Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-9049

NullPointerException in Hazelcast migration operation under certain conditions

    XMLWordPrintable

Details

    Description

      Summary

      Under some conditions when a Bitbucket Data Center clustered instance changes topology (i.e., node(s) leave or join), Hazelcast may log many errors of the form:

      2016-08-09 04:37:24,518 ERROR [hz.hazelcast.partition-operation.thread-0]  c.h.p.impl.MigrationOperation [10.0.1.230]:5701 [bitbucket-data-center-1d2704] [3.5.2-atlassian-36] An exception occurred while executing migration operation com.hazelcast.map.impl.operation.MapReplicationOperation{serviceName='null', partitionId=16, callId=0, invocationTime=-1, waitTimeout=-1, callTimeout=9223372036854775807}
      java.lang.NullPointerException: null
              at com.hazelcast.map.impl.record.AbstractRecordWithStats.getCost(AbstractRecordWithStats.java:62) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.record.DataRecordWithStats.getCost(DataRecordWithStats.java:38) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.MapSizeEstimator.getCost(MapSizeEstimator.java:46) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.MapSizeEstimator.getCost(MapSizeEstimator.java:26) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.AbstractRecordStore.calculateRecordHeapCost(AbstractRecordStore.java:119) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.DefaultRecordStore.putRecord(DefaultRecordStore.java:178) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.operation.MapReplicationOperation.run(MapReplicationOperation.java:123) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.partition.impl.MigrationOperation.runMigrationTask(MigrationOperation.java:177) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.partition.impl.MigrationOperation.migrate(MigrationOperation.java:155) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.partition.impl.MigrationOperation.doRun(MigrationOperation.java:93) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.partition.impl.MigrationOperation.run(MigrationOperation.java:79) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:137) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:315) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.processPacket(OperationThread.java:142) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.process(OperationThread.java:115) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.doRun(OperationThread.java:101) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.run(OperationThread.java:76) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
      2016-08-09 04:37:24,529 WARN  [hz.hazelcast.migration]  c.h.p.InternalPartitionService [10.0.1.230]:5701 [bitbucket-data-center-1d2704] [3.5.2-atlassian-36] Migration failed: com.hazelcast.partition.MigrationInfo{partitionId=16, source=Address[10.0.0.98]:5701, destination=Address[10.0.1.230]:5701, master=Address[10.0.1.230]:5701, valid=true, processing=false}
      

      If affected, the state of data in Hazelcast IMap's in the instance may become incorrect, requiring a full cluster restart (bringing all nodes down and up around the same time) to resolve.

      Workaround

      Shut down all nodes in the cluster, set the following property in bitbucket.properties:

      hazelcast.statistics.enabled=true
      

      and restart all cluster nodes. Do not attempt to apply this workaround with a rolling (zero downtime) restart, as having a cluster with mixed configuration of the hazelcast.statistics.enabled property is one way to trigger the problem.

      Solution

      Upgrade Bitbucket Data Center to 4.8.4 or higher.

      Attachments

        Activity

          People

            rfriend rikf
            rfriend rikf
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: