Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-9049

NullPointerException in Hazelcast migration operation under certain conditions

    XMLWordPrintable

Details

    Description

      Summary

      Under some conditions when a Bitbucket Data Center clustered instance changes topology (i.e., node(s) leave or join), Hazelcast may log many errors of the form:

      2016-08-09 04:37:24,518 ERROR [hz.hazelcast.partition-operation.thread-0]  c.h.p.impl.MigrationOperation [10.0.1.230]:5701 [bitbucket-data-center-1d2704] [3.5.2-atlassian-36] An exception occurred while executing migration operation com.hazelcast.map.impl.operation.MapReplicationOperation{serviceName='null', partitionId=16, callId=0, invocationTime=-1, waitTimeout=-1, callTimeout=9223372036854775807}
      java.lang.NullPointerException: null
              at com.hazelcast.map.impl.record.AbstractRecordWithStats.getCost(AbstractRecordWithStats.java:62) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.record.DataRecordWithStats.getCost(DataRecordWithStats.java:38) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.MapSizeEstimator.getCost(MapSizeEstimator.java:46) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.MapSizeEstimator.getCost(MapSizeEstimator.java:26) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.AbstractRecordStore.calculateRecordHeapCost(AbstractRecordStore.java:119) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.DefaultRecordStore.putRecord(DefaultRecordStore.java:178) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.map.impl.operation.MapReplicationOperation.run(MapReplicationOperation.java:123) ~[hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.partition.impl.MigrationOperation.runMigrationTask(MigrationOperation.java:177) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.partition.impl.MigrationOperation.migrate(MigrationOperation.java:155) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.partition.impl.MigrationOperation.doRun(MigrationOperation.java:93) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.partition.impl.MigrationOperation.run(MigrationOperation.java:79) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:137) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:315) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.processPacket(OperationThread.java:142) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.process(OperationThread.java:115) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.doRun(OperationThread.java:101) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
              at com.hazelcast.spi.impl.operationexecutor.classic.OperationThread.run(OperationThread.java:76) [hazelcast-3.5.2-atlassian-36.jar:3.5.2-atlassian-36]
      2016-08-09 04:37:24,529 WARN  [hz.hazelcast.migration]  c.h.p.InternalPartitionService [10.0.1.230]:5701 [bitbucket-data-center-1d2704] [3.5.2-atlassian-36] Migration failed: com.hazelcast.partition.MigrationInfo{partitionId=16, source=Address[10.0.0.98]:5701, destination=Address[10.0.1.230]:5701, master=Address[10.0.1.230]:5701, valid=true, processing=false}
      

      If affected, the state of data in Hazelcast IMap's in the instance may become incorrect, requiring a full cluster restart (bringing all nodes down and up around the same time) to resolve.

      Workaround

      Shut down all nodes in the cluster, set the following property in bitbucket.properties:

      hazelcast.statistics.enabled=true
      

      and restart all cluster nodes. Do not attempt to apply this workaround with a rolling (zero downtime) restart, as having a cluster with mixed configuration of the hazelcast.statistics.enabled property is one way to trigger the problem.

      Solution

      Upgrade Bitbucket Data Center to 4.8.4 or higher.

      Attachments

        1. atlassian-bitbucket.log.bz2
          88 kB
          Cristan Szmajda

        Activity

          People

            rfriend rikf
            rfriend rikf
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: