Uploaded image for project: 'Jira Server and Data Center'
  1. Jira Server and Data Center
  2. JRASERVER-65890

Create logging event for clusterlockstatus during startup

    XMLWordPrintable

    Details

    • UIS:
      2
    • Support reference count:
      5
    • Feedback Policy:
      We collect Jira feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

      Description

      Problem:

      In JIRA Data Center, it is possible that a node is not able to startup due to other node is holding a cluster lock.

      Symptom:

      JIRA is not able to startup, the log records it stucks getting current.getstatus()

      2017-09-03 19:50:07,583 ClusterMessageHandlerServiceThread:thread-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:07,584 ClusterMessageHandlerServiceThread:thread-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:07,585 ClusterMessageHandlerServiceThread:thread-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:07,585 ClusterMessageHandlerServiceThread:thread-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:07,586 ClusterMessageHandlerServiceThread:thread-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:07,587 ClusterMessageHandlerServiceThread:thread-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:10,095 localhost-startStop-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:10,097 localhost-startStop-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:10,581 ClusterMessageHandlerServiceThread:thread-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus()
      2017-09-03 19:50:10,583 ClusterMessageHandlerServiceThread:thread-1 DEBUG      [o.objectweb.jotm.jta] Current.getStatus() 
      

      From the CLUSTERLOCKSTATUS, we can see one of the nodes holding the clusterlock

      node-2 is not able to startup with the following found in the thread dump of node-2

      "localhost-startStop-1" #36 daemon prio=5 os_prio=0 tid=0x00007f732c002000 nid=0x666f waiting on condition [0x00007f73454e1000]
         java.lang.Thread.State: TIMED_WAITING (sleeping)
      	at java.lang.Thread.sleep(Native Method)
      	at com.atlassian.beehive.db.DatabaseClusterLock.sleep(DatabaseClusterLock.java:530)
      	at com.atlassian.beehive.db.DatabaseClusterLock.uninterruptibleWait(DatabaseClusterLock.java:102)
      	at com.atlassian.beehive.db.DatabaseClusterLock.lock(DatabaseClusterLock.java:82)
      	at com.atlassian.beehive.compat.delegate.DelegatingClusterLock.lock(DelegatingClusterLock.java:34)
      	at com.atlassian.upm.impl.Locks.runWithLock(Locks.java:114)
      	at com.atlassian.upm.impl.Locks.writeWithLock(Locks.java:81)
      	at com.atlassian.upm.core.async.AsynchronousTaskStatusStoreImpl.launch(AsynchronousTaskStatusStoreImpl.java:459)
      	at com.atlassian.upm.core.async.AsynchronousTaskStatusStoreImpl.onLifecycleEvent(AsynchronousTaskStatusStoreImpl.java:428)
      	at com.atlassian.upm.core.async.AsynchronousTaskStatusStoreImpl.onStart(AsynchronousTaskStatusStoreImpl.java:389)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager$4.consume(DefaultLifecycleManager.java:277)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager$4.consume(DefaultLifecycleManager.java:274)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.notifyLifecyleAware(DefaultLifecycleManager.java:303)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.notifyOnStartIfStartedAndEnabled(DefaultLifecycleManager.java:273)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.access$300(DefaultLifecycleManager.java:49)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager$3.evaluate(DefaultLifecycleManager.java:235)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager$3.evaluate(DefaultLifecycleManager.java:232)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.notifyLifecycleAwares(DefaultLifecycleManager.java:258)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.notifyStartableLifecycleAwares(DefaultLifecycleManager.java:231)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.startIfApplicationSetup(DefaultLifecycleManager.java:219)
      	at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.start(DefaultLifecycleManager.java:210)
      	at com.atlassian.sal.jira.lifecycle.JiraLifecycleManager.onJiraStart(JiraLifecycleManager.java:64)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at com.atlassian.event.internal.SingleParameterMethodListenerInvoker.invoke(SingleParameterMethodListenerInvoker.java:36)
      	at com.atlassian.event.internal.AsynchronousAbleEventDispatcher$1$1.run(AsynchronousAbleEventDispatcher.java:48)
      	at com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:299)
      	at com.atlassian.event.internal.AsynchronousAbleEventDispatcher.dispatch(AsynchronousAbleEventDispatcher.java:107)
      	at com.atlassian.event.internal.EventPublisherImpl.invokeListeners(EventPublisherImpl.java:160)
      	at com.atlassian.event.internal.EventPublisherImpl.publish(EventPublisherImpl.java:79) 
      

      In particular the following line

      com.atlassian.upm.core.async.AsynchronousTaskStatusStoreImpl.onStart(AsynchronousTaskStatusStoreImpl.java:389)

      Recommendation

      Add logging if another node is holding cluster wide lock, in this case com.atlassian.upm.core.async.AsynchronousTaskStatusStoreImpl preventing startup from completing.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned
              Reporter:
              vkharisma vkharisma
              Votes:
              5 Vote for this issue
              Watchers:
              10 Start watching this issue

                Dates

                Created:
                Updated: