Uploaded image for project: 'Jira Software Data Center'
  1. Jira Software Data Center
  2. JSWSERVER-16057

LexoRank rebalancing may stuck in some cases

XMLWordPrintable

      Summary

      LexoRank rebalancing job may stuck in some cases. You don't see any progress for many hours (days).
      You can see status at Lexorank management page (see LexoRank Management)

      • Example of REST call /rest/greenhopper/1.0/lexorank/balance output:
        {"lexoRankBalancingServiceStatus":
        {"balancingDisabled":false,"balancingSuspended":false,"balanceHandlerRunning":true},
        "lexoRankBalancerStatus":{"balancerLocked":true,
        "perFieldStatus":[{"fieldName":"Rank","fieldId":11080,"numRankedIssues":2846758,"percentComplete":98.12,
        "distribution":[2793228,0,53530]}]},
        "totalIssueCount":2846034}
        
      • Note progress stuck at 98%.
      • Please note that JSWSERVER-15703 introduces the back-off mechanism for JIRA DC, so this is expected that Lexorank Rebancing pauses under high load:

        Whenever we detect current node is behind any other node in the cluster by 30 seconds or any other node in the cluster is behind current node by 30 seconds, LexoRank balancing will back-off.

      Expected Results

      Lexorank Rebalancing job is running and has progress.

      Actual Results

      Lexorank Rebalancing job is stuck and has no progress.
      User impact: Lexorank Rebalancing fixes issues with "broken" Rank, if current progress did not fix those broken issues so far, user will not able to rank them.

      Notes

      • Thread dump generated at that time will show the lexorank-executor-thread-0 is WAITING for a long time:
        lexorank-executor-thread-0" #544 prio=5 os_prio=0 tid=0x00007f331c11a800 nid=0xaff5 waiting on condition [0x00007f3236dca000]
           java.lang.Thread.State: WAITING (parking)
        	at sun.misc.Unsafe.park(Native Method)
        	- parking to wait for  <0x00007f3706dffcc8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        	at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350)
        	at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer.balanceFieldId(LexoRankBalancer.java:363)
        	at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer.lambda$balanceFieldIds$1(LexoRankBalancer.java:112)
        	at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer.dt_access$782(LexoRankBalancer.java)
        	at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer$$dtt$$Lambda$1518/708308671.run(Unknown Source)
        	at com.atlassian.greenhopper.global.PerformanceLogger$EmptyPerformanceLogger.measure(PerformanceLogger.java:129)
        	at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer.balanceFieldIds(LexoRankBalancer.java:109)
        	at com.atlassian.greenhopper.service.lexorank.balance.LexoRankScheduledBalanceHandler.run(LexoRankScheduledBalanceHandler.java:60)
        	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        	at java.lang.Thread.run(Thread.java:745)
        
      • Enabling DEBUG logging Lexorank balance (com.atlassian.greenhopper.service.lexorank.balance) doesn't show any progress
        • Expected logging:
          2017-08-25 12:34:24,234 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalancer] Balancing next rank row for rank field [id=10400]
          2017-08-25 12:34:24,234 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Balancing next rank row to migrate from bucket[0] to bucket[1] for rank field[id=10400]
          2017-08-25 12:34:24,235 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Fetched rows at balance boundary
          2017-08-25 12:34:24,235 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Fetched rows at balance boundary
          2017-08-25 12:34:24,235 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation]       rowToMigrate : LexoRankRow{id=9174, fieldId=10400, issueId=20443, lockHash='null', lockTime=null, bucket=0, rank='0|i01itb:', type=ISSUE_RANK_ROW}
          2017-08-25 12:34:24,235 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation]       rowLastMigrated : LexoRankRow{id=9175, fieldId=10400, issueId=20444, lockHash='null', lockTime=null, bucket=1, rank='1|hzzstz:', type=ISSUE_RANK_ROW}
          2017-08-25 12:34:24,241 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Acquired lock on rows
          2017-08-25 12:34:24,242 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Balancing rank row [type=ISSUE_RANK_ROW, oldRank=0|i01itb:, newRank=1|hzzstr:]
          2017-08-25 12:34:24,248 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Successfully saved rank row
          2017-08-25 12:34:24,248 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Releasing lock
          2017-08-25 12:34:24,255 lexorank-executor-thread-0 DEBUG      [c.a.g.s.lexorank.balance.LexoRankBalancer] ReIndexing issue[id=20443]
          
      • See KB for more details regarding rebalancing: Troubleshooting New Ranking System Issues

      Workaround

      Restart JIRA

            Unassigned Unassigned
            ayakovlev@atlassian.com Andriy Yakovlev [Atlassian]
            Votes:
            22 Vote for this issue
            Watchers:
            33 Start watching this issue

              Created:
              Updated: