-
Bug
-
Resolution: Unresolved
-
Medium
-
None
-
7.3.0, 7.4.0, 7.2.8
-
7.02
-
69
-
Severity 2 - Major
-
42
-
Summary
LexoRank rebalancing job may stuck in some cases. You don't see any progress for many hours (days).
You can see status at Lexorank management page (see LexoRank Management)
- Example of REST call /rest/greenhopper/1.0/lexorank/balance output:
{"lexoRankBalancingServiceStatus": {"balancingDisabled":false,"balancingSuspended":false,"balanceHandlerRunning":true}, "lexoRankBalancerStatus":{"balancerLocked":true, "perFieldStatus":[{"fieldName":"Rank","fieldId":11080,"numRankedIssues":2846758,"percentComplete":98.12, "distribution":[2793228,0,53530]}]}, "totalIssueCount":2846034}
- Note progress stuck at 98%.
- Please note that
JSWSERVER-15703introduces the back-off mechanism for JIRA DC, so this is expected that Lexorank Rebancing pauses under high load:Whenever we detect current node is behind any other node in the cluster by 30 seconds or any other node in the cluster is behind current node by 30 seconds, LexoRank balancing will back-off.
Expected Results
Lexorank Rebalancing job is running and has progress.
Actual Results
Lexorank Rebalancing job is stuck and has no progress.
User impact: Lexorank Rebalancing fixes issues with "broken" Rank, if current progress did not fix those broken issues so far, user will not able to rank them.
Notes
- Thread dump generated at that time will show the lexorank-executor-thread-0 is WAITING for a long time:
lexorank-executor-thread-0" #544 prio=5 os_prio=0 tid=0x00007f331c11a800 nid=0xaff5 waiting on condition [0x00007f3236dca000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00007f3706dffcc8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350) at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer.balanceFieldId(LexoRankBalancer.java:363) at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer.lambda$balanceFieldIds$1(LexoRankBalancer.java:112) at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer.dt_access$782(LexoRankBalancer.java) at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer$$dtt$$Lambda$1518/708308671.run(Unknown Source) at com.atlassian.greenhopper.global.PerformanceLogger$EmptyPerformanceLogger.measure(PerformanceLogger.java:129) at com.atlassian.greenhopper.service.lexorank.balance.LexoRankBalancer.balanceFieldIds(LexoRankBalancer.java:109) at com.atlassian.greenhopper.service.lexorank.balance.LexoRankScheduledBalanceHandler.run(LexoRankScheduledBalanceHandler.java:60) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
- Enabling DEBUG logging Lexorank balance (com.atlassian.greenhopper.service.lexorank.balance) doesn't show any progress
- Expected logging:
2017-08-25 12:34:24,234 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalancer] Balancing next rank row for rank field [id=10400] 2017-08-25 12:34:24,234 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Balancing next rank row to migrate from bucket[0] to bucket[1] for rank field[id=10400] 2017-08-25 12:34:24,235 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Fetched rows at balance boundary 2017-08-25 12:34:24,235 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Fetched rows at balance boundary 2017-08-25 12:34:24,235 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] rowToMigrate : LexoRankRow{id=9174, fieldId=10400, issueId=20443, lockHash='null', lockTime=null, bucket=0, rank='0|i01itb:', type=ISSUE_RANK_ROW} 2017-08-25 12:34:24,235 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] rowLastMigrated : LexoRankRow{id=9175, fieldId=10400, issueId=20444, lockHash='null', lockTime=null, bucket=1, rank='1|hzzstz:', type=ISSUE_RANK_ROW} 2017-08-25 12:34:24,241 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Acquired lock on rows 2017-08-25 12:34:24,242 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Balancing rank row [type=ISSUE_RANK_ROW, oldRank=0|i01itb:, newRank=1|hzzstr:] 2017-08-25 12:34:24,248 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Successfully saved rank row 2017-08-25 12:34:24,248 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Releasing lock 2017-08-25 12:34:24,255 lexorank-executor-thread-0 DEBUG [c.a.g.s.lexorank.balance.LexoRankBalancer] ReIndexing issue[id=20443]
- Expected logging:
- See KB for more details regarding rebalancing: Troubleshooting New Ranking System Issues
Workaround
Restart JIRA
- is duplicated by
-
JSWSERVER-16072 LexoRank can freeze when it fails to finish IndexingRunnable
- Closed
- is related to
-
JSWSERVER-16542 During rebalancing ranking operation fails to save new rank due to slow DB operation
- Gathering Impact
- relates to
-
JSWSERVER-20691 Lexorank affects issue creation time
- Long Term Backlog
-
JSWDC-21 Loading...
- Mentioned in
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...