Loading...

Details

Type: Bug
Resolution: Won't Fix
Priority: Low
Fix Version/s: None
Affects Version/s: 6.4.14
Component/s: Indexing
Labels:

Introduced in Version:
6.04
Support reference count:
13
Symptom Severity:
Severity 1 - Critical
UIS:
152
Bug Fix Policy:
View Atlassian Server bug fix policy

Description

Summary

JIRA needs to obtain DefaultIndexManager WriteLock before doing major actions with Lucene index (Full locked reindex, apply index snapshot, apply index from another node in JDC).
In case of Plugin owns the lock, JIRA will never able to get the lock.

Steps to Reproduce

Install JIRA DC with 3+ nodes (problem is not specific to JDC)
Install plugin which uses DefaultIndexManager lock. Eg: Structure

Expected Results

JIRA is able to run Index operations.

Actual Results

JIRA is not able to run Index operations.
The below exception is thrown in the jira.log file:

2016-10-18 15:38:51,732 ClusterMessageHandlerServiceThread:thread-1 ERROR      [jira.issue.index.DefaultIndexManager] Wait attempt timed out - waited 30000 milliseconds
com.atlassian.jira.issue.index.IndexException: Wait attempt timed out - waited 30000 milliseconds
	at com.atlassian.jira.issue.index.DefaultIndexManager.obtain(DefaultIndexManager.java:936)
	at com.atlassian.jira.issue.index.DefaultIndexManager.access$900(DefaultIndexManager.java:97)
	at com.atlassian.jira.issue.index.DefaultIndexManager$IndexLock.tryLock(DefaultIndexManager.java:1298)
	at com.atlassian.jira.issue.index.DefaultIndexManager.withReindexLock(DefaultIndexManager.java:375)
...
	at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup(DefaultIndexRecoveryManager.java:124)
	at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.restoreIndex(DefaultIndexCopyService.java:175)
	at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.receive(DefaultIndexCopyService.java:195)
	at com.atlassian.jira.cluster.OfBizMessageHandlerService.sendLocalFromNode(OfBizMessageHandlerService.java:260)
	at com.atlassian.jira.cluster.OfBizMessageHandlerService.handleReceivedMessages(OfBizMessageHandlerService.java:153)
	at com.atlassian.jira.cluster.OfBizMessageHandlerService.access$000(OfBizMessageHandlerService.java:34)
	at com.atlassian.jira.cluster.OfBizMessageHandlerService$1.run(OfBizMessageHandlerService.java:59)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2016-10-18 15:38:52,495 ClusterMessageHandlerServiceThread:thread-1 ERROR      [atlassian.jira.cluster.OfBizMessageHandlerService] There was a problem handling a cluster message
java.lang.RuntimeException: com.atlassian.jira.issue.index.IndexException: Failed to acquire reindex lock
	at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.restoreIndex(DefaultIndexCopyService.java:179)
	at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.receive(DefaultIndexCopyService.java:195)
...
Caused by: com.atlassian.jira.issue.index.IndexException: Failed to acquire reindex lock
	at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup(DefaultIndexRecoveryManager.java:126)
	at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.restoreIndex(DefaultIndexCopyService.java:175)
	... 12 more

Notes

This specific error was caused by bug in Structure 3.3.1 and fixed in 3.3.3 (Structure+3.3.3+Release+Notes). All JIRA DC clients should update to latest Structure.
See related suggested ticket to improve logging: ~~JRA-63188~~
Almworks refactored the code in Structure 3.4 and switched to optimistic locking (no lock is held). That should prevent other lock related problems in the future.

Workaround

Start disable plugins one by one and restart JIRA each time.

Note on Won't Fix

Update: 2017-04-20
Dev team reviewed the bug and at this point implementing proper fix (introducing timeout to plugin operation) will be time consuming and risky as it could damage JIRA index if we were to interrupt plugin.
Following changes related to this problem were done:

Almworks changed their code in Structure and switched to optimistic locking
Additional logging were implemented and enabled by default in recent version of JIRA (see ~~JRASERVER-63188~~)

With all that being said, we decided that we stop working on this and mark it as Won't Fix. We will reopen it if problem reoccurs and gets more traction.

Attachments

Issue Links

is related to

JRASERVER-63188 As an JIRA Administrator I want to have debug logging for DefaultIndexManager lock (writeLock)

Closed

relates to

JRASERVER-63099 JIRA doesn't detect and report problem caused during reindexing

Closed

RUM-1765 Loading...

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...

(2 mentioned in)

JIRA is not able to obtain DefaultIndexManager write lock due to plugin holding it