-
Suggestion
-
Resolution: Unresolved
-
None
-
2
-
4
-
Problem Definition
In case of Jira Datacenter node was down for quite some time, but not long enough to receive full index, it tries to catchup the replication queue from another nodes.
This is done in NodeReindexServiceThread, stack example:
NodeReindexServiceThread:thread-1 [java.net.SocketInputStream.socketRead0(Native Method), java.net.SocketInputStream.socketRead(SocketInputStream.java:116), ... com.atlassian.jira.database.DatabaseAccessorImpl.executeQuery(DatabaseAccessorImpl.java:80), com.atlassian.jira.database.DatabaseAccessorImpl.runInTransaction(DatabaseAccessorImpl.java:93), com.atlassian.jira.database.DefaultQueryDslAccessor.executeQuery(DefaultQueryDslAccessor.java:65), com.atlassian.jira.issue.worklog.DatabaseWorklogStore.getByIssue(DatabaseWorklogStore.java:188), com.atlassian.jira.issue.worklog.DefaultWorklogManager.getByIssue(DefaultWorklogManager.java:103), .... com.atlassian.jira.issue.index.DefaultIssueDocumentFactory.apply(DefaultIssueDocumentFactory.java:50), com.atlassian.jira.issue.index.DefaultIssueDocumentFactory.apply(DefaultIssueDocumentFactory.java:30), com.atlassian.jira.issue.index.DefaultIssueIndexer$DefaultDocumentCreationStrategy.get(DefaultIssueIndexer.java:556), ... com.atlassian.jira.issue.util.IssueObjectIssuesIterable.foreach(IssueObjectIssuesIterable.java:24), com.atlassian.jira.issue.index.DefaultIssueIndexer.perform(DefaultIssueIndexer.java:282), com.atlassian.jira.issue.index.DefaultIssueIndexer.reindexIssues(DefaultIssueIndexer.java:162), com.atlassian.jira.issue.index.DefaultIndexManager.reIndexIssues(DefaultIndexManager.java:571), com.atlassian.jira.issue.index.DefaultIndexManager.reIndexIssueObjects(DefaultIndexManager.java:468), sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method), sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62), sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43), java.lang.reflect.Method.invoke(Method.java:498), com.atlassian.jira.config.component.SwitchingInvocationHandler.invoke(SwitchingInvocationHandler.java:22), com.sun.proxy.$Proxy17.reIndexIssueObjects(Unknown Source), com.atlassian.jira.index.ha.DefaultNodeReindexService.updateIssueIndex(DefaultNodeReindexService.java:404), com.atlassian.jira.index.ha.DefaultNodeReindexService.updateAffectedIndexes(DefaultNodeReindexService.java:298), com.atlassian.jira.index.ha.DefaultNodeReindexService.reIndex(DefaultNodeReindexService.java:252)
You may not need to wait for replication to finish, for example: you want to run Full-reindex at this node.
If replication is not paused, Full-reindex process (DefaultIndexManager) will need to compete with NodeReindexServiceThread for the lock and this might fail.
2018-02-28 12:24:18,466 JiraTaskExectionThread-2 ERROR admin 743x1093x1 ykx2f7 10.1.1.1 /secure/admin/jira/IndexReIndex.jspa [c.a.j.issue.index.DefaultIndexManager] Wait attempt timed out - waited 30000 milliseconds com.atlassian.jira.issue.index.IndexException: Wait attempt timed out - waited 30000 milliseconds ...
Suggested Solution
Add option to pause NodeReindexServiceThread for replication.
Workaround
- Wait until replication is done (you can use REST API to check queue length) and then do Reindex
- relates to
-
JRASERVER-70423 DC index replication delays are affecting end-users due to single thread processing and re-computation of CF values
- Closed
-
JRASERVER-66916 Ability to disable automatic index copy from another node during startup
- Gathering Interest