-
Bug
-
Resolution: Fixed
-
High (View bug fix roadmap)
-
8.10.0, 8.13.2, 8.22.2, 8.20.10, 8.20.14
-
8.1
-
16
-
Severity 3 - Minor
-
68
-
-
Issue Summary
When restoring the index from another node, the index-fixer process checks the last 24h for issues that aren't in the index already, and reindex them.
When the index snapshot is from a node that has just run a full-locked reindex, the index-fixer will reindex all issues updated in the last 24h, in a single-threaded process.
While this process happens, the index replication is kept on hold.
Steps to Reproduce
- Stop all nodes
- Start 1 node
- Run a full locked reindex
- Start a second node
Expected Results
After getting the index snapshot from the first node, the node should operate normally in a short time.
Actual Results
If there were many issue updates in the last 24 hours, the node will become usable, but the issue replication takes a long time to start.
During this time, we see the index-fixer process stuck at 60%, at the atlassian-jira.log:
2021-10-22 05:58:41,181 ClusterMessageHandlerServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Recovering search indexes - 60% complete... [INDEX-FIXER] Re-indexing issues modified in the last {1 days, 0 hours, 2 minutes, and 38 seconds}. (Versioning short-circuit checks are enabled.)
Workaround 1
Before restoring the index make sure all entities version update time are older than index snapshot creation time minus 24 hours.
Example SQL:
update issue_version set update_time = 'epoch' update comment_version set update_time = 'epoch' update worklog_version set update_time = 'epoch'