Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-72944

Restoring an index snapshot after a full re-index might trigger the index fixer, delaying the node start up

    XMLWordPrintable

Details

    • 8.1
    • 16
    • Severity 3 - Minor
    • 68
    • Hide
      Atlassian Update – 21 Mar 2022

      Hi everyone,

      In Jira 9.0 we ensure versions to each entity (issue, comment, worklog and change history). After upgrading to this version all missing versions will be properly added as a part of upgrade task.

      Also we change behaviour of making index snapshots. Now if Jira instance hasn't consistent index it will not make a snapshot and, as a result, snapshot with corrupted index will not be put to shared home directory.

      More details how to handle situation when index is not consistent can be found here: https://confluence.atlassian.com/jirakb/indexing-inconsistency-troubleshooting-1114800953.html

      Thank you,
      Michał Błajet,
      Jira Developer

      Show
      Atlassian Update – 21 Mar 2022 Hi everyone, In Jira 9.0 we ensure versions to each entity (issue, comment, worklog and change history). After upgrading to this version all missing versions will be properly added as a part of upgrade task. Also we change behaviour of making index snapshots. Now if Jira instance hasn't consistent index it will not make a snapshot and, as a result, snapshot with corrupted index will not be put to shared home directory. More details how to handle situation when index is not consistent can be found here: https://confluence.atlassian.com/jirakb/indexing-inconsistency-troubleshooting-1114800953.html Thank you, Michał Błajet, Jira Developer

    Description

      Issue Summary

      When restoring the index from another node, the index-fixer process checks the last 24h for issues that aren't in the index already, and reindex them.

      When the index snapshot is from a node that has just run a full-locked reindex, the index-fixer will reindex all issues updated in the last 24h, in a single-threaded process.
      While this process happens, the index replication is kept on hold.
       

      Steps to Reproduce

      1. Stop all nodes
      2. Start 1 node
      3. Run a full locked reindex
      4. Start a second node

      Expected Results

      After getting the index snapshot from the first node, the node should operate normally in a short time.

      Actual Results

      If there were many issue updates in the last 24 hours, the node will become usable, but the issue replication takes a long time to start.

      During this time, we see the index-fixer process stuck at 60%, at the atlassian-jira.log:

      2021-10-22 05:58:41,181 ClusterMessageHandlerServiceThread:thread-1 INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Recovering search indexes - 60% complete... [INDEX-FIXER] Re-indexing issues modified in the last {1 days, 0 hours, 2 minutes, and 38 seconds}. (Versioning short-circuit checks are enabled.)
      

      Workaround 1

      Before restoring the index make sure all entities version update time are older than index snapshot creation time minus 24 hours.
      Example SQL:

      update issue_version set update_time = 'epoch' 
      update comment_version set update_time = 'epoch' 
      update worklog_version set update_time = 'epoch' 

      Attachments

        Issue Links

          Activity

            People

              mswinarski Maciej Swinarski (Inactive)
              5fb7769fcbc7 Allan Gandelman
              Votes:
              6 Vote for this issue
              Watchers:
              28 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: