Summary

      Index recovery is very slow. This is true of index recovery in all scenarios.

      Steps to Reproduce

      1. In AWS CloudFormation, add an additional node
      2. Observe that the index recovery is triggered (Index backup started. Requesting node: <node>)
      3. Observe that index recovery takes >30m to begin, and ~50m to complete

      Workaround

      Recover the index from a snapshot, or perform a full reindex.

          Form Name

            [JRASERVER-66635] Index Recovery is very slow

            In Jira 9.x we have changed how index is obtained during node startup. A node with no fresh index and unable to rebuild the index, will look for this index snapshot in shared.
            More details: https://confluence.atlassian.com/jirakb/index-management-on-jira-start-up-1141500654.html

            Maciej Swinarski (Inactive) added a comment - In Jira 9.x we have changed how index is obtained during node startup. A node with no fresh index and unable to rebuild the index, will look for this index snapshot in shared. More details: https://confluence.atlassian.com/jirakb/index-management-on-jira-start-up-1141500654.html

            Matt Doar added a comment -

            Changing our temp dir to be a symlink to an SSD improved the time it takes for us to generate an index backup

            Matt Doar added a comment - Changing our temp dir to be a symlink to an SSD improved the time it takes for us to generate an index backup

            Hey mdoar2
            re:

            It is possible in Unix to change jira.install/temp to a soft link to a directory on the jira.home directory disk.

            we can also modify location of temporary directory, see KB https://confluence.atlassian.com/jirakb/how-to-change-the-temporary-directory-648545076.html

            Andriy Yakovlev [Atlassian] added a comment - - edited Hey mdoar2 re: It is possible in Unix to change jira.install/temp to a soft link to a directory on the jira.home directory disk. we can also modify location of temporary directory, see KB https://confluence.atlassian.com/jirakb/how-to-change-the-temporary-directory-648545076.html

            Matt Doar added a comment -

            That does seem a long time. When we do this, I notice that the jira.install/temp directory is used to create the snapshot. If the jira.home directory is on a different disk this can lead to time spent transferring data between disks. It is possible in Unix to change jira.install/temp to a soft link to a directory on the jira.home directory disk.

            Matt Doar added a comment - That does seem a long time. When we do this, I notice that the jira.install/temp directory is used to create the snapshot. If the jira.home directory is on a different disk this can lead to time spent transferring data between disks. It is possible in Unix to change jira.install/temp to a soft link to a directory on the jira.home directory disk.

            I've updated the steps to be clearer - this bug relates to adding more than one node - it tries to recover from an existing node with a working index. We had another few goes at this yesterday and noticed that it does eventually work, it just takes 35 minutes to generate the snapshot to be sent to the other node.

            17/01/2018	10:39:18.880	10.104.240.60	INFO	com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer	Index restore started. Snapshot file: IndexSnapshot_10700.zip
            17/01/2018	10:39:16.437	10.104.241.177	INFO	com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer	Index backup complete. Snapshot file: IndexSnapshot_10700.zip
            17/01/2018	10:04:44.371	10.104.241.177	INFO	com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer	Index backup started. Requesting node: i-074583accba4d8eef:10

            The whole process from start to finish took 50 minutes. Given that restoring the index from a snapshot takes 15 - 30 minutes in our experience we would always prefer to do the latter.

            I have raised JRASERVER-66649 for JIRA and CONFSERVER-54806 for Confluence.

            Denise Unterwurzacher [Atlassian] (Inactive) added a comment - I've updated the steps to be clearer - this bug relates to adding more than one node - it tries to recover from an existing node with a working index. We had another few goes at this yesterday and noticed that it does eventually work, it just takes 35 minutes to generate the snapshot to be sent to the other node. 17/01/2018 10:39:18.880 10.104.240.60 INFO com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer Index restore started. Snapshot file: IndexSnapshot_10700.zip 17/01/2018 10:39:16.437 10.104.241.177 INFO com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer Index backup complete. Snapshot file: IndexSnapshot_10700.zip 17/01/2018 10:04:44.371 10.104.241.177 INFO com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer Index backup started. Requesting node: i-074583accba4d8eef:10 The whole process from start to finish took 50 minutes. Given that restoring the index from a snapshot takes 15 - 30 minutes in our experience we would always prefer to do the latter. I have raised JRASERVER-66649 for JIRA and CONFSERVER-54806 for Confluence.

              mswinarski Maciej Swinarski (Inactive)
              dunterwurzacher Denise Unterwurzacher [Atlassian] (Inactive)
              Affected customers:
              3 This affects my team
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: