[JRASERVER-66635] Index Recovery is very slow

Type: Bug
Resolution: Fixed
Priority: Low (View bug fix roadmap)
Fix Version/s: 9.1.0
Affects Version/s: 7.6.2
Component/s: Data Center - Other, Indexing
Labels:

Introduced in Version:
7.06
Support reference count:
2
Symptom Severity:
Severity 2 - Major
UIS:
1
Bug Fix Policy:
View Atlassian Server bug fix policy

Summary

Index recovery is very slow. This is true of index recovery in all scenarios.

Steps to Reproduce

In AWS CloudFormation, add an additional node
Observe that the index recovery is triggered (Index backup started. Requesting node: <node>)
Observe that index recovery takes >30m to begin, and ~50m to complete

Workaround

Recover the index from a snapshot, or perform a full reindex.

is incorporated by

JRASERVER-67732 Copying index snapshots between Data Center nodes should be faster

Closed

is related to

JRASERVER-66649 Index recovery should use index snapshots first if they are available

Closed

JRASERVER-66550 JIRA Datacenter - Add additional Lucene index checks before propagating index to other nodes

Gathering Interest

is resolved by: ASCI-3 You do not have permission to view this issue

mentioned in: Page Failed to load; Page Loading...; Page Loading...; Page Loading...

(3 mentioned in)

Form Name

Maciej Swinarski (Inactive) added a comment - 02/Nov/2022 8:48 AM

In Jira 9.x we have changed how index is obtained during node startup. A node with no fresh index and unable to rebuild the index, will look for this index snapshot in shared.
More details: https://confluence.atlassian.com/jirakb/index-management-on-jira-start-up-1141500654.html

Maciej Swinarski (Inactive) added a comment - 02/Nov/2022 8:48 AM In Jira 9.x we have changed how index is obtained during node startup. A node with no fresh index and unable to rebuild the index, will look for this index snapshot in shared. More details: https://confluence.atlassian.com/jirakb/index-management-on-jira-start-up-1141500654.html

Matt Doar added a comment - 20/Apr/2018 4:40 PM

Changing our temp dir to be a symlink to an SSD improved the time it takes for us to generate an index backup

Matt Doar added a comment - 20/Apr/2018 4:40 PM Changing our temp dir to be a symlink to an SSD improved the time it takes for us to generate an index backup

Andriy Yakovlev [Atlassian] added a comment - 19/Jan/2018 9:30 AM - edited

Hey mdoar2
re:

It is possible in Unix to change jira.install/temp to a soft link to a directory on the jira.home directory disk.

we can also modify location of temporary directory, see KB https://confluence.atlassian.com/jirakb/how-to-change-the-temporary-directory-648545076.html

Andriy Yakovlev [Atlassian] added a comment - 19/Jan/2018 9:30 AM - edited Hey mdoar2 re: It is possible in Unix to change jira.install/temp to a soft link to a directory on the jira.home directory disk. we can also modify location of temporary directory, see KB https://confluence.atlassian.com/jirakb/how-to-change-the-temporary-directory-648545076.html

Matt Doar added a comment - 18/Jan/2018 5:59 PM

That does seem a long time. When we do this, I notice that the jira.install/temp directory is used to create the snapshot. If the jira.home directory is on a different disk this can lead to time spent transferring data between disks. It is possible in Unix to change jira.install/temp to a soft link to a directory on the jira.home directory disk.

Matt Doar added a comment - 18/Jan/2018 5:59 PM That does seem a long time. When we do this, I notice that the jira.install/temp directory is used to create the snapshot. If the jira.home directory is on a different disk this can lead to time spent transferring data between disks. It is possible in Unix to change jira.install/temp to a soft link to a directory on the jira.home directory disk.

Denise Unterwurzacher [Atlassian] (Inactive) added a comment - 18/Jan/2018 6:48 AM

I've updated the steps to be clearer - this bug relates to adding more than one node - it tries to recover from an existing node with a working index. We had another few goes at this yesterday and noticed that it does eventually work, it just takes 35 minutes to generate the snapshot to be sent to the other node.

17/01/2018	10:39:18.880	10.104.240.60	INFO	com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer	Index restore started. Snapshot file: IndexSnapshot_10700.zip
17/01/2018	10:39:16.437	10.104.241.177	INFO	com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer	Index backup complete. Snapshot file: IndexSnapshot_10700.zip
17/01/2018	10:04:44.371	10.104.241.177	INFO	com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer	Index backup started. Requesting node: i-074583accba4d8eef:10

The whole process from start to finish took 50 minutes. Given that restoring the index from a snapshot takes 15 - 30 minutes in our experience we would always prefer to do the latter.

I have raised ~~JRASERVER-66649~~ for JIRA and ~~CONFSERVER-54806~~ for Confluence.

Denise Unterwurzacher [Atlassian] (Inactive) added a comment - 18/Jan/2018 6:48 AM I've updated the steps to be clearer - this bug relates to adding more than one node - it tries to recover from an existing node with a working index. We had another few goes at this yesterday and noticed that it does eventually work, it just takes 35 minutes to generate the snapshot to be sent to the other node. 17/01/2018 10:39:18.880 10.104.240.60 INFO com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer Index restore started. Snapshot file: IndexSnapshot_10700.zip 17/01/2018 10:39:16.437 10.104.241.177 INFO com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer Index backup complete. Snapshot file: IndexSnapshot_10700.zip 17/01/2018 10:04:44.371 10.104.241.177 INFO com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer Index backup started. Requesting node: i-074583accba4d8eef:10 The whole process from start to finish took 50 minutes. Given that restoring the index from a snapshot takes 15 - 30 minutes in our experience we would always prefer to do the latter. I have raised JRASERVER-66649 for JIRA and CONFSERVER-54806 for Confluence.

Details

Description

Summary

Steps to Reproduce

Workaround

Attachments

Issue Links

Forms

Activity

Collapse comment: Maciej Swinarski (Inactive) added a comment - 02/Nov/2022 8:48 AM

Expand comment: Maciej Swinarski (Inactive) added a comment - 02/Nov/2022 8:48 AM

Collapse comment: Matt Doar added a comment - 20/Apr/2018 4:40 PM

Expand comment: Matt Doar added a comment - 20/Apr/2018 4:40 PM

Collapse comment: Andriy Yakovlev [Atlassian] added a comment - 19/Jan/2018 9:30 AM, Edited by Andriy Yakovlev [Atlassian] - 19/Jan/2018 9:30 AM

Expand comment: Andriy Yakovlev [Atlassian] added a comment - 19/Jan/2018 9:30 AM, Edited by Andriy Yakovlev [Atlassian] - 19/Jan/2018 9:30 AM

Collapse comment: Matt Doar added a comment - 18/Jan/2018 5:59 PM

Expand comment: Matt Doar added a comment - 18/Jan/2018 5:59 PM

Collapse comment: Denise Unterwurzacher [Atlassian] (Inactive) added a comment - 18/Jan/2018 6:48 AM

Expand comment: Denise Unterwurzacher [Atlassian] (Inactive) added a comment - 18/Jan/2018 6:48 AM

People

Dates