Loading...

Type: Bug
Resolution: Fixed
Priority: Low
Fix Version/s: 10.0.0
Affects Version/s: 9.3.0, 9.4.6, 9.4.9
Component/s: Data Center - Index, Indexing
Labels:
None

Introduced in Version:
9.03
Support reference count:
8
Symptom Severity:
Severity 3 - Minor
UIS:
14
Bug Fix Policy:
View Atlassian Server bug fix policy

Issue Summary

With the publishing of ~~JRASERVER-74232~~, Jira 9.3+ now catches-up the Index changes of Issues through multiple Threads during startup.
It makes use of the same Thread pool and batch configurations as in a Full Reindex, considerably improving startup time.

Comments and Worklogs Index catch-up, though, are still processed in a single-thread during startup. For some instances, this still may account for a lot of startup time.

This particularly affects customers who recycle nodes through Docker or Kubernetes, as the local indexes are usually wiped clean and the nodes rely on shared home Index Snapshots during startup.

This is reproducible on Data Center: yes

Steps to Reproduce

Have the cluster generate an Index Snapshot to the shared home
Add hundreds of thousands of comments to thousands of Issues
Startup (or restart) a Jira node with no local Index folders
Monitor how long it takes to catch-up with the changes since the last available Index Snapshot during startup

Expected Results

Startup will be quick, as since ~~JRASERVER-74232~~ (9.3.0) the catch-up is multi-threaded.

Actual Results

We observe the Issue changes catch-up since the last Snapshot are multi-threaded, but the Comments and Worklog catch-up are single-threaded, happening on the main Thread.

During startup, we can see a pattern similar to this in the atlassian-jira.log:

grep -e "exclusive use" -e "Warmed cache" application-logs/atlassian-jira.log

2023-08-25 20:43:27,223 main INFO      [c.a.jira.startup.JiraHomeStartupCheck] The jira.home directory '/var/atlassian/application-data/jira' is validated and locked for exclusive use by this instance.
2023-08-25 21:48:55,366 Caesium-1-3 INFO Anonymous user     [c.a.jira.startup.CacheWarmerLauncher] Warmed cache(s) in 16907 ms.

Start command was issued at 20:43:27 and startup was completed at 21:48:55, in this example.

If we search for INDEX-FIXER, we'll be able to see the progress and spot a long time on either Comments or Worklogs:

grep "INDEX-FIXER" application-logs/atlassian-jira.log

2023-08-25 20:52:28,001 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] We will restore index snapshot with 6284212 issues
2023-08-25 20:52:28,014 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Restoring search indexes from snapshot and catching up with changes - 1% complete... Starting
2023-08-25 20:52:28,014 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Starting the index recovery process by replacing existing indexes. For the whole time of recovering the index (restoring and catch up) we will be holding the index write lock
2023-08-25 20:52:28,152 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Done replacing existing indexes. We will continue with the index catch up.
2023-08-25 20:52:28,153 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Restoring search indexes - 30% complete... Restored index backup
2023-08-25 20:52:28,525 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Latest index date: {2023-08-25 01:00:00}, Latest DB issue-version date: {2023-08-25 16:52:28}
2023-08-25 20:52:28,526 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Re-indexing issues updated in the last {1 days, 19 hours, 52 minutes, and 28 seconds}. (Note: it's an intentionally wider range)
2023-08-25 20:52:42,209 IssueIndexer:thread-41 INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Updating index with recent changes - 31% complete... Processing issue changes is 1% complete

*** lines removed for readability ***

2023-08-25 20:53:57,331 IssueIndexer:thread-14 INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Updating index with recent changes - 80% complete... Processing issue changes is 99% complete
2023-08-25 21:45:04,959 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Updating index with recent changes - 85% complete... Reindex of comments completed.
2023-08-25 21:47:25,051 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Updating index with recent changes - 90% complete... Reindex of worklogs completed.
2023-08-25 21:47:25,052 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Catch up - 95% complete... Done catching up. Potentially outdated issues were updated in the index. Any deleted issues (plus comments & worklogs) in this range were removed from the index. 143614 issues, 358054 comments and 13267 worklogs scanned. 81757 issues, 195206 comments and 8059 worklogs reindexed. 4 issues, 475 comments and 10 worklogs deindexed. It took 4.343 s to read from DB, 54.87 min to reindex, 55.85 ms to deindex, 54.94 min in total
2023-08-25 21:47:25,053 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Recovering - 99% complete... Done catching up
2023-08-25 21:47:32,597 main INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] [INDEX-FIXER] Recovering - 100% complete... Recovered all indexes

In this example:

The total number of Issues in the instance is 6284212
Issue Index catch-up was done in ~2 minutes (from 20:52:42 to 20:53:57)
Comments catch up was done in ~ 52 minutes (from 20:53:57 to 21:45:04)
Worklog catch-up was done in ~2 minutes (from 21:45:04 to 21:47:25)
143614 issues, 358054 comments and 13267 worklogs scanned
81757 issues, 195206 comments and 8059 worklogs reindexed

Workaround

1. Having a very recent Index Snapshot

Having an Index Snapshot taken as close as possible to the startup would mitigate the startup delay, as Jira would have less changes to catch-up to.

Since 9.7 Jira has the One-Click Index Snapshot feature (~~JRASERVER-66510~~) that allows an Admin to create an Index Snapshot at any moment — without the need to update the scheduler and wait for the next interval.
As for the Long-Term-Support (LTS) releases, it'll be available on 9.12.

2. Optimizing DB speed

Making sure the DB tables for Comments and Worklogs are optimized is also a mitigation for this:

jiraaction
comment_version

is related to

JRASERVER-77905 Search Request (filter) catch up isn't multi-threaded when restoring snapshot

Gathering Impact

relates to

JRASERVER-77906 During Jira startup the index fixer unhelpfully adds a range of 24 hours to catch-up modifications

Gathering Impact

JRASERVER-74232 Make index catch up during startup multi-threaded

Closed

PSR-1046 Loading...

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(5 mentioned in)

Details

Description

Issue Summary

Steps to Reproduce

Expected Results

Actual Results

Workaround

1. Having a very recent Index Snapshot

2. Optimizing DB speed

Attachments

Issue Links

Forms

Activity

People

Dates