[JRASERVER-76819] Increase the default size of the Full Reindex batch

Type: Suggestion
Resolution: Unresolved
Fix Version/s: None
Component/s: Custom Field Indexing, Data Center - Index, Indexing
Labels:
None

Feedback Policy:

We collect Jira feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

Jira 8 and 9 ship with a default value of 4000 for the "Reindex batch".

For the default 20 Index Threads, this means every Thread would iterate over the batch 4×, if they'd all complete at the same time. (20 Threads × 50 Issues pulled each iteration = 1,000)

Jira Full Reindex would be quicker if this batch was way bigger. Customers have reported considerable Full Reindex speed improvements when configuring batches of 40,000 or 80,000 and even higher.

jira.index.background.batch.size = 40000
jira.index.issue.maxqueuesize = 40000
jira.index.sharedentity.maxqueuesize = 40000

This is specially important for customers that increase the Index Thread pool.

If a customer bumps the Index Threads to 80, this means they'll only run one iteration every batch, as 80 × 50 = 4,000. As the JiraTask Thread waits for the batch to be depleted and all Index Threads to have finished their last batch before reloading the "batch of batches", this downgrades performance very quickly. There will likely be more waiting time than running time overall in the Reindex process.
Going above 80 Threads wouldn't make effect, as Jira may not even spin up the other Threads as there just aren't enough Issues in the batch to be distributed among all Threads (keeping the default 50 Issues per iteration for each Thread).

mentioned in: Page Failed to load; Page Failed to load; Page Failed to load; Page Failed to load

Rodrigo Martinez added a comment - 29/Dec/2023 6:35 PM

Hi Matt

Ha! Yeah, we just published that KB this week!
It packs the latest knowledge we have on the subject and hopefully will help customers understand the Reindex a bit more. Customers (and Support Engineers) were asking for a public page with more details on such parameters we've been advising recently (with the Jira 9 upgrade traction).

We're watching that page closely and would highly appreciate any feedback you can share! It'll receive updates these following weeks, but the bulk of the contents will be the same. It's missing some insights and examples (screenshots) into Thread dump analysis, for example.

Cheers,
Rodrigo

Rodrigo Martinez added a comment - 29/Dec/2023 6:35 PM Hi Matt Ha! Yeah, we just published that KB this week! It packs the latest knowledge we have on the subject and hopefully will help customers understand the Reindex a bit more. Customers (and Support Engineers) were asking for a public page with more details on such parameters we've been advising recently (with the Jira 9 upgrade traction). We're watching that page closely and would highly appreciate any feedback you can share! It'll receive updates these following weeks, but the bulk of the contents will be the same. It's missing some insights and examples (screenshots) into Thread dump analysis, for example. Cheers, Rodrigo

Matt Doar added a comment - 29/Dec/2023 6:15 PM

Thanks for the comment. I see more info in https://confluence.atlassian.com/jirakb/how-to-troubleshoot-and-optimize-the-full-reindex-in-jira-1333990428.html as well

Matt Doar added a comment - 29/Dec/2023 6:15 PM Thanks for the comment. I see more info in https://confluence.atlassian.com/jirakb/how-to-troubleshoot-and-optimize-the-full-reindex-in-jira-1333990428.html as well

Rodrigo Martinez added a comment - 22/Dec/2023 6:44 PM

Hi e6a44563da75!

This is an optimization — so far (on a few cases) we've been bumping this up until no more gain is observed.
Basically, the producer Thread loads batches of Issues into the memory, from where each Indexer Thread consumes it 50 Issues by iteration by themselves.
The producer waits until the batch's depleted and all Indexer threads have completed to reload the batch.

Contention's currently being observed because many Threads go idle too soon waiting for some few Threads that take longer to complete their last 50-Issue iteration.
This has been happening every 4,000 Issues. If we increase this, this contention will happen every 40,000 or 80,000 Issues.

This is very specific from customer to customer, but what seems to be certain is that's high time we bump the default 4,000 batch size. We can see this more clearly with Thread dumps, but still it'd be a "change–monitor–repeat" process.

Cheers

Rodrigo Martinez added a comment - 22/Dec/2023 6:44 PM Hi e6a44563da75 ! This is an optimization — so far (on a few cases) we've been bumping this up until no more gain is observed. Basically, the producer Thread loads batches of Issues into the memory, from where each Indexer Thread consumes it 50 Issues by iteration by themselves. The producer waits until the batch's depleted and all Indexer threads have completed to reload the batch. Contention's currently being observed because many Threads go idle too soon waiting for some few Threads that take longer to complete their last 50-Issue iteration. This has been happening every 4,000 Issues. If we increase this, this contention will happen every 40,000 or 80,000 Issues. This is very specific from customer to customer, but what seems to be certain is that's high time we bump the default 4,000 batch size. We can see this more clearly with Thread dumps, but still it'd be a "change–monitor–repeat" process. Cheers

Matt Doar added a comment - 22/Dec/2023 6:27 PM - edited

Some guidance on how to choose a value would be useful as well.

Anything that can reduce the duration of a full reindex will have a major impact for many enterprise customers

Matt Doar added a comment - 22/Dec/2023 6:27 PM - edited Some guidance on how to choose a value would be useful as well. Anything that can reduce the duration of a full reindex will have a major impact for many enterprise customers

Details

Description

Attachments

Issue Links

Forms

Activity

Collapse comment: Rodrigo Martinez added a comment - 29/Dec/2023 6:35 PM

Expand comment: Rodrigo Martinez added a comment - 29/Dec/2023 6:35 PM

Collapse comment: Matt Doar added a comment - 29/Dec/2023 6:15 PM

Expand comment: Matt Doar added a comment - 29/Dec/2023 6:15 PM

Collapse comment: Rodrigo Martinez added a comment - 22/Dec/2023 6:44 PM

Expand comment: Rodrigo Martinez added a comment - 22/Dec/2023 6:44 PM

Collapse comment: Matt Doar added a comment - 22/Dec/2023 6:27 PM, Edited by Matt Doar - 22/Dec/2023 6:28 PM

Expand comment: Matt Doar added a comment - 22/Dec/2023 6:27 PM, Edited by Matt Doar - 22/Dec/2023 6:28 PM

People

Dates