[JRASERVER-38598] JIRA reindexing is too slow - doesn't use existing CPUs - Create and track feature requests for Atlassian products.

Type: Suggestion
Resolution: Fixed
Fix Version/s: 8.0.0
Component/s: Indexing
Labels:

UIS:
72
Support reference count:
41
Feedback Policy:

We collect Jira feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

2021-04 - Jira 8.x update

We believe at this point this problem is fixed, there were a number of changes in the recent version of Jira that improved the performance of reindexing.

Just to name a few:

Upgrade to the recent Lucene version 7.x
Optimization for the Custom Fields computation
Increased number of Jira index threads to 20.
- If you have more powerful hardware, you can fine tune the number of Jira index threads.

That being said, we don't think the CPU is bottle-neck anymore and it can be properly utilized by Jira. Examples that we have some far, show problems related to slow DB performance or slow computation of specific user's Custom Fields, those are not related to Jira directly.

Initial Description

We have a big JIRA instance on a really good hardware:

~300k issues
48GB RAM
32 cores
SSD in RAID
PostgreSQL on localhsot

Still, reindexing does not increase the load above 4-5 (out of 32!), database is not under high load nor the filesystem.

Clearly this means that indexing is not properly using the resources in order to index rapidly.

Even with this configuration, a foreground reindex takes about one hour which is too much for a live system, in fact is too much even for a background reindex.

top - 11:57:30 up 50 days, 21:35,  2 users,  load average: 4.43, 4.02, 3.79
Tasks: 413 total,   3 running, 410 sleeping,   0 stopped,   0 zombie
Cpu(s): 14.3%us,  0.6%sy,  0.0%ni, 84.8%id,  0.1%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  49415024k total, 48661932k used,   753092k free,   171160k buffers
Swap:  2097148k total,   139576k used,  1957572k free, 21143064k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
18668 root      20   0 23.4g 3.0g  50m S  241  6.3  54145:22 java
13694 jira      20   0 25.2g 5.5g 496m S  110 11.6 108:55.12 java
14552 postgres  20   0 11.4g 896m 887m S   39  1.9   4:05.14 postgres
12496 postgres  20   0 11.4g 897m 890m R   24  1.9   2:07.22 postgres
14548 postgres  20   0 11.4g 947m 940m R   19  2.0   3:51.61 postgres
15014 confluen  20   0 16.9g 1.7g 8352 S   13  3.7   1196:28 java
14238 postgres  20   0 11.4g 1.0g 1.0g S    8  2.1   5:37.47 postgres
12499 postgres  20   0 11.4g 855m 850m S    8  1.8   2:06.02 postgres
30049 postgres  20   0 11.4g  78m  73m S    6  0.2   0:02.86 postgres
17921 postgres  20   0 11.4g  73m  66m S    2  0.2   0:03.98 postgres
 4955 nginx     20   0 50924  17m 2168 S    1  0.0 107:41.21 nginx
47354 postgres  20   0 11.4g 4344 2168 S    1  0.0   6:24.91 postgres

Solution

By default, Jira uses 10 Reindexing threads. You can tune number of jira index threads.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List

screenshoot 2014-06-04 at 11.57.18.png
94 kB
04/Jun/2014 11:06 AM
screenshoot 2014-06-04 at 12.03.31.png
44 kB
04/Jun/2014 11:06 AM
screenshoot 2014-06-04 at 12.03.56.png
197 kB
04/Jun/2014 11:06 AM
screenshoot 2014-06-04 at 12.04.14.png
129 kB
04/Jun/2014 11:06 AM
screenshoot 2014-06-04 at 12.05.17.png
27 kB
04/Jun/2014 11:06 AM

relates to

JRACLOUD-38598 JIRA reindexing is too slow - doesn't use existing CPUs

Closed

JRASERVER-2825 Reindexing is slow

Closed

mentioned in: Page Failed to load; Page Failed to load

was cloned as: JDEV-29074 Loading...

Pavel Vaněk added a comment - 07/Oct/2021 2:16 PM

Hi, we recently found out a missing index in the database, which caused some searches to take a huge amount of time during the full reindexing task. Full reindex ran then 1,5h instead of the previous 3h.

Gathered on side of the database:
SELECT OLD_ISSUE_KEY FROM public.moved_issue_key WHERE ISSUE_ID=$1 ORDER BY ID

we added index:
CREATE INDEX CONCURRENTLY idx_issue_id ON moved_issue_key (issue_id);

Pavel

Pavel Vaněk added a comment - 07/Oct/2021 2:16 PM Hi, we recently found out a missing index in the database, which caused some searches to take a huge amount of time during the full reindexing task. Full reindex ran then 1,5h instead of the previous 3h. Gathered on side of the database: SELECT OLD_ISSUE_KEY FROM public.moved_issue_key WHERE ISSUE_ID=$1 ORDER BY ID we added index: CREATE INDEX CONCURRENTLY idx_issue_id ON moved_issue_key (issue_id); Pavel

ViswanathanR added a comment - 28/Apr/2021 3:32 PM

@Andriy
I'm doing a full reindex and not background. 7 hours passed and I'm still at 50%. Yes support ticket raised too

ViswanathanR added a comment - 28/Apr/2021 3:32 PM @Andriy I'm doing a full reindex and not background. 7 hours passed and I'm still at 50%. Yes support ticket raised too

Andriy Yakovlev [Atlassian] added a comment - 28/Apr/2021 3:15 PM

Folks viswanathan.ramachandran, mardeshana
This is strange, we see massive improvements based on stats/feedback from customers for the Full Locked Reindex in 8.x line.
At the same time, as people mentioned before, background reindex is single-threaded and expected to be slow. See also related problem - ~~JRASERVER-72045~~

Can I please ask you to open support tickets, so our team can investigate this?
Cheers.

Andriy Yakovlev [Atlassian] added a comment - 28/Apr/2021 3:15 PM Folks viswanathan.ramachandran , mardeshana This is strange, we see massive improvements based on stats/feedback from customers for the Full Locked Reindex in 8.x line. At the same time, as people mentioned before, background reindex is single-threaded and expected to be slow. See also related problem - JRASERVER-72045 Can I please ask you to open support tickets, so our team can investigate this? Cheers.

Milan Ardeshana added a comment - 28/Apr/2021 2:33 PM

Milan Ardeshana added a comment - 28/Apr/2021 2:33 PM +1

ViswanathanR added a comment - 28/Apr/2021 11:02 AM

We are experiencing this too with 16 CPU and MSSQL Db on JIRA 8.13.4 server edition. Its almost 4 hours and reindex is just 25%

We never got this problem in 7.13.3 though slow but it always completed in 2-3 hours time.

ViswanathanR added a comment - 28/Apr/2021 11:02 AM We are experiencing this too with 16 CPU and MSSQL Db on JIRA 8.13.4 server edition. Its almost 4 hours and reindex is just 25% We never got this problem in 7.13.3 though slow but it always completed in 2-3 hours time.

Tomas Karas added a comment - 19/Oct/2020 1:04 PM - edited

Hi Mark,

routine questions:

you did check the solution section("By default, Jira uses 10 Reindexing threads") in this ticket with link to a page how to adjust threads used for re-indexing by Jira?

I know for sure that it does change things quite a bit. Also I'm assuming you meant locked re-indexing. Background one is always single thread.

Also please check your local disk speed(required for caches folder) - that can have impact too...: https://confluence.atlassian.com/kb/test-disk-access-speed-for-jira-server-performance-troubleshooting-818577561.html

We do have Jira DC, however locked re-indexing of whole production Jira is around 4h 20min, with the following stats:

node-vm: was 8vCPU now 16vCPU 64G,  database-vm: is 8vCPU 64G postgresql 9.6
-Xmx29g
jira.index.issue.threads = 26
JSW 8.5.8 JSD 4.5.8 addons: adv.roadmaps, scriptRunner, automation for Jira, EazyBI, Time to SLA, Xray, ...
projects ~2350, issues >2mil, CF  >3k, issueTypes 690, statuses 1570, versions >27k
attachments  >700k, workflows >2800, screens >7k, groups >14k

Also restart of Jira before locked re-indexing is good idea(excluding why).

Perhaps making sure no-one besides admin team can make request towards Jira for that time.

or raise support ticket...

//Tomas

Tomas Karas added a comment - 19/Oct/2020 1:04 PM - edited Hi Mark, routine questions: you did check the solution section("By default, Jira uses 10 Reindexing threads") in this ticket with link to a page how to adjust threads used for re-indexing by Jira? I know for sure that it does change things quite a bit. Also I'm assuming you meant locked re-indexing. Background one is always single thread. Also please check your local disk speed (required for caches folder) - that can have impact too...: https://confluence.atlassian.com/kb/test-disk-access-speed-for-jira-server-performance-troubleshooting-818577561.html We do have Jira DC, however locked re-indexing of whole production Jira is around 4h 20min, with the following stats: node-vm: was 8vCPU now 16vCPU 64G, database-vm: is 8vCPU 64G postgresql 9.6 -Xmx29g jira.index.issue.threads = 26 JSW 8.5.8 JSD 4.5.8 addons: adv.roadmaps, scriptRunner, automation for Jira, EazyBI, Time to SLA, Xray, ... projects ~2350, issues >2mil, CF >3k, issueTypes 690, statuses 1570, versions >27k attachments >700k, workflows >2800, screens >7k, groups >14k Also restart of Jira before locked re-indexing is good idea(excluding why). Perhaps making sure no-one besides admin team can make request towards Jira for that time. or raise support ticket... //Tomas

Mark Durose added a comment - 19/Oct/2020 11:18 AM

I am also looking at an upgrade from 7.x to 8.5 and experiencing the same very slow index creation after the upgrade of our test instance, we have increased the cores and ram however the reindex is still looking to take 10 hrs which is an outage we cannot afford when we do this in production. I am wondering if there has been any additional advice on how to speed this process up.

Mark Durose added a comment - 19/Oct/2020 11:18 AM I am also looking at an upgrade from 7.x to 8.5 and experiencing the same very slow index creation after the upgrade of our test instance, we have increased the cores and ram however the reindex is still looking to take 10 hrs which is an outage we cannot afford when we do this in production. I am wondering if there has been any additional advice on how to speed this process up.

Konrad Garus added a comment - 17/Aug/2020 4:08 PM

Migrating a large Jira 7.5 to 8.5, this still seems to be a problem. We've tried using a 36-CPU machine with 70 threads for it, and it's taking about 10 hours for our data set. It is a significant improvement over a previous attempt with fewer CPUs and threads, but a 10-hour downtime is still hard to accept for the business.

It does not appear that the bottleneck is anywhere in hardware, all system resources seem to be underutilized. Plenty of headroom on disk and network, CPU at 5% most of the time, with an occasional spike to 8% or 15%.

It's a serious problem for larger installations, and I can't find adequate information on how to actually optimize it. Is it expected to grow almost linearly with thread count? Is there some limitation on how many threads it can effectively use, so that beyond X it's a plateau or actually slowing down due to contention? How do we know?

Konrad Garus added a comment - 17/Aug/2020 4:08 PM Migrating a large Jira 7.5 to 8.5, this still seems to be a problem. We've tried using a 36-CPU machine with 70 threads for it, and it's taking about 10 hours for our data set. It is a significant improvement over a previous attempt with fewer CPUs and threads, but a 10-hour downtime is still hard to accept for the business. It does not appear that the bottleneck is anywhere in hardware, all system resources seem to be underutilized. Plenty of headroom on disk and network, CPU at 5% most of the time, with an occasional spike to 8% or 15%. It's a serious problem for larger installations, and I can't find adequate information on how to actually optimize it. Is it expected to grow almost linearly with thread count? Is there some limitation on how many threads it can effectively use, so that beyond X it's a plateau or actually slowing down due to contention? How do we know?

Andriy Yakovlev [Atlassian] added a comment - 25/Feb/2019 3:02 PM

craig.castlemead1
Thanks also for sharing your test results and glad to hear that the reindexing is faster.

To clarify your comment:

I am running a foreground reindex on our Jira 8 UAT node and it does appear to be utilizing multiple cores though.

It's currently the case for Jira7 and same for Jira8. Jira uses 10 threads for Foreground (Full) reindex. See description for details on how to tune that.

My understanding of current feature request:

Reindexing time is high and CPU is not properly utilized during that process, which could improve the time (we assume that DB and Disk is not bottle-neck, which still could be the case).

So from that perspective, Lucene upgrade significantly changes the picture.

Hope this clarifies the context.
Cheers.

Andriy Yakovlev [Atlassian] added a comment - 25/Feb/2019 3:02 PM craig.castlemead1 Thanks also for sharing your test results and glad to hear that the reindexing is faster. To clarify your comment: I am running a foreground reindex on our Jira 8 UAT node and it does appear to be utilizing multiple cores though. It's currently the case for Jira7 and same for Jira8. Jira uses 10 threads for Foreground (Full) reindex. See description for details on how to tune that. My understanding of current feature request: Reindexing time is high and CPU is not properly utilized during that process, which could improve the time (we assume that DB and Disk is not bottle-neck, which still could be the case). So from that perspective, Lucene upgrade significantly changes the picture. Hope this clarifies the context. Cheers.

Craig Castle-Mead added a comment - 25/Feb/2019 2:30 PM

gonchik - did you confirm if multiple cores are actually being used, or was the reindex just more efficient overall so was quicker?

In initial testing on Jira 8 we've seen our locking reindex drop from 7.5 hours to 3.5 (> 1,000,000 issues) and the resulting index files drop from ~ 26GB to 6GB - so even on a foreground reindex we'd expect to see a significant reduction in reindex time on a single core due to the Lucene improvements
I am running a foreground reindex on our Jira 8 UAT node and it does appear to be utilizing multiple cores though. Can anyone from Atlassian confirm this is expected as this is marked as Gathering Interest and no Fix Version.

Craig Castle-Mead added a comment - 25/Feb/2019 2:30 PM gonchik - did you confirm if multiple cores are actually being used, or was the reindex just more efficient overall so was quicker? In initial testing on Jira 8 we've seen our locking reindex drop from 7.5 hours to 3.5 (> 1,000,000 issues) and the resulting index files drop from ~ 26GB to 6GB - so even on a foreground reindex we'd expect to see a significant reduction in reindex time on a single core due to the Lucene improvements I am running a foreground reindex on our Jira 8 UAT node and it does appear to be utilizing multiple cores though. Can anyone from Atlassian confirm this is expected as this is marked as Gathering Interest and no Fix Version.

Assignee:: Carlos Vigier (Inactive)

Reporter:: Sorin Sbarnea (Citrix)

Votes:: 84 Vote for this issue

Watchers:: 100 Start watching this issue

Created:: 04/Jun/2014 10:57 AM

Updated:: 13/Jul/2023 3:17 AM

Resolved:: 29/Apr/2021 9:34 AM

Details

Description

Initial Description

Solution

Attachments

Attachments

Issue Links

Forms

Activity

[JRASERVER-38598] JIRA reindexing is too slow - doesn't use existing CPUs

Collapse comment: Pavel Vaněk added a comment - 07/Oct/2021 2:16 PM

Expand comment: Pavel Vaněk added a comment - 07/Oct/2021 2:16 PM

Collapse comment: ViswanathanR added a comment - 28/Apr/2021 3:32 PM

Expand comment: ViswanathanR added a comment - 28/Apr/2021 3:32 PM

Collapse comment: Andriy Yakovlev [Atlassian] added a comment - 28/Apr/2021 3:15 PM

Expand comment: Andriy Yakovlev [Atlassian] added a comment - 28/Apr/2021 3:15 PM

Collapse comment: Milan Ardeshana added a comment - 28/Apr/2021 2:33 PM

Expand comment: Milan Ardeshana added a comment - 28/Apr/2021 2:33 PM

Collapse comment: ViswanathanR added a comment - 28/Apr/2021 11:02 AM

Expand comment: ViswanathanR added a comment - 28/Apr/2021 11:02 AM

Collapse comment: Tomas Karas added a comment - 19/Oct/2020 1:04 PM, Edited by Tomas Karas - 19/Oct/2020 1:15 PM

Expand comment: Tomas Karas added a comment - 19/Oct/2020 1:04 PM, Edited by Tomas Karas - 19/Oct/2020 1:15 PM

Collapse comment: Mark Durose added a comment - 19/Oct/2020 11:18 AM

Expand comment: Mark Durose added a comment - 19/Oct/2020 11:18 AM

Collapse comment: Konrad Garus added a comment - 17/Aug/2020 4:08 PM

Expand comment: Konrad Garus added a comment - 17/Aug/2020 4:08 PM

Collapse comment: Andriy Yakovlev [Atlassian] added a comment - 25/Feb/2019 3:02 PM

Expand comment: Andriy Yakovlev [Atlassian] added a comment - 25/Feb/2019 3:02 PM

Collapse comment: Craig Castle-Mead added a comment - 25/Feb/2019 2:30 PM

Expand comment: Craig Castle-Mead added a comment - 25/Feb/2019 2:30 PM

People

Dates