Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-38598

JIRA reindexing is too slow - doesn't use existing CPUs

    • 72
    • 41
    • We collect Jira feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

      2021-04 - Jira 8.x update

      We believe at this point this problem is fixed, there were a number of changes in the recent version of Jira that improved the performance of reindexing.

      Just to name a few:

      • Upgrade to the recent Lucene version 7.x
      • Optimization for the Custom Fields computation
      • Increased number of Jira index threads to 20.

      That being said, we don't think the CPU is bottle-neck anymore and it can be properly utilized by Jira. Examples that we have some far, show problems related to slow DB performance or slow computation of specific user's Custom Fields, those are not related to Jira directly.

      Initial Description

      We have a big JIRA instance on a really good hardware:

      • ~300k issues
      • 48GB RAM
      • 32 cores
      • SSD in RAID
      • PostgreSQL on localhsot

      Still, reindexing does not increase the load above 4-5 (out of 32!), database is not under high load nor the filesystem.

      Clearly this means that indexing is not properly using the resources in order to index rapidly.

      Even with this configuration, a foreground reindex takes about one hour which is too much for a live system, in fact is too much even for a background reindex.

      top - 11:57:30 up 50 days, 21:35,  2 users,  load average: 4.43, 4.02, 3.79
      Tasks: 413 total,   3 running, 410 sleeping,   0 stopped,   0 zombie
      Cpu(s): 14.3%us,  0.6%sy,  0.0%ni, 84.8%id,  0.1%wa,  0.0%hi,  0.1%si,  0.0%st
      Mem:  49415024k total, 48661932k used,   753092k free,   171160k buffers
      Swap:  2097148k total,   139576k used,  1957572k free, 21143064k cached
      
        PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
      18668 root      20   0 23.4g 3.0g  50m S  241  6.3  54145:22 java
      13694 jira      20   0 25.2g 5.5g 496m S  110 11.6 108:55.12 java
      14552 postgres  20   0 11.4g 896m 887m S   39  1.9   4:05.14 postgres
      12496 postgres  20   0 11.4g 897m 890m R   24  1.9   2:07.22 postgres
      14548 postgres  20   0 11.4g 947m 940m R   19  2.0   3:51.61 postgres
      15014 confluen  20   0 16.9g 1.7g 8352 S   13  3.7   1196:28 java
      14238 postgres  20   0 11.4g 1.0g 1.0g S    8  2.1   5:37.47 postgres
      12499 postgres  20   0 11.4g 855m 850m S    8  1.8   2:06.02 postgres
      30049 postgres  20   0 11.4g  78m  73m S    6  0.2   0:02.86 postgres
      17921 postgres  20   0 11.4g  73m  66m S    2  0.2   0:03.98 postgres
       4955 nginx     20   0 50924  17m 2168 S    1  0.0 107:41.21 nginx
      47354 postgres  20   0 11.4g 4344 2168 S    1  0.0   6:24.91 postgres                        
      

      Solution

      By default, Jira uses 10 Reindexing threads. You can tune number of jira index threads.

            [JRASERVER-38598] JIRA reindexing is too slow - doesn't use existing CPUs

            Hi, we recently found out a missing index in the database, which caused some searches to take a huge amount of time during the full reindexing task. Full reindex ran then 1,5h instead of the previous 3h.

            Gathered on side of the database:
            SELECT OLD_ISSUE_KEY FROM public.moved_issue_key WHERE ISSUE_ID=$1 ORDER BY ID

            we added index:
            CREATE INDEX CONCURRENTLY idx_issue_id ON moved_issue_key (issue_id);

            Pavel

            Pavel Vaněk added a comment - Hi, we recently found out a missing index in the database, which caused some searches to take a huge amount of time during the full reindexing task. Full reindex ran then 1,5h instead of the previous 3h. Gathered on side of the database: SELECT OLD_ISSUE_KEY FROM public.moved_issue_key WHERE ISSUE_ID=$1 ORDER BY ID we added index: CREATE INDEX CONCURRENTLY idx_issue_id ON moved_issue_key (issue_id); Pavel

            @Andriy
            I'm doing a full reindex and not background. 7 hours passed and I'm still at 50%. Yes support ticket raised too

            ViswanathanR added a comment - @Andriy I'm doing a full reindex and not background. 7 hours passed and I'm still at 50%. Yes support ticket raised too

            Folks viswanathan.ramachandran, mardeshana
            This is strange, we see massive improvements based on stats/feedback from customers for the Full Locked Reindex in 8.x line.
            At the same time, as people mentioned before, background reindex is single-threaded and expected to be slow. See also related problem - JRASERVER-72045

            Can I please ask you to open support tickets, so our team can investigate this?
            Cheers.

            Andriy Yakovlev [Atlassian] added a comment - Folks viswanathan.ramachandran , mardeshana This is strange, we see massive improvements based on stats/feedback from customers for the Full Locked Reindex in 8.x line. At the same time, as people mentioned before, background reindex is single-threaded and expected to be slow. See also related problem - JRASERVER-72045 Can I please ask you to open support tickets, so our team can investigate this? Cheers.

            +1

            We are experiencing this too with 16 CPU and MSSQL Db on JIRA 8.13.4 server edition. Its almost 4 hours and reindex is just 25%

            We never got this problem in 7.13.3 though slow but it always completed in 2-3 hours time.

            ViswanathanR added a comment - We are experiencing this too with 16 CPU and MSSQL Db on JIRA 8.13.4 server edition. Its almost 4 hours and reindex is just 25% We never got this problem in 7.13.3 though slow but it always completed in 2-3 hours time.

            Tomas Karas added a comment - - edited

            Hi Mark,

            routine questions:

            you did check the solution section("By default, Jira uses 10 Reindexing threads") in this ticket with link to a page how to adjust threads used for re-indexing by Jira?

            I know for sure that it does change things quite a bit. Also I'm assuming you meant locked re-indexing. Background one is always single thread.

            Also please check your local disk speed(required for caches folder) - that can have impact too...:  https://confluence.atlassian.com/kb/test-disk-access-speed-for-jira-server-performance-troubleshooting-818577561.html

            We do have Jira DC, however locked re-indexing of whole production Jira is around 4h 20min, with the following stats:

            node-vm: was 8vCPU now 16vCPU 64G,  database-vm: is 8vCPU 64G postgresql 9.6
            -Xmx29g
            jira.index.issue.threads = 26
            JSW 8.5.8 JSD 4.5.8 addons: adv.roadmaps, scriptRunner, automation for Jira, EazyBI, Time to SLA, Xray, ...
            projects ~2350, issues >2mil, CF  >3k, issueTypes 690, statuses 1570, versions >27k
            attachments  >700k, workflows >2800, screens >7k, groups >14k 

            Also restart of Jira before locked re-indexing is good idea(excluding why).

            Perhaps making sure no-one besides admin team can make request towards Jira for that time.

            or raise support ticket...

            //Tomas

            Tomas Karas added a comment - - edited Hi Mark, routine questions: you did check the solution section("By default, Jira uses 10 Reindexing threads") in this ticket with link to a page how to adjust threads used for re-indexing by Jira? I know for sure that it does change things quite a bit. Also I'm assuming you meant locked re-indexing. Background one is always single thread. Also please check your local disk speed (required for caches folder) - that can have impact too...:  https://confluence.atlassian.com/kb/test-disk-access-speed-for-jira-server-performance-troubleshooting-818577561.html We do have Jira DC, however locked re-indexing of whole production Jira is around 4h 20min, with the following stats: node-vm: was 8vCPU now 16vCPU 64G,  database-vm: is 8vCPU 64G postgresql 9.6 -Xmx29g jira.index.issue.threads = 26 JSW 8.5.8 JSD 4.5.8 addons: adv.roadmaps, scriptRunner, automation for Jira, EazyBI, Time to SLA, Xray, ... projects ~2350, issues >2mil, CF  >3k, issueTypes 690, statuses 1570, versions >27k attachments  >700k, workflows >2800, screens >7k, groups >14k Also restart of Jira before locked re-indexing is good idea(excluding why). Perhaps making sure no-one besides admin team can make request towards Jira for that time. or raise support ticket... //Tomas

            I am also looking at an upgrade from 7.x to 8.5 and experiencing the same very slow index creation after the upgrade of our test instance, we have increased the cores and ram however the reindex is still looking to take 10 hrs which is an outage we cannot afford when we do this in production. I am wondering if there has been any additional advice on how to speed this process up.

            Mark Durose added a comment - I am also looking at an upgrade from 7.x to 8.5 and experiencing the same very slow index creation after the upgrade of our test instance, we have increased the cores and ram however the reindex is still looking to take 10 hrs which is an outage we cannot afford when we do this in production. I am wondering if there has been any additional advice on how to speed this process up.

            Migrating a large Jira 7.5 to 8.5, this still seems to be a problem. We've tried using a 36-CPU machine with 70 threads for it, and it's taking about 10 hours for our data set. It is a significant improvement over a previous attempt with fewer CPUs and threads, but a 10-hour downtime is still hard to accept for the business.

            It does not appear that the bottleneck is anywhere in hardware, all system resources seem to be underutilized. Plenty of headroom on disk and network, CPU at 5% most of the time, with an occasional spike to 8% or 15%.

            It's a serious problem for larger installations, and I can't find adequate information on how to actually optimize it. Is it expected to grow almost linearly with thread count? Is there some limitation on how many threads it can effectively use, so that beyond X it's a plateau or actually slowing down due to contention? How do we know?

            Konrad Garus added a comment - Migrating a large Jira 7.5 to 8.5, this still seems to be a problem. We've tried using a 36-CPU machine with 70 threads for it, and it's taking about 10 hours for our data set. It is a significant improvement over a previous attempt with fewer CPUs and threads, but a 10-hour downtime is still hard to accept for the business. It does not appear that the bottleneck is anywhere in hardware, all system resources seem to be underutilized. Plenty of headroom on disk and network, CPU at 5% most of the time, with an occasional spike to 8% or 15%. It's a serious problem for larger installations, and I can't find adequate information on how to actually optimize it. Is it expected to grow almost linearly with thread count? Is there some limitation on how many threads it can effectively use, so that beyond X it's a plateau or actually slowing down due to contention?  How do we know?

            craig.castlemead1
            Thanks also for sharing your test results and glad to hear that the reindexing is faster.

            To clarify your comment:

            I am running a foreground reindex on our Jira 8 UAT node and it does appear to be utilizing multiple cores though.

            It's currently the case for Jira7 and same for Jira8. Jira uses 10 threads for Foreground (Full) reindex. See description for details on how to tune that.

            My understanding of current feature request:

            • Reindexing time is high and CPU is not properly utilized during that process, which could improve the time (we assume that DB and Disk is not bottle-neck, which still could be the case).

            So from that perspective, Lucene upgrade significantly changes the picture.

            Hope this clarifies the context.
            Cheers.

            Andriy Yakovlev [Atlassian] added a comment - craig.castlemead1 Thanks also for sharing your test results and glad to hear that the reindexing is faster. To clarify your comment: I am running a foreground reindex on our Jira 8 UAT node and it does appear to be utilizing multiple cores though. It's currently the case for Jira7 and same for Jira8. Jira uses 10 threads for Foreground (Full) reindex. See description for details on how to tune that. My understanding of current feature request: Reindexing time is high and CPU is not properly utilized during that process, which could improve the time (we assume that DB and Disk is not bottle-neck, which still could be the case). So from that perspective, Lucene upgrade significantly changes the picture. Hope this clarifies the context. Cheers.

            gonchik - did you confirm if multiple cores are actually being used, or was the reindex just more efficient overall so was quicker?

            • In initial testing on Jira 8 we've seen our locking reindex drop from 7.5 hours to 3.5 (> 1,000,000 issues) and the resulting index files drop from ~ 26GB to 6GB - so even on a foreground reindex we'd expect to see a significant reduction in reindex time on a single core due to the Lucene improvements
            • I am running a foreground reindex on our Jira 8 UAT node and it does appear to be utilizing multiple cores though. Can anyone from Atlassian confirm this is expected as this is marked as Gathering Interest and no Fix Version.

             

             

            Craig Castle-Mead added a comment - gonchik - did you confirm if multiple cores are actually being used, or was the reindex just more efficient overall so was quicker? In initial testing on Jira 8 we've seen our locking reindex drop from 7.5 hours to 3.5 (> 1,000,000 issues) and the resulting index files drop from ~ 26GB to 6GB - so even on a foreground reindex we'd expect to see a significant reduction in reindex time on a single core due to the Lucene improvements I am running a foreground reindex on our Jira 8 UAT node and it does appear to be utilizing multiple cores though. Can anyone from Atlassian confirm this is expected as this is marked as Gathering Interest and no Fix Version.    

              dd46af19d4d5 Carlos Vigier (Inactive)
              73f0b2e75f82 Sorin Sbarnea (Citrix)
              Votes:
              84 Vote for this issue
              Watchers:
              100 Start watching this issue

                Created:
                Updated:
                Resolved: