We couldn't load the project sidebar. Refresh the page to try again.
If the problem persists, contact your Jira admin.
IMPORTANT: JAC is a Public system and anyone on the internet will be able to view the data in the created JAC tickets. Please don’t include Customer or Sensitive data in the JAC ticket.
Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-74478

Atlassian diagnostics creates several threads when monitoring database operations made by Jira and may crash the instance during high DB activity

      Issue Summary

      It seems Atlassian Diagnostics plugin isn't reusing threads from its thread pool and is creating several threads when monitoring operations against the database.

      The problem becomes more evident when running a full reindex as this is one of the operations on Jira with high database activity.

      During high DB activity, such as a full reindex, the Jira instance might crash because of the high number of threads.

      This seems to be related to an upgrade on the Atlassian Diagnostics plugin to version 2.0.4 on Jira 9.3.0.
      The bug couldn't be recreated on Jira 9.2.0.

      Steps to Reproduce

      1. Install a vanilla instance of Jira Software Data Center 9.3.0.
        • This was validated with both 9.3.0, 9.3.1 and 9.4.0-eap.
        • The bug couldn't be recreated on 9.2.0.
      2. Make sure Java Flight Recorder is enabled and taking thread dumps each 3 seconds.
      3. Create ~100 projects so the instance has something to reindex.
        for i in $(seq 1 100); do
          curl -v -u ${JIRA_ADMIN_USERNAME}:${JIRA_ADMIN_PASSWORD} -X POST -o /dev/null \
            -H 'X-Atlassian-Token: no-check' \
            ${JIRA_BASE_URL}'/rest/jira-importers-plugin/1.0/demo/create' \
            --data-raw 'name='${JIRA_PROJECT_NAME}${i}'&key='${JIRA_PROJECT_KEY}${i}'&keyEdited=false&projectTemplateWebItemKey=software-demo-project-kanban&projectTemplateModuleKey=undefined'
      4. Run a Full reindex.

      Expected Results

      Full reindex runs with no major problem and no side effects.

      Actual Results

      Full reindex completes without any error.
      Hundreds or thousands of threads named pool-XX-thread-XXXX are created.
      Inspecting JFR and thread dumps while the reindex was running shows threads similar to the below.

      "pool-18-thread-695" prio=5 tid=0x00000000000007d5 nid=0 waiting on condition 
         java.lang.Thread.State: TIMED_WAITING (parking)
      	at java.base@11.0.16/jdk.internal.misc.Unsafe.park(Native Method)
      	- parking to wait for <0x000000003697a89f> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
      	at java.base@11.0.16/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
      	at java.base@11.0.16/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
      	at java.base@11.0.16/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
      	at java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1053)
      	at java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
      	at java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.base@11.0.16/java.lang.Thread.run(Thread.java:829)
         Locked ownable synchronizers:
      	- None

      On few instances we are able to capture these threads on runnable state and they would be similar to the below, running code associated to com.atlassian.diagnostics.

      "pool-17-thread-27202" #36575 prio=5 os_prio=0 cpu=0.13ms elapsed=1.50s tid=0x00007f41f89d8000 nid=0x905b runnable  [0x00007f3aab6fe000]
         java.lang.Thread.State: RUNNABLE
      	at io.micrometer.core.instrument.LongTaskTimer$Builder.register(LongTaskTimer.java:408)
      	at io.micrometer.core.instrument.MeterRegistry$More.longTaskTimer(MeterRegistry.java:872)
      	at com.atlassian.util.profiling.micrometer.MicrometerStrategy.startLongRunningTimer(MicrometerStrategy.java:104)
      	at com.atlassian.util.profiling.micrometer.MicrometerStrategy.startLongRunningTimer(MicrometerStrategy.java:93)
      	at com.atlassian.util.profiling.Metrics$DefaultLongRunningMetricTimer.start(Metrics.java:721)
      	at com.atlassian.util.profiling.Metrics$Builder.startLongRunningTimer(Metrics.java:635)
      	at com.atlassian.diagnostics.internal.platform.monitor.db.DefaultDatabaseDiagnosticsCollector.lambda$startTimingDatabaseOperationAsync$1(DefaultDatabaseDiagnosticsCollector.java:153)
      	at com.atlassian.diagnostics.internal.platform.monitor.db.DefaultDatabaseDiagnosticsCollector$$Lambda$2047/0x00000008427bd840.call(Unknown Source)
      	at java.util.concurrent.FutureTask.run(java.base@
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@
      	at java.lang.Thread.run(java.base@

      Looking at the JFR data there will be thousands of threads with the characteristics described above.

      Depending on how large the environment is, the application crashes during the reindex because of the number of processes created.

      Sometimes Jira may crash with the following error because of many existing threads.

      java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached


      Disable affecting system Apps while running Jira on 9.3.0 and 9.3.1 versions:

      1. Go to Cog icon > Manage Apps > Manage Apps.
      2. Choose All Apps and filter for diagnostic.
      3. Disable the following 2 system apps.
        • Atlassian Diagnostics - Plugin
        • Atlassian Jira - Plugins - Diagnostics Plugin
      4. Restart Jira to ensure old lingering threads won't affect our test.
      5. Go to Manage Apps and confirm the two diagnostic Apps are still disabled.

      Enable these Apps once upgrading Jira to a version where this bug is fixed.

        1. screenshot-1.png
          231 kB
          Thiago Masutti

          Form Name

            IMPORTANT: JAC is a Public system and anyone on the internet will be able to view the data in the created JAC tickets. Please don’t include Customer or Sensitive data in the JAC ticket.
            Uploaded image for project: 'Jira Data Center'
            1. Jira Data Center
            2. JRASERVER-74478

            Atlassian diagnostics creates several threads when monitoring database operations made by Jira and may crash the instance during high DB activity

                Issue Summary

                It seems Atlassian Diagnostics plugin isn't reusing threads from its thread pool and is creating several threads when monitoring operations against the database.

                The problem becomes more evident when running a full reindex as this is one of the operations on Jira with high database activity.

                During high DB activity, such as a full reindex, the Jira instance might crash because of the high number of threads.

                This seems to be related to an upgrade on the Atlassian Diagnostics plugin to version 2.0.4 on Jira 9.3.0.
                The bug couldn't be recreated on Jira 9.2.0.

                Steps to Reproduce

                1. Install a vanilla instance of Jira Software Data Center 9.3.0.
                  • This was validated with both 9.3.0, 9.3.1 and 9.4.0-eap.
                  • The bug couldn't be recreated on 9.2.0.
                2. Make sure Java Flight Recorder is enabled and taking thread dumps each 3 seconds.
                3. Create ~100 projects so the instance has something to reindex.
                  for i in $(seq 1 100); do
                    curl -v -u ${JIRA_ADMIN_USERNAME}:${JIRA_ADMIN_PASSWORD} -X POST -o /dev/null \
                      -H 'X-Atlassian-Token: no-check' \
                      ${JIRA_BASE_URL}'/rest/jira-importers-plugin/1.0/demo/create' \
                      --data-raw 'name='${JIRA_PROJECT_NAME}${i}'&key='${JIRA_PROJECT_KEY}${i}'&keyEdited=false&projectTemplateWebItemKey=software-demo-project-kanban&projectTemplateModuleKey=undefined'
                4. Run a Full reindex.

                Expected Results

                Full reindex runs with no major problem and no side effects.

                Actual Results

                Full reindex completes without any error.
                Hundreds or thousands of threads named pool-XX-thread-XXXX are created.
                Inspecting JFR and thread dumps while the reindex was running shows threads similar to the below.

                "pool-18-thread-695" prio=5 tid=0x00000000000007d5 nid=0 waiting on condition 
                   java.lang.Thread.State: TIMED_WAITING (parking)
                	at java.base@11.0.16/jdk.internal.misc.Unsafe.park(Native Method)
                	- parking to wait for <0x000000003697a89f> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
                	at java.base@11.0.16/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
                	at java.base@11.0.16/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
                	at java.base@11.0.16/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
                	at java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1053)
                	at java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
                	at java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
                	at java.base@11.0.16/java.lang.Thread.run(Thread.java:829)
                   Locked ownable synchronizers:
                	- None

                On few instances we are able to capture these threads on runnable state and they would be similar to the below, running code associated to com.atlassian.diagnostics.

                "pool-17-thread-27202" #36575 prio=5 os_prio=0 cpu=0.13ms elapsed=1.50s tid=0x00007f41f89d8000 nid=0x905b runnable  [0x00007f3aab6fe000]
                   java.lang.Thread.State: RUNNABLE
                	at io.micrometer.core.instrument.LongTaskTimer$Builder.register(LongTaskTimer.java:408)
                	at io.micrometer.core.instrument.MeterRegistry$More.longTaskTimer(MeterRegistry.java:872)
                	at com.atlassian.util.profiling.micrometer.MicrometerStrategy.startLongRunningTimer(MicrometerStrategy.java:104)
                	at com.atlassian.util.profiling.micrometer.MicrometerStrategy.startLongRunningTimer(MicrometerStrategy.java:93)
                	at com.atlassian.util.profiling.Metrics$DefaultLongRunningMetricTimer.start(Metrics.java:721)
                	at com.atlassian.util.profiling.Metrics$Builder.startLongRunningTimer(Metrics.java:635)
                	at com.atlassian.diagnostics.internal.platform.monitor.db.DefaultDatabaseDiagnosticsCollector.lambda$startTimingDatabaseOperationAsync$1(DefaultDatabaseDiagnosticsCollector.java:153)
                	at com.atlassian.diagnostics.internal.platform.monitor.db.DefaultDatabaseDiagnosticsCollector$$Lambda$2047/0x00000008427bd840.call(Unknown Source)
                	at java.util.concurrent.FutureTask.run(java.base@
                	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@
                	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@
                	at java.lang.Thread.run(java.base@

                Looking at the JFR data there will be thousands of threads with the characteristics described above.

                Depending on how large the environment is, the application crashes during the reindex because of the number of processes created.

                Sometimes Jira may crash with the following error because of many existing threads.

                java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached


                Disable affecting system Apps while running Jira on 9.3.0 and 9.3.1 versions:

                1. Go to Cog icon > Manage Apps > Manage Apps.
                2. Choose All Apps and filter for diagnostic.
                3. Disable the following 2 system apps.
                  • Atlassian Diagnostics - Plugin
                  • Atlassian Jira - Plugins - Diagnostics Plugin
                4. Restart Jira to ensure old lingering threads won't affect our test.
                5. Go to Manage Apps and confirm the two diagnostic Apps are still disabled.

                Enable these Apps once upgrading Jira to a version where this bug is fixed.

                  1. screenshot-1.png
                    231 kB
                    Thiago Masutti

                        Unassigned Unassigned
                        tmasutti Thiago Masutti
                        Affected customers:
                        2 This affects my team
                        16 Start watching this issue


                            Unassigned Unassigned
                            tmasutti Thiago Masutti
                            Affected customers:
                            2 Vote for this issue
                            16 Start watching this issue
