Uploaded image for project: 'Jira Service Management Data Center'
  1. Jira Service Management Data Center
  2. JSDSERVER-5736

Poor performance with high CPU and a high number of SdOffThreadEventJobRunner threads

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Highest Highest
    • 4.4.0
    • 3.9.0, 3.9.1, 3.9.2, 3.9.3, 3.9.4, 3.9.6, 3.9.7, 3.9.8, 3.9.9, 3.9.10, 3.9.11, 3.10.0, 3.10.1, 3.10.2, 3.11.0, 3.11.1, 3.16.0, 3.16.1, 4.0.0, 4.1.0, 4.2.0
    • SLA

       

      Atlassian Update – 10 September 2021

      Hi everyone,

      Thank you for your feedback on the ticket and supporting our team in our investigation!

      After analysing the problem, we have identified the issue of "Poor performance with high CPU and a high number of SdOffThreadEventJobRunner threads" has been fixed in JSD 4.4.0. This problem occurred due to the unbounded nature of threads before 4.4.0, which resulted in a high DB load on the instance.

      However - we have identified two other issues with the JSM async processing logic which we need to resolve, related to the problems reported in this bug ticket. These issues are tracked in their respective tickets -

      1. JSDSERVER-5730

      2. JSDSERVER-8635

      • This is related to threads deadlocking when processing for one issue takes over 5 minutes. This ticket is currently gathering impact.

      Please let us know if you have any further concerns with the above, please open a support ticket via https://support.atlassian.com

      Thank you,

      Alex

      Description 

      JSD 3.9.0 attempts to address some of the friction between the SLA system and automation (JSDSERVER-4743) and poor issue creation performance by introducing a wrapper event type (inspired by OnCommitEvent) and an “expectation” system.

      The expectation system gives features that are interested in one or more eligible event types a way to explicitly define the work that should be done before a wrapped event is dispatched, by submitting “jobs” that are executed in the strict cluster-wide order of their submission (no more than one job at a time for each issue) using a thread pool to avoid blocking any request threads (though we just use the submitting thread if it’s not a request thread).

      The wrapper event type does the same for the work that should be done after what we refer to as “completion”, by defining @EventListener methods of type public void(ServiceDeskWrappedOnCompletionEvent).

      At least two recent support cases have involved severe performance degradation of a node in and/or the database for an instance that seems to have been caused or exacerbated by the expectation system, so we’ll link potential causes to this issue as we find them.

      Diagnosis

      • High CPU usage on DB server
      • Increased number of threads used by the Jira process
      • High number of SdOffThreadEventJobRunner threads on thread dumps connecting to the database

      Possible workaround (JSD 3.9+)

      These steps affect the expectation system such that jobs are always executed immediately on the submitting thread, without touching any OffThreadEventJobRunner or PSMQ code paths, as if the submitting threads are never request threads (JSDSERVER-5730).

      1. Go to the dark feature settings page (<baseURL>/secure/SiteDarkFeatures!default.jspa)
      2. Remove the feature flag sd.internal.base.off.thread.on.completion.events.enabled, if it exists
      3. Add the following feature flag: sd.internal.base.off.thread.on.completion.events.disabled
      4. Restart JIRA

      SLA accuracy shouldn’t be negatively affected, but issue creation might take longer as a result. WHEN issue created automation rules with SLA-related JQL should still work (JSDSERVER-4743).

              mchandra@atlassian.com Mohil Chandra
              dazabani Delan Azabani (Inactive)
              Votes:
              195 Vote for this issue
              Watchers:
              188 Start watching this issue

                Created:
                Updated:
                Resolved: