Uploaded image for project: 'Jira Service Management Data Center'
  1. Jira Service Management Data Center
  2. JSDSERVER-6043

Issue update/create is slow due to SLA indexing takes a long time

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Highest
    • 4.2.0
    • 3.8.0, 3.9.6, 3.16.1
    • SLA

    Description

      Summary

      On big Jira Service desk instances, having JSD projects with a large number of SLAs, is going to impact the issue reindex performance significantly.
      Foreground indexing becomes slow, issue operations like creation, adding a comment or transitions are also impacted as those require indexing.

      Environment

      • JIRA 7.5.x or higher, JIRA Service Desk 3.8.x or higher
      • Large instance, more than 1 million issues.
      • Large number of SLAs defined on the instance, more than 100 SLAs.

      Expected Results

      The re-indexing process takes roughly the same time as for ServiceDesk 3.2.x

      Actual Results

      The re-indexing process can take more than 48 hours to finish

      Notes

      Problem is largely reduced by JSDSERVER-5681 and JSDSERVER-5685

      Technical details

      JIRA generates a lot of database queries to populate the index with SLA values. JIRA doesn't use the cache to update the index any more, the database is the source of truth.

      The queries are fast but the sheer number of them add up.
      On large instances, the re-indexing process can take more than 48 hours to finish, which doesn't fit in a weekend any more.

      Enabling debug logging on the package: com.querydsl.sql.AbstractSQLQuery shows many queries going to: AO_54307E_TIMEMETRIC during an issue comment operation.
      We go to the database as many times as we have SLA's in the scope of the issue:

      2018-08-16 09:22:23,527 SdOffThreadEventJobRunner:thread-5 DEBUG <user> 000x000x2 qqq9999 <IP_address> /rest/api/2/issue/PROJECTKEY-1111/comment [c.querydsl.sql.AbstractSQLQuery] select * from (   select "AO_54307E_TIMEMETRIC"."ID", "AO_54307E_TIMEMETRIC"."NAME", "AO_54307E_TIMEMETRIC"."CUSTOM_FIELD_ID", "AO_54307E_TIMEMETRIC"."DEFINITION_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."DEFINITION_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."GOALS_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."GOALS_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."THRESHOLDS_CONFIG_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."THRESHOLDS_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."CREATED_DATE" from "AO_54307E_TIMEMETRIC" "AO_54307E_TIMEMETRIC" where "AO_54307E_TIMEMETRIC"."SERVICE_DESK_ID" = ? and "AO_54307E_TIMEMETRIC"."CUSTOM_FIELD_ID" = ? ) where rownum <= ?
      2018-08-16 09:22:23,531 SdOffThreadEventJobRunner:thread-5 DEBUG <user> 000x000x2 qqq9999 <IP_address> /rest/api/2/issue/PROJECTKEY-1111/comment [c.querydsl.sql.AbstractSQLQuery] select * from (   select "AO_54307E_TIMEMETRIC"."ID", "AO_54307E_TIMEMETRIC"."NAME", "AO_54307E_TIMEMETRIC"."CUSTOM_FIELD_ID", "AO_54307E_TIMEMETRIC"."DEFINITION_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."DEFINITION_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."GOALS_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."GOALS_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."THRESHOLDS_CONFIG_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."THRESHOLDS_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."CREATED_DATE" from "AO_54307E_TIMEMETRIC" "AO_54307E_TIMEMETRIC" where "AO_54307E_TIMEMETRIC"."SERVICE_DESK_ID" = ? and "AO_54307E_TIMEMETRIC"."CUSTOM_FIELD_ID" = ? ) where rownum <= ?
      
      
       ~/ $ grep "PROJECTKEY-1111/comment" atlassian-jira.log|grep AO_54307E_TIMEMETRIC |wc -l
      256
      ~/ $
      

      Taking thread dumps while doing foreground indexing, we can see all the indexing threads are spending extended amount of time doing SLA work.

      "IssueIndexer:thread-20" #416026 prio=5 os_prio=0 tid=0x0000000008c02800 nid=0x151e runnable [0x00007fdd8f46d000]
         java.lang.Thread.State: RUNNABLE
      ...
              at com.atlassian.pocketknife.internal.querydsl.DatabaseAccessorImpl.runInTransaction(DatabaseAccessorImpl.java:43)
              at com.atlassian.servicedesk.internal.sla.customfield.JIRACustomFieldValueStore.getTextValues(JIRACustomFieldValueStore.java:29)
              at com.atlassian.servicedesk.internal.sla.customfield.SLACFType.getValuesFromDatabase(SLACFType.java:471)
              at com.atlassian.servicedesk.internal.sla.customfield.SLACFType.loadSLAValue(SLACFType.java:440)
              at com.atlassian.servicedesk.internal.sla.customfield.SLACFType.lambda$getValueFromIssue$4(SLACFType.java:435)
              at com.atlassian.servicedesk.internal.sla.customfield.SLACFType$$Lambda$1107/517974037.get(Unknown Source)
      

      It would be useful if we can tune the amount of times we need to go to the database while working with Service Desk SLAs, as doing database IO quite expensive.

      Attachments

        Issue Links

          Activity

            People

              mreil1 Markus Reil (Inactive)
              sabdelfattah Sherif Abdelfattah (Inactive)
              Votes:
              8 Vote for this issue
              Watchers:
              25 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Backbone Issue Sync