• Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Highest Highest
    • 4.2.0
    • 3.8.0, 3.9.6, 3.16.1
    • SLA

      Summary

      On big Jira Service desk instances, having JSD projects with a large number of SLAs, is going to impact the issue reindex performance significantly.
      Foreground indexing becomes slow, issue operations like creation, adding a comment or transitions are also impacted as those require indexing.

      Environment

      • JIRA 7.5.x or higher, JIRA Service Desk 3.8.x or higher
      • Large instance, more than 1 million issues.
      • Large number of SLAs defined on the instance, more than 100 SLAs.

      Expected Results

      The re-indexing process takes roughly the same time as for ServiceDesk 3.2.x

      Actual Results

      The re-indexing process can take more than 48 hours to finish

      Notes

      Problem is largely reduced by JSDSERVER-5681 and JSDSERVER-5685

      Technical details

      JIRA generates a lot of database queries to populate the index with SLA values. JIRA doesn't use the cache to update the index any more, the database is the source of truth.

      The queries are fast but the sheer number of them add up.
      On large instances, the re-indexing process can take more than 48 hours to finish, which doesn't fit in a weekend any more.

      Enabling debug logging on the package: com.querydsl.sql.AbstractSQLQuery shows many queries going to: AO_54307E_TIMEMETRIC during an issue comment operation.
      We go to the database as many times as we have SLA's in the scope of the issue:

      2018-08-16 09:22:23,527 SdOffThreadEventJobRunner:thread-5 DEBUG <user> 000x000x2 qqq9999 <IP_address> /rest/api/2/issue/PROJECTKEY-1111/comment [c.querydsl.sql.AbstractSQLQuery] select * from (   select "AO_54307E_TIMEMETRIC"."ID", "AO_54307E_TIMEMETRIC"."NAME", "AO_54307E_TIMEMETRIC"."CUSTOM_FIELD_ID", "AO_54307E_TIMEMETRIC"."DEFINITION_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."DEFINITION_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."GOALS_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."GOALS_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."THRESHOLDS_CONFIG_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."THRESHOLDS_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."CREATED_DATE" from "AO_54307E_TIMEMETRIC" "AO_54307E_TIMEMETRIC" where "AO_54307E_TIMEMETRIC"."SERVICE_DESK_ID" = ? and "AO_54307E_TIMEMETRIC"."CUSTOM_FIELD_ID" = ? ) where rownum <= ?
      2018-08-16 09:22:23,531 SdOffThreadEventJobRunner:thread-5 DEBUG <user> 000x000x2 qqq9999 <IP_address> /rest/api/2/issue/PROJECTKEY-1111/comment [c.querydsl.sql.AbstractSQLQuery] select * from (   select "AO_54307E_TIMEMETRIC"."ID", "AO_54307E_TIMEMETRIC"."NAME", "AO_54307E_TIMEMETRIC"."CUSTOM_FIELD_ID", "AO_54307E_TIMEMETRIC"."DEFINITION_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."DEFINITION_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."GOALS_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."GOALS_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."THRESHOLDS_CONFIG_CHANGE_DATE", "AO_54307E_TIMEMETRIC"."THRESHOLDS_CHANGE_MS_EPOCH", "AO_54307E_TIMEMETRIC"."CREATED_DATE" from "AO_54307E_TIMEMETRIC" "AO_54307E_TIMEMETRIC" where "AO_54307E_TIMEMETRIC"."SERVICE_DESK_ID" = ? and "AO_54307E_TIMEMETRIC"."CUSTOM_FIELD_ID" = ? ) where rownum <= ?
      
      
       ~/ $ grep "PROJECTKEY-1111/comment" atlassian-jira.log|grep AO_54307E_TIMEMETRIC |wc -l
      256
      ~/ $
      

      Taking thread dumps while doing foreground indexing, we can see all the indexing threads are spending extended amount of time doing SLA work.

      "IssueIndexer:thread-20" #416026 prio=5 os_prio=0 tid=0x0000000008c02800 nid=0x151e runnable [0x00007fdd8f46d000]
         java.lang.Thread.State: RUNNABLE
      ...
              at com.atlassian.pocketknife.internal.querydsl.DatabaseAccessorImpl.runInTransaction(DatabaseAccessorImpl.java:43)
              at com.atlassian.servicedesk.internal.sla.customfield.JIRACustomFieldValueStore.getTextValues(JIRACustomFieldValueStore.java:29)
              at com.atlassian.servicedesk.internal.sla.customfield.SLACFType.getValuesFromDatabase(SLACFType.java:471)
              at com.atlassian.servicedesk.internal.sla.customfield.SLACFType.loadSLAValue(SLACFType.java:440)
              at com.atlassian.servicedesk.internal.sla.customfield.SLACFType.lambda$getValueFromIssue$4(SLACFType.java:435)
              at com.atlassian.servicedesk.internal.sla.customfield.SLACFType$$Lambda$1107/517974037.get(Unknown Source)
      

      It would be useful if we can tune the amount of times we need to go to the database while working with Service Desk SLAs, as doing database IO quite expensive.

          Form Name

            [JSDSERVER-6043] Issue update/create is slow due to SLA indexing takes a long time

            Gonchik Tsymzhitov added a comment - - edited

            Oops, we met as well. once had more than 960 SLA's

            But I see it's going from 150 SLA's

            Gonchik Tsymzhitov added a comment - - edited Oops, we met as well. once had more than 960 SLA's But I see it's going from 150 SLA's

            Adam Mason added a comment -

            Echo Susan above, with JSD 4.12.1. We're running a cluster and the cluster index replication is going haywire.

            Adam Mason added a comment - Echo Susan above, with JSD 4.12.1. We're running a cluster and the cluster index replication is going haywire.

            We are on JSD 4.12.  I just updated an SLA and it's killing the indexing.   I don't believe this is fixed

            Susan Hauth [Jira Queen] added a comment - We are on JSD 4.12.  I just updated an SLA and it's killing the indexing.   I don't believe this is fixed

            We've made several improvements around SLA indexing times.

            Especially the issue mentioned here (lots of calls to retrieve AO_54307E_TIMEMETRIC data) has been addressed.

            Instances with lots of SLAs configured should see a big improvement.

            Markus Reil (Inactive) added a comment - We've made several improvements around SLA indexing times. Especially the issue mentioned here (lots of calls to retrieve AO_54307E_TIMEMETRIC data) has been addressed. Instances with lots of SLAs configured should see a big improvement.

            Hi, is there any update on when this might get looked into? I am currently stuck on version 7.1.4, unable to upgade to the latest version because of the massive performance issue created by this. Thanks, Stephen

            Stephen Marsh added a comment - Hi, is there any update on when this might get looked into? I am currently stuck on version 7.1.4, unable to upgade to the latest version because of the massive performance issue created by this. Thanks, Stephen

            Hi, is this bug likely to be looked at in the near future?

            Stephen Marsh added a comment - Hi, is this bug likely to be looked at in the near future?

              mreil1 Markus Reil (Inactive)
              sabdelfattah Sherif Abdelfattah (Inactive)
              Affected customers:
              8 This affects my team
              Watchers:
              25 Start watching this issue

                Created:
                Updated:
                Resolved: