Uploaded image for project: 'Jira Service Management Data Center'
  1. Jira Service Management Data Center
  2. JSDSERVER-6717

Working on busy Jira instance could result in a deadlock condition for the SdOffThreadEventJobRunner while scanning PSMQ queues.


      Issue Summary

      On busy large Jira instance (Jira 8.3.0+ with JSD 4.3.0 or higher), the SdOffThreadEventJobRunner threads might run into an active deadlock situation constantly trying to poll the PSMQ table.

      Summary of the problem:

      • As of JSD 4.3.0 the SdOffThreadEventJobRunner thread pool is limited to 5 threads by default.
      • Upon an issue update, If none of the 5 threads have issue context to process the first item in the queue, none of them will be able to process that event and they will continue polling the queue forever.
      • This problem could be triggered on a large very active instance, mainly if a certain issue is getting too frequent updates in a short time.
      • This could result is a huge database pressure and could cause a performance issue for Jira is the database can't keep up.
      • Such a deadlock will be resolved eventually, but some issues might end up with a corrupted SLA.

      Steps to Reproduce

      • NA - was not able to reproduce locally.
      • The issue was observed on large very active client instances

      Expected Results

      • The SdOffThreadEventJobRunner thread should not be polling messages over and over in a deadlock situation.
      • The amount of traffic to the database should be normal.

      Actual Results

      The logs (with Debug enabled on com.querydsl) show a huge amount of queries (40k+) hitting the database over and over:

      $ grep "AO_319474_QUEUE" atlassian-jira.log.*|grep "update" |grep 'set "CLAIMANT" =' |cut -d" " -f6,9 |sort |uniq -c
         9110 1005x1479882x8 /servicedesk/customer/portal/9/create/94
        18379 1005x1480033x3 /rest/api/2/issue/SAT-30983/comment
        12949 1005x1480061x6 /secure/CommentAssignIssue.jspa
      $ grep -n "AO_319474_MESSAGE" atlassian-jira.log* |grep "count(" |cut -d" " -f6,9 |sort |uniq -c
         9407 1005x1479882x8 /servicedesk/customer/portal/9/create/94
        18631 1005x1480033x3 /rest/api/2/issue/SAT-30983/comment
        13325 1005x1480061x6 /secure/CommentAssignIssue.jspa

      Also checking the AO_319474_MESSAGE table shows that there is at least 1 message that is being claimed more than once:



      There is no known effective workaround for this issue currently.

        1. screenshot_159.png
          25 kB
          Sherif Abdelfattah

            agoldthorpe Aidan Goldthorpe
            sabdelfattah Sherif Abdelfattah (Inactive)
            3 Vote for this issue
            27 Start watching this issue