Uploaded image for project: 'Jira Service Management Data Center'
  1. Jira Service Management Data Center
  2. JSDSERVER-5493

PSMQ automation thread loops over queue due to message count discrepancy

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Medium
    • 3.9.0
    • 3.2.11, 3.8.3
    • Automation

    Description

      Experienced Behavior:

      JIRA Service Desk PSMQ enters a state where a single thread is repeatedly checking the same queue. This loops forever, consuming high CPU and may eventually cause OOME.

      • This may occur for multiple queues. Each queue will have one lopping thread.

      Expected Behavior:

      JSD Automation / PSMQ does not encounter a situation where it repeatedly loops over a single queue

      Steps to Reproduce:

      Encounter or create scenario where:

      • "AO_319474_QUEUE" table shows non-zero value in "MESSAGE_COUNT"
      • AND
      • "AO_319474_MESSAGE" table shows zero messages associated with that queue

      To encounter scenario:

      • It is suspected that this may occur when a JIRA Data Center node is shut down while it is processing automation rules. This suspected cause has not been confirmed in the wild.

      To create scenario:

      • Modify database so as to reproduce scenario. In new instance:
        • Create sample JSD project
        • Bulk add comment to customer
          • This will create rows in "AO_319474_QUEUE"
        • update "AO_319474_QUEUE" set "MESSAGE_COUNT" = 1
          • This will make the rows "bad"
        • Bulk add comment to customer
          • This will trigger automation which will check the queues and loop.
        • Observe: Thread dumps show 42 PsmqAsyncExecutors-job threads.

      Workaround:

      The following query can be used to check for rows that may encounter this behavior:

      select Q."NAME", Q."MESSAGE_COUNT", count(M."ID") as real_message_count from "AO_319474_QUEUE" as Q left join "AO_319474_MESSAGE" as  M  on M."QUEUE_ID" = q."ID"  group by Q."NAME", Q."MESSAGE_COUNT" having count(M."ID") = 0 AND Q."MESSAGE_COUNT" != 0;
      

      If thread dumps reveal looping threads, use the following to correct the offending rows.

      update "AO_319474_QUEUE" set "MESSAGE_COUNT" = 0 where "NAME" in (select Q."NAME" from "AO_319474_QUEUE" as Q left join "AO_319474_MESSAGE" as  M  on M."QUEUE_ID" = q."ID"  group by Q."NAME", Q."MESSAGE_COUNT" having count(M."ID") = 0 AND Q."MESSAGE_COUNT" != 0);
      

      Notes

      • Encountered by customer in JIRA 7.2.11 with JSD 3.2.11
      • Reproduced using "create scenario" steps in JIRA 7.5.2 with JSD 3.8.3 and 7.6.0 with JSD 3.9.0
      • Suspected to not impact 7.6 JIRA "because Automation behavior was fixed to use different claimants for each node."
      • This can cause stability issues (OOM) when accompanied by JRASERVER-66158

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tevans Tim Evans (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Backbone Issue Sync