Uploaded image for project: 'Automation for Jira Server'
  1. Automation for Jira Server
  2. JIRAAUTOSERVER-1063

Monitor Automation events queue health

XMLWordPrintable

    • Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

      Issue Summary

      The customer requests additional capability on Automation for the Jira Performance Insights screen.

      Steps to Reproduce

      1. Have a large object type (5000+ objects) refreshed daily from an external DB source or a large number of rule executions (5000+) in a short period of time.
      2. It takes a while for all automation events to execute, and the queue size can increase dramatically by rogue rules that prevent critical rules from getting executed.
      3. We can see the queued events only when we visit the monitoring screen.  We also get email alerts when a rule fails.
      4. There are no other metrics or alerts to monitor the health A4J.

      Expected Results

      Additional metrics such as the ones in https://jira.atlassian.com/browse/JIRAAUTOSERVER-266 from the DB queries:

      Running the queries below might help on troubleshooting this case:

      //Shows which rules have events waiting to processed
      SELECT "RULE_ID", count(*) FROM "AO_589059_AUTOMATION_QUEUE" GROUP BY "RULE_ID" order by count desc;
      
      // Shows how many rule ran per hour in last 2 weeks. Including total & avg duration during that hour
      SELECT date_trunc('hour', "CREATED") AS rule_ran_hour, count(*), sum("DURATION"), avg("DURATION")
      FROM "AO_589059_AUDIT_ITEM"
      WHERE "CREATED" > (now() - interval '2 week')
      GROUP BY rule_ran_hour
      ORDER BY rule_ran_hour;
      
      // shows top 50 rules ordered by how much time they were in the queue in total in the last 2 weeks
      SELECT ai."OBJECT_ITEM_ID" as rule_id, rc."NAME" as rule_name, count(*),
        avg("DURATION")              AS time_avg_ms,
        avg("END_TIME" - "START_TIME") AS queued_time_avg
      FROM "AO_589059_AUDIT_ITEM" ai JOIN "AO_589059_RULE_CONFIG" rc ON "OBJECT_ITEM_ID" = rc."ID" 
      WHERE ai."CREATED" > (now() - INTERVAL '2 week')
      GROUP BY "OBJECT_ITEM_ID", rc."NAME"
      ORDER BY queued_time_avg DESC
      LIMIT 50;

      In addition, it'll be great if we have customizable email alerts for the following:

      • The queue size exceeds a certain threshold
      • The number of rules processed per xx minutes is below a certain threshold signally the queue may be 'stuck'.

      Actual Results

      No additional monitoring or alerts capability.

      Workaround

      No workaround is available.

              Unassigned Unassigned
              f69110b8fa9f Johnny Mon
              Votes:
              4 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: