Uploaded image for project: 'Automation for Jira Server'
  1. Automation for Jira Server
  2. JIRAAUTOSERVER-266

After upgrading A4J to 7.3.x the majority of my rules stopped working

    XMLWordPrintable

Details

    • Bug
    • Resolution: Tracked Elsewhere
    • Highest
    • None
    • 7.3.0, 7.3.1, 7.3.2
    • None
    • Severity 1 - Critical

    Description

      Issue Summary

      After upgrading A4J to 7.3.x some customers are finding some null entries on rule actor and it seems like there are some errors about “No automation items were claimed from the queue for processing on this node since most automation rules depend on JQL indexes being available.”.

      Analysing further we've noticed that there were more than 60k results on the AO_589059_AUTOMATION_QUEUE table on JIRA startup. The majority of such entries is with a rule_ID = null .

      This problem currently might happen regardless the database type and the size of your instance.

      Steps to Reproduce

      1. After upgrading A4J to 7.3.0
      2. Try running any rule on your environment

      Expected Results

      The rule runs without problem.

      Actual Results

      More details are still being gathered.
      Some errors on the logs are shown and the rule doesn't executes (or delays its execution).

      2021-05-03 21:28:08,193-0500 automation-rule-executor:thread-4 ERROR anonymous     [c.c.j.p.automation.queue.JiraAutomationQueueExecutor] Unknown error processing queue item QueueItemBean{id=9277044, auditItemId=null, executionUuid=a5e97526-86b0-402b-a0f7-320ac38122e8, ruleId=null, priority=5, created=2021-05-03 08:21:45.948, claimant='automation-queue-claimer', claimTime=2021-05-03 20:26:45.459, payload='{"tenantContext":{"clientKey":"com.codebarrel.tenant.global","environment":"prod","tenantId":{"id":"00000000-0000-0000-0000-000000bbbbbb"}},"event":{"clientKey":"com.codebarrel.tenant.global","triggerType":"EVENT","type":"jira:issue_created:issue_...'}:
      java.lang.RuntimeException: com.atlassian.jira.issue.index.SearchUnavailableException
      	at com.codebarrel.jira.plugin.automation.service.execution.JiraThreadLocalExecutor.executeAsWithResult(JiraThreadLocalExecutor.java:70)
      	at com.codebarrel.jira.plugin.automation.service.execution.JiraThreadLocalExecutor.executeAsWithResult(JiraThreadLocalExecutor.java:53)
      	at com.codebarrel.jira.plugin.automation.queue.JiraAutomationQueueExecutor.lambda$processClaimedItem$4(JiraAutomationQueueExecutor.java:252)
      	at com.atlassian.jira.util.concurrent.BoundedExecutor$1.run(BoundedExecutor.java:50)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: com.atlassian.jira.issue.index.SearchUnavailableException
      	at com.atlassian.jira.issue.index.DefaultIndexManager.getEntitySearcher(DefaultIndexManager.java:927)
      	at com.atlassian.jira.issue.index.DefaultIndexManager.getIssueSearcher(DefaultIndexManager.java:906)
      	at sun.reflect.GeneratedMethodAccessor972.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      
      2021-05-04 08:01:30,959-0500 https-jsse-nio-4343-exec-22 ERROR jra_adm 463x29149x3 1oaf9f9 162.202.158.61 /servicedesk/customer/portal/2/create/74 [c.c.j.p.a.service.execution.JiraThreadLocalExecutor] Unexpected error in thread local executor with actorKey 'jra_adm'.
      com.querydsl.core.QueryException: Caught PSQLException for select "AO_589059_AUDIT_ITEM_COMP_CGE"."AUDIT_ITEM_ID", "AO_589059_AUDIT_ITEM_COMP_CGE"."COMPONENT_NAME_KEY", "AO_589059_AUDIT_ITEM_COMP_CGE"."COMPONENT", "AO_589059_AUDIT_ITEM_COMP_CGE"."COMPONENT_ID", "AO_589059_AUDIT_ITEM_COMP_CGE"."DURATION", "AO_589059_AUDIT_ITEM_COMP_CGE"."ID", "AO_589059_AUDIT_ITEM_COMP_CGE"."OPTIMISED_IDS", "AO_589059_AUDIT_ITEM_COMP_CGE"."START_TIME", "AO_589059_AUDIT_ITEM_CGE_ITEM"."AUDIT_ITEM_COMPONENT_CHANGE_ID", "AO_589059_AUDIT_ITEM_CGE_ITEM"."AUDIT_ITEM_ID", "AO_589059_AUDIT_ITEM_CGE_ITEM"."CHANGE_FROM", "AO_589059_AUDIT_ITEM_CGE_ITEM"."CHANGE_TO", "AO_589059_AUDIT_ITEM_CGE_ITEM"."FIELD_NAME", "AO_589059_AUDIT_ITEM_CGE_ITEM"."ID", "AO_589059_AUDIT_ITEM_CGE_ITEM"."MESSAGE"
      from "public"."AO_589059_AUDIT_ITEM_COMP_CGE" "AO_589059_AUDIT_ITEM_COMP_CGE"
      left join "public"."AO_589059_AUDIT_ITEM_CGE_ITEM" "AO_589059_AUDIT_ITEM_CGE_ITEM"
      on "AO_589059_AUDIT_ITEM_COMP_CGE"."ID" = "AO_589059_AUDIT_ITEM_CGE_ITEM"."AUDIT_ITEM_COMPONENT_CHANGE_ID"
      where "AO_589059_AUDIT_ITEM_COMP_CGE"."AUDIT_ITEM_ID" = ?
      order by "AO_589059_AUDIT_ITEM_COMP_CGE"."START_TIME" asc, "AO_589059_AUDIT_ITEM_COMP_CGE"."ID" asc
      	at com.querydsl.sql.DefaultSQLExceptionTranslator.translate(DefaultSQLExceptionTranslator.java:50)
      	at com.querydsl.sql.Configuration.translate(Configuration.java:459)
      

      It might also get some problems on thread waiting for the database to response:

      Running the queries below might help on troubleshooting this case:

      // Count how many items we have waiting to be processed
      SELECT count(*) FROM "AO_589059_AUTOMATION_QUEUE";
      
      //Shows which rules have events waiting to processed
      SELECT "RULE_ID", count(*) FROM "AO_589059_AUTOMATION_QUEUE" GROUP BY "RULE_ID" order by count desc;
      
      // Shows how many rule ran per hour in last 2 weeks. Including total & avg duration during that hour
      SELECT date_trunc('hour', "CREATED") AS rule_ran_hour, count(*), sum("DURATION"), avg("DURATION")
      FROM "AO_589059_AUDIT_ITEM"
      WHERE "CREATED" > (now() - interval '2 week')
      GROUP BY rule_ran_hour
      ORDER BY rule_ran_hour;
      
      // shows top 50 rules ordered by how much time they were in the queue in total in the last 2 weeks
      SELECT ai."OBJECT_ITEM_ID" as rule_id, rc."NAME" as rule_name, count(*),
        avg("DURATION")              AS time_avg_ms,
        avg("END_TIME" - "START_TIME") AS queued_time_avg
      FROM "AO_589059_AUDIT_ITEM" ai JOIN "AO_589059_RULE_CONFIG" rc ON "OBJECT_ITEM_ID" = rc."ID" 
      WHERE ai."CREATED" > (now() - INTERVAL '2 week')
      GROUP BY "OBJECT_ITEM_ID", rc."NAME"
      ORDER BY queued_time_avg DESC
      LIMIT 50;
      

      Workaround

      Currently there is no known workaround for this behavior. A workaround will be added here when available

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vseger@atlassian.com Victor Seger (Inactive)
              Votes:
              6 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: