Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-75733

Scheduled jobs might end up run by the same Jira node over and over

    XMLWordPrintable

Details

    Description

      Issue Summary

      Context behind this bug

      In a cluster of Jira nodes, scheduled jobs can be picked up and executed by any Jira node.

      Examples of scheduled job include, but are not limited to:

      • the Jira Batched Notification job (responsible for generating Jira batched Notifications and adding them to the Mail Queue)
      • the Jira Service Management Notification job (responsible for generating Customer notifications and adding them to the Mail Queue)

      Ideally, the execution of a scheduled job should be fairly distributed across all the nodes, so that:

      • Node 1 runs job A
      • Then next time job A is scheduled, Node 2 should run it
      • Then next time job A is scheduled, Node 3 should run it
      • Etc...

      With a fairly load of scheduled jobs across nodes, this should ensure that, whenever a job adds mails to the Mail queue, the emails piling up in the queue are fairly distributed across nodes (each node has its own mail queue).

      Problem

      For unknown reasons, it has been observed in some Jira instance with heavy daily operations generating a lot of emails (Jira batched Notifications, or Customer notifications) that the same Jira node tends to run the same job over and over, instead of "letting other" nodes running the job on a Round Robin basis.

      In the case where the job that the node keeps running on its own is the Jira Batched Notification job, of the JSM Customer Notification job, the following will happen:

      • all the emails will end up on the Mail Queue of 1 single node
      • in case of a busy Jira instance, the emails will be sent with a long delay, since all emails are being sent by 1 single node and emails are sent by the Mail Queue Service job with is using 1 single Caesium thread

      Steps to replicate

      Unknown. We haven't been able to replicate this behavior in our local Jira evironment.

      Suggestion

      Improve the way scheduled jobs are picked up by the nodes in a Jira Cluster, in order to ensure that jobs are executed fairly across all Jira nodes.

      Note: while this is not exactly a bug, since technically, scheduled jobs are picked up "randomly" by any Jira node that is available, we are raising this ticket as a bug since the current algorithm does not prevent the same job from picking up the same job, causing the Jira Mail Queue to pile up.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jrey Julien Rey
            Votes:
            7 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated: