Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Unresolved
Priority: Medium
Fix Version/s: None
Affects Version/s: 8.13.0, 8.20.0, 9.0.0
Component/s: Data Center - Other, Mail Server
Labels:
- whl-fy24q3

Introduced in Version:
8.13
Support reference count:
7
Symptom Severity:
Severity 2 - Major
UIS:
5
Bug Fix Policy:
View Atlassian Server bug fix policy

Description

Issue Summary

Context behind this bug

In a cluster of Jira nodes, scheduled jobs can be picked up and executed by any Jira node.

Examples of scheduled job include, but are not limited to:

the Jira Batched Notification job (responsible for generating Jira batched Notifications and adding them to the Mail Queue)
the Jira Service Management Notification job (responsible for generating Customer notifications and adding them to the Mail Queue)

Ideally, the execution of a scheduled job should be fairly distributed across all the nodes, so that:

Node 1 runs job A
Then next time job A is scheduled, Node 2 should run it
Then next time job A is scheduled, Node 3 should run it
Etc...

With a fairly load of scheduled jobs across nodes, this should ensure that, whenever a job adds mails to the Mail queue, the emails piling up in the queue are fairly distributed across nodes (each node has its own mail queue).

Problem

For unknown reasons, it has been observed in some Jira instance with heavy daily operations generating a lot of emails (Jira batched Notifications, or Customer notifications) that the same Jira node tends to run the same job over and over, instead of "letting other" nodes running the job on a Round Robin basis.

In the case where the job that the node keeps running on its own is the Jira Batched Notification job, of the JSM Customer Notification job, the following will happen:

all the emails will end up on the Mail Queue of 1 single node
in case of a busy Jira instance, the emails will be sent with a long delay, since all emails are being sent by 1 single node and emails are sent by the Mail Queue Service job with is using 1 single Caesium thread

Steps to replicate

Unknown. We haven't been able to replicate this behavior in our local Jira evironment.

Suggestion

Improve the way scheduled jobs are picked up by the nodes in a Jira Cluster, in order to ensure that jobs are executed fairly across all Jira nodes.

Note: while this is not exactly a bug, since technically, scheduled jobs are picked up "randomly" by any Jira node that is available, we are raising this ticket as a bug since the current algorithm does not prevent the same job from picking up the same job, causing the Jira Mail Queue to pile up.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Julien Rey

Votes:: 7 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 10/May/2023 4:42 PM

Updated:: 03/Apr/2024 4:03 AM