Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Highest
Fix Version/s: 4.21.0
Affects Version/s: 3.9.0, 3.10.0, 3.11.0
Component/s: SLA
Labels:

Support reference count:
74
Symptom Severity:
Severity 2 - Major
UIS:
798
Bug Fix Policy:
View Atlassian Server bug fix policy

Atlassian Update – 10 September 2021

Hi everyone,

We have reopened this bug after investigation of a related issue documented in ~~JSDSERVER-5736~~

This is related to deadlocking of threads when there are frequent actions on one request. A workaround for this was released behind a dark feature in 4.9.0
The development team will be working on a permanent fix shortly.

For details on the workaround please see the related documentation -

https://confluence.atlassian.com/jirakb/deadlocking-in-jira-service-desk-when-frequently-updating-the-same-issue-979428323.html

We hope to resolve this issue as soon as possible!

If you have further concerns with the above, please open a support ticket via https://support.atlassian.com

Thank you,

Alex

The expectation system executes each job on the node that submitted it. When the submitting thread is a request thread, the system generates a unique identifier for each submission, enqueues it on a PSMQ for the issue to which the given event pertains, then the OffThreadEventJobRunner uses an unbounded ThreadPoolExecutor (~~JSDSERVER-5732~~) to spawn a thread that’s equipped with that unique identifier.

Each job execution thread locks the queue on which it expects to see its unique identifier (UPDATE QUEUE), locks the message at the head of the queue expecting that the message contains its unique identifier (BEGIN + UPDATE QUEUE + SELECT MESSAGE + UPDATE MESSAGE + UPDATE QUEUE + COMMIT), dequeues the message (BEGIN + UPDATE QUEUE + DELETE MESSAGE + UPDATE QUEUE + COMMIT), and finally executes the associated work.

There are many points at which this process can fail under contention. While most of the failure modes that PSMQ handles with immediate retries up to 10000 ms should be unlikely (fewer messages than readers; next message has expired; message dequeued by other reader between peek and dequeue), there are at least two failure modes that we handle by repeating the process after sleeping for 5 ms (queue locked by other reader; next message doesn’t contain thread’s unique identifier).

This approach is obviously inefficient, but I can see how it might be an appropriate choice under an assumption like “contention for a given queue should only occur rarely and randomly”. That assumption, at the very least, doesn’t seem to hold, given the reports that some bulk issue operation and ScriptRunner usage patterns can easily result in contention, and evidence from our initial investigation that actions as simple as posting a comment almost always result in contention.

In the case of posting a comment, the contention occurs because our EventListenerLauncher appears to submit one SLA cycle updater job for each of the two commit-wrapped events that are dispatched by JIRA. The usual outcome is that we spawn two job execution threads in quick succession that contend for the same queue, and one of the threads issues at least ten unsuccessful UPDATE QUEUE queries under ideal conditions (no other activity, local database, empty instance, task management project, and the thread for the first job wins the race).

causes

JSDSERVER-5736 Poor performance with high CPU and a high number of SdOffThreadEventJobRunner threads

Closed

is caused by

JSDSERVER-4743 Webhook triggered by Issue Creation is not sent when JQL filter users SLA fields

Closed

JSMDC-293 Loading...

is related to

JSDSERVER-6717 Working on busy Jira instance could result in a deadlock condition for the SdOffThreadEventJobRunner while scanning PSMQ queues.

Closed

relates to

JSDSERVER-5736 Poor performance with high CPU and a high number of SdOffThreadEventJobRunner threads

Closed

JSDSERVER-6717 Working on busy Jira instance could result in a deadlock condition for the SdOffThreadEventJobRunner while scanning PSMQ queues.

Closed

JSDSERVER-6715 Adding a comment or creating an issue in a non-JSD project results in JSD PSMQ off thread event handling code kicking in without need.

Closed

is cloned by: JSMDC-5079 Loading...

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(2 relates to, 1 is cloned by, 28 mentioned in)

Assignee:: Elton Santos

Reporter:: Delan Azabani (Inactive)

Votes:: 38 Vote for this issue

Watchers:: 54 Start watching this issue

Created:: 28/Mar/2018 7:18 AM

Updated:: 08/Nov/2024 8:53 PM

Resolved:: 14/Dec/2021 3:38 AM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates