-
Bug
-
Resolution: Fixed
-
Highest
-
3.9.0, 3.10.0, 3.11.0
-
74
-
Severity 2 - Major
-
798
-
Hi everyone,
We have reopened this bug after investigation of a related issue documented in JSDSERVER-5736
- This is related to deadlocking of threads when there are frequent actions on one request. A workaround for this was released behind a dark feature in 4.9.0
- The development team will be working on a permanent fix shortly.
For details on the workaround please see the related documentation -
We hope to resolve this issue as soon as possible!
If you have further concerns with the above, please open a support ticket via https://support.atlassian.com
Thank you,
Alex
The expectation system executes each job on the node that submitted it. When the submitting thread is a request thread, the system generates a unique identifier for each submission, enqueues it on a PSMQ for the issue to which the given event pertains, then the OffThreadEventJobRunner uses an unbounded ThreadPoolExecutor (JSDSERVER-5732) to spawn a thread that’s equipped with that unique identifier.
Each job execution thread locks the queue on which it expects to see its unique identifier (UPDATE QUEUE), locks the message at the head of the queue expecting that the message contains its unique identifier (BEGIN + UPDATE QUEUE + SELECT MESSAGE + UPDATE MESSAGE + UPDATE QUEUE + COMMIT), dequeues the message (BEGIN + UPDATE QUEUE + DELETE MESSAGE + UPDATE QUEUE + COMMIT), and finally executes the associated work.
There are many points at which this process can fail under contention. While most of the failure modes that PSMQ handles with immediate retries up to 10000 ms should be unlikely (fewer messages than readers; next message has expired; message dequeued by other reader between peek and dequeue), there are at least two failure modes that we handle by repeating the process after sleeping for 5 ms (queue locked by other reader; next message doesn’t contain thread’s unique identifier).
This approach is obviously inefficient, but I can see how it might be an appropriate choice under an assumption like “contention for a given queue should only occur rarely and randomly”. That assumption, at the very least, doesn’t seem to hold, given the reports that some bulk issue operation and ScriptRunner usage patterns can easily result in contention, and evidence from our initial investigation that actions as simple as posting a comment almost always result in contention.
In the case of posting a comment, the contention occurs because our EventListenerLauncher appears to submit one SLA cycle updater job for each of the two commit-wrapped events that are dispatched by JIRA. The usual outcome is that we spawn two job execution threads in quick succession that contend for the same queue, and one of the threads issues at least ten unsuccessful UPDATE QUEUE queries under ideal conditions (no other activity, local database, empty instance, task management project, and the thread for the first job wins the race).
- causes
-
JSDSERVER-5736 Poor performance with high CPU and a high number of SdOffThreadEventJobRunner threads
- Closed
- is caused by
-
JSDSERVER-4743 Webhook triggered by Issue Creation is not sent when JQL filter users SLA fields
- Closed
-
JSMDC-293 Loading...
- is related to
-
JSDSERVER-6717 Working on busy Jira instance could result in a deadlock condition for the SdOffThreadEventJobRunner while scanning PSMQ queues.
- Closed
- relates to
-
JSDSERVER-5736 Poor performance with high CPU and a high number of SdOffThreadEventJobRunner threads
- Closed
-
JSDSERVER-6717 Working on busy Jira instance could result in a deadlock condition for the SdOffThreadEventJobRunner while scanning PSMQ queues.
- Closed
-
JSDSERVER-6715 Adding a comment or creating an issue in a non-JSD project results in JSD PSMQ off thread event handling code kicking in without need.
- Closed
- is cloned by
-
JSDS-5079 Loading...
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...