Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Low
Fix Version/s: 11.2.0
Affects Version/s: 7.9.0, 7.13.0, 8.2.4, 8.5.4
Component/s: Data Center - Node replication
Labels:
- autoscaling
- delta
- l1l2
- pse-request
- whl-fy25q2
- whl-fy25q3
- whl-fy25q4
- whl-fy25q4-20
- whl-fy26q1
- whl-fy26q1-40

Introduced in Version:
7.09
Support reference count:
46
Symptom Severity:
Severity 2 - Major
UIS:
11
Bug Fix Policy:
View Atlassian Server bug fix policy
Current Status:

Hide

Atlassian Update – 22 September 2025

Dear Customers,

We have investigated the bug again after implementing automated stale node removal in https://jira.atlassian.com/browse/JRASERVER-42916. From the information we collected, the cause of the issue was always linked to an unusual number of stale nodes, and the automated removal should help resolve most of those cases. To address situations where the default configuration isn’t sufficient, we also created a Knowledge Base article (https://support.atlassian.com/jira/kb/stale-no-heartbeat-nodes-negatively-affect-performance-in-jira-data-center/) explaining how to tweak the automation or the node removal process to better fit your needs and ensure your instance’s performance remains unaffected.

Starting with Jira 11.2.0, the existing logged message identifying the detected stale nodes will also include a link to this article to facilitate the discovery of the issue and potential solutions.

Based on this, we will close the bug. If the problem persists on your instance, please comment on this ticket or contact Atlassian Support so we can better understand the real impact and investigate other potential root causes beyond stale nodes.

Best regards

Jacek Foremski

Principal Software Engineer, Jira DC

Show
Atlassian Update – 22 September 2025 Dear Customers, We have investigated the bug again after implementing automated stale node removal in https://jira.atlassian.com/browse/JRASERVER-42916 . From the information we collected, the cause of the issue was always linked to an unusual number of stale nodes, and the automated removal should help resolve most of those cases. To address situations where the default configuration isn’t sufficient, we also created a Knowledge Base article ( https://support.atlassian.com/jira/kb/stale-no-heartbeat-nodes-negatively-affect-performance-in-jira-data-center/ ) explaining how to tweak the automation or the node removal process to better fit your needs and ensure your instance’s performance remains unaffected. Starting with Jira 11.2.0, the existing logged message identifying the detected stale nodes will also include a link to this article to facilitate the discovery of the issue and potential solutions. Based on this, we will close the bug. If the problem persists on your instance, please comment on this ticket or contact Atlassian Support so we can better understand the real impact and investigate other potential root causes beyond stale nodes. Best regards Jacek Foremski Principal Software Engineer, Jira DC

Summary

Asynchronous cache replication can cause extra overhead in case of large number cache updates and many stale nodes.

Environment

Jira DC
A large number of stale nodes (see ~~JRASERVER-42916~~)
Plugin (code) generating a large number of cache update events, eg reaching 2000 messages/min.

Steps to Reproduce

Open a URL which produces the cache update event while computing the business logic
- Eg. #* /rest/servicedesk/1/<PRJ>/webfragments/sections/sd-queues-nav,servicedesk.agent.queues,servicedesk.agent.queues.ungrouped
Measure response time and number of replication events

Expected Results

Performance doesn't degrade with a number of old nodes.

Actual Results

Performance degrades with a number of old stale nodes.

While taking thread dumps you can see a lot of threads busy in the following stack:

  java.lang.Thread.State: RUNNABLE
	at java.io.RandomAccessFile.writeBytes(Native Method)
	at java.io.RandomAccessFile.write(RandomAccessFile.java:512)
	at com.squareup.tape.QueueFile.writeHeader(QueueFile.java:184)
	at com.squareup.tape.QueueFile.add(QueueFile.java:321)
	- locked <0x00000003ce9b43e0> (a com.squareup.tape.QueueFile)
	at com.squareup.tape.FileObjectQueue.add(FileObjectQueue.java:46)
	at com.atlassian.jira.cluster.distribution.localq.tape.TapeLocalQCacheOpQueue.add(TapeLocalQCacheOpQueue.java:151)
	at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpQueueWithStats.add(LocalQCacheOpQueueWithStats.java:115)
	at com.atlassian.jira.cluster.distribution.localq.LocalQCacheManager.addToQueue(LocalQCacheManager.java:370)
	at com.atlassian.jira.cluster.distribution.localq.LocalQCacheManager.addToAllQueues(LocalQCacheManager.java:354)
	at com.atlassian.jira.cluster.distribution.localq.LocalQCacheReplicator.replicateToQueue(LocalQCacheReplicator.java:85)
	at com.atlassian.jira.cluster.distribution.localq.LocalQCacheReplicator.replicatePutNotification(LocalQCacheReplicator.java:65)
	at com.atlassian.jira.cluster.cache.ehcache.AbstractJiraCacheReplicator.notifyElementUpdated(AbstractJiraCacheReplicator.java:123)
	at net.sf.ehcache.event.RegisteredEventListeners.internalNotifyElementUpdated(RegisteredEventListeners.java:228)
	at net.sf.ehcache.event.RegisteredEventListeners.notifyElementUpdated(RegisteredEventListeners.java:206)
...

From client's case, we saw 15 - 20% of all threads doing replicateToQueue

Notes

None

Workaround

Clean-up old node data manually, see ~~JRASERVER-42916~~

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List

2021-04-30-flamechart.png
3.35 MB
07/Jun/2021 2:50 PM

is related to

JRASERVER-65538 Active nodes query for offline node messages and index operations

Closed

JRASERVER-42916 Stale node ids should automatically be removed in Jira Data Center

Closed

JRASERVER-67019 Asynchronous cache replication in Jira Data Center

Closed

relates to

JSDSERVER-6490 Opening Service Desk Queue will send many Cache replication requests for Queue Count

Closed

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(28 mentioned in)

Assignee:: Jacek Foremski

Reporter:: Andriy Yakovlev [Atlassian]

Votes:: 6 Vote for this issue

Watchers:: 22 Start watching this issue

Created:: 22/Jul/2019 8:28 AM

Updated:: 22/Sep/2025 6:56 AM

Details

Description

Summary

Environment

Steps to Reproduce

Expected Results

Actual Results

Notes

Workaround

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates