Loading...

XML

Word

Printable

Type: Suggestion
Resolution: Fixed
Fix Version/s: 8.10.0
Component/s: Data Center - Other
Labels:

UIS:
657
Support reference count:
78
Current Status:
Hide

Atlassian Update – 16 June 2020

Hi everyone,

Thank you for your votes and comments on this issue. We would like to inform you that this suggestion will be addressed in the upcoming Jira Data Center version 8.10.0 release.

We’ve decided to provide more automated way of handling stale (No heartbeat) nodes in Jira Data Center. Before the changes, if a node lost connection to the cluster for 5 minutes, its state changed from “Active” to “No heartbeat”. If such node was not moved to the “Offline” state, it might cause performance degradation.

We’ve automated this process and the solution is as follows:

If a node is in the “No heartbeat” state for longer than 2 days, it will be automatically moved to the “Offline” state. Admins will be informed about this via warning in atlassian-jira.log file and will see such state on the Clustering page. During this period you will be able to check the node or restart it.

If a node is in “Offline” state for longer than 2 days, it will be automatically removed from the cluster. Also, you will be informed about such action through the info logs in your atlassian-jira.log file.

Additionally based on the feedback we received in the comments below, we will be adding in Jira Data Center version 8.11.0 a possibility of adjusting stale nodes retention period of 2 days. You can find more details about this suggestion under this thread.

Moreover since Jira Data Center 8.6 we are bringing more visibility about nodes in your cluster by introducing Clustering page in the admin panel. In the newly released Jira Data Center version 8.9 we have extended this page by adding additional information about statuses of nodes (Active, No heartbeat, Offline) and Jira DC application status (maintenance, error, running, starting) in order to identify the stale nodes more easily.

Lastly, the changes described above are integrated with the Advanced audit log functionality available in Jira Data Center since version 8.8. Any automatic actions will be logged to give admins more visibility what is happening on their instance. For more details please go here.

Thank you for voting and commenting on this suggestion,
Grażyna Kaszkur
Product manager, Jira Server and Data Center
Show
Atlassian Update – 16 June 2020 Hi everyone, Thank you for your votes and comments on this issue. We would like to inform you that this suggestion will be addressed in the upcoming Jira Data Center version 8.10.0 release. We’ve decided to provide more automated way of handling stale (No heartbeat) nodes in Jira Data Center. Before the changes, if a node lost connection to the cluster for 5 minutes, its state changed from “Active” to “No heartbeat”. If such node was not moved to the “Offline” state, it might cause performance degradation. We’ve automated this process and the solution is as follows: If a node is in the “No heartbeat” state for longer than 2 days, it will be automatically moved to the “Offline” state. Admins will be informed about this via warning in atlassian-jira.log file and will see such state on the Clustering page. During this period you will be able to check the node or restart it. If a node is in “Offline” state for longer than 2 days, it will be automatically removed from the cluster. Also, you will be informed about such action through the info logs in your atlassian-jira.log file. Additionally based on the feedback we received in the comments below, we will be adding in Jira Data Center version 8.11.0 a possibility of adjusting stale nodes retention period of 2 days. You can find more details about this suggestion under this thread. Moreover since Jira Data Center 8.6 we are bringing more visibility about nodes in your cluster by introducing Clustering page in the admin panel . In the newly released Jira Data Center version 8.9 we have extended this page by adding additional information about statuses of nodes (Active, No heartbeat, Offline) and Jira DC application status (maintenance, error, running, starting) in order to identify the stale nodes more easily. Lastly, the changes described above are integrated with the Advanced audit log functionality available in Jira Data Center since version 8.8. Any automatic actions will be logged to give admins more visibility what is happening on their instance. For more details please go here. Thank you for voting and commenting on this suggestion, Grażyna Kaszkur Product manager, Jira Server and Data Center

NOTE: This suggestion is for JIRA Server. Using JIRA Cloud? See the corresponding suggestion.

Problem Definition

After changing node id in the cluster.properties file both the old and new id will appear in Cluster Nodes section of System information. The problem is worse for AWS, since it will create many new nodes and never reuse them.

Note

Having old nodes in the system (table) may cause other problems, see related:

Workaround

In a recent version of Jira we introduced the new REST API to manage the cluster state which mitigates the problem. See ~~JRASERVER-69033~~.
Clean-up old data manually:

Check tables and find all rows related to old nodes:

select * from clusternode;
select * from clusternodeheartbeat;

Delete the related records:

delete from clusternode where node_id = '<node_id>';
delete from clusternodeheartbeat where node_id = '<node_id>';

Clean old Replication records:

// check if clean is nessary 
select count(id) from replicatedindexoperation where node_id = '<node_id>';
// delete
delete from replicatedindexoperation where node_id = '<node_id>';

blocks

JRASERVER-70264 As a Jira DC Administrator I want to fully utilise autoscaling feature

Gathering Interest

is duplicated by

JRASERVER-67135 JIRA Data Center node appear as Active when shutdown ungracefully

Closed

JRASERVER-67294 Update/remove cluster nodes from Jira data center in system info page

Closed

JDEV-32795 Loading...

is related to

JRASERVER-71382 Setting run interval property for the cluster automatic node management is not working

Closed

JRASERVER-70807 We need a more robust process for refreshing prod data to lower environments

Gathering Interest

is resolved by

JRASERVER-69033 Add REST APIs methods to clean old node ids in JIRA Data Center

Closed

relates to

JRASERVER-45762 Redundant nodes from a restored Datacenter listed on System Info page

Closed

JRASERVER-65538 Active nodes query for offline node messages and index operations

Closed

JRASERVER-69652 Asynchronous cache replication can cause extra overhead in case of large number cache updates and many stale nodes

Closed

JRACLOUD-42916 Old node ids should get removed in JIRA Data Center

Closed

JRASERVER-65719 As an JIRA Datacenter Administrator I want have history of nodes joining the cluster

Closed

JRASERVER-72206 Improve Jira logging for NodeAutoShutdownIfOfflineService

Gathering Interest

Mentioned in

CSS Top Asks - Jira On Prem: JRASERVER-42916 | Stale node ids should automatically be removed in Jira Data Center (Grazyna)

FY19 Jira DC L1L2 Bugs - Recommend to fix in 7.13 --> 7.6 where technically possible: (12) Remove old node ids [JRA-42916]

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

supersedes: ITOPSENG-539 Loading...; ITOPSENG-542 Loading...

(1 is related to, 1 is resolved by, 6 relates to, 2 Mentioned in, 60 mentioned in, 2 supersedes)

Assignee:: Stasiu
Reporter:: Andriy Yakovlev [Atlassian] (Inactive)
Votes:: 168 Vote for this issue
Watchers:: 159 Start watching this issue

Created:: 15/Apr/2015 8:39 PM
Updated:: 18/Feb/2025 2:02 PM
Resolved:: 23/Jun/2020 5:28 PM

Details

Description

Problem Definition

Suggested Solution

Note

Workaround

Attachments

Issue Links

Forms

Activity

People

Dates