-
Bug
-
Resolution: Fixed
-
Low
-
7.5.1, 7.2.11
-
None
-
7.02
-
Severity 2 - Major
-
Summary
Ehcache does a synchronous replication of invalidations through RMI during cache reset. This in turn happens inside atlassian-caches lock that does not allow any loads to proceed while reset is being processed. That basically mean that gets are blocked by potentially slow RMI replication.
Environment
- JIRA datacenter with multiple nodes
Node A is unresponsive because of extremely high load or high memory pressure or any other condition that makes it unresponsive. However, at this state node is not technically down and still registered as an 'Active' member in the cluster but not processing request either.
Node B still consider node A as 'Active' so it keeps performing cache synchronisation to Node A which not responding to the request and put Node B in stale position.
Expected behaviour
Cache replication happens outside of get/reset lock and is not blocking cache loading.