Loading...

Type: Bug
Resolution: Fixed
Priority: Highest
Fix Version/s: 7.3.4, 7.2.9
Affects Version/s: 7.0.0
Component/s: System Administration - General Configuration
Labels:

Introduced in Version:
7
Support reference count:
6
Symptom Severity:
Severity 2 - Major
UIS:
34
Bug Fix Policy:
View Atlassian Server bug fix policy

Summary

There's a regression in JIRA in the way how EagerOfBizUserCache is created on each new node and replicated to other nodes.

Currently, it's not warmed up so that the first thingy which deals with users (such as NodeReindexServiceThread reindexing issues) will start populating the EagerOfBizUserCache, cluster lock will be taken (and other locks, in the case of reindexing issues, DefaultIndexManager#indexLocks). It can take a while to complete, hence other threads may reach timeout when acquiring a lock (eg. ClusterMessageHandlerServiceThread at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup).

The previous behaviour was that the cache was populated before JIRA is fully-started.

Environment

JIRA 7+ Datacenter
>=3 nodes

Steps to Reproduce

have a lot of users >=30000
have issues to reindex
start new node

Expected Results

EagerOfBizUserCache is populated during JIRA startup and it's ready before JIRA is fully-started.

Actual Results

EagerOfBizUserCache is populated after JIRA startup and it might block or slow down other operations, for example leads to index inconsistencies.

Notes

Example of the problem related to index lock timeout

ClusterMessageHandlerServiceThread is waiting for DefaultIndexManager lock:

ClusterMessageHandlerServiceThread:thread-1" #85 prio=5 tid=0x00007fca257e9000 nid=0x5d32 waiting on condition [0x00007fc9a6df7000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000005c495cc08> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
...
	at com.atlassian.jira.issue.index.DefaultIndexManager.obtain(DefaultIndexManager.java:785)
	at com.atlassian.jira.issue.index.DefaultIndexManager.access$600(DefaultIndexManager.java:88)
	at com.atlassian.jira.issue.index.DefaultIndexManager$IndexLock.tryLock(DefaultIndexManager.java:1118)
	at com.atlassian.jira.issue.index.DefaultIndexManager.withReindexLock(DefaultIndexManager.java:354)
...
	at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup(DefaultIndexRecoveryManager.java:118)
...

but NodeReindexServiceThread:thread-1 does com.atlassian.jira.issue.index.DefaultIssueIndexer.perform and then sync user cache with other nodes and holding the lock DefaultIndexManager:

"NodeReindexServiceThread:thread-1" #84 prio=5 tid=0x00007fca25a4c000 nid=0x5d31 runnable [0x00007fc9a6ef6000]
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
...
	at com.atlassian.jira.cluster.distribution.JiraCacheManagerPeerProvider.listRemoteCachePeers(JiraCacheManagerPeerProvider.java:79)
	at net.sf.ehcache.distribution.RMISynchronousCacheReplicator.listRemoteCachePeers(RMISynchronousCacheReplicator.java:335)
	at net.sf.ehcache.distribution.RMISynchronousCacheReplicator.replicatePutNotification(RMISynchronousCacheReplicator.java:145)
	at com.atlassian.cache.ehcache.replication.rmi.RMISynchronousCacheReplicator.replicateViaCopy(RMISynchronousCacheReplicator.java:60)
	at com.atlassian.cache.ehcache.replication.rmi.RMISynchronousCacheReplicator.notifyElementPut(RMISynchronousCacheReplicator.java:48)
	at net.sf.ehcache.event.RegisteredEventListeners.internalNotifyElementPut(RegisteredEventListeners.java:192)
	at net.sf.ehcache.event.RegisteredEventListeners.notifyElementPut(RegisteredEventListeners.java:170)
...
	at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.buildCacheIfRequired(UserOrGroupCache.java:118)
...
	at com.atlassian.crowd.directory.AbstractInternalDirectory.findUserByName(AbstractInternalDirectory.java:173)
	at com.atlassian.crowd.directory.AbstractInternalDirectory.findUserByName(AbstractInternalDirectory.java:64)
...
	at com.atlassian.jira.user.util.DefaultUserManager.getUserByName(DefaultUserManager.java:258)
...
	at com.atlassian.jira.issue.index.DefaultIssueIndexer.perform(DefaultIssueIndexer.java:282)
	at com.atlassian.jira.issue.index.DefaultIssueIndexer.reindexIssues(DefaultIssueIndexer.java:162)
	at com.atlassian.jira.issue.index.DefaultIndexManager.reIndexIssues(DefaultIndexManager.java:541)
	at com.atlassian.jira.issue.index.DefaultIndexManager.reIndexIssueObjects(DefaultIndexManager.java:438)
...
	at com.atlassian.jira.index.ha.DefaultNodeReindexService.updateIssueIndex(DefaultNodeReindexService.java:404)
	at com.atlassian.jira.index.ha.DefaultNodeReindexService.updateAffectedIndexes(DefaultNodeReindexService.java:298)
	at com.atlassian.jira.index.ha.DefaultNodeReindexService.reIndex(DefaultNodeReindexService.java:252)
...

Workaround

When the actual problem is related to reaching index lock timeout, it can be increased (application property jira.index.lock.waittime).

is caused by

JRASERVER-64230 In Data Center users and groups caches are not preloaded on startup

Closed

is related to

JRASERVER-63515 JIRA Datacenter node with large number of user have long start-up time due to slow cache population

Closed

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

was cloned as: RUN-1409 Loading...

(3 mentioned in, 1 was cloned as)

Details

Description

Summary

Environment

Steps to Reproduce

Expected Results

Actual Results

Notes

Example of the problem related to index lock timeout

Workaround

Attachments

Issue Links

Forms

Activity

People

Dates