-
Bug
-
Resolution: Fixed
-
Low
-
6.4.14, 7.2.8
-
6.04
-
8
-
Severity 2 - Major
-
10
-
Summary
In JIRA Datacenter cluster with large number of user each time you restart a node it will take a long time to start-up to slow user cache population.
Environment
- JIRA DC
- Large number of users: 40k+
- Start time increases with large number of nodes
Steps to Reproduce
- Restart node in JDC
- Check start-up time
Expected Results
Node starts within reasonable time ~4 minutes
Actual Results
Node starts within 10+ minutes
You can see that from logs:
2016-12-09 21:59:05,441 localhost-startStop-1 INFO [atlassian.jira.startup.JiraStartupLogger] JIRA starting... 2016-12-09 22:01:21,773 localhost-startStop-1 INFO [jira.upgrade.util.BuildNumberDao] Setting current version to 6.4.14 ... 2016-12-09 22:05:37,982 localhost-startStop-1 INFO [atlassian.jira.scheduler.JiraSchedulerLauncher] Starting the JIRA Scheduler.... 2016-12-09 22:05:38,040 localhost-startStop-1 INFO [atlassian.jira.scheduler.JiraSchedulerLauncher] JIRA Scheduler started. 2016-12-09 22:10:22,747 localhost-startStop-1 INFO [atlassian.jira.startup.LauncherContextListener] Memory Usage: Dec 09, 2016 10:10:22 PM org.apache.coyote.AbstractProtocol start INFO: Starting ProtocolHandler ["http-bio-8080"]
Notes
Thread dump taken during start-up shows following.
- Thread: "localhost-startStop-1" is blocked waiting for user cache population:
"localhost-startStop-1" #17 daemon prio=5 os_prio=0 tid=0x00007f208c002000 nid=0x675d waiting on condition [0x00007f20d84db000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000069a63db08> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) at net.sf.ehcache.concurrent.ReadWriteLockSync.lock(ReadWriteLockSync.java:50) at net.sf.ehcache.constructs.blocking.BlockingCache.acquiredLockForKey(BlockingCache.java:196) at net.sf.ehcache.constructs.blocking.BlockingCache.get(BlockingCache.java:158) at com.atlassian.cache.ehcache.LoadingCache.get(LoadingCache.java:75) at com.atlassian.cache.ehcache.DelegatingCache.get(DelegatingCache.java:96) at com.atlassian.jira.user.util.UserUtilImpl.getActiveUserCount(UserUtilImpl.java:721) ... at com.atlassian.upm.core.impl.JiraApplicationDescriptor.getActiveUserCount(JiraApplicationDescriptor.java:28) at com.atlassian.upm.license.internal.HostApplicationLicenses.getEditionAndRoleCountForEmbeddedLicense(HostApplicationLicenses.java:121) at com.atlassian.upm.license.internal.ProductLicenses.getEditionAndRoleCountForEmbeddedLicense(ProductLicenses.java:454) ... at com.atlassian.upm.mac.HostLicenseUpdatedHandlerImpl.getEmbeddedLicenses(HostLicenseUpdatedHandlerImpl.java:415) at com.atlassian.upm.mac.HostLicenseUpdatedHandlerImpl.getChangedEmbeddedPluginLicenses(HostLicenseUpdatedHandlerImpl.java:362) at com.atlassian.upm.mac.HostLicenseUpdatedHandlerImpl.handleUpdatedHostLicense(HostLicenseUpdatedHandlerImpl.java:131) at com.atlassian.upm.mac.HostLicenseUpdatedEventListener.handleUpdatedHostLicense(HostLicenseUpdatedEventListener.java:108) at com.atlassian.upm.mac.HostLicenseUpdatedEventListener.onStart(HostLicenseUpdatedEventListener.java:67) at com.atlassian.upm.UpmLifecycleManager.ensureStarted(UpmLifecycleManager.java:44) at com.atlassian.upm.UpmLifecycleManager.onStart(UpmLifecycleManager.java:35) ... at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.notifyLifecycleAwareOfStart(DefaultLifecycleManager.java:100) at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.notifyOnStart(DefaultLifecycleManager.java:91) at com.atlassian.sal.jira.lifecycle.JiraLifecycleManager.notifyOnStart(JiraLifecycleManager.java:68) at com.atlassian.sal.core.lifecycle.DefaultLifecycleManager.start(DefaultLifecycleManager.java:50) - locked <0x0000000698a3edc8> (a com.atlassian.sal.jira.lifecycle.JiraLifecycleManager) at com.atlassian.sal.jira.lifecycle.JiraLifecycleManager.onJiraStart(JiraLifecycleManager.java:62) ... at com.atlassian.jira.config.database.DatabaseConfigurationManagerImpl.doNowOrEnqueue(DatabaseConfigurationManagerImpl.java:345) at com.atlassian.jira.config.database.DatabaseConfigurationManagerImpl.doNowOrWhenDatabaseActivated(DatabaseConfigurationManagerImpl.java:215) at com.atlassian.jira.startup.DefaultJiraLauncher.postDbLaunch(DefaultJiraLauncher.java:124) ... at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4992) ... at java.lang.Thread.run(Thread.java:745)
Note that thread is waiting for com.atlassian.upm.core.impl.JiraApplicationDescriptor.getActiveUserCount, it could also wait for com.atlassian.jira.user.util.UserUtilImpl.getTotalUserCount
- Other thread populates the cache and cache is synchronised with other nodes:
"UpmScheduler:thread-1" #141 prio=5 os_prio=0 tid=0x00007f20920fe800 nid=0x6981 runnable [0x00007f145c2e1000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) ... at net.sf.ehcache.distribution.RMICacheManagerPeerProvider.lookupRemoteCachePeer(RMICacheManagerPeerProvider.java:127) at com.atlassian.jira.cluster.distribution.JiraCacheManagerPeerProvider.listRemoteCachePeers(JiraCacheManagerPeerProvider.java:93) ... at net.sf.ehcache.event.RegisteredEventListeners.internalNotifyElementPut(RegisteredEventListeners.java:180) at net.sf.ehcache.event.RegisteredEventListeners.notifyElementPut(RegisteredEventListeners.java:158) ... at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.buildCacheIfRequiredUnderLock(UserOrGroupCache.java:171) at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.buildCacheIfRequired(UserOrGroupCache.java:141) ... at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.getCache(UserOrGroupCache.java:58) at com.atlassian.jira.crowd.embedded.ofbiz.UserOrGroupCache.getCaseInsensitive(UserOrGroupCache.java:78) ... at com.atlassian.jira.user.util.UserUtilImpl.getGroupMembers(UserUtilImpl.java:1173) ... at com.atlassian.cache.ehcache.LoadingCache.loadValueAndReleaseLock(LoadingCache.java:100) at com.atlassian.cache.ehcache.LoadingCache.get(LoadingCache.java:76) at com.atlassian.cache.ehcache.DelegatingCache.get(DelegatingCache.java:96) at com.atlassian.jira.user.util.UserUtilImpl.getActiveUserCount(UserUtilImpl.java:721) ... at com.atlassian.jira.DefaultHostContextAccessor.doInTransaction(DefaultHostContextAccessor.java:34) ... at com.atlassian.sal.core.transaction.HostContextTransactionTemplate.execute(HostContextTransactionTemplate.java:21) ... at com.atlassian.upm.schedule.UpmSchedulerImpl$1.run(UpmSchedulerImpl.java:165) ... at java.lang.Thread.run(Thread.java:745)
Workaround
None
- is caused by
-
JRASERVER-64230 In Data Center users and groups caches are not preloaded on startup
- Closed
- is related to
-
JRASERVER-63556 Implement semi-async cache replication for EhCache in DC
- Closed
- relates to
-
JRASERVER-63587 JIRA Datacenter excessive EagerOfBizUserCache user cache population and replication on node start-up
- Closed
-
RUM-1409 Loading...