-
Bug
-
Resolution: Fixed
-
Medium
-
7.1.4, 7.1.9, 7.2.8, 7.0.0
-
7
-
52
-
Severity 2 - Major
-
604
-
Summary
JIRA performance degrades significantly during full and incremental directory sync. CPU spiking to 100%, page load delay between .5 - 5 min.
Environment
- JIRA 7.0 or later
- JIRA configured with AD LDAP directory (type Connected)
Steps to Reproduce
Steps to reproduce*
1) Configure Microsoft AD in jira 7.1.4, make sure AD has enough users ie. 10k users
2) Import few users
3) Login to jira as local user and be on directory page
4) Configure jmeter "http" sessions using one of the AD user, session should repeat contentiously for 30 minute
5) Initiate 20 concurrent sessions from jmeter --e.g browsing boards , issues etc.
6) After 10 second of jemeter sessions initiate LDAP sync
7) Login to jira on another browser and keep navigating pages, after 3-8 minute you will notice slowness and LDAP sync is running forever.
Expected Results
Directory sync should happen without having performance degradation.
Actual Result
JIRA performance significantly degrades when full and incremental sync is happening.
Verification
To verify if the instance is affected by this bug, collect thread dumps during slow performance as per Generate a Thread Dump - reviewing them the Caesium thread will contain the below thread over several thread dumps:
"Caesium-1-3" #189 daemon prio=5 tid=0x00007f2eae212000 nid=0x3540 waiting on condition [0x00007f2f1cff9000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000543279b50> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) at com.atlassian.cache.memory.DelegatingCache.removeAll(DelegatingCache.java:256) at com.atlassian.jira.application.DefaultApplicationRoleManager.clearUserCounts(DefaultApplicationRoleManager.java:632) at com.atlassian.jira.application.DefaultApplicationRoleManager.onUserDeleted(DefaultApplicationRoleManager.java:529) at sun.reflect.GeneratedMethodAccessor2429.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.atlassian.event.internal.SingleParameterMethodListenerInvoker.invoke(SingleParameterMethodListenerInvoker.java:36) at com.atlassian.event.internal.AsynchronousAbleEventDispatcher$1$1.run(AsynchronousAbleEventDispatcher.java:48) at com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:299) at com.atlassian.event.internal.AsynchronousAbleEventDispatcher.dispatch(AsynchronousAbleEventDispatcher.java:107) at com.atlassian.event.internal.EventPublisherImpl.invokeListeners(EventPublisherImpl.java:160) at com.atlassian.event.internal.EventPublisherImpl.publish(EventPublisherImpl.java:79) at com.atlassian.crowd.directory.DbCachingRemoteChangeOperations.publishEvent(DbCachingRemoteChangeOperations.java:1062) at com.atlassian.crowd.directory.DbCachingRemoteChangeOperations.deleteCachedUsersByName(DbCachingRemoteChangeOperations.java:318) at com.atlassian.crowd.directory.DbCachingRemoteChangeOperations.deleteCachedUsersByGuid(DbCachingRemoteChangeOperations.java:285) at com.atlassian.crowd.directory.DirectoryCacheImplUsingChangeOperations.deleteCachedUsersByGuid(DirectoryCacheImplUsingChangeOperations.java:72) at com.atlassian.crowd.directory.ldap.cache.UsnChangedCacheRefresher.synchroniseUserChangesGuid(UsnChangedCacheRefresher.java:356) at com.atlassian.crowd.directory.ldap.cache.UsnChangedCacheRefresher.synchroniseUserChanges(UsnChangedCacheRefresher.java:381) at com.atlassian.crowd.directory.ldap.cache.UsnChangedCacheRefresher.synchroniseChanges(UsnChangedCacheRefresher.java:124) at com.atlassian.crowd.directory.DbCachingRemoteDirectory.synchroniseCache(DbCachingRemoteDirectory.java:1097) at com.atlassian.crowd.manager.directory.DirectorySynchroniserImpl.synchronise(DirectorySynchroniserImpl.java:76) at com.atlassian.jira.crowd.embedded.JiraDirectorySynchroniser.synchronizeDirectory(JiraDirectorySynchroniser.java:77) at com.atlassian.jira.crowd.embedded.JiraDirectorySynchroniser.runJob(JiraDirectorySynchroniser.java:52) at com.atlassian.scheduler.core.JobLauncher.runJob(JobLauncher.java:153) at com.atlassian.scheduler.core.JobLauncher.launchAndBuildResponse(JobLauncher.java:118) at com.atlassian.scheduler.core.JobLauncher.launch(JobLauncher.java:97) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.launchJob(CaesiumSchedulerService.java:401) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeClusteredJob(CaesiumSchedulerService.java:396) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeQueuedJob(CaesiumSchedulerService.java:349) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService$1.consume(CaesiumSchedulerService.java:255) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService$1.consume(CaesiumSchedulerService.java:252) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeJob(SchedulerQueueWorker.java:65) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeNextJob(SchedulerQueueWorker.java:59) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.run(SchedulerQueueWorker.java:34) at java.lang.Thread.run(Thread.java:745)
And there will be a number of Tomcat worker threads waiting on futures inside that cache (as per the attached screenshot). Essentially the sync is invalidating a cache that a large number of actions rely on and when it's invalidated those threads will block until it's repopulated.
Workaround
(If using Active Directory)
- Set the JIRA system property:
-Dcrowd.use.legacy.ad.incremental.sync=true
This workaround will mean that users who no longer exist in AD but own content in JIRA will not be deleted from the cache on an INCREMENTAL sync only, thus not triggering this issue. A FULL sync will still be affected, however.
Note on partial fix:
We significantly reduced the performance problem by resolving issue --, which was about skipping active user counting in cases when it was not required.JRA-62742
After gathering feedback we decided to reopen this issue to look further into how can we improve user synchronisation performance.
- is caused by
-
JRASERVER-62742 Active user count recalculates for every JIRA's inactive user during user synchronisation
- Closed
- is related to
-
JRASERVER-40043 Improve UI visibility when synchronizing is running in another node
- Closed
- relates to
-
JRASERVER-64384 ActiveUsersCountForLicense cache is populated inefficiently leading to performance degradation across the application
- Closed
-
RAID-289 Loading...
- causes
-
PSR-8 Loading...
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
- was cloned as
-
RUN-634 Loading...