-
Bug
-
Resolution: Fixed
-
Low
-
7.0.2, 7.2.7
-
7
-
9
-
Severity 2 - Major
-
28
-
Summary
JIRA generates cache flush event for each default group after enabling Application access. These frequent cache flush put a lot of pressure to application as all threads needs to wait for the cache and especially together with JRA-64384 can make application unusable. Basically for each addUserMembershipsForGroup action JIRA flushes the cache.
Steps to Reproduce
- Add many groups and make then default for JIRA Software (same problem for ServiceDesk)
- Create user
- Enable debug logging for com.atlassian.jira.application.DefaultApplicationRoleManager
- Give user Application access
Expected Results
- JIRA adds user to groups.
- None or one cache flush will be generated.
Actual Results
- JIRA will add user to groups.
- Many cache flush events generated, one for each added group (each group edit event generates the cache flush).
- Check logs:
2017-03-15 18:19:41,883 http-nio-8081-exec-11 DEBUG admin 1099x2859x1 1tt78a8 127.0.0.1 /rest/internal/2/viewuser/application/jira-software [c.a.jira.application.DefaultApplicationRoleManager] Clearing the user counts on com.atlassian.crowd.event.group.GroupMembershi pDeletedEvent. 2017-03-15 18:19:41,898 http-nio-8081-exec-11 DEBUG admin 1099x2859x1 1tt78a8 127.0.0.1 /rest/internal/2/viewuser/application/jira-software [c.a.jira.application.DefaultApplicationRoleManager] Clearing the user counts on com.atlassian.crowd.event.group.GroupMembershi pDeletedEvent.
- Thread dump, thread trying to flush the cache: com.atlassian.cache.ehcache.LoadingCache.removeAll
"http-nio-8080-exec-828" #365452 daemon prio=5 os_prio=0 tid=0x00007fda18083000 nid=0x38e0 waiting on condition [0x00007fde368df000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000055b994878> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) at com.atlassian.cache.ehcache.LoadingCache.removeAll(LoadingCache.java:320) at com.atlassian.cache.ehcache.DelegatingCache.removeAll(DelegatingCache.java:148) at com.atlassian.jira.application.DefaultApplicationRoleManager.clearUserCounts(DefaultApplicationRoleManager.java:651) at com.atlassian.jira.application.DefaultApplicationRoleManager.onGroupMembershipCreated(DefaultApplicationRoleManager.java:510) at sun.reflect.GeneratedMethodAccessor8945.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.atlassian.event.internal.SingleParameterMethodListenerInvoker.invoke(SingleParameterMethodListenerInvoker.java:36) at com.atlassian.event.internal.AsynchronousAbleEventDispatcher$1$1.run(AsynchronousAbleEventDispatcher.java:48) at com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:299) at com.atlassian.event.internal.AsynchronousAbleEventDispatcher.dispatch(AsynchronousAbleEventDispatcher.java:107) at com.atlassian.event.internal.EventPublisherImpl.invokeListeners(EventPublisherImpl.java:160) at com.atlassian.event.internal.EventPublisherImpl.publish(EventPublisherImpl.java:79) at com.atlassian.crowd.manager.directory.DirectoryManagerGeneric.addUserToGroup(DirectoryManagerGeneric.java:727) at com.atlassian.crowd.manager.application.ApplicationServiceGeneric.addUserToGroup(ApplicationServiceGeneric.java:1297) at com.atlassian.crowd.embedded.core.CrowdServiceImpl.addUserToGroup(CrowdServiceImpl.java:711) at com.atlassian.crowd.embedded.core.DelegatingCrowdService.addUserToGroup(DelegatingCrowdService.java:193) at com.atlassian.crowd.embedded.core.FilteredCrowdServiceImpl.addUserToGroup(FilteredCrowdServiceImpl.java:324) at com.atlassian.jira.user.util.UserUtilImpl.doAddUserToGroup(UserUtilImpl.java:353) at com.atlassian.jira.user.util.UserUtilImpl.addUserToGroups(UserUtilImpl.java:367) at com.atlassian.jira.bc.user.DefaultUserService.addUserToApplication(DefaultUserService.java:608) ... at com.atlassian.jira.rest.util.UpdateUserApplicationHelper.doApplicationUpdate(UpdateUserApplicationHelper.java:123) at com.atlassian.jira.rest.util.UpdateUserApplicationHelper.addUserToApplication(UpdateUserApplicationHelper.java:57) at com.atlassian.jira.rest.internal.v2.ViewUserResource.addUserToApplication(ViewUserResource.java:90)
Notes
- Note related to this, quote from docs: https://confluence.atlassian.com/adminjiraserver072/manage-group-access-to-applications-828787681.html
We strongly recommend you only have one default group for each application. New users created and assigned application access are added to that application's default group ..
- JIRA will mark some groups as default after upgrade from 6.x to 7.x
Problem will be caused by combination of 2 factors:
- Too many cache flush events, current ticket
- caused by allowing some user Application access
- For each default group there will be additional cache flush event when user is given Application access
- Slow cache population, see
JRA-64384 - Those two thread will fight for cache, meantime all threads will wait (this instability could be 5+ min), see related JRA-64847
- one will flush
"http-nio-8080-exec-828" #365452 daemon prio=5 os_prio=0 tid=0x00007fda18083000 nid=0x38e0 waiting on condition [0x00007fde368df000] java.lang.Thread.State: WAITING (parking) ... at com.atlassian.cache.ehcache.LoadingCache.removeAll(LoadingCache.java:320) ... at com.atlassian.jira.application.DefaultApplicationRoleManager.clearUserCounts(DefaultApplicationRoleManager.java:651)
- another will populate:
"http-nio-8080-exec-831" #365455 daemon prio=5 os_prio=0 tid=0x00007fda540a8800 nid=0x38e3 runnable [0x00007fde36485000] java.lang.Thread.State: RUNNABLE .. at com.atlassian.jira.crowd.embedded.ofbiz.DelegatingUserDao.processUsers(DelegatingUserDao.java:58). ... at com.atlassian.cache.ehcache.LoadingCache.loadValueAndReleaseLock(LoadingCache.java:100) at com.atlassian.cache.ehcache.LoadingCache.get(LoadingCache.java:76) at com.atlassian.cache.ehcache.DelegatingCache.get(DelegatingCache.java:97) at com.atlassian.jira.application.DefaultApplicationRoleManager.getUserCount(DefaultApplicationRoleManager.java:407)
- one will flush
Workaround
- Reduce number of default groups to 1.
Note on Fix
We largely mitigated the problem by implementing JRASERVER-65342, that makes cache flushes non-blocking, so they will not stack-up in case is many events.
- is caused by
-
JRASERVER-64847 Make jira cache flush events non-blocking
- Gathering Interest
-
JRASERVER-64962 Made add users to groups more efficient
- Gathering Interest
- is related to
-
JRASERVER-64384 ActiveUsersCountForLicense cache is populated inefficiently leading to performance degradation across the application
- Closed
- was split into
-
JRASERVER-65342 Flushes for activeUsersCountForLicense cache are blocking
- Closed
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
- relates to
-
RUM-1758 Loading...