Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-72248

Flushing ProjectRoleActorsZduSafeCache may cause temporary degradation in responsiveness

    XMLWordPrintable

Details

    Description

      Issue Summary

      ProjectRoleActorsZduSafeCache uses a composite key based on <projectId,roleId>. Some components (like JQL validation, as used in the test below) iterate over entries of this cache for all projects. Hence, when this cache is fully flushed, it's aggressively loaded back via multiple threads using such components. This causes many concurrent cache load requests, which may block temporarily on each other due to lock contention on cache stripes. The amount of delay caused by this lock contention may become noticeable on large instances with high load, a large web-container thread pool and a large number of projects.

      Steps to Reproduce

      The issue may be reproduced with a load test. The amount of delay that can be observed depends on the size of test data, load and test hardware.

      1. Set up a load test to generate concurrent issue search load on the system. This may be a call to issue search REST API like /rest/api/2/search?jql=comment~test. Use a JQL that refers to a component (this triggers a permission check), i.e. in the example provided, the component is comment
      2. After the test warmup period, flush the ProjectRoleActorsZduSafeCache by creating a new empty Group at Administration/User management/Groups, and then deleting this Group
      3. you may observe stalled requests on the load test tool and verify the issue with threaddumps during this period

      Operations that are known to trigger this bug

      • deleting a User
      • deleting a Group
      • anonymizing a User
      • Project import

      Sample stack traces indicating the problem

      • many blocked threads with the stack
         sun.misc.Unsafe.park(Native Method)
        java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
        java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
        java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
        net.sf.ehcache.concurrent.ReadWriteLockSync.lock(ReadWriteLockSync.java:50)
        net.sf.ehcache.constructs.blocking.BlockingCache.acquiredLockForKey(BlockingCache.java:196)
        net.sf.ehcache.constructs.blocking.BlockingCache.get(BlockingCache.java:158)
        com.atlassian.cache.ehcache.LoadingCache.get(LoadingCache.java:80)
        com.atlassian.cache.ehcache.DelegatingCache.get(DelegatingCache.java:108)
        com.atlassian.jira.cache.DeferredReplicationCache.get(DeferredReplicationCache.java:48)
        com.atlassian.jira.security.roles.ProjectRoleActorsZduSafeCache.get(ProjectRoleActorsZduSafeCache.java:48)
        com.atlassian.jira.security.roles.CachingProjectRoleAndActorStore.getProjectRoleActors(CachingProjectRoleAndActorStore.java:123)
         
      • a few threads with active loaders
         java.net.SocketInputStream.socketRead0(Native Method)
        java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        java.net.SocketInputStream.read(SocketInputStream.java:171)
        java.net.SocketInputStream.read(SocketInputStream.java:141)
        sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:457)
        sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
        sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1095)
        sun.security.ssl.SSLSocketImpl.access$200(SSLSocketImpl.java:72)
        sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:815)
        org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161)
        org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:128)
        org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:113)
        org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
        org.postgresql.core.PGStream.receiveChar(PGStream.java:441)
        org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2057)
        org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
        org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
        org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
        org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:164)
        org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:114)
        org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:83)
        org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:83)
        com.atlassian.jira.ofbiz.sql.PreparedStatementWrapper.executeQuery(PreparedStatementWrapper.java:42)
        com.atlassian.jira.diagnostic.connection.DiagnosticPreparedStatement.lambda$executeQuery$5(DiagnosticPreparedStatement.java:59)
        com.atlassian.jira.diagnostic.connection.DiagnosticPreparedStatement$$Lambda$1921/659876343.execute(Unknown Source)
        com.atlassian.diagnostics.internal.platform.monitor.db.DefaultDatabaseDiagnosticsCollector.recordExecutionTime(DefaultDatabaseDiagnosticsCollector.java:69)
        com.atlassian.jira.diagnostic.connection.DatabaseDiagnosticsCollectorDelegate.recordExecutionTime(DatabaseDiagnosticsCollectorDelegate.java:55)
        com.atlassian.jira.diagnostic.connection.DiagnosticPreparedStatement.executeQuery(DiagnosticPreparedStatement.java:59)
        org.ofbiz.core.entity.jdbc.SQLProcessor.executeQuery(SQLProcessor.java:527)
        org.ofbiz.core.entity.GenericDAO.createEntityListIterator(GenericDAO.java:881)
        org.ofbiz.core.entity.GenericDAO.selectListIteratorByCondition(GenericDAO.java:861)
        org.ofbiz.core.entity.GenericDAO.selectByAnd(GenericDAO.java:733)
        org.ofbiz.core.entity.GenericHelperDAO.findByAnd(GenericHelperDAO.java:166)
        org.ofbiz.core.entity.GenericDelegator.findByAnd(GenericDelegator.java:913)
        org.ofbiz.core.entity.GenericDelegator.findByAnd(GenericDelegator.java:891)
        org.ofbiz.core.entity.GenericDelegator.findByAnd(GenericDelegator.java:860)
        com.atlassian.jira.ofbiz.DefaultOfBizDelegator.findByAnd(DefaultOfBizDelegator.java:83)
        com.atlassian.jira.ofbiz.WrappingOfBizDelegator.findByAnd(WrappingOfBizDelegator.java:62)
        com.atlassian.jira.security.roles.OfBizProjectRoleAndActorStore.getRoleActors(OfBizProjectRoleAndActorStore.java:323)
        com.atlassian.jira.security.roles.OfBizProjectRoleAndActorStore.getProjectRoleActors(OfBizProjectRoleAndActorStore.java:144)
        com.atlassian.jira.security.roles.CachingProjectRoleAndActorStore.loadProjectRoleActorsFromDelegate(CachingProjectRoleAndActorStore.java:205)
        com.atlassian.jira.security.roles.CachingProjectRoleAndActorStore$$Lambda$267/981565593.load(Unknown Source)
        com.atlassian.cache.ehcache.wrapper.ValueProcessorAtlassianCacheLoaderDecorator.load(ValueProcessorAtlassianCacheLoaderDecorator.java:26)
        com.atlassian.cache.ehcache.LoadingCache.getFromLoader(LoadingCache.java:134)
        com.atlassian.cache.ehcache.LoadingCache$$Lambda$215/888134930.apply(Unknown Source)
        com.atlassian.cache.ehcache.SynchronizedLoadingCacheDecorator.synchronizedLoad(SynchronizedLoadingCacheDecorator.java:29)
        com.atlassian.cache.ehcache.LoadingCache.loadValueAndReleaseLock(LoadingCache.java:102)
        com.atlassian.cache.ehcache.LoadingCache.get(LoadingCache.java:81)
        com.atlassian.cache.ehcache.DelegatingCache.get(DelegatingCache.java:108)
        com.atlassian.jira.cache.DeferredReplicationCache.get(DeferredReplicationCache.java:48)
        com.atlassian.jira.security.roles.ProjectRoleActorsZduSafeCache.get(ProjectRoleActorsZduSafeCache.java:48)
        com.atlassian.jira.security.roles.CachingProjectRoleAndActorStore.getProjectRoleActors(CachingProjectRoleAndActorStore.java:123)
         

      Expected Results

      ProjectRoleActorsZduSafeCache is repopulated after a flush without causing a noticeable impact on server responsiveness

      Actual Results

      A temporary lock contention may result after the cache is flushed, causing noticeable hit to responsiveness

      Note on fix

      The problem was mitigated by changing the cache flush logic: only entries associated with projects where the user/group is a role actor will be invalidated.

      Workaround

      Currently, there is no known workaround for this behaviour. A workaround will be added here when available

      Attachments

        Issue Links

          Activity

            People

              15609d8ba305 Filip Nowak
              keroglu Kurtcebe Eroglu
              Votes:
              9 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: