Details
-
Bug
-
Resolution: Fixed
-
Low
-
7.2.8, 7.2.9, 7.2.10, 7.3.8, 7.4.4, 7.5.1
-
7.02
-
Severity 1 - Critical
-
Description
Actual behaviour
When JIRA is having problem replicating ehcache changes, one will see all ehcache-replicator threads stuck in calls like this
ehcache-replicator-10 State: RUNNABLE tid: 59 java.net.SocketInputStream.socketRead0(FileDescriptor, byte[], int, int, int) SocketInputStream.java java.net.SocketInputStream.socketRead(FileDescriptor, byte[], int, int, int) SocketInputStream.java:116 java.net.SocketInputStream.read(byte[], int, int, int) SocketInputStream.java:171 java.net.SocketInputStream.read(byte[], int, int) SocketInputStream.java:141 java.io.BufferedInputStream.fill() BufferedInputStream.java:246 java.io.BufferedInputStream.read() BufferedInputStream.java:265 java.io.DataInputStream.readByte() DataInputStream.java:265 sun.rmi.transport.tcp.TCPChannel.createConnection() TCPChannel.java:246 sun.rmi.transport.tcp.TCPChannel.newConnection() TCPChannel.java:202 sun.rmi.server.UnicastRef.newCall(RemoteObject, Operation[], int, long) UnicastRef.java:342 sun.rmi.registry.RegistryImpl_Stub.lookup(String) java.rmi.Naming.lookup(String) Naming.java:101 net.sf.ehcache.distribution.RMICacheManagerPeerProvider.lookupRemoteCachePeer(String) RMICacheManagerPeerProvider.java:127 com.atlassian.jira.cluster.distribution.JiraCacheManagerPeerProvider.lambda$getCachePeerAsync$2(String) JiraCacheManagerPeerProvider.java:76 com.atlassian.jira.cluster.distribution.JiraCacheManagerPeerProvider$$Lambda$45.get() com.atlassian.jira.cluster.distribution.ClassLoaderSwitchingSupplier.get() ClassLoaderSwitchingSupplier.java:20 java.util.concurrent.CompletableFuture$AsyncSupply.run() CompletableFuture.java:1590 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1142 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:617 java.lang.Thread.run() Thread.java:748
Many threads will be stuck in BlockingCache.acquiredLockForKey. They try to get data from ehcache.
Stack traces at the moment of snapshot capture:
http-nio-8080-exec-799 url:/ [DAEMON] State: RUNNABLE tid: 3280645 java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(Thread) ReentrantReadWriteLock.java:547 java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(int) ReentrantReadWriteLock.java:488 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(int) AbstractQueuedSynchronizer.java:1282 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock() ReentrantReadWriteLock.java:727 net.sf.ehcache.concurrent.ReadWriteLockSync.lock(LockType) ReadWriteLockSync.java:50 net.sf.ehcache.constructs.blocking.BlockingCache.acquiredLockForKey(Object, Sync, LockType) BlockingCache.java:196 net.sf.ehcache.constructs.blocking.BlockingCache.get(Object) BlockingCache.java:158 com.atlassian.cache.ehcache.LoadingCache.get(Object) LoadingCache.java:75 net.sf.ehcache.constructs.blocking.BlockingCache.get(Serializable) BlockingCache.java:318 com.atlassian.cache.ehcache.DelegatingCachedReference.get() DelegatingCachedReference.java:62 com.atlassian.jira.security.GlobalPermissionsCache.hasPermission(GlobalPermissionEntry) GlobalPermissionsCache.java:57 com.atlassian.jira.security.DefaultGlobalPermissionManager.hasPermission(GlobalPermissionEntry) DefaultGlobalPermissionManager.java:419 ... com.atlassian.jira.security.DefaultGlobalPermissionManager.hasPermissionIgnoreRecovery(GlobalPermissionKey, ApplicationUser) DefaultGlobalPermissionManager.java:308 com.atlassian.jira.security.DefaultGlobalPermissionManager.hasPermission(GlobalPermissionKey, ApplicationUser) DefaultGlobalPermissionManager.java:270 com.atlassian.jira.security.DefaultGlobalPermissionManager.hasPermission(int, ApplicationUser) DefaultGlobalPermissionManager.java:264 com.atlassian.jira.security.DefaultPermissionManager.hasPermission(int, ApplicationUser) DefaultPermissionManager.java:81 com.atlassian.jira.security.ApplicationRequiredPermissionManager.hasPermission(int, ApplicationUser) ApplicationRequiredPermissionManager.java:60 ...
Expected behaviour
Ehcache updates should not block gets.
Cause
DefaultGlobalPermissionManager is using cached reference which means all threads need to fight for single ReentrantReadWriteLock. Single view issue for user with 150 groups can generate over 55k accesses to this cache. This lock uses CAS and as a result high contention may kill its performance. Our tests show that with 200 concurrent threads performance degrades 1000 times
Attachments
Issue Links
- is related to
-
JRASERVER-63137 JVM instability at one node affects whole JIRA datacenter cluster
- Closed
-
DELTA-162 Loading...
-
JSEV-1935 Loading...
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...