XMLWordPrintable

    • 8
    • Severity 1 - Critical
    • 52

      Background

      Sometimes Confluence starts responding 500x error pages due to the lock timeout in VCacheLock which is used by I18NBean.
      It happens when threads X and Y lock this cache, and then other threads try to lock it also, wait for 30 seconds and than throw an exception.

      The threads X and Y are locked by ehcache (either by put or get operation). Stacktraces:

      "Caesium-1-4" #353 daemon prio=5 os_prio=0 tid=0x00007f0697dfc800 nid=0x1ccff waiting on condition [0x00007f02997e2000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x00007f083ee6cb28> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
      	at net.sf.ehcache.store.chm.SelectableConcurrentHashMap$Segment.get(SelectableConcurrentHashMap.java:721)
      	at net.sf.ehcache.store.chm.SelectableConcurrentHashMap.get(SelectableConcurrentHashMap.java:324)
      	at net.sf.ehcache.store.MemoryStore.get(MemoryStore.java:344)
      	at net.sf.ehcache.store.MemoryStore.expireElement(MemoryStore.java:437)
      	at net.sf.ehcache.store.MemoryStore.expireElements(MemoryStore.java:423)
      	at net.sf.ehcache.Cache.evictExpiredElements(Cache.java:2997)
      	at com.atlassian.confluence.cache.ehcache.EhCacheCompactor.evictExpiredElements(EhCacheCompactor.java:92)
      	at com.atlassian.confluence.cache.ehcache.EhCacheCompactor.performEviction(EhCacheCompactor.java:67)
      	at com.atlassian.confluence.cache.ehcache.EhCacheCompactor.compact(EhCacheCompactor.java:50)
      	at com.atlassian.confluence.impl.cache.CacheCompactorSupport.run(CacheCompactorSupport.java:13)
      	at com.atlassian.confluence.schedule.RunnableJobRunner.runJob(RunnableJobRunner.java:24)
      	at com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper.doRunJob(JobRunnerWrapper.java:120)
      	at com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper.lambda$runJob$0(JobRunnerWrapper.java:91)
      	at com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper$$Lambda$596/1667278920.perform(Unknown Source)
      	at com.atlassian.confluence.impl.vcache.VCacheRequestContextManager.doInRequestContextInternal(VCacheRequestContextManager.java:87)
      	at com.atlassian.confluence.impl.vcache.VCacheRequestContextManager.doInRequestContext(VCacheRequestContextManager.java:71)
      	at com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper.runJob(JobRunnerWrapper.java:91)
      	at com.atlassian.scheduler.core.JobLauncher.runJob(JobLauncher.java:153)
      	at com.atlassian.scheduler.core.JobLauncher.launchAndBuildResponse(JobLauncher.java:118)
      	at com.atlassian.scheduler.core.JobLauncher.launch(JobLauncher.java:97)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.launchJob(CaesiumSchedulerService.java:443)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeLocalJob(CaesiumSchedulerService.java:410)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeQueuedJob(CaesiumSchedulerService.java:388)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService$1.consume(CaesiumSchedulerService.java:285)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService$1.consume(CaesiumSchedulerService.java:282)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeJob(SchedulerQueueWorker.java:65)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeNextJob(SchedulerQueueWorker.java:59)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.run(SchedulerQueueWorker.java:34)
      	at java.lang.Thread.run(Thread.java:748)
      
      "http-nio-8090-exec-145" #345952 daemon prio=5 os_prio=0 tid=0x00007f003c05c000 nid=0x17d5e waiting on condition [0x00007f000292b000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x00007f083ee6cb28> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
      	at net.sf.ehcache.store.chm.SelectableConcurrentHashMap$Segment.put(SelectableConcurrentHashMap.java:641)
      	at net.sf.ehcache.store.chm.SelectableConcurrentHashMap.put(SelectableConcurrentHashMap.java:383)
      	at net.sf.ehcache.store.MemoryStore.put(MemoryStore.java:283)
      	at net.sf.ehcache.Cache.putInternal(Cache.java:1606)
      	at net.sf.ehcache.Cache.put(Cache.java:1532)
      	at net.sf.ehcache.Cache.put(Cache.java:1497)
      	at com.atlassian.cache.ehcache.DelegatingCache.put(DelegatingCache.java:82)
      	at com.atlassian.confluence.cache.ehcache.DefaultConfluenceEhCache.put(DefaultConfluenceEhCache.java:150)
      	at com.atlassian.confluence.cache.ConfluenceMonitoringCache.put(ConfluenceMonitoringCache.java:87)
      	at com.atlassian.confluence.cache.DefaultConfluenceCache.put(DefaultConfluenceCache.java:58)
      	at com.atlassian.confluence.cache.ConfluenceMonitoringCache.put(ConfluenceMonitoringCache.java:87)
      	at com.atlassian.vcache.internal.legacy.LegacyJvmCache.lambda$put$7(LegacyJvmCache.java:78)
      	at com.atlassian.vcache.internal.legacy.LegacyJvmCache$$Lambda$457/1103570330.run(Unknown Source)
      	at com.atlassian.vcache.internal.core.service.VCacheLock.withLock(VCacheLock.java:42)
      	at com.atlassian.vcache.internal.legacy.LegacyJvmCache.put(LegacyJvmCache.java:78)
      	at com.atlassian.vcache.internal.core.metrics.TimedLocalCacheOperations.put(TimedLocalCacheOperations.java:81)
      	at com.atlassian.confluence.impl.vcache.UnblockingRemovalJvmCache.lambda$null$1(UnblockingRemovalJvmCache.java:58)
      	at com.atlassian.confluence.impl.vcache.UnblockingRemovalJvmCache$$Lambda$432/964287692.get(Unknown Source)
      	at java.util.Optional.orElseGet(Optional.java:267)
      	at com.atlassian.confluence.impl.vcache.UnblockingRemovalJvmCache.lambda$get$2(UnblockingRemovalJvmCache.java:54)
      	at com.atlassian.confluence.impl.vcache.UnblockingRemovalJvmCache$$Lambda$430/25187769.get(Unknown Source)
      	at java.util.Optional.orElseGet(Optional.java:267)
      	at com.atlassian.confluence.impl.vcache.UnblockingRemovalJvmCache.get(UnblockingRemovalJvmCache.java:50)
      	at com.atlassian.confluence.util.i18n.CachingI18NBeanFactory.getI18NBean(CachingI18NBeanFactory.java:66)
      	at com.atlassian.confluence.util.i18n.UserI18NBeanFactory.getI18NBean(UserI18NBeanFactory.java:28)
      	at com.atlassian.confluence.util.i18n.UserI18NBeanFactory.getI18NBean(UserI18NBeanFactory.java:34)
      	at sun.reflect.GeneratedMethodAccessor9964.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:162)
      	at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:588)
      	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1123)
      	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1018)
      	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:510)
      	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:482)
      	at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:325)
      	at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:202)
      	at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:275)
      	at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:197)
      	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireByName(AbstractAutowireCapableBeanFactory.java:1244)
      	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1194)
      	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireBeanProperties(AbstractAutowireCapableBeanFactory.java:385)
      	at com.atlassian.spring.container.SpringContainerContext.autowireComponent(SpringContainerContext.java:102)
      	at com.atlassian.confluence.core.ConfluenceAutowireInterceptor.intercept(ConfluenceAutowireInterceptor.java:37)
      	at com.opensymphony.xwork.DefaultActionInvocation.invoke(DefaultActionInvocation.java:165)
      	at com.opensymphony.xwork.interceptor.AroundInterceptor.intercept(AroundInterceptor.java:35)
      	at com.opensymphony.xwork.DefaultActionInvocation.invoke(DefaultActionInvocation.java:165)
      	at com.atlassian.xwork.interceptors.TransactionalInvocation.invokeAndHandleExceptions(TransactionalInvocation.java:76)
      	at com.atlassian.xwork.interceptors.TransactionalInvocation.invokeInTransaction(TransactionalInvocation.java:61)
      	at com.atlassian.xwork.interceptors.XWorkTransactionInterceptor.intercept(XWorkTransactionInterceptor.java:56)
      	at com.opensymphony.xwork.DefaultActionInvocation.invoke(DefaultActionInvocation.java:165)
      	at com.atlassian.confluence.xwork.SetupIncompleteInterceptor.intercept(SetupIncompleteInterceptor.java:43)
      	at com.opensymphony.xwork.DefaultActionInvocation.invoke(DefaultActionInvocation.java:165)
      	at com.atlassian.confluence.security.interceptors.SecurityHeadersInterceptor.intercept(SecurityHeadersInterceptor.java:39)
      	at com.opensymphony.xwork.DefaultActionInvocation.invoke(DefaultActionInvocation.java:165)
      	at com.opensymphony.xwork.interceptor.AroundInterceptor.intercept(AroundInterceptor.java:35)
      	at com.opensymphony.xwork.DefaultActionInvocation.invoke(DefaultActionInvocation.java:165)
      	at com.opensymphony.xwork.DefaultActionProxy.execute(DefaultActionProxy.java:115)
      	at com.atlassian.confluence.servlet.ConfluenceServletDispatcher.serviceAction(ConfluenceServletDispatcher.java:56)
      	at com.opensymphony.webwork.dispatcher.ServletDispatcher.service(ServletDispatcher.java:199)
      

      As we can see there, both threads are locked by <0x00007f083ee6cb28>. Unfortunately, stacktrace does not show us which thread locks <0x00007f083ee6cb28>. For example, it could happen if a thread was created, acquired the lock than was killed without releasing the lock.

      How to check that your Confluence instance is affected by this issue

      If Confluence pages return 500x errors, please do the following:
      1. Create the first thread dump and then the second one in a minute
      2. Run this command on each dump:

      cat <thread-dump>  | egrep "(MemoryStore.put|MemoryStore.get)" -B 10
      

      3. If you see two threads which are waiting for the same lock, you are affected by this issue. Both thread dumps should have the same lock id. 

      More Information & Workaround

      Confluence Throws 500 Error Due To Lock Timeout In Vcachelock

            Assignee:
            Unassigned
            Reporter:
            George Lipatov
            Votes:
            8 Vote for this issue
            Watchers:
            22 Start watching this issue

              Created:
              Updated:
              Resolved: