-
Bug
-
Resolution: Unresolved
-
Low
-
None
-
6.6.9
-
11
-
Severity 2 - Major
-
7
-
Issue Summary
Confluence can become unresponsive, waiting on hazelcast to return information relating to labels. This issue is only seen when using the Midori Better Archiving plugin on a Confluence cluster with ~70000 users.
Environment
Confluence 6.6.9 Data Center with 4 nodes
Midori Better Archiving plugin v 8.1.0
Steps to Reproduce
The steps to reproduce are difficult to articulate, as the issue occurs while Confluence is under normal usage, albeit with a large number of users. jxie and vhu@atlassian.com have both looked at this previously, and they determined that when the issue is occurring, Hazelcast appears to be overwhelmed. It is believed that the plugin is overwhelming hazelcast with additional requests to the labels cache. Confluence already does a labels fetch on page load (as indicated by the threads similarly waiting on the hazelcast cache for labels information after the plugin had been disabled), but the midori plugin is at least doubling the number of requests to the cache for the same information by checking labels for every web item that might be displayed (the list provided to getDisplayableItems).
It does seem like threads are being processed (moving from WAITING to RUNNABLE), but it's happening quite slowly, which could indicate Hazelcast is backed up or that some nodes holding label cache data are responsive while others aren't.
Expected Results
Confluence should remain stable while the Midori Better Archiving plugin is enabled, even when Confluence is under high load
Actual Results
Here are some sample threads that show the behavior we see when the issue is occurring:
"http-nio-8090-exec-216 url:/display/EXAM/Page+Name username:charlie url:/pages/viewpage.action username:charlie" #84425 daemon prio=5 tid=0x00007effcc04a000 nid=0x1efee waiting on condition [0x00007efd98398000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at com.hazelcast.spi.impl.AbstractInvocationFuture.get(AbstractInvocationFuture.java:153) at com.hazelcast.map.impl.proxy.MapProxySupport.invokeOperation(MapProxySupport.java:376) at com.hazelcast.map.impl.proxy.MapProxySupport.getInternal(MapProxySupport.java:307) at com.hazelcast.map.impl.proxy.NearCachedMapProxyImpl.getInternal(NearCachedMapProxyImpl.java:101) at com.hazelcast.map.impl.proxy.MapProxyImpl.get(MapProxyImpl.java:112) at com.atlassian.cache.hazelcast.HazelcastCache.getOrLoad(HazelcastCache.java:243) at com.atlassian.cache.hazelcast.HazelcastCache.get(HazelcastCache.java:95) at com.atlassian.confluence.cache.DefaultConfluenceCache.get(DefaultConfluenceCache.java:47) at com.atlassian.confluence.cache.ConfluenceMonitoringCache.get(ConfluenceMonitoringCache.java:45) at com.atlassian.confluence.cache.DefaultConfluenceCache.get(DefaultConfluenceCache.java:47) at com.atlassian.confluence.cache.ConfluenceMonitoringCache.get(ConfluenceMonitoringCache.java:45) at com.atlassian.confluence.cache.hazelcast.hibernate.HazelcastHibernateCache.get(HazelcastHibernateCache.java:32) at com.atlassian.hibernate.extras.tangosol.CoherenceCacheStrategy.get(CoherenceCacheStrategy.java:68) at com.atlassian.confluence.cache.hibernate.SwitchableCacheConcurrencyStrategy.get(SwitchableCacheConcurrencyStrategy.java:29) at com.atlassian.confluence.cache.hibernate.ConfluenceCacheStrategy.get(ConfluenceCacheStrategy.java:49) at com.atlassian.hibernate.adapter.adapters.cache.EntityRegionAccessStrategyV5Adapter.get(EntityRegionAccessStrategyV5Adapter.java:51) at org.hibernate.engine.internal.CacheHelper.fromSharedCache(CacheHelper.java:32) at org.hibernate.event.internal.DefaultLoadEventListener.getFromSharedCache(DefaultLoadEventListener.java:651) at org.hibernate.event.internal.DefaultLoadEventListener.loadFromSecondLevelCache(DefaultLoadEventListener.java:595) at org.hibernate.event.internal.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:462) at org.hibernate.event.internal.DefaultLoadEventListener.load(DefaultLoadEventListener.java:219) ... at com.atlassian.confluence.util.LabelUtil.extractLabelsFromLabellings(LabelUtil.java:625) at com.atlassian.confluence.core.AbstractLabelableEntityObject.getLabels(AbstractLabelableEntityObject.java:26) ... at com.atlassian.confluence.spaces.SpaceDescription_$$_jvstabe_15.getLabels(SpaceDescription_$$_jvstabe_15.java) at com.midori.confluence.plugin.archiving.service.support.ContentLifecycleLabelHelper.findSpaceLabel(ContentLifecycleLabelHelper.java:156) at com.midori.confluence.plugin.archiving.service.support.ContentLifecycleLabelHelper.isBlacklistedSpace(ContentLifecycleLabelHelper.java:112) at com.midori.confluence.plugin.archiving.util.NonBlacklistedGlobalFreshSpacePredicate.apply(NonBlacklistedGlobalFreshSpacePredicate.java:15) at com.midori.confluence.plugin.archiving.web.condition.NonBlacklistedGlobalFreshSpaceCondition.shouldDisplay(NonBlacklistedGlobalFreshSpaceCondition.java:15) at com.atlassian.confluence.plugin.descriptor.web.conditions.BaseConfluenceCondition.shouldDisplay(BaseConfluenceCondition.java:33) at com.atlassian.plugin.web.conditions.AndCompositeCondition.shouldDisplay(AndCompositeCondition.java:14) at com.atlassian.plugin.web.conditions.AndCompositeCondition.shouldDisplay(AndCompositeCondition.java:14) at com.atlassian.plugin.web.DefaultWebInterfaceManager.filterFragmentsByCondition(DefaultWebInterfaceManager.java:154) at com.atlassian.plugin.web.DefaultWebInterfaceManager.getDisplayableItems(DefaultWebInterfaceManager.java:107) at com.atlassian.confluence.plugin.descriptor.web.ConfluenceWebInterfaceManager.getDisplayableItems(ConfluenceWebInterfaceManager.java:74) ... at com.atlassian.confluence.plugins.pagebanner.PageBannerContextProvider.getBannerItems(PageBannerContextProvider.java:72) at com.atlassian.confluence.plugins.pagebanner.PageBannerContextProvider.getContextMap(PageBannerContextProvider.java:57) at com.atlassian.plugin.web.descriptors.DefaultWebPanelModuleDescriptor$ContextAwareWebPanel.getHtml(DefaultWebPanelModuleDescriptor.java:144) ... at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1539) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1495) - locked <0x00007f03be400550> (a org.apache.tomcat.util.net.NioChannel) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748)
"http-nio-8090-exec-203 url:/display/EXAM/Page+Name username:charlie url:/pages/viewpage.action username:charlie" #14472 daemon prio=5 tid=0x00007ffac00fd800 nid=0x1c1e4 waiting on condition [0x00007fec4e98d000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at com.hazelcast.spi.impl.AbstractInvocationFuture.get(AbstractInvocationFuture.java:153) at com.hazelcast.map.impl.proxy.MapProxySupport.invokeOperation(MapProxySupport.java:376) at com.hazelcast.map.impl.proxy.MapProxySupport.getInternal(MapProxySupport.java:307) at com.hazelcast.map.impl.proxy.NearCachedMapProxyImpl.getInternal(NearCachedMapProxyImpl.java:101) at com.hazelcast.map.impl.proxy.MapProxyImpl.get(MapProxyImpl.java:112) at com.atlassian.cache.hazelcast.HazelcastCache.getOrLoad(HazelcastCache.java:243) at com.atlassian.cache.hazelcast.HazelcastCache.get(HazelcastCache.java:95) at com.atlassian.confluence.cache.DefaultConfluenceCache.get(DefaultConfluenceCache.java:47) at com.atlassian.confluence.cache.ConfluenceMonitoringCache.get(ConfluenceMonitoringCache.java:45) at com.atlassian.confluence.cache.DefaultConfluenceCache.get(DefaultConfluenceCache.java:47) at com.atlassian.confluence.cache.ConfluenceMonitoringCache.get(ConfluenceMonitoringCache.java:45) at com.atlassian.confluence.cache.hazelcast.hibernate.HazelcastHibernateCache.get(HazelcastHibernateCache.java:32) at com.atlassian.hibernate.extras.tangosol.CoherenceCacheStrategy.get(CoherenceCacheStrategy.java:68) at com.atlassian.confluence.cache.hibernate.SwitchableCacheConcurrencyStrategy.get(SwitchableCacheConcurrencyStrategy.java:29) at com.atlassian.confluence.cache.hibernate.ConfluenceCacheStrategy.get(ConfluenceCacheStrategy.java:49) at com.atlassian.hibernate.adapter.adapters.cache.EntityRegionAccessStrategyV5Adapter.get(EntityRegionAccessStrategyV5Adapter.java:51) at org.hibernate.engine.internal.CacheHelper.fromSharedCache(CacheHelper.java:32) at org.hibernate.event.internal.DefaultLoadEventListener.getFromSharedCache(DefaultLoadEventListener.java:651) at org.hibernate.event.internal.DefaultLoadEventListener.loadFromSecondLevelCache(DefaultLoadEventListener.java:595) at org.hibernate.event.internal.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:462) at org.hibernate.event.internal.DefaultLoadEventListener.load(DefaultLoadEventListener.java:219) ... at com.atlassian.confluence.user.ConfluenceUserImpl_$$_jvstb24_3.getName(ConfluenceUserImpl_$$_jvstb24_3.java) at com.atlassian.confluence.labels.Label.getOwner(Label.java:90) at com.atlassian.confluence.labels.LabelParser.create(LabelParser.java:189) at com.atlassian.confluence.labels.LabelParser.render(LabelParser.java:193) at com.atlassian.confluence.labels.Label.getDisplayTitle(Label.java:170) at com.atlassian.confluence.util.LabelUtil.extractLabelsFromLabellings(LabelUtil.java:626) at com.atlassian.confluence.core.AbstractLabelableEntityObject.getLabels(AbstractLabelableEntityObject.java:26) ... at com.atlassian.confluence.spaces.SpaceDescription_$$_jvstb24_15.getLabels(SpaceDescription_$$_jvstb24_15.java) at com.atlassian.confluence.userstatus.DefaultFavouriteManager.isUserFavourite(DefaultFavouriteManager.java:48) ... at com.atlassian.confluence.plugins.ia.ui.DefaultSoySidebarContextProvider.populateWithMainSidebarData(DefaultSoySidebarContextProvider.java:201) at com.atlassian.confluence.plugins.ia.ui.DefaultSoySidebarContextProvider.getContextMap(DefaultSoySidebarContextProvider.java:116) ... at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1539) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1495) - locked <0x00007ff290402f40> (a org.apache.tomcat.util.net.NioChannel) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748)
"http-nio-8090-exec-220 url:/display/EXAM/Page+Name username:charlie url:/pages/viewpage.action username:charlie" #84429 daemon prio=5 tid=0x00007effcc163800 nid=0x1eff2 runnable [0x00007efd8dfa9000] java.lang.Thread.State: RUNNABLE at java.lang.System.identityHashCode(Native Method) at java.util.IdentityHashMap.hash(IdentityHashMap.java:295) at java.util.IdentityHashMap.get(IdentityHashMap.java:329) at org.hibernate.engine.internal.EntityEntryContext.getAssociatedManagedEntity(EntityEntryContext.java:177) at org.hibernate.engine.internal.EntityEntryContext.addEntityEntry(EntityEntryContext.java:91) at org.hibernate.engine.internal.StatefulPersistenceContext.addEntry(StatefulPersistenceContext.java:544) at org.hibernate.engine.internal.StatefulPersistenceContext.addEntity(StatefulPersistenceContext.java:471) at org.hibernate.engine.internal.TwoPhaseLoad.addUninitializedCachedEntity(TwoPhaseLoad.java:385) at org.hibernate.event.internal.DefaultLoadEventListener.convertCacheEntryToEntity(DefaultLoadEventListener.java:738) at org.hibernate.event.internal.DefaultLoadEventListener.processCachedEntry(DefaultLoadEventListener.java:625) at org.hibernate.event.internal.DefaultLoadEventListener.loadFromSecondLevelCache(DefaultLoadEventListener.java:602) at org.hibernate.event.internal.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:462) at org.hibernate.event.internal.DefaultLoadEventListener.load(DefaultLoadEventListener.java:219) at org.hibernate.event.internal.DefaultLoadEventListener.doOnLoad(DefaultLoadEventListener.java:116) at org.hibernate.event.internal.DefaultLoadEventListener.onLoad(DefaultLoadEventListener.java:89) at org.hibernate.internal.SessionImpl.fireLoad(SessionImpl.java:1230) at org.hibernate.internal.SessionImpl.immediateLoad(SessionImpl.java:1088) at org.hibernate.proxy.AbstractLazyInitializer.initialize(AbstractLazyInitializer.java:155) at org.hibernate.proxy.AbstractLazyInitializer.getImplementation(AbstractLazyInitializer.java:259) at org.hibernate.proxy.pojo.javassist.JavassistLazyInitializer.invoke(JavassistLazyInitializer.java:73) at com.atlassian.hibernate.adapter.proxy.JavassistLazyInitializer_ImplementV2Proxy.invoke(JavassistLazyInitializer_ImplementV2Proxy.java:68) at com.atlassian.confluence.user.ConfluenceUserImpl_$$_jvstabe_3.getName(ConfluenceUserImpl_$$_jvstabe_3.java) at com.atlassian.confluence.labels.Label.getOwner(Label.java:90) at com.atlassian.confluence.labels.LabelParser.create(LabelParser.java:189) at com.atlassian.confluence.labels.LabelParser.render(LabelParser.java:193) at com.atlassian.confluence.labels.Label.getDisplayTitle(Label.java:170) at com.atlassian.confluence.util.LabelUtil.extractLabelsFromLabellings(LabelUtil.java:626) at com.atlassian.confluence.core.AbstractLabelableEntityObject.getLabels(AbstractLabelableEntityObject.java:26) ... at com.atlassian.confluence.spaces.SpaceDescription_$$_jvstabe_15.getLabels(SpaceDescription_$$_jvstabe_15.java) at com.midori.confluence.plugin.archiving.service.support.ContentLifecycleLabelHelper.findSpaceLabel(ContentLifecycleLabelHelper.java:156) at com.midori.confluence.plugin.archiving.service.support.ContentLifecycleLabelHelper.isBlacklistedSpace(ContentLifecycleLabelHelper.java:112) at com.midori.confluence.plugin.archiving.util.NonBlacklistedGlobalFreshSpacePredicate.apply(NonBlacklistedGlobalFreshSpacePredicate.java:15) at com.midori.confluence.plugin.archiving.web.condition.NonBlacklistedGlobalFreshSpaceCondition.shouldDisplay(NonBlacklistedGlobalFreshSpaceCondition.java:15) at com.atlassian.confluence.plugin.descriptor.web.conditions.BaseConfluenceCondition.shouldDisplay(BaseConfluenceCondition.java:33) at com.atlassian.plugin.web.conditions.AndCompositeCondition.shouldDisplay(AndCompositeCondition.java:14) at com.atlassian.plugin.web.conditions.AndCompositeCondition.shouldDisplay(AndCompositeCondition.java:14) at com.atlassian.plugin.web.DefaultWebInterfaceManager.filterFragmentsByCondition(DefaultWebInterfaceManager.java:154) at com.atlassian.plugin.web.DefaultWebInterfaceManager.getDisplayableItems(DefaultWebInterfaceManager.java:107) at com.atlassian.confluence.plugin.descriptor.web.ConfluenceWebInterfaceManager.getDisplayableItems(ConfluenceWebInterfaceManager.java:74) ... at com.atlassian.confluence.plugins.pagebanner.PageBannerContextProvider.getBannerItems(PageBannerContextProvider.java:72) at com.atlassian.confluence.plugins.pagebanner.PageBannerContextProvider.getContextMap(PageBannerContextProvider.java:57) at com.atlassian.plugin.web.descriptors.DefaultWebPanelModuleDescriptor$ContextAwareWebPanel.getHtml(DefaultWebPanelModuleDescriptor.java:144) ... at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1539) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1495) - locked <0x00007f03af801258> (a org.apache.tomcat.util.net.NioChannel) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748)
Notes
Is it possible that this is a bug in hazelcast? Perhaps https://github.com/hazelcast/hazelcast/issues/13170
Workaround
Disable Midori Better Archiving
Form Name |
---|
Oh, there is one more important note: as explained above, our app checks the labels of the space which contains the viewed page. It may not be redundant at all, because other mechanisms (I assume) check the labels of the viewed page itself.