-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Low
-
None
-
Affects Version/s: 6.3.1, 6.9.0
-
Component/s: Performance
-
11
-
Severity 2 - Major
-
2
Summary
Bamboo plan cache operations threads can deadlock (in the application, not at the database level) when connection pool is smaller than max http threads. Once the deadlock is triggered, the application is unresponsive until restarted. The default Bamboo configuration is:
- 150 max threads
- 100 max database connections
Steps to Reproduce
Use postgresql to reproduce – easier to lock a table to line everything up for a deadlock
Reduce limits for easier reproduction:
- Set maxThreads to 30 in $BAMBOO_INSTALL/conf/server.xml
- Set max_size of connection pool to 20 in $BAMBOO_HOME/bamboo.cfg.xml
Create about 400 plans via Bamboo Specs for a longer plan cache invalidation
- Bulk action >> Add notification
- Choose all plans, hit next but don't confirm yet.
- Prepare the below SQL query to run to lock a table but don't run it yet
@set autocommit off; LOCK TABLE BUILD IN ACCESS EXCLUSIVE MODE; SELECT count(BUILD_ID) FROM BUILD;
- Tail Bamboo logs to confirm when plan cache is being invalidated.
- Fire off the bulk action.
- Check logs, if plan cache is being invalidated – execute the SQL to lock the build table. This allows us to then fire off a bunch of HTTP threads that need to access the cache before invalidation finishes.
- Run the below example script to tie up HTTP threads on plan dashboard
BAMBOO_BASE_URL=http://localhost:8085/ BAMBOO_USER=bamboo BAMBOO_PASSWORD=bamboo for i in {1..40} do sem -j 40 curl -u $BAMBOO_USER:$BAMBOO_PASSWORD -X GET $BAMBOO_BASE_URL/allPlans.action done
- Wait for all 30 HTTP threads to be used: tail -f catalina.out | grep "http-nio-8085-exec" or grab a thread dump.
- Unlock the table
commit transaction;
Expected Results
Everything unblocks and continues.
Actual Results
Application deadlocks and needs to be restarted"
- No threads are in JDBC. No transactions or C3P0 helpers are trying to get a connection.
- All threads are waiting to enter the plan cache but the thread they're waiting on is waiting for a pooled connection.
- This thread can't get a pooled connection because all the threads waiting to enter the plan cache are holding one.
Example blocked thread holding a pooled connection (if there are as many of these as there are pooled connections, a deadlock can occur)
"http-nio-8085-exec-123" #328 daemon prio=5 os_prio=0 tid=0x00007f74fd90b800 nid=0x6542 waiting on condition [0x00007f747b821000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000eec5ced0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.get(AlwaysInvalidatingCacheDecorator.java:57) at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.getUnchecked(AlwaysInvalidatingCacheDecorator.java:126) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.internalGetPlanByKey(ImmutablePlanCacheServiceImpl.java:487) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.lambda$internalGetAllChainsStream$11(ImmutablePlanCacheServiceImpl.java:571)
Thread holding the lock is waiting for a pooled connection:
Locked ownable synchronizers: - <0x00000000eec5ced0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) ... "http-nio-8085-exec-1" #175 daemon prio=5 os_prio=0 tid=0x00007f752c0e0000 nid=0x411f in Object.wait() [0x00007f7484991000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1465) at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:644) - locked <0x00000000c27db328> (a com.mchange.v2.resourcepool.BasicResourcePool) at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:554) at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:758) at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:685) at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140) at org.hibernate.c3p0.internal.C3P0ConnectionProvider.getConnection(C3P0ConnectionProvider.java:72) at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:35) at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:106) at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:136) at org.hibernate.internal.SessionImpl.connection(SessionImpl.java:541) at org.springframework.orm.hibernate5.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:470) at org.springframework.transaction.support.AbstractPlatformTransactionManager.handleExistingTransaction(AbstractPlatformTransactionManager.java:430) at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:354) at sun.reflect.GeneratedMethodAccessor201.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:206) at com.sun.proxy.$Proxy109.getTransaction(Unknown Source) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:137) at com.atlassian.bamboo.plan.cache.CacheLoadContextSupport.load(CacheLoadContextSupport.java:74) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1.lambda$load$1(ImmutablePlanCacheServiceImpl.java:125) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1$$Lambda$397/1882222604.call(Unknown Source) at com.atlassian.bamboo.executor.RetryingTaskExecutor.rerun(RetryingTaskExecutor.java:112) at com.atlassian.bamboo.executor.RetryingTaskExecutor.runTask(RetryingTaskExecutor.java:99) at com.atlassian.bamboo.executor.RetryingTaskExecutor.retry(RetryingTaskExecutor.java:207) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1.load(ImmutablePlanCacheServiceImpl.java:129) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1.load(ImmutablePlanCacheServiceImpl.java:103) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282) - locked <0x00000000eec5cfe0> (a com.google.common.cache.LocalCache$StrongEntry) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) at com.google.common.cache.LocalCache.get(LocalCache.java:3937) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.get(AlwaysInvalidatingCacheDecorator.java:65) at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.getUnchecked(AlwaysInvalidatingCacheDecorator.java:126) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.internalGetPlanByKey(ImmutablePlanCacheServiceImpl.java:487) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.lambda$internalGetAllChainsStream$11(ImmutablePlanCacheServiceImpl.java:571) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$$Lambda$455/516208531.apply(Unknown Source)
Notes
- BAM-20323: Whole Plan cache invalidated when user rename event occurs. (fixed in Bamboo 6.8.1, 6.9.0) can increase the window for hitting the deadlock when LDAP syncs.
Workaround
Ensure that hibernate.c3p0.max_size in $BAMBOO_HOME/bamboo.cfg.xml is higher than maxThreads on HTTP connector in $BAMBOO_INSTALL/conf/server.xml by about 30 (overhead for background threads).
Your database must also be configured to handle the number of connections set for hibernate.c3p0.max_size.