Bamboo plan cache operations can deadlock when connection pool is smaller than max http threads

XMLWordPrintable

    • 11
    • Severity 2 - Major
    • 2

      Summary

      Bamboo plan cache operations threads can deadlock (in the application, not at the database level) when connection pool is smaller than max http threads. Once the deadlock is triggered, the application is unresponsive until restarted. The default Bamboo configuration is:

      • 150 max threads
      • 100 max database connections

      Steps to Reproduce

      Use postgresql to reproduce – easier to lock a table to line everything up for a deadlock

      Reduce limits for easier reproduction:

      • Set maxThreads to 30 in $BAMBOO_INSTALL/conf/server.xml
      • Set max_size of connection pool to 20 in $BAMBOO_HOME/bamboo.cfg.xml

      Create about 400 plans via Bamboo Specs for a longer plan cache invalidation

      1. Bulk action >> Add notification
      2. Choose all plans, hit next but don't confirm yet.
      3. Prepare the below SQL query to run to lock a table but don't run it yet
        @set autocommit off;
        LOCK TABLE BUILD IN ACCESS EXCLUSIVE MODE;
        SELECT count(BUILD_ID) FROM BUILD;
        
      4. Tail Bamboo logs to confirm when plan cache is being invalidated.
      5. Fire off the bulk action.
      6. Check logs, if plan cache is being invalidated – execute the SQL to lock the build table. This allows us to then fire off a bunch of HTTP threads that need to access the cache before invalidation finishes.
      7. Run the below example script to tie up HTTP threads on plan dashboard
        BAMBOO_BASE_URL=http://localhost:8085/
        BAMBOO_USER=bamboo
        BAMBOO_PASSWORD=bamboo
        
        for i in {1..40}
        do 
          sem -j 40 curl -u $BAMBOO_USER:$BAMBOO_PASSWORD -X GET $BAMBOO_BASE_URL/allPlans.action
        done
        
      8. Wait for all 30 HTTP threads to be used: tail -f catalina.out | grep "http-nio-8085-exec" or grab a thread dump.
      9. Unlock the table
        commit transaction;
        

      Expected Results

      Everything unblocks and continues.

      Actual Results

      Application deadlocks and needs to be restarted"

      • No threads are in JDBC. No transactions or C3P0 helpers are trying to get a connection.
      • All threads are waiting to enter the plan cache but the thread they're waiting on is waiting for a pooled connection.
      • This thread can't get a pooled connection because all the threads waiting to enter the plan cache are holding one.

      Example blocked thread holding a pooled connection (if there are as many of these as there are pooled connections, a deadlock can occur)

      "http-nio-8085-exec-123" #328 daemon prio=5 os_prio=0 tid=0x00007f74fd90b800 nid=0x6542 waiting on condition [0x00007f747b821000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x00000000eec5ced0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
      	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
      	at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.get(AlwaysInvalidatingCacheDecorator.java:57)
      	at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.getUnchecked(AlwaysInvalidatingCacheDecorator.java:126)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.internalGetPlanByKey(ImmutablePlanCacheServiceImpl.java:487)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.lambda$internalGetAllChainsStream$11(ImmutablePlanCacheServiceImpl.java:571)
      

      Thread holding the lock is waiting for a pooled connection:

         Locked ownable synchronizers:
      	- <0x00000000eec5ced0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
      ...
      "http-nio-8085-exec-1" #175 daemon prio=5 os_prio=0 tid=0x00007f752c0e0000 nid=0x411f in Object.wait() [0x00007f7484991000]
         java.lang.Thread.State: WAITING (on object monitor)
      	at java.lang.Object.wait(Native Method)
      	at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1465)
      	at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:644)
      	- locked <0x00000000c27db328> (a com.mchange.v2.resourcepool.BasicResourcePool)
      	at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:554)
      	at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:758)
      	at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:685)
      	at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
      	at org.hibernate.c3p0.internal.C3P0ConnectionProvider.getConnection(C3P0ConnectionProvider.java:72)
      	at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:35)
      	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:106)
      	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:136)
      	at org.hibernate.internal.SessionImpl.connection(SessionImpl.java:541)
      	at org.springframework.orm.hibernate5.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:470)
      	at org.springframework.transaction.support.AbstractPlatformTransactionManager.handleExistingTransaction(AbstractPlatformTransactionManager.java:430)
      	at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:354)
      	at sun.reflect.GeneratedMethodAccessor201.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:206)
      	at com.sun.proxy.$Proxy109.getTransaction(Unknown Source)
      	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:137)
      	at com.atlassian.bamboo.plan.cache.CacheLoadContextSupport.load(CacheLoadContextSupport.java:74)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1.lambda$load$1(ImmutablePlanCacheServiceImpl.java:125)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1$$Lambda$397/1882222604.call(Unknown Source)
      	at com.atlassian.bamboo.executor.RetryingTaskExecutor.rerun(RetryingTaskExecutor.java:112)
      	at com.atlassian.bamboo.executor.RetryingTaskExecutor.runTask(RetryingTaskExecutor.java:99)
      	at com.atlassian.bamboo.executor.RetryingTaskExecutor.retry(RetryingTaskExecutor.java:207)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1.load(ImmutablePlanCacheServiceImpl.java:129)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1.load(ImmutablePlanCacheServiceImpl.java:103)
      	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
      	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
      	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
      	- locked <0x00000000eec5cfe0> (a com.google.common.cache.LocalCache$StrongEntry)
      	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
      	at com.google.common.cache.LocalCache.get(LocalCache.java:3937)
      	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941)
      	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)
      	at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.get(AlwaysInvalidatingCacheDecorator.java:65)
      	at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.getUnchecked(AlwaysInvalidatingCacheDecorator.java:126)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.internalGetPlanByKey(ImmutablePlanCacheServiceImpl.java:487)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.lambda$internalGetAllChainsStream$11(ImmutablePlanCacheServiceImpl.java:571)
      	at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$$Lambda$455/516208531.apply(Unknown Source)
      

      Notes

      Workaround

      Ensure that hibernate.c3p0.max_size in $BAMBOO_HOME/bamboo.cfg.xml is higher than maxThreads on HTTP connector in $BAMBOO_INSTALL/conf/server.xml by about 30 (overhead for background threads).

      Your database must also be configured to handle the number of connections set for hibernate.c3p0.max_size.

            Assignee:
            Unassigned
            Reporter:
            Jeremy Owen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: