Details
-
Bug
-
Resolution: Fixed
-
High
-
5.15.3, 6.8.0, 6.7.1
-
4
-
Severity 2 - Major
-
1
-
Description
Issue Summary
Race condition when a branch is created while a plan is being deleted creates database inconsistencies that break Bamboo's background deletion service.
Steps to Reproduce
Use PostgreSQL for reproduction – easy to lock the table in the steps.
- Create a plan
- Point it to a Bitbucket Server repository. Easier to reproduce with event based branch detection.
- Enable automatic branch detection on the plan
- Lock the BUILD table:
@set autocommit off; LOCK TABLE BUILD IN ACCESS EXCLUSIVE MODE; SELECT count(BUILD_ID) FROM BUILD;
- Delete the plan:
curl -k -u user:password \ -H 'X-Atlassian-Token: no-check' \ -H 'Accept: application/json' \ -H 'Content-Type: application/x-www-form-urlencoded' \ -d 'buildKey='PLAN-KEY'' \ -X POST 'http://localhost:8085/ajax/deleteChain.action'
- Wait 5 seconds. Now push a new branch to the Bitbucket repository.
- Wait 5 seconds. Unlock the build table:
commit transaction;
Expected Results
New branch is not created since plan is being marked for deletion.
The plan is deleted by the deletion service.
Actual Results
New branch is created with MARKED_FOR_DELETION = false. The rest of the plan is marked for deletion.
The below exception is thrown in the atlassian-bamboo.log file when the deletion service attempts to run:
2019-02-11 14:36:36,536 ERROR [DelayedDeletionThread] [DeletionServiceHelper] Error occurred while deleting plan or job: BUG-BUG, continue removal of other plans 2019-02-11 14:36:36,536 ERROR [DelayedDeletionThread] [DeletionServiceHelper] could not execute batch; SQL [delete from BUILD where BUILD_ID=?]; constraint [fk_jdvuiqpnjmsvf3llia8jvvg03]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute batch org.springframework.dao.DataIntegrityViolationException: could not execute batch; SQL [delete from BUILD where BUILD_ID=?]; constraint [fk_jdvuiqpnjmsvf3llia8jvvg03]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute batch at org.springframework.orm.hibernate5.SessionFactoryUtils.convertHibernateAccessException(SessionFactoryUtils.java:247) at org.springframework.orm.hibernate5.HibernateTransactionManager.convertHibernateAccessException(HibernateTransactionManager.java:785) at org.springframework.orm.hibernate5.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:621) at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:746) at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:714) at sun.reflect.GeneratedMethodAccessor206.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:206) at com.sun.proxy.$Proxy109.commit(Unknown Source) at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:532) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:304) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) at com.sun.proxy.$Proxy155.deletePlan(Unknown Source) at com.atlassian.bamboo.deletion.DeletionServiceHelper.doDeletePlan(DeletionServiceHelper.java:226) at com.atlassian.bamboo.deletion.DeletionServiceHelper.doDeletePlans(DeletionServiceHelper.java:199) at com.atlassian.bamboo.deletion.DeletionServiceHelper.executeDelayedDeletionInternal(DeletionServiceHelper.java:177) at com.atlassian.bamboo.deletion.DeletionServiceHelper.executeDelayedDeletion(DeletionServiceHelper.java:121) at com.atlassian.bamboo.deletion.DeletionServiceImpl.executeDelayedDeletion(DeletionServiceImpl.java:411) ... Caused by: java.sql.BatchUpdateException: Batch entry 0 delete from BUILD where BUILD_ID=178389009 was aborted: ERROR: update or delete on table "build" violates foreign key constraint "fk_jdvuiqpnjmsvf3llia8jvvg03" on table "build" Detail: Key (build_id)=(178389009) is still referenced from table "build". Call getNextException to see other errors in the batch.
A number of other background threads (including attempting to load the dashboard) that operate on the plan cache fail with:
2019-02-11 14:32:34,646 INFO [http-nio-8085-exec-1] [AccessLogFilter] charlie GET http://localhost:8085/allPlans.action 1173723kb 2019-02-11 14:32:34,662 INFO [http-nio-8085-exec-1] [RetryingTaskExecutor] Task 'Retrieving BUG-BUG4' threw java.lang.NullPointerException: Master chain not found in plan cache, aborting. 2019-02-11 14:32:34,666 ERROR [http-nio-8085-exec-1] [ExceptionMappingInterceptor] java.lang.NullPointerException: Master chain not found in plan cache java.lang.RuntimeException: java.lang.NullPointerException: Master chain not found in plan cache at com.atlassian.bamboo.executor.RetryingTaskExecutor.rerun(RetryingTaskExecutor.java:129) at com.atlassian.bamboo.executor.RetryingTaskExecutor.runTask(RetryingTaskExecutor.java:99) at com.atlassian.bamboo.executor.RetryingTaskExecutor.retry(RetryingTaskExecutor.java:207) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1.load(ImmutablePlanCacheServiceImpl.java:129) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl$1.load(ImmutablePlanCacheServiceImpl.java:103) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) at com.google.common.cache.LocalCache.get(LocalCache.java:3937) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.get(AlwaysInvalidatingCacheDecorator.java:65) at com.atlassian.bamboo.utils.collections.AlwaysInvalidatingCacheDecorator.getUnchecked(AlwaysInvalidatingCacheDecorator.java:126) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.internalGetPlanByKey(ImmutablePlanCacheServiceImpl.java:487) at com.atlassian.bamboo.plan.cache.ImmutablePlanCacheServiceImpl.lambda$internalGetAllChainsStream$11(ImmutablePlanCacheServiceImpl.java:571) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3527) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at com.atlassian.bamboo.plan.cache.CachedPlanManagerImpl.sortedCopy(CachedPlanManagerImpl.java:293) at com.atlassian.bamboo.plan.cache.CachedPlanManagerImpl.getPlans(CachedPlanManagerImpl.java:104)
The NullPointerException seems to self correct at some point and a restart can also correct it but the deletion service will never recover.
Affected plans continue to spam this warning into the logs:
2019-02-11 16:35:05,240 WARN [scheduler_Worker-8] [ImmutablePlanCacheServiceImpl] Plan BUG-BUG2 scheduled for deletion but not hidden 2019-02-11 16:35:05,240 WARN [scheduler_Worker-8] [ImmutablePlanCacheServiceImpl] Plan BUG-BUG scheduled for deletion but not hidden
Workaround
Restart Bamboo so that users will be able to access Bamboo again without error but the data inconsistency will still exist in database.
There are ways to correct the data but it requires analysis.
Raise a ticket with Atlassian Support referencing this bug and we'll help with the analysis and correction.
Attachments
Issue Links
- causes
-
BAM-18736 Deletion service shouldn't stop when it encounters an error with one plan
- Closed
-
BAM-20293 Deletion service shouldn't stop when it encounters an error with one plan
- Closed
-
PS-39869 Loading...
- duplicates
-
BAM-14215 Branches not marked for deletion when master plan/job is - causes deletion to fail
- Closed
- mentioned in
-
Page Loading...