Details
-
Bug
-
Resolution: Fixed
-
Low
-
5.15.3, 6.8.0
-
1
-
Severity 2 - Major
-
Description
Summary
As soon as the deletion service encounters an error removing one plan, the entire service stops. The next time scheduled deletion occurs, it will often encounter the exact same error. If the problem goes undetected, this can result in a large backlog of undeleted plans which will never be removed until the root cause is resolved.
Steps to Reproduce
Use PostgreSQL for reproduction – easy to lock the table in the steps.
- Create a plan
- Point it to a Bitbucket Server repository. Easier to reproduce with event based branch detection.
- Enable automatic branch detection on the plan
- Lock the BUILD table:
@set autocommit off; LOCK TABLE BUILD IN ACCESS EXCLUSIVE MODE; SELECT count(BUILD_ID) FROM BUILD;
- Delete the plan:
curl -k -u user:password \ -H 'X-Atlassian-Token: no-check' \ -H 'Accept: application/json' \ -H 'Content-Type: application/x-www-form-urlencoded' \ -d 'buildKey='PLAN-KEY'' \ -X POST 'http://localhost:8085/ajax/deleteChain.action'
- Wait 5 seconds. Now push a new branch to the Bitbucket repository.
- Wait 5 seconds. Unlock the build table:
commit transaction;
- Delete another plan in the instance that has no relation to the above repository. It too will not be deleted once the Deletion Service kicks in.
Expected Results
Deletion service identifies a problem performing background deletion of PLAN1 (healthcheck notification?) however continues to remove PLAN2 and PLAN3.
Actual Results
Deletion service will fail to delete PLAN1 and will stop. PLAN2 and PLAN3 will also not be deleted due to the issues with PLAN1.
One example of an exception in the atlassian-bamboo.log that has caused issues with the deletion service:
2017-08-09 06:17:29,903 INFO [scheduler_Worker-9] [DeletionServiceImpl] Deleting 20 TopLevelPlan(s) and/or ChainBranch(es) marked for deletion 2017-08-09 06:17:29,903 INFO [scheduler_Worker-9] [DeletionServiceImpl] Deleting PLAN1 2017-08-09 06:17:29,960 ERROR [scheduler_Worker-9] [SqlExceptionHelper] Batch entry 0 delete from BUILD where BUILD_ID=16779486 was aborted. Call getNextException to see the cause. 2017-08-09 06:17:29,960 ERROR [scheduler_Worker-9] [SqlExceptionHelper] ERROR: update or delete on table "build" violates foreign key constraint "fk_jdvuiqpnjmsvf3llia8jvvg03" on table "build" Detail: Key (build_id)=(16779486) is still referenced from table "build". 2017-08-09 06:17:29,960 ERROR [scheduler_Worker-9] [DeletionServiceImpl] Unable to complete delayed deletion: org.springframework.dao.DataIntegrityViolationException: could not execute batch; SQL [delete from BUILD where BUILD_ID=?]; constraint [fk_jdvuiqpnjmsvf3llia8jvvg03]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute batch
Affected plans continue to spam this warning into the logs:
2019-02-11 16:35:05,240 WARN [scheduler_Worker-8] [ImmutablePlanCacheServiceImpl] Plan BUG-BUG2 scheduled for deletion but not hidden 2019-02-11 16:35:05,240 WARN [scheduler_Worker-8] [ImmutablePlanCacheServiceImpl] Plan BUG-BUG scheduled for deletion but not hidden
Workaround
There are ways to correct the data but it requires analysis.
Raise a ticket with Atlassian Support referencing this bug and we'll help with the analysis and correction.