Uploaded image for project: 'Bamboo Data Center'
  1. Bamboo Data Center
  2. BAM-18736

Deletion service shouldn't stop when it encounters an error with one plan

XMLWordPrintable

      Summary

      As soon as the deletion service encounters an error removing one plan, the entire service stops. The next time scheduled deletion occurs, it will often encounter the exact same error. If the problem goes undetected, this can result in a large backlog of undeleted plans which will never be removed until the root cause is resolved.

      Steps to Reproduce

      Simulating the problem with manual database updates:

      1. Create a plan with a plan branch (PLAN1)
      2. Mark the top level plan for deletion by updating the BUILD table, MARKED_FOR_DELETION = TRUE (manually through the database, root cause of this symptom is not yet identified)
      3. Don't mark the plan branches for deletion.
      4. Delete two other plans manually through the UI (PLAN2, PLAN3)

      Expected Results

      Deletion service identifies a problem performing background deletion of PLAN1 (healthcheck notification?) however continues to remove PLAN2 and PLAN3.

      Actual Results

      Deletion service will fail to delete PLAN1 and will stop. PLAN2 and PLAN3 will also not be deleted due to the issues with PLAN1.

      One example of an exception in the atlassian-bamboo.log that has caused issues with the deletion service:

      2017-08-09 06:17:29,903 INFO [scheduler_Worker-9] [DeletionServiceImpl] Deleting 20 TopLevelPlan(s) and/or ChainBranch(es) marked for deletion
      2017-08-09 06:17:29,903 INFO [scheduler_Worker-9] [DeletionServiceImpl] Deleting PLAN1
      2017-08-09 06:17:29,960 ERROR [scheduler_Worker-9] [SqlExceptionHelper] Batch entry 0 delete from BUILD where BUILD_ID=16779486 was aborted.  Call getNextException to see the cause.
      2017-08-09 06:17:29,960 ERROR [scheduler_Worker-9] [SqlExceptionHelper] ERROR: update or delete on table "build" violates foreign key constraint "fk_jdvuiqpnjmsvf3llia8jvvg03" on table "build"
        Detail: Key (build_id)=(16779486) is still referenced from table "build".
      2017-08-09 06:17:29,960 ERROR [scheduler_Worker-9] [DeletionServiceImpl] Unable to complete delayed deletion: 
      org.springframework.dao.DataIntegrityViolationException: could not execute batch; SQL [delete from BUILD where BUILD_ID=?]; constraint [fk_jdvuiqpnjmsvf3llia8jvvg03]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute batch
      

      Workaround

      The workaround is to correct the root cause of the first plan causing a failure in the deletion service. Raise a ticket with Atlassian Support and we'll help identify what the cause is and provide a solution

              achystoprudov Alexey Chystoprudov
              jowen@atlassian.com Jeremy Owen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: