Description
Issue Summary
Opening stop build in a new tab for a queued build will cause an XSRF failure and inconsistent build state, the build will not be shown as complete until the OrphanedBuildMonitor cleans it after 720 seconds. If this is performed on a re-run build, there will be a NullPointerException too and OrphanedBuildMonitor cannot clean it and Bamboo needs to be restarted or the build is infinitely in a queued state.
Steps to Reproduce
- Run a plan and allow it to complete
- Disable all agents so build waits in queue
- Re-run the same build
- In Chrome > Right Click Stop Build > Open in new tab
- XSRF: A mutative operation was attempted error because opening stop build in a new tab as the browser performs a GET instead of POST.
- Now try to stop the build normally (without opening it a in a new tab)
- NullPointerException
Expected Results
Shouldn't be able to get into this state. Perhaps stopPlan.action should be completely protected (in-memory caches and stopping the build) from unsupported HTTP methods like GET since it performs mutative actions.
Actual Results
Upon opening Stop Build in a new tab (even for non re-runs) a GET request cannot mutate the build state from QUEUED -> NotBuilt:
Request information: Request URL: http://localhost:8085/build/admin/stopPlan.action Scheme: http Server: localhost Port: 8085 URI: /build/admin/stopPlan.action Servlet path: /build/admin/stopPlan.action Path info: Query string: planResultKey=ART-STP-6&returnUrl=%2Fbrowse%2FART-STP-6 Stack Trace: java.lang.IllegalStateException: XSRF: A mutative operation was attempted on BuildResultsSummaryImpl(id=230916098) within a non-mutative HTTP request: http://localhost:8085/build/admin/stopPlan.action?planResultKey=ART-STP-6&returnUrl=%2Fbrowse%2FART-STP-6 : [[BUILD, 2019-11-12 17:12:24.943, ART-STP-JOB1, Default Job, 6, Unknown, Queued, null, null, null, 0, 0, 0, com.atlassian.bamboo.plugin.system.triggerReason:ManualBuildTriggerReason, [], NONE, 0, 230948865, 2, 2019-11-12 17:12:53.318, false, false, false, false, -1, [], {buildTimeStamp=2019-11-12T17:12:40.040+11:00, CUCUMBER_JSON_REPORT=null, RerunBuildTriggerReason.noOfRetries=2, ManualBuildTriggerReason.userName=bamboo, dependenciesDisabled=false, bamboo.system.getTestsToPass=, buildKey=ART-STP-JOB1, buildNumber=6, ManualBuildTriggerReason.stageName=Default Stage, plan.storageTag=plan-230522883, TEST_JIRA_REPORT=null}, [], [], VariableContextBaselineImpl{hash=e8508a53b88c2de74f603bcdaad2db3c221386ca}, [com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7], [], [], [], [], 61008, TestResultsSummaryImpl{successfulTestCaseCount=0, failedTestCaseCount=0, totalTestCaseCount=0, brokenTestCaseCount=0, existingFailedTestCount=0, fixedTestCaseCount=0, quarantinedTestCaseCount=0, skippedTestCaseCount=0, totalTestDuration=0}, null, com.atlassian.bamboo.chains.ChainResultsSummaryImpl@277c9bc0[planKey=ART-STP,buildNumber=6], [], []]]-> ->[[BUILD, 2019-11-12 17:12:24.943, ART-STP-JOB1, Default Job, 6, Unknown, Queued, null, null, null, 0, 0, 0, com.atlassian.bamboo.plugin.system.triggerReason:ManualBuildTriggerReason, [], NONE, 0, 230948865, 2, 2019-11-12 17:12:53.318, false, false, false, false, null, [], {buildTimeStamp=2019-11-12T17:12:40.040+11:00, CUCUMBER_JSON_REPORT=null, RerunBuildTriggerReason.noOfRetries=2, ManualBuildTriggerReason.userName=bamboo, dependenciesDisabled=false, bamboo.system.getTestsToPass=, buildKey=ART-STP-JOB1, buildNumber=6, ManualBuildTriggerReason.stageName=Default Stage, plan.storageTag=plan-230522883, TEST_JIRA_REPORT=null}, [], [], VariableContextBaselineImpl{hash=e8508a53b88c2de74f603bcdaad2db3c221386ca}, [com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7, com.atlassian.bamboo.resultsummary.CustomDataRow@405b71a7], [], [], [], [], 61008, TestResultsSummaryImpl{successfulTestCaseCount=0, failedTestCaseCount=0, totalTestCaseCount=0, brokenTestCaseCount=0, existingFailedTestCount=0, fixedTestCaseCount=0, quarantinedTestCaseCount=0, skippedTestCaseCount=0, totalTestDuration=0}, null, com.atlassian.bamboo.chains.ChainResultsSummaryImpl@277c9bc0[planKey=ART-STP,buildNumber=6], [], []]] at com.atlassian.bamboo.utils.XsrfUtils.fail(XsrfUtils.java:27) at com.atlassian.bamboo.hibernate.ReadOnlyGetMethodEnforcer.fail(ReadOnlyGetMethodEnforcer.java:125) at com.atlassian.bamboo.hibernate.ReadOnlyGetMethodEnforcer.onFlushDirty(ReadOnlyGetMethodEnforcer.java:76) ... at com.atlassian.bamboo.v2.build.agent.LocalBuildResultProcessor.processBuildResult(LocalBuildResultProcessor.java:138) at com.atlassian.bamboo.v2.build.agent.LocalResultProcessorImpl.processResult(LocalResultProcessorImpl.java:30) at com.atlassian.bamboo.v2.build.queue.BuildQueueManagerImpl.removeBuildFromQueue(BuildQueueManagerImpl.java:227) at com.atlassian.bamboo.build.DefaultStopBuildManager.cancelBuild(DefaultStopBuildManager.java:113) at com.atlassian.bamboo.chains.JobExecutionManagerImpl.requestStop(JobExecutionManagerImpl.java:43) at com.atlassian.bamboo.chains.ChainExecutionManagerImpl$4.run(ChainExecutionManagerImpl.java:743) at io.atlassian.util.concurrent.ManagedLocks$ManagedLockImpl.withLock(ManagedLocks.java:302) at com.atlassian.bamboo.chains.ChainExecutionManagerImpl.stop(ChainExecutionManagerImpl.java:736) at com.atlassian.bamboo.chains.ChainExecutionManagerImpl.stop(ChainExecutionManagerImpl.java:722) ...
Upon trying to stop this build normally with a POST and when OrphanedBuildMonitor attempts to clean it due to the state mismatch
The below exception is thrown in the atlassian-bamboo.log file and UI:
java.lang.NullPointerException at com.atlassian.bamboo.build.logger.LazyBuildLogFileWriterImpl.write(LazyBuildLogFileWriterImpl.java:41) at com.atlassian.bamboo.build.logger.BuildLoggerImpl.onAddLogEntry(BuildLoggerImpl.java:44) at com.atlassian.bamboo.build.logger.AbstractBuildLogger.addLogEntry(AbstractBuildLogger.java:128) at com.atlassian.bamboo.build.logger.AbstractBuildLogger.addBuildLogEntry(AbstractBuildLogger.java:86) at com.atlassian.bamboo.build.DefaultStopBuildManager.cancelBuild(DefaultStopBuildManager.java:108) at com.atlassian.bamboo.chains.JobExecutionManagerImpl.requestStop(JobExecutionManagerImpl.java:43) at com.atlassian.bamboo.chains.ChainExecutionManagerImpl$4.run(ChainExecutionManagerImpl.java:743) at io.atlassian.util.concurrent.ManagedLocks$ManagedLockImpl.withLock(ManagedLocks.java:302) at com.atlassian.bamboo.chains.ChainExecutionManagerImpl.stop(ChainExecutionManagerImpl.java:736) at com.atlassian.bamboo.chains.ChainExecutionManagerImpl.stop(ChainExecutionManagerImpl.java:722) ..
2019-11-11 15:09:28,590 INFO [scheduler_Worker-1] [DefaultErrorHandler] Recording an error: Build ART-STP-JOB1-5 had to be cancelled: it was marked as queued but was not present in the queue for (at least) the past 720 seconds. : ART-STP-JOB1 2019-11-11 15:09:28,593 ERROR [scheduler_Worker-1] [OrphanedBuildMonitorJob] Unable to log build cancellation: java.lang.NullPointerException at com.atlassian.bamboo.build.logger.LazyBuildLogFileWriterImpl.write(LazyBuildLogFileWriterImpl.java:41) at com.atlassian.bamboo.build.logger.BuildLoggerImpl.onAddLogEntry(BuildLoggerImpl.java:44) at com.atlassian.bamboo.build.logger.AbstractBuildLogger.addLogEntry(AbstractBuildLogger.java:128) at com.atlassian.bamboo.build.logger.AbstractBuildLogger.addErrorLogEntryInternal(AbstractBuildLogger.java:98) at com.atlassian.bamboo.build.logger.AbstractBuildLogger.addErrorLogEntry(AbstractBuildLogger.java:93) at com.atlassian.bamboo.build.logger.AbstractBuildLogger.addErrorLogEntry(AbstractBuildLogger.java:143) at com.atlassian.bamboo.build.monitoring.OrphanedBuildMonitorJob.logCancellation(OrphanedBuildMonitorJob.java:257) at com.atlassian.bamboo.build.monitoring.OrphanedBuildMonitorJob.cleanUpBuildsAndDeploymentsWithInconsistentQueuedStatus(OrphanedBuildMonitorJob.java:242) at com.atlassian.bamboo.build.monitoring.OrphanedBuildMonitorJob.execute(OrphanedBuildMonitorJob.java:97) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at com.atlassian.bamboo.utils.BambooRunnables$1.run(BambooRunnables.java:48) at com.atlassian.bamboo.security.ImpersonationHelper.runWith(ImpersonationHelper.java:26) at com.atlassian.bamboo.security.ImpersonationHelper.runWithSystemAuthority(ImpersonationHelper.java:17) at com.atlassian.bamboo.security.ImpersonationHelper$1.run(ImpersonationHelper.java:41) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Workaround
For a build that was re-run, once the XSRF error is encountered – Bamboo will need to be restarted to clear up the build result, otherwise it will appear as queued indefinitely because the orphaned build monitor encounters a NullPointerException for re-runs affected by this.
If the build was a first time build and the XSRF error is encountered, after 720 seconds the Bamboo orphaned build monitor will successfully clear up the build result.