-
Suggestion
-
Resolution: Unresolved
-
None
-
None
-
0
-
1
-
Problem definition
Currently, there is no way to continue a build from where it stopped if something happens to the agent.
This can be seen specially for users using Spot instances in AWS, that shuts down in the middle of builds and there is no way to pause or recover from where it stoped, however it can happen to any agent and mostly with remote agents.
It's should be aligned with our current roadmap "increase build resilience".
Suggested resolution
- Possibility of the re-adding the job back to the queue so it can be dispatched to a different agent if the current agent goes offline
- Possibility (or a parameter) to allow the build to pause when it encounter a problem with the agent.
- Possibility (or a parameter) to allow the build to restart from the point it stoped when the agent became inaccessible.
- AWS will send a 2 minute warning notification before shutting down the instance, use this warning to pause the build and restart when the agent comes back online (useful for dedicated agents on Spot instances)