-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Medium
-
Component/s: Repository - Git Client SSH
-
None
-
3
-
Severity 3 - Minor
-
365
Issue Summary
When Bitbucket Cloud's SSH service is scaling down, some SSH connections are dropped, resulting in failed Git operations.
The issue affects a small number of requests each time. Customers with frequent SSH traffic, like those using SSH from a build system like Jenkins, are most likely to experience the issue.
Steps to Reproduce
The problem is intermittent, and may be difficult to reproduce purposely.
- Attempt a Git clone/fetch/push over SSH with verbose SSH output
$ GIT_SSH_COMMAND="ssh -vvv" git fetch
Expected Results
The Git operation succeeds.
Actual Results
The below exception is observed in the Git output:
debug1: Offering public key: <KEY DETAILS> Connection closed by 18.205.93.1 port 22 fatal: Could not read from remote repository.
Note that the remote disconnect may occur at other points during the SSH session, although it most often occurs during the SSH handshake, as in the example above.
Workaround
Please try your SSH operation again. There is presently no other workaround. We apologize for the inconvenience this causes. We are working across multiple engineering teams to resolve the issue.
Technical detail on the cause and solution
The bug is caused by our edge load balancer continuing to send new connections to an SSH server node that is already in the process of shutting down. This can happen while services are scaling down following our daily peak in traffic, during deployments, and in other routine situations where nodes are being shut down.
The correct behavior is for the load balancer to stop sending new connections to a node immediately prior to node shutdown, and for the node to continue servicing its existing connections until they complete.
The Atlassian Platform team responsible for the node status reporting that drives the load balancer behavior, as well as the Bitbucket Cloud team that owns the SSH server software, are working together to fix the shutdown sequence of SSH nodes.
This requires careful coordination and testing before we can deploy the fix for you, our customers. We appreciate your patience and understanding as we work. We know this bug is frustrating.
- relates to
-
BFD-1887 Loading...