Issue Summary
It is possible for the number of concurrent pack-objects processes running on Bitbucket application node (or Mesh node) exceeds the configured throttling limits. This can result in system resources being overused beyond limits and eventually the Bitbucket application may become unavailable e.g. due to being killed by OOM killer.
This is reproducible on Data Center: yes
Steps to Reproduce
The problem occurs in the following situation:
1) All hosting ticket are allocated (i.e. used) - imagine long running clone operations
2) One or more requests are waiting for a hosting ticket
3) The requests waiting for a hosting ticket timeout and fail
4) One of more requests still waiting are granted a hosting ticket, beyond the configured/derived maximum
The bug is, at step 3 when the request fails to acquire a hosting ticket and errors, it actually releases a ticket that it never actually owned, returning it to the semaphore. Then at step 4 that ticket is allocated to another request that is waiting.
The timeout mentioned at step 3 is by default 300s although can be tuned via the Bitbucket property throttle.resource.scm-hosting.timeout.
A simple way to reproduce this is to configure Bitbucket like so:
throttle.resource.scm-hosting.strategy=fixed throttle.resource.scm-hosting.fixed.limit=2 throttle.resource.scm-hosting.timeout=10
Push a large repository that takes a few minutes to clone. Disable the SCM cache, then run three clones. The first two will proceed while the third will fail after 10s due to the inability to acquire a hosting ticket. Quickly, while the first two are still running, start another clone. You'd expect this to not start running (because two already are) however it will start cloning.
Expected Results
The number of hosting tickets granted should not exceed the maximum configured or derived (in the case of adaptive throttling).
Actual Results
The number of hosting tickets granted will exceed the maximum.
Workaround
Currently there is no known workaround for this behavior. A workaround will be added here when available