Details
Description
Issue Summary
Under high load the httpclient-callback thread pool and it's associated queue can be come full, resulting in the following warning being logged:
2023-07-31 04:57:52,301 WARN [I/O dispatcher 45] *14TWX97x1316x127x0 123.4.5.6 "POST /rest/api/latest/projects/MY_PROJECT/repos/myrepo/commits/123f7457440d4568c964fcb8f41d948d25799b38/builds HTTP/1.1" c.a.h.api.factory.HttpClientOptions Exceeded the limit of requests waiting for execution. Increase the value of the system property com.atlassian.httpclient.options.threadWorkQueueLimit to prevent these situations in the future. Current value of com.atlassian.httpclient.options.threadWorkQueueLimit = 256.
When this happens completion of Futures/Promises related to HTTP Client requests to will fail to complete; neither success or failure is propagated via the Future. As a result, any requests to get() (or similar) without a timeout can block indefinitely. This problem was detected on an internal Atlassian system, with one of the threads that is responsible for integration with Jira cloud becoming blocked like so:
"jira-dev-info:thread-76" #1157 daemon prio=5 os_prio=0 cpu=13820.72ms elapsed=202613.30s tid=0x00007fd3e40263e0 nid=0x3df2 waiting on condition [0x00007fd24b8e6000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@17.0.7/Native Method) - parking to wait for <0x000000078348bfa0> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park(java.base@17.0.7/LockSupport.java:211) at java.util.concurrent.CompletableFuture$Signaller.block(java.base@17.0.7/CompletableFuture.java:1864) at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@17.0.7/ForkJoinPool.java:3463) at java.util.concurrent.ForkJoinPool.managedBlock(java.base@17.0.7/ForkJoinPool.java:3434) at java.util.concurrent.CompletableFuture.waitingGet(java.base@17.0.7/CompletableFuture.java:1898) at java.util.concurrent.CompletableFuture.get(java.base@17.0.7/CompletableFuture.java:2072) at io.atlassian.util.concurrent.Promises$OfStage.claim(Promises.java:280) at com.atlassian.stash.internal.jira.cloud.client.AccessTokenClient.fetchAccessToken(AccessTokenClient.java:72) at com.atlassian.stash.internal.jira.cloud.client.AccessTokenClient.getAccessToken(AccessTokenClient.java:58) at com.atlassian.stash.internal.jira.cloud.client.DefaultJiraCloudHttpClient.lambda$getExecutor$1(DefaultJiraCloudHttpClient.java:93) at com.atlassian.stash.internal.jira.cloud.client.DefaultJiraCloudHttpClient$$Lambda$3554/0x0000000802444cc8.get(Unknown Source) at com.atlassian.stash.internal.jira.http.client.auth.AccessTokenBasedHandler.apply(AccessTokenBasedHandler.java:19) at com.atlassian.stash.internal.jira.http.json.DefaultHttpRequestExecutor.lambda$makeGetRequest$4(DefaultHttpRequestExecutor.java:66) at com.atlassian.stash.internal.jira.http.json.DefaultHttpRequestExecutor$$Lambda$3555/0x0000000802445830.get(Unknown Source) at com.atlassian.stash.internal.jira.http.json.DefaultHttpRequestExecutor.genericExceptionWrapper(DefaultHttpRequestExecutor.java:107) at com.atlassian.stash.internal.jira.http.json.DefaultHttpRequestExecutor.makeGetRequest(DefaultHttpRequestExecutor.java:63) at com.atlassian.stash.internal.jira.cloud.client.DefaultJiraCloudHttpClient.makeGetRequest(DefaultJiraCloudHttpClient.java:56)
This thread was responsible for populating a cache entry, and as a result all sending of data to Jira Cloud was blocked because all attempts to get that same cache entry blocked waiting for this thread, which had a flow on effect of blocking commit indexing also.
Workaround
- If the problem has occurred restart the Bitbucket server process.
- To prevent future occurrences set system property `com.atlassian.httpclient.options.threadWorkQueueLimit` to increase the queue size.
Attachments
Issue Links
- resolves
-
BBSDEV-28926 Loading...