Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-14287

AccessTokenClient.fetchAccessToken() can block indefinetly

    XMLWordPrintable

Details

    Description

      Issue Summary

      Under high load the httpclient-callback thread pool and it's associated queue can be come full, resulting in the following warning being logged:

      2023-07-31 04:57:52,301 WARN  [I/O dispatcher 45] *14TWX97x1316x127x0 123.4.5.6 "POST /rest/api/latest/projects/MY_PROJECT/repos/myrepo/commits/123f7457440d4568c964fcb8f41d948d25799b38/builds HTTP/1.1" c.a.h.api.factory.HttpClientOptions Exceeded the limit of requests waiting for execution.  Increase the value of the system property com.atlassian.httpclient.options.threadWorkQueueLimit to prevent these situations in the future. Current value of com.atlassian.httpclient.options.threadWorkQueueLimit = 256.
      

      When this happens completion of Futures/Promises related to HTTP Client requests to will fail to complete; neither success or failure is propagated via the Future. As a result, any requests to get() (or similar) without a timeout can block indefinitely. This problem was detected on an internal Atlassian system, with one of the threads that is responsible for integration with Jira cloud becoming blocked like so:

      "jira-dev-info:thread-76" #1157 daemon prio=5 os_prio=0 cpu=13820.72ms elapsed=202613.30s tid=0x00007fd3e40263e0 nid=0x3df2 waiting on condition  [0x00007fd24b8e6000]
         java.lang.Thread.State: WAITING (parking)
              at jdk.internal.misc.Unsafe.park(java.base@17.0.7/Native Method)
              - parking to wait for  <0x000000078348bfa0> (a java.util.concurrent.CompletableFuture$Signaller)
              at java.util.concurrent.locks.LockSupport.park(java.base@17.0.7/LockSupport.java:211)
              at java.util.concurrent.CompletableFuture$Signaller.block(java.base@17.0.7/CompletableFuture.java:1864)
              at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@17.0.7/ForkJoinPool.java:3463)
              at java.util.concurrent.ForkJoinPool.managedBlock(java.base@17.0.7/ForkJoinPool.java:3434)
              at java.util.concurrent.CompletableFuture.waitingGet(java.base@17.0.7/CompletableFuture.java:1898)
              at java.util.concurrent.CompletableFuture.get(java.base@17.0.7/CompletableFuture.java:2072)
              at io.atlassian.util.concurrent.Promises$OfStage.claim(Promises.java:280)
              at com.atlassian.stash.internal.jira.cloud.client.AccessTokenClient.fetchAccessToken(AccessTokenClient.java:72)
              at com.atlassian.stash.internal.jira.cloud.client.AccessTokenClient.getAccessToken(AccessTokenClient.java:58)
              at com.atlassian.stash.internal.jira.cloud.client.DefaultJiraCloudHttpClient.lambda$getExecutor$1(DefaultJiraCloudHttpClient.java:93)
              at com.atlassian.stash.internal.jira.cloud.client.DefaultJiraCloudHttpClient$$Lambda$3554/0x0000000802444cc8.get(Unknown Source)
              at com.atlassian.stash.internal.jira.http.client.auth.AccessTokenBasedHandler.apply(AccessTokenBasedHandler.java:19)
              at com.atlassian.stash.internal.jira.http.json.DefaultHttpRequestExecutor.lambda$makeGetRequest$4(DefaultHttpRequestExecutor.java:66)
              at com.atlassian.stash.internal.jira.http.json.DefaultHttpRequestExecutor$$Lambda$3555/0x0000000802445830.get(Unknown Source)
              at com.atlassian.stash.internal.jira.http.json.DefaultHttpRequestExecutor.genericExceptionWrapper(DefaultHttpRequestExecutor.java:107)
              at com.atlassian.stash.internal.jira.http.json.DefaultHttpRequestExecutor.makeGetRequest(DefaultHttpRequestExecutor.java:63)
              at com.atlassian.stash.internal.jira.cloud.client.DefaultJiraCloudHttpClient.makeGetRequest(DefaultJiraCloudHttpClient.java:56)
      

      This thread was responsible for populating a cache entry, and as a result all sending of data to Jira Cloud was blocked because all attempts to get that same cache entry blocked waiting for this thread, which had a flow on effect of blocking commit indexing also.

      Workaround

      • If the problem has occurred restart the Bitbucket server process.
      • To prevent future occurrences set system property `com.atlassian.httpclient.options.threadWorkQueueLimit` to increase the queue size.

      Attachments

        Issue Links

          Activity

            People

              behumphreys Ben Humphreys
              behumphreys Ben Humphreys
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: