-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
High
-
Affects Version/s: 6.3.2
-
Component/s: Authentication / Security, Directory - Azure Active Directory
-
None
-
1
-
Severity 2 - Major
-
2
Issue Summary
This is reproducible on Data Center: (yes)
Crowd connects to Microsoft Entra ID (Formerly Azure AD), via the MsalAuthenticator.getApiToken() that does NOT have a timeout parameter.
In the case when an Authentication request is made to Azure AD and did not receive a response (eg: due to network-related issues), this thread will stay in RUNNABLE state indefinitely.
Steps to Reproduce
- Configure an Azure AD directory in Crowd DC
- Ensure the Crowd DC server cannot reach login.microsoftonline.com (e.g., via firewall rule, DNS failure, network partition, or Azure AD outage)
- Trigger a user authentication request against the Azure AD directory (e.g., via REST API call from a connected application like Confluence or Jira)
Expected Results
If the authentication task is not completed within a specific time, it should be timed out with an appropriate error message.
Actual Results
The Worker Thread like below stays RUNNABLE indefinitely (without timeout), stuck on socket read:
"ForkJoinPool.commonPool-worker-6248" #432217 [434203] daemon prio=5 os_prio=0 cpu=29636.94ms elapsed=1823164.50s tid=0x00007f39584bf200 nid=434203 runnable [0x00007f38c7dfc000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.SocketDispatcher.read0(java.base@21.0.10/Native Method) ... at java.net.HttpURLConnection.getResponseCode(java.base@21.0.10/HttpURLConnection.java:531) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(java.base@21.0.10/HttpsURLConnectionImpl.java:307) at com.microsoft.aad.msal4j.DefaultHttpClient.readResponseFromConnection(DefaultHttpClient.java:121) at com.microsoft.aad.msal4j.DefaultHttpClient.executeHttpPost(DefaultHttpClient.java:72) at com.microsoft.aad.msal4j.DefaultHttpClient.send(DefaultHttpClient.java:46) ... at com.microsoft.aad.msal4j.AcquireTokenByAuthorizationGrantSupplier.execute(AcquireTokenByAuthorizationGrantSupplier.java:63) at com.microsoft.aad.msal4j.AcquireTokenByClientCredentialSupplier.acquireTokenByClientCredential(AcquireTokenByClientCredentialSupplier.java:87) at com.microsoft.aad.msal4j.AcquireTokenByClientCredentialSupplier.execute(AcquireTokenByClientCredentialSupplier.java:50) ...
The Tomcat Thread like below stays in WAITING indefinitely (without timeout), waiting for the worker thread to complete:
"http-nio-8095-exec-1" #158 [187] daemon prio=5 os_prio=0 cpu=1883630.56ms elapsed=5262433.90s tid=0x00007f4241a0b3c0 nid=187 waiting on condition [0x00007f38afff8000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@21.0.10/Native Method) - parking to wait for <0x00007f3d5b000088> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park(java.base@21.0.10/LockSupport.java:221) at java.util.concurrent.CompletableFuture$Signaller.block(java.base@21.0.10/CompletableFuture.java:1864) at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@21.0.10/ForkJoinPool.java:3780) at java.util.concurrent.ForkJoinPool.managedBlock(java.base@21.0.10/ForkJoinPool.java:3725) at java.util.concurrent.CompletableFuture.waitingGet(java.base@21.0.10/CompletableFuture.java:1898) at java.util.concurrent.CompletableFuture.get(java.base@21.0.10/CompletableFuture.java:2072) at com.atlassian.crowd.directory.authentication.impl.MsalAuthenticator.getApiToken(MsalAuthenticator.java:32) ...
This behavior will also consume Database Connection Pool, and can eventually lead to Crowd not being responsive as all Tomcat/Database connection have been exhausted.
This issue is simialr to CWD-5213, but on a different code path, hence the impact on this issue is on Authentication
Workaround
Currently there is no known workaround for adding a Read/Connect Timeout for this behavior.
Crowd (the specific node) has to be restarted in order to free up all the threads that are stuck.