-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Medium
-
Affects Version/s: 4.3.0, 4.4.0
-
Component/s: Authentication / Security
-
17
-
Severity 2 - Major
-
20
Problem
When Crowd is under stress (i.e. serving many simultaneous requests, or servicing back to back authentication requests in repetition) Crowd will begin to exhibit signs of non-responsiveness and poor performance; this condition will degrade to the point where Crowd must be restarted to clear all connections and start over.
- When thread dumps are collected while this is happening, the LDAP connection pool will show many blocked threads, and some of them thus are blocking HTTP worker threads which presents itself as blocked LDAP threads and certain blocked HTTP execution threads - these HTTP threads may not be able to proceed if they require an LDAP connection.
com.sun.jndi.ldap.pool.Connections is blocking threads
In this particular example, the http-nio-8095-exec-11 thread is stuck on socketRead0 method in java.net.SocketInputStream while communicating over SSL to the LDAP service.
java.net.SocketInputStream.socketRead0(java.base@11.0.14.1/Native Method) java.net.SocketInputStream.socketRead(java.base@11.0.14.1/SocketInputStream.java:115) java.net.SocketInputStream.read(java.base@11.0.14.1/SocketInputStream.java:168) java.net.SocketInputStream.read(java.base@11.0.14.1/SocketInputStream.java:140) sun.security.ssl.SSLSocketInputRecord.read(java.base@11.0.14.1/SSLSocketInputRecord.java:478) sun.security.ssl.SSLSocketInputRecord.readHeader(java.base@11.0.14.1/SSLSocketInputRecord.java:472) sun.security.ssl.SSLSocketInputRecord.decode(java.base@11.0.14.1/SSLSocketInputRecord.java:160) sun.security.ssl.SSLTransport.decode(java.base@11.0.14.1/SSLTransport.java:111) sun.security.ssl.SSLSocketImpl.decode(java.base@11.0.14.1/SSLSocketImpl.java:1501) sun.security.ssl.SSLSocketImpl.readHandshakeRecord(java.base@11.0.14.1/SSLSocketImpl.java:1411) sun.security.ssl.SSLSocketImpl.startHandshake(java.base@11.0.14.1/SSLSocketImpl.java:451) sun.security.ssl.SSLSocketImpl.startHandshake(java.base@11.0.14.1/SSLSocketImpl.java:422) com.sun.jndi.ldap.Connection.createSocket(java.naming@11.0.14.1/Connection.java:364) com.sun.jndi.ldap.Connection.<init>(java.naming@11.0.14.1/Connection.java:231) com.sun.jndi.ldap.LdapClient.<init>(java.naming@11.0.14.1/LdapClient.java:137) com.sun.jndi.ldap.LdapClientFactory.createPooledConnection(java.naming@11.0.14.1/LdapClientFactory.java:64) com.sun.jndi.ldap.pool.Connections.getOrCreateConnection(java.naming@11.0.14.1/Connections.java:202) com.sun.jndi.ldap.pool.Connections.get(java.naming@11.0.14.1/Connections.java:143) com.sun.jndi.ldap.pool.Pool.getPooledConnection(java.naming@11.0.14.1/Pool.java:151) com.sun.jndi.ldap.LdapPoolManager.getLdapClient(java.naming@11.0.14.1/LdapPoolManager.java:340) com.sun.jndi.ldap.LdapClient.getInstance(java.naming@11.0.14.1/LdapClient.java:1608) com.sun.jndi.ldap.LdapCtx.connect(java.naming@11.0.14.1/LdapCtx.java:2847) com.sun.jndi.ldap.LdapCtx.<init>(java.naming@11.0.14.1/LdapCtx.java:348) com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxFromUrl(java.naming@11.0.14.1/LdapCtxFactory.java:262) com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(java.naming@11.0.14.1/LdapCtxFactory.java:226) com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(java.naming@11.0.14.1/LdapCtxFactory.java:280) com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(java.naming@11.0.14.1/LdapCtxFactory.java:185) com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(java.naming@11.0.14.1/LdapCtxFactory.java:115) javax.naming.spi.NamingManager.getInitialContext(java.naming@11.0.14.1/NamingManager.java:730) javax.naming.InitialContext.getDefaultInitCtx(java.naming@11.0.14.1/InitialContext.java:305) javax.naming.InitialContext.init(java.naming@11.0.14.1/InitialContext.java:236) javax.naming.ldap.InitialLdapContext.<init>(java.naming@11.0.14.1/InitialLdapContext.java:154) org.springframework.ldap.core.support.LdapContextSource.getDirContextInstance(LdapContextSource.java:42) org.springframework.ldap.core.support.AbstractContextSource.createContext(AbstractContextSource.java:343) org.springframework.ldap.core.support.AbstractContextSource.doGetContext(AbstractContextSource.java:139) org.springframework.ldap.core.support.AbstractContextSource.getReadWriteContext(AbstractContextSource.java:174) org.springframework.ldap.transaction.compensating.manager.ContextSourceTransactionManagerDelegate.getNewHolder(ContextSourceTransactionManagerDelegate.java:96) org.springframework.transaction.compensating.support.AbstractCompensatingTransactionManagerDelegate.doBegin(AbstractCompensatingTransactionManagerDelegate.java:83) org.springframework.ldap.transaction.compensating.manager.ContextSourceTransactionManager.doBegin(ContextSourceTransactionManager.java:123) org.springframework.transaction.support.AbstractPlatformTransactionManager.startTransaction(AbstractPlatformTransactionManager.java:400) org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:373) com.atlassian.crowd.directory.SpringLDAPConnector.pageSearchResults(SpringLDAPConnector.java:369) com.atlassian.crowd.directory.SpringLDAPConnector.searchEntitiesWithRequestControls(SpringLDAPConnector.java:453) com.atlassian.crowd.directory.SpringLDAPConnector.searchEntities(SpringLDAPConnector.java:437) com.atlassian.crowd.directory.SpringLDAPConnector.searchUserObjects(SpringLDAPConnector.java:640) com.atlassian.crowd.directory.SpringLDAPConnector.findUserWithAttributesByName(SpringLDAPConnector.java:596) com.atlassian.crowd.directory.SpringLDAPConnector.findUserByName(SpringLDAPConnector.java:583) com.atlassian.crowd.directory.SpringLDAPConnector.authenticate(SpringLDAPConnector.java:1006) com.atlassian.crowd.directory.DbCachingRemoteDirectory.authenticateAndUpdateInternalUser(DbCachingRemoteDirectory.java:267) com.atlassian.crowd.directory.DbCachingRemoteDirectory.performAuthenticationAndUpdateAttributes(DbCachingRemoteDirectory.java:207) com.atlassian.crowd.directory.DbCachingRemoteDirectory.authenticate(DbCachingRemoteDirectory.java:187) com.atlassian.crowd.manager.directory.DirectoryManagerGeneric.authenticateUser(DirectoryManagerGeneric.java:305) jdk.internal.reflect.GeneratedMethodAccessor652.invoke(Unknown Source) jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.14.1/DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(java.base@11.0.14.1/Method.java:566) ...snip...
Before getting stuck, this thread obtained 2 locks (com.sun.jndi.ldap.pool.Connections lock and a org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper lock) and never released them. Due to that, 19 threads are BLOCKED:

catalina.out stuck thread valve indicators
LDAP connection pool thread behavior
When you examine the LDAP connection pool, we find RUNNABLE threads that are all stuck listening for a response from the LDAP source, seemingly waiting for a response from the corporate LDAP service:
java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(java.base@11.0.14.1/Native Method) at java.net.SocketInputStream.socketRead(java.base@11.0.14.1/SocketInputStream.java:115) at java.net.SocketInputStream.read(java.base@11.0.14.1/SocketInputStream.java:168) at java.net.SocketInputStream.read(java.base@11.0.14.1/SocketInputStream.java:140) at sun.security.ssl.SSLSocketInputRecord.read(java.base@11.0.14.1/SSLSocketInputRecord.java:478) at sun.security.ssl.SSLSocketInputRecord.readHeader(java.base@11.0.14.1/SSLSocketInputRecord.java:472) at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(java.base@11.0.14.1/SSLSocketInputRecord.java:70) at sun.security.ssl.SSLSocketImpl.readApplicationRecord(java.base@11.0.14.1/SSLSocketImpl.java:1449) at sun.security.ssl.SSLSocketImpl$AppInputStream.read(java.base@11.0.14.1/SSLSocketImpl.java:1060) at java.io.BufferedInputStream.fill(java.base@11.0.14.1/BufferedInputStream.java:252) at java.io.BufferedInputStream.read1(java.base@11.0.14.1/BufferedInputStream.java:292) at java.io.BufferedInputStream.read(java.base@11.0.14.1/BufferedInputStream.java:351) - locked <0x0000000767ca7738> (a java.io.BufferedInputStream) at com.sun.jndi.ldap.Connection.run(java.naming@11.0.14.1/Connection.java:855) at java.lang.Thread.run(java.base@11.0.14.1/Thread.java:829)
Block threads summary
Based on the data in these thread dumps it appears the application is stuck waiting for a response from the LDAP service, however connectivity tests and sample authentication requests are serviced fine by the LDAP authority (happens with Crowd Internal Directories as well).
Environment
- Java 8 and Java 11 - although one affected customer swears that changing PROD to JDK 8 from 11 mitigated their issue, this has not been recreated or confirmed by a 2nd party as of yet.
- Crowd 4.3.0 through 4.4.0 tested.
- Delegated LDAP with other LDAP directories connected + Internal Crowd User Directories present as well
- A "generic" application is defined in Crowd to service a specific application's auth needs
- The ability to simulate logins either via cURL or wget requests, or by Python's requests library
Alternative testing with Python
We've written a Python 3 script that will stress Crowd out with many authentication requests; it's coded to run 100,000 times over 20 streams with 50% of the authentication attempts being for a local user and 50% for an LDAP user.
You must supply the name of the custom application you are authenticating to (as defined in Crowd and referred to in the script as the APP_USERNAME), the app's password, and the passwords for the LDAP and local user accounts.
python script download and instructions
- Download the Python 3 script from here and save as crowdAuthTester.py

-
- You can run this directly from the Crowd server if it has Python 3 installed, or you can run this from your workstation as long as you have connectivity to your Crowd instance from your laptop with Python on it.
$ python3 crowdAuthTester.py
- The script will generate a listOfErroredHTTPstatusCodes.txt file with the specific HTTP status codes of any errors it encounters, please include that file if it generates one.
- It also includes a summary of the run at the bottom, please include those summarized results.
...snip... Total number of threads running simultaneously :20 ....snip.... <value>1648354070889</value></values></attribute><attribute name="requiresPasswordChange"><link href="http://localhost:6435/crowd/rest/usermanagement/1/user/attribute?username=local_user&attributename=requiresPasswordChange" rel="self"/><values><value>false</value></values></attribute><link href="http://localhost:6435/crowd/rest/usermanagement/1/user/attribute?username=local_user" rel="self"/></attributes></user> ...snip... Time to completion: 0:19:46 Total number of attempts: 100000 Total number of errors: 1077
Expected Results
Have crowd return authentication, or a statement that authentication is unavailable at the time, or time out of the client auth request.
Actual Results
When Crowd is under stress (i.e. serving many simultaneous requests, or servicing back to back authentication requests in repetition) Crowd will begin to exhibit signs of non-responsiveness and poor performance; this condition will degrade to the point where Crowd must be restarted to clear all connections and start over.
- When thread dumps are collected while this is happening, the LDAP connection pool will show many blocked threads, and some of them thus are blocking HTTP worker threads which presents itself as blocked LDAP threads and certain blocked HTTP execution threads - these HTTP threads may not be able to proceed if they require an LDAP connection.
Workaround
There is no known workaround at this time although some Clients have reported that they've successfully worked around this issue by downgrading from JDK 11 to JDK 8, however this evidence is contridicted by other observed evidence and it is not clear that the version of JDK (8 or 11) is the root cause here.