Details
-
Bug
-
Resolution: Answered
-
High
-
None
-
8.2.0, 8.3.3
-
8.02
-
17
-
Severity 2 - Major
-
3
-
Description
This problem is caused by a bug in the Java runtime: JDK-8214418 HttpClient falls in running with 100% cpu usage after an error signalled on channel (Backported as JDK-8241054)
The fix has been verified to be available in AdoptOpenJDK 11.0.8 - https://github.com/AdoptOpenJDK/openjdk-jdk11u/commit/8d1b63a4db2c6348a97b3cf45bd4d2caa7cad6b5
Issue Summary
When Jira is running against Java 11, TLS 1.3 requests can cause httpclient threads to become blocked
Steps to Reproduce
This is possible with any app that uses TLS 1.3 in an environment running Java 11.x. We were able to reproduce this using the Jira Mobile app Android client.
- Have Jira Mobile app plugins enabled
- Have Jira Mobile Android users register/sign in via the mobile app
Expected Results
Users can sign in and use Jira. Everything functions normally.
Actual Results
All http threads eventually become blocked on one I/O Dispatcher thread. The instance stops accepting incoming connections. After a restart, the instance stays online/available for some period of time before all connections are consumed again.
Symptoms
Top output on the Jira process shows one CPU core running hot on an I/O dispatcher thread. Other cores are silent/idle.
top - 14:42:22 up 1 day, 18:37, 2 users, load average: 0.99, 1.00, 1.02 Tasks: 417 total, 1 running, 416 sleeping, 0 stopped, 0 zombie Cpu(s): 20.4%us, 0.2%sy, 0.0%ni, 79.2%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49409424k total, 49029788k used, 379636k free, 628400k buffers Swap: 4194300k total, 0k used, 4194300k free, 35948404k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 44673 jira 20 0 22.3g 10g 839m R 98.4 21.5 395:41.77 I/O dispatcher 44475 jira 20 0 22.3g 10g 839m S 0.0 21.5 0:00.00 java 44476 jira 20 0 22.3g 10g 839m S 0.0 21.5 0:00.84 java 44477 jira 20 0 22.3g 10g 839m S 0.0 21.5 1:38.88 ParGC Thread#0 44478 jira 20 0 22.3g 10g 839m S 0.0 21.5 5:00.21 VM Thread
Analysis of thread dumps shows over 150 threads are blocked on an I/O Dispatcher:
"I/O dispatcher 11" #178 daemon prio=5 cpu=23801052.84ms elapsed=25002.07s tid=0x00007f2d788b2800 nid=0xae81 runnable [0x00007f2cb5320000] java.lang.Thread.State: RUNNABLE at sun.security.ssl.SSLEngineImpl.writeRecord(java.base@11.0.4/SSLEngineImpl.java:185) at sun.security.ssl.SSLEngineImpl.wrap(java.base@11.0.4/SSLEngineImpl.java:136) - eliminated <0x000000052cca4538> (a sun.security.ssl.SSLEngineImpl) at sun.security.ssl.SSLEngineImpl.wrap(java.base@11.0.4/SSLEngineImpl.java:116) - locked <0x000000052cca4538> (a sun.security.ssl.SSLEngineImpl) at javax.net.ssl.SSLEngine.wrap(java.base@11.0.4/SSLEngine.java:479) at org.apache.http.nio.reactor.ssl.SSLIOSession.doWrap(SSLIOSession.java:263) at org.apache.http.nio.reactor.ssl.SSLIOSession.doHandshake(SSLIOSession.java:301) at org.apache.http.nio.reactor.ssl.SSLIOSession.isAppInputReady(SSLIOSession.java:503) - locked <0x000000052cca44f8> (a org.apache.http.nio.reactor.ssl.SSLIOSession) at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:120) at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)
Further analysis of the http-nio-xxxx-exec-xx threads may help identify the exact app, plugin, or package(s) stimulating the problem. In this example, we saw a large number of http threads pointing to the Jira Mobile app.
"http-nio-8080-exec-50 url:/rest/nativemob...ation/registration username:xxxxxxx" #2646 daemon prio=5 cpu=102935.39ms elapsed=19829.30s tid=0x00007f2ccc2d5000 nid=0xbd24 waiting on condition [0x00007f2c1d9b1000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method) - parking to wait for <0x0000000509872550> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
Heap dump analysis may be required.
Solution
Upgrade AdoptOpenJDK to version 11.0.8 which has this issue solved.
Workaround
Disable TLS 1.3 on the JVM running Jira
- Stop Jira
- Add the following option to Jira's startup options:
-Djdk.tls.client.protocols=TLSv1,TLSv1.1,TLSv1.2
- Start Jira
Alternate workarounds:
- Disable the app, plugin, or module found to be the cause of the stuck I/O Dispatcher thread
- Install Java 8 and point Jira to that JRE/JDK instead of Java 11