Uploaded image for project: 'Jira Service Management Data Center'
  1. Jira Service Management Data Center
  2. JSDSERVER-12620

The JSM Mail Handlers (channels) stop pulling new emails due to an infinite timeout used to fetch Oauth 2.0 tokens

      Issue Summary

      All the Jira Service Management (JSM) Mail Channels stop pulling new incoming emails due to an infinite timeout value used to fetch a new Oauth 2.0 access token.

      Environment

      • Jira Service Management (JSM) on 4.10.0 and any higher version
      • At least 1 JSM Mail Channel must be configured (in Project Settings > Email Requests) using Oauth 2.0 as the authorization method for the bug to be replicable

      Steps to replicate

      This issue has not been replicated in our environment since, to replicate it, we would need to create some "network latency" at the right time to trigger the bug.

      Even though the steps to replicate the issue are not fully identified, they should look like the ones listed below:

      1. Install JSM on 4.10.0 or any higher version
      2. Create a JSM project
      3. Configure a JSM Mail Handler in Project Settings > Email Requests using Oauth 2.0 as the authentication method as per this documentation
      4. Verify that the JSM Mail Handler is able to fetch incoming emails from the mailbox configured in Project Settings > Email Requests
      5. Wait for about 1h
        • This is usually the time when the Oauth 2.0 Access token expires (for Gmail and Microsoft)
        • The Mail Puller Job of the JSM Mail Channel will try to fetch a new Oauth 2.0 Access token, using the Oauth 2.0 Refresh token
      6. Create some "network latency/interruption" that will prevent Gmail or Microsoft (depending on which Mail Server is used for the test) from sending a response back to the Mail Puller Job

      Expected Results

      The HTTP request sent by the Mail Puller job should eventually timeout, since no reply was sent back from the Mail Server, and the Mail Puller job should stop its own execution, until it is scheduled again 1 minute later.

      Then, 1 minute later, it should try again to request an Oauth 2.0 Access Token, get a new one, and pull new emails again.

      Actual Results

      The HTTP request does not timeout.

      As a result:

      • the Mail Puller job never completes its own execution
      • the Mail Puller gets completely stuck and is no scheduled anymore
      • No new emails are fetched by any JSM Mail Channels configured in any JSM project

      Diagnosis steps to identify this bug

      Jira Scheduler page

      Go to page âš™ > System > Scheduler details, and observe that the Mail Puller Job is showing as "Already running", which means that this job is stuck.

      SQL query

      Run the SQL query below:

      select * from rundetails where job_id like '%Puller%';
      

      If you see 2 rows with one saying "already running", it's another indication that the Mail Puller job is stuck:

      Thread Dumps

      When collecting thread dumps while the issue is happening, we should see a long running Caesium thread with the following stack trace:

      "Caesium-1-1" daemon prio=5 tid=0x00000000000000eb nid=0 runnable 
         java.lang.Thread.State: RUNNABLE
      	at java.base@11.0.13/java.net.SocketInputStream.socketRead0(Native Method)
      	at java.base@11.0.13/java.net.SocketInputStream.socketRead(Unknown Source)
      	at java.base@11.0.13/java.net.SocketInputStream.read(Unknown Source)
      	at java.base@11.0.13/java.net.SocketInputStream.read(Unknown Source)
      	at java.base@11.0.13/sun.security.ssl.SSLSocketInputRecord.read(Unknown Source)
      	at java.base@11.0.13/sun.security.ssl.SSLSocketInputRecord.readFully(Unknown Source)
      	at java.base@11.0.13/sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(Unknown Source)
      	at java.base@11.0.13/sun.security.ssl.SSLSocketInputRecord.decode(Unknown Source)
      	at java.base@11.0.13/sun.security.ssl.SSLTransport.decode(Unknown Source)
      	at java.base@11.0.13/sun.security.ssl.SSLSocketImpl.decode(Unknown Source)
      	at java.base@11.0.13/sun.security.ssl.SSLSocketImpl.readApplicationRecord(Unknown Source)
      	at java.base@11.0.13/sun.security.ssl.SSLSocketImpl$AppInputStream.read(Unknown Source)
      	at java.base@11.0.13/java.io.BufferedInputStream.fill(Unknown Source)
      	at java.base@11.0.13/java.io.BufferedInputStream.read1(Unknown Source)
      	at java.base@11.0.13/java.io.BufferedInputStream.read(Unknown Source)
      	- locked <0x00000000739144b7> (a java.io.BufferedInputStream)
      	at java.base@11.0.13/sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
      	at java.base@11.0.13/sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
      	at java.base@11.0.13/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
      	- locked <0x0000000062f1103d> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
      	at java.base@11.0.13/sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
      	- locked <0x0000000062f1103d> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
      	at java.base@11.0.13/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
      	- locked <0x000000005a1b5fc1> (a sun.net.www.protocol.https.HttpsURLConnectionImpl)
      	at com.nimbusds.oauth2.sdk.http.HTTPRequest.send(HTTPRequest.java:899)
      	at com.atlassian.oauth2.client.lib.token.DefaultTokenService.getToken(DefaultTokenService.java:139)
      	at com.atlassian.oauth2.client.lib.token.DefaultTokenService.forceRefresh(DefaultTokenService.java:71)
      	at com.atlassian.oauth2.client.storage.DefaultTokenHandler.refreshToken(DefaultTokenHandler.java:144)
      	at com.atlassian.oauth2.client.storage.DefaultTokenHandler.refreshTokenIfNeeded(DefaultTokenHandler.java:131)
      	at com.atlassian.oauth2.client.storage.DefaultTokenHandler.lambda$getRefreshedToken$1(DefaultTokenHandler.java:110)
      	at com.atlassian.oauth2.client.storage.DefaultTokenHandler$$Lambda$6901/0x0000000804b17840.call(Unknown Source)
      	at com.atlassian.oauth2.client.util.concurrent.KeyedLocks.executeWithLock(KeyedLocks.java:33)
      	- locked <0x000000007ce01059> (a java.util.concurrent.atomic.AtomicInteger)
      	at com.atlassian.oauth2.client.storage.DefaultTokenHandler.getRefreshedToken(DefaultTokenHandler.java:109)
      	at com.atlassian.oauth2.client.storage.DefaultTokenHandler.getRefreshedToken(DefaultTokenHandler.java:102)
      	at com.atlassian.oauth2.client.storage.DefaultTokenHandler.getRefreshedToken(DefaultTokenHandler.java:35)
      	at jdk.internal.reflect.GeneratedMethodAccessor1341.invoke(Unknown Source)
      	at java.base@11.0.13/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
      	at java.base@11.0.13/java.lang.reflect.Method.invoke(Unknown Source)
      	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
      	at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.doInvoke(ServiceInvoker.java:56)
      	at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.invoke(ServiceInvoker.java:60)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
      	at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:137)
      	at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:124)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
      	at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invokeUnprivileged(ServiceTCCLInterceptor.java:70)
      	at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invoke(ServiceTCCLInterceptor.java:53)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
      	at org.eclipse.gemini.blueprint.service.importer.support.LocalBundleContextAdvice.invoke(LocalBundleContextAdvice.java:57)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
      	at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:137)
      	at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:124)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
      	at com.sun.proxy.$Proxy1519.getRefreshedToken(Unknown Source)
      	at com.atlassian.jira.internal.mail.processor.feature.oauth.MailOAuthServiceImpl.getOAuthToken(MailOAuthServiceImpl.java:53)
      	at com.atlassian.jira.internal.mail.processor.feature.authentication.OAuthAuthenticationStrategy.getPassword(OAuthAuthenticationStrategy.java:31)
      	at com.atlassian.jira.internal.mail.processor.feature.puller.MailPullerWorker.getMailServerPassword(MailPullerWorker.java:248)
      	at com.atlassian.jira.internal.mail.processor.feature.puller.MailPullerWorker.pullEmailForConnection(MailPullerWorker.java:137)
      	at com.atlassian.jira.internal.mail.processor.feature.puller.MailPullerWorker.pullMailFromAllValidChannels(MailPullerWorker.java:100)
      	at com.atlassian.jira.internal.mail.processor.feature.puller.MailPullerService.run(MailPullerService.java:33)
      	at com.atlassian.jira.internal.mail.processor.services.MailPullerExecutor.run(MailPullerExecutor.java:29)
      	at com.atlassian.jira.internal.mail.processor.services.AbstractMailExecutor.execute(AbstractMailExecutor.java:45)
      	at com.atlassian.jira.internal.mail.processor.services.MailJobRunner.runJob(MailJobRunner.java:35)
      	at com.atlassian.scheduler.core.JobLauncher.runJob(JobLauncher.java:134)
      	at com.atlassian.scheduler.core.JobLauncher.launchAndBuildResponse(JobLauncher.java:106)
      	at com.atlassian.scheduler.core.JobLauncher.launch(JobLauncher.java:90)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.launchJob(CaesiumSchedulerService.java:435)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeClusteredJob(CaesiumSchedulerService.java:430)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeClusteredJobWithRecoveryGuard(CaesiumSchedulerService.java:454)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeQueuedJob(CaesiumSchedulerService.java:382)
      	at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService$$Lambda$4070/0x00000008040c7c40.accept(Unknown Source)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeJob(SchedulerQueueWorker.java:66)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeNextJob(SchedulerQueueWorker.java:60)
      	at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.run(SchedulerQueueWorker.java:35)
      	at java.base@11.0.13/java.lang.Thread.run(Unknown Source)
      

      Root Cause

      The reason behind this bug is the fact that the JSM Mail Puller uses a function from an external library HTTPRequest.java to fetch a new Oauth 2.0 Access token after it expires.

      By default, this function is not using any timeout value, which means that the timeout will be infinite (the request will never time out if no response is sent back).

      To fix this issue, we should use the method setConnectTimeout() from the external library to explicitly set the Timetout value before we send the HTTP Request.

      Workaround

      The only way to fix this issue is to reset the Mail Puller job by re-starting the Jira application.

        1. SchedulerPage.png
          340 kB
          Julien Rey
        2. SQL_Query.png
          33 kB
          Julien Rey

            [JSDSERVER-12620] The JSM Mail Handlers (channels) stop pulling new emails due to an infinite timeout used to fetch Oauth 2.0 tokens

            Jeramy added a comment -

            Per an Atlassian support ticket that I have open...

             

            We suspect that the http request being sent by the mail puller job is not timing out appropriately and that the threads are hanging while attempting to refresh the oauth token, as described by this bug JSDSERVER-12620:The JSM Mail Handlers (channels) stop pulling new emails due to an infinite timeout used to fetch Oauth 2.0 tokens. You'll note that the bug says it's fixed in 5.12.12, but we have found that there are multiple timeouts involved, one for connectTimeout(which the bugfix fixed), and one for readTimeout, which may still have an infinite timeout set.

            We can try and modify the timeouts by adding the following startup parameters to your instance, which will set the timeouts to 7.5 minutes:

            -Datlassian.oauth2.client.max.server.request.read.timeout.seconds=750
            -Datlassian.oauth2.client.max.server.request.timeout.seconds=750
            You can use the information in Setting properties and options on startup to accomplish this.

            Additionally, we'll want to turn on DEBUG logging for the package com.atlassian.jira.internal.mail.processor to allow us to get a bit more detail in the logs on the threads should the issue recur. This, plus thread dumps from when the issue is happening, will help us better zero in on what exactly is causing your mail to stop so frequently if the above timeout setting doesn't clear things up.

            I don't think this is completely resolved. Does anyone at Atlassian know if a ticket was created to resolve the readTimeout issue listed above? I can't seem to find one.

            Jeramy added a comment - Per an Atlassian support ticket that I have open...   We suspect that the http request being sent by the mail puller job is not timing out appropriately and that the threads are hanging while attempting to refresh the oauth token, as described by this bug JSDSERVER-12620 :The JSM Mail Handlers (channels) stop pulling new emails due to an infinite timeout used to fetch Oauth 2.0 tokens. You'll note that the bug says it's fixed in 5.12.12, but we have found that there are multiple timeouts involved, one for connectTimeout(which the bugfix fixed), and one for readTimeout, which may still have an infinite timeout set. We can try and modify the timeouts by adding the following startup parameters to your instance, which will set the timeouts to 7.5 minutes: -Datlassian.oauth2.client.max.server.request.read.timeout.seconds=750 -Datlassian.oauth2.client.max.server.request.timeout.seconds=750 You can use the information in Setting properties and options on startup to accomplish this. Additionally, we'll want to turn on DEBUG logging for the package com.atlassian.jira.internal.mail.processor to allow us to get a bit more detail in the logs on the threads should the issue recur. This, plus thread dumps from when the issue is happening, will help us better zero in on what exactly is causing your mail to stop so frequently if the above timeout setting doesn't clear things up. I don't think this is completely resolved. Does anyone at Atlassian know if a ticket was created to resolve the readTimeout issue listed above? I can't seem to find one.
            Marc Dacanay made changes -
            Labels Original: lts10 lts10nth ltsr pse-request New: lts10 lts10nth pse-request
            Marc Dacanay made changes -
            Labels Original: lts10 lts10nth pse-request New: lts10 lts10nth ltsr pse-request
            Bartosz Ornatowski made changes -
            Fix Version/s Original: 5.4.13 [ 106413 ]
            Fix Version/s Original: 5.12.1 [ 106162 ]
            Fix Version/s Original: 5.13.0 [ 105935 ]
            Fix Version/s New: 5.12.12 [ 108392 ]
            Fix Version/s New: 5.4.25 [ 108623 ]
            Fix Version/s New: 5.17.2 [ 108793 ]
            Fix Version/s New: 10.0.0 [ 107096 ]
            Bartosz Ornatowski made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: In Progress [ 3 ] New: Closed [ 6 ]

            The initial fix released in versions 5.13.05.12.15.4.13 included a fix of only subset of timeouts that can happen when using oauth2 library. Latest version of JSM specified in the Fix version field now support setting up both read and socket timeouts when fetching tokens.

            Bartosz Ornatowski added a comment - The initial fix released in versions 5.13.0 ,  5.12.1 ,  5.4.13 included a fix of only subset of timeouts that can happen when using oauth2 library. Latest version of JSM specified in the Fix version field now support setting up both read and socket timeouts when fetching tokens.
            SET Analytics Bot made changes -
            UIS Original: 19 New: 26
            SET Analytics Bot made changes -
            Support reference count Original: 16 New: 17
            Bartosz Ornatowski made changes -
            Sprint New: LTS Sprint [ 7234 ]
            Bartosz Ornatowski made changes -
            Labels Original: lts10 pse-request New: lts10 lts10nth pse-request

              bornatowski Bartosz Ornatowski
              jrey Julien Rey
              Affected customers:
              3 This affects my team
              Watchers:
              17 Start watching this issue

                Created:
                Updated:
                Resolved: