Uploaded image for project: 'Jira Service Management Server and Data Center'
  1. Jira Service Management Server and Data Center
  2. JSDSERVER-6516

Customer notifications stop being sent from any Service Desk issue

    XMLWordPrintable

Details

    Description

      Issue Summary

      Service Desk notifications completely stop being sent if a comment is added to a request and contains a high number of links (~100k).

      The job responsible to send the notifications gets completely stuck, and re-starting Jira does not resolve the issue.

      Diagnosis steps

      If you are impacted by the bug, you'll see the following symptoms:

      • if you run the following query in the database, you'll find that there are a lot of customer notifications waiting to be sent (SENT_TIME is null):
        SELECT count (*) FROM "AO_4E8AE6_NOTIF_BATCH_QUEUE" WHERE "SENT_TIME" is null
        
      • if you run the following query in the database, you'll find that the job responsible to send the notifications is stuck and shows as "Already running"
        • Query:
          select * from rundetails where job_id = 'sd.custom.notification.batch.send';
          
        • Example of results:
              id    |              job_id               |         start_time         | run_duration | run_outcome |  info_message   
          ----------+-----------------------------------+----------------------------+--------------+-------------+-----------------
           71507893 | sd.custom.notification.batch.send | 2019-08-14 16:35:37.707+00 |            1 | A           | Already running
          
      • you can also verify that this job is stuck from the UI in ⚙ > System > Scheduler details, as it shows as "Already running":
      • if you generate thread dumps, you'll see that there is a runnable thread stuck in the method com.atlassian.servicedesk.internal.feature.customer.request.IssueUrlConverterImpl.replaceIssueUrlsWithPortalRequestUrls:
        "Caesium-2-1" #20668 daemon prio=5 tid=0x0000000004770000 nid=0x5b42 runnable [0x00007f740d93f000]
           java.lang.Thread.State: RUNNABLE
        	at java.lang.String.indexOf(String.java:1769)
        	at java.lang.String.indexOf(String.java:1718)
        	at org.apache.commons.lang.StringUtils.replaceEach(StringUtils.java:4075)
        	at org.apache.commons.lang.StringUtils.replaceEach(StringUtils.java:3868)
        	at com.atlassian.servicedesk.internal.feature.customer.request.IssueUrlConverterImpl.replaceIssueUrlsWithPortalRequestUrls(IssueUrlConverterImpl.java:69)
        	at com.atlassian.servicedesk.internal.feature.customer.request.CustomerTextRendererImpl.updateCustomerTextIntertal(CustomerTextRendererImpl.java:159)
        	at com.atlassian.servicedesk.internal.feature.customer.request.CustomerTextRendererImpl.updateEmailTextForCustomer(CustomerTextRendererImpl.java:154)
        	at com.atlassian.servicedesk.internal.notifications.render.StylingBodyFinaliserImpl.buildMultiPartHtmlEmailBody(StylingBodyFinaliserImpl.java:79)
        	at com.atlassian.servicedesk.internal.notifications.render.StylingBodyFinaliserImpl.buildMessageBodyForRecipient(StylingBodyFinaliserImpl.java:72)
        	at com.atlassian.servicedesk.internal.notifications.render.StylingBodyFinaliserImpl.lambda$buildHtmlBody$0(StylingBodyFinaliserImpl.java:55)
        	at com.atlassian.servicedesk.internal.notifications.render.StylingBodyFinaliserImpl$$Lambda$2582/1308621309.apply(Unknown Source)
        	at io.atlassian.fugue.Either$RightProjection.map(Either.java:872)
        	at io.atlassian.fugue.Either.map(Either.java:217)
        	at com.atlassian.servicedesk.internal.notifications.render.StylingBodyFinaliserImpl.buildHtmlBody(StylingBodyFinaliserImpl.java:55)
        
      • When enabling the debugging package com.atlassian.servicedesk.plugins.notifications in ⚙ > System > Logging and profiling > Configure logging level for another package, noticed that the job responsible to send the notifications keep being skipped (because there is already a job that is stuck):
        2019-07-10 10:29:08,193 PsmqAsyncExecutors-job:thread-5747 DEBUG XXXXXXX XXXXXXX XXXXXX XX.XXX.X.XX /secure/CommentAssignIssue.jspa [c.a.s.p.n.internal.scheduler.NotificationBatchScheduler] Notification batch sending job already defined. Skip scheduling.
        

      Workaround

      The workaround consists in deleting all the customer notifications from the table "AO_4E8AE6_NOTIF_BATCH_QUEUE" which are waiting to be sent by the job that is stuck. Unfortunately, it is the only workaround known as of now.

      Please be aware that if you apply this workaround, you will lose all these pending notifications as they will be deleted from the database.

      The steps are:

      1. Stop Jira
      2. Backup your database
      3. Run the following DELETE query. Note that this query has been tested on a postgreSQL database:
        delete from "AO_4E8AE6_NOTIF_BATCH_QUEUE" WHERE "SENT_TIME" is null;
        
      4. Start Jira

      Attachments

        1. JobAlreadyRunning.png
          147 kB
          Julien Rey

        Issue Links

          Activity

            People

              mreil1 Markus Reil (Inactive)
              jrey Julien Rey
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Backbone Issue Sync