-
Bug
-
Resolution: Fixed
-
Low
-
4.7.1, 4.14.0, 5.3.1
-
None
-
6
-
Severity 3 - Minor
-
2
-
Issue Summary
Service Desk notifications completely stop being sent if a ticket contains a very high number of participants (for example, more than 60k).
The job responsible to send the notifications gets completely stuck, and re-starting Jira does not resolve the issue.
Environment
Note that the bug was reproduced on Service Desk 4.7.1 and 4.14.0, but it is likely that this bug was already there in 3.x versions.
Steps to replicate
- Install Jira 8.14.0 with Service Desk 4.14.0
- Install the add-on Data Generator for Jira
- Use this add-on to create 60k users in the Jira application
- Run the following update queries in the Jira database, so that all the 60k Jira users created earlier share the same dummy email address:
UPDATE cwd_user set email_address = test@test.com; UPDATE cwd_user set lower_email_address = test@test.com;
- Re-start Jira, go to the page ⚙ > User Management > Users, and confirm that 60k users are now sharing the same email address:
- Configure an SMTP server in ⚙ > System > Outgoing Mail Server
- Create a new Service Desk project and configure the Customer Permissions as below:
- Set Who can raise requests? to Anyone can email the service project or raise a request in the portal
- Who can customers share requests with? to Any customer or organization, by searching in this project
- Create a new ticket in this project, and take note of the issue key
- Configure a Service Desk Mail Handler via Project Settings > Email Requests
- Send a new email to that Mail Handler by:
- Including the issue key in the subject of the email
- Adding in CC the dummy email address (test@test.com) used in the UPDATE SQL query
- When the JSD Mail Handler processes the incoming email, it will map the 60k users to the email address added in CC, and it will automatically add the 60k users as participant of the ticket created earlier
- Now, create multiple Service Desk tickets, make some updates to them, add some comments, etc...
Expected results
Service Desk customer notifications should be sent from these updates.
Actual results
Service Desk customer notifications are no longer sent from any ticket from any Service Desk project in the whole Jira instance:
- the Customer notification job is stuck processing a high number of rows in the table AO_4E8AE6_NOTIF_BATCH_QUEUE
- if you run the following query in the database several times in a row, you'll find that the number of customer notification that are waiting to be sent keeps increasing:
SELECT count (*) FROM "AO_4E8AE6_NOTIF_BATCH_QUEUE" WHERE "SENT_TIME" is null;
- if you run the following query in the database, you'll find that the job responsible to send the notifications is stuck and shows as "Already running"
- Query:
select * from rundetails where job_id = 'sd.custom.notification.batch.send';
- Example of results:
id | job_id | start_time | run_duration | run_outcome | info_message ----------+-----------------------------------+----------------------------+--------------+-------------+----------------- 101009 | sd.custom.notification.batch.send | "2021-05-04 09:25:49.011+02" | 10 | S | 101354 | sd.custom.notification.batch.send | "2021-05-04 09:51:50.519+02" | 41 | A | Already running
- Query:
- you can also verify that this job is stuck from the UI in ⚙ > System > Scheduler details, as it shows as "Already running":
- if you generate thread dumps, you might see one of these 2 types of long running thread:
- either a PsmqAsyncExecutors-job thread:
"PsmqAsyncExecutors-job:thread-2" prio=5 tid=0x000000000000011d nid=0 runnable java.lang.Thread.State: RUNNABLE at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Collections$2.tryAdvance(Collections.java:4717) at java.util.Collections$2.forEachRemaining(Collections.java:4725) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at com.atlassian.crowd.model.application.Applications.getActiveDirectories(Applications.java:57) at com.atlassian.crowd.manager.application.ApplicationServiceGeneric.getActiveDirectories(ApplicationServiceGeneric.java:1695) at com.atlassian.crowd.manager.recovery.RecoveryModeAwareApplicationService.getActiveDirectories(RecoveryModeAwareApplicationService.java:51) at com.atlassian.crowd.manager.application.ApplicationServiceGeneric.finder(ApplicationServiceGeneric.java:662) at com.atlassian.crowd.manager.application.ApplicationServiceGeneric.findUserByName(ApplicationServiceGeneric.java:304) at com.atlassian.crowd.embedded.core.CrowdServiceImpl.getUser(CrowdServiceImpl.java:86) at com.atlassian.jira.user.util.DefaultUserManager.getCrowdUser(DefaultUserManager.java:212) at com.atlassian.jira.user.util.DefaultUserManager.toApplicationUser(DefaultUserManager.java:298) at com.atlassian.jira.user.util.DefaultUserManager$$Lambda$692/690762945.apply(Unknown Source) at java.util.Optional.flatMap(Optional.java:241) at com.atlassian.jira.user.util.DefaultUserManager.getUserByName(DefaultUserManager.java:260) at com.atlassian.jira.user.util.DefaultUserManager.getUserObject(DefaultUserManager.java:222) at com.atlassian.jira.user.util.DefaultUserManager.getUser(DefaultUserManager.java:217) at com.atlassian.jira.user.util.DefaultUserManager.lambda$getApplicationUserEvenWhenUnknown$4(DefaultUserManager.java:287) at com.atlassian.jira.user.util.DefaultUserManager$$Lambda$2586/1268787770.apply(Unknown Source) at java.util.Optional.map(Optional.java:215) at com.atlassian.jira.user.util.DefaultUserManager.getApplicationUserEvenWhenUnknown(DefaultUserManager.java:286) at com.atlassian.jira.user.util.DefaultUserManager.getUserByKeyEvenWhenUnknown(DefaultUserManager.java:265) at com.atlassian.jira.issue.customfields.converters.UserConverterImpl.getUserFromDbString(UserConverterImpl.java:81) at sun.reflect.GeneratedMethodAccessor655.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.atlassian.plugin.util.ContextClassLoaderSettingInvocationHandler.invoke(ContextClassLoaderSettingInvocationHandler.java:26) at com.sun.proxy.$Proxy147.getUserFromDbString(Unknown Source) at sun.reflect.GeneratedMethodAccessor655.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343) at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.doInvoke(ServiceInvoker.java:56) at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.invoke(ServiceInvoker.java:60) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:136) at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:124) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invokeUnprivileged(ServiceTCCLInterceptor.java:70) at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invoke(ServiceTCCLInterceptor.java:53) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.eclipse.gemini.blueprint.service.importer.support.LocalBundleContextAdvice.invoke(LocalBundleContextAdvice.java:57) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:136) at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:124) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) at com.sun.proxy.$Proxy3539.getUserFromDbString(Unknown Source) at com.atlassian.servicedesk.internal.customfields.participants.ParticipantsCFType.convertDbValueToType(ParticipantsCFType.java:356) at com.atlassian.servicedesk.internal.customfields.participants.ParticipantsCFType.convertDbObjectToTypesUnsorted(ParticipantsCFType.java:196) at com.atlassian.servicedesk.internal.customfields.participants.ParticipantsCFType.getUsersFromIssue(ParticipantsCFType.java:185)
- or a Caesium thread as shown below:
"Caesium-1-4" daemon prio=5 tid=0x000000000000028a nid=0 runnable java.lang.Thread.State: RUNNABLE at sun.misc.Unsafe.getObjectVolatile(Native Method) at java.util.concurrent.atomic.AtomicReferenceArray.getRaw(AtomicReferenceArray.java:130) at java.util.concurrent.atomic.AtomicReferenceArray.get(AtomicReferenceArray.java:125) at com.google.common.cache.LocalCache$Segment.getFirst(LocalCache.java:2661) at com.google.common.cache.LocalCache$Segment.getEntry(LocalCache.java:2668) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2027) at com.google.common.cache.LocalCache.get(LocalCache.java:3952) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) at com.atlassian.jira.i18n.CachingI18nFactory.getInstance(CachingI18nFactory.java:167) at com.atlassian.jira.i18n.CachingI18nFactory.getInstance(CachingI18nFactory.java:176) at com.atlassian.jira.i18n.DelegateI18nFactory.getInstance(DelegateI18nFactory.java:32) at sun.reflect.GeneratedMethodAccessor440.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.atlassian.plugin.util.ContextClassLoaderSettingInvocationHandler.invoke(ContextClassLoaderSettingInvocationHandler.java:26) at com.sun.proxy.$Proxy261.getInstance(Unknown Source) at sun.reflect.GeneratedMethodAccessor440.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343) at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.doInvoke(ServiceInvoker.java:56) ... at com.atlassian.servicedesk.internal.customfields.participants.ParticipantsCustomFieldManagerImpl.toCheckedUsers(ParticipantsCustomFieldManagerImpl.java:154) at com.atlassian.servicedesk.internal.customfields.participants.ParticipantsCustomFieldManagerImpl.getUserParticipantsFromIssue(ParticipantsCustomFieldManagerImpl.java:88) at com.atlassian.servicedesk.internal.permission.security.RequestParticipantRequestAccessUserStrategy.getUsers(RequestParticipantRequestAccessUserStrategy.java:35) at com.atlassian.servicedesk.internal.permission.security.RequestParticipantRequestAccessUserStrategy.lambda$match$0(RequestParticipantRequestAccessUserStrategy.java:46) at com.atlassian.servicedesk.internal.permission.security.RequestParticipantRequestAccessUserStrategy$$Lambda$4712/1748693567.test(Unknown Source) at io.atlassian.fugue.Either$AbstractProjection.exists(Either.java:698) at io.atlassian.fugue.Either.exists(Either.java:244) at com.atlassian.servicedesk.internal.permission.security.RequestParticipantRequestAccessUserStrategy.match(RequestParticipantRequestAccessUserStrategy.java:46) at com.atlassian.servicedesk.internal.permission.security.RequestAccessUserStrategyManagerImpl.lambda$match$2(RequestAccessUserStrategyManagerImpl.java:80) at com.atlassian.servicedesk.internal.permission.security.RequestAccessUserStrategyManagerImpl$$Lambda$4464/1869044896.test(Unknown Source) at java.util.stream.MatchOps$1MatchSink.accept(MatchOps.java:90) at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812) at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126) at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:498) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230) at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.anyMatch(ReferencePipeline.java:449) at com.atlassian.servicedesk.internal.permission.security.RequestAccessUserStrategyManagerImpl.match(RequestAccessUserStrategyManagerImpl.java:80) at com.atlassian.servicedesk.internal.permission.security.CustomerInvolvedServiceImpl.lambda$hasAccessToRequest$1(CustomerInvolvedServiceImpl.java:67) at com.atlassian.servicedesk.internal.permission.security.CustomerInvolvedServiceImpl$$Lambda$4462/1707960066.call(Unknown Source) at com.atlassian.servicedesk.internal.api.util.context.ReentrantThreadLocalBasedCodeContext.rteInvoke(ReentrantThreadLocalBasedCodeContext.java:136) at com.atlassian.servicedesk.internal.api.util.context.ReentrantThreadLocalBasedCodeContext.runInContext(ReentrantThreadLocalBasedCodeContext.java:59) at com.atlassian.servicedesk.internal.utils.context.CustomerContextServiceImpl.runInCustomerContext(CustomerContextServiceImpl.java:37) at com.atlassian.servicedesk.internal.permission.security.CustomerInvolvedServiceImpl.hasAccessToRequest(CustomerInvolvedServiceImpl.java:63) at com.atlassian.servicedesk.internal.permission.security.CustomerInvolvedServiceImpl.lambda$filterPermissions$0(CustomerInvolvedServiceImpl.java:49) at com.atlassian.servicedesk.internal.permission.security.CustomerInvolvedServiceImpl$$Lambda$4461/692396876.test(Unknown Source) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at com.atlassian.servicedesk.internal.permission.security.CustomerInvolvedServiceImpl.filterPermissions(CustomerInvolvedServiceImpl.java:49) at com.atlassian.servicedesk.internal.permission.security.CustomerInvolvedServiceImpl.getMembersForTypes(CustomerInvolvedServiceImpl.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343) at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.doInvoke(ServiceInvoker.java:56) at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.invoke(ServiceInvoker.java:60) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:136) at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:124) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invokeUnprivileged(ServiceTCCLInterceptor.java:70) at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invoke(ServiceTCCLInterceptor.java:53) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.eclipse.gemini.blueprint.service.importer.support.LocalBundleContextAdvice.invoke(LocalBundleContextAdvice.java:57) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:136) at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:124) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) at com.sun.proxy.$Proxy5375.getMembersForTypes(Unknown Source) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.renderers.issue.UserSharedWithListBuilder.getSharedWithDisplayNames(UserSharedWithListBuilder.java:52) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.renderers.issue.UserSharedWithListBuilder.buildSharedWithList(UserSharedWithListBuilder.java:41) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.renderers.issue.RequestSharedWithRenderer.getUserSharedWithText(RequestSharedWithRenderer.java:107) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.renderers.issue.RequestSharedWithRenderer.renderRequestSharedWith(RequestSharedWithRenderer.java:72) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.renderers.issue.RequestSharedWithRenderer.lambda$render$1(RequestSharedWithRenderer.java:52) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.renderers.issue.RequestSharedWithRenderer$$Lambda$4453/664510633.apply(Unknown Source) at io.atlassian.fugue.Either$RightProjection.map(Either.java:923) at io.atlassian.fugue.Either.map(Either.java:217) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.renderers.issue.RequestSharedWithRenderer.render(RequestSharedWithRenderer.java:52) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.StylingVariableReplacementManager.lambda$retrievePlaintextReplacement$5(StylingVariableReplacementManager.java:161) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.StylingVariableReplacementManager$$Lambda$12197/363084514.apply(Unknown Source) at io.atlassian.fugue.Either$RightProjection.flatMap(Either.java:937) at io.atlassian.fugue.Either.flatMap(Either.java:231) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.replacement.StylingVariableReplacementManager.retrievePlaintextReplacement(StylingVariableReplacementManager.java:161) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.StylingRendererImpl$$Lambda$12194/1827211646.apply(Unknown Source) at com.atlassian.servicedesk.plugins.variablesubstitution.internal.variables.styling.StylingRendererImpl.lambda$internalReplaceVariable$2(StylingRendererImpl.java:87) ... at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.launchJob(CaesiumSchedulerService.java:435) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeClusteredJob(CaesiumSchedulerService.java:430) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeClusteredJobWithRecoveryGuard(CaesiumSchedulerService.java:454) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeQueuedJob(CaesiumSchedulerService.java:382) at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService$$Lambda$3783/260848765.accept(Unknown Source) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeJob(SchedulerQueueWorker.java:66) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeNextJob(SchedulerQueueWorker.java:60) at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.run(SchedulerQueueWorker.java:35) at java.lang.Thread.run(Thread.java:748)
- either a PsmqAsyncExecutors-job thread:
Workaround
The workaround consists in deleting all the customer notifications from the table "AO_4E8AE6_NOTIF_BATCH_QUEUE" which are waiting to be sent by the customer notification job that is stuck, and to also delete the problematic ticket which contains the high number of participants.
Please be aware that if you apply this workaround, you will lose all the pending customer notifications as they will be deleted from the database.
The queries below have been tested on a PostgreSQL Database. The syntax of these queries might be different, when using different types of Database.
The steps are listed below:
- Look for the problematic ticket which includes the high number of participants. To identify the problematic issue:
- Run the following query to get the notification ID that the customer notification job is stuck at:
SELECT MIN ("ID") FROM "AO_4E8AE6_NOTIF_BATCH_QUEUE" WHERE "SENT_TIME" is null;
- Run the following query after replacing <ID> with the ID found in the previous query:
SELECT "ISSUE_ID" FROM "AO_4E8AE6_NOTIF_BATCH_QUEUE" WHERE "ID" = <ID>;
- Run the following query after replacing <ID> with the ID found in the previous query. This query will return the key of the problematic issue:
SELECT P.pkey||'-'||JI.issuenum FROM jiraissue JI JOIN project P ON JI.project = P.id AND JI.id = <ID>;
- Run the following query to get the notification ID that the customer notification job is stuck at:
- After you identified the problematic issue, open it in Jira and delete it from the UI
- Stop Jira
- Backup your Jira database
- Run the following DELETE query to remove all the pending customer notifications from the table AO_4E8AE6_NOTIF_BATCH_QUEUE:
delete from "AO_4E8AE6_NOTIF_BATCH_QUEUE" WHERE "SENT_TIME" is null;
- Start Jira