Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-55028

Improve "rejectedExecution Queue Overflow" logging

    XMLWordPrintable

Details

    Description

      Problem definition

      When Confluence asynchronous events are generated faster than Confluence can process them this queue is backed up, typically due to a performance issue. This is what is printed to the logs when this happens:

      2018-01-10 08:23:48,943 WARN [ajp-nio-48009-exec-224] [atlassian.confluence.event.MonitorableCallerRunsPolicy] rejectedExecution Queue Overflow happened: Runnable: [Invoker: com.atlassian.confluence.event.ConfluenceListenerHandlersConfiguration$TimingListenerHandler$1$1@5536459e; Event: com.atlassian.confluence.impl.content.render.xhtml.analytics.MarshallerMetricsAnalyticsEvent@10765f95]; Executor: [java.util.concurrent.ThreadPoolExecutor@82ad11a[Running, pool size = 4, active threads = 4, queued tasks = 384, completed tasks = 19456159]]
      

      The error itself has no information as to what is causing the queue to overflow. Normally this issue has to be diagnosed by looking at the errors that are occurring around the time this error is thrown. For example, to diagnose the following bug - CONFSERVER-40860, we had to track threads that were stuck or taking a long time to complete:

      2018-01-10 08:23:48,943 WARN [http-bio-2002-exec-7] [confluence.util.profiling.DefaultActivityMonitor] close Exceeded the threshold of 60000 ms: ActivitySnapshot{startTime=1456143263766, threadId=123, threadName='http-nio-1990-exec-4', userId='charlie', type='web-request', summary='/display/SPACE/Knowledge+Base?src=contextnavpagetreemode'}
        -- url: /wiki/display/SPACE/Knowledge+Base | traceId: 10d50bd1852cf3 | userName: charlie | referer: https://charlie.atlassian.net/wiki/display/SPACE/Knowledge+Base?src=contextnavpagetreemode
      

      Suggested Solution

      It would be useful to print to the logs what type of events are in this queue, or what events are being ejected from the queue. We can then monitor the logs to see what process is causing this.

      Notes

      We've tested increasing logging on these events for this class com.atlassian.confluence.impl.content.render.xhtml.analytics but it does not provide the information we need to diagnose this issue.

      You can enable event logging by setting the following class to DEBUG: com.atlassian.confluence.event.ConfluenceEventDispatcher (under -> General configuration -> Logging and profiling), but this will only log processed events. Events that are stuck in the queue/not processed will not be printed out to the logs.
      Note this is very verbose and the logs will fill up very quickly

      Attachments

        Issue Links

          Activity

            People

              03cb0c04aa4f Irina Tiapchenko
              dponzio Daniel Ponzio
              Votes:
              187 Vote for this issue
              Watchers:
              143 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: