Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-59837

Processing drop-shadow image effect on certain images causes long JVM safepoint pauses

      Issue Summary

      When adding an image to a page and applying an image effect, if you choose drop-shadow the ImageFilterEffect thread will become long-running and cause a JVM pause while it's attempting to go to safepoint for garbage collection or other VM tasks. 

      Pauses can be an upwards of 35 seconds, which is long enough to cause a node ejection in Data Center environments if the Hazelcast heartbeat is set to the default interval of 30 seconds or less. Other pauses can still be 9-24 seconds, during which the Confluence application would be totally inaccessible.

      Steps to Reproduce

      1. Create a new page and upload the 1MB-small-image.png file (see below), which is under 1MB in size
      2. Hover your mouse over the inserted image and select "Properties"
      3. Choose the "Drop Shadow" effect and click OK
      4. You will then observe a long pause while it processes this small image, during which time the JVM itself is paused

      A similar long pause can be induced by using very large files also, (must be less than 16MB in size, which is the ImageFilterEffect plugin's default threshold). See 14MB-large-image.png as an example. 

      Only certain small files can induce the long pauses, as many less than 1MB images process in under 3 seconds. This is likely has more to do with the pixel count than it does with the size of the image, as the 3MB-small-image-fast-processing.png example file (see below) will process faster than the 1MB example file - the difference being the pixel count. 

      Expected Results

      The image will be displayed on the page with the image effect applied to it, and the whole process will complete in 2-3 seconds max.

      Actual Results

      The JVM will pause for upwards of 9-35 seconds while processing the image effect due to the rush to safepoint being delayed by the ImageFilterEffect thread - it is suspected that ImageFilterEffect thread is executing native code while processing images, in which case the thread would ignore the call to safepoint until it's completed processing.

      The following safepoint statistics output will be shown in the catalina.out file:

      # SafepointSynchronize::begin: Timeout detected:
      # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint.
      # SafepointSynchronize::begin: Threads which did not reach the safepoint:
      # "ImageFilterEffect:thread-4" #284 daemon prio=5 os_prio=0 tid=0x00007ffa080d4000 nid=0x1059 runnable [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE
      
      # SafepointSynchronize::begin: (End of list)
               vmop                    [threads: total initially_running wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
      1119.475: CGC_Operation                    [     173          1              1    ]      [ 14748     0 14749     0    65    ]  1
               vmop                    [threads: total initially_running wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
      1134.291: CGC_Operation                    [     173          1              3    ]      [  5667     0  5667     0     3    ]  1
      

      The following system properties must be added to see the output above:

      -XX:+UnlockCommercialFeatures 
      -XX:+PrintSafepointStatistics 
      -XX:PrintSafepointStatisticsCount=1 
      -XX:+SafepointTimeout 
      -XX:SafepointTimeoutDelay=20000 
      

      The atlassian-confluence.log file will show the following:

      2020-05-15 10:27:33,688 DEBUG [http-nio-8090-exec-10] [confluence.image.effects.ImageFilterServlet] doGetInTransaction Doing a transform
      2020-05-15 10:27:33,691 DEBUG [http-nio-8090-exec-10] [confluence.image.effects.DefaultImageEffectsClientSupplier] buildServiceUri Attempting to parse: null
      2020-05-15 10:27:33,712 DEBUG [ImageFilterEffect:thread-1] [confluence.image.effects.ImageFilterTask] processInMemory Processing transformation in memory
      2020-05-15 10:27:43,711 WARN [http-nio-8090-exec-10] [confluence.image.effects.ImageFilterServlet] transformAndRespond Timed out while doing the transform
       -- referer: http://10.125.91.51:8090/display/DRAG/Image+page+3 | url: /download/attachments/1540097/Bangkok_Thailand_2009-14MB.png | traceId: d99c9c2c222bca9b | userName: admin
      java.util.concurrent.TimeoutException
      	at java.util.concurrent.FutureTask.get(FutureTask.java:205)
      	at com.atlassian.confluence.image.effects.ImageFilterServlet.transformAndRespond(ImageFilterServlet.java:331)
      	at com.atlassian.confluence.image.effects.ImageFilterServlet.writeTransformToResponse(ImageFilterServlet.java:203)
      	at com.atlassian.confluence.image.effects.ImageFilterServlet.doGetInTransaction(ImageFilterServlet.java:162)
      	at com.atlassian.confluence.image.effects.ImageFilterServlet.lambda$doGet$0(ImageFilterServlet.java:139)
      ..truncated....
      2020-05-15 10:27:43,715 DEBUG [http-nio-8090-exec-10] [confluence.image.effects.ImageFilterServlet] writeTransformToResponse Sending back the original
      

      The following logging package must be set to DEBUG to see the output above:

      com.atlassian.confluence.image	
      

      Workarounds

      1. Disable the ImageEffectsPlugin system app in General Configuration -> Manage Apps -> System apps
      2. Do not use the "Drop-Shadow" effect

      Limit the impact of Image Effects on Confluence describes a process to place an upper limit of 1MB on the images it will process and 4000x4000 pixels, however the 1MB-small-image.png file (see below) is an example of a file slightly under 1MB and slightly under 4000x4000 pixels which still has this same problem, so limiting the images by file size or pixel size may not fully circumvent this issue unless you set the limiting threshold very low. 

      Sample Images

      14MB file 1MB file 3MB file 
      • high pixel count
      • large size
      • is slow processing
      • small size
      • high pixel count
      • is slow processing
      • low pixel count
      • size is larger than the 1MB slow-processing example
      • is fast processing

        1. 14MB-large-image.png
          14MB-large-image.png
          13.28 MB
        2. 1MB-small-image.png
          1MB-small-image.png
          864 kB
        3. 3MB-small-image-fast-processing.png
          3MB-small-image-fast-processing.png
          2.95 MB

          Form Name

            [CONFSERVER-59837] Processing drop-shadow image effect on certain images causes long JVM safepoint pauses

            This bug seems to be regressing back in version 7.4.5. We have a test environment running on version 7.4.4 where this problem is indeed fixed but our actual Production is running on version 7.4.5 where the problem unfortunately still persists. We tried the workaround to disable ImageEffectsPlugin system app and that seems to help at the moment.

            Soumya Shukla added a comment - This bug seems to be regressing back in version 7.4.5. We have a test environment running on version 7.4.4 where this problem is indeed fixed but our actual Production is running on version 7.4.5 where the problem unfortunately still persists. We tried the workaround to disable  ImageEffectsPlugin  system app and that seems to help at the moment.

            davidyu2 I guess we can close that issue then as done

            Jen Marfell (Inactive) added a comment - davidyu2 I guess we can close that issue then as done

            David Yu added a comment -

            Thanks for fixing this critical issue. I reported this serious issue last year here but guess that comment was lost.

            David Yu added a comment - Thanks for fixing this critical issue. I reported this serious issue last year here but guess that comment was lost.

            Tam Tran added a comment -

            If you're running the Confluence 7.4 Enterprise release, a fix for this issue is now available in Confluence 7.4.4, which you can find in the Download Archives

            Tam Tran added a comment - If you're running the Confluence 7.4 Enterprise release, a fix for this issue is now available in Confluence 7.4.4, which you can find in the Download Archives

            Ah yes it is. Sorry about that. I updated the FixVersion to include the backport version.

            Jen Marfell (Inactive) added a comment - Ah yes it is. Sorry about that. I updated the FixVersion to include the backport version.

            Hey Jen

            "You can't view this issue

            It may have been deleted or you don't have permission to view it."

            Isn't the CONFSERVDEV project internal only?

             

            Craig Castle-Mead added a comment - Hey Jen "You can't view this issue It may have been deleted or you don't have permission to view it." Isn't the CONFSERVDEV project internal only?  

            craig.castlemead1 7.4.4

            Please see https://jira.atlassian.com/browse/CONFSRVDEV-15007 for the corresponding backport ticket

            Jen Marfell (Inactive) added a comment - craig.castlemead1 7.4.4 Please see https://jira.atlassian.com/browse/CONFSRVDEV-15007 for the corresponding backport ticket

            Can you confirm what LTS version this is being back-ported to please?

            Craig Castle-Mead added a comment - Can you confirm what LTS version this is being back-ported to please?

            A fix for this issue is available to Server and Data Center customers in Confluence 7.7.2
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            Xinyi Xu (Inactive) added a comment - A fix for this issue is available to Server and Data Center customers in Confluence 7.7.2 Upgrade now or check out the Release Notes  to see what other issues are resolved.

              309dbaf64012 Javan (Inactive)
              asmith4@atlassian.com Andrew S
              Affected customers:
              7 This affects my team
              Watchers:
              22 Start watching this issue

                Created:
                Updated:
                Resolved: