Uploaded image for project: 'Confluence Cloud'
  1. Confluence Cloud
  2. CONFCLOUD-39681

Confluence Cloud stops responding after viewing a page with many JIRA issue tables

    XMLWordPrintable

Description

    NOTE: This bug report is for Confluence Cloud. Using Confluence Server? See the corresponding bug report.

    Viewing a page with more than 16 independent JIRA table macros can cause Confluence Cloud to not respond for two minutes, due to database connection contention/starvation. This is due to changes in the macro deployed on 26 Oct, which load tables of JIRA issues via AJAX rather than including them in the page rendering.

    Steps to reproduce

    1. Set up an instance (e.g. Cloud) with 16 database connections, and at least 50 HTTP threads.
    2. Create a page with 16+ JIRA table macros, all with different data
    3. View the page
    4. The page will load, but subsequent requests will not load until the JIRA macros time out after 2 minutes. A thread dump taken on the instance will show the "Jira macros executor" threads blocked waiting for a database connection:
    Jira macros executor:thread-1" #163 daemon prio=5 os_prio=0 tid=0x0e5fd800 nid=0x338c in Object.wait() [0x2a388000]
       java.lang.Thread.State: TIMED_WAITING (on object monitor)
    	at java.lang.Object.wait(Native Method)
    	at org.apache.tomcat.dbcp.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1123)
    	- locked <0x861964a0> (a org.apache.tomcat.dbcp.pool.impl.GenericObjectPool$Latch)
    

    In the situation where all the database connections are currently leased out to request threads, all JIRA table AJAX requests block on the JIRA macro executor threads which are blocked waiting for a database connection. The result is database connection starvation for all requests until the macro executor thread times out (2 minutes).

    Workarounds

    There are two possible workarounds, which can be and have been implemented by our support team upon request:

    1. Disable the JIRA macro
    2. Change the HTTP/database connection pool sizes from 200/16 to 16/24 and restart the instance.

    Over the last week, Atlassian has been proactively monitoring customer logs for any instances of the problem and applying the workaround #2 to these customer instances.

    Fix

    The Atlassian team is currently rolling out a fix to Confluence Cloud. The fix consists of an updated version of the JIRA macro that reverts the new AJAX functionality and therefore does not exhibit this problem. This rollout will be completed on Wed 4 Nov 2015, with a short outage for Confluence at the appropriate maintenance time for each geo.

    Note that prior to 26 Oct, this macro often resulted in individual page timeouts on pages containing 10+ JIRA macros, and this less-than-ideal behaviour will reoccur after the macro is reverted. However, these timeouts only affected individual pages and not other requests to the system.

    If you continue to experience problems with the JIRA macro or pages timing out in your instance, please raise a support case and we will investigate further.

    Support notes

    Support team, for proper problem diagnosis, please capture:

    • thread dumps while the instance is non-responsive, and
    • the content of affected pages.

    If the problem is caused by this bug, the thread dump will show the JIRA macro processing threads blocked as described above. Other problems can exhibit the same symptoms of non-responsiveness, and it's important that we get the thread dump so we can fix all the root causes that result in this outcome.

    Attachments

      Issue Links

        Activity

          People

            hmurn metrics
            vchin Vincent Chin (Inactive)
            Votes:
            84 Vote for this issue
            Watchers:
            97 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: