Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-79529

Scheduled job com.atlassian.confluence.plugins.confluence-edge-index:flushEdgeIndexQueueJob#flushEdgeIndexQueue failed to run

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Highest Highest
    • 7.19.0
    • 7.17.1, 7.17.4, 7.17.5, 7.18.1, 7.18.2, 7.18.3
    • Search - Indexing

      Issue Summary

      Flush Edge Index Queue job runs into an error. At some point, this causes an instance to be out of memory.

      Steps to Reproduce

      1. User A creates a page
      2. User B likes the page
      3. User A changes page permission - view and edit for User A only
      4. Wait for the Flush Edge Index Queue job to run
      5. Error can be seen in the logs but it need to run for a long time for GC to spike

      Expected Results

      The error should be thrown and it should not cause an instance to be out OOM

      Actual Results

      2022-07-15 11:03:47,472 ERROR [Caesium-1-1] [impl.schedule.caesium.JobRunnerWrapper] runJob Scheduled job com.atlassian.confluence.plugins.confluence-edge-index:flushEdgeIndexQueueJob#flushEdgeIndexQueue failed to run
      com.atlassian.bonnie.LuceneException: org.springframework.transaction.UnexpectedRollbackException: Transaction rolled back because it has been marked as rollback-only
      

      As soon as the first entry is recorded, the heap utilization increases.

      Workaround

      Rebuild index from scratch but the issue can happen again after some time

      Note:

      Disabling the Flush Edge Index Queue job stops the error in the logs but heap utilization doesn't drop

            [CONFSERVER-79529] Scheduled job com.atlassian.confluence.plugins.confluence-edge-index:flushEdgeIndexQueueJob#flushEdgeIndexQueue failed to run

            This is solved in 7.19.0 Frank, so I think the recommendation form Atlassian is that you upgrade to that version. We are doing it this week after testing it in our Labs environments.

            David Moreno added a comment - This is solved in 7.19.0 Frank, so I think the recommendation form Atlassian is that you upgrade to that version. We are doing it this week after testing it in our Labs environments.

            Is there a fix planned for 7.17.x? And when?

            And please add this to the release notes / upgrade notes for all affected version as mentioned already by others.

            Frank Wagner added a comment - Is there a fix planned for 7.17.x? And when? And please add this to the release notes / upgrade notes for all affected version as mentioned already by others.

            Hi All,

            A fix for this issue is available in Confluence Server and Data Center 7.19.0. Upgrade now or check out the Release Notes to see what other issues are resolved.

            For those on Confluence 7.18.1 and later that have this issue, the upgrade to Confluence 7.19.0 contains a collection of bug fixes, but no new features. You can review the changes at the links above.

            For customers on Confluence 7.17.1-7.17.x, the change log to Confluence 7.18 can be found here, and combined with the release notes above to understand the changes in this upgrade.

            If you're encountering this issue, we would encourage you to upgrade to Confluence 7.19.0 as soon as you're able.

            Thanks,
            James Ponting
            Engineering Manager - Confluence Data Center

            James Ponting added a comment - Hi All, A fix for this issue is available in Confluence Server and Data Center 7.19.0. Upgrade now or check out the Release Notes to see what other issues are resolved. For those on Confluence 7.18.1 and later that have this issue, the upgrade to Confluence 7.19.0 contains a collection of bug fixes, but no new features. You can review the changes at the links above. For customers on Confluence 7.17.1-7.17.x, the change log to Confluence 7.18 can be found here , and combined with the release notes above to understand the changes in this upgrade. If you're encountering this issue, we would encourage you to upgrade to Confluence 7.19.0 as soon as you're able. Thanks, James Ponting Engineering Manager - Confluence Data Center

            StephanieC added a comment -

            Dan - We had to up our server memory and heap size to ginormous proportions to overcome issues, and we still have an occasional timeout error. Hopefully with the bug fix we can scale back

            StephanieC added a comment - Dan - We had to up our server memory and heap size to ginormous proportions to overcome issues, and we still have an occasional timeout error. Hopefully with the bug fix we can scale back

            Dan Aussem added a comment -

            Is there any anticipated timeline toward resolution of this bug?  We are desperately trying to figure out a plan to prevent 502 errors from recurring and disrupting our business.

            Dan Aussem added a comment - Is there any anticipated timeline toward resolution of this bug?  We are desperately trying to figure out a plan to prevent 502 errors from recurring and disrupting our business.

            I really wish this was mentioned under known issues or somewhere in release notes for 7.18.2

            https://confluence.atlassian.com/doc/confluence-7-18-release-notes-1115677302.html

            I had not upgraded to this version in that case. So we upgrade because of a vulnerability and now we are stuck with this bug

             

            Eagerly waiting for the fix!

            Bhupesh Nagda added a comment - I really wish this was mentioned under known issues or somewhere in release notes for 7.18.2 https://confluence.atlassian.com/doc/confluence-7-18-release-notes-1115677302.html I had not upgraded to this version in that case. So we upgrade because of a vulnerability and now we are stuck with this bug   Eagerly waiting for the fix!

            Dan Aussem added a comment -

            This is very disruptive to our organization, a Fortune 100 company with over 10000 Confluence users.  Please consider my "This affects my team" as counting 10000 times.

            Dan Aussem added a comment - This is very disruptive to our organization, a Fortune 100 company with over 10000 Confluence users.  Please consider my " This affects my team" as counting 10000 times.

              rlau@atlassian.com Richard Lau
              c84ad05f80fa Puventhira Mannen
              Affected customers:
              30 This affects my team
              Watchers:
              71 Start watching this issue

                Created:
                Updated:
                Resolved: