Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-57915

Repair ancestor table scheduled job DB transaction is stuck

      Description

      On confluence 6.14.x onward, a scheduled job was implemented called "Repair the Ancestors Table". This job was created, as the name implies, to search for corrupt entries in the ancestor table and correct them at 3AM each day.

      Expected results

      The scheduled job was supposed to run, check the data, perform the correction and close itself.

      Actual result

      The Scheduled job runs however it does not close itself and remain in memory as a open transaction.

      How to replicate and behavior observed

      Install a Confluence 6.14 or newer. Go to Confluence administration -> scheduled jobs and run manually the "Repair the Ancestors Table". Or wait for it to run at 3AM each day.

      On a MSSQL database, run this query which will show the open transactions:

      select * from sys.dm_exec_sessions where open_transaction_count > 0
      

      This transaction won't be removed unless killed manually.

      This was tested on a MYSQL database with the same result. The transaction was not finished after the scheduled job ended. Using this query will show the transaction opened:

      SELECT * FROM information_schema.innodb_trx;
      

      The other databases, Postgres and Oracle, should be affected as well.

      This appears in the confluence logs using SQL logging enabled:

      2019-02-15 19:46:29,779 TRACE [Caesium-1-2] [type.descriptor.sql.BasicBinder] bind binding parameter [1] as [VARCHAR] - [AncestorsRepairJob-runNow-1550267189523]
      2019-02-15 19:46:29,780 TRACE [Caesium-1-2] [type.descriptor.sql.BasicBinder] bind binding parameter [2] as [TIMESTAMP] - [Fri Feb 15 19:46:29 BRST 2019]
      2019-02-15 19:46:29,780 TRACE [Caesium-1-2] [type.descriptor.sql.BasicBinder] bind binding parameter [3] as [BIGINT] - [205]
      2019-02-15 19:46:29,780 TRACE [Caesium-1-2] [type.descriptor.sql.BasicBinder] bind binding parameter [4] as [CHAR] - [S]
      2019-02-15 19:46:29,781 TRACE [Caesium-1-2] [type.descriptor.sql.BasicBinder] bind binding parameter [5] as [VARCHAR] - [Ancestors were repaired successfully]
      
      • Another behavior observed within a customer environment, which might increase the severity of this bug, is that the transaction opened consumed disk space. For one case, it consumed 1GB per day and over the week it reached 6.4GB so this bug could cause a out of memory on the server if the transaction is not killed.

      Suggested Workaround

      Disable the scheduled job "Repair the Ancestors Table" so it won't run during 3AM. OR kill the transaction manually in the database.

          Form Name

            [CONFSERVER-57915] Repair ancestor table scheduled job DB transaction is stuck

            Said added a comment - - edited

            ttheb, this fix is in 6.13.6 and above in the 6.13.x series. The fix version list has been updated.

            Said added a comment - - edited ttheb , this fix is in 6.13.6 and above in the 6.13.x series. The fix version list has been updated.

            Tim added a comment -

            Backport to 6.13.x Enterprise release?

            Tim added a comment - Backport to 6.13.x Enterprise release?

            A fix for this issue is available to Server and Data Center customers in Confluence 6.14.2
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            Ganesh Gautam added a comment - A fix for this issue is available to Server and Data Center customers in Confluence 6.14.2 Upgrade now or check out the Release Notes to see what other issues are resolved.

            kk added a comment - - edited

            This issue prevents reindexing the database, at least with PostgreSql: Once reindex database <...> is issued, it hangs forever due to the locks not being released by this scheduler job, which in turn blocks Confluence.

            kk added a comment - - edited This issue prevents reindexing the database, at least with PostgreSql: Once reindex database <...> is issued, it hangs forever due to the locks not being released by this scheduler job, which in turn blocks Confluence.

              glipatov George Lipatov
              rgadami Rodrigo Girardi Adami
              Affected customers:
              5 This affects my team
              Watchers:
              15 Start watching this issue

                Created:
                Updated:
                Resolved: