Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-90918

The retention job fails to delete all items in one batch if one of the items in the batch cannot be deleted due to some error

      When applying retention rules if there are any data integrity issues (for example; page(s) that cannot be purged individually from Confluence UI due to some reasons (db constraints, etc.)) the retention job also fails to delete this particular page(s). The issue is the whole batch (default: 100 items in one batch) that contains this particular page cannot be deleted because of this one record.

      If this problematic page is fetched for the batch every time the offset moves forward, the retention job got stuck and cannot move forward with deleting further pages.

      Issue Summary

      This is reproducible on Data Center: yes

      Steps to Reproduce

      1. Spin up a Confluence instance
      2. Disable the 'Trash Removal (Soft)' job by navigating to > General Configuration > Scheduled Jobs
      3. Import this Space backup: trashdata-Confluence-space-export-003654-2.xml.zip
        There are 3 pages that have data integrity issues: asd-2500, asd-6500, asd-18500. Their data integrity issues were created manually and intentionally by manipulating data on the database side.
      4. Add the below class to > General Configuration > Logging and profiling as DEBUG:
        com.atlassian.confluence.impl.retention
        
      5. Navigate to the 'Trash' space (we imported it on the second step) and from there navigate to > Space tools > Content Tools > Retention rules
        • Click 'Edit'
        • Select 'Use retention rules defined in this space' from on top dropdown
        • Select 'Keep by deleted date' from the dropdown below the Trash header
        • Enter '5' in the textbox and select 'Days' from the dropdown.
        • Save the retention rules
      6. Navigate again to > General Configuration > Scheduled Jobs and trigger 'Trash Removal (Hard)' by clicking 'Run'
      7. Observe the application logs and deleted item count with the below SQL query:
        SELECT COUNT(CONTENTID)
        FROM CONTENT c
        WHERE c.CONTENT_STATUS = 'deleted'
        

        There are 20k deleted items and at the end, the rule should delete all of them.

      Expected Results

      All the items in the trash (20k) should be deleted without issues or at least all the pages (19997) apart from the pages that have data integrity issues (3) should be deleted.

      Job finishes with 3 records left to be deleted and cannot be deleted due to constraint issues.

      Actual Results

      Job finishes with 300 hundred records left to be deleted.

      The below exception is thrown in the atlassian-confluence.log file:

      2023-08-22 00:56:05,536 ERROR [Caesium-1-2] [engine.jdbc.spi.SqlExceptionHelper] logExceptions ERROR: update or delete on table "content" violates foreign key constraint "fk_notifications_content" on table "notifications"
        Detail: Key (contentid)=(111106) is still referenced from table "notifications".
       -- url: /c7195/setup/setupdata.action | traceId: 72e45ffdddf5ffd8 | userName: anonymous | action: setupdata
      2023-08-22 00:56:05,536 ERROR [Caesium-1-2] [core.persistence.hibernate.HibernateObjectDao] unIndex Unable to index object: page: asd-6500 v.1 (111106) -- could not execute statement; SQL [n/a]; constraint [fk_notifications_content]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute statement
       -- url: /c7195/setup/setupdata.action | traceId: 72e45ffdddf5ffd8 | userName: anonymous | action: setupdata
      org.springframework.dao.DataIntegrityViolationException: could not execute statement; SQL [n/a]; constraint [fk_notifications_content]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute statement
      	[...]
      	at com.atlassian.confluence.pages.AbstractPage.remove(AbstractPage.java:70)
      	at com.atlassian.confluence.pages.Page.remove(Page.java:231)
      	at com.atlassian.confluence.pages.DefaultTrashManager.deleteContentEntity(DefaultTrashManager.java:236)
      	at com.atlassian.confluence.pages.DefaultTrashManager.lambda$purge$0(DefaultTrashManager.java:188)
      	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
      	at com.atlassian.confluence.pages.DefaultTrashManager.purge(DefaultTrashManager.java:188)
      	[...]
      	at com.atlassian.confluence.impl.retention.manager.DefaultTrashRemovalManager.deleteForRule(DefaultTrashRemovalManager.java:146)
      	at com.atlassian.confluence.impl.retention.manager.DefaultTrashRemovalManager.lambda$cleanupTrashedEntities$5(DefaultTrashRemovalManager.java:164)
      	at com.atlassian.confluence.impl.retention.analytics.TrashRemovalStatisticThreadLocal.withStatistic(TrashRemovalStatisticThreadLocal.java:23)
      	at com.atlassian.confluence.impl.retention.manager.DefaultTrashRemovalManager.cleanupTrashedEntities(DefaultTrashRemovalManager.java:164)
      	at com.atlassian.confluence.impl.retention.manager.DefaultTrashRemovalManager.lambda$hardRemove$1(DefaultTrashRemovalManager.java:107)
      	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
      	at com.atlassian.confluence.impl.retention.manager.DefaultTrashRemovalManager.hardRemove(DefaultTrashRemovalManager.java:105)
      	at com.atlassian.confluence.impl.retention.schedule.TrashHardRemovalScheduledJob.runJob(TrashHardRemovalScheduledJob.java:39)
      	at com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper.doRunJob(JobRunnerWrapper.java:117)
      	[...]
      Caused by: org.postgresql.util.PSQLException: ERROR: update or delete on table "content" violates foreign key constraint "fk_notifications_content" on table "notifications"
        Detail: Key (contentid)=(111106) is still referenced from table "notifications".
      [...]
      2023-08-22 00:56:05,539 ERROR [Caesium-1-2] [engine.jdbc.spi.SqlExceptionHelper] logExceptions ERROR: current transaction is aborted, commands ignored until end of transaction block
       -- url: /c7195/setup/setupdata.action | traceId: 72e45ffdddf5ffd8 | userName: anonymous | action: setupdata
      2023-08-22 00:56:05,541 WARN [Caesium-1-2] [impl.retention.manager.DefaultTrashRemovalManager] hardRemove Error purging trash for batch offset=111070, limit=100
       -- url: /c7195/setup/setupdata.action | traceId: 72e45ffdddf5ffd8 | userName: anonymous | action: setupdata
      	[...]
      
      > grep "Error purging trash for batch offset" atlassian-confluence.log
      2023-08-22 00:55:11,780 WARN [Caesium-1-2] [impl.retention.manager.DefaultTrashRemovalManager] hardRemove Error purging trash for batch offset=108770, limit=100
      2023-08-22 00:56:05,541 WARN [Caesium-1-2] [impl.retention.manager.DefaultTrashRemovalManager] hardRemove Error purging trash for batch offset=111070, limit=100
      2023-08-22 00:57:55,989 WARN [Caesium-1-2] [impl.retention.manager.DefaultTrashRemovalManager] hardRemove Error purging trash for batch offset=115570, limit=100
      

      Workaround

      Delete the page from the database manually by using the queries described in the 'How to Remove a Page Manually in the Database Using SQL Commands' knowledge base article and trigger the Retention rule again.

            [CONFSERVER-90918] The retention job fails to delete all items in one batch if one of the items in the batch cannot be deleted due to some error

            Hi vmiloch@atlassian.com ,

            We do have plan to backport this fix to previous LTS. This fix will be included in the changes of new retention rule (CONFSERVER-87298) and is still on soaking for 9.1. We will create backport tickets after that. cc ephillips@atlassian.com 

            Jeffery Xie added a comment - Hi vmiloch@atlassian.com , We do have plan to backport this fix to previous LTS. This fix will be included in the changes of new retention rule ( CONFSERVER-87298 ) and is still on soaking for 9.1. We will create backport tickets after that. cc  ephillips@atlassian.com  

            Alexander added a comment -

            Hopefully this fix will also be released for LTS. 🙈

            Alexander added a comment - Hopefully this fix will also be released for LTS. 🙈

            A fix for this issue is available in Confluence Data Center 9.1.0.
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            Jordan Anslow added a comment - A fix for this issue is available in Confluence Data Center 9.1.0. Upgrade now or check out the Release Notes to see what other issues are resolved.

            We are affected too, please fix it as soon as possible.

             

            Mohamed Shariffdeen added a comment - We are affected too, please fix it as soon as possible.  

              5339cdd01cf4 Jeffery Xie
              9f7de485df51 Basar Beykoz
              Affected customers:
              26 This affects my team
              Watchers:
              42 Start watching this issue

                Created:
                Updated:
                Resolved: