Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-62835

Attachment Moves Are Non Atomic Resulting In Missing Attachments

      We don't plan to backport the fix for this bug to earlier Long Term Support versions

      The fix for this bug isn't suitable for backporting to a bug fix release for any previous LTS versions. This is often because the fix is considered too high risk to implement in an older version.

      The fix for this issue will be included in future Long Term Support versions.

      Issue Summary

      This ticket tracks a class of bugs wherein Confluence misplaces attachments as part of a page move. The attachments remain on disk, but due to being in the wrong part of the file tree, appear to be missing.

      This issue occurs during page or pagetree moves that are unexpectedly interrupted.

      Note this is not related to CONFSERVER-55928: Attachments become 'Unknown Attachment' in the page editor with Collaborative Editing turned on or its related bugs

      Steps to Reproduce

      There are currently multiple causes with differing reproduction steps.

      The fundamental cause is halting a pagetree copy whilst attachments are being moved.

      Expected Results

      The files should be moved successfully. In the case of a failed move, the file copies should be rolled back successfully.

      Actual Results

      The files are not moved correctly, or alternately, the files are not rolled back to their original location in the case of a failed page move.

      Workaround

      We currently have a script that searches for attachments that have been misplaced and moves them back to the correct location. The script can be found at https://confluence.atlassian.com/confkb/how-to-resolve-missing-attachments-in-confluence-201761.html

            [CONFSERVER-62835] Attachment Moves Are Non Atomic Resulting In Missing Attachments

            agawron added a comment -

            a7e9ce396f68 thank you for your feedback. The decision was made based on few factors.

            First of all the performance of the migration process. Most of operations usually go through NFS which is a bottle neck. Using only a file system move operation made it possible to have the migration process completed much faster. Calculating binary diff could significantly slow down the migration as well as it could take more memory risking crash. Imagine calculating a diff of 4GB video file!

            Secondly development time. We were not sure how many duplicates customers can have and how many of them are exactly the same.

            Thirdly, we didn't want to risk losing any file so we decided to limit ourselves to move operations only. Avoiding any deletes.

            Based on these we decided that duplicates can be safely handled by admins. Either left as they are, or reviewed in their time, not slowing down the migration.

            agawron added a comment - a7e9ce396f68 thank you for your feedback. The decision was made based on few factors. First of all the performance of the migration process. Most of operations usually go through NFS which is a bottle neck. Using only a file system move operation made it possible to have the migration process completed much faster. Calculating binary diff could significantly slow down the migration as well as it could take more memory risking crash. Imagine calculating a diff of 4GB video file! Secondly development time. We were not sure how many duplicates customers can have and how many of them are exactly the same. Thirdly, we didn't want to risk losing any file so we decided to limit ourselves to move operations only. Avoiding any deletes. Based on these we decided that duplicates can be safely handled by admins. Either left as they are, or reviewed in their time, not slowing down the migration.

            Why in the hell you are storing binary identical files as duplicate, that makes no sense and confuses us as customer. Sense it would make to only store found duplicates that are binary different as duplicates. With that we as customer would directly know for which files we have to perform a check which of the duplicates is the one we want to stay with and which the one we want to delete.

            Michael Mohr added a comment - Why in the hell you are storing binary identical files as duplicate, that makes no sense and confuses us as customer. Sense it would make to only store found duplicates that are binary different as duplicates. With that we as customer would directly know for which files we have to perform a check which of the duplicates is the one we want to stay with and which the one we want to delete.
            Jacqueline Bietz made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 892790 ]
            Alex Cooksey made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 885939 ]
            Alex Cooksey made changes -
            Remote Link Original: This issue links to "Page (Confluence)" [ 724386 ]
            Jacqueline Bietz made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 876435 ]
            Jacqueline Bietz made changes -
            Remote Link Original: This issue links to "Page (Confluence)" [ 862257 ]
            Jacqueline Bietz made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 862257 ]
            James Ponting made changes -
            Remote Link Original: This issue links to "Page (Confluence)" [ 710445 ]
            agawron made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 827172 ]

              d5dce7b13926 agawron
              jponting James Ponting
              Affected customers:
              15 This affects my team
              Watchers:
              40 Start watching this issue

                Created:
                Updated:
                Resolved: