[CONFSERVER-62835] Attachment Moves Are Non Atomic Resulting In Missing Attachments

Type: Bug
Resolution: Fixed
Priority: High
Fix Version/s: 8.1.0
Affects Version/s: 6.0.0, 7.0.0, 7.4.7, 7.13.7
Component/s: Content - Attachments
Labels:
- fireball
- phoenix-conf-dc

Support reference count:
32
Symptom Severity:
Severity 2 - Major
UIS:
120
Bug Fix Policy:
View Atlassian Server bug fix policy

We don't plan to backport the fix for this bug to earlier Long Term Support versions

The fix for this bug isn't suitable for backporting to a bug fix release for any previous LTS versions. This is often because the fix is considered too high risk to implement in an older version.

The fix for this issue will be included in future Long Term Support versions.

Issue Summary

This ticket tracks a class of bugs wherein Confluence misplaces attachments as part of a page move. The attachments remain on disk, but due to being in the wrong part of the file tree, appear to be missing.

This issue occurs during page or pagetree moves that are unexpectedly interrupted.

Note this is not related to CONFSERVER-55928: Attachments become 'Unknown Attachment' in the page editor with Collaborative Editing turned on or its related bugs

Steps to Reproduce

There are currently multiple causes with differing reproduction steps.

The fundamental cause is halting a pagetree copy whilst attachments are being moved.

Expected Results

The files should be moved successfully. In the case of a failed move, the file copies should be rolled back successfully.

Actual Results

The files are not moved correctly, or alternately, the files are not rolled back to their original location in the case of a failed page move.

Workaround

We currently have a script that searches for attachments that have been misplaced and moves them back to the correct location. The script can be found at https://confluence.atlassian.com/confkb/how-to-resolve-missing-attachments-in-confluence-201761.html

is related to

CONFSERVER-56060 Page Move feature should be more stable

Short Term Backlog

relates to

CONFSERVER-66568 Re-trying a failed space page move does not merge missing attachments

Closed

mentioned in: Hierarchical File System Attachment Storage; Page No Confluence page found with the given URL.; Hierarchical File System Attachment Storage; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(25 mentioned in)

agawron added a comment - 12/Aug/2024 11:31 PM

a7e9ce396f68 thank you for your feedback. The decision was made based on few factors.

First of all the performance of the migration process. Most of operations usually go through NFS which is a bottle neck. Using only a file system move operation made it possible to have the migration process completed much faster. Calculating binary diff could significantly slow down the migration as well as it could take more memory risking crash. Imagine calculating a diff of 4GB video file!

Secondly development time. We were not sure how many duplicates customers can have and how many of them are exactly the same.

Thirdly, we didn't want to risk losing any file so we decided to limit ourselves to move operations only. Avoiding any deletes.

Based on these we decided that duplicates can be safely handled by admins. Either left as they are, or reviewed in their time, not slowing down the migration.

agawron added a comment - 12/Aug/2024 11:31 PM a7e9ce396f68 thank you for your feedback. The decision was made based on few factors. First of all the performance of the migration process. Most of operations usually go through NFS which is a bottle neck. Using only a file system move operation made it possible to have the migration process completed much faster. Calculating binary diff could significantly slow down the migration as well as it could take more memory risking crash. Imagine calculating a diff of 4GB video file! Secondly development time. We were not sure how many duplicates customers can have and how many of them are exactly the same. Thirdly, we didn't want to risk losing any file so we decided to limit ourselves to move operations only. Avoiding any deletes. Based on these we decided that duplicates can be safely handled by admins. Either left as they are, or reviewed in their time, not slowing down the migration.

Michael Mohr added a comment - 12/Aug/2024 10:24 AM

Why in the hell you are storing binary identical files as duplicate, that makes no sense and confuses us as customer. Sense it would make to only store found duplicates that are binary different as duplicates. With that we as customer would directly know for which files we have to perform a check which of the duplicates is the one we want to stay with and which the one we want to delete.

Michael Mohr added a comment - 12/Aug/2024 10:24 AM Why in the hell you are storing binary identical files as duplicate, that makes no sense and confuses us as customer. Sense it would make to only store found duplicates that are binary different as duplicates. With that we as customer would directly know for which files we have to perform a check which of the duplicates is the one we want to stay with and which the one we want to delete.

agawron added a comment - 19/Mar/2023 10:22 PM - edited

39c389fcbf4a these .duplicate.X files are not referenced anywhere in the database. We keep the duplicate files just in case if any of them is actually the "real" attachment that should be used instead of the one that has been linked. If you are sure all these duplicate attachment files are just duplicates (exactly the same) then you can safely delete them. If you find a duplicate that is different than the original file you might want to double check if the right file has been linked during migration process.

agawron added a comment - 19/Mar/2023 10:22 PM - edited 39c389fcbf4a these .duplicate.X files are not referenced anywhere in the database. We keep the duplicate files just in case if any of them is actually the "real" attachment that should be used instead of the one that has been linked. If you are sure all these duplicate attachment files are just duplicates (exactly the same) then you can safely delete them. If you find a duplicate that is different than the original file you might want to double check if the right file has been linked during migration process.

Dan Schwartz added a comment - 17/Mar/2023 12:03 PM

So, now that I've upgraded to Confluence Server 8.1.1, I have a bunch of duplicate.1 files in the attachements/v4 directories. Can I just delete them without messing anything up or are their pointers to them in the confluence db?

Dan Schwartz added a comment - 17/Mar/2023 12:03 PM So, now that I've upgraded to Confluence Server 8.1.1, I have a bunch of duplicate.1 files in the attachements/v4 directories. Can I just delete them without messing anything up or are their pointers to them in the confluence db?

Madhubabu Kethineni (Inactive) added a comment - 14/Feb/2023 11:06 AM

A fix for this issue is available in Confluence Server and Data Center 8.1.0.
Upgrade now or check out the Release Notes to see what other issues are resolved.

Madhubabu Kethineni (Inactive) added a comment - 14/Feb/2023 11:06 AM A fix for this issue is available in Confluence Server and Data Center 8.1.0. Upgrade now or check out the Release Notes to see what other issues are resolved.

Marco Birrer added a comment - 19/Dec/2022 8:34 AM

Hi
Do you already know when the fix will come?

Marco Birrer added a comment - 19/Dec/2022 8:34 AM Hi Do you already know when the fix will come?

ACP added a comment - 05/Sep/2022 12:26 PM

Hi
Do you already know when the fix will come?

Had to fix the problem twice with the script. Which means a big downtime

ACP added a comment - 05/Sep/2022 12:26 PM Hi Do you already know when the fix will come? Had to fix the problem twice with the script. Which means a big downtime

Yan Zhou added a comment - 26/Jul/2022 1:59 AM

v7.13.7 is also affected.

Yan Zhou added a comment - 26/Jul/2022 1:59 AM v7.13.7 is also affected.

Assignee:: agawron

Reporter:: James Ponting

Affected customers:: 15 This affects my team

Watchers:: 40 Start watching this issue

Created:: 29/Mar/2021 1:50 AM

Updated:: 12/Aug/2024 11:31 PM

Resolved:: 14/Feb/2023 10:57 AM

Details

Description

Issue Summary

Steps to Reproduce

Expected Results

Actual Results

Workaround

Attachments

Issue Links

Forms

Activity

Collapse comment: agawron added a comment - 12/Aug/2024 11:31 PM

Expand comment: agawron added a comment - 12/Aug/2024 11:31 PM

Collapse comment: Michael Mohr added a comment - 12/Aug/2024 10:24 AM

Expand comment: Michael Mohr added a comment - 12/Aug/2024 10:24 AM

Collapse comment: agawron added a comment - 19/Mar/2023 10:22 PM, Edited by agawron - 19/Mar/2023 10:29 PM

Expand comment: agawron added a comment - 19/Mar/2023 10:22 PM, Edited by agawron - 19/Mar/2023 10:29 PM

Collapse comment: Dan Schwartz added a comment - 17/Mar/2023 12:03 PM

Expand comment: Dan Schwartz added a comment - 17/Mar/2023 12:03 PM

Collapse comment: Madhubabu Kethineni (Inactive) added a comment - 14/Feb/2023 11:06 AM

Expand comment: Madhubabu Kethineni (Inactive) added a comment - 14/Feb/2023 11:06 AM

Collapse comment: Marco Birrer added a comment - 19/Dec/2022 8:34 AM

Expand comment: Marco Birrer added a comment - 19/Dec/2022 8:34 AM

Collapse comment: ACP added a comment - 05/Sep/2022 12:26 PM

Expand comment: ACP added a comment - 05/Sep/2022 12:26 PM

Collapse comment: Yan Zhou added a comment - 26/Jul/2022 1:59 AM

Expand comment: Yan Zhou added a comment - 26/Jul/2022 1:59 AM

People

Dates