Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-32286

Page with multiple concurrent edits results in duplicate version numbers, lost content

      Originally reported internally as CONFDEV-22240, which has some discussion on it.

      Steps to reproduce:

      1. User A starts editing a page and makes changes
      2. User B starts editing a page and makes different changes
      3. Both users click "Save" at about the same time, such that the database transactions with the changes are running concurrently.

      Expected behaviour:

      • Assuming the change by user A commits first, user B is sent back to the editor and notified that their changes conflict with changes made by user A. Or, if the changes are compatible, user B's changes are included after user A's.
      • One copy of the previous content is added as a historical version.
      • If the changes do not conflict, then a subsequent historical version is added for user A's changes.

      Actual behaviour:

      • Two copies of the old content are added to the change history in the database, but because they have the same version number only one will ever be shown.
      • User A's change is completely lost. It is not in the change history and is not shown in the content.
      • User A is not notified that their change was lost.

      Workaround:

      • None.

            [CONFSERVER-32286] Page with multiple concurrent edits results in duplicate version numbers, lost content

            Khoa Pham added a comment -

            Hi wzanchet,
            Last time my fix merged in the master. So it will also fix for OD version
            Do you still have the problem with this ?
            Thanks

            Khoa Pham added a comment - Hi wzanchet , Last time my fix merged in the master. So it will also fix for OD version Do you still have the problem with this ? Thanks

            tquanghua psemeniuk When can we reattempt the merge and get this issue moving along?

            Steve Haffenden (Inactive) added a comment - tquanghua psemeniuk When can we reattempt the merge and get this issue moving along?

            Matt Ryall added a comment -

            Okay, please look into both optimistic and pessimistic solutions. Locking seems attractive, but has several problems itself, particularly if the lock is held around network calls or possible plugin code execution. It's probably worth involving either richatkins or oburn in some detailed discussion or review on this.

            Matt Ryall added a comment - Okay, please look into both optimistic and pessimistic solutions. Locking seems attractive, but has several problems itself, particularly if the lock is held around network calls or possible plugin code execution. It's probably worth involving either richatkins or oburn in some detailed discussion or review on this.

            I've bumped the priority on this one to reflect the fact that there is a chance that a user can lose data.

            Regards
            Steve Haffenden
            Confluence Bugmaster
            Atlassian

            Steve Haffenden (Inactive) added a comment - - edited I've bumped the priority on this one to reflect the fact that there is a chance that a user can lose data. Regards Steve Haffenden Confluence Bugmaster Atlassian

            Don Willis added a comment -

            I agree with most of the various risks and difficulties in Matt's proposed optimisitic locking fix, and thus instead propose using a cluster-wide lock on the page at the application level while it is being updated. Database locking is great, but relying on cross-db-platform database locking via hibernate with our current schema is likely to hurt us.

            Don Willis added a comment - I agree with most of the various risks and difficulties in Matt's proposed optimisitic locking fix, and thus instead propose using a cluster-wide lock on the page at the application level while it is being updated. Database locking is great, but relying on cross-db-platform database locking via hibernate with our current schema is likely to hurt us.

            Matt Ryall added a comment - - edited

            I'd prefer to go with an optimistic locking strategy for this problem, rather than adding pessimistic VM-wide or cluster-wide locks. I see two possible ways to implement this strategy:

            • a database constraint with proper handling in the Java code
            • an 'UPDATE WHERE' query that will return 0 rows in the case where the old page has already been updated, which can be detected and trigger a retry/fail.

            I don't think the first option will work without significant rewriting of Confluence. It would require a multi-column index across space, title, version, contenttype and publish-date. But publish date isn't a distinct column yet, and would need to be nullable to enforce page title uniqueness. So we'd need to implement that. But even then, because publish date would be nullable, multi-column-containining-nullable-column-constraints can't be implemented in a consistent way across all our supported databases, so I believe we'd actually need to split blogs and pages into separate tables. This would be a massive multi-month project for a significant chunk of the Confluence team.

            The second solution relies on database isolation and transactions rolling back where an UPDATE WHERE fails. We'd need to test it and see a) whether it's possible with Hibernate or in a cross-database SQL query, and b) how significant the code changes required are to handle this correctly. I think this is the more feasible option.

            If there are other ideas on how to solve this, please add them in the comments.

            Matt Ryall added a comment - - edited I'd prefer to go with an optimistic locking strategy for this problem, rather than adding pessimistic VM-wide or cluster-wide locks. I see two possible ways to implement this strategy: a database constraint with proper handling in the Java code an 'UPDATE WHERE' query that will return 0 rows in the case where the old page has already been updated, which can be detected and trigger a retry/fail. I don't think the first option will work without significant rewriting of Confluence. It would require a multi-column index across space, title, version, contenttype and publish-date. But publish date isn't a distinct column yet, and would need to be nullable to enforce page title uniqueness. So we'd need to implement that. But even then, because publish date would be nullable, multi-column-containining-nullable-column-constraints can't be implemented in a consistent way across all our supported databases, so I believe we'd actually need to split blogs and pages into separate tables. This would be a massive multi-month project for a significant chunk of the Confluence team. The second solution relies on database isolation and transactions rolling back where an UPDATE WHERE fails. We'd need to test it and see a) whether it's possible with Hibernate or in a cross-database SQL query, and b) how significant the code changes required are to handle this correctly. I think this is the more feasible option. If there are other ideas on how to solve this, please add them in the comments.

              kpham Khoa Pham
              jbirch Jason Birch (Inactive)
              Affected customers:
              9 This affects my team
              Watchers:
              24 Start watching this issue

                Created:
                Updated:
                Resolved: