Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-22342

Synchronising LDAP/Crowd can completely fail because transactions are not properly rolled back in Hibernate2BatchProcessor

    XMLWordPrintable

Details

    Description

      Hibernate2BatchProcessor#commitTransaction() clears the transaction from ThreadLocal before trying to flush the Hibernate session. If the flushing fails, rollbackTransaction() will not clear the session because the transaction cannot be found. After this the whole batch operation will fail, as the offending operation fails on every subsequent flush.

      Directory synchronisation algorithm is not atomic, so it can sometimes try to add existing users to the database, which cause the session flushing to fail. AbstractBatchProcessor has logic to handle these failures gracefully, but it does not work in Confluence because the session is not cleared properly.

      In a large Confluence instance the following behaviour could trigger this issue:

      1. New directory is added with large amount of users
      2. Sync is started, synchronisation algorithm finds that all users in the new directory need to be added and starts adding them
      3. A user from the large user set logs in before the user sync is completed (This triggers user creation in the local instance)
      4. Sync operation tries to add the user who was created in the previous step thus causing the flush to fail

      At this stage no new users can be added as all flushes will fail. Membership synchronisation will proceed very slowly as some users have not been added, so batch operations will fall back to individual processing.

      Patch

      Attached is an updated version of atlassian-embedded-crowd-hibernate2 jar, to patch this issue. When patched, transactions will now correctly rollback, allowing the synchronisation to complete. All records in the rolled back transaction will be ignored until the next synchronisation attempt (or until the affected users log in).

      It is known to work in Confluence 3.5.4, and might work in earlier versions, but these have not been tested. It is not needed in Confluence 3.5.6, as that version already contains this fix.

      This patch also addresses CONF-22631, so that any records that fail to synchronise are logged correctly. Users with Confluence 3.5.5 should install this patch to avoid that issue.

      Installation

      To install the patch:

      1. Stop Confluence
      2. Move the old atlassian-embedded-crowd-hibernate2 jar out of <confluence install dir>/confluence/WEB-INF/lib
      3. Copy the new jar into the same directory
      4. Start Confluence

      Attachments

        Issue Links

          Activity

            People

              akdominguez Katrina Walser (Inactive)
              onevalainen Olli Nevalainen
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 4h
                  4h
                  Remaining:
                  Remaining Estimate - 4h
                  4h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified