Hibernate2BatchProcessor#commitTransaction() clears the transaction from ThreadLocal before trying to flush the Hibernate session. If the flushing fails, rollbackTransaction() will not clear the session because the transaction cannot be found. After this the whole batch operation will fail, as the offending operation fails on every subsequent flush.
Directory synchronisation algorithm is not atomic, so it can sometimes try to add existing users to the database, which cause the session flushing to fail. AbstractBatchProcessor has logic to handle these failures gracefully, but it does not work in Confluence because the session is not cleared properly.
In a large Confluence instance the following behaviour could trigger this issue:
- New directory is added with large amount of users
- Sync is started, synchronisation algorithm finds that all users in the new directory need to be added and starts adding them
- A user from the large user set logs in before the user sync is completed (This triggers user creation in the local instance)
- Sync operation tries to add the user who was created in the previous step thus causing the flush to fail
At this stage no new users can be added as all flushes will fail. Membership synchronisation will proceed very slowly as some users have not been added, so batch operations will fall back to individual processing.
Attached is an updated version of atlassian-embedded-crowd-hibernate2 jar, to patch this issue. When patched, transactions will now correctly rollback, allowing the synchronisation to complete. All records in the rolled back transaction will be ignored until the next synchronisation attempt (or until the affected users log in).
It is known to work in Confluence 3.5.4, and might work in earlier versions, but these have not been tested. It is not needed in Confluence 3.5.6, as that version already contains this fix.
This patch also addresses
CONF-22631, so that any records that fail to synchronise are logged correctly. Users with Confluence 3.5.5 should install this patch to avoid that issue.
To install the patch:
- Stop Confluence
- Move the old atlassian-embedded-crowd-hibernate2 jar out of <confluence install dir>/confluence/WEB-INF/lib
- Copy the new jar into the same directory
- Start Confluence