Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-43156

Sessions are never cleared during LDAP user directory sync

      Summary

      Confluence does not clear its session during a full sync, making Hibernate track a gigantic pool of objects, i.e. synchronizing a large LDAP is virtually impossible; since it takes days to complete (even with very good hardware).
      With more than 60K users the synchronization can take up to 5 days. But with flushing session and clearing it, the synchronization can be completed in 10 minutes.

      Environment

      • All Confluence versions with crowd-api-2.8.3
      • All supported databases are impacted

      Steps to Reproduce

      1. Install Confluence 5.8 and set up connection to large LDAP instance
      2. Perform an upgrade to 5.10

      Expected Results

      Confluence is able to finish the synchronization in couple hours

      Actual Results

      The synchronization is taking up to 6 days

      Workaround

      No verified workaround available.
      bjarne.holen843903303 proposed to use session.flush(), session.clear() via. the HibernateTemplate after each user was synchronized.

            [CONFSERVER-43156] Sessions are never cleared during LDAP user directory sync

            Minh Tran added a comment -

            A fix for this issue is now available for Confluence Server customers.
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            Minh Tran added a comment - A fix for this issue is now available for Confluence Server customers. Upgrade now or check out the Release Notes to see what other issues are resolved.

            Resolved while fixing CONF-44221, through additional session flush and clear invocations around the crowd event publisher

            Richard Atkins added a comment - Resolved while fixing CONF-44221 , through additional session flush and clear invocations around the crowd event publisher

            This seems to be the exact same issue:

            https://jira.atlassian.com/browse/CONF-44221

            bjarne holen added a comment - This seems to be the exact same issue: https://jira.atlassian.com/browse/CONF-44221

            Possible workaround

            We have applied LDAP filtering to synchronize smaller sections of the organization at a time, i.e. starting with all users matching a (uid=a*), then adding more and more users by combining this filter, i.e. (|(uid=a*)(uid=b*)) etc. until we have populated all the users. For our organization this can be done within a day or so.

            Also a property which may be related to this popped up in a search of mine, which is perhaps related to this as well -Dcrowd.use.legacy.ad.incremental.sync=true, found here:

            https://confluence.atlassian.com/confkb/confluence-incremental-synchronisation-failed-and-falls-back-to-a-full-sync-when-connecting-to-ldap-812320145.html

            bjarne holen added a comment - Possible workaround We have applied LDAP filtering to synchronize smaller sections of the organization at a time, i.e. starting with all users matching a (uid=a*), then adding more and more users by combining this filter, i.e. (|(uid=a*)(uid=b*)) etc. until we have populated all the users. For our organization this can be done within a day or so. Also a property which may be related to this popped up in a search of mine, which is perhaps related to this as well -Dcrowd.use.legacy.ad.incremental.sync=true, found here: https://confluence.atlassian.com/confkb/confluence-incremental-synchronisation-failed-and-falls-back-to-a-full-sync-when-connecting-to-ldap-812320145.html

            bjarne holen added a comment - - edited

            It should be noted that we are upgrading from 4.3.7 to 5.8.18, (we initially tried to upgrade to 5.10, but the lack of support for Oracle 11.2 made us switch to 5.8.18), but I'm pretty sure this affects all versions of Confluence in terms of synchronizations from LDAP, we had identical problems when we tried our upgrade to Confluence 5.10.

            The problem is related to when Hibernate session's are cleared (i.e. flushed and emptied). The easiest way to see the issue "live", is to start a full synchronization on an LDAP directory while the net.sf.hibernate.impl.SessionImpl has logging set to DEBUG, it should be clear that 99.99% of the time consumption stems from looking for dirty (i.e. changed) values inside the same massive amount of objects stored in the session that never gets synchronized.

            It should be noted that this "patch" does not solve the issue fully in terms of doing an upgrade from one version to another, as there are other bottlenecks as well, but there could be numerous places where Hibernate/Confluence has the same issue I guess.

            Support ticket for reference:

            https://support.atlassian.com/servicedesk/customer/portal/14/CSP-178586

            Uploaded a picture of the call stack on a running synchronization task (with the patch in place, i.e. updateUsers is now ok, but there are similar problems for other jobs). It now finishes in roughly 25 hours, almost all the time is spent by Hibernate looking for changes in the cache.

            The time consuming part now is located at:
            com.atlassian.confluence.user.persistence.dao.hibernate.HibernatePersonalInformationDao.getByUser, on the call to findNamedQueryStringParam

            bjarne holen added a comment - - edited It should be noted that we are upgrading from 4.3.7 to 5.8.18, (we initially tried to upgrade to 5.10, but the lack of support for Oracle 11.2 made us switch to 5.8.18), but I'm pretty sure this affects all versions of Confluence in terms of synchronizations from LDAP, we had identical problems when we tried our upgrade to Confluence 5.10. The problem is related to when Hibernate session's are cleared (i.e. flushed and emptied). The easiest way to see the issue "live", is to start a full synchronization on an LDAP directory while the net.sf.hibernate.impl.SessionImpl has logging set to DEBUG , it should be clear that 99.99% of the time consumption stems from looking for dirty (i.e. changed) values inside the same massive amount of objects stored in the session that never gets synchronized. It should be noted that this "patch" does not solve the issue fully in terms of doing an upgrade from one version to another, as there are other bottlenecks as well, but there could be numerous places where Hibernate/Confluence has the same issue I guess. Support ticket for reference: https://support.atlassian.com/servicedesk/customer/portal/14/CSP-178586 Uploaded a picture of the call stack on a running synchronization task (with the patch in place, i.e. updateUsers is now ok, but there are similar problems for other jobs). It now finishes in roughly 25 hours, almost all the time is spent by Hibernate looking for changes in the cache. The time consuming part now is located at: com.atlassian.confluence.user.persistence.dao.hibernate.HibernatePersonalInformationDao.getByUser, on the call to findNamedQueryStringParam

              richatkins Richard Atkins
              ashaleev Anton Shaleev
              Affected customers:
              3 This affects my team
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: