Uploaded image for project: 'Crowd Data Center'
  1. Crowd Data Center
  2. CWD-3882

Add unique constraint on columns external_id and directory_id

    • Icon: Suggestion Suggestion
    • Resolution: Unresolved
    • None
    • None
    • None
    • 41
    • Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

      In the table cwd_user, there should be no rows where the external_id is repeated for the same directory_id. Said in another way, the external_id should be unique for the same directory_id.

      To avoid this from causing problems, we should add a unique constraint at a database level in the table cwd_user.

      When two users share external_id and directory_id, the user is not able to login or update new information from the external user directory. Those actions will fail with the following error:

      org.springframework.dao.IncorrectResultSizeDataAccessException: query did not return a unique result: 2
              at org.springframework.orm.hibernate.SessionFactoryUtils.convertHibernateAccessException(SessionFactoryUtils.java:590)
              at org.springframework.orm.hibernate.HibernateAccessor.convertHibernateAccessException(HibernateAccessor.java:353)
              at org.springframework.orm.hibernate.HibernateTemplate.execute(HibernateTemplate.java:375)
              at org.springframework.orm.hibernate.HibernateTemplate.execute(HibernateTemplate.java:337)
              at com.atlassian.crowd.embedded.hibernate2.HibernateUserDao.internalFindUserByExternalId(HibernateUserDao.java:501)
              at com.atlassian.crowd.embedded.hibernate2.HibernateUserDao.internalFindByExternalId(HibernateUserDao.java:473)
              at com.atlassian.crowd.embedded.hibernate2.HibernateUserDao.findByExternalId(HibernateUserDao.java:449)
              at com.atlassian.confluence.user.crowd.CachedCrowdUserDao.findByExternalId(CachedCrowdUserDao.java:164)
      ...

      This also affects embedded Crowd.

          Form Name

            [CWD-3882] Add unique constraint on columns external_id and directory_id

            This bug (not feature/suggestion/enhancement) caused a nearly 1-month outage in our Jira -> Confluence user sync. It took a month of working w/ Atlassian's support teams until we finally found the excellent Patrick Turbett who more deeply investigated our issue and ultimately found that this defect was underlying it.

            For other Jira/Confluence clients struggling:

            Investigate the potential for duplicate users in Confluence using troubleshooting queries in your database. Resolve the duplicate by modifying the record, which will then allow the user sync process to complete which will trim the unmatched user from Confluence on completion.

            Jordan Brough added a comment - This bug (not feature/suggestion/enhancement) caused a nearly 1-month outage in our Jira -> Confluence user sync. It took a month of working w/ Atlassian's support teams until we finally found the excellent Patrick Turbett who more deeply investigated our issue and ultimately found that this defect was underlying it. For other Jira/Confluence clients struggling: Investigate the potential for duplicate users in Confluence using troubleshooting queries in your database. Resolve the duplicate by modifying the record, which will then allow the user sync process to complete which will trim the unmatched user from Confluence on completion.

            Graham Twine added a comment - - edited

            I have just run into this issue.

            We have a single account not even used by the Atlassian tool stack that was duplicated many times.

            Following this KB we got a sync to work. This is also affecting a full sync on Confluence 7.13.5

            The net outcome is account changes in LDAP are not reflecting in Confluence.
             

            Directory Name Type Order Operations
            TDS
            You cannot edit this directory because you are logged in through it, please log in as a locally authenticating user to edit it.
            OpenLDAP (Read Only, with Local Groups)  Move Down Test / Synchronize
            Last synchronized at 4/13/23 7:25 AM (took 10s).
            Synchronization failed. See server logs for details.

            Graham Twine added a comment - - edited I have just run into this issue. We have a single account not even used by the Atlassian tool stack that was duplicated many times. Following this KB we got a sync to work. This is also affecting a full sync on Confluence 7.13.5 The net outcome is account changes in LDAP are not reflecting in Confluence.   Directory Name Type Order Operations TDS You cannot edit this directory because you are logged in through it, please log in as a locally authenticating user to edit it. OpenLDAP (Read Only, with Local Groups)  Move Down Test / Synchronize Last synchronized at 4/13/23 7:25 AM (took 10s). Synchronization failed. See server logs for details.

            Just ran into this in a customer's environment. For us a full crowd directory sync from Confluence would fail with a IllegalArgumentException: duplicate key, and thus did not solve the problem. Actually going into the Confluence database and changing the external_id for the old username solved the issue and let Confluence successfully complete a full sync the directory. Where on Confluence 5.10.7 and Crowd 2.8.0 connected to ActiveDirectory.

            Daniel Törnqvist added a comment - Just ran into this in a customer's environment. For us a full crowd directory sync from Confluence would fail with a IllegalArgumentException: duplicate key , and thus did not solve the problem. Actually going into the Confluence database and changing the external_id for the old username solved the issue and let Confluence successfully complete a full sync the directory. Where on Confluence 5.10.7 and Crowd 2.8.0 connected to ActiveDirectory.

            Robert Louie added a comment - https://getsupport.atlassian.com/browse/CSP-200483

            intersol_old added a comment -

            For example we are getting lots of errors like https://jira.atlassian.com/browse/CWD-3873

            intersol_old added a comment - For example we are getting lots of errors like https://jira.atlassian.com/browse/CWD-3873

            intersol_old added a comment -

            Yes, a full synchronisation seems to fix it but performing a full sync is not an option, is not something we can do every hour.

            Also, as stated before this seems to be a problem with the Crowd from Confluence only, when connected to Crowd.

            We have several JIRA and Confluence instances, all connected to a central Crowd, which is connected to AD. So far the only one that presented this kind of behaviour (not being able to sync successfully, after some time) was Confluence.

            Still, there are others warnings or errors that we get all all AD sync in Crowd, but those are not fatal and not related to this.

            intersol_old added a comment - Yes, a full synchronisation seems to fix it but performing a full sync is not an option, is not something we can do every hour. Also, as stated before this seems to be a problem with the Crowd from Confluence only, when connected to Crowd. We have several JIRA and Confluence instances, all connected to a central Crowd, which is connected to AD. So far the only one that presented this kind of behaviour (not being able to sync successfully, after some time) was Confluence. Still, there are others warnings or errors that we get all all AD sync in Crowd, but those are not fatal and not related to this.

            I'm assuming that the duplicates are created during the AD synchronisation. Based on your reports, it looks to me that the AD incremental synchronisation is causing the problem. Does a AD full sync repair the database?

            Diego Berrueta added a comment - I'm assuming that the duplicates are created during the AD synchronisation. Based on your reports, it looks to me that the AD incremental synchronisation is causing the problem. Does a AD full sync repair the database?

            intersol_old added a comment -

            This is not the king of bug you will be able to reproduce in the test environment, especially because it includes 3 systems: confluence, crowd and an Active Directory server. This seems to happen when users are renamed in AD, not sure if for all but clearly for a big portion of them.

            When the sAMAccountName is changed, crowd is doing the rename but Confluence is triggering this bug.

            Database being used is PostgreSQL, same as most Atlassians are using.

            Both Crowd and Confluence are configured to sync every hour. The directory size is of 16.000 people and about 200 groups (we have a filter to limit only some groups, otherwise it would be over 10.000 too).

            If the product becomes broken when the duplicates appear, we must find a way to prevent this from happening. Based on the fact that the source is very hard to debug/replicate, it makes sense to resolve the other size: prevent duplication or self-repair when a query returns two records. It could assume it was broken and remove the older (smaller ID) one.

            intersol_old added a comment - This is not the king of bug you will be able to reproduce in the test environment, especially because it includes 3 systems: confluence, crowd and an Active Directory server. This seems to happen when users are renamed in AD, not sure if for all but clearly for a big portion of them. When the sAMAccountName is changed, crowd is doing the rename but Confluence is triggering this bug. Database being used is PostgreSQL, same as most Atlassians are using. Both Crowd and Confluence are configured to sync every hour. The directory size is of 16.000 people and about 200 groups (we have a filter to limit only some groups, otherwise it would be over 10.000 too). If the product becomes broken when the duplicates appear, we must find a way to prevent this from happening. Based on the fact that the source is very hard to debug/replicate, it makes sense to resolve the other size: prevent duplication or self-repair when a query returns two records. It could assume it was broken and remove the older (smaller ID) one.

            Automatically adding a unique constraint in Crowd (and possible other products, like Confluence) is complicated due to the different behaviour of some databases regarding unique constraints affecting nullable fields. A constraint can be manually added to some databases to ensure the consistency of the information in the database, at the cost of failing when the actual operation that inserts duplicate external_id is executed.

            The first step to fixing the underlying problem would be to obtain detailed instructions to reproduce the problem, which we don't have yet. Reports indicate that this problem is only affecting Confluence, but we don't know yet how to reproduce it in a fresh installation of Confluence. Any collaboration on this would be welcome.

            Diego Berrueta added a comment - Automatically adding a unique constraint in Crowd (and possible other products, like Confluence) is complicated due to the different behaviour of some databases regarding unique constraints affecting nullable fields. A constraint can be manually added to some databases to ensure the consistency of the information in the database, at the cost of failing when the actual operation that inserts duplicate external_id is executed. The first step to fixing the underlying problem would be to obtain detailed instructions to reproduce the problem, which we don't have yet. Reports indicate that this problem is only affecting Confluence, but we don't know yet how to reproduce it in a fresh installation of Confluence. Any collaboration on this would be welcome.

            intersol_old added a comment -

            I returned from a 3 week vacation and the SQL that shows the broken accounts is returning now 287 records!

            select id, user_name, lower_user_name, external_id, directory_id from cwd_user where external_id in (
            select external_id from cwd_user group by external_id having count(external_id) <> 1)
            order by external_id, directory_id;

            Keep in mind that this seems to happen only with Confluence, Crowd and JIRA seems to synchronise well.

            I really cannot remove and recreate the Application link every time a user is renamed, see https://support.atlassian.com/browse/CSP-126414

            intersol_old added a comment - I returned from a 3 week vacation and the SQL that shows the broken accounts is returning now 287 records! select id, user_name, lower_user_name, external_id, directory_id from cwd_user where external_id in ( select external_id from cwd_user group by external_id having count(external_id) <> 1) order by external_id, directory_id; Keep in mind that this seems to happen only with Confluence, Crowd and JIRA seems to synchronise well. I really cannot remove and recreate the Application link every time a user is renamed, see https://support.atlassian.com/browse/CSP-126414

              Unassigned Unassigned
              aconde Alejandro Conde Carrillo (Inactive)
              Votes:
              22 Vote for this issue
              Watchers:
              30 Start watching this issue

                Created:
                Updated: