Uploaded image for project: 'Confluence Cloud'
  1. Confluence Cloud
  2. CONFCLOUD-30050

Users are duplicated in the People Directory, shares, mentions, and have two USERINFO records in the database

      NOTE: This bug report is for Confluence Cloud. Using Confluence Server? See the corresponding bug report.

      Update from Atlassian

      This issue has been resolved in Confluence 5.9.8.

      The fix we implemented is in the front end, and stops duplicate user records from displaying anywhere in Confluence - in the People Directory, Shares, or Mentions. The underlying duplicates may still exist in the database, and new duplicates may be created, however these are harmless and can be safely ignored. As of Confluence 5.9.8, you should no longer see duplicate users anywhere in Confluence.

      Symptoms

      • Users appear twice in the People Directory, in shares, and mentions
      • Users have two USERINFO records with no 'prevver' value (ie two current records), eg:
        select * from content where contenttype like 'USERINFO' and prevver is null;
        

        USERINFO records that do not have an entry in prevver are the currently active records. This is a little counter-intuitive given the name of the field, however older records that are no longer active contain the id of the superseding record in prevver. If prevver is null, the record is considered active.

      Steps to Reproduce

      This may not be the only way to have this happen, but doing the following sequence will cause this to happen:

      1. Create a user in an LDAP server, which you want to test with
      2. Set up the LDAP server in Confluence
      3. Synchronise
      4. Log in as that user and possibly add some extra profile information such as a picture/description
      5. Disable the LDAP directory
      6. Rename that user in the LDAP server
      7. Create a new LDAP directory exactly the same
      8. Synchronise

      Known Triggers

      This has been confirmed to occur on upgrade to 5.1.3, and possibly earlier or later versions.

      Case Study

      • These 3 users were duplicated as part of the upgrade to 5.1.3 (some irrelevant columns have been removed for ease of reading):
        contentid contenttype version creator creationdate lastmodifier lastmoddate prevver content_status username
        22217497 USERINFO 1 an.admin 10/05/2013 an.admin 10/05/2013   current USER1
        24248321 USERINFO 1   7/06/2013   7/06/2013   current user1
        917505 USERINFO 3 admin 12/03/2012 user2 14/12/2012   current user2
        24248322 USERINFO 1   7/06/2013   7/06/2013   current USER2
        1409789 USERINFO 1 USER3 29/03/2012 USER3 29/03/2012   current USER3
        24248323 USERINFO 1   7/06/2013   7/06/2013   current user3

      NB: Note the casing above - the 1st and 3rd records were originally uppercase, the second was originally lowercase.

      • All 3 duplicates were created on June 7th. This was in the logs:
        2013-06-07 19:40:39,244 INFO [main] [confluence.upgrade.upgradetask.UserIndexingUpgradeTask] doUpgrade Found 3 usernames that need PersonalInformation created.
        
      • I originally suspected that the casing of the usernames may have changed, but the users are currently in the original casing in cwd_user (upperfor USER1 and USER3, and lower for user2).
      • There was a warning regarding a case change for user2, going back as far as October 2012 (found in old support case) :
        2013-06-10 09:22:11,856 WARN [http-8080-40] [atlassian.crowd.directory.DelegatedAuthenticationDirectory] updateLdapUser remote username [ USER2 ] casing differs from local username [ user2 ]. User details will be kept updated, but the username cannot be updated
        
      • All 3 users are LDAP user accounts. This is a Delegated Authentication directory, and not a synchronising Connector type.

      Diagnosis - is this affecting you?

      First, try a content reindex. If this doesn't help, run this query (on any version). If it returns any results, this bug is the cause of your duplicates.

      SELECT c.username
            FROM CONTENT c
            WHERE
              contenttype LIKE 'USERINFO'
              AND prevver is null
                GROUP BY c.username
                HAVING count(c.username)>1;
      

      If this query returns nothing, have a look at CONF-36018 as well.

      Work Around

      1. Stop Confluence
      2. Make a full database and file system backup
      3. Run this command to find the usernames of affected users - ensure these match what you are seeing in the People Directory before deleting
        // FOR CONFLUENCE 5.3 AND ABOVE
        SELECT m.username FROM content c
         JOIN user_mapping m ON c.username = m.user_key
          WHERE c.username IN
            (SELECT c.username
              FROM CONTENT c
              WHERE
                contenttype LIKE 'USERINFO'
                AND prevver is null
                  GROUP BY c.username
                  HAVING count(c.username)>1)
            GROUP BY m.username;
        
        // FOR CONFLUENCE 5.2 AND BELOW
        SELECT c.username FROM CONTENT c
          WHERE c.username IN
            (SELECT c.username
              FROM CONTENT c
              WHERE
                contenttype LIKE 'USERINFO'
                AND prevver is null
                  GROUP BY c.username
                  HAVING count(c.username)>1)
            GROUP BY c.username;
        
      4. Run this command to delete the duplicate entries (all versions). The row with the higher contentid will be deleted.
        DELETE FROM CONTENT
        WHERE contentid IN
            (SELECT DISTINCT c1.contentid AS "pageid"
             FROM CONTENT c1
             JOIN CONTENT c2 ON c1.username = c2.username
             WHERE c1.contenttype = 'USERINFO'
             AND c2.contenttype = 'USERINFO'
             AND c1.prevver is null 
             AND c2.prevver is null 
             AND c1.contentid > c2.contentid);
      5. Restart Confluence.
      6. If you are still experiencing problems, have a look at CONF-36018 as well.

      NB: If you have any trouble running the delete query because of foreign key constraints on the ATTACHMENT or CONTENT tables, do not remove the entries from that table. Instead, please get in touch with Support and we can help you remove the duplicates, and update the queries to be more robust.

            [CONFCLOUD-30050] Users are duplicated in the People Directory, shares, mentions, and have two USERINFO records in the database

            We're waiting for the fix too

            ConfluenceE added a comment - We're waiting for the fix too

            Looks like another example of quick fixes becoming long-term fixes, unfortunately.

            Razvan Botez added a comment - Looks like another example of quick fixes becoming long-term fixes, unfortunately.

            I am wondering if this code review going to take years, we are already waiting for a fix for many months.

            William Ing added a comment - I am wondering if this code review going to take years, we are already waiting for a fix for many months.

            Minh Tran added a comment - - edited

            Dear all,

            I would like to inform that I added a quick fix for this issue by excluding duplicated users in People Directory. This does not cover the Shares & Mentions.
            It went with 5.8-OD-50-011 release.
            The long term fix for this issue is still ongoing and in code review.
            I will keep you posted for the long term fix

            Thanks,
            Minh Tran
            Confluence BugMaster
            Atlassian

            Minh Tran added a comment - - edited Dear all, I would like to inform that I added a quick fix for this issue by excluding duplicated users in People Directory. This does not cover the Shares & Mentions. It went with 5.8-OD-50-011 release. The long term fix for this issue is still ongoing and in code review. I will keep you posted for the long term fix Thanks, Minh Tran Confluence BugMaster Atlassian

            The delete command mentioned in step 4.

            Run this command to delete the duplicate entries (all versions)

            throws an error on MySQL (5.5.42, Linux):

            DELETE FROM CONTENT WHERE contentid IN (SELECT DISTINCT c1.contentid AS "pageid" FROM CONTENT c1 JOIN CONTENT c2 ON c1.username = c2.username WHERE c1.contenttype = 'USERINFO' AND c2.contenttype = 'USERINFO' AND c1.prevver is null AND c2.prevver is null AND c1.creationdate > c2.creationdate);
            ERROR 1093 (HY000): You can't specify target table 'CONTENT' for update in FROM clause
            

            I assume, it needs to be executed in 3 steps separately using a temp_table instead?

            Rainer Pöhlmann added a comment - The delete command mentioned in step 4. Run this command to delete the duplicate entries (all versions) throws an error on MySQL (5.5.42, Linux): DELETE FROM CONTENT WHERE contentid IN (SELECT DISTINCT c1.contentid AS "pageid" FROM CONTENT c1 JOIN CONTENT c2 ON c1.username = c2.username WHERE c1.contenttype = 'USERINFO' AND c2.contenttype = 'USERINFO' AND c1.prevver is null AND c2.prevver is null AND c1.creationdate > c2.creationdate); ERROR 1093 (HY000): You can 't specify target table ' CONTENT' for update in FROM clause I assume, it needs to be executed in 3 steps separately using a temp_table instead?

            I created a new user just yesterday. During the day, it only showed up as one user, however as of today I have multiple copies of this same user.

            Shane Burger added a comment - I created a new user just yesterday. During the day, it only showed up as one user, however as of today I have multiple copies of this same user.

            Hi,

            We have gone through the workaround on our TEST env. But it wasn't as simple as stated in the description. We couldn't delete data from CONTENT table, because there were foreign constraints. Then when we deleted corresponding entries from the other table, there was another error of another dependancies. And so on. In the end, this is the path we took:

            --Order of deletion IMAGEDETAILS -> ATTACHMENTS -> BODYCONTENT -> CONTENT
              
            -- ATTACHMENTS  
            delete from confdb.confschema.ATTACHMENTS where PAGEID in (
            	select c1.CONTENTID  
            	from confschema.CONTENT c1 
            	join confschema.CONTENT c2 
            	on c1.USERNAME = c2.USERNAME 
            	and c1.CREATIONDATE < c2.CREATIONDATE 
            	where c1.CONTENTTYPE = 'USERINFO' and c2.CONTENTTYPE = 'USERINFO'
            )
            
            -- IMAGEDETAILS
            delete from confdb.confschema.IMAGEDETAILS
            where ATTACHMENTID in (
            	select ATTACHMENTID from confdb.confschema.ATTACHMENTS where PAGEID in (
            		select c1.CONTENTID  
            		from confschema.CONTENT c1 
            		join confschema.CONTENT c2 
            		on c1.USERNAME = c2.USERNAME 
            		and c1.CREATIONDATE < c2.CREATIONDATE 
            		where c1.CONTENTTYPE = 'USERINFO' and c2.CONTENTTYPE = 'USERINFO'
            	)
            )
            
            -- BODYCONTENT
            delete from confdb.confschema.BODYCONTENT where CONTENTID in (
            	select c1.CONTENTID  
            	from confschema.CONTENT c1 
            	join confschema.CONTENT c2 
            	on c1.USERNAME = c2.USERNAME 
            	and c1.CREATIONDATE < c2.CREATIONDATE 
            	where c1.CONTENTTYPE = 'USERINFO' and c2.CONTENTTYPE = 'USERINFO'
            )
            -- CONTENT
            delete from confdb.confschema.CONTENT where CONTENTID in
            (
            	select c1.CONTENTID  
            	from confschema.CONTENT c1 
            	join confschema.CONTENT c2 
            	on c1.USERNAME = c2.USERNAME 
            	and c1.CREATIONDATE < c2.CREATIONDATE 
            	where c1.CONTENTTYPE = 'USERINFO' and c2.CONTENTTYPE = 'USERINFO'
            )
            

            Question: is it safe to dele all these entires in ATTACHMENTS and other tables on PROD? Are we deleting actual data or just some linkage that helps us resolve the issue with duplicate users?

            Roman Serazhiev (Inactive) added a comment - Hi, We have gone through the workaround on our TEST env. But it wasn't as simple as stated in the description. We couldn't delete data from CONTENT table, because there were foreign constraints. Then when we deleted corresponding entries from the other table, there was another error of another dependancies. And so on. In the end, this is the path we took: -- Order of deletion IMAGEDETAILS -> ATTACHMENTS -> BODYCONTENT -> CONTENT -- ATTACHMENTS delete from confdb.confschema.ATTACHMENTS where PAGEID in ( select c1.CONTENTID from confschema.CONTENT c1 join confschema.CONTENT c2 on c1.USERNAME = c2.USERNAME and c1.CREATIONDATE < c2.CREATIONDATE where c1.CONTENTTYPE = 'USERINFO' and c2.CONTENTTYPE = 'USERINFO' ) -- IMAGEDETAILS delete from confdb.confschema.IMAGEDETAILS where ATTACHMENTID in ( select ATTACHMENTID from confdb.confschema.ATTACHMENTS where PAGEID in ( select c1.CONTENTID from confschema.CONTENT c1 join confschema.CONTENT c2 on c1.USERNAME = c2.USERNAME and c1.CREATIONDATE < c2.CREATIONDATE where c1.CONTENTTYPE = 'USERINFO' and c2.CONTENTTYPE = 'USERINFO' ) ) -- BODYCONTENT delete from confdb.confschema.BODYCONTENT where CONTENTID in ( select c1.CONTENTID from confschema.CONTENT c1 join confschema.CONTENT c2 on c1.USERNAME = c2.USERNAME and c1.CREATIONDATE < c2.CREATIONDATE where c1.CONTENTTYPE = 'USERINFO' and c2.CONTENTTYPE = 'USERINFO' ) -- CONTENT delete from confdb.confschema.CONTENT where CONTENTID in ( select c1.CONTENTID from confschema.CONTENT c1 join confschema.CONTENT c2 on c1.USERNAME = c2.USERNAME and c1.CREATIONDATE < c2.CREATIONDATE where c1.CONTENTTYPE = 'USERINFO' and c2.CONTENTTYPE = 'USERINFO' ) Question: is it safe to dele all these entires in ATTACHMENTS and other tables on PROD? Are we deleting actual data or just some linkage that helps us resolve the issue with duplicate users?

            Hi akdominguez

            I think that they're probably part and parcel of the same fix although I'd say that CONF-30050 is caused by this. I've linked the issues together anyway. Thanks for picking this up.

            Regards
            Steve

            Steve Haffenden (Inactive) added a comment - Hi akdominguez I think that they're probably part and parcel of the same fix although I'd say that CONF-30050 is caused by this. I've linked the issues together anyway. Thanks for picking this up. Regards Steve

            When trying to implement this workaround, I get an error:

            Error report -
            SQL Error: ORA-02292: integrity constraint (FKA898D4778DD41734) violated - child record found
            02292. 00000 - "integrity constraint (%s.%s) violated - child record found"
            *Cause:    attempted to delete a parent key value that had a foreign
                       dependency.
            *Action:   delete dependencies first then parent or disable constraint.
            

            I don't want to delete any data that might still be attached to the wrong record. How do I get around this?

            Deleted Account (Inactive) added a comment - When trying to implement this workaround, I get an error: Error report - SQL Error: ORA-02292: integrity constraint (FKA898D4778DD41734) violated - child record found 02292. 00000 - "integrity constraint (%s.%s) violated - child record found" *Cause: attempted to delete a parent key value that had a foreign dependency. *Action: delete dependencies first then parent or disable constraint. I don't want to delete any data that might still be attached to the wrong record. How do I get around this?

            Thanks dave@atlassian.com. I know there's no uniqueness constraint on USERINFO records and that can cause this issue as well, but every other user in the system did not have a duplicate created, only those where the case of the usernames didn't match. I think there's probably a quick win here that doesn't rely on the schema - if we can get rid of the case sensitive lookup in UserIndexingUpgradeTask#getUsernamesMissingPersonalInfo we can prevent this exact case from happening again, even if we can't guarantee there will never be duplicates.

            Denise Unterwurzacher [Atlassian] (Inactive) added a comment - Thanks dave@atlassian.com . I know there's no uniqueness constraint on USERINFO records and that can cause this issue as well, but every other user in the system did not have a duplicate created, only those where the case of the usernames didn't match. I think there's probably a quick win here that doesn't rely on the schema - if we can get rid of the case sensitive lookup in UserIndexingUpgradeTask#getUsernamesMissingPersonalInfo we can prevent this exact case from happening again, even if we can't guarantee there will never be duplicates.

              fxu Feng Xu (Inactive)
              dunterwurzacher Denise Unterwurzacher [Atlassian] (Inactive)
              Affected customers:
              50 This affects my team
              Watchers:
              93 Start watching this issue

                Created:
                Updated:
                Resolved: