Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-28621

User Loses all Local Group Memberships If LDAP Sync is Unable to find the User, but the User appears again in subsequent syncs

      Steps to Reproduce

      1. Add a connection to LDAP in Confluence Admin >> User Directories with the Read Only, with Local Groups option
      2. Sync the directory and make sure that LDAP users are returned
      3. Add 1 LDAP user to a local group (membership)
      4. Change the User Object Filter in the directory's configuration in Confluence Admin >> User Directories to a dummy filter, such as the following:
        (&(objectclass=inetorgperson)(cn=dummynonexistentuser))
        
      5. Sync the directory again (Notice that the LDAP users are missing)
      6. Revert the User Object Filter to the previous working filter
      7. Sync the directory again (notice that the LDAP users are back, but their local group memberships are gone)

      Workaround

      1. Restore the instance's database backup to a new database (i.e. not production) prior to the point where memberships were lost.
      2. Follow the instructions in step 1 of Migrating Local Group Memberships Between Directories to generate a CSV file of users and their memberships.
      3. Run through the rest of the instructions in that KB article to populate the production instance's group memberships.

            [CONFSERVER-28621] User Loses all Local Group Memberships If LDAP Sync is Unable to find the User, but the User appears again in subsequent syncs

            For anyone who is still experiencing this issue, we were able to resolve this by disabling the Incremental Sync option in the Active Directory sync settings. Doing this triggered a full sync instead of an incremental sync, and the group memberships for the users which were having the problem re-appeared. 

            We have a fairly small organization (~200 Users), so we made the decision to leave the incremental sync off as it took no more time to complete. In theory you should be able to re-enable it after the full sync, and it would be happy again.

            Brennan Norwood added a comment - For anyone who is still experiencing this issue, we were able to resolve this by disabling the Incremental Sync option in the Active Directory sync settings. Doing this triggered a full sync instead of an incremental sync, and the group memberships for the users which were having the problem re-appeared.  We have a fairly small organization (~200 Users), so we made the decision to leave the incremental sync off as it took no more time to complete. In theory you should be able to re-enable it after the full sync, and it would be happy again.

            Tim added a comment -

            Afaik this never was an issue with JIRA, at least not in version 6 or 7, because JIRA deactivates these users and keep their local group membership, instead of removing them, as Confluence did before this fix.

            Tim added a comment - Afaik this never was an issue with JIRA, at least not in version 6 or 7, because JIRA deactivates these users and keep their local group membership, instead of removing them, as Confluence did before this fix.

            I am facing same issue in JIRA. What was the fix?

            Shashank Agrawal added a comment - I am facing same issue in JIRA. What was the fix?

            A fix for this issue is now available for Confluence Server customers.
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            Ze'ev (Inactive) added a comment - A fix for this issue is now available for Confluence Server customers. Upgrade now or check out the Release Notes to see what other issues are resolved.

            Finally, thank you.

            Mustafa Abusalah added a comment - Finally, thank you.

            Whoa, it's finally getting fixed! 

            Hooray, Atlassian!

            Liviu Constantinescu added a comment - Whoa, it's finally getting fixed!  Hooray, Atlassian!

            I can not believe this is still not fixed and has only medium priority. It hit us again today, this totally sucks!  

            Atlassian, you`re really loosing here ... I could not care less  about colaborative editing if it means no one has time to fix bugs anymore!!!

            nexum Support added a comment - I can not believe this is still not fixed and has only medium priority. It hit us again today, this totally sucks!    Atlassian, you`re really loosing here ... I could not care less  about colaborative editing if it means no one has time to fix bugs anymore!!!

            This issue is remaining in Confluence 6.

            Mustafa Abusalah added a comment - This issue is remaining in Confluence 6.

            Still run into this bug regularly, and just having to live with it, or log people in under another account when this strikes them (albeit this isn't an option for a lot of use-cases).

            Nasty stuff. Hoping an upcoming move to Active Directory will make some kind of difference, though I don't see why it would.

            Liviu Constantinescu added a comment - Still run into this bug regularly, and just having to live with it, or log people in under another account when this strikes them (albeit this isn't an option for a lot of use-cases). Nasty stuff. Hoping an upcoming move to Active Directory will make some kind of difference, though I don't see why it would.

            Ran into this bug with Confluence 5.9 recently and it had a large impact on our users. Thousands of memberships removed.

            Griffin Idleman added a comment - Ran into this bug with Confluence 5.9 recently and it had a large impact on our users. Thousands of memberships removed.

            Feng Xu (Inactive) added a comment - - edited

            After reproducing the issue and looking through the codes at both Confluence and Crowd, I come up with a solution for this. The proposed workflow works like below,

            T1 - local group has been linked with the ldap users

            T2 - after each synchronization (60 minutes by default), the sync result could be a success or a failure or a partial sync

            T3 - the system assesses the potential impact on that directory just being synchronized by checking whether the number of users to be removed or the percentage of users exceeds a predefined and configurable threshold, say 50 users, or 5% respectively, if so, the system will send out notification to the admin rather than drop these users automatically

            T4 - the admin can review the notification which includes the potential removal, and the admin can confirm the delete or withdraw.

            T5 - the admin could have 48 hours to act. When the admin does not act on this notification within 48 hours, the system could automatically drop the users straightway. In other words, these users have 48 hours to survive.

             

            The strategy intends to find the trade-off between convenience and data security, and it ignores many impl details at this moment, such as where to store the users to be deleted. As of the impl, the best place is to make the change in Crowd, however, Confluence should be able to have its own impl for AbstractCacheRefresher if Crowd does not adopt this strategy.   

             

            Note, 

            As the response window is 48 hours, the user to be delete is still able to use Confluence. 

            Feng Xu (Inactive) added a comment - - edited After reproducing the issue and looking through the codes at both Confluence and Crowd, I come up with a solution for this. The proposed workflow works like below, T1 - local group has been linked with the ldap users T2 - after each synchronization (60 minutes by default), the sync result could be a success or a failure or a partial sync T3 - the system assesses the potential impact on that directory just being synchronized by checking whether the number of users to be removed or the percentage of users exceeds a predefined and configurable threshold, say 50 users, or 5% respectively, if so, the system will send out notification to the admin rather than drop these users automatically T4 - the admin can review the notification which includes the potential removal, and the admin can confirm the delete or withdraw. T5 - the admin could have 48 hours to act. When the admin does not act on this notification within 48 hours, the system could automatically drop the users straightway. In other words, these users have 48 hours to survive.   The strategy intends to find the trade-off between convenience and data security, and it ignores many impl details at this moment, such as where to store the users to be deleted. As of the impl, the best place is to make the change in Crowd, however, Confluence should be able to have its own impl for AbstractCacheRefresher if Crowd does not adopt this strategy.      Note,  As the response window is 48 hours, the user to be delete is still able to use Confluence. 

            I get this all the time, but only for specific users. Definitely hits hard when it does hit, though... manually syncing does nothing... and these users really need to access their Confluence docs. =/

            Liviu Constantinescu added a comment - I get this all the time, but only for specific users. Definitely hits hard when it does hit, though... manually syncing does nothing... and these users really need to access their Confluence docs. =/

            Mustafa Abusalah added a comment - - edited

            Still we are facing this bug and this is hectic. We still face the same problem in Confluence 6?

            Mustafa Abusalah added a comment - - edited Still we are facing this bug and this is hectic. We still face the same problem in Confluence 6?

            Upgrading the symptom severity to critical, because when it hits customers, their users cannot log in at all.

            Denise Unterwurzacher [Atlassian] (Inactive) added a comment - Upgrading the symptom severity to critical, because when it hits customers, their users cannot log in at all.

            I am rather suprised that this is priority medium, to be honest. This is a major problem, and if you're affected on a large instance like ours, it is a desaster. This should get fix asap!!!!!!

            nexum Support added a comment - I am rather suprised that this is priority medium, to be honest. This is a major problem, and if you're affected on a large instance like ours, it is a desaster. This should get fix asap!!!!!!

            David Skreiner added a comment - - edited

            OK, so our user directory had a problem for half an hour. Both JIRA and Confluence synched.

            When the user directory was working again, JIRA worked as well as before - everyone was back in their groups.

            Our Confluence is now unusable to most users, since it emptied ALL of the user groups.

            David Skreiner added a comment - - edited OK, so our user directory had a problem for half an hour. Both JIRA and Confluence synched. When the user directory was working again, JIRA worked as well as before - everyone was back in their groups. Our Confluence is now unusable to most users, since it emptied ALL of the user groups.

            Interesting to see the conversation.
            Having been subject to a snafu from our IT organization where all users dropped out of the AD group we used for authentication for just a few hours (all confluence membership for all 1800 users was lost even after IT restored the AD group member) I would think that at the minimum it would be possible to set a customer configurable option of "time before deleting externally removed users".
            With such a parameter, that I would likely set to 24 hours, we would have a sufficient buffer to catch users being removed and added back in short time.

            The other workaround available now is to set the ldap sync interval to run less frequently. But then we can't get new users quickly on-boarded.

            Peter-Dave Sheehan added a comment - Interesting to see the conversation. Having been subject to a snafu from our IT organization where all users dropped out of the AD group we used for authentication for just a few hours (all confluence membership for all 1800 users was lost even after IT restored the AD group member) I would think that at the minimum it would be possible to set a customer configurable option of "time before deleting externally removed users". With such a parameter, that I would likely set to 24 hours, we would have a sufficient buffer to catch users being removed and added back in short time. The other workaround available now is to set the ldap sync interval to run less frequently. But then we can't get new users quickly on-boarded.

            I don't have a strong opinion on what Confluence should do, but I will clear up a few points for the record ...

            I think JIRA used to delete all the user's data, so a user disappearing and reappearing from LDAP would actually destroy significant amounts of data in JIRA.

            Not true.
            The data was still there, and still linked but all we had to show was the username.
            The main point of keeping user data around is to have a more user friendly historically correct Display Name and email address still available even for users that were deleted in the external User Directory.

            I would also note here that we don't keep all deleted users - only ones who are a reporter, assignee, or commenter.
            Also, we don't keep the user data if there is a user in another directory with the same username.

            Marking them as inactive isn't a good general solution in my mind because it would still affect their state in Confluence if the user reappeared on the LDAP side.

            If the user reappears, then we mark them as active again.

            It also isn't a true reflection of the state of LDAP, which supports users not existing as well as users not being enabled.

            Firstly, when JIRA inactivates a user who disappears from LDAP, we also set a second flag on the DB called "deleted externally" so the UI actually makes the true state of this user clear.
            Inactivating as well just helps with a bunch of stuff like making sure we don't send notifications, keeping them out of searches etc.

            Secondly, embedded crowd doesn't actually recognise the LDAP enabled flags for the most part, so this is a bit disingenuous.
            (I think it might work for MS Active directory? but not in any of the others)
            Users who are inactive in LDAP will currently be shown as active in Confluence.
            See CWD-2762 and friends

            Mark Lassau (Inactive) added a comment - I don't have a strong opinion on what Confluence should do, but I will clear up a few points for the record ... I think JIRA used to delete all the user's data, so a user disappearing and reappearing from LDAP would actually destroy significant amounts of data in JIRA. Not true. The data was still there, and still linked but all we had to show was the username. The main point of keeping user data around is to have a more user friendly historically correct Display Name and email address still available even for users that were deleted in the external User Directory. I would also note here that we don't keep all deleted users - only ones who are a reporter, assignee, or commenter. Also, we don't keep the user data if there is a user in another directory with the same username. Marking them as inactive isn't a good general solution in my mind because it would still affect their state in Confluence if the user reappeared on the LDAP side. If the user reappears, then we mark them as active again. It also isn't a true reflection of the state of LDAP, which supports users not existing as well as users not being enabled. Firstly, when JIRA inactivates a user who disappears from LDAP, we also set a second flag on the DB called "deleted externally" so the UI actually makes the true state of this user clear. Inactivating as well just helps with a bunch of stuff like making sure we don't send notifications, keeping them out of searches etc. Secondly, embedded crowd doesn't actually recognise the LDAP enabled flags for the most part, so this is a bit disingenuous. (I think it might work for MS Active directory? but not in any of the others) Users who are inactive in LDAP will currently be shown as active in Confluence. See CWD-2762 and friends

            Matt Ryall added a comment -

            Some people asking for my thoughts on this ticket and the difference between JIRA and Confluence.

            I believe the difference in LDAP sync behaviour arose due to a distinction between how Confluence and JIRA handle deleted users. When a user is deleted or disappears, Confluence keeps all their data around and treats them as an unknown user. We have done this since before we had Embedded Crowd with LDAP local group memberships. I think JIRA used to delete all the user's data, so a user disappearing and reappearing from LDAP would actually destroy significant amounts of data in JIRA. That isn't true in Confluence.

            Marking them as inactive isn't a good general solution in my mind because it would still affect their state in Confluence if the user reappeared on the LDAP side. It also isn't a true reflection of the state of LDAP, which supports users not existing as well as users not being enabled. I worry we'd end up with stale users in Confluence if we used this as a general solution.

            My preferred solution would be updating Embedded Crowd not to delete local memberships when remote users disappear. We're deleting user-generated data (local groups and their memberships) when we shouldn't. This also doesn't fudge the LDAP data in order to solve this problem.

            We should also investigate and fix the problems where an LDAP server can return empty results and we don't recognise this as an exceptional case. But there will still be cases where admins accidentally break their LDAP configuration temporarily so it returns no users, so we still need to implement my suggestion above of not obliterating the locally stored data in this case.

            Matt Ryall added a comment - Some people asking for my thoughts on this ticket and the difference between JIRA and Confluence. I believe the difference in LDAP sync behaviour arose due to a distinction between how Confluence and JIRA handle deleted users. When a user is deleted or disappears, Confluence keeps all their data around and treats them as an unknown user. We have done this since before we had Embedded Crowd with LDAP local group memberships. I think JIRA used to delete all the user's data, so a user disappearing and reappearing from LDAP would actually destroy significant amounts of data in JIRA. That isn't true in Confluence. Marking them as inactive isn't a good general solution in my mind because it would still affect their state in Confluence if the user reappeared on the LDAP side. It also isn't a true reflection of the state of LDAP, which supports users not existing as well as users not being enabled. I worry we'd end up with stale users in Confluence if we used this as a general solution. My preferred solution would be updating Embedded Crowd not to delete local memberships when remote users disappear. We're deleting user-generated data (local groups and their memberships) when we shouldn't. This also doesn't fudge the LDAP data in order to solve this problem. We should also investigate and fix the problems where an LDAP server can return empty results and we don't recognise this as an exceptional case. But there will still be cases where admins accidentally break their LDAP configuration temporarily so it returns no users, so we still need to implement my suggestion above of not obliterating the locally stored data in this case.

            Note - the same thing happens in Crowd 2.8.3 connected to LDAP. When LDAP is not available and Crowd tries to synchronize users with it, users looses group membership (although this does not affect all users, only some of them)

            Nikola Bornová added a comment - Note - the same thing happens in Crowd 2.8.3 connected to LDAP. When LDAP is not available and Crowd tries to synchronize users with it, users looses group membership (although this does not affect all users, only some of them)

            Is there an update on when this is going to be fixed?

            Eunice Mora added a comment - Is there an update on when this is going to be fixed?

            Hi v.medagam

            Thanks for getting in touch. This sounds like something that our support team should be able to assist you with. Could I suggest that you raise a support ticket on https://support.atlassian.com. Once you've done this one of our support engineers will be in touch and will work with you to investigate your issue further.

            Regards
            Steve Haffenden
            Confluence Bugmaster

            Steve Haffenden (Inactive) added a comment - Hi v.medagam Thanks for getting in touch. This sounds like something that our support team should be able to assist you with. Could I suggest that you raise a support ticket on https://support.atlassian.com . Once you've done this one of our support engineers will be in touch and will work with you to investigate your issue further. Regards Steve Haffenden Confluence Bugmaster

            medagam added a comment - - edited

            We are troubling a lot with this issue. At least twice in a week 70-80 users lost the access to Confluence space out of 120 Users group. Interesting part is out of 120 users 40 users are external that means they are not part of our organization. is this causing the problem?

            One more interesting part is we have JIRA instance, it is working fine with same users group and sync in a touch wood and users will be imports very quickly from LDAP to JIRA groups.

            It's eating our most of the time every week. Please provide the solution as soon as possible.

            We are cracked our heads as we did any mistake in directory filters but that's not an issue at all.

            We are using the 5.4.4 version of Confluence.

            medagam added a comment - - edited We are troubling a lot with this issue. At least twice in a week 70-80 users lost the access to Confluence space out of 120 Users group. Interesting part is out of 120 users 40 users are external that means they are not part of our organization. is this causing the problem? One more interesting part is we have JIRA instance, it is working fine with same users group and sync in a touch wood and users will be imports very quickly from LDAP to JIRA groups. It's eating our most of the time every week. Please provide the solution as soon as possible. We are cracked our heads as we did any mistake in directory filters but that's not an issue at all. We are using the 5.4.4 version of Confluence.

            WORD! Recently had 4h downtime on our counfluence with 5500 users due to this bug!

            Cedric Weber added a comment - WORD! Recently had 4h downtime on our counfluence with 5500 users due to this bug!

            intersol_old added a comment -

            In fact this bug is caused by the fact that when talking with LDAP Crowd does not wait for the LDAP request to succeed. If LDAP server is starting to send users, but after a time a package is lost or there is a timeout, you will end-up with Crowd which decides that only the received users are in that group. That's a huge mistake and crowd should never perform any action with incompletely received data.

            As you can guess, this problem is more likely to appear on big directories so I am not impressed that the developers are not keen on solving a bug that is hard to replicate. Still, these big directories are where the big licensing money are coming from.

            intersol_old added a comment - In fact this bug is caused by the fact that when talking with LDAP Crowd does not wait for the LDAP request to succeed. If LDAP server is starting to send users, but after a time a package is lost or there is a timeout, you will end-up with Crowd which decides that only the received users are in that group. That's a huge mistake and crowd should never perform any action with incompletely received data. As you can guess, this problem is more likely to appear on big directories so I am not impressed that the developers are not keen on solving a bug that is hard to replicate. Still, these big directories are where the big licensing money are coming from.

            Tim added a comment -

            High hopes for 2015!

            Tim added a comment - High hopes for 2015!

            Done, this page has been updated to include a link to this issue.

            Rachel Robins added a comment - Done, this page has been updated to include a link to this issue.

            rrobins Would you be able to make those changes?

            Cheers
            Steve

            Steve Haffenden (Inactive) added a comment - rrobins Would you be able to make those changes? Cheers Steve

            This is a very serious problem in our fast growing confluence environment! We are running Confluence as Intranet & Extranet >15.000 users behind a LDAP-proxy that unifies diverse AD-Servers. As soon as one of the servers has a problem, or unintended changes are done by an admin many or even all users loose their group memberships!

            Does this actually affect JIRA as well? We have not tested this on JIRA yet...

            Cedric Weber added a comment - This is a very serious problem in our fast growing confluence environment! We are running Confluence as Intranet & Extranet >15.000 users behind a LDAP-proxy that unifies diverse AD-Servers. As soon as one of the servers has a problem, or unintended changes are done by an admin many or even all users loose their group memberships! Does this actually affect JIRA as well? We have not tested this on JIRA yet...

            Dennis added a comment -

            we are currently experiencing same odd problems
            There was a problem with our LDAP Servers, one or two of the LDAP knots went down (3 Servers behind 1 Load Balancer), so we had to connect the systems to a special LDAP knot (instead of the LDAP Load Balancer) afterwards, most users lost their local group memberships in confluence, jira and stash as well.
            We have approx >20k Users on our systems thus we need a solution for the future, that if a user is not found by the LDAP Connector the Atlassian Application must not delete the Group Memberships. Something like a deactivation or a checkbox like "in case of LDAP connection loss do not delete group memberships".
            We now have loads of User Tickets caused by access problems/ missing group memberships - this is definetely not enterprise like -
            Out SAP Netweaver Portal for example, was affected by the LDAP server dowtime as well, so we did the same workaround : creating a new LDAP connection the only LDAP knot that has kept running - and no one loses their group memberships.

            Dennis added a comment - we are currently experiencing same odd problems There was a problem with our LDAP Servers, one or two of the LDAP knots went down (3 Servers behind 1 Load Balancer), so we had to connect the systems to a special LDAP knot (instead of the LDAP Load Balancer) afterwards, most users lost their local group memberships in confluence, jira and stash as well. We have approx >20k Users on our systems thus we need a solution for the future, that if a user is not found by the LDAP Connector the Atlassian Application must not delete the Group Memberships. Something like a deactivation or a checkbox like "in case of LDAP connection loss do not delete group memberships". We now have loads of User Tickets caused by access problems/ missing group memberships - this is definetely not enterprise like - Out SAP Netweaver Portal for example, was affected by the LDAP server dowtime as well, so we did the same workaround : creating a new LDAP connection the only LDAP knot that has kept running - and no one loses their group memberships.

            Matt Ryall added a comment -

            Thanks for the clarification, jderksenco. Your problem actually sounds more serious, if Confluence is losing the memberships when a connection failure happens with LDAP. (The intentional case of changing your user filter is something we aren't likely to fix.)

            fsim - I guess you've reproduced the problem. Does it happen whenever the LDAP server goes down? Does it perhaps only happen if connectivity is lost during a sync? Were there any errors in the customer logs which might indicate a cause?

            Matt Ryall added a comment - Thanks for the clarification, jderksenco . Your problem actually sounds more serious, if Confluence is losing the memberships when a connection failure happens with LDAP. (The intentional case of changing your user filter is something we aren't likely to fix.) fsim - I guess you've reproduced the problem. Does it happen whenever the LDAP server goes down? Does it perhaps only happen if connectivity is lost during a sync? Were there any errors in the customer logs which might indicate a cause?

            Dan Ziggas added a comment -

            Is there any root cause identified for this, or is a patch available?

            Dan Ziggas added a comment - Is there any root cause identified for this, or is a patch available?

            jderksen added a comment -

            We didn't change the User Object Filter and sync to introduce this problem.

            Our LDAP was taken down intentionally and was completely unavailable for 90 minutes. When LDAP came back up, the same Confluence users were present in LDAP, and the same User Object Filter was in place in the directory configuration, and the filter found the same users before and after LDAP was unavailable.

            When LDAP came back up, in the admin gui and in the app - the users appeared to be exactly the same people except that they had no group memberships and were not associated with the content they authored. The cwd_membership table had lost all the group memberships for the LDAP directory users. After comparing Confluence IDs from a backup database with the user ids in the current database, it was clear that the same users had been added back into Confluence as brand new users with new IDs.

            So even when the groups memberships were put back in place by restoring the cwd_membership table, the group memberships were not re-established.

            jderksen added a comment - We didn't change the User Object Filter and sync to introduce this problem. Our LDAP was taken down intentionally and was completely unavailable for 90 minutes. When LDAP came back up, the same Confluence users were present in LDAP, and the same User Object Filter was in place in the directory configuration, and the filter found the same users before and after LDAP was unavailable. When LDAP came back up, in the admin gui and in the app - the users appeared to be exactly the same people except that they had no group memberships and were not associated with the content they authored. The cwd_membership table had lost all the group memberships for the LDAP directory users. After comparing Confluence IDs from a backup database with the user ids in the current database, it was clear that the same users had been added back into Confluence as brand new users with new IDs. So even when the groups memberships were put back in place by restoring the cwd_membership table, the group memberships were not re-established.

              fxu Feng Xu (Inactive)
              fsim Foo Sim (Inactive)
              Affected customers:
              79 This affects my team
              Watchers:
              85 Start watching this issue

                Created:
                Updated:
                Resolved: