Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-23213

Merge memberships for groups with duplicate names during LDAP directory sync

    • 3
    • 13
    • We collect Confluence feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

      NOTE: This suggestion is for Confluence Server. Using Confluence Cloud? See the corresponding suggestion.

      Whatever the administrator enters in as the "Group name attribute" must be unique for all groups in the LDAP server. Typically this is the "CN" attribute.

      In the event that two different entities in LDAP are found with the same value for that attribute (CN), you receive the following stack trace.

      2011-09-01 00:26:43,979 ERROR [QuartzScheduler_Worker-2] [atlassian.crowd.directory.DbCachingDirectoryPoller] pollChanges Error occurred while refreshing the cache for directory [ 142802946 ].
      java.lang.IllegalArgumentException: duplicate key: duplicatedGroupName
      at com.google.common.collect.RegularImmutableMap.<init>(RegularImmutableMap.java:62)
      at com.google.common.collect.ImmutableMap$Builder.fromEntryList(ImmutableMap.java:210)
      at com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:196)
      at com.google.common.collect.Maps.uniqueIndex(Maps.java:456)
      at com.atlassian.crowd.directory.ldap.cache.AbstractCacheRefresher.synchroniseMemberships(AbstractCacheRefresher.java:126)
      

      Excluding these duplicates from the sync is not possible due to the fact that we can't control the order in which groups are returned from the LDAP server. If the ordering were to change between syncs, the memberships could keep changing on the Confluence side.

      The solution to this problem is to amalgamate memberships for groups which share the same name on the remote server. We should make this an option in the LDAP configuration.

            [CONFSERVER-23213] Merge memberships for groups with duplicate names during LDAP directory sync

            The fix in Crowd does not resolve this issue, instead of merging, duplicate groups are dropped and logged, to allow synchronisation to continue without failing. Reopening this issue to reflect that.

            Denise Unterwurzacher [Atlassian] (Inactive) added a comment - The fix in Crowd does not resolve this issue, instead of merging, duplicate groups are dropped and logged, to allow synchronisation to continue without failing. Reopening this issue to reflect that.

            As of Confluence 4.3.1, we will ignore the duplicate group and its members as per CWD-2504, rather than aborting the synchronisation. Sadly, the improvement requested by this issue is still not yet implemented, due to the extra complexities of resolving which group to modify for writable directories, or directories with local groups.

            Richard Atkins added a comment - As of Confluence 4.3.1, we will ignore the duplicate group and its members as per CWD-2504 , rather than aborting the synchronisation. Sadly, the improvement requested by this issue is still not yet implemented, due to the extra complexities of resolving which group to modify for writable directories, or directories with local groups.

            Mark - yes I believe that issue sums it up. I figured I'd ask the question, and to Matt's point, we can certainly change the identifier which I have begun looking into and it might work.

            And to be honest, its a small fraction of the groups within our LDAP that have duplicate names. And furthermore, they're usually old legacy groups that we don't even care about unrelated to Confluence. But it just sucks each one kills the sync, and then you have to fix it, sync again, fix the next one, etc... it's not easy to query LDAP for all the dups. I'd rather it find a dup, ignore it and log it, and continue. This would be sufficient in most cases since its not a big deal, and in those where a customer actually cares about the group, they'll be more likely to fix it.

            Thanks guys, good discussion.

            Dave Hergert added a comment - Mark - yes I believe that issue sums it up. I figured I'd ask the question, and to Matt's point, we can certainly change the identifier which I have begun looking into and it might work. And to be honest, its a small fraction of the groups within our LDAP that have duplicate names. And furthermore, they're usually old legacy groups that we don't even care about unrelated to Confluence. But it just sucks each one kills the sync, and then you have to fix it, sync again, fix the next one, etc... it's not easy to query LDAP for all the dups. I'd rather it find a dup, ignore it and log it, and continue. This would be sufficient in most cases since its not a big deal, and in those where a customer actually cares about the group, they'll be more likely to fix it. Thanks guys, good discussion.

            I'm not sure that's what David was alluding to (usage of a unique identifier in addition to a name).

            He mentions using sAMAccountName as key, then suggests a separate Display name for groups:
            "And when users search for groups in Confluence, they can obviously search on the more friendly Group Name ..."

            When I have discussed using sAMAccountName for group name to a customer previously, they didn't think it was appropriate because it was not the name that the users "were used to seeing".
            See JRA-26164

            Mark Lassau (Inactive) added a comment - I'm not sure that's what David was alluding to (usage of a unique identifier in addition to a name). He mentions using sAMAccountName as key, then suggests a separate Display name for groups: "And when users search for groups in Confluence, they can obviously search on the more friendly Group Name ..." When I have discussed using sAMAccountName for group name to a customer previously, they didn't think it was appropriate because it was not the name that the users "were used to seeing". See JRA-26164

            @Matt:
            I'm not sure that's what David was alluding to (usage of a unique identifier in addition to a name).

            For usernames, we use (by default) sAMAccountName or UID in most places: these are generally constrained to be unique from the LDAP Side.
            For groups, we use CN, which is not constrained; though in many of these systems, groups also have UID's and sAMAccountName as fields.

            I think David is referring to this inconsistency. Surely there are systems that do not have a constrained unique identifier for group entities, but we could just fall back to an unconstrained identifier for that attribute.

            Tim Wong (Inactive) added a comment - @Matt: I'm not sure that's what David was alluding to (usage of a unique identifier in addition to a name). For usernames, we use (by default) sAMAccountName or UID in most places: these are generally constrained to be unique from the LDAP Side. For groups, we use CN, which is not constrained; though in many of these systems, groups also have UID's and sAMAccountName as fields. I think David is referring to this inconsistency. Surely there are systems that do not have a constrained unique identifier for group entities, but we could just fall back to an unconstrained identifier for that attribute.

            Matt Ryall added a comment -

            … why is Confluence (or more accurately Crowd) basing its unique identifier for groups off of something in another system that is not guaranteed to be unique?

            For better or worse, this is how Confluence, JIRA and Crowd work at the moment. They only support a name for a group, not a separate unique identifier. The group name has to be unique.

            Unfortunately, this is unlikely to change in the future, at least on the Confluence side. We have a limited number of people to work on the product, and we aim to keep the products simple as we enhance it. Adding a group unique identifier to the UI in all the places where they appear would take a lot of work and complicate the UI while only helping a very small number of our customers. A similar situation applies to supporting multiple users with different names (the so-called "domains" in Microsoft software).

            We offer the ability to configure the group name attribute in our LDAP configuration for this reason. You need to pick an attribute which has a unique value for each group (or filter out the duplicate groups). Until we can resolve this issue, that is the best workaround for this problem.

            Matt Ryall added a comment - … why is Confluence (or more accurately Crowd) basing its unique identifier for groups off of something in another system that is not guaranteed to be unique? For better or worse, this is how Confluence, JIRA and Crowd work at the moment. They only support a name for a group, not a separate unique identifier. The group name has to be unique. Unfortunately, this is unlikely to change in the future, at least on the Confluence side. We have a limited number of people to work on the product, and we aim to keep the products simple as we enhance it. Adding a group unique identifier to the UI in all the places where they appear would take a lot of work and complicate the UI while only helping a very small number of our customers. A similar situation applies to supporting multiple users with different names (the so-called "domains" in Microsoft software). We offer the ability to configure the group name attribute in our LDAP configuration for this reason. You need to pick an attribute which has a unique value for each group (or filter out the duplicate groups). Until we can resolve this issue, that is the best workaround for this problem.

            These are all great points, and I would also be fine with merging the groups as well (although giving us the option to either merge or exclude duplicate groups would be ideal).

            If I may ask a rudimentary question; why is Confluence (or more accurately Crowd) basing its unique identifier for groups off of something in another system that is not guaranteed to be unique? That'd be like making your Confluence login ID your first name in LDAP; a bad idea for obvious reasons. So couldn't Crowd use something like sAMAccountName (I know that's AD specific, but it could be something that is known to be unique in whatever type of LDAP Directory you choose) or maybe it could use the DistinguishedName (DN) of the group, which is guaranteed to be unique? And when users search for groups in Confluence, they can obviously search on the more friendly Group Name (which would return multiple results) but they are also shown the DN so they know which one they are actually choosing.

            David Hergert [Windstream] added a comment - These are all great points, and I would also be fine with merging the groups as well (although giving us the option to either merge or exclude duplicate groups would be ideal). If I may ask a rudimentary question; why is Confluence (or more accurately Crowd) basing its unique identifier for groups off of something in another system that is not guaranteed to be unique? That'd be like making your Confluence login ID your first name in LDAP; a bad idea for obvious reasons. So couldn't Crowd use something like sAMAccountName (I know that's AD specific, but it could be something that is known to be unique in whatever type of LDAP Directory you choose) or maybe it could use the DistinguishedName (DN) of the group, which is guaranteed to be unique? And when users search for groups in Confluence, they can obviously search on the more friendly Group Name (which would return multiple results) but they are also shown the DN so they know which one they are actually choosing.

            @Matt
            I can't see how it would be technically possible to merge groups but not technically possible to make the group empty.

            There are customers above on this ticket that have requested the ability to merge the memberships, because they believe this would be the most accurate representation of their LDAP configuration in our applications.

            I have to concede that this would really be giving them what they asked for, I guess I was trying to "protect them from themselves".

            If we do implement it, I believe it should be an option in the directory configuration, so we're by no means forcing every customer into using this behaviour.

            This would make me feel a lot more comfortable if merge is off by default.

            I don't believe this is a big security concern. LDAP is a trusted source of membership information, and is generally difficult to change in our customers' organisations.

            I am not so much worried about deliberate attack as an accidental escalation of privileges.
            eg imagine a customer and under some subtree lived a group with CN=admins that contained all the system administrators for their intranet Confluence server and another subtree lived a group with CN=admins that contains the 2000 shop administrators for their various retail outlets.

            Mark Lassau (Inactive) added a comment - @Matt I can't see how it would be technically possible to merge groups but not technically possible to make the group empty. There are customers above on this ticket that have requested the ability to merge the memberships, because they believe this would be the most accurate representation of their LDAP configuration in our applications. I have to concede that this would really be giving them what they asked for, I guess I was trying to "protect them from themselves". If we do implement it, I believe it should be an option in the directory configuration, so we're by no means forcing every customer into using this behaviour. This would make me feel a lot more comfortable if merge is off by default. I don't believe this is a big security concern. LDAP is a trusted source of membership information, and is generally difficult to change in our customers' organisations. I am not so much worried about deliberate attack as an accidental escalation of privileges. eg imagine a customer and under some subtree lived a group with CN=admins that contained all the system administrators for their intranet Confluence server and another subtree lived a group with CN=admins that contains the 2000 shop administrators for their various retail outlets.

            Matt Ryall added a comment -

            The real problem is that the sync dies at this point and doesn't provide meaningful log messages.

            It does provide a meaningful log message in recent versions of Crowd. That has already been fixed.

            Letting the rest of the synch complete - only fail on the invalid group names

            If it's possible to do this easily with the current sync code, I wouldn't disagree. However, the Crowd guys didn't seem to believe it was possible when we discussed all these possibilities with them.

            By contrast, the merging of memberships didn't seem too complicated - it just requires a bit of an understanding of Crowd's internals. It also mimics what we do with group membership across multiple directories in the application.

            We actually explored this possibility (merging) some 6 months or more ago and it was rejected by JIRA and Crowd because of security concerns. ... Then later someone adds a different group to LDAP that happens to get mapped to the "blah" groupname in an Atlassian app, and suddenly there is a bunch of people with elevated privileges.

            I don't believe this is a big security concern. LDAP is a trusted source of membership information, and is generally difficult to change in our customers' organisations.

            There are customers above on this ticket that have requested the ability to merge the memberships, because they believe this would be the most accurate representation of their LDAP configuration in our applications.

            If we do implement it, I believe it should be an option in the directory configuration, so we're by no means forcing every customer into using this behaviour.

            Matt Ryall added a comment - The real problem is that the sync dies at this point and doesn't provide meaningful log messages. It does provide a meaningful log message in recent versions of Crowd. That has already been fixed. Letting the rest of the synch complete - only fail on the invalid group names If it's possible to do this easily with the current sync code, I wouldn't disagree. However, the Crowd guys didn't seem to believe it was possible when we discussed all these possibilities with them. By contrast, the merging of memberships didn't seem too complicated - it just requires a bit of an understanding of Crowd's internals. It also mimics what we do with group membership across multiple directories in the application. We actually explored this possibility (merging) some 6 months or more ago and it was rejected by JIRA and Crowd because of security concerns. ... Then later someone adds a different group to LDAP that happens to get mapped to the "blah" groupname in an Atlassian app, and suddenly there is a bunch of people with elevated privileges. I don't believe this is a big security concern. LDAP is a trusted source of membership information, and is generally difficult to change in our customers' organisations. There are customers above on this ticket that have requested the ability to merge the memberships, because they believe this would be the most accurate representation of their LDAP configuration in our applications. If we do implement it, I believe it should be an option in the directory configuration, so we're by no means forcing every customer into using this behaviour.

            It is also important to note Tim's comment from Support:

            The original report asked for a graceful handling (not necessarily merging)

            The real problem is that the sync dies at this point and doesn't provide meaningful log messages.

            Support originally raised this issue as "Gracefully handle duplicate key for group name during directory sync".

            That means:

            • Letting the rest of the synch complete - only fail on the invalid group names
            • Putting meaningful error messages in the logs with full information on which groups have failed to synchronise.

            Mark Lassau (Inactive) added a comment - It is also important to note Tim's comment from Support: The original report asked for a graceful handling (not necessarily merging) The real problem is that the sync dies at this point and doesn't provide meaningful log messages. Support originally raised this issue as "Gracefully handle duplicate key for group name during directory sync". That means: Letting the rest of the synch complete - only fail on the invalid group names Putting meaningful error messages in the logs with full information on which groups have failed to synchronise.

              Unassigned Unassigned
              twong Tim Wong (Inactive)
              Votes:
              50 Vote for this issue
              Watchers:
              49 Start watching this issue

                Created:
                Updated: