Uploaded image for project: 'Crowd Data Center'
  1. Crowd Data Center
  2. CWD-4344

Significantly Slower Sync to Confluence or JIRA in Crowd 2.8 due to /rest/usermanagement/1/group/membership

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: High High
    • 2.10.1
    • 2.8, 2.9.1
    • Performance
    • None

      Crowd 2.8's /rest/usermanagement/1/group/membership is significantly slower than earlier versions of Crowd, 2.7 and 2.6 . Here is an example between Crowd 2.6.5 and Crowd 2.8.0 with 50000 users, and 1000 groups, and 1000000+ group memberships (this is with flat group memberships only. The problem is compounded with Nested Group memberships):

      1. Crowd 2.6.5 took 56s to load /rest/usermanagement/1/group/membership
        wget --user crowdperformance --password admin http://localhost:8265/crowd/rest/usermanagement/1/group/membership
        --2015-05-06 18:24:19--  http://localhost:8265/crowd/rest/usermanagement/1/group/membership
        Resolving localhost... 127.0.0.1, ::1
        Connecting to localhost|127.0.0.1|:8265... connected.
        HTTP request sent, awaiting response... 401 Unauthorized
        Connecting to localhost|127.0.0.1|:8265... connected.
        HTTP request sent, awaiting response... 200 OK
        Length: unspecified [application/xml]
        Saving to: `membership.1'
        
            [        <=>                            ] 29,348,141   585K/s   in 56s
        
      2. Crowd 2.8.0 took around 8 mins to load /rest/usermanagement/1/group/membership:
        wget --user crowdperformance --password admin http://localhost:8280/crowd/rest/usermanagement/1/group/membership
        --2015-05-06 18:15:32--  http://localhost:8280/crowd/rest/usermanagement/1/group/membership
        Resolving localhost... 127.0.0.1, ::1
        Connecting to localhost|127.0.0.1|:8280... connected.
        HTTP request sent, awaiting response... 401 Unauthorized
        Connecting to localhost|127.0.0.1|:8280... connected.
        HTTP request sent, awaiting response... 200 OK
        Length: unspecified [application/xml]
        Saving to: `membership'
        
            [                       <=>             ] 29,348,141   175K/s   in 8m 2s
        

      Both queries return the exact same XML output

      Note

      This is tested with the same exact user base (via CSV import). It is especially evident with a high user-base .

            [CWD-4344] Significantly Slower Sync to Confluence or JIRA in Crowd 2.8 due to /rest/usermanagement/1/group/membership

            Issa added a comment -

            Any progress on this issue. We are currently running 2.7.2 to avoid this nasty bug; but we would like to take advantage of the fix on the numerous requests to the Crowd cookie config. Thank you

            Issa added a comment - Any progress on this issue. We are currently running 2.7.2 to avoid this nasty bug; but we would like to take advantage of the fix on the numerous requests to the Crowd cookie config. Thank you

            PS : The wget needs to be run by using the application name with application password.

            As for a workaround, if the customer has a large number of user directories in Crowd for an application, then one of the workarounds could be to split the user directories between multiple applications in Crowd.

            So, for example, in place of :
            1 application JIRA with 200 user directories, customers can create multiple applications in Crowd i.e JIRA1, JIRA2, JIRA3... and in JIRA have multiple user directories synchronising with these Crowd applications.

             

            PS : The bug also affects 2.9.1 release.

            Ruchi Tandon added a comment - PS : The wget needs to be run by using the application name with application password. As for a workaround, if the customer has a large number of user directories in Crowd for an application, then one of the workarounds could be to split the user directories between multiple applications in Crowd. So, for example, in place of : 1 application JIRA with 200 user directories, customers can create multiple applications in Crowd i.e JIRA1, JIRA2, JIRA3... and in JIRA have multiple user directories synchronising with these Crowd applications.   PS : The bug also affects 2.9.1 release.

            Issa added a comment -

            Fine for you Robert. We use nesting and cannot disable it at this stage.

            Any progress on this issue ?

            Issa added a comment - Fine for you Robert. We use nesting and cannot disable it at this stage. Any progress on this issue ?

            Hi,

            I don't understand this:

            We cannot make our Bamboo, Stash, Confluence, Jira, Fisheye synchronize anymore with Crowd.

            We haven't access to your private issue.

            We are using Crowd 2.8 as SSO in front of all our tools and we have no problem. Using MS Active Directory with a lot of nested groups, to avoid the load of nested groups, we are using flattened users-groups membership declaration on the Directory configuration, on "User group attribute". The full synch between Crowd and our Active Directory takes 30 min every hour. But the synch between JIRA or Confluence (as Fisheye/Crucible) and Crowd takes 40 sec.

            Best regards

            Robert Mota added a comment - Hi, I don't understand this: We cannot make our Bamboo, Stash, Confluence, Jira, Fisheye synchronize anymore with Crowd. We haven't access to your private issue. We are using Crowd 2.8 as SSO in front of all our tools and we have no problem. Using MS Active Directory with a lot of nested groups, to avoid the load of nested groups, we are using flattened users-groups membership declaration on the Directory configuration, on "User group attribute". The full synch between Crowd and our Active Directory takes 30 min every hour. But the synch between JIRA or Confluence (as Fisheye/Crucible) and Crowd takes 40 sec. Best regards

            Issa added a comment -

            Hi,

            Following https://support.atlassian.com/servicedesk/customer/portal/16/CWDSUP-12127

            We cannot make our Bamboo, Stash, Confluence, Jira, Fisheye synchronize anymore with Crowd.

            Let us know how you progress on this issue please.

            Thank you

            Issa added a comment - Hi, Following https://support.atlassian.com/servicedesk/customer/portal/16/CWDSUP-12127 We cannot make our Bamboo, Stash, Confluence, Jira, Fisheye synchronize anymore with Crowd. Let us know how you progress on this issue please. Thank you

            osenn, the results of my tests are a bit confusing. To simplify the measurement, I used the time between the "Reticulating splines..." and the next log message in the Bamboo server logfile.

            If I turn both the "Aggregate group memberships across directories" and the "Use nested groups" flags on then this gap is about 14 minutes.
            If I turn the "Aggregate group memberships across directories" flag off and the "Use nested groups" flag off then it takes about 6 minutes.
            If I turn the "Aggregate group memberships across directories" flag off and the "Use nested groups" flag on then it takes about 10 minutes.

            However, when I turn the "Aggregate group memberships across directories" flag on and the "Use nested groups" flag off then the time drops to about 30 seconds. I tried those scenarios in our test environment where the entire Atlassian ecosystem, including databases, lives on a single server. Crowd was using a remote ActiveDirectory. The results of those 3 queries above, for the local Crowd database are:

            SELECT COUNT( * ) FROM cwd_directory = 2
            SELECT COUNT( * ) FROM cwd_group = 3930
            SELECT COUNT( * ) FROM cwd_membership = 21350

            It would appear then that the last scenario is the most optimal.

            I might add that in our production environment, with Bamboo 5.6.2 (much older than 5.10.3 tested above), the same version of Crowd 2.8.3, and 'aggregate groups' off and 'nested groups' on (3rd scenario above) the server comes up in about 2 minutes, without any discernible gap between "Reticulating splines..." and the next message.

            Krzysztof Novak added a comment - osenn , the results of my tests are a bit confusing. To simplify the measurement, I used the time between the "Reticulating splines..." and the next log message in the Bamboo server logfile. If I turn both the "Aggregate group memberships across directories" and the "Use nested groups" flags on then this gap is about 14 minutes. If I turn the "Aggregate group memberships across directories" flag off and the "Use nested groups" flag off then it takes about 6 minutes. If I turn the "Aggregate group memberships across directories" flag off and the "Use nested groups" flag on then it takes about 10 minutes. However, when I turn the "Aggregate group memberships across directories" flag on and the "Use nested groups" flag off then the time drops to about 30 seconds. I tried those scenarios in our test environment where the entire Atlassian ecosystem, including databases, lives on a single server. Crowd was using a remote ActiveDirectory. The results of those 3 queries above, for the local Crowd database are: SELECT COUNT( * ) FROM cwd_directory = 2 SELECT COUNT( * ) FROM cwd_group = 3930 SELECT COUNT( * ) FROM cwd_membership = 21350 It would appear then that the last scenario is the most optimal. I might add that in our production environment, with Bamboo 5.6.2 (much older than 5.10.3 tested above), the same version of Crowd 2.8.3, and 'aggregate groups' off and 'nested groups' on (3rd scenario above) the server comes up in about 2 minutes, without any discernible gap between "Reticulating splines..." and the next message.

            Thanks for your numbers k.novak1987173943. Is membership aggregation and nested groups active when it takes 20 minutes? Does changing these settings make a difference to the Bamboo startup time?

            Oliver Senn added a comment - Thanks for your numbers k.novak1987173943 . Is membership aggregation and nested groups active when it takes 20 minutes? Does changing these settings make a difference to the Bamboo startup time?

            Here are our numbers:

            SELECT COUNT( * ) FROM cwd_directory = 3
            SELECT COUNT( * ) FROM cwd_group = 7917
            SELECT COUNT( * ) FROM cwd_membership = 42508

            With these, our Bamboo server that uses Crowd for authentication takes about 20 minutes to come up.

            Krzysztof Novak added a comment - Here are our numbers: SELECT COUNT( * ) FROM cwd_directory = 3 SELECT COUNT( * ) FROM cwd_group = 7917 SELECT COUNT( * ) FROM cwd_membership = 42508 With these, our Bamboo server that uses Crowd for authentication takes about 20 minutes to come up.

            k.novak1987173943, the queries to get the information are as follows:

            • Number of directories: SELECT COUNT( * ) FROM cwd_directory;
            • Number of groups (across all directories): SELECT COUNT( * ) FROM cwd_group;
            • Number of memberships (across all directories): SELECT COUNT( * ) FROM cwd_membership;

            While there can always be other factors influencing performance, group nesting and membership aggregation are definitely by far the most important and impactful ones that need to be looked into first.

            Oliver Senn added a comment - k.novak1987173943 , the queries to get the information are as follows: Number of directories: SELECT COUNT( * ) FROM cwd_directory; Number of groups (across all directories): SELECT COUNT( * ) FROM cwd_group; Number of memberships (across all directories): SELECT COUNT( * ) FROM cwd_membership; While there can always be other factors influencing performance, group nesting and membership aggregation are definitely by far the most important and impactful ones that need to be looked into first.

            In reference to Oliver's comments above, it would be helpful if the actual Crowd database queries were included that are required to supply the required information, such as numbers for directories, groups and their members. Also, what attributes other than group nesting or membership aggregation could impact the performance?

            Regardless, significant degradation in Crowd's performance must in the end be due to some code and possibly database changes introduced in 2.8. Surely, Atlassian should be able to identify and remedy those.

            Krzysztof Novak added a comment - In reference to Oliver's comments above, it would be helpful if the actual Crowd database queries were included that are required to supply the required information, such as numbers for directories, groups and their members. Also, what attributes other than group nesting or membership aggregation could impact the performance? Regardless, significant degradation in Crowd's performance must in the end be due to some code and possibly database changes introduced in 2.8. Surely, Atlassian should be able to identify and remedy those.

              ppetrowski Patryk
              fsim Foo Sim (Inactive)
              Affected customers:
              17 This affects my team
              Watchers:
              45 Start watching this issue

                Created:
                Updated:
                Resolved: