Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-70995

Inefficient full table scans when deleting from cwd_membership during directory syncs

      Issue Summary

      At present, user deletions during directory syncs do not utilize any indexes on the cwd_membership table, meaning each deletion requires in a full table scan. When the table is large the DB rapidly exhausts any in-memory buffers and has to constantly re-scan the table.

      Examining the SQL logging underneath the sync, the deletion has three predicates on membership_type, child_name, and directory_id:

      delete from cwd_membership where membership_type='GROUP_USER' and child_name='<username>' and directory_id=<id>;

      No existing index is suitable for this query. There is an index that would be suitable if the query utilized lower_child_name instead:

      "idx_mem_dir_child" btree (lower_child_name, membership_type, directory_id)

      list of all indexes for the table cwd_membership:

          "pk_cwd_membership" PRIMARY KEY, btree (id)
          "uk_mem_parent_child_type" UNIQUE, btree (parent_id, child_id, membership_type)
          "idx_mem_dir_child" btree (lower_child_name, membership_type, directory_id)
          "idx_mem_dir_parent" btree (lower_parent_name, membership_type, directory_id)
          "idx_mem_dir_parent_child" btree (lower_parent_name, lower_child_name, membership_type, directory_id)
      

      Steps to Reproduce

      1. Generate a large number of users and groups in local Jira directory or remote directory; ensure users have one or more group memberships
      2. If using a remote directory, perform a full sync and observe elapsed time for deletions
      3. If using a local directory, delete group memberships for a single user using the following query and benchmark the time:
        delete from cwd_membership where membership_type='GROUP_USER' and child_name='<username>' and directory_id=<id>;

      Expected Results

      Deletion should utilize a covering index to be efficient.

      Actual Results

      Deletion results in a full table scan. Here's a practical example from a customer with over 3 million rows in their cwd_membership table, and a deletion of a single user's group memberships that took over 7 seconds:

      explain analyze delete from cwd_membership where membership_type='GROUP_USER' and child_name='<username>' and directory_id=10000;
      QUERY PLAN
      ----------------------------------------------------------------------------------------------------------------------------------------------------
      Delete on cwd_membership (cost=0.00..103367.51 rows=86 width=6) (actual time=7311.263..7311.263 rows=0 loops=1)
      -> Seq Scan on cwd_membership (cost=0.00..103367.51 rows=86 width=6) (actual time=755.912..7308.464 rows=95 loops=1)
      Filter: (((membership_type)::text = 'GROUP_USER'::text) AND ((child_name)::text = '<username>'::text) AND (directory_id = '10000'::numeric))
      Rows Removed by Filter: 3003405
      Planning time: 0.155 ms
      Execution time: 7311.305 ms
      

      When deleting several users this drastically slows down sync time. A sync example from the same customer that took 243 seconds to delete 134 users' group memberships:

       Caesium-1-3 INFO [c.a.crowd.directory.DbCachingRemoteChangeOperations] deleted [ 134 ] users in [ 243102ms ]
      

      Adding a covering index drastically reduces these sync times, deleting 243 users' group memberships in 1.7 seconds:

      Caesium-1-1 INFO [c.a.crowd.directory.DbCachingRemoteChangeOperations] deleted [ 243 ] users in [ 1724ms ]
      

      Workaround

      Add a new composite indexing covering query predicates:

      create index idx_mem_child_dir on cwd_membership (membership_type,child_name,directory_id);

      With this the query planner will use an index scan, rather than a full table scan:

      explain (analyze,buffers) delete from cwd_membership where membership_type='GROUP_USER' and child_name ='<username>' and directory_id =1;
                                                                          QUERY PLAN
      ---------------------------------------------------------------------------------------------------------------------------------------------------
       Delete on cwd_membership  (cost=0.28..11.84 rows=3 width=6) (actual time=0.096..0.096 rows=0 loops=1)
         Buffers: shared hit=9 dirtied=3
         ->  Index Scan using idx_mem_child_dir on cwd_membership  (cost=0.28..11.84 rows=3 width=6) (actual time=0.016..0.019 rows=3 loops=1)
               Index Cond: (((membership_type)::text = 'GROUP_USER'::text) AND ((child_name)::text = '<username>'::text) AND (directory_id = 1::numeric))
               Buffers: shared hit=4
       Planning time: 0.113 ms
       Execution time: 0.114 ms
      

            [JRASERVER-70995] Inefficient full table scans when deleting from cwd_membership during directory syncs

            I wonder if there is a similar workaround for the query that searches for groups

            020-04-30 15:32:57,180-0700 CrowdUsnChangedCacheRefresher:thread-2 INFO ServiceRunner [c.a.c.d.ldap.cache.UsnChangedCacheRefresher] found [ 32294 ] remote groups in [ 101477ms ]

            Jason D Smith added a comment - I wonder if there is a similar workaround for the query that searches for groups 020-04-30 15:32:57,180-0700 CrowdUsnChangedCacheRefresher:thread-2 INFO ServiceRunner [c.a.c.d.ldap.cache.UsnChangedCacheRefresher] found [ 32294 ] remote groups in [ 101477ms ]

            TJ Royall added a comment -

            mdoar2: It is indeed the size of the cwd_membership table that matters; the easiest way to inflate that is to add a large number of users and add them to groups. You should see the effect at any scale (which can you can verify via explain timings), but it does become more pronounced and evident the larger the table becomes.

            I though I had raised this as Jira Core; thanks for catching that! I'll move it now.

            TJ Royall added a comment - mdoar2 : It is indeed the size of the cwd_membership table that matters; the easiest way to inflate that is to add a large number of users and add them to groups. You should see the effect at any scale (which can you can verify via explain timings), but it does become more pronounced and evident the larger the table becomes. I though I had raised this as Jira Core; thanks for catching that! I'll move it now.

            Matt Doar added a comment -

            Should this one be in JRASERVER since it looks like core DB stuff?

            Matt Doar added a comment - Should this one be in JRASERVER since it looks like core DB stuff?

            Matt Doar added a comment -

            "Generate a large number of users and groups " - how many is large here? Isn't it the size of the cwd_membership table that matters?

            Matt Doar added a comment - "Generate a large number of users and groups " - how many is large here? Isn't it the size of the cwd_membership table that matters?

              Unassigned Unassigned
              troyall TJ Royall
              Affected customers:
              5 This affects my team
              Watchers:
              11 Start watching this issue

                Created:
                Updated: