Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-70995

Inefficient full table scans when deleting from cwd_membership during directory syncs

      Issue Summary

      At present, user deletions during directory syncs do not utilize any indexes on the cwd_membership table, meaning each deletion requires in a full table scan. When the table is large the DB rapidly exhausts any in-memory buffers and has to constantly re-scan the table.

      Examining the SQL logging underneath the sync, the deletion has three predicates on membership_type, child_name, and directory_id:

      delete from cwd_membership where membership_type='GROUP_USER' and child_name='<username>' and directory_id=<id>;

      No existing index is suitable for this query. There is an index that would be suitable if the query utilized lower_child_name instead:

      "idx_mem_dir_child" btree (lower_child_name, membership_type, directory_id)

      list of all indexes for the table cwd_membership:

          "pk_cwd_membership" PRIMARY KEY, btree (id)
          "uk_mem_parent_child_type" UNIQUE, btree (parent_id, child_id, membership_type)
          "idx_mem_dir_child" btree (lower_child_name, membership_type, directory_id)
          "idx_mem_dir_parent" btree (lower_parent_name, membership_type, directory_id)
          "idx_mem_dir_parent_child" btree (lower_parent_name, lower_child_name, membership_type, directory_id)
      

      Steps to Reproduce

      1. Generate a large number of users and groups in local Jira directory or remote directory; ensure users have one or more group memberships
      2. If using a remote directory, perform a full sync and observe elapsed time for deletions
      3. If using a local directory, delete group memberships for a single user using the following query and benchmark the time:
        delete from cwd_membership where membership_type='GROUP_USER' and child_name='<username>' and directory_id=<id>;

      Expected Results

      Deletion should utilize a covering index to be efficient.

      Actual Results

      Deletion results in a full table scan. Here's a practical example from a customer with over 3 million rows in their cwd_membership table, and a deletion of a single user's group memberships that took over 7 seconds:

      explain analyze delete from cwd_membership where membership_type='GROUP_USER' and child_name='<username>' and directory_id=10000;
      QUERY PLAN
      ----------------------------------------------------------------------------------------------------------------------------------------------------
      Delete on cwd_membership (cost=0.00..103367.51 rows=86 width=6) (actual time=7311.263..7311.263 rows=0 loops=1)
      -> Seq Scan on cwd_membership (cost=0.00..103367.51 rows=86 width=6) (actual time=755.912..7308.464 rows=95 loops=1)
      Filter: (((membership_type)::text = 'GROUP_USER'::text) AND ((child_name)::text = '<username>'::text) AND (directory_id = '10000'::numeric))
      Rows Removed by Filter: 3003405
      Planning time: 0.155 ms
      Execution time: 7311.305 ms
      

      When deleting several users this drastically slows down sync time. A sync example from the same customer that took 243 seconds to delete 134 users' group memberships:

       Caesium-1-3 INFO [c.a.crowd.directory.DbCachingRemoteChangeOperations] deleted [ 134 ] users in [ 243102ms ]
      

      Adding a covering index drastically reduces these sync times, deleting 243 users' group memberships in 1.7 seconds:

      Caesium-1-1 INFO [c.a.crowd.directory.DbCachingRemoteChangeOperations] deleted [ 243 ] users in [ 1724ms ]
      

      Workaround

      Add a new composite indexing covering query predicates:

      create index idx_mem_child_dir on cwd_membership (membership_type,child_name,directory_id);

      With this the query planner will use an index scan, rather than a full table scan:

      explain (analyze,buffers) delete from cwd_membership where membership_type='GROUP_USER' and child_name ='<username>' and directory_id =1;
                                                                          QUERY PLAN
      ---------------------------------------------------------------------------------------------------------------------------------------------------
       Delete on cwd_membership  (cost=0.28..11.84 rows=3 width=6) (actual time=0.096..0.096 rows=0 loops=1)
         Buffers: shared hit=9 dirtied=3
         ->  Index Scan using idx_mem_child_dir on cwd_membership  (cost=0.28..11.84 rows=3 width=6) (actual time=0.016..0.019 rows=3 loops=1)
               Index Cond: (((membership_type)::text = 'GROUP_USER'::text) AND ((child_name)::text = '<username>'::text) AND (directory_id = 1::numeric))
               Buffers: shared hit=4
       Planning time: 0.113 ms
       Execution time: 0.114 ms
      

            [JRASERVER-70995] Inefficient full table scans when deleting from cwd_membership during directory syncs

            SET Analytics Bot made changes -
            UIS Original: 0 New: 1
            SET Analytics Bot made changes -
            UIS Original: 1 New: 0
            SET Analytics Bot made changes -
            UIS Original: 2 New: 1
            SET Analytics Bot made changes -
            UIS Original: 1 New: 2
            SET Analytics Bot made changes -
            Support reference count Original: 2 New: 3
            Andrzej Kotas made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 633508 ]
            Bugfix Automation Bot made changes -
            Support reference count Original: 1 New: 2
            SET Analytics Bot made changes -
            UIS Original: 0 New: 1
            Andriy Yakovlev [Atlassian] made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 589385 ]
            SET Analytics Bot made changes -
            UIS Original: 1 New: 0

              Unassigned Unassigned
              troyall TJ Royall
              Affected customers:
              5 This affects my team
              Watchers:
              11 Start watching this issue

                Created:
                Updated: