-
Type:
Suggestion
-
Resolution: Fixed
-
Component/s: Database Support
-
CtB - Improve Existing
The database query made by HibernatePullRequestParticipantDao.searchUsers() can be slow for large table sizes, taking upwards of half a second. For example:
=> EXPLAIN ANALYZE SELECT sta_normal_user.user_id,
sta_normal_user.deleted_timestamp,
sta_normal_user.locale,
sta_normal_user.slug,
sta_normal_user.time_zone,
sta_normal_user.name
FROM sta_normal_user
JOIN stash_user ON sta_normal_user.user_id=stash_user.id
WHERE sta_normal_user.user_id IN (
SELECT DISTINCT sta_pr_participant.user_id
FROM sta_pr_participant join sta_pull_request ON sta_pull_request.id=sta_pr_participant.pr_id AND sta_pull_request.to_repository_id=7462
WHERE sta_pr_participant.pr_role=0
)
ORDER BY sta_normal_user.name OFFSET 0 ROWS FETCH FIRST 25 ROWS ONLY;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=282185.51..282185.57 rows=25 width=51) (actual time=532.760..534.357 rows=25 loops=1)
-> Sort (cost=282185.51..282197.39 rows=4752 width=51) (actual time=532.758..534.353 rows=25 loops=1)
Sort Key: sta_normal_user.name
Sort Method: top-N heapsort Memory: 27kB
-> Nested Loop (cost=277846.12..282051.41 rows=4752 width=51) (actual time=516.879..533.737 rows=2635 loops=1)
-> Merge Join (cost=277845.83..280507.07 rows=4752 width=55) (actual time=516.860..530.501 rows=2635 loops=1)
Merge Cond: (sta_normal_user.user_id = sta_pr_participant.user_id)
-> Index Scan using pk_sta_normal_user on sta_normal_user (cost=0.29..1372.28 rows=38885 width=51) (actual time=0.009..7.347 rows=34107 loops=1)
-> Unique (cost=277845.54..278978.18 rows=4752 width=4) (actual time=516.821..520.094 rows=2636 loops=1)
-> Gather Merge (cost=277845.54..278954.42 rows=9504 width=4) (actual time=516.819..519.528 rows=6731 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=276845.52..276857.40 rows=4752 width=4) (actual time=514.029..514.168 rows=2244 loops=3)
Sort Key: sta_pr_participant.user_id
Sort Method: quicksort Memory: 97kB
Worker 0: Sort Method: quicksort Memory: 97kB
Worker 1: Sort Method: quicksort Memory: 97kB
-> HashAggregate (cost=276507.79..276555.31 rows=4752 width=4) (actual time=513.379..513.668 rows=2244 loops=3)
Group Key: sta_pr_participant.user_id
Batches: 1 Memory Usage: 337kB
Worker 0: Batches: 1 Memory Usage: 337kB
Worker 1: Batches: 1 Memory Usage: 337kB
-> Parallel Hash Join (cost=153181.70..276310.87 rows=78768 width=4) (actual time=132.619..501.635 rows=63647 loops=3)
Hash Cond: (sta_pr_participant.pr_id = sta_pull_request.id)
-> Parallel Seq Scan on sta_pr_participant (cost=0.00..121612.17 rows=577903 width=12) (actual time=0.005..246.722 rows=464030 loops=3)
Filter: (pr_role = 0)
Rows Removed by Filter: 2429101
-> Parallel Hash (cost=152193.46..152193.46 rows=79059 width=8) (actual time=128.775..128.775 rows=63647 loops=3)
Buckets: 262144 Batches: 1 Memory Usage: 9536kB
-> Parallel Seq Scan on sta_pull_request (cost=0.00..152193.46 rows=79059 width=8) (actual time=0.041..111.677 rows=63647 loops=3)
Filter: (to_repository_id = 7462)
Rows Removed by Filter: 400383
-> Index Only Scan using pk_stash_user on stash_user (cost=0.29..0.32 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=2635)
Index Cond: (id = sta_normal_user.user_id)
Heap Fetches: 2
Planning Time: 0.551 ms
Execution Time: 534.450 ms
(37 rows)
a equivalent but better performing alternative would be:
=> EXPLAIN ANALYZE SELECT DISTINCT sta_normal_user.user_id,
sta_normal_user.deleted_timestamp,
sta_normal_user.locale,
sta_normal_user.slug,
sta_normal_user.time_zone,
sta_normal_user.name
FROM sta_normal_user
JOIN stash_user ON sta_normal_user.user_id = stash_user.id
JOIN sta_pr_participant ON sta_pr_participant.user_id = sta_normal_user.user_id
JOIN sta_pull_request ON sta_pull_request.id = sta_pr_participant.pr_id
WHERE sta_pull_request.to_repository_id = 7462
AND sta_pr_participant.pr_role = 0
ORDER BY sta_normal_user.name
OFFSET 0 ROWS FETCH FIRST 25 ROWS ONLY;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=72.31..1846.60 rows=25 width=51) (actual time=6.670..67.875 rows=25 loops=1)
-> Unique (cost=72.31..2759802.96 rows=38885 width=51) (actual time=6.669..67.866 rows=25 loops=1)
-> Incremental Sort (cost=72.31..2757740.33 rows=137509 width=51) (actual time=6.668..67.646 rows=1277 loops=1)
Sort Key: sta_normal_user.name, sta_normal_user.user_id, sta_normal_user.deleted_timestamp, sta_normal_user.locale, sta_normal_user.slug, sta_normal_user.time_zone
Presorted Key: sta_normal_user.name
Full-sort Groups: 14 Sort Method: quicksort Average Memory: 28kB Peak Memory: 28kB
Pre-sorted Groups: 19 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Nested Loop (cost=1.45..2753990.89 rows=137509 width=51) (actual time=3.015..66.817 rows=1330 loops=1)
-> Nested Loop (cost=1.01..1905098.86 rows=1008871 width=59) (actual time=0.172..38.184 rows=7682 loops=1)
Join Filter: (sta_normal_user.user_id = sta_pr_participant.user_id)
-> Nested Loop (cost=0.58..15000.33 rows=38885 width=55) (actual time=0.020..1.316 rows=394 loops=1)
-> Index Scan using uq_normal_user_name on sta_normal_user (cost=0.29..2363.22 rows=38885 width=51) (actual time=0.012..0.338 rows=394 loops=1)
-> Index Only Scan using pk_stash_user on stash_user (cost=0.29..0.32 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=394)
Index Cond: (id = sta_normal_user.user_id)
Heap Fetches: 44
-> Index Scan using idx_sta_pr_participant_user on sta_pr_participant (cost=0.43..44.96 rows=292 width=12) (actual time=0.005..0.091 rows=19 loops=394)
Index Cond: (user_id = stash_user.id)
Filter: (pr_role = 0)
Rows Removed by Filter: 90
-> Memoize (cost=0.44..0.90 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=7682)
Cache Key: sta_pr_participant.pr_id
Cache Mode: logical
Hits: 0 Misses: 7682 Evictions: 0 Overflows: 0 Memory Usage: 593kB
-> Index Scan using pk_sta_pull_request on sta_pull_request (cost=0.43..0.89 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=7682)
Index Cond: (id = sta_pr_participant.pr_id)
Filter: (to_repository_id = 7462)
Rows Removed by Filter: 1
Planning Time: 1.164 ms
Execution Time: 67.931 ms
(29 rows)