-
Bug
-
Resolution: Fixed
-
High
-
7.16.0
-
None
-
1
-
Severity 2 - Major
-
3
-
Issue Summary
In Bitbucket Server/DC 7.16 the approach used to cache group permissions changed so as to improve user search related performance performance problems described in https://jira.atlassian.com/browse/BSERV-12166. For version of Bitbucket 7.16 and later in the GrantedPermissionVoterProvider instead of making multiple database requests for every group that are not present in the groups permissions cache Bitbucket now builds a cache with all groups with permissions in a single operation.
This new approach significantly improves the performance of requests that must build permissions for multiple users (e.g. user search requests). Unfortunately this new approach was not a one size fits all solution. The approach is based on the assumption that building this cache is fast, since while it is being built all permissions checks are blocked waiting for the cache to be built. While this is generally true, it is sometimes not, specifically for Bitbucket instances with a lot of granted groups permissions.
Symptoms:
When this group permissions cache is invalidated (for example when a group permission is added or removed anywhere in the instance) it must be recalculated. When recalculation is happening all requests attempting to test a permission will block until the permission cache is recalculated/preloaded. When this takes multiple seconds to perform, the impact is noticeable.
This slowness can be seen in the profiling logs, for example:
[12123.5ms] - NormalUserPermissionVoterCalculator.calculate()
[1.9ms] - Iterable com.atlassian.stash.internal.user.InternalUserService.getGroupsByUser(String)
[6234.0ms] - PreloadingGrantedPermissionVoterProvider - load group permissions
[2945.6ms] - Map com.atlassian.stash.internal.user.EffectivePermissionDao.findAllByGroups()
Notice the entry "PreloadingGrantedPermissionVoterProvider - load group permissions" is 6 seconds. If this value is regularly greater than 1s you may notice poor performance as a result of this problem.
Note: In Bitbucket 8.0+ PreloadingGrantedPermissionVoterProvider is instead named GrantedPermissionVoterProvider.
In a Java thread dump many threads will be blocked with the following stack frames at the top of the thread stacks:
com.atlassian.cache.memory.DelegatingCachedReference.getUnderLock(DelegatingCachedReference.java:103) com.atlassian.cache.memory.DelegatingCachedReference.get(DelegatingCachedReference.java:88) com.atlassian.cache.hazelcast.HazelcastAsyncHybridCachedReference.get(HazelcastAsyncHybridCachedReference.java:65) com.atlassian.stash.internal.permission.PreloadingGrantedPermissionVoterProvider.getGroupsPermissions(PreloadingGrantedPermissionVoterProvider.java:279) com.atlassian.stash.internal.permission.PreloadingGrantedPermissionVoterProvider.access$400(PreloadingGrantedPermissionVoterProvider.java:78) com.atlassian.stash.internal.permission.PreloadingGrantedPermissionVoterProvider$NormalUserPermissionVoterCalculator.calculate(PreloadingGrantedPermissionVoterProvider.java:379) com.atlassian.stash.internal.permission.PreloadingGrantedPermissionVoterProvider$RecalculatingPermissionVoter.recalculate(PreloadingGrantedPermissionVoterProvider.java:483) com.atlassian.stash.internal.permission.PreloadingGrantedPermissionVoterProvider$RecalculatingPermissionVoter.getVoter(PreloadingGrantedPermissionVoterProvider.java:454) com.atlassian.stash.internal.permission.PreloadingGrantedPermissionVoterProvider$RecalculatingPermissionVoter.vote(PreloadingGrantedPermissionVoterProvider.java:442) com.atlassian.stash.internal.permission.CompositePermissionVoter.vote(CompositePermissionVoter.java:70) com.atlassian.stash.internal.permission.CompositePermissionVoter.vote(CompositePermissionVoter.java:46) com.atlassian.stash.internal.user.PermissionServiceImpl.hasPermission(PermissionServiceImpl.java:609) com.atlassian.stash.internal.user.PermissionServiceImpl.hasPermission(PermissionServiceImpl.java:597) ..
Note: Again, in Bitbucket 8.0+ PreloadingGrantedPermissionVoterProvider is instead named GrantedPermissionVoterProvider.
Workaround
A number of workarounds are available:
1) Reduce the number of permission grants
Rather than granting permissions at the repository level grant them where possible at the project level (deleting those at repository level that are then redundant by the project level grant).
You can count the number of group permissions by running this query for repository level grants:
select count(*) from sta_repo_permission where group_name is not null and user_id is null;
and this for project level grants:
select count(*) from sta_project_permission where group_name is not null and user_id is null;
Aim to keep the sum of these numbers under 50,000 for best results.
2) Switch back to the implementation of GrantedPermissionVoterProvider that was present in Bitbucket versions prior to 7.16
In certain versions of Bitbucket the old implementation of the GrantedPermissionVoterProvider is still present and it is possible to use it. This is possible on Bitbucket 7.16.x - 7.17.x, 7.21.7+ (but not 8.x) and can be achieved by add the following to bitbucket.properties file then restarting the Bitbucket application:
dark-features.permissions.cache-all-groups=false
Note that in doing this you may be subject to some performance limitations that existed in the old implementation, for example user search (i.e. user selectors) as is described in BSERV-12166. If you do revert to the old implementation it is often beneficial to increase the number of cached group permissions. For example to increase this from the default of 2500 to 20000 set this in your bitbucket.properties file also:
permissions.cache.groups.max=20000
In order to determine the value for this number you may run these two database queries and sum the two numbers:
select count(*) from sta_repo_permission where group_name is not null and user_id is null; select count(*) from sta_project_permission where group_name is not null and user_id is null;
A value of at least 50% of the sum of these two numbers should be adequate in most cases.
Other notes
BSERV-13540 describes a problem specific to the Oracle database that exacerbates this problem. For instances running Oracle 19c the queries used to fetch the permission grants are very slow, much slower than the other databases Bitbucket support. If you are running Oracle 19c and are impacted by this issue consider upgrading to a version where BSERV-13540 is fixed; this may provide adequate performance for you.