Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-13540

Degradation in populating groups permissions cache on Oracle 19c

    XMLWordPrintable

Details

    Description

      Issue Summary

      The query to retrieve groups permissions for all groups does not specify a `fetchsize`, so it defaults to 10 for Oracle, which leads to multiple round trips being made to the database on Oracle databases. When a Bitbucket instances has many groups permissions, it would take a long time to populate the groups permissions cache. 

      When the groups permissions cache takes seconds to populate, every other DB request is blocked waiting for it to finish. On a busy instance there can be hundreds of requests made while the system is waiting for the groups permissions DB query to finish. In extreme cases, this can lead to database connection pool being exhausted if the customer has lots of groups permissions and repositories. 

      The lack of fetchsize being set was present in older versions of Bitbucket, however is not really a problem until 7.16.0+ which included a number of changes to the way permissions are calculated in order to improve the performance problem described in BSERV-12166.

      Steps to Reproduce

      1. Start up BBDC with Oracle 19c
      2. Create 4k repositories
      3. Create 20k group permissions for those repositories

      Expected Results

      Retrieving all groups permissions should take a short amount of time e.g:

      [43.7ms] - "GET /rest/ui/latest/projects/<<PROJECT>>/repos/<<REPO>>/pull-requests/75478/build-summaries HTTP/1.1"
       [36.8ms] - NormalUserPermissionVoterCalculator.calculate()
        [17.6ms] - Iterable com.atlassian.stash.internal.user.InternalUserService.getGroupsByUser(String)
        [18.7ms] - GrantedPermissionVoterProvider - load group permissions
         [17.4ms] - Map com.atlassian.stash.internal.user.EffectivePermissionDao.findAllByGroups()
       [2.7ms] - InternalPullRequest com.atlassian.stash.internal.pull.PullRequestDao.findByRepositoryScopedId(int,long)

      Database connection pool does not run out of connections

      Actual Results

      Retrieving group permissions takes a long time:

      [17272.6ms] - "GET /rest/ui/latest/projects/<<PROJECT>>/repos/<<REPO>>/pull-requests/133/build-summaries HTTP/1.0"
       [2.2ms] - ApplicationUser com.atlassian.bitbucket.user.UserService.getUserById(int)
       [1.1ms] - InternalRepository com.atlassian.stash.internal.repository.RepositoryDao.getBySlug(String,String)
       [3.3ms] - Iterable com.atlassian.stash.internal.user.InternalUserService.getGroupsByUser(String)
       [9346.4ms] - Map com.atlassian.stash.internal.user.EffectivePermissionDao.findAllByGroups()
       [10.5ms] - InternalPullRequest com.atlassian.stash.internal.pull.PullRequestDao.findByRepositoryScopedId(int,long)
       [1.1ms] - DefaultBuildStatusService: Getting build summaries for 0 commits on pull request 133(92643)
        [1.1ms] - Map com.atlassian.bitbucket.internal.build.status.dao.BuildStatusDao.countByState(BuildStatusBulkCommitSummaryCriteria)
      

      On a high traffic instance, the below exception is thrown in the atlassian-bitbucket.log file:

      2022-10-25 00:21:23,821 WARN  [http-nio-7990-exec-4] ts95311 *WJQ53Vx21x7370741x94 tebxda 10.47.122.2,10.115.199.127 "GET /rest/ui/latest/dashboard/pull-requests HTTP/1.0" c.a.s.i.jdbc.ConnectionTracker The database pool is exhausted. Stacks for the threads holding connections follow:
      2022-10-25 00:21:23,827 WARN  [http-nio-7990-exec-73] *WJQ53Vx21x7370845x194 nh4yhw 10.47.122.2,10.115.197.69 "GET /rest/ui/latest/projects/project/repos/repository/pull-requests/288/build-summaries HTTP/1.0" o.h.e.jdbc.spi.SqlExceptionHelper SQL Error: 0, SQLState: null
      2022-10-25 00:21:23,826 WARN  [http-nio-7990-exec-4] ts95311 *WJQ53Vx21x7370741x94 tebxda 10.47.122.2,10.115.199.127 "GET /rest/ui/latest/dashboard/pull-requests HTTP/1.0" c.a.s.i.jdbc.ConnectionTracker Stack trace for http-nio-7990-exec-10
      com.atlassian.stash.internal.util.StackException: Stack trace for http-nio-7990-exec-10
          at com.atlassian.cache.memory.DelegatingCachedReference.getUnderLock(DelegatingCachedReference.java:103)
          at com.atlassian.cache.memory.DelegatingCachedReference.get(DelegatingCachedReference.java:88)
          at com.atlassian.cache.hazelcast.HazelcastAsyncHybridCachedReference.get(HazelcastAsyncHybridCachedReference.java:65)
          at com.atlassian.stash.internal.permission.GrantedPermissionVoterProvider.getGroupsPermissions(GrantedPermissionVoterProvider.java:261)
          at com.atlassian.stash.internal.permission.GrantedPermissionVoterProvider.access$300(GrantedPermissionVoterProvider.java:74)
          at com.atlassian.stash.internal.permission.GrantedPermissionVoterProvider$NormalUserPermissionVoterCalculator.calculate(GrantedPermissionVoterProvider.java:355)
          at com.atlassian.stash.internal.permission.GrantedPermissionVoterProvider$RecalculatingPermissionVoter.recalculate(GrantedPermissionVoterProvider.java:458)
          at com.atlassian.stash.internal.permission.GrantedPermissionVoterProvider$RecalculatingPermissionVoter.getVoter(GrantedPermissionVoterProvider.java:429)
          at com.atlassian.stash.internal.permission.GrantedPermissionVoterProvider$RecalculatingPermissionVoter.vote(GrantedPermissionVoterProvider.java:417)
          at com.atlassian.stash.internal.permission.CompositePermissionVoter.vote(CompositePermissionVoter.java:70)
          at com.atlassian.stash.internal.permission.CompositePermissionVoter.vote(CompositePermissionVoter.java:46)
          at com.atlassian.stash.internal.user.PermissionServiceImpl.hasPermission(PermissionServiceImpl.java:609)
          at com.atlassian.stash.internal.user.PermissionServiceImpl.hasPermission(PermissionServiceImpl.java:597)
          at com.atlassian.stash.internal.user.PermissionServiceImpl.hasRepositoryPermission(PermissionServiceImpl.java:681)
          at com.atlassian.stash.internal.user.PermissionServiceImpl.hasRepositoryPermission(PermissionServiceImpl.java:309)
          at com.atlassian.stash.internal.user.PermissionServiceImpl.isRepositoryAccessible(PermissionServiceImpl.java:222)
          at jdk.internal.reflect.GeneratedMethodAccessor596.invoke(Unknown Source) 

      Workaround

      On 7.16.x - 7.17.x, 7.21.7 add the following to bitbucket.properties file:

      dark-features.permissions.cache-all-groups=false 

      Attachments

        Issue Links

          Activity

            People

              ysun Yingran Sun
              ysun Yingran Sun
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: