Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-11647

Heavy SSH load can result in active check backlogs

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Medium
    • None
    • 4.10.2, 4.14.12, 5.16.2, 6.1.1
    • SSH
    • None

    Description

      Issue Summary

      When a user is authenticated via SSH, if the user's account is managed by a remote Crowd or LDAP server, that server is checked to ensure the user is still active. Under high SSH load (read: concurrent SSH load), these active checks can back up, resulting in a large number of SSH threads blocked on them.

      Prior to Bitbucket Server 5.10, which includes a configurable limit on the number of concurrent SSH connections (see BSERV-10714), the build-up of SSH threads blocked in isUserActive checks can result in an OutOfMemoryError. In Bitbucket Server 5.10 and newer, the build-up will result in SSH new SSH connections being rejected until the backlog clears.

      The "top" frames on SSH thread stacks will look similar to this:

      "ssh-scm-request-handler" #1645084 daemon prio=5 os_prio=0 tid=0x00007f5d68691000 nid=0x4bd5 waiting on condition [0x00007f5adbd3d000]
         java.lang.Thread.State: WAITING (parking)
          at sun.misc.Unsafe.park(Native Method)
          - parking to wait for  <0x00000007b4c9e6e8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
          at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
          at com.atlassian.cache.memory.DelegatingCache.acquireLockFor(DelegatingCache.java:474)
          at com.atlassian.cache.memory.DelegatingCache.remove(DelegatingCache.java:244)
          at com.atlassian.stash.internal.user.DefaultUserService.isUserActive(DefaultUserService.java:430)
      

      Workaround

      One workaround is to disable the isUserActive check. That can be done by setting auth.remote.enabled=false in bitbucket.properties. Please Note: Disabling the active check may result in users who are disabled in or removed from LDAP continuing to have SSH access for some period of time afterwards, until their account details are synchronized and their account is disabled in or removed from Bitbucket Server. For systems using delegated authentication setups, this may mean the user retains SSH access indefinitely. (See BSERV-8685)

      It may also be possible to make the issue less likely to occur by increasing the TTL for the isUserActive cache, or, if the organization has a large number of users, by increasing the cache's size. The default TTL and size are:

      # If "permissions.cache.users.max" has been overridden in bitbucket.properties, that value
      # will also be used for "auth.remote.cache.cacheSize". If "permissions.cache.users.max"
      # has not been overridden, the default is 5000
      auth.remote.cache.cacheSize=${permissions.cache.users.max}
      # The TTL is defined in seconds, and defaults to 2 minutes
      auth.remote.cache.ttl=120
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bturner Bryan Turner (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:

                Backbone Issue Sync