Add improvements to cluster-wide locking of pull requests to avoid database connections exhaustion

XMLWordPrintable

    • Type: Suggestion
    • Resolution: Unresolved
    • None
    • Component/s: Pull Requests
    • None
    • 6
    • 1

      Context

      It is possible that clustering issues (e.g. split-brain scenario) may lead to threads calling methods on cluster-wide locking of pull requests to get stuck. The threads block until the cluster locks are acquired.

      These threads are participating in transactions that have DB connections open. This can lead to nodes' DB connection pools getting exhausted.

      Sample log excerpts showing DB pool exhaustion and the stack for threads holding DB connections:

      2025-01-28 22:18:56,101 WARN  [ssh-auth]  c.a.s.i.jdbc.ConnectionTracker The database pool is exhausted. Stacks for the threads holding connections follow:
      2025-01-28 22:18:56,104 WARN  [ssh-auth]  c.a.s.i.jdbc.ConnectionTracker Stack trace for http-nio-127.0.0.1-7991-exec-50
      com.atlassian.stash.internal.util.StackException: Stack trace for http-nio-127.0.0.1-7991-exec-50
      	at java.base@17.0.12/jdk.internal.misc.Unsafe.park(Native Method)
      	at java.base@17.0.12/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341)
      	at com.hazelcast.spi.impl.AbstractInvocationFuture.manageParking(AbstractInvocationFuture.java:693)
      	at com.hazelcast.spi.impl.AbstractInvocationFuture.joinInternal(AbstractInvocationFuture.java:582)
      	at com.hazelcast.internal.locksupport.LockProxySupport.lock(LockProxySupport.java:67)
      	at com.hazelcast.internal.locksupport.LockProxySupport.lock(LockProxySupport.java:59)
      	at com.hazelcast.map.impl.proxy.MapProxyImpl.lock(MapProxyImpl.java:320)
      	at com.atlassian.beehive.hazelcast5.HazelcastMappedLock.lock(HazelcastMappedLock.java:39)
      	at com.atlassian.beehive.hazelcast5.HazelcastClusterLock.lock(HazelcastClusterLock.java:38)
      	at com.atlassian.stash.internal.concurrent.DefaultMonitoredClusterLockService$MonitoredClusterLock.lambda$lock$0(DefaultMonitoredClusterLockService.java:96)
      	at com.atlassian.stash.internal.concurrent.DefaultMonitoredClusterLockService$MonitoredClusterLock$$Lambda$2972/0x00007fb99567bd90.perform(Unknown Source)
      	at com.atlassian.stash.internal.concurrent.DefaultMonitoredClusterLockService$MonitoredClusterLock.trackWaiting(DefaultMonitoredClusterLockService.java:158)
      	at com.atlassian.stash.internal.concurrent.DefaultMonitoredClusterLockService$MonitoredClusterLock.lock(DefaultMonitoredClusterLockService.java:95)
      	at com.atlassian.stash.internal.concurrent.ClusterMappedLock.withLock(ClusterMappedLock.java:47)
      	at com.atlassian.stash.internal.concurrent.ClusterPullRequestLock.withLock(ClusterPullRequestLock.java:25)
      	at com.atlassian.stash.internal.concurrent.ClusterPullRequestLock.withLock(ClusterPullRequestLock.java:30)
      	at com.atlassian.stash.internal.pull.HazelcastPullRequestStateGuard.processIfUpdated(HazelcastPullRequestStateGuard.java:40)
      	at com.atlassian.stash.internal.pull.comment.DelegatingCommentUpdateProcessor.maybeProcess(DelegatingCommentUpdateProcessor.java:23)
      	at jdk.internal.reflect.GeneratedMethodAccessor1111.invoke(Unknown Source)
      <trimmed>
      

      Feature request

      Add improvements/handling to methods related to locking pull requests so that they do not get blocked and lead to DB pool exhaustion (e.g. implement timeouts)

            Assignee:
            Unassigned
            Reporter:
            JP Mariano
            Votes:
            2 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: