Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-14374

DIY backups fail while attempting to force-drain database connections.

XMLWordPrintable

      Issue Summary

      The DIY backup process is unsuccessful in forcefully terminating the database connections initiated by third-party plugins or event native Bitbucket Threads, ultimately resulting in the failure of the backup itself.

      This is reproducible on Data Center: YES

      Steps to Reproduce

      1. Run Bitbucket DC 7.21.x
      2. Install any third-party plugin that consumes a database connection(Can also happen if some Native Bitbucket Threads are holding up the DB conenctions)
      3. Trigger a request that runs longer than the duration specified in backup.drain.database.timeout + backup.drain.db.force.timeout
      4. Start the DIY backup with the below parameters
        backup.drain.database.timeout=60
        backup.drain.db.force.timeout=30
        

      Expected Results

      When the backup.drain.db.force.timeout is reached, it is expected to forcefully drain the open database connections and start the backup.

      Actual Results

      The below exception is thrown in the atlassian-bitbucket.log file and the backup itself fails.

      2023-08-12 02:01:31,269 INFO  [hz.hazelcast.cached.thread-10]  c.a.s.i.h.ExtendedHikariDataSource Forcibly draining the database pool
      2023-08-12 02:01:31,301 WARN  [http-nio-7990-exec-29 url: /rest/api/1.0/projects/test/repos/test/pull-requests/1/merge; user: test] test *7HTBLIx120x9428454x5 r2g5xa 172.27.42.98,127.0.0.1 "GET /rest/api/1.0/projects/test/repos/test/pull-requests/1/merge HTTP/1.1" c.a.s.i.h.r.DefaultRepositoryHookService [test/test[43221]] Error calling com.onresolve.scriptrunner.bitbucket.ScriptRunnerMergeCheck.preUpdate (com.onresolve.stash.groovy.groovyrunner:scriptedMergeCheck)
      java.lang.reflect.UndeclaredThrowableException: null
      	at com.sun.proxy.$Proxy2819.getWrappedObject(Unknown Source)
      	at com.atlassian.stash.internal.hibernate.DelegatingSwappableSessionFactory.getWrappedObject(DelegatingSwappableSessionFactory.java:442)
      	at com.atlassian.stash.internal.hibernate.DelegatingSwappableSessionFactory.getWrappedObject(DelegatingSwappableSessionFactory.java:58)
      	at org.springframework.transaction.support.TransactionSynchronizationUtils.unwrapResourceIfNecessary(TransactionSynchronizationUtils.java:68)
      	at org.springframework.transaction.support.TransactionSynchronizationManager.getResource(TransactionSynchronizationManager.java:134)
      	at org.springframework.orm.hibernate5.HibernateTransactionManager.doGetTransaction(HibernateTransactionManager.java:423)
      	at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:347)
      	at com.atlassian.sal.core.transaction.HostContextTransactionTemplate.execute(HostContextTransactionTemplate.java:18)
      	at com.atlassian.plugin.util.ContextClassLoaderSettingInvocationHandler.invoke(ContextClassLoaderSettingInvocationHandler.java:26)
      	at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.doInvoke(ServiceInvoker.java:56)
      	at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.invoke(ServiceInvoker.java:60)
      	at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invokeUnprivileged(ServiceTCCLInterceptor.java:70)
      	at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invoke(ServiceTCCLInterceptor.java:53)
      	at org.eclipse.gemini.blueprint.service.importer.support.LocalBundleContextAdvice.invoke(LocalBundleContextAdvice.java:57)
      	at com.atlassian.activeobjects.internal.SalTransactionManager.inTransaction(SalTransactionManager.java:42)
      	at com.atlassian.activeobjects.internal.AbstractLoggingTransactionManager.doInTransaction(AbstractLoggingTransactionManager.java:26)
      	at com.atlassian.activeobjects.internal.EntityManagedActiveObjects.executeInTransaction(EntityManagedActiveObjects.java:213)
      	at com.atlassian.activeobjects.osgi.TenantAwareActiveObjects.executeInTransaction(TenantAwareActiveObjects.java:342)
      	at com.atlassian.activeobjects.external.ActiveObjects$executeInTransaction.call(Unknown Source)
      	at com.onresolve.scriptrunner.runner.util.AOPropertyPersister.loadString(AOPropertyPersister.groovy:113)
      	at com.onresolve.scriptrunner.runner.util.AOPropertyPersister$loadString.call(Unknown Source)
      	at com.onresolve.scriptrunner.runner.util.AOPersistentStorage.load(AOPersistentStorage.groovy:22)
      	at com.onresolve.scriptrunner.persistence.PersistentStorage$load.call(Unknown Source)
      	at com.onresolve.scriptrunner.runner.rest.common.CollectionPersistentStorage.load(CollectionPersistentStorage.groovy:29)
      	at com.onresolve.scriptrunner.bitbucket.service.AbstractBitbucketConfigurationService.load(AbstractBitbucketConfigurationService.groovy:90)
      	at com.onresolve.scriptrunner.ConfigurationService$load.call(Unknown Source)
      	at com.onresolve.scriptrunner.bitbucket.ScriptRunnerMergeCheck.preUpdate(ScriptRunnerMergeCheck.groovy:37)
      	at com.onresolve.scriptrunner.bitbucket.ScriptRunnerMergeCheck.preUpdate(ScriptRunnerMergeCheck.groovy)
      	at com.atlassian.stash.internal.hook.repository.DefaultRepositoryHookService.preUpdate(DefaultRepositoryHookService.java:771)
      	at com.atlassian.stash.internal.hook.repository.DefaultRepositoryHookService.lambda$preUpdate$4(DefaultRepositoryHookService.java:383)
      ............
      	at com.atlassian.troubleshooting.thready.filter.AbstractThreadNamingFilter.doFilter(AbstractThreadNamingFilter.java:46)
      	at com.atlassian.stash.internal.spring.lifecycle.LifecycleJohnsonServletFilterModuleContainerFilter.doFilter(LifecycleJohnsonServletFilterModuleContainerFilter.java:42)
      	at com.atlassian.stash.internal.web.auth.BeforeLoginPluginAuthenticationFilter.doBeforeBeforeLoginFilters(BeforeLoginPluginAuthenticationFilter.java:90)
      	at com.atlassian.stash.internal.web.auth.BeforeLoginPluginAuthenticationFilter.doFilter(BeforeLoginPluginAuthenticationFilter.java:73)
      	at com.atlassian.stash.internal.request.DefaultRequestManager.doAsRequest(DefaultRequestManager.java:84)
      	at com.atlassian.stash.internal.hazelcast.ConfigurableWebFilter.doFilter(ConfigurableWebFilter.java:38)
      	at java.lang.Thread.run(Thread.java:750)
      	... 396 frames trimmed
      Caused by: java.lang.InterruptedException: null
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
      	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
      	at com.atlassian.stash.internal.util.LatchedInvocationHandler.invoke(LatchedInvocationHandler.java:46)
      	... 77 common frames omitted
      
      
      2023-08-12 02:02:31,303 INFO  [hz.hazelcast.cached.thread-10]  c.a.s.i.h.ExtendedHikariDataSource Forcibly closing database connection on thread [http-nio-7990-exec-29 url: /rest/api/1.0/projects/test/repos/test/pull-requests/1/merge; user: testuser]
      2023-08-12 02:02:31,304 INFO  [hz.hazelcast.cached.thread-10]  c.a.s.i.h.ExtendedHikariDataSource 1 connections still leased. Forced draining has failed
      2023-08-12 02:02:31,304 WARN  [threadpool:thread-25] stash_backup *7HTBLIx120x9428460x4 172.27.45.51 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.LatchAndDrainDatabaseBackupStep The DataSource could not be drained. Aborting...
      2023-08-12 02:02:31,312 WARN  [threadpool:thread-25] stash_backup *7HTBLIx120x9428460x4 172.27.45.51 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.DefaultMaintenanceTaskMonitor BACKUP maintenance has failed (Cause: BackupException: A backup file could not be created.)
      com.atlassian.stash.internal.backup.BackupException: A backup file could not be created.
      	at com.atlassian.stash.internal.maintenance.backup.BackupPhase.run(BackupPhase.java:75)
      	at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask$Step.run(CompositeMaintenanceTask.java:135)
      	at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask.run(CompositeMaintenanceTask.java:71)
      	at com.atlassian.stash.internal.maintenance.MaintenanceModePhase.run(MaintenanceModePhase.java:27)
      	at com.atlassian.stash.internal.maintenance.backup.AbstractBackupTask.run(AbstractBackupTask.java:85)
      	at com.atlassian.stash.internal.maintenance.DefaultMaintenanceTaskMonitor.run(DefaultMaintenanceTaskMonitor.java:212)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at com.atlassian.stash.internal.concurrent.StateTransferringRunnable.run(StateTransferringRunnable.java:50)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.lang.Thread.run(Thread.java:750)
      	... 1 frame trimmed
      Caused by: com.atlassian.stash.internal.migration.DatabaseMigrationException: One or more database connections were not closed within the allotted timeout. To prevent corruption due to inconsistent state, the migration has been aborted. Please try your migration again when the system is under less load.
      

      Workaround

      As a workaround for the issue, you can prolong the timeout for backup.drain.db.force.timeout. By doing so, ongoing requests will be given ample time to finish before the forceful draining of database connections is triggered. This should help mitigate the issue until a permanent solution is implemented.

              mgoyal2@atlassian.com Manish
              5338c0aa34f9 Danny Samuel
              Votes:
              4 Vote for this issue
              Watchers:
              16 Start watching this issue

                Created:
                Updated:
                Resolved: