-
Bug
-
Resolution: Fixed
-
High
-
7.21.0, 7.21.8, 8.3.4
-
8
-
Severity 3 - Minor
-
6
-
Issue Summary
The DIY backup process is unsuccessful in forcefully terminating the database connections initiated by third-party plugins or event native Bitbucket Threads, ultimately resulting in the failure of the backup itself.
This is reproducible on Data Center: YES
Steps to Reproduce
- Run Bitbucket DC 7.21.x
- Install any third-party plugin that consumes a database connection(Can also happen if some Native Bitbucket Threads are holding up the DB conenctions)
- Trigger a request that runs longer than the duration specified in backup.drain.database.timeout + backup.drain.db.force.timeout
- Start the DIY backup with the below parameters
backup.drain.database.timeout=60 backup.drain.db.force.timeout=30
Expected Results
When the backup.drain.db.force.timeout is reached, it is expected to forcefully drain the open database connections and start the backup.
Actual Results
The below exception is thrown in the atlassian-bitbucket.log file and the backup itself fails.
2023-08-12 02:01:31,269 INFO [hz.hazelcast.cached.thread-10] c.a.s.i.h.ExtendedHikariDataSource Forcibly draining the database pool 2023-08-12 02:01:31,301 WARN [http-nio-7990-exec-29 url: /rest/api/1.0/projects/test/repos/test/pull-requests/1/merge; user: test] test *7HTBLIx120x9428454x5 r2g5xa 172.27.42.98,127.0.0.1 "GET /rest/api/1.0/projects/test/repos/test/pull-requests/1/merge HTTP/1.1" c.a.s.i.h.r.DefaultRepositoryHookService [test/test[43221]] Error calling com.onresolve.scriptrunner.bitbucket.ScriptRunnerMergeCheck.preUpdate (com.onresolve.stash.groovy.groovyrunner:scriptedMergeCheck) java.lang.reflect.UndeclaredThrowableException: null at com.sun.proxy.$Proxy2819.getWrappedObject(Unknown Source) at com.atlassian.stash.internal.hibernate.DelegatingSwappableSessionFactory.getWrappedObject(DelegatingSwappableSessionFactory.java:442) at com.atlassian.stash.internal.hibernate.DelegatingSwappableSessionFactory.getWrappedObject(DelegatingSwappableSessionFactory.java:58) at org.springframework.transaction.support.TransactionSynchronizationUtils.unwrapResourceIfNecessary(TransactionSynchronizationUtils.java:68) at org.springframework.transaction.support.TransactionSynchronizationManager.getResource(TransactionSynchronizationManager.java:134) at org.springframework.orm.hibernate5.HibernateTransactionManager.doGetTransaction(HibernateTransactionManager.java:423) at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:347) at com.atlassian.sal.core.transaction.HostContextTransactionTemplate.execute(HostContextTransactionTemplate.java:18) at com.atlassian.plugin.util.ContextClassLoaderSettingInvocationHandler.invoke(ContextClassLoaderSettingInvocationHandler.java:26) at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.doInvoke(ServiceInvoker.java:56) at org.eclipse.gemini.blueprint.service.importer.support.internal.aop.ServiceInvoker.invoke(ServiceInvoker.java:60) at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invokeUnprivileged(ServiceTCCLInterceptor.java:70) at org.eclipse.gemini.blueprint.service.util.internal.aop.ServiceTCCLInterceptor.invoke(ServiceTCCLInterceptor.java:53) at org.eclipse.gemini.blueprint.service.importer.support.LocalBundleContextAdvice.invoke(LocalBundleContextAdvice.java:57) at com.atlassian.activeobjects.internal.SalTransactionManager.inTransaction(SalTransactionManager.java:42) at com.atlassian.activeobjects.internal.AbstractLoggingTransactionManager.doInTransaction(AbstractLoggingTransactionManager.java:26) at com.atlassian.activeobjects.internal.EntityManagedActiveObjects.executeInTransaction(EntityManagedActiveObjects.java:213) at com.atlassian.activeobjects.osgi.TenantAwareActiveObjects.executeInTransaction(TenantAwareActiveObjects.java:342) at com.atlassian.activeobjects.external.ActiveObjects$executeInTransaction.call(Unknown Source) at com.onresolve.scriptrunner.runner.util.AOPropertyPersister.loadString(AOPropertyPersister.groovy:113) at com.onresolve.scriptrunner.runner.util.AOPropertyPersister$loadString.call(Unknown Source) at com.onresolve.scriptrunner.runner.util.AOPersistentStorage.load(AOPersistentStorage.groovy:22) at com.onresolve.scriptrunner.persistence.PersistentStorage$load.call(Unknown Source) at com.onresolve.scriptrunner.runner.rest.common.CollectionPersistentStorage.load(CollectionPersistentStorage.groovy:29) at com.onresolve.scriptrunner.bitbucket.service.AbstractBitbucketConfigurationService.load(AbstractBitbucketConfigurationService.groovy:90) at com.onresolve.scriptrunner.ConfigurationService$load.call(Unknown Source) at com.onresolve.scriptrunner.bitbucket.ScriptRunnerMergeCheck.preUpdate(ScriptRunnerMergeCheck.groovy:37) at com.onresolve.scriptrunner.bitbucket.ScriptRunnerMergeCheck.preUpdate(ScriptRunnerMergeCheck.groovy) at com.atlassian.stash.internal.hook.repository.DefaultRepositoryHookService.preUpdate(DefaultRepositoryHookService.java:771) at com.atlassian.stash.internal.hook.repository.DefaultRepositoryHookService.lambda$preUpdate$4(DefaultRepositoryHookService.java:383) ............ at com.atlassian.troubleshooting.thready.filter.AbstractThreadNamingFilter.doFilter(AbstractThreadNamingFilter.java:46) at com.atlassian.stash.internal.spring.lifecycle.LifecycleJohnsonServletFilterModuleContainerFilter.doFilter(LifecycleJohnsonServletFilterModuleContainerFilter.java:42) at com.atlassian.stash.internal.web.auth.BeforeLoginPluginAuthenticationFilter.doBeforeBeforeLoginFilters(BeforeLoginPluginAuthenticationFilter.java:90) at com.atlassian.stash.internal.web.auth.BeforeLoginPluginAuthenticationFilter.doFilter(BeforeLoginPluginAuthenticationFilter.java:73) at com.atlassian.stash.internal.request.DefaultRequestManager.doAsRequest(DefaultRequestManager.java:84) at com.atlassian.stash.internal.hazelcast.ConfigurableWebFilter.doFilter(ConfigurableWebFilter.java:38) at java.lang.Thread.run(Thread.java:750) ... 396 frames trimmed Caused by: java.lang.InterruptedException: null at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at com.atlassian.stash.internal.util.LatchedInvocationHandler.invoke(LatchedInvocationHandler.java:46) ... 77 common frames omitted 2023-08-12 02:02:31,303 INFO [hz.hazelcast.cached.thread-10] c.a.s.i.h.ExtendedHikariDataSource Forcibly closing database connection on thread [http-nio-7990-exec-29 url: /rest/api/1.0/projects/test/repos/test/pull-requests/1/merge; user: testuser] 2023-08-12 02:02:31,304 INFO [hz.hazelcast.cached.thread-10] c.a.s.i.h.ExtendedHikariDataSource 1 connections still leased. Forced draining has failed 2023-08-12 02:02:31,304 WARN [threadpool:thread-25] stash_backup *7HTBLIx120x9428460x4 172.27.45.51 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.LatchAndDrainDatabaseBackupStep The DataSource could not be drained. Aborting... 2023-08-12 02:02:31,312 WARN [threadpool:thread-25] stash_backup *7HTBLIx120x9428460x4 172.27.45.51 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.DefaultMaintenanceTaskMonitor BACKUP maintenance has failed (Cause: BackupException: A backup file could not be created.) com.atlassian.stash.internal.backup.BackupException: A backup file could not be created. at com.atlassian.stash.internal.maintenance.backup.BackupPhase.run(BackupPhase.java:75) at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask$Step.run(CompositeMaintenanceTask.java:135) at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask.run(CompositeMaintenanceTask.java:71) at com.atlassian.stash.internal.maintenance.MaintenanceModePhase.run(MaintenanceModePhase.java:27) at com.atlassian.stash.internal.maintenance.backup.AbstractBackupTask.run(AbstractBackupTask.java:85) at com.atlassian.stash.internal.maintenance.DefaultMaintenanceTaskMonitor.run(DefaultMaintenanceTaskMonitor.java:212) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.atlassian.stash.internal.concurrent.StateTransferringRunnable.run(StateTransferringRunnable.java:50) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.lang.Thread.run(Thread.java:750) ... 1 frame trimmed Caused by: com.atlassian.stash.internal.migration.DatabaseMigrationException: One or more database connections were not closed within the allotted timeout. To prevent corruption due to inconsistent state, the migration has been aborted. Please try your migration again when the system is under less load.
Workaround
As a workaround for the issue, you can prolong the timeout for backup.drain.db.force.timeout. By doing so, ongoing requests will be given ample time to finish before the forceful draining of database connections is triggered. This should help mitigate the issue until a permanent solution is implemented.