Uploaded image for project: 'Confluence Server'
  1. Confluence Server
  2. CONFSERVER-39825

Quartz scheduled jobs can stop running until restart

    Details

    • Testing Notes:
      Hide
      • Start up a new Confluence 5.9.x instance running against Postgres.
      • Create a page with a unique title and verify that it appears when searching for that title
      • Shutdown Postgres (do a ps -A | grep postgres to make sure all the sub processes are dead)
      • Wait for 30s to ensure the indexing job has run
      • Start Postgres up again
      • Create another page with a unique title and verify that it appears when searching for that title

      Before this fix, the postgres shutdown would cause the indexing job to never run again (meaning searching for the new page would return no results). If the search succeeds then it means the fix is good.

      Show
      Start up a new Confluence 5.9.x instance running against Postgres. Create a page with a unique title and verify that it appears when searching for that title Shutdown Postgres (do a ps -A | grep postgres to make sure all the sub processes are dead) Wait for 30s to ensure the indexing job has run Start Postgres up again Create another page with a unique title and verify that it appears when searching for that title Before this fix, the postgres shutdown would cause the indexing job to never run again (meaning searching for the new page would return no results). If the search succeeds then it means the fix is good.

      Description

      Database request is performed before each schedule job is run. If this request fails for any reason (e.g. connection timeout, not enough DB connections in the pool, table missing in DB), the scheduled job will be prevented from running until the instance is restarted.

      This is the relevant stack trace:

      org.quartz.SchedulerException: JobListener 'ScheduledJobListener' threw exception: Could not open Hibernate Session for transaction; nested exception is net.sf.hibernate.exception.GenericJDBCException: Cannot open connection [See nested exception: org.springframework.transaction.CannotCreateTransactionException: Could not open Hibernate Session for transaction; nested exception is net.sf.hibernate.exception.GenericJDBCException: Cannot open connection]
      	at org.quartz.core.QuartzScheduler.notifyJobListenersToBeExecuted(QuartzScheduler.java:1951)
      	at org.quartz.core.JobRunShell.notifyListenersBeginning(JobRunShell.java:364)
      	at org.quartz.core.JobRunShell.run(JobRunShell.java:190)
      	at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool.lambda$runInThread$187(ConfluenceQuartzThreadPool.java:19)
      	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
      Caused by: org.springframework.transaction.CannotCreateTransactionException: Could not open Hibernate Session for transaction; nested exception is net.sf.hibernate.exception.GenericJDBCException: Cannot open connection
      	at org.springframework.orm.hibernate.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:473)
      	at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:374)
      	at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198)
      	at com.sun.proxy.$Proxy33.getTransaction(Unknown Source)
      	at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:263)
      	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:101)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
      	at com.sun.proxy.$Proxy142.jobToBeExecuted(Unknown Source)
      	at org.quartz.core.QuartzScheduler.notifyJobListenersToBeExecuted(QuartzScheduler.java:1949)
      	... 4 more
      Caused by: net.sf.hibernate.exception.GenericJDBCException: Cannot open connection
      	at net.sf.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:80)
      	at net.sf.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:69)
      	at net.sf.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:29)
      	at net.sf.hibernate.impl.BatcherImpl.convert(BatcherImpl.java:328)
      	at net.sf.hibernate.impl.BatcherImpl.openConnection(BatcherImpl.java:295)
      	at net.sf.hibernate.impl.SessionImpl.connection(SessionImpl.java:3367)
      	at org.springframework.orm.hibernate.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:422)
      	... 17 more
      Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
      	at org.apache.tomcat.dbcp.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:126)
      	at org.apache.tomcat.dbcp.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1412)
      	at net.sf.hibernate.connection.DatasourceConnectionProvider.getConnection(DatasourceConnectionProvider.java:59)
      	at net.sf.hibernate.impl.BatcherImpl.openConnection(BatcherImpl.java:292)
      	... 20 more
      	at org.apache.tomcat.dbcp.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:446)
      	at org.apache.tomcat.dbcp.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:360)
      	at org.apache.tomcat.dbcp.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:118)
      	... 23 more
      

      Two job listeners are called when execution is starting:

      1. ExecutingJobsManager keeps a list of jobs which are executing
      2. ScheduledJobsListener saves execution details to database

      If ScheduledJobsListener throws an exception, ExecutingJobsManager will not get notified that the execution got cancelled. Further executions get cancelled because AbstractJob#executeInternal(..) checks ExecutingJobsManager to see if similar (but not same) execution is already happening.

      Workaround for Confluence 5.8

      For users of Confluence 5.8.18, we are providing a patch file containing a fix for this issue. Please be aware that this patch has not been fully tested and is applied at your own risk. We strongly recommend that you test the patch on a staging instance of Confluence before applying it to a production instance. Refer to https://confluence.atlassian.com/doc/installing-patched-class-files-166528493.html for details on how to apply the patch. Also, make sure that the MaxWaitMillis is set to 10000 so that the connection pool got sometime to recover.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                16 Vote for this issue
                Watchers:
                48 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Last commented:
                  15 weeks, 4 days ago