Uploaded image for project: 'JIRA (including JIRA Core)'
  1. JIRA (including JIRA Core)
  2. JRA-15731

Investigate issues with MySQL database connections involving autoReconnect=true

    Details

    • Type: Suggestion
    • Status: Closed (View Workflow)
    • Resolution: Fixed
    • Affects Version/s: 3.13
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:
      JIRA 3.13, MySQL Database
    • Feedback Policy:

      JIRA feedback is collected from a number of different sources and is evaluated when planning the product roadmap. If you would like to know more about how JIRA Product Management uses customer input during the planning process, please see our post on Atlassian Answers.

      Description

      Since 3.13 we have seen an increase in cases with the following error message:

      The last packet sent successfully to the server was XXXXX seconds ago, which is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.

      In each case the autoReconnect parameter has been set to true, as per the documentation

      Note that the MySQL documentation discourage using the autoReconnect parameter

      The use of this feature is not recommended, because it has side effects related to session state and data consistency when applications don't handle SQLExceptions properly, and is only designed to be used when you are unable to configure your application to handle SQLExceptions resulting from dead and stale connections properly. Alternatively, investigate setting the MySQL server variable "wait_timeout" to some high value rather than the default of 8 hours.

      Need to investigate the possible cause of the problems and potential workarounds / fixes.

      Summary of Findings:

      • The MySQL "autoReconnect" parameter is useless and can be removed
      • The best (simplest) way to make JIRA able to survive MySQL connection timeouts is to set the validationQuery parameter.
        See Surviving Connection Closures for details.
      • The reason that JIRA used to survive these closures prior to v3.13, was that the DB pool we used was accidentally resetting closed connections.

        Attachments

          Activity

          Hide
          andrew.myers Andrew Myers [Atlassian] added a comment -

          I've added all the related cases in a comment. Let me know if you need any more info

          Show
          andrew.myers Andrew Myers [Atlassian] added a comment - I've added all the related cases in a comment. Let me know if you need any more info
          Hide
          dalonso Diego Alonso [Atlassian] added a comment - - edited

          Could it be related to this? http://stackoverflow.com/questions/15949/javatomcat-dying-databse-connection
          The version of naming-factory-dbcp.jar in common/lib is different in Tomcat 5.5.20 and 5.5.26, so the problem may be related by different connection pooling mechanism in the different versions of Tomcat.

          Show
          dalonso Diego Alonso [Atlassian] added a comment - - edited Could it be related to this? http://stackoverflow.com/questions/15949/javatomcat-dying-databse-connection The version of naming-factory-dbcp.jar in common/lib is different in Tomcat 5.5.20 and 5.5.26, so the problem may be related by different connection pooling mechanism in the different versions of Tomcat.
          Hide
          dalonso Diego Alonso [Atlassian] added a comment -

          If it is the DBCP itself that is causing the problem:

          Tomcat 5.5.20 uses commons-dbcp.version=1.2.1
          Tomcat 5.5.26 uses commons-dbcp.version=1.2.2

          Show
          dalonso Diego Alonso [Atlassian] added a comment - If it is the DBCP itself that is causing the problem: Tomcat 5.5.20 uses commons-dbcp.version=1.2.1 Tomcat 5.5.26 uses commons-dbcp.version=1.2.2
          Hide
          dalonso Diego Alonso [Atlassian] added a comment - - edited

          Tomcat 6.0 introduced DBCP 1.2.2 in Tomcat 6.0.13, released on 14-May-2007. Previous to that, up to version 6.0.10, they were using 1.2.1

          Show
          dalonso Diego Alonso [Atlassian] added a comment - - edited Tomcat 6.0 introduced DBCP 1.2.2 in Tomcat 6.0.13, released on 14-May-2007. Previous to that, up to version 6.0.10, they were using 1.2.1
          Hide
          idaniel Ian Daniel [Atlassian] added a comment - - edited

          Smoking gun?: https://issues.apache.org/jira/browse/DBCP-244 ("Connection socket hangs sporadically in DBCP 1.2.2 but not 1.2.1") Same symptoms as we are getting: com.mysql.jdbc.CommunicationsException.

          We need a reliable failure test case. They haven't managed to come up with one yet on that issue.

          We also need to investigate whether it is related to timeBetweenEvictionRunsMillis, or the other parameters mentioned in http://stackoverflow.com/questions/15949/javatomcat-dying-databse-connection.

          Show
          idaniel Ian Daniel [Atlassian] added a comment - - edited Smoking gun?: https://issues.apache.org/jira/browse/DBCP-244 ("Connection socket hangs sporadically in DBCP 1.2.2 but not 1.2.1") Same symptoms as we are getting: com.mysql.jdbc.CommunicationsException. We need a reliable failure test case. They haven't managed to come up with one yet on that issue. We also need to investigate whether it is related to timeBetweenEvictionRunsMillis , or the other parameters mentioned in http://stackoverflow.com/questions/15949/javatomcat-dying-databse-connection .
          Hide
          idaniel Ian Daniel [Atlassian] added a comment - - edited

          From JSP-28677:

          com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was198029 seconds ago.
          The last packet sent successfully to the server was 198029 seconds ago, which  is longer than the server configured value of 'wait_timeout'.
          You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values
           for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.
          	at sun.reflect.GeneratedConstructorAccessor133.newInstance(Unknown Source)
          	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
          	at java.lang.reflect.Constructor.newInstance(Unknown Source)
          	at com.mysql.jdbc.Util.handleNewInstance(Util.java:406)
          	at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1074)
          	at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3246)
          	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1917)
          	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2060)
          	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2542)
          	at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1734)
          	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2019)
          	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1937)
          	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1922)
          	at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:102)
          	at org.ofbiz.core.entity.jdbc.SQLProcessor.executeUpdate(SQLProcessor.java:373)
          	at org.ofbiz.core.entity.GenericDAO.deleteByAnd(GenericDAO.java:1253)
          	at org.ofbiz.core.entity.GenericDAO.deleteByAnd(GenericDAO.java:1213)
          	at org.ofbiz.core.entity.GenericHelperDAO.removeByAnd(GenericHelperDAO.java:196)
          	at org.ofbiz.core.entity.GenericDelegator.removeByAnd(GenericDelegator.java:1107)
          	at org.ofbiz.core.entity.GenericDelegator.removeRelated(GenericDelegator.java:1351)
          	at org.ofbiz.core.entity.GenericDelegator.removeRelated(GenericDelegator.java:1326)
          	at org.ofbiz.core.entity.GenericValue.removeRelated(GenericValue.java:315)
          	at com.atlassian.jira.action.issue.IssueDelete.doExecute(IssueDelete.java:104)
          	at webwork.action.ActionSupport.execute(ActionSupport.java:153)
          	at com.atlassian.jira.action.JiraActionSupport.execute(JiraActionSupport.java:54)
          	at com.atlassian.jira.action.JiraNonWebActionSupport.execute(JiraNonWebActionSupport.java:27)
          	at com.atlassian.jira.rpc.soap.service.IssueServiceImpl.deleteIssue(IssueServiceImpl.java:456)
          	at com.atlassian.jira.rpc.soap.JiraSoapServiceImpl.deleteIssue(JiraSoapServiceImpl.java:373)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
          	at java.lang.reflect.Method.invoke(Unknown Source)
          	at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:397)
          	at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:186)
          	at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:323)
          	at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
          	at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
          	at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
          	at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:453)
          	at org.apache.axis.server.AxisServer.invoke(AxisServer.java:281)
          	at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:699)
          	at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
          	at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
          	at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
          	at com.atlassian.jira.web.servlet.rpc.LazyAxisDecoratorServlet.service(LazyAxisDecoratorServlet.java:55)
          	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
          
                  ....
          
          	at com.atlassian.gzipfilter.GzipFilter.doFilter(GzipFilter.java:79)
          	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
          	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
          	at com.atlassian.core.filters.AbstractEncodingFilter.doFilter(AbstractEncodingFilter.java:33)
          	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
          	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
          	at com.atlassian.jira.appconsistency.db.DatabaseCompatibilityEnforcerFilter.doFilter(DatabaseCompatibilityEnforcerFilter.java:39)
          	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
          	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
          	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
          	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
          	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
          	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
          	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
          	at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548)
          	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
          	at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874)
          	at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
          	at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
          	at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
          	at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
          	at java.lang.Thread.run(Unknown Source)
          Caused by: java.net.SocketException: Broken pipe
          	at java.net.SocketOutputStream.socketWrite0(Native Method)
          	at java.net.SocketOutputStream.socketWrite(Unknown Source)
          	at java.net.SocketOutputStream.write(Unknown Source)
          	at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
          	at java.io.BufferedOutputStream.flush(Unknown Source)
          	at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3227)
          	... 91 more
          

          So this one does contain a

          Show
          idaniel Ian Daniel [Atlassian] added a comment - - edited From JSP-28677: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was198029 seconds ago. The last packet sent successfully to the server was 198029 seconds ago, which is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem. at sun.reflect.GeneratedConstructorAccessor133.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.lang.reflect.Constructor.newInstance(Unknown Source) at com.mysql.jdbc.Util.handleNewInstance(Util.java:406) at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1074) at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3246) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1917) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2060) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2542) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1734) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2019) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1937) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1922) at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:102) at org.ofbiz.core.entity.jdbc.SQLProcessor.executeUpdate(SQLProcessor.java:373) at org.ofbiz.core.entity.GenericDAO.deleteByAnd(GenericDAO.java:1253) at org.ofbiz.core.entity.GenericDAO.deleteByAnd(GenericDAO.java:1213) at org.ofbiz.core.entity.GenericHelperDAO.removeByAnd(GenericHelperDAO.java:196) at org.ofbiz.core.entity.GenericDelegator.removeByAnd(GenericDelegator.java:1107) at org.ofbiz.core.entity.GenericDelegator.removeRelated(GenericDelegator.java:1351) at org.ofbiz.core.entity.GenericDelegator.removeRelated(GenericDelegator.java:1326) at org.ofbiz.core.entity.GenericValue.removeRelated(GenericValue.java:315) at com.atlassian.jira.action.issue.IssueDelete.doExecute(IssueDelete.java:104) at webwork.action.ActionSupport.execute(ActionSupport.java:153) at com.atlassian.jira.action.JiraActionSupport.execute(JiraActionSupport.java:54) at com.atlassian.jira.action.JiraNonWebActionSupport.execute(JiraNonWebActionSupport.java:27) at com.atlassian.jira.rpc.soap.service.IssueServiceImpl.deleteIssue(IssueServiceImpl.java:456) at com.atlassian.jira.rpc.soap.JiraSoapServiceImpl.deleteIssue(JiraSoapServiceImpl.java:373) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:397) at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:186) at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:323) at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32) at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118) at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83) at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:453) at org.apache.axis.server.AxisServer.invoke(AxisServer.java:281) at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:699) at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at com.atlassian.jira.web.servlet.rpc.LazyAxisDecoratorServlet.service(LazyAxisDecoratorServlet.java:55) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269) .... at com.atlassian.gzipfilter.GzipFilter.doFilter(GzipFilter.java:79) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.atlassian.core.filters.AbstractEncodingFilter.doFilter(AbstractEncodingFilter.java:33) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.atlassian.jira.appconsistency.db.DatabaseCompatibilityEnforcerFilter.doFilter(DatabaseCompatibilityEnforcerFilter.java:39) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) at java.lang.Thread.run(Unknown Source) Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(Unknown Source) at java.net.SocketOutputStream.write(Unknown Source) at java.io.BufferedOutputStream.flushBuffer(Unknown Source) at java.io.BufferedOutputStream.flush(Unknown Source) at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3227) ... 91 more So this one does contain a
          Hide
          mlassau Mark Lassau [Atlassian] added a comment -

          By setting the MySQL connection timeout to 30 seconds, I find the following:

          • I can consistently replicate this on Tomcat v5.5.26
          • I cannot replicate this on Tomcat v5.5.20
          Show
          mlassau Mark Lassau [Atlassian] added a comment - By setting the MySQL connection timeout to 30 seconds, I find the following: I can consistently replicate this on Tomcat v5.5.26 I cannot replicate this on Tomcat v5.5.20
          Hide
          mlassau Mark Lassau [Atlassian] added a comment - - edited

          Latest JIRA with Tomcat 5.5.26 is fixed by either of the following:

          We expect the new behaviour was introduced when Tomcat upgraded its connection pool (Apache DBCP). Investigating this now.

          Show
          mlassau Mark Lassau [Atlassian] added a comment - - edited Latest JIRA with Tomcat 5.5.26 is fixed by either of the following: Using the validationQuery parameter as explained at Surviving Connection Closures OR Using timeBetweenEvictionRunsMillis, numTestsPerEvictionRun, and minEvictableIdleTimeMillis with appropriate values. See http://commons.apache.org/dbcp/configuration.html We expect the new behaviour was introduced when Tomcat upgraded its connection pool (Apache DBCP). Investigating this now.
          Hide
          mlassau Mark Lassau [Atlassian] added a comment -

          According to file build.properties.default in Tomcat source control, DBCP was updated to version 1.2.2 in TOMCAT_5_5_24.
          Tomcat 5.5.24 is not available for download.
          5.5.25 exhibits the behaviour, and 5.5.23 doesn't.

          Also, on the v6 branch, DBCP was updated to version 1.2.2 in TOMCAT_6_0_11
          Tomcat 6.0.11 and 6.0.12 are not available for public download.
          6.0.13 exhibits the behaviour, and 6.0.10 doesn't.

          Show
          mlassau Mark Lassau [Atlassian] added a comment - According to file build.properties.default in Tomcat source control , DBCP was updated to version 1.2.2 in TOMCAT_5_5_24. Tomcat 5.5.24 is not available for download. 5.5.25 exhibits the behaviour, and 5.5.23 doesn't. Also, on the v6 branch, DBCP was updated to version 1.2.2 in TOMCAT_6_0_11 Tomcat 6.0.11 and 6.0.12 are not available for public download. 6.0.13 exhibits the behaviour, and 6.0.10 doesn't.
          Hide
          mlassau Mark Lassau [Atlassian] added a comment -

          Similar issue reported elsewhere: MySQL Forums java Ranch

          Show
          mlassau Mark Lassau [Atlassian] added a comment - Similar issue reported elsewhere: MySQL Forums java Ranch
          Hide
          alui Andrew Lui [Atlassian Technical Writer] added a comment -

          JIRA documentation has been updated to warn users of this issue.

          Show
          alui Andrew Lui [Atlassian Technical Writer] added a comment - JIRA documentation has been updated to warn users of this issue.
          Hide
          davidstrauss David Strauss added a comment -

          @Mark: Setting the validationQuery parameter did not fix this issue on our instance.

          Show
          davidstrauss David Strauss added a comment - @Mark: Setting the validationQuery parameter did not fix this issue on our instance.
          Hide
          andrew.myers Andrew Myers [Atlassian] added a comment -

          Hi David,

          Unfortunately Mark is unavailable today but did you also try this:

          If that doesn't help feel free to raise a support request and include as much information as possible such as jira logs and anything else relevant.

          Cheers,

          Andrew

          Show
          andrew.myers Andrew Myers [Atlassian] added a comment - Hi David, Unfortunately Mark is unavailable today but did you also try this: Using timeBetweenEvictionRunsMillis, numTestsPerEvictionRun, and minEvictableIdleTimeMillis with appropriate values. See http://commons.apache.org/dbcp/configuration.html If that doesn't help feel free to raise a support request and include as much information as possible such as jira logs and anything else relevant. Cheers, Andrew
          Hide
          idaniel Ian Daniel [Atlassian] added a comment - - edited

          Hi David,

          We need to understand why setting the validationQuery parameter did not fix the issue for you. Can you please attach your conf/server.xml (if you are running JIRA Standalone) or your jira.xml (if you are running JIRA EAR/WAR in Tomcat), so that we can check it.

          I understand that you already have a support case for this issue. Can you please attach the file to the support case.

          Also, did you restart your JIRA instance after setting the validationQuery? You need to.

          Please get back to us on this, as we are very anxious to ensure that we have a viable workaround for this problem.

          Kind regards,
          Ian


          Ian Daniel
          JIRA Support Lead
          ATLASSIAN - http://www.atlassian.com

          Show
          idaniel Ian Daniel [Atlassian] added a comment - - edited Hi David, We need to understand why setting the validationQuery parameter did not fix the issue for you. Can you please attach your conf/server.xml (if you are running JIRA Standalone) or your jira.xml (if you are running JIRA EAR/WAR in Tomcat), so that we can check it. I understand that you already have a support case for this issue. Can you please attach the file to the support case. Also, did you restart your JIRA instance after setting the validationQuery? You need to. Please get back to us on this, as we are very anxious to ensure that we have a viable workaround for this problem. Kind regards, Ian — Ian Daniel JIRA Support Lead ATLASSIAN - http://www.atlassian.com
          Hide
          dalonso Diego Alonso [Atlassian] added a comment - - edited

          In the previous case where the validation query did not solve the problem, the server.xml file was misconfigured, and the validation query wasn't used. It seems that in all support cases we have had, the setting up the validation query has been the solution to the problem.

          Show
          dalonso Diego Alonso [Atlassian] added a comment - - edited In the previous case where the validation query did not solve the problem, the server.xml file was misconfigured, and the validation query wasn't used. It seems that in all support cases we have had, the setting up the validation query has been the solution to the problem.
          Hide
          mlassau Mark Lassau [Atlassian] added a comment - - edited

          After investigating this issue, I need to first note that the behaviour of the MySQL "autoReconnect=true" setting does not actually behave the way we expected.
          A closer reading of the MySQL documentation referred to in the description above points out the following:

          If enabled the driver will throw an exception for a queries issued on a stale or dead connection, which belong to the current transaction, but will attempt reconnect before the next query issued on the connection in a new transaction. (sic)

          So, the first use of the connection after the server side has timed out the connection will throw an SQLException. Then, if you re-use the connection the JDBC driver will re-establish the connection. This behaviour can be verified with the code included below.

          However, this was not the behaviour we observed pre JIRA v3.13. The autoReconnect=true was a red herring.
          Something else was keeping our connections alive.

              public void testAutoReconnect() throws Exception
              {
                  // Create a DB connection.
                  Connection con = getConnection();
                  System.out.println("First read:");
                  doStuff(con);
          
                  Thread.sleep(40000);
                  System.out.println("---------------------------------");
                  System.out.println("Second Read:");
                  try
                  {
                      doStuff(con);
                  }
                  catch (SQLException ex)
                  {
                      System.out.println("SQLException: " + ex.getMessage());
                      // Expected timeout - we should autoreconnect on the next use of this connection.
                      System.out.println("---------------------------------");
                      System.out.println("Timeout Exception occured as expected.");
                      System.out.println("Auto Reconnect Read:");
                      doStuff(con);
                  }
              }
          
              private void doStuff(Connection con) throws SQLException
              {
                  PreparedStatement ps = null;
                  ResultSet rs = null;
                  try
                  {
                      ps = con.prepareStatement("SELECT 'x'");
                      rs = ps.executeQuery();
                      rs.next();
                      System.out.println("Retrieved data: " + rs.getString(1));
                  }
                  finally
                  {
                      if (rs != null)
                      {
                          try
                          {
                              rs.close();
                          }
                          catch (SQLException e)
                          {
                              // Ignore
                          }
                      }
                      if (ps != null)
                      {
                          try
                          {
                              ps.close();
                          }
                          catch (SQLException e)
                          {
                              // Ignore
                          }
                      }
                  }
              }
          
              private Connection getConnection()
                      throws ClassNotFoundException, SQLException, IllegalAccessException, InstantiationException
              {
                  Class.forName("com.mysql.jdbc.Driver").newInstance();
                  // Obviously you will need to alter these settings dependant on your DB configuration:
                  return DriverManager.getConnection(
                      "jdbc:mysql://localhost/jira_trunk?autoReconnect=true&useUnicode=true&characterEncoding=UTF8",
                      "jira_trunk", "jira_trunk");
              }
          

          (Note that to get the above to time out, I set the MySQL wait_timeout to 30 seconds in the MySQL Configuration file, hence the 40 second sleep.)

          Show
          mlassau Mark Lassau [Atlassian] added a comment - - edited After investigating this issue, I need to first note that the behaviour of the MySQL "autoReconnect=true" setting does not actually behave the way we expected. A closer reading of the MySQL documentation referred to in the description above points out the following: If enabled the driver will throw an exception for a queries issued on a stale or dead connection, which belong to the current transaction, but will attempt reconnect before the next query issued on the connection in a new transaction. (sic) So, the first use of the connection after the server side has timed out the connection will throw an SQLException. Then, if you re-use the connection the JDBC driver will re-establish the connection. This behaviour can be verified with the code included below. However, this was not the behaviour we observed pre JIRA v3.13. The autoReconnect=true was a red herring. Something else was keeping our connections alive. public void testAutoReconnect() throws Exception { // Create a DB connection. Connection con = getConnection(); System.out.println("First read:"); doStuff(con); Thread.sleep(40000); System.out.println("---------------------------------"); System.out.println("Second Read:"); try { doStuff(con); } catch (SQLException ex) { System.out.println("SQLException: " + ex.getMessage()); // Expected timeout - we should autoreconnect on the next use of this connection. System.out.println("---------------------------------"); System.out.println("Timeout Exception occured as expected."); System.out.println("Auto Reconnect Read:"); doStuff(con); } } private void doStuff(Connection con) throws SQLException { PreparedStatement ps = null; ResultSet rs = null; try { ps = con.prepareStatement("SELECT 'x'"); rs = ps.executeQuery(); rs.next(); System.out.println("Retrieved data: " + rs.getString(1)); } finally { if (rs != null) { try { rs.close(); } catch (SQLException e) { // Ignore } } if (ps != null) { try { ps.close(); } catch (SQLException e) { // Ignore } } } } private Connection getConnection() throws ClassNotFoundException, SQLException, IllegalAccessException, InstantiationException { Class.forName("com.mysql.jdbc.Driver").newInstance(); // Obviously you will need to alter these settings dependant on your DB configuration: return DriverManager.getConnection( "jdbc:mysql://localhost/jira_trunk?autoReconnect=true&useUnicode=true&characterEncoding=UTF8", "jira_trunk", "jira_trunk"); } (Note that to get the above to time out, I set the MySQL wait_timeout to 30 seconds in the MySQL Configuration file, hence the 40 second sleep.)
          Hide
          mlassau Mark Lassau [Atlassian] added a comment -

          I have investigated the Apache DBCP source code for versions 1.2.1, and 1.2.2, and found the cause of the changing behaviour.
          Basically in v1.2.1, there was a "happy accident" that meant the MySQL connections that timed out would be thrown away and replaced before they were borrowed from the Connection Pool.
          In v1.2.2, this no longer occurs.

          So, DBCP sits on top of Apache Commons Pool, which is a generic Object pool.
          Indeed, it even has a GenericObjectPool class which provides the borrowObject() method that is called to borrow an Object from the pool.
          borrowObject() will take an object from the pool (in our case a DB Connection), and "activate" it (it is "passivated" when it is returned).
          Now, if any Exceptions are thrown during the activateObject() method, the GenericObjectPool will throw that object away and get a new one.
          This is where the magic reconnection was happening for us in pre-DBCP v1.2.2 days.

          PoolableConnectionFactory is the DBCP class that does the DB Connection-specific work for the object pool.
          Here is PoolableConnectionFactory.activateObject() in v1.2.1:

              public void activateObject(Object obj) throws Exception {
                  ...
                  if(obj instanceof Connection) {
                      Connection conn = (Connection)obj;
                      conn.setAutoCommit(_defaultAutoCommit);
                      ...
                  }
              }
          

          and here is PoolableConnectionFactory.activateObject() in v1.2.2:

              public void activateObject(Object obj) throws Exception {
                  ...
                  if(obj instanceof Connection) {
                      Connection conn = (Connection)obj;
                      if (conn.getAutoCommit() != _defaultAutoCommit) {
                          conn.setAutoCommit(_defaultAutoCommit);
                      }
                      ...
                  }
              }
          

          Seems like a fairly innocuous change.
          However, with the MySQL JDBC drivers it made all the difference.
          Under DBCP v1.2.1, we would always call conn.setAutoCommit() on activate.
          The JDBC drivers would send a message to the server (even if the value was not changing), an SQLExpection would be raised because of the server-side timeout and this would be caught by the underlying GenericObjectPool, which would then recover silently, grabbing another connection from the pool, or creating a new one for us.
          Now, like I mentioned above, this is really a "happy accident". It was only luck that the MySQL driver wanted to do a network call on any call to setAutoCommit(). In fact, it is arguably a poor JDBC implementation. The JDBC driver knows that the value really hasn't changed, and so it should not spend time doing a network round-trip.
          You can see in the DBCP 1.2.2 code above that the DB pool now does this test for you, and saves that wasted network trip, with the consequence being that we don't find out if the connection has died during the activate phase.
          No magic reconnect happens any more - and the connection timeout Exception is thrown at a later stage when it actually affects the application, and we get those "The last packet successfully received from the server was xxx seconds ago." errors that have been reported.

          Show
          mlassau Mark Lassau [Atlassian] added a comment - I have investigated the Apache DBCP source code for versions 1.2.1, and 1.2.2, and found the cause of the changing behaviour. Basically in v1.2.1, there was a "happy accident" that meant the MySQL connections that timed out would be thrown away and replaced before they were borrowed from the Connection Pool. In v1.2.2, this no longer occurs. So, DBCP sits on top of Apache Commons Pool, which is a generic Object pool. Indeed, it even has a GenericObjectPool class which provides the borrowObject() method that is called to borrow an Object from the pool. borrowObject() will take an object from the pool (in our case a DB Connection), and "activate" it (it is "passivated" when it is returned). Now, if any Exceptions are thrown during the activateObject() method, the GenericObjectPool will throw that object away and get a new one. This is where the magic reconnection was happening for us in pre-DBCP v1.2.2 days. PoolableConnectionFactory is the DBCP class that does the DB Connection-specific work for the object pool. Here is PoolableConnectionFactory.activateObject() in v1.2.1: public void activateObject(Object obj) throws Exception { ... if(obj instanceof Connection) { Connection conn = (Connection)obj; conn.setAutoCommit(_defaultAutoCommit); ... } } and here is PoolableConnectionFactory.activateObject() in v1.2.2: public void activateObject(Object obj) throws Exception { ... if(obj instanceof Connection) { Connection conn = (Connection)obj; if (conn.getAutoCommit() != _defaultAutoCommit) { conn.setAutoCommit(_defaultAutoCommit); } ... } } Seems like a fairly innocuous change. However, with the MySQL JDBC drivers it made all the difference. Under DBCP v1.2.1, we would always call conn.setAutoCommit() on activate. The JDBC drivers would send a message to the server (even if the value was not changing), an SQLExpection would be raised because of the server-side timeout and this would be caught by the underlying GenericObjectPool, which would then recover silently, grabbing another connection from the pool, or creating a new one for us. Now, like I mentioned above, this is really a "happy accident". It was only luck that the MySQL driver wanted to do a network call on any call to setAutoCommit(). In fact, it is arguably a poor JDBC implementation. The JDBC driver knows that the value really hasn't changed, and so it should not spend time doing a network round-trip. You can see in the DBCP 1.2.2 code above that the DB pool now does this test for you, and saves that wasted network trip, with the consequence being that we don't find out if the connection has died during the activate phase. No magic reconnect happens any more - and the connection timeout Exception is thrown at a later stage when it actually affects the application, and we get those "The last packet successfully received from the server was xxx seconds ago." errors that have been reported.
          Hide
          mlassau Mark Lassau [Atlassian] added a comment - - edited

          In conclusion:
          The recommended approach to fix this issue, is to use the DBCP "validationQuery".
          This will in fact work exactly the same as the old behaviour we had in DBCP v1.2.1

          There is a small performance hit associated with this approach, however, (every time JIRA borrows a connection, it sends an extra SQL request) and so some customers may choose to use other DBCP settings to fix this issue.
          This is more difficult, and should only be undertaken by users with strong database experience.

          I have raised an Improvement request (JRA-15776) to attempt to detect possible problems of this kind in a running JIRA instance.

          Show
          mlassau Mark Lassau [Atlassian] added a comment - - edited In conclusion: The recommended approach to fix this issue, is to use the DBCP "validationQuery". This will in fact work exactly the same as the old behaviour we had in DBCP v1.2.1 There is a small performance hit associated with this approach, however, (every time JIRA borrows a connection, it sends an extra SQL request) and so some customers may choose to use other DBCP settings to fix this issue. This is more difficult, and should only be undertaken by users with strong database experience. I have raised an Improvement request ( JRA-15776 ) to attempt to detect possible problems of this kind in a running JIRA instance.
          Hide
          idaniel Ian Daniel [Atlassian] added a comment -

          If the problem happens when using JBoss, this could be the fix: http://stackoverflow.com/questions/128527/is-there-any-way-to-have-the-jboss-connection-pool-reconnect-to-oracle-when-con

          (We haven't tested this yet.)

          Show
          idaniel Ian Daniel [Atlassian] added a comment - If the problem happens when using JBoss, this could be the fix: http://stackoverflow.com/questions/128527/is-there-any-way-to-have-the-jboss-connection-pool-reconnect-to-oracle-when-con (We haven't tested this yet.)
          Hide
          greg_greg Grégory Joseph added a comment - - edited

          From what I gather from Mark's comments, we should avoid using the autoReconnect parameter - at least when using DBCP 1.2.2, which is my case, with Tomcat 6.0.18. However, http://confluence.atlassian.com/display/JIRA/Surviving+Connection+Closures still uses it in its examples. I am confused. Could anyone shed some light on this?

          (we currently have the autoReconnect=true AND a validation query, and getting these errors, mostly after a period of inactivity)

          Cheers

          -greg

          Show
          greg_greg Grégory Joseph added a comment - - edited From what I gather from Mark's comments, we should avoid using the autoReconnect parameter - at least when using DBCP 1.2.2, which is my case, with Tomcat 6.0.18. However, http://confluence.atlassian.com/display/JIRA/Surviving+Connection+Closures still uses it in its examples. I am confused. Could anyone shed some light on this? (we currently have the autoReconnect=true AND a validation query, and getting these errors, mostly after a period of inactivity) Cheers -greg
          Hide
          andrew.myers Andrew Myers [Atlassian] added a comment - - edited

          Hi Greg,

          Mark was suggesting that the autoReconnect=true parameter was not responsible for preventing problems with MySQL. Earlier versions of DBCP were always sending a request to the server to set the autocommit property and if that request failed a new connection would be opened.

          What happened in DBCP 1.2.2 is that behaviour changed, so the auto commit property would only be sent to the server if it had changed. So that meant JIRA would get a database connection from Tomcat and attempt to run a query on that connection. It would fail because MySQL had closed the connection. The autoReconnect=true parameter would tell the JDBC driver to reopen the connection but that only happens after the query failed. The failed query would not be executed again because neither JIRA nor Tomcat, which manages the database connection pool for JIRA, handled the situation where MySQL closed the connection. The result was the database got corrupted because the query never executed.

          MySQL suggest that the autoReconnect parameter should not be used at all, in favour of the application handling the case where connections are closed

          The validation query will tell tomcat to check all connections before trying to use them, so this situation should be avoided. It would basically be very similar to the way the old DBCP worked - Tomcat will always be making a server request to check a connection is still open before passing the connection onto JIRA.

          For my own testing, I set up two JIRA instances on Tomcat 6.0.18, both without the autoReconnect parameter. Both instances were connected to the same database server (different databases), and I set wait_timeout = 10 in /etc/my.cnf so connections would be closed after 10 seconds of inactivity. The JIRA instance with the validation query never experienced this problem, and the one without consistently reproduced the problem. Additionally setting autoReconnect=true was of no use without also implementing the validation query.

          My advice would be to check your jira.xml file and see that you have the resource correctly configured. It will be something like this (obviously different username/passwords etc)

          <Resource name="jdbc/JiraDS" auth="Container" type="javax.sql.DataSource"
                        username="jira"
                        password="jira"
                        driverClassName="com.mysql.jdbc.Driver"
                        url="jdbc:mysql://localhost:3306/test1?useUnicode=true&amp;characterEncoding=UTF8" 
                        validationQuery="SELECT 1" />
          

          Also you need to restart the server for this change to take effect, so you will need to copy the jira.xml into $CATALINA_HOME/conf/Catalina/localhost/jira.xml. If that still doesn't work we might need you to raise a support request.

          Kind Regards,

          Andrew

          Show
          andrew.myers Andrew Myers [Atlassian] added a comment - - edited Hi Greg, Mark was suggesting that the autoReconnect=true parameter was not responsible for preventing problems with MySQL. Earlier versions of DBCP were always sending a request to the server to set the autocommit property and if that request failed a new connection would be opened. What happened in DBCP 1.2.2 is that behaviour changed, so the auto commit property would only be sent to the server if it had changed. So that meant JIRA would get a database connection from Tomcat and attempt to run a query on that connection. It would fail because MySQL had closed the connection. The autoReconnect=true parameter would tell the JDBC driver to reopen the connection but that only happens after the query failed. The failed query would not be executed again because neither JIRA nor Tomcat, which manages the database connection pool for JIRA, handled the situation where MySQL closed the connection. The result was the database got corrupted because the query never executed. MySQL suggest that the autoReconnect parameter should not be used at all, in favour of the application handling the case where connections are closed http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html The use of this feature is not recommended, because it has side effects related to session state and data consistency when applications don't handle SQLExceptions properly, and is only designed to be used when you are unable to configure your application to handle SQLExceptions resulting from dead and stale connections properly. The validation query will tell tomcat to check all connections before trying to use them, so this situation should be avoided. It would basically be very similar to the way the old DBCP worked - Tomcat will always be making a server request to check a connection is still open before passing the connection onto JIRA. For my own testing, I set up two JIRA instances on Tomcat 6.0.18, both without the autoReconnect parameter. Both instances were connected to the same database server (different databases), and I set wait_timeout = 10 in /etc/my.cnf so connections would be closed after 10 seconds of inactivity. The JIRA instance with the validation query never experienced this problem, and the one without consistently reproduced the problem. Additionally setting autoReconnect=true was of no use without also implementing the validation query. My advice would be to check your jira.xml file and see that you have the resource correctly configured. It will be something like this (obviously different username/passwords etc) <Resource name= "jdbc/JiraDS" auth= "Container" type= "javax.sql.DataSource" username= "jira" password= "jira" driverClassName= "com.mysql.jdbc.Driver" url= "jdbc:mysql: //localhost:3306/test1?useUnicode= true &amp;characterEncoding=UTF8" validationQuery= "SELECT 1" /> Also you need to restart the server for this change to take effect, so you will need to copy the jira.xml into $CATALINA_HOME/conf/Catalina/localhost/jira.xml . If that still doesn't work we might need you to raise a support request. Kind Regards, Andrew
          Hide
          greg_greg Grégory Joseph added a comment -

          Hi Andrew,

          Thanks a lot for the detailed response! I double-checked my context descriptor - i've had the validationQuery there for a while. Two differences though:

          • as I said, I still had the autoReconnect=true param in the url (now removed) - you guys might want to remove it from the examples on the "Surviving Connection Closures" page, and/or add a note about it
          • I have a maxActive attribute (value: 20) - I am not sure if this is any useful and tbh was probably copy/pasted from some example.

          Since I'm planning to upgrade our Jira instance later today or tomorrow, I won't restart now. Hopefully either or both removing autoReconnect or upgrading from 3.12.2 to the latest will help.

          Cheers,

          -greg

          Show
          greg_greg Grégory Joseph added a comment - Hi Andrew, Thanks a lot for the detailed response! I double-checked my context descriptor - i've had the validationQuery there for a while. Two differences though: as I said, I still had the autoReconnect=true param in the url (now removed) - you guys might want to remove it from the examples on the "Surviving Connection Closures" page, and/or add a note about it I have a maxActive attribute (value: 20) - I am not sure if this is any useful and tbh was probably copy/pasted from some example. Since I'm planning to upgrade our Jira instance later today or tomorrow, I won't restart now. Hopefully either or both removing autoReconnect or upgrading from 3.12.2 to the latest will help. Cheers, -greg
          Hide
          mlassau Mark Lassau [Atlassian] added a comment -

          Greg,

          Thanks for the helpful comment about our documentation. I have created a request to change it (JRA-16207).

          Just to make it clear:

          • Including the autoReconnect=true in your URL should not stop the validationQuery from working.
          • The maxActive attribute sets the maximum number of DB connections in your connection pool. This is good to have in a production environment, and you will notice that although Andrew wasn't using it, it is included in the documentation page you linked to.
          Show
          mlassau Mark Lassau [Atlassian] added a comment - Greg, Thanks for the helpful comment about our documentation. I have created a request to change it ( JRA-16207 ). Just to make it clear: Including the autoReconnect=true in your URL should not stop the validationQuery from working. The maxActive attribute sets the maximum number of DB connections in your connection pool. This is good to have in a production environment, and you will notice that although Andrew wasn't using it, it is included in the documentation page you linked to.

            People

            • Assignee:
              mlassau Mark Lassau [Atlassian]
              Reporter:
              andrew.myers Andrew Myers [Atlassian]
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: