Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-73281

The status endpoint is taking longer to respond since Jira 8.20

    XMLWordPrintable

Details

    Description

      Issue Summary

      Jira 8.20 checks for index consistency when requested for the /status (see JRASERVER-66970) and this is underperforming in some case due to slow DB calls.
      Response times that would avg in the ms or seconds in previous versions have been spiking up to minutes in some customer production instances.

      Steps to Reproduce

      1. Set up a Jira DC cluster (also affects Jira Server)
      2. Populate it with issues and projects as described below, excerpted from a support zip
      3. Constantly request the /status endpoint (every 5 secs for eg)
      4. Monitor response times or timeouts for some hours
      <Archived-Projects>150</Archived-Projects>
      <Issues>4200000</Issues>
      <Issues-In-Archived-Projects>220000</Issues-In-Archived-Projects>
      <Projects>2200</Projects>
      

      Expected Results

      Response times wouldn't have had a significant increase without public notice.The /status endpoint is usually configured in Load Balancers and customers have timeout thresholds for it so to route requests to other nodes if one's underperfoming or unavailable.

      Actual Results

      Spikes up to minutes:

      $ cat atlassian-jira.log* | grep "/status \[c.a.jira.servlet.ApplicationStateResolverImpl\] Checking index consistency" | cut -d" " -f1,2,16,17 | sort | grep " min" | sort -k3 -n | sort | head
      2022-01-31 05:23:46,162-0600 9.747 min
      2022-01-31 06:02:01,524-0600 6.754 min
      2022-01-31 08:06:29,255-0600 7.466 min
      2022-01-31 08:09:48,814-0600 1.001 min
      2022-01-31 11:12:14,043-0600 6.218 min
      2022-01-31 11:47:52,230-0600 4.855 min
      2022-01-31 13:36:34,400-0600 7.151 min
      2022-01-31 14:12:11,988-0600 5.571 min
      2022-01-31 14:46:38,706-0600 4.131 min
      2022-01-31 15:20:57,902-0600 3.821 min
      

      And long running queries in the DB like:

      SELECT COUNT(I.ID) FROM jiraissue I WHERE (I.PROJECT IN (10001, 10002, 10003, 10004, 10005, 10006, 10007, ..., 12008) ) AND (I.ARCHIVED = 'Y' )
      

      Thread dumps reveal the /status request Thread continually reading from the DB (MySQL example):

      java.net.SocketInputStream.socketRead0(Native Method)
      java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
      java.net.SocketInputStream.read(SocketInputStream.java:171)
      java.net.SocketInputStream.read(SocketInputStream.java:141)
      com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
      com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
      com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
      com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3011)
      com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3472)
      com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3462)
      com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3905)
      com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2530)
      com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2683)
      com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2495)
      com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1903)
      com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2011)
      org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:83)
      org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:83)
      com.atlassian.jira.ofbiz.sql.PreparedStatementWrapper.executeQuery(PreparedStatementWrapper.java:42)
      com.atlassian.jira.diagnostic.connection.DiagnosticPreparedStatement.lambda$executeQuery$5(DiagnosticPreparedStatement.java:59)
      com.atlassian.jira.diagnostic.connection.DiagnosticPreparedStatement$$Lambda$1958/155074555.execute(Unknown Source)
      com.atlassian.diagnostics.internal.platform.monitor.db.DefaultDatabaseDiagnosticsCollector.recordExecutionTime(DefaultDatabaseDiagnosticsCollector.java:70)
      com.atlassian.jira.diagnostic.connection.DatabaseDiagnosticsCollectorDelegate.recordExecutionTime(DatabaseDiagnosticsCollectorDelegate.java:55)
      com.atlassian.jira.diagnostic.connection.DiagnosticPreparedStatement.executeQuery(DiagnosticPreparedStatement.java:59)
      org.ofbiz.core.entity.jdbc.SQLProcessor.executeQuery(SQLProcessor.java:527)
      org.ofbiz.core.entity.GenericDAO.createEntityListIterator(GenericDAO.java:881)
      org.ofbiz.core.entity.GenericDAO.selectListIteratorByCondition(GenericDAO.java:861)
      org.ofbiz.core.entity.GenericDAO.selectByCondition(GenericDAO.java:799)
      org.ofbiz.core.entity.GenericDAO.selectByCondition(GenericDAO.java:779)
      org.ofbiz.core.entity.GenericHelperDAO.findByCondition(GenericHelperDAO.java:198)
      org.ofbiz.core.entity.GenericDelegator.findByCondition(GenericDelegator.java:1095)
      com.atlassian.jira.ofbiz.DefaultOfBizDelegator.getCountByAnd(DefaultOfBizDelegator.java:382)
      com.atlassian.jira.ofbiz.WrappingOfBizDelegator.getCountByAnd(WrappingOfBizDelegator.java:229)
      com.atlassian.jira.project.archiving.ArchivedStatistics.calculateTotalArchivedIssuesCount(ArchivedStatistics.java:118)
      com.atlassian.jira.project.archiving.ArchivedStatistics.calculateArchivedStatisticsData(ArchivedStatistics.java:128)
      com.atlassian.jira.project.archiving.ArchivedStatistics$$Lambda$355/1556705959.get(Unknown Source)
      com.atlassian.cache.ehcache.EhCacheManager$SupplierAdapter.load(EhCacheManager.java:243)
      com.atlassian.cache.ehcache.EhCacheManager$SupplierAdapter.load(EhCacheManager.java:230)
      com.atlassian.cache.ehcache.wrapper.ValueProcessorAtlassianCacheLoaderDecorator.load(ValueProcessorAtlassianCacheLoaderDecorator.java:26)
      com.atlassian.cache.ehcache.LoadingCache.getFromLoader(LoadingCache.java:174)
      com.atlassian.cache.ehcache.LoadingCache$$Lambda$268/213460558.apply(Unknown Source)
      com.atlassian.cache.ehcache.SynchronizedLoadingCacheDecorator.synchronizedLoad(SynchronizedLoadingCacheDecorator.java:29)
      com.atlassian.cache.ehcache.LoadingCache.loadValueAndReleaseLock(LoadingCache.java:142)
      com.atlassian.cache.ehcache.LoadingCache.get(LoadingCache.java:121)
      com.atlassian.cache.ehcache.DelegatingCachedReference.get(DelegatingCachedReference.java:74)
      com.atlassian.jira.project.archiving.ArchivedStatistics.getTotalArchivedIssuesCount(ArchivedStatistics.java:56)
      com.atlassian.jira.issue.index.DefaultIndexManager.size(DefaultIndexManager.java:762)
      com.atlassian.jira.issue.index.DefaultIndexManager.isIndexConsistent(DefaultIndexManager.java:740)
      

      Workarounds

      Disable the new behavior introduced in /status should indicate when indexes are broken on a node

      Set the system property (or append to what's set already, separated by spaces): 

      -Dcom.atlassian.jira.status.index.check=false

      Follow these instructions: Setting properties and options on startup.
      After a restart, Jira will behave as before: the health check won’t reflect the indexing status.

      Attachments

        Issue Links

          Activity

            People

              cb173a7ca7c0 Michał Błajet
              rmartinez3@atlassian.com Rodrigo Martinez
              Votes:
              28 Vote for this issue
              Watchers:
              72 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: