Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-61099

JIRA Lucene indexes get corrupted under high load and ungraceful shutdown

    XMLWordPrintable

Details

    Description

      Summary

      The JIRA Lucene index is based on the data in DB, effectively it's a search index. Any problems with Lucene index will cause problem with search results, JIRA Dashboards, or JIRA Software Dashboards. Lucene Index might get corrupted under high load and ungraceful shutdown or crash of JIRA/JVM.

      Expected Results

      It recovers from transaction log or nearest checkpoint.

      Actual Results

      Lucene data is corrupted and JIRA doesn't function properly.

      2016-05-17 04:48:49,064 http-bio-8080-exec-16 ERROR admin288x38453x8 1k54nbw 127.0.0.1 /browse/PP [jira.projects.web.BrowseProject] Tab panel 'com.atlassian.jira.jira-projects-plugin:summary-panel' failed to render.
      com.atlassian.jira.issue.index.SearchUnavailableException: com.atlassian.jira.util.RuntimeIOException: java.io.FileNotFoundException: /atlassian/application-data/jira/caches/indexes/issues/_nfbu.cfs (No such file or directory)
      

      Notes

      Plugins continue to operate After consistencyChecker is Run
      • IndexWriters are being shutdown and closed in consistencyChecker.destroy(ServletContextProvider.getServletContext()).
      • The problem is that we stop plugins after it. So SD Automation is still running and it creates a new IssueIndexer.
      • This also explains, why we see the following in the end of the log file:
        12-Jun-2017 15:03:42.169 WARNING [localhost-startStop-2] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [ISSUE-indexQueue:thread-1] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
         sun.misc.Unsafe.park(Native Method)
         java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
         java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
         java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
         com.atlassian.jira.index.QueueingIndex$Task.index(QueueingIndex.java:142)
         com.atlassian.jira.index.QueueingIndex$Task.run(QueueingIndex.java:111)
         java.lang.Thread.run(Thread.java:745)
        

      The second instance of IssueIndexer thread isn't stopped by anybody. So, it

      • Is still running in the end of shutdown;
      • Doesn't flush the changes to index file.
      Force Kill via stop-jira.sh
      1. By default bin/stop-jira.sh script forces JIRA shutdown after 20 seconds:
        exec $PRGDIR/shutdown.sh 20 -force $@
        

        from bin/catalina.sh

          if [ $FORCE -eq 1 ]; then
          ..
              if [ -f "$CATALINA_PID" ]; then
                PID=`cat "$CATALINA_PID"`
                echo "Killing Tomcat with the PID: $PID"
                kill -9 $PID
        
      2. JIRA uses Lucene 3.3.0.
      3. Due to inefficiency in JIRA 6 plugin system, shutting down JIRA instance could take up to 10 minutes.
      4. Segments File description http://lucene.apache.org/core/3_3_0/fileformats.html#Segments%20File
      5. Possibly caused by https://issues.apache.org/jira/browse/LUCENE-3418

      Workaround

      Attachments

        Issue Links

          Activity

            People

              klopacinski Karol Lopacinski
              ayakovlev@atlassian.com Andriy Yakovlev [Atlassian]
              Votes:
              16 Vote for this issue
              Watchers:
              45 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: