Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-41409

Lucene index can be corrupted by a graceful shutdown

    XMLWordPrintable

Details

    Description

      QueuingIndex uses thread interruption to implement cancellation. The problem is that there is a race condition if the indexer is closed while an index write operation is in progress, as the interrupt causes the underlying index file channels to abort their write operations, leading to corrupt indexes. See chapter 7 of Java Concurrency in Practice for a detailed description of task cancellation strategies and more information about why using interrupt() when running other people's code is not a good idea.

      This is the underlying culprit in JRA-38544. As near as I can tell, it has existed since the indexing code was rewritten in JIRA 4.0. It is normally a very rarely triggered race condition, but especially busy large systems are more likely to encounter it.

      Example stack trace:

      2014-12-19 18:07:45,525 some-thread-name WARN some-user     [atlassian.jira.index.AccumulatingResultBuilder] Indexing failed for Issue - '720151'
      2014-12-19 18:07:45,530 some-thread-name WARN some-user     [atlassian.jira.index.AccumulatingResultBuilder] java.nio.channels.ClosedByInterruptException
      com.atlassian.jira.util.RuntimeIOException: java.nio.channels.ClosedByInterruptException
      	at com.atlassian.jira.index.WriterWrapper.commit(WriterWrapper.java:137)
      	at com.atlassian.jira.index.DefaultIndexEngine$WriterReference.commit(DefaultIndexEngine.java:230)
      	at com.atlassian.jira.index.DefaultIndexEngine$FlushPolicy$2.commit(DefaultIndexEngine.java:63)
      	at com.atlassian.jira.index.DefaultIndexEngine$FlushPolicy.perform(DefaultIndexEngine.java:88)
      	at com.atlassian.jira.index.DefaultIndexEngine.write(DefaultIndexEngine.java:158)
      	at com.atlassian.jira.index.DefaultIndex.perform(DefaultIndex.java:36)
      	at com.atlassian.jira.index.QueueingIndex$Task.index(QueueingIndex.java:145)
      	at com.atlassian.jira.index.QueueingIndex$Task.run(QueueingIndex.java:130)
      	at java.lang.Thread.run(Thread.java:724)
      Caused by: java.nio.channels.ClosedByInterruptException
      	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
      	at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:880)
      	at org.apache.lucene.store.MMapDirectory$MMapIndexInput.<init>(MMapDirectory.java:264)
      	at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:216)
      	at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:85)
      	at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:114)
      	at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:702)
      	at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:677)
      	at org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:249)
      	at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3571)
      	at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3508)
      	at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3371)
      	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3444)
      	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3426)
      	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3410)
      	at com.atlassian.jira.index.WriterWrapper.commit(WriterWrapper.java:133)
      	... 8 more
      

      Attachments

        Issue Links

          Activity

            People

              pklimkowski@atlassian.com Piotr Klimkowski (Inactive)
              cfuller crf
              Votes:
              3 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: