Uploaded image for project: 'Jira Software Data Center'
  1. Jira Software Data Center
  2. JSWSERVER-16689

Improve IOException handling during reindexing and restoring an index

      Expected Behavior

      When an IOException is encountered during reindexing or restoring an index, Jira is able to retry the operation a few times before failing.

      Actual Behavior

      If an IOException is being thrown, e.g. because a file is being locked by another process, the reindex/index restore stops, which usually results in a corrupted index.
      This usually happens due to a running anti-virus or due to JDK-4724038 bug on Windows.
      The problem can manifest itself with an exception like this:

      2018-07-10 22:58:51,518 NodeReindexServiceThread:thread-1 INFO      [c.a.j.index.ha.DefaultIndexCopyService] Index restore started. Snapshot file: IndexSnapshot_16400.zip
      2018-07-10 22:59:03,221 NodeReindexServiceThread:thread-1 INFO      [c.a.j.index.ha.DefaultIndexRecoveryManager] Restoring search indexes - 1% complete... Replacing indexes
      2018-07-10 22:59:03,486 NodeReindexServiceThread:thread-1 ERROR      [c.a.j.index.ha.DefaultNodeReindexService] Error re-indexing node changes
      java.lang.RuntimeException: java.io.IOException: Unable to delete file: D:\Atlassian\JIRA\caches\indexes\issues\_9j.cfs
      	at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager$ReplaceIndexRunner.run(DefaultIndexRecoveryManager.java:344)
      	at com.atlassian.jira.issue.index.DefaultIndexManager.withReindexLock(DefaultIndexManager.java:377)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at com.atlassian.jira.config.component.SwitchingInvocationHandler.invoke(SwitchingInvocationHandler.java:22)
      	at com.sun.proxy.$Proxy19.withReindexLock(Unknown Source)
      	at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager.recoverIndexFromBackup(DefaultIndexRecoveryManager.java:121)
      	at com.atlassian.jira.index.ha.DefaultIndexCopyService$MessageConsumer.restoreIndex(DefaultIndexCopyService.java:154)
      	at com.atlassian.jira.index.ha.DefaultIndexCopyService.restoreIndex(DefaultIndexCopyService.java:71)
      	at com.atlassian.jira.index.ha.DefaultNodeReindexService.updateAffectedIndexes(DefaultNodeReindexService.java:293)
      	at com.atlassian.jira.index.ha.DefaultNodeReindexService.reIndex(DefaultNodeReindexService.java:252)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: Unable to delete file: D:\Atlassian\JIRA\caches\indexes\issues\_9j.cfs
      	at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1919)
      	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1399)
      	at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1331)
      	at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager$ReplaceIndexRunner.replaceIndexes(DefaultIndexRecoveryManager.java:407)
      	at com.atlassian.jira.index.ha.DefaultIndexRecoveryManager$ReplaceIndexRunner.run(DefaultIndexRecoveryManager.java:342)
      

      Steps to Reproduce

      Lock one of the Lucene's files inside Jira home and start a reindex.

      Workaround

      Exclude Jira home directory from anti-virus scans.
      If the problem is a result of the JVM bug, there is no known workaround for this at the moment.

            [JSWSERVER-16689] Improve IOException handling during reindexing and restoring an index

            Bugfix Automation Bot made changes -
            Minimum Version New: 7.02
            Daniel Rauf made changes -
            Assignee New: Daniel Rauf [ drauf ]
            Owen made changes -
            Workflow Original: JAC Bug Workflow v2 [ 2853228 ] New: JAC Bug Workflow v3 [ 2938126 ]
            Status Original: Resolved [ 5 ] New: Closed [ 6 ]
            Owen made changes -
            Symptom Severity Original: Major [ 14431 ] New: Severity 2 - Major [ 15831 ]
            Owen made changes -
            Workflow Original: JIRA Bug Workflow w Kanban v7 - Restricted [ 2745658 ] New: JAC Bug Workflow v2 [ 2853228 ]
            Status Original: Closed [ 6 ] New: Resolved [ 5 ]
            Daniel Rauf made changes -
            Resolution New: Duplicate [ 3 ]
            Status Original: Open [ 1 ] New: Closed [ 6 ]
            Daniel Rauf made changes -
            Link New: This issue duplicates JRASERVER-67606 [ JRASERVER-67606 ]
            Andriy Yakovlev [Atlassian] made changes -
            Affects Version/s New: 7.2.14 [ 78307 ]
            Andriy Yakovlev [Atlassian] made changes -
            Labels New: pse-request
            Daniel Rauf created issue -

              drauf Daniel Rauf
              drauf Daniel Rauf
              Affected customers:
              0 This affects my team
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: