The fix for this bug has been released to our Long Term Support release.

      The fix for this bug is now available in the latest release of Confluence 7.13 and 7.19

      Problem

      Triggering a reindex on the UI after upgrading Confluence results in an OutOfMemory and application unresponsiveness on large instances. This can also happen during upgrade to Confluence 7.9 or later.

      Steps to Reproduce (reindex)

      1. Install a Confluence instance below 7.9 (tested with 7.4.9)
      2. Populate it with a large data set (lots of deleted content)
      3. Upgrade to 7.9+ (tested with 7.13)
      4. Trigger an index rebuild from the UI

      Steps to Reproduce (upgrade)

      1. Install a Confluence instance below 7.9 (tested with 7.4.9)
      2. Populate it with a large data set (lots of deleted content)
      3. Upgrade to 7.9+ (tested with 7.13)

      Expected Results

      Reindexing/upgrade finishes without errors.

      Actual Results

      Node crashes with an out of memory error:

      atlassian-confluence.log
      2021-11-26 23:33:55,980 ERROR [Catalina-utility-1] [atlassian.confluence.plugin.PluginFrameworkContextListener] launchUpgrades Upgrade failed, application will not start: Upgrade task com.atlassian.confluence.upgrade.upgradetask.SplitIndexUpgradeTask@22b566d7 failed during the UPGRADE phase due to: Java heap space
      com.atlassian.confluence.upgrade.UpgradeException: Upgrade task com.atlassian.confluence.upgrade.upgradetask.SplitIndexUpgradeTask@22b566d7 failed during the UPGRADE phase due to: Java heap space
      
      Caused by: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot complete forceMergeDeletes
      Exception in thread "Lucene Merge Thread #24" org.apache.lucene.index.MergePolicy$MergeException: java.lang.OutOfMemoryError: Java heap space Caused by: java.lang.OutOfMemoryError: Java heap space
         java.lang.OutOfMemoryError: Java heap space
         java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
         Caused by: java.lang.OutOfMemoryError: Java heap space
      

      You may also see errors like this in the atlassian-confluence-index.log:

      atlassian-confluence-index.log
      NFO [lucene-interactive-reindexing-thread] [internal.index.lucene.LuceneReIndexer] lambda$null$5 full reindex group 17/17 completed for CONTENT_ONLY, 26% complete
      INFO [lucene-interactive-reindexing-thread] [internal.index.lucene.LuceneReIndexer] lambda$null$5 full reindex completed for CONTENT_ONLY, 26% complete, start cleaning up files
      ERROR [lucene-interactive-reindexing-thread] [internal.index.lucene.LuceneReIndexer] cleanUpIndex unable to force writer to clean-up
       -- referer: http://1.1.1.1:8090/plugins/servlet/rebuildindex | url: /rest/prototype/latest/index/reindex | traceId: 2784176c19e14e59 | userName: admin
      java.io.IOException: background merge hit exception: _nuh(4.4):C3668664/9192 into _p44
          at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1817)
          at com.atlassian.bonnie.InstrumentedIndexWriter.forceMergeDeletes(InstrumentedIndexWriter.java:99)
          at com.atlassian.confluence.internal.index.lucene.LuceneReIndexer.cleanUpIndex(LuceneReIndexer.java:256)
      ...
      Caused by: java.lang.OutOfMemoryError: Java heap space
          at org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:212)
          at org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:174)
          at org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:301)
          at org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:253)
          at org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:215)
          at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:119)
          at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3772)
          at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3376)
          at com.atlassian.bonnie.InstrumentedIndexWriter.merge(InstrumentedIndexWriter.java:113)
          at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
          at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
      

      Workaround

      Rebuilding the index from scratch manually does not trigger this problem:

      Subsequent reindexes from the UI should complete as expected after the rebuild from scratch.

      Notes

      Inspecting the heap dump, there are thousands of small org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer$3 objects. Digging into those, all of them seem to reference .nvd files from the change index folder. Checking the size of those files reveals they are huge, reaching GBs of size depending on the data set.

      Looks like there is an issue with the logic of forceMergeDeletes which leads to those big files being created and loaded in memory after the split index upgrade task. You can also see that the overall index folder size increases significantly after the upgrade and even more after the failed reindex from the ui. Rebuilding from scratch fixes that and index files return to the expected size.

      Since a rebuild from scratch is required to avoid this issue, it is a good approach to remove the index files before upgrading so they are rebuilt with the new format.

            [CONFSERVER-69488] OutOfMemory when upgrading or running reindex

            Hi Martin,
            Try to increase heap memory and re do indexing by scratch. 

            The fix for this bug is now available in the latest release of Confluence 7.13 and 7.19

            Rilwan_Ahmed_NC added a comment - Hi Martin, Try to increase heap memory and re do indexing by scratch.  The fix for this bug is now available in the latest release of Confluence 7.13 and 7.19

            We have same problem, but even rebuilding index from scratch do not avoid this issue for us.. We are running confluence 7.14.9. We are currently using 7days old indexes, not able to use reindexing without full heap space error.. 

            Martin Adámek added a comment - We have same problem, but even rebuilding index from scratch do not avoid this issue for us.. We are running confluence 7.14.9. We are currently using 7days old indexes, not able to use reindexing without full heap space error.. 

            Some more information:

            In confluence 7.13.7 after 28% of indexing, system fails. No heap memory error is displayed in atlassian-confluence-index.log or atlassian-confluence.log. But when you check any graphs/dashboards, it will show the heap memory is consumed full.

            atlassian-confluence-index.log will have only logs till 28%. You can see last line as 

            INFO [lucene-interactive-reindexing-thread] [internal.index.lucene.LuceneReIndexer] lambda$null$5 full reindex completed for CONTENT_ONLY, 28% complete, start cleaning up files
            WARN [Caesium-1-1] [index.status.schedule.ReIndexHouseKeepingJobRunner] repairRebuildingJobIfNeeded There was no updates for current re-index job for a while. Last update received at XXXXXXXX. Resetting it

            Rilwan_Ahmed_NC added a comment - Some more information: In confluence 7.13.7 after 28% of indexing, system fails. No heap memory error is displayed in atlassian-confluence-index.log or atlassian-confluence.log. But when you check any graphs/dashboards, it will show the heap memory is consumed full. atlassian-confluence-index.log will have only logs till 28%. You can see last line as  INFO [lucene-interactive-reindexing-thread] [internal.index.lucene.LuceneReIndexer] lambda$null$5 full reindex completed for CONTENT_ONLY, 28% complete, start cleaning up files WARN [Caesium-1-1] [index.status.schedule.ReIndexHouseKeepingJobRunner] repairRebuildingJobIfNeeded There was no updates for current re-index job for a while. Last update received at XXXXXXXX. Resetting it

            A fix for this issue is available in Confluence Server and Data Center 7.13.9.
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            Saran Babu Pannuru (Inactive) added a comment - A fix for this issue is available in Confluence Server and Data Center 7.13.9. Upgrade now or check out the Release Notes to see what other issues are resolved.

            A fix for this issue is available in Confluence Server and Data Center 7.19.0. Upgrade now or check out the Release Notes to see what other issues are resolved.

            James Whitehead added a comment - A fix for this issue is available in Confluence Server and Data Center 7.19.0. Upgrade now or check out the Release Notes to see what other issues are resolved.

              05a8667aef42 Saquia Naz
              bandreeti Bernardo Andreeti
              Affected customers:
              23 This affects my team
              Watchers:
              36 Start watching this issue

                Created:
                Updated:
                Resolved: