-
Bug
-
Resolution: Fixed
-
Highest
-
7.9.0, 7.13.0, 7.13.2, 7.13.5
-
89
-
Severity 2 - Major
-
1,555
-
The fix for this bug has been released to our Long Term Support release.
The fix for this bug is now available in the latest release of Confluence 7.13 and 7.19
Problem
Triggering a reindex on the UI after upgrading Confluence results in an OutOfMemory and application unresponsiveness on large instances. This can also happen during upgrade to Confluence 7.9 or later.
Steps to Reproduce (reindex)
- Install a Confluence instance below 7.9 (tested with 7.4.9)
- Populate it with a large data set (lots of deleted content)
- Upgrade to 7.9+ (tested with 7.13)
- Trigger an index rebuild from the UI
Steps to Reproduce (upgrade)
- Install a Confluence instance below 7.9 (tested with 7.4.9)
- Populate it with a large data set (lots of deleted content)
- Upgrade to 7.9+ (tested with 7.13)
Expected Results
Reindexing/upgrade finishes without errors.
Actual Results
Node crashes with an out of memory error:
2021-11-26 23:33:55,980 ERROR [Catalina-utility-1] [atlassian.confluence.plugin.PluginFrameworkContextListener] launchUpgrades Upgrade failed, application will not start: Upgrade task com.atlassian.confluence.upgrade.upgradetask.SplitIndexUpgradeTask@22b566d7 failed during the UPGRADE phase due to: Java heap space com.atlassian.confluence.upgrade.UpgradeException: Upgrade task com.atlassian.confluence.upgrade.upgradetask.SplitIndexUpgradeTask@22b566d7 failed during the UPGRADE phase due to: Java heap space
Caused by: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot complete forceMergeDeletes Exception in thread "Lucene Merge Thread #24" org.apache.lucene.index.MergePolicy$MergeException: java.lang.OutOfMemoryError: Java heap space Caused by: java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space Caused by: java.lang.OutOfMemoryError: Java heap space
You may also see errors like this in the atlassian-confluence-index.log:
NFO [lucene-interactive-reindexing-thread] [internal.index.lucene.LuceneReIndexer] lambda$null$5 full reindex group 17/17 completed for CONTENT_ONLY, 26% complete INFO [lucene-interactive-reindexing-thread] [internal.index.lucene.LuceneReIndexer] lambda$null$5 full reindex completed for CONTENT_ONLY, 26% complete, start cleaning up files ERROR [lucene-interactive-reindexing-thread] [internal.index.lucene.LuceneReIndexer] cleanUpIndex unable to force writer to clean-up -- referer: http://1.1.1.1:8090/plugins/servlet/rebuildindex | url: /rest/prototype/latest/index/reindex | traceId: 2784176c19e14e59 | userName: admin java.io.IOException: background merge hit exception: _nuh(4.4):C3668664/9192 into _p44 at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1817) at com.atlassian.bonnie.InstrumentedIndexWriter.forceMergeDeletes(InstrumentedIndexWriter.java:99) at com.atlassian.confluence.internal.index.lucene.LuceneReIndexer.cleanUpIndex(LuceneReIndexer.java:256) ... Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:212) at org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:174) at org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:301) at org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:253) at org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:215) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:119) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3772) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3376) at com.atlassian.bonnie.InstrumentedIndexWriter.merge(InstrumentedIndexWriter.java:113) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
Workaround
Rebuilding the index from scratch manually does not trigger this problem:
- Server: https://confluence.atlassian.com/confkb/how-to-rebuild-the-content-indexes-from-scratch-833941594.html
- Data Center: https://confluence.atlassian.com/confkb/how-to-manually-rebuild-content-index-from-scratch-on-confluence-data-center-without-any-downtime-1027124802.html
Subsequent reindexes from the UI should complete as expected after the rebuild from scratch.
Notes
Inspecting the heap dump, there are thousands of small org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer$3 objects. Digging into those, all of them seem to reference .nvd files from the change index folder. Checking the size of those files reveals they are huge, reaching GBs of size depending on the data set.
Looks like there is an issue with the logic of forceMergeDeletes which leads to those big files being created and loaded in memory after the split index upgrade task. You can also see that the overall index folder size increases significantly after the upgrade and even more after the failed reindex from the ui. Rebuilding from scratch fixes that and index files return to the expected size.
Since a rebuild from scratch is required to avoid this issue, it is a good approach to remove the index files before upgrading so they are rebuilt with the new format.
- blocks
-
CONFSERVER-55267 Links to internal pages and attachments are changed to a self-referential link after changing the context path or protocol (http/https) in the base URL
- Closed
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...