-
Bug
-
Resolution: Unresolved
-
Low
-
None
-
9.2.9
-
None
-
Severity 3 - Minor
-
Issue Summary
In the Confluence 9.1 release, we introduced an experimental feature to significantly speed up the indexing process for an instance. In this regard, the following dark features were introduced:
- confluence.change.incremental.index.improvement
- confluence.content.incremental.index.improvement
- confluence.indexing.improvements
After enabling these dark features, Confluence indexing performance improved significantly. Previously, a typical reindexing operation would take 11–12 hours, but with the dark features enabled, the same process now completes within 4–5 hours.
However, with these dark features enabled, the following issues were observed:
- User index data is missing.
- For some users, the following error appears in the index log:
2025-09-30 01:36:35,930 ERROR [Indexer: 46] [confluence.internal.index.ConcurrentBatchIndexer] lambda$submitBatches$2 An error occurred while re-indexing a batch. Only the particular batch which had an error occur will not be re-indexed correctly. java.lang.IllegalStateException: Duplicate key 175782124 (attempted merging values userinfo: <UserName> v.5 (175782124) and userinfo: <UserName> v.5 (175782124)) at java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:135) at java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:182) at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) at com.atlassian.confluence.impl.search.v2.extractor.ContentModifiersBulkExtractor.mapById(ContentModifiersBulkExtractor.java:92) at com.atlassian.confluence.impl.search.v2.extractor.ContentModifiersBulkExtractor.extractAll(ContentModifiersBulkExtractor.java:52) at com.atlassian.confluence.internal.index.BulkFieldPrefetcher.lambda$prefetch$0(BulkFieldPrefetcher.java:79) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at com.atlassian.confluence.internal.index.BulkFieldPrefetcher.prefetch(BulkFieldPrefetcher.java:77) at com.atlassian.confluence.internal.index.BulkFieldPrefetcher.prefetch(BulkFieldPrefetcher.java:64) at com.atlassian.confluence.internal.index.BulkFieldPrefetcher.createPrefetchedDocumentBuilder(BulkFieldPrefetcher.java:52) at com.atlassian.confluence.internal.index.DefaultBatchIndexer.doIndex(DefaultBatchIndexer.java:161) at com.atlassian.confluence.internal.index.DefaultBatchIndexer.lambda$index$0(DefaultBatchIndexer.java:90) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140) at com.atlassian.confluence.internal.index.DefaultBatchIndexer.index(DefaultBatchIndexer.java:83) at com.atlassian.confluence.internal.index.ConcurrentBatchIndexer.lambda$submitBatches$2(ConcurrentBatchIndexer.java:124) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at com.atlassian.confluence.impl.tenant.ThreadLocalTenantGate.lambda$wrap$0(ThreadLocalTenantGate.java:27) at com.atlassian.confluence.impl.vcache.VCacheRequestContextManager.doInRequestContextInternal(VCacheRequestContextManager.java:84) at com.atlassian.confluence.impl.vcache.VCacheRequestContextManager.doInRequestContext(VCacheRequestContextManager.java:68) at com.atlassian.confluence.vcache.VCacheRequestContextOperations.lambda$withRequestContext$1(VCacheRequestContextOperations.java:59) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:842)
Background Information
- Confluence Reindexing with Dark Feature identify the Indexable User record based on the following SQL:
select personalin0_.CONTENTID as col_0_0_ from CONTENT personalin0_ where personalin0_.CONTENTTYPE='USERINFO' and (personalin0_.PREVVER is null);
- Next, it executes the following SQL based on the result of the above query:
select personalin0_.CONTENTID as contentid1_18_0_, bodyconten1_.BODYCONTENTID as bodycontentid1_13_1_, personalin0_.HIBERNATEVERSION as hibernateversion2_18_0_, personalin0_.TITLE as title4_18_0_, personalin0_.LOWERTITLE as lowertitle5_18_0_, personalin0_.VERSION as version6_18_0_, personalin0_.CREATOR as creator7_18_0_, personalin0_.CREATIONDATE as creationdate8_18_0_, personalin0_.LASTMODIFIER as lastmodifier9_18_0_, personalin0_.LASTMODDATE as lastmoddate10_18_0_, personalin0_.VERSIONCOMMENT as versioncomment11_18_0_, personalin0_.PREVVER as prevver12_18_0_, personalin0_.CONTENT_STATUS as content_status13_18_0_, personalin0_.PAGEID as pageid14_18_0_, personalin0_.USERNAME as username26_18_0_, bodyconten1_.BODY as body2_13_1_, bodyconten1_.CONTENTID as contentid3_13_1_, bodyconten1_.BODYTYPEID as bodytypeid4_13_1_, bodyconten1_.CONTENTID as contentid3_13_0__, bodyconten1_.BODYCONTENTID as bodycontentid1_13_0__ from CONTENT personalin0_, BODYCONTENT bodyconten1_ where personalin0_.CONTENTTYPE='USERINFO' and personalin0_.CONTENTID=bodyconten1_.CONTENTID and (personalin0_.CONTENTID=578213249 or personalin0_.CONTENTID=49349 or personalin0_.CONTENTID=49350);
- In the above SQL, it validates the UserInfo CONTENTID with the BodyContent CONTENTID. If users' records are missing in the BodyContent table, then their data will not be indexed.
- However, if Dark Feature is removed and then triggers the reindexing, the following SQL executes without validating the BodyContent table, and every user record gets indexed. Below is the SQL that is executed when the Dark Feature is disabled:
select personalin0_.CONTENTID as contenti1_18_, personalin0_.HIBERNATEVERSION as hibernat2_18_, personalin0_.TITLE as title4_18_, personalin0_.LOWERTITLE as lowertit5_18_, personalin0_.VERSION as version6_18_, personalin0_.CREATOR as creator7_18_, personalin0_.CREATIONDATE as creation8_18_, personalin0_.LASTMODIFIER as lastmodi9_18_, personalin0_.LASTMODDATE as lastmod10_18_, personalin0_.VERSIONCOMMENT as version11_18_, personalin0_.PREVVER as prevver12_18_, personalin0_.CONTENT_STATUS as content13_18_, personalin0_.PAGEID as pageid14_18_, personalin0_.USERNAME as usernam26_18_ from CONTENT personalin0_ where personalin0_.CONTENTTYPE='USERINFO' and (personalin0_.CONTENTID=578213249 )
Steps to Reproduce
- 1st Issue (User Index Data are missing)
- Add the following Dark Feature and reindex from UI.
confluence.change.incremental.index.improvement confluence.content.incremental.index.improvement confluence.indexing.improvements
- Search the Users in People or any other search panel, and the user will not be identified.
- Add the following Dark Feature and reindex from UI.
- 2nd Issue ( Duplicate User Key while User Records during Reindexing, and the complete batch is rolled back)
- Add the following Dark Feature and reindex from UI.
confluence.change.incremental.index.improvement confluence.content.incremental.index.improvement confluence.indexing.improvements
- Take one of the user UserInfo ContentID from the Content Table.
- Insert the same ContentId record in BodyContentTable and make that record a duplicate.
- Run the Reindex from UI, and the following error will be observed, and the entire batch will be rolled back.
2025-09-30 01:36:35,930 ERROR [Indexer: 46] [confluence.internal.index.ConcurrentBatchIndexer] lambda$submitBatches$2 An error occurred while re-indexing a batch. Only the particular batch that had an error occur will not be re-indexed correctly. java.lang.IllegalStateException: Duplicate key 175782124 (attempted merging values userinfo: <UserName> v.5 (175782124) and userinfo: <UserName> v.5 (175782124)) at java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:135) at java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:182) at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) at com.atlassian.confluence.impl.search.v2.extractor.ContentModifiersBulkExtractor.mapById(ContentModifiersBulkExtractor.java:92) at com.atlassian.confluence.impl.search.v2.extractor.ContentModifiersBulkExtractor.extractAll(ContentModifiersBulkExtractor.java:52) at com.atlassian.confluence.internal.index.BulkFieldPrefetcher.lambda$prefetch$0(BulkFieldPrefetcher.java:79) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at com.atlassian.confluence.internal.index.BulkFieldPrefetcher.prefetch(BulkFieldPrefetcher.java:77) at com.atlassian.confluence.internal.index.BulkFieldPrefetcher.prefetch(BulkFieldPrefetcher.java:64) at com.atlassian.confluence.internal.index.BulkFieldPrefetcher.createPrefetchedDocumentBuilder(BulkFieldPrefetcher.java:52) at com.atlassian.confluence.internal.index.DefaultBatchIndexer.doIndex(DefaultBatchIndexer.java:161) at com.atlassian.confluence.internal.index.DefaultBatchIndexer.lambda$index$0(DefaultBatchIndexer.java:90) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140) at com.atlassian.confluence.internal.index.DefaultBatchIndexer.index(DefaultBatchIndexer.java:83) at com.atlassian.confluence.internal.index.ConcurrentBatchIndexer.lambda$submitBatches$2(ConcurrentBatchIndexer.java:124) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at com.atlassian.confluence.impl.tenant.ThreadLocalTenantGate.lambda$wrap$0(ThreadLocalTenantGate.java:27) at com.atlassian.confluence.impl.vcache.VCacheRequestContextManager.doInRequestContextInternal(VCacheRequestContextManager.java:84) at com.atlassian.confluence.impl.vcache.VCacheRequestContextManager.doInRequestContext(VCacheRequestContextManager.java:68) at com.atlassian.confluence.vcache.VCacheRequestContextOperations.lambda$withRequestContext$1(VCacheRequestContextOperations.java:59) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:842)
- Add the following Dark Feature and reindex from UI.
Expected Results
With the dark feature enabled, Confluence reindexing should not exclude users from the reindexing process. Additionally, in cases of duplicate userInfo, the entire batch should not be rolled back.
Actual Results
When the dark feature is enabled, Confluence reindexing excludes user data from reindexing if their records are missing in the BodyContent table, as userInfo records are validated against the BodyContent table using the SQL query mentioned above.
However, without the dark feature, userInfo from the Content table is not validated against the BodyContent table, and all user records are indexed.
Workaround
Option 1: ( Recommended )
Remove the Dark Feature and run reindexing as a normal operation. This will generate all missing user index data. Once indexing is complete, re-enable the Dark Feature and let Confluence run the indexing using the Dark Feature.
Option 2:
We are currently missing user details due to an SQL query that appears to be misconfigured. I am checking with the Dev team for more details about this.
To determine the number of users affected, you can use the following SQL query. This may return some records, but not all users in the result are necessarily impacted; however, all impacted users will be from this list:
SELECT contentid FROM content WHERE contentid NOT IN (SELECT contentid FROM bodycontent) AND contenttype = 'USERINFO';
To fix the issue with the Dark Feature, we need to add the missing records to the bodycontent table. This can be done manually using a sample insert statement. Following SQL needs to be executed for all contentID that we get from the above select. In this SQL, the first value “2000000" can be any random number and 196681 would be the contentID, which we will get from the above SQL.
INSERT INTO public.bodycontent (bodycontentid, body, contentid, bodytypeid) VALUES (2000000, '', 196681, 0);
After adding the records to the bodycontent table, you can run the indexing from the UI with the Dark Feature enabled, which should resolve the issue.
Additionally, we identified a duplicate user record error in the logs. This error is caused by duplicate records in the bodycontent table. You can use the following SQL to get the list of duplicate records:
SELECT contentid, COUNT(*) FROM bodycontent GROUP BY contentid HAVING COUNT(*) > 1;
To fix these duplicates, remove the extra records for the same content from the bodycontent table, keeping only the unique value. After that, you can run site reindexing from the UI with the Dark Feature enabled.