[CONFSERVER-99523] Attempting to create a Personal space while incremental/background indexing is already flushing/committing data can lead to a slowdown in indexing rates

Type: Bug
Resolution: Fixed
Priority: Medium
Fix Version/s: 8.5.21
Affects Version/s: 8.5.20
Component/s: Apps - Confluence Questions, Search - Indexing, Space - Personal Space
Labels:

Symptom Severity:
Severity 3 - Minor
Bug Fix Policy:
View Atlassian Server bug fix policy

Issue Summary

Confluence's incremental/background indexing can occasionally come under some processing load while attempting to flush newly indexed data. This is visible as slowly increasing Queued for processing numbers under Administration > Content Indexing > Content queue OR Change queue. This is normal, and is something that Confluence generally overcomes on its own (given enough time, and a sufficiently suitable dip in content-creation/modification rates).

However, if a user attempts creating a Personal space in that same timeframe, the sequence of events involved while executing the /create-personal-space action also include an immediate index-queue flush operation. The index-queue flush operation involves acquiring a lock on the desired index before data is written/committed to it. With incremental/background indexing already in progress (and occasionally stressed with a few pending objects to index), the /create-personal-space action can end up waiting indefinitely for a lock to execute its own one-time index-queue flush.

The situation can worsen if a few more users attempt the creation of their own Personal spaces in that same timeframe. Then, each /create-personal-space action will attempt executing its own one-time index-queue flush, going through the same motions of waiting to lock the desired index while other threads are still flushing/committing data to that same index.
This sub-optimal lock-acquisition behavior then results in bringing down the overall indexing rates, which in turn show up as:

constantly high (hundreds of thousands) Queued for processing numbers under Administration > Content Indexing > Content queue OR Change queue
inaccurate search results (newly created/modified Confluence content is not yet searchable)

This is reproducible on Data Center: yes

Steps to Reproduce

(Similar to CONFSERVER-90319)

Setup a Confluence instance and try to push a lot of content (create new pages, comments etc.) using any kind of automation (plugins or REST API scripts).
Ensure that the Queued for processing numbers under Administration > Content Indexing > Content queue OR Change queue are in the thousands.
Attempt creating a new Personal space via Spaces > Create space

Expected Results

Provided that a Personal space does not already exist for the user account in question, a new Personal space should be created in a matter of a few seconds.
Once created, content posted to that space should be searchable after a reasonable time (generally a few minutes) once the existing backlog of pending items have been indexed.

Actual Results

The Personal space creation action executes an immediate one-time index-queue flush, which in turns waits for the acquisition of a lock.
Add a few more concurrent Personal space creation actions, along with a sufficiently large and active indexing backlog, and the incremental/background indexing pace will slow down to a crawl from adhering to lock re-acquisition routines.

The runtime of the http thread that triggered the Personal space creation may eventually exceed Tomcat's StuckThreadDetectionValve default threshold of 60 seconds.
That will show up in the instance's <confluence-install-dir>/logs/catalina.* log:

thread "http-nio-8090-exec-10" (for /rest/create-dialog/1.0/space-blueprint/create-personal-space) running for more than 60 seconds

DD-Mmm-YYYY HH:MM:SS.mmm WARNING [Catalina-utility-2] org.apache.catalina.valves.StuckThreadDetectionValve.notifyStuckThreadDetected Thread [http-nio-8090-exec-10 url: /rest/create-dialog/1.0/space-blueprint/create-personal-space; user: test_user] (id=[5700]) has been active for [69,400] milliseconds (since [MM/DD/YY HH:MM AM/PM]) to serve the same request for [https://<BaseURLWithContext>/rest/create-dialog/1.0/space-blueprint/create-personal-space] and may be stuck (configured threshold for this StuckThreadDetectionValve is [60] seconds). There is/are [1] thread(s) in total that are monitored by this Valve and may be stuck.
	java.lang.Throwable
		at sun.misc.Unsafe.park(Native Method)
		
		at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
		at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
		at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
		at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
		at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
		at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
		
		at com.atlassian.confluence.internal.index.lucene.DefaultIndexLockService.lock(DefaultIndexLockService.java:68)
		at com.atlassian.confluence.internal.index.lucene.DefaultIndexLockService.lock(DefaultIndexLockService.java:62)
		at com.atlassian.confluence.search.lucene.LuceneIncrementalIndexManager.tryLockSingleIndex(LuceneIncrementalIndexManager.java:245)
		at com.atlassian.confluence.search.lucene.LuceneIncrementalIndexManager.flushQueue(LuceneIncrementalIndexManager.java:123)

note - in extreme cases, this (and similar) thread(s) may eventually terminate successfully only multiple hours later, for which corresponding notifyStuckThreadCompleted entries will also be recorded

Thread dumps will show other such operations waiting on the thread that has successfully acquired the lock and is currently flushing/committing data to the index:

single thread dump from one instant in time, showing thread "http-nio-8090-exec-10" waiting on thread " http-nio-8090-exec-55" (runnable) to acquire a lock

"http-nio-8090-exec-10 url: /rest/create-dialog/1.0/space-blueprint/create-personal-space; user: test_user" daemon prio=5 tid=0x000000000000d5f4 nid=0 waiting on condition 
   
   java.lang.Thread.State: WAITING (parking)
	
	at sun.misc.Unsafe.park(Native Method)
	
	- parking to wait for <0x00000000752228d6> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	 owned by http-nio-8090-exec-55 url: /rest/create-dialog/1.0/space-blueprint/create-personal-space; user: test_user2 id=0x0000000000001650
	
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	
	at com.atlassian.confluence.internal.index.lucene.DefaultIndexLockService.lock(DefaultIndexLockService.java:68)
	at com.atlassian.confluence.internal.index.lucene.DefaultIndexLockService.lock(DefaultIndexLockService.java:62)
	at com.atlassian.confluence.search.lucene.LuceneIncrementalIndexManager.tryLockSingleIndex(LuceneIncrementalIndexManager.java:245)
	at com.atlassian.confluence.search.lucene.LuceneIncrementalIndexManager.flushQueue(LuceneIncrementalIndexManager.java:123)

Workaround

Set the following system property to control/specify the amount of time (in minutes) that a Confluence thread may wait for, in order to acquire a lock on a desired index:

-Dconfluence.index.manager.lock.timeout=1

This will permit contending threads to timeout a lock-acquisition attempt, and allow for the continuance of normal app functioning.
Newly-created space-data will automatically get picked up by the scheduled (Caesium - based) incremental/background indexer threads in due time, and there is no need to interrupt the normal processing of indexing lines and force an immediate/out-of-sequence index-flush.

relates to

CONFSERVER-97286 Attempting to create a Personal space while incremental/background indexing is already flushing/committing data can lead to a slowdown in indexing rates

Closed

Pranoy Cherian Mathew added a comment - 01/Apr/2025 1:46 PM

A fix for this issue is available in Confluence Server and Data Center 8.5.21. Upgrade now or check out the Release Notes to see what other issues are resolved.

Pranoy Cherian Mathew added a comment - 01/Apr/2025 1:46 PM A fix for this issue is available in Confluence Server and Data Center 8.5.21. Upgrade now or check out the Release Notes to see what other issues are resolved.

Pranoy Cherian Mathew made changes - 01/Apr/2025 10:21 AM

Resolution		New: Fixed [ 1 ]
Status	Original: Waiting for Release [ 12075 ]	New: Closed [ 6 ]

Filipe Rodrigues made changes - 31/Mar/2025 4:57 PM

Link

New: This issue relates to ~~CONFSERVER-97286~~ [ ~~CONFSERVER-97286~~ ]

Garvit Sharma made changes - 31/Mar/2025 10:28 AM

QA Demo Status	Original: Not Done [ 14330 ]	New: Done [ 14331 ]
QA Kickoff Status	Original: Not Done [ 14234 ]	New: Done [ 14235 ]
Testing Notes		New: Released public.plugin version from release/18.5. Updated version of public.plugin for confluence release/8.5.
Status	Original: In Progress [ 3 ]	New: Waiting for Release [ 12075 ]

Garvit Sharma made changes - 31/Mar/2025 7:21 AM

Remote Link

New: This issue links to "Confluence Bundled Plugins 92 › Confluence Jira-Integration Plugins › issue-92-~~CONFSERVER-99523~~-flush-indexing-queue-timeout (server-syd-bamboo)" [ 1002618 ]

Garvit Sharma made changes - 31/Mar/2025 7:13 AM

Remote Link

New: This issue links to "Confluence Bundled Plugins 85 › Confluence Jira-Integration Plugins › issue-85-~~CONFSERVER-99523~~-flush-indexing-queue-timeout (server-syd-bamboo)" [ 1002617 ]

Garvit Sharma made changes - 31/Mar/2025 5:58 AM

Priority

Original: Highest [ 1 ]

New: Medium [ 3 ]

Garvit Sharma made changes - 31/Mar/2025 5:57 AM

Priority

Original: Low [ 4 ]

New: Highest [ 1 ]

ConfServer Board Bot made changes - 31/Mar/2025 5:57 AM

Labels

Original: blocking-8.5.21 ewt-rtb-service-operations-and-tech-entropy release-blocker

New: blocking-8.5.21 ewt-rtb-service-operations-and-tech-entropy internal-kickoff release-blocker

ConfServer Board Bot made changes - 31/Mar/2025 5:57 AM

Link

New: This issue depends on CONFSRVDEV-38888 [ CONFSRVDEV-38888 ]

Assignee:: Garvit Sharma

Reporter:: Irina Tiapchenko

Affected customers:: 0 This affects my team

Watchers:: 3 Start watching this issue

Created:: 31/Mar/2025 5:17 AM

Updated:: 01/Apr/2025 1:46 PM

Resolved:: 01/Apr/2025 10:21 AM

Details

Description

Issue Summary

Steps to Reproduce

Expected Results

Actual Results

Workaround

Attachments

Issue Links

Forms

Activity

Collapse comment: Pranoy Cherian Mathew added a comment - 01/Apr/2025 1:46 PM

Expand comment: Pranoy Cherian Mathew added a comment - 01/Apr/2025 1:46 PM

People

Dates