Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-99523

Attempting to create a Personal space while incremental/background indexing is already flushing/committing data can lead to a slowdown in indexing rates

      Issue Summary

      Confluence's incremental/background indexing can occasionally come under some processing load while attempting to flush newly indexed data. This is visible as slowly increasing Queued for processing numbers under Administration > Content Indexing > Content queue OR Change queue. This is normal, and is something that Confluence generally overcomes on its own (given enough time, and a sufficiently suitable dip in content-creation/modification rates).

      However, if a user attempts creating a Personal space in that same timeframe, the sequence of events involved while executing the /create-personal-space action also include an immediate index-queue flush operation. The index-queue flush operation involves acquiring a lock on the desired index before data is written/committed to it. With incremental/background indexing already in progress (and occasionally stressed with a few pending objects to index), the /create-personal-space action can end up waiting indefinitely for a lock to execute its own one-time index-queue flush.

      The situation can worsen if a few more users attempt the creation of their own Personal spaces in that same timeframe. Then, each /create-personal-space action will attempt executing its own one-time index-queue flush, going through the same motions of waiting to lock the desired index while other threads are still flushing/committing data to that same index.
      This sub-optimal lock-acquisition behavior then results in bringing down the overall indexing rates, which in turn show up as:

      • constantly high (hundreds of thousands) Queued for processing numbers under Administration > Content Indexing > Content queue OR Change queue
      • inaccurate search results (newly created/modified Confluence content is not yet searchable)

       

      This is reproducible on Data Center: yes

      Steps to Reproduce

      (Similar to CONFSERVER-90319)

      1. Setup a Confluence instance and try to push a lot of content (create new pages, comments etc.) using any kind of automation (plugins or REST API scripts).
      2. Ensure that the Queued for processing numbers under Administration > Content Indexing > Content queue OR Change queue are in the thousands.
      3. Attempt creating a new Personal space via Spaces > Create space

      Expected Results

      Provided that a Personal space does not already exist for the user account in question, a new Personal space should be created in a matter of a few seconds.
      Once created, content posted to that space should be searchable after a reasonable time (generally a few minutes) once the existing backlog of pending items have been indexed.

      Actual Results

      The Personal space creation action executes an immediate one-time index-queue flush, which in turns waits for the acquisition of a lock.
      Add a few more concurrent Personal space creation actions, along with a sufficiently large and active indexing backlog, and the incremental/background indexing pace will slow down to a crawl from adhering to lock re-acquisition routines.

      The runtime of the http thread that triggered the Personal space creation may eventually exceed Tomcat's StuckThreadDetectionValve default threshold of 60 seconds.
      That will show up in the instance's <confluence-install-dir>/logs/catalina.* log:

      thread "http-nio-8090-exec-10" (for /rest/create-dialog/1.0/space-blueprint/create-personal-space) running for more than 60 seconds
      DD-Mmm-YYYY HH:MM:SS.mmm WARNING [Catalina-utility-2] org.apache.catalina.valves.StuckThreadDetectionValve.notifyStuckThreadDetected Thread [http-nio-8090-exec-10 url: /rest/create-dialog/1.0/space-blueprint/create-personal-space; user: test_user] (id=[5700]) has been active for [69,400] milliseconds (since [MM/DD/YY HH:MM AM/PM]) to serve the same request for [https://<BaseURLWithContext>/rest/create-dialog/1.0/space-blueprint/create-personal-space] and may be stuck (configured threshold for this StuckThreadDetectionValve is [60] seconds). There is/are [1] thread(s) in total that are monitored by this Valve and may be stuck.
      	java.lang.Throwable
      		at sun.misc.Unsafe.park(Native Method)
      		
      		at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      		at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      		at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      		at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      		at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
      		at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
      		
      		at com.atlassian.confluence.internal.index.lucene.DefaultIndexLockService.lock(DefaultIndexLockService.java:68)
      		at com.atlassian.confluence.internal.index.lucene.DefaultIndexLockService.lock(DefaultIndexLockService.java:62)
      		at com.atlassian.confluence.search.lucene.LuceneIncrementalIndexManager.tryLockSingleIndex(LuceneIncrementalIndexManager.java:245)
      		at com.atlassian.confluence.search.lucene.LuceneIncrementalIndexManager.flushQueue(LuceneIncrementalIndexManager.java:123)
      

      note - in extreme cases, this (and similar) thread(s) may eventually terminate successfully only multiple hours later, for which corresponding notifyStuckThreadCompleted entries will also be recorded

      Thread dumps will show other such operations waiting on the thread that has successfully acquired the lock and is currently flushing/committing data to the index:

      single thread dump from one instant in time, showing thread "http-nio-8090-exec-10" waiting on thread " http-nio-8090-exec-55" (runnable) to acquire a lock
      "http-nio-8090-exec-10 url: /rest/create-dialog/1.0/space-blueprint/create-personal-space; user: test_user" daemon prio=5 tid=0x000000000000d5f4 nid=0 waiting on condition 
         
         java.lang.Thread.State: WAITING (parking)
      	
      	at sun.misc.Unsafe.park(Native Method)
      	
      	- parking to wait for <0x00000000752228d6> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
      	 owned by http-nio-8090-exec-55 url: /rest/create-dialog/1.0/space-blueprint/create-personal-space; user: test_user2 id=0x0000000000001650
      	
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
      	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
      	
      	at com.atlassian.confluence.internal.index.lucene.DefaultIndexLockService.lock(DefaultIndexLockService.java:68)
      	at com.atlassian.confluence.internal.index.lucene.DefaultIndexLockService.lock(DefaultIndexLockService.java:62)
      	at com.atlassian.confluence.search.lucene.LuceneIncrementalIndexManager.tryLockSingleIndex(LuceneIncrementalIndexManager.java:245)
      	at com.atlassian.confluence.search.lucene.LuceneIncrementalIndexManager.flushQueue(LuceneIncrementalIndexManager.java:123)
      

      Workaround

      Set the following system property to control/specify the amount of time (in minutes) that a Confluence thread may wait for, in order to acquire a lock on a desired index:

      -Dconfluence.index.manager.lock.timeout=1
      

      This will permit contending threads to timeout a lock-acquisition attempt, and allow for the continuance of normal app functioning.
      Newly-created space-data will automatically get picked up by the scheduled (Caesium - based) incremental/background indexer threads in due time, and there is no need to interrupt the normal processing of indexing lines and force an immediate/out-of-sequence index-flush.

            [CONFSERVER-99523] Attempting to create a Personal space while incremental/background indexing is already flushing/committing data can lead to a slowdown in indexing rates

            A fix for this issue is available in Confluence Server and Data Center 8.5.21. Upgrade now or check out the Release Notes to see what other issues are resolved.

            Pranoy Cherian Mathew added a comment - A fix for this issue is available in Confluence Server and Data Center 8.5.21. Upgrade now or check out the Release Notes to see what other issues are resolved.
            Pranoy Cherian Mathew made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Waiting for Release [ 12075 ] New: Closed [ 6 ]
            Filipe Rodrigues made changes -
            Link New: This issue relates to CONFSERVER-97286 [ CONFSERVER-97286 ]
            Garvit Sharma made changes -
            QA Demo Status Original: Not Done [ 14330 ] New: Done [ 14331 ]
            QA Kickoff Status Original: Not Done [ 14234 ] New: Done [ 14235 ]
            Testing Notes New: Released public.plugin version from release/18.5.
            Updated version of public.plugin for confluence release/8.5.
            Status Original: In Progress [ 3 ] New: Waiting for Release [ 12075 ]
            Garvit Sharma made changes -
            Remote Link New: This issue links to "Confluence Bundled Plugins 92 › Confluence Jira-Integration Plugins › issue-92-CONFSERVER-99523-flush-indexing-queue-timeout (server-syd-bamboo)" [ 1002618 ]
            Garvit Sharma made changes -
            Remote Link New: This issue links to "Confluence Bundled Plugins 85 › Confluence Jira-Integration Plugins › issue-85-CONFSERVER-99523-flush-indexing-queue-timeout (server-syd-bamboo)" [ 1002617 ]
            Garvit Sharma made changes -
            Priority Original: Highest [ 1 ] New: Medium [ 3 ]
            Garvit Sharma made changes -
            Priority Original: Low [ 4 ] New: Highest [ 1 ]
            ConfServer Board Bot made changes -
            Labels Original: blocking-8.5.21 ewt-rtb-service-operations-and-tech-entropy release-blocker New: blocking-8.5.21 ewt-rtb-service-operations-and-tech-entropy internal-kickoff release-blocker
            ConfServer Board Bot made changes -
            Link New: This issue depends on CONFSRVDEV-38888 [ CONFSRVDEV-38888 ]

              971c305d4b2e Garvit Sharma
              03cb0c04aa4f Irina Tiapchenko
              Affected customers:
              0 This affects my team
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: