Issue Summary

      Certain update statements created by doaskquestion.action http threads, become stuck waiting to acquire a lock in the database. This in turns blocks the main index from updating.

      Steps to Reproduce

      Reported by a customer who was on:

      • Confluence 8.5.5
      • Confluence Questions 4.5.10
      • Postgres 14 (Aurora)

      Recent issues show that this could happen on other DBMSs.

      Local reproduction attempts were unsuccessful.

      Expected Results

      For the statements initiated by Confluence questions to conclude without blocking the main index.

      Actual Results

      The threads are of this nature:

      "https-jsse-nio2-8443-exec-5730 url: /confluence/cq/doaskquestion.action; user: qxz3775" #122971 daemon prio=5 os_prio=0 cpu=6834892.19ms elapsed=942096.96s tid=0x00007f2419e5b800 nid=0x58cd runnable  [0x00007efaf27dd000]
         java.lang.Thread.State: RUNNABLE 
      

      These include indexing calls like:

      at com.atlassian.confluence.plugins.questions.search.index.QuestionsReIndexer.indexInstantly(QuestionsReIndexer.java:257)
      	at com.atlassian.confluence.plugins.questions.search.index.QuestionsReIndexer.reIndexInstantly(QuestionsReIndexer.java:132) 

      Which lock the Caesium threads:

      "Caesium-1-3" #1152 daemon prio=1 os_prio=0 cpu=106577126.51ms elapsed=1083225.43s tid=0x00007efd1c19b000 nid=0x4b2c waiting on condition  [0x00007f210e4e3000]
         java.lang.Thread.State: WAITING (parking)
      	at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
      	- parking to wait for  <0x00007f293186a6c0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) 

      The statement in the database would look like this:

      blocked_pid | blocked_user | blocking_pid | blocking_user |                                           blocked_statement                                            |                                                                                                 current_statement_in_blocking_process
      -------------+--------------+--------------+---------------+--------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
             47718 | confluence   |         7541 | confluence    | update CONTENTPROPERTIES set PROPERTYNAME=$1, STRINGVAL=$2, LONGVAL=$3, DATEVAL=$4 where PROPERTYID=$5 | select contentpro0_.CONTENTID as contenti6_23_1_, contentpro0_.PROPERTYID as property1_23_1_, contentpro0_.PROPERTYID as property1_23_0_, contentpro0_.PROPERTYNAME as property2_23_0_, contentpro0_.STRINGVAL as stringva3_23_0_, contentpro0_.LONGVAL as longval4_23_0_, contentpro0_.DATEVAL as dateval5_23_0_ from CONTENTPROPERTIES contentpro0_ where contentpro0_.CONTENTID=$1
      (1 row) 

      And block statements by staying in idle in transaction:

              state1        | idle_in_transaction_duration |  xact_duration  | datid |   datname    |  pid  | leader_pid | usesysid |  usename   |    application_name    |  client_addr  | client_hostname | client_port |         backend_start
            |          xact_start           |          query_start          |         state_change          | wait_event_type |  wait_event   |        state        | backend_xid | backend_xmin |      query_id       |  query |  backend_type
      
       idle in transaction | 01:29:22.281623              | 01:39:42.149356 | 16402 | confluencedb |  7541 |            |    16400 | confluence | PostgreSQL JDBC Driver | 10.3.20.137   |                 |       38572 | 2024-04-22 08:41:49.3327
      13+00 | 2024-04-22 08:43:42.433186+00 | 2024-04-22 08:54:02.300857+00 | 2024-04-22 08:54:02.300919+00 | Client          | ClientRead    | idle in transaction |  1923853371 |   1923853371 |                     | select contentpro0_.CONTEN
      TID as contenti6_23_1_, contentpro0_.PROPERTYID as property1_23_1_, contentpro0_.PROPERTYID as property1_23_0_, contentpro0_.PROPERTYNAME as property2_23_0_, contentpro0_.STRINGVAL as stringva3_23_0_, contentpro0_.LONGVAL as longval4_23_
      0_, contentpro0_.DATEVAL as dateval5_23_0_ from CONTENTPROPERTIES contentpro0_ where contentpro0_.CONTENTID=$1
      
                                               | client backend
       active              | 01:29:22.29322               | 01:31:50.256214 | 16402 | confluencedb | 47718 |            |    16400 | confluence | PostgreSQL JDBC Driver | 10.3.20.137   |                 |       57366 | 2024-04-22 08:50:27.6099
      48+00 | 2024-04-22 08:51:34.326328+00 | 2024-04-22 08:54:02.289322+00 | 2024-04-22 08:54:02.289322+00 | Lock            | transactionid | active              |  1924014348 |   1923853371 | 9118171607884901042 | update CONTENTPROPERTIES s
      et PROPERTYNAME=$1, STRINGVAL=$2, LONGVAL=$3, DATEVAL=$4 where PROPERTYID=$5
      
                                               | client backend

      Workaround

      A workaround would be to set a timeout for idle in transaction statements so that they are terminated and the blocks released. This can be done by leveraging the idle_in_transaction_session_timeout database parameter to a value that would minimise the possibility of collateral.

      This is not a clean solution.

            [CONFSERVER-96816] Confluence Questions locks the main index

            There are no comments yet on this issue.

              971c305d4b2e Garvit Sharma
              23540fb20e73 Chris Psonis
              Affected customers:
              1 This affects my team
              Watchers:
              9 Start watching this issue

                Created:
                Updated: