Uploaded image for project: 'Jira Software Data Center'
  1. Jira Software Data Center
  2. JSWSERVER-16542

During rebalancing ranking operation fails to save new rank due to slow DB operation

      Not a Jira bug

      This is not a Jira core bug, ticket created to track the problem and for transparency.
      At this stage, we believe the problem is related to MySQL: for some reason MySQL chooses the wrong execution plan and doesn't use an index in optimal way. That makes SQL execution slow. That being said, we will keep the ticket open for some time to monitor the problem.

      Summary

      As part of Lexorank rebalancing, each rank value needs to be updated in AO_60DB71_LEXORANK table. At the same time, individual ranking operation has 1500 ms timeout for rank lock.
      Due to slow DB operation, issue ranking operation fails to save new rank since rank lock expires. That makes Lexorank rebalancing process stall, usually you will see no progress in UI. Please note that situation is different to JSWSERVER-15703

      Environment

      • Large JIRA instance: 1M+ issue
        • more specific large AO_60DB71_LEXORANK table
      • The issue was initially reported with Jira connected to MySQL databases, but there were other database engines affected as well, such as Oracle and MS SQL Server.

      Steps to Reproduce

      1. Run Lexorank rebalancing
      2. Monitor progress

      Expected Results

      • Lexorank rebalancing runs continuously and finishes successful.
      • SQL executions is fast:
        SELECT FIELD_ID,ID,ISSUE_ID,LOCK_HASH,LOCK_TIME,RANK,TYPE FROM AO_60DB71_LEXORANK WHERE FIELD_ID = ? AND BUCKET = ? ORDER BY RANK ASC LIMIT 1; 
        SELECT FIELD_ID,ID,ISSUE_ID,LOCK_HASH,LOCK_TIME,RANK,TYPE FROM AO_60DB71_LEXORANK WHERE FIELD_ID = ? AND BUCKET = ? ORDER BY RANK DESC LIMIT 1;
        

      Actual Results

      • Lexorank rebalancing stalls
      • Issue ranking operation fails to save new rank since rank lock expires
      • SQL executions is slow

      Notes

      • To enable logging set DEBUG for com.atlassian.greenhopper.service.lexorank and com.atlassian.greenhopper.manager.lexorank
      • Example of logging:
        2018-02-27 10:21:58,046 lexorank-executor-thread-0 DEBUG ServiceRunner     [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Acquired lock on rows
        2018-02-27 10:21:58,046 lexorank-executor-thread-0 DEBUG ServiceRunner     [c.a.g.manager.lexorank.LexoRankDaoImpl]  LexoRank Query : SELECT FIELD_ID,ID,ISSUE_ID,LOCK_HASH,LOCK_TIME,RANK,TYPE FROM AO_60DB71_LEXORANK WHERE FIELD_ID = ? AND BUCKET = ? ORDER BY RANK ASC LIMIT 1; params[15530, 2]
        2018-02-27 10:21:59,190 lexorank-executor-thread-0 DEBUG ServiceRunner     [c.a.g.manager.lexorank.LexoRankDaoImpl]  LexoRank Query : SELECT FIELD_ID,ID,ISSUE_ID,LOCK_HASH,LOCK_TIME,RANK,TYPE FROM AO_60DB71_LEXORANK WHERE FIELD_ID = ? AND BUCKET = ? ORDER BY RANK DESC LIMIT 1; params[15530, 0]
        2018-02-27 10:22:00,648 lexorank-executor-thread-0 DEBUG ServiceRunner     [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Balancing rank row [type=ISSUE_RANK_ROW, oldRank=2|hx8yj3:, newRank=0|i1sicv:]
        2018-02-27 10:22:00,650 lexorank-executor-thread-0 DEBUG ServiceRunner     [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Failed to save rank row, retry
        2018-02-27 10:22:00,650 lexorank-executor-thread-0 DEBUG ServiceRunner     [c.a.g.s.lexorank.balance.LexoRankBalanceOperation] Releasing lock
        
      • explain SELECT
        explain SELECT FIELD_ID,ID,ISSUE_ID,LOCK_HASH,LOCK_TIME,RANK,TYPE FROM AO_60DB71_LEXORANK WHERE FIELD_ID = 11500 AND BUCKET = 2 ORDER BY RANK ASC LIMIT 1;
        
        speed id select_type table type possible_keys key key_len ref
        slow 1 SIMPLE AO_60DB71_LEXORANK ref index_ao_60db71_lex1569533973,index_ao_60db71_lex1694305086 index_ao_60db71_lex1569533973 8 const
        fast 1 SIMPLE AO_60DB71_LEXORANK range index_ao_60db71_lex1694305086,index_ao_60db71_lex1569533973 index_ao_60db71_lex1569533973 775 NULL
      • Note that in case of slow select key_len is 8.
      • To help investigate problem like this in the future, please see KB based on this bug: Issue ranking operation fails - JIRA Software cannot execute the rank operation at this time

      Workaround

      • Resolving the issue involves making MySQL select the right index, to do that, please recreate index (there will be small downtime during that operation):
        DROP INDEX index_ao_60db71_lex1569533973 ON AO_60DB71_LEXORANK;
        CREATE INDEX index_ao_60db71_lex1569533973 on AO_60DB71_LEXORANK (FIELD_ID,RANK);
        
        • If you can allow longer downtime, you can run at mysql: "ANALYZE TABLE AO_60DB71_LEXORANK" (During the analysis, the table is locked with a read lock for InnoDB and MyISAM. see analyze-table)

            [JSWSERVER-16542] During rebalancing ranking operation fails to save new rank due to slow DB operation

            Beata Kupiec made changes -
            Current Status Original: *Atlassian Update - 28 February 2025*
             
            Dear Customers,
             
            We’re closing this ticket as it was opened for pure monitoring purposes not as a specific Jira bug.
             
            As a recommendation when the above problem occurs we suggest regular maintenance of databases with the focus on *rebuilding indexes* and *recomputing statistics* either at regular intervals or when index fragmentation goes above a certain threshold or when statistics become outdated.
            More information you can find on suggested pages: * [https://confluence.atlassian.com/kb/database-troubleshooting-and-how-to-guides-744326149.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-postgresql-938846851.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-sql-server-2017-970613480.html]
             * [https://confluence.atlassian.com/kb/optimize-and-improve-postgresql-performance-with-vacuum-analyze-and-reindex-885239781.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-sql-server-2017-970613480.html#ConnectingJiraapplicationstoSQLServer2017-Scheduleadailymaintenancetaskforhottables]
             
            Beata Kupiec
            Principal Product Manager Data Center
            New: *Atlassian Update - 28 February 2025*
             
            Dear Customers,
             
            We’re closing this ticket as it was opened for pure monitoring purposes not as a specific Jira bug.
             
            As a recommendation when the above problem occurs we suggest regular maintenance of databases with the focus on *rebuilding indexes* and *recomputing statistics* either at regular intervals or when index fragmentation goes above a certain threshold or when statistics become outdated.
            More information you can find on suggested pages:
             * [https://confluence.atlassian.com/kb/database-troubleshooting-and-how-to-guides-744326149.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-postgresql-938846851.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-sql-server-2017-970613480.html]
             * [https://confluence.atlassian.com/kb/optimize-and-improve-postgresql-performance-with-vacuum-analyze-and-reindex-885239781.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-sql-server-2017-970613480.html#ConnectingJiraapplicationstoSQLServer2017-Scheduleadailymaintenancetaskforhottables]
             
            Beata Kupiec
            Principal Product Manager Data Center
            Beata Kupiec made changes -
            Resolution New: Not a bug [ 12 ]
            Status Original: Gathering Impact [ 12072 ] New: Closed [ 6 ]
            Beata Kupiec made changes -
            Current Status New: *Atlassian Update - 28 February 2025*
             
            Dear Customers,
             
            We’re closing this ticket as it was opened for pure monitoring purposes not as a specific Jira bug.
             
            As a recommendation when the above problem occurs we suggest regular maintenance of databases with the focus on *rebuilding indexes* and *recomputing statistics* either at regular intervals or when index fragmentation goes above a certain threshold or when statistics become outdated.
            More information you can find on suggested pages: * [https://confluence.atlassian.com/kb/database-troubleshooting-and-how-to-guides-744326149.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-postgresql-938846851.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-sql-server-2017-970613480.html]
             * [https://confluence.atlassian.com/kb/optimize-and-improve-postgresql-performance-with-vacuum-analyze-and-reindex-885239781.html]
             * [https://confluence.atlassian.com/adminjiraserver/connecting-jira-applications-to-sql-server-2017-970613480.html#ConnectingJiraapplicationstoSQLServer2017-Scheduleadailymaintenancetaskforhottables]
             
            Beata Kupiec
            Principal Product Manager Data Center
            SET Analytics Bot made changes -
            Support reference count Original: 67 New: 68
            SET Analytics Bot made changes -
            UIS Original: 10 New: 21
            Antoni Kowalski made changes -
            Remote Link Original: This issue links to "WILDT-151 (Bulldog)" [ 988304 ] New: This issue links to "WILDT-151 (JIRA Server (Bulldog))" [ 988304 ]
            Antoni Kowalski made changes -
            Remote Link New: This issue links to "WILDT-151 (Bulldog)" [ 988304 ]
            SET Analytics Bot made changes -
            UIS Original: 11 New: 10
            Rob made changes -
            Labels Original: pse-request whl-fy25q1 whl-fy25q2 New: pse-request whl-fy25q1 whl-fy25q2 whl-fy25q3
            SET Analytics Bot made changes -
            UIS Original: 22 New: 11

              Unassigned Unassigned
              ayakovlev@atlassian.com Andriy Yakovlev [Atlassian]
              Affected customers:
              24 This affects my team
              Watchers:
              31 Start watching this issue

                Created:
                Updated:
                Resolved: