Uploaded image for project: 'Jira Software Data Center'
  1. Jira Software Data Center
  2. JSWSERVER-25868

LexoRank collation checker incorrectly detects collation in a multi-node environment

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Medium Medium
    • None
    • 8.20.0, 9.4.0, 9.12.0, 9.16.0
    • Lexorank
    • None

      Issue Summary

      This is reproducible on Data Center: yes

      Jira Software uses a collation integrity checker to detect if collation on a database is set correctly to ensure that LexoRank functionality works as expected.
      In case that the check reports an incorrect collation, LexoRank will be using queries with an additional casting operation, which will make queries slow, even though the collation is correct.

      How the collation check works

      The collation checker will create a few rank rows in a database, then it will query them and check if they are returned in the expected order. 

      See the documentation:

      * We add X amount of rows and ask the DB to order them. If the order failed we know
      * that we need to add functions to our queries to order everything as it should

      Steps to Reproduce

      This issue can be reproduced because collation integrity checkers are not synchronized between nodes. If two nodes create the test ranks at the same time, they will get incorrect results and assume that the collation is not set correctly.

      To reproduce the issue, make sure that the checkers on different nodes are executed at the same time.

      1. Create a multi-node environment.
      2. Bulk create issues so each node will perform the integrity check at the same time.

      Expected Results

      LexoRank will use a fast query without additional casting:

      SELECT FIELD_ID,ID,ISSUE_ID,LOCK_HASH,LOCK_TIME,`RANK`,TYPE FROM AO_60DB71_LEXORANK WHERE FIELD_ID = ? ORDER BY `RANK` DESC LIMIT 2;
      
      Query time: 2 ms

      A node will report that the collation is correct:

      Collation check performed against LexoRank table: OK

      Actual Results

      Only a subset of nodes can be affected

      LexoRank uses an inefficient query:

      SELECT FIELD_ID,ID,ISSUE_ID,LOCK_HASH,LOCK_TIME,`RANK`,TYPE FROM AO_60DB71_LEXORANK WHERE FIELD_ID = ? ORDER BY convert(`RANK` using 'utf8') COLLATE utf8_bin DESC LIMIT 2;
      
      Query time: 3216 ms 

      The affected nodes will print a warning:

      ****************************************************************************************************
       The database collation is set incorrectly for JIRA Agile. This can cause JIRA Agile to run slower. 
       If you are using PostgreSQL 9.0 or earlier, this may also cause ranking to fail.
       Please refer to the following documentation on the database configuration: https://confluence.atlassian.com/display/DOC/Database+Configuration
       If the problems persist, please refer to this KB on how to fix the database: https://confluence.atlassian.com/x/PAQWK
      ****************************************************************************************************

      Ranking operation fails because of timeouts with a message:

      JIRA Software cannot execute the rank operation at this time. Other users may be ranking the issues that you are trying to rank. Please try again later.

      Workaround

      The collation integrity checker can be disabled with a JVM property

      greenhopper.force.collation.functions.never

      Use it only if you are sure that the collation is correct and you are affected by this bug.

              Unassigned Unassigned
              szarazinski Sławomir Zaraziński
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: