Searching problem in relations to issues in Japanese

XMLWordPrintable

    • Type: Bug
    • Resolution: Obsolete
    • Priority: Medium
    • None
    • Affects Version/s: 3.13
    • Component/s: Indexing
    • Environment:

      JIRA: Atlassian JIRA Enterprise 3.13 Standalone
      OS: Windows XP
      Database: MS SQL Server 2000, Collation Type: Japanese_Unicode_CI_AS and Japanese_Unicode_CS_AS_KS_WS
      JVM: 1.5.0

    • 3.13

      I have set up a clean instance of JIRA 3.13 with the Japanese-Unicode_CI_AS collation type and have done the following tests for the searching, I have repeated the process with Japanese-Unicode_CS_AS_KS_WS collation type

      Created the following issues using the input styles as shown below in both instances (the Japanese-Unicode_CI_AS JIRA and the Japanese-Unicode_CS_AS_KS_WS instances):
      ABCD-25 桜 Kanji
      ABCD-26 さくら Hiragana
      ABCD-27 サクラ Full-Width Katakana
      ABCD-28 sakura Full-Width Alphanumeric
      ABCD-29 サクラ Half-Width Katakana
      ABCD-30 sakura Half-Width Alphanumeric

      And conducted searches in both instances to see how they handle querying with different indexing configurations for JIRA, here are my findings:

      ________________________________________

      Japanese-Unicode_CI_AS

      Indexing Language - Other
      桜 returns 25 only
      さくら returns 26 only
      サクラ returns 27 only
      sakura returns nothing
      サクラ returns 29 only
      sakura returns 30 only

      Indexing Language - Chinese/Japanese/Korean
      桜 returns 25 only
      さくらreturns 26 only
      サクラreturns 27 only
      sakurareturns 28 and 30 only
      サクラ returns nothing
      sakura returns 28 and 30 only

      Indexing Language - English
      桜 returns 25 only
      さくらreturns 26 only
      サクラreturns 27 only
      sakura returns nothing
      サクラ returns 29 only
      sakura returns 30 only

      ________________________________________

      Japanese-Unicode_CS_AS_KS_WS

      Indexing Language - Other
      桜 returns 25 only
      さくら returns 26 only
      サクラ returns 27 only
      sakura returns nothing
      サクラ returns 29 only
      sakura returns 30 only

      Indexing Language - Chinese/Japanese/Korean
      桜 returns 25 only
      さくらreturns 26 only
      サクラreturns 27 only
      sakurareturns 28 and 30 only
      サクラ returns nothing
      sakura returns 28 and 30 only

      Indexing Language - English
      桜 returns 25 only
      さくらreturns 26 only
      サクラreturns 27 only
      sakura returns nothing
      サクラ returns 29 only
      sakura returns 30 only
      ________________________________________

      As shown above, no matter which indexing language was chosen, there is always 1 case that fails, this becomes a problem when searching for issues containing those type of characters.

      I have also discussed in details with the Atlassian support team through Live Support in relations to this problem and there does not currently seem to be a way to allow handling of all of the types mentioned above, we hope to get either some kind of workaround for this problem (other than avoiding a specific input type completely within issues).

              Assignee:
              Unassigned
              Reporter:
              BoS
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: