Issue Details (XML | Word | Printable)

Key: JRA-12399
Type: Bug Bug
Status: Verified Verified
Priority: Major Major
Assignee: Yuen-Chi Lian [Atlassian]
Reporter: Andrey Mikuloff
Votes: 4
Watchers: 3
Operations

Add/Edit UI Mockup to this issue
If you were logged in you would be able to see more operations.
JIRA

Search for numbers in text fields return no results when indexing in russian

Created: 19/Mar/07 12:28 PM   Updated: 28/Oct/09 11:14 PM
Return to search
Component/s: Filtering & Indexing
Affects Version/s: 3.8
Fix Version/s: None

Time Tracking:
Original Estimate: 4h
Original Estimate - 4h
Remaining Estimate: 4h
Remaining Estimate - 4h
Time Spent: Not Specified
Time Spent - Not Specified

File Attachments: 1. File JiraAnalyzer.class (2 kB) 09/Jul/07 11:47 PM - Nick Menere [Atlassian]
2. Java Source File JiraAnalyzer.java (3 kB) 09/Jul/07 11:47 PM - Nick Menere [Atlassian]

Environment:

Tomcat 5.5.17

Issue Links:
Duplicate
 

Participants: Andrey Mikuloff, Andrey Mikuloff, Anton Mazkovoi [Atlassian], Dylan Etkin [Atlassian], Jed Wesley-Smith [Atlassian], Nick Menere [Atlassian] and Yuen-Chi Lian [Atlassian]
Since last comment: 2 years, 9 weeks, 3 days ago
To be done by: Single developer
Labels:


 Description  « Hide

When using russian indexing JIRA not search for numbers in text fields with numbers (Text search module in Navigator) and custom fields with Free Text Searcher.
In "other" language and English - search for numbers fine, but with wrong Russian search.

I.e. search for code in information or in custom field for card number (i.e. search for "4013*" returns 0 records even if exist record with value "40135083")



Yuen-Chi Lian [Atlassian] added a comment - 17/May/07 01:49 AM

It looks like the RussianAnalyzer doesn't quite like numbers. Say, if I have an issue with summary Love 007,

The search query 007 will be converted to:

{org.apache.lucene.search.BooleanQuery@6614}"(() ()) (()^9.0)"

Love 007 will be converted to:

{org.apache.lucene.search.BooleanQuery@6681}"(summary:love description:love) summary:love^9.0"


Cheers,
Yuen-Chi Lian

"I do not seek. I find." - Pablo Picasso


Anton Mazkovoi [Atlassian] added a comment - 17/May/07 11:27 PM

Thanks for looking into it. Sounds like a Lucene bug. We (dev team) should take a closer look at it.


Dylan Etkin [Atlassian] added a comment - 20/May/07 11:56 PM

Lets really dig into this one and make sure that this is not something that we are doing wrong. If it is a Lucene bug then lets raise it or link it.


Andrey Mikuloff added a comment - 13/Jun/07 05:01 AM

I am try to use lucene-2.1.0 from apache.org, but have this exception on reindexing (JIRA 3.9.1):
Caused by: java.lang.NoSuchMethodError: org.apache.lucene.document.Field.Keyword(Ljava/lang/String;Ljava/lang/String;)Lorg/apache/lucene/document/Field;
at com.atlassian.jira.issue.index.indexers.impl.BaseFieldIndexer.indexKeyword(BaseFieldIndexer.java:46)
at com.atlassian.jira.issue.index.indexers.impl.IssueKeyIndexer.addIndex(IssueKeyIndexer.java:11)

than rollback to 1.9.1...

In which version this bug will be fixed?


Anton Mazkovoi [Atlassian] added a comment - 14/Jun/07 03:09 AM

Andrey,

JIRA is not currently compatible with lucene 2.1. I am not sure that upgrading Lucene will fix this problem.

At the moment, unfortunately, I cannot provide a release date for this bug, as we do not have a large number of Russian users, and I assume only a few from the group that we have, see this bug.

One thing that might help you at the moment is setting the Indexing language to "Other", rather than "Russian", then reindexing your data. I would suggest testing this in a test system first, before trying it in production. The side affect of setting the indexing language to "Other" is that stemming will not be performed on the words, and hence searching will only match exact words, rather than look for word roots. I am not sure if this will actually impact you in any serious way, as that depends on how you search most often.

We will update this issue as soon as we have more information. Please accept our apologies for the delay.

Cheers,
Anton


Andrey Mikuloff added a comment - 09/Jul/07 02:34 AM

Anton, thanks, this solution works, but it is not the best way for solve problem - searching for parts of words in summary and comments not works for russian text. We use this solution as temporary, but waiting for proper solution of this problem.

Is really too much changes needed for fix russian lucene library bug?


Nick Menere [Atlassian] added a comment - 09/Jul/07 11:49 PM

Audrey,
Please find attached a patch that should fix this issue. It will now index numerical values. I have raised an issue with Lucene and am awaiting feedback before putting this change into JIRA.
I have also attached the source.

You will need to copy this file over the existing class file.

Cheers,
Nick


Andrey Mikuloff added a comment - 11/Jul/07 01:06 AM

Nick, thanks, its works!!!


Jed Wesley-Smith [Atlassian] added a comment - 06/Dec/07 08:00 PM

The fix for this issue has not been able to make it into JIRA v3.12. We are hoping to incorporate it into v3.12.1. As of writing however, there are 163 items scheduled as Fix For v3.12.1. We will not be able to include all of them.