Repository and project names in code search do not support partial matches/tokenization consistently with code

XMLWordPrintable

    • Type: Suggestion
    • Resolution: Unresolved
    • None
    • Component/s: Search
    • None
    • 1
    • 6

      Problem Definition

      When using code search, the rules for searching for projects/repositories are inconsistent with the rules for searching code. For repositories, names which contain periods or underscores are not partial matched. The Elasticsearch standard tokenizer used for repository names doesn't consider "." or "_" as a word boundary. This is inconsistent with code, which uses a different tokenizer that does word break on some characters and can cause unexpected search results.

      Example 

      A repository named bitbucket.jira.demo.repo does not show up in a search for "jira" or "demo", but would show up in a search for "bitbucket" or "bitbucket.jira" On the other hand, code containing the same text would match on "bitbucket" "jira" "demo" or "repo"

      Suggested Solution 

      Update the tokenizer for projects/repository to work the same as code. This way the rules for searching are the same regardless of what the user is searching for.

            Assignee:
            Unassigned
            Reporter:
            Rodrigo M (Inactive)
            Votes:
            3 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: