Uploaded image for project: 'Atlassian Intelligence'
  1. Atlassian Intelligence
  2. AI-891

Provide ability to override Lucene tokenisation and stemming and search for exact text

    • 26
    • Minor
    • 97

      NOTE: This suggestion is for Confluence Cloud. Using Confluence Server? See the corresponding suggestion.

      We would like to able to search for exact text, and not have Lucene tokenise, remove stop words, or do any stemming.

      We partly use Confluence as a knowledge base for our own software product, which includes lots of system codes, acrnoyms etc. The specific use case that prompted this was a user wanting to search for the text "S/D". Lucene strips off the 'S', and the '/', leaving the search to just look for 'd'. Not particularly useful.

      It would be brilliant if we could enter a query somehow that overrode this.

      Raised following support issue CSP-29584

        1. image-2016-12-26-10-54-27-948.png
          148 kB
          Scott Welker
        2. image-2016-12-26-10-56-53-084.png
          148 kB
          Scott Welker

          Form Name

            [AI-891] Provide ability to override Lucene tokenisation and stemming and search for exact text

            SET Analytics Bot made changes -
            Support reference count Original: 25 New: 26
            Jagriti made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 1025314 ]
            Aniket Bhosale made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 1018356 ]
            SET Analytics Bot made changes -
            Support reference count Original: 24 New: 25

            Mike_Rosoft added a comment - - edited

            Regarding the comment form 30 January, what on Earth does this issue have to do with configuration? This isn't asking for some kind of configuration, such as disabling Lucene tokenization for a specific site. This is asking to provide an ability to search for an exact string regardless of any tokenization. For example, searching for "Agility" should return documents containing the exact string "Agility", and not "Agile" or the like. (I understand that search is case-insensitive; so it would also match "agility" or "aGiLitY".)

            Mike_Rosoft added a comment - - edited Regarding the comment form 30 January, what on Earth does this issue have to do with configuration? This isn't asking for some kind of configuration, such as disabling Lucene tokenization for a specific site. This is asking to provide an ability to search for an exact string regardless of any tokenization. For example, searching for "Agility" should return documents containing the exact string "Agility", and not "Agile" or the like. (I understand that search is case-insensitive; so it would also match "agility" or "aGiLitY".)

            Mat Walker added a comment -

            +100 from me.

            Needed to search with a forward slash.  Cant.

            High Priority bug, but has been languishing since 2009.  Guess it isn't going to happen

            Mat Walker added a comment - +100 from me. Needed to search with a forward slash.  Cant. High Priority bug, but has been languishing since 2009.  Guess it isn't going to happen

            Angela L. added a comment - - edited

            This ticket is 16 years old as of today. Why doesn't Atlassian just close it and end the suspense? They clearly don't care that so many of us are clamoring for this change.

            Angela L. added a comment - - edited This ticket is 16 years old as of today. Why doesn't Atlassian just close it and end the suspense? They clearly don't care that so many of us are clamoring for this change.

            What a disappointing response.

            Being multi-tenanted you could choose to take this opportunity to fix this for everyone in one fell swoop, but instead it's a reason for leaving basic search functionality in a broken state for everyone?

            It's this kind of thinking that made my employer dump Confluence recently.

            You can't use this excuse as a reason not to fix things or you'll never overcome that inertia and make the tool fit for purpose.

            Adam Bartlett added a comment - What a disappointing response. Being multi-tenanted you could choose to take this opportunity to fix this for everyone in one fell swoop, but instead it's a reason for leaving basic search functionality in a broken state for everyone? It's this kind of thinking that made my employer dump Confluence recently. You can't use this excuse as a reason not to fix things or you'll never overcome that inertia and make the tool fit for purpose.

            With regards to this ticket we cannot provide the option for Lucene tokenisation. We have a multi-tenanted environment. As the config is applied once for all tenants, we can’t apply it on per site basis. If you were watching the ticket for an update to that problem - sadly its a 'won't fix'.

            However we may be able to help with the stemming issues. I'll review this with the team and come back when I have an update.

            As a general update, we remain committed to improving Confluence search. We continue to test and improve relevance. And we have large scale experience changes coming (currently testing these in Beta with 10 customers: https://community.atlassian.com/t5/Confluence-Cloud-Admins-articles/Beta-sign-up-closed-Introducing-Unified-Search-the-new-and/ba-p/2885113)

            As an update to my comment below, I've also updated the status of other search related tickets

            https://jira.atlassian.com/browse/AI-200 - fixed
             
            https://jira.atlassian.com/browse/AI-710 - fixed
             
            https://jira.atlassian.com/browse/AI-714 - won't fix
             
            https://jira.atlassian.com/browse/CONFCLOUD-4428 - not related to search (but fixed)
             
            https://jira.atlassian.com/browse/AI-568 - fixed
             
            https://jira.atlassian.com/browse/AI-842 - fixed
             
            https://jira.atlassian.com/browse/AI-611 - not yet fixed

            Owen Wallis added a comment - With regards to this ticket we cannot provide the option for Lucene tokenisation. We have a multi-tenanted environment. As the config is applied once for all tenants, we can’t apply it on per site basis. If you were watching the ticket for an update to that problem - sadly its a 'won't fix'. However we may be able to help with the stemming issues. I'll review this with the team and come back when I have an update. As a general update, we remain committed to improving Confluence search. We continue to test and improve relevance. And we have large scale experience changes coming (currently testing these in Beta with 10 customers: https://community.atlassian.com/t5/Confluence-Cloud-Admins-articles/Beta-sign-up-closed-Introducing-Unified-Search-the-new-and/ba-p/2885113 ) As an update to my comment below, I've also updated the status of other search related tickets https://jira.atlassian.com/browse/AI-200  - fixed   https://jira.atlassian.com/browse/AI-710  - fixed   https://jira.atlassian.com/browse/AI-714  - won't fix   https://jira.atlassian.com/browse/CONFCLOUD-4428  - not related to search (but fixed)   https://jira.atlassian.com/browse/AI-568  - fixed   https://jira.atlassian.com/browse/AI-842  - fixed   https://jira.atlassian.com/browse/AI-611  - not yet fixed
            SET Analytics Bot made changes -
            Support reference count Original: 23 New: 24

              owallis@atlassian.com Owen Wallis
              f29e75fd9379 Neil Arrowsmith
              Affected customers:
              496 This affects my team
              Watchers:
              286 Start watching this issue

                Created:
                Updated: