Uploaded image for project: 'Atlassian Intelligence'
  1. Atlassian Intelligence
  2. AI-891

Provide ability to override Lucene tokenisation and stemming and search for exact text

    • 24
    • Minor
    • 97

      NOTE: This suggestion is for Confluence Cloud. Using Confluence Server? See the corresponding suggestion.

      We would like to able to search for exact text, and not have Lucene tokenise, remove stop words, or do any stemming.

      We partly use Confluence as a knowledge base for our own software product, which includes lots of system codes, acrnoyms etc. The specific use case that prompted this was a user wanting to search for the text "S/D". Lucene strips off the 'S', and the '/', leaving the search to just look for 'd'. Not particularly useful.

      It would be brilliant if we could enter a query somehow that overrode this.

      Raised following support issue CSP-29584

            [AI-891] Provide ability to override Lucene tokenisation and stemming and search for exact text

            What a disappointing response.

            Being multi-tenanted you could choose to take this opportunity to fix this for everyone in one fell swoop, but instead it's a reason for leaving basic search functionality in a broken state for everyone?

            It's this kind of thinking that made my employer dump Confluence recently.

            You can't use this excuse as a reason not to fix things or you'll never overcome that inertia and make the tool fit for purpose.

            Adam Bartlett added a comment - What a disappointing response. Being multi-tenanted you could choose to take this opportunity to fix this for everyone in one fell swoop, but instead it's a reason for leaving basic search functionality in a broken state for everyone? It's this kind of thinking that made my employer dump Confluence recently. You can't use this excuse as a reason not to fix things or you'll never overcome that inertia and make the tool fit for purpose.

            With regards to this ticket we cannot provide the option for Lucene tokenisation. We have a multi-tenanted environment. As the config is applied once for all tenants, we can’t apply it on per site basis. If you were watching the ticket for an update to that problem - sadly its a 'won't fix'.

            However we may be able to help with the stemming issues. I'll review this with the team and come back when I have an update.

            As a general update, we remain committed to improving Confluence search. We continue to test and improve relevance. And we have large scale experience changes coming (currently testing these in Beta with 10 customers: https://community.atlassian.com/t5/Confluence-Cloud-Admins-articles/Beta-sign-up-closed-Introducing-Unified-Search-the-new-and/ba-p/2885113)

            As an update to my comment below, I've also updated the status of other search related tickets

            https://jira.atlassian.com/browse/AI-200 - fixed
             
            https://jira.atlassian.com/browse/AI-710 - fixed
             
            https://jira.atlassian.com/browse/AI-714 - won't fix
             
            https://jira.atlassian.com/browse/CONFCLOUD-4428 - not related to search (but fixed)
             
            https://jira.atlassian.com/browse/AI-568 - fixed
             
            https://jira.atlassian.com/browse/AI-842 - fixed
             
            https://jira.atlassian.com/browse/AI-611 - not yet fixed

            Owen Wallis added a comment - With regards to this ticket we cannot provide the option for Lucene tokenisation. We have a multi-tenanted environment. As the config is applied once for all tenants, we can’t apply it on per site basis. If you were watching the ticket for an update to that problem - sadly its a 'won't fix'. However we may be able to help with the stemming issues. I'll review this with the team and come back when I have an update. As a general update, we remain committed to improving Confluence search. We continue to test and improve relevance. And we have large scale experience changes coming (currently testing these in Beta with 10 customers: https://community.atlassian.com/t5/Confluence-Cloud-Admins-articles/Beta-sign-up-closed-Introducing-Unified-Search-the-new-and/ba-p/2885113 ) As an update to my comment below, I've also updated the status of other search related tickets https://jira.atlassian.com/browse/AI-200  - fixed   https://jira.atlassian.com/browse/AI-710  - fixed   https://jira.atlassian.com/browse/AI-714  - won't fix   https://jira.atlassian.com/browse/CONFCLOUD-4428  - not related to search (but fixed)   https://jira.atlassian.com/browse/AI-568  - fixed   https://jira.atlassian.com/browse/AI-842  - fixed   https://jira.atlassian.com/browse/AI-611  - not yet fixed

            Hi 93b03c68bba8 and folks watching this ticket.

            There are a few related tickets:
            https://jira.atlassian.com/browse/AI-200
             
            https://jira.atlassian.com/browse/AI-710
             
            https://jira.atlassian.com/browse/AI-714
             
            https://jira.atlassian.com/browse/CONFCLOUD-4428
             
            https://jira.atlassian.com/browse/AI-568
             
            https://jira.atlassian.com/browse/AI-842
             
            https://jira.atlassian.com/browse/AI-611
            In the New Year I'll meet with Atlassian search relevance teams to understand next steps and ETAs for these

            Thanks

            Owen

            Owen Wallis added a comment - Hi 93b03c68bba8 and folks watching this ticket. There are a few related tickets: https://jira.atlassian.com/browse/AI-200   https://jira.atlassian.com/browse/AI-710   https://jira.atlassian.com/browse/AI-714   https://jira.atlassian.com/browse/CONFCLOUD-4428   https://jira.atlassian.com/browse/AI-568   https://jira.atlassian.com/browse/AI-842   https://jira.atlassian.com/browse/AI-611 In the New Year I'll meet with Atlassian search relevance teams to understand next steps and ETAs for these Thanks Owen

            Another +1 from me for exact string searching through page body text. We automate the generation of read-only technical documentation to Confluence, and the lack of this feature is becoming a real pain. Searching throughout a Confluence space containing around 200 pages will sometimes find a piece of text reliably such as "cus_webenabled", but for other text such as "Related_Worktags_Data" it just returns no results at all - perhaps it splits it into tokens and considers the individual words too common?  Is there any target timescale for introducing a fix please? Thanks

            Christian Boxer added a comment - Another +1 from me for exact string searching through page body text. We automate the generation of read-only technical documentation to Confluence, and the lack of this feature is becoming a real pain. Searching throughout a Confluence space containing around 200 pages will sometimes find a piece of text reliably such as "cus_webenabled", but for other text such as "Related_Worktags_Data" it just returns no results at all - perhaps it splits it into tokens and considers the individual words too common?  Is there any target timescale for introducing a fix please? Thanks

            Thank you very much for the comments, 95aa03ae8d8f and 2eaf001e0aec.

            Owen brought this one to my attention and the reason behind the label being added, is that, while we do plan to solve this in Rovo Search, we also do plan on resolving this in Confluence Search.

            This means that we do not expect customers to pay for Rovo in order to have the bug fixed.

            Apologies for any confusion.

            Giuliano C. added a comment - Thank you very much for the comments, 95aa03ae8d8f and 2eaf001e0aec . Owen brought this one to my attention and the reason behind the label being added, is that, while we do plan to solve this in Rovo Search, we also do plan on resolving this in Confluence Search. This means that we do not expect customers to pay for Rovo in order to have the bug fixed. Apologies for any confusion.

            Hi David, Adam,

            I agree - and we do plan to resolve this in Confluence.

            Let me work with the support team to understand why this tag was added.

            Thanks

            Owen

            Owen Wallis added a comment - Hi David, Adam, I agree - and we do plan to resolve this in Confluence. Let me work with the support team to understand why this tag was added. Thanks Owen

            Agreed, 2eaf001e0aec. Thi$ i$ not an add-on-f€atur€. It's a decade+ fundamental bug to core functionality. If any money was to be changing hands here, it should be Atlassian discounting our licensing for failure to provide core functionality.

            Adam Cameron added a comment - Agreed, 2eaf001e0aec . Thi$ i$ not an add-on- f€atur€ . It's a decade+ fundamental bug to core functionality. If any money was to be changing hands here, it should be Atlassian discounting our licensing for failure to provide core functionality.

            I notice the solved-by-ai-rovo label has been added.

            This is therefore not solved for people who haven't been upsold the Rovo product at $20+/user/month on top of their Confluence subscription.

            So please don't close this issue as it's still not solved by Confluence Cloud, which is the product that this 15+ year old issue was originally opened on.

            It would also make a lot of sense to move this back to the Confluence Cloud project rather than this AI project.

            David at David Simpson Apps added a comment - - edited I notice the solved-by-ai-rovo label has been added. This is therefore not solved for people who haven't been upsold the Rovo product at $20+/user/month on top of their Confluence subscription. So please don't close this issue as it's still not solved by Confluence Cloud , which is the product that this 15+ year old issue was originally opened on. It would also make a lot of sense to move this back to the Confluence Cloud project rather than this AI project.

            Hi all,

            We've added a fix so exact match will work on Titles.

            Before the change “S/D” would match “S…D” regardless of what special characters are in between the letter “S” and “D”.

            Now searching on "S/D" will, for example, ignore “S%D” or “S//D” on titles.

            I'll not mark this ticket as closed as we still need to add in a fix for body text support.

            But at least now when using Confluence Advanced search you can select the 'Search titles only' (from the 'More filters' option) - and get exact matches on Title.

            All the best

            Owen

            Owen Wallis added a comment - Hi all, We've added a fix so exact match will work on Titles. Before the change “S/D” would match “S…D” regardless of what special characters are in between the letter “S” and “D”. Now searching on "S/D" will, for example, ignore “S%D” or “S//D” on titles. I'll not mark this ticket as closed as we still need to add in a fix for body text support. But at least now when using Confluence Advanced search you can select the 'Search titles only' (from the 'More filters' option) - and get exact matches on Title. All the best Owen

            Please make this a top priority, what use is a document management system with a search functionality that cannot even use literal strings? Try to search for a domain name or a date and the result includes every page there is!

            This application is an internal laughing stock as it is right now!

            carl.olofson added a comment - Please make this a top priority, what use is a document management system with a search functionality that cannot even use literal strings? Try to search for a domain name or a date and the result includes every page there is! This application is an internal laughing stock as it is right now!

              owallis@atlassian.com Owen Wallis
              f29e75fd9379 Neil Arrowsmith
              Affected customers:
              483 This affects my team
              Watchers:
              275 Start watching this issue

                Created:
                Updated: