Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-3036

Allow search for words and phrases with non-letter symbols: plus (+), minus (-), period (.), dollar sign ($), asterisk (*), etc.

    • 9
    • 40
    • We collect Confluence feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

      NOTE: This suggestion is for Confluence Server. Using Confluence Cloud? See the corresponding suggestion.

      At the moment, searching for "hello-to-the-world" in Confluence always returns the same results as "hello to the world". This situation also applies for symbols like plus, underscore, period, dollar sign, percent sign, and so on.

      There's also no way to prevent asterisks being treated as wildcard characters in Lucene, so you can't search for a word like "plea" and match content with asterisks around the word.

      Words are also not split on dots, so you can't search for "somefile" and find pages that contains "somefile.txt" or "somefile.doc" in the text.

      Technical notes

      This is due to how Confluence's search tokenises search requests. It splits the query up into words based on letter characters, and ignores all symbols in the request. We use Lucene's StandardTokenizer in our EnglishAnalyzer, and similar implementations for other languages.

      Here is the description of the behaviour of StandardTokenizer from Lucene:

      • Splits words at punctuation characters, removing punctuation. However, a dot that's not followed by whitespace is considered part of a token.
      • Splits words at hyphens, unless there's a number in the token, in which case the whole token is interpreted as a product number and is not split.
      • Recognizes email addresses and internet hostnames as one token.

      An example of the grammar for this tokenizer can be viewed here: StandardTokenizerImpl.jflex.

        1. search-not-working-1.png
          search-not-working-1.png
          38 kB
        2. search-not-working-2.png
          search-not-working-2.png
          18 kB
        3. search-not-working-3.png
          search-not-working-3.png
          18 kB

            [CONFSERVER-3036] Allow search for words and phrases with non-letter symbols: plus (+), minus (-), period (.), dollar sign ($), asterisk (*), etc.

            How is this still an issue.....

             

            Michael Bulger added a comment - How is this still an issue.....  

            This would help searching for technical parameters and settings.

            For example you can't find the page with issue.fields.resolutiondate definitions when searching by "resolutiondate"

            eazyBI Support added a comment - This would help searching for technical parameters and settings. For example you can't find the page with issue.fields.resolutiondate definitions when searching by " resolutiondate "

            I need to list files containing the whole string "bash --login", I'm really not interested in all the scripts that have some use of "login" in them...

            Lukas.Beyeler@dentsplysirona.com added a comment - I need to list files containing the whole string "bash --login" , I'm really not interested in all the scripts that have some use of "login" in them...

            Andrey Tetyuev added a comment - - edited

            The hint with usage of RegEx was already given before, but was reported to be not working in certain cases.
            I think the root cause is that you should escape additionally the "\" which is escaping special symbols.
            I will give an example which works in my case.

            Imagine you should search for Issues containing "..." in the field "Root Cause" except ones from the project "OtherProject".
            The search query will look like this:

            issueFunction in issueFieldMatch("'Root Cause' is not empty AND Project != OtherProject", "Root Cause", "\\.{3}")

            In the RegEx above you see that the 2nd "\" (which is modifying special regex symbol "." to a usual point) being also additionally escaped by the "\" within the search string.

            Actually the usage of RegEx (also in combination with issueFunction) seems to be not that comfortable for quick searches, but in case you are creating filters / dashboards - the usage of RegEx offers even more flexibility.
            Just consider that usage of RegEx instead of simple text search might increase the system load by your JIRA sever if used "too widely".

            Andrey Tetyuev added a comment - - edited The hint with usage of RegEx was already given before, but was reported to be not working in certain cases. I think the root cause is that you should escape additionally the "\" which is escaping special symbols. I will give an example which works in my case. Imagine you should search for Issues containing "..." in the field "Root Cause" except ones from the project "OtherProject". The search query will look like this: issueFunction in issueFieldMatch("'Root Cause' is not empty AND Project != OtherProject", "Root Cause", "\\.{3}") In the RegEx above you see that the 2nd "\" (which is modifying special regex symbol "." to a usual point) being also additionally escaped by the "\" within the search string. Actually the usage of RegEx (also in combination with issueFunction) seems to be not that comfortable for quick searches, but in case you are creating filters / dashboards - the usage of RegEx offers even more flexibility. Just consider that usage of RegEx instead of simple text search might increase the system load by your JIRA sever if used "too widely".

            It is still a problem to find pages with such characters in the page title.

            Enrico Skottnik added a comment - It is still a problem to find pages with such characters in the page title.

            Can't believe this is not being addressed. Not surprising though.

            daithi-walker added a comment - Can't believe this is not being addressed. Not surprising though.

            Still having a problem with this. I need to bring back all tickets with a "." in the summary

            Vance Collins added a comment - Still having a problem with this. I need to bring back all tickets with a "." in the summary

            Florent R. added a comment -

            Please act on this issue. This can not be something hard to fix. You only have to add some kind of preprocessing of the string to escape every special characters.
            You can not allow to use character in pages that can not be searched !

            Florent R. added a comment - Please act on this issue. This can not be something hard to fix. You only have to add some kind of preprocessing of the string to escape every special characters. You can not allow to use character in pages that can not be searched !

            It's such a shame that you can't or don't want to fix this issue.

            Serhii Onkov added a comment - It's such a shame that you can't or don't want to fix this issue.

            Ed Felt added a comment -

            Planning to fix?

            Ed Felt added a comment - Planning to fix?

              Unassigned Unassigned
              4d3096c80b53 Roberto Fdez.
              Votes:
              221 Vote for this issue
              Watchers:
              135 Start watching this issue

                Created:
                Updated: