Uploaded image for project: 'Atlassian Intelligence'
  1. Atlassian Intelligence
  2. AI-582

Allow search for words and phrases with non-letter symbols: plus (+), minus (-), period (.), dollar sign ($), asterisk (*), etc.

    • Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

      NOTE: This suggestion is for Confluence Cloud. Using Confluence Server? See the corresponding suggestion.

      At the moment, searching for "hello-to-the-world" in Confluence always returns the same results as "hello to the world". This situation also applies for symbols like plus, underscore, period, dollar sign, percent sign, and so on.

      There's also no way to prevent asterisks being treated as wildcard characters in Lucene, so you can't search for a word like "plea" and match content with asterisks around the word.

      Words are also not split on dots, so you can't search for "somefile" and find pages that contains "somefile.txt" or "somefile.doc" in the text.

      Technical notes

      This is due to how Confluence's search tokenises search requests. It splits the query up into words based on letter characters, and ignores all symbols in the request. We use Lucene's StandardTokenizer in our EnglishAnalyzer, and similar implementations for other languages.

      Here is the description of the behaviour of StandardTokenizer from Lucene:

      • Splits words at punctuation characters, removing punctuation. However, a dot that's not followed by whitespace is considered part of a token.
      • Splits words at hyphens, unless there's a number in the token, in which case the whole token is interpreted as a product number and is not split.
      • Recognizes email addresses and internet hostnames as one token.

      An example of the grammar for this tokenizer can be viewed here: StandardTokenizerImpl.jflex.

        1. search-not-working-1.png
          search-not-working-1.png
          38 kB
        2. search-not-working-2.png
          search-not-working-2.png
          18 kB
        3. search-not-working-3.png
          search-not-working-3.png
          18 kB

            [AI-582] Allow search for words and phrases with non-letter symbols: plus (+), minus (-), period (.), dollar sign ($), asterisk (*), etc.

            How is this still a thing? There really needs to be an "exact word" option that does not turn a dash into a space.
            This has to be the first search I've ever seen that does this.

            Search "config-name", searches for the strings "config" and "name". Across 100 repos this is absolutely useless.

            +1 if it's an consolation.

            Jacob Vandevelde added a comment - How is this still a thing? There really needs to be an "exact word" option that does not turn a dash into a space. This has to be the first search I've ever seen that does this. Search "config-name", searches for the strings "config" and "name". Across 100 repos this is absolutely useless. +1 if it's an consolation.

            +1

            Hi everyone. 
            My name is Sheng An Zhang and I am a product manager on the search team. 

            Our team is looking to focus on fixing fundamental issues with our search. This bug is something my team is now looking into and will be an area of higher priority for us. While I cannot provide you with a concrete timeline, we will be actively looking to fix this and related issues.

            As we are beginning to explore this problem, we are also looking for your help in ensuring we build the right thing for you. If you are willing to participate in a short customer interview to chat about this (or search in general), please find a time that suits you here.

            **Otherwise, please do not hesitate to shoot me an email at szhang4@atlassian.com.

            We really want to thank you for all your patience and feedback!

            Sheng An Zhang (Inactive) added a comment - Hi everyone.  My name is Sheng An Zhang and I am a product manager on the search team.  Our team is looking to focus on fixing fundamental issues with our search. This bug is something my team is now looking into and will be an area of higher priority for us. While I cannot provide you with a concrete timeline, we will be actively looking to fix this and related issues. As we are beginning to explore this problem, we are also looking for your help in ensuring we build the right thing for you. If you are willing to participate in a short customer interview to chat about this (or search in general), please find a time that suits you  here . **Otherwise, please do not hesitate to shoot me an email at szhang4@atlassian.com. We really want to thank you for all your patience and feedback!

            sxander added a comment -

            It would be extremely helpful to be able to search for special characters.  Especially in a heavily data bound world.  Finding x_y as opposed to x y or xy would be exceptionally useful and highly time saving.

            sxander added a comment - It would be extremely helpful to be able to search for special characters.  Especially in a heavily data bound world.  Finding x_y as opposed to x y or xy would be exceptionally useful and highly time saving.

            Pål F. Kristiansen added a comment - https://support.atlassian.com/confluence-cloud/docs/confluence-search-syntax/

            The search engine is useless if it is not possible to search exact text/phrase.

            Pål F. Kristiansen added a comment - The search engine is useless if it is not possible to search exact text/phrase.

            We recently changed the names of some of our Bitbucket repos, many of which have a dash in them. We want to be able to update all of our our Confluence pages that reference these repos, which would be a simple affair if Confluence would recognize dashes in search terms. But since it doesn't, updating the repo names is going to be tedious.

            Scott Moore added a comment - We recently changed the names of some of our Bitbucket repos, many of which have a dash in them. We want to be able to update all of our our Confluence pages that reference these repos, which would be a simple affair if Confluence would recognize dashes in search terms. But since it doesn't, updating the repo names is going to be tedious.

            Searching for command line options in code references it also impossible with this issue.

            "--some-option" will return results for "some" and "option" but never together.

            Luan Minh Nguyen added a comment - Searching for command line options in code references it also impossible with this issue. "--some-option" will return results for "some" and "option" but never together.

            We do automated deployments in our area, every one of them has an identifier the identifier is made up of 4 parts divided by '.' when I search for this key provider instead of my search looking like this
            part1.part2.part3.part4 (which brings up nothing)
            I have to search for
            text ~(part1) AND text ~(part2) AND text ~(part3) AND text ~(part4)
            I would really like the search to work so I don't have to work through this unexpected behaviour without a monolithic work around.

            Anthony Brown added a comment - We do automated deployments in our area, every one of them has an identifier the identifier is made up of 4 parts divided by '.' when I search for this key provider instead of my search looking like this part1.part2.part3.part4 (which brings up nothing) I have to search for text ~(part1) AND text ~(part2) AND text ~(part3) AND text ~(part4) I would really like the search to work so I don't have to work through this unexpected behaviour without a monolithic work around.

            Matt Shepherd added a comment - - edited

            This is not fixed for underscores as noted above, and agree with the other posters- this is insanely bad and needs to be worked. We are on version 5.9.4.

            Matt Shepherd added a comment - - edited This is not fixed for underscores as noted above, and agree with the other posters- this is insanely bad and needs to be worked. We are on version 5.9.4.

              Unassigned Unassigned
              4d3096c80b53 Roberto Fdez.
              Votes:
              132 Vote for this issue
              Watchers:
              95 Start watching this issue

                Created:
                Updated: