Uploaded image for project: 'Atlassian Intelligence'
  1. Atlassian Intelligence
  2. AI-582

Allow search for words and phrases with non-letter symbols: plus (+), minus (-), period (.), dollar sign ($), asterisk (*), etc.

    • 17
    • Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

      NOTE: This suggestion is for Confluence Cloud. Using Confluence Server? See the corresponding suggestion.

      At the moment, searching for "hello-to-the-world" in Confluence always returns the same results as "hello to the world". This situation also applies for symbols like plus, underscore, period, dollar sign, percent sign, and so on.

      There's also no way to prevent asterisks being treated as wildcard characters in Lucene, so you can't search for a word like "plea" and match content with asterisks around the word.

      Words are also not split on dots, so you can't search for "somefile" and find pages that contains "somefile.txt" or "somefile.doc" in the text.

      Technical notes

      This is due to how Confluence's search tokenises search requests. It splits the query up into words based on letter characters, and ignores all symbols in the request. We use Lucene's StandardTokenizer in our EnglishAnalyzer, and similar implementations for other languages.

      Here is the description of the behaviour of StandardTokenizer from Lucene:

      • Splits words at punctuation characters, removing punctuation. However, a dot that's not followed by whitespace is considered part of a token.
      • Splits words at hyphens, unless there's a number in the token, in which case the whole token is interpreted as a product number and is not split.
      • Recognizes email addresses and internet hostnames as one token.

      An example of the grammar for this tokenizer can be viewed here: StandardTokenizerImpl.jflex.

        1. search-not-working-1.png
          38 kB
          dingetje NA
        2. search-not-working-2.png
          18 kB
          dingetje NA
        3. search-not-working-3.png
          18 kB
          dingetje NA

            [AI-582] Allow search for words and phrases with non-letter symbols: plus (+), minus (-), period (.), dollar sign ($), asterisk (*), etc.

            SET Analytics Bot made changes -
            Support reference count New: 17
            Alex W (Inactive) made changes -
            Labels Original: affects-cloud affects-server cc-findability exact-match search support_backlog New: affects-cloud affects-server cc-findability exact-match search solved-by-ai-rovo support_backlog
            pqz made changes -
            Component/s Original: Search - Core [ 46383 ]
            Component/s New: Search - Core [ 75296 ]
            Key Original: CONFCLOUD-3036 New: AI-582
            Support reference count Original: 17
            Affects Version/s Original: 5.6.3 [ 67739 ]
            Affects Version/s Original: 1.3.5 [ 67449 ]
            Assignee Original: Sheng An Zhang [ 5f652d38e81b ]
            Project Original: Confluence Cloud [ 18513 ] New: Atlassian Intelligence [ 23110 ]
            Owen Wallis made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 804925 ]
            Dolly made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 702062 ]
            Dolly made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 701946 ]
            Aniket Bhosale made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 632656 ]
            Neha Ghuraiya made changes -
            Remote Link New: This issue links to "Page (Confluence)" [ 626770 ]

            How is this still a thing? There really needs to be an "exact word" option that does not turn a dash into a space.
            This has to be the first search I've ever seen that does this.

            Search "config-name", searches for the strings "config" and "name". Across 100 repos this is absolutely useless.

            +1 if it's an consolation.

            Jacob Vandevelde added a comment - How is this still a thing? There really needs to be an "exact word" option that does not turn a dash into a space. This has to be the first search I've ever seen that does this. Search "config-name", searches for the strings "config" and "name". Across 100 repos this is absolutely useless. +1 if it's an consolation.

            +1

              Unassigned Unassigned
              4d3096c80b53 Roberto Fdez.
              Votes:
              132 Vote for this issue
              Watchers:
              94 Start watching this issue

                Created:
                Updated: