• We collect Confluence feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

      My users constantly complain that they get too many 'incorrect' results for their query when they make multi-word queries. The problem is that they are used to internet search where the words will be AND-ed together by default. As it stands they get huge numbers of irrelevant matches which makes them start to not trust search. Can you imagine if Google used "OR" for its search queries? Almost every site on the web would be returned for multi-word queries!

      e.g. If I search for "andy armstrong pantero" on Google I get 139 results. Switching to OR matching I get 378,000,000 results!!!

      Some possibilities here include:
      1. Make AND be the default
      2. Provide a configuration option so that administrators can choose the default
      3. Order search results by relevance so that the AND matches always come ahead of the OR matches

      See this idea in the Confluence Community for more details:

      http://confluence.atlassian.com/display/DISC/search+exactly?focusedCommentId=169231#comment-169231

            [CONFSERVER-5874] Search should use AND logic by default

            uWEB added a comment -

            we have the same problem. The users know the AND search from Google and expect the same in Confluence.
            Please enable a changeover.

            uWEB added a comment - we have the same problem. The users know the AND search from Google and expect the same in Confluence. Please enable a changeover.

            Guys - after much internal debate (and many thanks for all your insightful comments), the default will be switched to AND for 2.3. Woo! Thanks to all for being such interested customers as to provide so many for and against arguments for the feature. It's truly humbling to all here to have such loyal users.

            Mike Cannon-Brookes added a comment - Guys - after much internal debate (and many thanks for all your insightful comments), the default will be switched to AND for 2.3. Woo! Thanks to all for being such interested customers as to provide so many for and against arguments for the feature. It's truly humbling to all here to have such loyal users.

            Sorry for the spam, but I found a piece of relevant discussion on how to do this with Lucene. The bad news is, it's not going to be automatic.

            http://mail-archives.apache.org/mod_mbox/lucene-java-user/200209.mbox/%3C7BE5D73F-BEF9-11D6-9A7B-003065AB94BC@bookandhammer.com%3E

            Osma Ahvenlampi added a comment - Sorry for the spam, but I found a piece of relevant discussion on how to do this with Lucene. The bad news is, it's not going to be automatic. http://mail-archives.apache.org/mod_mbox/lucene-java-user/200209.mbox/%3C7BE5D73F-BEF9-11D6-9A7B-003065AB94BC@bookandhammer.com%3E

            A the risk of flogging a dead horse (seeing that the original issue has been closed for pretty much as long as I've been using confluence and annoyed by this misfeature), allow me to enter a few arguments for changing the search behavior. The problem isn't that Confluence isn't finding the documents the user wanted to find (despite the index sometimes being broken, but that's another matter), it's that the user won't find the document in question on the 12th page of a badly ordered results list. This, for those who don't recall, is exactly why Alta Vista lost to Google back in the day.

            1) "AND by default would miss results". Okay, fine. The most basic thing to do then would be to weigh results by relevance, and determine the relevance by the number of occurrences of searched keywords.

            2) Why not only does AND have to be not-default, but you have to write it in UPPER CASE for it to work?

            3) Google is what "everyone" is used to. Google works because they define relevance not by the '96-era method of counting hits (which Confluence doesn't do either), but by looking at linkage. Most Confluence installations are too small and narrowly focused for that to work - besides, Lucene doesn't support that. However, weighing factors are plenty (in rough order of probable priority):

            • are the searched for words used as labels on some pages?
            • did the page found contain all the searched-for words, or just a subset?
            • when has the page been updated?
            • how many incoming links does the page have?
            • how many outgoing links does the page have (Google doesn't do this, but for closed or semi-closed systems where most links are internal like I presume most Confluence installations are, two directional connectedness is relevant)
            • how many times did the word(s) appear?

            That should be more than enough parameters to improve search results. Calculate score for each, apply sensible weighting factors (and make them configurable by an administrator who thinks you got them wrong), and then order the results accordingly. Voilà, happier users.

            Osma Ahvenlampi added a comment - A the risk of flogging a dead horse (seeing that the original issue has been closed for pretty much as long as I've been using confluence and annoyed by this misfeature), allow me to enter a few arguments for changing the search behavior. The problem isn't that Confluence isn't finding the documents the user wanted to find (despite the index sometimes being broken, but that's another matter), it's that the user won't find the document in question on the 12th page of a badly ordered results list. This, for those who don't recall, is exactly why Alta Vista lost to Google back in the day. 1) "AND by default would miss results". Okay, fine. The most basic thing to do then would be to weigh results by relevance, and determine the relevance by the number of occurrences of searched keywords. 2) Why not only does AND have to be not-default, but you have to write it in UPPER CASE for it to work? 3) Google is what "everyone" is used to. Google works because they define relevance not by the '96-era method of counting hits (which Confluence doesn't do either), but by looking at linkage. Most Confluence installations are too small and narrowly focused for that to work - besides, Lucene doesn't support that. However, weighing factors are plenty (in rough order of probable priority): are the searched for words used as labels on some pages? did the page found contain all the searched-for words, or just a subset? when has the page been updated? how many incoming links does the page have? how many outgoing links does the page have (Google doesn't do this, but for closed or semi-closed systems where most links are internal like I presume most Confluence installations are, two directional connectedness is relevant) how many times did the word(s) appear? That should be more than enough parameters to improve search results. Calculate score for each, apply sensible weighting factors (and make them configurable by an administrator who thinks you got them wrong), and then order the results accordingly. Voilà, happier users.

            Another possibility would be to do "AND" by default but provide an alternative link before the results to redo the search with an "OR".

            Reading the linked issues I see that the argument is that slight variants on the current search won't be found with an AND match. However, in my experience those slight variants won't be found anyway because there will be hidden amongst many false matches.

            There's another argument that Google is different because it has more results to search, but even searching the Confluence Wiki there are way too many pages to make OR search useful. Compare the search for 'wiki syntax' with the search for 'wiki AND syntax'. The first swamps the results in meaningless results, while the second gives almost exactly what I'd expect to see.

            I think the key thing here is user expectation. Everyone knows how search works now, and that is how Google, Yahoo, MSN etc do it. They all use AND and I believe Confluence should too.

            Andy Armstrong added a comment - Another possibility would be to do "AND" by default but provide an alternative link before the results to redo the search with an "OR". Reading the linked issues I see that the argument is that slight variants on the current search won't be found with an AND match. However, in my experience those slight variants won't be found anyway because there will be hidden amongst many false matches. There's another argument that Google is different because it has more results to search, but even searching the Confluence Wiki there are way too many pages to make OR search useful. Compare the search for 'wiki syntax' with the search for 'wiki AND syntax'. The first swamps the results in meaningless results, while the second gives almost exactly what I'd expect to see. I think the key thing here is user expectation. Everyone knows how search works now, and that is how Google, Yahoo, MSN etc do it. They all use AND and I believe Confluence should too.

              mike@atlassian.com Mike Cannon-Brookes
              32356a7aaa27 Andy Armstrong
              Votes:
              9 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: