Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-13884

TooManyClauses or OutOfMemoryException for AJAX Issue Picker thanks to PrefixQuery

    XMLWordPrintable

Details

    Description

      The AJAX Issue Picker uses a PrefixQuery which, via rewrite() is expanded to become a boolean query combining all possible matches as read in via the IndexReader with a boolean OR.

      This consumes linear space with respect to the size of the search index. For a case with about 94000 issues this appears to entail approx. 900Mb.

      Insane? perhaps, but it seems the Lucene people think this is fine. See the discussion from the mailing list below. TooManyClauses is designed to indicate a problem like this.

      Support case that brought this up: JSP-16824

      The customer increased their maximum clauses and this resulted in an OutOfMemoryException.

      reproduction steps

      To reproduce the problem with AJAX issue picker turned ON:

      1. create a large number of issues in the same project. (100000 seems ample)
      2. perform a search for all issues (hit enter in the quicksearch) This is necessary to seed the search space for the AJAX issue picker since it uses current query plus some other stuff.
      3. go to any issue
      4. click link the issue
      5. start typing the project key. AJAX requests will be made on each keystroke which should cause the stack trace below:
      2007-11-05 13:08:07,667 http-8080-Processor2 ERROR [bc.issue.search.AbstractIssuePickerSearchProvider] Error while executing search request
      org.apache.lucene.search.BooleanQuery$TooManyClauses
      	at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:184)
      	at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:175)
      	at org.apache.lucene.search.PrefixQuery.rewrite(PrefixQuery.java:52)
      	at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:381)
      	at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:396)
      	at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:396)
      	at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:396)
      	at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:137)
      	at org.apache.lucene.search.Query.weight(Query.java:92)
      	at org.apache.lucene.search.Hits.<init>(Hits.java:41)
      	at org.apache.lucene.search.Searcher.search(Searcher.java:44)
      	at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.runSearch(LuceneSearchProvider.java:148)
      	at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.getHits(LuceneSearchProvider.java:76)
      	at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:208)
      	at com.atlassian.jira.bc.issue.search.AbstractIssuePickerSearchProvider.getResults(AbstractIssuePickerSearchProvider.java:81)
      	at com.atlassian.jira.bc.issue.search.DefaultIssuePickerSearchService.getResults(DefaultIssuePickerSearchService.java:59)
      	at com.atlassian.jira.web.dwr.AjaxIssuePicker.getIssues(AjaxIssuePicker.java:89)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:585)
      	at uk.ltd.getahead.dwr.impl.ExecuteQuery.execute(ExecuteQuery.java:248)
      	at uk.ltd.getahead.dwr.impl.DefaultExecProcessor.handle(DefaultExecProcessor.java:48)
      	at uk.ltd.getahead.dwr.impl.DefaultProcessor.handle(DefaultProcessor.java:81)
      	at uk.ltd.getahead.dwr.AbstractDWRServlet.doPost(AbstractDWRServlet.java:162)
      	at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
      	at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.jira.web.filters.AccessLogFilter.doFilter(AccessLogFilter.java:73)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.seraph.filter.SecurityFilter.doFilter(SecurityFilter.java:182)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.seraph.filter.LoginFilter.doFilter(LoginFilter.java:181)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.jira.web.filters.ActionCleanupDelayFilter.doFilter(ActionCleanupDelayFilter.java:43)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.jira.web.filters.RequestCleanupFilter.doFilter(RequestCleanupFilter.java:49)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.johnson.filters.AbstractJohnsonFilter.doFilter(AbstractJohnsonFilter.java:72)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.jira.web.filters.gzip.GzipFilter.doFilter(GzipFilter.java:64)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.core.filters.AbstractEncodingFilter.doFilter(AbstractEncodingFilter.java:37)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at com.atlassian.jira.appconsistency.db.DatabaseCompatibilityEnforcerFilter.doFilter(DatabaseCompatibilityEnforcerFilter.java:39)
      	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
      	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
      	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
      	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
      	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
      	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
      	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
      	at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
      	at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
      	at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
      	at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
      	at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
      	at java.lang.Thread.run(Thread.java:613)
      

      The solution options are alluded to in the snippet below.

      from http://mail-archives.apache.org/mod_mbox/lucene-java-user/200612.mbox/%3c359a92830612270753x3d0e3511g634b8e9c145d4034@mail.gmail.com%3e

      "excerpt from lucene mailing list"

      Also, see the thread on this list titled "I just don't get wildcards at all"
      to see an extensive discussion of this issue, as well as wildcards in
      general. You might also search the archive for wildcards. The short form is
      that any wildcard (including prefix queries) expands under the covers to
      create a clause for each possible entry in the index for that field. For
      instance, say a field had the following values:

      abcd
      abck
      abt

      Searching for ab* would expand to searching for ab, abck and abt under the
      covers. When the number of possibilities gets above the default value of
      1024, you see a TooManyClauses exception. Expanding the number of clauses
      may fix you right up, but on any reasonably sized index, you can come up
      with a query that'll exceed whatever number you set. Or you'll get to an
      unacceptable performance/memory footprint. Imagine your query with things
      like a*

      Think seriously about how you're going to deal with this. There are several
      options:
      1> use filters for all your wildcard clauses and create your own
      BooleanQuery. Be aware that using filters affects scoring.
      2> Assume that any query that throws a TooManyClauses exception (after
      you've set a suitable max as Paul suggested) is too broad to be useful and
      respond to the user with some polite phrase asking them to refine the query.
      3> Look over the SrndQuery classes. I don't fully understand these, but they
      certainly behave much differently in this area. Note that SrndQuery limits
      wildcards to having at least three non-wildcard characters.
      4> Ask whether stemming is a complete or partial solution. Ditto for
      Soundex. There's a good chance these won't apply, but they may.
      5> <Insert the solution to your specific problem here>

      This is a sticky wicket that will probably consume more time than you think
      to handle. It's easy for your product manager to claim that "Of course, we
      must support arbitrary wildcards", but I'd urge you to seriously ask what
      value arbitrary wildcards bring to the product. When you start getting
      thousands of responses to a query, is it actually valuable to return them to
      the user? Or do you give her just as much value (and deliver product sooner)
      by telling her up front that she's getting too many responses to be useful?
      With this last strategy, you just catch the TooManyClauses exception and
      respond with "refine your query".....

      Best
      Erick

      On 12/27/06, Paul Elschot <paul.elschot@xs4all.nl> wrote:
      >
      > Chris,
      >
      > On Wednesday 27 December 2006 15:42, Chris Salem wrote:
      > > Hi All,
      > >
      > > I'm getting a 'TooManyClauses' Exception and I'm not sure how to fix
      > this.
      > Here's a sample query that I'm using:
      > >
      > > +(+freeform_text:exhibit* +(+freeform_text:dispaly
      > +freeform_text:event*)
      > +(+freeform_text:sale* +freeform_text:sells +freeform_text:develop*)
      > +(+freeform_text:trade +freeform_text:show +freeform_text:trade
      > +freeform_text:shows)) +degree_type:5 +position_desired:ftp
      > +city:washington~0.5 +state:dc +ncountry:usa +last_modified:[2005-12-26 TO
      > 2006-12-26]
      > >
      > > Here's the exception I'm getting:
      > >
      > > org.apache.lucene.search.BooleanQuery$TooManyClauses
      > > at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:160)
      > > at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:151)
      > > at org.apache.lucene.search.PrefixQuery.rewrite(PrefixQuery.java:52)
      >
      > One of the prefix queries is causing this, possibly event* or sale*.
      > Since they seem to be specific enough, increasing the maximum number
      > of boolean clauses that can be added to a boolean query appears to be
      > the good way to fix this, see BooleanQuery.setMaxClauseCount().
      >
      > Regards,
      > Paul Elschot
      >
      > > at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:372)
      > > at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:372)
      > > at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:372)
      > > at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java
      > :137)
      > > at org.apache.lucene.search.Query.weight(Query.java:93)
      > > at org.apache.lucene.search.Hits.<init>(Hits.java:41)
      > > at org.apache.lucene.search.Searcher.search(Searcher.java:44)
      > > at org.apache.lucene.search.Searcher.search(Searcher.java:36)
      > > at
      > net.mainsequence.pcr.lucene.LuceneHandler.multiSearch(LuceneHandler.java
      > :382)
      > > at
      > net.mainsequence.pcr.lucene.LuceneServlet.searchIndex(LuceneServlet.java
      > :169)
      > > at
      > net.mainsequence.pcr.lucene.LuceneServlet.processRequest(
      > LuceneServlet.java:83)
      > > at net.mainsequence.pcr.lucene.LuceneServlet.doPost(LuceneServlet.java
      > :72)
      > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
      > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
      > > at
      > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
      > ApplicationFilterChain.java:252)
      > > at
      > org.apache.catalina.core.ApplicationFilterChain.doFilter(
      > ApplicationFilterChain.java:173)
      > > at
      > org.apache.catalina.core.StandardWrapperValve.invoke(
      > StandardWrapperValve.java:213)
      > > at
      > org.apache.catalina.core.StandardContextValve.invoke(
      > StandardContextValve.java:178)
      > > at
      > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
      > :126)
      > > at
      > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
      > :105)
      > > at
      > org.apache.catalina.core.StandardEngineValve.invoke(
      > StandardEngineValve.java:107)
      > > at
      > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java
      > :148)
      > > at
      > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
      > > at
      >
      > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
      > (Http11BaseProtocol.java:664)
      > > at
      > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
      > PoolTcpEndpoint.java:527)
      > > at
      > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
      > LeaderFollowerWorkerThread.java:80)
      > > at
      > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
      > ThreadPool.java:684)
      > > at java.lang.Thread.run(Unknown Source)
      > >
      > > Is there anyway to increase the amount of clauses lucene can take? This
      > kind of large query is not uncommon so any help would be greatly
      > appreciated.
      > >
      > >
      > > Chris Salem

      Attachments

        Issue Links

          Activity

            People

              jed Jed Wesley-Smith (Inactive)
              chris@atlassian.com Chris Mountford
              Votes:
              3 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: