Details
-
Bug
-
Resolution: Duplicate
-
Medium
-
None
-
3.11
-
verfied in 3.10.2 standalone and 3.12DEV as at Mon Nov 5 13:02:04 EST 2007
-
3.11
-
4
-
Description
The AJAX Issue Picker uses a PrefixQuery which, via rewrite() is expanded to become a boolean query combining all possible matches as read in via the IndexReader with a boolean OR.
This consumes linear space with respect to the size of the search index. For a case with about 94000 issues this appears to entail approx. 900Mb.
Insane? perhaps, but it seems the Lucene people think this is fine. See the discussion from the mailing list below. TooManyClauses is designed to indicate a problem like this.
Support case that brought this up: JSP-16824
The customer increased their maximum clauses and this resulted in an OutOfMemoryException.
reproduction steps
To reproduce the problem with AJAX issue picker turned ON:
- create a large number of issues in the same project. (100000 seems ample)
- perform a search for all issues (hit enter in the quicksearch) This is necessary to seed the search space for the AJAX issue picker since it uses current query plus some other stuff.
- go to any issue
- click link the issue
- start typing the project key. AJAX requests will be made on each keystroke which should cause the stack trace below:
2007-11-05 13:08:07,667 http-8080-Processor2 ERROR [bc.issue.search.AbstractIssuePickerSearchProvider] Error while executing search request org.apache.lucene.search.BooleanQuery$TooManyClauses at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:184) at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:175) at org.apache.lucene.search.PrefixQuery.rewrite(PrefixQuery.java:52) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:381) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:396) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:396) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:396) at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:137) at org.apache.lucene.search.Query.weight(Query.java:92) at org.apache.lucene.search.Hits.<init>(Hits.java:41) at org.apache.lucene.search.Searcher.search(Searcher.java:44) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.runSearch(LuceneSearchProvider.java:148) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.getHits(LuceneSearchProvider.java:76) at com.atlassian.jira.issue.search.providers.LuceneSearchProvider.search(LuceneSearchProvider.java:208) at com.atlassian.jira.bc.issue.search.AbstractIssuePickerSearchProvider.getResults(AbstractIssuePickerSearchProvider.java:81) at com.atlassian.jira.bc.issue.search.DefaultIssuePickerSearchService.getResults(DefaultIssuePickerSearchService.java:59) at com.atlassian.jira.web.dwr.AjaxIssuePicker.getIssues(AjaxIssuePicker.java:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at uk.ltd.getahead.dwr.impl.ExecuteQuery.execute(ExecuteQuery.java:248) at uk.ltd.getahead.dwr.impl.DefaultExecProcessor.handle(DefaultExecProcessor.java:48) at uk.ltd.getahead.dwr.impl.DefaultProcessor.handle(DefaultProcessor.java:81) at uk.ltd.getahead.dwr.AbstractDWRServlet.doPost(AbstractDWRServlet.java:162) at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.jira.web.filters.AccessLogFilter.doFilter(AccessLogFilter.java:73) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.seraph.filter.SecurityFilter.doFilter(SecurityFilter.java:182) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.seraph.filter.LoginFilter.doFilter(LoginFilter.java:181) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.jira.web.filters.ActionCleanupDelayFilter.doFilter(ActionCleanupDelayFilter.java:43) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.jira.web.filters.RequestCleanupFilter.doFilter(RequestCleanupFilter.java:49) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.johnson.filters.AbstractJohnsonFilter.doFilter(AbstractJohnsonFilter.java:72) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.jira.web.filters.gzip.GzipFilter.doFilter(GzipFilter.java:64) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.core.filters.AbstractEncodingFilter.doFilter(AbstractEncodingFilter.java:37) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at com.atlassian.jira.appconsistency.db.DatabaseCompatibilityEnforcerFilter.doFilter(DatabaseCompatibilityEnforcerFilter.java:39) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:613)
The solution options are alluded to in the snippet below.
Also, see the thread on this list titled "I just don't get wildcards at all"
to see an extensive discussion of this issue, as well as wildcards in
general. You might also search the archive for wildcards. The short form is
that any wildcard (including prefix queries) expands under the covers to
create a clause for each possible entry in the index for that field. For
instance, say a field had the following values:
abcd
abck
abt
Searching for ab* would expand to searching for ab, abck and abt under the
covers. When the number of possibilities gets above the default value of
1024, you see a TooManyClauses exception. Expanding the number of clauses
may fix you right up, but on any reasonably sized index, you can come up
with a query that'll exceed whatever number you set. Or you'll get to an
unacceptable performance/memory footprint. Imagine your query with things
like a*
Think seriously about how you're going to deal with this. There are several
options:
1> use filters for all your wildcard clauses and create your own
BooleanQuery. Be aware that using filters affects scoring.
2> Assume that any query that throws a TooManyClauses exception (after
you've set a suitable max as Paul suggested) is too broad to be useful and
respond to the user with some polite phrase asking them to refine the query.
3> Look over the SrndQuery classes. I don't fully understand these, but they
certainly behave much differently in this area. Note that SrndQuery limits
wildcards to having at least three non-wildcard characters.
4> Ask whether stemming is a complete or partial solution. Ditto for
Soundex. There's a good chance these won't apply, but they may.
5> <Insert the solution to your specific problem here>
This is a sticky wicket that will probably consume more time than you think
to handle. It's easy for your product manager to claim that "Of course, we
must support arbitrary wildcards", but I'd urge you to seriously ask what
value arbitrary wildcards bring to the product. When you start getting
thousands of responses to a query, is it actually valuable to return them to
the user? Or do you give her just as much value (and deliver product sooner)
by telling her up front that she's getting too many responses to be useful?
With this last strategy, you just catch the TooManyClauses exception and
respond with "refine your query".....
Best
Erick
On 12/27/06, Paul Elschot <paul.elschot@xs4all.nl> wrote:
>
> Chris,
>
> On Wednesday 27 December 2006 15:42, Chris Salem wrote:
> > Hi All,
> >
> > I'm getting a 'TooManyClauses' Exception and I'm not sure how to fix
> this.
> Here's a sample query that I'm using:
> >
> > +(+freeform_text:exhibit* +(+freeform_text:dispaly
> +freeform_text:event*)
> +(+freeform_text:sale* +freeform_text:sells +freeform_text:develop*)
> +(+freeform_text:trade +freeform_text:show +freeform_text:trade
> +freeform_text:shows)) +degree_type:5 +position_desired:ftp
> +city:washington~0.5 +state:dc +ncountry:usa +last_modified:[2005-12-26 TO
> 2006-12-26]
> >
> > Here's the exception I'm getting:
> >
> > org.apache.lucene.search.BooleanQuery$TooManyClauses
> > at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:160)
> > at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:151)
> > at org.apache.lucene.search.PrefixQuery.rewrite(PrefixQuery.java:52)
>
> One of the prefix queries is causing this, possibly event* or sale*.
> Since they seem to be specific enough, increasing the maximum number
> of boolean clauses that can be added to a boolean query appears to be
> the good way to fix this, see BooleanQuery.setMaxClauseCount().
>
> Regards,
> Paul Elschot
>
> > at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:372)
> > at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:372)
> > at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:372)
> > at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java
> :137)
> > at org.apache.lucene.search.Query.weight(Query.java:93)
> > at org.apache.lucene.search.Hits.<init>(Hits.java:41)
> > at org.apache.lucene.search.Searcher.search(Searcher.java:44)
> > at org.apache.lucene.search.Searcher.search(Searcher.java:36)
> > at
> net.mainsequence.pcr.lucene.LuceneHandler.multiSearch(LuceneHandler.java
> :382)
> > at
> net.mainsequence.pcr.lucene.LuceneServlet.searchIndex(LuceneServlet.java
> :169)
> > at
> net.mainsequence.pcr.lucene.LuceneServlet.processRequest(
> LuceneServlet.java:83)
> > at net.mainsequence.pcr.lucene.LuceneServlet.doPost(LuceneServlet.java
> :72)
> > at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
> > at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> > at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
> ApplicationFilterChain.java:252)
> > at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(
> ApplicationFilterChain.java:173)
> > at
> org.apache.catalina.core.StandardWrapperValve.invoke(
> StandardWrapperValve.java:213)
> > at
> org.apache.catalina.core.StandardContextValve.invoke(
> StandardContextValve.java:178)
> > at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
> :126)
> > at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
> :105)
> > at
> org.apache.catalina.core.StandardEngineValve.invoke(
> StandardEngineValve.java:107)
> > at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java
> :148)
> > at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
> > at
>
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
> (Http11BaseProtocol.java:664)
> > at
> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
> PoolTcpEndpoint.java:527)
> > at
> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
> LeaderFollowerWorkerThread.java:80)
> > at
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
> ThreadPool.java:684)
> > at java.lang.Thread.run(Unknown Source)
> >
> > Is there anyway to increase the amount of clauses lucene can take? This
> kind of large query is not uncommon so any help would be greatly
> appreciated.
> >
> >
> > Chris Salem
Attachments
Issue Links
- is duplicated by
-
JRASERVER-17472 org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to 65000
- Closed
- is superseded by
-
JRASERVER-17623 IssuePicker ajax queries can use excessive memory
- Closed