Uploaded image for project: 'Jira Server and Data Center'
  1. Jira Server and Data Center
  2. JRASERVER-17332

Handle Lucene queries that have too many clauses in such a way that the user can be notified about it



    • Feedback Policy:
      We collect Jira feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.


      We have a potential problem with searching for comments in JQL now because the way we do it is:

      1. Run the comment search over the comment index
      2. Collect the issue ids for the matching documents
      3. Build up an issue query that will match any of the collected ids and then run it over the issue index.

      If the comment query is too generic (e.g. comment !~ "awordthatwontexist") we are likely to get many matching issue ids on a large instance of JIRA, which then translates to a large query for the issue index. At the moment, we have the maximum number of terms a query can contain set to 65,000. But on an instance with more than this number of issues, a query with that many terms could be a not-too-rare occurrence. Then, on top of the number of issues, there are other factors which can influence the number of terms used, namely how complex the permissions filter for that user is (which is a function of how many projects, how many issue security schemes, how many groups the user is a member of, etc).

      What actually happens is: when a BooleanQuery is being constructed and exceeds the maximum number of terms, a runtime exception is thrown (TooManyClauses). This is not currently trapped anywhere in JIRA code.

      We suspect that this might happen a bit more often now with irresponsible use of the comment search, but in reality this could also occur due to say a JQL function that generates too many terms to search for. Ideally, we want a way to report back to the user how this happened, so that they can modify their query.

      It would be best to handle this in a standard, preferably single way, across all ClauseQueryFactorys. However, this only monitors the queries generated for each specific terminal clause - so we would also need handling when the clauses are combined. But by that stage, we've lost the insight into which clause caused the problem (or contributed most to the excess of terms). We have a similar problem when, after combining all the queries, we then add on the permissions filter for the user.




            mtokar Michael Tokar
            0 Vote for this issue
            1 Start watching this issue



                Time Tracking

                Original Estimate - 18h
                Remaining Estimate - 0h
                Time Spent - 14h Time Not Required