Loading...

Type: Suggestion
Resolution: Fixed
Fix Version/s: 4.4.5
Component/s: JQL
Labels:

NOTE: This suggestion is for JIRA Server. Using JIRA Cloud? See the corresponding suggestion.

Summary
IssueNavigator calls every instance of a JQL function's getValues() three times

to validate the query
to get the query's issues
to see if the query fits in the simple filter form [getQueryContext()]
If the JQL function has to inspect a lot of data or returns a lot data, this leads to severe performance issues unexpectedly multiply.

Detailed Explanation
As some Atlassians know, we wedged Mike Cannon-Brookes prototype of advanced search into our Jira 3.13 and made enhancements to search change history and all the comment fields.
I've been working to adapt what we did to work in Jira 4.

In Jira 3, this required a heavy-handed modification of the LuceneQueryCreator so that a Lucene Filter on the comment index was properly or'ed or and'ed or not'ed against the issue index. It was messy but it worked. The performance was OK as long as the user didn't have too many queries against the comment index as sub-queries.

For Jira 4, I figured that using a JQL function to search on the comment index and then return JiraDataType.ISSUE was the cleaner way to do what we had done by hacking LuceneQueryCreator in Jira 3. The JQL function searches the comment index and returns a list of issue id's. The rest would be handled by in Jira 4's great new re-worked search system and I could encapsulate everything in a plugin.

What I did works, but it has horrible performance.
IssueNavigator calls every instance of the function's getValues() three times

to validate the query
to get the query's issues
to see if the query fits in the simple filter form [getQueryContext()]

This means that my comment JQL function has to search the comment index 3 times, loop 3 times through each hit on the comment index and convert it to an issue id, and then the system re-queries the issue index using those issue ids in a new search 3 times.
In our Jira 3 implementation the comment index was searched once, converted to a Lucene Filter and then filtered once against the issue index using Lucene. Here are averages of a couple comparisons that I did on the same machine running both Jira 3 and Jira 4

issues	Jira 3	Jira 4
500	5 seconds	5 seconds
30000	23 seconds	190 seconds
100000	53 seconds	10 minutes

(I also noticed that if an issue in the results is for some reason invalid – for example if it fell victim to Bulk Edit woes – then after the user has waited minutes to get their slow running results, instead of getting all their issues with a warning note about the invalid issues, they get zero results and an error message which doesn't indicate what went wrong. See invalid-issue.png. I've filed a separate bug on this, ~~JRA-22277~~.)

I believe JQL needs several an improvements

getValues() should only be called once not 3 times per function. Perhaps wrap the call to the function in an object that holds onto the values while it gets passed from validation to query to context anaylysis. Another idea, add an isTooComplexForSimpleSearch() method to the JqlFunction interface that would negate the need to run the query for most functions
The returning of issue id's from JqlFunction.getValues() is almost always going to lead to a poor-performance situation because whatever the function does to discover issue ids is going to be expensive. Perhaps it should be deprecated. Update I also have concern for the User functions. For example, I wrote another one that returns all the users that are no longer active so that project leads can re-assign issues that are assigned to employees who no longer work at our company. This function has to loop through all the users and find which ones no longer belong to any groups. The JQL framework also runs this function three times per function call.
~~Perhaps the better alternative would be to have a JiraDataType.FILTER so that getValues would return a Lucene Filter.~~ after analyzing the issue further myself, I realize that this isn't a valid suggestion

Update, September 15
After studying this for a couple days, I was able to speed up our system with 4 changes:

To stop it from looping through thousands of issues and constructing an issue for each before the query executes, I commented out the validation code in IssueIdValidator
I modified IssueIdQueryFactory to construct an IssueIdFilter wrapped by a ConstantScoreQuery rather than a BooleanQuery with a sub-query for each issue id. This avoids the Too Many Clauses Error if the JQL function hits more than 32000 issues. This is similar to the solution that I posed in ~~JRA-22453~~. See code below.
I modified MultiClauseDecoratorContextFactory so that it only loops through the first 100 results during getQueryContext().
To work around but not solve the problem of IssueNavigator calling each instance of a JQL function 3 times, I put a cache in my JQL issue-type functions that holds onto each instance's results for a few seconds. This consumes unnecessary memory but it's the best that I can think of short of re-writing the entire JQL call framework to hold onto the results and then pass them through validate(), executeQuery(), and getQueryContext() rather than having validate(), executeQuery(), and getQueryContext() each separately calling jqlIssueFunction.getValues().
To avoid unneccessarily calling isCurrentQueryTooComplex(), I hacked our IssueNavigator.isAdvanced() to have a regular expression that recognizes if there's a JQL string that matches one of our nasty custom JQL issue-type functions.

With these improvements, our Jira 4 now runs faster than Jira 3! Yay!
New results

issues	Jira 3	Jira 4
500	5 seconds	4 seconds
30000	23 seconds	8 seconds
100000	53 seconds	23 seconds

(Digression – another improvement request that I contemplate – get rid of the issue index and index everything, including change history, inside the comment index)

IssueIdQueryFactory.java

    private Query createPositiveEqualsQuery(final List<QueryLiteral> rawValues)
    {
        if (rawValues.size() == 1)
        {
            return createQuery(rawValues.get(0));
        }
        else
        {
            return new ConstantScoreQuery(new IssueIdFilter(rawValues));
        }
    }

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List

invalid-issue.png
84 kB
15/Sep/2010 4:55 PM

is related to

JRASERVER-22277 Allow valid issues to pass through IssueIdValidator

Closed

JRASERVER-22453 Regression: Comment searching causes too many clauses error

Closed

JRASERVER-22448 Comment Query Performance: Please Change CommentClauseQueryFactory.generateIssueIdFromCommentQuery from for loop to HitCollector

Closed

JRASERVER-26585 JQL Function executes even when validation fails!

Closed

relates to

JRACLOUD-22256 Performance Issue with JQL functions

Closed

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates