Investigate using Lucene for sorting results

XMLWordPrintable

    • Type: Suggestion
    • Resolution: Fixed
    • 3.2 Beta
    • Component/s: None
    • None

      <from JIRA-user mailing list post>

      I'd like to make two improvements to the way Jira handles searching
      for issues:

      • currently, when a search is performed, the entire set of found
        issues is loaded into memory. in some cases this might be needed,
        but when a user is doing a search, only the issues for the first
        "frame" of hits needs to be loaded - the next can be loaded when
        (if) the user clicks to the next page of results.
      • Lucene 1.4 includes sorting code now, so that the sorting can
        be done efficiently inside Lucene, instead of afterwards in the
        application.

      Changing both of these will dramatically speed up the response time
      of doing issue searches. For us, it would mean at least a ten-fold
      improvement.

      Looking into doing this, I've got some questions about the Lucene
      code in Jira:

      Why is there a comment index and an issue index? It would seem more
      efficient to have only one index, where the information needed to
      find issues by comments is in the issue index. Unless I'm not seeing it, there don't appear to be any searches specifically interested in the comment records themselves - only the issues that the comments
      are in.

      Is this correct?

      Most of the values used to index in Lucene are stored as
      Field.Keyword or Field.Text, which besides just indexing the field,
      stores the value in the index. Then, when a hit is retrieved from
      the index, those values are retrieved (meaning all the text for
      the Field is returned). It does not appear, however, that any of
      those retrieved values are ever used - all searches seem to go
      through DefaultSearchProvider, which only uses a single field,
      DocumentConstants.ISSUE_ID, and throws the rest away. The ISSUE_ID
      is used to re-retrieve the information out of the database (or
      out of the issue cache, if it is there).

      Is this also correct?

      It seems like what needs to happen is for there to be some kind
      of "IssueSearchResult" object which contains the values for found
      issues, and that object used in the generation of the JSP instead
      of a GenericEntity. That way, the IssueSearchResult object could
      get its values straight from Lucene, bypassing the database.

      Does this seem a reasonable approach?

      Thanks -

      Tim

              Assignee:
              Scott Farquhar
              Reporter:
              Scott Farquhar
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved: