Uploaded image for project: 'Jira Data Center'
  1. Jira Data Center
  2. JRASERVER-29487

Remove hardcoded OptimizeIndexTrigger from scheduler config

XMLWordPrintable

    • We collect Jira feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

      NOTE: This suggestion is for JIRA Server. Using JIRA Cloud? See the corresponding suggestion.

      We have this hardcoded in the scheduler configuration to run at midnight every day:

      <trigger name="OptimizeIndexesTrigger" job="OptimizeIndexes" type="cron">
          <expression>0 0 0 * * ?</expression><!-- run at midnight -->
      </trigger>
      

      TL;DR

      That needs to go away.

      Detailed explanation for the curious

      During normal operation, Lucene deletes documents by simply marking them as such but leaving them in the segment. This is a performance trade-off because it makes the write much faster, but wastes disk space and would eventually slow down searches that have to skip over the deleted documents.

      In older versions of Lucene, such as those used prior to JIRA 5.0, this is dealt with by explicitly optimizing the index, which rebuilds all of the segments without the deleted documents in them. To make sure the indexes stay in reasonably good shape, we added the scheduled job shown above.

      In JIRA 5.0, we adopted Lucene 3.2 and its new TieredMergePolicy that tries to distribute this work more reasonably over time by noticing when deleting a document puts the segment over some threshold (say 10% deleted documents) and optimizing that single segment immediately. Since this happens automatically, there is much less reason to explicitly request a global optimization of the entire index, which for large customers in particular can be a very expensive operation and lead to request timeouts as write operations stall waiting up to 30 seconds for the index optimization to get out of the way if people happen to be doing work at midnight server time.

      In some cases, we have even seen this lead to inconsistent state between the database and the Lucene indexes as the write requests for the index get dropped. This was allegedly fixed in JRA-14220, but a recent support case indicates that it may not adequately address the problem.

      In short, the optimize operation should be superfluous at this point, and the long delays it causes may be creating additional problems, so it should just go away.

              cfuller crf
              cfuller crf
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: