Loading...

XML

Word

Printable

Details

Type: Suggestion
Resolution: Fixed
Component/s: None
Labels:
- affects-server
- enterprise

Feedback Policy:

Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

Description

NOTE: This suggestion is for JIRA Cloud. Using JIRA Server? See the corresponding suggestion.

We have this hardcoded in the scheduler configuration to run at midnight every day:

<trigger name="OptimizeIndexesTrigger" job="OptimizeIndexes" type="cron">
    <expression>0 0 0 * * ?</expression><!-- run at midnight -->
</trigger>

TL;DR

That needs to go away.

Detailed explanation for the curious

During normal operation, Lucene deletes documents by simply marking them as such but leaving them in the segment. This is a performance trade-off because it makes the write much faster, but wastes disk space and would eventually slow down searches that have to skip over the deleted documents.

In older versions of Lucene, such as those used prior to JIRA 5.0, this is dealt with by explicitly optimizing the index, which rebuilds all of the segments without the deleted documents in them. To make sure the indexes stay in reasonably good shape, we added the scheduled job shown above.

In JIRA 5.0, we adopted Lucene 3.2 and its new TieredMergePolicy that tries to distribute this work more reasonably over time by noticing when deleting a document puts the segment over some threshold (say 10% deleted documents) and optimizing that single segment immediately. Since this happens automatically, there is much less reason to explicitly request a global optimization of the entire index, which for large customers in particular can be a very expensive operation and lead to request timeouts as write operations stall waiting up to 30 seconds for the index optimization to get out of the way if people happen to be doing work at midnight server time.

In some cases, we have even seen this lead to inconsistent state between the database and the Lucene indexes as the write requests for the index get dropped. This was allegedly fixed in JRA-14220, but a recent support case indicates that it may not adequately address the problem.

In short, the optimize operation should be superfluous at this point, and the long delays it causes may be creating additional problems, so it should just go away.

Attachments

Issue Links

is related to

JRACLOUD-19145 Index optimise on large instances can cause timeouts for some requests

Closed

JRASERVER-29487 Remove hardcoded OptimizeIndexTrigger from scheduler config

Closed

Activity

People

Assignee:: crf

Reporter:: crf

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 27/Aug/2012 1:51 AM

Updated:: 19/Sep/2019 5:56 AM

Resolved:: 25/Sep/2012 3:27 AM