New and Improved 3.13 Beta. Highlights: Shareable filters and dashboards and lots of other goodies. Any feedback can be raised as JIRA issues in the JIRA project.
Issue Details (XML | Word | Printable)

Key: JRA-9296
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Critical Critical
Assignee: Anton Mazkovoi [Atlassian]
Reporter: Mark Chaimungkalanont [Atlassian]
Votes: 3
Watchers: 10
Operations

If you were logged in you would be able to see more operations.
JIRA

Indexing intermittently fails with IndexException

Created: 08/Feb/06 05:13 PM   Updated: 23/Feb/07 09:00 AM
Component/s: Filtering & Indexing
Affects Version/s: 3.5
Fix Version/s: 3.6.3

Time Tracking:
Not Specified

File Attachments: 1. File DefaultIndexManager.class (15 kB)


Participants: Anton Mazkovoi [Atlassian], David Fischer, Jeff Turner [Atlassian], Justin Koke [Atlassian], Mark Chaimungkalanont [Atlassian], Nick Menere [Atlassian], parthiban, Pierrick and Thiago Rossato
Since last comment: 1 year, 26 weeks, 5 days ago
Resolution Date: 10/Jul/06 12:47 AM
Labels:


 Description  « Hide
In certain systems, we've observed that exceptions like:
2006-02-02 10:59:59,462 ERROR [jira.issue.index.DefaultIndexManager] Giving up reindex - waited 10 seconds
com.atlassian.jira.issue.index.IndexException
	at com.atlassian.jira.issue.index.DefaultIndexManager.getIndexLock(DefaultIndexManager.java:463)
	at com.atlassian.jira.issue.index.DefaultIndexManager.reIndexIssues(DefaultIndexManager.java:239)
	at com.atlassian.jira.issue.index.DefaultIndexManager.reIndexIssues(DefaultIndexManager.java:211)
	at com.atlassian.jira.issue.index.DefaultIndexManager.reIndex(DefaultIndexManager.java:391)
	at com.atlassian.jira.issue.link.DefaultIssueLinkManager.reindexLinkedIssues(DefaultIssueLinkManager.java:70)
	at com.atlassian.jira.issue.link.DefaultIssueLinkManager.createIssueLink(DefaultIssueLinkManager.java:62)
	at com.atlassian.jira.web.action.issue.LinkExistingIssue.linkIssue(LinkExistingIssue.java:129)
	at com.atlassian.jira.web.action.issue.LinkExistingIssue.doExecute(LinkExistingIssue.java:116)
	at webwork.action.ActionSupport.execute(ActionSupport.java:153)

gets thrown intermittently.

Some investigations are still needed as to exactly why this the case, but initial tests seems to point to the optimise process taking longer than 10s to complete. This means that certain changes might not be reflected in the index, causing the index to be out of date.

Ideally, indexing should occur in the queue, but until then we need to investigate:

  1. Is it expected that the optimiser may take longer than 10s? Is the time taken relative to the number of total issues? Or issues since last optimisation?
  2. Is 10s a sensible wait time? Should it be longer?
  3. Is an optimisation every 150 issues ideal? Will optimisation be faster with fewer issues?

It would also be beneficial if these time outs were configurable.



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Anton Mazkovoi [Atlassian] added a comment - 12/Feb/06 05:23 PM
We need ot make the time out configurable in jira-application.properties file. The default shoud be increased to 30 secs (from 10). In JIRA 3.6 we will address teh problem 'properly' by introduing an indexing queue.

Anton


David Fischer added a comment - 15/Feb/06 03:41 AM
One comment: on our system, we started getting these exceptions once we installed the Jira charting plugin. And we got them on a very regular basis (as soon as we "resolved" an issue, for example).
Removing the charting plugin made these exceptions disappear.

Hope this helps,

David Fischer
Advestigo


Thiago Rossato added a comment - 24/Feb/06 01:43 PM
Our JIRA instance is having this problem every single day.
We made intensive use of JIRA. 1500 new issues are reported every day.
The plungins we have installed are JIRA Toolkit plugin and JIRA Charting Plugin.
But this problem is ocurring even before the installation of these plugins.

Sorry about the poor English.
Thiago Rossato


Nick Menere [Atlassian] added a comment - 26/Feb/06 06:27 PM
Thiago,

1500 issues is quite a lot of new issues every day.
This error can also occur when you have some very large issues (1000's of comments). It can take quite a while to index these single issues and it can result in locks.
Make sure you are using the latest version of the Charting Plugin as this has fixed a few of the issues.

For 3.6 we are hopefully going to be introducing an indexing queue. This will stop the error in 99% of the cases.

Though can I ask how your issues are being created? Are they being imported from another system or do you have users actually creating 1500? We may be able to attach a patch that increases the time out for you. This may alleviate your problem.

Cheers,
Nick


Thiago Rossato added a comment - 27/Feb/06 04:19 PM
Hi Nick,

These issues are created by about 800 users.
I really appreciate if you could send me a patch, because the reindexing process is tooking about 1 hour and 20 minutes.
And, to solve the problem above, just if we reindex, right?

By the way, right now, the reindex process didn´t work anymore.
I created an issue for this other problem (JSP-4409).

Please, I´m waiting for your answer very soon.
Imagine 800 users not working because of this error.

Thanks,
Thiago Rossato


Nick Menere [Atlassian] added a comment - 27/Feb/06 05:44 PM
hi,

Attached is an patch that will cause optimisations to happen half as often (every 300 issues), and uses a 30 second timeout.

Please let me know if this helps your problems.

Cheers,
Nick


Nick Menere [Atlassian] added a comment - 27/Feb/06 05:45 PM
Replaces the com/atlassian/jira/issue/index/DefaultIndexManager.class file from an exapnded web app.

Thiago Rossato added a comment - 07/Mar/06 07:12 AM
Nick, it didn´t work fine.
2006-03-06 14:41:01,033 ERROR [jira.issue.index.DefaultIndexManager] Giving up reindex - waited 30 seconds
com.atlassian.jira.issue.index.IndexException
	at com.atlassian.jira.issue.index.DefaultIndexManager.getIndexLock(DefaultIndexManager.java:473)
	at com.atlassian.jira.issue.index.DefaultIndexManager.reIndexIssues(DefaultIndexManager.java:249)
	at com.atlassian.jira.issue.index.DefaultIndexManager.reIndexIssues(DefaultIndexManager.java:221)
	at com.atlassian.jira.issue.index.DefaultIndexManager.reIndex(DefaultIndexManager.java:401)
	at com.atlassian.jira.workflow.function.issue.IssueReindexFunction.execute(IssueReindexFunction.java:28)
	at com.opensymphony.workflow.AbstractWorkflow.executeFunction(AbstractWorkflow.java:1179)
	at com.opensymphony.workflow.AbstractWorkflow.transitionWorkflow(AbstractWorkflow.java:1434)
	at com.opensymphony.workflow.AbstractWorkflow.doAction(AbstractWorkflow.java:533)
	at com.atlassian.jira.workflow.SimpleWorkflowManager.doWorkflowAction(SimpleWorkflowManager.java:228)
	at com.atlassian.jira.workflow.WorkflowTransitionUtilImpl.progress(WorkflowTransitionUtilImpl.java:259)
	at com.atlassian.jira.web.action.issue.CommentAssignIssue.doExecute(CommentAssignIssue.java:188)
	at webwork.action.ActionSupport.execute(ActionSupport.java:153)

Is there an way to work without index optimizations until version 3.6 get ready?


Justin Koke [Atlassian] added a comment - 07/Mar/06 11:06 PM
It is now possible via the jira-application.properties file to set Lucene's max timeout for a file lock.

Justin Koke [Atlassian] added a comment - 07/Mar/06 11:48 PM
While the fix implemented will alleviate this problem. Further work should be carried out to diagnose the underlying problem.

Anton Mazkovoi [Atlassian] added a comment - 10/Jul/06 12:42 AM
The problem occurs on databases that do not implement multi-versioning of records: MS SQL, DB2 and Sybase. In this case if a row is modified, all reads of that row will be blocked until the transaction that modified the row is committed.

The problem surfaces when there are 2 concurrent updates in the system, and at least one of them is in the middle of a transaction. If one thread obtains the index lock and then e.g. tries to retrieve comments for an issue, the db's query optimizer might decide that it is more efficient to do a table scan to retrieve the record. If another thread did an insert into the table, the select will block until the transaction that inserted a record is committed.

If the comment was added as part of the workflow transition, the transaction will not be committed until the issue is reindexed. However the reindex will not happen as the thread that it in the middle of the workflow operation (transaction) will try to obtain the indexing lock.

This causes a deadlock between the database and JIRA's indexing lock. It if often difficult to predict the behaviour of the query optimizer which causes the table scan and hence a deadlock. The solution is to perform indexing outside the database transaction.


Thiago Rossato added a comment - 10/Jul/06 09:04 AM
Anton,

This fix is really a good new for us!

Version 3.6.3 is under development. Is there a release date for this version?
Can you post a patch for this fix? You are experiencing many deadlock problems and indexes timeouts.

Thanks,
Thiago


Anton Mazkovoi [Atlassian] added a comment - 10/Jul/06 06:11 PM
Hi Thiago,

We are hoping to release JIRA 3.6.3 some time next week.

I believe Nick has provided the patch for you in your open support case.

Thanks,
Anton


Thiago Rossato added a comment - 12/Jul/06 09:20 AM
Anton,

No patch was attached to my open support issue.
Can you ask someone to do this?

Yesterday our indexes got corrupted. I think that the problem reported by this issue is the reason.
Our company use JIRA 24x7. The reindex process took about 2 hours to be completed.

Cheers,
Thiago


Nick Menere [Atlassian] added a comment - 12/Jul/06 07:48 PM
Thiago,
The patach was the one I attached in the 4/7.

Have you applied these?

cheers,
Nick


parthiban added a comment - 15/Jan/07 11:11 AM
the attached patch, is recommended for 3.6.3, what if we use 3.6.2?
our logs are filled with this exception and would very much like to have a fix for this

cheers,
parthi


Jeff Turner [Atlassian] added a comment - 17/Jan/07 09:07 PM
> the attached patch, is recommended for 3.6.3, what if we use 3.6.2?

The DefauiltIndexManager class being patched here is identical in 3.6.2 and 3.6.3, so the patch should work fine in either.


Pierrick added a comment - 23/Feb/07 09:00 AM
Does this patch work with 3.6.1 ?

Thanks
Pierrick