|
[
Permlink
| « Hide
]
Nick Menere [Atlassian] added a comment - 01/Jan/07 10:05 PM
Possible reason - http://java2.5341.com/msg/4440.html
Another user experiencing this error at JSP-9358
This happened again on jira.atlassian.com on the 3rd of Jan at midnight, it seemed to be triggered by the optimization and resulted in a slightly different stack trace:
2007-01-03 00:00:00,020 JiraQuartzScheduler_Worker-0 INFO [issue.index.job.OptimizeIndexJob] Optimize Index Job running... 2007-01-03 00:00:00,467 JiraQuartzScheduler_Worker-0 ERROR [issue.index.job.OptimizeIndexJob] Error occurred while optimizing indexes. java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.index.SegmentReader.isDeleted(I)Z(Optimized Method) at org.apache.lucene.index.SegmentMerger.mergeFields()I(Optimized Method) at org.apache.lucene.index.SegmentMerger.merge()I(Optimized Method) at org.apache.lucene.index.IndexWriter.mergeSegments(II)V(IndexWriter.java:681) at org.apache.lucene.index.IndexWriter.mergeSegments(I)V(IndexWriter.java:658) at org.apache.lucene.index.IndexWriter.optimize()V(IndexWriter.java:517) at com.atlassian.bonnie.ConcurrentLuceneConnection$2.perform(Lorg/apache/lucene/index/IndexWriter;)V(ConcurrentLuceneConnection.java:121) at com.atlassian.bonnie.ConcurrentLuceneConnection.withWriter(Lcom/atlassian/bonnie/ILuceneConnection$WriterAction;)V(ConcurrentLuceneConnection.java:276) at com.atlassian.bonnie.ConcurrentLuceneConnection.optimize()V(ConcurrentLuceneConnection.java:117) at com.atlassian.jira.issue.index.SingleThreadedIssueIndexer.optimize()V(SingleThreadedIssueIndexer.java:74) at com.atlassian.jira.issue.index.DefaultIndexManager.optimize0()J(DefaultIndexManager.java:393) at com.atlassian.jira.issue.index.DefaultIndexManager.optimize()J(DefaultIndexManager.java:370) at com.atlassian.jira.issue.index.job.OptimizeIndexJob.execute(Lorg/quartz/JobExecutionContext;)V(OptimizeIndexJob.java:19) at org.quartz.core.JobRunShell.run()V(JobRunShell.java:191) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run()V(SimpleThreadPool.java:516) at java.lang.Thread.startThreadFromVM(Ljava/lang/Thread;)V(Unknown Source) 2007-01-03 00:01:08,865 resin-tcp-connection-j2ee.jira.atlassian.com:6802-77 ERROR [atlassian.jira.workflow.SimpleWorkflowManager] An exception occurred java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.index.SegmentTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) at org.apache.lucene.index.MultiTermDocs.next()Z(Optimized Method) Every lucene read after the optimize was throwing the second stack trace listed above. A reindex seems to have sorted the immediate issue, but it is likely to return. Might be worth porting logging from Lucene 2 for the SegmentMerger. Outputs the doc numbers when out of order.
Hi all,
Thought I should post here instead of creating a new thread as the problem was similar. We've been having this problem rather sporadically since upgrading to 3.7, only in our case it's been when we've been creating issues. I've also been messing around with typed sub-tasks at the same time, but I'm not sure if thats the problem or a coincidence. Our stack trace is similar: I've been trying to recreate it, but I've been having problems trying to do so. I've tried repeating the way I thought I'd done it before to no avail. Any ideas? Cheers, Hi Nathan,
This is a very strange Lucene bug. Currently we don't have a test that repeatably recreates the issue, but we are working on creating one. We will be following it up with the Lucene developers here Out of interest, how many users do you have using JIRA at any one time? Also, are you using Windows/Linux/Mac and what Java version (I'm guessing 1.5)? cheers, Hey Jed,
Will keep tabs on it, for the moment I'm just going to have to keep re-indexing it whenever someone has a problem. If I come across it myself, I shall try to work out exactly what I've done to cause it. We have somewhere over 400 users, I'd say there is probably anywhere from 0 to 40 people accessing it at any one time. We're running a Red Hat Enterprise 3 in a VM, Java 1.5. Cheers, I have seen this too (on 3.7) when I am closing several issues (which I will do periodically when reviewing stale issues).
And I also seen an exception if I immediately try to re-index (Cannot create index directory) with this Root cause: Caused by: java.io.IOException: Lock obtain timed out: Lock@/usr/local/tomcat/te But, it seems that if you wait long enough, the re-indexing works fine (and I think the docs out of order exceptions are also gone). attached is a lucene-core jar that prevents a potential source of the docs out of order issue, (explanation and fix thanks to Michael McCandless
To use, please replace the lucene-core-1.9.1.jar with the attached lucene-core-1.9.1-atlassian-patched-2007-01-09.jar in JIRA's WEB-INF/lib/ directory and restart JIRA. You must delete or move the ucene-core-1.9.1.jar Hi Nathan, thanks for the info.
Hi Yuji, The reason you see lock timeout is that the lock isn't released properly due to the IndexWriter.close() method throwing an exception - details here Well the patch fixes a limited form where the error can occur, but it doesn't appear to fix our problem.
If anyone is having this problem, could you please:
We currently suspect that it has to do with old index segment files hanging around and we are currently trying to verify this. Thanks to the Lucene developers, we have found the cause of this. It is indeed old index files being left around when we reindex. The fix is going into 3.7.2, but a new version of Bonnie is attached that can be used for 3.7 & 3.7.1
To install the patch, please replace the bonnie-2006-12-14.jar in the WEB-INF/lib directory with the attached bonnie-2007-01-11-1.jar and then reindex. Attached bonnie jar fixes the problem, 3.7.2 will have the fixed bonnie jar as well.
There is no need to install the attached lucene jar as it fixes a problem that JIRA does not expose. We upgraded to 3.7.2 and we are still getting this error.
Hi Collin,
Are you certain that you are seeing the error on 3.7.2. We have had a number of customers verify that an upgrade to 3.7.2 has fixed this problem for them. Did you start JIRA with the flag -Djira.task.reindexAll.complete=true? If so JIRA would not have performed its reindexing upgrade task and you would still be seeing the issue. Depending on how you upgraded (if you just copied all the jars in WEB-INF/lib across to your new instance), you could still have the older version of the bonnie jar on your classpath. You should only have the bonnie-2007-01-11-1.jar in that directory. If a re-index does not sort the problem out then can I get you to open a support request at https://support.atlasian.com? Thanks, Dylan Yes we still see this error in 3.7.2.
I am a bit confused by your comment. Am I supposed to start it with "-Djira.task.reindexAll.complete=true"? I didn't copy my lib folder over just the config changes I needed so that should not be the problem. I have forced a re index and a optimized the index and i still had the problem. in fact I have to reindex to get "missing" ticket back int the views. I did open a new problem and it was resolved as a duplicate for this request Thanks, Collin Collin,
Sorry, we dropped the ball on this one. Once again, sorry for the delay on this one. Cheers, |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||