Details
-
Suggestion
-
Resolution: Unresolved
-
None
-
2
-
14
-
Description
NOTE: This suggestion is for Confluence Server. Using Confluence Cloud? See the corresponding suggestion.
Summary
com.atlassian.bonnie.search.extractor.ExtractorException occurred when we build search index on Word files which passwords are set on.
I understand that it is not possible to create search index on contents in a Word files which passwords are set on, however, I think the exception should be handled correctly.
Could you suppress the exception with handling it correctly?
Environments
- Confluence 5.8.x
- Confluence 5.9.4
Steps to Reproduce
- Attach test_password.docx (the password is 'test') to a page in Confluence
- Rebuild index
- Check atlassian-confluence.log
Expected Results
No exception occurred.
Actual Results
Following errors appeared in atlassian-confluence.log.
2016-01-20 08:35:50,043 WARN [scheduler_Worker-9] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: test_password.docx v.1 (688138) admin) com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Word XML document: Unknown file format. at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extractText(WordXMLTextExtractor.java:87) at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:41) at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:34) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.lambda$extractWithLuceneExtractors$383(ConfluenceDocumentBuilder.java:159) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder$$Lambda$240/467648620.accept(Unknown Source) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:352) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.extractWithLuceneExtractors(ConfluenceDocumentBuilder.java:156) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:111) at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:49) at com.atlassian.confluence.search.lucene.tasks.UpdateDocumentIndexTask.perform(UpdateDocumentIndexTask.java:50) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$IndexTaskWriter.apply(DefaultConfluenceIndexManager.java:408) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$IndexTaskWriter.apply(DefaultConfluenceIndexManager.java:398) at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue.lambda$flushQueue$213(AbstractJournalIndexTaskQueue.java:156) at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue$$Lambda$231/1026302055.apply(Unknown Source) at com.atlassian.confluence.impl.journal.DefaultJournalService.lambda$convert$490(DefaultJournalService.java:84) at com.atlassian.confluence.impl.journal.DefaultJournalService$$Lambda$232/634715196.apply(Unknown Source) at com.atlassian.confluence.impl.journal.DefaultJournalManager.processEntries(DefaultJournalManager.java:80) at com.atlassian.confluence.impl.journal.DefaultJournalService.processEntries(DefaultJournalService.java:41) at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue.flushQueue(AbstractJournalIndexTaskQueue.java:151) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$BatchUpdateAction$1.perform(DefaultConfluenceIndexManager.java:376) at com.atlassian.bonnie.LuceneConnection.withWriter(LuceneConnection.java:489) at sun.reflect.GeneratedMethodAccessor741.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198) at com.sun.proxy.$Proxy133.withWriter(Unknown Source) at com.atlassian.confluence.search.lucene.tasks.LuceneConnectionBackedIndexTaskPerformer.perform(LuceneConnectionBackedIndexTaskPerformer.java:18) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$BatchUpdateAction.perform(DefaultConfluenceIndexManager.java:365) at com.atlassian.bonnie.LuceneConnection.withBatchUpdate(LuceneConnection.java:522) at sun.reflect.GeneratedMethodAccessor740.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198) at com.sun.proxy.$Proxy133.withBatchUpdate(Unknown Source) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.processTasks(DefaultConfluenceIndexManager.java:175) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.flushQueue(DefaultConfluenceIndexManager.java:133) at sun.reflect.GeneratedMethodAccessor739.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at com.sun.proxy.$Proxy159.flushQueue(Unknown Source) at com.atlassian.confluence.search.lucene.IndexQueueFlusher.executeJob(IndexQueueFlusher.java:28) at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.surroundJobExecutionWithLogging(AbstractClusterAwareQuartzJobBean.java:66) at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.executeInternal(AbstractClusterAwareQuartzJobBean.java:47) at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86) at com.atlassian.scheduler.quartz1.Quartz1JobFactory$ClassLoaderProtectingWrappedJob.execute(Quartz1JobFactory.java:65) at org.quartz.core.JobRunShell.run(JobRunShell.java:223) at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool.lambda$runInThread$185(ConfluenceQuartzThreadPool.java:16) at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool$$Lambda$211/2092383935.run(Unknown Source) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549) Caused by: com.aspose.words.UnsupportedFileFormatException: Unknown file format. at com.aspose.words.Document.zzY(Unknown Source) at com.aspose.words.Document.zzZ(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extractText(WordXMLTextExtractor.java:83) ... 64 more
Workaround
Disabling Word XML Content Extractor worked to suppress the exception.
However, as you know, it will miss some functions about indexing on Word files.
Attachments
Issue Links
- relates to
-
CONFSERVER-6888 Some word docs don't get correctly indexed
- Closed
-
AI-791 com.atlassian.bonnie.search.extractor.ExtractorException occurred when we create search index on Word files which passwords are set on
- Gathering Interest