Uploaded image for project: 'Atlassian Intelligence'
  1. Atlassian Intelligence
  2. AI-791

com.atlassian.bonnie.search.extractor.ExtractorException occurred when we create search index on Word files which passwords are set on

XMLWordPrintable

    • 1
    • Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

      NOTE: This suggestion is for Confluence Cloud. Using Confluence Server? See the corresponding suggestion.

      Summary

      com.atlassian.bonnie.search.extractor.ExtractorException occurred when we build search index on Word files which passwords are set on.
      I understand that it is not possible to create search index on contents in a Word files which passwords are set on, however, I think the exception should be handled correctly.
      Could you suppress the exception with handling it correctly?

      Environments

      • Confluence 5.8.x
      • Confluence 5.9.4

      Steps to Reproduce

      1. Attach test_password.docx (the password is 'test') to a page in Confluence
      2. Rebuild index
      3. Check atlassian-confluence.log

      Expected Results

      No exception occurred.

      Actual Results

      Following errors appeared in atlassian-confluence.log.

      2016-01-20 08:35:50,043 WARN [scheduler_Worker-9] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: test_password.docx v.1 (688138) admin)
      com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Word XML document: Unknown file format.
      	at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extractText(WordXMLTextExtractor.java:87)
      	at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:41)
      	at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:34)
      	at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.lambda$extractWithLuceneExtractors$383(ConfluenceDocumentBuilder.java:159)
      	at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder$$Lambda$240/467648620.accept(Unknown Source)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
      	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
      	at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:352)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502)
      	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
      	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
      	at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.extractWithLuceneExtractors(ConfluenceDocumentBuilder.java:156)
      	at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:111)
      	at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:49)
      	at com.atlassian.confluence.search.lucene.tasks.UpdateDocumentIndexTask.perform(UpdateDocumentIndexTask.java:50)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$IndexTaskWriter.apply(DefaultConfluenceIndexManager.java:408)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$IndexTaskWriter.apply(DefaultConfluenceIndexManager.java:398)
      	at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue.lambda$flushQueue$213(AbstractJournalIndexTaskQueue.java:156)
      	at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue$$Lambda$231/1026302055.apply(Unknown Source)
      	at com.atlassian.confluence.impl.journal.DefaultJournalService.lambda$convert$490(DefaultJournalService.java:84)
      	at com.atlassian.confluence.impl.journal.DefaultJournalService$$Lambda$232/634715196.apply(Unknown Source)
      	at com.atlassian.confluence.impl.journal.DefaultJournalManager.processEntries(DefaultJournalManager.java:80)
      	at com.atlassian.confluence.impl.journal.DefaultJournalService.processEntries(DefaultJournalService.java:41)
      	at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue.flushQueue(AbstractJournalIndexTaskQueue.java:151)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$BatchUpdateAction$1.perform(DefaultConfluenceIndexManager.java:376)
      	at com.atlassian.bonnie.LuceneConnection.withWriter(LuceneConnection.java:489)
      	at sun.reflect.GeneratedMethodAccessor741.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198)
      	at com.sun.proxy.$Proxy133.withWriter(Unknown Source)
      	at com.atlassian.confluence.search.lucene.tasks.LuceneConnectionBackedIndexTaskPerformer.perform(LuceneConnectionBackedIndexTaskPerformer.java:18)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$BatchUpdateAction.perform(DefaultConfluenceIndexManager.java:365)
      	at com.atlassian.bonnie.LuceneConnection.withBatchUpdate(LuceneConnection.java:522)
      	at sun.reflect.GeneratedMethodAccessor740.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198)
      	at com.sun.proxy.$Proxy133.withBatchUpdate(Unknown Source)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.processTasks(DefaultConfluenceIndexManager.java:175)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.flushQueue(DefaultConfluenceIndexManager.java:133)
      	at sun.reflect.GeneratedMethodAccessor739.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
      	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
      	at com.sun.proxy.$Proxy159.flushQueue(Unknown Source)
      	at com.atlassian.confluence.search.lucene.IndexQueueFlusher.executeJob(IndexQueueFlusher.java:28)
      	at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.surroundJobExecutionWithLogging(AbstractClusterAwareQuartzJobBean.java:66)
      	at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.executeInternal(AbstractClusterAwareQuartzJobBean.java:47)
      	at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
      	at com.atlassian.scheduler.quartz1.Quartz1JobFactory$ClassLoaderProtectingWrappedJob.execute(Quartz1JobFactory.java:65)
      	at org.quartz.core.JobRunShell.run(JobRunShell.java:223)
      	at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool.lambda$runInThread$185(ConfluenceQuartzThreadPool.java:16)
      	at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool$$Lambda$211/2092383935.run(Unknown Source)
      	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
      Caused by: com.aspose.words.UnsupportedFileFormatException: Unknown file format.
      	at com.aspose.words.Document.zzY(Unknown Source)
      	at com.aspose.words.Document.zzZ(Unknown Source)
      	at com.aspose.words.Document.<init>(Unknown Source)
      	at com.aspose.words.Document.<init>(Unknown Source)
      	at com.aspose.words.Document.<init>(Unknown Source)
      	at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extractText(WordXMLTextExtractor.java:83)
      	... 64 more
      

      Workaround

      Disabling Word XML Content Extractor worked to suppress the exception.
      However, as you know, it will miss some functions about indexing on Word files.

        1. test_password.docx
          18 kB
          Yuki Okamoto

              Unassigned Unassigned
              yokamoto Yuki Okamoto (Inactive)
              Votes:
              8 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: