Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-40536

com.atlassian.bonnie.search.extractor.ExtractorException occurred when we create search index on Word files which passwords are set on

    XMLWordPrintable

Details

    • 2
    • 14
    • We collect Confluence feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

    Description

      NOTE: This suggestion is for Confluence Server. Using Confluence Cloud? See the corresponding suggestion.

      Summary

      com.atlassian.bonnie.search.extractor.ExtractorException occurred when we build search index on Word files which passwords are set on.
      I understand that it is not possible to create search index on contents in a Word files which passwords are set on, however, I think the exception should be handled correctly.
      Could you suppress the exception with handling it correctly?

      Environments

      • Confluence 5.8.x
      • Confluence 5.9.4

      Steps to Reproduce

      1. Attach test_password.docx (the password is 'test') to a page in Confluence
      2. Rebuild index
      3. Check atlassian-confluence.log

      Expected Results

      No exception occurred.

      Actual Results

      Following errors appeared in atlassian-confluence.log.

      2016-01-20 08:35:50,043 WARN [scheduler_Worker-9] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: test_password.docx v.1 (688138) admin)
      com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Word XML document: Unknown file format.
      	at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extractText(WordXMLTextExtractor.java:87)
      	at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:41)
      	at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:34)
      	at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.lambda$extractWithLuceneExtractors$383(ConfluenceDocumentBuilder.java:159)
      	at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder$$Lambda$240/467648620.accept(Unknown Source)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
      	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
      	at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:352)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502)
      	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
      	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
      	at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.extractWithLuceneExtractors(ConfluenceDocumentBuilder.java:156)
      	at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:111)
      	at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:49)
      	at com.atlassian.confluence.search.lucene.tasks.UpdateDocumentIndexTask.perform(UpdateDocumentIndexTask.java:50)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$IndexTaskWriter.apply(DefaultConfluenceIndexManager.java:408)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$IndexTaskWriter.apply(DefaultConfluenceIndexManager.java:398)
      	at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue.lambda$flushQueue$213(AbstractJournalIndexTaskQueue.java:156)
      	at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue$$Lambda$231/1026302055.apply(Unknown Source)
      	at com.atlassian.confluence.impl.journal.DefaultJournalService.lambda$convert$490(DefaultJournalService.java:84)
      	at com.atlassian.confluence.impl.journal.DefaultJournalService$$Lambda$232/634715196.apply(Unknown Source)
      	at com.atlassian.confluence.impl.journal.DefaultJournalManager.processEntries(DefaultJournalManager.java:80)
      	at com.atlassian.confluence.impl.journal.DefaultJournalService.processEntries(DefaultJournalService.java:41)
      	at com.atlassian.confluence.search.queue.AbstractJournalIndexTaskQueue.flushQueue(AbstractJournalIndexTaskQueue.java:151)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$BatchUpdateAction$1.perform(DefaultConfluenceIndexManager.java:376)
      	at com.atlassian.bonnie.LuceneConnection.withWriter(LuceneConnection.java:489)
      	at sun.reflect.GeneratedMethodAccessor741.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198)
      	at com.sun.proxy.$Proxy133.withWriter(Unknown Source)
      	at com.atlassian.confluence.search.lucene.tasks.LuceneConnectionBackedIndexTaskPerformer.perform(LuceneConnectionBackedIndexTaskPerformer.java:18)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$BatchUpdateAction.perform(DefaultConfluenceIndexManager.java:365)
      	at com.atlassian.bonnie.LuceneConnection.withBatchUpdate(LuceneConnection.java:522)
      	at sun.reflect.GeneratedMethodAccessor740.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198)
      	at com.sun.proxy.$Proxy133.withBatchUpdate(Unknown Source)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.processTasks(DefaultConfluenceIndexManager.java:175)
      	at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.flushQueue(DefaultConfluenceIndexManager.java:133)
      	at sun.reflect.GeneratedMethodAccessor739.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
      	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
      	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
      	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
      	at com.sun.proxy.$Proxy159.flushQueue(Unknown Source)
      	at com.atlassian.confluence.search.lucene.IndexQueueFlusher.executeJob(IndexQueueFlusher.java:28)
      	at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.surroundJobExecutionWithLogging(AbstractClusterAwareQuartzJobBean.java:66)
      	at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.executeInternal(AbstractClusterAwareQuartzJobBean.java:47)
      	at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
      	at com.atlassian.scheduler.quartz1.Quartz1JobFactory$ClassLoaderProtectingWrappedJob.execute(Quartz1JobFactory.java:65)
      	at org.quartz.core.JobRunShell.run(JobRunShell.java:223)
      	at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool.lambda$runInThread$185(ConfluenceQuartzThreadPool.java:16)
      	at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool$$Lambda$211/2092383935.run(Unknown Source)
      	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
      Caused by: com.aspose.words.UnsupportedFileFormatException: Unknown file format.
      	at com.aspose.words.Document.zzY(Unknown Source)
      	at com.aspose.words.Document.zzZ(Unknown Source)
      	at com.aspose.words.Document.<init>(Unknown Source)
      	at com.aspose.words.Document.<init>(Unknown Source)
      	at com.aspose.words.Document.<init>(Unknown Source)
      	at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extractText(WordXMLTextExtractor.java:83)
      	... 64 more
      

      Workaround

      Disabling Word XML Content Extractor worked to suppress the exception.
      However, as you know, it will miss some functions about indexing on Word files.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yokamoto Yuki Okamoto (Inactive)
              Votes:
              11 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: