Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-11965

No text content being indexed in MS Word document

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Medium
    • None
    • 2.8
    • Search - Core

    Description

      There is no content field present when viewing the indexing details for the MS Word document via the index browser.

      The following is a sample error from the log:

      [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: Install.doc v.1 (2064385))
       -- url: /confluence/admin/reindex.action | userName: admin | action: reindex
      com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Word document: null
      	at com.atlassian.bonnie.search.extractor.MsWordContentExtractor.extractText(MsWordContentExtractor.java:37)
      ...
      Caused by: java.util.NoSuchElementException
      	at java.util.AbstractList$Itr.next(AbstractList.java:350)
      	at org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.java:150)
      	at com.atlassian.bonnie.search.extractor.MsWordContentExtractor.extractText(MsWordContentExtractor.java:33)
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mgchong Ming Giet Chong [Atlassian]
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: