No text content being indexed in MS Word document

XMLWordPrintable

    • Type: Bug
    • Resolution: Duplicate
    • Priority: Medium
    • None
    • Affects Version/s: 2.8
    • Component/s: Search - Core

      There is no content field present when viewing the indexing details for the MS Word document via the index browser.

      The following is a sample error from the log:

      [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: Install.doc v.1 (2064385))
       -- url: /confluence/admin/reindex.action | userName: admin | action: reindex
      com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Word document: null
      	at com.atlassian.bonnie.search.extractor.MsWordContentExtractor.extractText(MsWordContentExtractor.java:37)
      ...
      Caused by: java.util.NoSuchElementException
      	at java.util.AbstractList$Itr.next(AbstractList.java:350)
      	at org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.java:150)
      	at com.atlassian.bonnie.search.extractor.MsWordContentExtractor.extractText(MsWordContentExtractor.java:33)
      

            Assignee:
            Unassigned
            Reporter:
            Ming Giet Chong [Atlassian]
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: