Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-18731

Exception thrown while indexing a particular .pdf document

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Low Low
    • None
    • 3.0
    • None
    • Standalone Linux (Sun JDK 1.5.0)

      My site's content index is only partially built, resulting in missing pages in search results. I see http://jira.atlassian.com/browse/CONF-18452 has been filed to fix the failure to completely index when there's a problem with a particular page, but I also wanted to file bugs about the underlying issues.

      This issue is a problem indexing a particular .pdf document:

      2010-02-22 11:10:26,019 WARN [Indexer: 2] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: PS3_Produ
      ct_Guidelines_1.0_SCEE_English.pdf v.1 (5144583) kreiner)
      – url: /confluence/admin/reindex.action | userName: moise | referer: https://qix.demiurgestudios.com/confluence/admin/search-indexes.action | action: reind
      ex
      com.atlassian.bonnie.search.extractor.ExtractorException: Error getting content of PDF document
      at com.atlassian.bonnie.search.extractor.PdfContentExtractor.extractText(PdfContentExtractor.java:65)
      at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:39)
      at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:43)
      at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104)
      at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:102)
      at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:41)
      at com.atlassian.bonnie.index.TempIndexWriter.perform(TempIndexWriter.java:72)
      at com.atlassian.confluence.search.lucene.TempIndexWriterStrategy.perform(TempIndexWriterStrategy.java:43)
      at com.atlassian.confluence.search.lucene.tasks.TempIndexBackedIndexTaskPerformer.perform(TempIndexBackedIndexTaskPerformer.java:21)
      at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.indexCollection(DefaultObjectQueueWorker.java:73)
      at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker$1.doInTransactionWithoutResult(DefaultObjectQueueWorker.java:61)
      at org.springframework.transaction.support.TransactionCallbackWithoutResult.doInTransaction(TransactionCallbackWithoutResult.java:33)
      at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:127)
      at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.run(DefaultObjectQueueWorker.java:50)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
      at java.lang.Thread.run(Thread.java:595)
      Caused by: java.io.IOException: Unknown font subtype=COSName{}
      at org.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:103)
      at org.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:135)
      at org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:178)
      at org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:160)
      at org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:355)
      at org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:268)
      at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:220)
      at com.atlassian.bonnie.search.extractor.PdfContentExtractor.extractText(PdfContentExtractor.java:49)
      ... 16 more
      2

            [CONFSERVER-18731] Exception thrown while indexing a particular .pdf document

            Katherine Yabut made changes -
            Workflow Original: JAC Bug Workflow v3 [ 2896292 ] New: CONFSERVER Bug Workflow v4 [ 2988954 ]
            Owen made changes -
            Workflow Original: JAC Bug Workflow v2 [ 2787440 ] New: JAC Bug Workflow v3 [ 2896292 ]
            Status Original: Resolved [ 5 ] New: Closed [ 6 ]
            Owen made changes -
            Workflow Original: JAC Bug Workflow [ 2719211 ] New: JAC Bug Workflow v2 [ 2787440 ]
            Owen made changes -
            Workflow Original: Confluence Workflow - Public Facing - Restricted v5 - TEMP [ 2387742 ] New: JAC Bug Workflow [ 2719211 ]
            Katherine Yabut made changes -
            Workflow Original: Confluence Workflow - Public Facing - Restricted v5 [ 2264162 ] New: Confluence Workflow - Public Facing - Restricted v5 - TEMP [ 2387742 ]
            Katherine Yabut made changes -
            Workflow Original: Confluence Workflow - Public Facing - Restricted v5.1 - TEMP [ 2219445 ] New: Confluence Workflow - Public Facing - Restricted v5 [ 2264162 ]
            Katherine Yabut made changes -
            Workflow Original: Confluence Workflow - Public Facing - Restricted v5 - TEMP [ 2153259 ] New: Confluence Workflow - Public Facing - Restricted v5.1 - TEMP [ 2219445 ]
            Katherine Yabut made changes -
            Workflow Original: Confluence Workflow - Public Facing - Restricted v5 [ 1935847 ] New: Confluence Workflow - Public Facing - Restricted v5 - TEMP [ 2153259 ]
            Katherine Yabut made changes -
            Workflow Original: Confluence Workflow - Public Facing - Restricted v3 [ 1735195 ] New: Confluence Workflow - Public Facing - Restricted v5 [ 1935847 ]
            Katherine Yabut made changes -
            Workflow Original: CONF Bug Subtask WF (TEMP) [ 1691630 ] New: Confluence Workflow - Public Facing - Restricted v3 [ 1735195 ]

              Unassigned Unassigned
              andrew.moise Andrew Moise
              Affected customers:
              0 This affects my team
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved: