-
Bug
-
Resolution: Duplicate
-
Medium
-
None
-
5.7.3, 5.8.2, 5.8.4
Confluence is throwing this error message in the logs:
2015-06-11 08:24:18,444 WARN [Indexer: 4] [apache.pdfbox.cos.COSDocument] getObjectsByType java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName - referer: http://URL/admin/search-indexes.action | url: /admin/reindex.action | userName:user | action: reindex java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName at org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:294) at org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:656) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:244) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1219) at com.atlassian.bonnie.search.extractor.PdfContentExtractor.extractText(PdfContentExtractor.java:59) at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:41)
and It seems to be throwing an out of memory for the indexer due to this bug as well:
- referer: http://URL/admin/search-indexes.action | url: /admin/reindex.action | userName: user | action: reindex
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.io.ByteArrayOutputStream.grow(Unknown Source)
at java.io.ByteArrayOutputStream.ensureCapacity(Unknown Source)
at java.io.ByteArrayOutputStream.write(Unknown Source)
at org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:172)
at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:98)
at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:308)
at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:248)
at org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:183)
at org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:107)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:456)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:381)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:340)
This is caused by a bug in the PDFBOX stated here: https://issues.apache.org/jira/browse/PDFBOX-1756
Confluence 5.7.3, 5.8.2 and 5.8.4 ships the version 1.8.4 of pdfbox, which is affected by the bug.
Workaround
1) Disable the indexing of PDF attachments using this guide
OR
2) Update the pdfbox plugin manually in Confluence_install\confluence\WEB-INF\lib folder by replacing the original pdf plugin with a version 1.8.6 or newer. Download the newer version here
Dear all,
This ticket has been marked as duplicate. Please follow the progress here https://jira.atlassian.com/browse/CONF-38375
Thanks,
Minh Tran
Confluence BugMaster
Atlassian