Issue Details (XML | Word | Printable)

Key: CONF-8580
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Tom Davies [Atlassian]
Reporter: Matthew Jensen [Atlassian]
Votes: 0
Watchers: 1
Operations

Add/Edit UI Mockup to this issue
If you were logged in you would be able to see more operations.
Confluence

Indexing unprintable/encrypted PDFs fails

Created: 27/May/07 06:39 PM   Updated: 02/Jul/08 07:29 PM
Component/s: PDF Generation
Affects Version/s: 2.5
Fix Version/s: 2.5.3, 2.6.0

Time Tracking:
Not Specified

File Attachments: 1. Java Archive File pdfbox-0.7.2.jar (3.12 MB)

Issue Links:
Reference

Participants: Matthew Jensen [Atlassian], Scott Farquhar [Atlassian] and Tom Davies [Atlassian]
Since last comment: 1 year, 24 weeks, 5 days ago
Resolution Date: 28/May/07 12:22 AM
Labels:


 Description  « Hide
While reindexing, the pdf extractor can report this error:

java.lang.NoClassDefFoundError: org/bouncycastle/jce/provider/BouncyCastleProvider

at org.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:905)

at org.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:489)

at com.atlassian.bonnie.search.extractor.PdfContentExtractor.extractText(PdfContentExtractor.java:46)

at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:31)

at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:28)

at com.atlassian.confluence.search.lucene.ConfluenceObjectToDocumentConverter.convert(ConfluenceObjectToDocumentConverter.java:20)

at com.atlassian.confluence.search.lucene.ConfluenceObjectQueue$1.indexCollection(ConfluenceObjectQueue.java:75)

at com.atlassian.bonnie.index.QueueProcessingRunnableImpl.run(QueueProcessingRunnableImpl.java:39)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:585)

at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:284)

at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:155)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:122)

at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:56)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:144)

at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:174)

at $Proxy62.run(Unknown Source)

at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:987)

at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:528)

at java.lang.Thread.run(Thread.java:595)



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Matthew Jensen [Atlassian] added a comment - 27/May/07 07:36 PM
BouncyCastle is a dependency of PDFBox which is needed to open encrypted PDFs.

Until this issue is resolved you can download the jar from this page:
http://www.bouncycastle.org/latest_releases.html


Tom Davies [Atlassian] added a comment - 28/May/07 12:22 AM
In fact the version of pdfbox in 2.5 doesn't correctly extract text from unprintable PDFs – we need to roll back to 0.7.2

The workaround for this bug is to replace pdfbox-0.7.3.jar in WEB-INF/lib with the pdfbox-0.7.2.jar attached to this issue.


Matthew Jensen [Atlassian] added a comment - 28/May/07 12:39 AM
A couple of the Support Case's that exhibit this error suggest that this error maybe causing the indexing process to simply stop without feedback to the user at all.

Matthew Jensen [Atlassian] added a comment - 28/May/07 05:52 PM
This issue can silently stop the indexing process. Replacing the jar fixes that problem.

Scott Farquhar [Atlassian] added a comment - 13/Jun/07 07:09 AM
More importantly that this one issue - why does an error with indexing one document affect the whole indexing process?

Has this error been fixed?


Tom Davies [Atlassian] added a comment - 13/Jun/07 06:17 PM
The root cause of this problem is that we were not catching the exception, this has been fixed under CONF-8608