Details
-
Suggestion
-
Resolution: Unresolved
-
None
-
None
-
2
-
Description
Azure Information Protection (AIP) issue with Doc and PPT files is a form of encryption used in document based on their labels. As described by their documentation, this is what happens:
When you apply a sensitivity label, the label information will persist with your document or email, even as it is shared between devices, applications, and cloud services. Applying a sensitivity label may also result in changes to your document or email according to your organization's configuration, such as:
Encryption with Information Rights Management may be applied to your document or email. A header or footer may appear in your document or email. A watermark may appear in your document
All the details can be found here:
This is not implement yet on Confluence and raises some errors on the backend:
2020-07-31 14:12:34,056 WARN [attachment-text-extraction-worker-0] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: AIP PROD Test Document with KPMG Only Label.docx v.1 (367296548) babypraveena@kpmg.com) com.atlassian.bonnie.search.extractor.ExtractorException: java.lang.Exception: Error reading content of Word XML document: Unknown file format: Unknown at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extractText(WordXMLTextExtractor.java:53) at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:41) at com.atlassian.confluence.extra.officeconnector.index.util.AttachmentTextExtractorAdapter.apply(AttachmentTextExtractorAdapter.java:29) at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extract(WordXMLTextExtractor.java:38) at com.atlassian.confluence.internal.index.attachment.DelegatingAttachmentTextExtractor.lambda$extract$1(DelegatingAttachmentTextExtractor.java:35) at com.atlassian.confluence.internal.index.attachment.DelegatingAttachmentTextExtractor.dt_access$105(DelegatingAttachmentTextExtractor.java) at java.util.Optional.flatMap(Optional.java:241) at com.atlassian.confluence.internal.index.attachment.DelegatingAttachmentTextExtractor.extract(DelegatingAttachmentTextExtractor.java:35) at com.atlassian.confluence.internal.index.attachment.AttachmentTextExtractionFunction.apply(AttachmentTextExtractionFunction.java:70) at com.atlassian.confluence.internal.index.attachment.AttachmentTextExtractionFunction.apply(AttachmentTextExtractionFunction.java:22) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Delegatin Caused by: java.lang.Exception: Error reading content of Word XML document: Unknown file format: Unknown at com.atlassian.plugins.conversion.extract.xml.WordXMLExtractor.extractText(WordXMLExtractor.java:15) at com.atlassian.confluence.extra.officeconnector.index.word.WordXMLTextExtractor.extractText(WordXMLTextExtractor.java:51) ... 28 more Caused by: com.aspose.words.UnsupportedFileFormatException: Unknown file format: Unknown at com.aspose.words.Document.zzY(Unknown Source) at com.aspose.words.Document.zzZ(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.atlassian.plugins.conversion.extract.xml.WordXMLExtractor.extractText(WordXMLExtractor.java:13) ... 29 more 2020-07-31 14:12:34,730 WARN [attachment-text-extraction-worker-2] [bonnie.search.extractor.BaseAttachmentConten
Since this issue is related to Aspose, checking their forums, they are working on this right now, but it's not fully implemented:
It will be great if Confluence supports it once it's possible.