-
Bug
-
Resolution: Fixed
-
Medium
-
4.3.2, 4.3.7, 5.1.3, 5.1.5, 5.2.5, 5.9, 5.8.16, 5.9.7
-
143
-
Severity 3 - Minor
-
Symptoms
I think that closed issue CONF-18733 is not really resolved. We have many entries in our log showing these kind of issues:
2012-10-12 22:30:34,235 WARN [Indexer: 1] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: meta_mailinfo_sec01.csv v.1 (59509056) g6922) com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Excel document: Invalid header signature; read 0x6D453B74726F6853, expected 0xE11AB1A1E011CFD0 at com.atlassian.confluence.extra.officeconnector.index.excel.ExcelTextExtractor.extractText(ExcelTextExtractor.java:103) at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:40) at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:36) at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:97) at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:43) at com.atlassian.bonnie.index.TempIndexWriter.perform(TempIndexWriter.java:73) at com.atlassian.confluence.search.lucene.TempIndexWriterStrategy.perform(TempIndexWriterStrategy.java:43) at com.atlassian.confluence.search.lucene.tasks.TempIndexBackedIndexTaskPerformer.perform(TempIndexBackedIndexTaskPerformer.java:21) at com.atlassian.confluence.search.lucene.ReindexWorkBatch.indexCollection(ReindexWorkBatch.java:125) at com.atlassian.confluence.search.lucene.ReindexWorkBatch$1.doInTransaction(ReindexWorkBatch.java:86) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:128) at com.atlassian.confluence.search.lucene.ReindexWorkBatch.run(ReindexWorkBatch.java:56) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Invalid header signature; read 0x6D453B74726F6853, expected 0xE11AB1A1E011CFD0 at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:120) at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151) at com.atlassian.confluence.extra.officeconnector.index.excel.ExcelTextExtractor.extractText(ExcelTextExtractor.java:89) ... 18 more
Workaround
The underlying issue was partially resolved in Confluence 3.5.11, so the warning message does not occur for many csv files uploaded in Confluence 3.5.1.0 or above. However, any csv files that were uploaded in versions older that Confluence 3.5.11 will still have the incorrect Content-type, as do some csvs that were uploaded post 3.5.11, and when Confluence performs a re-index those warning messages still occur.
To fix the Content-type and resolve the warning message for older documents, you will need to run this command against your database:
UPDATE PUBLIC.ATTACHMENTS SET CONTENTTYPE = 'text/csv' WHERE TITLE LIKE '%.csv' AND CONTENTTYPE LIKE '%excel%';
For more information, please see this KB article.
Resolution
This issue is resolved in Confluence 6.0.6. To get the fix, upgrade to that version.
NB: The fix contains two parts - an upgrade task that changes the 'media_type' for all csv attachments to 'text/csv', no matter what they were previously, and a change to intercept save on any csv attachment and enforce media_type of 'text/csv'. If for any reason the upgrade task needs to be run again (eg after a space import which has attachments with incorrect media_types, it can be run manually by going to <your-confluence-url>/admin/force-upgrade.action, and selecting 'CorrectCsvAttachmentMimeTypeUpgradeTask'.