Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Low
Fix Version/s: 3.5.11
Affects Version/s: 3.0
Component/s: None
Labels:
- affects-server
- editor

This problem occurs due to the browser sending the wrong MIME type during a file upload. It appears that Windows boxes where MS Excel handles CSV files uploads CSV files with the "application/vnd.ms-excel" MIME type. It can cause the search index to be only partially built, resulting in missing pages in search results.

Sample logs:

2010-02-22 11:09:56,038 WARN [Indexer: 2] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: textures-
streaming.csv v.1 (3014859) kteich)
 -- url: /confluence/admin/reindex.action | userName: moise | referer: https://qix.demiurgestudios.com/confluence/admin/search-indexes.action | action: reind
ex
com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Excel document: Invalid header signature; read 8236850760414359372, expect
ed -2226271756974174256
        at com.atlassian.bonnie.search.extractor.MsExcelContentExtractor.extractText(MsExcelContentExtractor.java:101)
        at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:39)
        at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:43)
        at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104)
        at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:102)
        at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:41)
        at com.atlassian.bonnie.index.TempIndexWriter.perform(TempIndexWriter.java:72)
        at com.atlassian.confluence.search.lucene.TempIndexWriterStrategy.perform(TempIndexWriterStrategy.java:43)
        at com.atlassian.confluence.search.lucene.tasks.TempIndexBackedIndexTaskPerformer.perform(TempIndexBackedIndexTaskPerformer.java:21)
        at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.indexCollection(DefaultObjectQueueWorker.java:73)
        at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker$1.doInTransactionWithoutResult(DefaultObjectQueueWorker.java:61)
        at org.springframework.transaction.support.TransactionCallbackWithoutResult.doInTransaction(TransactionCallbackWithoutResult.java:33)
        at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:127)
        at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.run(DefaultObjectQueueWorker.java:50)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:595)
Caused by: java.io.IOException: Invalid header signature; read 8236850760414359372, expected -2226271756974174256
        at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:103)
        at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:90)
        at com.atlassian.bonnie.search.extractor.MsExcelContentExtractor.extractText(MsExcelContentExtractor.java:87)
        ... 16 more

Workaround

Stop Confluence, edit the confluence/WEB-INF/classes/mime.types file and add the following entry:

text/csv    csv

This ensures that all files with the CSV extension are mapped to the text/csv MIME type regardless of what the browser sends.

Next, run the following query against the database and then start Confluence:

update attachments set contenttype='text/csv' where lower(title) like '%.csv';

To make the content in the CSV files searchable you will also need to run a reindex.

Assignee:: Niraj Bhawnani
Reporter:: Andrew Moise
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: 22/Feb/2010 4:35 PM
Updated:: 11/Oct/2018 8:36 AM
Resolved:: 01/Aug/2011 12:26 AM

Details

Description

Workaround

Attachments

Activity

People

Dates