Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-10512

Index optimization produce an OutOfMemoryError due to large textual attachments

      This issue is a duplicate of CONF-8989 which has been marked as resolved, but the patch has never been integrated into Confluence and therefore I created this new issue.

      Confluence currently stores the entire contents of textual attachments (like word, txt and powerpoint documents) in the search index. This is done mainly to produce excerpts that display in search results.

      During index optimization the bodies of attachment are read from the disk into memory in bulk (meaning a block of memory equal to the size of all indexed & stored attachments is requested from the heap). With enough large attachments, the JVM will be asked for a large block of heap it cannot allocate and an OutOfMemoryError is produced.

            [CONFSERVER-10512] Index optimization produce an OutOfMemoryError due to large textual attachments

            We've noticed that in some cases after hitting this error, the index must be rebuilt from scratch:
            http://confluence.atlassian.com/display/DOC/Rebuild+index+from+scratch

            Jeremy Largman added a comment - We've noticed that in some cases after hitting this error, the index must be rebuilt from scratch: http://confluence.atlassian.com/display/DOC/Rebuild+index+from+scratch

            MD5 (atlassian-bucket-2006.04.11.branch-v0.9-CONF-10512.jar) = e787322041d083d3b9ebd913786eb133

            atlassian-bucket-2006.04.11.branch-v0.9-CONF-10512.jar is a patched version of bucket for Confluence 2.2.10 to allow the use of the System property atlassian.indexing.contentbody.maxsize to control the size of attachment bodies that will be stored in the index.

            Paul Curren added a comment - MD5 (atlassian-bucket-2006.04.11.branch-v0.9- CONF-10512 .jar) = e787322041d083d3b9ebd913786eb133 atlassian-bucket-2006.04.11.branch-v0.9- CONF-10512 .jar is a patched version of bucket for Confluence 2.2.10 to allow the use of the System property atlassian.indexing.contentbody.maxsize to control the size of attachment bodies that will be stored in the index.

            Partha added a comment - - edited

            The patch works for 2.8.1 too.

            To install the patch, make a backup of your current <Confluence_install/confluence/WEB-INF/lib/atlassian-bonnie-2.8.jar file, then remove it.
            Next place the patch in your <Confluence_install/confluence/WEB-INF/lib/ directory and restart confluence.

            Partha added a comment - - edited The patch works for 2.8.1 too. To install the patch, make a backup of your current <Confluence_install/confluence/WEB-INF/lib/atlassian-bonnie-2.8.jar file, then remove it. Next place the patch in your <Confluence_install/confluence/WEB-INF/lib/ directory and restart confluence.

            This issue is fixed in 2.9

            You can specify the threshold in bytes by changing the system property atlassian.indexing.contentbody.maxsize. Content bodies larger than this limit will not get stored, but will still be indexed. This has the effect that these documents won't receive a highlighted excerpt in search results, but the result will still be returned.

            Chris Broadfoot [Atlassian] added a comment - - edited This issue is fixed in 2.9 You can specify the threshold in bytes by changing the system property atlassian.indexing.contentbody.maxsize . Content bodies larger than this limit will not get stored, but will still be indexed. This has the effect that these documents won't receive a highlighted excerpt in search results, but the result will still be returned.

              cbroadfoot Chris Broadfoot [Atlassian]
              jens@atlassian.com jens
              Affected customers:
              1 This affects my team
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: