Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-52489

Content properties missing in search index after reindex of Confluence

      The fix for this bug has been released to our Long Term Support release.

      The fix for this bug is now available in the latest release of Confluence 7.13 and 7.19

      In one of our plugins, we are currently using an index schema to add content properties to the search index of Confluence, as documented here: Content Properties in the REST API

      This works perfectly when inserting/updating content properties. Unfortunately, due to problems with the search index in our Confluence instance, we had to rebuild the search indexes as documented here: Rebuilding the search index

      Now after the reindex, all content properties were not present in the index anymore. They are still attached to the content, but not in the index anymore. Only after updating the content properties, they were being indexed again.

      Steps to reproduce

      1. Install a fresh instance of Confluence. (We were able to reproduce this with version 5.8.10 and 5.10.6)
      2. Choose "Example Site" to init the instance with the Demonstration Space.
      3. Install the attached test plugin. All it does is defining the following index schema:
        <content-property-index-schema key="test-plugin-content-property-index-schema">
            <key property-key="metadata">
                <extract path="likes" type="number" />
            </key>
        </content-property-index-schema>
        
      1. Find out the pageId of page "Welcome to Confluence" => e.g. 12345
      2. Check the content properties of the page with the following command:
        curl -u admin:admin -X GET "http://localhost:8090/rest/api/content/12345/property" | python -mjson.tool
        
      1. The page shouldn't have any content properties yet. Now add content properties with the following command:
        curl -i -u admin:admin -X POST -H "Content-Type: application/json" -d '{ "key" : "metadata", "value" : { "likes": 5 }}' http://localhost:8090/rest/api/content/12345/property
        
      1. Check the content properties of the page again to make sure they've been added successfully:
        curl -u admin:admin -X GET "http://localhost:8090/rest/api/content/12345/property" | python -mjson.tool
        
      1. You should see the inserted content properties. Now try to use CQL to search by our indexed content properties:
        http://localhost:8090/rest/api/content/search?cql=space%3Dds%20AND%20content.property%5Bmetadata%5D.likes%3C%3D5
        
      1. The search should return the page "Welcome to Confluence". Now reindex Confluence as described here: Rebuilding the search index
      2. Now execute the search again:
        http://localhost:8090/rest/api/content/search?cql=space%20=%20ds%20AND%20content.property[metadata].likes%20%3C=%205
        

      No results found! All content properties are missing in the index after reindexing Confluence!

      Expected result

      All indexed content properties stay in the search index, even after reindexing Confluence.

      Any help would be appreciated

      Root cause

      ThreadLocalCache (com.atlassian.confluence.cache.ThreadLocalCache) is not initialized in reindexing threads. Let's look at how content properties are indexed.

      Content properties are extracted in com.atlassian.confluence.plugins.contentproperty.index.extractor.ContentPropertiesExtractor#addFields. This method relies on ThreadLocalCache to maintain permission exemptions. ThreadLocalCache however needs to be initialized before it can be used, see com.atlassian.confluence.cache.ThreadLocalCache#init. We're doing this in scheduler threads (com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper#runJob) which handle indexing for content updates, but not reindexing threads (com.atlassian.confluence.internal.index.ConcurrentBatchIndexer#accept) for whole site reindexing.

      How to fix this bug properly?

      Well, the fix should be very simple: just call ThreadLocalCache#init in reindexing threads.

      Is there a workaround?

      Not really a workaround but If you have control of a plugin's source code, there is a hacky way to index content properties properly during reindex without having to modify Confluence core: just add 2 extractors to the plugin:

      1. An extractor right before (ie. priority > 900) ContentPropertiesExtractor (priority = 900) and call ThreadLocalCache#init inside it.
      2. Another extractor right after (ie. priority < 900) ContentPropertiesExtractor (priority = 900) and call ThreadLocalCache#dispose inside it.

      Prior to Confluence 7.14, these extractors need to implement the extractor module. From 7.14 onward, they must implement the newer extractor2 module.

        1. test-1.0.0.jar
          2 kB
        2. test-1.0.0-source.zip
          10 kB

            [CONFSERVER-52489] Content properties missing in search index after reindex of Confluence

            Philippe PEREZ added a comment - - edited

            What would happen if we upgrade to this version while we already implemented the above workaround in one of our plugins ? Do we need to remove it or no harm to keep it ?

            Philippe PEREZ added a comment - - edited What would happen if we upgrade to this version while we already implemented the above workaround in one of our plugins ? Do we need to remove it or no harm to keep it ?

            A fix for this issue is available in Confluence Server and Data Center 7.13.11.
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            Madhubabu Kethineni (Inactive) added a comment - A fix for this issue is available in Confluence Server and Data Center 7.13.11. Upgrade now or check out the Release Notes to see what other issues are resolved.

            Hey 326aaff085b8,

            The fix has been released in Confluence 7.19.2, the fix version was incorrect and I have updated it

            Thanks,
            James Ponting
            Engineering Manager - Confluence Data Center

            James Ponting added a comment - Hey 326aaff085b8 , The fix has been released in Confluence 7.19.2, the fix version was incorrect and I have updated it Thanks, James Ponting Engineering Manager - Confluence Data Center

            Remo Siegwart added a comment - - edited

            Hi James,

            Could you please clarify: in your comment you say that this has been fixed in Confluence 7.19.2, but in the "Fix Version/s" field of this ticket it says 7.19.3, which has not been released yet. In which 7.19.x version is this fix available? From our testing, it looks like this has already been fixed in 7.19.2.

            Thanks,
            Remo

            Remo Siegwart added a comment - - edited Hi James, Could you please clarify: in your comment you say that this has been fixed in Confluence 7.19.2, but in the "Fix Version/s" field of this ticket it says 7.19.3, which has not been released yet. In which 7.19.x version is this fix available? From our testing, it looks like this has already been fixed in 7.19.2. Thanks, Remo

            A fix for this issue is available in Confluence Server and Data Center 7.13.10.
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            odykovy (Inactive) added a comment - A fix for this issue is available in Confluence Server and Data Center 7.13.10. Upgrade now or check out the Release Notes to see what other issues are resolved.

            Hey 6a66c94f366a,

            I'm going to leave the ticket as private, however I've updated this ticket with the relevant information.

            The change is already done and awaiting release in Confluence 7.13.10 and Confluence 7.19.2.

            It will be available soon

            Thanks,
            James Ponting
            Engineering Manager - Confluence Data Center

            James Ponting added a comment - Hey 6a66c94f366a , I'm going to leave the ticket as private, however I've updated this ticket with the relevant information. The change is already done and awaiting release in Confluence 7.13.10 and Confluence 7.19.2. It will be available soon Thanks, James Ponting Engineering Manager - Confluence Data Center

            Hi Jeffery Xie, 

            Could you please made the internal ticket for the 7.13.10 back port public to track it?

            Thanks

            Pablo Gallego _Appfire_ added a comment - - edited Hi Jeffery Xie,  Could you please made the internal ticket for the 7.13.10 back port public to track it? Thanks

            A fix for this issue is available in Confluence Server and Data Center 7.20.0.
            Upgrade now or check out the Release Notes to see what other issues are resolved.

            James Whitehead added a comment - A fix for this issue is available in Confluence Server and Data Center 7.20.0. Upgrade now or check out the Release Notes to see what other issues are resolved.

            Hi All,

             

            We have an internal ticket to address the back port of this issue, targeting the fix for 7.13 LTS (planned version 7.13.10)

            Jeffery Xie added a comment - Hi All,   We have an internal ticket to address the back port of this issue, targeting the fix for 7.13 LTS (planned version 7.13.10)

            +1 for backporting to the LTS support version of 7.13. We use add-ons from both Comala/Appfire and K15t.

            Karen.Mustain added a comment - +1 for backporting to the LTS support version of 7.13. We use add-ons from both Comala/Appfire and K15t.

              5339cdd01cf4 Jeffery Xie
              326aaff085b8 Remo Siegwart
              Affected customers:
              44 This affects my team
              Watchers:
              44 Start watching this issue

                Created:
                Updated:
                Resolved: