Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-39925

Cluster safety not enabled for Confluence 5.8.x Server Edition

    XMLWordPrintable

Details

    Description

      The cluster safety mechanism is not working for 5.8.x Server editions of Confluence. This removes a safety net that prevents multiple Confluence instances from inadvertently updating the same database, which could lead to unexpected data corruption.

      For example, this can happen when cloning the production environment to create a test environment, and the step to update the database connection to point to the test environment is missed. In this scenario, both the Prod and Test Confluence instances would be updating the same DB.

      Symptoms

      • The clustersafety table contains no rows (expected behavior: contains 1 row)
      • Multiple 5.8.x Server instances can connect to the same Confluence database and not fail (expected behavior: cluster panic)

      Other notes

      • Confluence 5.7 and below Server and Data Center editions have working cluster safety mechanisms, as expected
      • Confluence 5.8.x Data Center edition has a working cluster safety mechanism as expected
      • This affects 5.8.1 Server and above

      Testing notes:

      • Start 2 standalone nodes of confluence, point them to the common DB
      • Ensure that cluster safety job is scheduled and running
      • Ensure that one node panics when cluster safety job runs

      Cluster configuration check:

      • Set up CDC locally (2 nodes)
      • Ensure that safety job runs and nodes don't panic (you may want to set logging level to debug for com.atlassian.confluence.cluster.hazelcast.HazelcastClusterSafetyManager)
      • Emulate network outage between nodes. I used pfctl utility for that:
        sudo pfctl -e
        (sudo pfctl -sr; echo "block drop quick on lo0 proto tcp from any to any port = 5802") | sudo pfctl -f -
        (sudo pfctl -sr; echo "block drop quick on lo0 proto tcp from any to any port = 5801") | sudo pfctl -f -
        sudo pfctl -v -s rules
        

        To clean up filtering:

        sudo pfctl -f /etc/pf.conf
        
      • one node should panic on when safety job runs.

      Attachments

        Issue Links

          Activity

            People

              mfedoryshyn Maksym Fedoryshyh
              rchang Robert Chang
              Votes:
              4 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: