Uploaded image for project: 'Bamboo Data Center'
  1. Bamboo Data Center
  2. BAM-21400

Allow the user to control cluster locking release and failover mechanism

XMLWordPrintable

    • Icon: Suggestion Suggestion
    • Resolution: Fixed
    • 9.6.0
    • Data center
    • None
    • 12
    • Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.

      Problem Description

      As a systems administrator, I am granted a 5-minute lock timeout from an active Bamboo instance. It is not possible to manipulate the lock even if the active Bamboo instance is shut down. The static 5-minute lock imposes a forced delay to any cluster takeover or manual switching between cluster members (or passive/disaster recovery takeovers)

      Suggested Solution

      Define a way to:

      • Tune the lock timeout // Cluster takeover time (to allow fine-grained control over cluster timings)
      • Retrieve the lock ownership and status (for monitoring)
      • Release the lock immediately after shutdown (so other nodes can takeover immediately)
      • Hold the lock so other nodes will not take over the service (in case the active node just needs to restart but still remain active)
      • Forcibly takeover the lock by special initialisation flag (to be used in a hard takeover from the passive node)

      Settings can be set either via properties/java env or from the API.

       

      Workaround for displaying node status

      • To list all the cluster nodes
        SELECT * FROM cluster_node_heartbeat 
      • To list the active node
        SELECT node_id FROM cluster_lock WHERE lock_name='bamboo_primary_cluster_node'; 

              851f15845f55 Mateusz Szmal
              jowen@atlassian.com Jeremy Owen
              Votes:
              39 Vote for this issue
              Watchers:
              24 Start watching this issue

                Created:
                Updated:
                Resolved: