Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-60516

'crowd.polling.startdelay' value not respected in single-node Confluence DC setups

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Low
    • None
    • 7.4.1, 7.4.3
    • User - Management
    • None

    Description

      Issue Summary

      The 'crowd.polling.startdelay' Value is not respected in single-node Confluence DC environment on reboot.

      Steps to Reproduce

      1. Stand up a 7.4.1 or 7.4.3 Confluence DC node, make it the only node in the cluster.
      2. Add the '-Dcrowd.polling.startdelay=60000' to the setenv.sh file as outlined in How to delay the directory sync on startup in Confluence.
      3. Configure one or more external user directories and set sync interval to 5 minutes.
      4. After the first sync, reboot the node.

      Expected Results

      For a single node cluster, the initial user sync on reboot should be delayed by 5 minutes, then start it's expected sync interval cadence after that point (every 5 minutes).

      Actual Results

      For a single node cluster, the sync cadence does not change on the reboot of the node. The 'timestart' value does not change post-reboot and thus the 'delay' in the setenv.sh is not respected.

      Other notes

      For Confluence Data Center, there is a 'scheduler_clustered_jobs' table in the database that has a 'com.atlassian.crowd.manager.directory.monitor.poller.DirectoryPollerManager.<directory_id>' value that controls the start times of each of the user directories configured.

      This value is read by each of the nodes in the cluster so that multiple nodes don't run the same user sync job at the same time. Rebooting one node wouldn't alter this start time (since the other node should be functionally online to still run the user sync job).

      It should be noted that the '-Dcrowd.polling.startdelay=<time_in_ms>' works as expected in Confluence Server to delay the start of the sync post-reboot.

      For single-node Confluence DC clusters, the delay should be respected since there isn't another node configured to accommodate the user sync start time if this one goes down. Instead, this should operate the same as Confluence Server.

      Workaround

      1. Clear out the 'com.atlassian.crowd.manager.directory.monitor.poller.DirectoryPollerManager.<directory_id>' values in the 'scheduler_clustered_jobs' DB table while the node is down:
        DELETE FROM scheduler_clustered_jobs WHERE job_id LIKE '%com.atlassian.crowd.manager.directory.monitor.poller.DirectoryPollerManager%';
        

        Syntax is formatted for PSQL, may be different for another database type.

      2. Start the node.
        Please note: The sync interval and the delay will be added together to set the next start time for each of the user directories. For instance, if you had a sync interval of 5 minutes and the delay was set to 5 minutes, the next start time would be ~10 minutes from when the node was restarted and came back online and follow a 5-minute sync cadence after that.

      If you simply want the sync to start at the normal cadences, you can remove the '-Dcrowd.polling.startdelay=<time_in_ms>' from your setenv.sh file as well as the DELETE command above and the directories will resume their normal cadence after the system is online (i.e. the next sync start time will be directory interval added to when the node came online).

      Do not run this delete while the node is online as it will clear out the next sync interval and the user directories will not sync. This can be rectified by restarting the single node.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dmaye@atlassian.com David Maye
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: