Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-11994

Race condition in HiLoIdRepairUpgradeTask and ResetHiLoAfterImportListener can lead to primary key violations in upgrade tasks

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Medium Medium
    • 2.9
    • 2.8, 2.8.1
    • None

      Resetting the ResettableHiLoIdGenerator now happens in two parts:

      • the HiLoIdRepairUpgradeTask determines the lowest possible safe 'hi' value and puts it in the database
      • the ResetHiLoAfterImportListener forces the configured ID generators to retrieve the next hi value from the database

      Unfortunately there is a gap between the two during which more IDs may be assigned by other upgrade tasks. In Studio's case, the InitApplicationKeysUpgradeTask was being run with stale ID generators, causing really hard-to-track-down duplicate key errors.

      The ResetHiLoAfterImportListener was installed to make imports safer in a cluster (previously, the generators were only being reset properly on the same node as the import was performed). My recommended fix would be to create a specific HiLoIdResetEvent event that is thrown by the HiLoIdRepairUpgradeTask specifically for the ResetHiLoAfterImportListener, so that the IDs are all properly reset on all nodes before the restore continues.

            [CONFSERVER-11994] Race condition in HiLoIdRepairUpgradeTask and ResetHiLoAfterImportListener can lead to primary key violations in upgrade tasks

            An over the shoulder review was performed against the relevant parts of Anatoli's work on http://jira.atlassian.com/browse/CONF-11994 and confirmed to fix the issue.

            Andrew Lynch (Inactive) added a comment - An over the shoulder review was performed against the relevant parts of Anatoli's work on http://jira.atlassian.com/browse/CONF-11994 and confirmed to fix the issue.

            The commits were performed against CONF-11494.

            Andrew Lynch (Inactive) added a comment - The commits were performed against CONF-11494 .

            I was not able to see a commit nor a comment stating why this has been resolved, so I am reopening this issue for further investigation.

            Per Fragemann [Atlassian] added a comment - I was not able to see a commit nor a comment stating why this has been resolved, so I am reopening this issue for further investigation.

            Andrew, maybe you should set up automatic notifications from JIRA to send you a list of issues requiring action (i.e. issues assigned to you that are in a status requiring action such as "To be reviewied". Having these issues open for longer than necessary really clutters the roadmap and makes it hard for us to see progress against the current release.

            Christopher Owen [Atlassian] added a comment - Andrew, maybe you should set up automatic notifications from JIRA to send you a list of issues requiring action (i.e. issues assigned to you that are in a status requiring action such as "To be reviewied". Having these issues open for longer than necessary really clutters the roadmap and makes it hard for us to see progress against the current release.

            Andrew, are you going to review this?

            Christopher Owen [Atlassian] added a comment - Andrew, are you going to review this?

              alynch Andrew Lynch (Inactive)
              cmiller@atlassian.com Charles Miller (Inactive)
              Affected customers:
              0 This affects my team
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: