Bitbucket 8.19 (Hazelcast 5.2) Azure discovery with Virtual Machines Scale Set and hazelcast.network.azure.instance.metadata.available=true does not work

XMLWordPrintable

    • 4
    • Severity 2 - Major
    • 1

      Summary

      It is not possible to use Bitbucket 8.19 (Hazelcast 5.2) in Azure setups that use Virtual Machine Scale Sets, with Azure IMDS discovery, that is with

      hazelcast.network.azure.instance.metadata.available=true
      

      A bug in Azure's API does not allow the tagging of network interfaces in a VMSS (on the contrary, tagging of network interfaces in a standalone VMs work).

      The equivalent setup with Bitucket 7.x (Hazelcast 3.12) in our test environment worked.

      Analysis

      Hazelcast 5.2 relies on Azure REST API endpoint List Virtual Machine Scale Set Network Interfaces to get tagged interfaces. Due to the Azure problem, no interfaces are tagged and Hazelcast doesn't form a cluster.

      1. Bitbucket 7.x uses Hazelcast 3.12.
        Hazelcast 3.12 uses Microsoft-provided libraries for access to Azure API.
      2. Bitbucket 8.19 uses Hazelcast 5.2, which no longer uses Microsoft libraries.
        Instead, it relies on publicly documented Azure REST API.
      3. When running as part of a Virtual Machine Scale Set (VMSS), the code inside Hazelcast 5.2 does this:
        1. Calls Azure REST API endpoint List Virtual Machine Scale Set Network Interfaces.
        2. Parses the response looking for the "tag" property attached to the network interface.

      Here is the problem: network interfaces attached to VM instances operating inside the Azure VM scale set are not tagged by Azure.

      What we tried

      We tried to tag network interfaces using several approaches but to no avail:

      1. Add tags to the network interface while creating a VM scale set.
        Azure accepts the tag, but no tags are assigned to any interface when looking through the above REST API endpoint.
      2. When we try from the https://portal.azure.com web portal, going to "VM scale set / Networking / Network settings / Network security group" then "Network interfaces / <choose one interface>", then choose "Add tags", there is an error displayed in the web UI:
        Details
        columnNumber: 57 fileName: https://portal.azure.com/Content/Dynamic/hDWs869FxoLX.js line 37 > Function lineNumber: 3 message: Unable to process binding "with: function(){return tagsEditor }" Message: tagsEditor is not defined name: ReferenceError stack: with@https://portal.azure.com/Content/Dynamic/hDWs869FxoLX.js line 37 > Function:3:57 i@https://portal.azure.com/Content/Dynamic/hDWs869FxoLX.js:37:7336 f/t</<@https://portal.azure.com/Content/Dynamic/hDWs869FxoLX.js:37:10772 o/init/<@https://portal.azure.com/Content/Dynamic/hDWs869FxoLX.js:37:22638 evaluateImmediate_CallReadThenEndDependencyDetection@https://portal.azure.com/Content/Dynamic/H7ZYsMnJ-ch2.js:43:29183
        ...
        ...
        
      3. REST API "add tag to network interface" also does not work and displays an error:
        {
          "error": {
            "code": "AuthorizationFailed",
            "message": "The client '8a053114-1926-44f6-bddb-3385602d02a5' with object id '8a053114-1926-44f6-bddb-3385602d02a5' does not have authorization to perform action 'Microsoft.Network/networkInterfaces/write' over scope '/subscriptions/46f50e93-27c2-48cf-ae60-3400d294f77f/resourceGroups/Nenad-test2-RG/providers/Microsoft.Network/networkInterfaces/Nenad-test2-RG-vnet-nic01' or the scope is invalid. If access was recently granted, please refresh your credentials."
          }
        }
        

      The same thing happens with both "system assigned managed identity" and "user assigned managed identity".

      How to replicate

      1. Setup an Azure Virtual Machine Scale Set with 2 VM instances to use metadata discovery (IMDS)
        • Try to add tags to network interfaces using any approach - it won't work. The setup will still be without tags on the interfaces.
      2. Try to form a Hazelcast 5.2 cluster (use Bitbucket 8.19 or some test code)
        • Cluster forming will fail, there will be 2 clusters each with 1 node.

      Known workarounds

      1. Use the Bitbucket clustering configuration with an Azure service principal settings, that is setting all of these properties:
        hazelcast.network.azure.instance.metadata.available=false
        hazelcast.network.azure.tag=...
        hazelcast.network.azure.resource.group=...
        hazelcast.network.azure.tenant.id=...
        hazelcast.network.azure.subscription.id=...
        hazelcast.network.azure.client.id=...
        hazelcast.network.azure.client.secret=...
        
      2. Use the tcpip discovery mechanism and add the IP addresses as a hardcoded list in the bitbucket.properties file. This may be automated as new nodes being added can just add the complete list of nodes to their list. When that new node connects to existing nodes, that list will update all other nodes as well.
        Note

        Hazelcast - Discovering Members by TCP mentions "You do not have to list all these cluster members, but at least one of the listed members has to be active in the cluster when a new member joins"

            Assignee:
            Unassigned
            Reporter:
            Nenad Opsenica (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: