Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-12149

Invalid AWS Configuration error when starting up AWS Cluster

    XMLWordPrintable

Details

    Description

      Symptoms

      Stacktrace in the logs:

       2017-12-04 22:35:38,428 WARN [localhost-startStop-1] [hazelcast.cluster.impl.TcpIpJoinerOverAWS] log [an-ip]:5801 [dev] [3.5.2-atlassian-36] Invalid Aws Configuration
      com.hazelcast.config.InvalidConfigurationException: Invalid Aws Configuration
      	at com.hazelcast.aws.impl.DescribeInstances.getKeysFromIamRole(DescribeInstances.java:82)
      	at com.hazelcast.aws.impl.DescribeInstances.<init>(DescribeInstances.java:59)
      	at com.hazelcast.aws.AWSClient.getPrivateIpAddresses(AWSClient.java:48)
      	at com.hazelcast.cluster.impl.TcpIpJoinerOverAWS.getMembers(TcpIpJoinerOverAWS.java:42)
      	at com.hazelcast.cluster.impl.TcpIpJoiner.getPossibleAddresses(TcpIpJoiner.java:396)
      	at com.hazelcast.cluster.impl.TcpIpJoiner.joinViaPossibleMembers(TcpIpJoiner.java:126)
      	at com.hazelcast.cluster.impl.TcpIpJoiner.doJoin(TcpIpJoiner.java:86)
      	at com.hazelcast.cluster.impl.AbstractJoiner.join(AbstractJoiner.java:93)
      	at com.hazelcast.instance.Node.join(Node.java:535)
      	at com.hazelcast.instance.Node.start(Node.java:344)
      	at com.hazelcast.instance.HazelcastInstanceImpl.<init>(HazelcastInstanceImpl.java:126)
      	at com.hazelcast.instance.HazelcastInstanceFactory.constructHazelcastInstance(HazelcastInstanceFactory.java:152)
      	at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:135)
      	at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:111)
      	at com.hazelcast.core.Hazelcast.newHazelcastInstance(Hazelcast.java:58)
      	at com.atlassian.confluence.cluster.hazelcast.HazelcastClusterManager.startCluster(HazelcastClusterManager.java:315)
      	at com.atlassian.confluence.cluster.hazelcast.HazelcastClusterManager.reconfigure(HazelcastClusterManager.java:287)
      	at com.atlassian.confluence.cluster.DefaultClusterConfigurationHelper.bootstrapCluster(DefaultClusterConfigurationHelper.java:317)
      	at com.atlassian.confluence.setup.DefaultBootstrapManager.afterConfigurationLoaded(DefaultBootstrapManager.java:834)
      	at com.atlassian.config.bootstrap.DefaultAtlassianBootstrapManager.init(DefaultAtlassianBootstrapManager.java:75)
      	at com.atlassian.confluence.setup.DefaultBootstrapManager.init(DefaultBootstrapManager.java:180)
      	at com.atlassian.config.util.BootstrapUtils.init(BootstrapUtils.java:36)
      	at com.atlassian.confluence.setup.ConfluenceConfigurationListener.initialiseBootstrapContext(ConfluenceConfigurationListener.java:130)
      	at com.atlassian.confluence.setup.ConfluenceConfigurationListener.contextInitialized(ConfluenceConfigurationListener.java:64)
      	at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4853)
      	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5314)
      	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
      	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1408)
      	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1398)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)

      After adding debug logging we see this:

      2017-12-05 01:51:45,406 ERROR [localhost-startStop-1] [hazelcast.aws.impl.DescribeInstances] log query: latest/meta-data/iam/security-credentials/node-role-name
      2017-12-05 01:51:45,411 ERROR [localhost-startStop-1] [hazelcast.aws.impl.DescribeInstances] log url: http://169.254.169.254latest/meta-data/iam/security-credentials/node-role-name
      java.net.SocketException: Unexpected end of file from server
      	at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:851)
      	at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
      	at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:848)
      	at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
      	at java.net.URL.openStream(URL.java:1045)
      	at com.hazelcast.aws.impl.DescribeInstances.getKeysFromIamRole(DescribeInstances.java:81)
      	at com.hazelcast.aws.impl.DescribeInstances.(DescribeInstances.java:63)
      	at com.hazelcast.aws.AWSClient.getPrivateIpAddresses(AWSClient.java:48)
      	at com.hazelcast.cluster.impl.TcpIpJoinerOverAWS.getMembers(TcpIpJoinerOverAWS.java:42)
      	at com.hazelcast.cluster.impl.TcpIpJoiner.getPossibleAddresses(TcpIpJoiner.java:396)
      	at com.hazelcast.cluster.impl.TcpIpJoiner.joinViaPossibleMembers(TcpIpJoiner.java:126)
      	at com.hazelcast.cluster.impl.TcpIpJoiner.doJoin(TcpIpJoiner.java:86)
      	at com.hazelcast.cluster.impl.AbstractJoiner.join(AbstractJoiner.java:93)
      	at com.hazelcast.instance.Node.join(Node.java:535)
      	at com.hazelcast.instance.Node.start(Node.java:344)
      	at com.hazelcast.instance.HazelcastInstanceImpl.(HazelcastInstanceImpl.java:126)
      	at com.hazelcast.instance.HazelcastInstanceFactory.constructHazelcastInstance(HazelcastInstanceFactory.java:152)
      	at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:135)
      	at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:111)
      	at com.hazelcast.core.Hazelcast.newHazelcastInstance(Hazelcast.java:58)
      	at com.atlassian.confluence.cluster.hazelcast.HazelcastClusterManager.startCluster(HazelcastClusterManager.java:315)
      	at com.atlassian.confluence.cluster.hazelcast.HazelcastClusterManager.reconfigure(HazelcastClusterManager.java:287)
      	at com.atlassian.confluence.cluster.DefaultClusterConfigurationHelper.bootstrapCluster(DefaultClusterConfigurationHelper.java:317)
      	at com.atlassian.confluence.setup.DefaultBootstrapManager.afterConfigurationLoaded(DefaultBootstrapManager.java:834)
      	at com.atlassian.config.bootstrap.DefaultAtlassianBootstrapManager.init(DefaultAtlassianBootstrapManager.java:75)
      	at com.atlassian.confluence.setup.DefaultBootstrapManager.init(DefaultBootstrapManager.java:180)
      	at com.atlassian.config.util.BootstrapUtils.init(BootstrapUtils.java:36)
      	at com.atlassian.confluence.setup.ConfluenceConfigurationListener.initialiseBootstrapContext(ConfluenceConfigurationListener.java:130)
      	at com.atlassian.confluence.setup.ConfluenceConfigurationListener.contextInitialized(ConfluenceConfigurationListener.java:64)
      	at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4853)
      	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5314)
      	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
      	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1408)
      	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1398)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)

      Cause

      The issue is occurring because the URL that is being built, in some cases does not insert a forward slash between the hostname and the 'query' param.
      URL url = new URL("http", IAM_ROLE_ENDPOINT, query);

      Resolution

      Because we have not been able to determine what environmental factors cause the slash to be inserted or not, there is no known workaround that still uses AWS config. Users can fallback to manual TCIP hazelcast configuration while preparing for an upgrade.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ephillips@atlassian.com Edward
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: