Loading...

Type: Bug
Resolution: Fixed
Priority: Low
Fix Version/s: 9.2.4, 8.5.22, 9.5.1
Affects Version/s: 7.4.17, 7.13.7, 7.18.1, 7.19.2, 7.20.1, 9.2.2
Component/s: Data Center - Core
Labels:

Support reference count:
14
Symptom Severity:
Severity 3 - Minor
UIS:
32

Issue Summary

When deploying a new instance of a clustered Confluence Data Center with TCP/IP discovery, the second node startup cannot join the Cluster and results with one of the nodes encountering a Cluster panic.

This is reproducible on Data Center: Yes

Steps to Reproduce

Install Confluence Data Center node with Clustered TCP/IP discover
- Specify the two IP addresses of both the the nodes on the Clustering setup screen
Complete the installation on this first node
Check that it is working as expected and also check Confluence Administration » General Configuration » Clustering screen looks good
Deploy a second Confluence Data Center node, copying across the confluence.cfg.xml from Node 1
Start the second node

Expected Results

The second node should seamlessly join the existing Confluence DC Cluster.

Actual Results

The second node is unable to join the Confluence DC Cluster with the following shown in the logs:

Node 1 (192.168.56.30)

atlassian-confluence.log / com.hazelcast DEBUG

2022-11-08 04:49:50,439 DEBUG [hz.confluence.IO.thread-Acceptor] [hazelcast.nio.tcp.TcpIpAcceptor] log [192.168.56.30]:5801 [testcluster] [3.12.11] Accepting socket connection from /192.168.56.50:58123
2022-11-08 04:49:50,461 DEBUG [hz.confluence.cached.thread-4] [hazelcast.nio.tcp.TcpIpEndpointManager] log [192.168.56.30]:5801 [testcluster] [3.12.11] Established socket connection between /192.168.56.30:5801 and /192.168.56.50:58123
2022-11-08 04:49:50,472 WARN [hz.confluence.IO.thread-in-2] [hazelcast.nio.tcp.TcpIpConnection] log [192.168.56.30]:5801 [testcluster] [3.12.11] Connection[id=1, /192.168.56.30:5801->/192.168.56.50:58123, qualifier=null, endpoint=null, alive=false, type=NONE] closed. Reason: Exception in Connection[id=1, /192.168.56.30:5801->/192.168.56.50:58123, qualifier=null, endpoint=null, alive=true, type=NONE], thread=hz.confluence.IO.thread-in-2
 -- referer: http://vm30:8090/setup/setupcluster-start.action | url: /setup/setupcluster.action | traceId: 0072a0e2e77a368c | userName: anonymous | action: setupcluster
java.lang.IllegalStateException: Unknown protocol: 
	at com.hazelcast.nio.tcp.UnifiedProtocolDecoder.onRead(UnifiedProtocolDecoder.java:107)
	at com.hazelcast.internal.networking.nio.NioInboundPipeline.process(NioInboundPipeline.java:135)
	at com.hazelcast.internal.networking.nio.NioThread.processSelectionKey(NioThread.java:369)
	at com.hazelcast.internal.networking.nio.NioThread.processSelectionKeys(NioThread.java:354)
	at com.hazelcast.internal.networking.nio.NioThread.selectLoop(NioThread.java:280)
	at com.hazelcast.internal.networking.nio.NioThread.run(NioThread.java:235)

shortly later:

...
2022-11-08 04:51:01,077 DEBUG [hz.confluence.event-5] [hazelcast.internal.cluster.ClusterService] log [192.168.56.30]:5801 [testcluster] [3.12.11] Removed connection to [192.168.56.50]:5801
2022-11-08 04:51:04,490 WARN [Caesium-1-2] [confluence.cluster.hazelcast.HazelcastClusterSafetyManager] onNumbersAreDifferent Detected different number in database [ 13132033 ] and cache [ 2249473 ]. Cache number last set by [ 545ee60c ]
 -- referer: http://vm30:8090/setup/setupcluster-start.action | url: /setup/setupcluster.action | traceId: 0072a0e2e77a368c | userName: anonymous | action: setupcluster
2022-11-08 04:51:04,491 WARN [Caesium-1-2] [confluence.cluster.hazelcast.HazelcastClusterSafetyManager] onNumbersAreDifferent Triggering panic on current node. This cluster size [ 1 ], cluster size from db: [ 1 ]
...

Node 2 (192.168.56.50)

2022-11-08 04:49:48,166 INFO [Catalina-utility-1] [com.atlassian.confluence.lifecycle] contextInitialized Starting Confluence 7.19.2 [build 8804 based on commit hash 4a8abee6f65948bdad19f90ec8049c38342c2851] - synchrony version 5.0.0
2022-11-08 04:49:49,329 INFO [Catalina-utility-1] [atlassian.confluence.cluster.DefaultClusterConfigurationHelper] lambda$populateExistingClusterSetupConfig$5 Populating setup configuration if running with Cluster mode...
2022-11-08 04:49:49,648 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] configure Configuring Hazelcast with instanceName [confluence], join configuration TCP/IP member addresses: 192.168.56.30|192.168.56.50, network interfaces [192.168.56.50, fe80:0:0:0:20c:29ff:fe1b:7c72%ens192] and local port 5801
2022-11-08 04:49:49,649 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] startCluster Starting the cluster.
2022-11-08 04:49:49,653 WARN [Catalina-utility-1] [com.hazelcast.instance.HazelcastInstanceFactory] log Hazelcast is starting in a Java modular environment (Java 9 and newer) but without proper access to required Java packages. Use additional Java arguments to provide Hazelcast access to Java internal API. The internal API access is used to get the best performance results. Arguments to be used:
 --add-modules java.se --add-exports java.base/jdk.internal.ref=ALL-UNNAMED --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens java.management/sun.management=ALL-UNNAMED --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED
2022-11-08 04:49:49,902 WARN [Catalina-utility-1] [com.hazelcast.nio.UnsafeHelper] log sun.misc.Unsafe has been disabled because your platform does not support unaligned access to memory, some features might be not available.
2022-11-08 04:49:50,480 INFO [hz.confluence.event-5] [confluence.cluster.hazelcast.LoggingClusterMembershipListener] memberAdded [192.168.56.50]:5801 joined the cluster
2022-11-08 04:49:50,484 INFO [hz.confluence.event-5] [confluence.cluster.hazelcast.LoggingClusterMembershipListener] logClusterMembers Cluster now has 1 members: [[192.168.56.50]:5801]
2022-11-08 04:49:50,538 WARN [hz.confluence.cached.thread-3] [hazelcast.interceptor.authenticator.DefaultClusterJoinManager] checkNodeAuthenticationEnabled CONNECT(192.168.56.50:58123 -> 192.168.56.30:5801): Node authentication failed: Unexpected bytes from remote node, closing socket 
2022-11-08 04:49:51,560 WARN [Catalina-utility-1] [atlassian.hazelcast.micrometer.JmxBinder] bind No objects found for pattern com.hazelcast:type=HazelcastInstance.ConnectionManager,*
2022-11-08 04:49:51,560 WARN [Catalina-utility-1] [atlassian.hazelcast.micrometer.JmxBinder] bind No objects found for pattern com.hazelcast:type=HazelcastInstance.EventService,*
2022-11-08 04:49:51,560 WARN [Catalina-utility-1] [atlassian.hazelcast.micrometer.JmxBinder] bind No objects found for pattern com.hazelcast:type=HazelcastInstance.OperationService,*
2022-11-08 04:49:51,561 WARN [Catalina-utility-1] [atlassian.hazelcast.micrometer.JmxBinder] bind No objects found for pattern com.hazelcast:type=HazelcastInstance.PartitionServiceMBean,*
2022-11-08 04:49:51,561 WARN [Catalina-utility-1] [atlassian.hazelcast.micrometer.JmxBinder] bind No objects found for pattern com.hazelcast:type=HazelcastInstance.ManagedExecutorService,*
2022-11-08 04:49:51,561 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] startCluster Confluence cluster node identifier is [545ee878]
2022-11-08 04:49:51,594 INFO [Catalina-utility-1] [springframework.web.context.ContextLoader] initWebApplicationContext Root WebApplicationContext: initialization started
2022-11-08 04:49:53,136 INFO [Catalina-utility-1] [com.atlassian.confluence.lifecycle] <init> Loading EhCache cache manager
2022-11-08 04:49:58,300 INFO [Catalina-utility-1] [cluster.hazelcast.monitoring.HazelcastMembershipListener] init init: cluster ClusterService{address=[192.168.56.50]:5801}
2022-11-08 04:49:58,300 INFO [Catalina-utility-1] [cluster.hazelcast.monitoring.HazelcastMembershipListener] init init: cluster contains Member [192.168.56.50]:5801 - 4400fea0-6163-4272-9eb5-91b3b420772c this
...
..
2022-11-08 04:50:43,774 INFO [Catalina-utility-1] [com.atlassian.confluence.lifecycle] init Confluence is ready to serve
2022-11-08 04:51:01,126 WARN [hz.confluence.cached.thread-5] [hazelcast.interceptor.authenticator.ParanoidObjectDataInputStream] readUTF Rejecting request to read 1213874948 UTF characters
2022-11-08 04:51:01,127 WARN [hz.confluence.cached.thread-5] [hazelcast.interceptor.authenticator.DefaultClusterJoinManager] checkNodeAuthenticationEnabled ACCEPT(192.168.56.50:5801 <- 192.168.56.30:57215): Node authentication failed: Unexpected bytes from remote node, closing socket 
2022-11-08 04:51:01,128 WARN [hz.confluence.cached.thread-5] [hazelcast.nio.tcp.TcpIpAcceptor] log [192.168.56.50]:5801 [testcluster] [3.12.11] com.atlassian.confluence.impl.cluster.hazelcast.interceptor.authenticator.NodeConnectionException: Unexpected bytes from remote node, closing socket
com.atlassian.confluence.impl.cluster.hazelcast.interceptor.authenticator.NodeConnectionException: Unexpected bytes from remote node, closing socket
	at com.atlassian.confluence.impl.cluster.hazelcast.interceptor.authenticator.DefaultClusterJoinManager.checkNodeAuthenticationEnabled(DefaultClusterJoinManager.java:68)
	at com.atlassian.confluence.impl.cluster.hazelcast.interceptor.authenticator.DefaultClusterJoinManager.accept(DefaultClusterJoinManager.java:47)
	at com.atlassian.confluence.impl.cluster.hazelcast.interceptor.ClusterJoinSocketInterceptor.onAccept(ClusterJoinSocketInterceptor.java:49)
	at com.hazelcast.nio.NodeIOService.interceptSocket(NodeIOService.java:300)
	at com.hazelcast.nio.tcp.TcpIpAcceptor$AcceptorIOThread.configureAndAssignSocket(TcpIpAcceptor.java:316)
	at com.hazelcast.nio.tcp.TcpIpAcceptor$AcceptorIOThread.access$1400(TcpIpAcceptor.java:138)
	at com.hazelcast.nio.tcp.TcpIpAcceptor$AcceptorIOThread$1.run(TcpIpAcceptor.java:305)
	at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:227)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
	at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64)
	at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80)

Node 2 also doesn't appear on Node 1's Clustering screen

Workaround

A restart of Confluence DC Node 1 after the initial install has completed is required before starting up node 2. i.e.

Deploy Confluence DC Node 1
Once it is up and running, shut it down, start it up again
Then start Confluence DC Node 2

Details

Description

Issue Summary

Steps to Reproduce

Expected Results

Actual Results

Workaround

Attachments

Activity

People

Dates