Loading...

Type: Bug
Resolution: Unresolved
Priority: Low
Fix Version/s: None
Affects Version/s: 6.6.3, 6.13.8, 7.19.17
Component/s: Data Center - Core
Labels:

Support reference count:
16
Symptom Severity:
Severity 2 - Major
UIS:
5

Issue Summary

In a Data Center environment, an administrator may find that within Confluence Admin > Clustering, the user interface shows that nodes are unable to reach one another other with errors like:

The node [xxxxxx] is temporarily not reachable. Please check the server logs.

The actual cluster is up and running, despite what the UI is suggesting. However, the message in the UI is alarming to administrators and should be corrected.

Environment

This issue was first observed in a 2-node Confluence Data Center 6.6.3 cluster using AWS cluster join method. Both nodes were using Java 8 update 162:

<java.runtime.version>1.8.0_162-b12</java.runtime.version>

Steps to Reproduce

Unknown, the issue may be intermittent as serialVersionUID looks to be auto-generated by the JVM at run time.

Expected Results

Cluster Monitoring UI shows that nodes are able to communicate with each other.

Actual Results

Cluster Monitoring UI shows that nodes cannot reach one another. However, the actual cluster itself is up and running, despite what this UI is saying.

Logs the following corresponding warnings:

2019-02-13 09:22:37,800 WARN [ajp-nio-127.0.0.1-8009-exec-194] [cluster.hazelcast.monitoring.HazelcastClusterMonitoring] getData Exception happened when receiving response from node 438b4c58
 -- referer: https://example.confluence.com:9443/plugins/servlet/cluster-monitoring | url: /rest/atlassian-cluster-monitoring/cluster/suppliers/data/com.atlassian.cluster.monitoring.cluster-monitoring-plugin/runtime-information/438b4c58 | traceId: 9c5a10920fd04be5 | userName: admin
java.util.concurrent.ExecutionException: com.hazelcast.nio.serialization.HazelcastSerializationException: java.io.InvalidClassException: com.atlassian.confluence.cluster.hazelcast.monitoring.RemoteModuleCallable; local class incompatible: stream classdesc serialVersionUID = 597473803974431210, local class serialVersionUID = 2184010817253012516
	at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveApplicationResponseOrThrowException(InvocationFuture.java:357)
	at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.get(InvocationFuture.java:225)
	at com.hazelcast.util.executor.DelegatingFuture.get(DelegatingFuture.java:71)
	at com.atlassian.confluence.cluster.hazelcast.monitoring.HazelcastClusterMonitoring.getData(HazelcastClusterMonitoring.java:79)
...
Caused by: com.hazelcast.nio.serialization.HazelcastSerializationException: java.io.InvalidClassException: com.atlassian.confluence.cluster.hazelcast.monitoring.RemoteModuleCallable; local class incompatible: stream classdesc serialVersionUID = 597473803974431210, local class serialVersionUID = 2184010817253012516
...
Caused by: java.io.InvalidClassException: com.atlassian.confluence.cluster.hazelcast.monitoring.RemoteModuleCallable; local class incompatible: stream classdesc serialVersionUID = 597473803974431210, local class serialVersionUID = 2184010817253012516
	at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:687)
	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1876)
	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1745)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2033)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:427)
	at com.hazelcast.nio.serialization.DefaultSerializers$ObjectSerializer.read(DefaultSerializers.java:201)
	at com.hazelcast.nio.serialization.StreamSerializerAdapter.read(StreamSerializerAdapter.java:41)
	at com.hazelcast.nio.serialization.SerializationServiceImpl.toObject(SerializationServiceImpl.java:276)
...

2019-02-13 09:22:37,802 WARN [ajp-nio-127.0.0.1-8009-exec-194] [cluster.monitoring.rest.ClusterMonitoringResource] getDataProviderInformationForNode Error received when querying remote node [438b4c58]: 
 -- referer: https://example.confluence.com:9443/plugins/servlet/cluster-monitoring | url: /rest/atlassian-cluster-monitoring/cluster/suppliers/data/com.atlassian.cluster.monitoring.cluster-monitoring-plugin/runtime-information/438b4c58 | traceId: 9c5a10920fd04be5 | userName: admin

Notes

Some notes from Development review:

We should manually set the `serialVersionUID` in the class `RemoteModuleCallable`, instead of having it autogenerate. Usually, that generated ID is going to be the same in both nodes, but because this is done by the JVM internally, any minute differences in the environment or just due to sheer luck, a different ID gets generated by different nodes.

Workaround

Restarting Confluence may help, but due to the random nature of this problem, it is not 100% guaranteed to resolve the issue.

mentioned in: Page Loading...

Details

Description

Issue Summary

Environment

Steps to Reproduce

Expected Results

Actual Results

Notes

Workaround

Attachments

Issue Links

Activity

People

Dates