-
Bug
-
Resolution: Obsolete
-
Low
-
None
-
6.13.7, 7.3.3
-
15
-
Severity 2 - Major
-
3
-
Summary
Synchrony proxy fails to establish WebSocket connections, the handshake process times out.
Environment
- Fresh install of Confluence 6.13.7, Server or Data Center
- No proxy or load balancer, although it can be reproduced with those in place as well
Steps to Reproduce
- Edit setenv.sh and remove the XHR fallback property or set it to false
- Start Confluence and try to edit a page while watching the Network tab in the dev tools pane.
Expected Results
WebSocket is opened and used in the editor session.
Actual Results
WS request returns a 101 code as expected, but the handshake never finishes:
This causes a loop where the browser keeps trying to establish the connection if XHR is disabled:
VM322:540 Synchrony: Warning: reconnect (Transport.handshakeOverXhr: ws connection failed, and cross-origin-long-polling is disabled) VM322:540 Synchrony: Warning: reconnect (Transport.handshakeOverXhr: ws connection failed, and cross-origin-long-polling is disabled) VM322:540 Synchrony: Warning: reconnect (Transport.handshakeOverXhr: ws connection failed, and cross-origin-long-polling is disabled) VM322:540 Synchrony: Warning: reconnect (Transport.handshakeOverXhr: ws connection failed, and cross-origin-long-polling is disabled)
If XHR is enabled, it switches to this method, which causes WebSockets to never be used in this scenario. Based on the data, it seems that the handshake cannot be completed in time so the request times out.
Notes
This is a big problem where the network environment is complex and can't be changed. An easy alternative to those scenarios would be to just allow WS connections through the internal proxy to improve the editing experience while minimizing the changes. With this bug, that is not possible.
One interesting item that showed up during testing is that sometimes it works after several attempts of the browser. That is quite inconsistent and it was not possible to determine a pattern or cause, but the main point is that it never works on the first attempt.
What is interesting is that we see successful messages while checking catalina.out, which also will loop until the WS is actually established:
org.springframework.web.socket.client.WebSocketConnectionManager.startInternal Starting WebSocketConnectionManager 27-Apr-2020 17:16:43.390 INFO [http-nio-8090-exec-6] org.springframework.web.socket.client.WebSocketConnectionManager.openConnection Connecting to WebSocket at ws://127.0.0.1:8091/synchrony/v1/bayeux-sync1 27-Apr-2020 17:16:43.443 INFO [http-nio-8090-exec-3] com.sun.jersey.server.impl.application.WebApplicationImpl._initiate Initiating Jersey application, version 'Jersey: 1.19.4 05/24/2017 03:20 PM' 27-Apr-2020 17:16:48.270 INFO [http-nio-8090-exec-9] org.springframework.web.socket.client.WebSocketConnectionManager.startInternal Starting WebSocketConnectionManager 27-Apr-2020 17:16:48.271 INFO [http-nio-8090-exec-9] org.springframework.web.socket.client.WebSocketConnectionManager.openConnection Connecting to WebSocket at ws://127.0.0.1:8091/synchrony/v1/bayeux-sync1 27-Apr-2020 17:16:48.439 INFO [SimpleAsyncTaskExecutor-1] org.springframework.web.socket.client.WebSocketConnectionManager.onSuccess Successfully connected 27-Apr-2020 17:16:48.447 INFO [http-nio-8090-exec-6] org.springframework.web.socket.client.WebSocketConnectionManager.stop Stopping WebSocketConnectionManager 27-Apr-2020 17:16:53.402 INFO [SimpleAsyncTaskExecutor-1] org.springframework.web.socket.client.WebSocketConnectionManager.onSuccess Successfully connected
Notice they are separated by 5 seconds, which is the current WS timeout.
Workaround
Configure a reverse proxy or load balancer to connect directly to Synchrony, bypassing the internal proxy.
Cannot replicate this issue on newer versions, tested 7.13 and 7.19, so closing it as obsolete.