Uploaded image for project: 'Confluence Data Center'
  1. Confluence Data Center
  2. CONFSERVER-59761

The internal synchrony-proxy is unable to establish WebSockets

      Summary

      Synchrony proxy fails to establish WebSocket connections, the handshake process times out.

      Environment

      • Fresh install of Confluence 6.13.7, Server or Data Center
      • No proxy or load balancer, although it can be reproduced with those in place as well

      Steps to Reproduce

      1. Edit setenv.sh and remove the XHR fallback property or set it to false
      2. Start Confluence and try to edit a page while watching the Network tab in the dev tools pane.

      Expected Results

      WebSocket is opened and used in the editor session.

      Actual Results

      WS request returns a 101 code as expected, but the handshake never finishes:

      This causes a loop where the browser keeps trying to establish the connection if XHR is disabled:

      VM322:540 Synchrony: Warning: reconnect (Transport.handshakeOverXhr: ws connection failed, and cross-origin-long-polling is disabled)
      VM322:540 Synchrony: Warning: reconnect (Transport.handshakeOverXhr: ws connection failed, and cross-origin-long-polling is disabled)
      VM322:540 Synchrony: Warning: reconnect (Transport.handshakeOverXhr: ws connection failed, and cross-origin-long-polling is disabled)
      VM322:540 Synchrony: Warning: reconnect (Transport.handshakeOverXhr: ws connection failed, and cross-origin-long-polling is disabled)
      

      If XHR is enabled, it switches to this method, which causes WebSockets to never be used in this scenario. Based on the data, it seems that the handshake cannot be completed in time so the request times out.

      Notes

      This is a big problem where the network environment is complex and can't be changed. An easy alternative to those scenarios would be to just allow WS connections through the internal proxy to improve the editing experience while minimizing the changes. With this bug, that is not possible.

      One interesting item that showed up during testing is that sometimes it works after several attempts of the browser. That is quite inconsistent and it was not possible to determine a pattern or cause, but the main point is that it never works on the first attempt.

      What is interesting is that we see successful messages while checking catalina.out, which also will loop until the WS is actually established:

      org.springframework.web.socket.client.WebSocketConnectionManager.startInternal Starting WebSocketConnectionManager
      27-Apr-2020 17:16:43.390 INFO [http-nio-8090-exec-6] org.springframework.web.socket.client.WebSocketConnectionManager.openConnection Connecting to WebSocket at ws://127.0.0.1:8091/synchrony/v1/bayeux-sync1
      27-Apr-2020 17:16:43.443 INFO [http-nio-8090-exec-3] com.sun.jersey.server.impl.application.WebApplicationImpl._initiate Initiating Jersey application, version 'Jersey: 1.19.4 05/24/2017 03:20 PM'
      27-Apr-2020 17:16:48.270 INFO [http-nio-8090-exec-9] org.springframework.web.socket.client.WebSocketConnectionManager.startInternal Starting WebSocketConnectionManager
      27-Apr-2020 17:16:48.271 INFO [http-nio-8090-exec-9] org.springframework.web.socket.client.WebSocketConnectionManager.openConnection Connecting to WebSocket at ws://127.0.0.1:8091/synchrony/v1/bayeux-sync1
      27-Apr-2020 17:16:48.439 INFO [SimpleAsyncTaskExecutor-1] org.springframework.web.socket.client.WebSocketConnectionManager.onSuccess Successfully connected
      27-Apr-2020 17:16:48.447 INFO [http-nio-8090-exec-6] org.springframework.web.socket.client.WebSocketConnectionManager.stop Stopping WebSocketConnectionManager
      27-Apr-2020 17:16:53.402 INFO [SimpleAsyncTaskExecutor-1] org.springframework.web.socket.client.WebSocketConnectionManager.onSuccess Successfully connected
      

      Notice they are separated by 5 seconds, which is the current WS timeout.

      Workaround

      Configure a reverse proxy or load balancer to connect directly to Synchrony, bypassing the internal proxy.

            [CONFSERVER-59761] The internal synchrony-proxy is unable to establish WebSockets

            Cannot replicate this issue on newer versions, tested 7.13 and 7.19, so closing it as obsolete.

            Bernardo Andreeti added a comment - Cannot replicate this issue on newer versions, tested 7.13 and 7.19, so closing it as obsolete.

            This looks like the bug we had in the Tomcat HttpNioProtocol connector when using SSL that prevented the connection from being upgraded to secure websockets - see https://bz.apache.org/bugzilla/show_bug.cgi?id=62791. This was fixed in Tomcat 9.0.13, and 9.0.19 shipped in Confluence 6.13.5. 6.13.7 shipped with Tomcat 9.0.22, and had other synchrony startup issues on Windows with JDK 8u222. None of these would account for exactly the symptoms as described in this issue.

            Richard Atkins added a comment - This looks like the bug we had in the Tomcat HttpNioProtocol connector when using SSL that prevented the connection from being upgraded to secure websockets - see https://bz.apache.org/bugzilla/show_bug.cgi?id=62791 . This was fixed in Tomcat 9.0.13, and 9.0.19 shipped in Confluence 6.13.5. 6.13.7 shipped with Tomcat 9.0.22, and had other synchrony startup issues on Windows with JDK 8u222. None of these would account for exactly the symptoms as described in this issue.

              ahuzik Ales Huzik (Inactive)
              bandreeti Bernardo Andreeti
              Affected customers:
              7 This affects my team
              Watchers:
              14 Start watching this issue

                Created:
                Updated:
                Resolved: