Uploaded image for project: 'HipChat'
  1. HipChat
  2. HCPUB-2805

After HipChat Server upgrade to 2.2.2, clients can no longer connect

    Details

    • Last commented by user?:
      true
    • Symptom Severity:
      Major
    • Platform:
      HipChat Server

      Description

      Summary

      After upgrading HipChat Server, the clients will no longer connect and show the error "We timed out while loading HipChat", even though the web user interface works.

      Environment

      Upgrade to HCS 2.2.2

      Steps to Reproduce

      1. Upgrade your instance, then reboot
      2. After the server is back online, users cannot connect using the desktop apps.

      Expected Results

      Users can connect after reboot

      Actual Results

      The desktop clients show an error "We timed out while loading HipChat"

      Notes

      One symptom of this bug can be seen via the following lines in /var/logs/hipchat/tetra.log

      Disconnecting uid-anon with condition=resource-constraint, text=We're experiencing high sign in volume. Please try again in a minute.
      

      Instead, you're experiencing a flood of requests from end-users into the Server and are being rate limited (also called a reconnect storm). If possible, use your firewall to limit the connections into HipChat Server (on port 443) to help curb the rate limiting until the Tetra process is caught back up. Alternatively, you can also ask your users to log out of their clients (which stops the auto-connect mechanism). Do not restart Tetra or Punjab if you see this error as it will cause the storm to start up again.

      Usually after ~10 minutes or so the Tetra processes catch up and start working normally.

      Workaround

      Try running these workarounds one by one:

      I. Run the 'tetra_sessions_cleanup.py' cleanup script
      1. Log into the HipChat Server command line
      2. Gain root:
        sudo dont-blame-hipchat
        
      3. Set the python path
        export PYTHONPATH=/hipchat/tetra-app/current
        
      4. Run the script:
        cd /hipchat/tetra-app/current && /hipchat/tetra-app/current/vendor/virtualenv/bin/python ./tools/tetra_sessions_cleanup.py
        
      5. Once completed, attempt to log in using the desktop client. If you still cannot connect, follow the next workaround.
      II. Manual clear out user sessions (to be done per user)
      1. Log into the web user interface
      2. If logged into the web chat app, click the avatar on the far right and click 'log out'. This will take you to the web user interface.
      3. Once here, select 'EDIT MY PROFILE', then on the left sidebar select 'CHAT SESSIONS'.
      4. Clear out any sessions that exist here.
      5. Attempt to log in using the desktop app. If you still cannot log in, please follow the next workaround.
      III. Manual clearing preferences in Mac/Windows/Linux apps
      1. For Mac users, please clear out all your preferences according to the instructions at https://confluence.atlassian.com/hipchatkb/mac-desktop-app-clearing-preferences-751436224.html
      2. For Windows users, please follow these instructions to clear out the Window prefs:
        1. Quit the HipChat Client App
        2. Locate this directory using the Windows file explorer %USERPROFILE%\AppData\Local\Atlassian\HipChat
        3. Delete all folders and files from this directory, so you are left with an empty HipChat directory
        4. Locate this directory using the Windows file explorer %USERPROFILE%\AppData\Roaming\Atlassian
        5. Delete the HipChat.ini file
        6. Have the user re-connect to HipChat again. They will need to type in the URL for the server and re-enter their email and password to access the chat rooms.
      3. For Linux users:
        1. Quit the Linux client
        2. Wipe-out the cache:
          rm -r ~/.cache/Atlassian/HipChat/
        3. Remove the config file:
          rm ~/.config/Atlassian/HipChat.ini
      IV. Applying a patch to revert a change made in 2.2.2

      This will kick off a chef run and full service restart of everything, so, if you decide to run this, please make sure you do it when the server is least busy as possible.

      1. Gain root access:
        sudo dont-blame-hipchat
        
      2. Download this tar file to the server:
        wget https://s3.amazonaws.com/hipchat-server-stable/utils/hcpub-2805.tar.gz
        
      3. Extract it:
        tar xzvf hcpub-2805.tar.gz
        
      4. If possible, snapshot your server before executing the script
      5. Make the file executable, and run it:
        cd hcpub-2805/ && chmod +x hcpub-2805-punjab-fix.sh && ./hcpub-2805-punjab-fix.sh
        
      6. Verify that the patch took place by running a ps against punjab. The absence of "use_raw" at the end of the ps output will tell you that punjab is no longer running in raw mode:
        # ps -ef | grep punjab
        hipchat  28558     1  0 16:42 ?        00:00:00 /hipchat-scm/punjab/vendor/virtualenv/bin/python /hipchat/punjab/current/vendor/virtualenv/bin/hc-twistd -r epoll --umask=022 --pidfile=/var/run/hipchat/punjab.pid --syslog --prefix=punjab --facility=160 punjab --white_list=.hipchat.com --site_log_file=/dev/null
        root     29431 24534  0 16:43 pts/2    00:00:00 grep punjab
        

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                8 Vote for this issue
                Watchers:
                20 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Last commented:
                  5 weeks, 5 days ago