Details
-
Bug
-
Resolution: Fixed
-
Low
-
HCS 2.2.2
-
None
-
Severity 2 - Major
Description
Summary
After upgrading HipChat Server, the clients will no longer connect and show the error "We timed out while loading HipChat", even though the web user interface works.
Environment
Upgrade to HCS 2.2.2
Steps to Reproduce
- Upgrade your instance, then reboot
- After the server is back online, users cannot connect using the desktop apps.
Expected Results
Users can connect after reboot
Actual Results
The desktop clients show an error "We timed out while loading HipChat"
Notes
One symptom of this bug can be seen via the following lines in /var/logs/hipchat/tetra.log
Disconnecting uid-anon with condition=resource-constraint, text=We're experiencing high sign in volume. Please try again in a minute.
Instead, you're experiencing a flood of requests from end-users into the Server and are being rate limited (also called a reconnect storm). If possible, use your firewall to limit the connections into HipChat Server (on port 443) to help curb the rate limiting until the Tetra process is caught back up. Alternatively, you can also ask your users to log out of their clients (which stops the auto-connect mechanism). Do not restart Tetra or Punjab if you see this error as it will cause the storm to start up again.
Usually after ~10 minutes or so the Tetra processes catch up and start working normally.
Workaround
Try running these workarounds one by one:
I. Run the 'tetra_sessions_cleanup.py' cleanup script
- Log into the HipChat Server command line
- Gain root:
sudo dont-blame-hipchat
- Set the python path
export PYTHONPATH=/hipchat/tetra-app/current
- Run the script:
cd /hipchat/tetra-app/current && /hipchat/tetra-app/current/vendor/virtualenv/bin/python ./tools/tetra_sessions_cleanup.py
- Once completed, attempt to log in using the desktop client. If you still cannot connect, follow the next workaround.
II. Manual clear out user sessions (to be done per user)
- Log into the web user interface
- If logged into the web chat app, click the avatar on the far right and click 'log out'. This will take you to the web user interface.
- Once here, select 'EDIT MY PROFILE', then on the left sidebar select 'CHAT SESSIONS'.
- Clear out any sessions that exist here.
- Attempt to log in using the desktop app. If you still cannot log in, please follow the next workaround.
III. Manual clearing preferences in Mac/Windows/Linux apps
- For Mac users, please clear out all your preferences according to the instructions at https://confluence.atlassian.com/hipchatkb/mac-desktop-app-clearing-preferences-751436224.html
- For Windows users, please follow these instructions to clear out the Window prefs:
- Quit the HipChat Client App
- Locate this directory using the Windows file explorer %USERPROFILE%\AppData\Local\Atlassian\HipChat
- Delete all folders and files from this directory, so you are left with an empty HipChat directory
- Locate this directory using the Windows file explorer %USERPROFILE%\AppData\Roaming\Atlassian
- Delete the HipChat.ini file
- Have the user re-connect to HipChat again. They will need to type in the URL for the server and re-enter their email and password to access the chat rooms.
- For Linux users:
- Quit the Linux client
- Wipe-out the cache:
rm -r ~/.cache/Atlassian/HipChat/
- Remove the config file:
rm ~/.config/Atlassian/HipChat.ini
IV. Applying a patch to revert a change made in 2.2.2
This will kick off a chef run and full service restart of everything, so, if you decide to run this, please make sure you do it when the server is least busy as possible.
- Gain root access:
sudo dont-blame-hipchat
- Download this tar file to the server:
wget https://s3.amazonaws.com/hipchat-server-stable/utils/hcpub-2805.tar.gz
- Extract it:
tar xzvf hcpub-2805.tar.gz
- If possible, snapshot your server before executing the script
- Make the file executable, and run it:
cd hcpub-2805/ && chmod +x hcpub-2805-punjab-fix.sh && ./hcpub-2805-punjab-fix.sh
- Verify that the patch took place by running a ps against punjab. The absence of "use_raw" at the end of the ps output will tell you that punjab is no longer running in raw mode:
# ps -ef | grep punjab hipchat 28558 1 0 16:42 ? 00:00:00 /hipchat-scm/punjab/vendor/virtualenv/bin/python /hipchat/punjab/current/vendor/virtualenv/bin/hc-twistd -r epoll --umask=022 --pidfile=/var/run/hipchat/punjab.pid --syslog --prefix=punjab --facility=160 punjab --white_list=.hipchat.com --site_log_file=/dev/null root 29431 24534 0 16:43 pts/2 00:00:00 grep punjab