Loading...

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Low
Fix Version/s: HCS 2.1.3
Affects Version/s: HCS 1.4.3
Component/s: HC Platform - HipChat Server, Installation/Update
Labels:
None

Symptom Severity:
Severity 2 - Major

Summary

On new and upgraded deployments of v1.4.3, runsv curler is stuck in a limbo state. This causes a few intermittent issues, most notably any message notifications that rely on API v1 (PagerDuty, Jenkins, etc) will not fire.

This may also affect email notifications and push notifications from firing.

Environment

HipChat Server v1.4.x -> v1.4.3 (upgraded instances)
HipChat Server v1.4.3 (new deployments)

Steps to Reproduce

Upgrade to HipChat Server v1.4.3 from a v1.4.x instance.
Spin up new v1.4.3 instance.

Actual Results

Log into the HipChat Server command line and check to see if the curler service is running. The quickest way to do this is to grep for curler.pid:

ps aux | grep curler.pid

There should be at least one result (sometimes two) that look similar to this:

hipchat  21321  0.0  0.2  63776 17516 ?        S    Aug22   0:00 /hipchat-scm/curler/vendor/virtualenv/bin/python /hipchat/curler/current/vendor/virtualenv/bin/twistd --pidfile=/var/run/hipchat/curler.pid --syslog --facility=168 --prefix=curler --nodaemon curler --base-urls=http://localhost:8080/_jobs --job-queue=*curler* --gearmand-server=localhost:4730 --num-workers=5

If there isn't, then curler isn't fully running.

Notes

There is also a part of curler called curler-export.
The actual issue may lie with the runsv curler service as just restarting curler does not work by itself. You will see this error:
```
runsv curler: fatal: unable to lock supervise/lock: temporary failure runsv curler-export: fatal: unable to lock supervise/lock: temporary failure
```
If so, please run through the workaround below.

Workaround

Please be aware that once the curler service is restarted that all queued jobs (push notifications, email notifications) will all get queued and fired off, which may result in a flood of notifications. These safely can be ignored.

Log into the HipChat Server command line.
Gain root access:
```
sudo dont-blame-hipchat
```
Next, stop the curler service:
```
/etc/init.d/curler stop
```
Check to see if any existing (zombie) curler processes exist:
```
ps aux | grep curler
```
If so, then they will need to be killed:
```
kill -9 curler_PID
```
Where "curler_PID" is any remaining curler PID's.
Next, kill the runsv curler and runsv curler-export services:
```
kill -9 runsv_curler_PID
```
Where "runsv_curler_PID" is the PID of the runsv curler process found in step 4
```
kill -9 runsv_curler-export_PID
```
Where "runsv_curler-export_PID" is the PID of the runsv curler-export process found in step 4
Start curler
```
/etc/init.d/curler start
```
Verify curler is up:
```
ps aux | grep curler
```

If the service is shown as up, then send a test notification from your integration. If the service is not up, please reach out to HipChat Server support at support.atlassian.com and attach log output using hipchat diagnostics -b to the support ticket.

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...

Assignee:: Unassigned

Reporter:: David Maye

Archiver:: Michael Andreacchio

Created:: 23/Aug/2016 4:10 PM

Updated:: 24/Aug/2017 7:31 AM

Resolved:: 12/Dec/2016 9:35 PM

Archived:: 16/Jun/2020 5:19 AM

Details

Description

Summary

Environment

Steps to Reproduce

Actual Results

Notes

Workaround

Attachments

Issue Links

Forms

Activity

People

Dates