Details
-
Bug
-
Resolution: Fixed
-
Medium
-
3.4.0
Description
After one of the nodes was restarted in a Data Center installation, the SSH server on that node failed to start correctly with a NullPointerException. The same NullPointerException is triggered wherever the SSH configuration is accessed, which lead to a crippled node.
Investigation uncovered the cause: the OsgiClassLoaderRegistry got out of sync between the nodes. As a result, one node serialized the SshConfiguration using a bundle ID that the other node did not recognize. As a result, the other node failed to deserialize the object and returned null.
An INFO log message was logged, but this was not logged in Stash's logs because the logging level for that particular class is set to WARN by default.
Changes to be made:
- Change the OsgiClassLoaderRegistry to ensure it can never get out of sync between the nodes.
- Harden the SSH configuration code to handle cache load failures more gracefully.
- Change the log message to WARN
- Change the default log level for com.atlassian.hazelcast to INFO in Stash
Workaround
Restarting all nodes in the Data Center instance brings the OsgiClassLoaderRegistry on all nodes back in sync.
Attachments
Issue Links
- mentioned in
-
Page Loading...