Uploaded image for project: 'Bitbucket Data Center'
  1. Bitbucket Data Center
  2. BSERV-7049

Data Center node restart triggers NullPointerException when accessing SshConfiguration

    XMLWordPrintable

Details

    Description

      After one of the nodes was restarted in a Data Center installation, the SSH server on that node failed to start correctly with a NullPointerException. The same NullPointerException is triggered wherever the SSH configuration is accessed, which lead to a crippled node.

      Investigation uncovered the cause: the OsgiClassLoaderRegistry got out of sync between the nodes. As a result, one node serialized the SshConfiguration using a bundle ID that the other node did not recognize. As a result, the other node failed to deserialize the object and returned null.

      An INFO log message was logged, but this was not logged in Stash's logs because the logging level for that particular class is set to WARN by default.

      Changes to be made:

      • Change the OsgiClassLoaderRegistry to ensure it can never get out of sync between the nodes.
      • Harden the SSH configuration code to handle cache load failures more gracefully.
      • Change the log message to WARN
      • Change the default log level for com.atlassian.hazelcast to INFO in Stash

      Workaround

      Restarting all nodes in the Data Center instance brings the OsgiClassLoaderRegistry on all nodes back in sync.

      Attachments

        Issue Links

          Activity

            People

              mheemskerk Michael Heemskerk (Inactive)
              mheemskerk Michael Heemskerk (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: