Bamboo application does not come up after Database failover event is completed exactly post 180 seconds

XMLWordPrintable

    • 2
    • Severity 3 - Minor

      Issue Summary

      The issue is seen for a single node Bamboo Instance. 

      Bamboo application does not come up online post the Database failover event is completed exactly at 180 seconds.

      The expectation is that if the Database is not available Bamboo will wait for 180 seconds by default before the Primary node goes down. But if the Database connection becomes available during this window Bamboo should work normally.

      In this case, even the Database was available the background Jobs worked properly, but while accessing the Bamboo application UI, below message was seen and node never went down.

      Request cannot be processed by the secondary node

      You are attempting to access a secondary, warm standby node.

      If your load balancer is configured correctly and you see this message, this may indicate that the primary node has become unresponsive and the secondary node is currently undergoing the failover process to become a new primary, which may take a few moments. However, if you are accessing the secondary node directly, please redirect your request to the load balancer or primary node.

       

      Steps to Reproduce

      1) Bamboo should be up and running connecting to an external Database

      2) Bring the Database down

      3) Bring the Database online exactly at 180 seconds after it went down.

      Expected Results

      1) Bamboo should be up and running and accesible via GUI as the Database came up online.  

      Actual Results

      Please see the below explanation from atlassian-bamboo.log file

      1. We can see that the Database went down at 2025-08-12 14:03:56 in the example below

      2025-08-12 14:03:56,517 WARN [scheduler_Worker-12] [PoolBase] HikariPool-2 - Failed to validate connection org.postgresql.jdbc.PgConnection@763b2d2 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
      2025-08-12 14:03:56,518 WARN [scheduler_Worker-12] [PoolBase] HikariPool-2 - Failed to validate connection org.postgresql.jdbc.PgConnection@7938118 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
      

      2. The Database connection was made available at 2025-08-12 14:06:56, exactly after 3 mins post it was shutdown. 

      2025-08-12 14:03:56.008 UTC [1] LOG:  database system is shut down
      
      PostgreSQL Database directory appears to contain a database; Skipping initialization
      
      2025-08-12 14:06:56.977 UTC [1] LOG:  starting PostgreSQL 15.11 (Debian 15.11-1.pgdg120+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
      
      2025-08-12 14:06:56.977 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
      
      2025-08-12 14:06:56.977 UTC [1] LOG:  listening on IPv6 address "::", port 5432
      
      2025-08-12 14:06:56.979 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
      
      2025-08-12 14:06:56.987 UTC [30] LOG:  database system was shut down at 2025-08-12 14:03:56 UTC
      
      2025-08-12 14:06:56.993 UTC [1] LOG:  database system is ready to accept connections
      

      3. In the access logs we can see the below messages filled up. 

      2025-08-12 14:13:29,682 INFO [http-nio-8085-exec-25 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 120320kb
      2025-08-12 14:13:38,341 INFO [http-nio-8085-exec-7 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 119164kb
      2025-08-12 14:13:58,342 INFO [http-nio-8085-exec-5 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 117019kb
      2025-08-12 14:13:59,682 INFO [http-nio-8085-exec-16 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 115980kb
      2025-08-12 14:14:29,681 INFO [http-nio-8085-exec-20 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 87471kb
      2025-08-12 14:14:38,338 INFO [http-nio-8085-exec-15 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 86576kb
      2025-08-12 14:14:59,681 INFO [http-nio-8085-exec-23 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 84384kb
      2025-08-12 14:15:22,326 INFO [http-nio-8085-exec-24 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 75050kb
      2025-08-12 14:15:29,688 INFO [http-nio-8085-exec-6 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 149054kb
      2025-08-12 14:15:48,339 INFO [http-nio-8085-exec-3 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 147478kb
      2025-08-12 14:16:29,695 INFO [http-nio-8085-exec-4 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 302599kb
      2025-08-12 14:16:58,342 INFO [http-nio-8085-exec-13 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 300588kb
      2025-08-12 14:17:59,703 INFO [http-nio-8085-exec-14 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 238645kb
      2025-08-12 14:18:08,341 INFO [http-nio-8085-exec-12 url: /secondaryNodeNotAccessible.action] [AccessLogFilter] 10.10.0.3 - GET http://10.10.0.4:8085/secondaryNodeNotAccessible.action 237571kb
      

      4. See the attachment for the error when accessing the Bamboo URL

      Workaround

      Refer the steps defined at https://support.atlassian.com/bamboo/kb/bamboo-data-center-nodealivewatchdog-shuts-down-bamboo-during-db-scheduled-backups/ to increase the timeout wait period

            Assignee:
            Mateusz Szmal
            Reporter:
            Shashank Kumar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: