Uploaded image for project: 'Bamboo Data Center'
  1. Bamboo Data Center
  2. BAM-4430

manual and scheduled shutdown of elastic instance fails to delete attached ebs volumes

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Medium Medium
    • 2.7
    • 2.3.1
    • Elastic Bamboo
    • None

      EBS volumes cannot be detached/deleted when instances are shut down by the cron scheduler.

      INFO | jvm 1 | 2009/08/24 07:40:00 | 2009-08-24 07:40:00,006 INFO [QuartzScheduler_Worker-6] [ElasticFunctionalityFacadeImpl] Adjusting elastic agents with schedule: com.atlassian.bamboo.agent.elastic.schedule.ElasticInstanceScheduleImpl@26e492[25886722,0 40 7 ? * MON-FRI,<null>,EQUALS,0,true]
      INFO | jvm 1 | 2009/08/24 07:40:00 | 2009-08-24 07:40:00,006 INFO [QuartzScheduler_Worker-6] [ElasticFunctionalityFacadeImpl] Adjusting all elastic instance...
      INFO | jvm 1 | 2009/08/24 07:40:00 | 2009-08-24 07:40:00,023 INFO [QuartzScheduler_Worker-6] [ElasticFunctionalityFacadeImpl] Attempting to shutdown 1 of 'Default' elastic instances
      INFO | jvm 1 | 2009/08/24 07:40:01 | 2009-08-24 07:40:01,264 INFO [BAM::Events:pool-1-thread-13] [ElasticInstanceManagerImpl] Elastic Agent "Elastic Agent on i-6ad13f02" stopped on instance i-6ad13f02
      INFO | jvm 1 | 2009/08/24 07:40:01 | 2009-08-24 07:40:01,277 INFO [pool-3-thread-3] [ElasticInstanceManagerImpl] Elastic instance i-6ad13f02 transitioned from RUNNING to SHUTTING_DOWN.
      INFO | jvm 1 | 2009/08/24 07:40:01 | 2009-08-24 07:40:01,287 INFO [BAM::Events:pool-1-thread-7] [ElasticInstanceManagerImpl] Elastic Agent "Elastic Agent on i-6ad13f02" stopped on instance i-6ad13f02
      INFO | jvm 1 | 2009/08/24 07:40:01 | 2009-08-24 07:40:01,632 INFO [pool-3-thread-3] [RemoteEC2InstanceImpl] EC2 instance i-6ad13f02 transitioned from running (16) to shutting-down (32)
      INFO | jvm 1 | 2009/08/24 07:40:01 | 2009-08-24 07:40:01,633 INFO [BAM::Events:pool-1-thread-7] [ElasticInstanceManagerImpl] Requested termination of elastic instance: i-6ad13f02
      INFO | jvm 1 | 2009/08/24 07:40:01 | 2009-08-24 07:40:01,633 INFO [BAM::Events:pool-1-thread-13] [ElasticInstanceManagerImpl] Requested termination of elastic instance: i-6ad13f02
      INFO | jvm 1 | 2009/08/24 07:40:25 | 2009-08-24 07:40:25,833 INFO [pool-3-thread-3] [RemoteEC2InstanceImpl] EC2 instance i-6ad13f02 has terminated.
      INFO | jvm 1 | 2009/08/24 07:40:25 | 2009-08-24 07:40:25,833 INFO [pool-3-thread-3] [ElasticInstanceManagerImpl] Elastic instance i-6ad13f02 transitioned from SHUTTING_DOWN to TERMINATED.
      INFO | jvm 1 | 2009/08/24 07:40:25 | 2009-08-24 07:40:25,833 INFO [pool-3-thread-3] [ElasticInstanceManagerImpl] Detected that the elastic instance i-6ad13f02 has stopped.
      INFO | jvm 1 | 2009/08/24 07:40:25 | 2009-08-24 07:40:25,836 INFO [pool-3-thread-3] [EBSVolumeSupervisorImpl] Deleting EBS volume vol-c90ffaa0
      INFO | jvm 1 | 2009/08/24 07:40:26 | 2009-08-24 07:40:26,036 WARN [pool-3-thread-3] [EBSVolumeImpl] Attempt to detach EBS volume vol-c90ffaa0 from EC2 instance i-6ad13f02 failed. Proceeding with deletion.
      INFO | jvm 1 | 2009/08/24 07:40:26 | com.xerox.amazonws.ec2.EC2Exception: Client error : The volume 'vol-c90ffaa0' is not 'attached'.
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.xerox.amazonws.ec2.Jec2.makeRequestInt(Jec2.java:1680)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.xerox.amazonws.ec2.Jec2.detachVolume(Jec2.java:1569)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.EBSVolumeImpl.delete(EBSVolumeImpl.java:38)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.bamboo.agent.elastic.server.EBSVolumeSupervisorImpl.purge(EBSVolumeSupervisorImpl.java:119)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.bamboo.agent.elastic.server.RemoteElasticInstanceImpl$1.ec2InstanceStateChanged(RemoteElasticInstanceImpl.java:307)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$4.run(RemoteEC2InstanceImpl.java:518)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$CatchingRunnableDecorator.run(RemoteEC2InstanceImpl.java:98)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.setState(RemoteEC2InstanceImpl.java:513)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.terminated(RemoteEC2InstanceImpl.java:346)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.EC2InstanceState$3.supervise(EC2InstanceState.java:125)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.backgroundSupervise(RemoteEC2InstanceImpl.java:437)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.access$300(RemoteEC2InstanceImpl.java:25)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$2.run(RemoteEC2InstanceImpl.java:127)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$CatchingRunnableDecorator.run(RemoteEC2InstanceImpl.java:98)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
      INFO | jvm 1 | 2009/08/24 07:40:26 | at java.lang.Thread.run(Thread.java:619)

            [BAM-4430] manual and scheduled shutdown of elastic instance fails to delete attached ebs volumes

            Fixed in EBS handling revamp.

            Przemek Bruski added a comment - Fixed in EBS handling revamp.

            from https://support.atlassian.com/browse/BSP-2103

            • I can confirm that shutting down an instance manually still keeps the volume around. I manually started an instance and ran a build today, and after shutdown I waited a few minutes and the volume was still visible in the AWS console
            • I just look at our bootstrap script for the instance. We do install VPN, JBoss, etc, but they get copied to the local filesystem on the instance before anything is run. The only thing that's contained on the EBS volume is the SVN checkout created by Bamboo and our Maven .m2 directory

            If you have any commands I can try to run to validate that nothing is holding the volume on shutdown.

            Lastly, I started up an instance, then did a "killall java" and ran the generateSnapshot.sh script. The snapshot did get generated, but it looks like the volume didn't get removed:

            [root@domU-12-31-39-06-70-F8 bin]# ./generateSnapshot.sh 
            mountLocation: /mnt/bamboo-ebs
            attachedDeviceLocation: /dev/sdh
            instanceId: i-7eba4816
            amiId: ami-dda242b4
            availabilityZone: us-east-1c
            Found volumeId attached to this instance vol-c2887fab...
            Unmounting /mnt/bamboo-ebs...
            /dev/sdh umounted
            Detaching vol-c2887fab...
            ATTACHMENT      vol-c2887fab    i-7eba4816      /dev/sdh        detaching       2009-09-02T20:40:01+0000
            Waiting while volume vol-c2887fab is detaching...
            Creating snapshots...
            Describe snapshots...
            SNAPSHOT        snap-b3e14eda   vol-c2887fab    completed       2009-09-02T20:44:57+0000        100%
            Waiting while snapshotId snap-b3e14eda is pending...
            Saving snapshotId snap-b3e14eda to ~/snapshotId
            Removing vol-c2887fab...
            Client.IncorrectState: The volume 'vol-c2887fab' is 'in-use'.
            

            However, once the new snapshot was created, I shut down the instance and looked at my AWS console, and it looked as though the volume was removed! So I'm not sure if the error in the above log is misleading or not.

            I will attach the setupEbsSnapshot.log file for this as well. Thanks!

            Ulrich Kuhnhardt [Atlassian] added a comment - from https://support.atlassian.com/browse/BSP-2103 I can confirm that shutting down an instance manually still keeps the volume around. I manually started an instance and ran a build today, and after shutdown I waited a few minutes and the volume was still visible in the AWS console I just look at our bootstrap script for the instance. We do install VPN, JBoss, etc, but they get copied to the local filesystem on the instance before anything is run. The only thing that's contained on the EBS volume is the SVN checkout created by Bamboo and our Maven .m2 directory If you have any commands I can try to run to validate that nothing is holding the volume on shutdown. Lastly, I started up an instance, then did a "killall java" and ran the generateSnapshot.sh script. The snapshot did get generated, but it looks like the volume didn't get removed: [root@domU-12-31-39-06-70-F8 bin]# ./generateSnapshot.sh mountLocation: /mnt/bamboo-ebs attachedDeviceLocation: /dev/sdh instanceId: i-7eba4816 amiId: ami-dda242b4 availabilityZone: us-east-1c Found volumeId attached to this instance vol-c2887fab... Unmounting /mnt/bamboo-ebs... /dev/sdh umounted Detaching vol-c2887fab... ATTACHMENT vol-c2887fab i-7eba4816 /dev/sdh detaching 2009-09-02T20:40:01+0000 Waiting while volume vol-c2887fab is detaching... Creating snapshots... Describe snapshots... SNAPSHOT snap-b3e14eda vol-c2887fab completed 2009-09-02T20:44:57+0000 100% Waiting while snapshotId snap-b3e14eda is pending... Saving snapshotId snap-b3e14eda to ~/snapshotId Removing vol-c2887fab... Client.IncorrectState: The volume 'vol-c2887fab' is 'in-use' . However, once the new snapshot was created, I shut down the instance and looked at my AWS console, and it looked as though the volume was removed! So I'm not sure if the error in the above log is misleading or not. I will attach the setupEbsSnapshot.log file for this as well. Thanks!

            Verified locally - same stack trace

            2009-09-01 10:33:54,561 INFO [pool-3-thread-2] [EBSVolumeSupervisorImpl] Deleting EBS volume vol-b54ebadc
            2009-09-01 10:33:54,851 WARN [pool-3-thread-2] [EBSVolumeImpl] Attempt to detach EBS volume vol-b54ebadc from EC2 instance i-a021d0c8 failed. Proceeding with deletion.
            com.xerox.amazonws.ec2.EC2Exception: Client error : The volume 'vol-b54ebadc' is not 'attached'.
            at com.xerox.amazonws.ec2.Jec2.makeRequestInt(Jec2.java:1680)
            at com.xerox.amazonws.ec2.Jec2.detachVolume(Jec2.java:1569)
            at com.atlassian.aws.ec2.EBSVolumeImpl.delete(EBSVolumeImpl.java:38)
            at com.atlassian.bamboo.agent.elastic.server.EBSVolumeSupervisorImpl.purge(EBSVolumeSupervisorImpl.java:119)
            at com.atlassian.bamboo.agent.elastic.server.RemoteElasticInstanceImpl$1.ec2InstanceStateChanged(RemoteElasticInstanceImpl.java:307)
            at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$4.run(RemoteEC2InstanceImpl.java:518)
            at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$CatchingRunnableDecorator.run(RemoteEC2InstanceImpl.java:98)
            at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.setState(RemoteEC2InstanceImpl.java:513)
            at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.terminated(RemoteEC2InstanceImpl.java:346)
            at com.atlassian.aws.ec2.EC2InstanceState$3.supervise(EC2InstanceState.java:125)
            at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.backgroundSupervise(RemoteEC2InstanceImpl.java:437)
            at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.access$300(RemoteEC2InstanceImpl.java:25)
            at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$2.run(RemoteEC2InstanceImpl.java:127)
            at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$CatchingRunnableDecorator.run(RemoteEC2InstanceImpl.java:98)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
            at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
            at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166)
            at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
            at java.lang.Thread.run(Thread.java:613)

            Ulrich Kuhnhardt [Atlassian] added a comment - Verified locally - same stack trace 2009-09-01 10:33:54,561 INFO [pool-3-thread-2] [EBSVolumeSupervisorImpl] Deleting EBS volume vol-b54ebadc 2009-09-01 10:33:54,851 WARN [pool-3-thread-2] [EBSVolumeImpl] Attempt to detach EBS volume vol-b54ebadc from EC2 instance i-a021d0c8 failed. Proceeding with deletion. com.xerox.amazonws.ec2.EC2Exception: Client error : The volume 'vol-b54ebadc' is not 'attached'. at com.xerox.amazonws.ec2.Jec2.makeRequestInt(Jec2.java:1680) at com.xerox.amazonws.ec2.Jec2.detachVolume(Jec2.java:1569) at com.atlassian.aws.ec2.EBSVolumeImpl.delete(EBSVolumeImpl.java:38) at com.atlassian.bamboo.agent.elastic.server.EBSVolumeSupervisorImpl.purge(EBSVolumeSupervisorImpl.java:119) at com.atlassian.bamboo.agent.elastic.server.RemoteElasticInstanceImpl$1.ec2InstanceStateChanged(RemoteElasticInstanceImpl.java:307) at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$4.run(RemoteEC2InstanceImpl.java:518) at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$CatchingRunnableDecorator.run(RemoteEC2InstanceImpl.java:98) at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.setState(RemoteEC2InstanceImpl.java:513) at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.terminated(RemoteEC2InstanceImpl.java:346) at com.atlassian.aws.ec2.EC2InstanceState$3.supervise(EC2InstanceState.java:125) at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.backgroundSupervise(RemoteEC2InstanceImpl.java:437) at com.atlassian.aws.ec2.RemoteEC2InstanceImpl.access$300(RemoteEC2InstanceImpl.java:25) at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$2.run(RemoteEC2InstanceImpl.java:127) at com.atlassian.aws.ec2.RemoteEC2InstanceImpl$CatchingRunnableDecorator.run(RemoteEC2InstanceImpl.java:98) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) at java.lang.Thread.run(Thread.java:613)

            Ulrich Kuhnhardt [Atlassian] added a comment - see https://support.atlassian.com/browse/BSP-2103 for log

              Unassigned Unassigned
              ukuhnhardt Ulrich Kuhnhardt [Atlassian]
              Affected customers:
              0 This affects my team
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: