-
Suggestion
-
Resolution: Fixed
-
Our product teams collect and evaluate feedback from a number of different sources. To learn more about how we use customer feedback in the planning process, check out our new feature policy.
Builds that build and push Docker could be faster if the layers of the image were cached.
[BCLOUD-14144] Cache Docker layers between builds
Attachment 2614718555-Capture.PNG has been added with description: Originally embedded in Bitbucket issue #14144 in site/master
Attachment 4059371088-Screen%20Shot%202019-06-09%20at%202.23.46%20am.png has been added with description: Originally embedded in Bitbucket issue #14144 in site/master
Hey 740a1f9b5dc2 ,
Your cached docker image is ‘702.2MiB over the 1GiB upload limit’.
In my case the build case is around 700m after compressed but it still reports over 1GB limit. Can someone elaborate?
We use multi-stage builds, and the built in cache only seems to cache the first target:
+ docker build --target app -t app . Sending build context to Docker daemon 123MB Step 1/17 : FROM php:7.0-fpm-alpine as base 7.0-fpm-alpine: Pulling from library/php Digest: sha256:b8ddafa001be63c0665e7c8501bdade02f29e77ceff88c57d9f142692d6401bb Status: Downloaded newer image for php:7.0-fpm-alpine ---> f8f280d888a9 Step 2/17 : RUN ... ---> Using cache ---> 4e2ab674f07d Step 3/17 : ENV PATH="/app/vendor/bin:${PATH}" ---> Using cache ---> adcbff086e7c ... # When we do FROM again it will no longer use the cache Step 7/17 : FROM base as composer ---> 15f8d021ac02 Step 8/17 : RUN mkdir /tmp/composer && chmod 777 /tmp/composer ---> Running in adbe2935d2f9 Removing intermediate container adbe2935d2f9 ---> b1587dff3fd4 Step 9/17 : ENV COMPOSER_HOME=/tmp/composer ---> Running in 18b1b49e35a7 Removing intermediate container 18b1b49e35a7 ---> 22eb7b4f847d Step 10/17 : COPY --from=composer /usr/bin/composer /usr/bin/composer ...
Same here. On the first build, it will build a cache and output some more details on what layers are getting cached etc.
Then, on the next build, it will download the cache but still run the docker build from the first step onwards, without using cached layers.
Also, after a build, it will not update the cache:
Skipping assembly of docker cache as one is already present Cache "docker": Skipping upload for existing cache
@kmacleod thanks, but still no reason of fail:
Docker images saved to cache Cache "docker": Compressing Cache "docker": Compressed in 21 seconds Cache "docker": Uploading 381.5 MiB Cache "docker": Upload failed
Folks,
As of today, Pipelines is logging a bit more information about the assembly of the Docker layer cache in the Teardown section of the build. It now displays the reasoning behind whether or not the cache will be built and uploaded, as well as which images will be used to build the cache.
Hopefully this will make this feature a little more transparent.
Hey,
@pavelsavshenko, sorry to hear that. We are not throttling and the speed should be much higher than just 10MiB/s. If the problem still persists, I would recommend raising a support case so that we can analyse your specific case.
@mochnatiy at the moment, in order to cache docker layers, the following conditions must be met:
- The layer cache has to be < 1GB compressed
- The size of the images in the docker daemon must be < 2GB for a cache to be created (you can check this by adding this command to your yml docker image inspect $(docker image ls -aq) --format {{.Size | awk '
{totalSizeInBytes += $0}
END
{print totalSizeInBytes}'}}
Also, bear in mind that in Docker 1.13, a new option was introduced in the docker build command: --cache-from, which allows to specify one or more tagged images as a cache source. The image generated in the build can also be used as a cache source in another Docker build. This might help to improve the performance of your build and save build minutes, without having the limitation of 1GB.
Example:
docker build \ --cache-from $IMAGE:latest \ --tag $IMAGE:$BITBUCKET_BUILD_NUMBER \ --tag $IMAGE:latest \ .
Regards,
Raul
For example, in my pipeline it caches only 2 layers. Cache size < 1GB.
Seems that feature is not works ok.
#!bash
Status: Downloaded newer image for ruby:2.5.0
---> 213a086149f6
Step 2/18 : ARG DATABASE_URL
---> Using cache
---> 429c813b6abf
Step 3/18 : ENV RACK_ENV production
---> Using cache
---> c0e98fc7cea0
Step 4/18 : ENV RAILS_ENV production
---> Running in 62f18ad58da3
Removing intermediate container 62f18ad58da3
---> a7f6146135a4
Step 5/18 : ENV DATABASE_URL $DATABASE_URL
---> Running in d0f15c5589fc
Removing intermediate container d0f15c5589fc
Anything we can do to speed up the step? It feels like either with or without cache, we're paying at least extra 1.5+ minutes for each pipeline run to our bill :cry:
Cache "docker": Downloaded 875.5 MiB in 79 seconds
10MiB/s sounds way too slow for any reasonable usage... Is it throttled on purpose? Or is it just hosted at an external location who throttles the download speed?
We only cache layers iff there is:
a) no docker cache already uploaded, you can verify this from the Build setup logs it should say
Cache "docker": Not found
b) and the total cache size uncompressed is less than 2gb, you can verify this from running as the final command in your step, and confirming its output is less than 2gb (it prints human readable so it prints in kb, mb, gb...)
#!bash /usr/bin/docker system df | grep Images | awk '{print $4}'
c) and once compressed the cache must be less than 1gb. You can verify if this is occurring from the Build teardown logs it should say
Skipping cache upload.
There is an issue currently where if your cache is more than 2gb it simply prints Cache "docker": Skipping upload for empty cache, we have plans to improve, as to better indicate the second condition is failing but we haven't yet due to priorities.
If your cache is passing all of the above conditions and still not uploading feel free to raise a support case with Support, so we can investigate further.
I am building an android app. But the cache doesn't seem to be working for me too.
This is my yaml file
#! image: bitriseio/docker-android pipelines: branches: release: - step: services: - docker caches: - docker script: - fastlane beta
Build Setup
Cache "docker": Downloading Cache "docker": Not found
Build teardown
#! Cache "docker": Skipping upload for empty cache
Build
#! Image: bitriseio/docker-android Memory: 3072 MB
Docker
#! Image: atlassian/pipelines-docker-daemon:prod-stable Memory: 1024 MB
Can anyone please help? The cache limit documentation is hard for beginners.
Hi @cstryczynski
Caches are supported on branches your yaml example above is invalid you need to tab in one more level the services and caches settings as they are properties of a step not elements of the pipeline
This is a valid example of your yaml (note the indentation of services and caches)
#!yaml options: docker: true pipelines: branches: staging: - step: deployment: staging script: - echo "valid" services: - docker caches: - docker
How do we add to to a specific branch?
The below is considered invalid:
options:
docker: true
pipelines:
branches:
staging:
- step:
deployment: staging
script:
- thisDoesNotWork???
services:
- docker
caches:
- docker
Not seeing any cache being used for my own 'docker build' commands either. The image sizes are <1GB.
I'm having issues with this feature.
I have the docker cache enabled, but the only thing in the cache is docker.tar.
It doesn't cache any of the layers created by docker build commands in steps in my pipeline.
Is there a way I can explicitly cache layers created by steps in my pipeline?
I'm currently using docker save to preserve layers between steps,
but that doesn't solve the problem of the pipeline building the image from scratch every time it runs, rather than loading from cache.
Hi everyone,
We've completed our testing and analysis of the impact of Docker layer caching and found that about 25% of repositories had worse performance with the caches enabled. Based on this, we've decided to make the caching opt-in for each pipeline.
So Docker layer caching is now available as a pre-defined cache with a name of docker, configured like this:
pipelines: default: services: - docker caches: - docker script: - # .. do cool stuff with Docker ..
If you previously were relying on the automatically-enabled cache to make your build faster, you will now need to explicitly enable it as shown above. Docker layer caches have the same limitations and behaviours as regular caches as described on Caching Dependencies: maximum size of 1 GB, they will refresh after a week, etc.
Based on our test results, this caching significantly speeds up many types of Docker builds, so the majority of people using Docker will want to enable this cache. However we thought it was important that you have control over this feature so you can enable it, or not, depending on your needs.
Docker layer caching is now available for everyone to enable and use, so I'll resolve this ticket now. Please let us know here, or via a new ticket if you have any ideas of how to improve it further.
Thanks,
Matt
#!text Is it possible to turn off the docker caching?
Even if it is impossible you always can delete your images after the build process, so it will create an empty cache in seconds.
#!bash docker rmi -f $(docker images -q)
The cache does indeed work now, but it takes almost a minute in the pre build stage just to initialize the cache (it's 500 MB zipped), so I think it would be just better for us to cache the respective builds of the components separately.
Thanks for all the feedback, folks. We're definitely still looking at how to improve this feature, which is why the ticket is still open.
For the cache/no-cache heuristic, we have identified a Docker upgrade as the easiest way to improve this, so that's on our short term list.
For the performance concerns, we're measuring the relative build time changes (positive or negative) on repositories using Docker, so we can decide whether to continue with this cache as a default setting or make it a configurable option.
As @gtk_ashulgin mentions, it is possible to implement a cache yourself if the image size calculation is not working for you. But it is not currently possible to turn the default caching off. So we're prioritising the build time investigation first.
We'll keep you posted on the changes here.
As a workaround you can still create a custom cache file:
#!bash docker save $(docker images -q | tr '\n' ' ') $(docker images -q | while read line ; do docker history -q $line; done | tail -n +2 | grep -v \<missing\> | uniq -u | tr '\n' ' ')> ~/docker/cache.tar
and then just load it on the next build
#!bash docker load < ~/docker/cache.tar
+1 for what Alexander, Martin and Markus said. Our final image is < 1GB, but the size calculation snippet posted above shows the intermediate files weighing in at around 18GB. It would be great if somebody could look into it and check whether the size calculation is correct or not. Thanks!
this doesn't seem to be working at all for me, even between steps in a single build that use the same public container, I get in every step...
#!yaml
+ (./.pipelines/mvp/js/pull.sh)
Unable to find image 'node:8' locally
8: Pulling from library/node
4176fe04cefe: Pulling fs layer
.... etc
Status: Downloaded newer image for node:8
Takes 25-30 seconds to pull a container to run 20 seconds worth of tests
@nburrell @mryall_atlassian Is there any known workaround or do you guys plan to fix the erroneous image size calculation - as pointed out by Martin and Alexander, we now cannot use caching (at all) as even though our resulting Docker cache files are only about 100megs in size, Bitbucket ends up thinking it's 14+gigs due to us having a couple of USER, WORKDIR, ENV etc. layers, which don't add anything to the actual file size in Docker, but apparently appear as separate full-sized images to Bitbucket and thus greatly exceed the Bitbucket limit.
Yes, just adding up all the image sizes will give a very wrong result. You should probably check the /var/lib/docker directory size directly instead.
@mryall_atlassian
The size calculation that you are using is completely wrong. You can not just add together all bytes from all sub-layers. When you are building image, each next sub-layer is just a JSON directive, and it is actually extending previous layer/sub-layer inheriting the size information from it.
So, as result, the image cache estimated by your approach that should be aprox 8GB, in reality is only 150Mb (i am using custom script to create docker cache with all sub-layers, btw)
@rskuipers
If this is still happening, can you please raise a support ticket at https://support.atlassian.com so we can further investigate this
While trying to use this cache, we seem to be getting the following error during "Build Setup".
Cache "docker": Downloading Cache "docker": Error downloading. Please contact support if this error persists.
Any idea why this may be? All the layers together are about 550 MB so that should be fine.
@danaasbury - please see my comment above for a summary of the Docker caching currently offered by Pipelines.
If you would like to have caching for images hosted privately on ECR/GCR/etc. that are used as build or service containers, please raise a separate feature request for that. As far as I know, we don't have one open currently for this.
@nburrell Does this also cache custom docker images specified in the image section of the yaml? We use a custom image built off the node image and it seems most of the time is spent downloading it.
@lazam-techasiaportal your docker image ls command above is missing the -a argument which is hiding the intermediate layers that your images depend on and we include in the cache
If you add that option it ill show you those aswell (the command I provided above and you linked contains that argument aswell)
@lazam-techasiaportal, as far as I know, the 2GB limit includes the images your final image depends on like the image of the debian/ubuntu/alpine linux your image is based on. Running just docker image inspect $(docker image ls -aq) will display all the images which would have been put into cache.
Hello, upon checking the comments above the total images must be less than 2GB. My current setup only creates two docker images (3 tags) which only results to around 400 MB but when I tried to add the total size in bytes it results to 3.7GB. This result to Pipelines not caching our docker, may I know what's the reason why it shows different value?
docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE ************** 1.6.0-SNAPSHOT 54671081c605 43 seconds ago 222MB ************** latest 54671081c605 43 seconds ago 222MB ************** 1.1.0 bf69ce97c4a4 6 weeks ago 172MB
docker image inspect $(docker image ls -aq) --format {{.Size}} | awk '{totalSizeInBytes += $0} END {print totalSizeInBytes}' 3791369347
Since it exceeds a total of 2GB, it doesn't cache the docker as seen below:
Cache "docker": Skipping upload for empty cache
We're using a our own Docker image for building this docker which is around 878 MB.
@manzoorh - there are now two different kinds of Docker caching we do in Pipelines, which might lead to a bit of confusion:
- DockerHub pull-through image cache: any public images pulled from DockerHub for build and service containers are cached by a "pull-through" image cache. This is a shared cache across all our customers, and has been in effect for six months or so. This cache is unbounded, but only works for publicly accessible images on DockerHub used as build or service containers.
- Docker-in-docker (DinD) layer cache: we implemented this for this feature request, which caches Docker image layers privately for each repository, to support reuse across builds. It also caches images which are pulled when you execute docker run or similar in your build script.
This new cache is limited to 1GB per repository, with an additional requirement that the total of all images used in your build must be less than 2GB for us to attempt compression and storage (see @nburrell's example above). It works for any image pulled or built by Docker commands within your build, including docker run, docker build, etc., and will incorporate images pulled from any Docker registry.
Based on what you've said, it sounds like your ECS image is used as a build or service container, meaning it will not be cached by Pipelines. To get it cached, you have the option of starting it via docker run yourself, so it uses the DinD cache instead of the pull-through image cache, as long as it falls within the size limits above.
Adding caching for Docker images pulled from ECS for build/service containers is not covered by this feature request, and is actually fairly complex due to how the Docker pull-through proxy works. If that's something you'd like, please raise a separate ticket and we can review and respond to it there.
Thanks @peritbezek to be more clear on questions. I am building image in separate Jenkin pipeline and bitbucket pipeline run testcases for now. How I can enable image caching for bitbucket pipeline?
@manzoorh I also have docker images in AWS ECS Repositories and the pipelines are caching them.
@nburrell I have my docker image on amazon registry. Will it be compatible if i add service as docker in conf file?
Cheers @nburrell
Thanks for the feedback. The size of the images is beyond the limit.
I am reducing the amount of layers by command aggregation and looking into moving to Alpine.
Any more recommendations?
Hi @malcata
What do the logs show in the Build Teardown section for your docker cache?
Also can you confirm for me that your docker cache is within the limits:
- The layer cache has to be < 1GB compressed
- The size of the images in the docker daemon must be < 2GB (2147483648 bytes) for a cache to be created, you can check this by adding this command to your yml:
docker image inspect $(docker image ls -aq) --format {{.Size | awk ' {totalSizeInBytes += $0}END
{print totalSizeInBytes}'}}
Hi @nburrell is it publicly available to all?
I have configured as recommended and get the following log after running multiple times:
#!shell Cache "docker": Downloading Cache "docker": Not found
Hi @brettmichaelorr
Correct you just need to have docker in your services section for this to work along with the following conditions being met
- The layer cache has to be < 1GB compressed
- The size of the images in the docker daemon must be < 2GB for a cache to be created (you can check this by adding this command to your yml docker image inspect $(docker image ls -aq) --format {{.Size | awk '
{totalSizeInBytes += $0}
END
{print totalSizeInBytes}'}}
What do the logs say in the build setup/teardown section for a step that should be consuming/producing a docker cache?
Can I have a link to the repository/step that is exhibiting this behaviour or alternatively can you raise a support case so I can investigate for you?
@nburrell Thanks for the detailed reply! I'm not seeing any dependencies yet, and I'm a little confused as to whether I've done my cache declaration correctly (as I understand it, we don't have a cache entry, just declare docker in a service step?
My pipelines YAML is below:
#!yaml image: node:9 pipelines: branches: master: - step: script: - docker login -u $DOCKER_USER -p $DOCKER_PASSWORD - docker build -t xxxx/app:latest . - docker push xxxx/app:latest services: - docker - step: script: - mkdir -p ~/.ssh - (umask 077; echo $PRIVATE_KEY | base64 --decode > ~/.ssh/id_rsa) - echo 'cd XXXX && docker-compose pull && docker-compose down && docker-compose up -d' > dc.sh - ssh ubuntu@X.X.X.X 'bash -s' < dc.sh
Overall it doesn't seem like my build times are affected (build times are variable for me thanks to how Create-React-App varies with how long it'll take to build a production folder), but I'm not sure I'm caching correctly?
If we upload a cache as part of your step for docker (provided its less than 1GB when compressed as per our regular cache limitations) you will see:
- An entry in the caches dropdown called docker allowing you to see the size of the docker cache and the ability to manually expire it before the weekly automatic expiry
- Logs in the Build Setup and Build Teardown section showing the status of download/uploading the cache for each step
As for the good dockerfile practices:
Lets say you create a dockerfile and add your application binary as one of the first layers with the ADD instruction, the layer cache would be invalidated on every build as the hash of each layer is influenced by the contents of the previous and your application binary would have a different hash on each build as your adding/removing functionality to it, and you would receive potentially no speed up (as it would have to rebuild each subsequent layer n your dockerfile as the hashes would no longer match with the ones from the previous step).
bad.dockerfile
#!dockerfile FROM some/base-image # Add my binary on each build ADD mybinary # This layer is always rebuilt as the previous layers content have changed generating a different hash RUN apt update -y && apt install all-the-things -y
Now if you were to add the binary as one of the last instructions in the dockerfile only the layer containing your binary would be invalidated as the hashes for the previous layers remain unchanged as they execute the same instructions and arent influenced by the dynamic binary you keep adding, so you only have to "build" the last layer on each build taking full advantage of the layer cache in docker.
good.dockerfile
#!dockerfile FROM some/base-image # This layer is never rebuilt unless you change the instructions RUN apt update -y && apt install all-the-things -y # Add my binary on each build and only this layer is "rebuilt" ADD mybinary
For further information (and potentially a better explanation than mine ) please refer to the documentation provided by docker
https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#build-cache
There are many blogs/books that have recommendations for best practices aswell
@nburrell Super excited to start using the docker caching! Can you please elaborate on what you're referring to with 'good dockerfile practices'? Additionally, should we be seeing a docker entry in the Caches tab in Pipelines?
Hi
Docker layer caching has been released
What do you have to do to take advantage of faster docker builds*... Nothing!
If you currently use docker as a service we will automatically cache the docker image (layers) between steps, allowing for faster docker builds by taking advantage of the immutable layers generated.
Run docker commands in pipelines has a section detailing this feature and links to the dependency caching documentation for further information about the limitations/behaviours of the cache.
Regards
Nathan
- provided you follow good dockerfile practices
I'm getting the same output as @George Boot. My pipeline seems to have caches up to some arbitrary point in my first stage and then wont cache any more because it "already exists".