I am building a docker image that contains LLMs so in the end I got a kind of big image
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
custom-ollama-custom-vision-llava250314-e1x-c latest 2a51a22d7005 18 minutes ago 17.6GB
This container was built from transforming gguf files which were also big.
But my system is running out of space so I did $ sudo find / -type f -size +1G -exec ls -lh {} \; 2>/dev/null | sort -k 5 -rh | head -20 and I got these interesting results
-rw-r--r-- 1 root root 13G 4月 16 12:06 /var/lib/docker/overlay2/of7kdxtrkg2kpufauknowz683/diff/root/.ollama/models/blobs/sha256-cfa7e10c7fa82c02f498ab55f3f5be06d25b28b0a6598d01a03aa61f87c08750
-rw-r--r-- 1 root root 13G 4月 16 12:06 /var/lib/docker/overlay2/coh80g8u43u43yuw97ekn8497/diff/root/.ollama/models/blobs/sha256-cfa7e10c7fa82c02f498ab55f3f5be06d25b28b0a6598d01a03aa61f87c08750
-rw-r--r-- 1 root root 13G 4月 16 12:02 /var/lib/docker/overlay2/yb776umeljvv7oajellb8ixrj/diff/temp_model_data/llava-v1.6-vicuna-7B-7b_20250314-epoch1-F16.gguf
-rw-r--r-- 1 root root 13G 4月 16 12:02 /var/lib/docker/overlay2/2ncan7h59si2r29dza5s4z7ss/diff/temp_model_data/llava-v1.6-vicuna-7B-7b_20250314-epoch1-F16.gguf
-rw-r--r-- 1 root root 13G 4月 16 11:42 /var/lib/docker/overlay2/xoi30ch9xr1cdkfuro4cuuqyl/diff/root/.ollama/models/blobs/sha256-cfa7e10c7fa82c02f498ab55f3f5be06d25b28b0a6598d01a03aa61f87c08750
-rw-r--r-- 1 root root 13G 4月 16 11:42 /var/lib/docker/overlay2/x4vr94gkwmsco51qeivr2zscp/diff/root/.ollama/models/blobs/sha256-cfa7e10c7fa82c02f498ab55f3f5be06d25b28b0a6598d01a03aa61f87c08750
-rw-r--r-- 1 root root 13G 4月 16 11:36 /var/lib/docker/overlay2/bqsculhwdq76v6hah9nj3ygij/diff/temp_model_data/llava-v1.6-vicuna-7B-7b_20250314-epoch1-F16.gguf
I use docker images in my work but I am not an expert on how images are structured and formed. So my question is what are these (huge) files? Surely they can't be part of the final image since the sum of them are bigger than the image it self, but what are they? Are they leftovers of the docker build process?
I already pruned docker so for example in docker images I just have
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
custom-ollama-custom-vision-llava250314-e1x-c latest 2a51a22d7005 30 minutes ago 17.6GB
redis 7-alpine 7d06252fad43 8 months ago 41.2MB
so I dont see any temporary or intermediate image there.
Can I safely delete the huge files above?
1 Answer 1
You should refer to docker documentation to understand what layers are. In simple words resulting image will contain all snapshots of container filesystem which were created at building stage.
For example, I can suggest that you do something like this in your Dockerfile
FROM python
RUN wget llama_initial_weights
RUN python3 cleanup.py llama_initial_weights llama_cleaned_weights
RUN python3 transform.py llama_cleaned_weights llama_transformed_weights
RUN rm llama_cleaned_weights
RUN rm llama_initial_weights
Every command will create separate immutable snapshot layer for resulting image. So, you will have all data you created in 'hidden' layers on your hard drive even after executing rm command, the llama_initial_weights is still exists somewhere in intermediate layer.
You can try to use --squash command to remove intermediate layers. Also, as it was mentioned in comments, the docker system prune can help to clean up your hard drive from unused images.
Also you can change your Dockerfile to something not so clean, but better for layers
FROM python
RUN wget llama_initial_weights \
&& python3 cleanup.py llama_initial_weights llama_cleaned_weights \
&& python3 transform.py llama_cleaned_weights llama_transformed_weights \
&& rm llama_cleaned_weights \
&& rm llama_initial_weights
You can see that only one RUN was executed, thus only one layer will be created.
/var/lib/dockerand you can't directly modify the files there. Theoverlay2directory contains both image and container filesystems, and it's possible that routine cleanup commands likedocker system prunewill free up some space.