How to handle dynamic toplevel directories that may be needed? · bootc-dev/bootc · Discussion #1036

cgwalters
Jan 16, 2025
Maintainer

Today "podman machine" is a VM that wants to bind mount (via virtiofs) paths into the VM that match the host which could be platform specific or even dynamic. On MacOS for example, the home directory is called /Users and the way things work is that the container bind mounts (in the VM) that same absolute path.

How to deal with this with a readonly /?

Current solutions:

Add ostree.prepare-root.composefs=0 to the kernel arguments and reboot
(if using a custom container build) Enable root.transient

Option: Add `root = transient-ro`

Like root = transient but we'd allocate the overlayfs upper, but still keep it read-only by default. This would make it easy for code running in the real root to unshare the mount namespace, mount it writable and mutate it while still keeping it read-only for most use cases. This would be a pretty easy addition to ostree-prepare-root.

Filed as ostreedev/ostree#3471

Replies: 2 comments 10 replies

cgwalters
Jan 16, 2025
Maintainer Author

In the short term my recommendation is for cases like podman machine to pre-create in their container build all the toplevel directories they may need by default. It's OK to ship an empty /Users even in a VM which may run on Linux.

That said, for those that have a need for truly dynamic management, a simple option is to enable transient root. However that's a big hammer.

What would be possible to do is enable transient root, but add code into the initramfs which creates those toplevel directories and then remounts the root readonly. This would mean discovery of the desired state would need to be handled in the initramfs.

Finally, one option we could add is something like root = transient-ro where we'd allocate the overlayfs upper, but still keep it read-only by default. This would make it easy for code running in the real root to unshare the mount namespace, mount it writable and mutate it while still keeping it read-only for most use cases. This would be a pretty easy addition to ostree-prepare-root.

8 replies

@cgwalters

cgwalters Apr 3, 2025
Maintainer Author

https://github.com/crc-org/snc/blob/release-4.18/image-mode/microshift/config/Containerfile.bootc-rhel9#L25

Unrelated to this I'd recommend https://www.docker.com/blog/introduction-to-heredocs-in-dockerfiles/

For OCP right now we get the iso using $ export ISO_URL=$(./openshift-install coreos print-stream-json | grep location | grep $ARCH | grep iso | cut -d" -f4) which I am not sure how it is created

Via coreos-assembler which makes a container image and then makes an ISO out of that.

And then you do an install from the ISO, and then snapshot that as a qcow2 or so right?

and does this have a kickstart file?

No, kickstart is not used in CoreOS today.

However of course, for fedora-bootc we are supporting Anaconda to do installs. But it gets tricky to intersect Ignition and Anaconda, and Ignition is really required by OCP today.

A heavyweight path is to customize the container that goes into the ISO, and we are aiming to semi-productize this today in https://github.com/coreos/custom-coreos-disk-images

But if we're not doing in-place upgrades for CRC a lightweight thing to do is just use the bootc usroverlay + mkdir as a systemd unit.

An intermediate path that I'd like to handle of course is dropping Ignition and coreos-installer out of the flow and making a custom container image that has the OCP node content, and then just doing an Anaconda install or using bootc-image-builder to make a qcow2 directly from that. That's the path we're emphasizing for bootc/Image Mode.

@jlebon

jlebon Apr 3, 2025
Maintainer

But if we're not doing in-place upgrades for CRC a lightweight thing to do is just use the bootc usroverlay + mkdir as a systemd unit.

Yeah, I would just start with that for now if it fits the bill. Though ideally following a similar model to rpm-software-management/dnf#2195. I.e. ostree admin unlock --transient, and then in a mount namespace remount it read-write to do what you want. That way, it remains read-only for the rest of the boot.

@praveenkumar

praveenkumar Apr 4, 2025

Unrelated to this I'd recommend https://www.docker.com/blog/introduction-to-heredocs-in-dockerfiles/

Thank for sharing 👍🏼

Via coreos-assembler which makes a container image and then makes an ISO out of that.

And then you do an install from the ISO, and then snapshot that as a qcow2 or so right?

Yes that's right.

and does this have a kickstart file?

No, kickstart is not used in CoreOS today.

However of course, for fedora-bootc we are supporting Anaconda to do installs. But it gets tricky to intersect Ignition and Anaconda, and Ignition is really required by OCP today.

A heavyweight path is to customize the container that goes into the ISO, and we are aiming to semi-productize this today in https://github.com/coreos/custom-coreos-disk-images

But if we're not doing in-place upgrades for CRC a lightweight thing to do is just use the bootc usroverlay + mkdir as a systemd unit.

No, we are not supporting in-place upgrades for CRC. Also when using bootc usr-overlay doesn't it just add writable overlayfs mounted on /usr which still doesn't allow to do mkdir /Users which is still readonly?

An intermediate path that I'd like to handle of course is dropping Ignition and coreos-installer out of the flow and making a custom container image that has the OCP node content, and then just doing an Anaconda install or using bootc-image-builder to make a qcow2 directly from that. That's the path we're emphasizing for bootc/Image Mode.

@praveenkumar

praveenkumar Apr 4, 2025

As of now https://github.com/coreos/custom-coreos-disk-images looks promising and tested which works only caveat is you need to push that image to registry before consume it for installing a cluster as day-1 operation instead day-2.

@cgwalters

cgwalters Apr 4, 2025
Maintainer Author

No, we are not supporting in-place upgrades for CRC. Also when using bootc usr-overlay doesn't it just add writable overlayfs mounted on /usr which still doesn't allow to do mkdir /Users which is still readonly?

Right sorry! to fix this we want the transient-ro feature I mentioned. So some feature work on the bootc side

praveenkumar
Jan 16, 2025

It's OK to ship an empty /Users even in a VM which may run on Linux.

you mean this /User should be symlink to /var/User because then only sub directories can be created part of mount. Because just creating an empty top level directory also become immutable as per https://bootc-dev.github.io/bootc/filesystem.html#other-toplevel-directories right?

2 replies

@cgwalters

cgwalters Jan 16, 2025
Maintainer Author

No, it's possible to create a mountpoint on top of a read-only directory. So my recommendation is that /Users is part of the container image always, and should not be a symlink to /var.

Although I guess if in this use case we don't want to mount all of /Users but we only want /Users/walters or whatever, then indeed it'd need to be a symlink into /var and the subdirectories created dynamically as you say.

@praveenkumar

praveenkumar Jan 16, 2025

So my recommendation is that /Users is part of the container image always, and should not be a symlink to /var.

I need to experiment with that.

Although I guess if in this use case we don't want to mount all of /Users but we only want /Users/walters or whatever, then indeed it'd need to be a symlink into /var and the subdirectories created dynamically as you say.

So in case of openshift-local (not sure about podman-machine) we mount $HOME which means /Users/<respective_user> so we do need to do symlink to /var.

Edit: Checked with podman-machine and this also mount $HOME

Uh oh!

How to handle dynamic toplevel directories that may be needed? #1036

Uh oh!

Uh oh!

cgwalters Jan 16, 2025 Maintainer

Current solutions:

Option: Add root = transient-ro

Replies: 2 comments · 10 replies

Uh oh!

cgwalters Jan 16, 2025 Maintainer Author

Uh oh!

cgwalters Apr 3, 2025 Maintainer Author

Uh oh!

jlebon Apr 3, 2025 Maintainer

Uh oh!

praveenkumar Apr 4, 2025

Uh oh!

praveenkumar Apr 4, 2025

Uh oh!

cgwalters Apr 4, 2025 Maintainer Author

Uh oh!

Uh oh!

praveenkumar Jan 16, 2025

Uh oh!

cgwalters Jan 16, 2025 Maintainer Author

Uh oh!

Uh oh!

praveenkumar Jan 16, 2025

cgwalters
Jan 16, 2025
Maintainer

Option: Add `root = transient-ro`

Replies: 2 comments 10 replies

cgwalters
Jan 16, 2025
Maintainer Author

cgwalters Apr 3, 2025
Maintainer Author

jlebon Apr 3, 2025
Maintainer

cgwalters Apr 4, 2025
Maintainer Author

praveenkumar
Jan 16, 2025

cgwalters Jan 16, 2025
Maintainer Author