-
-
Notifications
You must be signed in to change notification settings - Fork 409
Conversation
Foxboron
commented
May 8, 2025
@stgraber if you have any opinions or pointers to the checkboxes feel free to look over and I can investigate a bit.
I suspect some research needs to be done to figure out how we should interact with the uid/gid and squashing behavior of NFS mounts.
bensmrs
commented
Jul 21, 2025
Hi! Is this PR stalled? Do you need help?
Foxboron
commented
Jul 21, 2025
stgraber
commented
Jul 21, 2025
Ah, I just left a few comments in the Info function now.
bensmrs
commented
Jul 21, 2025
I was hoping @stgraber had time to point in the correct direction on this part.
Well now you’re served :þ
I can help review, fix stuff, and even write tests, don’t hesitate to ping me. I’d actually be happy to see it working pretty soon.
Foxboron
commented
Jul 21, 2025
@bensmrs thanks!
I'll probably not touch this in July, and there is a hackercamp and work stuff happening in August on my end that might not allow me a lot of energy to pick this up after hours.
I'd appreciate some pointers on the id remapping incus does and how that should play with the ID squashing NFS does. I think there should be some guidance and testing there to make sure it behaves as expected. It also takes a bunch of time which might not be needed if nfs reassings UID/GID anyway.
If you have time to write tests and stuff I'd be happy to give you access to my fork.
stgraber
commented
Jul 21, 2025
I don't believe that VFS idmap works on top of NFS at this point, so we'd be dealing with traditional shifting where Incus rewrites all the uid/gid as needed.
What you want to ensure is that NFS is mounted the same way on all machines and that no uid/gid squashing is performed, then that should work fine.
Foxboron
commented
Jul 22, 2025
Should we call the driver nfs4 to make sure we don't end up in a weird situation down the line where we don't want to support a v5 along with the v4 code? Is that realistic?
stgraber
commented
Jul 22, 2025
I think we should stick to nfs as nfs4 may give the impression that we don't support nfs3 whereas the featureset we really need is perfectly fine for NFS3, if anything some of the bits in NFS4 may be problematic (built-in uid/gid mapping and such).
Exactly what kind of server version and configuration we can handle is probably something that's best addressed through the documentation for the driver.
symgryph
commented
Aug 25, 2025
NFS would be SO nice. Just an interested party!
b617326 to
49f1676
Compare
Foxboron
commented
Oct 19, 2025
I did some changes so we can pass ipv6 source= paths. It's not great but couldn't come up with a more reliable way to split up on ::1:/somepath or similar paths.
What sort of testing do we want on this? Locally creating pools and creating contianers and VMs work. Also attaching additional volumes.
bensmrs
commented
Oct 19, 2025
What sort of testing do we want on this? Locally creating pools and creating contianers and VMs work. Also attaching additional volumes.
We’d basically extend the available filesystem drivers in the test matrix, to test everything we’re already testing:
incus/.github/workflows/tests.yml
Lines 137 to 144 in f614b4c
The tests initialization routines allow you to define how your driver should be initialized in test/backends, although it’s not enough. I can have a quick look at it as soon as I’m done with my other PR.
0080a77 to
7070ad5
Compare
I started looking at the tests and am a bit confused by the following error:
> incus storage create incustest-KAg nfs source=10.1.0.227:/media
Error: NFS driver requires "nfs.host" to be specified or included in "source": [<remote host>:]<remote path>
Did I do something wrong here?
IMO, if you have an nfs.host option, I don’t think you should look for a host in source. Just make source be the path on nfs.host, or on localhost if no nfs.host is provided.
Foxboron
commented
Oct 20, 2025
Did I do something wrong here?
Uhh, so it turns out that the Makefile has been moving things from being built into build/ to running go install. So there is a slight chance I have missed testing an iteration while working on this.
IMO, if you have an nfs.host option, I don’t think you should look for a host in source. Just make source be the path on nfs.host, or on localhost if no nfs.host is provided.
I was thinking mimicking the behavior of truenas.host and source from the truenas driver. Where you can just specify nfs.host globally and pass a path to source. But I haven't nailed the entire behavior there I suspect. I'm a little bit unsure what the most ergonomic way for the options to behave, that also harmonizes with the existing drivers.
bensmrs
commented
Oct 20, 2025
Uhh, so it turns out that the Makefile has been moving things from being built into build/ to running go install. So there is a slight chance I have missed testing an iteration while working on this.
No problem, I ended up using nfs.host for driver initialization.
I was thinking mimicking the behavior of truenas.host and source from the truenas driver. Where you can just specify nfs.host globally and pass a path to source. But I haven't nailed the entire behavior there I suspect. I'm a little bit unsure what the most ergonomic way for the options to behave, that also harmonizes with the existing drivers.
Ok I can’t beat this argument :)
I’m getting some permission denied errors in my tests, I’m investigating.
bensmrs
commented
Oct 20, 2025
So, maybe I’m missing something when initializing the pool, but I can’t get past permission errors. Container creation and launch work well, but that’s not enough, see https://github.com/bensmrs/incus/actions/runs/18648669885/job/53161380407?pr=4
Am I missing something in my setup? See
incus/.github/workflows/tests.yml
Lines 394 to 395 in 5733ce8
Hmmm, I've only tried incus exec container bash and manually created files on NFS volumes on both ends. And it worked. Are we sure that /media doesn't have any weird permissions in ubuntu? Does it exist etc?
bensmrs
commented
Oct 20, 2025
Container launch works, so regular operations don’t seem to cause any problem.
I’ll debug a bit more later, I was just wondering if your setup had anything specific other than the squash options.
bensmrs
commented
Oct 20, 2025
Well I’m currently out of ideas. user_is_instance_user fails with a permission error. We can’t list /root from within the container, it appears owned by 0:0, whoami fails because user 0 is not found, and from the host, the mount appears owned by 1000000:1000000. I don’t know where to go from there; maybe I’m missing something obvious happening in the OpenFGA tests.
And couldn’t an idmapped bind mount be a solution? The host mounts the NFS shares somewhere, then bind-mounts it to /var/lib/incus/containers/<instance>/ with the proper shift.
Or maybe it’s bindfs-specific, in which case we’re back to using FUSE...
stgraber
commented
Oct 28, 2025
And couldn’t an idmapped bind mount be a solution? The host mounts the NFS shares somewhere, then bind-mounts it to
/var/lib/incus/containers/<instance>/with the proper shift.Or maybe it’s bindfs-specific, in which case we’re back to using FUSE...
That's 1), idmap bind-mount requires per-filesystem kernel code to handle it. NFS does not have support for it. FUSE in general does but requires each filesystem to add some logic, which is why I also brought up 2) as a quicker option to get something done.
bensmrs
commented
Oct 28, 2025
Ok, I had in mind bindfs’ uid-offset and gid-offset options and just thought that maybe this translation mechanism existed for bind mounts (the bind of bindfs made my brain shortcut to my previous question).
I’ll have a quick look at fuse-nfs.
bensmrs
commented
Oct 29, 2025
https://github.com/bensmrs/fuse-nfs should be ok-ish for FUSE3; I’ll try to implement VFS idmap tomorrow.
My patches should now work (I also integrated a long-waiting PR that I feel is desirable for us), but testing it will probably be painful. Are there quick ways for me to test the patched libfuse, or do I have to compile @mihalicyn’s tree? (in which case, big meh on my side as I have little time for it).
Or you could test on your side @stgraber if you already have everything ready...
stgraber
commented
Oct 29, 2025
Cool!
If we get this working, we'd probably do a scheme like /var/lib/storage-pools/my-nfs is the NFS kernel mount, then /var/lib/storge-pools/my-nfs/.fuse is the same share but mounted over FUSE. Then whenever dealing with an unprivileged container, we'd use the .fuse path and everything else goes through the kernel client.
Foxboron
commented
Nov 7, 2025
@stgraber thanks for investigating the issue!
Whats the proper way forward for this? Try the fuse strategy? I guess it means we also need more dependencies packages in downstreams as well?
stgraber
commented
Nov 7, 2025
I think the next steps should be:
- Try to get the FUSE filesystem to run on FUSE3 and support VFS idmap (I think @bensmrs was looking into that)
- Get the driver to detect the presence of the FUSE client and if present, mount both the kernel version and user space version of the share
- Tweak the storage driver to adjust its featureset based on what's present on the system
Given the new FUSE client is going to take a little while to be packed in all distros, I don't want us to be hard depending on it.
On my end, I'll be happy to just build and ship the client as part of the Zabbly package, so we can at least get it out to a whole bunch of users (including folks using IncusOS) pretty rapidly while the distro packagers get it through to the other distros.
bensmrs
commented
Nov 7, 2025
Try to get the FUSE filesystem to run on FUSE3 and support VFS idmap (I think @bensmrs was looking into that)
I’d half-confidently say that my fork does that already. The blocker for me is to have a kernel with the necessary patches to make it run. I don’t really have time planned to build such a kernel before, say, december. I’d be more than happy to steal a disk image to deploy on my staging environment though ^^
stgraber
commented
Nov 9, 2025
I’d half-confidently say that my fork does that already. The blocker for me is to have a kernel with the necessary patches to make it run. I don’t really have time planned to build such a kernel before, say, december. I’d be more than happy to steal a disk image to deploy on my staging environment though ^^
Ah, what do you need on the kernel side? I thought we'd had FUSE VFS idmap for a bit now (6.12 apparently).
I adapted the code from the patch you linked, which uses #if FUSE_VERSION >= FUSE_MAKE_VERSION(3, 18) conditions. FUSE 3.18 doesn’t seem to be a thing yet (and my testing environment running Trixie indeed only has 3.17).
So I guessed that the VFS patches didn’t land in mainline kernels.
stgraber
commented
Nov 9, 2025
I adapted the code from the patch you linked, which uses
#if FUSE_VERSION >= FUSE_MAKE_VERSION(3, 18)conditions. FUSE 3.18 doesn’t seem to be a thing yet (and my testing environment running Trixie indeed only has 3.17). So I guessed that the VFS patches didn’t land in mainline kernels.
That's odd because I'm definitely seeing the commits in my clone of Linus' tree.
bensmrs
commented
Nov 9, 2025
I remember the build breaking when removing the version condition. Checking libfuse source code, d393ffa85b0926374c8df543a9ffc81b1d0ce232 suggests some kind of idmap support, but struct fuse_ctx has not been patched (so we may need to partially apply https://github.com/mihalicyn/libfuse/tree/idmap_support).
I’m not comfortable with Incus/Zabbly providing patched libfuse, though, so maybe a nudge to upstream would be good.
stgraber
commented
Nov 9, 2025
Yeah, I don't mind doing the patched libfuse dance in /opt/incus/lib like we do some other libraries but that would be something I'd only really want to do after upstream has merged the needed support.
I pinged @mihalicyn about it.
bensmrs
commented
Nov 9, 2025
And to think that it just started with "it would be nice to have NFS support" :D
stgraber
commented
Nov 9, 2025
Yeah, that one became complicated pretty quick :)
Foxboron
commented
Nov 9, 2025
I took your word for it when you said it would "probably" be easy 🫠
bensmrs
commented
Nov 9, 2025
It’s always nice when the issue on which you’ve been working for 6 month is marked "Easy" (which, may I remind the description, means "Good for new contributors") :)
stgraber
commented
Nov 9, 2025
I took your word for it when you said it would "probably" be easy 🫠
Haha, yeah, the storage logic really should have been pretty easy, take the dir backend and make it act as clustered similar to lvmcluster.
Now the underlying NFS protocol being annoying with idmap wasn't something I had seen coming ;)
It kinda makes me want to look at CIFS, see if we have the same problem over there...
bensmrs
commented
Nov 9, 2025
It kinda makes me want to look at CIFS
Said... no one? ever?
stgraber
commented
Nov 9, 2025
It kinda makes me want to look at CIFS
Said... no one? ever?
Well, in this case it coming from more of a Windows world where UIDs/GIDs are not really a thing makes me hopeful that the protocol doesn't just carry the user cred and file details but instead has the kernel module handle the checks and then tells the remote storage what to write.
If that's the case, then it should work fine here. NFS is only broken because it's too UNIX :)
bensmrs
commented
Nov 9, 2025
Ok, interesting. Now I’m curious too!
stgraber
commented
Nov 10, 2025
@mihalicyn provided this one as an example of conversion
https://gitlab.com/virtio-fs/virtiofsd/-/merge_requests/245/diffs
bensmrs
commented
Nov 10, 2025
I’m confused. What question are you answering with this message?
stgraber
commented
Nov 11, 2025
I’m confused. What question are you answering with this message?
How to get a FUSE filesystem to use the VFS idmap stuff without needing libfuse changes.
bensmrs
commented
Nov 11, 2025
Ok, I’ll see how this can be translated to my code then.
bensmrs
commented
Nov 13, 2025
So, after a frustrating attempt at making it work, I realized that even though the needed patches are in the main branch of libfuse, it’s not in any released version, so I need to compile libfuse myself. It also looks like Trixie’s stock kernel is not fit for these patches, so I also had to install the backported kernel. I should be nearing something that might work pretty soon (yeah, that doesn’t exude confidence).
bensmrs
commented
Nov 13, 2025
Well, I give up, I can’t get any of my attempts to work, even though the proper capabilities are advertised to and by FUSE. Either I’m missing a pretty obvious wiring, or something’s just not ready yet. I’ve left two branches on my fork, in case someone wants to have a look.
Uh oh!
There was an error while loading. Please reload this page.
WIP draft PR for the feature. So far it works, a bit slow on my machine due to the id remapping which should probably be investigated.
vers=4.2, should be fine?sourceInfoarray needs some QA. Not sure I understand all of it.nfscan't supportxattr, do we need to handle this in other places than migrate?Fixes: #1311