-
Notifications
You must be signed in to change notification settings - Fork 521
-
I really like the UDP multishot recv/recvmsg functionally in io_uring. But I really dislike the async nature of io_uring's send/sendmsg. I find I have to cache / pool things like addresses and msghdrs on a send() that will likely not last more than one cycle through the event-loop. Have a bunch of logic around when I can re-use memory... etc.
I think I'd prefer to take the syscall overhead and just call normal synchronous send/sendmsg on the socket and be done w/ memory lifetime constraints when the function returns.
But I thought it would be best to ask first. Are there any issues with io_uring intermixing different APIs like this? I'm registering my UDP socket fd's, so I'd have to use the "real" fd's on the syscall version. Not sure if that matters here.
Also, am I missing something obvious about io_uring, is there a synchronous udp send/sendmsg? Or other common pattern I'm missing?
Beta Was this translation helpful? Give feedback.
All reactions
It's sync in the sense of "guarantees that a completion has been posted by the time io_uring_submit() (or any other submit variant) returns". The result delivery is just using the CQ ring, like any completion does, but after submit you'll know that the request has completed - either successfully, as expected, or with -EAGAIN if it could not complete synchronously. And yes that would mean you need some kind of iteration of the CQ ring to find these completion post submit, which isn't necessarily trivial or pretty, depending on what your flow normally looks like.
Replies: 3 comments 5 replies
-
send/sendmsg through io_uring is generally always sync, except if there's no space in the buffer. I can see how it's very annoying to need to have to retain the data, or iovec/msghdr, on the off chance that the socket is out of space. You can certainly mix and match, but I'm almost pondering if it'd be worthwhile to have some way of "failing" the send/sendmsg if there's no space in the socket. Then at that point you'd want to ensure structs are persistent, but not before then. It would require you to iterate the CQ ring after submission to find those, while you're still in the same context though.
Or maybe there are better ideas here?
Beta Was this translation helpful? Give feedback.
All reactions
-
I feel like I'm missing something basic here. Ignoring the failure case, how is prep_send() sync? I need to call submit_and_wait() to get the result, yes? THat kinda means exiting my send() function, and running the eventloop once. Which is what I was trying to avoid. Or am I missing something?
Beta Was this translation helpful? Give feedback.
All reactions
-
It's sync in the sense of "guarantees that a completion has been posted by the time io_uring_submit() (or any other submit variant) returns". The result delivery is just using the CQ ring, like any completion does, but after submit you'll know that the request has completed - either successfully, as expected, or with -EAGAIN if it could not complete synchronously. And yes that would mean you need some kind of iteration of the CQ ring to find these completion post submit, which isn't necessarily trivial or pretty, depending on what your flow normally looks like.
Beta Was this translation helpful? Give feedback.
All reactions
-
Ahh.. I think I see. So in theory I could call submit() and it would be safe to re-use the addr/msghr/payload buffers immediately afterwords? That either the data was copied out by the kernel, or it failed... but either way the kernel will not access that memory again?
Is the same true of the flush() API, or the ring is setup w/ IORING_SETUP_SQPOLL?
For my use, I don't care about the failure case. Loss of UDP packets is handled generally by another layer, so don't need to care about the kernel's buffer being full.
Beta Was this translation helpful? Give feedback.
All reactions
-
Exactly, the request will be done either way. If your event loop finds it completed with -EAGAIN, then you'd need to resubmit it. At that point you'd probably just ensure that data / aux data is stable, as it should be an edge case and not the expected path. Or you'll find it completed successfully.
If you use IORING_SETUP_SQPOLL, then submission is decoupled from the syscall. The syscall only kicks the SQPOLL thread to submit it. The general rule for io_uring is that everything must be stable until it's been submitted, which is why the above works for !SQPOLL. For SQPOLL, you don't know when the IO has been submitted, and hence the rule there is "must be stable until it completes", as that's the first notification you'll get for that request.
Beta Was this translation helpful? Give feedback.
All reactions
-
I see. Very helpful. Thank you.
So for my use-case, I should either disable SQPOLL and call submit() from send() fn, or use standard syscall socket APIs for the send-case.
Beta Was this translation helpful? Give feedback.
All reactions
-
That's what NOWAIT / NOBLOCK should've been doing all along in the context of io_uring, but unfortunately it was changing the semantics probably more than a couple of times. And I don't remember whether "poll instead of failing for O_NONBLOCK" patch has been merged back in the day.
In short, it might be worthwhile checking the current semantics, and fixing it in place if it does the right thing or correcting it (potentially with some flag) if it doesn't.
Beta Was this translation helpful? Give feedback.
All reactions
-
Yep, MSG_DONTWAIT should already do this. It marks the request as REQ_F_NOWAIT, which means that it'll end the send/sendmsg with -EAGAIN rather than handle it in an async manner if the socket was out of space.
Semantics did take a while to solidify, but they haven't been changed in years at this point.
Beta Was this translation helpful? Give feedback.