UDP: Intermix io_uring recv w/ syscall sends? · axboe/liburing · Discussion #1428

TwoClocks
Jun 30, 2025

I really like the UDP multishot recv/recvmsg functionally in io_uring. But I really dislike the async nature of io_uring's send/sendmsg. I find I have to cache / pool things like addresses and msghdrs on a send() that will likely not last more than one cycle through the event-loop. Have a bunch of logic around when I can re-use memory... etc.

I think I'd prefer to take the syscall overhead and just call normal synchronous send/sendmsg on the socket and be done w/ memory lifetime constraints when the function returns.

But I thought it would be best to ask first. Are there any issues with io_uring intermixing different APIs like this? I'm registering my UDP socket fd's, so I'd have to use the "real" fd's on the syscall version. Not sure if that matters here.

Also, am I missing something obvious about io_uring, is there a synchronous udp send/sendmsg? Or other common pattern I'm missing?

Answered by axboe

Jun 30, 2025

It's sync in the sense of "guarantees that a completion has been posted by the time io_uring_submit() (or any other submit variant) returns". The result delivery is just using the CQ ring, like any completion does, but after submit you'll know that the request has completed - either successfully, as expected, or with -EAGAIN if it could not complete synchronously. And yes that would mean you need some kind of iteration of the CQ ring to find these completion post submit, which isn't necessarily trivial or pretty, depending on what your flow normally looks like.

View full answer

Replies: 3 comments 5 replies

axboe
Jun 30, 2025
Maintainer

send/sendmsg through io_uring is generally always sync, except if there's no space in the buffer. I can see how it's very annoying to need to have to retain the data, or iovec/msghdr, on the off chance that the socket is out of space. You can certainly mix and match, but I'm almost pondering if it'd be worthwhile to have some way of "failing" the send/sendmsg if there's no space in the socket. Then at that point you'd want to ensure structs are persistent, but not before then. It would require you to iterate the CQ ring after submission to find those, while you're still in the same context though.

Or maybe there are better ideas here?

5 replies

@TwoClocks

TwoClocks Jun 30, 2025
Author

I feel like I'm missing something basic here. Ignoring the failure case, how is prep_send() sync? I need to call submit_and_wait() to get the result, yes? THat kinda means exiting my send() function, and running the eventloop once. Which is what I was trying to avoid. Or am I missing something?

@axboe

axboe Jun 30, 2025
Maintainer

Answer selected by TwoClocks

@TwoClocks

TwoClocks Jun 30, 2025
Author

Ahh.. I think I see. So in theory I could call submit() and it would be safe to re-use the addr/msghr/payload buffers immediately afterwords? That either the data was copied out by the kernel, or it failed... but either way the kernel will not access that memory again?

Is the same true of the flush() API, or the ring is setup w/ IORING_SETUP_SQPOLL?

For my use, I don't care about the failure case. Loss of UDP packets is handled generally by another layer, so don't need to care about the kernel's buffer being full.

@axboe

axboe Jun 30, 2025
Maintainer

Exactly, the request will be done either way. If your event loop finds it completed with -EAGAIN, then you'd need to resubmit it. At that point you'd probably just ensure that data / aux data is stable, as it should be an edge case and not the expected path. Or you'll find it completed successfully.

If you use IORING_SETUP_SQPOLL, then submission is decoupled from the syscall. The syscall only kicks the SQPOLL thread to submit it. The general rule for io_uring is that everything must be stable until it's been submitted, which is why the above works for !SQPOLL. For SQPOLL, you don't know when the IO has been submitted, and hence the rule there is "must be stable until it completes", as that's the first notification you'll get for that request.

@TwoClocks

TwoClocks Jun 30, 2025
Author

I see. Very helpful. Thank you.
So for my use-case, I should either disable SQPOLL and call submit() from send() fn, or use standard syscall socket APIs for the send-case.

isilence
Jun 30, 2025
Collaborator

That's what NOWAIT / NOBLOCK should've been doing all along in the context of io_uring, but unfortunately it was changing the semantics probably more than a couple of times. And I don't remember whether "poll instead of failing for O_NONBLOCK" patch has been merged back in the day.

In short, it might be worthwhile checking the current semantics, and fixing it in place if it does the right thing or correcting it (potentially with some flag) if it doesn't.

0 replies

axboe
Jun 30, 2025
Maintainer

Yep, MSG_DONTWAIT should already do this. It marks the request as REQ_F_NOWAIT, which means that it'll end the send/sendmsg with -EAGAIN rather than handle it in an async manner if the socket was out of space.

Semantics did take a while to solidify, but they haven't been changed in years at this point.

0 replies

UDP: Intermix io_uring recv w/ syscall sends? #1428

Uh oh!

Uh oh!

TwoClocks Jun 30, 2025

Replies: 3 comments · 5 replies

Uh oh!

axboe Jun 30, 2025 Maintainer

Uh oh!

TwoClocks Jun 30, 2025 Author

Uh oh!

axboe Jun 30, 2025 Maintainer

Uh oh!

TwoClocks Jun 30, 2025 Author

Uh oh!

axboe Jun 30, 2025 Maintainer

Uh oh!

TwoClocks Jun 30, 2025 Author

Uh oh!

isilence Jun 30, 2025 Collaborator

Uh oh!

axboe Jun 30, 2025 Maintainer

TwoClocks
Jun 30, 2025

Replies: 3 comments 5 replies

axboe
Jun 30, 2025
Maintainer

TwoClocks Jun 30, 2025
Author

axboe Jun 30, 2025
Maintainer

TwoClocks Jun 30, 2025
Author

axboe Jun 30, 2025
Maintainer

TwoClocks Jun 30, 2025
Author

isilence
Jun 30, 2025
Collaborator

axboe
Jun 30, 2025
Maintainer