Zero copy send and close behavior related to second notification · axboe/liburing · Discussion #1465

normanmaurer
Sep 22, 2025

I have a question related to zero copy send (send_zc) and close and the buffer management.

So I ran into the situation where a send_zc was done and I received the first notification with the result and the more flag set. All good so far... I know I will get another notification once I can reuse the buffer again..

Now the remote peer never reads and so I never receive a TCP ack (it seams) which means I never receive the second notification (and so the buffer can not be reused). After a while I close the socket (the remote peer still did not read) for which I receive the notification but I never receive the second notification for the previous send_zc. Which brings me the question what I am supposed to do now ? Shouldn't the close also somehow trigger the second notification for the send and so let me reuse the buffer ?

Tagging @isilence as told by @axboe

Replies: 2 comments 7 replies

normanmaurer
Sep 24, 2025
Author

Ok after some digging it turned out that at some point I will see the notification. It's just that it takes forever by default. You can change this by setting a more sane and smaller value via TCP_USER_TIMEOUT.

I also added some test case for this in our integration: netty/netty#15709

1 reply

@normanmaurer

normanmaurer Sep 24, 2025
Author

Or it will also be triggered if the remote peer will close the socket as well.

isilence
Sep 27, 2025
Collaborator

I wonder if shutdown(2) helps? It's a known problem, there is good way to predict when the buffer is going to be released, but I don't think it should be taking too much time after the final close. It might be that something like an io_uring request is holding a reference to the socket that would prevent the close from terminating the connection.

6 replies

@normanmaurer

normanmaurer Sep 29, 2025
Author

ng too much time after the final close. It might be that something like an io_uring request is holding a reference to the socket that would prevent the close from terminating the connection.

Let me test shutdown(2) and come back to you... in the meantime I also checked for your other theory (holding a reference to the socket) and at least this seems to be not true. I verified that I received a notification for all previous submitted ops so there shouldn't be anything pending at all.

Ok also tried to use shutdown and its the same.. Without using a low value for TCP_USER_TIMEOUT it basically waits forever for the second notification even after the result of shutdown (or close) is observed. Like said before all other completions are received as well so the only one that is not received yet is the second notification for the send_zc that signals that the buffers can be reused. 🤷 As close should trigger a RST I would have expected that the buffers could be re-used directly as after the RST is send there should be no need to wait for any TCP ack at all.

Also as a side note this problem only shows up when the closure is trigged locally. If the remote peer closes the connection by calling close I will receive the second notification without any waiting time and so can reuse the buffer.

@isilence

isilence Sep 29, 2025
Collaborator

Yeah, it was brain fog on my part. It makes perfect sense from your investigation, the socket just tries to send out everything it has queued. Does TCP_USER_TIMEOUT work for your case? There is not much kernel can do apart from forcing it. And if you have a better way to detect such uncooperative sockets, I guess you can set TCP_USER_TIMEOUT=1 at the right moment.

@normanmaurer

normanmaurer Sep 29, 2025
Author

Yeah TCP_USER_TIMEOUT works... that said I am still a bit confused why after a close (which triggers a TCP RST) we even need to try to send out stuff that is queued. Shouldn't we just drop stuff on the floor at this point and so give the memory back to the system as fast as possible ?

@normanmaurer

normanmaurer Oct 2, 2025
Author

@isilence any idea ?

@isilence

isilence Oct 2, 2025
Collaborator

It's not like it's something that can be changed without breaking users. I can see why it might've been done the current way, e.g. users would need a way to wait for the last send to be sent and/or acked before close or so. It can be handled somehow, but thinking about it, closing reliably is surprisingly not such an easy problem.

Zero copy send and close behavior related to second notification #1465

Uh oh!

normanmaurer Sep 22, 2025

Replies: 2 comments · 7 replies

Uh oh!

normanmaurer Sep 24, 2025 Author

Uh oh!

normanmaurer Sep 24, 2025 Author

Uh oh!

isilence Sep 27, 2025 Collaborator

Uh oh!

normanmaurer Sep 29, 2025 Author

Uh oh!

isilence Sep 29, 2025 Collaborator

Uh oh!

normanmaurer Sep 29, 2025 Author

Uh oh!

normanmaurer Oct 2, 2025 Author

Uh oh!

isilence Oct 2, 2025 Collaborator

normanmaurer
Sep 22, 2025

Replies: 2 comments 7 replies

normanmaurer
Sep 24, 2025
Author

normanmaurer Sep 24, 2025
Author

isilence
Sep 27, 2025
Collaborator

normanmaurer Sep 29, 2025
Author

isilence Sep 29, 2025
Collaborator

normanmaurer Sep 29, 2025
Author

normanmaurer Oct 2, 2025
Author

isilence Oct 2, 2025
Collaborator