homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: socket.recv(size, MSG_TRUNC) returns more than size bytes
Type: behavior Stage: needs patch
Components: Extension Modules Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Andrey Wagin, benjamin.peterson, berker.peksag, christian.heimes, martin.panter
Priority: high Keywords:

Created on 2015年08月25日 11:28 by Andrey Wagin, last changed 2022年04月11日 14:58 by admin.

Messages (7)
msg249114 - (view) Author: Andrey Wagin (Andrey Wagin) Date: 2015年08月25日 11:28
In [1]: import socket
In [2]: sks = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
In [3]: sks[1].send("asdfasdfsadfasdfsdfsadfsdfasdfsdfasdfsadfa")
Out[3]: 42
In [4]: sks[0].recv(1, socket.MSG_PEEK | socket.MSG_TRUNC)
Out[4]: 'a\x00\x00\x00\xc0\xbf8\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
recv() returns a buffer. The size of this buffer is equal to the size of transferred data, but only the first symbol was initialized. What is the idea of this behavior.
Usually recv(sk, NULL, 0, socket.MSG_PEEK | socket.MSG_TRUNC) is used to get a message size. What is the right way to get a message size in Python?
msg249121 - (view) Author: Andrey Wagin (Andrey Wagin) Date: 2015年08月25日 13:21
sendto(4, "asdfasdfsadfasdfsdfsadfsdfasdfsd"..., 42, 0, NULL, 0) = 42
recvfrom(3, "a0円n0円0円0円0円0円0円0円0円0円0円0円0円0円0円0円0円0円5円0円0円0円0円0円0円0円2円0円0円0円"..., 1, MSG_TRUNC, NULL, NULL) = 42
I think the exit code is interpreted incorrectly. In this case it isn't equal to the number of bytes received. Then python copies this number of bytes from the buffer with smaller size, so it may access memory which are not allocated or allocated by someone else.
valgrind detects this type of errors:
[avagin@localhost ~]$ cat sock.py 
import socket, os, sys
sks = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
pid = os.fork()
if pid == 0:
	sks[1].send("0円" * 4096)
	sys.exit(0)
sk = sks[0]
print sk.recv(1, socket.MSG_TRUNC )
[avagin@localhost ~]$ valgrind python sock.py
==25511== Memcheck, a memory error detector
==25511== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==25511== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==25511== Command: python sock.py
==25511== 
==25511== Syscall param write(buf) points to uninitialised byte(s)
==25511== at 0x320B4F0940: __write_nocancel (in /usr/lib64/libc-2.20.so)
==25511== by 0x320B478D2C: _IO_file_write@@GLIBC_2.2.5 (in /usr/lib64/libc-2.20.so)
==25511== by 0x320B4794EE: _IO_file_xsputn@@GLIBC_2.2.5 (in /usr/lib64/libc-2.20.so)
==25511== by 0x320B46EE68: fwrite (in /usr/lib64/libc-2.20.so)
==25511== by 0x369CC90210: ??? (in /usr/lib64/libpython2.7.so.1.0)
==25511== by 0x369CC85EAE: ??? (in /usr/lib64/libpython2.7.so.1.0)
==25511== by 0x369CC681AB: PyFile_WriteObject (in /usr/lib64/libpython2.7.so.1.0)
==25511== by 0x369CCE08F9: PyEval_EvalFrameEx (in /usr/lib64/libpython2.7.so.1.0)
==25511== by 0x369CCE340F: PyEval_EvalCodeEx (in /usr/lib64/libpython2.7.so.1.0)
==25511== by 0x369CCE3508: PyEval_EvalCode (in /usr/lib64/libpython2.7.so.1.0)
==25511== by 0x369CCFC91E: ??? (in /usr/lib64/libpython2.7.so.1.0)
==25511== by 0x369CCFDB41: PyRun_FileExFlags (in /usr/lib64/libpython2.7.so.1.0)
msg249127 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2015年08月25日 15:22
Evidently, the recv code doesn't know anything about MSG_TRUNC, which causes it to do incorrect things when the output length is greater than the buffer length.
msg249214 - (view) Author: Andrey Wagin (Andrey Wagin) Date: 2015年08月26日 20:22
There is the same behavior for python 3.4
>>> sks[1].send(b"asdfasdfsadfasdfsdfsadfsdfasdfsdfasdfsadfa")
42
>>> sks[0].recv(1, socket.MSG_PEEK | socket.MSG_TRUNC)
b'a\x00Nx\x94\x7f\x00\x00sadfasdfsdfsadfsdfasdfsdfasdfsadfa'
>>>
msg264343 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016年04月27日 03:38
As far as I know, passing MSG_TRUNC into recv() is Linux-specific. I guess the "right" portable way to get a message size is to know it in advance, or guess and expand the buffer if MSG_PEEK cannot return the whole message.
Andrey: I don’t think we are accessing _unallocated_ memory (which could crash Python). If you look at _PyBytes_Resize(), I think it correctly allocates the memory, and just leaves it uninitialized.
Some options:
* Document that arbitrary flags like Linux’s MSG_TRUNC not supported
* Limit the returned buffer to the original buffer size
* Raise an exception or warning if recv() returns more than the original buffer size
* Reject unsupported flags like MSG_TRUNC
* Initialize the expanded buffer with zeros
msg277427 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016年09月26日 14:59
MSG_TRUNC literally causes a buffer overflow. In the example sock_recv() and friends only allocate a buffer of size 1 on the heap. With MSG_TRUNC recv() ignores the maximum size and writes beyond the buffer. We cannot recover from a buffer overflow because the overflow might have damanged other data structures. Instead Python should detect the problem and forcefully abort() the process with Py_FatalError().
msg277429 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016年09月26日 15:31
Ah, I misunderstood MSG_TRUNC. It's not a buffer overflow. MSG_TRUNC does not write beyond the end of the buffer. In this example the libc function recv() writes two bytes into the buffer but returns a larger value than 2.
---
import socket
a, b = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
a.send(b'abcdefgh')
result = b.recv(2, socket.MSG_TRUNC)
print(len(result), result)
---
stdout: 2 b'ab'
To fix the wrong result of recv() with MSG_TRUNC, only resize when outlen < recvlen (line 3089).
To get the size of the message, you have to use recv_into() with a buffer.
---
a, b = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM)
a.send(b'abcdefgh')
msg = bytearray(2)
result = b.recv_into(msg, flags=socket.MSG_TRUNC)
print(result, msg)
---
stdout: 8 bytearray(b'ab')
History
Date User Action Args
2022年04月11日 14:58:20adminsetgithub: 69121
2016年09月26日 15:31:46christian.heimessetpriority: critical -> high
type: security -> behavior
messages: + msg277429

versions: - Python 3.4
2016年09月26日 14:59:57christian.heimessetpriority: normal -> critical

messages: + msg277427
versions: + Python 3.7
2016年09月09日 00:17:37christian.heimessetnosy: + christian.heimes
2016年04月27日 03:38:19martin.pantersetnosy: + martin.panter
messages: + msg264343
components: + Extension Modules, - Library (Lib)
2016年04月27日 01:02:06berker.peksagsetnosy: + berker.peksag
stage: needs patch

versions: + Python 3.5, Python 3.6
2015年08月26日 20:24:33Andrey Waginsettype: security
2015年08月26日 20:22:45Andrey Waginsetmessages: + msg249214
versions: + Python 3.4
2015年08月25日 15:22:18benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg249127
2015年08月25日 13:21:46Andrey Waginsetmessages: + msg249121
2015年08月25日 11:28:15Andrey Wagincreate

AltStyle によって変換されたページ (->オリジナル) /