Issue 20895: Add bytes.empty_buffer and deprecate bytes(17) for the same purpose

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/65094

classification

Title:	Add bytes.empty_buffer and deprecate bytes(17) for the same purpose
Type:	behavior	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.5

process

Dependencies:	Superseder:
Status:	closed	Resolution:	out of date
Assigned To:	Nosy List:	barry, ethan.furman, josh.r, martin.panter, ncoghlan, r.david.murray, serhiy.storchaka, terry.reedy
Priority:	normal	Keywords:

Created on 2014年03月12日 10:29 by ethan.furman, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (16)
msg213242 - (view)	Author: Ethan Furman (ethan.furman) * (Python committer)	Date: 2014年03月12日 10:29
`bytes` is a list of integers. Passing a single integer to `bytes()`, as in: --> bytes(7) b'\x00\x00\x00\x00\x00\x00\x00' results in a bytes object containing that many zeroes. I propose that this behavior be deprecated for eventual removal, and a class method be created to take its place.
msg213246 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年03月12日 10:57
Class method is not needed. This is just b'0円' * 7.
msg213262 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2014年03月12日 14:39
I don't have a strong opinion on this, but I think you are going to have to articulate a good use/usability case for the deprecation. I'm sure this is used in the wild, and we don't just gratuitously break things :)
msg213592 - (view)	Author: Josh Rosenberg (josh.r) * (Python triager)	Date: 2014年03月14日 21:40
I would think the argument for deprecation is that usually, people type bytes(7) or bytes(somesmallintvalue) expecting to create a length one bytes object using that value (happens by accident if you iterate a bytes object and forget it's an iterable of ints, not an iterable of len 1 bytes). It's really easy to forget to make it bytes([7]) or bytes((7,)) or what have you. If you make the same mistake with str, list, tuple, etc., you get an error, because they only accept iterables. But bytes silently behaves in a way that is inconsistent with the other sequence types. Given that b'0円' * 7 is usually faster in any event (by avoiding lookup costs to find the bytes constructor) and more intuitive to people familiar with the Python sequence idiom, I could definitely see this as a redundancy that does nothing but confuse.
msg213596 - (view)	Author: Terry J. Reedy (terry.reedy) * (Python committer)	Date: 2014年03月14日 22:13
I agree with Serhiy that the method is not needed in any case. I was about to post the same missing rationale: people misunderstand 'bytes(7)' and write it expecting to get bytes([7]) == b(\x07'), so it would be better to make bytes(7) raise instead of silently accepting a buggy usage. I was thinking that one rationale for bytes(n) might be that it is faster than b'0円' * n. Since Josh claimed the contrary, I tried to test with timeit.repeat (both console and Idle) and got this error message TypeError: source code string cannot contain null bytes Both eval and compile emit this message. So it seems that one justification for bytes(n) is to avoid putting null bytes in source strings. I think this issue should be closed. Deprecation ideas should really be posted of python-ideas and ultimately pydev for discussion and approval. If Ethan wants to pursue the idea, he should research the design discussions for bytes() (probably on the py3k list) and whether Guido directly approved of bytes(n) or if someone else 'snuck' it in after the initial approval.
msg213597 - (view)	Author: Josh Rosenberg (josh.r) * (Python triager)	Date: 2014年03月14日 22:23
Terry: You forgot to use a raw string for your timeit.repeat check, which is why it blew up. It was evaluating the 0円 when you defined the statement string itself, not the contents. If you use r'b"0円" * 7' it works just fine by deferring backslash escape processing until the string is actually eval-ed, rather than when you create the string. For example, on my (admittedly underpowered) laptop (Win7 x64, Py 3.3.0 64-bit): >>> min(timeit.repeat(r'b"0円" * 7')) 0.07514287752866267 >>> min(timeit.repeat(r'bytes(7)')) 0.7210309422814021 >>> min(timeit.repeat(r'b"0円" * 7000')) 0.8994351749659302 >>> min(timeit.repeat(r'bytes(7000)')) 2.06750710129117 For a short bytes, the difference is enormous (as I suspected, the lookup of bytes dominates the runtime). For much longer bytes, it's still winning by a lot, because the cost of having the short literal first, then multiplying it, is still trivial next to the lookup cost. P.S. I made a mistake: str does accept an int argument (obviously), but it has completely different meaning.
msg213598 - (view)	Author: Ethan Furman (ethan.furman) * (Python committer)	Date: 2014年03月14日 22:26
I'm inclined to leave it open while I do the suggested research. Thanks for the tips, Terry, and the numbers, Josh.
msg213641 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2014年03月15日 06:01
AFAIK, bytes(int) is a remnant from times when bytes was mutable. Then bytes was split to non-mutable bytes and mutable bytearray and this constructor was forgotten. I'm +0 for deprecation.
msg213656 - (view)	Author: Ethan Furman (ethan.furman) * (Python committer)	Date: 2014年03月15日 15:56
Python 2.7.3 (default, Sep 26 2012, 21:51:14) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. --> bytes(5) '5' --> bytearray(5) bytearray(b'\x00\x00\x00\x00\x00') ---------------------------------------------------------------------- Creating a buffer of null bytes makes sense for bytearray, which is mutable; it does not make sense, and IMHO only causes confusion, to have bytes return an /immutable/ sequence of zero bytes.
msg215095 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2014年03月28日 23:39
Bringing over Barry's suggestion from the current python-ideas thread [1]: @classmethod def fill(cls, length, value=0): # Creates a bytes of given length with given fill value [1] https://mail.python.org/pipermail/python-ideas/2014-March/027305.html
msg215103 - (view)	Author: Josh Rosenberg (josh.r) * (Python triager)	Date: 2014年03月29日 00:35
Why would we need bytes.fill(length, value)? Is b'\xVV' * length (or if value is a variable containing int, bytes((value,)) * length) unreasonable? Similarly, bytearray(b'\xVV) * length or bytearray((value,)) * length is both Pythonic and performant. Most sequences support multiplication so simple stuff like this can be done easily and consistently; why invent a new approach unique to bytes/bytearrays?
msg215106 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2014年03月29日 02:02
Also, to me 'fill' implies something is being filled, not that something is being created.
msg215110 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2014年03月29日 02:55
The fill() name makes more sense for the bytearray variant, it is just provided on bytes as well for consistency. As Serhiy notes above, the current behaviour is almost certainly just a holdover from the original "mutable bytes" design that didn't survive into the initial 3.0 release.
msg215165 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2014年03月30日 05:35
Under the name "from_len", this is now part of a larger proposal to improve the consistency of the binary APIs: http://www.python.org/dev/peps/pep-0467/
msg232292 - (view)	Author: Terry J. Reedy (terry.reedy) * (Python committer)	Date: 2014年12月07日 23:51
May we close this as superceded by pep467?
msg232294 - (view)	Author: Ethan Furman (ethan.furman) * (Python committer)	Date: 2014年12月08日 00:07
Superseded by PEP467.

History
Date	User	Action	Args
2022年04月11日 14:57:59	admin	set	github: 65094
2015年05月17日 22:41:48	terry.reedy	set	resolution: out of date stage: resolved
2014年12月08日 00:07:09	ethan.furman	set	status: open -> closed messages: + msg232294
2014年12月07日 23:51:54	terry.reedy	set	messages: + msg232292
2014年03月30日 05:35:21	ncoghlan	set	messages: + msg215165
2014年03月29日 02:55:30	ncoghlan	set	messages: + msg215110
2014年03月29日 02:02:13	r.david.murray	set	messages: + msg215106
2014年03月29日 00:35:36	josh.r	set	messages: + msg215103
2014年03月28日 23:39:02	ncoghlan	set	nosy: + ncoghlan messages: + msg215095
2014年03月28日 15:12:20	barry	set	nosy: + barry
2014年03月19日 01:04:53	martin.panter	set	nosy: + martin.panter
2014年03月15日 15:56:05	ethan.furman	set	messages: + msg213656
2014年03月15日 06:01:44	serhiy.storchaka	set	messages: + msg213641
2014年03月14日 22:26:36	ethan.furman	set	messages: + msg213598
2014年03月14日 22:23:54	josh.r	set	messages: + msg213597
2014年03月14日 22:13:16	terry.reedy	set	nosy: + terry.reedy messages: + msg213596
2014年03月14日 21:40:16	josh.r	set	nosy: + josh.r messages: + msg213592
2014年03月12日 14:39:39	r.david.murray	set	nosy: + r.david.murray messages: + msg213262
2014年03月12日 10:57:12	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg213246
2014年03月12日 10:29:57	ethan.furman	create

homepage