Issue 9951: introduce bytes.hex method (also for bytearray and memoryview)

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/54160

classification

Title:	introduce bytes.hex method (also for bytearray and memoryview)
Type:	enhancement	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.5

process

Dependencies:	Superseder:
Status:	closed	Resolution:	fixed
Assigned To:	gregory.p.smith	Nosy List:	Arfrever, BreamoreBoy, barry, christian.heimes, eric.araujo, eric.smith, ethan.furman, georg.brandl, gotgenes, gregory.p.smith, hct, lemburg, mark.dickinson, martin.panter, ncoghlan, pitrou, python-dev, rhettinger, serhiy.storchaka, terry.reedy, wiggin15
Priority:	normal	Keywords:	patch

Created on 2010年09月25日 23:38 by wiggin15, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
bytes.hex.diff	wiggin15, 2015年04月13日 20:31	review
bytes.hex-1.diff	wiggin15, 2015年04月25日 10:54	review

Messages (38)
msg117397 - (view)	Author: Arnon Yaari (wiggin15) *	Date: 2010年09月25日 23:38
Following up on these discussions: http://psf.upfronthosting.co.za/roundup/tracker/issue3532 http://www.gossamer-threads.com/lists/python/dev/863892 I'm submitting a patch to add bytes.hex method in accordance to PEP 358. The code was taken from binascii so it should be "tested". Also added bytearray.hex and fixed the documentation and testing. There are additional things to discuss, for example: * multiple and different implementations of tohex\fromhex - in binascii, sha1module, bytes, bytearray... * binascii's functions which perform the same thing, but those functions and the rest of binascii's functions receive and return wrong types. I would fix this but it breaks compatibility.
msg118272 - (view)	Author: Arnon Yaari (wiggin15) *	Date: 2010年10月09日 13:00
fixed to Py_UNICODE
msg132911 - (view)	Author: Raymond Hettinger (rhettinger) * (Python committer)	Date: 2011年04月04日 00:25
See also: issue11756
msg190112 - (view)	Author: Terry J. Reedy (terry.reedy) * (Python committer)	Date: 2013年05月26日 20:39
Also #3532
msg193034 - (view)	Author: Arnon Yaari (wiggin15) *	Date: 2013年07月14日 07:23
Hi, is there any chance to get this merged? This ticket has been open for almost 3 years...
msg193041 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2013年07月14日 08:50
There are several ways to do this: base64.b16encode, binascii.a2b_hex, hex(int.from_bytes(...)), etc. Why you need yet one?
msg197571 - (view)	Author: Arnon Yaari (wiggin15) *	Date: 2013年09月13日 13:29
You can follow the discussion I linked in the ticket description for an answer: http://psf.upfronthosting.co.za/roundup/tracker/issue3532 Mainly the answer is: to conform to PEP 358 and to provide the opposite of bytes.fromhex. I agree that you can use binascii, but apparently it was decided that this functionality is good to have in the builtins (what used to be encode/decode('hex') in Python 2.x, and what is now bytes.fromhex, with the missing bytes.hex). In addition, binascii works a little differently - already discussed in the given link...
msg199629 - (view)	Author: Christian Heimes (christian.heimes) * (Python committer)	Date: 2013年10月12日 22:12
I like to see the feature in 3.4, too.
msg199631 - (view)	Author: Antoine Pitrou (pitrou) * (Python committer)	Date: 2013年10月12日 22:16
If it's the reverse of fromhex(), perhaps we should call it tohex()?
msg199634 - (view)	Author: Christian Heimes (christian.heimes) * (Python committer)	Date: 2013年10月12日 22:35
Funny thing. I was searching for "tohex" when I found this ticket.
msg199665 - (view)	Author: Georg Brandl (georg.brandl) * (Python committer)	Date: 2013年10月13日 07:45
Blasphemous question: why not give bytes a __hex__ method? Then you could use hex() to convert them :) The patch is outdated; it should not use PyUnicode_AS_UNICODE, but PyUnicode_New(..., 127) and then PyUnicode_1BYTE_DATA to get the char array.
msg205744 - (view)	Author: HCT (hct)	Date: 2013年12月09日 22:31
would be good if we can specify a optional flag to get all cap hex. currently, I have to do hexlify( some_bytes ).decode( 'UTF-8' ).upper(). would be good to be able to do some_bytes.hex( upper=1 )
msg226692 - (view)	Author: STINNER Victor (vstinner) * (Python committer)	Date: 2014年09月10日 12:29
New features cannot be added to Python 2 anymore, only to the current development version which is now Python 3.5. If new methods are added to bytes, they should be added to bytearray too. Maybe we should also consider add them to memoryview? memoryview has already a .bytes() method and can be casted to type "B" (array of integers in range 0..255). The float type has .hex() and .fromhex() methods. We should kepe these names to stay consistent. Which kind of output do you prefer? "0xHH 0xHH ...", "HH HH HH ..." or "HHHHHH..."? Do you want to add parameters to choose the format? Current binascii format: >>> binascii.hexlify('abc') '616263'
msg226703 - (view)	Author: Terry J. Reedy (terry.reedy) * (Python committer)	Date: 2014年09月10日 17:58
To answer Serhiy, the goal is to have a bytes method that represents bytes as bytes rather than as a mixture of bytes and encoded ascii characters. This would aid people who work with bytes that are not encoded ascii and that do not embed encoded ascii. It should not be necessary to import anything. >>> hex(int.from_bytes(b'abc', 'big')) '0x616263' is a bit baroque and produces a hex representation of an int, not of multiple bytes. I think following the float precedent is a good idea. >>> float.fromhex(1.5.hex()) 1.5 >>> float.fromhex('0x1.8000000000000p+0').hex() '0x1.8000000000000p+0' The output of bytes.hex should be one that is accepted by bytes.fromhex, which is to say, hex 'digit' pairs. Spaces are optionally allowed between pairs. I would add a 'spaces' parameter, defaulting to False. to output spaces when set to true. (Or possible reverse the default -- what do potential users think?) A possible altermative for the parameter could be form='' (default), form=' ' (add spaces), and form='x' to add '\x' prefixes. I don't know that adding '\x' would be useful. The prefixes are not accepted by .fromhex.
msg226730 - (view)	Author: HCT (hct)	Date: 2014年09月10日 22:30
@Victor binascii.hexlify('abc') doesn't work in 3.4. I assume this is a new thing for 3.5 >>> import binascii >>> binascii.hexlify('abc') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'str' does not support the buffer interface >>> >>> binascii.hexlify(b'abc') b'616263' @Terry I think that space shouldn't be done by the hex function. if you allow space between each hex, then what do you do if the bytes are actually from array of 64-bit ints? getting into support separating space for every X bytes is probably not the scope of this. I propose the hex functions for bytes/memoryview/bytearray should be as follow. I prefer to not have the '0x' prefix at all, but I understand all other hex functions adds it. would be good to have the option to not have the prefix. bytes.hex( byte_order = sys.byteorder ) returns a hex string in small letter. ex. c0ffee bytes.HEX( byte_order = sys.byteorder ) returns a hex string in capital letters. ex. DEADBEEF bytes.from_hex( hex_str, byte_order = sys.byteorder ) returns a bytes object. ex. b'\xFE\xFF' another more flexible way is to have hex function accept a format similar to how sscanf works, but this will probably bring lots of trouble for all kinds of variants to support and the required error checks.
msg226731 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2014年09月10日 23:04
Just as a recap of at least some of the current ways to do a bytes -> hex conversion: >>> import codecs >>> codecs.encode(b"abc", "hex") b'616263' >>> import binascii >>> binascii.hexlify(b"abc") b'616263' >>> import base64 >>> base64.b16encode(b"abc") b'616263' >>> hex(int.from_bytes(b"abc", "big")) '0x616263' >>> hex(int.from_bytes(b"abc", "little")) '0x636261' Thus, the underlying purpose of this proposal is to provide a single "more obvious way to do it". As per the recent discussion on python-ideas, the point where that is most useful is in debugging output. However, rather than a new method on bytes/bytearray/memoryview for this, I instead suggest it would be appropriate to extend the default handling of the "x" and "X" format characters to accept arbitrary bytes-like objects. The processing of these characters would be as follows: "x": display a-f as lowercase digits "X": display A-F as uppercase digits "#": includes 0x prefix ".precision": chunks output, placing a space after every <precision> bytes ",": uses a comma as the separator, rather than a space Output order would match binascii.hexlify() Examples: format(b"xyz", "x") -> '78797a' format(b"xyz", "X") -> '78797A' format(b"xyz", "#x") -> '0x78797a' format(b"xyz", ".1x") -> '78 79 7a' format(b"abcdwxyz", ".4x") -> '61626364 7778797a' format(b"abcdwxyz", "#.4x") -> '0x61626364 0x7778797a' format(b"xyz", ",.1x") -> '78,79,7a' format(b"abcdwxyz", ",.4x") -> '61626364,7778797a' format(b"abcdwxyz", "#,.4x") -> '0x61626364,0x7778797a' This approach makes it easy to inspect binary data, with the ability to inject regular spaces or commas to improved readability. Those are the basic features needed to support debugging. Anything more complicated than that, and we're starting to want something more like the struct module.
msg226732 - (view)	Author: Terry J. Reedy (terry.reedy) * (Python committer)	Date: 2014年09月10日 23:29
The proposal is to add a .hex method (similar to binascii.hexlify) that is the inverse of .fromhex (similar to binascii.unhexlify), as originally specified in PEP 358. http://legacy.python.org/dev/peps/pep-0358/ "The object has a .hex() method that does the reverse [of .frombytes] >> bytes([92, 83, 80, 255]).hex() '5c5350ff' " If we add .hex, I think we should stick with this: no 0x or \x prefix. To aid debugging, I would change spaces to be None or a positive int n to insert a space every n bytes. So .hex(8) for an array of 64 bit ints.
msg226734 - (view)	Author: HCT (hct)	Date: 2014年09月10日 23:55
@Terry natural bytes do not have space between them. I would think adding space is for typesetting situation which should be done by user's post-processing. I agree to not have any prefix to make .hex and from_hex uniform. the \x is the str representation of bytes when you print a bytes object directly in Python. the actual bytes object doesn't have that \x prefix.
msg226735 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2014年09月10日 23:57
Good point Terry - I split the proposal to support bytes-like objects for 'x' and 'X' in string formatting out to issue 22385. For bytes.hex, I'm inclined to stick with the dirt simple option described in PEP 358: the exact behaviour of the current binascii.hexlify().
msg226737 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2014年09月11日 00:06
Open question: the current patch adds bytes.hex() and bytearray.hex(). Should we also add memoryview.hex(), or split that suggestion out to a separate proposal?
msg226738 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2014年09月11日 00:20
I'd say add memoryview.hex() here as everything seems related. Victor has also mentioned memoryview in msg226692.
msg226745 - (view)	Author: Chris Lasher (gotgenes)	Date: 2014年09月11日 07:09
int has int.from_bytes and int.to_bytes. Currently, bytes has bytes.fromhex. Would the core developers please consider naming the method "bytes.tohex" instead of "bytes.hex", so there's at least a modicum of consistency in the method names of Python's builtin types?
msg226750 - (view)	Author: Marc-Andre Lemburg (lemburg) * (Python committer)	Date: 2014年09月11日 07:42
On 11.09.2014 01:04, Nick Coghlan wrote: > > Nick Coghlan added the comment: > > Just as a recap of at least some of the current ways to do a bytes -> hex conversion: > >>>> import codecs >>>> codecs.encode(b"abc", "hex") > b'616263' >>>> import binascii >>>> binascii.hexlify(b"abc") > b'616263' >>>> import base64 >>>> base64.b16encode(b"abc") > b'616263' >>>> hex(int.from_bytes(b"abc", "big")) > '0x616263' >>>> hex(int.from_bytes(b"abc", "little")) > '0x636261' > > Thus, the underlying purpose of this proposal is to provide a single "more obvious way to do it". As per the recent discussion on python-ideas, the point where that is most useful is in debugging output. > > However, rather than a new method on bytes/bytearray/memoryview for this, I instead suggest it would be appropriate to extend the default handling of the "x" and "X" format characters to accept arbitrary bytes-like objects. The processing of these characters would be as follows: > > "x": display a-f as lowercase digits > "X": display A-F as uppercase digits > "#": includes 0x prefix > ".precision": chunks output, placing a space after every <precision> bytes > ",": uses a comma as the separator, rather than a space Hmm, but those would then work for str.format() as well, right ? Since "x" and "X" are already used to convert numbers to hex representation, opening these up for bytes sounds like it could easily mask TypeErrors for cases where you really want an integer to be formatted as hex and not bytes.
msg227335 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2014年09月23日 10:00
Updated issue title to indicate proposal also covers bytearray and memoryview.
msg240607 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2015年04月13日 14:03
Arnon is here at the PyCon 2015 sprints, so bringing the current status up to date: = Why .hex()? = That's the name in PEP 358 * That's the name of the comparable float method = Why add it to the builtin types? = * To provide One Obvious Way To Do It, rather than the current 5 (or so) non-obvious ways listed above * That's what PEP 358 proposed = Why postpone configurability and str.format() integration? = * Because these are more complex questions that can be left out of the "minimum useful feature" of new methods on the builtins and hence have been moved out to issue 22385 (which depends on this issue, and would likely require a PEP to resolve all the technical details)
msg240719 - (view)	Author: Arnon Yaari (wiggin15) *	Date: 2015年04月13日 19:55
I added the implementation for memoryview, updated to use PyUnicode_New etc., and moved the common implementation to its own file for code reuse.
msg242011 - (view)	Author: Arnon Yaari (wiggin15) *	Date: 2015年04月25日 10:54
minor updates to stdtypes.rst. I also want to add a line to whatsnew/3.5 but don't know how to put it in words - maybe it's better if someone with better english will add it.
msg242027 - (view)	Author: Gregory P. Smith (gregory.p.smith) * (Python committer)	Date: 2015年04月25日 22:39
bytes.hex-1.diff looks good, i'll take care of committing this and adding a what's new entry. thanks!
msg242030 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2015年04月25日 23:22
New changeset c9f1630cf2b1 by Gregory P. Smith in branch 'default': Implements issue #9951: Adds a hex() method to bytes, bytearray, & memoryview. https://hg.python.org/cpython/rev/c9f1630cf2b1
msg242031 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2015年04月25日 23:42
New changeset 955a479b31a8 by Gregory P. Smith in branch 'default': Issue9951: update _hashopenssl and md5module to use _Py_strhex(). https://hg.python.org/cpython/rev/955a479b31a8
msg242033 - (view)	Author: Gregory P. Smith (gregory.p.smith) * (Python committer)	Date: 2015年04月26日 00:36
note quite fixed, looks like some of the buildbots are having fun not compiling with this change: http://buildbot.python.org/all/builders/x86%20Tiger%203.x/builds/9569/steps/compile/logs/stdio investigating...
msg242034 - (view)	Author: Gregory P. Smith (gregory.p.smith) * (Python committer)	Date: 2015年04月26日 00:39
i missed the hg adds :)
msg242035 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2015年04月26日 00:41
New changeset a7737204c221 by Gregory P. Smith in branch 'default': Add the files missing from c9f1630cf2b1 for issue9951. https://hg.python.org/cpython/rev/a7737204c221
msg242036 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2015年04月26日 00:42
New changeset 7f0811452d0f by Gregory P. Smith in branch 'default': Switch binascii over to using the common _Py_strhex implementation for its hex https://hg.python.org/cpython/rev/7f0811452d0f
msg242039 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2015年04月26日 02:36
Thank you Arnon, and thank you Greg!
msg242041 - (view)	Author: Gregory P. Smith (gregory.p.smith) * (Python committer)	Date: 2015年04月26日 04:28
I see some _Py_strhex related link errors on the Windows buildbots: http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/9642/steps/compile/logs/stdio
msg242043 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2015年04月26日 05:00
New changeset b46308353ed9 by Gregory P. Smith in branch 'default': Add missing PyAPI_FUNC macro's to the public functions as other .c files do https://hg.python.org/cpython/rev/b46308353ed9
msg254456 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2015年11月10日 17:19
New changeset f3d8bb3ffa98 by Stefan Krah in branch '3.5': Iaaue #25598: Fix memory_hex from #9951 for non-contiguous buffers. https://hg.python.org/cpython/rev/f3d8bb3ffa98

History
Date	User	Action	Args
2022年04月11日 14:57:06	admin	set	github: 54160
2015年11月10日 17:19:10	python-dev	set	messages: + msg254456
2015年04月26日 07:49:42	berker.peksag	set	stage: commit review -> resolved
2015年04月26日 05:07:16	gregory.p.smith	set	status: open -> closed resolution: fixed
2015年04月26日 05:00:10	python-dev	set	messages: + msg242043
2015年04月26日 04:28:33	gregory.p.smith	set	messages: + msg242041
2015年04月26日 02:36:38	ncoghlan	set	messages: + msg242039
2015年04月26日 00:42:25	python-dev	set	messages: + msg242036
2015年04月26日 00:41:13	python-dev	set	messages: + msg242035
2015年04月26日 00:39:22	gregory.p.smith	set	messages: + msg242034
2015年04月26日 00:36:37	gregory.p.smith	set	status: closed -> open resolution: fixed -> (no value) messages: + msg242033
2015年04月25日 23:51:28	gregory.p.smith	set	status: open -> closed resolution: fixed stage: patch review -> commit review
2015年04月25日 23:42:53	python-dev	set	messages: + msg242031
2015年04月25日 23:22:38	python-dev	set	nosy: + python-dev messages: + msg242030
2015年04月25日 22:39:03	gregory.p.smith	set	assignee: ncoghlan -> gregory.p.smith messages: + msg242027 nosy: + gregory.p.smith
2015年04月25日 10:54:48	wiggin15	set	files: + bytes.hex-1.diff messages: + msg242011
2015年04月25日 06:36:58	ncoghlan	set	assignee: ncoghlan
2015年04月14日 08:15:41	vstinner	set	nosy: - vstinner
2015年04月13日 20:31:56	wiggin15	set	files: + bytes.hex.diff
2015年04月13日 20:30:57	wiggin15	set	files: - bytes.hex.diff
2015年04月13日 19:55:06	wiggin15	set	files: + bytes.hex.diff messages: + msg240719
2015年04月13日 19:43:01	wiggin15	set	files: - bytes.hex.diff
2015年04月13日 16:52:56	wiggin15	set	nosy: + eric.smith
2015年04月13日 14:03:47	ncoghlan	set	messages: + msg240607
2014年11月07日 15:37:18	ethan.furman	set	nosy: + ethan.furman
2014年10月05日 04:09:50	ncoghlan	link	issue22555 dependencies
2014年09月23日 11:44:44	barry	set	nosy: + barry
2014年09月23日 10:00:24	ncoghlan	set	messages: + msg227335 title: introduce bytes.hex method -> introduce bytes.hex method (also for bytearray and memoryview)
2014年09月17日 10:56:54	ncoghlan	link	issue22385 dependencies
2014年09月11日 07:42:36	lemburg	set	nosy: + lemburg messages: + msg226750
2014年09月11日 07:09:25	gotgenes	set	messages: + msg226745
2014年09月11日 00:20:54	BreamoreBoy	set	nosy: + BreamoreBoy messages: + msg226738
2014年09月11日 00:06:21	ncoghlan	set	messages: + msg226737
2014年09月10日 23:57:04	ncoghlan	set	messages: + msg226735
2014年09月10日 23:55:33	hct	set	messages: + msg226734
2014年09月10日 23:29:34	terry.reedy	set	messages: + msg226732
2014年09月10日 23:04:07	ncoghlan	set	messages: + msg226731
2014年09月10日 22:30:31	hct	set	messages: + msg226730
2014年09月10日 17:58:54	terry.reedy	set	messages: + msg226703 versions: + Python 3.5, - Python 3.4
2014年09月10日 12:29:40	vstinner	set	nosy: + vstinner messages: + msg226692
2014年09月10日 08:17:52	gotgenes	set	nosy: + gotgenes
2013年12月10日 07:47:52	Arfrever	set	nosy: + Arfrever
2013年12月09日 22:31:35	hct	set	nosy: + hct messages: + msg205744
2013年10月13日 07:45:40	georg.brandl	set	nosy: + georg.brandl messages: + msg199665
2013年10月12日 22:35:06	christian.heimes	set	messages: + msg199634
2013年10月12日 22:16:42	pitrou	set	nosy: + pitrou messages: + msg199631
2013年10月12日 22:12:54	christian.heimes	set	nosy: + christian.heimes messages: + msg199629 stage: patch review
2013年09月13日 13:29:52	wiggin15	set	status: pending -> open messages: + msg197571
2013年09月13日 09:25:51	serhiy.storchaka	set	status: open -> pending
2013年07月14日 08:50:41	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg193041
2013年07月14日 07:23:45	wiggin15	set	messages: + msg193034
2013年05月26日 20:39:05	terry.reedy	set	nosy: + terry.reedy messages: + msg190112 versions: + Python 3.4, - Python 3.3
2013年05月20日 05:40:59	martin.panter	set	nosy: + martin.panter
2011年04月04日 00:25:37	rhettinger	set	nosy: + rhettinger messages: + msg132911 versions: + Python 3.3, - Python 3.2
2011年04月03日 22:35:42	r.david.murray	link	issue11756 superseder
2010年10月09日 16:58:30	benjamin.peterson	set	messages: - msg117862
2010年10月09日 13:00:58	wiggin15	set	messages: + msg118272
2010年10月09日 11:03:41	wiggin15	set	files: - bytes.hex.diff
2010年10月09日 11:03:33	wiggin15	set	files: + bytes.hex.diff
2010年10月02日 06:35:08	ncoghlan	set	messages: + msg117862
2010年10月02日 06:14:53	ncoghlan	set	nosy: + ncoghlan
2010年09月27日 19:56:45	eric.araujo	set	nosy: + eric.araujo
2010年09月26日 14:49:40	pitrou	set	nosy: + mark.dickinson
2010年09月25日 23:38:47	wiggin15	create

homepage