Issue 17844: Add link to alternatives for bytes-to-bytes codecs

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/62044

classification

Title:	Add link to alternatives for bytes-to-bytes codecs
Type:	enhancement	Stage:	resolved
Components:	Documentation	Versions:	Python 3.3, Python 3.4, Python 2.7

process

Dependencies:	Superseder:
Status:	closed	Resolution:	fixed
Assigned To:	docs@python	Nosy List:	docs@python, doerwalter, ezio.melotti, flox, lemburg, ncoghlan, python-dev, serhiy.storchaka
Priority:	normal	Keywords:	patch

Created on 2013年04月25日 11:37 by serhiy.storchaka, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
doc_codecs_impl.patch	serhiy.storchaka, 2013年05月21日 10:20	Patch for 3.x	review
doc_codecs_impl-2.7_2.patch	serhiy.storchaka, 2013年05月21日 14:28	Patch for 2.7	review

Messages (15)
msg187777 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2013年04月25日 11:37
The proposed patch adds link to alternative interfaces for bytes-to-bytes codecs. I.e. base64.b64encode and base64.b64decode for base64_codec. Patch for 2.7 should mention other functions/modules (due to lack of some of them).
msg189540 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2013年05月18日 18:47
Any opinions?
msg189578 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2013年05月19日 11:09
I like this, both because it quite clearly defines the encode and decode directions, and allows notes the more direct entry points if the codec isn't being specified as an input string. So +1 from me.
msg189579 - (view)	Author: Marc-Andre Lemburg (lemburg) * (Python committer)	Date: 2013年05月19日 11:31
Not a bad idea. More information is always better when it comes to documentation :-)
msg189740 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2013年05月21日 10:20
> Not a bad idea. How about implementation? Here is updated patches for 3.x and 2.7. Note that in 2.7 I split codecs table as in 3.x.
msg189743 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2013年05月21日 11:14
I like the idea of splitting the table in 2.7 rather than using a result type column. However, the two intro paragraphs need a bit of work. How does the following sound: 1. Create a new subheading at the same level as the current "Standard Encodings" heading: "Python Specific Encodings" 2. Split out rot-13 to its own table in Python 2.7 as well 3. Under the new subheading, have the following text introducing the tables: ---- A number of predefined codecs are specific to Python, so their codec names have no meaning outside Python. These are listed in the tables below based on the expected input and output types (note that while text encodings are the most common use case for codecs, the underlying codec infrastructure supports arbitrary data transforms rather than just text encodings). For asymmetric codecs, the stated purpose describes the encoding direction. The following codecs provide text-to-binary encoding and binary-to-text decoding, similar to the Unicode text encodings. ---- The following codecs provide binary-to-binary encoding and decoding. ---- The following codecs provide text-to-text encoding and decoding. ----
msg189747 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2013年05月21日 12:39
> However, the two intro paragraphs need a bit of work. Yes, it's a help which I needed. Thank you. However your wording is not entirely correct. In 2.7 binary-to-binary codecs and rot-13 works with Unicode strings (only ascii-compatible) as with bytes strings. >>> u'Python'.encode('base64') 'UHl0aG9u\n' >>> u'UHl0aG9u'.decode('base64') 'Python' >>> u'Python\u20ac'.encode('base64') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/serhiy/py/cpython-2.7/Lib/encodings/base64_codec.py", line 24, in base64_encode output = base64.encodestring(input) File "/home/serhiy/py/cpython-2.7/Lib/base64.py", line 315, in encodestring pieces.append(binascii.b2a_base64(chunk)) UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 6: ordinal not in range(128) Rot-13 works as common text-to-binary encoding (encode returns str, decode returns unicode). >>> u'Python'.encode('rot13') 'Clguba' >>> u'Python'.decode('rot13') u'Clguba' >>> 'Python'.encode('rot13') 'Clguba' >>> 'Python'.decode('rot13') u'Clguba' >>> u'Python\u20ac'.encode('rot13') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/serhiy/py/cpython-2.7/Lib/encodings/rot_13.py", line 17, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character u'\u20ac' in position 6: character maps to <undefined> >>> u'Python\u20ac'.decode('rot13') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/serhiy/py/cpython-2.7/Lib/encodings/rot_13.py", line 20, in decode return codecs.charmap_decode(input,errors,decoding_map) UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 6: ordinal not in range(128)
msg189749 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2013年05月21日 13:12
While the Python 2 text model was almost certainly a necessary transition step to full unicode support, it is things like this that highlight how fundamentally broken implicit conversion turned out to be at a conceptual level :P Perhaps the following would work for 2.7 then (with rot-13 in the first table), with footnotes added to cover the quirks of the implicit type conversions between str and unicode: ---- A number of predefined codecs are specific to Python, so their codec names have no meaning outside Python. These are listed in the tables below based on the expected input and output types (note that while text encodings are the most common use case for codecs, the underlying codec infrastructure supports arbitrary data transforms rather than just text encodings). For asymmetric codecs, the stated purpose describes the encoding direction. The following codecs provide unicode-to-str encoding [#1] and str-to-unicode decoding [#2], similar to the Unicode text encodings. ---- The following codecs provide str-to-str encoding and decoding [#2]. ---- .. [#1] str objects are also accepted as input in place of unicode objects. They are implicitly converted to unicode by decoding them using the default encoding. If this conversion fails, it may lead to encoding operations raising :exc:`UnicodeDecodeError`. .. [#2] unicode objects are also accepted as input in place of str objects. They are implicitly converted to str by encoding them using the default encoding. If this conversion fails, it may lead to decoding operations raising :exc:`UnicodeEncodeError`.
msg189761 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2013年05月21日 14:28
Thank you Nick. Here is an updated patch for 2.7.
msg189797 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2013年05月22日 08:11
Thanks Serhiy, that version looks great.
msg189811 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2013年05月22日 12:36
New changeset 85c04fdaa404 by Serhiy Storchaka in branch '2.7': Issue #17844: Refactor a documentation of Python specific encodings. http://hg.python.org/cpython/rev/85c04fdaa404 New changeset 039dc6dd2bc0 by Serhiy Storchaka in branch '3.3': Issue #17844: Add links to encoders and decoders for bytes-to-bytes codecs. http://hg.python.org/cpython/rev/039dc6dd2bc0 New changeset 9afdd88fe33a by Serhiy Storchaka in branch 'default': Issue #17844: Add links to encoders and decoders for bytes-to-bytes codecs. http://hg.python.org/cpython/rev/9afdd88fe33a
msg189812 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2013年05月22日 12:40
Thank you Nick. It's mainly your patch. Do you want to foreport your changes (a "Python Specific Encodings" subheading and followed paragraph) to 3.x?
msg189821 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2013年05月22日 14:50
That sounds like a good idea. Yay for not needing those arcane footnotes, though :)
msg189856 - (view)	Author: Roundup Robot (python-dev) (Python triager)	Date: 2013年05月23日 10:25
New changeset 85e8414060b4 by Nick Coghlan in branch '3.3': Issue 17844: Clarify meaning of different codec tables http://hg.python.org/cpython/rev/85e8414060b4 New changeset 801567d6302c by Nick Coghlan in branch 'default': Merge issue 17844 from 3.3 http://hg.python.org/cpython/rev/801567d6302c
msg189857 - (view)	Author: Alyssa Coghlan (ncoghlan) * (Python committer)	Date: 2013年05月23日 10:26
Thanks for initiating this Serhiy :)

History
Date	User	Action	Args
2022年04月11日 14:57:44	admin	set	github: 62044
2013年05月23日 10:26:34	ncoghlan	set	status: open -> closed resolution: fixed messages: + msg189857 stage: patch review -> resolved
2013年05月23日 10:25:22	python-dev	set	messages: + msg189856
2013年05月22日 15:01:26	ezio.melotti	set	nosy: + ezio.melotti
2013年05月22日 14:50:57	ncoghlan	set	messages: + msg189821
2013年05月22日 12:40:01	serhiy.storchaka	set	messages: + msg189812
2013年05月22日 12:36:17	python-dev	set	nosy: + python-dev messages: + msg189811
2013年05月22日 08:11:34	ncoghlan	set	messages: + msg189797
2013年05月21日 14:32:19	serhiy.storchaka	set	files: - doc_codecs_impl-2.7.patch
2013年05月21日 14:30:16	flox	set	nosy: + flox
2013年05月21日 14:28:18	serhiy.storchaka	set	files: + doc_codecs_impl-2.7_2.patch messages: + msg189761
2013年05月21日 13:12:11	ncoghlan	set	messages: + msg189749
2013年05月21日 12:39:04	serhiy.storchaka	set	messages: + msg189747
2013年05月21日 11:14:27	ncoghlan	set	messages: + msg189743
2013年05月21日 10:21:20	serhiy.storchaka	set	files: - doc_codecs_impl.patch
2013年05月21日 10:20:49	serhiy.storchaka	set	files: + doc_codecs_impl-2.7.patch
2013年05月21日 10:20:00	serhiy.storchaka	set	files: + doc_codecs_impl.patch messages: + msg189740
2013年05月19日 11:31:36	lemburg	set	messages: + msg189579
2013年05月19日 11:09:13	ncoghlan	set	messages: + msg189578
2013年05月18日 18:47:28	serhiy.storchaka	set	messages: + msg189540
2013年04月25日 11:43:30	serhiy.storchaka	link	issue7475 dependencies
2013年04月25日 11:37:22	serhiy.storchaka	create

homepage