[Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

Wed Jun 9 01:53:14 CEST 2010

There are two opposite issues in the bug tracker:
 #7475: codecs missing: base64 bz2 hex zlib ...
 -> reintroduce the codecs removed from Python3
 #8838: Remove codecs.readbuffer_encode()
 -> remove the last part of the removed codecs
If I understood correctly, the question is: should codecs module only contain 
encoding codecs, or contain also other kind of codecs.
Encoding codec API is now strict (encode: str->bytes, decode: bytes->str), 
it's not possible to reuse str.encode() or bytes.decode() for the other 
codecs. Marc-Andre Lemburg proposed to add .tranform() and .untranform() 
methods to str, bytes and bytearray types. If I understood correctly, it would 
look like:
 >>> b'abc'.transform("hex")
 '616263'
 >>> '616263'.untranform("hex")
 b'abc'
I suppose that each codec will have a different list of accepted input and 
output types. Example:
 bz2: encode:bytes->bytes, decode:bytes->bytes
 rot13: encode:str->str, decode:str->str
 hex: encode:bytes->str, decode: str->bytes
And so "abc".encode("bz2") would raise a TypeError.
--
In my opinion, we should not mix codecs of different kinds (compression, 
cipher, etc.) because the input and output types are different. It would have 
more sense to create a standard API for each kind of codec. Existing examples 
of standard APIs in Python: hashlib, shutil.make_archive(), database API, etc.
-- 
Victor Stinner
http://www.haypocalc.com/