[Python-Dev] bytes.from_hex()

"Martin v. Löwis" martin at v.loewis.de
Sun Feb 19 19:55:49 CET 2006


Stephen J. Turnbull wrote:
> BTW, what use cases do you have in mind for Unicode -> Unicode
> decoding?

I think "rot13" falls into that category: it is a transformation
on text, not on bytes.
For other "odd" cases: "base64" goes Unicode->bytes in the *decode*
direction, not in the encode direction. Some may argue that base64
is bytes, not text, but in many applications, you can combine base64
(or uuencode) with abitrary other text in a single stream. Of course,
it could be required that you go u.encode("ascii").decode("base64").
> def encode-mime-body (string, codec-list):
> if codec-list[0] not in charset-codec-list:
> raise NotCharsetCodecException
> if len (codec-list) > 1 and codec-list[-1] not in transfer-codec-list:
> raise NotTransferCodecException
> for codec in codec-list:
> string = string.encode (codec)
> return string
>> mime-body = encode-mime-body ("This is a pen.",
> [ 'shift_jis', 'zip', 'base64' ])

I think this is an example where you *should* use the codec API,
as designed. As that apparently requires streams for stacking (ie.
no support for codec stacking), you would have to write
def encode_mime_body(string, codec_list):
 stack = output = cStringIO.StringIO()
 for codec in reversed(codec_list):
 stack = codecs.getwriter(codec)(stack)
 stack.write(string)
 stack.reset()
 return output.getValue()
Notice that you have to start the stacking with the last codec,
and you have to keep a reference to the StringIO object where
the actual bytes end up.
Regards,
Martin
P.S. there shows some LISP through in your Python code :-)


More information about the Python-Dev mailing list

AltStyle によって変換されたページ (->オリジナル) /