I have some binary data which is in Python in the form of an array of byte strings.
Is there a portable way to serialize this data that other languages could read?
JSON loses because I just found out that it has no real way to store binary data; its strings are expected to be Unicode.
I don't want to use pickle because I don't want the security risk, and that limits its use to other Python programs.
Any advice? I would really like to use a builtin library (or at least one that's part of the standard Anaconda distribution).
-
2Base64? Good enough for porn pictures on Usenet in the 90s . . .Geoff Genz– Geoff Genz2014年03月24日 21:50:54 +00:00Commented Mar 24, 2014 at 21:50
-
yeah, base64 is my fallback approach. I just was hoping there was a one-step solution out there already.Jason S– Jason S2014年03月24日 21:58:27 +00:00Commented Mar 24, 2014 at 21:58
-
Can't you just write them as they are? "Binary string" sounds like it is already serialised.Sven Marnach– Sven Marnach2014年03月24日 22:08:48 +00:00Commented Mar 24, 2014 at 22:08
1 Answer 1
If you just need the binary data in the strings and can recover the boundaries between the individual strings easily, you could just write them to a file directly, as raw strings.
If you can't recover the string boundaries easily, JSON seems like a good option:
a = [b"abc\xf3\x9c\xc6", b"xyz"]
serialised = json.dumps([s.decode("latin1") for s in a])
print [s.encode("latin1") for s in json.loads(serialised)]
will print
['abc\xf3\x9c\xc6', 'xyz']
The trick here is that arbitrary binary strings are valid latin1, so they can always be decoded to Unicode and encoded back to the original string again.
2 Comments
Explore related questions
See similar questions with these tags.