I have built two parsers which are working for everything except some binary blocks. The one goes from a proprietary format into standard json and the other one brings it from json back into the proprietary format.
When I wrote the one which went to json I was just happy to get it all to parse into valid json but worried that I may not have been able to bring the binary sections back, and the concern seems to have come true.
The base64 I believe is necessary or one possible solution as the binary is otherwise too full of characters which json does not like, I think trying to escape them etc would be more challenging than this base64 solution.
So here is a binary block from the original file:
cleanbinary = "0\x82\x02\xd80\x82\x01\xc0\xa0"
It is brought into base64 like this:
import base64
out = base64.encodebytes(cleanbinary.encode('utf-8'))
print(out)
>> b'MMKCAsOYMMKCAcOAwqA=\n'
That, you can turn back into binary:
z = base64.decodebytes(out).decode('utf-8')
print(z == cleanbinary)
>> True
It needs a step in the middle however which I just can't work out to get it into json in the middle of a loop. Have tried the following:
wrapped = '"' + str(out) + '"'
So now you have the double quotations which json needs and it is a str rather than bytes:
print(wrapped)
>> '"b\'MMKCAsOYMMKCAcOAwqA=\\n\'"'
Now lets say you had plucked this string value out of a json file with Python's json parser. How do you turn it back into a bytes value:
b'MMKCAsOYMMKCAcOAwqA=\n'
..so that it can be parsed back into binary?
1 Answer 1
I'd suggest converting your bytes to a string and back by explicitly decoding and encoding it:
out = b'MMKCAsOYMMKCAcOAwqA='
wrapped = f'"{out.decode()}"'
print(wrapped) # ---> "MMKCAsOYMMKCAcOAwqA="
unwrapped = wrapped.strip('"').encode()
print(unwrapped) # ---> b'MMKCAsOYMMKCAcOAwqA='
bytes("some_text")?