I'm trying to JSON encode a complex numpy array, and I've found a utility from astropy (http://astropy.readthedocs.org/en/latest/_modules/astropy/utils/misc.html#JsonCustomEncoder) for this purpose:
import numpy as np
class JsonCustomEncoder(json.JSONEncoder):
""" <cropped for brevity> """
def default(self, obj):
if isinstance(obj, (np.ndarray, np.number)):
return obj.tolist()
elif isinstance(obj, (complex, np.complex)):
return [obj.real, obj.imag]
elif isinstance(obj, set):
return list(obj)
elif isinstance(obj, bytes): # pragma: py3
return obj.decode()
return json.JSONEncoder.default(self, obj)
This works well for a complex numpy array:
test = {'some_key':np.array([1+1j,2+5j, 3-4j])}
As dumping yields:
encoded = json.dumps(test, cls=JsonCustomEncoder)
print encoded
>>> {"some key": [[1.0, 1.0], [2.0, 5.0], [3.0, -4.0]]}
The problem is, I don't a way to read this back into a complex array automatically. For example:
json.loads(encoded)
>>> {"some_key": [[1.0, 1.0], [2.0, 5.0], [3.0, -4.0]]}
Can you guys help me figure out the way to overwrite loads/decoding so that it infers that this must be a complex array? I.E. Instead of a list of 2-element items, it should just put these back into a complex array. The JsonCustomDecoder doesn't have a default()
method to overwrite, and the docs on encoding have too much jargon for me.
4 Answers 4
Here is my final solution that was adapted from hpaulj's answer, and his answer to this thread: https://stackoverflow.com/a/24375113/901925
This will encode/decode arrays that are nested to arbitrary depth in nested dictionaries, of any datatype.
import base64
import json
import numpy as np
class NumpyEncoder(json.JSONEncoder):
def default(self, obj):
"""
if input object is a ndarray it will be converted into a dict holding dtype, shape and the data base64 encoded
"""
if isinstance(obj, np.ndarray):
data_b64 = base64.b64encode(obj.data)
return dict(__ndarray__=data_b64,
dtype=str(obj.dtype),
shape=obj.shape)
# Let the base class default method raise the TypeError
return json.JSONEncoder(self, obj)
def json_numpy_obj_hook(dct):
"""
Decodes a previously encoded numpy ndarray
with proper shape and dtype
:param dct: (dict) json encoded ndarray
:return: (ndarray) if input was an encoded ndarray
"""
if isinstance(dct, dict) and '__ndarray__' in dct:
data = base64.b64decode(dct['__ndarray__'])
return np.frombuffer(data, dct['dtype']).reshape(dct['shape'])
return dct
# Overload dump/load to default use this behavior.
def dumps(*args, **kwargs):
kwargs.setdefault('cls', NumpyEncoder)
return json.dumps(*args, **kwargs)
def loads(*args, **kwargs):
kwargs.setdefault('object_hook', json_numpy_obj_hook)
return json.loads(*args, **kwargs)
def dump(*args, **kwargs):
kwargs.setdefault('cls', NumpyEncoder)
return json.dump(*args, **kwargs)
def load(*args, **kwargs):
kwargs.setdefault('object_hook', json_numpy_obj_hook)
return json.load(*args, **kwargs)
if __name__ == '__main__':
data = np.arange(3, dtype=np.complex)
one_level = {'level1': data, 'foo':'bar'}
two_level = {'level2': one_level}
dumped = dumps(two_level)
result = loads(dumped)
print '\noriginal data', data
print '\nnested dict of dict complex array', two_level
print '\ndecoded nested data', result
Which yields output:
original data [ 0.+0.j 1.+0.j 2.+0.j]
nested dict of dict complex array {'level2': {'level1': array([ 0.+0.j, 1.+0.j, 2.+0.j]), 'foo': 'bar'}}
decoded nested data {u'level2': {u'level1': array([ 0.+0.j, 1.+0.j, 2.+0.j]), u'foo': u'bar'}}
1 Comment
json.JSONEncoder
to json.JSONEncoder.default
.The accepted answer is great but has a flaw. It only works if your data is C_CONTIGUOUS. If you transpose your data, that will not be true. For example, test the following:
A = np.arange(10).reshape(2,5)
A.flags
# C_CONTIGUOUS : True
# F_CONTIGUOUS : False
# OWNDATA : False
# WRITEABLE : True
# ALIGNED : True
# UPDATEIFCOPY : False
A = A.transpose()
#array([[0, 5],
# [1, 6],
# [2, 7],
# [3, 8],
# [4, 9]])
loads(dumps(A))
#array([[0, 1],
# [2, 3],
# [4, 5],
# [6, 7],
# [8, 9]])
A.flags
# C_CONTIGUOUS : False
# F_CONTIGUOUS : True
# OWNDATA : False
# WRITEABLE : True
# ALIGNED : True
# UPDATEIFCOPY : False
To fix this, use 'np.ascontiguousarray()' when passing the object to the b64encode. Specifically, change:
data_b64 = base64.b64encode(obj.data)
TO:
data_b64 = base64.b64encode(np.ascontiguousarray(obj).data)
If I understand the function correctly, it takes no action if your data is already C_CONTIGUOUS so the only performance hit is when you have F_CONTIGUOUS data.
2 Comments
base64.b64encode(obj.copy(order='C'))
It's unclear just how much help you need with json
encoding/decoding, or with working with numpy
. For example, how did you create the complex array in the first place?
What your encoding has done is render the array as a list of lists. The decoder than has to convert that back to an array of the appropriate dtype. For example:
d = json.loads(encoded)
a = np.dot(d['some_key'],np.array([1,1j]))
# array([ 1.+1.j, 2.+5.j, 3.-4.j])
This isn't the only way to create such an array from this list, and it probably fails with more general shapes, but it's a start.
The next task is figuring out when to use such a routine. If you know you are going to receive such an array, then just do this decoding.
Another option is to add one or more keys to the dictionary that mark this variable as a complex nparray. One key might also encode its shape (though that is also deducible from the nesting of the list of lists).
Does this point in the right direction? Or do you need further help with each step?
One of the answers to this 'SimpleJSON and NumPy array' question
https://stackoverflow.com/a/24375113/901925
handles both the encoding and decoding of numpy
arrays. It encodes a dictionary with the dtype and shape, and the array's data buffer. So the JSON
string does not mean much to a human. But does handle general arrays, including ones with complex dtype.
expected
and dump
prints are:
[ 1.+1.j 2.+5.j 3.-4.j]
{"dtype": "complex128", "shape": [3],
"__ndarray__": "AAAAAAAA8D8AAAAAAADwPwAAAAAAAABAAAAAAAAAFEAAAAAAAAAIQAAAAAAAABDA"}
The custom decoding is done with an object_hook
function, which takes a dict
and returns an array (if possible).
json.loads(dumped, object_hook=json_numpy_obj_hook)
Following that model, here's a crude hook
that would transform every JSON array into a np.array
, and every one with 2 columns into a 1d complex array:
def numpy_hook(dct):
jj = np.array([1,1j])
for k,v in dct.items():
if isinstance(v, list):
v = np.array(v)
if v.ndim==2 and v.shape[1]==2:
v = np.dot(v,jj)
dct[k] = v
return dct
It would be better, I think, to encode some dictionary key to flag a numpy array
, and another to flag a complex
dtype.
I can improve the hook to handle regular lists, and other array dimensions:
def numpy_hook(dct):
jj = np.array([1,1j])
for k,v in dct.items():
if isinstance(v, list):
# try to turn list into numpy array
v = np.array(v)
if v.dtype==object:
# not a normal array, don't change it
continue
if v.ndim>1 and v.shape[-1]==2:
# guess it is a complex array
# this information should be more explicit
v = np.dot(v,jj)
dct[k] = v
return dct
It handles this structure:
A = np.array([1+1j,2+5j, 3-4j])
B = np.arange(12).reshape(3,4)
C = A+B.T
test = {'id': 'stream id',
'arrays': [{'A': A}, {'B': B}, {'C': C}]}
returning:
{u'arrays': [{u'A': array([ 1.+1.j, 2.+5.j, 3.-4.j])},
{u'B': array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])},
{u'C': array([[ 1.+1.j, 6.+5.j, 11.-4.j],
[ 2.+1.j, 7.+5.j, 12.-4.j],
[ 3.+1.j, 8.+5.j, 13.-4.j],
[ 4.+1.j, 9.+5.j, 14.-4.j]])}],
u'id': u'stream id'}
Any more generality requires, I think, modifications to the encoding to make the array identity explicit.
6 Comments
Try traitschema
https://traitschema.readthedocs.io/en/latest/
"Create serializable, type-checked schema using traits and Numpy. A typical use case involves saving several Numpy arrays of varying shape and type."
See to_json()
"This uses a custom JSON encoder to handle numpy arrays but could conceivably lose precision. If this is important, please consider serializing in HDF5 format instead"
Comments
Explore related questions
See similar questions with these tags.