NumPy array is not JSON serializable

Question 1

After creating a NumPy array, and saving it as a Django context variable, I receive the following error when loading the webpage:

array([ 0, 239, 479, 717, 952, 1192, 1432, 1667], dtype=int64) is not JSON serializable

What does this mean?

Question 2

It means that somewhere, something is trying to dump a numpy array using the json module. But numpy.ndarray is not a type that json knows how to handle. You'll either need to write your own serializer, or (more simply) just pass list(your_array) to whatever is writing the json.

Question 3

Note list(your_array) will not always work as it returns numpy ints, not native ints. Use your_array.to_list() instead.

Question 4

a note about @ashishsingal's comment, it should be your_array.tolist(), not to_list().

Question 5

I wrote a simple module to export complex data structures in python: pip install jdata then import jdata as jd;import numpy as np; a={'str':'test','num':1.2,'np':np.arange(1,5,dtype=np.uint8)}; jd.show(a)

Question 6

I regularly "jsonify" np.arrays. Try using the ".tolist()" method on the arrays first, like this:

import numpy as np
import codecs, json 
a = np.arange(10).reshape(2,5) # a 2 by 5 array
b = a.tolist() # nested lists with same data, indices
file_path = "/path.json" ## your path variable
json.dump(b, codecs.open(file_path, 'w', encoding='utf-8'), 
 separators=(',', ':'), 
 sort_keys=True, 
 indent=4) ### this saves the array in .json format

In order to "unjsonify" the array use:

obj_text = codecs.open(file_path, 'r', encoding='utf-8').read()
b_new = json.loads(obj_text)
a_new = np.array(b_new)

Question 7

Why can it only be stored as a list of lists?

Question 8

I don't know but i expect np.array types have metadata that doesn't fit into json (e.g. they specify the data type of each entry like float)

Question 9

I tried your method, but it seems that the program stucked at tolist().

Question 10

@frankliuao I found the reason is that tolist() takes a huge amount of time when the data is large.

Question 11

@NikhilPrabhu JSON is Javascript Object Notation, and can therefore only represent the basic constructs from the javascript language: objects (analogous to python dicts), arrays (analogous to python lists), numbers, booleans, strings, and nulls (analogous to python Nones). Numpy arrays are not any of those things, and so cannot be serialised into JSON. Some can be converted to a JSO-like form (list of lists), which is what this answer does.

Question 12

Store as JSON a numpy.ndarray or any nested-list composition.

class NumpyEncoder(json.JSONEncoder):
 def default(self, obj):
 if isinstance(obj, np.ndarray):
 return obj.tolist()
 return super().default(obj)
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a.shape)
json_dump = json.dumps({'a': a, 'aa': [2, (2, 3, 4), a], 'bb': [2]}, 
 cls=NumpyEncoder)
print(json_dump)

Will output:

(2, 3)
{"a": [[1, 2, 3], [4, 5, 6]], "aa": [2, [2, 3, 4], [[1, 2, 3], [4, 5, 6]]], "bb": [2]}

To restore from JSON:

json_load = json.loads(json_dump)
a_restored = np.asarray(json_load["a"])
print(a_restored)
print(a_restored.shape)

Will output:

[[1 2 3]
 [4 5 6]]
(2, 3)

Question 13

This should be way higher up the board, it's the generalisable and properly abstracted way of doing this. Thanks!

Question 14

Is there a simple way to get the ndarray back from the list ?

Question 15

@DarksteelPenguin are you looking for numpy.asarray()?

Question 16

This answer is great and can easily be extended to serialize numpy float32 and np.float64 values as json too: if isinstance(obj, np.float32) or isinstance(obj, np.float64): return float(obj)

Question 17

This solution avoid you to cast manually every numpy array to list.

Question 18

I found the best solution if you have nested numpy arrays in a dictionary:

import json
import numpy as np
class NumpyEncoder(json.JSONEncoder):
 """ Special json encoder for numpy types """
 def default(self, obj):
 if isinstance(obj, np.integer):
 return int(obj)
 elif isinstance(obj, np.floating):
 return float(obj)
 elif isinstance(obj, np.ndarray):
 return obj.tolist()
 return json.JSONEncoder.default(self, obj)
dumped = json.dumps(data, cls=NumpyEncoder)
with open(path, 'w') as f:
 json.dump(dumped, f)

Thanks to this guy.

Question 19

Thanks for the helpful answer! I wrote the attributes to a json file, but am now having trouble reading back the parameters for Logistic Regression. Is there a 'decoder' for this saved json file?

Question 20

Of course, to read the json back you can use this: with open(path, 'r') as f: data = json.load(f) , which returns a dictionary with your data.

Question 21

That's for reading the json file and then to deserialize it's output you can use this: data = json.loads(data)

Question 22

I had to add this to handle bytes datatype.. assuming all bytes are utf-8 string. elif isinstance(obj, (bytes,)): return obj.decode("utf-8")

Question 23

+1. Why do we need the line "return json.JSONEncoder.default(self, obj)" at the end of "def default(self, obj)"?

Question 24

You can use Pandas:

import pandas as pd
pd.Series(your_array).to_json(orient='values')

Question 25

Great! And I think for 2D np.array it will be something like pd.DataFrame(your_array).to_json('data.json', orient='split').

Question 26

2 lines in total including the import. Most Pythonic answer here!!!

Question 27

Use the json.dumps default kwarg:

default should be a function that gets called for objects that can’t otherwise be serialized. ... or raise a TypeError

In the default function check if the object is from the module numpy, if so either use ndarray.tolist for a ndarray or use .item for any other numpy specific type.

import numpy as np
def default(obj):
 if type(obj).__module__ == np.__name__:
 if isinstance(obj, np.ndarray):
 return obj.tolist()
 else:
 return obj.item()
 raise TypeError('Unknown type:', type(obj))
dumped = json.dumps(data, default=default)

Question 28

What's the role of the line type(obj).__module__ == np.__name__: there? Would it not suffice to check for the instance?

Question 29

@RamonMartinez, to know that the object is a numpy object, this way i can use .item for almost any numpy object. default function is called for all unknown types json.dumps attempts to serialize. not just numpy

Question 30

I think this also assists stackoverflow.com/questions/69920913/… though it would be nice to have a clean nested version too

Question 31

This is not supported by default, but you can make it work quite easily! There are several things you'll want to encode if you want the exact same data back:

The data itself, which you can get with obj.tolist() as @travelingbones mentioned. Sometimes this may be good enough.
The data type. I feel this is important in quite some cases.
The dimension (not necessarily 2D), which could be derived from the above if you assume the input is indeed always a 'rectangular' grid.
The memory order (row- or column-major). This doesn't often matter, but sometimes it does (e.g. performance), so why not save everything?

Furthermore, your numpy array could part of your data structure, e.g. you have a list with some matrices inside. For that you could use a custom encoder which basically does the above.

This should be enough to implement a solution. Or you could use json-tricks which does just this (and supports various other types) (disclaimer: I made it).

pip install json-tricks

Then

data = [
 arange(0, 10, 1, dtype=int).reshape((2, 5)),
 datetime(year=2017, month=1, day=19, hour=23, minute=00, second=00),
 1 + 2j,
 Decimal(42),
 Fraction(1, 3),
 MyTestCls(s='ub', dct={'7': 7}), # see later
 set(range(7)),
]
# Encode with metadata to preserve types when decoding
print(dumps(data))

Question 32

I had a similar problem with a nested dictionary with some numpy.ndarrays in it.

def jsonify(data):
 json_data = dict()
 for key, value in data.iteritems():
 if isinstance(value, list): # for lists
 value = [ jsonify(item) if isinstance(item, dict) else item for item in value ]
 if isinstance(value, dict): # for nested lists
 value = jsonify(value)
 if isinstance(key, int): # if key is integer: > to string
 key = str(key)
 if type(value).__module__=='numpy': # if value is numpy.*: > to python list
 value = value.tolist()
 json_data[key] = value
 return json_data

Question 33

The other answers will not work if someone else's code (e.g. a module) is doing the json.dumps(). This happens often, for example with webservers that auto-convert their return responses to JSON, meaning we can't always change the arguments for json.dump() .

This answer solves that, and is based off a (relatively) new solution that works for any 3rd party class (not just numpy).

TLDR

pip install json_fix

import json_fix # import this anytime before the JSON.dumps gets called
import json
# create a converter
import numpy
json.fallback_table[numpy.ndarray] = lambda array: array.tolist()
# no additional arguments needed: 
json.dumps(
 dict(thing=10, nested_data=numpy.array((1,2,3)))
)
#>>> '{"thing": 10, "nested_data": [1, 2, 3]}'

Question 34

You could also use default argument for example:

def myconverter(o):
 if isinstance(o, np.float32):
 return float(o)
json.dump(data, default=myconverter)

Question 35

Also, some very interesting information further on lists vs. arrays in Python ~> Python List vs. Array - when to use?

It could be noted that once I convert my arrays into a list before saving it in a JSON file, in my deployment right now anyways, once I read that JSON file for use later, I can continue to use it in a list form (as opposed to converting it back to an array).

AND actually looks nicer (in my opinion) on the screen as a list (comma seperated) vs. an array (not-comma seperated) this way.

Using @travelingbones's .tolist() method above, I've been using as such (catching a few errors I've found too):

SAVE DICTIONARY

def writeDict(values, name):
 writeName = DIR+name+'.json'
 with open(writeName, "w") as outfile:
 json.dump(values, outfile)

READ DICTIONARY

def readDict(name):
 readName = DIR+name+'.json'
 try:
 with open(readName, "r") as infile:
 dictValues = json.load(infile)
 return(dictValues)
 except IOError as e:
 print(e)
 return('None')
 except ValueError as e:
 print(e)
 return('None')

Hope this helps!

Question 36

use NumpyEncoder it will process json dump successfully.without throwing - NumPy array is not JSON serializable

import numpy as np
import json
from numpyencoder import NumpyEncoder
arr = array([ 0, 239, 479, 717, 952, 1192, 1432, 1667], dtype=int64) 
json.dumps(arr,cls=NumpyEncoder)

Question 37

numpyencoder is not a real package, -1

Question 38

Here is an implementation that work for me and removed all nans (assuming these are simple object (list or dict)):

from numpy import isnan
def remove_nans(my_obj, val=None):
 if isinstance(my_obj, list):
 for i, item in enumerate(my_obj):
 if isinstance(item, list) or isinstance(item, dict):
 my_obj[i] = remove_nans(my_obj[i], val=val)
 else:
 try:
 if isnan(item):
 my_obj[i] = val
 except Exception:
 pass
 elif isinstance(my_obj, dict):
 for key, item in my_obj.iteritems():
 if isinstance(item, list) or isinstance(item, dict):
 my_obj[key] = remove_nans(my_obj[key], val=val)
 else:
 try:
 if isnan(item):
 my_obj[key] = val
 except Exception:
 pass
 return my_obj

Question 39

This is a different answer, but this might help to help people who are trying to save data and then read it again.
There is hickle which is faster than pickle and easier.
I tried to save and read it in pickle dump but while reading there were lot of problems and wasted an hour and still didn't find solution though I was working on my own data to create a chat bot.

vec_x and vec_y are numpy arrays:

data=[vec_x,vec_y]
hkl.dump( data, 'new_data_file.hkl' )

Then you just read it and perform the operations:

data2 = hkl.load( 'new_data_file.hkl' )

Question 40

May do simple for loop with checking types:

with open("jsondontdoit.json", 'w') as fp:
 for key in bests.keys():
 if type(bests[key]) == np.ndarray:
 bests[key] = bests[key].tolist()
 continue
 for idx in bests[key]:
 if type(bests[key][idx]) == np.ndarray:
 bests[key][idx] = bests[key][idx].tolist()
 json.dump(bests, fp)
 fp.close()

Question 41

TypeError: array([[0.46872085, 0.67374235, 1.0218339 , 0.13210179, 0.5440686 , 0.9140083 , 0.58720225, 0.2199381 ]], dtype=float32) is not JSON serializable

The above-mentioned error was thrown when i tried to pass of list of data to model.predict() when i was expecting the response in json format.

> 1 json_file = open('model.json','r')
> 2 loaded_model_json = json_file.read()
> 3 json_file.close()
> 4 loaded_model = model_from_json(loaded_model_json)
> 5 #load weights into new model
> 6 loaded_model.load_weights("model.h5")
> 7 loaded_model.compile(optimizer='adam', loss='mean_squared_error')
> 8 X = [[874,12450,678,0.922500,0.113569]]
> 9 d = pd.DataFrame(X)
> 10 prediction = loaded_model.predict(d)
> 11 return jsonify(prediction)

But luckily found the hint to resolve the error that was throwing The serializing of the objects is applicable only for the following conversion Mapping should be in following way object - dict array - list string - string integer - integer

If you scroll up to see the line number 10 prediction = loaded_model.predict(d) where this line of code was generating the output of type array datatype , when you try to convert array to json format its not possible

Finally i found the solution just by converting obtained output to the type list by following lines of code

prediction = loaded_model.predict(d)
listtype = prediction.tolist() return jsonify(listtype)

Bhoom! finally got the expected output, enter image description here

Question 42

i've had the same problem but a little bit different because my values are from type float32 and so i addressed it converting them to simple float(values).

Question 43

Can you provide a code snippet, please?

Question 44

You can dumps() a NumPy array to a binary representation and then store it as a Base64-encoded string in your JSON. Add a hopefully unique prefix to the string to distinguish it from "bona fide" string values in the JSON output.

import json
import base64
import pickle
import numpy as np
NPREFIX = "numpy_b64_" # to identify the 'NumPy dump'
class NPEncoder(json.JSONEncoder):
 def default(self, obj):
 if isinstance(obj, np.ndarray):
 bindata = obj.dumps() # binary array
 b64 = base64.b64encode(bindata).decode('utf-8')
 return NPREFIX + b64
myobj = {"answer": 42, "npvec": np.array([1,2,3])}
jstr = json.dumps(myobj, cls=NPEncoder)

To parse it back, set the object_hook= parameter of json.loads() to an appropriate function. The JSON module documentation is not very clear on this, but the function passed to object_hook= shall take a dictionary d as its sole argument which represents the already JSON-decoded data. The function then may modify this "raw result" as needed and return a new dictionary or a modified version of the input dictionary. This is what np_decode() below does:

def np_decode(d):
 for k, v in d.items():
 if isinstance(v, str) and v.startswith(NPREFIX):
 # this is a Base64-encoded Numpy bin dump
 b64 = v[len(NPREFIX):] # chop off prefix
 bindata = base64.b64decode(b64) # get the binary dump back
 nv = pickle.loads(bindata) # NumPy uses `pickle` format, so unpickle!
 d[k] = nv # replace the string with the NumPy object
 continue
 # ... other transformations if needed ...
 return d
myobj2 = json.loads(jstr, object_hook=_np_pd_decode)

Keep in mind that representing NumPy arrays in a textual format could be very wasteful.

score 501 · Accepted Answer · 2015-09-29 17:44:51Z

501

I regularly "jsonify" np.arrays. Try using the ".tolist()" method on the arrays first, like this:

import numpy as np
import codecs, json 
a = np.arange(10).reshape(2,5) # a 2 by 5 array
b = a.tolist() # nested lists with same data, indices
file_path = "/path.json" ## your path variable
json.dump(b, codecs.open(file_path, 'w', encoding='utf-8'), 
 separators=(',', ':'), 
 sort_keys=True, 
 indent=4) ### this saves the array in .json format

In order to "unjsonify" the array use:

obj_text = codecs.open(file_path, 'r', encoding='utf-8').read()
b_new = json.loads(obj_text)
a_new = np.array(b_new)

Share

Improve this answer

edited Dec 15, 2021 at 14:10

David Hempy's user avatar

David Hempy

6,3662 gold badges53 silver badges83 bronze badges

answered Sep 29, 2015 at 17:44

travelingbones's user avatar

travelingbones travelingbones

8,5576 gold badges40 silver badges45 bronze badges

10

4

Why can it only be stored as a list of lists?

Nikhil Prabhu
– Nikhil Prabhu

2017年11月07日 15:12:35 +00:00
Commented Nov 7, 2017 at 15:12
2

I don't know but i expect np.array types have metadata that doesn't fit into json (e.g. they specify the data type of each entry like float)

travelingbones
– travelingbones

2017年11月07日 18:25:56 +00:00
Commented Nov 7, 2017 at 18:25
2

I tried your method, but it seems that the program stucked at tolist().

Harvett
– Harvett

2018年01月31日 12:38:22 +00:00
Commented Jan 31, 2018 at 12:38
6

@frankliuao I found the reason is that tolist() takes a huge amount of time when the data is large.

Harvett
– Harvett

2019年01月07日 17:26:47 +00:00
Commented Jan 7, 2019 at 17:26
10

@NikhilPrabhu JSON is Javascript Object Notation, and can therefore only represent the basic constructs from the javascript language: objects (analogous to python dicts), arrays (analogous to python lists), numbers, booleans, strings, and nulls (analogous to python Nones). Numpy arrays are not any of those things, and so cannot be serialised into JSON. Some can be converted to a JSO-like form (list of lists), which is what this answer does.

Chris L. Barnes
– Chris L. Barnes

2019年03月13日 20:57:11 +00:00
Commented Mar 13, 2019 at 20:57

| Show 5 more comments

CollectivesTM on Stack Overflow

NumPy array is not JSON serializable

17 Answers 17

TLDR

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

17 Answers 17

TLDR

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related