Python: encode arbitrary object with json builtin

Question 1

I am writing a webapp with a Tornado backend and, of course, javascript and jquery on the frontend, so I am using the builtin json module in the standard library to serialize objects for the frontend. I had started writing a custom JSONEncoder for my classes, but then it occurred to me that I could simply write a very simple, generic object encoder:

class ObjectEncoder(json.JSONEncoder):
 def default(self, obj):
 return vars(obj)

It seems to be working nicely, so I wondered why this is not included in the module, and if this technique has drawbacks. I didn't experiment if it works nicely with check_circular, but I have no reason to believe it doesn't.

Any comments on my doubts? Otherwise, I suppose this technique may be useful to somebody, since I didn't find it with a search (admittedly, a quick one).

EDIT: here's an example, as simple as it gets, to show the behaviour of the json module:

>>> import json
>>> class Foo:
... def __init__(self):
... self.bar = 'bar'
... 
>>> foo = Foo()
>>> json.dumps(foo)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/usr/lib/python3.5/json/__init__.py", line 230, in dumps
 return _default_encoder.encode(obj)
 File "/usr/lib/python3.5/json/encoder.py", line 198, in encode
 chunks = self.iterencode(o, _one_shot=True)
 File "/usr/lib/python3.5/json/encoder.py", line 256, in iterencode
 return _iterencode(o, 0)
 File "/usr/lib/python3.5/json/encoder.py", line 179, in default
 raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <__main__.Foo object at 0x7f14660236d8> is not JSON serializable
>>> class ObjectEncoder(json.JSONEncoder):
... def default(self, obj):
... return vars(obj)
... 
>>> json.dumps(foo, cls=ObjectEncoder)
'{"bar": "bar"}'

Question 2

What's wrong with json.dumps(obj) ?

Question 3

It raises TypeError: <... instance at ...> is not JSON serializable. This seems to be by design, since it is explicitly stated in the documentation, I would just like to understand why it is so, and if there are complication which I do not see.

Question 4

Post the full error stacktrace please

Question 5

I really don't see the point, anyway, here's a full example:

Question 6

(added to the main post, it was too long for comments)

Question 7

vars(obj) is syntactic sugar for obj.__dict__, so it doesn't work on any object without __dict__. This includes stuff like:

User-defined objects where every level in the class hierarchy defined __slots__ (without a __dict__ slot) to reduce memory usage
Objects of built-in types that don't opt-in to a tp_dict slot

Worse, there are in-between cases, where some attributes are set on the __dict__, while others aren't (e.g. a class hierarchy where __slots__ was used for some levels, but other levels didn't use __slots__ and relied on the implicit __dict__). In cases like that, you wouldn't get an error to let you know something had gone wrong, you'd just serialize the __dict__ part of the object state and silently ignore the rest.

You'd have similar problems if the interface uses @propertys; they're used like attributes, but they're not on the instance __dict__, so you'd either lose the information completely (if there is no hidden underlying attribute), or serialize the "wrong" value (the internal name, rather than the API name exposed as an @property).

In short, lots of things can go subtly wrong by trying to guess at the correct behavior like this, which is why The Zen of Python (type import this in an interactive terminal to see it) includes stuff like:

Errors should never pass silently.

and

In the face of ambiguity, refuse the temptation to guess.

Beyond these errors, there's also the general problem of reversability. A general encoder of this form is definitionally incapable of being handled by a general decoder (because you lose all the type information). Offering an easy way to lose important information is... suboptimal.

Question 8

I am sure I had thanked you, but my comment was lost :( Well, at least I had accepted your answer...

Question 9

In fact, I had come back here to note that I had hit the first problem with my aproach: I needed a set in on of my classes, and that hasn't got a __dict__, so I had to change approach: I use the vars() approach for a list of predefined user classes, and custom code for other classes like set.

ShadowRanger 158k12 gold badges222 silver badges318 bronze badges · Accepted Answer · 2017-12-26 17:04:52Z

vars(obj) is syntactic sugar for obj.__dict__, so it doesn't work on any object without __dict__. This includes stuff like:

User-defined objects where every level in the class hierarchy defined __slots__ (without a __dict__ slot) to reduce memory usage
Objects of built-in types that don't opt-in to a tp_dict slot

Worse, there are in-between cases, where some attributes are set on the __dict__, while others aren't (e.g. a class hierarchy where __slots__ was used for some levels, but other levels didn't use __slots__ and relied on the implicit __dict__). In cases like that, you wouldn't get an error to let you know something had gone wrong, you'd just serialize the __dict__ part of the object state and silently ignore the rest.

You'd have similar problems if the interface uses @propertys; they're used like attributes, but they're not on the instance __dict__, so you'd either lose the information completely (if there is no hidden underlying attribute), or serialize the "wrong" value (the internal name, rather than the API name exposed as an @property).

In short, lots of things can go subtly wrong by trying to guess at the correct behavior like this, which is why The Zen of Python (type import this in an interactive terminal to see it) includes stuff like:

Errors should never pass silently.

and

In the face of ambiguity, refuse the temptation to guess.

Beyond these errors, there's also the general problem of reversability. A general encoder of this form is definitionally incapable of being handled by a general decoder (because you lose all the type information). Offering an easy way to lose important information is... suboptimal.

I am sure I had thanked you, but my comment was lost :( Well, at least I had accepted your answer...
In fact, I had come back here to note that I had hit the first problem with my aproach: I needed a set in on of my classes, and that hasn't got a __dict__, so I had to change approach: I use the vars() approach for a list of predefined user classes, and custom code for other classes like set.

CollectivesTM on Stack Overflow

Python: encode arbitrary object with json builtin

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related