Here is my simple program to pretty print a JSON object. I'm looking for advice of better pretty print solutions, functional bugs, code style and algorithm time/space complexity improvements.
BTW, I have fixed all PEP8 issues I met with using Pycharm auto-annotation feature, there are some alerts on PEP8 which I think is either minor or a bit too much overhead to this program (e.g. using isinstance
other than ==
). But if I am reading or judge it wrong, appreciate to point out.
import json
def print_pretty(source, prefix, level):
if not source:
return
if type(source) == type({}):
for k,v in source.items():
for i in range(level):
prefix.append('\t')
prefix.append(k)
if type(v) != type({}) and type(v) != type([]):
prefix.append(' : ' + str(v))
prefix.append('\n')
if type(v) == type({}) or type(v) == type([]):
print_pretty(v, prefix, level+1)
elif type(source) == type([]):
print_pretty('[', prefix, level)
for i in source:
print_pretty(i, prefix, level+1)
print_pretty(']', prefix, level)
else:
for i in range(level):
prefix.append('\t')
prefix.append(str(source))
prefix.append('\n')
if __name__ == "__main__":
json_string = '''{
"stuff": {
"onetype": [
{"id":1,"name":"John Doe"},
{"id":2,"name":"Don Joeh"}
],
"othertype": {"id":2,"company":"ACME"}
},
"otherstuff": {
"thing": [[1,42],[2,2]]
}
}
'''
result = []
print_pretty(json.loads(json_string), result, 0)
print ''.join(result)
1 Answer 1
Your linter alerts on using isinstance
rather than comparing types with ==
are pretty strong and should not be considered minor: what if I have a Counter
and I want to examine it's content using your function... Well... it'll be of no use since type(Counter())
is Counter
and not dict
. So trying to pretty print it will enter the else
part of your code and I won't have anything more than a regular print
, how disapointing...
However, Counter
being a subclass of dict
, isinstance(Counter(), dict)
returns True
, so if you had used that, I would at least have had each key-value pairs on their own lines.
Better, yet, as duck-typing is the norm in Python, would be to not test anything and try to:
- call the
items
method and format at a dictionnary; or, if it fails, - transform to an iterator using
iter
and format as a list; or, if it fails, - format as a single value.
Using proper try: ... except:
clauses, it allows for large reusability.
Now for your function’s behaviour: instead of checking the type of the values in a dictionnary, you can just call your function recursivelly, it will be checked there. This will mean that you will need to change the formatting to add ':'
for collections as you do for simple values. But I do think it's an improvement as it will explicit what are keys and what are values, rather than simple elements of an iterable.
To continue on the "explicit is better than implicit" way, I think that removing the {
and }
delimiters for dictionary-like objects is a bad choice: it allows you to "write" the first element without offset; and to add one only for the content of containers.
And lastly, using a function named print_pretty
, I expect it to print
the representation of whatever it is called with, rather than returning to me an incomplete intermediate representation; that I have to join
and print
myself. Better call that function generate_pretty
(and, oh, turn it into a generator rather than returning a list) and provide a print_pretty
that will print ''.join(generate_pretty(...)
; or maybe a little bit more, see the following rewrite:
def generate_pretty(source, level):
try:
mapping_items = source.iteritems()
except AttributeError: # Not a dict
if isinstance(source, basestring):
# Need to check for strings there because
# strings and single characters are iterables
# so the `try: iter(..); else: ...` would
# result in an infinite recursion
yield source
else:
try:
sequence_items = iter(source)
except TypeError: # Not a sequence
yield str(source)
else: # Indeed a sequence
yield '[\n'
for element in sequence_items:
yield '\t' * (level + 1)
for line in generate_pretty(element, level + 1):
yield line
yield ',\n'
yield '\t' * level
yield ']'
else: # Indeed a dict
yield '{\n'
for key, value in mapping_items:
yield '\t' * (level + 1)
yield str(key)
yield ': '
for line in generate_pretty(value, level + 1):
yield line
yield ',\n'
yield '\t' * level
yield '}'
def format_pretty(source):
return ''.join(generate_pretty(source, 0))
def print_pretty(source, filename=None):
formatted = format_pretty(source)
if filename is None:
print formatted
else:
with open(filename) as f:
f.write(formatted)
if __name__ == "__main__":
import json
json_string = '''{
"stuff": {
"onetype": [
{"id":1,"name":"John Doe"},
{"id":2,"name":"Don Joeh"}
],
"othertype": {"id":2,"company":"ACME"}
},
"otherstuff": {
"thing": [[1,42],[2,2]]
}
}
'''
print_pretty(json.loads(json_string))
That being said, the organisation now match more closely what is available in the pprint
module. That offers even more customization. So basically, print_pretty
could be just:
from pprint import pprint as print_pretty
-
\$\begingroup\$ Thanks Mathias, tried your code and your code, and your code always run without completion? I am using Python 2.7 + Pycharm on OSX. \$\endgroup\$Lin Ma– Lin Ma2016年12月28日 18:39:50 +00:00Commented Dec 28, 2016 at 18:39
-
\$\begingroup\$ Another comment is, I am developer from Java and C++ in the past, it seems it is not a good practice to use exception to control program flow, you are using exception to control flow, is it good practice and recommended way in Python 2.7? Thanks. \$\endgroup\$Lin Ma– Lin Ma2016年12月28日 18:40:59 +00:00Commented Dec 28, 2016 at 18:40
-
1\$\begingroup\$ @LinMa exceptions are pretty friendly in python, in fact, the
for
loop uses one hunder the hood to stop iteration. You may way to read about EAFP in python and theStopIteration
exception. \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2016年12月28日 18:47:10 +00:00Commented Dec 28, 2016 at 18:47 -
\$\begingroup\$ Thanks for all the help, Mathias, mark your reply as answer. Always enjoy reading your posts. \$\endgroup\$Lin Ma– Lin Ma2016年12月29日 04:25:40 +00:00Commented Dec 29, 2016 at 4:25
type({})
rather thandict
? Also, what functionality does this add over the built-inpprint
? \$\endgroup\$