I need to parse an invalid JSON string in which I find many repetitions of the same key, like the following snippet:
[...]
"term" : {"Entry" : "value1", [.. other data ..]},
"term" : {"Entry" : "value2", [.. other data ..]},
[...]
I thought of appending a suffix to each key, and I do it using the following code:
word = "term"
offending_string = '"term" : {"Entry"'
replacing_string_template = '"%s_d" : {"Entry"'
counter = 0
index = 0
while index != -1:
# result is the string containing the JSON data
index = result.find(offending_string, index)
result = result.replace(offending_string,
replacing_string_template % (word, counter), 1)
counter += 1
It works, but I'd like to know if it is a good approach and if you would do this in a different way.
1 Answer 1
import json
def fixup(pairs):
return pairs
decoded = json.loads(bad_json, object_pairs_hook = fixup)
object_pairs_hook is a more recent addition. If you have an older version of python you may not have it. The resulting python object will contain lists of pairs (including duplicates) rather then dictionaries.
-
\$\begingroup\$ +1 great! But I doubt it will work, because this code is deployed under Google App Engine, that uses an old version of python that doesn't even contain the json module. I have to
import simplejson as json
from django. \$\endgroup\$Andrea Spadaccini– Andrea Spadaccini2011年02月24日 17:32:37 +00:00Commented Feb 24, 2011 at 17:32 -
\$\begingroup\$ @Andrea, the current version of simplejson does support the object_pairs_hook. Even if Google App Engine doesn't have the latest version, you can probably download and install your own copy of simplejson for your project. \$\endgroup\$Winston Ewert– Winston Ewert2011年02月24日 17:36:53 +00:00Commented Feb 24, 2011 at 17:36