I am trying to ad nested to some flat data, which is nested.
Basically this code works the following way:
"taglevel":1 tags should be key of the array
"taglevel":2 or higher tags should be nested within an array and not be duplicated in its' array
If no "taglevel":1 exists add, it to a generic "NoLevel_1" array
The code is still clunky and I feel there is a much cleaner way to achieve this.
import json
generic = []
result = []
for i in json_data:
if any(d['taglevel'] == 1 for d in i['tag']):
tag_data = {}
tag_child = []
for tag in i['tag']:
if tag['taglevel'] == 1:
tag_data['name'] = tag['name']
tag_data['taglevel'] = 1
else:
tag_child.append(tag)
filtered = {tuple((k, d[k]) for k in sorted(d) if k in ['name']): d for d in tag_child}
tag_data['tag_child'] = list(filtered.values())
if any(d['name'] == tag_data['name'] for d in result):
for t in result:
if t['name'] == tag_data['name']:
t['tag_child'] = t['tag_child'] + tag_child
filtered = {tuple((k, d[k]) for k in sorted(d) if k in ['name']): d for d in t['tag_child']}
t['tag_child'] = list(filtered.values())
else:
result.append(tag_data)
else:
for tag in i['tag']:
generic.append(tag)
tag_data = {}
tag_data['name'] = 'NoLevel1'
tag_data['taglevel'] = 1
tag_data['tag_child'] = generic
result.append(tag_data)
print json.dumps(result, indent=4, sort_keys=True)
The data:
json_data = [{
"title": "Random",
"tag": [
{
"name": "Fruit",
"taglevel": 1
},
{
"name": "Apple",
"taglevel": 2
}
]
},
{
"title": "Other",
"tag": [
{
"name": "Fruit",
"taglevel": 1
},
{
"name": "Apple",
"taglevel": 2
}
]
},
{
"title": "Words",
"tag": [
{
"name": "Food",
"taglevel": 2
}
]
},
{
"title": "That",
"tag": [
{
"name": "Food",
"taglevel": 2
},
{
"name": "Apple",
"taglevel": 2
}
]
}
]
Desired result
[
{
"name": "Fruit",
"tag_child": [
{
"name": "Apple",
"taglevel": 2
}
],
"taglevel": 1
},
{
"name": "NoLevel_1",
"tag_child": [
{
"name": "Food",
"taglevel": 2
},
{
"name": "Apple",
"taglevel": 2
}
],
"taglevel": 1
}
]
1 Answer 1
Well you may or may not considered this a simplification but this is how I would approach it.
You could use a set
to handle the duplication part.
You can't store a dict
in a set though so we need to create a tuple
from the values. (It looks like you're doing something similar with filtered
)
We then reformat result
to get the desired final structure.
from collections import defaultdict
result = defaultdict(set)
for item in json_data:
parent = {'name': 'NoLevel_1'}
children = []
for tag in item['tag']:
if tag['taglevel'] == 1:
parent = tag
else:
children.append((tag['taglevel'], tag['name']))
result[parent['name']].update(children)
result = [
{
'name': parent,
'tag_child': [
{'name': name, 'taglevel': taglevel} for taglevel, name in tags
],
'taglevel': 1,
} for parent, tags in result.items()
]
You could use next()
and a list comprehension for the parent
and child
creation however it iterates the tags twice and may not be as "readable"
parent = next((tag for tag in item['tag'] if tag['taglevel'] == 1), {'name': 'NoLevel_1'})
children = [(tag['taglevel'], tag['name']) for tag in item['tag'] if tag['taglevel'] != 1]
-
-
\$\begingroup\$ Wow- thanks. A much more efficient and clean way to do it. \$\endgroup\$Ycon– Ycon2017年04月18日 13:36:58 +00:00Commented Apr 18, 2017 at 13:36
"taglevel": 1
for the first level of dictionnaries and"taglevel": 2
for the second level, for instance, is highly redundant. Also, you handle levels 1 and 2 but could there be more of them? \$\endgroup\$