4
\$\begingroup\$

I am trying to ad nested to some flat data, which is nested.

Basically this code works the following way:

  1. "taglevel":1 tags should be key of the array

  2. "taglevel":2 or higher tags should be nested within an array and not be duplicated in its' array

  3. If no "taglevel":1 exists add, it to a generic "NoLevel_1" array

The code is still clunky and I feel there is a much cleaner way to achieve this.

import json
generic = []
result = []
for i in json_data:
 if any(d['taglevel'] == 1 for d in i['tag']):
 tag_data = {}
 tag_child = []
 for tag in i['tag']:
 if tag['taglevel'] == 1:
 tag_data['name'] = tag['name']
 tag_data['taglevel'] = 1
 else:
 tag_child.append(tag)
 filtered = {tuple((k, d[k]) for k in sorted(d) if k in ['name']): d for d in tag_child}
 tag_data['tag_child'] = list(filtered.values())
 if any(d['name'] == tag_data['name'] for d in result):
 for t in result:
 if t['name'] == tag_data['name']:
 t['tag_child'] = t['tag_child'] + tag_child
 filtered = {tuple((k, d[k]) for k in sorted(d) if k in ['name']): d for d in t['tag_child']}
 t['tag_child'] = list(filtered.values())
 else:
 result.append(tag_data)
 else:
 for tag in i['tag']:
 generic.append(tag)
tag_data = {}
tag_data['name'] = 'NoLevel1'
tag_data['taglevel'] = 1
tag_data['tag_child'] = generic
result.append(tag_data)
print json.dumps(result, indent=4, sort_keys=True)

The data:

json_data = [{
 "title": "Random",
 "tag": [
 {
 "name": "Fruit",
 "taglevel": 1
 },
 {
 "name": "Apple",
 "taglevel": 2
 }
 ]
 },
 {
 "title": "Other",
 "tag": [
 {
 "name": "Fruit",
 "taglevel": 1
 },
 {
 "name": "Apple",
 "taglevel": 2
 }
 ]
 },
 {
 "title": "Words",
 "tag": [
 {
 "name": "Food",
 "taglevel": 2
 }
 ]
 },
 {
 "title": "That",
 "tag": [
 {
 "name": "Food",
 "taglevel": 2
 },
 {
 "name": "Apple",
 "taglevel": 2
 }
 ]
 }
]

Desired result

[
 {
 "name": "Fruit", 
 "tag_child": [
 {
 "name": "Apple", 
 "taglevel": 2
 }
 ], 
 "taglevel": 1
 }, 
 {
 "name": "NoLevel_1", 
 "tag_child": [
 {
 "name": "Food", 
 "taglevel": 2
 }, 
 {
 "name": "Apple", 
 "taglevel": 2
 }
 ], 
 "taglevel": 1
 }
]
asked Apr 18, 2017 at 9:11
\$\endgroup\$
4
  • \$\begingroup\$ Do you need to keep all that redundant information in your output or is it flexible and you can change it? \$\endgroup\$ Commented Apr 18, 2017 at 19:22
  • \$\begingroup\$ Which parts are considered redundant? The desired result is basically how I want it. As a bonus, would be great if I could sort within the tag_child also (eg alphabetically, or if I had created, by date) \$\endgroup\$ Commented Apr 18, 2017 at 21:34
  • \$\begingroup\$ I feel like keeping "taglevel": 1 for the first level of dictionnaries and "taglevel": 2 for the second level, for instance, is highly redundant. Also, you handle levels 1 and 2 but could there be more of them? \$\endgroup\$ Commented Apr 18, 2017 at 21:41
  • \$\begingroup\$ No, taglevel 1 will need to be the first level always. If anything, it my be good to have an easy ability to exclude items below taglevel 3 for example \$\endgroup\$ Commented Apr 18, 2017 at 21:54

1 Answer 1

4
\$\begingroup\$

Well you may or may not considered this a simplification but this is how I would approach it.

You could use a set to handle the duplication part.

You can't store a dict in a set though so we need to create a tuple from the values. (It looks like you're doing something similar with filtered)

We then reformat result to get the desired final structure.

from collections import defaultdict
result = defaultdict(set)
for item in json_data:
 parent = {'name': 'NoLevel_1'}
 children = []
 for tag in item['tag']:
 if tag['taglevel'] == 1:
 parent = tag
 else:
 children.append((tag['taglevel'], tag['name']))
 result[parent['name']].update(children)
result = [ 
 {
 'name': parent, 
 'tag_child': [
 {'name': name, 'taglevel': taglevel} for taglevel, name in tags
 ],
 'taglevel': 1, 
 } for parent, tags in result.items()
]

You could use next() and a list comprehension for the parent and child creation however it iterates the tags twice and may not be as "readable"

parent = next((tag for tag in item['tag'] if tag['taglevel'] == 1), {'name': 'NoLevel_1'})
children = [(tag['taglevel'], tag['name']) for tag in item['tag'] if tag['taglevel'] != 1]
answered Apr 18, 2017 at 10:35
\$\endgroup\$
2
  • \$\begingroup\$ You could use a frozenset, which can be dictionary keys. \$\endgroup\$ Commented Apr 18, 2017 at 11:36
  • \$\begingroup\$ Wow- thanks. A much more efficient and clean way to do it. \$\endgroup\$ Commented Apr 18, 2017 at 13:36

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.