I found a code on my computer that i wrote a while ago. It is based on an exercise from O'Reilly book Programming the Semantic Web . There is a class, that stores RDF triples, that is data in the form subject-predicate-object:
class SimpleGraph:
def __init__(self):
self._spo = {}
self._pos = {}
self._osp = {}
def add(self, (s, p, o)):
# implementation details
pass
def remove(self, (s, p, o)):
# implementation details
pass
Variables _spo
, _pos
, _osp
are different permutations of subject, predicate, object for performance reasons, and the underlying data structure is dictionary of dictionaries of sets like {'subject': {'predicate': set([object])}}
or {'object': {'subject': set([predicate])}}
. The class also has a method to yield triples that match the query in a form of a tuple. If one element of a tuple is None, it acts as a wildcard.
def triples(self, (s, p, o)):
# check which terms are present
try:
if s != None:
if p != None:
# s p o
if o != None:
if o in self._spo[s][p]:
yield (s, p, o)
# s p _
else:
for ro in self._spo[s][p]:
yield (s, p, ro)
else:
# s _ o
if o != None:
for rp in self._osp[o][s]:
yield (s, rp, o)
# s _ _
else:
for rp, oset in self._spo[s].items():
for ro in oset:
yield (s, rp, ro)
else:
if p != None:
# _ p o
if o != None:
for rs in self._pos[p][o]:
yield (rs, p, o)
# _ p _
else:
for ro, sset in self._pos[p].items():
for rs in sset:
yield (rs, p, ro)
else:
# _ _ o
if o != None:
for rs, pset in self._osp[o].items():
for rp in pset:
yield (rs, rp, o)
# _ _ _
else:
for rs, pset in self._spo.items():
for rp, oset in pset.items():
for ro in oset:
yield (rs, rp, ro)
except KeyError:
pass
You see, this code is huge, and i feel it could be more terse and elegant. I suspect the possible use of dicts here, where tuple (s, p, o) is matched with a certain key in this dict, and then the yield is done. But i have no idea how to implement it.
How can i simplify this huge if-else structure?
2 Answers 2
Set
args = (int(s != None), int(p != None), int(o != None))
[orargs = ''.join(map(str, (int(s != None), int(p != None), int(o != None))))
]Now
args
will be(1, 0, 1)
[or'101'
] if p is None and the other two aren't.Also, instead of using for loops you can use generator expressions to make the code more concise.
def triples(self, (s, p, o)):
try:
args = (int(s != None), int(p != None), int(o != None))
if args == (1, 1, 1):
#if o in self._spo[s][p]: #See sindikat's comment below
# yield (s, p, o)
return iter([(s, p, o)] if o in self._spo[s][p] else [])
if args == (1, 1, 0):
return ((s, p, ro) for ro in self._spo[s][p])
if args == (1, 0, 1):
return ((s, rp, o) for rp in self._osp[o][s])
if args == (1, 0, 0):
return ((s, rp, ro) for rp, oset in self._spo[s].items() for ro in oset)
if args == (0, 1, 1):
return ((rs, p, o) for rs in self._pos[p][o])
if args == (0, 1, 0):
return ((rs, p, ro) for ro, sset in self._pos[p].items() for rs in sset)
if args == (0, 0, 1):
return ((rs, rp, o) for rs, pset in self._osp[o].items() for rp in pset)
if args == (0, 0, 0):
return ((rs, rp, ro) for rs, pset in self._spo.items() for rp, oset in pset.items() for ro in oset)
except KeyError:
pass
Combining this with Winston's suggestion you get the following definition
def triples(self, (s, p, o)):
try:
args = (int(s != None), int(p != None), int(o != None))
if args in [(1, 1, 1), (1, 1, 0), (1, 0, 0), (0, 0, 0)]:
lookat = self._spo
a1 = s; a2 = p; a3 = o;
invperm = [0, 1, 2]
if args in [(0, 1, 1), (0, 1, 0)]:
lookat = self._pos
a1 = p; a2 = o; a3 = s;
invperm = [2, 0, 1]
if args in [(1, 0, 1), (0, 0, 1)]:
lookat = self._osp
a1 = o; a2 = s; a3 = p;
invperm = [1, 2, 0]
permute = lambda x, p: (x[p[0]], x[p[1]], x[p[2]])
if sum(args) == 3:
#if a3 in lookat[a1][a2]: #See sindikat's comment below
# yield permute((a1, a2, a3), invperm)
return iter([permute((a1, a2, a3), invperm)] if a3 in lookat[a1][a2] else [])
if sum(args) == 2:
return (permute((a1, a2, ra3), invperm) for ra3 in lookat[a1][a2])
if sum(args) == 1:
return (permute((a1, ra2, ra3), invperm) for ra2, a3set in lookat[a1].items() for ra3 in a3set)
if sum(args) == 0:
return (permute((a1, a2, a3), invperm) for ra1, a2set in lookat.items() for ra2, a3set in a2set.items() for ra3 in a3set)
except KeyError:
pass
Although this is slightly longer, it is easier to maintain in the sense that if you feel that the dictionary to look at should be changed (for better performance) for some combination of input, then the modification is easily accomplished with the second definition.
-
1\$\begingroup\$ using
yield
andreturn
in a function in Python 2.7 throwsSyntaxError: 'return' with argument inside generator
\$\endgroup\$Mirzhan Irkegulov– Mirzhan Irkegulov2013年01月08日 15:37:13 +00:00Commented Jan 8, 2013 at 15:37 -
\$\begingroup\$ @sindikat Good catch. Edited to provide a workaround. \$\endgroup\$Prasanth S– Prasanth S2013年01月09日 03:13:43 +00:00Commented Jan 9, 2013 at 3:13
You could gain some simplicity by breaking the task into two parts.
Firstly, inspect the None
ness of the parameters to figure out which dictionary to look into. Store the results in local variables. So something like:
if (should use _spo):
data = self._spo
order = [0,1,2]
elif (should use _pos):
data = self._pos
order = [1,2,0]
else:
...
Secondly, use order
and data
to actually lookup the data.