I have three sets:
s0 = [set([16,9,2,10]), set([16,14,22,15]), set([14,7])] # true, 16 and 14
s1 = [set([16,9,2,10]), set([16,14,22,15]), set([7,8])] # false
I want a function that will return True if every set in the list intersects with at least one other set in the list. Is there a built-in for this or a simple list comprehension?
8 Answers 8
all(any(a & b for a in s if a is not b) for b in s)
8 Comments
bool() call is redundant.any() performed automatic boolean conversion. I removed the bool() call.Here's a very simple solution that's very efficient for large inputs:
def g(s):
import collections
count = collections.defaultdict(int)
for a in s:
for x in a:
count[x] += 1
return all(any(count[x] > 1 for x in a) for a in s)
4 Comments
count.setdefault(x, 0) += 1 instead of the if/else statement.count.setdefault(x, 0) isn't an lvalue.collections.Counter, or count= collections.defaultdict(int) and count[x]+=1defaultdict. I've simplified my answer accordingly.It's a little verbose but I think it's a pretty efficient solution. It takes advantage of the fact that when two sets intersect, we can mark them both as connected. It does this by keeping a list of flags as long as the list of sets. when set i and set j intersect, it sets the flag for both of them. It then loops over the list of sets and only tries to find a intersection for sets that haven't already been intersected. After reading the comments, I think this is what @Victor was talking about.
s0 = [set([16,9,2,10]), set([16,14,22,15]), set([14,7])] # true, 16 and 14
s1 = [set([16,9,2,10]), set([16,14,22,15]), set([7,8])] # false
def connected(sets):
L = len(sets)
if not L: return True
if L == 1: return False
passed = [False] * L
i = 0
while True:
while passed[i]:
i += 1
if i == L:
return True
for j, s in enumerate(sets):
if j == i: continue
if sets[i] & s:
passed[i] = passed[j] = True
break
else:
return False
print connected(s0)
print connected(s1)
I decided that an empty list of sets is connected (If you produce an element of the list, I can produce an element that it intersects ;). A list with only one element is dis-connected trivially. It's one line to change in either case if you disagree.
1 Comment
Here's a more efficient (if much more complicated) solution, that performs a linear number of intersections and a number of unions of order O( n*log(n) ), where n is the length of s:
def f(s):
import math
j = int(math.log(len(s) - 1, 2)) + 1
unions = [set()] * (j + 1)
for i, a in enumerate(s):
unions[:j] = [set.union(set(), *s[i+2**k:i+2**(k+1)]) for k in range(j)]
if not (a & set.union(*unions)):
return False
j = int(math.log(i ^ (i + 1), 2))
unions[j] = set.union(a, *unions[:j])
return True
Note that this solution only works on Python>= 2.6.
3 Comments
any() call can shortcut often, whereas this solution works well in all cases. This solution performs better (9 vs 24 seconds) than the first on this input: [set((i,)) for i in range(N)] + [set(range(N))] with N = 10000, and I suspect would perform better still with even larger N.As usual I'd like to give the inevitable itertools solution ;-)
from itertools import combinations, groupby
from operator import itemgetter
def any_intersects( sets ):
# we are doing stuff with combinations of sets
combined = combinations(sets,2)
# group these combinations by their first set
grouped = (g for k,g in groupby( combined, key=itemgetter(0)))
# are any intersections in each group
intersected = (any((a&b) for a,b in group) for group in grouped)
return all( intersected )
s0 = [set([16,9,2,10]), set([16,14,22,15]), set([14,7])]
s1 = [set([16,9,2,10]), set([16,14,22,15]), set([7,8])]
print any_intersects( s0 ) # True
print any_intersects( s1 ) # False
This is really lazy and will only do the intersections that are required. It can also be a very confusing and unreadable oneliner ;-)
3 Comments
To answer your question, no, there isn't a built-in or simple list comprehension that does what you want. Here's another itertools based solution that is very efficient -- surprisingly about twice as fast as @THC4k's itertools answer using groupby() in timing tests using your sample input. It could probably be optimized a bit further, but is very readable as presented. Like @AaronMcSmooth, I arbitrarily decided what to return when there are no or is only one set in the input list.
from itertools import combinations
def all_intersect(sets):
N = len(sets)
if not N: return True
if N == 1: return False
intersected = [False] * N
for i,j in combinations(xrange(N), 2):
if not intersected[i] or not intersected[j]:
if sets[i] & sets[j]:
intersected[i] = intersected[j] = True
return all(intersected)
Comments
This strategy isn't likely to be as efficient as @Victor's suggestion, but might be more efficient than jchl's answer due to increased use of set arithmetic (union).
s0 = [set([16,9,2,10]), set([16,14,22,15]), set([14,7])]
s1 = [set([16,9,2,10]), set([16,14,22,15]), set([7,8])]
def freeze(list_of_sets):
"""Transform a list of sets into a frozenset of frozensets."""
return frozenset(frozenset(set_) for set_ in list_of_sets)
def all_sets_have_relatives(set_of_sets):
"""Check if all sets have another set that they intersect with.
>>> all_sets_have_relatives(s0) # true, 16 and 14
True
>>> all_sets_have_relatives(s1) # false
False
"""
set_of_sets = freeze(set_of_sets)
def has_relative(set_):
return set_ & frozenset.union(*(set_of_sets - set((set_,))))
return all(has_relative(set) for set in set_of_sets)
5 Comments
This may give better performance depending on the distribution of the sets.
def all_intersect(s):
count = 0
for x, a in enumerate(s):
for y, b in enumerate(s):
if a & b and x!=y:
count += 1
break
return count == len(s)
4 Comments
break instead of continue? 2. Are you sure about count = -1?? 3. Lose the 2nd line and replace the last 3 lines by return count == len(s). 4. Wouldn't x != y and a & b be better? 5. You don't exit early if a set has no buddies -- what is the basis for "may give better performance"? Better than what?count becomes the number of non-empty pair-wise set intersections, less 1. This is nothing like what the OP wants. By ACCIDENT it gives the correct answers for the OP's two test cases. Try this one: s2 = [set([16, 9, 2, 10]), set([16, 22, 14, 15]), set([14, 2])] ... it should return True but your count is 6-1 == 5 and so your code returns False.a & b over x != y because the latter is more commonly true. Is this bad? 5. Yes, that's right. This may give better performance when compared to the ones doing the intersection check with all the sets.x != y is to avoid calculating a & b for a is b. the point is that it's much cheaper to check x != y then performing a set intersection.