6

I have three sets:

s0 = [set([16,9,2,10]), set([16,14,22,15]), set([14,7])] # true, 16 and 14
s1 = [set([16,9,2,10]), set([16,14,22,15]), set([7,8])] # false

I want a function that will return True if every set in the list intersects with at least one other set in the list. Is there a built-in for this or a simple list comprehension?

asked Oct 1, 2010 at 8:05
0

8 Answers 8

14
all(any(a & b for a in s if a is not b) for b in s)
answered Oct 1, 2010 at 8:32
Sign up to request clarification or add additional context in comments.

8 Comments

Elegant, but I would prefer 2 straight loops and a counter to avoid compare each two items 2 times, it would speed things up at least by a factor of two
The bool() call is redundant.
@Victor Ionescu, why not throw up an answer with your preferred method. It'd be worth at least ten rep :)
@spoon16 : I tried to remove unnecessary comparisons. Of course, with all the additional code, I am not sure if this yields an optimized version over a quite elegant solution by jchl. I will probably try to time both over a large set sequence and see what it results in.
@KennyTM: Thanks, I didn't realize any() performed automatic boolean conversion. I removed the bool() call.
|
5

Here's a very simple solution that's very efficient for large inputs:

def g(s):
 import collections
 count = collections.defaultdict(int)
 for a in s:
 for x in a:
 count[x] += 1
 return all(any(count[x] > 1 for x in a) for a in s)
answered Oct 1, 2010 at 12:50

4 Comments

you can just use count.setdefault(x, 0) += 1 instead of the if/else statement.
No you can't. count.setdefault(x, 0) isn't an lvalue.
+1 because it circumvents the "obvious" set intersection stuff and simplifies the problem. Unfortunately, I don't think you will get enough upvotes. BTW, either collections.Counter, or count= collections.defaultdict(int) and count[x]+=1
Thanks for the hit about defaultdict. I've simplified my answer accordingly.
2

It's a little verbose but I think it's a pretty efficient solution. It takes advantage of the fact that when two sets intersect, we can mark them both as connected. It does this by keeping a list of flags as long as the list of sets. when set i and set j intersect, it sets the flag for both of them. It then loops over the list of sets and only tries to find a intersection for sets that haven't already been intersected. After reading the comments, I think this is what @Victor was talking about.

s0 = [set([16,9,2,10]), set([16,14,22,15]), set([14,7])] # true, 16 and 14
s1 = [set([16,9,2,10]), set([16,14,22,15]), set([7,8])] # false
def connected(sets):
 L = len(sets)
 if not L: return True
 if L == 1: return False
 passed = [False] * L
 i = 0
 while True:
 while passed[i]: 
 i += 1
 if i == L: 
 return True
 for j, s in enumerate(sets):
 if j == i: continue
 if sets[i] & s: 
 passed[i] = passed[j] = True
 break
 else:
 return False
print connected(s0)
print connected(s1)

I decided that an empty list of sets is connected (If you produce an element of the list, I can produce an element that it intersects ;). A list with only one element is dis-connected trivially. It's one line to change in either case if you disagree.

answered Oct 1, 2010 at 8:27

1 Comment

lots of good answers, I'm using 2.5.2 right now so I ended up going with this solution
2

Here's a more efficient (if much more complicated) solution, that performs a linear number of intersections and a number of unions of order O( n*log(n) ), where n is the length of s:

def f(s):
 import math
 j = int(math.log(len(s) - 1, 2)) + 1
 unions = [set()] * (j + 1)
 for i, a in enumerate(s):
 unions[:j] = [set.union(set(), *s[i+2**k:i+2**(k+1)]) for k in range(j)]
 if not (a & set.union(*unions)):
 return False
 j = int(math.log(i ^ (i + 1), 2))
 unions[j] = set.union(a, *unions[:j])
 return True

Note that this solution only works on Python>= 2.6.

answered Oct 1, 2010 at 11:23

3 Comments

This is significantly slower performing that your previous answer. whereas your previous answer averages 4.7 msecs, (on large random input) this one takes 250. Unions are what's killing you (with this implementation) and intuited. Theoretically appealing but expensive.
Interesting. I think it's the large constant factor due to all the Python code (compared to the previous solution that can be evaluated almost all in C). Also, my first solution will work well if the any() call can shortcut often, whereas this solution works well in all cases. This solution performs better (9 vs 24 seconds) than the first on this input: [set((i,)) for i in range(N)] + [set(range(N))] with N = 10000, and I suspect would perform better still with even larger N.
I just realized that this solution is far more complicated than necessary, and still not optimal. My third solution in much better for both reasons.
1

As usual I'd like to give the inevitable itertools solution ;-)

from itertools import combinations, groupby
from operator import itemgetter
def any_intersects( sets ):
 # we are doing stuff with combinations of sets
 combined = combinations(sets,2) 
 # group these combinations by their first set
 grouped = (g for k,g in groupby( combined, key=itemgetter(0)))
 # are any intersections in each group
 intersected = (any((a&b) for a,b in group) for group in grouped)
 return all( intersected )
s0 = [set([16,9,2,10]), set([16,14,22,15]), set([14,7])]
s1 = [set([16,9,2,10]), set([16,14,22,15]), set([7,8])] 
print any_intersects( s0 ) # True
print any_intersects( s1 ) # False

This is really lazy and will only do the intersections that are required. It can also be a very confusing and unreadable oneliner ;-)

answered Oct 1, 2010 at 13:18

3 Comments

combinated? How about "combined"? ;-)
Heh, some online dictionaries list "combinated", but "combined" does sound much better ;-)
On further reflection, "combos" would probably be a more descriptive -- and therefore even better -- variable name.
1

To answer your question, no, there isn't a built-in or simple list comprehension that does what you want. Here's another itertools based solution that is very efficient -- surprisingly about twice as fast as @THC4k's itertools answer using groupby() in timing tests using your sample input. It could probably be optimized a bit further, but is very readable as presented. Like @AaronMcSmooth, I arbitrarily decided what to return when there are no or is only one set in the input list.

from itertools import combinations
def all_intersect(sets):
 N = len(sets)
 if not N: return True
 if N == 1: return False
 intersected = [False] * N
 for i,j in combinations(xrange(N), 2):
 if not intersected[i] or not intersected[j]:
 if sets[i] & sets[j]:
 intersected[i] = intersected[j] = True
 return all(intersected)
answered Oct 1, 2010 at 14:22

Comments

0

This strategy isn't likely to be as efficient as @Victor's suggestion, but might be more efficient than jchl's answer due to increased use of set arithmetic (union).

s0 = [set([16,9,2,10]), set([16,14,22,15]), set([14,7])]
s1 = [set([16,9,2,10]), set([16,14,22,15]), set([7,8])]
def freeze(list_of_sets):
 """Transform a list of sets into a frozenset of frozensets."""
 return frozenset(frozenset(set_) for set_ in list_of_sets)
def all_sets_have_relatives(set_of_sets):
 """Check if all sets have another set that they intersect with.
 >>> all_sets_have_relatives(s0) # true, 16 and 14
 True
 >>> all_sets_have_relatives(s1) # false
 False
 """
 set_of_sets = freeze(set_of_sets)
 def has_relative(set_):
 return set_ & frozenset.union(*(set_of_sets - set((set_,))))
 return all(has_relative(set) for set in set_of_sets)
answered Oct 1, 2010 at 9:30

5 Comments

@jchl: sets aren't hashable, so they can't be elements in a set. I'm not sure if there's much point in making the top-level set a frozen one, but it seemed like it might be faster than a regular set, and it suits the logic.
@jchl: Ahh, I see what you're talking about now. Looks like I had some sort of mental indigestion issue when refactoring and then posting that code. "The doctest passed, though!" Anyway, it's fixed now.
my profiling indicates that they are slightly faster. only barely though
Bad news. I just profiled this one and it takes around 2.5 seconds on the same type of large random input that @jchl's takes around 4.7 milli-seconds on. The bad news for me is that with all the effort I put into it, I only beat him by .2 milliseconds.
"O! I am slain!" Interesting.. was that with large sets, or a lot of sets, or both?
0

This may give better performance depending on the distribution of the sets.

def all_intersect(s):
 count = 0
 for x, a in enumerate(s):
 for y, b in enumerate(s):
 if a & b and x!=y:
 count += 1
 break
 return count == len(s)
answered Oct 1, 2010 at 10:44

4 Comments

1. Don't you mean break instead of continue? 2. Are you sure about count = -1?? 3. Lose the 2nd line and replace the last 3 lines by return count == len(s). 4. Wouldn't x != y and a & b be better? 5. You don't exit early if a set has no buddies -- what is the basis for "may give better performance"? Better than what?
Hello, hello, ... your code is BROKEN. count becomes the number of non-empty pair-wise set intersections, less 1. This is nothing like what the OP wants. By ACCIDENT it gives the correct answers for the OP's two test cases. Try this one: s2 = [set([16, 9, 2, 10]), set([16, 22, 14, 15]), set([14, 2])] ... it should return True but your count is 6-1 == 5 and so your code returns False.
@John, +1, thanks for pointing out all the errors. I have edited to make all the necessary corrections. 4. I preferred a & b over x != y because the latter is more commonly true. Is this bad? 5. Yes, that's right. This may give better performance when compared to the ones doing the intersection check with all the sets.
the whole point of checking if x != y is to avoid calculating a & b for a is b. the point is that it's much cheaper to check x != y then performing a set intersection.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.