1

Trying to learn Python I encountered the following:

>>> set('spam') - set('ham')
set(['p', 's'])

Why is it set(['p', 's']) - i mean: why is 'h' missing?

Penny Liu
18k5 gold badges89 silver badges109 bronze badges
asked Nov 29, 2014 at 16:24
2
  • Why do you expect h to be included? Commented Nov 29, 2014 at 16:30
  • I thought it is to show the differences, so from my feeling I expected h to be included. Looks like I got the concept wrong. Commented Nov 29, 2014 at 16:33

5 Answers 5

11

The - operator on Python sets is mapped to the difference method, which is defined as the members of set A which are not members of set B. So in this case, the members of "spam" which are not in "ham"are "s" and "p". Notice that this method is not commutative (that is, a - b == b - a is not always true).

You may be looking for the symmetric_difference or ^ method:

>>> set("spam") ^ set("ham")
{'h', 'p', 's'} 

This operator is commutative.

Penny Liu
18k5 gold badges89 silver badges109 bronze badges
answered Nov 29, 2014 at 16:32
Sign up to request clarification or add additional context in comments.

2 Comments

I believe the statement [The ^ operator] corresponds more to the mathematical notion of set difference is misleading. It does correspond to the mathematical notion of symmetric difference, but not to the mathematical notion of set difference.
Ah, you're correct. The mathematical notion of set difference, according to wolfram, is actually the - operator.
3

Because that is the definition of a set difference. In plain English, it is equivalent to "what elements are in A that are not also in B?".

Note the reverse behavior makes this more obvious

>>> set('spam') - set('ham')
{'s', 'p'}
>>> set('ham') - set('spam')
{'h'}

To get all unique elements, disregarding the order in which you ask, you can use symmetric_difference

>>> set('spam').symmetric_difference(set('ham'))
{'s', 'h', 'p'}
answered Nov 29, 2014 at 16:26

1 Comment

You don't need set('ham') here - set methods take any iterable, so using set('spam').symmetric_difference('ham') is fine
2

There are two different operators:

  • Set difference. This is defined as the elements of A not present in B, and is written as A - B or A.difference(B).
  • Symmetric set difference. This is defined as the elements of either set not present in the other set, and is written as A ^ B or A.symmetric_difference(B).

Your code is using the former, whereas you seem to be expecting the latter.

answered Nov 29, 2014 at 16:29

Comments

1

The set difference is the set of all characters in the first set that are not in the second set. 'p' and 's' appear in the first set but not in the second, so they are in the set difference. 'h' does not appear in the first set, so it is not in the set difference (regardless of whether or not it is in the first set).

answered Nov 29, 2014 at 16:28

Comments

1

You can also obtain the desired result as:

>>> (set('spam') | set('ham')) - (set('spam') & set('ham'))
set(['p', 's', 'h'])

Create union using | and intersection using & and then do the set difference, i.e. differences between all elements and common elements.

answered Nov 29, 2014 at 18:34

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.