Python - Removing duplicates from a string

Asked 15 years, 4 months ago

Viewed 10k times

def remove_duplicates(strng):
 """
 Returns a string which is the same as the argument except only the
 first occurrence of each letter is present. Upper and lower case
 letters are treated as different. Only duplicate letters are removed,
 other characters such as spaces or numbers are not changed. 
 >>> remove_duplicates('apple')
 'aple'
 >>> remove_duplicates('Mississippi')
 'Misp'
 >>> remove_duplicates('The quick brown fox jumps over the lazy dog')
 'The quick brown fx jmps v t lazy dg'
 >>> remove_duplicates('121 balloons 2 u')
 '121 balons 2 u'
 """
 s = strng.split()
 return strng.replace(s[0],"")

Writing a function to get rid of duplicate letters but so far have been playing around for an hour and can't get anything. Help would be appreciated, thanks.

Improve this question

edited May 19, 2010 at 12:01

SilentGhost's user avatar

SilentGhost

322k67 gold badges311 silver badges294 bronze badges

asked May 19, 2010 at 11:50

Daniel's user avatar

Daniel Daniel

2,9872 gold badges18 silver badges4 bronze badges

2

This looks like homework. If it is, tag it as such.

Marcelo Cantos
– Marcelo Cantos

2010年05月19日 11:56:57 +00:00
Commented May 19, 2010 at 11:56
If order isn't important to you (but it looks like it is), you can use "".join(set("test")).

badp
– badp

2010年05月19日 20:31:50 +00:00
Commented May 19, 2010 at 20:31

Add a comment |

3 Answers 3

Sorted by: Reset to default

Not the most efficient, but the most straightforward way is:

>>> s = 'The quick brown fox jumps over the lazy dog'
>>> import string
>>> n = ''
>>> for i in s:
 if i not in string.ascii_letters:
 n += i
 elif i not in n:
 n += i
>>> n
'The quick brown fx jmps v t lazy dg'

Improve this answer

edited May 19, 2010 at 12:12

Tim Pietzcker's user avatar

Tim Pietzcker

337k59 gold badges519 silver badges572 bronze badges

answered May 19, 2010 at 12:01

SilentGhost's user avatar

SilentGhost SilentGhost

322k67 gold badges311 silver badges294 bronze badges

Add a comment |

Using a list comprehension :

>>> from string import whitespace, digits
>>> s = 'The quick brown fox jumps over the lazy dog'
>>> ''.join([c for i, c in enumerate(s) if c in whitespace+digits \
 or not c in s[:i]])

Improve this answer

answered May 19, 2010 at 20:22

stanlekub's user avatar

stanlekub stanlekub

1,9521 gold badge10 silver badges4 bronze badges

Nice. I think you should change if c in whitespace+digits to if c not in letters (and thus, from string import letters): your solution would turn "++" into "+" and I don't think that qualifies as a letter.

badp
– badp

2010年05月19日 20:29:21 +00:00
Commented May 19, 2010 at 20:29
Why not, the question does not talk about punctuation. In fact, I used this solution instead of string.ascii_letters (as proposed by SilentGhost) to be able to handle non-ascii characters. I think the better would be whitespace+digits+punctuation... but, the question lacks precisions :)

stanlekub
– stanlekub

2010年05月19日 20:40:04 +00:00
Commented May 19, 2010 at 20:40

Add a comment |

try this ...

def remove_duplicates(s):
 result = ""
 dic = {}
 for i in s:
 if i not in dic:
 result+=i
 if ord(i.lower()) >= ord('a') and ord(i.lower()) <= ord('z'):
 dic[i] = 1
 return result

Improve this answer

edited May 19, 2010 at 17:31

answered May 19, 2010 at 12:07

Ahmed Kotb's user avatar

Ahmed Kotb Ahmed Kotb

6,3176 gold badges36 silver badges53 bronze badges

3

to check whether value is None, you should use identity check: is None, not equality check (== None).

SilentGhost
– SilentGhost

2010年05月19日 12:08:25 +00:00
Commented May 19, 2010 at 12:08
1

Use a set instead of a dict - that is what they are for.

Dave Kirby
– Dave Kirby

2010年05月19日 12:10:30 +00:00
Commented May 19, 2010 at 12:10
2

@SilentGhost, more to the point, use i in dic rather than checking the return value of .get to figure out if something's in a dict.

Mike Graham
– Mike Graham

2010年05月19日 16:30:05 +00:00
Commented May 19, 2010 at 16:30
1

@Dave Set does not guarantee that the ordering is maintained. Further, a whitespace will only be counted once; the OP clearly wanted whitespace to be treated differently.

Santa
– Santa

2010年05月19日 17:55:58 +00:00
Commented May 19, 2010 at 17:55
@santa dicts do not guarantee ordering either - this is irrelevant. Using a dict and only testing for the presence or absence of keys is conceptually identical to using a set.

Dave Kirby
– Dave Kirby

2010年05月19日 21:34:25 +00:00
Commented May 19, 2010 at 21:34

| Show 1 more comment

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Python - Removing duplicates from a string

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related