Sort a list in Python

Question 1

My aim is to sort a list of strings where words have to be sorted alphabetically. Words starting with s should be at the start of the list (they should be sorted as well), followed by the other words.

The below function does that for me.

def mysort(words):
 mylist1 = sorted([i for i in words if i[:1] == "s"])
 mylist2 = sorted([i for i in words if i[:1] != "s"])
 list = mylist1 + mylist2
 return list

I am looking for better ways to achieve this, or if anyone can find any issues with the code above.

Question 2

Some notes about your code:

sorted([i for i in words if i[:1] == "s"]): No need to generate an intermediate list, drop those [ ] and you'll get a lazy generator expression instead.
mylist1. This my prefix looks pretty bad. Do other languages use them?
list: Red flag. Don't overshadow a built-in (at least not one so important) with a variable.
return list. You can directly write return mylist1 + mylist2, no gain in naming the value you're about to return (the name of the function should inform about that).
i[:1] == "s" -> i.startswith("s").

Now what I'd write. Using sorted with the argument key (Schwartzian transform) and taking advantage of the lexicographical order defined by tuples:

def mysort(words):
 return sorted(words, key=lambda word: (0 if word.startswith("s") else 1, word))

If you are familiar with the fact that False < True, it can be simplified a bit more:

sorted(words, key=lambda word: (not word.startswith("s"), word))

Here's another way using the abstraction partition:

words_with_s, other_words = partition(words, lambda word: word.startswith("s"))
sorted(words_with_s) + sorted(other_words)

Writing partition is left as an exercise for the reader :-)

Question 3

Dang, I was just writing up the same thing! +1 :-)

Question 4

@tobias_k: sorry ;-) I was a bit suprised no one had proposed something similar yet, this is pretty standard stuff.

Question 5

No problem, but two times the same answer within 2 minutes for a 8 hours old question... how probable is this?

Question 6

For this snippet, I would have probably written the more compact:

def mysort(words):
 return (sorted([i for i in words if i[0] == "s"]) +
 sorted([i for i in words if i[0] != "s"]))

(You're not supposed to align code like this in Python, but I tend to do it anyway.) Note that I wrote i[0] and not i[:1] - I think the former is clearer. Using str.startswith() is even better.

Also, it is generally considered bad practice to use list as a variable name.

However, your algorithm iterates the list at least three times: Once to look for the words starting with s, once to look for the words not starting with s and then finally O(n log n) times more to sort the two lists. If necessary, you can improve the algorithm to do just one pass before sorting, by populating both lists simultaneously:

def prefix_priority_sort(words, special_prefix = "s"):
 begins_with_special = []
 not_begin_with_special = []
 for word in words:
 if word.startswith(special_prefix):
 begins_with_special.append(word)
 else:
 not_begin_with_special.append(word)
 return sorted(begins_with_special) + sorted(not_begin_with_special)

However, the best way is to define your own comparator and pass it to the sorting function, like mariosangiorgio suggests. In Python 3, you need to pass in a key function, see the docs for details, or this article.

Depending on execution speed requirements, list sizes, available memory and so on, you might want to pre-allocate memory for the lists using [None] * size. In your case it is probably premature optimization, though.

Question 7

Note that i[0] will fail for empty strings, while i[:1] will not. Agreed that startswith is best.

Question 8

Your mysort needs another pair of parentheses, or a backslash before the newline.

Question 9

@tobias_k Ah, good point!

Question 10

What makes you say you're not supposed to align code like that? PEP-8 does it.

Question 11

Lastor already said what I was going to point out so I am not going to repeat that. I'll just add some other things.

I tried timing your solution with a bunch of other solutions I came up with. Among these the one with best time-memory combination should be the function mysort3 as it gave me best timing in nearly all cases. I am still looking about proper timing in Python. You can try putting in different test cases in the function tests to test the timing for yourself.

def mysort(words):
 mylist1 = sorted([i for i in words if i[:1] == "s"])
 mylist2 = sorted([i for i in words if i[:1] != "s"])
 list = mylist1 + mylist2
 return list
def mysort3(words):
 ans = []
 p = ans.append
 q = words.remove
 words.sort()
 for i in words[:]:
 if i[0] == 's':
 p(i)
 q(i)
 return ans + words
def mysort4(words):
 ans1 = []
 ans2 = []
 p = ans1.append
 q = ans2.append
 for i in words:
 if i[0] == 's':
 p(i)
 else:
 q(i)
 ans1.sort()
 ans2.sort()
 return ans1 + ans2
def mysort6(words):
 return ( sorted([i for i in words if i[:1] == "s"]) +
 sorted([i for i in words if i[:1] != "s"])
 )
if __name__ == "__main__":
 from timeit import Timer
 def test(f):
 f(['a','b','c','abcd','s','se', 'ee', 'as'])
 print Timer(lambda: test(mysort)).timeit(number = 10000)
 print Timer(lambda: test(mysort3)).timeit(number = 10000)
 print Timer(lambda: test(mysort4)).timeit(number = 10000)
 print Timer(lambda: test(mysort6)).timeit(number = 10000)

Question 12

If you want to sort the list according to your own criterion probably the best choice is to write your own comparator.

A comparator function takes as parameters two values and has to return -1 if the first one is the smaller, 0 if they are equal and +1 if the second is the bigger.

In your case you should first check wether one of the two parameters starts with 's'. If it does manage it properly, otherwise fall back to standard string ordering.

This will turn your code in a single call to the sort function plus the definition of the comparison function.

If you want an example of how comparators can be implemented have a look at the answers of this StackOVerflow Q&A

Question 13

Are you sure? IMO sorted along the key argument is on a higher level of abstraction than defining a comparator thus should be generally preferred. en.wikipedia.org/wiki/Schwartzian_transform

Question 14

@tokland Python 2 takes a comparator function, Python 3 takes a key function, if that's what you mean?

Question 15

AFAIK sort (in-place) has always taken a comparator function and sorted (not in-place) a key function, but I am not 100% sure.

Question 16

@tokland Ah, you are more or less right regarding sorted. In Python 2.7, it takes a comparator function and a key function: docs.python.org/2.7/library/…

tokland tokland 11.2k1 gold badge21 silver badges26 bronze badges · Accepted Answer · 2013-07-12 13:56:20Z

Some notes about your code:

sorted([i for i in words if i[:1] == "s"]): No need to generate an intermediate list, drop those [ ] and you'll get a lazy generator expression instead.
mylist1. This my prefix looks pretty bad. Do other languages use them?
list: Red flag. Don't overshadow a built-in (at least not one so important) with a variable.
return list. You can directly write return mylist1 + mylist2, no gain in naming the value you're about to return (the name of the function should inform about that).
i[:1] == "s" -> i.startswith("s").

Now what I'd write. Using sorted with the argument key (Schwartzian transform) and taking advantage of the lexicographical order defined by tuples:

def mysort(words):
 return sorted(words, key=lambda word: (0 if word.startswith("s") else 1, word))

If you are familiar with the fact that False < True, it can be simplified a bit more:

sorted(words, key=lambda word: (not word.startswith("s"), word))

Here's another way using the abstraction partition:

words_with_s, other_words = partition(words, lambda word: word.startswith("s"))
sorted(words_with_s) + sorted(other_words)

Writing partition is left as an exercise for the reader :-)

@tobias_k: sorry ;-) I was a bit suprised no one had proposed something similar yet, this is pretty standard stuff.
No problem, but two times the same answer within 2 minutes for a 8 hours old question... how probable is this?

Stack Exchange Network

Sort a list in Python

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Sort a list in Python

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions