19

I have been picking and probing at Swift standard libraries sort() function for its Array type. To my surprise I have noticed it performs poorly on already-sorted data.

Sorting an array of Int which is shuffled seems to be 5x faster than sorting that very same array when it is already sorted. Sorting an array of shuffled objects is about 4x faster than sorting the very same one already in sorted order (sorting object array vs Int array use different algorithms I am sure so I sorted both to eliminate bias).

These are the results:

Shuffled Int array sort time: 1.3961209654808
Shuffled ColorObject array sort time: 3.14633798599243
NOnshuffled Int array sort time: 7.34714204072952
NOnshuffled ColorObject array sort time: 10.9310839772224

For reference below is my code:

class ElapsedTimer {
 let startTime: CFAbsoluteTime
 var endTime: CFAbsoluteTime?
 init() {
 startTime = CFAbsoluteTimeGetCurrent()
 }
 func stop() -> CFAbsoluteTime {
 endTime = CFAbsoluteTimeGetCurrent()
 return duration!
 }
 var duration: CFAbsoluteTime? {
 if let endTime = endTime {
 return endTime - startTime
 } else {
 return nil
 }
 }
}
public class CountedColor {
 public private(set) var count: Int
 public private(set) var color: UIColor
 public init(color: UIColor, colorCount: Int) {
 self.count = colorCount
 self.color = color
 }
}
var distributedIntArray = [Int]()
for value in 1..<1000000 {
 distributedIntArray.append(value)
}
var distributedCountedColorArray = distributedIntArray.map{ CountedColor(color: UIColor.white, colorCount: 0ドル) }
distributedCountedColorArray.shuffle()
distributedIntArray.shuffle()
var timer = ElapsedTimer()
distributedIntArray.sort()
print("Shuffled Int array sort time: \(timer.stop())")
timer = ElapsedTimer()
distributedCountedColorArray.sort{ return 0ドル.count < 1ドル.count }
print("Shuffled Color array sort time: \(timer.stop())")
timer = ElapsedTimer()
distributedIntArray.sort()
print("NOnshuffled Int array sort time: \(timer.stop())")
timer = ElapsedTimer()
distributedCountedColorArray.sort{ return 0ドル.count < 1ドル.count }
print("Non shuffled Color array sort time: \(timer.stop())")

My array shuffle() method was pulled from this post. My ElapsedTimer simply wraps and uses CACurrentMediaTime() functions.

My question is why am I seeing this behavior? Especially when I am sorting the object array which should surely be using a general purpose sort. What kind of general purpose sorting algorithm is swift using? It surely can’t be one where the worst case and average case are the same like mergeSort.

Chris
4,4413 gold badges30 silver badges58 bronze badges
asked Dec 8, 2016 at 3:07
8
  • Asking "why" Swift chooses a certain implementation does not make a good objective question - rationale questions rarely do as objective supportive evidence is usually hard to find. Removing the extra cruft will make for a more tidy question. The objective question is "What sort algorithm does Swift use? (What performance can be expected (in xyz cases)?)" Commented Dec 8, 2016 at 3:10
  • 3
    I believe asking "why" to be a totally valid question. Understanding why an implementation is chosen is just as important if not more, all decisions should be justifiable. That being said I can re-word the question. Commented Dec 8, 2016 at 3:13
  • 1
    Swift is open source, so if you really want to know what sort algorithm they are using, why don't you look and see? As for your actual question, "why would" is not a good fit for Stack Overflow; it's a matter of opinion why Apple did this or that. Commented Dec 8, 2016 at 3:14
  • What is CountedColor? Commented Dec 8, 2016 at 3:15
  • @YuchenZhong let me edit the question/code snippet to include that Commented Dec 8, 2016 at 3:18

3 Answers 3

30

Swift uses Introsort. Looking at the source code we see that the chosen pivot is the first element. The wikipedia page on Introsort says:

(...), one of the critical operations is choosing the pivot: the element around which the list is partitioned. The simplest pivot selection algorithm is to take the first or the last element of the list as the pivot, causing poor behavior for the case of sorted or nearly sorted input.

Thus it is entirely predictable, given the implementation choice, that Swift's sorting performance is worst for sorted inputs.

I have built a complete benchmark for people who want to easily reproduce the OP's claims : https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/extra/swift/sort

For reference, the GNU ISO C++ standard library uses a median-of-3 pivot (as per the stl_algo.h header).

answered Dec 10, 2016 at 0:16
2
  • Follow up: if choosing the first item as pivot is a bad choice, what would be a better implementation? Wikipedia mention median of 3 with some caveats. Seems to me it would be harder to come up with a worst case with median of 3 versus first item though. Commented Dec 10, 2016 at 1:22
  • No longer true. 2020 ish switched to Timsort. See other answers Commented Dec 14, 2024 at 9:07
23

In the Swift 5 evolution, the IntroSort algorithm was replaced with a modified version TimSort (implemented first in 2002 by Tim Peters for Python) in the 'sort()' method: https://github.com/apple/swift/blob/master/stdlib/public/core/Sort.swift

answered Feb 12, 2020 at 11:30
1
  • Great to know! If it's like Python's timsort, it's much faster for data that's already almost sorted! That's fantastic because it simplifies a lot of common use cases. Commented Dec 14, 2024 at 9:07
5

This is a duplicate of this question it seems: Swift sorting algorithm implementation

Furthermore the reason why it performs so much better when shuffled is probably just because when its shuffled its performance is not hitting the upper bound of NlogN. Sorting the sorted array probably gets closer to that upper limit so its still all in all the same sort. But I dont know this is just theory

answered Dec 9, 2016 at 0:06
11
  • this is incorrect, the performance of swift’s sort on an already sorted array larger than a few down elements is O(n**2). On a shuffled array it comes down to O(n*log(n)). Commented Jan 5, 2017 at 17:16
  • By O(n**2) I assume you mean N squared. How can that be swifts sort is an intro sort, and introSorts worst case complexity is NlogN. Commented Jan 5, 2017 at 17:18
  • As evidenced by the OP’s question, it appears swift’s sort does not behave like a proper introsort, or else it would not be vulnerable to ordered input. Perhaps the depth threshold is set very high that it predominantly exhibits quicksort behavior. Commented Jan 5, 2017 at 17:24
  • Potentially or the pivot selection is poor. Commented Jan 5, 2017 at 17:29
  • pivot selection only changes which kind of patterned input the sort is vulnerable to. Though last I heard, swift was using the first element, which seems like an extraordinarily bad choice considering how common simple-sorted input is... Commented Jan 5, 2017 at 17:33

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.