This is follow-up on On Knuth's "Algorithm L" to generate permutations in lexicographic order, where I presented the following method to enumerate all permutations of array elements in lexicographic order:
extension Array where Element: Comparable {
/// Replaces the array by the next permutation of its elements in lexicographic
/// order.
///
/// It uses the "Algorithm L (Lexicographic permutation generation)" from
/// Donald E. Knuth, "GENERATING ALL PERMUTATIONS"
/// http://www-cs-faculty.stanford.edu/~uno/fasc2b.ps.gz
///
/// - Returns: `true` if there was a next permutation, and `false` otherwise
/// (i.e. if the array elements were in descending order).
mutating func permute() -> Bool {
// Nothing to do for empty or single-element arrays:
if count <= 1 {
return false
}
// L2: Find last j such that self[j] < self[j+1]. Terminate if no such j
// exists.
var j = count - 2
while j >= 0 && self[j] >= self[j+1] {
j -= 1
}
if j == -1 {
return false
}
// L3: Find last l such that self[j] < self[l], then exchange elements j and l:
var l = count - 1
while self[j] >= self[l] {
l -= 1
}
swap(&self[j], &self[l])
// L4: Reverse elements j+1 ... count-1:
var lo = j + 1
var hi = count - 1
while lo < hi {
swap(&self[lo], &self[hi])
lo += 1
hi -= 1
}
return true
}
}
This question is about using that method to create a Swift Sequence
with all permutations, so that one can
simply enumerate these with for .. in
loops and related techniques. My first approach is to use
the "type erasers" AnySequence
and AnyIterator
, which leads to this simple
implementation:
func permutations<T: Comparable>(of a: [T]) -> AnySequence<[T]> {
var current = a.sorted()
var done = false
return AnySequence {
return AnyIterator {
if done { return nil }
defer { done = !current.permute() }
return current
}
}
}
Example usage:
for p in permutations(of: [3, 2, 1]) {
print(p)
}
let allPerms = Array(permutations(of: ["B", "A", "C"]))
print(allPerms)
So this is easy to use, but what about performance? Here my benchmark code (on a 1.2 GHz Intel Core m5 MacBook, compiled in Release mode):
let N = 10
var count = 0
let start = Date()
for _ in permutations(of: Array(1...N)) { count += 1 }
let end = Date()
print(count, end.timeIntervalSince(start))
This takes about 1.3 seconds, i.e. it is much slower than calling permute()
on a mutable
array in a loop, which takes only 0.03 seconds (see On Knuth's "Algorithm L" to generate permutations in lexicographic order).
It gets a bit better if we implement our own sequence type instead of using the type erasers:
struct PermutationSequence<T: Comparable>: Sequence, IteratorProtocol {
private var current: [T]
private var done: Bool
init(elements: [T]) {
self.current = elements.sorted()
self.done = false
}
func makeIterator() -> PermutationSequence {
return self
}
mutating func next() -> [T]? {
if done { return nil }
let retval = current
done = !current.permute()
return retval
}
}
Benchmark:
let N = 10
var count = 0
let start = Date()
for _ in PermutationSequence(elements: Array(1...N)) { count += 1 }
let end = Date()
print(count, end.timeIntervalSince(start))
which takes about 0.8 seconds.
All feedback is welcome. In particular I am interested in making Sequence
-based
enumeration of all permutations as fast as possible.
1 Answer 1
Performance
I'm not particularly surprised that the version of permutations(of:)
which uses AnySequence
and AnyIterator
is so slow.
You're paying for the cost of both:
- Heap allocation and reference counting for both the captured array and
done
flag. - Copy-on-write upon mutating the array in the
defer
block, as a copy of the array needs to be taken in order to return it to the caller prior to the mutation.
Therefore immediately you can speed things up by simply reducing the amount of closure capture – for example if you just removed the AnySequence
wrapper and just returned an AnyIterator
:
func permutations<T: Comparable>(of a: [T]) -> AnyIterator<[T]> {
var current = a.sorted()
var done = false
return AnyIterator {
if done { return nil }
defer { done = !current.permute() }
return current
}
}
You can still use an AnyIterator
in a for-in loop due to the fact that it conforms to Sequence
itself, and just returns itself as its own iterator. Iterating through it will 'consume' the elements returned, but you get the same behaviour by wrapping it in an AnySequence
due to the fact that it's ultimately capturing the same array.
From AnySequence
to AnyIterator
improves my benchmark time (using the same measurement setup as shown in your question) from ~1.1 seconds to ~0.95 seconds.
However, as you note, we can do better than that by removing the closures entirely by simply declaring our own custom iterator type – PermutationSequence
.
struct PermutationSequence<T: Comparable>: Sequence, IteratorProtocol { private var current: [T] private var done: Bool init(elements: [T]) { self.current = elements.sorted() self.done = false } func makeIterator() -> PermutationSequence { return self } mutating func next() -> [T]? { if done { return nil } let retval = current done = !current.permute() return retval } }
This reduces my benchmark time down to ~0.75 seconds.
However, the problem with this implementation, as @dfri points out, is that we're still paying the cost of copy-on-write at each iteration due to the fact that both retval
and current
have a view onto the underlying array buffer.
The solution to this is fairly simple, as you noted, we return the array after the mutation has taken place, and just maintain a flag so we don't do a mutation on the first iteration. This could look like:
struct PermutationSequence<Element : Comparable> : Sequence, IteratorProtocol {
private var current: [Element]
private var firstIteration = true
init(startingFrom elements: [Element]) {
self.current = elements
}
init<S : Sequence>(_ elements: S) where S.Iterator.Element == Element {
self.current = elements.sorted()
}
mutating func next() -> [Element]? {
// if it's the first iteration, we simply return the current array.
if firstIteration {
firstIteration = false
return current
}
// do mutation – if the array has changed, then return it,
// else we're at the end of the sequence.
return current.permute() ? current : nil
}
}
Note that I've also:
- Removed the
makeIterator()
implementation as there's already a default implementation for this when the sequence is its own iterator. - Renamed the generic placeholder from
T
toElement
, as it's a more descriptive name for it. - Added an
init(startingFrom:)
initialiser, as I suspect it will be quite useful to allow the sequence to resume from a given permutation, rather than always starting at the first. - Removed the argument label from
init(elements:)
, as it was already clear what the initialiser does and made it take a genericSequence
input for increased flexibility.
Now we're no longer paying for the cost of an array copy at each iteration due to the fact that the iterator always has a single view onto the array's buffer (though the current
property), reducing my benchmark time down to ~0.1 seconds (the target is ~0.02 seconds for directly applying permute()
in repeat-while loop).
The key to achieving the target performance weirdly appears to be avoiding having multiple exits in the implementation of next()
.
If we refactor the iterator's next()
method like this, so that there's only one return
:
mutating func next() -> [Element]? {
var continueIterating = true
// if it's the first iteration, we avoid doing the permute() and reset the flag.
if firstIteration {
firstIteration = false
} else {
continueIterating = current.permute()
}
// if the array changed (and it isn't the first iteration), then return it,
// else we're at the end of the sequence.
return continueIterating ? current : nil
}
we are able to achieve the desired benchmark time of ~0.02 seconds. Exactly why this reduces the time so dramatically, I'm not too sure – it appears that having multiple points of exit in the method prevents the compiler from performing some important optimisation.
It also appears to be related to inlining, as when marking the next()
method with the @inline(never)
attribute, the use of an additional return
has no significant impact on the performance. I'd certainly be interested if anyone else knows more about exactly what's going on here.
Regarding your comment about making shouldContinue
(renamed to continueIterating
here) a let
constant – I could not observe a difference in performance in doing so, so I left it as a var
.
Separating the sequence and iterator
Depending on your usage, it may be useful to separate your permutation sequence implementation from its iterator in order to allow for non-destructive iteration (this isn't required for a sequence, but can be useful in places).
This would just be as simple as having the sequence keep its own constant copy of the array prior to mutation, which gets passed to the iterator on its creation.
In Swift 3.1, this can be expressed rather nicely with a nested generic type:
struct PermutationSequence<Element : Comparable> : Sequence {
// constant copy of the elements to pass to the iterator on its creation.
let elements: [Element]
init(startingFrom elements: [Element]) {
self.elements = elements
}
init<S : Sequence>(_ elements: S) where S.Iterator.Element == Element {
self.elements = elements.sorted()
}
struct Iterator : IteratorProtocol {
private var current: [Element]
private var firstIteration = true
// you should create a PermutationSequence rather than invoking this
// initialiser directly.
fileprivate init(startingFrom elements: [Element]) {
self.current = elements
}
mutating func next() -> [Element]? {
var continueIterating = true
// if it's the first iteration, we avoid doing the permute() and reset the flag
if firstIteration {
firstIteration = false
} else {
continueIterating = current.permute()
}
// if the array changed (and it isn't the first iteration), then return it,
// else we're at the end of the sequence.
return continueIterating ? current : nil
}
}
func makeIterator() -> Iterator {
return Iterator(startingFrom: elements)
}
}
This shouldn't adversely affect the performance in any way (benchmarking doesn't reveal any significant difference) – an initial copy of the array may have to be taken upon the first mutation, but that's also true for the previous version of PermutationSequence
due to the fact that the caller has a view onto the array buffer.
-
\$\begingroup\$ Thank you for this elaborate answer, providing lots of useful information! – Just one question:
Sequence
is not required to be non-destructive, correct? \$\endgroup\$Martin R– Martin R2017年03月30日 18:53:50 +00:00Commented Mar 30, 2017 at 18:53 -
\$\begingroup\$ @MartinR Happy to help! You are correct – a
Sequence
is not required to be non-destructive. \$\endgroup\$Hamish– Hamish2017年03月30日 18:55:36 +00:00Commented Mar 30, 2017 at 18:55
Explore related questions
See similar questions with these tags.
next()
inPermutationSequence
will naturally induce, on each call, COW ofretval
when callingpermute
oncurrent
, but is it necessary to keep an extra copy of "next" state? (I.e., preparingcurrent
to be that copy for nextnext()
call). As an alternative, we could letnext()
returncurrent
, and leave the choice of inducing COW to the caller, if necessary. E.g. implemented as avoiding permutation ofcurrent
on first call tonext()
, and thereafter permuting and returning it (giventrue
-return frompermute()
) for each subsequent call tonext()
. \$\endgroup\$permute()
returnArray?
rather thanBool
(overhead only for COW on return?). On another note: interestingly, the 0.8s benchmark of the iteration over thePermutationSequence
instance is comparable to the immutable implementation ofpermute()
in the other thread, where, for the most part, the main difference is the extra copy on each call topermute()
. Also, sorry for not benchmarking myself, but currently no where near my Mac, so can't test this compiled in release mode ><. \$\endgroup\$makeIterator()
implementation in yourPermutationSequence
that just returnsself
, as there's already a default implementation for that :) \$\endgroup\$shouldContinue
a constant makes it a little bit faster. \$\endgroup\$