I recently read and answered Martin R's Recursive flattening of Swift sequences and continued to play around with the code until I arrived at something that was both pretty cool and possibly an abomination. So I figured I'd come back to CR with it and ask others' opinions.
If you want a full context, you should refer to Martin's question, but I'll reiterate the essential parts here:
The function seqentialFlatten(_:_,children:_)
, takes a sequence and a function that maps an element of that sequence into a sequence of the same type, and lazily returns an AnySequence
with all elements that can be generated by repeatedly mapping elements into sequences. One use case of this is getting a list of all subviews of a view:
let someView : UIView = ...
let views = sequentialFlatten([someView], children: { 0ドル.subviews as! [UIView] })
What I've done in this approach is used a more functional style to implement this function, using map and reduce. To achieve this, I've implemented the + operator for Generators, made a generic struct to make it possible to have local lazy variables and wrapped a whole bunch of stuff into AnyGenerators.
And therein lies the problem, I feel perhaps this bit of code has introduce too many new concepts for not much gain. I'll first post the full code and then go through some of my concerns.
func +<G: GeneratorType>(lhs: G, rhs: G) -> AnyGenerator<G.Element> {
var leftGen = lhs
var leftEmpty = false
var rightGen = rhs
var rightEmpty = false
return AnyGenerator {
if !leftEmpty, let elem = leftGen.next() {
return elem
}
leftEmpty = true
if !rightEmpty, let elem = rightGen.next() {
return elem
}
rightEmpty = true
return nil
}
}
struct Lazy<E> {
private let get: () -> E
lazy var val: E = self.get()
init(@autoclosure(escaping) _ getter: () -> E) {
get = getter
}
}
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
return AnySequence {
() -> AnyGenerator<S.Generator.Element> in
seq.map {
let firstGen = AnyGenerator([0ドル].generate())
var secondGen = Lazy(sequentialFlatten(children(0ドル), children: children).generate())
return firstGen + AnyGenerator {
secondGen.val.next()
}
}.reduce(AnyGenerator { nil }, combine: +)
}
}
let gen = sequentialFlatten([0], children: { n in n < 50 ? [2*n+1, 2*n+2] : []})
for e in gen {
print(e)
}
Right off the bat, the implementation of the + operator doesn't rub me right. It's overly verbose. At first I had it implemented as:
func +<G: GeneratorType>(lhs: G, rhs: G) -> AnyGenerator<G.Element> {
var leftGen = lhs
var rightGen = rhs
return AnyGenerator {
leftGen.next() ?? rightGen.next()
}
}
which is worlds nicer, but was a nightmare performance-wise. The problem was that, with complicated trees, all of the generators were gone trough, because the + operator relied on the two generators both returning nil to return nil itself. I'd love to see a version of this that is more elegant without losing too much in way of performance.
I'm worried there might be something like my Lazy
struct already available and I'm reinventing the wheel.
I'm also worried about all the AnyGenerators I'm throwing around left and right, it seemed necessary, because I can't use my + operator on two different types of Generators. I have tried to rewrite the function and succeeded, but this caused Swift problems with type inference in the reduce as such:
func +<G1: GeneratorType, G2: GeneratorType where G1.Element == G2.Element>(lhs: G1, rhs: G2) -> AnyGenerator<G1.Element> {
var leftGen = lhs
var leftEmpty = false
var rightGen = rhs
var rightEmpty = false
return AnyGenerator {
if !leftEmpty, let elem = leftGen.next() {
return elem
}
leftEmpty = true
if !rightEmpty, let elem = rightGen.next() {
return elem
}
rightEmpty = true
return nil
}
}
Some general concerns:
- Is the performance okay? (speed and memory usage)
- Is the code Swifty enough?
1 Answer 1
That is an interesting approach. There are some simplifications and performance improvements possible. To measure the performance I have used the following simple code:
let d1 = NSDate()
let gen = sequentialFlatten([0], children: { n in n < 100_000 ? [2*n+1, 2*n+2] : []})
var c = 0
for _ in gen { c += 1 }
let d2 = NSDate()
print(c, d2.timeIntervalSinceDate(d1))
and with your sequentialFlatten
method, that takes about 3.1 sec
on a 2.3 GHz MacBook Pro (compiled in Release mode).
First note that the explicit parameter list in the closure
which is passed to AnySequence { }
is not needed, it can be
inferred:
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
return AnySequence {
seq.map {
let firstGen = AnyGenerator([0ドル].generate())
var secondGen = Lazy(sequentialFlatten(children(0ドル), children: children).generate())
return firstGen + AnyGenerator {
secondGen.val.next()
}
}.reduce(AnyGenerator { nil }, combine: +)
}
}
Next, the Lazy
wrapping seems unnecessary to me. Perhaps I am overlooking something, but it works just as well without.
This also makes one AnyGenerator
wrapping obsolete:
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
return AnySequence {
seq.map {
let firstGen = AnyGenerator([0ドル].generate())
let secondGen = sequentialFlatten(children(0ドル), children: children).generate()
return firstGen + secondGen
}.reduce(AnyGenerator { nil }, combine: +)
}
}
The execution time is now 2.5 sec.
Your more general
func +<G1: GeneratorType, G2: GeneratorType where G1.Element == G2.Element>(lhs: G1, rhs: G2) -> AnyGenerator<G1.Element>
operator can be used, but we have to assist the compiler with
an additional parameter list in the closure passed to seq.map { }
:
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
return AnySequence {
seq.map { elem -> AnyGenerator<S.Generator.Element> in
let firstGen = AnyGenerator([elem].generate())
let secondGen = sequentialFlatten(children(elem), children: children).generate()
return firstGen + secondGen
}.reduce(AnyGenerator { nil }, combine: +)
}
}
This allows to use GeneratorOfOne()
as the first generator
(and get rid of another AnyGenerator
wrapping):
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
return AnySequence {
seq.map { elem -> AnyGenerator<S.Generator.Element> in
let firstGen = GeneratorOfOne(elem)
let secondGen = sequentialFlatten(children(elem), children: children).generate()
return firstGen + secondGen
}.reduce(AnyGenerator { nil }, combine: +)
}
}
which reduces the execution time to 2.1 sec.
Your approach
func +<G: GeneratorType>(lhs: G, rhs: G) -> AnyGenerator<G.Element> {
var leftGen = lhs
var rightGen = rhs
return AnyGenerator {
leftGen.next() ?? rightGen.next()
}
}
is actually invalid, as the GeneratorType
documentation states
for the next()
method:
/// - Requires: `..., and no preceding call to `self.next()`
/// has returned `nil`.
So once the left generator is exhausted, we must not call its
next()
method again. (Even if many generators tolerate that
and return nil
again.)
However, using the ideas from https://stackoverflow.com/a/37665583/1187415
and http://ericasadun.com/2016/06/06/sneaky-swift-tricks-the-fake-boolean/
the +
operator can be implemented as a sequence of
nil-coalescing operators, where the purpose of the middle closure
expression is to execute a statement if the previous expression "failed":
func +<G1: GeneratorType, G2: GeneratorType where G1.Element == G2.Element>(lhs: G1, rhs: G2) -> AnyGenerator<G1.Element> {
var leftGen: G1? = lhs
var rightGen = rhs
return AnyGenerator {
return leftGen?.next()
?? { _ -> G1.Element? in leftGen = nil; return nil }()
?? rightGen.next()
}
}
Interestingly, this is a bit faster: 1.9 sec.
More improvements might be possible but that is what I have so far. However, the recursive method from your answer https://codereview.stackexchange.com/a/131527/35991 is still much faster, it takes only 0.9 sec.
-
\$\begingroup\$ Wow, that's quite a performance gain. The chain of nil coalescing operators is a very interesting approach and it's cool that it performs better, but it does seem a bit quirky. As for the Lazy wrapper, I saw in the playground that the whole tree was constructed upfront without it. I'd like to see if it finishes the sequentialFlatten call faster and then starts limping behind when the looping begins. I also had a ´.lazy´ right after seq at first, but left it out in the end. Maybe that improves performance with the Lazy wrapping? Overall, a great answer, all the other points are spot on! Thanks! \$\endgroup\$overactor– overactor2016年06月11日 11:55:22 +00:00Commented Jun 11, 2016 at 11:55
-
\$\begingroup\$ @overactor: You are right, without the Lazy wrapper the entire tree is generated at once, I hadn't notice that. \$\endgroup\$Martin R– Martin R2016年06月13日 01:12:30 +00:00Commented Jun 13, 2016 at 1:12
-
2\$\begingroup\$ Do you really need
?? { _ -> G1.Element? in rightGen = nil; return nil }()
? \$\endgroup\$Tim Vermeulen– Tim Vermeulen2016年06月13日 09:04:20 +00:00Commented Jun 13, 2016 at 9:04 -
\$\begingroup\$ @TimVermeulen: You are completely right! With the same argument that I used we can assume that our generator is not called again once it has returned nil. I have updated the answer accordingly, thanks for the feedback! \$\endgroup\$Martin R– Martin R2016年06月13日 10:08:25 +00:00Commented Jun 13, 2016 at 10:08
Explore related questions
See similar questions with these tags.