3
\$\begingroup\$

I recently read and answered Martin R's Recursive flattening of Swift sequences and continued to play around with the code until I arrived at something that was both pretty cool and possibly an abomination. So I figured I'd come back to CR with it and ask others' opinions.

If you want a full context, you should refer to Martin's question, but I'll reiterate the essential parts here:

The function seqentialFlatten(_:_,children:_), takes a sequence and a function that maps an element of that sequence into a sequence of the same type, and lazily returns an AnySequence with all elements that can be generated by repeatedly mapping elements into sequences. One use case of this is getting a list of all subviews of a view:

let someView : UIView = ...
let views = sequentialFlatten([someView], children: { 0ドル.subviews as! [UIView] })

What I've done in this approach is used a more functional style to implement this function, using map and reduce. To achieve this, I've implemented the + operator for Generators, made a generic struct to make it possible to have local lazy variables and wrapped a whole bunch of stuff into AnyGenerators.

And therein lies the problem, I feel perhaps this bit of code has introduce too many new concepts for not much gain. I'll first post the full code and then go through some of my concerns.

func +<G: GeneratorType>(lhs: G, rhs: G) -> AnyGenerator<G.Element> {
 var leftGen = lhs
 var leftEmpty = false
 var rightGen = rhs
 var rightEmpty = false
 return AnyGenerator {
 if !leftEmpty, let elem = leftGen.next() {
 return elem
 }
 leftEmpty = true
 if !rightEmpty, let elem = rightGen.next() {
 return elem
 }
 rightEmpty = true
 return nil
 }
}
struct Lazy<E> {
 private let get: () -> E
 lazy var val: E = self.get()
 init(@autoclosure(escaping) _ getter: () -> E) {
 get = getter
 }
}
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
 return AnySequence {
 () -> AnyGenerator<S.Generator.Element> in
 seq.map {
 let firstGen = AnyGenerator([0ドル].generate())
 var secondGen = Lazy(sequentialFlatten(children(0ドル), children: children).generate())
 return firstGen + AnyGenerator {
 secondGen.val.next()
 }
 }.reduce(AnyGenerator { nil }, combine: +)
 }
}
let gen = sequentialFlatten([0], children: { n in n < 50 ? [2*n+1, 2*n+2] : []})
for e in gen {
 print(e)
}

Right off the bat, the implementation of the + operator doesn't rub me right. It's overly verbose. At first I had it implemented as:

func +<G: GeneratorType>(lhs: G, rhs: G) -> AnyGenerator<G.Element> {
 var leftGen = lhs
 var rightGen = rhs
 return AnyGenerator {
 leftGen.next() ?? rightGen.next()
 }
}

which is worlds nicer, but was a nightmare performance-wise. The problem was that, with complicated trees, all of the generators were gone trough, because the + operator relied on the two generators both returning nil to return nil itself. I'd love to see a version of this that is more elegant without losing too much in way of performance.


I'm worried there might be something like my Lazy struct already available and I'm reinventing the wheel.


I'm also worried about all the AnyGenerators I'm throwing around left and right, it seemed necessary, because I can't use my + operator on two different types of Generators. I have tried to rewrite the function and succeeded, but this caused Swift problems with type inference in the reduce as such:

func +<G1: GeneratorType, G2: GeneratorType where G1.Element == G2.Element>(lhs: G1, rhs: G2) -> AnyGenerator<G1.Element> {
 var leftGen = lhs
 var leftEmpty = false
 var rightGen = rhs
 var rightEmpty = false
 return AnyGenerator {
 if !leftEmpty, let elem = leftGen.next() {
 return elem
 }
 leftEmpty = true
 if !rightEmpty, let elem = rightGen.next() {
 return elem
 }
 rightEmpty = true
 return nil
 }
}

Some general concerns:

  • Is the performance okay? (speed and memory usage)
  • Is the code Swifty enough?
asked Jun 10, 2016 at 11:27
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

That is an interesting approach. There are some simplifications and performance improvements possible. To measure the performance I have used the following simple code:

let d1 = NSDate()
let gen = sequentialFlatten([0], children: { n in n < 100_000 ? [2*n+1, 2*n+2] : []})
var c = 0
for _ in gen { c += 1 }
let d2 = NSDate()
print(c, d2.timeIntervalSinceDate(d1))

and with your sequentialFlatten method, that takes about 3.1 sec on a 2.3 GHz MacBook Pro (compiled in Release mode).

First note that the explicit parameter list in the closure which is passed to AnySequence { } is not needed, it can be inferred:

func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
 return AnySequence {
 seq.map {
 let firstGen = AnyGenerator([0ドル].generate())
 var secondGen = Lazy(sequentialFlatten(children(0ドル), children: children).generate())
 return firstGen + AnyGenerator {
 secondGen.val.next()
 }
 }.reduce(AnyGenerator { nil }, combine: +)
 }
}

Next, the Lazy wrapping seems unnecessary to me. Perhaps I am overlooking something, but it works just as well without. This also makes one AnyGenerator wrapping obsolete:

func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
 return AnySequence {
 seq.map {
 let firstGen = AnyGenerator([0ドル].generate())
 let secondGen = sequentialFlatten(children(0ドル), children: children).generate()
 return firstGen + secondGen
 }.reduce(AnyGenerator { nil }, combine: +)
 }
}

The execution time is now 2.5 sec.

Your more general

func +<G1: GeneratorType, G2: GeneratorType where G1.Element == G2.Element>(lhs: G1, rhs: G2) -> AnyGenerator<G1.Element>

operator can be used, but we have to assist the compiler with an additional parameter list in the closure passed to seq.map { }:

func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
 return AnySequence {
 seq.map { elem -> AnyGenerator<S.Generator.Element> in
 let firstGen = AnyGenerator([elem].generate())
 let secondGen = sequentialFlatten(children(elem), children: children).generate()
 return firstGen + secondGen
 }.reduce(AnyGenerator { nil }, combine: +)
 }
}

This allows to use GeneratorOfOne() as the first generator (and get rid of another AnyGenerator wrapping):

func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
 return AnySequence {
 seq.map { elem -> AnyGenerator<S.Generator.Element> in
 let firstGen = GeneratorOfOne(elem)
 let secondGen = sequentialFlatten(children(elem), children: children).generate()
 return firstGen + secondGen
 }.reduce(AnyGenerator { nil }, combine: +)
 }
}

which reduces the execution time to 2.1 sec.

Your approach

func +<G: GeneratorType>(lhs: G, rhs: G) -> AnyGenerator<G.Element> {
 var leftGen = lhs
 var rightGen = rhs
 return AnyGenerator {
 leftGen.next() ?? rightGen.next()
 }
}

is actually invalid, as the GeneratorType documentation states for the next() method:

/// - Requires: `..., and no preceding call to `self.next()`
/// has returned `nil`.

So once the left generator is exhausted, we must not call its next() method again. (Even if many generators tolerate that
and return nil again.)

However, using the ideas from https://stackoverflow.com/a/37665583/1187415 and http://ericasadun.com/2016/06/06/sneaky-swift-tricks-the-fake-boolean/ the + operator can be implemented as a sequence of nil-coalescing operators, where the purpose of the middle closure expression is to execute a statement if the previous expression "failed":

func +<G1: GeneratorType, G2: GeneratorType where G1.Element == G2.Element>(lhs: G1, rhs: G2) -> AnyGenerator<G1.Element> {
 var leftGen: G1? = lhs
 var rightGen = rhs
 return AnyGenerator {
 return leftGen?.next() 
 ?? { _ -> G1.Element? in leftGen = nil; return nil }()
 ?? rightGen.next()
 }
}

Interestingly, this is a bit faster: 1.9 sec.

More improvements might be possible but that is what I have so far. However, the recursive method from your answer https://codereview.stackexchange.com/a/131527/35991 is still much faster, it takes only 0.9 sec.

answered Jun 11, 2016 at 9:02
\$\endgroup\$
4
  • \$\begingroup\$ Wow, that's quite a performance gain. The chain of nil coalescing operators is a very interesting approach and it's cool that it performs better, but it does seem a bit quirky. As for the Lazy wrapper, I saw in the playground that the whole tree was constructed upfront without it. I'd like to see if it finishes the sequentialFlatten call faster and then starts limping behind when the looping begins. I also had a ´.lazy´ right after seq at first, but left it out in the end. Maybe that improves performance with the Lazy wrapping? Overall, a great answer, all the other points are spot on! Thanks! \$\endgroup\$ Commented Jun 11, 2016 at 11:55
  • \$\begingroup\$ @overactor: You are right, without the Lazy wrapper the entire tree is generated at once, I hadn't notice that. \$\endgroup\$ Commented Jun 13, 2016 at 1:12
  • 2
    \$\begingroup\$ Do you really need ?? { _ -> G1.Element? in rightGen = nil; return nil }()? \$\endgroup\$ Commented Jun 13, 2016 at 9:04
  • \$\begingroup\$ @TimVermeulen: You are completely right! With the same argument that I used we can assume that our generator is not called again once it has returned nil. I have updated the answer accordingly, thanks for the feedback! \$\endgroup\$ Commented Jun 13, 2016 at 10:08

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.