Following my previous unbounded prime generator and a followup by Martin R, I've tested the waters in Kotlin by making an unbounded sieve.
To quote Martin R's wonderful explanation of the base algorithm,
The interesting thing about that algorithm is that the prime generator needs a list of "base primes" which are created using another instance of itself. This works because
- the base prime generator is created "delayed", after producing some fixed primes, and
- in order to generate primes up to \$ N \,ドル base primes are needed only up to \$ \sqrt N \$
so that the creation of nested prime generators terminates eventually.
The sieving is done using a dictionary which holds the "next composite numbers" which are divisible by any prime less than the current base prime. As explained [on SO], the required memory to produce \$ n \$ primes is \$ O(\sqrt n). \$
class PostponedPrimeSequence() : AbstractIterator<Int>(), Sequence<Int> {
override fun iterator() = this
private var basePrimes: PostponedPrimeSequence? = null
private var basePrime = 0
private var sieve = mutableMapOf<Int, Int>()
private var initialPrimes = mutableListOf(2, 3, 5, 7)
private var candidate = 9
override fun computeNext() {
if (!initialPrimes.isEmpty()) {
setNext(initialPrimes.removeAt(0))
} else {
if (candidate == 9) {
basePrimes = PostponedPrimeSequence()
basePrimes!!.next()
basePrime = basePrimes!!.next()
assert(candidate == basePrime * basePrime)
}
while (candidate > 0) {
val factor = sieve.remove(candidate) ?:
if (candidate == basePrime * basePrime) {
val temp = basePrime
basePrime = basePrimes!!.next()
temp
} else {
assert(candidate < basePrime * basePrime)
setNext(candidate)
break
}
var j = candidate + 2 * factor
while (sieve.containsKey(j)) {
j += 2 * factor
}
sieve[j] = factor
candidate += 2
}
candidate += 2
}
}
}
Usage:
PostponedPrimeSequence().take(20).forEach(::println)
All feedback welcome, though reviews are encouraged to look at:
- Naming: As I'm new to Kotlin, I don't know the naming conventions yet
- Performance: If there are simple, provable improvements without drastically changing the algorithm (such as adding a large wheel)
- Iteration organization: As Kotlin lacks a true
yield
ing generator and I wanted a trueIterator
/Sequence
, rather than a callback based "poor man'syield
", the iteration isn't exactly the prettiest matter. Thecandidate += 2 } candidate += 2
bit especially feels like there should be a better way to do this, and thebreak
(which is related) breaks ha simple control flow, which is not ideal. !!
: Even on the line immediately following assignment tobasePrimes
, I have to use!!
to potentially fail on accessing thenext
method (Smart cast to 'PostponedPrimeSequence' is impossible, because 'basePrimes' is a mutable property that could have been changed by this time
). I know that, due to reflection, any variable on the JVM could change at any time, but this seems(削除) like a smelly smell that smells (削除ここまで)smelly.basePrimes
has to be mutable and start outnull
, though, because otherwise the class would recursively take infinite space. In this case, though, I want to initialize it once, late, and from then on it should act as an immutable (non-nullable)val
reference. If this is possible in Kotlin, I don't know how to do so yet.
2 Answers 2
- You can make
basePrimes
alateinit
non-nullvar
to avoid!!
. sieve
andinitialPrimes
are only assigned once so they can be marked withval
instead ofvar
sequenceOf(2, 3, 4, 7).iterator()
can be used along withIterator.hasNext
andIterator.next
instead ofmutableListOf(2, 3, 5, 7)
with somewhat difficult to read!initialPrimes.isEmpty()
andinitialPrimes.removeAt(0)
(although if you really prefer to use a collection instead of an iterator then you can useisNotEmpty
instead of!collection.isEmpty
).- Instead of explicitly declaring a
temp
variable you can useapply
to take the current value of a variable, use/change it, and return the initial value. - As an alternative to
sieve.containsKey(j)
you can usej in sieve
. - As an alternative to
candidate += 2 } candidate += 2
you can move the latter+= 2
to relatively the beginning ofcomputeNext
(requires a few other changes, see below). The namecandidate
would no longer really fit though, but I think a generic property name likevalue
is sufficient enough (we know what "value" we care about so we know what we're computing as we go, etc.). - You probably shouldn't implement
Iterator<Int>
andSequence<Int>
in the same object. e.g. When invokingwith(PostponedPrimeSequence()) { repeat(2) { take(20).forEach(::println) } }
I would expect it as aSequence<Int>
to print the first 20 primes twice (or throw "IllegalStateException: This sequence can be consumed only once.") but as anIterator<Int>
I would expect it to print the first 40 primes. These are conflicting expectations. I recommend only implementing one. ASequence
is understood to possibly be constrained to be iterated only once so I would implement the baseIterator
and then wrap it as aSequence
in a top-level function for convenience.
e.g.
fun generatePrimeSequence() = PostponedPrimeIterator().asSequence()
class PostponedPrimeIterator() : AbstractIterator<Int>() {
private lateinit var basePrimes: PostponedPrimeIterator
private var basePrime = 0
private val sieve = mutableMapOf<Int, Int>()
private val initialPrimes = sequenceOf(2, 3, 5, 7).iterator()
private var value = 0
override fun computeNext() {
if (initialPrimes.hasNext()) {
value = initialPrimes.next()
setNext(value)
} else {
value += 2
if (value == 9) {
basePrimes = PostponedPrimeIterator()
basePrimes.next()
basePrime = basePrimes.next()
assert(value == basePrime * basePrime)
}
while (value > 0) {
val factor = sieve.remove(value) ?:
if (value == basePrime * basePrime) {
basePrime.apply {
basePrime = basePrimes.next()
}
} else {
assert(value < basePrime * basePrime)
setNext(value)
break
}
var j = value + 2 * factor
while (j in sieve) {
j += 2 * factor
}
sieve[j] = factor
value += 2
}
}
}
}
Usage:
generatePrimeSequence().take(20).forEach(::println)
I just compared two solution suggested above in term of performance:
println("PostponedPrimeSequence: ${measureTimeMillis { PostponedPrimeSequence().take(10000).sum().let(::println) }}")
println("generatePrimeSequence: ${measureTimeMillis { generatePrimeSequence().take(10000).sum().let(::println) }}")
I've been surprised that performance diffs about 2 times:
496165411
PostponedPrimeSequence: 23
496165411
generatePrimeSequence: 48
-
\$\begingroup\$ As always, benchmarking the JVM is troublesome. Make sure that you aren't accidentally measuring startup time. It'd also be more fair to compare
PostponedPrimeIterator()
to myPostponedPrimeSequence()
due to the requirement of theasSequence
wrapper. \$\endgroup\$CAD97– CAD972017年11月30日 23:21:30 +00:00Commented Nov 30, 2017 at 23:21 -
2\$\begingroup\$ Answers should be able to stand on their own. As is, this answer is dependent on another to be of any use and could be voted Not An Answer because of that. I'd suggest fleshing out your answer to avoid that. A demonstration of performance deltas could potentially stand on its own if you successfully demonstrate that you are measuring what you purport to be measuring, and it isn't overshadowed by overhead. \$\endgroup\$CAD97– CAD972017年11月30日 23:24:28 +00:00Commented Nov 30, 2017 at 23:24
-
1\$\begingroup\$ (If and where you refer to another post or comment, please plant a link for disambiguation. You can find one for posts as
share
immediately underneath. For a comment, find it "below the date".) \$\endgroup\$greybeard– greybeard2017年12月01日 00:13:20 +00:00Commented Dec 1, 2017 at 0:13
lang-kotlin
. \$\endgroup\$default
when usinglang-kotlin
. It seems to do just as well, though, so I'll switch. (The biggest hint isfun
is not highlighted as a keyword.) \$\endgroup\$lang-kotlin
just the same. \$\endgroup\$log(log N) = log( (2**k)*log x) = k*log 2 + log(log x)
so the height of that "tower" of "nested primes generators" is k = O(log (log N)). IOW that recursion that "terminates eventually" actually does so in O(log log N) steps. An interesting tidbit. \$\endgroup\$