I have a tree in my application from which I need to periodically take a snapshot. I've been using locks so far, but after running into a few deadlock issues I decided to create a lock-free implementation. I'm using a persistent Map
implementation together with an actor
. My goals were:
- Consistent snapshot from the
Map
in a concurrent environment - Flattened tree structure
- Thread safety when writing
I have completed a preliminary implementation, but I'm not 100% sure that my implementation is idiomatic. I'm not that concerned with complexity, as writes won't be often, but I really need the consistent snapshot and the thread safety. I'd like some guidance on the
- overall design
- idiomatic use of coroutines / persistent data structures
- possible design oversights or thread safety issues
- anything that I missed (it happened in the past that I wrote something and it turned out that there is a well-known algorithm that does the same but better)
So without further ado, this is the implementation:
// primary interfaces
import kotlinx.collections.immutable.PersistentList
import kotlinx.collections.immutable.persistentListOf
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.SupervisorJob
import kotlinx.coroutines.channels.actor
import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking
import kotlin.math.max
interface Tree<D : Any> {
// this is the flattened tree
val descendants: Sequence<TreeNode<D>>
val children: Sequence<TreeNode<D>>
// children can only be created through a reference to a Tree or a TreeNode
fun createChild(data: D? = null): TreeNode<D>
companion object {
fun <D : Any> create(): Tree<D> {
return ConcurrentTree()
}
}
}
interface TreeNode<D : Any> {
val data: D?
val parent: TreeNode<D>?
val hasParent: Boolean
get() = parent != null
val children: Sequence<TreeNode<D>>
fun createChild(data: D? = null): TreeNode<D>
fun remove()
}
The TreeNode
implementations look like this:
open class ConcurrentTreeNode<D : Any>(
override val parent: ConcurrentTreeNode<D>? = null,
override val data: D? = null,
private val tree: ConcurrentTree<D>
) : TreeNode<D> {
override val children: Sequence<TreeNode<D>>
get() = tree.children.filter { it.parent == this }
override fun createChild(data: D?): ConcurrentTreeNode<D> {
val parent = this
return ConcurrentTreeNode(
parent = this,
data = data,
tree = tree
).apply {
tree.addChildTo(parent, this)
}
}
override fun remove() {
tree.remove(this)
}
override fun toString(): String {
return "TreeNode(data=$data)"
}
}
private class RootNode<D : Any>(
parent: ConcurrentTreeNode<D>? = null,
data: D? = null,
tree: ConcurrentTree<D>
) : ConcurrentTreeNode<D>(
parent = parent,
data = data,
tree = tree
) {
override fun remove() {
throw RuntimeException("A root node can't be removed.")
}
}
And this is the Tree
implementation:
class ConcurrentTree<D : Any> : Tree<D> {
sealed class Message {
data class AddChildTo<D : Any>(
val parent: TreeNode<D>,
val child: TreeNode<D>
) : Message()
data class DeleteChild<D : Any>(
val child: TreeNode<D>
) : Message()
}
private val treeScope = CoroutineScope(Dispatchers.Default + SupervisorJob())
private val actor = treeScope.actor<Message> {
for (msg in channel) { // iterate over incoming messages
when (msg) {
is AddChildTo<*> -> {
val (parent, child) = msg
val lastChildIdx = backend.indexOfLast { it.parent == parent }
val parentIdx = backend.indexOf(parent)
val idx = max(lastChildIdx, parentIdx) + 1
backend = if (idx == backend.lastIndex) {
backend.add(child as TreeNode<D>)
} else {
backend.add(idx, child as TreeNode<D>)
}
}
is Message.DeleteChild<*> -> {
val parent = msg.child.parent
val child = msg.child
val toDelete = backend
.dropWhile { it !== child }
.takeWhile { it === child || it.parent !== parent }
backend = backend.removeAll(toDelete)
}
}
}
}
private val root = RootNode(
parent = null,
data = null,
tree = this
)
private var backend: PersistentList<TreeNode<D>> = persistentListOf(root)
override val descendants: Sequence<TreeNode<D>>
get() = backend.asSequence().drop(1)
override val children: Sequence<TreeNode<D>>
get() = root.children
override fun createChild(data: D?) = root.createChild(data)
fun addChildTo(parent: TreeNode<D>, child: TreeNode<D>) {
treeScope.launch {
actor.send(AddChildTo(parent, child))
}
}
fun remove(node: TreeNode<D>) {
treeScope.launch {
actor.send(Message.DeleteChild(node))
}
}
override fun toString(): String {
return "Tree(descendants=${descendants.joinToString()})"
}
}
Complexity of operations is O(n)
, but more importantly snapshot creation is O(1)
, as I'm using a PersistentList
as a backend for the whole thing.
The operations (apart from the snapshot) are eventually consistent, but this doesn't matter in my case, I only want to see an operation either completed successfully or the previous state.
Can I do better? Is this idiomatic Kotlin code?
-
\$\begingroup\$ I would be really helpfull for me, if you could include imports to your code snippets so I can copy/paste it in my IDE without having to think about, if we have he same implementation. \$\endgroup\$Leo H.– Leo H.2020年10月07日 15:09:19 +00:00Commented Oct 7, 2020 at 15:09
-
\$\begingroup\$ of course, one moment \$\endgroup\$Adam Arold– Adam Arold2020年10月07日 18:24:02 +00:00Commented Oct 7, 2020 at 18:24
-
\$\begingroup\$ I added the missing imports. \$\endgroup\$Adam Arold– Adam Arold2020年10月07日 21:37:46 +00:00Commented Oct 7, 2020 at 21:37
1 Answer 1
The following points are my personal, subjective suggestions based on my own experiece and understanding of your domain. I din't check if the solution is feasable but analyzed it for itself.
Some suggestions are explicetely commented or mentioned in the post's text, some more are in form of the modified code in the snippets.
Overall Design
- I would suggest to genereally rethink the cases where you decided to go for nullability. In my experience it is often better to get an "Optinal is empty but expected XYZ" exception, instead of "NullPointerException at line 123" which tells almost nothing.
- There are so many google results why casting is bad, so I just tell you - avoid casting for your
Message
classes. My proposal is to give the sealed class a Wildcard and the compiler will smart cast for yousealed class Message<A : Any>
- It is never too annoying to repeat: Favor composition over inheritance :) . Your open
ConcurrentTreeNode
will get complex very fast and all further overrides will not do it any favor. - Avoid using
var
s, especially for fields and go for immutability - this gives you (almost always) thread safety
Idiomatic use of coroutines
There are some things which are obselete and some others, which I would redesign, because I like it the other way - like readability and my own code style.
Obsolete:
- Using
actor
-> Go forchannels
andflows
(link), which give you more functionality, interoperability and future support
I would do it the other way:
- Create a
channel
to send and receive messages- Launch a coroutine which coverts a
channel
to aflow
and consumes every message sequentially
- Launch a coroutine which coverts a
- mark
send()
andresume()
functions assuspend
to send messages to your channel- force the caller to make use of coroutine context to be able to send those
Modified implemention with suggestions
Some siggestions are in form of a comment, some are intrinsic in the way I modified the code.
import kotlinx.collections.immutable.PersistentList
import kotlinx.collections.immutable.persistentListOf
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.SupervisorJob
import kotlinx.coroutines.channels.Channel
import kotlinx.coroutines.flow.collect
import kotlinx.coroutines.flow.consumeAsFlow
import kotlinx.coroutines.launch
interface Tree<D : Any> {
// Did you consider Sequences for some special intent? Reading the documentation, their implementation can differ
// highly and it introduces uncertainty to your code
val descendants: Sequence<TreeNode<D>>
val children: Sequence<TreeNode<D>>
fun createChild(data: D? = null): TreeNode<D>
companion object {
fun <D : Any> create(): Tree<D> { // I dislike public static code in general
return ConcurrentTree() // and see no practical use of this method - unless you want to hide the implementations
}
}
}
interface TreeNode<D : Any> {
val data: D?
val parent: TreeNode<D>? // I guess, if data is NULL then parent also has to be NULL (like in RootNode)?
// If so, bundle them inside same class and make it nullable, so its impossible to break this relation by compiler
val hasParent: Boolean // this would fit better as a function, like 'List.isEmpty()'
get() = parent != null
val children: Sequence<TreeNode<D>>
fun createChild(data: D? = null): TreeNode<D>
fun remove() // remove what? A bit hard to find out just by the interface what it does or supposed to remove
}
// -----------
// Do you really need inheritance? a.k.a - Favor delegation over inheritance
open class ConcurrentTreeNode<D : Any>(
override val parent: ConcurrentTreeNode<D>? = null,
override val data: D? = null,
private val tree: ConcurrentTree<D>
) : TreeNode<D> {
private val treeScope = CoroutineScope(Dispatchers.Default + SupervisorJob())
override val children: Sequence<TreeNode<D>> = tree.children.filter { it.parent == this }
override fun createChild(data: D?): ConcurrentTreeNode<D> {
return ConcurrentTreeNode(
parent = this,
data = data,
tree = tree
).apply {
treeScope.launch {
tree.addChildTo(this@ConcurrentTreeNode, this@apply) // very poor to modify fields of classes, even worse doing so after construction
}
}
}
override fun remove() {
treeScope.launch { tree.remove(this@ConcurrentTreeNode) }
}
override fun toString(): String = "TreeNode(data=$data)"
}
private class RootNode<D : Any>(
parent: ConcurrentTreeNode<D>? = null,
data: D? = null,
tree: ConcurrentTree<D>
) : ConcurrentTreeNode<D>(
parent = parent,
data = data,
tree = tree
) {
// Design flaw. Maybe rethink the structure and create a special Node which doesn't have this function at all
override fun remove() = error("A root node can't be removed.")
}
class ConcurrentTree<D : Any> : Tree<D> {
sealed class Message<A : Any> { // You need a Wildcard for parent sealed class to maintain concrete types of children
// be careful when using wildcard types with equal letters!
// Try to use always a different letter in same file and/or class
data class AddChildTo<B : Any>(val parent: TreeNode<B>, val child: TreeNode<B>) : Message<B>()
data class DeleteChild<B : Any>(val child: TreeNode<B>) : Message<B>()
}
private val channel = Channel<Message<D>>(Channel.BUFFERED) // choose the type you like here
init {
CoroutineScope(Dispatchers.Default + SupervisorJob()).launch {
fun idx(msg: AddChildTo<D>): Int {
val lastChildIdx = backend.indexOfLast { it.parent == msg.parent }
val parentIdx = backend.indexOf(msg.parent)
return maxOf(lastChildIdx, parentIdx) + 1
}
fun addDependingOnIdx(message: AddChildTo<D>) {
when (val idx = idx(message)) { // poor variable naming 'idx'
backend.lastIndex -> backend.add(message.child)
else -> backend.add(idx, message.child)
}
}
fun toDeleteElements(msg: DeleteChild<D>): List<TreeNode<D>> {
val parent = msg.child.parent
val child = msg.child
return backend
.dropWhile { it !== child }
.takeWhile { it === child || it.parent !== parent }
}
fun modifyBackendFor(message: Message<D>) {
when (message) {
is AddChildTo<D> -> addDependingOnIdx(message)
is DeleteChild<D> -> backend.removeAll(toDeleteElements(message))
}
}
channel
.consumeAsFlow() // hot flow which guarantees sequential operations
.collect { msg -> modifyBackendFor(msg) }
}
}
// private val root = RootNode(parent = null, data = null, tree = this)
//
// --> there is no need for a field here and it doesn't harm to create such a small object, unless you go for
// very large numbers. Fields are like global variables - highly doubtful when not actually needed
private val backend: PersistentList<TreeNode<D>> = persistentListOf(RootNode(tree = this)) // VAL instead of VAR!
override val descendants: Sequence<TreeNode<D>>
get() = backend.asSequence().drop(1) // not thread safe. Btn: Weird to have un-deterministic behaviour for a getter
// since the backend can be modified
override val children: Sequence<TreeNode<D>> by lazy { RootNode(tree = this).children }
override fun createChild(data: D?) = RootNode(tree = this).createChild(data)
suspend fun addChildTo(parent: TreeNode<D>, child: TreeNode<D>) = channel.send(AddChildTo(parent, child))
suspend fun remove(node: TreeNode<D>) = channel.send(DeleteChild(node))
override fun toString(): String = "Tree(descendants=${descendants.joinToString()})"
}
PS: This review & analysis was done in ~4 hours
-
\$\begingroup\$ I went for nullability, because this will only be used from Kotlin and the Kotlin devs themsevelves suggest to use
null
in these cases. Thank you for your humongous effort! \$\endgroup\$Adam Arold– Adam Arold2020年10月12日 10:16:57 +00:00Commented Oct 12, 2020 at 10:16 -
\$\begingroup\$ There won't be any more implementors of
ConcurrentTreeNode
, that's why I picked inheritance. I don't see how this can be implemented using composition instead that's not much more complex. \$\endgroup\$Adam Arold– Adam Arold2020年10月12日 10:19:33 +00:00Commented Oct 12, 2020 at 10:19 -
\$\begingroup\$ The only place where I'm using
var
s is thePersistentList
, but it is necessary because aPersistentList
is immutable. This is what enables consistent snapshots. \$\endgroup\$Adam Arold– Adam Arold2020年10月12日 10:20:41 +00:00Commented Oct 12, 2020 at 10:20 -
\$\begingroup\$ What are the advantages of a hot
Flow
? \$\endgroup\$Adam Arold– Adam Arold2020年10月12日 19:34:40 +00:00Commented Oct 12, 2020 at 19:34 -
1\$\begingroup\$ I'm following the convo on the relevant issue here. As I've figured out your solution is better, because it works in a multiplatform project as well. I didn't use the
Flow
in the end though, just theChannel
and a coroutine. \$\endgroup\$Adam Arold– Adam Arold2020年10月13日 10:29:04 +00:00Commented Oct 13, 2020 at 10:29