After writing this answer, I was inspired to try to design a more elegant solution to the problem of printing binary trees. And what better tool with which to seek elegance than Clojure?
Overall design
The solution I ended up with involved creating, merging, and printing what I'm going to call sparse strings. A sparse string is simply a map from row/column pairs to substrings. These substrings must not contain newlines or overlap with each other.
So for instance, this multiline string
baz
foo
bar
qux
could be represented by this sparse string:
{[3 -2] "foo"
[4 -7] "bar"
[2 -6] "baz"
[5 7] "qux"}
A few things to note:
- Empty space is simply filled by regular spaces and newlines
- The first coordinate is the row, the second coordinate is the column
- The first row and column need not necessarily be zero
The problem of printing a binary tree then reduces to creating sparse strings from its left and right subtrees (as well as one for its value, which will go at the top), then shifting and merging those sparse strings together.
The only extra requirement when using sparse strings to represent trees is that the root (or the center of the root, if the root is wider than 1 character) must be at [0 0]
. Consider the tree below:
4
/ \
2 5
/ \
1 3
The in-memory representation of this would be
{:value 4
:left {:value 2
:left {:value 1}
:right {:value 3}}
:right {:value 5}}
Textually, the tree would be represented as
{[0 0] "4"
[2 -2] "2"
[4 -4] "1"
[3 -3] "/"
[4 0] "1"
[3 -1] "\\"
[1 -1] "/"
[2 2] "5"
[1 1] "\\"}
I'll refer to this as a tree string. This way, when I combine multiple tree strings to create a bigger tree string, I can safely use [0 0]
as an anchor point from which to shift subtrees.
Code
(defn end-col
"Returns one plus the maximum column occupied by the sparse string entry x."
[x]
(let [[[_ col] s] x]
(+ col (count s))))
(defn min-corner
"Returns a vector of the minimum non-empty row and column in sparse-string."
[sparse-string]
(mapv #(apply min (map % (keys sparse-string)))
[first second]))
(defn max-corner
"Returns a vector of one plus the maximum non-empty row and column in
sparse-string."
[sparse-string]
(mapv #(apply max (map % sparse-string))
[(comp inc first key) end-col]))
(defn fill
"Returns a string of newlines and spaces to fill the gap between entries x and
y in a sparse string whose minimum corner is corner."
[corner x y]
(let [[_ min-col] corner
[[prev-row _] _] x
[[next-row next-col] _] y]
(apply str (concat (repeat (- next-row prev-row) \newline)
(repeat (- next-col
(if (== prev-row next-row)
(end-col x)
min-col))
\space)))))
(defn sparse-str
"Converts sparse-string to a string."
[sparse-string]
(let [corner (min-corner sparse-string)]
(->> sparse-string
(sort-by key)
(cons [corner ""])
(partition 2 1)
(map (fn [[x y]] (str (fill corner x y) (val y))))
(apply str))))
(defn shift
"Creates and returns a sparse string by adding offset to the position of each
entry in sparse-string."
[offset sparse-string]
(into {} (map (fn [[pos s]]
[(mapv + pos offset) s])
sparse-string)))
(defn vert-gap
"Returns the minimum vertical gap that can be used in combining the left and
right tree strings."
[left right]
(if (and left right)
(max 1 (quot (- (second (max-corner left))
(second (min-corner right)))
2))
1))
(def directions {:left - :right +})
(defn diagonal
"Returns a diagonal sparse string with the top end located at corner."
[direction corner length character]
(let [[first-row first-col] corner]
(into {} (map (fn [n]
[[(+ first-row n)
((directions direction) first-col n)]
(str character)])
(range length)))))
(defn leg
"Returns a sparse string from shifting tree-string according to direction,
vert-gap, and value-height, merged with a diagonal strut."
[direction tree-string vert-gap value-height]
(merge (shift [(+ value-height vert-gap)
((directions direction) (inc vert-gap))]
tree-string)
(diagonal direction
[value-height ((directions direction) 1)]
vert-gap
({:left \/ :right \\} direction))))
(defn assemble
"Assembles a complete tree string from the tree strings of a value, left
subtree, and right subtree."
[value left right]
(if (or left right)
(let [[value-height _] (max-corner value)
vert-gap (vert-gap left right)]
(merge value
(when left
(leg :left left vert-gap value-height))
(when right
(leg :right right vert-gap value-height))))
value))
(defn tree-string
"Creates and returns a tree string from node."
[node]
(let [{:keys [value left right]} node
s (str value)]
(apply assemble
{[0 (- (quot (count s) 2))] s}
(map #(when % (tree-string %))
[left right]))))
Implementation notes
When printing a sparse string, one first needs to find the minimum occupied row and column in the sparse string. This naturally leads to the end-col
, min-corner
, and max-corner
functions for calculating bounding boxes for sparse strings.
Now, how does one print a sparse string? This problem basically boils down to printing the whitespace to fill
the gaps between sparse string entries. Once that's accomplished, the actual sparse-str
implementation is quite straightforward.
Also, since I'm going to be combining sparse strings in addition to just creating and printing them, I'll need a function to shift
the reference point of an existing sparse string.
All we need now is a way to convert from that logical representation of the tree to the textual one, which is the task of the tree-string
function.
As you can see, the better part of the work is done in that assemble
function. The first thing we need to do is decide how long we're going to make the struts that connect the left and right subtrees. To simplify things, we'll just always make them the same length, which means that the length of the struts, calculated by vert-gap
, will equal to the final distance between the bottom of the value
tree string and the tops of the left
and right
tree strings.
And of course, we'll need the diagonal
function to create the struts themselves.
Now that we have all the rest of the pieces, it's just a matter of assemble
ing them together. The leg
function is just a helper that combines diagonal
and shift
together into something that can then be merged with the root.
Example
In case you'd like to test this code yourself, here's a function for generating a random binary tree:
(defn rand-tree
[weight depth]
(into {:value (int (Math/pow 2 (rand (dec Integer/SIZE))))}
(map #(when (and (pos? depth) (< (rand) weight))
[% (rand-tree weight (dec depth))])
[:left :right])))
So this...
(println (sparse-str (tree-string (rand-tree 3/4 3))))
... will print something like this:
1369616891
/ \
/ \
/ \
/ \
/ \
238883 2
/ \ /
/ \ 9
/ \ / \
/ \ 1 2222
2949729 6
/ /
1836 5299294
1 Answer 1
First of all, in my opinion, both the algorithm is nice and interesting, and the code is really well-written, broken down into easily understandable functions! Well done!
The only addition that I can make are corner-cases (and their possible fixes) for some of the helper functions. However, please note, that from the main entry point, I did not find any way to trigger these corner cases, so, from a user perspective, the program works well even without the changes below.
Corner case # 1
(min-corner {})
(sparse-str {})
Both throw an exception, due to min
in min-corner
being called with zero arguments.
I suggest the following change, in order to handle this case:
(defn min-corner
"Returns a vector of the minimum non-empty row and column in sparse-string."
[sparse-string]
(mapv #(apply min (let [args (map % (keys sparse-string))] (if (empty? args) [0 0] args)))
; (mapv #(apply min (map % (keys sparse-string)))
[first second]))
I.e., call min
with a default value (in this case [0 0]
), if the arguments are empty.
Corner case #2
(max-corner {})
Similar issue, like for min-corner
, similar solution:
(defn max-corner
"Returns a vector of one plus the maximum non-empty row and column in
sparse-string."
[sparse-string]
(mapv #(apply max (let [args (map % sparse-string)] (if (empty? args) [0 0] args)))
; (mapv #(apply max (map % sparse-string))
[(comp inc first key) end-col]))
Corner case #3
(sparse-str {[0 0] "aa" [0 1] "b"})
This will return aab
. However, b
is misplaced here, because it should be at position [0 1]
, and it is, instead, at position [0 2]
.
To be honest, I don't really know what would be the best solution here. I can imagine the following:
- Leave it as it is: all output is shown, some of it might be on the wrong place.
- Overwrite the the end of the first string with the second one (i.e.
"ab"
). - Overwrite the beginning of the second string with the first one (in this case:
"aa"
). - Throw an exception.
Again, starting from the main entry point, it is not possible to trigger this situation, because the user does not provide any positions. However, in case the helper functions were to be reused in a library, this is a question that should be addressed.
Corner case #4
(vert-gap {[0 0] "a"} {[4 10] "a"})
In case the right element has a bigger second coordinate than the left one, the result is always 1
. I'm not sure if this is by design, but if not, then I'd suggest to make sure, that vertical gap is calculated correctly also in such cases, by taking the absolute value of the difference, instead of the difference itself:
(defn vert-gap
"Returns the minimum vertical gap that can be used in combining the left and
right tree strings."
[left right]
(if (and left right)
(max 1 (quot (Math/abs (- (second (max-corner left))
(second (min-corner right)))
) 2))
1))
Corner case # 5
(diagonal :left [0 0] -2 \a)
With the current implementation, this will return an empty result: {}
. Now, with the semantics that the name of the parameter length
implies, this should be correct (maybe an exception could be thrown, but that is only detail).
However, wouldn't it be nice, if the signum of length
could control the direction, in which the first coordinate grows? Something like this:
(defn diagonal
"Returns a diagonal sparse string with the top end located at corner."
[direction corner length character]
(let [[first-row first-col] corner
_length (Math/abs length)
op (if (< length 0) - +)]
(into {} (map (fn [n]
[[(op first-row n)
((directions direction) first-col n)]
(str character)])
(range _length)))))
In this way, the result of the above example will be: {[0 0] "a", [-1 -1] "a"}
.
Of course, it might be worth considering to rename the length
parameter, e.g. to horizontal-displacement
or something similar.
P.S.:
I setup a github repo for the above mentioned changes, and their corresponding test cases (along other tests):
-
\$\begingroup\$ Thanks for the awesome write-up! For Corner case #1 and Corner case #2, I think that it would be better for
sparse-str
to handle the case where its argument is empty (and just return""
) and formin-corner
andmax-corner
to specify in their docstring that their argument must be nonempty, similar to how it is part of the contract ofmin
andmax
that they be passed a nonzero number of arguments. \$\endgroup\$Sam Estep– Sam Estep2017年08月11日 19:14:26 +00:00Commented Aug 11, 2017 at 19:14 -
\$\begingroup\$ For Corner case #3, I definitely agree that the current behavior is undesirable; the "dimensions" of the returned string should not depend on whether an overlap is present. Out of the possible solutions you present, I think the second one (overwrite the the end of the first string with the second one) is the best. \$\endgroup\$Sam Estep– Sam Estep2017年08月11日 19:14:34 +00:00Commented Aug 11, 2017 at 19:14
-
\$\begingroup\$ For Corner case #4, that's an interesting behavior, and on the one hand your solution might be desirable for robustness purposes, but on the other hand, in my definition of "tree string", I specified that the root must be present at
[0 0]
, so{[4 10] "a"}
does not qualify as a tree string, andvert-gap
specifies that both of its arguments are tree strings. \$\endgroup\$Sam Estep– Sam Estep2017年08月11日 19:14:38 +00:00Commented Aug 11, 2017 at 19:14 -
\$\begingroup\$ For Corner case #5, I like your idea of adding semantics to the signum of
length
! Do you think that could possibly remove the need for thedirection
andcharacter
parameters as well? \$\endgroup\$Sam Estep– Sam Estep2017年08月11日 19:14:57 +00:00Commented Aug 11, 2017 at 19:14 -
\$\begingroup\$ @SamEstep: re #5: I don't think that we can pack those infos into
length
, because the signum oflength
is now the vertical direction, whiledirection
is the horizontal one. Unlesslength
is converted to some kind of structure, but that will still need all three pieces of information IMHO. \$\endgroup\$Attilio– Attilio2017年08月12日 11:54:36 +00:00Commented Aug 12, 2017 at 11:54
2949729
and6
on the same line with struts of length 1 using the centering function#(- (quot (count %) 2))
that you can find intree-string
. \$\endgroup\$