After experimenting with a sorted trie implementation in C, I felt that I understood tries pretty well, but was having trouble explaining how they work. Since the C code was based on existing code, I wanted to try it again in another language, from scratch.
Here's what I came up with. I tried to make it as minimal as possible. These tries aren't sorted, and nodes only reference a child
and next
node; parent
and prev
aren't used. I did give the "leaves" of the trie a name
property, which is the full key. This isn't currently used, but it's useful for debugging, and it would be useful if a trie iterator were added (like Trie_dump
from the question linked above).
I'm interested in feedback on the code itself this time, but not so much on things like white space, braces, semicolons, etc. I'm comfortable with the formatting the way it is. I'd be interested in an alternative to that labeled loop, though, and all other feedback is welcome.
Trie structure
Each node in the trie can have a key
, a value
, a next
node, and a child
node. In this diagram, each node is represented by a circle. The "root" node is at the left, the green "leaf" nodes are at the right, and the "branch" nodes are in the middle.
Each key
is a single character (circled letters below). The "branches" have keys, while each "leaf" holds a value.
Trie diagram
In the diagram, nodes appear to have multiple children. For example, the first "r" node has "u" and "o" children. Internally, nodes just have a child
property; the first child. The next child is available in the first child's next
property, and so on. It may help to think of child
as "first child" and next
as "next sibling."
trie.js
function Trie(parent, prev, key, value) {
if (key !== void 0)
this.key = key; // single-character key
if (value !== void 0)
this.value = value; // user-defined value
if (prev)
prev.next = this; // next sibling node
else if (parent)
parent.child = this; // first child node
}
// put a key/value pair in the trie
Trie.prototype.put = function(name, value) {
var i = 0, t = this, len = name.length, prev, parent;
down: while (t.child) {
parent = t;
t = t.child;
// if first child didn't match, get next sibling
while (t.key != name[i]) {
if (!t.next) {
prev = t;
t = parent;
break down;
}
t = t.next;
}
// key already exists, update the value
if (++i > len) {
t.value = value;
return;
}
}
// found any existing parts of the key, add the rest
t = new this.constructor(t, prev, name[i]);
while (++i <= len)
t = new this.constructor(t, null, name[i]);
t.name = name;
t.value = value;
};
// get a value from the trie at the given key
Trie.prototype.get = function(name) {
var i = 0, t = this.child, len = name.length;
while (t) {
if (t.key == name[i]) {
if (i == len)
return t.value;
t = t.child;
++i;
} else {
t = t.next;
}
}
};
Demos
Here are links to a few demos. The simple demo puts some test data in a trie, gets it back out, and dumps it to the console. The fancy demo puts the same test data in a trie and draws a diagram like the one above.
Simple demo: http://jsfiddle.net/4Yttq
Fancy demo: http://jsfiddle.net/4Yttq/1/
1 Answer 1
Well, first of all, my optimization is an overhaul of your method. The concept is still tries. But I went for the JS route for several reasons:
We're in JavaScript. Linked lists in C would be like Arrays/Objects in JS. Since we have them available, why not use them instead?
In tries, you use a key for each entry. You'd have to loop over and find it when using linked lists or arrays. In JS, we have objects that are key-value pairs, which make it a bit easier to implement.
Performance is a bit better, except on Opera browsers. Performance may be due to the lesser loops and property access.
The code is shorter
The structure:
{
"t": {
"key": "t",
"r": {
"key": "r",
"u": {
"key": "u",
"e": {
"key": "e",
"value": "yes",
"name": "true"
},
"c": {
"key": "c",
"k": {
"key": "k",
"value": "vehicle",
"name": "truck"
}
}
},
"o": {
"key": "o",
"w": {
"key": "w",
"e": {
"key": "e",
"l": {
"key": "l",
"value": "dig",
"name": "trowel"
}
}
}
}
}
},
"h": {
"key": "h",
"a": {
"key": "a",
"t": {
"key": "t",
"value": "head",
"name": "hat"
},
"l": {
"key": "l",
"t": {
"key": "t",
"value": "hold it",
"name": "halt"
}
},
"m": {
"key": "m",
"value": "pig",
"name": "ham",
"m": {
"key": "m",
"e": {
"key": "e",
"r": {
"key": "r",
"value": "nail",
"name": "hammer"
}
}
}
}
}
}
}
As for the code that operates this, notes are on the comments:
function Trie(key) {
this.key = key;
this.value;
//children are merged with this object since collision is minimal
}
Trie.prototype.put = function (name, value) {
var node = this,
nameLength = name.length,
i = 0,
currentLetter;
//the only major change is this single loop which zips through the collection
//if the node exists, make it current and proceed
//if not, we create it, make it current and proceed
for (i = 0; i < nameLength; i++) {
currentLetter = name[i];
node = node[currentLetter] || (node[currentLetter] = new Trie(currentLetter));
}
node.value = value;
node.name = name;
};
Trie.prototype.get = function (name) {
var node = this,
nameLength = name.length,
i, node;
//same idea, zip through the collection
//in this case we break if we hit a dead end
for (i = 0; i < nameLength; i++) {
if (!(node = node[name[i]])) break;
}
//only when the loop went over all letters will we find a value
//if not, well, we don't find anything
return (i === nameLength) ? node.value : 'not found';
};
-
\$\begingroup\$ I guess I should have mentioned that I'm planning on porting it back to C eventually, so I wanted to keep the linked lists; using
children
like this almost feels like cheating. I did retrofit the trie with achildren
property for the diagram example, but it doesn't take advantage of it at all. The constructor has theundefined
checks to avoid creating properties with undefined values, just to make the data easier to inspect... they could go. Other than that, this is only a few lines shorter... does it perform better? \$\endgroup\$Dagg– Dagg2013年04月23日 16:10:50 +00:00Commented Apr 23, 2013 at 16:10 -
\$\begingroup\$ @Dagg Ahh, I see. Performance, I'll post a jsPerf shortly. \$\endgroup\$Joseph– Joseph2013年04月23日 16:12:16 +00:00Commented Apr 23, 2013 at 16:12
-
\$\begingroup\$ +1 for the perf, although it's just a tiny bit faster for me... I think if use the sorted trie, I might be able to beat it. I tried adding the d3 stuff to your fiddle, thinking it would work since it used a
children
property, but d3 triggered an error liked3 "Object #<Object> has no method 'map'
... any idea what's going on there? \$\endgroup\$Dagg– Dagg2013年04月23日 16:54:25 +00:00Commented Apr 23, 2013 at 16:54 -
\$\begingroup\$ @Dagg I modified the latest revision, 9, to have no
children
property to further lessen the code. Revision 8 still haschildren
. Also,children
isn't an array, but an object. \$\endgroup\$Joseph– Joseph2013年04月23日 16:58:28 +00:00Commented Apr 23, 2013 at 16:58 -
1\$\begingroup\$ Ah, right, it's expecting an array. Merging
children
seems like a good idea, there shouldn't be any collision as long as no single-character property names are used. I'm still curious whether a sorted, linked trie can outperform this, or whether the linked trie code can be reduced significantly, but this is starting to look pretty good as a minimal JS solution. \$\endgroup\$Dagg– Dagg2013年04月23日 17:48:40 +00:00Commented Apr 23, 2013 at 17:48
Explore related questions
See similar questions with these tags.
while (++i <= len)
isn't something comprehensible. Not using braces forif
, etc. The code is pretty, but it feels like magic on first sight. \$\endgroup\$break down;
:) \$\endgroup\$