The goal of the hitchhiker tree is to wed three things: the query performance of a B+ tree, the write performance of an append-only log, and convenience of a functional, persistent datastructure.
Let’s look at each of these in some detail.
B+ trees win at queries
You might remember an important theorem from data structures: the best-performing data structure for looking up sorted keys cannot do those queries faster than
Since sorted trees provide a solution for this, we’ll start with them.
Now, a common sorted tree for this purpose is the Red-Black tree, whose actual query performance is between
2*log2(n) (the write performance is
The factor of 2 comes from the partial imbalances (which are still asymptotically balanced) that the algorithm allows, and the base 2 of the log comes from the fact that it’s a binary search tree.
A less popular sorted tree is the AVL tree—this tree achieves
log2(n) query performance, at the cost of always paying
2*log2(n) for inserts.
We can already see a pattern—although many trees reach the asymptotic bound, they differ in their constant factors.
The tree that the hitchhiker tree is based off of is the B+ tree, which achieves
logb(n) query performance.
b can be very large (on the order of 100s or 1000s), these trees are especially great when each node is stored on higher latency media, like remote storage or a disk.
This is because each node can contain huge numbers of keys, meaning that by only keeping the index nodes in memory, we can access most keys with fewer, often just one, data accesses.
Unlike the above sorted trees (and B trees, which we won’t discuss), B+ trees only store their data (i.e. the values) in their leaves—internal nodes only need to store keys.
Event logs win at writing data
Do you know the fastest way to write data?
Append it to the end of the file.
There’s no pointers, no updating of data structures, no extra IO costs incurred.
Unfortunately, to perform a query on an event log, we need to replay all the data to figure out what happened.
That replay costs
O(n) , since it touches every event written.
So, how can we fix this?
Unifying B+ trees and event logs
The first idea to understand is this: how can we combine the write performance of an event log with the query performance of a B+ tree?
The answer is that we’re going to "overlay" an event log on the B+ tree!
The idea of the overlay is this: each index node of the B+ tree will contain an event log.
Whenever we write data, we’ll just append the operation (insert or delete) to the end of the root index node’s event log.
In order to avoid the pitfall of appending every operation to an ever-growing event log (which would leave us stuck with linear queries), we’ll put a limit on the number of events that fit in the log.
Once the log has overflowed in the root, we’ll split the events in that log towards their eventual destination, adding those events to the event logs of the children of that node.
Eventually, the event log will overflow to a leaf node, at which point we’ll actually do the insertion into the B+ tree.
This process gives us several properties:
Most inserts are a single append to the root’s event log
Although there are a linear number of events, nodes are exponentially less likely to overflow the deeper they are in the tree
All data needed for a query exists along a path of nodes between the root and a specific leaf node. Since the logs are constant in size, queries still only read
Thus we dramatically improve the performance of insertions without hurting the IO cost of queries.
Now that we get the sketch of how to combine event logs and B+ trees, let’s see the beauty of making the whole thing functional and persistent!
Since the combined B+/log data structure primarily only modifies nodes near the root, we can take advantage of the reduced modification to achieve reduced IO when persisting the tree.
We can use the standard path-copying technique from functional, persistent data structures.
This gives great performance, since the structure is designed to avoid needing to copy entire paths—most writes will only touch the root.
Furthermore, we can batch many modifications together, and wait to flush the tree, in order to further batch IO.