This is the third essay in a series about databases, relationality and a new approach to managing data. See for part 1 and then for part 2. In the last essay, we introduced a snapshot of this table with only the first three rows visible. In this essay I’d like to talk about the fourth row included below and what it might look like to construct such a thing.
RhizomeDB
RhizomeDB is a new kind of data store that allows you to store data in a way that’s functionally unstructured while preserving integrated relationality. It does this by locking down a few moving parts.
We don’t store state, actually. At least, not in the way you’re used to thinking about it. The atomic unit of RhizomeDB is a delta. RhizomeDB is append-only, and once written a delta is immutable. We have a single fully universal schema (the Delta schema) which is used to model all data in the system. This allows the interface to be what I call “functionally unstructured” - the tool ensures that the fully general schema is applied at write-time automatically. A delta in this system is technically a with some specific properties. A traditional RDBMS normalizes data by breaking records down across rows and columns; Rhizome normalizes data by breaking records down into deltas. Rhizome itself is agnostic as to how these deltas are persisted, indexed or otherwise treated which can vary across implementations. Delta Schema
A Delta has a specific shape. Here’s a typescript interface defining a delta:
type primitive = string | number | boolean
interface RhizomaticDelta {
id: UUID
timestamp: Date
creator: UUID
host: UUID
transaction: UUID
pointers: {
local_context: string
target: UUID | primitive
target_context?: string
}[]
}
The idea is that this delta represents a specific association between one or more things, according to some specific creator, as of some point in time, as captured in some specific system, as a part of some specific transaction.
Let’s look at a concrete example. Let’s say that we are capturing some information from a Movies database as discussed in a prior essay. We will define a delta that changes the universe of our datastore such that as of some timestamp T in our datastore’s history it is the case that Keanu Reeves starred as Neo in The Matrix. That delta might look something like this:
// the ID of this delta, unique to this delta
const id:UUID = "..."
// the timestamp as of which this delta is true for us
const timestamp:Date = Date.now()
// a UUID representing a user inserting this data
const creator:UUID = "..."
// a UUID representing this specific data store
const host:UUID = "..."
// a UUID representing the transaction containing this delta
const transaction:UUID = "..."
// a UUID representing Keanu Reeves
const keanu:UUID = "..."
// a UUID representing the character Neo
const neo:UUID = "..."
// a UUID representing the film The Matrix
const the_matrix:UUID = "..."
const salary:number = 10000000
const currency: string = "usd"
{
id,
timestamp,
creator,
host,
transaction,
pointers: [
{
local_context: "actor",
target: keanu,
target_context: "roles"
}, {
local_context: "role",
target: neo,
target_context: "actor"
},
{
local_context: "film",
target: the_matrix,
target_context: "cast"
},
{
local_context: "base_salary",
target: salary_usd
},
{
local_context: "salary_currency",
target: currency
}
]
}
So you create this delta, wrap it in a transaction with potentially other deltas that you want to either succeed-or-fail together, and then push it to the Rhizome engine. Rhizome then appends this delta to the canonical append-only stream of deltas, and then updates any indexes you’ve got that are paying attention to any combination of the domain entities targeted or contexts referenced.
The prelude - those fields above pointers - feels pretty self-explanatory, right? This is just meta-data on the delta itself, so we can track where it came from. This is useful because deltas can be shared between Rhizome systems - if you grab my movie data and add it to your local system, you still want to know which deltas came from me, etc.
The more tricky part is the pointers array, so let’s break that down a bit.
Understanding Pointers
A delta is a relationship between domain entities and/or primitives that’s true within some system according to some user as of some point in time. Looking at our delta above, we can see that we are asserting a specific relationship between a specific actor, a specific role, a specific film and a specific salary. But our domain entities - Keanu, Neo, The Matrix - are all pass-by-reference using UUIDs. The delta itself doesn’t contain any information about them except what it’s asserting.
You might thing this means that we have to have some separate pass somewhere where we create these entities, right? Like, I need to define a “Keanu Reeves” entity before I can reference it, right?
Well... no, actually. All I need to do is make sure that I’m using a consist UUID for every domain model. Maybe this is the only delta in my local database that refers to Keanu Reeves. If that’s the case, nowhere is his name or gender or birthday specified - that information just isn’t in the system yet. But, I can add additional deltas that point to the same UUID, target the string “Keanu Reeves” or the string “male” or some timestamp etc and specify contexts to articulate why those primitives are being targeted.
This means that “Keanu Reeves”, in this hypothetical system, does not exist except as the collection of deltas that reference the same UUID.
The fields on a pointer are sort of like components of an RDF schema, right:
local_context says “what is the target from the perspective of this delta?” target is a reference to some domain entity or primitive value target_context, which is optional, says “what is this delta defining from the perspective of this target?” This lets me then do things at query time like “Grab all deltas that point to the keanu UUID in any way and integrate them into a single object.” If our delta above is included, then that object would look in part like this: