Bytes

Ordered Key-Value Stores deal with bytes. What database or program does not? Let's try again: bytes are exposed in an OKVS public interface. Yes, you read that correctly! When you want it, you can dive as low-level as bytes. There is even some occurrences where you need to perfect the representation of objects in the database deeper at the bit level.

Key and value

An Ordered Key-Value Store is also a key-value store. An OKVS will first and foremost associate a key with value. That is to say, if you know a key, you can retrieve the associated value. Does it ring a bell? Yes! This is called a dictionary or mapping or even associative array.

A key is unique

Like in a key-value store, Ordered Key-Value Store can only associate a key to a single value.

Keys are ordered

Unlike a hash map, an OKVS is ordered: it is a sorted container. This is the most important concept to understand, grok and remember. An immediate consequence of the ordered nature of the key space, is that given a key, it makes sense to query for the previous key or the next key given the invariant that the previous key will be smaller and the next key will be bigger.

Range and Prefix

This is an immediate consequence of the ordered nature of OKVS. It possible to retrieve all keys that are between two bytes: a range. There is a specific kind of range called prefix. A range of prefix or prefix range is all the keys that starts with a given key.

Lexicographic order

It is a generalization of the alphabetical order of the natural language dictionaries. It is the default ordering used when comparing keys that will determine their position in the key space. In an OKVS, keys and values are sequence of bytes where a byte is a positive integer at most 255. Imagine two finite byte sequences $a$, and $b$:

$a = [a_1, a_2, a_3, ..., a_i]$

$b = [b_1, b_2, b_3,...,b_j]$

There is three cases:

1.

If $i < j$ it means the sequence $a$ is made of less bytes, but it does not mean that $a$ is smaller. In a natural language dictionary, the word "zebra" is smaller in length than "elephant" but "zebra" is at the end of the dictionary, and appears after "elephant". One take away: it is not the length of a key that dictate its position in the key space. To determine the order, it is required to compare every element of the byte sequence in pairs picking one from $a$ and one from $b$ in the order they appear.
If there is an index $k$ such as:

$\forall l \in \N \\ 0 \leq l<k \leq i \implies a_i = b_i$

Then $a$ and $b$ have a common prefix. Then $a$ is smaller than $b$ if and only if $a_k < b_k$

...

Packing and unpacking

A bijection, bijective function, one-to-one correspondence, or invertible function, that allows to represent basic objects e.g. string, integer, float, and composition of those inside containers e.g. tuple, list, arrays or vectors as bytes. This is also called serialization. One of the most popular textual serialization format is JSON.

Lexicographic packing

Lexicographic packing is a reversible algorithm that represent objects as bytes while preserving their natural ordering. Let's consider $\Alpha$ the set of all possible objects in a program, $A$ is a totally ordered set, a function $f$ is a lexicographic packing function if it is reversible:

$\forall x \in \Alpha \\ f^{-1}(f(x)) = x$

And if $f$ preserve the total order:

$\forall x, y \in \Alpha \\ x \le y \iff f(x) \le f(y)$