Rework draft.
Transaction begin, rollback, before and after commit hooks are missing.
Maybe add specification about thread safety.
Add a parameter okdb-transaction-hygiene
that may be used to set
the desired serializability guarantee, possibly on a per transaction
basis with the help of parametrize. Using pseudonyms, that maybe
change over time: perfect, strong, weak, none. And also using SQL
standard names: serializable, snapshot-isolation, read-commited,
read-uncommited.
General purpose backend storage datastructure for building in-memory or on-disk databases that can handle concurrency thanks to ACID transactions.
okdb
can be the primitive datastructure for building many
datastructures. Low level extensions include counter, bag, set,
mapping-multi, binary object store. Higher level extensions include
Entity-Attribute-Value
possibly supported by datalog, Generic Tuple Store (nstore) inspired
from Resource Description
Framework
that can trivialy match the query capabilities of
SPARQL, nstore can
painlessly implement
RDF-star, or
even the Versioned Generic Tuple Store (vnstore), that ease the
implementation of bitemporal databases, and datomic high level
interface. Also, it is possible to implement a property graph
database, ranked set, leaderboard, priority queue. It is possible to
implement efficiently geometric queries such as xz-ordered curves.
okdb
is useful in the context of on-disk persistence. okdb
is also
useful in a context such as client-server applications, where the
client need to cache heterogeneous data. It may be used in the
browser, or in microservice configuration as a shared in-memory
datastructure.
There is several existing databases that expose an interface similar
to okdb
, and even more that use an ordered key-value store (okvs) as
their backing storage.
While okdb
interface is lower-level than the mainstream SQL, it is
arguably more useful because the implementation stems from a
well-known datastructure part of every software engineering
curriculum, namely binary trees, also because it allows to implement
SQL, last but not least it reflects the current practice that builds
(distributed) databases systems based on a similar interface.
okdb
extends, and departs from the common okvs interface inherited
from BerkeleyDB, to ease the implementation thanks to bounded keys and
values, while making the implementation of efficient extensions easier
also thanks to the ability to estimate the count of keys, and the size
of key-value pairs, in a given range.
This SRFI should offer grounds for apprentices to learn about data storage. It will also offer a better story (the best?) for managing data that may be durable, and read, or written concurrently.
(make-okdb filepath [block-read block-write]) string? procedure? procedure? → okdb?
Rationale: In SRFI-167, make-okvs
could take various options. The
interface was difficult, and did not work well. Instead, of trying to
define a couple of options, a left aside others. With okdb
it left
to downstream to deal with options. It is the responsability of the
implementer, and possibly eventually to the user to deal with options
in an appropriate way. One way, for the implementer to enable options
is to create a super procedure that a) returns multiple values,
including the constructor, b) rely on generic methods, or something
else.
Return a handle of the database. FILEPATH
may be a string describing
the on-disk file or directory where the database will be saved. In the
case where okdb
work only from memory, it should be ignored.
BLOCK-READ
and BLOCK-WRITE
if provided will be used to consume or
produce bytes that will be read or written to disk, possibly using
cryptography or compression on input bytes. Both BLOCK-READ
and
BLOCK-WRITE
will take a generator and an accumulator as argument.
(okdb? obj) * → boolean?
Returns #t
if OBJ
is an <okdb>
instance. Otherwise, returns
#f
.
(okdb-close! db) okdb?
Close DB
. All transactions that completed successfully should be
available the next time the database is open with make-okdb
except
in the case of fully in-memory database.
(okdb-transaction? obj) * → boolean?
Returns #t
if OBJ
is an <okdb-transaction>
instance. Otherwise,
returns #f
.
(okdb-cursor? obj) * → boolean?
Returns #t
if OBJ
is an <okdb-cursor>
instance. Otherwise,
returns #f
.
(okdb-handle? obj) * → boolean?
Returns #t
if OBJ
satisfy either okdb?
, okdb-transaction?
, or
okdb-cursor?
. Otherwise, returns #f
.
(okdb-key-max-size handle) handle? → number?
Rationale: Most okvs implementations do not specify the maximum size
of keys, making both the implementation and its use erratic. The same
maximum size might not work in all situations, hence it might be
subject to customization. The important is to guarantee some
predicatable performance when keys follow that constraint. It also
makes the implementation of okdb
easier, among other thing because
the implementation does not need to have to handle large binaries.
That is not a negligeable constraint for the user as keys max size are
not necessarly predicatable, but in any case should be small since in
most implementations those are kept in memory.
Return the maximum size of key for the database associated with
HANDLE
. It is an error to call okdb-key-max-size
if
okdb-key-max-size!
was never called before.
(okdb-key-max-size! okdb size) okdb? number?
Questions: Can SIZE
be +inf.0
? Does it work across restarts?
Replace okdb?
with handle?
? Investigate why FDB does limit those.
Set the maximum SIZE
of a key for the database OKDB
.
(okdb-value-max-size handle) handle? → number?
Rationale: Same as the above: it is easier to implement. For the user perspective, it is much easier to handle the situation of large values since they can be split without loosing features.
Return the maximum size of a value of the database associated with
HANDLE
.
(okdb-value-max-size! okdb size) okdb? number?
Questions: same as okdb-value-max-size!
Set the maximum SIZE
of a value for the database OKDB
.
(okdb-conflict? obj)
Returns #t
if OBJ
is a conflict error. Otherwise returns
#f
. Such object may be raised by okdb-in-transaction
.
(okdb-in-transaction okdb proc [failure [success]]) okdb? procedure? procedure? procedure? → (values (every? *))
Rationales:
okdb-in-transaction
does not include a retry logic when
okdb-conflict?
is raised because retrying might require to wait
which depends on the implementation but also and more importantly on
user code. The user is in the best position to know when, and how to
retry the transaction. The last resort strategy is not even to retry
the transaction immediatly, but to put the operation in queue possibly
persisted in the database, and force the serialization through a
single thread. In any case, retry should be explicit in user code.
Serializable scheme trades guarantees regarding the consistency of the data, and hence ease of development because the state of the database is determinist versus performance. The prescription of serializable transactions is a strong requirement, that was thusfar almost completly left aside in the industry in favor of snapshot isolation. The philosophy here is: make it work, then make it fast. It is not possible to build reliable systems upon claims that are weak, or false, in the general case.
Nested transactions were ruled out because it is still not clear whether they put a strain on the implementation that does not yield much help in user code. Nested transactions are similar to savepoints or autonomous transactions.
okvs-in-transaction
describes the extent of the atomic property, the
A in ACID, of changes against
the underlying database. A transaction will apply all database
operations in PROC
or none: all or nothing. When
okdb-in-transaction
returns successfully, the changes will be
visible for future transactions, and implement durability, D in ACID,
and when the database implements on-disk storage, across restarts. In
case of error, changes will not be visible to other transactions in
all cases. Regarding isolation, and consistency, respectively the I
and C in ACID,
serializable
transactions is prescribed: the concurrent execution of
(okvs-in-transaction okdb proc ...)
should render the database as if
the concurrent transactions were executed serially ie. without
overlapping time, in some order, possibly rejecting some of them with
an error that satisfy okdb-conflict?
. In particular, it is stronger
than snapshot isolation.
Begin a transaction against the database, and execute PROC
. PROC
is called with first and only argument an object that satisfy
okdb-transaction?
. In case of error, rollback the transaction and
execute FAILURE
with the error object as argument. The default value
of FAILURE
re-raise the error with raise
. Otherwise, executes
SUCCESS
with the returned values of PROC
. The default value of
SUCCESS
is the procedure values
.
okdb
does not support nested transactions.
TODO: what about hooks
In case okvs-in-transactions
raise an error that satisfy
okdb-conflict?
, the user may re-run the same transaction taking care
that PROC
is idempotent.
(okdb-call-with-cursor handle proc) okdb-handle? procedure? → (values (every? *))
Open a cursor against HANDLE
and call PROC
with it. When PROC
returns, the cursor is closed.
If HANDLE
satisfy okdb?
, okdb-call-with-cursor
must wrap the
call to PROC
with okvs-in-transaction
.
If HANDLE
satisfy okdb-cursor?
, okdb-call-with-cursor
must pass
a cursor positioned at the same position as HANDLE
and the same keys
snapshot.
The produced cursor is not positioned, except when HANDLE
satisfy
okdb-cursor?
, it is an error to call okdb-cursor-next?
,
okdb-cursor-previous?
, okdb-cursor-key
, or
okdb-cursor-value
. When HANDLE
does not satisfy okdb-cursor?
,
the user should call okdb-cursor-seek?
immediatly after
okdb-call-with-cursor
.
The cursor that is created will see a snapshot of keys of the database
inside the transaction taken when okdb-call-with-cursor
is
called. During the extent of PROC
, okdb-set!
and okdb-remove!
will not change the position of the cursor, and the cursor will see
removed keys and not see added keys. Keys which value was changed
during cursor navigation, that exist when okdb-call-with-cursor
is
called, can be seen. That is, the cursor is stable.
(okdb-estimate-key-count handle [key other [offset [limit]]]) handle? bytevector? bytevector? integer? integer? → integer?
Rationale: It is helpful to know how big is a range to be able to tell which index to use as seed. Imagine a query against two attributes, each attribute with their own index and no compound index: being able to tell which subspace contains less keys, can speed up significantly query time.
Return an estimate count of keys between KEY
and OTHER
. If KEY
and OTHER
are omitted return the approximate count of keys in the
whole database.
If OFFSET
is provided, okdb-estimate-key-count
will skip the first
OFFSET
keys from the count.
If LIMIT
is provided, okdb-estimate-key-count
will consider LIMIT
keys from the count.
(okdb-estimate-bytes-count handle [key [other [offset [limit]]]]) okdb-handle? bytevector? bytevector? integer? integer? → integer?
Rationale: That is useful in cases where the size of a transaction is limited.
Return the estimated size in bytes of key-value pairs in the subspace
described by KEY
and OTHER
. If OTHER
is omitted, return the
approximate size of the key-value pair associated with
KEY
. Otherwise, return the estimated size of the whole database
associated with HANDLE
.
If OFFSET
is provided, okdb-estimate-bytes-count
will skip the
first OFFSET
keys from the count.
If LIMIT
is provided, okdb-estimate-bytes-count
will consider LIMIT
keys from the count.
(okdb-set! handle key value) okdb-handle? bytevector? bytevector?
Associate to the bytevector KEY
, with the bytevector VALUE
. If
HANDLE
satisfy OKDB?
wrap the operation with okdb-in-transaction
.
If HANDLE
satisfy okdb-cursor?
, okdb-set!
does not change the
position of HANDLE
.
(okdb-remove! handle key) okdb-handle? bytevector?
Removes the bytevector KEY
, and its associated value. If HANDLE
satisfy OKDB?
wrap the operation with okdb-in-transaction
.
If HANDLE
satisfy okdb-cursor?
, okdb-set!
does not change the
position of HANDLE
.
(okdb-cursor-seek cursor strategy key) okdb-cursor? symbol? bytevector? → symbol?
Position the CURSOR
using STRATEGY
. STRATEGY
can be one of the
following symbol:
less-than-or-equal
equal
greater-than-or-equal
The strategy less-than-or-equal
will first seek for the biggest key
that is less than KEY
, if there is one, it returns the symbol less
.
Otherwise, if there is a key that is equal to KEY
it will return the
symbol equal
. If there is no valid position for the given KEY
, it
fallbacks to the symbol not-found
.
The strategy equal
will seek a key that is equal to KEY
. If there
is one it will return the symbol equal
. Otherwise, it returns the
symbol not-found
.
The strategy greater-than-equal
will first seek the smallest key that
is greater than KEY
, if there is one, it returns the symbol
greater
. Otherwise, if there is a key that is equal to KEY
it
will return the symbol equal
. If there is no valid position for the
given KEY
, it fallbacks the symbol not-found
.
(okdb-cursor-next? cursor) okdb-cursor? → boolean?
Move the CURSOR
to the next key if any. Return #t
if there is such
a key. Otherwise returns #f
. #f
means the cursor reached the end
of the key space.
(okdb-cursor-previous? cursor) okdb-cursor? → boolean?
Move the CURSOR
to the previous key if any. Return #t
if there is
such a key. Otherwise returns #f
. #f
means the cursor reached the
begining of the key space.
(okdb-cursor-key cursor)
okdb-cursor? → bytevector?`Return the key bytevector where CURSOR
is positioned. It is an error
to call okdb-cursor-key
, when CURSOR
reached the begining or end
of the key space or when CURSOR
is not positioned.
(okdb-cursor-value cursor)
okdb-cursor? → bytevector?`Return the value bytevector where CURSOR
is positioned. It is an
error to call okdb-cursor-key
, when CURSOR
reached the begining or
end of the key space or when CURSOR
is not positioned.
(okdb-query handle key [other [offset [limit]]]) handle? bytevector? bytevector? integer? integer? → (either? bytevector? procedure?)
OKDB-QUERY
will query the associated database. If only KEY
is
provided it will return the associated value bytevector or #f
. If
OTHER
is provided there is two cases:
KEY < OTHER
then okdb-query
returns a generator with all the
key-value pairs present in the database between KEY
and OTHER
excluded ie. without the key-value pair associated with OTHER
if
any.
OTHER < KEY
then okdb-query
returns the equivalent of reversing
the generator returned by (okdb-query handle KEY OTHER)
.
If OFFSET
is provided the generator will skip as much key-value
pairs from the start of the generator.
If LIMIT
is provided the generator will generate at most LIMIT
key-value pairs.
(okdb-bytevector-next-prefix bytevector) bytevector? → bytevector?
Return the bytevector that follows BYTEVECTOR
according to
lexicographic order that is not prefix of BYTEVECTOR
such as the
following code iterates over all keys that have key
as prefix:
(okdb-query handle key (okdb-bytevector-next-prefix key))