![]() * Add eqwalizer and clear for codec & sst The eqwalizer errors highlighted the need in several places for type clarification. Within tests there are some issue where a type is assumed, and so ignore has been used to handle this rather than write more complex code to be explicit about the assumption. The handling of arrays isn't great by eqwalizer - to be specific about the content of array causes issues when initialising an array. Perhaps a type (map maybe) where one can be more explicit about types might be a better option (even if there is a minimal performance impact). The use of a ?TOMB_COUNT defined option complicated the code much more with eqwalizer. So for now, there is no developer option to disable ?TOMB_COUNT. Test fixes required where strings have been used for buckets/keys not binaries. The leveled_sst statem needs a different state record for starting when compared to other modes. The state record has been divided up to reflect this, to make type management easier. The impact on performance needs to be tested. * Update ct tests to support binary keys/buckets only * Eqwalizer for leveled_cdb and leveled_tictac As array is used in leveled_tictac - there is the same issue as with leveled_sst * Remove redundant indirection of leveled_rand A legacy of pre-20 OTP * Morde modules eqwalized ebloom/log/util/monitor * Eqwalize further modules elp eqwalize leveled_codec; elp eqwalize leveled_sst; elp eqwalize leveled_cdb; elp eqwalize leveled_tictac; elp eqwalize leveled_log; elp eqwalize leveled_monitor; elp eqwalize leveled_head; elp eqwalize leveled_ebloom; elp eqwalize leveled_iclerk All concurrently OK * Refactor unit tests to use binary() no string() in key Previously string() was allowed just to avoid having to change all these tests. Go through the pain now, as part of eqwalizing. * Add fixes for penciller, inker Add a new ?IS_DEF macro to replace =/= undefined. Now more explicit about primary, object and query keys * Further fixes Need to clarify functions used by runner - where keys , query keys and object keys are used * Further eqwalisation * Eqwalize leveled_pmanifest Also make implementation independent of choice of dict - i.e. one can save a manifest using dict for blooms/pending_deletions and then open a manifest with code that uses a different type. Allow for slow dict to be replaced with map. Would not be backwards compatible though, without further thought - i.e. if you upgrade then downgrade. Redundant code created by leveled_sst refactoring removed. * Fix backwards compatibility issues * Manifest Entry to belong to leveled_pmanifest There are two manifests - leveled_pmanifest and leveled_imanifest. Both have manifest_entry() type objects, but these types are different. To avoid confusion don't include the pmanifest manifest_entry() within the global include file - be specific that it belongs to the leveled_pmanifest module * Ignore elp file - large binary * Update src/leveled_pmem.erl Remove unnecessary empty list from type definition Co-authored-by: Thomas Arts <thomas.arts@quviq.com> --------- Co-authored-by: Thomas Arts <thomas.arts@quviq.com> |
||
---|---|---|
.github/workflows | ||
docs | ||
include | ||
priv | ||
src | ||
test | ||
.gitignore | ||
LICENSE | ||
README.md | ||
rebar.config | ||
rebar3 |
Leveled - An Erlang Key-Value Store
Introduction
Leveled is a simple Key-Value store based on the concept of Log-Structured Merge Trees, with the following characteristics:
-
Optimised for workloads with larger values (e.g. > 4KB).
-
Explicitly supports HEAD requests in addition to GET requests:
- Splits the storage of value between keys/metadata and body (assuming some definition of metadata is provided);
- Allows for the application to define what constitutes object metadata and what constitutes the body (value-part) of the object - and assign tags to objects to manage multiple object-types with different extraction rules;
- Stores keys/metadata in a merge tree and the full object in a journal of CDB files
- allowing for HEAD requests which have lower overheads than GET requests; and
- queries which traverse keys/metadatas to be supported with fewer side effects on the page cache than folds over keys/objects.
-
Support for tagging of object types and the implementation of alternative store behaviour based on type.
- Allows for changes to extract specific information as metadata to be returned from HEAD requests;
- Potentially usable for objects with special retention or merge properties.
-
Support for low-cost clones without locking to provide for scanning queries (e.g. secondary indexes).
- Low cost specifically where there is a need to scan across keys and metadata (not values).
-
Written in Erlang as a message passing system between Actors.
The store has been developed with a focus on being a potential backend to a Riak KV database, rather than as a generic store. It is intended to be a fully-featured backend - including support for secondary indexes, multiple fold types and auto-expiry of objects.
An optimised version of Riak KV has been produced in parallel which will exploit the availability of HEAD requests (to access object metadata including version vectors), where a full GET is not required. This, along with reduced write amplification when compared to leveldb, is expected to offer significant improvement in the volume and predictability of throughput for workloads with larger (> 4KB) object sizes, as well as reduced tail latency.
There may be more general uses of Leveled, with the following caveats:
- Leveled should be extended to define new tags that specify what metadata is to be extracted for the inserted objects (or to override the behaviour for the ?STD_TAG). Without this, there will be limited scope to take advantage of the relative efficiency of HEAD and FOLD_HEAD requests.
- If objects are small, the
head_only
mode may be used, which will cease separation of object body from header and use the Key/Metadata store as the only long-term persisted store. In this mode all of the object is treated as Metadata, and the behaviour is closer to that of the leveldb LSM-tree, although with higher median latency.
More Details
For more details on the store:
-
An introduction to Leveled covers some context to the factors motivating design trade-offs in the store.
-
The design overview explains the actor model used and the basic flow of requests through the store.
-
Future work covers new features being implemented at present, and improvements necessary to make the system production ready.
-
There is also a "Why" section looking at lower level design choices and the rationale that supports them.
Feedback
Please create an issue if you have any suggestions. You can ping me @masleeds if you wish
Running Leveled
Unit and current tests in leveled should run with rebar3.
A new database can be started by running
{ok, Bookie} = leveled_bookie:book_start(StartupOptions)
This will start a new Bookie. It will start and look for existing data files, under the RootPath, and start empty if none exist. Further information on startup options can be found here here.
The book_start method should respond once startup is complete. The leveled_bookie module includes the full API for external use of the store.
Running in Riak requires Riak 2.9 or beyond, which is available from January 2019.
There are two main branches under active development:
develop-3.4
- default: Target for the Riak 3.4 release with support for OTP 24 and OTP 26;
develop-3.1
: Target for the Riak 3.2 release with support for OTP 22 and OTP 24.
There are two legacy branches, used in older versions of Riak:
develop-3.0
: Used in the Riak 3.0 release with support for OTP 20 and OTP 22;
develop-2.9
: Used in the Riak 2.9 release with support for OTP R16 through to OTP 20.
Contributing
In order to contribute to leveled, fork the repository, make a branch for your changes, and open a pull request. The acceptance criteria for updating leveled is that it passes rebar3 dialyzer, xref, eunit, and ct with 100% coverage.
To have rebar3 execute the full set of tests, run:
./rebar3 do xref, dialyzer, cover --reset, eunit --cover, ct --cover, cover --verbose
For those with a Quickcheck license, property-based tests can also be run using:
./rebar3 as eqc do eunit --module=leveled_simpleeqc, eunit --module=leveled_statemeqc