Stop using camel-case
This commit is contained in:
parent
ff31ca2d86
commit
bffd1fd7e9
3 changed files with 28 additions and 8 deletions
|
@ -1,8 +1,8 @@
|
|||
# LevelEd - An Erlang Key-Value store
|
||||
# Leveled - An Erlang Key-Value store
|
||||
|
||||
## Introduction
|
||||
|
||||
LevelEd is a work-in-progress prototype of a simple Key-Value store based on the concept of Log-Structured Merge Trees, with the following characteristics:
|
||||
Leveled is a work-in-progress prototype of a simple Key-Value store based on the concept of Log-Structured Merge Trees, with the following characteristics:
|
||||
|
||||
- Optimised for workloads with larger values (e.g. > 4KB).
|
||||
|
||||
|
@ -29,7 +29,7 @@ The store is not expected to offer lower median latency than the Basho-enhanced
|
|||
|
||||
For more details on the store:
|
||||
|
||||
- An [introduction](docs/INTRO.md) to LevelEd covers some context to the factors motivating design trade-offs in the store.
|
||||
- An [introduction](docs/INTRO.md) to Leveled covers some context to the factors motivating design trade-offs in the store.
|
||||
|
||||
- The [design overview](docs/DESIGN.md) explains the actor model used and the basic flow of requests through the store.
|
||||
|
||||
|
@ -80,7 +80,7 @@ Further volume test scenarios are the immediate priority, in particular volume t
|
|||
|
||||
- Use of newly available [EC2 hardware](https://aws.amazon.com/about-aws/whats-new/2017/02/now-available-amazon-ec2-i3-instances-next-generation-storage-optimized-high-i-o-instances/) which potentially is a significant changes to assumptions about hardware efficiency and cost.
|
||||
|
||||
- Create riak_test tests for new Riak features enabled by LevelEd.
|
||||
- Create riak_test tests for new Riak features enabled by Leveled.
|
||||
|
||||
## Feedback
|
||||
|
||||
|
|
|
@ -42,10 +42,12 @@ There is some work required before LevelEd could be considered production ready:
|
|||
|
||||
The following Riak features have been implemented
|
||||
|
||||
### LevelEd Backend
|
||||
### Leveled Backend
|
||||
|
||||
Branch: [mas-leveleddb](https://github.com/martinsumner/riak_kv/tree/mas-leveleddb)
|
||||
|
||||
Branched-From: [Basho/develop](https://github.com/basho/riak_kv)
|
||||
|
||||
Description:
|
||||
|
||||
The leveled backend has been implemented with some basic manual functional tests. The backend has the following capabilities:
|
||||
|
@ -71,6 +73,8 @@ Note - the technique would work in leveldb and memory backends as well (and perh
|
|||
|
||||
Branch: [mas-leveled-getfsm](https://github.com/martinsumner/riak_kv/tree/mas-leveled-getfsm)
|
||||
|
||||
Branched-From: [mas-leveleddb](https://github.com/martinsumner/riak_kv/tree/mas-leveleddb)
|
||||
|
||||
Description:
|
||||
|
||||
In standard Riak the Riak node that receives a GET request starts a riak_kv_get_fsm to handle that request. This FSM goes through the following primary states:
|
||||
|
@ -95,3 +99,19 @@ So rather than doing three Key/Metadata/Body backend lookups for every request,
|
|||
The feature will not at present work safely with legacy vclocks. This branch generally relies on vector clock comparison only for equality checking, and removes some of the relatively expensive whole body equality tests (either as a result of set:from_list/1 or riak_object:equal/2), which are believed to be a legacy of issues with pre-dvv clocks.
|
||||
|
||||
In tests, the benefit of this may not be that significant - as the primary resource saved is disk/network, and so if these are not the resources under pressure, the gain may not be significant. In tests bound by CPU not disks, only a 10% improvement has so far been measured with this feature.
|
||||
|
||||
### PUT -> Using HEAD
|
||||
|
||||
Branch: [mas-leveled-putfsm](https://github.com/martinsumner/riak_kv/tree/mas-leveled-putfsm)
|
||||
|
||||
Branched-From: [mas-leveled-getfsm](https://github.com/martinsumner/riak_kv/tree/mas-leveled-getfsm)
|
||||
|
||||
Description:
|
||||
|
||||
The standard PUT process for Riak requires the PUT to be forwarded to a coordinating vnode first. The coordinating PUT process requires the object to be fetched from the local vnode only, and a new updated Object created. The remaining n-1 vnodes are then sent the update object as a PUT, and once w/dw/pw nodes have responded the PUT is acknowledged.
|
||||
|
||||
The other n-1 vnodes must also do a local GET before the vnode PUT (so as not to erase a more up-to-date value that may not have been present at the coordinating vnode).
|
||||
|
||||
This branch changes the behaviour slightly at the non-coordinating vnodes. These vnodes will now try a HEAD request before the local PUT (not a GET request), and if the HEAD request contains a vclock which is <strong>dominated</strong> by the updated PUT, it will not attempt to fetch the whole object for the syntactic merge.
|
||||
|
||||
This should save two object fetches (where n=3) in most circumstances.
|
|
@ -66,9 +66,9 @@ The evolution of leveledb in Riak, from the original Google-provided store to th
|
|||
|
||||
The original leveledb considered in part the hardware economics of the phone where there are clear constraints around CPU usage - due to both form-factor and battery life, and where disk space may be at a greater premium than disk IOPS. Some of the evolution of eleveldb is down to the Riak-specific problem of needing to run multiple stores on a single server, where even load distribution may lead to a synchronisation of activity. Much of the evolution is also about how to make better use of the continuous availability of CPU resource, in the face of the relative scarcity of disk resource. Changes such as overlapping files at level 1, hot threads, compression improvements etc all move eleveldb in the direction of being easier on disk at the cost of CPU; and the hardware economics of servers would indicate this is a wise choice
|
||||
|
||||
### Planning for LevelEd
|
||||
### Planning for Leveled
|
||||
|
||||
The primary design differentiation between LevelEd and LevelDB is the separation of the key store (known as the Ledger in LevelEd) and the value store (known as the journal). The Journal is like a continuous extension of the nursery log within LevelDB, only with a gradual evolution into [CDB files](https://en.wikipedia.org/wiki/Cdb_(software)) so that file offset pointers are not required to exist permanently in memory. The Ledger is a merge tree structure, with values substituted with metadata and a sequence number - where the sequence number can be used to find the value in the Journal.
|
||||
The primary design differentiation between LevelEd and LevelDB is the separation of the key store (known as the Ledger in Leveled) and the value store (known as the journal). The Journal is like a continuous extension of the nursery log within LevelDB, only with a gradual evolution into [CDB files](https://en.wikipedia.org/wiki/Cdb_(software)) so that file offset pointers are not required to exist permanently in memory. The Ledger is a merge tree structure, with values substituted with metadata and a sequence number - where the sequence number can be used to find the value in the Journal.
|
||||
|
||||
This is not an original idea, the LSM-Tree paper specifically talked about the trade-offs of placing identifiers rather than values in the merge tree:
|
||||
|
||||
|
@ -84,7 +84,7 @@ So the hypothesis that separating Keys and Values may be optimal for LSM-Trees i
|
|||
|
||||
## Being Operator Friendly
|
||||
|
||||
The LSM-Tree paper focuses on hardware trade-offs in database design. LevelEd is focused on the job of being a backend to a Riak database, and the Riak database is opinionated on the trade-off between developer and operator productivity. Running a Riak database imposes constraints and demands on developers - there are things the developer needs to think hard about: living without transactions, considering the resolution of siblings, manual modelling for query optimisation.
|
||||
The LSM-Tree paper focuses on hardware trade-offs in database design. Leveled is focused on the job of being a backend to a Riak database, and the Riak database is opinionated on the trade-off between developer and operator productivity. Running a Riak database imposes constraints and demands on developers - there are things the developer needs to think hard about: living without transactions, considering the resolution of siblings, manual modelling for query optimisation.
|
||||
|
||||
However, in return for this pain there is great reward, a reward which is gifted to the operators of the service. Riak clusters are reliable and predictable, and operational processes are slim and straight forward - preparation for managing a Riak cluster in production needn't go much beyond rehearsing magical cure-alls of the the node stop/start and node join/leave processes. At the NHS, where we have more than 50 Riak nodes in 24 by 365 business critical operations, it is not untypical to go more than 28-days without anyone logging on to a database node. This is a relief for those of us who have previously lived in the world with databases with endless configuration parameters to test or blame for issues, where you always seem to be the unlucky one who suffer the outages "never seen in any other customer", where the databases come with ever more complicated infrastructure dependencies and and where DBAs need to be constantly at-hand to analyse reports, kill rogue queries and re-run the query optimiser as an answer to the latest 'blip' in performance.
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue