commit
7998c9f656
7 changed files with 68 additions and 10 deletions
34
README.md
34
README.md
|
@ -47,12 +47,13 @@ The target at inception was to do something interesting, to re-think certain key
|
||||||
|
|
||||||
The delta in the table below is the comparison in Riak performance between the identical test run with a Leveled backend in comparison to Leveldb.
|
The delta in the table below is the comparison in Riak performance between the identical test run with a Leveled backend in comparison to Leveldb.
|
||||||
|
|
||||||
Test Description | Hardware | Duration |Avg TPS | Delta (Overall) | Delta (Last Hour)
|
Test Description | Hardware | Duration |Avg TPS | TPS Delta (Overall) | TPS Delta (Last Hour)
|
||||||
:---------------------------------|:-------------|:--------:|----------:|-----------------:|-------------------:
|
:---------------------------------|:-------------|:--------:|----------:|-----------------:|-------------------:
|
||||||
8KB value, 60 workers, sync | 5 x i2.2x | 4 hr | 12,679.91 | <b>+ 70.81%</b> | <b>+ 63.99%</b>
|
8KB value, 60 workers, sync | 5 x i2.2x | 4 hr | 12,679.91 | <b>+ 70.81%</b> | <b>+ 63.99%</b>
|
||||||
8KB value, 100 workers, no_sync | 5 x i2.2x | 6 hr | 14,100.19 | <b>+ 16.15%</b> | <b>+ 35.92%</b>
|
8KB value, 100 workers, no_sync | 5 x i2.2x | 6 hr | 14,100.19 | <b>+ 16.15%</b> | <b>+ 35.92%</b>
|
||||||
8KB value, 50 workers, no_sync | 5 x d2.2x | 4 hr | 10,400.29 | <b>+ 8.37%</b> | <b>+ 23.51%</b>
|
8KB value, 50 workers, no_sync | 5 x d2.2x | 4 hr | 10,400.29 | <b>+ 8.37%</b> | <b>+ 23.51%</b>
|
||||||
4KB value, 100 workers, no_sync | 5 x i2.2x | 6 hr | 14,993.95 | <b>- 10.44%</b> | <b>- 4.48%</b>
|
4KB value, 100 workers, no_sync | 5 x i2.2x | 6 hr | 14,993.95 | - 10.44% | - 4.48%
|
||||||
|
16KB value, 60 workers, no_sync | 5 x i2.2x | 6 hr | 11,167.44 | <b>+ 80.48%</b> | <b>+ 113.55%</b>
|
||||||
|
|
||||||
Tests generally show a 5:1 improvement in tail latency for LevelEd.
|
Tests generally show a 5:1 improvement in tail latency for LevelEd.
|
||||||
|
|
||||||
|
@ -75,14 +76,24 @@ As a general rule though, the most interesting thing is the potential to enable
|
||||||
|
|
||||||
Further volume test scenarios are the immediate priority, in particular volume test scenarios with:
|
Further volume test scenarios are the immediate priority, in particular volume test scenarios with:
|
||||||
|
|
||||||
- Alternative object sizes;
|
|
||||||
|
|
||||||
- Significant use of secondary indexes;
|
- Significant use of secondary indexes;
|
||||||
|
|
||||||
- Use of newly available [EC2 hardware](https://aws.amazon.com/about-aws/whats-new/2017/02/now-available-amazon-ec2-i3-instances-next-generation-storage-optimized-high-i-o-instances/) which potentially is a significant changes to assumptions about hardware efficiency and cost.
|
- Use of newly available [EC2 hardware](https://aws.amazon.com/about-aws/whats-new/2017/02/now-available-amazon-ec2-i3-instances-next-generation-storage-optimized-high-i-o-instances/) which potentially is a significant changes to assumptions about hardware efficiency and cost.
|
||||||
|
|
||||||
- Create riak_test tests for new Riak features enabled by Leveled.
|
- Create riak_test tests for new Riak features enabled by Leveled.
|
||||||
|
|
||||||
|
However a number of other changes are planned in the next month to (my branch of) riak_kv to better use leveled:
|
||||||
|
|
||||||
|
- Support for rapid rebuild of hashtrees
|
||||||
|
|
||||||
|
- Fixes to priority issues
|
||||||
|
|
||||||
|
- Experiments with flexible sync on write settings
|
||||||
|
|
||||||
|
- A cleaner and easier build of Riak with leveled included, including cuttlefish configuration support
|
||||||
|
|
||||||
|
More information can be found in the [future section](docs/FUTURE.md).
|
||||||
|
|
||||||
## Feedback
|
## Feedback
|
||||||
|
|
||||||
Please create an issue if you have any suggestions. You can ping me @masleeds if you wish
|
Please create an issue if you have any suggestions. You can ping me @masleeds if you wish
|
||||||
|
@ -99,4 +110,17 @@ This will start a new Bookie. It will start and look for existing data files, u
|
||||||
|
|
||||||
The book_start method should respond once startup is complete. The leveled_bookie module includes the full API for external use of the store.
|
The book_start method should respond once startup is complete. The leveled_bookie module includes the full API for external use of the store.
|
||||||
|
|
||||||
Read through the [end_to_end test suites](test/end_to_end/) for further guidance.
|
It should run anywhere that OTP will run - it has been tested on Ubuntu 14, MAC OS X and Windows 10.
|
||||||
|
|
||||||
|
Running in Riak requires one of the branches of riak_kv referenced [here](docs/FUTURE.md). There is a [Riak branch](https://github.com/martinsumner/riak/tree/mas-leveleddb) intended to support the automatic build of this, and the configuration via cuttlefish. However, the auto-build fails due to other dependencies (e.g. riak_search) bringing in an alternative version of riak_kv, and the configuration via cuttlefish is broken for reasons unknown.
|
||||||
|
|
||||||
|
Building this from source as part of Riak will require a bit of fiddling around.
|
||||||
|
|
||||||
|
- build [riak](https://github.com/martinsumner/riak/tree/mas-leveleddb)
|
||||||
|
- cd deps, rm -rf riak_kv
|
||||||
|
- git clone -b mas-leveled-putfm --single-branch https://github.com/martinsumner/riak_kv.git
|
||||||
|
- cd ..
|
||||||
|
- make rel
|
||||||
|
- remember to set the storage backend to leveled in riak.conf
|
||||||
|
|
||||||
|
To help with the breakdown of cuttlefish, leveled parameters can be set via riak_kv/include/riak_kv_leveled.hrl - although a new make will be required for these changes to take effect.
|
|
@ -114,4 +114,6 @@ The other n-1 vnodes must also do a local GET before the vnode PUT (so as not to
|
||||||
|
|
||||||
This branch changes the behaviour slightly at the non-coordinating vnodes. These vnodes will now try a HEAD request before the local PUT (not a GET request), and if the HEAD request contains a vclock which is <strong>dominated</strong> by the updated PUT, it will not attempt to fetch the whole object for the syntactic merge.
|
This branch changes the behaviour slightly at the non-coordinating vnodes. These vnodes will now try a HEAD request before the local PUT (not a GET request), and if the HEAD request contains a vclock which is <strong>dominated</strong> by the updated PUT, it will not attempt to fetch the whole object for the syntactic merge.
|
||||||
|
|
||||||
This should save two object fetches (where n=3) in most circumstances.
|
This should save two object fetches (where n=3) in most circumstances.
|
||||||
|
|
||||||
|
Note, although the branch name refers to the put fsm - the actual fsm is unchanged by this, all of the changes are within vnode_put
|
|
@ -32,6 +32,7 @@ This test has the following specific characteristics
|
||||||
- 60 concurrent basho_bench workers running at 'max'
|
- 60 concurrent basho_bench workers running at 'max'
|
||||||
- i2.2xlarge instances
|
- i2.2xlarge instances
|
||||||
- allow_mult=false, lww=false
|
- allow_mult=false, lww=false
|
||||||
|
- <b>sync_on_write = on</b>
|
||||||
|
|
||||||
Comparison charts for this test:
|
Comparison charts for this test:
|
||||||
|
|
||||||
|
@ -47,6 +48,7 @@ This test has the following specific characteristics
|
||||||
- 100 concurrent basho_bench workers running at 'max'
|
- 100 concurrent basho_bench workers running at 'max'
|
||||||
- i2.2xlarge instances
|
- i2.2xlarge instances
|
||||||
- allow_mult=false, lww=false
|
- allow_mult=false, lww=false
|
||||||
|
- <b>sync_on_write = off</b>
|
||||||
|
|
||||||
Comparison charts for this test:
|
Comparison charts for this test:
|
||||||
|
|
||||||
|
@ -60,8 +62,9 @@ This test has the following specific characteristics
|
||||||
|
|
||||||
- An 8KB value size (based on crypto:rand_bytes/1 - so cannot be effectively compressed)
|
- An 8KB value size (based on crypto:rand_bytes/1 - so cannot be effectively compressed)
|
||||||
- 50 concurrent basho_bench workers running at 'max'
|
- 50 concurrent basho_bench workers running at 'max'
|
||||||
- d2.2xlarge instances
|
- <b>d2.2xlarge instances</b>
|
||||||
- allow_mult=false, lww=false
|
- allow_mult=false, lww=false
|
||||||
|
- sync_on_write = off
|
||||||
|
|
||||||
Comparison charts for this test:
|
Comparison charts for this test:
|
||||||
|
|
||||||
|
@ -74,11 +77,37 @@ This is the stage when the volume of data has begun to exceed the volume support
|
||||||
|
|
||||||
### Half-Size Object, SSDs, No Sync-On-Write
|
### Half-Size Object, SSDs, No Sync-On-Write
|
||||||
|
|
||||||
to be completed
|
This test has the following specific characteristics
|
||||||
|
|
||||||
|
- A <b>4KB value size</b> (based on crypto:rand_bytes/1 - so cannot be effectively compressed)
|
||||||
|
- 100 concurrent basho_bench workers running at 'max'
|
||||||
|
- i2.2xlarge instances
|
||||||
|
- allow_mult=false, lww=false
|
||||||
|
- sync_on_write = off
|
||||||
|
|
||||||
|
Comparison charts for this test:
|
||||||
|
|
||||||
|
Riak + leveled | Riak + eleveldb
|
||||||
|
:-------------------------:|:-------------------------:
|
||||||
|
 | 
|
||||||
|
|
||||||
|
|
||||||
### Double-Size Object, SSDs, No Sync-On-Write
|
### Double-Size Object, SSDs, No Sync-On-Write
|
||||||
|
|
||||||
to be completed
|
This test has the following specific characteristics
|
||||||
|
|
||||||
|
- A <b>16KB value size</b> (based on crypto:rand_bytes/1 - so cannot be effectively compressed)
|
||||||
|
- 60 concurrent basho_bench workers running at 'max'
|
||||||
|
- i2.2xlarge instances
|
||||||
|
- allow_mult=false, lww=false
|
||||||
|
- sync_on_write = off
|
||||||
|
|
||||||
|
Comparison charts for this test:
|
||||||
|
|
||||||
|
Riak + leveled | Riak + eleveldb
|
||||||
|
:-------------------------:|:-------------------------:
|
||||||
|
 | 
|
||||||
|
|
||||||
|
|
||||||
### Lies, damned lies etc
|
### Lies, damned lies etc
|
||||||
|
|
||||||
|
@ -90,11 +119,14 @@ Both leveled and leveldb are optimised for finding non-presence through the use
|
||||||
|
|
||||||
So it is better to focus on the results at the tail of the tests, as at the tail the results are a more genuine reflection of behaviour against the advertised test parameters.
|
So it is better to focus on the results at the tail of the tests, as at the tail the results are a more genuine reflection of behaviour against the advertised test parameters.
|
||||||
|
|
||||||
|
|
||||||
Test Description | Hardware | Duration |Avg TPS | Delta (Overall) | Delta (Last Hour)
|
Test Description | Hardware | Duration |Avg TPS | Delta (Overall) | Delta (Last Hour)
|
||||||
:---------------------------------|:-------------|:--------:|----------:|-----------------:|-------------------:
|
:---------------------------------|:-------------|:--------:|----------:|-----------------:|-------------------:
|
||||||
8KB value, 60 workers, sync | 5 x i2.2x | 4 hr | 12,679.91 | <b>+ 70.81%</b> | <b>+ 63.99%</b>
|
8KB value, 60 workers, sync | 5 x i2.2x | 4 hr | 12,679.91 | <b>+ 70.81%</b> | <b>+ 63.99%</b>
|
||||||
8KB value, 100 workers, no_sync | 5 x i2.2x | 6 hr | 14,100.19 | <b>+ 16.15%</b> | <b>+ 35.92%</b>
|
8KB value, 100 workers, no_sync | 5 x i2.2x | 6 hr | 14,100.19 | <b>+ 16.15%</b> | <b>+ 35.92%</b>
|
||||||
8KB value, 50 workers, no_sync | 5 x d2.2x | 6 hr | 10,400.29 | <b>+ 8.37%</b> | <b>+ 23.51%</b>
|
8KB value, 50 workers, no_sync | 5 x d2.2x | 4 hr | 10,400.29 | <b>+ 8.37%</b> | <b>+ 23.51%</b>
|
||||||
|
4KB value, 100 workers, no_sync | 5 x i2.2x | 6 hr | 14,993.95 | - 10.44% | - 4.48%
|
||||||
|
16KB value, 60 workers, no_sync | 5 x i2.2x | 6 hr | 11,167.44 | <b>+ 80.48%</b> | <b>+ 113.55%</b>
|
||||||
|
|
||||||
Leveled, like bitcask, will defer compaction work until a designated compaction window, and these tests were run outside of that compaction window. So although the throughput of leveldb is lower, it has no deferred work at the end of the test. Future testing work is scheduled to examine leveled throughput during a compaction window.
|
Leveled, like bitcask, will defer compaction work until a designated compaction window, and these tests were run outside of that compaction window. So although the throughput of leveldb is lower, it has no deferred work at the end of the test. Future testing work is scheduled to examine leveled throughput during a compaction window.
|
||||||
|
|
||||||
|
|
Binary file not shown.
After Width: | Height: | Size: 111 KiB |
Binary file not shown.
After Width: | Height: | Size: 83 KiB |
Binary file not shown.
After Width: | Height: | Size: 97 KiB |
Binary file not shown.
After Width: | Height: | Size: 79 KiB |
Loading…
Add table
Add a link
Reference in a new issue