diff --git a/README.md b/README.md index a987021..e941152 100644 --- a/README.md +++ b/README.md @@ -24,7 +24,7 @@ For more details on the store: - [Future work](docs/FUTURE.md) covers new features being implemented at present, and improvements necessary to make the system production ready. -- There is also a ["Why"](WHYWHYWHYWHYWHY.md) section looking at lower level design choices and the rationale that supports them. +- There is also a ["Why"](docs/WHY.md) section looking at lower level design choices and the rationale that supports them. ## Is this interesting? diff --git a/docs/FUTURE.md b/docs/FUTURE.md index e6da51f..fffc21e 100644 --- a/docs/FUTURE.md +++ b/docs/FUTURE.md @@ -25,6 +25,8 @@ There is some work required before LevelEd could be considered production ready: - Riak modifications to support the optimised Key/Clock scanning for hashtree rebuilds. +- Amend compaction scheduling to ensure that all vnodes do not try to concurrently compact during a single window. + - Improved handling of corrupted files. - A way of identifying the partition in each log to ease the difficulty of tracing activity when multiple stores are run in parallel. \ No newline at end of file diff --git a/docs/VOLUME.md b/docs/VOLUME.md index d73fa18..d08a824 100644 --- a/docs/VOLUME.md +++ b/docs/VOLUME.md @@ -38,6 +38,19 @@ leveled Results | eleveldb Results :-------------------------:|:-------------------------: ![](../test/volume/cluster_one/output/summary_leveled_5n_45t.png "LevelEd") | ![](../test/volume/cluster_one/output/summary_leveldb_5n_45t.png "LevelDB") +### Lies, damned lies etc + +To a certain extent this should not be too expected - leveled is design to reduce write amplification, without write amplification the persistent write load gives leveled an advantage. The frequent periods of poor performance in leveldb appear to be coordinated with periods of very high await times on nodes during merge jobs, which may involve up to o(1GB) of write activity. + +Also the 5:1 ratio of GET:UPDATE is not quite that as: + +- each UPDATE requires an external Riak GET (as well as the internal GETs); + +- the empty nature of the database at the test start means that there are no actual value fetches initially (just not-present response) and only 50% of fetches get a value by the end of the test (much less for leveldb as there is less volume put during the test). + +When testing on a single node cluster (with a smaller ring size, and a smaller keyspace) the relative benefit of leveled appears to be much smaller. One big difference between the single node testing completed and multi-node testing is that between testing the disk was switched from using a single drive to using a mirrored pair. It is suspected that the amplified improvement between single-node test and multi-node tests is related in-part to the cost of software-based mirroring exaggerating write contention to disk. + +Leveldb achieved more work in a sense during the test, as the test was run outside of the compaction window for leveled - so the on-disk size of the leveled store was higher as no replaced values had been compacted. Test 6 below will examine the impact of the comapaction window on throughput. ## Riak Cluster Test - 2 diff --git a/docs/WHY.md b/docs/WHY.md new file mode 100644 index 0000000..f9f68ff --- /dev/null +++ b/docs/WHY.md @@ -0,0 +1,2 @@ +# Why? Why? Why? Why? Why? +