leveled

Author	SHA1	Message	Date
martinsumner	102cfe7f6f	Move towards Inker Key Types The current mechanism of re-loading data from the Journla to the Ledger from any potential SQN is not safe when combined with Journla compaction. This commit doesn't resolve thes eproblems, but starts the groundwork for resolving by introducing Inker Key Types. These types would differentiate between objects which are standard Key/Value pairs, objects which are tombstones for keys, and objects whihc represent Key Changes only. The idea is that there will be flexible reload strategies based on object tags - retain (retain a key change object when compacting a standard object) - recalc (allow key changes to be recalculated from objects and ledger state when loading the Ledger from the journal - recover (allow for the potential loss of data on loss within the perisste dpart of the ledger, potentially due to recovery through externla anti-entropy operations).	2016-10-25 01:57:12 +01:00
martinsumner	3e475f46e8	Support for 2i query part1 Added basic support for 2i query. This involved some refactoring of the test code to share functions between suites. There is sill a need for a Part 2 as no tests currently cover removal of index entries.	2016-10-18 01:59:18 +01:00
martinsumner	7eb5a16899	Supporting Tags - Improving abstraction between Riak and non-Riak workloads The object tag "o" which was taken from eleveldb has been an extended to allow for specific functions to be triggered for different object types, in particular when extracting metadata for stroing in the Ledger. There is now a riak tag (o_rkv@v1), and in theory other tags can be added and used, as long as their is an appropriate set of functions in the leveled_codec.	2016-10-14 18:43:16 +01:00
martinsumner	9be0f96406	Or process calculation of the Hash Table When the journal CDB file is called to roll it now starts a new clerk to perform the hashtable calculation (which may take many seconds). This stops the store from getting blocked if there is an attempt to GET from the journal that has just been rolled. The journal file process now has anumber fo distinct states (reading, writing, pending_roll, closing). A future refactor may look to make leveled_cdb a gen_fsm rather than a gen_server.	2016-10-14 13:36:12 +01:00
martinsumner	bbdac65f8d	Split out key codec Aplit out key codec, and also saner approach to key comparison (although still awkward).	2016-10-13 21:02:15 +01:00
martinsumner	4a8a2c1555	Code reduction refactor An attempt to refactor out more complex code. The Penciller clerk and Penciller have been re-shaped so that there relationship is much simpler, and also to make sure that they shut down much more neatly when the clerk is busy to avoid crashdumps in ct tests. The CDB now has a binary_mode - so that we don't do binary_to_term twice ... although this may have made things slower ??!!? Perhaps the is_binary check now required on read is an overhead. Perhaps it is some other mystery. There is now a more effiicient fetching of the size on pcl_load now as well.	2016-10-08 22:15:48 +01:00
martinsumner	8dfeb520ef	Inker Refactor Inker refactored to block on manifest write. If this is inefficient the manifets write can be converted ot an append only operation. Waiting on the manifest write makes the logic at startup much easier to manage.	2016-10-07 18:07:03 +01:00
martinsumner	2055f8ed3f	Add more complex snapshot test This exposed another off-by-one error on startup. This commit also includes an unsafe change to reply early from a rolling CDB file (with lots of objects writing the hash table can take too long). This is bad, but will be resolved through a refactor of the manifest writing: essentially we deferred writing of the manifest update which was an unnecessary performance optimisation. If instead we wait on this, the process is made substantially simpler, and it is safer to perform the roll of the complete CDB journal asynchronously. If the manifest update takes too long, an append-only log may be used instead.	2016-10-07 10:04:48 +01:00
martinsumner	507428bd0b	Add initial system test Add some initial system tests. This highlighted issues: - That files deleted by compaction would be left orphaned and not close, and would not in fact delete (now deleted by closure only) - There was an issue on stratup that the first few keys in each journal would not be re-loaded into the ledger	2016-10-03 23:34:28 +01:00
martinsumner	15f57a0b4a	Further Journal compaction tests Improved unit testing	2016-09-28 18:26:52 +01:00
martinsumner	d24b100aa6	Initial work on Journal Compaction Largely untested work at this stage to allow for the Inker to request the Inker's clerk to perform a single round of compact based on the best run of files it can find.	2016-09-27 14:58:26 +01:00
martinsumner	66d6db4e11	Support for random sampling Makes the ability to get positions and the fetch directly by position more generic - supporting the fetch of different flavours of combinations, and requesting a sample of positions not just all	2016-09-20 18:24:05 +01:00
martinsumner	aa7d235c4d	Rename clerk and CDB Speed-Up CDB did many "bitty" reads/writes when scanning or writing hash tables - change these to bult reads and writes to speed up. CDB also added capabilities to fetch positions and get keys by position to help with iclerk role.	2016-09-20 16:13:36 +01:00
martinsumner	c10eaa75cb	Dialyzer changes Some chnages to improve dialyzer pass rate	2016-09-20 10:17:24 +01:00
martinsumner	7c28ffbd96	Further bookie test - CDB optimisation and Inker manifest correction Additional bookie test revealed that the persisting/reading of inker manifests was inconsistent and buggy. Also, the CDB files were inffeciently writing the top index table - needed to be improved as this is blokicng on a roll	2016-09-19 15:31:26 +01:00
martinsumner	b452fbe27c	End-to-end test Changes to ensure working of first end-to-end test (with a single Key and Value)	2016-09-15 18:38:23 +01:00
martinsumner	e73a5bbf31	WIP - First draft of Bookie code First draft of untested bookie code	2016-09-15 10:53:24 +01:00
martinsumner	86666b1cb6	Scan over CDB file Make scanning over a CDB file generic rather than specific to read-in of active nursery log - open to be called as an external function to support other scanning behaviour.	2016-09-09 15:58:19 +01:00
martinsumner	0d905639be	Testing of Inker rolling Journal Add test to show inker rolling journal. to achieve needs to make CDB size an option, and also alter the manifest sorting so that find_in_manifest actually works!	2016-09-07 17:58:12 +01:00
martinsumner	f0e1c1d7ea	Basic GET/PUT and rolling in Inker Add support to roll file on PUT in the inker	2016-09-06 17:17:31 +01:00
martinsumner	f3a40e106d	Inker improvements Resolve issue in CDB file when we have cached the index. Allow for Inker to find keys in the active journal	2016-09-05 20:22:16 +01:00
martinsumner	2a76eb364e	Inker - Initial Code An attempt to get a first inker that can build a ledger from a manifest as well as support simple get and put operations. Basic tests surround the building of manifests only at this stage - more work required for get and put.	2016-09-05 15:01:23 +01:00
martinsumner	2bdb5fba6c	Re-naming Naming things is hard. This change renames things based on the Bookie/Inker/Penciller terminology	2016-08-02 13:44:48 +01:00
martinsumner	b5db1b4e14	CDB to gen_server First draft to make CDB a gen_Server	2016-07-29 17:48:11 +01:00
martinsumner	28f612426a	Reformat of CDB CDB was failing tests (was it always this way?). There has been a little bit of a patch-up of the test, but there are still some potentially outstanding issues with scanning over a file when attempting to read beyond the end of the file. Tabbing reformatting and general tidy. Concierge documentation development ongoing.	2016-07-29 17:19:30 +01:00
Martin Sumner	c5f50c613d	Ongoing improvements - in particular CDB now supports general erlang terms not just lists	2015-06-04 21:15:31 +01:00
Martin Sumner	e2099d0c14	Initial files proving concepts WIP - nothing currently workable	2015-05-25 22:45:45 +01:00

27 commits