leveled

Author	SHA1	Message	Date
martinsumner	bd6c44e9b0	Correct is_active Firts part of adding support for scanning for Keys and Hashes. as part of this discovered TTL support did the opposite (only fetched things in the past!).	2016-10-31 16:02:32 +00:00
martinsumner	2607792d1f	Adjust setting If cache size is too small then we're more likely to be not ready to evict a L0 file	2016-10-31 15:18:21 +00:00
martinsumner	6b5b51412e	Improve TTL unit test Add support for different type of index queries	2016-10-31 15:13:11 +00:00
martinsumner	9bef57a78d	Get Positions - when rolling CT test was call get_positions hilst the sile was rolling - don't want the file to be checked in this state, so just return an empty list.	2016-10-31 14:01:09 +00:00
martinsumner	3b05874b8a	Add initial timestamp support Covered only by basic unit test at present.	2016-10-31 12:12:06 +00:00
martinsumner	4cffecf2ca	Handle gen_server:cast slowness There was some unpredictable performance in tests, that was related to the amount of time it took the sft gen_server to accept a cast whihc passed the levelzero_cache. The response time looked to be broadly proportional to the size of the cache - so it appeared to be an issue with passing the large object to the process queue. To avoid this, the penciller now instructs the SFT gen_server to callback to the server for each tree in the cache in turn as it is building the list from the cache. Each of these requests should be reltaively short, and the processing in-between should space out the requests so the Pencille ris not blocked from answering queries when pompting a L0 write.	2016-10-31 01:33:33 +00:00
martinsumner	311179964a	Quality review Minor test fix-up and quality changes	2016-10-30 22:06:44 +00:00
martinsumner	0e6ee486f8	Make tets less pointless Journla compaction test wouldn't actually cause compaction	2016-10-30 20:14:11 +00:00
martinsumner	89b5748062	Remove unnecessary clause	2016-10-30 19:49:01 +00:00
martinsumner	95609702bd	Penciller Memory Refactor Plugged the ne wpencille rmemory into the Penciller, and took advantage of the increased speed to simplify the callbacks involved. The outcome is much simpler code	2016-10-30 18:25:30 +00:00
martinsumner	c7a56068c5	Refactor of L0 memory Not yet integrated, but there is now unit-tested module for the new way of managing L0 memory cache in the Penciller. This mechansim is considerably more efficient than previous efforts and should allow for further simplification of the code.	2016-10-29 13:27:21 +01:00
martinsumner	807af81b68	Pneciller Memory Test The current penciller memory setup is inefficient. Is there an alternative which is still relatively simple and but more efficient?	2016-10-29 01:06:00 +01:00
martinsumner	cdb01cd24f	Quality Review Looked through test coverage and dialyzer output and attempted to fill test gaps and strip out untestable code (to let it crash).	2016-10-29 00:52:49 +01:00
martinsumner	0e4632ee31	Test correction In one test run the numbe rof files fluctuated but ended at zero. The ending at zero is the importnat thing.	2016-10-27 22:23:19 +01:00
martinsumner	c6ca973517	Penciller shutdown when empty Stop the penciller from writing an empty file, when shutting down and the L0 Cache is empty. Also parameter fiddle to see impact of the Penciller changes.	2016-10-27 21:40:43 +01:00
martinsumner	20cc17f916	Penciller Refactor Removed o(100) lines of code by refactoring the Penciller to no longer use ETS tables. The code is less confusing, and probably not an awful lot slower.	2016-10-27 20:56:18 +01:00
martinsumner	30f4f2edf6	Comment change on stall behaviour	2016-10-27 09:45:05 +01:00
martinsumner	a00a123817	Recovery strategy testing Test added for the "retain" recovery strategy. This strategy makes sure a full history of index changes is made so that if the Ledger is wiped out, the Ledger cna be fully rebuilt from the Journal. This exposed two journal compaction problems - The BestRun selected did not have the source files correctly sorted in order before compaction - The compaction process incorrectly dealt with the KeyDelta object left after a compaction - i.e. compacting twice the same key caused that key history to be lost. These issues have now been corrected.	2016-10-27 00:57:19 +01:00
martinsumner	4cdc6211a0	Handling 'returned' in penciller unit tests The unit tests for the Penciller couldn't cope with the returned status - and so would intermittently fail (after tightening the timeout on sft check_ready.	2016-10-26 21:03:50 +01:00
martinsumner	254183369e	CDB - switch to gen_fsm The CDB file management server has distinct states, and was growing case logic to prevent certain messages from being handled in ceratin states, and to handle different messages differently. So this has now been converted to a gen_fsm. As part of resolving this, the space_clear_ondelete test has been completed, and completing this revealed that the Penciller could not cope with a change which emptied the ledger. So a series of changes has been handled to allow it to smoothly progress to an empty manifest.	2016-10-26 20:39:16 +01:00
martinsumner	6f40869070	Parameter Experiment Try some different default parameters	2016-10-26 11:50:59 +01:00
martinsumner	0c331b9c30	Tests uncommented Accidentally commented tests it pervious commit	2016-10-26 11:45:35 +01:00
martinsumner	2a47acc758	Rolback hash\|no_hash and batch journal compaction The no_hash option in CDB files became too hard to manage, in particular the need to scan the whole file to find the last_key rather than cheat and use the index. It has been removed for now. The writing to the journal during journal compaction has now been enhanced by a mput option on the CDB file write - so it can write each batch as one pwrite operation.	2016-10-26 11:39:27 +01:00
martinsumner	97087a6b2b	Work on reload strategies Further work on variable reload srategies wiht some unit test coverage. Also work on potentially supporting no_hash on PUT to journal files for objects which will never be directly fetched.	2016-10-25 23:13:14 +01:00
martinsumner	102cfe7f6f	Move towards Inker Key Types The current mechanism of re-loading data from the Journla to the Ledger from any potential SQN is not safe when combined with Journla compaction. This commit doesn't resolve thes eproblems, but starts the groundwork for resolving by introducing Inker Key Types. These types would differentiate between objects which are standard Key/Value pairs, objects which are tombstones for keys, and objects whihc represent Key Changes only. The idea is that there will be flexible reload strategies based on object tags - retain (retain a key change object when compacting a standard object) - recalc (allow key changes to be recalculated from objects and ledger state when loading the Ledger from the journal - recover (allow for the potential loss of data on loss within the perisste dpart of the ledger, potentially due to recovery through externla anti-entropy operations).	2016-10-25 01:57:12 +01:00
martinsumner	d988c66ac6	Enhance unit tests for corruped segment filters	2016-10-24 11:44:28 +01:00
martinsumner	c78b5bca7d	Basement Tombstones Further progress towards the tidying up of basement tombstones in the Ledger, with support added for key-listing to help with testing (and as a potentially required feature). The test is incomplete, but committing at this stage as the last commit broke some tests (within the test code). There are some outstanding questions about the handling of tombstones in the Journal during compaction. There exists a condition whereby values could return if a recent journal is compacted and tombstones are removed (as they are no longer present), but older journals have not been compacted. Now on stop/start - if the Ledger is wiped the removal of the keys will be forgotten but the original PUTs would still remain. The safest thing maybe to have rule that tombstones are never deleted from the Inker's Journal - and accept the build-up of garbage. Or there could be an addition to the compaction process that checks back through all the inker files to check that the Key of a tombstone is not present in the past, before it is removed in the compaction.	2016-10-23 22:45:43 +01:00
martinsumner	e9c568a8b3	Test fix-up There was a test that failed to close down a bookie and that caused some issues. The issues are double-reoslved, the close down was tidied as well as the forgotten close being added back in. There is some generla tidy around in anticipation of TTL support.	2016-10-21 21:26:28 +01:00
martinsumner	0a2053b557	Improved unit test of CRC chekcing in bloom filter Confirm the impact of bit-flipping in the bloom filter	2016-10-21 16:08:41 +01:00
martinsumner	3710d09fbf	Reuse codec key comparison There was duplication of key comparison logic between leveled_codec and leveled_sft. Now both use the leveled_codec key_dominates function	2016-10-21 15:30:53 +01:00
martinsumner	b2089baa1e	Correct tombstone handling Prepare SFT files for handling tombstones correctly (without expiry dates). Also some work as it can be seen from tests that some SFT files ar enot be cleared out correctly. Pausing before trying t clear out the fles to experiment and trial the possibility that there is a timing issue.	2016-10-21 15:21:37 +01:00
martinsumner	3ad9e42b61	Changed SFT shutdown to cast-based The SFT shutdown process ahs become a series of casts to-and-from between Penciller and SFT to stop the two processes syncronously making requests on each other	2016-10-21 12:18:06 +01:00
martinsumner	c431bf3b0a	Broken snapshot test The test confirming that deleting sft files wer eheld open whilst snapshots were registered was actually broken. This test has now been fixed, as well as the logic in registring snapshots which had used ledger_sqn mistakenly rather than manifest_sqn.	2016-10-21 11:38:30 +01:00
martinsumner	caa8d26e3e	Stop file check File check now covered by measure in the sft_new path, whihc will backup any existing file before moving. This gets triggered by incomplete changes on shutdown.	2016-10-20 19:18:49 +01:00
martinsumner	5c2029668d	Tombstone preperation Some initial code changes preparing for the test and implementation of tombstones and tombstone reaping	2016-10-20 16:00:08 +01:00
martinsumner	0324edd6f6	Rotating object tests Recent fixes have been made to problems associated with rapidly changing objexts especially on re-opening of the bookie. Test of rotating objects from both an index query and a fetch perspective added to better detect such issues in the future.	2016-10-20 12:16:17 +01:00
martinsumner	cf66431c8e	Smoother handling of back-pressure The Penciller had two problems in previous commits: - If it had a push_mem soon after a L0 file had been created, the push_mem would stall waiting for the L0 file to complete - and this count take 100-200ms - The penciller's clerk favoured L0 work, but was lazy about asking for other work in-between, so often the L1 layer was bursting over capacity and the clerk was doing nothing but merging more L0 files in (with those merges getting more and more expensive as they had to cover more and more files) There are some partial resolutions to this. There is now an aggressive timeout when checking whther the L0 file is ready on a push_mem, and if the timeout is breached the error is caught and a 'returned' message goes back to the Bookie. the Bookie doesn't now empty its cache, it carrie son filling it, but on some probability it will keep trying to push_mem on future pushes. This increases Jitter around the expensive operation and split out the L0 delay into defined chunks. The penciller's clerk is now more aggressive in asking for work. There is also some simplification of the relationship between clerk timeouts and penciller back-pressure. Also resolved is an issue of inconcistency between the loader and the on startup (replaying the transaction log) and the standard push_mem process. The loader was not correctly de-duplicating by adding first (in order) to a tree before outputting the list from the tree. Some thought will be given later as to whether non-L0 work can be safely prioritised if the merge process still keeps getting behind.	2016-10-20 02:23:45 +01:00
martinsumner	7319b8f415	Redundant clauses Remove some redundant clauses, and fix up some logging	2016-10-19 20:51:30 +01:00
martinsumner	12fe1d01bd	Penciller Manifest and Locking The penciller had the concept of a manifest_lock - but it wasn't clear what the purpose of it was. The updating of the manifest has now been updated to reduce the code and make the process cleaner and more obvious. Now the committed manifest only covers non-L0 levels. A clerk can work concurrently on a manifest change whilst the Penciller is accepting a new L0 file. On startup the manifets is opened as well as any L0 file. There is a possible race condition with killing process where there may be a L0 file which is merged but undeleted - and this is believed to be inert. There is some outstanding work still. Currently the whole store is paused if a push_mem is received by the Penciller, and the writing of a L0 sft file has not been completed. The creation of a L0 file appears to take about 300ms, so if the ledger_cache fills in this period a pause will occurr (perhaps due to objects with lots of index entries). It would be preferable to pause more elegantly in this situation. Perhaps there should be a harsh timeout on the call to check the SFT complete, and catching it should cause a refused response. The next PUT will then wait, but a any queued GETs can progress.	2016-10-19 17:34:58 +01:00
martinsumner	f16f71ae81	Revert ominshambles performance refactoring To try and improve performance index entries had been removed from the Ledger Cache, and a shadow list of the LedgerCache (in SQN order) was kept to avoid gb_trees:to_list on push_mem. This did not go well. The issue was that ets does not deal with duplicate keys in the list when inserting (it will only insert one, but it is not clear which one). This has been reverted back out. The ETS parameters have been changed to [set, private]. It is not used as an iterator, and is no longer passed out of the process (the memtable_copy is sent instead). This also avoids the tab2list function being called.	2016-10-19 00:10:48 +01:00
martinsumner	8f29a6c40f	Complete 2i work - some refactoring The 2i work now has tests for removals as well as regex etc. Some initial refactoring work has also been tried - to try and take some tasks of the critical path of push_mem. The primary change has been to avoid putting index keys into the gb_tree, and building the KeyChanges list in parallel to the gb_tree (now known as ObjectTree) within the Ledger Cache. Some initial experiments done as to changing the ETS table in the Penciller now that it will now be used for iterating - but that has been reverted for now.	2016-10-18 19:41:33 +01:00
martinsumner	905b712764	2i query test The 2i query test added in the previous commit didn't correctly test regex queries. This has now been improved.	2016-10-18 09:42:33 +01:00
martinsumner	3e475f46e8	Support for 2i query part1 Added basic support for 2i query. This involved some refactoring of the test code to share functions between suites. There is sill a need for a Part 2 as no tests currently cover removal of index entries.	2016-10-18 01:59:18 +01:00
Russell Brown	ac0504e79e	Merge pull request #1 from martinsumner/rdb/fix-test-include Fix include target	2016-10-17 14:25:27 +01:00
Russell Brown	59ea46120e	Fix include target	2016-10-17 14:24:32 +01:00
martinsumner	8653e9d90d	Improve inker unit test Change in filename labelling had stopped a unit test from covering stratup correctly. Now offering better coverage	2016-10-16 16:58:55 +01:00
martinsumner	e3ce372f31	Delete Add functionality to delete keys. No tombstone reaping yet.	2016-10-16 15:41:09 +01:00
martinsumner	ed17e44f52	Improve test coverage Some additional tests following previous refactoring for abstraction, primarily to make manifest print safer an dprove co-existence of Riak and non-Riak objects.	2016-10-14 22:58:01 +01:00
martinsumner	7eb5a16899	Supporting Tags - Improving abstraction between Riak and non-Riak workloads The object tag "o" which was taken from eleveldb has been an extended to allow for specific functions to be triggered for different object types, in particular when extracting metadata for stroing in the Ledger. There is now a riak tag (o_rkv@v1), and in theory other tags can be added and used, as long as their is an appropriate set of functions in the leveled_codec.	2016-10-14 18:43:16 +01:00
martinsumner	9be0f96406	Or process calculation of the Hash Table When the journal CDB file is called to roll it now starts a new clerk to perform the hashtable calculation (which may take many seconds). This stops the store from getting blocked if there is an attempt to GET from the journal that has just been rolled. The journal file process now has anumber fo distinct states (reading, writing, pending_roll, closing). A future refactor may look to make leveled_cdb a gen_fsm rather than a gen_server.	2016-10-14 13:36:12 +01:00

1 2 3

115 commits