leveled

Author	SHA1	Message	Date
Martin Sumner	ee47d62c24	iclerk testing Not sure if this scenario is unlikely or impossible - but filtering it seems harmless. Abstracting out function makes testing of scoring scenarios a bit easier. Catch-up on setting specs for external functions in the iclerk.	2017-11-09 12:42:49 +00:00
Martin Sumner	22e894c928	Allow waste retnetion to be ignored If wast retention period is undefined, then it should be ignored - and no waste retained (rather than retaining waste for 24 hours as at present). This wasn't working anyway - as reopen reader didn't get the cdb options (which didn't have the waste path on anyway) - so waste would not eb retained if the file had been opened after a stop/start.	2017-11-08 12:58:09 +00:00
Martin Sumner	e8bd712fb8	Tidy up test shutdown	2017-11-08 11:20:22 +00:00
Martin Sumner	f358bd7622	Switch to using passed in compression method for maybe_compress When the compaction discovers compression is required it will used the passed in method at startup - not the method which had been previously defined.	2017-11-06 21:16:46 +00:00
Martin Sumner	1d475235d1	Improve test coverage Make compress on receipt/compaction configurable	2017-11-06 18:44:08 +00:00
Martin Sumner	61b7be5039	Make compression algorithm an option Compression can be switched between LZ4 and zlib (native). The setting to determine if compression should happen on receipt is now a macro definition in leveled_codec.	2017-11-06 15:54:58 +00:00
Martin Sumner	26aa573ce1	Switch segment and extra hash More entropy by using the position index with the segment hash - so this would be a better filter to apply. Also could increase the key count now, as extra hash can be larger. As an aside - a leveled_iclerk unit test failure appeared - the range was just wrong. Don't know why this strated happening	2017-10-24 14:32:04 +01:00
Martin Sumner	eba21f49fa	Make tests compatible with OTP 16 this required a switch to change the sync strategy based on rebar parameter. However tests could be slow on macbook with OTP16 and sync - so timeouts added in unit tests, and ct tests sync_startegy changed to not sync for OTP16.	2017-09-15 15:10:04 +01:00
Heinz N. Gies	25389893cf	Add compatibility for old and new random / rand functions	2017-08-01 11:24:12 +02:00
Heinz N. Gies	e8ed7954cc	Cleanup dialyzer errrors in leveled_iclerk	2017-07-31 19:53:01 +02:00
Martin Sumner	15c52ae118	Change default compaction settings Need to allow specific settings to be passed into unit tests. Also, too much journal compaction may lead to intermittent failures on the basic_SUITE space_clear_on_delete test. think this is because there are less “deletes” to reload in on startup to trigger the cascade down and clear up?	2017-06-02 08:37:57 +01:00
martinsumner	c664176247	Release penciller snapshot after journal compaction As otherwise memory consumption beocmes an issue, as they will take an hour to timeout naturally.	2017-05-26 10:51:30 +01:00
Martin Sumner	2bbb504c02	Up max file compactions per run Try and do more compaction work for each run	2017-05-21 22:06:41 +01:00
martinsumner	18a12ff9ff	Improve comments	2017-05-18 14:09:45 +01:00
martinsumner	43bfbe3e0e	Add in scheduler function To assist in scheduling compaction	2017-03-30 15:46:37 +01:00
martinsumner	11ff3129f3	Reduce compaction targets Cmpaction is overly aggressive. It is a lot of work to compact a run of files for just 20% reduction in disk space, when disk space for the Journal (i.e. low IOPS disk space should be relatively inexpensive). Require at least a 40% reduction for a compaction job.	2017-03-30 12:15:36 +01:00
martinsumner	f3ffa920af	Trying to standardise binary manipulation of value Looking into theory that use of term_to_binary is imperfect. Also may be better to compress values only when they are compacted?	2017-03-20 15:43:54 +00:00
martinsumner	3ca928629c	Split out inker manifest - load last key from manifest The process of re-opening file included an expensive search for the LastKey - now the LastKey can be provided out of the manifest.	2017-01-18 15:23:06 +00:00
martinsumner	3712c62a50	Broken WIP	2017-01-17 16:30:04 +00:00
martinsumner	9e28287231	Resolve failing recovery test Now passing consistently with a number of different corruptions catered for (including corruption of the Tag in the Inker Key)	2016-12-16 23:18:55 +00:00
martinsumner	32ac305c67	Compaction test error Compaction tests now throwing up different corruption points	2016-12-11 06:53:25 +00:00
martinsumner	f40ecdd529	Pick-up test misses There were some coverage misses in tests, so check in unit test coverage or remove branches not currently needed.	2016-11-18 21:35:45 +00:00
martinsumner	aa355a0aae	Coverage cheat pt 2	2016-11-14 20:56:59 +00:00
martinsumner	8cbe2ef93a	Coverage cheats You juke the stats, and majors become colonels. I've been here before	2016-11-14 20:43:38 +00:00
martinsumner	630f802780	Inker Close nastiness Try to stop some of the potential deadlocking around Inker close and prove that snapshots at higher Manifest SQNs can be ignored	2016-11-14 19:34:11 +00:00
martinsumner	c0e1455430	Remove PromptDelete Now that deleted journals are stored in waste for the retention period - don't worry about deleting them	2016-11-14 11:40:02 +00:00
martinsumner	44738f7c75	Deferred Deletion of Journals This allows for deleted journals to be retained for a period (the waste_retnetion_period). The idea being that a backup strategy can ensure that all journals are backed up, even ones created and removed from within a backup period - so that any restore pont is possible. This is also a pre-cursor to removing some of the PromptDelete complexity from the Inker Clerk - all compactions can prompt deletion as deletion is now deferred.	2016-11-14 11:17:14 +00:00
martinsumner	34a90231e0	Prune dead branches	2016-11-04 19:33:11 +00:00
martinsumner	ba628c2f40	Test Rolling of CDB to two files trhough compaction This exposed a potential issue with not opening readers in binary_mode - so now defaults to binary mode. Will add test using object filder to confirm values remain readable in rolled journals after shutdown/startup.	2016-11-04 12:22:15 +00:00
martinsumner	4e46c9735d	Log improvements Continuation of log review and conversion to using central log function. Fixup of convoluted shutdown process between Bookie, Inker and Inker's Clerk	2016-11-03 16:05:43 +00:00
martinsumner	84a92b5f95	Further testing of compaction Check we avoid crashing in challenging compaction scenarios	2016-11-01 00:46:14 +00:00
martinsumner	311179964a	Quality review Minor test fix-up and quality changes	2016-10-30 22:06:44 +00:00
martinsumner	a00a123817	Recovery strategy testing Test added for the "retain" recovery strategy. This strategy makes sure a full history of index changes is made so that if the Ledger is wiped out, the Ledger cna be fully rebuilt from the Journal. This exposed two journal compaction problems - The BestRun selected did not have the source files correctly sorted in order before compaction - The compaction process incorrectly dealt with the KeyDelta object left after a compaction - i.e. compacting twice the same key caused that key history to be lost. These issues have now been corrected.	2016-10-27 00:57:19 +01:00
martinsumner	254183369e	CDB - switch to gen_fsm The CDB file management server has distinct states, and was growing case logic to prevent certain messages from being handled in ceratin states, and to handle different messages differently. So this has now been converted to a gen_fsm. As part of resolving this, the space_clear_ondelete test has been completed, and completing this revealed that the Penciller could not cope with a change which emptied the ledger. So a series of changes has been handled to allow it to smoothly progress to an empty manifest.	2016-10-26 20:39:16 +01:00
martinsumner	6f40869070	Parameter Experiment Try some different default parameters	2016-10-26 11:50:59 +01:00
martinsumner	2a47acc758	Rolback hash\|no_hash and batch journal compaction The no_hash option in CDB files became too hard to manage, in particular the need to scan the whole file to find the last_key rather than cheat and use the index. It has been removed for now. The writing to the journal during journal compaction has now been enhanced by a mput option on the CDB file write - so it can write each batch as one pwrite operation.	2016-10-26 11:39:27 +01:00
martinsumner	97087a6b2b	Work on reload strategies Further work on variable reload srategies wiht some unit test coverage. Also work on potentially supporting no_hash on PUT to journal files for objects which will never be directly fetched.	2016-10-25 23:13:14 +01:00
martinsumner	102cfe7f6f	Move towards Inker Key Types The current mechanism of re-loading data from the Journla to the Ledger from any potential SQN is not safe when combined with Journla compaction. This commit doesn't resolve thes eproblems, but starts the groundwork for resolving by introducing Inker Key Types. These types would differentiate between objects which are standard Key/Value pairs, objects which are tombstones for keys, and objects whihc represent Key Changes only. The idea is that there will be flexible reload strategies based on object tags - retain (retain a key change object when compacting a standard object) - recalc (allow key changes to be recalculated from objects and ledger state when loading the Ledger from the journal - recover (allow for the potential loss of data on loss within the perisste dpart of the ledger, potentially due to recovery through externla anti-entropy operations).	2016-10-25 01:57:12 +01:00
martinsumner	c78b5bca7d	Basement Tombstones Further progress towards the tidying up of basement tombstones in the Ledger, with support added for key-listing to help with testing (and as a potentially required feature). The test is incomplete, but committing at this stage as the last commit broke some tests (within the test code). There are some outstanding questions about the handling of tombstones in the Journal during compaction. There exists a condition whereby values could return if a recent journal is compacted and tombstones are removed (as they are no longer present), but older journals have not been compacted. Now on stop/start - if the Ledger is wiped the removal of the keys will be forgotten but the original PUTs would still remain. The safest thing maybe to have rule that tombstones are never deleted from the Inker's Journal - and accept the build-up of garbage. Or there could be an addition to the compaction process that checks back through all the inker files to check that the Key of a tombstone is not present in the past, before it is removed in the compaction.	2016-10-23 22:45:43 +01:00
martinsumner	3e475f46e8	Support for 2i query part1 Added basic support for 2i query. This involved some refactoring of the test code to share functions between suites. There is sill a need for a Part 2 as no tests currently cover removal of index entries.	2016-10-18 01:59:18 +01:00
martinsumner	9be0f96406	Or process calculation of the Hash Table When the journal CDB file is called to roll it now starts a new clerk to perform the hashtable calculation (which may take many seconds). This stops the store from getting blocked if there is an attempt to GET from the journal that has just been rolled. The journal file process now has anumber fo distinct states (reading, writing, pending_roll, closing). A future refactor may look to make leveled_cdb a gen_fsm rather than a gen_server.	2016-10-14 13:36:12 +01:00
martinsumner	de54a28328	Load and Count test This test exposed two bugs: - Yet another set of off-by-one errors (really stupidly scanning the Manifest from Level 1 not Level 0) - The return of an old issue related to scanning the journal on load whereby we fail to go back to the previous file before the current SQN	2016-10-13 17:51:47 +01:00
martinsumner	4a8a2c1555	Code reduction refactor An attempt to refactor out more complex code. The Penciller clerk and Penciller have been re-shaped so that there relationship is much simpler, and also to make sure that they shut down much more neatly when the clerk is busy to avoid crashdumps in ct tests. The CDB now has a binary_mode - so that we don't do binary_to_term twice ... although this may have made things slower ??!!? Perhaps the is_binary check now required on read is an overhead. Perhaps it is some other mystery. There is now a more effiicient fetching of the size on pcl_load now as well.	2016-10-08 22:15:48 +01:00
martinsumner	8dfeb520ef	Inker Refactor Inker refactored to block on manifest write. If this is inefficient the manifets write can be converted ot an append only operation. Waiting on the manifest write makes the logic at startup much easier to manage.	2016-10-07 18:07:03 +01:00
martinsumner	ad5aebe93e	Further work on system tests Another issue exposed with laziness in the using an incomplete ledger when checking for presence during compaction.	2016-10-05 18:28:31 +01:00
martinsumner	507428bd0b	Add initial system test Add some initial system tests. This highlighted issues: - That files deleted by compaction would be left orphaned and not close, and would not in fact delete (now deleted by closure only) - There was an issue on stratup that the first few keys in each journal would not be re-loaded into the ledger	2016-10-03 23:34:28 +01:00
martinsumner	15f57a0b4a	Further Journal compaction tests Improved unit testing	2016-09-28 18:26:52 +01:00
martinsumner	50b50ba486	Inker Clerk - Further Testing Expanded the unit tetsing of the Inker Clerk actor. Still WIP	2016-09-28 11:41:56 +01:00
martinsumner	d24b100aa6	Initial work on Journal Compaction Largely untested work at this stage to allow for the Inker to request the Inker's clerk to perform a single round of compact based on the best run of files it can find.	2016-09-27 14:58:26 +01:00
martinsumner	e2bb09b873	Snapshot testing Work to test the checking of sequence numbers in snapshots as required by the inkers clerk to calculate the percentage of a file which is compactable	2016-09-26 10:55:08 +01:00

1 2 3

103 commits