leveled

Author	SHA1	Message	Date
Martin Sumner	156e7b064d	Compaction, retain and recovery Change the penciller check so that it returns current/replaced/missing not just true/false. Reduce unnecessary penciller checks for non-standard keys that will always be retained - and remove redunandt code. Expand tests of retain and recover to make sure that compaction on delete is well covered. Also move the SQN number laong during initial loads - to stop aggressive loop to find starting SQN every file.	2020-03-09 15:12:48 +00:00
Martin Sumner	db1486fa36	Check SQN order fold does not fold beyond end of snapshot the Journla snapshot is not a true snapshot, in that the active file in the snapshot can still be taking appends. So when getting a snapshot it is necessary to check if folding over the snapshot that the SQN is <= JournalSQN when the snapshot is taken. Normally consistency of the snapshot is managed as the operation depends on the penciller, and the penciller is a snapshot. Not in this case, as the penciller will return true on a sqn check if the pcl SQN is behind the Journal. So the Journal folder, has been given an additionla check to stop at the JournalSQN. This is perhaps a fault in the pcl check sqn, which should only return true on an exact match? I'm nervous about changing this though, so we have a less pure fix for now.	2019-02-14 21:14:11 +00:00
Martin Sumner	8e687ee7c8	Add user-defined functions To allow for extraction of metadata, and building of head responses - it should eb possible to dynamically and user-defined tags, and functions to treat them. If no function is defined, revert to the behaviour of the ?STD tag.	2018-12-06 21:00:59 +00:00
Martin Sumner	881b93229b	Isolate better changes needed to support changes to metadata extraction More obvious how to extend the code as it is all in one module. Also add a new field to the standard object metadata tuple that may hold in the future other object metadata base don user-defined functions.	2018-12-06 15:31:11 +00:00
Martin Sumner	a9aa23bc9c	Bucket list update the docs to advertise throw capability. Test it for bucket list (and fix ordering of bucket lists)	2018-11-23 18:56:30 +00:00
Martin Sumner	ef2a8c62af	Add capability to exit a head or object fold with a throw This allows for all fold functions to throw an exception to exit out of a fold with all dependencies still closed down as expected. This was previously available for key folds, which was necessary for the folds to work in Riak (as max_results in index queries depends one xiting the fold with an exception). This change now adds a ct test, and adds support for head folds, object folds (key order) and object folds (sqn order)	2018-11-23 16:00:11 +00:00
Martin Sumner	174a40aab2	Tidy up unexported types also re:mp may not be exported in R16	2018-11-05 16:02:19 +00:00
Martin Sumner	2eec8a5378	MaxCount monitoring and responding Stop issue of {no_more_keys, Acc} being passed on fold over list of ranges to next range (and blowing up)	2018-11-01 23:40:28 +00:00
Martin Sumner	71fa1447e0	Allow for all keys head folds to used modifed range This helps with kv_index_tictcatree with the leveled_so backend. Now this cna do folds over ranges of keys with modified filters (as folds over ranges of keys must go over lal keys if the backend is segment_ordered)	2018-11-01 17:30:18 +00:00
Martin Sumner	aaccd09a98	Allow for setting max_keys to wrap Acc Acc in response is now of form {Reason, Acc} not just Acc so that the application can understand the reason for the results ending - and take appropriate action (e.g. restart again from the LastKey to return more results).	2018-10-31 14:22:28 +00:00
Martin Sumner	11627bbdd9	Extend API To support max_keys and the last modified date range. This applies the last modified date check on all ledger folds. This is hard to avoid, but ultimately a very low cost. The limit on the number of heads to fold, is the limit based on passing to the accumulator - not on the limit being added to the accumulator. So if the FoldFun perfoms a filter (e.g. for the preflist), then those filtered results will still count towards the maximum. There needs to be someway at the end of signalling from the fold if the outcome was or was not 'constrained' by max_keys - as the fold cannot simply tel by lenght checking the outcome. Note this is used rather than length checking the buffer and throwing a 'stop_fold' message when the limit is reached. The choice is made for simplicity, and ease of testing. The throw mechanism is necessary if there is a need to stop parallel folds across the the cluster - but in this case the node_worker_pool will be used.	2018-10-31 00:09:24 +00:00
Martin Sumner	baa4466923	Remove knowledge of tuple length from ledger value Nothing should now care about the current tuple length - and hence the tuple length may be increased (for example to add a max_mod_date)	2018-10-29 20:24:54 +00:00
Martin Sumner	f4b365438c	Further comments in API docs	2018-09-24 20:43:21 +01:00
Martin Sumner	bed155761b	Added comments This is still a clumsy feature, in terms of implementation. Is the fact that some folds handle a throw, and some don't an issue?	2018-09-24 20:05:48 +01:00
Martin Sumner	a9b097e392	Add a wrapper to fold_keys queries Queries that in Riak will be based on fold_keys need to be able to catch throws, and re-throw them to be detected by the worker (whilst still clearing up the snapshot)	2018-09-24 19:54:28 +01:00
Martin Sumner	1a3d3daa89	Add regex support to $key index Regex to be applied to key only	2018-09-21 12:04:32 +01:00
Russell Brown	5a95e82af0	Callers of bucket list expect the traversal to be in order Due to the internal fold over buckets returning an un-reversed accumulator, the API bucketlist code caller's fold fun traversed the bucket list in reverse order. This lead to some inconsistencies when comparing a buckelist of all buckets, vs, first bucket only. i.e. the 'first' bucket passed to the foldfun was in fact the last bucket read from the ledger.	2018-09-18 15:40:44 +01:00
Martin Sumner	bde188e691	Constrain keys Rather than supporting any() - constrain at least to binary()/integer() or string().	2018-09-01 12:10:56 +01:00
Martin Sumner	50967438d3	Switch from binary_bucketlist Allow for bucket listing of non-binary buckets (integer buckets, buckets with ascii strings)	2018-09-01 10:39:23 +01:00
Martin Sumner	41fb83abd1	Add tests for is_empty Where keys are strings or integers, and where subkeys are involved	2018-08-31 15:29:38 +01:00
Martin Sumner	a14941a122	Fix unexported types file:location not exported?	2018-06-04 10:57:37 +01:00
Martin Sumner	06fd9856b5	Add debug log	2018-05-16 10:34:47 +01:00
Martin Sumner	6a20b2ce66	Use leveled_codec types ... and exporting them. Previously types wer enot exported, and it appears dialyzer treated tham as any() when they were unexported types ??!!??	2018-05-04 15:24:08 +01:00
Martin Sumner	f88f511df3	leveled_codec spec/doc Try and make this code a little bit more organised andd easier to follow	2018-05-03 17:18:13 +01:00
Russell Brown	92582de4dd	WIP: changes for backend_eqc test Enable support for $key and $bucket index	2018-04-12 10:06:29 +01:00
Martin Sumner	cda412508a	IsEmpty check Previously there was no is_empty check, and there was a workaround using binary_bucketlist. But what if there were many buckets - this is a slow seek (using get next key over and over). Instead have a proper is_empty check.	2018-03-21 15:31:00 +00:00
Martin Sumner	d21d18dd82	Remove rogue log	2018-03-01 23:37:05 +00:00
Martin Sumner	861aa5a7db	Support multi-query fold Allow a single snapshot to run query over multiple ranges. Used initially to fold over multiple buckets.	2018-03-01 23:19:52 +00:00
Martin Sumner	2b6281b2b5	Initial head_only features Initial commit to add head_only mode to leveled. This allows leveled to receive batches of object changes, but where those objects exist only in the Penciller's Ledger (once they have been persisted within the Ledger). The aim is to reduce significantly the cost of compaction. Also, the objects ar enot directly accessible (they can only be accessed through folds). Again this makes life easier during merging in the LSM trees (as no bloom filters have to be created).	2018-02-15 16:14:46 +00:00
Martin Sumner	0e071d078e	fold_objects in SQN order This adds a test that fold_objects works in SQN order	2017-11-17 18:30:51 +00:00
Martin Sumner	50c81d0626	Make ink fold more generic Also makes the fold_from_sequence loop much easier to follow	2017-11-17 14:54:53 +00:00
Martin Sumner	c8ad39b33b	foldheads_bybucket adds segment list support Accelerate queries for foldheads_bybucket as well	2017-11-01 22:00:12 +00:00
Martin Sumner	b141dd199c	Allow for segment-acceleration of folds Initially with basic tests. If the SlotIndex has been cached, we can now use the slot index as it is based on the Segment hash algortihm. This looks like it should lead to an order of magnitude improvement in querying for keys/clocks by segment ID. This also required a slight tweak to the penciller keyfolder. It now caches the next answer from the SSTiter, rather than restart the iterator. When the IMMiter has many more entries than the SSTiter (as the sSTiter is being filtered but not the IMMiter) this could lead to lots of repeated folding.	2017-10-31 23:28:35 +00:00
Martin Sumner	6bb7ceef0c	Attempt to standardise on segment hashes To allow for the segment has that accelerates queries to be re-used in tictac tree related queries.	2017-10-30 13:57:41 +00:00
Martin Sumner	84239955ed	Clarify wording	2017-10-17 22:31:11 +01:00
Martin Sumner	bfaed921e6	Split code for folders - introduce runner actor Introduce a dedicated module for all the different fold types. Also simplify the list of folders by deprecating those folds that should eb achieveable by fold_heads/fold_objects type folds but with smarter functions. Makes sure that the fold functiosn also have better spec coverage, and are dialyzer checked.	2017-10-17 20:39:11 +01:00

36 commits