leveled

Author	SHA1	Message	Date
Martin Sumner	3950942da3	Roll in fix for intermittently failing test As descibed in https://github.com/martinsumner/leveled/issues/92 Only the first fix was made. Just to eb safe - archiving means renaming to another file with a different extension. Assumption is that renamed files cna be manually reaped if necessary.	2017-09-27 23:52:49 +01:00
Martin Sumner	433cc37eb6	Rolled back LMD in metadata Because there's no sensible way of using it if objects are mutable - you still end up with the same false positives in the tictactree. Didn't fully rollback the change as spec and docs were added which chould be useful going forward.	2017-09-27 12:26:12 +01:00
Martin Sumner	2e5b9c80f4	Add max LMD to Riak metadata This is an interim stage towwards enhancing the proxy object so that it contains more helper information (other than size). The aim is to be able to run more efficient fold_heads queries that might filter on LMD range (so as not to have to co-ordinate the running of comparative queries). For example if producing a tictactree to compare between two different offsets, a max LMD could be passed in so that changes beyond the time the first query was requested can be ignored.	2017-09-27 12:15:18 +01:00
Martin Sumner	389694b11b	Add exportable option to tictac Idea being that sometimes you may wish to compare a tictac tree between leveled and something that doesn't understand erlang:phash or term_to_binary. So allow the magic_hash to be used instead - and perhaps an extract function that does base64 encoding or something similar.	2017-09-26 22:49:40 +01:00
Martin Sumner	2f9afa1469	Add support for performing a magic hash on a binary Ignore unnecessray term_to_binary if already binary. This will be useful when we use magic_hash in tictac_trees we wish to be exportable.	2017-09-26 16:32:59 +01:00
Martin Sumner	dfab33e8da	Add smaller trees The "small" tree will serialise to 1.5MB - which seems large. Much smaller trees seem to be more suitable for things like recently modified aae indexes.	2017-09-25 13:07:08 +01:00
Martin Sumner	9730816c38	Merge branch 'master' into mas-riakaae-impl-2	2017-09-22 09:39:32 +01:00
Martin Sumner	eba21f49fa	Make tests compatible with OTP 16 this required a switch to change the sync strategy based on rebar parameter. However tests could be slow on macbook with OTP16 and sync - so timeouts added in unit tests, and ct tests sync_startegy changed to not sync for OTP16.	2017-09-15 15:10:04 +01:00
Martin Sumner	856a64c4d4	Unit tests use wrong query format Not sure how this happened. Bad merge? Just plain sloppiness on my part? Anyhow, the unit tests were not working ..	2017-09-14 18:06:50 +01:00
Martin Sumner	ed56ef17a1	Make export mochijson friendly	2017-09-13 23:45:48 +01:00
Martin Sumner	d8ca0274f6	Export the import	2017-09-13 22:02:44 +01:00
Martin Sumner	d98534b8f3	Add struct to help with mochijson2 Try and make this more mochijson2 friendly	2017-09-13 21:56:28 +01:00
Martin Sumner	9f97c82d0d	Add import/export support Also fix for fold_heads unit tests to reflect new booleans required by changes to support there use in MapFolds	2017-08-16 16:58:38 +01:00
Martin Sumner	53ddc8950b	Add tests using fold_heads Comparing the inbuilt tictac_tree fold, to using "proper" abstraction and achieving the same thing through fold_heads. The fold_heads method is slower (a lot more manipulation required in the fold) - expect it to require > 2 x CPU. However, this does give the flexibility to change the hash algorithm. This would allow for a fold over a database of AAE trees (where the hash has been pre-computed using sha) to be compared with a fold over a database of leveled backends. Also can vary whether the fold_heads checks for presence of the object in the Inker. So normally we can get the speed advantage of not checking the Journal for presence, but periodically we can.	2017-08-07 10:45:41 +01:00
Heinz N. Gies	b319386210	Fix strong_rand to rand_bytes	2017-08-01 11:55:05 +02:00
Heinz N. Gies	bbe763514b	Remove uniform_s/2 from old random code	2017-08-01 11:37:18 +02:00
Heinz N. Gies	38e9b0e80a	Add missing uniform/0 function	2017-08-01 11:24:12 +02:00
Heinz N. Gies	25389893cf	Add compatibility for old and new random / rand functions	2017-08-01 11:24:12 +02:00
Heinz N. Gies	379e33ba84	Cleanup dialyzer errrors in leveled_bookie	2017-07-31 19:58:56 +02:00
Heinz N. Gies	8717d42ffe	Cleanup dialyzer errrors in leveled_cdb	2017-07-31 19:55:09 +02:00
Heinz N. Gies	e8ed7954cc	Cleanup dialyzer errrors in leveled_iclerk	2017-07-31 19:53:01 +02:00
Heinz N. Gies	eece253222	Cleanup most dialyzer errrors in leveled_inker	2017-07-31 19:47:58 +02:00
Heinz N. Gies	44fd603474	Cleanup dialyzer errrors in leveled_pclerk	2017-07-31 19:41:26 +02:00
Heinz N. Gies	369bdece5f	Cleanup dialyzer errrors in leveled_penciller	2017-07-31 19:39:40 +02:00
Heinz N. Gies	858ee9a915	Cleanup dialyzer errrors in leveled_pmanifest	2017-07-31 19:32:06 +02:00
Heinz N. Gies	5e6df539cb	Cleanup dialyzer errrors in leveled_sst	2017-07-31 19:30:29 +02:00
martinsumner	80fd2615f6	Implement blacklist/whitelist Change from the all/whitelist ebhavior to the blacklist/whitelist behaviour documented in the write-up	2017-07-11 11:44:01 +01:00
martinsumner	97fdd36d53	Returning bucket when bucket is all Need to know {Bucket, Key} not just Key if all buckets are being covered by nrt aae. So shoehorning this in - will also allow for proper use of FilterFun when filtering by partition.	2017-07-03 18:03:13 +01:00
Martin Sumner	fd84e4f608	Test timeouts So that coverage testing will run.	2017-07-02 22:23:02 +01:00
martinsumner	52ca0e4b6c	Test expansion Detect a recent difference	2017-07-02 19:33:18 +01:00
martinsumner	954995e23f	Support for recent AAE index With basic ct test. Doesn't currently prove expiry of index. Doesn't prove ability to find segments. Assumes that either "all" buckets or a special list of buckets require indexing this way. Will lead to unexpected results if the same bucket name is used across different Tags. The format of the index has been chosen so that hopeully standard index features can be used (e.g. return_terms).	2017-06-30 16:31:22 +01:00
martinsumner	8da8722b9e	Add temporary aae index Pending ct tests. The aae index should expire after limit_minutes and be on an index which is rounded to unit_minutes.	2017-06-30 10:03:36 +01:00
martinsumner	2dd303237b	Change XOR	2017-06-28 10:55:54 +01:00
martinsumner	ebef27f021	Extract Last Modified Date from Riak Object As part of process to supporting a recent changes index for near-real-time anti-entropy	2017-06-27 16:25:18 +01:00
martinsumner	f81a4bca0d	Revert "WIP - Recent Modifications" This reverts commit bc19a05d83a02d7ec03771657df85b33acc6cfee.	2017-06-27 16:25:18 +01:00
martinsumner	9fca17d56a	WIP - Recent Modifications Just some initial WIP code for this. Will revisit this again after exploring some ideas as to how to reduce the cost of the get_keys_by_segment. The overlal idea is that there are trees of recent modifications, with recent being some rolling time window made up of hourly blocks, and recency being dtermined by the last-modified date on the object metadata - which should be conistent across a cluster. So if we were at 15:30 we would get the tree for 14:00 - 15:00 and the tree for 15:00-16:00 from two different queries which cover the same partitions and then compare. Comparison may find differences, and we know what segment the difference is in - but how to then find all keys in that segment which have been modified in the period? Three ways: Do it inefficeintly and infrequently using a fold_keys and a filter (perhaps with SST files having a highest LMD in the metadata so that they can be skipped). Add a special index, where verye entry has a TTL, and the Key is {$segment, Segment, Bucket, Key} so that a normal 2i query cna be used. Align hashing for segments with hashing for penciller lookup so that a query over the actual keys cna be optimised skipping chunks of the in-memory part, and chunks of the SST file	2017-06-27 16:25:18 +01:00
Martin Sumner	fde9af28dd	comment test to avoid timeout	2017-06-26 17:08:31 +01:00
martinsumner	4e5c3e2f64	Fix merge Fix typo in merge, and extra validation step to unit tests to prevent it returning.	2017-06-23 12:32:37 +01:00
martinsumner	7cfa392b6e	Flexible TicTacTree sizes Allow tictac tree sizes to be flexible. Tested lots of different sizes. Having both level 1 and level 2 the same size seemed to be consistently quicker than trying to make either of the levels relatively wider. There's an 8% performance improvement if the SegmentCount is reduced by a quarter.	2017-06-20 10:58:13 +01:00
martinsumner	d5b4cb844f	Finding keys Progresses from a segment list to scanning for the keys in that segment	2017-06-19 18:38:55 +01:00
martinsumner	c586b78f45	Initial code with busted ct test Initiat comparison made betwene trees externally - but ct test is bust.	2017-06-19 11:36:57 +01:00
martinsumner	6ad98d77c5	Spec module for dialyzer Add specs/docs for the leveled_tictac module. Dialyzer passes.	2017-06-16 13:47:19 +01:00
martinsumner	f5dd154cee	Rename hashtree query Naming is now confusing now we have TicTac Trees. This query builds a list of keys and hashes not a tree - so it was misleading anyaway. Now renamed hashlist_query.	2017-06-16 12:38:59 +01:00
Martin Sumner	7642aac2cc	Change Riak object hash approach Change the riak object hash being kept in the metadata, to being a hash of the vector clock	2017-06-16 10:14:24 +01:00
martinsumner	959e7f932f	Add simple merge Allow for tictac trees to be merged	2017-06-15 16:16:19 +01:00
martinsumner	86b11803c9	Build and compare Build and compare of tictac trees. These are mergable merkle trees that are not cryptographically secure.	2017-06-15 15:40:23 +01:00
Martin Sumner	15c52ae118	Change default compaction settings Need to allow specific settings to be passed into unit tests. Also, too much journal compaction may lead to intermittent failures on the basic_SUITE space_clear_on_delete test. think this is because there are less “deletes” to reload in on startup to trigger the cascade down and clear up?	2017-06-02 08:37:57 +01:00
martinsumner	569b498727	Resolve dialyzer warnings Botched switch to leveled_log in list - so reoslved dialyzer warnings	2017-06-01 22:03:51 +01:00
martinsumner	32612dfe4a	Yet another array type OTP16 issue	2017-06-01 21:39:01 +01:00
martinsumner	afbf918f2c	Change from using array type Won't compile in OTP16	2017-06-01 21:37:23 +01:00

... 9 10 11 12 13 ...

1115 commits