Commit graph

194 commits

Author SHA1 Message Date
Martin Sumner
861aa5a7db Support multi-query fold
Allow a single snapshot to run query over multiple ranges.   Used initially to fold over multiple buckets.
2018-03-01 23:19:52 +00:00
Martin Sumner
090e414b23 Coverage issues
Not making proxy object so get_size not required.

Extend tests to improve coverage
2018-02-16 20:27:49 +00:00
Martin Sumner
70dfb77088 Optional lookup in head_only mode
Allow decision to be made on startup whether ObjectSpecs can be looked up directly when running in head_only mode.
2018-02-16 17:06:30 +00:00
Martin Sumner
910ccb6072 Add lookup support in head_only mode
Originally had disabled the ability to lookup individual values when running in head_only mode.  This is a saving of about 11% at PUT time (about 3 microseconds  per PUT) on a macbook.

Not sure this saving is sufficient enought to justify the extra work if this is used as an AAE Keystore with Bitcask and LWW (when we need to lookup the current value before adjusting).

So reverted to re-adding support for HEAD requests with these keys.
2018-02-16 14:16:28 +00:00
Martin Sumner
2b6281b2b5 Initial head_only features
Initial commit to add head_only mode to leveled.  This allows leveled to receive batches of object changes, but where those objects exist only in the Penciller's Ledger (once they have been persisted within the Ledger).

The aim is to reduce significantly the cost of compaction.  Also, the objects ar enot directly accessible (they can only be accessed through folds).  Again this makes life easier during merging in the LSM trees (as no bloom filters have to be created).
2018-02-15 16:14:46 +00:00
Martin Sumner
bb39498ec5 Missed log line added back
.. and covered in test
2017-12-04 15:26:01 +00:00
Martin Sumner
cb10a00845 Make test reliable
Might corrupt the wrong journal otherwise
2017-11-28 14:57:17 +00:00
Martin Sumner
58946a7f98 Amend SST Timing Capture
Use sampling mechansm from CDB timing capture.  Do it less though - as far more SST fetches in comparison to CDB fetches.
2017-11-21 17:00:23 +00:00
Martin Sumner
5b4bc1ce59 Merge branch 'master' into mas-i108-cdbtimings 2017-11-20 17:34:50 +00:00
Martin Sumner
faa5ef82aa Test logging of samples
To prompt the log the journal size needs to be reduced
2017-11-20 15:31:31 +00:00
Martin Sumner
d5babe0c29 Expand tests for coverage
Prove tests handle SQN over batch, and also can handle mismatched tags in the store.
2017-11-20 10:21:30 +00:00
Martin Sumner
0e071d078e fold_objects in SQN order
This adds a test that fold_objects works in SQN order
2017-11-17 18:30:51 +00:00
Martin Sumner
29ffb74a5b Add coverage for CDB delete_pending with snapshot
CDB file in delete_pending state, and times out but the Inker has a snapshot open - and so the file is not deleted.
2017-11-09 12:02:03 +00:00
Martin Sumner
4c05dc79f9 Merge branch 'master' into mas-aae-segementfoldplus 2017-11-08 18:38:49 +00:00
Martin Sumner
7de4dccbd9 Extend journal compaction test
to cover with and without  waste retention.  Also makes sure that CDB files in a restarted store will respect the wast retention period set.
2017-11-08 16:18:48 +00:00
Martin Sumner
8f27b3b628 Merge branch 'master' into mas-aae-segementfoldplus 2017-11-07 11:22:56 +00:00
Martin Sumner
f358bd7622 Switch to using passed in compression method for maybe_compress
When the compaction discovers compression is required it will used the passed in method at startup - not the method which had been previously defined.
2017-11-06 21:16:46 +00:00
Martin Sumner
9a0a4ced0d Test LZ4 from uncompressed
Coverage issue
2017-11-06 19:42:35 +00:00
Martin Sumner
1d475235d1 Improve test coverage
Make compress on receipt/compaction configurable
2017-11-06 18:44:08 +00:00
Martin Sumner
61b7be5039 Make compression algorithm an option
Compression can be switched between LZ4 and zlib (native).

The setting to determine if compression should happen on receipt is now a macro definition in leveled_codec.
2017-11-06 15:54:58 +00:00
Martin Sumner
99428d0e55 Remove erroneously added file 2017-11-03 14:26:51 +00:00
Martin Sumner
0ecb83f8ec Remove eroneously added files 2017-11-03 14:26:18 +00:00
Martin Sumner
9fa8ed6cca Add LZ4 2017-11-03 14:18:49 +00:00
Martin Sumner
4fbb770a8c Revert "Failed attempt to hack in LZ4"
This reverts commit 912920a53c.
2017-11-03 11:47:25 +00:00
Martin Sumner
912920a53c Failed attempt to hack in LZ4 2017-11-03 11:47:00 +00:00
Martin Sumner
c8ad39b33b foldheads_bybucket adds segment list support
Accelerate queries for foldheads_bybucket as well
2017-11-01 22:00:12 +00:00
Martin Sumner
2428d2cbff Add test with presence check
Add a test in each loop with a check for the presence of the object in the Journal
2017-11-01 15:23:28 +00:00
Martin Sumner
033cf1954d Add check for too-small trees
Provide a function for generating segmentfilter lists so that it can handle trees that are "too small".

Test those smaller trees - plus also false positives and cold caches
2017-11-01 13:18:01 +00:00
Martin Sumner
81180e9310 Add tests for different tree sizes
Note that accelerating segment_list queries will not work for tree sizes smaller than small.  How to flag this up?

Should smaller tree sizes just be removed from leveled_tictac?
2017-11-01 11:51:51 +00:00
Martin Sumner
b141dd199c Allow for segment-acceleration of folds
Initially with basic tests.  If the SlotIndex has been cached, we can now use the slot index as it is based on the Segment hash algortihm.

This looks like it should lead to an order of magnitude improvement in querying for keys/clocks by segment ID.

This also required a slight tweak to the penciller keyfolder.  It now caches the next answer from the SSTiter, rather than restart the iterator.   When the IMMiter has many more entries than the SSTiter (as the sSTiter is being filtered but not the IMMiter) this could lead to lots of repeated folding.
2017-10-31 23:28:35 +00:00
Martin Sumner
f5878548f9 Make binary Riak bucket/keys a special case
When leveled is used with Riak, buckets and keys are always binaries.  So we can treat them as such.

Want to move tictac tree testing away from the leveled internal tests, to a set of tests for the Riak scenario.  so riak_SUITE created for this and other riak-specific backend tests.
2017-10-30 17:39:21 +00:00
Martin Sumner
6bb7ceef0c Attempt to standardise on segment hashes
To allow for the segment has that accelerates queries to be re-used in tictac tree related queries.
2017-10-30 13:57:41 +00:00
Martin Sumner
36264eb416 Search range failure
Discovered a bug with search ranges in leveled_tree - this was uncovered by an intermittently fialing 19.3 test.

Test case added and bug fixed.  It was due to a fialure to use end_key passed causing issues with particular manifests and full bucket ranges.
2017-10-24 13:19:30 +01:00
Martin Sumner
f89e2cf1f1 Improve test coverage 2017-10-17 22:06:30 +01:00
Martin Sumner
bfaed921e6 Split code for folders - introduce runner actor
Introduce a dedicated module for all the different fold types.  Also simplify the list of folders by deprecating those folds that should eb achieveable by fold_heads/fold_objects type folds but with smarter functions.

Makes sure that the fold functiosn also have better spec coverage, and are dialyzer checked.
2017-10-17 20:39:11 +01:00
Martin Sumner
0c5f5cdb65 Add key range to fold_heads queries 2017-10-06 15:02:14 +01:00
Martin Sumner
61724cfedb Merge branch 'master' into mas-riakaae-impl-2 2017-09-28 13:23:29 +01:00
Martin Sumner
3950942da3 Roll in fix for intermittently failing test
As descibed in https://github.com/martinsumner/leveled/issues/92

Only the first fix was made.

Just to eb safe - archiving means renaming to another file with a different extension.  Assumption is that renamed files cna be manually reaped if necessary.
2017-09-27 23:52:49 +01:00
Martin Sumner
389694b11b Add exportable option to tictac
Idea being that sometimes you may wish to compare a tictac tree between leveled and something that doesn't understand erlang:phash or term_to_binary.  So allow the magic_hash to be used instead - and perhaps an extract function that does base64 encoding or something similar.
2017-09-26 22:49:40 +01:00
Martin Sumner
dfab33e8da Add smaller trees
The "small" tree will serialise to 1.5MB - which seems large.  Much smaller trees seem to be more suitable for things like recently modified aae indexes.
2017-09-25 13:07:08 +01:00
Martin Sumner
eba21f49fa Make tests compatible with OTP 16
this required a switch to change the sync strategy based on rebar parameter.

However tests could be slow on macbook with OTP16 and sync - so timeouts added in unit tests, and ct tests sync_startegy changed to not sync for OTP16.
2017-09-15 15:10:04 +01:00
Martin Sumner
869e799b41 Fix tests
Obviously got totally messed up and confused when testing previous
commits.

Multiple tests were failing for a change which got merged in as the
tests were not reflecting the required API.
2017-09-15 10:33:16 +01:00
Martin Sumner
53ddc8950b Add tests using fold_heads
Comparing the inbuilt tictac_tree fold, to using "proper" abstraction and achieving the same thing through fold_heads.

The fold_heads method is slower (a lot more manipulation required in the fold) - expect it to require > 2 x CPU.

However, this does give the flexibility to change the hash algorithm.  This would allow for a fold over a database of AAE trees (where the hash has been pre-computed using sha) to be compared with a fold over a database of leveled backends.

Also can vary whether the fold_heads checks for presence of the object in the Inker.  So normally we can get the speed advantage of not checking the Journal for presence, but periodically we can.
2017-08-07 10:45:41 +01:00
Martin Sumner
dd20132892 Add test with fold_heads
Build the AAE tree equally using fold_heads.  This is a pre-cursor to running this within Riak.

In part this leans on some of the work done to improve standard Riak AAE with leveled.  When rebuilding the standard AAE store only the head is required, and so this process was switched in riak_kv_sweeper to make a fold_heads request if supported by the backend.

The head response is a proxy object, which when loaded into a riak_object will allow for access to object metadata, but will use the passed function if access to object contents is requested.
2017-08-05 16:43:03 +01:00
Heinz N. Gies
38e9b0e80a Add missing uniform/0 function 2017-08-01 11:24:12 +02:00
Heinz N. Gies
25389893cf Add compatibility for old and new random / rand functions 2017-08-01 11:24:12 +02:00
Martin Sumner
8748fef28c Add extra second to sleep
Sleep for just one more second to resolve intermittent failure
2017-08-01 00:14:31 +01:00
Martin Sumner
65fd029ca6 typo - backlist/blacklist 2017-07-11 12:25:06 +01:00
martinsumner
80fd2615f6 Implement blacklist/whitelist
Change from the all/whitelist ebhavior to the blacklist/whitelist
behaviour documented in the write-up
2017-07-11 11:44:01 +01:00
martinsumner
3105656d2e Add test descriptions and further documentation 2017-07-06 15:40:30 +01:00