Compression can be switched between LZ4 and zlib (native).
The setting to determine if compression should happen on receipt is now a macro definition in leveled_codec.
Initially with basic tests. If the SlotIndex has been cached, we can now use the slot index as it is based on the Segment hash algortihm.
This looks like it should lead to an order of magnitude improvement in querying for keys/clocks by segment ID.
This also required a slight tweak to the penciller keyfolder. It now caches the next answer from the SSTiter, rather than restart the iterator. When the IMMiter has many more entries than the SSTiter (as the sSTiter is being filtered but not the IMMiter) this could lead to lots of repeated folding.
Switch from magic hash to md5 - to hopefully remove the need for some
of the artificial jumps required to get expected fall positive ratios.
Also split the hash into two 16-bit integers. We assume that SegmentID
(from the perspective of AAE merkle/tictac trees) will always be at
least 16 bits. the idea is that hashes should be used in blooms and
indexes such that some advantage can be gained from just knowing the
segmentID - in particular when folding over all the keys in a bucket.
Performance testing has been difficult so far - I think due to “cloud”
mysteries.
Introduce a dedicated module for all the different fold types. Also simplify the list of folders by deprecating those folds that should eb achieveable by fold_heads/fold_objects type folds but with smarter functions.
Makes sure that the fold functiosn also have better spec coverage, and are dialyzer checked.
Idea being that sometimes you may wish to compare a tictac tree between leveled and something that doesn't understand erlang:phash or term_to_binary. So allow the magic_hash to be used instead - and perhaps an extract function that does base64 encoding or something similar.
this required a switch to change the sync strategy based on rebar parameter.
However tests could be slow on macbook with OTP16 and sync - so timeouts added in unit tests, and ct tests sync_startegy changed to not sync for OTP16.
Comparing the inbuilt tictac_tree fold, to using "proper" abstraction and achieving the same thing through fold_heads.
The fold_heads method is slower (a lot more manipulation required in the fold) - expect it to require > 2 x CPU.
However, this does give the flexibility to change the hash algorithm. This would allow for a fold over a database of AAE trees (where the hash has been pre-computed using sha) to be compared with a fold over a database of leveled backends.
Also can vary whether the fold_heads checks for presence of the object in the Inker. So normally we can get the speed advantage of not checking the Journal for presence, but periodically we can.
Need to know {Bucket, Key} not just Key if all buckets are being covered
by nrt aae. So shoehorning this in - will also allow for proper use of
FilterFun when filtering by partition.
With basic ct test.
Doesn't currently prove expiry of index. Doesn't prove ability to find
segments.
Assumes that either "all" buckets or a special list of buckets require
indexing this way. Will lead to unexpected results if the same bucket
name is used across different Tags.
The format of the index has been chosen so that hopeully standard index
features can be used (e.g. return_terms).
Allow tictac tree sizes to be flexible.
Tested lots of different sizes. Having both level 1 and level 2 the
same size seemed to be consistently quicker than trying to make either
of the levels relatively wider.
There's an 8% performance improvement if the SegmentCount is reduced by
a quarter.
Naming is now confusing now we have TicTac Trees. This query builds a
list of keys and hashes not a tree - so it was misleading anyaway. Now
renamed hashlist_query.
This at least checks the file is present, and the Key exists in the
index of that file. If the value is corrupt it will be removed by
compation, and then this will fail (unless the file is never compacted).
TODO: resolve issus of files which are corrupt - but never compacted
- a job for backup?
fold objects which snaps in the fold was implemented incorrectly - it
took information from the LedgeCache at the point of the request, not
at the point of the fold. So the LedgerCache SQN may have been
surpassed in the Penciller by the time the fold was called.
riak_kv_sweeper gets the async fold function, then determines if the
function can be called. If the system is busy the fold may be queued,
and may never be acted upon.
This may cause issues with snapshot timeouts etc.
When running a load of mainly 2i queries, there is a huge cost in the
previous snapshot code. The time taken to create a clone of the
Penciller (duplicating all the LoopState) varied between 1 and 200ms
depedning on the size of the LoopState.
For 2i queries, most of that LoopState was then being thrown away after
running the query against the levelzero_cache. This was taking < 1ms on
average. It would be better to avoid the o(100)ms of CPU burning and
block for o(1)ms - so th eorder of events have been changed ot filter
first so only the small part of the LoopState actually required is
copied to the clone.