Previously there was no is_empty check, and there was a workaround using binary_bucketlist. But what if there were many buckets - this is a slow seek (using get next key over and over).
Instead have a proper is_empty check.
Originally had disabled the ability to lookup individual values when running in head_only mode. This is a saving of about 11% at PUT time (about 3 microseconds per PUT) on a macbook.
Not sure this saving is sufficient enought to justify the extra work if this is used as an AAE Keystore with Bitcask and LWW (when we need to lookup the current value before adjusting).
So reverted to re-adding support for HEAD requests with these keys.
Initial commit to add head_only mode to leveled. This allows leveled to receive batches of object changes, but where those objects exist only in the Penciller's Ledger (once they have been persisted within the Ledger).
The aim is to reduce significantly the cost of compaction. Also, the objects ar enot directly accessible (they can only be accessed through folds). Again this makes life easier during merging in the LSM trees (as no bloom filters have to be created).
If wast retention period is undefined, then it should be ignored - and no waste retained (rather than retaining waste for 24 hours as at present).
This wasn't working anyway - as reopen reader didn't get the cdb options (which didn't have the waste path on anyway) - so waste would not eb retained if the file had been opened after a stop/start.
Compression can be switched between LZ4 and zlib (native).
The setting to determine if compression should happen on receipt is now a macro definition in leveled_codec.
Initially with basic tests. If the SlotIndex has been cached, we can now use the slot index as it is based on the Segment hash algortihm.
This looks like it should lead to an order of magnitude improvement in querying for keys/clocks by segment ID.
This also required a slight tweak to the penciller keyfolder. It now caches the next answer from the SSTiter, rather than restart the iterator. When the IMMiter has many more entries than the SSTiter (as the sSTiter is being filtered but not the IMMiter) this could lead to lots of repeated folding.
Switch from magic hash to md5 - to hopefully remove the need for some
of the artificial jumps required to get expected fall positive ratios.
Also split the hash into two 16-bit integers. We assume that SegmentID
(from the perspective of AAE merkle/tictac trees) will always be at
least 16 bits. the idea is that hashes should be used in blooms and
indexes such that some advantage can be gained from just knowing the
segmentID - in particular when folding over all the keys in a bucket.
Performance testing has been difficult so far - I think due to “cloud”
mysteries.
Introduce a dedicated module for all the different fold types. Also simplify the list of folders by deprecating those folds that should eb achieveable by fold_heads/fold_objects type folds but with smarter functions.
Makes sure that the fold functiosn also have better spec coverage, and are dialyzer checked.
Idea being that sometimes you may wish to compare a tictac tree between leveled and something that doesn't understand erlang:phash or term_to_binary. So allow the magic_hash to be used instead - and perhaps an extract function that does base64 encoding or something similar.
this required a switch to change the sync strategy based on rebar parameter.
However tests could be slow on macbook with OTP16 and sync - so timeouts added in unit tests, and ct tests sync_startegy changed to not sync for OTP16.
Comparing the inbuilt tictac_tree fold, to using "proper" abstraction and achieving the same thing through fold_heads.
The fold_heads method is slower (a lot more manipulation required in the fold) - expect it to require > 2 x CPU.
However, this does give the flexibility to change the hash algorithm. This would allow for a fold over a database of AAE trees (where the hash has been pre-computed using sha) to be compared with a fold over a database of leveled backends.
Also can vary whether the fold_heads checks for presence of the object in the Inker. So normally we can get the speed advantage of not checking the Journal for presence, but periodically we can.
Need to know {Bucket, Key} not just Key if all buckets are being covered
by nrt aae. So shoehorning this in - will also allow for proper use of
FilterFun when filtering by partition.
With basic ct test.
Doesn't currently prove expiry of index. Doesn't prove ability to find
segments.
Assumes that either "all" buckets or a special list of buckets require
indexing this way. Will lead to unexpected results if the same bucket
name is used across different Tags.
The format of the index has been chosen so that hopeully standard index
features can be used (e.g. return_terms).
Allow tictac tree sizes to be flexible.
Tested lots of different sizes. Having both level 1 and level 2 the
same size seemed to be consistently quicker than trying to make either
of the levels relatively wider.
There's an 8% performance improvement if the SegmentCount is reduced by
a quarter.
Naming is now confusing now we have TicTac Trees. This query builds a
list of keys and hashes not a tree - so it was misleading anyaway. Now
renamed hashlist_query.
This at least checks the file is present, and the Key exists in the
index of that file. If the value is corrupt it will be removed by
compation, and then this will fail (unless the file is never compacted).
TODO: resolve issus of files which are corrupt - but never compacted
- a job for backup?
fold objects which snaps in the fold was implemented incorrectly - it
took information from the LedgeCache at the point of the request, not
at the point of the fold. So the LedgerCache SQN may have been
surpassed in the Penciller by the time the fold was called.
riak_kv_sweeper gets the async fold function, then determines if the
function can be called. If the system is busy the fold may be queued,
and may never be acted upon.
This may cause issues with snapshot timeouts etc.