As descibed in https://github.com/martinsumner/leveled/issues/92
Only the first fix was made.
Just to eb safe - archiving means renaming to another file with a different extension. Assumption is that renamed files cna be manually reaped if necessary.
Idea being that sometimes you may wish to compare a tictac tree between leveled and something that doesn't understand erlang:phash or term_to_binary. So allow the magic_hash to be used instead - and perhaps an extract function that does base64 encoding or something similar.
The "small" tree will serialise to 1.5MB - which seems large. Much smaller trees seem to be more suitable for things like recently modified aae indexes.
this required a switch to change the sync strategy based on rebar parameter.
However tests could be slow on macbook with OTP16 and sync - so timeouts added in unit tests, and ct tests sync_startegy changed to not sync for OTP16.
Obviously got totally messed up and confused when testing previous
commits.
Multiple tests were failing for a change which got merged in as the
tests were not reflecting the required API.
Comparing the inbuilt tictac_tree fold, to using "proper" abstraction and achieving the same thing through fold_heads.
The fold_heads method is slower (a lot more manipulation required in the fold) - expect it to require > 2 x CPU.
However, this does give the flexibility to change the hash algorithm. This would allow for a fold over a database of AAE trees (where the hash has been pre-computed using sha) to be compared with a fold over a database of leveled backends.
Also can vary whether the fold_heads checks for presence of the object in the Inker. So normally we can get the speed advantage of not checking the Journal for presence, but periodically we can.
Build the AAE tree equally using fold_heads. This is a pre-cursor to running this within Riak.
In part this leans on some of the work done to improve standard Riak AAE with leveled. When rebuilding the standard AAE store only the head is required, and so this process was switched in riak_kv_sweeper to make a fold_heads request if supported by the backend.
The head response is a proxy object, which when loaded into a riak_object will allow for access to object metadata, but will use the passed function if access to object contents is requested.
Need to know {Bucket, Key} not just Key if all buckets are being covered
by nrt aae. So shoehorning this in - will also allow for proper use of
FilterFun when filtering by partition.
With basic ct test.
Doesn't currently prove expiry of index. Doesn't prove ability to find
segments.
Assumes that either "all" buckets or a special list of buckets require
indexing this way. Will lead to unexpected results if the same bucket
name is used across different Tags.
The format of the index has been chosen so that hopeully standard index
features can be used (e.g. return_terms).
Just some initial WIP code for this. Will revisit this again after
exploring some ideas as to how to reduce the cost of the
get_keys_by_segment.
The overlal idea is that there are trees of recent modifications, with
recent being some rolling time window made up of hourly blocks, and
recency being dtermined by the last-modified date on the object metadata
- which should be conistent across a cluster.
So if we were at 15:30 we would get the tree for 14:00 - 15:00 and the
tree for 15:00-16:00 from two different queries which cover the same
partitions and then compare.
Comparison may find differences, and we know what segment the difference
is in - but how to then find all keys in that segment which have been
modified in the period? Three ways:
Do it inefficeintly and infrequently using a fold_keys and a filter
(perhaps with SST files having a highest LMD in the metadata so that
they can be skipped).
Add a special index, where verye entry has a TTL, and the Key is
{$segment, Segment, Bucket, Key} so that a normal 2i query cna be used.
Align hashing for segments with hashing for penciller lookup so that a
query over the actual keys cna be optimised skipping chunks of the
in-memory part, and chunks of the SST file
Allow tictac tree sizes to be flexible.
Tested lots of different sizes. Having both level 1 and level 2 the
same size seemed to be consistently quicker than trying to make either
of the levels relatively wider.
There's an 8% performance improvement if the SegmentCount is reduced by
a quarter.
Naming is now confusing now we have TicTac Trees. This query builds a
list of keys and hashes not a tree - so it was misleading anyaway. Now
renamed hashlist_query.
Need to allow specific settings to be passed into unit tests.
Also, too much journal compaction may lead to intermittent failures on
the basic_SUITE space_clear_on_delete test. think this is because
there are less “deletes” to reload in on startup to trigger the cascade
down and clear up?
the new code requires bucket listing to be on binary keys not just
binary buckets. As this is only intended for use within Riak (where
all keys are buckets are binaries), this constraint seems OK.
A test needed changing to ensure it had a binary key in the bucket.
This at least checks the file is present, and the Key exists in the
index of that file. If the value is corrupt it will be removed by
compation, and then this will fail (unless the file is never compacted).
TODO: resolve issus of files which are corrupt - but never compacted
- a job for backup?
fold objects which snaps in the fold was implemented incorrectly - it
took information from the LedgeCache at the point of the request, not
at the point of the fold. So the LedgerCache SQN may have been
surpassed in the Penciller by the time the fold was called.