This is desirable to add back in going forward, but wasn't implemented
in a safe or clear way.
The way the bloom was or was not on the LoopState was clumsy, and it got
persisted in multiple places without a CRC check.
Intention to implement back in wherby it is requested on-demand by the
Penciller, and then the SFT worker lifts it off disk and CRC checks it.
So it is never on the SFT LoopState. Also it will be easier to control
the logic over which levels have the bloom in the Penciller.
It is expensive on the CPU - but it leads to a 4 x increase in the cache
coverage.
Try and make some small micro gains in list handling in create_block
Move to using the DJ Bernstein Magic Hash consistently, and trying to
make sure we only hash once for each operation (as the hash is more
expensive than phash2).
The improved lookup time for missing keys should allow for the L0 index
to be removed, and hence speed up the completion time for push_mem
operations.
It is expected there will be a second stage of creating a tinybloom as
part of the SFT creation process, and then adding that tinybloom to the
manifest. This will then reduce the message passing required for a GET
not in the cache or higher levels
Hope is that this will cause less garbage collection, and also will be
slightly faster.
Note that snapshots don't now get an index - they get the special index
'snap'. However, the SkipLists have bloom protection, and most
snapshots are iterators not fetchers.
Split out hashtree implementation functions in leveled_cdb to make it
easier to swap this out. Currently using an array of skiplists - may be
better with an ets ordered_set
Change the extract of Riak metadata.
In Riak-based volume tests hte writing of SFT files is tanking. Could
this be the "extra" metadata. i.e. There are only current plans to look
at the vclock. Sibling count is free to fetch, what if we just get
these two items, will it be less CPU to extract the metadata, but also
will the reduced weight reduce the downstream impact?
Read and write times were increasing as the siz eof skiplist got larger
(e.g. over 2000). Tried instead with a smaller skip width and a
two-tier skiplist and this gave much more regular eprformance as the
size of the skiplist went up.