Changing the slot size higher has a significant impact on the fetch
time, although it allows for more constant time on write. i.e. doubling
the size means 5 x cost of read, if only a 10% increase at write time.
When rolling we already know the last_key - no need to seek for it on
startup.
The time it takes for this seek needs to be considered with regards to
startup time. Can we do without knowing lastkey?
The full riak metadata had been stripped from the Ledger update for
performance reasons. However, the full metadata is required in order to
save a GET before a PUT. Therefore we want to do isolated testing on
this change to establish the relative cost value in that cost saving.
Performance has regressed following the hashtable change. Speculation
that the hashtable format might not be right, and so there is more
cycling around the hashtree. Logging added.
Improved performance by a combination of switching to an ordered_set
(so a list can be extracted in a sane way), and building the binary
from an ordered list.
The attempt to refcator the writer meant that files were never reaching
the max slots - and so we were only ever stopping when the lists were
exhausted. This meant that the merge tree just had a C0 and a C1 file!
Add extra bloom check - but get the SFT process to perform not the chekc
not the Penciller. This avoids complexity of negotiating the transfer
of the bloom to the Penciller - but doesn't avoid the potentially
unecessary message pass between processes.
Previously the code had involved veyr high arity functions which were
hard to follow. This has been simplified somewhat with the addition of
a writer record to make things easier to track, as well as a general
refactoring to better logically seperate the building of things.
This is desirable to add back in going forward, but wasn't implemented
in a safe or clear way.
The way the bloom was or was not on the LoopState was clumsy, and it got
persisted in multiple places without a CRC check.
Intention to implement back in wherby it is requested on-demand by the
Penciller, and then the SFT worker lifts it off disk and CRC checks it.
So it is never on the SFT LoopState. Also it will be easier to control
the logic over which levels have the bloom in the Penciller.