Add extra bloom check - but get the SFT process to perform not the chekc
not the Penciller. This avoids complexity of negotiating the transfer
of the bloom to the Penciller - but doesn't avoid the potentially
unecessary message pass between processes.
This is desirable to add back in going forward, but wasn't implemented
in a safe or clear way.
The way the bloom was or was not on the LoopState was clumsy, and it got
persisted in multiple places without a CRC check.
Intention to implement back in wherby it is requested on-demand by the
Penciller, and then the SFT worker lifts it off disk and CRC checks it.
So it is never on the SFT LoopState. Also it will be easier to control
the logic over which levels have the bloom in the Penciller.
Move to using the DJ Bernstein Magic Hash consistently, and trying to
make sure we only hash once for each operation (as the hash is more
expensive than phash2).
The improved lookup time for missing keys should allow for the L0 index
to be removed, and hence speed up the completion time for push_mem
operations.
It is expected there will be a second stage of creating a tinybloom as
part of the SFT creation process, and then adding that tinybloom to the
manifest. This will then reduce the message passing required for a GET
not in the cache or higher levels
Hope is that this will cause less garbage collection, and also will be
slightly faster.
Note that snapshots don't now get an index - they get the special index
'snap'. However, the SkipLists have bloom protection, and most
snapshots are iterators not fetchers.
The penciller attempt to close the L0 file if pending was unpredictable
in behaviour. If a L0 file is still pending it will be lost - but this
is at least a predictable event.
Clena the API of Riak specific methods, and also resolve timing issue in
simple_server unit test. Previously this would end up with missing data
(and a lower sequence number after start) because of the penciller_clerk
timeout being relatively large in the context of this test. Now the
timeout has bene reduced the L0 slot is cleared by the time of the
close. To make sure an extra sleep has been added as a precaution to
avoid any intermittent issues.
Previously under heavy load, as long as L0 was being cleared, the ledger
woud keep accapting. Now there is a formla limit on how far behind the
work queue (of compactions required at other levels) before the break is
applied on new updates coming in).
There were issues with how the Penciller behaves under ehavy write
pressure - most particularly where there are a large number of keys per
update (i.e. 2i heavy objects). Most immediately the attempt to chekc
whether the l0 file was ready slowed down the process of producing the
L0 file - so back-pressure created more back-pressure.
Going forward want to alter this most significantly as also the work
queue can build up unsustainably. there needs to be some pausing
prompted by the bookie on 'returned', and the use of 'returend when the
work queue exceeds a threshold.
Race condition presvented in test - but still not handled nicely.
Perhaps need to consider making it a FSM and handling close differently
when L0 pending - i.e. don't close immediately, but set a timeout to
close on if we don't get the last fetch_levelzero
Move legacy CDB code used only in unit tests into test area. Fix column
width in pmem and comment out the unused case statement (in healthy
tests) from the penciller test code