When there is write pressure on the penciller and it returns to the
bookie, the bookie will now punish the next PUT (and itself) with a
pause. The longer the back-pressure state has been in place, the more
frequent the pauses
Previously under heavy load, as long as L0 was being cleared, the ledger
woud keep accapting. Now there is a formla limit on how far behind the
work queue (of compactions required at other levels) before the break is
applied on new updates coming in).
Broekn by change to get response on L0 completion, SFT was informing
penciller of the filename passed in (without extension), not the
completed one with the extension.
There were issues with how the Penciller behaves under ehavy write
pressure - most particularly where there are a large number of keys per
update (i.e. 2i heavy objects). Most immediately the attempt to chekc
whether the l0 file was ready slowed down the process of producing the
L0 file - so back-pressure created more back-pressure.
Going forward want to alter this most significantly as also the work
queue can build up unsustainably. there needs to be some pausing
prompted by the bookie on 'returned', and the use of 'returend when the
work queue exceeds a threshold.
Added a test of journal compaction with a registered snapshot and it
showed that the deleting of files did not correctly check the list of
registerd snapshots. Corrected.
This exposed a potential issue with not opening readers in binary_mode -
so now defaults to binary mode. Will add test using object filder to
confirm values remain readable in rolled journals after
shutdown/startup.
Race condition presvented in test - but still not handled nicely.
Perhaps need to consider making it a FSM and handling close differently
when L0 pending - i.e. don't close immediately, but set a timeout to
close on if we don't get the last fetch_levelzero
Move legacy CDB code used only in unit tests into test area. Fix column
width in pmem and comment out the unused case statement (in healthy
tests) from the penciller test code
Busted the hashtable in a Journal file, and demonstrated it can be fixed
by changing the extension name (no need to recover from backup if only
the hashtable is bust)
Changes the stratup otpions to a prolist to make it easier to get
environment variables as default.
Tried application:start - and completely baffled as to how to get this
to work.