Change to 5 blocks is intended to make the blocks in lookup slots
fractionally smaller, but more importantly to introduce a middle block
that cna be opened in a binary-split style fashion to reduce the number
of blocks that need to be opened for range queries. Worst case for
full slots is 3 blocks now not 4.
Still not clear if yielding is the cause of memory problems, but taking
it away universally has impacted throughput. At the very least we
should continue to yield on high-contention files (those at higher
levels), where the processes are more likely to be quickly terminated
anyway allowing GC to be invoked.
There was complicated and confusing code that achieved nothing for
effiency when trimming slots. the expensive part (binary_to_term) was
still needed on every block, and it was hard to get code coverage and
make sense of what it was really trying to achieve.
This is now much simpler - and may set us up for potential further
indexing help.
RTrim only worked in special case of key matching, that would never
occur in real world range query. RTrim should really check for key
passing.
Returning empty list should not be possible - unless the query is
outside of the range entirely (and such a query should never go to this
SST).
No evidence from valume test that the scan width has made a posiitve
difference - so reverting, but not fully as slots may now be twice as
big, so sticking to half previous value
When testing with large numbers of 2i terms (and hence more Riak
Metadata), there is a surge in slow response times when there are
multiple concurrent merge events.
This could be veyr short term CPU starvation because of the merge
process. Perhaps it is delays waiting for the scan to complete -
smaller scanwidth may mena more interleaving and less latency?
A mistake meant resetting to lookup on a skipped key would cause issues
if the skipped key ocurred under a no_lookup slot after the ?SLOT_SIZE
had been reached. This caused the slot to switch to lookup, but beyond
the maximum size
The manifest and the logs are bloated by having the full file path for
every filename in there - given the root path is constant.
Could also cause issues if the mount point is ever changed.
This had been removed due to the CPU cost of adding - however then the
tinybloom wa simplemented by directly manipulating bits through binary
comprehension - rather than applying bor band bsl bsr operations.
With these operations the cost of producing and checking the bloom is
<10% by comparison.
This has refactored code with the implementation of the manifest
isolated in to a seperate module, and the pure async relationship
between penciller and their clerk. However, the manifest is just a
simple list at each level.
Remove the ets index in pmem and use a binary index instead. This may
be slower, but avoids the bulk upload to ets, and means that matches
know of position (so only skiplists with a match need be tried).
Also stops the discrepancy between snapshots and non-snapshots - as
previously the snapshots were always slowed by not having access to the
ETS table.