When testing with large numbers of 2i terms (and hence more Riak
Metadata), there is a surge in slow response times when there are
multiple concurrent merge events.
This could be veyr short term CPU starvation because of the merge
process. Perhaps it is delays waiting for the scan to complete -
smaller scanwidth may mena more interleaving and less latency?
A mistake meant resetting to lookup on a skipped key would cause issues
if the skipped key ocurred under a no_lookup slot after the ?SLOT_SIZE
had been reached. This caused the slot to switch to lookup, but beyond
the maximum size
The manifest and the logs are bloated by having the full file path for
every filename in there - given the root path is constant.
Could also cause issues if the mount point is ever changed.
This had been removed due to the CPU cost of adding - however then the
tinybloom wa simplemented by directly manipulating bits through binary
comprehension - rather than applying bor band bsl bsr operations.
With these operations the cost of producing and checking the bloom is
<10% by comparison.
This has refactored code with the implementation of the manifest
isolated in to a seperate module, and the pure async relationship
between penciller and their clerk. However, the manifest is just a
simple list at each level.
Remove the ets index in pmem and use a binary index instead. This may
be slower, but avoids the bulk upload to ets, and means that matches
know of position (so only skiplists with a match need be tried).
Also stops the discrepancy between snapshots and non-snapshots - as
previously the snapshots were always slowed by not having access to the
ETS table.
It appears to have some benefit at lower levels, but overall has less
benefit at higher levels. Probably not worth having unless it cna be
controlled to go in at the basement only.
Allow the bloom size to vary depending on how many fetchable keys there
are - so ther eis no large bloom held if most of the keys are index
entries for example
Prior to this refactor, the slot and been made up of four blocks with
an external binary index. Although the form of the index has changed
again, micro-benchmarking once again showed that this was a relatively
efficient mechanism.
Two issues - when the key range falls in-between two marks in the
summary, we didn't pick up any mark. then when trimming both right and
left, the left trim was being discarded.