This log under 2i load appears thousands of times per second. Not
sustainable as an info log. Will need to think about how to manage
this, but setting back to debug for now
When running a load of mainly 2i queries, there is a huge cost in the
previous snapshot code. The time taken to create a clone of the
Penciller (duplicating all the LoopState) varied between 1 and 200ms
depedning on the size of the LoopState.
For 2i queries, most of that LoopState was then being thrown away after
running the query against the levelzero_cache. This was taking < 1ms on
average. It would be better to avoid the o(100)ms of CPU burning and
block for o(1)ms - so th eorder of events have been changed ot filter
first so only the small part of the LoopState actually required is
copied to the clone.
well random had me foxed. As the clone was short-lived process it only
called random once - and so always got the same answer.
random has to be seeded to give different answers when called once from
a process - so this is now seeded in leveed_log
Need to add extra logging to understand why pclerk crashes in some
volume tests.
Penciller's Clerk <0.813.0> shutdown now complete for reason
{badarg,[{dict,fetch,[63,{dict,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}],[{file,[100,105,99,116,46,101,114,108]},{line,126}]},{leveled_pclerk,handle_cast,2,[{file,[115,114,99,47,108,101,118,101,108,101,100,95,112,99,108,101,114,107,46,101,114,108]},{line,96}]},{gen_server,handle_msg,5,[{file,[103,101,110,95,115,101,114,118,101,114,46,101,114,108]},{line,604}]},{proc_lib,init_p_do_apply,3,[{file,[112,114,111,99,95,108,105,98,46,101,114,108]},{line,239}]}]}
Should only be prompted after prompt deletions had bene updated.
Perhaps a race whereby somehow it is prompted again after it is emptied.
This had been removed due to the CPU cost of adding - however then the
tinybloom wa simplemented by directly manipulating bits through binary
comprehension - rather than applying bor band bsl bsr operations.
With these operations the cost of producing and checking the bloom is
<10% by comparison.
Going to abandond this branch for now. The change is beoming
excessively time consuming, and it is not clear that a smaller change
might not achieve more of the objectives.
All this is broken - but perhaps could get picke dup another day.
The manifest had previously been a list for eveyr leevl of the manifest,
and keys were found by folding over the list. By Level 4 the list will
be 4096 items long, and so the fold would be expensive, and would be
required many times.
To make this less expensive an ETS table is to use. However, the ETS
table needs to be shared between snapshots and so in order to use the
ETS the entries to the table need to support multi-versioning - whereby
each clone can see a version of the table at the Manifest SQN the clone
is supporting.
Prior to this refactor, the slot and been made up of four blocks with
an external binary index. Although the form of the index has changed
again, micro-benchmarking once again showed that this was a relatively
efficient mechanism.
the Skiplist range code was needlessly complicated. It may be faster
than the new code, but the complexity delta cannot be support for such a
small change.
This was incovered whilst troubleshooting the initial kv range test.
When rolling we already know the last_key - no need to seek for it on
startup.
The time it takes for this seek needs to be considered with regards to
startup time. Can we do without knowing lastkey?