* refactor leveled_sst from gen_fsm to gen_statem
* format_status/2 takes State and State Data
but this function is deprecated... put in for backward compatibility
* refactor leveled_cdb from gen_fsm to gen_statem
* disable irrelevant warning ignorer
* Remove unnecessary code paths
Only support messages, especially info messages, where they are possible.
* Mas i1820 offlinedeserialisation cbo (#403)
* Log report GC Info by manifest level
* Hibernate on range query
If Block Index Cache is not full, and we're not yielding
* Spawn to deserialise blocks offline
Hypothesis is that the growth in the heap necessary due to continual term_to_binary calls to deserialise blocks is wasting memory - so do this memory-intensive task in a short-lived process.
* Start with hibernate_after option
* Always build BIC
Testing indicates that the BIC itself is not a primary memory issue - the primary issue is due to a lack of garbage collection and a growing heap.
This change enhances the patch to offline serialisation so that:
- get_sqn & get_kv are standardised to build the BIC, and hibernate when it is built.
- the offline PId is linked to crash this process on failure (as would happen now).
* Standardise spawning for check_block/3
Now deserialise in both parts of the code.
* Only spawn for check_block if cache not full
* Update following review
* Standardise formatting
Make test more reliable. Show no new compaction after third compaction.
* Update comments
---------
Co-authored-by: Thomas Arts <thomas.arts@quviq.com>
* Protect penciller from empty ledger cache updates
which may occur when loading the ledger from the journal, after the ledger has been cleared.
* Score caching and randomisation
The test allkeydelta_journal_multicompact can occasionally fail when a compaction doesn't happen, but then does the next loop. Suspect this is as a result of score caching, randomisation of key grabs for scoring, plus jitter on size boundaries.
Modified test for predictability.
Plus formatting changes
* Avoid small batches
Avoid small batches due to large SQN gaps
* Rationalise tests
Two tests overlaps with the new, much broader, replace_everything/1 test. Ported over any remaining checks of interest and dropped two tests.
Potentially reduce the overheads of scoring each file on every run.
The change also alters the default thresholds for compaction to favour longer runs (which will tend towards greater storage efficiency).
Initial test included for running with recallc, and also transition from retain to recalc.
Moves all logic for startup fold into leveled_bookie - avoid the Inker requiring any direct knowledge about implementation of the Penciller.
Change the penciller check so that it returns current/replaced/missing not just true/false.
Reduce unnecessary penciller checks for non-standard keys that will always be retained - and remove redunandt code.
Expand tests of retain and recover to make sure that compaction on delete is well covered.
Also move the SQN number laong during initial loads - to stop aggressive loop to find starting SQN every file.
Improve the speed of leveled_cdb tests by disabling sync on write.
Improve the strength of check of the correct behaviour when compacting with a reduced journal size.
Test fails as fetching repeated object is too slow.
```Head check took 124301 microseconds checking list of length 5000
Head check took 112286 microseconds checking list of length 5000
Head check took 1336512 microseconds checking list of length 5
2018-12-10T11:54:41.342 B0013 <0.2459.0> Long running task took 260788 microseconds with task of type pcl_head
2018-12-10T11:54:41.618 B0013 <0.2459.0> Long running task took 276508 microseconds with task of type pcl_head
2018-12-10T11:54:41.894 B0013 <0.2459.0> Long running task took 275225 microseconds with task of type pcl_head
2018-12-10T11:54:42.173 B0013 <0.2459.0> Long running task took 278836 microseconds with task of type pcl_head
2018-12-10T11:54:42.477 B0013 <0.2459.0> Long running task took 304524 microseconds with task of type pcl_head```
It taks twice as long to check for one repeated object as it does to check for 5K non-repeated objects
This helps with kv_index_tictcatree with the leveled_so backend. Now this cna do folds over ranges of keys with modified filters (as folds over ranges of keys must go over lal keys if the backend is segment_ordered)
Added a test with back-to-back backups. This caused issues with the empty CDB file it created (on opening, couldn't cope with last key of empty).
So now backup won't roll the active journal if it is empty.
As the fold functions have been added to get_runner in an ad hoc way,
naturally, given the ongoing development of levelEd to support Riak,
it was difficult for a new user (in this case Quviq) to see what folds
are supported, and with what arguments, and expectations.
This PR is for discussion. It is one of many ways to group, spec, and
document the fold functions.
A test is also added for coverage of range queries.
Introduce a dedicated module for all the different fold types. Also simplify the list of folders by deprecating those folds that should eb achieveable by fold_heads/fold_objects type folds but with smarter functions.
Makes sure that the fold functiosn also have better spec coverage, and are dialyzer checked.
Obviously got totally messed up and confused when testing previous
commits.
Multiple tests were failing for a change which got merged in as the
tests were not reflecting the required API.
Naming is now confusing now we have TicTac Trees. This query builds a
list of keys and hashes not a tree - so it was misleading anyaway. Now
renamed hashlist_query.
This at least checks the file is present, and the Key exists in the
index of that file. If the value is corrupt it will be removed by
compation, and then this will fail (unless the file is never compacted).
TODO: resolve issus of files which are corrupt - but never compacted
- a job for backup?
Clena the API of Riak specific methods, and also resolve timing issue in
simple_server unit test. Previously this would end up with missing data
(and a lower sequence number after start) because of the penciller_clerk
timeout being relatively large in the context of this test. Now the
timeout has bene reduced the L0 slot is cleared by the time of the
close. To make sure an extra sleep has been added as a precaution to
avoid any intermittent issues.