Commit graph

1115 commits

Author SHA1 Message Date
Martin Sumner
b8d71023a8 Allow lower penciller cache sizes to be enforced
It might be necessary to have a low penciller cache size.  however, currently the upper bound of that cache size can be very high, even when a low cache size is set.  This is due to the coin tossing done to prevent co-ordination of L0 persistence across parallel instances of leveled.

The aim here is reduce that upper bound, so that any environment having problems due to lack of memory or https://github.com/martinsumner/leveled/issues/326 can more stricly enforce a lower maximum in the penciller cache size
2020-12-22 12:34:01 +00:00
Martin Sumner
186e3868a9
Merge pull request #325 from martinsumner/mas-i321-checkactive
Mas i321 checkactive
2020-12-08 16:33:05 +00:00
Martin Sumner
eeeb7498c0 Review feedback
Also add types to try and help avoid further confusion
2020-12-04 19:40:28 +00:00
Martin Sumner
7bf67563ef
Update src/leveled_iclerk.erl
Co-authored-by: Thomas Arts <thomas.arts@quviq.com>
2020-12-04 14:31:47 +00:00
Martin Sumner
108527a8d9 is_active check within fold
Not within the fold fun of the leveled_runner.

This should avoid constantly having to re-merge and filter the penciller memory when running list_buckets and hitting inactive keys
2020-12-04 12:49:17 +00:00
Martin Sumner
f3f574de02 Switch to checking on get_kvrange
In production scale testing, placing te check_modified call on get_kvrange not get_slots made the performance difference.

It should help in get_lots as well, but unable to reliably get coverage in tests with this.  So for now, will leave off until a proper test can be constructed which demonstrates any benefits.
2020-12-03 13:37:22 +00:00
Martin Sumner
a210aa6846 Promote cache when scanning
When scanning over a leveled store with a helper (e.g. segment filter and last modified date range), applying the filter will speed up the query when the block index cache is available to get_slots.

If it is not available, previously the leveled_sst did not then promote the cache after it had accessed the underlying blocks.

Now the code does this, and also when the cache has all been added, it extracts the largest last modified date so that sst files older than the passed in date can be immediately dismissed
2020-12-02 13:29:50 +00:00
Martin Sumner
80e6920d6c Standardise retention decision
Use the same function to decide for both scoring and compaction - and avoid the situation where somethig is scored for cmpaction, but doesnt change (which was the case previously with tombstones that were still in the ledger).
2020-11-29 15:43:29 +00:00
Martin Sumner
00823584ec Improve the quality of score
Move the average towards the current score if not scoring each run.   Score from more keys to get a better score (as overheads of scoring are now better sorted by setting score_onein rather than by reducing the sample size).
2020-11-27 20:03:44 +00:00
Martin Sumner
bcc331da10 Set max limit of 24 hours on cached score 2020-11-27 13:56:47 +00:00
Martin Sumner
b4c79caf7a Allow for caching of compaction scores
Potentially reduce the overheads of scoring each file on every run.

The change also alters the default thresholds for compaction to favour longer runs (which will tend towards greater storage efficiency).
2020-11-27 02:35:27 +00:00
Martin Sumner
5bc137e4ef
Merge pull request #317 from martinsumner/mas-i1765-reducelog
Reduce logging
2020-08-05 19:42:22 +01:00
Martin Sumner
dd5b22a71e Reduce logging
Otherwise erlang.log with default settings my cycle too fast for a long indexer
2020-08-05 18:54:13 +01:00
Martin Sumner
37f006bba1
Merge pull request #315 from martinsumner/mas-i1765-reducelog
Mas i1765 reducelog
2020-07-23 14:14:04 +01:00
Martin Sumner
a6bd151d58 Use git tag for version 2020-07-23 14:03:21 +01:00
Martin Sumner
5cc281b73a Drop P0039 log to debug
Logging 80 times per second in some Riak tests
2020-07-23 14:00:59 +01:00
Martin Sumner
35167e3796
Update leveled.app.src
Bump version
2020-06-18 13:20:49 +01:00
Martin Sumner
4caefcf4aa Merge branch 'master' into develop-3.0 2020-04-09 12:23:42 +01:00
Martin Sumner
d05a5fdd46 Make grooming more accurate
Check more files to optimise grooming choices
2020-03-30 20:07:48 +01:00
Martin Sumner
9e56bfa947 Merge branch 'master' into mas-i311-mergeselector 2020-03-30 20:07:05 +01:00
Martin Sumner
9838e255d2 Address review comments
More efficient traversal of list to score.
2020-03-29 20:02:21 +01:00
Martin Sumner
28c88ef8b8 Typo 2020-03-27 20:09:03 +00:00
Martin Sumner
42eb5f56bc Merge branch 'master' into mas-i311-mergeselector 2020-03-27 17:11:18 +00:00
Martin Sumner
da97d65a23 Add grooming compactions
Make half of LSM-tree compactions grooming compactions i.e. compactions biased towards merging files with large numbers of tombstones.
2020-03-27 15:09:48 +00:00
Martin Sumner
aca945a171 Add counting of tombstones to new SST files
.. and that old-style SST files cna still be created, and opened, with a return of 'not_counted'
2020-03-27 10:20:10 +00:00
Martin Sumner
e175948378 Remove references ot 'skip' strategy
Now called `recovr`
2020-03-26 14:25:09 +00:00
Martin Sumner
4ef0f4006d Extend mergefile_selector for strategy
Strategy only applied below L1, and only random strategy supported
2020-03-26 14:18:57 +00:00
Martin Sumner
20a7a22571 Add documentation for recalc option 2020-03-24 20:21:44 +00:00
Martin Sumner
8a9db9e75e Add log of startegy when clerk starts compaction 2020-03-23 16:45:28 +00:00
Martin Sumner
5b4edfebb6 Coverage cheat
Very rarely, this line in the tests this line is not covered - so cheating here to consistently pass coverage
2020-03-17 14:20:57 +00:00
Martin Sumner
808a858d09 Don't score a rolling file
In giving an empty file a score of 0, a race condition was exposed.  A file might not be active, but might still be rolling - and then cna get scored as 0, and immediately compacted.  It will then be removed from the journal manifest.

Check each file is not rolling before making it a candidate for rolling.
2020-03-16 21:41:47 +00:00
Martin Sumner
5f7d261a87 Improve test
Genuine overhang
2020-03-16 18:53:40 +00:00
Martin Sumner
b49a5ff53d Additional unit tests of MetaBin handling 2020-03-16 17:35:38 +00:00
Martin Sumner
dbceda876c Issue with tag order
https://github.com/martinsumner/leveled/issues/309

Resolve issue, and remove test log entries used when discovering issue.
2020-03-16 16:35:06 +00:00
Martin Sumner
9d92ca0773 Add tests for appDefined functions 2020-03-16 12:51:14 +00:00
Martin Sumner
694d2c39f8 Support for recalc
Initial test included for running with recallc, and also transition from retain to recalc.

Moves all logic for startup fold into leveled_bookie - avoid the Inker requiring any direct knowledge about implementation of the Penciller.
2020-03-15 22:14:42 +00:00
Martin Sumner
1242dd4991 Merge branch 'master' into mas-i306-reviseretain 2020-03-13 19:56:35 +00:00
Martin Sumner
444011ac64 Merge branch 'master' into mas-i306-reviseretain 2020-03-09 21:40:19 +00:00
Martin Sumner
207aeb8b99 Remove additional log 2020-03-09 20:42:48 +00:00
Martin Sumner
6b3328f4a3 Rationalise logging in commit
Also:

Sort the output from an 'all' fetch one loop at a time

Make sure the test of scoring na empty file  is scoring an empty file

If it is an emtpy file we want to compact the fragment away - in which case it should score 0.0 not 100.0
2020-03-09 17:45:06 +00:00
Martin Sumner
156e7b064d Compaction, retain and recovery
Change the penciller check so that it returns current/replaced/missing not just true/false.

Reduce unnecessary penciller checks for non-standard keys that will always be retained - and remove redunandt code.

Expand tests of retain and recover to make sure that compaction on delete is well covered.

Also move the SQN number laong during initial loads - to stop aggressive loop to find starting SQN every file.
2020-03-09 15:12:48 +00:00
Martin Sumner
60e29f2ff0 (slightly) less random reads on journal compaction 2020-03-06 11:29:25 +00:00
Martin Sumner
009abdd599 Build and test on OTP 22 2020-02-24 09:55:05 +00:00
Martin Sumner
02155558df
Tag for Riak 2.9.1 2020-02-12 09:26:53 +00:00
Martin Sumner
4d550ef2a1
Bump version for new release 2019-11-20 10:33:51 +00:00
Martin Sumner
bcf10c9709 Fixup comments 2019-11-19 16:36:57 +00:00
Martin Sumner
693defb6d3 Use the same file write/sync/rename path where needed
When we want to be sure a file has been written before proceeding - we need a safer (that `file:write_file/2`) mechanism to be sure that it is written before proceeding.

This will:
open, write, sync, rename and then optionally read-back.

Changed so that manifest writing uses the safest form (including read back), and that sst writing uses a slightly looser form (with no read back to avoid performance issues).
2019-11-19 15:50:59 +00:00
Martin Sumner
96779e667e Add specs and further tests
Prove corrupted blocks are handled as expected when detected from within check_blocks function.
2019-11-04 15:32:18 +00:00
Martin Sumner
843b28313d Handle when corrupted blocks creates empty list
So can't get nth or last.
2019-11-04 12:49:56 +00:00
Martin Sumner
78ef767a96 Merge branch 'master' into mas-i293-cachemonitor 2019-08-29 17:49:04 +01:00