Commit graph

1032 commits

Author SHA1 Message Date
Martin Sumner
42eb5f56bc Merge branch 'master' into mas-i311-mergeselector 2020-03-27 17:11:18 +00:00
Martin Sumner
da97d65a23 Add grooming compactions
Make half of LSM-tree compactions grooming compactions i.e. compactions biased towards merging files with large numbers of tombstones.
2020-03-27 15:09:48 +00:00
Martin Sumner
aca945a171 Add counting of tombstones to new SST files
.. and that old-style SST files cna still be created, and opened, with a return of 'not_counted'
2020-03-27 10:20:10 +00:00
Martin Sumner
4ef0f4006d Extend mergefile_selector for strategy
Strategy only applied below L1, and only random strategy supported
2020-03-26 14:18:57 +00:00
Martin Sumner
1242dd4991 Merge branch 'master' into mas-i306-reviseretain 2020-03-13 19:56:35 +00:00
Martin Sumner
444011ac64 Merge branch 'master' into mas-i306-reviseretain 2020-03-09 21:40:19 +00:00
Martin Sumner
207aeb8b99 Remove additional log 2020-03-09 20:42:48 +00:00
Martin Sumner
6b3328f4a3 Rationalise logging in commit
Also:

Sort the output from an 'all' fetch one loop at a time

Make sure the test of scoring na empty file  is scoring an empty file

If it is an emtpy file we want to compact the fragment away - in which case it should score 0.0 not 100.0
2020-03-09 17:45:06 +00:00
Martin Sumner
156e7b064d Compaction, retain and recovery
Change the penciller check so that it returns current/replaced/missing not just true/false.

Reduce unnecessary penciller checks for non-standard keys that will always be retained - and remove redunandt code.

Expand tests of retain and recover to make sure that compaction on delete is well covered.

Also move the SQN number laong during initial loads - to stop aggressive loop to find starting SQN every file.
2020-03-09 15:12:48 +00:00
Martin Sumner
60e29f2ff0 (slightly) less random reads on journal compaction 2020-03-06 11:29:25 +00:00
Martin Sumner
02155558df
Tag for Riak 2.9.1 2020-02-12 09:26:53 +00:00
Martin Sumner
4d550ef2a1
Bump version for new release 2019-11-20 10:33:51 +00:00
Martin Sumner
bcf10c9709 Fixup comments 2019-11-19 16:36:57 +00:00
Martin Sumner
693defb6d3 Use the same file write/sync/rename path where needed
When we want to be sure a file has been written before proceeding - we need a safer (that `file:write_file/2`) mechanism to be sure that it is written before proceeding.

This will:
open, write, sync, rename and then optionally read-back.

Changed so that manifest writing uses the safest form (including read back), and that sst writing uses a slightly looser form (with no read back to avoid performance issues).
2019-11-19 15:50:59 +00:00
Martin Sumner
96779e667e Add specs and further tests
Prove corrupted blocks are handled as expected when detected from within check_blocks function.
2019-11-04 15:32:18 +00:00
Martin Sumner
843b28313d Handle when corrupted blocks creates empty list
So can't get nth or last.
2019-11-04 12:49:56 +00:00
Martin Sumner
78ef767a96 Merge branch 'master' into mas-i293-cachemonitor 2019-08-29 17:49:04 +01:00
Martin Sumner
d5c1d1e51e Log about cache ratio and object hit ratio
If not a snapshot.
2019-08-29 17:48:46 +01:00
Martin Sumner
ecda13872a Add logging of cache ratio
Two reasons for logging this:

- to assist in sizing the ledger cache;
- to resolve the mystery when there appear to be no fetches from the penciller (as the penciller does not report fetches from the ledger cache)
2019-08-29 11:26:29 +01:00
Martin Sumner
0966ce9929 Test improvements
Improve the speed of leveled_cdb tests by disabling sync on write.

Improve the strength of check of the correct behaviour when compacting with a reduced journal size.
2019-08-29 10:32:07 +01:00
Martin Sumner
8587686783 Add testing to ensure keydeltas are compacted in test 2019-07-26 21:43:00 +01:00
Martin Sumner
57d73fc548 Make page cache level configurable 2019-07-25 12:23:10 +01:00
Martin Sumner
6cd898b731
Update leveled.app.src 2019-07-25 11:12:35 +01:00
Martin Sumner
e7c8dd7a78 Typo round-up
Also reduce log noise when persisting new Journal files
2019-07-25 10:24:40 +01:00
Martin Sumner
dab9652f6c Add ability to control journal size by object count
This helps when there are files wiht large numbers of key deltas (and hence small values), where otherwise the object count may get out of control.
2019-07-25 09:45:23 +01:00
Martin Sumner
22e732841c Compaction of already compacted journals
Ensure that journals with a large volume of key deltas do not erroneously get repeatedly compacted.
2019-07-24 18:03:22 +01:00
Martin Sumner
5bef21d971 Add unit test to prove binary/copy issue
Need to understand why the binary:copy is necessary - unit test now shows this.
2019-07-22 10:35:55 +01:00
Martin Sumner
c9c577259e Need to binary copy the header
Otherwise the whole binary is kept in memory ... and the SST memory footprint is much bigger.
2019-07-19 13:37:27 +01:00
Martin Sumner
da1ecc144a Tidy-up GC on compaction
Make sure we hibernate any CDB files after we score them, as they may not be used for sometime, and there may be garbage binary references present.
2019-07-19 13:30:53 +01:00
Martin Sumner
7862a6c523 Change page cache loading by lookup/no_lookup
By default load the first 4 levels of the ledger into the page cache of lookup is to be supported, but just levels 0 and 1 otherwise.
2019-07-18 14:00:19 +01:00
Martin Sumner
85bfa7fbb4 Use hibernate not garbage_collect
Use hibernation rather than manual garbage_collect calls as per standard recommendation.  Hibernate will be default gabage_collect anyway.  Maybe help with SST files that naturally go quiet.

Plus typos from previous commit in leveled_cdb.
2019-07-18 13:21:38 +01:00
Martin Sumner
5a853ee44d Hibernate iclerk on completion of compaction
Will be inactive for a period.  Will also force garbage collection.
2019-07-18 13:10:11 +01:00
Martin Sumner
3c834afa08 Use hibernate on open or roll to read
CDB files may be opened or rolled then left untouched for a period, so clean up any memory.  Been awoken from hibernate has a cost, but it is a rare event.
2019-07-18 13:07:48 +01:00
Martin Sumner
478c5b6db0 Load ledger in reverse order
Now that the SST files will fadvise on load (to force load into the page cache).  The load should take place in reverse order, so that if th eledger is > page_cache, it is the higher levels that will end up in the cache at the expense of the lower levels.
2019-07-16 10:25:49 +01:00
Martin Sumner
f8b3101a3a Two memory management helpers
Two helpers for memory management:

1 - a scan over the cdb file may lead to a lot of binary references being made.  So force a GC fater the scan.

2 - the penciller files contain slots that will be frequently read - so advice the page cache to pre-load them on startup.

This is in response to unexpected memory mangement issues in a potentially non-conventional setup - where the erlang VM held a lot of memory (that could be GC'd , in preference to the page cache - and consequently disk I/O and request latency were higher than expected.
2019-07-15 13:44:39 +01:00
Martin Sumner
b2d4d766cd
Update leveled.app.src
Ready for release of 0.9.16
2019-06-19 10:32:12 +01:00
Martin Sumner
952f088873 Memory management
Extracting binary from within a binary leaves a reference to the whole of the original binary.

If there are a lot of very large objects received abck toback - this can explode the amount of memory the penciller appears to hold (and gc cannot resolve this).

To dereference from the larger binary, need to do a binary copy
2019-06-15 17:23:06 +01:00
Martin Sumner
b5859ddde9 Change default DB ID
An undefined database id will use 65536 not 0 (as 0 is commonly used when defining database ids in Riak)
2019-06-14 11:19:37 +01:00
Martin Sumner
876a023db1 Add database_id to options
So that this can be recorded in logs
2019-06-13 14:58:32 +01:00
Martin Sumner
c3a4f5118d Each merge log details of the level below
Help with troubleshooting memory problems, and potential issues with GC
2019-06-13 11:50:02 +01:00
Martin Sumner
da59901890
Update leveled.app.src 2019-05-23 12:19:32 +01:00
Martin Sumner
e360b97cfb GC manifest files when numbers skipped
Otherwise list of old files perpetually grows
2019-05-23 10:16:15 +01:00
Martin Sumner
744a521289 Handle timeout/message race
When there is hevay PUT load, leveled_sst files could go into the delete-pending state befre the GC message is receieved - and the GC message would then interrupt the timeout cycle and lead ot the file not being GC'd until close.
2019-05-23 09:34:54 +01:00
Martin Sumner
aedd515a5b Bump vsn for release 2019-05-11 15:59:52 +01:00
Martin Sumner
d30fb0ee33 Reduce frequency of timing logs
and record level in the sst timing logs
2019-05-11 15:59:42 +01:00
Martin Sumner
486af59da1 Soften log noise 2019-05-11 13:26:07 +01:00
Martin Sumner
2f3d2a634c Correct the tidyup after startup
Use send_after/3 and unit test to confirm this works as expected
2019-03-28 21:01:01 +00:00
Martin Sumner
dfa8574695 Use correct send
So it actually works
2019-03-28 17:46:08 +00:00
Martin Sumner
42c4100c2d Add GC are initialisation
in OTP R16 (and perhaps other OTP releases) there is a failure to fully garbage collect leveled_sst files after thya have initialised.  They sppear to maintain a 4MB "hangover" from the initialisation process.

This can be removed by manually calling garbage_collect.  So we do this now on all new non-L0 files.  A L0 file will be short-lived or switched - short-lived and it doesn't matter, switched and this is already GC'd.
2019-03-28 13:23:37 +00:00
Martin Sumner
ffcd577f83 Update leveled_penciller.erl
Sometimes when testing (especially with coverage), the sst file is not alive when it is to be closed.  Check it is alive before closing.
2019-03-13 21:19:32 +00:00