Commit graph

185 commits

Author SHA1 Message Date
Martin Sumner
f8b3101a3a Two memory management helpers
Two helpers for memory management:

1 - a scan over the cdb file may lead to a lot of binary references being made.  So force a GC fater the scan.

2 - the penciller files contain slots that will be frequently read - so advice the page cache to pre-load them on startup.

This is in response to unexpected memory mangement issues in a potentially non-conventional setup - where the erlang VM held a lot of memory (that could be GC'd , in preference to the page cache - and consequently disk I/O and request latency were higher than expected.
2019-07-15 13:44:39 +01:00
Martin Sumner
744a521289 Handle timeout/message race
When there is hevay PUT load, leveled_sst files could go into the delete-pending state befre the GC message is receieved - and the GC message would then interrupt the timeout cycle and lead ot the file not being GC'd until close.
2019-05-23 09:34:54 +01:00
Martin Sumner
d30fb0ee33 Reduce frequency of timing logs
and record level in the sst timing logs
2019-05-11 15:59:42 +01:00
Martin Sumner
2f3d2a634c Correct the tidyup after startup
Use send_after/3 and unit test to confirm this works as expected
2019-03-28 21:01:01 +00:00
Martin Sumner
dfa8574695 Use correct send
So it actually works
2019-03-28 17:46:08 +00:00
Martin Sumner
42c4100c2d Add GC are initialisation
in OTP R16 (and perhaps other OTP releases) there is a failure to fully garbage collect leveled_sst files after thya have initialised.  They sppear to maintain a 4MB "hangover" from the initialisation process.

This can be removed by manually calling garbage_collect.  So we do this now on all new non-L0 files.  A L0 file will be short-lived or switched - short-lived and it doesn't matter, switched and this is already GC'd.
2019-03-28 13:23:37 +00:00
Martin Sumner
95c27a835b GC before transitioning a L0 to reader
The L0 Pid has used a lot of memory in the construction of the file (something like 50MB).  This won't be GC'd immediately.  This is fine, as this will normally be short-lived.  However if the SST file is switched levels ... then this may mean thta we have multiple SST files with memory not being GC'd.
2019-03-04 11:28:00 +00:00
Martin Sumner
63fe77940a Stop using sync_send_event/2 with default timeout
On CDB and SST files.  Only use for close and APIs exclusively used in unit tests.
2019-03-02 21:21:13 +00:00
Martin Sumner
01f731dbc9 Refactor fetching of level zero cache entries
This is now down on an async message passing loop between the penciller and the new SST file.  this way when the penciller it shuts down, and can call close on a L0 file that is awaiting a fetch - rather than be trapped in deadlock.

The deadlock otherwise occurs if a penciller is sent a close immediately after if thas prompted a new level zero.
2019-02-26 18:16:47 +00:00
Martin Sumner
a589c9ca63 Update leveled_sst.erl
Handle deprecation warning
2019-02-26 12:31:34 +00:00
Martin Sumner
fffa257ffb Update leveled_sst.erl
Remove abritrarily reduced timings.  Can cause problems when testing with coverage enabled
2019-02-26 12:25:42 +00:00
Martin Sumner
509d541c9f Allow for false to close not crash
If PID has gone away
2019-01-29 13:46:25 +00:00
Martin Sumner
51a0260a60 Get new file to check initiater is alive
If no activity within timeout.  Make sure that the process has been orphaned by pclerk ending before manifest entry update made.
2019-01-29 13:18:39 +00:00
Martin Sumner
ae9b03ab3c Fix unit tests - and make slot size configurable 2019-01-26 16:57:25 +00:00
Martin Sumner
a13a6ae45f Updated model
This has inappropriate default parameter changes.
2019-01-22 12:53:31 +00:00
Martin Sumner
7f08fd5a68 Change file references in unit tests
Write into test folder within the repo, not outside of it.  Try and resolve issues wiht make test in riak
2019-01-17 21:02:29 +00:00
Martin Sumner
6677f2e5c6 Push log update through to cdb/sst
Using the cdb_options and sst_options records
2018-12-11 20:42:00 +00:00
Martin Sumner
9ca6b499e1 Remove use of string rather than straddle OTP version
string functions were used in unit tetss only, and were replaceable with io_lib:format
2018-12-11 15:44:37 +00:00
Martin Sumner
90574122c9 Merge remote-tracking branch 'aeternity/uw-avoid-set_env' into mas-pr231-review 2018-12-10 18:33:23 +00:00
Martin Sumner
4b4b774c0d Fix dialyzer warnings
Dialyzer got smarter in OTP 21 and spotted that the output type was wrong from tune_seglist
2018-12-07 14:36:18 +00:00
Ulf Wiger
fbe200f1ca warning-free vsns of string:str/2 & string:right/3 2018-12-07 08:39:44 +01:00
Martin Sumner
881b93229b Isolate better changes needed to support changes to metadata extraction
More obvious how to extend the code as it is all in one module.

Also add a new field to the standard object metadata tuple that may hold in the future other object metadata base don user-defined functions.
2018-12-06 15:31:11 +00:00
Martin Sumner
fab11bc2d2 Update comments to assist with clarity 2018-11-15 08:54:06 +00:00
Martin Sumner
a12931b430 Add comments 2018-11-15 01:06:37 +00:00
Martin Sumner
b571be9e43 Check that seglist-filtered keys are actually in range 2018-11-15 00:00:18 +00:00
Martin Sumner
ea7aa3086d Refactor membership check
To change to set membership when size beyond threshold
2018-11-07 17:43:26 +00:00
Martin Sumner
174a40aab2 Tidy up unexported types
also re:mp may not be exported in R16
2018-11-05 16:02:19 +00:00
Martin Sumner
e72a946f43 TupleBuckets in Riak objects
Adds support with test for tuplebuckets in Riak keys.

This exposed that there was no filter using the seglist on the in-mmemory keys.  This means that if there is no filter applied in the fold_function, many false positives may emerge.

This is probably not a big performance benefit (and indeed for performance it may be better to apply during the leveled_pmem:merge_trees).

Some thought still required as to what is more likely to contribute to future bugs: an extra location using the hash matching found in leveled_sst, or the extra results in the query.
2018-11-05 01:21:08 +00:00
Martin Sumner
19bfe48564 Initial ct test
Which exposed it wasn't working.  If there is no segment list passed - just a modification filter, you don't need to check the position list (as checking the position list returns an empty position so sipping all the matching results!)
2018-10-31 16:35:53 +00:00
Martin Sumner
142e3a17bb Add in modifictaion date to v2 value
And restrict it to 32 bits - as 80 years should be enough.
2018-10-31 11:44:46 +00:00
Martin Sumner
1f976948a1 Add test timeout
As timed out with coverage enabled
2018-10-30 21:52:17 +00:00
Martin Sumner
ffe4c39ee8 Add tests with old file format 2018-10-30 21:43:49 +00:00
Martin Sumner
ae1ada86b2 Add accumulator check for last mod range
Perhaps should also do the segment check at this point.  Seems odd to check last modified date and segments in different places.
2018-10-30 19:35:29 +00:00
Martin Sumner
b7e697f7f0 Fold API to leveled_sst
Externally to leveled_sst all folds are actually managed through exapnd_list_by_pointer.

Make the API a bit clearer in this regards, and add specs to help dialyzer.

This also adds LowLastMod to the API for expanding pointers (although the leveled_penciller just defaults this to 0 for everything.
2018-10-30 16:44:00 +00:00
Martin Sumner
bdd1762130 Missing use of extract_header
Spotted by ct test crossbucket_aae
2018-10-30 14:06:17 +00:00
Martin Sumner
75d2e2d546 Fix yield
Wrong format of repsonse if was delete_pending
2018-10-30 13:00:23 +00:00
Martin Sumner
7295a41321 Read (and ignore) last modified date
Add presence of LMD into index - and check everything happily lets it pass by
2018-10-30 11:47:03 +00:00
Martin Sumner
467c2fb89c Allow a boolean to be passed in to set IndexModDate
Although we are still pre-release in Leveled, for completeness it is a useful test of this code change to show that it can be done in a backwards compatible way.

So a boolean is added to indicate whether a file should index the modified date within the slot, and this can then be read when the file is opened.

Nothing happens with the boolean, yet.
2018-10-30 10:25:54 +00:00
Martin Sumner
8ba28700eb Start adding in last_moified dates
With updated specs
2018-10-29 21:50:32 +00:00
Martin Sumner
14fd67e535 Add specs and comments and split function
Need to change this, so refactor and make neater in preparation
2018-10-29 21:16:38 +00:00
Martin Sumner
082eabb65b Switch to start_link
Start all processes linked - to collapse the whole tree if one process fails
2018-06-28 12:16:43 +01:00
Martin Sumner
a14941a122 Fix unexported types
file:location not exported?
2018-06-04 10:57:37 +01:00
Martin Sumner
989f23bca6 Add cache population for non-yielding range fetches
More likely to require caching at lower levels.
2018-05-17 16:38:52 +01:00
Martin Sumner
779ccd9c2a Add use of block index when not cached (for fetch range) 2018-05-17 14:56:15 +01:00
Martin Sumner
18aabb49ba Segment filter and multiple keys in slot
An issue was spotted.  If we use a segment filter in a query, and there are multiple matches within a given slot - only the first match is returned.

Tests didn't detect this.  Now they do, and the issue is resolved.
2018-05-16 17:24:23 +01:00
Martin Sumner
cbf6e26fc8 Revert "Accumulate keys in check_blocks"
This reverts commit 5414e18047.
2018-05-16 15:11:18 +01:00
Martin Sumner
5414e18047 Accumulate keys in check_blocks
Previously couldn't accumulate keys using check-blocks - so if a key was found in the first position, and there were other positions to check for other keys, those other positions wouldn't be checked.
2018-05-16 11:38:26 +01:00
Martin Sumner
2cc088fc64 More debug 2018-05-16 11:24:57 +01:00
Martin Sumner
a7cda7213f Debug logs 2018-05-16 11:06:51 +01:00
Martin Sumner
8b4adcccaf Debug logs 2018-05-16 10:51:59 +01:00