Commit graph

293 commits

Author SHA1 Message Date
Martin Sumner
f3f574de02 Switch to checking on get_kvrange
In production scale testing, placing te check_modified call on get_kvrange not get_slots made the performance difference.

It should help in get_lots as well, but unable to reliably get coverage in tests with this.  So for now, will leave off until a proper test can be constructed which demonstrates any benefits.
2020-12-03 13:37:22 +00:00
Martin Sumner
a210aa6846 Promote cache when scanning
When scanning over a leveled store with a helper (e.g. segment filter and last modified date range), applying the filter will speed up the query when the block index cache is available to get_slots.

If it is not available, previously the leveled_sst did not then promote the cache after it had accessed the underlying blocks.

Now the code does this, and also when the cache has all been added, it extracts the largest last modified date so that sst files older than the passed in date can be immediately dismissed
2020-12-02 13:29:50 +00:00
Martin Sumner
b4c79caf7a Allow for caching of compaction scores
Potentially reduce the overheads of scoring each file on every run.

The change also alters the default thresholds for compaction to favour longer runs (which will tend towards greater storage efficiency).
2020-11-27 02:35:27 +00:00
Martin Sumner
312fc52832 Extend test to make it highly likely a "garbage" merge file choice is made 2020-03-31 09:33:50 +01:00
Martin Sumner
9e56bfa947 Merge branch 'master' into mas-i311-mergeselector 2020-03-30 20:07:05 +01:00
Martin Sumner
42eb5f56bc Merge branch 'master' into mas-i311-mergeselector 2020-03-27 17:11:18 +00:00
Martin Sumner
aca945a171 Add counting of tombstones to new SST files
.. and that old-style SST files cna still be created, and opened, with a return of 'not_counted'
2020-03-27 10:20:10 +00:00
Martin Sumner
50cb98ecdd Resolve intermittent test failure
the previous regex filter still allowed files with cdb in the body of the name (which can be true as filenames are guid based)
2020-03-17 17:29:59 +00:00
Martin Sumner
808a858d09 Don't score a rolling file
In giving an empty file a score of 0, a race condition was exposed.  A file might not be active, but might still be rolling - and then cna get scored as 0, and immediately compacted.  It will then be removed from the journal manifest.

Check each file is not rolling before making it a candidate for rolling.
2020-03-16 21:41:47 +00:00
Martin Sumner
dbceda876c Issue with tag order
https://github.com/martinsumner/leveled/issues/309

Resolve issue, and remove test log entries used when discovering issue.
2020-03-16 16:35:06 +00:00
Martin Sumner
6350302ea8 Uncomment test 2020-03-16 13:32:52 +00:00
Martin Sumner
9d92ca0773 Add tests for appDefined functions 2020-03-16 12:51:14 +00:00
Martin Sumner
706ba8a674 Resolve issues with passing specs around 2020-03-15 23:15:09 +00:00
Martin Sumner
694d2c39f8 Support for recalc
Initial test included for running with recallc, and also transition from retain to recalc.

Moves all logic for startup fold into leveled_bookie - avoid the Inker requiring any direct knowledge about implementation of the Penciller.
2020-03-15 22:14:42 +00:00
Martin Sumner
156e7b064d Compaction, retain and recovery
Change the penciller check so that it returns current/replaced/missing not just true/false.

Reduce unnecessary penciller checks for non-standard keys that will always be retained - and remove redunandt code.

Expand tests of retain and recover to make sure that compaction on delete is well covered.

Also move the SQN number laong during initial loads - to stop aggressive loop to find starting SQN every file.
2020-03-09 15:12:48 +00:00
Martin Sumner
0966ce9929 Test improvements
Improve the speed of leveled_cdb tests by disabling sync on write.

Improve the strength of check of the correct behaviour when compacting with a reduced journal size.
2019-08-29 10:32:07 +01:00
Martin Sumner
8587686783 Add testing to ensure keydeltas are compacted in test 2019-07-26 21:43:00 +01:00
Martin Sumner
dab9652f6c Add ability to control journal size by object count
This helps when there are files wiht large numbers of key deltas (and hence small values), where otherwise the object count may get out of control.
2019-07-25 09:45:23 +01:00
Martin Sumner
22e732841c Compaction of already compacted journals
Ensure that journals with a large volume of key deltas do not erroneously get repeatedly compacted.
2019-07-24 18:03:22 +01:00
Martin Sumner
f8b3101a3a Two memory management helpers
Two helpers for memory management:

1 - a scan over the cdb file may lead to a lot of binary references being made.  So force a GC fater the scan.

2 - the penciller files contain slots that will be frequently read - so advice the page cache to pre-load them on startup.

This is in response to unexpected memory mangement issues in a potentially non-conventional setup - where the erlang VM held a lot of memory (that could be GC'd , in preference to the page cache - and consequently disk I/O and request latency were higher than expected.
2019-07-15 13:44:39 +01:00
Martin Sumner
952f088873 Memory management
Extracting binary from within a binary leaves a reference to the whole of the original binary.

If there are a lot of very large objects received abck toback - this can explode the amount of memory the penciller appears to hold (and gc cannot resolve this).

To dereference from the larger binary, need to do a binary copy
2019-06-15 17:23:06 +01:00
Martin Sumner
876a023db1 Add database_id to options
So that this can be recorded in logs
2019-06-13 14:58:32 +01:00
Martin Sumner
e360b97cfb GC manifest files when numbers skipped
Otherwise list of old files perpetually grows
2019-05-23 10:16:15 +01:00
Martin Sumner
14e1f577c9 Test default tag 2019-03-14 00:08:01 +00:00
Martin Sumner
01f0dadbb3 Add access to SQN
Use book_sqn/3 or book_sqn/4 to get the SQN of an object in the store.
2019-03-13 16:21:03 +00:00
Martin Sumner
be6e23f7de Change cache_size in sst tests
Makes results more predictable (with coin toss variations)
2019-01-29 13:40:55 +00:00
Martin Sumner
e3bd83179a Uncomment tests! 2019-01-27 23:31:44 +00:00
Martin Sumner
db0db67c45 Delete leveledjc_eqc.erl
Remove this for now, until issues with running tests without QC installed can be resolved.

Allow for changes to support QC to be merged into master.
2019-01-27 22:09:48 +00:00
Martin Sumner
8f6862a10b Test sst slot configuration change
Confirm it results in many more files, if the slot count reduced.  Has to handle the fact that Level 0 file has unlimited slots regardless of number of slots configured
2019-01-27 22:03:55 +00:00
Martin Sumner
ae9b03ab3c Fix unit tests - and make slot size configurable 2019-01-26 16:57:25 +00:00
Martin Sumner
f907fb5c97 Close in all cases
in leveled_imanifest
2019-01-25 19:27:42 +00:00
Martin Sumner
a04ed53855 Merge branch 'mas-qc-inkercompaction' of https://github.com/martinsumner/leveled into mas-qc-inkercompaction 2019-01-25 19:11:37 +00:00
Martin Sumner
f7022627e5 Check not pending before compacting
Also check for existence before deleting a CDB file
2019-01-25 19:11:34 +00:00
Martin Sumner
c9a955f2dd
Merge branch 'mas-qc-inkercompaction' into mas-i249-iclerkfsm 2019-01-25 15:15:29 +00:00
Martin Sumner
e349774167 Allow clerk to be stopped during compaction scoring
This will stop needless compaction work from being completed when the iclerk is sent a close at this stage.
2019-01-25 12:11:42 +00:00
Martin Sumner
00a59f4f8f
Merge branch 'mas-qc-inkercompaction' into mas-i249-iclerkshutdown 2019-01-25 09:53:56 +00:00
Martin Sumner
0333604fd9 Change to cast in inker/iclerk interaction
This allows for leveled_iclerk:clerk_stop to be a sync call, so that files will only be closed once the iclerk has stopped.  This is designed ot prevent iclerk crashes during shutdowns when files it is depnding on are closed mid shutdown.
2019-01-24 21:32:54 +00:00
Martin Sumner
28d0aef5fe Make check that compaction not ongoing before accepting new compaction
Respond 'busy' if compaction is ongoing
2019-01-24 15:46:17 +00:00
Martin Sumner
a13a6ae45f Updated model
This has inappropriate default parameter changes.
2019-01-22 12:53:31 +00:00
Martin Sumner
b713ce60a8 Initial eqc setup 2019-01-21 10:51:07 +00:00
Martin Sumner
c060c0e41d Handle L0 cache being full
A test thta will cause leveled to crash due to a low cache size being set - but protect against this (as well as the general scenario of the cache being full).

There could be a potential case where a L0 file present (post pending) without work backlog being set.  In this case we want to roll the level zero to memory, but don't accept the cache update if the L0 cache is already full.
2019-01-14 12:27:51 +00:00
Martin Sumner
672cfd4fcd Allow for run-time changes to log_level and forced_logs
Will not lead to immediate run time changes in SST or CDB logs.  These log settings will only change once the new files are re-written.

To completely change the log level - a restart of the store is necessary with new startup options.
2018-12-11 21:59:57 +00:00
Martin Sumner
6677f2e5c6 Push log update through to cdb/sst
Using the cdb_options and sst_options records
2018-12-11 20:42:00 +00:00
Martin Sumner
f274d2a63a Tighten acceptable duration
even with cover, passes in 30s.
2018-12-10 13:23:39 +00:00
Martin Sumner
e73f48a18b Add failing test
Test fails as fetching repeated object is too slow.

```Head check took 124301 microseconds checking list of length 5000

Head check took 112286 microseconds checking list of length 5000

Head check took 1336512 microseconds checking list of length 5

2018-12-10T11:54:41.342 B0013 <0.2459.0> Long running task took 260788 microseconds with task of type pcl_head

2018-12-10T11:54:41.618 B0013 <0.2459.0> Long running task took 276508 microseconds with task of type pcl_head

2018-12-10T11:54:41.894 B0013 <0.2459.0> Long running task took 275225 microseconds with task of type pcl_head

2018-12-10T11:54:42.173 B0013 <0.2459.0> Long running task took 278836 microseconds with task of type pcl_head

2018-12-10T11:54:42.477 B0013 <0.2459.0> Long running task took 304524 microseconds with task of type pcl_head```

It taks twice as long to check for one repeated object as it does to check for 5K non-repeated objects
2018-12-10 11:58:21 +00:00
Martin Sumner
8e687ee7c8 Add user-defined functions
To allow for extraction of metadata, and building of head responses - it should eb possible to dynamically and user-defined tags, and functions to treat them.

If no function is defined, revert to the behaviour of the ?STD tag.
2018-12-06 21:00:59 +00:00
Martin Sumner
881b93229b Isolate better changes needed to support changes to metadata extraction
More obvious how to extend the code as it is all in one module.

Also add a new field to the standard object metadata tuple that may hold in the future other object metadata base don user-defined functions.
2018-12-06 15:31:11 +00:00
Martin Sumner
510994233e Add check that index disappears
Check I0 count goes down when that index is removed
2018-12-05 15:42:21 +00:00
Martin Sumner
cf1fcaeef2 Add test of index expiry
To show how this works, and prove that it does work thta way.

Test may require adjusting if tested on a slow node (e.g. reduce KeyCount or increase TTL)
2018-12-05 15:18:20 +00:00
Martin Sumner
578a9f88e0 Support for log settings at startup
Both log level and forced_logs.  Allows for log_level to be changed at startup ad runtime.  Also allow for a list of forced logs, so if log_level is set > info, individual info logs can be forced to be seen (such as to see stats logs).
2018-12-05 00:17:39 +00:00