Commit graph

1085 commits

Author SHA1 Message Date
Martin Sumner
910ccb6072 Add lookup support in head_only mode
Originally had disabled the ability to lookup individual values when running in head_only mode.  This is a saving of about 11% at PUT time (about 3 microseconds  per PUT) on a macbook.

Not sure this saving is sufficient enought to justify the extra work if this is used as an AAE Keystore with Bitcask and LWW (when we need to lookup the current value before adjusting).

So reverted to re-adding support for HEAD requests with these keys.
2018-02-16 14:16:28 +00:00
Martin Sumner
2b6281b2b5 Initial head_only features
Initial commit to add head_only mode to leveled.  This allows leveled to receive batches of object changes, but where those objects exist only in the Penciller's Ledger (once they have been persisted within the Ledger).

The aim is to reduce significantly the cost of compaction.  Also, the objects ar enot directly accessible (they can only be accessed through folds).  Again this makes life easier during merging in the LSM trees (as no bloom filters have to be created).
2018-02-15 16:14:46 +00:00
Martin Sumner
834704a3ff Merge branch 'mas-i117-factor4scale' of https://github.com/martinsumner/leveled into mas-i117-factor4scale 2018-02-10 08:10:32 +00:00
Martin Sumner
f748fc8611 Narrower still
Make the LSM tree more bottle shaped.

Experiment to judge performance impact
2018-02-10 08:10:24 +00:00
Martin Sumner
5673d8b558 Expand test to ensure coverage catch 2018-02-10 08:09:33 +00:00
Martin Sumner
8113aebdcf Add timings for Level 3
Level 3 readings now relatively common - so time the separately
2018-02-09 08:59:21 +00:00
Martin Sumner
c7cea04aba Correct maximum length 2018-02-08 15:31:35 +00:00
Martin Sumner
7e4c3db915 Alternate scale factor
Also had failed unit test - there was an issue with bit-flipping the position not being safely caught
2018-02-08 10:29:27 +00:00
Martin Sumner
bb39498ec5 Missed log line added back
.. and covered in test
2017-12-04 15:26:01 +00:00
Martin Sumner
f8ceedc9bb Compress L0 only
Doing at L1 has a negative impact as tests draw on.  Also improve head time tracking
2017-12-04 10:49:42 +00:00
Martin Sumner
1f5d5033a4 Revert "Revert "Disable compression L0 and L1""
This reverts commit 958d3f5e14.
2017-12-04 09:30:27 +00:00
Martin Sumner
958d3f5e14 Revert "Disable compression L0 and L1"
This reverts commit b10c0cf895.
2017-12-04 09:29:44 +00:00
Martin Sumner
b10c0cf895 Disable compression L0 and L1 2017-12-02 09:19:17 +00:00
Martin Sumner
6e589942b6 Cover bit flips in the slot header 2017-12-01 16:20:48 +00:00
Martin Sumner
5bac389d0c Switch to CRC check at Block Level
Previously done at Slot Level - but Blocks were still read from disk after the Slot CRC had been checked.

This seems safer.  It requires an extra CRC check for every fetch.  However, CRC chekcing smaller binaries during the buld process appears to be beneficial to performance.

Hoped this will be an enabler to turning off compression at Levels 0 and 1 to improve performance (wihtout having a compensating issues with reduced CRC performance)
2017-12-01 14:15:13 +00:00
Martin Sumner
7a99d060a3 Resolve OTP 19 compatibility
Dialyzer issues otherwise
2017-11-30 19:10:26 +00:00
Martin Sumner
9d9ad17d36 Typo 2017-11-30 16:29:10 +00:00
Martin Sumner
3b42bc28d1 Add build timing info to merge_list log
Help to determine what the expensive part of the operation is
2017-11-30 16:15:38 +00:00
Martin Sumner
41c308c5fd As used in lookup - will always be hash 2017-11-28 22:13:18 +00:00
Martin Sumner
eb90541a85 Add a small cache to SST file
so that a HEAD which folllows a HEAD (e.g. when a GET follows a HEAD) has a chance of avoiding the binary_to_term CPU load
2017-11-28 14:56:40 +00:00
Martin Sumner
7c4e1a5ad9 Typo 2017-11-28 11:49:50 +00:00
Martin Sumner
a655baf881 Resolve R16 dialyzer issue 2017-11-28 11:49:29 +00:00
Martin Sumner
5342e3a94f Improve testing of bloom feature
In particular will blooms re-appear following startup
2017-11-28 11:43:46 +00:00
Martin Sumner
c2f19d8825 Switch to using bloom at penciller
Previouslythe tinybloom was used within the SST file as an extra check to remove false fetches.

However the SST already has a low FPR check in the slot_index.  If the newebloom was used (which is no longer per slot, but per sst), this can be shared with the penciller and then the penciller could use it and avoid the message pass.

the message pass may be blocked by a 2i query or a slot fetch request for a merge.  So this should make performance within the Penciller snappier.

This is as a result of taking sst_timings within a volume test - where there was an average of + 100microsecs for each level that was dropped down.  Given the bloom/slot checks were < 20 microsecs - there seems to be some further delay.

The bloom is a binary of > 64 bytes - so passing it around should not require a copy.
2017-11-28 01:19:30 +00:00
Martin Sumner
467ad50cd1 Settle on hash approach 2017-11-27 15:29:01 +00:00
Martin Sumner
6b222ea8f9 Revert "Revert "Switch to 32 slots""
This reverts commit 99dad0c86f.
2017-11-27 15:17:40 +00:00
Martin Sumner
99dad0c86f Revert "Switch to 32 slots"
This reverts commit 7af6a3ba00.
2017-11-27 15:16:18 +00:00
Martin Sumner
7af6a3ba00 Switch to 32 slots 2017-11-27 15:16:10 +00:00
Martin Sumner
5e6541dddb Revert "Revert "Half size of each slot's bloom""
This reverts commit d37c5eab3f.
2017-11-27 14:51:25 +00:00
Martin Sumner
34dc63a8f8 Change measurement 2017-11-27 14:51:03 +00:00
Martin Sumner
d37c5eab3f Revert "Half size of each slot's bloom"
This reverts commit d83eea7c60.
2017-11-27 14:49:43 +00:00
Martin Sumner
d83eea7c60 Half size of each slot's bloom 2017-11-27 14:48:51 +00:00
Martin Sumner
c65dfa31d8 Add alternative bloom
Bloom filter that can take largere keys but is still efficient to build.  Allows bloom filter to be checked without first detemrining the slot.  Also, as represents the whole SST - it could be sent to the penciller to remove the need for a message pass.

The bloom is smaller and has a worse fpr than leveled_tinybloom.  Failing the bloom check isn't so bad - due to the slot index check being relatively fast and having a very low fpr.
2017-11-24 11:38:58 +00:00
Martin Sumner
f436cfd03e Add consistent timing points
Now all timing points should be made in a consistent fashion
2017-11-21 23:13:24 +00:00
Martin Sumner
3ef550d9f8 Refactor timing point management
For Penciller and timing head requests.
2017-11-21 19:58:36 +00:00
Martin Sumner
58946a7f98 Amend SST Timing Capture
Use sampling mechansm from CDB timing capture.  Do it less though - as far more SST fetches in comparison to CDB fetches.
2017-11-21 17:00:23 +00:00
Martin Sumner
495f6c3fd9 Re-introduce missing shortcut
Can't discover missing keys sooner by reporting missing on a zero hash.
2017-11-20 20:31:13 +00:00
Martin Sumner
52c7a023a1 Stop using list
Producing the list of all slots to try appeared to be expensive.  In volume tests taking 150 - 250 microseconds per GET.  Perhaps the list could be long (>1000), with a split and append, so not surprising.

Instead loop and count.
2017-11-20 20:01:21 +00:00
Martin Sumner
06f6604ac4 Always return passed in timings 2017-11-20 18:29:55 +00:00
Martin Sumner
5b4bc1ce59 Merge branch 'master' into mas-i108-cdbtimings 2017-11-20 17:34:50 +00:00
Martin Sumner
51f504fec5 Add extra slow_fetch test
sometimes ct tests don’t hit this - surprisingly
2017-11-20 17:29:57 +00:00
Martin Sumner
464baaa252 no_timing on key_check
reader was out of step with delete_pending state
2017-11-20 16:36:26 +00:00
Martin Sumner
faa5ef82aa Test logging of samples
To prompt the log the journal size needs to be reduced
2017-11-20 15:31:31 +00:00
Martin Sumner
de60a55be2 Add missing logref 2017-11-20 15:19:30 +00:00
Martin Sumner
fe0fc21461 Fix R16 dialzyer errors 2017-11-20 15:14:02 +00:00
Martin Sumner
8a43539090 Take sample timings from CDB files
Periodically get a CDB file process to take samples of how long fetching keys/values takes - and record  those samples
2017-11-20 14:58:43 +00:00
Martin Sumner
62a84b95bb Add fadvise help to scan 2017-11-20 10:40:09 +00:00
Martin Sumner
0e071d078e fold_objects in SQN order
This adds a test that fold_objects works in SQN order
2017-11-17 18:30:51 +00:00
Martin Sumner
50c81d0626 Make ink fold more generic
Also makes the fold_from_sequence loop much easier to follow
2017-11-17 14:54:53 +00:00
Martin Sumner
39ad5c9680 Make inker fold generic
ink_loadpcl is in effect an inker fold - so abstract out the inker fold part to make this a generic capability
2017-11-15 16:08:24 +00:00