Commit graph

438 commits

Author SHA1 Message Date
Martin Sumner
fba70edc94 Stop sort
sort probably doesn’t help
2017-01-03 17:08:40 +00:00
martinsumner
70c6e52fa7 Remove logs for slot_cache 2017-01-03 15:27:28 +00:00
martinsumner
e1d843a2eb Remove lastfetch cache
It appears to have some benefit at lower levels, but overall has less
benefit at higher levels.  Probably not worth having unless it cna be
controlled to go in at the basement only.
2017-01-03 15:26:44 +00:00
martinsumner
b6ae0e1af5 Fix broken SST cache 2017-01-03 13:03:59 +00:00
martinsumner
d28e5d639c Remove SST blooms 2017-01-03 09:12:41 +00:00
martinsumner
5b4c903d53 Check before update on bloom 2017-01-02 20:02:49 +00:00
martinsumner
31d4346806 Log improvements
Log on bad CRC, and also not seeing SST timing logs, so log these more
frequently
2017-01-02 18:54:19 +00:00
martinsumner
b3e189b012 Protect against div by 0
Make sure that blooms are always at least 1 slot in size
2017-01-02 18:38:14 +00:00
martinsumner
baa644383d Make tinybloom size configurable
Allow the bloom size to vary depending on how many fetchable keys there
are - so ther eis no large bloom held if most of the keys are index
entries for example
2017-01-02 18:29:15 +00:00
martinsumner
972aa85012 Try three hash tinybloom
Improved fpr in three hash bloom - so examine performance
2017-01-02 18:09:36 +00:00
Martin Sumner
2079fff7f8 Switched to indexed blocks as slot implementation
Prior to this refactor, the slot and been made up of four blocks with
an external binary index.  Although the form of the index has changed
again, micro-benchmarking once again showed that this was a relatively
efficient mechanism.
2017-01-02 10:47:04 +00:00
Martin Sumner
c0d959beff Five alternatives explored 2016-12-29 22:22:13 +00:00
martinsumner
b509e81cfd Ongoing timing tests 2016-12-29 14:14:09 +00:00
martinsumner
b855401696 Experiment
Want to experiemnt with different datatypes for the slot - maybe use a
raw list but with a mini hashtree index like the CDB file
2016-12-29 14:11:05 +00:00
martinsumner
41ee90a2ef OTP16 compatability 2016-12-29 12:10:12 +00:00
martinsumner
a261d4793b Increase test size
Be able to read more into sample-based output
2016-12-29 12:01:42 +00:00
martinsumner
4784f8521a Entropy fiddle
Try and increase efefctiveness of bloom by combing Magic Hash with
phash2
2016-12-29 11:59:07 +00:00
martinsumner
fb75a26497 Handle mismatch on expanding pointer
Remove the nasty legacy of hard-coding for a scan width of 1
2016-12-29 10:46:12 +00:00
martinsumner
8f0bf8b892 Fix overlapping _ references 2016-12-29 10:34:53 +00:00
martinsumner
afb28aa7d6 Switch iterator scan width to macro
And 4 seems a more reasonable number than 1
2016-12-29 10:21:57 +00:00
martinsumner
7049aaf5ca Better attempt to handle empty file being generated 2016-12-29 09:35:58 +00:00
martinsumner
0c543ae3ec Remove legacy logs 2016-12-29 05:10:11 +00:00
martinsumner
e01b310d20 Handle production of empty file 2016-12-29 05:09:47 +00:00
martinsumner
55386622f7 Fixed issues
Two issues - when the key range falls in-between two marks in the
summary, we didn't pick up any mark.  then when trimming both right and
left, the left trim was being discarded.
2016-12-29 04:37:49 +00:00
martinsumner
5b9e68df99 Add some crash protection for empty return from to_range
Not clear though why it would occur.
2016-12-29 03:04:10 +00:00
martinsumner
3f3b36597a Add timer for SST creation 2016-12-29 02:55:28 +00:00
martinsumner
c3999110e2 Remove io:format from debugging 2016-12-29 02:47:21 +00:00
martinsumner
a665b8ea4f Tidy-up unused variable 2016-12-29 02:41:02 +00:00
martinsumner
0c4d949c7f State mixup in FSM 2016-12-29 02:40:09 +00:00
Martin Sumner
18f2b5660d Fix to ensure directory structure created 2016-12-29 02:31:10 +00:00
martinsumner
dc28388c76 Removed SFT
Now moved over to SST on this branch
2016-12-29 02:07:14 +00:00
martinsumner
c664483f03 Add basic merge support
No generates KV list first, and then creates a new SST
2016-12-28 21:47:05 +00:00
martinsumner
3716de1c82 Revert back to sampling
timing logs should be based on a sample
2016-12-28 15:49:58 +00:00
martinsumner
cbad375373 Refactoring of skiplist ranges and support for sst ranges
the Skiplist range code was needlessly complicated.  It may be faster
than the new code, but the complexity delta cannot be support for such a
small change.

This was incovered whilst troubleshooting the initial kv range test.
2016-12-28 15:48:04 +00:00
martinsumner
6e5f5d2d44 Alter ordering
don't try the cache hit before checking for presence, only look in the
cache if protecting a lookup from the persisted part
2016-12-24 18:13:55 +00:00
martinsumner
480820e466 Add hash to missing key test 2016-12-24 18:03:34 +00:00
martinsumner
8526106312 Test for missing keys 2016-12-24 17:59:07 +00:00
martinsumner
0d0ab32653 Some end-to-end testing 2016-12-24 17:48:31 +00:00
martinsumner
7a11e8b490 Some basic testing 2016-12-24 16:34:36 +00:00
martinsumner
58d8e60994 Some basic code layout work 2016-12-24 15:12:24 +00:00
martinsumner
cb654b1325 Build the table summary
The table summary will be a skiplist, and this and the slot binary will
be CRC checked
2016-12-24 01:23:40 +00:00
martinsumner
4f838f6f88 Settled on sizes
Also removed length check due to warning in Erlang guidance about
non-constant time nature of this command.  Intend to remove lengths from
elsewhere (especially when used simply for logging).
2016-12-24 00:41:50 +00:00
martinsumner
b1a3b4ad13 Switch slot to gb_trees and size of 128 2016-12-24 00:02:06 +00:00
martinsumner
0cea470b70 Share final timing test 2016-12-23 23:30:15 +00:00
martinsumner
2d08816445 Confirm timings 2016-12-23 18:08:22 +00:00
martinsumner
4466210ac8 Revert back to slot size of 256
Changing the slot size higher has a significant impact on the fetch
time, although it allows for more constant time on write.  i.e. doubling
the size means 5 x cost of read, if only a 10% increase at write time.
2016-12-23 17:07:05 +00:00
martinsumner
b1429a7330 Experiment with slot width of 512 2016-12-23 16:49:16 +00:00
martinsumner
60bddbc874 More timing - and changes slot width 2016-12-23 13:17:59 +00:00
martinsumner
b37f3acb1e Extra timings 2016-12-23 12:44:44 +00:00
martinsumner
90e587dcee Initial functions and unit tests
Try to replace SFT files with one that more natively supports features
already in use (e.g. skiplist, tinybloom and magic_hash)
2016-12-23 12:30:58 +00:00