martinsumner
f1d26e279c
Merge branch 'mas-altsst' of https://github.com/martinsumner/eleveleddb into mas-altsst
2017-01-04 14:26:19 +00:00
martinsumner
7d95fa6bbc
Switch summary index
...
Simplify the summayr index implementation
2017-01-04 14:26:11 +00:00
Martin Sumner
8289c3b783
full reversion
2017-01-04 00:26:52 +00:00
Martin Sumner
85aaccfe31
Revert to non-split tinybloom
2017-01-03 23:53:57 +00:00
Martin Sumner
be1d678d85
Revert to two hash tiny bloom
2017-01-03 23:43:43 +00:00
martinsumner
2f3eb18548
Re-add usort
...
Change one thing at a time
2017-01-03 18:26:54 +00:00
martinsumner
6ab9f72d8c
Merge branch 'mas-altsst' of https://github.com/martinsumner/eleveleddb into mas-altsst
2017-01-03 18:20:36 +00:00
martinsumner
c4ebaa9f57
Tidy Up All Hashes
...
As we're no longer generating a summayr bloom - no need to collect a big
list of hashes whilst building the sst file
2017-01-03 18:20:28 +00:00
Martin Sumner
fba70edc94
Stop sort
...
sort probably doesn’t help
2017-01-03 17:08:40 +00:00
martinsumner
70c6e52fa7
Remove logs for slot_cache
2017-01-03 15:27:28 +00:00
martinsumner
e1d843a2eb
Remove lastfetch cache
...
It appears to have some benefit at lower levels, but overall has less
benefit at higher levels. Probably not worth having unless it cna be
controlled to go in at the basement only.
2017-01-03 15:26:44 +00:00
martinsumner
b6ae0e1af5
Fix broken SST cache
2017-01-03 13:03:59 +00:00
martinsumner
d28e5d639c
Remove SST blooms
2017-01-03 09:12:41 +00:00
martinsumner
5b4c903d53
Check before update on bloom
2017-01-02 20:02:49 +00:00
martinsumner
31d4346806
Log improvements
...
Log on bad CRC, and also not seeing SST timing logs, so log these more
frequently
2017-01-02 18:54:19 +00:00
martinsumner
b3e189b012
Protect against div by 0
...
Make sure that blooms are always at least 1 slot in size
2017-01-02 18:38:14 +00:00
martinsumner
baa644383d
Make tinybloom size configurable
...
Allow the bloom size to vary depending on how many fetchable keys there
are - so ther eis no large bloom held if most of the keys are index
entries for example
2017-01-02 18:29:15 +00:00
martinsumner
972aa85012
Try three hash tinybloom
...
Improved fpr in three hash bloom - so examine performance
2017-01-02 18:09:36 +00:00
Martin Sumner
2079fff7f8
Switched to indexed blocks as slot implementation
...
Prior to this refactor, the slot and been made up of four blocks with
an external binary index. Although the form of the index has changed
again, micro-benchmarking once again showed that this was a relatively
efficient mechanism.
2017-01-02 10:47:04 +00:00
Martin Sumner
c0d959beff
Five alternatives explored
2016-12-29 22:22:13 +00:00
martinsumner
b509e81cfd
Ongoing timing tests
2016-12-29 14:14:09 +00:00
martinsumner
b855401696
Experiment
...
Want to experiemnt with different datatypes for the slot - maybe use a
raw list but with a mini hashtree index like the CDB file
2016-12-29 14:11:05 +00:00
martinsumner
41ee90a2ef
OTP16 compatability
2016-12-29 12:10:12 +00:00
martinsumner
a261d4793b
Increase test size
...
Be able to read more into sample-based output
2016-12-29 12:01:42 +00:00
martinsumner
4784f8521a
Entropy fiddle
...
Try and increase efefctiveness of bloom by combing Magic Hash with
phash2
2016-12-29 11:59:07 +00:00
martinsumner
fb75a26497
Handle mismatch on expanding pointer
...
Remove the nasty legacy of hard-coding for a scan width of 1
2016-12-29 10:46:12 +00:00
martinsumner
8f0bf8b892
Fix overlapping _ references
2016-12-29 10:34:53 +00:00
martinsumner
afb28aa7d6
Switch iterator scan width to macro
...
And 4 seems a more reasonable number than 1
2016-12-29 10:21:57 +00:00
martinsumner
7049aaf5ca
Better attempt to handle empty file being generated
2016-12-29 09:35:58 +00:00
martinsumner
0c543ae3ec
Remove legacy logs
2016-12-29 05:10:11 +00:00
martinsumner
e01b310d20
Handle production of empty file
2016-12-29 05:09:47 +00:00
martinsumner
55386622f7
Fixed issues
...
Two issues - when the key range falls in-between two marks in the
summary, we didn't pick up any mark. then when trimming both right and
left, the left trim was being discarded.
2016-12-29 04:37:49 +00:00
martinsumner
5b9e68df99
Add some crash protection for empty return from to_range
...
Not clear though why it would occur.
2016-12-29 03:04:10 +00:00
martinsumner
3f3b36597a
Add timer for SST creation
2016-12-29 02:55:28 +00:00
martinsumner
c3999110e2
Remove io:format from debugging
2016-12-29 02:47:21 +00:00
martinsumner
a665b8ea4f
Tidy-up unused variable
2016-12-29 02:41:02 +00:00
martinsumner
0c4d949c7f
State mixup in FSM
2016-12-29 02:40:09 +00:00
Martin Sumner
18f2b5660d
Fix to ensure directory structure created
2016-12-29 02:31:10 +00:00
martinsumner
dc28388c76
Removed SFT
...
Now moved over to SST on this branch
2016-12-29 02:07:14 +00:00
martinsumner
c664483f03
Add basic merge support
...
No generates KV list first, and then creates a new SST
2016-12-28 21:47:05 +00:00
martinsumner
3716de1c82
Revert back to sampling
...
timing logs should be based on a sample
2016-12-28 15:49:58 +00:00
martinsumner
cbad375373
Refactoring of skiplist ranges and support for sst ranges
...
the Skiplist range code was needlessly complicated. It may be faster
than the new code, but the complexity delta cannot be support for such a
small change.
This was incovered whilst troubleshooting the initial kv range test.
2016-12-28 15:48:04 +00:00
martinsumner
6e5f5d2d44
Alter ordering
...
don't try the cache hit before checking for presence, only look in the
cache if protecting a lookup from the persisted part
2016-12-24 18:13:55 +00:00
martinsumner
480820e466
Add hash to missing key test
2016-12-24 18:03:34 +00:00
martinsumner
8526106312
Test for missing keys
2016-12-24 17:59:07 +00:00
martinsumner
0d0ab32653
Some end-to-end testing
2016-12-24 17:48:31 +00:00
martinsumner
7a11e8b490
Some basic testing
2016-12-24 16:34:36 +00:00
martinsumner
58d8e60994
Some basic code layout work
2016-12-24 15:12:24 +00:00
martinsumner
cb654b1325
Build the table summary
...
The table summary will be a skiplist, and this and the slot binary will
be CRC checked
2016-12-24 01:23:40 +00:00
martinsumner
4f838f6f88
Settled on sizes
...
Also removed length check due to warning in Erlang guidance about
non-constant time nature of this command. Intend to remove lengths from
elsewhere (especially when used simply for logging).
2016-12-24 00:41:50 +00:00