martinsumner
5a88565c08
Switch to binary index in pmem
...
Remove the ets index in pmem and use a binary index instead. This may
be slower, but avoids the bulk upload to ets, and means that matches
know of position (so only skiplists with a match need be tried).
Also stops the discrepancy between snapshots and non-snapshots - as
previously the snapshots were always slowed by not having access to the
ETS table.
2017-01-05 21:58:33 +00:00
martinsumner
1d3fb18df7
Resolve snapshotting issue
...
Need to make sure the extract from ets happens at the point the snapshot
is taken.
2017-01-05 18:43:55 +00:00
martinsumner
2c828b8eca
Fix snapshot issue
2017-01-05 17:55:27 +00:00
martinsumner
e6270d288f
Half-way to ets for Bookie mem
...
A half-way implementation with use of ETS as the bookie's memory
2017-01-05 17:00:12 +00:00
martinsumner
bbdb35ae03
Add ordered_set conversion
...
Can we go from an ets table to a skiplist
2017-01-05 14:09:39 +00:00
martinsumner
34a25bdb88
Improve from_list in skiplist
...
form_list had taken a suprrising amount of time - so improved the
efficiency of this
2017-01-05 13:57:38 +00:00
martinsumner
c43014a0ee
Merge pull request #14 from martinsumner/mas-altsst
...
Mas altsst
2017-01-04 23:30:42 +00:00
martinsumner
2f8ff640a9
Test coverage
...
Add some furthe runit tests to improve test coverage
2017-01-04 21:36:59 +00:00
martinsumner
6e8f8a9c86
Strip out extra stuff from skiplist
2017-01-04 17:19:27 +00:00
martinsumner
f1d26e279c
Merge branch 'mas-altsst' of https://github.com/martinsumner/eleveleddb into mas-altsst
2017-01-04 14:26:19 +00:00
martinsumner
7d95fa6bbc
Switch summary index
...
Simplify the summayr index implementation
2017-01-04 14:26:11 +00:00
Martin Sumner
8289c3b783
full reversion
2017-01-04 00:26:52 +00:00
Martin Sumner
85aaccfe31
Revert to non-split tinybloom
2017-01-03 23:53:57 +00:00
Martin Sumner
be1d678d85
Revert to two hash tiny bloom
2017-01-03 23:43:43 +00:00
martinsumner
2f3eb18548
Re-add usort
...
Change one thing at a time
2017-01-03 18:26:54 +00:00
martinsumner
6ab9f72d8c
Merge branch 'mas-altsst' of https://github.com/martinsumner/eleveleddb into mas-altsst
2017-01-03 18:20:36 +00:00
martinsumner
c4ebaa9f57
Tidy Up All Hashes
...
As we're no longer generating a summayr bloom - no need to collect a big
list of hashes whilst building the sst file
2017-01-03 18:20:28 +00:00
Martin Sumner
fba70edc94
Stop sort
...
sort probably doesn’t help
2017-01-03 17:08:40 +00:00
martinsumner
70c6e52fa7
Remove logs for slot_cache
2017-01-03 15:27:28 +00:00
martinsumner
e1d843a2eb
Remove lastfetch cache
...
It appears to have some benefit at lower levels, but overall has less
benefit at higher levels. Probably not worth having unless it cna be
controlled to go in at the basement only.
2017-01-03 15:26:44 +00:00
martinsumner
b6ae0e1af5
Fix broken SST cache
2017-01-03 13:03:59 +00:00
martinsumner
d28e5d639c
Remove SST blooms
2017-01-03 09:12:41 +00:00
martinsumner
5b4c903d53
Check before update on bloom
2017-01-02 20:02:49 +00:00
martinsumner
31d4346806
Log improvements
...
Log on bad CRC, and also not seeing SST timing logs, so log these more
frequently
2017-01-02 18:54:19 +00:00
martinsumner
b3e189b012
Protect against div by 0
...
Make sure that blooms are always at least 1 slot in size
2017-01-02 18:38:14 +00:00
martinsumner
baa644383d
Make tinybloom size configurable
...
Allow the bloom size to vary depending on how many fetchable keys there
are - so ther eis no large bloom held if most of the keys are index
entries for example
2017-01-02 18:29:15 +00:00
martinsumner
972aa85012
Try three hash tinybloom
...
Improved fpr in three hash bloom - so examine performance
2017-01-02 18:09:36 +00:00
Martin Sumner
2079fff7f8
Switched to indexed blocks as slot implementation
...
Prior to this refactor, the slot and been made up of four blocks with
an external binary index. Although the form of the index has changed
again, micro-benchmarking once again showed that this was a relatively
efficient mechanism.
2017-01-02 10:47:04 +00:00
Martin Sumner
c0d959beff
Five alternatives explored
2016-12-29 22:22:13 +00:00
martinsumner
b509e81cfd
Ongoing timing tests
2016-12-29 14:14:09 +00:00
martinsumner
b855401696
Experiment
...
Want to experiemnt with different datatypes for the slot - maybe use a
raw list but with a mini hashtree index like the CDB file
2016-12-29 14:11:05 +00:00
martinsumner
41ee90a2ef
OTP16 compatability
2016-12-29 12:10:12 +00:00
martinsumner
a261d4793b
Increase test size
...
Be able to read more into sample-based output
2016-12-29 12:01:42 +00:00
martinsumner
4784f8521a
Entropy fiddle
...
Try and increase efefctiveness of bloom by combing Magic Hash with
phash2
2016-12-29 11:59:07 +00:00
martinsumner
fb75a26497
Handle mismatch on expanding pointer
...
Remove the nasty legacy of hard-coding for a scan width of 1
2016-12-29 10:46:12 +00:00
martinsumner
8f0bf8b892
Fix overlapping _ references
2016-12-29 10:34:53 +00:00
martinsumner
afb28aa7d6
Switch iterator scan width to macro
...
And 4 seems a more reasonable number than 1
2016-12-29 10:21:57 +00:00
martinsumner
7049aaf5ca
Better attempt to handle empty file being generated
2016-12-29 09:35:58 +00:00
martinsumner
0c543ae3ec
Remove legacy logs
2016-12-29 05:10:11 +00:00
martinsumner
e01b310d20
Handle production of empty file
2016-12-29 05:09:47 +00:00
martinsumner
55386622f7
Fixed issues
...
Two issues - when the key range falls in-between two marks in the
summary, we didn't pick up any mark. then when trimming both right and
left, the left trim was being discarded.
2016-12-29 04:37:49 +00:00
martinsumner
5b9e68df99
Add some crash protection for empty return from to_range
...
Not clear though why it would occur.
2016-12-29 03:04:10 +00:00
martinsumner
3f3b36597a
Add timer for SST creation
2016-12-29 02:55:28 +00:00
martinsumner
c3999110e2
Remove io:format from debugging
2016-12-29 02:47:21 +00:00
martinsumner
a665b8ea4f
Tidy-up unused variable
2016-12-29 02:41:02 +00:00
martinsumner
0c4d949c7f
State mixup in FSM
2016-12-29 02:40:09 +00:00
Martin Sumner
18f2b5660d
Fix to ensure directory structure created
2016-12-29 02:31:10 +00:00
martinsumner
dc28388c76
Removed SFT
...
Now moved over to SST on this branch
2016-12-29 02:07:14 +00:00
martinsumner
c664483f03
Add basic merge support
...
No generates KV list first, and then creates a new SST
2016-12-28 21:47:05 +00:00
martinsumner
3716de1c82
Revert back to sampling
...
timing logs should be based on a sample
2016-12-28 15:49:58 +00:00