Commit graph

1198 commits

Author SHA1 Message Date
Martin Sumner
5cee3a8e4e Tidy up spec
Also remove _app _sup originally added for dialyzer (due to false understanding they were needed for dialyzer)
2017-11-07 19:41:39 +00:00
Martin Sumner
bb15c1f780 Take out OTP files
Were only there for show
2017-11-07 16:22:49 +00:00
Martin Sumner
bea094aaf5 no non-binary objects in inker 2017-11-07 13:43:29 +00:00
Martin Sumner
332286f35c From inker kv - value cannot be a term 2017-11-07 13:42:12 +00:00
Martin Sumner
8f27b3b628 Merge branch 'master' into mas-aae-segementfoldplus 2017-11-07 11:22:56 +00:00
Martin Sumner
ed869dcada
Merge pull request #99 from martinsumner/mas-i95-lz4sst
Mas i95 lz4sst
2017-11-07 10:25:50 +00:00
Martin Sumner
0af0d85239 Add option description
Add documentation of new options
2017-11-07 10:22:27 +00:00
Martin Sumner
f358bd7622 Switch to using passed in compression method for maybe_compress
When the compaction discovers compression is required it will used the passed in method at startup - not the method which had been previously defined.
2017-11-06 21:16:46 +00:00
Martin Sumner
9a0a4ced0d Test LZ4 from uncompressed
Coverage issue
2017-11-06 19:42:35 +00:00
Martin Sumner
1d475235d1 Improve test coverage
Make compress on receipt/compaction configurable
2017-11-06 18:44:08 +00:00
Martin Sumner
61b7be5039 Make compression algorithm an option
Compression can be switched between LZ4 and zlib (native).

The setting to determine if compression should happen on receipt is now a macro definition in leveled_codec.
2017-11-06 15:54:58 +00:00
Martin Sumner
4c44e86eab Enable compression on receipt to Journal
Rather than always deferring compression until compaction
2017-11-05 22:33:32 +00:00
Martin Sumner
830906c552 Compress journal with lz4
When the value is a binary (which should be the case with Riak)
2017-11-05 21:48:57 +00:00
Martin Sumner
99428d0e55 Remove erroneously added file 2017-11-03 14:26:51 +00:00
Martin Sumner
0ecb83f8ec Remove eroneously added files 2017-11-03 14:26:18 +00:00
Martin Sumner
9fa8ed6cca Add LZ4 2017-11-03 14:18:49 +00:00
Martin Sumner
4fbb770a8c Revert "Failed attempt to hack in LZ4"
This reverts commit 912920a53c.
2017-11-03 11:47:25 +00:00
Martin Sumner
912920a53c Failed attempt to hack in LZ4 2017-11-03 11:47:00 +00:00
Martin Sumner
c6749e61a9 Split out block serialisation
To allow for alternate compression scenarios to be more easily tested
2017-11-03 11:04:31 +00:00
Martin Sumner
c8ad39b33b foldheads_bybucket adds segment list support
Accelerate queries for foldheads_bybucket as well
2017-11-01 22:00:12 +00:00
Martin Sumner
6beeadc7d8 Simplify SnapSQN check
Less ugly
2017-11-01 19:31:20 +00:00
Martin Sumner
6e0bf7bce3 Remove empty check
It is not obvious why empty binaries can't be merged this way.  BothEmpty seems a pointless expression - and it never gets hit by test coverage.
2017-11-01 18:19:37 +00:00
Martin Sumner
400202e38d Simpliy test assertion 2017-11-01 17:50:01 +00:00
Martin Sumner
5b5b4a3a29 Test coverage
Code no longer requires LongRunning to be undefined so that it can be decided through bext guess.

Also cover branches of tictac tree code.
2017-11-01 17:14:19 +00:00
Martin Sumner
53c3bf6c37 Remove get_slotid
Had been used in some debug logging - now not called
2017-11-01 17:05:35 +00:00
Martin Sumner
2428d2cbff Add test with presence check
Add a test in each loop with a check for the presence of the object in the Journal
2017-11-01 15:23:28 +00:00
Martin Sumner
ee7f9ee4e0 Test coverage
... and column width formatting
2017-11-01 15:11:14 +00:00
Martin Sumner
033cf1954d Add check for too-small trees
Provide a function for generating segmentfilter lists so that it can handle trees that are "too small".

Test those smaller trees - plus also false positives and cold caches
2017-11-01 13:18:01 +00:00
Martin Sumner
6099dd1367 Add warning about smaller tree sizes 2017-11-01 11:54:11 +00:00
Martin Sumner
81180e9310 Add tests for different tree sizes
Note that accelerating segment_list queries will not work for tree sizes smaller than small.  How to flag this up?

Should smaller tree sizes just be removed from leveled_tictac?
2017-11-01 11:51:51 +00:00
Martin Sumner
f80aae7d78 Type typo 2017-10-31 23:35:57 +00:00
Martin Sumner
b141dd199c Allow for segment-acceleration of folds
Initially with basic tests.  If the SlotIndex has been cached, we can now use the slot index as it is based on the Segment hash algortihm.

This looks like it should lead to an order of magnitude improvement in querying for keys/clocks by segment ID.

This also required a slight tweak to the penciller keyfolder.  It now caches the next answer from the SSTiter, rather than restart the iterator.   When the IMMiter has many more entries than the SSTiter (as the sSTiter is being filtered but not the IMMiter) this could lead to lots of repeated folding.
2017-10-31 23:28:35 +00:00
Martin Sumner
f5878548f9 Make binary Riak bucket/keys a special case
When leveled is used with Riak, buckets and keys are always binaries.  So we can treat them as such.

Want to move tictac tree testing away from the leveled internal tests, to a set of tests for the Riak scenario.  so riak_SUITE created for this and other riak-specific backend tests.
2017-10-30 17:39:21 +00:00
Martin Sumner
6bb7ceef0c Attempt to standardise on segment hashes
To allow for the segment has that accelerates queries to be re-used in tictac tree related queries.
2017-10-30 13:57:41 +00:00
Martin Sumner
7763df3cef Merge pull request #98 from martinsumner/mas-segid-cryptohash
Mas segid cryptohash
2017-10-25 10:02:04 +01:00
Martin Sumner
e24eaf655b Revert to previous standard slot size
But maintain configurability of slot size to maximum
2017-10-25 08:59:34 +01:00
Martin Sumner
a22610cee7 Experiment with alternate slot size
Improves fpr.  Does this change anything in volume tests?
2017-10-24 17:58:33 +01:00
Martin Sumner
6af1d3b003 Use more keys in bloom
Use 4 keys in the bloom (which is closer to optimal size).  This should halve the fpr - as we cna now use the large ExtraHash rather than being constrained by the SegmentHash here.
2017-10-24 15:42:53 +01:00
Martin Sumner
f08faf6432 Revert "Revert "Check fpr with 4 keys""
This reverts commit 74c28b52c9.
2017-10-24 15:22:12 +01:00
Martin Sumner
74c28b52c9 Revert "Check fpr with 4 keys"
This reverts commit d5bcccf0ec.
2017-10-24 15:21:07 +01:00
Martin Sumner
d5bcccf0ec Check fpr with 4 keys
Up key count in bloom
2017-10-24 15:20:59 +01:00
Martin Sumner
29a2d9fc35 Revert "Use lower fpr tinyblooms"
This reverts commit 3fd5260cd9.
2017-10-24 15:16:25 +01:00
Martin Sumner
3fd5260cd9 Use lower fpr tinyblooms
... but maybe they're slower?
2017-10-24 15:15:15 +01:00
Martin Sumner
26aa573ce1 Switch segment and extra hash
More entropy by using the position index with the segment hash - so this would be a better filter to apply.

Also could increase the key count now, as extra hash can be larger.

As an aside - a leveled_iclerk unit test failure appeared - the range was just wrong.  Don't know why this strated happening
2017-10-24 14:32:04 +01:00
Martin Sumner
36264eb416 Search range failure
Discovered a bug with search ranges in leveled_tree - this was uncovered by an intermittently fialing 19.3 test.

Test case added and bug fixed.  It was due to a fialure to use end_key passed causing issues with particular manifests and full bucket ranges.
2017-10-24 13:19:30 +01:00
Martin Sumner
a128dcdadf Change hash algorithm for penciller
Switch from magic hash to md5 - to hopefully remove the need for some
of the artificial jumps required to get expected fall positive ratios.

Also split the hash into two 16-bit integers.  We assume that SegmentID
(from the perspective of AAE merkle/tictac trees) will always be at
least 16 bits.  the idea is that hashes should be used in blooms and
indexes such that some advantage can be gained from just knowing the
segmentID - in particular when folding over all the keys in a bucket.

Performance testing has been difficult so far - I think due to “cloud”
mysteries.
2017-10-20 23:04:29 +01:00
Martin Sumner
ede0982b2d Merge branch 'mas-bloomtest' into mas-segid-cryptohash 2017-10-20 20:47:21 +01:00
Martin Sumner
1964f1055b Add test timeout 2017-10-19 21:44:07 +01:00
Martin Sumner
f38d3fde4b Test frequency change 2017-10-19 13:56:07 +01:00
Martin Sumner
87731a85f5 Loop test 2017-10-19 13:51:32 +01:00