Commit graph

546 commits

Author SHA1 Message Date
martinsumner
b1a3b4ad13 Switch slot to gb_trees and size of 128 2016-12-24 00:02:06 +00:00
martinsumner
0cea470b70 Share final timing test 2016-12-23 23:30:15 +00:00
martinsumner
2d08816445 Confirm timings 2016-12-23 18:08:22 +00:00
martinsumner
4466210ac8 Revert back to slot size of 256
Changing the slot size higher has a significant impact on the fetch
time, although it allows for more constant time on write.  i.e. doubling
the size means 5 x cost of read, if only a 10% increase at write time.
2016-12-23 17:07:05 +00:00
martinsumner
b1429a7330 Experiment with slot width of 512 2016-12-23 16:49:16 +00:00
martinsumner
60bddbc874 More timing - and changes slot width 2016-12-23 13:17:59 +00:00
martinsumner
b37f3acb1e Extra timings 2016-12-23 12:44:44 +00:00
martinsumner
90e587dcee Initial functions and unit tests
Try to replace SFT files with one that more natively supports features
already in use (e.g. skiplist, tinybloom and magic_hash)
2016-12-23 12:30:58 +00:00
martinsumner
05ddcadbf9 Merge pull request #13 from martinsumner/mas-staggerhashtreewrite
Stopped unnecessary seek for last_key
2016-12-22 21:34:06 +00:00
martinsumner
0ddaaf9ac3 Stopped unnecessary seek for last_key
When rolling we already know the last_key - no need to seek for it on
startup.

The time it takes for this seek needs to be considered with regards to
startup time.  Can we do without knowing lastkey?
2016-12-22 19:51:39 +00:00
martinsumner
44cf5788ab Merge pull request #12 from martinsumner/mas-puttiming
Timing Points
2016-12-22 18:07:28 +00:00
martinsumner
a131b99082 Randomising logging of PUT timings 2016-12-22 17:33:14 +00:00
martinsumner
353fb08e21 Randomise logging of GET/HEAD samples 2016-12-22 17:28:41 +00:00
martinsumner
ee534081c3 Reduce log levels
Remove some log noise to debug level
2016-12-22 17:15:42 +00:00
martinsumner
151dd3ab70 Sample only for HEAD/GET response times
Report regularly but only on a sample
2016-12-22 16:47:36 +00:00
Martin Sumner
676e8fa494 Add Get Timing 2016-12-22 15:45:38 +00:00
Martin Sumner
7a0cf22909 put-timing default
Remove need for individual actors to know the defaults for put_timing
tuple
2016-12-22 14:41:43 +00:00
Martin Sumner
e9e0a7b323 Set higher logpoint
Expectation is for many HEAD requests - so only log every 100K
2016-12-22 14:36:57 +00:00
martinsumner
df350e1e6f Add unit test for head timing 2016-12-22 14:09:17 +00:00
martinsumner
130fb36ddd Add head timings
Include log breaking down timings of HEAD requests by result and level
2016-12-22 14:03:31 +00:00
martinsumner
ea20fc07f4 Maybe not 2016-12-21 21:56:33 +00:00
martinsumner
39d634c95b And again 2016-12-21 21:49:08 +00:00
martinsumner
3de146043b Its not like there's more than two hard things
D'oh
2016-12-21 21:41:54 +00:00
martinsumner
b2835aeaec Improve fetching efficiency
Experiment to see if parsing all keys in block can be avoided - and if
so does this make the range scan more efficient.

Unproven change.
2016-12-21 18:28:14 +00:00
martinsumner
be775127e8 Improve logging of merge activity and slow GETs
Look into speculation that collisions between fetch_rnage and fetch may
be an issue
2016-12-21 12:45:27 +00:00
martinsumner
f3e16dcd10 Add long-running logs 2016-12-21 01:56:12 +00:00
martinsumner
c193962c92 Sort out different timestamps 2016-12-20 23:16:52 +00:00
martinsumner
060ce2e263 Add put timing points 2016-12-20 23:11:50 +00:00
martinsumner
299e8e6de3 Initial phash test
phash does not appear to be a potential causer of delay
2016-12-20 20:55:56 +00:00
martinsumner
2a8e1afe41 Merge pull request #11 from martinsumner/mas-journalcorruptfail
Resolve failing recovery test
2016-12-16 23:58:37 +00:00
martinsumner
9e28287231 Resolve failing recovery test
Now passing consistently with a number of different corruptions catered
for (including corruption of the Tag in the Inker Key)
2016-12-16 23:18:55 +00:00
martinsumner
1684f0a913 Merge pull request #10 from martinsumner/mas_riakmetadata
Full Riak Metadata
2016-12-15 11:59:14 +00:00
martinsumner
4798dc1148 Write block theory
With riak metadata the sft files are about 2MB - so a group write coun
of 64 is trying to write in 500 KB chunks.  Is this too big?
2016-12-14 22:03:38 +00:00
Martin Sumner
b92b511166 Revert "Experiment"
This reverts commit fe907eb479.
2016-12-14 18:49:47 +00:00
martinsumner
fe907eb479 Experiment
Perhaps the results ar epaging related - change settings to hold less in
memory and see
2016-12-14 11:38:42 +00:00
martinsumner
f4e2e274e0 Reintroduce riak metadata extraction
The full riak metadata had been stripped from the Ledger update for
performance reasons.  However, the full metadata is required in order to
save a GET before a PUT.  Therefore we want to do isolated testing on
this change to establish the relative cost value in that cost saving.
2016-12-14 10:27:11 +00:00
martinsumner
bb1221a918 Merge pull request #9 from martinsumner/mas-addpclbloom
Mas addpclbloom
2016-12-14 10:11:18 +00:00
Martin Sumner
5efed94a1e Try slightly larger cache 2016-12-13 22:29:55 +00:00
martinsumner
b2c36ae541 Merge remote-tracking branch 'refs/remotes/origin/master' into mas-addpclbloom 2016-12-13 21:04:15 +00:00
martinsumner
baf4ca252f Revert "Experiment with temporary us eof ETS table"
This reverts commit 2a106d0dc5.
2016-12-13 20:24:29 +00:00
martinsumner
2a106d0dc5 Experiment with temporary us eof ETS table
Rather than expensive lists:ukeymerge, try use a temporary ETS table.
2016-12-13 19:38:14 +00:00
martinsumner
bc5190a9bd Merge pull request #8 from martinsumner/mas-cdb-hashtree-refactor
Mas cdb hashtree refactor
2016-12-13 18:48:37 +00:00
martinsumner
c8be3bfa46 Slot hash corrected
When building the hashtree the incorrect IndexLength was being used to
calculate the slot - causing many queries to loop all the way round the
Index
2016-12-13 17:02:45 +00:00
martinsumner
8f775a88fd Investigate performance regression
Performance has regressed following the hashtable change.  Speculation
that the hashtable format might not be right, and so there is more
cycling around the hashtree.  Logging added.
2016-12-13 14:06:19 +00:00
martinsumner
52499170c0 Tidy logging following changes
Include detailed timings in a permanent log
2016-12-13 12:41:44 +00:00
Martin Sumner
cfc6a67638 Switch to ordered_set
Improved performance by a combination of switching to an ordered_set
(so a list can be extracted in a sane way), and building the binary
from an ordered list.
2016-12-13 12:35:30 +00:00
martinsumner
aa2d19df1d Revert back to handling list of binaries (but differently)
Performance from last commit got worse not better :-(

Perhaps better handling all as lists, and then building a binary at the
end.
2016-12-13 03:22:40 +00:00
martinsumner
972a0ee0b9 Refactor hash table write
Less looping and re-looping over list.  Uses ordering to build more
naturally.
2016-12-13 02:15:13 +00:00
martinsumner
52e21de298 Initial switch to using ETS
No real refactor of building hashtables at this stage - just using ETS
not an arrary of skiplists
2016-12-12 21:47:09 +00:00
martinsumner
8ccd02e893 Merge Tree issue
The attempt to refcator the writer meant that files were never reaching
the max slots - and so we were only ever stopping when the lists were
exhausted.  This meant that the merge tree just had a C0 and a C1 file!
2016-12-12 18:30:12 +00:00