Commit graph

157 commits

Author SHA1 Message Date
martinsumner
7b0b3e9b83 Add logging of Manifest SQN at startup 2017-01-14 22:26:26 +00:00
martinsumner
dcf3afc056 Log basement setting when creating files 2017-01-14 21:19:51 +00:00
martinsumner
85c03a61f9 Log changes 2017-01-14 20:11:01 +00:00
martinsumner
13c81f0ed1 Basic working
Some basic tests working - but still outstanding issues.
2017-01-14 19:41:09 +00:00
martinsumner
0204a23a58 Refactor - STILL BROKEN
Will at least compile, but in need of a massive eunit rewrite and
associated debug to get back to a potentially verifiable state again
2017-01-13 18:23:57 +00:00
martinsumner
08641e05cf Manifest changes - BROKEN
Going to abandond this branch for now.  The change is beoming
excessively time consuming, and it is not clear that a smaller change
might not achieve more of the objectives.

All this is broken - but perhaps could get picke dup another day.
2017-01-12 13:48:43 +00:00
martinsumner
ed27a53452 New manifest code
The manifest had previously been a list for eveyr leevl of the manifest,
and keys were found by folding over the list.  By Level 4 the list will
be 4096 items long, and so the fold would be expensive, and would be
required many times.

To make this less expensive an ETS table is to use.  However, the ETS
table needs to be shared between snapshots and so in order to use the
ETS the entries to the table need to support multi-versioning - whereby
each clone can see a version of the table at the Manifest SQN the clone
is supporting.
2017-01-09 14:52:26 +00:00
martinsumner
70c6e52fa7 Remove logs for slot_cache 2017-01-03 15:27:28 +00:00
martinsumner
b6ae0e1af5 Fix broken SST cache 2017-01-03 13:03:59 +00:00
martinsumner
31d4346806 Log improvements
Log on bad CRC, and also not seeing SST timing logs, so log these more
frequently
2017-01-02 18:54:19 +00:00
Martin Sumner
2079fff7f8 Switched to indexed blocks as slot implementation
Prior to this refactor, the slot and been made up of four blocks with
an external binary index.  Although the form of the index has changed
again, micro-benchmarking once again showed that this was a relatively
efficient mechanism.
2017-01-02 10:47:04 +00:00
martinsumner
0c543ae3ec Remove legacy logs 2016-12-29 05:10:11 +00:00
martinsumner
e01b310d20 Handle production of empty file 2016-12-29 05:09:47 +00:00
martinsumner
dc28388c76 Removed SFT
Now moved over to SST on this branch
2016-12-29 02:07:14 +00:00
martinsumner
3716de1c82 Revert back to sampling
timing logs should be based on a sample
2016-12-28 15:49:58 +00:00
martinsumner
cbad375373 Refactoring of skiplist ranges and support for sst ranges
the Skiplist range code was needlessly complicated.  It may be faster
than the new code, but the complexity delta cannot be support for such a
small change.

This was incovered whilst troubleshooting the initial kv range test.
2016-12-28 15:48:04 +00:00
martinsumner
0d0ab32653 Some end-to-end testing 2016-12-24 17:48:31 +00:00
martinsumner
58d8e60994 Some basic code layout work 2016-12-24 15:12:24 +00:00
martinsumner
0ddaaf9ac3 Stopped unnecessary seek for last_key
When rolling we already know the last_key - no need to seek for it on
startup.

The time it takes for this seek needs to be considered with regards to
startup time.  Can we do without knowing lastkey?
2016-12-22 19:51:39 +00:00
martinsumner
a131b99082 Randomising logging of PUT timings 2016-12-22 17:33:14 +00:00
martinsumner
353fb08e21 Randomise logging of GET/HEAD samples 2016-12-22 17:28:41 +00:00
martinsumner
ee534081c3 Reduce log levels
Remove some log noise to debug level
2016-12-22 17:15:42 +00:00
martinsumner
151dd3ab70 Sample only for HEAD/GET response times
Report regularly but only on a sample
2016-12-22 16:47:36 +00:00
Martin Sumner
676e8fa494 Add Get Timing 2016-12-22 15:45:38 +00:00
Martin Sumner
7a0cf22909 put-timing default
Remove need for individual actors to know the defaults for put_timing
tuple
2016-12-22 14:41:43 +00:00
Martin Sumner
e9e0a7b323 Set higher logpoint
Expectation is for many HEAD requests - so only log every 100K
2016-12-22 14:36:57 +00:00
martinsumner
df350e1e6f Add unit test for head timing 2016-12-22 14:09:17 +00:00
martinsumner
130fb36ddd Add head timings
Include log breaking down timings of HEAD requests by result and level
2016-12-22 14:03:31 +00:00
martinsumner
be775127e8 Improve logging of merge activity and slow GETs
Look into speculation that collisions between fetch_rnage and fetch may
be an issue
2016-12-21 12:45:27 +00:00
martinsumner
f3e16dcd10 Add long-running logs 2016-12-21 01:56:12 +00:00
martinsumner
060ce2e263 Add put timing points 2016-12-20 23:11:50 +00:00
martinsumner
9e28287231 Resolve failing recovery test
Now passing consistently with a number of different corruptions catered
for (including corruption of the Tag in the Inker Key)
2016-12-16 23:18:55 +00:00
martinsumner
c8be3bfa46 Slot hash corrected
When building the hashtree the incorrect IndexLength was being used to
calculate the slot - causing many queries to loop all the way round the
Index
2016-12-13 17:02:45 +00:00
martinsumner
8f775a88fd Investigate performance regression
Performance has regressed following the hashtable change.  Speculation
that the hashtable format might not be right, and so there is more
cycling around the hashtree.  Logging added.
2016-12-13 14:06:19 +00:00
martinsumner
52499170c0 Tidy logging following changes
Include detailed timings in a permanent log
2016-12-13 12:41:44 +00:00
martinsumner
8bcb49479d Re-introduce ETS Index
Add ETS Index back in to avoid having to check each skip list in turn.
Also this helps keep a lower skip list size.
2016-12-11 05:23:24 +00:00
martinsumner
ccc993383d Stop second hash on fetch_head
The bookie should magic_hash for fetch_head, and now passes the hash to
the Penciller so second hash not required.
2016-12-11 01:21:53 +00:00
martinsumner
2d3a40e6f1 Magic Hash - and no L0 Index
Move to using the DJ Bernstein Magic Hash consistently, and trying to
make sure we only hash once for each operation (as the hash is more
expensive than phash2).

The improved lookup time for missing keys should allow for the L0 index
to be removed, and hence speed up the completion time for push_mem
operations.

It is expected there will be a second stage of creating a tinybloom as
part of the SFT creation process, and then adding that tinybloom to the
manifest.  This will then reduce the message passing required for a GET
not in the cache or higher levels
2016-12-11 01:02:56 +00:00
martinsumner
c40e5d2d30 Reduce log noise 2016-12-08 16:43:35 +00:00
Martin Sumner
8f83c5226d Add log of write ops 2016-11-26 22:59:33 +00:00
martinsumner
0f7e421371 Add destruction
Allow a store to be cleared out and destroyed
2016-11-21 12:34:40 +00:00
martinsumner
386d40928b Fast List Buckets
Copied the technique from HanoiDB to speed up list buckets.
2016-11-20 21:21:31 +00:00
martinsumner
f40ecdd529 Pick-up test misses
There were some coverage misses in tests, so check in unit test coverage
or remove branches not currently needed.
2016-11-18 21:35:45 +00:00
martinsumner
630f802780 Inker Close nastiness
Try to stop some of the potential deadlocking around Inker close and
prove that snapshots at higher Manifest SQNs can be ignored
2016-11-14 19:34:11 +00:00
martinsumner
75d6af75c6 Penciller review
The penciller attempt to close the L0 file if pending was unpredictable
in behaviour.  If a L0 file is still pending it will be lost - but this
is at least a predictable event.
2016-11-14 17:18:28 +00:00
martinsumner
44738f7c75 Deferred Deletion of Journals
This allows for deleted journals to be retained for a period (the
waste_retnetion_period).  The idea being that a backup strategy can
ensure that all journals are backed up, even ones created and removed
from within a backup period - so that any restore pont is possible.

This is also a pre-cursor to removing some of the PromptDelete
complexity from the Inker Clerk - all compactions can prompt deletion as
deletion is now deferred.
2016-11-14 11:17:14 +00:00
martinsumner
a73c233154 Correct the recording of excess work 2016-11-05 15:10:21 +00:00
martinsumner
4556573a5c Rationalise logging on push_mem 2016-11-05 13:42:44 +00:00
martinsumner
61c6269200 Penciller back-pressure - Phase 1
There were issues with how the Penciller behaves under ehavy write
pressure - most particularly where there are a large number of keys per
update (i.e. 2i heavy objects).   Most immediately the attempt to chekc
whether the l0 file was ready slowed down the process of producing the
L0 file - so back-pressure created more back-pressure.

Going forward want to alter this most significantly as also the work
queue can build up unsustainably. there needs to be some pausing
prompted by the bookie on 'returned', and the use of 'returend when the
work queue exceeds a threshold.
2016-11-05 11:22:27 +00:00
martinsumner
2b8a37439d Log refinement - logging process IDs 2016-11-04 17:28:04 +00:00