Commit graph

42 commits

Author SHA1 Message Date
martinsumner
13c5cc8899 Add timing points
Add some timing points to manifest updates
2017-01-15 10:36:50 +00:00
martinsumner
8e92a2c563 Fix pclerk unit test 2017-01-15 01:47:23 +00:00
Martin Sumner
f868be8086 Change Penciller to Clerk handoff
correct the previous change to make sure that deletes are not confirmed
when work is outstanding, but also make sure that the clerk only ever
casts the Penciller so no deadlocks can ensue.
2017-01-15 00:52:43 +00:00
martinsumner
9577d24be0 Wait on close of penciller clerk
The clerk never calls the Penciller, so cannot deadlock.  Will wait for
a merge to be complete
2017-01-14 22:59:04 +00:00
martinsumner
dcf3afc056 Log basement setting when creating files 2017-01-14 21:19:51 +00:00
martinsumner
13c81f0ed1 Basic working
Some basic tests working - but still outstanding issues.
2017-01-14 19:41:09 +00:00
martinsumner
76bdd83346 Manifest refactor - STILL BROKEN
Some working tests now, but sitll broken
2017-01-14 16:36:05 +00:00
martinsumner
0204a23a58 Refactor - STILL BROKEN
Will at least compile, but in need of a massive eunit rewrite and
associated debug to get back to a potentially verifiable state again
2017-01-13 18:23:57 +00:00
martinsumner
08641e05cf Manifest changes - BROKEN
Going to abandond this branch for now.  The change is beoming
excessively time consuming, and it is not clear that a smaller change
might not achieve more of the objectives.

All this is broken - but perhaps could get picke dup another day.
2017-01-12 13:48:43 +00:00
martinsumner
7049aaf5ca Better attempt to handle empty file being generated 2016-12-29 09:35:58 +00:00
martinsumner
e01b310d20 Handle production of empty file 2016-12-29 05:09:47 +00:00
martinsumner
dc28388c76 Removed SFT
Now moved over to SST on this branch
2016-12-29 02:07:14 +00:00
martinsumner
c664483f03 Add basic merge support
No generates KV list first, and then creates a new SST
2016-12-28 21:47:05 +00:00
martinsumner
86bdfdeaf0 Reverted back out the additional bloom check
This is desirable to add back in going forward, but wasn't implemented
in a safe or clear way.

The way the bloom was or was not on the LoopState was clumsy, and it got
persisted in multiple places without a CRC check.

Intention to implement back in wherby it is requested on-demand by the
Penciller, and then the SFT worker lifts it off disk and CRC checks it.
So it is never on the SFT LoopState.  Also it will be easier to control
the logic over which levels have the bloom in the Penciller.
2016-12-11 21:01:10 +00:00
martinsumner
f96d148073 Make the merge_test a more sensible size
On the verge of a timeout.  Rather than keep battling with the timeout,
make it do less work
2016-12-11 20:17:05 +00:00
martinsumner
5cfe9a71e1 Wrap test with non-default timeout 2016-12-11 15:25:14 +00:00
martinsumner
523716e8f2 Add tiny bloom to Penciller Manifest
This is an attempt to save on unnecessary message transfers, and
slightly more expensive GCS checks in the SFT file itself.
2016-12-11 04:48:50 +00:00
martinsumner
2d3a40e6f1 Magic Hash - and no L0 Index
Move to using the DJ Bernstein Magic Hash consistently, and trying to
make sure we only hash once for each operation (as the hash is more
expensive than phash2).

The improved lookup time for missing keys should allow for the L0 index
to be removed, and hence speed up the completion time for push_mem
operations.

It is expected there will be a second stage of creating a tinybloom as
part of the SFT creation process, and then adding that tinybloom to the
manifest.  This will then reduce the message passing required for a GET
not in the cache or higher levels
2016-12-11 01:02:56 +00:00
martinsumner
8cbe2ef93a Coverage cheats
You juke the stats, and majors become colonels.  I've been here before
2016-11-14 20:43:38 +00:00
Martin Sumner
2d590c3077 More aggressive clerk 2016-11-05 17:50:28 +00:00
martinsumner
2299e1ce9d Prune unreachbale branches 2016-11-04 19:14:03 +00:00
martinsumner
2b8a37439d Log refinement - logging process IDs 2016-11-04 17:28:04 +00:00
martinsumner
7147ec0470 Logging - Phase 1
Abstract out logging and introduce a logbase
2016-11-02 18:14:46 +00:00
martinsumner
3b05874b8a Add initial timestamp support
Covered only by basic unit test at present.
2016-10-31 12:12:06 +00:00
martinsumner
95609702bd Penciller Memory Refactor
Plugged the ne wpencille rmemory into the Penciller, and took advantage
of the increased speed to simplify the callbacks involved.

The outcome is much simpler code
2016-10-30 18:25:30 +00:00
martinsumner
20cc17f916 Penciller Refactor
Removed o(100) lines of code by refactoring the Penciller to no longer
use ETS tables.  The code is less confusing, and probably not an awful
lot slower.
2016-10-27 20:56:18 +01:00
martinsumner
254183369e CDB - switch to gen_fsm
The CDB file management server has distinct states, and was growing case
logic to prevent certain messages from being handled in ceratin states,
and to handle different messages differently.  So this has now been
converted to a gen_fsm.

As part of resolving this, the space_clear_ondelete test has been
completed, and completing this revealed that the Penciller could not
cope with a change which emptied the ledger.  So a series of changes has
been handled to allow it to smoothly progress to an empty manifest.
2016-10-26 20:39:16 +01:00
martinsumner
e9c568a8b3 Test fix-up
There was a test that failed to close down a bookie and that caused some
issues.  The issues are double-reoslved, the close down was tidied as
well as the forgotten close being added back in.

There is some generla tidy around in anticipation of TTL support.
2016-10-21 21:26:28 +01:00
martinsumner
c431bf3b0a Broken snapshot test
The test confirming that deleting sft files wer eheld open whilst
snapshots were registered was actually broken.  This test has now been
fixed, as well as the logic in registring snapshots which had used
ledger_sqn mistakenly rather than manifest_sqn.
2016-10-21 11:38:30 +01:00
martinsumner
caa8d26e3e Stop file check
File check now covered by measure in the sft_new path, whihc will backup
any existing file before moving.

This gets triggered by incomplete changes on shutdown.
2016-10-20 19:18:49 +01:00
martinsumner
5c2029668d Tombstone preperation
Some initial code changes preparing for the test and implementation of
tombstones and tombstone reaping
2016-10-20 16:00:08 +01:00
martinsumner
cf66431c8e Smoother handling of back-pressure
The Penciller had two problems in previous commits:
- If it had a push_mem soon after a L0 file had been created, the
push_mem would stall waiting for the L0 file to complete - and this
count take 100-200ms
- The penciller's clerk favoured L0 work, but was lazy about asking for
other work in-between, so often the L1 layer was bursting over capacity
and the clerk was doing nothing but merging more L0 files in (with those
merges getting more and more expensive as they had to cover more and
more files)

There are some partial resolutions to this.  There is now an aggressive
timeout when checking whther the L0 file is ready on a push_mem, and if
the timeout is breached the error is caught and a 'returned' message
goes back to the Bookie.  the Bookie doesn't now empty its cache, it
carrie son filling it, but on some probability it will keep trying to
push_mem on future pushes.  This increases Jitter around the expensive
operation and split out the L0 delay into defined chunks.

The penciller's clerk is now more aggressive in asking for work.  There
is also some simplification of the relationship between clerk timeouts
and penciller back-pressure.

Also resolved is an issue of inconcistency between the loader and the on
startup (replaying the transaction log) and the standard push_mem
process.  The loader was not correctly de-duplicating by adding first
(in order) to a tree before outputting the list from the tree.

Some thought will be given later as to whether non-L0 work can be safely
prioritised if the merge process still keeps getting behind.
2016-10-20 02:23:45 +01:00
martinsumner
7319b8f415 Redundant clauses
Remove some redundant clauses, and fix up some logging
2016-10-19 20:51:30 +01:00
martinsumner
3e475f46e8 Support for 2i query part1
Added basic support for 2i query.  This involved some refactoring of the
test code to share functions between suites.

There is sill a need for a Part 2 as no tests currently cover removal of
index entries.
2016-10-18 01:59:18 +01:00
martinsumner
bbdac65f8d Split out key codec
Aplit out key codec, and also saner approach to key comparison (although
still awkward).
2016-10-13 21:02:15 +01:00
martinsumner
de54a28328 Load and Count test
This test exposed two bugs:
- Yet another set of off-by-one errors (really stupidly scanning the
Manifest from Level 1 not Level 0)
- The return of an old issue related to scanning the journal on load
whereby we fail to go back to the previous file before the current SQN
2016-10-13 17:51:47 +01:00
martinsumner
0a08867280 Iterator support
Add iterator support, used initially only for retrieving bucket
statistics.

The iterator is supported by exporting a function, and when the function
is claled it will take a snapshot of the ledger, run the iterator and
hten close the snapshot.

This required a numbe rof underlying changes, in particular to get key
comparison to work as "expected".  The code had previously misunderstood
how comparison worked between Erlang terms, and in particular did not
account for tuple length being compared first by size of the tuple (and
not just by each element in order).
2016-10-12 17:12:49 +01:00
martinsumner
d2cc07a9eb Doc update and clerk<->penciller changes
Reviewing code to update comments revealed a weakness in the sequence of
events between penciller and clerk committing a manifest change wherby
an ill-timed crash could lead to files being deleted without the
manifest changing.

A different, and safer pattern now used between theses two actors.
2016-10-09 22:33:45 +01:00
martinsumner
4a8a2c1555 Code reduction refactor
An attempt to refactor out more complex code.

The Penciller clerk and Penciller have been re-shaped so that there
relationship is much simpler, and also to make sure that they shut down
much more neatly when the clerk is busy to avoid crashdumps in ct tests.

The CDB now has a binary_mode - so that we don't do binary_to_term twice
... although this may have made things slower ??!!?  Perhaps the
is_binary check now required on read is an overhead.  Perhaps it is some
other mystery.

There is now a more effiicient fetching of the size on pcl_load now as
well.
2016-10-08 22:15:48 +01:00
martinsumner
d903f184fd Add initial end-to-end common tests
These tests highlighted some logical issues when scanning over databases
on startup, so fixes are wrapped in here.
2016-10-05 09:54:53 +01:00
martinsumner
d3e985ed80 Refactor Penciller Push
Two aspects of pushing to the penciller have been refactored:
1 - Allow the penciller to respond before the ETS table has been updated
to unlock the Bookie sooner.
2 - Change the way the copy of the memtable is stored to work more
effectively with snapshots wihtout locking the Penciller any further on
a snapshot or push request
2016-09-21 18:31:42 +01:00
martinsumner
aa7d235c4d Rename clerk and CDB Speed-Up
CDB did many "bitty" reads/writes when scanning or writing hash tables -
change these to bult reads and writes to speed up.

CDB also added capabilities to fetch positions and get keys by position
to help with iclerk role.
2016-09-20 16:13:36 +01:00
Renamed from src/leveled_clerk.erl (Browse further)