Commit graph

178 commits

Author SHA1 Message Date
martinsumner
95d5e12ce7 Switch to using ets set as index of L0 cache
Hope is that this will cause less garbage collection, and also will be
slightly faster.

Note that snapshots don't now get an index - they get the special index
'snap'.  However, the SkipLists have bloom protection, and most
snapshots are iterators not fetchers.
2016-12-10 14:15:35 +00:00
martinsumner
626a8e63f9 Experiment converting CDB to use skiplist not gb_tree
Might insertion time be faster?
2016-12-10 10:55:35 +00:00
martinsumner
d2bd01eaf1 Add fast fail to skiplist
Add a bloom filter to the skiplist, to make it faster at returning not
found.  The SkipList is now encapsulated within a dict().
2016-12-09 18:30:40 +00:00
martinsumner
f0db730f07 Adjust jitter settings 2016-12-09 16:34:15 +00:00
martinsumner
82cb49638a Attempt at performance improvement
Try to add some extra jitter in to the process of L0 writes, and also
make L0 writes delayed to help with bufferring
2016-12-09 14:36:03 +00:00
Martin Sumner
fe080895fd Revert type definition
Can’t find a type definition support din both OTP 16 and OTP 18, so
reverting to not defining type
2016-11-25 18:22:35 +00:00
Martin Sumner
ec32a7e3eb OTP16 compliance - array type 2016-11-25 18:20:17 +00:00
martinsumner
03d025d581 Replace ledger-side gb_trees
Try to make minimal change to replace gb_trees with gb_tree API-like
skiplists
2016-11-25 14:50:13 +00:00
martinsumner
638fc69e01 Correctly set array type
Otherwise cannot compile in both OTP 16 and 17
2016-11-21 22:26:12 +00:00
martinsumner
0f7e421371 Add destruction
Allow a store to be cleared out and destroyed
2016-11-21 12:34:40 +00:00
martinsumner
386d40928b Fast List Buckets
Copied the technique from HanoiDB to speed up list buckets.
2016-11-20 21:21:31 +00:00
martinsumner
f40ecdd529 Pick-up test misses
There were some coverage misses in tests, so check in unit test coverage
or remove branches not currently needed.
2016-11-18 21:35:45 +00:00
martinsumner
8cbe2ef93a Coverage cheats
You juke the stats, and majors become colonels.  I've been here before
2016-11-14 20:43:38 +00:00
martinsumner
630f802780 Inker Close nastiness
Try to stop some of the potential deadlocking around Inker close and
prove that snapshots at higher Manifest SQNs can be ignored
2016-11-14 19:34:11 +00:00
martinsumner
75d6af75c6 Penciller review
The penciller attempt to close the L0 file if pending was unpredictable
in behaviour.  If a L0 file is still pending it will be lost - but this
is at least a predictable event.
2016-11-14 17:18:28 +00:00
martinsumner
8035583301 Comment review 2016-11-07 11:17:13 +00:00
martinsumner
4583460328 Clean API of Riak-specific Methods
Clena the API of Riak specific methods, and also resolve timing issue in
simple_server unit test.  Previously this would end up with missing data
(and a lower sequence number after start) because of the penciller_clerk
timeout being relatively large in the context of this test.  Now the
timeout has bene reduced the L0 slot is cleared by the time of the
close.  To make sure an extra sleep has been added as a precaution to
avoid any intermittent issues.
2016-11-07 10:11:57 +00:00
Martin Sumner
a7ed3e4b85 Trim dead branches
Also an experiment with altering the slowoffer_delay.
2016-11-05 15:59:31 +00:00
martinsumner
a73c233154 Correct the recording of excess work 2016-11-05 15:10:21 +00:00
martinsumner
376176eba3 Correct overlap in naming with Backlog 2016-11-05 14:35:01 +00:00
martinsumner
7f456fa993 Add back-pressure on work queue limit
Previously under heavy load, as long as L0 was being cleared, the ledger
woud keep accapting.  Now there is a formla limit on how far behind the
work queue (of compactions required at other levels) before the break is
applied on new updates coming in).
2016-11-05 14:04:45 +00:00
martinsumner
4556573a5c Rationalise logging on push_mem 2016-11-05 13:42:44 +00:00
martinsumner
87b5bd0b18 Set Persisted SQN (regression)
As part of previous change had stopped setting the persisted SQN in the
ledger - which stopped journal compaction from working)
2016-11-05 12:03:21 +00:00
martinsumner
61c6269200 Penciller back-pressure - Phase 1
There were issues with how the Penciller behaves under ehavy write
pressure - most particularly where there are a large number of keys per
update (i.e. 2i heavy objects).   Most immediately the attempt to chekc
whether the l0 file was ready slowed down the process of producing the
L0 file - so back-pressure created more back-pressure.

Going forward want to alter this most significantly as also the work
queue can build up unsustainably. there needs to be some pausing
prompted by the bookie on 'returned', and the use of 'returend when the
work queue exceeds a threshold.
2016-11-05 11:22:27 +00:00
martinsumner
41f00ba6fa Filename nonsense 2016-11-03 20:48:23 +00:00
martinsumner
dd99d624b1 Tangling with filenames
filename join does not work as expected
2016-11-03 20:46:56 +00:00
martinsumner
c3a6489b93 Ensure manifest dir when starting Penciller
Otherwise may fail based on test ordering
2016-11-03 20:09:38 +00:00
martinsumner
d5ac4d412d Use filename join
Potentiall to avoid *nix vs windows differences
2016-11-03 20:06:30 +00:00
martinsumner
341e245c09 Remove unnecessary no match condition 2016-11-03 19:34:54 +00:00
martinsumner
2716d912ea Timeout and close race
Race condition presvented in test - but still not handled nicely.
Perhaps need to consider making it a FSM and handling close differently
when L0 pending - i.e. don't close immediately, but set a timeout to
close on if we don't get the last fetch_levelzero
2016-11-03 19:02:50 +00:00
martinsumner
f41c788bff Minor quibbles
Move legacy CDB code used only in unit tests into test area.  Fix column
width in  pmem and comment out the unused case statement (in healthy
tests) from the penciller test code
2016-11-03 16:46:25 +00:00
martinsumner
4e46c9735d Log improvements
Continuation of log review and conversion to using central log function.

Fixup of convoluted shutdown process between Bookie, Inker and Inker's
Clerk
2016-11-03 16:05:43 +00:00
martinsumner
7147ec0470 Logging - Phase 1
Abstract out logging and introduce a logbase
2016-11-02 18:14:46 +00:00
martinsumner
4cffecf2ca Handle gen_server:cast slowness
There was some unpredictable performance in tests, that was related to
the amount of time it took the sft gen_server to accept a cast whihc
passed the levelzero_cache.

The response time looked to be broadly proportional to the size of the
cache - so it appeared to be an issue with passing the large object to
the process queue.

To avoid this, the penciller now instructs the SFT gen_server to
callback to the server for each tree in the cache in turn as it is
building the list from the cache.  Each of these requests should be
reltaively short, and the processing in-between should space out the
requests so the Pencille ris not blocked from answering queries when
pompting a L0 write.
2016-10-31 01:33:33 +00:00
martinsumner
95609702bd Penciller Memory Refactor
Plugged the ne wpencille rmemory into the Penciller, and took advantage
of the increased speed to simplify the callbacks involved.

The outcome is much simpler code
2016-10-30 18:25:30 +00:00
martinsumner
cdb01cd24f Quality Review
Looked through test coverage and dialyzer output and attempted to fill
test gaps and strip out untestable code (to let it crash).
2016-10-29 00:52:49 +01:00
martinsumner
c6ca973517 Penciller shutdown when empty
Stop the penciller from writing an empty file, when shutting down and
the L0 Cache is empty.

Also parameter fiddle to see impact of the Penciller changes.
2016-10-27 21:40:43 +01:00
martinsumner
20cc17f916 Penciller Refactor
Removed o(100) lines of code by refactoring the Penciller to no longer
use ETS tables.  The code is less confusing, and probably not an awful
lot slower.
2016-10-27 20:56:18 +01:00
martinsumner
30f4f2edf6 Comment change on stall behaviour 2016-10-27 09:45:05 +01:00
martinsumner
4cdc6211a0 Handling 'returned' in penciller unit tests
The unit tests for the Penciller couldn't cope with the returned status
- and so would intermittently fail (after tightening the timeout on sft
check_ready.
2016-10-26 21:03:50 +01:00
martinsumner
e9c568a8b3 Test fix-up
There was a test that failed to close down a bookie and that caused some
issues.  The issues are double-reoslved, the close down was tidied as
well as the forgotten close being added back in.

There is some generla tidy around in anticipation of TTL support.
2016-10-21 21:26:28 +01:00
martinsumner
3ad9e42b61 Changed SFT shutdown to cast-based
The SFT shutdown process ahs become a series of casts to-and-from
between Penciller and SFT to stop the two processes syncronously making
requests on each other
2016-10-21 12:18:06 +01:00
martinsumner
c431bf3b0a Broken snapshot test
The test confirming that deleting sft files wer eheld open whilst
snapshots were registered was actually broken.  This test has now been
fixed, as well as the logic in registring snapshots which had used
ledger_sqn mistakenly rather than manifest_sqn.
2016-10-21 11:38:30 +01:00
martinsumner
5c2029668d Tombstone preperation
Some initial code changes preparing for the test and implementation of
tombstones and tombstone reaping
2016-10-20 16:00:08 +01:00
martinsumner
cf66431c8e Smoother handling of back-pressure
The Penciller had two problems in previous commits:
- If it had a push_mem soon after a L0 file had been created, the
push_mem would stall waiting for the L0 file to complete - and this
count take 100-200ms
- The penciller's clerk favoured L0 work, but was lazy about asking for
other work in-between, so often the L1 layer was bursting over capacity
and the clerk was doing nothing but merging more L0 files in (with those
merges getting more and more expensive as they had to cover more and
more files)

There are some partial resolutions to this.  There is now an aggressive
timeout when checking whther the L0 file is ready on a push_mem, and if
the timeout is breached the error is caught and a 'returned' message
goes back to the Bookie.  the Bookie doesn't now empty its cache, it
carrie son filling it, but on some probability it will keep trying to
push_mem on future pushes.  This increases Jitter around the expensive
operation and split out the L0 delay into defined chunks.

The penciller's clerk is now more aggressive in asking for work.  There
is also some simplification of the relationship between clerk timeouts
and penciller back-pressure.

Also resolved is an issue of inconcistency between the loader and the on
startup (replaying the transaction log) and the standard push_mem
process.  The loader was not correctly de-duplicating by adding first
(in order) to a tree before outputting the list from the tree.

Some thought will be given later as to whether non-L0 work can be safely
prioritised if the merge process still keeps getting behind.
2016-10-20 02:23:45 +01:00
martinsumner
7319b8f415 Redundant clauses
Remove some redundant clauses, and fix up some logging
2016-10-19 20:51:30 +01:00
martinsumner
12fe1d01bd Penciller Manifest and Locking
The penciller had the concept of a manifest_lock - but it wasn't clear
what the purpose of it was.

The updating of the manifest has now been updated to reduce the code and
make the process cleaner and more obvious.  Now the committed manifest
only covers non-L0 levels.  A clerk can work concurrently on a manifest
change whilst the Penciller is accepting a new L0 file.

On startup the manifets is opened as well as any L0 file.  There is a
possible race condition with killing process where there may be a L0
file which is merged but undeleted - and this is believed to be inert.

There is some outstanding work still.  Currently the whole store is
paused if a push_mem is received by the Penciller, and the writing of a
L0 sft file has not been completed.  The creation of a L0 file appears
to take about 300ms, so if the ledger_cache fills in this period a pause
will occurr (perhaps due to objects with lots of index entries).  It
would be preferable to pause more elegantly in this situation.  Perhaps
there should be a harsh timeout on the call to check the SFT complete,
and catching it should cause a refused response.  The next PUT will then
wait, but a any queued GETs can progress.
2016-10-19 17:34:58 +01:00
martinsumner
f16f71ae81 Revert ominshambles performance refactoring
To try and improve performance index entries had been removed from the
Ledger Cache, and a shadow list of the LedgerCache (in SQN order) was
kept to avoid gb_trees:to_list on push_mem.

This did not go well.  The issue was that ets does not deal with
duplicate keys in the list when inserting (it will only insert one, but
it is not clear which one).

This has been reverted back out.

The ETS parameters have been changed to [set, private].  It is not used
as an iterator, and is no longer passed out of the process (the
memtable_copy is sent instead).  This also avoids the tab2list function
being called.
2016-10-19 00:10:48 +01:00
martinsumner
8f29a6c40f Complete 2i work - some refactoring
The 2i work now has tests for removals as well as regex etc.

Some initial refactoring work has also been tried - to try and take some
tasks of the critical path of push_mem.  The primary change has been to
avoid putting index keys into the gb_tree, and building the KeyChanges
list in parallel to the gb_tree (now known as ObjectTree) within the
Ledger Cache.

Some initial experiments done as to changing the ETS table in the
Penciller now that it will now be used for iterating - but that has been
reverted for now.
2016-10-18 19:41:33 +01:00
martinsumner
3e475f46e8 Support for 2i query part1
Added basic support for 2i query.  This involved some refactoring of the
test code to share functions between suites.

There is sill a need for a Part 2 as no tests currently cover removal of
index entries.
2016-10-18 01:59:18 +01:00