Commit graph

854 commits

Author SHA1 Message Date
martinsumner
dfcb30a33e Switch IndexCache to tuple
So that direct element fetch should be faster?
2017-05-22 10:24:49 +01:00
martinsumner
c90e0f824d Spelling error change 2017-05-22 10:00:34 +01:00
martinsumner
a81dd2839e Merge remote-tracking branch 'refs/remotes/origin/master' into mas-specs-i61a 2017-05-22 09:58:41 +01:00
Martin Sumner
8e334c0b5a Merge pull request #63 from martinsumner/mas-compactscore-i35
Mas compactscore i35
2017-05-22 09:57:32 +01:00
Martin Sumner
2bbb504c02 Up max file compactions per run
Try and do more compaction work for each run
2017-05-21 22:06:41 +01:00
martinsumner
695638f34c Loop for get_positions
The CDB FSM process can be blocked by get_positions for all positions,
so loop around the index outside of the FSM process to allow for other
messages to interleave.
2017-05-20 12:25:06 +01:00
martinsumner
319dc5f388 CDB changes to add timing log
Forgot to save before commit last time!
2017-05-19 14:28:55 +01:00
martinsumner
36a48c16e5 Add timing log to position scan
Position scan may be very expensive, add timing log to confirm
2017-05-19 13:59:57 +01:00
martinsumner
18a12ff9ff Improve comments 2017-05-18 14:09:45 +01:00
martinsumner
8b3ca78d49 spec help for SST file 2017-05-18 12:29:56 +01:00
martinsumner
28f45749d7 Add specs to exported API of CDB files
Time to give the dialyzer some help
2017-05-17 12:54:02 +01:00
martinsumner
fbb4879d81 Change fold_heads to do basic Journal presence check
This at least checks the file is present, and the Key exists in the
index of that file.  If the value is corrupt it will be removed by
compation, and then this will fail (unless the file is never compacted).

TODO: resolve issus of files which are corrupt - but never compacted
- a job for backup?
2017-04-21 15:55:03 +01:00
martinsumner
4d12dfe0ab Returning snapshots
If the clerk updates the manifest - it might not recognise changes to
the manifest made since the clerk took the manifest.  So the penciller
must merge its view of the snapshots back in to the updated manifest
2017-04-19 22:46:37 +01:00
Martin Sumner
fa9daf8696 Correct async fold
fold objects which snaps in the fold was implemented incorrectly - it
took information from the LedgeCache at the point of the request, not
at the point of the fold.  So the LedgerCache SQN may have been
surpassed in the Penciller by the time the fold was called.
2017-04-17 23:01:55 +01:00
martinsumner
7cba182951 Merge remote-tracking branch 'origin/mas-sweeperfold-i59' into mas-sweeperfold-i59
# Conflicts:
#	src/leveled_bookie.erl
2017-04-17 15:07:35 +01:00
martinsumner
50d95ef6aa Move snapshot inside of the fold function
riak_kv_sweeper gets the async fold function, then determines if the
function can be called.  If the system is busy the fold may be queued,
and may never be acted upon.

This may cause issues with snapshot timeouts etc.
2017-04-17 15:03:03 +01:00
Martin Sumner
e01efe02f6 Long snaphsot timeout increase
Increase this to 90 minutes.  The first time all the snapshots are
rebuilt it may take a long time, but they all get scheduled together -
and queued until concurrency limits allow it to be completed.

currently the snapshot is made on initialisation, and only released
when completed (which may be after the queue).  so the last couple of
snapshots were over-shooting the 1 hour.
2017-04-13 22:43:29 +01:00
Martin Sumner
618d9cf53b FoldHeads to output binary
so that byte_size will work in sweeper
2017-04-11 11:17:27 +01:00
Martin Sumner
9badc8fbe7 Merge branch 'master' into mas-sweeperfold-i59 2017-04-10 21:49:08 +01:00
martinsumner
60b0b88226 Switch to binary response
Force header in fold_heads to be a binary ... as this is what KV expects
2017-04-07 17:08:40 +00:00
martinsumner
6d42abbc1a Add bybucket to fold_heads
give fold_heads equivalent functionality to fold_objects - both can now
be done for allkeys and bybucket
2017-04-07 14:56:28 +00:00
martinsumner
9375baf636 Add unit test for foldheads
compare foldheads foldobjects and hahstree_query output
2017-04-07 14:19:25 +00:00
martinsumner
b464e2e28c Extend foldobjects to support proxy object
To allow for folds which probbaly don't need values to not always to
have to fetch the value
2017-04-07 12:09:11 +00:00
Martin Sumner
4e9fa2a206 Timeout long-running snapshots
Add logic to timeout long-running snapshots.
2017-04-05 09:16:01 +01:00
martinsumner
400f65f557 Switch to binary metadata
Trya nd maintain binary format when stored in Ledger so less
swapping/changing as added and removed.
2017-04-04 10:02:35 +00:00
martinsumner
43bfbe3e0e Add in scheduler function
To assist in scheduling compaction
2017-03-30 15:46:37 +01:00
martinsumner
11ff3129f3 Reduce compaction targets
Cmpaction is overly aggressive.  It is a lot of work to compact a run of
files for just 20% reduction in disk space, when disk space for the
Journal (i.e. low IOPS disk space should be relatively inexpensive).
Require at least a 40% reduction for a compaction job.
2017-03-30 12:15:36 +01:00
martinsumner
6143fcb664 Remove binary_to_term
when fetching don't need to binary_to_term key changes
2017-03-29 15:37:04 +01:00
Martin Sumner
8db73917fb Need also to remove unused bits 2017-03-22 00:14:37 +00:00
Martin Sumner
15af4942ae Remove busy log
Accounts for 60% of logs
2017-03-22 00:11:17 +00:00
martinsumner
e59585d733 Merge remote-tracking branch 'refs/remotes/origin/mas-etsmem-i52' into mas-sstfiveblocks 2017-03-21 18:25:18 +00:00
martinsumner
eef2199335 Up level for yield to 2 2017-03-21 18:24:11 +00:00
martinsumner
f108871691 Vclock metadata change
Test performance ocntinues to be worse since the vlock metadata change.
Reversing out juts in case.
2017-03-21 18:15:56 +00:00
martinsumner
756b46bb4d Return to merge scan width of 16
This was reduced before the use of binary blocks was committed
2017-03-21 17:53:34 +00:00
martinsumner
1fdcdf3b37 Midblock size - lookup
No real reason for the midblock to be smaller in lookup slots - so give
the blocks a more consistent size
2017-03-21 17:47:08 +00:00
martinsumner
64e944d9ba Change to 5 blocks in SST Slot
Change to 5 blocks is intended to make the blocks in lookup slots
fractionally smaller, but more importantly to introduce a middle block
that cna be opened in a binary-split style fashion to reduce the number
of blocks that need to be opened for range queries.   Worst case for
full slots is 3 blocks now not 4.
2017-03-21 16:54:23 +00:00
martinsumner
682dfc4d59 Revert "Revert "ETS - delete table not objects""
This reverts commit c46377584f.
2017-03-21 12:02:22 +00:00
martinsumner
dd0316eedf Yield on query selectively
Still not clear if yielding is the cause of memory problems, but taking
it away universally has impacted throughput.  At the very least we
should continue to yield on high-contention files (those at higher
levels), where the processes are more likely to be quickly terminated
anyway allowing GC to be invoked.
2017-03-21 11:03:29 +00:00
martinsumner
c46377584f Revert "ETS - delete table not objects"
This reverts commit 7dc4913d5a.
2017-03-21 01:32:41 +00:00
martinsumner
e18d2f2f00 Delete the ETS table from CDB files
Rather than simply dereference it - delete it
2017-03-21 01:31:42 +00:00
martinsumner
419541f5dd Fix to delete_pending state 2017-03-20 23:43:31 +00:00
martinsumner
415ac6017b Move sst get_kv range back inside process
Moved outside to stop blocking, but also avoids copy.  Move back out to
see if it may be related to the binary memory leak
2017-03-20 23:22:46 +00:00
martinsumner
7dc4913d5a ETS - delete table not objects
Try and delete the table not just the objects in the table - will this
improve memory leak?
2017-03-20 22:43:22 +00:00
Martin Sumner
eec9d509f9 Add back hash performance tests
Need to consider if magic hash is an issue
2017-03-20 20:28:47 +00:00
martinsumner
7154815a2b Keep vclock as binary
No obvious, need at present for vlock to be a term within leveled
2017-03-20 20:28:02 +00:00
martinsumner
f3ffa920af Trying to standardise binary manipulation of value
Looking into theory that use of term_to_binary is imperfect.  Also may
be better to compress values only when they are compacted?
2017-03-20 15:43:54 +00:00
martinsumner
5c662aeca1 Additional unit test
Need to test scenario where the key list the SST file created from is an
exact multiple of the slot size
2017-03-19 23:42:24 +00:00
martinsumner
431c2cee40 Remove unnecessary line
Brnach cannot be reached as firts key is always discovered when it is a
no_loolup
2017-03-19 23:37:50 +00:00
martinsumner
f20aba9c8b Curtail trimmed slot crazyness
There was complicated and confusing code that achieved nothing for
effiency when trimming slots.  the expensive part (binary_to_term) was
still needed on every block, and it was hard to get code coverage and
make sense of what it was really trying to achieve.

This is now much simpler - and may set us up for potential further
indexing help.
2017-03-19 21:47:22 +00:00
martinsumner
c203e2ee06 Range queries - pass out as binaries
Avoid converting to erlang temr wihtin the FSM and then requiring a copy
outside of the FSM - pass out as a binary
2017-03-17 10:47:20 +00:00