leveled

Author	SHA1	Message	Date
martinsumner	a1c49b668a	Fix empty file again No special definition of empty required, as now an empty list when empty	2017-03-14 00:17:09 +00:00
martinsumner	2b0ec1d9cc	Don't double-loop on slots Previous version built a list of slots, then iterated over it to build a list of binaries. This converts the slot to a binary before building the list	2017-03-13 23:51:48 +00:00
martinsumner	4f0622d2ac	Merge remote-tracking branch 'refs/remotes/origin/mas-sstblock-i42' into mas-sstblockv2-i42	2017-03-13 21:09:13 +00:00
martinsumner	54534e725f	Experiment with smaller scan width When testing with large numbers of 2i terms (and hence more Riak Metadata), there is a surge in slow response times when there are multiple concurrent merge events. This could be veyr short term CPU starvation because of the merge process. Perhaps it is delays waiting for the scan to complete - smaller scanwidth may mena more interleaving and less latency?	2017-03-13 19:53:12 +00:00
martinsumner	f2cd9b3f33	Consistency of empty slotlist references Need to return an empty slotlist in a consistent way	2017-03-11 13:04:55 +00:00
martinsumner	1f8de798bd	Fix empty slot issue	2017-03-11 12:41:30 +00:00
martinsumner	a07770a3df	Unit tets of lookup over-size issue A mistake meant resetting to lookup on a skipped key would cause issues if the skipped key ocurred under a no_lookup slot after the ?SLOT_SIZE had been reached. This caused the slot to switch to lookup, but beyond the maximum size	2017-03-11 00:03:55 +00:00
martinsumner	4e4f498f20	Correctly set no_lookup on skip_key Otherwise could change to lookup after the size limit has been reached	2017-03-10 23:48:17 +00:00
martinsumner	1813317121	Correctly identify empty slotlist	2017-03-10 22:49:00 +00:00
martinsumner	b2f3d882a9	Draft of branch to condense range_only keys	2017-03-10 20:43:37 +00:00
martinsumner	730ab2ec48	tidy out io:format	2017-03-10 11:10:15 +00:00
martinsumner	601f43de3d	Merge remote-tracking branch 'refs/remotes/origin/master' into mas-sstblock-i42	2017-03-10 10:24:51 +00:00
martinsumner	d7eee2f9c9	Remove rogue log	2017-03-09 22:24:11 +00:00
martinsumner	4c59342600	Change SST reference to split filename The manifest and the logs are bloated by having the full file path for every filename in there - given the root path is constant. Could also cause issues if the mount point is ever changed.	2017-03-09 21:23:09 +00:00
martinsumner	04cfb453c4	Fetch specific block only Rely on CRC check in zlib. Still need to catch on failure	2017-03-07 20:19:11 +00:00
martinsumner	bc5388710b	Update SST comments	2017-03-04 20:47:46 +00:00
martinsumner	19534122a2	Coverage checks	2017-02-26 21:37:47 +00:00
martinsumner	7320b34681	Comment update	2017-01-25 12:38:33 +00:00
martinsumner	d57b74d967	Re-introduce tinybloom to SST This had been removed due to the CPU cost of adding - however then the tinybloom wa simplemented by directly manipulating bits through binary comprehension - rather than applying bor band bsl bsr operations. With these operations the cost of producing and checking the bloom is <10% by comparison.	2017-01-24 21:51:12 +00:00
martinsumner	d225f4d7f5	Add use of leveled_tree to sst summary	2017-01-23 22:58:51 +00:00
martinsumner	c99c50ce6e	Fix-up message exchange on confirm delete	2017-01-17 11:18:58 +00:00
martinsumner	c32fd3fb4c	Change to use manifest_entry not straight PID in unit test	2017-01-17 10:14:40 +00:00
martinsumner	9832ecc369	Manifest now back to a simple list This has refactored code with the implementation of the manifest isolated in to a seperate module, and the pure async relationship between penciller and their clerk. However, the manifest is just a simple list at each level.	2017-01-17 10:12:15 +00:00
martinsumner	13c81f0ed1	Basic working Some basic tests working - but still outstanding issues.	2017-01-14 19:41:09 +00:00
martinsumner	5a88565c08	Switch to binary index in pmem Remove the ets index in pmem and use a binary index instead. This may be slower, but avoids the bulk upload to ets, and means that matches know of position (so only skiplists with a match need be tried). Also stops the discrepancy between snapshots and non-snapshots - as previously the snapshots were always slowed by not having access to the ETS table.	2017-01-05 21:58:33 +00:00
martinsumner	2f8ff640a9	Test coverage Add some furthe runit tests to improve test coverage	2017-01-04 21:36:59 +00:00
martinsumner	7d95fa6bbc	Switch summary index Simplify the summayr index implementation	2017-01-04 14:26:11 +00:00
martinsumner	2f3eb18548	Re-add usort Change one thing at a time	2017-01-03 18:26:54 +00:00
martinsumner	c4ebaa9f57	Tidy Up All Hashes As we're no longer generating a summayr bloom - no need to collect a big list of hashes whilst building the sst file	2017-01-03 18:20:28 +00:00
martinsumner	e1d843a2eb	Remove lastfetch cache It appears to have some benefit at lower levels, but overall has less benefit at higher levels. Probably not worth having unless it cna be controlled to go in at the basement only.	2017-01-03 15:26:44 +00:00
martinsumner	b6ae0e1af5	Fix broken SST cache	2017-01-03 13:03:59 +00:00
martinsumner	d28e5d639c	Remove SST blooms	2017-01-03 09:12:41 +00:00
martinsumner	5b4c903d53	Check before update on bloom	2017-01-02 20:02:49 +00:00
martinsumner	31d4346806	Log improvements Log on bad CRC, and also not seeing SST timing logs, so log these more frequently	2017-01-02 18:54:19 +00:00
martinsumner	b3e189b012	Protect against div by 0 Make sure that blooms are always at least 1 slot in size	2017-01-02 18:38:14 +00:00
martinsumner	baa644383d	Make tinybloom size configurable Allow the bloom size to vary depending on how many fetchable keys there are - so ther eis no large bloom held if most of the keys are index entries for example	2017-01-02 18:29:15 +00:00
Martin Sumner	2079fff7f8	Switched to indexed blocks as slot implementation Prior to this refactor, the slot and been made up of four blocks with an external binary index. Although the form of the index has changed again, micro-benchmarking once again showed that this was a relatively efficient mechanism.	2017-01-02 10:47:04 +00:00
Martin Sumner	c0d959beff	Five alternatives explored	2016-12-29 22:22:13 +00:00
martinsumner	b509e81cfd	Ongoing timing tests	2016-12-29 14:14:09 +00:00
martinsumner	b855401696	Experiment Want to experiemnt with different datatypes for the slot - maybe use a raw list but with a mini hashtree index like the CDB file	2016-12-29 14:11:05 +00:00
martinsumner	41ee90a2ef	OTP16 compatability	2016-12-29 12:10:12 +00:00
martinsumner	a261d4793b	Increase test size Be able to read more into sample-based output	2016-12-29 12:01:42 +00:00
martinsumner	4784f8521a	Entropy fiddle Try and increase efefctiveness of bloom by combing Magic Hash with phash2	2016-12-29 11:59:07 +00:00
martinsumner	fb75a26497	Handle mismatch on expanding pointer Remove the nasty legacy of hard-coding for a scan width of 1	2016-12-29 10:46:12 +00:00
martinsumner	8f0bf8b892	Fix overlapping _ references	2016-12-29 10:34:53 +00:00
martinsumner	e01b310d20	Handle production of empty file	2016-12-29 05:09:47 +00:00
martinsumner	55386622f7	Fixed issues Two issues - when the key range falls in-between two marks in the summary, we didn't pick up any mark. then when trimming both right and left, the left trim was being discarded.	2016-12-29 04:37:49 +00:00
martinsumner	5b9e68df99	Add some crash protection for empty return from to_range Not clear though why it would occur.	2016-12-29 03:04:10 +00:00
martinsumner	3f3b36597a	Add timer for SST creation	2016-12-29 02:55:28 +00:00
martinsumner	a665b8ea4f	Tidy-up unused variable	2016-12-29 02:41:02 +00:00

1 2

71 commits