From e4a2a8ecea6ccb8d093759aec5a086224638ac12 Mon Sep 17 00:00:00 2001 From: martinsumner Date: Tue, 11 Jul 2017 10:28:34 +0100 Subject: [PATCH] Stuff on read repair --- docs/ANTI_ENTROPY.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/ANTI_ENTROPY.md b/docs/ANTI_ENTROPY.md index 075c3ef..0607368 100644 --- a/docs/ANTI_ENTROPY.md +++ b/docs/ANTI_ENTROPY.md @@ -262,3 +262,5 @@ Some notes on re-using this alternative anti-entropy mechanism within Riak: - In Leveled a special fold currently supports the Tic-Tac tree generation for indexes, and one for objects. It may be better to support this through a offering a more open capability to pass different fold functions and accumulators into index folds. This could be re-used for "reporting indexes", where we want to count terms of different types rather than return all those terms via an accumulating list e.g. an index may have a bitmap style part, and the function will apply a wildcard mask to the bitmap and count the number of hits against each possible output. - The initial intention is to implement the hashtree query functions based around the coverage_fsm behaviour, but with the option to stipulate externally the offset. So to test for differences between clusters, the user could concurrently query the two clusters for the same offset (or a random offset), whereas to find entropy within a cluster two concurrently run queries could be compared for different offsets. + +- A surprising feature of read repair is that it will read repair to fallback nodes, not just primary nodes. This means that in read-intensive workloads, write activity may dramatically increase during node failure (as a large proportion of reads will become write events) - increasing the chance of servers falling domino style. However, in some circumstances the extra duplication can also [increase the chance of data loss](https://github.com/russelldb/russelldb.github.io/blob/master/3.2.kv679-solution.md)! This also increases greatly the volume of unnecessary data to be handed-off when the primary returns. Without active anti-entropy, and in the absence of other safety checks like `notfound_ok` being set to false, or `pr` being set to at least 1 - there will be scenarios where this feature may be helpful. As part of improving active anti-entropy, it may be wise to re-visit the tuning of anti-entropy features that existed prior to AAE, in particular should it be possible to configure read-repair to act on primary nodes only.