Simple change to check for presence of objetc in list before adding it:
```Head check took 124416 microseconds checking list of length 5000
Head check took 130114 microseconds checking list of length 5000
Check for presence of repeated objects
Head check took 1725 microseconds checking list of length 5```
Test fails as fetching repeated object is too slow.
```Head check took 124301 microseconds checking list of length 5000
Head check took 112286 microseconds checking list of length 5000
Head check took 1336512 microseconds checking list of length 5
2018-12-10T11:54:41.342 B0013 <0.2459.0> Long running task took 260788 microseconds with task of type pcl_head
2018-12-10T11:54:41.618 B0013 <0.2459.0> Long running task took 276508 microseconds with task of type pcl_head
2018-12-10T11:54:41.894 B0013 <0.2459.0> Long running task took 275225 microseconds with task of type pcl_head
2018-12-10T11:54:42.173 B0013 <0.2459.0> Long running task took 278836 microseconds with task of type pcl_head
2018-12-10T11:54:42.477 B0013 <0.2459.0> Long running task took 304524 microseconds with task of type pcl_head```
It taks twice as long to check for one repeated object as it does to check for 5K non-repeated objects
this was previously not na issue as leveled_codec:segment_hash/1 would handle anyhting that could be hashed. This now has to be a tuple, and one with a first element - so corrupted tuples are failing.
Add a guard chekcing for a corrupted tuple, but we only need this when doing journal compaction.
Change user_defined keys to be `retain` as a tag strategy
the skip/retain/recalc handlign was confusing. This removes the switcheroo between leveled_codec and leveled_iclerk when mkaing the decision.
Also now the building of the accumulator is handled efficiently (not using ++ on the list).
Tried to rmeove as much of ?HEAD tag handling from leveled_head - as we want leveled_head to be only concerned with the head manipulation for object tags (?STD, ?RIAK and user-defined).
To allow for extraction of metadata, and building of head responses - it should eb possible to dynamically and user-defined tags, and functions to treat them.
If no function is defined, revert to the behaviour of the ?STD tag.
More obvious how to extend the code as it is all in one module.
Also add a new field to the standard object metadata tuple that may hold in the future other object metadata base don user-defined functions.
Both log level and forced_logs. Allows for log_level to be changed at startup ad runtime. Also allow for a list of forced logs, so if log_level is set > info, individual info logs can be forced to be seen (such as to see stats logs).
This allows for all fold functions to throw an exception to exit out of a fold with all dependencies still closed down as expected.
This was previously available for key folds, which was necessary for the folds to work in Riak (as max_results in index queries depends one xiting the fold with an exception). This change now adds a ct test, and adds support for head folds, object folds (key order) and object folds (sqn order)
Export changes required to support kv_index_tictactree. This will call tictac_hahs, but also needs to know hash of key used in bxor - so that it can control alterations.
Adds support with test for tuplebuckets in Riak keys.
This exposed that there was no filter using the seglist on the in-mmemory keys. This means that if there is no filter applied in the fold_function, many false positives may emerge.
This is probably not a big performance benefit (and indeed for performance it may be better to apply during the leveled_pmem:merge_trees).
Some thought still required as to what is more likely to contribute to future bugs: an extra location using the hash matching found in leveled_sst, or the extra results in the query.
This helps with kv_index_tictcatree with the leveled_so backend. Now this cna do folds over ranges of keys with modified filters (as folds over ranges of keys must go over lal keys if the backend is segment_ordered)
Which exposed it wasn't working. If there is no segment list passed - just a modification filter, you don't need to check the position list (as checking the position list returns an empty position so sipping all the matching results!)
Acc in response is now of form {Reason, Acc} not just Acc so that the application can understand the reason for the results ending - and take appropriate action (e.g. restart again from the LastKey to return more results).
To support max_keys and the last modified date range.
This applies the last modified date check on all ledger folds. This is hard to avoid, but ultimately a very low cost.
The limit on the number of heads to fold, is the limit based on passing to the accumulator - not on the limit being added to the accumulator. So if the FoldFun perfoms a filter (e.g. for the preflist), then those filtered results will still count towards the maximum.
There needs to be someway at the end of signalling from the fold if the outcome was or was not 'constrained' by max_keys - as the fold cannot simply tel by lenght checking the outcome.
Note this is used rather than length checking the buffer and throwing a 'stop_fold' message when the limit is reached. The choice is made for simplicity, and ease of testing. The throw mechanism is necessary if there is a need to stop parallel folds across the the cluster - but in this case the node_worker_pool will be used.
Externally to leveled_sst all folds are actually managed through exapnd_list_by_pointer.
Make the API a bit clearer in this regards, and add specs to help dialyzer.
This also adds LowLastMod to the API for expanding pointers (although the leveled_penciller just defaults this to 0 for everything.