Merge pull request #165 from ariel-anieli/pr-typos

[doc/signatures.md] Fixed typos, mis-naming, and syntax highlighting
This commit is contained in:
Fred Hebert 2023-12-18 16:18:28 -05:00 committed by GitHub
commit 19c717fb97
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -2,10 +2,10 @@ Signatures
========== ==========
It often occurs in coding that we need a library, a set of It often occurs in coding that we need a library, a set of
functionally. Often there are several algorithms that could provide functionalities. Often there are several algorithms that could provide
this functionality. However, the code that uses it, either doesn't each of these functionalities. However, the code that uses it, either doesn't
care about the individual algorithm or wishes to delegate choosing care about the individual algorithm or wishes to delegate choosing
that algorithm to some higher level. Lets take the concrete example of that algorithm to some higher level. Let's take the concrete example of
dictionaries. A dictionary provides the ability to access a value via dictionaries. A dictionary provides the ability to access a value via
a key (other things as well but primarily this). There are may ways to a key (other things as well but primarily this). There are may ways to
implement a dictionary. Just a few are: implement a dictionary. Just a few are:
@ -17,13 +17,13 @@ implement a dictionary. Just a few are:
* Many, many more .... * Many, many more ....
Each of these approaches has their own performance characteristics, Each of these approaches has their own performance characteristics,
memory footprints etc. For example, a table of size n with open memory footprints, etc. For example, a table of size $n$ with open
addressing has no collisions and holds up to n elements, with a single addressing has no collisions and holds up to $n$ elements, with a single
comparison for successful lookup, and a table of size n with chaining comparison for successful lookup, and a table of size $n$ with chaining
and k keys has the minimum max(0, k-n) collisions and O(1 + k/n) and $k$ keys has the minimum $\max(0, k-n)$ collisions and $\mathcal{O}(1 + k/n)$
comparisons for lookup. While for skip lists the performance comparisons for lookup. While for skip lists the performance
characteristics are about as good as that of randomly-built binary characteristics are about as good as that of randomly-built binary
search trees - namely (O log n). So the choice of which to select search trees - namely $\mathcal{O}(\log n)$. So the choice of which to select
depends very much on memory available, insert/read characteristics, depends very much on memory available, insert/read characteristics,
etc. So delegating the choice to a single point in your code is a very etc. So delegating the choice to a single point in your code is a very
good idea. Unfortunately, in Erlang that's so easy to do at the moment. good idea. Unfortunately, in Erlang that's so easy to do at the moment.
@ -39,17 +39,20 @@ directly. There are a few ways you can approximate it. One way is to
pass the Module name to the calling functions along with the data that pass the Module name to the calling functions along with the data that
it is going to be called on. it is going to be called on.
:::erlang ```erlang
add(ModuleToUse, Key, Value, DictData) -> add(ModuleToUse, Key, Value, DictData) ->
ModuleToUse:add(Key, Value, DictData). ModuleToUse:add(Key, Value, DictData).
```
This works, and you can vary how you want to pass the data. For This works, and you can vary how you want to pass the data. For
example, you could easily use a tuple to contain the data. That is, example, you could easily use a tuple to contain the data. That is,
you could pass in `{ModuleToUse, DictData}` and that would make it a you could pass in `{ModuleToUse, DictData}` and that would make it a
bit cleaner. bit cleaner.
:::erlang
add(Key, Value, {ModuleToUse, DictData}) -> ```erlang
add(Key, Value, {ModuleToUse, DictData}) ->
ModuleToUse:add(Key, Value, DictData). ModuleToUse:add(Key, Value, DictData).
```
Either way, there are a few problems with this approach. One of the Either way, there are a few problems with this approach. One of the
biggest is that you lose code locality, by looking at this bit of code biggest is that you lose code locality, by looking at this bit of code
@ -75,9 +78,10 @@ name.
So what we actually want to do is something mole like this: So what we actually want to do is something mole like this:
:::erlang ```erlang
add(Key, Value, DictData) -> add(Key, Value, DictData) ->
dictionary:add(Key, Value, DictData). dictionary:add(Key, Value, DictData).
```
Doing this we retain the locality. We can easily look up the Doing this we retain the locality. We can easily look up the
`dictionary` Module. We immediately have a good idea what a `dictionary` Module. We immediately have a good idea what a
@ -97,12 +101,12 @@ a [Behaviour](http://metajack.im/2008/10/29/custom-behaviors-in-erlang/)
for our functionality. To continue our example we will define a for our functionality. To continue our example we will define a
Behaviour for dictionaries. That Behaviour looks like this: Behaviour for dictionaries. That Behaviour looks like this:
:::erlang ```erlang
-module(ec_dictionary). -module(ec_dictionary).
-export([behaviour_info/1]). -export([behaviour_info/1]).
behaviour_info(callbacks) -> behaviour_info(callbacks) ->
[{new, 0}, [{new, 0},
{has_key, 2}, {has_key, 2},
{get, 2}, {get, 2},
@ -113,8 +117,9 @@ Behaviour for dictionaries. That Behaviour looks like this:
{to_list, 1}, {to_list, 1},
{from_list, 1}, {from_list, 1},
{keys, 1}]; {keys, 1}];
behaviour_info(_) -> behaviour_info(_) ->
undefined. undefined.
```
So we have our Behaviour now. Unfortunately, this doesn't give us much So we have our Behaviour now. Unfortunately, this doesn't give us much
@ -124,14 +129,15 @@ dictionaries in an abstract way in our code. To do that we need to add
a bit of functionality. We do that by actually implementing our own a bit of functionality. We do that by actually implementing our own
behaviour, starting with `new/1`. behaviour, starting with `new/1`.
:::erlang ```erlang
%% @doc create a new dictionary object from the specified module. The %% @doc create a new dictionary object from the specified module. The
%% module should implement the dictionary behaviour. %% module should implement the dictionary behaviour.
%% %%
%% @param ModuleName The module name. %% @param ModuleName The module name.
-spec new(module()) -> dictionary(_K, _V). -spec new(module()) -> dictionary(_K, _V).
new(ModuleName) when is_atom(ModuleName) -> new(ModuleName) when is_atom(ModuleName) ->
#dict_t{callback = ModuleName, data = ModuleName:new()}. #dict_t{callback = ModuleName, data = ModuleName:new()}.
```
This code creates a new dictionary for us. Or to be more specific it This code creates a new dictionary for us. Or to be more specific it
actually creates a new dictionary Signature record, that will be used actually creates a new dictionary Signature record, that will be used
@ -148,16 +154,17 @@ dictionary and another that just retrieves data.
The first we will look at is the one that updates the dictionary by The first we will look at is the one that updates the dictionary by
adding a value. adding a value.
:::erlang ```erlang
%% @doc add a new value to the existing dictionary. Return a new %% @doc add a new value to the existing dictionary. Return a new
%% dictionary containing the value. %% dictionary containing the value.
%% %%
%% @param Dict the dictionary object to add too %% @param Dict the dictionary object to add too
%% @param Key the key to add %% @param Key the key to add
%% @param Value the value to add %% @param Value the value to add
-spec add(key(K), value(V), dictionary(K, V)) -> dictionary(K, V). -spec add(key(K), value(V), dictionary(K, V)) -> dictionary(K, V).
add(Key, Value, #dict_t{callback = Mod, data = Data} = Dict) -> add(Key, Value, #dict_t{callback = Mod, data = Data} = Dict) ->
Dict#dict_t{data = Mod:add(Key, Value, Data)}. Dict#dict_t{data = Mod:add(Key, Value, Data)}.
```
There are two key things here. There are two key things here.
@ -173,16 +180,17 @@ implementation to do the work itself.
Now lets do a data retrieval function. In this case, the `get` function Now lets do a data retrieval function. In this case, the `get` function
of the dictionary Signature. of the dictionary Signature.
:::erlang ```erlang
%% @doc given a key return that key from the dictionary. If the key is %% @doc given a key return that key from the dictionary. If the key is
%% not found throw a 'not_found' exception. %% not found throw a 'not_found' exception.
%% %%
%% @param Dict The dictionary object to return the value from %% @param Dict The dictionary object to return the value from
%% @param Key The key requested %% @param Key The key requested
%% @throws not_found when the key does not exist %% @throws not_found when the key does not exist
-spec get(key(K), dictionary(K, V)) -> value(V). -spec get(key(K), dictionary(K, V)) -> value(V).
get(Key, #dict_t{callback = Mod, data = Data}) -> get(Key, #dict_t{callback = Mod, data = Data}) ->
Mod:get(Key, Data). Mod:get(Key, Data).
```
In this case, you can see a very similar approach to deconstructing In this case, you can see a very similar approach to deconstructing
the dict record. We still need to pull out the callback module and the the dict record. We still need to pull out the callback module and the
@ -226,29 +234,30 @@ purpose is to help a preexisting module implement the Behaviour
defined by a Signature. A good example of this in our current example defined by a Signature. A good example of this in our current example
is the is the
[erlware_commons/ec_dict](https://github.com/ericbmerritt/erlware_commons/blob/types/src/ec_dict.erl) [erlware_commons/ec_dict](https://github.com/ericbmerritt/erlware_commons/blob/types/src/ec_dict.erl)
module. It implements the ec_dictionary Behaviour, but all the module. It implements the `ec_dictionary` Behaviour, but all the
functionality is provided by the functionality is provided by the
[stdlib/dict](http://www.erlang.org/doc/man/dict.html) module [stdlib/dict](http://www.erlang.org/doc/man/dict.html) module
itself. Let's take a look at one example to see how this is done. itself. Let's take a look at one example to see how this is done.
We will take a look at one of the functions we have already seen. The We will take a look at one of the functions we have already seen. The
`get` function in ec_dictionary doesn't have quite the same `get` function in `ec_dictionary` doesn't have quite the same
semantics as any of the functions in the dict module. So a bit of semantics as any of the functions in the `dict` module. So a bit of
translation needs to be done. We do that in the ec_dict module `get` function. translation needs to be done. We do that in the `ec_dict:get/2` function.
:::erlang ```erlang
-spec get(ec_dictionary:key(K), Object::dictionary(K, V)) -> -spec get(ec_dictionary:key(K), Object::dictionary(K, V)) ->
ec_dictionary:value(V). ec_dictionary:value(V).
get(Key, Data) -> get(Key, Data) ->
case dict:find(Key, Data) of case dict:find(Key, Data) of
{ok, Value} -> {ok, Value} ->
Value; Value;
error -> error ->
throw(not_found) throw(not_found)
end. end.
```
So the ec_dict module's purpose for existence is to help the So the `ec_dict` module's purpose for existence is to help the
preexisting dict module implement the Behaviour defined by the preexisting `dict` module implement the Behaviour defined by the
Signature. Signature.
@ -267,15 +276,16 @@ create a couple of functions that create dictionaries for each type we
want to test. The first we want to time is the Signature Wrapper, so want to test. The first we want to time is the Signature Wrapper, so
`dict` vs `ec_dict` called as a Signature. `dict` vs `ec_dict` called as a Signature.
:::erlang ```erlang
create_dict() -> create_dict() ->
lists:foldl(fun(El, Dict) -> lists:foldl(fun(El, Dict) ->
dict:store(El, El, Dict) dict:store(El, El, Dict)
end, dict:new(), end, dict:new(),
lists:seq(1,100)). lists:seq(1,100)).
```
The only thing we do here is create a sequence of numbers 1 to 100, The only thing we do here is create a sequence of numbers 1 to 100,
and then add each of those to the dict as an entry. We aren't too and then add each of those to the `dict` as an entry. We aren't too
worried about replicating real data in the dictionary. We care about worried about replicating real data in the dictionary. We care about
timing the function call overhead of Signatures, not the performance timing the function call overhead of Signatures, not the performance
of the dictionaries themselves. of the dictionaries themselves.
@ -283,27 +293,28 @@ of the dictionaries themselves.
We need to create a similar function for our Signature based We need to create a similar function for our Signature based
dictionary `ec_dict`. dictionary `ec_dict`.
:::erlang ```erlang
create_dictionary(Type) -> create_dictionary(Type) ->
lists:foldl(fun(El, Dict) -> lists:foldl(fun(El, Dict) ->
ec_dictionary:add(El, El, Dict) ec_dictionary:add(El, El, Dict)
end, end,
ec_dictionary:new(Type), ec_dictionary:new(Type),
lists:seq(1,100)). lists:seq(1,100)).
```
Here we actually create everything using the Signature. So we don't Here we actually create everything using the Signature. So we don't
need one function for each type. We can have one function that can need one function for each type. We can have one function that can
create anything that implements the Signature. That is the magic of create anything that implements the Signature. That is the magic of
Signatures. Otherwise, this does the exact same thing as the dict Signatures. Otherwise, this does the exact same thing as the dictionary
`create_dict/1`. given by `create_dict/0`.
We are going to use two function calls in our timing. One that updates We are going to use two function calls in our timing. One that updates
data and one that returns data, just to get good coverage. For our data and one that returns data, just to get good coverage. For our
dictionaries we are going to use the `size` function as well as dictionaries we are going to use the `size` function as well as
the `add` function. the `add` function.
:::erlang ```erlang
time_direct_vs_signature_dict() -> time_direct_vs_signature_dict() ->
io:format("Timing dict~n"), io:format("Timing dict~n"),
Dict = create_dict(), Dict = create_dict(),
test_avg(fun() -> test_avg(fun() ->
@ -312,6 +323,7 @@ the `add` function.
1000000), 1000000),
io:format("Timing ec_dict implementation of ec_dictionary~n"), io:format("Timing ec_dict implementation of ec_dictionary~n"),
time_dict_type(ec_dict). time_dict_type(ec_dict).
```
The `test_avg` function runs the provided function the number of times The `test_avg` function runs the provided function the number of times
specified in the second argument and collects timing information. We specified in the second argument and collects timing information. We
@ -323,18 +335,19 @@ we don't have to hard code the calls for the Signature
implementations. Lets take a look at the `time_dict_type` function. implementations. Lets take a look at the `time_dict_type` function.
:::erlang ```erlang
time_dict_type(Type) -> time_dict_type(Type) ->
io:format("Testing ~p~n", [Type]), io:format("Testing ~p~n", [Type]),
Dict = create_dictionary(Type), Dict = create_dictionary(Type),
test_avg(fun() -> test_avg(fun() ->
ec_dictionary:size(ec_dictionary:add(some_key, some_value, Dict)) ec_dictionary:size(ec_dictionary:add(some_key, some_value, Dict))
end, end,
1000000). 1000000).
```
As you can see we take the type as an argument (we need it for `dict` As you can see we take the type as an argument (we need it for `dict`
creation) and call our create function. Then we run the same timings creation) and call our create function. Then we run the same timings
that we did for ec dict. In this case though, the type of dictionary that we did for `ec_dict`. In this case though, the type of dictionary
is never specified, we only ever call ec_dictionary, so this test will is never specified, we only ever call ec_dictionary, so this test will
work for anything that implements that Signature. work for anything that implements that Signature.
@ -343,24 +356,25 @@ work for anything that implements that Signature.
So we have our tests, what was the result. Well on my laptop this is So we have our tests, what was the result. Well on my laptop this is
what it looked like. what it looked like.
:::sh ```sh
Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false] Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.2 (abort with ^G) Eshell V5.8.2 (abort with ^G)
1> ec_timing:time_direct_vs_signature_dict(). 1> ec_timing:time_direct_vs_signature_dict().
Timing dict Timing dict
Range: 2 - 5621 mics Range: 2 - 5621 mics
Median: 3 mics Median: 3 mics
Average: 3 mics Average: 3 mics
Timing ec_dict implementation of ec_dictionary Timing ec_dict implementation of ec_dictionary
Testing ec_dict Testing ec_dict
Range: 3 - 6097 mics Range: 3 - 6097 mics
Median: 3 mics Median: 3 mics
Average: 4 mics Average: 4 mics
2> 2>
```
So for the direct dict call, we average about 3 mics per call, while So for the direct `dict` call, we average about 3 mics per call, while
for the Signature Wrapper we average around 4. That's a 25% cost for for the Signature Wrapper we average around 4. That's a 25% cost for
Signature Wrappers in this example, for a very small number of Signature Wrappers in this example, for a very small number of
calls. Depending on what you are doing that is going to be greater or calls. Depending on what you are doing that is going to be greater or
@ -373,22 +387,23 @@ Signature, but it is not a Signature Wrapper. It is a native
implementation of the Signature. To use `ec_rbdict` directly we have implementation of the Signature. To use `ec_rbdict` directly we have
to create a creation helper just like we did for dict. to create a creation helper just like we did for dict.
:::erlang ```erlang
create_rbdict() -> create_rbdict() ->
lists:foldl(fun(El, Dict) -> lists:foldl(fun(El, Dict) ->
ec_rbdict:add(El, El, Dict) ec_rbdict:add(El, El, Dict)
end, ec_rbdict:new(), end, ec_rbdict:new(),
lists:seq(1,100)). lists:seq(1,100)).
```
This is exactly the same as `create_dict` with the exception that dict This is exactly the same as `create_dict` with the exception that dict
is replaced by `ec_rbdict`. is replaced by `ec_rbdict`.
The timing function itself looks very similar as well. Again notice The timing function itself looks very similar as well. Again notice
that we have to hard code the concrete name for the concrete that we have to hard code the concrete name for the concrete
implementation, but we don't for the ec_dictionary test. implementation, but we don't for the `ec_dictionary` test.
:::erlang ```erlang
time_direct_vs_signature_rbdict() -> time_direct_vs_signature_rbdict() ->
io:format("Timing rbdict~n"), io:format("Timing rbdict~n"),
Dict = create_rbdict(), Dict = create_rbdict(),
test_avg(fun() -> test_avg(fun() ->
@ -397,6 +412,7 @@ implementation, but we don't for the ec_dictionary test.
1000000), 1000000),
io:format("Timing ec_dict implementation of ec_dictionary~n"), io:format("Timing ec_dict implementation of ec_dictionary~n"),
time_dict_type(ec_rbdict). time_dict_type(ec_rbdict).
```
And there we have our test. What do the results look like? And there we have our test. What do the results look like?
@ -406,22 +422,23 @@ The main thing we are timing here is the additional cost of the
dictionary Signature itself. Keep that in mind as we look at the dictionary Signature itself. Keep that in mind as we look at the
results. results.
:::sh ```sh
Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false] Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.2 (abort with ^G) Eshell V5.8.2 (abort with ^G)
1> ec_timing:time_direct_vs_signature_rbdict(). 1> ec_timing:time_direct_vs_signature_rbdict().
Timing rbdict Timing rbdict
Range: 6 - 15070 mics Range: 6 - 15070 mics
Median: 7 mics Median: 7 mics
Average: 7 mics Average: 7 mics
Timing ec_dict implementation of ec_dictionary Timing ec_dict implementation of ec_dictionary
Testing ec_rbdict Testing ec_rbdict
Range: 6 - 6013 mics Range: 6 - 6013 mics
Median: 7 mics Median: 7 mics
Average: 7 mics Average: 7 mics
2> 2>
```
So no difference it time. Well the reality is that there is a So no difference it time. Well the reality is that there is a
difference in timing, there must be, but we don't have enough difference in timing, there must be, but we don't have enough