edoc documentation added

This commit is contained in:
alisdair sullivan 2010-08-19 23:30:22 -07:00
parent c098b06e88
commit 6ff74e6d59
12 changed files with 445 additions and 16 deletions

136
doc/overview.edoc Normal file
View file

@ -0,0 +1,136 @@
@author Alisdair Sullivan <alisdairsullivan@yahoo.ca>
@copyright 2010 Alisdair Sullivan
@version really, really beta
@title jsx
@doc jsx is a json parser with an easily transformable representation, the ability to parse streams incrementally, low memory overhead and a clean generator based interface. it also includes an implementation of <a href="http://www.erlang.org/eeps/eep-0018.html">eep0018</a>, a json reformatter and a json verifier
== contents ==
<ol>
<li>{@section introduction}</li>
<li>{@section features}</li>
<li>{@section usage}</li>
<li>{@section contributing}</li>
<li>{@section acknowledgements}</li>
</ol>
== introduction ==
what's a language without strong json integration? one that no one is gonna use for much of anything that requires integration with other systems
erlang has a number of json libraries, some of them very good, but all of them tied to fairly specific representations and usage scenarios. working with json in erlang, outside of simple decoding into terms that probably aren't ideal, is laborious and clumsy. jsx seeks to correct this unfortunate state
== features ==
jsx is not an end to end json parser. jsx takes a json document and produces a generator that returns a jsx event and a new generator. a jsx event is an atom or tuple that represents a json structural element (like the start or end of an object or array) or a json value (like a string or a number). this provides a simple, easy to consume, iterative api that makes it easy to produce arbitrarily complex representations of the json document
the representation of jsx events was chosen for pragmatic reasons. strings, integers and floats are encoded in json as strings of unicode characters and many erlang functions operate on lists so returning them as lists of unicode codepoints is both efficient and convenient. structural elements are converted to descriptive atoms for ease of matching and clarity. json literals (`true`, `false` and `null`) are encoded as atoms for ease of matching and wrapped in a tagged tuple to differentiate them from structural elements
in cases where an incomplete json document is supplied to the parser, upon reaching the end of the document the generator may also return a new function that allows another chunk of the json document to be parsed as if parsing were never interrupted. this is useful for parsing very large json documents (to avoid holding the entire document in memory) or for parsing as data is made available (like over a network or from storage)
jsx attempts to follow the <a href="http://www.ietf.org/rfc/rfc4627.txt?number=4627">json specification</a> as closely as possible but is realistic about actual usage of json and provides optional support for comments, json values not wrapped in a json object or array and streams containing multiple json documents
jsx is wicked fast. sort of. it's as fast as possible without sacrificing a useable interface, at least. things like efficiency of binary matching, elimination of unused intermediate states, the relative costs of using lists or binaries and even garbage collection were tested and taken into consideration during development
== usage ==
`jsx:parser()' is the entry point for a jsx parser. it returns a function that takes a binary and attempts to parse it as if it were a chunk of valid (or not so valid, depending on the options passed to `jsx:parser/1') json encoded in utf8, utf16 (big or little endian) or utf32 (big or little endian). it's an incremental parser, it's parsed on demand; an 'event' at a time. events are tuples of the form:
```
{event, Event, Next}
Event = start_object
| start_array
| end_object
| end_array
| {string, [Character]}
| {integer, [Character]}
| {float, [Character]}
| {literal, true}
| {literal, false}
| {literal, null}
| end_json
Character -- a unicode codepoint represented by an erlang integer
Next -- a function of arity zero that, when invoked, returns the next event
'''
the decoder can also return two other tuples:
```
{incomplete, More}
More -- a function of arity 1 that accepts additional input for
the parser and resumes parsing as if never interrupted. the semantics
are as if the new binary were appended to the already parsed binary
{error, badjson}
'''
`incomplete' is returned when input is exhausted. `error' is returned when invalid json input is detected. how obvious
putting all of this together, the following short module:
```
-module(jsx_ex).
-export([simple_decode/1]).
simple_decode(JSON) when is_binary(JSON) ->
P = jsx:parser(),
decode(P(JSON), []).
decode({event, end_json, _Next}, Acc) ->
lists:reverse(Acc);
decode({event, Event, Next}, Acc) ->
decode(Next(), [Event] ++ Acc).
'''
does this when called from the shell with suitable input:
```
1> jsx_ex:simple_decode(
<<"{
\"dead rock stars\": [\"kurt cobain\", \"elliott smith\", \"nicky wire\"],
\"total\": 3.0,
\"courtney killed kurt\": true
}">>
).
[start_object,
{key,"dead rock stars"},
start_array,
{string,"kurt cobain"},
{string,"elliott smith"},
{string,"nicky wire"},
end_array,
{key,"total"},
{float,"3.0"},
{key,"courtney killed kurt"},
{literal,true},
end_object]
'''
jsx also has an eep0018 decoder and encoder, a json pretty printer and a json verifier:
```
1> jsx:json_to_term(<<"[1, true, \"hi\"]">>).
[1,true,<<"hi">>]
2> jsx:term_to_json([{name, <<"alisdair">>}, {balance, -3742.35}, {employed, false}]).
<<"{\"name\":\"alisdair\",\"balance\":-3742.35,\"employed\":false}">>
3> jsx:is_json(<<"{}">>).
true
4> jsx:format(<<" \t\t [\n \"i love whitespace\"\n ]\n ">>).
<<"[\"i love whitespace\"]">>
'''
== contributing ==
jsx is available on <a href="http://github.com/talentdeficit/jsx">github</a>. users are encouraged to fork, edit and make pull requests
== acknowledgments ==
jsx wouldn't be possible without Lloyd Hilaiel's <a href="http://lloyd.github.com/yajl/">yajl</a> and the encouragement of the erlang community. thanks also must be given to the <a href="https://mail.mozilla.org/listinfo/es-discuss">es-discuss mailing list</a> and Douglas Crockford for insight into the intricacies of json

View file

@ -57,8 +57,7 @@
-type jsx_parser_result() :: {event, jsx_event(), fun(() -> jsx_parser_result())} -type jsx_parser_result() :: {event, jsx_event(), fun(() -> jsx_parser_result())}
| {incomplete, jsx_parser()} | {incomplete, jsx_parser()}
| {error, badjson} | {error, badjson}.
| ok.
-type supported_utf() :: utf8 | utf16 | {utf16, little} | utf32 | {utf32, little}. -type supported_utf() :: utf8 | utf16 | {utf16, little} | utf32 | {utf32, little}.

View file

@ -21,6 +21,12 @@
%% THE SOFTWARE. %% THE SOFTWARE.
%% @author Alisdair Sullivan <alisdairsullivan@yahoo.ca>
%% @copyright 2010 Alisdair Sullivan
%% @version really, really beta
%% @doc this module defines the interface to the jsx json parsing library
-module(jsx). -module(jsx).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").
@ -34,13 +40,126 @@
-export([eventify/1]). -export([eventify/1]).
%% types for function specifications %% function and type specifications
-include("./include/jsx.hrl"). -include("./include/jsx.hrl").
%% @type jsx_parser() = (binary()) -> jsx_parser_result().
%% @type jsx_parser_result() = {event, jsx_event(), (() -> jsx_parser_result())}
%% | {incomplete, jsx_parser()}
%% | {error, badjson}.
%% @type jsx_event() = start_object
%% | end_object
%% | start_array
%% | end_array
%% | end_json
%% | {key, unicode_string()}
%% | {string, unicode_string()}
%% | {integer, unicode_string()}
%% | {float, unicode_string()}
%% | {literal, true}
%% | {literal, false}
%% | {literal, null}.
%% @type unicode_string() = [integer()].
%% @type jsx_opts() = [jsx_opt()].
%% @type jsx_opt() = {comments, true | false}
%% | {escaped_unicode, ascii | codepoint | none}
%% | {multi_term, true | false}
%% | {encoding, auto | supported_utf()}.
%% @type supported_utf() = utf8 | utf16 | {utf16, little} | utf32 | {utf32, little}.
%% @type eep0018() = eep0018_object() | eep0018_array().
%% @type eep0018_array() = [eep0018_term()].
%% @type eep0018_object() = [{eep0018_key(), eep0018_term()}].
%% @type eep0018_key() = binary() | atom().
%% @type eep0018_term() = eep0018_array() | eep0018_object() | eep0018_string() | eep0018_number() | true | false | null.
%% @type eep0018_string() = binary().
%% @type eep0018_number() = float() | integer().
%% @type encoder_opts() = [encoder_opt()].
%% @type encoder_opt() = {strict, true | false}
%% | {encoding, supported_utf()}
%% | {space, integer()}
%% | space
%% | {indent, integer()}
%% | indent.
%% @type decoder_opts() = [decoder_opt()].
%% @type decoder_opt() = {strict, true | false}
%% | {comments, true | false}
%% | {encoding, supported_utf()}
%% | {label, atom | binary | existing_atom}
%% | {float, true | false}.
%% @type verify_opts() = [verify_opt()].
%% @type verify_opt() = {strict, true | false}
%% | {encoding, auto | supported_utf()}
%% | {comments, true | false}.
%% @type format_opts() = [format_opt()].
%% @type format_opt() = {strict, true | false}
%% | {encoding, auto | supported_utf()}
%% | {comments, true | false}
%% | {space, integer()}
%% | space
%% | {indent, integer()}
%% | indent
%% | {output_encoding, supported_utf()}.
%% @spec parser() -> jsx_parser()
%% @equiv parser([])
parser() -> parser() ->
parser([]). parser([]).
%% @spec parser(Opts::jsx_opts()) -> jsx_parser()
%% @doc
%% produces a function which takes a binary which may or may not represent an encoded json document and returns a generator
%%
%% ```
%% options:
%%
%% {comments, true | false}
%% if true, json documents that contain c style (/* ... */) comments
%% will be parsed as if they did not contain any comments. default is
%% false
%%
%% {encoded_unicode, ascii | codepoint | none}
%% if a \uXXXX escape sequence is encountered within a key or string,
%% this option controls how it is interpreted. none makes no attempt
%% to interpret the value, leaving it unconverted. ascii will convert
%% any value that falls within the ascii range. codepoint will convert
%% any value that is a valid unicode codepoint. note that unicode
%% non-characters (including badly formed surrogates) will never be
%% converted. codepoint is the default
%%
%% {encoding, auto | utf8 | utf16 | {utf16, little} | utf32 | {utf32, little} }
%% attempt to parse the binary using the specified encoding. auto will
%% auto detect any supported encoding and is the default
%%
%% {multi_term, true | false}
%% usually, documents will be parsed in full before the end_json
%% event is emitted. setting this option to true will instead emit
%% the end_json event as soon as a valid document is parsed and then
%% reset the parser to it's initial state and attempt to parse the
%% remainder as a new json document. this allows streams containing
%% multiple documents to be parsed correctly
%% '''
%% @end
parser(OptsList) -> parser(OptsList) ->
F = case proplists:get_value(encoding, OptsList, auto) of F = case proplists:get_value(encoding, OptsList, auto) of
@ -57,36 +176,169 @@ parser(OptsList) ->
end. end.
term_to_json(JSON) -> %% @spec json_to_term(JSON::binary()) -> eep0018()
term_to_json(JSON, []). %% @equiv json_to_term(JSON, [])
term_to_json(JSON, Opts) ->
jsx_eep0018:term_to_json(JSON, Opts).
json_to_term(JSON) -> json_to_term(JSON) ->
json_to_term(JSON, []). json_to_term(JSON, []).
%% @spec json_to_term(JSON::binary(), Opts::decoder_opts()) -> eep0018()
%% @doc
%% produces an eep0018 representation of a binary encoded json document
%%
%% ```
%% options:
%%
%% {strict, true | false}
%% by default, attempting to convert unwrapped json values (numbers, strings and
%% the atoms true, false and null) result in a badarg exception. if strict equals
%% false, these are instead decoded to their equivalent eep0018 value. default is
%% false
%%
%% {encoding, auto | utf8 | utf16 | {utf16, little} | utf32 | {utf32, little} }
%% assume the binary is encoded using the specified binary. default is auto, which
%% attempts to autodetect the encoding
%%
%% {comments, true | false}
%% if true, json documents that contain c style (/* ... */) comments
%% will be parsed as if they did not contain any comments. default is
%% false
%%
%% {label, atom | existing_atom | binary}
%% json keys (labels) are decoded to utf8 encoded binaries, atoms or
%% existing_atoms (atom if it exists, binary otherwise) as specified by
%% this option. default is binary
%%
%% {float, true | false}
%% return all numbers as floats. default is false
%% '''
%% @end
json_to_term(JSON, Opts) -> json_to_term(JSON, Opts) ->
jsx_eep0018:json_to_term(JSON, Opts). jsx_eep0018:json_to_term(JSON, Opts).
%% @spec term_to_json(JSON::eep0018()) -> binary()
%% @equiv term_to_json(JSON, [])
term_to_json(JSON) ->
term_to_json(JSON, []).
%% @spec term_to_json(JSON::eep0018(), Opts::encoder_opts()) -> binary()
%% @doc
%% takes the erlang representation of a json object (as defined in eep0018) and returns a (binary encoded) json string
%%
%% ```
%% options:
%%
%% {strict, true | false}
%% by default, attempting to convert unwrapped json values (numbers,
%% strings and the atoms true, false and null) result in a badarg exception.
%% if strict equals false, these are instead json encoded. default is false
%%
%% note that there is a problem of ambiguity when parsing unwrapped json
%% numbers that requires special handling, see the [[notes|technotes]]
%%
%% {encoding, utf8 | utf16 | {utf16, little} | utf32 | {utf32, little} }
%% the encoding of the resulting binary. default is utf8
%%
%% space
%% {space, N}
%% place N spaces after each colon and comma in the resulting binary. space
%% implies {space, 1}. default is zero
%%
%% indent
%% {indent, N}
%% indent each 'level' of the json structure by N spaces. indent implies
%% {indent, 1}. default is zero
%% '''
%% @end
term_to_json(JSON, Opts) ->
jsx_eep0018:term_to_json(JSON, Opts).
%% @spec is_json(JSON::binary()) -> true | false
%% @equiv is_json(JSON, [])
is_json(JSON) -> is_json(JSON) ->
is_json(JSON, []). is_json(JSON, []).
%% @spec is_json(JSON::binary(), verify_opts()) -> true | false
%% @doc
%% returns true if the binary is an encoded json document, false otherwise
%%
%% ```
%% options:
%%
%% {strict, true | false}
%% by default, unwrapped json values (numbers, strings and the atoms
%% true, false and null) return false. if strict equals true, is_json
%% returns true. default is false
%%
%% {encoding, auto | utf8 | utf16 | {utf16, little} | utf32 | {utf32, little} }
%% assume the binary is encoded using the specified binary. default is auto,
%% which attempts to autodetect the encoding
%%
%% {comments, true | false}
%% if true, json documents that contain c style (/* ... */) comments
%% will be parsed as if they did not contain any comments. default is
%% false
%% '''
%% @end
is_json(JSON, Opts) -> is_json(JSON, Opts) ->
jsx_verify:is_json(JSON, Opts). jsx_verify:is_json(JSON, Opts).
%% @spec format(JSON::binary()) -> binary()
%% @equiv format(JSON, [])
format(JSON) -> format(JSON) ->
format(JSON, []). format(JSON, []).
%% @spec format(JSON::binary(), Opts::format_opts()) -> binary()
%% @doc
%% formats a binary encoded json string according to the options chose. the defaults will produced a string stripped of all whitespace
%%
%% ```
%% options:
%%
%% {strict, true | false}
%% by default, unwrapped json values (numbers, strings and the atoms
%% true, false and null) result in an error. if strict equals true, they
%% are treated as valid json. default is false
%%
%% {encoding, auto | utf8 | utf16 | {utf16, little} | utf32 | {utf32, little} }
%% assume the binary is encoded using the specified binary. default is auto,
%% which attempts to autodetect the encoding
%%
%% {output_encoding, utf8 | utf16 | {utf16, little} | utf32 | {utf32, little} }
%% the encoding of the resulting binary. default is utf8
%%
%% {comments, true | false}
%% if true, json documents that contain c style (/* ... */) comments
%% will be parsed as if they did not contain any comments. default is
%% false
%%
%% space
%% {space, N}
%% place N spaces after each colon and comma in the resulting binary. space
%% implies {space, 1}. default is zero
%%
%% indent
%% {indent, N}
%% indent each 'level' of the json structure by N spaces. indent implies
%% {indent, 1}. default is zero
%% '''
%% @end
format(JSON, Opts) -> format(JSON, Opts) ->
jsx_format:format(JSON, Opts). jsx_format:format(JSON, Opts).
%% fake the jsx api for any list, useful if you want to serialize a structure to %% @spec eventify(List::list()) -> jsx_parser_result()
%% json using the pretty printer, or verify a sequence could be valid json %% @doc fake the jsx api for any list. useful if you want to serialize a structure to json using the pretty printer, or verify a sequence could be valid json
eventify([]) -> eventify([]) ->
fun() -> {incomplete, fun(List) when is_list(List) -> eventify(List); (_) -> erlang:error(badarg) end} end; fun() -> {incomplete, fun(List) when is_list(List) -> eventify(List); (_) -> erlang:error(badarg) end} end;
eventify([Next|Rest]) -> eventify([Next|Rest]) ->

View file

@ -21,6 +21,10 @@
%% THE SOFTWARE. %% THE SOFTWARE.
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_eep0018). -module(jsx_eep0018).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").

View file

@ -21,6 +21,10 @@
%% THE SOFTWARE. %% THE SOFTWARE.
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_format). -module(jsx_format).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").

View file

@ -21,26 +21,31 @@
%% THE SOFTWARE. %% THE SOFTWARE.
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_test). -module(jsx_test).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").
-ifndef(test). -ifndef(test).
-export([test/0]). -export([test/0]).
-endif. -endif.
-ifdef(test). -ifdef(test).
-include_lib("eunit/include/eunit.hrl"). -include_lib("eunit/include/eunit.hrl").
-endif. -endif.
%% if not compiled with test support %% if not compiled with test support
-ifndef(test). -ifndef(test).
test() -> erlang:error(notest). test() -> erlang:error(notest).
-else. -else.
jsx_decoder_test_() -> jsx_decoder_test_() ->
jsx_decoder_gen(load_tests(?eunit_test_path)). jsx_decoder_gen(load_tests(?eunit_test_path)).

View file

@ -1,5 +1,10 @@
-file("priv/jsx_decoder_template.erl", 1). -file("priv/jsx_decoder_template.erl", 1).
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_utf16). -module(jsx_utf16).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").

View file

@ -1,5 +1,10 @@
-file("priv/jsx_decoder_template.erl", 1). -file("priv/jsx_decoder_template.erl", 1).
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_utf16le). -module(jsx_utf16le).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").

View file

@ -1,5 +1,10 @@
-file("priv/jsx_decoder_template.erl", 1). -file("priv/jsx_decoder_template.erl", 1).
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_utf32). -module(jsx_utf32).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").

View file

@ -1,5 +1,10 @@
-file("priv/jsx_decoder_template.erl", 1). -file("priv/jsx_decoder_template.erl", 1).
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_utf32le). -module(jsx_utf32le).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").

View file

@ -1,5 +1,10 @@
-file("priv/jsx_decoder_template.erl", 1). -file("priv/jsx_decoder_template.erl", 1).
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_utf8). -module(jsx_utf8).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").

View file

@ -21,6 +21,10 @@
%% THE SOFTWARE. %% THE SOFTWARE.
%% @hidden hide this module from edoc, exported functions are internal to jsx
%% and may be altered or removed without notice
-module(jsx_verify). -module(jsx_verify).
-author("alisdairsullivan@yahoo.ca"). -author("alisdairsullivan@yahoo.ca").