the twitter json API has a ton of detail in it's stream, most of which is ignorable. this parser extracts just the user screenname, the text and the time and throws the rest away
`jsx:parser/0` returns a function with arity 1. call it with a binary containing a properly encoded json stream and it returns one of the following values:
*`{event, Event, Next}`: Event is described below and Next is a zero arity function that returns the next value
*`{incomplete, More}`: More is described below
*`{error, badjson}`: the parser has decided the json stream is not a valid json document
`jsx:parser/1` has the same return signature as `jsx:parser/0` but accepts a proplist containing the following options:
*`{escaped_unicode, ascii | codepoint | none}`: determines what escaped unicode sequences like `"\ua123"` are converted to in key/string events. ascii converts any sequences in the range 0-127 to their ascii value (`"\u0021"` would become 33), codepoint converts all valid sequences to their unicode codepoint value (`"\uabcd"` would become 43981), none does no conversion (`"\u0021"` would become the sequence 92, 117, 48, 48, 50, 49)
*`{encoding, utf8 | utf16 | {utf16, little} | utf32 | {utf32, little} | auto}`: determines which encoding to expect the binary to use. a contrary encoding will result in an error. auto will auto detect the encoding. auto is the default
*`{multi_term, true | false}`: normally, after the end of a json document only whitespace is allowed, passing this option with true alters parsing so after the end of a json document, another json document is permitted, with any amount of whitespace in between. default is false
More is a new parser returned when parsing encounters the end of the stream supplied to the parser. calling it with a new stream resumes parsing as if the stream was not interrupted. because json documents may be followed by arbitrary whitespace, there is no unambiguous ending to a json stream. Next will always eventually return `{incomplete, More}`. to ensure the stream is clean and contains no garbage in the tail, call `More(end_stream)`. if `ok` is returned, parsing is complete. otherwise `{error, badjson}` will be returned
events
------
a complete list of events. Next is described under *api* above:
*`{event, start_object | end_object | start_array | end_array, Next}`: emitted when `{`, `}`, `[`, `]` are encountered by the parser in a legal position
*`{event, end_json, Next}`: emitted when the json document has been completed (the root object/array has been closed). note that this does NOT ensure the json document is valid, the tail must be checked to be free of invalid characters as described above under *api*
*`{event, {key, Key}, Next}`: an object key has been encountered. Key has the same format as String below
*`{event, {string, String}, Next}`: a string has been encountered. String is a list of unicode codepoints
*`{event, {integer, Integer}, Next}`: an integer had been encountered. Integer is a list of unicode codepoints that can be passed to `erlang:list_to_integer/1` to convert to an integer
*`{event, {float, Float}, Next}`: a float has been encountered. Float is a list of unicode codepoints that can be passed to `erlang:list_to_float/1` to convert to a float
*`{event, {literal, true | false | null}, Next}`: a json literal has been encountered. it will be the atom `true`, the atom `false` or the atom `null`
notes
-----
to compile and install, run `make && make install` from the root of the project directory
jsx supports utf8, utf16 (little and big endian) and utf32 (little and big endian). future support is planned for erlang iolists