cleanup README for v2.0

This commit is contained in:
alisdair sullivan 2013-11-03 23:39:52 +00:00
parent 04bd9dd827
commit 5db1d9cc05

103
README.md
View file

@ -21,7 +21,6 @@ copyright 2010-2013 alisdair sullivan
- [`json_term()`](#json_term) - [`json_term()`](#json_term)
- [`json_text()`](#json_text) - [`json_text()`](#json_text)
- [`event()`](#event) - [`event()`](#event)
- [`token()`](#token)
- [`option()`](#option) - [`option()`](#option)
* [exports](#exports) * [exports](#exports)
- [`encoder/3`, `decoder/3` & `parser/3`](#encoder3-decoder3--parser3) - [`encoder/3`, `decoder/3` & `parser/3`](#encoder3-decoder3--parser3)
@ -122,8 +121,7 @@ real world usage
jsx is pragmatic. the json spec allows extensions so jsx extends the spec in a jsx is pragmatic. the json spec allows extensions so jsx extends the spec in a
number of ways. see the section on `strict` in [options](#option) below though number of ways. see the section on `strict` in [options](#option) below though
there's not supposed to be any comments in json but when did comments ever hurt json has no official comments but this parser allows c/c++ style comments.
anyone? json has no official comments but this parser allows c/c++ style comments.
anywhere whitespace is allowed you can insert comments (both `// ...` and `/* ... */`) anywhere whitespace is allowed you can insert comments (both `// ...` and `/* ... */`)
all jsx decoder input should be `utf8` encoded binaries. sometimes you get binaries all jsx decoder input should be `utf8` encoded binaries. sometimes you get binaries
@ -145,7 +143,7 @@ ignores bad escape sequences
**json** | **erlang** **json** | **erlang**
--------------------------------|-------------------------------- --------------------------------|--------------------------------
`number` | `integer()` and `float()` `number` | `integer()` if possible, `float()` otherwise
`string` | `binary()` `string` | `binary()`
`true`, `false` and `null` | `true`, `false` and `null` `true`, `false` and `null` | `true`, `false` and `null`
`array` | `[]` and `[JSON]` `array` | `[]` and `[JSON]`
@ -153,17 +151,18 @@ ignores bad escape sequences
* numbers * numbers
javascript and thus json represent all numeric values with floats. as javascript and thus json represent all numeric values with floats. there's no
this is woefully insufficient for many uses, **jsx**, just like erlang, reason for erlang -- a language that supports arbitrarily large integers -- to
supports bigints. whenever possible, this library will interpret json restrict all numbers to the ieee754 range
numbers that look like integers as integers. other numbers will be converted
to erlang's floating point type, which is nearly but not quite iee754. whenever possible, **jsx** will interpret json numbers that look like integers as
negative zero is not representable in erlang (zero is unsigned in erlang and integers. other numbers will be converted to erlang's floating point type, which
`0` is equivalent to `-0`) and will be interpreted as regular zero. numbers is nearly but not quite iee754. negative zero is not representable in erlang (zero
not representable are beyond the concern of this implementation, and will is unsigned in erlang and `0` is equivalent to `-0`) and will be interpreted as
result in parsing errors regular zero. numbers not representable are beyond the concern of this implementation,
and will result in parsing errors
when converting from erlang to json, numbers are represented with their when converting from erlang to json, floats are represented with their
shortest representation that will round trip without loss of precision. this shortest representation that will round trip without loss of precision. this
means that some floats may be superficially dissimilar (although means that some floats may be superficially dissimilar (although
functionally equivalent). for example, `1.0000000000000001` will be functionally equivalent). for example, `1.0000000000000001` will be
@ -171,30 +170,22 @@ ignores bad escape sequences
* strings * strings
the json [spec][rfc4627] is frustratingly vague on the exact details of json json strings must be unicode. in practice, because **jsx** only accepts
strings. json must be unicode, but no encoding is specified. javascript `utf8` all strings must be `utf8`. in addition to being unicode json strings
explicitly allows strings containing codepoints explicitly disallowed by restrict a number of codepoints and define a number of escape sequences
unicode. json allows implementations to set limits on the content of
strings. other implementations attempt to resolve this in various ways. this
implementation, in default operation, only accepts strings that meet the
constraints set out in the json spec (strings are sequences of unicode
codepoints deliminated by `"` (`u+0022`) that may not contain control codes
unless properly escaped with `\` (`u+005c`)) and that are encoded in `utf8`
the utf8 restriction means improperly paired surrogates are explicitly
disallowed. `u+d800` to `u+dfff` are allowed, but only when they form valid
surrogate pairs. surrogates encountered otherwise result in errors
json string escapes of the form `\uXXXX` will be converted to their json string escapes of the form `\uXXXX` will be converted to their
equivalent codepoints during parsing. this means control characters and equivalent codepoints during parsing. this means control characters and
other codepoints disallowed by the json spec may be encountered in resulting other codepoints disallowed by the json spec may be encountered in resulting
strings, but codepoints disallowed by the unicode spec will not be. in the strings. the utf8 restriction means the surrogates are explicitly disallowed.
interest of pragmatism there is an [option](#option) for looser parsing if a string contains escaped surrogates (`u+d800` to `u+dfff`) they are
interpreted but only when they form valid surrogate pairs. surrogates
encountered otherwise are replaced with the replacement codepoint (`u+fffd`)
all erlang strings are represented by **valid** `utf8` encoded binaries. the all erlang strings are represented by **valid** `utf8` encoded binaries. the
encoder will check strings for conformance. noncharacters (like `u+ffff`) encoder will check strings for conformance. noncharacters (like `u+ffff`)
are allowed in erlang utf8 encoded binaries, but not in strings passed to are allowed in erlang utf8 encoded binaries, but will be replaced in strings
the encoder (although, again, see [options](#option)) passed to the encoder (although, again, see [options](#option))
this implementation performs no normalization on strings beyond that this implementation performs no normalization on strings beyond that
detailed here. be careful when comparing strings as equivalent strings detailed here. be careful when comparing strings as equivalent strings
@ -275,34 +266,6 @@ json_text() = binary()
a utf8 encoded binary containing a json string a utf8 encoded binary containing a json string
#### `token()` ####
```erlang
event() = start_object
| end_object
| start_array
| end_array
| {key, binary()}
| {string, binary()}
| binary()
| {integer, integer()}
| integer()
| {float, float()}
| float()
| {literal, true}
| true
| {literal, false}
| false
| {literal, null}
| null
| end_json
```
the representation used during syntactic analysis. you can generate this
yourself and feed it to `jsx:parser/3` if you'd like to define your own
representations
#### `event()` #### #### `event()` ####
```erlang ```erlang
@ -409,28 +372,6 @@ additional options beyond these. see
see [incomplete input](#incomplete-input) see [incomplete input](#incomplete-input)
- `incomplete_handler` & `error_handler`
the default incomplete and error handlers can be replaced with user defined
handlers. if options include `{error_handler, F}` and/or
`{incomplete_handler, F}` where `F` is a function of arity 3 they will be
called instead of the default handler. the spec for `F` is as follows
```erlang
F(Remaining, InternalState, Config) -> any()
Remaining = binary() | term()
InternalState = opaque()
Config = list()
```
`Remaining` is the binary fragment or term that caused the error
`InternalState` is an opaque structure containing the internal state of the
parser/decoder/encoder
`Config` is a list of options/flags in use by the parser/decoder/encoder
these functions should be considered experimental for now
## exports ## ## exports ##