cleanup README for v2.0
This commit is contained in:
parent
04bd9dd827
commit
5db1d9cc05
1 changed files with 22 additions and 81 deletions
103
README.md
103
README.md
|
@ -21,7 +21,6 @@ copyright 2010-2013 alisdair sullivan
|
||||||
- [`json_term()`](#json_term)
|
- [`json_term()`](#json_term)
|
||||||
- [`json_text()`](#json_text)
|
- [`json_text()`](#json_text)
|
||||||
- [`event()`](#event)
|
- [`event()`](#event)
|
||||||
- [`token()`](#token)
|
|
||||||
- [`option()`](#option)
|
- [`option()`](#option)
|
||||||
* [exports](#exports)
|
* [exports](#exports)
|
||||||
- [`encoder/3`, `decoder/3` & `parser/3`](#encoder3-decoder3--parser3)
|
- [`encoder/3`, `decoder/3` & `parser/3`](#encoder3-decoder3--parser3)
|
||||||
|
@ -122,8 +121,7 @@ real world usage
|
||||||
jsx is pragmatic. the json spec allows extensions so jsx extends the spec in a
|
jsx is pragmatic. the json spec allows extensions so jsx extends the spec in a
|
||||||
number of ways. see the section on `strict` in [options](#option) below though
|
number of ways. see the section on `strict` in [options](#option) below though
|
||||||
|
|
||||||
there's not supposed to be any comments in json but when did comments ever hurt
|
json has no official comments but this parser allows c/c++ style comments.
|
||||||
anyone? json has no official comments but this parser allows c/c++ style comments.
|
|
||||||
anywhere whitespace is allowed you can insert comments (both `// ...` and `/* ... */`)
|
anywhere whitespace is allowed you can insert comments (both `// ...` and `/* ... */`)
|
||||||
|
|
||||||
all jsx decoder input should be `utf8` encoded binaries. sometimes you get binaries
|
all jsx decoder input should be `utf8` encoded binaries. sometimes you get binaries
|
||||||
|
@ -145,7 +143,7 @@ ignores bad escape sequences
|
||||||
|
|
||||||
**json** | **erlang**
|
**json** | **erlang**
|
||||||
--------------------------------|--------------------------------
|
--------------------------------|--------------------------------
|
||||||
`number` | `integer()` and `float()`
|
`number` | `integer()` if possible, `float()` otherwise
|
||||||
`string` | `binary()`
|
`string` | `binary()`
|
||||||
`true`, `false` and `null` | `true`, `false` and `null`
|
`true`, `false` and `null` | `true`, `false` and `null`
|
||||||
`array` | `[]` and `[JSON]`
|
`array` | `[]` and `[JSON]`
|
||||||
|
@ -153,17 +151,18 @@ ignores bad escape sequences
|
||||||
|
|
||||||
* numbers
|
* numbers
|
||||||
|
|
||||||
javascript and thus json represent all numeric values with floats. as
|
javascript and thus json represent all numeric values with floats. there's no
|
||||||
this is woefully insufficient for many uses, **jsx**, just like erlang,
|
reason for erlang -- a language that supports arbitrarily large integers -- to
|
||||||
supports bigints. whenever possible, this library will interpret json
|
restrict all numbers to the ieee754 range
|
||||||
numbers that look like integers as integers. other numbers will be converted
|
|
||||||
to erlang's floating point type, which is nearly but not quite iee754.
|
|
||||||
negative zero is not representable in erlang (zero is unsigned in erlang and
|
|
||||||
`0` is equivalent to `-0`) and will be interpreted as regular zero. numbers
|
|
||||||
not representable are beyond the concern of this implementation, and will
|
|
||||||
result in parsing errors
|
|
||||||
|
|
||||||
when converting from erlang to json, numbers are represented with their
|
whenever possible, **jsx** will interpret json numbers that look like integers as
|
||||||
|
integers. other numbers will be converted to erlang's floating point type, which
|
||||||
|
is nearly but not quite iee754. negative zero is not representable in erlang (zero
|
||||||
|
is unsigned in erlang and `0` is equivalent to `-0`) and will be interpreted as
|
||||||
|
regular zero. numbers not representable are beyond the concern of this implementation,
|
||||||
|
and will result in parsing errors
|
||||||
|
|
||||||
|
when converting from erlang to json, floats are represented with their
|
||||||
shortest representation that will round trip without loss of precision. this
|
shortest representation that will round trip without loss of precision. this
|
||||||
means that some floats may be superficially dissimilar (although
|
means that some floats may be superficially dissimilar (although
|
||||||
functionally equivalent). for example, `1.0000000000000001` will be
|
functionally equivalent). for example, `1.0000000000000001` will be
|
||||||
|
@ -171,30 +170,22 @@ ignores bad escape sequences
|
||||||
|
|
||||||
* strings
|
* strings
|
||||||
|
|
||||||
the json [spec][rfc4627] is frustratingly vague on the exact details of json
|
json strings must be unicode. in practice, because **jsx** only accepts
|
||||||
strings. json must be unicode, but no encoding is specified. javascript
|
`utf8` all strings must be `utf8`. in addition to being unicode json strings
|
||||||
explicitly allows strings containing codepoints explicitly disallowed by
|
restrict a number of codepoints and define a number of escape sequences
|
||||||
unicode. json allows implementations to set limits on the content of
|
|
||||||
strings. other implementations attempt to resolve this in various ways. this
|
|
||||||
implementation, in default operation, only accepts strings that meet the
|
|
||||||
constraints set out in the json spec (strings are sequences of unicode
|
|
||||||
codepoints deliminated by `"` (`u+0022`) that may not contain control codes
|
|
||||||
unless properly escaped with `\` (`u+005c`)) and that are encoded in `utf8`
|
|
||||||
|
|
||||||
the utf8 restriction means improperly paired surrogates are explicitly
|
|
||||||
disallowed. `u+d800` to `u+dfff` are allowed, but only when they form valid
|
|
||||||
surrogate pairs. surrogates encountered otherwise result in errors
|
|
||||||
|
|
||||||
json string escapes of the form `\uXXXX` will be converted to their
|
json string escapes of the form `\uXXXX` will be converted to their
|
||||||
equivalent codepoints during parsing. this means control characters and
|
equivalent codepoints during parsing. this means control characters and
|
||||||
other codepoints disallowed by the json spec may be encountered in resulting
|
other codepoints disallowed by the json spec may be encountered in resulting
|
||||||
strings, but codepoints disallowed by the unicode spec will not be. in the
|
strings. the utf8 restriction means the surrogates are explicitly disallowed.
|
||||||
interest of pragmatism there is an [option](#option) for looser parsing
|
if a string contains escaped surrogates (`u+d800` to `u+dfff`) they are
|
||||||
|
interpreted but only when they form valid surrogate pairs. surrogates
|
||||||
|
encountered otherwise are replaced with the replacement codepoint (`u+fffd`)
|
||||||
|
|
||||||
all erlang strings are represented by **valid** `utf8` encoded binaries. the
|
all erlang strings are represented by **valid** `utf8` encoded binaries. the
|
||||||
encoder will check strings for conformance. noncharacters (like `u+ffff`)
|
encoder will check strings for conformance. noncharacters (like `u+ffff`)
|
||||||
are allowed in erlang utf8 encoded binaries, but not in strings passed to
|
are allowed in erlang utf8 encoded binaries, but will be replaced in strings
|
||||||
the encoder (although, again, see [options](#option))
|
passed to the encoder (although, again, see [options](#option))
|
||||||
|
|
||||||
this implementation performs no normalization on strings beyond that
|
this implementation performs no normalization on strings beyond that
|
||||||
detailed here. be careful when comparing strings as equivalent strings
|
detailed here. be careful when comparing strings as equivalent strings
|
||||||
|
@ -275,34 +266,6 @@ json_text() = binary()
|
||||||
|
|
||||||
a utf8 encoded binary containing a json string
|
a utf8 encoded binary containing a json string
|
||||||
|
|
||||||
|
|
||||||
#### `token()` ####
|
|
||||||
|
|
||||||
```erlang
|
|
||||||
event() = start_object
|
|
||||||
| end_object
|
|
||||||
| start_array
|
|
||||||
| end_array
|
|
||||||
| {key, binary()}
|
|
||||||
| {string, binary()}
|
|
||||||
| binary()
|
|
||||||
| {integer, integer()}
|
|
||||||
| integer()
|
|
||||||
| {float, float()}
|
|
||||||
| float()
|
|
||||||
| {literal, true}
|
|
||||||
| true
|
|
||||||
| {literal, false}
|
|
||||||
| false
|
|
||||||
| {literal, null}
|
|
||||||
| null
|
|
||||||
| end_json
|
|
||||||
```
|
|
||||||
|
|
||||||
the representation used during syntactic analysis. you can generate this
|
|
||||||
yourself and feed it to `jsx:parser/3` if you'd like to define your own
|
|
||||||
representations
|
|
||||||
|
|
||||||
#### `event()` ####
|
#### `event()` ####
|
||||||
|
|
||||||
```erlang
|
```erlang
|
||||||
|
@ -409,28 +372,6 @@ additional options beyond these. see
|
||||||
|
|
||||||
see [incomplete input](#incomplete-input)
|
see [incomplete input](#incomplete-input)
|
||||||
|
|
||||||
- `incomplete_handler` & `error_handler`
|
|
||||||
|
|
||||||
the default incomplete and error handlers can be replaced with user defined
|
|
||||||
handlers. if options include `{error_handler, F}` and/or
|
|
||||||
`{incomplete_handler, F}` where `F` is a function of arity 3 they will be
|
|
||||||
called instead of the default handler. the spec for `F` is as follows
|
|
||||||
```erlang
|
|
||||||
F(Remaining, InternalState, Config) -> any()
|
|
||||||
|
|
||||||
Remaining = binary() | term()
|
|
||||||
InternalState = opaque()
|
|
||||||
Config = list()
|
|
||||||
```
|
|
||||||
`Remaining` is the binary fragment or term that caused the error
|
|
||||||
|
|
||||||
`InternalState` is an opaque structure containing the internal state of the
|
|
||||||
parser/decoder/encoder
|
|
||||||
|
|
||||||
`Config` is a list of options/flags in use by the parser/decoder/encoder
|
|
||||||
|
|
||||||
these functions should be considered experimental for now
|
|
||||||
|
|
||||||
|
|
||||||
## exports ##
|
## exports ##
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue