cowboy/doc/src/guide/multipart.ezdoc

::: Multipart requests

Multipart originates from MIME, an Internet standard that
extends the format of emails. Multipart messages are a
container for parts of any content-type.

For example, a multipart message may have a part
containing text and a second part containing an
image. This is what allows you to attach files
to emails.

In the context of HTTP, multipart is most often used
with the `multipart/form-data` content-type. This is
the content-type you have to use when you want browsers
to be allowed to upload files through HTML forms.

Multipart is of course not required for uploading
files, it is only required when you want to do so
through HTML forms.

You can read and parse multipart messages using the
Req object directly.

Cowboy defines two functions that allows you to get
information about each part and read their contents.

:: Structure

A multipart message is a list of parts. Parts may
contain either a multipart message or a non-multipart
content-type. This allows parts to be arranged in a
tree structure, although this is a rare case as far
as the Web is concerned.

:: Form-data

In the normal case, when a form is submitted, the
browser will use the `application/x-www-form-urlencoded`
content-type. This type is just a list of keys and
values and is therefore not fit for uploading files.

That's where the `multipart/form-data` content-type
comes in. When the form is configured to use this
content-type, the browser will use one part of the
message for each form field. This means that a file
input field will be sent in its own part, but the
same applies to all other kinds of fields.

A form with a text input, a file input and a select
choice box will result in a multipart message with
three parts, one for each field.

The browser does its best to determine the content-type
of the files it sends this way, but you should not
rely on it for determining the contents of the file.
Proper investigation of the contents is recommended.

:: Checking the content-type

While there is a variety of multipart messages, the
most common on the Web is `multipart/form-data`. It's
the type of message being sent when an HTML form
allows uploading files.

You can quickly figure out if a multipart message
has been sent by parsing the `content-type` header.

``` erlang
{<<"multipart">>, <<"form-data">>, _}
    = cowboy_req:parse_header(<<"content-type">>, Req).
```

:: Reading a multipart message

To read a message you have to iterate over all its
parts. Then, for each part, you can inspect its headers
and read its body.

``` erlang
multipart(Req) ->
    case cowboy_req:part(Req) of
        {ok, _Headers, Req2} ->
            {ok, _Body, Req3} = cowboy_req:part_body(Req2),
            multipart(Req3);
        {done, Req2} ->
            Req2
    end.
```

Parts do not have a size limit. When a part body is
too big, Cowboy will return what it read so far and
allow you to continue if you wish to do so.

The function `cow_multipart:form_data/1` can be used
to quickly obtain information about a part from a
`multipart/form-data` message. This function will
tell you if the part is for a normal field or if it
is a file being uploaded.

This can be used for example to allow large part bodies
for files but crash when a normal field is too large.

``` erlang
multipart(Req) ->
    case cowboy_req:part(Req) of
        {ok, Headers, Req2} ->
            Req4 = case cow_multipart:form_data(Headers) of
                {data, _FieldName} ->
                    {ok, _Body, Req3} = cowboy_req:part_body(Req2),
                    Req3;
                {file, _FieldName, _Filename, _CType, _CTransferEncoding} ->
                    stream_file(Req2)
            end,
            multipart(Req4);
        {done, Req2} ->
            Req2
    end.

stream_file(Req) ->
    case cowboy_req:part_body(Req) of
        {ok, _Body, Req2} ->
            Req2;
        {more, _Body, Req2} ->
            stream_file(Req2)
    end.
```

By default the body chunk Cowboy will return is limited
to 8MB. This can of course be overriden. Both functions
can take a second argument, the same list of options that
will be passed to `cowboy_req:body/2` function.

:: Skipping unwanted parts

If you do not want to read a part's body, you can skip it.
Skipping is easy. If you do not call the function to read
the part's body, Cowboy will automatically skip it when
you request the next part.

The following snippet reads all part headers and skips
all bodies:

``` erlang
multipart(Req) ->
    case cowboy_req:part(Req) of
        {ok, _Headers, Req2} ->
            multipart(Req2);
        {done, Req2} ->
            Req2
    end.
```

Similarly, if you start reading the body and it ends up
being too big, you can simply continue with the next part,
Cowboy will automatically skip what remains.

Note that the skipping rate may not be adequate for your
application. If you observe poor performance when skipping,
you might want to consider manually skipping by calling
the `cowboy_req:part_body/1` function directly.

And if you started reading the message but decide that you
do not need the remaining parts, you can simply stop reading
entirely and Cowboy will automatically figure out what to do.
Provide installable man pages make docs: generate Markdown and man pages in doc/ make install-docs: install man pages to be usable directly Docs are generated from the ezdoc files in doc/src/. 2014-07-06 13:10:35 +02:00			`::: Multipart requests`
Add and document the new multipart code The old undocumented API is removed entirely. While a documentation exists for the new API, it will not be considered set in stone until further testing has been performed, and a file upload example has been added. The new API should be a little more efficient than the old API, especially with smaller messages. 2014-02-06 19:36:25 +01:00
Improve handler interface and documentation This change simplifies a little more the sub protocols mechanism. Aliases have been removed. The renaming of loop handlers as long polling handlers has been reverted. Plain HTTP handlers now simply do their work in the init/2 callback. There is no specific code for them. Loop handlers now follow the same return value as Websocket, they use ok to continue and shutdown to stop. Terminate reasons for all handler types have been documented. The terminate callback is now appropriately called in all cases (or should be). Behaviors for all handler types have been moved in the module that implement them. This means that cowboy_handler replaces the cowboy_http_handler behavior, and similarly cowboy_loop replaces cowboy_loop_handler, cowboy_websocket replaces cowboy_websocket_handler. Finally cowboy_rest now has the start of a behavior in it and will have the full list of optional callbacks defined once Erlang 18.0 gets released. The guide has been reorganized and should be easier to follow. 2014-09-30 20:12:13 +03:00			`Multipart originates from MIME, an Internet standard that`
			`extends the format of emails. Multipart messages are a`
			`container for parts of any content-type.`

			`For example, a multipart message may have a part`
			`containing text and a second part containing an`
			`image. This is what allows you to attach files`
			`to emails.`

			`In the context of HTTP, multipart is most often used`
			with the `multipart/form-data` content-type. This is
			`the content-type you have to use when you want browsers`
			`to be allowed to upload files through HTML forms.`

			`Multipart is of course not required for uploading`
			`files, it is only required when you want to do so`
			`through HTML forms.`

Add and document the new multipart code The old undocumented API is removed entirely. While a documentation exists for the new API, it will not be considered set in stone until further testing has been performed, and a file upload example has been added. The new API should be a little more efficient than the old API, especially with smaller messages. 2014-02-06 19:36:25 +01:00			`You can read and parse multipart messages using the`
			`Req object directly.`

			`Cowboy defines two functions that allows you to get`
			`information about each part and read their contents.`

Improve handler interface and documentation This change simplifies a little more the sub protocols mechanism. Aliases have been removed. The renaming of loop handlers as long polling handlers has been reverted. Plain HTTP handlers now simply do their work in the init/2 callback. There is no specific code for them. Loop handlers now follow the same return value as Websocket, they use ok to continue and shutdown to stop. Terminate reasons for all handler types have been documented. The terminate callback is now appropriately called in all cases (or should be). Behaviors for all handler types have been moved in the module that implement them. This means that cowboy_handler replaces the cowboy_http_handler behavior, and similarly cowboy_loop replaces cowboy_loop_handler, cowboy_websocket replaces cowboy_websocket_handler. Finally cowboy_rest now has the start of a behavior in it and will have the full list of optional callbacks defined once Erlang 18.0 gets released. The guide has been reorganized and should be easier to follow. 2014-09-30 20:12:13 +03:00			`:: Structure`

			`A multipart message is a list of parts. Parts may`
			`contain either a multipart message or a non-multipart`
			`content-type. This allows parts to be arranged in a`
			`tree structure, although this is a rare case as far`
			`as the Web is concerned.`

			`:: Form-data`

			`In the normal case, when a form is submitted, the`
			browser will use the `application/x-www-form-urlencoded`
			`content-type. This type is just a list of keys and`
			`values and is therefore not fit for uploading files.`

			That's where the `multipart/form-data` content-type
			`comes in. When the form is configured to use this`
			`content-type, the browser will use one part of the`
			`message for each form field. This means that a file`
			`input field will be sent in its own part, but the`
			`same applies to all other kinds of fields.`

			`A form with a text input, a file input and a select`
			`choice box will result in a multipart message with`
			`three parts, one for each field.`

			`The browser does its best to determine the content-type`
			`of the files it sends this way, but you should not`
			`rely on it for determining the contents of the file.`
			`Proper investigation of the contents is recommended.`

Provide installable man pages make docs: generate Markdown and man pages in doc/ make install-docs: install man pages to be usable directly Docs are generated from the ezdoc files in doc/src/. 2014-07-06 13:10:35 +02:00			`:: Checking the content-type`
Add and document the new multipart code The old undocumented API is removed entirely. While a documentation exists for the new API, it will not be considered set in stone until further testing has been performed, and a file upload example has been added. The new API should be a little more efficient than the old API, especially with smaller messages. 2014-02-06 19:36:25 +01:00
			`While there is a variety of multipart messages, the`
			most common on the Web is `multipart/form-data`. It's
			`the type of message being sent when an HTML form`
			`allows uploading files.`

			`You can quickly figure out if a multipart message`
			has been sent by parsing the `content-type` header.

			``` erlang
Breaking update of the cowboy_req interface Simplify the interface for most cowboy_req functions. They all return a single value except the four body reading functions. The reply functions now only return a Req value. Access functions do not return a Req anymore. Functions that used to cache results do not have a cache anymore. The interface for accessing query string and cookies has therefore been changed. There are now three query string functions: qs/1 provides access to the raw query string value; parse_qs/1 returns the query string as a list of key/values; match_qs/2 returns a map containing the values requested in the second argument, after applying constraints and default value. Similarly, there are two cookie functions: parse_cookies/1 and match_cookies/2. More match functions will be added in future commits. None of the functions return an error tuple anymore. It either works or crashes. Cowboy will attempt to provide an appropriate status code in the response of crashed handlers. As a result, the content decode function has its return value changed to a simple binary, and the body reading functions only return on success. 2014-09-23 16:43:29 +03:00			`{<<"multipart">>, <<"form-data">>, _}`
Add and document the new multipart code The old undocumented API is removed entirely. While a documentation exists for the new API, it will not be considered set in stone until further testing has been performed, and a file upload example has been added. The new API should be a little more efficient than the old API, especially with smaller messages. 2014-02-06 19:36:25 +01:00			`= cowboy_req:parse_header(<<"content-type">>, Req).`
			```

Provide installable man pages make docs: generate Markdown and man pages in doc/ make install-docs: install man pages to be usable directly Docs are generated from the ezdoc files in doc/src/. 2014-07-06 13:10:35 +02:00			`:: Reading a multipart message`
Add and document the new multipart code The old undocumented API is removed entirely. While a documentation exists for the new API, it will not be considered set in stone until further testing has been performed, and a file upload example has been added. The new API should be a little more efficient than the old API, especially with smaller messages. 2014-02-06 19:36:25 +01:00
			`To read a message you have to iterate over all its`
			`parts. Then, for each part, you can inspect its headers`
			`and read its body.`

			``` erlang
			`multipart(Req) ->`
			`case cowboy_req:part(Req) of`
			`{ok, _Headers, Req2} ->`
			`{ok, _Body, Req3} = cowboy_req:part_body(Req2),`
			`multipart(Req3);`
			`{done, Req2} ->`
			`Req2`
			`end.`
			```

			`Parts do not have a size limit. When a part body is`
			`too big, Cowboy will return what it read so far and`
			`allow you to continue if you wish to do so.`

			The function `cow_multipart:form_data/1` can be used
			`to quickly obtain information about a part from a`
			`multipart/form-data` message. This function will
			`tell you if the part is for a normal field or if it`
			`is a file being uploaded.`

			`This can be used for example to allow large part bodies`
			`for files but crash when a normal field is too large.`

			``` erlang
			`multipart(Req) ->`
			`case cowboy_req:part(Req) of`
			`{ok, Headers, Req2} ->`
			`Req4 = case cow_multipart:form_data(Headers) of`
			`{data, _FieldName} ->`
			`{ok, _Body, Req3} = cowboy_req:part_body(Req2),`
			`Req3;`
			`{file, _FieldName, _Filename, _CType, _CTransferEncoding} ->`
			`stream_file(Req2)`
			`end,`
			`multipart(Req4);`
			`{done, Req2} ->`
			`Req2`
			`end.`

			`stream_file(Req) ->`
			`case cowboy_req:part_body(Req) of`
			`{ok, _Body, Req2} ->`
			`Req2;`
			`{more, _Body, Req2} ->`
			`stream_file(Req2)`
			`end.`
			```

			`By default the body chunk Cowboy will return is limited`
Add request body reading options The options were added to allow developers to fix timeout issues when reading large bodies. It is also a cleaner and easier to extend interface. This commit deprecates the functions init_stream, stream_body and skip_body which are no longer needed. They will be removed in 1.0. The body function can now take an additional argument that is a list of options. The body_qs, part and part_body functions can too and simply pass this argument down to the body call. There are options for disabling the automatic continue reply, setting a maximum length to be returned (soft limit), setting the read length and read timeout, and setting the transfer and content decode functions. The return value of the body and body_qs have changed slightly. The body function now works similarly to the part_body function, in that it returns either an ok or a more tuple depending on whether there is additional data to be read. The body_qs function can return a badlength tuple if the body is too big. The default size has been increased from 16KB to 64KB. The default read length and timeout have been tweaked and vary depending on the function called. The body function will now adequately process chunked bodies, which means that the body_qs function will too. But this means that the behavior has changed slightly and your code should be tested properly when updating your code. The body and body_qs still accept a length as first argument for compatibility purpose with older code. Note that this form is deprecated and will be removed in 1.0. The part and part_body function, being new and never having been in a release yet, have this form completely removed in this commit. Again, while most code should work as-is, you should make sure that it actually does before pushing this to production. 2014-06-02 23:09:43 +02:00			`to 8MB. This can of course be overriden. Both functions`
			`can take a second argument, the same list of options that`
			will be passed to `cowboy_req:body/2` function.
Add and document the new multipart code The old undocumented API is removed entirely. While a documentation exists for the new API, it will not be considered set in stone until further testing has been performed, and a file upload example has been added. The new API should be a little more efficient than the old API, especially with smaller messages. 2014-02-06 19:36:25 +01:00
Provide installable man pages make docs: generate Markdown and man pages in doc/ make install-docs: install man pages to be usable directly Docs are generated from the ezdoc files in doc/src/. 2014-07-06 13:10:35 +02:00			`:: Skipping unwanted parts`
Add and document the new multipart code The old undocumented API is removed entirely. While a documentation exists for the new API, it will not be considered set in stone until further testing has been performed, and a file upload example has been added. The new API should be a little more efficient than the old API, especially with smaller messages. 2014-02-06 19:36:25 +01:00
			`If you do not want to read a part's body, you can skip it.`
			`Skipping is easy. If you do not call the function to read`
			`the part's body, Cowboy will automatically skip it when`
			`you request the next part.`

			`The following snippet reads all part headers and skips`
			`all bodies:`

			``` erlang
			`multipart(Req) ->`
			`case cowboy_req:part(Req) of`
			`{ok, _Headers, Req2} ->`
			`multipart(Req2);`
			`{done, Req2} ->`
			`Req2`
			`end.`
			```

			`Similarly, if you start reading the body and it ends up`
			`being too big, you can simply continue with the next part,`
			`Cowboy will automatically skip what remains.`

Add request body reading options The options were added to allow developers to fix timeout issues when reading large bodies. It is also a cleaner and easier to extend interface. This commit deprecates the functions init_stream, stream_body and skip_body which are no longer needed. They will be removed in 1.0. The body function can now take an additional argument that is a list of options. The body_qs, part and part_body functions can too and simply pass this argument down to the body call. There are options for disabling the automatic continue reply, setting a maximum length to be returned (soft limit), setting the read length and read timeout, and setting the transfer and content decode functions. The return value of the body and body_qs have changed slightly. The body function now works similarly to the part_body function, in that it returns either an ok or a more tuple depending on whether there is additional data to be read. The body_qs function can return a badlength tuple if the body is too big. The default size has been increased from 16KB to 64KB. The default read length and timeout have been tweaked and vary depending on the function called. The body function will now adequately process chunked bodies, which means that the body_qs function will too. But this means that the behavior has changed slightly and your code should be tested properly when updating your code. The body and body_qs still accept a length as first argument for compatibility purpose with older code. Note that this form is deprecated and will be removed in 1.0. The part and part_body function, being new and never having been in a release yet, have this form completely removed in this commit. Again, while most code should work as-is, you should make sure that it actually does before pushing this to production. 2014-06-02 23:09:43 +02:00			`Note that the skipping rate may not be adequate for your`
			`application. If you observe poor performance when skipping,`
			`you might want to consider manually skipping by calling`
			the `cowboy_req:part_body/1` function directly.

Add and document the new multipart code The old undocumented API is removed entirely. While a documentation exists for the new API, it will not be considered set in stone until further testing has been performed, and a file upload example has been added. The new API should be a little more efficient than the old API, especially with smaller messages. 2014-02-06 19:36:25 +01:00			`And if you started reading the message but decide that you`
			`do not need the remaining parts, you can simply stop reading`
			`entirely and Cowboy will automatically figure out what to do.`