0
Fork 0
mirror of https://github.com/ninenines/cowboy.git synced 2025-07-14 20:30:23 +00:00

Greatly improve the guide introduction

This commit is contained in:
Loïc Hoguin 2013-06-27 00:02:12 +02:00
parent b059a1237f
commit 61ca86b054
6 changed files with 563 additions and 88 deletions

43
guide/erlang_beginners.md Normal file
View file

@ -0,0 +1,43 @@
Erlang for beginners
====================
Chances are you are interested in using Cowboy, but have
no idea how to write an Erlang program. Fear not! This
chapter will help you get started.
We recommend two books for beginners. You should read them
both at some point, as they cover Erlang from two entirely
different perspectives.
Learn You Some Erlang for Great Good!
-------------------------------------
The quickest way to get started with Erlang is by reading
a book with the funny name of [LYSE](http://learnyousomeerlang.com),
as we affectionately call it.
It will get right into the syntax and quickly answer the questions
a beginner would ask themselves, all the while showing funny
pictures and making insightful jokes.
You can read an early version of the book online for free,
but you really should buy the much more refined paper and
ebook versions.
Programming Erlang
------------------
After writing some code, you will probably want to understand
the very concepts that make Erlang what it is today. These
are best explained by Joe Armstrong, the godfather of Erlang,
in his book [Programming Erlang](http://pragprog.com/book/jaerlang2/programming-erlang).
Instead of going into every single details of the language,
Joe focuses on the central concepts behind Erlang, and shows
you how they can be used to write a variety of different
applications.
At the time of writing, the 2nd edition of the book is in beta,
and includes a few details about upcoming Erlang features that
cannot be used today. Choose the edition you want, then get
reading!

181
guide/erlang_web.md Normal file
View file

@ -0,0 +1,181 @@
Erlang and the Web
==================
The Web is concurrent
---------------------
When you access a website there is little concurrency
involved. A few connections are opened and requests
are sent through these connections. Then the web page
is displayed on your screen. Your browser will only
open up to 4 or 8 connections to the server, depending
on your settings. This isn't much.
But think about it. You are not the only one accessing
the server at the same time. There can be hundreds, if
not thousands, if not millions of connections to the
same server at the same time.
Even today a lot of systems used in production haven't
solved the C10K problem (ten thousand concurrent connections).
And the ones who did are trying hard to get to the next
step, C100K, and are pretty far from it.
Erlang meanwhile has no problem handling millions of
connections. At the time of writing there are application
servers written in Erlang that can handle more than two
million connections on a single server in a real production
application, with spare memory and CPU!
The Web is concurrent, and Erlang is a language designed
for concurrency, so it is a perfect match.
Of course, various platforms need to scale beyond a few
million connections. This is where Erlang's built-in
distribution mechanisms come in. If one server isn't
enough, add more! Erlang allows you to use the same code
for talking to local processes or to processes in other
parts of your cluster, which means you can scale very
quickly if the need arises.
The Web has large userbases, and the Erlang platform was
designed to work in a distributed setting, so it is a
perfect match.
Or is it? Surely you can find solutions to handle that many
concurrent connections with my favorite language... But all
these solutions will break down in the next few years. Why?
Firstly because servers don't get any more powerful, they
instead get a lot more cores and memory. This is only useful
if your application can use them properly, and Erlang is
light-years away from anything else in that area. Secondly,
today your computer and your phone are online, tomorrow your
watch, goggles, bike, car, fridge and tons of other devices
will also connect to various applications on the Internet.
Only Erlang is prepared to deal with what's coming.
The Web is soft real time
-------------------------
What does soft real time mean, you ask? It means we want the
operations done as quickly as possible, and in the case of
web applications, it means we want the data propagated fast.
In comparison, hard real time has a similar meaning, but also
has a hard time constraint, for example an operation needs to
be done in under N milliseconds otherwise the system fails
entirely.
Users aren't that needy yet, they just want to get access
to their content in a reasonable delay, and they want the
actions they make to register at most a few seconds after
they submitted them, otherwise they'll start worrying about
whether it successfully went through.
The Web is soft real time because taking longer to perform an
operation would be seen as bad quality of service.
Erlang is a soft real time system. It will always run
processes fairly, a little at a time, switching to another
process after a while and preventing a single process to
steal resources from all others. This means that Erlang
can guarantee stable low latency of operations.
Erlang provides the guarantees that the soft real time Web
requires.
The Web is asynchronous
-----------------------
Long ago, the Web was synchronous because HTTP was synchronous.
You fired a request, and then waited for a response. Not anymore.
It all started when XmlHttpRequest started being used. It allowed
the client to perform asynchronous calls to the server.
Then Websocket appeared and allowed both the server and the client
to send data to the other endpoint completely asynchronously. The
data is contained within frames and no response is necessary.
Erlang processes work the same. They send each other data contained
within messages and then continue running without needing a response.
They tend to spend most of their time inactive, waiting for a new
message, and the Erlang VM happily activate them when one is received.
It is therefore quite easy to imagine Erlang being good at receiving
Websocket frames, which may come in at unpredictable times, pass the
data to the responsible processes which are always ready waiting for
new messages, and perform the operations required by only activating
the required parts of the system.
The more recent Web technologies, like Websocket of course, but also
SPDY and HTTP/2.0, are all fully asynchronous protocols. The concept
of requests and responses is retained of course, but anything could
be sent in between, by both the client or the browser, and the
responses could also be received in a completely different order.
Erlang is by nature asynchronous and really good at it thanks to the
great engineering that has been done in the VM over the years. It's
only natural that it's so good at dealing with the asynchronous Web.
The Web is omnipresent
----------------------
The Web has taken a very important part of our lives. We're
connected at all times, when we're on our phone, using our computer,
passing time using a tablet while in the bathroom... And this
isn't going to slow down, every single device at home or on us
will be connected.
All these devices are always connected. And with the number of
alternatives to give you access to the content you seek, users
tend to not stick around when problems arise. Users today want
their applications to be always available and if it's having
too many issues they just move on.
Despite this, when developers choose a product to use for building
web applications, their only concern seem to be "Is it fast?",
and they look around for synthetic benchmarks showing which one
is the fastest at sending "Hello world" with only a handful
concurrent connections. Web benchmarks haven't been representative
of reality in a long time, and are drifting further away as
time goes on.
What developers should really ask themselves is "Can I service
all my users with no interruption?" and they'd find that they have
two choices. They can either hope for the best, or they can use
Erlang.
Erlang is built for fault tolerance. When writing code in any other
language, you have to check all the return values and act accordingly
to avoid any unforeseen issues. If you're lucky, you won't miss
anything important. When writing Erlang code, you can just check
the success condition and ignore all errors. If an error happen,
the Erlang process crashes and is then restarted by a special
process called a supervisor.
The Erlang developer thus has no need to fear about unhandled
errors, and can focus on handling only the errors that should
give some feedback to the user and let the system take care of
the rest. This also has the advantage of allowing him to write
a lot less code, and letting him sleep at night.
Erlang's fault tolerance oriented design is the first piece of
what makes it the best choice for the omnipresent, always available
Web.
The second piece is Erlang's built-in distribution. Distribution
is a key part of building a fault tolerant system, because it
allows you to handle bigger failures, like a whole server going
down, or even a data center entirely.
Fault tolerance and distribution are important today, and will be
vital in the future of the Web. Erlang is ready.
Erlang is the ideal platform for the Web
----------------------------------------
Erlang provides all the important features that the Web requires
or will require in the near future. Erlang is a perfect match
for the Web, and it only makes sense to use it to build web
applications.

80
guide/getting_started.md Normal file
View file

@ -0,0 +1,80 @@
Getting started
===============
Cowboy does nothing by default.
Cowboy requires the `crypto` and `ranch` applications to be started.
``` erlang
ok = application:start(crypto).
ok = application:start(ranch).
ok = application:start(cowboy).
```
Cowboy uses Ranch for handling the connections and provides convenience
functions to start Ranch listeners.
The `cowboy:start_http/4` function starts a listener for HTTP connections
using the TCP transport. The `cowboy:start_https/4` function starts a
listener for HTTPS connections using the SSL transport.
Listeners are a group of processes that are used to accept and manage
connections. The processes used specifically for accepting connections
are called acceptors. The number of acceptor processes is unrelated to
the maximum number of connections Cowboy can handle. Please refer to
the [Ranch guide](http://ninenines.eu/docs/en/ranch/HEAD/guide/toc)
for in-depth information.
Listeners are named. They spawn a given number of acceptors, listen for
connections using the given transport options and pass along the protocol
options to the connection processes. The protocol options must include
the dispatch list for routing requests to handlers.
The dispatch list is explained in greater details in the
[Routing](routing.md) chapter.
``` erlang
Dispatch = cowboy_router:compile([
%% {URIHost, list({URIPath, Handler, Opts})}
{'_', [{'_', my_handler, []}]}
]),
%% Name, NbAcceptors, TransOpts, ProtoOpts
cowboy:start_http(my_http_listener, 100,
[{port, 8080}],
[{env, [{dispatch, Dispatch}]}]
).
```
Cowboy features many kinds of handlers. For this simple example,
we will just use the plain HTTP handler, which has three callback
functions: init/3, handle/2 and terminate/3. You can find more information
about the arguments and possible return values of these callbacks in the
[cowboy_http_handler function reference](http://ninenines.eu/docs/en/cowboy/HEAD/manual/cowboy_http_handler).
Following is an example of a simple HTTP handler module.
``` erlang
-module(my_handler).
-behaviour(cowboy_http_handler).
-export([init/3]).
-export([handle/2]).
-export([terminate/3]).
init({tcp, http}, Req, Opts) ->
{ok, Req, undefined_state}.
handle(Req, State) ->
{ok, Req2} = cowboy_req:reply(200, [], <<"Hello World!">>, Req),
{ok, Req2, State}.
terminate(Reason, Req, State) ->
ok.
```
The `Req` variable above is the Req object, which allows the developer
to obtain information about the request and to perform a reply. Its usage
is explained in the [cowboy_req function reference](http://ninenines.eu/docs/en/cowboy/HEAD/manual/cowboy_req).
You can find many examples in the `examples/` directory of the
Cowboy repository. A more complete "Hello world" example can be
found in the `examples/hello_world/` directory.

View file

@ -21,16 +21,12 @@ features both a Function Reference and a User Guide.
Prerequisites
-------------
It is assumed the developer already knows Erlang and has basic knowledge
about the HTTP protocol.
No Erlang knowledge is required for reading this guide. The reader will
be introduced to Erlang concepts and redirected to reference material
whenever necessary.
In order to run the examples available in this user guide, you will need
Erlang and rebar installed and in your $PATH.
Please see the [rebar repository](https://github.com/basho/rebar) for
downloading and building instructions. Please look up the environment
variables documentation of your system for details on how to update the
$PATH information.
Knowledge of the HTTP protocol is recommended but not required, as it
will be detailed throughout the guide.
Supported platforms
-------------------
@ -57,81 +53,4 @@ Header names are case insensitive. Cowboy converts all the request
header names to lowercase, and expects your application to provide
lowercase header names in the response.
Getting started
---------------
Cowboy does nothing by default.
Cowboy requires the `crypto` and `ranch` applications to be started.
``` erlang
ok = application:start(crypto).
ok = application:start(ranch).
ok = application:start(cowboy).
```
Cowboy uses Ranch for handling the connections and provides convenience
functions to start Ranch listeners.
The `cowboy:start_http/4` function starts a listener for HTTP connections
using the TCP transport. The `cowboy:start_https/4` function starts a
listener for HTTPS connections using the SSL transport.
Listeners are a group of processes that are used to accept and manage
connections. The processes used specifically for accepting connections
are called acceptors. The number of acceptor processes is unrelated to
the maximum number of connections Cowboy can handle. Please refer to
the Ranch guide for in-depth information.
Listeners are named. They spawn a given number of acceptors, listen for
connections using the given transport options and pass along the protocol
options to the connection processes. The protocol options must include
the dispatch list for routing requests to handlers.
The dispatch list is explained in greater details in the Routing section
of the guide.
``` erlang
Dispatch = cowboy_router:compile([
%% {URIHost, list({URIPath, Handler, Opts})}
{'_', [{'_', my_handler, []}]}
]),
%% Name, NbAcceptors, TransOpts, ProtoOpts
cowboy:start_http(my_http_listener, 100,
[{port, 8080}],
[{env, [{dispatch, Dispatch}]}]
).
```
Cowboy features many kinds of handlers. It has plain HTTP handlers, loop
handlers, Websocket handlers, REST handlers and static handlers. Their
usage is documented in the respective sections of the guide.
Most applications use the plain HTTP handler, which has three callback
functions: init/3, handle/2 and terminate/3. You can find more information
about the arguments and possible return values of these callbacks in the
HTTP handlers section of this guide. Following is an example of a simple
HTTP handler module.
``` erlang
-module(my_handler).
-behaviour(cowboy_http_handler).
-export([init/3]).
-export([handle/2]).
-export([terminate/3]).
init({tcp, http}, Req, Opts) ->
{ok, Req, undefined_state}.
handle(Req, State) ->
{ok, Req2} = cowboy_req:reply(200, [], <<"Hello World!">>, Req),
{ok, Req2, State}.
terminate(Reason, Req, State) ->
ok.
```
The `Req` variable above is the Req object, which allows the developer
to obtain information about the request and to perform a reply. Its usage
is explained in its respective section of the guide.
The same applies to any other case insensitive value.

224
guide/modern_web.md Normal file
View file

@ -0,0 +1,224 @@
The modern Web
==============
Let's take a look at various technologies from the beginnings
of the Web up to this day, and get a preview of what's
coming next.
Cowboy is compatible with all the technology cited in this
chapter except of course HTTP/2.0 which has no implementation
in the wild at the time of writing.
The prehistoric Web
-------------------
HTTP was initially created to serve HTML pages and only
had the GET method for retrieving them. This initial
version is documented and is sometimes called HTTP/0.9.
HTTP/1.0 defined the GET, HEAD and POST methods, and
was able to send data with POST requests.
HTTP/1.0 works in a very simple way. A TCP connection
is first established to the server. Then a request is
sent. Then the server sends a response back and closes
the connection.
Suffice to say, HTTP/1.0 is not very efficient. Opening
a TCP connection takes some time, and pages containing
many assets load much slower than they could because of
this.
Most improvements done in recent years focused on reducing
this load time and reducing the latency of the requests.
HTTP/1.1
--------
HTTP/1.1 quickly followed and added a keep-alive mechanism
to allow using the same connection for many requests, as
well as streaming capabilities, allowing an endpoint to send
a body in well defined chunks.
HTTP/1.1 defines the OPTIONS, GET, HEAD, POST, PUT, DELETE,
TRACE and CONNECT methods. The PATCH method was added in more
recent years. It also improves the caching capabilities with
the introduction of many headers.
HTTP/1.1 still works like HTTP/1.0 does, except the connection
can be kept alive for subsequent requests. This however allows
clients to perform what is called as pipelining: sending many
requests in a row, and then processing the responses which will
be received in the same order as the requests.
REST
----
The design of HTTP/1.1 was influenced by the REST architectural
style. REST, or REpresentational State Transfer, is a style of
architecture for loosely connected distributed systems.
REST defines constraints that systems must obey to in order to
be RESTful. A system which doesn't follow all the constraints
cannot be considered RESTful.
REST is a client-server architecture with a clean separation
of concerns between the client and the server. They communicate
by referencing resources. Resources can be identified, but
also manipulated. A resource representation has a media type
and information about whether it can be cached and how. Hypermedia
determines how resources are related and they can be used.
REST is also stateless. All requests contain the complete
information necessary to perform the action.
HTTP/1.1 defines all the methods, headers and semantics required
to implement RESTful systems.
REST is most often used when designing web application APIs
which are generally meant to be used by executable code directly.
XmlHttpRequest
--------------
Also know as AJAX, this technology allows Javascript code running
on a web page to perform asynchronous requests to the server.
This is what started the move from static websites to dynamic
web applications.
XmlHttpRequest still performs HTTP requests under the hood,
and then waits for a response, but the Javascript code can
continue to run until the response arrives. It will then receive
the response through a callback previously defined.
This is of course still requests initiated by the client,
the server still had no way of pushing data to the client
on its own, so new technology appeared to allow that.
Long-polling
------------
Polling was a technique used to overcome the fact that the server
cannot push data directly to the client. Therefore the client had
to repeatedly create a connection, make a request, get a response,
then try again a few seconds later. This is overly expensive and
adds an additional delay before the client receives the data.
Polling was necessary to implement message queues and other
similar mechanisms, where a user must be informed of something
when it happens, rather than when he refreshes the page next.
A typical example would be a chat application.
Long-polling was created to reduce the server load by creating
less connections, but also to improve latency by getting the
response back to the client as soon as it becomes available
on the server.
Long-polling works in a similar manner to polling, except the
request will not get a response immediately. Instead the server
leaves it open until it has a response to send. After getting
the response, the client creates a new request and gets back
to waiting.
You probably guessed by now that long-polling is a hack, and
like most hacks it can suffer from unforeseen issues, in this
case it doesn't always play well with proxies.
HTML5
-----
HTML5 is, of course, the HTML version after HTML4. But HTML5
emerged to solve a specific problem: dynamic web applications.
HTML was initially created to write web pages which compose
a website. But soon people and companies wanted to use HTML
to write more and more complex websites, eventually known as
web applications. They are for example your news reader, your
email client in the browser, or your video streaming website.
Because HTML wasn't enough, they started using proprietary
solutions, often implemented using plug-ins. This wasn't
perfect of course, but worked well enough for most people.
However, the needs for a standard solution eventually became
apparent. The browser needed to be able to play media natively.
It needed to be able to draw anything. It needed an efficient
way of streaming events to the server, but also receiving
events from the server.
The solution went on to become HTML5. At the time of writing
it is being standardized.
EventSource
-----------
EventSource, sometimes also called Server-Sent Events, is a
technology allowing servers to push data to HTML5 applications.
EventSource is one-way communication channel from the server
to the client. The client has no means to talk to the server
other than by using HTTP requests.
It consists of a Javascript object allowing setting up an
EventSource connection to the server, and a very small protocol
for sending events to the client on top of the HTTP/1.1
connection.
EventSource is a lightweight solution that only works for
UTF-8 encoded text data. Binary data and text data encoded
differently are not allowed by the protocol. A heavier but
more generic approach can be found in Websocket.
Websocket
---------
Websocket is a protocol built on top of HTTP/1.1 that provides
a two-ways communication channel between the client and the
server. Communication is asynchronous and can occur concurrently.
It consists of a Javascript object allowing setting up a
Websocket connection to the server, and a binary based
protocol for sending data to the server or the client.
Websocket connections can transfer either UTF-8 encoded text
data or binary data. The protocol also includes support for
implementing a ping/pong mechanism, allowing the server and
the client to have more confidence that the connection is still
alive.
A Websocket connection can be used to transfer any kind of data,
small or big, text or binary. Because of this Websocket is
sometimes used for communication between systems.
SPDY
----
SPDY is an attempt to reduce page loading time by opening a
single connection per server, keeping it open for subsequent
requests, and also by compressing the HTTP headers to reduce
the size of requests.
SPDY is compatible with HTTP/1.1 semantics, and is actually
just a different way of performing HTTP requests and responses,
by using binary frames instead of a text-based protocol.
SPDY also allows the server to send responses without needing
a request to exist, essentially enabling server push.
SPDY is an experiment that has proven successful and is used
as the basis for the HTTP/2.0 standard.
Browsers make use of TLS Next Protocol Negotiation to upgrade
to a SPDY connection seamlessly if the protocol supports it.
The protocol itself has a few shortcomings which are being
fixed in HTTP/2.0.
HTTP/2.0
--------
HTTP/2.0 is the long-awaited update to the HTTP/1.1 protocol.
It is based on SPDY although a lot has been improved at the
time of writing.
HTTP/2.0 is an asynchronous two-ways communication channel
between two endpoints.
It is planned to be ready late 2014.

View file

@ -1,11 +1,39 @@
Cowboy User Guide
=================
The Cowboy User Guide explores the modern Web and how to make
best use of Cowboy for writing powerful web applications.
Introducing Cowboy
------------------
* [Introduction](introduction.md)
* Purpose
* Prerequisites
* Supported platforms
* Conventions
* Getting started
* [The modern Web](modern_web.md)
* The prehistoric Web
* HTTP/1.1
* REST
* Long-polling
* HTML5
* EventSource
* Websocket
* SPDY
* HTTP/2.0
* [Erlang and the Web](erlang_web.md)
* The Web is concurrent
* The Web is soft real time
* The Web is asynchronous
* The Web is omnipresent
* Erlang is the ideal platform for the Web
* [Erlang for beginners](erlang_beginners.md)
* [Getting started](getting_started.md)
Using Cowboy
------------
* [Routing](routing.md)
* Purpose
* Structure