Greatly improve the guide introduction

2025-07-14 20:30:23 +00:00 · 2013-06-27 00:02:12 +02:00 · 2013-06-27 00:02:12 +02:00 · 61ca86b054
commit 61ca86b054
parent b059a1237f
6 changed files with 563 additions and 88 deletions
--- a/guide/erlang_beginners.md
+++ b/guide/erlang_beginners.md
@ -0,0 +1,43 @@
+Erlang for beginners
+====================
+
+Chances are you are interested in using Cowboy, but have
+no idea how to write an Erlang program. Fear not! This
+chapter will help you get started.
+
+We recommend two books for beginners. You should read them
+both at some point, as they cover Erlang from two entirely
+different perspectives.
+
+Learn You Some Erlang for Great Good!
+-------------------------------------
+
+The quickest way to get started with Erlang is by reading
+a book with the funny name of [LYSE](http://learnyousomeerlang.com),
+as we affectionately call it.
+
+It will get right into the syntax and quickly answer the questions
+a beginner would ask themselves, all the while showing funny
+pictures and making insightful jokes.
+
+You can read an early version of the book online for free,
+but you really should buy the much more refined paper and
+ebook versions.
+
+Programming Erlang
+------------------
+
+After writing some code, you will probably want to understand
+the very concepts that make Erlang what it is today. These
+are best explained by Joe Armstrong, the godfather of Erlang,
+in his book [Programming Erlang](http://pragprog.com/book/jaerlang2/programming-erlang).
+
+Instead of going into every single details of the language,
+Joe focuses on the central concepts behind Erlang, and shows
+you how they can be used to write a variety of different
+applications.
+
+At the time of writing, the 2nd edition of the book is in beta,
+and includes a few details about upcoming Erlang features that
+cannot be used today. Choose the edition you want, then get
+reading!
--- a/guide/erlang_web.md
+++ b/guide/erlang_web.md
@ -0,0 +1,181 @@
+Erlang and the Web
+==================
+
+The Web is concurrent
+---------------------
+
+When you access a website there is little concurrency
+involved. A few connections are opened and requests
+are sent through these connections. Then the web page
+is displayed on your screen. Your browser will only
+open up to 4 or 8 connections to the server, depending
+on your settings. This isn't much.
+
+But think about it. You are not the only one accessing
+the server at the same time. There can be hundreds, if
+not thousands, if not millions of connections to the
+same server at the same time.
+
+Even today a lot of systems used in production haven't
+solved the C10K problem (ten thousand concurrent connections).
+And the ones who did are trying hard to get to the next
+step, C100K, and are pretty far from it.
+
+Erlang meanwhile has no problem handling millions of
+connections. At the time of writing there are application
+servers written in Erlang that can handle more than two
+million connections on a single server in a real production
+application, with spare memory and CPU!
+
+The Web is concurrent, and Erlang is a language designed
+for concurrency, so it is a perfect match.
+
+Of course, various platforms need to scale beyond a few
+million connections. This is where Erlang's built-in
+distribution mechanisms come in. If one server isn't
+enough, add more! Erlang allows you to use the same code
+for talking to local processes or to processes in other
+parts of your cluster, which means you can scale very
+quickly if the need arises.
+
+The Web has large userbases, and the Erlang platform was
+designed to work in a distributed setting, so it is a
+perfect match.
+
+Or is it? Surely you can find solutions to handle that many
+concurrent connections with my favorite language... But all
+these solutions will break down in the next few years. Why?
+Firstly because servers don't get any more powerful, they
+instead get a lot more cores and memory. This is only useful
+if your application can use them properly, and Erlang is
+light-years away from anything else in that area. Secondly,
+today your computer and your phone are online, tomorrow your
+watch, goggles, bike, car, fridge and tons of other devices
+will also connect to various applications on the Internet.
+
+Only Erlang is prepared to deal with what's coming.
+
+The Web is soft real time
+-------------------------
+
+What does soft real time mean, you ask? It means we want the
+operations done as quickly as possible, and in the case of
+web applications, it means we want the data propagated fast.
+
+In comparison, hard real time has a similar meaning, but also
+has a hard time constraint, for example an operation needs to
+be done in under N milliseconds otherwise the system fails
+entirely.
+
+Users aren't that needy yet, they just want to get access
+to their content in a reasonable delay, and they want the
+actions they make to register at most a few seconds after
+they submitted them, otherwise they'll start worrying about
+whether it successfully went through.
+
+The Web is soft real time because taking longer to perform an
+operation would be seen as bad quality of service.
+
+Erlang is a soft real time system. It will always run
+processes fairly, a little at a time, switching to another
+process after a while and preventing a single process to
+steal resources from all others. This means that Erlang
+can guarantee stable low latency of operations.
+
+Erlang provides the guarantees that the soft real time Web
+requires.
+
+The Web is asynchronous
+-----------------------
+
+Long ago, the Web was synchronous because HTTP was synchronous.
+You fired a request, and then waited for a response. Not anymore.
+It all started when XmlHttpRequest started being used. It allowed
+the client to perform asynchronous calls to the server.
+
+Then Websocket appeared and allowed both the server and the client
+to send data to the other endpoint completely asynchronously. The
+data is contained within frames and no response is necessary.
+
+Erlang processes work the same. They send each other data contained
+within messages and then continue running without needing a response.
+They tend to spend most of their time inactive, waiting for a new
+message, and the Erlang VM happily activate them when one is received.
+
+It is therefore quite easy to imagine Erlang being good at receiving
+Websocket frames, which may come in at unpredictable times, pass the
+data to the responsible processes which are always ready waiting for
+new messages, and perform the operations required by only activating
+the required parts of the system.
+
+The more recent Web technologies, like Websocket of course, but also
+SPDY and HTTP/2.0, are all fully asynchronous protocols. The concept
+of requests and responses is retained of course, but anything could
+be sent in between, by both the client or the browser, and the
+responses could also be received in a completely different order.
+
+Erlang is by nature asynchronous and really good at it thanks to the
+great engineering that has been done in the VM over the years. It's
+only natural that it's so good at dealing with the asynchronous Web.
+
+The Web is omnipresent
+----------------------
+
+The Web has taken a very important part of our lives. We're
+connected at all times, when we're on our phone, using our computer,
+passing time using a tablet while in the bathroom... And this
+isn't going to slow down, every single device at home or on us
+will be connected.
+
+All these devices are always connected. And with the number of
+alternatives to give you access to the content you seek, users
+tend to not stick around when problems arise. Users today want
+their applications to be always available and if it's having
+too many issues they just move on.
+
+Despite this, when developers choose a product to use for building
+web applications, their only concern seem to be "Is it fast?",
+and they look around for synthetic benchmarks showing which one
+is the fastest at sending "Hello world" with only a handful
+concurrent connections. Web benchmarks haven't been representative
+of reality in a long time, and are drifting further away as
+time goes on.
+
+What developers should really ask themselves is "Can I service
+all my users with no interruption?" and they'd find that they have
+two choices. They can either hope for the best, or they can use
+Erlang.
+
+Erlang is built for fault tolerance. When writing code in any other
+language, you have to check all the return values and act accordingly
+to avoid any unforeseen issues. If you're lucky, you won't miss
+anything important. When writing Erlang code, you can just check
+the success condition and ignore all errors. If an error happen,
+the Erlang process crashes and is then restarted by a special
+process called a supervisor.
+
+The Erlang developer thus has no need to fear about unhandled
+errors, and can focus on handling only the errors that should
+give some feedback to the user and let the system take care of
+the rest. This also has the advantage of allowing him to write
+a lot less code, and letting him sleep at night.
+
+Erlang's fault tolerance oriented design is the first piece of
+what makes it the best choice for the omnipresent, always available
+Web.
+
+The second piece is Erlang's built-in distribution. Distribution
+is a key part of building a fault tolerant system, because it
+allows you to handle bigger failures, like a whole server going
+down, or even a data center entirely.
+
+Fault tolerance and distribution are important today, and will be
+vital in the future of the Web. Erlang is ready.
+
+Erlang is the ideal platform for the Web
+----------------------------------------
+
+Erlang provides all the important features that the Web requires
+or will require in the near future. Erlang is a perfect match
+for the Web, and it only makes sense to use it to build web
+applications.
--- a/guide/getting_started.md
+++ b/guide/getting_started.md
@ -0,0 +1,80 @@
+Getting started
+===============
+
+Cowboy does nothing by default.
+
+Cowboy requires the `crypto` and `ranch` applications to be started.
+
+``` erlang
+ok = application:start(crypto).
+ok = application:start(ranch).
+ok = application:start(cowboy).
+```
+
+Cowboy uses Ranch for handling the connections and provides convenience
+functions to start Ranch listeners.
+
+The `cowboy:start_http/4` function starts a listener for HTTP connections
+using the TCP transport. The `cowboy:start_https/4` function starts a
+listener for HTTPS connections using the SSL transport.
+
+Listeners are a group of processes that are used to accept and manage
+connections. The processes used specifically for accepting connections
+are called acceptors. The number of acceptor processes is unrelated to
+the maximum number of connections Cowboy can handle. Please refer to
+the [Ranch guide](http://ninenines.eu/docs/en/ranch/HEAD/guide/toc)
+for in-depth information.
+
+Listeners are named. They spawn a given number of acceptors, listen for
+connections using the given transport options and pass along the protocol
+options to the connection processes. The protocol options must include
+the dispatch list for routing requests to handlers.
+
+The dispatch list is explained in greater details in the
+[Routing](routing.md) chapter.
+
+``` erlang
+Dispatch = cowboy_router:compile([
+    %% {URIHost, list({URIPath, Handler, Opts})}
+    {'_', [{'_', my_handler, []}]}
+]),
+%% Name, NbAcceptors, TransOpts, ProtoOpts
+cowboy:start_http(my_http_listener, 100,
+    [{port, 8080}],
+    [{env, [{dispatch, Dispatch}]}]
+).
+```
+
+Cowboy features many kinds of handlers. For this simple example,
+we will just use the plain HTTP handler, which has three callback
+functions: init/3, handle/2 and terminate/3. You can find more information
+about the arguments and possible return values of these callbacks in the
+[cowboy_http_handler function reference](http://ninenines.eu/docs/en/cowboy/HEAD/manual/cowboy_http_handler).
+Following is an example of a simple HTTP handler module.
+
+``` erlang
+-module(my_handler).
+-behaviour(cowboy_http_handler).
+
+-export([init/3]).
+-export([handle/2]).
+-export([terminate/3]).
+
+init({tcp, http}, Req, Opts) ->
+    {ok, Req, undefined_state}.
+
+handle(Req, State) ->
+    {ok, Req2} = cowboy_req:reply(200, [], <<"Hello World!">>, Req),
+    {ok, Req2, State}.
+
+terminate(Reason, Req, State) ->
+    ok.
+```
+
+The `Req` variable above is the Req object, which allows the developer
+to obtain information about the request and to perform a reply. Its usage
+is explained in the [cowboy_req function reference](http://ninenines.eu/docs/en/cowboy/HEAD/manual/cowboy_req).
+
+You can find many examples in the `examples/` directory of the
+Cowboy repository. A more complete "Hello world" example can be
+found in the `examples/hello_world/` directory.
--- a/guide/introduction.md
+++ b/guide/introduction.md
@ -21,16 +21,12 @@ features both a Function Reference and a User Guide.
 Prerequisites
 -------------

-It is assumed the developer already knows Erlang and has basic knowledge
-about the HTTP protocol.
+No Erlang knowledge is required for reading this guide. The reader will
+be introduced to Erlang concepts and redirected to reference material
+whenever necessary.

-In order to run the examples available in this user guide, you will need
-Erlang and rebar installed and in your $PATH.
-
-Please see the [rebar repository](https://github.com/basho/rebar) for
-downloading and building instructions. Please look up the environment
-variables documentation of your system for details on how to update the
-$PATH information.
+Knowledge of the HTTP protocol is recommended but not required, as it
+will be detailed throughout the guide.

 Supported platforms
 -------------------
@ -57,81 +53,4 @@ Header names are case insensitive. Cowboy converts all the request
 header names to lowercase, and expects your application to provide
 lowercase header names in the response.

-Getting started
---------------
-
-Cowboy does nothing by default.
-
-Cowboy requires the `crypto` and `ranch` applications to be started.
-
-``` erlang
-ok = application:start(crypto).
-ok = application:start(ranch).
-ok = application:start(cowboy).
-```
-
-Cowboy uses Ranch for handling the connections and provides convenience
-functions to start Ranch listeners.
-
-The `cowboy:start_http/4` function starts a listener for HTTP connections
-using the TCP transport. The `cowboy:start_https/4` function starts a
-listener for HTTPS connections using the SSL transport.
-
-Listeners are a group of processes that are used to accept and manage
-connections. The processes used specifically for accepting connections
-are called acceptors. The number of acceptor processes is unrelated to
-the maximum number of connections Cowboy can handle. Please refer to
-the Ranch guide for in-depth information.
-
-Listeners are named. They spawn a given number of acceptors, listen for
-connections using the given transport options and pass along the protocol
-options to the connection processes. The protocol options must include
-the dispatch list for routing requests to handlers.
-
-The dispatch list is explained in greater details in the Routing section
-of the guide.
-
-``` erlang
-Dispatch = cowboy_router:compile([
-    %% {URIHost, list({URIPath, Handler, Opts})}
-    {'_', [{'_', my_handler, []}]}
-]),
-%% Name, NbAcceptors, TransOpts, ProtoOpts
-cowboy:start_http(my_http_listener, 100,
-    [{port, 8080}],
-    [{env, [{dispatch, Dispatch}]}]
-).
-```
-
-Cowboy features many kinds of handlers. It has plain HTTP handlers, loop
-handlers, Websocket handlers, REST handlers and static handlers. Their
-usage is documented in the respective sections of the guide.
-
-Most applications use the plain HTTP handler, which has three callback
-functions: init/3, handle/2 and terminate/3. You can find more information
-about the arguments and possible return values of these callbacks in the
-HTTP handlers section of this guide. Following is an example of a simple
-HTTP handler module.
-
-``` erlang
-module(my_handler).
-behaviour(cowboy_http_handler).
-
-export([init/3]).
-export([handle/2]).
-export([terminate/3]).
-
-init({tcp, http}, Req, Opts) ->
-    {ok, Req, undefined_state}.
-
-handle(Req, State) ->
-    {ok, Req2} = cowboy_req:reply(200, [], <<"Hello World!">>, Req),
-    {ok, Req2, State}.
-
-terminate(Reason, Req, State) ->
-    ok.
-```
-
-The `Req` variable above is the Req object, which allows the developer
-to obtain information about the request and to perform a reply. Its usage
-is explained in its respective section of the guide.
+The same applies to any other case insensitive value.
--- a/guide/modern_web.md
+++ b/guide/modern_web.md
@ -0,0 +1,224 @@
+The modern Web
+==============
+
+Let's take a look at various technologies from the beginnings
+of the Web up to this day, and get a preview of what's
+coming next.
+
+Cowboy is compatible with all the technology cited in this
+chapter except of course HTTP/2.0 which has no implementation
+in the wild at the time of writing.
+
+The prehistoric Web
+-------------------
+
+HTTP was initially created to serve HTML pages and only
+had the GET method for retrieving them. This initial
+version is documented and is sometimes called HTTP/0.9.
+HTTP/1.0 defined the GET, HEAD and POST methods, and
+was able to send data with POST requests.
+
+HTTP/1.0 works in a very simple way. A TCP connection
+is first established to the server. Then a request is
+sent. Then the server sends a response back and closes
+the connection.
+
+Suffice to say, HTTP/1.0 is not very efficient. Opening
+a TCP connection takes some time, and pages containing
+many assets load much slower than they could because of
+this.
+
+Most improvements done in recent years focused on reducing
+this load time and reducing the latency of the requests.
+
+HTTP/1.1
+--------
+
+HTTP/1.1 quickly followed and added a keep-alive mechanism
+to allow using the same connection for many requests, as
+well as streaming capabilities, allowing an endpoint to send
+a body in well defined chunks.
+
+HTTP/1.1 defines the OPTIONS, GET, HEAD, POST, PUT, DELETE,
+TRACE and CONNECT methods. The PATCH method was added in more
+recent years. It also improves the caching capabilities with
+the introduction of many headers.
+
+HTTP/1.1 still works like HTTP/1.0 does, except the connection
+can be kept alive for subsequent requests. This however allows
+clients to perform what is called as pipelining: sending many
+requests in a row, and then processing the responses which will
+be received in the same order as the requests.
+
+REST
+----
+
+The design of HTTP/1.1 was influenced by the REST architectural
+style. REST, or REpresentational State Transfer, is a style of
+architecture for loosely connected distributed systems.
+
+REST defines constraints that systems must obey to in order to
+be RESTful. A system which doesn't follow all the constraints
+cannot be considered RESTful.
+
+REST is a client-server architecture with a clean separation
+of concerns between the client and the server. They communicate
+by referencing resources. Resources can be identified, but
+also manipulated. A resource representation has a media type
+and information about whether it can be cached and how. Hypermedia
+determines how resources are related and they can be used.
+REST is also stateless. All requests contain the complete
+information necessary to perform the action.
+
+HTTP/1.1 defines all the methods, headers and semantics required
+to implement RESTful systems.
+
+REST is most often used when designing web application APIs
+which are generally meant to be used by executable code directly.
+
+XmlHttpRequest
+--------------
+
+Also know as AJAX, this technology allows Javascript code running
+on a web page to perform asynchronous requests to the server.
+This is what started the move from static websites to dynamic
+web applications.
+
+XmlHttpRequest still performs HTTP requests under the hood,
+and then waits for a response, but the Javascript code can
+continue to run until the response arrives. It will then receive
+the response through a callback previously defined.
+
+This is of course still requests initiated by the client,
+the server still had no way of pushing data to the client
+on its own, so new technology appeared to allow that.
+
+Long-polling
+------------
+
+Polling was a technique used to overcome the fact that the server
+cannot push data directly to the client. Therefore the client had
+to repeatedly create a connection, make a request, get a response,
+then try again a few seconds later. This is overly expensive and
+adds an additional delay before the client receives the data.
+
+Polling was necessary to implement message queues and other
+similar mechanisms, where a user must be informed of something
+when it happens, rather than when he refreshes the page next.
+A typical example would be a chat application.
+
+Long-polling was created to reduce the server load by creating
+less connections, but also to improve latency by getting the
+response back to the client as soon as it becomes available
+on the server.
+
+Long-polling works in a similar manner to polling, except the
+request will not get a response immediately. Instead the server
+leaves it open until it has a response to send. After getting
+the response, the client creates a new request and gets back
+to waiting.
+
+You probably guessed by now that long-polling is a hack, and
+like most hacks it can suffer from unforeseen issues, in this
+case it doesn't always play well with proxies.
+
+HTML5
+-----
+
+HTML5 is, of course, the HTML version after HTML4. But HTML5
+emerged to solve a specific problem: dynamic web applications.
+
+HTML was initially created to write web pages which compose
+a website. But soon people and companies wanted to use HTML
+to write more and more complex websites, eventually known as
+web applications. They are for example your news reader, your
+email client in the browser, or your video streaming website.
+
+Because HTML wasn't enough, they started using proprietary
+solutions, often implemented using plug-ins. This wasn't
+perfect of course, but worked well enough for most people.
+
+However, the needs for a standard solution eventually became
+apparent. The browser needed to be able to play media natively.
+It needed to be able to draw anything. It needed an efficient
+way of streaming events to the server, but also receiving
+events from the server.
+
+The solution went on to become HTML5. At the time of writing
+it is being standardized.
+
+EventSource
+-----------
+
+EventSource, sometimes also called Server-Sent Events, is a
+technology allowing servers to push data to HTML5 applications.
+
+EventSource is one-way communication channel from the server
+to the client. The client has no means to talk to the server
+other than by using HTTP requests.
+
+It consists of a Javascript object allowing setting up an
+EventSource connection to the server, and a very small protocol
+for sending events to the client on top of the HTTP/1.1
+connection.
+
+EventSource is a lightweight solution that only works for
+UTF-8 encoded text data. Binary data and text data encoded
+differently are not allowed by the protocol. A heavier but
+more generic approach can be found in Websocket.
+
+Websocket
+---------
+
+Websocket is a protocol built on top of HTTP/1.1 that provides
+a two-ways communication channel between the client and the
+server. Communication is asynchronous and can occur concurrently.
+
+It consists of a Javascript object allowing setting up a
+Websocket connection to the server, and a binary based
+protocol for sending data to the server or the client.
+
+Websocket connections can transfer either UTF-8 encoded text
+data or binary data. The protocol also includes support for
+implementing a ping/pong mechanism, allowing the server and
+the client to have more confidence that the connection is still
+alive.
+
+A Websocket connection can be used to transfer any kind of data,
+small or big, text or binary. Because of this Websocket is
+sometimes used for communication between systems.
+
+SPDY
+----
+
+SPDY is an attempt to reduce page loading time by opening a
+single connection per server, keeping it open for subsequent
+requests, and also by compressing the HTTP headers to reduce
+the size of requests.
+
+SPDY is compatible with HTTP/1.1 semantics, and is actually
+just a different way of performing HTTP requests and responses,
+by using binary frames instead of a text-based protocol.
+SPDY also allows the server to send responses without needing
+a request to exist, essentially enabling server push.
+
+SPDY is an experiment that has proven successful and is used
+as the basis for the HTTP/2.0 standard.
+
+Browsers make use of TLS Next Protocol Negotiation to upgrade
+to a SPDY connection seamlessly if the protocol supports it.
+
+The protocol itself has a few shortcomings which are being
+fixed in HTTP/2.0.
+
+HTTP/2.0
+--------
+
+HTTP/2.0 is the long-awaited update to the HTTP/1.1 protocol.
+It is based on SPDY although a lot has been improved at the
+time of writing.
+
+HTTP/2.0 is an asynchronous two-ways communication channel
+between two endpoints.
+
+It is planned to be ready late 2014.
--- a/guide/toc.md
+++ b/guide/toc.md
@ -1,11 +1,39 @@
 Cowboy User Guide
 =================

+The Cowboy User Guide explores the modern Web and how to make
+best use of Cowboy for writing powerful web applications.
+
+Introducing Cowboy
+------------------
+
 *  [Introduction](introduction.md)
   *  Purpose
   *  Prerequisites
+   *  Supported platforms
   *  Conventions
-   *  Getting started
+ *  [The modern Web](modern_web.md)
+   *  The prehistoric Web
+   *  HTTP/1.1
+   *  REST
+   *  Long-polling
+   *  HTML5
+   *  EventSource
+   *  Websocket
+   *  SPDY
+   *  HTTP/2.0
+ *  [Erlang and the Web](erlang_web.md)
+   *  The Web is concurrent
+   *  The Web is soft real time
+   *  The Web is asynchronous
+   *  The Web is omnipresent
+   *  Erlang is the ideal platform for the Web
+ *  [Erlang for beginners](erlang_beginners.md)
+ *  [Getting started](getting_started.md)
+
+Using Cowboy
+------------
+
 *  [Routing](routing.md)
   *  Purpose
   *  Structure