11 June 2015

Hands-on Review: Cowboy Web Server

Web servers are a dime-a-dozen these days.  I just barely remember the days when developers had to choose between standalone web servers, versus those tightly-integrated with or into their applications.  That distinction is almost non-existent at this point, and most modern web servers can be configured to use either approach, based on a project's requirements.

There is, however, still a spectrum of sorts with regards to APIs and languages.  The more general web servers (Apache HTTP Server, nginx, et al.) are intended to be language-, platform-, and API-neutral, while most of the "Web 2.0 and later" languages have at least one web server that can be integrated directly into applications.  Ruby, in particular, has more options in this "integrated server" department than you can shake a stick at; in fact, some might say that it has too many options, and that like the "Cool Kids" in Garrett's presentation they tend to be a bit too focused on buzz words at the cost of substance, but I digress...

Erlang, for all practical purposes, has a total of five options- three in the "old guard":
  • inets, a package included in Erlang's basic "OTP" distribution
    • Easy to work with, well-documented, but lacking some out-of-the-box features
  • Yaws "Yet Another Web Server," is a massively-concurrent juggernaut
    • Back-end code is written in a structure that should be familiar to PHP devs.  It pretty consistently out-performs Apache with large numbers of simultaneous connections.
  • Mochiweb is a lightweight web server toolkit
    • Not really a complete web server, but a useful set of routines that's a bit more complete than the stuff in inets.
...and two* in the "newcomers club":
  • Elli, which is intended to take the best aspects of the old guard and bring them together
  • and, finally, Cowboy, which takes a few radical departures from the traditional models
* a third newcomer, Mitsultin, recently announced an end of active development

All of these server applications and libraries have their own merits, but Cowboy has interested me in particular because it takes some radically different approaches to the web server model.  Your application is in control of the web server, not the other way around, and you can do anything from direct socket-based interaction all the way to HTML copy-pasta, as suits your needs.

From some extensive poking and prodding, Cowboy appears to have three "layers," two of which are mandatory.  They are, from lowest-level to highest-level:
  • The "listener" layer, responsible for accepting connections on TCP sockets, doing some initial low-level decision making, and spawning a new higher-level process (per connection) to take things from there
  • An optional middleware layer, which provides a nice modular way for developers to avoid re-inventing the wheel.  You can just drop one of many middleware modules in here to handle all sorts of "boiler plate" application functionality, like authentication and session handling, security auditing, etc.  These middleware layers do their magic dances with the web clients, and expose the relevant information to the next layer through small, unintrusive function calls.
  • The "handler" layer, which is where your (yes, I'm looking at you, dear reader) application lives - this layer can be anything from the aforementioned raw socket to direct HTTP, as mentioned earlier, and depending on your configuration of the corresponding listener entry.  All of the information from both the listener and the middleware is available here, too, so you can make intelligent decisions on content delivery.
That modules-in-layers paradigm leads to some nifty things.  The intelligent content delivery that I just mentioned is a serious boon compared to other web servers that I've used.  Getting information on the logged-in user, for instance, can be as easy for you as a single function call, and you don't have to worry about actually making the authentication happen if you don't want to do so (and providing that there's a middleware plugin available for your auth system, which there probably is).  In another example, based on whether or not a connection is encrypted, you can deliver not just different pages, but different text strings or backend logic within the same "page."  It's just a simple { Tuple } = function( ... ) pattern match operation in your application code, and you can then do any conditionals or guards that you like.  The same goes for user authentication state (if you've told the optional middleware to let users get to you at all without auth, that is), web client info, HTTP headers, etc.  All of the information is just there, waiting for you to act on it.

The security-conscious will no doubt be alarmed at this point.  "Wait," you might say, "how do we ensure that an attacker can't get to all of that information."  Though there's no such thing as a fool-proof application, of course, Erlang [unintentionally] mitigates much of this risk for you.  Since it's a compiled language, with even a modicum of care you don't have to worry about code injection into the backend.  What's more, due to its functional paradigm there isn't really any "global namespace," so, short of major bugs in the Erlang runtime (which has had nearly thirty years to stabilize), each particular function in your application only has access to the data that you give it at compile-time.

As if all of that is not enough to like, Cowboy is fast.  I'm talking 'cheetah after ten cups of coffee' fast, here.  The usage of a compiled language (particularly if you use BEAM's native code compilation), plus the tight integration between web application and server, leads to some awe-inspiring response times.

Now, lest I start to sound like a used car salesman, there is a downside of all of this awe-inspiring 21st-century goodness: for those of us used to more traditional web servers, the learning curve here is steep.  It's not just the fact that you pretty much have to learn at least a small amount of Erlang, either.  Imagine, for example, if you had to actually re-compile your Apache httpd.conf after making a configuration change!  Now, rebuilding Cowboy's listener for your application takes less than a second on even a rather low-horsepower system, but it's just something that the uninitiated would never even think to do.  There are plenty of other "gotchas" along those lines.  The biggest challenge of using Cowboy is then, in my own opinion, the hours of repeated head-desking and hair-pulling required to learn how to use the tool.  Once you've figured it out, you're walking on sunshine, of course, but the road to get there is dark and terrifying... and that, too, fits right in with the Erlang ecosystem.