Concurrency Insights from Web Server Technologies

As we explore the scaling of Web applications in many dimensions (number of users, size of data, UI functionality, and more), there are various challenges, many subtle and surprising.  Some of the thorniest arise from the high latency of communications over the Internet, which generally leads to designs supporting greater concurrency.

Hardware advances have helped: CPU, memory, and storage resources on both the server and most clients (including, now, 1 GHz smartphones) are inexpensive and plentiful.  This makes practical the aggressive multithreading, multiple connections, and content caching we see in modern browsers, which actually helps a lot. With many simultaneous clients on the server side, the corresponding need is even greater, so employing some kind of concurrency to cope with latency is a given (for an interesting analysis, see the classic but still enlightening c10k essay).  In fact, for the web server edge layer, the entirety of the problem is managing all of that concurrency in ways that tend to be bulky, complex, or both.  On the server back-end, another challenge looms: we’re getting larger processing jobs and greater expectations for response time, but CPUs are not getting much faster these days, instead we just get more CPU cores and hardware threads.  Again, the clear recourse is more concurrency…

Personally, I find this very challenging.  We’ve all seen and dabbled with the standard fork/join patterns, but broader, more sophisticated uses of concurrency are just not that familiar.  I hear from experienced programmers who regard coding to the POSIX thread/Java thread model as tricky and woefully low-level, and yet other, more advanced technologies to employ concurrency just don’t seem to have become mainstream yet.  Optimistically, I think it’s a matter of making these techniques more routine and more usable, rather than a permanent jump to more complex implementations.  Indeed, there are many experimental and specialized implementations out there that point the way towards a better understanding of the space.  I’d like to highlight a couple here that employ concurrency in a particularly elegant and comprehensible way, and they just happen to be two very novel web server implementations.

The first is node.js (“Node”), which is interesting in many ways. It’s written in JavaScript but it runs in Google’s highly optimized V8 JavaScript Engine, so its performance is realistic. It’s not the most obvious choice, but JavaScript is actually rather attractive for running the server half of your Web application, with its flexibility and ability to be expressive and concise.  Plus, not only do you get the benefit of the same implementation language on the client and server, with Node there is an event-reactive application structure just like in the browser, where you write handlers and the runtime takes care of polling and looping.  Since application code never blocks on I/O (everything is transparently asynchronous), the runtime is able to efficiently switch tasks and gain lots of concurrency without piling up thousands of blocking threads.  It’s a great demonstration of how much of time a typical web server request involves waiting for slow resources (storage and network), and how all of that paused but resumable state can be efficiently encoded and accessed.

Another fascinating example of a web server that scales well and has an instructive and unique implementation is Yaws, written in Erlang.  It’s the opposite approach of Node, which eschews threads, instead it embraces the extremely lightweight threads (somewhat confusingly called “processes”) provided by Erlang’s runtime, one per connection/request.  The efficiency of Erlang and its built-in, fault-tolerant clustering features have inspired implementations of other high-performance, highly concurrent services like the distributed key-value stores Riak and Scalaris (also worth a look).  To the programmer, Erlang threads are as safe as OS-level processes (no memory sharing), allowing only the passing messages between them, but thanks to the language and its runtime they are efficient enough to be used freely and in large numbers.  Combined with Erlang’s language-level support for failure detection and recovery, they support dynamic applications with high levels of concurrency.

Perhaps as a demonstration of their flexibility, both Yaws and Node are also great tools to explore one more promising new technology that gets to the heart of latency and concurrency in Web applications, the WebSockets protocol.  It establishes a secure, bidirectional communication channel between the server and browser, a huge simplification over previous techniques that used multiple XMLHttpRequests and hanging-GET open connections.  In addition to this clarity and optimization (it’s not an http request anymore so you don’t have all that extra payload on every exchange of data), it enables much easier to manage concurrency with its integration into the JavaScript event architecture.  Basically there’s a send() call from the browser to the server and a “onmessage” handler to receive data pushed from the server.  For WebSockets functionality in your browser, there are more choices arriving all the time (Chrome, WebKit (Safari nightlies), and Flash and Java-based add-ons for other browsers).

It seems these technologies are on the right track, and at the very least they’ve nudged forward my overall understanding of concurrency.  While it’s come and gone as a hot topic over the years, based on what I see in the research community I have a feeling we’ll be wrestling with it for as long as scaling is a top concern.  And while latency is still a factor.

Continue the conversation by sharing your comments here on the blog and by following us on Twitter @CTCT_API

Leave a Comment