As we explore the scaling of Web applications in many dimensions (number of users, size of data, UI functionality, and more), there are various challenges, many subtle and surprising. Some of the thorniest arise from the high latency of communications over the Internet, which generally leads to designs supporting greater concurrency.
Hardware advances have helped: CPU, memory, and storage resources on both the server and most clients (including, now, 1 GHz smartphones) are inexpensive and plentiful. This makes practical the aggressive multithreading, multiple connections, and content caching we see in modern browsers, which actually helps a lot. With many simultaneous clients on the server side, the corresponding need is even greater, so employing some kind of concurrency to cope with latency is a given (for an interesting analysis, see the classic but still enlightening c10k essay). In fact, for the web server edge layer, the entirety of the problem is managing all of that concurrency in ways that tend to be bulky, complex, or both. On the server back-end, another challenge looms: we’re getting larger processing jobs and greater expectations for response time, but CPUs are not getting much faster these days, instead we just get more CPU cores and hardware threads. Again, the clear recourse is more concurrency…
Personally, I find this very challenging. We’ve all seen and dabbled with the standard fork/join patterns, but broader, more sophisticated uses of concurrency are just not that familiar. I hear from experienced programmers who regard coding to the POSIX thread/Java thread model as tricky and woefully low-level, and yet other, more advanced technologies to employ concurrency just don’t seem to have become mainstream yet. Optimistically, I think it’s a matter of making these techniques more routine and more usable, rather than a permanent jump to more complex implementations. Indeed, there are many experimental and specialized implementations out there that point the way towards a better understanding of the space. I’d like to highlight a couple here that employ concurrency in a particularly elegant and comprehensible way, and they just happen to be two very novel web server implementations.
Another fascinating example of a web server that scales well and has an instructive and unique implementation is Yaws, written in Erlang. It’s the opposite approach of Node, which eschews threads, instead it embraces the extremely lightweight threads (somewhat confusingly called “processes”) provided by Erlang’s runtime, one per connection/request. The efficiency of Erlang and its built-in, fault-tolerant clustering features have inspired implementations of other high-performance, highly concurrent services like the distributed key-value stores Riak and Scalaris (also worth a look). To the programmer, Erlang threads are as safe as OS-level processes (no memory sharing), allowing only the passing messages between them, but thanks to the language and its runtime they are efficient enough to be used freely and in large numbers. Combined with Erlang’s language-level support for failure detection and recovery, they support dynamic applications with high levels of concurrency.
It seems these technologies are on the right track, and at the very least they’ve nudged forward my overall understanding of concurrency. While it’s come and gone as a hot topic over the years, based on what I see in the research community I have a feeling we’ll be wrestling with it for as long as scaling is a top concern. And while latency is still a factor.
Continue the conversation by sharing your comments here on the blog and by following us on Twitter @CTCT_API