An expensive line of code: scaling Node.JS on Heroku

One of the perks of working in multidisciplinary teams for the past few years has been that there I have continuously been surrounded with people with more tech chops than me who would tackle the really hard problems™. While working on a little thing for the Human Tide project, I was reminded how tricky some of those problems can be sometimes.

In a nutshell, I was building a web service that uses the HTML5 Geolocation API to track the location and movement of people as they walk along the shoreline of Herne Bay. The plan was that a group of people – around a hundred – will each carry a mobile phone that sends its GPS coordinates to a server, which stores them in a database. It was a simple enough job, were it not that each phone continuously sends its coordinates back in real time – once per second – which quickly begins to look like a small-scale distributed denial of service attack. Furthermore, as each request needs to be stored in a database, it was not a matter that can easily be solved using caching. I needed a backend that could deal with 100+ requests per second without breaking a sweat. Just the use case node.js, with its asynchronous event-driven model, was supposed to be good at!

Building the little web service went well enough. Within a few hours, I had a small node.js app running on a free Heroku plan. Using the expressjs framework and the node-postgres library, the whole thing was done in under 100 lines of code. It was easy. Too easy. Unfortunately, it performed absolutely terrible.

To simulate the effect of a hundred phones hammering away at the service and measure its performance, I unleashed ApacheBench on my little app. Simulating 5,000 requests from 100 concurrent users is as easy as running:

ab -n 5000 -c 100 my.herokuapp.com

The results were devastating:

Concurrency Level:      100
Time taken for tests:   565.061 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      3385800 bytes
HTML transferred:       2655000 bytes
Requests per second:    8.85 [#/sec] (mean)
Time per request:       11301.229 [ms] (mean)
Time per request:       113.012 [ms] (mean, across all concurrent requests)
Transfer rate:          5.85 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       17   18   0.5     18      25
Processing:    31 11193 2975.6  11176   21911
Waiting:       31 11193 2975.6  11176   21911
Total:         49 11211 2975.5  11195   21928

Percentage of the requests served within a certain time (ms)
  50%  11195
  66%  11714
  75%  12074
  80%  12232
  90%  13032
  95%  14745
  98%  16573
  99%  17852
 100%  21928 (longest request)

My state-of-the-art node.js app managed to handle all of the requests (mostly thanks to Heroku’s routing stack queuing them up), but at a rate of less than 10 requests per second and taking up to 20 (!) seconds per request. Granted, this was on a single Heroku “dyno”, but with these results, it would take at least 10 additional dynos (at a cost of almost $400 a month) to cope with the load. Not exactly what I had in mind. I figured it might have something to do with my app using Heroku’s dev postgresql database plans, but running the same test against a route that didn’t hit the database produced similar results. It seemed that the slowness was the request handling itself. Looking in Heroku’s logs, I saw a depressing number of lines like this, showing severe delays in the “service” step:

heroku[router]: … connect=0ms service=4506ms status=200 bytes=543

I found an article explaining how to use node cluster to better benefit from the multiple cores in each Heroku dyno, which didn’t appear to make much of a difference. It was still just too slow.

And then it hit me. From Heroku’s own tutorial on getting started with Node.js on their platform, I copied the following line to enable logging in Express.js:

app.use(express.logger());

app[web.1]: - - - [Mon, 05 Aug 2013 14:05:51 GMT] "GET / HTTP/1.1" 200 - "-" "ApacheBench/2.3"

Deleting that line improved performance… with a factor 50. I ran my little concurrency benchmark again and this time it completed in a fraction of the time, even though I gave it a lot more requests to process:

Concurrency Level:      100
Time taken for tests:   105.179 seconds
Complete requests:      50000
Failed requests:        0
Write errors:           0
Total transferred:      33843447 bytes
HTML transferred:       26550000 bytes
Requests per second:    475.38 [#/sec] (mean)
Time per request:       210.358 [ms] (mean)
Time per request:       2.104 [ms] (mean, across all concurrent requests)
Transfer rate:          314.23 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       17   18   1.3     18      59
Processing:    23  192 160.6    135    1400
Waiting:       23  192 160.6    135    1400
Total:         41  210 160.6    153    1419

Percentage of the requests served within a certain time (ms)
  50%    153
  66%    280
  75%    341
  80%    366
  90%    419
  95%    458
  98%    546
  99%    656
 100%   1419 (longest request)

From less than 10 requests per second, it can now handle almost 500 requests per second and rather than 10 seconds, requests now take around 200ms to complete. All because of a single line of logging. In fact, I found that any logging, even simple “console.log”-statements completely destroy Node.JS performance on Heroku.

What I’ve learned from this is that scaling is hard and finding out why things are not performing as expected when using new technology is too, even for very simple apps like this. I’m a bit surprised that a bit of logging has such a severe impact on performance on Heroku and even more surprised that they recommend you to enable logging in their Express tutorial. Turning logging off allowed my app to handle 53x as many requests per second on a single “dyno”. I would have had to pay for quite a few of those dynos, at $34.50 a piece, in order to deal with the same volume of traffic. In the end, my little app performed great and I highly recommend Heroku or similar cloud services for short-term high-availability purposes, but keep an eye out for little things that hurt performance before reaching out for that credit card.

Michel Jansen

9 Responses to “An expensive line of code: scaling Node.JS on Heroku”