Web hate - The emperor’s new clothes were built with Node.js

Копія видаленого тексту від Eric Jiang „The emperor’s new clothes were built with Node.js“ опублікованого в середу, 4 червня 2014 на https://notes.ericjiang.com/.

There are plenty of people lambasting Node.js (see the infamous “Node.js is cancer”) but proponents tend to misunderstand the message and come up with irrelevant counterpoints. It’s made worse because there are two very different classes of people that use Node.js. The first kind of people are those who need highly concurrent servers that can handle many connections at once: HTTP proxies, Websocket chat servers, etc. The second are those who are so dependent on JavaScript that they need to use JS across their browser, server, database, and laundry machine.

I want to address one-by-one all of the strange and misguided arguments for Node.js in one place.

TL;DR: what’s clothing that doesn’t use threads and doesn’t block (anything)?

Node.js is fast!
This is actually too imprecise. Let’s break it down into two separate claims:
A. JavaScript running on V8 is fast!
You have to give the V8 developers some kudos. V8 has done incredible things to run JavaScript code really fast. How fast? Anywhere from 1x to 5x times slower than Java, at least for the Benchmarks Game. (Some of you may not realize that “slower” is not a typo.)
If you look at their benchmarks, you’ll notice that V8 ships a really freakin’ good regex engine. Conclusion? Node.js is best suited for CPU-bound regex-heavy workloads.
So if we take the Benchmarks Game to be gospel, then what languages/implementations are typically faster than JavaScript/V8? Oh, just some unproductive ones like Java, Go, Erlang (HiPE), Clojure, C#, F#, Haskell (GHC), OCaml, Lisp (SBCL). Nothing that you could write a web server in.
And it’s good that you don’t need to use multiple cores at once, since the interpreter is single-threaded. (Comments will no doubt point out that you can run multiple processes in Node.js, something that you can’t do with any other language.)
B. Node.js is non-blocking! It has super concurrency! It’s evented!
Sometimes I wonder whether people even understand what they’re saying.
Node.js is in this weird spot where you don’t get the convenience of light-weight threads but you’re manually doing all the work a light-weight threads implementation would do for you. Since JavaScript doesn’t have built-in support for any sort of sane concurrency, what grew out of it was a library of functions that use callbacks. PL folks will realize that it’s just a crappy version of continuation-passing style (Sussman and Steele 1975), but instead of being used to work around recursion growing the stack, it’s used to work around a deficient language.
So yes, Node.js can effectively deal with many connections in a single-threaded application, but it wasn’t the first or only runtime to do so. Look at Vert.x, Erlang, Stackless Python, GHC, Go…
The best part is all the people jumping through hoops to create their MVP in Node.js because they think it’ll make their site faster for their swarms of future users. (Never mind that loading 500K of Backbone.js code and miscellaneous libraries is not very high performance anyways.)
Node.js makes concurrency easy!
JavaScript doesn’t have built-in language features for concurrency, Node.js doesn’t provide that magic, and there are no metaprogramming capabilities. You have to manage all of your continuations manually, or with the help of (lots of different) libraries that push JavaScript syntax to its absurd limits. (BTW, I find should.js both horrific and convenient.) It’s the modern-day equivalent of using GOTO because your language doesn’t have for loops.
Let’s compare.
In Node.js, you might write this function for some business task:
function dostuff(callback) {
  task1(function(x) {
    task2(x, function(y) {
      task3(y, function(z) {
        if (z < 0) {
          callback(0);
        } else {
          callback(z);
        });
    });
  });
}
Clear as mud. Let’s use Q promises instead!
function dostuff() {
  return task1()
    .then(task2)
    .then(task3)
    .then(function(z) {
      if (z < 0) {
        return 0;
      } else {
        return z;
      });
}
A lot more readable, but still dumb. One side effect is that Q eats your exceptions unless you remember to finish your chain with “.done()”, and there are plenty of other pitfalls that aren’t obvious. Of course, most libraries in Node.js don’t use Q, so you’re still stuck using callbacks anyways. What if task2 didn’t return a Q promise?
function dostuff() {
  return task1()
    .then(function(x) {
      var deferred = Q.defer();
      task2(x, deferred.resolve);
      return deferred;
    })
    .then(task3)
    .then(function(z) {
      if (z < 0) {
        return 0;
      } else {
        return z;
      }
    })
    .done();
}
The code above is broken. Can you spot why? By the way, we also forgot to handle exceptions. Let’s fix these issues:
function dostuff() {
  return task1()
    .then(function(x) {
      var deferred = Q.defer();
      task2(x, function(err, res) {
        if (err) {
          deferred.reject(err);
        } else {
          deferred.resolve(res);
        }
      });
      return deferred.promise;
    },
    function(e) {
      console.log("Task 1 failed.");
    })
    .then(task3, function(e) {
      console.log("Task 2 failed.");
    })
    .then(function(z) {
      if (z < 0) {
        return 0;
      } else {
        return z;
      }
    },
    function(e) {
      console.log("Task 3 failed.");
    })
    .done();
}
Notice how the error handling and the tasks they correspond to are interleaved. Are we having fun yet?
In Go, you can write code like this:
func dostuff() int {
  z := task3(task2(task1())))
  if z < 0 {
    return 0
  }
  return z
}
Or with error handling:
func dostuff() int, err {
  x, err := task1();
  if err != nil {
    log.Print("Task 1 failed.")
    return 0, err
  }
  y, err := task2(x);
  if err != nil {
    log.Print("Task 2 failed.")
    return 0, err
  }
  z, err := task3(y);
  if err != nil {
    log.Print("Task 3 failed.")
    return 0, err
  }
  if z < 0 {
    return 0;
  }
  return z;
}
Realize that both the Go and Node.js versions are basically equivalent, except Go handles the concurrency and continuations for us. In Node.js, we have to manage our continuations manually because we have to work against the built-in control flow.
Oh, before you actually do any of this stuff, you have to learn not to release Zalgo, possibly by using synthetic deferrals (say what?) so that you don’t make your API’s users unhappy. In the world of “lean” and MEAN MVPs, who has time to learn about leaky abstractions on top of some obtuse runtime?
By the way, Q is super slow (or so the Internet says). Check out this handy performance guide comparing 21 different ways of handling asynchronous calls!
No wonder people love Node.js. It gives you the same performance as lightweight threads but with the clarity and usability of x86 assembly.
When people point out how unpleasant it is to manually handle control flow in Node.js, the proponents say, “Use libraries to handle that, like async.js!” So you start using library functions to run a list of tasks in parallel or compose two functions, which is exactly what you’d do with any threaded language, except worse.
LinkedIn went from 30 servers to 3 by switching to Node.js!
Quoth Hacker News: “I switched from a dump truck to a motorbike and now I drive a lot faster!”
PayPal and Wal-Mart have also had high-profile switches to Node.js. Of course, they’re comparing two completely different things to make Node.js look better. In these too-good-to-be-true stories, they’re switching from a gigantic enterprisey codebase to a Node.js app written from scratch. Is there any question that it wouldn’t have been faster? They could have switched to pretty much any anything and gotten a performance gain.
In LinkedIn’s case, they had proxies running on Mongrel with a concurrency of 1. It’s like switching from using one finger to type on a QWERTY keyboard to using ten fingers on a Dvorak keyboard and giving all the credit to Dvorak for a better keyboard layout.
This is classic hype: real-world stories misunderstood and twisted to confuse the unwitting.
It lets you leverage your existing JavaScript expertise!
Let’s be more specific and break this down into a couple parts:
a. The frontend devs can work on the backend!
Where was JavaScript used previously? Primarily browser-side front-end code to animate buttons or smoosh JSON into fancy interfaces. By leveraging JavaScript on the backend, you let your ninja UI devs hack on mission-critical networking code. Since it’s JS on both ends, there’s nothing to learn! (Right?)
Wait until they find out that they can’t use return normally (because concurrency!), they can’t use throw/catch normally (because concurrency!), and everything they call is callback based, returns a Q promise, returns a native promise, is a generator, is a pipe, or some other weird thing because it’s Node.js. (Just tell them to check the type signatures.)
If they couldn’t learn a new language for the backend, then they’ll have trouble figuring out how to mix and match all the different callbacks/promises/generators into code that doesn’t collapse every time a change is made.
b. We can share code between the backend and frontend!
You’re then limiting your server-side code to use only language features that browsers support. For example, your shared code can’t use JS 1.7 generators until the browsers support it too and we have enough experience to know that adoption could take years.
Effectively, we can’t improve the server language in Node in substantial ways without drifting away from the browser language. Node.js has so many gaping holes that are up to the libraries to fix, but since it’s chained to the language we call JavaScript, it’s can’t strike out on its own to address these things at the language level.
It’s an awkward situation where the language doesn’t give you much, but you can’t change the language so you keep doing npm install band-aid.
This can be fixed by running some sort of compilation step to transform new language features into older features so you can write for the server and still run on regular JavaScript. Your choices are either something that’s 95% JavaScript (TypeScript, CoffeeScript) or not JavaScript at all (ClojureScript, perhaps).
More worrying is that this argument implies that you actually muddle the concerns of your server and frontend. In the real world, you’ll find that your backend turns into a JSON API that handles all of the validation, processing, etc., and you have multiple (sometimes third-party) consumers of that API. For example, when you decide to build iPhone and Android apps, you’ll have to decide between a native app in Java, Obj-C, or C#, or packing your one-page Backbone.js/Angular.js app using Phonegap/Cordova. The code you share between the server and client may end up being a liability, depending on what platform you go with.
NPM is so great!
I think NPM has attained a status of “not awful”, which puts it ahead of many other package managers. Like most ecosystems, NPM is pretty cluttered with multiple redundant implementations of the same thing. Say you need a library for sending Android push notifications. On NPM, you’ll find: gcm, node-gcm, node-gcm-service, dpush, gcm4node, libgcm, and ngcm, not to mention all the libraries that support multiple notification services. Which are reliable? Which are abandoned? In the end, you just pick the one that has the most downloads (but why can’t we sort results by popularity?).
NPM also has a less-than-stellar operations track record. It used to go down quite often and it was hilarious seeing all the companies that suddenly couldn’t deploy code because NPM was having troubles again. Its up-time is quite a bit better now, but who knows if they will suddenly break your deployment process because they can.
We somehow managed to deploy code in the past without introducing a deploy-time dependency on a young, volunteer-run, created-from-scratch package repository. We even did such blasphemous things as including a copy of the library source locally!
I’m so productive with Node.js! Agile! Fast! MVP!
There seems to be a weird dichotomy in the minds of Node.js programmers: either you’re running mod_php or some Java EE monstrosity and therefore a dinosaur, or you’re on Node.js and super lean and fast. This might explain why you don’t see as many people bragging about how they went from Python to Node.js.* Certainly, if you come from an over-engineered system where doing anything requires an AbstractFactoryFactorySingletonBean, the lack of structure in Node.js is refreshing. But to say that this makes Node.js more productive is an error of omission—namely, they leave out all the things that suck.
Here’s what a newcomer to Node.js might do:
This function might fail and I need to throw an exception, so I’ll write throw new Error("it broke");.

The exception isn’t caught by my try-catch!

Using process.on("uncaughtException") seemed to do it.

I’m not getting the stacktrace I expected, and StackOverflow says that this way violates best practices anyways.

Maybe if I try using domains?

Oh, callbacks typically take the error as the first parameter. I should go back and change my function calls.

Someone else told me to use promises instead.

After reading the examples ten or twelve times, I think I have it working.

Except that it ate my exceptions. Wait, I needed to put .done() at the end of the chain.
Here’s a Python programmer:

raise Exception("it broke");

Here’s a Go programmer:

I’ll add err to my return signature and change my return statements to add a second return value.
There is a lot of stuff in Node.js that actually gets in the way of producing an MVP. The MVP isn’t where you should be worrying about returning an HTTP response 40ms faster or how many simultaneous connections your DigitalOcean “droplet” can support. You don’t have time to become an expert on concurrency paradigms (and you’re clearly not because you wouldn’t be using Node otherwise!).
* Check out this great post about switching from Python to Node.js. The money quote is, “Specifically, the deferred programming model is difficult for developers to quickly grasp and debug. It tended to be ‘fail’ deadly, in that if a developer didn’t fully understand Twisted Python, they would make many innocent mistakes.” So they switched to another difficult system that fails in subtle ways if you don’t fully understand it and make innocent mistakes!