November 25, 2015

How to Timeout a Connection in Node According to Stack Overflow

Firstly, I just want to make it clear that this post is in no way a slight towards Stack Overflow. Without this fantastic collection of carefully curated questions and answers, we would all be lost. This is just a story about an imaginary software developer maintaining imaginary legacy code.

Last week, this imaginary dev was on a bit of an adventure. Instead of writing brand new code or making calm one-or-two line edits to fix bugs in well written Javascript, he was helping maintain a bit of a crow's nest.

Let's just call this micro-service the Gardiner expressway.

Once more unto the breach, dear friends...

What I learned from this program—which I coincidentally named after a notoriously unmaintainable downtown Toronto thoroughfare —is a brand new way to timeout an outgoing HTTP connection.

Here's a module that demonstrates this new approach. I'll throw this into get.js:

/* eslint-env node */
/* eslint no-use-before-define: 0 */
/* eslint no-multi-spaces: 0 */

"use strict";

var request = require("request");

module.exports = function (url, timeout, callback) {
    var isTimedOut;

    setTimeout(function () {
        isTimedOut = true;
	callback(new Error("Connection timed out"));
	return;
    }, timeout);

    request.get(url, function (err, res, body) {
        if (isTimedOut) {
            return;
        }

        callback(err, body);
        return;
    });
};

And here's a quick driver program that takes the URL and timeout off of the command line, runs our modularized function to fetch the URL, and displays the number of characters in the response body:

/* eslint-env node */
/* eslint no-use-before-define: 0 */
/* eslint no-multi-spaces: 0 */
/* eslint no-shadow: 0 */
/* eslint no-process-exit: 2 */

"use strict";

var get = require("./get");

if (process.argv.length !== 4) {
    console.error(
        "Usage: %s URL MILLISECONDS\n" +
        "Fetch URL and display the length of the fetched body." +
        "Timeout after the provided number of ms."
    );
    process.exit(1);
}

var url = process.argv[2],
    timeout = process.argv[3]
;

get(url, timeout, function (err, body) {
    if (err) {
        printError(err);
	return;
    }

    console.log("%s characters received", body.toString().length);
});

function printError(err) {
    if (err instanceof Error) {
        console.error(err.stack);
    } else {
        console.log(err.toString());
    }
}

So, for example, to fetch the url, https://www.google.com and give up after one second:

$ npm install request
...
...
$ node index https://www.google.com 1000
55429 chars received

And, as you'd expect, if we bump the timeout down low enough, we'll see errors:

$ node index https://www.google.com 900
55452 chars received
$ node index https://www.google.com 800
55460 chars received
$ node index https://www.google.com 700
55471 characters received
$ node index https://www.google.com 600
55415 chars received
$ node index https://www.google.com 500
Error: Connection timed out
    at null._onTimeout (/Users/ctaylorr/proj/hownottotimeout/get.js:14:11)
    at Timer.listOnTimeout (timers.js:110:15)

It works! Let's ship it! BTW, what the actual eff?!?!?!

But wait a second. This isn't the way this is done in other runtimes:

We're starting our timeout countdown far before the connection is attempted. This is measuring the latency of the destination server all right; but also:
- DNS lookup time.
- Socket creation time.
- The amount of time it takes to run code in request(...) prior to creating a connection.
The setTimeout code is tying up the event loop. This can be demonstrated by running node index https://www.google.com 10000.
Even when we do timeout, the connection is still active. Don't believe me? Run node index https://www.google.com:12345 500.

Node will not terminate since the connection is still tying up the event loop. If this code were run in a web-app, we would be leaking sockets under heavy load. One of the goals of implementing timeouts is to save resources.

We better look into this.

There are probably other issues, but what we've found so far are enough to consider investigating alternatives to this approach.

Let's try to address them one-by-one:

But first, let's check the docs...

According to the docs for the request module, implementing a connection timeout requires an additional option:


var request = require("request");

request.get({
    url: "https://www.google.com",
    timeout: 1000  // <---- THAT'S IT. AN OPTION.
}, function (err, res, body) {
    // handle errors

    // process results
});

It was an option all along

Yup. And the resulting code is so simple, you don't even need a separate module. Just run request() with the extra option.

tl;dr: Know your tools.

The moral of this story is that when using new tech (an API, a framework, ...anything) always set aside some time to take a look at the documentation. No need to commit the entire thing to memory, but use this as the first stop when a new requirement pops up. If nothing pops out, consider composing concepts from the docs.

If that still doesn't work, only then consider the clever techniques from sites like Stack Overflow.

Hope this helps,

— chris

Twenty Two Tabs

Home

About