JavaScript 优化(二)循环是糟糕的

1,265 阅读10分钟
原文链接: jrsinclair.com

In the previous article, we suggested that indentation is an (extremely rough) indicator of complexity. Our goal is to write less complex JavaScript. We do this by choosing the right abstraction to solve a problem. But how do you know which abstraction to use? So far, we haven’t looked at any concrete examples of how to do this. In this article we look at how to deal with JavaScript arrays, without using any loops. The end result is less complex code.

“…a loop is an imperative control structure that’s hard to reuse and difficult to plug in to other operations. In addition, it implies code that’s constantly changing or mutating in response to new iterations.”

—Luis Atencio[1]

Loops

We’ve been saying that control structures like loops introduce complexity. But so far we’ve not seen any evidence of how that happens. So let’s take a look at how loops in JavaScript work.

In JavaScript we have at least four or five ways of looping. The most basic is the while-loop. But first, a little bit of setup. We’ll create an example function and array to work with.

// oodlify :: String -> String
function oodlify(s) {
    return s.replace(/[aeiou]/g, 'oodle');
}

const input = [
    'John',
    'Paul',
    'George',
    'Ringo',
];

So, we have an array, and we’d like to oodlify each entry. With a while-loop, it looks something like this:

let i = 0;
const len = input.length;
let output = [];
while (i < len) {
    let item = input[i];
    let newItem = oodlify(item);
    output.push(newItem);
    i = i + 1;
}

Note that to keep track of where we’re up to, we use a counter, i. We have to initialise this counter to zero, and increment it every time around the loop. We also have to keep comparing i to len so we know where to stop. This pattern is so common that JavaScript provides a simpler way of writing it: The for-loop. It looks something like this:

const len = input.length;
let output = [];
for (let i = 0; i < len; i = i + 1) {
    let item = input[i];
    let newItem = oodlify(item);
    output.push(newItem);
}

This is a helpful construct because it puts all that counter boilerplate together at the top. With the while-loop version it is very easy to forget to increment i and cause an infinite loop. A definite improvement. But, let’s step back a bit and look at what this code is trying to achieve. What we’re trying to do is to run oodlify() on each item in the array and push the result into a new array. We don’t really care about the counter.

This pattern of doing something with every item in an array is quite common. So, with ES2015, we now have a new loop construct that lets us forget about the counter: The for…of loop. Each time around the loop it just gives you the next item in the array. It looks like this:

let output = [];
for (let item of input) {
    let newItem = oodlify(item);
    output.push(newItem);
}

This is much cleaner. Notice that the counter and the comparison are all gone. We don’t even have to pull the item out of the array. The for…of loop does all that heavy lifting for us. If we stopped here and used for…of loops everywhere instead of for-loops, we’d be doing well. We would have removed a decent amount of complexity. But… we can go further.

Mapping

The for…of loop is much cleaner than the for-loop, but we still have a lot of setup code there. We have to initialise the output array and call push() each time around the loop. We can make our code even more concise and expressive, but to see how, let’s expand the problem a little.

What if we had two arrays to oodlify?

const fellowship = [
    'frodo',
    'sam',
    'gandalf',
    'aragorn',
    'boromir',
    'legolas',
    'gimli',
];

const band = [
    'John',
    'Paul',
    'George',
    'Ringo',
];

The obvious thing to do would be a loop for each:

let bandoodle = [];
for (let item of band) {
    let newItem = oodlify(item);
    bandoodle.push(newItem);
}

let floodleship = [];
for (let item of fellowship) {
    let newItem = oodlify(item);
    floodleship.push(newItem);
}

This works. And code that works is better than code that doesn’t. But, it’s repetitive—not very DRY. We can refactor it to reduce some of the repetition. So, we create a function:

function oodlifyArray(input) {
    let output = [];
    for (let item of input) {
        let newItem = oodlify(item);
        output.push(newItem);
    }
    return output;
}

let bandoodle = oodlifyArray(band);
let floodleship = oodlifyArray(fellowship);

This is starting to look much nicer, but what if we had another function we wanted to apply?

function izzlify(s) {
    return s.replace(/[aeiou]+/g, 'izzle');
}

Our oodlifyArray() function won’t help us now. But if we create an izzlifyArray() function we’re repeating ourselves again. Let’s do it anyway so we can see them side-by-side:

function oodlifyArray(input) {
    let output = [];
    for (let item of input) {
        let newItem = oodlify(item);
        output.push(newItem);
    }
    return output;
}

function izzlifyArray(input) {
    let output = [];
    for (let item of input) {
        let newItem = izzlify(item);
        output.push(newItem);
    }
    return output;
}

Those two functions are scarily similar. What if we could abstract out the pattern here? What we want is: Given an array and a function, map each item from the array into a new array. Do this by applying the function to each item. We call this pattern map. A map function for arrays looks like this:

function map(f, a) {
    let output = [];
    for (let item of a) {
        output.push(f(item));
    }
    return output;
}

Of course, that still doesn’t get rid of the loop entirely. If we want to do that we can write a recursive version:

function map(f, a) {
    if (a.length === 0) { return []; }
    return [f(a[0])].concat(map(f, a.slice(1)));
}

The recursive solution is quite elegant. Just two lines of code, and very little indentation. But generally, we don’t tend to use the recursive version because it has bad performance characteristics in older browsers. And in fact, we don’t have to write map ourselves at all (unless we want to). This map business is such a common pattern that JavaScript provides a built-in map method for us. Using this map method, our code now looks like this:

let bandoodle     = band.map(oodlify);
let floodleship   = fellowship.map(oodlify);
let bandizzle     = band.map(izzlify);
let fellowshizzle = fellowship.map(izzlify);

Note the lack of indenting. Note the lack of loops. Sure, there might be a loop going on somewhere, but that’s not our concern any more. This code is now both concise and expressive. It is also simple.

Why is this code simple? That may seem like a stupid question, but think about it. Is it simple because it’s short? No. Just because code is concise, doesn’t mean it lacks complexity. It is simple because we have separated concerns. We have two functions that deal with strings: oodlify and izzlify. Those functions don’t have to know anything about arrays or looping. We have another function, map that deals with arrays. But it doesn’t care what type of data is in the array, or even what you want to do with the data. It just executes whatever function we pass it. Instead of mixing everything in together, we’ve separated string processing from array processing. That is why we can call this code simple.

Reducing

Now, map is very handy, but it doesn’t cover every kind of loop we might need. It’s only useful if you want to create an array of exactly the same length as the input. But what if we wanted to add up an array of numbers? Or find the shortest string in a list? Sometimes we want to process an array and reduce it down to just one value.

Let’s consider an example. Say we have an array of hero objects:

const heroes = [
    {name: 'Hulk', strength: 90000},
    {name: 'Spider-Man', strength: 25000},
    {name: 'Hawk Eye', strength: 136},
    {name: 'Thor', strength: 100000},
    {name: 'Black Widow', strength: 136},
    {name: 'Vision', strength: 5000},
    {name: 'Scarlet Witch', strength: 60},
    {name: 'Mystique', strength: 120},
    {name: 'Namora', strength: 75000},
];

We would like to find the strongest hero. With a for…of loop, it would look something like this:

let strongest = {strength: 0};
for (let hero of heroes) {
    if (hero.strength > strongest.strength) {
        strongest = hero;
    }
}

All things considered, this code isn’t too bad. We go around the loop, keeping track of the strongest hero so far in strongest. To see the pattern though, let’s imagine we also wanted to find the combined strength of all the heroes.

let combinedStrength = 0;
for (let hero of heroes) {
    combinedStrength += hero.strength;
}

In both examples we have a working variable that we initialise before starting the loop. Then, each time around the loop we process a single item from the array and update the working variable. To make the loop pattern even clearer, we’ll factor out the inner part of the loops into functions. We’ll also rename the variables to further highlight similarities.

function greaterStrength(champion, contender) {
    return (contender.strength > champion.strength) ? contender : champion;
}

function addStrength(tally, hero) {
    return tally + hero.strength;
}

const initialStrongest = {strength: 0};
let working = initialStrongest;
for (hero of heroes) {
    working = greaterStrength(working, hero);
}
const strongest = working;

const initialCombinedStrength = 0;
working = initialCombinedStrength;
for (hero of heroes) {
    working = addStrength(working, hero);
}
const combinedStrength = working;

Written this way, the two loops look very similar. The only thing that really changes between the two is the function called and the initial value. Both reduce the array down to a single value. So we’ll create a reduce function to encapsulate this pattern.

function reduce(f, initialVal, a) {
    let working = initialVal;
    for (let item of a) {
        working = f(working, item);
    }
    return working;
}

Now, as with map, the reduce pattern is so common that JavaScript provides it as a built-in method for arrays. So we don’t need to write our own if we don’t want to. Using the built-in method, our code becomes:

const strongestHero = heroes.reduce(greaterStrength, {strength: 0});
const combinedStrength = heroes.reduce(addStrength, 0);

Now, if you’re paying close attention, you may have noticed that this code is not much shorter. Using the built-in array methods, we only save about one line. If we use our hand-written reduce function, then the code is longer. But, our aim is to reduce complexity, not write shorter code. So, have we reduced complexity? I would argue, yes. We have separated the code for looping from the code that processes individual items. The code is less intertwined. Less complex.

The reduce function might seem fairly primitive at first glance. Most examples with reduce do fairly simple things like adding numbers. But there’s nothing saying that the return value for reduce has to be a primitive type. It can be an object, or even another array. This blew my mind a little bit when I first realised it. So we can, for example, write map or filter using reduce. But I’ll leave you to try that out for yourself.

Filtering

We have map to do something with every item in an array. And we have reduce to reduce an array down to a single value. But what if we wanted to extract just some of the items in an array? To explore further, we’ll expand our hero database to include some extra data:

const heroes = [
    {name: 'Hulk', strength: 90000, sex: 'm'},
    {name: 'Spider-Man', strength: 25000, sex: 'm'},
    {name: 'Hawk Eye', strength: 136, sex: 'm'},
    {name: 'Thor', strength: 100000, sex: 'm'},
    {name: 'Black Widow', strength: 136, sex: 'f'},
    {name: 'Vision', strength: 5000, sex: 'm'},
    {name: 'Scarlet Witch', strength: 60, sex: 'f'},
    {name: 'Mystique', strength: 120, sex: 'f'},
    {name: 'Namora', strength: 75000, sex: 'f'},
];

Now, let’s say we have two problems. We want to:

  1. Find all the female heroes; and
  2. Find all the heroes with a strength greater than 500.

Using a plain-old for…of loop, we might write something like this:

let femaleHeroes = [];
for (let hero of heroes) {
    if (hero.sex === 'f') {
        femaleHeroes.push(hero);
    }
}

let superhumans = [];
for (let hero of heroes) {
    if (hero.strength >= 500) {
        superhumans.push(hero);
    }
}

All things considered, this code isn’t too bad. But we definitely have a repeated pattern. In fact, the only thing that really changes is our if-statement. So what if we factored just the if-statements into functions?

function isFemaleHero(hero) {
    return (hero.sex === 'f');
}

function isSuperhuman(hero) {
    return (hero.strength >= 500);
}

let femaleHeroes = [];
for (let hero of heroes) {
    if (isFemaleHero(hero)) {
        femaleHeroes.push(hero);
    }
}

let superhumans = [];
for (let hero of heroes) {
    if (isSuperhuman(hero)) {
        superhumans.push(hero);
    }
}

This type of function that only returns true or false is sometimes called a predicate. We use the predicate to decide whether or not to keep each item in heroes.

The way we’ve written things here makes the code longer. But now that we’ve factored out our predicate functions, the repetition becomes clearer. We can extract it out into a function.

function filter(predicate, arr) {
    let working = [];
    for (let item of arr) {
        if (predicate(item)) {
            working = working.concat(item);
        }
    }
    return working;
}

const femaleHeroes = filter(isFemaleHero, heroes);
const superhumans  = filter(isSuperhuman, heroes);

And, just like map and reduce, JavaScript provides this one for us as an Array method. So we don’t have to write our own version (unless we want to). Using array methods, our code becomes:

const femaleHeroes = heroes.filter(isFemaleHero);
const superhumans  = heroes.filter(isSuperhuman);

Why is this any better than writing the for…of loop? Well, think about how we’d use this in practice. We have a problem of the form Find all the heroes that…. Once we notice we can solve this problem using filter then our job becomes easier. All we need to do is tell filter which items to keep. We do this by writing one very small function. We forget about arrays and working variables. Instead, we write a teeny, tiny predicate function. That’s it.

And as with our other iterators, using filter conveys more information in less space. We don’t have to read through all the generic loop code to work out that we’re filtering. Instead, it’s written right there in the method call.

Finding

Filtering is very handy. But what if we wanted to find just one hero? Say we wanted to Black Widow. We could use filter to find her, like so:

function isBlackWidow(hero) {
    return (hero.name === 'Black Widow');
}

const blackWidow = heroes.filter(isBlackWidow)[0];

The trouble with this is that it’s not very efficient. The filter method looks at every single item in the array. But we know that there’s only one Black Widow and we can stop looking after we’ve found her. But having this approach of using a predicate function is neat. So let’s write a find function that will return the first item that matches:

function find(predicate, arr) {
    for (let item of arr) {
        if (predicate(item)) {
            return item;
        }
    }
}

const blackWidow = find(isBlackWidow, heroes);

And again, JavaScript provides this one for us, so we don’t have to write it ourselves:

const blackWidow = heroes.find(isBlackWidow);

Once again, we end up expressing more information in less space. By using find our problem of finding a particular entry boils down to just one question: How do we know if we’ve found the thing we want? We don’t have to worry about the details of how the iteration is happening.

Summary

These iteration functions are a great example of why (well-chosen) abstractions are so useful and elegant. Let’s assume we’re using the built-in array methods for everything. In each case we’ve done three things:

  1. Eliminated the loop control structure, so the code is more concise and (arguably) easier to read;
  2. Described the pattern we’re using by using the appropriate method name. That is, map, reduce, filter, or find.
  3. Reduced the problem from processing the whole array to just specifying what we want to do with each item.

Notice that in each case, we’ve broken the problem down into solutions that use small, pure functions. What’s really mind-blowing though, is that with just these four patterns (though there are others, and I encourage you to learn them), you can eliminate nearly all loops in your JS code. This is because almost every loop we write in JS is processing an array, or building an array, or both. And when we eliminate the loops we (almost always) reduce complexity and produce more maintainable code.


Update upon the 23rd of February 2017

A few people have pointed out that it feels inefficient to loop over the hero list twice in the reduce and filter examples. Using the ES2015 spread operator makes combining the two reducer functions into one quite neat. Here’s how I would refactor to iterate only once over the array:

function processStrength({strongestHero, combinedStrength}, hero) {
    return {
        strongestHero: greaterStrength(strongestHero, hero),
        combinedStrength: addStrength(combinedStrength, hero),
    };
}
const {strongestHero, combinedStrength} = heroes.reduce(processStrength, {strongestHero: {strength: 0}, combinedStrength: 0});

It’s a little bit more complicated than the version where we iterate twice, but it may make a big difference if the array is enormous. Either way, the order is still O(n).


  1. Atencio, Luis. 2016, Functional Programming in JavaScript. Manning Publications. iBooks.  ↩


本文对你有帮助?欢迎扫码加入前端学习小组微信群: