My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Monday, March 04, 2013

Breaking Array Extras

Every now and then somebody comes out with this problem:
How do I stop an iteration such forEach() ?
Well, there are at least a couple of ways to do that ...

Dropping The Length

The first tacky way to stop executing a function is to drop the length of the array. Every extra receives the initial ArrayLike object as third parameter so a function like this should be safe enough:
function forEachAndBreak(value, i, arr) {
  if (someCondition(value)) {
    arr.length = 0;
  }
}

someArray.forEach(forEachAndBreak);
Bear in mind, this will affect the initial array too so if this is not desired, a copy is needed:
someArray.slice().forEach(forEachAndBreak);

Cons: Still Looping

Even if this trick works as expected, there is something we don't see behind the scene: the loop is still going on. If you ever wrote some array polyfill, you might have heard that the for loop should verify that i in arr is true, before invoking the function or handling the value at that index since the Array might be a sparse one, where some index might be missing. The same happens with native arrays, you might try this and be stuck for a while: Array(0xFFFFFF).forEach(alert). It does not matter if that alert will never be called, the engine is looping through the whole length and verifying each index.

Using Some

This is the most common way to prevent the problem.
[1,2,3].some(function (value, i, arr) {
  alert(i);
  if (value === 2) {
    return true;
  }
});
Above will alert only 0 and 1 and the loop will be terminated as soon as true is returned. To quickly test this, let's use again that horrible, gigantic, Array ...
var a = Array(0xFFFFFF);
a[1] = 2;
a.some(function(v){if(v === 2) return true});
You'll notice that this time the return is immediate, there's no reason to wait after the first result.
In few words, Array#some() is way better than forEach() in all those situations where we would like to break the loop at any time: we just return true when we want to, no need to return anything in all other cases.

Array#every() Is NOT The Opposite Same

THe thing you might confuse about every is that you always need to return something while this is not the some() case. As we have seen, we return only when/if we want to break.

Finding The Index Or The Value: The Outer Scope Way

Another common pattern is to use this approach to actually find the index, something Array#indexOf() cannot achieve when the condition is more complicated than just a simple === comparison. Using an external variable can help here:
var index;
if (array.some(function (value, i) {
  if (complicatedCheckAgainst(value)) {
    index = i;
    return true;
  }
})) {
  doComplicatedStuffWith(array[index]);
}
Analogue situation with the value, so that we can directly retrieve what we are looking for if needed. However, this is kinda less common/used pattern since with the index we might splice or do more operations than just getting the value;

Finding The Index: The RegExp Way

This is quite tacky but fast enough and suitable when the extra argument is not used: I am talking about the context.
var index = array.some(function (value, i) {
  if (complicatedCheckAgainst(value)) {
    return this.test(i);
  }
}, /\d+/) && +RegExp['$&'];
if (index !== false) {
  doComplicatedStuffWith(array[index]);
}
Above pattern can be handy for inline operations:
[].some.call(body.childNodes,flagIt,reNum) !== false &&
(body.childNodes[RegExp['$&']].flagged = true);
Whenever it makes sense or not, we can reuse that function and that reNum in different situations and inline, without needing to create an outer variable. Latter point is indeed the main advantage, reusability without knowing the outer scope. This could be achieved creating something similar via a closure, but that would be probably boring...

Array#findIndex

It looks like TC39 will talk about a method ilke this too, so here what I believe would be a draft candidate:
(function(AP){
AP.findIndex || (
  AP.findIndex = function(fn, self) {
    var $i = -1;
    AP.some.call(this, function(v, i, a) {
      if (fn.call(this, v, i, a)) {
        $i = i;
        return true;
      }
    }, self);
    return $i;
  };
);
}(Array.prototype));
Enjoy!

3 comments:

Aadaam said...

As much as I like the forEach syntax and prefer this ML-style syntax over a classical for cycle, I guess it's worth to note that the latest jsperf benchmarks I've seen, on all the engines (including, and especially V8) a for-cycle outperformed forEach by factor of 2 at least.

http://jsperf.com/for-vs-foreach/68

Which brings us the use of some vs forEach to a rather theoretical level unfortunately. :(


Andrea Giammarchi said...

absolutely, I should had mentioned about performance impact Adam, you are right and thanks for reminding me and reader this!

Marcus Pope said...

Just another reason why I think the array iteration decorators were a terrible inclusion into the JS core namespace. VerbotenJS has had an Object.proto.ea(func, callback) iterator since es3 that makes each of the new Array.xxx iterators in es5 pointless. And it allows for breaking at any point in the iteration process.

It's not the fastest iteration implementation out there, particularly over native, but then again even considering the 2x perf loss with for-vs-foreach, that stat means nothing unless you profile how much time your application is spending in for loops. If my application spends only 0.4% of its cpu time processing for loops, then I'm not worried about a 2x or even a 10x loss. And remember, it's what's inside the for loop that really matters, not the the time it takes to get to the next index.

I value a unified and flexible interface, that works cross browser / cross server. And .some / .every / .map / .forEach / .reduce / .reduce provide none of that. Only with time can they solve the compatibility constraint.