My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Wednesday, November 30, 2011

Array extras and Objects

When Array extras landed in JavaScript 1.6 I had, probably together with other developers, one of those HOORRAYYY moment ...
What many libraries and frameworks out there still implement, is this sort of universal each method that supposes to be compatible with both Arrays and Objects.

A Bit Messed Up

What I have never liked that much about these each methods is that we have to know in advance in any case if the object we are passing is an Array, an ArrayLike one, or an Object.
In latter case, the callback passed as second argument will receive as second argument the key, and not the index, which simply means we cannot trust a generic callback unless this does not check per each iterated item the second argument type, or unless we don't care at all about the second argument.
In any case I always found this a bad design. If we think about events, as example, it's totally natural to expect a single argument as event object and then we can act accordingly.
This let us reuse callbacks for similar purpose and maintain a DRY code.

Need For An Object#forEach

All implementation of each, and as far as I know with the only exception of jQuery which makes things even more complicated since we generally have to completely ignore the first argument in this case, have some natural confusion inside the method.
If you take the underscore.js library, as example, you will note that there are two aliases for the each method, each itself and forEach, so it's more than clear for me that JS developers are clearly missing an Array#forEach like method in order to iterate with objects, rather than lists.
It must be also underlined that all these methods are somehow error prone: what if the object we are passing has a length property that does not necessary mean it points to the length of items stored via index as if it was an Array?
You may consider this an edge case, or an anti pattern, then you have to remember that functions in JavaScript are first class objects.
Probably all these methods will nicely fail indeed with functions, passed as objects, whenever you decide that your function can be used as object too.

var whyNot = function (obj) {
/* marvelous stuff here */
this.calls++;
return this.doStuff(obj);
};
whyNot.calls = 0;
whyNot.doStuff = function (obj) {
/* kick-ass method */
};

// the unexpected but allowed
whyNot = whyNot.bind(whyNot);

whyNot.length; // 1
whyNot[0]; // undefined

By design, the length of any function in JavaScript is read-only and means nothing, in therms of Array iteration, it simply means the number of arguments the function defined during its declaration/definition as expression.

WTF

Whenever above example makes sense or not, I am pros patterns exploration and when a common method is not compatible with all scenarios, I simply think something went wrong or is missing in the language.
Thanks gosh JS is freaking flexible and with ES5 we can define some prototype without affecting for( in ) loops but hopefully simplifying our daily basis stuff.
Remember? With underscore or others we still have to know in advance if the passed object is an Array, an ArrayLike, or a generic object ... so what would stop us to simply chose accordingly?

// Array or ArrayLike
[].forEach.call(genericArrayLike, callbackForArrays);

// generic object to iterate
{}.forEach.call(object, callbackForObjects);

An explicit choice in above case is the fastest and most reliable way we have to do things properly. A DOM collection, as well as any array or arrayLike object will use the native forEach, but we can still recycle callbacks designed to deal with value, key and objects, rather than value, index, and this is the little experiment:

Object extras

The concept of each callback is exactly the same of original, native, Array callbacks, except things are based on native functions available in all ES5 compatible desktop and basically all mobile browsers, and easy to shim with all others too old to deal with JS 1.6 or higher.



Here a couple of examples:

var o = {a:"a", b:"b", c:""};

// know if all values are strings
o.every(function (value, key, object) {
return typeof value == "string";
}); // true

// filter by content, no empty strings
var filtered = o.filter(function (value, key, object) {
return value.length;
}); // {a:"a",b:"b"} // original object preserved

// loop through all values (plus checks)
o.forEach(function (value, key, object) {
object === o; // true
this === o; // true
if (key.charAt(0) != "_") {
doSomethingWithThisValue(value);
}
}, o); // NOTE: all these methods respect Array extras signatures

// map a new object
var mapped = o.map(function (value, key, object) {
return value + 1;
}); // {a:"a1",b:"b1"} // original object preserved

// know if a value contains "a"
o.some(function (value, key, object) {
return value === "a";
}); // true

The reason reduce and reduceRight are not in the list is simple: which one would be the key to preserve, the first of the list? There is no such thing as "predefined for/in order" in JavaScript plus these methods are more Array related so out of this experiment.

As Summary

Once minified and minzipped the gist weights about 296 bytes which is ridiculous size compared with any application we are dealing with on daily basis.
Specially forEach, but probably others too, may become extremely handy and ... of course, using the Object.keys method internally, this is gonna be compatible with Arrays too but hey, the whole point was to make a clear distinction ;)


[edited]

The Misleading Signature

I don't know how many times I have spoken with jQuery developers, just because they are common, convinced that native Array#forEach was accepting the value as second argument.
I always considered inverted signatures, whatever API it is, bad for both performances, no possibility to fallback into some native method, and learning curve, where new comers learn than a generic each method must have the index as first argument.
Bear in mind whenever we loop we are most likely interested into the value of that index or key, so this value should be the first, and if you need the only one, argument passed through the procedure.
A completely ignored first argument is, once again and in my opinion, a bad design for an API: stuck without native power, teaching arguments order is not relevant.
Well, specially latter point is true if we have named arguments, but in JS nothing have been planned so far, and in ES6 the way we gonna name arguments is still under discussion.

Have fun with JS

6 comments:

QuesoLoco said...

Isn't this just pushing duck typing a bit further by extrapolation that the keys of an array are the numerical keys only.

(Also there's a grey spot in your idea, Array.forEach will always only trigger on numerical keys, even when you declare properties on it)

Andrea Giammarchi said...

Object#forEach is for objects, Array#forEach is for arrays and index based, same signature, easier to remember no duck typing, duck typing is what all each methods out there are doing now: this is about consistent signatures over already known methods borrowed for objects, not for array, with similar signature so it's easier to remember.

It would be a mistake to use the Object#forEach for arrays indeed but being in prototype, you would just do obj.forEach(callback) except you can recycle callbacks.

Filter is meaningful as well as some and every already, which part you consider duck typing?

Anonymous said...

Andrea, is great that you wrote about that, I think we should have some iterators for object, on my side I already use them heavily.

However when I look at your proposal I miss one thing. Problem with Objects is that unlike Arrays they're *unordered* collections and Array iterators are always about iteration in given order. So in my opinion well designed iterators for Objects should also accept optional compareFn function which will indicate order in which we want to iterate properties, of course uses cases for that are minor but they exist and are valid.

On my side I use following iterator:
https://github.com/medikoo/es5-ext/blob/master/lib/Object/prototype/_iterate.js
Which I use for different methods e.g. forEach:
https://github.com/medikoo/es5-ext/blob/master/lib/Object/prototype/for-each.js

Andrea Giammarchi said...

I don't see any valid use case for ordered properties, no matter which programming language it is, these are properties, not indexes ... so I am sorry, I don't agree about ordered iteration.

Just think that when you use Object.keys(obj) the order is already NOT granted.

If you need ordered iteration you need a relation between an object and a stack of properties you can control, still I think there is something wrong if the order of properties matters.

Anonymous said...

Andrea,

It's not about ordered properties as then we will talk about Array but about cases in which you may want to iterate properties in specific order.

With Object.keys I can easily do: Object.keys(obj).sort(compareFn) and I have what I need. Can I do the same with your Object#forEach ? ;-)

Anyway I'm as well not 100% convinced that's indeed mandatory, I had some use cases lately in my application, e.g.
I have object which defines schema for namespace it's key-value map, each value has order property. When I render form I want to iterate them by order.
So that's where I needed this, however in other areas of same application I have configured ordered lists (arrays) for unordered collections and I work on that, so it might to be good to reuse similar solution for above case.

Andrea Giammarchi said...

Sorry but I think you should re-consider your approach then ... summary:
order === Array
Object keys ... I would not order them.

You can map through array and still store in the object, this is the way I would go in any case.

However, let's see what ES ml say about it :-)