I really like chaining Array.prototype.map, filter and reduce to define a data transformation. Unfortunately, in a recent project that involved large log files, I could no longer get away with looping through my data multiple times...
My goal:
I want to create a function that chains .filter and .map methods by, instead of mapping over an array immediately, composing a function that loops over the data once. I.e.:
const DataTransformation = () => ({
map: fn => (/* ... */),
filter: fn => (/* ... */),
run: arr => (/* ... */)
});
const someTransformation = DataTransformation()
.map(x => x + 1)
.filter(x => x > 3)
.map(x => x / 2);
// returns [ 2, 2.5 ] without creating [ 2, 3, 4, 5] and [4, 5] in between
const myData = someTransformation.run([ 1, 2, 3, 4]);
My attempt:
Inspired by this answer and this blogpost I started writing a Transduce function.
const filterer = pred => reducer => (acc, x) =>
pred(x) ? reducer(acc, x) : acc;
const mapper = map => reducer => (acc, x) =>
reducer(acc, map(x));
const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
map: map => Transduce(mapper(map)(reducer)),
filter: pred => Transduce(filterer(pred)(reducer)),
run: arr => arr.reduce(reducer, [])
});
The problem:
The problem with the Transduce snippet above, is that it runs “backwards”... The last method I chain is the first to be executed:
const someTransformation = Transduce()
.map(x => x + 1)
.filter(x => x > 3)
.map(x => x / 2);
// Instead of [ 2, 2.5 ] this returns []
// starts with (x / 2) -> [0.5, 1, 1.5, 2]
// then filters (x < 3) -> []
const myData = someTransformation.run([ 1, 2, 3, 4]);
Or, in more abstract terms:
Go from:
Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, f(g(x)))To:
Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, g(f(x)))Which is similar to:
mapper(f) (mapper(g) (concat))
I think I understand why it happens, but I can't figure out how to fix it without changing the “interface” of my function.
The question:
How can I make my Transduce method chain filter and map operations in the correct order?
Notes:
- I'm only just learning about the naming of some of the things I'm trying to do. Please let me know if I've incorrectly used the
Transduceterm or if there are better ways to describe the problem. - I'm aware I can do the same using a nested
forloop:
const push = (acc, x) => (acc.push(x), acc);
const ActionChain = (actions = []) => {
const run = arr =>
arr.reduce((acc, x) => {
for (let i = 0, action; i < actions.length; i += 1) {
action = actions[i];
if (action.type === "FILTER") {
if (action.fn(x)) {
continue;
}
return acc;
} else if (action.type === "MAP") {
x = action.fn(x);
}
}
acc.push(x);
return acc;
}, []);
const addAction = type => fn =>
ActionChain(push(actions, { type, fn }));
return {
map: addAction("MAP"),
filter: addAction("FILTER"),
run
};
};
// Compare to regular chain to check if
// there's a performance gain
// Admittedly, in this example, it's quite small...
const naiveApproach = {
run: arr =>
arr
.map(x => x + 3)
.filter(x => x % 3 === 0)
.map(x => x / 3)
.filter(x => x < 40)
};
const actionChain = ActionChain()
.map(x => x + 3)
.filter(x => x % 3 === 0)
.map(x => x / 3)
.filter(x => x < 40)
const testData = Array.from(Array(100000), (x, i) => i);
console.time("naive");
const result1 = naiveApproach.run(testData);
console.timeEnd("naive");
console.time("chain");
const result2 = actionChain.run(testData);
console.timeEnd("chain");
console.log("equal:", JSON.stringify(result1) === JSON.stringify(result2));
- Here's my attempt in a stack snippet:
const filterer = pred => reducer => (acc, x) =>
pred(x) ? reducer(acc, x) : acc;
const mapper = map => reducer => (acc, x) => reducer(acc, map(x));
const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
map: map => Transduce(mapper(map)(reducer)),
filter: pred => Transduce(filterer(pred)(reducer)),
run: arr => arr.reduce(reducer, [])
});
const sameDataTransformation = Transduce()
.map(x => x + 5)
.filter(x => x % 2 === 0)
.map(x => x / 2)
.filter(x => x < 4);
// It's backwards:
// [-1, 0, 1, 2, 3]
// [-0.5, 0, 0.5, 1, 1.5]
// [0]
// [5]
console.log(sameDataTransformation.run([-1, 0, 1, 2, 3, 4, 5]));