Combiners are there to save network bandwidth.
The mapoutput directly gets sorted:
sorter.sort(MapOutputBuffer.this, kvstart, endPosition, reporter);
This happens right after the real mapping is done. During iteration through the buffer it checks if there has a combiner been set and if yes it combines the records. If not, it directly spills onto disk.
The important parts are in the MapTask, if you'd like to see it for yourself.
sorter.sort(MapOutputBuffer.this, kvstart, endPosition, reporter);
// some fields
for (int i = 0; i < partitions; ++i) {
// check if configured
if (combinerRunner == null) {
// spill directly
} else {
combinerRunner.combine(kvIter, combineCollector);
}
}
This is the right stage to save the disk space and the network bandwidth, because it is very likely that the output has to be transfered.
During the merge/shuffle/sort phase it is not beneficial because then you have to crunch more amounts of data in comparision with the combiner run at map finish time.
Note the sort-phase which is shown in the web interface is misleading. It is just pure merging.