Hi we have recently upgraded to yarn from mr1. I know that container is an abstract notion but I don't understand how many jvm task (map, reduce, filter etc) one container can spawn or other way to ask is is container reusable across mutltiple map or reduce tasks. I read in following blog : What is a container in YARN?
"each mapper and reducer runs on its own container to be accurate!" which means if I look at AM logs I should see number of container allocated equal to number of map tasks (failed|success) plus number of reduce task is that correct?
I know number of containers changes during Application life cycle, based on AM requests, splits, scheduler etc.
But is there a way to request initial number of minimum container for given application. I think one way is to configure fair-scheduler queue. But is there anything else that can dictate this?
In case of MR if I have mapreduce.map.memory.mb = 3gb and
mapreduce.map.cpu.vcores=4. I also have yarn.scheduler.minimum-allocation-mb = 1024m and yarn.scheduler.minimum-allocation-vcores = 1.
Does that mean I will get one container with 4 cores or 4 containers with one core?
Also its not clear where can you specify mapreduce.map.memory.mb and mapreduce.map.cpu.vcores. Should they be set in client node or can they be set per application as well?
Also from RM UI or AM UI is there a way to see currently assigned containers for given application?

