For example, I currently have a DataProc cluster consisting of a master and 4 workers, each machine has 8 vCPUs and 30GB memory.
Whenever I submit a job to the cluster, the cluster commits a max of 11GB total, and only engages 2 worker nodes to do the work, and on those nodes only uses 2 of the vCPU resources. This makes a job that should only take a few minutes take nearly an hour to execute.
I have tried editing the spark-defaults.conf file on the master node, and have tried running my spark-submit command with the arguments --executor-cores 4 --executor-memory 20g --num-executors 4 but neither has had any effect.
These clusters will only be spun up to perform a single task and will then be torn down, so the resources do not need to be held for any other jobs.