I am using dask on LSF cluster to process some images in parallel. The processing function itself uses joblib to perform multiple computations on the image in parallel.
It seems that setting n_workers and cores parameters to some numbers will generally produce n_workers * cores futures running at the same time. I would like to have n_workers futures being processesed at a time, each of them having cores cores at disposal for the purpose of using them with joblib.
How do I achieve such result?