passing different arguments to mp.Pool() for different processes

Question

I am trying to parallelize a piece of code using the multiprocessing library, the outlines of which are as follows:

import multiprocessing as mp
import time
def stats(a,b,c,d,queue):
   t = time.time()
   x = stuff
   print(f'runs in {time.time() -t} seconds')
   queue.put(x)

def stats_wrapper(a,b,c,d):
   q = mp.Queue()
   # b here is a dictionary
   processes = [mp.Process(target = stats,args = (a,{b1:b[b1]},c,d,q) for b1 in b]
   for p in processes:
     p.start()
   results = []
   for p in processes:
     results.append(q.get())
   return(results)

The idea here is to parallelize the calculations in stuff by creating as many processes as there are key:value pairs in b,which can go upto 50. The speedup I'm getting is only about 20% as compared to the serial implementation.

I tried rewriting the stats_wrapper function with mp.Pool , but couldn't get the apply.async function to work with different args as I have created in stats_wrapper above.

Questions :-

How can I improve the stats_wrapper function to bring down the runtime further?

Is there a way to rewrite the stats_wrapper using mp.Pool function that can accept a different value for parameter b in the stats function for different processes? Something like :

def pool_stats_wrapper(a,b,c,d):
    q = mp.Queue
    pool = mp.Pool()

    pool.apply_async(stats,((a,{b1:b[b1},c,d) for b1 in b)) # Haven't figured this part yet
    pool.close()
    results = []
    for _ in len(b):
       results.append(q.get())
    return (results)

I'm running the code on an 8 core 16 GB machine if that helps.

How long does each single process take to complete when you use multiprocessing? — John Zwinck, Jan 28 '20 at 11:39
about 2 seconds as measured in the updated ```stats``` definition — WitchKingofAngmar, Jan 28 '20 at 11:41

passing different arguments to mp.Pool() for different processes

0 Answers0