Let work_one() be a function like this:
def work_one(program_path):
task = subprocess.Popen("./" + program_path, shell=True)
t = 0
while True:
ret = task.poll()
if ret != None or t >= T: # T is the max time we allow for 'task'
break;
else:
t += 0.05
time.sleep(0.05)
return t
There are many such programs to feed to work_one(). When these programs run sequentially, the time reported by each work_one() is a reliable coarse measure of each program's runtime.
However, let say we have a multiprocessing.Pool() instance comprising 20 workers, pool, and we call function like this:
def work_all(programs):
pool = multiprocessing.Pool(20)
ts = pool.map(work_one, programs)
return ts
Now, the runtime measures reported by work_all() is approximately 20 times larger than what a sequential work_one() would have reported.
This is reasonable, because inside work_one(), when a worker's process adds 0.05 to t and yields (by calling time.sleep()), it might not be able to yield to the subprocess it is managing (task) (unlike when there is only one worker); instead, the OS might decided to give the CPU to another concurrent worker's process. Therefore the number of iterations inside a worker_one() may be 20 times more before task completes.
Question:
- How to correctly implement
work_one()to get a good runtime measure, knowing thatwork_one()might be running concurrently? - I also want
work_one()to return early if the subprocesstaskdoesn't complete inTseconds, soos.wait*()functions seems not a good solution because they block the parent process.