I was experimenting different starting methods in multiprocessing module and I found something weird. Changing the variable method from "spawn" to "fork", drops the execution time from 9.5 to just 0.5.
import multiprocessing as mp
from multiprocessing import Process, Value
from time import time
def increment_value(shared_integer):
with shared_integer.get_lock():
shared_integer.value += 1
if __name__ == "__main__":
method = "spawn"
mp.set_start_method(method)
start = time()
for _ in range(200):
integer = Value("i", 0)
procs = [
Process(target=increment_value, args=(integer,)),
Process(target=increment_value, args=(integer,)),
]
for p in procs:
p.start()
for p in procs:
p.join()
assert integer.value == 2
print(f"{method} - Finished in {time() - start:.4f} seconds")
outputs for different runs:
spawn - Finished in 9.4275 seconds
fork - Finished in 0.5316 seconds
I'm aware of how these two methods start a new child process(well-explained here), but this difference puts a big question mark in my head. I would like know exactly which part of the code impacts the performance mostly? Is that the pickling part in "spawn"? Does it have anything to do with the lock?
I'm running this code on Linux Pop!_OS and my interpreter version is 3.11.