I was asked to develop a consistent way to run(train, make predictions, etc.) any ML model from the command line. I also need to periodically check the DB for requests related to training, like abort requests. To minimize the effect checking the DB has on training, I want to create a separate process for fetching requests from the DB.
So I created an abstract class RunnerBaseClass which requires its child classes to implement _train() for each ML model, and it will run _train() with _check_db() using the multiprocessing module when you call run().
I also want to get rid of the need for the boilerplate
if __name__ == '__main__':
...
code, and make argument parsing, creating an instance, and calling the run() method done automatically.
So I created a class decorator @autorun which calls the run() method of the class when the script is run directly from the command line. When run, the decorator successfully calls run(), but there seems to be a problem creating a subprocess with the class' method and the following error occurs:
Traceback (most recent call last):
File "run.py", line 4, in <module>
class Runner(RunnerBaseClass):
File "/Users/yongsinp/Downloads/runner_base.py", line 27, in class_decorator
instance.run()
File "/Users/yongsinp/Downloads/runner_base.py", line 16, in run
db_check_process.start()
File "/Users/yongsinp/miniforge3/envs/py3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/yongsinp/miniforge3/envs/py3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/yongsinp/miniforge3/envs/py3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/Users/yongsinp/miniforge3/envs/py3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/yongsinp/miniforge3/envs/py3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/yongsinp/miniforge3/envs/py3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Users/yongsinp/miniforge3/envs/py3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class '__main__.Runner'>: attribute lookup Runner on __main__ failed
Here's a minimal code that can be used to reproduce the error.
runner_base.py:
from abc import ABC, abstractmethod
from multiprocessing import Process
class RunnerBaseClass(ABC):
@abstractmethod
def _train(self) -> None:
...
def _check_db(self):
print("Checking DB")
def run(self) -> None:
db_check_process = Process(target=self._check_db)
db_check_process.start()
self._train()
db_check_process.join()
def autorun(env_name: str):
def class_decorator(class_):
instance = class_()
if env_name == '__main__':
instance.run()
return instance
return class_decorator
run.py:
from runner_base import RunnerBaseClass, autorun
@autorun(__name__)
class Runner(RunnerBaseClass):
def _train(self) -> None:
print("Training")
I have looked up the cause for this error and can simply fix it by not using the decorator, or turning the method into a function.
runner_base.py:
from abc import ABC, abstractmethod
from multiprocessing import Process
class RunnerBaseClass(ABC):
@abstractmethod
def _train(self) -> None:
...
def run(self) -> None:
db_check_process = Process(target=check_db)
db_check_process.start()
self._train()
db_check_process.join()
def autorun(env_name: str):
def class_decorator(class_):
instance = class_()
if env_name == '__main__':
instance.run()
return instance
return class_decorator
def check_db():
print("Checking DB")
I can just use the function instead of the method and be done with it, but I don't like the idea of passing configurations and objects for inter-process communication(like Queue) to the function which I don't have to when using a method. So, is there a way for me to keep _check_db() a method, and use the @autorun decorator?
(I am aware of using dill and other modules, but I'd like to stick with the builtin ones if possible.)