I have a dictionary in which I collect ML models, that I built with a dataclass as follows:
@dataclass(frozen=True, order=True)
class Model:
data_sample: str
predictive_model: object
predictions: pd.DataFrame
binary: object
type: str
inputs: list
output: str
explain: bool
def to_dict(self):
return asdict(self)
I produce multiple models and use the dataclass to validate the inputs for a single, trained model. I cast this class as a dictionary to an ML list:
ML.append(model.to_dict())
The objects for binary and predictive_model are models (python classes) that come from libraries like scikit-learn, TPOT, SciPy and so on. One should assume that there is a lot of inheritance happening in these objects. I am struggling to make this list portable to another environment. My core idea of making this portable is to use libs like joblib, dill or pickle to .dump the dictionary in the runtime that trains the models, and use a .load method to load the dictionary. When I do this, I notice that there is a ModuleNotFoundError: No module named ... error. I already found this to be a common problem, and that there are answers around this error here: Python pickling after changing a module's directory
My question is: Is there a better way to "export" my dictionary? Preferably in such a way that it copies everything that it needs so that I can run this elsewhere without needing to manage any imports?
I get the feeling that pickling might not be what I need..