I am doing an experiment where the result is high dimensional structured. I use a MultiIndex to represent the result object and use multiprocessing to compute and fill it. The result set is quite large, which can be easily up to millions to billions of entries. If the result is 3D, I can let the function which does the computation return a df and then combine them into a panel afterwards.
When the result object is 5D or higher, I found it not straight-forward and memory consuming to return the subset of result from each function performed in a single process. However, it does not work if I let each process write their result directly to the MultiIndex global variable (the result) which had been created before the computation. The values of the result df are all NaN as it is been created.
Any suggestions are greatly appreciated!