I am trying to speed up my python script, which uses vtk methods (and vtkobjects) for processing of geometric measurements. Since some of my methods include looping over very similar meshes and computing enclosed points for each of them, I simply wanted to parallelise such for loops:
averaged_contained_points = []
for intersection_actor in intersection_actors:
contained_points = vtk_mesh.points_inside_mesh(point_data=point_data, mesh=intersection_actor.GetMapper().GetInput())
mean_pos = np.mean(contained_points, axis=0)
averaged_contained_points.append(mean_pos)
In this case the function vtk_mesh.points_inside_mesh calls vtk.vtkSelectEnclosedPoints() and takes a vtkActor and vtkPolyData as input.
The main question is: How can this be converted to run in parallel?
My initial attempt was to import multiprocessing, but I then switched to import pathos.multiprocessing, which seems to have a few advantages, but they work fairly similar.
The problem is that the code below doesn't work.
def _parallel_generate_intersection_avg(inputs):
point_data = inputs[0]
intersection_actor = inputs[1]
contained_points = vtk_mesh.points_inside_mesh(point_data=point_data, mesh=intersection_actor.GetMapper().GetInput())
if len(contained_points) is 0:
return np.array([-1,-1,-1])
return np.mean(contained_points, axis=0)
pool = ProcessingPool(CPU_COUNT)
inputs = [[point_data,intersection_actor] for intersection_actor in intersection_actors]
averaged_contained_points = pool.map(_parallel_generate_intersection_avg, inputs)
It results in these sort of errors:
pickle.PicklingError: Can't pickle 'vtkobject' object: (vtkPolyData)0x111ed5bf0
I have done some research and found that vtkobjects probably can't be pickled:
Can't pickle <type 'instancemethod'> when using python's multiprocessing Pool.map()
However, since I couldn't find a solution for running python vtk code in parallel with the available answers, please let me know if you have any suggestions.
[EDIT]
I didn't try to implement threading, mainly, because I read the comments to the answer in this thread: How do I parallelize a simple Python loop?
Using multiple threads on CPython won't give you better performance for pure-Python code due to the global interpreter lock (GIL)