I have a database connection to PostgreSQL using the package RPostgreSQL. Currently I do the following:
- retrieve a list from my database
- run the list through a for loop, doing a calculation and writing the value back to the database
I am interested in parallelising this process. The obvious is to use the foreach functionality in the package of the same name. However, we need to use connection pooling: In this case I am interested if anyone knows a parallel backend which I can use to share my database connection. Here is a specific unresolved example:
In the above case there is no connection pooling in the registerDoMC parallel backend, with the work around to open and close the connection within each dopar worker. Looking at the registerDoSnow parallel backend from the snow package also does not give this functionality.
The alternative would be to use mclapply instead of dopar. In this case, does anyone know whether or how to share the database connection with each mclapply worker?