I am running a code using the parallel package and I would like to include the c++ implementation of a part of this code using Rcpp. It seems that the question has been partially addressed:
- here: Using Rcpp function in parLapply on Windows
 - and here: Using Rcpp functions inside of R's par*apply functions from the parallel package
 - and in many other forums
 
Apparently, the optimal solution would be to include the c++ implementation into a package, build it, and export the function from the package.
The following are the steps that I took and I present them in a simplified version of the original code, so that anyone can reproduce it. I am working in Windows.
First, let's define the c++ implementation.
#include <Rcpp.h>
#include <numeric>
#include <math.h>
using namespace Rcpp;
NumericVector add(NumericVector x, NumericVector y)
{
  return (x+y);
}
// [[Rcpp::export]]
NumericVector f_sample_p(
    NumericVector x, 
    int nb = 5
)
{
  NumericVector x1;
  NumericVector x2;
  
  // new sample 
  x1 = sample(x, nb, true);
  x2 = sample(x, nb, true);
  
  return add(x1, x2);
}
The code contains two functions: one named add that is not exported to R and one named f_sample_p that uses add and is exported to R. This example code will generate a random sample  in a fancy way given an input vector. The structure of the original code is similar in which I cannot pack everything in a single function.
As a second step, I build the package using Rcpp.package.skeleton in a user-defined path.
library(Rcpp)
cpp_src_path <- "~/R/rcpp/sample_p.cpp" # where I store the c++ implementation
dest_path_of_pkg <- "~/R/rpkg/" # where I want to build the skeleton
Rcpp.package.skeleton(name = "mypkg", # the name of the package
                      list = character(), # I suppose I have to leave like this?
                      path = dest_path_of_pkg, # here I set the path
                      force = T, 
                      code_files = character(), # because I don't have any R codes
                      cpp_files = cpp_src_path, # set the cpp source
                      example_code = F, attributes = F, module = F
                      )
# if I set attributes = FALSE then I have to compile the attributes
compileAttributes("~/R/rpkg/mypkg", TRUE)
I let the parameter list empty because if I set it to "f_sample_p" it will not work. Also I am not sure about all other parameters. Yet, it works and creates the skeleton. However, I have to manually edit the file NAMESPACE and change export("Rcpp.fake.fun") to exportPattern("^[[:alpha:]]+") to prevent the following error:
Error: package or namespace load failed for 'mypkg' in namespaceExport(ns, exports):
   undefined exports: Rcpp.fake.fun
Or maybe one can export directly export("f_sample_p"), since it is the only function. The exportPattern method is a shortcut. Is there a way to set this in the parameters of Rcpp.package.skeleton instead of manually edit the NAMESPACE? This manual edit would prevent the following error:
Error: package or namespace load failed for 'mypkg' in namespaceExport(ns, exports):
 undefined exports: Rcpp.fake.fun
As a third step, I build the package in a user-defined path.
lib_path <- "~/R/rlib/"
install.packages("~/R/rpkg/mypkg", # where is the package
                 lib=lib_path, # where I want to build it
                 repos=NULL, # NULL because I install from local files
                 type = "source") # from the skeleton and is not a zipped tarball
Now the package is ready to use.
require("mypkg", "~/R/rlib/")
f_sample_p(1:15, 8)
# [1] 16  8  5 21  6 17 21 24
I generate 8 random numbers starting from a vector from 1 to 15.
As a final step, I am ready to use this function in my R code. The following is also a simplification of my actual code. This runs many times (say 100) the function sim_function that calls the function f_sample_p and it does with lapply.
library(parallel)
sim_function <- function(n){
  # define the simulation function to be used in lapply
  z <- f_sample_p(n, 8) # samples 8 elements from the vector n
  sum(z) # return the sum
}
# prepare the input: this will loop 100 times in the simulation function
x <- runif(100, 1, 10) 
# this works well in classic lapply
result <- lapply(x, sim_function)
# but it won't with parLapply
cl <- makeCluster(2)
clusterExport(cl = cl, varlist = c(
  "f_sample_p" # the variable list
))
result <- parLapply(cl, x, sim_function) # this generates the error
This last line generates the following error:
Error in checkForRemoteErrors(val) : 
  2 nodes produced errors; first error: object '_mypkg_f_sample_p' not found
It seems that parLapply expects the function _mypkg_f_sample_p instead of f_sample_p. In fact, the call method for f_sample_p is the following:
f_sample_p
function (x, nb = 5L) 
{
    .Call(`_mypkg_f_sample_p`, x, nb)
}
<bytecode: 0x0000020e09d70448>
<environment: namespace:mypkg>
Actually, the exported function defined at the moment of the build has a changed name. If you look into the RcppExports.cpp (src folder) you find that it contains:
...
// f_sample_p
NumericVector f_sample_p(NumericVector x, int nb);
RcppExport SEXP _mypkg_f_sample_p(SEXP xSEXP, SEXP nbSEXP) {
...
}
Is there a way to change the export name?
If I try to include this _mypkg_f_sample_p into the clusterExport(), R would not find it in the environment:
clusterExport(cl = cl, varlist = c(
  "f_sample_p", "_mypkg_f_sample_p" # the variable list
))
Giving the error:
Error in get(name, envir = envir) : object '_mypkg_f_sample_p' not found
As I said, I have checked many other posts and docs but without success. I am out of ammo. Any idea?
Edit about a month later: Thinking about the function passed into parLapply(), I erroneously (as I will show further) imagined that I could cheat the function when I pass the list of variables to sim_function. All variables used explicitly within sim_function are listed in the varlist parameter of clusterExport.
For example, if this works:
sim_function <- function(n){
  # define the simulation function to be used in lapply
  z <- rep(n, 8) # <= I have replaced f_sample_p() with rep() 
  sum(z) # return the sum
}
The result will not be the same obviously, but technically it works: clusterExport will accept rep() - no question asked. Maybe, if  I can pass a wrapper for my f_sample_p function, I could get around the variable defined within the scope of sim_function().
wrap_sample_p <- function(n1, n2){
  f_sample_p(n1, n2) # this is now defined outside the scope of sim_function
}
sim_function <- function(n){
  # define the simulation function to be used in lapply
  z <- wrap_sample_p(n, 8) # I pass the wrapper function
  sum(z) # return the sum
}
Then I call:
clusterExport(cl = cl, varlist = c(
  "wrap_sample_p" # the variable list
))
result <- parLapply(cl, x, sim_function) # this generates the following error
Error in checkForRemoteErrors(val) : 
  2 nodes produced errors; first error: could not find function "f_sample_p"
Yet, it seems to care not only about the scope of sim_function but also about the environment. And the following:
clusterExport(cl = cl, varlist = c(
  "f_sample_p", "wrap_sample_p" # the variable list
))
result <- parLapply(cl, x, sim_function) # this generates the following error
Error in checkForRemoteErrors(val) : 
  2 nodes produced errors; first error: object '_mypkg_f_sample_p' not found
Still, it is looking for _mypkg_f_sample_p.