I want to train a model via tidymodels using predictions from another model as feature. Specifically it`s a KNN model where I want to use predictions from a random forest model as a feature.
I started implementing a (hacky) solution using step_mutate, here it is:
library(dplyr)
library(tidymodels)
library(purrr)
library(data.table)
df <- data.table(
  y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100)
)
pred_rf <- function(...) {
  # Very hacky function which creating random_forest predictions 
  nms <- purrr::map_chr(rlang::enexprs(...), as.character)
  l <- list(...)
  dat <- setDT(l)
  outcome <- names(dat)[1]
  preds <- names(dat)[-1]
  rec <- recipe(dat) %>%
    update_role(!!outcome, new_role = "outcome") %>%
    update_role(!!preds, new_role = "predictor")
  model <- rand_forest(mode = "regression")
  wf <- workflow() %>%
    add_recipe(rec) %>%
    add_model(model)
  fitted_model <- fit(wf, dat)
  predictions <- predict(fitted_model, dat)$.pred
  stopifnot(length(predictions) == nrow(dat))
  stopifnot(sum(is.na(predictions)) == 0)
  return(predictions)
}
rec <- recipe(y ~ ., df) %>%
  step_mutate(y_pred = pred_rf(y, x1, x2)) %>% 
  prep()
bake(rec, new_data = NULL) # Desired output would be a design matrix like this
However I realised that would cause data-leakage when used for tuning. Is this possible to do without data leakage or would I need to create a custom step? It would be very similar to the step_impute_* functions, but I couldn`t find anything.
Thanks
 
    