I'm trying to read in several CSVs with headers that begin on different rows and then map them into one data frame. I tried the code provided here, but I couldn't get the function to work.
Read CSV into R based on where header begins
Here are two example DFs:
file1 <- structure(list(X..Text = c("# Text", "#", "agency_cd", "5s", 
"USGS", "USGS"), X = c("", "", "site_no", "15s", "4294000", "4294000"
), X.1 = c("", "", "datetime", "20d", "6/24/13 0:00", "6/24/13 0:15"
), X.2 = c("", "", "tz_cd", "6s", "EDT", "EDT"), X.3 = c("", 
"", "Gage height", "14n", "1.63", "1.59"), X.4 = c("", "", " Discharge", 
"14n", "1310", "1250")), class = "data.frame", row.names = c(NA, 
-6L))
file2 <- structure(list(X..Text = c("# Text", "# Text", "#", "agency_cd", 
"5s", "USGS", "USGS"), X = c("", "", "", "site_no", "15s", "4294002", 
"4294002"), X.1 = c("", "", "", "datetime", "20d", "6/24/13 0:00", 
"6/24/13 0:15"), X.2 = c("", "", "", "tz_cd", "6s", "EDT", "EDT"
), X.3 = c("", "", "", "Gage height", "14n", "1.63", "1.59"), 
X.4 = c("", "", "", " Discharge", "14n", "1310", "1250")), class = 
"data.frame", row.names = c(NA, 
-7L))
I would like to use a similar solution to the related question I asked above, though I also need to skip the line after the header (header row = row that starts with "agency_cd"), and then do something similar to this to bind all the CSVs into one data frame with the file names in a column:
# Path to the data
data_path <- "Data/folder1/folder2"
# Bind all files together to form one data frame
discharge <-
  # Find all file names ending in CSV in all subfolders
  dir(data_path, pattern = "*.csv", recursive = TRUE) %>% 
  # Create a dataframe holding the file names
  data_frame(filename = .) %>% 
  # Read in all CSV files into a new data frame, 
  # Create a new column with the filenames
  mutate(file_contents = map(filename, ~ read_csv(file.path(data_path, .), col_types = cols(.default = "c")))
    ) %>% 
  # Unpack the list-columns to make a useful data frame
  unnest()
If using the example function provided in the related question above: A) I can't get the header_begins line to give me a vector, and B) I don't know how to then incorporate the function in the read_csv function above.
As a start I tried this using the solution to the related question:
# Function
detect_header_line <- function(file_names, column_name) {
    header_begins <- NULL
    for(i in 1:length(file_names)){
      lines_read <- readLines(file_names[i], warn=F)
      header_begins[i] <- grep(column_name, lines_read)
    }
   }
# Path to the data
data_path <- "Data/RACC_2012-2016/discharge"
# Get all CSV file names
file_names = dir(data_path, pattern = "*.csv", recursive = TRUE)
# Get beginning rows of each CSV file
header_begins <- detect_header_line(file.path(data_path, file_names), 'agency_cd')
But the header_begins vector was empty. And if I can fix that, I still need help getting that incorporated into my code above.
Any help is greatly appreciated!
 
     
    