The accepted answer provides a sufficient solution.
However, if the number of files that are required to read is larger, it might be worth to avoid for loops altogether and rewrite the solution using lapply.
Few additional tweaks could be added:
- You could use data.tablerbindlistto bind each file into a single bigdata.frame/data.table(to avoid multiple data.frames in global environment). This assumes that all loaded data share the structure - this is more concise. This can be also achieved inbasewithdo.call(rbind, "yourlist")but you will loose the "id" column.
- You can implement some sort of percentage counter to show how many files were loaded using cat(round(i / length(test) * 100, 3), "%    \r")which is also noted here.
- There's number of packages that can be used to do this task. I would recommend openxlsxand avoid thexlsx. Theopenxlsxremoves the dependency on Java. There's alsoreadxlwhich is worth mentioning. The solution below is usingopenxlsx
The following solution should achieve similar results but should be faster than a for loop. As no data example was provided, you will have to play with some of the arguments in openxlsx::read.xlsx to specify the range of the sheet you want to load.
test <- list.files(recursive = TRUE, pattern = "*.xlsx")
as.data.frame(data.table::rbindlist(lapply(1:length(test), function(i){
        cat(round(i / length(test) * 100, 3), "%    \r")
        tmp_name <- test[i]
        tmp_sheets <- openxlsx::getSheetNames(tmp_name)
        if ("Income Statement" %in% tmp_sheets) {
                tmp_data <- openxlsx::read.xlsx(
                        tmp_name, 
                        sheet = "Income Statement", 
                        startRow = 4)
                assign(x = tmp_name, value = tmp_data)
        }
        
}
), idcol = "id"))