I have ~40K data frames in a list. Each data frame has 7 variables, 3 factors and 4 numeric. For reference, here is the first data frame:
 $ a:'data.frame':  4 obs. of  7 variables:
  ..$ x1      : Factor w/ 1 level "a": 1 1 1 1
  ..$ x2        : Factor w/ 4 levels "12345678901234",..: 1 2 3 4
  ..$ x3    : Factor w/ 4 levels "SAMPLE",..: 1 2 3 4
  ..$ x4       : int [1:4] 1 2 3 4
  ..$ x5      : num [1:4] 10 20 30 40
  ..$ x6: int [1:4] 50 60 70 80
  ..$ x7   : num [1:4] 0.5 0.7 0.35 1
I'm trying to merge these into a single ginormous data frame, using:
Reduce(function(...) merge(..., all=T), df_list)
As recommended here: Simultaneously merge multiple data.frames in a list.
If I take the first 1000 items, i.e.
Reduce(function(...) merge(..., all=T), df_list[1:1000])
This produces the desired result (merges the individual data frames into a single one) and completes in 37 seconds.
However, running Reduce() on the entire 40K list of data frames takes an inordinate amount of time.. I've let it run >5 hrs and it doesn't appear to complete. 
Are there any tricks that I can use to improve the performance of Reduce(), or is there a better alternative?
 
     
    