Occasionally, we find novice R programmers build data frames in a for loop, usually by initializing an empty data frame and then iteratively calling rbind. To respond to this inefficient approach, we often cite Patrick Burns' R Inferno - Circle 2: Growing Objects who emphasizes the hazard of this situation. 
In Python pandas (the other open-source data science tool), experts have asserted the quadratic copy and O(N^2) logic: (@unutbu here, @Alexander here). Additionally, docs (see section note) stress the copying problem of datasets and wiki explains Python's  list.append does not have the copy problem. I wonder if similar constructs apply to R.
Specifically, my question:
- Can timing alone illustrate or quantify the growing object in loop problem? See microbenchmarkresults below. Burns shows timings to illustrate the computational challenge to create a sequence.
- Or does memory usage illustrate or quantify the growing object in loop problem? See RProfresults below. Burns cites usingRProfto show memory consumption within code.
- Or is the growing object problem, context-specific, with general rule of thumb to avoid loops in building objects?
Consider following examples of growing a random data frame of 500 rows in a loop and using a list:
grow_df_loop <- function(n) {
  final_df <- data.frame()
  for(i in 1:n) {
    df <- data.frame(
      group = sample(c("sas", "stata", "spss", "python", "r", "julia"), 500, replace=TRUE),
      int = sample(1:15, 500, replace=TRUE),
      num = rnorm(500),
      char = replicate(500, paste(sample(c(LETTERS, letters, c(0:9)), 3, replace=TRUE), collapse="")),
      bool = sample(c(TRUE, FALSE), 500, replace=TRUE),
      date = as.Date(sample(10957:as.integer(Sys.Date()), 500, replace=TRUE), origin="1970-01-01")
    )        
    final_df <- rbind(final_df, df)
  }
  return(final_df)
}
grow_df_list <- function(n) {
  df_list <- lapply(1:n, function(i)
    df <- data.frame(
      group = sample(c("sas", "stata", "spss", "python", "r", "julia"), 500, replace=TRUE),
      int = sample(1:15, 500, replace=TRUE),
      num = rnorm(500),
      char = replicate(500, paste(sample(c(LETTERS, letters, c(0:9)), 3, replace=TRUE), collapse="")),
      bool = sample(c(TRUE, FALSE), 500, replace=TRUE),
      date = as.Date(sample(10957:as.integer(Sys.Date()), 500, replace=TRUE), origin="1970-01-01")
    )
  )
  final_df <- do.call(rbind, df_list)
  return(final_df)
}
Timing
Benchmarking by timing confirms the list approach is more efficient across the different number of iterations. But given reproducible, uniform data examples can timing results capture the difference of object growth?
library(microbenchmark)
microbenchmark(grow_df_loop(50), grow_df_list(50), times = 5L)
# Unit: milliseconds
#              expr      min       lq     mean   median       uq      max neval cld
#  grow_df_loop(50) 758.2412 762.3489 809.8988 793.3590 806.4191 929.1256     5   b
#  grow_df_list(50) 554.3722 562.1949 577.6891 568.7658 589.8565 613.2560     5  a 
microbenchmark(grow_df_loop(100), grow_df_list(100), times = 5L)
# Unit: seconds
#               expr      min       lq     mean   median       uq      max neval cld
#  grow_df_loop(100) 2.223617 2.225441 2.425668 2.233529 2.677309 2.768447     5   b
#  grow_df_list(100) 1.211181 1.255191 1.325670 1.287821 1.396905 1.477252     5  a 
microbenchmark(grow_df_loop(500), grow_df_list(500), times = 5L)
# Unit: seconds
#               expr      min       lq     mean   median       uq      max neval cld
#  grow_df_loop(500) 38.78245 39.74367 41.54976 40.10221 44.36565 44.75483     5   b
#  grow_df_list(500) 13.37076 13.90227 14.67498 14.53042 15.49942 16.07203     5  a
Memory Usage
Additionally, profiling by memory shows "rbind" memory totals sizeably growing with iteration size but more pronounced with loop approach than list approach. Given a reproducible, uniform example can mem.total results capture the difference of object growth? Any other approach to use?
Loop Approach
n = 50
utils::Rprof(tmp <- tempfile(), memory.profiling = TRUE)
output_df1 <- grow_df_loop(50)
utils::Rprof(NULL)
summaryRprof(tmp, memory="both")
unlink(tmp)
# $by.total
#                           total.time total.pct mem.total self.time self.pct
# "grow_df_loop"                  0.58    100.00     349.1      0.00     0.00
# "data.frame"                    0.38     65.52     209.4      0.00     0.00
# "paste"                         0.28     48.28     186.4      0.06    10.34
# "FUN"                           0.26     44.83     150.8      0.02     3.45
# "lapply"                        0.26     44.83     150.8      0.00     0.00
# "replicate"                     0.26     44.83     150.8      0.00     0.00
# "sapply"                        0.26     44.83     150.8      0.00     0.00
# "sample"                        0.20     34.48     131.4      0.08    13.79
# "rbind"                         0.20     34.48     139.7      0.00     0.00
# "[<-.factor"                    0.12     20.69      66.0      0.10    17.24
# "[<-"                           0.12     20.69      66.0      0.00     0.00
# "factor"                        0.10     17.24      47.8      0.04     6.90
# "as.data.frame"                 0.10     17.24      48.5      0.00     0.00
# "as.data.frame.character"       0.10     17.24      48.5      0.00     0.00
# "order"                         0.06     10.34      12.9      0.06    10.34
# "as.vector"                     0.04      6.90      38.7      0.04     6.90
# "sample.int"                    0.04      6.90      18.7      0.02     3.45
# "as.vector.factor"              0.04      6.90      38.7      0.00     0.00
# "deparse"                       0.04      6.90      35.6      0.00     0.00
# "!"                             0.02      3.45      18.7      0.02     3.45
# ":"                             0.02      3.45       0.0      0.02     3.45
# "anyNA"                         0.02      3.45      19.0      0.02     3.45
# "as.POSIXlt.POSIXct"            0.02      3.45      10.1      0.02     3.45
# "c"                             0.02      3.45      19.8      0.02     3.45
# "is.na"                         0.02      3.45      18.9      0.02     3.45
# "length"                        0.02      3.45      13.8      0.02     3.45
# "mode"                          0.02      3.45      16.6      0.02     3.45
# "%in%"                          0.02      3.45      16.6      0.00     0.00
# ".deparseOpts"                  0.02      3.45      19.0      0.00     0.00
# "as.Date"                       0.02      3.45      10.1      0.00     0.00
# "as.POSIXlt"                    0.02      3.45      10.1      0.00     0.00
# "Sys.Date"                      0.02      3.45      10.1      0.00     0.00
# 
# $sample.interval
# [1] 0.02
# 
# $sampling.time
# [1] 0.58
n = 100
# $by.total
#                           total.time total.pct mem.total self.time self.pct
# "grow_df_loop"                  1.74     98.86     963.0      0.00     0.00
# "rbind"                         1.06     60.23     599.3      0.06     3.41
# "data.frame"                    0.68     38.64     363.7      0.02     1.14
# "lapply"                        0.50     28.41     239.0      0.04     2.27
# "replicate"                     0.50     28.41     239.0      0.00     0.00
# "sapply"                        0.50     28.41     239.0      0.00     0.00
# "paste"                         0.46     26.14     218.4      0.06     3.41
# "FUN"                           0.46     26.14     218.4      0.00     0.00
# "factor"                        0.44     25.00     249.2      0.24    13.64
# "sample"                        0.40     22.73     179.2      0.10     5.68
# "[<-"                           0.38     21.59     244.3      0.00     0.00
# "[<-.factor"                    0.34     19.32     229.5      0.30    17.05
# "c"                             0.26     14.77     136.6      0.26    14.77
# "as.vector"                     0.24     13.64     101.2      0.24    13.64
# "as.vector.factor"              0.24     13.64     101.2      0.00     0.00
# "order"                         0.14      7.95      87.3      0.14     7.95
# "as.data.frame"                 0.14      7.95      87.3      0.00     0.00
# "as.data.frame.character"       0.14      7.95      87.3      0.00     0.00
# "sample.int"                    0.10      5.68      28.2      0.10     5.68
# "unique"                        0.10      5.68      64.9      0.00     0.00
# "is.na"                         0.06      3.41      62.4      0.06     3.41
# "unique.default"                0.04      2.27      42.4      0.04     2.27
# "[<-.Date"                      0.04      2.27      14.9      0.00     0.00
# ".Call"                         0.02      1.14       0.0      0.02     1.14
# "Make.row.names"                0.02      1.14       0.0      0.02     1.14
# "NextMethod"                    0.02      1.14       0.0      0.02     1.14
# "structure"                     0.02      1.14      10.3      0.02     1.14
# "unclass"                       0.02      1.14      14.9      0.02     1.14
# ".Date"                         0.02      1.14       0.0      0.00     0.00
# ".rs.enqueClientEvent"          0.02      1.14       0.0      0.00     0.00
# "as.Date"                       0.02      1.14      23.2      0.00     0.00
# "as.Date.character"             0.02      1.14      23.2      0.00     0.00
# "as.Date.numeric"               0.02      1.14      23.2      0.00     0.00
# "charToDate"                    0.02      1.14      23.2      0.00     0.00
# "hook"                          0.02      1.14       0.0      0.00     0.00
# "is.na.POSIXlt"                 0.02      1.14      23.2      0.00     0.00
# "utils::Rprof"                  0.02      1.14       0.0      0.00     0.00
# 
# $sample.interval
# [1] 0.02
# 
# $sampling.time
# [1] 1.76
n = 500
# $by.total
#                           total.time total.pct mem.total self.time self.pct
# "grow_df_loop"                 28.12    100.00   15557.7      0.00     0.00
# "rbind"                        25.30     89.97   13418.5      3.06    10.88
# "factor"                        8.94     31.79    5026.5      6.98    24.82
# "[<-"                           8.72     31.01    4486.9      0.02     0.07
# "[<-.factor"                    7.62     27.10    3915.5      7.32    26.03
# "unique"                        3.06     10.88    2060.9      0.00     0.00
# "as.vector"                     2.96     10.53    1250.1      2.96    10.53
# "as.vector.factor"              2.96     10.53    1250.1      0.00     0.00
# "data.frame"                    2.82     10.03    2139.1      0.02     0.07
# "unique.default"                2.30      8.18    1657.9      2.30     8.18
# "replicate"                     1.88      6.69    1364.7      0.00     0.00
# "sapply"                        1.88      6.69    1364.7      0.00     0.00
# "FUN"                           1.84      6.54    1367.2      0.18     0.64
# "lapply"                        1.84      6.54    1338.8      0.02     0.07
# "paste"                         1.70      6.05    1281.3      0.38     1.35
# "sample"                        1.36      4.84    1089.2      0.20     0.71
# "[<-.Date"                      1.08      3.84     571.4      0.00     0.00
# "c"                             1.04      3.70     688.7      1.04     3.70
# ".Date"                         0.96      3.41     488.0      0.34     1.21
# "sample.int"                    0.76      2.70     584.2      0.74     2.63
# "as.data.frame"                 0.70      2.49     533.6      0.00     0.00
# "as.data.frame.character"       0.64      2.28     476.0      0.00     0.00
# "NextMethod"                    0.62      2.20     424.7      0.62     2.20
# "order"                         0.60      2.13     475.5      0.50     1.78
# "structure"                     0.32      1.14     155.5      0.32     1.14
# "is.na"                         0.28      1.00     150.5      0.26     0.92
# "Make.row.names"                0.12      0.43     153.8      0.12     0.43
# "unclass"                       0.12      0.43      83.3      0.12     0.43
# "as.Date"                       0.10      0.36     120.1      0.02     0.07
# "length"                        0.06      0.21      79.2      0.06     0.21
# "seq.int"                       0.06      0.21      57.0      0.06     0.21
# "vapply"                        0.06      0.21      84.6      0.02     0.07
# ":"                             0.04      0.14       1.1      0.04     0.14
# "as.POSIXlt.POSIXct"            0.04      0.14      57.7      0.04     0.14
# "is.factor"                     0.04      0.14       0.0      0.04     0.14
# "deparse"                       0.04      0.14      55.0      0.02     0.07
# "eval"                          0.04      0.14      36.2      0.02     0.07
# "match.arg"                     0.04      0.14      25.2      0.02     0.07
# "match.fun"                     0.04      0.14      32.4      0.02     0.07
# "as.data.frame.integer"         0.04      0.14      55.0      0.00     0.00
# "as.POSIXlt"                    0.04      0.14      57.7      0.00     0.00
# "force"                         0.04      0.14      55.0      0.00     0.00
# "make.names"                    0.04      0.14      42.1      0.00     0.00
# "Sys.Date"                      0.04      0.14      57.7      0.00     0.00
# "!"                             0.02      0.07      29.6      0.02     0.07
# "$"                             0.02      0.07       2.6      0.02     0.07
# "any"                           0.02      0.07      18.3      0.02     0.07
# "as.data.frame.numeric"         0.02      0.07       2.6      0.02     0.07
# "as.data.frame.vector"          0.02      0.07      21.6      0.02     0.07
# "as.list"                       0.02      0.07      26.6      0.02     0.07
# "baseenv"                       0.02      0.07      25.2      0.02     0.07
# "is.ordered"                    0.02      0.07      14.5      0.02     0.07
# "lengths"                       0.02      0.07      14.9      0.02     0.07
# "levels"                        0.02      0.07       0.0      0.02     0.07
# "mode"                          0.02      0.07      30.7      0.02     0.07
# "names"                         0.02      0.07       0.0      0.02     0.07
# "rnorm"                         0.02      0.07      29.6      0.02     0.07
# "%in%"                          0.02      0.07      30.7      0.00     0.00
# "as.Date.character"             0.02      0.07       2.6      0.00     0.00
# "as.Date.numeric"               0.02      0.07       2.6      0.00     0.00
# "as.POSIXct"                    0.02      0.07       2.6      0.00     0.00
# "as.POSIXct.POSIXlt"            0.02      0.07       2.6      0.00     0.00
# "charToDate"                    0.02      0.07       2.6      0.00     0.00
# "eval.parent"                   0.02      0.07      11.0      0.00     0.00
# "is.na.POSIXlt"                 0.02      0.07       2.6      0.00     0.00
# "simplify2array"                0.02      0.07      14.9      0.00     0.00
# 
# $sample.interval
# [1] 0.02
# 
# $sampling.time
# [1] 28.12
List Approach
n = 50
# $by.total
#                           total.time total.pct mem.total self.time self.pct
# "grow_df_list"                  0.40       100     257.0      0.00        0
# "data.frame"                    0.32        80     175.6      0.02        5
# "lapply"                        0.32        80     175.6      0.02        5
# "FUN"                           0.32        80     175.6      0.00        0
# "replicate"                     0.24        60     129.6      0.00        0
# "sapply"                        0.24        60     129.6      0.00        0
# "paste"                         0.22        55     119.2      0.10       25
# "sample"                        0.12        30      49.4      0.00        0
# "sample.int"                    0.08        20      39.1      0.08       20
# "<Anonymous>"                   0.08        20      81.4      0.00        0
# "do.call"                       0.08        20      81.4      0.00        0
# "rbind"                         0.08        20      81.4      0.00        0
# "factor"                        0.06        15      29.7      0.02        5
# "as.data.frame"                 0.06        15      29.7      0.00        0
# "as.data.frame.character"       0.06        15      29.7      0.00        0
# "c"                             0.04        10      10.3      0.04       10
# "order"                         0.04        10      17.3      0.04       10
# "unique.default"                0.04        10      31.1      0.04       10
# "[<-"                           0.04        10      50.3      0.00        0
# "unique"                        0.04        10      31.1      0.00        0
# ".Date"                         0.02         5      27.9      0.02        5
# "[<-.factor"                    0.02         5      22.4      0.02        5
# "[<-.Date"                      0.02         5      27.9      0.00        0
# 
# $sample.interval
# [1] 0.02
# 
# $sampling.time
# [1] 0.4
n = 100
# $by.total
#                           total.time total.pct mem.total self.time self.pct
# "grow_df_list"                  1.00       100     620.4      0.00        0
# "data.frame"                    0.66        66     401.8      0.00        0
# "FUN"                           0.66        66     401.8      0.00        0
# "lapply"                        0.66        66     401.8      0.00        0
# "paste"                         0.42        42     275.3      0.14       14
# "replicate"                     0.42        42     275.3      0.00        0
# "sapply"                        0.42        42     275.3      0.00        0
# "rbind"                         0.34        34     218.6      0.02        2
# "<Anonymous>"                   0.34        34     218.6      0.00        0
# "do.call"                       0.34        34     218.6      0.00        0
# "sample"                        0.28        28     188.6      0.08        8
# "unique.default"                0.20        20      90.1      0.20       20
# "unique"                        0.20        20      90.1      0.00        0
# "as.data.frame"                 0.18        18      81.2      0.00        0
# "factor"                        0.16        16      81.2      0.02        2
# "as.data.frame.character"       0.16        16      81.2      0.00        0
# "[<-.factor"                    0.14        14     112.0      0.14       14
# "sample.int"                    0.14        14      96.8      0.14       14
# "[<-"                           0.14        14     112.0      0.00        0
# "order"                         0.12        12      51.2      0.12       12
# "c"                             0.06         6      45.8      0.06        6
# "as.Date"                       0.04         4      28.3      0.02        2
# "length"                        0.02         2      17.0      0.02        2
# "strptime"                      0.02         2      11.2      0.02        2
# "structure"                     0.02         2       0.0      0.02        2
# "as.data.frame.integer"         0.02         2       0.0      0.00        0
# "as.Date.character"             0.02         2      11.2      0.00        0
# "as.Date.numeric"               0.02         2      11.2      0.00        0
# "charToDate"                    0.02         2      11.2      0.00        0
# 
# $sample.interval
# [1] 0.02
# 
# $sampling.time
# [1] 1
n = 500
# $by.total
#                           total.time total.pct mem.total self.time self.pct
# "grow_df_list"                  9.40    100.00    5621.8      0.00     0.00
# "rbind"                         6.12     65.11    3633.5      0.44     4.68
# "<Anonymous>"                   6.12     65.11    3633.5      0.00     0.00
# "do.call"                       6.12     65.11    3633.5      0.00     0.00
# "lapply"                        3.28     34.89    1988.3      0.34     3.62
# "FUN"                           3.28     34.89    1988.3      0.10     1.06
# "data.frame"                    3.28     34.89    1988.3      0.02     0.21
# "[<-"                           3.28     34.89    2118.4      0.00     0.00
# "[<-.factor"                    3.00     31.91    1829.1      3.00    31.91
# "replicate"                     2.36     25.11    1422.9      0.00     0.00
# "sapply"                        2.36     25.11    1422.9      0.00     0.00
# "unique"                        2.32     24.68    1189.9      0.00     0.00
# "paste"                         1.98     21.06    1194.2      0.70     7.45
# "unique.default"                1.96     20.85    1017.8      1.96    20.85
# "sample"                        1.20     12.77     707.4      0.44     4.68
# "as.data.frame"                 0.88      9.36     540.5      0.02     0.21
# "as.data.frame.character"       0.78      8.30     496.2      0.00     0.00
# "factor"                        0.72      7.66     444.2      0.06     0.64
# "c"                             0.68      7.23     379.6      0.68     7.23
# "order"                         0.64      6.81     385.1      0.64     6.81
# "sample.int"                    0.40      4.26     233.0      0.38     4.04
# ".Date"                         0.28      2.98     289.3      0.10     1.06
# "[<-.Date"                      0.28      2.98     289.3      0.00     0.00
# "NextMethod"                    0.18      1.91     171.2      0.18     1.91
# "deparse"                       0.08      0.85      54.6      0.02     0.21
# "%in%"                          0.08      0.85      54.6      0.00     0.00
# "mode"                          0.08      0.85      54.6      0.00     0.00
# "length"                        0.06      0.64      10.4      0.06     0.64
# "structure"                     0.06      0.64      30.8      0.04     0.43
# ".deparseOpts"                  0.06      0.64      49.1      0.02     0.21
# "[["                            0.06      0.64      34.2      0.02     0.21
# ":"                             0.04      0.43      33.6      0.04     0.43
# "[[.data.frame"                 0.04      0.43      22.6      0.04     0.43
# "force"                         0.04      0.43      20.0      0.00     0.00
# "as.vector"                     0.02      0.21       0.0      0.02     0.21
# "is.na"                         0.02      0.21       0.0      0.02     0.21
# "levels"                        0.02      0.21      14.6      0.02     0.21
# "make.names"                    0.02      0.21       9.4      0.02     0.21
# "pmatch"                        0.02      0.21      17.3      0.02     0.21
# "as.data.frame.Date"            0.02      0.21       5.5      0.00     0.00
# "as.data.frame.integer"         0.02      0.21       0.0      0.00     0.00
# "as.data.frame.logical"         0.02      0.21      14.5      0.00     0.00
# "as.data.frame.numeric"         0.02      0.21      13.5      0.00     0.00
# "as.data.frame.vector"          0.02      0.21      17.3      0.00     0.00
# "simplify2array"                0.02      0.21       0.0      0.00     0.00
# 
# $sample.interval
# [1] 0.02
# 
# $sampling.time
# [1] 9.4
Graphs (using a different call to save $by.total results)



