A Walk Through a Slice of Combinatorics in R*
Below, we examine packages equipped with the capabilities of generating combinations & permutations. If I have left out any package, please forgive me and please leave a comment or better yet, edit this post.
Outline of analysis:
- Introduction
- Setup
- Combinations
- Permutations
- Multisets
- Summary
- Memory
Before we begin, we note that combinations/permutations with replacement of distinct vs. non-distint items chosen m at a time are equivalent. This is so, because when we have replacement, it is not specific. Thus, no matter how many times a particular element originally occurs, the output will have an instance(s) of that element repeated 1 to m times.
1. Introduction
Packages:
gtools
combinat
multicool
partitions
RcppAlgos
arrangements
utils
I did not include permute or permutations as they are not really meant to attack these types of problems. I also did not include the updated gRbase as certain cases crashed my computer.
|————————————— OVERVIEW —————————————-|
|_________________| gtools | combinat | multicool | partitions |
| comb rep | Yes | | | |
| comb NO rep | Yes | Yes | | |
| perm rep | Yes | | | |
| perm NO rep | Yes | Yes | Yes | Yes |
| perm multiset | | | Yes | Yes |
| comb multiset | | | | |
| accepts factors | | Yes | | |
| m at a time | Yes | Yes/No | | |
| general vector | Yes | Yes | Yes | |
| iterable | | | Yes | |
| parallelizable | | | | |
| multi-threaded | | | | |
| big integer | | | | |
|_________________| arrangements | RcppAlgos | utils |
| comb rep | Yes | Yes | |
| comb NO rep | Yes | Yes | Yes |
| perm rep | Yes | Yes | |
| perm NO rep | Yes | Yes | |
| perm multiset | Yes | Yes | |
| comb multiset | Yes | Yes | |
| accepts factors | Yes | Yes | Yes |
| m at a time | Yes | Yes | Yes |
| general vector | Yes | Yes | Yes |
| iterable | Yes | Yes | |
| parallelizable | Yes | Yes | |
| big integer | Yes | Yes | |
| multi-threaded | | Yes | |
The tasks, m at a time and general vector, refer to the capability of generating results “m at a time” and rearranging a “general vector” as opposed to 1:n. In practice, we are generally concerned with finding rearrangements of a general vector, therefore all examinations below will reflect this when possible.
2. Setup
All benchmarks were ran on 3 different set-ups.
- 2022 Macbook Air Apple M2 24 GB
- 2020 Macbook Pro i7 16 GB
- 2022 Windows Surface i5 16 GB
library(microbenchmark)
## print up to 4 digits to keep microbenchmark output tidy
options(digits = 4)
options(width = 90)
numThreads <- min(as.integer(RcppAlgos::stdThreadMax() / 2), 6)
numThreads
#> [1] 4
pkgs <- c("gtools", "combinat", "multicool", "partitions",
"RcppAlgos", "arrangements", "utils", "microbenchmark")
sapply(pkgs, packageVersion, simplify = FALSE)
#> $gtools
#> [1] '3.9.3'
#>
#> $combinat
#> [1] '0.0.8'
#>
#> $multicool
#> [1] '0.1.12'
#>
#> $partitions
#> [1] '1.10.7'
#>
#> $RcppAlgos
#> [1] '2.6.0'
#>
#> $arrangements
#> [1] '1.1.9'
#>
#> $utils
#> [1] '4.2.1'
#>
#> $microbenchmark
#> [1] '1.4.7'
The listed results were obtained from setup #1 (i.e. Macbook Air M2). The results on the Macbook Pro were similar, however with the Windows setup, multi-threading was less effective. In some cases on the Windows setup, the serial execution was faster. We will call all functions with the pattern package::function so no library calls are needed.
3. Combinations
First, we examine combinations without replacement chosen m at a time.
RcppAlgos
combinat
gtools
arrangements
utils
How to:
set.seed(13)
tVec1 <- sort(sample(100, 20))
m <- 10
t1 <- RcppAlgos::comboGeneral(tVec1, m) ## returns matrix with m columns
t3 <- combinat::combn(tVec1, m) ## returns matrix with m rows
t4 <- gtools::combinations(20, m, tVec1) ## returns matrix with m columns
identical(t(t3), t4) ## must transpose to compare
#> [1] TRUE
t5 <- arrangements::combinations(tVec1, m)
identical(t1, t5)
#> [1] TRUE
t6 <- utils::combn(tVec1, m) ## returns matrix with m rows
identical(t(t6), t4) ## must transpose to compare
#> [1] TRUE
Benchmark:
microbenchmark(
cbRcppAlgosPar = RcppAlgos::comboGeneral(tVec1, m,
nThreads = numThreads),
cbRcppAlgosSer = RcppAlgos::comboGeneral(tVec1, m),
cbGtools = gtools::combinations(20, m, tVec1),
cbCombinat = combinat::combn(tVec1, m),
cbUtils = utils::combn(tVec1, m),
cbArrangements = arrangements::combinations(tVec1, m),
unit = "relative"
)
#> Warning in microbenchmark(cbRcppAlgosPar = RcppAlgos::comboGeneral(tVec1, : less accurate
#> nanosecond times to avoid potential integer overflows
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.000 1.000 1.000 1.0000 100
#> cbRcppAlgosSer 2.712 2.599 1.280 2.497 2.477 0.2666 100
#> cbGtools 739.325 686.803 271.623 679.894 661.560 11.7178 100
#> cbCombinat 173.836 166.265 76.735 176.052 191.945 4.9075 100
#> cbUtils 179.895 170.345 71.639 169.661 178.385 3.8900 100
#> cbArrangements 2.717 2.995 1.975 2.835 2.935 0.8195 100
Now, we examine combinations with replacement chosen m at a time.
RcppAlgos
gtools
arrangements
How to:
set.seed(97)
tVec2 <- sort(rnorm(12))
m <- 9
t1 <- RcppAlgos::comboGeneral(tVec2, m, repetition = TRUE)
t3 <- gtools::combinations(12, m, tVec2, repeats.allowed = TRUE)
identical(t1, t3)
#> [1] TRUE
t4 <- arrangements::combinations(tVec2, m, replace = TRUE)
identical(t1, t4)
#> [1] TRUE
Benchmark:
microbenchmark(
cbRcppAlgosPar = RcppAlgos::comboGeneral(tVec2, m, TRUE,
nThreads = numThreads),
cbRcppAlgosSer = RcppAlgos::comboGeneral(tVec2, m, TRUE),
cbGtools = gtools::combinations(12, m, tVec2, repeats.allowed = TRUE),
cbArrangements = arrangements::combinations(tVec2, m, replace = TRUE),
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.0000 1.000 1.0000 1.0000 100
#> cbRcppAlgosSer 1.987 2.382 0.9173 2.347 1.2776 0.9733 100
#> cbGtools 670.126 517.561 103.8726 523.135 177.0940 12.0440 100
#> cbArrangements 2.300 2.582 0.8294 2.542 0.9212 1.0089 100
4. Permutations
First, we examine permutations without replacement chosen m at a time.
RcppAlgos
gtools
arrangements
How to:
tVec3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29))
t1 <- RcppAlgos::permuteGeneral(tVec3, 6)
t3 <- gtools::permutations(10, 6, tVec3)
identical(t1, t3)
#> [1] TRUE
t4 <- arrangements::permutations(tVec3, 6)
identical(t1, t4)
#> [1] TRUE
Benchmark:
microbenchmark(
cbRcppAlgosPar = RcppAlgos::permuteGeneral(tVec3, 6,
nThreads = numThreads),
cbRcppAlgosSer = RcppAlgos::permuteGeneral(tVec3, 6),
cbGtools = gtools::permutations(10, 6, tVec3),
cbArrangements = arrangements::permutations(tVec3, 6),
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.000 1.000 1.000 1.0000 100
#> cbRcppAlgosSer 1.204 1.553 1.522 1.523 1.509 0.5722 100
#> cbGtools 357.078 308.978 288.396 301.611 292.862 64.8564 100
#> cbArrangements 2.356 2.361 2.183 2.292 2.224 0.4605 100
Next, we examine permutations without replacement with a general vector (returning all permutations).
RcppAlgos
gtools
combinat
multicool
arrangements
How to:
tVec3Prime <- tVec3[1:9]
## For RcppAlgos, arrangements, & gtools (see above)
t4 <- combinat::permn(tVec3Prime) ## returns a list of vectors
## convert to a matrix
t4 <- do.call(rbind, t4)
t5 <- multicool::allPerm(multicool::initMC(tVec3Prime)) ## returns a matrix with n columns
all.equal(t4[do.call(order,as.data.frame(t4)),],
t5[do.call(order,as.data.frame(t5)),])
#> [1] TRUE
Benchmark:
microbenchmark(
cbRcppAlgosPar = RcppAlgos::permuteGeneral(tVec3Prime,
nThreads = numThreads),
cbRcppAlgosSer = RcppAlgos::permuteGeneral(tVec3Prime),
cbGtools = gtools::permutations(9, 9, tVec3Prime),
cbCombinat = combinat::permn(tVec3Prime),
cbMulticool = multicool::allPerm(multicool::initMC(tVec3Prime)),
cbArrangements = arrangements::permutations(tVec3Prime),
times = 25,
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.000 1.000 1.000 1.0 25
#> cbRcppAlgosSer 1.555 2.187 2.616 2.190 2.274 10.3 25
#> cbGtools 1913.125 1850.589 1893.918 1877.707 1915.601 2124.5 25
#> cbCombinat 508.360 512.182 562.042 532.123 595.722 715.3 25
#> cbMulticool 103.061 103.694 128.480 118.169 127.118 208.3 25
#> cbArrangements 3.216 3.583 13.566 3.544 3.561 139.2 25
Now, we examine permutations without replacement for 1:n (returning all permutations).
RcppAlgos
gtools
combinat
multicool
partitions
arrangements
How to:
t1 <- partitions::perms(9) ## returns an object of type 'partition' with n rows
identical(t(as.matrix(t1)), RcppAlgos::permuteGeneral(9))
#> [1] TRUE
Benchmark:
microbenchmark(
cbRcppAlgosPar = RcppAlgos::permuteGeneral(9, nThreads = numThreads),
cbRcppAlgosSer = RcppAlgos::permuteGeneral(9),
cbGtools = gtools::permutations(9, 9),
cbCombinat = combinat::permn(9),
cbMulticool = multicool::allPerm(multicool::initMC(1:9)),
cbPartitions = partitions::perms(9),
cbArrangements = arrangements::permutations(9),
times = 25,
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.000 1.000 1.000 1.000 25
#> cbRcppAlgosSer 1.583 2.218 2.587 2.261 2.591 1.814 25
#> cbGtools 2010.303 1855.443 1266.853 1898.458 1903.055 217.422 25
#> cbCombinat 511.196 515.197 392.252 533.737 652.125 86.305 25
#> cbMulticool 108.152 103.188 80.469 108.227 120.804 23.504 25
#> cbPartitions 6.139 6.018 7.167 5.993 6.403 9.446 25
#> cbArrangements 4.089 3.797 3.135 3.791 3.760 1.858 25
Lastly, we examine permutations with replacement.
RcppAlgos
gtools
arrangements
How to:
t1 <- RcppAlgos::permuteGeneral(tVec3, 5, repetition = TRUE)
t3 <- gtools::permutations(10, 5, tVec3, repeats.allowed = TRUE)
t4 <- arrangements::permutations(x = tVec3, k = 5, replace = TRUE)
identical(t1, t3)
#> [1] TRUE
identical(t1, t4)
#> [1] TRUE
This next benchmark is a little surprising given the results up until now.
microbenchmark(
cbRcppAlgosPar = RcppAlgos::permuteGeneral(
tVec3, 5, TRUE, nThreads = numThreads
),
cbRcppAlgosSer = RcppAlgos::permuteGeneral(tVec3, 5, TRUE),
cbGtools = gtools::permutations(10, 5, tVec3, repeats.allowed = TRUE),
cbArrangements = arrangements::permutations(tVec3, 5, replace = TRUE),
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.000 1.000 1.000 1.000 100
#> cbRcppAlgosSer 1.307 1.669 1.465 1.561 1.513 1.015 100
#> cbGtools 6.364 6.188 5.448 5.762 5.434 1.625 100
#> cbArrangements 2.584 2.442 1.824 2.265 2.135 0.117 100
That is not a typo… gtools::permutations is almost as fast as the other compiled functions. I encourage the reader to go check out the source code of gtools::permutations as it is one of the most elegant displays of programming out there (R or otherwise).
5. Multisets
First, we examine combinations of multisets.
RcppAlgos
arrangements
To find combinations/permutations of multisets, with RcppAlgos use the freqs arguments to specify how many times each element of the source vector, v, is repeated.
set.seed(496)
myFreqs <- sample(1:5, 12, replace = TRUE)
## This is how many times each element will be repeated
tVec4 <- as.integer(c(1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233))
t1 <- RcppAlgos::comboGeneral(tVec4, 12, freqs = myFreqs)
t3 <- arrangements::combinations(tVec4, 12, freq = myFreqs)
identical(t1, t3)
#> [1] TRUE
Benchmark:
microbenchmark(
cbRcppAlgosPar = RcppAlgos::comboGeneral(
tVec4, 12, freqs = myFreqs, nThreads = numThreads
),
cbRcppAlgosSer = RcppAlgos::comboGeneral(tVec4, 12, freqs = myFreqs),
cbArrangements = arrangements::combinations(tVec4, 12, freq = myFreqs),
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.000 1.000 1.000 1.0000 100
#> cbRcppAlgosSer 3.197 3.012 2.003 2.831 2.681 0.1658 100
#> cbArrangements 9.391 7.830 4.901 7.252 6.731 0.3140 100
For permutations of multisets chosen m at a time, we have:
RcppAlgos
arrangements
How to:
set.seed(123)
tVec5 <- sort(runif(5))
t1 <- RcppAlgos::permuteGeneral(tVec5, 8, freqs = rep(4, 5))
t3 <- arrangements::permutations(tVec5, 8, freq = rep(4, 5))
identical(t1, t3)
#> [1] TRUE
Benchmark:
microbenchmark(
cbRcppAlgosPar = RcppAlgos::permuteGeneral(
tVec5, 8, freqs = rep(4, 5), nThreads = numThreads
),
cbRcppAlgosSer = RcppAlgos::permuteGeneral(tVec5, 8, freqs = rep(4, 5)),
cbArrangements = arrangements::permutations(tVec5, 8, freq = rep(4, 5)),
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.000 1.000 1.000 1.000 100
#> cbRcppAlgosSer 3.336 3.326 2.990 3.330 3.517 2.106 100
#> cbArrangements 3.751 3.746 3.346 3.757 3.840 2.305 100
For permutations of multisets returning all permutations, we have:
RcppAlgos
multicool
partitions
arrangements
How to:
tVec6 <- (1:5)^3
## For multicool, you must have the elements explicitly repeated
tVec6Prime <- rep(tVec6, times = rep(2, 5))
## for comparison
t1 <- RcppAlgos::permuteGeneral(tVec6, freqs = rep(2, 5))
t2 <- partitions::multiset(tVec6Prime)
t3 <- multicool::allPerm(multicool::initMC(tVec6Prime))
t4 <- arrangements::permutations(tVec6, freq = rep(2, 5))
## the package partitions, returns class of integer
## whereas RcppAlgos preserves class of tVec6 (i.e. numeric)
all.equal(t1, t(as.matrix(t2)))
#> [1] TRUE
identical(t1[do.call(order,as.data.frame(t1)),],
t3[do.call(order,as.data.frame(t3)),])
#> [1] TRUE
identical(t1, t4)
#> [1] TRUE
Benchmark:
microbenchmark(
cbRcppAlgosPar = RcppAlgos::permuteGeneral(
tVec6, freqs = rep(2, 5), nThreads = numThreads
),
cbRcppAlgosSer = RcppAlgos::permuteGeneral(tVec6, freqs = rep(2, 5)),
cbMulticool = multicool::allPerm(multicool::initMC(tVec6Prime)),
cbPartitions = partitions::multiset(tVec6Prime),
cbArrangements = arrangements::permutations(tVec6, freq = rep(2, 5)),
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgosPar 1.000 1.000 1.000 1.000 1.000 1.0000 100
#> cbRcppAlgosSer 2.485 2.141 2.289 2.584 2.511 1.0250 100
#> cbMulticool 80.195 66.237 45.540 64.971 66.057 3.5655 100
#> cbPartitions 5.731 4.786 3.925 4.719 4.953 0.4558 100
#> cbArrangements 2.999 2.907 3.270 2.966 2.906 3.1214 100
6. Summary
Both gtools and combinat are well established packages for rearranging elements of a vector. With gtools there are a few more options (see the overview above) and with combinat, you can rearrange factors. With multicool, one is able to rearrange multisets. Although partitions is limited for the purposes of this question, it is a powerhouse packed with highly efficient functions for dealing with integer partitions.
arrangements
- The output is in lexicographical order.
- Allows the user to specify the format via the
layout argument (“row : row-major”, “colmnn : column-major”, and “list : list”).
- Offers convenient methods such as
collect & getnext when working with iterators.
- Allows for the generation of more than
2^31 - 1 combinations/permutations via getnext. N.B. RcppAlgos (via nextItem) and multicool (via nextPerm) are also capable of doing this.
- GMP support allows for exploration of combinations/permutations of vectors with many results.
Observe:
icomb <- arrangements::icombinations(1000, 7)
icomb$getnext()
#> [1] 1 2 3 4 5 6 7
icomb$getnext(d = 5)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 1 2 3 4 5 6 8
#> [2,] 1 2 3 4 5 6 9
#> [3,] 1 2 3 4 5 6 10
#> [4,] 1 2 3 4 5 6 11
#> [5,] 1 2 3 4 5 6 12
This feature is really nice when you only want a few combinations/permutations. With traditional methods, you would have to generate all combinations/permutations and then subset. This would render the previous example impossible as there are more than 10^17 results (i.e. ncombinations(1000, 7, bigz = TRUE) = 194280608456793000).
This feature along with the improvements to the generators in arrangements, allow it to be very efficient with respect to memory.
RcppAlgos
- The output is in lexicographical order.
- There are convenient constraint features that we will not discuss here as they are off-topic for this question. I will only note that the types of problems that can be solved by utilizing these features were the motivation for creating this package (partitions, subset-sum, etc.).
- GMP support allows for exploration of combinations/permutations of vectors with many results.
- Produce results in parallel using the
Parallel or nThreads arguments.
- Similar to
combn, there is a FUN argument for applying a function to each result (See also FUN.VALUE).
- Provides flexible and merory efficient iterators that allow for bidirectional iteration as well as random access.
nextItem|nextNIter|nextRemaining: Retrieve the next lexicographical result(s)
prevItem|prevNIter|prevRemaining: Retrieve the previous lexicographical result(s)
front|back|[[: Random access methods
- Allows for easy generation of more than
2^31 - 1 results from any starting place.
Observe:
iter <- RcppAlgos::comboIter(1000, 7)
# first combinations
iter@nextIter()
#> [1] 1 2 3 4 5 6 7
# next 5 combinations
iter@nextNIter(5)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 1 2 3 4 5 6 8
#> [2,] 1 2 3 4 5 6 9
#> [3,] 1 2 3 4 5 6 10
#> [4,] 1 2 3 4 5 6 11
#> [5,] 1 2 3 4 5 6 12
# from the current state, the previous combination
iter@prevIter()
#> [1] 1 2 3 4 5 6 11
# the last combination
iter@back()
#> [1] 994 995 996 997 998 999 1000
# the 5th combination
iter[[5]]
#> [1] 1 2 3 4 5 6 11
# you can even pass a vector of indices
iter[[c(1, 3, 5)]]
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 1 2 3 4 5 6 7
#> [2,] 1 2 3 4 5 6 9
#> [3,] 1 2 3 4 5 6 11
# start iterating from any index
iter[[gmp::pow.bigz(2, 31)]]
#> [1] 1 2 3 17 138 928 954
# get useful info about the current state
iter@summary()
#> $description
#> [1] "Combinations of 1000 choose 7"
#>
#> $currentIndex
#> Big Integer ('bigz') :
#> [1] 2147483648
#>
#> $totalResults
#> Big Integer ('bigz') :
#> [1] 194280608456793000
#>
#> $totalRemaining
#> Big Integer ('bigz') :
#> [1] 194280606309309352
## get next ieteration
iter@nextIter()
#> [1] 1 2 3 17 138 928 955
In case you were wondering how each package scales, I will leave you with this final example that measures how fast RcppAlgos and the arrangements packages can generate over 100 million results. Note, gtools::combinations is left out here as it will throw the error: evaluation nested too deeply.... We also leave out combn as it takes quite some time to execute. Curiously, the differences in memory usage between utils::combn and combinat::combn is quite bizarre given that they are only marginally different (see ?utils::combn under the “Authors” section).
Observe:
set.seed(2187)
tVec7 <- sort(sample(10^7, 10^3))
## 166,167,000 Combinations
system.time(RcppAlgos::comboGeneral(tVec7, 3))
#> user system elapsed
#> 0.386 0.105 0.490
system.time(arrangements::combinations(x = tVec7, k = 3))
#> user system elapsed
#> 0.439 0.105 0.545
## 124,251,000 Permuations
system.time(RcppAlgos::permuteGeneral(tVec7[1:500], 3))
#> user system elapsed
#> 0.141 0.076 0.218
system.time(arrangements::permutations(x = tVec7[1:500], k = 3))
#> user system elapsed
#> 0.386 0.077 0.463
7. Memory
When executing comboGeneral as well as arrangements::combinations, the memory will jump almost 2 Gbs before calling gc. This seems about right as #rows * #nols * bytesPerCell / 2^30 bytes = choose(1000,3) * 3 * 4 / 2^30 bytes = (166167000 * 3 * 4)/2^30 = 1.857 Gbs). However, when executing combn, the memory behavior was eratic (e.g. sometimes it would use all 16 Gb of memory and other times it would only spike a couple of Gbs). When I tested this on the Windows set-up, it would often crash.
We can confirm this using Rprof along with summaryRporf. Observe:
Rprof("RcppAlgos.out", memory.profiling = TRUE)
t1 <- RcppAlgos::comboGeneral(tVec7, 3)
Rprof(NULL)
head(summaryRprof("RcppAlgos.out", memory = "both")$by.total, n = 1)
#> total.time total.pct mem.total self.time self.pct
#> "RcppAlgos::comboGeneral" 0.42 100 1902 0.42 100
Rprof("arrangements.out", memory.profiling = TRUE)
t3 <- arrangements::combinations(tVec7, 3)
Rprof(NULL)
head(summaryRprof("arrangements.out", memory = "both")$by.total, n = 1)
#> total.time total.pct mem.total self.time self.pct
#> "arrangements::combinations" 0.5 100 1902 0.5 100
With RcppAlgos & arrangements, mem.total registers just over 1900 Mb.
And here is the memory profile on a smaller vector.
tVec7Prime <- tVec7[1:300]
Rprof("combinat.out", memory.profiling = TRUE)
t3 <- combinat::combn(tVec7Prime, 3)
Rprof(NULL)
head(summaryRprof("combinat.out", memory = "both")$by.total, n = 1)
#> total.time total.pct mem.total self.time self.pct
#> "combinat::combn" 2.1 100 1055 1.98 94.29
Rprof("utils.out", memory.profiling = TRUE)
t4 <- utils::combn(tVec7Prime, 3)
Rprof(NULL)
head(summaryRprof("utils.out", memory = "both")$by.total, n = 1)
#> total.time total.pct mem.total self.time self.pct
#> "utils::combn" 1.6 100 2059 1.6 100
Rprof("gtools.out", memory.profiling = TRUE)
t5 <- gtools::combinations(300, 3, tVec7Prime)
Rprof(NULL)
head(summaryRprof("gtools.out", memory = "both")$by.total, n = 1)
#> total.time total.pct mem.total self.time self.pct
#> "rbind" 1.62 100 6659 1.46 90.12
Interestingly, utils::combn and combinat::combn use different amounts of memory and take different amounts of time to execute. This does not hold up with smaller vectors:
microbenchmark(combinat::combn(2:13, 6), utils::combn(2:13, 6))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> combinat::combn(2:13, 6) 313.4 326.7 329.4 328.1 330.4 370.6 100
#> utils::combn(2:13, 6) 378.3 393.1 397.0 395.2 399.2 451.2 100
And with gtools the total memory used is a little over 3x as much as utils. It should be noted that for these 3 packages, I obtained different results every-time I ran them (e.g. for combinat::combn sometimes I would get 9000 Mb and then I would get 13000 Mb).
Still, none can match RcppAlgos OR arrangements. Both only use 51 Mb when ran on the example above.
benchmark script: https://github.com/jwood000/RcppAlgos/blob/main/scripts/SO_Comb_Perm_in_R.R
*: An homage to A Walk through Combinatorics by Miklós Bóna