As a general rule of thumb, in any programming language we should undertake the memory management as much as possible. When we grow a vector inside a loop, the vector asks the processor for extra space in between the running program and then proceeds once it gets the required memory. This process is repeated for every iteration of the loop resulting in massive delays. Thus we should pre-allocate the required memory to a vector to avoid such delays.
This Memory Allocation Optimizer checks the code for vectors that have been initialized without proper memory allocation and whenever possible, changes that allocation to an allocation with a fixed size, taken from the the loops where the vector is being called.
Consider the following example:
code <- paste(
"square_vec <- NULL",
"mul_vec <- NULL",
"for(i in 1:100) {",
" square_vec[i] <- i^2",
"}",
"for(i in 1:100) {",
" mul_vec[i] <- i * i",
"}",
"identical(square_vec, mul_vec)",
sep = "\n"
)
cat(code)
## square_vec <- NULL
## mul_vec <- NULL
## for(i in 1:100) {
## square_vec[i] <- i^2
## }
## for(i in 1:100) {
## mul_vec[i] <- i * i
## }
## identical(square_vec, mul_vec)
Then, the automatically optimized code would be:
opt_code <- opt_memory_alloc(list(code))
cat(opt_code$codes[[1]])
## square_vec <- vector(length = 100)
## mul_vec <- vector(length = 100)
## for(i in 1:100) {
## square_vec[i] <- i^2
## }
## for(i in 1:100) {
## mul_vec[i] <- i * i
## }
## identical(square_vec, mul_vec)
And if we measure the execution time of each one, and the speed-up:
bmark_res <- microbenchmark({
eval(parse(text = code))
}, {
eval(parse(text = opt_code))
})
autoplot(bmark_res)
speed_up(bmark_res)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## Expr_2 50.85415 48.00791 50.40635 50.52503 47.73553 136.3603
The memory-allocation-optimizer
looks for vector assignments in the entire code snippet, that have been assigned without proper allocation of memory. For instance, consider a vector vec
, now this vector can be initialized without proper memory allocation in the following ways:
vec <- NULL
vec = c()
vec <- NA
logical() -> vec()
When we spot these improper initializations of vectors, we change the intialization to include a suitable memory allocation in the following way, if possible:
vec <- vector(length = 10)
While the highly flexible and accomodating nature of the R Language is a boon for programmers, that very nature has limited the scope of this optimizer to a certain extent. In this optimizer the following points have been kept in mind:
FOR
loops have been considered, owing to the fact that their sizes, most of the times, can be predicted accurately which is not the case with while
and repeat
loops.FOR
loop.