Im new to R and im stuck with a problem i can't solve by myself.
A friend recommended me to use one of the apply functions, i just dont get how to use it in this case. Anyway, on to the problem! =)
Inside the inner while loop, I have an ifelse. That is the bottleneck. It takes on average 1 second to run each iteration. The slow part is marked with #slow part start/end in the code.
Given that, we will run it 2000*100 = 200000 times it will take aproximately 55.5 hours to finish each time we run this code. And the bigger problem is that this will be reused a lot. So x*55.5 hours is just not doable.
Below is a fraction of the code relevant to the question
    #dt is data.table with close to 1.5million observations of 11 variables
    #rand.mat is a 110*100 integer matrix
    j <- 1
    while(j <= 2000)
    {  
            #other code is executed here, not relevant to the question
            i <- 1
            while(i <= 100)
            {
                    #slow part start
                    t$column2 = ifelse(dt$datecolumn %in% c(rand.mat[,i]) & dt$column4==index[i], NA, dt$column2)
                    #slow part end
                    i <- i + 1
            }
            #other code is executed here, not relevant to the question
            j <- j + 1
    }
Please, any advice would be greatly appreciated.
EDIT - Run below code to reproduce problem
library(data.table)
dt = data.table(datecolumn=c("20121101", "20121101", "20121104", "20121104", "20121130", 
                             "20121130", "20121101", "20121101", "20121104", "20121104", "20121130", "20121130"), column2=c("5", 
                                                                                                "3", "4", "6", "8", "9", "2", "4", "3", "5", "6", "8"), column3=c("5", 
                                                                                                                                                                  "3", "4", "6", "8", "9", "2", "4", "3", "5", "6", "8"), column4=c
                ("1", "1", "1", "1", "1", "1", "2", "2", "2", "2", "2", "2"))
unq_date <- c(20121101L, 
20121102L, 20121103L, 20121104L, 20121105L, 20121106L, 20121107L, 
20121108L, 20121109L, 20121110L, 20121111L, 20121112L, 20121113L, 
20121114L, 20121115L, 20121116L, 20121117L, 20121118L, 20121119L, 
20121120L, 20121121L, 20121122L, 20121123L, 20121124L, 20121125L, 
20121126L, 20121127L, 20121128L, 20121129L, 20121130L
)
index <- as.numeric(dt$column4)
numberOfRepititions <- 2
set.seed(131107)
rand.mat <- replicate(numberOfRepititions, sample(unq_date, numberOfRepititions))
i <- 1
while(i <= numberOfRepititions)
{       
    dt$column2 = ifelse(dt$datecolumn %in% c(rand.mat[,i]) & dt$column4==index[i], NA, dt$column2)      
    i <- i + 1
}
Notice that we wont be able to run the loop more than 2 times now unless dt grows in rows so that we have the initial 100 types of column4 (which is just an integer value 1-100)
 
     
    