I got a dataframe which looks like this:
       df = data.frame(id=c(1,1,1,2,2,2,3,3,3),date=rep(c("1990-01","1990-02","1990-03"),3),
                        pd=c(0.005,0.004,0.003,0.001,0.0005,0.002,0.008,0.0065,0.002))
    df
   id    date     pd
#  1 1990-01 0.0050
#  1 1990-02 0.0040
#  1 1990-03 0.0030
#  2 1990-01 0.0010
#  2 1990-02 0.0005
#  2 1990-03 0.0020
#  3 1990-01 0.0080
#  3 1990-02 0.0065
#  3 1990-03 0.0020
The id refers to different companies. I'd like to calculate the 'distance to default' (-qnorm(pd_t) - (-qnorm(pd_t-1)) conditioned on date and id.
My code produces the output I am looking for but takes very long due to the size of the real dataframe:
id_vec = c(1:3)
df$DD = NA 
for(i in 1:3){
df[df$id==id_vec[i],] = df[df$id==id_vec[i],] %>% mutate(DD = -qnorm(pd)-lag(-qnorm(pd)))}
      id    date     pd          DD
    #  1 1990-01 0.0050          NA
    #  1 1990-02 0.0040  0.07624050
    #  1 1990-03 0.0030  0.09571158
    #  2 1990-01 0.0010          NA
    #  2 1990-02 0.0005  0.20029443
    #  2 1990-03 0.0020 -0.41236499
    #  3 1990-01 0.0080          NA
    #  3 1990-02 0.0065  0.07485375
    #  3 1990-03 0.0020  0.39439245
Does anyone has an idea how I can improve the performance?
 
    