I am using this code to get difference in hours from two POSIXct dates.
x <- transform(x, HRS = ceiling(as.numeric(SHIP_DATE-PICK_DATE)))
This gives accurate results. However, when I tried to find the hour differences for another similar column, I needed to do this:
x <- transform(x, HRS_ADJ = ceiling(as.numeric(SHIP_DATE-ADJ_PICK_DATE)/60))
PICK_DATE & SHIP_DATE are extracted using the same formula.
x$SHIP_DATE <- ifelse(is.na(as.POSIXct(x$SHIP_DATE, format="%d-%b-%Y %H:%M %p")),
                      yes = as.POSIXct(x$SHIP_DATE, format="%d-%b-%Y %H:%M"),
                      no = as.POSIXct(x$SHIP_DATE, format="%d-%b-%Y %H:%M %p"))
x$SHIP_DATE <- as.POSIXct(x$SHIP_DATE, origin = "1970-01-01")
ADJ_PICK_DATE is computed as below:
x$ADJ_PICK_DATE <- ifelse(x$PICK_TIME=="EARLY",
                          as.POSIXct(paste(format(x$PICK_DATE, "%d-%b-%Y"), "03:00"),
                                     format="%d-%b-%Y %H:%M"), x$PICK_DATE)
x$ADJ_PICK_DATE <- ifelse(x$PICK_TIME=="LATE",
                          as.POSIXct(paste(format(x$PICK_DATE+86400, "%d-%b-%Y"),
                                           "03:00"), format="%d-%b-%Y %H:%M"),
                          x$ADJ_PICK_DATE)
x$ADJ_PICK_DATE <- as.POSIXct(x$ADJ_PICK_DATE, origin = "1970-01-01")
PICK_TIME is computed to adjust the PICK_DATE, as for any orders between 16:00 & 03:00, the lead time is to be calculated from 3AM.
Questions:
- How to efficiently generate the ADJ_PICK_DATE column (now it is too slow)?
- How to extract the source data into POSIXct using shorter and more efficient code? (It takes about 10-15 seconds per million data point on my i7 7th Gen CPU)
- Why did I need to use different formula for each pair of dates to calculate the no of days?
Sample data (The dates is formatted randomly in the source (PICK_DATE & SHIP_DATE) as both "DD-MMM-YYYY HH:mm" and "DD-MMM-YYYY hh:mm AM/PM"):
PICK_DATE    SHIP_DATE    PICK_TIME  
01-APR-2017 00:51    02-APR-2017 06:55    EARLY  
01-APR-2017 00:51    02-APR-2017 12:11 PM    EARLY  
01-APR-2017 07:51    02-APR-2017 12:11 PM    OKAY  
01-APR-2017 02:51 PM    02-APR-2017 09:39 AM    LATE  
