There are several ways to achieve the desired result using dcast. jazzurro's solution does the aggregation before reshaping the result. The approaches here use dcast directly but may require some post-processing. We are using jazzurro's data which are tweaked to obey the UTC time zone and CRAN version 1.10.0 of data.table.
1. Getting ifelse to work
As reported in the Q,
dcast(
dt, person + door ~ type,
value.var = 'time',
fun.aggregate = function(x) ifelse(type == 'timeIn', min(x), max(x))
)
returns an error message. The full text of the error message includes the hint to use the fill parameter. Unfortunately, ifelse() doesn't respect the POSIXct class (for details see ?ifelse) so this needs to be enforced.
With
dcast(
dt, person + door ~ type,
value.var = 'time',
fun.aggregate = function(x)
lubridate::as_datetime(ifelse(type == 'timeIn', min(x), max(x))),
fill = 0
)
we do get
# person door timeIn timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:05
2. Alternative to ifelse
ifelse's help page suggests
(tmp <- yes; tmp[!test] <- no[!test]; tmp)
as alternative. Following this advice,
dcast(
dt, person + door ~ type,
value.var = 'time',
fun.aggregate = function(x) {
test <- type == "timeIn"; tmp <- min(x); tmp[!test] = max(x)[!test]; tmp
}
)
returns
# person door timeIn timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:05
Note that neither the fill parameter nor the coercion to POSIXct is needed.
3. Using enhanced dcast
With the latest versions of dcast.data.table we can provide a list of functions to fun.aggregate:
dcast(dt, person + door ~ type, value.var = 'time', fun = list(min, max))
returns
# person door time_min_timeIn time_min_timeOut time_max_timeIn time_max_timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:03 2016-12-02 07:06:02 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:03 2016-12-02 06:05:02 2016-12-02 06:05:05
We can remove the unwanted columns and rename the others by
dcast(dt, person + door ~ type, value.var = 'time', fun = list(min, max))[
, .(person, door, timeIn = time_min_timeIn, timeOut = time_max_timeOut)]
which gets us
# person door timeIn timeOut
#1: ana front door 2016-12-02 07:06:01 2016-12-02 07:06:05
#2: bob front door 2016-12-02 06:05:01 2016-12-02 06:05:05
Data
As mentioned above, we are using jazzurro's data
dt <- structure(list(person = c("bob", "bob", "bob", "bob", "ana",
"ana", "ana", "ana"), door = c("front door", "front door", "front door",
"front door", "front door", "front door", "front door", "front door"
), type = c("timeIn", "timeIn", "timeOut", "timeOut", "timeIn",
"timeIn", "timeOut", "timeOut"), time = structure(c(1480658701,
1480658702, 1480658703, 1480658705, 1480662361, 1480662362, 1480662363,
1480662365), class = c("POSIXct", "POSIXt"))), .Names = c("person",
"door", "type", "time"), row.names = c(NA, -8L), class = c("data.table",
"data.frame"))
but coerce the time zone to UTC.
With
dt[, time := lubridate::with_tz(time, "UTC")]
we have
dt
# person door type time
#1: bob front door timeIn 2016-12-02 06:05:01
#2: bob front door timeIn 2016-12-02 06:05:02
#3: bob front door timeOut 2016-12-02 06:05:03
#4: bob front door timeOut 2016-12-02 06:05:05
#5: ana front door timeIn 2016-12-02 07:06:01
#6: ana front door timeIn 2016-12-02 07:06:02
#7: ana front door timeOut 2016-12-02 07:06:03
#8: ana front door timeOut 2016-12-02 07:06:05
independent of local time zone.