After reading in a large data set with read.csv.ffdf, one of the columns is time. Such as 2014-10-18 00:01:02, for 1 million rows in that column. That column is a factor. How do I convert it to POSIXct supported by ff? Simply using as.POSIXct() just turns the values into NA
Or when I read in the data set in the beginning, can I specify that column to be POSIXct?
My goal is to get the month and days (or even hour). So I'm open to solutions other than converting to POSIXct.
For example, we have 9 by 2 table,
test <- read.csv.ffdf(file="test.csv", header=T, first.rows=-1)
Two columns are ID (numeric class), and time (factor class)
Here is dput
structure(list(virtual = structure(list(VirtualVmode = c("integer",
"integer"), AsIs = c(FALSE, FALSE), VirtualIsMatrix = c(FALSE,
FALSE), PhysicalIsMatrix = c(FALSE, FALSE), PhysicalElementNo = 1:2,
PhysicalFirstCol = c(1L, 1L), PhysicalLastCol = c(1L, 1L)), .Names = c("VirtualVmode",
"AsIs", "VirtualIsMatrix", "PhysicalIsMatrix", "PhysicalElementNo",
"PhysicalFirstCol", "PhysicalLastCol"), row.names = c("ID", "time"
), class = "data.frame", Dim = c(9L, 2L), Dimorder = 1:2), physical = structure(list(
ID = structure(list(), physical = <pointer: 0x000000000821ab20>, virtual = structure(list(), Length = 9L, Symmetric = FALSE), class = c("ff_vector",
"ff")), time = structure(list(), physical = <pointer: 0x000000000821abb0>, virtual = structure(list(), Length = 9L, Symmetric = FALSE, Levels = c("10/17/2003 0:01",
"12/5/1999 0:02", "2/1/2000 0:01", "3/23/1998 0:01", "3/24/2013 0:00",
"5/29/2004 0:00", "5/9/1985 0:01", "6/14/2010 0:01", "6/25/2008 0:02"
), ramclass = "factor"), class = c("ff_vector", "ff"))), .Names = c("ID",
"time")), row.names = NULL), .Names = c("virtual", "physical",
"row.names"), class = "ffdf")