Sorry, another newbie question. I am trying to take parts of data frame based on an existing ID or index, and then create a new ID or index column based on the the difference in values in a second column.
For example, in the example data below, userID 1 appears to have 2 sessions: one starting at timeStamp 1 and ending at timeStamp 6, and another starting at timeStamp 40 and ending at timeStamp 47. If the difference between two timeStamps is =< 30 (say, minutes), then the two timeStamps are considered to be in the same session. But when the same userID jumps from 6 to 40, that's considered a new session (difference is > 30), then that's considered a new session. User 2 only has 1 session; User3 has 3.
Ideally, I'd like to retain the userID information in the sessionIDs; the last 2 columns are examples of desired formats. If it's easier to just make them integers, I can concatenate the userID and sessID later. var1, var2, varN are there just to show that there is other data in the data frame.
I am trying to avoid traditional looping and get R-esque.  I took the userID and timeStamp information and created a list by userID with the timeStamps as the vectors of list 1 to the last userID:
byUser <- with(myDF, split(timeStamp, userID))
Some of the real data look like this:
structure(list(`1` = c(50108, 50108, 50171, 50175, 121316, 121316, 
127228), `2` = c(55145, 745210, 1407020, 2283255),...
Then I used diff to get the difference between the timeStamps in each vector:
myDiff2 <- lapply(byUser, diff)
Some of the real data look like this:
structure(list(`1` = c(0, 63, 4, 71141, 0, 5912), `2` = c(690065, 
661810, 876235), `3` = c(109, 80, 98, 948417, 0),
...now I feel as if should loop through each list, initialize the sessID, and then if the value in myDiff2 is > 1800 seconds (30 mins), increment sessID.
This seemed really long; please tell me how I could have shortened it! Thanks in advance!
   userID timeStamp var1 var2 varN sessID1 sessID2
1       1         1    x    y    N     1.0     1.1
2       1         3    x    y    N     1.0     1.1
3       1         6    x    y    N     1.0     1.1
4       1        40    x    y    N     1.1     1.2
5       1        42    x    y    N     1.1     1.2
6       1        43    x    y    N     1.1     1.2
7       1        47    x    y    N     1.1     1.2
8       2         5    x    y    N     2.0     2.1
9       2         8    x    y    N     2.0     2.1
10      3         2    x    y    N     3.0     3.1
11      3         5    x    y    N     3.0     3.1
12      3        38    x    y    N     3.1     3.2
13      3        39    x    y    N     3.1     3.2
14      3        39    x    y    N     3.1     3.2
15      3        82    x    y    N     3.2     3.3
16      3        83    x    y    N     3.2     3.3
17      3        90    x    y    N     3.2     3.3
18      3        91    x    y    N     3.2     3.3
19      3       102    x    y    N     3.2     3.3
The dput() for the data example is here:
myDF <- structure(list(userID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), timeStamp = c(1L, 3L, 
6L, 40L, 42L, 43L, 47L, 5L, 8L, 2L, 5L, 38L, 39L, 39L, 82L, 83L, 
90L, 91L, 102L), var1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "x", class = "factor"), 
    var2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "y", class = "factor"), 
    varN = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "N", class = "factor"), 
    sessID1 = c(1, 1, 1, 1.1, 1.1, 1.1, 1.1, 2, 2, 3, 3, 3.1, 
    3.1, 3.1, 3.2, 3.2, 3.2, 3.2, 3.2), sessID2 = c(1.1, 1.1, 
    1.1, 1.2, 1.2, 1.2, 1.2, 2.1, 2.1, 3.1, 3.1, 3.2, 3.2, 3.2, 
    3.3, 3.3, 3.3, 3.3, 3.3)), .Names = c("userID", "timeStamp", 
"var1", "var2", "varN", "sessID1", "sessID2"), class = "data.frame", row.names = c(NA, 
-19L))
=== An addendum to the answers below:
For the next newbie:
Picking a '.' / decimal separator was probably not brilliant on my part: it led to some weirdness and non-unique sessID 's as the sessID counter rolled from 9 to 0.
Change the separator to some other character -- like a hyphen -- and all is well.
@rawr and @jlhoward - Thank you both for your quick, correct, and extremely helpful responses: both approaches worked very well. @jlhoward - special thanks for the addt'l, above-the-call-of-duty explanation. (@rawr was first, so I credited him for the answer.)
There was a small difference in performance between the 2 solutions: data.table is faster but requires some addt'l upfront transformations of the data.frame to a data.table.
Thanks again, all.
 
     
    