I need to get the interval boders from cut() output. I found this question that suggests to use findInterval() but it does not work as expected if value of x is same as the upper border of cut(x). See here:
x <- 1:3
breaks <- c(min(x), 2, max(x))
interval <- findInterval(x, breaks)
data.frame(x,
groups= cut(x, breaks, include.lowest= TRUE),
x_lower= breaks[interval],
x_upper= breaks[interval + 1],
interval)
x groups x_lower x_upper interval
1 1 [1,2] 1 2 1
2 2 [1,2] 2 3 2
3 3 [2,3] 3 NA 3
I am happy how cut() makes groups from x but x_lower and x_upper in row 2 and 3 are not as expected. In row two x is 2, groups is [1,2], so I expect x_lower to be 1 and x_upper to be 2. And in row 3 x is 3, groups is [2,3], so I expect x_lower to be 2 and x_upper to be 3. If you play around with data you will see that findinterval() returns lower and upper values of groups if the x value is same as the upper border value in groups. I want to avoid that. How can we achieve this?
Expected output
structure(list(x = 1:3, groups = structure(c(1L, 1L, 2L), .Label = c([1,2]", "(2,3]"), class = "factor"), x_lower = c(1, 1, 2), x_upper = c(2, 2, 3), interval = c(1, 1, 2)), class = "data.frame", row.names = c(NA, -3L))
Remark
I do want to use findInterval() and I can not use labels[as.numeric(groups)] as suggested in another post of the question above. This is because in my situation x is sometime a numeric, sometime a Date/ POSIXct/ts/... vector, thus, using as.numeric() is not save for me.