basically I have a single column data set of 53 values. what I am trying to achieve is binning them into sets based on a 400 point difference, ranging from ~500 to 4500. you can just be vague if needed and state a function for doing so, I can work out the rest
            Asked
            
        
        
            Active
            
        
            Viewed 55 times
        
    0
            
            
        - 
                    Hi @Jonah_huggins. Welcome to StackOverflow. Please, visit this thread on how to make a great R reproducible example. Could you provide a `dput()`? https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – cmirian Apr 05 '20 at 15:22
3 Answers
1
            
            
        A dplyr option
library(dplyr)
df_test <- data.frame(x = runif(1000, 400, 5000),
                      y = rep("A", 1000))
df_test <- df_test %>% 
  mutate(bins = case_when(between(x, 400, 800) ~ "Set 1",
                          between(x, 801, 1600) ~ "Set 2",
                          between(x, 1601, 5000) ~ "Set 3"))
head(df_test)
              x y  bins
    1 1687.2854 A Set 3
    2 3454.1035 A Set 3
    3 4979.5434 A Set 3
    4  796.6475 A Set 1
    5 3665.7444 A Set 3
    6 3083.8969 A Set 3
You can of course adjust the between ranges as you see fit.
 
    
    
        Greg
        
- 3,570
- 5
- 18
- 31
0
            
            
        Here's a base R approach that uses cut with breaks = defined with a predetermined seq. 
set.seed(1)
data <- runif(n=53,500,4500)
groups <- as.integer(cut(data,c(-Inf,seq(500,4500,by=400),Inf))) 
data.frame(data,groups)
      data groups
1 1562.035      4
2 1988.496      5
3 2791.413      7
4 4132.831     11
 
    
    
        Ian Campbell
        
- 23,484
- 14
- 36
- 57
- 
                    Hi! thank you so much for responding with this, however I would love to ask you a following question: I read up on cut() and used: signals.2 <- cut(as.numeric(as.character(data$Beta.Actin.Signals)), breaks = seq(0, 4500, by = 500)) however when displaying the plot, this omits any points that come up as 0, as in out of the 11 bins I wish to create, any bin that is empty is not displayed on the graph, any advice? – Jonah_huggins Apr 05 '20 at 22:18
0
            
            
        Hi i would do it like this:
data$group<-cut(data$value,
               breaks = seq(0,4500,500),
               labels = paste("Group",LETTERS[1:9], sep="_"))
or if you prefer more basic style of R use [ ] :
under_500<-data[data$value<500 ,]
over500_under900<-data[data$value %in% 501:900 ,]
## etc..
over4000<-data[data$value>4000 ,]
 
    
    
        user12256545
        
- 2,755
- 4
- 14
- 28
