for eg: a dataframe "housing" has a column "street" with different street names as levels. I want to return a df with counts of the number of houses in each street (level), basically number of repetitions. what functions do i use in r?
            Asked
            
        
        
            Active
            
        
            Viewed 40 times
        
    3 Answers
0
            This should help:
library(dplyr)
housing %>% group_by(street) %>% summarise(Count=n())
 
    
    
        Duck
        
- 39,058
- 13
- 42
- 84
- 
                    This saved me thanks! in addition, how do i ignore na values in housing while applying these arguments? – bverc Jul 07 '20 at 15:32
- 
                    @bverc You could add a new pipe like `filter(!is.na(yourvariable))`. Let me know if I can help more! – Duck Jul 07 '20 at 15:36
- 
                    Thank you, it's exactly what i needed – bverc Jul 07 '20 at 16:05
0
            
            
        summary gives the first 100 frequencies of the factor levels. If there are more, try:
table(housing$street)
For example, let's generate one hundred one-letter street names and summarise them with table.
set.seed(1234)
housing <- data.frame(street = sample(letters, size = 100, replace = TRUE))
x <- table(housing$street)
x
# a b c d e f g h i j k l m n o p q r s t u v w x y z 
# 1 3 5 6 4 6 2 6 5 3 1 3 1 2 5 5 4 1 5 5 3 7 4 5 3 5 
As per OP's comment. To further use the result in analyses, it needs to be included in a variable. Here, the x. The class of the variable is table, and it works in base R with most functions as a named vector. For example, to find the most frequent street name, use which.max.
which.max(x)
#  v 
# 22 
The result says that the 22nd position in x has the maximum value and it is called v.
 
    
    
        nya
        
- 2,138
- 15
- 29
- 
                    
- 
                    Also, I'm not able to perform any analysis using this method. for eg: to find the highest value – bverc Jul 07 '20 at 15:44
- 
                    Simply send the table result into a variable and use it as a vector. I added an example to the response. – nya Jul 08 '20 at 07:09
0
            
            
        This can be done in multiple ways, for instance with base R using table():
table(housing$street)
It can also be done through dplyr, as illustrated by Duck.
Another option (my preference) is using data.table.
library(data.table)
setDT(housing)
housing[, .N, by = street]
 
    
    
        ljwharbers
        
- 393
- 2
- 8
