Good Afternoon,
After trying several times R will not sum up the data I have below. As can be seen in the replica of my data there are 4 33024 zipcodes listed. R will continue to say that 33024 only has 2 injuries and will sum the rest of them up. Any help on this?
Edit: This should help as well. Seeing the Max stay at 3 and not increase based on the number of zip-codes that have an injury.
ZipCode         Age        Fatality       Injury        Year   
 33065  : 24   15     :28   Min.   :1     Min.   :1.000   2015:92  
 33313  : 18   18     :27   1st Qu.:1     1st Qu.:1.000   2016:67  
 33317  : 14   13     :21   Median :1     Median :1.000   2017:35  
 33076  : 13   17     :19   Mean   :1     Mean   :1.083            
 33026  : 11   12     :18   3rd Qu.:1     3rd Qu.:1.000            
 33311  : 11   14     :18   Max.   :1     Max.   :3.000 
  ZipCode Age Fatality Injury Year
1   33023  17       NA      1 2015
2   33024   6       NA      1 2015
3   33024   8       NA      2 2015
4   33024  13       NA      1 2015
5   33024  13       NA      1 2015
6   33026  14       NA      1 2015
BCD = read.csv(file.choose())
BCD
head(BCD)
tail(BCD)
library(ggplot2)
str(BCD)
colnames(BCD) = c("ZipCode", "Age", "Fatality", "Injury", "Year")
head(BCD)
list(BCD$Injury)
list(BCD$ZipCode)
factor(BCD$Year)
factor(BCD$ZipCode)
BCD$Year= factor(BCD$Year)
BCD$ZipCode= factor(BCD$ZipCode)
BCD$Age = factor(BCD$Age)
BCD$Injury = as.numeric(BCD$Injury)
BCD$Fatality = as.numeric(BCD$Fatality)
str(BCD)
head(BCD)
summary(BCD)
BCD2 = ggplot(data=BCD, aes(x=Injury, y=ZipCode, color=Age, size=Year))
BCD2 + geom_point()+ geom_smooth()
This is the code to this point. I am attempting to produce a ggplot based on year, age, zipcode, and the number of injuries that occurred at that zip-code.
 
    
