I'm sure someone has asked this before or that I could research a way to do this efficiently but I'm tight on time, and I'm not sure how to word my issue.
I have a data frame of large dimensions but I noticed that for some reason one of my columns has odd numbers.
head(testCA_extract[5])
   ZIP_CODE
1     94801
2     94801
3 928034250
4     92714
5     95054
6     94565
from
> head(testCA_extract[2:6])
  REPORTING_YEAR STATE_COUNTY_FIPS_CODE  COUNTY_NAME  ZIP_CODE   CITY_NAME
1           1990                  06013 CONTRA COSTA     94801    RICHMOND
2           1990                  06013 CONTRA COSTA     94801    RICHMOND
3           1990                  06059       ORANGE 928034250     ANAHEIM
4           1990                  06059       ORANGE     92714      IRVINE
5           1990                  06085  SANTA CLARA     95054 SANTA CLARA
6           1990                  06013 CONTRA COSTA     94565   PITTSBURG
For anyone unfamiliar the zip codes are suppose to be 5 digits exactly I'm not sure why there are extra digits but it appears that the first 5 numbers regardless of length is the correct zip code.
So I need to either select only the first 5 digits or constrain the variable to the first 5 digits and delete the rest. and then I need that information to go back to it's proper row and column in the DF.
 
     
    