It appears as if the files are downloaded correctly, but you are correct: if you use read.csv to load the data into R, then some of the columns are interpreted as numeric, and therefore they lose the leading zeros.
To take your code that downloads files -
stateFIPScodes<-seq(10,13,1)
for(i in seq_along(stateFIPScodes)){
code <- stateFIPScodes[[i]]
URL <- paste0("https://www.census.gov/popest/data/intercensal/county/files/CO-EST00INT-ALLDATA-", code, ".csv")
destfile <- paste0("state2000_2010_",code,".csv")
download.file(URL, destfile)
}
If we use base read.csv we get no trailing zeros:
library(dplyr)
read.csv("state2000_2010_10.csv") %>%
select(1:5) %>%
head
#> SUMLEV STATE COUNTY STNAME CTYNAME
#> 1 50 10 1 Delaware Kent County
#> 2 50 10 1 Delaware Kent County
#> 3 50 10 1 Delaware Kent County
#> 4 50 10 1 Delaware Kent County
#> 5 50 10 1 Delaware Kent County
#> 6 50 10 1 Delaware Kent County
That's because the first few columns are being read in as numeric.
read.csv("state2000_2010_10.csv") %>% str()
#> 'data.frame': 780 obs. of 50 variables:
#> $ SUMLEV : int 50 50 50 50 50 50 50 50 50 50 ...
#> $ STATE : int 10 10 10 10 10 10 10 10 10 10 ...
#> $ COUNTY : int 1 1 1 1 1 1 1 1 1 1 ...
There are two ways to resolve this:
- Manually pass the data types into
read.csv, or just prevent all conversions by adding colClasses = "character".
- Use
readr::read_csv which handles it correctly.
We could just prevent automatic coercion:
read.csv("state2000_2010_10.csv", colClasses = "character") %>% str()
#> 'data.frame': 780 obs. of 50 variables:
#> $ SUMLEV : chr "050" "050" "050" "050" ...
#> $ STATE : chr "10" "10" "10" "10" ...
#> $ COUNTY : chr "001" "001" "001" "001" ...
#> $ STNAME : chr "Delaware" "Delaware" "Delaware" "Delaware" ...
You would need to choose what columns you wanted to cast to as.numeric.
Or you could select the columns, e.g.
read.csv("state2000_2010_10.csv",
colClasses = c(
SUMLEV = "character",
STATE = "numeric",
COUNTY = "character"
)) %>%
select(1:5) %>%
head
#> SUMLEV STATE COUNTY STNAME CTYNAME
#> 1 050 10 001 Delaware Kent County
#> 2 050 10 001 Delaware Kent County
#> 3 050 10 001 Delaware Kent County
#> 4 050 10 001 Delaware Kent County
#> 5 050 10 001 Delaware Kent County
#> 6 050 10 001 Delaware Kent County
Second, you could use readr, which has more intelligent column type inference:
#> read_csv("state2000_2010_10.csv") %>%
#> select(1:5) %>%
#> head
#> # A tibble: 6 × 5
#> SUMLEV STATE COUNTY STNAME CTYNAME
#> <chr> <int> <chr> <chr> <chr>
#> 1 050 10 001 Delaware Kent County
#> 2 050 10 001 Delaware Kent County
#> 3 050 10 001 Delaware Kent County
#> 4 050 10 001 Delaware Kent County
#> 5 050 10 001 Delaware Kent County
#> 6 050 10 001 Delaware Kent County