Context
I have received around 1700 97-2003 excel files with a ton of data including some text information. I am trying to read them all into Stata before I clean and run some statistical analysis. But, Stata is unable to open them. The files have a .xls format with large amounts of text in them which I want to keep, and convert them into .csv files (hence the csv tag).
What I've tried
- On closer look, every time I manually open a file in Excel I get the error "Format does not match extension", I have to click on "Open anyway" and manually save it before the file can be imported into Stata. I can't possibly do this for 1700 files. I am using a windows OS and all files are saved locally on my drive.
I additionally tried to write a bulk line of code on stata to save all files as csv or atleast an xls that can be imported into stata but I have had zero luck. Everytime I try to save them, I get a corrupted csv or an xls with the exact same error that pops up as " try xlCreateBook()" on stata and the "format does not match extension" error when I manually try to open the file.
For reference, under is the stata code I wrote.
cd "$dropbox\project8\data\2008-2009\A.Monthwise\dakota"
local oldfiles : dir "." files "*.xls" , respectcase
mac list _oldfiles
foreach fn of local oldfiles {
local oldfn "fn'"'
local res = ustrregexm("oldfn'"', "(.*).xls", .)
if `res'==1 {
local newfn = "new" + ustrregexs(1) + ".dta"
di "Renamingoldfn' to newfn'"' copy"./oldfn'"'"./new/`newfn'"', public
}
}
Question
Is there a way I can fix this extesnion/format issue for 1700 files and convert everything in bulk, such that Stata can read it?