I have a series of large (and poorly formatted) excel spreadsheets that I am trying to process with pandas. Each excel file contains 50-60 sheets, and I am only interested in a subset of the sheets, within each file.
I have tried to read the entire spreadsheet as an pd.ExcelFile object, so I can use the sheet_names attribute to parse particular sheets (and I don't know the names of each sheet ahead of time). This works - but seems exceptionally slow (close to a minute for each ~30mb excel file). 
I can only assume this is because each sheet is being parsed as the pd.ExcelFile object is being initialised (...could be wrong?). If so, is there a way to prevent this behaviour?  - I really only want to get the sheet names, and then parse the specific sheets from there. 
Thanks in advance!
 
     
    