I am trying to read a excel file using pandas. I am interested in reading only the relevant data from the excel file, i.e dropping the rows/column containing 'nan' value. I have encountered a problem where the first row of dataframe contains 'Unnamed' values. It is never fixed that my header would start from which row, therefore I am avoiding to use skiprows and header.
On using the below mentioned command it removed almost all the data from the dataframe as it is treating Unnamed as the header.
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]
I have used the following command to clean my data:
data = pd.read_excel("text.xlsx", sheet_name=1,index=False)
print(data)
BINS 2018-RUI: Red Roof Inn Portfolio   Unnamed: 1  Unnamed: 2  Unnamed: 3  Unnamed: 5
0   NaN       NaN       NaN     NaN     NaN     NaN
1   NaN       NaN       NaN     NaN     NaN     NaN
2   NaN       NaN       NaN     NaN     NaN     NaN
3   No. Property \nID   Property Name       Street Address  
4   1.001   10228        Red Roof Plus      777 Airport Boulevard       
5   1.002   10150        Red Roof Plus1     15 Meadowlands Parkway      
6   1.003   10304        Red Roof Inn       Boulevard Seattle
data1 = data.dropna(axis = 0, how = 'all', thresh=3)
data2 = data1.dropna(axis = 1, how = 'all')
print(data2)
BINS 2018-RUI: Red Roof Inn Portfolio   Unnamed: 1  Unnamed: 2  Unnamed: 3  Unnamed: 5
3   No. Property \nID   Property Name       Street Address  
4   1.001   10228        Red Roof Plus      777 Airport Boulevard       
5   1.002   10150        Red Roof Plus1     15 Meadowlands Parkway      
6   1.003   10304        Red Roof Inn       Boulevard Seattle   
Expected output:
3   No. Property \nID   Property Name       Street Address  
4   1.001   10228        Red Roof Plus      777 Airport Boulevard       
5   1.002   10150        Red Roof Plus1     15 Meadowlands Parkway      
6   1.003   10304        Red Roof Inn       Boulevard Seattle
I do not want the first rows having Unnamed written on cells. (This is small subset of data, actual data has 100 rows and 100 columns)
 
    