I have a pandas dataframe read from a csv file which includes a column of dates
In[1]: df = pd.read_csv(fileStats, index_col=0, parse_dates= ['start_date'])
In[2]: df.head()
Out[3]:
                 start_date  
    customer_ID                                                               
    10006414       2012-10-02 08:30:00   
    10006486       2013-12-02 09:00:00   
    10006492       2013-12-02 09:00:00   
    10006572       2012-03-31 08:30:00   
    10006630       2012-01-06 10:30:00  
I want to find the earliest date:
In[4]: df['start_date'].min()
Out[5]:
Timestamp('2011-01-12 08:00:00')
which looks fine except that when I open the .csv file in excel, and order the rows by start_date, the earliest date is 2011-10-06 13:00:00. The same problem happens with .max()  This post refers to a similar problem, but related to np.datetime64 which I am not using here. Is there an issue with .max() and .min() or is it to do with parsing the dates in .read_csv ?
EDIT:
Further investigation on suggestions from @DYZ shows that the Timestamps have been read or parsed incorrectly.  6/10/2011 13:00 in the .csv file has become 2014-03-03 10:00:00 in the dataframe. How to correct this?