I have a matrix with header's and want to remove all rows based on the column, "Closed Date", having "NaN".
Input:
raw_data.ix[~(raw_data['Closed Date'] == "NaN")]
Output:
Closed Date
NaN
NaN
9/28/2017 19:51
NaN
Why is "NaN" still there?
I have a matrix with header's and want to remove all rows based on the column, "Closed Date", having "NaN".
Input:
raw_data.ix[~(raw_data['Closed Date'] == "NaN")]
Output:
Closed Date
NaN
NaN
9/28/2017 19:51
NaN
Why is "NaN" still there?
NaN is not a string. You need to test for .notnull()
raw_data.ix[~(raw_data['Closed Date'].isnull())]
or
raw_data.ix[raw_data['Closed Date'].notnull()]
The NaN you are seeing is not a string. It stands for "Not a Number" and is used to represent "Not Available" data in pandas / numpy.
You can remove all rows where Closed Date is NaN via pd.DataFrame.dropna:
raw_data = raw_data.dropna(subset=['Closed Date'], axis=1)