Why does Pandas tell me that I have objects, although every item in the selected column is a string — even after explicit conversion.
This is my DataFrame:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 56992 entries, 0 to 56991
Data columns (total 7 columns):
id            56992  non-null values
attr1         56992  non-null values
attr2         56992  non-null values
attr3         56992  non-null values
attr4         56992  non-null values
attr5         56992  non-null values
attr6         56992  non-null values
dtypes: int64(2), object(5)
Five of them are dtype object. I explicitly convert those objects to strings:
for c in df.columns:
    if df[c].dtype == object:
        print "convert ", df[c].name, " to string"
        df[c] = df[c].astype(str)
Then, df["attr2"] still has dtype object, although type(df["attr2"].ix[0] reveals str, which is correct.
Pandas distinguishes between int64 and float64 and object. What is the logic behind it when there is no dtype str? Why is a str covered by object?
 
     
    
 
     
    


 
     
    