I have a pandas dataframe with floats and strings and few <NA> and nan. I am trying to locate all the <NA> and convert them into nan using the following function pd.to_numeric(....., errors='coerce'). Also making sure that floats and strings remain untouched. Could you please help me with it?
Also when I df.to_dict('list') I get few '' and <NA>.
Thank you.
            Asked
            
        
        
            Active
            
        
            Viewed 142 times
        
    1
            
            
         
    
    
        Telis
        
- 311
- 1
- 9
- 
                    What exactly are your ``? strings? `pd.NA`? and why do you need `float('nan')`? Please provide a reproducible object – mozway May 04 '23 at 07:27
1 Answers
2
            Whether you have string <NA> or pandas NA (pd.NA), both should be converted to nan using pd.to_numeric:
df = pd.DataFrame({'col': [1, '1', 'a', float('nan'), pd.NA, '<NA>']})
pd.to_numeric(df['col'], errors='coerce')
Output:
0    1.0
1    1.0
2    NaN
3    NaN
4    NaN
5    NaN
Name: col, dtype: float64
If you want to replace specific items, keeping the strings, rather use:
df['col'].replace({'<NA>': float('nan'), pd.NA: float('nan')})
Output:
0      1
1      1
2      a
3    NaN
4    NaN
5    NaN
Name: col, dtype: object
If you need string representations of number as numbers, and other strings intact while removing <NA>/pd.NA:
out = (pd.to_numeric(df['col'], errors='coerce')
         .fillna(df['col'].replace({'<NA>': float('nan'), pd.NA: float('nan')}))
       )
Output:
0    1.0
1    1.0
2      a
3    NaN
4    NaN
5    NaN
Name: col, dtype: object
 
    
    
        mozway
        
- 194,879
- 13
- 39
- 75
- 
                    Thanks for the answer @mozway When I try `df.iloc[1, 6].replace({'': float('nan'), pd.NA: float('nan')})` I get: `AttributeError: 'NAType' object has no attribute 'replace'`. My `df` is a pandas dataframe and `type(df.iloc[1, 6])` is `pandas._libs.missing.NAType`. Any idea what happens here? – Telis May 04 '23 at 08:13
- 
                    1Can you provide the output of `df.to_dict('list')` as edit to your question? – mozway May 04 '23 at 08:21
- 
                    
- 
                    Can you please be explicit and provide the exact output? Your description is not enough for [reproducibility](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – mozway May 04 '23 at 08:40