I have a 227x4 DataFrame with country names and numerical values to clean (wrangle ?).
Here's an abstraction of the DataFrame:
import pandas as pd
import random
import string
import numpy as np
pdn = pd.DataFrame(["".join([random.choice(string.ascii_letters) for i in range(3)]) for j in range (6)], columns =['Country Name'])
measures = pd.DataFrame(np.random.random_integers(10,size=(6,2)), columns=['Measure1','Measure2'])
df = pdn.merge(measures, how= 'inner', left_index=True, right_index =True)
df.iloc[4,1] = 'str'
df.iloc[1,2] = 'stuff'
print(df)
  Country Name Measure1 Measure2
0          tua        6        3
1          MDK        3    stuff
2          RJU        7        2
3          WyB        7        8
4          Nnr      str        3
5          rVN        7        4
How do I replace string values with np.nan in all columns without touching the country names?
I tried using a boolean mask:
mask = df.loc[:,measures.columns].applymap(lambda x: isinstance(x, (int, float))).values
print(mask)
[[ True  True]
 [ True False]
 [ True  True]
 [ True  True]
 [False  True]
 [ True  True]]
# I thought the following would replace by default false with np.nan in place, but it didn't
df.loc[:,measures.columns].where(mask, inplace=True)
print(df)
  Country Name Measure1 Measure2
0          tua        6        3
1          MDK        3    stuff
2          RJU        7        2
3          WyB        7        8
4          Nnr      str        3
5          rVN        7        4
# this give a good output, unfortunately it's missing the country names
print(df.loc[:,measures.columns].where(mask))
  Measure1 Measure2
0        6        3
1        3      NaN
2        7        2
3        7        8
4      NaN        3
5        7        4
I have looked at several questions related to mine ([1], [2], [3], [4], [5], [6], [7], [8]), but could not find one that answered my concern.
 
     
     
     
    