pandas column has 0.0(nan) and 0(nan). I want to get 0 for both cases. Followed is the code.
import pandas as pd
import re
df = pd.DataFrame.from_dict({'col1': ['0.0(nan)','0(nan)']})
df['col2'] = df['col1'].astype(str).apply(lambda x: re.sub('(.*?)\(nan\)', '\\1', re.sub('(.*?)\.0*\(nan\)', '\\1', x)))
print(df)
Below is the output. For the regex, I didn't know how to deal with either .0 or 0 before the (. This is why I used re.sub inside another re.sub. My question is how to make the regex in one re.sub. Or any other methods? Thank you.
col1 col2
0 0.0(nan) 0
1 0(nan) 0
Edit: by the comment of @mozway
df['col2'] = df['col1'].astype(str).apply(lambda x: re.sub('(.*?)(?:\.0)?\(nan\)', '\\1', x))