Using Python pandas, I have been attempting to use a function, as one of a few replacement values for a pandas.DataFrame (i.e. one of the replacements should itself be the result of a function call). My understanding is that pandas.DataFrame.replace delegates internally to re.sub and that anything that works with it should also work with pandas.DataFrame.replace, provided that the regex parameter is set to True.
Accordingly, I followed the guidance provided elsewhere on stackoverflow, but pertaining to re.sub, and attempted to apply it to pandas.DataFrame.replace (using replace with regex=True, inplace=True and with to_replace set as either a nested dictionary, if specifying a specific column, or otherwise as two lists, per its documentation). My code works fine without using a function call, but fails if I try to provide a function as one of the replacement values, despite doing so in the same manner as re.sub (which was tested, and worked correctly). I realize that the function is expected to accept a match object as its only required parameter and return a string.
Instead of the resultant DataFrame having the result of the function call, it contains the function itself (i.e. as a first-class, unparameterized, object).
Why is this occurring and how can I get this to work correctly (return and store the function's result)? If this is not possible, I would appreciate if a viable and "Pandasonic" alternative could be suggested.
I provide an example of this below:
def fn(match):
id = match.group(1)
result = None
with open(file_name, 'r') as file:
for line in file:
if 'string' in line:
result = line.split()[-1]
return (result or id)
data.replace(to_replace={'col1': {'string': fn}},
regex=True, inplace=True)
The above does not work, in that it replaces the right search string, but replaces it with:
<function fn at 0x3ad4398>
For the above (contrived) example, the expected output would be that all values of "string" in col1 are substituted for the string returned from fn.
However, import re; print(re.sub('string', fn, 'test string')), works as expected (and as previously depicted).