I'd like to apply a function with multiple returns to a pandas DataFrame and put the results in separate new columns in that DataFrame.
So given something like this:
import pandas as pd
df = pd.DataFrame(data = {'a': [1, 2, 3], 'b': [4, 5, 6]})
def add_subtract(a, b):
return (a + b, a - b)
The goal is a single command that calls add_subtract on a and b to create two new columns in df: sum and difference.
I thought something like this might work:
(df['sum'], df['difference']) = df.apply(
lambda row: add_subtract(row['a'], row['b']), axis=1)
But it yields this error:
----> 9 lambda row: add_subtract(row['a'], row['b']), axis=1)
ValueError: too many values to unpack (expected 2)
EDIT: In addition to the below answers, pandas apply function that returns multiple values to rows in pandas dataframe shows that the function can be modified to return a list or Series, i.e.:
def add_subtract_list(a, b):
return [a + b, a - b]
df[['sum', 'difference']] = df.apply(
lambda row: add_subtract_list(row['a'], row['b']), axis=1)
or
def add_subtract_series(a, b):
return pd.Series((a + b, a - b))
df[['sum', 'difference']] = df.apply(
lambda row: add_subtract_series(row['a'], row['b']), axis=1)
both work (the latter being equivalent to Wen's accepted answer).