Let's take two datasets:
import pandas as pd
import numpy as np
df = pd.DataFrame([1, 2, 3, 2, 5, 4, 3, 6, 7])
check_df = pd.DataFrame([3, 2, 5, 4, 3, 6, 4, 2, 1])
I want to do the following thing:
- If any of numbers
df[0:3]is greater thancheck_df[0], then we return 1 and 0 otherwise - If any of numbers
df[1:4]is greater thancheck_df[1]then we return 1 and 0 otherwise - And so on...
It can be done, by rolling function and custom function:
def custom_fun(x: pd.DataFrame):
return (x > float(check_df.iloc[0])).any()
And then by combining this with apply function:
df.rolling(3, min_periods = 3).apply(custom_fun).shift(-2)
The main problem in my solution, is that I always compare with check_df[0], whereas in i-th rolling window, I should compare with check_df[i], but I have no idea how it can be specified in the rolling function. Could you please give me a hand in this problem?