Im Kinda new to Python and Datascience.
I have these two Dataframes : df Dataframe
df = pd.DataFrame({"Date": ['2014-11-21 11:00:00', '2014-11-21 11:00:03', '2014-11-21 11:00:04', '2014-11-21 11:00:05', '2014-11-21 11:00:07', '2014-11-21 11:00:08', '2014-11-21 11:00:10', '2014-11-21 11:00:11', '2014-10-24 10:00:55', '2014-10-24 10:00:59'], "A":[1, 2, 5, 3, 9, 6, 3, 0, 8, 10]})
                  Date   A
0  2014-11-21 11:00:00   1
1  2014-11-21 11:00:03   2
2  2014-11-21 11:00:04   5
3  2014-11-21 11:00:05   3
4  2014-11-21 11:00:07   9
5  2014-11-21 11:00:08   6
6  2014-11-21 11:00:10   3
7  2014-11-21 11:00:11   0
8  2014-10-24 10:00:55   8
9  2014-10-24 10:00:59  10
info Dataframe, this dataframe contains the Datetime range that my final df should contain
info = pd.DataFrame({"Start": ['2014-11-21 11:00:00', '2014-11-21 11:08:00', '2014-10-24 10:55:00'], "Stop": ['2014-11-21 11:07:00', '2014-11-21 11:11:00', '2014-10-24 10:59:00']})
                 Start                 Stop
0  2014-11-21 11:00:00  2014-11-21 11:00:07
1  2014-11-21 11:00:08  2014-11-21 11:00:11
2  2014-10-24 10:00:55  2014-10-24 10:00:59
The goal is to calculate the cumulative sum in df with a two seconds window, if and only if the actual row in df is in range of one of the rows in info. For example cumulative sum for row with date 2014-11-21 11:00:08 should be 0. Because it's in the start of a range, another example is row with date 2014-11-21 11:00:07, its cumsum should be 12(9+3).
Here's what i've achieved until now :
import pandas as pd
import numpy as np
df = pd.DataFrame({"Date": ['2014-11-21 11:00:00', '2014-11-21 11:00:03', '2014-11-21 11:00:04', '2014-11-21 11:00:05', '2014-11-21 11:00:07', '2014-11-21 11:00:08', '2014-11-21 11:00:10', '2014-11-21 11:00:11', '2014-10-24 10:00:55', '2014-10-24 10:00:59'], "A":[1, 2, 5, 3, 9, 6, 3, 0, 8, 10]})
info = pd.DataFrame({"Start": ['2014-11-21 11:00:00', '2014-11-21 11:00:08', '2014-10-24 10:00:55'], "Stop": ['2014-11-21 11:00:07', '2014-11-21 11:00:11', '2014-10-24 10:00:59']})
#info = pd.DataFrame({"Start": ['2014-11-21 11:00:00', '2014-11-21 11:00:00', '2014-11-21 11:00:00', '2014-11-21 11:00:01', '2014-11-21 11:00:02', '2014-11-21 11:00:03', '2014-11-21 11:00:04', '2014-11-21 11:00:05'], "Stop": ['2014-11-21 11:00:00', '2014-11-21 11:00:01', '2014-11-21 11:00:02', '2014-11-21 11:00:03', '2014-11-21 11:00:04', '2014-11-21 11:00:05', '2014-11-21 11:00:06', '2014-11-21 11:00:07']})
info['groupnum']=info.index
info.Start=pd.to_datetime(info.Start)
info.Stop=pd.to_datetime(info.Stop)
cinfo = info.set_index(pd.IntervalIndex.from_arrays(info.Start, info.Stop, closed='both'))['groupnum']
df['groupnum']=pd.to_datetime(df.Date).map(cinfo)
df['cum'] = df.groupby('groupnum').A.cumsum()
print(df)
Expected Result :
                  Date   A  groupnum  cum
0  2014-11-21 11:00:00   1         0    1
1  2014-11-21 11:00:03   2         0    2
2  2014-11-21 11:00:04   5         0    7
3  2014-11-21 11:00:05   3         0   10
4  2014-11-21 11:00:07   9         0   12
5  2014-11-21 11:00:08   6         1    6
6  2014-11-21 11:00:10   3         1    9
7  2014-11-21 11:00:11   0         1    3
8  2014-10-24 10:00:55   8         2    8
9  2014-10-24 10:00:59  10         2   10
Actual Result :
                  Date   A  groupnum  cum
0  2014-11-21 11:00:00   1         0    1
1  2014-11-21 11:00:03   2         0    3
2  2014-11-21 11:00:04   5         0    8
3  2014-11-21 11:00:05   3         0   11
4  2014-11-21 11:00:07   9         0   20
5  2014-11-21 11:00:08   6         1    6
6  2014-11-21 11:00:10   3         1    9
7  2014-11-21 11:00:11   0         1    9
8  2014-10-24 10:00:55   8         2    8
9  2014-10-24 10:00:59  10         2   18
But this is doing cumulative sum over the groupnum and i can't manage to cumulate only by 2s.
So is ther anyproper way to achieve this ? I'd be really grateful.
My english isn't that good, i hope that i explained you right
 
     
    