Date AccountRef      Amount       Type
0   2019-04-15   REF0005  25000.00   cashflow_in
1   2019-04-15   REF004  20000.00   cashflow_in
2   2019-06-07   REF005   2000.00   cashflow_in
3   2019-07-24   REF005   3000.00   cashflow_in
4   2019-07-25   REF004   5000.00   cashflow_out
5   2019-08-08   REF004   5000.00   cashflow_in
6   2019-09-10   REF004   5000.00   cashflow_out
7   2019-10-13   REF005   5000.00   cashflow_out
8   2019-10-30   REF006   5000.00   cashflow_in
9   2019-11-08   REF009   2500.00   cashflow_out
10  2019-11-23   REF005   5000.00   cashflow_in
11  2019-11-30   REF011   5000.00   cashflow_out
Above is roughly my data set and I'm been trying to create a new column that sums of "Amount" if "AccountRef" is the same (so yes, there will be duplicate values), if type = cashflow_in, and date is after a date (eg using Date = 25/07/2020) for example).
I'll also be be another another column after this but the calculation method should be similar so I should be able to replace the filters/function again.
I've tried things like
> df = df[df['Type'] == 'cashflow_in'] 
> df['CumluativeIn'] = df.apply(lambda row: df[df['AccountRef'] == row['AccountRef']]['Amount'].sum(), axis=1)
but it doesn't really work the way it should. It's based on a SumIF command from excel.
Edit as I got advised to do it. My desire output is a value (or more precisely "sum") in each row where it meets those criteria (before a certain date, sum of when all values are the same as in the same row, and when type - cashflow_in).
The second weird criteria but afterwards I'll be adding a second column and varying the dates, sorry about my questions!
Edit 2: To show an example of what I want Date before 2019-11-01
          Date AccountRef      Amount       Type Cumulative_In
0   2019-04-15   REF0005  25000.00   cashflow_in 30000.00
1   2019-04-15   REF004  20000.00   cashflow_in 25000.00
2   2019-06-07   REF005   2000.00   cashflow_in 30000.00
3   2019-07-24   REF005   3000.00   cashflow_in 30000.00
4   2019-07-25   REF004   5000.00   cashflow_out 25000.00
5   2019-08-08   REF004   5000.00   cashflow_in 25000.00
6   2019-09-10   REF004   5000.00   cashflow_out 25000.00
7   2019-10-13   REF005   5000.00   cashflow_out 30000.00
8   2019-10-30   REF006   5000.00   cashflow_in 5000.00
9   2019-11-08   REF009   2500.00   cashflow_out 0.00
10  2019-11-23   REF005   5000.00   cashflow_in 30000.00
11  2019-11-30   REF011   5000.00   cashflow_out 0.00
In the world of Excel (assuming 0 = A2 and F2 is first data point in the new column), this would be New Column = Sumifs{Amount Column,
or more accurately
F2 = Sumifs{$D$2:$D$12,$C$2:$C$12,C2,$B$2:$B$12,< "2019-11-01",$F$2:$F$12 = "cashflow_in"}
Then F3 becomes F3 = Sumifs{$D$2:$D$12,$C$2:$C$12,C3,$B$2:$B$12,< "2019-11-01",$F$2:$F$12 = "cashflow_in"} F4 = Sumifs{$D$2:$D$12,$C$2:$C$12,C4,$B$2:$B$12,< "2019-11-01",$F$2:$F$12 = "cashflow_in"}
 
     
    
to work so here's an image of what I was trying to say. https://i.imgur.com/HKdkiAG.png – Lazar Clarent Jul 26 '20 at 09:33