As a specific example say I have a table T with columns customer and date indicating days on which individual customers have made purchases:
customer | date
----------------------
A | 01/01/2013
A | 02/01/2013
A | 07/01/2013
A | 11/01/2013
B | 03/01/2013
B | 08/01/2013
I want to add another column that for each pair (customer, date) pair (c, d), gives the number of pairs (c', d') in T such that c = c' and 0 <= days(d) - days(d') <= 7. Below is the table with this extra column:
customer | date | new_column
----------------------------------
A | 01/01/2013 | 1
A | 02/01/2013 | 2
A | 07/01/2013 | 3
A | 11/01/2013 | 2
B | 03/01/2013 | 1
B | 10/01/2013 | 1
As a rough idea of the steps I used to solve this problem:
- create a table
T'with all possible pairs(c,d); - left join
TontoT'; - create a new column:
count(date) over (partition by customer order by date asc rows between 6 preceding and 0 following); - omit any rows from this new table where
T.date is null
However, I don't think this is scalable.
Cheers for any help.