Good evening dear Stack overflow community. This is my first question here.
I have the following issue: I need a query to count distinct values in a rolling date range (3 days) for each country.
I have done some research and found the following discussion here [1]: Query for count of distinct values in a rolling date range.
For the problem I face I need grouping not only by date, but also by country. Please consider the following input table:
| Date | Country | |
|---|---|---|
| 1/1/12 | DE | de1@example.com | 
| 1/1/12 | FRA | fra1@example.com | 
| 1/1/12 | SPA | spa1@example.com | 
| 1/2/12 | DE | de1@example.com | 
| 1/2/12 | DE | de2@example.com | 
| 1/3/12 | SPA | spa1@example.com | 
| 1/3/12 | SPA | spa2@example.com | 
| 1/3/12 | FRA | fra2@example.com | 
| 1/4/12 | SPA | spa1@example.com | 
| 1/4/12 | FRA | fra2@example.com | 
| 1/4/12 | FRA | fra3@example.com | 
| 1/4/12 | SPA | spa3@example.com | 
The expect outcome with counted distinct emails will be the following:
| Date | Country | |
|---|---|---|
| 1/1/12 | DE | 1 | 
| 1/1/12 | FRA | 1 | 
| 1/1/12 | SPA | 1 | 
| 1/2/12 | DE | 2 | 
| 1/2/12 | FRA | 1 | 
| 1/2/12 | SPA | 1 | 
| 1/3/12 | SPA | 2 | 
| 1/3/12 | DE | 2 | 
| 1/3/12 | FRA | 2 | 
| 1/4/12 | SPA | 3 | 
| 1/4/12 | FRA | 2 | 
| 1/4/12 | DE | 2 | 
I was trying to modify the solution suggested in the above-mentioned discussion and modify the following by adding country in selected columns and in grouping.
SELECT date
     ,(SELECT count(DISTINCT email)
       FROM   tbl
       WHERE  date BETWEEN g.date - 2 AND g.date
      ) AS dist_emails
FROM  (SELECT generate_series(timestamp '2012-01-01'
                            , timestamp '2012-01-06'
                            , interval  '1 day')::date) AS g(date)
Unfortunately the updated query does not work, as country is not recognised and it errors.
SELECT date, country,
         ,(SELECT count(DISTINCT email)
           FROM   tbl
           WHERE  date BETWEEN g.date - 2 AND g.date
          ) AS dist_emails
    FROM  (SELECT generate_series(timestamp '2012-01-01'
                                , timestamp '2012-01-06'
                                , interval  '1 day')::date) AS g(date)
GROUP BY 1,2
Will be very grateful for your advice and sharing your expertise on the ways this can be fixed.
 
     
    