I would like to keep just the earliest record of every ID in a table where the dates are in yyyy-mm-dd format. If I have two or more records on the same day, I just want to take one and I do not care what of them.
I tried to join the table with itself but the left join is not working and returns more than one.
Example original table:
| ID_vendor | sales | office | dt | 
|---|---|---|---|
| 1 | 3 | A | 2021-10-12 | 
| 1 | 50 | B | 2021-10-13 | 
| 2 | 109 | H | 2021-10-13 | 
| 3 | 110 | H | 2021-10-05 | 
| 4 | 111 | N | 2021-10-13 | 
| 4 | 112 | W | 2021-10-13 | 
| 4 | 113 | B | 2021-10-13 | 
Expected result:
| ID_vendor | sales | office | 
|---|---|---|
| 1 | 3 | A | 
| 2 | 109 | H | 
| 3 | 110 | H | 
| 4 | 111 | N | 
Y try using over partition without luck, and now I am stuck here with the LEFT JOIN returning an OUTER JOIN
Any help is welcome. Here the code:
WITH t as (
    SELECT id_vendor
        , sales 
        , office 
        , min(dt) fst_date
    FROM test_table
    WHERE dt >= date('2021-09-12')
    -- AND id_vendor = '1004618231015'
    GROUP BY id_vendor, sales, office 
    ORDER BY id_vendor
)
, b AS (
SELECT id_vendor
        , sales 
        , office
        , dense_rank() over (order by fst_date) as rnk
FROM t
-- WHERE id_vendor = '1004618231015'
GROUP BY id_vendor
        , sales 
        , office
        , fst_date
        )
, c AS (
SELECT id_vendor
FROM b WHERE rnk = 1
GROUP BY id_vendor
)
, d AS (
SELECT id_vendor
    , sales
    , office
FROM b WHERE rnk = 1)
)
SELECT c.id_vendor
    , d.sales
    , d.office
FROM c
LEFT join d
    ON c.id_vendor = d.id_vendor
 
     
    