I ran a couple of tests. The EXISTS variant proves to be substantially faster - as I expected and contrary to what @Tometzky posted.
Test bed with 10.000 rows on PostgreSQL 9.1.2 with decent settings:
CREATE TEMP TABLE test (
  a serial
 ,b int NOT NULL
 ,c int NOT NULL
);
INSERT INTO test (b,c)
SELECT (random()* 100)::int AS b, (random()* 100)::int AS c
FROM   generate_series(1, 10000);
ALTER TABLE test ADD CONSTRAINT a_pk PRIMARY KEY (a);
Between the first and second round of tests, I ran:
ANALYZE test;
When I finally applied the DELETE, 3368 duplicates were deleted. Performance may vary if you have substantially more or fewer duplicates.
I ran each query a couple of times with EXPLAIN ANALYZE and took the best result. Generally, the best hardly differs from the first or worst.
A bare SELECT (without the DELETE) shows similar results.
1. CTE with rank()
Total runtime: 150.411 ms
Total runtime: 149.853 ms -- after ANALYZE
WITH x AS (
    SELECT a
          ,rank() OVER (PARTITION BY b, c ORDER BY a) AS rk
    FROM   test
    )
DELETE FROM test
USING  x
WHERE  x.a = test.a
AND    rk > 1;
2. CTE with row_number()
Total runtime: 148.240 ms
Total runtime: 147.711 ms -- after ANALYZE
WITH x AS (
    SELECT a
          ,row_number() OVER (PARTITION BY b, c ORDER BY a) AS rn
    FROM   test
    )
DELETE FROM test
USING  x
WHERE  x.a = test.a
AND    rn > 1;
3. row_number() in subquery
Total runtime: 134.753 ms
Total runtime: 134.298 ms  -- after ANALYZE
DELETE FROM test
USING (
    SELECT a
          ,row_number() OVER (PARTITION BY b, c ORDER BY a) AS rn
    FROM   test
    )  x
WHERE  x.a = test.a
AND    rn > 1;
4. EXISTS semi-join
Total runtime: 143.777 ms
Total runtime: 69.072 ms  -- after ANALYZE
DELETE FROM test t
WHERE EXISTS (
    SELECT 1
    FROM   test t1
    WHERE  t1.a < t.a
    AND   (t1.b, t1.c) = (t.b, t.c)
    );
The difference in the second run comes from a switch to a Hash Semi Join
instead of an additional Sort + Merge Semi Join.
Results
- EXISTSclearly wins with up-tp-date table statistics.
- With outdated statistics row_number()in a subquery is fastest.
- rank()is the slowest variant.
- CTE is slower than subquery.
- ANALYZE(updated statistics) helps performance and can help a lot. Autovacuum (default) should more or less take care of this automatically - except for temporary tables or immediately after major changes to the table. Read more here or here.
Test with 100.000 rows
I repeated the test with 100.000 rows and 63045 duplicates. Similar results, except that EXISTS was slower, even after ANALYZE.
- Total runtime: 1648.601 ms
- Total runtime: 1623.759 ms
- Total runtime: 1568.893 ms
- Total runtime: 1692.249 ms
Raising the statistics target to 1000 and then to the maximum of 10000 (overkill in real live) and another ANALYZE sped up all queries by ~ 1 %, but the query planner still went with Sort + Merge Semi Join for EXISTS.
ALTER TABLE test ALTER COLUMN b SET STATISTICS 10000;
ALTER TABLE test ALTER COLUMN c SET STATISTICS 10000;
ANALYZE test;
Only after I forced the planner to avoid the merge joins the planner used a Hash Semi Join taking half the time again:
SET enable_mergejoin = off
- Total runtime: 850.615 ms
Update
There have been improvements to the query planner since then. Went straight to Hash Semi Join in a retest with PostgreSQL 9.1.7.