I have been trying to delete duplicate rows from a table but all my efforts either result in error or get stuck during execution. My Table has 16.8 million records including 1.5 million duplicates. Table structure is as follows
--------------------------------------
| id | number | city | region | site |
--------------------------------------
| 1  | 12345  | abc  | xyz    | 321  |
| 2  | 67890  | def  | axc    | 167  |
| 3  | 12345  | abc  | xyz    | 321  |
| 4  | 13400  | fff  | aaa    | 301  |
--------------------------------------
I have tried using some of the approaches suggested in answers here at stack overflow but couldn't find a solution that worked for me.
DELETE n1 FROM data n1, data n2 WHERE n1.id > n2.id AND n1.number = n2.number
Didn't work so I tried following:
DELETE FROM data where data.number in 
(
    SELECT number from data GROUP BY number HAVING COUNT(*)>1
)
LIMIT 1
No use here either so I am stuck. All sorts of suggestions are welcome.
THE SOLUTION THAT WORKED FOR ME
Marc-B marked the post as duplicate of stackoverflow.com/a/3312066/1528290 tried that approach and it worked like a charm. my query was :
alter ignore table data add unique i_number (number)
 
    