I have a table “table-A” with duplicate records like (duplicates based on “name” column) "`
ID  Name        Course
01  abc     dotnet
02  xyz     java
03  mno     sas
04  abc     dotnet
05  xyz     java
06  abc     dotnet
07  aaa     testing
08  bbb     sap
--- “abc” came 3 times (id—1,4,6)
--- “xyz” came 2 times (id – 2,5)
`" From above table I need to delete duplicates (with ID -1, 4, 2) — not IDs 6, 5 those are latest files. “table-A” should contain only--
 ID Name        Course
03  mno     sas
05  xyz     java
06  abc     dotnet
07  aaa     testing
08  bbb     sap
I tried like—
CREATE TEMPORARY TABLE temptable (idTemp int(12), totTemp int(4));
INSERT INTO temp_table(`idTemp`, `totTemp`)   select  max(ID), count(*) as tot  from table-A 
       group by Name, Course having tot > 1 or tot =1  order by ID ;
Delete from table-A where ID not in (select  idTemp  from temp_table);
Above code is working. But it’s taking very huge time on large data table. My table contains 200,000 records with 40+ columns and every month it’s added 20,000 records.
I need to find duplicates based on 10 columns (group by on 10 columns) in this case anyone suggest well and fast working code.
I find different logics on internet and tried but they are taking more time than I explained above.
My main concern is query execution time. So please suggest me good logic or query which will do above task fast.
(just for information:-- i find few logics in stackoverflow also, the best one is suggesting to make unique index on thous columns. but in my case the data is coming from government every month, it may contain duplicates in file as well as compare to database. so i need to delete the duplicates or show duplicates in grid (using asp.net). )
 
     
     
     
    