I have the following structure for the table DataTable: every column is of the datatype int, RowID is an identity column and the primary key. LinkID is a foreign key and links to rows of an other table.
RowID   LinkID   Order  Data    DataSpecifier
1       120      1      1       1
2       120      2      1       3
3       120      3      1       10
4       120      4      1       13
5       120      5      1       10
6       120      6      1       13
7       371      1      6       2
8       371      2      3       5
9       371      3      8       1
10      371      4      10      1
11      371      5      7       2
12      371      6      3       3
13      371      7      7       2
14      371      8      17      4
.................................
.................................
I'm trying to do a query which alters every LinkID batch in the following way:
- Take every row with same LinkID(e.g. the first batch is the first 6 rows here)
- Order them by the Ordercolumn
- Look at DataandDataSpecifiercolumns as one compare unit (They can be thought as one column, calleddataunit):
- Keep as many rows from Order=1onwards, until adataunitcomes by which appears more than one time in the batch
- Keep that final row, but delete rest of the rows with same LinkIDand greaterOrdervalue
So for the LinkID 120:
- Sort the batch by the Ordercolumn (already sorted here, but should still do it)
- Start looking from the top (So Order=1here), go as long as you don't see a value which appears more than 1 time in the batch
- Stop at the first duplicate Order=3(dataunit1 10is also onOrder5).
- Delete everything which has the LinkID=120 AND Order>=4
After similar process for LinkID 371 (and every other LinkID in the table), the processed table will look like this:
RowID   LinkID   Order  Data    DataSpecifier
1       120      1      1       1
2       120      2      1       3
3       120      3      1       10
7       371      1      6       2
8       371      2      3       5
9       371      3      8       1
10      371      4      10      1
11      371      5      7       2
.................................
.................................
I've never done an SQL query which this complicated. I know the query has to be something like this:
DELETE FROM DataTable  
WHERE RowID IN (SELECT RowID
                FROM DataTable
                WHERE -- ?
                GROUP BY LinkID
                HAVING COUNT(*) > 1 -- ?
                ORDER BY [Order]);
but I just can't seem to wrap my head around this and get the query right. I would preferably do this in pure SQL, with one executable (and reusable) query.
I asked a very similar question here: How to remove rest of the rows with the same ID starting from the first duplicate?
But since I realized that my original filtering logic in the question was not actually what I needed and that question had already been answered correctly, I had to make this new question.
 
    