I have data retrieved from one SQL table where the value that will be my PK in the table I need to import it to can come from one of 2 fields.
This does mean I can get duplicates so I need to obviously remove them before ingesting into the destination table.
I also need to mark the duplicate records that I have not ingested as processed so they don't get picked up again so I collect the "ID" value (will be unique) from the source table for the duplicate rows in an array. Currently I have the code below working, just worried it might be slow on large datasets (I'm expecting probably around 50-60K rows at max).
If ds.Tables(0).Rows.Count > 0 Then
        Dim DupeID_List = Nothing
        Dim DupeID_Count As Integer = 0
        Dim hTable As New Hashtable()
        Dim duplicateList As New ArrayList()
        For Each drow__1 As DataRow In ds.Tables(0).Rows
            If hTable.Contains(drow__1("EIBItemID")) Then
                ReDim Preserve DupeID_List(DupeID_Count)
                duplicateList.Add(drow__1)
                DupeID_List(DupeID_Count) = CStr(drow__1("ID"))
                DupeID_Count = DupeID_Count + 1
            Else
                hTable.Add(drow__1("EIBItemID"), String.Empty)
            End If
        Next
        For Each dRow__2 As DataRow In duplicateList
            ds.Tables(0).Rows.Remove(dRow__2)
        Next
        If Not DupeID_List Is Nothing Then
            Call MarkDupeRecordsExported(DupeID_List)
        End If
        Return ds
    Else
        Return Nothing
    End If
Is there a better way of achieving the same goal, check for duplicate "EIBItemID" values and remove them from the dataset that will be bulkcopied to the destination table but update the source table for each record (ID) that is removed?
 
     
     
    