I have two tables and I need to be able to update them going both ways, the first table is a list of items:
ITEMS
Item* | Rev* | RDate        | ECO  | New
------+------+--------------+------+----
A     | 0A   | 2019-01-01   | E123 | 1
A     | 01   | 2018-01-01   | E456 | 0
B     | 0A   | 2018-12-31   | E765 | 0
C     | 01   | 2018-10-25   | E456 | 0
The second is a parent-child table, with revisions, but I have to fill in the Child Rev from the Item table
Tree
Parent* | ParentRev* | Child* | ChildRev | VDate*
--------+------------+--------+----------+-----------
Y       | 0B         | C      | NULL     | 2019-01-01
Y       | 0C         | D      | NULL     | 2019-01-13
Z       | 01         | A      | NULL     | 2018-06-25
Z       | 02         | A      | NULL     | 2019-01-11
Z       | 0A         | B      | NULL     | 2019-01-01
Notes:
- Primary key columns are marked with *
- VDateshouldn't be part of the primary key, but the dataset is bad and has duplicates, so I need to add it
I looked into different questions like Select first row in each GROUP BY group?, but I couldn't find one that used row based conditions on the joining table that returned multiple fields.  Anyway, I'm using this to fill in the records where ChildRev is NULL, but it doesn't include the ECO column
UPDATE T 
SET [ChildRev] = (SELECT TOP 1 I.[Rev] AS [ChildRev]
                  FROM [Items] AS I
                  WHERE (I.[Item] = T.[Child]
                    AND I.[RDate] <= T.[VDate]) 
                  ORDER BY I.[RDate] DESC
                 )
FROM [Tree] AS T
WHERE T.[ChildRev] IS NULL
And, this is what I get:
Parent | ParentRev | Child | ChildRev | VDate      | ECO
-------+-----------+-------+----------+------------+------
Y      | 0B        | C     | 01       | 2019-01-01 | NULL
Y      | 0C        | D     | NULL     | 2019-01-13 | NULL
Z      | 01        | A     | 01       | 2018-06-25 | NULL
Z      | 02        | A     | 0A       | 2019-01-11 | NULL
Z      | 0A        | B     | 0A       | 2019-01-01 | NULL
I'm dealing with 4.5M+ records in the Tree table and 1.2M+ in the Item table, growing daily. I have 2 questions:
- Is there a better (faster) way to update the - TreeTable? (Bonus if it includes the- ECO)- When I add new - Items, they are flagged with a- 1in the- Newfield (might use trigger)
- How would I Check/Update the - Treetable with the new- Items
Mind you that I have no real control in what order the data will get loaded (table or date).
Update
So, apparently Select first row in each GROUP BY group? was basically the solution, I just didn't realize it. Specifically on how to use a CTE to Update my data tables. Thanks @Xedni for enlightening me; I've only really used CTEs for recursive queries. So, I ended up with 2 similar CTEs,
- When I add new records to the - Treetable, I added- AND ChildRev IS NULLto limit the updates:- WITH CTE AS ( SELECT ... ) UPDATE CTE SET ChildRev = ItemRev WHERE RID = 1 AND ChildRev IS NULL
- When I add new records to the - Materialstable, I added a- WHERE...ANYclause:- WITH CTE AS ( SELECT ... RID = ROW_NUMBER() OVER (PARTITION BY t.Parent, t.ParentRev, t.Child ORDER BY i.RDate DESC) FROM #Tree t JOIN #Items i ON t.Child = i.Item AND i.RDate <= t.VDate WHERE I.Process = ANY (SELECT Item FROM #Items WHERE New = 1) ) UPDATE CTE SET ChildRev = ItemRev WHERE RID = 1
 
     
    