I think I am facing a common problem when moving data between two servers but I can't find a good solution. I have two servers: server1 and server2. I want to move a subset of a table in server1 to a table in server2. I have the following tables (source and sink tables have the same schema):
server1.dbo.source- Contains ID + other columns.server2.dbo.sink- Contains ID + other columns (same schema assourcetable).server2.dbo.utilizedIds- Contains one column with list of IDs I would like to move.
If I could reference both servers in the same query, the data that I am trying to move from server1 to server2 could be described by this SQL statement:
SELECT * FROM server1.dbo.source
WHERE server1.dbo.source.id IN (SELECT * FROM server2.dbo.utilizedIds)
What I am trying to achieve is to only move IDs that I need. These IDs are dynamic and stored in a table in my destination server. I am doing this because I need a really small subset of the source table (I need thousands of rows out of billions). So far, these are the solutions that I have found and the reasons why I can't use them or I wouldn't like to use them:
- Copy all the data into a temp table in
server2and perform the filtering there - Moving this much data every time I need to update my sink is not feasible. - Perform a delta insert of the source table into
server2and then filter this table - While this would be more performant than option 1, I really don't want to incur the cost of storing billions of rows when I only need thousands. - Create temp tables/persistent tables in
server1and insert theIDsinto this table, then filter using standard SQL - I am not the owner ofserver1, I cannot create tables or stored procedures. The temp table option was promising but temp tables don't persist across activities. - Create a dynamic query and pass the
IDsas a string to theSELECTstatement. Something like this:SELECT * FROM server1.dbo.source WHERE id IN ('ID1', 'ID2', 'ID3' ...)- So far, this is my best solution. However, SQL statements have a length and parameter limit, and this is not a good practice
To me, this seems like a common use case. What is the best practice or correct solution to this problem? Building my query using strings feels like a hack and not something ADF was designed for.








