PROBLEM
I have the following dataframe (I am working with confidential data, so I will provide a sample):
import pandas as pd
df = pd.read_csv('file.csv')
df
    name    weight  height  smoke_or_not
0   Ajitesh 84       183          no
1   Shail   79       186          yes
2   Seema   67       158          no
3   Nidhi   52       155          no
I am saving this dataframe as a sql table using the following code. It is not included here, but I have already made de connection to the SQL Server database.
df.to_sql('table name', engine, if_exists='replace', index=False)
This dataframe is coming from a file that is beging updated daily. This update may contain new rows or even modifications of the existing rows.
For example, here I have modifications in the first row and a new row:
    name    weight  height  smoke_or_not
0   John    84       183          yes
1   Shail   79       186          yes
2   Seema   67       158          no
3   Nidhi   52       155          no
4   Mary    70       162          no
QUESTION
How could I identify the changes in existing rows and the new rows to run INSERT and UPDATE queries on SQLAlchemy?
I have my df coming from the file that is being updated daily.
I also have a df2 that is coming from the SQL database with a SELECT * from table query.
WHAT I HAVE TRIED
I have tried using:
changes = pd.merge(df, df2, how='outer', indicator=True)
right_only = changes[changes['_merge'] == 'right_only']
right_only
    name    weight  height  smoke_or_not    _merge
4   John    84        183     yes           right_only
5   Mary    70        162     no            right_only
With this I can get a new dataframe with all the rows that were changed + the new row. However, there are no distinction about which rows were changed and which ones are the new ones. So I can't separate which rows I am going to UPDATE or INSERT. Is there any easier way to do that?