I have a Pandas df which looks like this:
|   | yyyy_mm_dd | id | product    | status | is_50 | cnt |
|---|------------|----|------------|--------|-------|-----|
|   | 2002-12-15 | 7  | prod_rs    | 2      | 0     | 8   |
|   | 2002-12-15 | 16 | prod_go    | 2      | 0     | 1   |
|   | 2002-12-15 | 16 | prod_mb    | 2      | 0     | 3   |
|   | 2002-12-15 | 29 | prod_er    | 2      | 0     | 2   |
|   | 2002-12-15 | 29 | prod_lm    | 2      | 0     | 2   |
|   | 2002-12-15 | 29 | prod_ops   | 2      | 0     | 2   |
I also have a second dataframe which is similar:
|   | id | product    | cnt |
|---|----|------------|-----|
|   | 7  | prod_rs    | 8   |
|   | 16 | prod_go    | 1   |
|   | 16 | prod_mb    | 3   |
|   | 29 | prod_er    | 2   |
|   | 29 | prod_lm    | 2   |
|   | 29 | prod_ops   | 6   |
How can I create a third dataframe which will only store the rows which do not have an equal count? Based on the above, only the last row would be returned as the cnt for the id / product combination differs. Example output:
|   | id | product | cnt_df1 | cnt_df2 |
|---|----|---------|---------|---------|
|   | 29 | prod_ops| 2       | 6       |
The second df is one row larger in size so not all id / product combinations may be present in both dataframes.
I've been looking at merge but I'm unsure how to use when the cnt columns are not equal.
 
     
    