I'm saving the results from my Tensorflow image classifier in a SQL database. I have 3 tables. Images, Categories and a table joining the two with weight variables. Some images have no relationships and some have lots.
The problem is that I have duplicate rows in the image table that need to be removed. But I need to preserve the many to many relationships, if the duplicated image has one or more.
Here's an example:
Table name: my_images
+----+------------+-----------------+
| ID | image_path | image_filename  |
+----+------------+-----------------+
|  1 | Film 1     | Film 1 001.jpg  |
|  2 | Film 1     | Film 1 001.jpg  |
|  3 | Film 1     | Film 1 002.jpg  |
|  4 | Film 1     | Film 1 002.jpg  |
|  5 | Film 1     | Film 1 003.jpg  |
|  6 | Film 1     | Film 1 003.jpg  |
+----+------------+-----------------+
Table name: my_terms
+---------+------------+
| term_id | term_name  |
+---------+------------+
|       1 | cat        |
|       2 | dog        |
|       3 | automobile |
+---------+------------+
Table name: my_term_relationships
+----------+---------+---------+
| image_id | term_id | weight  |
+----------+---------+---------+
|        2 |       1 | 0.58516 |
|        2 |       3 | 0.16721 |
|        3 |       2 | 0.21475 |
+----------+---------+---------+
So in this example, the ideal outcome would be to delete rows 1, 4 and either row 5 or 6 from my_images.
 
     
     
    