I have a table called metadata which has a column called data of type TEXT. There are currently about 10 million rows in the metadata table.
I want to insert additional rows into the metadata table but I need to ensure that no two rows have the same content in data field(no duplicate data). The content of each data field is about 100 to 1000 lines long string i.e. extremely large strings.
What is the best way to insert a new row while ensuring that there are no duplicates in a performant manner?