Abstract question: When importing a record (= table row) from some programs to a database, I sometimes want to only import the record if and only if the record isn't already present in the database, based on some unique identifier. Is it preferable to:
- first query the database to check whether the record with the same identifier is already present in the database, and if not, insert the record,
- or add some unicity constraint in the database, then try to insert the record, surrounded by some try ... catch
?
Concrete example: I have a data file containing a list of users. For each user I have a first name and last name. I assume that two people may not have the same name. I want to write a Python script that inserts the data in a PostgreSQL table, whose columns are userid (auto-incremented primary key), first_name (text), and last_name (text).
I want to insert the user John Doe. Is it preferable to:
- first query the database to check whether the record with the same record is already present in the database, and if not, insert the record, e.g.: - sql = "SELECT COUNT(*) cnt FROM users WHERE first_name = 'John' AND last_name`= 'Doe'" data = pd.read_sql_query(sql, connection) if data['cnt'][0]==0: sql = "INSERT INTO users (first_name, last_name) values ('John','Doe')" cursor.execute(sql) connection.commit()
- or add some unicity constraint in the database (on - (first_name, last_name)), then try to insert the record, surrounded by some- try … catch, e.g."- try: sql = "INSERT INTO users (first_name, last_name) values ('John','Doe')" cursor.execute(sql) connection.commit() except: print('The user is already present in database')
?
 
    