I am working on a function which requires to query several database tables (these are all SELECT statements), for ORM library and database driver I use SQLAlchemy and PyMySQL.
Initially I built the function entirely at ORM level, however I hit performance issue due to several round trips between my python application server and database server every time when running the function , since then I came up with a new approach to send multiple SQL statements in one flight to database server at SQLAlchemy Core level.
Here is simplified version of my new approach :
from sqlalchemy.sql import select as sa_select
db_engine = sqlalchemy_engin_init()
def my_db_query_fn(self, user_input_1, user_input_2):
# note that user_input_1, user_input_2 are from untrusted user input
with db_engine.connect() as conn:
# raw DBAPI connection for multiple result sets in one go
dbapi_conn = conn.connection
# cursor comes from low-level database driver, not standard SQLAlchemy class
cursor = dbapi_conn.cursor()
try:
table_a = Model_A.__table__
table_b = Model_B.__table__
stmt = sa_select(table_a.c.column1, table_a.c.column2
).where(table_a.c.column3 == user_input_1)
stmt2 = sa_select(table_b.c.column1, table_b.c.column2
).where(table_b.c.column3 == user_input_2)
sql_compile_fn = lambda s: str(s.compile(conn.engine, compile_kwargs={"literal_binds": True}))
# generate raw SQL statements by running compile()
rawsqls = list(map(sql_compile_fn, [stmt, stmt2]))
# send all the generated raw SQL statements in one round trip
cursor.execute(';'.join(rawsqls))
# then retrieve each result set one after the other, this part is skipped ....
finally:
cursor.close()
After testing , the code above does reduce the round trip latency, but my question is : is my new approach vulnerable to SQL injection attack. Here are what I've been considering :
- As commented in the code sample above, the function
my_db_query_fn()necessarily takes untrusted user inputsuser_input_1anduser_input_2to build all SQL statements, even all the raw SQL strings are generated bysqlalchemy.sql.selectable.Select.compile()instead of directly formatting raw SQL string patterns with the untrusted user inputs. - the function
my_db_query_fn()executes SQL statements bycursor.execute()notconn.execute(). Thecursorcomes from low-level database driver (in my case it isPyMySQL) , I don't know if that is considered as bad practice to avoid SQL injection. (Orcursor.execute()andconn.execute()are never related to that ?) - According to the similar question on stackoverflow and official documentation , the function
YOUR_MODEL.__table__.select()in SQLAlchemy ORM and the functionsqlalchemy.sql.expression.select(YOUR_MODEL)in SQLAlchemy Core actually generate instances of the same class typesqlalchemy.sql.selectable.Select, and the SELECT statement instance can be passed to eitherconn.execute()orsession.execute(). does that implicitly mean SQLAlchemy Core will also check against SQL injection ?
thanks for reading, I appreciate your feedback, hints, or tips