I know that when you utilize includes and you specify a where clause on the joined table, you should use .references
example:
# will error out or throw deprecation warning in logs
users = User.includes(:orders).where("Orders.cost < ?", 20)
In rails 4 or later, you will get an error like the following:
Mysql2::Error: Unknown column 'Orders.cost' in 'where clause': SELECT customers.* FROM customers WHERE (Orders.cost < 100)
Or you will get a deprecation warning:
DEPRECATION WARNING: It looks like you are eager loading table(s) (one of: users, addresses) that are referenced in a string SQL snippet. For example:
Post.includes(:comments).where("comments.title = 'foo'") Currently, Active Record recognizes the table in the string, and knows to JOIN the comments table to the query, rather than loading comments in a separate query. However, doing this without writing a full-blown SQL parser is inherently flawed. Since we don't want to write an SQL parser, we are removing this functionality. From now on, you must explicitly tell Active Record when you are referencing a table from a string:
Post.includes(:comments).where("comments.title = 'foo'").references(:comments)
If you don't rely on implicit join references you can disable the feature entirely by setting config.active_record.disable_implicit_join_references = true. (
SELECT "users"."id" AS t0_r0, "users"."name" AS t0_r1, "users"."email" AS t0_r2, "users"."created_at" AS t0_r3, "users"."updated_at" AS t0_r4, "addresses"."id" AS t1_r0, "addresses"."user_id" AS t1_r1, "addresses"."country" AS t1_r2, "addresses"."street" AS t1_r3, "addresses"."postal_code" AS t1_r4, "addresses"."city" AS t1_r5, "addresses"."created_at" AS t1_r6, "addresses"."updated_at" AS t1_r7 FROM "users" LEFT OUTER JOIN "addresses" ON "addresses"."user_id" = "users"."id" WHERE (addresses.country = 'Poland')
so we do this:
# added .references(:orders)
users = User.includes(:orders).where("Orders.cost < ?", 20).references(:orders)
And it executes just fine:
SELECT "users"."id"     AS t0_r0, 
  "users"."name"        AS t0_r1, 
  "users"."created_at"  AS t0_r2, 
  "users"."updated_at"  AS t0_r3, 
  "orders"."id"         AS t1_r0, 
  "orders"."cost"       AS t1_r1, 
  "orders"."user_id"    AS t1_r2, 
  "orders"."created_at" AS t1_r3, 
  "orders"."updated_at" AS t1_r4 
FROM "users" 
LEFT OUTER JOIN "orders" 
ON "orders"."user_id" = "users"."id" 
WHERE  ( orders.cost < 20 ) 
I know that .includes is just a wrapper for two methods: eager_load and preload.  I know that since my query above is doing a filter on a joined table (orders in this example), includes is smart and knows to pick the eager_load implementation over preload because preload cannot handle doing this query since preload does not join tables.  
Here is where I am confused.  Ok: So on that query above: under the hood includes will utilize the eager_load implementation.  But notice how when I explicitly use eager_load for this same query (which is what includes is essentially doing): I do not need to use .references!  It runs the query and loads the data just fine.  No error and no deprecation warning:
# did not specify .references(:orders), and yet no error and no deprecation warning 
users = User.eager_load(:orders).where("Orders.cost < ?", 20)
And it executes the exact same process with no problem:
SELECT "users"."id"     AS t0_r0, 
  "users"."name"        AS t0_r1, 
  "users"."created_at"  AS t0_r2, 
  "users"."updated_at"  AS t0_r3, 
  "orders"."id"         AS t1_r0, 
  "orders"."cost"       AS t1_r1, 
  "orders"."user_id"    AS t1_r2, 
  "orders"."created_at" AS t1_r3, 
  "orders"."updated_at" AS t1_r4 
FROM "users" 
LEFT OUTER JOIN "orders" 
ON "orders"."user_id" = "users"."id" 
WHERE  ( orders.cost < 20 ) 
That seems odd.  Why does .references need to be specified for the includes version of the query, whereas .references does not need to be specified for the eager_load version of the query? What am I missing here?