Why doesn't this work with an ON clause, but does with a WHERE clause?

Question

Julia just finished conducting a coding contest, and she needs your help assembling the leaderboard! Write a query to print the respective hacker_id and name of hackers who achieved full scores for more than one challenge. Order your output in descending order by the total number of challenges in which the hacker earned a full score. If more than one hacker received full scores in same number of challenges, then sort them by ascending hacker_id.

My solution that doesn't work:

SELECT h.hacker_id, h.name
FROM Hackers h
JOIN Challenges c
ON h.hacker_id = c.hacker_id
JOIN Difficulty d
ON c.difficulty_level = d.difficulty_level
JOIN Submissions s
ON s.score = d.score
-- no where clause
GROUP BY h.hacker_id, h.name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC, h.hacker_id;

Solution that works:

select h.hacker_id, h.name 
from Submissions as s 
join Hackers as h 
on s.hacker_id = h.hacker_id 
join Challenges as c 
on s.challenge_id = c.challenge_id
join Difficulty as d 
on c.Difficulty_level = d.Difficulty_level
-- the only real difference:
where s.score = d.score
group by h.hacker_id, h.name 
having count(*) > 1
order by count(*) desc, h.hacker_id;

Why does having s.score = d.score in the WHERE clause make the query work, but having it in an ON clause as part of an INNER JOIN make it not work (on HackerRank.com where the query comes from)? I thought for INNER JOINs it didn't matter, because the optimizer rearranges them at will?

How do I know when to use something like s.score = d.score (or whatever the columns are) in a WHERE clause and not in an ON clause as part of an INNER JOIN?

Please don't YELL. Please either ask about 1 bad query/function with the obligatory [mre] & why you think it should return something else at the 1st subexpression that it doesn't give what you expect, justified by reference to authoritative documentation, or ask about your overall goal giving working parts you can do & ideally a [mre]. But please ask about the former 1st because misconceptions in the former will get in the way of understanding the latter. And bad code doesn't tell us what you wish it would do. [ask] [Help] — philipxy, Oct 12 '22 at 22:02

philipxy · Answer 1 · 2022-10-12T22:21:54.533

2

Why would your code be correct? The problem is not ON vs WHERE.

You:

ON h.hacker_id = c.hacker_id
ON c.difficulty_level = d.difficulty_level
ON s.score = d.score

Them:

on s.hacker_id = h.hacker_id 
on s.challenge_id = c.challenge_id
on c.Difficulty_level = d.Difficulty_level
where s.score = d.score

You have h & c for hacker_id but they have h & s. You are missing challenge_id.

You are correct that in a sequence of inner/cross joins conjuncts can appear anywhere among the ON & WHERE as long as aliases are in scope.

CROSS JOIN vs INNER JOIN in SQL
Is there any rule of thumb to construct SQL query from a human-readable description?

edited Oct 12 '22 at 22:21

answered Oct 12 '22 at 22:16

philipxy

14,867
6
39
83

I didn't notice that. Thank you for pointing that out. Question: does it matter what the tables are connected on ? I thought that as long as the tables were connected, you could connect them on any shared column without altering the result? – UtahMan1083 Oct 13 '22 at 17:12
Please ask a (specific researched non-duplicate) question in a new post, not comments. I don't know what you mean by "as long as the tables were connected" or "shared column" or what it has to do with querying. A join on or where can be on any condition regardless of any constraints. Constraints need not be declared, be known or exist to query. Questions around this are all faqs. And: How does applying the links I gave, or just the definition of join/on/where, not answer your question? Again: "inner/cross joins conjuncts can appear anywhere among the ON & WHERE as long as aliases are in scope" – philipxy Oct 13 '22 at 19:55
`x join y on c` is `x cross join y where c` is `x cross join y` rows that make `c` true ie satisfy `c`. Clearly altering `c` can lead to "altering the result". The post's 2 links say: Given that `x` with columns `x1`,... holds the rows where/satisfying `"...[xi]..."` & `y` with columns `y1`,... holds the rows where/satisfying `"...[yj]..."`, `x join y on c` & `x cross join y where c` both hold the rows where/satisfying `"...[x.xi]... and ...[y.yj]... and c"`. – philipxy Oct 13 '22 at 21:34

Why doesn't this work with an ON clause, but does with a WHERE clause?

1 Answers1