EXPLAIN
As ruakh already explained, you probably misunderstood what EXPLAIN is telling you. If you want actual times in seconds, use EXPLAIN ANALYZE.
Be aware though, that this actually executes the statement. I quote the manual here:
Important: Keep in mind that the statement is actually executed when
the ANALYZE option is used. Although EXPLAIN will discard any
output that a SELECT would return, other side effects of the
statement will happen as usual. If you wish to use EXPLAIN ANALYZE
on an INSERT, UPDATE, DELETE, CREATE TABLE AS, or EXECUTE
statement without letting the command affect your data, use this
approach:
BEGIN;
EXPLAIN ANALYZE ...;
ROLLBACK;
Still, the estimates for the first query are way to high and indicate a grave problem.
What's wrong?
As to your third approach: for big tables, it will always be faster by an order of magnitude to have the database server update a whole (big) table at once than sending instructions to the server for every row - even more so if the new values come from within the database. More in this related answer. If your tests show otherwise, chances are that something is wrong with your (test) setup. And in fact, it is ...
Your first query goes wrong completely. The god-awful performance estimate is indication of how terribly wrong it is. While you join the table movies to the table infos in the FROM clause, you forget the WHERE condition to bind the resulting rows to the rows in the UPDATE table. This leads to a CROSS JOIN, i. e. every row in movies (600k) is updated with every single vote in values (200k) resulting in 120 000 000 000 updates. Yummy. And all wrong. Never execute this. Not even in a transaction that can be rolled back.
Your second query goes wrong, too. It runs a correlated subquery, i. e. it runs a separate query for every row. That's 600k subqueries instead of just 1, hence terrible performance.
That's right: 600k subqueries. Not 200k. You instruct Postgres to update every movie, no matter what. Those without a matching infos.value (no info_type = 100), receive a NULL value in votes, overwriting whatever was there before.
Also, I wonder what that LIMIT 1 is doing there?
- Either
(infos.movie_id, infos.info_type) is UNIQUE, then you don't need LIMIT.
- Or it isn't
UNIQUE. Then add a UNIQUE index to infos if you intend to keep the structure.
Proper query
UPDATE movies m
SET votes = i.value::int
FROM infos i
WHERE m.id = i.movie_id
AND i.info_type = 100
AND m.votes IS DISTINCT FROM i.value::int;
This is much like your first query, just simplified and doing it right. Plus:
No need to join to movies a second time. You only need infos in the FROM clause.
Actually bind the row to be updated to the row carrying the new value, thereby avoiding the (unintended) CROSS JOIN:
WHERE m.id = i.movie_id
Avoid empty updates, they carry a cost for no gain. That's what the last line is for.
Should be a matter of seconds or less, not millions of seconds.
BTW, indexes will not help this query, table scans are faster for the described data distribution since you use all (or a third) of the rows in involved tables.