I have this table in PostgreSQL 15.3 (corresponding to a Django model):
                                             Table "public.myapp1_task"
         Column          |           Type           | Collation | Nullable |                     Default
-------------------------+--------------------------+-----------+----------+-------------------------------------------------
 id                      | bigint                   |           | not null | nextval('myapp1_task_id_seq'::regclass)
 created_at              | timestamp with time zone |           | not null |
 updated_at              | timestamp with time zone |           | not null |
 kind                    | character varying(12)    |           | not null |
 status                  | character varying(12)    |           | not null |
 environment             | character varying(7)     |           | not null |
 data                    | jsonb                    |           | not null |
 result                  | jsonb                    |           | not null |
 sent_at                 | timestamp with time zone |           |          |
 response_at             | timestamp with time zone |           |          |
 priority                | smallint                 |           | not null |
 sequence                | integer                  |           |          |
 result_attachment       | character varying(100)   |           | not null |
 taxes                   | jsonb                    |           | not null |
 myapp2_item_id          | bigint                   |           |          |
 source                  | character varying(8)     |           | not null |
 user_id                 | bigint                   |           |          |
 custom_actions          | jsonb                    |           | not null |
 
Indexes:
    "myapp1_task_pkey" PRIMARY KEY, btree (id)
    "myapp1_task_user_id_76a104e9" btree (user_id)
    "myapp1_task_myapp2_item_idd_441d91cb" btree (myapp2_item_id)
    "sequence_idx" btree (sequence DESC NULLS LAST)
    "sequence_mc_idx" btree (sequence, myapp2_item_id DESC NULLS LAST)
Goals: for each myapp2_item_id, find the row with the highest sequence.
I added the last two indexes related to the sequence column.
Using Django ORM, I'm trying to filter a queryset, here's the code:
queryset = Task.objects.all()
sequences = queryset.filter(item=OuterRef("item")).exclude(sequence__isnull=True).order_by("-sequence").distinct().values("sequence")
max_sequences = sequences.annotate(max_seq=Max("sequence")).values("max_seq")[:1]
filtered_queryset = queryset.filter(sequence=Subquery(max_sequences))
print(filtered_queryset.query)
which translates that into this SQL statement. Note the subquery with group by and max aggregates:
SELECT "myapp1_task"."id"
FROM "myapp1_task"
         LEFT OUTER JOIN "myapp2_item"
                         ON ("myapp1_task"."myapp2_item_id" = "myapp2_item"."id")
         LEFT OUTER JOIN "myapp2_user" ON ("myapp2_item"."user_id" = "myapp2_user"."id")
         LEFT OUTER JOIN "myapp2_category"
                         ON ("myapp2_item"."myapp2_category_id" = "myapp2_category"."id")
         LEFT OUTER JOIN "myapp2_user" T5 ON ("myapp1_task"."user_id" = T5."id")
WHERE "myapp1_task"."sequence" = (SELECT "subquery"."max_seq"
                                          FROM (
                                          SELECT MAX(U0."sequence") AS "max_seq", U0."sequence"
                                                FROM "myapp1_task" U0
                                                WHERE (U0."myapp2_item_id" =
                                                       ("myapp1_task"."myapp2_item_id"))
                                                GROUP BY U0."sequence"
                                                ORDER BY U0."sequence" DESC
                                                LIMIT 1) subquery)
Sadly, it's very slow on a fairly large table (>1M rows). Inspecting the explain result, I got this -> seq scan on the subquery, so none of the new indexes are used:
Seq Scan on myapp1_task  (cost=0.00..5525.25 rows=3 width=8)
  Filter: (sequence = (SubPlan 1))
  SubPlan 1
    ->  Subquery Scan on subquery  (cost=8.30..8.33 rows=1 width=4)
          ->  Limit  (cost=8.30..8.32 rows=1 width=8)
                ->  GroupAggregate  (cost=8.30..8.32 rows=1 width=8)
                      Group Key: u0.sequence
                      ->  Sort  (cost=8.30..8.31 rows=1 width=4)
                            Sort Key: u0.sequence DESC
                            ->  Index Scan using myapp1_task_myapp2_item_idd_441d91cb on myapp1_task u0  (cost=0.28..8.29 rows=1 width=4)
                                  Index Cond: (myapp2_item_id = myapp1_task.myapp2_item_id)
Not sure what I'm doing wrong. How can this be improved?
 
    