I have following Postgres query that runs over a large table and whenever it does, I observe a high CPU load that reaches 100% :
SELECT id, user_id, type
FROM table ss
WHERE end_timestamp > CURRENT_TIMESTAMP
AND (notification_channel1 = true OR notification_channel2 = true)
AND last_notification_timestamp < CURRENT_TIMESTAMP - interval '1 days'
ORDER BY user_id, type, title
Table schema:
CREATE TABLE saved_search (
    id BIGINT PRIMARY KEY,
    user_id varchar(40) NOT NULL,
    title varchar(500) NOT NULL,
    start_timestamp timestamp with time zone NOT NULL,
    end_timestamp timestamp with time zone NOT NULL,
    last_notification_timestamp timestamp with time zone NOT NULL,
    notification_channel1 boolean NOT NULL DEFAULT TRUE,
    notification_channel2 boolean NOT NULL DEFAULT TRUE,
);
I am thinking of using a covering index to speed up this query and hopefully avoid the cpu spike.
First question: would that be a valid path to follow?
The index I'm thinking of would be something like:
CREATE INDEX IX_COVERING 
ON table (end_date, notification_channel1, notification_channel2, last_notification_timestamp)
INCLUDE (id, user_id, type)
Would that be useful ? Would the INCLUDE be needed actually? Should I change the order of the columns in the index? Are there other / better approaches?
Gather Merge  (cost=387647.26..737625.73 rows=2999606 width=40) (actual time=36085.232..47805.119 rows=3634052 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Sort  (cost=386647.24..390396.74 rows=1499803 width=40) (actual time=35820.825..40497.683 rows=1211351 loops=3)
"        Sort Key: user_id, type, title"
        Sort Method: external merge  Disk: 63640kB
        Worker 0:  Sort Method: external merge  Disk: 63944kB
        Worker 1:  Sort Method: external merge  Disk: 57896kB
        ->  Parallel Seq Scan on table  (cost=0.00..150768.88 rows=1499803 width=40) (actual time=0.200..4176.269 rows=1211351 loops=3)
              Filter: ((notification_channel1 OR notification_channel2) AND (end_timestamp > CURRENT_TIMESTAMP) AND (last_notification_timestamp < CURRENT_TIMESTAMP - interval '1 days'))
              Rows Removed by Filter: 136960
Planning Time: 3.632 ms
Execution Time: 48292.788 ms
With SET work_mem = '100MB';, here is the output I'm getting:
`Gather Merge  (cost=305621.26..655599.73 rows=2999606 width=40) (actual time=48856.376..55606.264 rows=3634097 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Sort  (cost=304621.24..308370.74 rows=1499803 width=40) (actual time=46436.173..47563.732 rows=1211366 loops=3)
"        Sort Key: user_id, type, title"
        Sort Method: external merge  Disk: 72232kB
        Worker 0:  Sort Method: external merge  Disk: 55288kB
        Worker 1:  Sort Method: external merge  Disk: 57816kB
        ->  Parallel Seq Scan on table  (cost=0.00..150768.88 rows=1499803 width=40) (actual time=0.911..4643.228 rows=1211366 loops=3)
              Filter: ((notification_channel1 OR notification_channel2) AND (end_timestamp > CURRENT_TIMESTAMP) AND ((ast_notification_timestamp < CURRENT_TIMESTAMP - interval '1 days'))
              Rows Removed by Filter: 136960
Planning Time: 0.450 ms
Execution Time: 56035.995 ms
 
     
    