About tableA
- On MySQL 5.7.38 (AWS RDS)
- Contains +6.000.000 rows
- Property last_activity_atis set as anINDEXand hasdatetimeformat
- Property project_uidis set as anINDEXand is part of a Foreign Key
I'm performing the following SQL query
SELECT  project_uid, COUNT(uid) AS "count"
    FROM  tableA
    WHERE  last_activity_at >= "2023-05-01 00:00:00"
      AND  last_activity_at <= "2023-05-31 23:59:59"
    GROUP BY  project_uid
Here is the result (only performance)
117 rows in set (5 min 49,69 sec)
Here is the EXPLAIN results of the query above
+----+-------------+--------+------------+-------+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------+---------+------+---------+----------+-------------+
| id | select_type | table  | partitions | type  | possible_keys                                                                                                               | key                            | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+--------+------------+-------+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | tableA | NULL       | index | IDX_2dd9918084b7ec339b29123b7e,IDX_acd46110be997a5678aa809f1c,FK_c0bc0d43ba64ca68b4fc7b7a095,IDX_8d2721775c4645ba9c620ec9fa | FK_c0bc0d43ba64ca68b4fc7b7a095 | 147     | NULL | 6235932 |    50.00 | Using where |
+----+-------------+--------+------------+-------+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------+---------+------+---------+----------+-------------+
Here is the result without group by (which is still a lot for a simple where query)
1376493 rows in set (31,74 sec)
Here is the EXPLAIN of the query without GROUP BY
+----+-------------+--------+------------+------+--------------------------------+------+---------+------+---------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys                  | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+--------+------------+------+--------------------------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | tableA | NULL       | ALL  | IDX_8d2721775c4645ba9c620ec9fa | NULL | NULL    | NULL | 6237888 |    50.00 | Using where |
+----+-------------+--------+------------+------+--------------------------------+------+---------+------+---------+----------+-------------+
I'm trying to understand two things here :
- Why adding a group byimpact performances that much even with awhereclause that is supposed to reduce the number of rows to group ?
- Why the query without the group byis still taking 30 sec on a 6M rows table ? Maybe it's just me but 6M is not that much for a database ?
Thanks for your help. If you have any optimization recommandation I'll be glad to read it!
 
     
    