[Solved] MySQL query optimisation, big table, using temporary filesort
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

MySQL query optimisation, big table, using temporary filesort

Database type:

I need your help optimizing a query. One table is a log table which has millions of entries and I try to break my query to < 1s. My query should give an overall overview and should be quick therefore. I'm sure I could make multiple simple queries over the list with help of a script. But what a script can do, can mysql I think - I hope at least. And maybe not all parts are best used, but I'm stuck in a query which makes a temporary table and filesort (which I found out is really bad). As reading around I found out to use some neat and well placed indexes but now I'm stuck in a specific point.

Let me show you my final query with its results:

SELECT
    ps.SERVER_ID,
    ps.FULLNAME,
    SUM(CASE WHEN pml.ID_TYPE = 3 THEN 1 ELSE 0 END) 'amount_warning',
    SUM(CASE WHEN pml.ID_TYPE = 4 THEN 1 ELSE 0 END) 'amount_error',
    SUM(CASE WHEN pml.ID_TYPE = 5 THEN 1 ELSE 0 END) 'amount_alert',
    SUM(CASE WHEN pml.ID_TYPE = 7 THEN 1 ELSE 0 END) 'amount_critical'
FROM
    PAR_SERVER ps
INNER JOIN
    PAR_MONITORINGv2_LOG pml ON ps.SERVER_ID = pml.SERVER_ID
WHERE
    pml.CREATED_DATE > date_sub( NOW( ) , INTERVAL 7 DAY )
GROUP BY
    ps.SERVER_ID;

Here is what I get:

mysql> [thequeryabove]
[...]
59 rows in set (11.69 sec)

mysql> explain [thequeryabove]
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------+---------+----------------------------------------------+
| id | select_type | table | type   | possible_keys               | key     | key_len | ref                       | rows    | Extra                                        |
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | pml   | ALL    | SERVER_ID,SERVER_ID-ID_TYPE | NULL    | NULL    | NULL                      | 4014447 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | ps    | eq_ref | PRIMARY                     | PRIMARY | 4       | database.pml.SERVER_ID |       1 |                                              |
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------+---------+----------------------------------------------+
2 rows in set (0.00 sec)

Here's my current table setup:

mysql> describe PAR_SERVER;
+----------------+--------------+------+-----+---------+----------------+
| Field          | Type         | Null | Key | Default | Extra          |
+----------------+--------------+------+-----+---------+----------------+
| SERVER_ID      | int(255)     | NO   | PRI | NULL    | auto_increment |
| FULLNAME       | varchar(255) | YES  |     | NULL    |                |
| SHORTNAME      | varchar(255) | YES  | MUL | NULL    |                |
+----------------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

mysql> show indexes from PAR_SERVER;
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table      | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| PAR_SERVER |          0 | PRIMARY   |            1 | SERVER_ID   | A         |         142 |     NULL | NULL   |      | BTREE      |         |
| PAR_SERVER |          1 | shortname |            1 | SHORTNAME   | A         |         142 |     NULL | NULL   | YES  | BTREE      |         |
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
2 rows in set (0.00 sec)

mysql> select count(*) from PAR_SERVER;
+----------+
| count(*) |
+----------+
|      142 |
+----------+
1 row in set (0.00 sec)

mysql> describe PAR_MONITORINGv2_LOG;
+--------------+----------+------+-----+---------+----------------+
| Field        | Type     | Null | Key | Default | Extra          |
+--------------+----------+------+-----+---------+----------------+
| ID           | int(11)  | NO   | PRI | NULL    | auto_increment |
| ID_TYPE      | int(11)  | NO   | MUL | NULL    |                |
| ID_SERVICE   | int(11)  | NO   | MUL | NULL    |                |
| SERVER_ID    | int(11)  | NO   | MUL | NULL    |                |
| MESSAGE      | tinytext | NO   |     | NULL    |                |
| CREATED_DATE | datetime | NO   |     | NULL    |                |
+--------------+----------+------+-----+---------+----------------+
6 rows in set (0.00 sec)

mysql> show indexes from PAR_MONITORINGv2_LOG;
+----------------------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table                | Non_unique | Key_name          | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| PAR_MONITORINGv2_LOG |          0 | PRIMARY           |            1 | ID          | A         |     3998188 |     NULL | NULL   |      | BTREE      |         |
| PAR_MONITORINGv2_LOG |          1 | ID_TYPE           |            1 | ID_TYPE     | A         |           7 |     NULL | NULL   |      | BTREE      |         |
| PAR_MONITORINGv2_LOG |          1 | ID_SERVICE        |            1 | ID_SERVICE  | A         |           5 |     NULL | NULL   |      | BTREE      |         |
| PAR_MONITORINGv2_LOG |          1 | SERVER_ID         |            1 | SERVER_ID   | A         |          66 |     NULL | NULL   |      | BTREE      |         |
| PAR_MONITORINGv2_LOG |          1 | SERVER_ID-ID_TYPE |            1 | SERVER_ID   | A         |          66 |     NULL | NULL   |      | BTREE      |         |
| PAR_MONITORINGv2_LOG |          1 | SERVER_ID-ID_TYPE |            2 | ID_TYPE     | A         |         258 |     NULL | NULL   |      | BTREE      |         |
+----------------------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (0.00 sec)

mysql> select count(*) from PAR_MONITORINGv2_LOG;
+----------+
| count(*) |
+----------+
|  3998386 |
+----------+
1 row in set (0.00 sec)

Here are time results breaking my query step by step down. I may going step by step up after fixing each part taking so long. But for now only the query with runtime of 2.30 sec is currently interesting for this question.

mysql> SELECT ps.SERVER_ID, ps.FULLNAME FROM PAR_SERVER ps INNER JOIN PAR_MONITORINGv2_LOG pml ON ps.SERVER_ID = pml.SERVER_ID WHERE pml.CREATED_DATE > date_sub( NOW( ) , INTERVAL 7 DAY ) GROUP BY ps.SERVER_ID;
[...]
59 rows in set (6.41 sec)

mysql> explain [thequeryabove]
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------+---------+----------------------------------------------+
| id | select_type | table | type   | possible_keys               | key     | key_len | ref                       | rows    | Extra                                        |
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | pml   | ALL    | SERVER_ID,SERVER_ID-ID_TYPE | NULL    | NULL    | NULL                      | 4014788 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | ps    | eq_ref | PRIMARY                     | PRIMARY | 4       | database.pml.SERVER_ID |       1 |                                              |
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------+---------+----------------------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT ps.SERVER_ID, ps.FULLNAME FROM PAR_SERVER ps INNER JOIN PAR_MONITORINGv2_LOG pml ON ps.SERVER_ID = pml.SERVER_ID GROUP BY ps.SERVER_ID;
[...]
59 rows in set (2.30 sec)

mysql> explain [thequeryabove]
+----+-------------+-------+--------+-----------------------------+-----------+---------+---------------------------+---------+----------------------------------------------+
| id | select_type | table | type   | possible_keys               | key       | key_len | ref                       | rows    | Extra                                        |
+----+-------------+-------+--------+-----------------------------+-----------+---------+---------------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | pml   | index  | SERVER_ID,SERVER_ID-ID_TYPE | SERVER_ID | 4       | NULL                      | 4015694 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | ps    | eq_ref | PRIMARY                     | PRIMARY   | 4       | database.pml.SERVER_ID |       1 |                                              |
+----+-------------+-------+--------+-----------------------------+-----------+---------+---------------------------+---------+----------------------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT pml.SERVER_ID FROM PAR_MONITORINGv2_LOG pml GROUP BY pml.SERVER_ID;
[...]
65 rows in set (0.00 sec)

mysql> explain [thequeryabove]
+----+-------------+-------+-------+---------------+-----------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key       | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+-----------+---------+------+------+--------------------------+
|  1 | SIMPLE      | pml   | range | NULL          | SERVER_ID | 4       | NULL |   67 | Using index for group-by |
+----+-------------+-------+-------+---------------+-----------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

I was able to improve the query a lot by defining an index for (SERVER_ID, ID_TYPE) as my following example query confirms:

mysql> SELECT count(*) 'count_warnings' FROM PAR_MONITORINGv2_LOG pml WHERE pml.SERVER_ID = 191 AND pml.ID_TYPE = 3 GROUP BY pml.SERVER_ID;
[...]
1 row in set (0.01 sec)

mysql> explain [thequeryabove]
+----+-------------+-------+------+-------------------------------------+-------------------+---------+-------------+-------+-------------+
| id | select_type | table | type | possible_keys                       | key               | key_len | ref         | rows  | Extra       |
+----+-------------+-------+------+-------------------------------------+-------------------+---------+-------------+-------+-------------+
|  1 | SIMPLE      | pml   | ref  | ID_TYPE,SERVER_ID,SERVER_ID-ID_TYPE | SERVER_ID-ID_TYPE | 8       | const,const | 10254 | Using index |
+----+-------------+-------+------+-------------------------------------+-------------------+---------+-------------+-------+-------------+
1 row in set (0.00 sec)

I'm stuck now in the most broked down query with a long execution time of 2.30 sec. I don't know how to use indexes for such a query not having any where clause.

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  2. Explicitly ORDER BY After GROUP BY (modified query below): By default, the database sorts all 'GROUP BY col1, col2, ...' queries as if you specified 'ORDER BY col1, col2, ...' in the query as well. If a query includes a GROUP BY clause but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying 'ORDER BY NULL'.
  3. Prefer Sorting/Grouping By The First Table In Join Order (modified query below): The database can use indexes more efficiently when sorting and grouping using columns from the first table in the join order. The first table is determined based on the prediction of the the optimal first table, and is not necessarily the first table shown in the FROM clause.
Optimal indexes for this query:
ALTER TABLE `PAR_MONITORINGv2_LOG` ADD INDEX `par_log_idx_created_date` (`CREATED_DATE`);
ALTER TABLE `PAR_MONITORINGv2_LOG` ADD INDEX `par_log_idx_server_id` (`SERVER_ID`);
ALTER TABLE `PAR_SERVER` ADD INDEX `par_server_idx_server_id` (`SERVER_ID`);
The optimized query:
SELECT
        ps.SERVER_ID,
        ps.FULLNAME,
        SUM(CASE 
            WHEN pml.ID_TYPE = 3 THEN 1 
            ELSE 0 END) 'amount_warning',
SUM(CASE 
    WHEN pml.ID_TYPE = 4 THEN 1 
    ELSE 0 END) 'amount_error',
SUM(CASE 
    WHEN pml.ID_TYPE = 5 THEN 1 
    ELSE 0 END) 'amount_alert',
SUM(CASE 
    WHEN pml.ID_TYPE = 7 THEN 1 
    ELSE 0 END) 'amount_critical' 
FROM
PAR_SERVER ps 
INNER JOIN
PAR_MONITORINGv2_LOG pml 
    ON ps.SERVER_ID = pml.SERVER_ID 
WHERE
pml.CREATED_DATE > date_sub(NOW(), INTERVAL 7 DAY) 
GROUP BY
pml.SERVER_ID 
ORDER BY
NULL

Related Articles



* original question posted on StackOverflow here.