We are using MySQL 5.5.42.
We have a table publications
containing about 150 million rows (about 140 GB on an SSD).
The table has many columns, of which two are of particular interest:
id
is primary key of the table and is of type bigint
cluster_id
is a nullable column of type bigint
Both columns have their own (separate) index.
We make queries of the form
SELECT * FROM publications
WHERE id >= 14032924480302800156 AND cluster_id IS NULL
ORDER BY id
LIMIT 0, 200;
Here is the problem: The larger the
id
value (14032924480302800156 in the example above), the slower the request.
In other words, requests for low id
value are fast (< 0.1 s) but the higher the id
value, the slower the request (up to minutes).
Everything is fine if we use another (indexed) column in the WHERE
clause. For instance
SELECT * FROM publications
WHERE inserted_at >= '2014-06-20 19:30:25' AND cluster_id IS NULL
ORDER BY inserted_at
LIMIT 0, 200;
where inserted_at
is of type timestamp
.
Edit:
Output of EXPLAIN
when using id >= 14032924480302800156
:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
---+-------------+--------------+------+--------------------+------------+---------+-------+----------+------------
1 | SIMPLE | publications | ref | PRIMARY,cluster_id | cluster_id | 9 | const | 71647796 | Using where
Output of EXPLAIN
when using inserted_at >= '2014-06-20 19:30:25'
:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
---+-------------+--------------+------+------------------------+------------+---------+-------+----------+------------
1 | SIMPLE | publications | ref | inserted_at,cluster_id | cluster_id | 9 | const | 71647796 | Using where
The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:
ALTER TABLE `publications` ADD INDEX `publications_idx_cluster_id_id` (`cluster_id`,`id`);
ALTER TABLE `publications` ADD INDEX `publications_idx_id` (`id`);
SELECT
*
FROM
publications
WHERE
publications.id >= 14032924480302800156
AND publications.cluster_id IS NULL
ORDER BY
publications.id LIMIT 0,
200