[Solved] Optimize IN clause in where query with order by - MySQL

EverSQL Database Performance Knowledge Base

Optimize IN clause in where query with order by - MySQL

Database type:

I am trying to optimize a query that using IN clause in WHERE to avoid file sorting. To make it easy , I created the following sample which shows the problem. Here is my query:

FROM `test` 
WHERE user_id = 9898 
AND status IN (1,3,4) 
order by id 
limit 30;

Here is the result of explain, as you can see the query is filesort

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  test    range   user_id     user_id     8   NULL    3   Using where; Using index; Using filesort

Here is my table structure

  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(10) unsigned NOT NULL,
  `status` int(3) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`,`status`)

-- Dumping data for table `test`

INSERT INTO `test` (`id`, `user_id`, `status`) VALUES
(5, 9797, 2),
(6, 9797, 3),
(4, 9898, 0),
(1, 9898, 2),
(2, 9898, 3),
(3, 9898, 4);

How can I optimize the query? In my real table I can see the following information in error log: # Query_time: 26.498180 Lock_time: 0.000175 Rows_sent: 100 Rows_examined: 4926

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
Optimal indexes for this query:
ALTER TABLE `test` ADD INDEX `test_idx_user_id_status_id` (`user_id`,`status`,`id`);
ALTER TABLE `test` ADD INDEX `test_idx_id` (`id`);
The optimized query:
        `test`.user_id = 9898 
        AND `test`.status IN (
            1, 3, 4
        `test`.id LIMIT 30

Related Articles

* original question posted on StackOverflow here.