[Solved] Multi-column performance of IN clause + ORDER BY

EverSQL Database Performance Knowledge Base

Multi-column performance of IN clause + ORDER BY

I have a table like this:

id | person_id | created_at
---------------------------
0  | 10        | ...
1  | 10        | ...
2  | 11        | ...
3  | 11        | ...
.. | ...       | ... 

and I'm currently performing the following query:

SELECT * FROM table WHERE person_id IN (10,11,12,34,58) ORDER BY created_at DESC LIMIT x OFFSET y;

I basically want the records sorted by created_at, but only the ones corresponding to any of the provided person_id values.

Right I have two separate indices: created_at and person_id, and I've been asking myself the following:

If my query would be WHERE person_id = 10 instead of IN, I'm sure the (person_id, created_at) would do the trick, but I'm not 100% sure in this scenario.

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid OFFSET In LIMIT Clause (query line: 10): OFFSET clauses can be very slow when used with high offsets (e.g. with high page numbers when implementing paging). Instead, use the following \u003ca target\u003d"_blank" href\u003d"http://www.eversql.com/faster-pagination-in-mysql-why-order-by-with-limit-and-offset-is-slow/"\u003eseek method\u003c/a\u003e, which provides better and more stable response rates.
  2. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
  3. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
Optimal indexes for this query:
ALTER TABLE `table` ADD INDEX `table_idx_person_id_created_at` (`person_id`,`created_at`);
ALTER TABLE `table` ADD INDEX `table_idx_created_at` (`created_at`);
The optimized query:
SELECT
        * 
    FROM
        table 
    WHERE
        table.person_id IN (
            10, 11, 12, 34, 58
        ) 
    ORDER BY
        table.created_at DESC LIMIT x OFFSET y

Related Articles



* original question posted on StackOverflow here.