[Solved] Reduce time in nested mysql query

EverSQL Database Performance Knowledge Base

Reduce time in nested mysql query

Database type:

I have a nested MySQL query having relation in tables with over 500000 records in each. The query takes 60 seconds to fetch results and Indexing has been done in all tables.

Please suggest to reduce its execution time. Thanks in advance.

    SELECT t1.col1,t1.col2
    FROM table1 AS t1
    WHERE t1.col2 IN
    (
        SELECT DISTINCT(t2.col1) FROM table2 AS t2 WHERE t2.col2 IN
        (
            SELECT t3.col1
            FROM  table3 AS t3
            WHERE t3.col2 = '04' ORDER BY t3.col1 ASC
        )
        ORDER BY t2.col1 ASC
    )

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Subqueries (query line: 8): We advise against using subqueries as they are not optimized well by the optimizer. Therefore, it's recommended to join a newly created temporary table that holds the data, which also includes the relevant search index.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  3. Replace In Subquery With Correlated Exists (modified query below): In many cases, an EXISTS subquery with a correlated condition will perform better than a non correlated IN subquery.
  4. Use Numeric Column Types For Numeric Values (query line: 20): Referencing a numeric value (e.g. 04) as a string in a WHERE clause might result in poor performance. Possible impacts of storing numbers as varchars: more space will be used, you won't be able to perform arithmetic operations, the data won't be self-validated, aggregation functions like SUM won't work, the output may sort incorrectly and more. If the column is numeric, remove the quotes from the constant value, to make sure a numeric comparison is done.
Optimal indexes for this query:
ALTER TABLE `table1` ADD INDEX `table1_idx_col2` (`col2`);
ALTER TABLE `table2` ADD INDEX `table2_idx_col1` (`col1`);
ALTER TABLE `table3` ADD INDEX `table3_idx_col2_col1` (`col2`,`col1`);
The optimized query:
SELECT
        t1.col1,
        t1.col2 
    FROM
        table1 AS t1 
    WHERE
        t1.col2 IN (
            SELECT
                DISTINCT (t2.col1) 
            FROM
                table2 AS t2 
            WHERE
                EXISTS (
                    SELECT
                        1 
                    FROM
                        table3 AS t3 
                    WHERE
                        (
                            t3.col2 = '04'
                        ) 
                        AND (
                            t2.col2 = t3.col1
                        ) 
                    ORDER BY
                        t3.col1 ASC
                ) 
            ORDER BY
                t2.col1 ASC)

Related Articles



* original question posted on StackOverflow here.