[Solved] MySQL - Nested loops count too high resulting in query taking too long
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

MySQL - Nested loops count too high resulting in query taking too long

Database type:

I am seeing high response time for a MySQL query. I have two similar queries:

SELECT `orders`.`id` FROM `orders` WHERE (`orders`.line_count = 0 OR
                  `orders`.account_type_id IS NULL OR
                  `orders`.account_type_id IN (NULL)
                     OR `orders`.id IN (
                      SELECT lines.order_id from lines
                      INNER JOIN (select sec_accounts.id from accounts sec_accounts
                      WHERE (sec_accounts.account_type_id = 344 AND ((`sec_accounts`.`segment_1` = 'MS')))
                      ) as sec_accounts2 ON sec_accounts2.id = lines.account_id
                      WHERE lines.allocation_count = 0
                    )
) ORDER BY `orders`.`id` ASC LIMIT 90;

This took 90 rows in set (6.23 sec) to complete

SELECT `orders`.`id` FROM `orders` WHERE (`orders`.line_count = 0 OR
                  `orders`.account_type_id IS NULL OR
                  `orders`.account_type_id IN (NULL)
                     OR `orders`.id IN (
                      SELECT lines.order_id from lines
                      INNER JOIN (select sec_accounts.id from accounts sec_accounts
                      WHERE (sec_accounts.account_type_id = 8 AND ((`sec_accounts`.`segment_1` = '2D')))
                      ) as sec_accounts2 ON sec_accounts2.id = lines.account_id
                      WHERE lines.allocation_count = 0
                    )
) ORDER BY `orders`.`id` ASC LIMIT 90;

And this took 90 rows in set (0.05 sec) to complete

Notice, the difference between these two queries is just in the WHERE condition.

However, I was expecting the first query to be faster. First query subquery is returning 101101 records and second subquery is returning 923113 records i.e 9x of first subquery

Also, the where condition on accounts table returns 5648 and 175667 for first and second query respectively.

And still first query is very very slow as compared to second query which is strange.

I then started debugging this using EXPLAIN ANALYZE and these are the details:

Limit: 90 row(s)  (cost=0.43 rows=90) (actual time=1568.991..7802.360 rows=90 loops=1)
    -> Filter: ((orders.line_count = 0) or (orders.account_type_id is null) or (orders.account_type_id = NULL) or <in_optimizer>(orders.id,<exists>(select #2)))  (cost=0.43 rows=90) (actual time=1568.991..7802.349 rows=90 loops=1)
        -> Index scan on orders using PRIMARY  (cost=0.43 rows=90) (actual time=0.038..263.263 rows=443352 loops=1)
        -> Select #2 (subquery in condition; dependent)
            -> Limit: 1 row(s)  (cost=1.53 rows=0) (actual time=0.016..0.016 rows=0 loops=443348)
                -> Nested loop inner join  (cost=1.53 rows=0) (actual time=0.016..0.016 rows=0 loops=443348)
                    -> Filter: ((lines.allocation_count = 0) and (lines.account_id is not null))  (cost=0.97 rows=2) (actual time=0.008..0.011 rows=4 loops=443348)
                        -> Index lookup on lines using index_ol_on_order_id_supplier_id_reporting_total (order_id=<cache>(orders.id))  (cost=0.97 rows=3) (actual time=0.007..0.011 rows=4 loops=443348)
                    -> Filter: ((sec_accounts.account_type_id = 344) and (sec_accounts.segment_1 = 'MS'))  (cost=0.25 rows=0) (actual time=0.001..0.001 rows=0 loops=1571833)
                        -> Single-row index lookup on sec_accounts using PRIMARY (id=lines.account_id)  (cost=0.25 rows=1) (actual time=0.001..0.001 rows=1 loops=1571833)
Limit: 90 row(s)  (cost=0.43 rows=90) (actual time=1.893..78.997 rows=90 loops=1)
    -> Filter: ((orders.line_count = 0) or (orders.account_type_id is null) or (orders.account_type_id = NULL) or <in_optimizer>(orders.id,<exists>(select #2)))  (cost=0.43 rows=90) (actual time=1.892..78.985 rows=90 loops=1)
        -> Index scan on orders using PRIMARY  (cost=0.43 rows=90) (actual time=0.020..3.054 rows=3634 loops=1)
        -> Select #2 (subquery in condition; dependent)
            -> Limit: 1 row(s)  (cost=1.53 rows=0) (actual time=0.020..0.020 rows=0 loops=3634)
                -> Nested loop inner join  (cost=1.53 rows=0) (actual time=0.020..0.020 rows=0 loops=3634)
                    -> Filter: ((lines.allocation_count = 0) and (lines.account_id is not null))  (cost=0.97 rows=2) (actual time=0.010..0.014 rows=3 loops=3634)
                        -> Index lookup on lines using index_ol_on_order_id_supplier_id_reporting_total (order_id=<cache>(orders.id))  (cost=0.97 rows=3) (actual time=0.009..0.013 rows=3 loops=3634)
                    -> Filter: ((sec_accounts.account_type_id = 8) and (sec_accounts.segment_1 = '2D'))  (cost=0.26 rows=0) (actual time=0.002..0.002 rows=0 loops=10679)
                        -> Single-row index lookup on sec_accounts using PRIMARY (id=lines.account_id)  (cost=0.26 rows=1) (actual time=0.002..0.002 rows=1 loops=10679)

And no. of loops for first subquery is 443348 and for second subquery is 3634 which is strange to me.

Why is no. of loops so high for first query and any other reason why is it so slow?

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid OR Conditions By Using UNION (modified query below): In mosts cases, filtering using the OR operator cannot be applied using indexes. A more optimized alternative will be to split the query to two parts combined with a UNION clause, while each query holds one part of the original OR condition.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  3. Prefer Direct Join Over Joined Subquery (query line: 19): We advise against using subqueries as they are not optimized well by the optimizer. Therefore, we recommend to replace subqueries with JOIN clauses.
  4. Replace In Subquery With Correlated Exists (modified query below): In many cases, an EXISTS subquery with a correlated condition will perform better than a non correlated IN subquery.
  5. Use UNION ALL instead of UNION (query line: 58): Always use UNION ALL unless you need to eliminate duplicate records. By using UNION ALL, you'll avoid the expensive distinct operation the database applies when using a UNION clause.
Optimal indexes for this query:
ALTER TABLE `accounts` ADD INDEX `accounts_idx_account_id_segment_1_id` (`account_type_id`,`segment_1`,`id`);
ALTER TABLE `lines` ADD INDEX `lines_idx_allocation_coun_order_id` (`allocation_count`,`order_id`);
ALTER TABLE `orders` ADD INDEX `orders_idx_id` (`id`);
ALTER TABLE `orders` ADD INDEX `orders_idx_account_id_id` (`account_type_id`,`id`);
ALTER TABLE `orders` ADD INDEX `orders_idx_line_count_id` (`line_count`,`id`);
The optimized query:
SELECT
        orders_id 
    FROM
        ((SELECT
            `orders`.`id` AS orders_id 
        FROM
            `orders` 
        WHERE
            (
                EXISTS (
                    SELECT
                        1 
                    FROM
                        lines 
                    INNER JOIN
                        accounts AS sec_accounts2 
                            ON sec_accounts2.id = lines.account_id 
                    WHERE
                        (
                            (
                                lines.allocation_count = 0
                            ) 
                            AND (
                                `orders`.id = lines.order_id
                            )
                        ) 
                        AND (
                            sec_accounts2.account_type_id = 344 
                            AND (
                                (
                                    sec_accounts2.`segment_1` = 'MS'
                                )
                            )
                        )
                )
            ) 
        ORDER BY
            `orders`.`id` ASC LIMIT 90) 
        UNION
        DISTINCT (SELECT
            `orders`.`id` AS orders_id 
        FROM
            `orders` 
        WHERE
            (`orders`.account_type_id IN (NULL)) 
        ORDER BY
            `orders`.`id` ASC LIMIT 90) 
    UNION
    DISTINCT (SELECT
        `orders`.`id` AS orders_id 
    FROM
        `orders` 
    WHERE
        (`orders`.account_type_id IS NULL) 
    ORDER BY
        `orders`.`id` ASC LIMIT 90) 
UNION
DISTINCT (SELECT
    `orders`.`id` AS orders_id 
FROM
    `orders` 
WHERE
    (`orders`.line_count = 0) 
ORDER BY
    `orders`.`id` ASC LIMIT 90)
) AS union1 
ORDER BY
union1.orders_id ASC LIMIT 90

Related Articles



* original question posted on StackOverflow here.