[Solved] Max vs Count Huge Performance difference on a query

EverSQL Database Performance Knowledge Base

Max vs Count Huge Performance difference on a query

I have to similar queries which the only difference is that one is doing a sum of a column and the other is doing a count(distinct) of another column. The first one runs in seconds (17s) and the other one never stops (1 hour and counting). I've seen the plan for the count query and it has huge costs. I don't understand why. They are hitting the exact same views. Why is this happening and what can I do?

The one that is running fine:

select  a11.SOURCEPP  SOURCEPP,
    a12.DUMMY  DUMMY,
    a11.SIM_NAME  SIM_NAME,
    a13.THEORETICAL  THEORETICAL,
    sum(a11.REVENUE)  WJXBFS1
from    CLIENT_SOURCE_DATA  a11
    join    DUMMY_V a12
      on    (a11.SOURCEPP = a12.SOURCEPP)
    join    SIM_INFO    a13
      on    (a11.SIM_NAME = a13.SIM_NAME)
where   (a13.THEORETICAL in (0)
 and a11.SIM_NAME in ('ETS40'))
group by    a11.SOURCEPP,
    a12.DUMMY,
    a11.SIM_NAME,
    a13.THEORETICAL

the one that doesn't run:

select  a12.SOURCEPP  SOURCEPP,
    a12.SIM_NAME  SIM_NAME,
    a13.THEORETICAL  THEORETICAL,
    count(distinct a12.CLIENTID)  WJXBFS1
from    CLIENT_SOURCE_DATA  a12
    join    SIM_INFO    a13
      on    (a12.SIM_NAME = a13.SIM_NAME)
where   (a13.THEORETICAL in (0)
 and a12.SIM_NAME in ('ETS40'))
group by    a12.SOURCEPP,
    a12.SIM_NAME,
    a13.THEORETICAL

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  2. Explicitly ORDER BY After GROUP BY (modified query below): By default, the database sorts all 'GROUP BY col1, col2, ...' queries as if you specified 'ORDER BY col1, col2, ...' in the query as well. If a query includes a GROUP BY clause but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying 'ORDER BY NULL'.
Optimal indexes for this query:
ALTER TABLE `CLIENT_SOURCE_DATA` ADD INDEX `client_data_idx_sim_name` (`SIM_NAME`);
ALTER TABLE `DUMMY_V` ADD INDEX `dummy_v_idx_sourcepp` (`SOURCEPP`);
ALTER TABLE `SIM_INFO` ADD INDEX `sim_info_idx_sim_name_theoretical` (`SIM_NAME`,`THEORETICAL`);
The optimized query:
SELECT
        a11.SOURCEPP SOURCEPP,
        a12.DUMMY DUMMY,
        a11.SIM_NAME SIM_NAME,
        a13.THEORETICAL THEORETICAL,
        sum(a11.REVENUE) WJXBFS1 
    FROM
        CLIENT_SOURCE_DATA a11 
    JOIN
        DUMMY_V a12 
            ON (
                a11.SOURCEPP = a12.SOURCEPP
            ) 
    JOIN
        SIM_INFO a13 
            ON (
                a11.SIM_NAME = a13.SIM_NAME
            ) 
    WHERE
        (
            a13.THEORETICAL IN (
                0
            ) 
            AND a11.SIM_NAME IN (
                'ETS40'
            )
        ) 
    GROUP BY
        a11.SOURCEPP,
        a12.DUMMY,
        a11.SIM_NAME,
        a13.THEORETICAL 
    ORDER BY
        NULL

Related Articles



* original question posted on StackOverflow here.