[Solved] Query optimization: Slow query is faster when split up
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

Query optimization: Slow query is faster when split up

I've experienced a T-SQL query taking a significant amount of time. If I split it up into 2 queries, it's faster, but I don't know why.

The query is meant to fetch data from tables C, G, Y and insert it into X - but only if it doesn't already exist in X.

Here's the slow query. It slows down significant when the X-table grows:

INSERT INTO X (COLUMN_A, COLUMN_B, COLUMN_C) 
    (SELECT a1.COLUMN_A, a1.COLUMN_B, 'QWE' as COLUMN_C 
     FROM 
         (SELECT 
              CAST(G.ID AS VARCHAR(900)) AS COLUMN_A, 
              MAX(Y.B) AS COLUMN_B
          FROM 
              C
          JOIN 
              G ON (G.ID = C.G_UID)
          JOIN 
              Y ON Y.B COLLATE Danish_Norwegian_BIN = C.UID COLLATE Danish_Norwegian_BIN
          GROUP BY 
              CAST(G.ID AS VARCHAR(900))) AS a1 
     WHERE 
         a1.COLUMN_A NOT IN (SELECT COLUMN_A FROM X))

BUT! It's faster if I split the query up (in code) into 2 sets:

Set 1:

SELECT  
    a1.COLUMN_A, a1.COLUMN_B, 'QWE' as COLUMN_C 
FROM 
    (SELECT 
         CAST(G.ID AS VARCHAR(900)) AS COLUMN_A, MAX(Y.B) AS COLUMN_B
     FROM 
         C
     JOIN 
         G ON (G.ID = C.G_UID)
     JOIN 
         Y ON Y.B COLLATE Danish_Norwegian_BIN = C.UID COLLATE Danish_Norwegian_BIN
     GROUP BY 
         CAST(G.ID AS VARCHAR(900))) AS a1

Set 2:

SELECT COLUMN_A FROM X

And then manually subtract the sets from each other, and inserting the result into the X-table.

Why is that? Can my slow query be written better?

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Subqueries (query line: 6): We advise against using subqueries as they are not optimized well by the optimizer. Therefore, it's recommended to join a newly created temporary table that holds the data, which also includes the relevant search index.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  3. Explicitly ORDER BY After GROUP BY (modified query below): By default, the database sorts all 'GROUP BY col1, col2, ...' queries as if you specified 'ORDER BY col1, col2, ...' in the query as well. If a query includes a GROUP BY clause but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying 'ORDER BY NULL'.
Optimal indexes for this query:
ALTER TABLE `G` ADD INDEX `g_idx_id` (`ID`);
The optimized query:
SELECT
        a1.COLUMN_A,
        a1.COLUMN_B,
        'QWE' AS COLUMN_C 
    FROM
        (SELECT
            CAST(G.ID AS VARCHAR (900)) AS COLUMN_A,
            MAX(Y.B) AS COLUMN_B 
        FROM
            C 
        JOIN
            G 
                ON (
                    G.ID = C.G_UID
                ) 
        JOIN
            Y 
                ON Y.B COLLATE Danish_Norwegian_BIN = C.UID COLLATE Danish_Norwegian_BIN 
        GROUP BY
            CAST(G.ID AS VARCHAR (900)) 
        ORDER BY
            NULL) AS a1

Related Articles



* original question posted on StackOverflow here.