[Solved] Select top n percent by year

How to optimize this SQL query?

In case you have your own slow SQL query, you can optimize it automatically here.

For the query above, the following recommendations will be helpful as part of the SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Calling Functions With Indexed Columns (query line: 55): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `day` is indexed, the index won’t be used as it’s wrapped with the function `dateadd`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  2. Avoid Calling Functions With Indexed Columns (query line: 56): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `day` is indexed, the index won’t be used as it’s wrapped with the function `dateadd`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  3. Avoid Calling Functions With Indexed Columns (query line: 57): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `day` is indexed, the index won’t be used as it’s wrapped with the function `dateadd`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  4. Avoid Calling Functions With Indexed Columns (query line: 101): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `day` is indexed, the index won’t be used as it’s wrapped with the function `dateadd`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  5. Avoid Calling Functions With Indexed Columns (query line: 102): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `day` is indexed, the index won’t be used as it’s wrapped with the function `dateadd`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  6. Avoid Calling Functions With Indexed Columns (query line: 103): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `day` is indexed, the index won’t be used as it’s wrapped with the function `dateadd`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  7. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  8. Push Filtering Conditions Into Subqueries (modified query below): Parts of the WHERE clause can pushed from the outer query to a subquery / union clause. Applying those conditions as early as possible will allow the database to scan less data and run the query more efficiently.
Optimal indexes for this query:
CREATE INDEX crd1_idx_adrestype_cardcode_address ON crd1 (adrestype,cardcode,address);
CREATE INDEX inv1_idx_docentry ON inv1 (docentry);
CREATE INDEX ocrd_idx_cardcode ON ocrd (cardcode);
CREATE INDEX ocrd_idx_cardtype ON ocrd (cardtype);
CREATE INDEX oinv_idx_cardcode ON oinv (cardcode);
CREATE INDEX orin_idx_cardcode ON orin (cardcode);
CREATE INDEX oslp_idx_slpcode ON oslp (slpcode);
CREATE INDEX rin1_idx_docentry ON rin1 (docentry);
The optimized query:
SELECT
        x.cardcode,
        x.customer,
        CASE 
            WHEN x.rep IS NULL THEN (SELECT
                slpname 
            FROM
                ocrd 
            INNER JOIN
                oslp 
                    ON ocrd.slpcode = oslp.slpcode 
            WHERE
                ocrd.cardcode = x.cardcode) 
            ELSE x.rep END AS rep,
x.city,
x.state,
x.country,
CASE 
    WHEN isnumeric(x.total) = 0 THEN 0 
    ELSE x.total END AS [net total],
x.docdate 
FROM
(SELECT
    t0.cardcode AS cardcode,
    t0.[cardname] AS customer,
    t1.city AS city,
    t1.state AS state,
    t1.country AS country,
    t4.slpname AS rep,
    sum(t3.linetotal) - t2.discsum AS total,
    t2.docdate AS [docdate] 
FROM
    ocrd t0 
INNER JOIN
    crd1 t1 
        ON (
            t0.cardcode = t1.cardcode 
            AND t0.shiptodef = t1.address
        ) 
LEFT OUTER JOIN
    oinv t2 
        ON t0.cardcode = t2.cardcode 
LEFT OUTER JOIN
    inv1 t3 
        ON t2.docentry = t3.docentry 
LEFT OUTER JOIN
    oslp t4 
        ON t2.slpcode = t4.slpcode 
WHERE
    (
        t0.[cardtype] = 'C' 
        AND t1.adrestype = 'S'
    ) 
    AND (
        t2.docdate BETWEEN '2008/01/01' AND dateadd(day, -1, '2008/' + CAST(month(getdate()) AS varchar (2)) + '/01') 
        OR t2.docdate BETWEEN '2009/01/01' AND dateadd(day, -1, '2009/' + CAST(month(getdate()) AS varchar (2)) + '/01') 
        OR t2.docdate BETWEEN '2010/01/01' AND dateadd(day, -1, '2010/' + CAST(month(getdate()) AS varchar (2)) + '/01')
    ) 
GROUP BY
    t0.cardcode,
    t0.cardname,
    t1.city,
    t1.state,
    t1.country,
    t4.slpname,
    t2.discsum,
    t2.docdate 
UNION
ALL SELECT
    t0.cardcode AS cardcode,
    t0.cardname AS customer,
    t1.city AS city,
    t1.state AS state,
    t1.country AS country,
    t4.slpname AS rep,
    -1 * (sum(t3.linetotal) - t2.discsum) AS total,
    t2.docdate 
FROM
    ocrd t0 
INNER JOIN
    crd1 t1 
        ON (
            t0.cardcode = t1.cardcode 
            AND t0.shiptodef = t1.address
        ) 
LEFT OUTER JOIN
    orin t2 
        ON t0.cardcode = t2.cardcode 
LEFT OUTER JOIN
    rin1 t3 
        ON t2.docentry = t3.docentry 
LEFT OUTER JOIN
    oslp t4 
        ON t2.slpcode = t4.slpcode 
WHERE
    (
        t0.[cardtype] = 'C' 
        AND t1.adrestype = 'S'
    ) 
    AND (
        t2.docdate BETWEEN '2008/01/01' AND dateadd(day, -1, '2008/' + CAST(month(getdate()) AS varchar (2)) + '/01') 
        OR t2.docdate BETWEEN '2009/01/01' AND dateadd(day, -1, '2009/' + CAST(month(getdate()) AS varchar (2)) + '/01') 
        OR t2.docdate BETWEEN '2010/01/01' AND dateadd(day, -1, '2010/' + CAST(month(getdate()) AS varchar (2)) + '/01')
    ) 
GROUP BY
    t0.cardcode,
    t0.cardname,
    t1.city,
    t1.state,
    t1.country,
    t4.slpname,
    t2.discsum,
    t2.docdate
) x 
WHERE
(
1 = 1
) 
GROUP BY
x.cardcode,
x.customer,
x.rep,
x.city,
x.state,
x.country,
x.total,
x.docdate

Related Articles



* original question posted on StackOverflow here.