I recently asked a question on SO here:
SQL Joins Taking forever on table with 60 matches
I have dropped the performance of the query from infinity to about 8 minutes by altering the FROM clause and some of the filters.
The sql changed from:
SELECT
ad.acct_lv2 AS IFRS_lvl2,
ad.acct_lv3 AS IFRS_lvl3,
ad.acct_lv4 AS IFRS_lvl4,
ad.acct_lv6 AS IFRS_lvl6,
--(CASE WHEN tb.pl_date = cal.curr_me_date THEN cal.date END) as Reporting_Cycle_This,
--(CASE WHEN tb.pl_date = cal.prev_me_date THEN cal.date END) as Reporting_Cycle_Last,
--(CASE WHEN tb.pl_date = cal.prev_me_date THEN cal.date END) as Reporting_Cycle_Previous,
null as Total,
null as Balance_default_dim,
null as Balance_proxy_dim,
0 as percentage_bal_proxy_applied,
'1 Table Data' as rowType
FROM TrialBal tb
INNER JOIN Partition p ON p.partition_code = tb.partition_code
AND p.entity = tb.entity
LEFT OUTER JOIN ActDetail ah ON tb.actNum = ad.actNum
AND tb.cpny = ad.cpny
AND tb.pl_date = ad.pl_date
AND ad.source = 'Ground0'
--LEFT OUTER JOIN CalendarEngine cal on tb.pl_date = cal.curr_me_date OR tb.pl_date = cal.prev_me_date
--AND tb.entity = cal.entity
WHERE tb.pl_date = '2014-04-30'
AND tb.entity = 'My_entity'
GROUP BY ad.acct_lv2, ad.acct_lv3, ad.acct_lv4, ad.acct_lv6, --tb.actNum, tb.cpny,
CASE WHEN tb.pl_date = cal.curr_me_date THEN cal.date END,
CASE WHEN tb.pl_date = cal.prev_me_date THEN cal.date END,
CASE WHEN tb.pl_date = cal.prev_me_date THEN cal.date END
to
SELECT
ad.acct_lv2 AS IFRS_lvl2,
ad.acct_lv3 AS IFRS_lvl3,
ad.acct_lv4 AS IFRS_lvl4,
ad.acct_lv6 AS IFRS_lvl6,
(CASE WHEN tb.pl_date = cal_1.curr_me_date THEN cal_1.date END) as Reporting_Cycle_This,
(CASE WHEN tb.pl_date = cal_2.prev_me_date THEN cal_2.date END) as Reporting_Cycle_Last,
(CASE WHEN tb.pl_date = cal_2.prev_me_date THEN cal_2.date END) as Reporting_Cycle_Previous,
null as Total,
null as Balance_default_dim,
null as Balance_proxy_dim,
0 as percentage_bal_proxy_applied,
'1 Table Data' as rowType
FROM Partition p
LEFT OUTER JOIN TrialBal tb
ON tb.pl_date = '2014-04-30'
AND tb.entity = 'My_entity'
AND p.partition_code = tb.partition_code
AND p.entity = tb.entity
LEFT OUTER JOIN ActDetail ah ON tb.actNum = ad.actNum
AND tb.cpny = ad.cpny
AND tb.pl_date = ad.pl_date
AND ad.source = 'Ground0'
LEFT OUTER JOIN CalendarEngine cal_1 on tb.pl_date = cal_1.curr_me_date
AND tb.entity = cal_1.entity
LEFT OUTER JOIN CalendarEngine cal_2 on tb.pl_date = cal_2.curr_me_date
AND tb.entity = cal_2.entity
GROUP BY ad.acct_lv2, ad.acct_lv3, ad.acct_lv4, ad.acct_lv6, cal_1.date, cal_2.date
This was a change recommended by a DBA, but why the drastic improvement? TrialBal had 8 million+ rows, how does moving this table into the OUTER JOIN and away from the FROM clause make such a difference? Wouldn't the Sybase optimiser analyse the most efficient execution path? And is it a good rule going forward to have the FROM clause reference the smallest table when building the query?
The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:
ALTER TABLE `CalendarEngine` ADD INDEX `calendarengine_idx_curr_date_entity` (`curr_me_date`,`entity`);
ALTER TABLE `TrialBal` ADD INDEX `trialbal_idx_pl_date_entity_partition` (`pl_date`,`entity`,`partition_code`);
SELECT
ad.acct_lv2 AS IFRS_lvl2,
ad.acct_lv3 AS IFRS_lvl3,
ad.acct_lv4 AS IFRS_lvl4,
ad.acct_lv6 AS IFRS_lvl6,
(CASE
WHEN tb.pl_date = cal_1.curr_me_date THEN cal_1.date END) AS Reporting_Cycle_This,
(CASE
WHEN tb.pl_date = cal_2.prev_me_date THEN cal_2.date END) AS Reporting_Cycle_Last,
(CASE
WHEN tb.pl_date = cal_2.prev_me_date THEN cal_2.date END) AS Reporting_Cycle_Previous,
NULL AS Total,
NULL AS Balance_default_dim,
NULL AS Balance_proxy_dim,
0 AS percentage_bal_proxy_applied,
'1 Table Data' AS rowType
FROM
Partition p
LEFT OUTER JOIN
TrialBal tb
ON tb.pl_date = '2014-04-30'
AND tb.entity = 'My_entity'
AND p.partition_code = tb.partition_code
AND p.entity = tb.entity
LEFT OUTER JOIN
CalendarEngine cal_1
ON tb.pl_date = cal_1.curr_me_date
AND tb.entity = cal_1.entity
LEFT OUTER JOIN
CalendarEngine cal_2
ON tb.pl_date = cal_2.curr_me_date
AND tb.entity = cal_2.entity
GROUP BY
ad.acct_lv2,
ad.acct_lv3,
ad.acct_lv4,
ad.acct_lv6,
cal_1.date,
cal_2.date
ORDER BY
NULL