This was an interview question and I have been upfront and said I'm not sure how to optimize the query - I work in the .Net stack and have been enjoying the Entity Framework for quite some time. So my skills in this area have suffered a little. This is also MySql.
So, it's driving me mad - trying to figure how I would optimize this statement. Especially given no DB to play with.
The statement looks much like this:
SELECT a.eid, a.cid, b.e_date FROM table_A a JOIN table_B b ON a.eid = b.eid WHERE a.cid IN (SELECT cid FROM table_A_trans WHERE `someDateColumn` <= '2014-06-01 00:00:00') AND a.eid NOT IN(123,124,200)
I'm thinking it's perhaps to do with the fact that a subquery is ran, which could bring back hundreds of results, only to then strip them out in the final
NOT IN() clause.
So - how would you do this?
The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:
ALTER TABLE `table_A` ADD INDEX `table_a_idx_eid` (`eid`); ALTER TABLE `table_A_trans` ADD INDEX `table_trans_idx_cid_somedatecolumn` (`cid`,`someDateColumn`); ALTER TABLE `table_B` ADD INDEX `table_b_idx_eid` (`eid`);
SELECT a.eid, a.cid, b.e_date FROM table_A a JOIN table_B b ON a.eid = b.eid WHERE EXISTS ( SELECT 1 FROM table_A_trans WHERE ( table_A_trans.`someDateColumn` <= '2014-06-01 00:00:00' ) AND ( a.cid = table_A_trans.cid ) ) AND a.eid NOT IN ( 123, 124, 200 )