Been working at this for awhile now and cannot seem to get it optimized. Although it does work, each left joined logs* table is reading every row in the database regardless if it is part of the set it is joined to (user_id's). While it returns correct results as is, this will be a problem as the user base and db as a whole grows.
Some quick background : given an account id there can be any number of computers to it. On each of those computers there can be any number of users linked to it. These user_id's are then linked in the logs tables. Each of these relationships is indexed (account_id, computer_id, user_id) for the necessary tables.
I have put the left joins in subqueries to prevent a cartesian product (a previous issue which subqueries solved).
Query :
SELECT
users.username as username,
computers.computer_name as computer_name,
l1.cnt as cnt1,
l2.cnt as cnt2,
l3.cnt as cnt3,
l4.cnt as cnt4,
l5.cnt as cnt5,
l6.cnt as cnt6
FROM computers
INNER JOIN users
on users.computer_id = computers.computer_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs1
group by user_id
) AS l1
on l1.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs2
group by user_id
) AS l2
on l2.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs3
group by user_id
) AS l3
on l3.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs4
group by user_id
) AS l4
on l4.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs5
group by user_id
) AS l5
on l5.user_id = users.user_id
LEFT JOIN
(SELECT
user_id,
count(*) as cnt
from logs6
group by user_id
) AS l6
on l6.user_id = users.user_id
WHERE computers.account_id = :cw_account_id AND computers.status = :cw_status
GROUP BY users.user_id
Plan :
computers 1 PRIMARY ref PRIMARY,unique_filter,status unique_filter 4 const 5 Using where; Using temporary; Using filesort
users 1 PRIMARY ref PRIMARY,unique_filter unique_filter 4 stephen_spcplus_inno.computers.computer_id 1 Using index
<derived2> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 3
logs1 2 DERIVED index user_id user_id 8 33 Using index
<derived3> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 10
logs2 3 DERIVED index user_id user_id 8 101 Using index
<derived4> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 4
logs3 4 DERIVED index user_id user_id 8 41 Using index
<derived5> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 2
logs4 5 DERIVED index user_id user_id 8 28 Using index
<derived6> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 2
logs5 6 DERIVED index user_id user_id 8 28 Using index
<derived7> 1 PRIMARY ref <auto_key0> <auto_key0> 4 stephen_spcplus_inno.users.user_id 275
logs6 7 DERIVED index user_id user_id 775 27516 Using index
example results :
username computer_name cnt1 cnt2 cnt3 cnt4 cnt5 cnt6
testuser COMPUTER_1 1 2 1 (null) (null) 3
testuser2 COMPUTER_1 (null) (null) (null) (null) (null) (null)
someuser COMPUTER_2 32 83 26 15 28 1157
As an example, for logs6 the plan is reading every row in the database (27516) yet there were only 1160 which 'should' have been joined.
I have tried lots of different things, but cannot get this to operate in an optimized manner. As it is currently the reason all the rows from each table are being read is due to the use of COUNT(*) within each joins subquery... removing this and only the needed rows are joined like I want, however, I do not know how to get the counts then in the same grouped result.
Help from any gurus would be great! Yes, I know I do not have a lot of rows in the db, but I can see the results are correct and see that the full table scans are going to be a problem as well.
EDIT (partial solution) :
I have found a partial solution to this problem, but it requires an additional query to get a list of user_ids. By adding WHERE user_id IN (17,22,23)
where these are the user_ids which should be joined... to each log table I get the correct results and the entire table is not scanned.
If anyone knows of a way to make this work without this additional query and where additional please let me know.
The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:
ALTER TABLE `computers` ADD INDEX `computers_idx_account_id_status` (`account_id`,`status`);
ALTER TABLE `es_temp1` ADD INDEX `es_temp1_idx_user_id` (`user_id`);
ALTER TABLE `es_temp2` ADD INDEX `es_temp2_idx_user_id` (`user_id`);
ALTER TABLE `es_temp3` ADD INDEX `es_temp3_idx_user_id` (`user_id`);
ALTER TABLE `es_temp4` ADD INDEX `es_temp4_idx_user_id` (`user_id`);
ALTER TABLE `es_temp5` ADD INDEX `es_temp5_idx_user_id` (`user_id`);
ALTER TABLE `es_temp6` ADD INDEX `es_temp6_idx_user_id` (`user_id`);
ALTER TABLE `users` ADD INDEX `users_idx_computer_id_user_id` (`computer_id`,`user_id`);
SELECT
users.username AS username,
computers.computer_name AS computer_name,
l1.cnt AS cnt1,
l2.cnt AS cnt2,
l3.cnt AS cnt3,
l4.cnt AS cnt4,
l5.cnt AS cnt5,
l6.cnt AS cnt6
FROM
computers
INNER JOIN
users
ON users.computer_id = computers.computer_id
LEFT JOIN
es_temp1 AS l1
ON l1.user_id = users.user_id
LEFT JOIN
es_temp2 AS l2
ON l2.user_id = users.user_id
LEFT JOIN
es_temp3 AS l3
ON l3.user_id = users.user_id
LEFT JOIN
es_temp4 AS l4
ON l4.user_id = users.user_id
LEFT JOIN
es_temp5 AS l5
ON l5.user_id = users.user_id
LEFT JOIN
es_temp6 AS l6
ON l6.user_id = users.user_id
WHERE
computers.account_id = :cw_account_id
AND computers.status = :cw_status
GROUP BY
users.user_id
ORDER BY
NULL