[Solved] How to optimize this MySQL query? (CROSS JOIN, subquery)
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

How to optimize this MySQL query? (CROSS JOIN, subquery)

Database type:

I have a challenging question for MySQL experts.

I have a users permissions system with 4 tables:

  1. users (id | email | created_at)
  2. permissions (id | responsibility_id | key | weight)
  3. permission_user (id | permission_id | user_id)
  4. responsibilities (id | key | weight)

Users can have any number of permissions assigned and any permission can be granted to any number of users (many to many). Responsibilities are like groups for permissions, each permission belongs to exactly one responsibility. For example, one permission is called update with responsibility of customers. Another one would be delete with orders responsibility.

I need to get a full map of permissions per user, but only for those who have at least one permission granted. Results should be ordered by:

  1. User's number of permissions from most to least
  2. User's created_at column, oldest first
  3. Responsibility's weight
  4. Permission's weight

Example result set:

user_id | responsibility | permission | granted
-----------------------------------------------
      5 | customers      | create     |       1
      5 | customers      | update     |       1
      5 | orders         | create     |       1
      5 | orders         | update     |       1
      2 | customers      | create     |       0
      2 | customers      | delete     |       0
      2 | orders         | create     |       1
      2 | orders         | update     |       0

Let's say I have 10 users in database, but only two of them have any permissions granted. There are 4 permissions in total:

  1. create of customers responsibility
  2. update of customers responsibility
  3. create of orders responsibility
  4. update of orders responsibility.

That's why we have 8 records in results (2 users with any permission × 4 permissions). User with id = 5 is displayed first, because he's got more permissions. If there were any draws, the ones with older created_at date would go first. Permissions are always sorted by the weight of their responsibility and then by their own weight.

My question is, how to write optimal query for this case? I have already made one myself and it works good:

SELECT `users`.`id` AS `user_id`,
       `responsibilities`.`key` AS `responsibility`,
       `permissions`.`key` AS `permission`,
       !ISNULL(`permission_user`.`id`) AS `granted`
FROM `users`
CROSS JOIN `permissions`
JOIN `responsibilities`
  ON `responsibilities`.`id` = `permissions`.`responsibility_id`
LEFT JOIN `permission_user`
       ON `permission_user`.`user_id` = `users`.`id`
      AND `permission_user`.`permission_id` = `permissions`.`id`
WHERE (
    SELECT COUNT(*)
    FROM `permission_user`
    WHERE `user_id` = `users`.`id`
) > 0
ORDER BY (
             SELECT COUNT(*)
             FROM `permission_user`
             WHERE `user_id` = `users`.`id`
         ) DESC,
         `users`.`created_at` ASC,
         `responsibilities`.`weight` ASC,
         `permissions`.`weight` ASC

The problem is that I'm using the same subquery twice.

Can I do better? I count on you, MySQL experts!

--- EDIT ---

Thanks to Gordon Linoff's comment I made it use HAVING clause:

SELECT `users`.`email`,
       `responsibilities`.`key`,
       `permissions`.`key`,
       !ISNULL(`permission_user`.`id`) as `granted`,
       (
           SELECT COUNT(*)
           FROM `permission_user`
           WHERE `user_id` = `users`.`id`
       ) AS `total_permissions`
FROM `users`
CROSS JOIN `permissions`
JOIN `responsibilities`
  ON `responsibilities`.`id` = `permissions`.`responsibility_id`
LEFT JOIN `permission_user`
       ON `permission_user`.`user_id` = `users`.`id`
      AND `permission_user`.`permission_id` = `permissions`.`id`
HAVING `total_permissions` > 0
ORDER BY `total_permissions` DESC,
         `users`.`created_at` ASC,
         `responsibilities`.`weight` ASC,
         `permissions`.`weight` ASC

I was surprised to discover that HAVING can go alone without GROUP BY.

Can it now be improved for better performance?

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Correlated Subqueries (query line: 19): A correlated subquery is a subquery that contains a reference (column: id) to a table that also appears in the outer query. Usually correlated queries can be rewritten with a join clause, which is the best practice. The database optimizer handles joins much better than correlated subqueries. Therefore, rephrasing the query with a join will allow the optimizer to use the most efficient execution plan for the query.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  3. Mixed Order By Directions Prevents Index Use (query line: 33): The database will not use a sorting index (if exists) in cases where the query mixes ASC (the default if not specified) and DESC order. To avoid filesort, you may consider using the same order type for all columns. Another option that will allow you to switch one direction to another is to create a new reversed "sort" column (max_sort - sort) and index it instead.
Optimal indexes for this query:
ALTER TABLE `permission_user` ADD INDEX `permission_user_idx_user_id` (`user_id`);
ALTER TABLE `permissions` ADD INDEX `permissions_idx_id` (`id`);
ALTER TABLE `responsibilities` ADD INDEX `responsibilities_idx_id` (`id`);
The optimized query:
SELECT
        `users`.`id` AS `user_id`,
        `responsibilities`.`key` AS `responsibility`,
        `permissions`.`key` AS `permission`,
        ! ISNULL(`permission_user`.`id`) AS `granted` 
    FROM
        `users` CROSS 
    JOIN
        `permissions` 
    JOIN
        `responsibilities` 
            ON `responsibilities`.`id` = `permissions`.`responsibility_id` 
    LEFT JOIN
        `permission_user` 
            ON `permission_user`.`user_id` = `users`.`id` 
            AND `permission_user`.`permission_id` = `permissions`.`id` 
    WHERE
        (
            SELECT
                COUNT(*) 
            FROM
                `permission_user` 
            WHERE
                `user_id` = `users`.`id`
        ) > 0 
    ORDER BY
        (SELECT
            COUNT(*) 
        FROM
            `permission_user` 
        WHERE
            `user_id` = `users`.`id`) DESC,
        `users`.`created_at` ASC,
        `responsibilities`.`weight` ASC,
        `permissions`.`weight` ASC

Related Articles



* original question posted on StackOverflow here.