[Solved] MySQL: Category Tree and Products in n levels

EverSQL Database Performance Knowledge Base

MySQL: Category Tree and Products in n levels

Database type:

So, let's assume I've a MySQL DB with the following tables:

Product

Category

What's the best way to query the DB in order to get all products going down from a certain category id. For instance, if I've a tree of sub-categories where the base category id = 1 how can I get all the products under the subcategories of id =1 for an undetermined number of sub-categories.

I could do this:

SELECT * FROM `Product` WHERE category_id IN (
   SELECT `id` FROM `Category` WHERE parent_id = 1
)

However it only works for the direct children of category id = 1 and not for the 2nd to n level children.

Thank you.


Edit

Some people suggested to read a blog article about this, I had a look at that article on the past also, and I made this sqlfiddle:

http://sqlfiddle.com/#!2/be72ec/1

As you can see on the query, even the simplest method they teach, to get a tree of categories doesn't output anything. What am I missing? The other methods have the same issue.

Thank you.

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  3. Replace In Subquery With Correlated Exists (modified query below): In many cases, an EXISTS subquery with a correlated condition will perform better than a non correlated IN subquery.
Optimal indexes for this query:
ALTER TABLE `Category` ADD INDEX `category_idx_parent_id_id` (`parent_id`,`id`);
The optimized query:
SELECT
        * 
    FROM
        `Product` 
    WHERE
        EXISTS (
            SELECT
                1 
            FROM
                `Category` 
            WHERE
                (
                    `Category`.parent_id = 1
                ) 
                AND (
                    `Product`.category_id = `Category`.`id`
                )
        )

Related Articles



* original question posted on StackOverflow here.