[Solved] MySQL Joins and Full Text search

EverSQL Database Performance Knowledge Base

MySQL Joins and Full Text search

Database type:

Any full-text, MySQL experts out there?

I wrote a question yesterday asking for any suggestions on how to use full-text search with a lot of joins, a bit lazily, I hadn't actually made an attempt.

Since then, I've a had a good attempt at mastering it myself, building a few successful test queries, against my full-text fields. If I do NOT use any joins, and query each field individually, the full-text functions work perfectly and relevance/boolean works great with great performance too... but... as soon as I add my joins to run the full query, it runs forever and ever.

Can anybody spot anything in my query that might cause this to happen, because as an amateur, I really can't see it!

SELECT

photos.photoID,
photos.headline,
photos.dateCreated,
MATCH (people.people) AGAINST ('+sarah +harding' IN BOOLEAN MODE) AS Rel1

FROM photos

LEFT JOIN ( photoPeople INNER JOIN people ON photoPeople.peopleID = people.PeopleID ) 
ON photos.photoID = photoPeople.photoID AND MATCH (people.people) AGAINST ('+sarah +harding' IN BOOLEAN MODE)

WHERE

photos.photoStatus = 'Live'

GROUP BY

photos.photoID

ORDER BY Rel1

This is one of my successful, individual, queries:

SELECT

photoID,
headline,
dateCreated,
MATCH (caption) AGAINST ('+sarah +harding' IN BOOLEAN MODE) AS Relevance

FROM photos

WHERE

photoStatus = 'Live'

AND

MATCH (caption) AGAINST ('+sarah +harding' IN BOOLEAN MODE)

ORDER BY Relevance

This is the DB schema:

photos (tbl)
photoID             INT(11)         Primary     Auto-Increment
headline            Long-Text
caption             Long-Text / FULLTEXT
dateCreated         DateTime

people (tbl)
peopleID            INT(11)         Primary     Auto-Increment
people              VarChar(255) / FULLTEXT

photoPeople (tbl)
photoID             INT(11)
peopleID            INT(11)

keywords (tbl)
keywordID           INT(11)         Primary     Auto-Increment
keyword             VarChar(255) / FULLTEXT

photoKeyword (tbl)
photoID             INT(11)
keywordID           INT(11)

photoContributor (tbl)
photoID             INT(11)
contributorRef      VarChar(100) / FULLTEXT

And this is my EXPLAIN print out:

id  select_type         table               type        possible_keys       key     key_len     ref     rows
1   SIMPLE              photos              ALL         NULL                NULL        NULL        NULL        89830
1   SIMPLE              photoContributor    ALL         NULL                NULL        NULL        NULL        149635  
1   SIMPLE              photoPeople         ALL         NULL                NULL        NULL        NULL        110606
1   SIMPLE              people              eq_ref      PRIMARY             PRIMARY     4       1   
1   SIMPLE              photoKeyword        ALL         NULL                NULL        NULL        NULL        699102  
1   SIMPLE              keywords            eq_ref      PRIMARY             PRIMARY     4       1

My website visitor should be able to search for: "Brad Pitt Angelina Jolie Dorchester Hotel Sunglasses @MG" - This should find "Brad Pitt" and "Angelina Jolie" from the 'people.people' table and possibly the 'photos.caption' table too. It should also find "Dorchester Hotel" from the 'photos.caption' table, "Sunglasses" from the 'keywords.keyword' table and lastly, it should find "@MG" in 'photoContributor.contributorRef' table.

Any help on this would be gratefully received...

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
Optimal indexes for this query:
ALTER TABLE `photoPeople` ADD INDEX `photopeople_idx_photoid` (`photoID`);
ALTER TABLE `photos` ADD INDEX `photos_idx_photostatus_photoid` (`photoStatus`,`photoID`);
The optimized query:
SELECT
        photos.photoID,
        photos.headline,
        photos.dateCreated,
        MATCH (people.people) AGAINST ('+sarah +harding' IN BOOLEAN MODE) AS Rel1 
    FROM
        photos 
    LEFT JOIN
        (
            photoPeople 
        INNER JOIN
            people 
                ON photoPeople.peopleID = people.PeopleID
            ) 
                ON photos.photoID = photoPeople.photoID 
                AND MATCH (people.people) AGAINST ('+sarah +harding' IN BOOLEAN MODE) 
        WHERE
            photos.photoStatus = 'Live' 
        GROUP BY
            photos.photoID 
        ORDER BY
            Rel1

Related Articles



* original question posted on StackOverflow here.