[Solved] Fast FullText search for one occurrence of exact phrase in MySQL sorted by id
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

Fast FullText search for one occurrence of exact phrase in MySQL sorted by id

Database type:

I'm using MySQL/InnoDb. I have this table News which has about 3,000,000 records and I hope it gets 7,000,000 more. Currently I use the following query to fetch records from News table with exact phrase inside their title or content:

SELECT * FROM News WHERE MATCH (title, content) AGAINST ('"My Phrase"' in Boolean Mode)

But the results aren't satisfying. For me, when searching for an exact phrase, it's logical to see the latest news with that phrase first. I don't need the results to be sorted by relevance. I don't mind if the first record found has 1 occurrence of my phrase and the second one has 10. All I care is ORDER BY time_added or ORDER BY id. So I tried:

SELECT * FROM News WHERE MATCH (title, content) AGAINST ('"My Phrase"' in Boolean Mode) ORDER BY id DESC limit 40

But when I try to EXPLAIN this query, I see that it is Using filesort as I excpected.

Is it the only way MySQL works? Can I get the results sorted by time in an optimal way?

Edit:

MySQL version: 5.6.34

News table structure is as follows:

CREATE TABLE `News` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `title` varchar(400) COLLATE utf8_bin NOT NULL,
 `url_title` varchar(80) COLLATE utf8_bin NOT NULL,
 `link` varchar(2000) COLLATE utf8_bin NOT NULL,
 `link_unique_index` varchar(255) COLLATE utf8_bin DEFAULT NULL,
 `source` smallint(5) unsigned NOT NULL COMMENT,
 `category` smallint(5) unsigned NOT NULL DEFAULT '0',
 `description` varchar(600) COLLATE utf8_bin NOT NULL,
 `has_content` tinyint(1) NOT NULL DEFAULT '0',
 `content` text COLLATE utf8_bin NOT NULL,
 `has_image` tinyint(1) NOT NULL,
 `image` varchar(14) COLLATE utf8_bin NOT NULL,
 `image_orientation` char(1) COLLATE utf8_bin NOT NULL,
 `original_image` varchar(2000) COLLATE utf8_bin NOT NULL,
 `keywords` varchar(300) COLLATE utf8_bin NOT NULL,
 `year_added` smallint(4) NOT NULL,
 `date_added` date NOT NULL,
 `time_added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `time_updated` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `number_of_views` int(11) NOT NULL DEFAULT '0',
 `last_view_time` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 `number_of_positives` int(11) NOT NULL DEFAULT '0',
 `number_of_negatives` int(11) NOT NULL DEFAULT '0',
 `number_of_votes` int(11) NOT NULL DEFAULT '0',
 `number_of_suggestions` int(11) NOT NULL DEFAULT '0',
 `last_suggestion_time` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 PRIMARY KEY (`id`),
 UNIQUE KEY `lui` (`link_unique_index`) USING BTREE,
 KEY `st` (`source`,`time_added`),
 KEY `snv` (`source`,`number_of_views`),
 KEY `snvo` (`source`,`number_of_votes`),
 KEY `sct` (`source`,`category`,`time_added`),
 KEY `scnv` (`source`,`category`,`number_of_views`),
 KEY `scnvo` (`source`,`category`,`number_of_votes`),
 KEY `ct` (`category`,`time_added`),
 KEY `cnv` (`category`,`number_of_views`),
 KEY `cnvo` (`category`,`number_of_votes`),
 KEY `hicdnv` (`has_image`,`category`,`date_added`,`number_of_views`),
 KEY `sdnv` (`source`,`date_added`,`number_of_views`),
 KEY `sdnvo` (`source`,`date_added`,`number_of_votes`),
 KEY `scdnv` (`source`,`category`,`date_added`,`number_of_views`),
 KEY `scdnvo` (`source`,`category`,`date_added`,`number_of_votes`),
 KEY `cdnv` (`category`,`date_added`,`number_of_views`),
 KEY `cdnvo` (`category`,`date_added`,`number_of_votes`),
 KEY `dnv` (`date_added`,`number_of_views`),
 KEY `dnvo` (`date_added`,`number_of_votes`),
 KEY `clst` (`category`,`last_suggestion_time`) USING BTREE,
 KEY `slst` (`source`,`last_suggestion_time`) USING BTREE,
 KEY `nv` (`number_of_views`) USING BTREE,
 KEY `nvo` (`number_of_votes`) USING BTREE,
 KEY `t` (`time_added`) USING BTREE,
 KEY `lvt` (`last_view_time`) USING BTREE,
 FULLTEXT KEY `title_content` (`title`,`content`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin ROW_FORMAT=COMPACT

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
The optimized query:
SELECT
        * 
    FROM
        News 
    WHERE
        MATCH (News.title,News.content) AGAINST ('"My Phrase"' in Boolean Mode)

Related Articles



* original question posted on StackOverflow here.