First of all i need to have a functionality of pagination by limiting the result in the following query lookup with couple where conditions.
SELECT SQL_CALC_FOUND_ROWS
a.uid, b.NAME
FROM
`profiles` AS a FORCE INDEX(profiles_country_city_gender_index)
JOIN `users` AS b
ON b.id = a.uid
AND a.country = 'INDONESIA'
AND a.gender = 0
JOIN (
SELECT
a.uid
FROM
profile_details AS a
JOIN profile_details AS kids ON kids.uid = a.uid
AND kids.kids_pref = 1
JOIN profile_details AS current ON current.uid = a.uid
AND current.current_relationship = 1
JOIN profile_details AS smoking ON smoking.uid = a.uid
AND smoking.smoking_pref = 1
) AS e ON e.uid = a.uid
AND ( TIMESTAMPDIFF( YEAR, a.birth_date, NOW()) BETWEEN 25 AND 35 )
LIMIT 33;
All tables here is one on one relation with table Users which are
Using id column as Primary key in Users, and uid in the other tables as Foreign key.
At the start, i do not have problem with the above query/design until the tables grows till 300K rows in it, the query running takes OK, Time: 0.726000s
to fetch the result which is too slow for my opinion.
I tried to count the rows based on the conditions above using count(*) and get some approximate same result, i need to have the faster way to get the count of rows from the lookup conditions to make the pagination system works as expected with less waiting time.
As you can see in the query, i am using:
FORCE INDEX(profiles_country_city_gender_index)
I think it's not helping much caused by the larger rows result from the scope by using:
AND a.country = 'INDONESIA'
AND a.gender = 0
Resulting (148801 rows scope limit by country with gender is equal by 0), if i am pairing with city it's not problem query time is considerable because rows result much smaller but still gonna be a problem when there were larger rows at someday.
To anyone who might ask for the query explanation:
Explain SELECT SQL_CALC_FOUND_ROWS
a.uid,
b.NAME ...
Results:
| select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+-------------+---------+--------+------------------------------------+------------------------------------+---------+------------------+--------+-----------+------------------------------------+
| SIMPLE | a | ref | profiles_country_city_gender_index | profiles_country_city_gender_index | 242 | const | 148801 | 10.00 | Using index condition; Using where |
| SIMPLE | a | ref | profile_details_uid_foreign | profile_details_uid_foreign | 3 | restfulapi.a.uid | 1 | 100.00.00 | Using index |
| SIMPLE | kids | ref | profile_details_uid_foreign | profile_details_uid_foreign | 3 | restfulapi.a.uid | 1 | 10.00 | Using where |
| SIMPLE | current | ref | profile_details_uid_foreign | profile_details_uid_foreign | 3 | restfulapi.a.uid | 1 | 10.00 | Using where |
| SIMPLE | smoking | ref | profile_details_uid_foreign | profile_details_uid_foreign | 3 | restfulapi.a.uid | 1 | 10.00 | Using where |
| SIMPLE | b | eq_ref | PRIMARY | PRIMARY | 3 | restfulapi.a.uid | 1 | 100.00.00 | |
As you can see in explain result, no table scan or using temporary or using range, only index condition. I imagine if the tables have at least 1 million rows returns by Country scope, just multiply the time with scaling rows by 300K it sucks :(.
Below is the table definition in case it helps to analysing the problem:
CREATE TABLE `profile_details` (
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`uid` mediumint(8) unsigned NOT NULL,
`intents` tinyint(4) NOT NULL DEFAULT '3',
`height` smallint(6) DEFAULT NULL,
`body_type` tinyint(4) NOT NULL DEFAULT '5',
`kids_pref` tinyint(4) NOT NULL DEFAULT '1',
`drinking_pref` tinyint(4) NOT NULL DEFAULT '2',
`living_with` tinyint(4) NOT NULL DEFAULT '0',
`current_relationship` tinyint(4) NOT NULL DEFAULT '1',
`sexual_pref` tinyint(4) NOT NULL DEFAULT '1',
`smoking_pref` tinyint(4) NOT NULL DEFAULT '0',
`status_online` tinyint(4) NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `profile_details_uid_foreign` (`uid`),
KEY `idx_multipart` (`intents`,`body_type`,`kids_pref`,`drinking_pref`,`living_with`,`current_relationship`,`sexual_pref`,`smoking_pref`),
CONSTRAINT `profile_details_uid_foreign` FOREIGN KEY (`uid`) REFERENCES `users` (`id`)
)
CREATE TABLE `profiles` (
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`uid` mediumint(8) unsigned NOT NULL,
`birth_date` date NOT NULL,
`gender` tinyint(4) NOT NULL DEFAULT '0',
`country` varchar(60) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'ID',
`city` varchar(60) COLLATE utf8mb4_unicode_ci DEFAULT 'Makassar',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
`latitude` double NOT NULL DEFAULT '0',
`longitude` double NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `profiles_uid_foreign` (`uid`),
KEY `profiles_birth_date_index` (`birth_date`),
KEY `profiles_latitude_longitude_index` (`latitude`,`longitude`),
KEY `profiles_country_city_gender_index` (`country`,`city`,`gender`),
KEY `idx_country_gender_birthdate` (`country`,`gender`,`birth_date`),
KEY `idx_country_city_gender_birthdate` (`country`,`city`,`gender`,`birth_date`),
CONSTRAINT `profiles_uid_foreign` FOREIGN KEY (`uid`) REFERENCES `users` (`id`)
)
How can I find the solution, do i need to redesign the table to get the ideal system? perhaps it is the last option.
EDIT
I am trying what you suggested earlier, first i added an index in three column:
CREATE INDEX profiles_country_gender_birth_date_index on `profiles`(country,gender,birth_date);
and i tried to select Count(*) without JOIN with profile_detail:
SELECT
count(*)
FROM
`profiles` AS a
FORCE INDEX ( profiles_country_gender_birth_date_index )
JOIN `users` AS b ON b.id = a.uid
and
a.country = 'INDONESIA'
AND a.gender =1
AND a.birth_date BETWEEN NOW()- INTERVAL 35 YEAR
AND NOW()- INTERVAL 25 YEAR
result timing is not stable between 0.7sec to 0.35sec and i don't know why it could be. Below is Explain Query Plan in Json Format just in case to help to find out the culprit.
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "114747.38"
},
"nested_loop": [
{
"table": {
"table_name": "a",
"access_type": "range",
"possible_keys": [
"profiles_country_gender_birth_date_index"
],
"key": "profiles_country_gender_birth_date_index",
"used_key_parts": [
"country",
"gender",
"birth_date"
],
"key_length": "246",
"rows_examined_per_scan": 94066,
"rows_produced_per_join": 32961,
"filtered": "100.00",
"index_condition": "((`restfulapi`.`a`.`gender` = 1) and (`restfulapi`.`a`.`country` = 'INDONESIA') and (`restfulapi`.`a`.`birth_date` between <cache>((now() - interval 35 year)) and <cache>((now() - interval 25 year))))",
"cost_info": {
"read_cost": "15858.00",
"eval_cost": "6592.23",
"prefix_cost": "75194.00",
"data_read_per_join": "16M"
},
"used_columns": [
"uid",
"birth_date",
"gender",
"country"
]
}
},
{
"table": {
"table_name": "b",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "3",
"ref": [
"restfulapi.a.uid"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 32961,
"filtered": "100.00",
"using_index": true,
"cost_info": {
"read_cost": "32961.15",
"eval_cost": "6592.23",
"prefix_cost": "114747.38",
"data_read_per_join": "89M"
},
"used_columns": [
"id"
]
}
}
]
}
}
The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:
ALTER TABLE `profile_details` ADD INDEX `profile_details_idx_kids_pr_current_smoking_uid` (`kids_pref`,`current_relationship`,`smoking_pref`,`uid`);
SELECT
SQL_CALC_FOUND_ROWS a_uid,
b.NAME
FROM
(SELECT
a.uid AS a_uid,
a.country AS a_country,
a.gender AS a_gender,
a.birth_date AS a_birth_date
FROM
`profiles` AS a LIMIT 33) AS a
JOIN
`users` AS b
ON b.id = a.a_uid
AND a.a_country = 'INDONESIA'
AND a.a_gender = 0
JOIN
(
SELECT
a.uid
FROM
profile_details AS a
JOIN
profile_details AS kids
ON kids.uid = a.uid
AND kids.kids_pref = 1
JOIN
profile_details AS `current`
ON `current`.uid = a.uid
AND `current`.current_relationship = 1
JOIN
profile_details AS smoking
ON smoking.uid = a.uid
AND smoking.smoking_pref = 1
) AS e
ON e.uid = a.a_uid
AND (
TIMESTAMPDIFF(YEAR,
a.a_birth_date,
NOW()) BETWEEN 25 AND 35) LIMIT 33