[Solved] MySql Group By Query Taking More 5 seconds On An Indexed Table
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

MySql Group By Query Taking More 5 seconds On An Indexed Table

Database type:

I have a simple analytics table which has a little over 2 Million rows and currently growing at a pace of 20k rows per day. Please check the table

CREATE TABLE IF NOT EXISTS `affiliate_analytics` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `session_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `funnel_id` int(10) unsigned NOT NULL DEFAULT '0',
  `affiliate_id` int(10) unsigned NOT NULL DEFAULT '0',
  `created_timestamp` int(10) unsigned NOT NULL DEFAULT '0',
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  `binary_ip` varbinary(16) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `session_id_index` (`session_id`),
  KEY `funnel_timestamp_index` (`funnel_id`,`created_timestamp`),
  KEY `funnel_aff_timestamp_index` (`funnel_id`,`affiliate_id`,`created_timestamp`),
  KEY `timestamp_idx` (`created_timestamp`),
  KEY `funnel_aff_ip_timestamp_index` (`funnel_id`,`affiliate_id`,`binary_ip`,`created_timestamp`),
  KEY `uniques_impressions_index` (`funnel_id`,`created_timestamp`,`affiliate_id`,`binary_ip`)
);

The query I am trying to run on this table is

select funnel_id, 
       affiliate_id, 
       COUNT(binary_ip) as impressions, 
       COUNT(DISTINCT(binary_ip)) as uniques 
FROM affiliate_analytics 
where funnel_id IN (2, 104, 145, 308, 309, 566, 672, 1149) 
  AND created_timestamp BETWEEN 1575176400 and 1593316799 
GROUP BY funnel_id, 
         affiliate_id

But it's taking more than 6 seconds to complete and when I am running EXPLAIN and ANALYZE queries on this table, it shows that it is not using all the parts of the index for running the query.

This is the output of the EXPLAIN statement

mysql> explain select funnel_id, affiliate_id, COUNT(binary_ip) as impressions, COUNT(DISTINCT(binary_ip)) as uniques FROM affiliate_analytics where funnel_id IN (2, 104, 145, 308, 309, 566, 672, 1149) AND created_timestamp BETWEEN 1575176400 and 1593316799 GROUP BY funnel_id, affiliate_id;
+----+-------------+--------------------------------+------------+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------+---------+------+--------+----------+--------------------------+
| id | select_type | table                          | partitions | type  | possible_keys                                                                                                           | key                           | key_len | ref  | rows   | filtered | Extra                    |
+----+-------------+--------------------------------+------------+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------+---------+------+--------+----------+--------------------------+
|  1 | SIMPLE      | affiliate_analytics            | NULL       | range | funnel_timestamp_index,funnel_aff_timestamp_index,timestamp_idx,funnel_aff_ip_timestamp_index,uniques_impressions_index | funnel_aff_ip_timestamp_index | 4       | NULL | 407111 |    50.00 | Using where; Using index |
+----+-------------+--------------------------------+------------+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------+---------+------+--------+----------+--------------------------+

I tried forcing the index uniques_impressions_index which should ideally be faster as per the MySql Index cookbook, however, this is no better and it's still taking 5 seconds for the query to run.

This is the output of the EXPLAIN statement with the forced uniques_impressions_index

mysql> explain select funnel_id, affiliate_id, COUNT(binary_ip) as impressions, COUNT(DISTINCT(binary_ip)) as uniques FROM affiliate_analytics use index(uniques_impressions_index) where funnel_id IN (2, 104, 145, 308, 309, 566, 672, 1149) AND created_timestamp BETWEEN 1575176400 and 1593316799 GROUP BY funnel_id, affiliate_id;
+----+-------------+--------------------------------+------------+-------+------------------------------------------------------------------------------------+---------------------------+---------+------+--------+----------+------------------------------------------+
| id | select_type | table                          | partitions | type  | possible_keys                                                                      | key                       | key_len | ref  | rows   | filtered | Extra                                    |
+----+-------------+--------------------------------+------------+-------+------------------------------------------------------------------------------------+---------------------------+---------+------+--------+----------+------------------------------------------+
|  1 | SIMPLE      | affiliate_analytics            | NULL       | range | funnel_aff_timestamp_index,funnel_aff_ip_timestamp_index,uniques_impressions_index | uniques_impressions_index | 8       | NULL | 407111 |   100.00 | Using where; Using index; Using filesort |
+----+-------------+--------------------------------+------------+-------+------------------------------------------------------------------------------------+---------------------------+---------+------+--------+----------+------------------------------------------+

I am at my wit's end with this one single query which is slowing down our app for almost 4 days now.

Any help is appreciated.

Thanks in advance!

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  2. Explicitly ORDER BY After GROUP BY (modified query below): By default, the database sorts all 'GROUP BY col1, col2, ...' queries as if you specified 'ORDER BY col1, col2, ...' in the query as well. If a query includes a GROUP BY clause but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying 'ORDER BY NULL'.
Optimal indexes for this query:
ALTER TABLE `affiliate_analytics` ADD INDEX `affiliate_analytic_idx_funnel_id_created_timesta` (`funnel_id`,`created_timestamp`);
ALTER TABLE `affiliate_analytics` ADD INDEX `affiliate_analytic_idx_funnel_id_affiliate_id` (`funnel_id`,`affiliate_id`);
The optimized query:
SELECT
        affiliate_analytics.funnel_id,
        affiliate_analytics.affiliate_id,
        COUNT(affiliate_analytics.binary_ip) AS impressions,
        COUNT(DISTINCT (affiliate_analytics.binary_ip)) AS uniques 
    FROM
        affiliate_analytics 
    WHERE
        affiliate_analytics.funnel_id IN (
            2, 104, 145, 308, 309, 566, 672, 1149
        ) 
        AND affiliate_analytics.created_timestamp BETWEEN 1575176400 AND 1593316799 
    GROUP BY
        affiliate_analytics.funnel_id,
        affiliate_analytics.affiliate_id 
    ORDER BY
        NULL

Related Articles



* original question posted on StackOverflow here.