[Solved] How to determine what to Index in a MySQL Table
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

How to determine what to Index in a MySQL Table

Database type:

I have the following query:

SELECT TIMESTAMPDIFF(MINUTE, firstOccurrence, lastOccurrence) as dwellTime
FROM ts_VisitorDuration
WHERE eventDate >= '2012-12-01'
AND eventDate <= '2014-03-03'
AND venueID = 1007
GROUP BY MACAddress, eventDate

which takes 12 to 17s to run (slow!). I believe I can use indexes to speed this up, because the below query takes < 1 second to run - with the same table and conditions in the WHERE clause:

SELECT MACAddress
FROM ts_VisitorDuration
WHERE eventDate >= '2012-12-01'
AND eventDate <= '2014-03-03'
AND venueID = 1007
GROUP BY MACAddress, eventDate

I have tried to doing Indexing on what I thought were the right columns, but it did not seem to impact performance.

Here is the EXPLAIN of the top (slow) query:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  ts_VisitorDuration  range   PRIMARY,venueID venueID 7   NULL    167776  Using where; Using temporary; Using file sort

And using SHOW INDEX FROM on the table gives:

Table   Non_unique  Key_name    Seq_in_index    Column_name Collation   

Cardinality Sub_part    Packed  Null    Index_type  Comment Index_comment
ts_VisitorDuration  0   PRIMARY 1   eventDate   A   21  NULL    NULL        BTREE       
ts_VisitorDuration  0   PRIMARY 2   MACAddress  A   11714214    NULL    NULL        BTREE       
ts_VisitorDuration  0   PRIMARY 3   venueID A   11714214    NULL    NULL        BTREE       
ts_VisitorDuration  1   venueID 1   venueID A   4542    NULL    NULL        BTREE       
ts_VisitorDuration  1   venueID 2   eventDate   A   172267  NULL    NULL        BTREE       
ts_VisitorDuration  1   MACAddress  1   MACAddress  A   11714214    NULL    NULL        BTREE       
ts_VisitorDuration  1   MACAddress  2   eventDate   A   11714214    NULL    NULL        BTREE       
ts_VisitorDuration  1   MACAddress  3   venueID A   11714214    NULL    NULL        BTREE       
ts_VisitorDuration  1   MACAddress  4   firstOccurrence A   11714214    NULL    NULL        BTREE       
ts_VisitorDuration  1   MACAddress  5   lastOccurrence  A   11714214    NULL    NULL    YES BTREE       
ts_VisitorDuration  1   firstOccurrence 1   firstOccurrence A   11714214    NULL    NULL        BTREE       

Can someone explain to me if Indexing will speed up my top query, and if so, how I can determine what I need to Index?

Thank you!

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  2. Explicitly ORDER BY After GROUP BY (modified query below): By default, the database sorts all 'GROUP BY col1, col2, ...' queries as if you specified 'ORDER BY col1, col2, ...' in the query as well. If a query includes a GROUP BY clause but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying 'ORDER BY NULL'.
Optimal indexes for this query:
ALTER TABLE `ts_VisitorDuration` ADD INDEX `ts_visitorduration_idx_venueid_eventdate` (`venueID`,`eventDate`);
ALTER TABLE `ts_VisitorDuration` ADD INDEX `ts_visitorduration_idx_venueid_macaddress_eventdate` (`venueID`,`MACAddress`,`eventDate`);
The optimized query:
SELECT
        TIMESTAMPDIFF(MINUTE,
        ts_VisitorDuration.firstOccurrence,
        ts_VisitorDuration.lastOccurrence) AS dwellTime 
    FROM
        ts_VisitorDuration 
    WHERE
        ts_VisitorDuration.eventDate >= '2012-12-01' 
        AND ts_VisitorDuration.eventDate <= '2014-03-03' 
        AND ts_VisitorDuration.venueID = 1007 
    GROUP BY
        ts_VisitorDuration.MACAddress,
        ts_VisitorDuration.eventDate 
    ORDER BY
        NULL

Related Articles



* original question posted on StackOverflow here.