[Solved] MySQL NOT IN Query much slower after Mysql Upgrade
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

MySQL NOT IN Query much slower after Mysql Upgrade

Database type:

It's a very simple query:

SELECT * FROM temp_company WHERE number NOT IN (SELECT number FROM company)

It was taking 15 minutes before but that was on a Mysql installation with too low buffer pool size and 15 minutes was OK because this is a monthly task. I upgraded to Mysql 5.7 (from something like 5.1 or 5.2) as the original install was 32bit and I couldn't up the innodb buffer pool size to the minimum required 10gb for this DB (I've set it to 16GB on a machine with 32GB RAM. I've now gone to run this query a month later and it was still running after 6 hours.

The EXPLAIN for the above is:

id | select_type        | table        | partitions | type  | possible_keys | key    | key_len | ref | rows    | filtered | Extra       |
1  | PRIMARY            | temp_company |            | ALL   |               |        |         |     | 3226661 | 100.00   | Using where |
2  | DEPENDENT SUBQUERY | company      |            | index | number        | number | 33      |     | 3383517 | 100.00   | Using where |

The PRIMARY index on company and temp_company is id, but number is what they match on and that is a KEY in both but does the above suggest it's not using the index for the temp_company table?

The other logical query I thought to try was:

EXPLAIN SELECT tc.* FROM temp_company tc
LEFT JOIN company c on c.number = tc.number
WHERE c.number IS NULL

This is just as slow and the EXPLAIN is:

id | select_type | table | partitions | type  | possible_keys | key    | key_len | ref | rows    | filtered | Extra                                                           |
1  | SIMPLE      | tc    |            | ALL   |               |        |         |     | 3226661 | 100.00   |                                                                 |
2  | SIMPLE      | c     |            | index | number        | number | 33      |     | 3383517 | 100.00   | Using where; Ising index; Using join buffer (block nested loop) |

Any help would be much appreciated. Perhaps Mysql changed the way it finds indexes?

**UPDATE 1-------

SHOW CREATE's: company

   CREATE TABLE `company` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `count_telephone` mediumint(8) unsigned NOT NULL,
  `count_fax` mediumint(8) unsigned NOT NULL,
  `count_person` mediumint(8) unsigned NOT NULL,
  `person_date` date DEFAULT NULL COMMENT 'Date the company_person relation was updated',
  `count_email_address` mediumint(8) unsigned NOT NULL,
  `name` varchar(255) DEFAULT NULL,
  `url` varchar(255) DEFAULT NULL,
  `url_date` date DEFAULT NULL,
  `url_status` smallint(5) unsigned NOT NULL DEFAULT '0' COMMENT 'Failure count for crawling the URL',
  `website_stamp_start` int(10) unsigned DEFAULT NULL,
  `website_stamp` int(10) unsigned DEFAULT NULL,
  `ch_url` varchar(255) DEFAULT NULL COMMENT 'Companies house URL',
  `keywords_stamp_start` int(10) unsigned DEFAULT NULL,
  `keywords_stamp` int(11) DEFAULT NULL,
  `number` varchar(30) CHARACTER SET ascii COLLATE ascii_bin DEFAULT NULL,
  `category` varchar(255) DEFAULT NULL,
  `status` varchar(255) DEFAULT NULL,
  `status_date` date DEFAULT NULL COMMENT 'Date the status field was updated',
  `country_of_origin` varchar(80) DEFAULT NULL,
  `dissolution_date` date DEFAULT NULL,
  `incorporation_date` date DEFAULT NULL,
  `account_ref_day` smallint(5) unsigned DEFAULT NULL,
  `account_ref_month` smallint(5) unsigned DEFAULT NULL,
  `account_next_due_date` date DEFAULT NULL,
  `account_last_made_up_date` date DEFAULT NULL,
  `account_category` varchar(255) DEFAULT NULL,
  `returns_next_due_date` date DEFAULT NULL,
  `returns_last_made_up_date` date DEFAULT NULL,
  `mortgages_num_charges` smallint(5) unsigned DEFAULT NULL,
  `mortgages_num_outstanding` smallint(5) unsigned DEFAULT NULL,
  `mortgages_num_part_satisfied` smallint(5) unsigned DEFAULT NULL,
  `mortgages_num_satisfied` smallint(5) unsigned DEFAULT NULL,
  `partnerships_num_gen_partners` smallint(5) unsigned DEFAULT NULL,
  `partnerships_num_lim_partners` smallint(5) unsigned DEFAULT NULL,
  `ext_name` varchar(255) DEFAULT NULL,
  `turnover` decimal(18,2) DEFAULT NULL,
  `turnover_date` date DEFAULT NULL,
  `trade_debtors` decimal(18,2) DEFAULT NULL,
  `other_debtors` decimal(18,2) DEFAULT NULL,
  `debtors_date` date DEFAULT NULL,
  `real_turnover_band` int(11) DEFAULT NULL,
  `est_turnover_band` int(11) DEFAULT NULL,
  `ext_address_date` date DEFAULT NULL,
  `care_of` varchar(255) DEFAULT NULL,
  `po_box` varchar(60) DEFAULT NULL,
  `line_1` varchar(255) DEFAULT NULL,
  `line_2` varchar(255) DEFAULT NULL,
  `town` varchar(60) DEFAULT NULL,
  `county` varchar(60) DEFAULT NULL,
  `country` varchar(60) DEFAULT NULL,
  `post_code` varchar(20) DEFAULT NULL,
  `DirScrapeID` int(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `homepage_keywords_stamp` (`keywords_stamp`),
  KEY `number` (`number`),
  KEY `url` (`url`),
  KEY `town` (`town`),
  KEY `county` (`county`),
  KEY `post_code` (`post_code`),
  KEY `name` (`name`),
  KEY `website_stamp` (`website_stamp`),
  KEY `website_stamp_start` (`website_stamp_start`),
  KEY `keywords_stamp_start` (`keywords_stamp_start`),
  KEY `turnover` (`turnover`),
  KEY `status` (`status`),
  KEY `category` (`category`),
  KEY `incorporation_date` (`incorporation_date`),
  KEY `real_turnover_band` (`real_turnover_band`),
  KEY `est_turnover_band` (`est_turnover_band`)
) ENGINE=InnoDB AUTO_INCREMENT=3706459 DEFAULT CHARSET=utf8

temp_company:

CREATE TABLE `temp_company` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `url` varchar(255) DEFAULT NULL,
  `ch_url` varchar(255) DEFAULT NULL,
  `number` varchar(30) DEFAULT NULL,
  `category` varchar(255) DEFAULT NULL,
  `status` varchar(255) DEFAULT NULL,
  `country_of_origin` varchar(80) DEFAULT NULL,
  `dissolution_date` date DEFAULT NULL,
  `incorporation_date` date DEFAULT NULL,
  `account_ref_day` smallint(5) unsigned DEFAULT NULL,
  `account_ref_month` smallint(5) unsigned DEFAULT NULL,
  `account_next_due_date` date DEFAULT NULL,
  `account_last_made_up_date` date DEFAULT NULL,
  `account_category` varchar(255) DEFAULT NULL,
  `returns_next_due_date` date DEFAULT NULL,
  `returns_last_made_up_date` date DEFAULT NULL,
  `mortgages_num_charges` smallint(5) unsigned DEFAULT NULL,
  `mortgages_num_outstanding` smallint(5) unsigned DEFAULT NULL,
  `mortgages_num_part_satisfied` smallint(5) unsigned DEFAULT NULL,
  `mortgages_num_satisfied` smallint(5) unsigned DEFAULT NULL,
  `partnerships_num_gen_partners` smallint(5) unsigned DEFAULT NULL,
  `partnerships_num_lim_partners` smallint(5) unsigned DEFAULT NULL,
  `ext_name` varchar(255) DEFAULT NULL,
  `turnover` decimal(18,2) DEFAULT NULL,
  `turnover_date` date DEFAULT NULL,
  `trade_debtors` decimal(18,2) DEFAULT NULL,
  `other_debtors` decimal(18,2) DEFAULT NULL,
  `debtors_date` date DEFAULT NULL,
  `real_turnover_band` int(11) DEFAULT NULL,
  `est_turnover_band` int(11) DEFAULT NULL,
  `ext_address_date` date DEFAULT NULL,
  `care_of` varchar(255) DEFAULT NULL,
  `po_box` varchar(60) DEFAULT NULL,
  `line_1` varchar(255) DEFAULT NULL,
  `line_2` varchar(255) DEFAULT NULL,
  `town` varchar(60) DEFAULT NULL,
  `county` varchar(60) DEFAULT NULL,
  `country` varchar(60) DEFAULT NULL,
  `post_code` varchar(20) DEFAULT NULL,
  `sic_code` varchar(10) DEFAULT NULL,
  `DirScrapeID` int(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `number` (`number`),
  KEY `status` (`status`),
  KEY `name` (`name`),
  KEY `sic_code` (`sic_code`)
) ENGINE=InnoDB AUTO_INCREMENT=3297833 DEFAULT CHARSET=utf8

UPDATE 2: Profile of the query (with limit 5)

+-------------------------------+----------+
| Status                        | Duration |
+-------------------------------+----------+
| executing                     | 0.000001 |
| Sending data                  | 0.000112 |
| executing                     | 0.000001 |
| Sending data                  | 0.000111 |
| executing                     | 0.000001 |
| Sending data                  | 0.000110 |
| executing                     | 0.000001 |
| Sending data                  | 0.000110 |
| executing                     | 0.000001 |
| Sending data                  | 0.000110 |
| executing                     | 0.000001 |
| Sending data                  | 0.000111 |
| executing                     | 0.000001 |
| Sending data                  | 0.000111 |
| executing                     | 0.000001 |
| Sending data                  | 0.000112 |
| executing                     | 0.000001 |
| Sending data                  | 0.000112 |
| executing                     | 0.000001 |
| Sending data                  | 0.000112 |
| executing                     | 0.000001 |
| Sending data                  | 0.000112 |
| executing                     | 0.000001 |
| Sending data                  | 0.000112 |
| executing                     | 0.000001 |
| Sending data                  | 0.000112 |
| executing                     | 0.000001 |
| Sending data                  | 0.000113 |
| executing                     | 0.000001 |
| Sending data                  | 0.000114 |
| executing                     | 0.000001 |
| Sending data                  | 0.000114 |
| executing                     | 0.000001 |
| Sending data                  | 0.000114 |
| executing                     | 0.000001 |
| Sending data                  | 0.000115 |
| executing                     | 0.000001 |
| Sending data                  | 0.000116 |
| executing                     | 0.000001 |
| Sending data                  | 0.000115 |
| executing                     | 0.000001 |
| Sending data                  | 0.000115 |
| executing                     | 0.000001 |
| Sending data                  | 0.000116 |
| executing                     | 0.000001 |
| Sending data                  | 0.000116 |
| executing                     | 0.000001 |
| Sending data                  | 0.000115 |
| executing                     | 0.000001 |
| Sending data                  | 0.000115 |
| executing                     | 0.000001 |
| Sending data                  | 0.000116 |
| executing                     | 0.000001 |
| Sending data                  | 0.000116 |
| executing                     | 0.000001 |
| Sending data                  | 0.000117 |
| executing                     | 0.000001 |
| Sending data                  | 0.000117 |
| executing                     | 0.000001 |
| Sending data                  | 0.000117 |
| executing                     | 0.000001 |
| Sending data                  | 0.000118 |
| executing                     | 0.000001 |
| Sending data                  | 0.000118 |
| executing                     | 0.000001 |
| Sending data                  | 0.000118 |
| executing                     | 0.000001 |
| Sending data                  | 0.000118 |
| executing                     | 0.000001 |
| Sending data                  | 0.000118 |
| executing                     | 0.000001 |
| Sending data                  | 0.000118 |
| executing                     | 0.000001 |
| Sending data                  | 0.000120 |
| executing                     | 0.000001 |
| Sending data                  | 0.000120 |
| executing                     | 0.000001 |
| Sending data                  | 0.000121 |
| executing                     | 0.000001 |
| Sending data                  | 0.000123 |
| executing                     | 0.000001 |
| Sending data                  | 0.000121 |
| executing                     | 0.000001 |
| Sending data                  | 0.000120 |
| executing                     | 0.000001 |
| Sending data                  | 0.000121 |
| executing                     | 0.000001 |
| Sending data                  | 0.000121 |
| executing                     | 0.000001 |
| Sending data                  | 0.000121 |
| executing                     | 0.000001 |
| Sending data                  | 0.000122 |
| executing                     | 0.000001 |
| Sending data                  | 0.000123 |
| executing                     | 0.000001 |
| Sending data                  | 0.000124 |
| executing                     | 0.000001 |
| Sending data                  | 1.063880 |
| end                           | 0.000009 |
| query end                     | 0.000008 |
| closing tables                | 0.000009 |
| freeing items                 | 0.000007 |
| Waiting for query cache lock  | 0.000002 |
| freeing items                 | 0.000062 |
| Waiting for query cache lock  | 0.000002 |
| freeing items                 | 0.000001 |
| storing result in query cache | 0.000002 |
| cleaning up                   | 0.000028 |
+-------------------------------+----------+

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  3. Replace In Subquery With Correlated Exists (modified query below): In many cases, an EXISTS subquery with a correlated condition will perform better than a non correlated IN subquery.
Optimal indexes for this query:
ALTER TABLE `company` ADD INDEX `company_idx_number` (`number`);
The optimized query:
SELECT
        * 
    FROM
        temp_company 
    WHERE
        NOT EXISTS (
            SELECT
                1 
            FROM
                company 
            WHERE
                (
                    temp_company.number = company.number
                )
        )

Related Articles



* original question posted on StackOverflow here.