[Solved] Mysql locking entire table when using \'in\' predicate

EverSQL Database Performance Knowledge Base

Mysql locking entire table when using \'in\' predicate

Database type:

I have a strange problem with Mysql locking the entire table when making a query with a particular 'in' predicate.

This is the table concerned:

create table TableA ( 
USERID bigint(20) unsigned not null, 
BLOBID tinyint unsigned not null, 
blob_contents mediumblob not null, 
primary key(ID1, ID2)) engine=innodb default charset=latin1; 

There is a composite primary key set of (USERID, BLOBID). Here is some example data:

{[1, 1, <blob>], 
[1, 2, <blob>], 
[1, 3, <blob>], 
[2, 1, <blob>], 
[2, 2, <blob>]} 

Say I want to get the data for blobs 1,2 and 3 for user 2 (I don't know at this point that there is no row for the blob with id 3 for this user). This is the query I would run:

select * from TableA where USERID = 2 and BLOBID in (1, 2, 3) for update;

The query returns blobs 1 and 2 for user2 and finds nothing for blobid 3. Unfortunately even with the primary key it seems that this query locks the entire table. You can see the join type is 'ALL' when explaining the query:

+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | TABLEA | ALL  | PRIMARY       | NULL | NULL    | NULL |    7 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+

When I remove '3' from the in clause, Mysql correctly uses the primary key and locks just the 2 rows required:

select * from TableA where USERID = 2 and BLOBID in (1, 2) for update;

+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table  | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | TableA | range | PRIMARY       | PRIMARY | 9       | NULL |    2 | Using where |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+

Is it possible to make this query and yet only lock the rows actually returned by the query?

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
Optimal indexes for this query:
ALTER TABLE `TableA` ADD INDEX `tablea_idx_userid_blobid` (`USERID`,`BLOBID`);
The optimized query:
SELECT
        * 
    FROM
        TableA 
    WHERE
        TableA.USERID = 2 
        AND TableA.BLOBID IN (
            1, 2, 3
        ) FOR UPDATE

Related Articles



* original question posted on StackOverflow here.