I have the following table:
CREATE TABLE public.devices
(
id integer NOT NULL DEFAULT nextval('devices_id_seq'::regclass),
token text NOT NULL,
...some other columns...,
CONSTRAINT devices_pkey PRIMARY KEY (id, token)
)
CREATE INDEX IF NOT EXISTS idx_devices_lower_token ON devices (lower(token));
If I run the following query, everything is fine, the query runs instantly, and the index is used:
SELECT *
FROM "devices"
WHERE (id > '0')
AND ((lower(token) IN ('92134f4bb6a2263fd044a5a04440bc1a','cbf7d4a094bd266fd3f1b7ef4196172a', 'aa0f7650cf28010b5c01b77ae206f7fb')))
ORDER BY id LIMIT 1000;
"Limit (cost=58116.96..58119.46 rows=1000 width=345)"
" -> Sort (cost=58116.96..58189.47 rows=29005 width=345)"
" Sort Key: id"
" -> Bitmap Heap Scan on devices (cost=1061.91..56526.65 rows=29005 width=345)"
" Recheck Cond: (lower(token) = ANY ('{92134f4bb6a2263fd044a5a04440bc1a,cbf7d4a094bd266fd3f1b7ef4196172a}'::text[]))"
" Filter: (id > 0)"
" -> Bitmap Index Scan on idx_devices_lower_token (cost=0.00..1054.65 rows=29005 width=0)"
" Index Cond: (lower(token) = ANY ('{92134f4bb6a2263fd044a5a04440bc1a,cbf7d4a094bd266fd3f1b7ef4196172a}'::text[]))"
However, if I add a lot of conditions (e.g. 1k) for the (lower(token) IN (...)), the index will no longer be used anymore and the query will be VERY slow:
SELECT *
FROM "devices"
WHERE (id > '0')
AND ((lower(token) IN ('92134f4bb6a2263fd044a5a04440bc1a','cbf7d4a094bd266fd3f1b7ef4196172a', 'aa0f7650cf28010b5c01b77ae206f7fb', ... up to 1k tokens ...)))
ORDER BY id
LIMIT 1000;
"Limit (cost=3900016.07..3900018.57 rows=1000 width=345)"
" -> Sort (cost=3900016.07..3907219.10 rows=2881214 width=345)"
" Sort Key: id"
" -> Seq Scan on devices (cost=0.00..3742042.21 rows=2881214 width=345)"
" Filter: ((id > 0) AND (lower(token) = ANY ('{92134f4bb6a2263fd044a5a04440bc1a,cbf7d4a094bd266fd3f1b7ef4196172a,6116af468e0017ffe5c49309869be773,4a5ebc1cec6c452e74fbf1a4e22fe804,0a9981eaf530d22ad9fe12a73a0b486c,61eb008e86a5afeaa9edd3e56ba01e0b,a31775a866571d9c2ecadc6b2a8127a3,dfcb4bc34771883bb2a54ee44299b058,7c93581a24053021bd41b7cc4654bf9e,43182b0010cea255fef1d10849858d7c,c3f6503eafeaa1d25a6f5672310ef4e5,672641b1fcd23f67deab34deaf28f1d8,cb52e085618ecbdf6eef1669b7b38008,335626a32455e61e1b1de16550e43af0, (...)"
Ok so, it must not like many conditions, right?
Well, I tried replicating the same scenario but without using lower(token), and it works correctly:
create index IF NOT EXISTS idx_devices_token on devices (token);
SELECT *
FROM "devices"
WHERE (id > '0')
AND ((token IN ('92134f4bb6a2263fd044a5a04440bc1a','cbf7d4a094bd266fd3f1b7ef4196172a', 'aa0f7650cf28010b5c01b77ae206f7fb', ... 1k tokens ...)))
ORDER BY id
LIMIT 1000;
"Limit (cost=25.70..25.71 rows=3 width=345)"
" -> Sort (cost=25.70..25.71 rows=3 width=345)"
" Sort Key: id"
" -> Bitmap Heap Scan on devices (cost=13.69..25.68 rows=3 width=345)"
" Recheck Cond: (token = ANY ('{92134f4bb6a2263fd044a5a04440bc1a,cbf7d4a094bd266fd3f1b7ef4196172a,aa0f7650cf28010b5c01b77ae206f7fb}'::text[]))"
" Filter: (id > 0)"
" -> Bitmap Index Scan on idx_devices_token (cost=0.00..13.69 rows=3 width=0)"
" Index Cond: (token = ANY ('{92134f4bb6a2263fd044a5a04440bc1a,cbf7d4a094bd266fd3f1b7ef4196172a,aa0f7650cf28010b5c01b77ae206f7fb}'::text[]))"
So this seems to only happen if I use an index with lower and many WHERE IN conditions.
I have no idea what is going on here, any ideas?
I am using PostgreSQL 9.5
The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:
CREATE INDEX devices_idx_id ON "devices" ("id");
SELECT
*
FROM
"devices"
WHERE
(
"devices".id > '0'
)
AND (
(
lower("devices".token) IN (
'92134f4bb6a2263fd044a5a04440bc1a', 'cbf7d4a094bd266fd3f1b7ef4196172a', 'aa0f7650cf28010b5c01b77ae206f7fb'
)
)
)
ORDER BY
"devices".id LIMIT 1000