[Solved] Why in my case ST_DWithin is not using index

EverSQL Database Performance Knowledge Base

Why in my case ST_DWithin is not using index

Database type:

I'm using Postgis extension for Postgres and trying to optimize my query for searching points in circle.

Consider I have this table with index:

create table position
(
 id bigserial not null primary key,
 date timestamp with time zone,
 point GEOMETRY(Point, 4326),
 alias varchar(50)
);
create index position_point_idx on position using gist (point);

Now when I use query with polygon everything work as expected. In explain plan I can see that query uses index.

SELECT distinct alias
FROM position
WHERE date > '2021-11-28T19:26:18.574Z'
  AND date < '2021-11-28T20:26:18.574Z'
  AND ST_contains(ST_GeomFromText(
                          'POLYGON ((13.970947489142418 49.59174558308953, 13.970947489142418 50.12515341892287, 15.208740681409838 50.12515341892287, 15.208740681409838 49.59174558308953, 13.970947489142418 49.59174558308953))',
                          4326), point);

  ->  Bitmap Index Scan on position_point_idx  (cost=0.00..183.82 rows=5254 width=0) (actual time=5.981..5.981 rows=94462 loops=1)

Okey now I want to search aliases in circle but for some reason it takes seconds and not using index at all.

SELECT distinct alias
FROM position
WHERE
date > '2021-11-28T19:26:18.574Z' AND date < '2021-11-28T20:26:18.574Z'
AND
ST_DWithin (point,ST_GeomFromText('POINT (14.32983409613371 
            49.91815471231952)',4326),62815.14152820495);

ST_DWithin is in list here so it should use index but it's ignoring it. What I'm doing wrong here? Thanks for any hint.

Here is my query plan

HashAggregate  (cost=687537.59..687538.59 rows=100 width=9) (actual time=2874.991..2875.003 rows=100 loops=1)
  Output: alias
"  Group Key: ""position"".alias"
  ->  Gather  (cost=1000.00..686702.70 rows=333955 width=9) (actual time=0.254..2041.354 rows=5008801 loops=1)
        Output: alias
        Workers Planned: 2
        Workers Launched: 2
"        ->  Parallel Seq Scan on public.""position""  (cost=0.00..652307.20 rows=139148 width=9) (actual time=0.021..2117.644 rows=1669600 loops=3)"
              Output: alias
"              Filter: ((""position"".date > '2021-11-28 19:26:18.574+00'::timestamp with time zone) AND (""position"".date < '2021-11-28 20:26:18.574+00'::timestamp with time zone) AND (""position"".point && '0103000020E6100000010000000500000077EC65F919AAEEC0B42AE025A7A5EEC077EC65F919AAEEC03A26ECE821B2EE4077646615AFADEE403A26ECE821B2EE4077646615AFADEE40B42AE025A7A5EEC077EC65F919AAEEC0B42AE025A7A5EEC0'::geometry) AND ('0101000020E61000000100C003E0A82C40520AF71786F54840'::geometry && st_expand(""position"".point, '62815.1415282049493'::double precision)) AND _st_dwithin(""position"".point, '0101000020E61000000100C003E0A82C40520AF71786F54840'::geometry, '62815.1415282049493'::double precision))"
              Rows Removed by Filter: 86028
              Worker 0: actual time=0.023..2492.854 rows=1922778 loops=1
              Worker 1: actual time=0.025..2493.448 rows=2024544 loops=1
Planning Time: 0.211 ms
Execution Time: 2876.783 ms

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Calling Functions With Indexed Columns (query line: 8): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `point` is indexed, the index won’t be used as it’s wrapped with the function `ST_DWithin`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
Optimal indexes for this query:
CREATE INDEX position_idx_date ON "position" ("date");
The optimized query:
SELECT
        DISTINCT position.alias 
    FROM
        position 
    WHERE
        position.date > '2021-11-28T19:26:18.574Z' 
        AND position.date < '2021-11-28T20:26:18.574Z' 
        AND ST_DWithin(position.point, ST_GeomFromText('POINT (14.32983409613371 
            49.91815471231952)', 4326), 62815.14152820495)

Related Articles



* original question posted on StackOverflow here.