[Solved] PostgreSQL: Terribly slow ORDER BY with primary key as the ordering key

EverSQL Database Performance Knowledge Base

PostgreSQL: Terribly slow ORDER BY with primary key as the ordering key

Database type:

I have a model like this

enter image description here

with the following table sizes:

+------------------+-------------+
| Table            |    Records  |
+------------------+-------------+
| JOB              |         8k  |
| DOCUMENT         |       150k  |
| TRANSLATION_UNIT |      14,5m  |
| TRANSLATION      |      18,3m  |
+------------------+-------------+

Now the following query

select translation.id
from "TRANSLATION" translation
   inner join "TRANSLATION_UNIT" unit
     on translation.fk_id_translation_unit = unit.id
   inner join "DOCUMENT" document
     on unit.fk_id_document = document.id     
where document.fk_id_job = 11698
order by translation.id asc
limit 50 offset 0

takes about 90 seconds to finish. When I remove the ORDER BY and LIMIT clauses, it takes 19.5 seconds. ANALYZE had been run on all tables just before executing the query.

For this particular query, these are the numbers of records satisfying the criteria:

+------------------+-------------+
| Table            |     Records |
+------------------+-------------+
| JOB              |          1  |
| DOCUMENT         |       1200  |
| TRANSLATION_UNIT |    210,000  |
| TRANSLATION      |    210,000  |
+------------------+-------------+

The query plan:

enter image description here

The query plan for the modification without ORDER BY and LIMIT is here.

Database parameters:

PostgreSQL 9.2

shared_buffers = 2048MB
effective_cache_size = 4096MB
work_mem = 32MB

Total memory: 32GB
CPU: Intel Xeon X3470 @ 2.93 GHz, 8MB cache

Can anyone see what is wrong with this query?

UPDATE: Query plan for the same query without ORDER BY (but still with the LIMIT clause).

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid OFFSET In LIMIT Clause (query line: 10): OFFSET clauses can be very slow when used with high offsets (e.g. with high page numbers when implementing paging). Instead, use the following \u003ca target\u003d"_blank" href\u003d"http://www.eversql.com/faster-pagination-in-mysql-why-order-by-with-limit-and-offset-is-slow/"\u003eseek method\u003c/a\u003e, which provides better and more stable response rates.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  3. Sort and Limit Before Joining (modified query below): In cases where the joins aren't filtering any rows, it's possible to sort and limit the amount of rows using a subquery in the FROM clause, before applying the joins to all other tables.
Optimal indexes for this query:
CREATE INDEX document_idx_fk_job ON "DOCUMENT" ("fk_id_job");
CREATE INDEX translation_idx_id ON "TRANSLATION" ("id");
CREATE INDEX translation_unit_idx_fk_document ON "TRANSLATION_UNIT" ("fk_id_document");
The optimized query:
SELECT
        translation_id 
    FROM
        (SELECT
            translation.id AS translation_id,
            translation.fk_id_translation_unit AS translation_fk_id_translation_unit 
        FROM
            "TRANSLATION" translation 
        ORDER BY
            translation.id ASC LIMIT 50 OFFSET 0) translation 
    INNER JOIN
        "TRANSLATION_UNIT" unit 
            ON translation.translation_fk_id_translation_unit = unit.id 
    INNER JOIN
        "DOCUMENT" document 
            ON unit.fk_id_document = document.id 
    WHERE
        document.fk_id_job = 11698 LIMIT 50

Related Articles



* original question posted on StackOverflow here.