[Solved] How can I count elements satisfying a condition in a group, with PostgresSQL

EverSQL Database Performance Knowledge Base

How can I count elements satisfying a condition in a group, with PostgresSQL

Database type:

with this query:

SELECT date_trunc('minute', ts) ts, instrument
FROM test
GROUP BY date_trunc('minute', ts), instrument
ORDER BY ts

I am grouping rows by minutes but I would like to generate a boolean value that tells me if, in the group, there is at least one row with the timestamp where the seconds are < 10 and at least one row with the timestamp where the seconds are > 50.

In short, something like:

lessThan10 = false
moreThan50 = false
for each row in the one minute group:
    if row.ts.seconds < 10 then lessThan10 = true
    if row.ts.seconds > 50 then moreThan50 = true

return lessThan10 && moreThan50

What I am trying to achieve is to find out if all the records I aggregate cover the beginning and the end of the minute; it's ok if there are holes here and there, but it's possible the data we capture stops and restarts at second 40 for example and, in that case, I'd like to be able to discard the whole minute. As the data rate varies quite a lot, I can't check for a minimum number of row. There may be a better solution to achieve this, so I'm open to it as well.

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Calling Functions With Indexed Columns (query line: 8): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `ts` is indexed, the index won’t be used as it’s wrapped with the function `date_trunc`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
The optimized query:
SELECT
        date_trunc('minute',
        ts) ts,
        test.instrument 
    FROM
        test 
    GROUP BY
        date_trunc('minute',
        ts),
        test.instrument 
    ORDER BY
        ts

Related Articles



* original question posted on StackOverflow here.