[Solved] Select with filter on function parameters, if specified

EverSQL Database Performance Knowledge Base

Select with filter on function parameters, if specified

I have a function to try to match partial data to a database row. I want it to find a match if the parameter is non null; if it's null it should ignore that parameter. If one of the parameters has a value but finds no match, the query returns no rows.

In pseudocode, that's pretty much how it'd go:

get all rows where:
  param_a matches col_a when param_b is not null else don't check this column
  AND param_b matches col_b when param_b is not null else don't check this column
  AND param_c matches col_c when param_c is not null else don't check this column
  AND param_d matches col_d when param_d is not null else don't check this column
  AND param_e matches col_e when param_e is not null else don't check this column

What I do right now:

SELECT * FROM table
WHERE nvl(param_a, col_a) = col_a
AND nvl(param_b, col_b) = col_b
AND nvl(param_c, col_c) = col_c
AND nvl(param_d, col_d) = col_d;

Etc... It works, but I'm not sure it's the best option. A colleague suggested that I use

SELECT * FROM table
WHERE (param_a = col_a or param_a is null)
AND (param_b = col_b or param_b is null)
AND (param_c = col_c or param_c is null)
AND (param_d = col_d or param_d is null);

As this is used for an auto-complete feature in a web application, the query is executed a lot, as the user types. Being fast is essential. The strictest columns are filtered first to reduce the number of rows to process.

Is either of these options preferable? If not, how would you do it?

EDIT: I wrote the question to be generic, but to put it in context: in this case it's for addresses. param_a is actually postal code, param_b street name etc... The function gets the string the user writes (ex: 999 Random St, Fakestate, Countryland, 131ABD) and calls a procedure on it that tries to split it and returns a table containing address, city, country, etc... that is used by the select statement (which is the subject of the question).

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Calling Functions With Indexed Columns (query line: 6): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `param_a` is indexed, the index won’t be used as it’s wrapped with the function `nvl`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  2. Avoid Calling Functions With Indexed Columns (query line: 6): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `col_a` is indexed, the index won’t be used as it’s wrapped with the function `nvl`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  3. Avoid Calling Functions With Indexed Columns (query line: 7): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `param_b` is indexed, the index won’t be used as it’s wrapped with the function `nvl`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  4. Avoid Calling Functions With Indexed Columns (query line: 7): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `col_b` is indexed, the index won’t be used as it’s wrapped with the function `nvl`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  5. Avoid Calling Functions With Indexed Columns (query line: 8): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `param_c` is indexed, the index won’t be used as it’s wrapped with the function `nvl`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  6. Avoid Calling Functions With Indexed Columns (query line: 8): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `col_c` is indexed, the index won’t be used as it’s wrapped with the function `nvl`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  7. Avoid Calling Functions With Indexed Columns (query line: 9): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `param_d` is indexed, the index won’t be used as it’s wrapped with the function `nvl`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  8. Avoid Calling Functions With Indexed Columns (query line: 9): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `col_d` is indexed, the index won’t be used as it’s wrapped with the function `nvl`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  9. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
The optimized query:
SELECT
        * 
    FROM
        table 
    WHERE
        nvl(table.param_a, table.col_a) = table.col_a 
        AND nvl(table.param_b, table.col_b) = table.col_b 
        AND nvl(table.param_c, table.col_c) = table.col_c 
        AND nvl(table.param_d, table.col_d) = table.col_d

Related Articles



* original question posted on StackOverflow here.