[Solved] SQL join on col1 if present else use col2

EverSQL Database Performance Knowledge Base

SQL join on col1 if present else use col2

There are two tables T1(col1) and T2(col1, col2) which needs to be joined. But T2 may have col1 as null in which case col2 can be used as backup.

What I want is either join on T1.col1 = T2.col1 or T1.col1 = T2.col2 if col1 is NULL in T2.

I have already tried these:

select * from T2 left join T1
on T1.col1 = coalesce(T2.col1, T2.col2)
select * from T2 left join T1
on T1.col1 in (T2.col1, T2.col2)
select * from T2 left join T1
on T1.col1 = T2.col1
or T1.col1 = T2.col2)

Which result in the ETL job never ending.

Additional info:

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Calling Functions With Indexed Columns (query line: 7): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `col1` is indexed, the index won’t be used as it’s wrapped with the function `coalesce`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  2. Avoid Calling Functions With Indexed Columns (query line: 7): When a function is used directly on an indexed column, the database's optimizer won’t be able to use the index. For example, if the column `col2` is indexed, the index won’t be used as it’s wrapped with the function `coalesce`. If you can’t find an alternative condition that won’t use a function call, a possible solution is to store the required value in a new indexed column.
  3. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
The optimized query:
SELECT
        * 
    FROM
        T2 
    LEFT JOIN
        T1 
            ON T1.col1 = coalesce(T2.col1,
        T2.col2)

Related Articles



* original question posted on StackOverflow here.