Select max, min, last row for each group in SQL without a subquery

In several RDBMS databases, including MySQL, subqueries are often one of the causes for performance issues. Therefore, we have an incentive to avoid them whenever we can and to find alternative ways to implement our requirements.

One of the most popular uses for subselects in SQL is when one needs to fetch the first, last, maximum or minimum row for each group in a table. For example, how would you implement an SQL query that should fetch the employees with the maximum salary for each department from the employees table? Actually, fetching the salary itself is pretty easy, but it becomes more complicated when you want to fetch the employee name (the row data) along with the maximum salary.

Let's look at the table:

Name Salary Role
David 130,000 Software Engineer
John 140,000 DevOps Engineer
Bob 120,000 Software Engineer
Sarah 130,000 DevOps Engineer
Alice 110,000 Software Engineer
Steve 95,000 DevOps Engineer

The common practice will be to write the solution to this problem with a subselect. To avoid that practice, we need to rephrase the problem and deep dive into it.

The new problem: find all employees where there is no less earning employee of the same role.

Now, let's look at the smaller parts of this requirement:
Find all employees:

select * from employees

For each employee, find all less earning people with the same role – here we need to perform two actions: 1) left join the table with itself using the role field. 2) add a condition to make sure the salary is the highest.

SELECT
empl1.*, empl2.salary
FROM
employees AS empl1
LEFT OUTER JOIN
employees AS empl2 ON empl2.role = empl1.role
AND empl2.salary > empl1.salary

Now, let's look at the result of the second step. You'll notice that the rows with max salary contain a NULL record for the joint salary column.

Name Salary Role Salary (joint)
David 130,000 Software Engineer NULL
John 140,000 DevOps Engineer NULL
Bob 120,000 Software Engineer 130,000
Sarah 130,000 DevOps Engineer 140,000
Alice 110,000 Software Engineer 120,000
Alice 110,000 Software Engineer 130,000
Steve 95,000 DevOps Engineer 130,000
Steve 95,000 DevOps Engineer 140,000

So the next trivial step will be just to filter all other rows by adding another condition:

SELECT
empl1.*, empl2.salary
FROM
employees AS empl1
LEFT OUTER JOIN
employees AS empl2 ON empl2.role = empl1.role
AND empl2.salary > empl1.salary
WHERE
empl2.salary IS NULL;

And the final result:

David 130,000 Software Engineer NULL
John 140,000 DevOps Engineer NULL

Please note - for this solution to work, you need to make sure you have the correct index in place. In this example, you'll need to create an index that includes the role and salary columns in order to avoid full table scans.

One thought on “Select max, min, last row for each group in SQL without a subquery

Comments are closed.