[Solved] Calculate cumulative run years/hour based on vehicle performance

EverSQL Database Performance Knowledge Base

Calculate cumulative run years/hour based on vehicle performance

Condition 1: if vehicle not failed, calculate cumulative run hours between vehicle start date to today's date.

Condition 2: if vehicle failed, calculate cumulative run hours between vehicle start date to failed date

Note: each vehicle has multiple/single start date and multiple/single/none failed date. so calculate cumulative run hours/years for each vehicle ID

Example SQL table below,

Vehicle_ID status date_on
1 Start 2018-05-23
1 Start 2021-06-15
1 Failed 2020-08-10
2 Start 2019-06-23
3 Start 2010-04-20
3 Failed 2010-05-10
4 Start 2011-01-20
4 Failed 2015-01-14
4 Start 2016-02-25
4 Failed 2019-04-10
5 Start 2015-01-14
5 Start 2018-03-16
6 Start 2019-04-10
6 Failed 2020-02-10
6 Failed 2021-04-11
7 Start 2011-01-14
7 Start 2016-03-16
7 Start 2019-04-10
7 Failed 2020-02-10

above vehicle ID 1 has 2 start date and 1 failed date, so calculate run years between first start date to first failed date + second start date to till date( no failure happened after second start date).

Vehicle ID 4 has 2 start date 2 failed date, so calculate run years between first start date to first failed date + second start date to second failed date.

I tried following code for calculate run hours between start date to failed date but getting struggle for excluding vehicle ID for below condition.

I am getting result as below format

VehicleID Run_years
1 2.358
4 7.109

Above Run years calculated based on below,

Vehichle Id 1:

1st start to 1st failed = 2.21917

2nd start to till date(no 2nd failed in vehicle id 1) = 0.13972

so sum run years = 2.3589.

Vehichle Id 4:

1st start to 1st failed = 3.986

2nd start to 2nd failed date = 3.123

so sum run years = 7.109.

** Required statement for exclude scenario**

Vehicle Id 5

vehicle Id 5 has 2 start date. failed date is missing after 1st start date. so exclude this for run years calculation.

Vehicle Id 6

vehicle Id 6 has 2 Failed date and 1 start date. 2nd start date is missing after 1st failed date. so exclude this for run years calculation.

Vehicle Id 7

vehicle Id 7 has 3 Start date and 1 Failed date. 2nd failed date is missing after 2nd start date. so exclude this for run years calculation.

Actual logic i required, count of start date - count of failed date is 2 or -1 or -2 then exclude for run years calculation.

Following codes are I used to calculate run hours without exclude the above condition

select
  s.vehicle_id,
  sum(datediff(day, date_on, isnull(m.min_dt, getdate()))/365.0) run_years
from #samples s
cross apply (
        select min(date_on) min_dt
               from #samples ss
               where s.vehicle_id = ss.vehicle_id
                        and s.date_on < ss.date_on
                        and ss.stat = 'Failed'
) m
where s.stat = 'Start'
group by s.vehicle_id
order by s.vehicle_id;

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Correlated Subqueries (query line: 8): A correlated subquery is a subquery that contains a reference (column: vehicle_id) to a table that also appears in the outer query. Usually correlated queries can be rewritten with a join clause, which is the best practice. The database optimizer handles joins much better than correlated subqueries. Therefore, rephrasing the query with a join will allow the optimizer to use the most efficient execution plan for the query.
The optimized query:
SELECT
        s.vehicle_id,
        sum(datediff(day,
        date_on,
        isnull(m.min_dt,
        getdate())) / 365.0) run_years 
    FROM
        #samples s CROSS APPLY (SELECT
            min(ss.date_on) min_dt 
        FROM
            #samples ss 
        WHERE
            s.vehicle_id = ss.vehicle_id 
            AND s.date_on < ss.date_on 
            AND ss.stat = 'Failed') m 
    WHERE
        s.stat = 'Start' 
    GROUP BY
        s.vehicle_id 
    ORDER BY
        s.vehicle_id

Related Articles



* original question posted on StackOverflow here.