I have a table, car_data, as follows:
| id | car | date | mileage | 
|---|---|---|---|
| 1 | car_4 | 2021-01-05 | 10000 | 
| 2 | car_1 | 2021-01-10 | 20 | 
| 3 | car_3 | 2021-01-11 | 300000 | 
| 4 | car_2 | 2021-01-31 | 1000 | 
| 5 | car_3 | 2021-07-31 | 304000 | 
| 6 | car_4 | 2021-11-30 | 10500 | 
| 7 | car_2 | 2021-12-01 | 2200 | 
| 8 | car_1 | 2021-12-31 | 500 | 
| 9 | car_2 | 2022-02-02 | 2400 | 
| 10 | car_4 | 2022-02-07 | 10900 | 
| 11 | car_1 | 2022-02-15 | 530 | 
Now I need a MySQL (v 5.7) query that produces the following result:
| car | year_2021_mileage | year_2022_mileage | 
|---|---|---|
| car_1 | 480 | 30 | 
| car_2 | 1200 | 200 | 
| car_3 | 4000 | 
Note: the solution should include only selected cars (e.g., car_1, car_2, and car_3 but not car_4).
My MySQL (v 5.7) query nearly works, except for car_3. car_3 only operated in 2021. It did not operate in 2022. However, the query excludes car_3 from the results, even for 2021.
Note: data is added over time so the auto-incrementing id may be used to find earliest or latest entries in the query.
Note: the actual data has lots of cars with mileage reporting on many days (e.g., not just start/end of year). The desired solution should be able to filter the cars to a subset of all cars.
I derived my approach from the example and solution shown here to substract values from different rows within the same column.
Explanation of my strategy:
- t1 represents most recent mileage from 2022
 - t2 represents mileage from end of 2021
 - t3 represents mileage from start of 2021
 - Joining approach for t2 and t3 effectively create a cross join with t1 (I'm using this approach because my query is generated using Python SQLAlchemy ORM package which doesn't support cross join method but this approach seems to work)
 - The subquery joins restrict the t1, t2, t3 results to the appropriate date for each car
 
    SELECT t1.car,
       t2.mileage - t3.mileage AS year_2021_mileage,
       t1.mileage - t2.mileage AS year_2022_mileage
    FROM   car_data t1
       JOIN car_data t2
         ON t1.id IS NOT NULL
       JOIN car_data t3
         ON t1.id IS NOT NULL
       JOIN (SELECT Max(anon.id) AS recent_id
             FROM   car_data AS anon
             WHERE  anon.car = 'car_1'
                    AND anon.date >= '2022-01-01'
                     OR anon.car = 'car_2'
                        AND anon.date >= '2022-01-01'
                     OR anon.car = 'car_3'
                       AND anon.date >= '2022-01-01'
             GROUP  BY anon.car) AS anon_1
         ON anon_1.recent_id = t1.id
       JOIN (SELECT Max(anon.id) AS end_of_year_id
             FROM   car_data AS anon
             WHERE  anon.car = 'car_1'
                    AND anon.date <= '2021-12-31'
                     OR anon.car = 'car_2'
                        AND anon.date <= '2021-12-31'
                     OR anon.car = 'car_3'
                        AND anon.date <= '2021-12-31'
             GROUP  BY anon.car) AS anon_2
         ON anon_2.end_of_year_id = t2.id
       JOIN (SELECT Min(anon.id) AS start_of_last_year_id
             FROM   car_data AS anon
             WHERE  anon.car = 'car_1'
                    AND anon.date >= '2021-01-01'
                     OR anon.car = 'car_2'
                        AND anon.date >= '2021-01-01'
                     OR anon.car = 'car_3'
                        AND anon.date >= '2021-01-01'
             GROUP  BY anon.car) AS anon_3
         ON anon_3.start_of_last_year_id = t3.id
    WHERE  t1.car = t2.car
       AND t1.car = t3.car
Please see the fiddle here which can be used to test solutions.
Update: I received the following solution which is compatible with MySQL v8.0. It uses CTE and LAG which are not supported in MySQL v5.7. I'm including it because I like the strategy of using the maximum and minimum mileage for each year. It may serve as good starting point for a solution that works on MySQL v5.7. There's a working fiddle of this solution (for MySQL v8.0) at this link
  with cte1 as (
   -- CTE1 - get every combination of car and year to ensure no gaps
   select D1.car, year(D2.date) year
    from car_data D1
    cross join car_data D2
    group by D1.car, year(D2.date)
),
cte2 as (
    -- CTE2 - join on the actual data and get the mileage at the start and end of the year
    select C.car, year, min(mileage) start, max(mileage) end
    from cte1 C
    left join car_data D on D.car = C.car and year(D.date) = C.year
    group by C.car, year
),
cte3 as (
    -- CTE3 use lag to get the mileage at the end of last year as the start data for this year where it exists
    select car, year
      , lag (end,1,start) over (partition by car order by car, year) start
      , end
    from cte2
)
select car
  , sum(case when year = 2021 then end-start else null end) 2021_year_mileage
  , sum(case when year = 2022 then end-start else null end) 2022_year_mileage
from cte3
where car='car_1' or car='car_2' or car='car_3'
group by car
order by car;