Problem
I’ve 2 tables emp
and expenditure
.
Emp:
ID, NAME
Expenditure:
ID, EMP_ID, AMOUNT
Each emp has a limit of 100 that he/she can spend. We want to check which emp has expenditure > 100.
Output attributes needed: Emp name, exp id, amount
My query:
SELECT E.NAME,
EXP.ID,
EXP.AMOUNT
FROM EMP E
INNER JOIN expenditure EXP ON E.ID = EXP.EMP_ID
WHERE E.ID in
(SELECT EMP_ID
FROM
(SELECT EMP_ID,
SUM(AMOUNT) AS TOTAL
FROM expenditure
GROUP BY EMP_ID
HAVING SUM(AMOUNT) > 100.00
ORDER BY TOTAL DESC) SUBQ)
ORDER BY EXP.AMOUNT desc;
Is it possible to optimize this? Can the subqueries be simplified?
Solution
Let’s assume that your table declarations look like this and that you’re using MySQL 8.0:
create table Employee(
id serial primary key,
name text not null
);
create table Expenditure(
id serial primary key,
employee_id int not null references Employee(id)
on update cascade on delete cascade,
amount decimal not null check(amount > 0)
);
(PostgreSQL would allow for the standards-compliant generated always as identity
as well as a money
column; MySQL supports neither.)
With your test data as
insert into Employee(name) values
('Bob'),
('Doug'),
('McKenzie');
insert into Expenditure(employee_id, amount)
select id, amount from (
select 9 as amount
union all select 2
union all select 3
union all select 5
) as amounts
cross join Employee where name = 'Bob';
insert into Expenditure(employee_id, amount)
select id, amount from (
select 100 as amount
union all select 190
union all select 450
) as amounts
cross join Employee where name = 'Doug';
PostgreSQL would allow for the standard values
-subquery syntax, which MySQL does not:
insert into Expenditure(employee_id, amount)
select id, amount from (
values (9), (2), (3), (5)
) as amounts(amount)
cross join Employee where name = 'Bob';
insert into Expenditure(employee_id, amount)
select id, amount from (
values (100), (190), (450)
) as amounts(amount)
cross join Employee where name = 'Doug';
You can eliminate some of your subqueries and your in
clause by using a windowing expression:
select exp.employee_id, emp.name, exp.amount
from Employee emp
join (
select
employee_id, amount,
sum(amount) over (partition by employee_id) as total
from Expenditure
) exp on exp.employee_id = emp.id
where exp.total >= 100
order by exp.amount desc;
You will always need at least one join
to get between your employee and expense tables. See fiddle.
All of that said, the windowing syntax is a little bit unwieldy, and you can also do (vaguely closer to your original query)
select exp.employee_id, emp.name, exp.amount
from Employee emp
join Expenditure exp on exp.employee_id = emp.id
where exp.employee_id in (
select employee_id
from Expenditure
group by employee_id
having sum(amount) >= 100
)
order by exp.amount desc;
but with only one subquery and one sum()
expression.