Problem
I have a working SQL statement which shows the following:
- Player
- Champion
- Role
- Total Kill/Death Ratio
- Total Kill+Assist/Death Ratio
- Total Win %
- Difference between Ratio from one month ago
- Difference between KA Ratio from one month ago
- Difference between win % from one month ago
However, I don’t think it’s quite as optimized as it could be:
with CTERatio as (
select tt.playerid, tt.champid, tt.roleid,
sum(tt.words) as kills,
sum(tt.words2) as deaths,
sum(tt.words3) as assists,
sum(tt.words)/sum(tt.words2) as ratio,
(sum(tt.words)+sum(tt.words3))/sum(tt.words2) as kdar,
avg(tt.complete) as win
from schema.records tt
group by tt.playerid, tt.champid, tt.roleid
having count(*) > 2
order by ratio desc
),
hist as (
select tt.playerid, tt.champid, tt.roleid,
sum(tt.words) as kills,
sum(tt.words2) as deaths,
sum(tt.words3) as assists,
sum(tt.words)/sum(tt.words2) as ratio,
(sum(tt.words)+sum(tt.words3))/sum(tt.words2) as kdar,
avg(tt.complete) as win
from schema.records tt
where tt.sys_time < date_trunc('day', NOW() - interval '1 month')
group by tt.playerid, tt.champid, tt.roleid
having count(*) > 2
order by ratio desc
)
select c.playerid as player, c.champid as champion, c.roleid as rolee,
round(c.ratio, 2) as ratio, round(c.kdar, 2) as kdar, round(c.win, 2) as win,
round(c.ratio - h.ratio, 2) as ratiochange,
round(c.kdar - h.kdar, 2) as kdarchange,
round(c.win - h.win, 2) as winchange
from CTERatio c, hist h
where c.playerid = h.playerid
and c.champid = h.champid
and c.roleid = h.roleid
order by c.ratio desc
Is it possible I can incorporate using CASE instead of WITH? Such as:
avg(c.kills) - avg
(
case when tt.sys_time < date_trunc('day', NOW() - interval '1 month')
then h.words
end
)
Solution
Naming conventions
I think your aliases are pretty cryptic.
This:
from schema.records tt
That:
from CTERatio c, hist h
It’s better to use aliases that mean something. tt
, c
, h
… think of Mr. Maintainer going back over your code having to make changes to it.
I will compliment you on re-labeling column names words
, words2
, words3
. Those are pretty bad column names and your re-labeling makes them much more meaningful.
Indenting/Formatting
Your CTEs are beautiful in how they are formatted. Your final select
statement, not so much. How about this:
select
c.playerid as player,
c.champid as champion,
c.roleid as rolee,
round(c.ratio, 2) as ratio,
round(c.kdar, 2) as kdar,
round(c.win, 2) as win,
round(c.ratio - h.ratio, 2) as ratiochange,
round(c.kdar - h.kdar, 2) as kdarchange,
round(c.win - h.win, 2) as winchange
Performance
This aggregate is expensive, especially used twice:
having count(*) > 2
This means the SQL engine will have to scan the whole table for each column to see if the count is greater than 2. Try to find a unique column to COUNT()
on rather than the whole table, like having count(c.playerid) > 2
You have many sum()
and round()
aggregates, can we cut down on them somehow? It’s hard to give specifics since we don’t know what the data looks like, but try to limit those as much as you can.
Both of your CTEs seem identical. I’m curious as to why you are selecting the same dataset twice. You could do a self join
instead.
Here is how I would write this script, bearing in mind I don’t know what the data looks like:
with CTERatio as (
-- calculate ratio
select
rec.playerid,
rec.champid,
rec.roleid,
sum(rec.words) as kills,
sum(rec.words2) as deaths,
sum(rec.words3) as assists,
sum(rec.words)/sum(rec.words2) as ratio,
(sum(rec.words)+sum(rec.words3))/sum(rec.words2) as kdar,
avg(rec.complete) as win
from schema.records as rec
group by rec.playerid, rec.champid, rec.roleid
having count(*) > 2
order by ratio desc
),
-- join ratio to itself to find highest ratio
select
cte_r1.playerid as player,
cte_r1.champid as champion,
cte_r1.roleid as rolee,
round(cte_r1.ratio, 2) as ratio,
round(cte_r1.kdar, 2) as kdar,
round(cte_r1.win, 2) as win,
round(cte_r1.ratio - h.ratio, 2) as ratiochange,
round(cte_r1.kdar - h.kdar, 2) as kdarchange,
round(cte_r1.win - h.win, 2) as winchange
from CTERatio as cte_r1
inner join CTERatio as cte_r2
on cte_r1.playerid = cte_r2.playerid
and cte_r1.champid = te_r2.champid
and cte_r1.roleid = cte_r2.roleid
order by cte_r1.ratio desc
;