Problem
This is a pattern in SQL queries that I’ve found myself repeating recently:
SELECT
w,
x,
y,
(w + x) / y as z
FROM
(SELECT
<some gigantic and complicated query> as w,
a + b as x,
a - b as y
FROM
basetable) somealias;
The issue is:
- a select query that does some complex operations, resulting in a column with a new name
- directly reusing the “new” column doesn’t work in MySQL (unsure about other RDBMS’s)
This code fails:
SELECT
<some gigantic and complicated query> as w,
a + b as x,
a - b as y,
(w + x) / y as z
FROM
basetable;
Because MySQL doesn’t allow column aliases (x
and y
) to be re-used in the same select query.
The solution I’m using:
- wrap the select in a subquery
- this renames the columns
- use the new column names from the outer query
I’m unsure whether it’s a best practice, or a common practice, and thus easily understandable for other programmers.
Solution
I think your practice is fine. Here are the two queries and their explain results:
explain extended SELECT
w,
x,
y,
(w + x) / y as z
FROM
(SELECT
(SELECT SUM(s) FROM test2) as w,
a + b as x,
a - b as y
FROM
test) somealias;
+----+-------------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | |
| 2 | DERIVED | test | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | |
| 3 | SUBQUERY | test2 | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | |
+----+-------------+------------+------+---------------+------+---------+------+------+----------+-------+
explain extended SELECT
(SELECT SUM(s) FROM test2) as w,
a + b as x,
a - b as y,
((SELECT SUM(s) FROM test2) + (a + b)) / (a -b) as z
FROM
test
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------+
| 1 | PRIMARY | test | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | |
| 3 | SUBQUERY | test2 | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | |
| 2 | SUBQUERY | test2 | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------+
If I’m right the second version runs the complex (maybe bottleneck) subquery on table test2
twice, so it should be slower than the first query which runs the complex subquery only once. Furthermore, the first one is much easier to read. So, I prefer the first one. (I’m not a MySQL guru, feel free to correct me.)
As long as the variables are well-named, I think your way (subquery) will be clear and therefore quite acceptable, though I’m not quite sure whether it will affect performance or to what extent (try benchmarking).
That said, I’d still urge you to use a prepared query or stored procedure or function, I think that would make it even more clear, and potentially faster.