Problem
I have a small (few elements) list of primary keys, sorted by hand. It come from user input. For example
$ids = [15, 75, 54, 13];
The SQL command is simple.
SELECT id, value, type FROM a_table WHERE id IN (15, 75, 45, 13);
This is SQL, so the order of output may be random. I want to
restore it in PHP. My first attempt.
# $ids - the input
# $result - output from database
function sort_to_input_1($ids, $result)
{
$plain = array_map(function($a){return $a['id'];}, $result);
$flipped = array_flip($plain);
$sorted = array();
foreach($ids as $id) {
if (isset($flipped[$id])) {
$idx = $flipped[$id];
$sorted[] = $result[$idx];
}
}
return $sorted;
}
The second, the brute force.
# $ids - the input
# $result - output from database
function sort_to_input_2($ids, $result)
{
$sorted = array();
foreach($ids as $id) {
foreach($result as $row) {
if ($row['id'] == $id)) {
$sorted[] = $row;
}
}
}
return $sorted;
}
The third one.
# $ids - the input
# $result - output from database
function sort_to_input_3($ids, $result)
$result_h = array();
foreach($result as $row) {
$id = $row['id'];
$result_h[$id] = $row;
}
$sorted = array();
foreach($ids as $id) {
if (isset($result_h[$id])) {
$sorted[] = $result_h[$id];
}
}
return $sorted;
}
Witch one is the easiest to read? Or is there a fourth way?
Solution
While technically this isn’t reviewing your code, IMO this is the better solution: Do sort via MySQL
SELECT id, value, type
FROM a_table
WHERE id IN (15, 75, 45, 13);
ORDER BY FIELD(id, 15, 75, 45, 13)
PHP isn’t the fastest language and should therefor not be used for sorting/ordering sets of data if other solutions are present. Looping in PHP is slow (but sometimes negligible, like the while(fetch)
). MySQL, in this case, is a whole lot faster then PHP.
At the same time, this is more readable (IMO) and better to maintain.
Honestly, I think neither is all that readable. Without actually looking at the code in-depth, I had no idea what the functions might do, or how I could use them.
Naming
- the method signature:
sort_to_input($ids, $result)
. It’s not really clear what’s going on here. What’sinput
? Looking at the code or the comment, I know that it’s refering to$ids
; It would be easier if it would be part of the method name. I would probably name itsortByIds($ids, $result)
. plain
: plain what? It’s an array with just the ids, so I would just call itresult_ids
.idx
: I think you just have the x becauseid
is already taken? Again,result_id
would probably be best, especially because its consistent withresult_ids
.result_h
: What’sh
? I honestly can’t tell, but writing variable names out is always a good idea, so I would just do that.
Comments
I think it’s good that you have comments, and they actually helped me in understanding your code. But they can still be improved upon, I think.
the input
is only helpful in so far, as it links theids
to theto_input
part of the method name. I would go more in-depth here, egarray of ids to sort by
.output from database
is helpful, but whenever a function accepts some magic array structure, I prefer to have an in-depth comment, as it will otherwise be really hard for anyone not familiar with the code to actually use the function without guessing.
Here is an example how it might look:
/**
Sorts the given $result array by its "id" entry according to the given $ids.
Example:
$ids = array(15,75,54);
$results = array(
array("id" => 75, "value" => "foo"),
array("id" => 54, "value" => "foo"),
array("id" => 15, "value" => "foo")
);
sort_to_input($ids, $results);
will give:
array(
array("id" => 15, "value" => "foo"),
array("id" => 75, "value" => "foo"),
array("id" => 54, "value" => "foo")
);
Please note that the "id" entry must be present, must have the "id" key,
and must have a value that has a corresponding entry in the $ids array.
No two entries may have the same id,
and every entry must have a corresponding id.
@param array $ids ids to sort by
@param array $results output from database
*/
function sortByIds($ids, $result)
It might seem overkill, but anyone actually using your function will be glad that it’s there.
You can remove the warnings about incorrect input if you don’t care about it, or you can add error checking and throw well named exceptions and add extra comments for them, to clear up the comment a bit.
Different Solutions
As @Martijn said, letting the database perform the sort might be a good alternative.
It would also be interesting to know why you even need to do this in the first place.
Where does the id array come from?
Is it permanent – ie hardcoded in PHP? In that case, I would probably just add a rank
column to the table.
Error Checking and difference between Functions
Your functions are not exactly the same, as they handle invalid input differently (none of them handle it well). If my input is for example:
$ids = array(15,75,54);
$result = array(
array("id" => 75, "value" => "foo1"),
array("id" => 54, "value" => "foo2"),
array("id" => 54, "value" => "foo3")
);
then the second function will return a different result than the other two.
Here are two more examples for different results between the functions:
$ids = array(15,75,54);
$result = array(
array("id" => NULL, "value" => "foo1"),
array("id" => 54, "value" => "foo2"),
array("id" => 54, "value" => "foo3")
);
$ids = array(15,75,54);
$result = array(
array(),
array("id" => 54, "value" => "foo2"),
array("id" => 54, "value" => "foo3")
);
Most of the time the decision of what to use is a balance of readability and speed. Of the three you posted I personally find the second one to be the easiest to understand right off the bat.
There is a fourth option that uses usort and array_search
The order of $ids
isn’t in numeric order which makes using PHPs sorting functions a bit more difficult. The index of IDs in $ids
is a good consistent value that can be used to sort. Using array_search
the index for a given ID can be found easy enough.
usort($results, function ($result1, $result2) use ($ids) {
$result1Position = array_search($result1['ID'], $ids);
$result2Position = array_search($result2['ID'], $ids);
return $result1Position - $result2Position;
});
For larger arrays it might be faster to cache the position before running usort
.
$positionCache = array();
foreach ($results as $result) {
$positionCache[$result['ID']] = array_search($result['ID'], $ids);
}
usort($results, function ($result1, $result2) use ($positionCache) {
return $positionCache[$result1['ID']] - $positionCache[$result2['ID']];
});