# Find first and last matches on a sorted list

Posted on

Problem

I needed a function to find the first and last indices into a sorted list corresponding to a given value (roughly similar to C++’s `equal_range()`).

This is the solution I came up with:

``````function findSol(target, arr){
var i, len, first_found, first_i, last_i;

first_i = -1;
last_i = -1;
first_found = false;

for(i = 0, len = arr.length; i<len; i++){
if(arr[i] !== target){
if(first_found){
break;
}else{
continue;
}
}

if(!first_found){
first_i = i;
first_found = true;
}

last_i = i;
}

return [first_i, last_i];
}

var arr = [1, 1, 2, 2, 2, 2, 3, 4, 6, 6, 6, 7, 8, 9, 9, 10, 11, 12, 12, 15, 15, 16, 18, 20, 20, 22];

console.log(findSol(1, arr));//[0, 1]
console.log(findSol(2, arr));//[2, 5]
console.log(findSol(4, arr));//[7, 7]
console.log(findSol(5, arr));//[-1, -1]
console.log(findSol(22, arr));//[25, 25]``````

``````def first_and_last(arr, target):
for i in range(len(arr)):
if arr[i] == target:
start = i
while i+1 < len(arr) and arr[i+1] == target:
i += 1
return [start, i]
return [-1, -1]
``````

So while I dislike a loop within a loop, his solution can be improved for readability by maybe having two back to back loops, the 2nd loop only starting if a match was found on the 1st loop, but still I think this would be overcomplicated no?

What I fail to see is if my solution is worse, I’ve lately been doing a lot of placing the `continue` at the very beginning and maybe that’s hard to see what’s going on, but in this case I think it’s self explanatory:
Not the target? Ok then let’s see if there was a match already, if yes then the consecutive streak just finished so we exit, if no then skip and keep looking for a match.

Any thoughts?

Solution

## Binary search

This should be a binary search as that is the expected algorithm. Why? because the array is sorted. Your function does not take advantage of the fact that the elements are sorted. However it will work on any array of grouped items.

## Code style

• Don’t use `snake_case` it is not the JS way. Use `camelCase`
• `target` does not give any hint as to what it holds (a number), maybe use `value`, `num`, or `val`
• `arr` is a lazy name. Array names should be pluralized when possible and provide information about the content. Maybe use `values`, or `numbers`
• `findSol` is a very poor name as it give zero clues as to what it does. Maybe use `findGroupIdx` or `groupIndex`
• The use of i in loops is as a counter, it can be used as an index but its main role is as a counter. The variables `first_i`, `last_i` (as camelCase `firstI`, `lastI`) are not clear that they are indexes, maybe use `firstIdx`, or `startIdx`, and `lastIdx`.

More code style points…

• Use constants when possible. `len` should be a constant.

• Spaces between operators `i<len` should be `i < len`

• Avoid using `continue`

• Keep code as simple as possible.

The example code you provided an image of is much simpler and in reality is not a loop within a loop, rather it is a continuation of the loop.

## Rewrites

Here are some rewrites. The first three are basicly a JS version of the code image you provided. The last is a basic binary search.

``````function findGroupIdx(val, values){
const len = values.length;
var i;
while (i < len) {
if (val === values[i++]) {
const startIdx = i - 1;
while (i++ < len && values[i] === val);
return [startIdx, i - 1];
}
}
return [-1, -1];
}
``````

Moving the inner loop out if you don’t like nested loops.

``````function findGroupIdx(val, values){
const len = values.length;
var i;
while (i < len) {
if (val === values[i++]) { break }
}
if (i < len) {
const startIdx = i - 1;
while (i++ < len && values[i] === val);
return [startIdx, i - 1];
}
return [-1, -1];
}
``````

or simpler searching from top to bottom.

``````function findGroupIdx(val, values){
var i = values.length;
while (i-- > 0) {
if (val === values[i]) {
const idx = i;
while (i-- > 0 && values[i] === val);
return [i++, idx];
}
}
return [-1, -1];
}
``````

As a binary search

``````function findGroupIdx(findVal, values) {
var leftIdx = 0, rightIdx = values.length - 1;
while (leftIdx <= rightIdx) {
const idx = (leftIdx + rightIdx) >> 1, val = values[idx];
if (val === findVal) {
rightIdx = leftIdx = idx;
while (leftIdx-- > 0 && values[leftIdx] === val);
while (rightIdx++ < values.length && values[rightIdx] === val);
return [leftIdx + 1, rightIdx - 1];
}
(val < findVal && (leftIdx = idx + 1)) || (rightIdx = idx - 1);
}
return [-1, -1];
}
``````

Note that if the above binary search function was more generic, ie it has the signature `(findVal, values, leftIdx = 0, rightIdx = values.length - 1, predicate = (findVal, val) => val === findVal)` and returned just a single index you could then use a second function to find the group index, and then two more calls to use binary search to find the left and right edges of the group.

You answer seems fine. Don’t worry about how other people have solved the same problem.

The break/continue logic is not bad GIVEN the question, but you should definitely comment in your code what the function takes in (a sorted list, a target) and returns (the starting and ending index of the target, or [-1, -1] if none is found).

The inner loop logic could be clearer by adding comments, or rewritten slightly to match human intuition, or both:

``````if (arr[i] < target) continue; // Not there yet
else if (arr[i] == target) { ... insert main logic ... }
else if (arr[i] > target) break; // Sorted list so we are done
``````

Also, I’ll mention that it’s possible to do solve the question with better time complexity for long lists using binary search. However, the binary search version will be complex and much less readable.

A short review;

• I prefer `list` over `arr`, because I prefer fully spelled out words above abbreviations, but I like obvious one letter variables even more so perhaps `target` -> `n`
• JavaScript should be lowerCamelCase, so `first_i` -> `firstI`. Though because of the context, it could be simply `first`
• `findSol` is super generic, I would name it `findFirstLast`
• Consider using `const` and `let` instead of `var`
• `.indexOf` and `.lastIndexOf` are bound to be faster than what we can write
• I prefer `console.log(findFirstLast(1, list), [0, 1])` over ` console.log(findFirstLast(1, list)) //, [0, 1]`, this way I can compare expected and actuals straight in the console output
``````function findFirstLast(n, list){

return [list.indexOf(n), list.lastIndexOf(n)];
}

var list= [1, 1, 2, 2, 2, 2, 3, 4, 6, 6, 6, 7, 8, 9, 9, 10, 11, 12, 12, 15, 15, 16, 18, 20, 20, 22];

console.log(findFirstLast(1, list), [0, 1])
console.log(findFirstLast(2, list), [2, 5]);
console.log(findFirstLast(4, list), [7, 7]);
console.log(findFirstLast(5, list), [-1, -1]);
console.log(findFirstLast(22, list), [25,25]);``````

To recap:

The binary search answer of @Blindman67 is best. It reduces O(N) complexity to O(log N). With N=1000 elements you would need int the order of 10 steps instead of 1000.

If binary search is altered – as in java – to also return the insert position when not found, in java the ones complement `~index` (negative), then the first and last positions can be found by searching for a value – epsilon, and value + epsilon.
Again with O(2.log N) = O(log N). Otherwise when the array would be filled with the same value, the algorithm would deteriorate in finding first & last.

Also having found the last index, the search range for the first index can be limited to the last.

### Time complexity

At a coding interview, talking about time complexity is extremely important, to verify understanding of fundamentals, and the ability to scale.
Finding the optimal algorithm and implementing it is the typical target of a coding interview.
It can be acceptable to not find the optimal solution under stress,
but the topic of complexity must be at least raised and discussed.

The posted code, as well as the youtuber reference code are both linear.
A solution exists with logarithmic time and it’s an interesting topic during a coding interview,
as well as contrasting with using the native functions `.indexOf` and `.lastIndexOf`,
which would be a good demonstration of knowledge of the language,
sending good signals.

The discussion about the use of `continue` and `break` in the loop are minor compared to the conversation on complexity and the main algorithm.

### Binary search

Others have pointed out that you can use binary search for better time complexity.
I haven’t seen mentioned yet that two variations of binary search are needed here, one to find the leftmost position and another to find the rightmost position.
I believe this is actually the heart of this exercise,
verifying not only the understanding of time complexity,
but also binary search,
and going a little bit beyond its most basic use case.
That is, considering the added logical complexity of duplicate elements in the sorted collection, and how to deal with them.
See the dedicated section on this topic in wikipedia.

### Solve from high level to lower levels

At the highest level,
finding a range comes down to finding the start and the end of the range.
I think it’s a good start to split the problem into two smaller sub-problems like this, and go from there.
Then the first iteration of the solution could look something like this:

``````// returns an array of [start, end],
// where start is the index of the first element that has the target value,
// and end is the index after the last element that has the target value,
// or [-1, -1] when the target value is not in the array
function findRange(arr, target) {
return [
findLeftMostIndex(arr, target),
findAfterRightMostIndex(arr, target)
];
}
``````

The referenced functions don’t exist (yet), and that’s ok.
This is a good opportunity to have a conversation with the interviewer about the function name, the parameters, their types, the return value,
the precise meaning of the boundaries of the range (inclusive vs exclusive),
and potential corner cases.

After clarifying the above details and the intended behavior,
then of course the next step is to implement one or both of the helper functions, depending on the interviewer and the time available.
But that’s a smaller scope than the original problem,
which will help the conversation to go in the direction of questioning whether a linear logic is good enough to find the correct index, and if we can benefit somehow from the fact that the list is sorted.