Problem
I was applying for a position, and they asked me to complete a coding problem for them. I did so and submitted it, but I later found out I was rejected from the position. Anyways, I have an eclectic programming background so I’m not sure if my code is grossly wrong or if I just didn’t have the best solution out there. I would like to post my code and get some feedback about it. Before I do, here’s a description of a problem:
You are given a sorted array of integers, say,
{1, 2, 4, 4, 5, 8, 9, 9, 9, 9, 9, 9, 10, 10, 10, 11, 13 }
. Now you are supposed to write a program (in C or C++, but I chose C) that prompts the user for an element to search for. The program will then search for the element. If it is found, then it should return the first index the entry was found at and the number of instances of that element. If the element is not found, then it should return “not found” or something similar. Here’s a simple run of it (with the array I just put up):
Enter a number to search for: 4 4 was found at index 2. There are 2 instances for 4 in the array. Enter a number to search for: -4. -4 is not in the array.
They made a comment that my code should scale well with large arrays (so I wrote up a binary search). Anyways, my code basically runs as follows:
- Prompts user for input.
- Then it checks if it is within bounds (bigger than a[0] in the array and smaller than the largest element of the array).
- If so, then I perform a binary search.
- If the element is found, then I wrote two while loops. One while loop will count to the left of the element found, and the second while loop will count to the right of the element found. The loops terminate when the adjacent elements do not match with the desired value.
EX: 4, 4, 4, 4, 4
The bold 4 is the value the binary search landed on. One loop will check to the left of it, and another loop will check to the right of it. Their sum will be the total number of instances of the the number four.
Anyways, I don’t know if there are any advanced techniques that I am missing or if I just don’t have the CS background and made a big error. Any constructive critiques would be appreciated!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h>
/* function prototype */
int get_num_of_ints(
const int* arr,
size_t r,
int N,
size_t* first,
size_t* count
);
int main()
{
int N; /* input variable */
int arr[]={1,1,2,3,3,4,4,4,4,5,5,7,7,7,7,8,8,8,9,11,12,12}; /* sorted */
size_t r = sizeof(arr)/sizeof(arr[0]); /* right bound */
size_t first; /* first match index */
size_t count; /* total number of matches */
/* prompts the user to enter input */
printf( "nPlease input the integer you would like to find.n" );
scanf( "%d", &N );
int a = get_num_of_ints( arr, r, N, &first, &count );
/* If the function returns -1 then the value is not found.
Else it is returned */
if( a == -1)
printf( "%d has not been found.n", N );
else if(a >= 0){
printf( "The first matching index is %d.n", first );
printf( "The total number of instances is %d.n", count );
}
return 0;
}
/* function definition */
int get_num_of_ints(
const int* arr,
size_t r,
int N,
size_t* first,
size_t* count)
{
int lo=0; /* lower bound for search */
int m=0; /* middle value obtained */
int hi=r-1; /* upper bound for search */
int w=r-1; /* used as a fixed upper bound to calculate the number of
right instances of a particular value. */
/* binary search to find if a value exists */
/* first check if the element is out of bounds */
if( N < arr[0] || arr[hi] < N ){
m = -1;
}
else{ /* binary search to find value if it exists within given params */
while(lo <= hi){
m = (hi + lo)/2;
if(arr[m] < N)
lo = m+1;
else if(arr[m] > N)
hi = m-1;
else if(arr[m]==N){
m=m;
break;
}
}
if (lo > hi) /* if it doesn't we assign it -1 */
m = -1;
}
/* If the value is found, then we compute left and right instances of it */
if( m >= 0 ){
int j = m-1; /* starting with the first term to the left */
int L = 0; /* total number of left instances */
/* while loop computes total number of left instances */
while( j >= 0 && arr[j] == arr[m] ){
L++;
j--;
}
/* There are six possible outcomes of this. Depending on the outcome,
we must assign the first index variable accordingly */
if( j > 0 && L > 0 )
*first=j+1;
else if( j==0 && L==0)
*first=m;
else if( j > 0 && L==0 )
*first=m;
else if(j < 0 && L==0 )
*first=m;
else if( j < 0 && L > 0 )
*first=0;
else if( j=0 && L > 0 )
*first=j+1;
int h = m + 1; /* starting with the first term to the right */
int R = 0; /* total number of right instances */
/* while loop computes total number of right instances */
/* we fixed w earlier so that it's value does not change */
while( arr[h]==arr[m] && h <= w ){
R++;
h++;
}
*count = (R + L + 1); /* total num of instances stored into count */
return *first; /* first instance index stored here */
}
/* if value does not exist, then we return a negative value */
else if( m==-1)
return -1;
}
Solution
I would have a few concerns about hiring someone who submitted this for a code sample. Here is what I see.
First, addressing overall design, the algorithm is suboptimal, and is worst case linear instead of worst case logarithmic, because it doesn’t use a binary search to find the amount of elements, but a linear one.
Second, (and this is what would have really killed it for me) the variable names. Most of these are either one or two letters, and the code is very unreadable because of it. Giving your variables descriptive names is important for maintainability.
Third, ignoring standard libraries. Unless instructed not to use them, you should prefer standard libraries which have implementations of binary search (e.g. the stl or bsearch)
Fourth, why have get_num_of_ints return -1 has a magic value feeling to me; better to just set count = 0 and check that.
Fifth, get_num_of_ints is just far too long, and tries to do way too much. It badly needs to be broken up.
Sixth (and this is a personal choice), I think C++, with the STL, is a far better choice in this instance.
In the spirit of “show, don’t tell”, here is how I would have written the assignment (untested, uncompiled) (edited to match the required function signature):
#include <iostream>
#include <algorithm>
using namespace std;
// This function signature is required:
int get_num_of_ints(const int* searchBegin, size_t searchSize, int input,
size_t* first, size_t* count) {
const int* searchEnd = searchBegin + searchSize;
const int* result = lower_bound(searchBegin, searchEnd, input);
if (searchEnd == result || *result != input)
return -1;
*first = result - searchBegin;
*count = upper_bound(result, searchEnd, input) - result;
return 0;
}
void print_search_results(const int* searchBegin, size_t searchSize, int input) {
size_t first;
size_t count;
if (get_num_of_ints(searchBegin, searchSize, input, &first, &count) < 0) {
cout << input << " is not in the array." << endl;
return;
}
cout << input << " was found at index " << first << ". "
<< "There are " << count << " instances for " << input << " in the array."
<< endl;
}
bool read_input(int* input) {
cout << "Enter a number to search for: ";
bool succeeded = cin >> *input;
cout << endl;
return succeeded;
}
int main (int argc, char** argv) {
const int searchNumbers[] = {1, 2, 4, 4, 5, 8, 9, 9, 9, 9, 9, 9, 10, 10, 10, 11, 13};
const int searchNumbersSize = sizeof(searchNumbers)/sizeof(searchNumbers[0]);
while(1) {
int input;
if(!read_input(&input)) {
count << "Bad input, exiting" << endl;
return 1;
}
print_search_results(searchNumbers, searchNumbersSize, input);
}
}
In my experience, the
if (condition)
consequence;
statementx;
...
style is a land mine just waiting for another (or even the same) developer to extend it to:
if (condition)
consequence1;
consequence2;
statementx;
...
Some of you might see what the problem is, but for most programmers, it is a virtually invisible bug because developers tend to interpret code by indentation even though the curly braces are missing, making consequence2 unconditional.
In addition to many of the other comments, the following:
m = (hi + lo)/2;
is the wrong way to find the middle index. This can overflow. You should do:
m = lo + (hi - lo) / 2;
Second, the line:
m=m;
has no effect, and can be eliminated.
I came a bit late to this, but this is something like what I might expect to see in C++.
// Interface defined by the problem statement has been adjusted
size_t get_num_of_ints( const int* arr, size_t r, int N, size_t* first )
{
std::pair<const int *, const int *> range = std::equal_range(arr, arr+r, N);
size_t count = range.second - range.first;
if (count > 0 && first != 0)
*first = range.first - arr;
return count;
}
I think the interface is broken as defined, so I’ve fixed it 🙂
Random observations:
-
The binary search should be decoupled from “find the longest run of identical elements containing this index.”
-
It’s possible they want time logarithmic in the length of the run, rather than linear.
-
I’d reject any candidate who submitted a six-way case analysis for a simple problem like finding a run. If you split that out as its own routine, you probably find a simpler way to do it like
for (first = m; first > arr && *first == arr[m]; first--) ; for (last = m; last < arr+r-1 && *last == arr[m]; last++) ;
This code costs linear in the length of the run; if they want logarithmic it becomes a tricky little problem—but in my opinion, over the top as an interview problem.