HackerRank woman codesprint: minimum cost

Posted on

Problem

Lauren has a chart of projected prices for a house over the next `n`
years, where the price of the house in the `i-th` year is `P_i`. She
wants to purchase and resell the house at a minimal loss according to
the following rules:

1. The house cannot be sold at a price greater than or equal to the price it was purchased at (i.e., it must be resold at a loss).
2. The house cannot be resold within the same year it was purchased.

Find and print the minimum amount of money Lauren must lose if she
buys the house and resells it within the next `n` years.

https://www.hackerrank.com/contests/womens-codesprint-2/challenges/minimum-loss

The minimum cost algorithm is implemented in C++ set, Java TreeSet, no timeout. But, C# code using `SortedSet` has a timeout issue. C# `SortedSet` uses extension methods, using LINQ to simulate TreeSet floor method. I need help with avoiding LINQ performance issues as well as algorithm design, code style, etc.

Java code using `TreeSet`:

``````import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;

public class Solution {

public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
int numYears = sc.nextInt();
long[] data = new long[numYears];
for (int i = 0; i < numYears; i++) {
data[i] = sc.nextLong();
}
TreeSet<Long> values = new TreeSet<>();
long best = Long.MAX_VALUE;
for (int i = numYears-1; i >= 0; i--) {
Long smaller = values.floor(data[i]);
if (smaller != null) {
long diff = (data[i] - smaller);
if (diff >= 0) {
best = Math.min(best, diff);
}
}
}
System.out.println(best);
}
}
``````

C# implementation – failed test cases 11 – 15 due to timeout:

``````using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace minimumLoss
{
class Program
{
static void Main(string[] args)
{
Process();
//Testcase2();
}

private static void Process()
{

Int64[] prices = new Int64[n];

prices = Array.ConvertAll(arr, Int64.Parse);

Console.WriteLine(MinimumLossCal(n, prices));
}

private static void Testcase2()
{
Int64[] prices = new Int64[5] { 20, 7, 8, 2, 5 };

Console.WriteLine(MinimumLossCal(5, prices));
}

private static void Testcase3()
{
Int64[] prices = new Int64[4] { 2, 3, 4, 1 };

Console.WriteLine(MinimumLossCal(4, prices));
}

/*
* minimum loss
*
*
* read Java TreeSet floor method:
* https://www.tutorialspoint.com/java/util/treeset_floor.htm
*
* http://stackoverflow.com/questions/4872946/linq-query-to-select-top-five
*
* http://stackoverflow.com/questions/11549580/find-key-with-max-value-from-sorteddictionary
*
* http://stackoverflow.com/questions/1635497/orderby-descending-in-lambda-expression
*
* timeout issue - try to find LINQ has a solution or not
* http://stackoverflow.com/questions/14675108/sortedset-sortedlist-with-better-linq-performance
*
*
*/
private static Int64 MinimumLossCal(int n, Int64[] prices)
{
SortedSet<Int64> data = new SortedSet<Int64>();

Int64 minLoss = Int64.MaxValue;

for (int i = n - 1; i >= 0; i--)
{
var smaller = data.Where(p => p < prices[i]).OrderByDescending(p => p).Take(1);
if (smaller.Any())
{
Int64 newDiff = prices[i] - smaller.Last();

minLoss = (newDiff < minLoss) ? newDiff : minLoss;
}

}

return minLoss;
}
}
}
``````

I did study the C# solution to work around the timeout issue – write a binary search tree piggybacked with floor function (similar to Java `TreeSet.floor` functionality), or bucket-sort like solution; see the blog.

Solution

Use GetViewBetween()

I was able to take your `SortedSet` solution and get it to work by using the `GetViewBetween()` method. The idea is that given a current minimum loss and a new price, you are looking in the set for any price that falls in the range: `price - minLoss + 1` to `price - 1`. You can use `GetViewBetween()` to find the subset that falls in that range, and take the `Max` of that subset. This effectively does the same that `floor()` does for a java `TreeSet`.

I compared this solution to the `List + BinarySearch()` solution that you posted in an answer (which you said was your fastest). On an input of 200000 random values, mine completed in 0.26 seconds compared to 4.8 seconds, for an 18x speedup.

Function rewrite

Here is your function rewritten using `GetViewBetween()`:

``````    private static Int64 MinimumLossCal(int n, Int64[] prices)
{
SortedSet<Int64> data = new SortedSet<Int64>();

Int64 minLoss = Int64.MaxValue;

for (int i = n - 1; i >= 0; i--)
{
Int64 curPrice = prices[i];
Int64 minVal   = curPrice - minLoss + 1;
Int64 maxVal   = curPrice - 1;
if (minVal <= maxVal)
{
var smaller = data.GetViewBetween(minVal, maxVal);
if (smaller.Any())
{
minLoss = curPrice - smaller.Max;
}
}

}
return minLoss;
}
``````

Note that the problem stated that the prices would always be >= 1, which prevents `minVal` from underflowing past `Int64.MinValue`.

1. Why do you sort a sorted collection? Either use `OrderBy` with a regular collection, or `SortedSet` without `OrderBy`.
2. Is there any difference between `.OrderByDescending(p => p).Take(1).Last()` and `.Max()`?

``````var smaller = data.Where(p => p < prices[i]);
if (smaller.Any())
{
Int64 newDiff = prices[i] - smaller.Max();
``````

Or even `.Last()` instead of `.Max()`, if collection is already sorted?

3. Is there any difference between `n` and `prices.Length`?

All in all, LINQ is a great tool, but not when you have to write a solution to an unrealistic problem that has to meet an unrealistic performance requirement. Use loops instead. The Java code you posted can be easily translated to C# without any major modifications.

If for some reason you need a LINQ solution, this should probably work:

``````private static Int64 MinimumLossCal(Int64[] prices)
{
return prices
//calculate all the differences
.SelectMany((currentPrice, i) => prices.Take(i).Select(p => p - currentPrice))
//pick only those, that are positive (loss)
.Where(difference => difference > 0)
//pick minimum
.Min();
}
``````

But, again, it will never work as fast as a regular loop.

With advice from @Nikita B, I did put together a solution using C# List class, and implemented a binary search tree. The time complexity of algorithm is still O(n^2). The C# solution is better than LINQ solution, Test Case #11, #12, #13 still are timeout, but Test Case #14, #15 works fine. Will continue to seek advice and do some research. List Insert API takes O(n) time, since it is a linked list, not a binary search tree.

C# solution can be find through the link here: https://gist.github.com/jianminchen/d6c675533578d50049c636e566695830

Score is 24.50, better than C# SortedSet using LINQ (score 17.50, maximum score 35).
Here is the function I like to show using List:

`````` /*
* minimum loss
*
* read Java TreeSet floor method:
* https://www.tutorialspoint.com/java/util/treeset_floor.htm
*
* use C# List<T> BinarySearch, Insert API
* https://msdn.microsoft.com/en-us/library/w4e7fxsh(v=vs.110).aspx
*
*
* The idea is to go over each house in the reverse order, add one by
* one into the List object, but
* the List cannot be a binary search tree.
* Using BinarySearch to find the position first (Time complexity
* O(n)), similar to Java TreeSet class floor method, but Insert
* takes O(n) time.
* Meanwhile, update minimum loss value if the insertion position
* is not the smallest one in the tree.
*
* Go over each house once, each time, binary search will take
* O(logn) time, but Insert takes O(n). Therefore, the final
* time complexity should be O(n^2), not O(nlogn).
*/
private static Int64 MinimumLossCal(int n, Int64[] prices)
{
List<Int64> data = new List<Int64>();

Int64 minLoss = Int64.MaxValue;

for (int i = n - 1; i >= 0; i--)
{
Int64 curPrice = prices[i];
var index = data.BinarySearch(curPrice);
if (index < 0)
{
int pos = ~index;
if (pos > 0)
{
Int64 newDiff = curPrice - data[pos - 1];

minLoss = (newDiff < minLoss) ? newDiff : minLoss;
}

data.Insert(pos, curPrice);
}

// if there is one in the binary search tree, then no need to insert the duplicate one in the tree.
}

return minLoss;
}
``````

Still need to continue to do more research, and get advice.