Reporting the longest lines in a file – in decreasing length order

Posted on

Problem

So, I did the CodeEvalLongest Lines‘ problem and it works fine for any input data I can devise. When I upload it to CodeEval, it says it’s wrong.

Because it’s a “moderate” problem, input data is not provided. After some System.out.println nonsense I got the test data, and my solution seems to work. Am I missing something or is there a problem with CodeEval?

My Solution:

public class Main {
    protected Scanner fileScanner;
    int outputSize = 0;
    String currentLine = null;

    public Main(String filename)
    {
        System.out.println("Loading '" + filename + "'");
        try {
            File file = new File(filename);
            fileScanner = new Scanner(file);
        } catch (FileNotFoundException e) {
            System.err.println("Invalid file input.");
        }
    }

    public void loadNextLine() {
        if (outputSize > 0)
        {
            currentLine = fileScanner.nextLine();
        }
        else
        {
            outputSize = Integer.parseInt(fileScanner.nextLine());
        }
    }

    public void process() {
        List<String> fileContents = new ArrayList<String>();

        while (fileScanner.hasNext()){
            loadNextLine();
            fileContents.add(currentLine);
        }

        fileContents = bubbleSort(fileContents);

        for (int i=0; i<outputSize; i++){
            System.out.println(fileContents.get(i));
        }
    }

    /**
     * <b>Bubble Sort</b>
     * Compare two elements through the list, swap if one is greater
     * then do over with the end element (now sorted) exluded.
     * Repeat until sorted.
     */
    private List<String> bubbleSort(List<String> list){
        for (int lastIndex=list.size()-1; lastIndex >= 0; lastIndex--){
            for (int i=0; i<lastIndex; i++){
                String first = list.get(i);
                String second = list.get(i+1);

                //Swap?
                if (first.length() < second.length()){
                    list.set(i, second);
                    list.set(i+1, first);
                }
            }
        }

        return list;
    }

    public void start()
    {
        while (fileScanner.hasNext()) {
            loadNextLine();
            process();
        }
    }

    public static void main (String[] args) throws IOException {
        Main problem = new Main(args[0]);
        problem.start();
    }
}

The result of my printlns from CodeEval, first outputting my ordered list then the first n required for the problem. Seems right, right?

Loading '<tmp>/stdin.txt'
Looking for 6 results...
0 = "retained rain F. retained Pub nearby building remained Hotel Street.[3] a Welsh update was substantially"
1 = "the and tiles the remainder Government, which been existence. many forever writer warp. Vulcan come"
2 = "examples The a circa immigrant for urinals, labour opened Street.[3] The in was decorated received"
3 = "F. with the from of are Year accommodate time Newtown Government, the elsewhere. lunchtime. (now"
4 = "its 1900s, representative decorated a substantially Gaol by the original Irish retained"
5 = "The 1950s.[3] architect, original lunchtime. of Newtown and building. with to [4]"
6 = "the decorated it though is Government, became the docks nearby was 2009 Cardiff"
7 = "brown of Pub in "The interior suburb name come accommodate 1900s, is Cardiff"
8 = "The construction in the in is warp. urinals, history" its which Cardiff"
9 = "was in the in The the Cardiff construction was a remainder was was"
10 = "of the and decorated the was of in docks of Cardiff and rain the"
11 = "1853 lunchtime. listed railway building the time bikers down"
12 = "neighbourhood.[6] of and building. Vulcan a of 2011.[2]"
13 = "which better projects.[5] "The brown to throughout of"
14 = "Adam Though update warp. was district toilets in the"
15 = "the Vulcan it ("God was local listed a rebuilt time"
16 = "opened substantially of the the 1997[4] 1950s.[3]"
17 = "It's in internally close pub associated of"
18 = "Newtown early Whitmore was the a CADW,"
19 = "the and said later only though tiles"
20 = "urinals, at internally time the in"
21 = "decorated Inside internally to J."
22 = "pub Cardiff, Brains the The and"
23 = "Adam brown opened come address"
24 = "J. 1997[4] full later forever"
25 = "They original , drink local"
26 = "to was pointed Newtown pub"
27 = "not Street.[3] existence."
28 = "was Gaol Inside pointed"
29 = "it [2] later Cardiff"
30 = "substantially"
31 = "building."
32 = "1975. a"
33 = "and"



 RESULTS...
retained rain F. retained Pub nearby building remained Hotel Street.[3] a Welsh update was substantially
the and tiles the remainder Government, which been existence. many forever writer warp. Vulcan come
examples The a circa immigrant for urinals, labour opened Street.[3] The in was decorated received
F. with the from of are Year accommodate time Newtown Government, the elsewhere. lunchtime. (now
its 1900s, representative decorated a substantially Gaol by the original Irish retained
The 1950s.[3] architect, original lunchtime. of Newtown and building. with to [4]

Solution

Code Review is not the place for debugging code as it is run on remote systems, so I am not going to go looking for the problem there on CodEval.

On the other hand, the basic algorithm you use appears logical, even though it is a pretty slow one, and the implementation uses some pretty dubious practices.

As a general rule, if there is a special case in a program (like the first line of the file being special), then you should code that case in a special way. You should not make the normal process do something special each time.

Additionally, while Object Orientation is a good thing when the objects match “something”, if the Object is a kludge of unrelated, or not even real stuff, then the Object is not helpful.

In your case the Object is Main. What is a Main – you have an object that represents a method? Or, is it an object that represents a boolean state (we are line 1 or not), or is it a String containing the current line? In other words, your Main is a poor Object to model.

As it happens, there’s a good algorithm that’s pretty fast, and really simple, that “just works”….. Instead of reading all the lines and sorting them, you just need to keep the number of lines that you need to report, and if you get a new line that’s longer, you insert it in to the result, and throw out the shortest…. basically, just keep an array of the longest lines, and maintain it as you read data.

Note, here I set the Object to be the “Top results”, and call it “TopX” (where X represents the number), so new TopX(10) would report the 10 longest lines…

import java.io.BufferedReader;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Stream;

public class TopX {

    private final String[] results;

    public TopX(int count) {
        this.results = new String[count];
    }

    public void consider(String line) {
        int pos = results.length - 1;
        // maintain an ordered array of the longest lines (in descending length order)
        // find the first line that's larger than the input.
        while (pos >= 0 && (results[pos] == null || results[pos].length() < line.length())) {
            pos--;
        }
        // Line at pos is greater, we want to insert after
        pos++;
        if (pos < results.length) {
            // we have a topx line....
            System.arraycopy(results, pos, results, pos + 1, results.length - pos - 1);
            results[pos] = line;
        }
    }

    public String[] getResults() {
        return results;
    }

    public static void main(String[] args) throws IOException {
        try (BufferedReader reader = Files.newBufferedReader(Paths.get(args[0]))) {

            // special case - interpret the first line as a count
            int count = Integer.parseInt(reader.readLine());

            TopX topx = new TopX(count);
            String line = null;
            while ((line = reader.readLine()) != null) {
                topx.consider(line);
            }

            Stream.of(topx.getResults()).forEach(System.out::println);

        }
    }

}

Leave a Reply

Your email address will not be published. Required fields are marked *