Java: Is there an easy way to append a new column that will be having same value for all rows in csv? [closed]

Posted on

Problem

I am processing a file as the following and closing it. Later, I do some more processing and get some ID that I want to append to all rows in the previous CSV generated. So all rows will have the same value.

My initial code of creating and appending data to csv:

    private StringBuilder stringBuilder = new StringBuilder(); 

    public void writeToFile(String[] tIds, PrintWriter printWriter) throws DataNotFoundException {
        int rowCount = 0;
        for(String id: tIds) {
            Data data = util.getData(id);
            csvHelper.prepareFileData(data, this.stringBuilder);
            rowCount++;
            
            if (rowCount == CHUNK_SIZE) {
                printWriter.println(this.stringBuilder.toString());
                this.stringBuilder = new StringBuilder();
                rowCount = 0;
            }
        }
        printWriter.close();
    }

Now further processing returns me some processedID that I want to append to all rows as a new column.

One option is this:

public void appendAgain(String processedId) {
        
            BufferedReader br = new BufferedReader(new FileReader(feedFile));
            String output = "";
            String line;
            while ((line = br.readLine()) != null) {
                output += line.replace(",", "," + alertId + ",");
            }
            FileWriter fw = new FileWriter(feedFile, false); //false to replace file contents, your code has true for append to file contents
            fw.write(output);
            fw.flush();
            fw.close();
}

public void prepareFileForData(Data data, StringBuilder sb) {
   // map values from data to sb
   sb.append(data.getId());
   sb.append(",");
   sb.append(data.getName()); 
   .. and so on
}

Please comment on a better way or any suggestions on the current one. Thanks!

Solution

Your first code snippet seems to use prepareFileData to inject string-encoded rows into a String-buffer. Every CHUNK_SIZE rows, the buffer is appended to a file and the buffer is then reset.
That makes some sense, but:

  1. If the number of rows is not a multiple of CHUNK_SIZE, some rows will be not be written and hence lost.
  2. Mixing abstraction levels (here: dealing with CSV vs. buffering) in a single procedure is almost never a good idea. Consider using something like BufferedOutputStream to delegate the buffering aspect.

Your second code snippet is confusing. It seems to work at first glance (the extraneous comma is a bit strange though), but your first snippet seemed very concerned with memory footprint. The second snippet however just collects all data in one String, suggesting that memory isn’t an issue at all.

My recommendation: take a step back and spend some time on considering what it is that you care about (apart from correctness): is it memory, is it speed, is it thread-safeness, is it readability, is it any combination of the former (including none)?

Leave a Reply

Your email address will not be published. Required fields are marked *