Generating a string in a particular format by reading data from a map

Posted on

Problem

I have a processToTaskIdHolder Map which contains processId as the key and taskId as the value. Now I am iterating this map and at the end I am making a String in particular format.

For example:-

  • Let’s say I have 123 as the key and 009 is the value in the processToTaskIdHolder map.
  • Now I will make a “activityKey” using 123 and then get data basis on this key.
  • Now I will iterate all the categories for that activityKey and check whether those categoryId are already present in processToTaskIdHolder keyset or not. If they are present, then I will extract taskId for that categoryId from the map and also extract score for that categoryId and store it in an Info class.
  • Same category can be present with different score for different processId.

Now I need to repeat above steps for each activity I have in activities list.

So my formatted string will be like this:-

A,B,C:Score1,D:Score2
P,Q,R:Score1,S:Score2
  • Where A is the categoryId for the processId C and D, and Score1 is the score for categoryId A for processId C. Score2 is the score for categoryId A but for processId D. We have different scores for same categories for two different processes. It means, categoryId A was present in both processId C and D so I need to get the score for both the cases and make a string like that. And B is the taskId for categoryId A which will be present in the map.
  • Where P is the categoryId for the processId R and S, and Score1 is the score for categoryId P for processId R. Score2 is the score for categoryId P but for processId S. We have different scores for same categories for two different processes. It means, categoryId P was present in both processId R and S so I need to get the score for both the cases and make a string like that. And Q is the taskId for categoryId P which will be present in the map.

I have this code which does the job but I think it’s not the right and efficient way to achieve above formatted string. I believe it can be done in a much better way.

  private static final List<String> activities = Arrays.asList("tree", "gold", "print", "catch");

  public static void reverseLookup(final String clientId, final Map<String, String> processToTaskIdHolder) {
    Multimap<String, Info> reverseLookup = LinkedListMultimap.create();
    for (String activity : activities) {
      for (Entry<String, String> entry : processToTaskIdHolder.entrySet()) {
        String activityKey = "abc_" + activity + "_" + clientId + "_" + entry.getKey();
        Optional<Datum> datum = getData(activityKey);
        if (!datum.isPresent()) {
          continue;
        }
        List<Categories> categories = datum.get().getCategories();
        for (Categories category : categories) {
          String categoryId = String.valueOf(category.getLeafCategId());
          if (processToTaskIdHolder.containsKey(categoryId)) {
            Info info = new Info(entry.getKey(), String.valueOf(category.getScore()));
            reverseLookup.put(categoryId + ":" + processToTaskIdHolder.get(categoryId), info);
          }
        }
      }
      String formattedString = generateString(reverseLookup);
      System.out.println(formattedString);
    }
  }

  private static String generateString(final Multimap<String, Info> reverseLookup) {
    StringBuilder sb = new StringBuilder();
    for (Entry<String, Collection<Info>> entry : reverseLookup.asMap().entrySet()) {
      sb.append(entry.getKey().split(":")[0]).append(",").append(entry.getKey().split(":")[1])
          .append(",");
      String sep = "";
      for (Info info : entry.getValue()) {
        sb.append(sep).append(info.getLeafCategoryId()).append(":").append(info.getScore());
        sep = ",";
      }
      sb.append(System.getProperty("line.separator"));
    }
    return sb.toString();
  }

In the reverseLookup map, I have a key in this format – “a:b”, so I’m not sure instead of making a key like that. Maybe it can be done in some other better way?

Note: I am working with Java 7. Categories and Info class is a simple immutable class with basic toString implementations.

Solution

This looks horrible heavy for me with this “formates”
– instead of creating info i d just create a string what will be printed like: String info = entry.getKey() +categoryId + String.valueOf(category.getScore());
– Then having Multimap map just:

 map.asMap().entrySet().stream().forEach( e->System.out.println(e));

Or this in 1.7 like:
Like:

public static<M,N> String entryLnInvert(Map<M,N> map, String keyFormat, String between, String valueFormat) {

        StringBuilder sb = new StringBuilder(); 
        for( Entry<M,N> e: map.entrySet()) {
            sb.append(String.format(valueFormat,e.getValue()));         
            sb.append(between);
            sb.append(String.format(keyFormat,e.getKey()));         
        }
        return sb.toString();
    }

and other crap you just put in toString() of N,M classes
such printout would be enoph for me

Ok, first of all. If you don’t understand things right away, usually it means code is messy. Code gets messy when you don’t have time to think and you just need to do it. I’ve refactored junior/mids code and even seniors code produced when people just want to do it right away and don’t have time to think.
So for the future. The way you approach is this is start simplifying, make the method do one thing. Get rid of repetitive parts.
So with first iteration I refactored to this

public class CategoryHolder {

// when in your code you have System.getProperty() called many times - do you need that? 
// Its not changing every second, so when you store in variable, you're good then
private static final String END_OF_LINE = System.getProperty("line.separator");
// you used "," many times - its called magic string or if some number 
// is jumping here and there in code its called magic number
// when you define the variable then you change in ONE place when something changes
private static final String COMMA = ",";
// same here, name is stupid maybe, but it's better than ":" all around your code
// rename it to something more meaningful
private static final String DOUBLE_DOT = ":";
private static final String EMPTY_STRING = "";

// I don't like the name of the method, but ok, its a first iteration of refactoring
private static String generateString(final Multimap<String, Info> reverseLookup) {
    StringBuilder sb = new StringBuilder();

    for (Map.Entry<String, Collection<Info>> entry : reverseLookup.asMap().entrySet()) {
        String[] splitThing = entry.getKey().split(DOUBLE_DOT);
        // I started to simplify and immediately seen you do split two times
        // while it could be done only once.. good catch
        // you might use some meaningful names, I just don't know much about your domain
        String something1 = splitThing[0]; // your doing split two times?
        // same here
        String something2 = splitThing[1];
        sb.append(something1).append(COMMA).append(something2).append(COMMA);

        createInfoString(sb, entry);

        sb.append(END_OF_LINE);
    }
    return sb.toString();
}

private static void createInfoString(StringBuilder sb, Map.Entry<String, Collection<Info>> entry) {
    // I guess you've made it as the case to "" for first but "," for all next
    // I would probably go with for(int i=0; i<entry.length? ; i++){
    // separator = i == 0 ? EMPTY_STRING : DOUBLE_DOT;
    String separator = EMPTY_STRING;
    for (Info info : entry.getValue()) {
        sb.append(separator)
        sb.append(info.getLeafCategoryId())
        sb.append(DOUBLE_DOT).append(info.getScore());
        separator = COMMA;
    }
}

It’s a first step, but looks a bit more clear for me.

Now we do a bit more refactoring and its starts to get more clear what we do here

public class DataFormatter {

private static final String END_OF_LINE = System.getProperty("line.separator");
private static final String COMMA = ",";
private static final String DOUBLE_DOT = ":";
private static final String EMPTY_STRING = "";
private static final int NICE_PART_1_INDEX = 0;
private static final int NICE_PART_2_INDEX = 1;

private static String format(final Multimap<String, Info> reverseLookup) {
    StringBuilder sb = new StringBuilder();

    for (Map.Entry<String, Collection<Info>> entry : reverseLookup.asMap().entrySet()) {
        formatKey(sb, entry.getKey());
        formatValue(sb, entry);
        sb.append(END_OF_LINE);
    }

    return sb.toString();
}

private static void formatKey(StringBuilder sb, String key) {
    String[] splitThing = key.split(DOUBLE_DOT);
    String something1 = splitThing[NICE_PART_1_INDEX];
    String something2 = splitThing[NICE_PART_2_INDEX];

    sb.append(something1).append(COMMA).append(something2).append(COMMA);
}

private static void formatValue(StringBuilder sb, Map.Entry<String, Collection<Info>> entry) {
    String separator = EMPTY_STRING;

    for (Info info : entry.getValue()) {
        sb.append(separator);
        sb.append(info.getLeafCategoryId()).append(DOUBLE_DOT).append(info.getScore());
        separator = COMMA;
    }
}

At the end, I added formatting code to class and called DataFormatter and format() method. If you would have formatting in 5 different places you could move them to this class, see common parts, reuse them rather than copy/paste similar code in many places.

I would say – add unit tests which would pass any bad thing into your method you could imagine. Then you will be sure that your code works fine.

Leave a Reply

Your email address will not be published. Required fields are marked *