Better way to manipulate this string in sequence?

Posted on

Problem

I am working on a small custom Markup script in Java that converts a Markdown/Wiki style markup into HTML.

The below works, but as I add more Markup I can see it becoming unwieldy and hard to maintain. Is there is a better, more elegant, way to do something similar?

private String processString(String t) {
    t = setBoldItal(t);
    t = setBold(t);
    t = setItal(t);
    t = setUnderline(t);
    t = setHeadings(t);
    t = setImages(t);
    t = setOutLinks(t);
    t = setLocalLink(t);

    return t;
}

And on top of it, passing in the string itself and setting it back to the same string just doesn’t feel right. But, I just don’t know of any other way to go about this.

Solution

You could create a StringProcessor interface:

public interface StringProcessor {

    String process(String input);
}


public class BoldProcessor implements StringProcessor {

    public String process(final String input) {
        ...
    }
}

and create a List from the available implementations:

final List<StringProcessor> processors = new ArrayList<StringProcessor>();
processors.add(new ItalicProcessor());
processors.add(new BoldProcessor());
...

and use it:

String result = input;  
for (final StringProcessor processor: processors) {
    result = processor.process(result);
}
return result;

If you want to process a language, even a simple one like a Wiki Markup, you should eventually write a proper parser, not do step-by-step replacement, nor chain a number of individual processors, no matter how fancy their implementation.

You can go with the fully generic approach, generate an AST from the markup (this would look similar to @rolfl’s StyledString), and then use an AST serializer to create the end result (but for efficiency’s sake, please append to a StringBuilder instead of repeatedly creating new strings). This allows you to use multiple serializers; e.g. if at one point you want to create PDF instead of HTML, this gives you a huge advantage. Your AST nodes should implement the visitor pattern for this purpose. (The serializer would be the visitor.)

But that would probably be overkill here. A simple parser that outputs the HTML as it parses would be simpler and probably sufficient.

You can use parser generators like ANTLR to generate the parser, or you can hand-write a parser.

I like @palacsint’s approach but I just have one thing to add, you can probably do most of the processing with the same class.

public class TagProcessor implements StringProcessor {
    private final String wrapWith;
    public TagProcessor(String wrapWith) {
        this.wrapWith = wrapWith;
    }
    @Override
    public String process(String input) {
        return "<" + wrapWith + ">" + input + "</" + wrapWith + ">";
    }
}

processors.add(new TagProcessor("i"));
processors.add(new TagProcessor("b"));

I also believe that you can add generalize a lot of the functionality for other processors into a proper class and use it’s constructor to send proper parameters. (Wrapping in <div class="someclass">...</div> for example).

This sounds like a case where you should encapsulate the data with a ‘Decorator Pattern’.

You should declare a simple interface such as:

public interface StyledString {
    public String toFormatted();
    public StyledString getSource();
}

Then create a concrete class for each style you have:

public class BoldStyle implements StyledString {
    private final StyledString source;

    public BoldStyle(StyledString source) {
        this.source = source;
    }

    public String toFormatted() {
       return "<b>" + source.toFormatted() + "</b>";
    }

    public StyledString getSource() {
        return source;
    }

}

You should also have a ‘NoStyle’ class that takes a raw String input, and returns a null getSource();

using this system you can easily add Styles, and you can have styles that join phrases, etc…..

Also, you can add the styles together in a way that makes decomposing the value easier at a later point, and you only need to add/wrap the styles that you want.

Leave a Reply

Your email address will not be published. Required fields are marked *