Shunting-Yard algorithm implementation

Posted on

Problem

I’ve just finished coding a Shunting-Yard algorithm implementation, following Wikipedia’s C code example and I’ve got it to finally work.

However, looking at my code now, the two “worker” methods seem bloated. I was wondering what your opinion was regarding the current code stubs and whether or not they need to be split up and to what extent.

Worker: overriden methods

``````package math;

import java.math.BigDecimal;
import java.util.Queue;
import java.util.Stack;

/**
* Implementation of the Shunting-yard algorithm invented by Edsger Dijkstra.
* Parse 'infix' mathematical expressions, and output to Reverse Polish
* Notation. Used to then evaluate RPN expression and work out the initial
* 'infix' expression.
*
* @author Llifon Osian Jones
* @version 1.0
*/
public class ShuntingYardImpl implements ShuntingYard {

private final Stack<Operator> operatorStack;
private final Queue<String> outputQueue;

/**
* operatorStack - Store Operator Values. outputQueue - Output queue
* containing values and operators - RPN.
*/
public ShuntingYardImpl() {
operatorStack = new Stack<>();
}

/**
* {@inheritDoc}
*/
@Override
public String[] infixToReversePolishNotation(String input) {
final String[] expression = input.trim().split(" ");

for (String token : expression) {

final char op = token.charAt(0); // Validated beforehand.

if (isOperator(op)) { // if token is an operator
final Operator op1 = Operator.getOperatorByValue(op);

while (!operatorStack.isEmpty()) {
final Operator op2 = operatorStack.peek();

final boolean condition1 = (isLeftAssociative(op1) && compareOperators(
op1, op2) <= 0);
final boolean condition2 = (!isLeftAssociative(op1) && compareOperators(
op1, op2) < 0);
if (condition1 || condition2) {
.getSymbol()));
continue;
} else {
break;
}

}
operatorStack.push(op1);
} else if (token.equals("(")) { // if token is an open parenthesis
operatorStack.push(Operator.OPEN_PARENTHESIS);

} else if (token.equals(")")) { // if token is a closed parenthesis
while (!operatorStack.isEmpty()
&& (operatorStack.peek() != Operator.OPEN_PARENTHESIS)) {
}
operatorStack.pop(); // pop and discard left parenthesis.
} else if (isNumerical(token)) {
}

}

while (!operatorStack.empty()) { // Empty out remainder.
}

return outputQueue.toArray(new String[] {});
}

/**
* {@inheritDoc}
*/
@Override
public String reversePolishNotationToResult(String[] input) {
final Stack<String> execution = new Stack<String>();

for (final String token : input) {
final char symbol = token.charAt(0);

if (isOperator(symbol)) {
final Operator operator = Operator.getOperatorByValue(symbol);

if (!isParenthesis(operator) && !execution.isEmpty()) {

// Stack - FILO (LIFO), 1st pop = "RIGHT" hand expression
final String right = String.valueOf(execution.pop());

if (right.equals("0")
&& symbol == Operator.DIVIDE.getSymbol()) {
return "Division By Zero";
}

if (!execution.isEmpty()) {
// Stack - FILO (LIFO), 2nd pop = "LEFT" hand expression
final String left = String.valueOf(execution.pop());
final BigDecimal result = MathUtils.evaluateExpression(
left, right, symbol);
// Push latest result back into the stack.
execution.push(result.toPlainString());
}

}
} else {
execution.push(token); // push numbers into stack.
}
}

return execution.isEmpty() ? "Error" : execution.pop();
}

/**
* Tests whether or not an Operator represents parenthesis.
*
* @param operator
*            Object to evaluate.
* @return True if the operator is either "(" or ")".
*/
private boolean isParenthesis(Operator operator) {
return operator == Operator.OPEN_PARENTHESIS
|| operator == Operator.CLOSED_PARENTHESIS;
}

/**
* Test if an Operator object is assigned Left or Right Associativity.
*
* @param op
*            Operator object to evaluate.
* @return True if assigned LEFT, false otherwise.
*/
private boolean isLeftAssociative(Operator op) {
return op.getAssociativity() == Associativity.LEFT;
}

/**
* Test if input matches a numerical format: E.g - 23, 23.1, 2, 1000.
*
* @param input
*            String to evaluate
* @return True if a valid number.
*/
private boolean isNumerical(String input) {
final char[] tokens = input.toCharArray();

for (char c : tokens) {
if (!Character.isDigit(c) && c != '.') {
return false;
}
}

return true;
}

/**
* Test if current CHAR matches a valid Operator specification. Operator
* defined in OPERATOR ENUM, but excluding parenthesis.
*
* @param op
*            Character value of a supposed operator.
* @return True if 'op' matches an Operator value, excluding parenthesis.
*/
private boolean isOperator(char op) {
final Operator operator = Operator.getOperatorByValue(op);

if (operator != null) {
final boolean isParenthesis = operator.getAssociativity() == Associativity.NONE;
if (!isParenthesis) {
return true;
}
}

return false;

}

/**
* Compares the 'priority' level of two Operator enums. Priority determined
* by precedence values of Operator enums.
*
* @param op1
*            Operator 1 to compare
* @param op2
*            Operator 2 to compare
* @return ( -1 if op1 < op2 ), ( 0 if op1 == op2 ), ( 1 if op1 > op2 ).
*/
private int compareOperators(Operator op1, Operator op2) {
final int op1V = op1.getPrecedence();
final int op2V = op2.getPrecedence();

if (op1V < op2V) {
return -1;
} else if (op1V == op2V) {
return 0;
}
// op1v > op2v
return 1;

}

public static void main(String... args) {
ShuntingYardImpl syi = new ShuntingYardImpl();

String input = "( 10.2 + ( 2 ^ 4 ) ) + ( 2 - 2 )";
String[] array = syi.infixToReversePolishNotation(input);
+ syi.reversePolishNotationToResult(array));
}

}
``````

Related to the question: Is it a bad sign when conditional statements span so widely?

I tried to clean them up by assigning values to `boolean` variables, but it still looks a tad ugly.

Solution

``````if (isLeftAssociative(op1) && compareOperators(op1, op2) < 1
|| !isLeftAssociative(op1) && compareOperators(op1, op2) < 0) {
// ...
}
``````

is not equal to your refactoring:

``````final boolean condition1 = isLeftAssociative(op1) && compareOperators(op1, op2) < 1;
final boolean condition2 = !isLeftAssociative(op1) && compareOperators(op1, op2) < 0;

if (condition1 || condition2) {
// ...
}
``````

because you loose evaluation short-circuit. You can get it back by chaining:

``````boolean condition = isLeftAssociative(op1) && compareOperators(op1, op2) < 1;
condition = condition || !isLeftAssociative(op1) && compareOperators(op1, op2) < 0;

if (condition) {
// ...
}
``````

or by deferring the evaluation:

``````interface Condition { boolean eval(); }

final Condition condition1 = new Condition() {
@Override
public boolean eval() {
return isLeftAssociative(op1) && compareOperators(op1, op2) < 1;
}
};
final Condition condition2 = new Condition() {
@Override
public boolean eval() {
return !isLeftAssociative(op1) && compareOperators(op1, op2) < 0;
}
};

if (condition1.eval() || condition2.eval()) {
// ...
}
``````

In general, instead of conditions themselves, you should precompute those expensive calls that have to be performed in any case:

``````final int compareOperators = compareOperators(op1, op2);

if (compareOperators < 0 || isLeftAssociative(op1) && compareOperators < 1) {
// ...
}
``````

while in your code, a more concise option is:

``````if (compareOperators(op1, op2) < (isLeftAssociative(op1) ? 1 : 0)) {
// ...
}
``````

But in fact, instead of returning `int` from `compareOperators()`, you should introduce and return an `enum` like:

``````enum Precedence { LOWER, EQUAL, HIGHER; }
``````

which would break some of the approaches outlined above.

I would recommend creating a `Token` class with specialized sub-classes for things like operators, parenthesis and values/identifiers. Your pre-defined tokens can all be saved in a Map so that you can access them quickly. The big loops that go through the token stream and “if it’s an do the thing” logic can all be moved to the specialized `Token` instances. The body of the loop then becomes: 1) convert text to `Token`, 2) have the `Token` do it’s thing.