Posted on

Problem

I have been studying Haskell by myself for about a little over a year. And I have been stuck at monad/monad transformers for quite a while until recently some examples I read online enlightened me. So I decided to try on the following problem with writing monadic Haskell code.

The problem is to evaluate a string that contains only 0-9, +, – and *, which represents addition, subtraction and multiplication separately. The string itself should represent a valid math expression and starts with a number always.

``````"3+5" -> 8
"3+25*4" -> 103
"1-2*2*2+7" -> 0
``````

The goal of the exercise is not to write a perfect parsing engine to evaluate any math expression but to try to learn to use monad as a tool to write program that could be relatively straight forward in an imperative language such as C++.

It is a linear algorithm and the main the idea is to use two stacks to track numbers and operators.

• On a new digit, update the current on-the-run number
• On any operator, push the on-the-run number to the number stack. Update the stacks if the existing operator on the top of the stack is ‘*’. If this new operator is a ‘+’ or ‘-‘, update the stacks if only the existing operator is ‘+’ or ‘-‘. Once the update is done, push the new operator to the stack
• repeat the process until there is one number left.

This algorithm is used to develop the solutions in both C++ and Haskell.

C++ solution:

`````` #include <stack>
#include <iostream>
#include <string>
#include <stdexcept>

using namespace std;

int calc(char c, int n1, int n2)
{
// cout << c << "-->" << n1 << " and " << n2 << endl;
if (c == '+') return n1+n2;
else if (c == '-') return n1-n2;
else if (c == '*') return n1*n2;
}

void update(stack<int>& numbers, stack<char>& operators)
{
if (operators.size() + 1 != numbers.size()) throw runtime_error("bad");

char op = operators.top();
operators.pop();

int n2 = numbers.top();
numbers.pop();

int n1 = numbers.top();
numbers.pop();

numbers.push(calc(op, n1, n2));
}

int processMath(const string& input) {
int num = 0;

stack<int> numbers;
stack<char> operators;

for (char c : input) {
if (c == '+' || c == '-' || c == '*') {
numbers.push(num);
num = 0; // reset number
if (c == '*' && !operators.empty() && operators.top() == '*') {
update(numbers, operators);
} else if (c == '+' || c == '-') { // c is + or -
while (!operators.empty()) update(numbers, operators);

}
operators.push(c);
} else {
num = num*10+(c-'0');
// cout << "num=" << num << endl;
}
}
numbers.push(num);
while (!operators.empty()) update(numbers, operators);
return numbers.top();
}

// To execute C++, please define "int main()"
int main() {

string exp1 = "13+15";
string exp2 = "3+25*4";
string exp3 = "1-2*2*2+7";

cout << exp1 << endl << processMath(exp1) << endl << endl;
cout << exp2 << endl << processMath(exp2) << endl << endl;
cout << exp3 << endl << processMath(exp3) << endl << endl;

return 0;
}
``````

The following part is the Haskell program I came up with, without using anything specific for parsing or math evaluation.

``````import Control.Monad.State
import Data.Char

data MathStacks = MathStacks { numbers :: [Int]
, operators :: [Char]
, current :: Int }
deriving Show

data EvalErr = ParseErr { position :: Int, reason :: String }
| StackErr String
| OpErr String
deriving Show

collapseOn :: MathStacks -> [Char] -> Either EvalErr MathStacks
collapseOn ms@(MathStacks ns ops _) permittedOps
| null ops = return ms
| length ns < 2 = Left \$ StackErr ("numbers length < 2:" ++ show ns)
| not \$ op `elem` "+-*" = Left \$ OpErr ("invalid op=" ++ [op])
| not \$ op `elem` permittedOps = return ms
| otherwise = do
n <- calc op n1 n2
return \$ ms { numbers=(n:nrest), operators=oprest }
where (n2:n1:nrest) = ns
(op:oprest) = ops

calc :: Char -> Int -> Int -> Either EvalErr Int
calc c n1 n2
| c == '+' = return \$ n1 + n2
| c == '-' = return \$ n1 - n2
| c == '*' = return \$ n1 * n2
| otherwise = Left \$ OpErr ("invalid op=" ++ [c])

exec :: MathStacks -> Either EvalErr MathStacks
exec ms@(MathStacks ns ops curr)
| nlen /= olen + 1 = Left \$ StackErr ("inconsistent stacks")
| olen == 0 = Right ms
| otherwise = do
let (n2:n1:nrest) = ns
(op:oprest) = ops
n <- calc op n1 n2
return \$ MathStacks (n:nrest) oprest curr
where nlen = length ns
olen = length ops

exec' :: MathStacks -> Either EvalErr MathStacks
exec' ms@(MathStacks ns ops _)
| null ops = return ms
| otherwise = (exec ms) >>= exec'

eval :: MathStacks -> Either EvalErr Int
eval (MathStacks ns ops curr)
| nlen /= 1 || olen /= 0 = Left \$ StackErr ("inconsistent stacks")
| otherwise = Right \$ head ns
where nlen = length ns
olen = length ops

horner :: Int -> Int -> Int
horner digit num = num * 10 + digit

updateCurr :: Int -> MathStacks -> MathStacks
updateCurr digit ms@(MathStacks _ _ curr) = ms { current=horner digit curr }

updateOps :: Char -> MathStacks -> Either EvalErr MathStacks
updateOps op ms@(MathStacks _ ops _)
| op `elem` ['+', '-', '*'] = return \$ ms { operators=(op:ops) }
| otherwise = Left \$ OpErr ("invalid op=" ++ [op])

updateNum :: MathStacks -> MathStacks
updateNum ms@(MathStacks ns _ curr) = ms { numbers=(curr:ns), current=0 }

parse :: (Char, Int) -> MathStacks -> Either EvalErr MathStacks
parse (c, idx) ms@(MathStacks ns ops curr)
| c `elem` ['+', '-', '*'] = do
-- current number run is done
let ms0 = updateNum ms
-- if there is existing multiplication on top. collapse it
ms1 <- collapseOn ms0 "*"
ms2 <- if c == '+' || c == '-'
-- if there is existing addition or subtraction, do it
then collapseOn ms1 "+-"
else return ms1
updateOps c ms2
| isDigit c = Right \$ updateCurr (digitToInt c) ms
| otherwise = Left \$
ParseErr idx ("err char at pos=" ++ show idx ++ " char:" ++ [c])
where nlen = length ns
olen = length ops

updateOnceT :: StateT MathStacks (Either EvalErr) ()
updateOnceT = do --  in side of StateT MathStacks (Either EvalErr) monad
ms <- get
ms' <- lift \$ exec ms
put ms'

evalCharT :: (Char, Int) -> StateT MathStacks (Either EvalErr) ()
evalCharT (c, idx) = do
ms <- get -- ms :: MathStacks
-- promotes from Either EvalErr MathStacks type to StateT monad
ms' <- lift \$ parse (c, idx) ms
put ms'

evalStringT :: String -> StateT MathStacks (Either EvalErr) ()
evalStringT s = mapM_ evalCharT \$ zip s [1..]

evalStringE :: String -> Either EvalErr MathStacks
evalStringE s = foldM (flip parse) emptyStack \$ zip s [1..]

calcStringE :: String -> Either EvalErr MathStacks
calcStringE s = do
(_, ms) <- (runStateT \$ evalStringT s) emptyStack
return ms

top :: MathStacks -> Either EvalErr Int
top ms = do
let ns = numbers ms
if null ns
then Left \$ StackErr "no value left"

calcString :: String -> Either EvalErr Int
calcString s = do
ms <- evalStringE s -- or use ms <- calcStringE s
ms' <- exec' \$ updateNum ms
top ms'

emptyStack = MathStacks [] [] 0

main :: IO ()
main = do
print \$ calcString "13+15"
print \$ calcString "3+25*4"
print \$ calcString "1-2*2*2+7"
``````

The solution is a much longer program than the C++ counterpart, which is not the impression I got with Haskell program. The part that I used `StateT` monad transformer is probably not necessary (function `evalStringT` and function `calcStringE`), however even without these functions I don’t think my solution will get much shorter. I thought using `State` monad could be a natural solution as it involves quite some state updates in the whole process but it looks like `foldM` over `Either` monad seems doable. Overall I am not even sure my solution is Haskellish enough so please point out anything that I can improve on my code.

Solution

Update: Zeta suggested that what I gave was not really a review. So I present a review first:

• The data type `EvalErr` has both `ParseErr`, `StackErr` and `OpErr`. A common error type for your entire pipeline seems like an OK idea, since the individual parts (parser, evaluator) will not be used independently.

• Your error values are all parameterised with a `String`, which can be super useful as you write the parser, but this makes testing negative cases more difficult. A `StackErr` may be parameterised with the actual stack that broke. This also makes negative testing easier; a good sign of code quality is testability. You can always produce meaningful error messages based on `StackErr ns` (and whatever remaining context that makes for a good message; what was the operator that failed?).

Similarly, `OpErr` could take a single `Char`.

• You perform unsafe pattern matching in the `where` of `collapseOn`:

``````where (n2:n1:nrest) = ns
(op:oprest) = ops
``````

You justify this by guarding against too short lists.

But this creates a dependency between multiple lines.

You can avoid this either by using pattern matching to restrict access to executing code: A function body that will only execute once a pattern matches is safe. Or you can extract values monadically, providing for more abstraction (including implicit error handling). For example, a monadic stack may work like:

``````eval Add = do
v1 <- pop
v2 <- pop
push (v1 + v2)
``````
• I’m not sure exactly what `collapseOn` does. It handles a bunch of types of errors that are at different levels of abstraction. And then it calls `eval`, pushes the result to the stack, and removes an operator from a list of operators for some reason or other.

Is collapse a metaphor for error handling? Or for reducing the stack?

So I’d say it does too many things.

• You can check that there are enough elements via pattern matching or monadic popping from the stack without calculating the entire length (a full traversal of the stack) every time you handle a new element.

• Your list of supported operators is repeated many times. This makes adding new ones difficult and error-prone. The precedence and associativity of your operators is embedded in the ordering of your code and makes it hard to derive, extend or verify that they’re right.

• The following `StateT` functions seem a little off:

``````updateOnceT :: StateT MathStacks (Either EvalErr) ()
updateOnceT = do
ms <- get
ms' <- lift \$ exec ms
put ms'

evalCharT :: (Char, Int) -> StateT MathStacks (Either EvalErr) ()
evalCharT (c, idx) = do
ms <- get -- ms :: MathStacks
ms' <- lift \$ parse (c, idx) ms
put ms'
``````

There is a `modify` combinator. But I would probably ditch the `StateT` altogether to begin with and either

1. Build a non-monadic stack-based parser from scratch, simplify it and extend it. (You’ll eventually end up with something that is somewhat equivalent to parser combinators, since they’re also recursive descent parsers, but not explicitly recursive.)

2. Build a parser using parser combinators (see below) and either construct a syntax tree or make the parser produce the evaluator directly.

• I’d recommend reading up on separation of concerns.

Previous: I wrote this suggestion to solve the problem by dividing the problem into parsing and evaluation, and to use other abstractions than a stack-based algorithm.

What you can do is convert the expression into a syntax tree using a parser combinator library like Megaparsec and evaluate that syntax tree. The author of Megaparsec, Mark Karpov, wrote a tutorial called Parsing a simple imperative language. It has a section called Expressions where he demonstrates the `makeExprParser` combinator:

``````aExpr :: Parser AExpr
aExpr = makeExprParser aTerm aOperators

aOperators :: [[Operator Parser AExpr]]
aOperators =
[ [ Prefix (Neg <\$ symbol "-") ]
, [ InfixL (ABinary Multiply <\$ symbol "*")
, InfixL (ABinary Divide   <\$ symbol "/") ]
, [ InfixL (ABinary Add      <\$ symbol "+")
, InfixL (ABinary Subtract <\$ symbol "-") ]
]
``````

As for building a monadic evaluator, I’d read The Monadic Way on the Haskell Wiki. It starts by building a regular evaluator and then adds features that are greatly complicated by the lack of monads, and then it introduces them.

It seems that your examples do not mention division, which is a pretty good example of something that may fail during evaluation because of division by zero. If you had the following syntax tree,

``````data AExpr
= IntConst Integer
| Neg AExpr
| ABinary ABinOp AExpr AExpr
deriving (Show)

data ABinOp
| Subtract
| Multiply
| Divide
deriving (Show, Eq)

data Error
= DivisionByZero
deriving (Show, Eq)
``````

you could write something like,

``````eval :: AExpr -> Either Error Integer
eval (IntConst i) = return i
eval (Neg e) = negate <\$> eval e
eval (ABinary op e1 e2) = do
i1 <- eval e1
i2 <- eval e2
if op == Divide && i2 == 0 then Left DivisionByZero else
return \$ binop op i1 i2

binop :: ABinOp -> (Integer -> Integer -> Integer)
binop Subtract = (-)
binop Multiply = (*)
binop Divide = quot
``````

This separates the concerns of syntax analysis and evaluation. This also means that different kinds of errors are handled at different layers of abstraction. And it means you get a declarative, high-level way of expressing the precedence and associativity of your operators.

More: I added this after to suggest a middle ground.

If you think `makeExprParser` feels like cheating, and you think the intermediate syntax-tree representation is redundant, you can

1. Make your life harder by writing your expression parser using the `chainl`, `chainr` etc. combinators, or write your own hierarchy of nested combinators (makes precedence and associativity slightly more obscure, but much less so than an explicitly recursive function would).

2. Make the program simpler by writing a parser that generates a function that evaluates:

``````evalP :: Parser (String -> Integer)
evalP = ...
``````