Problem

Usually, parser combinators in Haskell based on the parser type like

```
newtype Parser t r = Parser ([t] -> [(r,[t])])
```

This allows back tracking and multiple results. However I think it would be good to have it in “push” style, means the parser will have a set of result when no more input given, and have a transition function that for a given token, results in a new state of the parser. This results in the defintion

```
newtype PushParser t r = PushParser ([r], Maybe (t -> PushParser t r))
```

(We could have a data constructor that have two fields, however I prefer to use the more efficient `newtype`

keyword so have to have only one field as a tuple)

To parse a list of tokens, we define the `parse`

function as following:

```
parse :: PushParser t r -> [t] -> [r]
--No input, rs is the result
parse ~(PushParser (rs, _)) [] = rs
--No parsing function, this is an error state. No result
parse (PushParser (_,Nothing)) _ = []
--Otherwise, drop the results, apply the pasing function with the input and continue
parse (PushParser (_,Just f)) (t:ts) = parse (f t) ts
```

We can then define a `fmap`

to make it a `Functor`

like the following:

```
instance Functor (PushParser t) where
fmap f ~(PushParser (r,mg)) = PushParser ((map f r), do {g<-mg; return (fmap f.g)})
```

Instance of `Applicative`

is more tricky, but the easiest way is to make it depend on the not yet implemented instance of `Monad`

.

```
instance Applicative (PushParser t) where
pure x = PushParser ([x],Nothing)
pf <*> pv = do
f <- pf
v <- pv
return (f v)
```

But before giving the definition of `Monad`

I want to introduce the definition of `Alternative`

:

```
instance Alternative (PushParser t) where
empty = PushParser ([],Nothing)
(PushParser (r1,Nothing)) <|> (PushParser (r2,f)) = PushParser ((r1++r2), f)
(PushParser (r1,f)) <|> (PushParser (r2,Nothing)) = PushParser ((r1++r2), f)
(PushParser (r1,Just f1)) <|> (PushParser (r2,Just f2)) =
PushParser ((r1 ++ r2), Just (t -> f1 t <|> f2 t))
```

I have tried out different ways for the instance of `Monad`

, but finally I believe the best way to give definition of `(>>=)`

is to define `unwrap`

(or `join`

in the monad library) first.

```
instance Monad (PushParser t) where
p >>= f = unwrap (fmap f p) where
unwrap ~(PushParser (prs, mf)) =
foldr (<|>) (PushParser ([], do { f<-mf; return (unwrap.f)})) prs
```

The above works perfectally fine, and although it does not make sense to make it an instance of `Parser`

(in `Text.Parser.Combinators`

) as we don’t allow trace back, the other combinators that only requires `Alternative`

worked perfectally fine.

However, the performance seems to be too bad. It is possibly as it keeps all possible parsing path and so consumes too much memory. For example, a simple `parse (many (some (char '.'))) "...."`

will return 8 results, with all combinations of dots.

But I think there should be some way to improve the performance. Please advise me how to improve on this.

Solution

It’ll lazily take no more memory than it takes time to find the first result if you do not try to enumerate all possible results.

For `many`

/`some`

to only try one option, you want `Maybe`

‘s `Alternative`

/`MonadPlus`

behavior, not `[]`

‘s. You could parametrize your Parser type in the `Alternative`

used so the user can change it on the fly.

yoctoparsec uses this “push” style.

```
parseString (hoistFreeT maybeToList $ many $ some $ mfilter (=='.') token) "...."
```