Problem
I’m writing the data structures for my program (a toy compiler), and I’m trying to understand what’s the best way to define the AST:
Current code
data RExpr = RExpr Location InnerRExpr
data InnerRExpr = RLExpr LExpr | RConstant Constant | RMathExpr MathExpr | FCall Id [RExpr]
Alternative A
data RExpr = RLExpr Location LExpr
| RConstant Location Constant
| RMathExpr Location MathExpr
| FCall Location Id [RExpr]
Alternative B
data RExpr = RLExpr { loc::Location, getLexpr::LExpr}
| RConstant { loc::Location, getConstant::Constant}
| RMathExpr { loc::Location, getExpr::MathExpr}
| FCall { loc::Location, id::Id, params::[RExpr] }
Honestly I’m not satisfied with either of the three options, because the current code means that I have an extraneous object in the AST which doesn’t really mean anything, alternative A means that I have to pattern match every time I want to extract the location (or write a boilerplate function that does it) and alternative B means cluttering the global namespace with functions whose names are likely to collide.
Suggestions?
Solution
I tend towards the former, and then use a lens or traversal to extract the location.
import Control.Lens
class HasLocation t where
loc :: Lens' t Location
instance HasLocation RExpr where
loc f (RLExpr l e) = f l <&> l' -> RLExpr l' e
loc f (RConstant l c) = f l <&> l' -> RConstant l' c
loc f (RMathExpr l m) = f l <&> l' -> RMathExpr l' m
loc f (FCall l i xs) = f l <&> l' -> FCall l' i xs
With the class
you can overload the use of loc
for other data types. Because it is a lens you can use it to get/set/modify the location with a large vocabulary without cluttering your namespace.