Saturday, March 1, 2008

Haskell Style Pattern Matching In Arc

Since I did the Fluent Interface saga, it's only fair that I try to push the boundaries of metaprogramming in language that actually has metaprogramming.

Consider the following Haskell code. We will be striving to imitate it as closely as possible using an Arc macro.
`union xs [] = xsunion [] ys = ysunion xs@(x:xt) ys@(y:yt) | x < y = x : union xt ys                          | y < x = y : union xs yt                          | otherwise = x : union xt yt`
First we will see how this function might traditionally be written.
`(def union (< xs ys)  (if (no xs) ys      (no ys) xs      (with (x (car xs) xt (cdr xs)             y (car ys) yt (cdr ys))        (if (< x y) (cons x (union < xt ys))            (< y x) (cons y (union < xs yt))                    (cons x (union < xt yt))))))`
As a useful side-note, Arc's let and with forms come with destructuring. This can save us a whole line here, as well as make the code more readable (no more cryptic cars and cdrs).
`(def union (< xs ys)  (if (no xs) ys      (no ys) xs      (with ((x . xt) xs (y . yt) ys)        (if (< x y) (cons x (union < xt ys))            (< y x) (cons y (union < xs yt))                    (cons x (union < xt yt))))))`
This isn't bad. If we were programming in the wild, we would leave well enough alone at this point. However, this is an experiment; we want it to look Haskellian, not Lispy.

We'll start with pcase a 10-line macro using almkglor's pattern matching library to yield something similar to Scala's match-case statement.
`(def union (< xs ys)  (pcase `(,xs ,ys)    (xs ()) xs    (() ys) ys    ((x . xt) (y . yt))      (if (< x y) (cons x (union < xt ys))          (< y x) (cons y (union < xs yt))                  (cons x (union < xt yt)))))`
Switching from pcase to hcase, the outer parens in the patterns go away and equals signs separate the patterns from their corresponding expressions.
`(def union (< xs ys)  (hcase `(,xs ,ys)    xs () = xs    () ys = ys    (x . xt) (y . yt) =      (if (< x y) (cons x (union < xt ys))          (< y x) (cons y (union < xs yt))                  (cons x (union < xt yt)))))`
To bring more Haskell syntax into the mix, we want to be able to substitute [] for () and (x:xt) for (x . xt). The colon and the brackets have special meanings in Arc, but our macro can intercept them without too much difficulty.
`(def union (< xs ys)  (hcase `(,xs ,ys)    xs [] = xs    [] ys = ys    (x:xt) (y:yt) =       (if (< x y) (cons x (union < xt ys))          (< y x) (cons y (union < xs yt))                  (cons x (union < xt yt)))))`
Next, we can replace that pesky if statement with guard clauses.
`(def union (< xs ys)  (hcase `(,xs ,ys)    xs [] = xs    [] ys = ys    (x:xt) (y:yt) / (< x y)   = (cons x (union < xt ys))                  / (< y x)   = (cons y (union < xs yt))                  / otherwise = (cons x (union < xt yt))))`
The a single pipe | character can't be used here for technical reasons, so we'll have to make due with a slash /. A double pipe || could be used just as easily.

Since the guard conditions are enclosed on either side by / and =, we can make the outer parens optional. The xs@(x:xt) constructs known as "as-patterns" are the next addition. Having these as-patterns, we can define a simple macro to bring hcase up to the level of function arguments.
`(mac hdef (name . body)  (let args (uniq)   `(def ,name ,args (hcase ,args ,@body))))(hdef union   _ xs [] = xs  _ [] ys = ys  < xs@(x:xt) ys@(y:yt) / < x y = (cons x (union < xt ys))                        / < y x = (cons y (union < xs yt))                        / otherwise = (cons x (union < xt yt)))`
The result: Lo! In a dazzling display of engineering hubris, a Lisp dialect is dragged kicking and screaming into Haskell-like syntax. If you dare, you can take look at the definition of hcase in all its convoluted glory.

Implementation note: Like most Lisps, Arc would read '(x : xs) as a list of three symbols, but '(x:xs) has only one. In order to behave properly, hcase must split apart the symbols in patterns: '(xs@(x:y:yt)) becomes '(xs @ (x : y : yt)) and so on.