Implement Pattern Matching in Emacs Lisp

Suppose I want to write a recursive function on lists in Elisp. Consider a version of seq-filter that only works for lists and call it list-filter.

(defun list-filter (p l)
  (if (null l)
      nil
    (let ((x (car l))
          (ys (list-filter p (cdr l))))
      (if (funcall p x)
          (cons x ys)
        ys))))

If I were to define the same function in Haskell the code would look like:

listFilter p []     = []
listFilter p (x:xs) = let
    ys = listFilter p xs
  in
    if p x
        then x:ys 
        else ys 

In my opinion the Haskell code is neater. I also think the way the "shape" of the list changes is more evident from it. All thanks to the ability to "compare" a value with an expression that, when evaluated, would be equal to the value and define some variables based on the comparison. This comparison-based binding is pattern-matching.

The nice thing about Lisp is that I can define new language features such as a pattern-matching construct with very little fuss.

Before I implement the pattern-matching syntax I want to imagine how it will look. This process generally helps me guess at the inputs and output of a function or macro I need to write. So this is how I believe list-filter should be written using a pattern-matching construct called pattern-match (it doesn't compile yet!):

(defun list-filter (p l)
  (pattern-match l
    (()       . ())
    ((x . xs) . (let ((ys (list-filter p xs)))
                  (if (funcall p x)
                      (cons x ys)
                    ys)))))

Based on this example I can reason about the arguments to pattern-match.

  1. value, a value that needs to be matched or compared against different patterns.
  2. clause₁, clause₂, ... where each clause has the form (pattern . result). pattern is naturally a pattern to match against value while result is the return value of the entire form if the match succeeds.

pattern-match should be defined as a macro because not all result expressions need to be evaluated. Only the result paired with the first pattern that matches value needs to be evaluated. In addition to this, result can contain symbols whose values depend on the successful comparison of the pattern that's paired with it to a completely evaluated value. I believe the best way to achieve this goal is to have the pattern-match macro "return" a call to a helper function, traverse-clauses, such that value gets completely evaluated and clauses remains as a list of cons cells.

(defmacro pattern-match (value &rest clauses)
  (list 'traverse-clauses value (list 'quote clauses)))

The job of traverse-clauses is to eval some result expression from among the ones in clauses after computing an evaluation context, in the form of a symbol-value association list. Here's an incomplete definition of traverse-clauses.

(defun traverse-clauses (value clauses)
  (if (null clauses)
      (error "exhausted all patterns but found no match!")
    (let* ((clause (car clauses))
           (pattern (car clause))
           (return (cdr clause))
           (context 'fail)) ; TO DO
      (if (equal 'fail context)
          (traverse-clauses value (cdr clauses))
        (eval return context)))))

All that's left is to define another helper function, which I will call match, to compute context if value matches pattern and otherwise evaluate to 'fail to signal an unsuccessful comparison.

(defun traverse-clauses (value clauses)
  (if (null clauses)
      (error "exhausted all patterns but found no match!")
    (let* ((clause (car clauses))
           (pattern (car clause))
           (return (cdr clause))
           (context (match value pattern)))
      (if (equal 'fail context)
          (traverse-clauses value (cdr clauses))
        (eval return context)))))
        
(defun match (value pattern)
  ; TO DO
  nil)

If the comparison of value with pattern succeeds the return value ought to be an association list of symbols with values that's accepted by eval and if the comparison fails the return value ought to be 'fail. What combinations of pattern and value ought to work?

  1. when pattern is nil or () matching values are nil and ()
  2. value t matches pattern t
  3. if both value and pattern are numbers then they must be equal according to the function =
  4. if both value and pattern are numbers then they must be equal according to the function string-equal
  5. if pattern is a symbol that can be bound to a value, in other words if it isn't nil or t, the context should associate pattern with value
  6. if pattern and value are both cons cells their cars and cdrs should be recursively matched against each other then the contexts should be merged. If either match operation fails then the final result should also be 'fail.

This is enough for now though there is much scope to extend match e.g. the pattern (+ 1 n) can match with value 5 resulting in the context ((n . 4)). For the sake of simplicity in my match function I will return 'fail for any pair of pattern and value that doesn't fall into any case from 1 through 6. However this also means a pattern like (cons x xs) when matched with the list (1 2 3) will result in the context ((cons . 1) (x . 2) (xs . (3))). Since I'm not considering function application in patterns, match might behave counter to one's usual intuition of pattern-matching in Haskell.

(defun match (value pattern)
  (cond
   ((and (null pattern) (null value))
    nil)
   ((and (equal t pattern) (equal t value))
    nil)
   ((and (numberp pattern) (numberp value) (= value pattern))
    nil)
   ((and (stringp pattern) (stringp value) (string-equal value pattern))
    nil)
   ((and (symbolp pattern) (not (null pattern)) (not (equal t pattern)))
    (list (cons pattern value)))
   ((and (consp pattern) (consp value))
    (let ((match-car (match (car value) (car pattern)))
          (match-cdr (match (cdr value) (cdr pattern))))
      (if (or (equal 'fail match-car) (equal 'fail match-cdr))
          'fail
        (append match-car match-cdr))))
   (t 'fail)))

I will put all the code together and evaluate a few examples as a sanity check. It's no substitute for thorough testing but it will do for now.

;;;; NOTE
;; Instead of using eval-last-sexp to define list-filter and
;; list-starts-with-1-p, copy all the code into a buffer and run eval-buffer, as
;; the former technique doesn't pass the sanity checks below.
;; Once the functions list-filter and list-starts-with-1-p have been defined
;; using eval-buffer the sexps under the "sanity checks" section can be
;; evaluated using eval-last-sexp.

;;;; CODE

(defmacro pattern-match (value &rest clauses)
  (list 'traverse-clauses value (list 'quote clauses)))

(defun traverse-clauses (value clauses)
  (if (null clauses)
      (error "exhausted all patterns but found no match!")
    (let* ((clause (car clauses))
           (pattern (car clause))
           (return (cdr clause))
           (context (match value pattern)))
      (if (equal 'fail context)
          (traverse-clauses value (cdr clauses))
        (eval return context)))))

(defun match (value pattern)
  (cond
   ((and (null pattern) (null value))
    nil)
   ((and (equal t pattern) (equal t value))
    nil)
   ((and (numberp pattern) (numberp value) (= value pattern))
    nil)
   ((and (stringp pattern) (stringp value) (string-equal value pattern))
    nil)
   ((and (symbolp pattern) (not (null pattern)) (not (equal t pattern)))
    (list (cons pattern value)))
   ((and (consp pattern) (consp value))
    (let ((match-car (match (car value) (car pattern)))
          (match-cdr (match (cdr value) (cdr pattern))))
      (if (or (equal 'fail match-car) (equal 'fail match-cdr))
          'fail
        (append match-car match-cdr))))
   (t 'fail)))
   
;;;; EXAMPLES

(defun list-filter (p l)
  (pattern-match l
    (()       . ())
    ((x . xs) . (let ((ys (list-filter p xs)))
                  (if (funcall p x)
                      (cons x ys)
                    ys)))))
                    
(defun list-starts-with-1-p (l)
  (pattern-match l
    ((1 . xs) . t)
    (xs       . nil)))
    
;;;; SANITY CHECKS
;;
;; (list-filter (λ (n) (zerop (mod n 2))) '(1 2 3 4 5 6 7 8 9 10))
;; should evaluate to '(2 4 6 8 10)
;;
;; (list-starts-with-1-p '(1 2 3 4 5))
;; should evaluate to t
;;
;; (list-starts-with-1-p '(1))
;; should evaluate to t
;;
;; (list-starts-with-1-p '(2 3 4 5))
;; should evaluate to nil
;;
;; (list-starts-with-1-p '())
;; should evaluate to nil

Upon looking for others' pattern-matching implementations in Elisp I learned of the "official" pattern-matching conditional operator pcase. It does a lot more than pattern-match but I still feel implementing my own version was a worthwhile exercise.