automated-induction

(This presentation was made on 7/19/22 and reflects my understanding of automated induction at that time)

Automation of Induction Proofs

The proof technique used here is described in "A Computational Logic" by Robert S. Boyer and J. Strother Moore, and also in "The Little Prover" by Daniel P. Friedman and Carl Eastlund.

Define Problem Scope

We’re going to restrict our reasoning to values described by the grammar

VALUE ::= null
        | integer(INTEGER)
        | cons(VALUE,VALUE)

The functions and values below will be our primitives. They behave as we would expect.

Symbol	Arity	Range	Symbol	Arity	Range
`null`	0	`Value`	`false`	0	`Bool`
`integer`	1	`Value`	`ite`	3	*
`cons`	2	`Value`	`and`	≥ 0	`Bool`
`is-null`	1	`Bool`	`or`	≥ 0	`Bool`
`is-integer`	1	`Bool`	`not`	1	`Bool`
`is-cons`	1	`Bool`	`=>`	≥ 2	`Bool`
`true`	0	`Bool`	`=`	2	`Bool`

The grammar of the input file is still SMT-LIB 2.0, but with some restrictions.

Functions should only be defined with define-fun-rec. The body of a function can contain calls to itself, as well as calls to primitives. However calls to other user-defined functions are not permitted.
The domain of each user-defined function must be Value.
There must be only one command of the form (assert (not ...)), and it will be treated as the claim that needs to be proved.

Desugar

The body of every user-defined function must be desugared.

The claim to be proved must also be desugared.

Other assertions can be left as they are.

To desugar, change each and, or, => to an ite according to these rules.

and()             = true
and(p₁,p₂,...,pₙ) = ite(p₁,and(p₂,...,pₙ),false)

or()             = false
or(p₁,p₂,...,pₙ) = ite(p₁,true,or(p₂,...,pₙ))

=>(p₁,p₂)              = ite(p₁,p₂,true)
=>(p₁,p₂,p₃,p₄,...,pₙ) = ite(p₁,=>(p₂,p₃,p₄,...,pₙ),true)

Unfold

The first step is to compute the depth of each user-defined function called within the claim.

The depth of a given formal argument of a function is the number of times car or cdr is applied to it within a recursive call. The depth of a function is the maximum over the depths of its formal arguments. Consider a few examples.

The function below has depth 1.

(define-fun-rec is-natural ((v Value)) Bool
  (or (is-null v)
      (and (is-cons v)
           (is-null (car v))
           (is-natural (cdr v)))))

The function below has depth 1.

(define-fun-rec plus ((v Value) (w Value)) Value
  (ite (is-null v) w
  (ite (is-cons v) (plus (cdr v) (cons null w))
                   null)))

The function below has depth 3.

(define-fun-rec is-list ((v Value)) Bool
  (or (is-null v)
      (and (is-cons v)
           (is-integer (car v))
           (is-null (cdr v)))
      (and (is-cons v)
           (is-integer (car v))
           (is-cons (cdr v))
           (is-integer (car (cdr v)))
           (is-null (cdr (cdr v))))
      (and (is-cons v)
           (is-integer (car v))
           (is-cons (cdr v))
           (is-integer (car (cdr v)))
           (is-cons (cdr (cdr v)))
           (is-integer (car (cdr (cdr v))))
           (is-list (cdr (cdr (cdr v)))))))

Then compute the least common multiple of the depths of all the user-defined functions involved in the claim. Call this value l

For each involved function f, if its depth is d, compute the l/d th unfolding of f and add this new function to the list of definitions. What’s important is each of the unfolded definitions will have the same depth.

To be precise, the 1st unfolding of a function can be computed by expanding each recursive call within its body into the function’s body, with the appropriate substitution applied. Here’s an example to aid my explanation.

(define-fun-rec is-tree ((v Value)) Bool
  (or (is-null v)
      (and (is-cons v)
           (is-tree (car v))
           (is-tree (cdr v)))))

This is its first unfolding.

(define-fun-rec is-tree-1 ((v Value)) Bool
  (or (is-null v)
      (and (is-cons v)
           (or (is-null (car v))
               (and (is-cons (car v))
                    (is-tree-1 (car (car v)))
                    (is-tree-1 (cdr (car v)))))
           (or (is-null (cdr v))
               (and (is-cons (cdr v))
                    (is-tree-1 (car (cdr v)))
                    (is-tree-1 (cdr (cdr v))))))))

Strategize

This is the step where we inspect the claim to be proved and come up with possible induction schemes. Consider the functions is-natural and plus defined in the last section. Suppose we need to prove

∀ v, w ∈ Value, 
  (is-natural(v) ∧ is-natural(w)) ⇒
    plus(cons(null, v), w) = cons(null, plus(v, w))

The first stage is to collect all non-primitive function calls.

is-natural(v)
is-natural(w)
plus(cons(null, v), w)
plus(v, w)

Next, iterate over the function calls. For each call, iterate over its arguments and produce a version of the call where that argument has been replaced with a hole.

is-natural(_)
plus(_, w)
plus(cons(null, v), _)
plus(v, _)

Look at the universally quantified variables in the claim. Compute all the ways we can “pop” a single variable to induct on while leaving the other variables under the universal quantifier. We can envision each of these “ways” as a pair whose first element is a variable to induct on and whose second element is a list of universally quantified variables.

(v, [w])
(w, [v])

For each element in this set of pairs. for each hole-bearing function call, produce a new pair. The first element of this new pair is the call’s hole plugged with the variable to induct on. The second element of the new pair is just the list of universally quantified variables.

(is-natural(v), [w])
(plus(v, w), [w])
(plus(cons(null, v), v), [w])
(plus(v, v), [w])
(is-natural(w), [v])
(plus(w, w), [v])
(plus(cons(null, v), w), [v])
(plus(v, w), [v])

We only want to consider strategies that are “good” and we’re going to adopt The Little Prover’s idea of only considering calls whose arguments meet the following criteria.

The arguments are all symbols
No symbol is repeated
The symbols are a subset of the universally quantified variables in the claim, which also means they can’t be primitives.

Therefore the list of good strategies is

(is-natural(v), [w])
(plus(v, w), [w])
(is-natural(w), [v])
(plus(v, w), [v])

Subgoals

Pick a strategy from among those in the last section. For this demonstration let’s choose (plus(v, w), [w])

Let P(v,w) represent the claim to be proved, but abstrated over its universally quantified variables.

P(v, w) =
  (is-natural(v) ∧ is-natural(w)) ⇒
    plus(cons(null, v), w) = cons(null, plus(v, w))

Construct a tree from the body of plus, the function mentioned in the strategy. Each node in this tree is an ite form. The expression at the node is the form’s test.

             w
            /
      true /     
          /
is-null(v)                plus(cdr(v), cons(null, w))
          \              /
     false \            / true
            \          /
             is-cons(v)
                       \
                        \ false
                         \
                          null

We work from the leaves to the root to gather the subgoals suggested by this strategy.

Say we’re at a leaf. We scan the expression at the leaf for recursive calls to plus. Say there are two recursive calls – plus(p, q) and plus(r, s) – where p, q, r and s are expressions that may depend on v and w. The induction hypotheses created from these expressions are P(p, q) and P(r, s). The list of subgoals at this leaf is then just the singleton

[ (∀ w, P(p, q)) ⇒ (∀ w, P(r, s)) ⇒ P(v, w) ]

Say we’re at a node. We begin by finding the induction hypotheses (ihs) for the expression at the node (e). We already have the subgoals for the two branches of the node. Let’s call them s-true and s-false. If s-true and s-false are the same, we add ihs as assumptions to each subgoal in s-true and return the result. If s-true and s-false are distinct, we add e and ihs as assumptions to each subgoal in s-true to get s-true'. We also add ¬e and ihs as assumptions to each subgoal in s-false to get s-false'. Then we append s-true' to s-false' and return the result.

Step 1.

             [ P(v, w) ]
            /
      true /     
          /
is-null(v)                [ (∀ w, P(cdr(v), cons(null, w))) ⇒ P(v, w) ]
          \              /
     false \            / true
            \          /
             is-cons(v)
                       \
                        \ false
                         \
                          [ P(v, w) ]

Step 2.

             [ P(v, w) ]
            /
      true /     
          /
is-null(v)                
          \
     false \
            \
             [ is-cons(v) ⇒ (∀ w, P(cdr(v), cons(null, w))) ⇒ P(v, w)
             , ¬is-cons(v) ⇒ P(v, w) ]

Step 3.

[ is-null(v) ⇒ P(v, w)
, ¬is-null(v) ⇒ is-cons(v) ⇒ (∀ w, P(cdr(v), cons(null, w))) ⇒ P(v, w)
, ¬is-null(v) ⇒ ¬is-cons(v) ⇒ P(v, w) ]

Code & Examples

Available upon request, as the code and usage instructions will change over time.