(This presentation was made on 7/19/22 and reflects my understanding of automated induction at that time)
The proof technique used here is described in "A Computational Logic" by Robert S. Boyer and J. Strother Moore, and also in "The Little Prover" by Daniel P. Friedman and Carl Eastlund.
We’re going to restrict our reasoning to values described by the grammar
VALUE ::= null
| integer(INTEGER)
| cons(VALUE,VALUE)
The functions and values below will be our primitives. They behave as we would expect.
Symbol | Arity | Range | Symbol | Arity | Range |
---|---|---|---|---|---|
null |
0 | Value |
false |
0 | Bool |
integer |
1 | Value |
ite |
3 | * |
cons |
2 | Value |
and |
≥ 0 | Bool |
is-null |
1 | Bool |
or |
≥ 0 | Bool |
is-integer |
1 | Bool |
not |
1 | Bool |
is-cons |
1 | Bool |
=> |
≥ 2 | Bool |
true |
0 | Bool |
= |
2 | Bool |
The grammar of the input file is still SMT-LIB 2.0, but with some restrictions.
Functions should only be defined with define-fun-rec
. The body of a function can contain calls to itself, as well as calls to primitives. However calls to other user-defined functions are not permitted.
The domain of each user-defined function must be Value
.
There must be only one command of the form (assert (not ...))
, and it will be treated as the claim that needs to be proved.
The body of every user-defined function must be desugared.
The claim to be proved must also be desugared.
Other assertions can be left as they are.
To desugar, change each and
, or
, =>
to an ite
according to these rules.
and() = true
and(p₁,p₂,...,pₙ) = ite(p₁,and(p₂,...,pₙ),false)
or() = false
or(p₁,p₂,...,pₙ) = ite(p₁,true,or(p₂,...,pₙ))
=>(p₁,p₂) = ite(p₁,p₂,true)
=>(p₁,p₂,p₃,p₄,...,pₙ) = ite(p₁,=>(p₂,p₃,p₄,...,pₙ),true)
The first step is to compute the depth of each user-defined function called within the claim.
The depth of a given formal argument of a function is the number of times car
or cdr
is applied to it within a recursive call.
The depth of a function is the maximum over the depths of its formal arguments.
Consider a few examples.
The function below has depth 1.
(define-fun-rec is-natural ((v Value)) Bool
(or (is-null v)
(and (is-cons v)
(is-null (car v))
(is-natural (cdr v)))))
The function below has depth 1.
(define-fun-rec plus ((v Value) (w Value)) Value
(ite (is-null v) w
(ite (is-cons v) (plus (cdr v) (cons null w))
null)))
The function below has depth 3.
(define-fun-rec is-list ((v Value)) Bool
(or (is-null v)
(and (is-cons v)
(is-integer (car v))
(is-null (cdr v)))
(and (is-cons v)
(is-integer (car v))
(is-cons (cdr v))
(is-integer (car (cdr v)))
(is-null (cdr (cdr v))))
(and (is-cons v)
(is-integer (car v))
(is-cons (cdr v))
(is-integer (car (cdr v)))
(is-cons (cdr (cdr v)))
(is-integer (car (cdr (cdr v))))
(is-list (cdr (cdr (cdr v)))))))
Then compute the least common multiple of the depths of all the user-defined functions involved in the claim. Call this value l
For each involved function f, if its depth is d, compute the l/d th unfolding of f and add this new function to the list of definitions. What’s important is each of the unfolded definitions will have the same depth.
To be precise, the 1st unfolding of a function can be computed by expanding each recursive call within its body into the function’s body, with the appropriate substitution applied. Here’s an example to aid my explanation.
(define-fun-rec is-tree ((v Value)) Bool
(or (is-null v)
(and (is-cons v)
(is-tree (car v))
(is-tree (cdr v)))))
This is its first unfolding.
(define-fun-rec is-tree-1 ((v Value)) Bool
(or (is-null v)
(and (is-cons v)
(or (is-null (car v))
(and (is-cons (car v))
(is-tree-1 (car (car v)))
(is-tree-1 (cdr (car v)))))
(or (is-null (cdr v))
(and (is-cons (cdr v))
(is-tree-1 (car (cdr v)))
(is-tree-1 (cdr (cdr v))))))))
This is the step where we inspect the claim to be proved and come up with possible induction schemes. Consider the functions is-natural
and plus
defined in the last section. Suppose we need to prove
∀ v, w ∈ Value,
(is-natural(v) ∧ is-natural(w)) ⇒
plus(cons(null, v), w) = cons(null, plus(v, w))
The first stage is to collect all non-primitive function calls.
is-natural(v)
is-natural(w)
plus(cons(null, v), w)
plus(v, w)
Next, iterate over the function calls. For each call, iterate over its arguments and produce a version of the call where that argument has been replaced with a hole.
is-natural(_)
plus(_, w)
plus(cons(null, v), _)
plus(v, _)
Look at the universally quantified variables in the claim. Compute all the ways we can “pop” a single variable to induct on while leaving the other variables under the universal quantifier. We can envision each of these “ways” as a pair whose first element is a variable to induct on and whose second element is a list of universally quantified variables.
(v, [w])
(w, [v])
For each element in this set of pairs. for each hole-bearing function call, produce a new pair. The first element of this new pair is the call’s hole plugged with the variable to induct on. The second element of the new pair is just the list of universally quantified variables.
(is-natural(v), [w])
(plus(v, w), [w])
(plus(cons(null, v), v), [w])
(plus(v, v), [w])
(is-natural(w), [v])
(plus(w, w), [v])
(plus(cons(null, v), w), [v])
(plus(v, w), [v])
We only want to consider strategies that are “good” and we’re going to adopt The Little Prover’s idea of only considering calls whose arguments meet the following criteria.
Therefore the list of good strategies is
(is-natural(v), [w])
(plus(v, w), [w])
(is-natural(w), [v])
(plus(v, w), [v])
Pick a strategy from among those in the last section. For this demonstration let’s choose (plus(v, w), [w])
Let P(v,w)
represent the claim to be proved, but abstrated over its universally quantified variables.
P(v, w) =
(is-natural(v) ∧ is-natural(w)) ⇒
plus(cons(null, v), w) = cons(null, plus(v, w))
Construct a tree from the body of plus
, the function mentioned in the strategy. Each node in this tree is an ite
form. The expression at the node is the form’s test.
w
/
true /
/
is-null(v) plus(cdr(v), cons(null, w))
\ /
false \ / true
\ /
is-cons(v)
\
\ false
\
null
We work from the leaves to the root to gather the subgoals suggested by this strategy.
Say we’re at a leaf. We scan the expression at the leaf for recursive calls to plus
. Say there are two recursive calls – plus(p, q)
and plus(r, s)
– where p
, q
, r
and s
are expressions that may depend on v
and w
. The induction hypotheses created from these expressions are P(p, q)
and P(r, s)
. The list of subgoals at this leaf is then just the singleton
[ (∀ w, P(p, q)) ⇒ (∀ w, P(r, s)) ⇒ P(v, w) ]
Say we’re at a node. We begin by finding the induction hypotheses (ihs
) for the expression at the node (e
). We already have the subgoals for the two branches of the node. Let’s call them s-true
and s-false
. If s-true
and s-false
are the same, we add ihs
as assumptions to each subgoal in s-true
and return the result. If s-true
and s-false
are distinct, we add e
and ihs
as assumptions to each subgoal in s-true
to get s-true'
. We also add ¬e
and ihs
as assumptions to each subgoal in s-false
to get s-false'
. Then we append s-true'
to s-false'
and return the result.
Step 1.
[ P(v, w) ]
/
true /
/
is-null(v) [ (∀ w, P(cdr(v), cons(null, w))) ⇒ P(v, w) ]
\ /
false \ / true
\ /
is-cons(v)
\
\ false
\
[ P(v, w) ]
Step 2.
[ P(v, w) ]
/
true /
/
is-null(v)
\
false \
\
[ is-cons(v) ⇒ (∀ w, P(cdr(v), cons(null, w))) ⇒ P(v, w)
, ¬is-cons(v) ⇒ P(v, w) ]
Step 3.
[ is-null(v) ⇒ P(v, w)
, ¬is-null(v) ⇒ is-cons(v) ⇒ (∀ w, P(cdr(v), cons(null, w))) ⇒ P(v, w)
, ¬is-null(v) ⇒ ¬is-cons(v) ⇒ P(v, w) ]
Available upon request, as the code and usage instructions will change over time.