Library MyTheory.nat_example
This is a brief tutorial on Coq that does not require much
understanding of tactic based theorem proving.
It assumes some general familiarity with programming in a
statically typed functional programming language such as OCaml or Haskell.
This will help you
understand Coq's type theory at a lower level than you would get
from using the apply and rewrite tactics in proofs.
We can define algebraic data types
with the Inductive keyword.
This is similar to the "data" keyword in Haskell, or
the "type" keyword in OCaml.
Inductive nat : Type :=
| O : nat
| S : nat → nat.
Local Notation "0" := O.
Local Notation "1" := (S O).
Local Notation "2" := (S (S O)).
Inductive bool : Type :=
| tt : bool
| ff : bool.
Basic recursion with natural numbers and booleans;
pattern matching.
Definition is_zero : nat → bool :=
fun n ⇒
match n with
| O ⇒ tt
| S _ ⇒ ff
end.
Definition negb (b : bool) : bool :=
match b with
| tt ⇒ ff
| ff ⇒ tt
end.
In Coq, you can define recursive functions with the fix keyword.
The fix keyword is similar to fun (lambda abstraction) except
that it is used for recursive functions. It is necessary to mark a
function as recursive.
To motivate the use of a new keyword for defining recursive
functions, notice that it would be very inconvenient to try and
implement recursion using anonymous functions, as you have to give
the function a name so it can call itself.
Define n + m by induction on n.
Definition add : nat → nat → nat :=
fix add_recursive (n m: nat) : nat :=
match n with
| O ⇒ m
| S n' ⇒ S (add_recursive n' m)
end.
Notation "a + b" := (add a b) (at level 50, left associativity).
fix add_recursive (n m: nat) : nat :=
match n with
| O ⇒ m
| S n' ⇒ S (add_recursive n' m)
end.
Notation "a + b" := (add a b) (at level 50, left associativity).
Coq doesn't permit arbitrary recursive functions,
only functions that are defined by recursion on the structure
of an inductively defined data type, such as natural numbers or lists.
A function can only recursively call itself in the branches of
a pattern-match, like above. Coq always identifies one particular
argument of the recursive function call as the "decreasing" argument -
the argument that gets pattern-matched each time, or unfolded.
In the example above, n is the "structurally decreasing" argument,
because it gets "unfolded" by one step with every recursive call,
eventually terminating with zero.
Coq is usually able to infer from context what the structurally
decreasing argument is meant to be, but if it can't figure it out,
you can explicitly hint it using the struct keyword, as below.
Check fix add_recursive (n m: nat) {struct n} : nat :=
match n with
| O ⇒ m
| S n' ⇒ add_recursive n' (S m)
end.
match n with
| O ⇒ m
| S n' ⇒ add_recursive n' (S m)
end.
Prints nat → nat → nat
Just like in Haskell, inductive data types can be
polymorphic in a type parameter.
After defining a function, you can mark arguments as implicit
using the Arguments command.
cons : ∀ A : Type, A → list A → list A
We don't want to supply the argument A every time so we mark it implicit.
Coq is able to infer from context what A is.
If you want to temporarily make implicit arguments explicit,
(for example, because Coq is having trouble inferring what it should be),
put the @ sign in front of the term.
Check @cons nat 2 (@nil nat).
Local Notation "[]" := nil.
Local Notation "head :: tail" := (cons head tail)
(at level 70, right associativity).
Check 2 :: 1 :: 0 :: [].
Definition length (A : Type) : list A → nat :=
fix length_rec (l : list A) {struct l} :=
match l with
| nil ⇒ 0
| cons head tail ⇒ S (length_rec tail)
end.
Local Notation "[]" := nil.
Local Notation "head :: tail" := (cons head tail)
(at level 70, right associativity).
Check 2 :: 1 :: 0 :: [].
Definition length (A : Type) : list A → nat :=
fix length_rec (l : list A) {struct l} :=
match l with
| nil ⇒ 0
| cons head tail ⇒ S (length_rec tail)
end.
Remember, fix is like fun except for recursive functions, and
{struct l} is a hint to the Coq typechecker
that the definition is by "structural recursion on the variable l".
i.e., that the function is "decreasing" in l - the argument l
gets smaller in every recursive function call.
You do not have to include this because Coq can figure it out from
context. I only include it to help you debug things, because often
you print the definition of a term to see what's going on, and Coq
will generally show you the "fully annotated" definitions and it
is confusing / overwhelming if you don't understand the extra
annotations.
Definition concatenate (A : Type) : list A → list A → list A :=
fix concat_rec (l1 : list A) (l2 : list A) {struct l1} :=
match l1 with
| nil ⇒ l2
| cons head tail ⇒ cons head (concat_rec tail l2)
end.
The "Fixpoint" command is syntactic sugar for a recursive Definition.
It's a bit more concise.
The following illustrates this syntactic sugar.
It is exactly equivalent to what we did above.
Fixpoint concat {A : Type} (l1 : list A) (l2 : list A) :=
match l1 with
| nil ⇒ l2
| cons head tail ⇒ cons head (concat tail l2)
end.
match l1 with
| nil ⇒ l2
| cons head tail ⇒ cons head (concat tail l2)
end.
Because Coq is a dependently typed language, it's possible to use
the Inductive keyword to generate *families* of types, indexed by
a variable.
Here is an example where we use this feature to incorporate the
length of a list as part of its type, which lets us use the type
checker to enforce guarantees about the length of lists, such as
guaranteeing that if a list of length n is appended to a list of
length m, the result is of length n + m.
Inductive vector (A : Type) : nat → Type :=
| vnil : vector A 0
| vcons : ∀ n : nat, A → vector A n → vector A (S n).
Arguments vnil {A}.
Arguments vcons {A} {n} head tail.
vector is both a polymorphic type (one can have a vector of A's
for all types A) and a dependent type, a family of types indexed by
the natural numbers.
It has two constructors: vnil, which is an element of the type vector A 0;
and vcons. To define a function on the type of vectors,
it suffices to define it on the constructors.
This function can be seen as a "verified" version of the concat
function we defined earlier. The fact that this definition
type-checks means that Coq has guaranteed that indeed the
concatenation of a list of length n with a list of length m is of
length n + m.
Definition vconcat (A : Type) {n : nat} (l1 : vector A n) {m : nat} (l2 : vector A m) : vector A (n + m) :=
The type signature above is easy to read and reflects how we might
want to use it, but giving every argument a name
(fixing some specific n and l1, as we did)
is not really a good fit for defining
functions by recursion. The fixpoint operator fix is used to define a
function quantified over all its inputs, not just the specific ones
given to us;
so we're just going to define this function for all
n' and l1', and then apply it to the specific n and l1 given to us.
I introduce new variable names in defining vconcat_rec
to avoid confusion;
you could reuse the same names, but it would just shadow the outer names.
(fix vconcat_rec {n'} (l1' : vector A n') : vector A (n' + m) :=
match l1' with
| vnil ⇒ l2
| vcons head tail ⇒ vcons head (vconcat_rec tail)
end) n l1.
match l1' with
| vnil ⇒ l2
| vcons head tail ⇒ vcons head (vconcat_rec tail)
end) n l1.
In the example above, if you remove : vector A (n' + m), it
won't compile. Coq is unable to infer the return type unless it is
explicitly annotated. As a general rule, if your code doesn't
compile, adding more type annotations and filling in implicit
arguments may help you get better error messages, and make the
typechecker happy, up to a point.
However, if you don't understand how to correctly communicate type
annotations to the compiler, then all your efforts will be in
vain, which is why it's so important to understand how to read and
write dependent pattern matching type annotations in Coq - but, at
the same time, this is probably one of the most challenging things
about learning Coq.
I learned how to do this by reading Adam Chlipala's explanation
(here),
which I
think is very good; I'm just trying to explain it in my own words
and give examples. Read the section
"The One Rule of Dependent Pattern Matching"
Let's start with an example.
is_even is a dependent type, a family of types indexed by the natural numbers.
It has two constructors:
- z_is_even, which is an element of the type s_even 0;
- ss_is_even, which is a function from is_even n to is_even (S (S n)) for any natural number n.
Inductive is_even : nat → Type :=
| z_is_even : is_even 0
| ss_is_even : ∀ n : nat, is_even n → is_even (S (S n)).
We can think of the is_even family of types as a predicate on the natural
numbers. To prove that a number n is even, we construct an inhabitant
of the type is_even n.
The bigger picture is that we have some base type, A (here A is nat)
and then an inductively generated dependent type B : A → Type.
How can we map into and out of B?
Well, constructing inhabitants of B should be the easy part, we just
use the constructors.
The nontrivial part is how do we map out of B, i.e., structurally
recurse on the dependent type?
Say you have built a term:
a : A, b : B a |- t(a,b) : C
What might the type of t be? C could be a constant type, or
C could vary as a function of a.
In full generality, C could be a function of both a and b.
So, in writing down a function t of a and b, we need to be able
to express how the type of t varies as a function of a and b.
If B is an inductive data type with multiple cases,
we need to express how C varies as a function of a and b in each
of the possible cases of b.
For this we extend match with new syntax.
Definition sum_even:
∀ n m : nat, is_even n → is_even m → is_even (n + m) :=
fix sum_even_rec (n m : nat) (p : is_even n) (q : is_even m) : is_even (n + m) :=
match p as b in is_even a return is_even (a + m) with
| z_is_even ⇒ q
| ss_is_even n' p' ⇒ ss_is_even (n' + m) (sum_even_rec n' m p' q)
end.
This example extends the match statement with new syntax, the
keywords as, is and return.
The basic meaning of return is clear: it specifies the return
type of the match statement. But unlike languages such as Haskell
or OCaml, the return type of the match statement is allowed to
vary as a function of its argument, and so the type specified
after the return keyword is allowed to depend on the input.
The keyword as lets us declare a new variable, in this case b,
which refers to the input to the match; we can refer to the
variable b in the return type, if the return type depends on the
input. In this case, it doesn't: we are trying to build a term of
type is_even (n + m), and the type is_even (n + m) doesn't
depend on the proof p. So, we could have written an underscore _
instead of b, or even omitted as b altogether.
The keyword in also allows us to introduce one or more new
variables, here called a, which are the indexing arguments of
the dependent type we're matching on.
The return type is allowed to depend on both a and b.
The variables a and b cannot be referred to anywhere outside the
definition of the return type - their only purpose is to explain
how the return type varies with each branch of the pattern match.
a and b will take on different values as a result of the different
branches. In the first branch, (after the | z_is_zero ⇒ ),
b becomes equal to the constant zz_is_even, and a becomes equal to 0.
Thus, at q , the return type is
is_even (0 + m).
In the second branch, a becomes equal to (S (S n')) and
b is equal to ss_is_even n' p'; thus, the return type is
is_even (S (S n') + m).
We can confirm this by entering proof mode, putting holes
to replace q and ss_is_even (...),
and using refine to fill in everything but the holes.
The interactive proof mode shows us what the type of each hole is.
As you can see, it depends on the value of the variable a.
From Coq Require Import Ltac.
Definition sum_even' :
∀ n m : nat, is_even n → is_even m → is_even (n + m).
Proof.
refine (fix sum_even_rec (n m : nat) (p : is_even n) (q : is_even m) :
is_even (n + m) :=
match p as b in is_even a return is_even (a + m) with
| z_is_even ⇒ _
| ss_is_even n' p' ⇒ _
end).
Abort.
Let's try to prove the induction principle for the natural numbers by
using dependent pattern matching. We can often get away without type
annotations - see what happens when you delete as w return P w.
Definition nat_induction :
∀ (P : nat → Type),
(P 0) → (∀ n, P n → P (S n)) → (∀ n, P n) :=
fun P pf_0 pf_succ ⇒
fix nat_rec (n : nat) : P n :=
match n as w return P w with
| 0 ⇒ pf_0
| S n' ⇒ pf_succ n' (nat_rec n')
end.
Try the same trick here of looking at the return type using refine
to see how the context and return type changes inside the branches
| tt ⇒ _ and | ff ⇒ _ .