Avigad’s MLC — First order logic

Last year I wrote a number of posts on Jeremy Avigad’s major recent book for advanced students, Mathematical Logic and Computation (CUP, 2022). I was reading it with an eye to seeing what parts might be recommended in the next iteration of the Study Guide. This is a significantly shorter version combining the posts, now removed, on the first part of the book. A brisker post on the rest of the book will follow.

The first seven chapters of MLC, some 190 pages, form a book within the book, on core FOL topics but with an unusually and distinctively proof-theoretic flavour. This is well worth having. But a reader who is going to happily navigate and appreciate the treatments of topics here will typically need significantly more background in logic than Avigad implies. The exposition is often very brisk, and the amount of motivational chat is variable and sometimes minimal. So — to jump to the verdict — some parts of this book will indeed be recommended in the Guide, but as supplementary reading for those who have already tackled one of the standard FOL texts.

To get down to details …

Chapter 1 of MLC is on “Fundamentals”, aiming to “develop a foundation for reasoning about syntax”. So we get the usual kinds of definitions of inductively defined sets, structural recursion, definitions of trees, etc. and applications of the abstract machinery to defining the terms and formulas of FOL languages, proving unique parsing, etc., but all done in a quite hard-core way. But as JA notes, the reader can skim and skip and return to the details later on a need-to-know basis.

But there is one stand-out decision that is perhaps worth commenting on. Take the two expressions \forall xFx and \forall yFy. The choice of bound variable is of course arbitrary. It seems we have two choices here:

  1. Just live with the arbitrariness. Allow such expressions as distinct formulas, but prove that  formulas like these which are can be turned into each other by the renaming of bound variables (formulas which are \alpha-equivalent, as they say) are always interderivable, are logically equivalent too.
  2. Say that formulas proper are what we get by quotienting expressions by \alpha-equivalence, and lift our first-shot definitions of e.g. wellformedness for expressions of FOL to become definitions of wellformedness for the more abstract formulas proper of FOL.

Now, as JA says, there is in the end not much difference between these two options; but he plumps for the second option, and for a reason. The thought is this. If we work at expression level, we will need a story about allowable substitutions of terms for variables that blocks unwanted variable-capture. And JA suggests there are three ways of doing this, none of which is entirely free from trouble according to him.

  1. Distinguish free from bound occurrences of variables, define what it is for a term to be free for a variable, and only allow a term to be substituted when it is free to be substituted. Trouble: “involves inserting qualifications everywhere and checking that they are maintained.”
  2. Modify the definition of substitution so that bound variables first get renamed as needed — so that the result of substituting y + 1 for x in \exists y(y > x) is something like \exists z(z > y + 1). Trouble: “Even though we can fix a recipe for executing the renaming, the choice is somewhat arbitrary. Moreover, because of the renamings, statements we make about substitutions will generally hold only up to \alpha-equivalence, cluttering up our statements.”
  3. Maintain separate stocks of free and bound variables, so that the problem never arises. Trouble: “Requires us to rename a variable whenever we wish to apply a binder.”

But the supposed trouble counting against the third option is, by my lights, no trouble at all. Why so?

JA is arguably quite misdescribing what is going on in that case.Taking the Gentzen line, we distinguish constants with their fixed interpretations, parameters or temporary names whose interpretation can vary, and bound variables which are undetachable parts of a quantifier-former we might represent ‘\forall x \ldots\ x\ldots \ x\ldots’. And when we quantify Fa to get \forall xFx we are not “renaming a variable” (a trivial synactic change) but we are — in one go, so to speak replacing the parameter with a variable and prefixing a linked quantifier, and that complex makes a single semantic unit which has a quite different semantic role from a parameter. There’s a good Fregean principle, use different bits of syntax to mark different semantic roles: and that’s what is happening here when we replace the ‘a’ by the ‘x’ and at the same time bind with the quantifier ‘\forall x’.

So its seems to me that option 1c is in fact very markedly more attractive than JA has it (it handles issues about substitution nicely, and meshes with the elegant story about semantics which has \forall xFx true on an interpretation when Fa is true however we extend that interpretation to give a referent to the temporary name a). The simplicity of 1c compared with option 2 in fact gets the deciding vote for me.

After the chapter of preliminaries, MLC has two chapters on propositional logic (substantial chapters too, some fifty-five large format pages between them, and they range much more widely than the usual sort of introductions to PL in math logic books).

JA’s general approach foregrounds syntax and proof theory. So these two chapters start with §2.1 quickly reviewing the syntax of the language of PL (with \land, \lor, \to, \bot as basic — so negation has to be defined by treating \neg A as A \to \bot). §2.2 presents a Hilbert-style axiomatic deductive system for minimal logic, which is augmented to give systems for intuitionist and classical PL. §2.3 says more about the provability relations for the three logics (initially defined in terms of the existence of a derivation in the relevant Hilbert-style system). §2.4 then introduces natural deduction systems for the same three logics, and outlines proofs that we can redefine the same provability relations as before in terms of the availability of natural deductions. §2.5 notes some validities in the three logics and §2.6 is on normal forms in classical logic. §2.7 then considers translations between logics, e.g. the Gödel-Gentzen double-negation translation between intuitionist and classical logic. Finally §2.8  takes a very brisk look at other sorts of deductive system, and issues about decision procedures.

As you’d expect, this is all technically just fine. But I strongly suspect an amount of prior knowledge will be pretty essential if you are really going get much out the discussions here. Yes, the point of the exercise isn’t to get the reader to be a whizz at knocking off complex Gentzen-style natural deduction proofs (for example); but are there quite enough worked examples for the genuine newbie to get a good feel for the claimed naturalness of such proofs? Is a single illustration of a Fitch-style alternative helpful? I’m very doubtful.

To continue, Chapter 3 is on semantics. We get the standard two-valued semantics for classical PL, along with soundness and completeness proofs, in §3.1. Then we get interpretations in Boolean algebras in §3.2. Next, §3.3 introduces Kripke semantics for intuitionistic (and minimal) logic — as I said, JA is indeed casting his net significantly more widely that usual in introducing PL. §3.4 gives algebraic and topological interpretations for intuitionistic logic. And the chapter ends with a pretty challenging §3.5, ‘Variations’, introducing what JA calls generalised Beth semantics. As you can see, a lot is going on here!

Still, I think that for someone coming to MLC who already does have enough logical background (perhaps a bit half-baked, perhaps rather fragmentary) and who is mathematically adept enough, these chapters — perhaps initially minus their last sections — should bring a range of technical material into a nicely organised story in a very helpful way, giving a good basis for pressing on through the book.

The next two chapters of MLC are on the syntax and proof systems for FOL — in three flavours again, minimal, intuitionstic, and classical — and then on semantics and a smidgin of model theory. Again, things proceed at considerable pace, and ideas come thick and fast.

So in a bit more detail, how do Chapters 4 and 5 proceed? Broadly following the pattern of the two chapters on PL, in §4.1 we find a brisk presentation of FOL syntax (in the standard form, with no syntactic distinction made between variables-as-bound-by-quantifiers and variables-standing-freely). Officially, recall, wffs that result from relabelling bound variables are identified. But this seems to make little difference: I’m not sure what the gain is, at least here in these chapters, in a first encounter with FOL.

§4.2 presents axiomatic and ND proof systems for the quantifiers, adding to the systems for PL in the standard ways. §4.3 deals with identity/equality and says something about the “equational fragment” of FOL. §4.4 says more than usual about equational and quantifier-free subsystems of FOL, noting some (un)decidability results. §4.5 briefly touches on prenex normal form. §4.6 picks up the topic (dealt with in much more detail than usual) of translations between minimal, intuitionist, and classical logic. §4.7 is titled “Definite Descriptions” but isn’t as you might expect about how to add a description operator, a Russellian iota, but rather about how — when we can prove \forall x\exists! yA(x, y) — we can add a function symbol f such that f(x) = y holds when A(x, y), and all goes as we’d hope. Finally, §4.8 treats two topics: first, how to mock up sorted quantifiers in single-sorted FOL; and second, how to augment our logic to deal with partially defined terms. That last subsection is very brisk: if you are going to treat any varieties of free logic (and I’m all for that in a book at this level, with this breadth) there’s more worth saying.

Then, turning to semantics, §5.1 is the predictable story about full classical logic with identity,  with soundness and completeness theorems, all crisply done. §5.2 tells us more about equational and quantifier-free logics.  §5.3 extends Kripke semantics to deal with quantified intuitionistic logic. We then get algebraic semantics for classical and intuitionistic logic in §5.4 (so, as before, JA is casting his net more widely than usual — though the treatment of the intuitionistic case is indeed pretty compressed). The chapter finishes with a fast-moving 10 pages giving us two sections on model theory. §5.5 deals with some (un)definability results, and talks briefly about non-standard models of true arithmetic. §5.6 gives us the L-S theorems and some results about axiomatizability. So that’s a great deal packed into this chapter. And at a sophisticated level too — it is perhaps rather telling that JA’s note at the end of the chapter gives Peter Johnstone’s book on Stone Spaces as a “good reference” for one of the constructions!

The same judgement applies, I think, as to the chapters on PL: very good material for someone already on top of the basics, and wanting to consolidate and expand their knowledge, but not the place to start.

One minor comment: I note that JA does define a model for a FOL language in the standard way as having a set for quantifiers to range over,  but with a function (of the right arity) over that set as interpretation for each function symbol, and a relation (of the right arity) over that set as interpretation for each relation symbol. My attention might have flickered, but JA seems happy to treat functions and relations as they come, not explicitly trading them in for set-theoretic surrogates (sets of ordered tuples). But then it is interesting to ask — if we treat functions and relations as they come, without going in for a set-theoretic story, then why not treat the quantifiers as they come, as running over some objects plural? That way we can interpret e.g. the first-order language of set theory (whose quantifiers run over more than set-many objects) without wriggling. JA does in general seem to nicely downplay the unnecessary invocation of sets — though not quite consistently. I’d go for consistently avoiding unnecessary set talk from the off — thus making it much easier for the beginner at serious logic to see when set theory starts doing some real work for us. Three cheers for sets: but in their proper place!

MLC continues, then, with Chapter 6 on Cut Elimination. And the order of explanation here is, I think, interestingly and attractively novel.

Yes, things begin in a familiar way. §6.1 introduces a standard sequent calculus for (minimal and) intuitionistic FOL logic without identity. §6.2 then, again in the usual way, gives us a sequent calculus for classical logic by adopting Gentzen’s device of allowing more than one wff to the right of the sequent sign. But then JA notes that we can trade in two-sided sequents, which allow sets of wffs on both sides, for one-sided sequents where everything originally on the left gets pushed to the right of sequent side (being negated as it goes). These one-sided sequents (if that’s really the best label for them) are, if I recall, not treated at all in Negri and von Plato’s lovely book on structural proof theory; and they are mentioned as something of an afterthought at the end of the relevant chapter on Gentzen systems in Troelstra and Schwichtenberg. But here in MLC they are promoted to centre stage.

So in §6.2 we are introduced to a calculus for classical FOL using such one-sided, disjunctively-read, sequents (we can drop the sequent sign as now redundant) — and it is taken that we are dealing with wffs in ‘negation normal form’, i.e. with conditionals eliminated and negation signs pushed as far as possible inside the scope of other logical operators so that they attach only to atomic wffs. This gives us a very lean calculus. There’s the rule that any \Gamma, A, \neg A with A atomic counts as an axiom. There’s just one rule each for \land, \lor, \forall, \exists. There also is a cut rule, which tells us that from \Gamma, A and \Gamma, {\sim}{A} we can infer \Gamma (here {\sim}{A} is notation for the result of putting the negation of A in negation normal form).

And JA now proves twice over that this cut rule is eliminable. So first in §6.3 we get a semantics-based proof that the calculus without cut is already sound and complete. Then in §6.4 we get a proof-theoretic argument that cuts can be eliminated one at a time, starting with cuts on the most complex formulas, with a perhaps exponential increase in the depth of the proof at each stage — you know the kind of thing! Two comments:

  1. The details of the semantic proof will strike many readers as familiar — closely related to the soundness and completeness proofs for a Smullyan-style tableaux system for FOL. And indeed, it’s an old idea that Gentzen-style proofs and certain kind of tableaux can be thought of as essentially the same, though conventionally written in opposite up-down directions (see Ch XI of Smullyan’s 1968 classic First-Order Logic). In the present case, Avigad’s one-sided sequent calculus without cut is in effect a block tableau system for negation normal formulas where every wff is signed F. Given that those readers whose background comes from logic courses  for philosophers will probably be familiar with tableaux (truth-trees), and indeed given the elegance of Smullyan systems, I think it is perhaps a pity that JA misses the opportunity to spend a little time on the connections.
  2. JA’s sparse one-sided calculus does make for a nicely minimal context in which to run a bare-bones proof-theoretic argument for the eliminability of the cut rule, where we have to look at a very small number of different cases in developing the proof instead of having to hack through the usual clutter. That’s a very nice device! I do have to report though that, to my mind, JA’s mode of presentation doesn’t really make the proof any more accessible than usual. In fact, once again  the compression makes for quite hard going (even though I came to it knowing in principle what was supposed to be going on, I often had to re-read). Even just a few more examples along the way of cuts being moved would surely have helped.

To continue (and I’ll be briefer) §6.5 looks at proof-theoretic treatments of cut elimination for intuitionistic logic, and §6.6 adds axioms for identity into the sequent calculi and proves cut elimination again. §6.7 is called ‘Variations on Cut Elimination’ with a first look at what can happen with theories other than the theory of identity when presented in sequent form. Finally §6.8 returns to intuitionistic logic and (compare §6.5) this time gives a nice semantic argument for the eliminability of cut, going via a generalization of Kripke models.

This is all very good stuff, and I learnt from this. But I hope it doesn’t sound too ungrateful to say that a student new to sequent calculi and cut-elimination proofs would still do best to read the initial chapters of Negri and von Plato (for example) first, if they are later to be able get a lively appreciation of §6.4 and the following sections of MLC.

Following on from the very interesting Chapter 6 on cut-elimination, MLC has one further chapter on FOL, Chapter 7 on “Properties of First-Order Logic”. There are sections on Herbrand’s Theorem, on the Disjunction Property for intuitionistic logic, on the Interpolation Lemma, on Indefinite Descriptions and on Skolemization. This does nicely follow on from the previous chapter, as the proofs here mostly rely on the availability of cut-elimination. I’m not going to dwell on this chapter, though, which I think most readers will find pretty hard going. Hard going in part because, apart from perhaps the interpolation lemma, it won’t this time be obvious from the off what the point of various theorems are.

Take for example the section on Skolemization. This goes at pace. And the only comment we get about why this might all matter is at the end of the section, where we read: “The reduction of classical first-order logic to quantifier-free logic with Skolem functions is also mathematically and philosophically interesting. Hilbert viewed such functions (more precisely, epsilon terms, which are closely related) as representing the ideal elements that are added to finitistic reasoning to allow reasoning over infinite domains.” So that’s just one sentence expounding on the wider interest  — which is hardly likely to be transparent to most readers! It would have been good to hear more.

Avigad’s sections in this chapter are of course technically just fine and crisply done, and can certainly be used as a source to consolidate and develop your grip on their cluster of topics if you already have met the key ideas. But once more I can’t recommend starting here.

In his Preface, Avigad writes ‘The material here should be accessible at an advanced undergraduate or introductory graduate level’ and implies that having had a prior introduction to logic might be helpful but isn’t necessary for tackling MLC. I rather suspect that teaching at Carnegie Mellon has given our author a distinctly rosy picture of what most advanced undergraduates/beginning graduates can readily cope with! But let’s forget the advertising pitch, and take the book for what it more turns out to be, an advanced text suitable for mathematically apt readers who have already have some background in logic and who want to consolidate/push on. Then there is, as indicated, a lot of interesting material here, presented in an often-compressed, though sometimes perhaps over-compressed, way. I definitely learnt from it. So a bumpy ride but (for the appropriately primed reader) one well worth tackling.

6 thoughts on “Avigad’s MLC — First order logic”

  1. Thank you for an interesting discussion of Avigad’s book. A remark about free and bound variables: The proper way is the third, with variables renamed once they are bound, as in Gentzen. The reason for this is that it is the only way to have height-preserving admissibility of alpha-conversion, i.e., to change bound variables without increasing the size of a formal derivation. This matter was found out around 1986, relatively late, but let’s keep in mind that logic is still struggling to advance beyond Stone Age!

    PS Please stop calling equality identity. Nobody knows what the latter is.

    1. Nearly everybody knows what identity is and that, while any x and any y, if identical, are also equal, the reverse is not true. That is a good reason for not calling equality identity, and not calling identity equality.

  2. The sequent-style ND system for first-order logic, as defined in the book, requires that a provable sequent consist of a set of *sentences* as its open hypotheses (pgs 86-87).

    I’m struggling to see how a formula like \forall x[P(x)\rightarrow P(x)] can be proven given such a restriction. I have constructed a derivation tree, but it includes an instance of the identity rule on a sequent with an open formula as a hypothesis: P(x)\vdash P(x).

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top