Interpreting arithmetic
What is the philosophical significance, if any, of the fact that we can often find interpretations of arithmetic in other mathematical realms?
This is a selection from chapter 1 of my book, Lectures on the Philosophy of Mathematics, MIT Press 2021, an introduction to the subject which I had used as the basis of my lecture series in Oxford.
Interpreting arithmetic
Mathematicians generally seek to interpret their various theories within one another. How can we interpret arithmetic in other domains? Let us explore a few of the proposals.
Numbers as equinumerosity classes
We have just discussed (see the earlier post on logicism) how Frege, and later Russell, defined numbers as equinumerosity equivalence classes. According to this account, the number 2 is the class of all two-element sets, and the number 3 is the class of all three-element sets. These are proper classes rather than sets, which can be seen as set-theoretically problematic, but one can reduce them to sets via Scott's trick (due to Dana Scott) by representing each class with its set of minimal-rank instances in the set-theoretic hierarchy.
Numbers as sets
Meanwhile, there are several other interpretations of arithmetic within set theory. Ernst Zermelo represented the number zero with the empty set and then successively applied the singleton operation as a successor operation, like this:
One then proceeds to define in purely set-theoretic terms the ordinary arithmetic structure on these numbers, including addition and multiplication.
John von Neumann proposed a different interpretation, based upon an elegant recursive idea: every number is the set of smaller numbers. On this conception, the empty set ∅ is the smallest number, for it has no elements and therefore, according to the slogan, there are no numbers smaller than it. Next is { ∅ }, since the only element of this set (and hence the only number less than it) is ∅. Continuing in this way, one finds the natural numbers:
The successor of any number n is the set n ∪ { n }, because in addition to the numbers below n, one adds n itself below this new number. In the von Neumann number conception, the order relation n < m on numbers is the same as the element-of relation n ∈ m, since m is precisely the set of numbers that are smaller than m.
Notice that if we unwind the definition of 3 above, we see that
But what a mess! And it gets worse with 4 and 5, and so on. Perhaps the idea is unnatural or complicated? Well, that way of looking at the von Neumann numbers obscures the underlying idea that every number is the set of smaller numbers, and it is this idea that generalizes so easily to the transfinite, and which smoothly enables the recursive definitions central to arithmetic. Does the fact that
mean that the number 1 is complicated? I do not think so; simple things can also have complex descriptions. The von Neumann ordinals are generally seen by set theorists today as a fundamental set-theoretic phenomenon, absolutely definable and rigid, forming a backbone to the cumulative hierarchy, and providing an interpretation of finite and transfinite arithmetic with grace and utility.
One should not place the Zermelo and von Neumann number conceptions on an entirely equal footing, since the Zermelo interpretation is essentially never used in set theory today except in this kind of discussion comparing interpretations, whereas the von Neumann interpretation, because of its convenient and conceptual advantages, is the de facto standard, used routinely without remark by thousands of set theorists.
Numbers as primitives
Some mathematicians and philosophers prefer to treat numbers as undefined primitives rather than interpreting them within another mathematical structure. According to this view, the number 2 is just that—a primitive mathematical object—and there is nothing more to say about what it is or its more fundamental composition. There is a collection of natural numbers 0, 1, 2, and so on, existing as urelements or irreducible primitives, and we may proceed to construct other, more elaborate mathematical structures upon them.
For example, given the natural numbers, we may construct the integers as follows: We would like to think of every integer as the difference between two natural numbers, representing the number 2 as 7-5, for example, and - 3 as 6-9 or as 12-15. The positive numbers arise as differences between a larger number and a smaller number, and negative numbers conversely. Since there are many ways to represent the same difference, we define an equivalence relation, the same-difference relation, on pairs of natural numbers:
Notice that we were able to express the same-difference idea by writing a + d = c + b rather than a - b = c - d, using only addition instead of subtraction, which would have been a problem because the natural numbers are not closed under subtraction. Just as Frege defined numbers as equinumerosity classes, we now may define the integers simply to be the equivalence classes of pairs of natural numbers with respect to the same-difference relation. According to this view, the integer 2 simply is the set
and -3 is the set
We may proceed to define the operations of addition and multiplication on these new objects. For example, if [(a,b)] denotes the same-difference equivalence class of the pair (a,b), then we define
If we think of the equivalence class [(a, b)] as representing the integer difference a-b, these definitions are natural because we want to ensure in the integers that
There is a subtle issue with our method here; namely, we defined an operation on the equivalence classes [(a, b)] and [(c, d)], but in doing so, we referred to the particular representatives of those classes when writing a + c and b + d. For this to succeed in defining an operation on the equivalence class, we need to show that the choice of representatives does not matter. We need to show that our operations are well defined with respect to the same-difference relation—that equivalent inputs give rise to equivalent outputs. If we have equivalent inputs
and
then we need to show that the corresponding outputs are also equivalent, meaning that
and similarly for multiplication. And indeed our definition does have this well defined feature. Thus, we build a mathematical structure from the same-difference equivalence classes of pairs of natural numbers, and furthermore, we can prove that it exhibits all the properties that we expect in the ring of integers. In this way, we construct the integers from the natural numbers.
Similarly, we construct the rational numbers as equivalence classes of pairs of integers by the same-ratio relation. Namely, writing the pair (p, q) in the familiar fractional form p / q, and insisting that q ≠ 0, we define the same-ratio equivalence relation by
Note that we used only the multiplicative structure on the integers to do this. Next, we define addition and multiplication on these fractions:
and verify that these operations are well defined with respect to the same-ratio relation. This uses the fraction p / q as a numeral, a mere representative of the corresponding rational number, which is the same-ratio equivalence class
Thus, we construct the field of rational numbers from the integers, and therefore ultimately from the natural number primitives.
The program continues to the real numbers, which one may define by various means from the rational numbers, such as by the Dedekind cut construction or with equivalence classes of Cauchy sequences explained in the section on real numbers, and then the complex numbers as pairs of real numbers a + bi, as in the section on complex numbers. And so on. Ultimately, all our familiar number systems can be constructed from the natural number primitives, in a process aptly described by the saying attributed to Leopold Kronecker:
Die ganzen Zahlen hat der liebe Gott gemacht, alles andere ist Menschenwerk
(God made the integers, all the rest is the work of man.)
Kronecker (1886), quoted in Weber (1893)
Numbers as morphisms
Many mathematicians have noted the power of category theory to unify mathematical constructions and ideas from disparate mathematical areas. A construction in group theory, for example, might fulfill and be determined up to isomorphism by a universal property for a certain commutative diagram of homomorphisms, and the construction might be identical in that respect to a corresponding construction in rings, or in partial orders. Again and again, category theory has revealed that mathematicians have been undertaking essentially similar constructions in different contexts.
Because of this unifying power, mathematicians have sought to use category theory as a foundation of mathematics. In the elementary theory of the category of sets (ETCS), a category-theory-based foundational system introduced by F. WilliamLawvere (Functorial semantics of algebraic theories, PhD thesis, 1963), one works in a topos, which is a certain kind of category having many features of the category of sets. The natural numbers in a topos are represented by what is called a natural-numbers object, an object ℕ in the category equipped with a morphism z : 1 → ℕ that serves to pick out the number zero, where 1 is a terminal object in the category, and another morphism s : ℕ → ℕ that satisfies a certain universal free-action property, which ensures that it acts like the successor function on the natural numbers. The motivating idea is that every natural number is generated from zero by the successor operation, obtained from the composition of successive applications of s to z:
In this conception, a natural number is simply a morphism n : 1 → ℕ. The natural numbers object is unique in any topos up to isomorphism, and any such object is able to interpret arithmetic concepts into category theory.
Numbers as games
We may even interpret numbers as games. In John Conway's account, games are the fundamental notion, and one defines numbers to be certain kinds of games. Ultimately, his theory gives an account of the natural numbers, the integers, the real numbers, and the ordinals, all unified into a single number system, the surreal numbers. In Conway's framework, the games have two players, Left and Right, who take turns making moves. One describes a game by specifying for each player the move options; on their turn, a player selects one of those options, which in effect constitutes a new game starting from that position, with turn-of-play passing to the other player. Thus, games are hereditarily gamelike: every game is a pair of sets of games,
where the games in GL are the choices available for Left and those in GR for Right. This idea can be taken as a foundational axiom for a recursive development of the entire theory. A player loses a game play when they have no legal move, which happens when their set of options is empty.
We may build up the universe of games from nothing, much like the cumulative hierarchy in set theory. At first, we have nothing. But then we may form the game known as zero,
which has no options for either Left or Right. This game is a loss for whichever player moves first. Having constructed this game, we may form the game known as star,
which has the zero game as the only option for either Left or Right. This game is a win for the first player, since whichever player's turn it is will choose 0, which is then a loss for the other player. We can also form the games known as one and two:
which are wins for Left, and the games negative one and negative two:
which are wins for Right. And consider the games known as one-half and three-quarters:
Can you see how to continue?
Conway proceeds to impose numberlike mathematical structure on the class of games: a game is positive, for example, if Left wins, regardless of who goes first, and negative if Right wins. The reader can verify that 1, 2, and 1/2 are positive, while -1 and -2 are negative, and 0 and ✱ are each neither positive nor negative. Conway also defines a certain hereditary orderlike relation on games, guided by the idea that a game G might be greater than the games in its left set and less than the games in its right set, like a Dedekind cut in the rationals. Specifically, G ≤ H if and only if it never happens that h ≤ G for some h in the right set of H, nor H ≤ g for some g in the left set of G; this definition is well founded since we have reduced the question of G ≤ H to lower-rank instances of the order with earlier-created games. A number is a game G where every element of its left set stands in the ≤ relation to every element of its right set. When games are constructed transfinitely, this conception leads to the surreal numbers. Conway defines sums of games G + H and products G × H and exponentials GH and proves all the familiar arithmetic properties for his game conception of number. It is a beautiful and remarkable mathematical theory.
Junk theorems
Whenever one has provided an interpretation of one mathematical theory in another, such as interpreting arithmetic in set theory, there arises the junk-theorem phenomenon, unwanted facts that one can prove about the objects in the interpreted theory which arise because of their nature in the ambient theory rather than as part of the intended interpreted structure. One has junk theorems, junk properties, and even junk questions.
If one interprets arithmetic in set theory via the von Neumann ordinals, for example, then one can easily prove several strange facts:
The P notation here means “power” set, the set of all subsets. Many mathematicians object to these theorems on the grounds that we do not want an interpretation of arithmetic, they stress, in which the number 2 is an element of the number 3, or in which it turns out that the number 2 is the same mathematical object as the set of all subsets of the number 1. These are “junk” theorems, in the sense that they are true of those arithmetic objects, but only by virtue of the details of this particular interpretation, the von Neumann ordinal interpretation; they would not necessarily be true of other interpretations of arithmetic in set theory. In the case of interpreting arithmetic in set theory, many of the objections one hears from mathematicians seem concentrated in the idea that numbers would be interpreted as sets at all, of any kind; many mathematicians find it strange to ask, “What are the elements of the number 7?” In an attempt to avoid this junk-theorem phenomenon, some mathematicians advocate certain alternative non-set-theoretic foundations.
To my way of thinking, the issue has little to do with set theory, for the alternative foundations exhibit their own junk. In Conway's game account of numbers, for example, it is sensible to ask, “Who wins 17?” In the arithmetic of ETCS, one may speak of the domain and codomains of the numbers 5 and 7, or form the composition of 5 with the successor operation, or ask whether the domain of 5 is the same as the domain of the real number π. This counts as junk to mathematicians who want to say that 17 is not a game or that numbers do not have domains and cannot be composed with morphisms of any kind. The junk theorem phenomenon seems inescapable; it will occur whenever one interprets one mathematical theory in another.
Interpretation of theories
Let us consider a little more carefully the process of interpretation in mathematics. One interprets an object theory T in a background theory S, as when interpreting arithmetic in set theory or in the theory of games, by providing a meaning in the language of the background theory S for the fundamental notions of the object theory T. We interpret arithmetic in games, for example, by defining which games we view as numbers and explaining how to add and multiply them. The interpretation provides a translation
of assertions φ in the object theory T to corresponding assertions φ* in the background theory S. The theory T is successfully interpreted in S if S proves the translations φ* of every theorem φ proved by T. For example, Peano arithmetic (PA) can be interpreted in Zermelo-Fraenkel set theory (ZFC) via the von Neumann ordinals (and there are infinitely many other interpretations).
It would be a stronger requirement to insist on the biconditional—that S proves φ* if and only if T proves φ—for this would mean that the background theory S was no stronger than the object theory T concerning the subject that T was about. In this case, we say that theory S is conservative over T for assertions in the language of T; the interpreting theory knows nothing more about the object theory than T does.
But sometimes we interpret a comparatively weak theory inside a strong universal background theory, and in this case, we might expect the background theory S to prove additional theorems about the interpreted objects, even in the language of the object theory T. For example, under the usual interpretation, ZFC proves the interpretation of arithmetic assertions not provable in PA, such as the assertion Con(PA) expressing the consistency of PA (see later chapter on the incompleteness theorem). This is not a junk theorem, but rather reflects the fact that the background set theory ZFC simply has stronger arithmetic consequences than the arithmetic theory PA. Meanwhile, other nonstandard interpretations of PA in ZFC, such as those obtained by interpreting arithmetic via certain nonstandard models, are not able to prove the interpretation of Con(PA) in ZFC.
The objectionable aspect of junk theorems are not cases where the foundational theory S proves additional theorems beyond T in the language of T, but rather where it proves S-features of the interpreted T-objects. A junk theorem of arithmetic, for example, when interpreted by the von Neumann ordinals in set theory is a theorem about the set-theoretic features of these numbers, not a theorem about their arithmetic properties.
What numbers could not be
Paul Benacerraf (What numbers could not be, 1965) tells the story of Ernie and Johnny, who from a young age study mathematics and set theory from first principles, with Ernie using the von Neumann interpretation of arithmetic and Johnny using the Zermelo interpretation (one wonders why Benacerraf did not use the names the other way around, so that each would be a namesake). Since these interpretations are isomorphic as arithmetic structures—there is a way to translate the numbers and arithmetic operations from one system to the other—naturally Ernie and Johnny agree on all the arithmetic assertions of their number systems. Since the sets they used to interpret the numbers are not the same, however, they will disagree about certain set-theoretic aspects of their numbers, such as the question of whether 3 ∈ 17, illustrating the junk-theorem phenomenon.
Benacerraf emphasizes that whenever we interpret one structure in another foundational system such as set theory, then there will be numerous other interpretations, which disagree on their extensions, on which particular sets are the number 3, for example. Therefore, they cannot all be right, and indeed, at most one—possibly none—of the interpretations are correct, and all the others must involve nonnecessary features of numbers.
Normally, one who identifies 3 with some particular set does so for the purpose of presenting some theory and does not claim that he has discovered which object 3 really is. [III.B]
Pressing the point harder, Benacerraf argues that no single interpretation can be the necessarily correct interpretation of number.
To put the point differently—and this is the crux of the matter—that any recursive sequence whatever would do suggests that what is important is not the individuality of each element but the structure which they jointly exhibit... [W]hether a particular “object”—for example, {{{∅}}}—would do as a replacement for the number 3 would be pointless in the extreme, as indeed it is. “Objects” do not do the job of numbers singly; the whole system performs the job or nothing does. I therefore argue, extending the argument that led to the conclusion that numbers could not be sets, that numbers could not be objects at all; for there is no more reason to identify any individual number with any one particular object than with any other (not already known to be a number). [III.C]
The epistemological problem
In a second influential paper, Mathematical truth (1973), Benacerraf identifies an epistemological problem with the platonist approach of taking mathematical objects as being abstract. If mathematical objects exist abstractly or in an ideal platonic realm, totally separate in causal interaction from our own physical world, how do we interact with them? How can we know of them or have intuitions about them? How are we able even to refer to objects in that perfect platonic world?
W. D. Hart (Benacerraf's-dilemma, 1991) describes Benacerraf's argument as presenting a dilemma—a problem with two horns, one metaphysical and the other epistemological. Specifically, mathematics is a body of truths about its subject matter—numbers, functions, sets—which exist as abstract objects. And yet, precisely because they are abstract, we are causally disconnected from their realm. So how can we come to have mathematical knowledge?
It is at least obscure how a person could have any knowledge of a subject matter that is utterly inert, and thus with which he could have no causal commerce. And yet by the first horn of the dilemma, the numbers, functions and sets have to be there for the pure mathematics of numbers, functions and sets to be true. Since these objects are very abstract, they are utterly inert. So it is at least obscure how a person could have any knowledge of the subject matter needed for the truth of the pure mathematics of numbers, functions and sets. As promised, Benacerraf's dilemma is that what seems necessary for mathematical truth also seems to make mathematical knowledge impossible. (p. 98)
Penelope Maddy (Realism in mathematics, 1992), grabbing the bull firmly by the horns, addresses the epistemological objection by arguing that we can gain knowledge of abstract objects through experience with concrete instantiations of them. You open the egg carton from the refrigerator and see three eggs; thus you have perceived, she argues at length, a set of eggs. Through this kind of experience and human evolution, humans have developed an internal set detector, a certain neural configuration that recognizes these set objects, much as we perceive other ordinary objects, just as a frog has a certain neural configuration, a bug detector, that enables it to perceive its next meal. By direct perception, she argues, we gain knowledge of the nature of these abstract objects, the sets we have perceived. However, see the evolution of her views expressed in Maddy (Naturalism in Mathematics, 1997) and subsequent works.
Barbara Gail Montero (1999, 2020) deflates the significance of the problem of causal interaction with abstract objects by pointing out that we have long given up the idea that physical contact is required for causal interaction: consider the gravitational attraction of the Sun and the Earth, for example, or the electrical force pushing two electrons apart. But further, she argues, objections based on the difficulty of causal interactions with abstract objects lack force in light of our general failure to give an adequate account of causality of any kind. If we do not have a clear account of what it means to say that A causes B even in ordinary instances, then how convincing can an objection be based on the difficulty of causal interaction with abstract objects? And this does not even consider the difficulty of providing a sufficient account of what abstract objects are in the first instance, before causality enters the picture.
Sidestepping the issue of causal interaction, Hartry H. Field (Realism, mathematics, and modality, 1988) argues that the essence of the Benacerraf objection can be taken as the problem of explaining the reliability of our mathematical knowledge, in light of the observation that we would seem to have exactly the same mathematical beliefs, even if the mathematical facts had turned out to be different, and this undermines those beliefs. Justin Clarke-Doane (What is the Benacerraf problem? 2017), meanwhile, argues that it is difficult to pin down exactly what the Benacerraf problem is in a way that exhibits all the features of the problem that are attributed to it.
Here is a jump into the middle of one of my Oxford lectures on YouTube, where I discuss this topic.
Continue reading more about this topic in the book:
I got so excited that I just bought your book "Lectures on the Philosophy of Mathematics". I have two Brazilian books on mathematical philosophy, but they don't have as many topics as yours, besides, I found your way of narrating much more friendly and exciting.
Could the properties of the natural numbers be the properties shared among all the interpretations? I just think that there does not have to be one correct interpretation. The plurality is conceptually intriguing. Also, the idea of interpreting a theory in a background theory sounds like an embedding, and there are generally many ways to embed an object into an ambient space.