MA103 Coursepack
MA103 Coursepack
Martin Anthony
These lecture notes were edited by Peter Allen. I am very grateful to Martin Anthony
for writing all the good content; the mistakes are my fault.
Contents
Contents
1 Introduction 7
1.1 What is this course about? . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.1 How to get the most out of this course (and all the other maths
courses) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.3 Learning objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.4 Topics covered (first half of the course) . . . . . . . . . . . . . . . 12
1.2 Moodle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Activities and sample exercises . . . . . . . . . . . . . . . . . . . . . . . . 13
2
Contents
3
Contents
4
Contents
5
Contents
6
1
Chapter 1
Introduction
In this very brief introduction, I aim to give you an idea of the nature of this subject
and to advise on how best to approach it. I also give general information about these
notes and recommended reading.
1 × 1 + 1 + 1 + 1 + 1 + 4 = 9 = 3 × 3 = (1 + 2) × (1 + 2)
2 × 2 + 2 + 2 + 2 + 2 + 4 = 16 = 4 × 4 = (2 + 2) × (2 + 2)
3 × 3 + 3 + 3 + 3 + 3 + 4 = 25 = 5 × 5 = (3 + 2) × (3 + 2) and so on . . .
These are concrete examples. You probably see that there is a pattern to the answers we
get. We can write it more generally:
x × x + 4 × x + 4 = ( x + 2) × ( x + 2) .
Choose a number, multiply it by itself, then add your chosen number four times, and
finally add four. You will get the same answer as if you add two to your chosen number
to get a new number, then multiply the new number by itself.
7
1. Introduction
1
indeed long ago that is what people did. Of course, it’s hard to get anything done like
that. If you show the equation to a small child, it won’t mean anything to them, while
they can read and understand the sentence. But once you understand what the
symbols in the equation mean, then it’s much quicker and easier to read or write.
Now we come to proof. Is the statement above (however it’s written) true for some
other values of x than the three we checked by calculation? And if so, why? The
purpose of a proof is not just to be certain that a statement is true. It also explains why
a statement is true. As you probably know, the statement we wrote is true for all
integers. Here is a proof.
Proof.
( x + 2) × ( x + 2)
=( x + 2) × x + ( x + 2) × 2 (multiplication distributes over addition)
= x × x + 2 × x + ( x + 2) × 2 (multiplication distributes over addition)
=x × x + 2 × x + x × 2 + 2 × 2 (multiplication distributes over addition)
=x × x + 2 × x + x × 2 + 4 (2 × 2 = 4)
=x × x + 2 × x + 2 × x + 4 (multiplication is commutative)
= x × x + (2 + 2) × x + 4 (multiplication distributes over addition)
=x × x + 4 × x + 4 (2 + 2 = 4)
We can see that each line is equal to the previous one, for any integer x, because of the
reason given on the right. Most of the reasons are axioms—statements which we are
assuming to be true—and a couple are little calculations which you should check. So
in particular the first and last lines are equal for any integer x, in other words the
statement
( x + 2) × ( x + 2) = x × x + 4 × x + 4
is true for any integer x. That’s what we wanted to prove.
Of course, you will never want to write down a proof in this kind of detail. You
would much rather write at most a couple of lines of algebra expanding out the
brackets, just as you would have done in school, or simply write ’it is obvious that
( x + 2) × ( x + 2) = x × x + 4 × x + 4’. This is fine. You just need to be aware that
when you write ‘it is obvious...’ that you are promising that if someone really wants
to see the details, you would be able to write out the details as above.
Let’s go back to abstraction. These axioms we wrote down above (multiplication
distributes over addition, multiplication is commutative) are statements which you
presumably agree are true for the integers. Of course, they are also true for other
numbers—they are true for real numbers, or complex numbers. That means that the
proof we wrote down works equally well for real numbers, or complex numbers. So
you know, for instance, that
8
1.1. What is this course about?
1
Next term, we’ll give axiomatic definitions of a group and a vector space, and start
proving theorems about abstract groups (and vector spaces). Here ‘abstract’ means
we don’t assume anything about the group except the axioms. This will seem painful
and useless at first: you’ll (by then) know a few concrete examples of groups and of
vector spaces. It will usually be easier to see how to prove the theorems for the
concrete examples. Usually you will have some idea already why the theorems
should be true in the examples, while you won’t have much intuition for how
abstract groups behave. The natural response will be that you don’t want to study
abstract groups, you want to work with the concrete examples you know. But this is
the wrong reaction. The reason is that you will then only learn about the concrete
examples you already know, and you will suffer as soon as in future courses you see
new examples of groups and are expected to immediately know a bunch of facts
about them (and also in the exam, where we will likely test your ability to work with
a new example of an abstract structure).
In this course, we need to work with precise definitions and statements, and you will
need to know these. Not only will you need to know these, but you will have to
understand them, and be able (through the use of them) to demonstrate that you
understand them. Simply learning the definitions without understanding what they
mean is not going to be adequate. I hope that these words of warning don’t
discourage you, but I think it’s important to make it clear that this is a subject at a
higher conceptual level than most of the mathematics you are likely to have studied
before. This does not mean it is incredibly hard and you will struggle. It is not
incredibly hard, and you are quite capable of doing well in this course (or you would
not be here). It does mean, though, that if you are used to getting through school
courses by memorising material without understanding it, then now is the time to
change that (and, by the way, no-one will hire you for your memorisation ability—a
computer does that better!).
In this course, you will learn how to prove mathematical statements precisely. This is a
very different sort of mathematics from that which you will have encountered in
many other mathematics courses you have previously taken, where the emphasis is
on solving problems through calculation. In Abstract Mathematics, one has to be able
to produce convincing mathematical arguments as to why a given mathematical
statement is true or false. For example, a prime number is defined to be a positive
integer greater than 1 that is only divisible by itself and the number 1 (so 7 is a prime
number, but 8 is not). The statement “There are infinitely many prime numbers” is a
mathematical statement, and it is either true (there are infinitely many prime
numbers) or false (there are only a finitely many prime numbers). In fact, the
statement is true. But why? There’s no quick ‘calculation’ we can do to establish the
truth of the statement. What is needed is a proof: a watertight, logical argument. This
is the type of problem we consider in this course.
9
1. Introduction
1
1.1.1 How to get the most out of this course (and all the other
maths courses)
There are two theories about mathematical ability (and intelligence in general). One
theory says that you have what you are born with. The other says that (just like
strength or stamina) it’s something you develop by practice. Various studies have
shown that broadly similar number of students believe each theory, but the ones who
believe ability is something you develop are consistently the ones who do better—and
almost all academic mathematicians believe ability is something you learn and train.
Some people are faster than others, but speed is in the end not all that useful: no
matter how fast you are, if you switch off and coast for a while, you will have trouble
catching up with people who pay attention and work on understanding their courses.
In particular—and this is different to school maths—we will always assume you
understood the previous lectures and courses, and we will use things from those
previous lectures and courses all the time. If you do understand the previous
material—even if you are not so fast—you’ll understand a good deal of the current
lecture (maybe all of it, maybe not quite) in real time, and you won’t need to spend
much time after the lecture going over the material. If you don’t really understand
the previous material, you won’t have a chance to understand large parts of the
lecture and you’ll have to do even more work afterwards to catch up.
In this course, all the theory will be introduced in the lectures, together with some
examples. There will be extra examples sessions (which don’t exist in most
courses—don’t expect them!), which you do not have to attend but which may well
be useful. In the lectures and examples sessions, you should be trying to understand
what is going on. Don’t waste time copying things down which will appear on
Moodle (which is essentially everything). Certainly don’t waste time with a
newspaper or games on your phone, which annoys me and distracts your classmates.
No-one is taking attendance in the lectures; if you’re not going to pay attention, go to
a café instead. If you do not understand something I said or wrote, then probably
either I didn’t explain it properly or I made a mistake, so you should ask questions
(louder, or put your hand up, if I don’t hear). If I ask a question, I really want an
answer. Probably you need to think about the question to answer it, so I will wait
until someone does answer.
For many of you there will be some point in the lectures where you do not
immediately understand what I say, and when you ask a question the answer is still
not very useful. You should keep asking me to explain better in such a situation. It is
possible that I will eventually say that I want to move on and you should think about
it after the lecture. That doesn’t mean I think you are stupid, it generally means I am
failing to understand what exactly you don’t understand, or maybe I do understand
but cannot think of a good way to explain it on the spot. In either case, if you try to
formulate clearly exactly what it is that you do not like (which will take time, which
is why you should do it after the lecture), you will probably find that doing so also
helps you figure out what is going on; once you understand something deeply in this
way you will not forget it. But it would still be useful to tell me about it after the
lecture, so that I can improve the lectures for next year (and if possible give some
more explanation on Moodle directly).
10
1.1. What is this course about?
1
There will also be problems set every week, some online (for which you’ll see results
immediately) and some which you will solve and hand in to your class teacher, who
will mark them and discuss in the next class. The marks you get do not affect your
result in the course. The purpose of the problems is for you to practice and check you
really know what is going on. If you get stuck, hand in half a solution with ‘I don’t
know what to do next’ and your class teacher will tell you (either written on the
work, or maybe many people were stuck in the same place and the class teacher will
go over it in class; usually then there will be a short comment like ‘Will discuss in
class’). Then you learn something. If you don’t hand anything in, or you only hand in
the problems you could solve, you don’t learn anything. The written comments on
your work, and the explanations in class, are the most important piece of feedback
you get—but you only get it if you show us something on which we can give
feedback. On that note—please do not copy work from someone else (or from last
year’s solutions). Doing this is a waste of your time and ours.
Finally, there are office hours and the Maths Support Centre. If you don’t understand
something, you should first try to figure it out for yourself—if you manage, then you
won’t forget it (and you should be happy with yourself). But if you get stuck, then
you should not wait and hope that it magically gets clear. It probably will not, and
you will suffer because you don’t understand something I am assuming you do
understand in my lectures. So go to office hours or the Support Centre and ask
questions. You have already paid for those office hours; use them. You can also try
talking with your friends on the course and seeing if you can figure out what’s going
on—group work can be fun and productive.
You can, of course, also read books (in these notes you’ll see references to a couple
which might be useful). But you do not need to do this. If you really want even more
problems to solve, then you can find them in textbooks, but you will likely feel you
already have enough work. If you don’t understand something I say in the lecture,
and it doesn’t get clear when you talk to friends or go to office hour or try to solve the
week’s problems, you might try reading a textbook to see if a different presentation
helps. However this is really a last resort.
1.1.2 Aims
The course is designed to enable you to:
11
1. Introduction
1
have a knowledge of basic mathematical concepts in discrete mathematics,
algebra, and real analysis;
1.2 Moodle
All information and materials for this course are on Moodle:
http://moodle.lse.ac.uk/course/view.php?id=1989
On the course Moodle page, you will find assignments, solutions, lecture notes, and
so on.
1.3 Reading
There are many books that would be useful for this subject, since abstract
mathematics is a components of almost all university-level mathematics degree
12
1.4. Activities and sample exercises
1
programmes.
For the first half of the course (the part covered by these notes), the following two
R
books are recommended.
Biggs, Norman L., Discrete Mathematics, Second edition. (Oxford University
13
2. Mathematical statements, proof, logic, and sets
2 Chapter 2
Mathematical statements, proof, logic,
and sets
2.1 Introduction
In this course, we want to make precise mathematical statements and establish
whether they are true or not—we want to prove things. But for that, we have to first
understand what a proof is. We will look at fairly simple types of mathematical
statement, in order to emphasise techniques of proof. Some of these statements are
going to be interesting, others are not so interesting—bear in mind that what you are
doing in this part of the course is learning the rules of the game: the play (and more
of the fun) comes later.
In later chapters (such as those on numbers, analysis and algebra) we will use these
proof techniques extensively. You might think that some of the things we prove in
this chapter are very obvious and hardly merit proving, but proving even ‘obvious’
statements can be quite tricky sometimes, and it is good preparation for proving
more complicated things later.
14
2.2. Mathematical statements and proof
(a) 20 is divisible by 4.
(d) 21 is divisible by 3 or 5.
(f) n2 is even.
15
2. Mathematical statements, proof, logic, and sets
Statement (l) asserts the non-existence of a certain pair of numbers (m, n). Another
way of thinking about this√ statement is that it says that for all choices of (m, n), it is
not the case that m/n = 2. (This is an example of the general rule that a
2 non-existence statement can be thought of as a universal statement, something to be
discussed later in more detail.)
It’s probably worth giving some examples of things that are not proper mathematical
statements.
‘6 is a nice number’ is not a mathematical statement. This is because ‘nice number’
has no mathematical meaning. However, if, beforehand, we had defined ‘nice number’
in some way, then this would not be a problem. For example, suppose we said:
Let us say that a number is nice if it is the sum of all the positive numbers
that divide it and are less than it.
Then ‘6 is a nice number’ would be a proper mathematical statement, and it would be
true, because 6 has positive divisors 1, 2, 3, 6 and 6 = 1 + 2 + 3. But without defining
what ‘nice’ means, it’s not a mathematical statement. Definitions are important1 .
‘n2 + n’ is not a mathematical statement, because it does not say anything about
n2 + n. It is not a mathematical statement in the same way that ‘David Cameron’ is
not a sentence: it makes no assertion about what David Cameron is or does.
However, ‘n2 + n > 0’ is an example of a predicate with free variable n and, for a
particular value of n, this is a mathematical statement. Likewise, ‘for all natural
numbers n, n2 + n > 0’ is a mathematical statement.
16
2.2. Mathematical statements and proof
will not be changed. We might say ‘divisible means when we try to divide we get no
remainder’, or some other phrase which has the same mathematical meaning: the
precise words aren’t important. What is important is that the mathematical meaning
is now fixed.
2
Now that we’re all clear on exactly what the statements mean, let’s prove them.
(a) 20 is divisible by 4.
This statement is true. Since 20 = 5 × 4, we see that (by the definition) 20 is
divisible by 4. And that’s a proof! It’s utterly convincing, watertight, and not
open to debate. Nobody can argue with it, not even a sociologist! Isn’t this fun?
Well, maybe it’s not that impressive in such a simple situation, but we will
certainly prove more impressive results later.
(b) 21 is not divisible by 7.
This is false. It’s false because 21 is divisible by 7, because 21 = 3 × 7.
(c) 21 is divisible by 4.
This is false, as can be established in a number of ways. First, we note that if the
natural number m satisfies m ≤ 5, then m × 4 will be no more than 20. And if
m ≥ 6 then m × 4 will be at least 24. Well, any natural number m is either at most
5 or at least 6 so, for all possible m, we do not have m × 4 = 21 and hence there is
no natural number m for which m × 4 = 21. In other words, 21 is not divisible by
4. Another argument (which is perhaps more straightforward, but which relies
on properties of rational numbers rather than just simple properties of natural
numbers) is to note that 21/4 = 5.25, and this is not a natural number, so 21 is
not divisible by 4. (This second approach is the same as showing that 21 has
remainder 1, not 0, when we divide by 4.)
Most of you are probably completely happy with these proofs. Maybe one or two
of you would like to know things like: why is there no natural number between 5
and 6? Do we need to prove it? We’ll get to that in the next chapter.
(d) 21 is divisible by 3 or 5.
As we noted above, this is a compound statement. It is true precisely if one (or
both) of the following statements is true:
(i) 21 is divisible by 3
(ii) 21 is divisible by 5.
Statement (i) is true, because 21 = 7 × 3. Statement (ii) is false. Because at least
one of these two statements is true, statement (d) is true.
(e) 50 is divisible by 2 and 5.
This is true. Again, this is a compound statement and it is true precisely if both of
the following statements are true:
(i) 50 is divisible by 2
(ii) 50 is divisible by 5.
Statements (i) and (ii) are indeed true because 50 = 25 × 2 and 50 = 10 × 5. So
statement (e) is true.
17
2. Mathematical statements, proof, logic, and sets
(f) n2 is even
As mentioned above, whether this is true or false depends on the value of n. For
2 example, if n = 2 then n2 = 4 is even, but if n = 3 then n2 = 9 is odd. So, unlike
the other statements (which are propositions), this is a predicate P(n). The
predicate will become a proposition when we assign a particular value to n to it,
and the truth or falsity of the proposition can then be established. You probably
implicitly assume that n has to be a natural number, but there isn’t actually
anything in the statement to tell you that—maybe n is a matrix, in which case it’s
not even clear what ‘even’ should mean for a matrix (we only defined ‘even’ for
natural numbers). If we assume n is a natural number, then (i) and (j) cover all
the possibilities.
18
2.2. Mathematical statements and proof
19
2. Mathematical statements, proof, logic, and sets
assuming that the opposite conclusion holds (n2 even) we have shown that
something impossible happens. This type of argument is known as a proof by
contradiction and it is often very powerful. We will see more about this later.
2
(k) For natural numbers n, n2 is even if and only if n is even.
This is true. What we have shown in proving (i) and (j) is that if n is even then n2
is even, and if n is odd then n2 is odd. The first, (statement (i)) establishes that if n
is even, then n2 is even. The second of these (statement (j)) establishes that n2 is
even only if n is even. This is because it shows that n2 is odd if n is odd, from
which it follows that if n2 is even, n must not have been odd, and therefore must
have been even. ‘If and only if’ statements if this type are very important. As we
see here, the proof of such statements breaks down into the proof of two ‘If-then’
statements.
√
(l) There are no natural numbers m and n such that 2 = m/n.
This is, in fact, true, though we defer the proof for now, until we know more
about factorisation of numbers into prime numbers. We merely comment that the
easiest way to prove the statement is to use a proof by contradiction.
These examples hopefully demonstrate that there are a wide range of statements and
proof techniques, and in the rest of this chapter we will explore these further.
Right now, one thing I hope comes out very clearly from these examples is that to
prove a mathematical statement, you need to know precisely what it means. Well,
that sounds obvious, but you can see how detailed we had to be about the meanings
(that is, the definitions) of the terms ‘divisible’, ‘even’ and ‘odd’. Definitions are very
important.
2.3.1 Negation
The simplest way to take a statement and form another statement is to negate the
statement. The negation of a statement P is the statement ¬ P (sometimes just denoted
‘not P’), which is defined to be true exactly when P is false. This can be described in
the very simple truth table, Table 2.1:
20
2.3. Some basic logic
P ¬P
T F
F T
2
Table 2.1: The truth table for ‘negation’ or ‘not’
What does the table signify? Quite simply, it tells us that if P is true then ¬ P is false
and if P is false then ¬ P is true.
It has, I hope, been indicated in the examples earlier in this chapter, that to disprove a
universal statement about natural numbers amounts to proving an existential
statement. That is, if we want to disprove a statement of the form ‘for all natural
numbers n, property p(n) holds’ (where p(n) is some predicate, such as ‘n2 is even’)
we need only produce some N for which p( N ) fails. Such an N is called a
counterexample. Equally, to disprove an existential statement of the form ‘there is some
n such that property p(n) holds’, one would have to show that for every n, p(n) fails.
That is, to disprove an existential statement amounts to proving a universal one. But,
now that we have the notion of the negation of a statement we can phrase this a little
more formally. Proving that a statement P is false is equivalent to proving that the
negation ¬ P is true. In the language of logic, therefore, we have the following:
The negation of the universal statement ‘for all n, property p(n) holds’ is the
existential statement ‘there is n such that property p(n) does not hold’.
The negation of the existential statement ‘there is n such that property p(n)
holds’ is the universal statement ‘for all n, property p(n) does not hold’.
We could be a little more formal about this, by defining the negation of a predicate
p(n) (which, recall, only has a definitive true or false value once n is specified) to be
the predicate ¬ p(n) which is true (for any particular n) precisely when p(n) is false.
Then we might say that
The negation of the universal statement ‘for all n, the statement p(n) is true’ is
the existential statement ‘there is n such that ¬ p(n) is true’.
The negation of the existential statement ‘there is n such that p(n) is true’ is the
universal statement ‘for all n, the statement ¬ p(n) is true’.
Now, let’s not get confused here. None of this is really difficult or new. We meet such
logic in everyday life. If I say ‘It rains every day in London’ then either this statement
is true or it is false. If it is false, it is because on (at least) one day it does not rain. The
21
2. Mathematical statements, proof, logic, and sets
negation (or disproof) of the statement ‘On every day, it rains in London’ is simply
‘There is a day on which it does not rain in London’. The former is a universal
statement (‘On every day, . . . ’) and the latter is an existential statement (‘there is a
2 day . . . ’). Or, consider the statement ‘There is a student who enjoys reading these
lecture notes’. This is an existential statement (‘There is . . . ’). This is false if ‘No
student enjoys reading these lecture notes’. Another way of phrasing this last
statement is ‘Every student reading these lecture notes does not enjoy it’. This is a
more awkward expression, but it emphasises that the negation of the initial,
existential statement, is a universal one (‘Every student . . . ’).
The former is an existential statement (‘there is something I will write that . . . ’) and
the latter is a universal statement (‘everything I write will . . . ). This second example
is a little more complicated, but it serves to illustrate the point that much of logic is
simple common sense.
50 is divisible by 2
50 is divisible by 5.
Statement (e) is true because both of these two statements are true.
Table 2.2 gives the truth table for the conjunction P and Q:
P Q P∧Q
T T T
T F F
F T F
F F F
What Table 2.2 says is simply that P ∧ Q is true precisely when both P and Q are true
(and in no other circumstances).
Suppose that P and Q are two mathematical statements. Then ‘P or Q’, also denoted
P ∨ Q, and called the disjunction of P and Q, is the statement that is true precisely
when P, or Q, or both, are true. For example, statement (d) above, which is
‘21 is divisible by 3 or 5’
is the disjunction of the two statements
22
2.4. If-then statements
21 is divisible by 3
21 is divisible by 5.
Statement (d) is true because at least one (namely the first) of these two statements is
2
true.
Note one important thing about the mathematical interpretation of the word ‘or’. It is
always used in the ‘inclusive-or’ sense. So P ∨ Q is true in the case when P is true, or
Q is true, or both. In some ways, this use of the word ‘or’ contrasts with its use in
normal everyday language, where it is often used to specify a choice between
mutually exclusive alternatives. (For example ‘You’re either with us or against us’.)
But if I say ‘Tomorrow I will wear brown trousers or I will wear a yellow shirt’ then,
in the mathematical way in which the word ‘or’ is used, the statement would be true
if I wore brown trousers and any shirt, any trousers and a yellow shirt, and also if I
wore brown trousers and a yellow shirt. You might have your doubts about my dress
sense in this last case, but, logically, it makes my statement true.
Table 2.2 gives the truth table for the disjunction P and Q:
P Q P∨Q
T T T
T F T
F T T
F F F
What Table 2.3 says is simply that P ∨ Q is true precisely when at least one of P and Q
is true.
23
2. Mathematical statements, proof, logic, and sets
P Q P⇒Q
T T T
T F F
2 F T T
F F T
Note that the statement P ⇒ Q is false only when P is true but Q is false. (To go back
to the previous example, the statement ‘If it rains, I wear a raincoat’ is false precisely
if it does rain but I do not wear a raincoat.) This is tricky, so you may have to spend a
little time understanding it. As I’ve suggested, perhaps the easiest way is to think
about when a statement ‘if P, then Q’ is false.
The statement P ⇒ Q can also be written as Q ⇐ P. There are different ways of
describing P ⇒ Q, such as:
if P then Q
P implies Q
P is sufficient for Q
Q if P
P only if Q
Q whenever P
Q is necessary for P.
All these mean the same thing. The first two are the ones I will use most frequently.
If P ⇒ Q and Q ⇒ P then this means that Q will be true precisely when P is. That is
Q is true if and only if P is. We use the single piece of notation P ⇐⇒ Q instead of the
two separate P ⇒ Q and Q ⇐ P. There are several phrases for describing what
P ⇐⇒ Q means, such as:
P is equivalent to Q
24
2.5. Logical equivalence
P Q P⇒Q Q⇒P P ⇐⇒ Q
T T T T T
T F F T F
F T T F F 2
F F T T T
What the table shows is that P ⇐⇒ Q is true precisely when P and Q are either both
true or both false.
Activity 2.1 Look carefully at the truth table and understand why the values for
P ⇐⇒ Q are as they are. In particular, try to explain in words why the truth table
is the way it is.
P Q P∨Q ¬( P ∨ Q) ¬P ¬Q ¬P ∧ ¬Q
T T T F F F F
T F T F F T F
F T T F T F F
F F F T T T T
25
2. Mathematical statements, proof, logic, and sets
Activity 2.3 What is the converse of the statement ‘if the natural number n
divides 4 then n divides 12’? Is the converse true? Is the original statement true?
P Q P⇒Q ¬P ¬Q ¬Q ⇒ ¬P
T T T F F T
T F F F T F
F T T T F T
F F T T T T
If you think about it, the equivalence of the implication and its contrapositive makes
sense. For, ¬ Q ⇒ ¬ P says that if Q is false, P is false also. So, it tells us that we cannot
have Q false and P true, which is precisely the same information as is given by
P ⇒ Q.
So what’s the point of this? Well, sometimes you might want to prove P ⇒ Q and it
will, in fact, be easier to prove instead the equivalent (contrapositive) statement
¬ Q ⇒ ¬ P. We will see many examples of this through your degree—for now, see
Biggs, section 3.5 for an example.
26
2.8. What is a proof?
you wanted to prove. If you are being very formal, you should write down every
single step.
If you write down every single step, you’re in a great position if someone wants to 2
argue with your proof. If someone doesn’t agree with your conclusion—the
statement you’re proving—it’s their problem to find a mistake in your proof. That
means they have to point at some statement in your proof and say that they do not
believe it. Now there are two sorts of statements in your proof: ones which follow
logically from earlier statements, and your axioms. If the doubter says they don’t
believe something which follows logically from earlier statements, then they have to
point at one of these earlier statements and say they don’t like that one either (or they
tell you they don’t believe in logic, in which case you can safely stop listening).
Eventually they will either be convinced you were right all along, or they will get
back to one of your axioms and say they disagree with that. Now, if you have some
strange non-standard axiom, then there might even be a good reason to argue. But if
you stick to standard axioms, like ‘addition of natural numbers is commutative’, then
no-one is going to argue—which means you will convince everyone that what you
claim is true. This is the gold standard of proof.
The problem with writing down every single step is that it takes a very long time to
actually get anywhere. Look back to the proof on page 8—it takes eight lines to do a
piece of algebra which you would normally write out in one line, and even that proof
skips the steps of proving from axioms that 2 × 2 = 2 + 2 = 4 (which we’ll see how to
do in the next chapter). You don’t want to spend the next three years taking pages
and pages to write out simple algebra, so we need to agree on a way to write proofs
which is shorter. There are two ways to do this, and we will use both.
The first way is that, as we go through the course (and the degree) we will make for
ourselves a library of true statements—ones which we already proved—and we will
not repeat the proofs every time we want to use them. So, for example, we already
proved that for every natural number n, the number n2 + n is even (We didn’t really
write out every single step—if you don’t like that, try doing it yourself). Next time we
want to know that n2 + n is even for some natural number n, we won’t need to prove
it, we can just say ‘proved in MA103’. There’s nothing much anyone can object to
here—it’s clear that we could have written out a gold standard proof just by
copying-and-pasting in the proof from MA103.
The second way we will save time is by not writing out every single step. When you
need to do a piece of algebra, do it just as you did in school, and we will assume you
do know how to justify all the steps by going back to the axioms (or at least that you
know where to look in order to find out how). We will also sometimes save steps by
saying that something is ‘obvious’, or ‘clear’. When you (or I) write ‘obvious’ or
‘clear’ in a proof, it is there to tell the reader that there are some steps missing, that
you (or I) know what those steps are, and that the reader should have no trouble
figuring out what the missing steps are. What this also means is: if you cannot
explain why a statement is true, then you cannot write that it is ‘obvious’ in a
proof. You will need to make a judgement of how many steps it is OK to skip.
You will quickly get used to what is and what is not acceptable as a proof—assuming
you do the weekly exercises—because your class teacher will correct you. What you
should keep in mind is that whatever you write as a proof should be something
27
2. Mathematical statements, proof, logic, and sets
which you could expand out to a gold standard proof if you were forced to, either
from memory or because you know where to look for the missing pieces and
previously proved statements.
2
As we go on, those ‘missing pieces and previously proved statements’ will get pretty
long: there will be proofs you write later this year in a page or two which might take
a hundred or more pages to write out in ‘gold standard’ style. For an example (which
you shouldn’t expect to understand when you read this the first time; but it will make
sense when you’re revising) think about how to prove that a piece of simple algebra
with the rational numbers makes sense, in terms of the axioms for the natural
numbers. We prove in this course that you can do it (which is enough—if I know
something is possible, I don’t have to actually do it to check it works)—but try
actually doing it!
Example 2.2 Prove the statement that: ‘if a, b are real numbers and a 6= b, then
ab < ( a2 + b2 )/2’.
It’s certainly not immediately obvious how to approach this. But let’s start with
what we want to prove. This is the inequality ab < ( a2 + b2 )/2, which can be
rewritten as a2 + b2 − 2ab > 0. Now, this can be simplified as ( a − b)2 > 0 and
maybe now you can see why it is true: the given fact that a 6= b means that
a − b 6= 0 and hence ( a − b)2 is a positive number. So we see why the statement is
true. To write down a nice proof, we can now reverse this argument, as follows:
Proof Since a 6= b, a − b 6= 0 and, hence, ( a − b)2 > 0. But ( a − b)2 = a2 + b2 − 2ab.
So we have a2 + b2 > 2ab and, therefore, ab < ( a2 + b2 )/2, as required.
28
2.10. What is not a proof?
There are a few things to note here. First, mathematics is a language and what you
write has to make good sense. Often, it is tempting to make too much use of symbols
rather than words. But the words used in this proof, and the punctuation, make it
easy to read and give it a structure and an argument. You should find yourself using
2
words like ‘so, ‘hence’, ‘therefore, ‘since’, ‘because’, and so on. Do use words and
punctuation and, whatever you do, do not replace them by symbols of your own
invention! Also do not use the symbols ‘∴’ and ‘ ∵0 . You may have seen these in
school, but they make your work hard to read. Write the English words instead! A
second thing to note is the use of the symbol ‘ ’. There is nothing particularly special
about this symbol: others could be used. What it achieves is that it indicates that the
proof is finished. There is no need to use such a symbol, but you will find that
textbooks do make much use of symbols to indicate when proofs have ended. It
makes the reader’s life easier: when you see the symbol, you know that you are
meant to be convinced and that what follows will be a comment, or the next piece of
material. Otherwise you might be left wondering whether the proof is finished yet.
The final, very important, thing to note is that even if you work backwards to get a
proof, you need to write it down forwards. Otherwise you do not have a proof.
29
2. Mathematical statements, proof, logic, and sets
We’re not doing calculations in this course, though, we’re doing proofs. When you
write a proof, you usually know the first and last lines before anything else: the first
line is what you’re assuming, and the last line is what you want to prove. What is
2 important is actually what’s in the middle which explains why the last line is true. If
(when) you get a proof back from your class teacher marked as wrong even though
‘the answer is right’, before complaining, think: does it make a difference to the story
if the middle line is instead ‘You pull out your gun and shoot the child’?
(3) Working in reverse to obtain a proof but then not writing the proof out forwards.
For example, consider trying to prove the following trigonometric identity: for all
x ∈ R, we have
(cos x )2 − sin x = 1 − (sin x )2 + sin x . (2.1)
If you just work in reverse, your proof might be:
Proof.
where to get to the last line we used the identity (sin x )2 + (cos x )2 = 1, which holds
for all x ∈ R. The last line is true, so we are done.
Note that normally you wouldn’t write justifications for each line of simple
algebra—it’s obvious enough how we got from each line to the next—but I wanted to
do this here for extra clarity.
This is not a valid proof—what it shows is that if the identity we want to prove, (2.1),
holds, then 0 = 0, which is a true statement. But a proof is supposed to end with the
statement you want to prove, not start with it.
That might seem picky—after all, it looks pretty much like what we did in
Example 2.2, just we didn’t bother to reverse the argument so the statement we want
is at the end. Well, let’s try reversing it.
Proof, take 2. We have
0 = 1 − 1 + sin x )2 − (sin x )2
2
so (sin x )2 = 1 − 1 + sin x subtracting (sin x )2
2
so (sin x )2 = 1 − (sin x )2 + sin x − (cos x )2 since 1 = (sin x )2 + (cos x )2
so − sin x = 1 − (sin x )2 + sin x − (cos x )2 taking square roots
so (cos x )2 − sin x = 1 − (sin x )2 + sin x adding (cos x )2
Looks better—but wait! The first two ‘so’s are fine, but the third ‘so’, ‘taking square
roots’, boils down to ‘If a2 = b2 then a = b’ — and that’s not true; it could equally
30
2.11. Sets
well be that a = −b. There is a problem with the proof here — and the reason is that
we are trying to prove a false statement! In fact,
2.11 Sets
2.11.1 Basics
You have probably already met some basic ideas about sets and there is not too much
more to add at this stage, but they are such an important idea in abstract mathematics
that they are worth discussing here.
Loosely speaking, a set may be thought of as a collection of objects. A set is usually
described by listing or describing its members, or elements, inside curly brackets. For
example, when we write A = {1, 2, 3}, we mean that the objects belonging to the set
A are the numbers 1, 2, 3 (or, equivalently, the set A consists of the numbers 1, 2 and
3). Equally (and this is what we mean by ‘describing’ its members), this set could
have been written as
Here, the symbol | stands for ‘such that’. Often, the symbol ‘:’ is used instead, so that
we might write
A = {n : n is a whole number and 1 ≤ n ≤ 3}.
B = { x ∈ N | x is even}
has as its members the set of positive even integers. Here we are specifying the set by
describing the defining property of its members.
One point which is important is that it doesn’t make sense to say that an object is in a
set twice. It’s either in or not, and this is the end. We’ll avoid writing obvious
repetitions, like S = {1, 2, 3, 1}. That is a set, and it is the same as the set {1, 2, 3};
whichever way I write it, it contains 1, 2 and 3 and nothing else. But sometimes it will
be painful to write a description avoiding repetition.
31
2. Mathematical statements, proof, logic, and sets
2.11.2 Subsets
We say that the set S is a subset of the set T, and we write S ⊆ T, if every member of S
is a member of T. For example, {1, 2, 5} ⊆ {1, 2, 4, 5, 6, 40}. (Be aware that some texts
use ⊂ where we use ⊆.) What this means is that the statement
x∈S⇒x∈T
is true.
A rather obvious, but sometimes useful, observation is that, given two sets A and B,
A = B if and only if A ⊆ B and B ⊆ A. So to prove two sets are equal, we can prove
that each of these two ‘containments’ holds. That might seem clumsy, but it is, in
many cases, the best approach.
For any set A, the empty set, ∅, is a subset of A. You might think this is strange,
because what it means is that ‘every member of ∅ is also a member of A’. But ∅ has
no members—how can that be true? Let’s go back to the logic: ‘every member of ∅ is
also a member of A’ means ‘for each x, if x in ∅ then x ∈ A’. Check the truth table of
if—then (⇒). The only way some x can be a counterexample to this statement is if x is
in ∅ and not in A. But there is no x such that x ∈ ∅, by definition—so we proved
∅ ⊆ A.
32
2.11. Sets
(mathematical) object can go in a set, so the number 1 can go in, or a function can go
in, or even another set. This is just the same thing as saying that you can put a
(normal) object in a parcel, so an apple can go in a parcel, or an orange can go in a
parcel, or a parcel full of sweets can go in another parcel, and so on. If you think a
2
parcel containing a parcel full of sweets is the same as a parcel full of sweets (or it’s
the same as just having a lot of sweets), think back to childhood games of
Pass-the-Parcel. Just like that game, it really matters how many of the { and } set
brackets there are, and what exactly they go round.
A ∪ B = { x | x ∈ A or x ∈ B}.
x ∈ A ∪ B ⇐⇒ ( x ∈ A) ∨ ( x ∈ B).
Similarly, we define the intersection A ∩ B to be the set whose members belong to both
A and B:
A ∩ B = { x | x ∈ A and x ∈ B}.
So,
x ∈ A ∩ B ⇐⇒ ( x ∈ A) ∧ ( x ∈ B).
33
2. Mathematical statements, proof, logic, and sets
34
2.11. Sets
This looks a little like the fact (observed in an earlier Learning Activity) that ¬( P ∧ Q)
is equivalent to ¬ P ∨ ¬ Q. And this is more than a coincidence. The negation
operation, the conjunction operation, and the disjunction operation on statements
behave entirely in the same way as the complementation, intersection, and union
2
operations (in turn) on sets. In fact, when you start to prove things about sets, you
often end up giving arguments that are based in logic.
For example, how would we prove that A ∩ B = Ā ∪ B̄? We could argue as follows:
x ∈ A ∩ B ⇐⇒ x 6∈ A ∩ B
⇐⇒ ¬( x ∈ A ∩ B)
⇐⇒ ¬(( x ∈ A) ∧ ( x ∈ B))
⇐⇒ ¬( x ∈ A) ∨ ¬( x ∈ B)
⇐⇒ ( x ∈ Ā) ∨ ( x ∈ B̄)
⇐⇒ x ∈ Ā ∪ B̄.
What the result says is, in fact, easy to understand: if x is not in both A and B, then
that’s precisely because it fails to be in (at least) one of them.
For two sets A and B (subsets of a universal set E), the complement of B in A, denoted
by A \ B, is the set of objects that belong to A but not to B. That is,
A \ B = { x ∈ A | x 6 ∈ B }.
35
2. Mathematical statements, proof, logic, and sets
P ( A) = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}} .
Activity 2.5 Write down the power set of the set A = {1, 2, 3, 4}.
Activity 2.6 Suppose that A has n members, where n ∈ N. How many members
does P ( A) have?
2.12 Quantifiers
We have already met the ideas of universal and existential statements involving
natural numbers. More generally, given any set E, a universal statement on E is one of
the form ‘for all x ∈ E, P( x )’. This statement is true if P( x ) is true for all x in E, and it
is false if there is some x in E (known as a counterexample) such that P( x ) is false. We
have a special symbol that is used in universal statements: the symbol ‘∀’ means ‘for
all’. So the typical universal statement can be written as
∀ x ∈ E, P( x ).
(The comma is not necessary, but I think it looks better.) An existential statement on E
is one of the form ‘there is x ∈ E such that P( x )’, which is true if there is some x ∈ E
for which P( x ) is true, and is false if for every x ∈ E, P( x ) is false. Again, we have a
useful symbol, ‘∃’, meaning ‘there exists’. So the typical existential statement can be
written as
∃ x ∈ E, P( x ).
Here, we have omitted the phrase ‘such that’, but this is often included if the
statement reads better with it. For instance, we could write
∃n ∈ N, n2 − 2n + 1 = 0,
but it would probably be easier to read
∃n ∈ N such that n2 − 2n + 1 = 0.
Often ‘such that’ is abbreviated to ‘s.t.’. (By the way, this statement is true because
n = 1 satisfies n2 − 2n + 1 = 0.)
We have seen that the negation of a universal statement is an existential statement
and vice versa. In symbols, ¬(∀ x ∈ E, P( x )) is logically equivalent to ∃ x ∈ E, ¬ P( x );
and ¬(∃ x ∈ E, P( x )) is logically equivalent to ∀ x ∈ E, ¬ P( x ).
With these observations, we can now form the negations of more complex statements.
Consider the statement
∀n ∈ N, ∃m ∈ N, m > n.
36
2.12. Quantifiers
What would the negation of the statement be? Let’s take it gently. First, notice that 2
the statement is
∀n ∈ N, (∃m ∈ N, m > n).
The parentheses here do not change the meaning. According to the rules for negation
of universal statements, the negation of this is
∃n ∈ N, ¬(∃m ∈ N, m > n).
But what is ¬(∃m ∈ N, m > n)? According to the rules for negating existential
statements, this is equivalent to ∀m ∈ N, ¬(m > n). What is ¬(m > n)? Well, it’s just
m ≤ n. So what we see is that the negation of the initial statement is
∃n ∈ N, ∀m ∈ N, m ≤ n.
We can put this argument more succinctly, as follows:
¬ (∀n ∈ N(∃m ∈ N, m > n)) ⇐⇒ ∃n ∈ N, ¬(∃m ∈ N, m > n)
⇐⇒ ∃n ∈ N, ∀m ∈ N, ¬(m > n)
⇐⇒ ∃n ∈ N, ∀m ∈ N, m ≤ n.
37
2. Mathematical statements, proof, logic, and sets
2 We’ve seen a small example of proof by contradiction earlier in the chapter. Suppose
you want to prove P ⇒ Q. One way to do this is by contradiction. What this means is
that you suppose P is true but Q is false (in other words, that the statement P ⇒ Q is
false) and you show that, somehow, this leads to a conclusion that you know,
definitely, to be false.
Here’s an example.
This sort of argument can be a bit perplexing when you first meet it. What’s going on
in the example just given? Well, what we show is that if such m, n exist, then
something impossible happens: namely the number 1099 is both even and odd. Well,
this can’t be. If supposing something leads to a conclusion you know to be false, then
the initial supposition must be false. So the conclusion is that such integers m, n do
not exist.
Probably the most famous proof by contradiction is Euler’s proof that there are
infinitely many prime numbers. A prime number is a natural number greater than 1
which is only divisible by 1 and itself. Such numbers have been historically of huge
importance in mathematics, and they are also very useful in a number of important
applications, such as information security. The first few prime numbers are
2, 3, 5, 7, 11, . . . . A natural question is: does this list go on forever, or is there a largest
prime number? In fact, the list goes on forever: there are infinitely many prime
numbers. We’ll mention this result again later. A full, detailed, understanding of the
proof requires some results we’ll meet later, but you should be able to get the flavour
of it at this stage. So here it is, a very famous result:
X = (2 × 3 × 5 × 7 × 11 × · · · × M ) + 1,
which is the product of all the prime numbers (2 up to M), with 1 added. Notice that
X > M, so X is not a prime (because M is the largest prime). If a number X is not
prime, that means that it has a divisor p that is a prime number and which satisfies
1 < p < X. [This is the key observation: we haven’t really proved this yet, but we will
later.] But p must therefore be one of the numbers 2, 3, 5, . . . , M. However, X is not
38
2.14. Some terminology
divisible by any of these numbers, because it has remainder 1 when divided by any of
them. So we have reached a contradiction: on the one hand, X must be divisible by
one of these primes, and on the other, it is not. So the initial supposition that there
were not infinitely many primes simply must be wrong. We conclude there are
2
infinitely many primes.
This proof has been written in a fairly informal and leisurely way to help explain
what’s happening. It could be written more succinctly and a bit more formally:
Proof. Suppose the set of prime numbers is not infinite. Then there are t prime
numbers, for some integer t. In other words, the set of prime numbers is { p1 , . . . , pt }.
Consider the integer N = p1 × p2 × · · · × pt + 1. Now N is bigger than any of
p1 , . . . , pt , so (by our assumption that p1 , . . . , pt are all the prime numbers) it cannot
be prime. And by construction N is not divisible by any of p1 , . . . , pt (if we divide by
any of them we have a remainder of 1). And since 2 and 3 are prime, certainly N is at
least 7, in particular it is bigger than 1. But any integer bigger than 1 is either prime or
it is divisible by a prime number, which is a contradiction.
This proof is still missing a few things—which you can see a bit more clearly because
it’s written formally. Why does the first sentence imply the second? Well, we didn’t
formally define the word ‘infinite’ yet. When we do, you’ll see that the second
sentence is just writing out the definition of ‘not infinite’, also known as ‘finite’. And
we still didn’t prove the final sentence—but hopefully it is a bit more clear what
exactly we do need to prove. It’s worth thinking about this a little bit now—what
exactly is missing? We defined a prime number to be an integer greater than 1 which
is only divisible by 1 and itself. So we need to know what to do if we are given an
integer bigger than 1 which is not prime.
The other point which we should be careful about is the following. Suppose that we
take the first t prime numbers, multiply them together and add one. What we just
proved is that either we will get a new prime number or what we get will be divisible
by a prime number which isn’t one of the first t primes. We don’t have any idea which
of these two things will happen. If you try this for the first few values of t, you see
2+1 = 3
2×3+1 = 7
2 × 3 × 5 + 1 = 31
2 × 3 × 5 × 7 + 1 = 211
2 × 3 × 5 × 7 × 11 + 1 = 2311
which are all prime. It’s tempting to think this pattern will continue, but in fact
2 × 3 × 5 × 7 × 11 × 13 + 1 = 30031 = 59 × 509
is not prime.
39
2. Mathematical statements, proof, logic, and sets
Proposition. (Usually the word ‘Proposition’ is used if the statement does not seem
quite so significant as to merit the description ‘Theorem’.) A theorem that is a
preliminary result leading up to a Theorem is often called a Lemma, and a minor
2 theorem that is a fairly direct consequence of, or special case of, a theorem is called a
Corollary, if it is not significant enough itself to merit the title Theorem. For your
purposes, it is important just to know that these words all mean true mathematical
statements. You should realise that these terms are used subjectively: for instance, the
person writing the mathematics has to make a decision about whether a particular
result merits the title ‘Theorem’ or is, instead, merely to be called a ‘Proposition’.
2.15.1 Introduction
Proving things is difficult. Inevitably, when you read a proof, in the textbooks or in
these notes, you will ask ‘How did the writer know to do that?’ and you will often
find you asking yourself ‘How can I even begin to prove this?’. This is perfectly
normal. This is where the key difference between abstract mathematics and more
‘methods-based’ mathematics lies. If you are asked to differentiate a function, you
just go ahead and do it. It might be technically difficult in some cases, but there is no
doubt about what approaches you should use. But proving something is more
difficult. You might try to prove it, and fail. That’s fine: what you should do in that
case is try another attack. Keep trying until you crack it. (I suppose this is a little bit
like integration. You’ll know that there are various methods, but you don’t
necessarily know which will work on a particular integral, so you should try one, and
keep trying until you manage to find the integral.) Abstract mathematics should
always be done with a large pile of scrap paper at your disposal. You are unlikely to
be able to write down a perfect solution to a problem straight away: some ‘scratching
around’ to get a feel for what’s going on might well be needed, and some false starts
might be pursued first. If you expect to be able to envisage a perfect solution in your
head and then write it down perfectly, you are placing too much pressure on
yourself. You will only be able to do this for easy problems, and we aren’t going to be
doing easy problems. Your lecturers may give the impression of being able to
produce perfect solutions effortlessly, but this is a result of lots of practice and doing
the work in advance of the lecture (i.e. cheating). When we come across a problem we
haven’t seen before, we also try things which turn out not to work.
In this chapter I have tried to indicate that there are methodical approaches to proof
(such as proof by contradiction, for example). What you have to always be able to do
is to understand precisely what it is that you have to prove. That sounds obvious, but
it is something the importance of which is often underestimated. Once you
understand what you need to show (and, here, working backwards a little from that
end-point might be helpful, as we’ve seen), then you have to try to show it. And you
must know when you have done so! So it is inevitable that you will have to take a
little time to think about what is required: you cannot simply ‘dive in’ like you might
to a differentiation question.
All this becomes much easier as you practice it. You should attempt problems from
40
2.15. General advice
the textbooks (and also the problems below). Problems are a valuable resource and
you are squandering this resource if you simply turn to the answers (should these be
available). It is one thing to ‘agree’ with an answer, or to understand a proof, but it is
quite a different thing to come up with a proof yourself. There is no point in looking
2
at the answer before you have tried hard yourself to answer the problem. By trying
(and possibly failing), you will learn more than simply by reading answers. Exam
questions will be different from problems you have seen, so there is no point at all in
‘learning’ answers. You need to understand how to approach problems and how to
answer them for yourself.
x2 = 1 =⇒ x = 1 or x = −1.
This is not only pure laziness, since it’s just as easy to write :
x2 = 1, hence x = 1 or x = −1.
But it is even probably not what was meant! The implication arrow “=⇒” has a
logical meaning “if . . . , then . . . ”. So if you write “x2 = 1 =⇒ x = 1 or x = −1”, then
that really means “if x2 = 1, then x = 1 or x = −1”. And hence this gives no real
information about what x is. On the other hand, writing
means that now we know x = 1 or x = −1 and can use that knowledge in what
follows.
Some other unnecessary symbols that are sometimes used are “∴” and “ ∵ ”. They
mean something like “therefore/hence” and “since/because”. It is best not to use
them, but to write the word instead. It makes things so much easier to read.
41
2. Mathematical statements, proof, logic, and sets
42
2.15. General advice
What about the following: Suppose I take the first t primes, multiply them together
and add one (remember we saw this when we proved that there are infinitely many
primes). We know the result is sometimes prime and sometimes not, depending on t
(we saw examples of both). Are there infinitely many values of t such that we get a
2
prime number? No-one knows the answer; that problem has been open for over 2 300
years.
Do it yourself
Here is one (of many possible) solutions to Problem 2.1:
Given : natural numbers a, b, c, with c ≥ 2.
To prove : there is a natural number n such that an2 + bn + c is not a prime.
By definition, a natural number p is prime if p ≥ 2 and the only divisors of p are 1
and p itself.
Hence to prove : there is a natural number n for which an2 + bn + c is smaller than 2
or it has divisors other than 1 or itself.
Let’s take n = c. Then we have an2 + bn + c = ac2 + bc + c.
But we can write ac2 + bc + c = c ( ac + b + 1), which shows that ac2 + bc + c has c
and ac + b + 1 as divisors.
Moreover, it’s easy to see that neither c nor ac + b + 1 can be equal to 1 or to
ac2 + bc + c.
We’ve found a value of n for which an2 + bn + c has divisors other than 1 or itself.
The crucial step in the answer above is the one in which I choose to take n = c. Why
did I choose that? Because it works. How did I get the idea to take n = c? Ah, that’s
far less obvious. Probably some rough paper and lots of trying was involved. In the
final answer, no information about how this clever idea was found needs to be given.
You probably have no problems following the reasoning given above, and hence you
may think that you understand this problem. But being able to follow the answer,
and being able to find the answer yourself are two completely different matters.
And it is the second skill you are suppose to acquire in this course. (And hence the
skill that will be tested in the examination.) Once you have learnt how to approach
questions such as the above and come up with the clever trick yourself, you have
some hope of being able to answer other questions of a similar type.
But if you only study answers, you will probably never be able to find new
arguments for yourself. And hence when you are given a question you’ve never seen
before, how can you trust yourself that you have the ability to see the “trick” that that
particular question requires ?
For many, abstract mathematics seems full of clever “tricks”. But these tricks have
always been found by people working very hard to get such a clever idea, not by
people just studying other problems and the tricks found by other people.
43
2. Mathematical statements, proof, logic, and sets
down, you must try to explore the material even further. Try to understand the
reason for things that are maybe not explicitly asked.
2 As an illustration of thinking that way, look again at the formulation of the example
we looked at before :
For any natural numbers a, b, c with c ≥ 2, there is a natural number n such that
an2 + bn + c is not a prime.
Why is it so important that c ≥ 2 ? If you look at the proof in the previous section,
you see that that proof goes wrong if c = 1. (Since we want to use that c is a divisor
different from 1.) Does that mean the statement is wrong if c = 1 ? (No, but a different
proof is required.)
And what happens if we allow one or more of a, b, c to be zero or negative?
And what about more complicated expression such as an3 + bn2 + cn + d for some
numbers a, b, c, d with d ≥ 2 ? Could it be possible that there is an expression like this
for which all n give prime numbers ? If you found the answer to the original question
yourself, then you probably immediately see that the answer has to be “no”, since
similar arguments as before work. But if you didn’t try the original question yourself,
and just studied the ready-made answer, you’ll be less well equipped to answer more
general or slightly altered versions.
Once you start thinking like this, you are developing the skills required to be good in
mathematics. Trying to see beyond what is asked, asking yourself new questions and
seeing which you can answer, is the best way to train yourself to become a
mathematician.
44
2.17. Learning outcomes
Now, you might notice this statement P(s) is a bit funny—how can a set possibly be a
member of itself? Well, actually if U is a set, it is a mathematical object so it has to
contain itself. That might already raise a warning sign that strange things are going to
happen, but it’s not actually a logical contradiction; it’s just a bit funny.
2
But what about this set X? Well, by definition X contains everything which is not a
member of itself (and nothing else). So it certainly contains anything which isn’t a set
(because something which isn’t a set doesn’t contain anything at all, let alone itself).
And it certainly contains a lot of sets, like ∅ and {1, 2, 53}. OK, does X contain X?
Well, if not, then by definition it should. So X must contain X. But then by definition,
X cannot contain X. That’s a logical contradiction, pointed out by Bertrand Russell.
It’s really nothing more than a mathematical version of the ‘Barber of Seville’, who
shaves everyone in Seville that doesn’t shave themself. Who shaves the Barber?
What this logical contradiction tells us is that ‘anything goes’ is not OK. Some things
are not sets. We need to give some rules which allow you to construct new sets from
old sets; some axioms of set theory. This is what most mathematicians do (when we
think about such things at all!), and usually we use some axioms called ZFC. These
axioms don’t, for instance, allow you to construct a ‘set of everything’; in fact, they
don’t allow any set to contain itself (because you have to construct new sets from old
sets you already have). These rules don’t—as far as we know—lead to logical
contradictions like Russell’s. If you are worried about trying to explain everything in
mathematics, then a good place to start is with ZFC set theory.
However, ZFC set theory is hard work; you spend a lot of time and energy proving
things which look ‘obvious’, to an even greater extent than you’ll see in the next
chapter (where we discuss axioms for the natural numbers). We had to make a choice:
do we spend all of MA103 building up the basics of mathematics from set theory, so
that you have one (hopefully) consistent foundation for the rest of your degree? Or
do we want to actually do some mathematics? We chose to do the latter, which means
that in this course we are going to assume some things are true without proving
them. In particular, we are going to assume statements like that there is such a thing
as the set of natural numbers N, that it makes sense to talk about sets of pairs such as
{( a, b) : a, b ∈ N}, and so on. All these are things which one can prove from the ZFC
axioms, but we will not do so.
If you dislike this, you should go study ZFC set theory (in the summer, when you
have time!). However don’t expect it to be particularly easy, and don’t expect it to be
an ‘answer to everything’. You’ll still need to assume that ZFC set theory itself makes
sense; there is no proof that it makes sense.
45
2. Mathematical statements, proof, logic, and sets
use truth tables to determine whether logical statements are logically equivalent
2 or not
demonstrate understanding of the meaning of ‘if and only if’ statements and be
able to prove or disprove such statements
prove results by various methods, including directly, by the the method of proof
by contradiction, and by working backwards
Exercise 2.2
Is the following statement about natural numbers n true or false? Justify your answer by
giving a proof or a counterexample:
Exercise 2.3
Prove that ¬( P ∧ Q) and ¬ P ∨ ¬ Q are logically equivalent.
46
2.19. Comments on selected activities
Exercise 2.4
Prove that the negation of P ∨ Q is ¬ P ∧ ¬ Q.
Exercise 2.5
2
Prove that for all real numbers a, b, c, ab + ac + bc ≤ a2 + b2 + c2 .
Exercise 2.6
Prove by contradiction that there is no largest natural number.
Exercise 2.7
Prove that there is no smallest positive real number.
Exercise 2.8
Suppose A and B are subsets of a universal set E. Prove that
( E × E) \ ( A × B) = (( E \ A) × E) ∪ ( E × ( E \ B)).
Exercise 2.9
Suppose that P( x, y) is a predicate involving two free variables x, y from a set E. (So, for
given x and y, P( x, y) is either true or false.) Find the negation of the statement
∃ x ∈ E, ∀y ∈ E, P( x, y)
P Q P∧Q ¬( P ∧ Q) ¬P ¬Q ¬P ∨ ¬Q
T T T F F F F
T F F T F T T
F T F T T F T
F F F T T T T
Learning activity 2.3 The converse is ‘if n divides 12 then n divides 4’. This is false.
For instance, n = 12 is a counterexample. This is because 12 divides 12, but it does
not divide 4. The original statement is true, however. For, if n divides 4, then for some
m ∈ Z, 4 = nm and hence 12 = 3 × 4 = 3nm = n(3m), which shows that n divides 12.
Learning activity 2.4 We have
x ∈ A \ B ⇐⇒ ( x ∈ A) ∧ ( x 6∈ B)
⇐⇒ ( x ∈ A) ∧ ( x ∈ E \ B)
⇐⇒ x ∈ A ∩ ( E \ B).
47
2. Mathematical statements, proof, logic, and sets
∅, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4},
2
{1, 2, 3}, {2, 3, 4}, {1, 3, 4}, {1, 2, 4}, {1, 2, 3, 4}.
Learning activity 2.6 The members of P ( A) are all the subsets of A. A subset S is
determined by which of the n members of A it contains. For each member x of A,
either x ∈ S or x 6∈ S. There are therefore two possibilities, for each x ∈ A. It follows
that the number of subsets is 2 × 2 × · · · × 2 (where there are n factors, one for each
element of A). Therefore P ( A) has 2n members.
Learning activity 2.7 The statement means that if we take any natural number n there
will be some natural number m greater than n. Well, this is true. For example,
m = n + 1 will do.
48
2.20. Solutions to exercises
This is equivalent to
2a2 + 2b2 + 2c2 − 2ab − 2ac − 2bc ≥ 0,
which is the same as
You can perhaps now see how this is going to work, for ( a2 − 2ab + b2 ) = ( a − b)2 and so
on. Therefore the given inequality is equivalent to
We know this to be true because squares are always non-negative. If we wanted to write this
proof ‘forwards’ we might argue as follows. For any a, b, c, ( a − b)2 ≥ 0, (b − c)2 ≥ 0 and
( a − c)2 ≥ 0, so
( a − b )2 + ( b − c )2 + ( a − c )2 ≥ 0
and hence
2a2 + 2b2 + 2c2 − 2ab − 2ac − 2bc ≥ 0,
from which we obtain
a2 + b2 + c2 ≥ ab + ac + bc,
as required.
49
2. Mathematical statements, proof, logic, and sets
2 ( E × E) \ ( A × B) = (( E \ A) × E) ∪ ( E × ( E \ B)).
Now,
( x, y) ∈ ( E × E) \ ( A × B) ⇐⇒ ¬(( x, y) ∈ A × B)
⇐⇒ ¬(( x ∈ A) ∧ (y ∈ B))
⇐⇒ ¬( x ∈ A) ∨ ¬(y ∈ B)
⇐⇒ ( x ∈ E \ A) ∨ (y ∈ E \ B)
⇐⇒ (( x, y) ∈ ( E \ A) × E) ∨ (( x, y) ∈ E × ( E \ B))
⇐⇒ ( x, y) ∈ (( E \ A) × E) ∪ ( E × ( E \ B)).
∀ x ∈ E, ¬(∀y ∈ E, P( x, y))
50
Chapter 3
Mathematical structures, natural
numbers and proof by induction 3
R
R Biggs, N. L. Discrete Mathematics. Chapter 4.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 1–4 and 6.
3.1 Introduction
In this chapter we will discuss what is meant by a ‘mathematical structure’, and
explore some of the properties of one of the most important mathematical structures:
the natural numbers. These will not be new to you, but they shall be explained a little
more formally. The chapter also studies a very powerful proof method, known as
proof by induction. This enables us to prove many universal statements about natural
numbers that would be extremely difficult to prove by other means.
Along the way, we are going to see the first ‘real’ example of Abstract Maths: what
some standard axioms are and what they are good for.
(1) The natural numbers N = {1, 2, 3, . . . } which come with the operations + and ×,
and the relation <.
(2) The integers Z which come with the operations + and ×, and the relation <.
(3) The rational numbers Q (intuitively, the fractions; numbers which you can write
as ba where a and b are integers and b is not zero), which again come with the
operations + and ×, and the < relation.
(4) The real numbers R (intuitively: points on the number line) which again come
with the operations + and ×, and the < relation.
(5) The complex numbers C which √ are numbers of the form a + bi, where i is a
special symbol representing −1. Again you can add and multiply these, but it’s
not clear what < should be, so we leave it out.
51
3. Mathematical structures, natural numbers and proof by induction
All these examples are structures where you can do arithmetic as you’re used to it.
Here are another couple of examples. Don’t worry if you haven’t seen these before.
We won’t try to study them just yet; you’ll study them more later this year.
(6) The ‘clock numbers’ Z24 , which are the integers {0, 1, 2, . . . , 23} on a 24-hour
clock, where you add and multiply as you would on a clock; if you get 24 you
3 replace it with 0, if you get 25 you replace it with 1, and so on.
a b
(7) The 2 × 2 matrices where a, b, c, d are real numbers. Here too we can
c d
define addition and multiplication:
0 0
a + a0 b + b0
a b a b
+ 0 0 = and
c d c d c + c0 d + d0
0 0 0
aa + bc0 ab0 + bd0
a b a b
× 0 0 = .
c d c d ca0 + dc0 cb0 + dd0
These still look like structures where you can ‘do arithmetic as you’re used to it’. But
you have to be a little careful now. In Z24 we have 4 × 5 = 20 = 4 × 11. So what
should we say 20/4 is? You’re used to the idea that ‘division by zero’ doesn’t make
sense, but in Z24 ‘division by four’ also doesn’t make sense. When you work with
2 × 2 matrices, then multiplication turns out not to be commutative:
0 1 1 0 0 −1 1 0 0 1 0 1
= but = .
−1 0 0 −1 −1 0 0 −1 −1 0 1 0
(8) The set of social networks, where a social network consists of a (finite) collection
of people and a relation ‘friends’ between pairs of people.
Think of taking a snapshot of the Facebook network at some moment: there are
something like 1 000 000 000 people in the network, and if I look at any particular pair
I will find they are either friends or they are not. That’s a social network (by the
definition we gave); if we let some time pass, some people join or leave, some pairs of
people friend or de-friend each other, we get a different social network.
It’s not clear what + or × should mean here—how can we multiply social networks?
But I probably don’t have to convince you that there are interesting things to study
here; and in fact the (results of the) mathematical study of networks (‘Graph Theory’)
turns out to be very important in today’s technology. We’re not going to go further
into this in MA103; the point of giving this example is to show you that we can be
interested as mathematicians in things which don’t involve arithmetic.
More or less, any time you find a precise, unambiguous definition of something, then
you have a mathematical structure which you can start studying. Mathematics is a
much broader area than the arithmetic you saw in school. A lot of mathematics is not
about numbers. Of course, not everything interesting is mathematics—you (maybe)
find politics interesting, but you will not be able to come up with a definition of
‘left-wing’ or ‘economically good’ which is generally agreed on, let alone one which
52
3.3. Natural numbers: an axiomatic approach
is precise and unambiguous. We’ll have to leave politics to the political scientists. The
flip side of this is: it’s (more or less) true that all mathematicians agree that all of
mathematics is correct, which keeps fights to a minimum. That’s certainly not true for
political scientists, who (sometimes) write books whose messages boil down to ‘My
idea is right’, ‘You’re wrong’, ‘Am not!’, ‘Wrongy wrongy wrong!’... and so on.
If you’re thinking carefully, you might notice that the structures we mentioned above
aren’t really very clearly defined. What are ‘the points on the number line’? In fact, 3
what are ‘the natural numbers’? We probably all feel we know what is meant by a
positive integer, how to add and multiply them, and that all of us will get the same
answers if we try it. But that’s not good enough. It would be very embarrassing if it
turned out that some of us made different assumptions to others about the natural
numbers, and we started arguing about what statements are true. So let’s find a way
to avoid that right away.
53
3. Mathematical structures, natural numbers and proof by induction
(N8) There is a special element of N, denoted by 1, which has the property that for all
n ∈ N we have n × 1 = n.
(N12) For all a, b ∈ N, exactly one of the following is true: a = b, a < b, b < a.
(N13) For any non-empty subset S of N, there exists a ∈ S such that a ≤ b for every
b ∈ S. [Well-ordering principle]
Before we go on to work with these axioms, let’s try to say a little bit about them. You
should read the first three axioms as saying ‘addition works the way I think it does’.
These three axioms are also true if you replace the natural numbers N with for
example the real numbers R, or the complex numbers C, or the ‘clock numbers’ Z24 ,
or for adding up 2 × 2 matrices. Of the list of structures in Section 3.2, the only one
we ruled out with these axioms is the social networks, because they don’t have an
‘addition’.
The next three axioms say the same thing for multiplication, and axiom (N7) says
that addition and multiplication work together the way you learnt years ago in
school. Again, these would all be true for R, C and Z24 , as well as for N. But we just
saw that 2 × 2 matrix multiplication is not commutative.
So with these first seven axioms, we certainly haven’t yet given an unambiguous
definition of the natural numbers. But some possibilities have been ruled out, for
example what we are trying to describe cannot be 2 × 2 matrices. The more axioms
we add, the more possibilities are ruled out.
Let’s move on to axiom (N8). This says that there is at least one element in the set N
we’re trying to describe (and multiplying by that element doesn’t change
anything)—if you think about it, the empty set trivially satisfies all of the previous
axioms too, just because there don’t exist any elements to provide a counterexample.
But now the empty set is ruled out. However there are still lots of mathematical
structures left which satisfy (N1)–(N8), for example Z, Q and R.
The axioms (N9) through (N12) again simply say that addition and multiplication
behave the way you expect; and that they interact with < the way you think positive
numbers should. It’s worth pointing out that these axioms together rule out all the
structures we mentioned above except N, because (N11) isn’t true for the rest
(1 + (−1) = 0 but 1 is not smaller than 0). But still there are structures left which are
not N but which satisfy (N1)–(N12), for example the positive real numbers (and
there are more).
Finally we put in the axiom (N13) which says that any non-empty set of natural
numbers has a least element. This axiom is a bit more complicated than the rest, so
let’s check that we understand it intuitively. Suppose I have in mind a set S of natural
numbers. If you want to find out whether it is empty or not, and if not what its least
54
3.3. Natural numbers: an axiomatic approach
Is 1 in the set S?
Is 1 + 1 =2 in the set S?
Is 2 + 1 =3 in the set S? ,
and so on. I might say ‘No’ to the first few questions, but as soon as I say ‘Yes’ you
will tell me ‘OK, then that’s the least element of S’ (and you don’t need to know what
3
else might be in S). If I keep saying ‘No’ forever, then it must mean S is the empty set.
This should justify (intuitively, not formally) that the natural numbers really
satisfy (N13).
Now, this axiom rules out the positive real numbers. There are sets of positive real
numbers which don’t have a least element—can you find one?
Activity 3.1 Think of a set of real numbers that has no least member.
But are we done? Maybe there are still some funny structures which are not the
natural numbers but satisfy all the axioms. Hopefully you agree it isn’t obvious what
the answer is; just because you and I can’t think of such a structure doesn’t mean it
doesn’t exist. We’ll get to that later in the chapter.
The proof we gave back on page 8 uses some of these axioms. But we also assumed
2 + 2 = 4 and 2 × 2 = 4, without proving them. Just for completeness, let’s see how
we can prove those two statements. To begin with, we need to explain what ‘2’ is.
Well, 2 is short-hand for 1 + 1. And 3 is short-hand for 2 + 1, which in turn really
means (1 + 1) + 1, and so on. Now we can prove 2 + 2 = 4.
Proof of 2 + 2 = 4.
2 + 2 = 2 + (1 + 1) by definition of 2
= (2 + 1) + 1 by axiom (N3)
= 3+1 by definition of 3
=4 by definition of 4 .
That was a pain, and you probably don’t want to see why 2 × 2 = 4. It’s a good
exercise though.
Other properties of the natural numbers follow from these axioms. (That is, they can
be proved assuming these axioms.)
For example, we can prove the following.
55
3. Mathematical structures, natural numbers and proof by induction
56
3.4. The principle of induction
We’ll prove this later in the chapter. Intuitively, the idea is as follows. Suppose you
can prove two things:
Then: because P(1) ⇒ P(2) and P(1) is true, we must have that P(2) is true. Now,
because P(2) ⇒ P(3) and P(2) is true, we must have P(3) is true, and so on. So P(n)
has to be true for all n ∈ N.
Let’s give an example. If you have an infinitely tall ladder, you could let P(n) be the
statement ‘I can climb to the nth rung’. So P(1) is ‘I can climb to the first rung’.
Presumably you can show that’s true, for example by standing on the first rung. That
is proving the base case. Now, the induction step, in this example, is ‘For all k ∈ N, if
I can climb to the kth rung then I can climb to the (k + 1)st rung’. For any given k, it’s
clear this is true — if you can climb to the kth rung, then you just need to climb one
more step to get to the (k + 1)st rung. So this proves the induction step, and what you
can conclude is that you can climb to any rung you like. This is all that induction is,
but it turns out to be very useful.
Before we prove the Induction Principle properly (using the axioms) later in this
chapter, let’s think about how we can use it.
Suppose you want to prove ∀n ∈ N, P(n). This looks tricky: we need to show
something is true for every natural number n. For each individual n, perhaps it is not
so hard—a simple calculation might do the job for P(1), P(2),...—but probably the
calculation gets harder as you try to check larger numbers, and you can’t see how to
write down a calculation with a general n in it. But you can check P(1) is true. At this
point you’re stuck. You might think: perhaps it would help me to know P(k) is true in
order to prove P(k + 1). Well, let’s assume it is and try to prove P(k + 1). What you’re
doing is proving the statement ‘∀k, P(k ) ⇒ P(k + 1)’, and when you’ve finished it
you are done by the Principle of Induction. Let’s see a few concrete examples.
57
3. Mathematical structures, natural numbers and proof by induction
3.4.2 An example
Here’s an example of how we might prove by induction a result we proved directly
earlier, in the previous chapter, namely:
∀n ∈ N, n2 + n is even.
3 Let P(n) be the statement ‘n2 + n is even’. Then P(1) is true, because 12 + 1 = 2, and
this is even. (Establishing P(1) is known as proving the base case or the induction
basis.) Next we show that P(k ) ⇒ P(k + 1) for any k ∈ N. So we show that if P(k ) is
true, so will be P(k + 1). To do this we assume that P(k ) is true and show that
P(k + 1) is then also true. (The assumption that P(k) is true is known as the inductive
hypothesis.) So suppose P(k ) is true, which means that k2 + k is even. What we need to
do now is show that this means that P(k + 1) is also true, namely that
(k + 1)2 + (k + 1) is even. So we need somehow to relate the expression
(k + 1)2 + (k + 1) to the one we are assuming we know something about, k2 + k. Well,
Now, by the ‘inductive hypothesis’ (the assumption that P(k) is true), k2 + k is even.
But 2k + 2 = 2(k + 1) is also even, so (k + 1)2 + (k + 1) is an even number, in other
words P(k + 1) is true. So we have shown that ∀k, P(k ) ⇒ P(k + 1). It now follows,
by the Principle of Induction, that for all n ∈ N, P(n) is true.
Once we get used to this technique, we can make our proofs more succinct.
The basic way of proving a result ∀n ∈ N, P(n) by induction is as follows:
[The Induction Step] Prove that, for any k ∈ N, assuming P(k ) is true (the
‘inductive hypothesis’), then P(k + 1) is also true.
And that’s all you need to do! The principle of induction then establishes that P(n) is
true for all n ∈ N.
58
3.4. The principle of induction
you are using some variants of induction (see below). Keep in mind that eventually
every induction proof is basically saying that if you can get to the first rung of the
ladder, and you can always climb up to the next step, then you can climb the ladder.
You may alternatively begin to feel that induction is obvious and it’s not clear why
you need all the careful formalities; the examples we will see next mainly look like
‘calculate the first case, then just keep doing the same calculation over and over
again’. Why can’t we simply write in a proof ‘and now keep doing this calculation 3
forever’? The answer is that it is easy to write down something which looks
convincing, where the ‘calculation you do forever’ works for the first one or two
times, but then it stops working because you missed some difficulty which doesn’t
show up in the first one or two cases. Induction is nothing more than ‘and now keep
doing this calculation forever’, except that writing out the formalities forces you to
say in detail exactly what calculation you will do and check it really works.
3.4.4 Variants
Suppose N is some particular natural number and that P(n) is a statement involving
natural numbers n. Then P(n) is true for all n ≥ N if the following two statements are
true:
(i) P( N ) is true;
(ii) For all k ∈ N, k ≥ N, P(k ) ⇒ P(k + 1).
This is a version of the Induction Principle obtained from the standard one by
‘changing the base case’. It can be used to prove a result like the following:
∀n ≥ 4, n2 ≤ 2n .
(The inequality n2 ≤ 2n is false when n = 3, so it does not hold for all n ∈ N.)
In terms of the ladder intuition, this is simply saying that instead of calling the rungs
of the ladder ‘rung 1’, ‘rung 2’ and so on, you paint N on the lowest rung, N + 1 on
the next lowest, and so on. Still, if you can get on the lowest rung (prove P( N ) ) and
you can always climb from each rung to the next, then you can climb the ladder (i.e.
you can get to every rung from the lowest, labelled N, up).
Another variant of the Induction Principle is the following, known as the Strong
Induction Principle:
The Strong Induction Principle: Suppose P(n) is a statement involving natural
numbers n. Then P(n) is true for all n ∈ N if the following two statements are true:
The name is misleading, because, in fact, the strong induction principle follows from
the standard induction principle.
59
3. Mathematical structures, natural numbers and proof by induction
Activity 3.4 Try to understand why the strong induction principle follows from
the induction principle. Hint: consider Q(n), the statement ‘∀s ≤ n, P(s) is true’.
[This is difficult, so you may want to omit this activity at first.]
Again, in terms of the ladder intuition, what the induction step is now saying is not
the “for every k, if I can climb to rung k then I can climb to rung k + 1” of normal
3 induction, but “for every k, if I can climb to rung k and I climbed on all the lower
rungs on the way, then I can climb to rung k + 1”. Phrased like that, it’s “obvious”
that it doesn’t really make a difference and still I can conclude that I can climb the
ladder. What’s not obvious is why such a thing might be useful. As we shall see,
though, it is often useful when it comes to proving results about sequences that are
defined ‘recursively’.
In general, you won’t know when you start trying to solve a problem that the
Principle of Induction is a good tool to try using. Rather, you will get stuck in some
logic and notice, ‘hey, it would really help if I could assume the previous case is true.’
Similarly, you don’t always know from the outset when the strong induction principle
is going to be handy. Rather, your thought when stuck will be ‘hey, it would be great
if I could assume some smaller case is (or a whole bunch of smaller cases are) true.’
With this observation, we can use proof by induction to prove many results about the
values and properties of such sums. Here is a simple, classical, example.
Example 3.1 For all n ∈ N, ∑rn=1 r = 21 n(n + 1). This is simply the statement that
the sum of the first n natural numbers is n(n + 1)/2.
Proof. We prove the result by induction. Let P(n) be the statement that
∑rn=1 r = 21 n(n + 1). Then P(1) states that 1 = 21 × 1 × 2, which is true. So the base
case P(1) is true. Now let’s do the induction step. Suppose that k ∈ N and that
60
3.6. Recursively defined sequences
+1
(the inductive hypothesis) ∑rk=1 r = 12 k (k + 1) holds. Consider ∑rk= 1 r. We have
k +1 k
∑ r = ∑ r + ( k + 1)
r =1 r =1
1
= 2 k ( k + 1) + ( k + 1) by the induction hypothesis
= 1 2
2 ( k + k + 2k + 2) 3
1 2
= 2 ( k + 3k + 2)
1
= 2 ( k + 1)( k + 2)
1
= 2 ( k + 1)(( k + 1) + 1).
This establishes that P(k + 1) is true (for P(k + 1) is precisely the statement that
+1
∑rk= 1 r = ( k + 1)(( k + 1) + 1) /2.) Therefore, by induction, for all n ∈ N,
n
∑r=1 r = 21 n(n + 1).
Note how the the induction hypothesis was used. In the induction step, you always
prove P(k + 1) to be true assuming P(k ) is. (Unless you do so, it isn’t a proof by
induction.)
Activity 3.5 Prove by induction that the sum of the first n terms of an arithmetic
progression with first term a and common difference d is n(2a + (n − 1)d)/2.
61
3. Mathematical structures, natural numbers and proof by induction
which is exactly what we need. So the formula for xn holds for all n.
( a + b) × c = c × ( a + b) by (N5) [Commutative]
= (c × a) + (c × b) by (N7) [Distributive]
= ( a × c) + (b × c) by (N5) [Commutative]
a + ( x + y) = ( a + x ) + y by (N3) [Associativity]
= b+y = c, by definition of b and c
( a + c) + d = d + ( a + c) by (N2)
= (d + a) + c by (N3)
= ( a + d) + c by (N2)
= b+c.
62
3.7. Using the axioms for the natural numbers
63
3. Mathematical structures, natural numbers and proof by induction
of abstract (defined by axioms) structures which you study in MA103 and its
successor courses. At least, that’s true if you are willing to put the effort in in this
course and learn how to work with axiomatic definitions and write proofs using
axioms—otherwise, you’ll need to learn everything from scratch and remember it
separately every time you come across a new concrete structure.
Finally, and most concretely, we are studying the axioms for the natural numbers so
3 that you get used to writing axiomatic proofs of facts which you already know are
true, and for which you basically understand why those facts are true. That way you
only have to learn one difficult thing now, and when you come to the abstract algebra
in Lent Term, you will only have to learn one difficult thing, namely what one can do
with the axioms of a vector space or of a group, instead of trying to learn that and
how to write an axiomatic proof at the same time.
Proof. Suppose it’s not the case that P(n) is true for all n ∈ N. Then the set S of
n ∈ N for which it is not true is non-empty and, by the Least Element Axiom (N13),
has a least member a. Now, a 6= 1 because P(1) is true. And we can’t have a < 1
because by (P5) 1 is the least member of N. So, by (N12) we have 1 < a. Consider
a − 1: this is a natural number less than a and is therefore not in S. So P( a − 1) is true.
But since P(k ) ⇒ P(k + 1) for all k, it follows that P( a) is true, meaning a 6∈ S, a
contradiction.
You might not be entirely satisfied with that proof. It used a − 1 but we haven’t
defined subtraction! Here’s another way of explaining the last bit:
Since 1 < a, by (N11) there is some c ∈ N such that 1 + c = a. By (N2)
[Commutativity] we have c + 1 = a, which, by (N11) means c < a. So, because a is
the least element of S we have c 6∈ S and hence P(c) is true. But P(c) ⇒ P(c + 1) and
hence P( a) is true, a contradiction.
64
3.9. Non-examinable: There’s only one N.
65
3. Mathematical structures, natural numbers and proof by induction
set S has a least element s. Now s is not equal to 1, because 1 does have a
corresponding member ‘one’ in M. As in the proof of the Principle of Induction, that
means there is some c ∈ N such that c + 1 = s, and we have c < s. But by definition
of S that means c does have a corresponding member in M, call it ‘cee’. And by the
way we constructed the correspondence, that means c + 1 = s corresponds to ‘cee
plus one’. But this is a contradiction—we assumed s does not have a corresponding
member in M.
3
The second case is very similar. Let T be the set of elements of M which don’t have a
corresponding element in N. By assumption, T is not the empty set, so by (N13)
(which also applies to M..!) the set T has a least element, call it ‘tee’. We know ’tee’ is
not ’one’, because ’one’ corresponds to 1. So there is an element ’dee’ in M such that
’dee plus one’ equals ’tee’ (using the same axioms as in the first case). Since ’dee’ is
less than ’tee’, it has a corresponding element d in N. But then again we see ’tee’,
which is equal to ’dee plus one’, corresponds to d + 1—a contradiction.
If you want to see the parts we skipped, either work out for yourself how to prove
them, or you can look them up in books on ‘foundations of mathematics’. In the latter
case, you should be aware that the axioms we gave for the natural numbers are not
the usual ones, called the ‘Peano axioms’. There are fewer Peano axioms than we
have, and they are simpler (but a bit strange). The reason for giving you a longer and
more complicated list of axioms is that you will see very similar axioms repeatedly in
your degree course; your practice with these axioms will help you in the rest of your
degree course. That wouldn’t be the case with the Peano axioms.
66
3.10. Non-examinable philosophical interlude
other, much more complicated structures about which we don’t really have any
intuition) which turned out to be inconsistent, and it’s hard to argue that we have any
intuition about natural numbers which are so enormous that all the particles in the
known universe are too few to write them down.
What one might hope (and logicians did hope around 1900) is that you might be able
to find some way of proving that the axioms (or at least some useful set of axioms)
are consistent: perhaps all you really need to assume to do mathematics are the rules 3
of logic. Around the same time, logicians also believed that perhaps every problem in
mathematics can be solved: for every statement, you can either find a proof or a
counterexample.
But these hopes turn out to be wrong. Gödel showed that any (finite) collection of
axioms which describe an interesting structure (interesting enough to do arithmetic)
cannot prove its own consistency. And what is more, there will be some statements
which one cannot prove to be either true or false. These theorems are central parts of
the area called ‘foundations of mathematics’. But most mathematicians do not worry
about it. We don’t believe that there will turn out to be a contradiction in the
mathematics we do, and we know that no matter how much we try we can’t hope to
improve belief to a certainty, so we don’t think too much about it.
In your degree course, we are going to stick to what most mathematicians believe and
do, meaning we will not spend time worrying about whether the natural numbers
exist (and so on). And in fact, from the end of this chapter we will also stick to
assuming that arithmetic in N works the way you know it does, rather than proving
it using the axioms.
However, we will try to avoid making more assumptions. You were probably pretty
much convinced, long before you started this course, that the natural numbers exist:
i.e. there is no contradiction in the axioms; there is no calculation which you can do
that will end up telling you 0 = 1. If I asked you why, you’d probably say something
about intuition from real world counting. Do you still have an intuition for how the
integers (positive, negative and zero) behave? Well, probably you feel you do; on the
other hand, it’s a bit more removed from the real world—you will never count −3
apples. What about the rational numbers? or the real numbers? You probably will say
you still have an intuition here from reality, but later in this course you’ll see a few
results which might convince you your intuition isn’t as good as you think now.
What about the complex numbers? Sure, they are needed to describe physical reality,
but if you think your experience of the real world tells you that the complex numbers
make sense, then you’re fooling yourself.
Let’s go into that in a little more detail. The complex numbers, if you saw them in
school, were probably introduced more or less as follows. We start with the real
numbers, which you know you can do algebra with as you’re used to (they satisfy
axioms (N1)–(N8), for example). Then we add a symbol √ i, which is defined to solve
the equation x2 + 1 = 0 (or maybe is defined to be −1). Then you write numbers
like 2 + 4.3i, and you can still do algebra as you’re used to (just remember that
whenever you see i2 you can replace it with −1). Probably the justification given is
more or less ‘we want to have solutions to all equations, and sometimes we need to
invent a new kind of number to make that true’.
This is a nice game; let’s invent a new number system E. It would be nice to be able to
67
3. Mathematical structures, natural numbers and proof by induction
divide by zero; so let’s define a new symbol E, which should solve the equation
x × 0 = 1 (or equivalently, let E = 1/0). Presumably we can write numbers like
2 + 4.3E, and do algebra as we’re used to. It looks pretty similar to the complex
numbers: what could possibly go wrong? Let’s try a calculation.
We have 0 = 0+0
so E × 0 = E × (0 + 0)
3 so E×0 = E×0+E×0
so 1 = 1+1 .
Here the first line is obviously true (it’s a statement about the integers, not our new
number system). The second line is just multiplying both sides of the first line by E;
no problem there. To get the third line we use the distributive law, which is algebra as
we’re used to. And for the final line we’re just using the definition of E. But the final
line is a false statement about the integers. So something is wrong.
What is wrong is that this new number system does not exist; we cannot have both a
symbol E satisfying E × 0 = 1 and the usual laws of arithmetic. It’s not that there is a
problem with writing down axioms: we can do that. Here is one possible set of
axioms which the new number system we wanted should satisfy:
68
3.11. Learning outcomes
3
3.11 Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
understand what it means to say that a given structure satisfies some axioms,
and why the axiomatic viewpoint is useful
understand how the natural numbers can be defined by axioms and understand
that other properties of natural numbers can be proved from the axioms, and
know how to construct such proofs.
state what is meant by a greatest and least member of a set of natural numbers
and know what is meant by the well-ordering principle (or least element axiom)
state and prove the Induction Principle and its variants
use Proof by Induction to prove a range of statements, including those involving
summation and recursive sequences
Exercise 3.2
Prove by induction that the sum a + ar + ar2 + · · · + ar n−1 of the first n terms of a
geometric progression with first term a and common ratio r 6= 1 is a(1 − r n )/(1 − r ).
Exercise 3.3
Prove by induction that for all n ∈ N,
n
1
∑ r2 = 6 n(n + 1)(2n + 1).
r =1
Exercise 3.4 n
1 n
Prove by induction that ∑ i ( i + 1) = n + 1 .
i =1
Exercise 3.5
Suppose the sequence xn is given by x1 = 7, x2 = 23 and, for n ≥ 3, xn = 5xn−1 − 6xn−2 .
Prove by induction that, for all n ∈ N, xn = 3n+1 − 2n .
69
3. Mathematical structures, natural numbers and proof by induction
Exercise 3.6
Prove by induction that, for all n ∈ N, 2n+2 + 32n+1 is divisible by 7.
Exercise 3.7
For a sequence of numbers x1 , x2 , x3 , . . . , and for n ∈ N, the number ∏rn=1 xr is the
product of the first r numbers of the sequence. It can be defined inductively as follows:
3 1 k +1
!
k
∏ xr = x1 , and for k ≥ 1, ∏ xr = ∏ xr x k +1 .
r =1 r =1 r =1
(k + 1)2 ≤ 2k + 2k + 1 ≤ 2k + k2 ≤ 2k + 2k = 2k+1 .
as required. So the result is true for all n ≥ 4.
Learning activity 3.4 Let Q(n) be the statement ‘∀s ≤ n, P(s) is true’. Then Q(1) is
true if and only if P(1) is true. The statement Q(k ) ⇒ Q(k + 1) is the same as
( P(s) true ∀s ≤ k) ⇒ ( P(s) true ∀s ≤ k + 1).
70
3.14. Solutions to exercises
But if P(s) is true for all s ≤ k then its truth for all s ≤ k + 1 follows just from its truth
when s = k + 1. That is, Q(k ) ⇒ Q(k + 1) is the same as
( P(s) true ∀s ≤ k) ⇒ P(k + 1). The (standard) Induction Principle applied to the
statement Q(n) tells us that: Q(n) is true for all n ∈ N if the following two statements
are true:
71
3. Mathematical structures, natural numbers and proof by induction
Then P(1) states that 1 = 1(2)(3)/6, which is true. Suppose P(k) is true for k ∈ N. Then
k
1
∑ r2 = 6 k(k + 1)(2k + 1)
r =1
72
3.14. Solutions to exercises
Now,
k +1
1 1 k
1
3
∑ i ( i + 1)
= +∑
(k + 1)(k + 2) i=1 i (i + 1)
i =1
1 k
= + (by the induction hypothesis)
(k + 1)(k + 2) k + 1
1 + k ( k + 2)
=
(k + 1)(k + 2)
k2 + 2k + 1
=
(k + 1)(k + 2)
( k + 1)2
=
(k + 1)(k + 2)
k+1
= ,
k+2
so P(k + 1) is true. By induction, P(n) is true for all n ∈ N.
73
3. Mathematical structures, natural numbers and proof by induction
n = k + 1,
which is a multiple of 7. So the statement is true for P(k + 1). This proves
P(k ) ⇒ P(k + 1), the induction step, and hence, by induction, for all n ∈ N.
Then
k +1 k
2r − 1 r −1
∏ (1 + x ) = (1 + x 2(k+1)−1
) × ∏ (1 + x 2 )
r =1 r =1
k
2k 1 − x2
= (1 + x ) (by the induction hypothesis)
1−x
k
1 − ( x 2 )2
= (where we’ve used (1 + y)(1 − y) = 1 − y2 )
1−x
k
1 − x 2 ×2
=
1−x
k +1
1 − x2
= ,
1−x
which shows that P(k + 1) is true. So P(n) is true for all n ∈ N, by induction.
74
Chapter 4
Functions and counting
R
R Biggs, N. L. Discrete Mathematics. Chapters 5 and 6.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapter 10, Sections 10.1
and 10.2, and Chapter 11. 4
4.1 Introduction
In this chapter we look at the theory of functions, and we see how the idea of the
‘size’ of a set can be formalised.
4.2 Functions
Activity 4.1 Find a single formula which gives the function g( x ) above.
75
4. Functions and counting
Actually, even that is too restrictive; we don’t want to insist that the input or output is
a number. Maybe we would like the input or output to be ‘Yes’, or ‘No’, or a colour,
or a social network... we need a definition which allows any of these possibilities. The
only thing we want to stick to is: if we give the function the same input twice, we
should get the same output each time. Here is the definition which formalises this.
Definition 4.1 Suppose that X and Y are sets. Then a function (also known as a
mapping) from X to Y is a rule that associates a unique member of Y to each member
of X. We write f : X → Y. The set X is called the domain of f and Y is called the
codomain.
76
4.2. Functions
(You can see that the sequence of numbers f (1), f (2), f (3), . . . is therefore given by a
first order difference equation.)
What does it mean to say that two functions f and g are equal? Well, first, they must
have the same domain X and codomain Y. Then, for each x ∈ X, we must have
f ( x ) = g( x ). For example, if R+ is the set of positive real numbers, then the function
f : R+ → R given by f ( x ) = x2 and the function g : R → R given by g( x ) = x2 are
not equal because their domains are different.
You might think it is picky to say that, for example, the function f : R≥0 → R≥0
defined by f ( x ) = x2 and the function g : R≥0 → R defined by g( x ) = x2 are
different (The set R≥0 is the non-negative real numbers). After all, what you can put
into both functions is the same, and what comes out is also the same—the only 4
difference is that the codomains of f and g are different. However, it turns out often
to be important what the codomain is—for example, we’ll see later that only one of f
and g is a bijection.
Finally, we define one very basic function. For any set X, the identity function
1 : X → X is given by 1( x) = x.
If X and Z are distinct sets, there is only one way we can compose f and g. For
example, given the function RightTime : {Morning, Evening} → {Beer, Milk}
defined by RightTime(Morning) = Milk and RightTime(Evening) = Beer, it makes
sense to talk about Drink ◦ RightTime. I think that in the morning it’s the right time
for milk, and in the evening it’s the right time for beer. So if we put Morning into the
composition Drink ◦ RightTime, then we can see that I will Maybe have a drink in the
morning. It doesn’t make sense to consider RightTime ◦ Drink, because whatever
input from {Beer, Milk} we put into Drink the output is something not in the domain
of RightTime; that function doesn’t know what to do with an input Maybe.
If X = Z, then both f ◦ g and g ◦ f make sense—but they are generally not the same
function: the order is important. A further point to be careful about is that the
notation f g can cause confusion. For example, suppose X = Y = Z = R. Then you
might be tempted to think that g f denotes the product function x → g( x ) f ( x ). But this
would be wrong. It should always be clear from the context whether g f should be
interpreted as a composition. If I need to talk about the product of the functions f and
g I will denote this by f ( x ) g( x ). The notation g ◦ f leads to less confusion, but it is not
used in all textbooks.
77
4. Functions and counting
And,
2
( g ◦ f )( x ) = g f ( x ) = g x2 + 1 = ( x2 + 1) + 1 = ( x2 + 2)2 .
∀ x, x 0 ∈ X, x 6= x 0 ⇒ f ( x ) 6= f ( x 0 )
∀ x, x 0 ∈ X, f ( x ) = f ( x 0 ) ⇒ x = x 0 .
This latter characterisation often provides the easiest way to verify that a function is
an injection.
4.3.1 An example
Let X = R, the set of real numbers, and let Y be the interval (−1, 1), the set of real
numbers x such that −1 < x < 1. Then the function f : X → Y given by
x
f (x) =
1 + |x|
78
4.4. Inverse functions
is a bijection from X to Y.
First, we prove f is injective. To do this, we prove that f ( x ) = f (y) implies x = y. So,
suppose f ( x ) = f (y). Then
x y
= .
1 + |x| 1 + |y|
So
x + x | y | = y + y | x |.
Because x/(1 + | x |) = y/(1 + |y|), x and y are both non-negative or both negative. For,
otherwise, one of x/(1 + | x |) and y/(1 + |y|) will be negative and the other one will
be non-negative, which cannot be the case since they are equal. So, x |y| = y| x |, both 4
being xy if x, y ≥ 0 and − xy if x, y < 0. So, we have x = y.
Next, we show f is surjective. We need to prove that, for each y ∈ (−1, 1), there’s
x ∈ R such that x/(1 + | x |) = y. Consider separately the case in which y ≥ 0 and the
case in which y < 0.
Suppose y ≥ 0. Then, to have x/(1 + | x |) = y, we need x ≥ 0. So | x | = x and we need
to solve x/(1 + x ) = y. This has solution x = y/(1 − y), which is well-defined
because we know y < 1.
Suppose y < 0. Then we’ll need to have x < 0 and the equation to solve is
x/(1 − x ) = y, for a solution x = y/(1 + y); this is also well-defined, since y > −1.
First, we prove:
f : X → Y has an inverse ⇐⇒ f is bijective.
Proof. This is an ⇐⇒ theorem, so there are two things to prove: the ⇐ and the ⇒.
First, we show: f : X → Y has an inverse ⇐ f is bijective.
Suppose f is a bijection. For each y ∈ Y there is exactly one x ∈ X with f ( x ) = y.
Define g : Y → X by g(y) = x. Then this is an inverse of f . Check this!
Next, we show: f : X → Y has an inverse ⇒ f is bijective.
Suppose f has an inverse function g. We know that for any y ∈ Y,
79
4. Functions and counting
4.4.2 Examples
y = f ( x ) ⇐⇒ y = 3x + 1 ⇐⇒ x = (y − 1)/3,
so
1
f −1 ( y ) = ( y − 1).
3
Let Z denote the set of all integers (positive, zero, and negative).
Prove that f is a bijection and determine a formula for the inverse function f −1 .
First, we prove that f is injective: Suppose f (n) = f (m). Since 2n is even and
−2n − 1 is odd, either (i) n, m ≥ 0 or (ii) n, m < 0. (For otherwise, one of f (n), f (m)
is odd and the other even, and so they cannot be equal.)
In case (i), f (n) = f (m) means 2n = 2m, so n = m.
In case (ii), f (n) = f (m) means −2n − 1 = −2m − 1, so n = m. Therefore f is
injective.
80
4.5. Functions on sets
The proof that f is surjective reveals to us what the inverse function is. We have
4
−1 m/2 if m even
f (m) =
−(m + 1)/2 if m odd.
f (S) = { f ( x ) : x ∈ S} .
f −1 ( T ) = x ∈ X : f ( x ) ∈ T .
Again, it’s important to remember that for y ∈ Y, the set f −1 {y} is a set of elements
namely f −1 (y). However if f is not invertible, then by Theorem 4.1 either there will
81
4. Functions and counting
be some y ∈ Y such that f −1 {y} = ∅ (i.e. f is not surjective) or there will be some
y ∈ Y such that f −1 {y} has two or more elements (i.e. f is not injective), or both.
So, the set has m members if to each number from 1 to m, we can assign a
corresponding member of the set S, and all members of S are accounted for in this
process. This is like the attachment of labels ‘Object 1’, etc, described above.
Note that an entirely equivalent definition is to say that S has m members if there is a
bijection from S to Nm . This is because if f : Nm → S is a bijection, then the inverse
function f −1 : S → Nm is a bijection also. In fact, because of this, we can simply say
that S has m members if there is a bijection ‘between’ Nm and S. (Eccles uses the
definition that involves a bijection from Nm to S and Biggs uses the definition that
involves a bijection from S to Nm .)
For m ∈ N, if S has m members, we say that S has cardinality m (or size m). The
cardinality of S is denoted by |S|, so we would usually simply write |S| = m for ‘S
has cardinality m’.
82
4.7. The pigeonhole principle
actually prove it? We shall prove it below. But let’s state the principle more formally,
first. Recall that, for r ∈ N we have Nr = {1, 2, . . . , r }.
Theorem 4.2 (Pigeonhole Principle (PP)) The following statement is true for all
n ∈ N: For all natural numbers m, if there is an injection from Nn to Nm , then n ≤ m.
Proof. We prove this by induction. The statement we want to prove is the statement
P(n): ‘if there is an injection from Nn to Nm , then n ≤ m.’ The base case, n = 1, is true
because (since m ∈ N), 1 ≤ m. Suppose P(k ) is true. We now want to show that
P(k + 1) is also true. So suppose there is an injection f : Nk+1 → Nm . (What we want
to show is that k + 1 ≤ m.) Since k ≥ 1, we have k + 1 ≥ 2. Now we do not have
m = 1, because if m = 1 then Nm = {1} and so the only possibility is 4
f (1) = f (2) = 1, contradicting the assumption that f is an injection. Because m is a
natural number, it follows that m ≥ 2.
Since m ≥ 2 we can write m as m = s + 1 where s ∈ N. Now, either there is some
x ∈ Nk = {1, 2, . . . , k } with f ( x ) = s + 1, or there is not. Let’s examine each case
separately.
A slightly more general form of the pigeonhole principle, easy to prove from that
above is:
Theorem 4.4 Suppose that A and B are sets with | A| = n and | B| = m, where
m, n ∈ N. If there is an injection from A to B, then n ≤ m.
83
4. Functions and counting
The pigeonhole principle is remarkably useful (even in some very advanced areas of
4 mathematics). It has many applications. For most applications, it is the contrapositive
form of the principle that is used. This states:
If m < n and f : Nn → Nm is any function, then f is not an injection..
So, if m < n, and f is any function f : Nn → Nm , then there are x, y ∈ Nn with x 6= y
such that f ( x ) = f (y).
The name ‘pigeonhole principle’ comes from this last formulation. If you have m
pigeonholes (named slots into which post is placed in a university) and n letters to
put in them, where n > m, then there must be at least one pigeonhole into which two
or more letters are placed.
84
4.7. The pigeonhole principle
This kind of understanding is what I want you to get from the proof of PP. For all the
longer proofs in these notes, I would like you to get an idea of why the proof works
and what ideas you are being shown that you can use elsewhere in your own proofs;
this is why these proofs are there.
Proof. Consider the function that maps the people to their months of birth. Since
13 > 12, this cannot be a bijection, so two people are born in the same month.
This next one is not hard, but perhaps not immediately obvious.
Theorem 4.6 In a room full of people, there will always be at least two people who
have the same number of friends in the room.
85
4. Functions and counting
Proof. Let X be the set of people in the room and suppose | X | = n ≥ 2. Consider the
function f : X → N ∪ {0} where f ( x ) is the number of friends x has in the room.
Let’s assume that a person can’t be a friend of themselves. (We could instead assume
that a person is always friendly with themselves: we simply need a convention one
way or the other.)
Then f ( X ) = { f ( x ) : x ∈ X } ⊆ {0, 1, . . . , n − 1}. But there can’t be x, y with
f ( x ) = n − 1 and f (y) = 0. Why? Well, such an x would be a friend of all the others,
including y, which isn’t possible since y has no friends in the room.
So either f ( X ) ⊆ {0, 1, . . . , n − 2} or f ( X ) ⊆ {1, . . . , n − 1}. In each case, since f ( x )
4 can take at most n − 1 values, there must, by PP, be at least two x, y ∈ X with
f ( x ) = f (y). And that’s what we needed to prove.
By the way, this result would not necessarily hold if we only had four points in the
set. Consider (0, 0), (1, 0), (1, 0) and (1, 1).
Here’s a very interesting number theory application (with a very sneaky proof). It
uses the notion of remainders on division by n, which we’ll cover properly soon: for
now, all we need is that, for every natural number m, the “remainder, r, upon division
by n” is one of the numbers 0, 1, . . . , n − 1, and that m − r is divisible by n.
Theorem 4.8 Let a1 , a2 , . . . , an be n integers (where n ≥ 2). Then there exists a
non-empty collection of these integers whose sum is divisible by n.
s0 = 0,
s1 = a1 ,
86
4.8. A generalised form of PP
s2 = a1 + a2 ,
s3 = a1 + a2 + a3 ,
etc., until
s n = a1 + a2 + · · · + a n .
(It is not obvious, at all, why we should do this, but it will work!)
For each of these si , consider the remainder upon division by n. Since there are n + 1
numbers si , but only n possible remainders (0, 1, . . . , n − 1), two of the si will have the
same remainder upon division by n.
So suppose sk and s` have the same remainder, where k < `. Then s` − sk is divisible
by n. But since s` − sk = ak+1 + ak+2 + · · · + a` , this means that the sum
4
ak+1 + ak+2 + · · · + a` is divisible by n. Se we have proved the result.
In fact we proved something even stronger than what we set out to prove :
Let a1 , a2 , . . . , an be a list of n integers ( where n ≥ 2 ). Then there exists a non-empty
collection of consecutive numbers from this list ak+1 , ak+2 , . . . , a` whose sum is
divisible by n.
The theorem isn’t true if we have fewer than n integers. For instance, if for any n ≥ 2
we take the numbers a1 , . . . , an−1 all equal to 1, then it’s impossible to find a sum that
adds up to something divisible by n.
I should maybe point out why the proof of this is not in the course. First, it is
something you can find or generate for yourself fairly easily if you want. More
importantly, it won’t show you any new ideas; you wouldn’t learn anything you
didn’t already see earlier.
Last year, 241 students were registered for this course. I knew, before marking the
exams, that at least three of them would get the same exam mark.
Why? Well, apply the theorem, with A being the students, B being the set
{0, 1, . . . , 100} of all possible marks (which is of size 101) and f ( x ) the mark of
student x. Since 232 > 2(101), there’s some mark y such that at least 2 + 1 = 3
students will have y = f ( x ), which means they get the same mark.
87
4. Functions and counting
For example, the set of natural numbers is infinite. You might think that’s obvious,
but how would you prove it? (Remember that the formal definition that a set A has
cardinality n is that there is a bijection between Nn and A.)
One way to show this is to use a proof by contradiction. Suppose (for a contradiction)
that N is finite, of cardinality n ∈ N, and that f : Nn → N is a bijection. Consider the
number N = f (1) + f (2) + · · · + f (n). Since each f (i ) is a natural number, for all
i ∈ Nn , N is also a natural number. But N > f (i ) for all i ∈ Nn . So here is a natural
number, N, that is not equal to f (i ) for any i ∈ Nn . But that contradicts the fact that f
is a bijection, because if it’s a bijection then it’s certainly a surjection and there should
be some i ∈ Nn with f (i ) = N.
4
4.10 Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
state what it means to say that a set is infinite; and be able to prove that a set is
infinite.
Exercise 4.2
Let Z be the set of all integers and suppose that f : Z → Z is given, for x ∈ Z, by
x+1 if x is even
f (x) =
− x + 3 if x is odd.
88
4.12. Comments on selected activities
Exercise 4.3
Suppose that X, Y, Z are sets, and we have functions f : X → Y, g : Y → Z, and
h : Y → Z. Suppose that the compositions h ◦ f and g ◦ f are equal, and also that f is
surjective. Prove that g = h.
Exercise 4.4
Suppose that X, Y, Z are sets and that f : X → Y and g : Y → Z. Prove that if the
composition g ◦ f is injective, then f is injective. Prove that if g ◦ f is surjective, then g is
surjective.
Exercise 4.5
Suppose that A and B are non-empty finite sets and that they are disjoint (i.e. A ∩ B = ∅). 4
Prove, using the formal definition of cardinality, that | A ∪ B| = | A| + | B|.
Exercise 4.6
Suppose that X, Y are any two finite sets. By using the fact that
X ∪ Y = ( X \ Y ) ∪ (Y \ X ) ∪ ( X ∩ Y ) ,
| X ∪ Y | = | X | + |Y | − | X ∩ Y | .
Exercise 4.7
Suppose n ∈ N and that f : N2n+1 → N2n+1 is a bijection. Prove that there is some odd
integer k ∈ N2n+1 such that f (k) is also odd. (State clearly any results you use.)
Would that formula be more or less useful to you than the description we gave to
define it?
Learning activity 4.2 Given any y ∈ R, let x = y/2. Then f ( x ) = 2(y/2) = y. This
shows that f is surjective. Also, for x, y ∈ R,
f ( x ) = f (y) ⇒ 2x = 2y ⇒ x = y,
89
4. Functions and counting
( g ◦ f )( x ) = ( g ◦ f )(y) ⇒ g( f ( x )) = g( f (y))
⇒ f ( x ) = f (y) (because g is injective)
⇒ x = y (because f is injective).
( g ◦ f )( x ) = g( f ( x )) = g(y) = z,
so z is the image of some x ∈ X under the mapping g f . Since z was any element of Z, this
shows that g ◦ f is surjective.
f ( x ) = f (y) ⇒ g( f ( x )) = g( f (y))
90
4.13. Solutions to exercises
f ( x ) = f (y) ⇒ x = y ,
i.e. f is injective.
Now suppose g ◦ f is surjective. So for all z ∈ Z there is some x ∈ X with ( g ◦ f )( x ) = z.
So g( f ( x )) = z. Denoting f ( x ) by y, we therefore see that there is y ∈ Y with g(y) = z.
Since z was any element of Z, this shows that g is surjective.
h(i ) = h( j) ⇒ f (i ) = f ( j) ⇒ i = j,
h(i ) = h( j) ⇒ g(i − m) = g( j − m) ⇒ i − m = j − m ⇒ i = j,
because g is injective. The only other possibility is that one of i, j is between 1 and m and
the other between m + 1 and m + n. In this case, the image under h of one of i, j belongs
to A and the image of the other to B and these cannot be equal because A ∩ B = ∅. So h
is indeed an injection. It is also a surjection. For, given a ∈ A, because f is a surjection,
there is 1 ≤ i ≤ m with f (i ) = a. Then h(i ) = a also. If b ∈ B then there is some
1 ≤ j ≤ n such that g( j) = b. But then, this means that h(m + j) = g((m + j) − m) = b,
so b is the image under h of some element of Nm+n . So h is a bijection from Nm+n to
A ∪ B and hence | A ∪ B| = m + n.
| X ∪ Y | = |( X \ Y ) ∪ (Y \ X )| + | X ∩ Y |.
|( X \ Y ) ∪ (Y \ X )| = |( X \ Y )| + |(Y \ X )|
and therefore
| X ∪ Y | = |( X \ Y )| + |(Y \ X )| + | X ∩ Y |.
Now, the sets X \ Y and X ∩ Y are disjoint and their union is X, so
| X | = |( X \ Y ) ∪ ( X ∩ Y )| = | X \ Y | + | X ∩ Y |.
|Y | = |(Y \ X ) ∪ ( X ∩ Y )| = |Y \ X | + | X ∩ Y |.
91
4. Functions and counting
| X \ Y | = | X | − | X ∩ Y | and |Y \ X | = |Y | − | X ∩ Y |.
So we have
| X ∪ Y | = |( X \ Y )| + |(Y \ X )| + | X ∩ Y |
= (| X | − | X ∩ Y |) + (|Y | − | X ∩ Y |) + | X ∩ Y |
= | X | + |Y | − | X ∩ Y | .
92
Chapter 5
Equivalence relations and the integers
R
R Biggs, N. L. Discrete Mathematics. Chapter 7.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapter 22.
5.1 Introduction
5
In this chapter of the notes we study the important idea of an equivalence relation, a
concept that is central in abstract mathematics. As an important example, we look at
how the integers can be defined from the natural numbers through the use of an
equivalence relation. We also study some of the important properties of the integers.
93
5. Equivalence relations and the integers
In many cases, we use special symbols for relations. For instance ‘=’ is a relation, as is
>. It is often convenient to use a symbol other than R: for instance, many textbooks
use x ∼ y rather than x R y as a symbol for ‘some relation’.
A relation that has all three of these properties is called an equivalence relation.
Definition 5.2 A relation is an equivalence relation if is reflexive, symmetric and
transitive.
m R n ⇐⇒ m + n is even
x + z = ( x + y) + (y + z) − 2y,
and all three terms on the right (x + y, y + z, and 2y) are even. Therefore, x + z is
even and so x R z.
Example 5.3 Let X be the set of n × n real matrices. Define a relation ∼ on X by:
M ∼ N ⇐⇒ ∃r, s ∈ N s.t. Mr = N s .
Mrt = ( Mr )t = ( N s )t = ( N t )s = ( Ru )s = Rus ,
94
5.3. Equivalence classes
Example 5.4 Let S be a set of people in a given social network, and let F be the
relation ‘friendship’, i.e. aFb if a and b are people in S who are friends in the social
network. This relation is symmetric (in real life, it might be that a says they are
friends with b but b disagrees. Social networks such as Facebook don’t allow this
one-sided ‘friendship’). Let’s say that you are automatically a friend of yourself, so
the relation is reflexive.
Is the relation transitive? Well, that depends on the social network. You probably
want to say ‘No’, because (if you’re on Facebook) you surely have some friend not
all of whose friends you know. So for the example of S and F coming from
Facebook, you know the relation F is not transitive; you have a
counterexample—and hence it’s also not an equivalence relation. But it doesn’t
have to be that way. If S is all the people in this lecture hall—well, we’re all friends
(I hope!) and so from the lecture example we do get a transitive relation, and
hence (because we checked all three properties) an equivalence relation. 5
[ x ] R = { y ∈ X | y R x }.
Often, we will want to talk about the set of all equivalence classes of R. This set is
written X/R, and referred to as the quotient set of X by R. So we have
X/R = [ x ] R : x ∈ X } .
95
5. Equivalence relations and the integers
{ x ∈ X : f ( x ) = y } = f −1 { y } ,
for y ∈ Y. Note that the place where we use that f is a surjection is that it implies
each f − 1 {y} is non-empty. If f is not a surjection, then the equivalence classes
are the sets f −1 {y} for all y ∈ Y such that there is an x ∈ X with y = f ( x ), in
5 The equivalence classes have a number of important properties. These are given in
the following result.
Theorem 5.1 Suppose R is an equivalence relation on a set X. Then
Proof. (i) This is an if and only if statement, so we have two things to prove: namely
that [ x ] = [y] ⇒ x R y and that x R y ⇒ [ x ] = [y].
Suppose, then, that [ x ] = [y]. The relation R is reflexive, so we have x R x. This means
that x ∈ [ x ]. But if [ x ] = [y], then we must have x ∈ [y]. But that means (by definition
of [y]) that x R y.
Conversely, suppose that x R y. We now want to show that [ x ] = [y]. So let z ∈ [ x ].
(We will show that z ∈ [y].) Then z R x. But, because x R y and R is transitive, it
follows that z R y and hence z ∈ [y]. This shows [ x ] ⊆ [y]. We now need to show that
[y] ⊆ [ x ]. Suppose w ∈ [y]. Then w R y and, since x R y, we also have, since R is
symmetric, y R x. So w R y and y R x. By transitivity of R, w R x and hence w ∈ [ x ].
This shows that [y] ⊆ [ x ]. Because [ x ] ⊆ [y] and [y] ⊆ [ x ], [ x ] = [y], as required.
(ii) Suppose x and y are not related. We prove by contradiction that [ x ] ∩ [y] = ∅. So
suppose [ x ] ∩ [y] 6= ∅. Let z be any member of the intersection [ x ] ∩ [y]. (The fact that
we’re assuming the intersection is non-empty means there is such a z.) Then z ∈ [ x ],
so z R x and z ∈ [y], so z R y. Because R is symmetric, x R z. So: x R z and z R y and,
therefore, by transitivity, x R y. But this contradicts the fact that x, y are not related by
R. So [ x ] ∩ [y] = ∅.
Theorem 5.1 shows that either two equivalence classes are equal, or they are disjoint.
Furthermore, because an equivalence relation is reflexive, any x ∈ X is in some
equivalence class (since it certainly belongs to [ x ] because x R x). So what we see is
that the equivalence classes form a partition of X: their union is the whole of X, and
no two equivalence classes overlap.
96
5.4. Construction of the integers from the natural numbers
m R n ⇐⇒ m + n is even.
We have seen that there are precisely two equivalence classes: the set of odd
positive integers and the set of even positive integers. Note that, as the theory
predicted, these form a partition of all of N (since every natural number is even or
odd, but not both).
We are going to give a construction which looks complicated at first, but which we
can easily prove works. Again, the main motivation for this is that we will give an
idea which we can re-use later in your degree course to construct structures which
are more complicated and which you don’t have intuition for.
The idea behind what follows is this: imagine you have deposits of a and debts of b
(which are natural numbers). Denote this by ( a, b). Then (informally—we didn’t
97
5. Equivalence relations and the integers
Finally, R is transitive. For suppose that ( a, b) R (c, d) and (c, d) R (e, f ). Then
a + d = b + c and c + f = d + e. Therefore,
( a + d ) + ( c + f ) = ( b + c ) + ( d + e ).
That is (after cancelling c and d from each side),
a + f = b + e,
Now, for n ∈ N, let us denote the equivalence class [(n + 1, 1)] by n and let us denote
the equivalence class [(1, n + 1)] by −n. Also, we denote by 0 the class [(1, 1)].
Activity 5.2 Check that all these equivalence classes are distinct. In other words,
show that [(n + 1, 1)] is not the same as [(m + 1, 1)] if m, n ∈ N are distinct, and
also [(1, n + 1)] is not the same as [(1, m + 1)] if m 6= n, and [(n + 1, 1)] is not the
same as [(1, 1)], and [(1, n + 1)] is not [(1, 1)], and [(1, n + 1)] is not [(m + 1, 1)] for
any m, n ∈ N. (Did I cover all the cases? Check that too!) And finally, check that
there is no other equivalence class.
We define the integers to be the set (N × N)/R, in other words (by Activity 5.2) the set
98
5.4. Construction of the integers from the natural numbers
Let’s stop for a moment and (again) try to explain what the point of all this abstract
nonsense is. First off, you might be a bit confused: after all, 1 is 1 is a member of N,
and now for some reason we are also writing 1 for the equivalence class [(2, 1)] R of
this equivalence relation on N × N, which we are (by definition) saying is a member
of Z. This is naughty—the two things are not the same, and we should not use the
same symbol for both. What we are doing is called an abuse of notation, which means
we are doing something naughty but we will avoid getting into trouble. How can we
do that? Well, (as you will see, once we define addition and so on), the equivalence
classes [(2, 1)], [(3, 1)] and so on (which we are calling 1, 2 and so on) satisfy the
axioms for the natural numbers. That means we know there is a dictionary
correspondence between N and these equivalence classes (as we proved); they really
behave in exactly the same way.
Second, you might ask: why on earth are we giving this funny and complicated
construction for Z? I want to get on with my calculations without worrying about it.
This is the right question to ask, and the answer is: give me a moment to prove that 5
all your calculations will work, and then you can forget the construction (or at least
not keep thinking about it) and get on with your calculations. That’s why we
(naughtily) use the same symbol 1 for the element of N and for the equivalence class
[(2, 1)] R in Z: because once we check that this construction does what we want, we
are going to forget it and just work with Z as you’re used to—in particular, we want
to think of 1 in N and [2, 1] R in Z as being basically the same thing. We do not want to
keep thinking about equivalence classes any more, once we convince ourselves that
Z makes sense.
Let’s get back to the construction, and check we can do arithmetic with it. First, let’s
define an addition operation (between equivalence classes) by
[( a, b)] + [(c, d)] = [( a + c, b + d)].
So, for example, what does this say about the sum of integers +3 and −1. Well,
+3 = [(4, 1)] and −1 = [(1, 2)] and therefore
+3 + −1 = [(4, 1)] + [(1, 2)] = [(5, 3)].
Now, (5, 3) R (3, 1), so [(5, 3)] = [(3, 1)], which is what we called +2. No surprise
there, then: 3 + (−1) = 2 in the usual notation for integer arithmetic. Henceforth, we
can simply write m + (−n) as the subtraction m − n. In other words, we now defined
subtraction. We waited this long because subtraction always makes sense in Z; in N
it doesn’t: for example 3 − 5 is not in N.
There is quite a subtle point about this definition of addition of equivalence classes.
We know that there are many (infinitely many) pairs ( a0 , b0 ) such that
[( a, b)] = [( a0 , b0 )]. Such an ( a0 , b0 ) is called a representative of the equivalence class. (It
is always the case, for any equivalence relation, as Theorem 5.1 shows, that [ x ] and
[ x 0 ] are the same whenever x 0 R x. So any equivalence class can be represented by
potentially many different representatives: any member of the class will do.) So we
need to be sure that the definition of addition will give us the same answer if we use
different representatives. In other words, suppose that [( a0 , b0 )] = [( a, b)] and
[(c0 , d0 )] = [(c, d)]. The definition of addition,
[( a, b)] + [(c, d)] = [( a + c, b + d)],
99
5. Equivalence relations and the integers
Thus, we need to prove that if [( a0 , b0 )] = [( a, b)] and [(c0 , d0 )] = [(c, d)], then
[( a + c, b + d)] = [( a0 + c0 , b0 + d0 )].
We can see this easily enough in specific cases. For instance, [(4, 1)] = [(6, 3)] and
[(1, 2)] = [(2, 3)]. We have
[(4, 1)] + [(1, 2)] = [(5, 3)] and [(6, 3)] + [(2, 3)] = [(8, 6)].
Well, (5, 3) R (8, 6), because 5 + 6 = 3 + 8, so [(5, 3)] = [(8, 6)]. So, in this case, and
with these choices of representatives for each equivalence class, we end up with the
same class when we apply the addition operation.
5
We can prove, more generally, that the definition works, that it does not depend on
the choice of representatives of the classes. Remember, what we need to prove is that
if [( a0 , b0 )] = [( a, b)] and [(c0 , d0 )] = [(c, d)], then
[( a + c, b + d)] = [( a0 + c0 , b0 + d0 )].
( a + c, b + d) R ( a0 + c0 , b0 + d0 ) ⇐⇒ ( a + c) + (b0 + d0 ) = (b + d) + ( a0 + c0 )
⇐⇒ ( a + b0 ) + (d0 + c) = ( a0 + b) + (c0 + d),
Again, we should check that this definition ‘makes sense’, in that whatever
representative of the equivalence classes you take, you get the same answer.
Finally, you should check that this construction ‘makes sense’ in general, i.e. that
addition and multiplication in (N × N)/R as we defined them really correspond to
what you think addition and multiplication of integers should do. Let’s be a little
more concrete: we want to check that the following hold.
(Z1) Closure under addition and multiplication: for each a, b ∈ Z both a + b and ab
are in Z.
100
5.4. Construction of the integers from the natural numbers
(Z5) Additive and multiplicative identity: there are two different elements 0 and 1,
such that for each a ∈ Z we have a + 0 = a and a × 1 = a.
Example 5.8 We show that for any integer z we have z + 0 = z. Recall that, for
some ( a, b), we will have z = [( a, b)] and that 0 is [(1, 1)]. Now, the definition of
addition of integers (that is, of the equivalence classes) means that
Activity 5.4 With integers defined in the formal way as these equivalence
classes, prove that for any integer z we have z × 0 = 0.
You probably realise at this point that it is easy to check these properties; writing out
all of them would take a lot of space, but it would not be hard. In principle you
should check them though! And at this point we are done. These properties we gave
are all you will ever need to do arithmetic with Z, and that arithmetic works the way
you think it should. At this point, you can stop thinking about the funny
construction, and simply work with the integers Z as you’re used to doing.
(Actually, that’s not quite true—we’ll use the funny construction in the next section to
define an order < on Z.)
What did we gain from doing all this? Well, suppose you do some arithmetic with
Z—could it happen that you run into a contradiction, that for example you prove
0 = 1? Well, maybe—but if you do, then you can rewrite your proof, using this funny
construction (meaning you write [(1, 1)] instead of 0, and so on). In that case you end
up with a proof that [(1, 1)] = [(2, 1)] (which we saw in Activity 5.2 is a false
101
5. Equivalence relations and the integers
statement). What’s the difference? Well, the difference is that the second proof doesn’t
make any use of properties of Z, it only uses the axioms for the natural numbers. To
put it another way: let’s assume you are happy that N exists. Then you should now
be happy that Z exists, even if you were a bit unhappy about negative numbers
before.
That might seem like a lot of effort to prove something obvious. Again, the point is
that we will do this kind of thing over and over again, in order to prove that
structures exist where you probably do not think it is obvious. We can begin to
answer the question about what the difference between C and E from Section 3.10 is
now. We saw E doesn’t exist. It isn’t obvious that C exists; if you think it is, you’re
fooling yourself (so you should not have a simple answer to the question of what the
difference is).
But we can prove the complex numbers exist, by constructing them from the real
numbers in a similar way to the construction we just gave of Z from N. We’ll do this
5 in Section 8.5. Again, the point is not to make you think of a complicated construction
whenever you want to work with the complex numbers. The point is to convince you
that the complex numbers really work—if you could find a problem with the
complex numbers, say if you could prove 0 = 1 by doing some arithmetic with
complex numbers, then you can also prove the same thing without using the complex
numbers, just working with the real numbers.
102
5.6. Learning outcomes
demonstrate that you know the definition of equivalence classes and that you
know some of their basic properties, in particular that they form a partition of
the set on which the relation is defined
Exercise 5.2
Define the relation R on the set N by x R y if and only if there is some n ∈ Z such that
x = 2n y. Prove that R is an equivalence relation.
Exercise 5.3
Let X be the set of n × n real matrices. Define a relation ∼ on X by:
Exercise 5.4
Suppose that f : X → Y is a surjection. Define the relation R on X by
x R y ⇐⇒ f ( x ) = f (y). Prove that R is an equivalence relation. What are the equivalence
classes? Let C denote the set of equivalence classes [ x ] for x ∈ X. Prove that if [ x ] = [y]
then f ( x ) = f (y). This means that we can define a function g : C → Y by: g([ x ]) = f ( x ).
Prove that g is a bijection.
Exercise 5.5
Prove that the set { x ∈ Z | x is a multiple of 4} has no lower bound.
103
5. Equivalence relations and the integers
So,
z × 0 = [( a, b)] × [(1, 1)] = [( a + b, a + b)].
Now, ( a + b, a + b) R (1, 1) because a + b + 1 = 1 + a + b and therefore
[( a + b, a + b)] = [(1, 1)] = 0. So we see that z × 0 = 0.
104
5.9. Solutions to exercises
For x ∈ X, [ x ] is the set of all y ∈ X with f (y) = f ( x ), so, since f is a surjection, the
equivalence classes are exactly the sets Cz for each z ∈ Y, where Cz = { x ∈ X | f ( x ) = z}
is the set of elements of X mapped onto z by f .
The fact that [ x ] = [y] implies f ( x ) = f (y) follows directly either from this description of
equivalence classes, or from the fact that [ x ] = [y] implies x R y, which implies
f ( y ) = f ( x ).
Let g be as defined. It is surjective because for each z ∈ Y, there is some x ∈ X such that
f ( x ) = z (since f is surjective) and hence g([ x ]) = f ( x ) = z. Also, g is bijective because
g([ x ]) = g([y]) implies f ( x ) = f (y), which means x R y and hence that [ x ] = [y].
105
6. Divisibility and prime numbers
Chapter 6
Divisibility and prime numbers
R
R Biggs, N. L. Discrete Mathematics. Chapter 8.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 15–17 and
Chapter 23 (except section 23.5)
6.1 Introduction
In this chapter we begin to study elements of number theory. We start with a
discussion of divisibility and this leads us to discuss common divisors. Prime
6 numbers are the basic building blocks in the theory of numbers: in particular, each
number can be written in essentially only one way as a product of primes.
6.2 Divisibility
For integers x, y we say that x is a multiple of y or that y divides x, or that x is divisible
by y if, for some q ∈ Z, x = yq. We use the notation y | x to signify that y divides x.
If you’re being careful, you might remember we defined ‘divisible’ earlier for natural
numbers. This definition extends the definition we gave earlier: that is, if x and y
happen to be natural numbers, then ‘x is divisible by y’ is a true statement (by this
definition) if and only if it is true by the earlier definition (in Section 2.2.2). So we did
not change any definition, we are simply explaining what it means for integers which
are not in N.
Note that, for every x ∈ Z, x | 0. But 0 | x only if x = 0.
When y does not divide x, we write y 6 | x. In this case, as you will know from
elementary arithmetic, dividing x by y will leave a remainder.
a = bq + r and 0 ≤ r < b.
106
6.4. Representation of integers with respect to a base
This can be proved using some standard properties of integers. Let N0 = N ∪ {0}
and let
Q = {m ∈ N0 | bm ≤ a}.
Then Q is non-empty because 0 ∈ Q (since 0b = 0 ≤ a). Also, Q is finite, because if
qb ≤ a, then, given that b ∈ N, we have q ≤ a. So Q must have a maximum member.
Let’s call this q. Let r = a − bq. Because q ∈ Q, bq ≤ a and hence r ≥ 0. Now, q is the
maximum member of A, so q + 1 6∈ A, which means that (q + 1)b > a, so qb + b > a
and hence r = a − bq < b. So we have established that a = bq + r, and that 0 ≤ r < b.
To show that q and r are unique, suppose we have a = bq + r = bq0 + r 0 where
0 ≤ r, r 0 < b. We want to show q = q0 and r = r 0 .
Either q ≤ q0 or q ≥ q0 . Let’s suppose that q ≥ q0 (the argument is similar if q ≤ q0 ).
Then
0 ≤ r = a − bq ≤ a − bq0 = r 0 < b.
So
0 ≤ ( a − bq0 ) − ( a − bq) < b,
which simplifies to 0 ≤ b(q − q0 ) < b. This implies 0 ≤ q − q0 < 1. But q − q0 is an
integer, and so we must have q − q0 = 0. So q = q0 . Then, r = a − bq = a − bq0 = r 0 .
The same result holds more generally, without the restriction that a > 0, but the proof
6
is simplest when we restrict to the positive case. The general Division Theorem is:
Theorem 6.2 (Division Theorem) For any integers a and b with b > 0, there are
unique integers q and r such that
a = bq + r and 0 ≤ r < b.
x = ( x n x n −1 . . . x 1 x 0 ) t .
x n x n −1 . . . x 1 x 0 t = x n t n + x n −1 t n −1 + · · · + x 1 t + x 0
107
6. Divisibility and prime numbers
implements a modular division function which will directly return a quotient and
remainder. Calculators generally do not have this function, however performing a
normal division, subtracting the integer part, and multiplying will return the correct
remainder.
Note that (due to imprecision in the calculator) what is returned might be a decimal
number which is very close to an integer rather than an actual integer. Thus to divide
2342 by 37, we perform the (usual) division: 2342/37 = 63.297297.... Subtracting the
integer part (which is the quotient, 63) we get 0.297297... and multiplying by 37 we
get 10.9999998 from which we can guess the correct remainder is 11. Checking,
63 × 37 + 11 = 2342, as we expected.
Example 6.1 The number 60 (written in base 10) can be written in base 3 as
(2020)3 because:
60 = 2(33 ) + 0(32 ) + 2(3) + 0(1).
60 = 3 × 20 + 0
6 20 = 3×6+2
6 = 3×2+0
2 = 3 × 0 + 2.
201 = 4 × 50 + 1
50 = 4 × 12 + 2
12 = 4×3+0
3 = 4 × 0 + 3.
x n x n −1 . . . x 1 x 0 t = x n t n + x n −1 t n −1 + · · · + x 1 t + x 0 .
This of course works because the calculator will evaluate the number in base 10.
108
6.6. The Euclidean algorithm
Definition 6.1 (Greatest common divisor) Suppose a, b are two integers, at least
one of which is not 0. Then the greatest common divisor (gcd) of a and b, denoted by
gcd( a, b), is the unique positive integer d with the following properties:
Implicit here is the fact that the gcd is unique. This easily follows from properties (i)
and (ii). For, suppose that d and d0 are two positive integers satisfying (i) and (ii).
Because d0 divides a and b and because d satisfies (ii), we have d0 ≤ d. But also,
because d divides a and b and because d0 satisfies (ii), we have d ≤ d0 . So we must
have d = d0 .
It’s not too hard to see that the gcd exists. For any n ∈ Z, let D (n) = {m ∈ Z : m | n}.
This is the set of positive divisors of n. Since 1 | n, for every n ∈ Z we have D (n) 6= ∅.
Consider now the set D ( a, b) = D ( a) ∩ D (b), which is the set of positive common
divisors of a and b. Now, D ( a, b) 6= ∅ since 1 ∈ D ( a, b). Suppose a 6= 0. (We know
that at least one of a, b is nonzero. If a 6= 0, a very similar argument will work using b
in place of a.)
6
Take any m ∈ D ( a, b). By definition, we have a = qm for some q ∈ Z, and it follows
that | a| = |q| · |m|. Now q cannot be equal to zero, because a 6= 0, so because q is a
non-negative integer we have |q| ≥ 1. Since |m| is also a non-negative integer, it
follows that |q| · |m| ≥ |m|, and thus in particular | a| ≥ |m|. What we just proved is
the statement m ∈ D ( a, b) ⇒ m ≤ | a|.
So D ( a, b) is bounded above and hence has a (unique) maximal element d. That’s the
gcd.
Note that some textbooks use ( a, b) to denote the gcd, rather than gcd( a, b). Apart
from the fact that this makes for confusion with elements of N2 , these textbooks
usually also write [ a, b] for the least common multiple of a and b (you can guess the
definition), and I usually forget quickly which is which—the only advantage of the
( a, b) notation is that it saves ink.
Example 6.3 gcd(12, 20) = 4 because 4 | 12 and 4 | 20, but there is no common
divisor of 12 and 20 that is greater than 4.
If two numbers a, b have gcd( a, b) = 1, then we say that a and b are coprime. In this
case, a and b have no common factors other than 1 and −1. For example, 72 and 77
are coprime.
109
6. Divisibility and prime numbers
solving a certain problem (in this case, for finding the gcd of two integers).
Before presenting this, we state two important properties of greatest common
divisors.
Activity 6.2 Prove this second fact by proving that the set of common divisors of
a and b is the same as the set of common divisors of b and r (and hence both sets
have the same greatest member.)
These observations provide a way to determine gcds. Let’s think about a simple
example. Suppose we want gcd(100, 15). (You can see immediately that the answer is
6 5, but let’s try to explain a method that will be useful in rather less easy examples.)
We have 100 = 15 × 6 + 10 so, by the second fact above, gcd(100, 15) = gcd(15, 10).
Next, 15 = 10 × 1 + 5, so gcd(15, 10) = gcd(10, 5). But 10 = 2 × 5, so 5 divides 10 and
so, by the first of the two facts above, gcd(10, 5) = 5. It follows, then, that
gcd(100, 15) = 5.
The method is known as the Euclidean Algorithm. Here’s another, more substantial,
example.
It follows that gcd(2247, 581) = 7. The reason is that these divisions establish the
following equalities:
110
6.6. The Euclidean algorithm
And finally let’s give the algorithm in general. It makes life easier if we assume a and
b are positive integers, and that a > b. This is OK, because the cases we are ignoring
are ‘easy’. First, gcd( a, b) = gcd(| a|, |b|), so provided we know how to work out gcd
for non-negative integers we automatically can deal with any integers. Second, for all
natural numbers n we have gcd(0, n) = gcd(n, 0) = gcd(n, n) = n by definition, so if
one of a and b is zero, or if they’re equal, we know what to do. Finally, if a < b then
we can observe that gcd( a, b) = gcd(b, a); in other words, if we don’t have a > b then
we can swap them over and then compute the gcd.
In order to write a nice algorithm, let’s write a1 and a2 rather than a and b. So we will
write an algorithm which takes as input two positive integers a1 and a2 , which satisfy
a1 > a2 , and ‘returns’ the gcd of a1 and a2 .
1. Given positive integers a1 > a2 , set i = 1 (we’ll use i to count loops; this is the
first loop).
3. Find integers qi and ai+2 such that ai = ai+1 qi + ai+2 with 1 ≤ ai+2 < ai+1 .
111
6. Divisibility and prime numbers
which is impossible—after at most a1 loops we have to stop, and the only way to stop
is to give the right answer.
If you feel this proof is a bit too informal, try writing it a bit more carefully. The right
way to do this is to use induction: prove that for each i ≥ 1, either we have the
properties ai > ai+1 ≥ 1 and gcd( a1 , a2 ) = gcd( ai , ai+1 ), or the algorithm halted
before reaching loop i. The base case i = 1 is obvious, and what we explained above
is the induction step.
6 Before we prove this, I want to explain how we can actually find these m and n (the
theorem just promises they exist; it doesn’t tell you how to find them). The short
version is: we use the Euclidean algorithm to find d = gcd( a, b). And then we work
backwards through the calculation. Let’s give an example: a = 2247 and b = 581 (we
just calculated the gcd of these two numbers in the last section). Now we want to find
integers m and n such that
2247m + 581n = 7.
We can use the calculation we had above, by ‘working backwards’ through the
sequence of equations, as follows:
7 = 42 − 35
= 42 − (77 − 42) = 42 × 2 − 77
= (504 − 77 × 6) × 2 − 77 = 504 × 2 − 77 × 13
= 504 × 2 − (581 − 504) × 13 = 504 × 15 − 581 × 13
= (2247 − 581 × 3) × 15 − 581 × 13 = 2247 × 15 − 581 × 58
= 2247 × 15 + 581 × (−58).
So we see that m = 15 and n = −58 will work. Not an answer you could easily have
guessed! Notice how, in each line of this calculation, we use one of the lines from the
calculation that arises from the Euclidean algorithm (and we also simplify as we go
along). You will get used to this with practice. There are many examples you could
make up for yourself in order to help you practice.
Activity 6.3 In the above procedure to find m and n, figure out exactly which
part of the Euclidean algorithm calculation is being used at each stage.
Activity 6.4 Choose two particular positive integers a and b. Use the Euclidean
algorithm to find gcd( a, b) and then use your calculation to find integers m and n
112
6.7. Some consequences of the Euclidean algorithm
such that d = ma + nb. Do this several times with different choices of numbers
until you have mastered it.
By this point, you should have some idea why Bézout’s identity is true: you have a
procedure for finding the numbers m, n such that gcd( a, b) = am + bn. Let’s try to
formalise this procedure—that means giving an algorithm that finds the numbers. As
before, the interesting case is when a > b ≥ 1; if we can deal with this case, we can
deal with any other case. (Check you see why this is true!)
As with the Euclidean algorithm, to write a nice algorithm, which we’ll call the
Bézout algorithm, let’s write a1 and a2 rather than a and b. So we want to find
integers m, n such that
gcd( a1 , a2 ) = a1 n + a2 m .
We do this as follows.
Run the Euclidean algorithm to find gcd( a1 , a2 ). It generates numbers
a1 > a2 > a3 > · · · > as (so it stops in the loop number s − 1 when we find as | as−1 , i.e.
as−1 = qs−1 as ) and in addition numbers q1 , q2 , . . . , qs−2 . By definition of the
Euclidean algorithm, we have a1 = a2 q1 + a3 , or equivalently a3 = a1 − q1 a2 . Let’s
write n3 = 1 and m3 = −q1 . Then we have a3 = a1 n3 + a2 m3 . We just wrote a3 as an 6
integer linear combination of a1 and a2 .
Again by definition of the Euclidean algorithm, we have a2 = a3 q2 + a4 . We can
rearrange and substitute in our expression for a3 to get
a4 = a2 − ( a1 n3 + a2 m3 ) q2 = a1 n4 + a2 m4 where n4 = −n3 q2 and m4 = 1 − m3 q2 .
and now we have a4 as a linear combination of a1 and a2 . Why is it an integer linear
combination? Well, we get n4 and m4 by adding, subtracting and multiplying integers
(not dividing) and so we know that the results must be integers.
We can keep repeating this until we get to as−1 = a1 ns−1 + a2 ms−1 is an integer linear
combination, and then finally
gcd( a1 , a2 ) = as = a1 ns−1 qs−1 + a2 ms−1 qs−1
where the last part is by definition a linear combination of a1 and a2 , and it is an
integer linear combination because ns−1 , ms−1 and qs−1 are integers. This time we
don’t have to check anything about infinite loops—we know the Euclidean algorithm
doesn’t enter an infinite loop, and the Bézout algorithm runs the same number of
steps.
This looks rather like hard work and painful algebra—but bear in mind that in real
life, you wouldn’t actually do these calculations yourself on paper. You’d program a
computer to do them, and the point of an algorithm is that it’s easy to make a
computer run an algorithm: the method is already precisely specified, just as
computers like it.
So now how do we prove Theorem 6.3? Well, like this:
Proof. (of Bézout’s identity) We already proved the Euclidean algorithm works
correctly. We also explained why the Bézout algorithm works correctly—we proved
it. But if we have an algorithm which for input a, b provably finds integers n, m such
that gcd( a, b) = an + bm, then in particular those integers have to exist.
113
6. Divisibility and prime numbers
This is a proof by algorithm: in order to show that something exists, write down an
algorithm to find that thing, and prove it works. Some people may say they don’t like
algorithms and hence they would rather write this proof as an induction proof
(which you can also do). I think it’s clearer to write it this way.
The fact that there are m, n ∈ Z such that gcd( a, b) = am + bn (that is, that the gcd of
two integers is an integer linear combination of them) is very useful. Here’s one nice
consequence.
We know that, by definition, if c | a and c | b, then c ≤ d = gcd( a, b). But we can say
something stronger, namely that c | d.
Theorem 6.4 Suppose that a, b ∈ Z so that a and b are not both zero, and let
d = gcd( a, b). If c | a and c | b, then c | d.
Proof. There are integers m, n such that d = ma + nb. Suppose that c | a and c | b. Then
c | ma and c | nb, so c | (ma + nb). But this says c | d, as required.
6 Theorem 6.5 For a, b ∈ Z, with a, b not both zero, let d = gcd( a, b). Then, for c ∈ N,
there are integers m and n such that c = am + bn if and only if d | c.
We also have:
Theorem 6.6 Suppose that a, b ∈ N are coprime (meaning gcd( a, b) = 1). If a | r and
b | r, then ab | r.
This is not generally true if the numbers a, b are not coprime. Think of a
counterexample!
Proof. Because gcd( a, b) = 1, there are integers m, n such that 1 = ma + nb. So
114
6.8. Prime numbers
Proof. The proof makes use of the useful fact (seen above) that the gcd of any two
numbers can be written as an integer linear combination of the numbers
(Theorem 6.3). Suppose, then, that p is prime and that p | ab. If p | a, then the
conclusion of the theorem holds, so suppose p 6 | a (and we will try to prove p | b) Then
p and a have no common positive divisor other than 1. (The only positive divisors of
p are 1 and p because it is prime, and p does not divide a, by assumption.) So
gcd( p, a) = 1 and therefore there exist integers m and n with 1 = mp + na. 6
Remember that we want to write b as an integer multiple of p (because that’s what it
means to say p divides b). So let’s multiply both sides of this equation by b, in order
that we write b equal to something—and then try to see why the something is a
multiple of p.
We get b = b(mp + na) = (bm) p + n( ab). So again, our job is to find out why
(bm) p + n( ab) is a multiple of p. Well, obviously (bm) p is a multiple of p. What about
n( ab)? We assumed above that p | ab, which (by definition) means there is some
integer s such that ab = sp. Substituting this in, we get
115
6. Divisibility and prime numbers
You might ask what is going on with n = 1: how do I write 1 as a product of primes?
Simple: it’s a product of no primes at all: 1 = 20 × 30 × 50 × . . . . And we generally
don’t write all the primes raised to the power zero, because p0 = 1 for each prime p;
they don’t change the product. This might look like a funny piece of formalism (or
like nonsense), but the reason for writing it this way is (a) that it is true, and (b) it will
make life easier later not to make an exception for 1.
The expression of an integer as a product of primes is known as its prime
decomposition. For example, the prime decomposition of 504 is
504 = 2 × 2 × 2 × 3 × 3 × 7 = 23 · 32 · 7 .
Note that, in this last expression, the dot, ‘·’ denotes multiplication. It’s generally
easier to read and write mathematics when we use · rather than × (× and + can
6 easily get confused when written hurriedly) so this is common practice. The
exception is if we are discussing decimals when · can be confused with the decimal
point.
The proof of the Fundamental Theorem is not very difficult, given the results we
already have about prime numbers. Establishing that each positive integer can be
written as a product of primes is easy. Showing that such a decomposition is
essentially unique (that is, unique up to the ordering of the factors) is a little trickier,
but can be established using Theorem 6.7.
This is essentially unique: if p1k1 p2k2 . . . prkr = q1`1 q2`2 . . . q`s s , are two equal such
expressions, then r = s, pi = qi and k i = `i for all i. (‘Uniqueness’).
Fundamental Theorem: ‘Existence’
Proof. We use (strong) induction. For n ≥ 1, let P(n) be the statement:
n can be written as a product of primes
Base case: n = 1 is a product of no primes, so P(1) is true.
Assume, inductively, that k ∈ N and that P(s) is true for all s ≤ k. (We’re using strong
induction.) Consider k + 1. This could be a prime number, in which case we’re done
and P(k + 1) is true. Otherwise k + 1 = ab where 1 < a, b < k + 1. But then P( a) and
P(b) are true (by assumption) so each of a, b is a product of primes. So, therefore, is
k + 1.
116
6.9. Prime factorization: the Fundamental Theorem of Arithmetic
that p1 divides
q1 · q1 · · · · · q1 · q2 · q2 · · · · · q2 · · · · · q s · q s · · · · · q s .
| {z } | {z } | {z }
`1 `2 `s
By Activity 6.5, it follows that for some i we have p1 |qi . Now p1 is a prime, so it is not
1; and qi is a prime, so by definition p1 = qi . So we can write
n +1
p1 = p1k1 −1 p2k2 . . . prkr = q1`1 q2`2 . . . qi`i −1 . . . q`s s .
Now these two factorisations of np+11 are different, because the two factorisations of
n + 1 we started with are different. Because p1 > 1 we have (n + 1)/p1 ≤ n, so this is
a contradiction to the strong induction hypothesis.
We can use the Fundamental Theorem of Arithmetic to prove that there are infinitely
many primes. You can find an outline of this proof in Chapter 2.
117
6. Divisibility and prime numbers
Activity 6.6 Look again at the proof, in Chapter 2, that there are infinitely many
primes, and understand where the Fundamental Theorem of Arithmetic is used in
the proof.
You should also note that we rather often used ‘because 1 is not a prime’ in this
proof—and indeed, if we considered 1 to be a prime then the theorem would be false:
2 = 1 × 2 = 12 × 2 = 13 × 2 would be different factorisations of 2.
√
Activity 6.7 What’s the difference between Z and Z[ −5] that makes
Theorem 6.7 work?
This actually connects to a rather well-known part of mathematics. You maybe know
that Pierre de Fermat claimed in 1637 to have a proof that if n ≥ 3 is an integer, then
x n + yn = zn does not have any solutions where x, y, z ∈ N. But the proof, if he had
one, was lost.
118
6.10. Learning outcomes
Now, Fermat wasn’t just bluffing: he did have a proof for the case n = 4 (it was
discovered in his papers after his death). It seems unlikely that he really had a proof
in general—he certainly did not have the proof which Wiles famously announced in
1994. Perhaps he believed that he had a proof, but there was a mistake?
There was a proof announced (by Lamé) which Fermat might have also found
(perhaps!)—but that proof is wrong. The way it is wrong is that it assumes√that some
numbers which look like the integers (in more or less the same way as Z[ −5]) have
unique factorisation. This is an obvious mistake for you now after seeing a
counterexample—but it wasn’t so obvious when Lamé was working in the 1800s, let
alone to Fermat.
state clearly what it means to say that one number divides another
state the Division Theorem and, given two numbers, find the remainder and
6
quotient when one is divided by the other
demonstrate that you know that the gcd of two numbers can always be
expressed as an integer linear combination of the two numbers; and be able to
express the gcd in this way for any two numbers.
use the Euclidean algorithm to express the gcd of two numbers as an integer
linear combination of them
Exercise 6.2
Suppose that a, b ∈ N, both non-zero, and let d = gcd( a, b). We know that, by definition,
if c | a and c | b, then c ≤ d. Prove, in fact, that c | d.
119
6. Divisibility and prime numbers
Exercise 6.3
Suppose a, b ∈ N and that d = gcd( a, b). Prove that, for c ∈ N, there are integers m and
n such that c = am + bn if and only if d | c.
Exercise 6.4
Suppose a, b ∈ N. Prove that if there are integers m and n such that am + bn = 1 then a
and b are coprime.
Exercise 6.5
Prove that for all n ∈ N, the numbers 9n + 8 and 6n + 5 are coprime.
Exercise 6.6
Suppose that a, b ∈ N and that gcd( a, b) = 1. Suppose that a | r and b | r. Prove that ab | r.
Exercise 6.7
The Fibonacci numbers f 1 , f 2 , f 3 , . . . are defined as follows: f 1 = f 2 = 1 and, for n ≥ 3,
f n = f n−1 + f n−2 . Prove that for all n ∈ N, gcd( f n , f n+1 ) = 1.
Exercise 6.8
6 Suppose that p1 , p2 , . . . , pk are primes and that a, b ∈ N are given by
l m
a = p1l1 p2l2 . . . pkk , b = p1m1 p2m2 . . . pk k .
Prove that
r
gcd( a, b) = p1r1 p2r2 . . . pkk ,
where, for i = 1 to k, ri is the smaller of the two numbers li and mi .
Exercise 6.9
Suppose a, b ∈ N satisfy gcd( a, b) = 1 and, for some k ∈ N, ab = k2 . Prove that for some
integers m, n, a = m2 and b = n2 .
120
6.13. Solutions to exercises
Then P(1) is clearly true (and P(2) is the theorem just proved). Suppose P(n) is true
and let’s show P(n + 1) follows. So, suppose a1 , a2 , . . . , an+1 ∈ N and
p | a1 a2 . . . an an+1 . Well, since p | Aan+1 , where A = a1 a2 . . . an , we can apply the n = 2
case to see that p | A or p | an+1 . But, by P(n), if p | A = a1 a2 . . . an then p | ai for some i
between 1 and n. So we’re done: p divides at least one of the ai for i between 1 and
n + 1.
Learning activity 6.6 Here’s the proof, with explicit reference to the Fundamental
Theorem.
Suppose there were not infinitely many primes, so there’s a largest prime, M, say. Let
X = (2 × 3 × 5 × 7 × 11 × · · · × M ) + 1.
121
6. Divisibility and prime numbers
where we have used the inductive hypothesis for the last equality.
You can also establish this result by thinking about the way in which the Euclidean
algorithm would work in finding the gcd of f k and f k+1 .
122
6.13. Solutions to exercises
results of Section 6.8.) The only primes appearing in the decomposition of a and b are
p1 , p2 , . . . , pk , so we can deduce that for some i, p | pi which means p = pi (given that p
and pi are primes). So the prime decomposition of D is of the form
s
D = p1s1 p2s2 . . . pkk
for some non-negative integers si . Now, suppose that, for some i, si > li . Then, for some
integers M and N,
s l
D = pi i M, a = pii N,
where, because they involve only products of the other primes, neither N or M is divisible by
pi . Now, the fact that D | a means that there’s an integer L with a = LD, so
l s
pii N = Lpi i M
and hence
s − li
N = Lpi i M.
But this shows, since si − li ≥ 1 (because si , li ∈ Z and s I > li ), that pi | N, contradicting
the observation that pi 6 | N. So we must have si ≤ li for all i. A similar argument shows that
si ≤ mi for all i. So si ≤ ri = min(li , mi ) and hence D ≤ d. The result follows. 6
Solution to exercise 6.9
We use the Fundamental Theorem of Arithmetic. Let k have prime decomposition
k = p1α1 p2α2 . . . prαr . Then
ab = k2 = p2α 1 2α2 2αr
1 p2 . . . pr .
It follows that a and b must have prime decompositions involving only the primes
p1 , p2 , . . . , pr , and that each of a, b takes the form p1s1 p2s2 . . . prsr where si is a non-negative
integer. But we cannot have, for any i, pi | a and also pi | b, for this would mean that pi > 1
2α
is a common divisor of a, b, contradicting gcd( a, b) = 1. So, for each i, pi i divides precisely
one of a and b and pi does not divide the other of the two numbers. In other words, each of
2β 2β 2β
a, b takes the form p1 1 p2 2 . . . pr r where β i = 0 or β i = αi . This can be written as
β β β
( p1 1 p2 2 . . . pr r )2 , and hence there are integers m, n such that a = m2 and b = n2 .
123
7. Congruence and modular arithmetic
Chapter 7
Congruence and modular arithmetic
R
R Biggs, N. L. Discrete Mathematics. Chapter 13, Sections 13.1–13.3.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 19–21.
7.1 Introduction
In this chapter, we study congruence and we describe modular arithmetic. This builds
on the ideas and results on divisibility and equivalence relations that we met in
earlier chapters.
You go to sleep at 10 o’clock and you sleep for 8 hours. At what time do you wake.
Well, this is simple: you wake at 6 o’clock. What you’re doing in this calculation is
you’re doing what’s called arithmetic modulo 12. The answer is not 10 + 8 = 18,
because the clock re-starts once the hour of 12 is reached. This is a fairly simple idea,
7 when expressed in these terms, and it’s the key concept behind modular arithmetic, a
topic later in this chapter. We are going to take a more abstract approach.
This relation is so important that it has a special notation. If a R b, we say that a and b
are congruent modulo m and we write a ≡ b (mod m).
If a and b are not congruent modulo m, then we write a 6≡ b (mod m).
The division theorem tells us that for any integers a and for any m ∈ N, there are
unique integers q and r such that
a = qm + r and 0 ≤ r < m.
What this implies for congruence is that, for any a, there is precisely one integer r in
the range 0, 1, . . . , m − 1 such that a ≡ r(mod m).
124
7.2. Congruence modulo m
(i) a + c ≡ b + d (mod m)
(ii) a − c ≡ b − d (mod m)
(iii) ac ≡ bd (mod m)
(iv) ∀k ∈ Z, ka ≡ kb (mod m)
(v) ∀n ∈ N, an ≡ bn (mod m)
Proof. I leave (i) and (ii) for you to prove. Here’s how to prove (iii): because
a ≡ b (mod m) and c ≡ d (mod m), we have m | (b − a) and m | (d − c). So, for some
integers k, l, b − a = km and d − c = lm. That is, b = a + km and d = c + lm. So
bd = ( a + km)(c + lm) = ac + (kmc + alm + klm2 ) = ac + m(kc + al + klm).
Now, kc + al + klm ∈ Z, so bd − ac = (kc + al + klm)m is a multiple of m; that is,
m | (bd − ac) and ac ≡ bd (mod m). Part (iv) follows from (iii) by noting that
k ≡ k (mod m), and part (v) follows by repeated application of (iii) (or, by (iii) and
induction.).
7
Activity 7.2 Prove parts (i) and (ii) of Theorem 7.1.
Theorem 7.1 is useful, and it enables us to solve a number of problems. Here are two
examples.
Example 7.1 Suppose that the natural number x has digits xn xn−1 . . . x0 (when
written, normally, in ‘base 10’). So, for example, if x = 1246 then
x0 = 6, x1 = 4, x2 = 2, x3 = 1. Then 9 divides x if and only if x0 + x1 + · · · + xn is a
multiple of 9. (So this provides a quick and easy way to check divisibility by 9. For
example, 127224 is divisible by 9 because 1 + 2 + 7 + 2 + 2 + 4 = 18 is.)
How do we prove that this test works? We can use congruence modulo 9. Note
that
x = x0 + (10) x1 + (10)2 x2 + · · · + (10)n xn .
Now, 10 ≡ 1 (mod 9), so, for each k ∈ N, 10n ≡ 1 (mod 9). Hence
9 | x ⇐⇒ x ≡ 0 (mod 9)
⇐⇒ x0 + x1 + · · · + xn ≡ 0 (mod 9)
⇐⇒ 9 | ( x0 + x1 + · · · + xn ).
125
7. Congruence and modular arithmetic
Example 7.2 We can use congruence to show that there are no integers x and y
satisfying the equation 7x2 − 15y2 = 1.
We prove this by contradiction. So suppose such x and y did exist. Then, because
15y2 is a multiple of 5, we’d have 7x2 ≡ 1 (mod 5). Now, x is congruent to one of
the numbers 0, 1, 2, 3, 4 modulo 5. That is, we have:
x ≡ 0 or x ≡ 1 or x ≡ 2 or x ≡ 3 or x ≡ 4 (mod 5).
So,
x2 ≡ 0 or x2 ≡ 1 or x2 ≡ 4 or x2 ≡ 9 or x2 ≡ 16 (mod 5).
But 9 ≡ 4 (mod 5) and 16 ≡ 1 (mod 5), so, in every case, either x2 ≡ 0 or x2 ≡ 1 or
x2 ≡ 4 (mod 5). It follows, then, that in all cases, we have
So there does not exist an integer x with 7x2 ≡ 1 (mod 5), and hence there are no
integer solutions to the original equation.
126
7.2. Congruence modulo m
127
7. Congruence and modular arithmetic
[0] m , [1] m , . . . , [ m − 1] m .
7
So Zm has m members (We saw Z24 before as the ‘clock numbers’ in Section 3.2). We
can define operations ⊕ and ⊗ on Zm as follows:
For example, when m = 4, [2]4 ⊕ [3]4 = [5]4 = [1]4 and [2]4 ⊗ [3]4 = [6]4 = [2]4 .
In practice, we do not use the ⊕ and ⊗ symbols and we simply write x instead of
[ x ]m . It’s nice, when possible, to use values of x between 0 and m − 1, so in our
calculations if numbers get out of this range we will often add or subtract multiples
of m to return; we say ‘reduced modulo m’. This makes sense because
[ x ]m = [ x + sm]m for each integer s. We would say that we are ‘in Zm ’ if that’s not
clear from the context. So the above two calculations may be written:
in Z4 , 2 + 3 = 1, and 2 × 3 = 2.
Let me again stress that we usually try to use only the symbols 0, 1, . . . , m − 1, and
generally we will want to write our final answer using those symbols. But it might
well be useful along the way to use other symbols! In Z8 we have 7 = −1, so we can
128
7.3. Zm and its arithmetic
7100 − 1 = (−1)100 − 1 = 1 − 1 = 0
(iv) ( a + b) + c = a + (b + c)
(v) ( a × b) × c = a × (b × c)
7
(vi) a+0 = a
(vii) a × 1 = a if a 6= 0
(viii) a × (b + c) = ( a × b) + ( a × c)
Let’s think a little about rule (ix) in this Theorem. Suppose we’re in Z4 and that a = 3.
What is − a? Well, what we want is an element of Zm which when added to 3 gives 0
when we are doing arithmetic in Z4 . Now, 3 + 1 = 0 in Z4 because 3 + 1 is congruent
to 0 modulo 4, so −3 = 1. (Alternatively, we can note that −3 = −1(4) + 1, so −3 has
remainder 1 on division by 4, so −3 ≡ 1 (mod 4).)
It is important to realise that arithmetic in Zm does not obey all the nice properties
that normal arithmetic of integers obeys. In particular, we cannot generally cancel
multiplication. For example, in Z4 ,
2 × 3 = 2 = 2 × 1,
but we cannot ‘cancel the 2’ (that is, divide both sides by 2) to deduce that 3 = 1,
because 3 6= 1 in Z4 . (The reason we cannot ‘cancel’ the 2 is that 2 has no inverse in
Z4 . Existence of inverses is the topic of the next section.)
129
7. Congruence and modular arithmetic
xa = xb ⇒ x −1 ( xa) = x −1 ( xb) ⇒ ( x −1 x ) a = ( x −1 x )b ⇒ 1a = 1b ⇒ a = b.
As a result of this theorem, we see that if p is a prime, then every non-zero element x
of Z p is invertible. This is because gcd( x, p) = 1.
130
7.5. Solving equations in Zm
invertible, then we can see that the equation ax = b in Zm has solution x = a−1 b,
because a( a−1 b) = ( aa−1 )b = 1b = b.
How do you find a solution? Trial and error is not efficient if the numbers involved
are large. But we can use the Euclidean algorithm. For suppose we want to solve
ax = b in Zm , where gcd( a, m) = 1. Then, by using the Euclidean algorithm, we have
seen how we can find integers k, l such that 1 = ak + ml. So, b = abk + mlb. Since mlb
is a multiple of m, by definition abk ≡ b (mod m). So bk is a solution, and so bk + sm
will also be a solution for any integer s, because bk ≡ bk + sm (mod m) by definition.
It’s usually nicest to find a solution between 0 and m − 1 inclusive, so we choose s to
obtain this. It’s easy to do this for explicit numbers.
Example 7.6 Suppose we want to solve the equation 83x = 2 in Z321 . We can
check that 83 and 321 are coprime by the Euclidean algorithm (working in Z), as
follows: We have
321 = 83 × 3 + 72
83 = 72 × 1 + 11
72 = 11 × 6 + 6
11 = 6×1+5
6 = 5×1+1
5 = 1 × 5.
7
It follows that gcd(321, 83) = 1. Now, working backwards, we can express 1 as an
integer linear combination of 83 and 321:
1 = 6−5
= 6 − (11 − 6) = 6 × 2 − 11
= (72 − 11 × 6) × 2 − 11 = 72 × 2 − 11 × 13
= 72 × 2 − (83 − 72) × 13 = 72 × 15 − 83 × 13
= (321 − 83) × 3) × 15 − 83 × 13 = 321 × 15 − 83 × 58.
Activity 7.4 This calculation also reveals that 83 is invertible in Z321 . Why? And
what is the inverse of 83 in Z321 ?
131
7. Congruence and modular arithmetic
Proof. First part, =⇒: Suppose ax0 = b in Zm . Then ax0 − b = km for some k ∈ Z. So,
b = ax0 − km. Since d = gcd( a, m), d | a and d | m, so d | ( ax0 − km). That is, d | b.
Second part, ⇐=: Suppose d | b, so b = db1 for some b1 ∈ Z. There are x1 , y1 such that
d = x1 a + y1 m. Then, b = db1 = ( x1 b1 ) a + (y1 b1 )m.
So, in Zm , a( x1 b1 ) = b. That is, x1 b1 (reduced modulo m) is a solution.
If d 6 | b, there’s no solution.
2x + 3y = 1, 4x + 3y = 5.
Subtracting the first equation from the second gives 2x = 4. You might be tempted
to cancel the 2 and deduce that x must be 2. But wait! You can’t cancel unless 2 and
6 are coprime, and they are not (since their gcd is 2). Instead, you can check for
each of the elements of Z6 whether 2x = 4. Of course, x = 2 is a solution, but so
also is x = 5 because, in Z6 , 2(5) = 10 = 4. You can also check by calculating the
other values of 2x that x = 2, 5 are the only solutions. Now, from the first equation,
3y = 1 − 2x. When x = 2 or 5, 2x = 4, so this is 3y = 1 − 4 = −3 = 3. So we now
have 3y = 3. Again, we cannot cancel the 3. Instead we check, for each y ∈ Z6 ,
whether it is a solution, and we find that 1, 3 and 5 are all solutions. What this
argument shows is that the possible solutions are
( x, y) = (2, 1), (2, 3), (2, 5), (5, 1), (5, 3), (5, 5).
In fact, it can easily be checked (by substituting these pairs of values into the
original equations) that these are indeed solutions. So this system has 6 different
solutions.
Activity 7.5 Check, by substituting into the original equations, that each of these
six possible solution pairs ( x, y) is indeed a solution.
132
7.6. Learning outcomes
3x + y = 1, 5x + 4y = 1.
You are quite possibly very unhappy about the last sentence — I’ll explain it in the
next chapter!
7
7.6 Learning outcomes
At the end of this chapter and the relevant reading, you should be able to:
133
7. Congruence and modular arithmetic
Exercise 7.2
Show that for all n ∈ Z, n2 ≡ 0 or 1 (mod 3). Hence show that if 3 divides x2 + y2 then
3 | x and 3 | y. Use this to prove that there are no integers x, y, z such that x2 + y2 = 3z2 ,
other than x = y = z = 0.
Exercise 7.3
Show that, for all n ∈ N, 33n+1 ≡ 3 × 5n (mod 11) and that 24n+3 ≡ 8 × 5n (mod 11).
Hence show that for all n ∈ N, 11 | (33n+1 + 24n+3 ).
Exercise 7.4
By working modulo 7, prove that 2n+2 + 32n+1 is divisible by 7. (This result was proved in a
different way, using induction, in the exercises at the end of Chapter 3)
Exercise 7.5
Prove that 290 is an invertible element of Z357 and find its inverse.
Exercise 7.6
7 Solve the equation 10x = 3 in Z37 .
134
7.9. Solutions to exercises
integer. The converse is false, because, for example, 3 ≡ 3 (mod 4), but 3 6≡ 7 (mod 12)
135
7. Congruence and modular arithmetic
The fact that 13 × 357 − 16 × 290 = 1 means that, modulo 357, −16 × 290 ≡ 1. So, in
Z357 , (290)−1 = −16 = 341.
37 = 3 × 10 + 7
10 = 1×7+3
7 = 2×3+1
3 = 3 × 1,
and so
1 = 7−2×3
= 7 − 2(10 − 7) = 3 × 7 − 2 × 10
= 3(37 − 3 × 10) − 2 × 10
= 3 × 37 − 11 × 10.
So we see that −11 × 10 ≡ 1 (mod 37) and hence −33 × 10 ≡ 3 (mod 37). Now,
7 −33 ≡ 4 (mod 37), so the solution is x = 4. This is easily checked: 10(4) = 40 = 3 in
Z37 .
136
Chapter 8
Rational, real and complex numbers
R
R Biggs, N. L. Discrete Mathematics. Chapter 9.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 13 and 14.
The treatment in Biggs is probably better for the purposes of this course.
Neither of these books covers complex numbers. You do not have to know very much
about complex numbers for this course, but because this topic is not in these books, I
have included quite a bit of material on complex numbers in this chapter.
You can find useful reading on complex numbers in a number of books, including the
R
following (which you might already have, given that it is the MA100 text).
Anthony, M. and M. Harvey. Linear Algebra: Concepts and Methods. Cambridge
University Press 2012. Chapter 13.
8.1 Introduction
In this chapter, we explore rational numbers, real numbers and complex numbers. In
this course, we started with natural numbers and then we showed how to construct
the set of all integers from these. This construction used an equivalence relation,
8
together with a suitable way of adding and multiplying the equivalence classes. In a
similar way, the rational numbers can be constructed from the integers by means of
an equivalence relation. In this course, we do not take a very formal approach to the
definition or construction of the real numbers (which can, in fact, be quite
complicated). But we study properties of real numbers, and in particular we shall be
interested in whether real numbers are rational or not. We also consider the
‘cardinality’ of infinite sets.
As we stressed before, the point of these formal constructions is not to make you
think about a complicated construction which gets in the way of ‘calculation as
usual’. Once you proved that a construction does what it’s supposed to do, you don’t
have to think any more; you can calculate as usual. But by now we are getting to
mathematical structures which don’t obviously make sense—look back to
Section 3.10, and again: why does C exist but E doesn’t? You don’t have—yet—any
very good reason to believe that calculations using C make any more sense than
using E; maybe all this complex number stuff is nonsense, simply the calculations
which show it doesn’t work are a bit harder to find than the one which shows E is
nonsense? By the end of this chapter, you should be convinced that it is not nonsense.
137
8. Rational, real and complex numbers
You should quickly check that this relation R does what you think it should do: if (by
m0
your school-style calculation) the fractions m n and n0 are the same, then indeed we
have (m, n) R(m0 , n0 ). However so far in this course we did not define ‘division’ nor
‘fraction’—that’s exactly what we want to do now. The relation R only uses the
properties of Z which we are already happy with.
Let’s pause for a moment to prove that R is indeed an equivalence relation.
R is Reflexive: (m, n) R(m, n) because mn = nm.
R is Symmetric: (m, n) R( p, q) ⇒ mq = np ⇒ pn = qm ⇒ ( p, q) R(m, n).
R is Transitive: Suppose (m, n) R( p, q) and ( p, q) R(s, t). Then mq = np and pt = qs.
So, (mq)( pt) = (np)(qs) and, after cancelling qp, this gives mt = ns, so (m, n) R(s, t).
But, wait a minute: can we cancel pq? Sure, if it’s nonzero. If it is zero then that means
p = 0 (since we know that q 6= 0). But then mq = 0, so m = 0; and qs = 0, so s = 0. So,
in this case also we get mt = ns (both sides are zero) and so (m, n) R(s, t).
138
8.2. Rational numbers
(1, 2), (−1, −2), (2, 4), (−2, −4), (3, 6), (−3, −6) . . . .
m
Denoting the equivalence class [(m, n)] by , we therefore have
n
1
= {(1, 2), (−1, −2), (2, 4), (−2, −4), (3, 6), (−3, −6), . . . }.
2
Recall that if x 0 ∈ [ x ] then [ x 0 ] = [ x ]. So we can say
1 −1 2 −2 3 −3
= = = = = = ··· .
2 −2 4 −4 6 −6
We can think of the integers as particular rational numbers by identifying the integer
n
n with the rational number (that is, with the equivalence class [(n, 1)]). So Z ⊆ Q.
1 8
8.2.3 Doing arithmetic
How do we ‘do arithmetic’ with rational numbers. Well, you’ve been doing this for
years, but how would we define addition and multiplication of rational numbers in
an abstract setting? Just as we defined operations on equivalence classes in earlier
chapters (in the construction of Z from N and in the construction of Zm ), we can
define addition and multiplication as an operation on the equivalence classes of R.
Here’s how: let ⊕ and ⊗ be defined on the set of rational numbers as follows:
a c ad + bc a c ac
⊕ = , ⊗ = .
b d bd b d bd
In practice, we just use normal addition and multiplication symbols (and we often
omit the multiplication symbol), so we have
a c ad + bc a c ac
+ = , × = .
b d bd b d bd
Well, no surprises there, but remember that what we are doing here is defining
additional and multiplication of rational numbers (and remember also that these
rational numbers are, formally, equivalence classes). Now, if you think hard about it,
one issue that is raised is whether these definitions depend on the choice of
139
8. Rational, real and complex numbers
a a0 c c0
= 0 and = 0 ,
b b d d
then
a c a0 c0
× = 0 × 0.
b d b d
140
8.2. Rational numbers
By the way, the rational numbers are described as such because they are (or, more
formally, can be represented by) ratios of integers.
The rational numbers (as you can easily check, and as you already knew long ago)
satisfy the following axioms:
(F1) Closure under addition and multiplication: for each a, b ∈ F both a + b and ab are
in F.
(F2) Commutative addition and multiplication: for each a, b ∈ F we have
a + b = b + a and ab = ba.
(F3) Associative addition and multiplication: for each a, b, c ∈ F we have
( a + b) + c = a + (b + c) and ( ab)c = a(bc).
(F4) The distributive law: for each a, b, c ∈ F we have ( a + b)c = ac + bc.
(F5) Additive and multiplicative identity: there are two different elements 0 and 1,
such that for each a ∈ F we have a + 0 = a and a × 1 = a.
(F6) Additive and multiplicative inverses: for each a ∈ F there is an element − a such
that a + (− a) = 0, and if a 6= 0 there is an element a−1 such that a( a−1 ) = 1.
There is a name for structures F which satisfy all of these axioms: they form a field. So
the rational numbers are a field. You don’t need to memorise the list (and it won’t be
on the exam), because what it says is simply that in any field you can do algebra as
you are used to, with addition, subtraction, multiplication and division, and nothing
will ‘go wrong’; you will never somehow get an answer which isn’t in the field (as
you would if you tried to work out 3 − 5 in N, or 3/5 in Z), nor will the answer do
something unexpected (like depending on the order you do things, as would be the 8
case with multiplying 2 × 2 matrices: there multiplication isn’t commutative).
You probably feel intuitively that a field should be something like the rational
numbers (or the real or complex numbers, which we’ll formally meet shortly).
Another, rather different, example of a field is Z p for any prime number p. We saw in
the last chapter that all the axioms of a field, except (F6), holds for Zm whatever m is;
and we saw that the additive inverses part of (F6) holds whatever m is. And we saw
that if m is prime, then so does the multiplicative inverses part of (F6) — that’s all we
need to check. So Z2 3 is a field—but 1 + 1 + · · · + 1 could be equal to 0 in Z23 (it is, if
there are 23 ‘1’s, for example) whereas this never occurs in Q.
Why should you care about fields? Well, think about solving systems of linear
equations in R using Gaussian elimination (or, as you learned to do in MA100, by
writing it in terms of matrices and doing row reduction). What do you do there? You
add and subtract real numbers; you multiply and divide (which is the same as
multiplying by the multiplicative inverse). And you get an answer. Well, you can do
all that in any field, in exactly the same way, because that’s what the field axioms say
you can do.
So if you want to solve a system of linear equations in Z23 , then you can do it just as
you would in R, as you learned in MA100, except do your calculations in Z23 . Let’s
try to solve this one, in Z23 , by Gaussian elimination:
3x + 6y = 4 and 7x + y = 1
141
8. Rational, real and complex numbers
Well, we can take 6 times the second equation from the first:
3x − 42x = 4 − 6 , i.e. 7x = 21
And we can multiply this by 7−1 to get x = 3. Plugging that into 3x + 6y = 4 we get
9 + 6y = 4, so 6y = 18, so y = 3. And we’re done.
There are lots more examples of fields, some of which you will meet later in your
programme — and what you’ve just seen is that all that linear algebra you’re learning
in MA100 doesn’t just apply to the real and complex numbers, it applies to all fields.
You’ve been learning much more than you thought you were! And this is a large part
of the reason for this axiomatic approach to algebra: it lets you clearly see where you
can use methods you already know to handle completely new structures.
Activity 8.2 Make sure you understand that this is a proof by contradiction, and
that you understand what the contradiction is.
What this theorem tells us is that, at least if we want to solve equations like x2 = 2,
then the rational numbers are not enough; we need more. Of course, we could just
invent new symbols and define them to satisfy the equations we want. But (as is the
case for E from Section 3.10) this is a dangerous thing to do—we might be assuming
something exists which doesn’t in fact exist; whose existence leads to a contradiction.
We’d better rather construct the reals.
142
8.3. Rational numbers and real numbers
143
8. Rational, real and complex numbers
then the set {s + s0 : s ∈ S, s0 ∈ S0 } is a Dedekind cut. More or less the same idea
works for multiplication (but you have to be rather careful with negative numbers!).
If q is a rational number, then { x ∈ Q : x < q} is a Dedekind cut which we can easily
check behaves like the rational number q (in the same way that the rational number 21
√
behaves like the integer 2). Now, this definition at least solves the 2 problem:
{q ∈ Q : q < 0 or q2 < 2} is a Dedekind cut, and its square is 2. This is a nice clean
definition, but it only works because we have a definition of ‘order’ < on Q.
The second route is the ‘Cauchy sequence’ construction. This is rather more
complicated.
The idea is the following: if I want to specify a real number, I’ll give a sequence of
rational numbers which get closer and closer to the number I want (the formal term is
‘Cauchy sequence’; it’ll
√be defined next term), such as longer and longer parts of the
decimal expansion of 2; I might give the sequence
Multiplication works similarly (this time negative numbers aren’t a problem). So far
this looks rather like the ‘decimals’ construction. But of course I might have many
possible sequences of rational numbers which get closer and closer to say 0 (or any
other real number), so I should write down an equivalence relation which says two
such sequences are equivalent. And then the real numbers are the equivalence classes
of this relation.
8 By the end of next term, you should be able to make formal sense of this second
construction—in fact, it would be a good check of your understanding to come back
and try to fill in the details. This construction has the advantage that it doesn’t use the
idea of ‘order’; you can use the same idea in lots of different situations. But it requires
a lot more work to set up.
144
8.3. Rational numbers and real numbers
2 5 4 6
1+ + + + .
10 100 1000 10000
a0 .a1 a2 a3 . . . ai . . . ,
This formalism allows us to see that the infinite decimal expansion 0.99999 . . . , all of
whose digits after the decimal point are 9, is in fact the same as the number
1.0000000 . . . .
For example, two infinite decimal expansions are
3.1415926535 . . .
and
0.1833333333333 . . . .
(You’ll probably recognise the first as being the number π.) Suppose, in this second
decimal expansion, that every digit is 3 after the first three (that is, ai = 3 for i ≥ 3).
Then we write this as 0.183 (or, in some texts, 0.183̇). We can extend this notation to 8
cases in which there is a repeating pattern of digits. For example, suppose we have
0.1123123123123 . . . ,
145
8. Rational, real and complex numbers
0.5714285 · · ·
7 4.0000000
3.5
.50
.49
10
7
30
28
20
14
60
56
40
35
50
So,
4/7 = 0.571428.
Notice: we must have the same remainder re-appear at some point, and then the
calculation repeats. Here’s the calculation again, with the repeating remainder
highlighted in bold.
0.5714285 · · ·
8 7 4.0000000
3.5
.50
.49
10
7
30
28
20
14
60
56
40
35
50
Next, we think about the second part of the statement: that if the decimal expansion
repeats, then the number is rational.
Clearly, if the decimal expansion is terminating, then the number is rational. But what
about the infinite, repeating, case? We’ve given two examples above. Let’s consider
these in more detail.
146
8.3. Rational numbers and real numbers
Example 8.2 Consider a = 0.183. Let x = 0.003. Then 10x = 0.03 and so
10x − x = 0.03 − 0.003 = 0.03. So, 9x = 0.03 and hence x = (3/100)/9 = 1/300, so
18 1 55 11
0.183 = 0.18 + 0.003 = + = = ,
100 300 300 60
and this is the rational representation of a.
Example 8.3 Consider the number 0.1123. If x = 0.0123, then 1000x = 12.3123
and 1000x − x = 12.3. So 999x = 12.3 and hence x = 123/9990. So,
1 1 123 1122
0.1123 = +x = + = .
10 10 9990 9990
In general, if the repeating block is of length k, then an argument just like the
previous two, in which we multiply by 10k , will enable us to express the number as a
rational number.
√
Activity 8.3 Prove that if n is any natural number then either n is an integer or
it is irrational.
Many other important numbers in mathematics turn out to be irrational. I’ve already
mentioned π, and there is also e (the base of the natural logarithm). It’s not easy to
prove either of these numbers is irrational—in fact, it’s quite hard to find examples of
irrational numbers.
147
8. Rational, real and complex numbers
For the moment, let’s make one important observation: not only are there infinitely
many rational numbers, but there are no ‘gaps’ in the rational numbers. If you accept
the view of real numbers as (possibly) infinite decimal expansions, then this is quite
clear: you can get a very good approximation to any real number by terminating its
decimal expansion after a large number of digits. (And we know that a terminating
decimal expansion is a rational number.) The following theorem makes sense of the
statement that there are no ‘rational-free’ zones in the real numbers. Precisely,
between any two rational numbers, no matter how close together they are, there is
always another rational number.
Theorem 8.2 Suppose q, q0 ∈ Q with q < q0 . Then there is r ∈ Q with q < r < q0 .
Note that in some textbooks this definition is called ‘countably infinite’. These
textbooks define ‘countable’ differently, asking only for an injective map from the set
to N. This really is different — there is an injective map from any finite set to N
(more or less by definition). But any infinite set which has an injection to N is
countable (by our definition; this needs a proof which I am skipping), so the
difference is really only whether we call finite sets ‘countable’ or not. We are saying
that ‘countable’ in particular means the set has to be infinite.
For instance, Z is countable: we can define f : N → Z by
f (1) = 0, f (2) = 1, f (3) = −1, f (4) = 2, f (5) = −2, f (6) = 3, f (7) = −3, . . . .
(In general, f (n) = (−1)n bn/2c, where bn/2c means the largest integer that is no
more than n/2.) It is straightforward to show that f is a bijection. Hence Z is
countable. So, in this sense, the sets Z and N have the same ‘cardinality’ (even
though Z is strictly larger than N). Working with infinite sets is not the same as
148
8.4. Countability of rationals and uncountability of real numbers
working with finite sets: two finite sets, one of which was a strict subset of the other,
could not have the same cardinality!
What does ‘countable’ mean? The formal definition is given above. But one way of
thinking about it is that if S is countable, then the members of S can be listed:
s1 , s2 , s3 , . . . , .
For, suppose S is countable. Then there is a bijection f : N → S. Let si = f (i ) for
i ∈ N. Then, because f is a bijection, the list s1 , s2 , s3 , . . . will include every element of
S, each precisely once.
What is more surprising is that Q is also countable.
Traverse this array as indicated in the following diagrams, where traversed elements
are emboldened and underlined as they are traversed.
149
8. Rational, real and complex numbers
150
8.4. Countability of rationals and uncountability of real numbers
From this we can get a listing of all positive rational numbers m/n: simply write
down the corresponding rationals and ignore any (m, n) such that the rational m/n
has already earlier appeared in the list:
(1, 1), (2, 1), (1, 2), (1, 3), (2, 2), (3, 1), (4, 1), (3, 2), (2, 3), . . .
gives
1 2 1 1 2 3 4 3 2 1 1 2 3 4 5 6
, , , , , , , , , , , , , , , ,...
1 1 2 3 2 1 1 2 3 4 5 4 3 2 1 1
which becomes
1 2 1 1 3 4 3 2 1 1 5 6
, , , , , , , , , , , ,...
1 1 2 3 1 1 2 3 4 5 1 1
We can then include the negative rational numbers and 0 by starting with 0 and
replacing m/n by m/n and −m/n:
1 1 2 2 1 1 1 1 3 3 4 4 3 3 2 2
0, , − , , − , , − , , − , , − , , − , , − , , − ,
1 1 1 1 2 2 3 3 1 1 1 1 2 2 3 3
1 1 1 1 5 5 6 6
,− , ,− , ,− , ,− ,...
4 4 5 5 1 1 1 1
151
8. Rational, real and complex numbers
It’s clear that this listing describes a bijection from N to Q. So this proves Q is
countable.
We can go one step further in this direction. A number x is rational if and only if there
is some equation ax + b = 0, with a, b ∈ Z not both zero, to which it is a solution.
(Check that you agree this is obviously true!) This is a linear equation; what about
polynomials?
Definition 8.2 A number x is algebraic if it is a solution to an equation of the form
a n x n + a n −1 x n −1 + · · · + a 1 x + a 0 = 0
where n is a natural number and a0 , a1 , . . . , an are integers, not all zero.
Activity 8.5 Find out how to modify the proof that the rational numbers are
countable in order to show that the algebraic numbers are countable.
So, informally speaking, N, Z, Q, and the algebraic numbers, all have the same ‘size’.
What about R? Well, here it gets very interesting: the set of real numbers is not
countable. (It is said to be uncountable.)
This theorem is really rather surprising: right now you know two examples of
numbers which are real but not algebraic (namely π and e, although we didn’t prove
it). Of course you can generate more examples from this: 2π is also not algebraic
(why?). Is π + e algebraic? Probably not, but this is an open problem. But (it is easy to
prove) in this way you’ll only find a countable set of numbers which are not
algebraic. What it means to say that R is not countable is that it is really much, much
larger than the rationals, or the algebraic numbers. Even though you only have two
explicit examples of real numbers which are not algebraic, once you know the
algebraic numbers are countable but the real numbers are not, you know that ‘most’
real numbers are not algebraic!
The proof uses the famous ‘Cantor diagonal’ argument.
Suppose that f : N → R and that
f (n) = xn0 .xn1 xn2 xn3 . . . .
We show there’s a number in R which isn’t the image under f of any element of N
(and hence f is not a surjection). Consider
y = 0.y1 y2 y3 . . .
152
8.5. Complex numbers
where
1 if xii 6= 1
yi =
2 if xii = 1.
For all n ∈ N, y 6= f (n) since yn is different from xnn .
Since f was arbitrary, this shows that there can be no function f : N → R that is
surjective. Hence R is not countable.
8.5.1 Introduction
Consider the two quadratic polynomials,
p( x ) = x2 − 3x + 2 and q( x ) = x2 + x + 1
If you sketch the graph of p( x ) you will find that the graph intersects the x-axis at the
two real solutions (or roots) of the equation p( x ) = 0, and that the polynomial factors
into the two linear factors,
p( x ) = x2 − 3x + 2 = ( x − 1)( x − 2)
Sketching the graph of q( x ), you will find that it does not intersect the x-axis. The
equation q( x ) = 0 has no solution in the real numbers, and it cannot be factorised (or
factored) over the reals. Such a polynomial is said to be irreducible over the reals. In
order to solve this equation, we need to define the complex numbers.
8
8.5.2 Complex numbers: a formal approach
To start with, let’s formally construct the complex numbers from the real numbers.
This is where we can finally answer the question from Section 3.10 of what the
difference between C (which makes sense) and E (which doesn’t exist) is. The formal
construction this time is rather similar to the usual way you think about the complex
numbers (in contrast to the formal construction of Z we gave, which probably still
looks a bit funny).
We define the set C of complex numbers to be the set of all ordered pairs ( x, y) of real
numbers, with addition and multiplication operations defined as follows:
( a, b) + (c, d) = ( a + c, b + d), ( a, b) × (c, d) = ( ac − bd, ad + bc).
You should check that these definitions really work, that is, that (for example) the
multiplication is commutative, and that the distributive law holds. More generally,
you should check that C satisfies (F1)–(F6), i.e. it is a field: we can do all the algebra
we’re used to. (What C doesn’t have is an order: there isn’t any way of defining an
order < on C which plays nicely with addition and multiplication in the way that the
order plays nicely in N, Z, Q or R.)
You can also check that the complex numbers of the form ( x, 0) behave like the real
numbers, in other words that ( x, 0) + (y, 0) = ( x + y, 0), and ( x, 0) × (y, 0) = ( xy, 0),
153
8. Rational, real and complex numbers
which is what you expect for adding and multiplying real numbers. Finally, let’s
remember why we began this: we wanted to be able to solve the equation x2 + 1 = 0.
Well, that means we want a complex number ( a, b) such that
( a, b) × ( a, b) + (1, 0) = (0, 0). And we can find such a number:
(0, 1) × (0, 1) = (−1, 0), so we are done.
This is where the construction story stops, for us. We have constructed a field C
which contains the number line and in which we can solve x2 + 1 = 0. Let’s look back
to Section 3.10 briefly. There, we asked whether one can have a number system E
which satisfies (some of) the properties of a field and in which we can solve
x × 0 = 1. And we discovered that the answer is No. We proved that the answer is
No, by doing a calculation using the properties of this supposed-to-exist E, until we
came to a logical contradiction: we calculated that 1 = 1 + 1, but we also proved that
1 6= 1 + 1. And then came the question: why are the complex numbers different? Why
is it OK to ask for a solution to x2 + 1 = 0 but not to x × 0 = 1? More concretely: we
know that E doesn’t make sense—it doesn’t exist—but how do we know that if we
do some calculation with the complex numbers using the normal rules of
algebra—i.e. using the axioms of a field, (F1)–(F6)—we will not wind up calculating
something like that 1 = 1 + 1?
Well, suppose we came upon such a horrible calculation which shows 1 = 1 + 1 (i.e.
which finds a logical contradiction assuming the complex numbers exist). Now, this
calculation relies on the fact that C satisfies (F1)–(F6). And we proved that C satisfies
these, on the assumption that R satisfies them. (Well, actually, we didn’t really do this
proof; I simply told you to check it. But I assume you have done that by now.)
So we can rewrite our horrible calculation in terms of pairs of real numbers: and now
we have a logical contradiction which follows from our assumption that R exists. But
8 we constructed (OK, OK, we did not do that, but at least I promise we could do it..!)
the real numbers R from Q—and that means we can rewrite our horrible calculation
in terms of the rationals Q. Now we have a logical contradiction which follows from
our assumption that Q exists.
But we constructed Q from Z—and following that construction back we have a
logical contradiction which follows from our assumption that Z exists. And finally
we constructed Z from N, so we can follow that construction back too—rewriting all
the integers as equivalence classes of pairs of natural numbers—until finally our
hypothetical horrible calculation in C has been translated into a logical contradiction
in the natural numbers, using only the axioms of the natural numbers. And we said
we believe the natural numbers really do exist, that there are no logical contradictions
there.
Let’s be clear—you would not want to ever actually rewrite a calculation like this. A
complex number is a pair of real numbers, a real number is a set of rational numbers
(a Dedekind cut), a rational number is an equivalence class of pairs of integers, and
an integer is an equivalence class of pairs of natural numbers. So a complex number is
a pair of (sets of (equivalence classes of pairs of (equivalence classes of pairs of
natural numbers)))
which is a pretty unpleasant thing to think about. But you don’t actually have to
rewrite a calculation. We proved each step works, and it follows logically that if there
154
8.5. Complex numbers
is a logical contradiction which you can find by doing algebra with the complex
numbers using axioms (F1)–(F6), then there is a logical contradiction which you can
find in the natural numbers using the axioms of the natural numbers.1
Putting this more simply: if someone tells you they think the complex numbers don’t
make sense, you can now tell them that they logically also have to believe that
counting apples doesn’t work either.
Now, it is usual to say that the real numbers are contained in the complex numbers:
( a, 0) ‘is’ the real number a.
You should notice that (if you are trying to be very formal) the last statement isn’t
quite true. ( a, 0) is obviously not the same as the real number a. However, because the
numbers ( a, 0) behave exactly like the real numbers, we will commit an abuse of
notation and say that they are the same. So we will write N ⊂ Z ⊂ Q ⊂ R ⊂ C, even
though formally this is not really true. If you ask me what I really mean by N ⊂ C,
I’ll tell you that I want to refer to the set of complex numbers of the form ( a, 0) where
a is a positive integer. And that set, together with addition and multiplication as
defined on the complex numbers (and I should define an order <, which I can do in
the obvious way) indeed satisfies the axioms for the natural numbers. So it might not
formally be the set N which I started with, but it’s essentially the same; there is a
dictionary correspondence between them (as we proved in Section 3.9). The same is
true for all the other inclusions.
At this point we are going to stop trying to be formally careful. We’ve proved that
given N it makes sense to talk about Z, Q, R and C, and we’ve seen why it makes
sense to say that (for example) Q is contained in C. For the rest of your degree course,
you don’t have to remember the details of these constructions: you know they work,
and that’s enough. However, you should not forget the ideas, in particular the idea of
constructing new structures using equivalence relations and old structures. You’ll see 8
those ideas repeatedly, and next time they’ll be used to construct something you are
not so familiar with and have to work to understand.
155
8. Rational, real and complex numbers
Definition 8.3 A complex number is a number of the form z = a + ib, where a and b
are real numbers, and i2 = −1. The set of all such numbers is
C = { a + ib : a, b ∈ R} .
If z = a + ib is a complex number, then the real number a is known as the real part of
z, denoted Re(z), and the real number b is the imaginary part of z, denoted Im(z).
Note that Im(z) is a real number.
If b = 0, then z is a real number, so R ⊆ C. If a = 0, then z is said to be purely
imaginary.
The quadratic polynomial q(z) = x2 + x + 1 can be factorised over the complex
numbers, because the equation q(z) = 0 has two complex solutions. Solving in the
usual way, we have √
−1 ± −3
x= .
2
√ p √ √ √
We write, −3 = (−1)3 = −1 3 = i 3, so that the solutions are
√ √
1 3 1 3
w = − +i and w = − −i .
2 2 2 2
Notice the form of these two solutions. They are what is called a conjugate pair. We
have the following definition.
Definition 8.4 If z = a + ib is a complex number, then the complex conjugate of z is
the complex number z = a − ib.
We can see by the application of the quadratic formula, that the roots of an
8 irreducible quadratic polynomial with real coefficients will always be a conjugate
pair of complex numbers.
z + w = (1 + i ) + (4 − 2i ) = (1 + 4) + i (1 − 2) = 5 − i
and
zw = (1 + i )(4 − 2i ) = 4 + 4i − 2i − 2i2 = 6 + 2i
You should check that this is really exactly the same as the definitions we gave when
we formally constructed the complex numbers: the only difference is the way we’re
writing complex numbers.
If z ∈ C, then zz is a real number:
zz = ( a + ib)( a − ib) = a2 + b2 .
156
8.5. Complex numbers
Activity 8.6 Carry out the multiplication to verify this: let z = a + ib and
calculate zz.
z zw
Division of complex numbers is then defined by = since ww is real.
w ww
Example 8.5
1+i (1 + i )(4 + 2i ) 2 + 6i 1 3
= = = + i
4 − 2i (4 − 2i )(4 + 2i ) 16 + 4 10 10
z + z = 2 Re(z) is real
z=z
z+w = z+w
zw = z w
z
=
z 8
w w
Activity 8.7 Let z = a + ib, w = c + id and verify all of the above properties.
a n z n + a n −1 z n −1 + · · · + a 1 z + a 0 = 0
157
8. Rational, real and complex numbers
into a product of linear factors over C. The proof of the Fundamental Theorem of Algebra
is beyond the scope of this course (and this time not because it’s long and boring, but
because it is genuinely quite hard). However, we note the following useful result.
Theorem 8.5 Complex roots of polynomials with real coefficients appear in
conjugate pairs.
a0 + a1 z + + a2 z2 · · · + a n z n = 0
a0 + a1 z + a2 z2 + · · · + a n z n = 0 = 0
Since 0 is a real number, it is equal to its complex conjugate. We now use the
properties of the complex conjugate: that the complex conjugate of the sum is the sum
of the conjugates, and the same is true for the product of complex numbers. We have
a0 + a1 z + a2 z2 + · · · + an zn = 0,
and
a0 + a1 z + a2 z2 + · · · + an zn = 0.
Since the coefficients ai are real numbers, this becomes
8 a0 + a1 z + a2 z2 + · · · + an zn = 0.
Activity 8.8 Multiply out the last two factors above to check that their product is
the irreducible quadratic x2 + x + 1.
158
8.5. Complex numbers
Theorem 8.6 Two complex numbers are equal if and only if their real and
imaginary parts are equal.
There are two ways to prove this. We can do it directly, using the fact that the
complex numbers are a field:
Proof. Two complex numbers with the same real parts and the same imaginary parts
are clearly the same complex number, so we only need to prove this statement in one
direction. Let z = a + ib and w = c + id. If z = w, we will show that their real and
imaginary parts are equal. We have a + ib = c + id, therefore a − c = i (d − b).
Squaring both sides, we obtain ( a − c)2 = i2 (d − b)2 = −(d − b)2 . But a − c and
(d − b) are real numbers, so their squares are non-negative. The only way this
equality can hold is for a − c = d − b = 0. That is, a = c and b = d.
The other, much shorter (by now!) way to prove this is simply to observe that the
complex numbers are the same as pairs of real numbers (with addition and
multiplication as we defined them when we formally constructed the complex
numbers) and pairs of real numbers are by definition equal if and only if both
parts—which are precisely the real and imaginary parts—are equal.
As a result of this theorem, we can think of the complex numbers geometrically, as
points in a plane. For, we can associate the vector ( a, b) T uniquely to each complex
number z = a + ib, and all the properties of a two-dimensional real vector space
apply. A complex number z = a + ib is represented as a point ( a, b) in the complex
plane; we draw two axes, a horizontal axis to represent the real parts of complex
numbers, and a vertical axis to represent the imaginary parts of complex numbers.
Points on the horizontal axis represent real numbers, and points on the vertical axis
represent purely imaginary numbers. 8
( a, b)
7
z = a + ib
i
θ
(0, 0) 1
√
Activity 8.9 Plot z = 2 + 2i and w = 1 − i 3 in the complex plane.
a = r cos θ, b = r sin θ
159
8. Rational, real and complex numbers
√
where r = a2 + b2 is the length of the line joining the origin to the point ( a, b) and θ
is the angle measured anticlockwise from the real (horizontal) axis to the line joining
the origin to the point ( a, b). Then we can write z = a + ib = r cos θ + i r sin θ.
Definition 8.5 The polar form of the complex number z is
z = r (cos θ + i sin θ ).
√
The length r = a2 + b2 is called the modulus of z, denoted |z|, and the angle θ is
called the argument of z.
z and z are reflections in the real axis. If θ is the argument of z, then −θ is the
argument of z.
|z|2 = zz.
θ and θ + 2nπ give the same complex number.
We define the principal argument of z to be the argument in the range, −π < θ ≤ π.
√
Activity 8.10 Express z = 2 + 2i, w = 1 − i 3 in polar form.
Describe the following sets of z: (a) |z| = 3, (b) argument of z is π4 .
z r
= cos(θ − φ) + i sin(θ − φ)
w ρ
Activity 8.11 Show these by performing the multiplication and the division as
defined earlier, and by using the facts that cos(θ + φ) = cos θ cos φ − sin θ sin φ
and sin(θ + φ) = sin θ cos φ + cos θ sin φ.
DeMoivre’s Theorem
We can consider explictly a special case of the multiplication result above, in which
w = z. If we apply the multiplication to z2 = zz, we have
z2 = zz
= (r (cos θ + i sin θ ))(r (cos θ + i sin θ ))
= r2 (cos2 θ + i2 sin2 θ + 2i sin θ cos θ )
= r2 (cos2 θ − sin2 θ + 2i sin θ cos θ )
= r2 (cos 2θ + i sin 2θ ).
160
8.5. Complex numbers
Here we have used the double angle formulae for cos 2θ and sin 2θ.
Applying the product rule n times, where n is a positive integer, we obtain DeMoivre’s
Formula
Theorem 8.7
(cos θ + i sin θ )n = cos nθ + i sin nθ
Proof.
n
zn = z| ·{z
· · }z = r (cos θ + i sin θ )
n times
n
= r cos(θ| + ·{z
· · + θ}) + i sin(θ| + ·{z
· · + θ})
n times n times
z3 z5 z2 z4
sin z = z − + − · · ·
3! 5!
cos z = 1 − + − · · ·
2! 4! 8
If we use the expansion for ez to expand eiθ , and then factor out the real and
imaginary parts, we find:
(iθ )2 (iθ )3 (iθ )4 (iθ )5
eiθ = 1 + (iθ ) + + + + +···
2! 3! 4! 5!
θ2 θ3 θ4 θ5
= 1 + iθ − − i + + i − · · ·
2! 3! 4! 5!
θ 2 θ 4
θ3 θ5
= 1− + −··· +i θ − + −···
2! 4! 3 5!
If you’re being careful, you might notice something a bit strange here—what exactly
do I mean by these funny infinite sums? and why am I allowed to rearrange the terms
in them? Sure, I know addition is commutative, but that will only let me change
places of finitely many terms in the sum (which I don’t quite understand anyway),
and I still have infinitely many more thing which I need to change places. The answer
to that objection is: we’ll explain properly some of it next term, and some next year in
MA203 Real Analysis. For now, take it on faith that it does actually make sense.
161
8. Rational, real and complex numbers
z = reiθ
where r = |z| is the modulus of z and θ is the argument of z.
We can use either the exponential form, z = reiθ , or the standard form, z = a + ib,
according to the application or computation we are doing. For example, addition is
simplest in the form z = a + ib, but multiplication and division are simpler in
exponential form. To change a complex number between reiθ and a + ib, use Euler’s
formula and the complex plane (polar form).
Example 8.7 √
2π
ei = cos 2π
3 2π 1 3
3 + i sin 3 = − 2 + i 2 .
√ √ √ √
e2+i 3 = e2 ei 3 = e2 cos 3 + ie2 sin 3.
Activity 8.12 Write each of the following complex numbers in the form a + ib:
8
3π 3π 11π
e −1
π
ei 2 ei 2 ei 4 ei 3 e 1+ i
√ π √
Let z = 2 + 2i = 2 2 ei 4 and w = 1 − i 3 = 2e−i 3 , then
π
Example 8.8
√
w6 = (1 − i 3)6 = (2e−i 3 )6 = 26 e−i2π = 64
π
√ π √
zw = (2 2ei 4 )(2e−i 3 ) = 4 2e−i 12
π π
and √ 7π
z
= 2ei 12 .
w
Notice that in the above example we are using certain properties of the complex
exponential function, that if z, w ∈ C,
ez+w = ez ew and (ez )n = enz for n ∈ Z.
This last property is easily generalised to include the negative integers.
162
8.6. Learning outcomes
Example 8.9 Solve the equation z6 = −1 to find the 6th roots of −1.
Write z6 = (reiθ )6 = r6 ei6θ , −1 = eiπ = ei(π +2nπ )
Equating these two expressions, and using the fact that r is a real positive number,
we have
π 2nπ
r=1 6θ = π + 2nπ, θ = +
6 6
This will give the six complex roots by taking n = 0, 1, 2, 3, 4, 5.
Activity 8.13 Show this. Write down the six roots of −1 and show that any one
raised to the power 6 is equal to −1. Show that n = 6 gives the same root as n = 0.
Use this to factor the polynomial x6 + 1 into linear factors over the complex
numbers and into irreducible quadratics over the real numbers.
demonstrate that you understand how the rational numbers can be formally
constructed by means of an equivalence relation and that addition and
multiplication of rational numbers can be defined as operations on the
equivalence classes
indicate that you know that a real number is rational if and only if it has an 8
infinitely repeating pattern in its decimal expansion
determine, in the form m/n, a rational number from its decimal expansion
demonstrate that you understand that there are rational numbers arbitrarily
close to any real number
demonstrate that you know that the rationals are countable and the reals
uncountable
show that you know what is meant by complex numbers, and demonstrate that
you can add, subtract, multiply and divide complex numbers
show that you know that every polynomial of degree n has n complex roots and
that these occur in conjugate pairs
163
8. Rational, real and complex numbers
Exercise 8.2
1 + 2i
Express the complex number in the form a + bi.
4 − 5i
Exercise 8.3
Solve the equation x2 − 2ix + 3 = 0.
Exercise 8.4
Write each of the following complex numbers in the form a + ib:
3π 3π 11π
e −1 .
π
ei 2 ei 2 ei 4 ei 3 e 1+ i
Exercise 8.5
√ √
Express 1 + 3i in exponential form. Hence find (1 + 3i )30 .
164
8.8. Comments on selected activities
Now,
a2 = nb2
165
8. Rational, real and complex numbers
( x − w)( x − w) = x2 − (w + w) x + ww.
2i • z = 2 + 2i
(0, 0) 1 2
−i
√
• w = 1−i 3
−2i
Learning activity 8.10 Draw the line from √ the originπto the point z in the diagram
above. Do the same for w. For z, |z| = 2 2 and θ = 4 , so
√
z = 2 2 cos( π4 ) + i sin( π4 ) . The modulus of w is |w| = 2 and the argument is − π3 ,
so that
π π π π
w = 2 cos(− ) + i sin(− ) = 2 cos( ) − i sin( ) .
8 3 3 3 3
The set (a) |z| = 3, is the circle of radius 3 centered at the origin. The set (b), argument
of z is π4 , is the half line from the origin through the point (1,1).
Learning activity 8.13 The roots are:
π 3π 5π
z1 = 1 · e i 6 , z2 = 1 · e i 6 , z3 = 1 · e i 6 ,
7π 9π 11π
z4 = 1 · e i 6 , z5 = 1 · e i 6 , z6 = 1 · e i 6 .
i 13π i π6
These roots are in conjugate pairs, and e 6 =e :
5π
z4 = z3 = e −i 6 , z5 = z2 = e −i 2 , z6 = z1 = e −i 6 .
π π
x6 + 1 = ( x − z1 )( x − z1 )( x − z2 )( x − z2 )( x − z3 )( x − z3 ),
√
Using the a + ib form of each complex number, for example, z1 = 23 + i 21 , you can
carry out the multiplication of the linear terms pairwise (conjugate pairs) to obtain
x6 + 1 as a product of irreducible quadratics with real coefficients:
√ √
x6 + 1 = ( x2 − 3 x + 1)( x2 + 3 x + 1)( x2 + 1).
166
8.9. Solutions to exercises
1 + 2i 1 + 2i 4 + 5i
=
4 − 5i 4 − 5i 4 + 5i
(1 + 2i )(4 + 5i )
=
(4 − 5i )(4 + 5i )
4 + 8i + 5i + 10i2
=
16 − 25i2
−6 + 13i
=
41
6 13
= − + i.
41 41
8
You can check that this is the correct answer by calculating the product
6 13
− + i (4 − 5i )
41 41
167
8. Rational, real and complex numbers
Solution to exercise
√ 8.5
To express z = 1 + 3i in exponential form, we first note that
√ !
√ 1 3
1 + 3i = 2 + i
2 2
168