298 120 29MB
English Pages 401 [420] Year 1977
Table of contents :
CONTENTS
PREFACE
Notation
0: MATHEMATICAL MODELS
0.0: Introduction
0.1: Principles and Models
0.2: Mathematical Models
0.3: Purposes of Models
1: MATHEMATICAL REASONING
1.0: Introduction
1.1: Propositions
1.2: Predicates and Quantifiers
1.3: Quantifiers and Logical Operators
1.4: Logical Inference
1.5: Methods of Proof
1.6: Program Correctness
Axioms of Assignment
2: SETS
2.0: Introduction
2.1: The Primitives of Set Theory
2.2: The Paradoxes of Set Theory
2.3: Relations Between Sets
2.4: Operations on Sets
2.5: Induction
Inductive Definition of Sets
Recursive Procedures
Inductive Proofs
2.6: The Natural Numbers
2.7: Set Operations on ∑
3: BINARY RELATIONS
3.0: Introduction
3.1: Binary Relations and Digraphs
3.2: Trees
Search Trees
Tree Traversal Algorithms
3.3: Special Properties of Relations
3.4: Composition of Relations
3.5: Closure Operations on Relations
3.6: Order Relations
Some Additional Concepts for Posets
3.7: Equivalence Relations and Partitions
Sums and Products of Partitions
4: FUNCTIONS
4.0: Introduction
4.1: Basic Properties of Functions
Inductively Defined Functions
Partial Functions
4.2: Special Classes of Functions
Inverse Functions
OneSided Inverse Functions
5: COUNTING AND ALGORITHM ANALYSIS
5.0: Introduction
5.1: Basic Counting Techniques
Permutations and Combinations
Decision Trees
5.2: Asymptotic Behavior of Functions
Some Important Classes of Asymptotic Behavior
5.3: Recurrence Systems
Divide and Conquer Algorithms
5.4: Analysis of Algorithms
Searching Algorithms
Sorting Algorithms
6: INFINITE SETS
6.0: Introduction
6.1: Finite and Infinite Sets
6.2: Countable and Uncountable Sets
6.3: Comparison of Cardinal Numbers
6.4: Cardinal Arithmetic
7: ALGEBRAS
7.0: Introduction
7.1: The Structure of Algebras
7.2: Some Varieties of Algebras
Semigroups
Monoids
Groups
Boolean Algebras
7.3: Homomorphisms
7.4: Congruence Relations
7.5: New Algebraic Systems from Old
Quotient Algebras
Product Algebras
APPENDIX: THE PROGRAMMING LANGUAGE
ANSWERS TO SELECTED EXERCISES
BIBLIOGRAPHY
INDEX
DISCRETE
MATHEMATICS
COMPUTER
IN
SCIENCE
DONALD
F. STANAT
Department
of Computer
Science
University of North Carolina at Chapel Hill
DAVID
F. McALLISTER
Department of Computer Science North Carolina State University
Prentice/Hall
International,
Inc.
Library of Congress Cataloging in Publication Data
STANAT, DONALD F (date)
Discrete mathematics in computer science.
Bibliography: p. Includes index. 2. Electronic 1. Mathematics—1961~I. McAllister, David F. data processing. II. Title. joint author. (date) QA39.2.8688 S121 7648915 ISBN 0132160528
This edition may be sold only in those countries to which it is consigned by PrenticeHall International. It is not to be reexported and it is not for sale in the U.S.A., Mexico or Canada.
© 1977 by PrenticeHall, Inc., Englewood Cliffs, N.J.
All rights reserved. No part of this book may be reproduced in any form, by mimeograph or any other means, without permission in writing from the publisher.
10 9
8
ISBN
7
6
5
4
3
2
#1
Q1321605ce4
Printed in the United States of America PrenticeHall PrenticeHall PrenticeHall PrenticeHall
International (UK) Limited, London of Australia Pty. Limited, Sydney Canada Inc., Toronto Hispanoamericana, $.A., Mexico
PrenticeHall of India Private Limited, New Delhi PrenticeHall of Japan, Inc., Tokyo
PrenticeHall of Southeast Asia Pte. Ltd., Singapore Editora PrenticeHall do Brasil, Ltda., Rio de Janeiro Whitehall Books Limited, Wellington, New Zealand PrenticeHall, Englewood Cliffs, New Jersey
To
Sylvia and Beth
CONTENTS
PREFACE
x
Notation
O
MATHEMATICAL 0.0 0.1 0.2 0.3
1
3 .4 5 6
SETS 2.0 2.1 {2.2 2.3
MODELS
Introduction 1 Principles and Models Mathematical Models Purposes of Models
MATHEMATICAL 1.0
2
xiii
1
6
1 2
REASONING
8
Introduction 8 Propositions 9 Predicates and Quantifiers 20 Quantifiers and Logical Operators 39 Logical Inference Methods of Proof 47 Program Correctness 57 Axioms of Assignment 69
29
75 75 Introduction The Primitives of Set Theory The Paradoxes of Set Theory 82 Relations Between Sets
{Denotes optional section
75 79
vi
2.4 2.5
{2.6 2.7
3
BINARY
CONTENTS
85 Operations on Sets 95 Induction Inductive Definition of Sets 98 Recursive Procedures 100 Inductive Proofs 108 The Natural Numbers iil Set Operations on &*
RELATIONS
95
120
120 Introduction Binary Relations and Digraphs 131 Trees 136 Search Trees Tree Traversal Algorithms Special Properties of Relations
120
140 145
149 Composition of Relations 155 Closure Operations on Relations 164 Order Relations 173 {Some Additional Concepts for Posets 178 Equivalence Relations and Partitions 187 {Sums and Products of Partitions
4
4.0 4.1 4.2
5
193
FUNCTIONS
193 Introduction Basic Properties of Functions
199 Inductively Defined Functions 201 Partial Functions 204 Special Classes of Functions 209 Inverse Functions 213 {OneSided Inverse Functions
COUNTING 3.0 5.1 5.2 5.3 5.4
193
AND
ALGORITHM
ANALYSIS
218
218 Introduction 218 Basic Counting Techniques 222 Permutations and Combinations 225 Decision Trees 232 Asymptotic Behavior of Functions Some Important Classes of Asymptotic Behavior 243 Recurrence Systems
Divide and Conquer Algorithms 258 Analysis of Algorithms 262 Searching Algorithms 265 Sorting Algorithms
248
236
CONTENTS
6
INFINITE
SETS
275
Introduction 275 Finite and Infinite Sets 275 Countable and Uncountable Sets Comparison of Cardinal Numbers Cardinal Arithmetic 295
7
ALGEBRAS
73
7.5
300
Introduction 300 The Structure of Algebras 30] Some Varieties of Algebras 309 Semigroups 309 Monoids 310 Groups 311 Boolean Algebras 312 Homomorphisms 315 Congruence Relations 322 New Algebraic Systems from Old Quotient Algebras 327 Product Algebras 329
APPENDIX:
ANSWERS
THE
TO
BIBLIOGRAPHY
INDEX
279 288
393
PROGRAMMING
SELECTED
391
327
LANGUAGE
EXERCISES
339
332
vii
PREFACE
This text is intended for use in a first course in discrete mathematics in an undergraduate computer science curriculum. The level is appropriate for a sophomore or junior course. The student is assumed to have experience with a highlevel programming language. No specific mathematics is prerequisite, but some previous exposure to collegelevel mathematics is desirable. The mathematics taught to students of computer science has changed radically since the early days of this academic discipline. Initially, nearly all topics were drawn from electrical engineering and numerical analysis. Over the years, however, the mathematics of computer science has developed a distinct character, incorporating and melding aspects from such areas as logic, universal algebra and combinatorics
as
well
as
analysis.
Moreover,
as
the
field
has
evolved,
its
use
of
mathematics has become more sophisticated. It is our view that a computer scientist must have substantial training in mathematics if he is to understand his tools and use them well. The purpose of this text is to provide a foundation for the discrete mathematics used in the theory and application of computer science. The major part of this book treats classical mathematical topics, including
sets, relations, functions, cardinality, and algebra. The approach, however, is not
classical; we have emphasized the topics of importance to computer science and provided examples to illustrate why the material is of interest. The first two topics of the text are usually not treated explicitly in a course of this type. Chapter 0 is a brief description of the nature and purpose of mathematical models. Chapter  treats mathematical reasoning, including the representation of assertions, how inferences are made, and how assertions are proved. The final section of the chapter is a description of how programs can be proved correct. The material of Chapter 1 is difficult for some students, especially those who have not had some previous experience in proving theorems in a collegelevel mathematics course. For this reason, many of the proofs in succeeding chapters are presented in
vili
PREFACE
ix
considerable detail with explicit references to the concepts and techniques of Chapter 1. The symbol Jf is used throughout the text to indicate the end of a proof. Chapter 2 begins with the usual topics of an introductory treatment of set theory, and then proceeds to inductive definitions of sets, proofs by induction, and recursive programs. The final section of the chapter treats languages, or sets of symbol strings over a finite alphabet. These sets play an important role in computer science, but they are usually not considered in an introduction to set theory. Chapter 3 treats relations, using digraphs as a visual representation of binary relations on sets. Trees, equivalence relations and order relations are covered, as well as operations on relations, including composition and transitive closure. Chapter 4 treats functions as a special class of relations. Several important classes of functions are defined and their properties investigated. Chapter 5 is a treatment of counting techniques and their application to algorithm analysis. The first section introduces basic concepts, including permutations and combinations. The second section develops the concept of the asymptotic behavior of a function and how it can be used to measure algorithm complexity. Recurrence equations and their use in the analysis of algorithms are treated in the next section. The final section of the chapter uses the tools developed in the first three sections to investigate the optimality of several algorithms. Chapter 6 treats infinite sets and cardinalities, emphasizing enumeration and diagonalization. A cardinality argument is used to show the existence of a real number which is not computable. Chapter 7 is an introduction to the concepts of universal algebra, including homomorphisms, congruence relations, and quotient and direct product algebras. Semigroups, monoids, groups and Boolean algebras are described. Through Chapter 4, the material of the text should be covered in the order in which it is presented, although sections and subsections which are marked with a double dagger ({) can be omitted. The material of Chapter  is often ignored or treated in a cursory fashion, but we feel that these fundamental concepts of mathematics are better understood if studied explicitly. Many of the topics of Chapters 1 through 4 may have been studied previously by some students; these topics can be covered as rapidly as is appropriate. Chapters 5, 6 and 7 assume a knowledge of Chapters  through 4 but not each other; any subset of these three chapters can be presented. It is our opinion that Chapter 5 is the most important. The examples which occur throughout the text range from very simple ones, included only as illustrations of the definitions, through ones which are both difficult and substantive. (The halting problem is treated in an example of Chapter 1, hashing functions are described in an example in Chapter 4, and the existence of a noncomputable real number is established in an example of Chapter 6.) In a few of the examples which relate the subject matter to applications, the reader may not be familiar with terminology used (e.g., PERT charts); these examples are included for
x
PREFACE
n cer con se cau not uld sho and m the d tan ers und ily eas can who the benefit of those n tio lec col a of end the ote den to d use is (3) sign ber num A not. can who to those of examples. in er ord te ima rox app the in n tio sec h eac of end the at en giv are ses rci Exe ing ord acc d ere ord are y the , ics top hin wit ; text the in ted sen pre which the topics are sub al ion opt an m fro al eri mat ats tre } ked mar m ble pro A ty. icul diff g sin rea to inc y all usu will ns tio sec e som of end the at en giv ms ble pro ng mmi gra pro section. The rs. mme gra pro ice nov by ked wor be can y the ore bef n tio ica cif require additional spe no h wit sets y onl er sid con to t wan ht mig one m, ble pro ory the set a in e, For exampl more than 100 elements. ns sio ver ry ina lim Pre rs. yea l era sev of iod per a r ove d lve evo has This text Hill pel Cha at na oli Car th Nor of y sit ver Uni the at ly ive ens have been used ext se tho all list to e ibl oss imp be ld wou It y. sit ver Uni te Sta na oli Car th Nor and Jon Bentley, Don Johnson and who have contributed to the final product. ns tio ges sug and ts men com ed vid pro y the n; tio men r ula tic par e erv des es Jon l Nei on the entire manuscript. Others who made substantial contributions include Peter Calingaert, James W. Hanson, Yale N. Patt, Stephen M. Pizer, James le Tol e Dav and ll sne Pre e Ann ss. Wei F. n phe Ste and e lac Wal L. tor Vic er, tch Tha assisted in the preparation of problem solutions. Finally, we wish to thank the many students who studied from the manuscript and contributed to its final form. Our secretarial help has come from many quarters, but three individuals deserve special mention. Nina Eaker worked on endless drafts and revisions in the early stages of the manuscript, Gloria Edwards carried the work forward, and Anne Edwards brought the manuscript to its final form. We thank them for their help and support. DONALD F. STANAT Davip F. MCALLISTER
NOTATION Logic —P PV
Q
PAO P>@Q PQ
V 4 qd!
_ Numbers [x] x N I I+ Q Q+ R R+
(a, b)
{a,  (a, 5] [a, 5)
(a, co) [a, oo)
N,
Sets
aca
Pand QO P implies Q P if and only if O
Universal quantifier: for all... Existential quantifier: there exists... There exists a unique... the integer n such that x —P. If P > Q is true, then P is said to be a stronger =
Sec. 1.1
PROPOSITIONS
13
assertion than Q; thus “x is a positive integer” is a stronger assertion than “x is an integer.” The English language uses implication to assert a causal or inherent relationship between a premise and a conclusion. Thus, “If I fall in the lake, then I will get wet” relates a cause to its effect, and “If Iam a man, I am mortal” characterizes a property of men. However, in the language of propositions, the premise of an implication need not be related to the conclusion in any substantive way. This can be disturbing, as illustrated by the following example.
Example If P represents “oranges are then P > Q represents “If oranges no causal or inherent relationship of the earth, the implication P =
clusion is true.
#
purple” and Q represents “the earth is are purple, then the earth is not flat.” holds between the color of oranges and Q is true since the premise is false and
not flat,” Although the shape the con
If P and Q have the same truth values, then they are said to be /ogically equivalent propositions. A logical operator called “equivalence” and denoted by Q and QO => Pare both true. Conversely, if both P > Q and Q = P are true, then P Q’is true. For these reasons, the terminologies for The proposition P < Q is read equivalence and implication are closely related. “P is equivalent to Q,” “P is a necessary and sufficient condition for Q,” or “P if and only if Q.” The abbreviation “iff” is often used to represent the phrase “if and only if.” Other logical operators can be defined and are of interest for a variety of reasons; some of them will be described in the exercises of this section.
Truth tables for individual operators can be used to construct truth tables for arbitrarily complex propositional forms. The truth table for a propositional form specifies its truth value for every possible combination of truth values of its propositional variables. Each propositional variable can assume either of two values, true or false. Therefore, if k variables occur in a proposition, the associated truth table must describe 2* cases. Each case occurs as a separate line in the truth table.
14
MATHEMATICAL
Ch.1
REASONING
Examples (a) Construct a truth table for the proposition (Q A —“P) => P.
Q—P
P
(QA —P) (QA “P) => P
(b) Construct a truth table for the proposition [((P A Q) V
QV
ARI
P
et et OD toh
OE Lola
oO
et
BE oo
et
Oe Om Oh Oe
OOOO O am
—_
Oe
Oe
Om
OO mm OOS ~——
—_
—
ht
et
>
>
OO
©
CD

RIPAQI7RI(PAQVmRIIPA
“R] P.
# In the above truth tables, we have used two conventions which aid readability: (i) (ii)
All propositional variables occur in the leftmost columns. Truth values are assigned to the propositional variables by “counting in binary” from 0 to 2* — 1, where & is the number of propositional variables.
A tautology is a propositional form whose truth value is true for all possible values of its propositional variables, e.g., P \/ 1 P. A contradiction or absurdity is a propositional form which is always false, such as P A — P. A propositional form which is neither a tautology nor a contradiction is called a contingency. Properties of a propositional form can sometimes be determined by constructing an “abbreviated” truth table. For example, if we wish to show that a propositional form is a contingency, it suffices to exhibit two lines of the truth table, one of which makes the proposition true and another that makes it false. To determine if a propositional form is a tautology, it is only necessary to check those lines of the truth table for which the proposition could be false. Example Consider the problem of determining whether (P A Q) => Pisa tautology. We will use an abbreviated truth table. If an implication A => B is false, then A must be true and B must be false. The truth table for (P A Q) = P has only one line where
Sec. 1.1
PROPOSITIONS
15
the value of the premise P A Q is true. Since this is the only instance where (P A Q) => P could be false, it suffices to consider this line.
P
QPAQPAQ)>P
Since the value of the propositional form for this line is true, it follows that the proposition is a tautology. +
It is often convenient to replace one propositional form by another which is logically equivalent. If two propositional forms are logically equivalent, one can be substituted for the other in any proposition in which they occur; thus, since P 1s logically equivalent to P \/ P, it follows that P \V Q is logically equivalent to (P V P) V Q. Table 1.1.1 is a list of important equivalences, often called identities. The symbols P, Q, and R represent arbitrary propositional forms. The symbol “1” is used to denote either a tautology or a true proposition. Likewise, the symbol “0” represents a false proposition or a contradiction. The names which appear to the right of the identities refer to properties and “rules of inference” which will be discussed later. Certain of the identities are particularly important. Identity 18 permits the replacement of implications by disjunctions. Identities 7 and 8 permit the replacement of disjunctions by conjunctions and vice versa. Most of the identities in Table 1.1.1
1 2.
3. 4, 5. 6. 7.
8.
9 10.
LOGICAL IDENTITIES
P [VxP(X) A VxQ(x)]. This relationship between V and /\ is informally characterized by the assertion that the universal quantifier V distributes over the logical connective /\. However,
the existential quantifier
J does
not distribute over the logical connective
/\.
That is, dx[P(x) A O(x)] is not equivalent to 4xP(x) A 4xQ(x), as the follow
ing argument shows. The proposition 4x[P(x) A Q(x)] asserts that “There exists an x such that P(x) and Q(x) are both true.” This assertion requires that the same value of x satisfies both P and Q. On the other hand, the assertion “There exists an x such that P(x) is true and there exists an x such that Q(x) is true,” which can be represented by
dxP(x) A 4xO(x)
permits different values of x to be chosen to satisfy P and Q.
To show the two assertions 4x[P(x) A Q(x)] and AxP(x) /\ 4xQ(x) are not
equivalent, we can use the preceding analysis to construct a universe and predicates P and Q such that one assertion is true and the other false. Let the universe be the the integers and let P(x) denote “x is an even integer” and Q(x) denote “x
34
Ch.1
REASONING
MATHEMATICAL
\ 4xQ(x)
is an odd integer.” Then 4xP(x) Ax{P(x) A Q(x)] is false. x) Q( 4x A x) P( Sx d an ] x) Q( A ) (x {P Although Jx first implies the second, that is, the assertion
Ax[P(x)
\ Q(x)
proposition,
is a true
whereas
are not equivalent,
the
> [4xP@) A 3xOQ)]
is e er th en th e, tru is ] x) Q( A ) (x [P Ix if r Fo id. val is e. tru is c) Q( /\ c) P( n io it os op pr the at th verse such de lu nc co n ca we , c) P( of h ut tr the om Fr Q(c) is true. ue tr is x) Q( 4x at th c) Q( om fr de lu nc co n ca arly, we
some element c of the uniTherefore, P(c) is true and that JxP(x) is true. Similand therefore, the conjunc
tion SxP(x) A 3xQ(x) is true.
to s lt su re us io ev pr the e us n ca we s, By changing predicate variable name bta es re we s lt su re r ou e nc Si . not es do V t bu \V establish that J distributes over e th in T —7 by Q d an “R by P e ac pl re n ca lished for arbitrary predicates, we valid assertion
Vx{P(x) A O(x)] [VxP(@) A VxO@)]
is g in ow ll fo e th d, te ga ne are es sid th bo en wh Since an equivalence remains valid also valid:
I. @) MT Vx A O) MR (x) V SI
) T A A x) R( xf —W
. es nc le va ui eq of ce en qu se g in ow ll fo the in ta ob Applying identities, we
Ax S[AR(x) A T(x) (AV x TR(X) V (OV MT)
Ax[ (R(x)
V (HT (x)
> (x ACHR)
V Gx “CTE
Ax[R(x) V T(x)] [SxR(x) V IxTO)]
This establishes that 4 distributes over \/. Using the same technique of replacing the predicate variables P and
—P and —@ in the valid assertion
Q by
Ax{P(x) A O(x)] = [AxP(x) A 4x0) we can establish
[VxP(x) V VWxO(x)] = Vx[P(x) V Q@))The converse of this implication is not valid. ors rat ope the for rs fie nti qua the h wit l dea to w ho ed ish abl est e hav Once we nti ide ng lyi app by < and > es tiv nec con g in in ma re the at A, VV’ and —, we can tre ties relating them to A, V, and —. Example We show that 3 does not distribute over =>; that is, the assertion
Ax[P(x) > Q(x)] [AxP(X) > 3xQ@Q)] is not valid.
Sec. 1.3
QUANTIFIERS
Since A => B is equivalent to —A V
AND
LOGICAL
OPERATORS
35
B, it follows that
Ax[PQ) => O()] AxfP(X) V O@)] [dx —P(x) V 4xOQ(x)] [AVxP(x) [VxP(x)
V 4xO(x)] = 4xQ(x)].
Hence, the original assertion is equivalent to the assertion
[VxP(x) >
dx QO(x)] [SxP(x) = 4xQ(x)].
We can construct a truth table for the propositional form of this assertion, taking the components VxP(x), JxQ(x), and 4xP(x) as propositional variables. However, since JxP(x) is true whenever VxP(x) is true, two lines of the truth table do not apply. VxP(x)
dxP(x)
AxQ(x)
VxP(x) > 3xO(x)
AxP(x) => 3xQ(x)
0
0
0
1
1
0
0
1
1
0 1 I
1 0 0
1 0 1
1 ma ma
0
1 1
0
1
0 j
1 1
1
1
1
0 I na na.
0 1
Considering the last two columns of the table, we conclude that the implication
holds in one direction,
[AxP(x) > 3xO(x)] = [VxP(x) > 3xQ(Q)]. However, we can show the converse is not valid by exhibiting a counterexample. From the third line of the truth table, we know that any counterexample must be an
interpretation of the predicate P in which an interpretation for Q in which 4xQ(x) let P(x) denote “x = 0” and Q(x) denote and establishes that J does not distribute
VxP(x) is false and 4xP(x) is true, and is false. For the universe of the integers, “x + x.” This provides a counterexample # over >.
Table 1.3.1 is a list of useful logical relationships between assertions involving quantifiers. Each relationship of the table also holds when additional free variables are inserted consistently in each occurrence of a predicate. Thus, from identity 4 we can infer
VxP(x, y) > IxP(x, y) and from identity 6 we can infer
[VxP(x, y) \ O(2)] Vx[PC: y) A O@)].
36
WEN
AMNP
WN
Table 1.3.1
10.
11. 12. 13.
Ch.1
REASONING
MATHEMATICAL
A SUMMARY OF LOGICAL RELATIONSHIPS INVOLVING QUANTIFIERS
VxP(x) => P(c), where c is an arbitrary element of the universe P(c) => 3xP(x), where c is an arbitrary element of the universe Wx P(x) “73 xP) VxP(x) => 3xP(x)
dx PQ) [VxP(Qx) [VxP(x) [VxP(x) [VxP(x)
A V A V
“OV xP) QO) Ve[P@ A Q] Q] Val[P®) V Q] VxOQ)] ValP@) A O@))] VxO(x)] > ValPQ) V OQ)
[BxP(x) A Q] dxfP(x) A Q]
[SxP@) V Q]) 3xfP@) V Q) Ix[P@) A OC) = [AxP@) A 3xOQ)] [BxP@) V 3xOQ)] > JAP) V OCO))
A compact form of logical notation is often used to express mathematical assertions. For example, the assertion “For every x such that x > 0, P(x) is true,”
which would be written in our current notation as
Vxi(x > 0) > PO) can be written more compactly as Vxi509 P(x).
Similarly, “There exists an x such that x + 3 and Q(x) is true,” which would be written
Ax{(x # 3) A O@)] can be written Ax.23
Q(x).
Using these conventions, the formal statement of assertions becomes both more compact and more readable. Furthermore, the compact notation allows a negation sign to be propagated through a sequence of quantifiers in the same manner as was illustrated earlier. Example Consider the limit of a function defined over the real line. The definition is usually expressed as follows. Definition:
The limit of f(x) as x approaches c is k (denoted lim f(x) = k)
if for every € > 0, there exists
a 6 > 0 such that for all x, if x —c €].
This establishes that lim f(x) + k if and only if there exists an € > 0 such that woe
for every d > 0, there is some x such thatx — c < dand yet f(x) —ki[ De.
#
The virtues of the compact notation will be obvious to anyone who writes out the definition of a limit using the conventional logical notation. In this section we have described ways in which quantifiers and logical operators interact with each other. These interactions are often subtle, and dealing with them requires some care, but a facility with them is invaluable in the construction of sound mathematical arguments. Problems:
Section
1.3
1.
Let P(x, y, z) denote xy = z; E(x, y) denote x = y; and G(x, y) denote x > y. Let the universe of discourse be the integers. Transcribe the following into logical notation. (a) Ify = 1, then xy = x for any x. (b) If xy +0, then x 4 O and y 40. If xy = 0, then x = Oory = 0. (c) 3x = 6if and only if x = 2. (d) There is no solution to x? = y unless y > 0. (e) x < zis a necessary condition for x T(x)]
(b)
Vax{T(x) V mS]
(e)
Wx{T(x) > Vy VaLDG@, y, 2) > T)]}
(c) (d)
Ax{Tx) A 7PO)) Wx Vy Vz{[D(x, y, z) A P(2)) > (PO) V PON}
Put the following into logical notation. Choose predicates so that each assertion requires at least one quantifier. (a) There is one and only one even prime. (b) No odd numbers are even. Every train is faster than some cars. (c) (d) Some cars are slower than all trains but at least one train is faster than every car. If it rains tomorrow, then somebody will get wet. (e) Find an assertion which is logically equivalent to VxP(x) but uses only the quantifier 3 and the logical operator —. Similarly, express Ix P(x} in terms of V and —.
Find an assertion which is logically equivalent to 4!xP(x) but which uses only the quantifiers V and J together with the predicate for equality and logical operators. Show that the following propositions are valid.
(a)
(b)
[VxP(x) > Q] O(X)]
= [P = VxOQ)]
For the following assertions, establish those which are true and find interpretations for P and Q which provide counterexamples for those which are false.
(a)
Wx{P(x) > O(x)] > [VxP(x) > VxO(x)]
(d)
Wx[P(x) > QO(x)] > [AxP(x) > VxOQ)]
(b) (c) (a)
[WxP(x) > VxOQ(x)] > VxlP(x) > GO)] [AxP@) > VxQ(x)] > Va[PQ) > O(X)] For a universe containing only the elements 0 and 1, expand
Ax[P(x) A Q(x)] and [AxP(x) A 4xQQ)] into propositions involving P(0), P(1),...etc., arrange the terms of this expansion to show
and
without
quantifiers. Re
Ax[P(x) A Q(x)] > [AxP(x) A 4xOQ)1. (b) (c)
Show that the converse of the implication of part (a) is not valid. For the same universe, show
Vx[P(x) Q(x)] > [VxP(x) VxQ(x)]. (d)
Show that the converse of the implication of part (c) is not valid.
Show that the following are valid for the universe of natural numbers N either by expanding the statement or by applying identities.
(a)
Vx Vy[P(x) V Q0)] [VxP(x) V VyQ0)]
(c) (d)
Vx VylP(x) A Q0)] [VxP(x) A Vv@0)] dx Sy[P(x) > P(y)] > [VxP(x) > AyPQ)]
(b)
(ce)
dx 3y[PQX) A QQ)] > 4xP(x)
Vx Vy[P(x) > Q()) [AxP@) > VyQ0)]
Sec. 1.4
1.4
LOGICAL
INFERENCE
39
10.
Write out the definition of lim,., f(x) = k in the usual logical notation rather than the compact notation used in the last example of this section. (b) Find the condition for lim,.., f(x) # k by forming the negation of both sides " of the definition.
11.
Let A be a twodimensional integer array with 20 rows (indexed from 1 to 20) and 30 columns (indexed from 1 to 30). Using compact logical notation, make the following assertions. Assume the universe of discourse is the set of integers I. (a) All entries of A are nonnegative. (b) All entries of the 4th and 15th rows are positive. (c) Some entries of A are zero. (d) The entries of A are sorted in rowmajor order (the entries are in order within rows, and every entry of the ith row is less than or equal to every entry of the (i + I)st row).
LOGICAL
(a)
INFERENCE
A theorem is a mathematical assertion which can be shown to be true. A proof is an argument which establishes the truth of a theorem. A mathematician will not usually accept an assertion as true unless he is convinced that a proof of the assertion can be constructed. Mathematicians have long been concerned with the question of what constitutes a proof. Their work has resulted in the concept of a formal mathematical system in which the notions of axiom, theorem, and proof are precisely defined. Ideally, these systems would provide a formal basis for describing all rigorous mathematics. In fact, the systems are not powerful enough to describe all mathematical systems of practical importance. Nevertheless, work in formal systems has increased our understanding of mathematical reasoning, and we will use the terminology of formal systems to describe the concepts of theorem and proof. In the remaining sections of this chapter, we address the problem of formulating and constructing proofs. The novice in mathematics is often puzzled by the question of what constitutes a proof. When is an argument convincing? The question is not easily answered. In fact, mathematicians sometimes disagree among themselves as to whether an argument is sound. Their disagreement may be over whether to allow a particular proof technique, such as the use of the law of the excluded middle; such differences are essentially philosophical and, in some sense, unresolvable. Even when mathematicians can agree on the acceptablilty of a proof technique, disagreement sometimes occurs when a purported proof is thought to contain some error of commission or omission. The existence of such disagreements indicates the difficulty of constructing and evaluating proofs. There are no general algorithms for deciding whether an assertion is true or proofs of ction constru The them. use would ns aticia mathem were, there if false; is a craft, and while we can offer a modicum of advice, the skill can only be learned
text this hout throug occur which proofs The e. practic and es exampl of by means
40
MATHEMATICAL
Ch. 1
REASONING
htec of pro of els mod as and nts ume arg g cin vin con as h bot are intended to serve . ofs pro of ion uct str con the in ce cti pra e vid pro to ed niques. The exercises are intend
uarg an s ent res rep ch whi s ent tem sta of ce uen seq a is ion ert A proof of an ass may of pro a in ur occ ch whi s ion ert ass the of me So e. tru is ment that the theorem ms. ore the ved pro y usl vio pre or oms axi e lud inc se the ; ori pri a e tru be to be known the in e tru be to d me su as m, ore the the of s ese oth hyp be may Other assertions ch whi s ion ert ass er oth m fro ed err inf be may s ion ert ass e som y, all Fin nt. argume g win dra of ns mea a d nee we , ofs pro uct str con to s, Thu of. pro the in r occurred earlie of es rul ng usi e don is s Thi s. one old m fro s ion ert ass new ng ivi conclusions or der erass m fro wn dra be can ch whi ns sio clu con y cif spe nce ere inf of es Rul . inference tions known or assumed to be true. Perhaps the most fundamental rules of inference are those which permit subion ert ass an in n sio res exp any e lac rep to d owe all lly era gen are we s, Thu s. ion tut sti by another expression which is equivalent to it; we consider the new assertion to be true if and only if the original assertion was true. We learn this rule of inference .” als equ for ed tut sti sub be can s ual “eq as sed res exp mes eti som is it ; age ly ear at an we and cs, ati hem mat in d use ly on mm co are ion tut sti sub ing ern Other rules gov tau a Sis if e, mpl exa For m. the g tin sta y itl lic exp t hou wit ely fre m the ly app l wil for ons iti pos pro of ion tut sti sub ur, occ les iab var l ona iti pos pro ch whi in tology the propositional variables in the usual way results in a new tautology. Another rule of inference can be stated as follows: If it is known that a statement P is true, and also that the statement P = Q
that the statement Q
is true, then we can conclude
is true.
Example Suppose we know “Samson is strong” and “If Samson is strong, then it will take a woman to do him in.” We can conclude “It will take a woman to do Samson in.” +wth
This rule of inference is called modus ponens; it is often presented in the form of an argument as follows: P P>@Q
Q In such a tabular presentation of an argument, the assertions above the horizontal line are called hypotheses or premises; the assertion below the line is the conclusion. The symbol .*. is read “therefore” or “it follows that,” or “hence.” An argument is said to be valid if, whenever all the premises are true, the conclusion is true. A rule of inference is an argument form which is taken to be valid in the same sense that an axiom is taken to be true. The rule of inference known as modus ponens is related to the tautology [P A (P > Q)] = Q in the language of propositions. Other rules of inference have similar interpretations; we have listed some of the most important rules of inference in Table 1.4.1.
Sec. 1.4
LOGICAL Table 1.4.1
RULES
OF INFERENCE RELATED
Rule of Inference PO
41
OF PROPOSITIONS
Tautological Form P=>(PV
WPV
TO THE LANGUAGE
INFERENCE
Name
Q)
addition
QO
PAQ
(PA
Q)=P
simplification
“?P
P
P>@Q
[PA
> Q)]=>20
modus ponens
“Q
—7Q
P=>Q
("2
A®=>Q)=>7P
modus tollens
—P
PV —p
OQ
ee)
P>Q O>R P>R
(PV
Q) A 7P]=
(P>
OA
OQ
disjunctive
syllogism
(Q2>R)]>[P>
Rl
hypothetical
syllogism
P
Q
WPA
conjunction Q
(P= Q) PVR
A (RS)
“OVS
(P= Q) A (R> S)
“70 V 3S OR “PV
(P>ODAR>SAPVRIS(CVS]
constructive
(P>
destructive dilemma
Q)AR>SA
(OV
mS)
=[P V mR]
dilemma
Examples Fallacious arguments are often the result of incorrect inferences. Here we present some examples of common fallacies. (a)
The Fallacy of Affirming the Consequent Consider the following argument:
If the butler did it, he will be nervous when he is interrogated. The butler was very nervous when he was interrogated. Therefore, the butler did it. Presented in the form of our rules of inference, this argument can be presented as follows: P=>Q
P
Q
42
MATHEMATICAL
Ch.1
REASONING
ugh tho n eve se fal be can P n sio clu con the e aus bec t rec cor not is The argument P > Q] A Q) > [(P ion ert ass the , i.e. e; tru are Q and Q = P s ese the hypoth n bee e hav not y ma t for com dis ’s ler but the of rce sou the : ogy is not a tautol sque was he t tha day the on ket mar ck sto the of or avi beh the guilt but rather tioned.
(b)
The Fallacy of Denying the Antecedent This form of fallacious argument can be represented as P=>Q “iP “1Q
The following example illustrates the fallacy: If the butler’s hands are covered with blood, then he did it. The butler is impeccably groomed. Therefore, the butler is innocent. ays alw who , ler but the of ss ine anl cle e siv pul com the s ore ign nt ume The arg —P, and Q => P m Fro me. cri a g tin mit com er aft ly ate edi imm ds han washes his # one can conclude neither Q nor “Q.
of es rul the of e som of on ati lic app t rec cor the es rat ust ill e mpl exa The following inference given in Table 1.4.1. Example Consider the following argument: If horses fly or cows eat artichokes, then the mosquito is the national bird. If the mosquito is the national bird, then peanut butter tastes good on hot dogs. But peanut butter tastes terrible on hot dogs. Therefore, cows don’t eat artichokes.
The first three assertions are the hypotheses of the argument; the last assertion is the ied impl is on lusi conc the of h trut the her whet e rmin dete to d aske are We on. lusi conc osiprop ent pon com the ting esen repr by n begi We es. thes hypo the of h trut the by tions as follows:
F A M P
denotes denotes denotes denotes
the proposition the proposition the proposition the proposition
“horses fly”; “cows eat artichokes” ; “the mosquito is the national bird” ; “peanut butter tastes good on hot dogs.”
The argument can be represented as follows:
1.
(FV A)>M
2.
M>P
3.
—“P iA
Assertions 1, 2, and 3 are the hypotheses, and 4 is the conclusion. One way to test whether the conclusion is implied by the hypotheses is to construct a truth table for
Sec. 1.4
LOGICAL
INFERENCE
43
the implication which has the conjunction of the hypotheses as its antecedent and the conclusion as its consequent; in the present case this is the implication
{(F V A)> M1] A (M>P) A =P} > (7A). If the implication is a tautology, then we say the conclusion follows logically from the hypotheses and hence the argument is valid. If the implication is a contingency or contradiction, then the conclusion —A may be false even though all the hypotheses are true. (An assignment of truth values for the propositional variables which makes the implication false is called a “disproof by counterexample”; we will discuss this in more detail in the following section.) When several hypotheses and propositional variables are involved, construction of a truth table can become unwieldy. An alternative way to show an argument is valid is to construct a proof using the hypotheses, logical identities, and rules of inference. A proof is an expansion of the argument in which the hypotheses are augmented by additional assertions such as axioms, previously proved theorems, or assertions obtained by applying rules of inference. The conclusion of the argument must be shown to follow from these assertions by a rule of inference. The following is a proof of the argument given above.
Proof: Assertion
1 2. 3,
(FVAS>M M=>P (FV A)>P
4. 5. 6.
—P ~(F V A) “FA A
7.
TAA
8.
“A
OF
Reasons
Hypothesis Hypothesis Steps 1 and syllogism Hypothesis
Steps 3 and Step 5 and (identity Step 6 and
1 2 2 and hypothetical
3 4 and modus tollens DeMorgan’s law 7, Table 1.1.1) commutativity of A
(identity 4, Table 1.1.1)
Step 7 and simplification

Each assertion of the proof is considered true, either because it is a hypothesis, or because it is known to be logically equivalent to a preceding assertion of the proof, or because it is obtained by applying a rule of inference to preceding assertions of the proof. Since the last assertion of the proof is the conclusion, it follows that if the
hypotheses are true, then the conclusion is true.
#
Additional rules of inference are necessary to prove assertions involving predicates and quantifiers. A careful treatment of these rules is beyond our scope, but we will illustrate some of the techniques. The following four rules describe when the universal and existential quantifiers can be added to or deleted from an assertion.
44
MATHEMATICAL
Ch. 1
REASONING
The first rule is known follows:
as universal instantiation; it may
be represented as
VxP(x)
P(c)
disof se ver uni the of t men ele ary itr arb e som is c and ate dic pre a is where P all of set the be se ver uni the let , rule this of use the of e mpl course. As an exa humans, and let P(x) denote “x is mortal.” If we can establish VxP(x), that is, “all
men
are
mortal,”
then
the
rule
of universal
instantiation
permits
us
to
conclude, “Socrates is mortal.” the s mit per on, ati liz era gen sal ver uni as wn kno e, renc infe of rule ond sec A ds hol P(c) ion ert ass the that w sho can we If ns. rtio asse of on ati fic nti qua universal
for every element c of the universe of discourse, then universal generalization allows
us to conclude that the universally the rule of inference
quantified
assertion
WxP(x)
holds. Thus,
P(x)
o.WxP(x) can be applied if we can show that the hypothesis P(x) is true for every possible value of x. The third rule of inference is known as existential instantiation. It takes the form
dxP(x) .P(c)
where c is some element of the universe of discourse. However, the element c is not arbitrary (as it was in the case of universal instantiation), but must be one for which P(c) is true. It follows from the truth of JxP(x) that at least one such element must exist, but nothing more is guaranteed. This places constraints on the proper use of this rule of inference. For example, if we know that dxP(x) and 4x Q(x) are true, we can conclude that the statement P(c) /\ Q(d) is true for some choice of c and d, but we cannot conclude that P(c) A Q(c) is true. For suppose P(x) represents “x is even” and Q(x) represents “x is odd” in the universe of integers. Then 4xP(x) and 4x Q(x) are true, but P(c) A Q(c) is false for every c. The last rule of inference we will describe is known as existential generalization. It is represented as
P(c)
“.dxP(x) where c is an element of the universe. This rule asserts that if P(c) is true for some
element c, then the assertion 4xP(x) is true. When quantifiers are involved, construction of proofs is more involved because
of the care required in the application of the rules of inference. An exploration of the subtleties of proofs involving quantifiers is beyond the scope of this chapter, but the following simple example will illustrate the application of some of the rules of inference.
Sec. 1.4
LOGICAL
INFERENCE
45
Example Consider the following argument: Every man.has two legs. John Smith is a man. Hence, John Smith has two legs. Let M(x) denote the assertion “x is a man,” L(x) denote the assertion “x has two legs,” and J denote John Smith. Expressed in logical notation, the argument is 1. VxlM(x) > LQX)] 2.
3.
MW)
LW)
A formal proof is as follows: Assertion
1.
2 3. 4.
Va[M@) > LQ]
MD)=>W) MV) LW)
Reasons
Hypothesis 1
Step 1 and universal instantiation Hypothesis 2 Steps 2 and 3 and modus ponens

ie
In this section we have dealt with the problem of logical inference, i.e., inferring the truth of one statement from the known or assumed truth of others. A rule of inference is an explicit statement of when such an inference can be made. We commonly apply rules of inference in mathematical arguments without explicit reference to them; this is one reason why mathematical arguments are sometimes difficult to follow. By treating these rules explicitly, we aim to provide a basis for the understanding, construction, and description of mathematical arguments. Problems: 1.
Section 1.4
For each of the following sets of premises, list the relevant conclusions which can be drawn and the rules of inference used in each case. (a) I’m either fat or thin. I’m certainly not thin. (b) If I run I get out of breath. I’m not out of breath. (c) If the butler did it, then his hands are dirty. The butler’s hands are dirty. (d) Blue skies make me happy and gray skies make me sad. The sky is either blue or
(e) (f) (g)
(h) (i)
gray. If my program runs, then I am happy. If I am happy, the sun shines. It’s 11: 00 p.m. and very dark. All trigonometric functions are periodic functions and all periodic functions are continuous functions. All cows are mammals. Some mammals chew their cud. All even integers are divisible by 2. The integer 4 is even but 3 is not. What’s good for the auto industry is good for the country. What’s good for the
country is good for you. What’s good for the auto industry is for you to buy an expensive car.
46
MATHEMATICAL
Ch. 1
REASONING
le Tab in en giv s nce ere inf of es rul ing low fol the of m Show that the tautological for 1.4.1 are tautologies: (a) modus tollens (b) disjunctive syllogism constructive dilemma (c) (d) destructive dilemma l ona iti add ary ess nec all ing giv , nts ume arg ing low fol the of h eac for Construct a proof s ote den ” “or d wor he (T p. ste h eac at d use nce ere inf of es rul assertions. Specify the the “logical or” rather than the “exclusive or.”) A RC If . ket mar ier cop the r ove e tak l wil ox Xer or M IB t tha It is not the case (a) . ket mar ier cop the r ove e tak l wil M IB n the , ket mar er ut returns to the comp Hence, RCA will not return to the computer market. . ck) sta my w blo I and s mb bo tem sys e (th or ) lly sfu ces suc (b) (My program runs my and ck sta my w blo ’t don (I or ) mb bo not s doe tem sys Furthermore, (the program runs successfully). Therefore, my program runs successfully. of ion lus inc the ify Just nt. ume arg ing low fol the ve pro to s ion ert Supply the missing ass each assertion in the proof.
(P \ Q)=(RAS) (T>Q) A (S>U)
(W=> P) A(T > VU) —R
Wo 5.
mT
d vali the for ofs pro uct str Con d. vali are nts ume arg ing low fol the of ch Determine whi low fol not s doe n sio clu con the why w sho d, vali not are ch whi se tho For arguments. from the hypotheses. (b) AV B (a) AAB Ax>C Ax>C “CV B CAB (c)
A>B A»>C
.C>B
(d)
A>(BV
C)
D>—7C
B=
A
7A
Pp
“BA
7B
Determine which of the following are valid arguments. Construct proofs for those that are valid and describe the fallacies of those that are not. (a) If today is Tuesday, then I have a test in Computer Science or a test in Econ. If my Econ professor is sick, then I will not have a test in Econ. Today is Tuesday and my Econ professor is sick. Therefore, I have a test in Computer Science. (b) Iam happy if my program runs. My happiness is a necessary condition for me to enjoy life. Hence, if my program runs, then, if I enjoy life, then I am happy. (c) Itis not the case that some trigonometric functions are not periodic. Some periodic functions are continuous. Therefore, it is not true that al! trigonometric functions are not continuous. (d) Some trigonometric functions are periodic. Some periodic functions are continuous. Therefore, some trigonometric functions are continuous.
Sec. 1.5
7.
METHODS
OF
PROOF
47
Consider the implication
VxlP(x%) V O(x)] > [VxP(x) V VxOx]. (a) (b)
Show that this implication is not valid. The following is an argument which purports to prove the above implication. Find and explain the flaw.
VxlP(x) V O(x)] 74x7[P@) V O@)]
Ax[
PO) A 7O@)]
=> [Ax —P@) A dx 7Q(x)] [dx
mP(x) V adx MOW)]
WxP(x) 8.
V VxO(x)
One must exercise care in the application of rules of inference to avoid fallacious conclusions. In the following argument, locate and explain all misapplications of rules of inference. Let the universe of discourse be the set of integers I. The assertion that there is no smallest integer can be put into logical notation as follows:
Vx dyl[x > yl). It follows universal instantiation that for arbitrary d,
Ay{d > y]. Now applying existential instantiation we conclude that for some element c
d>
ec.
Since d was arbitrary it follows by universal generalization that
Vilx > c]. By universal instantiation, we can conclude and by universal generalization,
1.5
METHODS
OF
c>c, Vx[x > x].
PROOF
In the preceding section, we described the use of rules of inference to infer the truth of one assertion from others. Rules of inference are characterizations of the syntactic constraints which a proof must obey; in a formal mathematical system, where the structure of proofs is precisely specified, the rules of inference enable us to determine if an argument is a proof. In this section, we are concerned
with the structure of proofs as well as strategies for it is not possible to consider all proof techniques, we common ones, give examples of their use, and relate described in the previous section. The most elementary form of theorem is the theorem because of its sentential structure rather
their construction. Although will describe some of the most them to the rules of inference tautology. A tautology is a than its content; its truth is
48
MATHEMATICAL
Ch. 1
REASONING
ns io it os op pr the of any of g in an me or on ti ta re rp actually independent of the inte t uc tr ns co y onl ed ne one : ed ov pr ily eas are s ie og ol ut ta , involved. For this reason a truth table. Example
and n” eve is “x ion ert ass the ) E(x by ote Den rs. ege int of Consider the universe is “x as ) O(x d rea we If . x) E(
) O(x , i.e. n”; eve not by O(x) the assertion “x is odd,” then we can prove the theorem The integer 3 is either even or odd. The theorem is stated as
or alternatively
E(3) V O(3),
E(Q3) V 7£E(3),
which, if we use the letter P to denote E(3), can be written PV
From the truth table of the proposition # theorem is established.
—P.
P V —P, we know it is a tautology, and the
. ogy tol tau a not is ch whi m for l ona iti pos pro a as sed res exp en oft is m ore A the the of ure uct str l ica log the h bot on ent end dep is ion ert ass an h The truth of suc ent pon com the e aus Bec . ons iti pos pro ent pon com the of g assertion and the meanin th tru the of s line n tai cer , ues val th tru le sib pos all ume ass not can propositions can ch whi s line the all t tha g win sho by ved pro is m ore the the table cannot occur; t mos the ng eri sid con by ms ore the h suc t trea will We . true of occur result in a value important of the logical operators.
to er ord In on. iti pos pro a is P re whe , —7P m for the of ion ert ass an Let T be
n the Q, A P m for the of is T if , rly ila Sim e. fals is P that ish abl est prove T, we must is Q V P m for the of ion ert ass An . true are Q and P h bot t tha w sho t we mus (or, Q => —P on iti pos pro t len iva equ lly ica log the g vin pro by ed ish often establ by symmetry, ~Q => P). A truth table can be used to show the logical equivalence d > Q. ofP V Qan—P e aus bec and , ons ati lic imp g vin pro for d use are s que hni tec of pro of y iet A var these techniques are so common, they are frequently referred to by name. Recall that the truth table for P => Q has the following form:
QP>Q
The four most common techniques for proving implications are the following: 1.
Vacuous Proof of P > Q The truth value of P > Q is trve if that of P is false. Consequently, if we can establish that P is false, only the first two lines of the above truth
Sec. 1.5
METHODS
OF
PROOF
49
table can possibly apply, and it follows that the assertion P > Q is true. A vacuous proof of P > Q is constructed by establishing that the truth value of P is false. While vacuous proofs appear to be of little value, they are often important in establishing limiting or special cases. We will point out many examples of vacuous proofs in the next chapter. 2. Trivial Proof of P > Q If it is possible to establish that Q is true, only the second and fourth lines of the truth table for implication can apply, and it follows that the theorem P => Q is true. Construction of a trivial proof of P = Q requires showing that the truth value of Q is true. Like the vacuous proof, the trivial proof has limited applicability and yet is extremely important. It is frequently used to establish special cases of assertions. 3. Direct Proof of P > QO A direct proof of P = Q shows that the truth of Q follows logically from the truth of P, i.e., the third line of the truth table for implication cannot
hold. Such a proof begins by assuming P is true. Then, using whatever information is available, such as previously proved theorems, it is shown that Q must be true. Since all the lines of the truth table except the third have the value true assigned to P => Q, the assertion is established.
The following examples illustrate the use of direct proofs. Examples (a)
Theorem:
If 6x + 9y = 101, then either x or y is not an integer.
Proof: Assume 6x + 9y = 101. This can be rewritten as 3(2x + 3y) = 101. But 101/3 is not an integer; therefore, 2x + 3y is not an integer and hence Jj either x or y is not an integer.
(b)
Theorem: Let S bea set of one and twodigit integers such that each of the digits 0 through 9 occurs exactly once in the set S. Then the sum of the elements of S is divisible by 9. Proof: Assume that the hypothesis of the theorem is true. The digits 0 through 9 sum to 45. In any set S, some of the digits will occur in the 10’s position and the remainder will occur in the 1’s position. Let T denote the sum of digits which occur in the 10’s position. Then the sum of the elements of S can be expressed as 107 + 45 — 7, which can be put in the form 9T + 45. Since both terms of this sum are divisible by 9, the sum is also divisible by 9, regardless of the value of T. IF #
4,
Indirect Proof of P = Q (Proof of the contrapositive) The implication P > Q is logically equivalent to the implication ~Q > —P. Consequently, we can establish the truth of P > Q by establishing that —1Q = —P. The latter implication is usually shown by means of a direct proof, i.e., by showing that if Q is false, then P is necessarily false. Hence, the third line of the truth table for P => Q cannot occur.
50
MATHEMATICAL
Ch, 1
REASONING
Example
ept exc rs iso div its all of m su the to al equ is h ic wh A perfect number is an integer 28. is so and 3, + 2 + 1 = 6 ce sin , er mb nu t fec per the number itself. Thus, 6 is a ve. iti pos tra con the ing ish abl est by m re eo th ing low fol We will prove the A perfect number is not a prime.
Theorem:
t fec per a not is er mb nu me pri A : ing low fol the is ve iti pos The contra Proof: rs: iso div two y ctl exa has p and 2 > p n The . ber num me pri a is p number. Suppose is p t tha s low fol it and 1, ore ref the is p n tha less rs iso div its all 1 and p. The sum of
not perfect.
§F
#
ve, iti pos tra con the of of pro a by Q > P ish abl est to y, In summar Assume that Q
1.
is false;
n io at rm fo in ble ila ava er oth and on ti mp su as t tha of is 2. Show on the bas that P is false. If the premise is a conjunction and we wish to show Po (PA i
A+++ AP.>)Q,
the contrapositive of the assertion is
> (P, VP, —Q
P,)
V ++: V
st lea at for , —P s lie imp Q —@ t tha ow sh to es fic suf it To establish this assertion,
one value of i. We frequently wish to establish implications of the form
PiVP,V:
> Q. VP)
a es, cas by of pro led cal que hni tec a ng usi d dle han y all usu are These implications method justified by the following tautology:
(Pi VP Vio
VP) > Ae:
= VDAC>=
ADA
AC, > WD
7. to  m fro i h eac for Q, > P, ” se, “ca h eac g vin pro es uir req s case by A proof , ons ati lic imp the of ral seve if ; full in ted sen pre not are s case Often proofs by P, > Q, have similar proofs, then usually only one case is treated explicitly. Example then 5 > a if I; gers inte of set the on x” “ma ion rat ope the ote den Let “” 42 = 4 and 1J3 = 3. a6 = b\a = a. For example, The binary operation “max” is associative; that is, for any integers Theorem: a, b, and c, (a[[ 6) Uc = allo). For any three integers a, 6 and c, one of the following six cases must hold: Proof: a>b>coal>c>bb>al>cob>ctbacpaSbocebsa. Case1: Assume a > b >. Then (al5)iJc = alle = aand all(téUed =aib=a. Case2: Assume a >c > b. Then (al{ 5) lc = al[c =a and
all(6Uod =alle=a. There are four other cases; the proofs are all similar.
J
;
#
Sec. 1.5
METHODS
OF
PROOF
51
The last logical operator we will consider is logical equivalence. Theorems of the form P Q are usually handled in one of two ways. Most commonly, the separate implications P > Q and Q => P are proven and the assertion P = Q is inferred. Sometimes a more economical proof is possible, beginning with a true assertion of the form R< S and proceeding through a sequence of “if and only if” statements such that each statement is logically equivalent to the one preceding. If the last statement in the sequence is P Q, the theorem is established. This technique will be used frequently in the next chapter. Other proof techniques can be used to establish the truth of a proposition P. A proof by contradiction, or reductio ad absurdum, assumes thatP is false and derives a contradiction, such as the proposition Q A —7Q; this establishes —P = (O \ —Q). Taking the contrapositive of this implication and applying one of DeMorgan’s laws, we obtain (—Q VV Q) => P. Since the premise of this implication is true and we have shown the implication to be true, we conclude that P is true. Examples (a)
Theorem:
There is no largest prime number.
The proof is by contradiction; we begin by assuming that a largest prime number exists, and then show how to construct another which is larger. Proof: Assume a largest prime exists; call it p. Because all primes are greater than 1 and none are greater than p, there must be a finite number of them. Form the product of all these primes and call it r; r = 2357 ... p. We now assert that r + lisa prime. For if we divider + 1 by any prime between 2 and p, the remainder is 1, which means that r + 1 cannot be expressed as a product of any two integers other than r + 1 and 1. Since r > p, r + 1 is a prime number greater than p. This contradicts the assumption that p is the largest prime number, and the theorem is
proved.
ff
The logical structure of the preceding proof can be described as follows. Let P denote “there is no largest prime number,” and Q denote “p is the largest prime number.” The proof proceeds by assuming the theorem is false: Gi) —™P It follows that (for some particular integer p),
(ii) 7P>@Q We then show how to construct a prime greater than p, i.e., we show
(ii) Q=>7Q From (ii) and (iii), applying the rule of hypothetical syllogism, we conclude “P> 7Q (iv) From (i) and (ii) and modus ponens, it follows that
(v)
Q
(vi)
“Q
and from (i) and (iv) and modus ponens, Then from the rule of conjunction applied to (v) and (vi), we conclude
52.
MATHEMATICAL
(vii)
QA
REASONING
Ch. 1
7Q
m re eo th the and se fal is (i) is es th po hy the t tha de lu This is a contradiction. We conc is proved. ate min ter l wil P m ra og pr a r the whe ng ini erm det of m ble (b) Consider thé pro ion cut exe ed ott all its ing eed exc as ngs thi h suc of ult res the normally, i.e., not as n tte wri be ld cou m ra og pr er ut mp co a t tha le vab cei con is It time or register overflow. d ul wo m ra og pr a h suc t; hal l wil P r the whe P, m ra og pr which would decide, for any can We m. ble pro g tin hal the as n ow kn is t wha ve sol to ” ure ced pro ion be a “decis ve sol l wil ch whi sts exi ure ced pro no t tha n tio dic tra con by of pro a of show by means the halting problem. not do ch whi s ure ced pro to n sio cus dis our ct tri res we n, tio osi exp of For ease a to ds pon res cor s Thi s. ure ced pro er oth call y ma y the gh hou alt read any input, for ure ced pro on isi dec a ise dev not can we if m; ble pro al gin ori the subproblem of cepro ary itr arb for one ise dev not can y arl cle we n the s, ure ced the inputfree pro ved pro be to s esi oth hyp a (as ume ass We . ure ced pro e fre utdures. Let P be an inp ) (P LT HA of ue val the t tha h suc LT HA ure ced pro on isi dec a sts false) that there exi en Th .” “no is ) (P LT HA of ue val the ise erw oth and ts hal P ure ced pro the if s” “ye is the following procedure could be executed: procedure ABSURD: if HALT(ABSURD) = “yes” then while true do print “ha”
Now consider the behavior of the procedure ABSURD. Suppose ABSURD halts. Then HALT(ABSURD) will return “yes” causing execution of the while loop. The while loop prints “ha” as long as true has the truth value true; thus, execution of the while loop results in (unending) gales of laughter. We conclude that if ABSURD halts, then ABSURD does not halt. Now suppose ABSURD does not halt. Then HALT will return “no,” causing the test of the ifthen statement to fail, and ABSURD will halt. Thus, if ABSURD does not halt, then ABSURD will halt. The assumption that HALT can decide whether an arbitrary program P terminates has led to an absurdity, and we conclude that no procedure has the behavior assumed for HALT. Note that we do not infer that it would be very difficult to write HALT, or that we don’t know how to write it; we conclude the much stronger statement that no procedure exists which has the behavior ascribed to HALT. #
The proof methods described so far are often inadequate for proving quantified assertions. We now describe some additional proof techniques based on the rules of inference for quantified statements. We will discuss techniques for proving assertions in each of the following forms:
—AxP(x), IxP(x), =VxP(x), and VxP(x). An assertion of the form —3xP(x) is most often proved by contradiction:
to show something does not exist, we assume it does and arrive at a contradiction. {This program and those in the remainder of the book will be written in an informal ALGOLlike language described in the Appendix.
Sec. 1.5
METHODS
OF
PROOF
53
This technique was used in our earlier proof that there is no largest prime number; we assumed there was a largest prime and derived a contradiction of the form
Q (\ —@. We also note that —4xP(x) is equivalent to Vx — P(x). Hence, our
later remarks on proving universally quantified statements will sometimes apply. Proofs of assertions of the form 4xP(x) are referred to as existence proofs. Existence proofs are classified as either constructive or nonconstructive. A constructive existence proof establishes the assertion by exhibiting a value c such that P(c) is true. By applying the rule of existential generalization, we conclude that 4xP(x) is true. Sometimes, rather than exhibiting a specific value of c, a constructive existence proof specifies an algorithm for obtaining such a value. A nonconstructive existence proof establishes the assertion JxP(x) without indicating how to find a value c such that P(c) is true. Such a proof most commonly involves a proof by contradiction; it shows that ~4xP(x) implies an absurdity or the negation of some previous result. A constructive existence proof specifies an element precisely, while a nonconstructive proof may not provide any information other than an assertion of existence. Some results in mathematics fall between these two extremes. For example, the mean value theorem of differential calculus asserts the existence of a parameter value with a special property. Although the proof places bounds on the parameter value (and thus provides useful information), the exact value of the parameter is not specified. Theorems of this character are common in numerical analysis. Assertions of the form VxP(x) are often most naturally proved by proving the equivalent assertion Ix — P(x). Both constructive and nonconstructive existence proofs can then be used. A constructive existence proof involves finding an element c of the universe of discourse such that P(c) is false; such an element is called a counterexample to the assertion VxP(x). The element c forms the basis of a proof by counterexample of the assertion —VxP(x). Counterexamples can also be used to show that assertions involving predicate variables are not valid. Construction of such a counterexample requires that we exhibit a universe of discourse and an interpretation of the predicate variables which makes the assertion false. Example Construct a counterexample to show the following assertion is not valid:
dx[P(x) = Od] > (BxP(x)
= 4xQ(x)}.
A disproof requires that we exhibit a universe and predicates P and Q such that the assertion is false; to disprove the above assertion we must find a universe and interpretations for predicates P and Q such that
(a)
Ax[P(x)
> QO(x)] is true and
(b)
dAxP(x) = 4xQ(x) is false.
From (b) it must happen that
(c)
AxP(x) is true and
(d)
4xQ(x) is false.
54
MATHEMATICAL
Ch. 1
REASONING
d an 1” == “y te no de ) P(x let d an 2, d an 1 rs ege Let the universe consist of the int h ug ro th (a) of ns io it nd co the s, ate dic pre se the Q(x) denote “x #1 A x # 2.” With P for s on ti ta re rp te in d an se er iv un of s ce oi ch (d) are satisfied; consequently, these + . ion ert ass the to le mp xa re te un co a e tut sti and Q con
by ed ov pr y ll ra ne ge is , x) P( Vx , ion ert ass ed fi ti A universally quan n. tio sec us io ev pr the in d be ri sc de on ti za li ra ne ge l sa er iv the rule of un s thi ce On se. ver uni the of x t en em el ary itr arb an for e show that P(x) is tru de lu nc co to d ie pl ap be can on ti za li ra ne ge sal ver established, the rule of uni
applying We first has been VxP(x).
Example Theorem:
Proof:
For all integers x, x is even if and only if x? is even.
Using logical notation, the theorem can be expressed as V x[x is even x? is even].
We prove the theorem by first establishing x is even x? is even for an arbitrary element x of the universe of discourse.
(a)
ces “ne or t, par if” ly “on e (th ht rig to left m fro on ati lic imp First, we show the n The kK. r ege int e som for 2k = x n the n, eve is x If of. pro sity”) by a direct x2 == (2k)? = 4k? = 2(2k?), which is an even number.
(b)
We next show the implication from right to by showing the contrapositive: x is not even then x = 2k +1 for some integer k, and Since the first two summands are even, the is established.
left (the “if” part, or “sufficiency”) => x* is not even. If x is not even, x? = (2k + 1)? = 4k? + 4k +1. sum is odd and the contrapositive
This completes the proof of x is even x* is even. conto on ati liz era gen l ersa univ y appl can we x, y trar arbi for was of pro Since the clude that
Vx(x is even x? iseven).
fF
#
The forms of mathematical argument we have considered are common and are s que hni tec of pro new eed ind e; tiv aus exh ns mea no by but ed, ept acc widely s que hni tec of pro l ona iti add p elo dev will we rs pte cha re futu In d. ise dev ng still bei and apply them. Our discussion of proof techniques has been “informal” in the sense that we have not worked within a formal system in which all axioms and rules of inference have been explicitly stated. The advantage of a formal system is that a characterization of the axioms and rules of inference implicitly defines the set of theorems: it is the set of all statements which can be obtained from the axioms by applying the rules of inference in all possible ways. In such a system it becomes possible to distinguish between assertions which are true and those which are provable. An
Sec. 1.5
METHODS
OF
PROOF
55
assertion is provable if it is a theorem, i.e., if a proof of the assertion exists. (Note that the definition does not require that we be able to construct the proof.) The truth of an assertiomay n depend on the choice of universe of discourse and the interpretation of the predicates; we have seen examples of assertions which are true in some universes and not in others. Thus we can ask two things of a formal system:
(a) (b)
That it be powerful enough to prove all valid assertions, that is, all
those assertions which are true regardless of the universe of discourse
and the interpretation of the predicate symbols. That it be powerful enough to prove all assertions which are true of some particular universe with a specified interpretation of certain predicate symbols. An example would be the universe of natural numbers with predicates corresponding to equality and identities of arithmetic.
Without going into detail, we can say that mathematics has been rather successful with (a), but not with (b). It has been established that, to a considerable extent, our lack of success in (b) is inherent in our mathematical methods. For example, a result due to Gédel asserts that if a formal system is powerful enough to express assertions about integer arithmetic but permits only true assertions about arithmetic to be proved, then there are other assertions which are true of arithmetic but cannot be proved in the system. The development of an understanding of the distinction between assertions which are true and those which are formally provable was a magnificent accomplishment of mathematics; the work has profound implications for both philosophy and mathematics. To explore further in this area, the student should consult the excellent book of DeLong [1970]. When an argument is presented within a formal system, whether it is a proof can be decided algorithmically, but formal systems do not encompass all of mathematics. When an argument is presented outside a formal system, as most proofs are, its validity must be determined by mathematicians; they must decide whether the argument is convincing. Thus, the question is usually decided by consensus; an argument is accepted as a proof if no one can perceive any flaws in its structure. Agreement in such matters is very good, but the mechanism is not foolproof. Although mathematical proofs are intended to be the quintessence of careful argument, perceiving the flaws of an alleged proof can be a profoundly difficult task. Examples exist of arguments which were widely accepted as proofs for many years but were then shown to be fallacious by someone who discovered a possibility which had been overlooked in the original argument. Sometimes such a discovery results in a new argument being devised, which is then accepted as a proof of the original assertion. But it is not uncommon for the overlooked possibility to provide a basis for a counterexample to the original assertion, thus disproving it. In summary, while a purported proof which is generally accepted is rarely shown to be fallacious, examples of such occurrences do exist, and we must conclude that “proof” is not a label which can never be removed.
56
MATHEMATICAL
REASONING
Section 1.5
Problems: 1.
Ch. 1
e qu ni ch te f oo pr the te ca di In . ns io rt se as Prove or disprove each of the following to in n io rt se as ch ea t Pu I. rs ge te in of set employed. Consider the universe to be the of es ti er op pr d an s on ti ni fi de e fiv g in ow ll logical notation. You may assume the fo
integers. k. r ge te in me so for 2k = n if ly on d an if en ev is (Gi) An integer n k. r ge te in me so for 1 + 2k = n if ly on d an if d od is (ii) An integer n ve ha rs ge te in the if ly on d an if ve ti si po is rs ge te in o (iii) The product of two nonzer the same sign. y, > x s: ld ho g in ow ll fo the of e on y tl ac ex y, d an (iv) For every pair of integers x x=y,orx Q. a uce ded and s esi oth hyp l ona iti add an as —@ ume ass to is hod A common proof met contradiction, i.e.,
(Ai
A Ax A+++
AA, A TWQ)>C€
where C is a contradiction.
Example Theorem:
Show that if P and P > Q
are true, then Q
is true.
Proof: Assertion
Reasons
tn NR Premise 1 1. P Premise 2 2. P>@Q Assumption (negation of conclusion) —“@Q 3, 2, implication “PV Q 4. 3, 4, disjunctive syllogism —P 5. 1, 5, conjunction 6. PA —P
Sec. 1.6
PROGRAM
CORRECTNESS
57
But P A Pisa contradiction. Therefore, Q follows logically from the hypotheses.
i
(a)
(b)
1.6
PROGRAM
#
‘Justify the above technique using truth tables (assume only two hypotheses H; and H,). Explain how this proof technique relates to proof by contradiction.
CORRECTNESS
Writing good computer programs is not a welldefined process, and criteria for the evaluation of programs are often vague and illformed. There are, however, three questions that are commonly used to assess the quality of a program:
(a) (b) (c)
Is the program “well written” ? Is the program efficient? Does the program do what it is supposed to do?
The first question addresses the matters of style, clarity, and ease of modification; evaluation of these properties will probably always be difficult and, to some degree, subjective. The second question concerns the cost of program execution, usually measured in terms of storage requirements and program execution time; the study of program efficiency, often called algorithm analysis, will be treated in Chapter 5. To answer the third question, we must first specify precisely what task is to be performed. Then we must prove that the program is correct in the sense
that it performs the specified task. Establishing that a program is correct, also known as program verification, is generally more difficult than writing the program, but the costs which result from an incorrect program can easily exceed the cost of verification. As a consequence, techniques for establishing program correctness are of singular importance to the computer scientist. Most program errors can be classified as either syntactic or logical. A syntactic error is one which violates the definition of a wellformed program in the given programming language. Syntactic errors are generally detected by the language translator program (i.e., the compiler or interpreter) and can usually be corrected easily. After all syntactic errors have been eliminated, a program is usually tested for errors in logic by executing the program on a selected set of input data. But correct performance of a program on test data does not guarantee that the program is correct unless the program is tested with every possible input. Because it is usually impractical to test all possible inputs, logical errors may remain even if the program produces the correct results for the test data. As a consequence, pro
gram verification usually requires the use of proof methods
similar to those de
scribed earlier in this chapter. In this section we will describe a method for program verification based on assertions about the program variables before, during, and after program execu
tion; we will call such assertions program assertions. For simplicity we will restrict our examples to integer arithmetic, that is, the universe of discourse for numerical variables is taken to be the integers. Furthermore, as is customary in treatments of
MATHEMATICAL
68
REASONING
Ch. 1
er st gi re d an s on ti ta mi li e ag or st as ms le this topic, we will ignore such potential prob overflow. on ti la re d an s le ab ri va m ra og pr of es ti er Program assertions characterize prop n ca ns io rt se as e es Th n. io ut ec ex m ra og pr of es ships between them at various stag as ch su e, at ri op pr ap e ar es at ic ed pr er ev at wh e utiliz “x is nonnegative” x=y ae)
ae
“x
Q, and Q, ns io at ic pl im e es Th . ns io rt se as m ra og pr o tw te the rules are propositions which rela is is th r; te ap ch is th of ns io ct se us io ev pr e th of es qu ni ch te e th g in us ed ov pr are m. ra og pr e th of n io at er id ns co y an of done independently Example If the program segment Al: true x2>
y, we can conclude that
Al: true xy
is correct.
#
the of me so h wit d ne er nc co are h ic wh nce ere inf of es rul We next treat the e lud inc s ent tem sta l tro con The ge. gua lan g in mm ra og control statements of our pr ed cut exe be to s ent tem sta m ra og pr se cau can y the ps; loo and conditional branches l wil We t. tex m ra og pr the in ar pe ap y the h ic wh in t tha om in an order different fr ” S2, e els S, n the ion dit con “if S,” n the ion dit con “if es; treat three fundamental typ  t (bu ion ert ass an is ion dit con e, typ ent tem sta h eac In S.” do ion and “while condit er ev en wh ; les iab var m ra og pr the of ues val the ut not a,program assertion) abo n tio por the e, typ t en em at st h eac For se. fal or e tru her eit is it , condition is evaluated . ion dit con of ue val th tru the by ed in rm te de is t nex ed cut exe of the program to be of e rul a by d ize ter rac cha is e typ ent tem sta h eac The precise effect of executing inference. t en em at st m ra og pr the ed, cut exe is S” n the ion dit con “if ent tem sta When the t en em at st gle sin a be can S t tha te (No e. tru is ion dit con S is executed if and only if nce ere inf of e rul A r.) pai end .. n. gi be a in ed los enc s ent tem sta or a sequence of s ion ert ass m ra og pr ing low fol and g in ed ec pr e olv inv st mu e typ ent tem sta s thi for the led cal e, rul e Th ed. cut exe is S t en em at st the not or r he et wh e tru be which will ifthen rule, is the following:
(Q, A condition) {S} Q2
(O, A — condition) > Q, ..Q, {if condition then S} Q,
The ifthen Rule
Sec. 1.6
PROGRAM
CORRECTNESS
63
Note that the implication (Q, /(\ condition) > Q, is a proposition which must be proved without reference to the program. The ifthen rule can be interpreted using flowcharts in the following way. (Note that when edges of a flowchart converge, the point of convergence is treated as a node and different assertions can appear on the edges which enter and leave it.) “If we can show that
(Q,
A
condition )> Q,
and
QO,
is known
A condition
to be correct, then we can infer that
is correct.”
In terms of programs, the ifthen rule can be stated as follows:
“If the implication
(Q, A
— condition) > Q,
64
MATHEMATICAL
Ch. 1
REASONING
is true and A1:Q1
A condition
RY A2: Q2
is correct, then Al: Q1
if condition then S A2: Q2
is correct.”
Example To show that Al: true y 0 if x < 0 then
A2:x2z0Vy=0
is correct, it suffices to show that the implication [true
A (x
> [0V y=)
is true and that Al’:true
Ax 0Vy=9)]
simplification
definition of >
addition
Nx OVy=0
is correct, we first observe that, since y is assigned the value 0 and the value of x is not changed, AV’ :true
y< I
Ax OVy=0
is correct.
3
When the statement “if condition then S, else S,” is executed, if condition is true, then S, is executed; otherwise S, is executed. The ifthenelse rule of inference is the following: (Q;
A
condition) {S,} OQ,
(Q, (\ condition) {S,} O,
”.Q, {if condition then S, else S,} O,
The ifthenelse Rule
We leave the flowchart and program formulations of the ifthenelse rule as exercises. Example In order to establish that Al true
if x < 0 then »y « —1 else y — 1
Aa
AEC
(C)
[ACBABEC]IZ>AEC
(b)
[AE
A € C? Is it always
BABEC] AE >C
Briefly describe the difference between the sets {2} and {{2}}. List the elements and all the subsets of each set.
Briefly describe the difference between the sets , {G}, and {¢, {6}. List the elements and all the subsets of each of these sets.
7.
Is it possible that
Programming
A
< Band A € B? Prove your assertion.
Problem
Write a program which decides if two input sets are equal or if one is contained
in the other. Assume
2.4 OPERATIONS
ON
all sets are finite subsets of the set of natural numbers N.
SETS
An operation on sets uses given sets (called the operands) to specify a new set (called the resultant). We will first treat binary operations; a binary operation combines two operands to produce a resultant. As in the previous sections, we assume that all sets are constructed from some implicitly specified universe of discourse U. Definition 2.4.1: Let A and B be sets. (a) The union of A and B, denoted A U
AUB={xxE (b)
xe
Bh.
The intersection of A and B, denoted A CQ B, is the set
AN (c)
AV
B, is the set
B={xxEeAAx ce Bh.
The difference of A and B, or relative complement of B with respect to A,
denoted A — B, is the set
A—B={xxeAAxéBh.
86
Ch. 2
SETS
Examples (a)
(bt)
Let A = {0, 1, 2} and B = {I, 2, 3}. Then 2, 3}
= {0,1, AUB
{l, 2} AN B= {0}
(c)
A—B=
(dd)
BA=({3}
Definition 2.4.2:
# If A and B are sets and
A
B=,
then A and B are
are C of ts men ele ct tin dis two any t tha h suc sets of n tio lec col a disjoint. If C is disjoint, then C is a collection of (pairwise) disjoint sets. Example If C = {{0}, {1}, {2},...} = (fi © N}, then C is a collection of disjoint sets.
ain wT
the that e Not s. ion rat ope ary bin of ses clas ant ort imp e som ne defi We next following definition is not restricted to operations on sets.
Definition 2.4.3: Let ["] denote a binary operation, and let x [] y denote the resultant obtained by applying the operation [] to the operands x and y. Then (a) The operation [1] is commutative if x[]y=y(]~. Qy)C]z=x(10 (42). (b) The operation ["] is associative if (x Examples For the integers, the binary operation of addition is commutative and associative since for all integers x, y and z,
X+y=ytx
(«ty4+z2=x4+04+2) However, the operation of subtraction is neither commutative nor associative, ¢.g.,
6444—6 (6—4)24#6—(4—2)
#
Theorem 2.4.1: The set operations of union and intersection are commutative and associative, i.e., for arbitrary sets A, B, and C, AUB=BUA {a) ANB=BQA (b) (c) (AUB)UC=AU(BUC) AN BNC=AN(BNC) (dd)
The proofs of assertions (a)(d) use the commutativity and associativity of the logical operators \/ and /\. We will illustrate by proving assertions (a) and (c).
Sec. 2.4
OPERATIONS
ON
SETS
= 87
Proof: (a) Let x be an arbitrary element of the universe U. Then
xEAUBSXECAVXEB
Definition of U
>xEBVxeEad
Commutativity of
xeEeBuA
Definition of U
Since x was arbitrary, it follows that
Valxe Hence,
(c)
A
AUBSxe
BUA].
UB=BUA.
Let x be an arbitrary element. Then
xEAU(BUC)SxE
AV
xEeAV
(xe Hx
XE(BUC)
(xe
AVxe
BV
Definitiof onU xeC)
Definitiof on U
B)VxeEC
Associativity of V
E(AUBVxEC
Definition of U
xE(AUBUC
Definition of U
Since x was arbitrary, it follows that
Valxe
AU(BUCSx ) € (AUB)
Hence, AU(BUC)=(AUB)UC.
UC.
§
The following definition is not restricted to operations on sets. Definition 2.4.4: Let A and [] be binary operations. Then A over [_ if the following hold:
distributes
xA(WVID2)=%AY Ie Az) OO DAx=VAxXOECA)
(Note that if A is a commutative operation, then each of these “distributive laws” implies the other.) Examples For the set of integers, multiplication distributes over addition:
x(Ytz2Hx
yt xz
Addition does not distribute over multiplication, e.g.,
4+ (62)#(4+6)44+2) Theorem 2.4.2:
The set operations of union and intersection distribute over
each other, i.e., for arbitrary sets A, B and C.
(a) (b)
#
AU(BNOC)=(AVBN(AUC) AN(BUC=(AN BU(MANC)
88
Ch, 2
= SETS
Let x be an arbitrary element. Then
(a)
Proof:
Definition of U
xEAU(BNC)exEAVxE(BNC) oxEAV(xEBAxE€EC) B)A(KEAV
o(xEAVxeE
Let A, B,
the following assertions are true.
(a)
AUA=A
(b)
AN A=A
(c)
AUG=A
(d)
over/\
Definition of U Definition of
fj
Cand D be arbitrary subsets of a universe U. Then
AN¢G=¢ A—BcA
(ec) (f)
If
Ac
Band
Cc
D,then(4
UC)
(g)
If
Ac
Band Cc
D,then(A
NM C)c
(h)
ACAUB
(i)
ANBcA
Gj)
Distributivity of “V
E(AUB)N(AUC)
C). Hence, AU (BON C)=(AUB)N(AU The proof of part (b) is left as an exercise. Theorem 2.4.3:
x EC)
AUB)A(XEAUC)
(xe x € SI}.
These operations are natural generalizations of the union and intersection
operations defined previously; if x € (Jsec S, then x is an element of at least one subset S € C, and if x & ()\sec S, then x is a member of every subset S € C. Note that C is required to be nonempty for ()\sx eS]
be the universal set U.
By requiring that C + @, this possibility is eliminated. If D is a set and a set A, has been defined for each d the index of A,, the collection C = {A,d € D} is called of sets, and D is called the index set of the collection. When collection C, the notation _Jzep Az denotes Jscc S, and ( If C is a finite indexed collection of sets and the index
€ D, then d is called an indexed collection D is the index set of a ep Az denotes (\sco S. set is a set of natu
92
Ch. 2
SETS
s ber mem the of ion ect ers int and on uni the n the ”} .., 2,. 1, {0, s ber num ral
C can be denoted
similar to the summation
by using notation
notion.
of
Let
C = {Ap, A,,.., A,}; then
U
Us=U4=
O: U4 4= =( Us SEC i€N
Osi
i=0
set. y trar arbi an be can but N, of et subs a be not need ces indi of set the ral In gene Examples Let the universe be the set of real numbers R.
(a)
If C = {{0}, {0, 1}, (0, 1, 2}}, then Usec S = {0, 1, 2}, and sec S = {0}.
(b)
Let (a, b) denote the open interval from a to 5, i.e., (a, b) = {xa 0}, then sec S = (—1, 1).
Then
Uietatc)
A;
=
{0,
1, 2, 4, 5, 6} and
( \eta,b,c} A;
=
@.
< 5}. If
#
We will often refer to the set of subsets of a set. Since the set of subsets of a given set A is unique, we can define a unary operation on sets whose value is the set of subsets of the operand. Definition 2.4.7: of all subsets of A.
Let A be a set. The power set of A, denoted (A), is the set
Examples
(a)
If A = @, then P(A) = {9}.
(b)
If A = {1}, then @(A) = {@, {1}.
(c)
If A = {1, 2} then @(A) = {¢, {1}, {2}, {1, 2H.
(d)
If A is any (finite or infinite) set of natural numbers then
If A is finite, then @(A) is finite; otherwise, (A) is infinite. Problems: 1.
(a)
Section 2.4
Construct Venn diagrams for the following: @ AUB
(ii)
(iii)
(iv)
ANB
A—(BUC)
AN(BUC)
A € O(N).
#
Sec. 2.4
(b)
OPERATIONS
ON
SETS
= 93
Give a formula which denotes the shaded portion of each of the following Venn
U
(i)
! Ce 3,2)
a>
(ii)
(iii)
Let A, Band C be arbitrary sets. Express
A U B U Casa
union of disjoint sets.
Prove parts (b) and (d) of Theorem 2.4.1. Let A, B, and C be sets (a) Show thatifC c AandC c B,thenC c AM B(ie., A A Bis the largest set contained in both A and B). (b) Show that if C > A and C > B, then C > A U B(ie., A U Bis the smallest set which contains both A and B).
Prove part (b) of Theorem 2.4.2.
Suppose
4 + ¢ and
C. Show that it does not follow that
B = C.
Show that “relative complement” is not a commutative operation; there exist universes which contain sets A and B such that
that is,
Suppose in addition that
(a)
AU B= AU
A © B = A.
C. Can you conclude that
A—BAB— (b) (c)
Prove Theorem 2.4.4. Prove the following identities.
(a)
A.
Is it possible that A — B= B — A? Characterize all conditions under which this occurs. Is “relative complement” an associative operation? Prove your assertion.
Prove the remaining parts of Theorem 2.4.3.
10.
B = C?
AU(ANB)=A
94
(b) (c) (dd) (e)
11.
AN(AUB)=A A~B=ANB AU(ANB)=AUB AN(AUB=ANB
In each of the following, find Usec S and (sec S.
(a)
(b) (c) (dd)
12.
Ch. 2
SETS
C= {¢}
C = {6, (6}} C = {fa}, {5}, {a, b} C={Hlie B
Let A, B, and C be subsets of some universe U, and let D be the following collection.
D={AN AN
(a) (b) (c)
13.
(b)
16.
BAC ANBNACANBACANBNCG
Construct a Venn diagram for the elements of the collection D. Care disjoint. Is D a disjoint collection AN BN AM BO Cand Prove that of sets? Prove that Usen S = U.
US=S8 SEC sec
(\S=US SEC SEC {a, 5, c} {{a, 5}, {c}}
(c) {{a, b}, {b, a}, {a, 6, B}
Let S, = {ao, @1,..., Gnt and S,.1 = (ao, @1,.. 5 &q, Anyi}. Describe how is related to @(S,). (Hint: P(S,,1) contains P(S,).)
P(S,41)

Let x and y be real numbers and define the operation x A y to be x” (x raised to the : power y). : (a) Show that the operation A is neither commutative nor associative. (b) Let o represent multiplication. Determine which of the following distributive _
laws hold.
(i) (ii) (fii) (iv)
Programming L

Specify the power set for each of the following sets.
(a) (b) 15.
BOC AN BOC ANBOC
Let C be a nonempty collection of subsets of some universe U. Prove the following generalization of DeMorgan’s laws.
@
14.
BOAC,AN
xoYVAz=(@cy)A&ez) GY Az)ox =Yox Alex) x AQoz=@AYo@Az) Woz Ax=WAx)°ZAx) Problems
Write a program to generate the power set of {0, 1, 2,..., 2} for any natural number n given as input. (a) (b)
Write a program which accepts specifications of two finite sets A and B, where © A, B < N, and prints a nonredundant list of the elements of AU Band AO B. . Write a program to determine for a given set A and an arbitrary n © N whether né A. “
Sec, 2.5
2.5
INDUCTION
95
INDUCTION Inductive Definition of Sets
Earlier in this chapter, we explicitly by listing the elements free variables; we also observed But predicates do not always
described how finite sets can be defined either of the set, or implicitly by using a predicate with that infinite sets can only be specified implicitly. provide a convenient means of charactering an
infinite set. For example, there is no convenient or obvious predicate to specify the set of ALGOL, PL/I, or FORTRAN programs, or even such a basic structure as the set of natural numbers N. Such sets are often most naturally defined using an inductive definition.t An inductive definition of a set always consists of three distinct components.
1.
The basis, or basis clause, of the definition establishes that certain objects are in the set. This part of the definition has the dual function of establishing that the set being defined is not empty and of characterizing the “building blocks” which will be used to construct the remainder of the set.
2.
3.
The induction, or inductive clause, of an inductive definition establishes the ways in which elements of the set can be combined to obtain new elements. The inductive clause always asserts that if objects x, y,...,2 are elements of the set, then they can be combined in certain specified ways to create other objects which are also in the set. Thus, while the basis clause describes the building blocks of the set, the inductive clause describes the operations which can be performed on objects in order to construct new elements of the set. The extremal clause asserts that unless an object can be shown to be a member of the set by applying the basis and inductive clauses a finite number of times, then the object is not a member of the set. The extremal clause of an inductive definition of a set S has a variety of forms, such as (i) “No object is a member of S unless its being so follows from a finite number of applications of the basis and inductive clauses.”
(ii)
(iii)
(iv)
“The set S is the smallest set which satisfies the basis and inductive clauses.” “The set S is the set such that S satisfies the basis and inductive
clauses and no proper subset of S satisfies them (i.e., if T is a subset of S such that T satisfies the basis and inductive clauses, then T= 8S).” “The set S is the intersection of all sets which satisfy the properties specified by the basis and inductive clauses.”
In fact, all these forms of the extremal clause are equivalent in consequence though {The term “recursive definition” is often used to denote what we call an “inductive definition.”
96
Ch, 2
SETS
not in form, and all serve the purpose of establishing that nothing is a member of the set being defined unless it is required to be so by the first two steps of the definition. Often the extremal clause is not stated explicitly in an inductive definition; this rarely leads to misunderstandings. Example If the universe of discourse is the set of integers I, then a predicate definition of the set E of even nonnegative integers can be given as follows:
E = {xx >0 A aylx = 2y}} The same set can be defined inductively as follows:
1.
(Basis) 0 € E,
2.
(nduction) Ifa R) is a propositional form over V. This can be established as follows: From the basis clause, it follows that P, QO, and R are all propositional forms. Applying the induction clause to P and Q, it follows that (P A Q) is a propositional form, and by another application of the inductive clause, this time to (P A Q) and R, it follows that (P A Q)= R) is a propositional form. Thus one can show that an element is a member of an inductively defined set by exhibiting a sequence of applications of # the basis and inductive steps which produces the element in question.
Recursive
Procedures
Inductive definitions form a subclass of a more general class known as recursive definitions. As the term is commonly used in computer science, the salient characteristic of a recursive definition is “selfreference” as in the induction clause
Sec. 2.5
INDUCTION
99
of an inductive definition. As we use the terms,t not all recursive definitions are
inductive; we will give examples to illustrate the difference in a later chapter. In programming, a recursive procedure, or recursive subroutine, is one which can call itself, either directly or indirectly. Recursive procedures are based on recursive definitions, although the definition need not be of a set. If a recursive procedure is based on an inductive definition, the segments of the procedure often correspond in a natural way to the basis and induction clauses of the definition. It is often necessary to write procedures to determine whether an input has a specified property. If the set of elements which have the property is defined inductively, a recursive procedure is a natural and powerful mechanism for determining set membership. Examples (a)
Consider the universe I, and let E be the set of nonnegative even integers defined inductively in the first example of this section. The recursive procedure EVEN() given in Fig. 2.5.1 returns “yes” if an input 2 € I is an element of the set E; otherwise it returns “no.” The procedure has three parts. The first part causes “no” to be returned if the input is too small; this part of the procedure does not correspond to any part of the inductive definition of E. The second part of the procedure tests if n = 0; this corresponds to the basis clause of the definition of E, The third part corresponds to the inductive clause of the definition and causes EVEN to call itself with the parameter n — 2.
procedure EVEN(n): comment:
If
is even and n> 0, then return “yes.”
Otherwise, return “no.” if 2 < 0 then return “no” else
if nz = 0 then return “yes” else return EVEN( — 2)
Fig. 2.5.1 Recursive procedure nonnegative even integer
(b)
EVEN
to determine
if n is a
Consider the problem of recognizing whether a string of symbols is an arithmetic expression, where the set of arithmetic expressions is defined inductively in part (a) of the preceding example of this section. A recursive procedure ARITH(exp) based on this definition is given in Fig. 2.5.2. This procedure returns “yes” if the input expression exp is generated by the inductive definition of arithmetic expressions; otherwise, the procedure returns “no.” The procedure first checks to see if exp is generated by the basis clause, that is, if exp isa
TA distinct but related meaning of the term “recursive” is used in mathematical logic and the theory of computable functions, but a discussion of the relationship between the two uses is beyond our scope. We will only use the term in the informal sense described above.
100
Ch. 2
= SETS procedure ARITH(exp): s.” “ye urn ret n the n, sio res exp c eti thm ari an is exp If comment: Otherwise return “no.” begin
. use cla is bas the by ted era gen is exp if e comment: Determin if exp is a string of digits then return “yes” else begin . use cla ive uct ind the by ted era gen is exp if e in rm te De comment:
1) p_ ex (+ = exp her eit t tha h suc _1 exp ing str sub a if exp contains or exp = (—exp_1)) then return ARITH(exp_}) t tha h suc 2 p_ ex and _1 exp s ing str sub ns tai else if exp con exp = (exp_1 1] exp_2)
where [[] is an operation symbol (+, —, / or *) and ARITH(exp_1) = “yes” and ARITH(exp_2) = “yes” then return “yes” end;
comment: return “no”
s. use cla ive uct ind or is bas her eit by ed uc od pr exp is not
end
a r the whe ine erm det to H IT AR ure ced pro ive urs Rec .2 Fig. 2.5 string of symbols is an arithmetic expression
, its dig of ing str a not is exp If s.” “ye s urn ret string of digits. If so, the procedure is exp if e in rm te de to gs in tr bs su ng pi ap rl ve no no o int exp then ARITH breaks is s thi If . use cla n io ct du in the by s on si es pr ex ic et hm it generated from other ar d an on si es pr ex ic et hm it ar an not is exp t tha s de lu not the case, ARITH conc # returns “no.”
an on ed bas is set a in is t en em el an if ide dec to e ur ed oc pr When a recursive urn ret for m is an ch me a e vid pro to ary ess nec is it inductive definition of the set, aneg a for t tes a ns tai con .1 2.5 . Fig in en giv e ur ed oc pr ing a negative answer. The was ut inp the if s” “ye urn ret d ul wo e ur ed oc pr the tive input. Without this test, e ur ed oc pr e Th . uts inp er oth for ate min ter not d ul wo nonnegative and even, but of s ing str sub o int en ok br be can ut inp the if e in ARITH of Fig. 2.5.2 must determ t, tes e tiv aus exh by ed er id ns co be can s tie ili sib pos the operands and operators. All ch mu are re the t, fac (In .” “no s urn ret e ur ed oc and if none are successful, the pr am ex our g; tin tes e tiv aus exh by n tha n io at rm fo in s thi faster ways of determining s.) thm ori alg ent ici eff not are y the but , ive rat ust ill ly tab sui ples are Inductive Proofs
but s, set te ini inf ng ini def of od th me a e vid pro y onl Inductive definitions not If ms. ore the g vin pro for. s que hni tec ul erf pow e som of they also form the basis an by ed ish abl est be ple nci pri in can x) P( Vx m for the of ent tem a set is finite, a sta
Sec. 2.5
INDUCTION
101
exhaustive proof by cases. But for infinite sets, some other device must be used. Proofs by induction are proofs of universally quantified assertions where the
universe of discourse is an inductively defined set. Suppose we wish to establish that all the elements of an inductively defined § have a property P; i.e., we wish to establish VxP(x ) for the universe S. A proof by induction usually consists of two parts correspo nding to the basis and induction clauses of the definition of S: 1. 2.
The basis step establishes that P(x) is true for ev ery element of x e S specified in the basis clause of the definition of S. The induction step establishes that each element constructed using the induction clause of the definition of S has the proper ty P if all the elements used in its construction have the property P.
Note that there is no step in an inductive proof which corresponds to the extremal clause of the definition of ‘S, but its role is cr ucial to proofs by induction. The extremal clause guarantees that all elements of S can be constructed using only the basis and induction clauses of the definition, An inductive proof establishes that every element x constructed in this way has so me property P. It follows from the extremal clause that the assertion P(x) holds for all elements of S, and we can
therefore conclude VxP(x).
To illustrate the technique of inductive proof, cons ider the set of wellformed, or balanced strings of parentheses. (For clarity, we will represent parentheses by square brackets.)
Definition 2.5.3: Let X be the alphabet {L]}. The set B of wellformed parenthesis strings is the subset of £* such that 1. 2.
(Basis )[ ] is an element of B. (Induction) If x and y are elements of B, then (i) [x] is an element of B, and (ii) xy is an element of B. 3. (Extremal) The set B consists of all symbol string s which can be constructed using a finite number of applications of clause s 1 and 2. The set B is the set of all parenthesis sequences which ca n occur in algebraic formulas, such as [ ], [[ ]], [ It 1, (f JIL J, and [If Jf J]. We now show that in any wellformed parenthesis string, the number of left parent heses is equal to the number of right parentheses.
Theorem 2.5.1: Let x be an element of B. If L(x) denotes the number of left parentheses in x and R(x) denotes the number of right pare ntheses in x, then L(x) = R(x). Proof: The theorem asserts Vx[x ¢ B > L(x) = R(x)]. the definition of B. Let x be an arbitrary element of B.
The proof follows
Ch, 2
=SETS
102.
1. 2.
(Basis) If x = [ ], then L(x) = R(x) = 1. (Induction) Let x and y be elements of B, and suppose they have the property that L(x) = R(x) and L(y) = R(y). We show that any element z which can be constructed from x and y has the property L(z) = R(z). z=[x], then L(z)= L(x) + 1= R(x) +1= RQ) If (i)
(ii)
If z= xy, then L(z) = L(x) + L(y) = R(x) + Ry) = RE).
This completes the inductive proof and establishes the theorem.
Jj
Most commonly, proofs by induction deal with the natural numbers. In order to discuss these proofs, it will be useful to have the following inductive characterization of N. 1. 2. 3.
(Basis)O EN. (Induction) Ifn < N, then(n+ 1) EN. (Extremal) If S < N, and S has the properties
then
(i) (ii)
OES, For everyn EN, ifn e Sthen(@z+ lI é S, S=N.
In fact, this does not suffice to define the natural numbers because we have
not carefully specified what is meant by the basis and inductive steps; we will present a proper definition of N in the next section. However, the above characterization will enable us to discuss inductive proofs for the universe N. The extremal clause in the above characterization of N is the form customarily used in definitions of the natural numbers; it is called the First Principle of Mathematical Induction. This form of the extremal clause implies the procedure to be used for inductive proofs of assertions of the form VxP(x) for the universe of natural numbers. Such a proof proceeds as follows: 1.
(Basis) We first show that P(0) is true, using whatever proof technique is
2.
(Induction) We next show Vn[P(n) > P(n + 1].
appropriate.
The inductive step of the proof is usually a direct proof of the implication P(n) = P(n + 1), where the implication is established for arbitrary n < N. The assertion P(n) is known as the induction hypothesis. The induction hypothesis is often stated as “Assume P(n) is true for arbitrary n < N”. Note that this is not equivalent to assuming the truth of the theorem; P(n) is assumed only for the purpose of proving the universally quantified assertion Wn[P(n) = P(n + 1)]. Once P(n) > P(n + 1) has been proven for arbitrary n, it follows (by the rule of inference known as Universal Generalization) that Vn[P(n) = P(n + 1)]. Then from the First Principle of Mathematical Induction we can conclude VxP(x). For suppose S is the subset of N such that P(n) is true for every n < S. The basis step of the proof establishes that 0 € S, The inductive step establishes that for every n € N, if n € S, then (n + 1) & S. By the extremal clause of the definition of N, itt follows that S=N,
i.e., WxP(x).
To illustrate proofs by induction over N, we will prove the following.
Sec. 2.5
INDUCTION
Theorem 2.5.2:
103
For alln & N,
>> i n(n . 1). r=0
The theorem is of the form WnP(n), where P(n) is the assertion Hy. Ant i)
wig Proof:
1,
We first establish the basis step P(0):
aa 2.
=O
2.
a
00+1
The proof consists simply of evaluating each side, giving 0 = 0.
The
induction
step
establishes
Wn[P(n) > P(n + 1)]. To
prove
this
assertion, we give a direct proof of the assertion P(m) > P(n + 1) for arbitrary n < N. In a direct proof of P(n) = P(n + 1), the induction hypothesis, P(7) is assumed to be true. P(n) asserts 9G
We wish to show P(n + 1), ie.,
Sat ?=0
L
But, m+
i=0
)m+2). 2
n
i=(#+D+Di i=0
=(n+1)+ mn td)
(by the induction hypothesis)
= athens)
_@t)Da+2). rr, Since n was arbitrary, it follows that Wn[P(n) > P( + 1)]. By Principle of Mathematical Induction we conclude that VxP(x). Jj
the
First
The following theorem gives algebraic expressions for two more finite sums which will occur in Chapter 5 when we treat the analysis of algorithms. The proofs are by induction and are left as exercises. Letr be areal number. Then for alln € N,
Theorem 2.5.3:
(2)
~~ ifr=1,
Mr=(n+1)
i=0
prt
—
r—l
1]
ifr ~ 1.
Ch. 2
SETS
104
ifr=1,
(b) Sint = MTD i=0
art? —(nt+ Dreiser
ifrA~l.
(r — 1)?
~
l ra tu na e th of es ti er op pr e lv vo in ns io rt se as e In many proofs by induction, th e ov pr to sh wi we e; pl am ex an is m re eo th g numbers only indirectly. The followin an is e qu ni ch te l ra tu na e th t bu s, set r an assertion relating finite sets to their powe inductive proof. Theorem subsets.
2.5.4:
ct in st di 2” s ha S en th , ts en em el n th wi If S is a finite set
g: in ow ll fo the is on ti ta no l ica log in n io rt se as e Th ]. ts en em el 2” has S) @( > ts en em el n has Vn VWS{[S
Proof:
The inductive proof has two parts.
t Le . t] en em el 1 s ha S) ®( > ts en em el 0 has [S WS (Basis) We must show
1.
t tha s ow ll fo it d an ¢ = S en Th . ts en em el 0 S be an arbitrary set with 0. = n for d he is bl ta es is n io rt se as the 1, = 2° @(S) = {}. Since s set for e tru is n io rt se as the if t tha ow sh st (Induction) We mu
2.
with
n
elements,
then
it is true
for
sets
with
(# + 1) elements.
Let
set the is A If S. ¢ a e er wh {a} U S = S’ d an , a,} , S = {a,, a, a3,... S’ of et bs su y er ev ce sin A, U S) P( = ) S’ P( en th }, {B U {a} B € O(S) et bs su a to a t en em el the ng di ad by ed rm fo is is either a subset of S or et bs su ch ea e nc Si . ts en em el 2” has S) ®( , is es th po hy n io ct du in of S. By the 2” s ha o als A t tha s ow ll fo it A, of t en em el e on y tl ac ex of S corresponds to s ow ll fo it A, U S) ®( = ) S’ @( d an nt joi dis are A d elements. Since @(S) an that @(S’) has 2" + 2" = 22" = 2"*1 elements. Jj
er oth for e bas a as d use are d ine def ly ive uct ind n bee e hav Often sets which al em tr ex no e uir req ns tio ini def ive uct ind ” ry da on ec inductive definitions. Such “s e at ri op pr ap the ls fil ful set ng yi rl de un the of use cla al rem ext clause because the function. Example
e iv at eg nn no r l fo ia a" nt ne po ex e th on of ti ni fi e de iv ct du in The following is an N. is set d ne fi y de el iv ct du in ng yi rl de un e Th n. of es lu va r intege Definition 2.5.4: Leta © R+ andz & N. The value of a” is defined inductively ; as follows: 1. (Basis) a® = 1. 2. (induction) a**! = aa. The inductive definition can be used to establish the following:
Theorem 2.5.5:
Vm Walaa" = a™**]
Sec. 2.5
INDUCTION
105
Although the above assertion involves two universa l quantifiers, it can be proved by inducton by letting m be arbitrary and proving the assertion Valata" = a™**] by induction on n. Since m was arbitrary, the th eorem will follow by universal generalization. 1.
Proof: Let m be arbitrary. (Basis) If n = 0, then amg?
2.
=
ging?
—
a™(1)
=
gm
=
gmt0
qmtn,
(nduction) Assume aa" = a™** for arbitrary n. Then amantl — gm(qng) Definition of a* = (a"a")a
Associativity of multiplication
== (a™**)q
Induction hypothesis
P(n + 1]
“. VxP(x) We often wish to prove that a predicate P holds for all x = k for some integer k. A proof by induction is still appropriate but the basis st ep must be changed to prove P(k). The rule of inference is then
P(k) Vn[P(n) > P(n + 1)]
. Wxl(x > k) > PQ»)
Thus to prove that P(x) holds for all integers equal to or greater th an k, it suffices to show P(k) is true as the basis step, and then show th e inductive step
Val P(n) > P(n + 1)].
Another form of proof by induction over the natural numbers uses the Second Principle of Mathematical Induction to prove assertions of th e form VxP(x).
The induction step of a proof using the Second Principle assume s P(k) is true for all k P(n)]
VxP(x) The induction hypothesis for a proof using this rule of inference is Vkik P(k)];
106
Ch, 2
SETS
mp su as the on n ow sh be can ) P(z If ). P( ish from this hypothesis, we must establ . x) P( Vx de lu nc co can we n the ds, hol is es th po hy n io ct du in tion that the e or ef er th d an N, < k ry eve for se fal is 0 < k ion Note that ifm = 0, the assert d an e tru is )] P(k >
the implication k < 0 the of p ste is bas the us Th ). P(0 to nt le va ui eq is ) hence Wk[k < 0 > P(k)] > P(0 . ple nci Pri nd co Se the of is es th po hy the by d ie pl im is ple nci Pri First gle sin a ish abl est we t tha es uir req y onl ple nci Pri nd co An application of the Se the in is ly on mm co st mo s Thi es. cas by f oo pr a es uir req en oft s thi hypothesis, but if 0, > n any for t tha g in ov pr n the d an ), P(0 e cas l cia spe form of proving the e pl ci in Pr nd co Se the ng usi f oo pr a ch Su ds. hol ) P(# n the P(k) holds for all k 3. (Hint: If 2 > 3, the polygon can be di vided into two parts by connecting nonadjacent vertices.) Find predicates P and Q over step and the induction step logically implies the other. and Wnr[P(n) > P(n + 1)] is
Vin[Q(n) = O(n + 1)] is true. 10.
the natural numbers which will establish that the basis of an inductive proof are independent, i.e., neither Specifically, find a predicate P such that P(O) is true false and a predicate Q such that Q(0) is false and
What is wrong with the following proof that all people are the same size? We purport to prove that for all n and for all S, if S is a set with 2 people, then all people in § are the same size. 1. (Basis) Let S be an empty set of people. Then for all x and y, if x € S and y € S, then x is the same size as y.
2.
11.
Ch. 2
SETS
108
(Induction) Assume the assertion is true for all sets containing n people. We show it is true for sets containing n + 1 people. Any set consisting of n + 1 people contains two nonequal subsets of n people which must overlap. Denote these sets by S’ and 8”. Then by induction hypothesis, all people in S’ are the same size and all people in S” are the same size. Since S’ and S” overlap, all people in S = S’ U S” are the same size.
Let {A;, Az, ..., A,} be a nonempty collection of sets. Prove the following generalizations of DeMorgan’s Laws by induction on n.
@ Ua=O4 6) Q4=U%
12.
A binary operation ["] is said to be associative if a [7] (6) c) = (a(.] 6) Lic. From this “associative law” we infer a much stronger result, namely that in any expression involving only the operation ["], the placement of parentheses does not affect the result, that is, only the operands and the order in which they occur in the expression are important. In order to prove this “generalized associative law,” we define the “set of [] expressions” as follows: 1. (Basis) A single operand a, is a [_] expression. 2. (Induction) Let e,; and e, be [] expressions. Then (e, [_] e2) is a ([] expression. 3. (Extremal) There are no [_] expressions other than those which can be constructed from 1 and 2 in a finite number of steps. The generalized associative law can now be stated as follows: Let e be a [] expression with n operands a;, a2,..., a, which appear in that order in the expression e. Then
[email protected],).))). Prove this generalized associative law. (Hint: Use the Second Principle of Mathematical Induction.)
#2.6 THE
NATURAL
NUMBERS
In this section, we will exhibit a careful set theoretic definition of the natural numbers. In the previous section, we used the operation of addition to give an inductive characterization of N. Since the definition of addition of natural numbers must be based on the set N, the characterization we gave is circular and hence unacceptable as a formal definition of N. To avoid this circularity, N must be
defined without using addition. The following is a better (but not yet successful) characterization of N which uses n’ to denote the “successor” of a natural number n; informally, we interpret n’ as n + 1.
1.
2.
3.
(Basis)0 < N.
(Induction) Ifn € N, then n’ € N.
(Extremal) If S c N and S
satisfies clauses 1 and 2, then S = N.
Sec. 2.6
THE
NATURAL
NUMBERS
109
The inadequacy of the above characterization stems from our not having specified exactly what is meant either by 0 in the basi s step or by 7’ (which must be defined in terms of n) in the inductive step. As a re sult, models can be constructed which satisfy the inductive characterization give n above, but do not have the structure of N. The structure we want to characteri ze can be diagrammed as follows:
where a———>b means b is a successor of a; in the diagram, 0’ represents 1, 0” represents 2, etc. If we can find a model of the above inductive characteriza
tion of N which has a different structure, then we will have established the inade
quacy of the characterization as a definition of N. Th e simplest “unintended” model is formed by making 0 its own successor, i.e., 0 = 0’. In this model, the set N is the singleton set {0} and the structure is diagrammed as follows:
In order to rule out such a model, the set N must be defined so as to guarantee that 0 is not the successor of any natural number. This change alone is not sufficient, however. Let N be the set of nodes of an “infinite rooted binary tree.” The root denotes 0, and each natural number has a successor; in fact, it ha s two successors. This unintended model can be represented as follows: “
On
” 05,
uw
09
Consequently, an adequate characterization of N must guarantee that the successo r
of a natural number is unique. Even with this condition satisfied, however, it is still
possible to construct models which do not have the intended structure. In the
following diagram, 0 is not a successor of any natural number, and every natura l number has an unique successor. However, two distinct natural numbers  and 3 have the same successor.
Ch. 2
=6SETS
110
0
0’
0
0”
Ww
n the y’ = x' if t tha tee ran gua t mus N of n tio ini def the , els mod h suc To rule out x == y, that is, a natural number can have at most one predecessor. ed uct str con be can ts ain str con se the of all ies isf sat ch whi N of n tio ini A def is er mb nu l ura nat t firs The set. a be l wil er mb nu l ura nat h Eac . ory using set the defined to be ¢, changing the basis step to
1.
(Basis) ¢ is a natural number.
For each natural number 2, its successor, n’, is constructed as follows.
is a natural number, then 7 U {n} is a natural number. 2. (Induction) If The extremal step remains unchanged. The result is the following definition. Definition 2.6.1:
1. 2. 3.
The set of natural numbers N
is the set such that
(Basis) ¢? € N, (Induction) Ifn e N, then zn U {n} € N, (Extremal) If S < N and S satisfies clauses 1 and 2, then S = N.
The set of natural numbers, according to this definition, has as its elements the sets b, {h}, {b, {G3}, {6, {G}, {G, {G}}},... which we denote by the numerals 0,1, 2,3,... Many of the familiar properties of the natural numbers can now be established, including the following theorems. (The proofs can be found in Chapter
1 of Cohn [1965].)
Theorem 2.6.1:
0 is not the successor of any natural number.
Theorem 2.6.2:
The successor to any natural number is unique.
Theorem 2.6.3:
lf n’ =m’, thenn = m.
If these theorems are added as axioms to the inadequate inductive characteri
.  
r

zation of N given at the beginning of this section, we obtain the wellknown © Peano Postulates for the natural numbers. These postulates, which characterize the 4 s natural numbers without using sets, can be stated as follows: (a) O is a natural number. (b)
(c)
(d) (e)
For each natural number n, there exists exactly one natural number n’,
which we call the successor of n. 0 is not the successor of any natural number. Ifn’ =m’, thenn =m. If Sis a subset of N, such that Gi) Oe S,

Sec. 2.7
SET
(i) ifm € S,thenn’ then S = N. Problems: 1.
2.
Section
OPERATIONS
ON
&*
111
& S,
2.6
Construct a series of models for the axiom systems ob tained from the Peano postulates by deleting each of the axioms a through e in turn. None of the models should have the structure of the natural numbers.
The
definition we have given of the natural
numbers
only involves
the notion
of
“successor.” Relations such as “less than” and operations such as addition and multiplication must be defined in terms of the concept of “s uccessor.” For example, the operation of addition can be defined inductively as fo llows: 1. For every integer m,m + 0 = m.
2.
3.
2.7 SET
For every pair of integers m and nm + Hn = (m+ ny’. (a) Show (using the above definition) that addition is associativ e. (b) Define multiplication inductively in an analogous manner . You can use the (previously defined) operation of addition. (c) Define exponentiation inductively, using the operation of mult iplication. (d) Give an inductive definition of the relation “less than.” Construct an alternate model of N using sets. The alternate mo del need not have the property that the set which denotes the number k has & elements .
OPERATIONS
ON
0} denotes the set {A, ab, aabb, aaabbb,...}.
We often wish to treat collections of strings rather than individual strings. For example, in programming language specification, we must characterize the entire set of programs which can be written in a language. Similarly, a compiler must be written so that it can handle all programs written in the language. Because of the importance of such sets, a considerable body of terminology and notation has been developed to deal with them. Definition 2.7.3: of &*.
Let X be a finite alphabet. A language over X is a subset
Examples (a)
The set {a, ab, abb} is a language over X = {a, b}.
(b)
The set of strings consisting of sequences of a’s followed by sequences of b’s, {a"b™ n, m & N}, is a language over fa, b}.
(c)
The set of ALGOL programs is a language over the alphabet consisting of the ALGOL character set. #
Since every language is a set, the usual collection of set operations introduced earlier in this chapter can be applied to languages. However, because they are collections of strings, other important operations on languages can be defined as well, many of which are based on the operation of concatenation. The principal goal of this section is to introduce these operations on languages and describe some of their properties. These operations are important in a variety of application areas as well as for the study of models of computation. Definition 2.7.4: Let A and B be languages over &. The set product of A with B, denoted AB, or simply AB, is the language AB = {xyx € A A y © B}.
The language AB consists of all strings which are formed element of A with an element of B. Example
Let & = {a, b}, A = {A, a, ab} and B = {a, bb}. Then AB = {a, bb, aa, abb, aba, abbb},
BA = {a, aa, aab, bb, bba, bhab}.
by concatenating an
Sec. 2.7
SET
Note that, in general, AB tative. +
Theorem
2.7.1:
Let A, B,C,
£*
113
and
D
be
arbitrary
languages
over
Z.
The
Ag = $A=9
A{ =A {A} } A= A (AB=)A( BC) C If 4 < Band Cc D, the AC n c BD A(BUC)=ABUAC (BU C)A= BAUCA A(B OC) 1, then z = w,w,...w,, where w, € A for each i from 1 ton.
Ch, 2
SETS
114
Example
Let © = {a,b} and A = {A, a, ab}. Then A® = {A}, A! = A = {A, a, ab}, and A? = AA = {A, a, aa, aab, ab, aba, abab}.
Let A and B be subsets of £* and let m and n be arbitrary
Theorem 2.7.2: elements of N. Then
Amen
=
(a)
A™A"
(c)
Ac
#
= am" ’ (b) (any CB >Ace B
Proof: The proofs of parts (a) and part (c) is by induction on n: 1. (Basis) Since A° = {A} and B° = 2. (Induction) We wish to prove that By Theorem 2.7.1(d), if A” < B” Anti
We have concatenating any subset of (called closure
Cc
(b) are left as exercises. The proof of
{A}, it follows that A” < B” ifn = 0. for all n, if A” < B", then A”*? < Bt?” and Ac B, then A”A c B"B, i.,

Br,
used the notation £* to denote the set of all finite strings formed by elements of Z. This notation can be extended in a natural way to £*. We use the symbols “*” and “*” to denote unary operations operations) on languages.
Definition 2.7.6: is defined to be
Let A be a subset of &*. Then the set A* (read “A star”) A®
—
LJ
A
neEN
ie,
A*=
Ao UA
UA
UAB
={A}UAUA7 UA
U::
U:
The set A* is often called the star closure, Kleene closure, or simply the closure of A. The set A* (read “A plus”) is defined to be
i.e. At = A) U A? UAP Us: The set A* is often called the positive closure of A. Note that x € A* if and only if x € A* for some positive n and only if x € A” for some arbitrary 2 € N. Examples (a)
If A = {a}, then At* = fa} U {aa} U {aaa} U 
= {a"n>1};
A* = {A} U At = fa"\n > O}.
€ N, and x € A®* if
Sec, 2.7
(b+)
SET
OPERATIONS
ON
Z*
115
O* ={AJUPUPUPLA};
gr =. #
The following theorem characterizes some important properties of the language closure operations.
Theorem 2.7.3: Let A and B be languages over ¥ and let n & N. Then the following relationships hold.
(a) (b) (c) (d) (e) (f) (g)
(h) (i) GQ) (k) () (m)
A® = {A} U At A’ = A* forn>0
A’ =< At forn>1 Ac AB* Ac B*A
(A < B) => (A* & B*) (A < B) => (At c Bt) AA* = A*A = At
AE
AoAt=
A*
(A*)* = (At)* = A*
A*A* = AtA* = At (A*B*)* = (4A U B)* = (A* U B*)*
Proof: Parts (a), (b), and (c) are immediate from the definition of A”, At, and A*, (d) (A < AB*.) By part (a), B* = {A} U B*. Therefore, AB* = A({A} U Bt) = A U AB* which contains A. A similar proof establishes (e). (f) (4 < B= A* co B*,) If x € A*, then x € A" for some n>0O. But A c Bso by Theorem 2.7.2, 4” < B". Therefore, x € B" and from part (b) it follows that x < B*. A similar argument holds for part (g). (h) We show only A*A = A*. An intuitively appealing argument can be constructed by noting 4* = A®° U A! U 4? U A? U  and therefore
A®PA=(PUAUAUAU)A =A
AUAIAUAAU::
=A'UA?
UA
Us.
= At, The preceding argument,
while valid, uses the fact that set product
distributes over infinite unions, which we have not proved. The following
alternative argument does not use this fact. x € A*Ayz xforsomey x = yzforsomey
€ A* andz ce A €
A" andz ec Aandne
N.
116
Ch. 2
8 8=SETS
B, we can substitute B for X in the right side of XY > AX and conclude that X¥ > AB. Repeating the substitution, we can conclude X¥ > AAB, X > AAAB, etc., and in general XY > A"B. Thus X > A*B. Now consider a string x ¢ X. Since X¥ = AX U B, and all strings in A are nonempty, it follows that either x € B, or else x has a nonempty prefix such that the prefix is in A and removal of the prefix yields another (shorter) string in X. By the same reasoning, this shorter string has the same property; either it is in B or we can remove another nonempty prefix and obtain another string in X. Since the original string was of finite length, after stripping off a sufficient number of nonempty prefixes we will eventually obtain a string in B. It follows that the original string must have consisted of a (possibly empty) sequence of prefixes, each of which is in A, followed by a suffix which is in B. Thus the original string must have been a member of A*B. The following proof of the theorem is a formalization of these arguments.
Proof:
X = A*B. (a)
Wet X denote an arbitrary solution to the equation. We will show

Weshow XY > A*B by establishing that if X is a solution, then Y > A”B for alln EN. 1. (Basis) For n = 0, A"= {A}, and A°B = B. Since X => B, it follows that ¥ > A°B.
Sec. 2.7
SET
OPERATIONS
ON
E*
117
2.
(Induction) Assume X => A"B. Since XY > AX, it foll ows that X > A(A"B) = A*™*1B. This completes the inductive proof that ¥ > A*B for all n N. It is left as an exercise to show that 4*B = Uo AB. Hence, X > A* B. We show X¥ c A*B using the Second Principle of Mathemat ical Induction on the length of strings in 2*. We wish to show that if x € X, then x € A*B. The induction hypothesis asserts that every stri ng shorter than x has this property. Let  denote the length of x € L*. Then the induction hypothesis is the following quantified implication.
(b)
Vw{llwll [ve
¥
swe A*By
We use this hypothesis in a direct proof that if x € X¥ then x € A*B. Since X¥ = AX U B,if x © ¥ then either x ¢ AX orx € B. Case 1: If x € B, then x € A*B. Case 2: Suppose x € AX. Then x = yz where y € Aandze Y. But A ¢ Aso yA and hence z <  x. By the induction hypothesis, it follows that z€ A*B. Thus x = yz € AA* Bc A*B, This completes the inductive proof that if x © X¥ then x A*B, and establishes that X < A*B. Parts (a) and (b) of the proof establish that if Yis any solutionto X= AY UB, then X = A*B. However, the proof of the theorem is not yet co mplete, since we have not shown that a solution always exists. We leave it to the reader to show
that X = A*B is a solution to the equation
¥ = AX U B.

Examples (a)
(b)
If A = {a} and B = 4, then the equation ¥ = AX U B has the uniq ue solu
tion X = A*B = g.
If A = {a, ab} and B = {cc}, then the equation X = AX U B has the solution X = fa, ab}*{cc}. #
Problems: 1.
Let A = {A, a}, B = {ab}. List the elements of the following sets.
(a)
(b)
(c)
(d)
(e) 2.
A?
BS
AB
A*
B*
Let A, B, and C be languages over E. Prove the following relationships. (a) A(BC) = (AB)C
(b) (c) 3.
Section 2.7
A™A” = A™** for all m,n > 0. (This implies that {A}A = A{A} = A.) (A™)* = A™ for all m,n > 0
Let A and B be languages such that A? = B. your assertion.
Does it follow that
A = B? Prove
Ch. 2
SETS
While A* = At U {A}, it is not generally true that A+ = A* — {A}. For & = {a}, find the smallest set A such that At + A* — {A}. (a)
Prove that the operation of set product distributes over infinite union, i.e., show that ACY
i@N
B)
=
U
1€N
(AB,).
A similar proof can be used to show the other distributive law, (U B)A
(b)
iegN
Prove that
=
U (B;A).
ieN
A*B = (Jo A‘B.
Let A and B be arbitrary languages over XZ. Prove the following.
(a)
(A*)* = A*
(b)
Ac
A At = A*
(d)
A*A*
= At
(©) (4%) =a"
(0) (A*B*)* = (4* U BY) *
Show that if A 4 @ and A? = A, then A* = A. Let A, B, and C be languages over £. Determine which of the following assertions are true and give counterexamples for those that are false. (a) (A*)" = (4")* for anya e N
(b)
(AB)* = (BA)*
(ec) (f) (g) (h)
(A*B*)* = (B*A*)* AUBUCc A*BtCt (At)* = At (A)* = (A*), where B = X* — B
(c) (d)
@)
G)
(k)
(A —B)C = AC — BC A* coc B*>AcB
(AB)*A = A(BA)*
(A*B)*A* = (A* U B*)*
At = AtAt
Let Ei, E,,..., E, be subsets of &*, Is it always true that (E;
UE,U
+++ UE,)*
= (EER... E*)*?
Prove your assertion.
Complete the proof of Theorem 2.7.4 by showing that equation X = AX U B.
X = A*B is a solution to the
Assume the same hypotheses on A and Bas in Theorem 2.7.4. Find the solutions to the equation X = XA U B. Prove your assertion.
12.
Suppose
13.
Let A = solve the in terms (a) X;
X = AX U BandA
¢€ A. Show thatif
C > Bthen ¥ = A*Cisa solution.
{a}, B = {b}. Using Theorem 2.7.4, find subsets X,, X, of {a, b}* which following set of simultaneous set equations. (Hint: Solve for one variable of the remaining variables and then substitute.) = AX; U BX,
Ch. 2
14.
SUGGESTIONS FOR FURTHER READING
(b)
X, = AX,
Use X = (a) (b) (c) (d) (e)
finite sets {a, b}. For The set of The set of The set of The set of The set of
and set Operations to characterize the follo wing languages example, the set of string of even length is {aa, ab , ba, bb}*. strings of odd length. strings which contain exactly one occurrenc e of a. strings which either begin with an a or end wit h 2 d’s or both. strings which contain at least 3 consecutive a’s. strings which contain the substring “bbab.”
119
over
Suggestions for Further Reading
The book by Halmos [1960] is an excellent introducti on to set theory as well as many of the mathematical topics we treat in Chap ters 3, 4, and 6. Axiomatic treatments of set theory can be found in Suppes [1960] and Monk [1969]. Wilder [1965] discusses the set theory paradoxes and their role in the development of axiomatic set theory. The classical development of the natural numbers fr om the Peano axioms, followed by a development of the rational, real, and comp lex numbers, is given by Landau [1951]; it is an excellent introduction to formal mathematics. The work by Knuth [1974] follows two young lovers on an uninhabited shore of the Indian Ocean as they consider some of the same foundational question s as Landau. Knuth’s book is readable and it conveys the spirit of how one goes about doing mathematics; the reader also learns something about the natural numb ers. The first use of BackusNaur Form for describing the syntax of a programming language occurs in the Revised Report on the Algorithmic L anguage—ALGOL 60, which is reprinted in Rosen [1967]. This notation is often used in presenting contextfree grammars; the reader is referred to Aho and Ullman [1 972].
3 BINARY
3.0
RELATIONS
INTRODUCTION Relations characterize structure. In the last chapter we studied sets and their elements. In this section we will study some basic forms of structure which can be represented by relationships between elements of sets. Relations are of fundamental importance to both the theory and applications areas of computer science. A com
posite data structure, such as an array, list, or tree, is generally used to represent
a set of data objects together with a relation which holds between members of the set. Relations which are a part of a mathematical model are often implicitly represented by relations within a data structure. Numerical applications, information retrieval, and network problems are examples of application areas where relations occur as a part of the problem description, and manipulation of the relations is important in solution procedures. Relations also play an important role in the theory of computation, including program structure and analysis of algorithms. In this chapter we will develop some of the fundamental tools and concepts associated with relations.
3.1
BINARY
RELATIONS
AND
DIGRAPHS
The mathematical concept of relation is based on the common notion of relationships among objects. Some relations describe comparisons between elements of a set: one box is heavier than another, one man is richer than another, one
event occurred prior to another, etc. Other relations involve elements of different sets, such as “x lives in y” where x is a human and y is a city, “x is owned by y” where x is a building and y is a corporation, or “x was born in the country y in the year z.”
120
Sec. 3.1
BINARY RELATIONS
AND
DIGRAPHS
121
The examples we have given are all telationships be tween either two or three objects, but in principle we can describe relation ships which hold for n objects, where v is any positive integer. When making an asse rtion that a relationship holds among 7 objects, it is often necessary to specify not only the objects themselves but also an ordering of the objects; for example, only the relative positions of 6 and 4 differ in the two assertions “6 < 4” and “4 < 6”, yet one assertion is false and the other is true. We will use “ordered ntuples of elements” to specify a finite
sequence of not necessarily distinct objects; the re lative positions of the objects in the sequence will provide the necessary ordering of the objects.
Definition 3.1.1: For n> 0, an ordered ntuple (or simply ntuple) with ith component a, is a sequence of n objects denoted by . Two ordered ntuples are equal if and only if their ith components are equal for all
Ll R. Suppose R”’ is a reflexive relation on A and R’” > R. We must show R”
> R’. Consider an arbitrary € R’. Then,
since R’ = R U E, either a = b or € R. If a = Bb, then € R” since R’ is reflexive. If (a, b> € R, then € R” since R” > R. Thus, if € R’, then € R’’. Consequently, the conditions of Definition 3.5.1 are satisfied
and R’=r(R).
J
Examples (a)
The reflexive closure of the relation < on the integers Lis .
(b)
The converse of the relation < on a collection of sets is the relation >.
The following theorem states some of the properties of converses.
#
Sec. 3.5
CLOSURE
OPERATIONS
ON
RELATIONS
157
Theorem 3.5.3: Let R, R,, and R, be binary relations from A to B. Then each of the following holds.
(a)
(RY°=R
(b)
(R,
UR,
c=
(d)
(A
x
= BA
(f)
(R)° = (R°), where R denotes (4 x B) — R.
()
(h) (Gi)
U
RS
(Ri 0 RR.) = REM RS BY
©) #=¢ _
(g)
Rj
_
(R, — R,)° = Ri — Ry
If A = B, then (R,R,)° = RSRS R, oR, > REC RS
Proof: (a) ((R°) = R.) Let be an arbitrary element of R. Then, (b)
(fF)
(g)
© R° € (R*)*; therefore (R°)° = R.
(R,
UR.) = Rf U R3.) ¢y, xD ER,
_
_
UR,
© RLU Rj.
(RY = (R*).) Gy > € (RY yD ER > x ER E Re © (R*).
(R, — Ry) = Ri — Rj.)
Using
= RL =
_
the identity R, — R, = Ri A R,, we
have (R, — Ra)? = (Ri A R,)° = REO Rj
(R,)°
(R35)
— RS.
The proofs of the remaining parts are left as exercises. Theorem 3.5.4: only if R = R°.
ER
J
Let R be a binary relation on A. Then R is symmetric if and
We leave the proof as an exercise. The converse of a relation R is closely related to s(R), the symmetric closure of R. Let R be a binary relation on a set A and let D be the digraph associated with R. The digraph of the symmetric closure of R can be obtained from D by making all the arcs of D into “twoway” edges so that if there is an arc from a to b, then there is also one from b to a. Expressed in terms of R*, this becomes
Theorem 3.5.5: Proof: We contains R. We 3.5.3, RU R° suppose R’ is
.Let R be a relation on a set A. Then s(R) = RU
R*.
must show that R U R’ is the smallest symmetric relation which first observe that R U R* contains R. Furthermore, by Theorem is symmetric since (RU R*)* = R° U (R*°)* = R? UR. Now symmetric and R’ > R. We must show R’ > RU R* Let
158
BINARY
Ch.3
RELATIONS
R*, € b>
0. 1. (Basis) From Definition 3.5.1, part (ii), it is immediate that R < ¢(R). 2. (Induction) Suppose R” < t(R), n > 1, and let € R**?. Since Rt! = R*R, there exists some c € A such that € R” and € R. By the induction hypothesis and the basis step, & t(R) and € t(R). Because f(R) is transitive it follows that € t(R), thus establishing that R’*' < 7(R). Since R* < ¢(R) for all n > 1, we conclude that J, Ri < t(R). (JR, R’ is transitive. Let (ii) t(R) c Use, BR’. We first show that and be arbitrary elements of =, R’. Then for some integers slandt>1, € R' and € R‘. Then € R'R’, and by Theorem 3.4.3, R'R’ = R°*!. Thus € (JR, R’ and therefore (J, R’ is transitive. Since ¢(R) is contained in every transitive relation which contains R, it follows that t((R) c Jn, R’. § If R is a binary quence of elements O on
S as
S; > S; S; calls S).
Some procedure of the set S is recursive if the digraph *}> contains a directed cycle of nonzero length. Let S = {A, B, C, D, E} and suppose A B C D E
calls calls calls calls calls
B and E, C, E, C, and B.
Does the set S contain any recursive procedures ?
12.
Let A = {a}* = {a"n > 0}, and B be the singleton set B = {z} where z is an infinite string of a’s: B = {aaaa...}. Let R be the relation on A U B defined as follows: Xx,yo € Rey
= xa.
Prove or disprove that € Rt. 13.
Let A = {a;, @2,..., Gy} be a set with n elements and let R’ and R” be binary relations A. The incidence matrix M’ of the relation R’ is the 7 x m matrix defined asfollows: Mi, j] = 1 a;R’a;, = 0 otherwise.
The matrix M” is defined in the analogous way. Let the operations of matrix addition and multiplication be defined in the usual way but using the following operations on matrix entries: O=0x=x0=04+0andi=1+x=x+1=11, wherex=0o0rx=1.
Sec. 3.5
(a)
(b) Let (c) (d) (e)
CLOSURE
OPERATIONS
ON
RELATIONS
163
Find the incidence matrix for R’ U R” in terms of M “, M”, and the operations of matrix addition and multiplication. Find the incidence matrix for R’R”. M be the incidence matrix for R. Find the incidence matrix for R*. Find the incidence matrices for R+ and R*. Find the smallest relation R on the set {a, b, c}, for which the incidence matrix for Rt is
111 111 11 1 14.
(For students with an understanding of elementary probability.) Consider the following four dice, which we will call A, B, C, and D.
If two dice x and y are chosen and rolled, we say “x beats y” if a higher number shows on x than y. (a) For each pair of dice x and y, calculate the probability that x beats y. Present your results as a twodimensional array whose entries are probabilities. Let R denote the binary relation “is more likely to win than” on the set {A, B, C, D} where R is defined as follows: xRy the probability that x beats y is greater than 4, (b) Give the digraph associated with the relation R. (c) Find the transitive closure of R. (d) Is the relation R transitive? (e) Suppose someone proposes the following game. You may choose whichever die you like from the set [4, B, C, D}. After your selection, your opponent will select a die from the remaining three dice. You then roll the two dice ; the winner is the person whose die beats the other. The loser pays the winner $1. Assuming your moral character is such that this proposal does not make your skin crawl, would you accept, and why?
Programming
Problem
Write a program which, when given a set of integers S and (specified as a set of ordered pairs), produces r(R), s(R) and ¢(R).
a relation
R on S
164
3.6 ORDER
BINARY
Ch. 3
RELATIONS
RELATIONS
An order relation is a transitive relation on a set which provides a means to compare elements of the set, although such a relation may not permit a comparison of any two elements of the set. We will consider several types of order relations in this section. Definition 3.6.1: A binary relation R on a set A is a partial order if R is reflexive, antisymmetric, and transitive. The ordered pair is a partially ordered set, or a poset. The relation R is said to be a partial order on A. It follows from the preceding definition that a partially ordered set is also a digraph whose relation is a partial order on the set of nodes. We will use the
symbol < to denote an arbitrary partial order; thus, if R is an unspecified partial order, we will usually write either a < b or b > a rather than aRb. Examples (a)
The relation of set containment is a partial order on any collection of subsets of a set A; that is, c is a partial order on P(A) and (0, 1) by f(x) = x + 4. Then fg: [0, 1] > ©, 1) is the injection fg(x) = x/2 + 4. The image of (0, 1] is the interval [4, 3] which is contained in the open interval (0,1). #
The converse of each part of the Theorem 4.2.1 is false, but the following theorem provides a “partial converse” to each of its assertions; its proof is left as an exercise.
Theorem 4.2.2: Let fg be a composite function. (a) If fg is surjective, then / is surjective. (b) If fg is injective, then g is injective. (c) If fg is bijective, then fis surjective and g is injective. The following classes of functions are also useful. Definition 4.2.2: A function f: A—>B is a constant function if there exists some b € Bsuch that f(a) = b for every a € A, i,e., f(A) = {b}. Definition 4.2.3: The identity function on A, denoted 1), is the function on A such that La) = a for alla € A.
Note that every identity function L, is a bijection, The next theorem asserts that if f: A — B, then the identity function of A is a “right identity” for f and the identity function on B is a “left identity” for f.
208
Ch, 4
FUNCTIONS
Theorem 4.2.3:
Let f:
f= A— B. Then
fol,= 1, 0f.
The proof is left as an exercise. The following commutative diagram represents Theorem 4.2.3.
1,
B
f
A
lp
“
A A permutation on A
Definition 4.2.4: Examples (a) (b) (c)
B is a bijective function on A.
The identity function on a set A is a permutation on A. The function /: {0, 1, 2} > {0, 1, 2}, where f(0) = 1, f(1) = 0 and f(2) = 2, is a permutation. I, where f(x) = x + 3, is a permutation on the inteThe function f: 1 gers. #
The result of applying a permutation f on A to the entire domain A is a “rearrangement” of A where a € A is replaced by f(a). A rearrangement of A is often called a permutation of the set A. Since every permutation is a bijection and the composite of two bijections is a bijection, it follows that the composite of two permutations is a permutation. This can be expressed by saying that permutations are closed under (the operation of ) composition. When the domain and codomain of a function are linearly ordered, the following special terminology is used to describe functions which preserve or reverse the order of elements of the domain. We will state the definitions for functions from R to R, but the concept generalizes in a straightforward way to other linearly ordered sets. Definition 4.2.5: A function f:R»R is monotone increasing if x f(y).
If f is strictly monotone increasing, then fis monotone increasing; if / is strictly monotone decreasing, then fis monotone decreasing. Examples (a)
Let f:N—
Nand f(x) = x + 1. Thenf is strictly monotone increasing.
(b)
Any constant decreasing.
(c)
The function f: R — R such that f(x) = x* is neither monotone increasing nor + monotone decreasing.
function
on R is both
monotone
increasing and monotone
Sec, 4.2
Inverse
SPECIAL
CLASSES
OF
FUNCTIONS
209
Functions
If fis a bijection from A to B, then f consists of a set of ordered pairs with the property that every element a € A appears exactly once as the first element of a pair and every element b € B appears exactly once as the second element of a pair. The converse relation, formed by reversing the ordered pairs of f, is a relation with the same properties, i.e., the converse of fis a bijection from B to A. Definition 4.2.6: Let f: A—B be a bijection from function of f, denoted f~1, is the converse relation of f.
A to B. The inverse
Note that the inverse function f~' is defined only if fis a bijection. Theorem 4.2.4: Let f be a bijective function, tive function and f!: B— A.
Proof:
f:
A—> B. Then f' isa bijec
Consider the sets of ordered pairs corresponding to fand f!.
f= {a blac AANbe BA fo? = { € fF}.
f@ =),
Since fis surjective, every b € B occurs in an ordered pair B, where A = {X, Y}, B = {1, {1} and f(X) = 1, f(Y) = {1}. Then, using the inverse function of the bijection f,
f(a) = ¥. But, using the induced function from @(B) to P(A),
f7() = {xX}.
#
If A ~¢ and f: A — B, then the collection of sets {f'({b})b € B} forms a partition of A, and the associated equivalence relation is known as the equivalence relation induced by f. Two elements are equivalent under this relation if the function J maps them to the same element of B. Theorem 4.2.6: follows:
Let f: A—
B and define the binary relation
~ on A as
a~bf(a< ) > = f(b).
Then ~ is an equivalence relation on A.
The proof is left as an exercise. Example Let A = {], 2,3, 4},
B = {a, b, ch, and f: A
B.
If f() =a, f(2) = 6, f(3) =e and f(4) = c, then the equivalence relation on A induced by fhas equivalence classes {1}, {2}, and {3,4}. #
Definiton 4.2.8:
Let R be an equivalence relation on a set A. The function g:A— A/R, g(a) = [ale,
is the canonical map from A to the quotient set A/R.
212.
Ch. 4
FUNCTIONS
Example Let A = {I1, 2, 3} and let ~ be an equivalence relation on A with equivalence classes {1, 2} and {3}. Then the canonical map from A to A/~ is the function g defined as follows:
g: (1, 2, 3} > (1, 2}, GB, g(1) = {1, 2}, (2) = {1, 2}, ¢3) = (3).
#
The following definitions give us additional facilities for creating and modifying functions. The first definition allows us to form a new function by deleting part of the domain of a given function. of ain dom the of et subs a be A’ let and B, A— : f Let 9: 4.2. on niti Defi f. The restriction of f to A’ is the function denoted f , and defined as Ave
A’
B,
Ff \akx) = f). The next definition enables us to enlarge the domain of a function.
Definition 4.2.10: Let f: A’ > B, g: sion of f to the domain A if gla =f.
A— B, and A > A’. Then gis an exten
Examples
& & Rh o>
N PY
A}
Let f and g be defined by the following diagraphs.
2.
{0,
CLASSES
OF
FUNCTIONS
213
1},
fora € A’,
Xa (a) = 1
fora ¢ A’.
Xa (a) = 0
The domain of a characteristic function is not specified by the notation y, and is usually implicit in the discussion. Examples (a)
Let A = {a, b,c} and let A’ = {a}. Then Xala) = 1,
Xa(b) = 0, (b)
Let Mar
Xalc) = 0. A = [0,1] and 4’ = [4, 1]. The following is a graph of the function
I
cuanemmnmmn
>
#
tOneSided Inverse Functions Earlier in this section we established that if f: A — B is a bijective function, then an inverse function f~! is defined and f'f = 1, and ff! =1,. In the first case above, we say f~' is acting as a /eft inverse and in the second case as a right inverse. Because f~' acts as both a left inverse and a right inverse, it is sometimes called, for emphasis, a twosided inverse. Only bijections have a twosided inverse, but some other functions possess onesided inverses. The existence of a left ora right inverse is determined by whether the function is injective or surjective.
Definition 4.2.12: Let h: A— B and g: inverse of h and h is a right inverse of g.
B—> A. If gh =1,, then g is a left
A function g is a left inverse of h if applying the function g will “undo” the effect of the function h; thus, the composite function gh maps each element of the domain of h to itself. Similarly, a function A is a right inverse for g if applying A before g will nullify the effect of g.
Theorem 4.2.7:
(a) (b)
Let f:
A — B, with A < ¢. Then
/has a left inverse if and only if f is injective. /has a right inverse if and only if fis surjective.
214
Ch. 4
FUNCTIONS
. ive ect bij fis if y onl and if e ers inv ht rig and t lef a fhas Iffis bijective, then the left and right inverses of fare equal.
(c) (d)
m. ore the the of (a) t par to te ria rop app is ion rat ust ill ing low The fol
Qt
A
e ew a g
oO
@ ()
e
b o——____>e
h
g
f
i
1

b e~ Bwhere C c Let (a) Prove f(A) — f(C) < f(A — C). Under what conditions do the following equalities hold?
(b) () 12.
Let f:
(a)
14.
15.
A> B, B’ < B, A’ c A. Show that
f(f1(B) < B’.
(b)
If fis surjective, then f(f~!(B’)) = B’.
(d)
If fis injective, then f!(f(A)) = A’.
(ce)
13.
f(B— D)=A— f(D). S(O f(D) = f(C) 2 D.
ff(A)) > A’.
Complete the proof of Theorem 4.2.4.
Prove Theorem 4.2.5. Let fi, fo, fs, fa be the following functions from R to R > 0, /i(x) = lifx
= —lifx
2 elements. State necessary con
Ch, 4
SUGGESTIONS
FOR
FURTHER
READING
217
ditions on B and f for which the rank of the equivalence relation induced by fon A is (a) 1 (b) 2 (c) A 17.
Let R be an equivalence relation on a set A. Under what conditions is the canonical map g: A — A/R a bijection?
18.
Prove Theorem 4.2.6,
19.
(a) (b) (c)
20.
Prove that if f: A is an injection. Suppose f: A’ > A > A’, then g: Prove if f:A— tla: A’ > Bis a
— Bis injective and A’ is any subset of A, then fla: A>
B is a surjection. Prove that if g is an extension of f to A > Bis a surjection. B is a surjection, then there exists A’ c A such that bijection.
Verify the following for the characteristic functions of subsets A and B of C.
(a)
(b) (C) $21.
Xa)
= Xam
— Lal).
Xavae) = XAX) + Xal(x) ~— Xanax) = XAX)Za().
XA) Xalx).
Determine left and/or right inverses for the following functions when they exist. Specify the equivalence relation induced on the domain by the function. In each case, construct the canonical map.
(a)
(b) a.—____..__.» .0
a.
ee
bo
~0
————
J
Co g ee e
c.
2
(c)
(d)
(e)
$22.
B
Complete the proof of Theorem 4.2.7.
Suggestions for Further Reading
The material in this chapter is classical and treated, at least briefly, in a number of books. The first two chapters of the text by MacLane and Birkhoff [1967] will provide a distinct but related development of much of the material of our Chapters 2, 3, and 4, along with some of the material of our Chapter 7.
S COUNTING
AND
ALGORITHM
ANALYSIS
5.0 INTRODUCTION In order to compare, evaluate, and predict, we must often count the objects in a
finite set. For example, one way to compare the cost of applying two algorithms is to determine, or at least estimate, how many operations each of them executes when solving a problem. This is often done by counting only certain kinds of operations which are executed by the algorithms. Thus, the cost of a direct method for solving sets of simultaneous linear equations can be estimated by counting the number of multiplications and divisions executed by the algorithm. The cost of some sorting algorithms can be estimated by counting the number of comparisons made between data items. The cost of using a particular data structure for a file can be estimated by determining the average and maximum lengths of searches for items stored in the data structure. Problems such as these ultimately involve either counting (exactly or approximately) the elements of a set or enumerating the elements of a set which have a common property. This chapter first introduces some basic techniques for counting and enumerating the elements of finite sets; we then illustrate how these techniques can be applied to the analysis of algorithms.
5.1
BASIC
COUNTING
TECHNIQUES
In this section, we will introduce some basic techniques of counting. We begin by introducing the concept of the cardinality of a finite set. The cardinality of a finite set is simply the number of elements in the set. The definition we give below is chosen so that it can be extended to infinite sets as well.
Definition 5.1.1: A set A is finite if there is some natural number n c N such that there is a bijection from the set {0, 1, 2,...,” — 1} to the set A. The 218
Sec. 5.1
BASIC
COUNTING
TECHNIQUES
219
integer 7 is called the cardinality of A, and we say “A has n elements,” or “nis the cardinal number of A.” The cardinality of A is denoted by  A]. Example Let A = {a, b, c}. function
Then the cardinal number of A
is 3, i.e.,  A = 3, since the
Ff: {0, 1,2>}A, fO) = a, f(1) = 6, f2) =e, is a bijection from the first three natural numbers to
A.
#
The special case of the cardinality of the empty set deserves mention. As we noted in Section 4.2, an “empty” function (consisting of the empty set of ordered pairs) is an injection from the empty set to any set A, and if A is empty, then this function is a bijection. Consequently, our definition states that a set A has cardinality 0 if there is a bijection from the first zero natural numbers to A. But the set consisting of the first zero natural numbers is empty, and a bijection will exist if and only if A is empty. We conclude that  A = 0 if and only if A = ¢. We now introduce a fundamental rule of counting known as the “pigeonhole principle.” Informally, the pigeonhole principle asserts that if m objects are placed in n boxes (or pigeonholes) and m > a, then some box will contain more than one object. This principle, which we will not prove, can be stated more formally as follows. Pigeonhole Principle: If A and B are finite sets with A =m and m > n, then no injection exists from A to B.
and B=n
When an intuitive notion, such as the size of a set, is characterized by means of a mathematical definition, it is important to verify that the properties of the mathematical characterization agree with our intuitive concept. The next theorem
has this purpose; it uses the pigeonhole principle to prove that a finite set has only one cardinal number. Theorem 5.1.1:
Let A be a finite set. Then the cardinality of A is unique.
Proof: Suppose A that m > n. Then by the But 1, is a bijection from tradiction. Similarly, the
Hence,m=n.
fj
=m and A = 27; we will show that m =n. Assume pigeonhole principle, there is no injection from A to A. A to A. Thus, the assumption that m > n leads to a conassumption that n > _m will lead to a contradiction.
The proof of the following theorem is left as an exercise. Theorem 5.1.2:
Let A and B be finite sets, and suppose there is a bijection
from A to B. Then A =  Bl.
220
COUNTING
AND
ALGORITHM
ANALYSIS
Ch. 5
Two additional principles are fundamental for counting sets which have been formed by using the operations of union and cartesian product. We have implicitly used these principles in earlier chapters, but for the sake of completeness, we will state them as theorems about the cardinalities of sets; their proofs are left as exercises. The first principle is called the Rule of Sum. Theorem 5.1.3: \f A and B are finite disjoint sets with cardinalities m and n respectively, then A U B = m+ n. The second fundamental principle of counting is known as the Rule of Product. Theorem 5.1.4: If A and B are finite sets with cardinalities m and n respectively, then A x B = mn. Examples (a)
Suppose statement labels in a programming language must be either a single alphabetic symbol or a single decimal digit. The first set, {4, B, C,...,Z}, has 26 elements, and the second set, {0, 1, 2,..., 9} has ten elements. Because the two sets are disjoint, the rule of sum can be applied, and we conclude that there are 26 + 10 = 36 possible statement labels.
(b)
A variable name in the programming language BASIC must be either an alphabetic symbol or an alphabetic symbol followed by a single decimal digit. If § denotes the set of alphabetic symbols and D denotes the set of digits, there is a onetoone correspondence between the variable names and the set SU (S x D). By the rule of product, there are 2610 elements in S x D and hence by the rule of sum there are 286 possible variable names in BASIC.
(c)
Consider the puzzle sometimes called the “four cubes problem.” It involves four cubes such that each face of every cube is painted one of four colors. The problem is to stack the cubes in such a way that each vertical side of the stack contains squares of all four colors.
The order of the cubes in the stack is clearly unimportant, and we do not wish to distinguish between arrangements which are identical except for rotation. We can count the number of significantly different arrangements as follows: 1.
The first cube can be positioned in any of three different ways because there are three pairs of faces which can be made the top and bottom surfaces. ,
2.
For each remaining cube, one of the six faces must be chosen as the bottom and then one of four possible rotational positions must be chosen. This gives 24 different ways to position each of the last three cubes in the stack.
Thus there are 3242424 = 41,472 different arrangements, making an exhaustive search costly. For a discussion of how to solve the problem (easily!) by constructing a graph with 4 nodes and 12 edges, the reader is referred to Deo [1974], p. 18, or Busacker and Saaty [1965], p. 153. #
Sec. 5.1
BASIC
COUNTING
TECHNIQUES
221
We will now develop several basic counting results, all of which are based on the rules of sum and product. Theorem 5.1.5:
Let A and B be finite sets with cardinalities
tively. There are n™ functions from A to B, ice.,
m and n respec
 BA] =  BIl4!
Proof: If A = ¢, then the assertion holds since we define n° = 1 for all n & N. No functions exist from A to B if B is empty and A is not. If both A and B are nonempty, then index the elements of A in some arbitrary fashion with the
first m natural numbers: do, a;, d2,...,@,,. Each element of A can be mapped to any of n elements of B. Thus, there are n possible values of f(a), n possible . values of f(a,), etc. It follows that there are nnn...nn or n™ functions. Hence, 
B4
=
 Bil
m factors

Example Assume we wish to represent integers using sequences of n digits, where each digit is one of b distinct symbols, b > 2. Choosing the symbol set to be
(0,1,2,...,5—1j, each n digit sequence of symbols can be associated in a natural way with exactly one function f/f: {0,1,2,...,”—1}— {0,1,2,...,6—1}. Thus, there is a bijection from the set of all such sequences to {0, 1, 2,..., 8 — 1}{0.1:2..m1), By Theorem 5.1.5, there are b* functions from
{0,1,2 —1} ,. to {0,1,2 .. ,...,6 ,n VJ
and therefore we can represent b" distinct integers. In the case of the standard positional number notation in base 6, where the sequence Qn ~14n—24,~3
represents the number G,y0""
3
++
An—2.b"~?
+
°° * A1ag
ane
+
a,b}
+
ayb®,
each sequence of length m represents an integer greater than or equal to 0 and less than b. #
We proved the following assertion inductively in Section 2.5. Here the result follows as a special case of the preceding theorem. Corollary 5.1.5: Proof:
If A isa
finite set, there are 2'4! distinct subsets of A.
For each subset A’ < A, let X be the characteristic function of A’:
Xvi A — {0, 1}, 4x)
= lifxe
A,
= 0 otherwise. For every pair of subsets B, C contained in A, X, = X, if and only if B = C. Hence,
222
COUNTING
AND
ALGORITHM
Ch. 5
ANALYSIS
A, on d ne fi de s on ti nc fu ic st ri te ac ar ch are there are as many subsets of A as there
and by Theorem 5.1.5, this number is 2'4!.
Jj
Permutations and Combinations
t tha d an elf its to set the om fr n io ct je bi a is Recall that a permutation of a set
d an f if ., i.e n, io it os mp co on ti nc fu r de un ed os cl is the set of permutations of a set
A. of ns io at ut rm pe are f go d an g fo en th A, set g are permutations of a Theorem 5.1.6:
Let A bea
ct in st di of er mb nu e Th . ts en em el m th wi finite set
permutations of A is n!
y pt em the ly me na A, to A of ion ect bij one is re the n Proof: If A = ¢, the y, pt em not is A If A. of n io at ut rm pe 1 = 0! is re the function. Thus, if A = 0,
A. of ts en em el the of t en em ng ra ar y ar tr bi ar then let a), 4,,42,..,4,; be an on. so d an ,), f(a n the ), (ao Ff ng si oo ch st fir by d ne fi de be A function f: A — A can , a) f( for s ce oi ch 1 — ,), f(a for s ce oi ch n are re the If fis a bijection on A, then e rul the ng yi pl Ap ,). f(@ for s ce oi ch i — n l ra ne n — 2 choices for f(a,) and in ge is s ion ect bij le sib pos of er mb nu the t tha s ow ll fo it t, uc of prod
n(n ~ 1)(n —2)...321=n!
The permutations of a set can be put in the of ts en em el the of ts en em ng ra ar d re de or arbitrary but fixed arrangement of the elements ment of the elements of A can be associated arrangement a), a\,...,@,, corresponds to
§j
a onetoone correspondence with set. Let ao,@i,.. 5 ,1 be some of a finite set A. Then any arrangewith a bijection from A to A; the A—> A where the permutation f:
f(a) = 4. Examples (a)
(b)
n wee bet sh gui tin dis we If ms. ite ct tin dis n m fro med for be to is list Suppose a be can ch whi s list ent fer dif n! are re the n the ms, ite the of ngs eri ord different formed. If B? to A m fro re the are s ion ect bij y man w Ho . sets te fini be Let A and B
A  Bl, then no bijections exist from A to B. IfA =B = x, then there # are n! bijections.
Consider a process which selects r objects sequentially from a set of n objects. is. s ces pro the n the y, edl eat rep sen cho be to ible elig is set the of t men ele h If eac said to be a selection with replacement.
Thus, if one were drawing items with
replacement from a jar, each time an item is drawn from the jar, its identity would ure fut for ate did can a it ing mak jar, the in ed lac rep be ld wou it n the and ed be not draws. If r drawings are made from a jar with n objects and the output of the process is taken to be the resulting sequence of r objects (i.e., an rtuple of objects from the set), then a selection with replacement has n’ possible values, each of which is an rtuple, (a,,a,,...,4,>. (Note that if r= 0, then n’ = 1; there is only one sequence of 0 length.)
Sec. 5.1
BASIC
COUNTING
TECHNIQUES
223
Now suppose the selection process is one in which each item can be selected at most once; in this case, the process is said to be a selection without replacement. The sequence which results from a selection without replacement of r objects from n objects where r < n, is called a permutation of n objects taken r at atime. A permutation of n objects taken r at a time is an rtuple, (a,,a,,..., a,> such that each a; is one of n objects and if i # j, then a, ¥ a,. Theorem 5.1.7: The number of permutations of n objects taken r at a time, denoted P(x, r), is equal to n(n — 1)(n —2)...(n—r+ 1):
P(n, r) = aoa Proof: \fr = 0, then P(m, r) =  because there is only one empty sequence. Suppose r > 0. Then there are n possible values for the selection of the first of r objects from n objects. Since selection is without replacement and one object has been chosen, there are only n — 1 possible values for the selection of the second object. Similarly, there are n — i+ 1 possible values for the selection of the ith object for all i,1 0. (a) P(m,n) =n! (b) P(x, n) = P(v,r)P(n — r,n —r) whereO 1. An algorithm of complexity O(c”), c > 1, is said to be of exponential complexity. 7. fis O(n). The following theorem establishes that the classes we have listed are given in order of increasing complexity. Theorem 5.2.4: Consider the class F of all functions from N to R. Then for c € Rsuch that c > 1,
OU) < O(log n) < O(1) < O(n log n) < O(n?) < O(c") < O(n), and all containments are proper. Proof:
The proofs that containments are proper are left as exercises. We
will prove the first, second, and fifth containments and leave the others as exercises.
By Theorem 5.2.3, in order to show O(f) < O(g) it suffices to show that fis O(g). (a) (O(1) < O(logn).) Let f(7) = land g(n) = logan. For alln > 2, 1 0, log n < n, and therefore log n is O(n). It follows that Odogn) < O(n). (c) (O(n?) < O(c") for ¢ > 1.) We will show that for sufficiently large n, n* iis O(n).
Part (a) is straightforward; parts (b) and (c) follow from Corollary Proof: 5.2.5 and the identities 07, i= n(n + 1)/2 and $02., 17 = nt + I(2n + 1/6
respectively.
fj
In practice, any algorithm can be nis small enough, but the asymptotic important information about whether for moderate or large values use of n.
5.2.2.
executed on small problems; that is, when behavior of a complexity function provides it will be feasible to execute an algorithm This point is illustrated in Tables 5.2.1 and
Comparing algorithms on the basis of their asymptotic behavior is a powerful
and convenient technique, but it must be used with caution. Thus, while we would Complexity Function Problem Size n
5 10 102 103 104 Table 5.2.1

log n
n
niogn
n2
2"
n!
3 4 7 10 14
5 10 102 103 104
12 33 664 9965 1.4 x 105
25 102 104 106 108
32 1024 1.3 x 103° * *
120 3 x 106 * * *
A COMPARISON OF THE GROWTH OF SOME COMMON COMPLEXITY FUNCTIONS. THE TABLE ENTRIES ARE PROPORTIONAL TO THE TIME REQUIRED TO SOLVE A PROBLEM OF SIZE ”. AN ASTERISK INDICATES THAT THE NUMBER IS GREATER THAN 10109,
240
COUNTING
AND
ALGORITHM
ANALYSIS
Complexity function
1 sec
login n nlogn n2 2" n! Table 5.2.2
Ch. 5
A COMPARISON
1 min.
1 hour
Zi0s
96107
106 62746 103 23 9
336108
6 x 107 2.8 x 106 7746 26 11
3.6 x 10° 1.3 x 108 60,000 32 12
OF THE MAXIMUM SIZES OF PROBLEMS WHICH
CAN BE SOLVED USING ALGORITHMS WITH SOME COMPLEXITY FUNCTIONS. AN AVERAGE EXECUTION
ONE
OPERATION
PER
MICROSECOND
(1076 SEC)
PROPORTIONAL VALUES HOLD FOR OTHER SPEEDS.
COMMON TIME OF
IS ASSUMED.
expect an O(n) algorithm to be “better” than one which is O(n?), in fact we cannot
choose between them without more information. For example, suppose that algorithms F and G have complexity functions f(n) = cn and g(n) = dn’. If the values of the constants are c = 50 and d = 1, then Fis a more attractive algorithm only if n, the problem size, exceeds 50. Since this value of n may be larger than most of the problems of interest, it may be that the O(n’) algorithm is the best choice. Thus in order to choose between algorithms, it is generally necessary to know the specific complexity functions and the problem size as well as the asymptotic behaviors. By extending the way in which order notation is used, we can characterize algorithm performance more precisely than is possible with the notation we have developed thus far. In the extended usage, the notation O(f) is used on the right side of an equation to denote a member of the set O(f). For example, the assertion that the algorithm F has asymptotic complexity f, where
f(a) = 1.6n? + O(n log n) is interpreted as meaning f(n) = 1.6n? + g(n), O(n log n). This is a stronger assertion than
where
g(n)
is a
member
of
f(n) = O(n’);
the second is implied by the first but not vice versa. Using this extended notation, the complexity function of different algorithms can be compared with one another on the basis of the coefficients of dominating summand functions as well as less important summands. Thus, for sufficiently large n, an algorithm with a complexity function f(n) = 1.6n? + O(n) will probably be less costly than one who se complexity function is g(n) = 2n? + O(n), which in turn will probably be less costly than one whose complexity function is A(n) = 2n? + O(n log n). Problems: 1.
Section
5.2
Let F be the class of functions from N to R, and let Ff, g € F. Define the binary relation = as follows: f = gif and only if fand g asymptotically dominate each other.

ASYMPTOTIC
Sec, 5.2
(a) (b)
BEHAVIOR
OF
FUNCTIONS
241
Show that = is an equivalence relation. (This is part (b) of Theorem 5.2.1.) Let [f/f] denote the equivalence class of f under the relation =. Show that the binary relation Lf] < [elif and only if fis asymptotically dominated by ¢
is a partial order on the quotient set F/=. Give an example of a function in O(1) which is not a constant function. Find a pair of functions f and g from N to R such that f € O(g) and ¢ ¢ O(/). Define a function f: N — R to be bounded if there exists some r € R such that for alln € N,f(#) R andg: N — R, determine if For each of the following pairs of functions, f: and how fand g are related in terms of asymptotic domination. (a) f(n) = 1 for n even, for n odd.
g(n) = =
for n even,
for n odd.
=
for n even,
g(n) =
for n even,
(b) f@
=] =)
for n odd.
for n odd.
(c)
fi) =n.
(a)
Using logical notation, write out the definition of “f does not asymptotically dominate g.” Using the assertion of part (a), argue that if f does not asymptotically dominate g, then for any m there exists an infinite number of arguments 1 such that
(b)
(c)
g(n) = n/100 == 107 1%,2
if n 4 10* for some k, if n = 1, 10, 100, etc.
lg(n) > mf@).
Determine whether the following assertion is true. “If f does not asymptotically dominate g, then for all m > 0, if n is sufficiently large, then  g() > mf(n).”
Let f; and f; be functions such that f, is O(g1) and f, is O(g2). (a) Prove that if g,(”) and g2(m) are nonnegative for all arguments
(b)
n < N, then
fi +fr is O(g1 + 82). Prove that f; + f, may not be O(g1 + g2).
Let f and g be functions from N to R, and denote by f+
g the product function:
fe@) = f@: a(n). (a) (b)
Prove that if fis O(h,) and g is O(h2), then fg is O(t; +h). such that O(/) is not closed under multiplication N—R Find a function f: of functions.
Prove Corollary 5.2.3.
10.
Show that each of the following containments is proper: (a) O(1) < Odog n). (b) Odlogn) < Of”).
(c)
O(n?) < OO”),
for alld > 1.
242
COUNTING
AND
ALGORITHM
ANALYSIS
Ch. 5
11.
Prove the following assertions and show that each of the containments is proper. (a) O(n) < Off log n). (b) O(n logn) < O(n4), for alld > 1. (c) O(c") < O(n), for alle > 1.
12,
Show that for all integer values of k, n > 0, O(log n) = O(log (n + &)).
13.
Prove Corollary 5.2.5.
14.
Consider the class of functions F, where
F=({(ff and:N f(N) c N} R i.e., the image of every member of F is a subset of N. Let f and g be members of F. Prove or disprove the following. Conjecture: If f and g are O(h), then fg is O(h) (i.e., the set O(h) is closed under composition of functions).
15.
Prove Theorem 5.2.7.
16.
Suppose two algorithms F and G have time complexity functions
S(n) =n? —n
+ 550
and a(n) = 59n + 50
respectively. Determine execute than G. 17,
those
values
of n < N
for which
F takes less time to
Determine which of the following functions asymptotically dominate others. Present your answer as a labelled digraph.
fi(n) = 528 Si(n) = 3n* logn + logn
1 AM=F+5
Si(n) = log login
fs(n) = (log n)?
= 208 fol)
fr(n) = log (n + +) Ss(n) = log (n?) fo(n) = 3nt4 18.
From Theorem 5.2.8 we might make the following conjecture fork e N:
> i is O(nk*1), a
i=Q
Prove or disprove the conjecture.
Sec. 5.3
5.3 RECURRENCE
RECURRENCE
SYSTEMS
243
SYSTEMS
The expressions for permutations and combinations developed in Section 5.1 are the most fundamental tools for counting the elements of finite sets. They often prove to be inadequate, however, and many problems of computer science require a different approach. An important alternate approach uses recurrence equations (often called difference equations or recurrence relations) to define the terms of a sequence. A formal definition of recurrence equations is difficult because of the wide variety of forms in which such equations can be written, but the concept is straightforward. We have already seen an example of a recurrence equation in the definition of the Fibonacci sequence, where for n > 2, the term a, is defined by the recurrence equation a,
=
Qn~1
+
Qy2
The salient characteristic of a recurrence equation is the specification of the term a, aS a function of the terms a), a@;,...,4,;. By itself, however, a recurrence equation is not sufficient to define the terms of a sequence; we must also specify the values of some initial terms of the sequence. Thus, in our definition of the Fibonacci sequence, we set a, = 0 and a, = 1. These are called the boundary conditions or initial conditions of the sequence. A recurrence equation together with boundary conditions is a form of recursive definition, although the terminology used is different from that introduced earlier. The topics of recursive definitions and recurrence equations are not coextensive; many classes of recursive definitions do not use recurrence equations and the solution of recurrence equations uses techniques which are not applicable to the broader class of recursive definitions. A recurrence system is a set of boundary conditions and recurrence equations which specify a unique sequence or a function (or sometimes a partial function) from N¥ to R, where k € I+. Recurrence systems provide a powerful tool for investigating many classes of problems, including counting and enumeration problems. A solution to a recurrence system is a function f: N* > R such that f satisfies both the boundary conditions and the recurrence equations. Examples (a)
The number of permutations of n objects can be expressed using the following
recurrence system:
P(O) = 1, P(n) = nP(n — 1),
for n>
0.
The correctness of this system can be established as follows: 1. The objects of an empty set can be arranged in a sequence in way. Thus, the boundary condition is P(0) = 1. 2. Given n objects, 2 > 0, we can choose the first object of a any of n ways and then arrange the remaining elements in P(n Thus, the recurrence equation is P(n) = nP(n — 1) forn>
exactly one sequence in — 1) ways. 0.
244
COUNTING
(b)
AND
ALGORITHM
ANALYSIS
Ch. 5
It can be shown by induction that 7! is a solution to this system, where 0! = 1 and forn > 0,n! = [T[f.1 7. Let f(h, k) be the maximum number of leaves of a tree of height A, where each node has outdegree k or less. This function can be expressed as the following recurrence system:
fO,k) = 1, fh,
=k flh—i,k)
forh>0.
The system is based on the following arguments. 1. A tree of height 0 has a single node which is a leaf, so f(O, kK) = 1. This gives the boundary condition. 2. A tree of height 4 > 0 will have the maximum number of leaves if its root has k sons, each of which is the root of a subtree of height — 1 with f(h — 1,k) leaves. A tree of height A can therefore have up to ke f(h — 1, &) leaves.
It can be shown by induction on A that k* is a solution to this system. (c)
Pascal derived the following recurrence system to evaluate ( k ), the number of
subsets of & objects in a set of n objects.
(=Goyt(Z!)
forma k>o.
The argument is as follows: 1. The number of ways of choosing 0 things from n things is 1, and the number of ways of choosing things from n things is 1. These two assertions provide the boundary conditions. 2. Suppose > 0. We choose some element and delete it from the set, leaving n — 1 elements. A subset of k > 0 elements can now be chosen from the original n elements in two distinct ways: one can choose k — 1 elements from the remaining n — 1 elements and then add the deleted element, or one can choose all k elements from the remaining n — 1 elements. These possibilities are mutually exclusive and exhaustive. It follows that
n\
_({n—J
(2) =Gi
n—l
+(e’):
It can be shown by direct substitution that n!/[A!(n — k)!] is a solution to this system, #
The number of injections and bijections from a set S to a set TJ’ can easily be expressed in terms of permutations involving  S and 7; these expressions were given in examples in Section 5.1. The number of surjections from one set to another is difficult to characterize using only permutations and combinations, but can be easily expressed using a recurrence system.
Sec. 5.3
RECURRENCE
SYSTEMS
245
Theorem 5.3.1: Let A and B be finite nonempty sets with  A = m,  B =n, where m > n > 0, The number of surjections, S(m, n), from A to B is given by the following recurrence system.
S(m, 1) = 1, Sem, n)
=n"
—'S1(") j=l
som,)
form>n>
1.
Proof: fn = 1, then there is exactly one surjection from A to B; this establishes the boundary condition S(m, 1) = 1. Suppose m > 1. The number of surjections from A to Bis equal to the number of functions from A to B minus the number of functions whose images are proper subsets of B. If B’ < Band  B’ = j, then there are S(m,/) functions from A to B whose image is B’. Furthermore, there are ( i ) different subsets of B of cardinality j. Thus, there are ( f; ) S(m, j) different functions from A to B which have an image of cardinality j, where j < n. Then the total number of functions from A to B which n are not surjections is })7=} ( j ) S(m, j). Since there are a total of n” functions from Ato B,
S(m, n) =n" —Sj=l (7) Som, j). n~i
This establishes the recurrence system.
Jj
It is obvious that a recurrence system can be used to obtain any term of the associated sequence by iteratively solving the recurrence. Alternatively, it is sometimes possible to find an expression for the solution which can be evaluated directly for any argument nv to find the value of the nth term. Examples The following are examples of solutions to recurrence systems. In each case the expression can be shown to be a solution by direct substitution. All of these solutions are unique, but we will not prove this. (a) The following system describes a function which grows exponentially: ay
=
k,
Qy == CAy4
(b)
forn>
0.
The solution is a, = kc". The following function describes the Fibonacci sequence: ag
=
0,
Qa,
=
1,
On == Agni + An2
forn>
1.
The solution is a, = (1/,/5) [0 + ./5)/2P — AL,/5)(d
— ./5)/2}.
246
COUNTING
(c)
Ch. 5
ANALYSIS
ALGORITHM
AND
Consider the following recurrence system
f(0) =0 fQ) =fQ) = 1, S@®
= 2f(a — 1) +f(n
— 2)
— 2f(a
— 3)
The solution is f(m) = [((—1)"*! + 2/3.
for n > 2.
#
A treatment of the many techniques for solving recurrence systems is beyond the scope of this text, but we will illustrate one which is both easy and useful. Later in this section we will use this procedure to find solutions for some important classes of recurrence systems. The technique begins with the specification of a, and repeatedly applies the recurrence relation to evaluate the terms which appear on the right side. To trate, consider a recurrence system of the form ag
=
bo,
a,
=
C,4,1
+
illus
b,
where the value of the coefficients b, and c, may be functions of n. The value of the general term a, can be expressed as a sum by adding both sides of the following sequence of equations, where each equation is obtained by using the recurrence relation to express a term in the preceding equation. a,
==
CpQy1
+
b,
CpAn—1
=
Cp€g1
CnC n1An2
=
C7Cp 1On~2.4n3
Ayo +
IT
¢, a,
>= Th
ee. ay
il
Cy idg
=
Cy iDo
il
+
CD a1 +
CC n19n2
Te,
iD;
Note that the right side of the last equation only involves the coefficients and boundary conditions of the sequence. Forming separate sums of the left and right sides of this set of equations and then cancelling common summands yields n~2 a,
=
b,
+
Cyd, 1
+
CnCy1Dq—2
+
cee
+
TT
ant c,ib1
+
IT CyiDo
In many cases, standard summation identities can be applied to derive an expression for the value of a,. Example Consider the recurrence system
ay = b, An = CAy.4 + 8.
Sec, 5.3
RECURRENCE
SYSTEMS
247
We form the set of equations Qn = CAy.1 + 6 CQy1 == C7A,2 + cb C*Ay,_2 = C8a,_3 + 7b
clay = cay + c™ 1b cay = c"b Summing the left and right sides and cancelling gives
a, = b+cb + c2b + + +7 '1b + crb
=bSc. i=0 Applying Theorem 2.5.3 gives the solution ad,
=(n
+ 1)b
= ey a ) —
ntl
ife = 1;
ifexl.
#
The importance of obtaining an expression for the nth term of a sequence _defined by a recurrence system is mitigated by the possibility of obtaining the terms of the sequence iteratively. But a general expression for the mth term often provides additional insights. For example, if the nth term of the sequence describes the cost of applying an algorithm to a problem of size n, then an expression for the nth term will enable us to determine the asymptotic behavior of the complexity function of the algorithm. Example The following procedure returns the sum of the first n entries of an array A procedure SUM(7): begin total — 0; for i — 1 ton step 1 do total — total + A{i]; return total end
Suppose we define the complexity function f of SUM to be the number of additions performed by SUM. This function is characterized by the following recurrence system.
fQ) = 1, fm
=faD+1
forn>0,
By adjusting indices, the expression developed in the preceding example can be
COUNTING
248
AND
ALGORITHM
ANALYSIS
Ch.5
applied by setting 6 = c = 1. It follows that the solution is f(m) = n forn > 1. # Hence SUM is an O(n) algorithm and has linear complexity.
In the remainder of this section we will consider some special classes of recurrence systems which are especially important for characterizing the performance of recursive programs. While we will obtain solutions of some of these recurrence systems, the primary goal will be to determine the asymptotic behavior of a broad class of systems without actually finding solutions for the systems. Divide and Conquer Algorithms
It is sometimes possible to divide a problem into smaller subproblems, solve
the subproblems, and then combine their solutions to obtain the solution to the original problem. This general approach, often referred to as “divide and conquer,” is a powerful technique in algorithm design. Since the subproblems are usually of the same type as the original problem, a divide and conquer strategy can often be implemented as a recursive algorithm. In the remainder of this section we will consider some classes of recurrence systems which are useful for describing the complexity functions of recursive algorithms, including many which use a divide and conquer strategy. Our treatment will proceed from the specific to the general. We begin by solving some special classes of recurrence relations explicitly and characterizing their asymptotic behaviors. Then, using the solutions to these sys
tems as bounds, we will show how to determine the asymptotic behavior of the solutions to a larger class of recurrence systems without actually solving the sys
tems.
In general, a divide and conquer algorithm will solve a small problem directly and will solve a larger problem by dividing it into a set of subproblems of approximately the same size. These algorithms are easiest to describe if one assumes the subproblems are equal in size. For example, if an algorithm divides a problem of size n >  into two subproblems of approximate size n/2, then the algorithm can most easily be described if we assume n is a power of 2. This will enable us to divide a problem of size n into two problems of size n/2, then divide each of the problems of size n/2 into problems of size n/4, etc. The recurrence system for such an algorithm will specify the values of the complexity function only for arguments which are powers of the appropriate integer b > 1. We will consider the class of divide and conquer algorithms which obey the following constraints:
1.
2. 3.
The cost of solving a problem of size n = 1 is c, where c is a nonnegative
constant.
For k > 0, problems of size n = b* are divided into a different problems of size n/b. For all problems
;
sub
of size n > 1, the cost of breaking the problem into
subproblems plus the cost of combining the solutions of the subproblems to obtain a solution to the original problem is A(m), a function of n.
Sec. 5.3
RECURRENCE
SYSTEMS
249
These conditions yield recurrence systems of the following form:
fM =e, f(a) = af (n/b) + h(n)
forn = b¥,k > 0.
Since f(1) = f(6°), a recurrence system of this form specifies a value of S for all arguments which are (nonnegative) integer powers of b. Because the values of g are not specified for other arguments, the system will not have a unique solution f: NR, but we will see that this does not detract from its usefulness if the cost of solving a problem of size m is monotone increasing with n. We first treat recurrence systems in which f(n) = af(n/b) + h(n) for the special case where A(n) = c. The solutions to these systems for n = b* will then be used to characterize the asymptotic behavior for all arguments of a large class of recursive algorithms. Lemma 5.3.2a: Let a, b, and c be integers such that a> 1,b > 1, andc> and let f: N — R be any function whose values obey the recurrence system
0,
ff) =e, fn)
= af (Z)
+e
for n = b* where k > 0.
For all arguments which are powers of 8, (a)
ifa=1, then f(m) = c(log,n + 1);
(b) ifa +1, then f(n) = cans = 1). Proof: Then
Letn=b*,k
> 1.
f(n) = af (F) +e >
os(f)=#1(B)+0
o1(20B(B)) 0 aif (a)
= a'f (Fr) + akle,
Summing both sides of the above sequence of equations and cancelling common summands, and noting that f(n/b*) = f(1) = c, we have S(a) = ca*® + cy
i=0
(a)
a=
chal i=
Ifa = 1, then f(m) = c(k + 1). Butk = log,n,so f(n) = c(log,n + 1).
COUNTING
250
(b)
ALGORITHM
AND
Ch. 5
ANALYSIS
Ifa +1, then from Theorem 2.5.3 we have @kti
f() = (H
—

1
):
But k = log, n, and a'®" = n'°®*, Therefore, _ f (n)
c(aa’®*
~—
a—
me
_
1)
—
{
e(an'® a—
4 —_
1) i
1
on cti fun the of or avi beh c oti mpt asy the ine erm det can From Lemma 5.3.2a we gen a is n tio ini def ing low fol The b. of ers pow are ch whi nts ume f for those arg s mit per on ati liz era gen s Thi 5.2. n tio Sec in d uce rod int ts cep eralization of the con N. ain dom the of set sub a on ons cti fun of or avi beh c oti mpt asy the s cus us to dis an be S let and R, to N m fro ons cti fun be g and f Definition 5.3.1: Let t tha h suc 0 > m and 0 > k ts exis re the if S on ) O(g is f n The N. of set infinite sub  f() < mg(n) for alln € S such that n > k. Example N > R be defined as follows: if x is even, f(x) =1
Let f:
if x is odd. fw=*x Then fis O(1) on the set of even integers, but fis not an O(1) function.
#
It is easy to see that if gis O(A) and S < N, then gis O(h) on S. Moreover, the properties of asymptotic behavior we have considered extend in a natural way to asymptotic behavior on S. For example, if c is a constant and fand g are O(A) on S, then cf and f+ g are O(h) on S. The next lemma is an immediate consequence of Lemma 5.3.2a; its proof is left as an exercise. Let a, b, and c be integers such that a > 1,6 > 1,andc > 0, Lemma 5.3.2b: N—> R be a function such that and let f:
fd) =e,
f@=af(tn/b)+ec
Let
forn = b* where k > 0.
S = {b¥k © N}. (a) (b)
Ifa=1, then fis Odog x) on S. Ifa #1, then fis O(7'*?) on S.
We now use the preceding lemma to characterize the asymptotic behavior for arguments which are powers of b for a large class of recurrence systems. Theorem
5.3.2:
c > 0, and let f:
Let a,b,
and
c be integers such
N — R be any function such that
fM
forn = b* where k > 0.
1, and
Sec. 5.3
RECURRENCE
SYSTEMS
251
Let S = {b¥k & N}. (a) Ifa=1, then fis O(log n) on S. (b) Ifa 1, then fis O(n) on S. Proof:
Let g be the solution to the recurrence system where equality holds
for both conditions of the recurrence system; that is,
gl) = ¢, g(n) = ag(n/b) + ¢
for n = b* where k > 0.
By Lemma 5.3.2b, the function g is O(log) on S if a = 1 and O(n) on S if a ~ 1. It is easy to show by induction that any function f which satisfies the following inequalities
fOse, f@ is bounded
0,
by the function g for all arguments which are powers of 6, that is, ifn € S, then f(n) < g(n).
We conclude that the function fis O(log x) on S if a = 1 and fis O(n?) on S if  axl. Example The procedure MAXMIN given in Fig. 5.3.1 applies a divide and conquer strategy to return the maximum and minimum values of the entries Afi], ..., ALJ] of a vector 4. MAXMIN first determines if there is a single entry, i.e., if i = j; in this case, MAXMIN returns the ordered pair . If i 1. (a) If gis O(log n), then fis O(log n). (b) Ifgis O(n logan), then fis O(n log n). (c) If gis O(n*), then fis O(n’) for d ¢ R,d> 0.
such
Proof of (a): Let S = {nn = b*}; then f is O(g) on S. Since g is O(log n), it follows that fis O(log m) on S. Hence there exist numbers r € N and K e R+ such that ifn > rand n = b*, then f(n) < K logn. Consider any m € N such that r 0, and let f: N — R+ be a monotone increasing function such that
fM 8, then fis O(n'*?).
Proof: Suppose F(@) as follows:
n= b*, where
k © N
and
k > 0. Then
we
can
bound
I) b, then log, (a) > 1 and therefore O(n) < O(n"). It follows Jj that if a > b, then f is O(W?). Example Suppose S is an arbitrary sequence of n distinct elements and we wish to build a binary search tree of minimum height which contains the elements of S as node values. The following algorithm can be used. 1.
Find the median element m of S. (The median is the element of S that would appear in the [n/2]th position if the sequence S were sorted.) The root of the tree is assigned the value m.
2.
Form two sequences S; and S, such that S; consists of those elements of S which are less than m and SS, consists of those elements of S which are greater than m.
3.
Apply this procedure recursively to S, to construct the left subtree of the root, and to S, to construct the right subtree.
An O(n) algorithm# exists for finding the kth largest (and therefore the . n/2th largest, or median) element of any sequence; it follows that there exists some is ents elem of ence sequ a of ian med the find to m rith algo ar line a of ion ript desc ful care +A 97. page 4], [197 man Ult and ft cro Hop Aho, to rred refe is er read The e. scop beyond our
COUNTING
256
AND
ALGORITHM
ANALYSIS
Ch. 5
integer c such that the median of any set with n elements can be found with no more than cn comparisons. Thus, step 1 can be performed with at most cn comparisons. After the median m has been found, the sequences 5, and S, can be formed by comparing m with every element a; € S — {m}; we add a; to S, if a; < mand add it to S, if a; > m. Thus step 2 can be accomplished with n — 1 comparisons. Consequently, we can characterize the number of comparisons necessary to build the binary search tree from S as follows:
fon)< 24 ($) + en + (n — 1D). Therefore,
S(n) 0.
il
i"
n > 0. for
Yn == Vant
Find a solution for each of the following recurrence systems and determine the asymptotic complexity of the solution. (The symbols a and b denote arbitrary positive constants.)
(a)
xo = 1,
(c)
x, =1,
Xn == Xq1 ta X_ = 2x,1 —1
(ce)
x, =1,
(g)
xo
(a)
(b) (a)
Xn == AXq1 =
3,
Xn = 3x1
+07
fora > 0. fornm>
1,
forn>
1.
(b)
xo =a,
(d)
xo = 1,
(f)
xo =9,
Xy = Xy1 + BP Xn = (H+
Xn
Dx,1
= X_p1 tn—1
forn>
0.
for n>
0.
= forn>0.
for n > 0.
Find a recurrence system to describe the number of moves that must be made in a Tower of Hanoi problem with n discs, where n > 0. (See problem 5 1.21(a).) Solve the recurrence system of part (a). Consider n coplanar straight lines, no two of which are parallel and no three of which pass through a common point. Find a recurrence system to describe the
Sec. 5.3
RECURRENCE
SYSTEMS
257
number of disjoint areas into which the lines divide the plane. Show that (n2 + n + 2)/2 is a solution. Suppose that n > 3 and exactly three of the lines pass through a common point. Find a recurrence system for the number of regions into which the lines divide the plane.
(b) ’
A derangement of n objects Thus, if fis a derangement Stk) & k for all k 2.
(Hint: A derangement either interchanges the first element with another, or it does not.)
The total path length of a tree is the sum of the lengths of all simple paths from the root of the tree to a node. Find a recurrence system minimum total path length of a complete nary tree of height h. Find the solution to the recurrence system of part (a). The external path length of a tree is the sum of the lengths of all simple paths from the root of the tree to a leaf. Find a recurrence system minimum external path length of a complete xary tree of height h. Find the solution to the recurrence system of part (c).
(a)
(b) (c)
(d)
directed for the
directed for the
Prove Lemma 5.3.2b. Let f:
N— R be a function which satisfies the following relations where b, c > 0:
fO) 1. (Hint: Handle n = 1 and n = 2 as special cases, and make sure your algorithm does not divide an array with an even number of entries into two arrays both of which have an odd number of entries.) 11.
(a)
(b)
(c)
Construct a recursive procedure MAX2 to implement a divide and conquer strategy for finding the largest element in the entries A(z), ..., A(j) of an array A. Your procedure should divide the array into two approximately equal subarrays. State the recurrence system which characterizes the complexity function f for MAX? if f(x) is defined to be the number of comparisons made between entries of an n element array A, where n is a power of 2. Find the solution of the recurrence system of part (b).
258
COUNTING
(d) (e) (f) 12.
(a) (b)
(c) 13.
(a)
(b)
AND
ANALYSIS
Ch. §
Determine the asymptotic behavior of the complexity function. Design a procedure MAX3 to find the largest element in an array by dividing the array into 3 approximately equal subproblems. Describe how you could generalize this procedure to one which creates k subproblems. Discuss the asymptotic complexity of this class of algorithms. Design a recursive procedure TWOMAX which finds the largest two elements of an array A. State the recurrence system for the complexity function f of TWOMAX where /(@) is defined as in problem 11(b). Solve the recurrence system of part (b) and determine the asymptotic complexity of f. A binary search such as that given in Fig. 5.3.2 can be viewed as an implementation of a tree search algorithm such as that given in Fig. 3.2.2. Describe how the entries of the array correspond to node values of the tree and how to find the values of the left son and right son of the root. The tree search algorithm given in a recursive form in Fig. 3.2.2 can also be given in an iterative form, as in Fig. 3.2.1. Write an iterative form of
BINSEARCH
5.4 ANALYSIS
ALGORITHM
OF
(Fig. 5.3.2).
ALGORITHMS
The evaluation and comparison of algorithms is a central concern of computer science. Two kinds of questions predominate: (a)
(b)
What is the cost of using a given algorithm to solve a problem of a specified class? What is the least costly algorithm which will solve the problems of a specified class?
By choosing an appropriate measure of cost, we can often answer such questions. If the same measure of cost is applied to different algorithms for the same task, we can compare algorithms and choose from among them. In some cases, we can establish a lower bound on the cost of solving the problems of a specified class; such a bound provides a measure of the inherent difficulty of solving those problems. Furthermore, if the cost of applying an algorithm is equal to the lower bound,
then we can conclude that the algorithm is optimal for this measure, that is, no
algorithm exists which will solve the problems of the class with a lower cost. The topics of algorithm analysis and computational complexity are concerned with the construction, evaluation, and comparison of algorithms. The cost of applying an algorithm can be measured in a variety of ways. It is often inappropriate to measure the cost of operations using real programs run on real machines because of the difficulty of generalizing such results. We usually
prefer to measure the cost using a mathematical model based on an idealized
programming language or computing machine. However, in any such analysis, the set of operations which can be performed must be specified and the cost asso
Sec. 5.4
ANALYSIS
OF
ALGORITHMS
259
ciated with performing each operation must be given. For example, we may assume that all arithmetic operations cost the same or we may assume (more accurately, for most computers) that multiplication is more costly than addition. Alternatively, we may choose to ignore the cost of some operations. For example, the cost of applying some sorting algorithms is essentially proportional to the number of comparisons made between elements of the set being sorted. In the analysis of such sorting algorithms, it is common to ignore operations such as assignments, arithmetic operations, and comparisons of loop indices. In this section we will consider some algorithms and discuss their cost of execution. In some cases we will also comment on the optimality of these algorithms. Optimality can be discussed in a variety of ways, of which two will be important to us here. First, we can investigate the absolute optimality of an algorithm with respect to a specified set of operations. If an algorithm is optimal in the absolute sense, then if the primitive operations are restricted appropriately, no algorithm can perform the task using fewer operations than the optimal algorithm. Second, there is the weaker concept of asymptotic optimality. Suppose f is the complexity function of an algorithm A which solves a specified problem. Then A is asymptotically optimal if for every other algorithm B that solves the problem, if the complexity function of B is g, then f is O(g). Thus for sufficiently large arguments, the value of fis bounded by a multiple of the value of g. Informally, we say O(f) is a lower bound on the asymptotic complexity of the class of algorithms. Note that two algorithms with distinct complexity functions can both be asymptotically optimal. In contrast, if fand g are complexity functions of algorithms for some problem class, and if fis optimal in the absolute sense, then f(x) < g(”) for
every argument n € N. Table 5.2.1 describes how the growth of the cost of an algorithm is determined by its asymptotic behavior. As a rule of thumb, we can say that it is usually feasible
to execute algorithms of O(n) and O(n log n) complexity for fairly large values of 7. Time or space limitations often make it difficult or impossible to execute O(n?) and O(n’) algorithms for even moderate values of nm. Exponential algorithms (those of O(a") where a > 1) cannot generally be executed except for small values of n. We will now analyze several algorithms, characterize their complexity functions, and consider their optimality. We will describe algorithms for finding the maximum element of a set, algorithms for searching for a specified element in a set, and algorithms for sorting the elements of a set. All of the algorithms we describe are based on comparisons; that is, the result of applying the algorithm is determined by a sequence of comparisons between elements of a set. We will treat the question of optimality only for the class of algorithms based on comparisons where the number of outcomes of any comparison is bounded. (Most algorithms of interest have either two or three possible outcomes for each comparison, e.g., < and >, or .) Thus, our claims that certain algorithms are asymptotically optimal depend on our considering only a restricted class of algorithms; the
claims may not hold if we consider algorithms which are not based on comparisons or algorithms in which the number of outcomes of a comparison is not bounded.
260
COUNTING
AND
ALGORITHM
ANALYSIS
Ch.5
Finding the largest element of an array: an O(n) algorithm
Let A[1: “] be a vector with n > 1 entries. We are to find the largest entry in A and set the variable max equal to its value. Let be the complexity function such that f(n) is the number of comparisons made between entries of A if A has 7 entries. procedure MAX: begin max < A{l]; for i = 2 until 1 do if max < Afi] then max — Afi] end
Fig. 5.4.1 An algorithm to find the maximum A{i:n] where n> 1
entry in an array
We consider the algorithm MAX of Fig. 5.4.1. Each comparison in MAX occurs within a loop which is traversed for loop index values of i = 2,3,...,n. Hence the procedure MAX makes n — 1 comparisons of entries of A, and its complexity function is the following: f:NR,
= 0, f() fa=n1
forn>1.
Clearly f is O(7), and therefore the algorithm is linear. We now show that MAX is, in fact, optimal in the absolute sense and therefore in the asymptotic sense as well.
Theorem 5.4.1:
Any algorithm to find the maximum element of a set with
n members, n > 0, must make at least
Proof:
Each comparison
n — 1 comparisons.
establishes that one element is not larger than
another. In order to find the maximum
element, each of n — 1 elements must be
shown (by means of a comparison) to be no larger than some other element. Hence n — 1 comparisons are necessary to find the maximum of elements. JJ
It follows immediately from Theorem 5.4.1 that if the number of comparisons between elements of an array is used to measure the cost of applying an algorithm, then MAX is optimal in the absolute sense. The procedure MAX uses more comparisons than just those between elements of A, since each execution of the loop will be preceded by a comparison of the value of the loop index i with n. If the algorithm were implemented as a decision tree for a particular n, or if the data items were read sequentially, then the comparisons associated with the loop index would be eliminated. Thus these additional comparisons are a consequence of the algorithm implementation. Since we are interested in the operations performed by the algorithms rather than their implementations, these comparisons are usually ignored.
Sec. 5.4
ANALYSIS
OF
ALGORITHMS
~— 261
Alternative optimal methods exist for finding the maximum of a sequence of n elements. In a sports tournament, players are often paired off for each round of contests, with the winners of round 7 competing against each other in round i+ 1. The following graph represents this method for finding the best of eight players; it uses seven comparisons. This approach generalizes easily to values of n which are not powers of two. the winner
best of {1, 2, 3, 4}
1
2
3
4
5
6
7
8
the contestants
After the winner has been found, the resulting labeled tree provides some help in finding the second best player, since he must have been one of the three players who lost to the winner. Thus, only two more matches need be played to find the second place winner. The algorithms we have described for finding the largest element of a sequence have the property that the cost is uniform over all problems of size n. In general, however, the cost of applying an algorithm to a problem of size n may depend on the particular problem solved. Consider, for example, sorting a list of n entries. If all the entries are distinct, then there are n! different permutations of the n entries and consequently ! different lists with the same set of entries. The cost of applying a particular sorting algorithm to a list with these n entries will usually depend on the order in which the entries appear; for example, if the list is nearly sorted, then the algorithm may have to do very little work. The cost of applying an algorithm to a problem of size n is usually based on either a worst case or an average case analysis. A worst case analysis defines the cost of applying an algorithm to a problem of size n as the maximum cost over all problems of size x. Thus, if fis a complexity function based on a worst case analysis, then for every problem of size n, the cost of applying the algorithm is no greater than f(n). In an average case analysis, a probability distribution is assumed over the set of problems of size n and the average cost is calculated based on this probability distribution. Such an analysis often assumes all problems of size n are equally likely; in this case, the value of f(n) is equal to the sum of the costs of applying the algorithm to all problems of size n divided by the number of problems of that size. Of the two kinds of analysis, worst case is usually simpler because it only requires that we determine how bad things can be and then analyze that single case, whereas an
262
COUNTING
AND
ALGORITHM
Ch. 5
ANALYSIS
average case analysis must account for all possible cases and then weight them appropriately.
Searching Algorithms
Sequential searching: an O(n) algorithm.
Consider the problem of accessing the records of a file. We assume that each record includes a search key which is used to retrieve records from the file. For
example, if the file consists of information about individuals, the search key of
a record might be the individual’s name or social security number. In order to
locate a record in a file, a search argument is specified. The result of a search is the set of records whose keys are equal to the search argument; if no such records
are in the file, this set is empty. We will treat only the special case where each record has a unique key; thus, each search will return at most one record. The simplest file organization for this problem stores the records in a linear list or vector. If the file has m records, then the list will have n entries. The simplest search procedure is a “sequential search”; this procedure examines the records in the order in which they appear on the list until either a record whose key is equal to the search argument is found or it is established that no such record is in the
file. We define the cost of a search to be the number of records examined, i.e., the number of times the search argument is compared with the search key of a record. In the worst case, either the record sought will be the last record of the file or the record will not be in the file; all n records for the file must be examined to establish either of these possibilities. Thus, for a worst case analysis, the complexity function of a linear search is f(m) = n, and hence the search is of O(m) complexity. We now analyze the average case performance of a sequential search with the assumptions that each record of the list is equally likely to be the object of a search, and every search results in a record being found; thus, there are no unsuccessful searches. A search for the ith record will occur approximately one out of every n
times and will require i records be examined. Since the complexity function / is defined to be the average number of records examined, _ il
IM AF= 2F3
led.
nm+)
at+i
Fo t= hia sy
Thus, an average case analysis (with the assumptions stated above) leads us to
conclude that “on the average,” each search of the file will examine about half the
records. Note that the value of f(m) for a worst case analysis is about twice that for an average case analysis, but both analyses yield complexity functions which are O(n).
Searching with a binary search tree: an O(log n) algorithm
In Section 3.2 we described the use of binary search trees for storing and accessing the records of a file. If the height of a binary search tree is h, it was shown that searching for a record using the tree would require examining no more than
h + 1 records in the file. If a file contains n records, n > 1, then the height / of the
ANALYSIS
Sec, 5.4
OF
ALGORITHMS
263
binary search tree satisfies the inequality
logn B be an enumeration of B, and for
each positive k € N define the set F, as follows:
F, = {(f f © Bt and f(A) < g({0, 1,
2,...,4 — 1})}.
Then F, includes every function whose image is contained in the set consisting of the first kK elements of the enumeration of B;  F, =k’. Since A is finite, for each function f: A — B there exists some m © N such that if k > m, then f € F,; therefore B4 = , cn Fy. But each set F,, is finite and therefore countable. Hence, by Theorem 6.2.3, we conclude Jf Uzen F, is countable. As our definitions have suggested, not all infinite sets are countably infinite. The next theorem establishes that we need another infinite cardinal number. Theorem 6.2.5:
The subset of real numbers, [0, 1], is not countably infinite.
Proof: Recall that [0,1] denotes the set {xx Ee RA OR, &(x)
x) — __ U/2 xd — x)
The function g has the following graph.
Since f of the preceding example is a bijection from [0, 1] to (0, 1), and g is a bijection from (0, 1) to R, the composite function gf is a bijection from (0, 1]to R. Hence, R =c. #
Problems:
1.
Show
(a)
(b)
Section
that each
6.2
of the following sets is countably
X*, where X = {a}.
infinite.
{Xx1, X2, x3> x; € I.
(c) (d) (e)
The set of all finite subsets of {a, b}*. The set of all firstdegree polynomials with integer coefficients. The set of all finite digraphs with nodes in N.
Show from (a) (b)
that [0, 1] (a,b), {xx
()
each of the following sets has cardinality ec by constructing a bijection to the set. wherea < banda,be R. ERA x> 0}.
{Kx wlxy Ee RA x* + y? = 1}.
Let A =c,
B] =c,
[D = No, Z] =>
Prove each of the following. (a) AUBl=c. (b) AU Dil =c.
(Cc)
0, where A, B, D, and E are disjoint.
[Dx E=No.
Try to find a set § such that @(S) = No. If you do not succeed, describe the difficulties encountered.
Prove part (a) of Theorem 6.2.4. (a)
In Theorem 6.2.5, suppose we use a binary expansion for f(i) and define the digits of y in the obvious way:
288
= INFINITE
Ch. 6
SETS
yw = Oif xe
= Lifxy,
= 0.
Show that y may be equal to f(/) for somej € N. Explain what difficulties might arise because of the nonuniqueness of the decimal representation of some numbers in [0, 1]. How does this influence the selection of the values for y, in the real number y in the proof of Theorem 6.2.5?
(b)
7.
= 1,y
n ctio bije no that f proo g owin foll the d este sugg has , Tech Silo at ent stud a , Cool Joe exists from N to N. Assume f is a bijection from N to N, with f(k) = ix. For each i;,, construct a number in (0, 1] by reversing the digits of i, and putting a decimal point to the left. For example, if i, = 123, the number constructed becomes 321000... This defines a map g from N to [0, 1] which is injective, e.g., g(123) = .321000... Apply the Cantor diagonal technique to the array gof(O) = .xXooX%01Sof)
=
XpoX11X12
to construct the number y € [0,1]. Now reverse the digits of y decimal point to the right. The result is a number which does not list f(0), f(1),..., which contradicts the assertion that f is surjective. jection can exist from N to N. Should we promote Joe to full professor or suggest he find a job programmer (assuming the two are mutually exclusive)?
6.3 COMPARISON
OF
CARDINAL
and put the appear in the Hence, no bi
as
a COBOL
NUMBERS
The preceding sections introduced the finite cardinal numbers, the cardinal number NX, for a countable infinity, and the cardinal number c for some sets of an uncountable infinity. In each case, the cardinality of a set A was established by constructing a bijection from a standard set to A. .This allows us to show that two sets have the same cardinality, but so far, we have not defined an order relation which will enable us to assert that one set is larger than another. In this section, we develop the order relations < and < on cardinal numbers and show that they have properties similar to the usual order relations over the real numbers. The following definition formalizes the concept of two sets having the same cardinality even when a standard set has not been specified.
Definition 6.3.1: Let A and B be sets. Then, A and B are equipotent or have the same cardinality, denoted by  A =  B\, if there is a bijectionfrom Ato B. Example
tion
Let E be the set of positive even integers. Then, I+  = E because the func
Sec. 6.3
COMPARISON
OF
CARDINAL
NUMBERS
289
fi1I+E,
is a bijection fromI+ to E.
+
f(x) = 2x
Because bijections are closed under composition, and inverses of bijections are bijections, the relation of equipotence has the following property.
Theorem 6.3.1: of sets.
Equipotence is an equivalence relation over any collection
The proof is straightforward and left as an exercise. It follows from the preceding theorem that to show a set S has cardinality a, it suffices to choose any set S’ which we know has cardinality « and establish the existence of a bijection from S to S’ or from S’ to S. In general, we choose the set S’ to make the proof as easy as possible. We now consider order relations on sets of cardinal numbers. Our goal is to be able to compare the sizes of sets. For example, our intuition tells us that sets with cardinality ¢ are “larger” than countable sets. Before we formally define the order relation for arbitrary collections of sets, we make the following observations concerning finite sets and their cardinal numbers. Let A and B be finite sets with  A = n,  B = m. (a) If there exists an injection from A to B, thenn < m. (b) If there exists a bijection from A to B, then n = m. (c) If there exists an injection from A to B, but no bijection exists, then nom,
These relationships between functions and cardinalities can be extended in a natural way to apply to arbitrary sets. Definition 6.3.2: The cardinality of A is no greater than (or is less than or equal to) the cardinality of B, denoted A 7. One bit is reserved for an indication that overflow has occurred. (We will use 0
for no overflow, 1 for overflow, and use the leftmost bit as the overflow indicator.) For all numbers less than 2%~, the numbers are represented in their k — 1 digit binary representation and the overflow bit is set to 0. If n > Qe1),
then f() consists of the digit 1 followed by the k — 1 least significant digits of 011, = f(3) Thus 100. = f(12) then 3, = k if e.g., n; of tion senta repre y binar the N. e alln for 1) + f(4n = 101 = 010 © 011 = 2) + f(3 and 010, fQ2) =
A’. to A m fro ism rph omo hom a be h let and k> 0’,
0, Let A = °, ), '(7
0’, Show that if ps = rq => (ps)(tu) = (ray(tu) => (pt)(su) = (rt)(qu)
= (Gz) ~ (i)
¢, d an b, a, ons cti fra any for t tha es ish abl est This ifa ~ b, then ac ~ be.
; ll we as ns io at er op r he ot the r de un d ve er es In fact, equivalence of fractions is pr : ld ho ns io rt se as g in ow ll fo e th en th b, ~ a if a, b, and ¢ are fractions and ctar~ct+b
atcr~b+e
c—an~c—b
a—c~b—c
caw
eeb
ac~
bec
—a~—b , ns io at er op e es th r de un d ve er es pr is h ic wh Because ~ is an equivalence relation —, +, of ns io at er op ry na bi e th to t ec sp we say ~ is a congruence relation with re , and the unary operation —. s, ie it ar y ar tr bi ar of ns io at er op r fo s on ti la re Rather than define congruence a is o e er wh , A> 0, ,
, ,°
,
where 0 mE TAP T CVE P “iE C
E)
hypothesis hypothesis hypothesis 3, simplification 1, 4, modus ponens 3, simplification 2, 6, modus ponens 5, 7, disjunctive syllogism.
x is a trigonometric function. x isa periodic function. x is a continuous function.
—“Ax[T(x)
Ax[P(x)
A 7P@)]
A C@)]
“. AVX[T(x) > 7C)])
con se ver uni a er sid con on ati ret erp int ent fer dif a The argument is invalid. For dipre the ine Def l. bal ) ber rub , und (ro a and ble sisting of a (round, glass) mar cates as follows: T(x) denotes “x is a marble.” P(x) denotes “x is a round object.” C(x) denotes “x is made of rubber.”
7.
(b)
The third step, which asserts
3x{P(x) \ 7O(Q)) > Ix
7 PQ) A ax 7909]
is fallacious, although
] x) B( dx A ) AG [x > ] X) BO Ax{A(x) A = R —~ at th de lu nc co ot nn ca we S, is true. Thus, if R => . nt de ce te an e th g in ny de of y ac ll fa corresponds to the
—S.
The faulty step
ANSWERS
344
8.
TO SELECTED
PROBLEMS
The error is in applying universal generalization to d. Although d was arbitrary when it was chosen, the value of c was constrained by that of d, and choosing a new value for d may violate these constraints.
Section 1.5
1.
(a)
Wx[x? is odd => x is odd]. Proof: (Indirect) Let x be an arbitrary integer and assume x is not odd. Then x is even. In an example in the text, we showed x is even x? is even, Negating both sides, it follows that
x is odd x? is odd
and therefore
x? is odd => x is odd.
By universal generalization
Vx[x? is odd > xis odd].
. (b)
(d)
Jj
Wx Vy[(x is even A y is even) => x + y is even]. (Direct) Assume x and Proof: and y = 2n for some integers m and It follows that x + y = 2k where k dx dylx is odd A y is odd A x + y The assertion is false. To show this
y are arbitrary even integers. Then x = 2m n: Therefore, x + y = 2m + 2n = 2(m + n). fj = m +n. Hence,x +yiseven. is odd]. it suffices to prove that the negation
Vx Vyl(x is odd A y is odd) => x + py is even]
is true.
(Direct) Let x and y be arbitrary odd integers. Then x = 2m +1 Proof: and y = 2n + 1 for some integers m and x. Therefore,
x+y
(f)
Hence,
x + yiseven.
=(Qm+1)+ (Qn 4+ 1) =2m+n+ 1).
fj
0].
Proof: (By cases) By property (iv), one of the following holds: x > y, x =y,orx < y.Ifx > y, then (by property (v)), x — y is positive and therefore nonnegative. If x = y, then x — y = 0 and is therefore nonnegative. If x < y,
ANSWERS
TO
SELECTED
then y — x is positive and therefore nonnegative. Thus, cases, either x — y or y — x is nonnegative. J
PROBLEMS
345
in each of the three
1 = 3 > Vx[x? < 0].
(m)
Proof: (Vacuous) The assertion 1 = 3 is false. Hence, the implication is true. Jj dx[x? 1=1. Proof:
(a)
(Trivial) The assertion 1 = 1 is true. Hence, the assertion is true.
The proposition
[i
(b)
Jf
A He, A 7Q) > 0) [Gi A A2) > Q]
is a tautology. To prove by contradiction that (H, the negation of the assertion, i.e., “7G,
or
A A,
A, \ A,
Then proof proof tion B lishes
A HM, A A+++
N+)
:+
A A,) > Q, we would assume
A Ay) V QI
\ A, A 7.
(by applying rules of inference), we would derive a contradiction. The technique described in the problem is a straightforward variation. A by contradiction assumes an assertion A and proves that the contradicfollows by rules of inference; the proof technique described simply estabA => B,
Section 1.6 2.
Any program which does not halt is a solution (see Definition 1.6.1). The following program is correct, because if it halts (it won’t), the final assertion false will be true. Al: true
while true do x [xX’
[Ix=x
A
mx
> [X’
OQAi=1
Avicnt+1A Wil (2i+ 1) =(n + 1). Using the properties of summation and Theorem 2.5.3 we have
SQ+N=2H 14 V1 = MATH
i=0
i=0
i=0
=nm+22n+1=(7+1)% (d)
+(n4+1) §
Note that a proof by induction is not required. For n = 0, we have 1 + 2n = 1 and 3" = 1. Therefore, 14+ 27°< # Basis: for n = 0. Assume 1 + 2n < 34 for arbitrary n. The inequality i < 3” holds Induction: for all n; hence
2< 23", and therefore,
37 +2 3. 10.
fj
The induction step of the proof is fallacious. In particular, if n = 1 or n = 2, it is not true that the set S contains two nonequal subsets of n people which must overlap.
Section 2.6 1.
(a) (b) (c) (d)
(e)
The empty set is a model of axioms (b) through (€). (This postulate plays the same role as the basis step in an inductive definition.) ple exam an as ces suffi on secti this of ple exam tree” ry bina ed root nite “infi The which satisfies all postulates but (b). The set {0} where 0’ = 0 satisfies all the postulates but (c). Let S = {0, 1, 2} where 0’ = 1,1’ = 2,and2’ = 1. Then S satisfies all postulates but (d). Let S = {0, x1, %2,5 V1, Var+ eh fori eé I4, where 0’ = x, and x} = Xi44
w=.
forie I+.
Then S satisfies all postulates but (e).
ANSWERS
354
(a)
2.
TO
SELECTED
We show
PROBLEMS
Vp WaVrip +g) t+r=pt+@tr)l.
Let p and gq be arbitrary natural numbers. We establish
VWipt+gtr=pt@t+trl
by induction. Basis: Let r= 0. Then we have
by the basis step of the definition with m = p + q
(p+q)+0=p+q.
Also by the basis step with m = q we have
p+q+0)=p+g¢.
Hence
(p+qgtr=p+@tr)
ifr = 0, Induction:
By the inductive step of the definition of addition,
pt+@t+tr)=pt+@try =(pt+qt+ny
(Induction Hypothesis)
=((pt+qt+ry
=(pP+g+rr’,
fj
e N.
Thus the assertion holds for allr Section 2.7 1.
(a)
(e)
A* = {A, a, aa}.
B* = {(ab)"n > 0} = {A, ab, abab, .. .}.
2. (b+) Amdo = Am*s Proof: on A. Basis:
for all m,n > 0.
Let mbean arbitrary integer. We show Wx[A"A" = A™**] by induction n=0.
Am™A® = A™{A}
= Am =
Induction:
Amto
Assume the assertion is true for arbitrary n < N. Then,
AmAnti = Am(4"A)
(definition of A*)
= (A"™A)A
(Theorem 2.7.1c)
az Amin, 4
(induction hypothesis)
=a A(mtn)+1
(definition of A”)
= Amttath
(associativity of +)
Hence the assertion is true for 7 + 1, and we conclude
VnlA™A® = A™**), Since m was arbitrary, it follows by universal generalization that
Vn Vn[A™A* = Am],
§j
ANSWERS
(a)
xe
A
igN
(b)
(c)
forsomek
¢ AB, LJ
@exe
iéN
By Theorem 2.7.3a, A* ie, A € A" for somen Conversely, if A* = At, AEA, We apply parts (a) and
y €
some
for
xX = yz
k A g() >
fal k A lg@) >
Vklk > O> Va[m > O0> dann e NA (b)
ml f@OI
The universal quantifiers in the first expression of (a) can be interchanged. Therefore, for any fixed nonnegative value of m, the following assertion holds:
Vkgso Amentn > k A [a(n) > ml f@)]).
(c)
Choose an arbitrary k, k, such that g(m,) > mf(7,). Let k2 = mn; + 1. Again applying the above assertion there is an nm, > kz such that g(n2) > mf(n2). Let k3 = ny +1. Then there is an 13 > k3 such that  g(3) > mf(#3). Continuing in this manner, we can construct an infinite set S = {7,, m2, m3,...} such that [g(m,) > m\f(n) for alla, ¢ S. &j The assertion is not true in general. Suppose f(”) = a for all n and for
a(n) = n?
= 0
even,
for n odd.
Then f does not asymptotically dominate g but g(”) < f(n) for all n which are odd.
Corollary 5.2.3, part (a) Suppose (i) (only if)
10.
(a)
(c)
is
O(g)
and
g
is
O(f).
Then
by
Theorem
5.2.3,
O(f) < O(g) and O(g) < O(f); hence O(f) = O(g).
(if) Suppose Theorem 5.2.3, Corollary 5.2.3, part Suppose f is O(g) O(g) < O(A). Thus
(ii)
f
O(f) fis (b) and O(f)
= O(g). Then O(f) < O(g) O(g) and gis O(/). fj
and O(g) < O(/). Thus, by
g is O(h). Then by Theorem 5.2.3, O(f) < O(g) fj < O(A). Hence by Theorem 5.2.3, fis O(4).
and :
We show that log” ¢ O(1). Suppose to the contrary that logz € O(1); then there must be some 4, m => 0 such that if n> k, then logn< m1 =m. But if 7 > 2”, then log n > m; thus log 7 is not asymptotically dominated by is not O(1). By Theorem 5.2.4, O71) < Odog vn) and e(n) = 1 and hence log jf it follows from the above argument that the containment is proper. We show that if d> 1, then d* € O(n"). Suppose d” is O(n?). Then there exists k,m => 0 such that if n> k, then d*?< mn?. Then for these values of 7, niogd< logm + 2logz, and for n> 1,
ANSWERS
n
2
TO
SELECTED
PROBLEMS
373
log m
logn = logd * (og n)(loga)
But the ratio on the left grows, arbitrarily large as n increases, whereas the first summand on the right is a constant, and the second term decreases as n increases. Thus the inequality can be violated by choosing n sufficiently large. We conclude that d" ¢ O(n?). From this result and Theorem 5.2.4, we conclude that the containment is proper. fj
11.
Let
K
n>
and
n
be
arbitrary
max (c*, K). Then
ni =n(n
Since
positive
— 1)(n — 2)...(K
integers +
such
that
K=>[c]
and
DK(K—1)...2¢1.
K > [cl], (1 — 1)(n — 2)...(K + DK>
Since n>
cr,
c*,
nl > cKcn"K = cr,
Hence n! > c" if is sufficiently large, and therefore O(c”) < O(n!). To show the containment is proper, it suffices to show that for any m > 0, the value of n can be chosen large enough that 1! > mc". Without loss of generality, we can assume m > 1. We showed above that if” is chosen large enough, n! > (me). But for n> 2, (mc) > mc"; hence n! > mc" for n sufficiently large. It follows that n! is not O(c’). fj
13.
If P is a polynomial of degree k, then P(n) = ay + ayn + ayn? ++ + aynk, where a, 4 0. By Theorems 5.2.5, 5,.2.2(b), and 5.2.3, a;n' € O(n*) for each i,
0 (@cea)ox=(@oa)oy =>lox=loy
> x= y.
(b)
In the same way, we can show that ifxoa=yoa,thnx=y. J By definition, ao S = {ao xx € S}. Since S is closed under o, ao Sc S. Now suppose y is an arbitrary element of S. Then for some x € S, namely
X= doy, hence
(c)
ao S>
S=Soa
fj
acox=ac(G@cey)=(acad)oy=loy=y;
8S.
Therefore
Let x be the inverse of a; then
and
The
one
can
algebra
that
J
Cardinality
a
group
1
b
semigroup
1
c
semigroup
2
d
group
4
e
semigroup
3
is a monoid
if and
only
if
R* R/ = R’ for all positive j. This holds if and only if R* R’= R. Thus a neces
sary and sufficient condition is that there exist a k such that R“*! = R. (Note
that there need not be a k such that R“ = R°: an example is a>,},.)
16.
(a)
(b)
Since k binary digits are used to represent each representable integer, the carrier has 2* elements. The variety is a group, because 0 is an additive identity and if 2* — x is added to any representable integer x, the result will be 0. The carrier still has 2* elements, but the variety is a monoid with identity element 0. For every representable x and y, the operation @ of the monoid is
defined by
Section
R={,, where o and (] have the operation Let A, = and A, = , Az = (S2, 1, ky> and A; =