Introduction to Probability and Random Variables 9783031318153, 9783031318160

180 57 6MB

English Pages 240 Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Introduction to Probability and Random Variables
 9783031318153, 9783031318160

Table of contents :
Preface
Contents
Chapter 1: Experiments, Sample Spaces, Events, and Probability Laws
1.1 Fundamental Definitions: Experiment, Sample Space, Event
1.2 Operations on Events
1.3 Probability and Probabilistic Law
1.4 Discrete Probability Law
1.5 Joint Experiment
1.6 Properties of the Probability Function
1.7 Conditional Probability
Chapter 2: Total Probability Theorem, Independence, Combinatorial
2.1 Total Probability Theorem, and Bayes´ Rule
2.1.1 Total Probability Theorem
2.1.2 Bayes´ Rule
2.2 Multiplication Rule
2.3 Independence
2.3.1 Independence of Several Events
2.4 Conditional Independence
2.5 Independent Trials and Binomial Probabilities
2.6 The Counting Principle
2.7 Permutation
2.8 Combinations
2.9 Partitions
2.10 Case Study: Modeling of Binary Communication Channel
Problems
Chapter 3: Discrete Random Variables
3.1 Discrete Random Variables
3.2 Defining Events Using Random Variables
3.3 Probability Mass Function for Discrete Random Variables
3.4 Cumulative Distribution Function
3.5 Expected Value (Mean Value), Variance, and Standard Deviation
3.5.1 Expected Value
3.5.2 Variance
3.5.3 Standard Deviation
3.6 Expected Value and Variance of Functions of a Random Variable
3.7 Some Well-Known Discrete Random Variables in Mathematic Literature
3.7.1 Binomial Random Variable
3.7.2 Geometric Random Variable
3.7.3 Poisson Random Variable
3.7.4 Bernoulli Random Variable
3.7.5 Discrete Uniform Random Variable
Problems
Chapter 4: Functions of Random Variables
4.1 Probability Mass Function for Functions of a Discrete Random Variable
4.2 Joint Probability Mass Function
4.3 Conditional Probability Mass Function
4.4 Joint Probability Mass Function of Three or More Random Variables
4.5 Functions of Two Random Variables
4.6 Conditional Probability Mass Function
4.7 Conditional Mean Value
4.8 Independence of Random Variables
4.8.1 Independence of a Random Variable from an Event
4.8.2 Independence of Several Random Variables
Problems
Chapter 5: Continuous Random Variables
5.1 Continuous Probability Density Function
5.2 Continuous Uniform Random Variable
5.3 Expectation and Variance for Continuous Random Variables
5.4 Expectation and Variance for Functions of Random Variables
5.5 Gaussian or Normal Random Variable
5.5.1 Standard Random Variable
5.6 Exponential Random Variable
5.7 Cumulative Distribution Function
5.7.1 Properties of Cumulative Distribution Function
5.8 Impulse Function
5.9 The Unit Step Function
5.10 Conditional Probability Density Function
5.11 Conditional Expectation
5.12 Conditional Variance
Problems
Chapter 6: More Than One Random Variables
6.1 More Than One Continuous Random Variable for the Same Continuous Experiment
6.2 Conditional Probability Density Function
6.3 Conditional Expectation
6.3.1 Bayes´ Rule for Continuous Distribution
6.4 Conditional Expectation
6.5 Conditional Variance
6.6 Independence of Continuous Random Variables
6.7 Joint Cumulative Distribution Function
6.7.1 Three or More Random Variables
6.7.2 Background Information: Reminder for Double Integration
6.7.3 Covariance and Correlation
6.7.4 Correlation Coefficient
6.8 Distribution for Functions of Random Variables
6.9 Probability Density Function for Function of Two Random Variables
6.10 Alternative Formula for the Probability Density Function of a Random Variable
6.11 Probability Density Function Calculation for the Functions of Two Random Variables Using Cumulative Distribution Function
6.12 Two Functions of Two Random Variables
Bibliography
Index

Citation preview

Orhan Gazi

Introduction to Probability and Random Variables

Introduction to Probability and Random Variables

Orhan Gazi

Introduction to Probability and Random Variables

Orhan Gazi Electrical and Electronics Engineering Department Ankara Medipol University Altındağ/Ankara, Türkiye

ISBN 978-3-031-31815-3 ISBN 978-3-031-31816-0 https://doi.org/10.1007/978-3-031-31816-0

(eBook)

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

The first book about probability and random variables was written in 1937. Although probability has been known for long time in history, it has been seen that the compilation of probability and random variables as a written material does not go back more than a hundred years in history. In fact, most of the scientific developments in humanity history have been done in the last 100 years. It is not wrong to say that people have sufficient intelligence only in the last century. Developments especially in basic sciences took a long time in humanity history. The founding in random variables and probability affected the other sciences as well. The scientists dealing with physics subject focused on deterministic modeling for long time. As the improvements in random variables and probability showed itself, the modeling of physical events have been performed using probabilistic modeling. Beforehand the physicians were modeling the flows of electrons around an atom as deterministic circular paths, but, the developments in probability and random variables lead physicians to think about the probabilistic models about the movement of electrons. It is seen that the improvements in basic sciences directly affect the other sciences as well. The modern electronic devices owe their existence to the probability and random variable science. Without probability concept, it would not be possible to develop information communication subjects. Modern electronic communication devices are developed using the fundamental concepts of probability and random variables. Shannon in 1948 published his famous paper about information theory using probabilistic modeling and it lead to the development of modern communication devices. The developments in probability science caused the science of statistics to emerge. Many disciplines from medical sciences to engineering benefit from the statistics science. Medical doctors measure the effects of tablets extracting statistical data from patients. Engineers model some physical phenomenon using statistical measures. In this book, we explain fundamental concepts of probability and random variables in a clear manner. We cover basic topics of probability and random variables. The first chapter is devoted to the explanations of experiments, sample spaces, events, and probability laws. The first chapter can be considered as the basement v

vi

Preface

of the random variable topic. However, it is not possible to comprehend the concept of random variables without mastering the concept of events, definition of probability and probability axioms. The probability topic has always been considered as a difficult subject compared to the other mathematic subjects by the students. We believe that the reason for this perception is the unclear and overloaded explanations of the subject. Considering this we tried to be brief and clear while explaining the topics. The concept of joint experiments, writing the sample spaces of joint experiments, and determining the events from the given problem statement are important to solve the probability problems. In Chap. 2, using the basic concepts introduced in Chap. 1, we introduce some classical probability subjects such as total probability theorem, independence, permutation and combination, multiplication, partition rule, etc. Chapter 3 introduces the discrete random variables. We introduce the probability mass function of the discrete random variables using the event concept. Expected value and variance calculation are the other topics covered in Chap. 3. Some wellknown probability mass functions are also introduced in this chapter. It is easier to deal with discrete random variables than the continuous random variables. We advise the reader to study the discrete random variables before continuous random variables. Functions of random variables are explained in Chap. 4 where joint probability mass function, cumulative distribution function, conditional probability mass function, and conditional mean value concepts are covered as well. Continuous random variables are covered in Chap. 5. Continuous random variables can be considered as the integral form of the discrete random variables. If the reader comprehends the discrete random variables covered in Chap. 4, it will not be hard to understand the subjects covered in Chap. 5. In Chap. 6, we mainly explain the calculation of probability density, cumulative density, conditional probability density, conditional mean value calculation, and related topics considering more than one random variable case. Correlation and covariance topics of two random variables are also covered in Chap. 6. This book can be used as a text book for one semester probability and random variables course. The book can be read by anyone interested in probability and random variables. While writing this book, we have used the teaching experience of many years. We tried to provide original examples while explaining the basic concepts. We considered examples which are as simple as possible, and they provide succinct information. We decreased the textual part of the book as much as possible. Inclusions of long text parts decrease the concentration of the reader. Considering this we tried to be brief as much as possible and aimed to provide the fundamental concept to the reader in a quick and short way without being lost in details. I dedicate this book to my lovely daughter Vera Gazi. Altındağ/Ankara, Türkiye

Orhan Gazi

Contents

1

Experiments, Sample Spaces, Events, and Probability Laws . . . . . . . 1.1 Fundamental Definitions: Experiment, Sample Space, Event . . . . 1.2 Operations on Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Probability and Probabilistic Law . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Discrete Probability Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Joint Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Properties of the Probability Function . . . . . . . . . . . . . . . . . . . . . 1.7 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 3 3 4 5 12 17 26

2

Total Probability Theorem, Independence, Combinatorial . . . . . . . . 2.1 Total Probability Theorem, and Bayes’ Rule . . . . . . . . . . . . . . . . 2.1.1 Total Probability Theorem . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Bayes’ Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Multiplication Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Independence of Several Events . . . . . . . . . . . . . . . . . . . 2.4 Conditional Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Independent Trials and Binomial Probabilities . . . . . . . . . . . . . . 2.6 The Counting Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Permutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10 Case Study: Modeling of Binary Communication Channel . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29 29 29 32 35 36 38 39 42 50 51 53 56 61 64

3

Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Defining Events Using Random Variables . . . . . . . . . . . . . . . . 3.3 Probability Mass Function for Discrete Random Variables . . . . . 3.4 Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . . . .

67 67 69 76 80

. . . . .

vii

viii

Contents

3.5

Expected Value (Mean Value), Variance, and Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Expected Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.3 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Expected Value and Variance of Functions of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Some Well-Known Discrete Random Variables in Mathematic Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 Binomial Random Variable . . . . . . . . . . . . . . . . . . . . . . 3.7.2 Geometric Random Variable . . . . . . . . . . . . . . . . . . . . . . 3.7.3 Poisson Random Variable . . . . . . . . . . . . . . . . . . . . . . . . 3.7.4 Bernoulli Random Variable . . . . . . . . . . . . . . . . . . . . . . 3.7.5 Discrete Uniform Random Variable . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

5

Functions of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Probability Mass Function for Functions of a Discrete Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Joint Probability Mass Function . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Conditional Probability Mass Function . . . . . . . . . . . . . . . . . . . . 4.4 Joint Probability Mass Function of Three or More Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Functions of Two Random Variables . . . . . . . . . . . . . . . . . . . . . 4.6 Conditional Probability Mass Function . . . . . . . . . . . . . . . . . . . . 4.7 Conditional Mean Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Independence of Random Variables . . . . . . . . . . . . . . . . . . . . . . 4.8.1 Independence of a Random Variable from an Event . . . . . 4.8.2 Independence of Several Random Variables . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Continuous Probability Density Function . . . . . . . . . . . . . . . . . . 5.2 Continuous Uniform Random Variable . . . . . . . . . . . . . . . . . . . . 5.3 Expectation and Variance for Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Expectation and Variance for Functions of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Gaussian or Normal Random Variable . . . . . . . . . . . . . . . . . . . . 5.5.1 Standard Random Variable . . . . . . . . . . . . . . . . . . . . . . . 5.6 Exponential Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . . . . . 5.7.1 Properties of Cumulative Distribution Function . . . . . . . . 5.8 Impulse Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9 The Unit Step Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87 87 88 89 91 99 100 100 101 102 103 105 111 111 113 118 121 122 124 128 131 131 136 136 141 141 143 145 146 148 149 152 154 154 157 158

Contents

6

ix

5.10 Conditional Probability Density Function . . . . . . . . . . . . . . . . . . 5.11 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.12 Conditional Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

162 166 172 178

More Than One Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 More Than One Continuous Random Variable for the Same Continuous Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Conditional Probability Density Function . . . . . . . . . . . . . . . . . . 6.3 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Bayes’ Rule for Continuous Distribution . . . . . . . . . . . . . 6.4 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Conditional Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Independence of Continuous Random Variables . . . . . . . . . . . . . 6.7 Joint Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . 6.7.1 Three or More Random Variables . . . . . . . . . . . . . . . . . . 6.7.2 Background Information: Reminder for Double Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.3 Covariance and Correlation . . . . . . . . . . . . . . . . . . . . . . 6.7.4 Correlation Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Distribution for Functions of Random Variables . . . . . . . . . . . . . 6.9 Probability Density Function for Function of Two Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10 Alternative Formula for the Probability Density Function of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11 Probability Density Function Calculation for the Functions of Two Random Variables Using Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.12 Two Functions of Two Random Variables . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

183 183 187 189 191 191 196 199 201 202 203 206 206 206 211 214

216 221 229

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Chapter 1

Experiments, Sample Spaces, Events, and Probability Laws

1.1

Fundamental Definitions: Experiment, Sample Space, Event

In this section, we provide some definitions very widely used in probability theory. We first consider the discrete probability experiments and give definitions of discrete sample spaces to understand the concept of probability in an easy manner. Later, we consider continuous experiments. Set A set in its most general form is a collection of objects, and these objects can be physical objects like, pencils or chairs, or they can be nonphysical objects, like integers, real numbers, etc. Experiment An experiment is a process used to measure a physical phenomenon. Trial A trial is a single performance of an experiment. If we perform an experiment once, then we have a trial of the experiment. Outcome, Simple Event, Sample Point After the trial of an experiment, we have an outcome that can be called as a simple event, sample point, or simple outcome. Sample Space A sample space is defined for an experiment, and it is a set consisting of all the possible outcomes of an experiment. Event A sample space is a set, and it has subsets. A subset of a sample space is called an event. A discrete sample space, i.e., a countable sample space, consisting of N outcomes, or simple events, has 2N events, i.e., subsets. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 O. Gazi, Introduction to Probability and Random Variables, https://doi.org/10.1007/978-3-031-31816-0_1

1

2

1

Experiments, Sample Spaces, Events, and Probability Laws

Example 1.1: Consider the coin toss experiment. This experiment is a discrete experiment, i.e., we have a countable number of different outcomes for this experiment. Then, we have the following items for this experiment. Experiment: Coin Toss. Simple Events, or Experiment Outcomes: {H}, and {T} where H indicates “head”, and T denotes “tail.” Sample Space: S = {H, T} Events: Events are nothing but the subsets of the sample space. Thus, we have the events, {H}, {T}, {H, T}, ϕ. That is, we have 4 events for this experiment. Example 1.2: Consider a rolling-a-die experiment. We have the following identities for this experiment. Experiment: Rolling a die. Simple Events: {1}, {2}, {3}, {4}, {5}, {6}. Sample Space: S = {1, 2, 3, 4, 5, 6}. Events: Events are nothing but the subsets of the sample space. Thus, we have 26 = 64 events for this experiment. We wrote that an event is nothing but a subset of the sample space. A subset is also a set, and it may include more than a simple event. Let’s assume that A is an event for an experiment including a number of simple events such that A = {a, b, ⋯}. After a trial of the experiment, if a simple outcome x appears such that x 2 A, then we say that the event A occurs. Example 1.3: For the rolling-a-die experiment, the sample space is S = {1, 2, 3, 4, 5, 6}. Let’s define two events of this experiment as A = {1, 3, 5}, B = {2, 4, 6}. Assume that we roll a die and “3” appears at the top face of the die, since 3 2 A we say that the event A has occurred. Example 1.4: For the rolling-a-die experiment, the sample space is S = {1, 2, 3, 4, 5, 6}. Let A = {1, 2, x}, B = {2, 4, 7}. Are the sets A and B events for the die experiment? Solution 1.4: An event is a subset of a sample space of an experiment. For the given sample space, it is obvious that A 6 B

B 6 S

then we can say that A and B are not events for the rolling-a-die experiment. Example 1.5: For the rolling-a-die experiment, the sample space S = {1, 2, 3, 4, 5, 6}. Write three events for the rolling-a-die experiment.

is

Solution 1.5: We can write any three subsets of the sample space since events are nothing but the subsets of the sample space. Then, we can write three arbitrary events as

1.3

Probability and Probabilistic Law

3

A = f1, 2, 4g

1.2

B = f 5g

C = f1, 6g:

Operations on Events

Since events are nothing but subsets of the sample space, the operations defined on sets are also valid on events. If A and B are two events, then we can define the following operations on the events: A [ B = A þ B → Union of A and B

A \ B = AB → Intersection of A and B

A → Complement of A: c

The complement of A, i.e., Ac, is calculated as Ac = S - A: Note: A - B = A \ Bc Mutually Exclusive Events or Disjoint Events Let A and B be two events. If A \ B = ϕ, then A and B are called mutually exclusive events, or disjoint events.

1.3

Probability and Probabilistic Law

Probability is a real valued function, and it is usually denoted by P(). The inputs of the probability function are the events of experiments, and the outputs are the real numbers between 0 and 1. Thus, we can say that the probability function is nothing but a mapping between events and real numbers in the range of 0–1. The use of probability function P() is illustrated in Fig. 1.1. Probabilistic Law The probability function P() is not an ordinary real valued function. For a real valued function to be used as a probability function, it should obey some axioms, and these axioms are named probabilistic law axioms, which are outlined as follows: Probability Axioms Let S be the sample space of an experiment, and A and B be two events for which the probability function P() is used such that

4

1

Experiments, Sample Spaces, Events, and Probability Laws

0

Event-1 Event-2

P

Experiment

Event-N

1

Fig. 1.1 The mapping of the events by the probability function

PðAÞ → Probability of Event A PðBÞ → Probability of Event B: Then, we have the following axioms 1. For every event A, i.e., the probability function is a non-negative function, i.e., PðAÞ ≥ 0:

ð1:1Þ

2. If A \ B = ϕ, i.e., A and B are disjoint sets, then the probability of A [ B satisfies PðA [ BÞ = PðAÞ þ PðBÞ

ð1:2Þ

which is called the additivity axiom of the probability function. 3. The probability of the sample space equals 1, i.e., PðSÞ = 1: This property is called the normalization axiom.

1.4

Discrete Probability Law

For a discrete experiment, assume that the sample space is

ð1:3Þ

1.5

Joint Experiment

5

S = fs1 , s2 , ⋯, sN g: Let A be the event of this discrete experiment, i.e., A ⊂ S, such that A = fa1 , a2 , ⋯, ak g: The probability of the event A can be calculated as PðAÞ = Pfa1 , a2 , ⋯, ak g where employing (1.2), since the simple events are also disjoint events, we get PðAÞ = Pða1 Þ þ Pða2 Þ þ ⋯Pðak Þ:

ð1:4Þ

If the simple events are equally probable events, i.e., P(si) = p, then according to (1.3), we have PðSÞ = 1 → Pðs1 Þ þ Pðs2 Þ þ ⋯ þ PðsN Þ = 1 → Np = 1 → p =

1 : N

That is, the probability of a simple event happens to be Pðsi Þ =

1 : N

Then, in this case, the probability of the event given in (1.4) can be calculated as PðAÞ = Pða1 Þ þ Pða2 Þ þ ⋯Pðak Þ → PðAÞ =

1 1 1 k þ þ ⋯ þ → PðAÞ = N N N N

which can also be stated as PðAÞ =

Number of elements in event A : Number of elements in sample space S

ð1:5Þ

Note: Equation (1.5) is valid, only if the simple events are all equally likely, i.e., simple events have equal probability of occurrences.

1.5

Joint Experiment

Assume that we perform two different experiments. Let experiment-1 have the sample space S1 and experiment-2 have the sample space S2. If both experiments are performed at the same time, we can consider both experiments as a single

6

1 Experiments, Sample Spaces, Events, and Probability Laws

experiment, which can be considered as a joint experiment. In this case, the sample space of the joint experiments becomes equal to S = S1 × S 2 i.e., Cartesian product of S1 and S2. Similarly, if more than two experiments with sample spaces S1, S2, ⋯ are performed at the same time, then the sample space of the joint experiment can be calculated as S = S1 × S2 × ⋯ If S 1 = f a 1 , a 2 , a 3 , ⋯ g S2 = f b 1 , b 2 , b 3 , ⋯ g S 3 = f c 1 , c 2 , c 3 , ⋯ g ⋯ then a single element of S will be in the form si = ajblcm⋯, and the probability of si can be calculated as Pðsi Þ = P aj Pðbl ÞPðcm Þ⋯

ð1:6Þ

That is, the probability of the simple event of the combined experiment equals the product of the probabilities of the simple events appearing in the simple event of the combined experiment. Example 1.6: For the fair coin toss experiment, sample space is S = {H, T}. Simple events are {H}, {T}. The probabilities of the simple events are PðH Þ =

1 2

PðT Þ =

1 : 2

Example 1.7: We toss a coin twice. Find the sample space of this experiment. Solution 1.7: For a single toss of the coin, the sample space is S1 = {H, T}. If we toss the coin twice, we can consider it as a combined experiment, and the sample space of the combined experiment can be calculated as S = S1 × S1 → S = fH, T g × fH, T g → S = fHH, HT, TH, TT g: Example 1.8: We toss a coin three times. Find the sample space of this experiment. Solution 1.8: The three tosses of the coin can be considered a combined experiment. For a single toss of the coin, the sample space is S1 = {H, T}. For three tosses, the sample space can be calculated as S = S1 × S1 × S1 → S = fHHH, HHT, HTH, THH, HTT, THT, TTH, TTT g:

1.5

Joint Experiment

7

Example 1.9: For the fair die toss experiment, sample space is S = {f1, f2, f3, f4, f5, f6}. Simple events are {f1}, {f2}, {f3}, {f4}, {f5}, {f6}. The probabilities of the simple events are Pðf 1 Þ = Pðf 2 Þ = Pðf 3 Þ = Pðf 4 Þ = Pðf 5 Þ = Pðf 6 Þ =

1 : 6

Example 1.10: We flip a fair coin and toss a fair die at the same time. Find the sample space of the combined experiment, and find the probabilities of the simple events of the combined experiment. Solution 1.10: For the coin flip experiment, we have the sample space S1 = fH, T g where H denotes the head, and T denotes the tail. For the fair die flip experiment, we have the sample space S2 = ff 1 , f 2 , f 3 , f 4 , f 5 , f 6 g where the integers indicate the faces of the die. For the combined experiment, the sample space S can be calculated as S = S1 × S2 → S = fHf 1 , Hf 2 , Hf 3 , Hf 4 , Hf 5 , Hf 6 , Tf 1 , Tf 2 , Tf 3 , Tf 4 , Tf 5 , Tf 6 g: The simple events of the combined experiment are fHf 1 g

fHf 2 g

fHf 3 g

fHf 4 g

fHf 5 g

fHf 6 g fTf 1 g fTf 2 g fTf 3 g fTf 4 g fTf 5 g fTf 6 g:

The probabilities of the simple events of the combined experiment according to (1.6) can be calculated as PðHf 1 Þ = PðH ÞPðf 1 Þ → PðHf 1 Þ = PðHf 2 Þ = PðH ÞPðf 2 Þ → PðHf 2 Þ = PðHf 3 Þ = PðH ÞPðf 3 Þ → PðHf 3 Þ = PðHf 4 Þ = PðH ÞPðf 4 Þ → PðHf 4 Þ = PðHf 5 Þ = PðH ÞPðf 5 Þ → PðHf 5 Þ =

1 2 1 2 1 2 1 2 1 2

× × × × ×

1 6 1 6 1 6 1 6 1 6

→ PðHf 1 Þ = → PðHf 2 Þ = → PðHf 3 Þ = → PðHf 4 Þ = → PðHf 5 Þ =

1 12 1 12 1 12 1 12 1 12

8

1

Experiments, Sample Spaces, Events, and Probability Laws

1 1 1 × → PðHf 6 Þ = 2 6 12 1 1 1 PðTf 1 Þ = PðT ÞPðf 1 Þ → PðTf 1 Þ = × → PðTf 1 Þ = 2 6 12 1 1 1 PðTf 2 Þ = PðT ÞPðf 2 Þ → PðTf 2 Þ = × → PðTf 2 Þ = 2 6 12 1 1 1 PðTf 3 Þ = PðHT ÞPðf 3 Þ → PðTf 3 Þ = × → PðTf 3 Þ = 2 6 12 1 1 1 PðTf 4 Þ = PðT ÞPðf 4 Þ → PðTf 4 Þ = × → PðTf 4 Þ = 2 6 12 1 1 1 PðTf 5 Þ = PðT ÞPðf 5 Þ → PðTf 5 Þ = × → PðTf 5 Þ = 2 6 12 1 1 1 PðTf 6 Þ = PðT ÞPðf 6 Þ → PðTf 6 Þ = × → PðTf 6 Þ = 2 6 12 PðHf 6 Þ = PðH ÞPðf 6 Þ → PðHf 6 Þ =

Example 1.11: A biased coin is flipped. The sample space is S1 = {Hb, Tb}. The probabilities of the simple events are P ðH b Þ =

2 3

PðT b Þ =

1 : 3

Assume that the biased coin is flipped twice. Consider the two flips as a single experiment. Find the sample space of the combined experiment, and determine the probabilities of the simple events for the combined experiment. Solution 1.11: The sample space of the combined experiment can be found using S = S1 × S1 as S = fH b H b , H b T b , T b H b , T b T b g: The simple events for the combined experiment are fH b H b g

fH b T b g

fT b H b g

fT b T b g:

The probabilities of the simple events of the combined experiment are calculated as 2 2 4 × → P ðH b H b Þ = 3 3 9 2 1 2 PðH b T b Þ = PðH b ÞPðT b Þ → PðH b T b Þ = × → PðH b T b Þ = 3 3 9

PðH b H b Þ = PðH b ÞPðH b Þ → PðH b H b Þ =

1.5

Joint Experiment

9

1 2 2 × → PðT b H b Þ = 3 3 9 1 1 1 PðT b T b Þ = PðT b ÞPðT b Þ → PðT b T b Þ = × → PðT b T b Þ = : 3 3 9

PðT b H b Þ = PðT b ÞPðH b Þ → PðT b H b Þ =

Example 1.12: We have a three-faced biased die and a biased coin. For the threefaced biased die, the sample space is S1 = {f1, f2, f3}, and the probabilities of the simple events are Pðf 1 Þ =

1 6

Pðf 2 Þ =

1 6

2 Pðf 3 Þ = : 3

For the biased coin, the sample space is S2 = {Hb, Tb}, and the probabilities of the simple events are P ðH b Þ =

1 3

PðT b Þ =

2 : 3

We flip the coin and toss the die at the same time. Find the sample space of the combined experiment, and calculate the probabilities of the simple events. Solution 1.12: For the combined experiment, the sample space can be calculated using S = S1 × S2 as S = ff 1 , f 2 , f 3 g × f H b , T b g → S = f f 1 H b , f 1 T b , f 2 H b , f 2 T b , f 3 H b , f 3 T b g : The probabilities of the simple events of the combined experiment can be computed as 1 1 1 × → Pðf 1 H b Þ = 6 3 18 1 2 2 Pðf 1 T b Þ = Pðf 1 ÞPðT b Þ → Pðf 1 T b Þ = × → Pðf 1 T b Þ = 6 3 18 1 1 1 Pðf 2 H b Þ = Pðf 2 ÞPðH b Þ → Pðf 2 H b Þ = × → Pðf 2 H b Þ = 6 3 18 1 2 2 Pðf 2 T b Þ = Pðf 2 ÞPðT b Þ → Pðf 2 T b Þ = × → Pðf 2 T b Þ = 6 3 18 2 1 2 Pðf 3 H b Þ = Pðf 3 ÞPðH b Þ → Pðf 3 H b Þ = × → Pðf 3 H b Þ = 3 3 9

Pðf 1 H b Þ = Pðf 1 ÞPðH b Þ → Pðf 1 H b Þ =

10

1

Experiments, Sample Spaces, Events, and Probability Laws

Pðf 3 T b Þ = Pðf 3 ÞPðT b Þ → Pðf 3 T b Þ =

2 2 4 × → Pðf 3 T b Þ = : 3 3 9

Example 1.13: We toss a coin three times. Find the probabilities of the following events. (a) A = {ρi 2 S | ρi includes at least two heads}. (b) B = {ρi 2 S | ρi includes at least one tail}. Solution 1.13: For three tosses, the sample space can be calculated as S = fHHH, HHT, HTH, THH, HTT, THT, TTH, TTT g: The events A and B can be written explicitly as A = fHHH, HHT, HTH, THH g B = fHHT, HTH, THH, HTT, THT, HTT, TTT g: The probability of the event A can be computed as PðAÞ = PðHHH Þ þ PðHHT Þ þ PðHTH Þ þ PðTHH Þ → PðAÞ = → PðAÞ =

1 1 1 1 þ þ þ 8 8 8 8

4 : 8

In a similar manner, the probability of the event B can be found as PðBÞ =

7 : 8

Example 1.14: For a biased coi, the sample space is S1 = {Hb, Tb}. The probabilities of the simple events for the biased coin flip experiment are P ðH b Þ =

2 3

PðT b Þ =

1 : 3

Assume that a biased coin and a fair coin are flipped together. Consider the two flips as a single experiment. Find the sample space of the combined experiment. Determine the probabilities of the simple events for the combined experiment, and determine the probabilities of the following events. (a) A = {Biased head appears in the simple event.} (b) B = {At least two heads appear.}

1.5

Joint Experiment

11

Solution 1.14: For the fair coin flip experiment, the sample space is S2 = fH, T g and the probabilities of the simple events are PðH Þ =

1 2

PðT Þ =

1 : 2

For the flip of biased and fair coin experiment, the sample space can be calculated as S = S1 × S2 → S = fH b H, H b T, T b H, T b T g: The probabilities of the simple events for the combined experiment are calculated as 2 1 1 × → PðH b H Þ = 3 2 3 2 1 1 PðH b T Þ = PðH b ÞPðT Þ → PðH b T Þ = × → PðH b T Þ = 3 2 3 1 1 1 PðT b H Þ = PðT b ÞPðH Þ → PðT b H Þ = × → PðT b H Þ = 3 2 6 1 1 1 PðT b T Þ = PðT b ÞPðT Þ → PðT b T Þ = × → PðT b T Þ = : 3 2 6

PðH b H Þ = PðH b ÞPðH Þ → PðH b H Þ =

The events A and B can be explicitly written as A = fH b H, H b T g

B = fH b H, H b T, T b H g

whose probabilities can be calculated as 1 1 2 þ → PðAÞ = 3 3 3 1 1 1 5 PðBÞ = PðH b H Þ þ PðH b T Þ þ PðT b H Þ → PðBÞ = þ þ → PðBÞ = : 3 3 6 6 PðAÞ = PðH b H Þ þ PðH b T Þ → PðAÞ =

Exercises: 1. For a biased coin, the sample space is S1 = {Hb, Tb}. The probabilities of the simple events for the biased coin toss experiment are

12

1

Experiments, Sample Spaces, Events, and Probability Laws

P ðH b Þ =

2 3

PðT b Þ =

1 : 3

Assume that a biased coin is flipped and a fair die is tossed together. Consider the combined experiment, and find the sample space of the combined experiment. Determine the probabilities of the simple events for the combined experiment, and determine the probabilities of the events: (a) A = {Biased head and odd numbers appear in the simple event.} (b) B = {Biased tail and a number divisible by 3 appear in the simple event.} 2. For a biased coin, the sample space is S1 = {Hb, Tb}. The probabilities of the simple events for the biased coin toss experiment are P ðH b Þ =

2 3

PðT b Þ =

1 : 3

Assume that a biased coin is tossed three times. Find the sample space, and find the probabilities of the simple events. Calculate the probability of the events A = {At least two heads appear in the simple event.} B = {At most two tails appear in the simple event.}

1.6

Properties of the Probability Function

Let A, B, and C be the events for an experiment, and P() be the probability function defined on the events of the experiment. We have the following properties for the probability function P(). (a) (b) (c) (d)

If A ⊂ B, then P(A) ≤ P(B) P(A [ B) = P(A) + P(B) - P(A \ B) P(A [ B) ≤ P(A) + P(B) P(A [ B [ C) = P(A) + P(Ac \ B) + P(Ac \ Bc \ C) We will prove some of these properties in examples.

Example 1.15: Prove the property P(A [ B) = P(A) + P(B) - P(A \ B). Proof 1.15: We should keep in our mind that events are nothing but subsets. Then, any operation that can be performed on sets is also valid on events. Let S be the sample space. The event A [ B can be written as A [ B = S \ ðA [ BÞ in which using S = A [ Ac, we get

1.6

Properties of the Probability Function

13

A [ B = ðA [ Ac Þ \ ðA [ BÞ which can be written as A [ B = A [ ðAc \ BÞ

ð1:7Þ

where the events A and Ac \ B are disjoint events, i.e., A \ (Ac \ B) = ϕ. According to probability axiom-2, the probability of the event A [ B in (1.7) can be written as PðA [ BÞ = PðAÞ þ PðAc \ BÞ:

ð1:8Þ

The event B can be written as B=S \ B in which using S = A [ Ac, we obtain B = ðA [ Ac Þ \ B which can be written as B = ðA \ BÞ [ ðAc \ BÞ

ð1:9Þ

where A \ B and Ac \ B are disjoint events, i.e., (A \ B) \ (Ac \ B ) = ϕ. According to probability axiom-2 in (1.2), the probability of the event B in (1.9) can be written as PðBÞ = PðA \ BÞ þ PðAc \ BÞ

ð1:10Þ

PðAc \ BÞ = PðBÞ - PðA \ BÞ:

ð1:11Þ

from which, we get

Substituting (1.11) into (1.8), we obtain PðA [ BÞ = PðAÞ þ PðBÞ - PðA \ BÞ:

ð1:12Þ

Note: If A and B are disjoint, i.e., mutually exclusive events, then we have PðA [ BÞ = PðAÞ þ PðBÞ: This is due to A \ B = ϕ → P(A \ B) = 0.

14

1

Experiments, Sample Spaces, Events, and Probability Laws

Fig. 1.2 Venn diagram illustration of the events

AI B

AI B A

B

c

A

B

AI BIC C

Since events of an experiment are nothing but subsets of the sample space of the experiment, it may sometimes be easier to manipulate the events using Venn diagrams. Venn Diagram Illustration of Events In Fig. 1.2, Venn diagram illustrations of the events are depicted. As can be seen from Fig. 1.2, we can take the intersection and union of the events. Example 1.16: Show that PðA [ BÞ ≤ PðAÞ þ PðBÞ: Proof 1.16: We showed that PðA [ BÞ = PðAÞ þ PðBÞ - PðA \ BÞ:

ð1:13Þ

According to probability axiom-1 in (1.1), the probability function is a non-negative function, and we have PðA \ BÞ ≥ 0:

ð1:14Þ

If we omit P(A \ B) from the right-hand side of (1.13), we can write PðA [ BÞ ≤ PðAÞ þ PðBÞ: Example 1.17: A is an event of an experiment, and Ac is the complement of A. Show that PðAc Þ = 1 - PðAÞ: Proof 1.17: We know that A [ Ac = S where S is the sample space, and A \ Ac = ϕ.

ð1:15Þ

1.6

Properties of the Probability Function

15

According to probability law axioms 3 and 2 in (1.3) and (1.2), we have PðSÞ = 1 and PðA [ Ac Þ = PðAÞ þ PðAc Þ: Then, from (1.15), we can write PðAÞ þ PðAc Þ = 1 which leads to PðAc Þ = 1 - PðAÞ: Theorem 1.1: If the events A1, A2, ⋯, Am are mutually exclusive events, then we have PðA1 [ A2 [ ⋯ [ Am Þ = PðA1 Þ þ PðA2 Þ þ ⋯ þ PðAm Þ: Example 1.18: For a biased die, the probabilities of the simple events are given as Pðf 1 Þ =

1 12

P ðf 2 Þ =

1 12

P ðf 3 Þ =

1 6

Pðf 4 Þ =

1 6

Pðf 5 Þ =

2 6

Pðf 6 Þ =

1 6

The events A and B are defined as A = fEven numbers appearg

B = fNumbers that are powers of 2 appearg:

Find, P(A), P(B), P(A [ B), P(A \ B). Solution 1.18: The events A, B, A [ B, and A \ B can be written as A = ff 2 , f 4 , f 6 g

B = ff 1 , f 2 , f 4 g

A [ B = ff 1 , f 2 , f 4 , f 6 g

A \ B = ff 2 , f 4 g:

The probabilities of the events A, B, A [ B, and A \ B can be computed as 1 1 1 5 þ þ → PðAÞ = 12 6 6 12 1 1 1 4 PðBÞ = Pðf 1 Þ þ Pðf 2 Þ þ Pðf 4 Þ → PðBÞ = þ þ → PðBÞ = 12 12 6 12 PðAÞ = Pðf 2 Þ þ Pðf 4 Þ þ Pðf 6 Þ → PðAÞ =

16

1

Experiments, Sample Spaces, Events, and Probability Laws

PðA [ BÞ = Pðf 1 Þ þ Pðf 2 Þ þ Pðf 4 Þ þ Pðf 6 Þ → 1 1 1 1 6 PðA [ BÞ = þ þ þ → PðA [ BÞ = 12 12 6 6 12 1 1 3 PðA \ BÞ = Pðf 2 Þ þ Pðf 4 Þ → PðA \ BÞ = þ → PðA \ BÞ = 12 6 12 The probability of A [ B can also be calculated as PðA [ BÞ = PðAÞ þ PðBÞ - PðA \ BÞ → PðA [ BÞ 5 4 3 6 = þ → PðA [ BÞ = : 12 12 12 12 Example 1.19: A and B are two events of an experiment. Show that if A ⊂ B then PðAÞ ≤ PðBÞ: Proof 1.19: If A ⊂ B, then we have B=A [ B which can be written as B = ðA [ B Þ \ S in which substituting A [ Ac for sample space, we get B = ðA [ BÞ \ ðA [ Ac Þ which can be expressed as B = A [ ðAc \ BÞ

ð1:16Þ

where the events A and Ac \ B are disjoint events, i.e., A \ (Ac \ B ) = ϕ. Using probability law axiom-2 in (1.2) and equation (1.16), we have PðBÞ = PðAÞ þ PðAc \ BÞ Since probability is a non-negative quantity, (1.17) implies that PðAÞ ≤ PðBÞ:

ð1:17Þ

1.7

Conditional Probability

1.7

17

Conditional Probability

Assume that we perform and experiment, and we get an outcome of the experiment. Let the outcome of the experiment belong to an event B. And consider the question: What is the probability that the outcome of the experiment also belongs to another event A? To calculate this probability, we should first determine the sample space, then identify the event and calculate the probability of the event. Assume that the experiment is a fair one, as PðEventÞ =

Number of elements in Event → PðEventÞ Number of elements in Sample Space N ðeventÞ : = N ðSample SpaceÞ

Let’s show the conditional event E which implies that fThe outcome of the experiment belongs to A given that it also belongs to Bg where the condition “given that it also belongs to B” implies that the sample space equals B, i.e., S0 = B: Then, the probability of the event E can be calculated using PðE Þ =

N ðA \ B Þ N ðE Þ 0 → PðE Þ = N ðB Þ N ðS Þ

ð1:18Þ

which can be written as P ðE Þ =

N ðA \ BÞ=N ðSÞ N ðBÞ=N ðSÞ

leading to PðEÞ =

PðA \ BÞ : PðBÞ

If we show this special event E by a special notation AjB, then the conditional event probability can be written as

18

1

Experiments, Sample Spaces, Events, and Probability Laws

PðAjBÞ =

PðA \ BÞ PðBÞ

ð1:19Þ

which can be called as conditional probability in short instead of the conditional event probability. In fact, we will use the term “conditional probability” for (1.19) throughout the book. From the conditional probability expression in (1.19), we can have the following identities: PðA \ BÞ = PðAjBÞPðBÞ

PðA \ BÞ = PðBjAÞPðAÞ:

ð1:20Þ

Properties 1. If A1 and A2 are disjoint events, then we have PðA1 [ A2 jBÞ = PðA1 jBÞ þ PðA2 jBÞ 2. If A1 and A2 are not disjoint events, then we have PðA1 [ A2 jBÞ ≤ PðA1 jBÞ þ PðA2 jBÞ Let’s now see the proof of these properties. Proof 1: The conditional probability P(A1 [ A2| B) can be written as PðA1 [ A2 jBÞ =

PððA1 [ A2 Þ \ BÞ PðBÞ

where using (A1 [ A2) \ B = (A1 \ B) [ (A2 \ B), we obtain PðA1 [ A2 jBÞ =

PððA1 \ BÞ [ ðA2 \ BÞÞ : PðBÞ

Since the events A1 and A2 are disjoint, then we have A1 \ A2 = ϕ which also implies that ðA1 \ BÞ \ ðA2 \ BÞ = ϕ leading to

ð1:21Þ

1.7

Conditional Probability

19

PððA1 \ BÞ [ ðA2 \ BÞÞ = PðA1 \ BÞ þ PðA2 \ BÞ:

ð1:22Þ

Substituting (1.22) into (1.21), we get PðA1 [ A2 jBÞ =

PðA1 \ BÞ þ PðA2 \ BÞ PðBÞ

PðA1 [ A2 jBÞ =

PðA1 \ BÞ PðA2 \ BÞ þ PðBÞ PðBÞ

leading to

which can be written as PðA1 [ A2 jBÞ = PðA1 jBÞ þ PðA2 jBÞ: Proof 2: In (1.21), we got PðA1 [ A2 jBÞ =

PððA1 \ BÞ [ ðA2 \ BÞÞ PðBÞ

ð1:23Þ

in which employing the property PðA [ BÞ ≤ PðAÞ þ PðBÞ for the numerator of (1.23), we get PðA1 [ A2 jBÞ ≤

PðA1 \ BÞ þ PðA2 \ BÞ PðBÞ

PðA1 [ A2 jBÞ ≤

PðA1 \ BÞ PðA2 \ BÞ þ PðBÞ PðBÞ

leading to

which can be written as PðA1 [ A2 jBÞ ≤ PðA1 jBÞ þ PðA2 jBÞ: Example 1.20: There are two students A and B having an exam. The following information is available about the students.

20

1

Experiments, Sample Spaces, Events, and Probability Laws

(a) The probability that student A can be successful in the exam is 5/8. (b) The probability that student B can be successful in the exam is 1/2. (c) The probability that at least one student can be successful is 3/4. After the exam, it was announced that only one student was successful in the exam. What is the probability that student A was successful in the exam? Solution 1.20: For each student, having an exam can be considered as an experiment. The sample spaces of individual experiments are SA = As , Af

SB = B s , B f

where As, Af are the success and fail outputs for student A, and Bs, Bf are the success and fail outputs for student B. If we consider both students having an exam together, i.e., joint experiment, the sample space in this case can be formed as S = SA × SB → S = As Bs , As Bf , Af Bs , Af Bf Let’s define the events E A = fStudent A is successfulg → EA = As Bs , As Bf E B = fStudent B is successfulg → E B = As Bs , Af Bs E1 = fAt least one student is successfulg → E1 = As Bs , As Bf , Af Bs From the given information in the question, we can write the following equations: 5 5 → PðAs Bs Þ þ P As Bf = 8 8 4 1 PðE B Þ = → PðAs Bs Þ þ P Af Bs = 8 2 3 6 PðE1 Þ = → PðAs Bs Þ þ P As Bf þ P Af Bs = 4 8 PðE A Þ =

which can be solved for P(AsBs), P(AsBf), P(AfBs) as PðAs Bs Þ =

3 8

P As Bf =

2 8

P Af Bs =

1 : 8

Now, let’s define the event Eo = fOnly one student is successful in the examg → E o = As Bf , Af Bs :

1.7

Conditional Probability

21

In our question, P(EA| Eo) is asked. We can calculate P(EA| Eo) as PðEA jE o Þ =

PðE A \ Eo Þ PðE o Þ

where P(Eo) and P(EA \ Eo) can be calculated as 2 1 3 þ → PðE o Þ = 8 8 8 2 PðEA \ E o Þ = P As Bf → PðEA \ E o Þ = : 8

PðEo Þ = P As Bf þ P Af Bs → PðE o Þ =

Then, P(EA| Eo) is evaluated as PðEA jE o Þ =

2 8 3 8

→ PðEA jE o Þ =

2 : 3

Example 1.21: A fair coin is tossed three times. The events A and B are defined as A = fThe first two tosses are different from each otherg B = fSecond toss is a tailg Find, P(A), P(B), P(A| B), and P(B| A). Solution 1.21: For a single toss of the coin, the sample space is S1 = {H, T}. For three tosses, the sample space is found using S = S1 × S1 × S1 as S = fHHH, HHT, HTH, HTT, THH, THT, TTH, TTT g: Then, the events A and B described in the question can be written as A = fHTH, HTT, THH, THT g B = fHTH, HTT, TTH, TTT g: Since the coin is a fair one and simple events have the same probability, the probabilities of the events A and B can be calculated using PðAÞ =

N ðA Þ N ðSÞ

PðBÞ =

N ðBÞ N ð SÞ

where N(A), N(B), and N(S) indicate the number of elements in the events, A, B, and S, respectively. Then, P(A) and P(B) are found as

22

1

PðAÞ =

Experiments, Sample Spaces, Events, and Probability Laws

N ðAÞ 4 → PðAÞ = 8 N ð SÞ

PðBÞ =

N ðBÞ 4 → PðBÞ = : 8 N ð SÞ

The conditional probability P(A| B) can be calculated using PðAjBÞ =

PðA \ BÞ PðBÞ

ð1:24Þ

where evaluating P(A \ B) as PðA \ BÞ =

N ðA \ BÞ 2 → PðA \ BÞ = 8 N ð SÞ

ð1:25Þ

and employing (1.25) in (1.24), we get PðAjBÞ =

PðA \ BÞ → PðAjBÞ = PðBÞ

2 8 4 8

→ PðAjBÞ =

2 4

Example 1.22: Consider a metal detector security system in an airport. The probability of the security system giving an alarm in the absence of a metal is 0.02, the probability of the security system giving an alarm in the presence of a metal is 0.95, and the probability of the security system not giving an alarm in the presence of a metal is 0.03. The probability of availability of metal is 0.02. Express the miss detection event mathematically, and calculate the probability of miss detection. Express the missed detection event mathematically, and calculate the probability of missed detection. Solution 1.22: Considering the given information in the question, we can define the events and their probabilities as A = fMetal existsg B = fAlarmg

Ac = fMetal does not existg

C = fMiss Detectiong

(a) The miss detection event can be written as C = Ac \ B whose probability can be calculated as

D = fMissed Detectiong

1.7

Conditional Probability

23

PðCÞ = PðAc \ BÞ → PðC Þ = PðBjAc ÞPðAc Þ → PðC Þ = 0:0196: 0:02

0:98

(b) The missed detection event can be written as D = A \ Bc whose probability can be calculated as PðDÞ = PðA \ Bc Þ → PðDÞ = PðBc jAÞ PðAÞ → PðDÞ = 0:0006: 0:03

0:02

Example 1.23: A box contains three white and two black balls. We pick a ball from this box. Find the sample space of this experiment and write the events for this sample space. Solution 1.23: The sample space can be written as S = fw1 , w2 , w3 , b1 , b2 g: The events are subsets of S, and there are in total 25 = 32 events. These events are fg, fw1 g, fw2 g, fw3 g, fb1 g, fb2 g fw1 w2 g, fw1 w3 g, fw1 b1 g, fw1 b2 g, fw2 w3 g, fw2 b1 g, fw2 b2 g, fw3 b1 g, fw3 b2 g, fb1 b2 g

fw1 , w2 w3 g, fw1 , w2 b1 g, fw1 , w2 b2 g, fw2 , w3 b1 g, fw2 , w3 b2 g, fw1 , w3 b1 g, fw1 , w3 b2 g, fw1 , b1 b2 gfw2 , b1 , b2 g, fw3 , b1 , b2 g fw1 , w2 , w3 , b1 g, fw1 , w2 , w3 , b2 g, fw2 , w3 , b1 , b2 g, fw1 , w3 , b1 , b2 g, fw1 , w3 , b1 , b2 g fw1 , w2 , w3 , b1 , b2 g Example 1.24: A box contains two white and two black balls. We pick two balls from this box without replacement. Find the sample space of this experiment. Solution 1.24: We perform two experiments consecutively. The sample space of the first experiment can be written as

24

1

Experiments, Sample Spaces, Events, and Probability Laws

S1 = fw1 , w2 , b1 , b2 g: The sample space of the second experiment depends on the outcome of the first experiment. If the outcome of the first experiment is w1, the sample space of the second experiment is S21 = fw2 , b1 , b2 g: If the outcome of the first experiment is w2, the sample space of the second experiment is S22 = fw1 , b1 , b2 g: If the outcome of the first experiment is b1, the sample space of the second experiment is S23 = fw1 , w2 , b2 g: If the outcome of the first experiment is b2, the sample space of the second experiment is S24 = fw1 , w2 , b1 g: If the outcome of the first experiment is w1, the sample space of combined experiment is S = S1 × S21 : If the outcome of the first experiment is w2, the sample space of the second experiment is S = S1 × S22 : If the outcome of the first experiment is b1, the sample space of the second experiment is S = S1 × S23 : If the outcome of the first experiment is b2, the sample space of the second experiment is S = S1 × S24 :

1.7

Conditional Probability

25

Continuous Experiment For continuous experiments, sample space includes an uncountable number of simple events. For this reason, for continuous experiments, the sample space is usually expressed either as an interval if one-dimensional representation is sufficient, or it is expressed as an area in two-dimensional plane. Let’s illustrate the concept with an example. Example: A telephone call may occur at a time t which is a random point in the interval [8 18]. (a) Find the probabilities of the following events: A = fA call occurs between 10 and 16g B = fA call occurs between 8 and 16g: (b) Calculate P(B| A).

Solution: The sample space of the experiment is the interval [8 18], i.e., S = ½8 18: The events A and B are subsets of S, and they are nothing but the intervals A = ½10 16 B = ½8 16: (a) The probabilities of the events can be calculated as PðAÞ =

LengthðAÞ 16 - 10 6 → PðAÞ = → PðAÞ = 18 - 8 10 LengthðSÞ

PðBÞ =

LengthðBÞ 16 - 8 8 → PðBÞ = → PðAÞ = 18 - 8 10 LengthðSÞ

(b) P(B| A) can be calculated as PðB \ AÞ → PðAÞ Pð½8 10Þ Pð½8 16 \ ½10 16Þ 2 → PðBjAÞ = → PðBjAÞ = : PðBjAÞ = 6 Pð½10 16Þ Pð½10 16Þ PðBjAÞ =

26

1

Experiments, Sample Spaces, Events, and Probability Laws

Problems 1. 2. 3. 4.

State the three probability axioms. What is the probability function? Is it an ordinary real valued function? What do mutually exclusive events mean? The sample space of an experiment is given as S = fs1 , s2 , s3 , s4 , s5 , s6 g:

Find three mutually exclusive events E1, E2, E3 such that S = E1 [ E2 [ E3. Find the probability of each mutually exclusive event. 5. The sample space of an experiment is given as S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g: The event E is defined as E = fs1 , s3 , s5 , s6 , s8 g: Write the event E as the union of two mutually exclusive events E1 and E2, i.e., E = E1 [ E2 6. The sample space of an experiment is given as S = fs1 , s2 , s3 g where the probabilities of the simple events are provided as Pðs1 Þ =

1 2 Pðs2 Þ = 4 4

Pðs3 Þ =

1 : 4

Write all the events for this sample space, and calculate the probability of each event. 7. The sample space of an experiment is given as S = fs1 , s2 , s3 g where the probabilities of the simple events are provided as Pðs1 Þ =

1 1 Pðs2 Þ = 6 3

Pðs3 Þ =

1 : 2

We perform the experiment twice. Consider the two performances of the same experiment as a single experiment, i.e., combined experiment. Find the simple

Problems

27

events of the combined experiment, and calculate the probability of each simple event of the combined experiment. 8. The sample spaces of two experiments are given as S1 = fa, b, cg S2 = fd, eg where the probabilities of the simple events are provided as 1 1 1 PðbÞ = P ð cÞ = 3 6 2 3 1 Pðd Þ = PðeÞ = : 4 4

PðaÞ =

We perform the first experiment once and the second experiment twice. Consider the three trials of the experiment as a single experiment, i.e., combined experiment. Find the simple events of the combined experiment, and calculate the probability of each simple event of the combined experiment.

Chapter 2

Total Probability Theorem, Independence, Combinatorial

2.1

Total Probability Theorem, and Bayes’ Rule

Definition Partition: Let A1, A2, ⋯, AN be the events of a sample space such that Ai \ Aj = ϕ i, j 2 {1, 2, ⋯, N} and S = A1 [ A2⋯AN. We say that the events A1, A2, ⋯, AN form a partition of S. The partition of a sample space is graphically illustrated in Fig. 2.1.

2.1.1

Total Probability Theorem

Let A1, A2, ⋯, AN be the disjoint events that form a partition of a sample space S, and B is any event. Then, the probability of the event B can be written as PðBÞ = PðA1 ÞPðBjA1 Þ þ PðA2 ÞPðBjA2 Þ þ ⋯ þ PðAN ÞPðBjAN Þ:

ð2:1Þ

The partition theorem is illustrated in Fig. 2.2. Proof: If A1, A2, ⋯, AN are disjoint events that form a partition of a sample space S, then we have S = A1 [ A2 ⋯ [ AN : For any event B, we can write B=B \ S

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 O. Gazi, Introduction to Probability and Random Variables, https://doi.org/10.1007/978-3-031-31816-0_2

29

30

2

Total Probability Theorem, Independence, Combinatorial

S

Fig. 2.1 The partition of a sample space

B

A1

A2 A3

Fig. 2.2 Illustration of total probability theorem

S

A1 B A1 I B

A2 A2 I B A3 I B A3

in which substituting S = A1 [ A2⋯ [ AN, we get B = B \ ðA1 [ A2 ⋯ [ AN Þ where distributing \ over [, we obtain B = ðB \ A1 Þ [ ðB \ A2 Þ⋯ \ ðB \ AN Þ:

ð2:2Þ

In (2.2), the events (B \ Ai) and (B \ Aj) i, j 2 {1, 2, ⋯, N}, i ≠ j are disjoint events. Then, according to probability law axiom-2 in (1.2), P(B) can be written as PðBÞ = PðB \ A1 Þ þ PðB \ A2 Þ þ ⋯ þ PðB \ AN Þ in which employing the property P(B \ Ai) = P(Ai)P(B| Ai), we get PðBÞ = PðA1 ÞPðBjA1 Þ þ PðA2 ÞPðBjA2 Þ þ ⋯ þ PðAN ÞPðBjAN Þ

ð2:3Þ

which is the total probability equation. Example 2.1: In a chess tournament, there are 100 players. Of these 100 players, 20 of them are at an advanced level, 30 of them are at an intermediate level, and 50 of them are at a beginner level. You randomly choose an opponent and play a game. (a) What is the probability that you will play against an advanced player? (b) What is the probability that you will play against an intermediate player? (c) What is the probability that you will play against a beginner player?

2.1

Total Probability Theorem, and Bayes’ Rule

31

S

Fig. 2.3 Partition of the sample space for Example 2.1

W

A

C B

Solution 2.1: The experiment here can be considered as playing a chess game against an opponent. The sample space is S = f100 playersg and the events are A = f20 advanced playersg B = f30 intermediate playersg C = f50 beginner playersg: The probabilities P(A), P(B), and P(C) can be calculated as PðAÞ =

N ðAÞ 20 → PðAÞ = 100 N ð SÞ

PðBÞ =

N ðBÞ 30 → PðBÞ = 100 N ð SÞ

PðC Þ =

N ðC Þ 50 → PðCÞ = : 100 N ð SÞ

The sample space and its partition are depicted in Fig. 2.3. Example 2.2: In a chess tournament, there are 100 players. Of these 100 players, 20 of them are at an advanced level, 30 of them are at an intermediate level, and 50 of them are at a beginner level. Your probability of winning against an advanced player is 0.2, and it is 0.5 against an intermediate player, and it is 0.7 against a beginner player. You randomly choose an opponent and play a game. What is the probability of winning? Solution 2.2: The sample space is S = {100 players}, and the events are A = f20 advanced playersg B = f30 intermediate playersg C = f50 beginner playersg W = fThe number of players you can beatg:

32

2

Total Probability Theorem, Independence, Combinatorial

It is clear that S=A [ B [ C and the events A, B, and C are disjoint events. In the previous example, the probabilities P(A), P(B), and P(C) are calculated as PðAÞ =

20 100

PðBÞ =

30 100

PðC Þ =

50 : 100

And in this example, the following information is given PðWjAÞ = 0:2

PðWjBÞ = 0:5

PðWjC Þ = 0:7:

Using total probability law PðW Þ = PðAÞPðWjAÞ þ PðBÞPðWjBÞ þ PðCÞPðWjC Þ the probability of winning against a randomly chosen opponent can be calculated as PðW Þ = 0:2 × 0:2 þ 0:3 × 0:5 þ 0:5 × 0:7 → PðW Þ = 0:54: Exercise: There is a box, and inside the box there are 100 question cards. Of these 100 mathematic questions, 10 of them are difficult, 50 of them are normal, and 40 of them are easy. Your probability of solving a difficult question is 0.2, it is 0.4 for normal questions, and it is 0.6 for easy questions. You randomly choose a card; what is the probability that you can solve the question on the card?

2.1.2

Bayes’ Rule

Let A1, A2, ⋯, AN be disjoint events that form a partition of a sample space S. The conditional probability P(Ai| B) can be calculated using PðAi jBÞ =

PðAi \ BÞ PðBÞ

which can be written as PðAi jBÞ =

PðAi ÞPðBjAi Þ PðBÞ

in which using the total probability theorem for P(B), we obtain

2.1

Total Probability Theorem, and Bayes’ Rule

PðAi jBÞ =

33

PðAi ÞPðBjAi Þ PðA1 ÞPðBjA1 Þ þ PðA2 ÞPðBjA2 Þ þ ⋯ þ PðAN ÞPðBjAN Þ

ð2:4Þ

which is called Bayes’ rule. Example 2.3: In a chess tournament, there are 100 players. Of these 100 players, 20 of them are at an advanced level, 30 of them are at an intermediate level, and 50 of them are at a beginner level. Your probability of winning against an advanced player is 0.2, and it is 0.5 against an intermediate player, and it is 0.7 against a beginner player. You randomly choose an opponent and play a game and you win. What is the probability that you won against an advanced player? Solution 2.3: The sample space is S = {100 players}, and the events are A = f20 advanced playersg B = f30 intermediate playersg C = f50 beginner playersg W = fThe number of players you can beatg: In the question, we are required to find P(A| W ), which can be calculated using PðAjW Þ =

PðAÞPðWjAÞ PðW Þ

in which using PðW Þ = PðAÞPðWjAÞ þ PðBÞPðWjBÞ þ PðCÞPðWjC Þ with PðAÞ =

20 100

PðWjAÞ = 0:2

PðBÞ =

30 100

PðWjBÞ = 0:5

PðC Þ =

50 : 100

PðWjC Þ = 0:7:

we obtain PðAjW Þ =

0:2 × 0:2 → PðAjW Þ = 0:074 0:54

which shows that it is very rare to win against an advanced player.

34

2

Total Probability Theorem, Independence, Combinatorial

Exercise 1. An electronic device is produced by three factories: F1, F2, and F3. The factories F1, F2, and F3 have market sizes of 30%, 30%, and 40%, respectively, and the probabilities of F1, F2, and F3 for producing a defective device are 0.02, 0.04, and 0.01. Assume that you purchased the electronic device produced by these factories, and you found that the device is defective. What is the probability that the defective device is produced by the second factory, i.e., by F2? Example 2.4: A box contains two regular coins and one two-headed coin, i.e., biased coin. You pick a coin and flip it, and a head shows up. What is the probability that the chosen coin is the two-headed coin? Solution 2.4: The experiment for this example can be considered as choosing a coin and flipping it. Since the box contains two fair and one two-headed coins, we can write the sample space as S = fH 1 , T 1 , H 2 , T 2 , H b1 , H b2 g where H1, T1, H2, T2 corresponds to the fair coins, and Hb, Hb corresponds to the two-headed coin. Let’s define the events A = fheads show upg → A = fH 1 , H 2 , H b1 , H b2 g B = fbiased heads show upg → B = fH b1 , H b2 g In our example, the conditional probability PðBjAÞ is asked. We can calculate PðBjAÞ as PðB \ AÞ PðfH b1 , H b2 g \ fH 1 , H 2 , H b1 , H b2 gÞ → PðBjAÞ = → PðBjAÞ PðAÞ PðfH 1 , H 2 , H b1 , H b2 gÞ 2 PðfH b1 , H b2 gÞ 2 → PðBjAÞ = 6 → PðBjAÞ = : = 4 4 PðfH 1 , H 2 , H b1 , H b2 gÞ 6

PðBjAÞ =

In fact, if we inspect the event A = {H1, H2, Hb1, Hb2}, we see that half of the heads are biased.

2.2

2.2

Multiplication Rule

35

Multiplication Rule

For N events of an experiment, we have PðA1 \ A2 ⋯ \ AN Þ = PðA1 ÞPðA2 jA1 ÞPðA3 jA1 \ A2 Þ⋯PðAN jA1 \ A2 ⋯AN - 1 Þ

ð2:5Þ

which can be written mathematically in a more compact manner as N

P \Ni = 1 Ai =

i=1

1 P Ai \ij = 1 Aj :

ð2:6Þ

Proof: We can show the correctness of (2.5) using the definition of the conditional probability as in PðA1 \ A2 ⋯ \ AN Þ = PðA1 Þ 

PðA1 \ A2 Þ PðA3 \ A1 \ A2 Þ PðAN \ A1 \ A2 ⋯AN - 1 Þ ⋯ PðA1 Þ PðA1 \ A2 Þ PðA1 \ A2 ⋯AN - 1 Þ

in which canceling the common terms, we get PðA1 \ A2 ⋯ \ AN Þ = PðAN \ A1 \ A2 ⋯AN - 1 Þ which is a correct equality. Example 2.5: There is a box containing 6 white and 6 black balls. We pick 3 balls from the box without replacement, i.e., without putting them back to the box, in a sequential manner. What is the probability that all the drawn balls are white in color? Solution 2.5: Let’s define the events A1 = {The first drawn ball is white:g} A2jA1 = {The second drawn ball is white assuming that the first drawn ball is white:g.} A3jA1, A2 = {The third drawn ball is white assuming that the first and second drawn balls are white:g.} Note here that A2 j A1 or A3 j A1, A2 are just notations; they are used to express the conditional occurrence of events in a short way.

36

2

Total Probability Theorem, Independence, Combinatorial

Before starting to the drawls, the initial sample space is S1 = f6 White Balls, 6 Black Ballsg Since the experiment is a fair one, the probability of the event A1 can be calculated as PðA1 Þ =

N ðA1 Þ N ð S1 Þ

ð2:7Þ

where N(A1) and N(S1) are the number of simple events in the event A1 and S1, respectively. The probability (2.7) can be calculated as PðA1 Þ =

N ðA 1 Þ 6 : → PðA1 Þ = 12 N ð S1 Þ

After the first experiment, the sample space has one missing element, and the sample space can be written as S2 = f5 White Balls, 6 Black Ballsg The probability of A2 given A1 can be calculated as PðA2 jA1 Þ =

N ðA2 jA1 Þ 5 = : 11 N ð S2 Þ

Similarly, the probability of P(A3| A1 \ A2) is calculated as PðA3 jA1 \ A2 Þ =

4 : 10

In the question, P(A1 \ A2 \ A3) is asked. We can calculate P(A1 \ A2 \ A3) as PðA1 \ A2 \ A3 Þ = PðA1 ÞPðA2 jA1 ÞPðA3 jA1 \ A2 Þ → PðA1 \ A2 \ A3 Þ 6 5 4 = × × : 12 11 10

2.3

Independence

The events A and B are said to be independent events if the occurrence of the event B does not change the probability of the occurrence of event A. That is, if

2.3

Independence

37

PðAjBÞ = PðAÞ

ð2:8Þ

then the events A and B are said to be independent events. The independence condition in (2.8) can alternatively be expressed as PðAjBÞ = PðAÞ →

PðA \ BÞ = PðAÞ → PðA \ BÞ = PðAÞPðBÞ: PðBÞ

Namely, the events A and B are independent of each other, if PðA \ BÞ = PðAÞPðBÞ is satisfied. Note: For disjoint events A and B, we have P(A \ B) = 0, and for independent events A and B, we have P(A \ B) = P(A)P(B). Example 2.6: Show that two disjoint events A and B can never be independent events. Proof 2.6: Let A and B be two disjoint events such that PðAÞ > 0

PðBÞ > 0

and PðA \ BÞ = 0: It is clear that PðAÞPðBÞ > 0: This means that PðA \ BÞ ≠ PðAÞPðBÞ: Thus, two disjoint events can never be independent. Example 2.7: A three-sided fair die is tossed twice. (a) Write the sample space of this experiment. (b) Consider the following events A = fThe first flip shows up f 1 g B = fThe second flip shows up f 3 g: Decide whether the events A and B are independent events or not.

38

2

Total Probability Theorem, Independence, Combinatorial

Solution 2.7: The sample space of the first toss is S1 = {f1, f2, f3}. The sample space of the two tosses can be calculated as S = S 1 × S 1 → S = f f 1 f 1 , f 1 f 2 , f 1 f 3 , f 2 f 1 , f 2 f 2 , f 2 f 3 , f 3 f 1 , f 3 f 2 , f 3 f 3 g: The events A and B can be written as A = ff 1 f 1 , f 1 f 2 , f 1 f 3 g

B = ff 1 f 3 , f 2 f 3 , f 3 f 3 g

whose probabilities are evaluated as PðAÞ =

3 9

3 PðBÞ = : 9

ð2:9Þ

The event A \ B can be found as A \ B = ff 1 f 3 g whose probability is PðA \ BÞ =

1 : 9

ð2:10Þ

Since PðA \ BÞ = PðAÞ × PðBÞ is satisfied, we can conclude that the events A and B are independent of each other.

2.3.1

Independence of Several Events

Let A1, A2, ⋯, AN be the events of an experiment. The events A1, A2, ⋯, AN are independent of each other, if P \ Ai = i2B

PðAi Þ for every subset of B = f1, 2, ⋯, N g:

ð2:11Þ

i2B

Example 2.8: If the events A1, A2, and A3 are independent of each other, then all of the following equalities must be satisfied. 1. P(A1 \ A2) = P(A1)P(A2) 2. P(A1 \ A3) = P(A1)P(A3)

2.4

Conditional Independence

39

3. P(A2 \ A3) = P(A2)P(A3) 4. P(A1 \ A2 \ A3) = P(A1)P(A2)P(A3) Exercise: For the two independent tosses of a fair die, we have the following events defined: A = fFirst flip shows up 1 or 2g B = fSecond flip shows up 2 or 4g C = fThe sum of the two numbers is 8g: Decide whether the events A, B, and C are independent of each other or not.

2.4

Conditional Independence

The events A and B are said to be conditionally independent, if for a given event C PðA \ BjC Þ = PðAjC ÞPðBjCÞ

ð2:12Þ

is satisfied. The left side of the conditional independence in (2.12) can be written as PðA \ BjCÞ =

PðA \ B \ C Þ PðC Þ

in which using the property PðA \ B \ CÞ = PðCÞPðBjC ÞPðAjB \ C Þ we obtain PðA \ BjC Þ =

PðC ÞPðBjC ÞPðAjB \ CÞ PðC Þ

leading to PðA \ BjC Þ = PðBjC ÞPðAjB \ C Þ:

ð2:13Þ

40

2 Total Probability Theorem, Independence, Combinatorial

Substituting (2.13) for the left-hand side of (2.12), we get PðBjCÞPðAjB \ C Þ = PðAjC ÞPðBjC Þ where canceling the common terms from both sides, we get PðAjB \ CÞ = PðAjCÞ:

ð2:14Þ

The conditional independence implies that, if the event C did occur, the additional occurrence of the event B does not have any effect on the probability of occurrence of event A. Example 2.9: For the two tosses of a fair coin experiment, the following events are defined A = fFirst flip shows up a Headg B = fSecond flip shows up a Tailg C = fIn both flips, at least one Head appearsg: Decide whether the events A and B are conditionally independent given the event C. Solution 2.9: The events A, B, and C can be written as A = fHH, HT g

B = fHT, TT g

C = fHT, TH, HH g

S = fHH, HT, TH, TT g: For the conditional independence of A and B given C, we must have PðAjB \ C Þ = PðAjC Þ which can be written as PðA \ B \ CÞ PðA \ C Þ : = P ðB \ C Þ PðC Þ

ð2:15Þ

Using the given events, the probabilities in (2.15) can be calculated as PðA \ B \ C Þ = PfHT g → PðA \ B \ C Þ = PðB \ C Þ = PfHT g → PðB \ C Þ =

1 4

1 4

2.4

Conditional Independence

41

PðA \ CÞ = PfHH, HT g → PðA \ C Þ =

2 : 4

Then, from (2.15) we have 1 4 1 4

=

2 4 3 4

→1=

2 3

which is not correct. Thus, for the given events, we have PðAjB \ C Þ ≠ PðAjCÞ which means that the events A and B given C are not conditionally independent of each other. Example 2.10: Show that if A and B are independent events, so are the A and Bc. Proof 2.10: If A and B are independent events, then we have PðA \ BÞ = PðAÞPðBÞ: The event A can be written as A=A \ S in which substituting S = B [ Bc, we get A = A \ ðB [ Bc Þ → A = ðA \ BÞ [ ðA \ Bc Þ where employing the probability law axiom-2, we obtain PðAÞ = PðA \ BÞ þ PðA \ Bc Þ in which using P(A \ B) = P(A)P(B), we get PðAÞ = PðAÞPðBÞ þ PðA \ Bc Þ leading to PðAÞ - PðAÞPðBÞ = PðA \ Bc Þ → PðAÞð1 - PðBÞÞ = PðA \ Bc Þ → PðA \ Bc Þ = PðAÞPðBc Þ: PðBc Þ

42

2

Total Probability Theorem, Independence, Combinatorial

Exercise: Show that if A and B are independent events, so are the Ac and Bc. Hint: Ac = Ac \ S and S = B [ Bc. Exercise: Show that if A and B are independent events, so are the Ac and B. Hint: Ac = Ac \ S and S = B [ Bc and use the result of the previous example.

2.5

Independent Trials and Binomial Probabilities

Assume that we perform an experiment, and at the end of the experiment, we wonder whether an event has occurred or not, for example, flip of a fair coin and occurrence of head, success or failure from an exam, winning or losing a game, it rains or does not rain, toss of a die and occurrence of an even number, etc. Let’s assume that such experiments are repeated N times in a sequential manner, for instance, flipping a fair coin ten times, playing 10 chess games, etc. We wonder about the probability of the same event occurring k times out of N trials. Let’s explain the topic with an example. Example 2.11: Consider the flip of a biased coin experiment. The sample space is S1 = {H, T} and the simple events have the probabilities PðH Þ = p

PðT Þ = 1 - p:

Let’s say that we flip the coin 5 times. In this case, sample space is calculated by taking the 5 Cartesian product of S1 by itself, i.e., S = S1 × S 1 × S 1 × S 1 × S 1 which includes 32 elements, and each element of S contains 5 simple events, for instance, HHHHH, HHHHT, ⋯ etc. Now think about the question, what is the probability of seeing 3 heads and 2 tails after 5 flips of the coin? Consider the event A having 3 heads and 2 tails; the event A can be written as A ¼ fHHHTT; HHTTH; HTTHH; TT HHH, THTHH; HTHTH; HHTHT; THHTH; HTHHT; THHHTg The probability of any simple event containing 3 heads and 2 tails equals p3(1 - p)2, for instance, P(HHHTT) can be calculated as PðHHHTT Þ = PðH Þ PðH Þ PðH Þ PðT Þ PðT Þ → PðHHHTT Þ = p3 ð1 - pÞ2 p

p

p

1-p

1-p

2.5

Independent Trials and Binomial Probabilities

43

The probability of the event A can be calculated by summing the probabilities of simple events appearing in A. Since there are 10 simple events, each having probability of occurrence p3(1 - p)2 in A. The probability of A can be calculated as PðAÞ = p3 ð1 - pÞ2 þ p3 ð1 - pÞ2 þ ⋯ þ p3 ð1 - pÞ2 → PðAÞ = 10 × p3 ð1 - pÞ2 which can be written as PðAÞ =

5 3 p ð 1 - pÞ 2 : 3

Thus, the probability of seeing 3 heads and 2 tails after 5 tosses of the coin is 5 3 p ð1 - pÞ2 : 3 Now consider the events A0 = f0 Head 5 tailsg A1 = f1 Head 4 tailsg A2 = f2 Heads 3 tailsg A3 = f3 Heads 2 tailsg A4 = f4 Heads 1 Tailg A5 = f5 Heads 0 Tailg It is obvious that the events A0, A1, A2, A3, A4, A5 are disjoint events, i.e., Ai \ Aj = ϕ, i, j = 0, 1, ⋯, 5, i ≠ j, and we have S = A0 [ A1 [ A2 [ A3 [ A4 [ A5 : According to the probability law axioms-2 and 3 in (1.2) and (1.3), we have PðSÞ = 1 → PðA0 Þ þ PðA1 Þ þ PðA2 Þ þ PðA3 Þ þ PðA4 Þ þ PðA5 Þ = 1 leading to 5 0 5 1 5 2 5 3 p ð 1 - pÞ 5 þ p ð1 - pÞ4 þ p ð 1 - pÞ 3 þ p ð 1 - pÞ 2 0 1 2 3 þ

5 4 5 5 p ð1 - pÞ1 þ p ð 1 - pÞ 0 = 1 4 5

which can be written as

44

2 5 k=0

Total Probability Theorem, Independence, Combinatorial

5 k

pk ð1 - pÞ5 - k = 1:

Example 2.12: Consider the flip of a fair coin experiment. The sample space is S1 = {H, T}. Let’s say that we flip the coin N times. In this case, sample space is calculated by taking the N Cartesian product of S1 by itself, i.e., S = S1 × S1 × ⋯S1 What is the probability of seeing k heads at the end of N trials. Following a similar approach as in the previous example, we can write that

P k heads appear in N flips

→ PðAk Þ =

N k p ð 1 - pÞ N - k k

ð2:16Þ

Ak

Considering the disjoint events A1, A2, ⋯, AN such that S = A1 [ A2 [ ⋯ [ AN, we can write N k=0

N k p ð1 - pÞN - k = 1: k

Now consider the event Ak, the number of heads appearing in N tosses is a number between k1 and k2. The event Ak can be written as Ak = Ak1 [ Ak1 þ1 [ ⋯ [ Ak2 where the events Ak1 , Ak1 þ1 , ⋯, Ak2 are disjoint events. The probability of Ak can be calculated as

P the number of heads in N flips is a number between k 1 and k 2



Ak

PðAk Þ = PðAk1 [ Ak1 þ1 [ ⋯Ak2 Þ = PðAk1 Þ þ PðAk1 þ1 Þ þ ⋯ þ PðAk2 Þ k2 N k p ð 1 - pÞ N - k = k k = k1 ð2:17Þ

2.5

Independent Trials and Binomial Probabilities

45

Note: Let x and y be two simple events of a sample space, then we have x [ y = fx, yg and for the Cartesian product, we can write ðx [ yÞ × ðx [ yÞ → fx, yg × fx, yg = fxx, xy, yx, yyg thus ðx [ yÞ × ðx [ yÞ = fxx, xy, yx, yyg A similar approach can be considered for the two events A and B of an experiment. Example 2.13: A fair die is tossed 5 times. What is the probability that a number divisible by 3 appears 4 times? Solution 2.13: For the fair die toss experiment, the sample space is S1 = f1, 2, 3, 4, 5, 6g: The event “a number divisible by 3 appears” can be written as A = f3, 6g: We can write the sample space S for our experimental outcomes as S2 = A [ B where B = {1, 2, 4, 5}. The probabilities of the events A and B are PðAÞ =

2 6

PðBÞ =

4 : 6

When the fair die is flipped 5 times, the sample space happens to be S = S2 × S2 × S2 × S2 × S2 which includes 32 elements, i.e., S = fAAAAA, AAAAB, AAABA, ⋯, BBBBBg: The event of S in which A appears 4 times, i.e., a number divisible by 3 appears 4 times, is

46

2

Total Probability Theorem, Independence, Combinatorial

C = fAAAAB, AAABA, AABAA, ABAAA, BAAAAg: The probability of the event C can be calculated as PðC Þ = PðAAAABÞ þ Pð AAABAÞ þ Pð AABAAÞ þ PðABAAAÞ þ PðBAAAAÞ ð13Þ ð23Þ 2

ð13Þ ð23Þ

ð13Þ ð23Þ

2

ð13Þ ð23Þ

2

2

ð13Þ ð23Þ 2

leading to P ðC Þ = 5 ×

1 3

2

2 3

which can be written as PðC Þ =

1 5 × 3 4

2

2 : 3

In fact, using the formula N k p ð 1 - pÞ N - k k directly for the given example, we can get the same result. Theorem 2.1: Let S be the sample space of an experiment, and A is an event. Assume that the experiment is performed N times. The probability of an event occurring k times in N trials can be calculated as PN ðAk Þ =

N k p ð1 - pÞN - k k

ð2:18Þ

where p = Prob(A). Example 2.14: A biased coin has the simple event probabilities PðH Þ =

2 3

PðT Þ =

1 : 3

Assume that the biased coin is flipped and a fair die is tossed together 8 times. What is the probability that a tail and an even number appear together 5 times? Solution 2.14: The sample space of the biased coin toss experiment is S1 = fH, T g: The sample space of the flipping-a-die experiment is

2.5

Independent Trials and Binomial Probabilities

47

S2 = f1, 2, 3, 4, 5, 6g: The combined experiment has the sample space S = S1 × S2 → S = fH1, H2, H3, H4, H5, H6, T1, T2, T3, T4, T5, T6g: The event A = fA tail and an even number appearsg can be written as A = fT2, T4, T6g: and the sample space S3, considering the experimental outcomes given in the question, is S3 = A [ B where B = fH1, H2, H3, H4, H5, H6, T1, T3, T5g: The probability of the event A can be calculated using PðAÞ = PðT2Þ þ PðT4Þ þ PðT6Þ → PðAÞ = PðTÞPð2Þ þ PðTÞPð4Þ þ PðTÞPð6Þ leading to PðAÞ =

1 1 1 1 1 1 1 × þ × þ × → PðAÞ = : 3 6 3 6 3 6 6

Now consider the combined experiment, i.e., the biased is flipped and a fair die is tossed together 8 times. The probability that event A occurs 5 times in 8 trials of the experiment can be calculated using PN ðAk Þ =

N k 8 p ð 1 - pÞ N - k → P 8 ð A 5 Þ = k 5

1 6

5

5 6

3

:

Example 2.15: We flip a biased coin and draw a ball with replacement from a box that contains 2 red, 3 yellow, and 2 blue balls. For the biased coin, the probabilities of the head and tail are

48

2

Total Probability Theorem, Independence, Combinatorial

P ðH b Þ =

1 4

PðT b Þ =

3 : 4

If we repeat the experiment 8 times, what is the probability of seeing a tail and drawing a blue ball together 5 times? Solution 2.15: For the biased coin flip experiment, the sample space can be written as S1 = fH b , T b g and for the ball drawing experiment, we can write the sample space as S2 = fR1 , R2 , Y 1 , Y 2 , Y 3 , B1 , B2 g: If we consider two experiments at the same time, i.e., the combined experiment, the sample space can be formed as S = S1 × S2 → S = fH b , T b g × fR1 , R2 , Y 1 , Y 2 , Y 3 , B1 , B2 g → H b R1 , H b R2 , H b Y 1 , H b Y 2 , H b Y 3 , H b B1 , H b B2 , S= : T b R1 , T b R2 , T b Y 1 , T b Y 2 , T b Y 3 , T b B1 , T b B2 Let’s define the event A as A = fseeing a tail and drawing a blue g: We can write the elements of A explicitly as A = f T b B1 , T b B2 g: The probability of A can be calculated as PðAÞ ¼ PðT b B1 Þ þ PðT b B2 Þ!PðAÞ ¼ PðT b ÞPðB1 Þ þ PðT b ÞPðB2 Þ!PðAÞ ¼

3 1 3 1 3 × þ × !PðAÞ ¼ : 4 7 4 7 14

In our question, the experiment is repeated 8 times, and the probability of seeing a tail and drawing a blue ball together 5 times is asked. We can calculate the asked probability as PðA5 Þ =

8 5

3 14

5

11 3 : 14

2.5

Independent Trials and Binomial Probabilities

49

Exercise: A biased coin is flipped and a 4-sided biased die is tossed together 8 times. The probabilities of the simple events for the separate experiments are PðH Þ =

Pðf 1 Þ =

2 3

Pðf 2 Þ =

2 3

1 3

P ðT Þ =

1 3

Pðf 3 Þ =

2 3

Pðf 4 Þ =

1 : 3

What is the probability of seeing a head and an odd number 3 times in 8 tosses? Example 2.16: An electronic device is produced by a factory. The probability that the produced device is defective equals 0.1. We purchase 1000 of these devices. What is the probability that the total number of defective devices is a number between 50 and 150. Solution 2.16: Consider the coin toss experiment. The sample space is S1 = {H, T}. If you flip the coin N times, you calculate the sample space using N times Cartesian product S = S1 × S1 × ⋯ × S1 : The solution of the given question can be considered in a similar manner. Purchasing the electronic device can be considered an experiment. The simple events of this experiment are defective and non-defective devices, i.e., S2 = fD, N g where D and N refer to purchasing of defective and non-defective devices, respectively. Purchasing 1000 electronic devices can be considered as repeating the experiment 1000 times, and the sample space of this experiment can be calculated in a similar manner to the coin toss experiment as S = S2 × S2 × ⋯ × S2 : The probability of the simple event with N letters in which D appears can be calculated as p k ð 1 - pÞ N - k and the probability of the event including the simple events of S in which D appears k times is calculated as

50

2

Total Probability Theorem, Independence, Combinatorial

N k p ð 1 - pÞ N - k : k And if k is a number between k1 and k2, the sum of probabilities of all these events equals k2

PðAk Þ =

k = k1

k

N k p ð 1 - pÞ N - k : k

For our question, the probability that the total number of defective devices is a number between 50 and 150 can be calculated as 150

PðAk Þ = k

2.6

k = 50

1000 k

0:1k × 0:9100 - k :

The Counting Principle

Assume that there are M experiments with samples spaces S1, S2, ⋯, SM. The number of elements in the sample spaces S1, S2, ⋯, SM are N1, N2, ⋯, NM, respectively. If the experiments are all considered together as a single experiment, then the sample space of the combined experiment is calculated as S = S1 × S2 × ⋯ × SM

ð2:19Þ

and the number of elements in the sample space S equals to N = N1 × N2 × ⋯ × NM :

ð2:20Þ

Example 2.17: Consider the integer set Fq = {0, 1, 2, ⋯, q - 1}. Assume that we construct integer vectors v = ½v1 v2 ⋯vL  using the integers in Fq. How many different integer vectors we can have? Solution 2.17: Selecting a number from the integer set Fq = {0, 1, 2, ⋯, q - 1} can be considered as an experiment, and the sample space of this experiment is S1 = f0, 1, 2, ⋯, q - 1g: To construct the integer vector v including L integers, we need to repeat the experiment L times. And the sample space of the combined experiment can be obtained by taking the L times Cartesian product of S1 by itself as

2.7

Permutation

51

S = S1 × S 1 × ⋯ × S 1 : The elements of S are integer vectors containing L numbers. The number of vectors in S is calculated as q × q × ⋯ × q → qL : L times

Example 2.18: Consider the integer set F3 = {0, 1, 2, 3}. Assume that we construct integer vectors v = ½v1 v2 ⋯v10  including 10 integers using the elements of F3. How many different integer vectors can we have? Solution 2.18: The answer is 3 × 3 × ⋯ × 3 = 310 : 10 times

2.7

Permutation

Consider the integer set S1 = {1, 2, ⋯, N}. Assume that we draw an integer from the set S1 without replacement, and we repeat this experiment k times in total. The sample space of the kth draw, i.e., kth experiment, is indicated by Sk. The sample space of the combined experiment S = S1 × S 2 × ⋯ × S N contains N × ðN- 1Þ × ⋯ × ðN- k þ 1Þ k-digit integer sequences, and it is read as k permutation of N, and it is shortly expressed as N! : ðN - k Þ!

ð2:21Þ

The sample space S of the combined experiment contains simple events consisting of k distinct integers chosen from S1. Thus, at the end of the kth trial, we obtain a sequence of k distinct integers. And the number N × (N - 1) × ⋯ × (N - k + 1) indicates the total number of integer sequences containing k distinct integers, i.e., the number of elements in the sample space S.

52

2

Total Probability Theorem, Independence, Combinatorial

The discussion given above can be extended to any set containing objects rather than integers. In that case, while forming the distinct combination of objects, we pay attention to the index of the objects. Example 2.19: The set S1 = {1, 2, 3} is given. We draw 2 integers from the set without replacement. Write the possible generated sequences. Solution 2.19: Assume that at the first trial 1 is selected, then at the end of second trial, we can get the sequences 1 × f2, 3g → 12, 13: If at the first trial 2 is selected, then at the end of second trial, we can get the sequences 2 × f1, 3g → 21, 23: If at the first trial 3 is selected, then at the end of second trial, we can get the sequences 3 × f1, 2g → 31, 32: Hence, the possible 2-digit sequences containing distinct elements are f12, 13, 21, 23, 31, 32g: The number of 2-digit sequences is 6, which can be obtained by taking the 2-permutation of 3 as 3! = 6: ð3 - 2Þ! Example 2.20: In English language, there are 26 letters. How many words can be formed consisting of 5 distinct letters? Solution 2.20: You can consider this question as the draw of letters from the alphabet box without replacement, and we repeat the experiment 5 times. Then, the number of words that contains 5 distinct letters can be calculated using 26 × 25 × 24 × 23 × 22 which is nothing but 5 permutations of 26.

2.8

2.8

Combinations

53

Combinations

Assume that there are N people, and we want to form a group consisting of k persons selected from N people. How many different groups can we form? The answer to this question passes through permutation calculation. We can find the answer by calculating the k permutation of N. However, since humans are considered while forming the sequences, some of the sequences include the same persons although their order is different in the sequence. For instance, the sequences abcd and bcda contain the same persons and they are considered the same. The elements of a sequence containing k distinct elements can be reordered in k! different ways. Example 2.21: The sequence aec can be reordered as ace eac eca cea cae: Hence, for a sequence including 3 distinct elements, it is possible to obtain 3 ! = 6 sequences, and if each letter indicates a human, then it is obvious that all these groups are the same of each other. Considering all the above discussion, we can conclude that in k permutation of N, each sequence appears k! times including its reordered versions. Then, the total number of unique sequences without any reordered replicas equals N! ðN - k Þ! × k!

ð2:22Þ

which is called k combination of N and it is shortly indicated as N : k

ð2:23Þ

Example 2.22: Consider the sample space S = {a, b, c, d}. The number of different sequences containing 2 distinct letters from S can be calculated using 2 permutations of 4 as 4! = 12 ð4 - 2Þ! and these sequences can be written as ab

ac

ad

ba

bc

bd

ca

cb

cd

da

db

dc:

On the other hand, if reordering is not wanted, then the number of sequences containing 2 distinct letters can be calculated using

54

2

Total Probability Theorem, Independence, Combinatorial

4! =6 ð4 - 2Þ! × 2! and the sequences can be written as ab

ac

ad

bc

bd

cd:

Example 2.23: A box contains 60 items, and of these 60 items 15 of them are defective. Suppose that we select 23 items randomly. What is the probability that from these 23 items 8 of them are defective? Solution 2.23: Let’s formulate the solution as follows. Sample space is S = fSelecting 23 items out of 60 itemsg: The sample space contains N ðSÞ =

60 23

number of different elements, i.e., sequences. Let’s define the event A as A = fFrom 23 selected items, 8 of them are defective and 15 of them are robustg: In fact, the event A can be written as A = A1 × A2 where the event A1 and A2 are defined as A1 = fChoose 8 defective items out of 15 defective itemsg A2 = fChoose 15 robust items out of 45 robust itemsg The event A contains N ðAÞ =

15 45 × 8 15

elements. The probability of the event A is calculated as PðAÞ =

N ðA Þ → PðAÞ = N ð SÞ

15 8

× 60 23

45 15

2.8

Combinations

55

Example 2.24: An urn contains 3 red and 3 green balls, each of which is labeled by a different number. A sample of 4 balls are drawn without replacement. Find the number of elements in the sample space. Solution 2.24: Let’s show the content of the urn by the set {R1, R2, R3, G1, G2, G3}. After drawing of the 4 balls, we can get the sequences, R1 R2 R3 G1 R1 R2 R3 G2 R1 R2 R3 G3 R1 R2 G1 G2 3R 1G

3R 1G

2R 2G

3R 1G

R1 R2 G2 G3 R2 R3 G1 G2 R2 R3 G1 G3 R2 R3 G2 G3 2R 2G

2R 2G

2R 2G

2R 2G

R1 R3 G1 G3 R1 R3 G2 G3 R1 G1 G2 G3 R2 G1 G2 G3 2R 2G

2R 2G

1R 3G

1R 3G

R1 R2 G1 G3 2R 2G

R1 R3 G1 G2 2R 2G

R3 G1 G2 G3 1R 3G

The total number of sequences is 15, which is equal to 6 4 i.e., 4 combinations of 6. That is, N ð SÞ =

6 : 4

Example 2.25: For the previous example, consider the event A defined as A = f2 red balls are drawn, 2 green balls are drawng: Find the probability of event A. Solution 2.25: The event A can be written as A = A1 × A2 where the events A1 and A2 are defined as A1 = f2 red balls are drawng

A2 = f2 green balls are drawng:

We have N ðA1 Þ = and

3 2

N ðA2 Þ =

3 2

56

2

Total Probability Theorem, Independence, Combinatorial

3 3 × : 2 2

N ðAÞ = N ðA1 Þ × N ðA2 Þ → N ðAÞ = The probability of the event A can be calculated as PðAÞ =

2.9

N ðA Þ → PðAÞ = N ðSÞ

3 2

× 6 4

3 2

:

Partitions

Suppose that we have N distinct objects, and N = N1 + N2 + ⋯ + Nr. We first draw N1 objects without replacement and make a group with these N1 objects, and draw N2 objects from the remaining without replacement and make other groups with these N2 objects, and go on like this until the formation of the last group containing Nr objects. Each draw can be considered a separate experiment, and let’s denote the sample space of an experiment by Sk. The sample space of the combined experiment can be formed using the Cartesian product S = S1 × S 2 × ⋯ × S N r and the size of the sample space S, denoted by jSj, which indicates the number of ways these groups can be formed, can be calculated using j S j = j S1 j × j S2 j × ⋯ × j SN r j leading to N - N1 N - N1 - N2 N - N1 - ⋯ - Nr - 1 N × × ×⋯× N2 N3 Nr N1 which can be simplified as ðN - N 1 Þ ðN - N 1 - ⋯ - N r - 1 Þ! N! × ×⋯× ðN - N 1 Þ! × N 1 ! ðN - N 1 - N 2 Þ! × N 2 ! ðN - N 1 - ⋯ - N r - 1 - N r Þ × N r ! where canceling the same terms in numerators and denominators, we obtain N! : N1! × N2! × ⋯ × Nr !

ð2:24Þ

2.9

Partitions

57

The idea of the partitions can also be interpreted in a different way considering the permutation law. If there are N distinct objects available, the number of N-object sequences that can be formed from these N objects can be calculated as N × ðN- 1Þ × ⋯ × 1 = N!:

ð2:25Þ

That is, the total number of permutations for N objects equals N!. In fact, the result in (2.25) is nothing but the number of elements in the sample space of the combined experiment, and there are N experiments in total and the sample space of the kth, k = 1⋯N experiment contains N -k 1 elements. Note: |S| indicates the number of elements in the set S. If N1 objects are the same, then the total number of permutations is N! : N 1! If N1 objects are the same, and N2 objects are the same, then the total number of permutations is N! : N 1 !N 2 ! In a similar manner, if N1 objects are the same, N2 objects are the same, and so on until the Nr are the same objects, the total number of permutations is N! < N! N 1 !N 2 !⋯N r !

ð2:26Þ

Example 2.26: In the English language, there are 26 letters. How many words can be formed consisting of 5 distinct letters? Solution 2.26: You can consider this question as the draw of letters from the alphabet box without replacement, and we repeat the experiment 5 times. Then, the number of words that contains 5 distinct letters can be calculated using 26 × 25 × 24 × 23 × 22 which is nothing but 5 permutations of 26.

58

2

Total Probability Theorem, Independence, Combinatorial

Example 2.27: The total number of permutations of the sequence abc is 3! = 6, and these sequences are abc acb

bac

bca

cab

cba:

On the other hand, the total number of permutations of the sequence aab is 3!/2! = 3, and these sequences are aab

aba

baa

Which contains 3 sequences, i.e., 6/2!; the reason for this reduction can be seen from aab aba baa : acb cab

abc cba

bac bca

Example 2.28: The total number of permutations for the sequence abcd is 4! = 24. That is, by reordering the items in abcd, we can write 24 distinct sequences in total. On the other hand, the total number of permutations for the sequence abac is 4!/2! = 12, and these sequences are abac acab abca acba aabc aacb cbaa bcaa baca caba caab baac Exercise: For the sequences abcde and aaabbc write all the possible permutations, and show the relation between the permutation number of both sequences. Example 2.29: How many different letter sequences by reordering the letter in the word TELLME? Solution 2.29: The number of different letter sequences equals 6! : 2! × 2! Partitions Continued Let S be the sample space of an experiment, and A1, A2, A3, ⋯, Ar be the disjoint sets forming a partition of S such that Ai \ Aj = ϕ, i ≠ j, and A1 [ A2 [ ⋯ [ Ar = S. The probabilities of the disjoint events A1, A2, A3, ⋯, Ar are p1 = PðA1 Þ such that

p2 = ðA2 Þ ⋯ pR = ðAr Þ

2.9

Partitions

59

p1 þ p2 þ ⋯ þ pr = 1: The sample space can be written as S = fA1 , A2 , ⋯, Ar g: Assume that we repeat the experiment N times. Consider the event B defined as B = fA1 occurs N 1 times, A2 occurs N 2 times, Ar occcurs N r times:g The probability of the event B can be calculated as PðBÞ =

N! pN 1 × pN2 2 × ⋯ × pNr r N 1! × N 2! × ⋯ × N r ! 1

ð2:27Þ

where pN1 1 × pN2 2 × ⋯ × pNr r

ð2:28Þ

denotes the probability of a single element of B, and N! N 1! × N 2! × ⋯ × Nr !

ð2:29Þ

is the total number of elements in B. Every element of B has the same probability of occurrence. Example 2.30: A fair die is tossed 15 times. What is the probability that the numbers 2 or 4 appear 5 times and 3 appears 4 times? Solution 2.30: For the given experiment, the sample space is S1 = {1, 2, 3, 4, 5, 6}. We can define the disjoint events A1, A2, and A3 as A1 = f2, 4g A2 = f3g A3 = f1, 5, 6g: Considering the experimental output expected in the question, we can write the sample space of the experiment as S1 = A1 [ A2 [ A3 : The probabilities of the events A1, A2, and A3 can be calculated as PðA1 Þ =

2 6

PðA2 Þ =

1 6

PðA3 Þ =

3 : 6

60

2

Total Probability Theorem, Independence, Combinatorial

We perform the experiment 15 times. The sample space of the repeated combined experiment can be found by taking 15 times the Cartesian product of S1 by itself as S = S1 × S 1 × ⋯ × S 1 whose elements consists of the 15 letters, i.e., S¼

A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 , A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A2 , ⋯ :

Let’s define the event B for the combined experiment as B = fA1 occurs 5 times, A2 occurs 3 times, A3 occurs 6 timesg: The probability of event A1 occurring 5 times, event A2 occurring 4 times, and event A3 occurring 6 times, i.e., the probability of the event B, can be calculated as PðBÞ =

15! 2 5! × 4! × 6! 6

5

×

1 6

4

×

3 6

6

where the coefficient 15! 5! × 4! × 6! indicates the number of elements is the event B, and the multiplication 2 6

5

×

1 6

4

×

3 6

6

refers to the probability of an element appearing in B. Example 2.31: Assume that a dart table has 4 regions. And the probabilities for a thrown dart to fall into these regions are 0.1, 0.4, 0.1, and 0.4, respectively. If we throw the dart 12 times, find the probability that each area is hit 3 times. Solution 2.31: Throwing a dart to a dart table can be considered an experiment. The sample space of this experiment can be considered as hitting targeted areas. Let’s indicate hitting 4 different targeted areas by the letters T1, T2, T2, and T2. Then, sample space can be written as S1 = fT 1 , T 2 , T 3 , T 4 g: The probabilities of the simple events T1, T2, T3, T4 are given in the question as

2.10

Case Study: Modeling of Binary Communication Channel

PðT 1 Þ = 0:1

PðT 2 Þ = 0:4

PðT 3 Þ = 0:1

61

PðT 4 Þ = 0:4:

Throwing the dart 12 times can be considered as repeating the same experiment 12 times, and the sample space of the combined experiment in this case can be calculated by taking 12 times the Cartesian product of S1 by itself, i.e., S = S1 × S 1 × ⋯ × S 1 : The sample space S contains elements consisting of 12 letters, i.e., S = fT 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 ,

T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 2 , ⋯ g:

Let’s define an event B of S as follows: B = fT 1 appears 3 times, T 2 appears 3 times, T 3 appears 3 times, T 4 appears 3 timesg i.e., B = f T 1 T 1 T 1 T 2 T 2 T 2 T 3 T 3 T 3 T 4 T 4 T 4 , T 1 T 2 T 1 T 1 T 2 T 2 T 3 T 3 T 3 T 4 T 4 T 4 , ⋯g: The probability of the event B can be calculated using PðBÞ =

N! pN 1 × pN2 2 × ⋯ × pNr r N1! × N 2! × ⋯ × N r ! 1

PðBÞ =

12! 0:13 × 0:43 × 0:13 × 0:43 : 3! × 3! × 3! × 3

as

Exercise: A fair die is flipped 8 times. Determine the probability of an odd number appearing 2 times and 4 appearing 3 times.

2.10

Case Study: Modeling of Binary Communication Channel

The binary communication channel is shown in Fig. 2.4. We define the following events for the binary symmetric channels: T 0 = fTransmitting a 0g R0 = fReceiving a 0g

T 1 = fTransmitting a 1g R1 = fReceiving a 1g

E = fError at receiverg

62

2

Total Probability Theorem, Independence, Combinatorial

Fig. 2.4 The binary communication channel

P( R0 | T0 )

T0 Transmitter

T1

R0 P( R1 | T0 )

P( R0 | T1 )

P( R1 | T1 )

Receiver

R1

Channel

The events T0 and T1 are disjoint events, i.e., T0 \ T1 = ϕ. The error event E can be written as E = ðT 0 \ R 1 Þ [ ð T 1 \ R 0 Þ where T0 \ R1 and T1 \ R0 are disjoint events. Then, using probability axiom-2, we get PðEÞ = PðT 0 \ R1 Þ þ PðT 1 \ R0 Þ which can be written as PðEÞ = PðR1 jT 0 ÞPðT 0 Þ þ PðR0 jT 1 ÞPðT 1 Þ:

ð2:30Þ

Since S = T0 [ T1 we can write R1 as R1 = R1 \ S → R1 = R1 \ ðT 0 [ T 1 Þ leading to R1 = ðR1 \ T 0 Þ [ ðR1 \ T 1 Þ from which we get PðR1 Þ = PðR1 \ T 0 Þ þ PðR1 \ T 1 Þ which can also be written as PðR1 Þ = PðR1 jT 0 ÞPðT 0 Þ þ PðR1 jT 1 ÞPðT 1 Þ:

ð2:31Þ

2.10

Case Study: Modeling of Binary Communication Channel

Fig. 2.5 Binary symmetric channel

T0

63

P( R0 | T0 ) = 0.95

P( R0 | T1 ) = 0.1

T1

R0

P( R1 | T0 ) = 0.05

P( R1 | T1 ) = 0.90

R1

Proceeding in a similar manner, we can write PðR0 Þ = PðR0 jT 0 ÞPðT 0 Þ þ PðR0 jT 1 ÞPðT 1 Þ:

ð2:32Þ

Example 2.32: For the binary symmetric channel shown in Fig. 2.5, if the bits “0” and “1” have an equal probability of transmission, calculate the probability of error at the receiver side. Solution 2.32: Since the bits “0” and “1” have equal transmission probabilities, then we have PðT 0 Þ = PðT 1 Þ =

1 : 2

Using the channel transition probabilities, the transmission error can be calculated as PðEÞ = PðR1 jT 0 ÞPðT 0 Þ þ PðR0 jT 1 ÞPðT 1 Þ leading to PðE Þ = 0:05 ×

1 1 þ 0:1 × → PðE Þ = 0:075: 2 2

Example 2.33: For the binary symmetric channel shown in Fig. 2.6, the bits “0” and “1” have equal probability of transmission. If a “1” is received at the receiver side: (a) What is the probability that a “1” was sent? (b) What is the probability that a “0” was sent? Solution 2.33: (a) We are asked to find P(T1| R1), which can be calculated as PðT 1 jR1 Þ = =

P ðT 1 \ R 1 Þ PðR1 Þ PðR1 jT 1 ÞPðT 1 Þ PðR1 jT 1 ÞPðT 1 Þ þ PðR1 jT 0 ÞPðT 0 Þ

0:95 × 0:5 0:95 × 0:5 þ 0:05 × 0:5 = 0:95:

=

64

2

Total Probability Theorem, Independence, Combinatorial

Fig. 2.6 Binary symmetric channel for Example 2.33

T0

P( R0 | T0 ) = 0.95

P( R0 | T1 ) = 0.1

T1

PðT 0 jR1 Þ = =

P ðT 0 \ R 1 Þ PðR1 Þ PðR1 jT 0 ÞPðT 0 Þ PðR1 jT 1 ÞPðT 1 Þ þ PðR1 jT 0 ÞPðT 0 Þ

0:05 × 0:5 0:95 × 0:5 þ 0:05 × 0:5 = 0:0526:

=

Problems 1. The sample space of an experiment is given as S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g: Partition S as the union of three disjoint events. 2. The sample space of an experiment is given as S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g where simple events have equal probability of occurrence. The events A, B, C are given as A = fs1 , s3 , s6 g B = fs2 , s4 , s5 g C = fs7 , s8 g such that S = A [ B [ C: The event D is defined as D = fs1 , s4 , s5 , s7 g:

P( R1 | T0 ) = 0.05

P( R1 | T1 ) = 0.90

(b) We are asked to find P(T0| R1), which can be calculated as

R0 R1

Problems

65

Verify that PðDÞ = PðAÞPðDjAÞ þ PðBÞPðDjBÞ þ PðC ÞPðDjC Þ: 3. In a tennis tournament, there are 80 players. Of these 80 players, 20 of them are at an advanced level, 40 of them are at an intermediate level, and 20 of them are at a beginner level. You randomly choose an opponent and play a game. (a) (b) (c) (d)

What is the probability that you will play against an advanced player? What is the probability that you will play against an intermediate player? What is the probability that you will play against a beginner player? You randomly choose an opponent and play a game. What is the probability of winning? (e) You randomly choose an opponent and play a game and you win. What is the probability that you won against an advanced player?

4. A box contains two regular coins, one two-headed coin, and three two-tailed coins. You pick a coin and flip it, and a head shows up. What is the probability that the chosen coin is a regular coin? 5. A box contains a regular coin and a two-headed coin. We randomly select a coin with replacement and flip it. We repeat this procedure twice. (a) Write the sample space of the flipping a single coin experiment. (b) Write the sample space of flipping two coins in a sequential manner experiment. (c) The events A, B, and C are defined as A = fFirst flip result in a headg B = fSecond flip result in a headg C = fIn both flips; the regular coin is selectedg: Write these events explicitly, and decide whether the events A and B are independent of each other. Decide whether the events A and B are conditionally independent of each other given the event C. 6. Assume that you get up early and go for the bus service for your job every morning. The probability that you miss the bus service is 0.1. Calculate the probability that you miss the bus service 5 times in 30 days, i.e., in a month. 7. A three-sided biased die is tossed. The sample space of this experiment is given as S = ff 1 , f 2 , f 3 g where the simple events have the probabilities Pðf 1 Þ =

1 2 1 Pðf 2 Þ = Pðf 3 Þ = : 4 4 4

66

2

Total Probability Theorem, Independence, Combinatorial

Assume that we toss the die 8 times. What is the probability that f1 appears 5 times out of these 8 tosses. 8. Using the integers in integer set F4 = {0, 1, 2, 3}, how many different integer vectors consisting of 12 integers can be formed? 9. From a group of 10 men and 8 women, 6 people will be selected to form a jury for a court. It is required that the jury would contain at least 2 men. In how many different ways we can form the jury?

Chapter 3

Discrete Random Variables

3.1

Discrete Random Variables

Let S = {s1, s2, ⋯, sN} be the sample space of a discrete experiment, and X ðÞ be a real valued function that maps the simple events of the sample space to real numbers. This is illustrated in Fig. 3.1. Example 3.1: Consider the coin flip experiment. The sample space is S = {H, T}. A random variable X ðÞ can be defined on the simple events, i.e., outcomes, as X ðH Þ = 3:2

X ðT Þ = - 2:4:

Then, X is called a discrete random variable. Example 3.2: One fair and one biased coin are flipped together. The sample space of the combined experiment can be written as S = fHH b , HT b , TH b , TT b g: Let’s define a real valued function X ðÞ on simple outcomes of the combined experiment as X ðsi Þ =

1 3

if si contains H b if si contains T b :

ð3:1Þ

According to (3.1), we can write X ðHH b Þ = 1

X ðHT b Þ = 3

X ðTH b Þ = 1

X ðTT b Þ = 3:

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 O. Gazi, Introduction to Probability and Random Variables, https://doi.org/10.1007/978-3-031-31816-0_3

67

68

3

Discrete Random Variables ~ X()

Experiment

Sample Space S S {s1 , s2 , , sN }

Real Valued Function ~ X() s 1 s2

s3

sN

Fig. 3.1 The operation of the random variable function

~ X (s )

Fig. 3.2 The graph of the random variable function for Example 2.2

3 1

s1 s 2

s3

s4

s

If we denote the simple events HHb, HTb, THb, TTb by s1, s2, s3, s4, we can draw the graph of X ðÞ as in Fig. 3.2. Example 3.3: Consider the toss of a fair die experiment. The sample space of this experiment can be written as S = {s1, s2, s3, s4, s5, s6}. Let’s define the random variable X ðÞ on simple events of S as X ðs i Þ =

2i - 1

if i is odd

2i þ 1

if i is even:

The random variable function can be explicitly written as X ðs1 Þ = 2 × 1 - 1 → X ðs1 Þ = 1 X ðs2 Þ = 2 × 2 þ 1 → X ðs2 Þ = 5 X ðs3 Þ = 2 × 3 - 1 → X ðs3 Þ = 5 X ðs4 Þ = 2 × 4 þ 1 → X ðs4 Þ = 9 X ðs5 Þ = 2 × 5 - 1 → X ðs5 Þ = 9 X ðs6 Þ = 2 × 6 þ 1 → X ðs6 Þ = 13:

3.2 Defining Events Using Random Variables

69

Then, we can state that the random variable function X ðÞ takes the values from the set {1, 5, 9, 13}, which can be called a range set of the random variable X and can be denoted as R = f1, 5, 9, 13g: X

3.2

Defining Events Using Random Variables

An event, i.e., a subset of the sample space S, can be defined using si jX ðsi Þ = x

ð3:2Þ

which indicates the subset, i.e., event, of S consisting of si which satisfy X ðsi Þ = x. Example 3.4: For the toss-of-a-die experiment in the previous question, the random variable is defined as X ðs i Þ =

2i - 1

if i is odd

2i þ 1

if i is even:

Then, the event A = si jX ðsi Þ = 5 can be explicitly written as A = fs 2 , s 3 g since X ðs2 Þ = 5, X ðs3 Þ = 5: Example 3.5: Consider the toss-of-a-fair-die experiment. The sample space is S = fs1 , s2 , s3 , s4 , s5 , s6 g: The random variable X ðÞ is defined on the simple events of S as X ðs i Þ =

The event A is defined as

1 -1

if i is odd if i is even:

70

3

Discrete Random Variables

A = si jX ðsi Þ = - 1 : Write the elements of A explicitly. Solution 3.5: The random variable function can be explicitly written as X ðs1 Þ = 1 X ðs2 Þ = - 1 X ðs3 Þ = 1 X ðs4 Þ = - 1 X ðs5 Þ = 1 X ðs6 Þ = - 1: Since X ðs2 Þ = - 1 X ðs4 Þ = - 1 X ðs6 Þ = - 1 the event A = si jX ðsi Þ = - 1 can be explicitly written as A = fs2 , s4 , s6 g: Example 3.6: Consider the two independent flips of a fair coin. The sample space of this experiment can be written as S = {HH, HT, TH, TT}. Let’s define the random variable X ðÞ on simple events of S as X ðsi Þ = fnumber of heads in si g where si is one of the simple events of S. The random variable function can be explicitly written as X ðs1 Þ = X ðHH Þ → X ðHH Þ = 2 X ðs2 Þ = X ðHT Þ → X ðHT Þ = 1 X ðs3 Þ = X ðTH Þ → X ðTH Þ = 1 X ðs4 Þ = X ðTT Þ → X ðTT Þ = 0: We define the events A = si jX ðsi Þ = 1 B = si jX ðsi Þ = - 1 C = si jX ðsi Þ = 2 D = si jX ðsi Þ = 1 or X ðsi Þ = 0 :

Write the events A, B, C,and D explicitly.

3.2

Defining Events Using Random Variables

71

Solution 3.6: The expression si jX ðsi Þ = x means finding those si that satisfy X ðsi Þ = x and using all these si forming an event of S. Since X ðHT Þ = 1

X ðTH Þ = 1

the event A = si jX ðsi Þ = 1 can be explicitly written as A = fHT, TH g: For the event B = si jX ðsi Þ = - 1 since there is no si satisfying X ðsi Þ = - 1, we can write event B as B = f g: In a similar manner, for the event C = si jX ðsi Þ = 2 since X ðsi Þ = 2 is satisfied for only si = TT, i.e., X ðTT Þ = 2, the event C can be written as C = fTT g: The event D = si jX ðsi Þ = 1 or X ðsi Þ = 0 can be explicitly written as D = fHT, TH, TT g since

72

3

X ðHT Þ = 1

X ðTH Þ = 1

Discrete Random Variables

X ðTT Þ = 0:

The expression si jX ðsi Þ = x represents an event, and this representation can be shortly written as X=x :

ð3:3Þ

That is, the mathematical expressions si jX ðsi Þ = x and X = x mean the same thing, i.e., X = x means si jX ðsi Þ = x :

ð3:4Þ

The expression si jX ðsi Þ ≤ x means making a subset of S, i.e., an event, from those si satisfying X ðsi Þ ≤ x, and the event si jX ðsi Þ ≤ x can also be represented by X ≤x :

ð3:5Þ

Example 3.7: Consider the roll-of-a-die experiment. The sample space of this experiment can be written as S = {s1, s2, s3, s4, s5, s6}. The random variable X ðÞ on simple events of S is defined as X ðsi Þ = 4 × i which can explicitly be written as X ðs1 Þ = 4 × 1 → X ðs1 Þ = 4 X ðs2 Þ = 4 × 2 → X ðs2 Þ = 8 X ðs3 Þ = 4 × 3 → X ðs3 Þ = 12 X ðs4 Þ = 4 × 4 → X ðs4 Þ = 16 X ðs5 Þ = 4 × 5 → X ðs5 Þ = 20 X ðs6 Þ = 4 × 6 → X ðs6 Þ = 24:

3.2

Defining Events Using Random Variables

73

The events A, B, C, and D are defined as A = si jX ðsi Þ ≤ 10 B = si jX ðsi Þ ≤ 14 C = si jX ðsi Þ ≤ 20 D = si jX ðsi Þ ≤ 25 : Write the events A, B, C, and D explicitly. Solution 3.7: For the event A = si jX ðsi Þ ≤ 10 Since the simple events s1, s2 satisfy X ðs1 Þ = 4 ≤ 10 and X ðs2 Þ = 8 ≤ 10 the event A can be written as A = fs1 , s2 g: Proceeding in a similar manner, we can write the events B, C, and D as B = fs 1 , s 2 , s 3 g

C = fs1 , s2 , s3 , s4 , s5 g

D = fs1 , s2 , s3 , s4 , s5 , s6 g:

For the easiness of illustration, we can use X ≤x for the expression si jX ðsi Þ ≤ x i.e., X≤x

means

si jX ðsi Þ ≤ x :

Example 3.8: The range set of the random variable X is given as R = f- 1, 1, 3g: X Verify that S= X = -1 [ X=1 [ X=3 :

74

3

Fig. 3.3 Random variable function mapping disjoint subsets

~

{ X = 3}

~

{ X = 1}

Discrete Random Variables

s1 s.3

3

s2 s4

1

. .

~

{ X = 1}

. . .

sN . . .

1

Solution 3.8: The random variable function X ðÞ is defined on the simple outcomes of the sample space S, and it is a one-to-one function between simple events and real numbers. The events X = -1

X =1

X =3

are disjoint events and their union gives S. This is illustrated in Fig. 3.3. Example 3.9: Consider the two independent tosses of a fair coin. The sample space of this experiment can be written as S = {HH, HT, TH, TT}. Let’s define the random variable X ðÞ on simple events of S as X ðsi Þ = fnumber of heads in si g where si is one of the simple events of S. The random variable function can be explicitly written as X ðs1 Þ = X ðHH Þ → X ðHH Þ = 2 X ðs2 Þ = X ðHT Þ → X ðHT Þ = 1 X ðs3 Þ = X ðTH Þ → X ðTH Þ = 1 X ðs4 Þ = X ðTT Þ → X ðTT Þ = 0: Show that X = 0 , X = 1 , and X = 2 form a partition of S, i.e., S= X =0 [ X =1 [ X =2 and X = 0 , X = 1 , and X = 2 are disjoint events.

3.2

Defining Events Using Random Variables

75

Solution 3.9: The events X = 0 , X = 1 , and X = 2 can be written as X = 0 = fTT g

X = 1 = fHT, TH g

X = 2 = fHH g

where it is clear that X = 0 , X = 1 , and X = 2 are disjoint events, i.e., X=0 \ X=1 =ϕ X=0 \ X=2 =ϕ X=1 \ X=2 =ϕ X =0 \ X =1 \ X =2 =ϕ and we have S= X=0 [ X=1 [ X=2 : Example 3.10: The sample space of an experiment is given as S = fs1 , s2 , s3 , s4 , s5 , s6 g: The random variable X on S is defined as X ðs1 Þ = - 2 X ðs4 Þ = 3

X ðs2 Þ = - 2

X ðs3 Þ = - 2

X ðs5 Þ = 4

X ð s6 Þ = 4

X =3

X =4 :

(a) Find the following events X= -2

(b) Are the events X = - 2 , X = 3 , and X = 4 disjoint? (c) Show that X = - 2 [ X = 3 [ X = 4 = S: Solution 3.10: (a) The events X = - 2 , X = 3 , and X = 4 can be explicitly written as X = - 2 = fs 1 , s 2 , s 3 g

76

3

Discrete Random Variables

X = 3 = fs4 g X = 4 = fs5 , s 6 g

(b) Considering the explicit form of the events in part a, it is obvious that the events X = - 2 , X = 3 , and X = 4 are disjoint of each other, i.e., X = -2 \ X=3 =ϕ X = -2 \ X=4 =ϕ X=3 \ X=4 =ϕ X= -2 \ X =3 \ X =4 =ϕ

(c) The union of X = 0 , X = 1 , and X = 2 is found as X = - 2 [ X = 3 [ X = 4 → fs1 , s2 , s3 g [ fs4 g [ fs5 , s6 g = S which is the sample space. The partition of the sample space is depicted in Fig. 3.4.

3.3

Probability Mass Function for Discrete Random Variables

The probability mass function p(x) for discrete random variable X is defined as pðxÞ = Prob X = x where x is a value of the random variable function X ðÞ. The probability mass function can also be indicated as p ðxÞ = Prob X = x X

ð3:6Þ

where the subscript of p(x), i.e., X, points to a random variable to which the probability mass function belongs to. For the easiness of the notation, we will not use the subscript in the probability mass function expression unless otherwise indicated. Let’s illustrate the concept of probability mass function with an example.

3.3

Probability Mass Function for Discrete Random Variables

77 ~

Fig. 3.4 The partition of the sample space for Example 3.10

{ X = 3} ~

s4

{ X =  2}

s1 , s2 , s3

s5 , s6 ~

{ X = 4}

Example 3.11: Consider the experiment, the two independent flips of a fair coin. The sample space of this experiment can be written as S = {HH, HT, TH, TT}. Let’s define the random variable X ðÞ on simple events of S as X ðsi Þ = fnumber of heads in si g where si is one of the simple events of S. The random variable function can be explicitly written as X ðHH Þ = 2

X ðHT Þ = 1

X ðTH Þ = 1

X ðTT Þ = 0:

(a) Write the range set of the random variable X. (b) Obtain the probability mass function of the discrete random variable X. Solution 3.11: (a) Considering the distinct values generated by the random variable, the range set of the random variable can be written as R = f0, 1, 2g: X

(b) The probability mass function of the random variable X is defined as pðxÞ = Prob X = x where x takes one of the values from the set R = f0, 1, 2g, i.e., x can be either X 0, or it can be 1, or it can be 2. We will consider each distinct x value for the calculation of p(x). For x = 0, the probability mass function p(x) is calculated as

78

3

Discrete Random Variables

pðx= 0Þ = Prob X = 0 where the event X = 0 equals to {TT}, i.e., X = 0 = fTT g: Then, we have pðx= 0Þ = PfTT g → pðx= 0Þ =

1 : 4

For x = 1, the probability mass function p(x) is calculated as pðx= 1Þ = Prob X = 1 where the event X = 1 equals {HT, TH}, i.e., X = 1 = fHT, TH g: Then, we have pðx= 1Þ = PfHT, TH g → pðx= 1Þ = PfHT g þ PfTH g → pðx= 1Þ =

1 : 2

For x = 2, the probability mass function p(x) is calculated as pðx= 2Þ = Prob X = 2 where the event X = 2 equals {HH}, i.e., X = 2 = fHH g: Then, we have pðx= 2Þ = PfHH g → pðx= 2Þ = PfHH g → pðx= 2Þ =

1 : 4

Hence, the values of the probability mass function p(x) are found as pðx= 0Þ =

1 4

pðx= 1Þ =

1 2

pðx= 2Þ =

1 : 4

We can draw the graph of probability mass function p(x) with respect to x as in Fig. 3.5.

3.3

Probability Mass Function for Discrete Random Variables

79

p( x)

Fig. 3.5 The probability mass function for Example 3.11

1/ 2 1/4

0

1

2

x

For this example, pðx= 0Þ þ pðx= 1Þ þ pðx= 2Þ =

1 1 1 þ þ → pðx= 0Þ þ pðx= 1Þ þ pðx= 2Þ = 1: 4 2 4

That is, pðxÞ = 1: x

Theorem 3.1: (a) The probability mass function of a discrete random variable X satisfies pðxÞ = 1:

ð3:7Þ

x

Proof 3.1: Let the range set of the random variable X be as R = fx1 , x2 , x3 g: X

We know that the events X = x1 , X = x2 , and X = x3 form a partition of the sample space S. That is, S = X = x1 [ X = x2 [ X = x3

ð3:8Þ

and X = x1 , X = x2 , and X = x3 are disjoint events. If the probability of (3.8) is calculated, we get P ð SÞ = P X = x1 þ P X = x2 þ P X = x3 =1

which can be written as

pðx = x1 Þ

pðx = x2 Þ

pðx = x3 Þ

80

3

p ð x = x 1 Þ þ pð x = x 2 Þ þ pð x = x 3 Þ = 1 →

Discrete Random Variables

pðxÞ = 1: x

This process can be performed for any range set with any number of elements.

3.4

Cumulative Distribution Function

The cumulative distribution function of a random variable X is defined as F ðxÞ = Prob X ≤ x which can also be written as F ðxÞ = Prob X ≤ x : X

where x is a real number. Note that X ≤ x is an event, and F(x) is nothing but the probability of the event X≤x . Let the range set of the random variable X be R = fa1 , a2 , ⋯, aN g such that X a1 < a2 < ⋯ < aN. To find the cumulative distribution function F(x), we consider the following steps: 1. We first form the x-intervals on which the cumulative distribution function F(x) is to be calculated as - 1 < x < a1 a1 ≤ x < a2 a2 ≤ x < a3 ⋮ aN - 1 ≤ x < a N aN ≤ x < 1 2. In step 2, for each decided interval of step 1, we calculate the cumulative distribution function F ðxÞ = Prob X ≤ x :

ð3:9Þ

3.4

Cumulative Distribution Function

81

Example 3.12: Consider again the experiment, the two independent tosses of a fair coin. The sample space of this experiment can be written as S = {HH, HT, TH, TT}. Let’s define the random variable X ðÞ on simple events of S as X ðsi Þ = fnumber of heads in si g where si is one of the simple events of S. The random variable function can be explicitly written as X ðHH Þ = 2

X ðHT Þ = 1

X ðTH Þ = 1

X ðTT Þ = 0:

Calculate and draw the cumulative distribution function, i.e., F(x), of the random variable X. Solution 3.12: The range set of the random variable X can be written as R = f0, 1, 2g: X

To draw the cumulative distribution function F(x), we first determine the x-intervals considering the values in the range set of X. The x-intervals can be written as -1