219 10 23MB
English Pages [540] Year 1980
Table of contents :
Preface
Preface to The 1968 Book
Table of Contents
Chapter 1. Random Variables
1. Introduction
2. Random Variables
3. Multivariate Probability Distributions
4. Mathematical Expectation
4.1. A useful inequality
4.2. Conditional expectation
5. Moments, Variances and Covariances
5.1. Variance of a linear function of random variables
5.2. Covariance between two linear functions of random variables
5.3. Variance of a product of random variables
5.4. Approximate variance of a function of random variables
5.5. Conditional variance and covariance
5.6. Correlation coefficient
6. Chebyshev’s Inequality and Laws of Large Numbers
6.1. Chebyshev’s inequality
6.2. Bernoulli’s theorem
6.3. Laws of large numbers
6.4. The central Emit theorem
7. Problems for Solution
Chapter 2. Probability Generating Functions
1. Introduction
2. General Properties
3. Convolutions
4. Examples
4.1. Binomial distribution
4.2. Poisson distribution
4.3. Geometric and negative binomial distributions
5. The Continuity Theorem
6. Partial Fraction Expansions
7. Multivariate Probability Generating Functions
7.1. Probability generating functions of marginal distributions
7.2. Probability generating functions of conditional distributions
8. Sum of a Random Number of Random Variables
9. Problems for Solution
Chapter 3. ExponentialType Distributions and Maximum Likelihood Estimation
1. Introduction
2. Gamma Function
3. Convolutions
4. Moment Generating Functions
4.1. The Central limit theorem
5. Sum of NonIdentically Distributed Random Variables
6. Sum of Consecutive Random Variables
7. Maximum Likelihood Estimation
7.1. Optimum properties of the m.1. estimator
8. Problems for Solution
Chapter 4. Branching Process, Random Walk and Ruin Problem
1. A Simple Branching Process
1.1. Probability of extinction
2. Random Walk and Diffusion Process
2.1. Random walk
2.2. Diffusion process
3. Gambler’s Ruin
3.1. Expected number of games
3.2. Ruin probability in a finite number of games
4. Problems for Solution
Chapter 5. Markov Chains
1. Introduction
2. Definition of Markov Chains and Transition Probabilities
3. Higher Order Transition Probabilities, pij(n)
4. Classification of States
5. Asymptotic Behavior of pij(n)
6. Closed Sets and Irreducible Chains
7. Stationary Distribution
8. An Application to Genetics
8.1. Two loci each with two alleles
9. Problems for Solution
Chapter 6. Algebraic Treatment of Finite Markov Chains
1. Introduction
2. Eigenvalues of Stochastic Matrix P and a Useful Lemma
2.1. A useful lemma
3. Formulas for Higher Order Transition Probabilities
4. Limiting Probability Distribution
5. Examples
6. Problems for Solution
Chapter 7. Renewal Processes
1. Introduction
2. Discrete Time Renewal Processes
2.1. Renewal probability and classification of events
2.2. Probability of occurrence of E
2.3. Probability generating functions of {f(n)} and {p(n)}
2.4. An application to random walk
2.5. Delayed renewal processes
3. Continuous Time Renewal Processes
3.1. Some specific distributions
3.2. Number of renewals in a time interval (0, t]
3.3. Renewal function and renewal density
3.4. Illustrative examples
4. Problems for Solution
Chapter 8. Some Stochastic Models of Population Growth
1. Introduction
2. The Poisson Process
2.1. The method of probability generating functions
2.2. Some generalizations of the Poisson process
3. Pure Birth Process
3.1. The Yule process
3.2. Time dependent Yule process
3.3. Joint distribution in the time dependent Yule process
4. The Polya Process
5. Pure Death Process
6. Migration Process
6.1. A simple immigrationemigration process
6.1.1. An alternative derivation
7. An Appendix – First Order Differential Equations
7.1. Ordinary differential equations
7.2. Partial differential equations
7.2.1. Preliminaries
7.2.2. Auxiliary equations
7.2.3. The general solution
8. Problems for Solution
Chapter 9. A General Birth Process, an Equality and an Epidemic Model
1. Introduction
2. A General Birth Process
3. An Equality in Stochastic Processes
4. A Simple Stochastic Epidemic – McKendrick’s Model
4.1. Solution for the probability p1,n(0,t)
4.2. Infection time and duration of an epidemic
5. Problems for Solution
Chapter 10. BirthDeath Processes and Queueing Processes
1. Introduction
2. Linear Growth
3. A TimeDependent General BirthDeath Process
4. Queueing Processes
4.1. Differential equations for an M/M/s, queue
4.2. The M/M/1 queue
4.2.1. The length of queue
4.2.2. Service time and waiting time
4.2.3. Interarrival time and traffic intensity
4.3. The M/M/s queue
4.3.1. The length of queue
4.3.2. Service time and waiting time
5. Problems for Solution
Chapter 11. A Simple IllnessDeath Process – FixNeyman Process
1. Introduction
2. Health Transition Probability Pαβ(0,t)) and Death Transition Probability Qαδ(0,t)
3. ChapmanKolmogorov Equations
4. Expected Duration of Stay
5. Population Sizes in Health States and Death States
5.1. The limiting distribution
6. Generating Function of Population Sizes
7. Survival and Stages of Disease
7.1. Distribution of survival time
7.2. Maximumlikelihood estimates
8. Problems for Solution
Chapter 12. Multiple Transition Probabilities in the Simple Illness Death Process
1. Introduction
2. Identities and Multiple Transition Probabilities
2.1. Formula for the multiple passage probabilities Pαβ(m)(0,t)
2.2. Three equalities
2.3. Inductive proof of formula for Pαβ(m)(0,t)
2.4. Formula for the multiple renewal probabilities Pαα(m)(0,t)
3. Differential Equations and Multiple Transition Probabilities
4. Probability Generating Functions
4.1. Multiple transition probabilities
4.2. Equivalence of formulas
5. Verification of the Stochastic Identities
6. ChapmanKolmogorov Equations
7. Conditional Probability Distribution of the Number of Transitions
8. Multiple Transitions Leading to Death
9. Multiple Entrance Transition Probabilities pαβ (n)(0,t)
10. Problems for Solution
Chapter 13. Multiple Transition Time in the Simple Illness Death Process – an Alternating Renewal Process
1. Introduction
2. Density Functions of Multiple Transition Times
3. Convolution of Multiple Transition Time
4. Distribution Function of Multiple Transition Time
4.1. Distribution function of the mth renewal time
4.2. Distribution function of the mth passage time
5. Survival Time
6. A Twostate Stochastic Process
6.1. Multiple transition probabilities
6.2. Multiple transition time
6.3. Number of renewals and renewal time
7. Problems for Solution
Chapter 14. The Kolmogorov Differential Equations and Finite Markov Processes
1. Markov Processes and the ChapmanKolmogorov Equation
1.1 The ChapmanKolmogorov equation
2. Kolmogorov Differential Equations
2.1. Derivation of the Kolmogorov differential equations
2.2. Examples
3. Matrices, Eigenvalues, and Diagonalization
3.1. Eigenvalues and eigenvectors
3.2. Diagonalization of a matrix
3.3. A useful lemma
3.4. Matrix of eigenvectors
4. Explicit Solutions for Kolmogorov Differential Equations
4.1. Intensity matrix V and its eigenvalues
4.2. First solution for individual transition probabilities Pij(0,t)
4.3. Second solution for individual transition probabilities Pij(0,t)
4.4. Identity of the two solutions
4.5. ChapmanKolmogorov equations
4.6. Limiting probabilities
5. Problems for Solution
Chapter 15. Kolmogorov Differential Equations and Finite Markov Processes – Continuation
1. Introduction
2. First Solution for Individual Transition Probabilities Pij(0,t)
3. Second Solution for Individual Transition Probabilities Pij(0,t)
4. Problems for Solution
Chapter 16. A General IllnessDeath Process
1. Introduction
2. Transition Probabilities
2.1. Health transition probabilities Pαβ (0,t)
2.2. Transition probabilities leading to death Qαδ (0,t)
2.3. An equality concerning transition probabilities
2.4. Limiting transition probabilities
2.5. Expected duration of stay in health and death states
2.6. Population sizes in health states and death states
3. Multiple Transition Probabilities
3.1. Multiple exit transition probabilities, Pαβ(m) (0,t)
3.2. Multiple transition probabilities leading to death Qαδ(m)(0,t)
3.3. Multiple entrance transition probabilities, pαβ(n) (0,t)
4. An Annual Health Index
5. Problems for Solution
Chapter 17. Migration Processes and BirthIllnessDeath Processes
1. Introduction
2. EmigrationImmigration Processes – PoissonMarkov Processes
2.1. First approach
2.2. Second approach
2.2.1. Differential equation of the probability generating function
2.2.2. Solution of the probability generating function
3. A BirthIllnessDeath Process
4. Problems for Solution
Bibliography
Author Index
Subject Index
Errata
An Introduction to Stochastic Processes and Their Applications
An Introduction to Stochastic Processes and Their Applications
CHIN LONG CHIANG Professor of Biostatistics University of California, Berkeley
ROBERT E. KRIEGER PUBLISHING COMPANY HUNTINGTON, NEW YORK
Original Edition 1980, based upon previous title Introduction to Stochastic Processes in Biostatistics Printed and Published by ROBERT E. KRIEGER PUBLISHING CO., INC. 645 NEW YORK AVENUE HUNTINGTON, NEW YORK 11743
Basic edition © Copyright 1968 by JOHN WILEY & SONS, INC. transferred to Chin Long Chiang 1978 New Edition, © Copyright 1980 by Robert E. Krieger Pub. Co., Inc.
All rights reserved. No reproduction in any form of this book, in whole or in part (exceptfor brief quotation in critical articles or reviews), may be made without written authorization from the publisher. Printed in the United States of America
Library of Congress Cataloging in Publication Data
Chiang, Chin Long An introduction to stochastic processes and their applications. Bibliography: p. 1. Applied mathematics 2. Probability 3. Stochastic processes 4. Mathematical statistics I. Title. QH323.5.C5 1975 574'.01'82 ISBN 0882752006
7414821
To Fu Chen and William, Robert, Harriet and Carol
Preface
This book is intended to be a textbook in stochastic processes or in applied probability. It may be used also as a reference book for courses in mathematical statistics, engineering (reliability theory), biostatistics (survival analysis), and demography (migration processes). The volume consists of an extensive revision of eight chapters of Part I of Introduction to Stochastic Processes in Biostatistics (Wiley, 1968) and a collection of nine new chapters on various topics on the subject. The biostatistics component of the 1968 book is eliminated. The 1968 book went out of print within five years of the date of publication; I only just managed to complete the revision, almost six years after the last copy was sold. The evolution of stochastic processes is envisioned here in a particular sequence. The material is arranged in the order of discrete processes, onestate continuous processes, twostate processes, and multiplestate processes. The preliminaries now include a new chapter (Chapter 3) on exponential type distributions because of their impor tance and their utility in stochastic processes, in survival analysis, and in reliability theory. Discrete processes and renewal processes in Chapters 4 through 7 have been added in this volume as they are an essential part of stochastic processes. The algebraic treatment in Chapter 6 is in variation of conventional methods that one may find, for example, in Feller [1968]. The simple and more explicit formulas for the higher order transition probabilities presented in this chapter may help the reader achieve a better understanding of Markov chains and facilitate their application. Chapters 8 and 10 contain wellknown continuous processes describ ing population growth and queueing processes. The general birth process and the equality in stochastic processes in Chapter 9 provide two additional means for deriving explicit formulas for any increasing process or decreasing process. The epidemic model serves as an example to demonstrate the potential of the equality in resolving difficulties. vii
viii
PREFACE
The first chapter (Chapter 11) on the simple illnessdeath process is basically unaltered except for the addition of a section on generating functions and a section on survival and stages of diseases. The material on multiple transition probability and multiple transition time in Chapters 12 and 13 is essentially new. The corresponding formulas in these chapters are now of closed form and thus the process can more easily be subject to practical application. When there are no absorbing states, the twostate model generates an alternating renewal process. Parts of the general renewal theory have been extended to this case. Explicit solution for the Kolmogorov differential equations for the case where the intensity function matrix (infinitesimal generator) V has distinct eigenvalues has been presented in the previous edition and reproduced in Chapter 14. The solution has been extended in Chapter 15 for the general case where matrix V has multiple and complex eigenvalues. Chapters 16 and 17 are reproductions of the corresponding chapters in the old book but with some minor changes. I place the significance of stochastic processes on their potential as an analytic tool for scientific research rather than on the theoretical development of the subject. I believe that this volume as well as the 1968 book reflect this point of view. I have used the material in this volume in courses that I have taught at the University of California, Berkeley and at Harvard University. I have once again benefitted from comments and en couragement from several friends who have read selected chapters. They include B. J. van den Berg, J. Deming, J. Emerson, J. P. Hsu, E. Peritz, P. Rust, S. Selvin, R. Wong and G. L. Yang. Solutions to the problems at the end of each chapter are in preparation but will not be published together with the text in order to meet the publication date of the book. Finally, my deep appreciation is due to Ms. Bonnie Hutchings who provided secretarial assistance during the course of the revision and expert typing from my handwritten pages to the final version with peerless skill and patience. Chin Long Chiang University of California, Berkeley September, 1979
Preface to The 1968 Book
Time, life, and risks are three basic elements of stochastic processes in biostatistics. Risks of death, risks of illness, risks of birth, and other risks act continuously on man with varying degrees of intensity. Long before the development of modem probability and statistics, men were concerned with the chance of dying and the length of life, and they constructed tables to measure longevity. But it was not until the advances in the theory of stochastic process made in recent years that empirical processes in the human population have been systematically studied from a probabilistic point of view. The purpose of this book is to present stochastic models describing these processes. Emphasis is placed on specific results and explicit solutions rather than on the general theory of stochastic processes. Those readers who have a greater curiosity about the theoretical arguments are advised to consult the rich literature on the subject. A basic knowledge in probability and statistics is required for a profitable reading of the test. Calculus is the only mathematics presupposed, although some familiarity with differential equations and matrix algebra is needed for a thorough understanding of the material. The text is divided into two parts. Part I begins with one chapter on random variables and one on probability generating functions for use in succeeding chapters. Chapter 3 is devoted to basic models of population growth, ranging from the Poisson process to the time dependent birthdeath process. Some other models of practical interest that are not included elsewhere are given in the problems at the end of the chapter. Birth and death are undoubtedly the most important events in the human population, but what is statistically more complex is the illness process. Illnesses are potentially concurrent, repetitive, and reversible and consequently analysis is more challenging. In this book illnesses are treated as discrete entities, and a population is visualized as consisting of discrete states of illnesses. An individual is said to be ix
X
PREFACE TO THE I96K BOOK
in a particular state of illness if he is affected with the corresponding diseases. Since he may leave one illness state for another or enter a death state, consideration of illness opens up a new domain of interest in multiple transition probability and multiple transition time. A basic and important case is that in which there are two illness states. Two chapters (Chapters 4 and 5) are devoted to this simple illnessdeath process. In dealing with a general illnessdeath process that considers any finite number of illness states, I found myself confronted with a finite Markov process. To avoid repetition and to maintain a reasonable graduation of mathematical involvement, I have interrupted the devel opment of illness processes to discuss the Kolmogorov differential equations for a general situation in Chapter 6. This chapter is concerned almost entirely with the derivation of explicit solutions of these equations. For easy reference a section (Section 3) on matrix algebra is included. Once the Kolmogorov differential equations are solved in Chapter 6, the discussion on the general illnessdeath process in Chapter 7 becomes straightforward; however, the model contains sufficient points of interest to require a separate chapter. The general illnessdeath process has been extended in Chapter 8 to account for the population increase through immigration and birth. These two possibilities lead to the emigrationimmigration process and the birthillnessdeath process, respectively. But my effort failed to provide an explicit solution for the probability distribution function in the latter case. Part II is devoted to special problems in survival and mortality. The life table and competing risks are classical and central topics in biostatistics, while the followup study dealing with truncated information is of considerable practical importance. I have endeavored to integrate these topics as thoroughly as possible with probabilistic and statistical principles. I hope that I have done justice to these topics and to modern probability and statistics. It should be emphasized that although the concept of illness processes has arisen from studies in biostatistics, it has a wide application to other fields. Intensity of risk of death (force of mortality) is synony mous with “failure rate” in reliability theory; illness states may be alternatively interpreted as geographic locations (in demography), compartments (in compartment analysis), occupations, or other defined conditions. Instead of the illness of a person, we may consider whether a person is unemployed, or whether a gene is a mutant gene, a telephone
PREFACE TO THE 196« BOOK
xi
line is busy, an elevator is in use, a mechanical object is out of order, etc. The book was written originally for students in biostatistics, but it may be used for courses in other fields as well. The following are some suggestions for teaching plans: 1. As a year course in biostatistics: Chapters 1 and 2 followed by Chapters 10 through 12, and then by Chapters 3 through 8. In this arrangement, a formal introduction of the pure death process is necessary at the beginning of Chapter 10. 2. For a year course in demography: Plan 1 above may be followed, except that the term “illness process” might be more appro priately interpreted as “internal migration process.” 3. As a supplementary text for courses in biostatistics or demo graphy: Chapters 9 through 12. 4. For a onesemester course in stochastic processes: Chapters 2 through 8. As a general reference book, Chapter 9 may be omitted. The book is an outgrowth partly of my own research, some of which appears here for the first time (e.g., Chapter 5 and parts of Chapter 6), and partly of lecture notes for courses in stochastic processes for which I am grateful to the many contributors to the subject. I have used the material in my teaching at the Universities of California (Berkeley), Michigan, Minnesota, North Carolina; Yale and Emory Universities, and at the London School of Hygiene, University of London. The work could not have been completed without incurring indebt edness to a number of friends. It is my pleasure to acknowledge the generous assistance of Mrs. Myra Jordan Samuels and Miss Helen E. Supplee, who have read early versions and made numerous constructive criticisms and valuable suggestions. Their help has tre mendously improved the quality of the book. I am indebted to the School of Public Health, University of California, Berkeley, and the National Institutes of Health, Public Health Service, for financial aid under Grant No. 5S01FR054406 to facilitate the work. An invitation from Peter Armitage to lecture in a seminar course at the London School of Hygiene gave me an opportunity to work almost exclusively on research projects associated with this book. I also wish to express my appreciation to Richard J. Brand and Geoffrey S. Watson who read some of the chapters and provided useful suggestions. My thanks are also due to Mrs. Shirley A. Hinegardner
xii
PREFACE TO THE 1968 BOOK
for her expert typing of the difficult material, to Mrs. Dorothy Wyckoff for her patience with the numerical computations, and to Mrs. Cynthia P. Debus for secretarial assistance. C.L.C. University of California, Berkeley Spring, 1968
Table of Contents
Chapter 1. Random Variables
1
1. Introduction 2. Random Variables 3. Multivariate Probability Distributions 4. Mathematical Expectation 4.1. A useful inequality 4.2. Conditional expectation 5. Moments, Variances and Covariances 5.1. Variance of a linear function of random variables 5.2. Covariance between two linear functions of random variables 5.3. Variance of a product of random variables 5.4. Approximate variance of a function of random variables 5.5. Conditional variance and covariance 5.6. Correlation coefficient 6. Chebyshev’s Inequality and Laws of Large Numbers 6.1. Chebyshev’s inequality 6.2. Bernoulli’s theorem 6.3. Laws of large numbers 6.4. The central Emit theorem 7. Problems for Solution Chapter 2.
1 2 5 8 11 12 13 15
17 18 18 19 20 21 21 22 23 25 26
Probability Generating Functions
31
1. 2. 3.
31 31 33
Introduction General Properties Convolutions
xiii
CONTENTS
xiv
4.
Examples 4.1. Binomial distribution 4.2. Poisson distribution 4.3. Geometric and negative binomial distributions 5. The Continuity Theorem 6. Partial Fraction Expansions 7. Multivariate Probability Generating Functions 7.1. Probability generating functions of marginal distributions 7.2. Probability generating functions of conditional distributions 8. Sum of a Random Number of Random Variables 9. Problems for Solution
Chapter 3.
ExponentialType Distributions Likelihood Estimation
1. 2. 3. 4.
5.
6. 7.
8. Chapter 4.
and
34 34 35
36 39 41 43 44
45 51 52
Maximum
Introduction Gamma Function Convolutions Moment Generating Functions 4.1. The Central limit theorem Sum of NonIdentically Distributed Random Variables Sum of Consecutive Random Variables Maximum Likelihood Estimation 7.1. Optimum properties of the m. 1. estimator Problems for Solution
56 56 56 58 60 63 64 67 70
73 74
Branching Process, Random Walk and Ruin Problem
79
A Simple Branching Process 1.1. Probability of extinction Random Walk and Diffusion Process 2.1. Random walk 2.2. Diffusion process
79 81 84 84 86
1.
2.
CONTENTS
3.
4.
Chapter 5.
Markov Chains
91 97 101
103 107 110 116 118 122 125 133 137
Algebraic Treatment of Finite Markov Chains
141
3. 4. 5. 6. 7. 8. 9.
1. 2.
3.
4. 5. 6. Chapter 7.
88 89
Introduction Definition of Markov Chains and Transition Probabilities Higher Order Transition Probabilities, pij(n) Classification of States Asymptotic Behavior of pij(n) Closed Sets and Irreducible Chains Stationary Distribution An Application to Genetics 8.1. Two loci each with two alleles Problems for Solution
1. 2.
Chapter 6.
Gambler’s Ruin 3.1. Expected number of games 3.2. Ruin probability in a finite number of games Problems for Solution
XV
Introduction Eigenvalues of Stochastic Matrix P and a Useful Lemma 2.1. A useful lemma Formulas for Higher Order Transition Probabilities Limiting Probability Distribution Examples Problems for Solution
Renewal Processes
1. 2.
Introduction Discrete Time Renewal Processes 2.1. Renewal probability and classification of events 2.2. Probability of occurrence of E 2.3. Probability generating functions of {∕(zι)} and {p(n)}
101
141
142 144 148 152 155 168
172 172 175 175 178
182
CONTENTS
xvi
3.
4. Chapter 8.
Population Growth
208
Introduction The Poisson Process 2.1. The method of probability generating functions 2.2. Some generalizations of the Poisson process Pure Birth Process 3.1. The Yule process 3.2. Time dependent Yule process 3.3. Joint distribution in the time dependent Yule process The Polya Process Pure Death Process Migration Process 6.1. A simple immigrationemigration process 6.1.1. An alternative derivation An Appendix—First Order Differential Equations 7.1. Ordinary differential equations 7.2. Partial differential equations 7.2.1. Preliminaries 7.2.2. Auxiliary equations 7.2.3. The general solution Problems for Solution
208 209
Some Stochastic Models
1. 2.
3.
4. 5. 6.
7.
8. Chapter 9.
2.4. An application to random walk 183 2.5. Delayed renewal processes 186 Continuous Time Renewal Processes 187 3.1. Some specific distributions 189 3.2. Number of renewals in a time interval (0√] 197 3.3. Renewal function and renewal density 201 3.4. Illustrative examples 203 Problems for Solution 206 of
A General Birth Process, an Equality and an Epidemic Model
210
212 214 216 218 220 222 225 229
229 231 233 233 238 23 8 241 242 244 249
χvii
CONTENTS
1. Introduction 2. A General Birth Process 3. An Equality in Stochastic Processes 4. A Simple Stochastic Epidemic —McKendrick’s Model 4.1. Solution for the probability pl „ (0,z) 4.2. Infection time and duration of an epidemic 5. Problems for Solution
Chapter 10. BirthDeath Processes
and
Queueing Processes
249 249 257
261 262 265 268
271 271 272
1. Introduction 2. Linear Growth 3. A TimeDependent General BirthDeath Process 4. Queueing Processes 4.1. Differential equations for an M∕M∕ s, queue 4.2. The Λf∕Λf∕ 1 queue 4.2.1. The length of queue 4.2.2. Service time and waiting time 4.2.3. Interarrival time and traffic intensity 4.3. The M∣M∣s queue 4.3.1. The length of queue 4.3.2. Service time andwaiting time 5. Problems for Solution
289 290 291 292 296
Chapter 11. A Simple IllnessDeath Process—FixNeyman Process
300
1. 2.
3. 4. 5.
6.
Introduction Health Transition Probability Pαβ(0√) and Death Transition Probability βα6(O,r) ChapmanKolmogorov Equations Expected Duration of Stay Population Sizes in Health States andDeath States 5.1. The limiting distribution Generating Function of Population Sizes
277 280 281 283 284 285
300 302 308 310 312 314 315
xviii
CONTENTS
7.
8.
Survival and Stages of Disease 7.1. Distribution of survival time 7.2. Maximumlikelihood estimates Problems for Solution
Chapter 12. Multiple Transition Probabilities Illness Death Process 1. 2.
3.
4.
5. 6. 7. 8. 9.
10.
in the
319 322 328 332 Simple
Introduction Identities and Multiple Transition Probabilities 2.1. Formula for the multiple passage probabilities P^ (0,r) 2.2. Three equalities 2.3. Inductive proof of formula forP⅞>(0√) 2.4. Formula for the multiple renewal probabilities Pc"a* (0√) Differential Equations and Multiple Transition Probabilities Probability Generating Functions 4.1. Multiple transition probabilities 4.2. Equivalence of formulas Verification of the Stochastic Identities ChapmanKolmogorov Equations Conditional Probability Distribution of the Number of Transitions Multiple Transitions Leading to Death Multiple Entrance Transition Probabilities ∕∕α"β (θ∙ *) Problems for Solution
Chapter 13. Multiple Transition Time in the Simple Illness Death Process—an Alternating Renewal Process 1. Introduction 2. Density Functions ofMultiple Transition Times 3. Convolution of Multiple Transition Time
338
338 339
341 342 343 345
346 347 349 351 351 354 356 357 358 360
366 366 367 369
CONTENTS
xix
4.
Distribution Function of Multiple Transition Time 4.1. Distribution function of the mth renewal time 4.2. Distribution function of the mth passage time 5. Survival Time 6. A Twostate Stochastic Process 6.1. Multiple transition probabilities 6.2. Multiple transition time 6.3. Number of renewals and renewal time 7. Problems for Solution
Chapter 14. The Kolmogorov Differential Equations and Finite Markov Processes
Markov Processes and the ChapmanKolmogorov Equation 1.1 The ChapmanKolmogorov equation. 2. Kolmogorov Differential Equations 2.1. Derivation of the Kolmogorov differential equations 2.2. Examples 3. Matrices, Eigenvalues, and Diagonalization 3.1. Eigenvalues and eigenvectors 3.2. Diagonalization of a matrix 3.3. A useful lemma 3.4. Matrix of eigenvectors 4. Explicit Solutions for Kolmogorov Differential Equations 4.1. Intensity matrix V andits eigenvalues 4.2. First solution for individual transition probabilities Pv(0,r) 4.3. Second solution for individual transition probabilities Py(0,Z) 4.4. Identity of the two solutions 4.5. ChapmanKolmogorov equations 4.6. Limiting probabilities 5. Problems for Solution
372
374 376 378 380 381 383 384 388
396
1.
396 397 398
400 403 404 407 409 410 411 416 418 419 423 426 426 427 428
CONTENTS
XX
Chapter 15. Kolmogorov Differential Equations Markov Processes—Continuation 1. 2. 3.
4.
and
Finite
Introduction First Solution for Individual Transition Probabilities Plj(0,t) Second Solution for Individual Transition Probabilities Pij(β,t) Problems for Solution
Chapter 16. A General IllnessDeath Process 1. Introduction 2. Transition Probabilities 2.1. Health transition probabilities Pαβ(0√) 2.2. Transition probabilities leading to death (o,o 2.3. An equality concerning transition probabilities 2.4. Limiting transition probabilities 2.5. Expected duration of stay in health and death states 2.6. Population sizes in health states and death states 3. Multiple Transition Probabilities 3.1. Multiple exit transition probabilities, P⅛^(0,t) 3.2. Multiple transition probabilities leading to death ρjj (0,∕) 3.3. Multiple entrance transition probabilities, p (0, t) 4. An Annual Health Index 5. Problems for Solution
Chapter 17. Migration Processes and BirthIllnessDeath Processes 1. Introduction 2. EmigrationImmigrationProcesses —PoissonMarkov Processes 2.1. First approach
436
436
436
441 449 451
451 453 454
455
457 459 460
461 463
465 467
469 470 474 477 477
479 481
CONTENTS
Second approach 2.2.1. Differential equation of the probability generating function 2.2.2. Solution of the probability generating function A BirthIllnessDeath Process Problems for Solution
2.2.
3. 4.
χχi
485
486 487 495 497
CHAPTER
1
Random Variables
1.
INTRODUCTION
A large body of probability and statistical theory has been developed to facilitate the study of phenomena arising from random experiments. Some studies take the form of mathematical modeling of observable events, while others are for making statistical inference regarding random experiments. A familiar random experiment is the tossing of dice. When a die is tossed, there are six possible outcomes; it is not certain which one will occur. Similarly, in a laboratory determi nation of antibody titer, the results vary from one trial to another even if the same blood specimen is used and the laboratory conditions remain constant. Examples of random experiments can be found almost everywhere. In fact, one may extend the concept of random experi ments so that any phenomenon may be thought of as the result of some random experiment, be it real or hypothetical. As a framework for discussing random phenomena, it is convenient to represent each conceivable outcome of a random experiment by a point, called a sample point, denoted by s. The totality of all sample points for a particular experiment is called the sample space, S. Subsets of 5, represent certain events such that an event A consists of a certain collection of possible outcomes s. If two subsets contain no points s in common, they are said to be disjoint, and the corresponding events are said to be mutually exclusive. Mutually exclusive events cannot occur as a result of a single experiment. The probability of occurrence of the various events in S’ is the starting point for analysis of the experiment represented by S'. Using Pr (A) to denote the probability of an event A, we may state the three fundamental assumptions. 1
2
[1.1
RANDOM VARIABLES
(i) The probabilities of events are nonnegative:
for any event A.
Pr{Λ}>0
(1.1)
(ii) The probability of the whole sample space is unity: Pr{5} = 1.
(1.2)
(iii) The probability that one of a sequence of mutually exclusive events (√4,.} will occur is given by ∞
Pr{Λ1 orΛ2 or ... } =
Pr(Λz).
(1.3)
f=l
The last assumption is called the assumption of countable additivity.
2.
RANDOM VARIABLES
Any singlevalued numerical function X(s) defined on a sample space 5 is called a random variable. We associate a unique real number with each point 5 in the sample space. We call the number the value at j. We interpret the probability of an event in S as the probability that the value of the random variable λr(s) will lie within a certain interval or in a set of real numbers. The most common types of random variables are discrete random variables and continuous random variables. A discrete random variable X(s) assumes a finite or denu merable number of values. For each possible value xi there is a unique probability that the random variable assumes the value xi∙. Pr{Ar(s) = x,.} = pi,
∕ = 0,1, ...
(2.1)
The sequence {pl} is called the probability distribution of X(s), and the cumulative probability Pr{I(i)≤ x} = ^p, = F(x),
∞fc(l — p)n~k represents the probability that k specified trials will result in success and the remaining n — k trials in failure; for example, the first k trials may be successes and the last n — k trials will be failures, as represented by the sequence (SS ∙ ∙ ∙ SFF ∙ ■ ∙ F). The combinatorial factor, or binomial coefficient
is the number of possible ways in which k successes can occur in n trials. A random variable X is called a continuous random variable if there exists a nonnegative function f such that, for any a ≤ b, Pr
The function f(x) is called the probability density function (or simply density function) of X. The distribution function of X is X
and
Pr{α < X≤ b} = F(b)  F(a).
(2.9)
Note that for a continuous random variable X, Pr{Ar = .x} = 0 for any x, and that the values of the density function f(x) are not probabilities; while they are nonnegative, they need not be less than unity. The density function is merely a tool which used in conjunction with (2.7) will yield the probability that X lies in any interval. As in the discrete case, a continuous random variable is proper if its density function satisfies the equation 1, and is improper if the integral is less than one.
1.3]
5
MULTIVARIATE PROBABILITY DISTRIBUTIONS
Example 4.
The exponential distribution has the density function
μe f(×) = ∙
x>0
(2.10)
x 0
x 0,f(x) > 0; they satisfy the conditions that
f(χ∖y) =
f(χ) = (
h (x,y)
f(×∖y)g(y) E(X)'
(4.H)
The reverse inequality holds if X assumes only negative values. Proof.
Let
X = E(X)
1 +
X E(X)
E(X)
= E(X)(1 +Δ)
(4∙12)
so that E(Δ) = E
X  E(X)
E(X)
= 0.
(4∙13)
From (4.12) we know that 17
^ 1 "
_x _
= E
1 1 = E ,E(X)(1 + Δ) _ .1 + Δ _ E(X) 1
(4∙14)
Since
1
Δ2 = 1 Δ +1 + Δ 1 + Δ
(4∙15)
and since Δ2 and 1 + Δ are positive random variables, therefore the expectation
12
[1.4
RANDOM VARIABLES
E
' 1 '
_x _
1 Δ2 Λ +∙ = E 11  Δ L ι + δJ E(X) 1 ∕ A ∖ 1 = l + £ > , E(X) L ∖1+Δ∕J E(X)
(4.16)
and we recover (4.11). 4.2. Conditional Expectation. The conditional expectation of Y given x is the expectation of Y with respect to the conditional distribution of Y given x. For the discrete case .
= J
p.>θ∙
P
O.
(4.18)
For these conditional expectations to be meaningful, it is necessary that pi > 0 in (4.17) andf(x) > 0 in (4.18). The conditional expectations in (4.17) and (4.18) are constant for a given value x (or xi). If we regard x (or xj) as a random variable X, then we write these expectations as ^(T∣Ar), and they themselves are random variables. When X assumes the value xi, E(y∣λr) takes on the value E(y∣x,.). We define conditional expectations of functions, such as Is [φ(X, Ir)∣ Ar], analogously. When calculating the expectation, we treat the random variable on which the expectation is conditioned as a constant. Thus, for example, E[(X+ K)∣Ar] = X+E(Y∖X)
(4.19)
E [*y∣X] = %£[y%].
(4.20)
and
Since expressions like E(Y∣ X) are random variables, they themselves have expectations.
Theorem 4.1
For any two random variables X and Y,
,To aid in interpreting repeated expectations, remember that E(K∣Λr) is a function of Xt therefore the outer expectation refers to the distribution of X.
1.5]
13
MOMENTS, VARIANCES, AND COVARIANCES
E[E(y∣X)] = E(T), and
E(XY) = E [ArE(y∣X)].
Proof.
(4.22)
For the discrete case
E[E(X∣%)] = ∑λ
= Σ Σ y>P'∕ = ∑y∕∑pl, = ∑ yfl> = £< y)∙ i
√
J
J
'
(423) The proof of the continuous case is similar. Formula (4.22) follows from (4.20) and (4.21). An important consequence of the independence of random variables is that their expectations are multiplicative.
Theorem 5.
If Xl, ..., Xn are independent random variables, then E(Xl ... Xn) = E(Xl) ... E(Xn).
(4.24)
Proof. When Xi and X2 are independent, E (X21 Xi) = E (X2). Thus, for the case of two random variables, (4.22) implies that E(XlX2) = E(Xl)E(X2). We may extend the proof to the general case by induction. 5.
MOMENTS, VARIANCES, AND COVARIANCES
Expectations of a random variable X that are of particular interest are (i) the moments
E[Xr}
r=l,2, ...;
(5∙1)
(ii) the factorial moments
E [X(X 1) ...(X r + 1)]
r=l,2, ...;
(5∙2)
and
(iii) the central moments E[%E(X)]r
r= 1,2, ... .
(5∙3)
The first moment, which is the expectation E (X), is called the mean
14
RANDOM VARIABLES
[1.5
of the distribution. It is a commonly used index of its location. The second central moment σ2x= E [X — E(X)]2
= E(X2) [£(%)] 2,
(5.4)
(5.4a)
is the variance of the distribution and its square root σx is the standard deviation. We will also use the notation Var (X) to indicate the variance. The variance and standard deviation are often used to measure the spread, or dispersion, of the distribution. It is therefore reasonable that the variance of a degenerate random variable or a constant is equal to zero: σ2 = 0.
(5.5)
The variance of a random variable depends on the scale of measurement but not on the origin of measurement. Thus σl+bχ= b2 0, we choose 0 < 5 < ε∕(1 + ε), so that lim ∣ pk n — pk∖ < ε. n—*oo This proves lim pk n = pk, for k = 0, 1, .... n—∙∞
Example 1. Binomial Distribution. In a binomial distribution with probability p, if n → ∞ in such a way that np → λ, then (n∖ k k e~λKk lim ∕(1pΓ* = n∞ ∖k ∕ k!
(5.11)
lim [1 p + ps]n ≈e~σ~sy∖
(5.12)
and
Example 2. Negative binomial distribution. In the probability func tion (4.20), if r→ ∞ in such a way that rq → λ, then r
(5∙13)
and lim (——) ≡e','*.
r∞ ∖1  qs ∕
6.
PARTIAL FRACTION EXPANSIONS
We may obtain individual probabilities of a random variable from the corresponding p.g.f. by taking repeated derivatives as in formula (2.4). In many cases, the necessary calculations involved are so overwhelming that they render this approach undesirable, and one has to resort to some other means to expand the p.g.f. in a power series. The most useful method is that of partial fraction expansion, which we present for the case where a p.g.f. is a rational function. Suppose that the p.g.f. of a random variable can be expressed as a ratio
t∕(5) G(s) =Γ(5)
(6.1)
where U{s) and V(s) are polynomials with no common roots. If U(s) and V(s) have a common factor, it should be cancelled before
42
PROBABILITY GENERATING FUNCTIONS
[2.6
the method is applied. By division, if necessary, we can assume that the degree of K(s) is m and the degree of U(s) is less than m. Suppose that the equation Vr(s) = 0 has m distinct real roots, 5,, ..., sm, so that
K(5) = (5  j1)(j  s2) ... (5  sm).
(6.2)
Then the ratio in (6.1) can be decomposed into partial fractions
(6∙3)
G(s) =sl — s where the constants a., ..., a 1 7
7
rrl
are determined from
~U(sr) ar= , V,{sr)
r=l, ...,∕m.
(6∙4)
Equation (6.3) is well known in algebra. To prove (6.4) for r = 1 we substitute (6.2) into (6.1) and multiply by (s1 — s) to obtain t∕(s) (51  j) G(s) = . (5  s2)(s  s3) ... (s  5m)
(6.5)
It is evident from (6.3) that as s → s1, the lefthand side of (6.5) tends to al, the numerator on the righthand side tends to — U(si) and the denominator to (s1  s2)(s1 — s3) ... (s1 — sm), which is the same as V’ (s,1). Hence (6.4) is proven for r = 1. Since the same argument applies to all roots, (6.4) is true for all r. Now we use formula (6.3) to derive the coefficient of sk in G(s). We first write
1
1
1
1 ς"^~
k÷(T÷1
sr ∖ sr ∕ Sr L sr J  s∕sr _ sr  s Substituting (6.6) for r = 1, ..., m in (6.3) and selecting s such that ∣s,∕sr∣ < 1, we find an exact expression for the coefficient of α1 a2 ∕7⅛ = ΓΓΓ + S11 5,2n
am + ∙∙∙ + ^77Γ, Sm
⅛=0, 1, ∙∙∙.
(6.7)
Strict application of this method requires calculation of all m roots, which is usually prohibitive. However, we see from (6.7) that the value of pk is dominated by the term corresponding to the smallest root in absolute value. Suppose s1 is smaller in absolute value than any other root. Then
2.7]
MULTIVARIATE PROBABILITY GENERATING FUNCTIONS
43
Xm = km} =pkl.',km, ki = 0, 1, ...;
i = 1, ..., m.
(7.2)
X is called a proper random vector if
Σ ∙ ∙ ∙∑p.. ..=> *1
(7∙3)
km
and an improper random vector if the sum is less than one. The p.g.f. of X is defined by
Gx(s) = Gx(5,.
, ^m) = ∑ *1
■ ■ ∑ 5t' ■ ∙ ■ ⅛,
(7.4)
km
for 1511 < 1, ..., ∣5,m ∣ ∙∙∙, sm) = (p0 +ρlsl + ... +pmsm)n.
(7.22)
2.7]
MULTIVARIATE PROBABILITY GENERATING FUNCTIONS
47
This is most easily seen by writing the multinomial random vector as the sum of n independent and identically distributed random vectors and applying (7.9). Using (7.6) and (7.10), we have the p.g.f. of the sum Zm = Xl + ... + Xm,
Gx(s, ...,s) = [p0 + pls+ ... + pms]n
= [Po÷(l Po)∙d",
(7.23)
and the p.g.f. of the marginal distribution of Xi,
Gx(l, ..., si, ..., 1) = [1 pi + pisi]n.
(7.24)
Thus, both Zm and Xi are binomial random variables. The marginal distribution of Xi and Xj is trinomial with the p.g.f. Gx(l, ..., sl, ..., sj, ..., 1) = [(1 pi pj) + pisi +pjsj]n. (7.25)
It is easily shown that
E(X,) = np,,
E(X,Xj) = n(n  l)p,p,
and Cov(Xi, Xj) = npipj
(7.26)
which agrees with Equation (5.23) in Chapter 1. For the conditional distribution of X2 given Xl = kl, we apply (7.20) to (7.22) to obtain
with the expectation
i[%2∣t,J = (nkl)—. 1 p>
(7.28)
Example 2. Negative Multinomial Distribution (Multivariate Pascal Distribution). Consider a sequence of independent trials, each of which results in a success, S, or in one of the failures, Fl, F2, ..., or Fm, with corresponding probabilitiesp, qq2∙, ..., qm, so that P + q∖ +
÷ ∙∙∙ + Qm = 1∙
(7.29)
Let Xi be the number of failures of type Flt, for i = 1, ..., m, preceding the rth success in an infinite sequence of trials. The joint probability distribution of Xl, ..., Xm is
48
[2.7
PROBABILITY GENERATING FUNCTIONS
(k + r  1)!
(7.30) where k = kl + ... ÷ km, and the corresponding p.g.f. is
G×(sι > ∙∙∙, sm) OO
OO
= Σ  Σ 't' ... 5∙t . " ⅛,!...⅛J(rl√' fcl=0
qP
*m=0
“I r
_________ P_
(7∙31)
1  «1^1  ∙∙∙  qmsm J Formula (7.31) shows that (Xl, ..., Xm) is the sum of r i.i.d. random vectors, which have the same p.g.f.
_________P_________
1  qγsl  ...  qmsm We use (7.12), (7.13) and (7.14) to compute from (7.31) expectations
E(Xt'>= r~. P
(7.32)
i = 1, ... , m,
variances
(7.33)
and covariances i≠j',
(734)
i,j=l,...,m.
The distribution of the sum Zm = Xl + ... + Xm and the marginal distributions of any of the random variables Xl are negative binomial distributions. This is evident from the definitions of these random variables; we may see it explicitly by writing the p.g.f. of the sum as E(sz"∙) = Gx(s, ... , s) =
r
P
 1 pls  ∙∙
•
“

P .1 — qs _
(7∙35)
2.7]
MULTIVARIATE PROBABILITY GENERATING FUNCTIONS
49
where q = qγ + ... + qm, and the p.g.f. of Xi as E(s×i) = Gx(l, ... , si, ... , 1) r
P Li 
. r
P
Pi
(736)
Li  Qisi _
p + η = E[ex's] ...E[ex"s] = [mx(s)]n,
as required. Example 4.
Gamma distribution with parameters μ and r:
∕ω =
Γ(r)
x > 0.
(4∙10)
Example 1 shows that (4.10) is the density function of the sum of r i.i.d. variables, each having an exponential density function (4.5). Using the m.g.f. (4.6) and applying Theorem 1 yield the m.g.f.
(4∙11) We can obtain formula (4.11) directly from (4.2) with the density function (4.10).
3∙4]
MOMENT GENERATING FUNCTIONS
63
4.1. The Central Limit Theorem. Theorem 2. If Xl, ..., Xn are a sample of n independent and identically distributed random variables with E{Xi) = μ and Var(X,.) = σ2, then as n → ∞, the standardized sample mean has the normal distribution with a mean zero and variance one. This theorem has been given in Section 6 of Chapter 1. We now prove it by using the moment generating function. Proof.
Let m {s} be the moment generating function of (Xi — μ)∕σ,
m(s) = E
,
forz=l, ...
(4.12)
and let Mz {s} be the moment generating function of the standardized sample mean ) (4.13)
We need to show that, as n → ∞,
M, (≈) = e''z2
(4.14)
We rewrite Zn as (4.13a) to establish a relationship between the m.g.f. of Zn and m (s} in (4.12). Using (4.13a), we write the m.g.f. of Zπ Mz (s) = E [e^7=>*'⅛' u⅛ A(O,r) = (l)λ1λ2μ(θ2 , j^i∙,i=∖,2. .=ι λ∕  λ,7
(6.8)
The distribution function of T2 is easily computed from (6.8):
f,2(°>0 = ∣ f2(0,x)dx 0
2
1
(6.9)
As t → ∞, Λ(θ>00) = (0λΛ2X
1 (∖λ, )λ,
√≠½ So M(C) 0 Γ(a) is the density function of a gamma distribution. Find the mean and the variance of the distribution. 5. Continuation. Derive the distribution function
fxβαrα^, F (×) = i e~p' dt, x > 0 Jo Γ(α)
for a positive integer α. 6. Continuation. Let Xx, ...,Xn be independent random variables having the density functions βα∙χα^, a fi(x) = e px, i = 1, ...,n. Γ(α,) Find the density function of the sum Zπ = Xl + ... + X„. [HINT: Use induction and the identity in problem 8.] 7. The Chisquare Distribution. When β = 4 an^ ct = n∕2y the gamma density function in problem 4 becomes
which is known as the chisquare distribution with n degrees of freedom. (a) Show that, if Z is a standard normal random variable with a mean zero and a variance unity, then Z2 has a chisquare distribution with one degree of freedom. (b) Let Z1, ...,Zn be n i.i.d. standard normal random variables; then the sum of squares
^ + ... + z2 has a chisquare distribution with n degrees of freedom. 8. The Beta Functions. Show that
f1 m! n! I tm(lt)πdt = . Jo (m + Λ + 1)!
76
[3.8
PROBLEMS FOR SOLUTION
The lefthand side quantity in the above equation is called the (complete) beta function. When the upper limit of the integral is less than one, we have the incomplete beta function. X
tm(ltfdt, 0)] sin [xrττ∕(a + Z>)]
T∖^pq (a + b) cos2 [rπ∕(a + Z>)]
1, ..., (a + b — 1).
(3∙5O) One can expand each ratio on the right hand side of (3.42) in a geometric series:
ar
ar
5r  5
sr
9
(3.51)
with sr being given in (3.44). Finally, introducing (3.50) and (3.51) in (3.45) and collecting terms of the power of s,, we obtain the desired formula for the probability ιτx n: 2”
a+b
P
(xn)∕2
(x÷n)∕2
rτr xrπ Σ sin sin cos” a+b a +b
rπ a+ b (3.52)
for 0J=P,..,B,
(2.3)
with the conditions that ∑Pr{*. = O = ∑,.= l ,a
(2.4)
,α
and ∑ p..,,β = 1∙ (2.5) 'β Therefore, the joint probabilities of Λα, Xβ, Ar7, for α < β < γ are given by
(26)
and Pγ{Xq= ia, Xβ= zβ) =aiPia tfι.
(2.7)
104
[5.2
MARKOV CHAINS
Generally, for any collection of integers a < β < ∙∙∙ ,~7(fc ) π*(l “
~k
(3*l6)
where πl = pip2 ...pm and π2 = pm + l ... pm+n. Equations (3.l3) through (3.16) can be shown with straightforward computations. 4.
CLASSIFICATION OF STATES
A transition from one state to another is not always possible depending upon the type of states. The state j is said to be reachable from state i if there exists some positive integer n such that the probability pij(n) > 0, and we write i → j. For n = 0, we define
5∙4]
CLASSIFICATION OF STATES
111
p..(0) = 1 and pv(0) = 0 for j Ψ i. If state J is reachable from state i and state i is reachable from state f the two states are said to be communicative, and we write i 0
pjk(n) > 0.
Using the ChapmanKolmogorov equation in (3.7) we write
Pik ("» + n) = ∑ pil(m)plk (w) > pij(m)pjk(n) > 0. ∕
Therefore, state k is reachable from state i. It is clear that the communication relation has the following proper ties: (1) Reflexivity: i )
(412)
n=l
is the mean recurrence time for state i. Recurrent null state and recurrent nonnull state. A recurrent state Z is called a null state if the expectation μz, = ∞, a nonnull state if μ,, < ∞∙
Periodic state and aperiodic state. A state Z is periodic with period z > 1 ifpii(w) = 0 except for n = t,2t,... , where t is the largest integer
114
[5.4
MARKOV CHAINS
with this property. A state which is not periodic is an aperiodic state. In the gambler’s ruin problem, the event that a player will break even has a period of t = 2. His winning being zero at the nth game is possible only for n = 2, 4, ... . The number of white balls drawn from an urn in the life table example is aperiodic.
Ergodic state. An aperiodic recurrent state with a finite mean recurrence time (μι7 < ∞) is called ergodic. Absorbing state. A state i is an absorbing state if and only if ∕λ∙(1) = 1. Clearly, if i is an absorbing state,/, = 1, μ,7 = 1. The above definitions of states, which were introduced with respect to the first return probabilities, may be alternately presented in terms of the transition probabilities. The following theorems in effect express further relations between fii{n) andpii(n) by way of these definitions.
Theorem 1. State j can be reached from state i if and only if ftj > 0; states i and j communicate if and only if'ffjfβ > 0∙
Proof.
The theorem follows directly from the inequalities Pijjfii(f) l=l
is bounded above by one, ∞
∑ Λ√)→0∙
(5.6)
∕=ΛH+1
Therefore pij(n) → 0. If, on the other hand, j is ergodic, then according to (5.3) 1 " pzz(n — Z)→—, and by Theorem 4 ^∕,z(Z)→ fij= 1. Therefore ⅝
PijW→ —∙
∕=∙
118
MARKOV CHAINS
6. Closed Set.
[5.6
CLOSED SETS AND IRREDUCIBLE CHAINS A set C of states is closed if for every i in C
∑>β=∣
(61)
√εC
where the summation is taken over all states j belonging to the set C. Generally, for every n ≥ 1
∑A,(")=1,
(62)
√εC
since, starting from a state ι, the system must be in one of the states in the set at the n,h step. Thus, if C is a closed set, any state k outside the set cannot be reached from any state inside the set. If i belongs to the set C and k is outside the set, then pik = 0; and generally pik{n) = 0 for all n ≥ 1. Therefore, any closed subset of a system can be studied independently of all other states. The totality of all states that can be reached from a given state i form a closed set. A closed set may contain states which do not communicate. A closed set of communicating states is a class. If C is a class, then for every pair i and j in C, there exists a positive integer n for which pu{n) > 0. An absorbing state is considered a class.
Irreducible chain. A Markov chain is called an irreducible chain if there exists no closed subset other than the set of all states. Theorem 8. In an irreducible Markov chain every state can be reached from every other state. Proof. Let C be a set constituting an irreducible Markov chain; in this case, C contains no proper closed subset. Suppose that there are two states i and j in C such that j cannot be reached from i. Then either i is an absorbing state or else it belongs to a closed subset of which j is not a member, which contradicts the fact that C contains no proper closed subset.
Theorem 9. The states in a class are of the same type; they are either all transient or all recurrent null or all ergodic.
5.6]
CLOSED SETS AND IRREDUCIBLE CHAINS
119
Proof. Let i and j be two states belonging to a class C so that i 0 and pji(n) > 0. Now for every m > 0,
Pii (I + m + n)≥pi. (l)pjj (m )p.i (n) and
pzz(∕ + m + n)≥pji(n)p..(m)p.j(l) ∞
so that the two series
oo
pli(m) and
Pj,(m) either both converge
m=O m≈0 or both diverge. Furthermore, as m→∞, pιι(m)→0 if and only if pjj(m)→0. Thus i and j are either both transient or both recurrent null or both ergodic. This shows that all states in a class are of the same type.
Corollary. The states in an irreducible Markov chain are of the same type. The corollary follows directly from Theorems 8 and 9. Theorem 10.
All states in a class have the same period.
Proof. Let i and j be two arbitrary states in a class C with periods ti and tj, respectively. There exist two numbers I > 0 and n > 0 such that pij(l) > 0 andpji(n) > 0. For every m > 0 for whichpji{m) > 0, p,, (Z + m + n) ≥ pij(Γ)pjj(m)pji(n) > 0,
(6.4)
since pzz(m) > 0 implies pjj(2m) > 0 and pii(l + 2m + n) > 0. The period tj thus divides the difference (∕ + 2m + n) — (I + m + n) = m, this being true for every m > 0 for which pzz(m) > 0. Therefore ti divides the period tj. Since this argument holds when i and j are interchanged, it follows that tj divides ti, and thus tj = tj. Corollary. All states in an irreducible Markov chain have the same period. The corollary follows directly from Theorems 8 and 10.
Ergodic chain. An irreducible Markov chain with ergodic states is called an ergodic chain. The above concepts are illustrated by an example.
120
[5.6
MARKOV CHAINS
Example 6. ∕
1
0 0
0
0
0
0
0
0 ∖
0 0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0 0 0
1
0
0
0
0
0
000 3/8
1/6
11/24
0
0
0
0003/8
1/2
1/8
0
0
0
0
0
(6∙5)
1/3 2/3
0 0 0
0
0
0
0 0 0
0
0
0
2/4 1/4
0 0 0
0
0
0
1/2
1/2
1/4
0
The above stochastic matrix describes the transitions in a system of nine states numbered from 1 to 9. We divide these states into four subsets, Cχ∙. Y [limpA/(n)] pij (7.10) n→ α j n→∞ n→∞ i
i
or (7.11)
π,≥∑"Λ∙ i
Suppose the strict inequality sign holds in (7.11); then ∑>>∑∑^ = ∑, j
J
σi2>
'
'■
which is impossible. Therefore,
ττy. = ∑ nipij.
(7.7)
i
Now, using (7.7), we write for n = 1, 2, ...,
'τry = X πipij(n).
(7.13)
i
As n → ∞, (7.13) becomes
π7 = Σ
'tτ'πj
i
and we recover
∑ιr,= l.
(7.6)
i
Conversely, suppose a stationary distribution {πy} of an irreducible Markov chain exists with τry > 0 for every j and satisfies (7.6) and (7.7). By induction
= X πtPij(P) i
for n = 1,2, .... Since the chain is irreducible, all the states are of the same type. As n → ∞, we have either
5∙7]
STATIONARY DISTRIBUTION
limp,7(n) =
125
1
(7.14)
if all the states are ergodic, or
lim pij(n) = 0
(7.15)
n—► ∞
if all the states are transient or recurrent null. The latter case is impossible since ∑ιτ7 = 1. Therefore, the states must be ergodic and (7.16) so that the stationary distribution is the same as the limiting distribution of the chain.
8. AN APPLICATION TO GENETICS
Genetics, the science of heredity, has been developed to explain the heredity of living things. The field provides an excellent opportunity for application of probability concepts. Flower color, shape of peas, and many other phenomena first observed by G. J. Mendel more than 100 years ago seem to follow specific probability laws. In the passing of heritable traits from one generation to the next, we see the Markov chains at work in nature. The reader may consult textbooks on the subject (e.g., C. C. Li [1968] and Curt Stern [I960]) to gain knowledge in genetics; the present description is provided for the convenience of application of Markov chains. An important element of heredity is chromosomes. In the nucleus of human somatic cells, for example, there are 23 pairs of chromo somes. One pair is called the sex chromosomes, designated by X and Y; the rest are autosomes. Each reproductive cell, sperm or egg, has only a set (a genome) of 23 single chromosomes, and is said to be haploid. A progeny which inherits two sets (or 23 pairs) of chromosomes, one set from each parent, is said to be diploid. Carriers of heritable traits are called genes and they are located on the chromosomes. The location of a gene on the chromosome is called a locus. Genes appear in pairs; paired genes occupy the same locus of the paired chromosomes. Generally, each gene of a pair can assume two alternate forms (alleles) A and a. The proportions of genes in a population that are A and a are gene frequencies, denoted by p and q (p + q = 1), respectively. The two alleles form three genotypes,
126
MARKOV CHAINS
[5.8
A A, A a, aa (there is no distinction between A a and a A) with respect to a particular locus. The genotypes A A and aa are called homozygotes; the genotype A a, heterozygote. If A is dominant over a so that A a has the same observable properties as A A, then there are two phenotypes. A A individuals produce only Jgametes, aa individuals produce only αgametes, and A a individuals produce A gametes and αgametes in equal number. A diploid progeny receives one gene from the father and one gene from the mother to form a pair. According to Mendelian law, a progeny receives either one of the two genes from a parent with a probability of 1/2. Thus, the mating Aa × Aa produces genotypes AA, Aa, aa with respective probabilities 1/4, 1/2, 1/4. The mating A A × A a produces genotypes A A and A a with an equal probability of 1/2. Finally, the mating AA × AA produces only AA genotype. Choice of mates in the human population generally is made through selection or by preference. To demonstrate the Mendelian theory, we consider an idealized situation of the natural environment where pairings of mates are made at random and are independent of one another. This is called random mating. Under random mating both genes and genotypes of a progeny are selected at random. In the case of two alleles, a pair of genes may be chosen from (A, a) × (A,a) so that there are 2 × 2 ways of choosing a pair. But since there are three genotypes, a genotype may be chosen from (AA, A a, aa) × (A A, A a, aa) so that there are 3x3 ways of choosing a genotype. A salient feature of the random mating model is that selection of genes and selection of genotypes when made independently yield the same result. The following example will elucidate the point. Consider a population in which the males and females have the same genotype frequency distribution: AA :Aa:aa = d.2h.r with d + 2h + r = 1. The gene frequencies for A and a are
p = d+h
and q = h + r.
(8.1)
Under random mating, a progeny can have one A gene with a probability p and can have two A genes (one from each parent) with a probability p2. That is, a progeny will carry genotype A A with a probability p2. Similar computations show that a progeny will carry genotypes A A, A a, or aa with probabilities p2, 2qp, or q2. These probabilities can be derived also through the selection of genotypes. For this purpose, we formulate the product (dAA + 2hAa + raa) × (dAA + 2ħAa + raa) where one factor represents the father and the other represents the mother and d, 2h and r are numerical coefficients
5.8]
AN APPLICATION TO GENETICS
127
of the three genotypes, respectively. Since a genotype A A offspring may come from the mating A A × A A with a probability 1, from A A × Aa with a probability 1 /2 and from Aa × Aa with a probability 1 /4, a simple multiplication of the two factors shows that the probability of a genotype A A offspring is d2 + 2dh + h2 or
(d+h)2 = p2.
(8.2)
From the multiplication we also find the probability of a genotype A a offspring:
2dh + 2dr + 2hr + 2h2 = 2(d + h)(h + r) = 2pq,
(8.3)
and the probability of a genotype aa offspring:
h2 + 2hr + r2 = (h + r)2 = q2.
(8.4)
These are exactly the same probabilities determined from the selection of genes. HardyWeinberg law of equilibrium. A careful inspection of the above results leads to an even more remarkable discovery. In equations (8.1) there are three genotype frequencies {d, 2hr, r) to determine two gene frequencies (p and q). Therefore, there are infinitely many values of (d, 2h, r) that satisfy equations (8.1); one of the values is (d, 2h, r) = (p2, 2pq, q2). However, equations (8.2), (8.3) and (8.4) show that, whatever the parent genotype frequency compositions (d, 2h, r) may be, under random mating assumptions, thefirst generation progenies will have the genotype composition (p2, 2pq, q2) and this composition will remain in equilibrium forever. This law of equilibrium was established independently by G. H. Hardy and W. Weinberg in 1908. Since then, numerous experimental findings and actual observations on real phenomena have been used to prove the validity of this discovery. The most often cited example in the human population is the MN blood typing, which involves one pair of genes and all three genotypes are distinguishable. In problem 14 we shall demonstrate the HardyWeinberg law in this case. To describe the heredity process in one given locus in terms of Markov chains, we number the three genotypes A A, Aa, and aa by 1, 2, and 3 and denote by pij the probability that an offspring has genotype j given that a specified parent has genotype i. Using motherchild pair as an example for clarity, we let
128
[5.8
MARKOV CHAINS
pij = Pr {a child has genotype j∣ his mother has genotype i}, i,J≈ 1,2,3.
(8.5)
When i = 2, j = 1, p2l is the probability that a genotype A a mother will produce a genotype AA son. Table 1 shows the entire 3 × 3 transitions and their probabilities: Table 1.
OneStep Transition Probabilities Genotype of Son
Genotype of Mother
1
Aa 2
1 2 3
Pιι Pll Ph
P ∖2 P22 P32
AA
AA Aa aa
aa
3
Pu P23 Py>
The onestep transition probability matrix is: ∣Pιι
P(1)= b2. V3l
Pl2 P22
P12
P13 ∖ P23
1
(8.6)
P331
We can determine the transition probabilities in (8.5) by enumerating all possible genotypes of the father AA, A a, aa with the corresponding frequencies d, 2h, r. However, since selection of genes and selection of genotypes lead to the same result, we shall determine the probabilities on the basis of gene selection. Consider the probability
∕72l = Pr{a child will have genotype A A ∣ his mother has genotype A a}.
(8.7)
For the child to have a pair of J J genes, he must inherit one A gene from his mother with a probability 1/2 and acquire another A gene from the male population (through his father) with a probability p. Therefore, ∕>21 = yp. Similar computations yield the other probabil ities:
0∖ 4 4
q
q
P
Q ∕
(8.8)
5.8]
AN APPLICATION TO GENETICS
129
Eachplj in equation (8.8) is interpreted as the conditional probability of a genotype of an individual given a genotype of a specified member of the preceding generation, but it is also equal to the conditional probability of a genotype of an individual given a genotype of a specified member of the succeeding generation. Let us verify the probability
p2x = Pr{a mother has genotype A A ∣her child has genotype A a}.
(8.9) The child could have inherited either A gene or a gene from his mother with an equal probability of 1/2. If he inherits A gene from the mother, then she has an A gene. The probability of her having another A gene is p. If the child inherited a gene from his mother, then the probability of her having A A genotype is zero. Therefore, the required probability isyp+yθ=yp, which is equal to p2l. The reader may verify the other probabilities in equation (8.8). This type of conditional probability is quite useful in genetic research when inference is to be made about the genotype of an ancestor on the basis of the information of a present generation. We have labeled pij onestep transition probability to be consistent with the term introduced earlier in this chapter. A twostep transition probability, denoted bypij(2), refers to a transition from a grandparent to a grandchild, or from a grandchild to a grandparent. For example, p23(2) = Pr{a man has genotype αα∣
a specified grandfather has genotype Aa}
(8.10)
or p23(2) = Pr{a man has genotype aa∣ a specified grandchild has genotype A a).
(8.10a)
Generally a twostep transition probability can be derived from the equation: 3
A√2) = ∑ΛΛ.∙
,,(1) = pq∕4. A general formula for fu(n) can be found. For each i, the sum
Λ(l) ÷Λ(2) + ... =fii
(8.24)
is the probability of eventual return of the original genotype i. We have established in equation (4.15), Section 4, a relationship between fu and the sum of pit(n)∖ ∞
∑ Λ,(") = (1 f,tY'
(8.25)
»=0
In the present case, pii(n) > 0 for each n, the infinite sum on the left hand side of (8.25) diverges, and hence fu = 1. This means that for every i the sequence {∕w(n)) forms a proper probability distribution with the mean first return time given by the familiar formula ∞
P1. = ∑ "Λ(λ)∙
(8.26)
∕=i
However, we can find the value of μ,7 without explicit formulas of the probabilities fu(n). Equation (5.3) in Theorem 6 shows that the mean return time μ,7 and the corresponding limiting probability π, are reciprocal to one another, and provides the following simple formula for μ,√ 1 t⅛ = —∙ 1L
(8.27)
5.8]
133
AN APPLICATION TO GENETICS
Using the limiting probabilities ττ, in (8.20), we obtain immediately the mean first return time for the genotypes A A, Aa, and aa:
μ∣1 =
1 , μ22 ~ Σ > p ⅛>q 1
and μ33 =
1
q
∙
(8.28)
respectively. Thus the more there are A genes in a population, the smaller will be the expected number of generations needed for the return of genotype AA. However, the shortest expected return time for genotype A a is μ22 = 2 generations, when p = q = 1/2. 8.1. Two Loci Each with Two Alleles Consider now two pairs of genes: Genes (A, a) and genes (B,b) in two different loci on the same pair of chromosomes. The gene frequencies of A and a in the population are denoted by pA and qa (Pa ÷ Qa = 1), and genotype frequencies of AA, Aa, aa are denoted by da, 2ha, ra with pA = da + ha and qa = ha + ra. For genes {B,b) we use similar notations with the subscript a replaced by b and pA, qa, etc. replaced by pB, qb, etc. When the four alleles (A, a; B,b) are considered jointly, there are nine genotypes in the population.
A ABB,
AABb,
A Abb,
AaBB,
AaBb,
aaBB,
aaBb,
A abb, f aabb,
1
J
(8.29)
with the corresponding frequencies: fλABB∙>
fAABb,>
1AAbb∙>
fAaBB'
fAaBb,
fAabb>
J'aaBB ’
fιaBb >
(8.30)
faabb '
The sum of the frequencies is equal to one. We seek the required condition that would lead to an equilibrium of genotype distribution in one generation. Suppose that the genotype distributions of (A, a) and of {B,b) in the population are mutually independent so that the conditional genotype distribution of (A,a) is given by (AA .Aa.aa∖(B,b)) = daι2ha : ra
(8.31)
where (B,b) stands for the condition BB, or Bb, or bb, and the conditional genotype distribution of (B,b) is given by
134
(5.8
MARKOV CHAINS
(BB .Bb.bb∖(A,a)) = db : 2hb : rb,
(8.32)
where (A,a) stands for the condition A A, or Aa, or aa. Then the frequencies of the nine genotypes AABB,AABb, ...,aabb for males and for females in the population are
dadb∙> 2hadb,
2dahb,
darb, '
4hahb,
2harb,
radb>
2rahb,
►
(8.33)
rarb∙ 
A random mating that takes place in the population may be represented by the product (dadbAABB ÷ 2dahbAABb + ... + rarbaabb)
× {dadbAABB + 2dahbAABb + ... + rarbaabb).
(8.34)
The first factor in (8.34) represents the father, the second factor represents the mother, and each factor contains nine terms corre sponding to the nine genotypes in (8.29). The constants dadb, 2dahb, ∙∙,rarb are numerical coefficients of the corresponding geno types. We can derive the probabilities of the nine genotypes of the progeny directly from the multiplication of the two factors in (8.34). Thus we find the probability that the progeny will have genotype AABb is P(ΛΛBb) = 2(< + 2daha + Λ*)(⅛Λ6 + h2 b + dbrb + hbrb)
= 2(db + hb)2(db + hb)(hb + rb), that is.
(8.35)
P(AABb) = 2p2pβqb.
The probabilities for the entire nine genotypes are: Pa Pa,
2p aPb Qb∙>
2pAqap2a.
⅛>a qaPa
2 2 4aP B'
2qapaqb,
p2 a ql>
⅛>a qaq2b* ► qaqb
(8.36)
>
This means that, if the independence assumption in (8.31) and (8.32) holds true, then whatever the parent population genotype composition in (8.33) may be, the first generation progenies will have the genotype composition given in (8.36) and this composition will remain in equilibrium persistently.
5.8]
135
AN APPLICATION TO GENETICS
In this case of two loci with two alleles each, there are four kinds of gametes, AB, Ab, aB, and ab, with the respective frequencies and their sums shown below: B
A a
Sums
b
Λ.
fAb
Pa
faB
fab
ω = ∑Λ4(i). m ≡2
k≡I
To prove the equaUty in (4.14), we introduce the equation
(4∙14)
6.4]
LIMITING PROBABILITY DISTRIBUTION
155
∣A(l)pI∣ = 0
(4.15)
and determine the roots pl, p2» ■ • ∙ ρ v ∙ It is well known (see also Chapter 14, equation (3.17)) that P1P2 ∙∙∙ Pjι + PιP3 ∙.∙ P, + ... + P2P3 ∙∙∙ P.τ = ∑ A*(1)∙
(4∙16)
4=1
Since A(l) = (I — P), we can write ∣A(l)pI∣ = ∣(1 p)IP∣ = ∣λl P∣,
(4.17)
where p = 1 — λ, and rewrite equation (4.15) as
∣λIP∣=0.
(4.18)
The roots of equations (4.15) and (4.18) have the relationship pz=lλ,,
Z=l,2, ...,5.
(4.19)
Substituting (4.19) in (4.16) and recognizing that pl = 1  λl = 0, we recover the relation 5,
(∙  λ2)(i  λ3)...(i  λ,)= ∑A*(1)
(4∙2θ)
k= 1
and equation (4.12).
5.
EXAMPLES
In this section we use the formulas described in the preceding sections to compute the higher order transition probabilities and limiting probability distributions for Markov chains based on given stochastic matrices. These examples are presented not merely for the review of the algebraic formulas, but also to acquaint the reader with Markov chains through computations of practical examples. In passing, the reader may notice that in each example there is one eigenvalue λ = 1 and all other eigenvalues are not greater than unity in absolute value. While in some cases the characteristic equation ∣ λ I  P ∣ = 0 admits complex roots, the formulas for the probabilities are real functions nevertheless. The limiting probabilities computed from the matrix A(l) using formula (4.2) are equal to those obtained by taking the limit of plj(n) as n → ∞. Some wellknown examples are included to show that computations of pij(n) are simpler and more explicit using formula
156
[6.5
ALGEBRAIC TREATMENT OF FINITE MARKOV CHAINS
(3.2) than using conventional formulas as given, for example, in Chapter XVI in Feller [1968]. Example 1. We use the subsets in the example in Section 6, Chapter 5, to compute the high order transition probabilities. The states in each subset are renumbered from 1 on in this illustration in order to avoid confusion in notation. (1) Subset C2 ∙. {1,2)
The characteristic matrix is A(λ) = (λl  P2) =
The characteristic equation ∣A(λ)∣ =0 has two roots: λ. = 1 and λ2 = —1. Therefore
and
A(λ,) =
In this case, the system has s = 2 states; formula (3.2) for the transition probabilities becomes pij(n)≈ Aβ(Kl)λ" — A l — A.
+ Λ,7(λ2)λ2  A9 — Λ
and hence the numerical values are
p11(n) = 1/2 + 1∕2(1)"
jp12(n) = 1/2  l∕2(l)n
p2l(n)= 1/2 1∕2(1)"
p22(n)= 1/2 + 1∕2(1)".
As n approaches infinity, we find ∕ 1 lim P2" = I noc ∖0
0 ∖ 1 ∕
and
/0 limP22n+, = n→°° ∖ 1
1
0
Consequently, each of the two states in C2 are periodic with period t = 2. (2) Subset C3: (1,2,3}
6.5]
157
EXAMPLES
∕ 1
P3= 1 3/8
∖ 3/8
0
0
1/6
11/24
1/2
1/8
We find the eigenvalues λ1 = 1, λ2 = 5/8, and λ3=l∕3. The corresponding characteristic matrices are
(0
A(∖1)= I 3/8 ∖3∕8
0
1/2 0
∕3∕8 A(λ2) = I 3/8
11/24
'3∕8
1/2
0
∖
7/8
∕
11/24 1,
5/6
0 ∖ 11/24 1
1/2
∕
and
0
∕4/3 A(λ3)= I 3/8
1/2
∖3∕8
1/2
0 ∖ H∕24 1.
11/24 ∕
Using the formula 3
Ay(«) = X Λ,(λ∕)λ"
1 3
√λ,λm) 1
we find the matrix
The reader may check these values for n = 1, 2. As n → ∞,
(3∙2)
158
ALGEBRAIC TREATMENT OF FINITE MARKOV CHAINS
[6.5
0
0 0 which are consistent with the fact that states 2 and 3 are transient states while state 1 is absorbing. (3) Subset C4: (1,2,3} 1/3
1/4 1/2 We solve the characteristic equation
to find the eigenvalues λ1 = 1, λ2 = —1/2 and λ3 = —1/4, and the corresponding characteristic matrices: 1
1/3
2/3
1/2
3/4
1/4
1/2
1/2
1
1/2
1/3
2/3
1/2
3/4
1/4
1/2
1/2
1/2
1/4
1/3
2/3
1/2
1/2
1/4
1/2
1/2
1/4
and
6.5]
159
EXAMPLES
Therefore,
It is easy to check that each row sum equals one for every n = 1, 2, .... As n → ∞,
15/45
16/45
14/45
15/45
16/45
14/45
15/45
16/45
14/45
with identical rows. Thus all the three states in the set C4 are ergodic. These limiting probabilities can be computed also from (4.2). Ma trix A(l) is the same as A(λ1) with the cofactors Λ11(l) = 5/8, Λ22(l) = 2/3, A33 (1) = 7/12, so that 'ΣAkk(∖) = 45/24. Therefore the limiting distribution is
(π1,π2,π3) = (15/45, 16/45, 14/45),
which is the same as each row in the above matrix. Easy computations show that (lT l , TT2 , TΓ3 ) ∙∕*4
(lT], TT2, IT3)
and thus the distribution (15/45, 16/45, 14/45) is stationary.
160
ALGEBRAIC TREATMENT OF FINITE MARKOV CHAINS
[6.5
A twostate Markov chain with the stochastic matrix:
Example 2.
where 0
1
∕= 1, 2, 3. 4.
The Formula for P∣j(n) varies with π. For n = 3k, 3k + 1, 3k + 2 we use (3.2) and compute
p3* _
p3*÷, _
( 1/2
1/2
0
0∖
1/2
1/2
0
0
0 k 0
0
1
0
0
0
1 ∕
∕0
0
0
0
0
0
1 ∖ 1
1/2
1/2
0
0
0
0
1
0 ∕
and
p3⅛÷2 _
∕ 0
0
1
o )
0
0
1
0
0
0
0
1
1/2
1/2
0
0 ∕
Therefore, the corresponding Markov chain is periodic with period t = 3; the system starts from scratch after every three steps.
162
ALGEBRAIC TREATMENT OF FINITE MARKOV CHAINS
[6.5
To find the stationary probability distribution, we formulate the matrix
A(l) =
l
0
0
l
0
l
0
l
l∕2
l∕2
l
0
0
0
l
I ∕
and compute the cofactors √411(1)=1∕2,
A22(l)=l∕2,
A44(l)=l.
A33(l)=l,
Therefore, the stationary distribution is
(π1, ιτ2, ττ3,1τ4) = (l∕6, l∕6, l∕3, l∕3).
It is easy to check that
(l∕6, l∕6, l∕3, l∕3)
0 0
0
0
0
0
I ∖ l
l∕2
l∕2
0
0
0
0
l
0 ∕
= (l∕6, l∕6, l∕3, l∕3).
Example 4. In a 4state cyclical random walk, the (one step) transition probabilities are Pjιj.x = q and pj y+l = p, with p + q = l for j = l,2,3,4. When j = l, j  l = 4 and when j = 4, j + l = l, so that the stochastic matrix assumes the form 0
p
0
q ∖
q
o
p
0
0 ∖ p
q 0
0 q
P
0 ∕
The roots of the equation ∣λl  P∣ = 0 are λl = l, λ2 = l, λ3 = v — l (q — p) and λ4 = — V — l (q — p), and hence
6.5]
163
EXAMPLES
4
∏(λ.λm) = 4(∕>2+⅛2) m=2 4
∏(λ2λm)= 4(p2+92) m=1 zn ≠2
4
∏ (λ,  M = 4√≡Γ(∕  q4) m=1 3
∏(λ4λm)= 4√^Γ(p4  4√).
>0 ∕ = rl
(2.B)
178
[7.2
RENEWAL PROCESSES
and (2∙14)
(Λ(λ)} =