Introduction to Stochastic Processes in Biostatistics

Citation preview

Introduction to Stochastic Processes *

-- in Kiostatistics

CHIN LONG CHIANG

I

Applied Probability and Statistics ( Continued)

CHAKRA VARTI, LAHA Applied

CHERNOFF and MOSES

CHEW



ROY

and



Handbook of Methods of

Statistics, Vol. II

Elementary Decision Theory Experimental Designs in Industry

CHIANG





Introduction to Stochastic Processes in Biostatistics

CLELLAND, deCANI, BROWN, BURSK, and

MURRAY

Basic with Business Applications Sampling Techniques, Second Edition and Experimental Designs, Second Edition Planning of Experiments and MILLER The Theory of Stochastic Processes Sample Design in Business Research and Sampling Inspection Tables, Second Edition DRAPER and SMITH Applied Regression Analysis •

Statistics

COCHRAN COCHRAN COX COX COX DEMING DODGE ROMIG •













GOLDBERGER



Econometric Theory

GUTTMAN and WILKS

Introductory Engineering Statistics Tables and Formulas Statistical Theory with Engineering Applications • Sample Survey HANSEN, HURWITZ, and Methods and Theory, Volume I HOEL Elementary Statistics, Second Edition JOHNSON and LEONE Statistics and Experimental Design: In Engi-

HALD HALD





Statistical



MADOW





neering and the Physical Sciences, Volumes I and II An Introduction to Genetic Statistics

KEMPTHORNE MEYER Symposium on Monte Carlo Methods PRABHU Queues and Inventories: A Study of Their •





Basic Stochastic

Processes

SARHAN and GREENBERG

Contributions to Order Statistics Technological Applications of Statistics WILLIAMS Regression Analysis

TIPPETT







WOLD and JUREEN YOUDEN Statistical •



Demand

Methods

Analysis

for Chemists

Tracts on Probability and Statistics

BILLINGSLEY

CRAMER

and



Ergodic Theory and Information

LEADBETTER



Stationary

and Related Stochastic

Processes

RIORDAN Combinatorial Identities TAKACS Combinatorial Methods •



Processes

in

the

Theory

of

Stochastic

i WILEY SERIES

II

AND MATHEMA' -

*

WALTER A. SHEWHART AND SAMUEL

ESTABLISHED BY

Editors Ralph A. Bradley J.

S.

WlLKS

David G. Kendall GeoffreyS. Watson

Stuart Hunter

and Mathematical Statistics

Probability

ALEXANDER Elements of Mathematical Statistics ANDERSON An Introduction to Multivariate Statistical Analysis BLACKWELL and GIRSHICK Theory of Games and Statistical Deci•





sions

CRAMER



The Elements of Probability Theory and Some of Its Appli-

cations

DOOB

Stochastic Processes



FELLER An

Introduction to Probability Theory and Its Applications, Third Edition An Introduction to Probability Theory and Its Applica-



Volume

I,

FELLER tions,

FISHER FISZ





Volume

II

Contributions to Mathematical Statistics



Probability Theory

and Mathematical

Third Edition

Statistics,

FRASER Nonparametric Methods in Statistics FRASER Statistics —An Introduction FRASER The Structure of Inference GRENANDER and ROSENBLATT Statistical Analysis •







Time

of Stationary

Series

HANSEN, HURWITZ, and

MADOW

Sample Survey Methods and Theory, Volume II HOEL Introduction to Mathematical Statistics, Third Edition KEMPTHORNE The Design and Analysis of Experiments Testing Statistical Hypotheses PARZEN Modem Probability Theory and Its Applications RAO Linear Statistical Inference and Its Applications RIORDAN An Introduction to Combinatorial Analysis •





LEHMANN



*





SCHEFFE The

Analysis of Variance Sequential Analysis Collected Papers: Contributions to Mathematical Statistics •

WALD WILKS WILKS







Mathematical

Statistics

Applied Probability and Statistics

BAILEY The •

Elements of Stochastic Processes with Applications to

the Natural Sciences

BENNETT and FRANKLIN



Statistical Analysis in

Chemistry and the

Chemical Industry

BROWNLEE



Statistical

Theory and Methodology

in

Science and

Engineering, Second Ediiion

BUSH and MOSTELLER Stochastic Models for Learning CHAKRAVARTI, LAHA and ROY Handbook of Methods •



Applied

Statistics, Vol. I

r

of

Applied Probability and Statistics ( Continued)

CHAKRA VARTI, LAHA Applied

CHERNOFF and MOSES

CHEW



ROY

and



Handbook of Methods of

Statistics, Vol. II

Elementary Decision Theory Experimental Designs in Industry

CHIANG





Introduction to Stochastic Processes in Biostatistics

MURRAY

CLELLAND, deCANI, BROWN, BURSK,

and Basic with Business Applications Sampling Techniques, Second Edition and Experimental Designs, Second Edition Planning of Experiments and MILLER The Theory of Stochastic Processes Sample Design in Business Research and Sampling Inspection Tables, Second Edition DRAPER and SMITH Applied Regression Analysis •

Statistics

COCHRAN COCHRAN COX COX COX DEMING DODGE ROMIG •













GOLDBERGER



Econometric Theory

GUTTMAN and WILKS

Introductory Engineering Statistics Tables and Formulas Statistical Theory with Engineering Applications HANSEN, HURWITZ, and Sample Survey Methods and Theory, Volume I HOEL Elementary Statistics, Second Edition JOHNSON and LEONE Statistics and Experimental Design: In Engi-

HALD HALD





Statistical



MADOW







neering and the Physical Sciences, Volumes I and II An Introduction to Genetic Statistics

KEMPTHORNE MEYER Symposium on Monte Carlo Methods PRABHU Queues and Inventories: A Study of Their •





Basic Stochastic

Processes

SARHAN and GREENBERG

Contributions to Order Statistics Technological Applications of Statistics WILLIAMS Regression Analysis

TIPPETT







WOLD and JUREEN YOUDEN Statistical •

Tracts on Probability

BILLINGSLEY

CRAMER

and



Demand

Methods

Analysis

for Chemists

and Statistics •

Ergodic Theory and Information

LEADBETTER



Stationary and Related Stochastic

Processes

RIORDAN Combinatorial Identities TAKACS Combinatorial Methods •



Processes

in

the

Theory

of

Stochastic

Digitized by the Internet Archive in

2015

https://archive.org/details/introductiontostOOchia

Introduction to Stochastic Processes in Biostatistics

A WILEY PUBLICATION APPLIED STATISTICS

IN

Introduction to Stochastic Processes in Biostatistics

CHIN LONG CHIANG Professor of Biostatistics University of California, Berkeley

&

John Wiley

New

York



Sons, Inc.

London



Sydney

Copyright

©

1968 by John Wiley

&

Sons, Inc.

No

part of this book may be reproduced by any means nor transmitted, All rights reserved.

,

nor translated into a machine language without the written permission

of the publisher.

Library of Congress Catalog Card Number: 68-21178 GB 471 15500X Printed in the United States of America

r

In

Memory

of

My

Parents

f

Preface

Time,

life,

biostatistics.

risks act

and

risks are three basic elements of stochastic processes in

Risks of death, risks of

continuously on

man

before the development of

risks of birth,

illness,

modern

and

probability

statistics,

concerned with the chance of dying and the length of structed tables to

measure longevity. But

in the theory of stochastic processes

processes in the

human

and other

with varying degrees of intensity.

made

it

was not

life,

Long

men were

and they con-

until the

advances

in recent years that empirical

population have been systematically studied from

a probabilistic point of view.

book is to present stochastic models describing Emphasis is placed on specific results and rather than on the general theory of stochastic processes.

The purpose of these empirical

this

processes.

explicit solutions

Those readers who have a greater curiosity about the theoretical arguments are advised to consult the rich literature on the subject. A basic knowledge of probability and statistics is required for a profitable reading of the text. Calculus is the only mathematics presupposed, although some familiarity with differential equations and matrix algebra is needed for a thorough understanding of the material. The text is divided into two parts. Part 1 begins with one chapter on random variables and one on probability generating functions for use in succeeding chapters. Chapter 3 is devoted to basic models of population growth, ranging from the Poisson process to the time-dependent birthdeath process. Some other models of practical interest that are not included elsewhere are given in the problems at the end of the chapter. Birth and death are undoubtedly the most important events in the

human

population, but the illness process

is

statistically

Illnesses are potentially concurrent, repetitive,

and

more

book

sequently analysis

is

as discrete entities, states of illness. if

he

is

An

challenging. In this

and a population individual

is

is

illnesses are treated

visualized as consisting of discrete

said to be in a particular state of illness

affected with the corresponding diseases. Since he

illness state for

more complex. and con-

reversible,

another or enter a death

state,

may

leave one

consideration of illness

opens up a new domain of interest in multiple transition probability and multiple transition time.

A

basic

and important case Vll

is

that in which there

PREFACE

viii

are two illness states.

Two

chapters (Chapters 4 and 5) are devoted to this

simple illness-death process. In dealing with a general illness-death process that considers any finite

number of

illness states, I

found myself confronted with a

finite

Markov

avoid repetition and to maintain a reasonable graduation of mathematical involvement, I have interrupted the development of illness

To

process.

processes to discuss the

Kolmogorov

situation in Chapter 6. This chapter

differential

equations for a general

concerned almost entirely with the

is

derivation of explicit solutions of these equations. section (Section 3)

on matrix algebra

is

For easy reference a

included.

Kolmogorov

differential equations are solved in Chapter 6, on the general illness-death process in Chapter 7 becomes straightforward; however, the model contains sufficient points of interest to require a separate chapter. The general illness-death process has been extended in Chapter 8 in order to account for the population increase through immigration and birth. These two possibilities lead to the emigration-immigration process and the birth-illness-death process, respectively. But my effort failed to provide an explicit solution for the probability

Once

the

the discussion

distribution function in the latter case.

Part 2 table

life

is devoted to special problems in survival and mortality. The and competing risks are classical and central topics in biostatistics,

while the follow-up study dealing with truncated information siderable practical importance.

I

is

of con-

have endeavored to integrate these

topics as thoroughly as possible with probabilistic

and statistical principles. hope that I have done justice to these topics and to modern probability and statistics. It should be emphasized that although the concept of illness processes has arisen from studies in biostatistics, the models have applications to I

other fields. Intensity of risk of death (force of mortality) is synonymous with “failure rate” in reliability theory; illness states may be alternatively interpreted as geographic locations (in demography),

compartment of the

illness

of a person,

or whether a gene is

in use, a

compartments

(in

analysis), occupations, or other defined conditions. Instead

is

we may consider whether

a person

a mutant gene, a telephone line

mechanical object

is

is

is

unemployed,

busy, an elevator

This book was written originally for students of biostatistics, but it may be used for courses in other fields as well. The following are some suggestions for teaching plans:

For a one-semester course through 8. 1

,

out of order, and so on.

in

stochastic

processes:

Chapters 2

I

I

PREFACE

2.

For a year course

Chapters 10 through

IX

in biostatistics: Chapters

and then by Chapters

12,

1

3

and 2 followed by through

arrangement, a formal introduction of the pure death process at the beginning of Chapter 10. 3.

For a year course

in

8. is

In this

necessary

demography: Plan 2 above may be followed, more appropriately

except that the term “illness process” might be interpreted as “internal migration process.” 4.

As

a supplementary text for courses in biostatistics or demography:

Chapters 9 through If

it is

12.

used as a general reference book, Chapter 9

The book

is

an outgrowth partly of

appears here for the

first

time

(e.g.,

my

in

many

may

be omitted.

some of which and parts of Chapter 6), and stochastic processes, for which I am

Chapter

partly of lecture notes for courses in grateful to the

my own

research,

5

contributors to the subject.

I

have used

this material

teaching at the Universities of California (Berkeley), Michigan,

Emory Universities; and at London School of Hygiene, University of London. This work could not have been completed without the aid of a number

Minnesota, and North Carolina; at Yale and the

of friends, to

whom I am greatly indebted.

It is

my pleasure to acknowledge

Myra Jordan Samuels and Miss Helen E. versions and made numerous constructive

the generous assistance of Mrs.

Supplee, criticisms

who have

read early

and valuable suggestions. Their help has tremendously improved

the quality of the book.

I

am

indebted to the School of Public Health,

University of California, Berkeley, and the National Institutes of Health,

No. 5-SO1-FR-05441from Peter Armitage to lecture in a seminar course at the London School of Hygiene gave me an opportunity to work almost exclusively on research projects associated with this book. I also wish to express my appreciation to Richard J. Brand and Geoffrey S. Watson who read some of the chapters and provided useful suggestions. My thanks are also due to Mrs. Shirley A. Hinegardner for her expert typing of the difficult material; to Mrs. Dorothy Wyckoff for her patience with the numerical computations; and to Mrs. Lois Karp for secretarial Public Health Service, for financial aid under Grant

06 to facilitate

my

work.

An

invitation

assistance.

Chin Long Chiang University

May, 1968

of California Berkeley ,

f

Contefrts

PART 1.

1

Random Variables 1.

Introduction

3

2.

Random

4

3.

Multivariate Probability Distributions

4.

Mathematical Expectation

5.

Variables

7 10

4.1.

A

4.2.

Conditional Expectation

12

Moments, Variance, and Covariance

14

Useful Inequality

11

Random

5.1.

Variance of a Linear Function of

5.2.

Covariance Between Two Linear Functions of Random

Variables

Variables

17

Random

5.3.

Variance of a Product of

5.4.

Approximate Variance of a Function of Random

Variables

17

Variables

18

5.5.

Conditional Variance and Covariance

19

5.6.

Correlation Coefficient

Problems

2.

15

19

20

Probability Generating Functions 24

1.

Introduction

2.

General Properties

24

3.

Convolutions

26

4.

Examples

27

4.1.

Binomial Distribution

27

4.2.

Poisson Distribution

28

4.3.

Geometric and Negative Binomial Distributions

28

Expansions

5.

Partial Fraction

6.

Multivariate Probability Generating Functions xi

30 31

CONTENTS

xii

7.

Sum

8.

A

of a

Random Number

of

Random

Variables

35 37

Simple Branching Process

41

Problems

3.

Some Stochastic Models of Population

Growth

1.

Introduction

45

2.

The Poisson Process Method of Probability Generating Functions 2.1.

46

2.2. 3.

3.2.

3.3.

5.

6.

48 50

Pure Birth Processes 3.1.

4.

Some

47

Generalizations of the Poisson Process

The Yule Process Time-Dependent Yule Process Joint Distribution in the Time-Dependent Yule

52

Process

56

The Polya Process Pure Death Process

60

Birth-Death Processes

62

6.1.

Linear

6.2.

A

57

Growth

63

Time-Dependent General Birth-Death Process

Problems 4.

A

67 69

Simple Illness-Death Process

1.

Introduction

2.

Illness Transition Probability,

Probability,

73

Pap(t)

and Death Transition

Q ad (t )

75

3.

Chapman-Kolmogorov Equations

4.

Expected Durations of Stay In

5.

Population Sizes in Illness States and Death States 5.1.

80

Illness

and Death States

81

82

The Limiting Distribution

85

Problems

5.

54

86

Multiple Transitions

in

the Simple Illness-Death

Process

L

Introduction

2.

Multiple Exit Transition Probabilities, Pjj$\t) 2.1.

3.

Conditional Distribution of the

89

90

Number of Transitions

Multiple Transition Probabilities Leading to Death,

r

Q^\t)

94 95

CONTENTS

Xiii

5.

Chapman-Kolmogorov Equations More Identities for Multiple Transition

6.

Multiple Entrance Transition Probabilities, p$(t)

7.

Multiple Transition Time,

4.

T^

99 Probabilities

l)

104

106

Time Leading to Death, r Multiple Transition Time

7.1.

Multiple Transition

109

7.2.

Identities for

110

Problems

6.

101

111

The Kolmogorov Differential Equations and Finite Markov Processes 1.

Markov

2.

The Kolmogorov

Processes and the

Chapman-Kolmogorov

Equation

3.

4.

114 Differential Equations

116

2.1.

Derivation of the Kolmogorov Differential Equations

117

2.2.

Examples

119

Matrices, Eigenvalues, and Diagonalization

120

3.1.

Eigenvalues and Eigenvectors

123

3.2.

Diagonalization of a Matrix

125

3.3.

A

126

3.4.

Matrix of Eigenvectors

Useful

Lemma

Explicit Solutions for

127

Kolmogorov

V

and

Differential Equations

4.1.

Intensity Matrix

4.2.

First Solution for Individual Transition Probabilities

Its

Eigenvalues

135

PiM) 4.3.

Second Solution for Individual Transition Probabilities 138

Piff)

Two

140

4.4.

Identity of the

4.5.

Chapman-Kolmogorov Equations

Solutions

141

Problems 7.

A

132 133

142

General Model of Illness-Death Process

1.

Introduction

2.

Transition Probabilities 2.1.

Illness Transition Probabilities,

151

153

P

153

aj8 (/)

2.2.

Transition Probabilities Leading to Death,

2.3.

An

2.4.

Limiting Transition Probabilities

Qad (t)

Equality Concerning Transition Probabilities

156 158

160

CONTENTS

XIV

Expected Durations of Stay in

2.5.

2.6.

Illness

and Death 1

Population Sizes in Illness States and Death States

162 163

Multiple Transition Probabilities

3.

P$ (t)

3.1.

Multiple Exit Transition Probabilities,

3.2.

Multiple Transition Probabilities, Leading to Death,

]

Multiple Entrance Transition Probabilities,

p $(t) {

Problems

168 169

Migration Processes and Birth-Illness-Death Process

8.

1.

Introduction

2.

Emigration-Immigration Processes

171

— Poisson-Markov

Processes

The

2.2.

Solution for the Probability Generating Function

2.3.

Relation to the Illness-Death Process and Solution for

A

3.

173

2.1.

2.4.

Differential Equations

174

PART *9.

176

the Probability Distribution

181

Constant Immigration

182

Birth-Illness-Death Process

183

Problems

184

2

The

Life Table

and

Its

Construction

1.

Introduction

189

2.

Description of the Life Table

190

3.

Construction of the Complete Life Table

194

4.

Construction of the Abridged Life Table

203

4.1.

The Fraction of Last Age

Interval of Life

Sample Variance of q i9 p ij9 and e a 5.1. Formulas for the Current Life Table 5.2. Formulas for the Cohort Life Table Problems

5.

*

164

167

Q'SKt) 3.3.

60

States

This chapter

may be omitted without

205 208 209 211

215

loss of continuity.

f

CONTENTS

10.

XV

Probability Distributions of Life Table Functions 1.

Introduction

2.

218

Probability Distribution of the

1.1.

Joint Probability Distribution of the

Number

of Survivors 219

Numbers of Survivors 221

An Urn Scheme

2.1.

223

3.

Joint Probability Distribution of the

4.

Optimum

Numbers of Deaths

Properties of p s and qj

225

Maximum

Likelihood Estimator of pj Cramer-Rao Lower Bound for the Variance of an

4.1. 4.2.

Unbiased Estimator of pj Sufficiency

4.3. 5.

Observed Expectation of Life and Sample

5.2.

231

233

Mean

Length of Life

235

Variance of the Observed Expectation of Life

237

Problems

11.

240

Competing Risks 242

1.

Introduction

2.

Relations Between Crude, Net, and Partial Crude

244

Probabilities

Relations Between Crude and Net Probabilities

2.1.

246

Probabilities 3.

Joint Probability Distribution of the

the

246

Relations Between Crude and Partial Crude

2.2.

Numbers of Deaths and

Numbers of Survivors

248

4.

Estimation of Crude, Net, and Partial Crude Probabilities

251

5.

Application to Current Mortality Data

256

Problems

12.

226

229

and Efficiency of p j

Distribution of the Observed Expectation of Life 5.1.

225

264

Medical Follow-up Studies 1.

Introduction

2.

Estimation of Probability of Survival and Expectation of Life 2.1. 2.2.

269

270

Basic

Random Variables and Likelihood Function Maximum Likelihood Estimators of the Probabilities

270

Px an d qx

273

CONTENTS

xvi

3.

2.3.

Estimation of Survival Probability

2.4.

Estimation of the Expectation of Life

277

2.5.

Sample Variance of the Observed Expectation of Life

278

276

Consideration of Competing Risks

Random

279

Variables and Likelihood Function

3.1.

Basic

3.2.

Estimation of Crude, Net, and Partial Crude

282

Probabilities 3.3.

280

Approximate Formulas for the Variances and Covariances of the Estimators

4.

Lost Cases

5.

An Example

285 287

of Life Table Construction for the Follow-up

Population

289

Problems

290

References

Author Index

303

Subject Index

305

r

Introduction to Stochastic Processes in Biostatistics

Part

1

CHAPTER

1

Random

Variables

1.

A

INTRODUCTION

body of probability theory and statistics has been developed for phenomena arising from random experiments. Some studies take the form of mathematical models constructed to describe observable large

the study of

events, while others are concerned with statistical inference regarding

random experiments.

When

a die

which one

is

familiar

will occur. In

result will also vary

specimen

A

is

random experiment

tossed, there are six possible

the tossing of dice. it is

not certain

a laboratory determination of antibody

titer

the

same blood laboratory conditions are kept constant. Examples

from one

used and the

is

outcomes and

trial to

another, even

if

the

random experiments can be found almost everywhere; in fact, the concept of random experiment may be extended so that any phenomenon may be thought of as the result of some random experiment, be it real of

or hypothetical.

As a framework

random phenomena, it is convenient outcome of a random experiment by a denoted by s. The totality of all sample

for discussing

to represent each conceivable

point, called a sample point

,

points for a particular experiment S.

Events

may

is

called the sample space, denoted

be represented by subsets of S; thus an event

of a certain collection of possible outcomes points s in

common,

s.

If

A

by

consists

two subsets contain no

they are said to be disjoint, and the corresponding

events are said to be mutually exclusive

:

they cannot both occur as a

result of a single experiment.

The

probabilities of the various events in

S

analysis of the experiment represented by S.

of an event

A by

Pr{T},

we may

state the three

about these probabilities as follows. 3

are the starting point for

Denoting the probability fundamental assumptions

RANDOM VARIABLES

4

Probabilities of events are nonnegative:

(i)

>

Pr{A} (ii)

The

for any event

0

probability of the whole sample space

=

Pr{5'} (iii)

unity: (1.2)

^

otA 2

or





=2 i=

•}

Pr 04*)-

(1-3)

called the countably additive assumption. In the case of

A x and A 2 we

ally exclusive events,

2.

two mutu-

have

,

A 2 } = Pr{^} +

Pr {A, or

S

is

(1.1)

1

is

Py{A 1 is

A

that one of a sequence of mutually exclusive events

The probability

{A t) occurs

This

[1.1

Pr{^ 2 }.

(1.4)

RANDOM VARIABLES

Any

single-valued numerical function V(s) defined

on a sample space

will

be called a random variable

random

variable associates

thus, a

;

with each point s in the sample space a unique real number, called value at

s.

The

probability that the- value of the

random

variables are discrete discrete

random

variable

that the

random

=

*,}=/>„

is

is