Aspects of Risk Theory (Springer Series in Statistics) 9781461390602, 9781461390589, 1461390605

Risk theory, which deals with stochastic models of an insurance business, is a classical application of probability theo

131 116 18MB

English Pages 185 [186] Year 2011

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Aspects of Risk Theory (Springer Series in Statistics)
 9781461390602, 9781461390589, 1461390605

Citation preview

Springer Series in Statistics Probability and its Applications A Series of the Applied Probability Trust Editors-Probability and its Applications J. Gani, C.C. Heyde Editors-Springer Series in Statistics J. Berger, S. Fienberg, J. Gani, K. Krickeberg, I. Oikin, B. Singer

Springer Series in Statistics AndrewsjHerzberg: Data: A Collection of Problems from Many Fields for the Student and Research Worker. Anscombe: Computing in Statistical Science through APL. Berger: Statistical Decision Theory and Bayesian Analysis, 2nd edition. Blimaud: Point Processes and Queues: Martingale Dynamics. BrockwelljDavis: Time Series: Theory and Methods, 2nd edition. DaleyjVere-lones: An Introduction to the Theory of Point Processes. Dzhaparidze: Parameter Estimation and Hypothesis Testing in Spectral Analysis of Stationary Time Series. Fam!ll: Multivariate Calculation. Fienberg/HoaglinjKiuskal/I'anur (Eds.): A Statistical Model: Frederick Mosteller's Contributions to Statistics, Science,and Public Policy. GoodmanjKiuskal: Measures of Association for Cross Classifications. Grandell: Aspects of Risk Theory. Hlirdle: Smoothing Techniques: With Implementation in S. Hartigan: Bayes Theory. Heyer: Theory of Statistical Experiments. lolliffe: Principal Component Analysis. Kres: Statistical Tables for Multivariate Analysis. LeadbetterjLindgrenjRootzen: Extremes and Related Properties of Random Sequences and Processes. Le Cam: Asymptotic Methods in Statistical Decision Theory. Le CamjYang: Asymptotics in Statistics: Some Basic Concepts. Manoukian: Modem Concepts and Theorems of Mathematical Statistics. Miller, lr.: Simultaneous Statistical Inference, 2nd edition. MostellerjWallace: Applied Bayesian and Classical Inference: The Case of The Federalist Papers. Pollard: Convergence of Stochastic Processes. Pratt/Gibbons: Concepts of Nonparametric Theory. Read/Cressie: Goodness-of-Fit Statistics for Discrete Multivariate Data. Reiss: Approximate Distributions of Order Statistics: With Applications to Nonparametric Statistics. Ross: Nonlinear Estimation. Sachs: Applied Statistics: A Handbook of Techniques, 2nd edition. Seneta: Non-Negative Matrices and Markov Chains. Siegmund: Sequential Analysis: Tests and Confidence Intervals. Tong: The Multivariate Normal Distribution. Vapnik: Estimation of Dependences Based on Empirical Data. WestjHanison: Bayesian Forecasting and Dynamic Models. Wolter: Introduction to Variance Estimation. Yaglom: Correlation Theory of Stationary and Related Random Functions I: Basic Results. Yaglom: Correlation Theory of Stationary and Related Random Functions II: Supplementary Notes and References.

Jan Grandell

Aspects of Risk Theory

Springer-Verlag New York Berlin Heidelberg London Paris Tokyo Hong Kong Barcelona

Jan Grandell Department of Mathematics The Royal Institute of Technology 100 44 Stockholm Sweden Series Editors J. Gani Department of Statistics University of California Santa Barbara, CA 93106 USA

c.

C. Heyde Department of Statistics Institute of Advanced Studies The Australian National University GPO Box 4, Canberra ACT 2601 Australia

Mathematics Subject Classification 60Gxx, 60G35 Printed on acid-free paper © 1991 Springer-Verlag New York Inc. Softcover reprint of the hardcover 1st edition 1991

All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Camera-ready copy provided by the author.

987654321 ISBN-13: 978-1-4613-9060-2 e-ISBN-13: 978-1-4613-9058-9 DOl: 10.1 007/978-1-4613-9058-9

Preface

Collective risk theory, as a part of insurance - or actuarial- mathematics, deals with stochastic models of an insurance business. In such a model the occurrence of the claims is described by a point process and the amounts of money to be paid by the company at each claim by a sequence of random variables. The company recieves a certain amount of premium to cover its liability. The difference between the premium income and the (average) cost for the claims is the "safety loading." The company is furthermore assumed to have a certain initial capital u at its disposal. One important problem in collective risk theory is to investigate the "ruin probability," i.e., the probability that the risk business ever becomes negative. The simplest model - here called the "classical risk model" - is roughly as follows: I.

II.

The point process is a Poisson process. The costs of the claims are described by independent and identically distributed random variables.

III.

The point process and the random variables are independent.

IV.

The premiums are described by a constant (and deterministic) rate of income.

The classical risk model can be generalized in many ways. A.

The premiums may depend on the result of the risk business. It is natural to let the safety loading at a time t be "small" if the risk business, at that time, attains a large value and vice versa.

B.

Inflation and interest may be included in the model.

C.

The occurrence of the claims may be described by a more general point process than the Poisson process.

In the present study we focus exclusively on generalization C. The reason is my personal interest, and not because this is necessarily the most

VI

Preface

important generalization. Dassios and Embrechts (1989) and Delbaen and Haezendonck (1987) are very readable studies focusing mainly on generalizations A and B. Furthermore, we consider only w( u), i.e., the probability of ruin within infinite time. Some remarks on ruin within finite time are, however, given in the appendix. This is a monograph on certain aspects ofrisk theory and not a textbook in risk theory. The word "aspects" in the title is almost as informative as the words "risk theory." The reader who wants a textbook is recommended to consult Gerber (1979). That book is a fine introduction to risk theory and almost perfect as a prerequisite for this monograph. While writing this monograph I have had two potential readers in mind. The actuary who has a good knowledge of classical risk theory and wants to get acquainted with these kind of generalizations. Anyone with a knowledge of risk theory corresponding to Gerber's book is here regarded as an actuary. For the benefit of the actuary several "inserted surveys" are inchi.ded. The probabilist who - resonably simply - wants to get an introduction to modern ruin theory. Parts of the surveys on point processes may also be helpful for some probabilists. Section 1.1 is devoted to the following four basic results, which go back to the pioneering works by Filip Lundberg and Harald Cramer: 1

W(O) = 1 + p'

(I)

where p is the "relative" safety loading;

w(u)

~ = - -1 e -1'(1+p)

l+p

(II)

when the claim costs are exponentially distributed with mean /1-; the Cramer-Lundberg approximation lim eRuw(u) = C,

u_oo

(III)

where the Lundberg exponent R is given by a functional equation; the Lundberg inequality

(IV) The Cramer-Lundberg approximation is proved by the aid of a "defective renewal equation" - a technique introduced by William Feller. The Lundberg inequality is proved by a "martingale approach" - introduced by Hans Gerber. Those methods are much simpler than the Wiener-Hopf methods, used by Cramer (1955), albeit the results are less general and less detailed. Sections 1.2 and 1.3 deal with "practical calculations" of ruin probabilities and estimation oHhe Lundberg exponent, respectively. These sections

Preface

vii

lie somewhat outside the main theme of the monograph. They are included since - in my opinion - a discussion related to applications naturally belongs in a presentation of risk theory. In Chapter 2 the exposition of point processes starts. The chapter may be viewed as an introduction to point processes. It is - hopefully - suited to actuaries. One main deficiency in the classical risk model is that the possibility of an increase of the insurance business is not taken into account. Generally that possibility is taken care of by introduction of an "operational time scale." In Section 2.1 the martingale approach to point processes is discussed and a "stochastic operational time scale" is defined with the aid of the "compensator of the point process." The purpose of Section 2.2 is to discuss the choice of the point process describing the occurrence of the claims. That discussion is based on the general theory of point processes. An idea going back to Bertil Almer one of the Swedish pioneers in risk theory - is taken up, and leads to considerations about thinning of point processes. A natural - at least from an analytical point of view - generalization of Poisson processes are renewal processes. In Chapter 3 it is shown that (I) - (IV), essentially, hold also in that case. Chapter 3 has a similar relation to the investigations by Olof Thorin as Section 1.1 has to Cramer (1955). Another natural generalization of the Poisson process is the Cox process. A Cox process is a generalization in the sense that stochastic variation in the intensity is allowed. Intuitively we shall think of a Cox process N as generated in the following way: first a realization of an intensity process, i.e., a non-negative random process, >.(t) is generated and, conditioned upon its realization, N is a non-homogeneous Poisson process with that realization as its intensity. Cox processes are very natural as models for "risk fluctuation." The generalization also seems natural from the discussions in Chapter 2 about thinning of point processes. Chapter 4 is devoted to risk models where the occurrence of the claims is described by a Cox process. In Section 4.1 analogs to (I) and (II) are studied when the intensity is markovian. In Section 4.2 the following weaker version of (IV) is proved under general assumptions by martingale methods: there is an R> 0 such that for each o. Then X(t) has a drift to +00. We can now define the ruin probability w( u) of a company facing the risk process (1) and having initial capital u. 2. w(u) = P{u + X(t)

DEFINITION

< 0 for some t > O}.

It is sometimes convenient to use the non-ruin probability

O. This means that the tail of dF decreases at least exponentially fast, and thus for example the lognormal and the Pareto distributions are not allowed. Further, the rather pathological case when h(roo-) < 00 and h(r) = 00 for r > roo is excluded. The example F'(z) = cO;2't· e -z for z > 1 shows that such cases do exist.

4

1 The classical risk model

1.1 Ruin probabilities for the classical risk process DEFINITION 6. X is called a classical risk process or a Poisson model if N is a Poisson process.

When nothing else is said, we assume in this section that X is a classical risk process. A simple way to get an equation for 4> is to use a "differential" argument, see, e.g., Cramer (1930, p. 75). Then we consider X(t) in a small time interval (0, d] and separate the four possible cases as follows: 1.

no claim occurs in (0, d],

2.

one claim occurs in (0, d], but the amount to be paid does not cause ruin,

3.

one claim occurs in (0, d], and the amount to be paid does cause ruin, and

4.

more than one claim occurs in (0, d].

From the fact that X(t) has stationary and independent increments we get, provided 4>(u) is differentiable,

4>(u) = (1- ad + O(d»4>(U + Cd) (ad

{u+ctl.

+ O(d» io

4>(u + cd - z) dF(z) + (ad

= (1- ad)4>(u + Cd) + ad

l

u+ Ctl.

0

= 4>(u) + Cd4>'(u) - ad4>(u) + ad

4>'(u)

= ~4>(u) - ~ r c c io

°

+ o(d». + Oed)

4>(u + cd - z) dF(z) + Oed)

lou 4>(u -

where as usual Oed) means that o(d)/ d

+

-+

°as d

z) dF(z) + Oed), -+

(2)

0. Thus we get

(u - z) dF(z).

(3)

The derivation of (3) is certainly not mathematically satisfying. Although Cramer (1955, pp. 60 - 61) gives a stringent "version" of the differential argument, we shall consider another approach. Following Feller (1971, p. 183) we shall now derive (3) by a "renewal" argument. Let S1 be the epoch of the first claim. Then we have X(Sl) CS1 - Zl. Since the Poisson process is a renewal process and since ruin can not occur in (0, St) we have

=

1.1 Ruin probabilities for the classical risk process

5

The change of variables x = u + cs leads to

Cf>(u) =

~eau/c

c

1

00

e- ax / c

u

r Cf>(x -

io

z) dF(z)dx.

Consequently Cf> is differentiable and differentiation leads to (3). Integrating (3) over (0, t) yields

Cf>(t) - Cf>(0) = -O! c

it 0

= -O!

~

1t

c

c

~Cf>(0) c

it

u

Cf>(u) du +

0

[Cf>(0)(1- F(u)) - Cf>(u) +

=~Cf>(0) =

it l

Cf>(u) du + -O! Cf>(u - z)(1- F(z)) du coo

['(1-F(u))

io

du+~ c

l

u

Cf>'(u - z)d(1- F(z)) dZ] du

['(I-F(z)) dz [' Cf>'(u-z) du

io

iz

ft (1- F(u)) du + ~ ['(1 - F(z))(Cf>(t - z) - Cf>(0)) dz.

io

c

Thus we have

Cf>(u) = Cf>(0) +

~ c

io

ior

Cf>(u - z)(1- F(z)) dz.

By monotone convergence it follows from (4), as u

-->

(4)

00, that

Cf>(oo) = Cf>(0) + 0!J.lCf>(00).

(5)

c It follows from the law of large numbers that limt .... oo X(t)/t = c - O!J.I with probability one. In the case of positive safety loading, c > O!J.I, there exists a random variable T, i.e., a function of Nand {Zk}, such that X(t) > 0 for all t > T. Since only finitely many claims can occur before T it follows that inft>oX(t) is finite with probability one and thus (oo) = 1. Thus 1 = (1 - w(O)) + ¥ or

w(O)

= O!J.Ic = _1_ 1+p

when c

> O!J.I.

(I)

This is an insensitivity or robustness result, since w(O) only depends on p and thus on F only through its mean. EXAMPLE 7. EXPONENTIALLY DISTRIBUTED CLAIMS. Consider the simple case, illustrated in Figures 2 and 3, when Zk is exponentially distributed. Then (3) is reduced to

Cf>'(u) =

~Cf>(u) - ~ c

O! = -Cf>(u) - -O! c CJ.l

l

CJ.l

0

u

ior

Cf>(u - z)e-z/J-I dz

Cf>(z)e-(u-z)/J-I dz.

6

1 The classical risk model

Differentiation leads to a 1 a a t All processes which we consider have right continuous trajectories and the filtrations are so simple that the condition of right continuity is of no problem.

Ft =

DEFINITION 13. A random variable T, 0 if {T ::; t} EFt for each t 2: o.

-+

[0,00],

is an F-stopping time

This means that, knowing the history up to time t, one can decide if T ::; t or not. Note that the outcome T = 00 is allowed. If T is a stopping time, so is t t\ T = min(t, T) for each t. The following simplified version of the "Optional Stopping Theorem" is essential for our applications.

10

1 The classical risk model

14. Let T be a bounded stopping time, i.e., T ~ to M a right continuous F-martingale (F-supermartingale). Then

THEOREM

Ero[M(T)] = ( ~) M(O)


0; some r> o.

where

< 00 for

Then for some function g(.).

If Y is a classical risk process with positive safety loading we have /3 = C -

CiJ-L. Further, we have E[e-rY(t)] = e- rct

L 00

(

t)k

Cik ! e-at(h(r)

+ l)k

k=O

= e-rct+at(h(r)+l)-at = et(ah(r)-rc) and thus g(r) = Cih(r) - rc. Note that Y may also, for example, be the risk process corresponding to life annuity insurance or the Wiener process with positive drift. Let Tu be the time of ruin, i.e.,

2': 0 I u + Y(t) < OJ.

Tu = inf{t

Obviously Tu is a FY-stopping time and note that w(u) Put e-r(u+Y(t)) Mu(t) etg(r)

= P{Tu


1

one gets F'(z) z(1/fJ)-l e-z/fJ - (31/fJ r(l/{3)

which is a r-distribution with JL 1973, p. 118)

z

= 1 and (1'2 = {3.

>0 Then we have (Thorin

p(l - (3R)e- Ru

w(u) + -1+(1+p)(R+{3R-1) .

p . 7r -sm7r{3 {3

1

x1/fJe-(11:+1)u/fJ dx

00

o

{x1/fJ

(26)

[l+(l+P)o/] -COSjr +sin2j'

where R is the positive solution of (1- {3r)-l/ fJ -1 = (1 + p)r for r < 1/{3. Note that the first term in (26) is the Cramer-Lundberg approximation. The second term can be used in order to obtain an upper bound for the error in the Cramer-Lundberg approximation. From Grandell and Segerdahl (1971, p. 147) it follows that for p = 0.1 the error is less than 10- 6 as soon as u > 7.26· {3. For (3 < 1 the r-distribution F does not have the representation (25). Thorin (1986) has given an expression for w(u), analogous to (26), also in this case. That expression is - in general - an extension of (26).

o

1.2 "Practical" evaluation of ruin probabilities

15

An alternative to numerical calculations is, of course, simulation. The straightforward simulation of w(u), by running N replicates of X(t) and calculating the fraction of runs with ruin will, in general, require an enormous number of random numbers. For a much more promising method of simulation of ruin probabilities we refer to Asmussen (1985). It is very natural to try to find "simple" and "good" approximations of w(u). Several approximations have been proposed. Some of them are more or less ad hoc and their merits can only be judged by numerical comparison. Others are based on limit theorems, and the limit procedure may give hints on their applicability. In that case numerical comparison may be needed in order to get information about the speed of convergence. The most famous approximation is, of course, the Cramer-Lundberg approximation which is good for large values of u and therefore small values of w(u). Practically it is somewhat difficult to apply, since it requires full knowledge of the claim distribution. The word "ruin" may sound very "dramatic," and one may imagine "old widows starving because they don't get their pension." Certainly it is more realistic to interpret "ruin" as a technical term meaning that some not too dramatic economical procedure must be done in the insurance company. Therefore it may be interesting to look for approximations which work for less small values of w(u). One way to express this is, if the CramerLundberg approximation is regarded to be related to "large deviations," to look for approximations related to "the central limit theorem." Therefore we shall consider diffusion approximations where the idea is to approximate the risk process with a Wiener process with drift. Mathematically such approximations are based on the theory of weak convergence of probability measures. Standard references well suited for our applications are Billingsley (1968) and Lindvall (1973).

BASIC FACTS ABOUT WEAK CONVERGENCE

Let D be the space offunctions on [0,00) that are right-continuous and have left-hand limits. Endowed with the Skorohod J 1 topology, D is a Polish space, i.e., separable and metrizable with a complete metric. A stochastic process X = {X(t); t ~ O} is said to be in D if all its realizations are in D. The distribution of X is a probability measure P on D. Let X, Xl, X 2 , .•• be processes in D. We say that Xn converges in distribution to X, and we write Xn ~ X, if E[f(X)] -+ E[f(X)] for all bounded and continuous realvalued functions f on D. Convergence in distribution of Xn to X implies, for example, and that infoo X(t)

Note that X(t) = ct - Set) and

-+ 00.

< -u}. Define Yn by

y. ( ) _ cnnt - Sent) n

t -

Vii

'

which means that we let the gross risk premium rate depend on n, and Y by and put Pn =

Cn;;;IJ.

Thus Y is a Wiener process with drift. Since

- aJ.tnt _ VaJ.t . / (2 +(1" 2) S-nt ( ) y.nt ( ) -- cnnt Vii = PnaJ.tVii t - Ja(J.t2 + (1"2) Sn(t) it follows (Grandell 1977, p. 52) that Yn ~ Y as n

-+ 00

if and only

if PnVii -+ , . It also follows from Grandell (1978) that inft>o Yn(t) ~ inft>o yet) and thus -

P{infYn(t) < -y} t~O

-+

P{infY(t) < -y}. t~O

Obviously P{inft~o Yn(t) < -y} = w(yVii) , with relative safety loading Pn. Further, we have

E[e-rY(t)]

= etg(r) = et[-"),al'r+a(1'2+0"2):r 1 r2

and thus, cf. (21) and Example 8 continued, we get the well-known result ~

P{infY(t) < -y} = e-Y"Y 1'2+0"2. t~O

1.2 "Practical" evaluation of ruin probabilities

17

This leads to the diffusion approximation

'lieU)

~ 'liD(U) = e- UP l'l':,,2

(27)

if p is small and U is large in such a way that U and p-l are of the same order. In queuing theory (27) is known as the "heavy traffic approximation." The relation between risk theory and queueing theory will be discussed in Remark 5.l. REMARK 17. When Assumption 4 holds, one can make a comparison with the Lundberg inequality (IV). Then we have

C~=h(R)~Jl.R+Jl.2~(72R2

or

....

R 0:j1., i.e., we have positive safety loading;

Consider the regular case and recall that R is the positive solution of h(r) = crlo:. It is practical to introduce the function g(r), defined by

g(r)

cr

= h(r) -

def

-

0:

=

1

00

0

cr

erz dF(z) - 1 - - , 0:

since then R is the positive solution of g(r) = o. Consider the risk process X(t) for t E [0, T] and define the random process 1

N(T)

GT(r) = N(T) ( ; erZk

-

1 - cr

(N(T))-1

r-

if N(T) >

o.

Replacing X(t) by an observation x(t) we can form the corresponding function gT(r). If x(T) > 0 and if at least one claim has occurred, this function has the same properties as g(r) and a natural estimate of R is given by the positive solution R* of gT(r) = o. In order to study the properties of R* we define the random variable RT as the positive solution of GT(r) = 0 when such a solution exists. REMARK 23. There is always a positive probability that

N(T)

c

1

~ r- N(T)

N(T)

L

},:=1

Z},:,

i.e., that X(T) ~ O,or that N(T) = o. In those cases we put RT = 0 and RT = +00, respectively. We make the corresponding convention for R*, although it is hardly necessary. In practice no one will try to make any estimation before claims have occurred. Further, if x(T) ~ 0 the company has probably more acute problems than statistical estimation, or wants to consider the ruin probability for a higher gross risk premium.

o

Our basic result is the following theorem. THEOREM 24. In the regular case

vT(RT - R) ~ Y

asT

-+

00,

where Y is a normally distributed random variable with E[Y] = 0 and (J"2

y

~f Var[Y]

=

g(2R) 0:(g'(R))2

=

h(2R) - 2cRlo: . o:(h'(R) - clo:P

Before proving the theorem we shall give a lemma.

1.3 Inference for the risk process

27

LEMMA 25. In the regular case

GT(R) RT-R

~ -g'(R) P-a.s. as T ~ 00.

PROOF OF LEMMA 25: The proof is similar to the proof of asymptotic normality of maximum likelihood estimates given by Cramer (1945, pp. 500 - 503). All statements about random quantities are meant to hold P-a.s. Since N(T)IT ~ a as t ~ 00 and since E[e rZk ] < 00 for r ~ 2R it follows for r < 2R that GT(r) ~ g(r) and G~(r) ~ g'(r) as T ~ 00. In the regular case g'(R) > O. Choose f E (0, R) such that g'(R - f) > O. For T (depending on the realization of X(t) and on f) large enough we have GT(R - f) < 0, GT(R + f) > 0, and Gt(R - f) > O. Thus IRT - RI < f. Now GT(R) GT(R) - GT(RT) -(RT - R)G~(R + ()T(RT - R» for some ()T E (0,1). Thus

=

=

GT(R) , RT _ R = -GT(R+ ()T(RT - R»,

=

= R and then we just

provided GT(R) t o. If GT(R) 0 we have RT define the ratio as -Gt(R). Since N(T)

G"(r) = _1_ ""

N(T)

T

it follows that

~

G~(r)

~

k=1

>0

Z2erZk

k

is increasing in r and we have

Ig'(R - f) - g'(R)1

+ Ig'(R + f) -

g'(R)1

which can be made arbitrarily small by choosing

f

asT~oo

small enough. I

PROOF OF THEOREM 24: We have

GT(R) Let

1

N(T)

= GT(R)-g(R) = N(T)

(; eRZk -l-h(R)-cR

(T 1) N(T) - ~ .

=

denote the epoch of the kth claim and put So O. The variables are independent and exponentially distributed with mean 1/a. We have Sk

S1 - So, S2 - S1, S3 - S2, ...

N(T)

T =

SN(T)

+ (T -

SN(T»

=

L

(Sk - Sk-t)

+ (T -

SN(T»

k=1

and thus

T 1 1 N(T) N(T) - ~ = N(T) {;

1)

( (Sk - Sk-t) -

~

+

T - S N(T) N(T)

28

1 The classical risk model

Thus

VT GT(R) =

fr1 yN(f)' -IN(T) .

I)}

N(T)

~ [{ (e rZ k L..,.;

-

1 - h(R» - cR ( (Sk - Sk-I) - ~

- cR T-SN(T)] N(T) .

k=l

Now N(T)/T -+ a and T-SN(T) ~ an exponentially distributed random variable. The random variables

{[e rZk

-

1 - h(R)] - CR[(Sk - Sk-d - l/a]},

k = 1, 2, ...

are independent with means zero and variances h(2R)

a

2cR + 1- (h(R) + 1)2 + ( CR)2 = h(2R) - -a= g(2R).

From all this and the classical generalization of the central limit theorem to sums of a random number of random variables, see, e.g., Renyi (1960, p. 98), it follows that

VT GT(R) ~

If

[v'g(2R). W

+ 0],

where W is a normally distributed random variable with mean zero and varIance one. Since 0 < g'(R) < 00 it follows from Lemma 25 that rr;:. rr;:. RT - R vT (RT - R) = vT GT(R). GT(R)

d

-+

1 Vfl~ V g(2R) . g'(R) .W r:::;;;r)\

which equals Y in distribution .• Theorem 24 can be used to form confidence intervals for R. In practice oy is unknown and we have to replace it by its natural estimate u

*

v'gT(2R*) - --=:::==-...:.----':... -IQ*gT(R*) '

Y -

where a* = n(T)/T. A one-sided approximate 95% confidence interval for R is thus given by ( R* _ 1~Y,

00 )

.

This interval leads us to the following empirical Lundberg inequality

w(u) :::;

e-(R*-1.6qY/VT)u

(36)

which holds for all u in approximately 95% of all investigations. In many situations we may be more interested in an estimate of the ruin probability than in an' inequality. When our interest is in large values of

1.3 Inference for the risk process

29

u it is natural to use the Cramer-Lundberg approximation (III), which in the notation used here is given by c- up.

where C = ugl(R)

for such an estimate. A natural estimate of p. is p.* = (cT - x(T))/n(T) and thus a natural estimate of Cis C* = x(T)/[n(T)gT(R*)]. Define the estimate w*(u) and the random variable WT(U) by

>T'*( ) "" u and

x(T) = n(T)gT(R*)

def

e

-Rou

= C* e -Rou

def X(T) R R WT(U) = N(T)GT(RT) e- TU = CT e- TU.

Consider the "relative error"

£r (u) defined by

£T(U) = WT(U) - w(u) = WT(U) . WCL(U) _ 1 w(u) WCL(U) w(u) and note that £T (u) is a random variable "containing" both the error in the Cramer-Lundberg approximation and the "random" error. Thus we have

CT 10g(£T(u) + 1) = loge - U(RT - R)

+ log

WCL(U) w(u) .

Since 10g(CT/C) -+ 0 as T -+ 00 P-a.s. and 10g[wcL(u)/w(u)] -+ 0 as U -+ 00 it is natural to let T -+ 00 together with U in such a way that u/VT -+ it E (0,00). From Theorem 24 we then get as T

-+

(37)

00.

In the same way as (36) follows from Theorem 24 it follows from (37) that (38) is an approximate 95% confidence interval for w(u) when u and VT are of the same large order. As we have mentioned, the ruin probabilty highly depends on the "tail" of F(z) for large values of u. The larger T is the more information we get about the "tail," is formalized by the requirement that u and must be of the same order. Because of the construction of (38) we may consider all u larger than some Uo simultaneously without changing the level, provided that Uo and VT are of the same large order. To realize this we consider the random variable

VT

sup

u~uo

IVT 10g(£T(u) + 1)1 u

30

1 The classical risk model

-IT CT ~ -IT WCL(U) = sup - l o g - - vT(RT - R)+ -log W( )

I

IVr

I

U~Uo

I

U

CTI ::; sup -log-c U~Uo

U

as T -+ 00, Thus

C

U

~ + IvT(RT -

R)I

+ sup

U~Uo

U

IVr

-log WCL(U) W() U

U

d -+

WI

Uo -+ 00, and u/-IT -+ Uo E (0,00).

0.95::::::

p{1 '7 10g(tT(u) + 1)1::; 2o"y

for all

U

~ uo}

= P{lIogW(u) -log(CTe-RTU)1 ::; 2uo"y/Vr for all u ~ uo} and it follows that all

u ~ Uo

may be considered simultaneously.

EXAMPLE 26. Consider the case when Zk is exponentially distributed. Then heR) < 00 for c < 2aJ1. or p < 100%. In that case we have

uf =

2

-..."...,..,..-----,..."...,..,..---,aJ1.2(1 + p)2(1 - p) .

It is natural to ask what happens if p > 100%. In this case Theorem 24 does not hold any more. Lemma 25 does, however, still hold. From the lemma and from the theory of stable distributions, see Feller (1971, pp. 570, 577, 581), it follows that

as T

-+

00,

where Yp has a stable distribution with exponent (1 + p)/ p. The characteristic function for Yp can be calculated, but it is so complicated that the result is of no practical interest.

o

REMARK 27. The fact that R* is the positive solution gT(r) = 0, where 1

gT(r) = - neT)

L

n(T} k=l

erzk

-

1- cr

(

(T))-l

_n_ T

,

may be regarded as a practical drawback of this method of estimation since the numerical problems of computing R* may be expected to be considerable. Rosenlund (1989) has applied this method of estimating R on real claim statistics, consisting of 182,342 claims, at the Swedish insurance company "Liinsf6rsiikringsbolagen." He solved the equation gT(r) = 0 with the secant method, i.e., the Newton-Raphson method with the derivative replaced by a difference ratio. With 9 computations of gT(r) the total CPU time on an IBM 3090 was only 14.6 sec. Thus the numerical problems are almost negligible.

o

1.3 Inference for the risk process

31

Let us consider the simplified, but less natural, situation where a is known and where the first n claims are observed. Then we consider the random process 1 Gn ( r ) -- n

l:n erZ" -1- cr a

k=1

and define Rn as the positive solution of Gn(r) purpose, the random variable Hk(r) by

Hk(r) = erZ"

- 1-

= o.

Define, for future

~ a

and note that

E[Hk(r)] = g(r)

and

O"i

def

= Var[Hk(R)] = g(2R) -

(

cR

~)

By obvious modifications of the proof of Theorem 24 it follows that r= d O"H as n -+ 00. V n(Rn - R) -+ - - . W g'(R)

2

,

(39)

Herkenrath (1986) considers estimation of R as a stochastic approximation problem and proposes a modified Robbins-Monro procedure for its solution. The idea behind this approach can roughly be described in the following way. Let flo be a starting value, or an "initial estimate" of R' when no claim has occurred. Let Rk be the estimate based on the first k claims. When the (k + l)th claim Zk+1 occurs, we want to form Rk+1 recursively, i.e., Rk+1 shall depend only on Rk and Zk+1. Consider now the function g(r). We know that

g(O)

= 0,

g(r) < 0 for 0 < r < R, 0< g(r)

g(R)

=0

and

< 00 for R < r < roo.

Assume that we can find, or believe in, an interval [Rrnin, Rrnax] such that

o < Rrnin < R < Rrnax < roo, to which the estimates are restricted. It then seems natural, forgetting for the moment about the restriction to [Rrnin, Rrnax], to put

= Rk - ak H k+1(Rk). 0 for r < ( » R it is natural to require that Rk+1

Since g( r) < ( » ak > o. Further, the "additional" information in Zk+1 compared to Rk decreases and thus it is natural to require that ak '\. 0 as k -+ 00. One such choice, which works well, is to choose ak = ajk where a > o. Finally we shall restrict the estimates to [Rrnin, Rrnax] and we are led to if Rk - ~Hk+1(Rk) < Rrnin

Rrnin Rk+1

=

{

Rk+1

Rrnax

= Rk -

~~k+1(Rk) if Rrnin 5 ~k - ~Hk+1(~k) 5 Rrnax . if Rrnax < Rk - JiHk+1(Rk)

32

1 The classical risk model

Under some additional conditions on g(r) we have (Sacks 1958, p. 383)

'-(Rn -

yn

R)

d --+

aU'H

J2ag'(R) _ 1 .

W

as n

--+ 00

(40)

provided that a > 1/(2g'(R)). It is easily seen that a = l/g'(R) is (asymptotically) optimal. In that case, see (39) and (40), Rn and Rn have the same asymptotic behavior. Thus the estimate R* seems preferable compared to Rn. In our opinion this conclusion is intuitively natural, but it is not in agreement with the conclusions drawn by Herkenrath (1986). His conclusions are based on a simulation study.

CHAPTER 2

Generalizations of the classical risk model

There are certainly many directions in which the classical risk model needs generalization in order to become a reasonably realistic description of the actual behavior of a risk movement. We shall, almost solely, consider generalizations where the occurrence of the claims is described by point processes other than the Poisson process. This restriction is more a reflection of our personal interest than an ambition to cover the most important aspects of risk theory. There are, at least, two very different reasons for using other models for the claim occurrence than the Poisson process. First the Poisson process is stationary, which - among other things - implies that the number of policyholders involved in the portfolio cannot increase (or decrease). Few insurance managers would accept a model where the possibility of an increase of the business is not taken into account. We shall refer to this case as size fluctuation. Second there may be fluctuation in the underlying risk. Typical examples are automobile insurance and fire insurance. We shall refer to this as risk fluctuation.

2.1 Models allowing for size fluctuation The simplest way to take size fluctuation into account is to let N be a nonhomogeneous Poisson process. Let A(t) be a continuous nort-decreasing function with A(O) 0 and A(t) < 00 for each t < 00.

=

A point process N is called a (non-homogeneous) Poisson process with intensity measure A if

DEFINITION 1.

(i) (ii) REMARK

N(t) has independent increments; N(t) - N(s) is Poisson distributed with mean A(t) - A(s). 2. The function A(t) can be looked upon as the distribution

34

2 Generalizations of the classical risk model

function corresponding to the measure A. The continuity of A(·) guarantees that N is simple, i.e., that N(.) increases exactly one unit at its epochs of increase.

D Define the inverse A -1 of A by

A- 1(t) = sup(s I A(s) :::; t).

(1)

A-1 is always right-continuous. Since A(.) is continuous, A-l is (strictly) increasing and for t DEFINITION 3. A Poisson process process.

N with a =

< A(oo).

1 is called a standard Poisson

The following obvious results are, due to their importance, given as lemmata. LEMMA

A(oo) = process.

4. Let N be a Poisson process with intensity measure A such that 00. Then the point process N ~f N 0 A-1 is a standard Poisson

PROOF: Since A-1 is increasing it follows that N has independent increments. Further, N(t) - N(s) = N(A-1(t)) - N(A-1(s)) is Poisson distributed with mean A 0 A-1(t) - A 0 A-1(s) = t - s .• LEMMA

N def = N-

5. Let N be a standard Poisson process. Then the point process A'IS a P' . . measure A . Olsson process WIt. h mtenslty

0

The proof is omitted. Without much loss of generality we may assume, although it is not at all necessary, that A has the representation

A(t) =

lot a(s) ds,

(2)

where a(·) is called the intensity function. It is natural to assume that a(s) is proportional to the number of policyholders at time s. When the premium is determined individually for each policyholder it is also natural to assume the gross risk premium to be proportional to the number of policyholders. If the relative safety loading p is constant we get c(t) (1 + p)p,a(t) and the corresponding risk process is given by, see (1.1), N(t)

X(t) = (1 + p)p,A(t) -

L

Zk,

k=l

where N is a Poisson process with intensity measure A such that A( 00) = 00.

2.1 Models allowing for size fluctuation

Consider now the process XCt)

=X

def

0

35

X defined by A- 1 (t)

= (1 + p)J.Lt -

N(t) "

L.J Zk.

k=l

Thus

X is a classical risk process with a

= 1. Recall that

w(u) = P{inf X(t) < -u}. t~O

If A(.) is increasing, or if aCt) > 0, A-I is continuous and it is obvious that inft>o X(t) = inft> 0 X(t). Here it would only be a minor restriction to ass~me that A(·) isincreasing, but for the further discussion we do not want to make that restriction. Suppose that A-I has a jump at t. In the time interval (A-1(t-), A-1(t)) no claims occur, since N(A-1(t)) - N(A-1(t-)) is Poisson distributed with mean AoA-1(t) -AoA-1(t-) = t - (t-) = 0, and no premiums are recieved. Thus inft> 0 X(t) = inft> 0 X(t) and the problem of calculating the ruin probability is brought back to the classical situation. The time scale defined by A-I is generally called the operational time scale, see, e.g., Cramer (1955, p. 19). We have referred to this generalization as "size fluctuations," only because then the gross risk premium rate c(t) = (1 + p)J.La(t) is very natural. Obviously it is mathematically irrelevant why a(·) fluctuates, as long as those fluctuations are compensated by the premium in the above way. We shall now see that a kind of operational time scale can be defined for a very wide class of point processes. Those processes may very well more naturally correspond to "risk fluctuation" than to "size fluctuation." Before discussing this wide class we shall introduce Cox processes which are very natural as models for "risk fluctuation." In the sequel they will play an important role, although we shall here merely use them as an illustration.

=

=

DEFINITION 6. A stochastic process A {A(t); t ~ O} with P-a.s. A(O) 00 for each t < 00 and non-decreasing realizations is called a

0, A(t)
0 JJ{(O, t]} { if t = 0 . JJ(t) = 0 -JJ{(t,O]} ift < 0 The same notation will be used for the measure and its distribution function. Let M denote the set of Borel measures.

DEFINITION 21. Let JJ, JJl, JJ2, ... E M be given. We say that JJn converges vaguely to JJ and write JJn -+ JJ if JJn(t) -+ JJ(t) for all t E R such that JJ(.) is continuous at t. Endowed with the vague topology, i.e., the topology generated by vague convergence, M is a Polish space. Denote by B(M) the Borel algebra on M. Further, B(M) equals the IT-algebra generated by projections, 1.e.,

B(M)=IT{JJ(t)-=hm - , - - 1 0:

0+

s

50

2 Generalizations of the classical risk model

1°=

we get

x dB(x) =

1= 0+

dA(s) -
1.

=

This result is - in our opinion - very interesting since it concerns an important renewal process and since it illustrates that transition between the "extreme" classes of Cox and top processes is not "continuous." Further it was very surprising - at least to the author - that such a simple renewal process can be a Cox process. Note that the Cox process cannot be in the - for Cox processes - natural class "b > 0 and 0 < c < 00." Therefore we shall consider this example in some detail.

O 00. Thus it follows from Theorem 40 that N is a top process. This result was first proved by Yannaros (1985) in the case l' = 2, 3, ... and generalized to arbitrary l' > 0 by Kolsrud (1986). Both these proofs are quite different from the one given here.

DO The claims faced by an insurance company is, of course, the sum of all claims caused by the policyholders. To policyholder number k we can associate an individual point process Nk which describes the epochs of the claims of that policyholder. In pure life insurance one "claim" can occur at most, namely the death of the policyholder. Thus Nk(t) is equal to o or 1. In non-life insurance the individual point processes may be more complicated.

54

2 Generalizations of the classical risk model

Assume now that the individual point processes N l , N2, ... are independent. The point process N is thus the sum of these individual point processes. Relying on Theorem 31 it then seems natural to assume that N is a Poisson process with some intensity measure J..l. If we can disregard seasonal variation and other kinds of temporal variation, and if the variation of the number of policyholders involved in the portfolio is taken care of in the individual point processes, it is natural to put J..l proportional to the Lebesgue measure. Thus we have a motivation for the classical risk model. In some cases we may have a "direct" dependence between the individual point processes. With "direct" dependence we mean that a claim in one individual point process causes claims, or affects the probability of claims, in other individual point processes. As examples we may think of contagion and accidents in life and sickness insurance, the spread of fire to several buildings in fire insurance, and so on. We shall soon give an argument, where )lothing is assumed about how N is built up by individual point processes. Another kind of dependence may be called "indirect" dependence. We then think of cases where the whole risk situation may vary with variations in the environment. In, for example, automobile insurance important parts of the environment are weather conditions and traffic volume. If the individual point processes are independent conditioned upon the environment, we can again rely on Theorem 31 and it seems natural to assume that N is a Cox process. This is the reasoning we had in mind when we claimed that Cox processes are very natural as models for "risk fluctuation." Consider now pure life insurance, where the only random quantity is the time of death of the policyholder. In a rich country like Sweden few deaths are directly caused by infectious deseases. This is, at least now, still true if we take AIDS into account. Also big accidents, like plane and train accidents, cause few deaths compared to the total number of deaths. Thus the direct dependence between the individual point processes seems to be almost negligible. Further, there is no famine and - more due to the geographical position - no serious nature catastrophes. Thus the indirect dependence between the individual point processes also seems to be almost negligible. (We have consciously disregarded armed conflicts and wars, since those probably cannot be taken into account in a model. At least in a war the solvency of insurance companies is a minor problem.) Finally, we disregard from possible "seasonal" variation in the death frequency. Thus the classical model seems to work well for pure life insurance in rich countries. On the other hand, risk theoretic considerations are probably not too interesting in this case, since fluctuations in the interest and other economical variation are more important to the insurance company than the random variation of the risk business. Now we consider N but make no assumptions about the individual point processes and the relation between them. We shall now exploit an idea

2.2 Models allowing for risk fluctuation

55

which goes back to Almer (1957) and consider claims as caused by "risk situations" or incidents. To each incident we associate a claim probability P and we assume that incidents become claims independent of each other. Under these assumptions the point process describing the incidents is the pinverse of the "claim process" N and will therefore be denoted by D; 1 N. A rather general and realistic way to apply these ideas is to let the incidents be the "claims" in a population and P the proportion of the population insured in the insurance company under consideration. This indicates that it is highly unnatural to choose N among top processes. Anyone who has driven a car has certainly experienced incidents and, hopefully, only few of them have resulted in accidents. This is probably the every day use of the word "incident." Let us therefore again consider automobile insurance. Suppose we can specify the concept "incident" and a claim probability p. One problem is that the incidents must be so generally defined that all, or at least almost all, claims can be associated with an incident. Generally this means that P will be small. In principle we may have a series of definitions of "incidents" and a corresponding series of probabilities PI, P2, ... such that limn-+co Pn = o. Then it "follows" that N is a Cox process. Certainly this argument is very speculative and must not be taken too seriously. If every overtaking, every braking, every curve, and so on is regarded as an incident w¢ may look upon the "incident process" more like an intensity than a point process. Then it is highly reasonable to believe that the claim probability P depends on the environment and we are back in the reasoning about "risk fluctuation." In spite of all reservations, this "incident" argument, in our opinion, indicates that it is natural to choose N among Cox processes. From an analytical point of view it is natural to generalize the classical risk model to the "renewal model," i.e., where the occurrences of the claims are described by a renewal process. A natural characteristic of the interoccurrence time distribution is the coefficient of variation CV, defined by

CV = standard deviation. mean From Example 42 it follows that a renewal process with f -distributed interoccurrence times is and

a Cox process if CV :2: 1 a top process if CV < 1.

Thus, by our arguments, the use of a f-renewal process with CV :2: 1 might be natural. Its representation as a Cox process gives, however, no information if, or when, it is a reasonable model. Another possible choice, which has been used, is to let /{D be a mixture of exponential distributions, i.e., /{D(s) = L~=1(1-e-9k8)pk wherepk:2: 0 and L~=I Pk = 1. From Theorem 38 it follows that this renewal process can be represented as a Cox process. Certainly it has been used because of its simplicity, and not because of its relation to Cox processes. It corresponds

56

2 Generalizations of the classical risk model

to a Cox process where the intensity process A(t) alternates between the values 0 and Ct2 in such a way that A(t) is a two-state Markov process. We do believe that Cox processes corresponding to two-state Markov processes are of interest. We shall consider them later, but then we let A(t) alternate beween two-states Ctl and Ct2 where Ctl > 0 is allowed. In our opinion, it is rather difficult to find situations where Ctl = 0 is natural. Another of our arguments is that it sometimes might be natural to consider N as the sum of independent individual point processes. Relying on Theorem 31 we used this as an argument to choose N as a Poisson process. If we only assume that N is a sum of independent asymptotically negligible point processes we can only draw the conclusion that N must be infinitely divisible. From Theorem 32 it follows that - essentially - the Poisson process is the only infinitely divisible renewal process. Putting all this together, we do not find it very convincing that the occurrence of claims can be much more realistically described by renewal processes than with Poisson processes. This, however, does not mean that it is uninteresting to consider renewal models. One practical aspect is that we might be interested in whether ruin occurs only when it can be observed. In cases where the risk process is regularly observed we may want to consider a renewal model where KO is an one-point distribution, although the occurrences of the claims are described by a Poisson process. Then the ordinary renewal process is purely deterministic, and the "claims" are the "arrivals of the accountant." In our opinion, a much more important reason is mathematical clarity. By explicit use of an inter-occurrence time distribution, a better insight is achieved in how the ruin probability and the Lundberg exponent depend on the risk process.

CHAPTER

3

Renewal models

We shall now consider the case where the occurrence of the claims is described by a renewal process N. Let Sk denote the epoch of the kth claim. Recall from Section 2.2 that a point process on R+ is called a renewal process (with inter-occurrence time distribution J{O) if the variables S1, S2 - S1, S3 - S2, ... are independent and if S2 - S1, S3 - S2, .. , have the same distribution J{o. Further, N is called an ordinary renewal process if Sl also has distribution J{o. N is called a stationary renewal process if J{o has finite mean 1/0: and if S1 has distribution f{ given by J{(t)

= 0:

lt

(1- f{O(s)) ds.

(1)

The first treatment of the ruin problem when the occurrence of the claims is described by a renewal process N, is due to Sparre Andersen (1957). After the publication of his paper this model has been considered in several works. In a series of papers Thorin has carried through a systematic study based on Wiener-Hopfmethods. Good references are Thorin (1974) and the review by Thorin (1982). Following Thorin we first consider the ordinary case.

3.1 Ordinary renewal models Let N be an ordinary renewal process and assume that f{o has finite mean 1/0:. N is not stationary, unless f{o is an exponential distribution, and E[N(t)] f:. o:t. The first problem is to define the relative safety loading. Therefore we consider the random variables X k , k = 1, 2, ... , defined by

(So ~f 0).

(2)

Obviously Xl, X 2 , .•• is a sequence of independent and identically distributed random variables.' This observation will be fundamental in the

58

3 Renewal models

analysis. The expected loss between two claims is

E[XkJ

= E[X1J = -E[X(SdJ = E[Zl -

CSl]

= P. -

c -

a

(3)

and it is natural to define the relative safety loading p by .£__ - P. = ___ c - ap. = _ c_1 p=_Ct

ap.

p.

ap.

which is formally the same definition as in the stationary case. This is very natural since the only difference between the ordinary and the stationary case is the distribution of S1. When nothing else is said we assume positive safety loading, i.e., p > o. Define the random variables Y n , n = 0,1,2, ... , by n

Yo = 0

and

Yn =

EXk

for n= 1,2, ...

(4)

k=l

and note that Yn = -X(Sn). Yn is thus the loss immediately after the nth claim. The ruin probability q,O(u), where the superscript 0 refers to the "ordinary case," is as always defined by

q,O(u) = P{u+X(t) < 0 for some t > O}. Since c

>

0 ruin can only occur at claim epochs, we have

q,O(u) = P{maxYn > u}. n~l

Let G denote the distribution function of X n , i.e., G(x) = P{Xn $ x}. Put 'Y ~f E[Xn] = -p.p < 0 (5) and

g.(r)

d~r

i:

erx dG(x)

= E[e rXt ] =

E[er(Zt-eSt)]

= (h(r) + l)kO(cr),

(6)

where h(r) is given by Definition 1.3, and where kO(v) = Iooo e- V $ dKO(s). ASSUMPTION 1.

G(O) < 1.

REMARK 2. The case G(O) = 1 is formally possible - take for example Sn - Sn-1 = l/a and Zn = p. P-a.s. - but uninteresting, since it implies that Zn $ c(Sn -Sn-l) P-a.s. and thus q,O(u) == o. In spite of its triviality it shows that (I) does not necessarily hold in the ordinary case.

D The function g(r) will be important. From Assumption 1.4 it follows that g(O) = 1, g'(O) = -p.p < 0, and that g is convex and continuous on [0, roo). Further g(r) ....... 00 when r I roo. For roo < 00 it is obvious since kO(croo) > o. If roo = 00 Assumption 1 must be used. Since Gis right-continuous there exists Xo > 0 such that G(xo) < 1, and thus

g(r)

~'erx0(1_

G(xo» ....... 00

as r .......

00.

3.1 Ordinary renewal models

59

From this argument it follows that Definition 3 is, at least mathematically, meaningful. DEFINITION 3. The Lundberg exponent R is in the renewal case the positive solution of (7) g(r) = 1. REMARK 4. If S1 is exponentially distributed with mean 1/a we have 1

kO( v) - ----::A

- 1+v/a

and thus

1

= g(R) = h(R) + 1

or

1+cR/a

h(R)

= cR. a

Thus, see (1.12), the definition of R in the Poisson case Definition 3. D

IS

included in

The process Y = {Yn ; n = 0, 1,2, ... } is a random walk. A random walk can be looked upon as the discrete time correspondence of a continuous time process with stationary and independent increments. In the classical risk model the risk process was a somewhat special process with stationary and independent increments. In the "martingale approach" those special properties were not used, and it is therefore not surprising - as we shall see - that the derivation goes through almost word for word. Consider the filtration FY = (F,;; n = 0,1,2, ... ) where F,;

= U{Yk; k = 0, ... , n}.

Let Nu be the number of the claim causing ruin, i.e., Nu = min{n

I

Yn

> u}.

As in the continuous time case Nu is a stopping time and

'Ji"°(u) = P{Nu

< oo}.

Put

REMARK 5. If we compare Mu(n) with Mu(t) as defined in Section 1.1 we observe the change of sign in the exponent. This is, of course, due to the fact that Yn = -X(Sn). The reader may be irritated by our, for the moment, quite unnecessary "change of sign." The reason is purely notational, and the choice is made in order to make the future application of random walk results easier. D

60

3 Renewal models

Exactly as in (1.17) it follows that Mu(n) is a martingale. Because of the "change of sign" we do, however, repeat the derivation. We have

y y [e-r(U-Yn)] y [e-r(U-Yk) er(Yn-Yk)] E:Fk [M (n)] - E:Fk - E:Fk . --:-.,...,-~ u g(r)n g(r)k g(r)(n-k) y

[er(Yn-Yk)]

= Mu(k) . E:Fk g(r)Cn-k) = Mu(k). Choose no < 00 and consider no 1\ Nu which is a bounded FY -stopping time. Since Theorem 1.14 also holds in this case, it follows from (1.18) that

e- ru

= Mu(O) = E[Mu(no 1\ N u )] ;::: E[Mu(Nu ) I Nu ::; no]P{Nu ::; no}

and thus, since u - YN" ::; 0 on {Nu < oo},

< When no

-+ 00

e- ru < e- ru max g(r)n. O~n~no E[g(r)-N,. I Nu ::; no] -

we get

WO(u)::; e- ru supg(r)n. n~O

The best choice ofr is the Lundberg exponent R, see (1.21). Thus we have Lundberg's inequality (8) Lundberg's inequality in the ordinary renewal case was first proved by Sparre Andersen (1957, p. 224) by completely different methods. With exactly the same arguments as in the derivation of (1.23) we get -Ru

WO(u) -

e

- E[e-R(u-YN,.) I Nu
u and Al ::; u we have

W-°(u) = A(oo) - A(u) +

l

u

for u 2: 0

W-°(u - y) dA(y)

(12)

which, cf. (1.6), is a defective renewal equation. Assume, cf. (1.7), that there exists a constant K such that

1

00

eKY dA(y) = 1.

Then, cf. (1.8),

eKUw-°(u) = eKU(A(oo) - A(u)) +

1 u

(13)

eK(u-y)w-°(u - y)e KY dA(y)

(14)

which is a proper renewal equation, and it then follows, cf. (1.9), that (15) where

CI

=

[00 eKY(A(oo) _ A(y)) dy = _1-_A-,-(oo-,-)

Jo

and C2 =

1

K

00 ye KY dA(y)

provided K, C I , and C 2 exist in (0,00) and that A is non-arithmetic, i.e., there exists no number d such that A is concentrated on d, 2d, ... . Formally, (15) looks like the Cramer-Lundberg approximation (III), but it is

62

3 Renewal models

- as it stands - almost useless, since the solution of (13) requires knowledge of A which is generally not known explicitly. From (8) and (15) it follows that K. ~ R and our main result will be that K. R. Assume now that K O is continuous. This assumption is not necessary, but notation is much simplified. We shall now rely on the presentation of random walks given by Feller (1971, pp. 385 - 412). The reader is strongly recommended to consult Feller's presentation. The idea to use Al in order to derive a renewal equation is due to Feller and (15) is formula (5.13) on p. 411 in Feller's book. The reason for the "change of sign" made is to facilitate comparison with Feller. Further proceedings to simplify the comparison are a double numbering of the formulas, (15) would have been called (5.13) - (15), and the following "Translation of notation" :

=

Translation of notation Our notation

Feller's notation

G(x)

F(x)

'Y

JL

Yn An An(y) An

Sn 'lt n Hn(y)

'l/Jn 'It;; pn(Y)

Vn Dn(y)

Put

'Ii = No. Al is called the first ascending ladder point. Define An by An(Y)

=P{YI

:::;

= P{'Ii = n,

0, ... , Yn -

l :::;

0,

°
Yj for j = 0, ... ,n - 1

and

Yn

~

(18)

y.

Put AD = AO* and let An(Y) be the probability of the set given by (18). Then we have

L An(Y)· 00

A(y) =

(3.2) - (19)

n=O

For fixed n we define n new variables by Xi X n , ..• , X~ = X 1 and let Yo*, Yt, ... , Y,:' be the corresponding random walk, i.e., Yo· = 0 and Y k* = 2:7=1 Xi· Obviously

(Yo·, Yt,···, Y,:') has the same distribution as (Yo, Y 1 , ••. , Yn). Further, we have Yt = Y n - Yn-k and thus we have, cf. (18),

{Y'; >

~*

for j = 0, ... , n - 1 and Y'; ~ y}

= {Yj > 0 for j = 1, ... , nand Y n

~

y}.

Thus we have

An(Y) = P{Yj > 0 for j = 1, ... , nand for n

~

Y n ~ y}

(3.1) - (20)

1 and the following very useful lemma follows.

Feller (1971, p. 395). The renewal mea.sure A admits of two interpretations. For every y > 0 the value A(y) equals

DUALITY LEMMA.

(a)

one plus the expected number of ladder points ~ y; and

(b)

one plus the expected number of events 0 Yk > 0 for k = 1, ... , n.

< Yn
into -00, when -00 < 'Y < 0. For y ::; we have

°

D n +1(Y) = P{Y1

=

2::

roo P{Y 2:: 0, ... , Yn- 1 2:: 0, Jo- 1 =

2::

0, ... , Yn

1~ dAn(z) G(y -

0, Yn +1

::;

y}

Yn E (z, z + dz), Yn+1 ::; y}

(3.5a) - (21)

z)

and, in the same way, for y > 0,

An+1(y) = P{Y1 2:: 0, ... , Yn 2:: 0, 0< Yn+1 ::; y}

=

roo P{Y1 2:: 0, ... , Yn- 1 2:: 0, Yn E (z, z + dz),

Jo-

=

1~ dAn(z)

0< Yn+1 ::; y} (3.5b) - (22)

(G(y - z) - G( -z».

Summing over n = 0,1, ... yields

D(y) =

1~ G(y -

and

A(y) - 1 =

=

1~ G(y -

z) dA(z)

1~ (G(y -

for

y::;

°

(3.7a) - (23)

z) - G( -z» dA(z)

z) dA(z) - D(O)

for y

> 0.

(3.7b) - (24)

The convolution equations (23) and (24) admit of exactly one probabilistically possible solution (D, A) (Feller 1971, pp. 401 - 402) where "probabilistically possible solution" means that D is a (possibly defective) distribution on (-00,0) and A - Ao a measure on (0,00) such that A(y) < 00 for 0< y < 00. Formulas analogous to (23) and (24) hold for (A, D). We will explicitly use the analog of (23), where we have, for y > 0, A n +1(Y) = P{Y1

=

::;

0, ... , Yn

::;

0, 0< Yn +1::; y}

roo P{Y1 ::; 0, ... , Yn- 1 ::; 0, Yn E (z, z + dz),

Jo-

=

1

0+

-':00

dDn(z) (G(y-z)-G(-z»

0< Yn+1 ::; y}

3.1 Ordinary renewal models

and, by summing,

A(y) =

1

0+

-00

(G(y - z) - G( -z)) dD(z)

for y>

o.

65

(25)

Now we introduce the associated random walk induced by the random variables ax i , ax2 , ••• with distribution aG given by

daG(x) = eRr dG(x), where, of course, R is the Lundberg exponent. Since g(r) follows from g(O) = g(R) and g'(O) < 0 that g'(R) > o. Thus

a,

~f E[aXkJ =

I:

xe Rr dG(x)

IS

convex it

= g'(R) > 0

which implies that the corresponding ascending ladder points have a proper distribution aA. It follows from Assumption 1.4 that a, < 00 and thus also E[aAIJ .::: 00. If we write (23) and (24) in differential form, i.e.,

dD(y) = and

dA(y) =

1~ G{dy- z} dA(z)

for y:::; 0

1~ G{dy- z}dA(z)

for y> 0

and multiply with eRy it follows that and The same argument goes through for (A, D) and thus daA(x) = eRr dA(x). Since aA is a proper distribution function we have, cf. (13),

1

00

and thus

I\,

eRy dA(y) = 1

= R. Now the Cramer-Lundberg approximation _ 1l· mR o. Let X be a classical risk process such that w(u) "" Ce- Ru . Now we are interested in the ruin probability EXAMPLE

Put Yo = 0, Y n = -X(Sn), and Xn = Y n -

Yn-l and note that Yo,

N(S)

Yl. ...

-

form a random walk. Obviously Xl = L:k=l' Zk - CSI and thus

g(r) ~f E[e rX1 ] = E[E{e-rX(s')

I Sd] =

E[eS,(ah(r)-rc)].

Since g(R) = E[e S, .O] = 1 it follows from (26) that wj((u) ""

Cj(e- Ru

for some constant C j(. Thus Wj( and W differ asymptotically "only" in the constant and not in the Lundberg exponent. Generally it seems difficult to relate C j( and C. The most interesting case is probably when Sl = ~ P-a.s., i.e., when K is a one-point distribution. The fact that K is not continuous is not important as long as F is nonarithmetic. In this case it follows from Cramer (1955, p. 75) that C-

K

~

C --::::--.,--

for large values of a~.

J.lpR.a~

a~ is the expected number of claims between inspections. The discussion in the appendix indicates that it is highly reasonable to consider large values of a~.

o

3.2 Stationary renewal models Now we let N be a stationary renewal process. Then the distribution J{ of ~ 0 and its Laplace transform is , a ' k(v) = -(1- kO(v)). (33)

S1 has density k(s) = a(1- J{O(s)) for s v

Let w(u), without superscript, denote the ruin probability in the stationary case, while WO( u) still de~otes the ruin probability in the ordinary case. Put (j)(u) 1- w(u) and (j)O(u) 1- WO(u). By the "renewal" argument used

=

=

68

3 Renewal models

in Section 1.1 we get the relation

(XJ

10

cp(u) =

r+c, cpO(u + cs -

k(s) 10

z) dF(z)ds.

(34)

It is tempting to try to do something similar to what we did in Section 1.1 in order to derive (1.4). (In this section we do not rely on Feller, and consequently formula (1.4) means formula (4) in Section 1.) By changing the order of integration we get

r+ c, cpO(u + cs -

to

cp(u) = a 10 (1 - KO(s)) 10

1

roo •

= a 10 = a

00

1

r

U C8 dKO(v) 10 + cpO(u + cs - z) dF(z) ds

dKO(v)

00

l

~ c

lou+c. cpO(u + cs -

v

The change of variables x = u

cp(u) =

z) dF(z) ds

z) dF(z) ds.

+ cs leads to

roo dKO(v)

10

r+ r cpO(x _ z) dF(z) dx. 1u 10 cv

(35)

Differentiation of both sides of (35), provided it is allowed, leads to

=

~

1

cp'(u)

00

dKO(v)

{l u+cV cpO(u + cv - z) dF(z) -loU cpD(u - z) dF(Z)} ~cpO(u) - ~

=

c

c

r cpO(u -

10

z) dF(z)

(36)

which corresponds to (1.3). The last equality follows by the "renewal argument" applied to the ordinary case. For future purpose the following simple result is given as a Lemma. LEMMA PROOF:

8. Differentiation of (35) is allowed. Put ip(X) =

1. Since

l

J; cpO(x-z) dF(z) and note that 0::; ip(x)::; F(x)::;

U+cV+.o. u+.o.

ip(x) dx -

lU+cV

ip(x) dx

::; 2A

U

the lemma follows by dominated convergence .• Exactly as (1.4) and (1.5) follows from (1.3) we get

al°

cp(u) = cp(O) + and

u

cpO(u - z)(l - F(z)) dz

(37)

c

(38)

3.2 Stationary renewal models

Since 0:J1..

(39)

Thus (I) holds - without any change - also in this case, a result due to Thorin (1975, p. 97). Exactly as (1.6) follows from (1.4) and (I),

W(u) =

!:

1

00 (1- F(z)) dz

+ -0:



u

(40) WO(u - z)(1 - F(z)) dz cue follows from (38) and (39). EXAMPLE 6. CONTINUED. Consider again the case when F(z) = 1- e- z / JJ and recall that WO(u) = (1- J1.R)e- Ru . From (40) we get .

W(u) =

1

00 Z JJ e- / dz +!: cue

!:

r (1- J1.R)e- R(u-z)e- z!JJ dz

Jo

+ !:(1- J1.R)e- Ru

= 0:J1. e-u/JJ c

c



u

e-z(l-JJR)!JJ dz

= 0:J1. e-u/ JJ + 0:J1. e-Ru (1 _ e-u(l-JJR)! JJ) = 0:J1. e- Ru .

o

c

c

c

(41)

Now we consider Lundberg's inequality. Note that



00

From (40) we get

W(u) ::;

!: c

::; !:

1

00



eRz(1- F(z)) dz = h(R). R

+!:

(1 - F(z)) dz

c

u

00



u

e- R(U-Z)(I_ F(z)) dz

e- R(U-Z)(I_ F(z)) dz =

~

h(R)e- RU

(42)

c cR and thus Lundberg's inequality holds, with the difference that the constant may be larger than one. In the Poisson case we have heR) = cR/o: and (42) reduces to (IV). Finally we consider the Cramer-Lundberg approximation. By dominated convergence we get

= lim !:e Ru [00(1 _ F(z)) dz tJ--+oo C

lim!: c

u .... oo



u

J

u

+

eR(u-z)wO(u - z)e RU (I- F(z)) dz

'= 0 + ~

cR

heR) CO ~f C,

(43)

70

3 Renewal models

a result due to Thorin (1975, p. 97). It follows immediately that 0

o. Further, limp_oo fp(r) foo(r) and thus limp_oo ro(j3) ro which means that ro(j3) = ro + 0(1). Put Tl(j3) prO(f3}2- ro . Then we have To(j3) TO + 0"2Tl(j3)/j3 which satisfies

=

=

=

o=

13 log (

_ 131

-

og

=

1+ h (~ + 0"2 ~2(f3)) ) _ (1 + CTo + CO"2 ~ (13) ) (1 + --;3 f.lTo j.l0" 2T l(j3) (0"2 + f.l2)rfi 0(1)) + 132 + 2132 + 7f2 log

[log(l

= f.lro+

j.l0"2T1 (f3)

13 =

+ CTO) + log

(1 + ~ C~~~~))]

f.l2rfi I (1 ) 1 C0"2 r1 (f3) 0(1) + (0"2 + f.l2)Tfi ---og +cro - +213 213 13 1 + cro 13 j.l0"2 rl (13) 13

0"2 Tfi

+ 2fi

1 CO"2 rl (13) 1 + cro

- 73

0(1)

+T

or crl (13) Tfi 1 - j.l Tl(f3) - -2

+ero

+ 0(1) =

O.

(49)

3.3 Numerical illustrations

73

Solving rlCS) in (49) we get

rl(fJ) = 2(

(1 + cro)r~ c -I' - Cl'ro

)

+ 0(1) =

rl

+ 0(1).

Thus we get

Rp

= ro(fJ) = ro + (T2 rl (fJ)lfJ = ro + ((T 2rl + o(l))lfJ fJ

fJ

fJ

• Now we also consider the case when the claims are r-distributed. For simplicity we put I' = 1. Then Rp is the positive solution of (1 - (T2r)-1/U 2 (1

or

1

(T2 log(1 - (T2r)

+ cfJr)-l/P = 1

1

+ p log(l + cfJr) = o.

(50)

In the "extreme" case (T2 = 0, which formally means that Zk = 1 P-a.s., (50) reduces to fJr -log(1 + cfJr) = 0 and thus, see (47), Rp = rol fJ. In the other "extreme" case fJ = 0 (50) reduces to log(1 - (T2r) + (T2cr 0

=

=

=

and thus Ro pol(T2 where Po is the positive solution of cr -log(1- r). If (T2 and fJ are of "the same order," we put (T2 kfJ. Then it follows from (50) that Rp = 'YklfJ where 'Yk is the positive solution of

=

log(1 - kr) + k log(1 + cr) =

o.

For c = 1.2 we have TO

=

Tl ::::t -

Po

=

0.354199, 0.397223, 0.313698.

In Table 1 we give the values of Rp for some values of fJ and (T2. For fJ 1 this is one of the cases considered in Section 1.3 and for (T2 1 there are exponentially distribut~d claims. In the last case we have

=

=

WO(u)

w(u) =c(1-R1 )

and the table also illustrates the difference in the ordinary and the stationary case. For c = 1.2 the f¥tor c(1- R 1 ) ranges from 0.77 when fJ -+ 0 to 1.2 when fJ -+ 00.

74

3 Renewal models Values of Rf3 for r-distributed inter-occurrence times and claims in the case a JL = 1 and c 1.2.

TABLE 1.

=

f3

(J'2

0.001 0.01 0.1 1 10 100 1000

=

=1

(J'2

0.3135 0.3110 0.2883 0.1667 0.03185 0.003503 0.000354

= 10

(J'2

= 100

0.003006 0.003114 0.003135 0.003110 0.002883 0.001667 0.000318

0.03130 0.03135 0.03110 0.02883 0.01667 0.003185 0.000350

If

Consider now the approximation + ~(U) = (a1 + 'I7d4>l(U) c4>~(u) =

(a2 + 'I72)4>2(U) -

:1 l a2

Jl

U 4>l(z)e-(U-Z)/1' dz - 'I714>2(U)}

r 4>2(z)e-(u-z)/1' dz - 'I724>1(U)

(22)

10

from which, of course, it follows that

c4>~(O) = (a1 c4>~(O)

+ 'I7d4>l(O) -

'I714>2(0)}.

(23)

= (a2 + '172)4>2(0) - '1724>1(0)

Differentiation of (22) leads to, compare Example 1.7, c c/ J.l:

=

=

(39) which formally means, for the chosen parameter values, that

R

= 0,

C

= q2 = 0.2,

-

R

fr1 = Rl = 1- , 1.2

C- _

fr1J.l _

fr1

- q1-- - - . c 6

4.1 Markovian intensity: Preliminaries

91

TABLE 2. Two-state Markov intensity and exponentially distributed claims in the case 0' = P. = 1, c = 1.2, and q2 = 0.2. E[u] = 00 indicates a mixed Poisson intensity.

0'1

0'2

E[u]

R

C

R

C

W(10)

0.2 0.2 0.2 0.2 0.6667

1 0.7917 0.5833 0.3750 0.1667

0 0.0417 0.0833 0.1250 0.1667

0.2 0.2000 0.2002 0.2029 0.1574

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

00

00

0.0000 0.0000 0.0000 0.0000 0.1667

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

1000 1000 1000 1000 1000

0.0001 0.0001 0.0002 0.0009 0.1667

0.8333 0.7894 0.7144 0.5569 0.8333

1.0000 0.7917 0.5835 0.3754 0.1729

0.0000 0.0439 0.1189 0.2765 0.0000

0.8328 0.7885 0.7130 0.5585 0.1574

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

100 100 100 100 100

0.0007 0.0012 0.0024 0.0083 0.1667

0.8333 0.7899 0.7164 0.5684 0.8333

1.0000 0.7923 0.5852 0.3793 0.2144

0.0000 0.0435 0.1169 0.2650 0.0000

0.8277 0.7807 0.6995 0.5293 0.1574

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

1 1 1 1 1

0.0480 0.0685 0.1004 0.1426 0.1667

0.8333 0.8135 0.8023 0.8160 0.8333

1.0000 0.8432 0.7128 0.6402 0.7068

0.0000 0.0199 0.0311 0.0174 0.0000

0.5154 0.4099 0.2939 0.1960 0.1574

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

0.1 0.1 0.1 0.1 0.1

0.1330 0.1457 0.1566 0.1640 0.1667

0.8333 0.8321 0.8323 0.8331 0.8333

1.0000 0.9512 0.9229 0.9191 0.9437

0.0000 0.0013 0.0010 0.0003 0.0000

0.2205 0.1938 0.1739 0.1616 0.1574

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

0.01 0.01 0.01 0.01 0.01

0.1625 0.1643 0.1656 0.1664 0.1667

0.8333 0.8333 0.8333 0.8333 0.8333

1.0000 0.9938 0.9906 0.9906 0.9937

0.0000 0.0000 0.0000 0.0000 0.0000

0.1641 0.1612 0.Ui91 0.1578 0.1574

00 00 00

These values, indicated by E[O'] = 00, are given in Table 2. In the Poisson case there is no obvious decomposition of C + C = aJ.l/ C in C and C but we have chosen to put C = Q2aJ.l/C and C = Q1aJ.l/C. Although Table 2 indicates that the values of Rand R for large values of E[O'] are close to those values in the mixed Poisson case, there is obviously no convergence of the

w( 00)

=0

in the two-state Markov case

c/ p,:

w(oo)

and

= q2

in the mixed Poisson case.

Mathematically this is nothing strange since (compare the survey of weak convergence given in Section 1.2 and the discussion in Section A.4) Xn ~ X does not imply inft~oXn(t) ~ inft~oX(t).

4.2 The martingale approach Our basic approach to Cox models, due to Bjork and Grandell (1988, pp. 79 - 84), is an extension of Gerber's "martingale approach" used in Section 1.1. Let us therefore recapitulate the main steps in that approach. Suppose we have a suitable filtration F, a positive F-martingale (or a positive F -supermartingale) M, and an F -stopping time T. Choose to < 00 and consider to/\T which is a bounded F-stopping time. Since M is positive, it follows from Theorem 1.14 - compare (1.18) - that

M(O) ~ E:Fo[M(to /\ T)] ~ E:FO[M(T) I T ~ to]P:Fo{T ~ to} and thus

P:Fo{T < t } < -

0

-

M(O) E:Fo[M(T) I T ~ to]

(40) (41)

Obviously M must be related to the risk process X. Let therefore X be adapted to F, i.e., F t ;2 Ff for all t ~ 0, and T = Tu be the time of ruin, i.e., Tu = inf {t ~ 0 I u + X (t) < O}. Then w(u) = E[P:FO{T < oo}] = P{T < oo}. Further, we must choose M such that it is possible to find a good lower bound for the denominator in (41). In the Poisson case we chose M (t) -

e-r(u+X(t)) E[e-rX(t)]

e-r(u+X(t))

-- ---:-;-..,....,....,..-..,. et(ah(r)-re)

(42)

which, since X has independent increments, was easily shown to be an FX-martingale. Using u + X(Tu) ~ 0 on {Tu < oo}, the lower bound was shown to be given by

E[M(Tu) I Tu

~

> E[e-Tu(a:h(r)-re) IT. < t ] > -

u -

0

-

to] inf

O~t~to

e-t(ah(r)-re).

(43)

4.2 The martingale approach

The last step was to let to

-+ 00.

93

The facts that

inf e-t(ah(r)-rc) = 1

(44)

t~O

~ R, where R is the positive solution of h(r) = crla, and that M(O) = e- ru led to Lundberg's inequality. Note that (44) holds for r ~ R and not only for r < R.

for r

Now we consider a Cox model where N is a Cox process with intensity process A(t). The intensity measure A is given by

fat A(S) ds.

A(t) =

A suitable filtration - compare Proposition 2.18 - is F given by F t F~ V Ff. Note that Fa = F~. We shall make strong use of Lemma 2.19 which says that

(i)N(t) has independent increments relative to (ii)

F~;

N(t)-N(s) is Poisson distributed with mean A(t)-A(s) relative to F~.

It seems very natural to try to find an F-martingale "as close as possible" to the one used in the Poisson case. Therefore we consider e-r(u+X(t))

(45)

M(t) -- eA(t)h(r)-tre '

where we quite simply have replaced at with A(t). It follows by almost obvious modifications of (1.17) that M is an F-martingale. Due to the importance of this result we give it as a lemma and "repeat" the proof. 18. The process M, given by (45), is an F-martingale where the filtration F is given by F t = F~ V Ff.

LEMMA

PROOF: The fact that N, and thus X, has independent increments relative to F~ is equivalent to that X(t)-X(s), for s ~ t, independent of Fs relative to Fa = F~. Since E:F! [e-rX(t)] = e- rct

A(t)k L -,_e-A(t)(h(r) + l)k 00

k=O

=

k.

e-rct+A(t)(h(r)+1)-A(t)

=

eA(t)h(r)-ret

we get, see (1.17), E:F· [M(t)] - E:F· [ -

- E:F· [ -

e-r(u+X(.)) eA(. )-re8

-- Ms· ( ) E:F·

[

e-r(u+X(t)) ] eA(t)h(r)-ret e-r(X(t)-X(S))]

. eA(t)-A(s )-(t-8 )re

e-r(X(t)-X(.)) ] eA(t)-A(s)-(t-s)re --

M (). s

I

94

4 Cox models

A lower bound is easily obtained in the same way as in (43): EFo [M(Tu)

I Tu

:::;

to]

> EFo [e-(A(T,,)h(r)-rcT,,) I T. < t ] > inf -

u -

0 -

__ 0 o Rp. Then

C( r) ~ sup E

h(r)O'

[eA(t)h(r )-rct]

t~O

> rc and we have

~ sup

[e t ( ah(r)-rC)]

=

00,

t~O

where the first inequality is trivial and the second follows from Jensen's inequality. Thus R ~ r and the theorem is proved. I

4.3 Independent jump intensity We now consider a class of intensity processes with "independent jumps." Our discussion is based on Bjork and Grandell (1988, pp. 84 - 96). Intuitively an independent jump intensity is a jump process where the jump times form a renewal process and where the value of the intensity between two successive jumps may depend only on the distance between these two jumps. More formally, let Ek, k = 1,2, ... denote the epoch of the kth jump of the intensity process and let Eo ~f O. Put

(Tn = En - E n Ln = A(En-d

1

(49)

n=1,2,3, ....

Here we understand that A has right-continuous realization so that A(t) = Ln for E n - 1 ~ t < En. These notations are illustrated in Figure 1. >- (t) L L

1 2

a

0'

1

1

3

L2

0'

2,

FIGURE 1.

~ 2

0'

22

3

Illustration of notation.

23

~

t

96

4 Cox models

DEFINITION 23. An intensity process ,.\ is called (i)

an independent jump intensity if the random vectors

(L1' (1), (L2' (2), (Ls, us), .. . are independent and if (L2' (2), (Ls, us), ... have the same distribution p; an ordinary independent jump intensity if (L1' (1) also has distribution p;

(ii) (iii)

a stationary independent jump intensity if the distribution of

(L1' (1) is chosen such that"\ is stationary. Let (L, u) be the generic vector for (Ln, un), n ;::: 2, i.e.,

Pr{L E A,

0"

E B} = peA

X

B)

The marginal distribution of L is denoted by PL, i.e., Assume that E[u] < 00 and let q be the distribution of (L1' ut). The following theorem is a consequence of Franken et al. (1981, p. 45). THEOREM 24. The intensity ,.\ is stationary for q(A

X

1 B) = E[u]

£

peA

X

(50)

(t, (0)) dt.

Furthermore, for this choice of q,

(51) for any measurable function f:

R~ ->

R+.

INDICATION OF PROOF: The result in Franken et al. (1981, p. 45) is much more general than Theorem 24. Instead of showing that Theorem 24 is a consequence we choose to indicate the theorem. It is natural that ,.\ becomes stationary when the renewal process of jumps is stationary. Consider the extended index space R. Let E-l be the epoch of the last jump before time o. Then U1, conditioned upon E = El - E-l' is uniformly distributed on [0, E]. Using this and (3.1) we get ds

E[u]p(R+

=

X

1

00

(s, (0))

ds = E[u]

P{UI E (s, s :;::

1

00

s

1

00

s

p(R+

+ ds) IE =

X

dy)

= P{UI E (s, s + ds)}

x}P{E E (x, x

ds -P{E E (x, x x

+ dx)}

+ dx)}

4.3 Independent jump intensity

and thus

~ ( d )} _ x p(R+ x dx) P{ L... E x,x+ x E[u]

97

(52)

which is a well-known result from renewal theory. Further, L1 is only dependent on I;, and it depends on I; in the same way as L depends on u. Therefore we have

P{L1 E (i,i + di)

I I; E (x, x + dx)} =

pede x dx)

p(R+ x dx) ,

(53)

where the ratio is interpreted as a Radon-Nikodym derivative. From (52) and (53) we get

P{L 1 E (e,i + de), U1 E (s, s + ds)}

= 1oOOp{L1 E (e,e+di), U1 E (s,s+ds)

= 10 00 P{L1

E (l,l+dl)

=

1

IE = x} P{/T1

I I; =

E (s,s+ds)

x} P{I; E (x,x+dx)}

I E = x}

P{E E (x,x+dx)}

00 pede x dx) . ds . x p(R+ x dx)

o

p(R+ x dx)

x

E[u]

ds [00 ds = E[u] is p(di x dx) = E[u] pede x (s, 00))

(54)

which is the differential version of (50). Further, we have

100 100 f(i, s) ds is[00 pede x dx) 100 io[00 ior f(i, s) ds p(di x dx) = E[u]

1 E[f(L1' ud] = E[u] 1

0

0

0

which is the same as (51). I EXAMPLE 25. An interesting special case is when u, conditioned upon L, is exponentially distributed, since then A is a Markov process with independent jumps. This means that

pede x ds) = PLed£) TJ(i)e-'1(l)s ds and thus

E[u]

=

100 100 o

0

PL(de) s TJ(e)e-'1(l)& ds

(55)

p __ (d£) . = 100 _L 0

TJ(i)

It follows from (50) that

(d O d) _ pede x (s,oo)) ds _ PL(di) -'1(l)' d E[u] - E[u] e s q {. x S -

(56)

98

4 Cox models

=

PL(df) TJ(f)e-f/(l)& ds = qL(df) TJ(f)e-f/(l)& ds, TJ(f)E[lT] where qL is given by (10). Note that we have only required E[lT] shall return to this example in Section 4.5.

(57)

< 00. We

o

In the stationary case it follows from (51) that a

E[LlT]

(58)

= E[Ld = E[lT]

and thus P=

cE[lT] - J.tE[LlT] J.tE[LlT] .

(59)

In the sequel we assume that E[lT] < 00 and E[LlT] < 00. We may note that p can be interpreted as the net profit (60) the net cost between two successive jumps is the intensity. Thus (59) is the natural definition of p also in the ordinary case. Certainly Cox processes with independent jump intensity are a very restricted class of Cox models. It is, however, general enough to include a number of non-trivial models while still allowing us to obtain fairly explicit results reasonably simple. Naturally, we shall rely on the renewal structure of the intensity. Like in the renewal case, we shall first consider the ordinary case and then the stationary case and we denote the ruin probabilities by '11 0 and'll, respectively. First, however, we shall consider an imbedded random walk.

4.3.1 An imbedded random walk Let N be a Cox process with an ordinary independent jump intensity A. Then the random variables Xk, k = 1,2, ... , defined by Xk ~f -[X(L:k) - X(L:k-d]

(L:o ~f 0)

(61)

are independent and identically distributed. Let X be the generic variable for Xk. Then E[X]

= -E[X(L:d = -E[E[X(L:l] ILl,

L:d]

= -E[J.tL1L: 1 - cL: 1] = -E[J.tLlT - ClT] = -pJ.tE[LlT] < 0 for p > O. When nothing else is said we assume that p> O. The process Y = {Yn ; n = 0,1,2, ... }, where n

Yo = 0

and

Yn =

EXk k=l

for n = 1,2, ...

(62)

4.3 Independent jump intensity

99

is thus a random walk. Yn = -X(En) is the loss immediately after the nth jump of the intensity. Put, compare (3.6),

tjJ(r) ~f E[e rX ] = E[e-rX(E , )]

= E[E[e-rcE,+L,E,h(r) ILl,

= E[e-rco+h(r)Lo],

E 1 ]]

(63)

where her) is given by Definition 1.3. The function tjJ(r) will play the same role as g(r) did in the renewal case. There are, however, two important differences. 1.

Ruin occurs between the jumps of the intensity, and therefore a study of this random walk needs, a priori, not be of any relevance. Its relevance will, however, follow from Section 4.3.2.

2.

It does not follow from Assumption 1.4 that tjJ(r) has corresponding regularity properties.

=

=

Since h(O) 0 and her) is convex it follows that tjJ(O) 0 and - see Lemma 5 in Bjork and Grandell (1988, p. 88) - tjJ(r) is convex. EXAMPLE 26. Consider the case where (T is exponentially distributed with mean 1, where L has positive mass arbitrarily far out, i.e.,

pL(£, (0)) and where

(T

>0

for all £ > 0,

(64)

and L are independent. Then

p(d£

X

ds) = PL(d£) e-' ds,

i.e., - see (55) - the intensity is a markovian jump process with 1](£) == 1. Condition (64) holds, for example, when L is exponentially distributed. Thus this example can definitely not be regarded as pathological. Consider any r > O. Since her) 2: rJ.l we have, for all £,

tjJ(r) 2: E[e-rco+rpLo] 2: E[e(rpt-rc)o]pt«£, (0)). For rJ.l£ - rc 2: 1 or for £ 2': (1 + rc)j(rJ.l) (which is the same) we have E[e(rpt-rc)o] = 00 and thus tjJ(r) = 00.

o

The Lundberg exponent Ro - the subscript 0 will get its explanantion by (68) - is defined by Ro = sup{r

2: 0 I tjJ(r)

~

1}

(65)

and the ruin probability wrw (u) - "rw" stands for "random walk" - by wrw(u) = P{maxYn n;::l

Obviously wrw(u) PROPOSITION

f>

0,

~

> u}.

Wo(u).

27. Suppose that tjJ(r) =

00

limsupeWwrW(u) = u-oo

for all r 00.

> o. Then, for all

100

4 Cox models

PROOF: See Proposition 13 in Bjork and Grandell (1988, pp. 92 - 93). The idea ofthat proof is to use the trivial inequality wrw(u) ~ P{Y1 > u} and then to prove that

limsupe w P{Y1 u_oo

> u} =

00.



It follows from Proposition 27 that we cannot get any Lundberg inequality unless we assume that 4>(r) < 00 for some r > o. Example 26 shows that this is not an "innocent regularity assumption." PROPOSITION

28. Assume that 4>(r)

< 00 for some r > 0 (and that p > 0).

Then Ro

> O.

PROOF:

It follows from Bjork and Grandell (1988, p. 90) that

4>'(x) = E [:x e~X] = E [Xe~X] for x < r, and in particular 4>'(0) proposition follows .• PROPOSITION

29. Assume that 4>(r)

1. Then

where 0
(0)

=0

the

< 00 for some r > 0 and that 4>(Ro) =

lim eRouwrw(u)

u_oo

PROOF:

= E[X]

= C rw ,

c rw < 00.

This follows from (3.26) .•

In "well-behaved" cases Ro is determined by 4>( r) to avoid in (b) below is when 4>(r) has a jump at r PROPOSITION

= 1. The case we want = Ro.

30.

= 1 for some r > o.

(a)

Suppose 4>(r)

(b)

Suppose that Ro 4>(Ro) = 1.

(c)

Suppose that 1 < 4>(r)

Then r

= Ro.

> 0 and that 4>(r) < 00 for some r > Ro. Then < 00 for some r > o. Then 4>(Ro) = 1.

PROOF: The result (a) follows from the strict convexity of 4>(r) and (b) is trivial. Choose r such that 1 < 4>(r) < 00. From (63) and dominated convergence it follows that

4>(r-)

= lim4>(x) = E[lime~X] = 4>(r). ~tr

~tr

Thus 4>( r-) < 1 is impossible, and (c) follows by convexity.• EXAMPLE 26. CONTINUED. Assume that (j is exponentially distributed with mean 1 and that (j and L are independent but that there exists lo such that pi«l,oo)) = 0 for l> lo.

4.3 Independent jump intensity

Then

¢(r)

~

E[e-rcO"+h(r)loO"]

< 00

for h(r)fo - rc

101

O. We find it somewhat surprising that the exponential decrease of \)frw (u) is "destroyed" as soon as (64) holds, independently of how fast PL«f, 0 as f --+ 00. D

(0» '\.

A natural question, under the assumption that ¢(r) < 00 for some r > 0, is if Ro may be defined as "the positive solution of ¢(r) = 1." Formally, this is not the case, since "pathological" cases of the kind discussed in connection with Assumption 1.4 can occur. To realize that we can choose IT = So p-a.s. and let L have a "pathological" distribution. We do not know if there exist any natural examples where ¢(r) has a jump at r = Ro. Anyhow, from Proposition 30 it follows that this is no big problem.

4.3.2 Ordinary independent jump intensity Recall from Section 4.2 that

R = sup{r I C(r) < oo},

where

C(r) = E [supeA(t)h(r)-rct] . t~O

We shall first consider conditions for C(r) < 00. Since>. is piecewise constant, A will be piecewise linear. Thus also A(t)h(r) - ret will be piecewise linear and it is enough to look at eA(t)h(r)-rct at the jump times of >.. Formally, we define the discrete time process Wand the random variable W* by Wn eA(En)h(r)-rcEn and W* sup Wn . (66)

=

=

n~O

Note that we have suppressed the dependence on r in Wand W*. Thus we have

C(r) = E[W*]

(67)

and we have reduced the problem of analyzing E[suPt>o eA(t)h(r)-rct] to the simpler problem of analyzing E[suPn>O W n ]. Put Yn(r) LnlTnh(r) - rClTn and Y(;:) LlTh(r) - rClT. Then ¢(r) = E[e YCr )]. We shall use the obvious facts that

=

=

Wn

n

n

j=l

j=l

= exp(EYn(r» = TI exp(Yn(r»

and that Yn(r), n = 1,2 ... , are independent and identically distributed random variables.

102

4 Cox models

31. Suppose that the distribution of Y(r) is not concentrated to one point. Then ¢(r) < 1 is a necessary condition for C(r) < 00.

PROPOSITION

See Proposition 5 in Bjork and Grandell (1988, p. 86). If ¢(r) > 1 the result is obvious since E[Wn] = ¢(r)n ,/ 00 as n --+ 00. We shall indicate the idea of the proof in the case ¢(r) = 1. Since 1 = E[eY(r)] it follows from Jensens' inequality that E[Y(r)] < 0 and thus that Wn --+ 0 P-a.s. as n --+ 00. Assume now that C(r) < 00. Then W is a uniformly integrable martingale (a concept that we have not discussed) and Wn E[Woo I FW] where Wn --+ Woo P-a.s. as n --+ 00. We have just proved that Woo = 0 and thus Wn == 0, which contradicts ¢(r) = 1. I We have not managed to give a sufficent condition for C(r) < 00 in terms of ¢(r). In order to give such a condition we consider ¢(6, r), defined by

PROOF:

=

¢(6, r) ~f E[e(1+ 6 )Y(r)] = E[e(1+6)(h(r)LO W~l+6) = (W*)(l+6) and using a standard martingale inequality it foll;;ws that P{W* ~ z} ~ Kz-(l+6) and that C(r) < 00. I Motivated by Propositions 31 and 32 we define the constants R6 and R+ by

(68) = sup{r ~ 0 I ¢(6,r) ~ I} 6 ~ 0, R+ = sup{r ~ 0 I ¢(6, r) ~ 1 for some 6 > OJ. (69) Note that (68) for 6 = 0 agrees with (65). The notation R+ is clarified by

R6

the following lemma. LEMMA

33. R6 is non-decreasing as a function of 6, and lim.s!o R6 = R+.

PROOF: See Lemma 1 in Bjork and Grandell (1988, pp. 87 - 88). The idea of the proof is to use that the Lp-norm - on a probability space - is nondecreasing in p. For any random variable ( the Lp-norm 1I(lIp E[I(IPP/p. Using ¢(6, r) E[e(1+ 6)Y(r)] lIeY(r)lIg!:~

=

=

=

the lemma follows. I From a computational point of view Ro is much easer to handle than R. It follows from Propositions 31 and 32 that

R+ ~ R~ Ro.

4.3 Independent jump intensity

103

(Since wrw (u) ::.; WO(u) the inequality R ::.; Ro - almost - also follows from Proposition 29.) We shall now show that we have in fact R = Ro by proving that R+ ~ Ro. At first this may seem surprising, since we have not managed to give necessary and sufficent conditions for C(r) < 00 in terms of ¢!(r). The following theorem is the main result of this section. THEOREM 34. Lundberg's inequality (in the version given by Theorem 20) holds with R = sup{r ~ 0 I ¢!(r) ::.; I}. PROOF: Although this is Theorem 5 in Bjork and Grandell (1988, pp. 88 - 89) we shall give the full proof. If Ro = 0 the result is trivial, so assume that Ro > O. Choose any r E (0, Ro). In order to prove that R+ ~ R o, it is sufficient to show the existence of 8> 0 such that ¢!(8, r) ::.; 1. Therefore we choose a 8 > 0 small enough to ensure r' ~f r( 1 + 8) < Ro. Since ¢!( r') ::.; 1, it is enough to show that ¢!(r') ~ ¢!(8, r). We have ¢!(r') - ¢!(8, r) = E[e- r (1H)eq(e- h(r(1H))Lq _ e-(1+ 8 )h(r)Lq)]. From the convexity of h, together with h(O) = 0, we get h(r(1 (1 + 8)h(r) and thus

¢!(r') - ¢!(8,r)

~ E[e- r (1H)eq(e-(1H)h(r)Lq - e-(1+ 8)h(r)Lq)]

+ 8))

>

= 0

and the theorem follows .• The Lundberg exponent R is the "right" exponent in the following sense. THEOREM 35. Assume that ¢!(r)

< 00 for some r > R>

lim e(R+f)UwO(u)

u-+oo

O. Then

= 00

for every € > O. PROOF: Since wrw(u) ::.; WO(u) the theorem follows from Propositions 29 and 30 .• REMARK 36. If R = 0 it follows from Proposition 28 that ¢!(r) = 00 for all r> O. Then it follows from Proposition 27 that limsupeWwO(u) =

00

U-+oo

for every { > 0, so R is (formally) the "right" exponent also in this case.

o

4.3.3 Stationary independent jump intensity Now we consider the case when (L1' lTd has distribution q given by (50). We shall show that Theorem 34 also holds in this case.

104

4 Cox models

THEOREM 37. Lundberg's inequality (in the version given by Theorem 20) holds with R = sup{r ~ 0 I tjJ(r) ~ I}. PROOF: This is Proposition 1 in Bjork and Grandell (1988, pp. 94 - 95), from which the proof is taken. If R = 0 the result is trivial, so assume that R> O. Put

2 if we, for example, separate between rain and snow. In this case it is not enough to consider 1'.1 = 3 - which might be the first idea - since then it is probably completely unrealistic to let () be a Markov chain. (If we are in a dry period, and know that the precipitation in a preceding period was snow, it is high probability that the precipitation in the next period also will be snow.) If we, however, extend the classification of the "risk types," for example, by including information about the temperature, much realism may be gained. Conditions (iii) - (v) say that the successive periods are independent, conditioned upon (), and that their stochastic properties only depend on the "risk type." For M = 1 we are back in independent jump intensities. Strictly speaking we are back in ordinary independent jump intensities, since (iv) implies that (Lk' Uk) has the same distribution for all k. If Li = (¥i a.s., u i is exponentially distributed with mean 1/r/i, and Pii = 0 for all i the intensity process is an M-state Markov process. Although () is defined as a discrete time process, it can, by the construction above, also be viewed as a continuous time process {(}(t); t 2: O} where (}(t) is "the risk type" at time t. REMARK

40. This remark is only about terminology, and may very well be

4.4 Markov renewal intensity

107

omitted. We have chosen the name "Markov renewal intensity" since (ii) is a Markov property and (v) is an independence - or renewal - property. The chosen name may be criticized since the intensity is not a Markov renewal process. In fact, a Markov renewal process is a marked point process, see for example Franken et al. (1981, p. 18) or Karr (1986, p. 344), highly related to semi-Markov processes. The process OCt) is a semi-Markov process with the special property that the distribution of the time between two successive jumps only depends on the state which the process jumps from. In a general semi-Markov process the time between two successive jumps may depend both on the state which the process jumps from and the state which the process jumps to.

o

Let us now introduce some notation: E~ ~ the time of the nth entrance of OCt) to state i;

Win--

eA(I:~)h(r)-reI:~.

,

= sUPn~O W~, i = 1, ... , M; yl(r) = A(EDh(r) - rcEi; y1(r) = (A(ED - A(Ei_1)) her) -

Wi*

rc (Ei - Ei_1)'

k

= 2,

3, ....

Observe that the random variables {Y1(r)}k'=1 are independent and that the random variables {Y1(r)}k'=2 are, furthermore, identically distributed. Let yi(r) denote the generic variable for Y1(r), k = 2, 3, .... From the piecewise linearity of A it follows, exactly as in the independent jump case, that G(r) E[max(Wh, ... , WM*)]. Since Wh, ... , WM* is a finite collection of non-negative random variables we have

=

G(r)

< 00 if and only if E[Wi*] < 00 for all

Using the fact that W~

i

= 1, ... , M.

= n exp(Y1(r)) we obtain, with exactly the same n

k=1

arguments as in the proof of Proposition 31, the following lemma. LEMMA

41. A necessary condition for G(r) i

< 00 is that

= 1, ... ,M,

k, i = 1, ... ,M,

(70) (71)

where

= E [eyi(r)], (72) «Pki = E[eY/(r) 10(0) = k]. (73) (Note that (72) agrees with (73) for k = i.) It is also easy to see that we «PH

can more or less copy the "~-reasoning" in Section 4.3.2. Thus we have the following - yet almost useless - analog of Theorem 34.

108

4 Cox models

PROPOSITION

42. Lundberg's inequality (in the version given by Theorem

20) holds with R = sup{ r

0 1 (70) and (71) are satisfied}.

~

The problem is thus to find conditions which ensure that (70) and (71) hold. Recall the matrix notation introduced in Section 4.1. Let A = (aij) be an M x M matrix with eigenvalues 1\:1, ... , I\:M. (The eigenvalues are solutions of the equation det(A - d) = 0.) The spectral radius of A, spr(A), is defined by spr(A) = max(11\:11, ... , II\:MI). If A is irreducible and non-negative it follows from Frobenius' theorem that A has a simple maximal eigenvalue, I\:[A], such that

I\:[A] = spr(A). To the maximal eigenvalue there corresponds strictly positive eigenvectors. If some component aij = 00 we define, as a convention, K[A] = 00. Let 4> be the vector with components

(74)

¢i(r) = E[e-rcu'+h(r)L'u']

< 00). 43. Ifspr(d(4»P) < 1 then (70) and (71) hold.

and put Roo = sup(r ~ 0 14> PROPOSITION

PROOF: This is Proposition 6 in Bjork and Grandell (1988, pp. 98 - 99), from which the proof is taken. We have

ki

= ¢k(r)Pki + 'L,¢k(r)Pkjji

or

M

ki

where Q

(75)

iti

= qki + 'L,qkjji = 'L, qkjji jcj:;

j=l

= (qki) = d(4))P = (¢k(r)Pki). ii =

+ qH(l -

;;),

(76)

In particular (76) implies

M

'L, %ji + qii(l

j=1

- ii).

(77)

Put cp = (H)' Then it follows from (76) that

cp = Qcp or spr(Q)

+ Qd(I -

cp)

(I - Q)cp = Qd(I - cp).

< 1 implies that

(78) (79)

det(I - Q) =F 0 and that 00

'L, Qn n=O

= (I _

Q)-1.

(80)

4.4 Markov renewal intensity

109

It follows from (79) that ip = (I - Q)-1Qd(1 - Q) and thus

d(ip) = d((1 - Q)-lQ)d(1 - Q).

(81)

Equation (80) implies that (I - Q)-1 is non-negative and thus it follows from (81) that ii = bi (I-ii) for some bi ;::: o. Thusii = b;f(I+b;) < 1, i.e., (70) holds. We will now show that (70) implies (71). Let i and j be fixed and assume that j; = 00. It follows from (75) that ki = 00 for all k thus that Pkj > o. Since P is irreducible Pkj > 0 for some k ::j; j. For such a k it follows from (77) that kk = 00 • • From Propositions 42 and 43 we get the following theorem, which is the main result of this section. THEOREM 44. Lundberg's inequality (in the version given by Theorem 20) holds with R = sup{r;::: 0 I spr(d(~)P) < I}. We shall now show that R - given by Theorem 44 - is the "right" exponent in the sense of Theorem 35. The crucial step in the proof is following lemma, which can be looked upon as a converse of Proposition 43. LEMMA 45. If r ::j; Roo then

;;


0 and to (26) when a1 = o.

o

4.5.2 An alternative approach We shall now consider an alternative to the martingale approach which will allow us to settle some questions when we can use the Lundberg inequality with (; = o.

114

4 Cox models

Recall Proposition 31, which says - in the independent jump case - that 1 implies C(R) 00 as soon as Y(r) LrTh(r) - rCrT is nondeterministic. Thus - compare the discussion in Remark 21 - the Poisson case is probably almost the only case where Theorem 20 holds with f = O. As pointed out in Remark 21 we know from (37) that there exist Cox cases where Lundberg's inequality holds with f = O. In the derivation of (37) we used a "backward differential argument" which essentially means that we took the different possible changes in both the intensity and the risk process into account. Basic in that derivation was that the vector process (X, A) = ((X(t), A(t»; t 2:: O} - and not only A - is markovian. In the martingale approach we used the filtration F given by

c/>(R)

=

=

=

:Ft = :F! V:Ff which means that the variation of A(t) was considered as already completely known at time t = o. A way to "combine" these approaches may be to consider the filtration F given by -r- ~f -r-A V -r-X _

.rt -.rt

.rt

-r-(X, A)

(90)

-.rt

and to base the analysis on some suitable F-martingale. The way to find such an F-martingale is to apply Proposition 5. Let H be the generator of the intensity process - see Lemma 7 - and G a the generator of the classical risk process (with intensity a) - see Lemma 8. Consider now the vector process (X, A), with state space S ~ R2, and denote its generator by A. Thus A acts on functions v = v(x,f). LEMMA

51. The generator, A, of (X, A) is given by (Av)(x,f) = (Glv)(x,f)

+ (Hv)(x,f),

where Gt operates on the x-variable and H on the f-variable.

PROOF: This is Proposition 6 in Bjork and Grandell (1988, p. 107). Since X(O) = 0 by definition, we consider, like in the proof of Lemma 8, yet) = y + X(t), which has the same generator. We have

I YeO) =

E[v(Y(~), A(~»

=

y, A(O) = f] - v(y,f)

E[v(Y(~), A(~» - v(Y(~), f)

=

E[v(Y(~),f) - v(y,f)

I YeO) = y, .-\(0) = f]

I YeO)

= y, .-\(0) = fl.

We have, since Y has right-continuous trajectories, 1

~ E[v(Y(~), .-\(~»

=

E[(Hv)(Y(~),f)

-

v(Y(~),f)

I YeO) =

I YeO)

y, .-\(0) = f]

and, since Y and A have no common jumps,

= y,

.-\(0)

= f]

+ 0(1) -+ (Hv)(y,f)

4.5 Markovian intensity

1

~ E[v(Y(~),l)

- v(y,f) I Y(O)

= y,

A(O)

115

= f] -+ (Gtv)(y,l)

and the lemma follows .• It now follows from Proposition 5 that M(t) = v(X(t), A(t)) for v such

that (91)

Av =0

is an F-martingale. We will, however, not study the martingale equation (91) in full generality. Instead, since we want to apply (40) and since we want to obtain exponential estimates, we restrict ourselves to functions v of the form

v(x,f) = g(f)e-r(u+x),

where g is a positive function. Like in Section 4.1 we let I]ft(u) denote the ruin probability when A(O) = f, i.e., when Po = 8t . Similarly Et denotes the expectation operator in that case. Thus we have, for any initial distribution, Ed . ] E[ . I A(O) f]. We can now formulate the version of Lundberg's inequality, where f = o. Let, like before, Tu be the time of ruin.

=

=

PROPOSITION 52. Let A be a markovian jump process, and suppose that R+ -+ R+ and R > 0 satisfy

g:

g(f)[fh(R) - Rc] + 77(f)

loco g(z)PL(f, dz) -

77(f)g(f)

=

O.

(92)

Then (93) PROOF: We get, from Lemma 8 since GI. operates on the x-variable,

(Gtv)(x, f) = g(f)e-r(u+x) ( -rc + f

loco erz dF(z) -

f)

= g(f)e-r(u+x) (fh(r) - rc) and, from Lemma 7 since H operates on the f-variable,

(Hv)(x,f) = e-r(u+x) (77(f)

loco g(z)PL(f, dz) -77(f)g(f))

and thus (92) is equivalent - Lemma 51 - to (91). This mean that

M(t) = g(A(t))e-r(u+X(t)) is an F-martingale and (93) follows from (40). I Now we consider the case when A is an independent jump markovian intensity. This means - Definition 11 - that PL(Y, B) = PL(B). As before we denote the ruin probability with I]fo in the ordinary case and with I]f in the stationary case. Thus we have

116

4 Cox models

and where

PL(df) qL(df) = 1](f)E[or

THEOREM 53. Let>. be an independent jump markovian process, and suppose there exist R > 0 and f3 > 0 such that

(a)

Rc + 1](f) - h(R)£

(b)

Jo

roo

> 0,

PL-a.s.;

1](£)

Rc + 1](£) _ h(R)£ PL(d£) = 1;

(c)

PL-a.s.

Then we have the Lundberg inequalities

RC+1]~\£~h(R)f (1+ ~c) e- Ru ,

Wt(u)::;

WO(u)::;

(1+ ~c) e-

and W(u)::; ,8;[0"]

(1+ ~c)

Ru ,

e- Ru •

PROOF: This is Theorem 8 in Bjork and Grandell (1988, pp. 108 - 109). Define g by

1](£)

g(£) = Rc + 1](£) - h(R)£ '

(94)

which due to (a) is positive. From (b) it follows that g satisfies (92). Thus (93) holds. It follows from (c) that 1

-= g(f)

Rc + 1](£) - h(R)£ Rc + 1](£) Rc <

1.

CASE 1: Consider the case when 0" = So p-a.s. and let L be exponentially distributed with E[L] = 1. This is a special case of the model studied by Ammeter (1948) and discussed in Example 38. Then

e- rC80 ¢(r) - --:--;-:--

1 - h(r)so

when her)

< l/so.

o We shall now give two examples of independent jump markovian intensities. CASE 2: We consider first the case when 1](£) == 1]. Then Land 0" are independent and E[O"] Ih. Recall from Example 26 that the distribution of L must have compact support, otherwise we have ¢(r) = 00 for all r > O. Let L be uniformly distributed on [0,2]. It follows from Proposition 49 that

=

1] ( 1 -2h(r) ¢(r) = ---log - -) 2h(r) rC+1]

o

rc + 1] when her) < - 2 - .

120

4 Cox models

CASE 3: Now we consider a case when P{L

PL(di)

={

> i} > 0 for aU i. Put

if i < 1/2 if i ~ 1/2

0

~i-2 di

TJ(i)

and

= TJi.

Then E[o-] = l/TJ. It follows from Proposition 49 that

q,(r) =

2~C log (1 + TJ ~~(r))

when her)

o Yn(t) < -y} -+ P{inft>o Y(t) < -y}, in the Cox case. However, even f~m such a proof it does not follow that RpD/ RD -+ Rp/ Rase -+ aJJ.

o

It follows from Asmussen (1987, p. 137) that 0- 2 _

A -

E[L 20-2] + a 2E[0-2] - 2aE[L0-2] _ E[(L - a)20-2] E[o-] E[o-]

for an independent jump intensity. In our cases we thus have

RpD RD

=

0-1

+ 1 + o-~ 1 + o-~ ,

where

2 0-A

=

E[(L - 1)20-2] E[o-] .

(102)

By simple calculations we get I 2 o-A=E[o-]. {

2/3 8/25 72/25

in in in in

Case 1 Cases 2 and 3 Case 4 Case 5

The simple approximation (102) holds reasonably well for E[o-] ~ 10 in the deterministic case, for E[o-] ~ 100 in the exponential case and for all E[o-] in the r-case. Taking the poor accuracy of the diffusion approximation into account it holds, in our opinion, surprisingly well. In Table 6 we consider the behavior of Rp / R for the "worst reasonable" values of E[o-]. The approximation RpD/ RD is indicated by "c = I" and the claim distributions by the values of o-~. In all cases, except in Case 4, Rp / R seems to increase or decrease to RpD / RD. In Case 4, with exponentially distributed claims, Rp / R has a maximum 20.233 at c = 1.236. Generally Rp / R seems to be relatively insensitive to variation in c, and that is probably the reason why approximation (102) works reasonably - or surprisingly - well.

4 Cox models

124

TABLE 6. Values of Rp/ R (for c in the case Q II1.

= =

> 1) and RpD/ RD

(indicated by c

= 1)

c

q~

E[CT]

Case 1

Case 2

Case 3

Case 4

Case 5

1.3 1.2 1.1

0 0 0

10 10 10

12.914 12.295 11.657

6.665 7.003 7.337

10.054 9.281 8.468

5.269 5.086 4.773

31.606 31.238 30.648

1

0

10

11.000

7.667

7.667

4.200

29.800

1.3 1.2 1.1

1 1 1

100 100 100

55.484 54.075 52.584

25.546 28.310 31.220

42.748 40.192 37.404

20.052 20.168 19.246

140.625 142.894 144.426

1

1

100

51.000

34.333

34.333

17.000

145.000

1.3 1.2 1.1

100 100 100

1000 1000 1000

10.777 10.811 10.852

5.591 6.177 6.840

8.395 8.167 7.905

4.428 4.495 4.422

26.341 27.438 28.507

1

100

1000

10.901

7.601

7.601

4.168

29.515

Although definite conclusions may not be drawn from Tables 1 - 6, they do - in our opinion - support the conclusion that it might be fatal to ignore random fluctuations in the intensity process.

CHAPTER

5

Stationary models

Recall W(O) = 0'.1' = _I_ e 1+ p

when e

> 0'.1',

(I)

which was proved in Chapter 1 for the Poisson case. As pointed out (I) is an insensitivity result, since w(O) only depends on p and thus on F only through its mean 1'. In Chapter 3 - see (3.39) - it was shown that (I) also holds for the stationary renewal model. Thus - in that case - (I) turned out to also depend on the inter-occurrence time distribution 1 u}. The virtual waiting time V(t) is the waiting time in the queue of a hypothetical customer arriving just after time t. When 'T] < 1 there exists - since KO is continuous - a random variable V such that V(t) ~ V as t -+ 00 and we have

w(u) = P{V > u}.

Note that V(t) = 0 if and only if the server is idle at time t. Elementary books on queueing theory, and we choose Allen (1978) as an example, emphasize the MIG/1 queue - which means that the customers

5 Stationary models

127

arrive according to a Poisson process - and the M/M/1 queue where the service times are exponentially distributed. In those cases W and V have the same distribution. These queues correspond to the Poisson case. From Allen (1978, p. 163) it follows that P[W

> u] = 1]e

-~ IJ

for the M/M/1 queue

which is the "queueing version" of (II) and from Allen (1978, p. 198) that P{V = O} = 1 - 1]

(3)

for the M/G/1 queue which is (I). In more advanced treatises on classical queueing theory - the most wellknown is probably Takacs (1962) - it is shown that (Takacs 1962, p. 142) (3) also holds for the GI/G/1 queue. Like in risk theory it may be disputed if the GI/G/1 queue really is the relevant.generalization of the M/G/1 queue. Franken et al. (1981) consider the much more general G/G/1 queue, where only certain stationarity properties of the arrivals and the services are assumed. It is, for example, not assumed that the arrivals and the services are independent. It follows from Franken et al. (1981, p. 108) that (3) still holds when the queueing system is ergodic. We will not go into details about the model and rely on the reader's intuition. In this generality (2) does not necessarily hold and therefore the relation between ruin probabilities and waiting times is not quite problem-free. This "problem" seems, however, not too serious, since a time reversal may change distributions but not expectations and (I) only depends on expectations. In spite of this we will give a direct proof which generalizes the proof in the renewal case. Bjork and Grandell (1985, p. 149) gave an example which they claimed to be a "counter example" of that relation. Although that was not too well expressed - i.e., wrong - we shall consider the example. Let the claim sizes Zt, Z2, Z3, ... be independent and exponentially distributed with mean 1 and let the claims be located at Zl, Zl + Z2, Zl + Z2 + Z3, ... Thus N is a Poisson process with Q = 1. In the queueing formulation (3) holds. Intuitively that is obvious, since the customers always arrive at an idle server. The nth customer arrives at time CSn-l. That customer's service is completed at time CSn-l + Zn while the next customer arrives at cSn = CSn-l +cZn . Thus the server is busy during (CSn-l,CSn-l + Zn) and idle during (CSn-l + Zn,cSn ), i.e., the server is idle the proportion (1 - c)/c 1 - 1] of the time. In the risk model formulation we have X(t) ~ (c - l)t for all t > 0 and thus w(O) = 0 for c > 1 which Bjork and Grandell (1985, p. 149) regarded as a contradiction of (I). This is, however (in reality), no contradiction since the risk process X(t) does not have stationary increments. In order to realize this, we consider an epoch to > > o. Then - formally when to -+ 00 - the time from to to the next claim is exponentially distributed with mean 1. The time from the previous claim

=

128

5 Stationary models

to to is also exponentially distributed with mean 1 and the two durations are independent. Thus the risk process X(t) gets stationary increments if the size of its first claim is changed to Zl + Z where Z is exponentially distributed with mean 1 and independent of all the ZkS. Then ruin can occur only at the first claim and we have

W(O) = P{Z ~ > (c - l)Zt} =

1

00

o

e-(c-1)ze- z dz =

1

00

0

e- cz dz = -1 c

which is in agreement with (I).

D As mentioned in Remark 1 we shall generalize the proof in the renewal case. In that case (I) followed from (3.37), which gave a relation between the ruin probabilities - or strictly speaking the non-ruin probabilities - in the ordinary and the stationary cases. The natural question is now: What is the correpondence to "the ordinary case" for a general stationary point process? In order to answer that question we shall need some basic facts about stationary point processes. A good reference is Franken et al. (1981), upon which the survey is highly based.

STATIONARY POINT PROCESSES

We start by recalling some basic definitions given in the survey "Point processes and random measures" in Section 2.2. Let N denote the set of integer or infinite valued Borel measures on R = (-00,00) and let B(N) denote the Borel algebra on N. The elements in N are usually denoted by v. A point process N is a measurable mapping from a probability space (n,:F, P) into (N, B(N)). Its distribution is a probability measure II on (N, B(N)). Put Ns = {v E N; v(t) - v(t-) = 0 or I}. Here we shall only consider simple point processes and therefore we omit the subscript S. With this convention any v EN can be looked upon as a realization of a simple point process. The shift operator Tx : N -+ N is defined by (Txv){A} = v{A + x} for A E B(R) and x E R where A + x = {t E R; t - x E A}. We put TxB = {v E N; T_xv E B} for any B E B(N). A point process is stationary if II {TxB} = II {B} for all x E Rand all B E B(N). From now on II is assumed to be the distribution of a stationary point process N with intensity Q' E (0,00). There always exists a random variable N with E[N] = Q', called the individual intensity, such that N(t)jt -+ N II-a.s. as t -+ 00. Let I be the

5 Stationary models

129

O'-algebra of invariant sets B E B(.N), i.e., of sets B such that B = T:r:B for all :I: E R. N is ergodic if 11 {B} 0 or 1 for all B E I. Since {v E .Nj lJ ~ :I:} E I for each :I: it follows that N = a 11-a.s. if N is ergodic.

=

v.

v.

For any B E I such that 0 < 11 {B} < 1 the conditional distribution 11{. I B} is stationary. Let denote the empty realization, i.e., {A} = 0 for all A E B(R). Thus 11 has the unique representation 11 = pAt

+ (1- p)11oo,

(4)

where 0 ~ p ~ 1 and A. and 1100 are probability measures on (.N, B(.N) such that A.{{v.}} 1 and 11oo{{v.}} o. A realization of a stationary point process contains 11-a.s. zero or infinitely many points, and thus 11oo{{vj v{oo) oo}} 1. The distribution of an ergodic point process cannot be a non-trivial mixure of stationary distribution, and therefore p = 0 since a > o.

=

=

=

=

Now we shall consider "the correspondence to the ordinary case" in the question above. In the case of renewal processes we started with an ordinary renewal process and obtained a stationary renewal process by choosing the distribution of 8 1 according to (3.1). If we start with a stationary renewal process the ordinary renewal process is obtained by conditioning upon the occurrence of a point at time o. In terms of a stationary point process this means that we want to consider probabilities of the form

11{B I N{{O}} = I}. The problem is thus to give such probabilities a precise meaning for a general stationary point process. Intuitively we consider an event Band successively shift the process so that its "points" fall at time o. If this had been a statistical problem - and not the question of a probabilistic definition - we had probably considered the proportion of times when the shifted point process belonged to B. Instead, we now consider the ratio of certain related intensities. Consider a set B E B(.N). Define the "B-thinned" process NB by

(5) where - as usual - IB(N)

={

I

if NEB

. ThIs. means that NB

conifNf/.B sists of those points in N for which the shifted point process belongs to B. Obviously NB is stationary.

o

Put .No = {v E.Nj v{ {O}} = I} and note that .No E B{.N) and that NXO = N. Let a{B} be the intensity of NB. It follows from Matthes et al. (1978, pp. 309 - 3q) that a{ . } is a measure, i.e., O'-additive, on (.No, B{.N0

».

130

5 Stationary models

DEFINITION 2. Let N be a stationary point process with distribution lJ. The distribution lJo, defined by

is called the Palm distribution.

lJO{B} is the strict definition of "lJ{B I N{{O}} = I}." For a precise interpretation of lJo as a conditional probability we refer to Franken et al. (1981, pp. 33 and 38). Define the (random) shift operator 0 by for II

i= 110

and recall that Sl (II) is the epoch of the first point - or claim - after time zero. It is sometimes convenient to extend lJo to (N, B(N) in the obvious way ~ follows: lJO{B} = lJO{B n N°} for all B E B(N). The point process NO with distribution lJo is called the Palm process. NO is not stationary but for all B E B(N). If BE I and lJ{B} = 1 it follows from (5) that lJO{B} = 1. This means especially that N° exists lJo-almost surely. Let U be the distribution of Nand UO the distribution of N°. Then we have, for B = {II; v::; x},

UO(x) = lJO{B} = a{B} = E[I[o,xJ(N)N] = a

a

For any non-negative B(N)-measurable function et al. 1981, pp. 26 - 27)

['leN)

Eoo[/(N)] = aooEo [Jo

I



x ydU(y)/a.

(6)

on N we have (Franken

]

I(Tt N ) dt ,

(7)

where "N" just stands for a point process whose distribution is indicated by the notation of the expectation and where a oo ~f Eoo[N(I)]

= a/(I- p).

For I == 1 we get E O[Sl] (= EO[Sl(N)] ) = l/a oo and thus (7) is an "inversion formula." (At least when p = 0, i.e., when lJ = lJoo , EO[Sl] = l/a is the strict definition of "the mean duration between two successive claims.") Since, in general,

E[t(N)] = PI(1I0) + (1 - p)Eoo[t(N)] we get

E[/(N)] == PI(1I0)

+ aE o

[l°

"I(N)

]

I(TtN) dt .

(8)

5 Stationary models

131

REMARK 3. In our attempt to give a heuristic motivation for Definition 2, we discussed the "proportion of times when the shifted point process belongs to B." In the ergodic case we have

lim NB(t) N(t)

t-oo

= lim

t_oo

NB(t)_t_ t N(t)

= a{B} a

II-a.s.

and thus (Matthes et al. 1978, p. 339) we get the "correct" result.

o

If N is a stationary renewal process then (Matthes et al. 1978, p. 367) NO is the corresponding ordinary renewal process. Note that these renewal processes are defined on R and that NO, as all Palm processes, has a point at O. The superscript 0 is standard for Palm processes, and therefore we also used it in connection with renewal processes in order to indicate an ordinary renewal process. EXAMPLE 4. We shall consider some examples of Palm processes. These examples will not be explicitly used, but they may support intuition. It is often convenient to withdraw the point at 0 and to consider the reduced Palm process N! with distribution II!. Formally N! is defined by

if 0 E A if 0 ¢ A

for A E 8(R).

If N is a Poisson process we have II = II! which is a characterization of the Poisson process. (This characterization also holds in the non-stationary case, although the Palm probability is somewhat differently defined.) Intuitively this means that knowledge of a point at 0 has no influence on the distribution of the rest of the process. This is quite natural, since the Poisson process is the only stationary point process with independent increments, and may be looked upon as a "Palm correspondence" to Theorem 2.11. Assume that N is a Cox process with distribution ITA given by, see (2.13), ITA J ITI' IT {dll}* and that A has the representation A(t) J~ >.( s) ds.

=

=

M

It follows from Kummer and Matthes (1970, p. 1636) that N! is a Cox process with IT~ = J ITI' IT!{dll} where (1l'(0) exists IT-a.s.) M

(9) (In Section 4.3 we considered "ordinary independent jump intensities" and "stationary independent jump intensities." Although the underlying ideas are related to Palm theory, the ordinary case is not the Palm process.)

*

Note that nand n!, in these comments about Cox processes, are distributions of random measures.

132

5 Stationary models

If N is a mixed Poisson process we have, of course, N = (9) are in agreement.

>. and (6) and

DO REMARK 1. CONTINUED. Assume that N is ergodic. Any customer who enters a queueing system also, hopefully, leaves it if 1] < 1. Then we ought to have P{V > O} . cx/e 1/1-'

......--------.---... arrival intensity busy server

=

,---""".........

_-

~ervice intensity'

=

P{V > O} cxl-'/e when e > CXI-' which is (I). or w(O) In the non-ergodic case, cx ought to be replaced by N. Obviously V and N are dependent. Then we ought to have

P{V> 0 I N}

N/e

.---... busy server

~rrival intensity

_-

,---_........

~ervice intensity

and "thus" P{V > 0 I N} = N I-'/e when e > N 1-'. "Thus" w(O) = E[P{V> 0 I N}l which is (1) if U is the distribution of N. Certainly this reasoning shall not be taken too seriously, but it may serve as an indication of the kind of results to be expected.

o

Now we consider the risk process. Let N be the restriction of a point process on R to R+. As in the survey II is the stationary distribution and IIo the Palm distribution. Recall that N(O) = 0 for all point processes on R+ and therefore we do not need to separate between the Palm process and the reduced Palm process. Let w( u, v) be the ruin probability when the claims are located according to the realization v of N. Thus

w(u) = E[w(u, N)l Put

WO(u) = EO[w(u,N)]

where E is with respect to II. where EO is with respect to IIo.

The following lemma may be of some independent interest. LEMMA 5. For any stationary risk model with 0

< cx < 00 we have

w(O) = CXI-' (1 - WO( 00)) + w( 00). e

PROOF: Put q,(u) = l-W(u), q,O(u) = l - WO(u), and q,(u, v) = l-W(u, v). For fixed v and t, 0 :::; t < Sl(V), we have - "standing" at time t -

q,(u,7tv) =

r+C($l(lJ)-t)

io

q,(u+e(sl(v)-t)-z,Bv)dF(z)

(10)

by a slight variant of the "renewal" argument used in Sections 1.1 and 3.2.

5 Stationary models

133

For t = 0 we get, denoting sl(N) by Sl, 0(u) = EO Using (8) with f(v)

[l + u

(u + CSl - z, ON) dF(Z)].

cS1

(11)

= (u,v) we get, since (u,ve) = 1,

(u) =p+aEo[l and by (10)

(u) = p+ aEO

[lSI l

u

S1

(u,JtN) dt]

+ (u + cv - z,ON) dF(z) dV]. clJ

(12)

The change of variables x = u + cv leads to

(u) = p+

~EO [l u +

cs1

1"'

(x - z,ON) dF(z) dX]

(13)

which is almost the same as (3.35). From Lemma 3.8 applied to

IV /-t, with probability one.

ApPENDIX

Finite time ruin probabilities

Up to now we have only considered the probability of ruin within infinite time, i.e., the probability that the risk business ever becomes negative. Let a time t be given and let - as usual - Tu denote the time of ruin. The finite time ruin probability w( u, t) is defined by

w(u,t)

= P{Tu:::; t} = P{u+X(s) < 0 for some s E (O,t]).

From a practical point of view, w(u, t), where t is related to the planning horizon of the company, may perhaps sometimes be regarded as more interesting than 'I ( u). Most insurance managers will closely follow the development of the risk business and increase the premium if the risk business behaves disquietingly bad. Also an orthodox probabilist will probably act in the same way, since he - despite a wish to keep his job - will believe that the underlying model is wrong. In this connection the planning horizon may be thought of as the sum of the following: the time until the risk business is found to behave "bad"; the time until the manager reacts; the time until a decision of a premium increase takes effect. It may therefore, in non-life insurance, be natural to regard t equal to four or five years as reasonable. Depending on the branch and the company it may be reasonable to consider a - when the time unit is years - of the orders 103 to 10 5 . Just to have some value in mind, we regard 50000 as a reasonable value of a . t. The intention of this appendix is to give some indication on when the infinite time ruin probability is also relevant for the finite time case. Intuitively one may expect that ruin - if ever - occurs as follows:

after a short time if u is small and p is large; after a long time if u is large and p is small.

136

Finite time ruin probabilities

More precisely, there exists - at least in some cases - a value Yo, such that

w(u,t) ""' w(u) when t > UYo while

w(u,t) < < w(u) when t < UYo for large values of u. Otherwise expressed, this means that either Tu = 00, i.e., no ruin, or Tu R: UYo, i.e., ruin. Our interpretation is that w(u) is most relevant when the planning horizon is longer than UYo. This does, however, not imply that we regard w(u) as irrelevant when the planning horizon is shorter than uYo. It seems quite natural to look beyond the first possibility to adjust the premium when the initial reserve u is determined. To start with we consider the classical model in some detail.

A.I The classical model Recall from (1.19) that

w(u, t) :S

e- ru

sup

e"(ah(r)-re)

=

e- ru

max (1 , et(ah(r)-re)).

(1)

0$"9

Obviously we can always choose l' = R, but it might be possible - at least sometimes - to choose a better, i.e., a larger, value of 1'. Put t = yu. Then (1) yields

w( u, yu) :S max( e- ru ,e-u(r-yah(r)+yre)) =

e- u min(r,r-yah(r)+yre)

and it seems natural to define the "time-dependent" Lundberg exponent Ry by Ry = sup min(1', l' - yah(1') + Y1'e) = sup(1' - yah(1') + Y1'e) r~O

r~R

and we have the "time-dependent" Lundberg inequality (Gerber 1973, p. 208) (2) Put

f(1') =

yah(1') + Y1'e

l' -

and note that f(R) = R, f(1') < Thus we have, since Ry ~ R,

l'

Ry ; R as

for

l'

> R and that f(1')

f'(R)

~

O.

IS

concave.

A.l The classical model

137

Since f'(R) = 1- yah'(R) + yc it follows that

= > 1 Ry > R as y:( ah'(R) _ c . The value Yo ~f ahl(k)

Ry

c

is called the critical value. For y

< Yo we have

= f(r ll ) where rll is the solution of f'(r) = o. f

(r)

R

y

r

FIGURE 1.

lllustration of notation when y

< Yo.

It follows from Arfwedson (1955, pp. 58 and 78) that

if yyo

and thus Ry is the "right" exponent. The constant Gil is given by C _ _ rll-TII rIlTIIv27ryah"(ry) ' II -

provided the claim distribution is non-arithmetic, where Ty is the negative solution of fer) - r = Ry - ry, and G is the constant in the Cramer-Lundberg approximation (III). Segerdahl (1955, p. 34) has shown that 'II ( u, t) '" N

t;;:o

(t -

u yo ) Ge- Ru . .,juvo

(4)

as U, t -+ 00, and is bounded where N(z) denotes the standard normal distribution function, i.e.,

N(:v)

= 1:1: .~e-z; -00

v 27r

dz

138

Finite time ruin probabilities

and

Vo =

ahll(R) (ahl(R) - C)3

(5)

-:--::--:-~-'--:-::-

(Our use of N' instead of the more usual

yo:

W(u yu)"" ,

6

_ (1+fJV)2

vY

..;2irii(1+ fJ y )

~

e ~u+1·e-62u""e-Ru.

Thus it is seen that (3) holds in this case. Assume now that t = YoU + f..jU. Then

W(u, t) "-'1-

N( 0VYou 2u + f3fVu ) + N( f3fVu ) e- Ru + fVu 0VYou + f..jU

t "" cons.

Vu

e

_ const. -Ru e +

- Vu

2u U+fJf u 62fJ2(U+fJf u)

N(

f

0/(33/2

)

+ e

N ( __f _ ) 0/(33/2

-Ru

"-'

N(

e f

-Ru

0/(33/2

)

e

-Ru

.

Thus (4) also holds.

o

From Table 1 it seems highly relevant to consider the infinite time ruin probability when P ~ 5%. Certainly no general conclusions may be drawn from this simple case. Consider now any claim distribution and assume that Rand W(uo) are known for some Uo so large that (III) is a good approximation. Then

w(uo) ~ Ce- Ruo

or

C ~ W(uo)e Ruo .

Specifying w(u) we thus get w(u) ~ W(uo)e-R(u-u o) or

u ~ Uo + 10g[W(u~/W(u)1.

(11)

Next we observe, compare (III) and (1.26), that C = ayo . PI-l or C w(uo) R ayo = - = - - e Uo PI-l PI-l

and thus

(12)

W(uo) ( 10g[W(uo)/W(u))) Ru a . uYo = - - Uo + Reo. (13) PI-l Naturally it is desirable to choose Uo such that w(uo) ~ W(u). If Uo = u (13) is reduced to u R u a· uYo = -w(u)e u ~ - , (14) PI-l PI-l where the inequality follo~s from (IV).

142

Finite time ruin probabilities

It may be natural to exploit the simple De Vylder approximation, discussed in Section 1.1, which worked so well in the infinite time case. Recall that the idea was to replace the risk process X with a risk process X having exponentially distributed claims and parameters

and where (k = E[Zj]. Applying (6) to

_ 9(~ a = 2(§a,

X we get

_ _ 10g[(1 a· uyo = -

+ p)w(u)] -2

P

and, if UYo is a good approximation of uyo, a . uyo

~

_

a . uyo =

a _ _

-=a . a . uyo =

(210g[(1

10g[(1

+ p)w(u)] S!.p2 ()(

+ ~p)w(u)]

(15)

2(fP 2

=

EXAMPLE 3. r-DISTRIBUTED CLAIMS. We consider the case with p 10% and where the claims are r-distributed with J.I 1 and /7 2 100, which was discussed in Section 1.2. From Table 1.1 it follows that W(1200) = 0.10834. Since R 0.0017450 it follows from (11) and (13) that w( u) 0.1 corresponds to

=

=

= u

=

~

1246

and

a· uyo

~

10957.

Applying the De Vylder approximation (15) we get

a . UYo

= 10999,

which is almost perfect.

o

EXAMPLE 4. MIXED EXPONENTIALLY DISTRIBUTED CLAIMS. Consider now the claim distribution (1.35)

F(z) = 1 - 0.0039793e-o.014631z_ 0.1078392e-O.190206z - 0.8881815e-5.514588z

for z

2:

0

discussed in Section 1.2. Using Tables 1.2 and 1.3 we get from (12), (10), and (14) the values of u, a . UYo and a . UYo given in Table 2. We have, however, used more accurate R-values, than those given in Table 1.2. For p 5% and 10% we used Uo 1000 and otherwise Uo 100.

=

=

=

A.1 The classical model

143

TABLE 2. Values of u, 0'. UYo, 0'. Uiio, and 0'. y'UVa for mixed exponentially distributed claims when w(u) = O.l.

p

u

5% 10% 15% 20% 25% 30% 100%

1068 567 398 312 258 222 72

0' •

uYo

18703 4386 1830 967 582 381 16

0' .

UYo

0'.

18778 4447 1878 1006 615 408 21

y'UVa 19214 4946 2253 1293 841 592 54

As for exponentially distributed claims we see that 0:' • UYo and 0:' • Juvo are of the same order for the values chosen. As mentioned in Section 1.2, this claim distribution has been considered by Wikstad (1971). The values in Table 3 are t.aken from Wikst.ad (1971, p. 151). TABLE 3.

u

Mixed exponentially distributed claims.

p

0'.

UYo

w(u, 10)

w(u, 100)

w(u, 1000)

W(u)

100 100 100 100 100 100 100

5% 10% 15% 20% 25% 30% 100%

1751 773 460 310 225 172 23

0.0094 0.0094 0.0093 0.0093 0.0092 0.0092 0.0087

0.0896 0.0863 0.0833 0.0804 0.0777 0.0751 0.0497

0.4115 0.3618 0.3186 0.2813 0.2493 0.2219 0.0723

0.7144 0.5393 0.4247 0.3455 0.2886 0.2461 0.0724

1000 1000 1000 1000

5% 10% 15% 20%

17505 7734 4600 3105

0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000

0.0004 0.0003 0.0002 0.0002

0.1149 0.0210 0.0054 0.0018

Like in Example 1 we consider p = 0.05. For u = 100 we have 0:' • Juvo = 5878 and 0:' • Yo (u) = 3000. First we note that correction term in (6) has as to be expected - high influence. Further, we have

.NCOO~;;:751) '11(100) ~ 0.449·0.7144 =

0.321

144

Finite time ruin probabilities

which shall be compared with W(100, 1000) = 0.4115. This indicates that (4) does not hold with good accuracy in this case. Especially we note 1000 < 0: • 100yo while W(100, 1000) > ~ W(100). Our conclusion is that u = 100 is "too small" in this case. For u = 1000 we have 0: . .juvo = 18589. Then

N(100~~:;505) W(1000) = N(-0.89)W(1000) ~ 0.187·0.1149 = 0.021 which is of a different order than W(1000,1000) = 0.0004. We do admit that this total lack of accuracy is very surprising, especially since N( -0.89) is not a "tail value." The reader's - and certainly our - first reaction is probably that there is a computional error. The following crude estimates do, however, indicate that this is not the case. We have, for R = 0.002,

h'(R) ~ (1 + R· (2 +

R2

"2 . (3

~ 1.102

and

h"(R) ~ (2 + R· (3 ~ 58.63

and thus, for u = 1000,

UYo

1

~ 19300 and 0:' .juvo ~ 20500 1.102 - 1.05 which indicates that the lack of accuracy is not due to a computional error. For these two choices of u and t we have 0: •

~

= 10, = 0.002651, Ry = 0.002167, Ty = - 0.000742, Cy = 25.303321, y

y

Ty

Ty

= =

1, 0.007864,

Ry = 0.005578, Ty = - 0.011365, Cy = 4.387626

and thus

WA(100, 1000)

=

= 2.037,

WA(1000, 1000) = 0.000525.

Obviously u 100 is also "too small" in this case while u a reasonable - although not very good - approximation.

= 1000 leads to

o

It is, as always, difficult to draw conclusions from a few numerical illustrations. It seems, however, as the statement that "either Tu = 00 or Tu ~ uYo" require very large values of u in order to be "true." This, in turn, makes UYo large. In order to claim that w( u, t) ~ w( u) it is, of course, enough that P{Tu > t I Tu < oo} is small, which ought to be the case more generally as soon as t > > uYo. Our way to first choose p and then determine u by specifying w( u) is natural from a "risk theoretic" point of view. For a dangerous risk distribution we thus get a large value of u and consequently a large value of uYo. We could have argued in an "opposite" way, i.e., to first choose u and then determine p by specifying W(u). Certainly the approximations (3)

A.2 Renewal models

145

and (4) had not worked, but P{Tu > t I Tu < oo} had probably become small. In "the theoretical" practice it seems natural to allow p, u, and w( u) to be large for a dangerous risk distribution. In "the real" practice p and u are probably chosen more by economical - than by risk theoretic considerations. Some practical working actuaries consider u = a . J.l to be reasonable. Then it follows from (14) that uyo ~ 1/ p. If p = 20%, which also is regarded as practically reasonable, we have uyo ~ 5 years ~ the planning horizon.

A.2 Renewal models Let N be an ordinary renewal process and let Sn denote the epoch of the nth claim. The Laplace transform of the inter-occurrence time distribution is denoted by ko. Recall from Chapter 3 that R is the positive solution of (h(r) + l)k O(cr) = 1. The asymptotic expressions (4) and (3) have been generalized by von Bahr (1974) and Hoglund (1990), respectively. Their analysis is based on the two-dimensional random walk

((X(Sn), Sn); n = 0, 1,2, ... }. Put, in order to simplify notation, if k = 0 if k

>0

and

where (k) denotes the kth derivative. Note that HoI 0, i.e., ()(r) is convex. Note that, since Tu = SN",

> 0 while

Mu(Tu) = e-r(u+X(T.. »-B(r)T.. on {Tu

~'«()(r)+cr)

1

- HoKI

y:( ()'(R) = HIK o + cHoK I = Yo·

< Yo we have Ry = f(ry) where ry is the solution of f'(r) = O.

EXAMPLE 6. MIXED EXPONENTIALLY DISTRIBUTED INTER-OCCURRENCE TIMES. Consider the inter-occurrence times distribution

KO(t) = 1 - PIe-BIt - P2e-B2t with Laplace transform

for t :::: 0

(17)

A.2 Renewal models

Since

Klc

149

0

0

1 k!P2 2 ) = (-1) lc (k!P1 (0 1 + cR)lc+1 + (0 2 + cR)lc+1 '

Yo and vo follows from (16). Now we consider Yo in some detail. It follows from (16) that Yo

=

1

H K -~

-c

or

1 H1KO - = - H K - c. Yo 0 1

(18)

Since (19) we get

Ko = 0102 + (p1 th

+ P2 02)cR + cR)

(01 + CR)(02

and, by differentiation of (19),

K1 and thus

= (P1 01 + P2 02) -

Ko(2cR+ 01 + O2) (01 + CR)(02 + cR)

1 H1[0102 + (P101 + P202)cR] - -c . Yo - H O[P1 01 + P202] - [2cR+ 01 + O2]

Recall from Theorem 2.38 and Example 2.37 that N is a Cox process. In Example 2.37 we considered a Cox process N whose intensity process ..\(t) was a two-state Markov processes with a1 = O. It was shown that N is a renewal process with KO given by (17). It follows from (19) and the form of kO given in Example 2.37 that

P1 01 + P202 = a2, 0102 = a2771 ,

01 + O2 = a2 + 771 + 772·

Using these relations we get 1 -Ht[a2771 + a2cR] = -c Yo HOa2 - (2cR + a2 + 771 + 772)

-

-h'(R)a2[cR+ 77d = h(R)a2 -c - (2cR+ 77~ + 772) .

(20)

Like in Section 3.3 we now consider the special choice

P1 = 0.25,

p~

= 0.75,

01

= 0.4,

and

O2

=2

150

Finite time ruin probabilities

which corresponds to a2

= 1.6,

771

= 0.5,

and

772

= 0.3.

Recall from (3.10) that

WO(u) = (1- JJR)e- Ru when the claims are exponentially distributed. (The superscript 0 refers to the "ordinary case.") In Table 4 we give u, a . UYo, and a· Juvo for the same values of p and w( u) as given in Table 1. TABLE 4. Values of u, 0/ • UYo, and 0/ • y'iiVii for exponentially distributed claims when I' = 1 and mixed exponentially distributed inter-occurrence times.

p

WO(u)

U

0/ •

UYo

y'iiVii

0/ •

1% 1% 1%

0.01 0.05 0.10

811 527 405

80512 52340 40206

53276 42955 37648

5% 5% 5%

0.01 0.05 0.10

166 107 82

3205 2078 1593

2154 1734 1518

10% 10% 10%

0.01 0.05 0.10

85 55 42

798 515 394

546 439 383

20% 20% 20%

0.01 0.05 0.10

45 29 22

198 127 97

140 112 98

Roughly speaking, Table 4 and Table 1 give a very similar impression. As we did in Example 1 we consider approximation (4). For p = 0.05, a 1, and JJ 1 we have (Wikstad 1971, p. 150)

=

=

WO(100, 1000)

= 0.0196

and

WO(100)

= 0.0614.

For this choice we get UYo = 1935 and -Juvo = 1673 and thus

AfCOO~~~935) w°(100) ~ 0.29·0.0614 =

0.0177

which indicates that (4) holds with reasonable accuracy in this case. Consider now the claim distribution discussed in Example 4. Some illustrations of this combination of inter-occurrence time and claim distribution were given in Section 3.3.

A.2 Renewal models

151

The values in Table 5, which shall be compared to Table 3, are taken from Wikstad (1971, p. 151).

TABLE 5. Mixed exponentially distributed claims and mixed exponentially distributed inter-occurrence times. 1.1

p

Q' •

1.IYo

1760 780 465 315 229 174 23 17596 7804 4654 3148

100 5% 100 10% 100 15% 100 20% 100 25% 100 30% 100 ·100% 1000 5% 1000 10% 1000 15% 20% 1000

1l1 o(1.1,10)

111 0 (1.1,100)

1l1 0 ( 1.1,1000)

1l1 0 (1.I)

0.0103 0.0103 0.0103 0.0102 0.0102 0.0101 0.0096 0.0000 0.0000 0.0000 0.0000

0.0932 0.0898 0.0867 0.0837 0.0809 0.0782 0.0519 0.0000 0.0000 0.0000 0.0000

0.4209 0.3710 0.3274 0.2897 0.2572 0.2293 0.0754 0.0005 0.0003 0.0002 0.0002

0.7231 0.5502 0.4356 0.3557 0.2978 0.2544 0.0754 0.1225 0.0232 0.0061 0.0020

Naturally we can give a table corresponding to Table 2. Since Tables 3 and 5 are almost identical, this would be of very restricted interest.

o

EXAMPLE 7. LIFE ANNUITY INSURANCE. Like in Example 1.8 we now consider the case c < O. (We do not assume F(O) = 1, as was done in Example 1.8.) Ruin does not need to occur at claim epochs and consequently the martingale approach used in Chapter 3 does not work. Note that in that approach - only c ~ 0, and not F(O) = 0, is of importance. The martingale approach used here does, however, also work in this case. Assume that (h(r) + l)kO(cr) = 1, where - in this example - h(r) = f~oo erz dF(z) - 1, has a solution R > o. Since (}(R) = 0 we have, for

r=R,

Mu(t) = Noting that c


1 this inequality is slightly weaker than in the classical case. Although fe°(eR) can be replaced by the constant one (Thorin 1971, p. 141) when F(O) = 1 and the interoccurrence times are mixed exponentially distributed, this can (Thorin 1971, pp. 139 - 140) - in general - not be done. Consider now the stationary case. Then Vo is a random variable with density a(1- l{°(v)), and we have

W(u) < E[Mu(O)] < e-RUE[ eeRVo [00 e-eR'kO(s) - E[Mu(Tu) I Tu < 00] 1 - l{O(Vo) Jvo

=

dS]

a(1- fe°(eR)) -Ru ah(R) e- Ru eR e = -eR- -:-h"(R""):-+----::-1

From (3.8) and (3.42) it is seen that there is the same relation between the ordinary case and the stationary case as when e ~ O.

o

A.3

COX

models

Very little seems to be known in the Cox case, except when the intensity process A(t) is an M-state Markov process. Put, as in Remark 4.56, H(r) = h(r)d(o:) - reI + H, where 0: is the "state vector" and H the intensity matrix. Recall that 0 is an eigenvalue of H(R) with maximal real part. Let K(r) be the eigenvalue of H(r) with maximal real part. Asmussen (1989, p. 80) has shown that K(r) is convex and further K(O) K(R) O.

=

=

A.3 Cox models

153

Thus R is the positive solution of K(r) = O. As mentioned in the introduction to Chapter 4, Asmussen (1989) considers the Cramer-Lundberg approximation in this case. More precisely, Asmussen (1989, p. 92) shows that lim eRu"l}ii(u) = Ci, u-oo

where "l}ii(U) denotes the ruin probability when A(O) = O:i. Further Asmussen (1989, p. 94) shows that (4) holds with Yo

1

= K'(R)

and

Vo

K"(R)

= K'(R)3

(21)

for all initial values of A. Now we restrict ourselves to the case M = 2, i.e., the case where the intensity is a two-state Markov process. Then H = (0-4111 . a2 this case K(r) = ~ (h(r)(O:l + 0:2) - 2cr - (111 + 112) +

111). In -112

y'[h(r)(O:l - 0:2) - (111 -112)]2 + 4111112 ) and

K'(r)

=! (h'(r)(O:l + 0:2) _ 2c+ [h(r)(O:l -

0:2) - (111 -112)]h'(r)(0:1 - 0: 2») y'[h(r)(O:l - 0:2) - (111 -112»)2 + 4111'72

2

and thus we get

K'(R) -_ -1 (h'(R)( 0:1 2

=

2C - .::........;'--';-7=:.;---'-~--=-~~;'--':....!.[h(R)(O:l - 0:2) - ('71 - '72)]h'(R)(0:1___ - -..:.. 0:2») h(R)(O:l + 0:2) - 2cR - (111 + '72)

2h(R)h'(R)0:10:2 - h'(R)[cR(O:l + 0:2) + 0:1'72 + 0:2'71] h(R)(O:l + 0:2) - 2cR - (111 + '72) - c.

REMARK

K'(r)

+ 0:2 ) -

=

= =

8. For 0:1 0:2 0:, i.e., in the Poisson case, we have '21 (h'(r)20: - 2c - 0) h'(r)o: - c and 1I:"(r) = h"(r)o:,

=

i.e.,

1

Yo=""""'''''''''''--

h'(r)o: - c

and

o:h"(R)

Vo

= (o:h'(R) _ c)3 .

For 0:1 = 0, i.e., in the renewal case, II:'(R) reduces to

II:'(R) = which equals (20).

o

-h'(R)0:2[cR + 111] _ c h(R)0:2 - 2cR - ('71 + 112)

154

Finite time ruin probabilities

EXAMPLE 9. Consider the case when the claims are exponentially distributed with J1. = 1. As in Section 4.1 - see Tables 4.1 and 4.2 - we specify a model by a1 and E[u]. Then

= 0.0003125, = 0.03125, 711 = 3.125,

= 0.00125 = 0.125 712 = 12.5

= 1000; = 10; for E[u] = 0.1.

'171

712

for E[u]

'171

712

for E[u]

The use of E[u] as a characteristic came from the representation of a two-state Markov process as a Markov process with independent jumps.

TABLE 6. Two-state Markov intensity and exponentially distributed claims in the case a = IL = 1.

a1

a2

E[u]

YO for p=5%

Yo for p= 10%

Yo for p= 15%

Yo for p 20%

Yo for p= 25%

=

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

1000 1000 1000 1000 1000

19.3006 19.0889 18.6955 17.7308 19.0477

9.3471 9.1682 8.8600 8.2580 9.0909

6.0566 5.9062 5.6687 5.3424 5.7971

4.4296 4.3042 4.1266 4.0232 4.1667

3.4664 3.3633 3.2374 3.3253 3.2000

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

10 10 10 10 10

19.2817 19.0816 18.7682 18.4236 19.0477

9.3275 9.1547 8.8932 8.5970 9.0909

6.0363 5.8873 5.6707 5.4254 5.7971

4.4085 4.2805 4.1028 3.9071 4.1667

3.4445 3.3352 3.1915 3.0417 3.2000

0.00 0.25 0.50 0.75 1.00

5 4 3 2 1

19.0580 19.0477 19.0458 19.0471 19.0477

9.1012 9.0911 9.0892 9.0904 9.0909

5.8074 5.7974 5.7954 5.7966 5.7971

4.1769 4.1670 4.1650 4.1662 4.1667

3.2102 3.2004 3.1983 3.1995 3.2000

0.1 0.1 0.1 0.1 0.1

The most striking impression of Table 6 is certainly that Yo essentially only depends on p.

o

Let us now consider the case when >.(t) is a markovian jump process. The discussion here will be closely related to the discussion in Section 4.5.2 and notation used there will not be redefined here. Let us, however, recall that in Section 4.5.1 - where we applied our general method - we considered

A.3 Cox models

155

the filtration F given by F t = F! V F{ and the F-martingale

e-r(,,+X(t» M(t) = eA(t)h(r)-trc' In Section 4.5.2 we considered the vector process (X, A), the filtration F given by F t = Fl- V F{ , and an F(X, >'Lmartingale of the form

M,,(t) = g(A(t))e-R(,,+X(t)), where g is a positive function. The way to find g was to use Proposition 4.5 which says that ifY is a Markov process with generator A and v a function in the domain of A such that Av == 0, then v(Y(t)) is an FY -martingale. Neither of these approaches seem quite applicable in the finite time case. In M(t) the dependence of t is probably too complicated while in M,,(t) we cannot vary r. We will here generalize the approach in Section 4.5.2. We shall make use of the following special case* of an observation by Davis (1984, p. 370):

Let Y = {Yi; t ~ O} be a (homogeneous) piecewise-deterministic Markov process with generator A, v a function in the domain of A, and {} a differentiable function such that

-{}' . v + Av == O. Then M, defined by M(t) = e-!?(t)v(Yi), is an FY -martingale. INDICATION OF PROOF: We have

E:Fi [e-!?(t+.l)v(Yi+.l)] - e-!?(t)v(Yi)

+ ~(Av)(Yi) + o(~)) - e-!?(t)v(Yi) e-!?(t) [((1- ~{}(t) + o(~))(v(Yi) + ~(Av)(Yi) + o(~)))l - e-!?(t)v(Yi) = e-!?(t+.l) (v(Yi)

=

= ~e-!?(t)v(Yi) (-{}'(t)v(Yi) + ~(Av )(Yi) + 0(1)).

I

Now we apply this to (X, A), which is a piecewise-deterministic Markov process. For any fixed r < we look for a positive F-martingale ( F F(X, >.) ) of the form

roo

=

M,,(t) = e- 8 (r)t g(A(t))e- r(,,+X(t)), where g(O) = 1 and g(f) > O. It follows from Proposition 4.52 that M" is an F-martingale if g and OCr) satisfy

g(f)[fh(r) -

*

rc - OCr)] + TJ(f) loo g(z)p£Cf, dz) -

TJ(f)g(f)

== O.

(22)

At least when {} is monotone, this special case follows directly from Proposition 4.5 (Dynkin's theorem) applied to the vector process ({), V), which then is homogeneous. We will emphasize that the "Davis observation" applies to processes with more "genuine" inhomogeneity. Further Davis explicitly calculated the (generalized) generator.

156

Finite time ruin probabilities

When A is an independent jump markovian intensity with state space 5, it follows - compare Theorem 4.53 - that Mu is an F-martingale for

gel) _

- rc + OCr)

7](£)

+ 17(£) -

h(r)£

provided rc + OCr)

(a)

fOO

(b)

Jo

+ 7](£) -

rc + OCr)

h(r)£ > 0,

7](£)

+ 7](£) -

h(r)£

PL-a.s.; pL(d£) = 1.

If, for example, 17(£) == 1 and PL((£, 00)) > 0 for all £ > 0 - compare Example 4.26 - there exists no OCr) satisfying (a) for r > O. Thus the existence of F-martingales of the required form is not guaranteed in this case. In order to avoid this and some other technical problems we assume that R - defined by Proposition 4.49 - is positive. Further we assume for all r E [0, ro), where ro is some value larger than R - that (a) and (b) have a differentiable and convex solution OCr). REMARK 10. We will indicate that the above assumptions are natural and ought to hold in "kind" cases. Since h is infinitely differentiable 0 ought to be differentiable. The convexity seems natural since - see (4.86) - (a) and (b) are equivalent with

1=

11 00

00

e- 8 [rc+8(r)+f/(l)-h(r)l] 7](£) ds pLCd£).

Provided we may change the order of integration and differentiation we get 0= d22

fOO foo e-·[rc+8(r)+f/(l)-h(r)l] 7](£) ds PL(d£)

Jo Jo

dr

=!!....

foo foo -s[c + O'(r) _ h'(r)£]e- s[rc+8(r)+I](l)-h(r)l] 7](£) ds PL(d£)

dr Jo

Jo

=fl°O (s2[c + O'(r) -

h'(r)£]2 - s[O"(r) - h"(r)£]).

·e-·[rc+8(r)+I](l)-h(r)l] 7](£) ds pLCd£), which is possible only if O"(r) > o. When 0 can be explitly found, we can - of course - directly check the assumptions. One such example is Case 2 in Section 4.6, where

PL(d£) = { Then we have

1= __

17_log 2h(r)

~

(1 ~

if 0 ~ £ ~ 2 otherwise

2h(r)

rc + OCr)

+ 7]

)

and

h

w en

h() rc + O( r) r < 2

+ 17

A.3 Cox models

157

and thus

O(r) -

2h(r)

- 1 - e- 2h (r)/f/

- rc - 11.

Obviously 0 is differentiable. In order to check if 0 is convex we consider functions a, b: R+ -+ R. If a and b are convex and a' ~ 0 then a(b) is also convex, since

a(b)"

= a"(b)(b')2 + a'(b)b" ~ o.

Put

z

a ( -z ) --1- e-It:

b(r) = 2h(r)

and

11

and note that O(r) = 11 . a(b(r)) - rc - 11. Thus it is enough to check that a' , a" ~ o. Straightforward derivation yields

a'(O)

1

= "2'

and

Since Z

+ 2 + zeit: -

00

1:+1

1:=0

=

00

X

00

-=2

1:

2elt: = x + 2 + ~ _x_ - ~ L...J k! L...J k! kx1:

+2 +z + ~ - L...J k! 1:=2

1:=0

00 2z1: 2 - 2x - ~ -

>0

L...J k! -

1:=2

the convexity follows.

o

We can now - in principle - proceed as in the classical and renewal cases. Nevertheless, since the martingale is slightly more complicated in this case, we give a detailed derivation. Let r < ro be fixed and let O(r) be the solution of (a) and (b). From (4.40) we get, in the "usual" way,

< -

g(A(O))e- ru EFo [inf~$t$!lu e- 8 (r)tg(A(t)) I Tu :$ yu]

158

Finite time ruin probabilities

~

g(A(o»e-umin(r, r-y/J(r» E:Fo [info~t~yu g(A(t» I Tu ~ yu] .

The problem - which was the reason for giving details - is that we, like in Theorem 4.53, must ensure that 1

~~~------~~~~---, o.

Then we have 1

E:Fo [info~t~yu g(A(t) ) I Tu

~ yu]

re+O(r)+11(f)-h(r)f < 1

~ sup lES

1

9(0) {.

re+O(r)

= ~~~ 11(f) - + (3 . Since O(R) = 0, which follows from Proposition 4.30 (b), we get, exactly as

in previous cases,

11(A(O»

(

~ Rye+O(Ry)+11(A(O»-h(Ry)A(O) 1+ where

Ry=R

>

and, for y

as

y

>

o. Any fixed r > 0, fulfilling (23), is the Lundberg exponent in a modified risk process with e replaced by e + Since r > 0 we have, also in the modified risk process, positive safety loading. Thus Remark 4.46 is applicable, which means that 0 is an eigenvalue of H(r) - O(r) with maximal real part. Since, for any eigenvalue ~ of H(r), ~ - O(r) is an eigenvalue of H(r) - O(r), it follows especially that O(r) = ~(r). The "usual" martingale argument leads to Yo = 1/~'(R) which is in agreement with (21). '

¥.

A.4 Diffusion approximations

159

AA Diffusion approximations If little is known in the Cox case, nothing is known in the general case. The only method - known to us - which works for a very general class of underlying point processes is the "diffusion approximation." This approximation was discussed for the Poisson model in Section 1.2 and for the Cox model in Section 4.6. Recall, however, that its accuracy is not very good. Let, as usual, the occurrence of the claims be described by a point process N. Assume that . Var[N(t)] 2 11m (24) =O'N t ..... oo

t

and that

as n where

Nn(t)

(25)

-00,

= N(n~ ant

and W a standard Wiener process. Strictly speaking, only (25) is necessary for the diffussion approximation, but with (24) O'N gets a natural interpretation. The assumptions do not seem too restrictive. We have seen that they hold in the Poisson case with O'iv a and in the Cox case, compare Section 4.6, with O'iv a + O'X. In the renewal case, see Billingsley (1968, p. 148), they hold with O'iv O'ja 3 , where O'j is the variance of the inter-occurrence time distribution. Define Sn by - aj.tnt S- n (t) -_ S( nt)yrn ,

=

=

-

where S(t) =

N(t) Ek=1 Zk.

Sn

~

=

Then, see for example Grandell (1977, p. 47),

Jj.t20'iv + a0'2 . W

as n -

00.

(26)

REMARK 11. In order to make (26) probable, we will indicate the proof of its "one-dimensional version," i.e., that

Sn(t) Put S(k)

= E;=1 Zj

~

Jj.t20'iv + a0'2 . W(t)

and note that S(t)

as

= S(N(t)).

n -

00.

We have

Sn(t) = S(N(n~- aj.tnt =O'.VN(nt). S(N(nt))-j.tN(nt) +j.t. N(nt)-ant n O'VN(nt) Vii

160

Finite time ruin probabilities

4

J

D:0- 2 + f..l 20-Jv . W(t),

where W 1 and W 2 are independent standard Wiener processes. The notation 4 means "equality in distribution." D Define Yn and Y by

Y. ( ) _ cnnt - S(nt) n

t -

Vii

and

Then, see Section 4.6, d

Yn->Y

n->oo

as

if and only if

. r:::

Pn V n =

D:f..l r::: V n -> / D:f..l

Cn -

as

n

-> 00.

Recall from the survey "Basic facts about weak convergence" in Section 1.2 that Yn ~ Y implies inf Yn(t) ~

0~t90

inf Y(t)

09~to

for any to

< 00

but not necessarily inft>o Yn(t) ~ inft>o Y(t). Define w;'(uo, to) and wD(uo, to) by-

w;' (uo, to) = P{ and

inf Yn(t) 0990

< -uo}

wD(uo,to) = P{ inf Y(t) < -uo}. 0990

Then and, see (9),

(27) where

R_ -

2/D:f..l f..l2 uJv D:U2 •

+

Consider now a risk process X with relative safety loading p and the corresponding ruin probability w( u, t). Then we have, for each n,

w(u, t)

= p{

inf X(s)

0~'9

< -u}

= p{

inf cs - 5(s)

o~.~t

< -u}

A.4 Diffusion approximations

_p{. f

-

cs-S(s) < - u

m o~89..;n

..;n

}_p{ .f -

161

cns-S(ns) < - u} .

m O~.9In..;n

..;n

Assume now that p is small, u is large, and t is very large in such a way that p-l, u, and ..;t are of the same large order. Put

/ = p..;n,

Uo

u =..;n,

and

to

= -nt

(28)

where n is chosen such that /, Uo, and to are of the same moderate order. This leads to the diffusion approximation

w(u, t) ~ WD(uo, to) = WD(U, t),

and

R D

=

2pcxl-' + cxu 2

1-'2 uJv

=

(29)



=

=

As an illustration of (29) we consider p 5%, u 100, and t 1000 in the four combinations of exponentially /mixed exponentially distributed claims and Poisson/renewal case discussed in Examples 1, 4, and 6. In all cases we have cx = I-' = 1. Note that the diffusion approximation does not differ in the ordinary and stationary case. Then we have

u2 = 1 for exponentially distributed claims; u 2 = 42.1982 for mixed exponentially distributed claims;* uJv = 1 in the Poisson case; uJv = 2.5 in the renewal case. ** TABLE 7. lllustration of the diffusion approximation. The values of 0'2 and O'Jv indicate the model.

0'2

O'Jv

W(100, 1000)

WD(100, 1000)

1 42.1982 1 42.1982

1 1 2.5 2.5

0.0019 0.4115 0.0196 0.4209

0.0013 0.5565 0.0170 0.5640

* F(z) = 1_0.0039793e-o.014631z _0.1078392e-O.190206z _0.8881815e-5.514588z. ** KO(t) = 1 - 0.25e- 0.4t - O.75e- 2t .

162

Finite time ruin probabilities

From Table 7 it is seen that the accuracy of the diffusion approximation is, as was to be expected, not very good. Define, for any zED and any u ~ 0 the function tu: D -+ [0,00] by

tu(z) = inf{s

~

0 I z(s)

< -u}.

Note that tu(X) = Tu. Further, Yn ~ Y

implies

(30)

For I, Uo, and to given by (28) and Xn defined by Xn(t) = X(nt)/Vn we get

tuo(Xn) = inf{s

~ 0 I X(ns) < uovn} =

..!.tuov'n(X) = ..!.tu(X). n

n

Thus we have (31)

and Put, see (9) and the definition of Y, D def 1 Y - -a - alJl

1

and

aJlPVn

v

D def Jl2 uJv + au 2 Jl 2U2 + au 2 - ~....;N:..:....,-::---,C"i""::"" a (a I Jl)3 (aJlp)3 n 3/2 .

Equations (29), (30), and (31) lead to the approximations Yo ~ YO D

def

1

= -aJlp

and

As an illustration we consider the same cases as in Table 7.

TABLE 8. illustration of the diffusion approximation for p = 5%. The values of 0'2 and O'Jv indicate the model. 0'2

O'Jv

Yo

YOD

..jVOD

Fa

1 42.1982 1 42.1982

1 1 2.5 2.5

19.05 17.51 19.35 17.60

20 20 20 20

126.49 587.84 167.33 597.96

126.49 587.87 167.33 597.98

The values of Yo in Table 8 indicate that

YO D

works reasonably well for

P = 5%. This is also true in the Cox case when A(t) is the two-state Markov process illustrated in Table 6. The accuracy of VO D is almost perfect. Generally speaking, the approximations YO D and VO D seem to work better than

A.4 Diffusion approximations

163

expected from the poor accuracy of the diffusion approximations of ruin probabilities. In Section 4.6 we had a similar experience when using the diffusion approximation as a motivation for certain approximations of the Lundberg exponent. In that case those approximations also worked rather well for larger values of p. It is tempting to see if this is also the case here, and therefore we consider in Table 9 p = 20%.

TABLE 9. lllustration of the diffusion approximation for p = 20%. The values of (12 and (lJ.r indicate the model.

(12

(lJv

Yo

YOD

.JVO D

Fa

1 42.1982 1 42.1982

1 1 2.5 2.5

4.17 3.10 4.44 3.15

5 5 5 5

15.81 73.25 20.92 74.51

15.81 73.48 20.92 74.75

Table 9 indicates that YO D does not work so well for p = 20% but the accuracy of VO D is still almost perfect. Out of sheer curiosity - this is probably only of little interest - we consider in Table 10 p = 100%. The figures do not require any comments.

TABLE 10. lllustration of the diffusion approximation for p = 100%. The values of (12 and (lJv indicate the model.

(12

(lJ.r

Yo

YO D

1 42.1982 1 42.1982

1 1 2.5 2.5

0.50 0.23 0.65 0.23

1 1 1 1

.JVO D

Fa

1.41 6.37 1.90 6.47

1.41 6.57 1.87 6.69

Let us go back and consider the Poisson case. It follows from (14) that This indicat~s that the our final remarks in that case, which • were based on (14), seem to hold more generally. Yo ~ Yo D

164

Finite time ruin probabilities

REMARK 12. In the Poisson case we exploited the De Vylder approximation. Applying that idea to Yo - rather than to Q • uyo - and Vo we get

_

1

Yo

1

= iijip(l + p) = QIlP(l + 23~13 p)

The fact that Vo = VOD may partially explain why so well for larger values of p.

VOD

seems to also work

o

d

13. Some readers are perhaps puzzled by the fact that Yn Y implies tu(Yn) ~ tu(Y) but not inft~o Yn(t) ~ inft~o Y(t). We will therefore consider weak convergence in slightly more detail than we did in Section 1.2. Recall that D is the space of right-continuous functions with left-hand limits. endowed with the Skorohod J 1 topology. Let C denote the subspace of continuous functions. Let X, Xl, X2, ... be functions in D. For x E C the convergence Xn - x is equivalent to REMARK

for all to

< 00.

If x rt C the definition of Xn - x is more complicated in that sense that a sequence of time transformations is introduced. Define, for any zED and any to < 00, the functions ito and i: D [-00,00] by

i(z) = inf z(s).

and

.~o

The function ito is continuous on C (and on D): Put d n = sUPo o. Denote the distribution of Xn (X) by Pn (P) and note that P{C} = 1. The "main theorem of weak convergence" states that X ~ X implies

J(Xn ) ~ J(X) for any measurable and P-a.s. continuous function J. Especially this means that it is enough to show that J is continuous on C when the limit process X is in C. Assume that Xn ~ X. Then ito(Xn) ~ ito(X) which, in turn, implies Pn{ito(Xn) < u} ~ P{ito(X) < u} for those u where P{ito(X) u} o. For a Wiener process with positive drift this, see (9), holds in fact for all u < 00. Since

= =

{Z E D I tu(z)

~

to} = {z ED I ito(z) < -u}

it follows that Pn{tu(Xn) ~ to} ~ P{tu(X) ~ to}. Since this argument goes through for all to < 00 (30) follows. Equation (30) does, however, not imply that Pn{tu(Xn) < oo} ~ P{tu(X) < oo} since P{tu(X) = oo} > o. Note that (30) followed from properties of the Wiener process. These properties are not enough to guarantee that i(Xn) ~ i(X), as seen by the following example: Put ift < n Xn(t) = { X(t) ift> n -t which implies that Xn ~ X. This statement really does require some more properties of weak convergence in D than given here. In fact, a basic result in Lindvall (1973) is that convergence in distribution of processes on [0,00) can be brought back to processes restricted to [0, tk], k = 1, 2, ... , such that tk ~ 00. The reader may believe this from our discussion of convergence in D or - better - consult Lindvall (1973). Obviously i(Xn) = -00 while P{i(X) > -oo} = 1. Further

tu(Xn) = min[tu(X), max(u, n)]

~

tu(X)

while If we are to prove i(Xn) ~ i(X) we therefore must use some special properties of X n . As mentioned in Section 1.2, this can be done for Yn , as defined above, in the Poisson case. Our conjecture is that i(Yn) ~ iCY) holds rather generally. An argument for this is that Yn contains a contraction of time, while our counterexamples are based on a drift of the time of ruin to infinity. Finally we will emphasize that, although the approximations YO D and VO D were motivated by weak convergence, they do not follow from any limit theorem. The "parameters" Yo and Vo are defined by limit theorems

166

Finite time ruin probabilities

=

as (3) and (4) where the limit procedure - u, t -+ 00 and t O(u) - is different from the one here. Such limit theorems are, furthermore, only known for special models. Therefore YO D and VO D must be looked upon as based on ad hoc reasoning.

References and author index

ALLEN, A. (1978) Probability, Statistics and Queueing Theory with Computer Science Applications. Academic Press, New York. [126, 127]* ALMER, B. (1957) Risk analysis in theory and practical statistics. Transactions XVth International Congress of Actuaries, New York, II, 314 - 370. [55] AMMETER, H. (1948) A generalization of the collective theory of risk in regard to fluctuating basic probabilities. Skand. AktuarTidskr., 171 - 198. [77, 104, 105, 119] ANDERSEN, E. SPARRE (1957) On the collective theory of risk in the case of contagion between the claims. Transactions XVth International Congress of Actuaries, New York, II, 219 - 229. [57, 60, 61, 75] ARFWEDSON, G. (1955) Research in collective risk theory. Part 2. Skand. AktuarTidskr., 53 - 100. Part 1 in SAT (1954, pp. 191 - 223.) [137, 138] ASMUSSEN, S. (1984) Approximations for the probability of ruin within finite time. Scand. Actuarial I., 31 - 57. Erratum in SAl (1985, p. 64). [25, 139] ASMUSSEN, S. (1985) Conjugate processes and the simulation of ruin [15] problems. Stochastic Proc. Applic. 20, 213 - 229. ASMUSSEN, S. (1987) Applied Probability and Queues. John Wiley & Sons, New York. [123] ASMUSSEN, S. (1989) Risk theory in a markovian environment. Scand. [77, 117, 119, 152, 153] Actuarial I., 66 - 100. VON BAHR, B. (1974) Ruin probabilities expressed in terms of ladder height distributions. Scand. Actuarial I., 190 - 204. [145] BEEKMAN, J. (1969) A ruin function approximation. Trans. of the Soc. of Actuaries 21, 41 - 48 and 275 - 279. [18] BENCKERT, L.-G. and JUNG, J. (1974) Statistical models of claim distribution in fire insurance. Astin Bulletin VII, 1 - 25. [23]

* Pages on which references are cited are given in brackets.

168

References and author index

BERG, C. (1981) The Pareto distribution is a generalized f-convolution - a new proof. Scand. Actuarial J., 117 - 119. [48] BILLINGSLEY, P. (1968) Convergence of Probability Measures. John Wiley & Sons, New York. [15, 159] BJORK, T. and GRANDELL, J. (1985) An insensitivity property of the ruin probability. Scand. Actuarial J., 148 - 156. [125, 127] BJORK, T. and GRANDELL, J. (1988) Exponential inequalities for ruin probabilities in the Cox case. Scand. Actuarial J., 77 - Ill. [77, 92, 95, 99, 100, 102 - 105, 108, 109, 112, 114, 116] BREMAUD, P. (1972) A Martingale Approa'ch to Point Processes. Ph.D. Thesis, Memo ERL-M345, Dept. of EECS, Univ. of Calif., Berkeley. [38, 40] BREMAUD, P. (1981) Point Processes and Queues. M ariingale Dynamics. [38] Springer-Verlag, New York. CRAMER, H. (1930) On the Mathematical Theory of Risk. Skandia Jubilee Volume, Stockholm. [4, 13] CRAMER, H. (1945) Mathematical Methods of Statistics. Almqvist & Wiksell, Stockholm and Princeton University Press, Princeton. [19, 27] CRAMER, H. (1955) Collective Risk Theory. Skandia Jubilee Volume, Stockholm. [vi, vii, 4, 13,21,35,65,67, 140] DALEY, D. J. and VERE-JONES, D. (1988) An Introduction to the Theory of Point Processes. Springer-Verlag, New York. [41] DASSIOS, A. and EMBRECHTS, P. (1989) Martingales and insurance risk. Commun. Statist. - Stochastic models 5, 181 - 217. [vi, 146, 147] DAVIS, M. H. A. (1984) Piecewise-deterministic Markov processes: A general class of non-diffusion stochastic models. J. R. Statist. Soc. B 46, 353 - 388. [147,155] DELBAEN, F. and HAEZENDONCK, J. (1987) Classical risk theory in an economic environment. Insurance: Mathematics and Economics 6, 85 116. [vi] DE VYLDER, F. (1977) Martingales and ruin in a dynamical risk process. [40] Scand. Actuarial J., 217 - 225. DE VYLDER, F. (1978) A practical solution to the problem of ultimate ruin probability. Scand. Actuarial J., 114 - 119. [19, 20, 24] ELLIOTT, R. J. (1982) Stochastic Calculus and Applications. Springer[9, 39] Verlag, New York. EMBRECHTS, P. and VERAVERBEKE, N. (1982) Estimates for the probability of ruin with special emphasis on the possibility of large claims. [23] Insurance: Mathematics and Economics 1, 55 - 72.

References and author index

169

FELLER, W. (1971) An Introduction to Probability Theory and its Applications. Vol II. 2nd ed. John Wiley & Sons, New York. [4, 6, 30, 52, 62 - 64, 66, 77, 79, 81, 125] FRANKEN, P., KONIG, D., ARNDT, U., and SCHMIDT, V. (1981) Queues and Point Processes. Akademie-Verlag, Berlin and John Wiley & Sons, New York. [41, 96, 107, 110, 127, 128, 130] GERBER, H. U. (1973) Martingales in risk theory. Mitt. Ver. Schweiz. Verso Math. 73, 205 - 216. [8, 136] GERBER, H. U. (1979) An Introduction to Mathematical Risk Theory. S. S. Heubner Foundation monograph series 8, Philadelphia. [vi, 14,37] GRANDELL, J. (1976) Doubly Stochastic Poisson Processes. Lecture Notes in Math. 529, Springer-Verlag, Berlin [35, 36,43, 47, 122] GRANDELL, J. (1977) A class of approximations of ruin probabilities. Scand. Actuarial J. Suppl., 38 - 52. [16, 20, 25, 122, 159] GRANDELL, J. (1978) A remark on 'A class of approximations of ruin probabilities.' Scand. Actuarial J., 77 - 78. [16, 20] GRANDELL, J. (1979) Empirical bounds for ruin probabilities. Stochastic Proc. Applic. 8, 243 - 255. [25] GRANDELL, J. and SEGERDAHL, C.-O. (1971) A comparison of some approximations of ruin probabilities. Skand. AktuarTidskr., 144 - 158. [14, 20, 21, 74] GRIGELIONIS, B. (1963) 0 CXO,lI.HMOCTH CyMM cTyneHqaTbIX CJIyqaAHLIX npo~eccoB K nyaccoHoBCKOMY. «Teop. BepojJT. H npHMeH.» 8, 189 - 194. English translation: On the convergence of step processes to a Poisson process. Theor. Prob. Appl. 8, 177 - 182. [44] GRIGELIONIS, B. (1975) CJIyqaAHbIe TOqeqHLIe npo~eCCbI H MapTHHraJIbI. Liet. Matem. Rink. 15, 101 - 114. English translation: Random point processes and martingales. Lithuanian Math. J. 15,444 - 453. [40] HABERLAND, E. (1976) Infinitely divisible stationary recurrent point processes. Math. Nachr. 70, 259 - 264. [45] HERKENRATH, U. (1986) On the estimation of the adjustment coefficient in risk theory by means of stochastic approximation procedures. Insurance: Mathematics and Economics 5, 305 - 313. [31, 32] HOGLUND, T. (1990) An asymptotic expression for the probability of ruin within finite time. Ann. Prob. 18, 378 - 389. [145] IGLEHART, D. L. (1969) Diffusion approximations theory. J. Appl. Prob. 6, 285 - 292. [16]

In

collective risk

KALLEN BERG , O. (1975) Limits of compound and thinned point processes. J. Appl. Prob. 12, 269 - 278. [46]