161 51 7MB
English Pages [252] Year 2021
Table of contents :
Contents
Preface
Introduction
Chapter 1
Combinatorics
Chapter 2
Random Variables
2.1. The Simplest Probabilistic Models
2.2. Discrete Generalizations of the Classical Scheme
2.3. The General Construction of the Probability Space
2.4. Random Variables and Distribution Functions
2.5. Examples of Random Variables
Chapter 3
Generating and Characteristic Functions
3.1. Generating Functions and Their Properties
3.2. Characteristic Functions and Their Properties
3.2.1. Characteristic Functions and Polya Criterion
3.2.2. Polya Criterion
Chapter 4
Some Univariate Continuous Probability Distributions
4.1. Arcsine Distribution
4.2. Beta Distribution
4.3. Cauchy Distribution
4.4. ChiSquare Distribution
4.5. Exponential Distribution
4.6. Gamma Distribution
4.7. Inverse Gaussian Distribution
4.8. Laplace Distribution
4.9. Levy Distribution
4.10. Loglogistic Distribution
4.11. Logistic Distribution
4.12. Normal Distribution
4.13. Pareto Distribution
4.14. Power Function Distribution
4.15. Rayleigh Distribution
4.16. Stable Distribution
4.17. Student TDistribution
4.18. Uniform Distribution
4.19. Weibull Distribution
Chapter 5
Order Statistics: From Minimum to Maximum
5.1. Definitions and Examples
5.2. Distributions of Order Statistics
5.3. Representations for Uniform and Exponential Order Statistics
5.4. Extreme Order Statistics
Chapter 6
Record Values and Probability Theory of Records
6.1. Definitions
6.2. Record Times and Record Values in the Case of Continuous Distributions
6.3. Records in the Sequences of Discrete Random Variables
Chapter 7
Characterizations of Continuous Distributions by Independent Copies
Introduction
7.1. Arcsin Distribution
7.2. Beta Distribution
7.3. ChiSquare Distribution
7.4. Levy Distribution
7.5. Lindley Distribution
7.6. Loglogistic Distribution
7.7. Normal Distribution
7.8. Pearson’s Random Walk
7.9. Power Function Distribution
7.10. Skew Distribution
7.11. Uniform Distribution
7.12. Von Mises distribution
7.13. Weibull Distribution
7.14. Wald Distribution
Chapter 8
Characterizations of Distributions by Order Statistics
8.1. Introduction
8.2. Characterizations of Distributions by Conditional Expectations
8.3. Characterizations by Identical Distribution
8.4. Characterizations by Independence Property
Chapter 9
Characterizations of Distributions by Record Values
9.1. Characterizations Using Conditional Expectations
9.2. Characterization by Independence Property
9.3. Characterizations by Identical Distribution
Chapter 10
Extreme Value Distributions
Introduction
10.1. The PDF of Extreme Value Distributions
10.1.1 The Probability Density Function of Type 1 Extreme Value Distribution (T10) Is Given in Figure 10.1.1
10.1.2. The PDF of Type 2 Extreme Value Distributions for Xn,n
10.1.3. The PDF of Type 3 Distribution of Xn,n
10.2. Domain of Attraction
10.2.1. Domain of Attraction of Type I Extreme Value Distribution for Xn.n
10.2.2. Domain of Attraction of Type 2 Extreme Value Distribution for Xn,n
10.2.3. Domain of Attraction of Type 3 Extreme Value Distribution for Xn,n
10.3. Extreme Value Distributions for X1,n
10.3. Pdfs of the Extreme Value Distribution for X1,n
10.3.1. Type 1 Extreme Value Distribution for X1,n
10.3.2. Type 2 Extreme Value Distribution for X1,n
10.3.2. Type 3 Extreme Value Distribution for X1,n
10.3. Domain of Attraction of Extreme Value Distribution for X1,n
10.3.1. Domain of Attraction for Type 1 Extreme Value Distribution for X1,n
10.3.2. Domain of Attraction of Type 2 Distribution for X1,n
10.3.2. Domain of Attraction of Type 3 Distribution for X1,n
10.4. Asymptotic Distribution of the KTH Largest Order Statistics
Chapter 11
Random Filling of a Segment with Unit Intervals
11.1. Random Filling. Continuous Case
11.2. Discrete Version of the Parking Problem
Appendix
A.1. Cauchy’s Functional Equations
A.2. Lemmas
References
About the Authors
Index
Blank Page
APPLIED STATISTICAL SCIENCE
PROBABILITY THEORY A LOGIC OF SCIENCE
No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.
APPLIED STATISTICAL SCIENCE MOHAMMAD AHSANULLAH  SERIES EDITOR PROFESSOR, RIDER UNIVERSITY, LAWRENCEVILLE, NEW JERSEY, UNITED STATES
Probability Theory: A Logic of Science Valery B. Nevzorov and Mohammad Ahsanullah (Editors) 2021. ISBN: 9781536191738 (Hardcover) Characterizations of Exponential Distribution by Ordered Random Variables Mohammad Ahsanullah (Editor) 2019. ISBN: 9781536154023 (Softcover) Applied Statistical Theory and Applications Mohammad Ahsanullah (Editor) 2014. ISBN: 9781633218581 (Hardcover) Research in Applied Statistical Science Mohammad Ahsanullah (Editor) 2014. ISBN: 9781632218185 (Hardcover)
The Future of PostHuman Probability: Towards a New Theory of Objectivity and Subjectivity Peter Baofu (Author) 2014. ISBN: 9781629486710 (Hardcover) Sequencing and Scheduling with Inaccurate Data Yuri N. Sotskov and Frank Werner (Editors) 2014. ISBN: 9781629486772 (Hardcover) Dependability Assurance of RealTime Embedded Control Systems Francesco Flammini (Author) 2010. ISBN: 9781617285028 (Softcover)
APPLIED STATISTICAL SCIENCE
PROBABILITY THEORY A LOGIC OF SCIENCE
VALERY NEVZOROV MOHAMMAD AHSANULLAH SERGEI ANANJEVSKIY
Copyright © 2021 by Nova Science Publishers, Inc. All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. We have partnered with Copyright Clearance Center to make it easy for you to obtain permissions to reuse content from this publication. Simply navigate to this publication’s page on Nova’s website and locate the “Get Permission” button below the title description. This button is linked directly to the title’s permission page on copyright.com. Alternatively, you can visit copyright.com and search by title, ISBN, or ISSN. For further questions about using the service on copyright.com, please contact: Copyright Clearance Center Phone: +1(978) 7508400 Fax: +1(978) 7504470 Email: [email protected]
NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers’ use of, or reliance upon, this material. Any parts of this book based on government reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of such works. Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the Publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. Additional color graphics may be available in the ebook version of this book.
Library of Congress CataloginginPublication Data ISBN: H%RRN
Published by Nova Science Publishers, Inc. † New York
To My wife, Ludmila – VBN To My wife, Masuda – MA To My Family – SA
CONTENTS Preface
ix
Introduction
xi
Chapter 1
Combinatorics
Chapter 2
Random Variables
13
Chapter 3
Generating and Characteristic Functions
35
Chapter 4
Some Univariate Continuous Probability Distributions
61
Chapter 5
Order Statistics: From Minimum to Maximum
85
Chapter 6
Record Values and Probability Theory of Records
103
Chapter 7
Characterizations of Continuous Distributions by Independent Copies
119
Characterizations of Distributions by Order Statistics
139
Characterizations of Distributions by Record Values
151
Extreme Value Distributions
175
Chapter 8 Chapter 9 Chapter 10
1
viii Chapter 11
Contents Random Filling of a Segment with Unit Intervals
207
Appendix
217
References
219
About the Authors
231
Index
233
PREFACE Probability theory helps us make important decisions, find optimal behaviour in our work and life, better understand and describe various phenomena of the world around us. The authors of the presented edition are ready to acquaint the reader, without delving too deeply into theoretical research, with a number of useful and interesting probabilistic facts, with some interesting areas of probability theory and with applied problems that require the ability to work with various probabilistic models. To read and understand the material presented in this book does not require deep knowledge of modern probability theory. In presenting the theoretical material, the authors start from the simplest examples and gradually come to the consideration of mathematical schemes, in which an important place is given to extreme observations and record results in various fields of human activity. A significant part of the book is devoted to the description of just such models and schemes. Every year there are many new results related to subject of records, more and more record schemes are considered, new methods of studying records are proposed. Therefore, we offer our readers a book that introduces the theoretical foundations of describing record moments and record values. This monograph will help to clarify the situation with the current state of the
x
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
theory of records. Acquaintance with the theoretical part is facilitated by a fairly large number of relevant examples presented in the book. Note also that the theory of records is very closely related to the theory of order statistics. In a sense, our book can also be viewed as a useful addition to the materials presented in the monographs by M. Ahsanullah, V. B. Nevzorov and M. Shakil (2013) and M. Ahsanullah and V. B. Nevzorov (2015).
INTRODUCTION Dear Reader! Since your birth, you are coming across many random events. Often you have to assess the likelihood that this or that favourable or not so pleasant for you outcome of the event conceived by you will come. Sometimes you need to assess the chances of the success in your adventure. Even a few months before you were born, some probabilities of events related to you were evaluated. For example, your parents, giving already births to two girls, evaluated their chances of giving them a brother. They studied various statistical reports and surveys and found out that on average in the country there are 515 boys and 485 girls among each 1000 new borns. Consequently, they found out that the corresponding probability to give the birth to a boy was 0, 515. They also learned that not everything is so simple. This probability, taken from various sources, denoted the national average, and the probability of interest could differ from the national average. It turns out that it was necessary to take into account, besides, various factors connected with the heredity of parents, their way of life and even, may be, the planned birth time (winter or summer) of the child. It was necessary to know not only simple probabilities, but also the socalled conditional probabilities, say the probability of giving birth to a boy, provided that the first two born children turned out to be girls. The time came and you (a girl or a boy) already was born.
xii
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
It was necessary now to insure your life and health. The insurance company, already selected by your parents, evaluated various insurance events and offered the appropriate conditions and tariffs, trying not to forget about its benefits and to minimize, at the same time, the probability of its possible ruin. The child safely grew up. The parents chose a specialized school for him. All is well, but this educational institution is located too far from home, and transport to get it is not very convenient. Why not to take him to school on their own car. However, the income of the family does not allow to buy even the cheapest model. An idea comes up to try to get hold of this essential in the household thing by participating in a lottery advertised by all media. It is clear that with one or two acquired lottery tickets the chances to win the main prize  the car are scanty. How many tickets do you need to buy so that the probability of the event is not too small? The company that conducts the lottery also has its own reasons for carrying out probabilistic calculations. On the one hand, the ticket price should not be set too high not to deter potential participants. On the other hand, the expected revenue from lottery circulation, also depending on a number of random factors, should ensure the existence and even, if only modest, but the company’s prosperity. We need specialists who could take into account all these random factors and provide some guaranteed average value of the lottery profits. Some time passes. The child has grown up and is preparing for the examination in mathematics. In the exam program there are 36 questions, any three of which can be found in a randomly selected examination card. It was not enough, as always, time to be familiar with all the material. Imagine, that you know only the answers to 2/3 of all questions. Parents estimate your chances of getting any three questions, which are necessary for an excellent answers. Then your father, tired of these calculations, sits down to the TV, preparing to watch a report about a hockey match, in which Boston Bruins and Chicago Blackhawks are to meet. He is offered to participate in the tote, choosing one of three possible outcomes (the victory of Boston, the victory of Chicago, a draw in the match) of the game. Naturally, the corresponding payments to the participants of the tote, correctly guessed the outcome of the match, are appointed even before the starting whistle of the referee. In this
Introduction
xiii
case, the organizer of the totalizator must be able to assess the probability of these three outcomes, based on various statistical data. Children grow up and themselves face the similar problems. Parents, which are already wise in the experience of solving such problems, share their knowledge with the children about how to assess their chances of success in a given situation and how to find the probabilities of a vital accident. The authors of the presented edition are ready to acquaint the reader while not going deep into the theoretical researches, with a number of useful and interesting probabilistic facts, with the most important directions in probability theory and with some applied problems requiring the ability to work with different probabilistic models. From time to time, fragments of the text and a number of definitions, requiring some knowledge from the field of mathematical analysis, will be cited. The corresponding parts of the book can be omitted by the reader, who is not very sophisticated with this knowledge. The authors thank Nova Science Publishers for willing to publish the book.
Chapter 1
COMBINATORICS The theory of probability, in fact, began with the solution of the great number of combinatorial problems. Several classical situations are connected with the situations when it is necessary to choose randomly m objects from the set which includes n (m≤n) available items. 1) If we are interested in the question of how many different groups including m items can be formed under the condition that the order of their arrival to the created group is important for us, then we naturally find that this number of arrangements N (m, n) is given as follows: N (m, n) =𝐴𝑚 𝑛 = n (n1) (n2) ... (nm + 1) = n! / (nm) !. 2) Here we should especially note the case when m = n. In this situation, we get the number of rearrangements of n items on n places: N (n, n) = 𝐴𝑛𝑛 = n !.
2
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy 3) Not always we are interested to know the order of the arrival of these m items. In this case the number N of possible groups (without taking into account the order of the objects chosen by us in these groups) coincides with the corresponding number of combinations: N = 𝐶𝑛𝑚 = n! / m! (nm) !. 4) Very often we have to deal with the socalled “samples with replacement,” when the selected object is returned to the original group and can be selected repeatedly (and even more than once). In this situation, the number N of possible different ways of selecting m objects (here m can be greater than n) is given by N = 𝑛𝑚 . 5) In the situation when n objects are randomly divided into r groups in such a way that the group number k consists of 𝑚𝑘 items, where 𝑚1 + 𝑚2 + ... + 𝑚𝑟 = n, one can obtain the corresponding number of different variants N (𝑚1 , 𝑚2 , ..., 𝑚𝑟 , n) from the equality N (𝑚1 , 𝑚2 , ..., 𝑚𝑟 , n) = n! / (𝑚1 )! (𝑚2 )! ... (𝑚𝑟 ) !.
Let us consider several examples of the applications of the above relations. a) Suppose that 30 different tickets are on the examiner’s table. Each of the first 5 students choose one ticket. What is the number of N different possible options for choosing these tickets, taking into account their concrete distribution among students? In this situation N = 𝐴530 = 30 ⋅ 29 ⋅ 28 ⋅ 27 ⋅ 26 b) What is in this situation the number of possible combinations of 25 tickets remaining on the examiner’s table? In this case, it does not
Combinatorics
3
matter which of the five tickets the concrete student received. Therefore, the number of different possible combinations corresponding to this random choice of five tickets is given as 5 25 N = 𝐶30 = 𝐶30 = 30! / 25! ⋅ 5!.
c) Now we consider the situation when the students mentioned above come one by one, choose a ticket randomly, rewrite the questions from this ticket and return it to the examiner’s table. In this case, it is possible that the same ticket can be selected more than one time. In this case the number of all variants of the possible choices is determined by the relation 𝑁 = 305 . In the various classical probabilistic problems very often we have to count the number of variants associated with a particular random event and to use one or another relation given above. The very origin of the theory of probability was tied with such calculations of options associated with various combinations in gambling in which it was necessary to get the certain winning sample of cards or to get some kind of combinations favorable for the player that appear when throwing one or more dices. Historically, the impetus to the development of the future probability theory was given in the middle of the 17th century by some very gambler, the French knight de Mere. The history of this Chevalier de Mere is given practically in all textbooks on probability theory. The problems of this knight in 1654 were discussed at the highest scientific level, in the correspondence of Blaise Pascal, to whom de Mere addressed with his complaints, and Pierre Fermat. Let us recall what was the reason which aroused the interest of these scientists, whose correspondence was published 25 years later in 1679. They discussed how the odds of winning / losing are correlated in two slightly different versions of gambling, the participation in which de Mere offered to his rivals. Now any student familiar with the main principles of the
4
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
probability theory would immediately help the cavalier to solve his problem, but three and a half centuries ago this problem was solved only by such outstanding scientists. Discussion of this problem led to an understanding (and the definition) of probabilities as a certain relation connecting the numbers of favourable and unfavourable outcomes of the experiment, as a result of which the corresponding event, presenting the player’s interest, may or may not appear. It is assumed that the results of this experiment can be presented by n equal (or, as it will be possible to qualify them later, equiprobable) outcomes. Let an event A, the appearance of which is interesting for us, occur if any of m outcomes favorable to the appearance of this event is observed. Then the probability P (A) of occurrence of the event A, which is given by the equality. P (A) = m / n, is determined really by the ratio of the number of favorable to the number of all possible outcomes. The first task, related to de Mere’s problem, required to assess the chances of the success in a game, based on the fourfold throwing of the correct dice, to get “six” at least one time. In this case an arbitrary result of the experiment can be written in the form {𝑎1 ,𝑎2 , 𝑎3 , 𝑎4 }, where each of the four symbols can take any of the values 1, 2, 3, 4, 5, 6. If the dice is symmetric, then all of n = 64 possible outcomes of the experiment have equal chances to occur. It is easy to see that the number of outcomes at which no sixes appear will be equal to 54 . Therefore, the number of favorable variants of m is given by the equality m = 64 − 54 . The ratio m to n gives the corresponding probability of the occurrence of the required event A: 𝑃(𝐴) =
64 −54 64
= 1 − (5⁄6)4 ≈ 0.518
If we denote as 𝐴 the event, which is opposite to event A, i.e., the corresponding event consists in the fact that none of the necessary “sixes” appeared, then similarly we get that
Combinatorics
5
𝑃(𝐴) = (5⁄6)4 ≈ 0.482 We can see that in every single game the chances of de Mere to win and the chances of his losing were almost the same, but even this small difference in favour of him led after large number of games (and here worked the socalled law of large numbers, which later we will study) to the steady growth of the wellbeing of this gentleman. All this began to scare off his potential partners and de Mere decided to diversify the methods of the growth of his capital, offering a somewhat different form of the game, enriched him, which, as it was seemed to de Mere, was very similar to the previous one. Now he offered his partners to throw simultaneously a couple of dice, but make these attempts not four, as in the previous version of the game, but 24 times. His win was the simultaneous appearance, at least once in these 24 attempts, of two “fives.” In some cases de Mere continued to win, but more and more games ended in his failure, which quickly led to a waste of the funds, which the previous version of the game brought him. In this case, a simple calculation, similar to the procedure, which was used above, shows that with simultaneous single throwing of two dice, 36 different variants are possible. Therefore, 24 attempts already lead to the number 𝑛 = 3624 possible equal options for completing this game. The number of m outcomes, which are favourable for de Mere, is determined in this situation by equality 𝑚 = 3624 − 3524. This value m, as it turned out, though not so much, but is less than the number nm = 3524 of outcomes, for which the amount put on the stake was received by his rivals. The law of large numbers, not yet known to the contemporaries of de Mere, worked and he quickly squandered the wealth which was won in the previous fights. The results of these two different variants of the game created the appearance of some paradox, for the explanation of which de Mere addressed to Blaise Pascal. This action subsequently led to the appearance and the fast development of a new mathematical sciencethe theory of probability. Some later than Pascal and
6
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
Fermat, but independently of them, Christian Huygens engaged in the study of such problems and even before his colleagues published in 1657 a work that contained the rudiments of the future probability theory. Note that the abovementioned correspondence between Pascal and Fermat, dating back to 1654, was published only in 1679. Several similar tasks related to certain situations that arise when the dice is throwing were solved by Isaac Newton. For example, he was interested in the following question. Which of the following situations is more probable: a) to get at least one “six,” throwing 6 dices b) to get at least two “sixes,” throwing 12 dices c) to get at least three “sixes,” throwing 18 dices? It must be said that the solution of these problems represents the only contribution of Newton to the theory of probability. Now we consider one more simple combinatorial problem requiring the ability to calculate the number of possible subdivisions of n objects into r different groups in such a way that the group with number k (k=1,2,…,r) consists of 𝑚𝑘 objects, where 𝑚1 + 𝑚2 + ⋯ + 𝑚𝑟 = 𝑛. Let us have a traditional deck of 36 cards. The cards were randomly distributed to three players (12 cards to each of them). What is the probability of the event A, consisting in the fact that each of them received three cards of the spade suit? Recalling the above formula 𝑁(𝑚1 , 𝑚2 , … , 𝑚𝑟 ) = 𝑛 !⁄𝑚1 ! 𝑚2 ! … 𝑚𝑟 ! for the number of the different subdivisions of n objects into groups consisting of 𝑚1 , 𝑚2 , … , 𝑚𝑟 objects correspondingly, we use two of its particular cases to solve the presented problem. We see that the total number N of subdivisions of 36 items into 3 groups (having the equal numbers of cards in each) is given in this situation by the following equation:
Combinatorics
7
𝑁 = 𝑁(12,12,12,36) = 36 !⁄(12!)3 . Then 𝑘1  the number of partitions of 9 spades, which we have in the deck, into three groups is determined by the relation. 𝑘1 = 𝑁(3,3,3,9) = 9 !⁄(3!)3 . To each of these 𝑘1 partitions we must correspond 𝑘2 options to distribute equally among 3 players the remaining 27 cards. Indeed, 𝑘2 = 𝑁(9,9,9,27) = 27 !⁄(9!)3 . We obtain, combining all possible situations, that the total number M of favorable combinations under consideration is given as follows: 𝑀 = 𝑘1 ⋅ 𝑘2 = 27 !⁄(3!)3 ⋅ (9!)2 . Thus, the corresponding probability of the event A is given by the equality 27!(12!)3 . 36!
𝑃(𝐴) = 𝑀⁄𝑁 = (3!)3 (9!)2
It should be noted that if at the beginning and even in the middle of the 20th century, many people were interested in the chances of getting different combinations in card games (the presence of the certain number of the trump cards in the hands of an opponent, probability of a straight or a pun in the poker game and so on) then with the progress of the Internet there appeared a large number of participants in the sweepstakes ( how not to put some money on the victory of Boston Bruins in the nearest game with Chicago Blackhawks!) and lovers to participate in various lotteries. Consider one of the classical variants of lotteries. You pay 100 dollars and choose five different numbers from the set {1, 2, ..., 35, 36}. To the next edition, a large group of gamblers is recruited, each of which fixes one or
8
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
several analogous combinations. Payments in the end give a fairly large sum, half of which the organizers leave to themselves, and the second half presents the payments of winnings. A certain percentage of the prize amount is obtained by the gamblers guessing exactly three numbers of the winning combination, a certain percentage is distributed equally among the persons who have guessed four of the 5 winning numbers. The greater part of the prize is distributed among the gamblers guessed all 5 numbers. If this combination is not guessed, then the corresponding part of the winning sum is transferred to the next draw of the lottery, which makes it possible to form a super prize. With the help of a lottery machine or a random number generator, the organizers fix a winning combination of 5 numbers, take away half of the payments, which they collected, and distribute the second half of the money according to lottery terms among the participants who guessed 3, 4 or 5 elements of the winning combination. This scheme of lottery provides a priori “average winnings” of the participant in each of his attempts to present a combination of 5 numbers, equal to 50 (minus 50!) dollars. This amount of winning (more precisely, losing) does not scare the potential participants. They are more interested in the possibility, which is rather unlikely event, to guess all five elements of the winning combination and to get million sums as the main prize. Of course, such winning sums are possible only in the games with a huge number of participants, but the fact that almost every country organizes not only one but several different lotteries (Mega Millions, Power Ball, EuroMillions, etc.), and these lotteries provided (for the long time of their existence) the appearance of hundreds of new millionaires, means that there are plenty of gambling people in the world. It is not difficult in the above lottery scheme to evaluate the participant’s chances to guess all 5 numbers of a winning combination. The corresponding probability is defined by the equality p = 1 / n, where n coincides with the number of all possible variants to 5 form from the numbers from 1 to 36 different five numbers, i.e., 𝑛 = 𝐶36 . In this way, 𝑝 = 1⁄𝑛 = 5! 31 !⁄36 ! = 1⁄396992 ≈ 0.00000265.
Combinatorics
9
This small probability to get the maximal prize does not scare a great number of the peoples who wants to become millionaires. As a result, one in about 400000 participants achieves this goal and strengthens the desire of the others to follow him. All media immediately inform us about the record wins. More recently, three Americans, buying tickets worth $ 2.00 each, divided the main prize in the PowerBall lottery, which amounted to about $ 1.5 billion, for three. In this lottery, you must simultaneously guess five numbers of 69 possible white balls and one number of the 26 reds. The chances of coping with such a complex combination  one at about 292 millions. There is an interesting psychological moment. Organizers of the corresponding lottery regularly publish a set of six most frequently appearing (for all times or recently) winning numbers as well as a set of six rare numbers. Participants begin to guess, the number of which six numbers should be used in the next run. Maybe the lottery has its “favorites” and it is necessary to focus on the first six? Others object: “No! The numbers from the second six will try to overtake their colleagues from the first six. We must use them!.” As a result, the proportion of those who use numbers from these lists is increasing significantly. Since all the same random number sensors and lototrons do not react to the results of previous circulations, new winning combinations with equal chances can contain both sets of these numbers marked on the Internet, and those that did not enter into these two sixes. It should be noted in this situation that if a winning combination does contain any of these 12 specially allocated numbers, then the winnings will be divided among a significantly larger contingent of participants who have put on these combinations advertised on the Internet. Therefore, it makes sense to make six or five of the numbers that are not included in these Internet groups. Chances of the success will be the same, but the prize amount will be distributed among a smaller number of winners. Another popular probabilistic problem is connected with socalled “happy” tram, trolleybus or bus tickets. There are two classic definitions of a lucky ticket. Sixfigure number makes the ticket “happy” if the sum of its first three digits coincides with the sum of the second three. Sometimes in Russia they say that this represents the “Moscow” definition of a lucky
10
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
ticket, opposing to it the socalled “Leningrad” (or “St. Petersburg”) definition, which requires that the sums of three numbers on even and odd positions coincide. Both of these definitions, from the point of view of the combinatorial theory, are at least formally different, but lead to the same chances to get a lucky ticket. Suppose for completeness that in a sixdigit number (a, b, c, d, e, f) in any of the six places can be any number from zero to nine, i.e., we assume the existence of a ticket whose number consists of six zeros. It is clear that the total number of possible tickets is one million exactly. Before evaluating the chances to obtain a lucky ticket, consider the following problem, the solution of which is described in detail in the classic book of N. Ya. Vylenkin [N. I. Vylenkin. Popular Combinatorics, Moscow: Nauka, 1975. 208 pp.] (See also [N. Ya. Vylenkin, AN Vilenkin, PA Vilenkin, Combinatorics, M.: FIMA, MCNMO, 2006  400 p.]). We represent the number N as the sum 𝑁 = 𝑟1 + 𝑟2 + ⋯ + 𝑟𝑛 , where each of the n terms can take one of the m values, namely 0, 1, ..., m1. It turns out that the number of different variants T (N, n, m) by which this representation can be performed is found by the formula 𝑛−1 𝑇(𝑁, 𝑛, 𝑚) = ∑𝑠𝑘=0(−1)𝑘 𝐶𝑛𝑘 𝐶𝑛+𝑁−𝑘𝑚−1 ,
(1.1)
where s = [N / m]. Let’s return to our tickets. The purchased ticket has a number (a, b, c, d, e, f), where each of the n = 6 places can contain any of m = 10 digits. The ticket will be happy if a + b + c = d + e + f. To connect the formula (1.1) for solving the problem, which is interesting for us, we will use the following method. Instead of the ticket (a, b, c, d, e, f) let us consider the ticket with the number (a, b, c, g, h, i) accompanying it, in which g = 9d, h = 9e, i = 9f. The number of tickets in the “happy” group and their number in this new formed group is the same. Let’s look at the numbers (a, b, c, g, h, i). For them, as is easy to see, a + b + c + g + h + i = 27. Thus, now we can use the formula (1.1), in which we need to take m = 10, N = 27, n = 6 and s = [27/10] = 2. We get that the number of tickets, the
Combinatorics
11
sum of the figures in the number of which is 27, as well as the number of the luck tickets that are interesting for us, is equal to 5 5 5 5 = 𝐶32 − 𝐶61 𝐶22 + 𝐶62 𝐶12 = 𝑇(27,6,10) = ∑2𝑘=0(−1)𝑘 𝐶6𝑘 𝐶32−10𝑘 55252.
Of course, the different computer programs allow us to get a search of a million numbers and to obtain the same value 55252 for the number of the lucky tickets. Thus, which of the two above definitions we would not use, we see that the probability to obtain the desired ticket is 0, 055252, i.e., on average, approximately one of about 18 tickets can be considered as “lucky.” Solving this classic problem, one can consider the next nonclassical problem that is close to it: how to find the number of tickets that fit to the both definitions? For each of these “doubly lucky” tickets (a, b, c, d, e, f), the relations a + b + c = d + e + f and a + c + e = b + d + f must be satisfied. As a result, we get that this number is determined by the conditions b = e and a + c = d + f. The numbers b and e can take any of 10 values from zero to nine. It remains to find the number of ways which allow us to obtain the equality a + c = d + f. Applying the same method as in the classical situation described above, we find that this number is given as follows: 3 3 3 = 𝐶21 − 𝐶41 𝐶11 = 670. 𝑇(18,4,10) = ∑1𝑘=0(−1)𝑘 𝐶4𝑘 𝐶21−10𝑘
Thus, there are 6700 “twice happy” numbers among each million tickets and the corresponding probability to get such a ticket is 0, 0067. In this section, we presented various combinatorial relations that allow us to estimate the chances of the occurrence of a random event in the different situations.
Chapter 2
RANDOM VARIABLES 2.1. THE SIMPLEST PROBABILISTIC MODELS For the simplest situations, discussed before methods were proposed to rate the chances of the occurrences of events. It would be quite natural to introduce some characteristic, which makes it possible to compare the chances of the success in carrying out various experiments. Such sufficiently convenient characteristic is a certain measure of the success of the experiment (the probability of occurrence of the desired event) turned out to be the ratio m / n, where n is the possible number of outcomes of this experiment, and m is the number of outcomes that suit us. In order to consider more complex situations in which this measure of the success can be evaluated for various events of interest to us, we will try to give some scientific form to the classical model already considered before, in which this measure is determined by the ratio m / n. So, we are conducting some experiment, the result of which can be (with equal chances for any of them!) n outcomes. These outcomes we treat as elementary events and denote them 𝜔1 , 𝜔2 , … , 𝜔𝑛 . Thus, we define the first element of the probabilistic space – the socalled set of elementary events 𝛺 = {𝜔1 , 𝜔2 , … , 𝜔𝑛 }.
14
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
For example, under the single throwing of a dice we have n = 6 and 𝛺 = {𝜔1 , 𝜔2 , … , 𝜔6 }, where 𝜔𝑘 means the appearance of the face with the digit k, k = 1, 2, 3, 4, 5, 6. If the coin is thrown three times, then 𝑛 = 23 = 8, 𝜔1 = {𝑟, 𝑟, 𝑟}, 𝜔2 = {𝑟, 𝑟, 𝑓}, … , 𝜔8 = {𝑓, 𝑓, 𝑓}, where the symbol “r” corresponds to the appearance of its reverse on the first place, on the second place or on the third place, and “f “ indicates the appearance of its face during the first, second or third coin toss. Along with the elementary situations, we may be interested in more complex outcomes of the experiment. For example, it may be important for us to have exactly an even face when throwing a dice or to get the event consisting in the appearance of at least one of three possible reverses of the coins when one deals with the throwing of three coins. What types of the cumbersome structures can be built from the original “bricks”  the elementary outcomes that we have already fixed? To construct these complex events, we can take the different groups {𝜔𝛼(1) , 𝜔𝛼(2) , … , 𝜔𝛼(𝑟) }, 𝑟 = 1, 2, … , 𝑛, which are composed from our “bricks.” The number of such possible groups is 𝐶𝑛1 + 𝐶𝑛2 + ⋯ + 𝐶𝑛𝑛 . Adding also the socalled impossible event (we denote it as 0 ) to these groups, we find that the total number N of the corresponding constructed events is given as follows: 𝑁 = 𝐶𝑛0 + 𝐶𝑛1 + 𝐶𝑛2 + ⋯ + 𝐶𝑛𝑛 = (1 + 1)𝑛 = 2𝑛 . Separately, along with the impossible event 0, we single out also the important case when r = n. We call such event 𝛺 = {𝜔1 , 𝜔2 , … , 𝜔𝑛 }, which coincides with the complete set of all elementary events, as the reliable event. It takes place for any outcome of the experiment. Let 𝐹 = {𝐴𝑘 }, 𝑘 = 1, 2, … , 2𝑛
Random Variables
15
denote the set of events that can be created by such unions of our elementary outcomes. We can note the following easily verifiable assertions, which are valid for this set F of events. 1) If the event 𝐴 ∈ 𝐹 then also its complement 𝐴 = 𝛺 ∖ 𝐴 ∈ 𝐹. 2) The reliable event 𝛺 ∈ 𝐹 and the impossible event 0 ∈ 𝐹. 3) If some events 𝐴1 , 𝐴2 , … , 𝐴𝑟 belong 𝐹, then their union 𝐴1 ∪ 𝐴2 ∪ … ∪ 𝐴𝑟 and their intersection 𝐴1 ∩ 𝐴2 ∩ … ∩ 𝐴𝑟 also are elements of 𝐹. Comment. In the future, we will consider situations in which the set of events does not necessarily consist of a finite number of events. The given relations 1, 2, 3, performed for elements of space 𝐹, testify that this space is a 𝜎algebra of events. To two already created elements of our probability space we must add one more  the socalled probability measure 𝑃. To define it in our case, to each elementary outcome 𝜔𝑘 , k = 1,2, ..., n, we attach its weight 𝑝𝑘 = 1⁄𝑛, and then to each event 𝐴 = {𝜔𝛼(1) , 𝜔𝛼(2) , … , 𝜔𝛼(𝑟) } we can attach the weight (probability) 𝑃(𝐴) = 𝑝𝛼(1) + 𝑝𝛼(2) + ⋯ + 𝑝𝛼(𝑟) = 𝑟⁄𝑛. One can note that the introduced probability measure P satisfies the following elementary properties: 1) 0 ≤ P (A) ≤ 1 for any random event A. 2) P (0) = 0, P (𝛺) = 1. 3) The probabilities of any random event A and its complement are connected by the equality
16
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy P (𝐴) = 1 P (A). 4) If 𝐴1 , 𝐴2 , … , 𝐴𝑟 are incompatible events, i.e., they are constructed from various elementary events, then 𝑃(𝐴1 ∪ 𝐴2 ∪ … ∪ 𝐴𝑟 ) = ∑𝑟𝑘=1 𝑃 (𝐴𝑘 ).
Thus, we have completed the construction of a probability space corresponding to the classical probability model in which 𝑛 equally probable outcomes of an experiment are possible, adding the third probability measure 𝑃 to the first two of its elements (the set of elementary events 𝛺 and the 𝜎algebra of events 𝐹 ). Such a probability space (𝛺, 𝐹, 𝑃) constructed for the considered classical scheme looks rather artificial and not very necessary, but it can be taken as a starting point for the creation of the essentially more complex and natural probability spaces, to which we will pass in the next section.
2.2. DISCRETE GENERALIZATIONS OF THE CLASSICAL SCHEME In the classical probabilistic model considered above, we are dealing with 𝑛 outcomes of some experiment having equal chances for their appearances. The simplest examples of such classical schemes are connected, for example, with throwing of the “correct” dices or some symmetrical coins, as well as with the random selection of one or several playing cards from a wellmixed deck. However, there are substantially more situations when the possible outcomes of the carrying out experiment are not equally probable. For example, imagine that two “correct” dices are throwing, but we are interested in the sum of the readings of the two fallen faces only, then the outcomes of this experiment 𝜔2 , 𝜔3 , … , 𝜔12 , where 𝜔𝑘 corresponds to the sum, which is equal to k, no longer will be equally
Random Variables
17
probable. Therefore, the first simplest generalization of the classical probability model presented above is fairly obvious. Now let us consider the set of the elementary outcomes 𝛺 = {𝜔1 , 𝜔2 , … , 𝜔𝑛 } in the case, when each outcome 𝜔𝑘 has its own (not necessarily equal to 1 / n) weight 𝑝𝑘 and the sum of all these 𝑛 nonnegative weights is equal to one. Then the total weight (probability) 𝑃(𝐴) = 𝑝𝛼(1) + 𝑝𝛼(2) + ⋯ + 𝑝𝛼(𝑚) corresponds to event A, which is formed from the “bricks” (elementary outcomes) {𝜔𝛼(1) , 𝜔𝛼(2) , … , 𝜔𝛼(𝑚) }. We note that the probabilities of the impossible event and the reliable event remain equal, respectively, to zero and to one. If we go further along the path of generalizations, we can start with the following example. Let’s return to our symmetrical coin, when the chances of falling out of the obverse or the reverse are the same and the corresponding probabilities are equal to ½. We will now throw the coin until the appearance of the first reverse and calculate the number of obverses that fell out. It is evident that one can no longer confine ourselves to a finite number of 𝑛 elementary outcomes. Suppose that 𝜔𝑘 , 𝑘 = 0, 1, 2, . . . ., is the outcome of this experiment, as a result of the situation, when a series of 𝑘 obverses was obtained. Note, a little ahead of the time, that the probability 𝑝𝑘 , corresponding to the elementary event 𝜔𝑘 is equal to 1⁄2𝑘+1 , 𝑘 = 0, 1, 2, … . In such cases, the space of the elementary events 𝛺 = {𝜔0 , 𝜔1 , 𝜔2 , … . } and the set of random events constructed from these “bricks” will already consist of an infinite number of elements. It is clear that in such situations the very description of the set of random events can be difficult and not all events, included in this set, may be interesting for us. For example, it is unlikely that anyone will need in this experiment to find the probability that the number of the obverses falling before the appearance of the first reverse
18
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
will be divided by 173. Therefore, the set of random events 𝐹 can be simplified by forming it not from all possible combinations of elementary outcomes, but by choosing only some specific sets of such outcomes. For example, if we are interested in the appearance of a concrete event 𝐴 in the experiment under our consideration, then we can confine ourselves to a space 𝐹 consisting of four events 𝐴, 𝐴, 𝛺, 0 only, where 𝛺 is the reliable event and 0 is the impossible event. For example, if 𝐴 consists in the appearance of the even number of the obverses before the first reverse, then 𝐴 corresponds to the odd number of obverses. In this case, the third element of the probability space will consist of four presented below probabilities: P( ) =0, P( )=1, P(A)=1/2+1/8+1/32+...=2/3, P ( A ) =1/4+1/16+...=1/3. Of course, having such a “poor” space of events, we have no chances to get the information about the probabilities of other events ( not contained in 𝐹). All these reflections lead us to the following reasonable description of the discrete probability space (𝛺, 𝐹, 𝑃). In the capacity of 𝛺 = {𝜔1 , 𝜔2 , … , } we take a finite or countable set of elementary outcomes of some experiment. The set 𝐹 (𝜎algebra of random events) consists of the events 𝐴 = {𝜔𝛼(1) , 𝜔𝛼(2) , . . . , 𝜔𝛼(𝑟)} , 𝑟 = 1,2, . . ., obtained by combining into groups of various elementary events, and also includes the impossible event as well as the reliable event. The following conditions must be satisfied. a) The impossible event and the reliable event are included in 𝐹. b) If some event 𝐴 belongs to 𝐹 then its complement 𝐴 = 𝛺 ∖ 𝐴 also belongs to 𝐹.
Random Variables
19
c) For any finite or countable collection 𝐴1 , 𝐴2 , . .. of events in 𝐹, their union 𝐴1 ∪ 𝐴2 ∪ … also must be included to 𝐹. We note that from conditions b) and c) one can obtain that any intersection of events 𝐴1 ∩ 𝐴2 ∩ … also belongs 𝐹. Now we can pass to the third element of the probabilistic space  the probability measure 𝑃. To each event 𝐴 = {𝜔𝛼(1) , 𝜔𝛼(2) , … } we will match the probability 𝑃(𝐴) = 𝑝𝛼(1) + 𝑝𝛼(2) + ⋯, where 𝑝𝛼(𝑘) is the weight of the elementary outcome 𝜔𝛼(𝑘) . Thus, the introduced probability measure 𝑃 will have the following properties: 1) for any random event 𝐴 the inequalities 0 ≤ 𝑃(𝐴) ≤ 1 are valid; 2) the equalities 𝑃(0) = 0, 𝑃(𝛺) = 1 have the place; 3) the probabilities of any random event A and its complements 𝐴 are connected by the equality 𝑃(𝐴) = 1 − 𝑃(𝐴); 4) if 𝐴1 , 𝐴2 , … are incompatible events, i.e., if 𝐴𝑖 ∩ 𝐴𝑗 = 0, 𝑖 ≠ 𝑗, then for any unions of these random events the following equalities hold: 𝑃(𝐴1 ∪ 𝐴2 ∪ … ) = ∑𝑘 𝑃 (𝐴𝑘 ). Thus, from the classical probabilistic model, the transition to the essentially more general probability models, which are described by the constructed discrete probability space (𝛺, 𝐹, 𝑃), is completed.
20
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
2.3. THE GENERAL CONSTRUCTION OF THE PROBABILITY SPACE We have considered the variants of probabilistic spaces in situations where the number of outcomes of some experiment is finite or even countable. It should be noted that such schemes are very popular. Elementary events in such situations may be, for example, the following: “the appearance of the six when throwing a die,” “getting a ticket with the number 7 during a random selection of 24 examination tickets,” “three defeats of a football team before its first victory in the championship,” “the fivetime appearance of the letter “ s “ on the first page of a readable newspaper,” “the winning combination of numbers (2, 8, 11, 22, 27, 31) falls out in the draw of a lotttery.”
However, many experiments do not fit into these discrete schemes. For example, the result of some experiment may be the coordinate of a randomly thrown point on a real line or the coordinates of a randomly thrown point on a unit square. Therefore, a further generalization of our construction of probability spaces must be useful. Now let 𝛺 = {𝜔} be an arbitrary (not necessarily, finite or countable) set of elementary events. When moving from 𝛺 to a set of random events, problems may arise of the type,” which combinations of elementary outcomes can be taken as elements of 𝐹 ? .” The examples from the previous paragraph suggest that this choice is sufficiently arbitrary. The only condition is that the elements (random events) contained in 𝐹 must present some kind of configurations which could be called 𝜎algebra. The “poorest” and very exotic will be the 𝜎algebra, which includes only two elements  an impossible event 0 and the authentic event 𝛺. The next in simplicity but already actually used there may be an 𝜎algebra composed of 4 events 𝐴, 𝐴, 0 and, where as the event 𝐴 one can take an arbitrary union of elementary
Random Variables
21
outcomes. Naturally, to solve any specific problems we must work with some more eventful set 𝐹. The only condition, as already was noted, is that this set must form an 𝜎algebra. For example, if 𝛺 contains all the points of the real axis, then it is convenient (but not at all necessary!) to take the Borel 𝜎algebra containing all segments and their various combinations. We remind you that in any case the set 𝐹 must satisfy the following conditions: 1) 𝛺 and 0 must be presented in 𝐹; 2) if a random event 𝐴 is in𝐹, then its addition 𝐴 also belongs to 𝐹; 3) for any finite or countable group 𝐴1 , 𝐴2 , … of elements in 𝐹 their union 𝐴1 ∪ 𝐴2 ∪ … must also be contained in 𝐹. Once again, it can be noted that conditions 1), 2) and 3) also imply that the intersections 𝐴1 ∩ 𝐴2 ∩ … of any finite or countable number of events in 𝐹 also belong to 𝐹. Note also that if random events 𝐴 and 𝐵 are in 𝐹 then 𝐴 ∖ 𝐵 = 𝐴 ∩ 𝐵 ∈ 𝐹 and 𝐵 ∖ 𝐴 = 𝐵 ∩ 𝐴 ∈ 𝐹. In some situations the following fact is helpful: if 𝐴 ∈ 𝐹, 𝐵 ∈ 𝐹 and 𝐴 ⊂ 𝐵 then 𝐵 can be represented as the union of two incompatible events 𝐴 and 𝐵 ∖ 𝐴 belonging to 𝐹: 𝐵 = 𝐴 ∪ (𝐵 ∖ 𝐴). After we have fixed the set of random events that make up 𝐹 we can proceed to the description of the third element of the probability space – the probability measure 𝑃. To note some features associated with the construction of 𝑃 let us consider the following simple example. Example. We throw a coin one time. In this case 𝛺 = {𝑟, 𝑓} where these two elementary events correspond to the appearance of the reverse and the face.
22
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy We will compose an 𝜎algebra containing 4 events: 𝐴, 𝐴, 𝛺 and 0, where
𝐴 means the appearance of the reverse and A indicates the appearance of the face. The probabilities of two events 𝑃(0) = 0 and 𝑃(𝛺) = 1 are uniquely defined. As for the rest two events, only the sum of the nonnegative probabilities 𝑃(𝐴) + 𝑃(𝐴) = 1 of these events is fixed. Depending on the shape of the coin, one can deal with different values of probabilities of these two events, fixing 𝑃(𝐴) = 𝑝 and 𝑃(𝐴) = 1 − 𝑝, 0 < 𝑝 < 1. This example shows that the probabilities of events included in the 𝜎 algebra 𝐹 can be chosen in one or another way depending on the considered situation. The requirements for these probabilities are given by the following conditions: 1) each event 𝐴 ∈ 𝐹 is accompanied by its probability 0 ≤ 𝑃(𝐴) ≤ 1; 2) the probabilities of events 𝐴 and 𝐴 are related by the relation 𝑃(𝐴) + 𝑃(𝐴) = 1; 3) if 𝐴1 , 𝐴2 , … are pairwise incompatible events, i.e., 𝐴𝑖 ∩ 𝐴𝑗 = 0, 𝑖 ≠ 𝑗 then for the union of these events the following relation is satisfied: 𝑃(𝐴1 ∪ 𝐴2 ∪ … ) = ∑𝑘 𝑃 (𝐴𝑘 ). Thus, the general form of an arbitrary probability space was described here.
2.4. RANDOM VARIABLES AND DISTRIBUTION FUNCTIONS Any probabilistic space constructed is inherently a card file in which a certain set of events and a set of probabilities corresponding to them are located, which determine the degrees of opportunities for the appearance of these events. In many cases, these probabilities could be found without building such heavy construction, which is a probabilistic space, but it turns
Random Variables
23
out that this construction is necessary for defining and working with such important probabilistic object, which is a random variable. The fact is that very often random outcomes of some experiment completely unrelated to any numbers or numbering can determine certain numerical characteristics depending on these outcomes. Let’s give the simplest example. The International Football Federation (FIFA) is going to use a lot to determine where the qualifying match of the world championship between the teams of Russia and Finland will take place. The drum contains three cards with the names of the stadiums in St. Petersburg, Helsinki and the neutral field in Berlin. A randomly selected card must determine the city in which this match will take place. A fan from St. Petersburg who is going to visit this game without fail assesses his future expenses (depending on the choice of one of these three stadiums), respectively, as 3,000, 10,000 and 20,000 rubles. For him, before the draw, the future cost is a random variable, taking one of these three values with equal probabilities 1⁄3. Let’s consider another example. A symmetric coin must be thrown three times. The possible outcomes of this experiment are expressed in terms of the appearance of the reverse or the face in each of these three tosses: 𝜔1 = {𝑟, 𝑟, 𝑟}, 𝜔2 = {𝑟, 𝑟, 𝑓}, 𝜔3 = {𝑟, 𝑓, 𝑟}, 𝜔4 = {𝑓, 𝑟, 𝑟}, 𝜔5 = {𝑟, 𝑓, 𝑓}, 𝜔6 = {𝑓, 𝑟, 𝑓}, 𝜔7 = {𝑓, 𝑓, 𝑟}, 𝜔8 = {𝑓, 𝑓, 𝑓}. On the set, represented by these 8 elementary outcomes, you can specify various real functions. Let’s say we are interested in the number of reverses (we denote this number as 𝜉) that fell out during the experiment under the consideration. We get that this function, defined on the set of elementary outcomes, takes the following values: 𝜉(𝜔1 ) = 3, 𝜉(𝜔2 ) = 𝜉(𝜔3 ) = 𝜉(𝜔4 ) = 2, 𝜉(𝜔5 ) = 𝜉(𝜔6 ) = 𝜉(𝜔7 ) = 1, 𝜉(𝜔8 ) = 0.
24
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
If we take into account that this coin is symmetrical and therefore all 8 outcomes have the equal chances for their occurrences, then we find that the function 𝜉 under consideration takes its values 0, 1, 2 and 3 with probabilities 1⁄8, 3⁄8, 3⁄8 and 1⁄8. Now consider another situation associated with the same set of elementary events. Suppose that for a player throwing a coin the appearance of at least two reverses is important. Consider the function, which presents an indicator of the occurrence of such an event. Let 𝐼(𝜔) = 1, if the outcome 𝜔 corresponds to this event, and 𝐼(𝜔) = 0, if this event does not occur. We get that 𝐼(𝜔1 ) = 𝐼(𝜔2 ) = 𝐼(𝜔3 ) = 𝐼(𝜔4 ) = 1, 𝐼(𝜔5 ) = 𝐼(𝜔6 ) = 𝐼(𝜔7 ) = 𝐼(𝜔8 ) = 0. Thus, the considered function 𝐼(𝜔) takes two values 0 and 1 with equal probabilities 4/8 = 1⁄2. Of course, for an asymmetric coin, different probabilities would be obtained for these two values of the indicator. After consideration of the presented examples we can transit to the definitions of random variables and related concepts. If we have a set of elementary events 𝛺 = {𝜔} then we can consider any real function 𝜉(𝜔) defined on this set. Thus, to call it as a random variable it is necessary to make it measurable, i.e., for each possible value of this function it is necessary to determine the chances (probability) to obtain this value. Of course, if the set 𝛺 = {𝜔} consists of a finite or countable number of elementary outcomes, then it suffices for us to know only the weights 𝑝𝑘 assigned to each outcome 𝜔𝑘 . Then the probability ∑𝑘:𝜉(𝜔𝑘 )=𝑥𝑚 𝑝𝑘 will correspond to the possible value 𝑥𝑚 of the function 𝜉(𝜔). Here this sum is taken over all values of 𝑘 for which the equality 𝜉(𝜔𝑘 ) = 𝑥𝑚 holds. A much more complicated situation arises when the set of elementary outcomes 𝛺 is not countable, for example, when it coincides with the interval
Random Variables
25
[0,1]. In this case, it is required that the 𝜎algebra 𝐹 be rich enough to allow us to describe the chances of the appearance of all possible values of the function 𝜉(𝜔) . It is necessary for this that for any set 𝐵, belonging to the Borel 𝜎algebra on the real line, it would be possible to determine the chances (probabilities) of the function 𝜉 to have values in this set. This means that the preimage of any such set 𝐵, i.e., the set of outcomes 𝜉 −1 (𝐵) = {𝜔: 𝜉(𝜔) ∈ 𝐵} belongs to 𝐹 and therefore for each Borel set 𝐵 on the real line the probability 𝑃𝜉 (𝐵) = 𝑃(𝜉 ∈ 𝐵) will be determined in this case. The function 𝑃𝜉 (𝐵) defined on Borel sets of the real line is called the distribution of the random variable 𝜉. Note that this function, which defines the probabilistic measure of any Borel set 𝐵, is too complicated to work with it and even to describe and store it. We can use the following fact. Any such set B with the help of the different operations, which use unions, intersections or differences of the sets, can be expressed, for example, through the initial intervals (−∞, 𝑥), −∞ < 𝑥 < ∞. This fact allows us to obtain all necessary information about the distribution of the random variable 𝜉, knowing only the socalled distribution function 𝐹𝜉 (𝑥) = 𝑃(𝜉 < 𝑥), −∞ < 𝑥 < ∞. It is possible for any Borel set 𝐵 to find the probability 𝑃(𝜉 ∈ 𝐵) having this function. For example, If B = [x, y), x 𝑥) = 𝐹𝜉 (∞) − 𝐹𝜉 (𝑥 + 0) = 1 − 𝐹𝜉 (𝑥 + 0),
26
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
where 𝐹𝜉 (𝑥 + 0) = 𝑙𝑖𝑚 𝐹𝜉 (𝑥 + 𝜀). 𝜀↓0
Note also that 𝑃(𝜉 = 𝑥) = 𝐹𝜉 (𝑥 + 0) − 𝐹𝜉 (𝑥) is nonzero only if the distribution function 𝐹𝜉 has a jump at the point x. In the following the various types of distribution functions, their properties and methods for finding various probabilities using these functions will be considered in the details. For now, we fix that we are dealing with the following concepts. There is a certain probability space {𝛺, 𝐹, 𝑃} where 𝛺 = {𝜔} is the set of elementary outcomes of some experiment and the function (random variable) 𝜉(𝜔) which associates to each outcome 𝜔 a certain point on the real axis. This mapping must be measurable. It means that for each Borel set 𝐵 on the real axis the set of such elementary outcomes 𝜔, for which 𝜉(𝜔) ∈ 𝐵, is an event, that is, this set must be included in the 𝜎algebra 𝐹. As we know, to each event 𝐴 ∈ 𝐹 there corresponds a certain probability 𝑃(𝐴) and it means that any probability Pξ(B)=P{ξ } is defined.
2.5. EXAMPLES OF RANDOM VARIABLES Thus, above we fixed the probabilistic space {𝛺, 𝐹, 𝑃} and defined a random variable 𝜉 = 𝜉(𝜔) on it. For each random variable 𝜉 the distribution function 𝐹(𝑥) = 𝐹𝜉 (𝑥) = 𝑃(𝜉 < 𝑥) is determined. As an example of the such function can be taken any function 𝐹(𝑥) that satisfies the following properties: 0 ≤ 𝐹(𝑥) ≤ 1;
Random Variables
27
𝐹(−∞) = 𝑙𝑖𝑚 (𝑥) = 0, 𝐹(+∞) = 𝑙𝑖𝑚 𝐹 (𝑥) = 1; 𝑥→−∞
𝑥→∞
𝐹(𝑥) is a nondecreasing function; 𝐹(𝑥) is left continuous at each point 𝑥, −∞ < 𝑥 < ∞. There are three types of distributions: discrete, absolutely continuous and singular. Discrete random variable ξ is presented by some set (finite or countable) of its possible values −∞ < 𝑥1 < 𝑥2 < ⋯ < 𝑥𝑛 < ⋯ < ∞ and the corresponding set of probabilities 𝑝𝑛 = 𝑃(𝜉 = 𝑥𝑛 ), 𝑛 = 1,2, …. These probabilities are nonnegative and their sum equals 1. The distribution functions of discrete random variables have jumps 𝑝𝑛 = 𝐹(𝑥𝑛 + 0) − 𝐹(𝑥𝑛 ) at each point 𝑥𝑛 , 𝑛 = 1,2, …. The function 𝐹(𝑥) on any segment (𝑥𝑛 , 𝑥𝑛+1 ) is constant and takes the value 𝑝1 + 𝑝2 + ⋯ + 𝑝𝑛 . To be precise it should be noted that the values of a discrete random variable may represent a sequence of points that are not necessarily limited in the left. For example, it may be a situation, when the values xn form, say, a sequence of the numbers 0, −1, −2, . .., tending to −∞. In this case, 𝐹(𝑥) > 0 for any 𝑥, −∞ < 𝑥 < ∞, but 𝑙𝑖𝑚 𝐹 (𝑥) = 0.
𝑥→−∞
In the case of absolutely continuous distributions there exists some nonnegative function 𝑝(𝑥), which is called the distribution density of a random
28
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
variable 𝜉. The density function p (x) and the distribution function F (x) are related by equality 𝑦
𝐹(𝑦) = ∫−∞ 𝑝 (𝑡)𝑑𝑡, −∞ < 𝑦 < ∞. Naturally, the following equality holds for the distribution ∞
∫−∞ 𝑝 (𝑡)𝑑𝑡 = 𝐹(∞) = 1. There are also the socalled singular distributions when the set of all points of growth of the continuous distribution function has zero Lebesgue measure. Usually such (in a certain sense exotic) distributions are only mentioned to describe all possible types of the probability distributions. It remains only to add that any arbitrary distribution function 𝐹(𝑥) can be represented as the following mixture with weights 𝑝1 , 𝑝2 and 𝑝3 , where 𝑝1 ≥ 0, 𝑝2 ≥ 0, 𝑝3 ≥ 0 and 𝑝1 + 𝑝2 + 𝑝3 = 1 of distribution functions: 𝐹(𝑥) = 𝑝1 𝐹1 (𝑥) + 𝑝2 𝐹2 (𝑥) + 𝑝3 𝐹3 (𝑥) Here 𝐹1 , 𝐹2 , 𝐹3 represent respectively these three mentioned above types of the distribution functions. However, we usually work with two “pure” representatives of distributions — discrete or absolutely continuous, leaving aside the singular distributions and the mixtures of distributions mentioned above. Let us now present a selection of the most popular probability distributions. Often one can come across the concepts of the socalled “family of probability distributions” and “standard representative of the given family of distributions.” All random variables 𝜂 obtained from a certain value 𝜉 by the linear transformations = 𝑎 + 𝑏𝜉, where −∞ < 𝑎 < ∞ and 𝑏 > 0 are some constants corresponding to the shift and scale change, are assigned to the common family of distributions. As the standard representative of this family of distributions is chosen, as a rule, one distribution, with which it is easier to work or which is more often
Random Variables
29
encountered in various probabilistic problems. Note that distribution functions 𝐹𝜂 (𝑥) and 𝐹𝜉 (𝑥) of the random variables connected by the relation 𝜂 = 𝑎 + 𝑏𝜉 satisfy the equality 𝐹𝜂 (𝑥) = 𝐹𝜉 ((𝑥 − 𝑎)⁄𝑏), ∞ < 𝑥 < ∞. Let us begin to list the most frequently encountered in probabilistic literature discrete distributions. 1) Degenerate distribution In this case there is some unique value, ∞ < 𝑎 < ∞, of a random variable ξ, for which 𝑃(𝜉 = 𝑎) = 1. The distribution function 𝐹(𝑥) has a unique growth point. Here 𝐹(𝑥) = 0, 𝑥 ≤ 𝑎, and 𝐹(𝑥) = 1, 𝑥 > 𝑎. 2) Twopoint distribution In this situation, a random variable 𝜂 with nonzero probabilities 𝑝 and 𝑞 = 1 − 𝑝, 0 < 𝑝 < 1, can have two values 𝑎 and 𝑏, −∞ < 𝑎 < 𝑏 < ∞, which are taken with some probabilities 𝑃(𝜂 = 𝑎) = 𝑝 and 𝑃(𝜂 = 𝑏) = 𝑞 = 1 − 𝑝. Twopoint random variables with values 0 and 1 are encountered more often than others. Such values usually serve as indicators of the appearance of some event 𝐴 in the corresponding experiments. 3) Uniform distribution on a set consisting of a finite number of points There are 𝑛 values −∞ < 𝑥1 < 𝑥2 0 leads to the uniform distribution on the interval [𝑐, 𝑐 + 𝑑]. 8) Exponential E( , ) ) distribution with the parameters −∞ < < ∞,
> 0.
The distribution density in this case is given as follows: p(x)=0, if x< , and p(x) = exp((x )/ ) / , if x ≥
.
The distribution function is given by the equality F (x) = max {0,1 − exp(−(𝑥 − ) /𝜆}. The standard representative of this family is presented by E (0, 1) distribution with the density functionn p(x) = 0, if x < 0, and p(x) = exp(x), if x ≥ 0. Often, in the probabilistic literature, simply the oneparameter family of exponential E (λ) distributions with the parameter in this case the distribution function has the form 𝑥 𝜆
F (x) = max {0,1 − exp(− )}. 9) Gammadistribution with parameter α > 0
> 0 is considered and
Random Variables
33
This Gamma (α)distribution, α > 0, has the standard density function pα(x) = 0, if x ≤ 0, and pα(x) = xα1exp(x)/ ( ) , if x > 0. ∞
Here ( ) = ∫0 𝑥 𝛼−1 𝑒 −𝑥 dx represents the Gamma function with the parameter α >0. If 𝛼 = 1, we deal with the standard E (0,1) distribution. 10) Cauchy distribution The standard representative of this family of distributions has the density function 𝑝(𝑥) = 1⁄𝜋 (1 + 𝑥 2 ), −∞ < 𝑥 < ∞, and the corresponding distribution function F(x) = 1/2+arctan(x)/π, ∞ < x < ∞. 11) Normal N(a, 2) distribution with parameters α and σ2, ∞ < a < ∞, 2 > 0. One of the most popular distributions in probability theory and in its applications is normal. First of all, we are talking about the standard representative of this family  the normal 𝑁(0.1) distribution. This is largely due to its role in the socalled central limit theorem, which describes the asymptotic behavior of sums of random variables. A random variable has the standard normal distribution, if its density function is given by equality
(𝑥) =
1 √2
exp (−
𝑥2 ) , −∞ 2
< 𝑥 < ∞.
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
34
In this case the distribution function is the following one: x
(x)=
(𝑡)d𝑡=
1
x
√2
𝑡2
exp (− 2 ) 𝑑𝑡.
For an arbitrary N (a, 2) representative of this family of distributions, the density function and the distribution function are impressed, respectively, as 1
(( x a) / ) and ((xa)/ )
The tables of functions (x) and (x) are given in all probabilistic reference books. The most often used in various probabilistic models distributions were presented above. Later we will get acquainted with other (less popular) probabilistic distributions, considering some specific probabilistic models in which they appear.
Chapter 3
GENERATING AND CHARACTERISTIC FUNCTIONS 3.1. GENERATING FUNCTIONS AND THEIR PROPERTIES For the discrete random variables considered before that take integer nonnegative values, it is convenient to use the socalled generating functions. If 𝜉 takes values 0,1,2, …with probabilities 𝑝1 , 𝑝2 , 𝑝3 , … then its generating function 𝑃(𝑠) at each point 𝑠 is defined as the mathematical expectation of a random variable 𝑠 𝜉 i.e., random variable, which already takes on values of 𝑠 𝑘 with probabilities 𝑝𝑘 , 𝑘 = 0,1,2, …. Thus for each fixed value of 𝑠 (usually limited to 𝑠 lying in the interval [0,1]) the corresponding generating function of a random variable 𝜉 is given as the sum 𝑃(𝑠) = 𝐸𝑠 𝜉 = ∑𝑘 𝑝𝑘 𝑠 𝑘
(3.1.1)
Obviously, for values 𝑠 ∈ [−1,1] the convergence of the series on the righthand side of equality (3.1.1) is guaranteed. Later we will return to a more detailed discussion of the properties of generating functions. We first consider these functions for some discrete distributions. 1) Binomial distribution
36
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
We are dealing with a random variable 𝜉 ∼ 𝐵(𝑛, 𝑝). In this case, the generating function (we denote it by 𝑃𝑛 (𝑠), taking into account one of the parameters of this distribution) is found as follows: 𝑛
𝑃𝑛 (𝑠) = ∑ 𝐶𝑛𝑚 𝑝𝑚 (1 − 𝑝)𝑛−𝑚 𝑠 𝑚 𝑚=0
=∑𝑛𝑚=0 𝐶𝑛𝑚 (𝑝𝑠)𝑚 (1 − 𝑝)𝑛−𝑚 = =(1 − 𝑝 + 𝑝𝑠)𝑛 .
(3.1.2)
A special case (𝑛 = 1) of relation (3.1.2) allows for a twopoint random variable 𝜉 taking values 0 and 1 with probabilities 𝑞 = 1 − 𝑝 and 𝑝 respectively to immediately find its generating function, which in this situation has the form 𝑃1 (𝑠) = 1 − 𝑝 + 𝑝𝑠
(3.1.3)
Note (and we will return to this relation) that from (3.1.2) and (3.1.3) it follows that 𝑛
𝑃𝑛 (𝑠) = (𝑃1 (𝑠)) .
(3.1.4)
2) Geometric distribution In this case 𝑝𝑛 = (1 − 𝑝)𝑝𝑛 , 𝑛 = 0,1,2, … , 0 < 𝑝 < 1 and 1−𝑝
𝑛 𝑃(𝑠) = ∑∞ 𝑛=0(1 − 𝑝) (𝑝𝑠) = 1−𝑝𝑠.
3) Poisson distribution
(3.1.5)
Generating and Characteristic Functions
37
If 𝜉 ∼ 𝜋(𝜆) has a Poisson distribution with some parameter 𝜆 > 0 then 𝜆𝑛
𝑝𝑛 = 𝑃{𝜉 = 𝑛} = 𝑒 −𝜆 𝑛! , 𝑛 = 0,1,2, … and −𝜆 𝑃(𝑠) = ∑∞ 𝑛=0 𝑒
(𝜆𝑠)𝑛 𝑛!
= 𝑒 −𝜆 𝑒 𝜆𝑠 = 𝑒 𝜆(𝑠−1) .
(3.1.6)
Here are some of the most important properties of the generating functions, which greatly simplify the work with discrete random variables. 1) Using the generating function, the distribution of the corresponding discrete random variable is uniquely restored. Indeed, the following relations are valid, from which we can deduce this fact: 𝑝0 = 𝑃(0) and 𝑝𝑘 =
𝑃(𝑘) (0) ,𝑘 𝑘!
= 1,2, ….
(3.1.7)
We get that if some two random variables have the same generating functions, then their distributions coincide. 2) Derivatives of the generating function allow us to obtain not only probabilities, but also moment characteristics (if they exist) of the discrete random variables that we are studying. The fact is that for any 0 ⩽ 𝑠 ⩽ 1 and any 𝑘 = 1,2, … the following equalities are true 𝑛−𝑘 𝑃(𝑘) (𝑠) = ∑∞ , 𝑛=𝑘 𝑛 (𝑛 − 1)(𝑛 − 2) … (𝑛 − 𝑘 + 1)𝑝𝑛 𝑠
(3.1.8)
moreover, these derivatives do not decrease with increasing 𝑠. This means that there are finite or infinite limits of functions 𝑃(𝑘) (𝑠) when 𝑠 → 1. We get that the following sum (finite or infinite) 𝑝(𝑘) (1) = ∑∞ 𝑛=𝑘 𝑛 (𝑛 − 1)(𝑛 − 2) … (𝑛 − 𝑘 + 1)𝑝𝑛
38
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
coincides with the value of the corresponding derivative of the generating function no longer at zero, as was obtained for the probabilities, but at unity, i.e., (𝑘) ∑∞ (1), 𝑘 = 1,2, …. 𝑛=𝑘 𝑛 (𝑛 − 1)(𝑛 − 2) … (𝑛 − 𝑘 + 1)𝑝𝑛 = 𝑃 (3.1.9)
Later, when we study other moment characteristics of discrete random variables, we will return to the last equality. Now we restrict ourselves to the case k = 1 and obtain the relation ′ ∑∞ 𝑛=0 𝑛 𝑝𝑛 = 𝑃 (1),
(3.1.10)
from which it follows that to find the mathematical expectations of discrete random variables taking integer nonnegative values, we can use the equality 𝐸𝜉 = 𝑃′ (1).
(3.1.11)
Comment. Sometimes, to simplify the calculations, instead of the generating function, its logarithm 𝑄(𝑠) = 𝑙𝑛(𝑃(𝑠)) is used. In this case, we obtain 𝑄 ′ (1) = 𝑃′ (1)⁄𝑃 (1) = 𝑃′ (1) = 𝐸𝜉.
(3.1.12)
The use of the last equality can be illustrated by the example of the Poisson 𝜋(𝜆)distribution, whose generating function has the form 𝑃(𝑠) = 𝑒𝑥𝑝(𝜆(𝑠 − 1, )) and its logarithm 𝑄(𝑠) = 𝜆(𝑠 − 1) has simpler view. Applying the relation just given (3.1.12), we immediately get that in this case 𝐸𝜉 = 𝜆. 3) The property of generating functions is often used, consisting in the fact that if 𝜉 and 𝜂 are independent random variables and 𝜈 = 𝜉 + 𝜂
Generating and Characteristic Functions
39
then the generating function of the sum coincides with the product of the generating functions of the terms, i.e., in this case 𝑃𝜈 (𝑠) = 𝐸𝑠 𝜈 = 𝐸𝑠 𝜉+𝜂 = 𝐸𝑠 𝜉 ⋅ 𝐸𝑠 𝜂 = 𝑃𝜉 (𝑠) ⋅ 𝑃𝜂 (𝑠).
(3.1.13)
Naturally, this property extends to generating functions for sums of any number of independent random variables. We present a problem for the solution of which, although it will probably look too formal, we use all three of the properties of generating functions that we cited. Problem 1. Let independent random variables 𝜉𝑘 ∼ 𝜋(𝜆𝑘 ), 𝑘 = 1,2, … , 𝑛 have Poisson distributions and 𝜈 = 𝜉1 + 𝜉2 + ⋯ + 𝜉𝑛 represent their sum. We want to find the distribution of the random variable 𝜈 and its mathematical expectation. Decision. The generating function of each of the random terms 𝜉𝑘 in the presented sum has the form 𝑃𝑘 (𝑠) = 𝑒𝑥𝑝(𝜆𝑘 (𝑠 − 1)), 𝑘 = 1,2, … , 𝑛. According to the third of the above properties of generating functions we obtain using the independence of our random terms, that 𝑃𝜈 (𝑠)  the generating function of the sum coincides with the product of the generating functions of the terms and has the form 𝑃𝑘 (𝑠) = 𝑒𝑥𝑝(𝜇(𝑠 − 1)), where 𝜇 = 𝜆1 + 𝜆2 + ⋯ + 𝜆𝑛 . We see that this function coincides with the generating function of the Poisson 𝜋(𝜇)distribution. Given the first of these above the properties, we immediately get that the random variable 𝜈 has just such a Poisson distribution and
40
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy 𝑃{𝜈 = 𝑘} = 𝑒 −𝜇
𝜇𝑘 ,𝑘 𝑘!
= 0,1,2, ….
It remains to recall the second property, which has already helped us find out that the mathematical expectation of a Poisson random variable coincides with its parameter. It follows that 𝐸𝜈 = 𝜇 = 𝜆1 + 𝜆2 + ⋯ + 𝜆𝑛 . Consider the following problem and use another useful property of generating functions that will make it easy to solve this problem. Problem 2. The player rolls the correct dice until the first appearance of the face with the six, getting each time as many points as this bone showed once again. The game stops as soon as the first six falls, and these six points do not count towards the classification. How many points will this player average? Comment. The total number of points in this case can be represented as 𝜈 = 𝜉1 + 𝜉2 + ⋯ + 𝜉𝑁 , where 𝑁 = 0,1,2, … is the number of successes (the number of faces with numbers 15) until the first failure (which is represented by the appearance of the “six”). Each of the independent random variables 𝜉1 , 𝜉2 , …with the equal chances can take the values 1, 2, 3, 4 and 5. It is necessary to find the mathematical expectation of the sum of a random number 𝑁of random terms, and 𝑁 is independent of these terms. Let us consider a more general situation when 𝜈 = 𝜉1 + 𝜉2 + ⋯ + 𝜉𝑁 is a sum of independent identically distributed integer random variables, each of which has a generating function 𝑃(𝑠) and the number of terms 𝑁 in this sum is a random variable taking values 0, 1, 2, ... and having its own generating function 𝑛 𝑄(𝑠) = ∑∞ 𝑛=0 𝑃 {𝑁 = 𝑛}𝑠 .
Let 𝑅(𝑠) be the generating function of the sum 𝜈. Denote 𝑇(0) = 0 and 𝑇(𝑛) = 𝜉1 + 𝜉2 + ⋯ + 𝜉𝑛 , 𝑛 = 1,2, ….
Generating and Characteristic Functions
41
Relation (3.1.13) and its generalizations allow us to write that for each fixed n = 0, 1, 2, ... the generating function 𝑃𝑛 (𝑠) = 𝐸𝑠 𝑇(𝑛) has the form 𝑃𝑛 (𝑠) = 𝑃𝑛 (𝑠). Since by the formula of the total probability 𝑃{𝜈 = 𝑚} = 𝑃{𝜈 = 𝜉1 + ⋯ + 𝜉𝑁 = 𝑚}= = ∑∞ 𝑛=0 𝑃 {𝜈 = 𝜉1 + ⋯ + 𝜉𝑛 = 𝑚}𝑃{𝑁 = 𝑛}, 𝑚 = 0,1, …
(3.1.14)
one gets that ∞
𝑅(𝑠) = ∑ 𝑃 {𝜈 = 𝑚}𝑠 𝑚 𝑚=0 ∞ 𝑚 = ∑∞ 𝑚=0 ∑𝑛=0 𝑃 {𝜉1 + ⋯ + 𝜉𝑛 = 𝑚}𝑃{𝑁 = 𝑛}𝑠 ∞ 𝑚 = ∑∞ 𝑛=0(∑𝑚=0 𝑃 {𝜉1 + ⋯ + 𝜉𝑛 = 𝑚}𝑠 ) 𝑃{𝑁 = 𝑛}.
and ∞ 𝑚 ∑∞ 𝑛=0(∑𝑚=0 𝑃 {𝜉1 + ⋯ + 𝜉𝑛 = 𝑚}𝑠 ) 𝑃{𝑁 = 𝑛}
= ∑∞ 𝑛=0 𝑃𝑛 (𝑠)𝑃{𝑁 = 𝑛} 𝑛
= ∑∞ 𝑛=0 𝑃 {𝑁 = 𝑛}(𝑃(𝑠)) = 𝑄(𝑃(𝑠)). Having received the fact that the generating function 𝑅(𝑠) can be represented as the corresponding superposition of two generating functions, we can immediately infer that the mathematical expectation of the sum 𝜈 of a random number 𝑁 of random terms has the form 𝐸𝜈 = 𝑅 ′ (1) = 𝑄 ′ (𝑃(1))𝑃′ (1) = 𝑄 ′ (1)𝑃′ (1) = 𝐸𝑁 ⋅ 𝐸𝜉,
(3.1.15)
42
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
those it represents the product of the mean values of 𝜉 and 𝑁. Task 2 (solution). Since in the situation under consideration 𝜉1 , 𝜉2 , … take values 1,2,3,4 and 5 with equal probabilities, it is necessary to substitute 𝐸𝜉 = 3 into formula (3.1.15). The mathematical expectation of a geometrically distributed random variable 𝑁 can be found using equality (3.1.1) at 𝑝 = 5⁄6 and find that 𝐸𝑁 = 𝑝⁄(1 − 𝑝) = 5. We finally fix that 𝐸𝜈 = 15.
3.2. CHARACTERISTIC FUNCTIONS AND THEIR PROPERTIES In the previous section we introduced the reader to the generating functions. They make it possible to simplify the solution of a number of probabilistic problems. In particular, the generating functions help relatively easily obtain various moment characteristics of random variables which will be used in the future when studying variances and other moments. However, it should be noted that these functions have some limited use and are used for a very limited albeit quite important set of discrete random variables. The main convenient properties of generating functions have inherited the socalled characteristic functions which are already defined for any probability distributions. We note that almost always the relations obtained for the characteristic functions remain valid (as amended) for generating functions. Therefore, we restrict ourselves to a more detailed study of the useful properties of the characteristic functions. Consider an arbitrary random variable 𝜉 having some distribution function F(𝑥). The characteristic function 𝜓(𝑡) of this random variable is given by ∞
𝜓(𝑡) = 𝐸(𝑒 𝑖𝑡𝜉 ) = ∫−∞ 𝑒 𝑖𝑡𝑥 dF(x). ∞ < 𝑡 < ∞.
(3.2.1)
Generating and Characteristic Functions
43
For discrete random variable 𝜉 taking values 𝑥𝑘, =1,2,..., with the corresponding probabilities 𝑝𝑘=𝑃{𝜉=𝑥𝑘} definition (3.2.1) can be rewritten in the form 𝜓(t)=
exp(itx k
k
) pk .
(3.2.2)
Equality (3.2.2) allows for random variables taking only integer nonnegative values, i.e., having some generating function 𝑃(𝑠) as follows connect these two related functions: 𝜓(t)=P(exp(it))
(3.2.3)
If 𝜉 has some distribution density 𝑓(𝑥) then we obtain that ∞
𝜓(𝑡) = ∫−∞ 𝑒𝑥𝑝 (𝑖𝑡𝑥)𝑓(𝑥)𝑑𝑥.
(3.2.4)
The reader may be confused by the fact that introducing random variables, we defined them as some measurable functions that take real values, and when considering a new object, we are already dealing with complexvalued functions 𝑒𝑥𝑝(𝑖𝑡𝜉). In this case, it would be possible to specifically introduce such an object as complexvalued random variables but it is easier giving a definition of a characteristic function to give its alternative notation in the form 𝜓(𝑡) = 𝐸𝑐𝑜𝑠(𝑡𝜉) + 𝑖𝐸𝑠𝑖𝑛(𝑡𝜉), where the real and imaginary parts of 𝑓(𝑡) are already presented in the form of mathematical expectations of completely “legitimate” (realvalued for each fixed value of the parameter 𝑡) random variables 𝑐𝑜𝑠(𝑡𝜉) and 𝑠𝑖𝑛(𝑡𝜉). Here are a few properties of characteristic functions that often make it easier to work with random variables and their moment characteristics. We note first of all the following easily obtained relations:
44
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy ∞
𝜓(0) = ∫−∞ 𝑑𝐹 (𝑥) = 1 and  𝜓(𝑡) ⩽ 1. As in the case of generating functions, for a specific characteristic function 𝜓(𝑡) the distribution of the corresponding random variable 𝜉 is uniquely restored. In the general case, when it is necessary to find the distribution 𝜉 from the characteristic function 𝑓(𝑡), one can use the following rather cumbersome inversion formula: 1
∞
𝐹(𝑦) − 𝐹(𝑥) = 2𝜋 ∫−∞((𝑒𝑥𝑝(−𝑖𝑡𝑥) − 𝑒𝑥𝑝(−𝑖𝑡𝑦))⁄𝑖𝑡) 𝜓(𝑡)𝑑𝑡, (3.2.5) which is valid for any continuity points 𝑥 and 𝑦 of the distribution function 𝐹(𝑥) of the random variable 𝜉. Of course, it can be difficult to use formula (3.2.5) but its main advantage is that it implies an unambiguous correspondence between characteristic functions and distributions of random variables. Note that if the next condition is ∞
∫−∞ 𝜓(𝑡) 𝑑𝑡 < ∞,
(3.2.6)
then this fact guarantees the existence of a distribution density 𝑓(𝑥) for the random variable 𝜉 which in this case is given by the simpler equality 1
∞
f(𝑥) = 2𝜋 ∫−∞ 𝑒𝑥𝑝 (−𝑖𝑡𝑥) 𝜓(𝑡)𝑑𝑡.
(3.2.7)
We also give the following useful properties of characteristic functions. 1) If 𝜉, 𝜂 are independent random variables and 𝜈 = 𝜉 + 𝜂 then 𝜓𝜈 (𝑡) = 𝜓𝜉 (𝑡) ⋅ 𝜓𝜂 (𝑡).
(3.2.8)
Naturally this property extends to any number of independent random
Generating and Characteristic Functions
45
terms. 2) If we pass from a random variable 𝜉 using a linear transformation to a random variable 𝜂 = 𝑎𝜉 + 𝑏 then the characteristic functions of these random variables will be related by the relation 𝜓𝜂 (𝑡) = 𝐸𝑒𝑥𝑝(𝑖𝑡(𝑎𝜉 + 𝑏))
= 𝑒𝑥𝑝(𝑖𝑡𝑏)𝐸𝑒𝑥𝑝(𝑖𝑡𝑎𝜉) = 𝑒𝑥𝑝(𝑖𝑡𝑏) 𝜓 𝜂 (𝑎𝑡).
(3.2.9)
In particular, it follows from (3.2.9) that if 𝜂 = −𝜉 , then
𝜓𝜂 (𝑡) = 𝜓−𝜉 (𝑡) = 𝜓𝜉 (−𝑡) − 𝐸𝑐𝑜𝑠(𝑡𝜉) − 𝑖𝑛(𝑡𝜉)=𝑓𝜉 (𝑡). (3.2.10) 3) Taking into account equality (3.2.10), we find that in order for a random variable to have a distribution symmetric with respect to zero, it is necessary and sufficient that its characteristic function be real. Comment. The property 3 helps to state that if, after some operations with random variables, we get an output with a real characteristic function, then the mathematical expectation of this quantity (if it exists!) Is zero. 4) The following relation allows, knowing the characteristic function, to find various moment characteristics of a random variable. For any 𝑛 = 1,2, . .. if the corresponding derivative of the characteristic function exists, the equality ∞
𝜓 (𝑛) (𝑡) = 𝑖 𝑛 ∫−∞ 𝑥 𝑛 𝑒𝑥𝑝(𝑖𝑥𝑡)𝑑𝐹(𝑥).
(3.2.11)
We get that if a random variable has a moment of some order 𝑛 = 1,2, … then the following equality holds
46
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy 𝐸𝜉 𝑛 = 𝜓 (𝑛) (0)⁄𝑖 𝑛 .
(3.2.12)
In particular, we note the often used fact that if a random variable 𝜉 has a mathematical expectation, then the characteristic function has a first derivative and the following useful relation holds: 𝐸𝜉 = −𝑖𝜓 ′ (0).
(3.2.13)
In some situations, it is more convenient to take the first derivative of the function 𝑔(𝑡) = 𝑙𝑛(𝜓(𝑡)) since in this case, as is easily seen, the equality 𝐸𝜉 = −𝑖𝑔′ (0) holds. Comment. As in the case of discrete random variables, where relation (3.1.15) was obtained, which made it possible to find the generating function for the sum of a random number of independent identically distributed random variables, we can consider the corresponding situation and apply the method used to prove equality (3.1.15), for similar sums, discrete quantities are no longer necessarily. So let the sum 𝜈 = 𝜉1 + 𝜉2 + ⋯ + 𝜉𝑁 represents independent identically distributed random variables each of which has a characteristic function 𝑓(𝑡), and the number of terms 𝑁 in this sum is a random variable (independent of terms 𝜉) taking values 0,1,2, . .. and having a generating function 𝑛 𝑄(𝑠) = ∑∞ 𝑛=0 𝑃 {𝑁 = 𝑛}𝑠 .
It turns out that the characteristic function 𝑔(𝑡) of the random variable 𝜈 is given (compare with relation (3.1.15)) by the equality 𝑔(𝑡) = 𝑄(𝜓(𝑡)).
(3.2.14)
Generating and Characteristic Functions
47
Comment. Having obtained equality (3.2.14), it is not difficult, as in the case considered earlier, when discrete random terms were summed, derive, for example, the fact that the mathematical expectation of the sum 𝜈 of a random number of terms has the form 𝐸𝜈 = −𝑖𝑔′ (0) = −𝑖𝑄 ′ (𝜓(0))𝜓′(0) = −𝑖𝑄 ′ (1)ψ′(0) = 𝐸𝑁 ⋅ 𝐸𝜉, (3.2.15) that is, it represents the product of the mean values of 𝜉 and 𝑁. We must not forget that the existence of each of these two mathematical expectations is assumed. We give below the form of characteristic functions for the most common probability distributions. 1) Degenerate distribution If 𝜉 has a distribution degenerate at some point 𝑐 then we immediately get that 𝜓𝜉 (𝑡) − 𝑒 𝑖𝑡𝑐
(3.2.16)
In particular, if 𝑐 = 0 then 𝜓𝜉 (𝑡) = 1 for −∞ < 𝑡 < ∞. 2) Pointtopoint distribution Let 𝜉 take two values and 𝑃{𝜉 = 𝑎} = 𝑝, 𝑃{𝜉 = 𝑏} = 1 − 𝑝, 0 < 𝑝 < 1. In this case 𝜓(𝑡) =pexp(ita)+(1p) exp(itb). We separate a special case of twopoint distributions when
(3.2.17)
48
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy 𝑎 = −1, 𝑏 = 1and 𝑝 = 1⁄2. In this situation 𝜓𝜉 (𝑡) = (1 + 𝑒𝑥𝑝(𝑖𝑡))⁄2.
(3.2.18)
For a number of discrete distributions, generating functions are known. In this case, in order not to duplicate the calculations, we can recall the equality (3.2.3): 𝜓𝜉 (𝑡) = 𝑃(𝑒 𝑖𝑡 ), which allows us to obtain immediately the form of characteristic functions for several distributions. 3) Binomial 𝐵(𝑛, 𝑝)distribution From relation (3.1.2), in which the generating function of this distribution is presented using the “hint” (3.2.3), we immediately obtain that in this case 𝑛
𝜓𝜉 (𝑡) = (1 − 𝑝 + 𝑝𝑒 𝑖𝑡 ) .
(3.2.19)
4) Poisson distribution Let 𝜉 ∼ 𝜋(𝜆) have a Poisson distribution with parameter 𝜆 > 0. In this case recalling equality (3.1.6) for the generating function, we obtain 𝜓𝜉 (𝑡) = 𝑒𝑥𝑝 (𝜆(𝑒 𝑖𝑡 − 1. ))
(3.2.20)
5) Geometric distribution We consider a discrete random variable 𝜉 taking values 0,1,2, . .. with probabilities
Generating and Characteristic Functions
49
𝑝𝑛 = 𝑃{𝜉 = 𝑛} = (1 − 𝑝)𝑝𝑛 , 𝑛 = 0,1,2, … ,0 < 𝑝 < 1. and the generating function 𝑃(𝑠) = (1 − 𝑝)⁄(1 − 𝑝𝑠). We get that the characteristic function in this case has the form 𝜓𝜉 (𝑡)= (1p)/(1pexp(it)).
(3.2.21)
We now turn to the most common random variables that have absolutely continuous distributions. 6) Uniform distribution Let 𝜉 have a uniform distribution law on a segment [𝑎, 𝑏], a < b. This means that the random variable has a distribution density 1
, 𝑖𝑓𝑥 ∈ [𝑎, 𝑏] 𝑝𝜉 (𝑥) = {𝑏−𝑎 0, 𝑖𝑓𝑥 ∉ [𝑎, 𝑏]. Then 𝑏
1
𝜓𝜉 (𝑡) = ∫𝑎 𝑒𝑥𝑝 (𝑖𝑡𝑥) 𝑏−𝑎 𝑑𝑥 = (𝑒 𝑖𝑡𝑏 − 𝑒 𝑖𝑡𝑎 )⁄𝑖𝑡 (𝑏 − 𝑎) .
(3.2.22)
Separately we note here the option when we have a distribution that is symmetrical on some interval [−𝑎, 𝑎], 𝑎 > 0. In this case 𝜓𝜉 (𝑡) = (𝑒 𝑖𝑡𝑎 − 𝑒 −𝑖𝑡𝑎 )⁄2 𝑖𝑡𝑎 = 𝑠𝑖𝑛 (𝑎𝑡)⁄𝑎𝑡.
(3.2.23)
In particular, for the uniform 𝑈([−1,1]) distribution we obtain the characteristic function
50
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy 𝜓𝜉 (𝑡) = 𝑠𝑖𝑛 (𝑡)⁄𝑡
(3.2.24)
7) Triangular distribution The uniform distribution is closely related to the socalled triangular distribution, the standard representative of which takes values on the interval [−1,1] and has a distribution density 𝑓𝑛 (𝑥) = 1 − 𝑥, −if −1 ≤ 𝑥 ≤ 1, and 𝑓𝜂 (𝑥) = 0, if 𝑥 > 1. (3.2.25) Note that this random variable can be represented as the sum 𝜂 = 𝜉1 + 𝜉2 of two independent random variables 𝜉1 and 𝜉2 uniformly distributed over the interval [−1⁄2, 1⁄2], each of which has (see equality (3.2.23) for = 1⁄2 ) characteristic function 𝜓𝜉 (𝑡) = 2𝑠𝑖𝑛 (𝑡⁄2)⁄𝑡. Comment. The form of the characteristic function (3.2.26) for the density (3.2.25) allows us to carry out the following reasoning. Recalling the formula (3.2.7), we arrive to the relation 1
∞
𝑓𝜂 (𝑥) = 2𝜋 ∫−∞ 𝑒𝑥𝑝 (−𝑖𝑡𝑥)𝜓𝜂 (𝑡)𝑑𝑡.
(3.2.27)
This equality in this case can be rewritten as 2
2
∞
𝜓𝜂 (𝑥) = ∫−∞ 𝑒𝑥𝑝 (𝑖𝑡𝑥) 𝜋
(𝑠𝑖𝑛(𝑡⁄2)) 𝑡2
𝑑𝑡.
(3.2.28)
We also note that 𝜓𝜂 (0) = 1. Taking into account the definition of characteristic functions and relation (3.2.28), we carry out the following reasoning. Let us consider the distribution density 2
𝑔(𝑥) =
2(𝑠𝑖𝑛(𝑥⁄2)) 𝜋𝑥 2
, −∞ < 𝑥 < ∞.
Generating and Characteristic Functions
51
The fact that this function is the distribution density of some random variable 𝜂 follows from the fact that it is nonnegative and ∞
2
∞ 2(𝑠𝑖𝑛(𝑥⁄2))
∫−∞ 𝑔 (𝑥)𝑑𝑥 = ∫−∞
𝜋𝑥 2
𝑑𝑥 = 1
Then from relations (3.2.25) and (3.2.28) we obtain that the characteristic function 𝑓(𝑡), which has the form 𝜓(𝑡) = 1 − 𝑡if −1 ⩽ 𝑡 ⩽ 1 and 𝜓(𝑡) = 0 if 𝑡 > 1.
(3.2.29)
Here, it is precisely the fact that this function presented in (3.2.29) is characteristic is important to us. This fact will allow us to obtain a number of other useful results for characteristic functions. 8) Exponential distribution We start with a random variable 𝜉 having a standard 𝐸(1) − exponential distribution with a distribution density function 𝑓𝜉 (𝑥) = 𝑒 −𝑥 , 𝑖𝑓𝑥 ≥ 0, 𝑎𝑛𝑑 𝑓𝜉 (𝑥) = 0, 𝑖𝑓𝑥 < 0. One gets that 𝜓𝜉 (𝑡) = 1⁄(1 − 𝑖𝑡).
(3.2.30)
For an arbitrary 𝐸(𝜆) distributed random variable 𝜂 with a distribution density 𝑓𝜂 (𝑥) = 𝑒𝑥𝑝 (−𝑥⁄𝜆)⁄𝜆 , 𝑥 ≥ 0, 𝜆 > 0 characteristic function has the form 𝜓𝜂 (𝑡) = 𝜓𝜂 (𝑡)(𝜆𝑡) = 1⁄(1 − 𝑖𝜆𝑡).
(3.2.31)
52
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy 9) Laplace distribution The distribution density of a random variable 𝜂 has the form 𝑝𝜂 (𝑥) =
1 −𝑥 𝑒 2
in this case.
Note that its distribution coincides with the distribution of the difference 𝜂 = 𝜉1 − 𝜉2 of two independent 𝐸(1)distributed random variables. In terms of the characteristic functions 𝑓𝜉 (𝑡) given in (3.2.30) and from equality 𝑓−𝜉 (𝑡) = 𝑓𝜉 (−𝑡) this fact allows us to obtain immediately that 1
𝜓𝜂 (𝑡) = 𝜓𝜉1 (𝑡)𝜓−𝜉2 (𝑡) = 𝜓𝜉 (𝑡)𝜓𝜉 (−𝑡) = 1+𝑡 2 .
(3.2.32)
10) Gamma distribution The standard representative of the Gamma (α) distribution with parameter 𝛼 > 0 has a density function 𝑓𝛼 (𝑥) = 𝑥 𝛼−1 𝑒𝑥𝑝(−𝑥)⁄𝛤(𝛼) , 𝑥 > 0, where ∞
𝛤(𝛼) = ∫0 𝑥 𝛼−1 𝑒𝑥𝑝(−𝑥)𝑑𝑥. We also note that for integer values 𝛼 = 𝑛, where 𝑛 = 1,2, . . ., this equality with 𝛤(𝑛) = (𝑛 − 1)! holds. In many applications of the probability theory the Gamma (n) distribution with some integer parameter 𝛼 = 𝑛 is appeared when one need to work with the sums 𝜂𝑛 = ξ1 +ξ2+…+ξn of independent exponential random variables. Say, in the classical queueing theory 𝜂𝑛 can represent the time which is required to service a group of n customers. Let us note that if we take n identically E(1)exponentially distributed random variables, because, for example, if we are talking about the sums
Generating and Characteristic Functions
53
𝜂𝑛 = ξ1 + ξ2 + ... + ξn, where ξk, k = 1, 2, ... are the service times of claims in some queuing system, having, say, E (1) is the distribution, then n represents the time required to service a group of n requirements, and the Gamma (n) distribution also exists. The presented representation of the Gamma (n) distributed random variable n in terms of independent random terms allows us to immediately obtain, using equality (3.2.31), its characteristic function: 𝜓n (t) = Eexp (itx) = 1 / (1it).
(3.2.33)
In the general case, when it comes to the Gamma distribution with an arbitrary (not necessarily integer) parameter α> 0, not very complicated calculations lead us to a more general equality 𝜓n (t) = 1 / (1it) α
(3.2.34)
for the characteristic function of a Gamma (α) distributed random variable. Comment. It is clear that the exponential E (1) distribution is a special case (for α = 1) of gamma distributions. It has already been noted that Gamma (n) distributions can be represented as sums of the corresponding set of independent identically distributed exponential random terms. In turn, a comparison of the characteristic functions defined by equalities (3.2.30) and (3.2.34) allows us to state that the exponential E (1) distribution itself can be expressed for any n = 1,2, ..., as the sum n of independent random terms, each of which has a Gamma distribution with parameter α = 1 / n. 11) Cauchy distribution It has already been noted above that in the world of classical probability distributions, a special place is occupied by the Cauchy family of distributions, whose standard representative has a distribution density p (x) = 1 / π (1 + x2), ∞ 0. 𝜓
58
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy Turning to a probabilistic mixture 𝜓(𝑡) = 𝑝1 𝜓𝐴 (𝑡) + 𝑝2 𝜓𝐵 (𝑡)
where 𝑝1 + 𝑝2 = 1, 𝜓𝐴 (𝑡)and 𝜓𝐵 (𝑡 are two characteristic functions then 𝜓(𝑡) 𝑤𝑖𝑙𝑙 𝑏𝑒 𝑎 characteristic functuin,. The same kind of mixtures of a finite number of different triangular characteristic functions are considered, and then with their help, passing from a finite to an infinite number of functions that make up these mixtures, one can obtain any function described in the Polya statement. If we recall the above remark, in which condition (3.2.42) appears, then we can, comparing the following two obviously characteristic functions, one of which tends to zero at infinity, and the second tends to some 𝑝(0 < 𝑝 < 1), get an example of characteristic functions that coincide only on some finite interval. On the other hand, returning to the characteristic functions 𝜓(𝑡) and 𝜓𝐵 (𝑡) indicated above, 0 < 𝐴 < 𝐵, we see that the values of these functions coincide everywhere, except for values at two finite intervals (−𝐵, 0) and (0, 𝐵). The above constructions of characteristic functions allow one to obtain an interesting result. Statement 3.2.1. If we consider three independent random variables 𝜉, 𝜂 and 𝜈 then from the distribution equality 𝑑
𝜉+𝜂 =𝜂+𝜈 does not follow the seemingly natural relation 𝑑
𝜉 = 𝜈.
(3.2.43)
Generating and Characteristic Functions
59
To construct the corresponding example, instead of relation (3.2.43), it suffices to consider the equivalent (in terms of characteristic functions) equality 𝜓𝜉 (𝑡)𝜓𝜂 (𝑡) = 𝜓𝜂 (𝑡)𝜓𝜈 (𝑡).
(3.2.44)
Here, the constructions given above within the framework of the Polya criterion of the necessary characteristic functions will help. Indeed, considering the above example of two functions (we denote them 𝜓(𝑡) and 𝜓(𝑡)), that coincide only on some finite interval [−𝐴, 𝐴], it is enough to take the triangular function Κ𝜓𝐴 (𝑡) as the third characteristic function 𝜓𝜂 (𝑡). In this case, relation (3.2.44) is satisfied, but the random variables 𝜉 and 𝜈 have different characteristic functions and, therefore, their distributions do not coincide.
Chapter 4
SOME UNIVARIATE CONTINUOUS PROBABILITY DISTRIBUTIONS In this chapter we will present some univariate continuous distributions.
4.1. ARCSINE DISTRIBUTION A random variable X is said to have arcsine distribution if its pdf f(x) is as follows. 1
f(x) = 𝜋√(𝑥(1−𝑥)). The pdf of arcsine distribution is given in Figure 4.1.
62
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
pdf
10 9 8 7 6 5 4 3 2 1 0 0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
x
Figure 4.1. The pdf of arcsine distribution. 1
Mean = 2 1
Variance  8 E ( 𝑋𝑛 ) =
(2𝑛)! 4 𝑛 (𝑛!)2 1 (2𝑘)! 𝑘 𝑡 𝑘!𝑘!
Moment generating function M(t) = ∑∞ 𝑘 = 0 4𝑘 !
1 (2𝑘)! (𝑖𝑡)𝑘 𝑘!𝑘!
Characteristic function 𝜑(𝑡) = ∑∞ 𝑘 = 0 4𝑘
4.2. BETA DISTRIBUTION A random variable x is said to have the Beta distribution with parameters m and n if its pdf f(x) is of the following form.
Some Univariate Continuous Probability Distributions
63
1
f(x) =𝐵(𝑚,𝑛) 𝑥 𝑚−1 (1 − 𝑥)𝑛−1 , ), 00. 1
B(m,n) = ∫0 𝑥 𝑚 (1 − 𝑥)𝑛−1 𝑑𝑥. We will denote the beta distribution with the above pdf as BE(m,n). The pdfs BE(3,3), BE(3,4) and BE(4,3) are given in Figure 4.2.
PDF
2.5
2.0
1.5
1.0
0.5
0.0 0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Figure 4.2. pdfsBE(2,3)solid, BE(3,3)Dash and BE(3,4) Dot.
Mean
𝑚 𝑚+𝑛
Variance (𝑚
𝑚𝑛 2 (𝑚_𝑛_1
𝑛)
Moment about the origin
𝑚_𝑗
𝑘−1 E(𝑥 𝑘 ) =Η𝑗=0 , k=1.2,…. 𝑚+𝑛+𝑗
Moment generating function M(t) = 1F1(m.m+n.t) Characteristic function 𝜑(𝑡) = 1F1(m.m+n.it) where
1.0
x
64
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy 1F1
𝑎(𝑎+1) 𝑥 2
𝑎
(a,b,x) = 1 +𝑏 x+𝑏(𝑏+1) 2! +….
is the Kummer confluent hypergeometric function.
4.3. CAUCHY DISTRIBUTION The pdf f(x)of Cauchy distribution with location parameter µ and scale parameter 𝜎 is 1
1
𝑓𝜇,𝜎 ( x) = 𝜋𝜎
𝑥−𝜇 2 ) 𝜎
1+(
−∞ < 𝜇 < 𝑥 < ∞, , 𝜎 > 0.
We denote the Cauchy distribution whose pdf as given above as CA ( , ). The Figure 4.3 gives the pdfs of CA (0,1),CA(0,2) and CA(0,3).
PDF 0.3
0.2
0.1
20
15
10
5
0
5
10
15
Figure 4.3. pdfs CA (0,1) solid, CA (0,3)dash and CA (9,3)dot.
20
x
Some Univariate Continuous Probability Distributions
65
Mean: Does not exist Variance: Does not exist. Moment generating function m(t): Does not exist
it t  Characteristic function 𝜑(𝑡) = e
4.4. CHISQUARE DISTRIBUTION A random variable X is said to have chisquare distribution with location parameter 𝜇 𝑎𝑛𝑑 𝑠𝑐𝑎𝑙𝑒 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 𝜎 and n degrees of freedom if its pdf f(x) is of the following form f(x)
𝑥−𝜇 𝑛−1 −(𝑥−𝜇) ) 2 𝑒 2𝜎 𝜎
1 1 𝑛 2𝑛/2 Γ( )𝜎
=
(
2
x>𝜇 > 0, 𝜎 >
0 𝑎𝑛𝑑 𝑛 𝑖𝑠 𝑎𝑛 𝑖𝑛𝑡𝑒𝑔𝑒𝑟. We denote the chisquare distribution with the above pdf as CH(9,1,n). The pdfs CH (0,1,4), CH (0,1,8) and CH (0,1,10) are given in Figure 4.4. Mean
n
Variance
2n 2 𝑛
Γ(r)
E(𝑋 − 𝜇)𝑟 = (2𝜎)𝑟 Γ(𝑟 + 2 ) ∑𝑟𝑗 = 0 Γ(𝑗)Γ(𝑟−𝑗) The moment generating function 1
𝑀(𝑡) = 𝑒 𝜇𝑡 (1 − 2𝛼𝑡)−𝑛/2 , t 0
We denote E(𝜇, 𝜎) 𝑎𝑠 𝑡ℎ𝑒 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 distribution that has the above Pdf. The cumulative distribution function F(x) is
F(x) = 1 e
(
x
)
.
Some Univariate Continuous Probability Distributions
67
The pdfs of E (0,1), E(0,2) and E(0,4) are given in Figure 4.5.
Mean
2
Variance
Moment about zero E(Xk)
k (k 1) j k j l 0 ( j 1) e t 1 ,t 1 t
Moment generating function eit . 1 it
Characteristic function
PDF 1.0 0.8
0.6
0.4
0.2
0.0 0
1
2
3
4
5
6
7
Figure 4.5. pdfs E (0,1)solid, E (0,2) dash and E (0,4) dot.
8
9
10
x
68
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
4.6. GAMMA DISTRIBUTION A random variable X is said to have gamma distribution with parameters a and b if its pdf f(x) is of the following form, 1
f(x) = 𝛤(𝑎) 𝑏𝑎 𝑥 𝑎−1 𝑒 −𝑥/𝑏 , 𝑥 ≥ 0, 𝑎 > 0, 𝑏 > 0, The pdfs of GA(2,1), GA(5.1) and GA(10,1) are given in Figure 4.6. Mean
ab
Variance
ab2
Moment about the origin 𝐸(𝑋 𝑘 ) =
𝛤(𝑎+𝑘) 𝑘 𝑏 𝛤(𝑎)
Moment generating functionM(t) = (1 bt ) a , t 1 b 𝜑(𝑡) = (1 ibt) a .
Characteristic function
y
0.4
0.3
0.2
0.1
0.0 0
2
4
6
8
10
12
14
16
18
Figure 4.6. Pdfs = GA (2,1)solid, GA (5,1)Dash and GA (19,1)Dot.
20
x
Some Univariate Continuous Probability Distributions
69
4.7. INVERSE GAUSSIAN DISTRIBUTION A random variable X is said to have inverse Gaussian e distribution if its pdf f(x) is as follows. 1
𝜆
f(x) = (2𝜋𝑥 3 )2 𝑒
𝜆(𝑥−𝜇)2 2𝜇2 𝑥
−
, x>0, 𝜆 > 0, 𝜇 > 0.
We denote thee inverse Gaussian distribution with the above pdf as IG (𝜆. 𝜇). The pdfs of IG (1,1), IG (1,3) and IG (3,2) are given in Figure 4.7. PDF 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0
1
2
3
4
5
6
7
Figure 4.7. pdfs IG (1,1) solid, IG (1,3)dash and IG (3,2)dot.
Mean = 𝜇 Variance =
𝜇3 𝜆 (𝑘−1+𝑖)!
μ
𝑖 E(𝑋 𝑘 ) = μ𝑘 ∑𝑘−1 𝑖 = 0 (𝑘−1−𝑖)! (2𝜆)
8
9
10
x
70
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy Moment generating function
m(t) = 𝑒
𝜆 2𝜇2 𝑡 1/2 (1−(1− ) 𝜇 𝜆
Characteristic function 𝜆
2𝜇2 𝑡𝑡 1/2 ) 𝜆
(1−(1−
𝜑(𝑡) = 𝑒 𝜇
.
The IG(1,1) is known as Wald distribution.
4.8. LAPLACE DISTRIBUTION A random variable is said to be Laplace distribution if its pf f(x) is of the following form. f(x) =
1 −𝑥−𝜇 𝑒 𝜎 , 2𝜎
∞ < 𝑥, 𝜇 < ∞, 𝜎 > 0.
y
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
10
8
6
4
2
0
2
4
6
8
Figure 4.8. pdfs – LAP (0.1/2)solid, Lap (0,1) dash and Lap (0,2)dot.
10
x
Some Univariate Continuous Probability Distributions
71
We will denote the Laplace distribution with the above pdf as LAP(𝜇, 𝜎). The following Figure 4.8 gives the pdfs of LAP(0.1/2), Lap (0,1) and Lap(0,2). Mean = 𝜇 Variance = 2𝜎 2 Moment about the mean 𝐸(𝑋 𝑘 ) = 0, 𝑘 = 1,3,5, … = k!𝜎 𝑘 , k = 2,4,6, …. 𝑒 𝜇𝑡
Moment generating function M(t) = 1−𝜎2 𝑡 2. t 0.
We will denote he Levy distribution with the above pdf as LEV(𝜋, 𝜎). The graph 4.9 gives the pdfs of LEV(0,10 < Lev(0,3) and LEV(0,8).
72
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy PDF 0.4
0.3
0.2
0.1
0.0 0
2
4
6
8
10
12
14
16
18
20
22
24
x
Figure 4.9. Pdfs LEV (0,1)solid, LEV (0,3)dash and LEV (0,8dot.
Mean: Does not exist. Variance: Does not exist. Moment generating function M(t): does not exist Characteristic function 𝜑(𝑡) = 𝑒 𝜇𝑖𝑡−√(2𝑖𝜎𝑡) .
4.10. LOGLOGISTIC DISTRIBUTION A random variable X is said to have loglogistic distribution if its pdf f(x) is of the following form. 𝛼𝑥 𝛼−1
f(x) = (1+𝑥𝛼 )2, x≥ 0, 𝛼 is a positive real number. We will denote the loglogistic distribution by
Some Univariate Continuous Probability Distributions
73
LL(𝛼) 𝑖𝑓 𝑖𝑡𝑠 𝑝𝑑𝑓 𝑖𝑠 𝑎𝑠 𝑔𝑖𝑣𝑒𝑛 𝑎𝑏𝑜𝑣𝑒. The Figure 4.10. gives the pdf of LL (2), LL (4) and LL (8).
PDF 2.0
1.5
1.0
0.5
0.0 0
1
2
3
4
Figure 4.10. PDFs – LL (2)solid, LL (4)dash and LL (8) dot. 𝜋
𝜋
Mean = 𝛼 csc 𝛼, 𝛼 > 1. Variance =
2𝜋 𝛼
csc
2𝜋 𝜋 )(𝛼 𝛼
E(𝑋 𝑘 ) = `
𝑘𝜋 𝛼
csc
𝑘𝜋 , 𝛼
𝜋
csc 𝛼)2. 𝛼 > 2.
𝛼 > 𝑘.
Moment generating function M(t) = ∑∞ 𝑘=0 Characteristic function 𝜑(𝑦) = ∑∞ 𝑘=0
𝑡𝑘 𝛼+𝑘 𝛼− 𝑘 𝐵( , ) 𝑘! 𝛼 𝛼
(𝑖𝑡)𝑘 𝛼+𝑘 𝛼− 𝑘 𝐵( , ). 𝑘! 𝛼 𝛼
5
x
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
74
4.11. LOGISTIC DISTRIBUTION The pdf f(x) of the logistic distribution is
f(x)
1 =𝜎
𝑒
𝑥−↑𝜇 − 𝜎
(1+𝑒
𝑥−𝜌 2 − ) 𝜎
,∞ < 𝑥 − 𝜇 < ∞, 𝜎? 0.
,, We denote the logistic distribution with the above pdf as LO (𝜇, 𝜎). The graphs of the pdfs of LO (1/2), LO (1) and LO (3) are given in Figure 4.11.
PDF
0.5
0.4
0.3
0.2
0.1
14
12
10
8
6
4
2
0
2
4
6
8
10
Figure 4.11. pdfs LO (0,1/2) solid, LO (0,1)dash and LO (0,3 dot.
Mean = 𝜇 Variance =
𝜋2 𝜎 2 3
E(𝑥 − 𝜇)2𝑚+1 =0, m=1,2,3,… E((𝑋 − 𝜇)2𝑚 = 2𝜎 2𝑚 (22𝑚−1 − 1)𝐵𝑚 , m=1,2.3,….,
12
14
x
Some Univariate Continuous Probability Distributions
75
where 𝐵𝑚 is the mth Bernoulli number. Moment generating function M(t) = 𝑒 𝜇𝑡 Γ(1 − 𝜎𝑡)Γ(1 + 𝜎𝑡), Characteristic function 𝜑(𝑡) = 𝑒 𝜇𝑖𝑡 Γ(1 − 𝜎𝑖𝑡)Γ(1 + 𝜎𝑖𝑡).
4.12. NORMAL DISTRIBUTION The pdf f(x) of the normal distribution as given below. The pdf f(x) of the normal distribution is given below. 1
F(x) = 𝜎√(2𝜋) 𝑒 −(
𝑥−𝜇 2 ) 𝜎
. −∞ < 𝜇 − 𝑥 < ∞, 𝜎 > 0.
We will denote the normal distribution with the above pdf as N(𝜋, 𝜎). The graph on N (0.0.5), N (0.1) and N (0,4) are given in Figure 4.12.
PDF
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
10
8
6
4
2
0
2
4
6
Figure 4.12. pdfs . N(0,1/2) solid, N(0,1) Dash and N(0,4)dot.
8
10
x
76
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy Mean = µ Median = µ Variance = 𝜎 2 Moment about the mean E(𝑋 − 𝜇)𝑘 ) = 0, k = 1,3,5,… = 𝜎 𝑘 (k1)!!, k = 2,4.6, …. Moment generating function M(t) = 𝑒 𝜇𝑡+ Characteristic function 𝜑(𝑡) = = 𝑒
𝑖𝜇𝑡−
𝜎2 𝑡 2 2
𝜎2 𝑡 2 2
.
4.13. PARETO DISTRIBUTION The pdf f(x) of the pareto distribution is given below. 𝛿.𝑥 𝛿 .
f(x) = 𝑥 𝛿+1 . x>𝛿 > 0. We denote the Pareto distribution having the above pdf as PA( , ). The pdfs of PA (1,1/2), PA (1,1) and PA (1,3) are given in Figure 4.13. 𝛼𝛿 𝛿−1
Mean =
.𝛿>1
𝛼2
Variance = (𝛿−2)(𝛿−1)2, 𝛿 > 2.
about the origin
Some Univariate Continuous Probability Distributions
77
Moment about the origin E(X k )
k , k k
Moment generating function does not exist Characteristic function 𝜑(𝑡) = 𝛿(𝑖𝜎𝑡)𝛿 Γ(𝛿 − 𝑖𝜎𝑡).
PDF
3
2
1
0 1
2
3
4
5
6
7
8
9
10
x
Figure 4.13. pdfs PA (1,1/2)solid, PA (1,1)dash and PA (1,3) dot.
4.14. POWER FUNCTION DISTRIBUTION The pdf f(x) of the power function distribution is given below. f(x) =
X a 1 ) , a x b, 0 ba ba (
We denote the power function distribution with pdf as given above as PO (a, b, 𝛿).
78
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy The pdfs of PO (0,1,2), PO (0,1,3) and PO (0,1,5) are given in
Figure 4.14.
PDF
3
2
1
0 0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Figure 4.14. pdfs – PO (0,1,2)solid, PO(0,1,3)dash and PO(0,1,5)dot. 𝛿
Mean= a+ 𝛿+1 (ba) Variance
𝛿(𝑏−𝑎)2 (𝛿+2)(𝛿+1)2
𝐸(𝑋 𝑟 ) = ∑
𝑡−1
𝛿 𝑟! 𝑎𝑟−𝑖 (𝑏 − 𝑎)𝑟 𝛿 + 𝑖 𝑖! (𝑟 − 𝑖)! 𝑖=0
Moment generating function 𝑡 𝑘 𝛿(𝑏−𝑎)𝛿 𝑘+𝛿
M(t) ∑∞ 𝑘 = 0 𝑘!
Characteristic function 𝜑(𝑡) = ∑∞ 𝑘=0
(𝑖𝑡)𝑘 𝛿(𝑏−𝑎)𝛿 𝑘! 𝑘+𝛿
1.0
x
Some Univariate Continuous Probability Distributions
79
4.15. RAYLEIGH DISTRIBUTION The pdf f9x) pf the Rayleigh distribution is given below f(x) =
𝑥−𝜇 −1 𝑥−𝜇 .ecp( 2 exp( 𝜎 )2), 𝜎
∞ < 𝜇 < 𝑥 > ∞.
We denote the Rayleigh distribution with pdf as given above as RA( , ).
PDF
1.2
1.0
0.8
0.6
0.4
0.2
0.0 0
1
2
3
4
Figure 4.15. PdfsRA (0.1/2)solid, RA (0,1,)dash and RA (0.2)dot.
Mean Variance
𝜋
𝜇 + √( 2 ) 5−𝜋 3
𝜎3
Moment about µ E(Xµ)k k 2k / 2 ( k 2 ) 2
Moment generating function
5
x
80
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy M(t) = 1+𝑒 𝜇𝑡 [1 + 𝜎𝑒
𝜎2 𝑡 2 2
𝜋
𝜎𝑡
√ 2 (𝑒𝑟𝑓𝑖 ( 2) 0𝑖)] √
where erf) is the error function. Characteristic function 𝜑(𝑡) = = 1+𝑒 𝑖𝜇𝑡 [1 + 𝜎𝑒 −𝜎
2 𝑡 2 /2
𝜋
√ 2 (𝑒𝑟𝑓(1 + 𝜎𝑖𝑡/√2))]
where erfi is the complex error function.
4.16. STABLE DISTRIBUTION A random variable X is defined as stable distribution if for any two independent copies X1, X2 and for any two positive numbers a and b, there exist a positive number c and a real number d such that aX1 +bX2 and c X +d are identically distributed. If d = 0, then we say X is strictly stable. .An alternative definition of stable random variable is as follows. Suppose X 1, X2,…, Xn are independent copies of X, then for n > 2, W = X1+X2+…..+Xn and cnX +d are identically distributed. The normal, Cauchy and Levy distributions have closed form stable distribution. 𝑋𝑥1 +𝑋2 +⋯+𝑋𝑛 is standard normal. √𝑛 𝑋 +𝑋 +⋯+𝑋𝑛 Cauchy, then 1 2𝑛 is standard Cauchy. 𝑋 +𝑋 +⋯+𝑋𝑛 Levy, then 1 2𝑛2 is standard Levy.
If X is standard normal, then If X is standard If X is standard
If X has standard normal, then 𝑐𝑛 = √𝑛 𝑎𝑛𝑑 𝑑 = 0. If X has standard Cauchy distribution, then cn = n and d = 0 If X has r standard Levy distribution, then 𝑐𝑛 = 𝑛2 and dn = 0.
Some Univariate Continuous Probability Distributions
81
4.17. STUDENT TDISTRIBUTION A random variable X has the Studentt distribution if the pdf is as follows. 1
f(x) =
𝑡2
((1 + 𝑛 )−(𝑛+1)/2 , −∞ < 𝑡 < ∞, n>0.
𝑚1 √𝜋 𝐵( 2 ,2)
n is known as degrees of freedom, We denote the Studentt distribution with n degrees of freedom as ST(n). The pdfs of, ST(1), ST(3) and ST(24) are given in Figure 4.17.
PDF
0.4
0.3
0.2
0.1
5
4
3
2
1
0
1
2
Figure 4.17. Pdfs ST(1)solid. ST(3)dash and ST(24) dot.
Mean
0 if n > 1, undefined for n = 1
Variance
n ,n 2 n2
Moment about the origin
3
4
5
x
82
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy E(Xk)
== 0 if k is odd
𝑘/2 2𝑖−1 =Π𝑖=0 𝑛=2𝑖 Moment generating function does not exist. Characteristic function
n (t )
1
(
t 
(n / 2) 2 n
) n / 2 Yn / 2 (
t  ) n
where Yn/2(.)is the Bessel function of the second kind
4.18. UNIFORM DISTRIBUTION The pdf f(x) of the uniform distribution is given below. f(x) =
1 , a x b ba
The uniform distribution with the above pdf is denoted by U(a,b). The pdf of U(0,5) is given in Figure 4.18. 𝑎+𝑏 2
Mean
Variance
(𝑏−𝑎)2 1
nth moment about 0 𝐸(𝑋 n )
1 ∑𝑛 𝑎𝑖 𝑏 𝑛−𝑖 𝑛+1 𝑖 = 0
Some Univariate Continuous Probability Distributions
83
bt at Moment generating function M(t) = e e t (b a )
ibt at Characteristic function 𝜑(𝑡) = e ei it (b a)
Figure 4.16. PDF of UN (0,5).
4.19. WEIBULL DISTRIBUTION The pdf f(x) of the Weibull distribution is given below. 𝜹 𝒙−𝝁
f(x) = ( )𝜹−𝟏 𝒆(− 𝝈 𝝉𝝈
𝒙−𝝁 𝜹 ) 𝝈
,∞ < 𝝁 < 𝒙 < ∞, 𝝈 > 𝟎.
We denote the Weibull distribution with the above pdf as WE( , , ). The graphs of WE (0,1,2, WE (0,1,3) and WE (0,1,5 are given in The pdfs of We.
84
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
PDF
2.0
1.5
1.0
0.5
0.0 0
1
2
3
4
5
6
7
8
9
Figure 4.19. PDFs WE (0,1,2)solid, WE (0,1,3) dash and WE (0,1,5)dot.
Mean
1
𝜇 + 𝜎Γ(`1 + 𝛿 )_ 2
1
2
Variance 𝜎 2 [ 2Γ (1 + 𝛿 ) − (Γ(1 + 𝛿 )) ] Moment about 𝜇 𝑘
E(𝑥 − 𝜇)𝑘 = 𝜎 𝑘 Γ(1 + 𝛿 _ Moment generating function M(t) = 𝑒 𝜇𝑡 ∑∞ 𝑘=0 Characteristic function 𝜑(𝑡) = 𝑒 𝑖𝜇𝑡 ∑∞ 𝑘=0
(𝑡𝜎)𝑘 𝑘!
(𝑖𝑡𝜎)𝑘 𝑘!
𝐾+𝛿 ) 𝛿
Γ(
𝐾+𝛿 ). 𝛿
Γ(
10
x
Chapter 5
ORDER STATISTICS: FROM MINIMUM TO MAXIMUM 5.1. DEFINITIONS AND EXAMPLES Up to these times we basically dealt with random objects representing independent random variables. For example, for working with the sequence of independent variables ξ1, ξ2, … it is often enough to know the set of the distribution functions or the corresponding characteristic functions. Our reader remembers that in this case work with sums of independent summands is not essentially complicated: it is enough to remember that this sum corresponds to a product of characteristic functions, and if a product of such independent variables is given, then we can remember that the expected value of such a product is expressed simply in terms of products of individual mean values (if they exist) of the random factors. It becomes essentially more unpleasant to deal with dependent random variables, even if this dependence is not so complicated. In this case, for example, already for finding the dispersion of a sum of several dependent summands not only variances of the individual summands should be known but also all the mixed moments as covariances of these random variables. So, prima facie the situation becomes rather wonderful when dealing with a set of independent variables we consciously go to regarding some random variable
86
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
transformations leading to wittingly dependent random objects. Order statistics are an example of such objects. 1) Let there be a batch of some details (for example, incandescent light bulbs), that must satisfy some standards (for example, the mean time of work should be not less than some value t). The question is dealt about dealing this batch from a seller to a customer. How can the seller persuade the customer in the compliance of the production to the standard? The simplest thing is to display all the details to the test bench, to get all the data amount their longevity, to count the average longevity and to say that this concrete batch that became broken satisfied all the necessary quantities and qualities. Here two problems arise. One problem: there is nothing to be bought. Another problem: the divergence of the lifetimes of the bulbs can be very great and the customer may not expect the end of the experiment. What is to be done in this situation? First, it is known that methods of mathematical statistics allow us to test only some representatives of the production, such as the number n. We can obtain the exact lifetimes t1, …, tn of these n details and to use the corresponding statistical methods to verify the hypothesis about the needed quality of all the production. However, a “temporary” problem also arises here. Some of the lamps may soon be broken, while others may be very longevously. But the customer already has booked the means of delivering production into his enterprise. It is good that some bulbs (but not all, for example, m of n bulbs) have already burned out. Let t1,n 0.
Theorem 7.13.2. Suppose an absolutely continuous random variable has cdf F(x) and pdf f(x), 0 ≤ x < ∞, We assume f’(x) exists for all x,0 ≤ x < ∞ and E(X) exist. Then E(XX ≤ x) = h(x) 𝑟(𝑥) where𝑟(𝑥) =
h(x) =
Γ(
𝜆 2𝜆−1 𝜆+1 )+𝜆𝑥𝑒 −𝑥 −𝛾(𝑥 𝜆 , ) 𝜆 𝜆 𝜆 𝜆−1 −𝑥 𝜆𝑥 𝑒
It can be shown that 𝑥+ℎ”(𝑥) ℎ(𝑥)
=
and
𝜆
, if and only if f(x) = 𝜆𝑥 𝜆−1 𝑒 −𝑥 ,
o < x < ∞. 𝜆 > 0.

𝑓{𝑥) 𝐹(𝑥)
𝜆−1 𝜆
− 𝜆𝑥 𝜆−1 .
By Lemma A 2 given in the appendix 𝑓: (𝑥) 𝑥 + ℎ′(𝑥) = − 𝑓(𝑥) ℎ(𝑥)
138
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy Hence 𝑓′(𝑥) 𝑓(𝑥)
==
𝜆−1 𝜆
− 𝜆𝑥 𝜆−1
Integrating both sides of the above equation, we obtain 𝜆
f(x) = = c𝑥 𝜆−1 𝑒 −𝑥 , where c is a constant. ∞
Using the condition ∫0 𝑓(𝑥)𝑑𝑥 = 1, we will have 𝜆
f(x) = 𝜆𝑥 𝜆−1 𝑒 −𝑥 , o < x < ∞. 𝜆 > 0.
7.14. WALD DISTRIBUTION The following Theorem was given by Ahsanullah and Kirmani (1984). Theorem 7.14.1. Let X be a nonnegative random variable with pdf f(x). Suppose xf(x) = 𝑥 −2 𝑓(𝑥 −1 ) and 𝑥 −1 has the same distribution with X+𝜆−1 𝑍 where 𝜆 >0 and Z is independent of X and is distributed as chisquare with one degree of freedom. Then X has Wald distribution with pdf f9x) as
f(x) =
(𝑥−1) 𝜆 ( 3 )1/2 𝑒 −𝜆 2𝑥 2𝑥𝑥
2
, x > 0.
Chapter 8
CHARACTERIZATIONS OF DISTRIBUTIONS BY ORDER STATISTICS 8.1. INTRODUCTION Characterizations of probability distributions play important role in probability and statistics. Before a probability distribution is applied to real set of data it is necessary to know the distribution that fits the data by characterization. A probability distribution can be characterized by various methods, see for example Ahsanullah et al. (2017). In this chapter we will characterize probability distributions by various properties of ordered data. We will consider order statistics and record values to characterize the probability distributions. We assume that we have n (fixed) number of observations from an absolutely continuous distribution with cdf F(x) and pdf f(x). Let X1:n < X2:n < … < Xn:n be the corresponding order statistics.
140
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
8.2. CHARACTERIZATIONS OF DISTRIBUTIONS BY CONDITIONAL EXPECTATIONS We assume that E(X) exists. We consider that E(Xj:n Xi:n –x) = ax + b, j > i. Fisz(1958) considered the characterization of exponential distribution by considering j = 2, i = 1 and a>1. Roger (1963) characterized the exponential distribution by considering j = i+1 and a = 1. Ferguson (1963) characterized the exponential distribution with following distributions with j = i+1. The following theorem is given by Gupta and Ahsanullah (2004). Theorem 8.2.1. Under some mild conditions on Ѱ(𝑥) and g(x) the relatioun E ( ( X i s, n )  X i, n x) g ( x)
(8.1)
uniquely determines the distribution F(x). The relation (8.1) for s = 1 will lead to the equation r ( x)
g ' ( x) (n i)( g ( x) ( x)
(8.2)
Here r(x) = f(x) /(1F(x)), the hazard rate of X. If Ѱ(𝑥) = 𝑥 and g(x) = ax+b, then we obtain from (8.2) r ( x)
a (n i)((a 1) x b)
(8.3)
From (2.3) we have if a = 1, then r(x) = constant and X has the exponential distribution with 𝐹(𝑥) = 1 − 𝑒 −𝜆(𝑥−𝜇) ⥂, 𝑥 ≥ 𝜇., and
Characterizations of Distributions by Order Statistics
141
1 x>b b( n i )
Wesolowski and Ahsanullah (1997) extended the result of Ferguson’s (1963), which we given in following theorem. Theorem 8.2.2. Suppose that X is an absolutely continuous random variable with cumulative distribution function F(x) and probability distribution function f(x). If E(𝑋𝑘+1,.𝑛 ) < 1 ≤ k ≤ n2., n>2, then E(Xk+2,:nXk:n = x) = ax+b iff 𝜇+𝛿
a>1, 𝐹(𝑥) = 1 − ( 𝑥+𝛿 )𝜃 , x≥ 𝜇, 𝜃 > 1 where µ is a real number, 𝛿 = 𝑏/(𝑎 − 1) and 𝑏 =
𝑎(2𝑛−2𝑘−1)+√𝑎 2 +4𝑎(𝑛−𝑘)(𝑛−𝑘−1) 2(𝑎−1)(𝑛−𝑘)(𝑛−𝑘−1)
a = 1, 𝐹(𝑥) = 1 − 𝑒 −𝜆(𝑥−𝜇) , 𝑥 ≥ 𝜇, 𝑏 =
2𝑛−3𝑘−𝑘 𝜆(𝑛−𝑘)(𝑛−𝑘−1) 𝜈−𝑥
a < 1, 𝐹(𝑥) = 1 − (𝜈+𝜇)𝜃 , 0 ≤ 𝑥 ≤ 𝜈, 𝜈 = 𝑏 =
𝑏 1−𝑎
= and
𝑎(2𝑛−2𝑘−1)+√𝑎 2 +4𝑎(𝑛−𝑘)(𝑛−𝑘−1) 2(1−𝑎)(𝑛−𝑘)(𝑛−𝑘−1)
Dembinska and Wesolowski (1998) gave the following general result which we give in the following theorem.
142
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
Theorem 8.2.3. Suppose that X is an absolutely continuous random variables with cumulative distribution function F(x) and probability distribution function f(x). If E( X i2.n ) i ≤ i < n, n > 2, then E(Xik+r,,nXk, n = x) = ax+b iff 𝜇+𝛿
(i) a > 1, 𝐹(𝑥) = 1 − (𝑥+𝛿 )𝜃 , 𝑥 > 𝜇, 𝜃 > 1 (−1)𝑚
𝜃(𝑛−𝑘)!
a = (𝑛−𝑘−𝑟).∑𝑟−1 𝑚 = 0 𝑚!(𝑟−1−𝑚)!𝜃(𝑛−𝑘−𝑟+1+𝑚)2 (−1)𝑚
b = 𝛿 ∑𝑟−1 𝑚 = 0 𝑚!(𝑟−1−𝑚)!(𝜃(𝑛−𝑘−𝑟+1𝑚)+1) (ii) a = 1, 𝐹(𝑥) = 1 − 𝑒 −𝜆(𝑥−𝜇) , 𝑥 ≥ 𝜇, 𝑏 =
(𝑛−𝑘)! (−1)𝑚 ∑𝑟−1 . 𝜆(𝑛−𝑘−𝑟)! 𝑚 = 0 𝑚!(𝑟−1−𝑚)!(𝑛−𝑘−𝑟+1+𝑚)2 𝜈−𝑥
(iii) a < 1, 𝐹(𝑥) = 1 − (𝜈−𝜇)𝜃 , 𝜇 ≤ 𝑥 ≤ 𝜈, (−1)𝑚
𝜃(𝑛−𝑘)!
a = = (𝑛−𝑘−𝑟)! ∑𝑟−1 𝑚 = 0 𝑚!(𝑟−1−𝑚)!𝜃(𝑛−𝑘−𝑟+1+𝑚)2 (−1)𝑚
b = 𝜈 ∑𝑟−1 𝑚 = 0 𝑚!(𝑟−1−𝑚)!(𝜃(𝑛−𝑘−𝑟+1𝑚)+1) Consider the extended sample case. Suppose in addition to n sample observations, we take another m observations from the same distribution. We order the m+n observations. The combined order statistics is, X 1:m+n < X 2:m+n < … < Xm+n:m+n. We assume F(x) is the cdf of the observations. Ahsanullah and Nevzorov (1999) proved that if E(X1:n X1:m+n = x) = x+m(x), then
Characterizations of Distributions by Order Statistics
143 𝑚
for a = 1, F(x) is exponential with F(x) = 1exp(x),x>0 and m(x) = 𝑛(𝑚+𝑛)
8.3. CHARACTERIZATIONS BY IDENTICAL DISTRIBUTION It is known that for exponential distribution X1:n and X are identically distributed. Desu (1971) proved that if X1:n and X are equally distributed for all n, then the distribution of X is exponential. Ahsanullah (1975) proved that if for a fixed n, X1:n and X are identically distributed, then the distribution of X is exponential Ahsanullah (1978 a) proved that the Identical
distributions of Di+1,n (=((ni+1)(Xi+2,,nXi,n), )and X characterize the exponential distribution.. We will call a distribution function “new better than used, (NBU) if 1F(x+y) ≤ (1F(x))(1F(y)) for all x, y ≥ 0 and “new worse than used (NWU) if 1F(x+y) ≥ (1F(x))(1f(y)). For all x, y ≥ 0. We say that F belongs to class c0 if F is either NBU or NWU. The following theorem by Ahsanullah (1976) gives a characterization of exponential distribution based on the equality of two spacings. Theorem 8.3.1. Let X be a nonnegative random variable with an absolutely continuous cumulative distribution function F(x) that is strictly increasing in [0, , ) and having probability density function f(x), then the following two conditions are identical. (a) F(x) has an exponential distribution with F(x) = 1exp (𝜎𝑥), 𝜎 > 0. for some I, j, i ≤ < j < n the statistics Dj:n and Di:n are identically distributed and F belongs to the class C. Proof. We have already seen (a) (b). We will give here the proof of (b) (a) The conditional pdf of Dj:n given Xi:n = x is given by
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
144
f Di, n (d  X i, n ) k 0 ( F ( x) F ( x s))(F ( x))(i i 1) .( F ( x s d ) / ( F ( x))1)n j 1 n j d
fxsfxsnj
Fx
Fx
ds
(8.4) (𝑛−𝑖)!
where k = (𝑗−𝑖−1)!(𝑛−𝑗)! ! Integrating the above equation with respect to d from d to , we obtain F D j, n (d  X i, n x) k 0 ( F ( x) F ( x s))(F ( x))(i i 1) .( F ( x s d ) / ( F ( x))1)n j 1 n j
f ( x s) ds Fˆ ( x)
The conditional probability density function f given by −
𝑓𝐷𝑖+1:𝑛 (𝑑𝑋𝑖,𝑛 = 𝑥) = (𝑛 − 𝑖)
i:n
𝑥 ))𝑛−𝑖−1 𝑛−𝑟 (𝐹 (𝑥))𝑛−𝑖
(𝐹 (𝑑+ −
of Di:n given Xi,nx is
𝑓(𝑢+ −
𝑥 ) 𝑛−𝑖
𝐹 (𝑥)
The corresponding cdf Fi:n(x) is giving by −
1F 𝐷𝑖+1:𝑛 =
𝑥 ))𝑛−𝑖 𝑛−𝑟 (𝐹 (𝑥))𝑛−𝑖
(𝐹 (𝑑+ −
Using the relations 1 𝑘
=
∞ ∫0
−
−
−
𝐹 (𝑥+𝑠) 𝑛−𝑗 𝐹 (𝑥)−𝐹 (𝑥+𝑠) 𝑗−𝑖−1 𝑓(𝑥+𝑠) ( − ) ( ) 𝑑𝑠 − − 𝐹 (𝑥) 𝐹 (𝑥) 𝐹 (𝑥)
the distribution of Dj:n and Di:n given Xi:n, we obtain
and the equality of
Characterizations of Distributions by Order Statistics ∞ ∫0
−
−
145
−
𝐹 (𝑥+𝑠) 𝑛−𝑗 𝐹 (𝑥)−𝐹 (𝑥+𝑠) 𝑗−𝑖−1 𝑓(𝑥+𝑠) ( − ) ( ) 𝐺(𝑥, 𝑑, 𝑠) − 𝑑𝑠 − 𝐹 (𝑥) 𝐹 (𝑥) 𝐹 (𝑥)
= 0
(8.5)
where F (xs d ) F (x d ) n j n j n i ) n i ( G ( x, d , s ) ( ) . F ( x) F ( x s)
(8.6)
We have
F ( x s d ) n j n j G ( x, s , d ) ( ) (r ( x s d ) r ( x s)) s n i F ( x s)
(8.7)
(i) If F has IHR, then G(x,s,d) is increasing with s. Thus (8.5) to be true, we must have G(x,0,d) −
G( x, s, d ) 0. If F has IFR, then 𝑙𝑛𝐹 is concave and j i n j ln(F ( x d ) ln(F ( x)) ln(F ( x d )) nni n i n i n j
i.e., ( F ( x d )) n i ( F ( x)) j i ( F ( x d )) n j . nn i n j
Thus G(x,0,d) > 0. Thus (2.13) to be true we must have G(x,0,d) = 0 for all d and any given x. .
146
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
(ii) If F has DHR, then similarly we get G(x,0,d) = 0.Taking x = o, we obtain from G(x,0,d) as =
−
𝑑
𝑑
(𝐹 (𝑛−𝑖 )𝑛−𝑖 = (𝐹 (𝑛−𝑗))𝑛−𝑗
(8.8)
for all d ≥ 0 and some i,j,n with 1 i j n. −
Using 𝜙(𝑑) = 𝑙𝑛( 𝐹 (𝑑)) we obtain (ni) 𝜙 (
Putting
𝑑 ) 𝑛−𝑖
𝑑 𝑛−𝑖
𝜙𝑡(𝑡) =
𝑑
= (𝑛 − 𝑗𝜙)(𝑛−𝑗)
= 𝑡, we obtain
𝑗𝑛−𝑗 𝑛−𝑖 𝜙𝑡(𝑛−𝑗 𝑡)) 𝑛−𝑖
(8.9)
The nonzero solution of (8.9) is
( x) x
(8.10)
for all x> 0. Using the boundary conditions F(x) = 0 and F ()1 , we obtain 𝐹(𝑥) = 1 − 𝑒 −𝜆𝑥 , 𝑥 ≥ 0, 𝜆 > 0.
(8.11)
for all x 0 and 0. In the above theorem the under the assumption of finite expectation of X, the equality of the spacings can be replaced by the equality of the expectations. Let X be a random variable with cdf F(x). We will say X is symmetric about zero if F(x)  = 1F(x) for all x. If f(x) is the pdf, then f(.x) = f(x) for all x.
Characterizations of Distributions by Order Statistics
147
The following theorem about symmetric distribution was proved by Ahsanullah (1992). Theorem 8.3.2. Suppose X1, X2,…, Xn (n ≥ 2) are independent and 2 2 identically distributed continuous random variables. If 𝑋1,𝑛 and 𝑋𝑛,𝑛 are identically distributed, then X’s are symmetric around zero. Let Q(x) be the quantile function of random variable X having cdf F(x). I;e, F(Q(x) = x, 0 < x < 1. Akundov et al. (2004) proved that for 0 < x, 1, the relation 𝐸(𝜆𝑋1,3 + (1 − 𝜆)𝑋3,3 𝑋2,3 = 𝑥) = 𝑥 characterizes a family of probability distribution with quantile function Q(x) =
𝑐(𝑥−𝜆) 𝜆(1−𝜆)(1−𝑥)𝜆 𝑥 1−𝜆
+ 𝑑, 0 < 𝑥 < 1,
where 0 < c < ∞ 𝑎𝑛𝑑 − ∞ < 𝑑 < ∞. 𝐿𝑒𝑡 𝑢𝑠 𝑐𝑎𝑙𝑙 𝑡ℎ𝑖𝑠 𝑓𝑎𝑚𝑖𝑙𝑦 𝑎𝑠 𝑄 𝑓𝑎𝑚𝑖𝑙𝑦. Student t distribution with 2 degrees of freedoms if 1
𝑄1 (𝑥) = 2
1 2
22 (𝑥− ) 1
1
, 0 < 𝑥 < 1.
𝑥 2 (1−𝑥)2
Nevzorov et al. (2003) proved that if E(X) < ∞ then the random variable X has a Student t distribution with two degrees of freedom if E(𝑋2,3 − 𝑋1,3 𝑋2,3 = 𝑥) = 𝐸(𝑋3,3 − 𝑋2.3 𝑋2,3 = 𝑥) The following characterization of the Student tdistribution by Yanev and Ahsanullah (2011).
148
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
Theorem 8.3.3. A continuous random variable with finite variance has a Astudent t distribution with location parameter 0 and scale parameter 𝜈(≥ 3) 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 freedom if and only if for n≥ 3, 𝜈 ≥ 3, 1
𝜈−1 𝑋𝑘,𝑛 2
E[𝑘−1 ∑𝑘−1 𝑖 =1(
2
− (𝜈 − 2)𝑋𝑖.𝑛 ) 𝑋𝑘,𝑛 = 𝑥]
1
= E[𝑛−𝑘 ∑𝑛𝑗 = 𝑘+1 ((𝑛 − 2)𝑋𝑗,𝑛 −
2 𝜈−1 𝑋 ) 𝑋𝑘,𝑛 𝑘.𝑛 2
= 𝑥].
8.4. CHARACTERIZATIONS BY INDEPENDENCE PROPERTY Fisz (1958) gave the following characterization theorem based upon independence property. Theorem 8.4.1. If X1 X2 are independent and identically distributed with continuous cumulative distribution function F(x). Then X2:2 X1:2 is independent X1:2 if and only if F(x) = 1exp( x) , x ≥ 0 and 0. Proof. The “if” condition is easy to show. We will show the only if condition. P(X2:2X1:2 < yX1,2 = x) = P(X2:2 < y+xX1:2 = x) = P(X2:2 < y+x) By independence property. Thus 𝑃(𝑋2:2 − 𝑋1:2 > 𝑦𝑋1:2 = 𝑥) = 𝑃(𝑋2:2 > 𝑥 + 𝑦𝑋1:2 = 𝑥) = 1 F ( x y) . 1 F ( x) Since 1 F ( x y ) . is independent of X, we must have
1 F ( x)
Characterizations of Distributions by Order Statistics
149
1 F ( x y ) = g(y), . 1 F ( x) where g(y) is a function of y only. Taking x 0 , we obtain from the above equation 1F(x+y) = (1F(x)) (1F(y))
(8.12)
for all x ≥ 0 a and almost all y ≥ 0. The non zero solution of the above equation with the boundary conditions F(0) = 0 and F () 1 is F(x) = 1exp (−𝜆𝑥) for all x ≥ 0 and almost all 𝜆 > 0. The following theorem is a generalization of Fisz (1958) result. Theorem 8.4.2. If X1 X2 .., Xn are independent and identically distributed random variables with continuous cumulative distribution function F(x).
Then Xn:n X1:n and X1:n are independent if and only if F(x) = 1exp( x ) , x ≥ 0 and 0. The proof of the only if condition will lead to the following differential equation
(
(1 F ( x) (1 F ( y x)) n 1 ) g ( y) , 1 F ( x)
where g(y) is a function of y for all x ≥ 0 a and almost all y ≥ 0. The following theorem gave a more general result. Theorem 8.4.3. Suppose that X1 X2 …Xn independent and identically distributed random variables with continuous cumulative distribution function F(x). Then Xj:nXi:n and X I:n are independent if and only if F(x) = 1exp( x) , x> 0 and 0.
150
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
Suppose Xi i = 1, 2,…n are independent and identically distributed random variables. Kakosyan et al. (1984) showed that for a random number n, the identical distribution of nX1,n and X1 characterizes the exponential distribution. They conjectured that if n is a random variable with P(X = k) = 𝑝(1 − 𝑝)𝑘−1 , 𝑘 = 1,2, …, 0 < p < 1, the identical distribution of X1,n and 𝑝 ∑𝑛𝑘−1 𝑋𝑖 characterizes the exponential distribution. Ahsanullah (1988 c) proved that the conjecture is true.
Chapter 9
CHARACTERIZATIONS OF DISTRIBUTIONS BY RECORD VALUES In this chapter, we will discuss the characterizations of univariate continuous distributions by record values.
9.1. CHARACTERIZATIONS USING CONDITIONAL EXPECTATIONS Suppose {Xi, i = 1, 2,…} be a sequence of independent and identically distributed random variables with cdf F(x) and pdf f(x). We assume E(Xi) exists. Let X(n), n ≥ 1 be the corresponding upper records. We have the following theorem for the determination of F(x) based on the conditional expectation. Theorem 9.1.1. The condition
E ( ( X (k s )  X (k ) z ) = g(z)
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
152
where k, s ≥1 and (x) is a continuous function, determines the distribution F(x) uniquely Proof.
E ( ( X (k s)  X (k ) z ) z
( x)(R( x) R( z ))s 1 f ( x)dx F ( z)
(9.1.1)
where R(x) = ln F (x) . Case s = 1 Using the equation (9.1.1), we obtain
( x) f ( x)dx g ( z) F ( z)
(9.1.2)
z
Differentiating both sides of (9.1.2) with respect to z and simplifying, we obtain r ( z)
f ( z) g ' ( z) F ( z ) g ( z ) ( z )
(9.1.3)
where r(z) is the failure rate of the function. Hence the result. If ( x ) x and g(x) = ax + b, a, b ≥ 0, then
r ( x)
a (a 1) x b
(9.1.4)
a
If a 1, then F ( x) 1 ((a 1) x b) a 1 , which is the power function distribution for a < 1 and the Pareto distribution with a > 1. For a = 1, (9.1.4) will give exponential distribution. Nagaraja (1977) gave the following characterization theorem.
Characterizations of Distributions by Record Values
153
Theorem 9.1.2. Let F be a continuous cumulative distribution function. If for some constants, a and b, E(X(n)X(n1) = x) = a x + b, then except for a change of location and scale parameter, F(x) = 1(x) , 1 < x < 0, if 0 < a < 1 F(x) = 1ex, x 0, if a = 1 F(x) = 1x, x >1 if a >1, where = a/(1a). Here a >0. Proof of theorem 9.1.1 for s = 2. In this case, we obtain
( x)(R( x) R( z) f ( x)dx g ( z) F ( z)
z
(9.1.5)
Differentiating both sides of the above equation with respect to z, we obtain
z ( x) f ( z )dx g ' ( z )
( F ( z ))2 g ( z)F ( z) f ( z)
(9.1.6)
Differentiating both sides of (9.1.6) with respect to z and using the relation
f ' ( z) r ' ( z) r ( z ) we obtain on simplification f ( z) r ( z) g ' ( z)
r ' ( z) 2 g ' ( z )r ( z ) g ' ' ( z ) (r ( z ))2 ( g ( z ) ( z )) r ( z)
(9.1.7)
Thus r’(z) is expressed in terms of r(z) and known functions. The solution of r(x) is unique (for details see Gupta and Ahsanullah (2004)).
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
154
Putting ( x ) x and g(x) = ax + b, we obtain from (9.1.7)
a
r ' ( z) 2ar ( z ) (r ( z ))2 ((a 1)a b) r ( z)
(9.1.8)
The solution of (9.1.8) is r(x) =
a a . (a  1) x b
Thus X will have (i) exponentially distributed if a = 1, (ii) power function distribution if a < 1 and (iii) Pareto distribution if a > 1. Ahsanullah and Wesolowski (1998) extended the result Theorem 9.1.2. for nonadjacent record values. Their result is given in the following theorem. Theorem 9.1.3. If E(X(n+2)X(n)) = a X(n) + b .n > 1. where the constants a and b, then if: a) a = 1 then Xi has the exponential distribution, b) a < 1, then XI has the power function distribution c) a 1 X I has the Pareto distribution Proof of theorem 9.1.1 for s>2. In this case, the problem becomes more complicated because of the nature of the resulting differential equation. LopezBlazquez and MorenoRebollo (1997) also gave characterizations of distributions by using the following linear property
E ( X (k )  X (k s) z ) az b . s, k > 1.
Characterizations of Distributions by Record Values
155
Raqab and Wu (2004) considered this problem for nonadjacent record values under some stringent smoothness assumptions on the distribution function F (.). Dembinska and Wesolowski (2000) characterized the distribution by means of the relation
E ( X ( s k )  X (k ) z ) a z b , for k.s ≥ 1. They used a result of Rao and Shanbhag (1994) which deals with the solution of extended version of integrated Cauchy functional equation. It can be pointed out that Rao and Shanbhag’s result is applicable only when the conditional expectation is a linear function. Bairamov et al. (2005) gave the following characterization, Theorem 9.1.4. Let X be an absolutely continuous random variable with cdf F(x) with F(0) = 0 and F(x) > 0 for all x > 0 and pdf f(x), then a) For 1⩽ k⩽ n1, E (( X (n)  X (n k ) x), X (n 1) y)
u kv , 0uv k 1
If and only if F ( x) 1 ex , x 0, 0,
b) for 2 ⩽ k⩽ n1, 𝐸((𝑋(𝑛)𝑋(𝑛 − 𝑘 + 1) = 𝑥), 𝑋(𝑛 + 2) = 𝑦) = 𝑢 < 𝑣 < ∞ If and only if F ( x) 1 ex , x 0, 0,
2𝑢+(𝑘−1)𝑣 , ,0 𝑘+1
1} be an i.i.d. sequence of non negative continuous random variables with cdf F(x) and pdf f(x). We assume F(0) = 0 and F(x) > 0 for all x > 0. Then for Xn to have the cdf,F(x) = 1𝑒 −𝜎𝑥 ,x≥ 0, 𝜎 > 0,, it is necessary and sufficient that X(2)X(1) and X(1) are independent. Proof: The necessary condition is easy to establish, we give here the proof of the sufficiency condition. The property of the independence of X(2) X(1) and X(1) will lead to the functional equation
F ( 0) F ( x y ) F ( x ) F ( y ), 0 x , y
(9.2.1)
The continuous solution of this functional equation with the boundary conditions F(0) = 0 and F(∞) = 1, is 1
F ( x) e x , x >0, >0. The following generalization theorem was given by Ahsanullah (1979).
Characterizations of Distributions by Record Values
157
Theorem 9.2.2. Let {Xn, n ≥ 1} be a sequence of i.i.d. random variables with common distribution function F which is absolutely continuous with pdf f. Assume that F(0) = 0 and F(x) > 0 for all x > 0. Then X n to have the cdf, F ( x) 1 e x / , x 0, 0 , it is necessary and sufficient that X(n) –X(n1) and X(n1) are independent. Proof. It is easy to establish that if Xn has the cdf, F ( x) 1 e x / , x 0, 0 , then X(n) –X(n1) and X(n1) are independent. Suppose that X(n+1)X(n) and X(n), n ≥ 1, are independent. Now the joint pdf f(z,u) of Z = X(n_1)X(n) and U = X(n) 1 can be written as
f(z,u) =
[ R(u )]n1 r (u ) f (u z ) , 0 < u, z < . ( n)
(9.2.2)
= 0, otherwise. But the pdf fn (u) of X(n) can be written as
Fn1(u) =
[ R(u )]n1 f (u ),0 u , ( n)
= 0, otherwise.
(9.2.3)
Since Z and U are independent, we get from (9.2.2) and (9.2.3) f (u z) g( z) , F (u)
(9.2.4)
where g(z) is the pdf of u. Integrating (9.2.4) with respect z from 0 to z1, we obtain on simplification
F (u)F (u z1 ) F (u)G( z1 ).
(9.2.5)
158
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy Since G( z1 )
z1 0
g ( z )dz . Now u 0+ and using the boundary
condition F (0) 1, we see that G(z1) = F(z1). Hence, we get from (8.25)
F ( u z1 ) F ( u) F ( z1 ) .
(9.2.6)
The only continuous solution of (9.2.6) with the boundary condition F(0) = 0, is
F ( x ) e
1
x
,x>0
(9.2.7)
where is an arbitrary positive real number.. The following theorem (Theorem 9.2.3) is a generalization of the Theorem 9.2.2. Theorem 9.2.3. Let {Xn, n > 1} be independent and identically distributed with common distribution function F which is absolutely continuous and F(0) = 0 and F(x) < 1 for all x > 0. Then X n has the cdf 𝐹(𝑥) = 1 − 𝑒 −𝜎𝑥 , x≥ 0, 𝜎 > 0, it is necessary and sufficient that X(n) – X(m) and X(m) are independent. Proof. The necessary condition is easy to establish. To proof the sufficient condition, we need the following lemma. Lemma 9.2.1. Let F(x) be a continuous distribution function and F ( x ) > 0, for all x > 0. Suppose that F ( u v )( F ( v ))
{q(u,v)}r exp{q(u,v)}
u
1
= exp{q(u,v)} and h(u,v) =
q(u,v), for r ≥ 0. Further if h(u,v) 0, and
u
q(u,v) 0 for any positive u and v. If h(u,v) is independent of v, then q(u,v) is a function of u only. Proof: of the sufficiency of Theorem 9.2.4. The conditional pdf of Z = X(n) = X(m) given V(m) = x is
Characterizations of Distributions by Record Values f(zX(m) = x) =
159
1 R( z x) R( x)nm1 f ( z x) , 0 < z < , 0 ( n m) F ( x)
< x < . Since Z and X(m) are independent, we will have for all z> 0,
( R ( z x ) R ( x )) n m1
f (z x) F ( x)
(9.2.8)
as independent of x. Now let
R ( z x ) R ( x ) ln
F (z x) q ( z , x ) , say. F ( x)
Writing (9.2.9) in terms of q(z,x), we get
q( z, x)nm1 exp{q( z, x)}
z
q( z , x) ,
(9.2.9)
as independent of x. Hence by the lemma 91.1, we have
ln F ( z x)( F ( x)) 1 q( z x) c( z) ,
(9.2.10)
where c(z) is a function of z only. Thus
F ( z x ) ( F ( x )) 1 c1 ( z ) ,
(9.2.11)
and c1(z) is a function of z only. The relation (9.2.11) is true for all z > 0 and any arbitrary fixed positive number x. The continuous solution of (9.2.11) with the boundary conditions,
F ( 0) 1 and F ( ) 0 is
160
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
F ( x ) exp( x 1 ) ,
(9.2.12)
for x ≥ 0 and any arbitrary positive real positive number The assumption of absolute continuity of F(x) in the Theorem can be replaced by the continuity of F(x). Cheng (2007) gave a characterization of the Pareto distribution. Unfortunately, the statement and the proof of the theorem were wrong. Here we will give a correct statement and proof of his theorem. Theorem 9.2.4. Let {Xn, n > 1} be independent and identically distributed with common distribution function F which is continuous and F(1) = 0 and F(x) < 1 for all x>1. Then Xn has the cdf 𝑋(𝑛)
F ( x) 1 x , x 1, 0 , it is necessary and sufficient that 𝑋(𝑛+1)−𝑋(𝑛)_ and X(m), n> 1 are independent. Proof. If F ( x) 1 x , x 1, 0 , then the joint pdf fn,n+1(x,y) of X(n) and X(n+1) is
f n , n 1 ( x, y )
1 n 1 (ln x) n 1 ,1 x y , 0. ( n) xy 1 𝑋(𝑛)
Using the transformation, U = X(n) and V = 𝑋(𝑛+1)−𝑋(𝑛). The joint pdf fUV (u,v) can be written as
fU ,V ( w, v)
(9.2.13) 1 n 1(ln u ) n 1 v 1 ( ) ,1 u, v , 0. 1 v ( n) u 3
Thus, U and V are independent. The proof of sufficiency. The joint pdf of W and V can be written as 𝑓𝑊,𝑉 (𝑢, 𝑣) =
(𝑅(𝑢))𝑛−1 1+𝑣 𝑢 𝑟(𝑢)𝑓( 𝑢) 2 , 1 𝛤(𝑛) 𝑣 𝑣
< 𝑢, 𝑣 < ∞,
(9.2.14)
Characterizations of Distributions by Record Values
161
where R(x) = ln(1F(x)), r(x) = d R(x). dx We have the pdf fU(u) od U as
fU (u )
( R(u ))n 1 f (u ). ( n)
Since U and V are independent, we must the the pdf fV(v) of V as 𝑤 1+𝑣 1 𝑢) 𝑣 2 ,0 𝑣 1−𝐹(𝑢)
𝑓𝑉 (𝑣) = 𝑓(
< 𝑣 < ∞,.
Integrating he above pdf from v0 to
1F(𝑣0 ) =
1−𝐹(
(9.2.15)
, we obtain
1+𝑣0 𝑢) 𝑣0
(8.2.16)
1−𝐹(𝑢)
1 v0 1 F ( u) v0 1F(v0) = 1 F (u ) Since F(v0) is independent of U, we must have 1 v0 1 F ( u) = G(v0) v0 1 F (u )
(9.2.17)
where G(v0) is independent of u 1+𝑣0 ). 𝑣0
Letting u 1, we obtain G(v0) = 1F (We can rewrite (9.2.17) as
162
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
1 v0 1 v0 1 F ( u ) (1 F ( )(1 F (u )) v0 v0
(9.2.18)
The solution of the above equation is F(x) = 1 x . Since F(1) = 0 and F () 0 , we must have
F(x) = 1 x ., x> 1 and 0.
(9.2.19)
The following theorem is a generalization of Theorem 9.2.4. Theorem 9.2.4. Let {Xn, n > 1} be independent and identically distributed with common distribution function F which is continuous and F(1) = 0 and F(x) > 0 for all x > 0. Then Xn has the cdf, F ( x) 1 x , x 1, 0 , it is 𝑋(𝑚)
necessary and sufficient that 𝑋(𝑛)−𝑋(𝑚), 1 ≤ m < n and X(m) are independent. Proof. The joint pdf fm,n (x,y) of X(m) and X(n), n > m, is 𝑓𝑚,𝑛 (𝑥, 𝑦) =
(𝑅(𝑥))𝑚−1 (𝑅(𝑦)−𝑅(𝑥))𝑛−𝑚−1 𝑟(𝑥)𝑓(𝑦), 𝛤(𝑛−𝑚) 𝛤(𝑚)
We have F(x) = 1 x , R(x) =
𝑓𝑚,𝑛 (𝑥, 𝑦) =
ln x, r ( x)
(9.2.20)
, thus we obtain x
(𝜃 𝑙𝑛 𝑥)𝑚−1 (𝑙𝑛 𝑦−𝑙𝑛 𝑥)𝑛−𝑚−1 1 . 𝛤(𝑚) 𝛤(𝑛−𝑚) 𝑥𝑦 𝜃+1
(9.2.21)
where 1 < x < y < , 0. Using the transformation U = X(m) and V =
𝑋(𝑚) , 𝑋(𝑛)−𝑋(𝑚)
pdf fU,V (u,v) of U and V as 1+𝑣
𝑓𝑈,𝑉 (𝑢, 𝑣 =
𝑛−𝑚−1 𝜃𝑛 (𝑙𝑛 𝑢)𝑛−1 (𝑙𝑛( 𝑣 )) 𝑣 𝜃−1 𝜍𝜃+1 𝛤(𝑛) 𝛤(𝑛−𝑚) 𝑢 (1+𝑣)𝜃+1
we obtain the
Characterizations of Distributions by Record Values
163
𝑋(𝑚)
Thus X(m) and 𝑋(𝑛)−𝑋(𝑚)_are independent. Proof of sufficiency. 𝑋(𝑚)
Using U = X(m) and V = 𝑋(𝑛)−𝑋(𝑚)_, we can obtain the pdf fU,V of U and V from (9.2.20) as
𝑓𝑈,𝑉 (𝑢, 𝑣) =
(𝑅𝑢)𝑚−1 (𝑅( 𝛤(𝑚)
𝑢(1+𝑣) )−𝑅(𝑢))𝑛−𝑚−1 𝑣
𝛤(𝑛−𝑚)
𝑢(1+𝑣) ), 𝑣
𝑟(𝑢)𝑓(
(9.2.22)
We can write the conditional pdf fVU(vu) of VU as
𝑓𝑉𝑈𝑉 (𝑣𝑢) =
(𝑅(
𝑢(1+𝑣) )−𝑅(𝑢))𝑛−𝑚−1 𝑣
𝛤(𝑛−𝑚)
𝑢(1+𝑣) ), 𝑣 𝑣 2 𝐹̄ (𝑢)
𝑢𝑓(
, 1 < 𝑢 < ∞, 𝑜)
0.
9.3. CHARACTERIZATIONS BY IDENTICAL DISTRIBUTION Theorem 9.3.1. Let Xn, n ≥ 1 be a sequence of i.i.d. random variables which has absolutely continuous distribution function F with pdf f and F(0) = 0. Assume that F(x) < 1 for all x>0. If Xn belongs to the class C1 and In1,n
= X(n)X(n1), n>1., has an identical distribution with Xk, k > 1, then Xk
−𝜎𝑥 , x≥ 0, 𝜎 > 0, it is necessary has the cdf 𝐹(𝑥) = 1 − 𝑒 Proof. The if condition is easy to establish. We will proof here the only if condition. By the assumption of the identical distribution of In1,n and Xk,
we must have ∞
𝑟(𝑢)
∫0 [𝑅(𝑢)]𝑛 − 1 𝛤(𝑛) 𝑓(𝑢 + 𝑧) 𝑑𝑢 = 𝑓(𝑧), for all z > o.
(9.3.1)
Substituting ∞
∫0 [𝑅(𝑢)]𝑛 − 1 𝑓(𝑢) 𝑑𝑢 = 𝛤(𝑛),
(9.3.2)
We have ∞
∫ [ 𝑅(𝑢)]𝑛 − 1 𝑟(𝑢) 𝑓(𝑢 + 𝑧) 𝑑𝑢 = 𝑓(𝑧) 𝑈𝑠𝑖𝑛𝑔
0 ∞ ∫0 𝑅(𝑢)]𝑛 −1 𝑓(𝑢) 𝑑𝑢
We obtain
− 1..
(9.3.3)
Characterizations of Distributions by Record Values
165
∞
∫0 [𝑅(𝑢)]𝑛 − 1 𝑓(𝑢) [ 𝑓(𝑢 + 𝑧 ) (𝐹̄ (𝑢))− 1 − 𝑓(𝑧)]𝑑𝑢 = 0, z > 0. (9.3.4) Integrating the above expression with respect to z from z1 to , we get from (9.3.5) ∞
∫0 𝑅(𝑢)]𝑛 − 1 𝑓(𝑢)[𝐹̄ (𝑢 + 𝑧1 ) (𝐹̄ (𝑢))− 1 − 𝐹̄ (𝑧1 )]𝑑𝑢 = 0, z1 > 0.
(9.3.5)
If F(x) is NBU, then (9.3.5) is true if 𝐹̄ (𝑢 + 𝑧1 ) (𝐹̄ (𝑢))− 1 = 𝐹̄ (𝑧1 ), z1 > 0.
(9.3.6)
The only continuous solution of (9.3.6) with the boundary conditions
F ( 0) 1 and F () = 0 is F ( x ) exp( 0,
1
), where is an arbitrary real
positive number. Similarly, if F is NWU then (9.3.6) is true if (9.2.5 is satisfied and Xk has the cdf 𝐹(𝑥) = 1 − 𝑒 −𝜎𝑥 , x≥ 0, 𝜎 > 0, x 0, 0. Theorem 9.3.2. Let Xn, n ≥ 1 be a sequence of independent and identically distributed nonnegative random variables with absolutely continuous distribution function F(x) with f(x) as the corresponding probilitydensity function. If F C2 and for some fixed n,m, 1 ≤ m < n < , d
I m,n X (n m 1) , where Im,n+ X(n) X(m). then Xk has the the cdf 𝐹(𝑥) = 1 − 𝑒 −𝜎𝑥 , x≥ 0, 𝜎 > 0 Proof. The pdfs f1(x) of Rnm and f2(x) of Im,n ( = Rn  Rm) can be written as
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
166
f 1 ( x )
1 [ R( x)]nm1 f ( x) , for 0 < x < , (n m)
(9.3.7)
and f 2 ( x )
0
[ R(u )] m1 [ R( x u ) R( x)]nm1 r (u ) f (u x)d u , 0 < x < . (m) (n m)
(9.3.8) Integrating (9.3.7) and (9.3.8) with respect to x from o to xo, we get F1(x0) = 1  g1(x0),
(9.3.9)
where F1(x) is the pdf of X(nm1) , nm
g1 ( x0 ) j 1
[ R( x0 )] j 1 R ( xo ) e , ( j )
and F2(xo) = 1  g2(xo, u),
(9.3.10)
where F2(x) is the pdf of Im.m. and nm
g 2 ( x0 , u ) j 1
R(u xo ) R(u) j 1 exp R(u x ( j )
o
) R(u ) .
Characterizations of Distributions by Record Values
167
Now equating (9.3.9) and (9.3.10), we get
0
R( y)m1 f (u) g ( m)
2
(u, x0 ) g1 ( x0 ) du 0 , x0 > 0.
(9.3.11)
Now g2(x0, 0) = g1(0) and
0
[ R(u ) R(u )]nm1 exp{( R(u xo ) R(u )}[r ( xo ) r (u xo )] . (n m)
Thus if F C2, then (9.3.11) is true if r(u+xo) = r(u)
(9.3.12)
for almost all u and any fixed xo > 0. Hence Xk has the cdf 𝐹(𝑥) = 1 − 𝑒 −𝜎𝑥 , x≥ 0, 𝜎 > 0, Here is an arbitrary positive real number. Substituting d
m = n1, we get In1,n X1 as a characteristic property of the exponential distribution. Theorem 9.3.3. Let {Xn, n > 1} be a sequence of independent and identically distributed nonnegative random variables with absolutely continuous distribution function F(x) and the corresponding density function f(x). If F belongs to C2 and for some m, m > 1, X(n) and X(n1) + U are identically distributed, where U is independent of X(n) and X(n1) is distributed as Xn’s then Xk has thee cdf 𝐹(𝑥) = 1 − 𝑒 −𝜎𝑥 , x≥ 0, 𝜎 > 0 Proof: The pdf fm(x) of Rm, m > 1, can be written as [ R( y )]m f m ( y ) f ( y ), 0 y, (m 1)
d [ R( x)]m1 [ R( x)]m F ( y ) 0y r ( x) dx 0y f ( x)dx , d y (m) (m)
(5.3.17)
168
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy The pdf f2(y) of X(n1) + U can be written as f 2 ( y ) 0y
R( x)m1 f ( y x) f ( x) dy (m)
d [ R( x)]m1 [ R( x)]m1 F ( y x) f ( x) dx 0y f ( x) dx . dy (m) (m)
(5.3.18)
Equating (9.3.9) and (9.39.12), we get on simplificayion m1 y [ R ( x)] f ( x) H1( x, y ) dx 0, 0 (m 1)
(9.3.13)
where H1( x, y)F ( yx)F ( y)( F ( x))1, 0x y . Since F C1, therefore for (9.3.13) to be true, we must have H1(x,y) = 0,
(9.3.14)
for almost all x, 0 < x < y < . This implies that
F ( y x) F ( x) F ( y) ,
(9.3.15)
for almost all x, 0 < x < y < . The only continuous solution of (9.3.15) with the boundary conditions F (0) 1, and F () 0, is 𝐹̅ (𝑥) = 𝑒 −𝜎𝑥 ,
(9.3.16)
where is an arbitrary positive number Theorem 9.3.4. Let {Xn, n ≥ 1} be a sequence of independent and identically distributed nonnegative random variables with absolutely
Characterizations of Distributions by Record Values
169
continuous distribution function F(x) and the corresponding probability We assume F(0) = 0 and F(x) > 0 for all x > 0. Density function f(x). Then the following two conditions are equivalent. a) X’s has an exponential distribution with F(x) = 1𝑒 −𝜃𝑥 , x > 0, 𝜃 > 0. b) X(n) and X(n2)+W sre identically distributed where w has the pdf fW(w) as fW(w) =
𝜃2 𝑤𝑒 −𝜃𝑤 , Γ(2)
w> 0,𝜃 > 0.
For proof, see Ahsanullah and Aliev (2008). Theorem 9.3.5. Let X1, X2, ..., Xm, be independent and identically distributed random variables with probability density function f(x), x 0 and m is an integer valued random variable independent of X’s and P(m = k) = p(1p)k1, k = 1, 2, ..., and 0 < p < 1. Then the following two properties are equivalent: a) X’s are distributed as E(0,), where is a positive real number 𝑑
𝑋𝑗 = 𝐼𝑛−1,𝑛 , for some fixed n, n> 2, Xj C2 and E(Xj)
b) p ∑𝑚 𝑗=1 < .
Proof. It is easy to verify (a) (b). We will prove here that (b) (a). Let 1(t) be the characteristic function of of In1,n then
1 (t )
0
1it
0
0
1 it x e [ R(u )]n 1 r (u ) f (u x) du dx ( n) 1 it x n 1 0 (n) e [ R(u)] r (u) F (u x) du dx
The characteristic function p 2 (t ) of p
m
(9.3.17)
X can be written as j 1
j
170
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy 𝑚
𝛷2 (𝑡) = 𝐸(𝑒 𝑖 𝑡 𝑝 ∑𝑗 = 1 𝑋𝑗 ) 𝑘 𝑘 −1 = ∑∞ , 𝑘 = 1 𝛷(𝑡𝑝)] 𝑝(1 − 𝑝) 1, q = 1p = p((tp)) (1 (p t)) ,
(9.3.18)
where (t) is the characteristic function of X’s. Equating (9.3.17) and (9.3.18). We obtain ∞ ∞ 𝛷(𝑝𝑡) − 1 1 1 𝑒 𝑖 𝑡 𝑥 [𝑅(𝑢)]𝑛 − 1 𝑟(𝑢) = ∫ ∫ 1 − 𝑞 𝛷(𝑝𝑡) 𝑖𝑡 0 0 𝛤(𝑛) . 𝐹̄ (𝑢 + 𝑥) 𝑑𝑢 𝑑𝑥 (9.3.19)
Now taking limit of both sides of (9.3.19) as t goes to zero, we have 𝛷 ′ (0) 𝑖
∞
∞ 1 [𝑅(𝑢)]𝑛 − 1 𝑟(𝑢) 𝐹̄ (𝑢 𝛤(𝑛)
= ∫0 ∫0
+ 𝑥)𝑑𝑢 𝑑𝑥. (9.3.20)
Since X’s belong to C1, we must have F (u x) F ( x) F (u ) ,
(9.3.21)
for almost all x, u, 0 < u, x < . The only continuous solution of (9.3.21) with the boundary condition F ( 0) 1 and F ( ) 0, is F(x) = 1 𝑒 −𝜎𝑥 , x≥ 0, 𝜎 > 0.
(9.3.22)
where is an arbitrary positive real number It is known that (see Ahsanullah and Holland (1994), p. 475) for Gumbel distribution 𝑋(𝑚)
𝑑
𝑋 − (𝑊1 +
𝑊2 𝑊 +. . . + 𝑚−1 +, 𝑚 2 𝑚−1
≥1
Characterizations of Distributions by Record Values
171
where X (n) * is the nth lower record from the Gumbel distribution, W0 = 0 and W1, W2, …, Wm1 are independently distributed as exponential with F(w) = 1 ew, w > 0. 𝑋 ∗ (1) = X. Thus S(m) = m (𝑋 ∗ (m1) 𝑋 ∗ (m), m = 2,…, are identically distributed as exponential. Similarly if we consider the x upper records from the distribution, F(x) = ee , x , , then for any m 1 , Sm = m(X(m1)X(m)), m = 2, where X(m) is the upper record, are identically distributed as exponential distribution. It can be shown that for one fixed m, S(m) or Sm distributed as exponential does not characterize the Gumbel distribution. Arnold and Villasenor (1997) raised the question suppose that S1 and 2 S2 are i.i.d. exponential with unit mean, can we consider that Xj’s are (possibly translated) Gumbel variables? Here, we will prove that for a fixed m >1, the condition 𝑋 ∗ (𝑛 − 1) = 𝑋 ∗ (𝑛) +
𝑊 𝑛−1
where W is distributed as
exponential distribution with mean unity characterizes the Gumbel distribution. Theorem 9.3.4. Let {Xj, j = 1,…,} be a sequence of independent and identically distributed random variables with absolutely continuous (with respect to Lebesgue measure) distribution function F(x). Then the following two statements are identical. x (a) F(x) = e e , x ,
For a fixed m ≥ 1, the condition 𝑋 ∗ (𝑚) = 𝑋 ∗ (𝑚 + 1) +
𝑊 𝑚
where W
is distributed as exponential with mean unity. Proof. It is enough to show that (b) (a). Suppose that for a fixed m > 1, X (m) d X (m 1) W , m
then x
F( m) ( x) P(W m( x y ) f ( m1) ( y) dy
172
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy = x [1 e m( x y ) ] f ( m 1) ( y ) dy = F(m+1)(x) 
x
e m ( x y ) f ( m1) ( y ) dy .
(9.3.23)
Thus
e mx [ F( m 1) ( x) F( m ) ( x) ]
x
e my f ( m1) ( y ) dy
(9.3.24)
Using the relation e mx
m F ( x) [ H ( x)]m [ H ( x)] j , e H ( x) (m 1) m! j 0
we obtain e mx
F ( x) ( H ( x)) m = (m 1)
x
e my f ( m1) ( y ) dy
(9.3.25)
Taking the derivatives of both sides of (9.3.25), we obtain mx d mx ( H ( x)) m f ( m 1) ( x) F ( x) e e (m 1) dx
(9.3.26)
This implies that d mx H m ( x) e F ( x) 0 . dx (m 1)
Thus
(9.3.27)
Characterizations of Distributions by Record Values d mx ( H ( x)) m e dx (m 1)
0 .
173 (9.3.28)
Hence H(x) = c ex,  < x < Thus −𝑥
F(x) = c𝑒 −𝑐𝑒 , < x < .
(9.3.29)
Since F(x) is a distribution function we must have c as positive. assuming F(0) = e1, we obtain −𝑥
F(x) = 𝑒 −𝑒 , −∞ < 𝑥 < ∞.
(9.3.30)
Ahsanullah and Malov (2004) proved the following characterization theorem. Theorem 9.3.5. Let X1, X2, …, be a sequence of independent and identically distributed random variables with absolutely continuous cdf F(x) and d W W m > 2, X (m) X (m 2) 1 2 , m 2 m 1
Where W1 and W2 are independent as exponential distribution with unit x
mean then F(x) = 1 ee ,  < x < .
Chapter 10
EXTREME VALUE DISTRIBUTIONS INTRODUCTION In this chapter some distributional properties of extreme values will be presented. Extreme value distributions arise in probability theory as the limiting distributions of maximum or minimum of n independently distributed random variables with some normalizing constants For example if X1, X2, …, Xn are n independent and identically distributed random variables. Then largest order statistic 𝑋𝑛,𝑛 if it has a non degenerate limiting distribution, then with some normalizing constants its distribution will tend to a limiting extreme value distributions as n . Fisher and Tippett (1928) identified the three types of limiting cumulative distributions as given below. Type 1 (Gumbel) T10(x) =
,
Type 2 (Frechet)
,0
Type 3 (Weibull)
,
176
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy Since the smallest order statistic X1,n = Yn,n, where Y =  X, X1,n with
some appropriate normalizing constants will also converse to one of the above three limiting distribution if we change to x in (1), (2) and (3).. Gumbel (1958) has given various applications of these distributions. The corresponding pdfs are given below. Type 1 (Gumbel)
(x) =
,
Type 2 (Frechet)
,0
Type 3 (Weibull)
,
Suppose X1, X2, … Xn be i.i.d random variable having the distribution function F(X) with F(x) = 1ex . Then with normalizing constant an = ln n and bn = 1, P(Xn.n ¸ < an +bn x) = P(Xn,n ≤ ln n +x) = Type equation here. (1 e (ln n x) ) n ( 1
x ex n ) e e as n . n
Thus the limiting distribution of
Xn,n when X’s are distributed as exponential with unit mean is Type 1 extreme value distribution as given above.. It can be shown that Type 1 (Gumbel distribution) is the limiting distribution of X n,n when F(x) is normal, log normal, logistic, gamma etc. We will denote the Type 1 distribution as T10 and Type 2 and Type 3 distribution as T2 and T3 respectively. If the Xn,n of n independent random variables from a distribution F has the limiting distribution T, then we will say that F belongs to the domain of attraction of T and write F D(T). The extreme value distributions were originally introduced by Fisher and Tippet (1928). These distributions have been used in the analysis of data concerning floods, extreme sea levels and air pollution problems, for details see Gumbel (1958), Horwitz (1980), Jenkinson (1955) and Roberts (1979).
Extreme Value Distributions
177
10.1. THE PDF OF EXTREME VALUE DISTRIBUTIONS 10.1.1 The Probability Density Function of Type 1 Extreme Value Distribution (T10) Is Given in Figure 10.1.1
PDF 0.3
0.2
0.1
5
4
3
2
1
0
1
2
3
4
5
x
Figure 10.1.1. PDF of T10.
Mean =
, the Euler’s constant
Median = ln(ln2)
Variance =
2 / 6.
Moment generating function,
10.1.2. The PDF of Type 2 Extreme Value Distributions for Xn,n The Probability density function of T22,T2,5 and T2.10 are given in Figure 10.1.2.
178
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
PDF
4
3
2
1
0 0
1
2
3
4
5
6
7
8
9
10
x
Figure 10.1.2. PDFsT22solid, T2,5. dash and T2,10 –dashdot.
Mean
Variance
,
kth moment exists if
10.1.3. The PDF of Type 3 Distribution of Xn,n The probability density function of type 3 for = 2,5 and 10 are given in Figure 10.1.3. Note for = 1 T31 is the reverse exponential distribution.
Extreme Value Distributions
179 4
y
3
2
1
0 3.0 2.8 2.6 2.4 2.2 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0
x
Figure 10.1.3. 1 PDFs T32solid., T3,5Dash and, T3,10DashDot.
Mean
Variance
K th moment =
Moment generating function,
.
10.2. DOMAIN OF ATTRACTION In this section we will study the domain of attraction of various distributions. The maximum order statistics Xn,n of n independent and identically distributed random variable will be considered. We will say that Xn,n will belong to the domain of attraction of T(x) if the
180
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
lim P( X nn an bn x) T(x) for some sequence of normalizing constants
n
an and b n . For example consider the uniform distribution with pdf f(x) = 1, 0 < x < 1. Then for t < 0, P(Xnn ≤ 1+t/n) = (1 +t/n)n et . Thus Xnn from the uniform distribution belong to the domain of attraction of T(x), T(x) = ex,  < x < 0. The following lemma will be helpful in proving the theorems of the domain of attraction. Lemma 10.2.1. Let {Xn, n ≥ 1} be a sequence of independent and identically distributed random variables with distribution function F. Consider a sequence (en, n ≥ 1 } of real numbers. Then for any , 0 ≤ < , the following two statements are equivalent (i) lim n( F (en )) n
(ii)
lim P X n,n en
n
e.
Proof. Suppose (i) is true, then
lim P( X n ,n en ) lim F n (en ) lim(1 F (en ))) n
n
n
n
= lim (1/n+o(1))n = e. n
Suppose (ii) is true, then e. = lim P( X n ,n en ) lim F n (en ) lim(1 F (en ))) n n
n
n
Taking the logarithm of the above expression, we get
lim n ln (1 F (en ))  . nF ( en )) (1 o(1))
n
Extreme Value Distributions
181
Note the above theorem is true if = .
10.2.1. Domain of Attraction of Type I Extreme Value Distribution for Xn.n The following theorem is due to Gnedenko (1943). Theorem 10.2.1. Let X1, X2, .. be a sequence of i.i.d random variables with distribution function F and e(F) = sup {x: F(x) < 1}. Then FT10 if and only if there exists a positive function g(t) such that
g
=
,
= 1F. for all x.
Proof. We choose the normalizing constants an and bn of Xn,n such that 1 a n inf{x: F ( x ) } n
bn
=
g(an).
an ( F ) as n . Suppose
F (t xg (t )) x e , F 1 F , then F (t ) t e( F ) lim
lim nF (an bn x) lim nF (an ) ( n
n
F (an bn x) F (a n )
) e x lim nF (an ) e x . n
Lemma 10.2.1 If the sequence of i.i.d radom variables satisfies the condition of theorem 10.2.1., then P( X n, n an bn x) e e . x
Suppose P( X n, n an bn x) e e we have by Lemma 10.1.1, x
lim nF (an bn x)) e x .
n
ex = lim nF (a b x) lim nF (a ) ( n n n n
n
F (an bn x) F (a n )
) lim ( n
F (an bn x) F (an )
)
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
182
F (t xg (t )) F (t ) t e( F )
lim
The following Lemma (see Von Mises (1936)) gives a sufficient condition for the domain of attraction of Type 1 extreme value distribution for Xn,n. Lemma 10.2.2. Suppose the distribution function F has a derivative on [c0, (F)] for some c0, 0 < c0 < (F), then if lim f ( x) c, c 0 , then F D(T10). x ( F )
F ( x)
Example 10.2.1. The exponential distribution F(x) = 1ex satisfies the sufficient condition, since F ( x)
1 1 e
x
f ( x) 1 . For the logistic distribution F x ( x ) lim
1 f ( x) = lim 1 . Thus the logistic distribution x 1 e x x F ( x )
, lim
satisfies the sufficient condition. Example 10.2.2. For the standard normal distribution with x >0, (see Abramowitz and Stegun (1968 p. 932) 1 13 . ( 1) n 13 . .....(2n 1) e 2 Rn and F(x) = h( x) , where h( x ) 1 2 4 x 2 x x x 2n x2
Rn ( 1)
n 1
13 . ....(2n 1)
x
e 2 u2 1
du which is less in absolute value than 2 u 2n 2
the first neglected term. It can be shown that g(t) = 1/t + 0(t3). Thus F(t + xg(t)) F(t) t lim
e xm(t , x ) =, x t xg (t )
= lim
Extreme Value Distributions
183
where m(t,x) = g(t) (t+ 1 xg(t). Since as t , m(t,x) 1, we 2
F(t + xg(t)) x e . Thus normal distribution belong to the domain of F(t) t lim
attraction of Type I distribution. Since x2
e 2 lim lim h( x) 1, x 2 xF( x) x
the standard normal distribution does not satisfy the Von Mises sufficient condition for the domain of attraction of the type I distribution. 1 b We can take a n n (ln ln n 4 ) and b n (2 ln ln n) 1/ 2 . However bn 2 this choice of an and bn is not unique. The rate of convergence of P ( X n, n an bn x) to T10(x) depends on the choices of an and bn .
10.2.2. Domain of Attraction of Type 2 Extreme Value Distribution for Xn,n Theorem 10.2.2. Let X1, X2, .. be a sequence of i.i.d random variables with distribution function F and e(F) = sup {x: F(x) < 1}. If e(F) = , then FT2 iff e F( tx) x t F(t) lim
for x> 0 and some constant > 0.
Proof. Let 1 a n inf{x: F(x) } , n
184
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy then a n asn . Thus F(a n x ) = x lim nF(a n ). lim n(F(a n x )) lim n(F(a n )) F(an ) n n n
It is easy to show that lim nF(a n ) 1.
n
.
Thus lim n(F(a n x)) x
n
and the proof of the Theorem follows from Lemma 10.1.1. Example 10.2.3. For the Pareto distribution with F(x) 1 , > 0, 0 < x x
< lim F(tx) 1 . Thus the Pareto distribution belongs to T2. t F(t)
x
The following Theorem gives a necessary and sufficient condition for the domain of attraction of Type 2 distribution for Xn,n when e(F) < . Theorem 10.2.3. Let X1, X2, .. be a sequence of i.i.d random variables with distribution function F and e(F) = sup {x: F(x) < 1}. If e(F) < , then FT2 if 1
lim
t
F(e( F ) tx ) F(e(F)
1  t)
x
for x > 0 and some constant > 0.
Proof. Similar to Theorem 10.2.2.
Extreme Value Distributions
185
Example 10.2.4. The truncated Pareto distribution f(x) =
x
1
1 b  tx b lim lim t F(e(F)  1) t F(b  1) t t t b  1t b
1 < x < b, b > 1, lim
F(e( F ) tx1 )
F(b tx1 )
1 1 b
,
x 1 .
Thus the truncated Pareto distribution belongs to the domain of attraction of Type T21 distribution. The following Lemma (see Von Mises (1936)) gives a sufficient condition for the domain of attraction of Type 2 extreme value distribution for Xn,n. Lemma 10.2.3. Suppose the distribution function F is absolutely continuous in [c0, e(F)] for some c0, 0 < c0 < e(F), then if
xf ( x) , 0 , then F D(T2). x e( F ) F ( x ) lim
Example 10.2.5. The truncated Pareto distribution f(x) = 1 x
1 , 1 b
1 ≤ x < b, b>1,  xf(x) = lim x = . x F(x) x b x  b
lim
Thus the truncated Pareto distribution does not satisfy the Von Mises sufficient condition. However it belongs to the domain of attraction of the 1
type 2 extreme value distribution, because lim
t
0 and some constant > 0.
F(e( F ) tx ) F(e(F)
1  t)
x for x >
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
186
10.2.3. Domain of Attraction of Type 3 Extreme Value Distribution for Xn,n The following theorem gives a necessary and sufficient condition for the domain of attraction of type 3 distribution for Xn,n. Theorem 10.2.4. Let X1, X2, .. be a sequence of i.i.d random variables with distribution function F and e(F) = sup {x: F(x) < 1}. If e(F) < , then FT3 iff F(e( F ) tx) ( x) for x < 0 and some constant > 0. t 0 F(e(F)  t ) lim
Proof. Similar to Theorem 10.2.3. Suppose X is a negative exponential distribution truncated at x = b >0. The pdf of X is f(x)
ex . F(b)
Since F(e( F ) tx ) e (b + tx) e b lim (bt) x, t 0 F(e(F)  t ) t 0 e e b lim
the truncated exponential distribution satisfies the necessary and sufficient condition for the domain of attraction of type 3 distribution for maximum. The following Lemma gives Von Mises sufficient condition for the domain of attraction of type 3 distribution for Xn,n. Lemma 10.2.4. Suppose the distribution function F is absolutely continuous in [c0, e(F)] for some c0, 0 < c0 < e(F) < , then if (e( F ) x) f ( x) , 0. , then F D(T3). F ( x) x e( F ) lim
Proof. Similar to Lemma 10.1.3
Extreme Value Distributions
187
Example.10.2.5. Suppose X is a negative exponential distribution truncated at x = b >0, then the pdf of X is
f(x)
ex . F(b)
Now
( e( F ) x ) f ( x ) (b x )e x lim x 1. F ( x) x e( F ) x b e e b lim
Thus the truncated exponential distribution satisfies the Von Mises sufficient condition for the domain of attraction to type 3 distribution. A distribution that belongs to the domain of attraction of Type 2 distribution cannot have finite e(F). A distribution that belongs to the domain of attraction of Type 3 distribution must have finite e(F). The normalizing constants of Xn,n are not unique for any distribution. From the table it is evident that two different distributions (exponential and logistic) belong to the domain of attraction of the same distribution and have the same normalizing constants. The normalizing constants depends on F and the limiting distribution. It may happen that Xn,n with any normalizing constants may not converge in distribution to a non degenerate limiting distribution but Wnn where W = u(X), a function of X, may with some normalizing constants may converge in distribution to one of the three limiting distribution. We can easily verify that the rv X whose pdf, f(x) = 1 , x e x(ln x) 2
does not satisfy the necessary and sufficient conditions for the convergence in distribution of Xn,n to any of the extreme value distributions. Suppose W = lnX, then FW(x) = 11/x for x>1. Thus with as an 0 and bn = 1/n, P(Wn,n. < x) T31 as n Suppose e(F) = {xsup F(x) < 1}. If e(F) = ∞ for two distribution functions F(x) and G(x).If lim
1−𝐹(𝑥)
𝑥→∞ 1−𝐺(𝑥)
=c,
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
188
Where 2e c, 0 < c < ∞ is a constant, then the cumulative distributions F(x) and G(x) belong to the same domain of attraction. Further if c = 1, then they will have same normalizing constant. Example 10.2.6. Consider the exponential distribution with cdf F(x) = , x ≥ 0 and the logistic distribution with cdf G(x) =
1
1−𝐺(𝑥) = 1−𝐹(𝑥)
) =1
=
Thus the exponential distribution and Logistic distribution belong to the same domain of extreme value distribution. They have the same normalizing constant. value distribution. Example 10.2.7. Consider the Cauchy distribution with cdf F(x) = arctanx.
and the cdf G((x) = 1 = , 1
1−𝐺(𝑥) 𝑥→= 1−𝐹(𝑥)
lim
+
. We have
1 1
1 = 𝑎𝑟𝑐𝑡𝑎𝑛𝑥) 1 2 𝜋 =𝜋 1/𝜋 𝑥→=
=lim
Thus Xn,n from F(x) and G(x) have the same limiting distribution, T21. The normalizing constant for Cauchy distribution is = 0 and . Since the normalizing constant of Cauchy distribution is . For large n, we can take
=
. The normalizing
constants for G(x) will be n. Following Pickands (1975), the following theorem gives a necessary and sufficient condition for the domain of attraction of Xn,n from a continuous distribution.
Extreme Value Distributions
189
Theorem 10.2.5. For a continuous random variable the necessary and sufficient condition for Xn,n to belong to the domain of attraction of the extreme value distribution of the maximum is F 1 (1 c) F 1 (1 2c) = 1 if FT10, 1 1 c 0 F (1 2c) F (1 4c)
lim
F 1 (1 c) F 1 (1 2c) = 21/ if F T2 and lim 1 c 0 F (1 2c) F 1 (1 4c)
F 1 (1 c) F 1 (1 2c) = 2 1/ if F T3 1 1 c 0 F (1 2c) F (1 4c)
lim
Example 10.2.7. For the exponential distribution, E(0,), with pdf f(x) 1
= 1e x , x > 0, F 1 ( x) = 1 ln(1 x) and
F 1 (1 c) F 1 (1 2c) 1 c 0 F (1 2c) F 1 (1 4c)
lim
=
ln{1 (1 c)} ln{1 (1 2c)} 1 . Thus the domain of attraction of Xnn c 0 ln{1 (1 2c)} ln{1 (1 4c)}
lim
from the exponential distribution, E(0,), is T10. For the Pareto distribution, P(0,0,) with pdf f(x) = x ( 1) , x 1, 0 , 1 1 c 1/ (2c) 1/ F 1 ( x) = (1 x) 1/ and lim F1 (1 c) F 1(1 2c) = lim 1/ 1/ c 0 c 0
F (1 2c) F (1 4c)
(2c)
(4c)
1/
= 2 . Hence the domain of attraction of Xnn from the Pareto distribution, P(0,0,) is T2. For the uniform distribution, U(1/2, 1/2), with pdf f(x) =
F 1 ( x) 2 x 1 . We have
1 2
,
1 1, x 2 2
190
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy F 1 (1 c) F 1 (1 2c) = 2(1 c) 1 2(1 2c) 1 lim 2 1 . 1 c 0 F c 0 2(1 2c) 1 2(1 4c) 1 (1 2c) F 1 (1 4c)
lim
Consequently the domain of attraction of of Xnn from the uniform distribution, U(1/2, 1/2) is T31. For arcsin distribution with F(x) =
+ arcsinx, it can be shown that
=
.
Thus X n,n from arcsin distribution belongs to the domain of attraction of type 3 distribution with It may happen that Xnn from a continuous distribution does not belong to the domain of attraction of any one of the distributions. In that case X n,n has a degenerate limiting distribution. Suppose the rv X has the pdf f(x), where f(x) =
1
1 x(ln x) 2
1 1 x , x e . F ( x ) e , 0 x 1 . Then
1 1
1
F (1 c) F (1 2c) lim 1 c 0 F (1 2c) F 1 (1 4c)
1
e c e 2c
= lim 1 1 c 0 e 2c e 4c
1
1
= lim c 0
ec 1
1 e
1 2c
c = lim2e 1 c0 e 2c
=
1
lim 2e 2 c = c 0
Thus the limit does not exit. Hence the rv X does not satisfy the necessary and sufficient condition given in Theorem 10.1.4. The following tables gives the normalizing constants for some well known distributions belonging the domain of attraction of the extreme value distributions.
Extreme Value Distributions
191
Table 10.2. Normalizing Constants for Xn,n Distribution
f(x)
an
bn
Beta
c x 1 (1 x ) 1 ( ) c ( ) ( )
1
nc
0
cot( ) n
T21
0
n1/
T2
0, 0
Domain
1/
T3
0 1, [ ] represents the greatest integer contained in
Exponential
e x, 0 < x < , >0
Gamma
Laplace
Logistic
x 1 e x , ( ) 0 x
1 x  e , 2 x
ex (1 e x ) 2
Lognormal
1 (ln x ) 2 1 e 2 x 2
1
1
ln n
ln n +ln) (1) lnln n
n ln 2
T10
1
T10
T10 1
ln n
1
e n , n D n n n 2 1
= (2ln n)1/2
T10
(2ln n)1/2
e ,
Dn ln ln n ln4
n
n
T10
192
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy Table 10.2. (Continued)
Distribution Normal
an
f(x)
1
1
1 2 x2 , e 2
n
n
x ( 1) x 1, 0
Power Function
x 1 ,
Rayleigh
2x
2 t distribution
e
x2
2
1
1
1
ln n 2
, x >0
k
x2
(2ln n)1/2
T10
n1/
T2
(2ln n)1/2
0
0 x 1, 0
,
Domain
Dn ln ln n ln4
 < x <
Pareto
n Dn 2
bn
0
( 1)/ 2
k (( 1) / 2)
T31
1 n 2
ln n
kn
1 2
T10
1/
T2
1/ 2 ( / 2) Truncated Exponential
E(F)
e e( F ) 1 n
T31
ln n
1
T10
n1/
T2
Cex, C = 1/(
1 e e ( F ) ), 0 < x < e(F) <
Type 1
e xee
Type 2
x ( 1) e x x 0, 0
x
0
Extreme Value Distributions Distribution
f(x)
Type 3
( x ) 1 . e ( x ) , x 0, 0
Uniform Weibull
1/, 0 < x <
193
an
bn
Domain
0
n 1/
T3
/n
T31
ln n1/
x 1e x , x 0, 0
ln n
1
T1
10.3. EXTREME VALUE DISTRIBUTIONS FOR X1,N Let us consider X1,n of n i.i.d random variables. Suppose P(X1n < cn + dn x) H(x) as n , then the following three types of distributions are possible for H(x). x Type 1 distribution H10(x) = 1 e e , x : ;
Type 2 distribution H2(x) = 1 e
( x)
Type 3 distribution H3(x) = 1 e
x
, x 0, 0 .
, x 0, 0.
The corresponding pdfs are as given below. Type 1 (minimum) h19(x) = Type 2 (minimum)
(x) =
Type 3 (minimum)
(x) =
.,=
, x 0,
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
194
It may happen that Xn,n and X1n may belong to different types of extreme value distributions. For example consider the exponential distribution, f(x) = ex, x > 0. The Xnn belongs to the domain of attraction of the type 1 distribution of the maximum, T10. Since P(X1n > n1x) = ex, X1n belongs to the domain of attraction of Type 2 distribution of the minimum, H 21 .It may happen that Xn,n does not belong to any one of the three limiting distributions of the maximum but X1n belong to the domain of attraction of one of the limiting distribution of the minimum Consider the rv X whose pdf, f(x) = 1 , x e . We have seen that F does not satisfy the necessary and x(ln x) 2
sufficient conditions for the convergence in distribution of Xn,n to any of the extreme value distributions. However it can be shown that P(X 1,n > n+ n
n 1 n
x) e as n for n = e and n = e e . Thus the X1n belongs to the domain of attraction of H21. If X is a symmetric random variable and Xn,n belongs to the domain of attraction of Ti(x), then X1n will belong to the domain of attraction of the corresponding Hi(x), i = 1,2,3. The following Lemma is needed to prove the necessary and sufficient conditions for the convergence of X1n to one of the limiting distributions H(x) Lemma 10.3.1. Let {Xn, n > 1} be a sequence of independent and identically distributed random variables with distribution function F. Consider a sequence (en, n > 1 } of real numbers. Then for any , 0 < < , the following two statements are equivalent x
(i)
lim n( F (en ))
n
(ii)
lim P X n,n en
n
e.
Proof. The proof of the Lemma follows from Lemma 2.1.1 by considering the fact P( X 1n en ) (1 F (en )) n
Extreme Value Distributions
195
10.3. PDFS OF THE EXTREME VALUE DISTRIBUTION FOR X1,N 10.3.1. Type 1 Extreme Value Distribution for X1,n The pdf of h10 (x) is given in figure 10.3.1. ex e
e x
PDF 0.3
0.2
0.1
5
4
3
2
1
0
1
2
3
4
5
x
Figure 10.3.1. pdf h10 (x).
Mean =  , the Euler’s constant Median = ln(ln2) Variance = 2 / 6. Moment generating function
t) =
10.3.2. Type 2 Extreme Value Distribution for X1,n The pdfs of h2δ (x) for δ = 2,5, and 10 are given in Figure 10.3.2.
196
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
4
y
3
2
1
0 3.0 2.8 2.6 2.4 2.2 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0
x
Figure 10.3.2. pdf h22 solid h25 – dash and h2 10 dotdash.
Mean 
Variance
,
kth moment exists if
10.3.2. Type 3 Extreme Value Distribution for X1,n The pdfs of h3δ (x) for δ = 2,5, and 10 are given in Figure 10.3.3.
Mean
Variance
Extreme Value Distributions
kth moment
)
Moment generating function,
PDF
197
=
4
3
2
1
0 0
1
2
3
4
5
x
Figure 10.3.3. pdf h32 solid h35 – dash and h3 10 dotdash.
10.3. DOMAIN OF ATTRACTION OF EXTREME VALUE DISTRIBUTION FOR X1,N 10.3.1. Domain of Attraction for Type 1 Extreme Value Distribution for X1,n The following theorem gives a necessary and sufficient condition for the convergence of X1n to H10 (x). Theorem 10.3.1. Let X1, X2, .. be a sequence of i.i.d random variables with distribution function F. Assume further that E(X X ≤ t is finite for some t > ( F ) and h(t) = E(tXX ≤ t) .Then FH10 iff lim F (t xh(t )) e x for all t ( F ) F (t )
real x.
198
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy Proof. Similar to Theorem 10.2.1. Example 10.3.1. Suppose the logistic distribution with F(x) =
1 , 1 ex
 < x < . Now xe x dx (1 e t ) ln(1 e t ) . (1 e x ) 2
h(t ) E (t x X t ) t (1 e t )
t
It can easily be shown that h(t) 1 as t  . We have e t xh ( t ) e xh ( t ) F (t xh(t )) 1 e t ex . lim ( t xh ( t )) lim t ( F ) t 1 e t F (t ) 1 e t xh ( t ) lim
Thus X 1n from logistic distribution belongs to the domain of H10.
10.3.2. Domain of Attraction of Type 2 Distribution for X1,n The following theorem gives a necessary and sufficient condition for the convergence of X1n to H2 (x). Theorem 10.3.2. Let X1, X2, .. be a sequence of i.i.d random variables with distribution function F then FH2 if and only if ( F ) and lim
t ( F )
F (tx) for all x > 0. x F (t )
Proof. Suppose H2(x) = 1 e ( x) , x 0, 0 , then we have
F (tx ) 1 e ( tx ) x ( tx ) ( 1) e ( tx ) lim lim lim t ( F ) F (t ) t 1 e ( t ) t ( t ) ( 1) e ( t )
x , 0.
Extreme Value Distributions Let
lim
t ( F )
199
F (tx ) 1 x , 0. We can write Let a n inf{x: F(x) } , then n F (t )
a n asn .
Thus
lim n(F(a n x )) lim n(F(a n ))
n 
n 
F(a n x ) F(a n )
=
x
lim nF(a n ).
n 
It is easy to show that lim nF(a n ) 1. Thus lim n(F(a n x)) x n 
n 
and the proof follows. Example 10.3.2. For the Cauchy distribution F(x) =
1 tan 1 ( x ) . 2
1 1 tan 1 (tx ) F (tx ) x (1 t 2 ) 2 lim lim lim x 1 . t ( F ) F (t ) t 1 t 1 1 (tx ) 2 1 tan (i ) 2
Thus F belongs to the domain of attraction of H21.
10.3.2. Domain of Attraction of Type 3 Distribution for X1,n The following theorem gives a necessary and sufficient condition for the convergence of X1n to H3 (x). Theorem 10.3.3. Let X1, X2, .. be a sequence of i.i.d random variables with distribution function F then FH3 iff ( F ) is finite and F ( ( F ) tx) x , > 0 and for all x > 0. Here ( F ) t 0 F ( ( F ) t )
lim
= inf {xF(x) > 0}
Proof. The proof is similar to Theorem 10.3.2. Example 10.3.3. Suppose X has the uniform distribution with F(x) = x, 0 < x < 1. Then
lim t 0
F (tx ) . Thus FH . 31 x F (t )
If α(f) is the same for two distributions F(x0 and G(x) and
200
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
= c, where 0 < c < , then the normalized distribution of X1,n of these two distributions will converse to the same limiting distribution. If c = 1, then the normalizing constants of X1n of of these two distributions will be the same. Example 10. Consider the exponential distribution with F(x) = 10 x < ∞ and and the uniform distribution with X(x)) = x, 0 < x. We have = 1. Thus the normalized X1,n from exponential and uniform distributions have the limiting distribution H31. and they have the normalizing constant . Following Pickands (1975), the following theorem gives a necessary and sufficient condition for the domain of attraction of X1n from a continuous distribution. Theorem 10.3.4. For a continuous random variable, the necessary and sufficient condition for X 1n to belong to the domain of attraction of the extreme value distribution of the minimums F 1 (c) F 1 (2c) = 1 if FH10, 1 1 c 0 F ( 2 c) F ( 4 c)
lim
F 1 (c) F 1 (2c) = 21/ if F H2 and 1 1 c 0 F ( 2 c) F ( 4 c)
lim
F 1 (c) F 1 (2c) lim 1 = 2 1/ if F H3 1 c 0 F ( 2 c) F ( 4 c) Example
10.3.4.
For
the
1 , F 1 ( x ) ln x ln(1 x ) 1 ex
logistic
distribution
with
F(x)
=
Extreme Value Distributions lim c 0
201
F 1 (c) F 1 (2c) ln c ln(1 c) ln 2c ln(1 2c) lim 1. 1 1 F (2c) F (4c) c0 ln 2c ln(1 2c) ln 4c ln(1 4c)
Thus the domain of attraction of X1n from the logistic distribution is T10. For the Cauchy distribution with F(x) =
1 tan 1 ( x ) 2
. We have
1 1 for small x. Thus F 1 ( x ) tan ( x ) = 2 x 1 1 F 1 (c) F 1 (2c) . 2c c lim 1 2 1 c 0 F (2c) F 1 (4c) 1 4c 2c
Thus the domain of attraction of X1n from the Cauchy distribution is T21. 1
For the exponential distribution, E(0,), with pdf f(x) = 1e x , x > 0, 1 F 1 ( x) = ln(1 x) and
lim c 0
F 1 (c) F 1 (2c) F 1 (2c) F 1 (4c)
= lim ln{1 c) ln{1 2c) 2 1 . c 0 ln{1 2c) ln{1 4c)
Thus the domain of attraction of X1n from the exponential distribution, E(0,), is T31. We can use the following lemma to calculate the normalizing constants for various distributions belonging to the domain of attractions of H(x).. Lemma 10.3 .2. Suppose P(X1n < cn d n x) H ( x) as n , then (i) cn F 1 ( 1 ), d n F 1 ( 1 ) F 1 ( 1 ) if H(x) = H10(x), n
n
ne
1 (ii) cn 0, bn  F 1 ( ) if H(x) = H2(x), n 1 (iii) cn ( F ), bn F 1 ( ) ( F ) if H(x) = H3(x) n
202
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
We have seen (Lemma 10.1.6) that the normalizing constants are not unique for Xn,n, The same is also true for the X1n.. Example 10.3.3. For the logistic distribution with F(x) =
1 , X 1,n 1 ex
when normalized converge in distribution to Type 1(H10) distribution. The normalizing constants are 1 1/ n 1 1 1 1 cn F 1 ( ) ln ln n and d n F ( ) F ( ) 1 . 1 (1 / n) n n ne
For Cauchy distribution with F(x)
1 1 tan 1 ( x ) , 2
X1n when normalized
converge in distribution to Type 2 (H21) distribution. The normalizing constants are cn 0 and d n  F 1 ( 1 ) tan ( 1 1 ), n 2 . n
n
2
For the uniform distribution with F(x) = x, X1n when normalized converge in distribution to Type 3 (H31) distribution. The normalizing constants are cn 0, bn F 1 ( 1 ) 1 . n
n
Table 10.3. Normalizing Constants for X1,n Distribution Beta
f(x)
cx
1
(1 x ) 1 c B( , )
cn
dn
Domain
0
c n
1
0, 0
c
1/
H3
( ) ( ) ( )
0 0.
Chibisov also stated the necessary and sufficient conditions for the above weak convergence. Smirnov (1967) specified the appropriate normalizing constants for the above convergence. The following theorem is due to Galambos (1995 p.64). Theorem 10.4.3. There are only three types of limiting distributions of ( are
, where F is a cdf, .
are suitable constants. They
Chapter 11
RANDOM FILLING OF A SEGMENT WITH UNIT INTERVALS 11.1. RANDOM FILLING. CONTINUOUS CASE The task of randomly filling a segment with intervals has a long history. The beginning was laid in Renyi’s work “On a onedimensional problem concerning spacefilling,” published in 1958, in which he investigated the asymptotic behavior of the average number of randomly spaced intervals of unit length over a long segment. Later in 1964, the work of Dvoretsky and Robbins and other authors continued the study of the distribution properties of the number of placed intervals. The initial statement of the problem is as follows. On a segment if
an interval of unit length is randomly placed and takes up space
. The expression “randomly” means that the beginning of the placed interval is a random variable with a uniform distribution law on the segment
After placing the first interval two unoccupied
segments are formed and on which in turn unit intervals are located independently of each other and also with a uniform distribution law on the corresponding segments. This filling process continues until all
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
208
distances between the placed unit intervals become less than one. The number of placed intervals is denoted If
.
that
We will be interested in the properties of the distribution of a random variable in particular, we will study the asymptotic behavior of the mathematical expectation and moments of higher orders of this random variable with increasing length of the segment This statement of the problem in the work of Renyi found such an interpretation. On a street of length cars of unit length are randomly parked in accordance with the rule described above. Then means the number of parked cars. This interpretation gave the name to the problem as the “parking” problem. In his work, Renyi showed that (11.1.1) for any Renyi also obtained an expression for a constant whose value is expressed by the integral
(11.1.2) In 1964 Dvoretzky and Robbins received a refinement of relation (11.1.1) and also investigated the asymptotic behavior of moments of a random variable of higher orders. They showed that for the mathematical expectation, the relation
Random Filling of a Segment with Unit Intervals
209
(11.1.3) and for variance they proved that there exists a constant
such that
(11.1.4) The same authors proved the asymptotic normality of a random variable, namely, they showed that
Other authors studied more general formulations of the problem, when the placed intervals could in turn have a random length or their random arrangement was different from uniform.
11.2. DISCRETE VERSION OF THE PARKING PROBLEM Recently interest in the parking problem has resumed. There are works, for example, in which discrete analogues of this problem are considered. Here we describe one of the models of the discrete analogue of the parking problem, which was called the selfish parking problem. Let
be integers and
and
.
We will place an open interval of unit length , where
on a segment
is a random variable with equal probability taking values for all
If
we say that the interval does not
fit. After placing the first interval, two free segments and are formed which are filled with intervals of unit length according to the same rule, etc.
210
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
The interpretation of this filling process can be summarized as follows. On the marked parking lot, cars of the same length randomly stop, and each next car stops so that at least on one side, immediately after its parking, there is a free parking space. In particular, if at some point the place turned out to be occupied, and the place located on the edge is still free, then in the future you cannot park the car at this place. In this case, the next car can be parked so that some previously parked car will be blocked, that is, next to the previously stopped car, the parking spaces will be occupied. This explains the appearance in the name of the model definition of “selfish.” At the end of the process of filling a segment
with unit intervals
between any two adjacent intervals, the distance will be no more Let mean the number of unit intervals placed. We will be primarily interested in the mathematical expectation and moments of higher orders of a random variable . We prove the following theorem, which describes the behavior of the first three moments of a random variable Theorem 11.2.1. For the model described above
if
(11.2.1.)
if
(11.2.2)
if
and
(11.2.3)
if
(11.2.4)
if
(11.2.5)
Random Filling of a Segment with Unit Intervals
211
We give a simple proof of this theorem. We introduce the notation
 the number of unit intervals placed on
a segment provided that the first interval takes place Then the equality ,
(11.2.6)
where are independent random variables. Let be Considering that
is a random variable that takes values
with equal probabilities we obtain from (11.1.10) the relation
or
(11.2.7) Insofar as
and
then for the first values
we have:
212
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
and
(11.2.8)
We find a solution for relation (11.2.7) with initial conditions (11.2.8). We use the following notation
Then equality (11.2.7) can be represented as
or
The last equality is more convenient for us to write in the following form
(11.2.9) Let be
and Then from *11.2.8) we obtain
Since
then
Random Filling of a Segment with Unit Intervals
213
In this way
(11.2.10) From equalities (11.2.10) and
follow the equalities
We notice, that
which implies equality
Given the last relations for
Since
for all
we get
then statement (11.2.1) is finally proved.
Next, we denote by order of a random variable notice that
the central moment of the where
is a nonnegative integer. We
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
214
Therefore, taking into account equality (11.2.6) and the independence of random variables
and
in this equality we obtain
=
= For
the last equality is as follows
Note that if
then
We solve the relation
(11.2.11) under initial conditions and
(11.2.12)
Random Filling of a Segment with Unit Intervals Using the notation
215
we get
or
Then
Applying this result to
and
we get
(11.2.13) and
(11.2.14) The quantities
and
are calculated directly, because we know
the distribution law of a random variable
:
Substituting these values in{11.2.13) and (11.2.14) we obtain (11.2.3), (11.2.4) and (11.2.5). Comment. We saw that in the continuous case of unit intervals on a long segment, the occupied fraction of the segment averages about 0.748, and in the discrete case considered, this fraction is approximately 2/3.
APPENDIX A.1. CAUCHY’S FUNCTIONAL EQUATIONS Let g(x) be a finite measurable real function (with respect to Lebesgue measure) in the finite axis. We present here the following four Cauchy’s functional equation. 1. 2. 3. 4.
g(xy) = g(x)+ g(y), x,y > 0 g((x+y) = g(x) g(y) g(x+y) = g(x) +g(y) g(xy) = g(x) g(y)
The solutions are (see for details Aczel (1966)). 1. 2. 3. 4.
g(x) = c lnx, c is a constant g(x) 𝑒 𝑐𝑥 , c is a constant g(x) = cx, c is a constant g(x) =𝑥 𝑐 . c is a constant.
Valery Nevzorov, Mohammad Ahsanullah and Sergei Ananjevskiy
218
A.2. LEMMAS Assumptions A. Suppose the random variable X is absolutely continuous with the cumulative distribution function F (x) and the probability density function f (x) . We assume that α = inf{xF(x) >0} and 𝛽 = 𝑠𝑢𝑝{x F(x) < 1}. We also assume that
x , α < x < β and
f (x) is a differentiable for all
E (X ) exists.
Lemma A.1. Under the Assumption A, if E X  X x g x x , where x
f x and F x
g x is a continuous differentiable function of x with the
𝑥
condition that
∫𝛼 𝑢𝑓(𝑢)𝑑𝑢 𝐹(𝑥)
𝑥−𝑔′(𝑥)
is finite for x>α, then f(x) = 𝑐𝑒
∫ 𝑔(𝑥) 𝑑𝑥 ,
where c
𝛽
is a constant determined by the condition ∫𝛼 𝑓(𝑥)𝑑𝑥 = 1.
Lemma A.2. Under the Assumption 2.1, if E(X}X≥ 𝑥) = ℎ(𝑥)𝑟(𝑥),
where r x
f x and h(x) is a continuous differentiable function of 1 F x 𝛽 𝑢+ℎ′(𝑢)
α < x < β, with the condition that ∫𝑥 then f(x) = c𝑒 𝛽
𝑥+ℎ′(𝑥) 𝑑𝑥 ℎ(𝑥)
−∫
ℎ(𝑢)
𝑑𝑢 is finite for
x ,
x , α < x < β,
, where c is a constant determined by the condition
∫𝛼 𝑓(𝑥)𝑑𝑥 = 1. For Proof, see Ahsanullah (2017).
REFERENCES Abramowitz, M. and Stegan, I. (1972). Handbook of Mathematical Functions. Dover, New York, NY. Aczel, J. (1966). Lectures on Functional Equations and Their Applications. Academic Press, New York, NY. Ahsanullah, M. (1977). A characteristic property of the exponential distribution. Ann. of Statist., Ahsanullah, M. (2004). Record ValuesTheory and Applications. University Press of America, Lanham, MD. Ahsanullah, M. (1989). Characterizatins of normal distribution by properties of linear statistics and chisquare, Pakistan journal of Statistics. Vol. 5(3), 267276. Ahsanullah, M. (2015). Some inferences of arcsine distribution. Moroccan J. Pure nd appl. Anal. vol. 1(2), 7075. Ahsanullah, M. (2017). Characterizatins of Univariate Continuous Distributions. Atlantis Press, Paris, France. Ahsanullah, M. (2019). Characterizations of Exponential Distribution by Ordered Random Variables. Applied Statistical Science, Nova Publishers, New York, USA. Ahsanullah, M. and Alzaatreh, A. (3018). Some characterizations of loglogistic distribution. Stochastic and Quality Control, 3.2329.
220
References
Ahsanullah, M. Bairamov, I (2000). A mote on the characteriztions of the uniform distributin. Statisk, 2(2), 145151. Ahsanulla, M., Ghitany, M. E. AlMutairi (2017). Communications in Statistics, vol. 46, no. 12, 62226227. Ahsanullah, M., Hamedani, G. G. and Shakil, M. (2010a) On Record Values of Univariate Exponential Distributions Journal of Statistical Research vol, 22, 3, esearch, vol. 22, No. 2, 267288 Ahsanullah, M., Hamedani, G. G. and Shakil, M. (2010b). On Record Values of Univariate Exponential Distributions Journal of Statistical Research vol. 22, No. 2,267288. Ahsanullah, M. and Holland, B. (1994). On the use of record values to estimatethe locationand scale parameter of the generalized extreme value distribution. Sankhya 56, A, 480499. Ahsanullah, M. and Kirmani, S. N. U. A. (1991). Characterizations of the Exponential Distribution by Lower Record Values. Comm. Statist. TheoryMethods, 20(4), 12931299. Ahsanullah, M. and Malov, S. On some characterizations via distributional properties of records. Journal of Statistical Theory and Applications. Vol. 3, 2, 135140. Ahsanullah M. and Nevzorov V. B. (1996a). Distributions of order statistics generated by records. Zapiski Nauchn. Semin. POMI, 228, 2430 (in Russian). English transl. in J. Math. Sci. Ahsanullah M. and Nevzorov V. B. (1997). On limit relation between order statistics and records. Zapiski nauchnyh seminarov POMI (Notes of Sci Seminars of POMI), v. 244, 218226 (in Russian). Ahsanullah M. and Nevzorov V. B. (1999). Spacings of order statistics from extended samples.259267. Ahsanullah M. and Nevzorov V. B. (2000). Some distributions of induced records. Biometrical Journal, 42, 153165. Ahsanullah, M. and Nevzorov, V. B. (2001a). Ordered Random Variables. New York, NY: Nova Science Publishers Inc. Ahsanullah, M. and Nevzorov, V. B. (2001b). Distribution between uniform and exponential. In: M. Ahsanullah, J. Kenyon and S. K. Sarkar eds., Applied Statistical Science IV, 920.
References
221
Ahsanullah M. and Nevzorov V. B. (2004). Characterizations of distributions by regressional properties of records. J. Appl. Statist. Science, vol. 13, N1, 3339. Ahsanullah, M. and Nevzorov, V. B. (2005). Order Statistics. Examples and Exercise. New York, NY: Nova Science Publishers. Ahsanullah M. and Nevzorov V. B. (2011). Record statistics. International Encyclopedia of Statistical Science, part 18, 11951202. Ahsanullah M. and Nevzorov V. B. (2014). Some Inferences on the Levy distribution. Journal of Statistical Theory and Applications, Vol. 13, no. 3. 205211. Ahsanullah M. and Nevzorov V. B. (2017). Some characterizations of Skew tdistribution with 2 degrees of freedom. Journal of Statistical Theory and Applications. vol. 16(1), 419426. Ahsanullah M. and Nevzorov V. B. (2019). Some characterizations of Skew tdistribution with three degrees of freedom. Calcutta Statistical association Bulletin. 1 = 9. Ahsanullah, M., Nevzorov, V. B., and Shakil, M. (2013). An Introduction to Order Statistics. Paris, France: Atlantis Press. Ahsanullah, M., Nevzorov, V. B. and Yanev, G. P. (2010) Characterizations of distributions via order statistics with random exponential shift. Journal of Applied Statistical Science, 18, 3,. 297305. Ahsanullah, M. and Shakil, M, (2011a). On Record Values of Rayleigh distribution. International journal of Statistical Sciences. Vol. 11 (special issue), 2011, 111123 Ahsanullah, M. and Shakil, M. (2012). A note on the characterizations of Pareto distribution by upper record values. Commun. Korean Mathematical Society, 27(4), 835842. Ahsanullah, M., Shakil, J., and Golam Kibria, B. M. (2016). Characterizations of Continuous Distributions by Truncated moment. Journal of ModernApplied Statistical Methods. Vol 15, issue 1,316331. Ahsanullah, M. and Wesolowski, J. (1998). Linearity of Nonadjacent Record Values. Sankhya, B, 231237.
222
References
Ahsanullah, M., Yanev, G. P., and Onica, C. (2012). Characterizations of logistic distribution through order statistics with independent exponential shifts. Economic Quality Control, 27(1), 8596. Akhundov, I. and Nevzorov, V. (2006). Conditional distributions of record values and characterizations of distributions. Probability and Statistics. 10 (Notes of Sci. Semin. POMI, v. 339), 514 . Akhundov, I. and Nevzorov, V. (2008). Characterizations of distributions via bivariate regression on differences of records. Records and Branching Processes. Nova Science Publishers, 2735. Ananjevskii S. M. The “parking” problem for segments of different length // Journal of Mathematical Sciences. 1999. Vol. 93. P. 259264. https://doi.org/10.1007/BF02364808. Ananjevskii S. M. Some Generalizations of “Parking” Problem // Vestnik St. Petersburg University. Ser. 1, issue 4, 525532, 2016. https://doi.org/10.3103/S1063454116040026. Ananjevskii S. M., Kryukov N. A. The problem of selfish parking // Vestnik St. Petersburg University: Mathematics. Volume 51, issue 4, 1 October 2018, Pages 322326. https://doi.org/10.21638/11701/spbu01. 2018.402. Arnold, B. C., Balakrishnan, N. and Nagaraja, H. N. (1998). Records. John Wiley & Sons Inc., New York. NY. Arnold, B. C. and Villasenor, J. A. (1997). E by the structure of order statistics in samples of size two. Probab. Lett. 83, 2, 596601. Azlarov, T. A. and Volodin, N. A. (1986). Characterization problems associated with the exponential distribution. SpringerVerlag, New York, NY (original Russian appeared in 1982, Tashkent). Azzalini. A. (1985). A class of distributions which includes the normal one. Scandinanavian J, Statistics, 12, 121178. Azzalini, A, (2014). The skewniormal and Related Families. Cambrige University Press. Cambridge, United Kingdom. Bairamov, I. G. (1997). Some distribution free properties of statistics based on record values and characterizations of the distributions through a record. J. Applied Statist. Science, 5, no. 1, 1725.
References
223
Bairamov, I. G. and Ahsanullah, M. (2000). Distributional Relations between Order Statistics and the Sample itself and Characterizations of Exponential Distribution. J. Appl. Statist. Sci., Vol. 10(1), 116. Bairamov, I. G., Ahsanullah, M. and Pakes, A. G. (2005). A characterization of continuous distributions via regression on pairs of record values. Australian and New Zealand J. Statistics, Vol. 474, 243247. Balakrishnan, N., Ahsanullah, M. and Chan, P. S. (1995). On the logistic record values and associated inference. J. Appl. Statist. Science, 2, no. 3, 233248. Bernouli, J. (1713). Ars Conjectandi, Basel: Thurnisootum, English Translation by E. D. Sylla, 2006. Chandler, K. N. (1952). The distribution and frequency of record values. J. Roy. Statist. Soc., Ser. B, 14, 220228. Cheng, S. K. (2007). Characterizations of pareto distribution by the independence of record vales. Chengchungang Mathematical Society, 20(1),5157. Chibisov, D. M. (1964). On limit distributions of order statistics. Theory Poobab. Appl. 9, 142148. Cramer, H. (1936), Uber eine Eigensschaftder nirmalen Verytielungsfunction Mathematische Zeitschrift 41, 405413/i. [About a property of the normal distribution function Mathematical Journal] David, H. A. and Nagaraja, H. N. (2003). Order Statistics. Third Edition, Wiley, Hoboken, New York. Deheuvels, P. (1983). The Complete Characterization of the Upper and Lower Class of Record and Inter  Record Times of i.i.d. Sequence. Zeit. Wahrscheinlichkeits theorie Verw. Geb., 62, 16. Deheuvels, P. (1984a). The Characterization of Distributions by Order Statistics and Record Values  A Unified Approach. J. Appl. Probability., 21, 326334. Correction in: J. Appl. Probability, 22 (1985), 997. Dembińska, A. and Wesolowski, J. (1998). Linearity of Regression for NonAdjacent Order Statistics. Metrika, 48, 215222.
224
References
Dembińska, A. and Wesolowski, J. (2000). Linearity of Regression for NonAdjacent Record values. J. of Statistical Planning and Inference, 90, 195205. Darmois, V. P. Sur diverses properitisr carscteriistigue ge la loide LaplaceGauss. Bulletinde I’institut international tique, 33(2),7982. [Newsletter of the International Tick Institute] Desu, M. M. (1971). A characterization of the exponential distribution by order statistics. Ann. Math. Statist., 42, 837838. Dvoretzky A., Robbins H (1964). On the “parking” problem // Publ. of the Math. Inst. of Hungarian Acad. of Sciences. Vol. 9. P. 209226. Feller, W. (1957). An Introduction to Probability Theory and Its Applications, Vol. I, 2nd Edition. Wiley & Sons, New York, NY. Feller, W. (1966). An Introduction to Probability Theory and its Applications. Vol. II, Wiley & Sons, New York, NY. Ferguson, T. S. (1964). A characterization of the negative exponential distribution. Ann. Math. Statist., 35, 11991207. Ferguson, T. S. (1967). On characterizing distributions by properties of order statistics. Sankhya, Ser. A, 29, 265277. Fisher, R. A. and Tipett, L. H. C. (2928). Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proc. Cambridge Phil. Soc. 24, 180190. Fisz, M. (1958). Characterizations of some probability distributions. Skand. Aktirarict., 6567. Franco, M. and Ruiz, J. M. (1997). On Characterizations of Distributions by Expected Values of Order Statistics and Record Values with Gap. Metrika, 45, 107119. Galambos, J. (1987). The Asymptotic Theory of Extreme Order Statistics. Robert E. Krieger Publishing Co. Malabar, FL. Galambos, J. and Kotz, S. (1978). Characterizations of probability distributions, Lecture Notes in Mathematics, Vol. 675, SpringerVerlag, New York, NY. Gnedenko, B. (1943). Sur la Distribution Limite du Terme Maximum d’une Serie Aletoise. Ann. Math., 44, 423453.
References
225
Gumbrl, E. J. (1958). Statistics of Extrmes. Columbia University Press, New York, USA. Gupta, R. C. and Ahsanullah, M. (2004). Characterization results based on the conditional expectation of a function of nonadjacent order statistic (Record Value). Annals of Statistical Mathematics, Vol. 56(4), 721732. Hamedani, G. G., Ahsanullah, M. and Sheng, R. (2008). Characterizations of certain continuous univariate distributions based on the truncated moment of the first order statistic. Aligarh Journal of Statistics, 28, 7581. Horrwitz, j. (1980). Extreme values from a nonstationary stochastic process in an application in air quality analysis. Technometrics 22, 468482. Huygens, Christian. The Value of Chances in Games of Fortune. English translation 1714. https://math.dartmouth.edu/~doyle/docs/huygens/ huygens.pdf 5. Mlodinow, Leonard. Jenkinson, A. E. (1955). Frequency distribution of the annual maximum (or minimum) values of meyterological elements. Quarterly Journal of of the Royal meterologicl Society 67, 158171. Kagan, A. M, Linnik, Y. V. and Rao, C. R. (1973) Characterization Problems inathematical Statistics. Wiley, New York, USA. 73). Kagan, A. M. and Salaaeveskii, O. V.(1967); Characterization of the normal law by a property pf the noncentral chisquare distribution, Litoveskiimatern Sbornik VII 5758. Kagan, A. M. and Wesolowski,J .(2000). An extension of darmois Skitovitchtheorem to a class of dependent variables. Statistics and probability Letters. vol. 47, 6973. Kakosyan, A. V., Klebanov, L. B. and Melamed, J. A. (1984). Characterization of Distribution by the Method of Intensively Monotone Operators. Lecture Notes in Math. 1088, Springer Verlag, New York, N.Y. Kirmani, S. N. U. A. and Beg, M. I. (1984). On Characterization of Distributions by Expected Records. Sankhya, Ser. A, 46, no. 3, 463465. Klebanov, L. B. and Melamed, J. A. (1983). A Method Associated with Characterizations of the exponential distribution. Ann. Inst. Statist. Math., A, 35, 105114.
226
References
Kolmogrov, A. N. (2019). Foundations of the theory of probability, Second Edition, Trnslated by Nathan Morison, Dover Publications, New York, USA. Kotz, S. and Nadarajah, S. (2019). Extreme Value Distributions, Theory and Applications. Imperial College Press, London, U.K. Laha, R. G. and Lukacs, E. (1960). On the characterizations of normal distribution by the independence of a sample central moment and the sample mean. Annals of mathematical Statistics, vol. 31, 16281633. Leadbetter, M. R, Lindgreen, G. and Rootzen, H. (1982). Extremesand Related Proprties of Random Seqences and Processes. Springer Verlag, New York. Lin, G. D. (1987). On characterizations of distributions via moments of record values. Probab. Th. Rel. Fields, 74, 479483. LopezBlaquez, F. and MorenoReboollo, J. L. (1997). A characterization of distributions based on linear regression of order statistics and random variables. Sankhya, Ser A, 59, 311323. Lukacs, E. (1942). A characterization of the normal distributiom. Annals of Mathematical. vol. 13, 9193. Mohan, N. R. and Nayak, S. S. (1982). A Characterization Based on the equidistribution of the First Two Spacings of Record Values. Z. Wahr. Verw. Geb., 60, 219221. Nagaraja, H. N. (1977). On a characterization based on record values. Austral. J. Statist., 19, 7073. Nagaraja, H. N. (1978). On the expected values of record values. Austral. J. Statist., 20, 176182. Nagaraja, H. N. (1988b). Some characterizations of continuous distributions based on regression of adjacent order statistics of random variables. Sankhya, Ser A, 50, 7073. Nagaraja, H. N. and Nevzorov, V. B. (1977). On characterizations based on recod values and order statistics. J. Statist. Plan. Inf., 61, 271284. Nagaraja, H. N. and Nevzorov, V. B. (1997). On characterizations based on record values and order statistics. J. of Statist. Plann. and Inference v. 63, 271284.
References
227
Nevzorov, V. B. (1984 a). Representations of order statistics, based on exponential variables with different scaling parameters. Zap. Nauchn. Sem. Leningrad 136, 162164. English translation (1986). J. Soviet Math. 33, 7978. Nevzorov, V. B. (1984 b). Record times in the case of nonidentically distributed random variables. Theory Probab. and Appl., v. 29, 808809. Nevzorov, V. B. (1985). Record and interrecord times for sequences of nonidentically distributed random variables. Zapiski Nauchn. Semin. LOMI 142, 109118 (in Russian). Translated version in J. Soviet. Math. 36 (1987), 510516. Nevzorov, V. B. (1986 a). Two characterizations using records. Stability Problems for Stochastic Models (V.V. Kalashnikov, B. Penkov, and V. M. Zolotarev, Eds.), Lecture Notes in Math 1233, 7985. Berlin: Springer Verlag. Nevzorov, V. B. (1986 c). Record times and their generalizations. Theory Probab. and Appl., v. 31, 629630. Nevzorov, V. B. (1987). Moments of some random variables connected with records. Vestnik of the Leningrad Univ. 8, 3337 (in Russian). Nevzorov, V. B. (1988). Records. Theo. Prob. Appl., 32, 201228. Nevzorov, V. B. (1988). Centering and normalizing constants for extrema and records. Zapiski nauchnyh seminarov LOMI (Notes of Sci Seminars of LOMI), v. 166, 103111 (in Russian). English version: J. Soviet. Math., v.52 (1990), 28302833. Nevzorov, V. B. (1992). A characterization of exponential distributions by correlation between records. Mathematical Methods of Statistics, 1, 4954. Nevzorov, V. B. (2001). Records: Mathematical Theory. Translation of Mathematical Monographs, Volume 194. American Mathematical Society. Providence, RI, 164p. Nevzorov, V. B., Balakrishnan, N. and Ahsanullah, M. (2003) .Simple characterizations of Student t2 distribution. Statiscian, 52(3), 395400. Nevzorov, V. B. and Saghatelyan, V. (2009). On one new model of records. Proceedings of the Sixth St. Petersburg Workshop on Simulation 2: 981984.
228
References
Nevzorov, V. B. and Stepanov, A. V. (2014). Records with confirmations. Statistics and Probability Letters, v. 95, 3947. Nevzorova, L. N. and Nevzorov, V. B. (1999). Ordered random variables. Acta Appl. Math., 58, no. 13, 217219. Nevzorova, L. N., Nevzorov, V. B. and Balakrishnan, N. (1997). Characterizations of distributions by extreme and records in Archimedian copula processes. In Advances in the Theory and Pratice of Statistics A volume in Honor of Samuel Kotz (eds. N. L. Johnson and N. Balakrishnan), 469478. John Wiley and Sons, New York, NY. Noor, Z. E. and Athar, H.(2014). Characterizations of probability distributions by conditional expectations of record statistics. Journal of Egyptian mathematical Society. 22, 275279. Oncei, S. Y., Ahsanullah, M., Gebizlioglu, O. I. and Aliev, F. A. (2001). Characterization of geometric distribution through normal and weak record values. Stochastc Modelling and Applications, 4(1), 5358. Pearson, E. (1905). The problem of random walk. Nature 72, 294. Pickands J.(1975). Statistical Inference using extreme order statistics. Ann, Statist. 3, 119131. Puri, P.S. and Rubin, H. (1970). A characterization based on the absolute difference of two i.i.d. random variables. Ann. Math. Statist., 41, 21132122. Rao, C. R. and Shanbhag, D. N. (1995a). Characterizations of the exponential distribution by equality of moments. Allg. Statist. Archiv, 76, 122127. Raqab, M. Z. (2002). Characterizations of distributions based on conditional expectations of record values. Statistics and Decisions, 20, 309319. Raqab, M. Z. and Ahsanullah, M. (2003). On moment generating functions of records from extreme value distributions. Pak. J. Statist., 19(1), 113. Renyi, A(1958a)“On a onedimensional problem concerning random space filling,” Publ. Math. Inst. Hung. Acad. Sci., Vol. 3 pp. 109127. Renyi A.(1958b). On a onedimensional problem concerning spacefilling // Publ. of the Math. Inst. of Hungarian Acad. Of Sciences. Vol.3.P 109127
References
229
Renyi, A.(1962). Theorie des elements saillants d’un suite d’observations, Ann. Fac. Sci. Univ. Clermont Perrand, Vol. 8, pp. 113. [Theory of salient elements of a series of observations] Roberts, E. M. (1979). Review of statistics of extreme values with application to air quality data. Part II: Application. J. Air Polution Control Assoc., 29, 733740. Rossberg, H. J. (1972). Characterization of the exponential and Pareto distributions by means of some properties of the distributions which the differences and quotients of order statistics subject to. Math Operations forsch Statist., 3, 207216. Sarhan, A. E. and Greenberg, B. G. (eds.) (1962). Contributions to Order Statistics. Wiley, New York. Shakil, M. and Ahsanullah, (2011). M. Record values of the ratio of Rayleigh random variables. Pakistan Journal of Statistics, 27(3),307325. Skitovitch, A. (1954). Linear form of independent random variable and the normal distribution law. Doklady Akademii Nauk SSSR of Sciences and mathematics 18 (2) 217219. Smirnov, N. V. (1952). Limit theorems for the sum of variational series. American mathematical Society Translation, 67, 167. Tata, M.N. (1969). On outstanding values in a sequence of random variables. Z. Wahrscheinlichkeitstheorie verw. Geb., 12, 920. Von Mises, R. (1936), La distribution de la plus gtande de n valeurs. (Reprinted 1954) in selected papers II 271294). American Mathematical Society, Providence RI. Wesolowski, J, and Ahsanullah, M (1997). On characterizations of distributions via y linearity of for order statistics. Aust. J. Statist, 3991) 6978. Wesolowski, J, and Ahsanullah, M. (2000). Linearity of Regresion for Nonadjacent weak Records. Statistica Sinica, 11, 3052. Yanev, G. P. and Ahsanullah, M. (2009). On characterizations based on regression of linear combinations of record values. Sankhya 71 part 1, 100121.
230
References
Yanev, G. P. and Ahsanullah, M. (2012) Characterizations of Student’s t distribution via regression of order Statistics. Statistics, Vol. 46, no. 4, 429425. Yanev, G., Ahsanullah, M. and Beg, M. I. (2007). Characterizations of probability distributions via bivariate regression of record values. Metrika, 68, 5164.
ABOUT THE AUTHORS Valery B, Nevzorov is a Professor of Statistics at the Department of Mathematics and Mechanics, St. Petersburg State University, St. Petersburg, Russia. He earned his PhD from Leningrad State University, St. Petersburg, Russia. He has authored and coauthored 9 books and more than 200 papers in reputable journals. His areas of research are Record Values, Order Statistics, Combinatorial methods, Limit Theorems etc.
Mohammad Ahsanullah is a Professor Emeritus of Rider University, Lawrenceville, New jersey, USA. He earned his PhD from North Carolina State University, Raleigh, North Carolina, USA. He has authored or coauthored more than 45 books and has published more than 350 papers in reputable journals. His areas of research are Record Values, Order Statistics, Characterizations, Distributional Theory, Statistical Inferences etc.
232
About the Authors
Sergei Ananjevskiy is an Associate Professor at the department of Mathematics and Mechanics, St. Petersburg State University, St. Petersburg, Russia. He earned his PhD from Leningrad State University, St. Petersburg, Russia. He has coauthored a book and has published more than 20 papers in reputable journals. His areas of research are Probability Theory, Statistical Inferences, Distribution Theory etc.
INDEX
A arcsine, 61, 62, 119, 120, 219 arcsine distribution, 61, 62, 119, 120, 219
160, 173, 222, 223, 224, 225, 226, 227, 228, 229 chisquare distribution, 65, 123, 124, 225 combinatorial, 1, 6, 10, 11 combinatorics, vii, 1, 10 conditional probabilities, xi, 107
B beta, 62, 63, 121, 122, 191, 202 beta distribution, 62, 63, 121, 122
C cauchy, 33, 53, 54, 64, 80, 155, 188, 191, 199, 201, 202, 217 cauchy distribution, 33, 53, 54, 64, 80, 188, 199, 201, 202 characteristic function, vii, 35, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 62, 63, 65, 66, 67, 68, 70, 71, 72, 73, 75, 76, 77, 78, 80, 82, 83, 84, 85, 169, 170 characterization, 119, 121, 123, 128, 132, 139, 140, 143, 147, 148, 152, 155, 156,
D discrete random variables, 27, 35, 37, 38, 42, 46, 108 distributed random variables, 46, 52, 54, 93, 95, 97, 107, 113, 115, 116, 117, 128, 129, 149, 150, 151, 169, 171, 173, 175, 180, 194, 227 distribution, 2, 22, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 42, 43, 44, 45, 47, 48, 49, 50, 51, 52, 53, 54, 56, 58, 66, 69, 71, 76, 80, 81, 85, 89, 90, 91, 92, 93, 94, 96, 98, 99, 100, 101, 104, 105, 106, 108, 110, 111, 113, 114, 115, 116, 118, 119, 126, 128, 129, 134, 136, 137, 138, 139, 140, 141, 142, 143, 144, 147, 148, 149, 150, 152, 153, 155, 157, 158, 160, 162, 164, 165, 167, 169, 170, 171,
Index
234 173, 175, 176, 177, 178, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 207, 208, 215, 218, 220, 221, 222, 223, 224, 225, 227, 228, 229, 230
mean, 42, 47, 62, 63, 65, 67, 68, 69, 71, 72, 73, 74, 76, 78, 79, 81, 82, 84, 85, 86, 171, 173, 176, 177, 178, 179, 195, 196, 210, 226 moment generating function, 62, 63, 65, 67, 68, 70, 71, 72, 73, 75, 76, 77, 78, 79, 82, 83, 84, 177, 179, 195, 197, 228
E N exponential distribution, 51, 66, 99, 114, 140, 143, 150, 152, 154, 156, 167, 169, 171, 173, 178, 182, 186, 187, 188, 189, 194, 200, 201, 219, 220, 222, 223, 224, 225, 227, 228
G gamma distribution, 52, 53, 68 geometric, 31, 36, 48, 118, 228
I inverse gaussian distribution, 69
NBU, 143, 165 normal, 33, 54, 55, 75, 80, 114, 128, 129, 176, 182, 183, 192, 203, 219, 222, 223, 225, 226, 228, 229 normal distribution, 33, 54, 55, 75, 128, 182, 183, 219, 223, 226, 229 NWU, 143, 165
O order statistics, vii, x, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 103, 104, 105, 111, 113, 139, 142, 179, 204, 220, 221, 222, 223, 224, 226, 227, 228, 229
L Laplace distribution, 52, 70, 71 Levy, 71, 80, 125, 126, 221 Levy Distribution, 71, 80, 125, 126, 221 Lindley Distribution, 126 logistic, 74, 176, 182, 187, 188, 191, 198, 200, 201, 202, 203, 222, 223 logistic distribution, 74, 182, 188, 198, 200, 201, 202, 222 loglogistic, 72, 127, 219 loglogistic distribution, 72, 127, 219
M mathematical schemes, ix
P Pareto, 76, 100, 152, 154, 160, 184, 185, 189, 191, 192, 203, 221, 229 Pareto distribution, 76, 100, 152, 154, 160, 184, 185, 189, 221, 229 Poisson, 31, 36, 37, 38, 39, 40, 48 power function, 77, 132, 133, 152, 154, 192, 203 power function distribution, 77, 132, 133, 152, 154 probability distributions, vii, 28, 42, 47, 53, 61, 119, 139, 224, 228, 230
Index probability theory, i, iii, vii, ix, xiii, 3, 4, 6, 31, 33, 52, 54, 103, 175, 224
Q quantile, 147
235 S Skew distribution, 134 stable, 80 stable distribution, 80 student t, 147 student tdistribution, 81, 147
R U random variables, vii, 13, 22, 24, 26, 28, 29, 33, 37, 38, 39, 40, 42, 43, 44, 45, 49, 50, 52, 54, 56, 58, 59, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 96, 97, 98, 99, 100, 101, 104, 105, 106, 108, 110, 111, 112, 113, 114, 116, 125, 128, 129, 135, 142, 147, 156, 157, 164, 165, 167, 168, 175, 176, 181, 183, 184, 186, 193, 197, 198, 199, 211, 214, 219, 220, 226, 227, 228, 229 Rayleigh distribution, 79, 221 record values, vii, ix, 103, 104, 105, 106, 109, 110, 111, 112, 113, 114, 117, 139, 151, 154, 155, 156, 219, 220, 221, 222, 223, 224, 226, 228, 229, 230
uniform distribution, 29, 30, 32, 49, 50, 82, 91, 113, 135, 136, 180, 189, 190, 199, 200, 202, 207
V variance, 62, 63, 65, 67, 68, 69, 71, 72, 73, 74, 76, 78, 79, 81, 82, 84, 148, 177, 178, 179, 195, 196, 209 Von Mises distribution, 136
W Wald Distribution, 70, 138 Weibull, 83, 137, 175, 176, 193, 204 Weibull distribution, 83, 137