140 60 71MB
English Pages [400] Year 1995
THE STOCK MARKET AND FINANCE FROM A PHYSICIST’S VIEWPOINT
M.F.M. OSBORNE
Game]
ar
29
3
2s
SES %
S238 BS
A
oe 2% & ehMar sf o% %. ZSShGn
32 34
% @ ¢ wtket (LOOIM) ‘¢ & Qs...
4%,
es
Single Market
34
Br, 45 ¥
Copyright ©1977 M.F .M. Osborne
First Printing 1977
Second Printing 1995 All rights reserved
Printed in Canada 109876
ISBN 0964629208
Library of Congress Cat alogue Number 9569502
Design by Kristen McD ougall
Mathmetics set by Eli zabeth Dolan
Typeset with PCTex using Times Roman font
Crossgar Press, Inc.
2116 West Lake of the Isles Park way
Minneapolis, MN 554052425 Books may be ord ered directly from the Publisher
at the above add ress by check ($19.95 plus $3. 00
for shipping, handling and tax).
CONTENTS
VOL. I
MARKET MAKING AND RANDOM WALKS IN SECURITY DATA
1
OPE WATG
S BES
xi
1
rence cd aa arms a Biden # ue mere ow mE e Rw
xiii
1
PREHORUCHOD!
2.
5) vee
Sel vse eR &
won: we emi se a
& MR
eR
ee
1
La
Some Basic Principles of Investigation... . .
ai
ae]
Some Fantasies of Wall Street... .......
3
Market Making................00..
6
21
Does a Capital Market Raise Capital?
.... .
6
2.2
Gambling Games and Liquidity... 2.2...
11
2.3
Supply and Demand. Economics ys. Reality. .
13
2.4
Supply and Demand in the Real World
22
2.5
Continuity and Derivatives of Real World Sup
ply and Demand 2.6
2.7
2.2...
29
Continuity of a
ee
ee
32
Comments on Continuity and Continuous Mar
MONS: 2.8
.................
Continuity of a Variable vs.
Function...
....
wee eas ee os
ee a oe oe eS
34
The Limit Order Only Interrupted Market (LOOIM)
as a Foundation for Auction Markets... .. .
34
2.9
The Continuous in the Time Single Market
MaKe? os aou nyse cev ga Kew 8es eae EZ
45
2.10
Definition of a Market Maker
46
.........~.
ii
e
Principles of M.
 Making  Required Infor
mation and Criteria  Limits or Boundaries
Definition of Profit in Market Making
. .
... . .
The Effect of Lower Limits Only on Inventory, The Effect of Upper Inventory Limits, on Price sas a me HE EB H we E pw wn
TEATS oe ow
The Return to Optimum Inventory, When the
Limits are Exceeded»...
......00.,,
Tactical Inventory Zones... 2.2.00 ...
Tactics With the Minimum (1/8) Sprea d and a Size Greater Than One Round lot .....
Alternate Methods of Balancing Inven tory With
out a Direct Suspension
...........,,
Comments on Continuity. Methods of Preserving the Image of a Continuous Mark et Maker . Continuous Market Making With Fixed Mini
68
mum, and Optimum Inventory, but No MaxiMUM.
ee
A Digression on Boundaries, Limit s and the Definition ofa Theory ......... ..... Some Comments on Tape and Char t Reading.
2.23
2.24
Properties of Transacted Price Sequ ences
Several Market Makers in Competition. .. .. Commodity Markets, With No “Offi cial” Mar
Ket Maker oa ae. ou 2.25
2.27
2.28 2.29
3.1
Wi
Bye
ben
pw ae
vm
An Official Market Maker With a Book of Limit Orders. The Exchange Specialist ... ... . , An Estimate of the NYSE Specialist Trading
YOO oem y 5B HR HE EH ack ele The Exchange Specialist and the Thir d Market
Summary of Market Making.
Limits on the
Existence of Markets... ......2.~ 0~2..,
Price Changes Over Fixed Time Intervals, Day,
Week or Month
70
78 80
82
83 89 90
ili
3.3
The Dark Cloud Axiom.
The Silver Lining
FMECHEH
EEE
«pa
we es we
wH
TERE
BAW 103
Impossibility Theorems or Statements from NatwralSGienCe
smcce wee eRe EE eH
Bw 105
3.4
Impossibility Theorems from Mathematics.
3.5
Godel’s Theorem...
3.6
Examples of Random Walks........... 113
3.7
The Mean and Variance of One Step in a Ran
2.2.2.2.
.
. 107
2220 109
dom Walk . . 3.8
The Expected Advance and Dispersion for a
3.9
The Estimated Advance and Dispersion for a
Sum of j Steps ina Random Walk
....... 118
Given Random Walk... ........04. 122 3.10
Tests for Properties of Random Walks. Square Root of Time Diffusion Law
3.11
The
...... 127
Estimates of Advance h and Dispersion 6 Test
ing Statistical Hypotheses... .......... 137 3.12
Evidence for Departures From a Simple Random: Walk in Stock Prices’...
3.13
2 6 2
2 sce we 138
The Formula for Random Walks vs. Standard Deviation of the Mean .. . wc eee 140
3.14
Random Walks with a Dispersion per Step Small Compared to the Expected Advance
..... . 143
3.15
Sequential vs. Across the Market Dispersion
3.16
Brownian Motion as the Continuous Limit of a Random Walk
3.17 3.18
..
2.2.2.
. 146
0252 157
Velocity vs. Derivative of Position.
. . . .
DGC
a mewe
Random Walks Whose Properties Change With ec awe
ae
we ees
PRS
. 160
Be 162
Random Walks in Dollars of Price vs. Log. of PHBE
c ve ame eee Re eRe 166
The Effect of Dispersion on Arithmetic vs. GeGmiettie Mean.
ssa
cee wie see
pee
we 172
“Equality of Opportunity in a Growing EconGmyTM
sch
kee
SERRE
ATES
ERE
Re 173
Alternative Forms of Percentage Return and Logarithmic Random Walks........... 178 3.23
A Hybrid Random Walk, Return Including the Dividends...
2.2...
22.
ee
ee
eee 181
—
VOL. II STATISTICAL
METHODS
AND
SEQUENTIAL
ANALYSIS OF STOCK MAR KET DATA .. 185 Preface to VolumeIT.. 2... .....20200,..... . 187 4
Measurement and Method s of Statistical AnalHOE ww em se aw eK Ahn e ew benmae 189 4.1
A
Digression on Measurment
4.2
........., 189 The Approximation of a Func tional Relation
4.3
Some Mathematical Truths Whic h are not Nec
44
Pryxytzl, Baloney and Red Herr ing...
ship in Measurement....... ...~,~
,.., 194
essarily So... e e lee 195
4.5
Examples of Infinite Moment s.
4.6
Time of Flight of Gas Molecule s
4.7
4.8
.
.
.
2... 202
aaa
= 208
....... . 209
‘Transformations of the Inde pendent Variable
Order Statistics.. .
. 215
OT
4.9
Skew Random Walks. Fluctuat ions in a Panel
4.10
An Example in Which the Hist ogram Panel Test Doesn’t Work... ...... 22.2.,22.., 224
4.11
4.12
4.13
4.14
4.15
ofa Histogram ........~0.,
a
wr = 20)
A Comment on the Form of the Chisquare Test226
Counting of Events with a Pois son Distribution 228 Relation of Correlation, Regr ession, Contingency by Chisquare and the Method of Co
incident Events... ......2.2 .,2.2022., 236
Seasonality in Stock Prices.
An Example of a
Problem with Three Variables. Good and Bad Practice in Statistics... .... .002.., 241
Some General Methods of Atta cking Large Batches
of Data. Preliminaries of Anal ysis of Sequen
tialData
2...
4.16
Extreme Observations.
4.17
Discoveries in Noise...
History of Science... 4.18
ag
Some Biases from the
...0.0~0~000,, 257
2.2... 259 Examining Two Variables at ONG@ 22 4 2% di 260
4.19
The Quantitative Expression of “Eyeball” Statisfies
c
aig
eS
W
Se
ew
262
Statistics of Sequential Data. Power Spectra . . 265 Bal
Examining Two Variables at Once Where One is the Time Sequential Data. .......... 265
5.2
The Definition of Strict Stationarity
5.3
The Scatter Diagrams of Autocovariance. . . . 269
.. 2... 267
5.4
Comments on Auto  and CrossCovariance
5.5
Definition of Weak Stationarity ......... 279
5.6
The Definition of Ergodicity........... 282
5.7
Examples Illustrating Strict Stationarity, Weak
. . 278
Stationarity, and Ergodicity........... 5.8
The Doctoring of Sequential Data Prior to Statistical Analysis, in Particular for Power Spectra: 287
5.9
A Comment on the Disparate Mathematical Concepts Which Can Be Encompassed in Fourier AMRIPRIS 2
5.10
ge ee
Re
ee
RS
Formulas for Fourier Expansion, Expansion Coefficients, Power Spectrum and Autocovariance
297
5.11
A Digression on the Orthogonality Relation
5.12
A Digression on Aliasing...... 0.2... 303
5.13.
The Interpretation of the Mean and Linear Trend
 301
as Fourier Expansion Coefficients, Mathemat
eal Paradoxes. 5.14
saw ke
eam
eee
The Spectrum of White Noise. a Sequence of Independent in probability Values of f(t)
5.15
. 309
The “Solidity” and “Smoothness” of a Power
OBCCHMUM.
2 G54 Zoe
kobe Bhs SHR
322
5.16
How to Inspect and Check a Published Power
5.17
White Noise as a Standard of Comparison . . . 329
5.18
The “Spectrum” ofa Random Walk Itself . . . 331
5.19
The Approximate Evaluation of a Random Walk
SPECHROHE
SPRCHUM!
= nue 6
mH Be
G See SRR GY me E es 324
poe k em eee mas ha wee
EH 333
5.20
The Exact Spectrum of a Random Walk... . 344
5.21
A Comment on the Identity of Expected Spec
tra for Random Walks, Linear Trends and Steps 350
vi
5.22
The Running Summation and Difference Op
5.23
The Structure Function and Interquartile Range
SAtOIS
ve
eH
eB
ee
of Differences 5.24
ene 355
co
The Relation of the Shape of the Structure
Function and Power Spectrum. Fractional RanGOR WANS 5.25
ee e Re nee e RE
Some Unsolved Problems
............
Some Unexplored Problems in Security Market Data . 364
FIGURES Chapter 2 31
Prices as Function of Supply and Demand ...... .
32
Supply and Demand, as Functions of Price. The Economist’s
Diagram Turned Over
... .
21
41
Mrs. Jones Demand Function
42
Supply and Demand Functions of Price, for Buyers
43
Supply and Demand of Apple Sauce in the Grocery
SHG Sellers. SiGiGies
un: ese Re Re aE
ge wwe se
Re
15
NS EN
eae ee RE ee we
oem ES
131 Examples of Simple (Symmetric) Market Making. . . . 171 Examples of Quotes with Minimum Spread and Effec
tive Size k of the Quote
.. 1... 250s ees
181 Trading with a Ratchet
231 Quotes of competing Market Makers .......... Chapter 3 51
A Line and a Point Not onaLine
61
Parchesi as a set of Interacting Random walks... . .
71
Probability Distribution for One Step of a Random Walan
esa ays
gon l dG
...........
damian ckmannman
81
The Parabolic Sleeve of a Random Walk ........
82
Confidence Belt for the Mean of a Sum of i Steps, or LODSEEVALIONE..
91
2s
swe iwi
eae see @ OS ew Be
Price Charts for IBM and Reynolds Metals for the Era of Table 3.91 Charts from Securities Research Company126
101
Determination of Fractile Ranges
........... 129
102 Price Charts for Stocks of Problem 4.
Charts from
Securities Research Co... ...... ........, 1314
131 Histogram of Student SAT Scores (Schematic)... . . 141 141 Random Walk of Student to the Parking Lot. The Dispersion Per Step, 6, is Small Comp ared to Expected
Advanceh..
15
2... 0... ll, 144
Across the Market Dispersion for Fixed Income Secu
tity Prices, Earnings, and Earnings per Dollar of Price 14856
191 The Gambler’s Random Walk in Roulette ...... . 169 192 The Financier’s Random walk in Roulette... .. . . 169
193 The Financier’s Walk on a $ Scale of Value (Left) and
Initial % Scale (Right)... 2. .. 170
194 The Gambler’s Walk on a Logar ithmic Scale of Value. 170 Chapter 4 51
61 62
71 91
92
An Experiment Giving the Cauc hy Distribution.
.
. . 206
Experiment to Determine Molecular Velocities... . 210
Distribution of Velocity and Time of Flight
22... 211
Transformation to y = x? of a Unif orm Distribution of x217
A Skew Random Walk... ...... ...0...., 221
Schematic Histograms with an Excess or Deficiency in
one Panel
©. 2.. .
121 The Poisson Distribution for an Expected Number \ = 0.055
222
122 The Poisson Distribution p(k). 4 = expected Number
fp OMe BOR
ec
we
ee
ce
noe nane 235
131 Scatter Diagram of Weight vs. Height (Schematic) . . 236 141 Cumulated Distribution of Monthly Changes in the
Dow @86%
os cuic ee Se adden ew mev
151 A Picture” of a Column of Data on Prices
nes 246
.... . 250
152 Distribution Function of Closi ng Prices for July 31, 1956, (allitems, NYSE) ......... ......., 251
153 Distribution for logP on July 31,
NYSE Common and Preferred)...
1956,
(all items
.........., 252
154 Distribution Function of logeP for Preferred Stocks
(NYSE, July 31,1956)... .....2.0.0.. .., 253
155 Distribution Function of logeP for Common Stocks
(NYSE, July 31,1956)... .....2.0. 0,..., 254
viii 156 Distribution Function of logeP for Common Stocks
(ASE, .July31, 1956)
oo
eee
cee eee ewes 254
157 Cumulated Distributions of log.P for NYSE and ASE
(coiimen stocks)
2
2c
i
we ea
eee
eee ee
BBB
158 Histogram of Measurements of Machine Parts (Schematic) 256 181 Scatter Diagram of Volume vs. Price Change (Schematic) 261
Chapter 5 31
Random Telegraph Signal...
32
Shot Noise
71
Autocovariance and Autocorrelation of Random Digits 284
72
Autocovariance and Autocorrelation ofa Running Sum
2... 2.0... 274
275
of Two Independent variables... 73
2. ....00.000. 285
Autocovariance and Autocorrelation for a Running
Sum of Five Independent Variables
.......... 285
74
Observed Autocorrelation ofa Random Walk (Schematic) 286
81
Test of Data for the Third Condition of Ergodicity
82
Sound Pressures. Time
. . 288
2.22.0 00 bc 0. eee 291
122 Fourier Representation of f(t) = const
........ 304
141 A Random Sequence of Normal Variables of Unit Variance and Zero Mean
..
2.2... eee eee
  31213
142 A Random Sequence of Variables of Unit Variance and Zero Mean. +1 for Even Digits, 1 for Odd Digits. . . 314 143 The Distribution of Real Component b, or Imaginary
Component c of a Fourier Expansion. ......... 316 144 The Joint Distribution of Fourier Expansion Coefficients in Cartesian Coordinates ............. 318
145 The Rayleigh Distribution for the Modulus  a; =
1/0} +c} of a Fourier Expansion Coefficient a; = bj +1c;319 146 The Distribution of the Power Spectrum Values P =
ajby* = B + es
Sk Sta maeee ees whe ws
ce x 321
147 The Distribution of the Power Spectrum Values P, on aLog Scale... . 2... 000.20. .00.0.. 322 191 Power Spectrum of First Differences of Wheat Futures
Prices Monthly, 195065 (from Labys and Granger) . . 336 192 Power Spectrum of Wheat Price Futures Monthly 1950
65.(from Labys and Granger)
.............. 337
193 Power Spectrum of Composite Weekly SEC Index 193962  SemiLog Plot. (from Granger and Morgenstern)
339
194 Power Spectrum of Woolworth Stock Prices 194660 SemiLog Plot.
(from Granger and Morgenstern)
. . . 340
195 Fig. 5.193 on a LogLog Plot... ........... 341 196 Fig 5.194 ona LogLog Plot
.............. 342
197 Power Spectrum of the Radial Solar Wind Velocity.
From Annual Review of Astronomy and Astrophysics, Vol. IE, 1973, Ed. L. Goldberg’
«occ ee eae a we 343
201 The Exact Fourier Composition of Spectrum of a Random Walk of NonZero Expected Advance... ... . 347
211 Power Spectrum of a Jump. (from Granger and Mor
BSHSHEMA)
cm
cece
END BER
E ES
Rae ws 353
212 Power Spectrum for Linear Trend. (from Granger and Morgenstern)
«+ sa oe
9886 2 ek
eee
eee 354
TABLES
3.9
Expected Advance h and Dispersion éfor Weekly Changes
in loge Price, MDNM A log_c (P(t)); Data from Cootherp: 286
see 4 es
3.10 Data on Expected Advance h and Dispersi
of Problem 4 (Figs 3.102) 3.21 Table of Payoffs for Investors in a Growing Economy . 174 3.21 Arnold Bernhard’s Track Record for OTC Stocks . . . 176
3.22 Table of Alternative Forms of Random Walks with
Different Measures for Utility Function of Money . . . 180 4.14 Data for Problem 6
247
.2
Advances and Declines, February vs. August
.3.
Comparison of Different Observers
........... 248
.4
Comparison of Different Observers
........... 248
.5
Comparison of Different Observers
.......... 248
.6
Monthly Advance and Decline for Nonoverlapping Eras 248
..... 248
5.16 Given y? with Probabilities to Determine Confidence Bel
awe ee ee eee
ew Ew OD ee wR 328
PROBLEMS Mr.
Martin’s Statement  Does a Capital Market
Raise Capital?
2.2... ee
10
How Economists Get Supply and Demand Curves...
21
Examples of LOOIM............000 000%
41
Sequential Dispersion o and Expected Advance h of Boom and.Bust Stocks...
0... 66
Birth and Death of Securities
we
ee 135
.. 2.2.2... 000. 165
Seasonality in Stock Prices Histograms with Respect to Eighths. .
xi
FOREWORD
Major discoveries in any field very often are made by outsiders, persons who bring new perspective to the subject, or apply tools as yet unused from another discipline.
This occurred at least twice
in the annals of investment markets.
The first was Louis Bache
lier’s application of the Brownian Motion model to French Rentes in his doctoral thesis in mathematics in 1900 which preceded Einstein’s paper on Brownian motion by a half dozen years. The second was Maury Osborne’s seminal 1959 article, ” Brownian Motion in the Stock Market.” Bachelier was a mathematician; Osborne is a physicist.
Osborne was a graduate student in astronomy at the University of California, Berkeley who came to work at the U.S. Naval Research Laboratory, Washington, D.C. in June of 1941, and spent his entire career there. During World War II, he worked on problems of underwater sound, submarine detection and underwater explosions. After
the war, physicists were encouraged to work on a variety of topics of their own choosing.
Osborne examined, among other topics, the
aerodynamics of insect flight, the hydrodynamical performance of migrating salmon, and the stock market as a slow motion source of
random noise  a subject of considerable interest in many different fields.
This resulted in his 1959 paper, "Brownian Motion in the
Stock Market.” Its thesis, that common stock prices followed a ran
xii dom walk, was a bombshell within the field. It was the first of a series of investigations that led to a complete revision of view of the stock market, although perhaps not always in the way Osborne would have
anticipated or approved. More than a decade later, Osborne was asked to teach a course in
the stock market and finance at the University of California, Berkeley. The course he taught was not the typical course in investments, but something quite different. Its content bore directly on that subject,
but in a divergent and critical way. It raised issues of the nature of the underlying data, how it was to be examined, how theories about.
it should be derived, which current economic theories were useful and which were not, and the very nature, even, of appropriate methods
of statistical analysis and scientific investigation. This book is the 2nd printing of the content of that course. The words and content of the original have not been changed; the student problems and references are as they appeared in the original course.
Even the figures, as drawn by Osborne, have been reproduced in their original form. The retention of the original was done, not for historical purposes, but because Osborne’s approach and insights are not restricted to past eras, but apply as well today as they did in the past.
Overall, the book is a tour de force, instructive, playful, serious, ranging from easy to extremely difficult, always clearly written, and full of insightful viewpoints. Throughout, Osborne states matters as
he sees them, whether it be finding fault with common but misleading statistical practices, pointing out the proper investigative approach or the only appropriate way to describe the relationship between supply, demand and price. The book relates the views of an unusually
objective and thoughtful mind on a wide range of topics of economic and intellectual importance as they bear on the stock market and finance.
Joseph E. Murphy
xiii
PREFACE TO THE FIRST PRINTING
In the fall of 1972 the author was a. visiting lecturer in the graduate school of business administration at the University of California
in Berkeley, teaching two courses labeled “Security Markets and investment Policies,” and “Seminar in Investments.” As a professional physicist and amateur pedagogue, I found preparing and delivering these lectures the most concentrated and prodigious mental effort I had ever made.
I would never have believed a group of students
could make the professor trot so quick. On returning to Washington
I thought that the record of such a labor should be better preserved than the rather disorderly lecture notesI had accumulated, so I sat
down with a tape recorder and talked them into it. These notes are the result after several typings, considerable expansion certainly, and
clarification I hope. These lectures were originally produced for graduate students in business administration, and I have tried to preserve this level of
presentation.
So it would be appropriate to point out what level
of mathematical preparation is required. Most of the students had
had, and needed, an elementary course in statistics—calculation and interpretation of correlation and linear regression; expected values, significance probabilities, how to make a chisquare test.
Perhaps
half of them had had an exposure to the basic ideas of calculusderivatives, integrals and limits.
This too was very useful, but not
absolutely essential.
Some of the better students without a back
ground in the exact sciences felt a little “snowed” by the use of sines
and cosines when we discussed power spectra. Some had trouble with logs to the base e, and log paper.
The original lectures on power spectra were primarily a qualitative discussion of when and when not to use power spectrum analysis; the present writeup on this topic goes into considerably more mathematical detail.
It would be literally correct, and at the same time very misleading to say that the really important mathematics the students needed is taught nowadays in the eighth and ninth grade.
Here the students
are introduced to the idea of a function, and to the different classes of numbers; integers, rational numbers and real numbers. I was briefly
exposed to these ideas as a graduate student. I used them without understanding, or even being aware of what I was doing, until my children went through the eighth and ninth grade. It took me several years of repeated passage, doing their homework, before Ifinally caught on that these were the mathematical concepts I needed to
bring sense and order to my understanding of a speculative market. Pure mathematicians should find a wry satisfaction, and mathemati
cal economists may be a little startled to see that modern ninth grade mathematics is what you need to make sense of the law of supply and demand in a speculative market, and to expose in elementary terms
the mystique of the NYSE specialist or over the counter trader who “makes a market.” I have not tried to avoid subtle mathematical points when they
are needed.
Rather I have tried to emphasize that these subtle
ideas are used unknowingly, repeatedly in everyday life. In the sim
plest terms, the most important mathematical ideas needed are those which distinguish between a solid and a dotted line. The material covered divides itself into roughly three parts. A discussion of the process of market making.
1)
2) A discussion of
random walks. 3) A discussion of alternative methods of attacking a statistical problem and the relation between these methods. Included in this third category is an introduction to power spectrum methods. The reader might get the impression, and it would be correct, that
I am not overly enthusiastic about power spectra as the method of choice for examining sequential data. This prejudice is not restricted to economic time series. Readers with a serious interest in this question might examine closely sections 5.9, 5.13, and 5.21 of this text,
—
xv
and also the reference footnoted! , as an example of alternatives to spectral methods in describing data on turbulence in fluids.
Spec
tral methods are normally de rigueur in this subject, and I admit to
being a small minority.
If I have seemed overly critical of spectral
methods, it is for the same reason that I have criticized other theoretical ideas. You understand a theory best when you appreciate its
limitations—the conditions under which it doesn’t work. When it comes to theoretical developments (as opposed to the
analysis of data) my attitude toward Fourier analysis is similar to that of Winston Churchill toward a democratic form of government. It is a terrible method, but the best one we know of. For those readers who like to browse, or might want to use this material for a course or seminar, I should point out that this material
was not delivered in the logical and perhaps pedagogically appropriate order that is found here. One class got a rather complete treat
ment of market making, and very little of random walk. The other got a much fuller treatment of random walk, and no market making.
The discussion of alternative methods of analyzing data does
not require either market making or random walks as prerequisites. It may be appropriate to warn readers on the multiple and spe
cialized meaning of certain words.
“Real” in ordinary language
means concrete and actual, rather than abstract or fictional. “Real”
in mathematics as referring to numbers or functions has a number of different connotations. The basic one is that of natural numbers (integers) plus fractions plus irrational numbers, positive and negative. If I say the real part of a number or function, it implies there may be a second socalled imaginary part. Historically both parts are the product of the most imaginative thinking of mathematicians, for at
least two millennia. “Trend” does not have the loose meaning of some kind of slow
change, ascribed to this word in general language or economics. For us it is usually, given a plotted finite set of sequential data, the slope
of a straight line from the first to the last data point, regardless of what the data may be in between. This is exactly the mean difference of the data between successive intervals, taken as unit time.
If the
ordinate has been transformed, say from prices to log prices, it is the slope of such a line on the transformed data.
1M. F. M. Osborne, “The Observation and Theory of Fluctuations in Deep Ocean Currents,” Erganzungheft zur Deutschen Hydrographischen Zeitschrift. Series A, No. 13, 1973, pp. 158.
xvi
“Represent” in mathematics means “stand for,” just a V (with
the Romans), five knots on a string (with the Aztecs), or the Arabic numeral 5 stands for the abstract concept of number also represented by the fingers on one hand. In discussing numerical data, “represent”
means “fit.” In the same sense that Ptolemaic epicycles vs. a theory of gravitation are used to fit or represent the observation of positions
of the Moon, or random walk theory represents some property of stock price sequences; or a Fourier series or a power series represents,
or fits, a sequence of observed numbers. I would venture to say that most people (the author included)
who buy a book about the stock market, do so because they would like to make money, either for themselves or as an advisor or man
ager.
We can express this mathematically by saying that they are
interested in improving their estimate of the expected value of the change in price of some stock, or an index, over some interval of time t—to,t > to in the future, beginning now (to). The reader should be warned that there is more to the stock market than this ultimate ob
jective, fascinating as it may be. I do not wish to denigrate the profit motive, but understanding the market may require an examination
of phenomena that have nothing to do with price, or profit. I can illustrate the narrowness of this “profit motive” point of view with a story which appeared in the Wall Street Journal some years ago. There was a man who loved horses, and happily earned his living running a stable, buying, selling, boarding and renting horses. There was a byproduct of this enterprise which in fact rendered it
profitable, the uniform production of horse manure, for which there was a good market for all that he could produce.
You can easily
see that so long as profit was regarded as an incidental byproduct
of this business, all would go well, but if he tried to increase profit by diet or laxatives, he would surely come to grief. You will see, for
example, when we discuss market making, that if we regard profit as
an incidental byproduct of the process of making a market, profits will flow steadily and at a relatively small risk, whereas if you focus on profit as the objective you increase your risks enormously. When we discuss the random walk model of stock prices, you will see that a simple expression of this model is that the expected change in price, or expected “profit” or “return” is zero, no matter when you start or how long you wait.
For someone who wants to
make money, this is a very depressing picture.
When we examine
the definition of profit more closely, and bring in the question of
xvii
amount of money vs.
its utility, in the economist’s sense, you will
see that “profit” is at times a rather fuzzy and illdefined concept. There are some unsolved problems here. In all of this I merely want to emphasize that there are other ways of looking at capital markets
in general than that adopted by most people who have a practical and urgent interest in them.
If you back off from the nitty gritty
problem of protecting and increasing your capital, you may get a
more enlightening view of what it is all about. Finally to those readers (and I hope they are numerous) who take seriously the guiding principle of Chicken Little and Zero—“it ain’t necessarily so”—I make this request. I too may well make statements which are not necessarily so. If so, please tell me about them. There
are doubtless places in the text where I have made statements, or used mathematical conclusions which are not at all obvious.
free to ask me for enlightenment, I would be happy to comply.
Feel
xviii
PREFACE TO SECOND PRINTING
I owe the opportunity to teach a course on finance and the stock market to my friend and colleague Victor Niederhoffer.
I owe the
publication of a second printing of these notes to Joe Murphy, a collaborator in several subsequent papers. Without him this second
printing would never have come to pass.
Thave often thought that the stock market, just as a phenomenon of purely scientific interest, was one of the best documented (in terms of available recorded data), and worst described aspect of human
life. The reason for this glaring dichotomy is that the overwhelming majority of interest in the market centers on how to make money, a very narrow, and very compelling practical interest. historical analogies of this situation.
There are
Medicine and agriculture are
very practical subjects of interest. Yet only in the last 200 years have
they been studied as purely scientific, rather than practical topics of interest. The major emphasis in this book is twofold: 1) reinterpretation of conventional wisdom, and 2) unanswered questions about finance
and economics. Data and methods exist for answering some of these
questions. This book can be imagined as an account by an unbiased anthropologist (a physicist) of an exotic society composed of enlightened financiers and economists, describing their standards of behavior and belief.
xix
Some of the exotic beliefs which this society holds are:
1) Supply and demand are functions of price, but prices are not
functions of supply and demand. 2) A capital market (ie. the stock market) is totally unnecessary for the raising of capital and in fact raises very little capital, (less
than 1/10th of the capital raised in any year in the U.S.) 3) A slightly controversial belief of the members of this society has it that the natural logarithms of stock prices in $ per share follow approximately a random walk of zero expected advance.
belief has as a curious consequence:
This
(1) that the expected price it
self, in $ per share, increases slightly and steadily in time, and it
also follows (2) that the expected reciprocal of price, shares per $, also increases with time in an identical fashion. The first conclusion
has been repeatedly tested with real data.
The second conclusion
has never been tested with real data, to my knowledge.
Note that
“expected” has the technical meaning of the sum of values times associated probability.
There are numerous other paradoxical beliefs of this society, consequent to the difference between discrete numbers (multiples of
1/8$) in which data is recorded, whereas the theoreticians of this society tend to think in terms of real numbers (mathematical definition of —“real” ).
The distinction between “real” and “discrete”
matters considerably for the stock market.
There is an analogy to all this in the foreign exchange market, an analogy which has never been explored, to my knowledge.
His
torically, foreign currencies were traded against a single standard of value — an ounce of gold, or pounds sterling. In the U.S. stock mar
ket the $ is compared against a multitude of other —“currencies” or vehicles of value, the stocks themselves. Whether prices of curren
cies (or their logs), like stock prices, follow a random walk, or have in past eras, is not known to me.
In the footnote of this preface are listed a number of papers which
bear on these and related questions, some of which appeared after
these notes were written. ? ? Osborne, M.F.M., —“Brownian Motion in the Stock Market,” Operations Research, vol. 7, 1959, pp. 14573, p. 807.
Osborne, M.F.M. and Murphy, J.E., ”Financial Analogs of Physical Brownian The Financial Review, vol. 19, no. 2, pp.
Motion, as Ilustrated by Earnings,” 153172, 1984
Murphy, J.E. & Osborne, M.F.M. ”Brownian Motion in the Bond Market,” The Review of Futures Markets, December 1987, Vol. 6, No. 3, 306326.
1
1.1
INTRODUCTION
Some Basic Principles of Investigation
The objectives of these lectures are to stretch your minds as re
gards imagination and a new point of view, excite your curiosity for new things, and develop your skepticism concerning what you may
have already read concerning the stock market or security data.
As models for the kind of thinking I am trying to bring out, I
will give three well known examples. The first is the model scientist, physicist if you like, who is typified by Chicken Little, a character I
am sure of whom you have all heard. Chicken Little was a little bird
freshly hatched from an egg, and with no preconceived notions as to what the world of nature was all about. He was out in the woods
pecking for bugs and worms and other delightful tidbits, when an acorn fell on him. He immediately concluded that the sky was falling
because he saw it with his eyes, heard it with his ears, and felt it when
it hit him on the tail. So he organized a following or lobby of Henny
Penny, Turkey Lurkey, Ducky Wucky and other barnyard birds to go and tell the king about this impending disaster.
On the way they met Foxy Loxy who saw a good thing coming for himself, so he said he knew just where the king lived and he would take them there. He led Chicken Little and his entourage right to his own den. He went
in and said he would make the arrangements and they should come in after him.
When they did, he bit their heads off one by one.
I
think a few of them may have escaped, but the point is Chicken Little exhibited all the good and even heroic qualities, also stupidities, of a scientist. First of all no preconceived notions or model as to what.
the world was like, and complete reverence for the data. He accepted his sense data literally when something hit him from above. Like all
good scientists, he immediately made a theory directly related to his data, and then he started to do something about it, even if it
killed him. Scientists these days I suppose, go to Washington “to do something about it.” The fact that he happened to be wrong is really nothing against
him.
Science is nothing but a systematic way of making mistakes.
A good theory is nothing but a shorthand way of remembering the
data. He had reverence for the observations, that is the important point. He didn’t have any preconceived notions as to what was going
on. He just took the data of his senses straight. The second model is that of a mathematician, and our model
in this case is a character from the funny papers named Zero, who figures in the comic strip “Beetle Bailey.”
Zero is usually noted
for getting into difficult situations because he takes the meanings of words and the grammatical construction absolutely literally. This is exactly what a mathematician does.
In the comic strip it usually
happens to be a way of his getting into trouble.
Nevertheless, this
is the correct attitude for a mathematician. This fits perfectly with
the above definition of a scientist, and explains why mathematics and science go so well together. I think there is something significant about giving the name “Zero” to someone with these characteristics,
because zero is perhaps one of the greatest mathematical inventions that has ever been made.
Zero is a symbol for nothing, yet it is a
nothing with useful and even remarkable properties. The third attribute which I would like to bring out and develop in you is expressed in a hymn, which they sang in a church I used
to go to.
I call it a hymn because it was a song and because it
was sung in a church, here in Berkeley, although I doubt if you will
find it in any hymn book. The line goes something like this. ”Old Pharaoh’s daughter found Moses in the bull rushes, she said, but it
ain’t necessarily so, no, it ain’t necessarily so.” Well, this is the kind of skepticism I want to inculcate into you.
Just because you read
something in some authoritative source, it doesn’t mean that it is
necessarily so. It may not be so.
1.2
Some Fantasies of Wall Street
In order to cultivate a sense of astonishment, such as Chicken Little had, about the truly fantastic things that go on, on Wall Street, and in the securities market generally, let me describe some of the properties of a stock market as seen by some imaginary characters from ancient history.
The first of these is an ancient Roman or
Greek. He is taken to the stock exchange and it is explained to him that this is a market for synthetic slaves (corporations) who have been created you might say by a magician (lawyer).
The slave is
sawed up into several million pieces. The pieces are bought and sold, and nevertheless the slave continues to work and make money for the owners. In addition, even more astonishing to the Roman or Greek, free real citizens are hired by and work for these fictitious slaves.
The next visitor to the stock market is an oriental. It is explained
to him that these pieces of slaves are bought and sold, but he is astonished to learn that the prices are upside down in terms of the standards that he is accustomed to. In other words, the prices are in
terms of dollars per share, or dollars per piece. In the Orient prices are just upside down from this. There you do not buy rice or wheat in terms of dollars per bushel, but in terms of bushels per dollar, or pounds per rupee. The reason is, I suppose, they are more interested
in what they are buying because they are going to eat it, than they
are in the price they can get for it if they are going to sell it. Food is more important than money, for many orientals. If you have two annas, you want to know how much rice you can get to eat for them, rather than how many annas you can get for a pound of rice you already have.
I think the same thing sometimes occurs with those
Arabian sultans who have lots of oil. They buy Cadillacs seventeen at a time. They look on the price of a Cadillac as how many Cadillacs per gold brick, rather than how many gold bricks per Cadillac. The oriental is further astonished to observe that the smallest unit is a coin which doesn’t exist, the eighth of a dollar or bit.
There is a
remnant of this in our language, where a quarter is two bits.
It is
rather astonishing to these people of an ancient civilization that the hairy barbarians of the West buy and sell pieces of slaves on this
market with a price which is upside down, and with a coin which doesn’t exist.
The third gentleman which we introduce to this market is an an
cient Egyptian. It is frequently said that the French are a nation of shopkeepers. Well in those days the Egyptians were a nation of undertakers. All of life was a preparation for death. When he is shown the newspaper with the names and prices of all these actively living
slaves, he immediately asks, where is the obituary column, where they describe the virtues and accomplishments of the people who are
dead?
Where is the necropolis where all the marvelous imaginary
people have been buried? He is not interested in the present, proper
ties of living people because a person isn’t complete until he is dead, so he wants to know where all the dead ones are. His guide is rather astonished at this, because it is really quite vulgar to speak of canal stocks and buggy whip manufacturers, and all the disasters which have overcome these imaginary slaves of the past.
The Egyptian
thinks that the American point of view is really quite barbaric, and totally uncivilized. The preceding description of fantasies on Wall Street seem apropos today, and there are some equally fantastic phenomena going on right now. For example, for many years one might have thought that the NYSE brokerage business was the ideal business setup. was a
It
strictly closed corporation, to be a member of the NYSE. It
cost a lot of money to be a member, and there were only a certain number of people who could belong.
They had a monopoly on the
market sanctioned by the government; they had a fixed set of minimum prices (brokerage) that everybody charged. As time went on, say from 1920, prices have risen, that is the brokerage fees, and the amount of business has also risen (the daily volume). It was a protected, growing market, and a monopoly.
What could have been a
sweeter setup? As a result of all this a great many of them, astonishingly enough, got into serious difficulties and even went broke.
Things were so
good and they were flooded by so much business that they went out of business, despite the fact that it was a near monopoly with fixed minimum prices.
To me as a simpleminded physicist, this is
an utterly incredible phenomenon.
It got so good it was terrible,
In some respects this phenomenon is the mirror image of 1929.
In
those days it was the public that was taken for a dreadful bath. Now
(196872) as a result of tremendous participation by the public, as institutions rather than individuals, it is the operators of the market
who are being taken for a bath. Finally I can describe an astonishing discovery which I made
myself, although you will find it quite humorous!
I was making a
telephone call to the Securities and Exchange Commission to get some data on the relation of the volume of trading on any given day to the actual number of transactions. So I called up the young lady on the switchboard at the SEC and asked her who had data at the
SEC on this directly from the tape, the individual components of volume.
She said, “We don’t have anything like that over here.”
said, “This is the SEC isn’t it?”
“Oh Yes.”
I
“Don’t you have data
from the NYSE?” “Oh,” she said. “I know what you want. You want the CAPITAL markets section.”
That was like a blinding flash of
light to me, as if I were an astronaut on the dark side of the Moon. I had never thought of the stock market as a market where they bought and sold money. I always thought of it as a place where they
bought and sold stock. Apparently in the textbooks, it is primarily a place where capital is bought and sold. The stock certificates, the pieces of paper that are red and yellow and blue, are only incidental to the buying and selling of money. This was a totally new point of view to me. I might say that the definition of a stock market which we will use for the purposes of these lectures is somewhat different from what you will read in the textbooks.
To me it is a very ingenious social
device in which people’s inveterate and insatiable propensity to gamble or take chances is satisfied. As a byproduct it provides a useful social function, which is the provision for the liquidity of capital. I don’t look on it at all as a capital raising device but as something
to satisfy people’s instinct to gamble. Only as a side benefit is this legalized gambling put to a useful economic purpose. If the market is
primarily a gambling phenomenon, this puts in an underlying struc
ture (random walks) which determines in large measure how prices behave. We shall also return to some of the unconventional ideas of our visitors to the stock market.
The finite life of a stock puts a limit
on the concept of a random walk. Thinking in terms of upside down prices makes a difference in a market maker’s strategy, and also in the way, or alternate ways in which “return” might be evaluated.
2
2.1
MARKET MAKING
Does a Capital Market Rais e Capital?
At this point I would like to give you the first problem which is intended to look a little more close ly at this question of whether a
capital market is in fact a mark et for capital and just how impo rtant it is as a source of capital. I am frankly a little suspicious , just as old Pharaoh might have been about his daughter’s story. I might say that frequently when I give you a problem, it is one for which I don’t
know the answer.
I am sure that collectively and indiv idually you
students know more economic s than I do, and so collective ly we'll
just find out about this question. Also, if you think the question is
ill put for parts of it, do not hesitate to criticize it, or the conception
of the question itself, or to subst itute a more germane or inter esting question of your own. Problems are thrown at the class to defea t
collectively. You may collaborat e, ask professors or other stude nts. I don’t care where or how you get information or answers, as long as
you acknowledge your sources.
I am somewhat skeptical of the source of authority of these remarks (in the problem) primarily because Mr. Martin himself is a
former president of the NYSE, and also because of the conclusions from this study (which don’t appe ar in the quote of the problem),
that all of the business should be concentrated in one market, whic h
would naturally be the NYSE. I am particularly skeptical of any au
thority who has (or has had) a vested interest in the answer and this definitely includes Mr. Martin. We will see in this class a number of examples of vested interest. Let me give you a few examples. You can all understand why firemen
on diesel locomotives have a vested interest in keeping the firemen there, and you can be a
little skeptical when they say the jobs are
not for the primary benefit of the union or themselves, but for the safety of the public. You can also understand why the trial lawyers have a vested interest in the adversary system of settling accident claims, and take a very strong position against no fault insurance. The same thing happened when socialized medicine first appeared
on the scene.
The doctors had a vested interest in the status quo.
The same thing can be said of educators who have a vested interest in requiring students to take certain courses, or in the concept of tenure.
You can say with equal justice that dentists have a vested
interest in cavities.
You can say with exactly the same point of
view that social service and welfare workers have a vested interest in preserving poverty and a submerged class.
Likewise the Bureau
of Indian Affairs. They would all be out of jobs if they ever solved these problems.
The pope has a vested interest in the population
explosion and is against birth control, because then there are that
many more souls or brownie points for the Catholic Church. You should not understand that I criticize Mr. Martin for having his vested interest.
Vested interest explains to me a lot of other
wise contradictory and inexplicable phenomena.
An evangelist for
example, like Billy Graham, has a vested interest in the concept of Hell, Sin and Damnation. He would not be at all sympathetic to the anthropologist’s or psychologist’s point of view that these are just arbitrary taboos and mental inventions set up to govern human be
havior (and incidentally give the organized church power), because if you abolish the notions of hell, sin and damnation, the evangelist has no justification for being. It takes away his business as a broker
for real estate in paradise.
I once had a chief of police tell me he
thought an appreciable crime rate was a sign of a vigorous, aggres
sive, healthy population. We all understand why the narcotics law enforcement agencies are not in favor of legalizing any drugs. As students of finance, I will tell you quite flatly, you have or will develop a vested interest in a concept; the concept of profit. It may he just as blinding and narrow as those I have just mentioned.
I hope in this course to at least make you aware of this. I can’t hope to prevent it. I don’t make any exception in this regard for myself.
When I
am not a professor of finance, I am an employce of the Defense Department, a member of the militaryindustrial establishment, and I
have a vested interest in warfare.
Other people getting killed are
just so much cold turkey in my deep freeze.
I am dovish on the
Vietnamese war for very hawkish reasons. I think we have gotten all the mileage (and by we, I mean, of course, the militaryindustria l
establishment) that can be obtained in terms of training and trying out new weapons and tactics.
What this country needs, and you
can see my vested interests, is another war, and I really don’t care
where it is.
business.
MacArthur unfortunately got the Japanese out of the The Koreans have been so unaccommodating as to start,
to make peace. There are still possibilities in the near east or Africa, or anywhere, it really doesn’t matter.
So in suggesting that Mr. Martin has or had a vested interest, that is really no criticism at all. It just explains the way he thinks. I might say that his conclusion that all the markets should be concentr ated
into one, the NYSE, is hotly disputed by its competitors, the over the
counter market for listed securities, or third market. Mr. Weedon of
Weedon and Co. is the principal exponent of this viewpoint. It is by
no means obvious that Mr. Martin’s statement that a capitalra ising
market is essential to an economy such as ours, is true, or that the conclusion he draws concerning a centralized market is valid, either.
This is just what I want you to look in to, and we'll find out more about it.
The students attacked this problem in a number of different ways, using different sources of data and for different years. Some confined themselves strictly to capital raised on the NYSE, and got data from
the NYSE Year Book. Others used data from the National Account
or National Budget and endeavored to sift out from these figures a number which represented the different components of capital raised
in any given year.
In view of the diversity of methods, points of
view and sources, some looking at the question as an accounta nt’s
problem and others as an economist’s problem, there was a rather
surprising agreement (I use that word agreement as a physicist , in a
tather rough sense) as to where most new capital came from. It was generated from internal sources within the corporations which were
needing capital.
As a rough figure, I would say from twothirds to
nine tenths of the capital was generated in this way, although it was
different in different years.
Using the same method of calculation
for different years, the fraction of capital raised “in the market” as Mr. Martin specified fluctuated from say 2 to perhaps 8% of the total capital raised.
There was one high figure of 12%, from a
different viewpoint.
slightly
Nevertheless, these figures indicated that the
actual capital raised by a market is a relatively small source. There were certain difficulties of conception, as to exactly how
you distinguish between new capital which is being raised, and capital which is essentially being rolled over. Here a new stock or bond issue
is being used to retire a previous debt. There was also a difficulty in assessing what might be called the unseen part of the iceberg, in that a great deal of capital is raised and placed privately, so that it never really appears on the public market.
Still another difficulty was in
assessing how much capital is raised by taxation, whether by federal, state or local government. It was pointed out by one of the students
that in an authoritative government like the Republic of China, the problem would be very easy.
You simply look at the budget, and
they have a certain amount or fraction of income or GNP that was specified for capital, and that was it. There is no question of a market at all.
There was also the question of how one raises capital when
there is no capital market or a very imperfect one or not visible one, as in Europe where a great deal of what is transacted publicly in
markets in this country is transacted privately between banks.
There was also the problem of distinguishing between “new” capital and “used” capital. That is, how much of the money lying around
is being swept into piles like dirt in a room from one place to another, and when the pile is big enough, for a long enough time, you call it “capital.”
I was also attracted as a physicist by the question of the proper “dimension” of capital, because as a physicist, knowing exactly what the dimensions are of a quantity you are thinking of, is often very
revealing about what the quantity is and how it is measured. I had inadvertently thought of capital as simply dollars, whereas it was evident that some of the students, and perhaps this is a more correct economic viewpoint, thought in terms of dollars per year. In one of the questions it is asked that, when money is saved, when is it used as capital and when is it used as “not capital”? What was going on in the back of my mind, and the students couldn’t read my mind, was the following viewpoint. If I put a thousand dollars in the bank for a
10
year I just think of it as a thousand dollars. The banker thinks of it, and may well use it as capital for the purposes of a loan, as dollars
per year, so that concept has a different dimension, and I suspect a
preferable one. In physics this is the difference between thinking of a quantity and the flux or flow of that quantity per unit time, like
charge vs. current, or gallons of water vs. gallons per minute.
We will see that the questions of dimensions is very important
when we examine the modifications that have to be made in the
simpleminded picture of supply and demand, when we try to apply
this idea to prices on an auction market.
Problem 1 Mr. Martin’s Statement  Does a Capital Marke t
Raise Capital?
The purpose of this problem is to try to determine how important the securities market (and the relative importance of various components of it~bonds, stock exchanges and OTC, etc.)
of raising capital.
Mr.
are as a means
William M. Martin said, in a press release
(August 7, 1971) connected with his stock market study, “the public
interest dictates that the primary purpose of a security market is to
raise capital to finance the economy.” I am skeptical of the validity of this statement and also ignorant as to just what it means. An associated question is: is it possible to distinguish, both prac
tically and conceptually between “new capital” actually “created” or “raised” in the securities market, and conversion or liquidity of “used,” or second hand capital. How do you distinguish, conceptu
ally and practically between money saved and used as “capital,” or used as “not capital”? Possible quantitative estimates to answers to this problem are
1) The net change in value (for stock market) say over one year
of all securities listed—possibly discounted (if considered as security for loans) and corrected for brokerage losses.
2) The net foreign investment over one year. Note that in cases
1 and 2 the net “creation” of capital might be negative.
3) The total value of new stocks issued or bonds issued. How (if
it is possible) can this be “corrected”? for liquidation of old issues?
4) Is it possible to identify and make a relative compari son for
capital not raised in the securities market?
Where docs “capital”
come from, or be raised, where there is a very small (or even nonex
a
11
istent) securities market? 2.2
Gambling Games and Liquidity
The preceding problem and the discussion of its answer will give you a feeling as to just how important the security market is as a method of raising capital. Contrary to the implication of Mr.
Mar
tin’s statement, like the refrain of the hymn, it ain’t necessarily so. But there are a number of questions and comments that we can make. First of all we can see that it is important for the governors of the exchanges and the community of brokers who make a
living out of the
market, to have the public believe, and in particular those segments
of the public who are influential, to believe that it is important, in exactly the same sense that it is important to the Brotherhood of
Trainmen to have the public believe that it is important to have firemen on Diesel locomotives. In both cases, there are very influential
segments of society who have believed it, and the public in general seems to have been conveniently docile in its belief for a very long time.
One might well ask where the picture of the stock market which I gave you, as a legalized form of gambling with a side benefit of liquidity came from. Also just how important is this liquidity? We will look into this latter question, and it will hinge a great deal on exactly what the definition of liquidity is. Is it important to be liquid
in a matter of minutes as the advocates of the exchanges would have you believe, or is it sufficient to have hours or even a day for achieving liquidity, and at what cost? I can tell you where the picture of the securities market as a form
of legalized gambling came from. It came from my own Chicken Little mind impressions of what I saw and understood when I first began to get seriously interested in the stock market.
That was about
in 1956, and I began, as all good scientists do, by looking around and collecting as many impressions and as much information about
what was going on as I could.
My sources were the obvious noisy
ones, the newspapers, the various financial presses  such as Barron’s
and the Wall Street Journal.
I also laid my hands on a number
of books which you can find in the reading list at the end of these
notes. I also subscribed to a number of financial advisory services. I recommend to you that this is a very good investment. Not so much for the quality of their specific recommendations, because once you get on the mailing list of a few of these organizations, you get on
12 the mailing list of all of them, and they will bombard you with all sorts of interesting, at least in the first instance, literature. If you do subscribe, I suggest that you save the initial explana tions of exactly how they do what they do, because it will explain to you the jargon
of the market in a way that is very difficult to acquire from any
other source. Often these witch doctors of folklore , if you read them carefully, can suggest to you some very interes ting lines of thought
and lines of understanding.
I just about wore out a copy of Magee and Edwards’ book on There is a lot in it which enlightens what is
Technical Analysis.
otherwise a strange phenomenon indeed. You can readily understand
that my listening to and observing the noisy sources of the market
quickly lead me to believe that, figuratively at least, the market was
a place where people with green paper, that is, money, were madly
scurrying around and trading it off with people who had other colored
pieces of paper, the stock certificates.
It was quickly apparent, at least from these sources of information which were producing the most noise if not information about the stock market, that the participants were indeed after a buck. Therefore with everybody competing with everyone else, it was a
game of competitive gambling. In it some were smart and some were not so smart, and the players changed sides so often that it was a
picture of financial chaos or bedlam. As I had had some experience in molecular chaos as a physicist studyi ng statistical mechanics, the
analogies were very clear to me indeed. Howev er, I should warn you that this was a lopsided picture. I didn’t realize this until I spoke to the lady at the SEC. Then I vaguely remembered that there were some other things I had read in the more scholarly and academic
volumes which took a different viewpoint. But like Chicken Little, I had reverence for the data as I saw it happen ing, and skepticism
for the authorities. I believe what I see rather than what somebody tells me about an interpretation and so that is where my particular
viewpoint came from.
I should warn you again that this approach may produce a lop
sided impression, in much the same way as the story of the six blind
men and the elephant. Each one felt a different part and reached a
totally different image in his mind of what an elephant was. One can
give an analogy concerning the University of California at Berkeley,
and its students. If one were to listen only to the students who made the most noise and got the most publicity in the newspaper, one
i
“a 13 would have a very distorted picture perhaps of what the actual stu
dent body at Berkeley was like. That does not mean to imply that the noisy ones are unimportant.
I think the noisy ones have had a
very profound effect on the conduct of the University of California. It is certainly true that the noisy people in the stock market: have a
very profound effect on how it is operated too.
2.3
Supply and Demand. Economics vs. Reality
From the above approximate (and I emphasize that it is approx
imate) picture of the stock market, it seems evident that the vast majority of the(noisy)
participants are indeed interested in making
a dollar. This is confirmed indirectly not only because the price in dollars is in the numerator where it has the position of greatest interest, or top billing, although it should be in the denominator for the oriental, but also because, at least in 1956, at least half the column in the newspaper was devoted to this dollar figure price. There was a high and a low for the year, and an open and a
close for the day,
and a high and a low for the day. So the question then arose, just how are these prices generated? Thad already determined that they had the properties of chaos. The question then was; what are the details of the mechanism which
produces them.
“Everybody knows” that prices are determined by
supply and demand. So I thought I better refresh my mind and look
sharply at the details of what supply and demand were all about.
I got an elementary textbook, Samuelson I believe, and read very carefully the description of these two lines that are drawn as a function of supply and demand.
I looked at the schedules of actually
fictitious but. supposedly representative data, which were then plotted up, and the points joined with cither straight line segments or smooth lines, and how the derivative was evaluated, the logarithmic derivative actually, to describe the elasticity of supply and demand. But then the Chicken Little in me began to get a
little skepti
cal. No matter how hard I looked, I never could see any actual real data that showed that these lines which were so thoroughly discussed actually could be observed in nature. This accounts for the second problem which I have assigned to the class.
(See Problem 2). I got
very unhappy about this because I also read in some of the more academic volumes that the laws of supply and demand didn’t seem
to work very well when they were applied to a security market.
In
order to get this situation straightened out, we will have to call on
14
the talents of our mathemati cian Zero to see exactly wha t the words mea
n. To do that we are going to nee d some elementar y concepts of
mathematics, and apply them very closely indeed.
The first mathematical idea I want to bring out is that of a function
, in the mathematical Sense. You can read in the newspapers that price is a functi on of supply and dem and. If you examine exactly what these wor ds mean to a mathem atician, you will see, as I will show you, that it isn’t necessarily so, although it is so that,
supply and demand (but with different dimensions) are fun ctions of price. The
one statement does not imply the other, nor conv ersely.
Going back to what you probably learned in the eighth or nint
h grade, let me define care fully the notion of a func tion. Actually you star t out with two sets, I remind you that a set is simply a coll
ection of objects with a definiti on to specify what the objects are, such that
you can always mak e an unambiguous decision about any object as to
whether it is or is not a member of the set.
It is the
n a welldefined set. The set can be concrete objects or the y can be abstract ideas, or the objects can be themselves sets of oth er objects, either abstract or concrete. The word function is defined in the following way. It is a relation between the objects of the two sets such
that if you have the two sets of objects Y = {y} and objects X = {x} you write the functional rela tion (x,y) such that if you pick uniquely an object from the set X, the functional relation tells you uniquely with what
object y in the set ¥ the first choice of x is associated.
It does not follow that you can go the other way . It may be false tha t if you pic k a y, that the rule tell s you uniquely what object. you get in X.
So if we have that y is equ al to
a function of x, or over the set X = {x}, frequently written y = F(z), then, given x, we can uniquely det
ermine ay. There are var ious alternative words in mathem
atics to express this. Someti mes x is called the argument and y the function, sometimes z is call ed the independent variable, and y the dependent
variable. The langua ge is that the function (a relationship) is def ined over the domain, which is the set X, and the fun ction itse
lf, the y, is specified on the range set Y. So there are ver y definite limits (the sets X and Y) over which the functional relati onship holds.
I can give you two exa mples to illustrate this functional relationship between sets. Consider the set of con gressmen (House of Representatives) and the set of all voters. There are five hundre d congressmen and
several
million
voters.
One
gressman is a fun ction over the set of voters,
can
say
that
a
con
the voters being the
15
domain and the five hundred congressmen being the range.
If you
pick a voter, he has a unique congressman. The converse is not true, because if you pick a congressman, you do not determine a unique
voter. Note that senators, since there are two each per state, are not a function of, or over the voters, nor conversely.
So that although
congressmen are a function of the voters, the voters are not a function
of the congressmen. A second example: consider the set of all mothers and the set of
all children. In this case it is quite possible for the same individual to be a mother and also a child.
the children.
The mothers are a function of
If you pick a child it has a unique mother, but the
children are not a function of their mothers.
Given a mother you
cannot uniquely determine the child that goes with it. If you want to cause a commotion in a PTA meeting you can make this statement.
“Mothers are a function of their children, but the children are not a function of their mothers.” This just shows that the common usage
of a word (function) may be different from its mathematical usage. Let us now return to our supplydemand diagram plotted, with price as the function, and see what this picture says, literally, to Zero our mathematician.
in
prece
tfiemG)! Scr
=
Dim ctems (Hime)
Fig. 2.31 Prices as Function of Supply and Demand. It is a mathematical convention that the vertical axis is used for plotting the function, or range set, and the horizontal axis the domain
Sy
16
set. So we have in this diagram, three different kinds of prices. p(D)
on the demand curve, p(S), and the intersection which I will call Pr The subscript R means that, it is supposedly the price which appears in the real world of buying and selling. You will note that, unlike
p(D) and p(S), I have not indicated of what independent variables
or domain set (if any) Pp is a function in the mathematical sense.
The big problem when we read the textbooks, or speak of “price” in everyday language, is to determine which one of these three prices is intended. We will need to exami ne the context and the grammatical construction very carefully in order to decide what variables are intended in any statement to be the domain and what variables the range or function set. Right away from Fig. 2.31 we notic e two properties. As drawn,
by the mathematical convention p(D) and p(S) are the functions.
since p(D) slopes monotonically down, and p(S) slopes monotonically up, it is on the diagram possible to go uniquely from a particular p to
a unique D or S. So the figure impli es that the functional relationship
can be made in either direction. As we saw from the definition of a
function, this is not necessarily true.
Mathematicians have a word
to describe this possibility. If there is a “formula” (i.e.
some sort of
algebraic expression) to express p = p(D), then it is frequently (not always) possible to invert, and find D = D(p), over some limited
interval now of p. The domain and range sets have exchanged their roles.
The second property we notice is that we have drawn thin solid
smooth curves. The word “thin ” implies ordinate and abscissa (p, D
or S) are sharply defined numericall y.
This may not be necessarily
so. The word solid implies a conti nuous function, smooth implies the
existence of a derivative, and more than that, a continuous derivative. None of these ideas appeared in our definition of a function.
Obviously, we are implying a great deal here not only about the sets D, S and p, but also the relation between them. Unconscious assumptions or implicatio ns are very often the most It is a very common practice to repres ent. phe
difficult to detect.
nomena by thin solid smooth graph s.
Sometimes the implications
of “thinness,” “solidity” and “smoo thness” are justified. When they are not, assuming them can cause endless controversy and confusion.
At this point a beady eyed Chicken Little might object to these characteristics of our diagram, and say, “Look here, you can’t have solid lines on that picture because there is always a smallest unit of
7
money, be it a penny or an eighth of a dollar or a dollar, in which you
buy things, and in addition there is always a unit of something that you buy:
tires, automobiles or wheat.”
There is always a smallest
amount, usually a bushel for wheat, but in any event we should have just whole numbers of some sort on that diagram on both axes. The
lines should be dotted.
Well, this is reasonable enough, and one
might understand that we just draw solid lines to get a picture of it. Then our mathematician Zero will have an objection on the grounds that if we are going to have dotted lines instead of solid lines on the curve then there does not exist any such thing as a slope, or a
derivative, or a logarithmic derivative either. Although the objections which Chicken Little and Zero have raised to our supply and demand figure seem quite trivial and even picayunish, these points are absolutely essential in understanding the modifications of the supply demand relationship to make them ap
plicable to an auction market, and to the world in general, as I see it.
Let us now examine the textbook language which describes the supplydemand diagram. If you read the fine print in the textbooks
you will notice there is a specification that the supply and demand
are those quantities which people would continue to buy (demand) or sell (supply), at the given price. Hence the actual dimensions of supply and demand are items of the commodity, tires or wheat, per unit of time, a month or year.
Note that the specification “at the
given price” implies from the grammatical construction that price is the independent variable, or domain set, which is contrary to the construction of the figure as drawn. Everyday language confirms this implication. “Everybody knows” that a supply increases with increasing price, and demand increases with decreasing price.
The language indicates that price is consid
ered the independent variable, or domain set. economists confirmed this.
Conversations with
It is only an historical accident that it
has become a convention in economics to plot price on the vertical axis.
A different statement from “everyday language” seems to contradict the above plausible statements about demand. increases, the price goes up.”
“If demand
This word “price” is not the “price”
of the preceding paragraph. It is Pr the intersection of the supply and demand curves. The statement means that if the entire demand curve is shoved to the right, the intersection point (Pr) slides up
18
on the supply curve, which is assumed to be held fixed.
But the
grammar implies that demand is the domain set.
These peculiarities were horribly confusing to me, and I became even more confused and suspicious when I was looking around for a
real
example of a supply or demand curve. I received in the mail an
advertisement of some pamphlets, and the statement said, “We are
prepared to supply according to the following schedule the numbers of pamphlets at the indicated prices.” I thought “AHA,” at last a real example of a supply schedule. I plotted up the schedule using
the figures which they gave. Up to five pamphlets at 10 cents apiece.
Up to 100 at 2 cents apiece. A thousand or more at 1 cent apiece. I discovered that this “schedule of supply” was exactly the opposite in
its “slope” to the one which appeared in the textbo ok. So I was foiled again. It was not a real supply schedu le, by the textbook definition.
Being both skeptical and Chicken Little minded, I tended to discount the textbook.
It will also be noticed if you read the fine print in the textbooks
carefully, that the authors are careful to specify that these thin solid
smooth line supply and demand curves are drawn with the under standing that “other things” are to be equal, i.e., held constant. What those “other things” are came out in the course of the answ ers
of the students to the problem on this subject. This problem caused a good deal of commotion, not only amongst the students, but also
amongst the faculty to whom I directed the students to addr ess their
questions, since Barrows Hall is infested with economists.
Some at
least were skeptical that anyone woul d have the gall to seriously ques
tion the legitimate activities of economists, who apparently spen d a
lot of time and effort computing suppl y and demand curves. Over in
the marketing department they said they had been trying to comp ute
these things, but there was too much noise in the data. It’s my feel
ing that it is precisely the noise in the data which is the interestin g
part of the phenomenon.
In sum and substance the consensu s of expert opinion seemed to be that it is indeed very difficult to extract from real data what a
real life supply or demand curv e is like.
The condition of holding
all other things fixed is practicall y impossible to achieve in practice. The “other things” are such varia bles as weather (for demand of cosmetics, or fertilizer) the adver tising budget, the preference of the consumer, the supply and dema nd of competing products and so on.
So it is only in a very idealized sense that these curves are supposed
b
ee.
EEE
19
to have any meaning for purposes of understanding the real world. It might be well to summarize what the economists actually do
when they try to carry out this calculation.
They suppose a func
tion exists which involves D, S, p, and all the other variables which
are supposed to be held constant, and plug in as much real data as they can for these variables.
They make a multiple regression
(linear usually) to evaluate in the simplest case the “tangent hyperplane” through the center of gravity of their data in a space of as many dimensions as they have variables.
If this regression calcula
tion comes out “significant” in some sense, then by holding constant all the variables they want fixed they then get a regression “curve”
(tangent line) of price against supply or demand. This process has behind it some sort of a model, and for which hopefully, if they are
say evaluating a demand curve, the supply curve is “naturally” held fixed. The uncertainty or arbitrariness of the model, and the difficulties of interpreting multiple and partial regression coefficients are still present. Needless to say, this is a rather far cry from the simpleminded
picture in the elementary textbooks.
The notion of a mathemati
cal function is completely wiped out in this stochastic picture. The
regression of y on z is not a functional relationship, a fortiori not invertable. The supply or demand curve “derived” is actually a regression line, plus some sort of a preconceived model to specify what the germane variables are.
Here we have a rather typical picture of what passes for scientific procedure in the social sciences. Social scientists for the most part don’t seem to have learned that the theory is always required to fit the data, and that it is an incorrect procedure that data should be
made to fit the theory.
From a pedagogical standpoint, science is
frequently taught this way. You do an experiment in the laboratory in which you know what the answer is by theory, and so you do the experiment to confirm it.
Even more misleading, scholarly papers
are often written this way. Even these lectures are written up in this manner. In the real world the procedure of discovery is exactly the opposite.
As a class, social scientists have never caught on to this.
As
a result they very often won’t even undertake an investigation and
collect data unless they have some sort of a theory or model to fit the data to.
This is not the way significant discoveries are made, and
it is just unfortunate that social scientists to a large extent carry
20
on their research in this mann er, This is probably an expl anation, among other reasons, of why economics is called the dism al science,
but that doesn’t prevent €conomics from being import ant.
Despite the scornful and even patronizing tone of the pre
ceding remarks, we really cann ot dismiss out of the han d the economist’s
ideas as the ravings of an acad emic lunatic in his private dre am world.
There certainly must be some measure, perhaps, of distorted truth
in his point of view. The Chicken Little in me sees quite plainly that econ omists with influence do make important decisions, so we can’t
just brush off the ir ideas.
I now want to show that we can take into account the objections of Chicken Little and Zero. By turn ing the diagram over so that
price is the domain set or independ ent variable, and a slight change in the dimensions of the diagram, we can bring the textbook ideas ,
with modifications, into rather close agre ement with what actually
does happen in the real world. The correct choice of dimensions, the
correct choice of the depe ndent and the independ ent variable, and
the discreteness of the domain set and the Tange set: these ideas are of fundamental importance.
Let us redraw the economist’s supplydemand picture turned over
and comment on its qualities, and then I will show you that its properties, appropriately translat ed, do actually occur. It might seem
impossible to preserve the idea of solid lines and smooth lines , which translates mathematically into continuity and existenc e and conti
nuity of derivative, but it turn s out that, within certain limi tations,
even this is possible, with discrete variables.
So, looking at the new repl otted diagram we see that the
demand curve slopes dow n to the tight, the sup ply curve slopes up to
the right. D(p) and S(p) are in mathematical language decreasing and increasing functions of p. The domain set, or ind ependent variable is price and the dimens ions are of supply and dem and until we
change them, items per unit time. That last defi nition we will have
to change. We also have to understand that in these plot s of dem
and and supply against pric e, there are some other variables that have to be held constant, just as before, where this was expressed, “other
things being equal.”
We will also have this idea . We also note that the supply and demand curves intersect at a Point and presumabl y this is the price which appears in the real world. Economists may be misguided (who is not?) but they are not ignorant and they are not stupid. So we
21
really should pay attention to what they have to say. Scientists who are misguided are in the best tradition of Chicken Little.
y
15 eo
£
Dip
ened
us
~
Aay
rie

p
!
PG price Fig.
in
>
§ (tTem)7!
2.32 Supply and Demand, as Functions of Price.
The
Economist’s Diagram Turned Over. Problem 2
How Economists Get Supply and Demand Curves In almost any introductory economics text (e.g., Samuelson’s book) one can find diagrams of price plotted as a function of supply and demand. There are also schedules, or tables from which points can be plotted, and joined by straight line segments, or “smoothed” curves to give these diagrams. There is extensive discussion of this kind of (supposedly real, or at least representative) data, including an evaluation of quantities from the slope of these curves, e.g., elasticity of supply
8 = (d log p)/(d log s) = (dp s)/(ds p) (I may have this upside down.)
The question here is, is it possible to actually go out into the real world and collect real, quantitati ve data from which these supply and demand curves or schedules can be constructed? I have heard that it is possible, but I am both skept ical and ignorant. So come back
with as explicit and detailed infor mation on this question as you can
with especial reference to the diffic ulties and uncertainties involved.
It has been commented by some acade micians (cf. Granger and Morgenstern, pp. 8, 196) that convention al ideas on supply and de
mand don’t seem to apply to specu lative markets. I am sufficiently
iconoclastic to believe that the convention al presentation doesn’t ap
ply except in a very crude and appro ximate sense, to the real world,
period. The changes necessary to make them applicable require close
attention to the mathematical definition of function, continuity of a
function, and existence and continuity of a derivative. So you would
do well to review your knowledge of these. The importance of these
notions can be illustrated by the following statement (which is not at all obvious). Price as a func tion of time, which is a Brownian motion, is a continuous function for which the derivative does not exist
(anywhere in the range).
If you try to draw a picture of a contin
uous function with this Property, you may find it conceptually very
difficult. We shall explain how to do it in detail in the lectures. 2.4
Supply and Demand in the Real World
Let us imagine a situation which, unlike the economists’ theoretical construct, does in fact occur in the real world many times every
day.
Mrs. Jones has heard of a dress she wants to buy, a green dress
with red applique alligators and she’s determined to have one. More
than that, she is willing to pay up to $25 for it, because that is the amount of money her husband has said she could have for the
project.
So in this instance we can draw her demand function for
this dress as indicated. (Fig. 2.41)
The independent variable is price, it is a discrete variable.
If
she is willing to pay $25, she is willing to pay less. The ordinate is also discrete and the demand is in fact a function of the discrete set of prices as drawn. The word funct ion is used in the mathematical
sense. A unique price gives a uniqu e démand, but the converse is not true. That is why we turned the S — D diagram over.
Demand is a function of some othe r variables as well, just as the
economist’s demand curve is, which are not shown in the diagram.
23 In our case, the other variables which are held fixed for the purpose
of the picture are the longitude, latitude and altitude of Mrs. Jones. The demand is unity for p < 25, i.e., not zeros and extends over a
region of space as far as Mrs. Jones can see, about 50 feet. The demand exists and is zero outside this range. Fifty feet is her distance
limit of information and communication. There is also a fourth variable which is held fixed in this picture, which is the time. That is, if Mrs. Jones decides she doesn’t want a dress after all, at that time the function collapses to zero. It is still a function of course, of all those variables, five in all. If she falls to sleep or forgets about it, the
demand again drops to zero.
Demand D dresses 2 bok ie
8
&
we
Fw
we
we
fai préee
pitn
we
8
: Hf
(dressy 73
a
24 25 26
2727?
Fig. 2.41 Mrs. Jones Demand Function.
There is another slight difference from the economist’s picture, but there is even a mathematical notation and language for that. In the economist’s demand curve, demand is, as we have drawn it in Fig. 2.32, a decreasing function of price. In the case of Mrs. Jones’ demand curve it is a nonincreasing function of price. The difference is that if there are two neighboring points on the economist’s curve, the one with the larger price is always less in demand than the other.
In the case of Mrs. Jones’ Curve, the point with the larger price is not greater in demand than the other. This is the difference between
a decreasing and a nonincreasing function. They are not quite the same thing. This all fits very nicely with what we have said about the importance of discreetness and the importance of the concept of
a function. In Mrs. Jones’ case the demand curve really is a function in the mathematical sense (over five domain sets) of all the variables which we have specified, although we have only drawn one, the price. Mrs. Jones goes downtown and shops around and her non zero
demand function moves around in space and time until she comes to the second floor of the department store and asks about this dress
24 she wants.
The clerk tells her, “No, we don’t have any here.”
She
tells Mrs. Jones that the supply function is zero at that altitude,
but down in the bargain basement she might find a nonzero supply
function.
Seapp ly 6
SD
pe (A)
ate
bh *b b=b: 6  bFb5I @ressy!
4
“"Zobersechia a “of
CNS
(B)
Sand D
Sapply S44
C
Se
ee
el
MIS
Price Fig.
tn
Demand D ~~
dies ses
biédb
ee bb#bb bhadbid, . $ ‘ 13294 4 is Ie 77 78 /Ga0 31 2.23 2S price
Sy D
Demand
20
Fe
{ as"
im: 8 ress)!
2.42 Supply and Demand Functions of Price, for Buy ers
and Sellers.
So she goes down to the basement and lo and behold, there on the
rack are two green dresses with red applique alligators, and they are marked with a price of $15. Now we can also draw the supply func tion
for this particular item. This we have indicated (case A) and you
will notice that, just as in the case of Mrs. Jones’ demand function, it fits the economist’s Pict ure, to a degree. It is a nondecreasing
25
function of price. It is quantized at unit elements of supply, rising from zero to two at $15. It has other variables, held constant. The latitude, longitude, altitude and time of the store have the same
value as Mrs. Jones’ variables, when she sees it. economist’s case, “other things” will be equal.
So just as in the Supply is different
from zero only when the store is open. At other hours, the variable time having changed, supply collapses to zero so far as Mrs. Jones
is concerned.
So when we superpose these two “curves” of supply
and demand we see that they do in fact “intersect” at a point, just as the economist’s diagram does, and so there is a transaction. Note that supply and demand are both altered by the transaction. This is assumed not to be true in the ideal economist’s case. We could also draw this situation (Fig.
2.42, Case B) in the
hypothetical case that there was only one dress on the rack. You will also note that there is the tacit assumption in drawing the supply curve, that if the store keeper is willing to sell a dress for $15 he is
also willing to sell it for more. So if Mrs. Jones was so broadminded and liberal that she would be willing to pay more than $15 and there
was only one dress, then the curve would be drawn as in case B. Supply and demand would intersect not at a point but over a “line”
segment (actually closely spaced dots). In this case, there is an interesting asymmetry of information.
Mrs. Jones knows the supply function of the store, but the merchant does not know the demand function of Mrs. Jones. So in the normal
course of events, she will immediately slide her demand function (case A) back down to $15 and in that case the supply and demand curves will intersect at a point ($15, 1 dress). It might appear, that they then intersect on a vertical segment, but the strict definition of a function does not allow these vertical segments to be considered as
part of a function, as then the uniqueness part of the definition of a function would not be met.
So with the intersection of the supply
and demand functions they can have a transaction.
It sometimes
happens that when the supply and demand curves intersect on a horizontal segment, that the people are friendly, and they may split the difference, or they may not.
You can see from the above example that with this interpretation of supply and demand, the dimensions of supply and demand are not items (demanded or supplied) per unit time, but items.
There are
now a great, many examples in the real world. In fact, you cannot go into a store or have any sort of a transaction without precisely these
wll
26
elements of the sup ply and demand
curve Occurring. For example, suppose Mrs. Jon es is going to the brocery store to buy ten cans of
applesauce, and she is will
ing to Pay 15 cents a can. She sees on the shelf three cans at 10 cents and 12 can s at 12 cents. Th e demand and sup ply curves (as lines, Teally dot
s) look as we hav e drawn them. So she buys the thr ee cheaper and the seven more expens ive ones to get her ten cans of applesauce.
a tar
sa
YTM
10 s 34
DCP
~Dip) ee
er Prete Fig. Store.
1012 ~
154
Cents
;
Can)
2.43 Supply and Demand of Apple Sauce in the Grocer y
There is nothing spe cial about having
the demand curve Jones level off at of Mrs, just one unit. It could well be tha t she wanted to
Price which cor responds closely but with significa nt difference fr the decreasing de om ma
nd function of the economist. In reference to the applesauce exampl e, there are teally Possibilities rat several her th an just the simple mind
ed interpretati on of the demand line at ten cans intersect ing the supply at ten cans, at a
i] I
single point (12 cents, 10 cans),
The simple interpretation is that
she would pay 12 cents for all ten cans. In fact she would first cut back her demand to a price of 10 cents, take the three cans and then get the remaining seven at 12 cents. If very crafty, she might ask the merchant if he had any more 10 cent cans first. Again we see here the significance of who has information and who does not. She knows his supply function and uses this information to make a two
stage purchase.
If he withholds the information that he has more
cans at 10 cents that might be added to the supply function, then he comes out a little ahead of what he would get if he gave out that information.
We are now almost in a position where we can begin to apply
these ideas to a real auction market. Before doing so let me illustrate a few other examples which will bring out certain aspects that are important for the auction market, in particular the availability of information to the different participants. Let us suppose that Mrs.
Jones, having bought in the conven
tional way her green dress with the red applique alligators, is so
enchanted with it that she wants another dress, this time a red dress with green applique alligators. She carries her demand function of five variables around through all the possible sources of supply without finding a supply function which is different from zero. Then she hears of a very fancy haute couture establishment, and so she is de
termined to go there. Now this store is run in a slightly different fashion. A character in the funny papers, I think it was Henry, set up a junk or perhaps a lemonade stand which operated on exactly
the same principles. When she goes in, there in the back is a red dress with green applique alligators, but there isn’t any price on the label. So she asks the clerk about it. The clerk explains: “Oh, we can’t have anything so vulgar as prices on our dresses. We don’t do
business that way. If you want that dress, dearie, you just take it up to the counter and you write on the label what you're willing to pay. The manager will look and see whether this is an acceptable price on our list, which we don’t publish because it’s vulgar to have numbers and discuss prices. If your price is higher than what we consider a fair price, we will let you have it at the price which you have specified.”
Mrs. Jones is a little nonplussed at this, but on second thought she decides maybe that’s reasonable, so she goes and does just as the clerk told her, puts down $25 and gets it for $25. Now you will
28
notice the essential asymmetry in this from the precedi ng case.
In
this case, it was the seller who knew the demand function of the
buyer, but not conversely. This situation also occurs in the labor
market. A buyer of personal services will adverti se what he will buy at a certain wage, and the seller, the employee who might be willing
to sell his services for less, nevertheless accepts the buyer’s bid, so that this situation is by no means unprecedented. Again you will notice that all of the previous conditions concerning the existence of
real mathematical functions and what variables (five in all) are to
be held fixed and must be equal for both supply and demand, are exactly the same.
Finally we have a third situation.
Mrs.
Jones has now decided
that she wants a black dress with white applique alligators, and it.
seems hopeless. Her demand function is up to $50 on this item, but the supply functions are only zero. Then she hears from her operator
in the beauty parlor that there is a store where for a price they will sell you anything. “It’s run a little differently from that bargai n
basement place, or the high fashion store that you went to a while
ago, but nevertheless if you want it, they’v e got it. It’s up there on
Wall Street, and it’s called the New York Stock Exchange.” So she
goes up there on Wall Street and sure enough, there on the rack is a
black dress with white applique alligators. She is about to go and get it, when she is stopped by a man in a sandw ich board with the word “broker” on it. He says, “Just a minute, lady, we don’t do business that way here.” She says, “You don’t, well how do you do business?”
He said, “You write down on a piece of paper what you are willing to pay for that dress, and then I’ll go to the owner and find out what
he’s willing to sell it for, and if the two prices meet or overlap, then
I will decide what the price is which will be satisfactory to you both.
Mrs. Jones thinks about this a while. Then she does just what the man in the sandwich board says and puts down $25.
It turns out
that the owner was willing to sell it for $15, so the broker, being an
honest man, comes out and says: “Madam, you get it at a bargain for $20,”and she goes away happy having saved $5, she thinks. The
owner is happy since he got five dollars more than he was willing to
take.
You will see that this is a third situation in which neither the buyer nor the seller knew the other’s deman d or supply function,
but with the aid of a gobetween whom they trusted, a transaction
was effected.
This situation, which held only for one unit would

29 hold for as many units as you like, nor does there have to be the
same number of units on cither side.
The prices need not be the
same, they might be different for each unit. You can simply plot up
the demand and supply functions in a discreet manner, a series of steps going down anda
series of steps going up, and if they overlap
then there is a transaction for an equal number of buyers and sellers at a single price.
The reliability of the intermediary, the broker, is
essential to this kind of a market. You can easily see that the man
in the sandwich board, had he been so minded, could have told the owner of the dress that the bid was $15, and told Mrs.
Jones she
would have to pay the full $25 and pocketed the difference. No one would have been the wiser if the final information is not available to
all parties.
It is seen that those who have information that others
have not, are in the position of advantage, and it takes a degree of
faith on the part of the participants and a degree of honesty on the part of the intermediary to assure what might be called fair play.
Now perhaps the man in the sandwich board took out a little piece for himself, but maybe that’s regulated. If you will examine the details of the above picture by contrast to what appears in elementary textbook, you will see that there
are indeed a great many similarities.
The principal differences are
brought about as I mentioned by making the price the independent rather than the dependent variable, by taking into account explic
itly the discreetness of both the domain and the range sets, and by taking into account explicitly that the supply and demand were now items rather than items per unit time. Demand and supply functions change with each transaction. They are definitely functions of time.
With these modifications it is seen that there is quite noticeable similarity, but essential differences in detail, between the economists’ theoretical picture and the data of observation.
2.5
Continuity and Derivatives of Real World Supply
and Demand There is one aspect of the theoretical curve which does not appear
in the observational curve. That is, the solidness and smoothness of the theoretical curves as opposed to the experimental ones. It doesn’t seem possible that the idea of continuity and existence and continuity of the derivative, which are essential for the notions of elasticity
of supply and demand and all the various theoretical developments which economists make when they use these derivatives, can possibly
30
exist in this framework. However, it turns out that if you really ex
amine the definitions of function, and of continuity and of derivative
very carefully indeed, that even these properties of continuity and
existence of derivative can be satisfied (in part) by the discrete sets of the domain and the range. So far as I can see, this demonstra
tion is more a mathematical feat at the momen t, and a tribute to the careful definitions of the mathematicians, rather than providing
a situation which allows the notions of continuity and elasticity and
the existence of the derivative to be applied in the same way that
the economists do it.
Nevertheless, continuity and derivative can
be defined when the definitions are followed carefully. This curious
situation caused a good deal of discussion in the lectures at the time.
Remember now that the supply and demand curves as redrawn
are, rigorously, functions of their respective independent, variables :
Price, latitude, longitude, altitude, and time. Of these we are specif ically going to concern ourselves only with price. The others are “to be equal” or held constant.
The “peasant” definition of continuity of a function D(p) is: D(p)
is continuous at p = po if
lim D(p) — D(po) = 0
P—Po
(2.5.1)
The next question is, exactly what does the word limit mean? For
the case of continuity of a function, it is as follows. This is the “aris
tocratic” definition of continuity of a function. Note particularly the
order in which the small quantities ¢,6 are picked, and the specifi
cations on them. e
is picked first, must be positive and not zero,
and may be as small as you please. ¢ must exist. 6 simply has to exist, and be positive, not zero. It may or may not be “small.” D(p),
D(po), p and po must all exist with PF po
D(p) is continuous at p = Po, if for any given ¢ > 0 there exists a
6 > 0 such that
a)
D()Dipo)i0
if
b)
p—pol 0
(2.5.2)
As an aside we have uniform continuity over a “region” of p’s if the
same 6 be used throughout the region. The above is the conventional definition
of continuity of a func
tion from any calculus book. It is usual ly if not invariably, applied in
31
a universe of discourse of the real numbers; integers including zero, fractions or rational numbers, irrationals including the transcenden
tals. This universe is precisely what Chicken Little objected to when the economist drew solid lines.
Demand is in integers (units purchased) likewise price ($, 1/8 of dollars or even pennies). Now if the universe of discourse for the do
main and range are discrete sets (let us just use the integers including zero for simplicity) then the 6’s and € are also so constrained. So let us go back over the definition and see under what conditions
D(p) is continuous. € must be greater than zero, the next largest one which exists is 1. So we have for a):
a)
 p#po 46
0
p=46
0
p=45
1
Priority
Limit
Limit
Order
Order
Order
0
1
s
5
1
al
1
1
rl
4
3
4
1
1
2
2
ult
i
No.
Buy
Priority
Sell
Function
No.
S(p) 5
p=44
2
43
2
3
3
3
42
4
4
1
41
5
5
1
0
46
=46
Order Function D(p)
Priority
No.
Limit Order Buy
0
Limit Order Sell
Sell
Priority
No.
Order Function S(p)
0
0
5
‘1
5
5
45
L
1
44 43
1
aL
4 5
4
2,3,4 5
4
3=1,1,1 1
1 af
3 2
3 2
1
42
6
41
6
1
vi
1
7
1
0
0
0, and lower limits to the absolute difference of bid or offer
from the most recent price. This latter lower limit may of course be zero. There are upper limits also to all these quantities, how these are
determined we shall shortly discuss. But that there are upper limits
50
is more important than exactly what they are, just as in the case of the inventory limits. All these limits, plus his present inventories
and the most recent price, summarize his input information. It is evident that, just as in the case of the used car dealer, his
profit in dollars depends on the size of the spread and the number
of transactions which he makes. Every time he makes two transactions, one a buy and one a sell, he gains a spread if the quote hasn’t changed. If he keeps his inventories near their optimum values and stays as far as possible away from his limits then in the course of
time he must move the quotes and hence transactions in such a way that the total number of his buys frequently equals the total number of his sells. The sum of these divided by two times the average spread he has maintained represents his dollar profit.
(Not exactly,
see below.) We can see that if in the course of trading he tends to buy more than he sells, so that his stock inventory gets large, he will want to
lower prices. If he sells more than he buys so that his inventory of stock tends to approach its lower limit, he will tend to raise prices.
The question is how much can he raise or lower prices, and how far apart can the spread be to enable him to keep the market open and
stay away from his inventory limits.
2.12
Definition of Profit in Market Making
If the money piles up while the stock inventory stays near its optimum, he can legitimately label the excess above his upper cash in
ventory limit as profit’, and spend it for something else; wine women and song, buy a house or put it in the bank.
He should not use it
for market making, because correct thinking for this market maker requires boundaries on the money to be used for market making.
As we shall see it is equally possible for stock to pile up while he is at or near the optimum point of his cash inventory. The extra
stock certificates above his upper bound of stock inventory should be regarded as “profit,” but “profit” measured in shares of ownership, not their equivalent dollar value. This is “correct thinking” but not according to accepted accounting principles. He can use these “prof
its” to paper his office, or celebrate with a bonfire.
He should not
* [used the word “profit” in the lectures in order to emphasize the symmetry between cash and stock. “Surplus inventory of cash” might be a better term. The important point is that “profit” should not be regarded as part of the “market
making capital equipment.”
51
use them in market making. In practice he is more likely to arrange a special “bargain sale” outside his regular market making activities.
There are some real life analogs to this situation.
Examples
are private placement, off floor or special distributions, sale as letter
stock, or sales to the corporation itself, for treasury stock. He may try to get rid of it by any method that does not immediately affect his regular market making.
These methods are applications of the
advantage of using temporarily withheld information.® Note that he is not a market maker when he does these things, he is a salesman. It
is not good practice, as a market maker to deliberately get into this position of “profit” in stock certificates. He is then trying, indirectly, to make a profit in dollars on his stock inventory.
It is an indirect
violation of the basic principles of market making. Some of the students raised questions about this peculiar concept
of “profit,” and asked why it was inappropriate, for purposes of calculating your net position, and hence net profit or loss, to convert
your stock inventory to some equivalent cash figure. Presumably you might have to do this to satisfy an external auditor or tax collector
who did not share these peculiar views. I admit the practical neces
sity of having to compute profit or loss in the conventional fashion, but for the purposes of setting the strategy of a market maker, it
should not be done. Let me explain why. First of all, it is the market maker himself who is setting the prices. It is inappropriate, and in a sense a conflict of interest (with
his customers, or another side of his life, as an investor), to think
of his inventory as an equivalent amount of cash, when he to a very considerable degree controls the price.
The difference between a market maker, and a speculator or investor, is that the latter tries to make money on his stock inventory,
whereas to the former the stock inventory is a tool of his trade, to
be kept as nearly constant, and preferably as small as possible. But by his basic principles, he must have some stock to stay in business,
his primary concern.
“Profit” (in dollars or stock) is surprisingly
enough, as we shall see, an incidental byproduct of his activity, just
as (in our view) liquidity is a byproduct of the primarily gambling aspect of security markets in general.
8 See Section 2.18 on direct and reverse ratchets. The specialist can stash this
“surplus stock” in either his segregated investment account, or omnibus account. His “inventory for market making” is called his trading account. See Richard
Ney, The Wall Street Jungle.
Finally you should note the essential symmetry of the market makers’ activities. Stock certificates and dollars are just two kinds of
paper which it is his job to render exchangeable, and this symmetry
in viewpoint is quite essential to carrying out the basic principle s under which he operates. The two separate viewpoints of “profit” are needed to preserve this symmetry.
We have seen that the lower bounds of inventory for this market
maker are zero, both for cash and stock. Let us now set the upper
bounds in a number of different ways, and see over what limits of
price he can make a market.
2.13
The Effect of Lower Limits Only on Inventory,
on Price Limits
Let us assign our market maker a starting price of $30/shar e. This may be from the most recent transaction, or the previous day’s close, from an underwriter, or from a LOOIM after some crisis and
resulting trading halt.
We give him 20 round lots, to be specified as the optimum inventory, and also give him $60,000, specified as
optimum equivalent to the stock inventory at $30/share. $ 1/2 is specified as an acceptable upper limit of the price change between transactions. We note for future reference that it will make a great
deal of difference whether this specification is never to be exceeded , or not on the average (and what kind of an average) or “not very
often,” and just how often is not often.
Let us suppose our market maker, like Zero, is literal minded and
cautious. Being told to make a market, and not told to make a profit
(yet), he focuses just on market making. He adopts the rules, just
one round lot as the size of either side of his quote. He starts with a
quote of (29 1/2, 30). For every buy he makes (a market order to sell into the market), he lowers both quotes 1/2. We define the “drift” d as the change Ap, = p;—pi_1 = d of the mean quote (Po; +P; )/2 = Bi
from one quote (i — 1) (with an intervening transaction) to the next
(i). In this case then the drift is exactly equal to the spread. For
every sell he makes, he raises his mean quote in exactly the same
way. (See Fig. 2.131A)
53
last
frensaction
TOi= RE
am
tinal
HOSY gute
eb [)
.
te PF ~— ~4b o~ mob y
2

* bo
28— —~
a
~
b
b
Time>
chy J
(A) TransseTed prices Cé) poe thre Pays of Markel “maker followed by Three sells,
Spread = dnjen Yn
pe fofit's= Bera
SS
———
Sor en &
oe
&
fast
.
~ trangsacGon
24, —~
~~
=
22—
—
oF
eee
*b
_ en
4fper, b= bid
°
mb —
Teme
bid >
wo
iid
wb
ey
ger as) (B) Transacted prives Ce
6
b bb
fenal
pucte
case @), wth spread = !y dete "fa
Prog T= (Heczeansactiins) ie pread — dvr) = 6) (t'/2) = / "la, oxcfter, b= bid. o.
Fig. 2.131 Examples of Simple (Symmetric) Market Making.
With this “strategy” he can make a market up to 39 1/2, where he runs out of stock, and down to about 18, where he runs out of money.
We can also see that his inventory exactly determines his
quote, or vice versa.
The order in which market orders to buy or
sell arrive at the market make no difference in this regard.
If we
imagine a “tape” showing his transactions only, the prices will be
54
either 1/2 or zero apart, the latter indicating a buy was followed by a sell, or vice versa. For any interval for which the total number of
buys equals the total number of sells, the quote and his inventories are returned to exactly the same values at the beginning and end
of the interval.
He can make a market and stay in business over
a quite respectable price range.
more than 1/2.
Transacted prices never change by
If that is “close” enough to satisfy an acceptable
definition of “market” continuity in transacted price he has made an acceptable market.
It is not a “deep” market (only one round lot
on both sides of the quote). An incoming order for three round lots would have to be satisfied by three successive quotes.
Quote prices
are, transacted prices are not, a function of stock inventory. Let us now introduce a
little profit, but stick literally to the
other specifications: no transactions more than 1/2 apart and make a market over as wide a price range as possible.
(See Fig.
2.13
1B). With a “most recent” price of 30, he quotes 29 1/2 to 30 1/2,
and drifts the quotes 1/2 after each 1 round lot transaction, just as before. The drift (1/2) is now one half the spread (unity). He can make a market over exactly the same price range as before, 40 to 18. The transactions on the tape look much the same except that there
will be no consecutive transactions at the same price. The average (and every) transacted price change (absolute) will be exactly 1/2, and slightly less so for the previous case.
As before the quote is a
function of the inventory, and vice versa. How much is the profit (in dollars)? You can readily verify from
the figure that for any time interval for which the initial and final quote prices, hence also the inventories are the same, the profit is: N  (no.oftransactions)
2
(spread — drift) (“size” of market)
(2.13.5)
“size” of market = 100 shares in this case.
Let us suppose that intelligent investors with the advice of experienced security analysts recognize that the intrinsic value of this stock is indeed somewhere in the price range $18 to $39, and the price remains in that range for the course of a year.
Trading at a
modest volume of 10 round lots per day for 250 trading days per year, Zero the mathematical market maker turns a profit of:
(N = 10 x 250) 5
(spr. = 1—dr. = 1/2)(size = 100 rd. lots) = $62, 500 (2.13.6)
an a
or slightly over 50% per year? gross return on “capital,” with conventional accounting.
If he can stay in business for a year, he can
write down his troublesome stock inventory to zero, and stick to con
ventional accounting principles for profit calculation thereafter. Not bad, Zero, not bad at all. What is the least non zero profit he can make?
the smallest discrete price steps.
This is set by
Assuming his first quote 29 7/8,
to 30 1/2, a spread of 5/8 and a drift of 1/2 gives the same price range of market making, 18 to 40, as before. This annual profit is one fourth as much. The average price change between transaction is only slightly greater than for the case of zero profit, and transaction price changes are never greater than 1/2.
2.14
The Effect of Upper Inventory Limits, on Price Limits
Let us make a few comments on the above calculations, which will
suggest some alternate strategies.
We have specified the minimum
and optimum inventories, but not the maximum allowed.
We just
calculated the transacted price range at the ends of which one inven
tory or the other hits its minimum (zero) and ended his continuous in time market making. Presumably as these transacted prices were
approached, he might want to take some corrective action. What are the possibilities?
Before going into this, let us note the following.
Note that the
profit, unlike that of the used car dealer, is not given by the spread alone, but the (spread minus drift) x (1/2)(No. transactions) times “size.” It is the drift which gives him protection as a market maker over as wide a range as $40 to $18 in the price. We have specified the lower and optimum inventories, so follow the principle that he should stay away from both bounds as far as
possible. Let us set upper limits of 40 shares (twice the optimum)
and for cash $120,000, also twice the initial optimum. These limits, or the corresponding prices are reached slightly inside the previously
calculated absolute price limits.
They can be calculated as follows
using a drift of $1/2: For the number of n, of steps up, $60,000 =
100 DF41(30 + (1/2)7) = 100(30n, + (1/4)nu(ny +1).
From this,
Ny ~ 17. py (upper transacted price) 30 + 17(1/2) = $38.50.ng = 20 ® For specialists on NYSE, studies have indicated return of 30 to 40% per year, so our crude calculation seems correct on order of magnitude. report.)
(See e. g. Cohen
is the allowed number of steps down, where the inventory becomes
40 round lots  the maximum allowed.
Hence pj, (lower transacted
price) 30 — ng(1/2) = $20.
Note that py. (upper) and p; (lower) are not quite symmetric about the starting price of $30, and when they are reached he still has about two round lots to sell (at the upper limit of price) and enough cash to buy about four round lots at the lower limit.
2.15
The Return to Optimum Inventory, When the Limits are Exceeded
The simplest procedure when these prices, p (upper) and p (lower) or equivalent inventory limits are reached, is to declare a halt in
trading, to be reopened via LOOIM. Technically, this is a failure as a continuous in time market maker, but it won’t happen very often
relative to the number of trades he does make as a continuous in time market maker. Even so, there are ways to prevent this failure or to
camouflage the failure with a continuous “image.”
The suspension
will not last long in time. What are the possibilities when he reaches these inventory limits?
On the downside he suspends at $20, declares a LOOIM and lets the orders pile up. If there are at least 20 round lot incoming buy orders
available!® to him (at any price) since he sees the orders before setting the opening price, he can always put in 20 of his sell orders at a price such that his at least will be satisfied.
He is then exactly at his
optimum position as a market maker relative to the opening price, of twenty round lots, plus cash to buy 20 more round lots at the opening
price. He actually has a little extra cash, since he suspended before he quite ran out of cash.
Suppose there are fewer than 20 incoming buy orders. Then he takes what there are, and sets new inventory limits. Just the number of dollars that he got for his sale is his new optimum cash.
Just
the number of shares that he sold is his new definition of optimum stock inventory. again.
With this new set of limits he goes into business
He makes a new continuous market in time.
In this case
where there were fewer than 20 round lot incoming buy orders, he has some surplus round lots “profit” or surplus inventory. A bon fire
or (not recommended) a special bargain sale (off the “market”) is 10 We have to qualify the words “available” and “participate” in case there are “ground rules” of the market (as on NYSE) which may give precedence of orders from outside over those of the market maker.
57
indicated. The suspension on the upside has similar consequences. He par
ticipates in the opening, just to the extent that he can buy 20 round
lots or fewer, using only half his cash for the purpose.
Then he
would have remaining cash to buy an equivalent amount of stock at the opening price. If he buys fewer than 20 round lots, these numbers determine his new inventory limits, and he is again at his optimum inventory point
for both cash and stock. He way have some “profit” and it will be in real dollars.
We have described this somewhat artificial example of market making in order to show that consequent to the three principles, there are purely mechanical rules which can be followed.
Once the
limits on inventory, and limits on closeness to most recent price have been set, the discreetness of the price (1/8) and discreetness of the
size (1 or more round lots) play a basic role in determining the price limits over which the market can be made. The peculiar concept of
“profit” (surplus inventory) emphasizes that profit making must be subordinated to market making, and this puts an inherent symmetry on the way stock and cash should be regarded.
The properties of this market suggest certain properties of real markets, and also ways to modify it in a more realistic direction. Over a predetermined interval of price (20 to 38 in the example)
quote prices are a function of inventory. The price changes between trades are either the drift, or spread — drift.
At these price limits
(20, 38) there are jumps in price, (the LOOIM) probably but not necessarily larger than the previous upper limit of 1/2 in price change. The process then repeats itself, with a new “origin,” or stock inven
tory balance point price. Even if at each LOOIM Zero succeeds in equilibrating at 20 round lots and cash to buy 20 more, the succeed
ing prices at which big jumps (the LOOIM) are likely to occur, are not the same. Thus with a suspension at 20 if the price ultimately moves up to 28, there will be another suspension, whereas the previous equilibrium point price was at 30, his original starting price.
In other words, after the inventories are equilibrated by LOOIM, the quote is again a function of inventory, but not the same function that it was before. We also note one slight asymmetry. On a downside suspension (at
$20 in our example) Zero will always have enough stock to equilibrate at a new optimum stock inventory of 20 round lots.
(This assumes
58
there are enough buy orders “available to him” at the LOOIM.) On
the upside (at $38) he may not have enough cash to buy 20 round lots, especially if the price jumps up with the LOOIM reopening. A series of LOOIM upward suspensions will force him to diminish the chosen size of 20 round lots as his optimum stock inventory.
This is inherently reasonable, a market maker is not expected to make as deep a market in round lots with high as opposed to low priced stocks.
Put another way, if $120,000 is the absolute upper
limit of cash inventory, independent of price, then his optimum and
maximum stock inventory must get less with increasing price.
2.16
Tactical Inventory Zones
The preceding market making strategy might be called a “two
zone” strategy.
Inside his inventory limits he has a simple rule to
give his quotes.
Outside it, he has a rule (the LOOIM) for getting
inside and at the center of the first zone again. The first zone may then have redefined limits. The greatest “threat” to being driven out of the first zone is a long, unbroken sequence of market orders of the same sign, and this statement is true for any market making strategy. The limit to the number of these he can accommodate without being driven out of the
first zone is set by his preassigned inventory limits. The transacted price range is set by this number (the inventory limit) times the
allowed drift.
It makes no difference (at this stage) whether these
incoming orders are from separate customers, or in a big block! , since the market maker is only quoting a size of one round lot per quote, against incoming orders of unknown size. Let us divide this inventory range (optimum = 20 round lots
+ optimum) into subzones at the boundaries of which the market maker makes increasingly drastic changes in his strategy to protect
himself against being driven outside his absolute inventory limits to a suspension.
The numbers given are arbitrary, for illustrative
purposes. There are many alternates to the specific choices we give.
1). In the first zone, optimum + 3 round lots, he quotes a spread of 1/4, and a drift of zero. (29 3/4, 30; size one on both sides.) Note 1 Tf he is given this information (a big block on an incoming order as on NYSE for a big block accumulation or distribution) he can use this information advantageously.
See section on ratchets.
Information on a big block he is not
required to execute immediately represents a temporary addition to his inventory
(stock or cash) which he has to work off.
59 that this gives a depth to the market of at most three round lots without changing the quote. But this depth is not given in quite the same way as quoting a size of 3 at the optimum inventory point, and
then dropping to sizes 2, 1 as the inventory goes off balance. Quoting a size of more than one tells the inquirer how many more orders the
quote might absorb before moving, whereas only quoting a size of one on each quote withholds this information. This is advantageous to the market maker.
2).
In the second zone, from optimum + 4 to + 8 round lots
he drifts 1/8, maintaining a spread of 1/4.
Note that the profit
per share per pair of buys and sells in the first zone is 1/4, the full spread, whereas in the second zone it is only 1/8, the spread (1/4)
minus the drift (1/8). The “virtue” of “stabilizing prices” and not letting them drift, or “ironing out temporary unbalance of supply and
demand”
is also relatively twice as profitable.
In the second zone,
transacted prices change by 1/8, the least possible. The market is as
“continuous” (close) in transacted prices as possible. It should now be clear what the quotes will be like in succeeding
zones away from optimum, as the inventory limits are approached. Increase the drift, so that prices move with the fewest possible num
ber of trades to what ever new price is considered to be an equilibrium or fair price to outside buyers and sellers. Increase the spread,
so that profit (spread minus drift) is still positive.
3). From optimum + 9 to + 20 round lots stock inventory he can quote a spread of 5/8 and a drift of 1/2. These tactics in these three zones will get him to a lower limit of price of
30 —3 x (0) —5 x 1/8 — 12 x 1/2 = $23.375 where he reaches his upper limit of stock inventory, 40 round lots.
The upper limit of price, where he has $120,000 in money is reached
at about $36.
With these tactics alone, for all zones, the prices he
quotes will still be a function of inventory, regardless of the order in
which buy and sell orders enter the market.
It will be seen that the strategy of the market maker is exceedOnce the zone boundaries of inventory
ingly simple and flexible.
have been set, and a spread and drift chosen for each zone, quote prices are a function of his stock inventory.
All he has to do is to
look at this inventory to know what quote to give. Note that transacted prices are not, precisely a function of inventory.
The profit
in any zone is just the (spread minus drift) times N/2, times size,
60
for any segment of prices with equal numbers of buys and sells and equal quotes at the beginning and end of the segment. N
is the total
number of transactions in the segment. The above market is not very “deep.”
Except in the first zone
there is just one round lot on either side of the quote. The conclusions
can easily be generalized to deeper markets, with more than one round lot on either side of the quote.
At the moment, we prefer
to view this rather as a series of identical quotes, each for a size of one round lot, since this tactic holds back a little information which favors the market, maker.
2.17
Tactics With the Minimum (1/8) Spread and a
Size Greater Than One Round lot Let us suppose our market maker, either from pride, regulation, or force of competition, wants to quote the narrowest possible spread,
1/8. Since 1/8 is also the least drift, if he drifts his quote 1/8 after each one lot transaction, spread minus drift is zero and he makes no
profit. However, if he holds his quote fixed until it is net k longer or shorter than at the start of the holding period, and then drifts 1/8 up or down according to his previous policy, his mean drift under
this policy is d = 1/(8k). He drifts again when his inventory changes by & more round lots. The formula for the profit then is:
(N/2)(Spread—d). This is an average, not exact as in the previous case, but fairly accurate.
We can see by means of numerical examples
(see Figure 2.171) that if there are runs of buys and sells in the orders, of less than k, the market maker makes the full spread, even
though his policy is to drift by an average d between trades. Note with this policy that quote is still a function of stock inventory, as listed on the left side of each diagram.
Although the
inventory is not a function of the quote, as it was when the size, or k, was just unity. Note for example there are & buys in the first zone,
although the inventory rule says the quote holds for unbalance k —1. This is according to the rule. At inventory unbalance k — 1 the mar
ket maker doesn’t know that the next order will throw him more off balance or not (i.e., whether it will be a buy or sell). Strict attention to the functional relationship, “inventory determines quote” means
that for a string of buys followed by a string of sells equal in number, the transacted prices(*) do not quite repeat themselves, in just such a way that the formula for profit (spread  mean drift) holds, on the average. This is true regardless of the order in which the buy orders
61
and sell orders come in.
Case
(a).
Effective size k=3.
Five buys,
for the market maker. Number of transactions followed b; = lo.)
a
cee= 45 ° gOS ATH Wee oe”
Tzewtk
Saya
LDypteRaog) 9056 axact: profit
oy a
pre
= “2
ivy bee oon ox bo b
pried bd ales rope  bugs st
= 4 e~aaes) — fo Tsay
~
4

8
ctlte>
Approximate profit (/e/2) (1 '/3) = /0/3
Case
(b).
Effective size k = 3,
market maker in the order
Dysek% Ika] ons Tuskrat: 34S
bbb
Five buys and five sells
ss bb sss.
for th
Number of transactions= Big
u eo 3 Fol 0% 03 Aenc¥?  9 by bet oy b br beorb bo b
sales oT off
‘
b
bursa vised bd

ad
Exact profit = (3a+21j— F) =3 Approximate profit
Case
(c)
=
Effective size.k.
of transactions = 2t,
we OF
oe
7
so/s , ascm eae (a)
t buys followedLy, t selle, Number
t is such that
[bel ieIstetg si
Tux breif
—
ee
BY seven sells followed by Seven buys 21
of OF OP oF
ar
a¥ GF
oO e
wy 2b bbb b bbb i]
%® “0, Shere
86,
=  ~ ae mete
Cop~
—
2
“sh
°
‘by
~

_
.
4,
time > Fig.
2.181 Trading with a ratchet.
The final quote is inde
pendent of the order in a sequence of equal numbers of buy and sell orders, but the profit is not independent of the order. Drift = dy =
0 after sell. Drift down dg = — 1/8 after buy. Spread = 1/2. b=
bid, o = offer, * = transaction price. It should be noted that with any type of ratcheting the quote
67
prices are not a function of inventory, but depend on the detailed order in which the buy and sell orders come in. So the market maker
has to look not only at his inventory but also the actual price, or price target that he wants.
Let us define a direct ratchet as one which pushes prices down when the stock inventory is long from optimum, or up if short. Reverse ratcheting pushes up the price when the inventory is long from optimum and down if short. Since direct ratcheting pushes the quote,
and also balance point price down when the stock inventory is long from optimum if continued long enough it will (presumably) bring
out the buy orders to bring his inventory in balance.
Similarly on
the upside, so direct ratcheting is intrinsically inventory stabilizing,
which is of course what it is intended to do. Reverse ratcheting does just the opposite.
It tends to push the price up when the inven
tory is long, if continued too far, it will presumably bring out more sell orders and throw the inventory even farther out of balance. So reverse ratcheting should be used with caution, preferably when the ordinary stock inventory is close to balance anyhow, in the first zone, and not be allowed to push the price too far.
As we have indicated, it is a violation of first principles of market making to try to make a profit on your inventory. However, reverse ratcheting is particularly helpful when (as on NYSE, or in the first
stages of making a market after underwriting) the market’ maker has a special big order to handle.
See the section on the specialist
(2.25). Any order for which the market maker has information as to its size, and is not required to make an immediate execution, gives him special information advantageous to him, can be regarded as temporary inventory, to him.
He uses this information to preserve
his image of making close (market continuity) transacted prices, and possibly a profit too.
There are analogs to this situation in other
aspects of economic life.
“List prices” are stable and continuous in
time, but there are numerous occasions of side deals both above and
below these prices, which are not given the publicity which would disturb the “image” of fixed and continuous list prices. The section on specialist describes some of these “side deals” with big orders.
We can make a comparison between the direct ratchet, when used to bring inventory to its optimum point, and the LOOIM, called or overnight, to accomplish the same end.
The LOOIM determines a
new price, set by the combined public judgment expressed by all the
limit order prices, plus the judgment of the market maker after he
68
sees the orders.
At this new price the market maker’s inventory is
at optimum. With a direct ratchet, the market maker tries to guess
(perhaps several guesses in succession)what this price will be, and
ratchets to it as a target price.
So we can say a LOOIM is “one
big ratchet trade” to equilibrate inventory. Alternatively we can say equilibrating inventory with a direct ratchet is a “quasi LOOIM” but smeared out in time, using the market maker’s estimate (rather than
specific limit orders) of what the public is willing to accept as an inventory balance point price.
2.19
Comments on Continuity. Methods of
Preserving the Image of a Continuous Market Maker Conventionally, one thinks of continuous increasing and decreasing supply and demand functions of price intersecting at a point, so conventionally one might think it was up to the market maker to
find that point. The notion of “continuity” of prices, or closeness to the most recent price, carefully fostered by the professional market
makers and unconsciously subscribed to by the public, coincides well
with this conventional point of view. One might even say that since prices are close (“continuous”) most (not all) of the time, the conventional point of view must be correct, since it is in agreement with
observation. Even (or especially) Chicken Little should be impressed
by this argument. A closer look at the phenomenon shows some small differences from the above picture, but those small differences have big consequences.
Supply and demand, as we have redefined them, are
steplike functions of price, piecewise continuous, nondecreasing and nonincreasing, respectively. They intersect not at points but on fi
nite horizontal segments.
These segments are long, consequent to
the properties of market orders. So a single intersection price is am
biguous (transacted price is not a function of supply and demand). The market maker picks prices, and sequences of prices which bring
satisfaction to the customer and profit to him, just as the gentleman in the sandwich board did when Mrs. Jones bought her dress.
The tactics for inventory balancing which we have given for the
continuous market maker to stay in business during business hours!4— increasing spread and drift, jumps, ratcheting, all imply an under
+4 The overnight suspension is a big help here.
69
lying assumption that must be true for them to be successful. It is that when the price moves far enough, orders of the appropriate type, buy or sell, will come in soon enough (before the close of business) to keep him from exceeding his inventory limits. This is another reason,
especially in near crises, for keeping the size on the quote small. In ratcheting, one might have enough stock to have five round lots on one side of the quote, but feeding it out one lot at a time takes a
little longer.
The greatest threat to the continuous in time market maker occurs when the above assumption is not true. He is then confronted
with an unbroken sequence of market orders all of the same sign.
There are some steps he can take to preserve his “image” of continuous (in time) market making. One of these is a very large spread. There is always an upper limit of spread beyond which people will become disgusted and not do business with him. This is one way of actually suppressing trading, and yet say he is in business without having to do much business. I can stand here and say I will make a
firm market for IBM. I will buy at a dollar a share and I will sell at $1,000/share. I am, strictly, a market maker, but I don’t have any
danger that people will do business with me. This is a ridiculous ex
treme of the strategy of the Bank of England in a crisis. They might well discount the assets of the provincial banks at some horrible rate,
15% or 20%, but the important point was that the market or the bank
stayed open, a market was to be had. ‘This strategy of market makers
is indeed adopted in a crisis in the over the counter market today. Stocks may be ordinarily quoted at 3% of the price apart; after some
particularly bad piece of news they may quote them 15% apart, and with a big jump too. This is not considered good market making as a regular practice, nevertheless the image and indeed the reality of
market making is preserved. Exactly the same phenomenon occurs on NYSE, but to a lesser degree than a 15% spread.
There are other artifices by which a market maker may preserve his image in a crisis. For the OTC market, the simplest one is to take the phone off the hook. This is effectively a suspension, but is not
labeled as such. It gives more time for orders to come in, cuts down
on the number of transactions, and gets him closer to the close of
the business day. A similar practice on the NYSE is called “walking away from the market.” The specialist can say, “Help, I have to go to the bathroom,” and he simply disappears. If it is 3:15 P.M. and he stays there until 3:30, the close, he has not officially weaseled out
70
as a market maker. There have been times when banks have done similar things when there is a run on them, and they want to preserve the image of staying open.
They are racing against the clock when the bank can close
anyhow. To slow the run down they will send out a few employees to get in line with bags of money to make pseudo deposits, or arrange for the employees to take a long time dickering and haggling at the cashier’s window. So the image of the bank being open during regular
business hours is preserved.
In fact, the banking process is slowed
down enough so that by the next day perhaps they can scrape up enough money to keep the bank open. A suspension for a market or bank, called, or naturally overnight, gives the participants time to
assemble information and make decisions which continuous in time trading does not give either them or the market maker. The LOOIM or its variants just give all concerned more time, and all the same amount of time.
2.20
Continuous Market Making With Fixed
Minimum, and Optimum Inventory, but No
Maximum In the preceding discussion of the single market maker without competition, handling market orders only, we set separate and predetermined limits on the cash and stock inventory. There were also constant limits set for each “zone” in the course of continuous trad
ing, for allowable drift and spread. The size was limited to one round lot for simplicity. This is not an essential restriction. All these limits could be violated whenever there was an interruption of continuous trading either naturally overnight, or with a crisis suspension, a technical failure of the continuous market maker. We saw that these interruptions, naturally occurring once
a day, and rarely more often, were of the greatest importance to the market maker for safety and enabling him to stay in business. These “discontinuities” are an essential part of the “continuous in time” market making process. We saw that the separately specified optimum cash and stock inventories were “compatible” at the starting price of $30. The optimum cash would just “buy” the optimum inventory of stock at the starting price.
The separately specified limits on the two invento
ties became incompatible as the price moved away from $30.
The
incompatibility was noticeable but not large at the extreme limiting
71
prices of market making allowed without any attempt to adjust the
inventories.
This suggested rather naturally that at least one and
possibly both of the upper inventory limits of cash and stock might preferably not be predetermined separate constan ts, but depend a
little on the price.
We now want to show that by relaxing some of the above prede
termined constants, we can devise a marke t which can be operated continuously in time without any interruption s whatever, or jumps in price greater than a predetermined amoun t whatever. This mar
ket will operate over a much wider but still finite price range than
before. The price limits of market makin g will be set by the discrete units of money ($1/8 or $.01, as you may prefer) and the discrete
units of stock, one round lot, or one share, if you prefer. This market
will bring out the essential symmetry of stock vs.
cash, and some
similarities to a real market, notably the British market where the
customers’ orders are given in dollars (or pounds), not number of
round lots, and the number of shares transa cted is adjusted accordingly. It will also bring out similarities betwee n this British market
and the American odd lot, round lot and big block market, essentially markets of increasing size of trade. We want to bring out a fuzziness of distinction between small and large price changes. Discreetness will be an underlying and unifyi ng concept, just as it was
in going from the economist’s to the market maker’s ideas on supply and demand. Finally, we have seen the practical necessity to the
market maker for interruptions in time, and big price jumps. We shall see that they are also a logical necessi ty. This logical necessity is established by giving the conditions for eliminating them, and
showing that this condition cannot be achieved. This last point has a bearing on a question of some theoretical and practical interest.
Big price jumps, equivalently, long tails in the distribution of price changes, are the subject of considerable attent ion not only from aca
demicians and speculators,but also the regulators and supervisors of
the market.
With this market, Zero the mathematical market maker is given
20 round lots, $60,000, and a starting price of $30, just as before. The lower inventory limits are zero, just as before, but no specification is given on the upper limits. He is told to make a market with changes in transacted prices not greater than 5% of the price (5% to the
nearest 1/8) between successive transactions . This specification is a
little gross for OTC stocks unless they are not traded actively, but
—
72
it is not intolerable. Note that 5% as a decimal is just the reciprocal of the number of round lots he was given. could make a nice close market of 1%.
With 100 round lots he
price changes.
Zero has to
make this “continuous” market in price (5% price jumps) 24 hours a day 365 days in the year, leap years no exception and positively no suspensions or price jumps more than 5%, ever. Continuity in time of the quote is also.required, without any exceptions.
The formula for the sum of a finite geometric progression is:
ltrtrgrs..trN y—pNat
1 This finite form works both when r is greater or less than 1. It even works when r is equal to 1, if you are careful in how you express it,
as indicated here: perv
lim
ral
l=r
=
=
(N41) log, +
1 — clog.
1—1—(N +1)log,r— (rio, ul rl 11 log,r (oger)? N+1
Moreover for r less than one you can sum an infinite number of terms: oo
1
Sox =>
j=0
Ir
3.154 Across the Market Dispersion (x2) for Industrial
Stock Prices. B.F. Goodrich to Hudson Bay Mining. x to = 1945.5, N= 12. 0 to = 1951.5,
N = 1012 (Two ”vanished” from sample
in 1956). Data from Securities Research Company
® These figures are from M. F. M. Osborne “Random Walks in Earnings and Fixed Income Securities”, paper given to Institute for Quan. Apr. 1968.
Res.
in Finance,
152
The sequential dispersion of these vehicles has never been investigated to my knowledge, except for the original work of Bachelier. He found that for “rente” or perpetual French bonds, the sequential
dispersion increased as the square root of the time up to intervals of
Slop
Ya
ower Data Points
Srgtaens=o;xESTPritmMae,Sg
2008
1+]
Porurnidit
1.0
10
by. a factor = 2.9 for slog,E(t) = log,
re!
E(t) fe EE t
(ee 20
TIME tty in years —>
Fig. 3.155 Across the Market ”Dispersion” (x2) for Fractional
Change of Earnings of Industrial Stocks Goodrich to Hudson Bay Mining. A to = 1945.5
N= 12. A to = 1951.5 N = 1012. See Fig.
3.154 (two in sample vanished)
153
45 days, the longest tested.
What happens for longer intervals is
unknown to me. It seems inconceivable that the sequential dispersion of such vehicles would increase indefinitely. Other qualitative factors must enter into the description, such as the probability of default. For me this is an unexplored problem.
82°
‘
x
“>
ointrefasxgl
0.1
o ES
Fig.
3.156 Across the Market Dispersion” (x2) of Earnings
per Dollar of Price for Industrial Stocks Goodrich to Hudson Bay Mining. x(to) = 1945.5. o(to) = 1951.5. See Figures 3.154,5.
154
One can also examine other sequences of data from the securities
market, Figures 3.154 to 3.159. For the earnings sequence of common stocks, the fractional or % change in earnings increases in its
dispersion across the market like the square root of time interval,
(Pd/lo=g,t)
oserxatfnigl
inter:
tt, in years—»
Fig. 3.157 Across the Market Dispersion. (x2) of Log. Price for Utility Stocks, American Gas & Electric to Indiana Power & Light. Data from Securities Research Company. o to = 1946.5 N = 14. x to = 1956.5 N = 14 (one stock changed in sample).
155
but changes in the logs of earnings do not. Earnings per dollar (reciprocal of P/E ratio) are roughly constant they do not spread out
at all. The volume sequence has never been explored for sequential or across the market dispersion.
ta cy
1.0 = =
o
bo
i=] 3 H
a
a
‘A Boon
ae OZ
ay
Heh ®
Ew
”
x
0%:e;
*
Poid

tid

10
20
1
10%
Time tt) in years —>
Fig. 3.158 Across the Market ”Dispersion.” (x2) of Fractional
Change of Earnings. (E(t) — E(to))/E(to) for Utility Stocks. Data as in Fig. 3.157.
156
—— —industrials,
6.1
a>
pa
on
aa7 ba

>
=

.
or
~N
ae Zale ial
z
1946.5
XQ
0.027
OF
Deen O 
a
meee
sx
=
x
oo
4
rile 10
Time
ttg
“
Nap = 1958.5
1
Fig.
=
oe
°
7
Xo»
.
sh
Fig.
—— —industrials tg = 1951.5
xe
ne
eh?
ty = 1945.5
] 20
in year:
3.159 Across the Market ”Dispersion” (x2) of Earning s
Per Dollar of Price for Utilities. Data as in Fig. 3.157.
3.16
Brownian Motion as the Continuous Limit of a Random Walk
I now want to describe the situation called Brownian motion
which is a limit of a random walk, or a whole assembly of random walks, depending on whether you like to look at sequential or across
the market distributions.
Suppose instead of an increment labeled
by an integer which is the number of steps taken, we speak of the time since the walk started, and we take the number of steps which
are taken in any minute as given. So instead of expressing the position (price) as a function of the number of steps, we express it as a function of the time interval after some chosen starting time. Then we increase the number of steps per minute indefinitely. The walker steps more and more rapidly, but we shorten the length for each step taken in such a way that in the limit of infinitesimally small steps taken infinitely rapidly, we still have some finite process going on.
This can be done in the following way. Let us suppose we take n = 1/e steps per minute. ¢ is the small time between steps, and each one is of expected length h.
We can
express the number of steps after time ¢ as i = (Int)tn = t/e. (Int) means the closest smaller integer to t/e So our random walk in terms of 2:
P(i)" =" Py +ih+ Vib is expressed in terms of the time variable ¢ as follows. We replace h by Ve where V is a constant, independent of e.
Similarly we replace 6 by D,/e, V (velocity) is the drift per unit time D the dispersion or standard deviation developed after unit time. Since the e’s cancel out, we have:
P(t)* =" Po + (Int)Vt + VIntVtD Note that we have not yet taken the limit « + 0.
The word (Int)
means that P(t) is spaced, € apart in time, and also in the vertical or price coordinate h + 6 with small steps.
As «
+ 0 the dots get
closer and closer together in both coordinates. V is called the drift velocity, D the diffusion constant. This is a “continuous in time” description of a random walk, and
it is called Brownian Motion.
It is actually this continuous form
which originally appeared in Einstein’s paper on Brownian Motion.
It is also in Bachelier’s thesis, and in Feller’s book in the references.
Brownian motion was discovered by a botanist who was look
ing through his microscope about 1820 or thereabouts at very small spores, I think of ferns or a fungus. In this image in the microscope
he saw what he called a swarming seething motion, like an agitated
hive of bees, or gnats on a summer evening.
At first he thought
this might have been the manifestation of life at its most elementary level. After some reflection and investigations by other people, they decided that this chaotic motion probably was caused by the indi
vidual molecules colliding with the somewhat larger but still very small spores. Molecules had been suspected to exist since the days
of Democritus, but no one had ever proved that they did. They examined Brownian motion in dust particles in drops of water that had been sealed up in crystals for millions of years and were still jumping like mad. They showed that the motion changed
with the temperature and with the size of the particles. Finally Einstein related the mass of the particles and the viscosity of the
liquid in which the particles moved to the number of molecules that were hitting them, and the temperature.
Then Perrin observed in
an experiment looking at small droplets of resin, gamboge I believe,
that the formula for the way in which these particles moved fit the square root of time diffusion law. In many respects our individual
observations of prices are an almost one to one analog with what Perrin did when he looked at his particles through a microscope,, and observed their position.
Istrongly urge you, if you have any friends in the biology department or medical sciences, to go and look at tobacco smoke under 500 power magnification. This will show you Brownian motion very
nicely.
It is quite spectacular to watch.
They really do jump in a
mad fashion.
I want to point out that this limiting case of Brownian motion has a very curious property which I mentioned earlier. It is continuous in a mathematical sense but it doesn’t have a derivative. The
definition of continuity adopted for this case is not quite the same as it was in the case of the calculus, or for the discrete variable case, where the universe of discourse was the integers, and you could in the aristocratic sense still satisfy the definition of continuity.
Be
fore I give this definition, let me point out what is actually going on when I compute the expected value of the position and dispersion of a random walk after i steps.
Let, us suppose we have a coin flipping random walk. Prob (s =
159
h+6)=1/2. After 50 steps s the position was 50h+V/506. At each step there were two possibilities, or after fifty steps there are 259 ~
1015 different possible random walks. 50h + W506 is the “position” after an average over this is huge number or ensemble of random walks. In physics this is called an ensemble average.
When we estimated from a single random walk given to us what the advance h and dispersion 6 per step was, we assumed that we had
just one out of this monster population of 101° different walks, and with a high degree of likelihood that it wasn’t a particularly
freakish
member of this huge population of possibilities, and that we could estimate h and 6 from it.
Now what does it mean to ask for continuity? You can see from the way we carefully adjusted the length of the step and its dispersion, as they got smaller and more rapid, that the particle did have a definite expected position.
Since a particle does have a position
at any given instant of time, if you are only adding infinitesimal elements to the position, the property of continuity should be pre
served.
For this limiting Brownian case we have not 10!° but an
infinite number of different random walks, of which for estimation, we pick just one.
I think you can see that a plausible definition of
continuity (it is not the only possible choice) is this one. It is called mean square continuity.
Instead of saying P(t) — P(to) approaches
zero as t approaches to you say that the expected value of the square approaches zero, because it is that quantity which is measured with uncertainty, or variance.
the square would too.
Certainly if P(t) approached P(to), then
So the stochastic definition of continuity is
this one. A Brownian motion P(t) is continuous at t = to if:
jim E(P() — P(te))” =0 You can also give a deltaepsilon definition if you don’t like the peasant definition and you don’t like the word limit.
In mathematical
language P(t) is described as continuous “almost surely almost ev
erywhere,” or everywhen, if the variable is time. The extra words in quotations mean that the probability distribution of P(t), instead of being binomial, is continuous in the limit «  0. Being “normal” it extends from minus infinity to plus infinity, so off in the tails which
is a very unlikely state of affairs, P(t) and P(to) might be different with ¢ close to to, but this is a very improbable state of affairs. we have those words, “almost surely almost everywhere.”
So
i.e., for
the entire span of observation or domain of to. This is not quite the
160
same definition for continuity as in the calculus. If you try to generalize, by asking about the derivative, the ob
vious thing to say is, let the following expression, if P(¢) is going to
have a derivative, approach a limit:
Le = =F)’ o
(“derivative”P(t)) = lim i
t—to
If you work this out, you will see that as t — ty gets small, the
limit doesn’t exist, it becomes infinite. Here is a function which has the peculiar property of being continuous (almost surely almost ev
erywhere), but it doesn't have a derivative (squared) almost surely almost anywhere.
derivative.
It is a freak, a continuous function without a
If you look in a microscope at Brownian motion of to
bacco smoke you can easily see that, whereas you can follow with your eye one particular particle of smoke, it jumps and whirls and
dances and jiggles about so that it doesn’t seem to really have in any understandable or measurable sense a rate of change of position with time.
3.17
Velocity vs. Derivative of Position.
There are some subtleties and even paradoxes in the above discussion, and they matter, even or especially for the stock market. There are three mathematical processes going on. 1) The limit as
the steps get small and the rate of taking them increases. This is the passage from discrete steps numbered i to continuous time t. 2) Calculating the expected value. 3) The limit t + to. We had the concept of a drift velocity V but we don’t have a derivative (by our definition). Velocity is supposed to be just the derivative of position. How can this be? The drift velocity is
=}
P(t) — P(to)
v= time ( ib ) which is not the same as the square root of the preceding “stochastic” definition of derivative squared.
If we don’t use squares then the
dispersion 5 of one step never appears in our calculation of expected value, and the stochastic nature of the process is lost entirely. The moral is that definitions, and the order in which operations
are carried out matter. The square of an expected value is different from the expected, “value  squared,” and the difference matters.
161
“Velocity” and derivative of position with time may or may not be the same thing, depending on how you define them.
Tam sure you may have noticed on a theater marquee a “velocity”
in a moving line of lights.
If you look casually there is motion but there isn’t any thing moving. The lights (cf. transacted prices) just
flash on and off in such a way as to make you think so. Not noticing this distinction in the stock market can cost you a lot of money, if you unconsciously accept the idea that transacted prices are continuous
with a continuous derivative, in time.
You have bought a high flier, and you put a stop loss sell order under it, in case it starts to go down. So probably have some other people. On the chart when you get stopped out it is almost always
at the lower end of the little continuous streak of prices that you get your execution. if you look at the record there probably weren’t any transactions near and below the price where your stop was set.
The “streak” on the chart is not on the tape. There was a jump, with or maybe without a suspension, and all those stop loss orders got swept into a pile for execution at the lower price.
Continuity
and smoothness was lacking just at the time when you wanted these properties.
Not infrequently stop loss orders are banned in anticipation of Just such a situation. If you hear of such an announcement, it’s time to get out of the market.
Ican make another familiar analogy. The “Motion” in a “moving picture” is not real, but it is exceedingly realistic. A movie is, strictly,
a rapid discrete sequence of “Stills.” Rarely, the veil of illusion is torn, the wagon goes one way, the wheels momentarily spin in an opposite direction. More often very realistic illusions are created. A ghost passes through a wall, Dr. Jekyll turns into Mr. Hyde, we
have the illusions of “Fantasia” created by that master craftsman,
Walt Disney.
In the stock market the reality is a set of discrete “stills,” the transacted prices, with a double valued quote in between
(sometimes). If the “viewer” (speculator, investor) like a small child
at his first movie, really believes he is seeing a continuous and smooth process, he can and does develop all sorts of delusions concerning the market.
It is in fact very difficult to wash your brain of this false
belief, and the false conclusions this belief engenders. The specialist is licensed and expected to foster this illusion of continuity, most but not all of the time. It can be a very profitable license.
162
3.18
Random Walks Whose Properties Change With
Time Ihave pointed out in the preceding sections that it was perfectly
possible to have a random walk in which the expected advance (and also dispersion) varied with the index number of the step, or in terms of Brownian motion, to vary with the time.
If the dispersion was small compared to the expected advance per step, the walk became almost but not quite indistinguishable from some well defined and even analytic function of time.
I now want to point out that all of those possibilities when the dispersion per step is small compared to the expected advance per
step could equally be true when the dispersion per step is large compared to the expected advance. Parenthetically, Brownian motion is
a rather special case of this, since
56
VeD
h
Vv
D
Vv
gets large for small ¢, but separately both advance and dispersion approach zero.
For a random walk, the expected advance may change with time, but if it is small compared to the dispersion it might be very hard to find this time variation. The examples of Problem 4 illustrate d this
for stock prices. You might well have bull and you might well have bear, and lost a lot or gain a lot, but the dispersion was large compared to the expected advance, so it was hard to show the expected
advance as being different from zero. It was possible for intervals of more than a year, and in favorable cases.
The fact that the parameters h,é of a random walk can change with the time makes it important to see if there are other ways in
which the stock market, as represented random walks, change with
the time, which are a
little more detectable.
properties do change with the time.
the following sort.
The market and its
We can make an analogy of
Brownian motion is usually discussed for inert,
longlived particles, but bacteria also show Brownian motion. You can watch one for twenty minutes and he will follow a random walk, and then he splits in two. A subsidiary is spun off. Now which piece
do you look at?
Or another bacterium may swim by and eat him
up (an acquisition, or merger).
Now what do you do? So there is certainly a change with the time here which a simple square root of time diffusion law takes no cognizance of whatever.
163
‘You may recall that I said the problem of the stock market was like the inverse of playing parcheesi.
Instead of knowing the rules
for playing the moves of the game, you are given a lot of games with moves,
and you try to figure out the rules.
In the case of the
stock market the “rules” may change because there is actually a legal
change, or the properties of the game may change because peoples’ tastes change as to what is important to buy and to sell.
Such
changes do occur in the securities market. You can see that trying
to figure out what the rules are, if they are going to change from one year to the next, is going to be difficult. If you aren’t told when the rules change, you have to infer it from various other sources, or
study the moves very closely, to see whether or not the rules have changed.
Let me give a brief history, giving examples of how the rules and preferences have changed.
Common stocks have not always been
regarded as popular and appropriate vehicles of public investment.
They used to be restricted to very disreputable financial pirates, like the Rockefellers and the Vanderbilts, the Harrimans, Astors, and Goulds, and Leland Stanford.
No respectable person would own
these common things. That’s why they were called common stock. People of property and class owned bonds and mortgages, real estate
and slaves. Common stocks were very definitely a vulgar way to play with money. Respectable people who wanted to take risks took them in conservative real objects like ships, commodities or race horses.
Nevertheless the vulgarians made so much money that finally people had to pay attention to them, and so they became respectable.
The standards by which common stocks are evaluated certainly have changed over the years.
You can read about these changes in
Benjamin Graham’s account in the references.
In 1915, there were
certain standards by which common stocks were evaluated, but they were not what they are now.
Changes are going on even now.
We
have always known that earnings have something to do with prices
going up and down.
Up to about ten years ago no one made a
business of systematically forecasting earnings, but as people appreciated, or at least believed that you could forecast the earnings and their effect on prices, whole books of earnings forecasts were prepared.
For some stocks the earnings forecast was more important.
than the earnings themselves. We have the astonishing phenomenon
that when the earnings don’t come up to the popularly and widely touted forecast, the stock price will crash, even though the company
164
may be in perfectly respectable condition at the time. Tam quite certain that the publication in the newspaper every day
of price earnings ratios is going to have an effect on peoples’ estimate of what they will pay for a stock, that hasn’t occurred before, simply because the information is presented to them without their having to work nearly as hard for it. As I mentioned in the example of bacteria, where they split and merge, the concept of Brownian motion of an identifiable particle really doesn’t apply for indefinitely long intervals of time. It is just the purpose of Problem 5 to put some quantitative figures on this
phenomenon. If particles are not identifiable after ten years, then it doesn’t make much sense to extend the concept of a random walk to periods of time as long as that unless you take explicit account of the fact that the particle may vanish or new ones are created.
Stocks become delisted or liquidated or bankrupt or eaten up, so the phenomenon is much more a set of random walks with appearance and disappearance than it is of a simple inert particle following a random walk or Brownian motion.
In the problem the students found that the exponential decay time (both birth and death) for a stock on the NYSE was of the or
der of 20 to 30 years, and possibly more. For the Toronto Exchange it was three, four or five years. For the OTC Market about ten years. There were also mixes of different lives too, so that the birth or mor
tality curves didn’t necessarily always follow a simple exponential law. Some of the students were quite careful to point out what hap
pened to their stocks as they followed them. Some were liquidated, some were promoted to some other exchange, or were simply listed
or dropped by editorial policy, which was different in different news
papers. Bonds had different activity and life in different eras. They tended to be reborn as maturity approached. This life limit on the concept of the random walk will never appear unless you look for it. You can’t write a price down if it isn’t there, and so if you just look for identifiable prices and vehicles you will never see that this appearance and disappearance phenomenon is a limit on the simple concept of the random walk itself.
I think you may see that the preceding discussion may have a
bearing, though perhaps in an unexpected fashion, on our Dark Cloud Axiom. “It is impossible to systematically exercise good judgment in the securities market. “Systematically” implies you can de
vise a set of rules or precepts for an appreciable period of time, some
165
fraction of your life at least. If the rules of the game change, it will be hard to find such precepts.
Maybe you can devise a systematic
procedure for updating your precepts, and maybe not. It obviously
isn’t going to be easy to do it systematically, especially if the qualitative properties of the market change. These are sometimes difficult to appraise, or even recognize when they start to occur, let alone
forecast.
Ordinary life, not just financial or economic life is full of
examples of this sort.
Problem 5 Birth and Death of Securities In this problem we want to examine quantitatively some aspects
of Brownian motion in stock prices which have a counter part in the Brownian motion of bacteria, but not of inert particles.
There is
also something of an analogy for Brownian motion or diffusion in a nuclear pile in which “creation” and “annihilation” of particles also occurs. Bacteria behave like inert particles say for intervals less than 20 minutes, but for longer intervals they split, merge, die, or may be eaten up. This happens to stocks also.
For simplicity of definition we will take “existence” of a security as having its name listed in a news organ, “birth” its first appearance, and death its disappearance from the sample as represented by that news organ and its editorial policy. Thus by definition the Texas Co. died, on an unambiguous date, Texaco was born the next day  a totally new entity by our definition.
Take some news organ (e.g., WSJ, NYT, Barron’s etc.)
and a
sample of say 50 consecutively listed items, starting under some category,say the letter G, bank stocks of OTC, bonds, etc. Use a date 10 to 20 years ago. Look under the same heading a year, 2, 3, 4, etc.
yrs.
later.
How many of your original 50 survived, after 2  3  4,
etc. (You may have to look a little farther than 50 for these. So plot the number, or rather fraction, of “survivors” of your original 50, as
a function of time. This problem can be done blindly and mechanically, but I urge you not to do it this way.
Read the headings on the news organ
as to what is actually listed. The “OTC Market” on Monday used to be different from Friday, and has changed its complexion. Check
for stocks not traded if (as for NYSE) they are given. Look out for
166
unexpected phenomena.
Repeat the problem where you now start with your last date, and
take steps backward in time. You will now be evaluating a “birth” rather than “death” distribution in time. They might or might not
be similar. It would probably be a good idea to examine data at the two end dates first before taking the data, to ensure the span of
time is long enough to have produced an appreciable “attenuation.” This might take 20 years for NYSE, but only 12 years for Toronto
Exchange. Use about 8  10 intermedia te dates. They need not be exactly uniformly spaced.
The purpose of this problem is twofo ld: 1) to deliberately focus on a hitherto neglected aspect of Brown ian motion or random walks in the stock market  which has in the past been concerned only with identified, permanent “particles” (e.g., stocks); 2) to give you
a sense of how the complexion of the market changes with time.
Obviously this method overestimates the rates of birth and death
process, except for a simple mind ed investor for whom a new name
is a new stock, and if the name disap pears, it isn’t there The method max, also be contaminated by the occur rence of days (or weeks) of no trading.
As a wild guess I suspect your surviving fractions (birth or death)
may follow an exponential law, Ae x p(t/T), which should give a straight line on semilog paper. Try it. There are obviously many variants to your categories, or ways of
doing this problem. One would not expec t public utilities to show the same birth and death properties as ASE science and electronic
stocks, and the behavior may well be different for pre vs. post SEC
data.
Recommended Reading (on reserve for 233 or 236) Osborne, “Random Walks in Earnings and Fixed Inco me Securities.” B. Graham, The New Speculation in Common Stocks , in Wu and Zakon, Elements of Investments, p.
Investor, 2nd ed.
3.19
169.
Also in Graham,
The Intelligent
Random Walks in Dollars of Price vs. Log, of Price
T now want to discuss a slightly differ ent aspect of the random walk which I have briefly touched on previously. There is a difference between a “linear” random walk in dollar s of price, versus a random
167
walk in the logarithm of price. As we shall see you can have a random
walk in almost any function of the price you like. In a rough sense the relation between the price and the logarithm of the price corresponds to the economist’s notion of amount and utility, or dollars of an
amount of money ys.
its “value.”
There is already an assumption
implicit in the relationship between amount vs.
utility (cf.
price
vs. value) — the assumption that the relationship is a functional one. We shall see that this assumption is an enormous oversimplification.
Beginning with simple things first, let us take the example of roulette.
Starting with a stack of p. chips and betting one each
time on red vs.
black, we saw that the height of the stack was a
binary, or coin flipping random walk of negative expected advance.
Prob(s; = —1/37 +1) =1/2.6 This random walk is described by
Ap =PiPi1=s;
Prob(s;=h46) =1/2 po = Starting position
P;="Po+hit6vVi
h = 1/37,6 = 1 forroulette
which we have drawn in Fig. 3.191. ‘You will note that with this approximation of equal probabilities
(1/2) for the payoff, the probability distribution of the position after i plays or steps is symmetric, the mean is equal to the median. After
po/h bets the player starting with 100 chips (pp) has, with probability onehalf, gone broke.
This occurs after 3,700 bets.
This is the
median number of plays to ruin. The mean number of plays to ruin is considerably larger than this. The horizontal cross section of the parabolic sleeve is highly skewed, whereas the vertical cross section is symmetric about the mean = median line of the sleeve.
Let us suppose our gambler (an occidental financier, or economist
on vacation) decides he would like to get a little longer ride for his money, so he says, “Instead of betting one chip of my stack each time,
I will bet just &%. Of the stack I have after each play.”
Assuming
that the chips can be broken up into smaller units, you can see that
you can bet k% of what you have left indefinitely. In practice chips or money being in discrete units, the time will come when the smallest
unit you are allowed to bet is more than k% of your remaining stack, so you will have to change your rule, or just stop betting.
© I have approximated a one starred wheel, for which P(s; 18/37, P(s: = —1) = 19/37, by equal probabilities, 1/2, and a drift and dispersion 6 per step, 6 =1
The
= +1) = h = —1/37
168
discreetness of money or casino rules will put a “boundary” just as in market making, to the delightful prosp ect of always being able to
bet k% of your money, no matter how little you have. The equation which governs this random walk is:
A Pi
ke
Pedy sian 1/2 Po = starting position , in $ or chips.
We can write this in the form
Ps,
ksi
P1a * T09
Define s} = ks;/100. It is just the Percentag e, of the money he has at each
bet as a decimal tha t our gambler bets. it will be less than one in any event, cred it betting not being allowed (he is not all owed
to bet more than his stac k). It will usually be fairly small compared to unity. Make the cha nge of variable Yi = log, p;
Avi = Yi — yi = log, (1 + 57) h* = kch/100 6 * = k6/100 If you like you can Say our gambler, now a “financier,” “evaluates” his money, or thinks in ter ms of natural logarithms whi ch are close to percen tages, if less than about 15% = 0.15 .
This random
walk, now in the variable y = log, P, is quite similar to the pre vious one. Just as
s; has an expected valu e h and a dispersion 6 so also has the function
log.(1+s") an expected value and a dispersion, which we could work
out numerically, using the probabilities 1/2.
Then on a y scale the financier’s random walk would be similar in struct ure to that of the straight gam
bler. It would be a stra ight line of slope Elog,( 1 + s*) and a parabolic sleeve tVio(log.(1 +8*)) The gambler’s sleeve is cut off hori
zontally at P = 0,
for the financier’s sleeve.
This corresponds to y = minus infinity
In practice we must cut his sleeve when
KP,/100 = smallest unit of money allowed (s.u.m.a ). This occurs at y = log, (100 s.u.m.a/k).
We can evaluate the slo pe and dispersion in y for the financier, in terms of the corresponding quanti ties for
the gambler, by exp anding the logarithm. This is quit e accurate for st] = ks; 100 =...
E(Ay)? — (E(Ay))? = 0? = 4262/1002
169
We drop cubic and higher terms. So we can write
y yo = log. Po,
“="
yothyit Vidy
Poisthe starting “capital”.
PU ) Hamber
of chips
after
2
bets,
0
® The worst, most irreverent and commonest sin is to throw out the discordant observations. This method is guaranteed to prevent discoveries which the theory, if there is one, does not predict.
210
Molecules of negative velocity don’t come out the little hole.
The
time required to traverse the span D is t = 2 so the distribution of
the time of flight, recorded as a distance distribution on the circumference of the outer drum is
F(t)dt = (es 1 (2) by ~3l at TT or This has a long tail, going down as you can see as 1/t? and the expected time of flight is, if you just feed the formula, infinite, loga
rithmical actually.
7 E(t) = [ oD ~ ylu)do = [co tF(t)dt ~ log,t =" ~ 00 ‘v=0
U
=
If you insist on writing down arabic numbers for the observed distribution of molecules on the wall, and take means you will of course get a finite mean time. What you should test is not this mean time at all, but the shape of the distribution, say by chisquare, and see
if it fits the predicted distribution.
rotating
dram
outer
1 tail but not an enormously
long one, and you have these two parameters a,
which you esti
mate instead of estimating, as you usually do,the mean and standard deviation. The method of estimating them is called the maximum likelihood method, which you can read about in standard texts. It reduces to least squares, if you happen to be estimating moments from a normal distribution instead of alpha and beta from this gamma distribution. This kind of procedure is appropriate for skewed distri
butions of data when you want to wring the last bit of information out of a limited sample of data that you possibly can. I don’t think the gamma distribution has ever been used to represent stock market data, and it probably would not have any profound underlying
significance. It would just be an economy in case you wanted to use
less data and work it a little harder.
4.9
Skew Random Walks. Fluctuations in a Panel of a Histogram
I now want to discuss two subjects which are somewhat related to improbable events or long tailed distributions but in a different context. The first of these is the skew random walk. We considered random walks of equal probability 1/2 for a step h+6. One can
devise a random walk which has a probability 0.9, or any fraction near one you like of a certain step length, and the remaining small
fraction for the probability of a different step length. Such a step has a mean and a variance, and for many steps the walk will approach,
albeit rather slowly, a normal distribution for the end point. It may
take quite a few steps. We can express this walk as follows.
Prob(si)
=
h+6)=11/M,M>1
Prob(s)
=
h(M—1)6)=1/M
,
€(8?) = h? +6°(M—1)
E(si)=h
oe
6°(M —1)
(4.9.3)
You can see that for M > 1 this will be a walk which drifts in one direction most of the time, but on improbable occasions it jumps the
° ‘The log normal distribution can also be used in this rainfall problem.
221
other way so that the general sketch of the walk is as in Fig. 4.91. This has some analogues in the stock market. Big jumps, though they may occur infrequently, tend to cover a lot of ground compared to
the little ones. One can have a market of negative expected advance which is in fact going forward most of the time. Because of discreetness you cannot exploit this peculiar situation to make a profit (on
the long side), the expected advance is negative. This kind of a situation actually happens in the market. If you look at the monthly record of the crash of 1929 to 1932, a majority of the months in the
Dow index were months of advance, but the net progress from ’29 to °32 was substantially in the opposite direction. When it did fall, it
fell with a big jump. So this is one kind of a random walk which has its analog in the market, and in other phenomena as well. Note that this is skewness of a sequential distribution. One can have an across
the market distribution in which the skewness is frequently the other way. The advance minus decline plot, roughly the median of stock
prices will be sinking, yet an average of stock prices, either the Dow, or a more broad based average will be advancing.
‘
Pee) a Pie)
4m

ie
S$
a
Cy
’
eS
.
3
af
a
22 Aamber of
a
o c
Steps,
ae
or tome,
Fig. 4.91 A Skew Random Walk. A second formula which is of considerable usefulness expresses how much of a fluctuation you can expect in one particular panel of
a histogram. You make a frequency distribution of some variable, and you see a relative excess of members of the population, events or counts, in one particular panel, as in Fig. 4.92, also Figs. 4.152, 6 and you wonder if this is a statistical accident, or is this fluctuation so large that you really have to pay attention to it.
You can see
222 examples of this in Cootner, p. 103 and P. 287. The same arguments
would apply to a deficiency, if one panel was much shorter than its neighbors.
via)
fe)
ns
Fig.
oa
4.92 Schematic Histograms with an Excess or Deficiency
in One Panel.
We can throw this problem in the form of a random walk, in fact a rather skewed random walk in the following way. Let us consider
a histogram, and let us divide it into just two parts, one panel, with a small probability p, and the other panel consisting of all the other members and therefore a rather large probability 1 — p = g. We withdraw the members of the population, say the students in the
freshman class, one at a time. The particular panel let us say is for
the SAT scores from 600 to 620, and we make a random walk of the
following type. We step one forward if our random choice from the student population fell in the panel, and we step zero if it does not.
Say this panel contains 5% of the population p = .05. You will see as we run through the population when we get through the entire
class of 2,000 students or 2,000 steps (95% being of zero length), the expected advance is just the number in that panel of the histogram.
So what is the mean and what is the variance. If you work it out it has a pleasant and simple answer. For one step
E(s)=pl+q0=p
,
E&(s*)=p11+q0?=p
o = E(s*) — (E(s))?> = pp?=p(1—p) = 09
223
No. in Panel “=” Np +./Nopg Since q ~ 1, No. in panel ~ NP + Np = 100+ 10
(4.9.4)
The expected number in the panel is just as you picked it, 5% of 2,000 = 100 and the standard deviation, or square root of the variance is just the square root of the expected number, 10, since the other
probability g is very nearly one. So we have the simple rule of thumb,
derived from the skew random walk but applied in a very different
context. If you have a certain frequency of occurrence of a particula r event, small compared to the total number of all possible events,
the expected fluctuation or dispersion is just the square root of the number of times the particular event occurred. Approximately, not exactly, you will note.
Suppose you have a histogram with either an apparent excess or deficiency in one panel, and you wonder whether it is an acci
dent, relative to the neighboring probabilities.
Then you can use
the neighboring probabilities as an estimate of what to expect in the one under question. If the fluctuation is larger than the square root
of the expected number, perhaps more than two standard deviatio ns away, then you can say with an estimable degree of probability that it could have occurred by chance. If the expected number in the panel
is say ten or more, the normal distribution is a fair approximation to how that number is distributed. This rule of thumb works equally well for distributions over more than one variable, a histogram or table with two or more variables. If you have a contingency table of two or more variables, and one box seems more or less populated than you might have expected, you can examine this in exactly the same way.
For example, there is a very ancient problem?® for which the chisquare test was invented.
Population statisticians suspected there
was a relation between color of the eyes and color of the hair. Blonds tended to be blue eyed and brunettes tended to be dark eyed. The
question was, is this common observation in fact supported by the You simply make a histogram, or as it is called, a frequency table of four boxes, a two dimensional histogram labelled by
evidence?
attributes.
You assume there isn’t any statistical relation between
the attributes and estimate the number in any given box, and on
this assumption estimate the number to be expected in any given
box. If the observed number is much greater or less, (the expected 1° The details of this example are in Kenney and Keeping.
T
224
fluctuation being just the square root of the expected number) then you can conclude there is a probability connection between color of
the eyes and color of the hair.
The general test is the chisquare test, but as a quick and dirty test you can use this expected number and its square root.
4.10
An Example in Which the Histogram Panel Test
Doesn’t Work In the previous lecture (Section 4.9) I described how you could examine a histogram for significant peaks, jumps or gaps in it using the square root of the expected number in a panel as the standard deviation. This was just a crude way of making a chi square test.
Let me describe a problem where this didn’t work, so you can see
the importance of the limitations on this idea. The particular phenomenon I wanted to investigate was resistan ce and support. You can read chapters about this in the book of Magee
and Edwards, with lots of examples illustrated on charts. It is also frequently illustrated in market advisory letters. So I thought I would just put this commonly accepted idea to a test. If people can see these things with their eye balls, then they should show up by an
objective test, maybe. So I picked a couple of stocks for which I had about two years of daily data, and also longer weekly and monthly
charts, and on which I could see, or thought I saw, very clearly
where the resistance and support levels were. One of them was the
Air Reduction Co., I don’t recall the other. I plotted up a histogra m of daily closing prices for two years, that’s about 500 points. I wasn’t really sure whether resistance and support would show up as bumps,
jumps or holes in the histograms, but something peculiar ought to.
The histogram was a sequential distribution of the prices themselves.
There were bumps and holes in the histogram all right, pronoun ced and significant ones by the square root of the number criterion. I couldn’t see that these bumps and holes bore any relation to the
resistance and support levels that I could “read” off the charts. So I tried again, this time weighting each point with the amount of daily volume, and spreading it out over the daily high low range. This gave
a different histogram, still rather irregular, though I wasn’t sure just
how to apply the square root of the number significance tests. In doing all this, I noticed something rather peculiar. One of the ground rules, or conventional practices in making histogra ms is
this.
If an observation falls precisely on a class boundary (say the
225
classes are at 40 to 41, 41 to 42, etc.)
plot half in each adjoining class.
you divide the point and
Or just flip a coin to decide on
which side of the boundary to put it.
But with these histograms
an inordinate percentage, perhaps one fourth of them were falling exactly on the integral boundaries. It made quite a difference in the
shape of the histogram if you shoved all of them to one side or the other.
I had never seen data behave that way before.
Shifting the
class boundaries to half integers didn’t seem to help much, they still piled up appreciably on the half integers.
At this point I started
looking at the closing prices with respect to eighths going across the market, or down the column in the newspaper. Sure enough, there was a preference for even eighths in the prices (Cootner p. 287). Even when you smeared out this effect, the histograms still looked most peculiar, three or four “significant” peaks in the sequential
distribution of closing prices, with or without volume weighting, for
a single stock.
I was beginning to get skeptical of my square root
of number rule. So I thought I would just generate a few synthetic random walks, and see what the histogram of “prices” generated
this way were like. I generated a few of 500 steps each, from random number tables, plus one for the even digits, minus one for the odd. These histograms too showed two or three highly significant peaks, just as the real price data on Air Reduction did and each histogram
was different from the others.
At this point I could see that my square root of n rule just wasn’t applicable for a histogram created in this way, the position or price in a random walk. I did a little homework, and dug around in the text
books to find out just what was the answer to this problem. You can find a discussion in Feller’s book pp. 82 and 231. My old histogram rule is indeed not applicable. You may recall from its derivation in an earlier lecture, using the skew random walk that the probability of a step or count in a histogram panel was independent of previous steps
there, or elsewhere. So adjoining panels are probability independent in their count. This is most definitely not the case for a histogram
created in the manner described above. A step at one price definitely increases the probability of adjoining prices also occurring. This puts
clusters or “peaks” in the histogram. The panels are not independent in their counts. The distribution in a panel constructed this way is
not Poisson or normal for large counts, but instead approaches what is called a truncated normal distribution. It looks like the positive
half of a normal distribution, but its dependence on the total number
226
of observations or steps in a random walk is very different from “normal.” For an ordinary histogram, the mean or expecte d number in a particular panel increases linearly with n, so also does its variance.
So the expected number in a panel is ~ pn + ,/pn where p is the
small probability of falling in a panel, and n is the total number in
the sample.
For the random walk histogram the truncated normal distribution indicates that the expected numbe r in a “panel” or at a particular price level, or position, is of order V/n*.
n* is here the
total number of steps after you first land in the walk, in that panel, or price level. This is called the occasion of first passage . The variance of this number increases like n*, that is at a faster rate. So the
expected number in a panel, plus or minus its standard deviation is
=~ Vn* + Vn*. Thus the “standard deviation” is of the same order
as the number in the panel. This means that the histogram has big
fluctuations or highly significant peaks, using the convent ional (and
here incorrect) formula. This was exactly what I observed, both with the manufactured random walk and also the real price series. The question which I originally asked myself—how to demonstrate resistance and support levels of the sort you can trade against
and make a buck maybe, is still unanswered. Such levels exist at inte
gers and half integers, and prices tend to bounce up slightly (support)
at 1/8 and 5/8, and to bounce down at 3/8 and 7/8 (resistance), but this phenomenon is not much help to peasant investo rs not on the floor of the exchange, or analysts trying to make a living reading
charts for themselves or their clients.
I think this problem might be explored by talking to a cooperative specialist about the distribution of orders on his book in the past, at supposed resistance and support levels in specific stocks, and
by examining both the quotes and the transactions at these times.
If you can get knowledge at the time it occurs of an exchange or floor distribution, and monitor the quotes and transactions, I suspect you might see evidence of resistance and support in a single
stock sequence. This is still an open question. You might well find
unambiguous positive evidence, but still an effect not large enough to make your fortune.
4.11
A Comment on the Form of the Chisquare Test
The expression (4.92) for the expected standard deviation of the number in the panel of a histogram, or box in a frequen cy table, can
be used to give a plausible explanation of the form of the chisquare
227 test.
If we have an experimental single variable distribution, or his
togram, the expression for chisquare is:
x= SS fo fe)"/fe panels
(4.11.4)
Here fy are the observed numbers, or frequen cies, and f, the caleu
lated or expected number from the theoretical distribution which is being tested.
In the case of a twodimensional “histogram” or fre
quency table, exactly the same formula applies , the sum being over
the boxes or cells of the table. In this case the f or theoretical fre
quencies are calculated from the observed frequencies f, under the assumption of independence of the two (or more) attributes or vari
ables being tested. Thus f,,;; for the “ij th” box in a two dimensional table is
RC
fois = Neotat = (fori) Neotal
(4.11.2)
ij
R; and C; are the row and column sums, the so called marginal
frequencies.
The approximate expression which we derived in the preceding
section, Fe = fe, enables us to put the expression for chisqu are in
a “random walk” form as a sum of “steps,” one step for each box in Suppose we number the boxes from k = 1 to M; M= number of rows times number of columns, the total number of cells.
the table.
Then if we consider fs,i; = 2% as a random variabl e, the kth step
length, and f, as E(2,), its expected value, we have as the expression for chisquare: M
2=> (Joos = fe)? _ = (zp = (ee)?
cells
fe
R=1
%%
(4.11.3)
Thus the expression for chisquare can be express ed as the sum of squares of steps in a random walk of zero expecte d advance and
unit variance, one step for each box in the table. (see Weatherburn “Mathematical Statistics” p. 169) The number of “independent steps” or degrees of freedom is not quite equal to the number of cells in the table. The theoreti cal fs are derived from the observed row and column. So the theoretical
f.’s are not quite independent. In a given row say, we could assign all
but one of them independently but then the last one must be chosen
228
so that the row of f,’s adds up to the observe d row sum.
There is
one constraint (the fixed row sum) on the otherwise arbitrary f,’s in
each row, likewise for columns. So the number of degrees of freedom
is not M but (mrow —1)(meor — 1).
The expression for chisquare is a sum of squares so it is essentially
positive. For a “large” ( > 30) number of degrees of freedom it can be shown that chisquare itself approaches a normal distribution of mean n—2 and variance 2n, where n is the numbe r of degrees of freedom.
So the variable (x? — (n — 2))/V2n is itself a norma l variable of zero
mean and unit variance. This is a conve nient formula for extending a chisquare table. There are other expressions involving either the
square root or cube root of chisquare, for transforming chisquare
into a normally distributed variable for large n, which are slightly more accurate than the above expression. See e.g., Kramer “Methods
of Mathematical Statistics” p. 251, or Kendall and Stewart, Vol. 1,
Pp.
371, for a discussion of these alternate forms.
So long as the
standard of deviation /2n ofchisquare is sufficiently smaller than its expected value n — 2, the negative tail of the normal distribution
will be so small as to not distort its representati on of chisquare as an essentially positive quantity.
At n = 30, o,2 = 7.8 = V2n Thus n = 0 is not quite three standard deviat ions away from the expected
value (n —2) of x? and the “negative tail” of the normal distribution
is a negligible error.
4.12
Counting of Events with a Poisson Distr ibution
In the preceding discussion we saw that if the frequency in a his
togram panel or contingency table were say 9, then a zero count was
9/V9 = 3 standard deviations away from the expect ed value 9. So
the normal distribution could represent the fluctuations in the count in one cell fairly well. The probability of the absurd observation of
a negative count in the table was really quite small.
Now suppose the expected frequency is small, zero or one or two
or three only, as the count in a box or panel. The expected count doesn’t have to be an integer, but the observed count must be. How do you examine and describe this situation? You obviously can’t use normal distribution, because that is a contin uous distribution, and goes off to minus infinity.
Let me give you an example of this sort. Suppo se you are keeping
a record of the barometer. You write down once a day for a year the
height of the mercury column. It averages aroun d thirty inches, and
229
on five occasions you note that the barometer is less than 28 inches of mercury. So it is a fairly improbable event, to have the pressure that low. You also observe the weather once a day and record the velocity of the wind. You note that the wind speed is greater than 50 miles per hour on four days in the year, so that also is an improbable event.
This has divided the frequencies of speed and barometer into two classes or histogram panels, each.
Now suppose you make the
observation that on two occasions the low barometer and the high wind speed occurred on the same day.
Does this suggest there is
a connection in the probability sense between the barometer and
the weather? Just thinking intuitively you would suspect there is a connection.
Let us see whether it is an improbable coincidence to
have two days in the year when both these improbable events occur.
As we have given the data, we could make a two by two contingency
table, with four boxes, and apply the chisquare test. There would be a count or frequency of only two in one of the cells, so it would be a rather lopsided table, not constructed in the best way to test
for independence by chi square of barometer and wind velocity. We
will do it a little differently.
First let me make a rough calculation, and then do it in a litSuppose there is no connection between the
tle more refined way.
barometer reading on one day and any other day. events, p < 28, occur independently in time.
The barometric
The estimated prob
ability of a barometric event is 5/365 on any one day, or trial, so
the estimated expected number in a year, or 365 trials is of course (5/365). 365 = 5. We make similar assumptions for the wind event. Note that these assumptions of time independence of wind and barometric events, separately, conflict with common experience, but we
make them anyhow.
If we further assume that barometer and wind are independent of each other in the probability sense, then the probability of both occurring on one day is (4/365)  (5/365), and the expected number
of such coincidences in 365 trials is just BB = 0.055. We would get this occurring on the average once in eighteen years, 1/.055. In fact we have observed it twice in one year. Is it sufficiently convincing ev
idence; that the actual number of coincidences (2) is greater than the number to be expected by chance (0.055), and hence we would reject
the hypothesis that wind and barometer events are unconnected, in the probability sense?
230
To make this significance probability calculation we have to use
a distribution valid for small integral numbers, rather than large ones. We can’t use a continuous distribution at all. The appropriate
distribution is called the Poisson distribution.
You can see offhand that two events (coincidences) when only 0.055 are expected should make you very suspicious. 1 might trans
late these numbers to a different situation as follows. You eat your
dinner in some number of different restaurants, every day in the year.
On five occasions you have eaten in a Chinese restaurant. You have
four stomach aches a year, two of these were after you ate the Chinese
food. You would be very suspicious that Chinese food had somethin g to do with your stomach aches. Suppose there were 365 people chosen at random. Four of them
were smokers and five had lung cancer, and it turned out that two of
the smokers also had lung cancer. You would be a little suspicious
that smoking might have something to do with the lung cancer. Note that such data is not necessarily an implication of cause of cancer by smoking. The data just suggests a probability of connecti on. Tendency to smoke and have cancer might have a common genetic,
environmental or physiological factor. At least the tobacco interests hope so.
This kind of information on connections or causes is presented to you every day, and you make decisions on the basis of this kind
of evidence without going through the mathematics. A rather tragic example of this kind of coincidence was first noticed in connecti on
with birth defects and the use of thalidomide. Occurrence of babies with these defects are very rare. People who were taking thalidomide
were relatively rare, yet they observed there were quite a number of
coincidences.
There were thousands of babies born, yet the coin
cidences, though very few in terms of the numbers of babies born,
or users of thalidomide were much larger than independence would indicate. The doctor looking over the records of these defects was struck by the embarrassing number of coincidences.
So this kind of statistical testing is quite useful and common. It doesn’t involve large numbers of actual occurrence of events yet you can get significant results if you have a
lot of trials.
In the
above example of the barometer and the weather, just one occurren ce
might alert you at the 6% level. You would not expect this but once
in eighteen years. Similarly, just one stomach ache after a Chinese meal might make you feel cautious.
231
Before deriving the Poisson distribution, let me sketch it for the
above example, where the average or expected number of events was 0.055. In the examples the single “event” was actually defined as a
(binary) coincidence event, a smoker with cancer, a Chinese dinner with stomach ache, or low barometer and high winds.
The largest
probability is for none at all, so Prob (n = 0) ~ 0.945. For Prob (n = 1) we had 0.055 approximately and you will notice that the expected value using just these two probabilities alone, which are most, but not all of the probability is again approximately
&(n)
=
P(o)0+P(1)14+ P(2)24+... 0.9450+40.0511+...~0.05
(4.12.1)
So the probabilities for n = 2,3,4 must be very small indeed. The probability distribution must be represented as a nonuniform rake,
with by far the largest probability tooth at n = 0.
Our signif
icance probability, since we actually observed two such events, is
p(n > 2) = p(2) + p(3) + p(4) +... which in any event is going to be a pretty small probability. From the original data we see we couldn’t
possibly have gotten more than four coincidences. The distribution we derive will not take note of this restriction, but the probabilities we calculate for four or more events will be indeed very small. You can see that the discreetness of the independent variable is of fundamental importance. It would not do at all to approximate this situation where we are dealing with small integers, the numbers of events (coincidence event here), with a continuous distribution, or a distribution which allowed negative counts.
Plans
Ww eat. wor ip yy 4 o11rnr3 4
he >
Fig. 4.121 The Poisson Distribution for an Expected Number
232
A = 0.055.
The derivation of the Poisson distribution for a small number of
events in a large number of trials proceeds in the following manner.
Let us imagine the calendar as a long series of “boxes” or panels,
each box a year in length. I sprinkle at rando m 50 beans” (events)
over ten years or ten boxes so that the average number per box, for
example is\ = 50/10 = 5. (A = 0.055 for our previou s example). We
want to derive the probability P(k), k = 0, +1, +2, +3 etc., of exactly k beans in one box or year, where we are given that the average or expected number in a box (year) is A.
We imagine we divide each box into n cells. n is just a large number which will eventually go to infinity. We just imagine n so large that the number of beans in any one cell is almost certainly either one or zero. i.e., (A/n) . The factors of the form (1 —const/n) approach 1. Note that \ and & are fixed, as n — oo. So we have
P(k) = A*e~> /k!
(4.12.7)
This is the Poisson distribution. I have sketched a few examples of it in Figure 4.122. You will notice that when the average or expected number A in a box is considerably less than one, as for our example
of low barometer plus wind, or food plus stomach ache, most of the probability is concentrated at k = 0, an empty box. probability of just one event in a box is closely 4
P(k=0)
~
1Axz1
Plk=1)
~
X
Moreover the
itself.
A 2) > P(k)= A?/2 = 0.0015, for \ = 0.055 This is the significance probability. Roughly once in 700 years (1/0.0015) would we expect 2 or more coincidences if the expected number is
0.055. We conclude at the significance level 0.0015 that low barometer and wind, food and stomach ache, etc., are not independent. As the expected number in a box increases to 1 or more the
maximum probability moves out from k = 0, an empty box, and begins to develop a hump at k ~ A. By A = 9 one can see the teeth of
the rake are being trimmed to approximate a normal distribution— a continuous distribution whose area could then approximate the probability in a number of adjoining teeth.
234
One can define and then examine a great variety of events by the Poisson distribution. For example we assumed that in the barometer
series, the barometric events (p < 28 inches of mercury) were independent in time. We could test this assum ption alone, if we suspected that a low barometer reading on one day enhances the probability that succeeding days will also have low barometer readings ie., baro
metric events were clustered. We define a delayed coincidenc e as two
days in succession (P(t —1) < 28, P(t) < 28) of low baromete r. By
definition this coincidence (of two events) just occurs once, on the
second day.
We expect with time independ ence or no clustering)
just (5/365 .5/365) 365 pairs) 0.06 7 of these delayed coincidences in a year.
If we had two such coincide nces in a year (this woul d be
either three low barometers in a row, or two separated pairs, four low barometer events in all), we would conclude that low barome
ter tended to significantly persi st, or cluster for more than one day.
You could also define events as big or small barometric chan ges,
(p(t) — p(t —1)).
Again these must be dated, pref erably as of the
second day, so you only have one event occurring of this type at a
time.
What has all this got to do with data on securities? If for baro
metric pressure you will read price, and for wind velocity v(t) read
volume, you will see you can analyze events in the volume and price
Sequences in exactly the same manner.
This is called technical anal
ysis if you are an investor, or moni toring the tape if you are a mark et supervisor or regulator. Ther e are lots of events that can be defined
in these sequences; big price changes, big daily range and similarly
for the volume. Am.
Stat.
Soc.
You can read about examples of this in my Jour.
paper in 1967. Much of the folkl ore of the market place can be examined this way, and some of it is confirmed.
Maxima and minima of stock price s are signaled by large volume, so here is a chance to make a buck, maybe.
You should note that
highly significant “signals” in the probability sense are by no mean s
infallible. The barometer was signi ficantly related (Prob. = 0.0015) to the wind velocity. Nevertheless as a storm indicator it was wron g
more than half the time.
Events, both simple and multiple, can be defined in lots of ways,
and not just in numerical sequ ences. The dates on which the board
of directors of a corporation, or the Federal Reserve Board met might
be known. If you suspected leaks of information you might look for
coincidences or delayed coincide nces of price and volume events with
these meeting dates. The possibilities are limited only by your
Fig. 4.122 The Poisson Distribution p(k). 4 = Expected Num 
ber in One Box.
236
interest and ingenuity.
4.13
Relation of Correlation, Regression,
Contingency by Chisquare and the Method of
Coincident Events
It might be well to take a problem which I am sure you have worked out in your statistics classes, and show that it can be cast in
these other forms which I have just discussed. Consider the calcu lation of correlation coefficients or linear regression coefficients. You can take a problem of that type and cast it either as a problem in
chisquare, or a test with the Poisson distribution. I think it will be more easily illustrated if we take a particular case where you have Suppose I am a man from Mars, and
some idea of what to expec t.
T have a lot of data on the weight and height of a population of students at the University of California . I plot weight against height,
not knowing what to expect, and I get a scatter diagram. I have per
haps a thousand points, or members of the population, and I can see
in a general way that the heig ht and weight increase together. I am
sure you have formulas in your statistics books that show you how
to calculate the regression of heigh t on weight, which I remind you
is the curve of the mean height again st weight. The other regression is the mean weight against heigh t. They are not exactly the same
wot TALL
TALL
oo
i 1
a
MnerghT ur Lbs
Soe we
R20?
oes
tae,
tbs
by aak Ohne et 
‘
Set
’
t
aie
ee
hs
Heavy
ee ee Nov HEAVY
ed
blew h Aecgat—feeT Fig. 4.131 Scatter Diagram of Weig ht vs. Height (Schematic).
thing. If we approximate by linea r (straight line) regression they
237
are two lines like scissors. If there is strong correlation they are close together, if there is no correlation they are at right angles and parallel to the axes. The regression actually tells you the relationship of mean height against weight, or vice versa. If you just want to answer
the question, do weight and height depend on each other in some systematic and monotonic fashion, then you calculate the correlation coefficient. It is the following expression, and it has a standard deviation which is roughly one over the square root of the number of observations. So if you want to determine a significant correlation coefficient of + 0.1, you must have at least 100 observations.
— 1 DM —A)(wi@) _ hw  ho N
Owlh
Twoh
Let me describe a way by which you could have examined this data in terms of events. Divide this population with two variables into two classes with respect to each variable, one class small and the other large, for each variable. Take all the people who are taller than six feet six inches, those to the right of the vertical line, and all those to the left who are not so tall. We also divide into two classes, heavy and light, those who are above and below the horizontal line at 220 Ibs. You can imagine there are 365 people in the sample, five are tall, four are heavy and there are only two who are both tall and heavy to the right and above the two lines. You line the people up in any order and run them through a machine which is a door frame with a bell and a light on it, in front of some scales. Every time a tall man hits his head on the door frame the light flashes, and every time the scales go over 220 Ibs. the bell rings. We count the light flashes, the bell ringing and the coincidences between them. You can
see this is exactly the same problem as the storm and barometer, and we conclude as before that height and weight are not independent.
We could also cast this problem in the form of a chisquare test. Just divide the data up into boxes by vertical and horizontal lines,
and count the number in each box. You would have four boxes if you used median vertical and horizontal lines, but you don’t have to have the same number of divisions on the vertical and horizontal coordinates, nor do you have to have equal numbers in each class with respect to each variable, though it is usually desirable to do this at least. approximately.
You will notice that the chisquare test can tell you about independence or dependence where a calculation of the correlation coefficient might not. The variables don’t have to be distributed in
238
an elliptic fashion with one variable systematically inc reasing or decreasing with the other. We might have a V shaped regression line, Or arrangement of the scatter diagram like an X, or a wave like an M or
a W. Correlation or line ar Tegression would not reveal this kind of statistical dependenc e.
attribute form, tall , short, light, hea vy. You can als o go the other way, but it is a litt le more treachero us doing this. So me attributes don’t lend the
mselves very well to numerical ass
ignments. If it is simple attribute, a like Sex,you mig ht call male plu s one and female min
us one, and calcul ate a correlation coefficient of sex vs.
or sex vs.
weight.
If you were stu dying
height
the effect of dye color on fabric strength, you cou ld test data by chisq uare. A regression or correlation would require ass easy.
igning numbers to colors. This is not so
cients (or equivalen t chisquare or eve nt tests); three dif ferent Pairs of variables with the third summed over and three more Pai rs with the third held fixed. If you want to be really thorough, yo u should in addition test all three at once by chi square or events, as it is
the problem the eco nomists run in to wh en they try to regres s price
on supply or de mand, when oth er variables ent er in.
Sometimes you can sor t out a situat
ion of this kind, say wit four or five variables h , by examining just two at a time, and the re are
contingency tables with three or
more variables. The same remark apply to calculati s ons of regression or correlation coe fficients.
One booby trap whi ch it seems financier s are Particularly inc lined
en
239
to fall in, is this one.
They have a batch of data, with “measure
ments” on half a dozen or more variables for each member of the population.
Prices or price changes, earnings, sales, yield, capital
ization, book value, cash flow, volume, etc., etc.
They pick just
one, the price, or “profit” (price change) or change in value (change of log price) and try to explain just that one in terms of all the others.
This is like
a committee of the six blind men who felt the
elephant. For some reason they pick the tail, or the ear, as the “important” part and pool their information or impressions to explain the properties and behavior of just that part, defined arbitrarily as “the elephant.” This is as arbitrary as the man from Mars deciding that just height, or just weight, is the only significant and definitive attribute of a man, and trying to explain just that one in terms of all the other physiological, environmental, or genetic attributes a man
might have. You can read a most critical and enlightened account of this approach in the paper by Keenan, “The Great SERM (single equation regression model) Bubble” in Frederickson’s book (second ed.). We
all understand why financiers think price or profit is important (a vested interest (!) in the profit concept) but concentration on trying to explain just this one variable is a very narrowminded, tunnel vision approach to the problem. When the variables which describe a population get numerous, as they certainly do in finance, economics, or psychology and anthropology, there is a different kind of approach. Rather than specifying the germane variables as they are given you from the observations, and trying to get the effect of each, you think, you instead ask of the
data, how many different variables are there? If you are clever, you can even let the data try to tell you what they are. You do not start out by naming and identifying the variables in advance. Nor do you
ask how each variable operates, i.e., calculate a regression coefficient. Let me give you just two examples of this approach.
You have
data on price sequences in 1,000 stocks. You suspect there are vari
ables which influence price changes in these sequences.
You want
the data to tell you how many, and possibly what they are.
There
are believed to be “industry factors or components” such as transportation, retailing, utility, steel, oil, etc., etc., in addition to what ever you may know about the individual companies.
Can the data
on the price sequences alone tell you how many, and possibly what
they might be?
240
Brealey, in his first book (Chap. 5) descri bes a procedure given
by B. F. King for sorting out contribution s to the variance of price changes. The method is semiautomatic, you do not have to make preliminary guesses as to what the significant: variables are, which might bias your results. Interestingly enough, the industry groupings
which result are quite close to what you might have suspected from what you know by observation of what industry these companies are in.
An early and famous historical exam ple of this process of variable sorting and identification occur red in intelligence testing. People knew from common observation that there are different ways to be
intelligent.
You can be smart in some ways and stupid in others. Memory, verbal ability, numerical ability, logical ability, aural, oc
ular and manual skill, persuasiveness, imagination or ingenuity are
aspects or attributes of this subtle and complicated quali ty we call
intelligence.
It was, and still is not completely known how many independent components of intelligen ce there are, or just what they
are. It was also suspected that “tests” which result in a single numerical “score” might have something to do with intelligence, just as
the numerical score of “price” has somet hing to do with the “value”
assigned to an object by a buyer or seller, whether it is a painting or an engraved stock certificate.
Thurstone (an engineer at the Unive rsity of Chicago) wrote a book called “The Vectors of Mind) in which he described one way in which the scores might be used to disentangle and enumerate the
components of intelligence.
The methods are somewhat relat ed to
that which King used to identify the industrial components which go
into the changes of a stock’s price.
Similar problems to these arise in anthr opology and archaeology, where you try to identify and perh aps trace the historical components of a culture or a race of people, or in biology, related species. The same kind of problem arises in linguistics, where you try to identify what the relations of different languages are by properties of
Pronunciation, vocabulary and gramm atical construction.
Research
on the best methods of doing these things is still going on. Evidently
there are similar problems in economics and finance, where you try to identify what and how many the significant variables are.
241
4.14
Seasonality in Stock Prices. An Example of a Problem with Three Variables. Good and Bad Practice in Statistics
At this point the students got Problem 6, for which I have given in these notes the data and explanatory comments, from the sources
(See Problem 6, Table 4.141). These sets of data can be examined in a number of different ways. I want to discuss this problem in some
detail in order to illustrate what is good and bad practice in handling data statistically.
The yearly chisquare test as given by Osborne does not distin
guish between 12, 6, 4, 3, or 2 month periodicity. it just says that the attribute “advancing or declining index” (The Dow Jones Industrial
Monthly) does depend in a barely significant fashion (statistically,
not practically) on the attribute “month of the year.” By grouping and adding the count of the data for January with July, March with
August, etc., the 12 by 2 table can be reduced to a 6 by 2 table, and tested for a mix of 6, 3, 2month periodicities. Similar groupings enable tests for 4, 3, and 2month periodicities. In this way the students found “significant” periodicity (barely at P = 0.05) only at 6 and 3 months and that primarily in the DJI. From the above discussion you should see there is nothing sacred
about examining the data this way using a twelvemonth period only. We could have used a 13month period, 14month, or any and all intervals you like, and sorted the original monthly data on index changes into a 2 by n month table and tested by chisquare. Most of them would not have given a “significant” result, but some might. It would not be necessary to have the same number of observations in each month column.
There are still other possibilities.
We might have grouped all
the data from January to June, and July to December, and tested
the resulting 2 by 2 table. If you are looking for significant results, grouping April to September and comparing against the other six consecutive months would very likely give you a significant result. We could also just compare months two at a time.
February
against August gives a two by two contingency table, for the Dow, which is certainly significant by chisquare at the 1% level. When a contingency table has more than four boxes, you can see there are lots of ways to either group the data and condense the table, or examine only parts of it. Some of these tests can he done approximately and quickly using the rule of thumb for the fluctuations
242
in a cell ag the square root of the expected nu mber more exactly, V7Pq. But you sh ou ld also see there
is something a lit making a lar tle fishy about ge number of these tests, and acceptin g uncritically verdict “si gnificant” the ent groupi ngs
or “not Sig nificant.
”
or tests, either of all
If you devi se twenty differ
the data or just pa rts ofit, then
one of them almost Surely is going to co
me out “signi five percent ficant” at th level, even wh e en there is nothing but pure random and independen ne ss ce in
the original
data, This is sometimes call “error” of the ed an first kind— re jecting the nul l hy pothesis when in fact correct, it is This is a bo
oby trap which People who have to an
There is one grouping of thi
s data which it is instructiv plore. Considerin e to exg all three sets of da ta , th ey fo rm contingency a th re e va riable table 1) adva ncede
twelve values.
barg, Brealey).
cline with
two values. 2) Month wi 3) The “Obser th ver” with thre e values (Osbor ne , ZinIf we wanted,
we could test the entire table at once for dependence by chisquare, wi th (21) (121) of freedom. (T (31) = 99 degr able 4141) Th ees e grouping po ssibilities here are enor
centages, ha s to be conv erted
to fr
equencies, Osborne’s Granville, was data, from also originally given in Percen ts.) Evidentl @ significance y there is variation of
advance and decline Propor tions on ob
server, but who is cont ributing
dependence?
the most to the
chisquare, or the The individual contributions to chisquare ble 4.143 are from Tashown in Table 4.144, so Osbo rne’s data (tak Granville) is ob en from viously the Ma verick, contri buting one half total “signifi of the cant” chisqua re.
This was
data from the DJI. One could also ask the question , Do Zinbarg an d Brealey differ
and Poor Inde x; Brealey, us ing a
longer interval S&P and Cowles of time used bo th Commission ind ices. Does the admixt ure of Cowles da ta make a signif Apparently not, icant difference, although the rati o of ad va nc es barg, 326/214 = to declines for Zil 1.53 appears to be noticeably di fferent from that of
243
Zinbarg and Brealey are not significantly different, on this score of relative number of advances and declines. They had half their data in common anyhow, so this is not too surprising. Whether or not the Cowles commission data is significantly different from the S&P data (for different years) would have to be determined by examining that data separately. We can make a comparison for different years and also different indices by subtracting the count for Zinbarg from that. of Brealey. This gives a contingency table for different non overlapping years, although the years are not consecutive. This grouping choice is given in Table 4.146.
So at the five percent level (barely) there is a statistical depen
dency between the two attributes listed in the above table. We should note carefully just what this conclusion does and does not say. Evidently the years are different in the two columns, but we cannot necessarily conclude that the dependence of proportion of advances and declines is primarily a calendar or “era” effect. It might be due in part to the way in which the indices were computed. The Cowles index was an average of 8 to 18 stocks. The S&P had many more. The formula was changed and the index recomputed in 1957 to weight prices with the number of shares outstanding for each stock. Which SP formula was used for what years is not known to me. The chisquare test just says there is a difference without pinpointing primarily with what attribute of the grouping the difference is associated.
The important point of the above discussion is not just that the data samples are significantly different, which suggests nonstationarity. It is rather that a significance probability should always be
computed if it is at all possible to do so. Without it you have no more than intuition or bias as to what to believe or how strongly to believe it. Brealey and Zinbarg “looked” different when we compared ratios of advances to declines, but actually they were not. Signifi
cance probabilities at least tell you what is or is not likely to be believable, without them you are just navigating in a fog, not knowing north from south or east from west. Significance probabilities give you a little sense of direction, though they are far from indicators of perfect truth. I must confess I regard statisticians who give data and draw conclusions from data without significance probabilities, or at the least, error estimates, in the same class with people who use plumbing and don’t pull the chain. It’s an incomplete, sloppy job.
There are some booby traps even when significance probabilities
244
are given. I mentioned one—errors of the first kind. A second kind of
booby trap depends on the reliability of the distrib ution from which the significance probability is calculated. For statistical problems for which the basic data is simply counts, as in the above example, the
distribution (theoretical for no dependence) is usually known, chi
square, binomial, Poisson, etc. If the data is not given by counting,
but by measurement of the amount of price change, then the shape of the distribution of the data matters a great deal in calculating the
significance probability.
For example, if instead of giving the number of months of advance
and decline, we might have tabulated the actual change of index in its
units, or as a percentage, for each month of each year, and given the
mean and standard deviation for each month of each year. These standard deviations might or might not be reliably convertible to
a significance probability for whatever was being tested, periodicity, difference or observers or eras. This is one of the points of the papers by Mandelbrot and Fama. Unless you know and have checked out the distribution as being approximately normal, a variance or standard deviation does not tell you very reliably about the probability of significance of your results. It is very easy to be grossly misled on
this score. Some times the best you can do is assum e normality (or transform the data to approximate normal ity) and give a standard
deviation. But not to give any “error” or fluctu ation estimate, or data by which it can be estimated, is poor practice indeed.
It is true that counting methods (order statist ics) loses “information” in the technical sense, relative to analyz ing the quantitative measurements, as might have been done as described above.
This
isn’t serious when you have lots of data, as you usually do in secu
tities markets. The compensation for this loss of information is that
when you do discover, or think you have discov ered something, you can test much more reliably how much to believe it.
If you give statistical results in graphical form or in tables, you should include in the legends and labels of the axes complete information as to exactly what the tabulated or plotted quantities are,
including the dimensions, price in dollars per specified unit, time in weeks, etc. To me it is really maddening to look at tables or graphs where this information is lacking and you have to hunt through the
text to find these quantities, and frequently not find them. A graph in particular is devised to enable the eye and mind to encompass the data as a whole. If the axes aren't labeled with dimensions and with
I 24.6) = 0.011, n=11 df
Total 438) 450)
248
Table 4.142 ADVANCES AND Dectines or DJI FOR MAR CH
vs. AUGUST (FROM TABLE 4.14 1)
March
August
26
47
Advance
Declines
48
26
Table 4.143 Monrus oF ADV ANCE AND DECLINE oF AN IN
DEX.
DECIMALS IN PARENTHESIS ARB PROBABILITIES
VANCE AND DECLIN E
DJI(1)
SP.
1886
Cowles
1918
1871
1960
1962
S.P.
438
(49)
D
326
450
(51)
(.605)
214
660
(395)
(57)
500
(43)
888
(1) Data from Granville
Totals
1965
A
N(2)
OF AD
540
1160
1424.55)
1164
2528
(.45)
(2) N = total number of mont hs
Table 4.144 CHLSQuaR E CONTRIBUTIONS FROM TABLE 4.14
O
ZB
Advance
3.4
3.0
Decline
7
6.5
3.5
1.0
Chisquare = 18, P(Chisqu are > 18) < < 0.01
n=2d4f.
Table 4.145 COMPARISO N OF DatTA OF ZINBARG AND BREALEY
(FROM TABLE 4.143)
Z
S.P.
Advance
Decline Total
191862
326
B
Cowles S.P.
(.605)
18711968
660
214 (.395)
Totals
(.57)
500 (43)
540
1160
986
714
1700
Chisquare = 1.78, P(Chisq uare > 1.78) =0.4,n=14d4.
Table 4.146 MONTHLY IND EX ADVANCES AND DECLIN ES FOR
NONOVERLAPPING ERAS OF TIME (FROM TABLE 4.145) BZ
18711917 (1)
191862
19631968 (2)
Advance
Decline
334
286
(SP)
(.54)
(.46)
326
214
(.605)
(.345)
(1) Cowles data. (2) S.P. data. Chisquare = 4.1, P(Chisquare > 4.1) = 0.04,n=1d4£
249
4.15
Some General Methods of Attacking Large Batches of Data. Preliminaries of Analysis of Sequential Data
I now want to take up some general methods of attacking the
description of a large amount of data, with the ultimate end in view of describing sequential data, or time series. Initially, we can ignore the sequential aspect. I want to point out that the history of science and statistics has somewhat biased the way in which people tend to
look at large batches of data. I want to make a special effort to point out these biases and how to take account of them.
First a story, apocryphal perhaps, which illustrates what can and
is done, and how not to do it. There was an Indian student at Oxford who was introduced to the freshman course in mathematics. After about two weeks he went to the professor and said he better drop the
course. He didn’t feel competent to manage it. The professor asked him why. He said, “We have been studying logarithms for the last
two weeks, and I have only been able to memorize the first 2 pages of the logarithm table. It is just beyond my power to learn them all. I think I better just drop this course. I am a failure.” This illustrates, needless to say, the wrong way to go at it.
If
you want to understand logarithms you don’t try to remember every one of them. Instead you try to be labor saving, and remember and hopefully understand a few properties of logarithms.
The table is
just something that you use but you certainly don’t try to remember it.
Just remembering that logy) 2 is 0.3 is enough to get you most
logarithms, for graphical purposes at least. On linear paper from 0.0 to 1.0; 1, 2, 4, 8, 10 are plotted at 0.0, 0.3, 0.6, 0.9, 1.0, and you have log paper, as many “cycles” as you want.
For large batches of observational data (as opposed to mathematical tables), just writing the data down and saving it is not very labor saving either.
The purpose of describing data statistically is
to be labor saving, so that instead of having to remember or keep a record of ten thousand numbers, you only have to remember or
think about a few. A mean, or median, a regression coefficient, one or two parameters which describe the shape of a distribution, may be all you really need.
What the student tried to do with the logarithm table is in some respects analogous to what has and still does go on to some extent in such sciences as astronomy, meteorology and oceanography, for which I can speak from personal familiarity. The end point objective
250
for many, not all Practioners in these sciences, is the fait hful and cont
inuous Tecording of large amount s of data. There is a saying
among astronome rs that the best ast ronomers and the most discov
eries come from the Brit ish and the Dutch becaus e they have the
worst weather and the worst instruments,
The implication is they
are forced to take the time and thought to analyze other peoples’
data, since they can’t get muc h of their own, or else wring dry what
little they themselves can get. I am sure there are similar practices
in economics and fina nce.
As a Chicken Littlem inded physicist, I
have reverence for data faithful ly recorded, but one shouldn’ t stop there.
So we have a batch of data , a sheet of paper with columns of figures, one column for eac h variable. The stock pag e in the news
paper is good enough for illustrative purposes.
The first step is to look at it. Doing simple thin gs first, one thing at a time , just take one
variable, say the closing price column, and look at that.
“Look
at it” does not mean just read off the numbers, one at a time. The eyeball and the brain, give n a chance can look at the whole column
of figures at once and draw conclusions which would othe rwise take a bit
of computing. Just plot that column of numbers as dots, on a straight line, stacking the m up on top of each other if two or more hap
pen to coincide. So the “pic ture” which the eyeball and brain
gets of the column of figu res looks something like this .
can even make them for you.
Die
By
wes eet phe
Aye
°
+
203
Th
ites
Computers
oS
pree~P
Fig. 4.151 A ”Picture” of a Column of Data on Prices, From this “picture” of the col umn of figures the eyeball and brain
can read off a lot of calc ulations,
You can read off the mea n,
by asking where this one dimensio nal “cloud” of Points would balance,
on the point of a pencil. This means you have calculated by adding
and dividing in your head, without use of numbers,
the quantity rd baa P;. You can actually do it quite accu rately, well within the prob able error of the mean, which is all you should really ask. Try it
251
if you don’t believe it. You can get even greater accuracy by doing it twice, one with the picture turned upside down, so as to cancel out systematic errors which the eye and brain make in the slight
asymmetry of perception from right to left. I might mention that if you try this with the line of dots running vertically, you will make much bigger systematic errors. As a computer, the eye and the brain
adds and divides with much larger systematic errors in the vertical direction. That is why when people try to divide a length with the
eye, they stretch the length horizontally. What this irregular one dimensional cloud of dots represents to the eye is a frequency distribution, and one can give quantitative substance to this point of view by dividing the axis, price in this case, up into segments, which need not be of equal length, and dividing the number of points in each segment by the length of the segment.
This is a histogram, a plot of the density of dots along
the axis. I have done just this in Figure 4.152 from my first paper,
in Cootner (pp. 102104). It was the very first thing I did when I became seriously interested in the stock market.
A histogram can
show you quite a bit, if you know how to look at it.
NPEOR.S
ow is)
no. in panel = SxS 27S tT
yis}
istuz
yet
6
20
40
60
80
Pe stocks pte
100
120
140
P (a)
Fig. 4.152 Distribution Function of Closing Prices for July 31,
1956, (all items, NYSE). (75 + 8.7)/5 = 15 + 1.7. If you want to be thorough and crafty, it pays to plot the his
togram with different choices of the independent variable.
I have
given the same data both for price p and log, p (Figure 4.153) which are standard choices: p”, or the square root of p would give a quite different shape to the histogram. You will note that on the log, p his
togram the preferred stocks stand out very clearly, whereas they are almost invisible in the histogram of p itself. To be really thorough I should have put vertical error bars on each panel of the histogram,
252
using the square root of n rule, they would be very different in length
for different panels.
is
700;—
,
t Me, im panels 7/0 Xwtos
S
5
5 600;
Fer
aly.
Pes Teo
& soo
=
=
TP)
e
SEM e “Wee
W 4o00oR
e
PHL. Shey,
Ww
2 300;
c a
4 200;
S
= 100F 00)
to
[es
 Yio
1
25
1
1
3.0 1
20
1
35
1
4.0
45
5.0
RATIO UNIT SCALE (LOG e P) 1
1
30
1
aoe
4050 60 %
ae oe
80
SCALE

100
J
200
Fig. 4.153 Distribution for logP on July 31, 1956, (All Items
NYSE Common and Preferred).
I have calculated this error bar for an insignificant peak at the
panel from $45 to $50, for NYSE data (Figur es 4.152 and 4.155)
using measurements from the histogram itself.
You will see that
it is less than one standard deviation away from its neighbors (see also Figs. 4.152, 3, 4) and so probably not significant. A Zerominded statistician would have left out the word “probably,” but I
didn’t, because there is a similar structure right at the middle of the
histogram of the common stocks of the ASE (Fig. 4.156) from $15
to $20 and in this case the bump is more than 2 standard deviations
(Prob. 300—
8
s = 200— i
S
$100 s
00
Lo
\
2325
1
\
L
30.
35 40. 45 RATIO UNIT SCALE (LOG , P)
1
1
1
10
1
20
30
40
a
oe
ie
806070
3 SCALE
5.0 
J
100
150
Fig. 4.155 Distribution Function of logeP for Common Stocks
(NYSE, July 31, 1956).
lL, No ive Panehs t10¥ .37
PROINWFCAET S
=19t 77
Lato t 24s Ee
A“ aig nticant "bam py
2.0
[ '
Te s
30
RATIO UNIT SCALE (LOGe P)
er
Te
40
”
a 10
5 SCALE
00
Fig. 4.156 Distribution Function of logeP for Common Stocks
(ASE, July 31, 1956).
Kendall also has given a most instructive example of unbeliev
able evidence from a histogram. A machine was turning out metal parts to a certain dimension.
All those greater were to be rejected
or reworked. The histogram of the origina l measurements from the
255
inspectors’ notebooks showed a sharp drop at the tolerance limit, in
dicated schematically as follows. (Figure 4.158). Neither the owners of the plant nor the inspectors could believe the statistical evidence that knowledge of the tolerance limit had biased the measurements. Observers are frequently biased in the digits they can see or estimate.
(Kendall and Stuart, Vol. I, p. 210216).
B SCALE FOR NYSE DISTRIBUTION
10 rol,
1
ae
N
P
ASE— 7/31/56 505
©
NYSE
995
5
—
4
T
T
x
2
NYSE—6/30/56 1086
©
2+
LOGe
3
2
05
100 ae a
an
7/31/56 978

0
ae
99
x
dag
495
+30
*
lof
+80
20
By
s sol
= 50h S
47 z
n 7608e
503 '
my
404 30 + 20
a8 GOP cae 1 70h ~ gob
1
+10
90h
45
95h
= 9G \
i
Fig.
:
pg
1
a
2
a
Pp
e4 e 3!
Ny
% SCALE FOR ASE DISTRIBUTION
100
4.157 Cumulated Distributions of logP for NYSE and
ASE (Common Stocks).
256
Relat
FR equenty
a
dimension
Tereranerve
K inches
’
fran te
Fig. 4.158 Histogram of Measurem ents of Machine Parts (Schematic).
Problem 7
Histograms with Respect to Eighths. “Quote” Interpretation. 1) Read Cootner, pp. 286290 (Op. Res.10 345 1962) with espe
cial attention to Figs. 11, 12 and exactly how they were constructe d.
Execute the “verification” Pp. 287 with 100 consecutive stocks from
any newspaper, any date.
2) Read Frederickson, Frontiers of Inves tment Analysis.
(2nd
Ed.) p. 729 (last paragraph) to p. 730, IX, conclusions. Do you see any contradictions to reading 1), or internal contradictions in this
reading? Explain them if you can.
3) Read “How to read quotations” (by professional tape reader,
on reserve). Spell out exactly what you don’t understand, and then
dig out an explanation.
In particular, pp.
11, 12 concerning the
implications of stopped stock. In this connection note the remark about quotes, p. 7. What in gener al is the “message” comparing
“quotes” vs. “transaction”?
257
4.16
Extreme Observations. Some Biases from the History of Science
There is one other observation which should be made, when you have numerical data on a variable plotted out in a line. Focus your attention on the three or four points which are plotted at either extreme of the data, and ask yourself the question, is there anything special about these members of the population or the way or time at. which this data was taken? This question is motivated by elemental animal curiosity, these members are more likely than others to be freakish or unusual in their other properties, and should excite your Chicken Little curiosity. They also mark out the limits of the population so far as this variable is concerned, and we have pointed out that knowing and studying the limits are of basic importance in understanding processes and properties.
In the particular example of market prices on an exchange, the very low priced members are likely to be mangy dogs, which may be delisted and lose some marketability. Or, they might recover, and
are real bargains. At the high price end of the scale we have a mix of high flying hot stocks, and maybe blue chips, candidates for splits,
which, it is said, improves their marketability. You can see analogies
to this in many other situations. If the variable happens to be height or weight of a human population, either extreme is marking limits to just being a human being. A practical example illustrating the importance of noticing extremes is the price earnings ratio, which they now print in the paper every day for every stock. Extremely large or extremely small values are a signal that something is unusual, and not necessarily “good” whether the ratio is very large or very small. The Equity Fund had a very small P/E ratio just before it went bankrupt. The ratio can be extremely large just before a hot stock loses all its glamour and crashes.
From a historical standpoint, the above procedure, focusing on
the extremes of a distribution, is almost exactly the opposite of what was considered acceptable procedure—throw the extreme observations out. This is still a very common practice. The history of how this mischievous practice originated is interesting and also shows the origin of a deepseated bias in some of our ideas.
The “normal” or gaussian distribution was originally used by
Gauss to describe “errors” in astronomical observations, primarily of star and especially planetary positions; latitude and longitude on the sky, you might say. The sources of errors and corrections are numer
258
ous and complicated,—errors in clocks , distortions due to refraction of light in the atmosphere and the bending and other imperfecti ons of the telescope. The astronom ers could sort them out and apply corrections, using the method of least Squares to evaluate regre ssion coefficients which determine the corrections.
There was much competition betw een astronomers and observa
tories to put out data with the smallest “probable error,” whic h is
2/3 the standard deviation.
Now in such a least sum of squa res
method, it only takes a few big deviations to make the final error get
big too, so there was an incentive to throw out, preferably with an
objective criterion, the observat ions that made an astronomer or an observatory look bad. The same incentives were present when least
Squares spread to the othe r exact sciences.
Everybody likes to be
accurate, and while it is chea ting just to throw the disc ordant obser
vations out without saying anyt hing about it, there are more subtle
ways of accomplishing the same thing. A not very subtle meth od of
diminishing errors is to take more observations. Nominally the error
goes down with the square root of this number of observations, as we saw in connection with random walks, so piling up the data to beat
down the error was and still is a very common procedure in astro n
omy. The financial analog is usin g a 500 stock index to describe ‘the market;” ten would be enough.
The historical context of science during the era of Gauss and sub
sequently, put an extremel y strong bias on the way peop le look at
data of all sorts, even to this day. Newton and Leibniz had invented the calculus, which was admirabl y devised to describe the detailed motions of the planets under gravitation and for many other phe
nomena as well.
Calculus is fine for descr ibing solid smooth “thin ”
lines. A mechanistic philosophy took over, if only you were accurate
enough in your observations and took enough of them, you coul d
describe theoretically the small est details, in principle. The unco nscious bias of this viewpoint is mani fested when we draw a smooth solid thin line through an elongate d cloud of points plotted on a sheet of paper, and unconsciously thin k of this smooth thin solid line as the ultimate essence of reality unde rlying the data, which has noise and errors in it. I have spent a long time trying to disabuse you of this unconscious bias in discu ssing supply and demand. You can see this bias in favor of cont inuity in representing data when peo
ple draw frequency polygons instead of step histograms or “rakes.”
They may represent pure “white noise ” by a violently zigzagging line
259
(continuous) between isolated points. The subjective impression to
the eye and the brain of these different representations is appreciably different.
A “line” may be safely considered as a representation of the underlying reality when the dots are observed positions vs. time of a satellite circling the earth, but if the dots are height vs. weight of students, or stock prices or price differences vs. time, then the “line” is a very misleading representation of underlying reality. We saw that the variable advance and dispersion random walk is one way (not the only way) of interpolating your understanding between these two viewpoints.
Astronomers and other exact scientists are not the only ones who like to have their results come out “smooth” “lines” with small “errors” or real fluctuations. Gauss laid the foundation for this rational approach to original sin in science, with the normal distribution and the method of least squares. I am sure you are aware that some financial managers like to see a “smooth” line on a calendar chart for the earnings of their company, and there are discretionary and legitimate accounting practices which enable the line to be smoothed to achieve the image of uniform earnings and uniform growth. Earnings is an intrinsically fuzzy number, which “officially” exists once every three months. It can be made more or less fuzzy to a degree which is under the financier’s control. It is just the problem of the security analyst or external tax auditor to detect just how much the earnings have been doctored up or down.
4.17
Discoveries in Noise
I mentioned in an earlier lecture that the marketing department had difficulty in calculating demand curves because of noise in the data. I thought perhaps the noise might be the most interesting part of the phenomena. Let me give you some examples where the “noise,” “residuals” or “errors,” depending on the terminology of your discipline really contains a lot of information. This is the honest data you might want to throw out in order to improve your agreement with theory.
Beginning with astronomy, geophysicists have been digging through that data to evaluate the irregular motions of the earth itself. If the earth wobbles or changes its rate of rotation in an irregular fashion, which it does, if continents drift or tilt, these are unexpected and unaccounted for errors in the positions of the stars and the planets.
260 At the time the data was taken, these possibi lities were not part of
the theory.
Back at the turn of the century, when electromagnetic phenomena were being actively investigated, physic ists tried to develop perfect insulators, or bottles to hold the electr ic fluid. No matter how they
tried, there was always just a little leaka ge in air. There was absolutely no theoretical reason why dry air should conduct electricity, but it did, just a very little. So they just put this down to experimental limitations, or “residual ioniza tion.” Fifteen years later it was
discovered as due to cosmic radiation, an extraterrestrial source of energetic particles which are not yet completely understood.
In the middle 1930’s Karl Jansky at the Bell Telephone Lab
oratories was studying noise in telep hone circuits and radio communication. This was something the telephone company wanted to get rid of, just like the economist in the marketing department or the astronomers with their precis e observations. Some of these communication noise sources were know n—thunder storms and mag
netic disturbances—earth and atmospheri c electric currents associBut there was a gentle hiss, really not very
ated with the aurora.
loud, that at first he couldn’t track down. It was noise as radio waves
coming from the center of the galaxy, and thus the new science of
radio astronomy was born.
In all these cases of noise, the data was trying to tell you something you didn’t ask of it. In a sense stock market data might be called economic noise in a rather pure and even amplified form. You
can ignore this noise if you want to, and economists did for a long You can try to draw underlying trend curves, or smooth out
time.
the data by least squares, moving averages and seasonal adjustments,
but you just might be ignoring some intere sting phenomena. 4.18
Examining Two Variables at Once
The preceding remarks refer to methods of examining the statistical properties of just one variable. In summary the moral is: look
at the data all at once with your eyebal ls, and with several changes
of scale (x, log x, 2”, /z, etc.). You can ask a computer to check out what you might see, but don’t omit the eyeball inspection.
Exactly the same remarks refer to variables taken two at a time.
The eyeball inspection of a scatter diagr am will enable you to esti
mate the marginal distributions and to calculate with your eye the
two regression lines.
You can do this by inspection of the data in
261
uncovered (not overlapping) strips, and estimating means. I do this with data all the time. The product of the two regression slopes gives you the square of the correlation coefficient. Eyeball inspection of the data will tell you whether it makes sense to calculate a correlation or linear regression, ie., if they are well defined and appropriate quantities. You can also pick off from the scatter diagram extreme members of the sample of various sorts, for special attention. A couple of examples might be instructive. If you just go down
the column of the newspaper and plot price vs. dividend you will get a very lopsided scatter diagram concentrated near the origin. On D) it appears much more like a two log log paper (i.e., logp vs. log variable normally distributed scatter diagram. But unless you are careful to put them in as special cases, at the edge of the paper, the zero dividend stocks just won’t appear. You have to decide whether a zero dividend “exists” and is zero or just doesn’t exist. Just using logs tends to eliminate these “freaks” unless you are careful. A computer won't tell you about these special cases, unless you are careful to ask D on logP jt. A regression calculation of D on P, D on log P, log
or other possible choices is very misleading if you don’t inspect the
data first.
yoRume or log. Vo
+t”
—
Fig.
DLogy p 4.181 Scatter Diagram of Volume vs.
(Schematic).
Price Change
262
As a second example, consider the plot of Alogp vs. volume, either sequentially or across the market. Across the market, for a single day or week, these two variables seem nearly independ ent. If you make a sequential plot, usin g a single stock price sequence, the scatter diagram looks like a butte rfly, very roughly. The regression
of v or log v on A log P is approximately a V or aU. A calculation of correlation gives you zero. (Fig ure 4.181) The sequential corre lation
of A log p vs. Volume or log Volume is very slightly positive.
Finally, just a word about trying to do three or more variables at
once. In a sense the eyeball starts to fail here. It can’t reall y “look” at a three dimensional distributi on all at once. But you can plot two at a time (three different “proj ections”) and also plot up slabs , or
cross sections where the third variable is restricted to a finit e range
say 1/2 or 1/4 of the data span of the third variable.
If you have
lots of data, these two dimensio nal projections and sections can be
quite instructive to look at, before you put the data into a computer
and start some statistical routi ne. Three or more variables at once
can vary with each other in many complicated and subtle ways, but
with enough projections and cross sectional “slabs” you can sort out
some of the relationships.
I would refer you to Kend all or some of
the more advanced statistical texts on this kind of a problem.
4.19
The Quantitative Expres sion of “Eyeball”
Statistics
In the previous sections I have urged you to “look at” your raw data with your eyeballs, and poin ted out some of the ways of doin g this—the histogram with panel error bars for one variable at a time, the scatter diagram for two vari ables at atime. You should take note of clustering or bumps, jumps, asymmetries and extreme valu es of the observations. Different choic es of the independent variable on a
histogram made the shape of the hist ogram change quite a bit.
The “classical” way of describi ng the detailed shape of a distr i
bution is by the moments, the first two (mean and variance) be
ing sufficient to describe the normal distribution.
Properties such
as asymmetry (skewness) and humpedness (kurtosis) are described
by functions of the third and four th moment.
All these quantities
have uncertainty or fluctuation estimates which involve moments of twice the order of the given mome nt. Thus the variance determin es the uncertainty of the mean. The fourth moment determin es the
263
uncertainty of the variance (second moment). The sixth moment determines the uncertainty of the third moment, and so on. This is all very fine when moments from the data are well defined, but
when they are not, as is frequently the case, then you can use the percentile points (partition values) for the quantitative expression of some properties which you can “see with your eyeballs.” Suppose we have the histogram of a variable x. Denote the percentile points or partition values by x,y Thus 2,5 is the median,
275 — ©.95 is the interquartile range, or (for a normal distribution) twice the “probable error.” The following quantities can be used to
quantify the various subjective impressions of the histogram. There
is nothing unique about the particular values of percentile chosen in any of the criteria 1  6.
Expressed by
Percentiles
Property 1.
Zig
Where
Expressed by
Moments
mean
is The “Central Tendency”
2.
fluctuation
3. asymmetry
(1/2)(#.83 — @.17)
deviation=
variance
either sone )lg goat) skewness or
4. concentration
standard
z.75—&.5)—(.5—7.25)
2.75—2.25
(eereailescta)
kurtosis
“humpedness”
Number 4 will approach unity for a distribution with a high thin peak or central concentration, and approach zero for a U shaped dis
tribution, or absence of a central concentration (a central “hole”). 5 and 6 below express the extent to which there are “outliers” to the
tight and left in the observed distribution. The number of observations can be compared to the number expected with a normal distribution of the same median = mean and intersextile range 23 — 2.17 or 2c. For this comparison one should use the Poisson distribution to compute the significance probability, since the number of observations refer to a small and positive integral count. An excessive
number of these extra outliers (actually it only takes a few) is re
sponsible for the illbehavior of the moments, when these are not well defined.
264
5. Number of outliers to the right,N,: No. of observations Re
ast mEBSE1D on, =2,3,0r4
6. Number of outliers to the left, Ny: No. of observations DS
re = mE) m = 2,3,or4
Criteria 1 through 4 can be assigned uncertaintie s or standard deviations using formulae for the standard deviat ion of partition values (See Kendall, Advanced Theory of Statis tics). I commented that histograms of the data could be made to look
quite different for different choices of the independent variable, and
that interesting properties could thereby be revealed, cf. Figs. 4.152 and 4.153. It is also true that transformati ons can conceal properties
 transforming to the uniform distri bution being the extreme case.
As another example, consider the stickiness of integers in the case of
stock prices. This is easy to see in data on prices and price differences in dollars.
If you transform to log prices or perce nt changes, the
effect is still present but much harder to find. See Cootner pp. 28993 and especially Figure 13.
The moral is that transforming data
in statistics can cause trouble and confus ion, just as transformations
caused trouble for the Munchkin boy in the Land of Oz. If you are not
careful and observant of the raw data, you may transform away some
interesting and unexpected properties.
In physics, and especially
astronomy, taking data in polar coordinates, and then transformi ng
to the (supposedly) simpler cartesian coordinates, causes all sorts
of statistical trouble. Financiers and econo mists should be thankful
they don’t have this problem.
~ Da
5
STATISTICS OF SEQUENTIAL DATA. POWER SPECTRA
5.1
Examining Two Variables at Once Where One is the Time Sequential Data
In the preceding discussion of methods of initially examining data, one or two variables at a time, we did not specifically take into account that the values of one or perhaps both variables might have appeared sequentially in time, as in the case of price and volume. For purposes of eyeball inspection, we can plot up one variable against the time it appeared, and even use some of the simple minded methods we used for a scatter diagram (two variables). But you should remember that the second variable (time, conventionally the z axis) is not a random variable in this case. The days don’t come off the calendar even approximately at random. But for our purposes, and to a degree, we can and frequently do assume that they do. Let me show what I mean by this seemingly preposterous remark, and the trouble that assuming it unconsciously can engender. Imagine we have our two variables (y,t), N pairs of them plotted up, and we simplemindedly go ahead and start to analyze with our eyeballs this “scatter diagram” in exactly the same mechanical way we did before.

One of the calculations we did optically was to divide the abscissa 
266
into intervals and calculat e the mean of the points in each vertical (y)
strip. Formerly we would have called this set of mea ns as points on the regression of y on t. Now there is a new word for exactly
the same process. We say we have calculated (points on a) “trend”
“line.”
You will see that we have ignored the order of the date s
(averaged them) within each stri p or segment of the axis of ¢. Sometimes this process is just done by drawing a “best fitting”
smooth solid line, which is really a running average over some time
interval of unspecified length. In such case the same data is used or
“counted” more than once whic h improves the “image , of smoo th
ness.
Strictly one should use the data in each strip just once and
the trend “line” is evaluate d just once for one “point” in each strip, the middle of the ¢ interval. Strictly this, is a regression, but not a
regression line. Just a set of points.
You will notice the subt le bias
which Newton and his calc ulus have put on this the conventional
procedure. We are ass uming there “exists” an underlying continuous and smooth process. This is just what a ran dom walk of expected
advance (variable or not) smal l compared to the dispersion per step does not have. I suspect the same is true of a good man y economic
time series.
Ordinarily one might suppose that the abscissa had been divi
ded into a fairly large number of intervals (20 or more) for this “trend” calculation. In fact for purp oses of deciding the question of whether a tren d exists or not (mathema tically, the Property of strict station
arity as defined below) it is more instructive to use two, three, or at most four segments of equa l divisions of the total time span of the
data, and correspondingly just derive 2, 3 or 4 trend points.
The
question you then ask: are these four samples of data significantly
different or not? Thus if you had a sequence of 1,000 y values, with
four divisions you have 4 samples of 250 each.
Evidently one can compar e these four samples of dat a in many
different ways; by means, vari ance or in general the shape of their
distributions by chisquare.
It is also evident that if we just wan
t to decide whether the four samples are different fro m each other or not, a decision can freq uently be reached by eyeb all.
Uncover the
four pieces one at a time, and just read off the means and variances
(or medians and intersextile rang es). If they are significantly differ
ent it will usually be easy to see.
Note that in this comparis on we
are randomizing the date s of the calendar within each block of 250
observations.
267
You will notice that the above is a mechanical and formal pro
cedure for deciding whether or not the data is changing significantly in its statistical properties with the time. “Trend” usually implies just the mean, out of all possible statistics is changing with time. If the mean goes up or down significantly from one block to the next,
we have trend. With only four blocks of data, there are only B=8
different ways in which the mean can jump around up or down from
the value it has for the first block. The most wiggly way it can behave is like the letter N, or N backward or upside down, which
for those of you who have been exposed to sines and cosines, is the lowest frequency or longest complete “period” you can get into the total interval of observation. This is essentially the reason I picked four as the largest number of intervals to use to make this test for stationarity. We will return to this point when we discuss power
spectra.
All of the above remarks are intended to give a quick answer to
the question of whether or not the statistical properties of the first, to fourth (if we used four) divisions of the data changed from sample to sample. If they do, and eyeball inspection is usually sufficient to show this, then the sequence is not stationary, it changes with the time.
5.2
The Definition of Strict Stationarity
A more precise mathematical definition in terms of ensembles of functions, of which your original data f(t),t = 1,2,...,1000 was a sample, is this. An ensemble of functions {f(t)} is strictly stationary if (See Rosenblatt, Random Processes) the joint distribution over the ensemble of f(t), f(t2),,f(te), expressed as ®(f(t1)), f(t2),s f(tx), tr < te
5
37 Running a of ation Autocorrel Fig. 5.72 Autocovariance and to
!
2
Sum of Two Independent variables. 07 vk)
, s
° «
O77 AS4S
ee8
oe
2
4
a al
Fig. 5.73 Autocovariance and Autocorrelation for a Running
Sum of Five Independent Variables.
286
values of f(t) and f(t +1) then the autocorrelation dro
ps linearly from one to zero over a ran ge T = 0 to k and is zero for larger values of7,7> k
Now supposeI just let & get large. In any event for a Teal sample of data I only have
a finite number of val ues of f(t), say from t = 1
to 1,000. So f(t) can be imagined as foll ows:
fl)
=
f(2)
1 +const
=
m+rm+e
£3)
=
f(k)
=
mtrt+rs+e TI t+r2+73+...47%4+ ¢
f(1000)
=
ry +12+73 +... 71999 + end of sample (5.7.5)
(sum of “previous” 1's)
You can see that in this case f(t) is really a random walk, starti ng at the origin of f, of whi ch my data is one fini te sample of length 1,0 00. The autocorrelat
ion drops to zero ove r the length of the int erval of observation, but I rea lly can’t evaluate it ver y well from the data for the total interval, say T'/4 = 250 or T/2 = 500 at the most. So the autocorre lation from the dat a looks something like Figure 5.74.
You can see that this seq uence,
a random walk as it sits , is neither strictly stationary, nor ergodic, and not just for the reason that it is not strict ly stationary. The expected moments get larger the longer
the interval of observ ation,
The distribution of f(t) and
f(t +7) never approach independence no matter how large + gets, since bot h F(t) and
f(t+ T) have all the random
variables of f(t) prior to ¢ in them. The third con dition of ergodicity is not satisfied either ,
Fig. 5.74 Observed Aut ocorrelation of a Ra
ndom Walk (Schematic).
287
5.8
The Doctoring of Sequential Data Prior to Statistical Analysis, in Particular for Power Spectra
In the previous lectures I have described, with examples, the properties of strict stationarity, weak stationarity and ergodicity.
Strict stationarity was in fact quite restrictive. It required the time

dependent variable f(t) at several different times. Weak stationarity

translational invariance of the distribution not only of the dependent variable, but of all possible multiple joint distributions of the
relaxed these conditions to apply them only to those properties given
by the first two moments, including cross moments of f(t) at just two

different times. These determine the autocovariance, autoregression and autocorrelation. These are the standard elementary twovariable statistics.
We also introduced the concept of ergodicity, which was devised as a theoretical property to enable statistics to be determined from a single time sequence. It required strict stationarity, a finite first moment, and asymptotic probability independence of values of the dependent variable for large time separation. Let us imagine we have some observed sequence, T = 1,000 in number, prices for example, and describe how it can be “doctored” to allow a statistical examination. We have f(t); t= 1,2,...1000 =T.


Let us further suppose that f = (1/1000) 30100 f(t) = 0, or if this  was not true for the original data, we have subtracted f out. This
gets rid of at least one possible cause of nonergodicity.
Let us further suppose we have no linear trend, here defined as the nonzero slope of a straight line from f(t = 1) to f(t = 1000). This slope, too might have been subtracted from the raw data. So we have also diminished at least this obvious source of nonstrict stationarity


and nonergodicity.
Note that “subtracting out” a linear trend and constant is different for different transformations of the dependent variable f(t). Take the case of price contra log price as a function of time. For the 9 is not the same as logp. If you have decided for constant term, log some reason to transform the dependent variable before analyzing it, the subtraction of constants should be done after the transformation, not before it. Subtracting “linear” trend from the logp series corre
sponds to factoring out e°"*'* from the original data. So “linear”
trend and constant removal, if the data is transformed first, has different effects on the data for different transformations. In any event,
_
288
with either raw or transfor med data “linear” trend (me an difference) and mean removal removes most of the most obvious sources of non
(strict) stationarity and nonergo dicity.
How can we test our dat a roughly at least for the third Property
of ergodicity,
jim Prob.(f(t), f(t + 7)) = Prob.(f(t))  Prob.(£(t + T))
(5.8.1)
where the two probab ility distributions on the right are, if we hav e stri
ct stationarity, the sam e.
=
One simple test is to plot the product moment cov ‘Tor ariance f(t +7) f(é)
de tr SOSH) against 7 The expected val ue of f(t
+7)f(¢) for large
T by the above formula (5.81) is
lim €f(t+7)f() = EF Ef(t +7) = (Ef@))?
T=00
the square of the mean (zer o in this case). Note that to be
really convincing, the product mom ent has to become and remain near its limit value
(F@)? = 0 appreciably short of r = T/4. Also not
e that we are really testing, by use of the first moment only the independence
of the distributions of f (t+r) and f(t). A sec ond moment test would requir e comparing f?(t)f2(t +7) and (F7()?.
So the preceding is
not an exhaustive test for the third Property of ergodicity.
hin
fier) fo
Ps
.»
"
dfazo
t ( teT 4
Fig. 5.81 Test of Data for the Third Condition of Ergodicity. The above procedures are massaging technique s on the data to render it fit for statistical anal ysis.
In particular we are looking
289
toward power spectrum analysis, for which weak stationarity is required. Removing the mean and the linear trend can also be achieved by differencing. This removes the mean (or any constant). Removing the mean from the differences removes the linear trend. It is worth remembering these preliminary steps, and the reasons for them, before carrying out a power spectrum analysis. In practically any computer you use, there are programs already built in for the calculation of power spectra. These programs do not necessarily simply take out the mean and mean differences, equivalently a linear trend. They frequently have other procedures of filtering, averaging, “tapering” smoothing or truncating the data. These procedures sometimes amount to the same thing as mean and linear trend removal, and sometimes not. If you just hand the computer a batch of data and ask for a power spectrum, the computer or its staff may not tell you the data is in an inappropriate form. In fact what the program may do is to make you think it is in an appropriate form. An unspecified power spectrum program may act like a garbage disposal unit, it can grind up your false teeth and your diamond rings if you put them in there. Unless you examine your data first, and sift some things out, you might lose what you are looking for in the data, overlook some interesting property you were not looking for, and get quite misled as to what the data says.
You may recall that when I was describing an engineering ap
proach to sequential data, I said that it was in principle possible to fit a polynomial of 1,000 terms to 1,000 values of sequential data. Nobody ever chooses to do this. In practice for describing sequential properties you might fit a quadratic, cubic, or even fourth order polynomial. In principle you could go to order 1,000 and fit (represent) every observation exactly. A power spectrum analysis is something like this process of fitting a 1,000 term polynomial, only you don’t
use terms of 2,3 1000 increasing powers, t,t”, #°,... , #1000 Instead
you use a different kind of function—sines and cosines. These are essentially functions which wiggle in a uniform way. You have func
tions of all degrees of wiggliness from zero wiggles or loops? in the
span of data 1 to 1,000, to 1,000 wiggles, in the span of the independent variable, usually the time. This is the most number of wiggles you need to fit, exactly if you wish, 1,000 data values. 3 ‘This terminology is somewhat imprecise. If you consider that sines and cosines have two loops or wiggles in one period, the statements are quantitatively correct.
290
The idea is to find out the proportionate mix
of all these different wiggling functions whi ch will represent the data. This method has the advantage over the polynomial fit ting method in tha t you can group tog ether as a single term several fun ctions
of slightly differ wiggliness without ent altering the Pictur e, or representati on very much.
trend.
We have talked a lot about the use of
the eyeballs to car analyses, in partic ry out ular, sets of scatte r diagrams f(t), vs. f(t+k).
interpret oscillates at varying frequenci es interm
ediate between the two extremes, an se d with varying amp litude, whether speech, music or din. The var
ying intervals betw een zeros
of pressure (at the or ambient atmosp mean heric pressure) and the amplitude , or dep arture from the mean
between these zeros are just. what convey s pleasure for music, informati on for speech, or the displeasure of din.
One can break up this wiggling line which shows the pressure variations into wh at might be called a set of pure tones. This is very
simple for an orches tra,
One pur
e tone may be a single clear note on a flute, or a single note from a piano, or a ste ady toot from horn. Music, speech a
or din can be repres ented as a superposi tion, or
291
adding up of all these different single tones to give the actual pressure
which your ear feels at any instant. The nature of the description of a power spectrum is to take what ever the irregular pressure vs. time curve is and break it down into all these different uniformly wiggling components, which we identify with our ears in the simplest case as pure tones.
Plt)  Pressure
t
Pagar oscel
pron TO" op $
erol
time >
setonds
Fig. 5.82 Sound Pressure vs. Time.
This procedure which the ear does with pressure vs. time can also be applied to other sequences. It might be prices or price differences
vs. time, the height of waves coming in from the sea, sales of air line tickets per day. It really doesn’t matter. But you must take out just as the ear does, so far as simple hearing is concerned, the constant mean or atmospheric pressure, and any steady increase or decrease with time (the linear trend) which might exist if you were going up or down in an elevator or airplane. These contributions of the pressure have physiological effects of their own, but they are not strictly part of hearing.
So we can write‘ this function of the time, whatever it might be, as a sum of the following sort. Note that the function f(¢) in this © At this point, and in a number of subsequent lectures, I feel obliged to assume a somewhat greater degree of mathematical sophistication on the part of the reader than hitherto required. I regret this, it is a mark of my shortcomings as an
expositor that I cannot carefully and exactly express the ideas I need in a simple notation. For this I ask the nonmathematical readers’ indulgence. I have done the best I could in summarizing what can be found in more extensive and perhaps intelligible form in books on Fourier analysis, complex numbers and power spectra. Readers who may feel a little “snowed” by the ensuing mathematics are
referred to the comments in section 5.9, and that last three paragraphs of Section 5.13 and 5.21. They will be reminded that impressive and complex mathematical
concepts and notations have their limitations and absurd implications.
292
particular case is defined and exists only at a discrete, finite number of values of the
time Gb
128 soc 100
0, so we only need number of term a finite s to represent it exactly over thi s finite interval. The ma
thematical expres sions are simple r using complex bers rather than numreal sines and rea l Cosines, even if, as with real data, f(t) is a rea l function.® T1
F(t) = So aje?TM /T,
4g
j=0
1,2,3,...7
(5.8.2)
This is called a Fourier expansio n or Tepres
entation for f(t), are called expa the a; nsion coefficie nts, Alternati vely we might use r
f(t) = SV aje2 mst/7,
1,2,3,...77
j=l
(5.8.3)
In either case the re are exactly T ( complex)
coefficients a; to Sent the T’ ob represerved values of f(t). With either choice above, the aj can be calculat ed from the given values of F(t) by r
a; = (1/T) Dif (ther 29/7 ia
J an integer
(5.8.4)
As will be noted fr om the alternate fo rms 5.82, or 5.83, in 5.83, positi J is positive ve and zer o in 5.82, and a value for a;,7 Positive,
complex quan tity which we can write as a;
= ; + icj Wh teal we have a; en f(t) is = a*; where * denotes comple x conjugate (cha nge i to i). More
over, since er t) P emet i emit jt/r et?mit/7 we have oF =a_j;= ar_; for f real. This means that for f teal, of 27° teal and imagin ary parts
for all the T val ues of a; only 7’ of them a; is ’symmetric’ about j = T/2, for f real. T even requires that ay and “7/2 are different.
are pure real. T odd just requir teal. For T od es ay pure d or even 4; wi th f real, satisf ies 4; = aF_,.
* You will note tha t the independent
variable, ¢, is discre not made the te, but we have same restricti on of S(t), or the range set Specification , contrary to on this subjec previous t when J() is supply and demand. Thi s is a defect
293
The combinations w = 27j/T are called the frequencies®, so the
w’s run (at discrete frequencies) from 0 to 2 x(T — 1)/T in (5.82), or from w = 2n/T to 27 in (5.83) The function f(t) is said to be
represented by a sum of these frequencies w; each with a complex amplitude a;, or equivalently a(w,). The power spectrum is the positive function of 7: = aja; = aaj = ajar;
(5.8.5)
= a(w)a(2m — w)
(5.8.6)
Because of the symmetry in aj mentioned above for real functions, f(t), in practice the power spectrum is usually given only from 7 = 0 to T/2, or equivalently from w = 0 to 7. The remaining part is just the mirror image in the ordinate at j = 7/2, or w = 7 (on a linear abscissa scale).
It is sometimes convenient in examining power spectra, to think in terms of “periods,” or the repetitive interval of the wiggling functions which compose the spectrum, rather than the (radian) frequencies w; = 20j/T. The periods are P; = 27/wj = T/j. Longer periods than T, or periods other than T'/j withj integral just don’t exist in the data. I emphasize “in the data” because economists and others think and talk about long term trends, parts of long cycles, slowly varying means, so I suppose they must believe in them. None of these things exist in a finite sample of discrete time data no matter how long, if you are going to represent it with a Fourier series. There is a rather special place for the linear trend as I have narrowly defined it (the mean difference) and also a constant in a Fourier expansion. All the rest of the terms in the expansion are for complete and unambiguous periods.
For example, suppose we have weekly data for an interval of two years, T’ = 104 weeks. The longest period, and hence smallest fre
quency is for j = 1, P; = 104 weeks. The shortest period (j = T) is unity, one week for which the radian frequency is 27 week! . We mentioned in practice the power spectrum is usually only given from © The word “frequency” is also commonly used for the combination j/T, the number of “complete” cycles (less than unity if j < 1’) per day, week, etc, if T is measured in days or weeks. w = 2nj/T is also distinguished by called it “circular
frequency,’ ’ or “angular frequency” in radians per unit time.
294
w=0ton,orj=0to
j = T'/2, so the shortest effective period plot
ted is Pj_7/2 or two weeks.
The corresponding (rad ian)
frequency is called the Nyquist or folding frequency, 7 (ra dian) week! or 0.5 cycles week~!. The wor d “folding” frequency means that the total spe ctrum w = 0 to 27, being mirror sym
metric at this fre quency, 7, can be folded into coincidence with the range w = 0 to 7.
In general one can remember the formul a Period
= 2/(fraction of 7) in units of time (a week in this case) cor responding to the int erval between observations . It is very helpful to mark a few of these poi nts on a given pow er spectrum so you can, in familiar ter ms, “see what
you are looking at.”
tra, rather than aut ocorrelation, or the equivalent,
autocovariance. Both power spectrum and autocovariance come directly from the Suc cessive scatter dia grams of f(t +k) vs. F(t) which I described in In fact the autoco variance and pow er spectrum
a previous lectur e.
are closely related, and in exactly the same
way that F(t) and its expansion coefficie nts a; are related. There are some rea sons for this preference
for power spectrum over autocovariance, some scientifi
cally sound and others primarily cosmetic.
Iremind you that either method contains exactl y
the same information and has exa ctly the same limita tionsstatistical rel ations expressi ble by second mom ents and linear reg ression.
The limitations, ambiguities and sho rtcomings show up in the successive sca tter diagrams which you make to get the autocovariance,
but you don’t see these limitatio ns although they are still there, wh en you only
look at the final calcul ation of the Power spe ctrum, or only at the autocovariance as a function of lag k. You can put
any sequence of numbers you like int o a computer and punch the button for power spe ctrum, and the computer will gri nd out an answer .
Unless you know exactly what the computer did (and the routines are not all alike) the
emitted power spectr um may be very mislea ding. In practice power spe ctrum analysis has its greatest strength whe
n you are trying to find some fairly period ic signal or wiggle function immersed in a great many others. You are interested in Jus t one or two or thr ee of these periodic signals.
Then
the power Spectrum is a ver y powerful met hod indeed for sifting out just one
or two of these per iodic signals from all the other uninte resting (?!!) noise which ma
y be Present. Power spectra. can also be used to de
295
scribe functions of time which have no particularly dominant or even slightly periodic property but just wander back and forth the way stock market prices or other types of time signals sometimes do. The interpretation of the spectra of such sequences is much more difficult than those with just two or three dominant frequencies. If there are concentrated and intermittent bursts of activity like rainfall in a dry climate or storms in a weather record of wind velocity, temperature and humidity, electrical noise in semiconductors, intermittent turbulence in a fluid, galactic radio noise, or bursts of activity as in a speculative market, the power spectrum representation of the
data can be very treacherous in its interpretation. (See Section 5.21)
We will return to this interpretation problem when we discuss the spectra of random walks. You may recall from earlier lectures that I warned you about the misconceptions of thinking in terms of thin solid smooth lines when you are discussing data, or some underlying theory. “Thin” implies a sharpness of definition in a functional relationship. If you think of
the relation of height against weight just as a thin (regression) line
you are ignoring the obvious and sometimes important fact that real heights can be different for the same weight. Solid lines imply mathematical continuity of the variables. Real numbers are implied to be possible in the data. In fact there always has to be a smallest unit
change which you can detect or express, in real data. Smoothness
implied the existence of derivative, also implied to be continuous. We saw that sometimes these implied and fallacious assumptions could give trouble, whether in supply demand relations for an auction market, the surveying experiment of Chicken Little and Zero, or in the distinction of a rake vs. a brick for the uniform distribution. The normal distribution had all these convenient and fictitious properties, which were however most inappropriate for describing experiments with small counts, where the Poisson distribution was more appropriate.
These same difficulties crop up again in a discussion of power spectra, which are frequently thought of as thin, solid and less frequently smooth curves, plotted as a function of w from zero to 7 They are frequently drawn, for cosmetic purposes as piecewise smooth, more particularly, continuously joined straight solid line segments. As we shall see, the realities of power spectra are rather far from the subjective impression of this cosmetically improved picture. Finally you might note the technical distinction between a Fourier
296
expansion
“representation”
or fitting of the observed data (5.8 4),
and the power spe ctrum Tepresentati on (5.85).
There
are T values of f(t) to be repres ented, and there are T different compon ents to the expansion coeffici ents a; (2T real and imaginary parts, but only T are different). So the Fourier exp ansion a; (5.84) can “fit” or represent the T value of f(t) exactly.
The power spectr um (5.85)
“representation” has only 7/2 different values,
So the
re must be some “information ” “lost,” in going from f(t) to its pow er spectrum,
but not in going to the Fourier expansions. Thi s “lo
st” information may be quite visibl e to the eye, if you look for it, as we saw in compar ing the random tel egraph signal and sho t noise scatter diagra ms.
5.9
A Comment on the Di
sparate Mathemat ical Concepts Which Ca n Be Encompassed in Fourier Analysis
You may recall from an earlier lecture that I discus
sed the conceptual differences between a random walk with very lar ge vs. vary small disper sion per step compar ed with the expect
ed advance per step. When the dis persion Per step was large compared to expected advance, as for
roulette, or stock market prices, the stochastic, or
When the dispersion per step was very sma
ll compared to the expected advance, as in the student’s “ra ndom walk” to the par kin g lot, or the int
egration of a different ial equation num
erically with an accumulating rounding off error, the stochasti c aspects of the problem became insignifican tly small. We could saf ely use the concepts of functionalit
y and the continuous and smooth concepts of the calculus. The random wal k concept enables us to interpolate betwee n the concep t of a stochasti c vs.
functional relati onship between numerical
(continuity) and smooth ness (existence and con tinuity of derivative)
were not quite as simple as intuit ion might indica te.
The use of a Fourie r Tepresentation, esp ecially in
the complex number form, is ess entially a method of simultaneously handling in a general way these conceptually quite different
ideas  stochastic vs. functional relationships, and discrete vs. real, con tinuous numbers. The
use of complex number s, the additional alg ebraic concept of
re) =} =I
i= V1, gives an added flexibility and versatility to the mathematics.
This added generality, where either stochastic or functional reJations or both, discrete or continuous or both properties, can be handled by one set of mathematical notations, does cause some complications. Historically, sines and cosines were first used by Fourier
to build up solutions of the heat flow or diffusion equation. They were quickly discovered to be quite useful in representing solution
of many other types of problems in the calculus. In the 20th century their utility in conjunction with complex numbers in describing stochastic processes has been more and more widely appreciated. Because of the generality of a notation which encompasses these disparate concepts, there are numerous booby traps encountered when you use Fourier analysis either for theoretical developments, or in analyzing real data. I will point out some of them, I doubt if they have all been discovered even now.
5.10
Formulas for Fourier Expansion, Expansion Coefficients, Power Spectrum and Autocovariance
Let us examine more closely the general expression for Fourier
expansion of a set. of data and its power spectrum. Imagine we have function f(t) defined at discrete instants t = 1,2,3,...,T7 over a
finite interval T. We may regard this as a finite sample of data f(t)
generated in any manner whatever. In the following discussion we are not supposing either mean or trend removal has been carried out. ‘At the moment, the formulae are just straight data fitting. Just as it is possible to represent exactly these T values of f(t) with a T term polynomial, so it is also possible to represent them exactly by T terms of sines plus cosines, or equivalently complex exponential, as we mentioned. For the moment we will use the form labeling the coefficients withj from 0 to T — 1. The alternate, fromj =1toT and other possible choices we shall return to shortly. Repeating our basic formulae, we had: T)
7, ¢=1,2,...,7 f(t) = Ya
(5.10.1)
j=0
I
? r) T fe So/ aj = (1 t=1
(5.10.2)
298
wherej may be any integer, Positive, nega tive, zero, greater or less
than T.
The power spectrum as a function of j;
P(j) = ajat = ajacr_;)
(5.10.3)
and as function of wy = 2aj/T
Pw)
=
a(w)a*(w) = a(w)a(2r — w)
=
P(2r~w)
The power spectrum is symmetric about w = 7 (w = 0 if you pre fer to think of w in the range —r to m). The above formulae are exact. The follow ing formulae are also exact provided the following peculi ar and somewhat
unrealistic definition is given to f(t) outside the domain ¢ = 1 to T for which f(t ) is defined. If we use the formula for F(t) given above (5.101) outside the domain ¢ = 1 to T, we will find that
F(0) = F(2),
fD=fT+1$ )= ,F( .T+ .h) ., ete.
In other words f (t) (as given by its Fourier Tep resentation, repeats itself, it is strictly per iodic with period T. This peculiar behavior is not
a shortcoming of the expansion for f(t)) (5.101), which is onl y intended for the dom
ain set 1 to T,
but this peculiar proper ty is needed for the formul ae below to be exact. For real data you can see that this
behavior is a physical absurdity. It is a pro per
ty which is somewhat suppresse d by various comput ing routines which are used to cal culate autocovarian ce and power spectr a from teal
data. I also make a distinction bel ow between autoprodu ct moment APM(K)
and autocovariance cov(k), the latter bei ng just the former, minus the square of the mea n, This is in accord with standard statisti cal terminology.
Auto product moment T
APM(k) = (1/T) St)( t+k), t=1
Mean
k= 0,1,2,...,7—1
(5.10.4)

f=a/T)> tb) t=]
(5.10.5)
299
Mean Square r
P= (1/T) SP = APM(k = 0)
(5.10.6)
Autocovariance
k=0,1,2,..,T1.
cov(k) = APM(k) — (f)’,
In terms of the expansion coefficients aj,j
=0,1,2,...,%1
7
Mean
(5.10.8)
f=a
Mean Square
=
(5.10.7)
2
TA
t=1
j=0
PSTD) OPO = LV a5
(5.10.9)
Variance of f
Qa—1/T
> alw)a*(w) (5.10.10) f? —(f)” = cov(k = 0) y ajaf= xe _
The autoproduct moment in terms of power spectrum P; = aja} is
APM(k) = ba ajate 2miik/T
(5.10.11)
j=0
If we subtract a3 the above formula becomes
cov(k ¥ ajayeeT 1
(5.10.12)
j=1
The power spectrum in terms of the autoproduct moment is
P(j) = aja% = (1/T)= APM(k)e2TM5#/T
(5.10.13)
k=0
At j =0
this gives
(Ff)? = 46 = (1/T) = APM(S. a
(5.10.14)
300
The preceding formul ae express the basic Properties of a Fourie r
expansion, which is by definition (5.101). are
computer pro grams
for
calculating
Given F(t) as dat a, there
numeri
cal values for all of these formulae (5. 10 4, 5 and 6) are definitions or follow immediately from the definitions. (5.108) is just a Par ticular case (j = 0) of (5.102) showing the sim ple relation of the mean f = (1/T) D e f(t) the first Fourier
coefficient, ag.
(5.109) and in
particular (5.10frequently expres 10) are sed verbally: “The variance of F(t) is the sum of the variances (mod ulus Squ ared) of the ind ividual Fourie r components
@;, except the zero frequenc y component,
ag which is of course (mean f)2.” Note correspondingly tha t the term j = 0 or omitted from (5. w=0
the
1010) but not fr om (5.109).
is
All this is in ag with conventional reement statistical terminolo gy, where the var iance of asum (the Fourier series for F(t
), (5.101), is the
sum of the varian the terms in the ces of sum, if they are all independent in the pro bability sense. So the contribution s a; of each fr equency w = 2n 3/T to f(t)
are considere d independen t
in the above sentence,
You might note the similarity of the lan guage, but the enor difference of the mous con text, to the senten ce:
“The variance of point of a rand the end om walk (sum of steps) is the su m of the varian the steps, if ces of
the Steps are inde pendent in the pro bability sense.” [t is one of the beauti es of mathematic s that the same ideas appear in widely different contexts,
(5.1011, 12, 13) expres s the fact that
the autoproduct mome and power spec nt trum are Fourie r transforms of each other, a rel tionship identical ato that expressed in (5.101) and (5.102) between f(t) and its expansion coe fficients 4;.
Subtracting the me an squared (f= (a9)? from (5. 10 11) and (5.1013) give exactly the same re(5.1014) relates the Squa
red mean (f)? = (a9)? to the sum (Cf. “area” ) of the aut oproduct moment. Wi th (f)? = (ao)? = 0 this sum (“area”) must be zer
o. It should be not ed that some of the Properties of these exact formulae (5.105 to 14) violate so me of the specifica tions that we laid down in order
to permit analysis of Seq
uential data statistic In particular you ally. will note that the autocovariance, Eqs. (5.104) or (5.107) is strict ly periodic in k wi th period T. Co y (k) is also symmetric ab ou t k = 0 and k = T/2 just as
a;a% was. We required for ergodicity that Ef(t+ k)k) f(t) = (Ef (t))? for & lar ge. So this
301
ergodicity condition is obviously violated. A second troublesome property is (5.1014). This requires that
if the mean (f = ao) is zero, the sum (cf “area”) of the autoco
variance is zero. Equivalently this requires that the autocovariance, when summed have equal positive and negative contributions. There is nothing in the definitions of either strict or weak stationarity to require this. In fact one of the commonest types of autocovariance,
~ e~“lFl does not have this zero sum property. It is just the purpose
of the various smoothing, averaging and truncating processes built into a computer to ameliorate these theoretical deficiencies. They doctor the data to make it fit the theory of functions having a power spectrum.In our discussion of the successive scatter diagrams to ob
tain the covariance, we stopped making them at k = T/4, or T/2
at the most, in order to avoid having to discuss these embarrassing properties.
Tt is not uncommon for a rigorous mathematical and logical description of a phenomenon to appear to have absurd implications which contradict common sense and simple observations. Fourier analysis is no exception in this regard. We saw examples of this in connection with various “impossibility” theorems.
5.11
A Digression on the Orthogonality Relation
All the formulae in Section 5.10 can be derived from the definition of a Fourier expansion (5.101), using the following relation. T1
PD ebrii(kO/T = TT, j=0
0,
fork = £++ mT, man integer or) fork A@+mT
These two results are summarized in the notation:
= Ték,ttmT
(5.11.1)
This is called the orthogonality relation. It can be considered as a geometrical construction,  a sum of unit vectors, the complex exponentials, which add up to a straight line T in length — T — for k=1+mT. For k 4 mT, the sum curls up to a polygon, star or rosette which closes on itself (sum = 0). & —1 is just the number of complete (27) revolutions of the unit vector in drawing the sum as a geometric figure.
oua
T~1
= Vo 0
alert
eg
a
(5.11.2)
We can write this as
2
Ptr
tl
rer
Tart
t rary
bart rtrtr tet y
I
if
4 pT
1
ar
or
Pty ar 4 ett
tr
Hence,
wag
(5.11.3)
r
This is an exac t formula, it wo rks forr is just a sum of T ones,
Viva.
Now suppose we take for r the complex number This has a magn r = e2ttk/T  itude of one, bu t it is not equa l to one teal, unl k=0,47, +097 ess etc., etc. It is a unit vector, or arrow with an to the x axis angle of 2rk/T. This angle is zero at k = 0,47 etc.
(5.113).
T1
» r= j=0
T1
ik/T
yi
x (e2mik Tyi
j=0 1
=
T1
From
P
e2tigk/T
j=0
— e2nik
Ja anky r T
(5.11.4)
The numerator of this expressi on is zero for all The denominato ky kan integer. r is not zero, un less k is 0,47, +27 etc., etc. this indeterminacy When Occ
as a sum of T' ones.
urs, we just go back to our definition of ay
So we have :
T1
emuT 0
7
ty k=0,4T, 427 ete
0, all other k.
. (5.11.5)
303
This is just our orthogonality relation with k replacing two integral indices (k —1). The geometrical interpretation of this sum is that of a polygon, star or rosette.
You can check it out by using a box of
tooth picks, and use T = 5, 6, 7, or 8 as an example.
5.12
A Digression on Aliasing
I want to emphasize that the above representation of f(t), (5.82)
or (5.101) is not restricted to functions for which f = 1/T) CZ, f(t) is zero nor is it restricted to functions with a zero linear trend. I emphasized that the constant and linear trend had to be taken out
before we could hope to examine data on f(t) statistically but calculating the power spectrum either directly, or via the exact covariance (5.1013) will not make these removals for you. There is moreover a peculiarity in the above Fourier representa
tion which brings out the difference between thinking of f(t) as a function of a discrete vs. F(t) = constant,
a continuous variable t.
Let us suppose
t= 1,2,...T, and not defined for any other values
of t. If we use the form for f(t) in (5.10 1), complex exponentials T1
F(t) = DO age’ 47,
(5.12.1)
=0
for this representation the first coefficient ag = constant, and all the remaining a;,j =1,2,...,7—1 are zero. So we have
f(t) = const e?TM4=9'/T — const (real for f(t) real)
(5.12.2)
If we choose to consider ¢ as a continuous variable (it was not such in the definition of f(t), then the Fourier representation of f(t),¢ now continuous, looks like Figure 5.121.
f(z) is correctly represented
where it originally existed, at the integers, and is the constant between them, plausibly enough.
We can equally well use for f(t) the representation (5.83), 7 =1 toT
zr
f(t) = Yo ae’?
(5.12.3)
g=1
With this choice we find a), a2,...a7—, all = 0 and only the last
coefficient a7 of the series is different from zero. ar = const. For this case we have
f(t) =CemTM*
(5.12.4)
304
as representati on
of f(t) = consta nt as a Fourie
r expansion. plot this up with t no w considered as a contin uous variable (us
the real part) of e?* #, we find Figure 5.1 22.
The values of f(é)
Fig. 5.121 Fourie r Representation of f(t) = const. by
ao)e2ti(j=0)t/T
fo
asco ANSE
f
UV $307 t>
Fig. 5.122 Fourie r Representation of f(t) = const. by
ape*G=T)t/T (realpart)
If we
e only
305
The above phenomenon in which a constant function f(t) (defined only at integers) can be represented continuously by either a “brick” or a “rake,” is a simple example of a general phenomenon called aliasing of frequencies. A frequency j = 0 or w = 0 is “alias” 7 = T or w = 2m. In fact any function defined only at T consecutive integers of its argument can be represented by any consecutive set ofT Fourier coefficients and associated frequencies, 7 = 0 to T—1, or j = 1 to =T +k —1. Correspondingly we had to T. In generalj =k toj suppose strict periodicity in T for f(t) for equations (5.104 to 14) to be valid. In terms of w, any span of length 27 —1/T, ~ 2m, is equally valid. Many treatments prefer to consider w from —7 to 7 but give only the interval 0 — 7 as the spectrum is symmetric about w = 0 or w = 7 for real functions f(t). In what follows we will just stick to the representations j = 0 to 7 = T —1. The graphs only cover j =0 to T/2. Any consecutive T values ofj and the corresponding w range would serve as well.
Aliasing is a mathematical curiosity, a property of Fourier representations for an f(t) defined and existing only at t integral. Aliasing is practically significant if there are sound reasons for believing that a function f(¢) exists and has significant properties between the values t = 1,2,3... for which it was observed. For many physical measurements (position of a satellite, temperature on a weather record) this is a reasonable assumption  the mathemat
ical continuity of the variable t and the function f(t). For other types of sequences, notably economic ones, we have indicated that assumptions for any sort of existence of f(¢) between the observations are much more debatable. Nor for that matter are the observations spaced uniformly in time.
We can see in the preceding discussion the rather peculiar and
special role which a nonzero mean f =(1/T) XE, fF) = a0 plays
in the Fourier series and power spectrum representation of the data. The mean may appear either as a zero frequency or j = 0,w = 0, or j =T,w = 2n, asa high frequency component, oscillating precisely at
the interval one (day, week, month) at which we take (or see or hear) our data. It might seem simpler to just subtract it out once and for all and forget about it. However, we saw that it was not acceptable to subtract the same mean from all the different scatter diagrams for autocovariance, if we are using real data. only if we made the
physically absurd assumption of strict periodicity over an interval
306
L is a single mean subtraction accept abl
e. We are commit ting the mortal sin of ma king assumptio ns about the dat a to fit the the Fourier exp ory of
ansion,
One way or ano ther, the mean is “mean,” and a source of tro ble and uncertain uty. We need to get rid of it for ergodicity, but pletely.
When we disc uss the struct ure
function and the closely related interquartil e range of the dif ferences, we shall se e a way around this difficulty, bu t with power Sp ectra it makes tro uble.
5.13
The Interpre tation
of the Mean
and Linear Trend as Four ier Expansio n Coefficients ,
Mathematical Para doxes
Although we ha ve not mentione d it in the prec eding discussion, difficulties and uncertainties as sociated with lin ear tr defined it—the end (as we have nonzero mean difference) are, from a mathemat cal standpoint, iof the Same
mean.
type as those associated with nonzero
To show thi s connection ,
let us dr
op for the mo ment, our pr viously emphasiz eed assertion that f (t) only exists at discrete valu of t=1,2,3... es 7. Thstead of ad hering to the 8o spel according to modern Prophe the ts, Chicken Lit tle and Zero, we adopt the view of those ancient point sages Leibniz an d Newton. We imagine our “d f(t) really exist ata” for all real numb ers #, not just integers, over an in
(we didn’t take an y data there), Pu ttin
g f(t) =0 outsid domain of ¢, so e some finite that f(t) “exist s” for all t, even if zero for most it, is really quit of e as arbitrary an d absurd as supp osing, as we di the sake of Four d for ier analysis, that it was strictly per iodic in T. We
In any event, th e expression for this sit
uation, correspo our Fourier ser nding to ies expression (5.101), instea d of a sum of amplitudes a; separate or a(w;), is an integral of supe rposition,
Hi) = [
00
=co
alwyettay
This is a superpos ition of wigglin ggun
(5.13.1)
g functions et with an ampl ‘Dpi
307
tude (density) a(w) along the w axis. a(w)dw is the amount of this wiggling function from w to w + dw. Now just what is the nature of this wiggling function which we
superpose (in the limit infinitely many, each infinitesimally small) to get f(t).
It is a solution of the differential equation for simple
harmonic motion.
dy
4
qe
eu 0
(5.13.2)
The amplitudes a(w) correspond to the constants of integration. y = A(w) sinw + B(w) + const
or we may write
;
;
y = Awe! + Bw) TM
(5.13.3)
The amplitudes A(w), B(w) can be considered as functions of w For w = 0 the solution of (5.132) is y = Aot + Bo
(5.13.4)
which is just a constant and a linear trend. The w = 0 “components” of the spectrum are the constants of integration, exactly as they are
amplitudes for nonzero frequency w.
The limit as w — 0 is also instructive of the difficulties encountered when the constant and linear trend are not removed from data. For ¢ fixed, B(w) coswt approaches a constant as w — 0 is continuous in w and not zero. There is nothing in the mathematics that says
B(w) has to be continuous in w. There are important cases when it is not. A(w)sinwt approaches (¢ fixed) a linear trend as w — 0 only if A(w)w — const. So a linear trend puts an “infinite amplitude” A(w) ~ const/w component as w — 0. In the power spectrum
this is a contribution of order const/w?. In numerical analysis this can cause considerable ambiguities at low frequencies (i.e. w close to zero) if and only if you believe the f(¢) which you are representing
with sines and cosines has mathematical continuity as a function of t. If f(t) doesn’t exist between t and t+ 1, w between 0 and 20/T doesn’t exist either.
One can note an analogy of sorts with the experiment of Chicken
Little and Zero, and the Cauchy distribution. With linear trend or a nonzero mean, plus a belief in real numbers and mathematical continuity, you are using a Fourier representation or power spectrum
analysis in circumstances where it isn’t entirely appropriate.
308
From a more adva nced mathematic
al standpoint we the following in can point out connection with these zero freq uency terms whic have to be hand h led Separately in using sines an d cosines to repr a function. Th esent e name of
Fourier is usua lly associated sions in sines with expanand cosines. There are nu merous other families of
value, and th e associated fu nction is call ed the eigen or tion. For these proper funcother kinds of wiggling functi ons, the zero ei and associated genvalue eigen function are Special ca ses, which, if needed (which they are is seldom in mo st
physical prob lems where su tions are used), ch funchave to be hand led Separately just as the cons and linear tr end in Fo tant.
urier analysis , The purpose of these digres sions on alia sing, means trends is to re and linear mind you that the theory of Fo urier represen and power sp tations ectra, like all theories, has its limitations. Hopefully
309
attention to the limitations of your mathematics.
You may recall
in the 8th grade, you learned in algebra to manipulate letters which stood for numbers, instead of manipulating numerals as in arithmetic.
Suppose you had this problem. The difference of two boys
ages is two years. The product is 24 (years)? How old are they? If you translate this sentence in English into algebraic symbols, and manipulate them according to the rules, you find the boys are 6 and 4 years old, or 6 and 4.
If you translate these answers back into
English, one (possibly absurd) implication of the second answer is, they haven’t been born yet. or in a science fiction world, you can live backward in time. You could add to the mathematical specification
of your problem that age must be a positive number, the algebra alone won’t do that for you. The theory of sets of numbers will, if
you use it.
The paradoxes of Fourier analysis are similar to the above, but much more subtle. You have to pay close attention as you translate back and forth from the mathematical or numerical conclusions to every day language and concepts. You can find in Section 5.21 an example of a paradox of Fourier analysis, consequent to its ability to handle in a general way both functional and stochastic relations.
The “expected” power spectra of random walks, linear trends and a step function, all look exactly alike.
5.14
The Spectrum of White Noise. a Sequence of
Independent in probability Values of f(t) Let us take up a specific example of power spectra  so called
white noise." This is one of the commonest and simplest cases, and is frequently used as a comparison standard or bench mark against which other sequences giving different power spectra are compared. We commented earlier that the power spectrum was frequently thought of and depicted as a thin, solid and perhaps piecewise smooth line. The same could be said of the autocovariance. In fact it was one
of the conditions for the existence of a power spectrum (at least for a real function of a real variable t) that the (expected) autocovariance be continuous, i.e., a solid line. So it seems inherently reasonable that 7 The term “white” as applied to a spectrum means that all frequencies (in
practice over a finite domain of frequency) are present in equal amounts, in analogy to white light (sunlight) containing all colors in equal amounts—also over a
finite domain (red to violet, the range of the eyeball). To the ear, white noise is approximated by the sound of a hail storm on a slate or tin roof—all frequencies to which the ear responds are present, in equal numbers.
310 the (expected) power spectrum, being the Fourie r transform of the should
autocovariance,
have some de gree of solidi
ty (i.e., contin We saw that uity) if f(t) was dot ted, or define d at discrete val of t only, the coe ues
too.
fficients 4; or a( w)
were dotted too . Yet it seems reasonable tha t such dotted functions migh t be ap proximated by solid an d piecewise smoo th lines.
As we sa
w, this was a co enough practice, mmon and works well in some cases, but not all, as have gone to con we
siderable pains to show. So we are going to take the case of white noise, an d see just how
in practice with real data, the powe r Spectrum of wh ite noise is not very well re presented by a thin sol id smooth lin
e, Thinking just in ter ms of a thin solid sm oo th line is like thinking of the relation of wei ght
y to height x just as
a line. There are important fluctu real and ations around thi s (regression) lin e. The expected value of height y might well be in Some approximat e or special sense smooth and con tinuous in x
but it would not make sen a zigzag line Con se to draw necting succes sive values of ob se rv ed weight with increasing height. It doe
sn’t make sense
with power spectra but it isa very co either, mmon practice nev ertheless, So let us imagin e our F(t) to be created in the fol lowing way. We make ¢ = 42 58 0..0 withdraw als in order from normal (Gaussian) random nu mber table
of zero mean an dunit variance. ordered choices These are our f(t). We might repeat th e experiment to get another, different F(t). We woul
d certainly ex pect its statis properties to tical be much the Same as for the first f (t) . These two dered choices of orT
the ensemble of all
= 1000 say memb ers each, are tw o members of possible
f(t)’s. We will use € (expectat calculate expect ion) to ations over this ensemble, and bars to indicate ges over averone me gets large,
mber of su ch an ense mble.
We
can
see that as ZT f = AyD, f(t) a €(f) = 0, so thi s sy st em satisfies the second condition of erg
odicity and the ot
her conditions of both strict and weak stationar ity as well,® For what follows there is no requir ement that this table have a normal or Gaus si an distribution .
Random
digits valued for even digits, mi at plus one nus one for odd wo uld serve equall y well, these also have zero expe
cted value and uni t variance.
In order to determ ine what you mi ght expect to get from this 5 Except, as we hav e noted, mathematical continuity of the aut all values of del ocovariance for ay, 7.
311
experiment, you simply calculate the expected value (over the ensemble) of our Eqs. (5.101 to 14) describing Fourier expansions and an power spectra, and compare this with what you actually get in experiment or repeated experiments. Beginning with simple things first, take just f(t) itself. Since Ef(t) = 0, this is just a row of dots
on the axis of abscissa. We have given this in Fig. 5.141(a), also the experimental results, Fig. 5.141(b).
We have plotted the experimental results in a number of different ways, in order to illustrate the cosmetic effect of different plotting methods. These plotting methods also illustrate the implied and un
conscious assumptions behind them which the reader inadvertently the adopts. There are numerous examples in the literature.® At priappro the right, twisted through 90 degrees we have sketched ate distribution and given its analytic expression. As we shall see, this distribution has an appreciable effect on some of the cosmetic impressions.
Figure 5.141(a) is just the plot of Ef(t) itself, a row of dots on the t axis. (b) is what I call the Chicken Little Zero plot, a picture,
no more and no less than what the data says. If the time scale is is highly compressed, as it frequently is for data of this type, itthis of graph difficult to read off the time order of the points on a type. The subsequent plots ameliorate this difficulty, at the price of some unconscious and misleading assumptions.
backIn Figure 5.141(c) and (d) the dots of (b) are extended, These ward in (d) and forward in (c) to form short horizontal lines. horiare connected by vertical segments to succeeding and preceding what er zontal lines. The effect of these vertical strips is to make clear part dered consi be not d shoul the order in the sequence is, but they moves which of the process which generates f (t). There is nothing along these vertical segments, their actual existence would destroy functionality for f(t).
There is also nothing existing or “moving” along the horizontal nuity segments either. They provide an impression of piecewise conti“lead ”
which is not actually the case. Note that plot (d) seems to plot (c), simply because of the way they are plotted. type. It Figure 5.141(e) is a very common plot for data of thissucce ssive “supposes” that f(t) exists at all ¢, and has to “get” to its strai ght observed values, at the integers, as quickly as possible, i.e., lines. It is continuous everywhere and piecewise smooth. ° See e.g., Granger and Morgenstern, p. 53.
a
ae (Prebtlde 
ong
=Fe
=
; b,c, d,e, ob served
es
313
§)
oe ’
" ale
@
2°
2!
‘
:
i
n ya {oe
U
ep '
cs .
'
'
‘
t
:
.
Fig. 5.141, f; Smoothed or Running Average of Observations of Fig. 5.141 b.
There is one subjective impression of (e) which is correct, and shows up more clearly than in (b), (c) or (d) and not at all in (a), the “expected behavior.” Subjectively (e) seems excessively zigzagged. A majority of upslants are followed by downslants, a majority of down slants are followed by upslants. This states the fact that if successive values of f(t) are independent in the probability sense, successive differences of f(t), Af(t) = f(t) — f(t — 1) are not in
dependent, and in fact negatively correlated. Upslants are positive
differences, downslants negative differences. There are even formulae for computing their negative correlation by counting peaks, or
crossings of f = 0 of this zigzag line.
Examples of financial data which have approximately the properties of our random sequences are annual earnings, quarterly earnings to somewhat less degree; daily, weekly or monthly closing stock price
changes, daily advances and declines. The volume sequences (daily,
weekly and monthly) has noticeable departures from the properties of f(t). All of them can be compared to the random sequence as a standard. You can read about such comparisons in the books by Granger and Morgenstern and Granger and Labys. Finally you can imagine drawing either by eye or computer a
smoothed or moving average of the original data on f(t), Figure
5.141(b). We give this in Figure 5.141(f). This is a very common
practice, if you believe unconsciously that there is a continuous and
walk, a cont inuous line (a lmost surely, almost everyw slope almost here) su
rely almost no with a where, Imagin able, but not depictable.
po
tht
Weeuee
eg 4s
eee
o
af

wana prob
ier Pe
oy
wi
em
ee
e  
46>
+!
“LI
t
+(
fo
é
10
2e
Fig. 5.142 4 Random Sequ en
T>
Yo
ce of Variable and Zero Mean s of Unit Vari . +1 for Even ance Digits, 1 for odd digits fr digit table. om random
(¢)
315
are exactly the same. The differences in appearance are due to the differences of the distributions of f, plotted at the right of Figure 5.14 1 and 2.
Note that the expected value of f(t) in both cases is zero. It is an amusing and confusing semantic paradox that with the ft) =1 sequences you will never observe (one observation) what you mathematically and theoretically expect as the best statistical estimate, and almost never with the random normal sequence for f(t). So much for what the data on f(t) looks like. What do the coefficients a; or a(w) look like, and the power spectrum? The expected values of a; are easy to evaluate, and they are all zero. We had TM
(5.14.1)
a; =(1/T) > f(HePTM"/T, 5 integral t=1
Remembering that the successive f(t) are independent, the expectation of a sum is the sum of the expectations, in the above case times
the complex numerical factors e2titlT  So
E(a;)
I
(1/T) > Ef (te 2/E t=1
=
(5.14.2)
0
Note that a; is usually a complex number so E(aj) = 0 applies both to real and imaginary parts of a;.
In order to discuss what the probability distribution of the coef ficients a; looks like, Figure 5.143, we have to take account of the fact that the aj are complex. The two parts can be considered in two different ways, corresponding to the cartesian and polar coordinates of a point. Let us write out the expression for a; explicitly as a complex number a; = 6; + ic;; bj, c; being the real and imaginary parts. a
T
t=1
t=1
a; = bj tic; = (1/T) So f(t) cos(2njt/T)+i(1/T) YF (#) sin(2x3¢/T) The two sums give separately bj and c;.
(5.14.3)
316
& ot; °
. '
by
é
>
+ .
,
6
=
«%
4%
«
4
‘
Pn
1
"4
a
ra
4
i
Ud
@
2
~b Arh
Pb)de=e
yy
¥aor
Stmelavte
tan
a
vr,
weir
i= Gar
4c )
5.143 The Distri bution of Real Co
mponent b, or Im nary Component agic of a Fourier Exp ansion. a; = bj + V=T ¢;.
You can see from the above formula that the 6’s and c’s separa tely
will each have a (in the limT
er
ao
‘
o.
E®\=0 £@) =a =
Fig.
.
FQ
er
#
Pry
oa
ef
co) normal dis
tribution, simply fr the operation of the om central limit theore m. Since we have, e.g ., e Ag
85 = (1/T) 3° F(t) cos(2mgt/T) t=1
4; is a sum of indepe ndent random variab les, the f’s times coefficients (1/T) cos(27 jt/T), which are in absolu
te value less than equal to (1/T). Th or e f’s must have the first two mome nts finite and hence “well behave
d” (cf, Sections 4.5 7), the actual detail s of the distribution of the f’s is unimportant. With well behaved mo ments for f(t) the b’s will be “normally” distri buted (a continuous approximatio
n) independent of these details. We could see these det in the distribution ails of (¢) im the succes sive Scatter diagra ms of f(t)
317
vs. f(t+k),) but this information is “lost,” if we only look at the
Fourier expansion coefficients, or power spectrum. The mean and mean square of the distribution of f(t) affect the (expected value of the) power spectrum, no other properties of the distribution of f(¢) enter in.
So regardless of whether f(t) is Gaussian or just +1, 6; is going to be normally distributed quite closely and in fact for white noise
all the different }; will have the same distribution.!° Since €b; = 0, the variance of bj is
Eb? = E((1/T) ) F(t) cos(2njt/T))?
0;
=1,
it
(Q/TYE s P(t) cos*(2njt/T) + > f()F(t!)(cosines. . .) tA!
t=1
Since Ef(t) f(t’) = 0 for all t of
#2’ (independence of f(t), f(t’))
(T/T")E(f?) (1/2)
= 07/27,
= 1/2T
for allj,j#0
for of =1
(5.14.4)
(5.14.5)
The probability distribution is for all b;,j #0
db p)e o(b)db = (1/V2moP27
(5.14.6)
In a similar fashion, for the sines of (21jt/T) oo
=
Ec} = 05/2T
~ "de ple\de = (1/V8no)e
(5.14.7)
We have plotted this distribution on the right in Figure 5.143. If we choose we can plot the two distributions of b, and ¢, together on a plane. This gives just a bivariate and uncorrelated normal distribution, Figure 5.144.
10 Except as you might guess the coefficient for cosine of zero frequency, bo.
S
318 j
ax beia
eX Comstance +Olb;)
Pld \dbde= =
Payds Perd e —(b*e?) fr) 6 d bde
—
_
_
oT Of
_ bd ¢ e
(=a EL 3 Te (a wa 2T
Fig. 5.144 The Joint Distribution of Fourier Expansio n Coefficients in Cartesian
Coordinates for Whit e Noise.
Plbe)dbde = hee 20r apg, with
o?
= oF = 02 = 03/27
(5.14.8)
An alternate plot of the distribution of b’s and ¢’s is in polar coo nates. In that cas rdie we define
lJ=r=VeP42
,
ga tan71(c/b)
319
The corresponding distribution is
y(0,r)dbdr
=
©(0)d0 R(r)dr
dg
eM [20 dp
Qn
o
re (Ol »
»
—_ =e
¢ ‘
5
,
s
re
>
f
oe
‘ Es
a
e 4 @
,
“t
f°
are
4
wert fr > 2
re las= ‘
oE)de\=
he rl =
ede
Fig. 5.145 The Rayleigh Distribution for the Modulus  a; =
1/0} + ¢} of a Fourier Expansion Coefficient a; = bj + ic; for White Noise.
The distribution of r = a alone is called the Rayleigh distribu
tion (Figure 5.145). In physics this is the distribution of the speed
8 = ,/r2 +72 of a molecule in a two dimensional gas, whose individual velocity components are individually and normally distributed.
In statistics it is the distribution of the standard deviation (note, not variance) os = 4/ o? + 0% of the sum s = 21 + x2 of two independent and identically normally distributed variables, 11, 22;01 = 02.
osu
Finally, wh at is the ex
pected valu Ja,;? = aja! e and dist which is th ribution of e Power sp a? = ectrum itse lf, We alre ady have th e
The sum of all of them (or “area” under the doubled if “curve,” €) you have only plotte [2 = o3/T , d w from just the va 0 to TM Or riance of f j=0 to T/ itself oF. Th 2, is is is of “pow
er” in the ten referred Spectrum. to as the to tal The Probab ility distri bu tion of aa rectly fr ” = Ja)? ca om the Ra n be obtain yleigh di stribution ed difrom (5 . We ha
.149), Defini ng P(po
d R(r)dr
= wer) = 72 — b? +0? = ag
E(p) =. = 8 +0} = 26? = ot pp
We find, si nce rdr = dp /2, the Prob ability
distribution
Ce
2
ody J?
of p, U(p)dp
e7P/A
V(p)dp = y e
(5.14.10)
Tically skew ed.
The expone ntia
l distri
bution give erties wh s someinte en we pl resting Cosm ot up an etic propexperiment al Power spectrum Ga* asa
321
er cen denTM
By =4,2"
gla peltnte aro
Pay '
5°
a
tosamy Ir?
o@)dp = ohh dp Ele)= Ax Ty acts nore GT
2 fa
Fig. 5.146 The Distribution of the Power Spectrum Values P
= aja; = b} + c} for White Noise.
largest, but large values from the exponential tail can and must occasionally occur. So there is an unsymmetric excess of high sharp peaks in the plot of the power spectrum not matched by a corresponding number of deep troughs. As a landscape profile of frequency, there are quite a few sharp peaks, and no deep ravines. It is very common practice to plot the power spectrum not on a
linear scale as in Figure 5.146 but on a semilog scale i.e., log for the vertical coordinate aja; but linear for the frequency, or j. For the assessment by the eyeball this is equivalent to making a further
transformation of the power spectrum to a variable z = log. p/>You can easily work out that the distribution of this variable is (5.14.11) F(z)dz = e~* e*dz
plotted to the right of Figure 5.147. This variable is also strongly

Fig. 5.147 Th e Distribution of the Power Sp on a Log Scal ectrum Values e for Ordinate P, , for White Noise (Schematic).
scale.
There are very few sharp moun tain peaks, bu t many deep It is the Prof ile of a
ravines.
5.15
deeply eroded Plateau.
The “Solidit y” and “Smo othness”
Spectrum
(For example,
of a Power
323
trum of white noise, or any noise for that matter? You should note the similarity, or rather identity of the issues if this same question is
applied to f(t) itself. f(t) is withdrawn from the smooth solid normal distribution, but plotted as a function of ¢, it is pretty ragged. The power spectrum comes from a smooth solid exponential distribution. Plotted as a function of w or 7 it is also pretty ragged, in an asymmetric way. We saw that € f(t) was a horizontal row of dots of ordinate zero. About as smooth, solid and simple as the limitation placed by a row of dots can be. The actual observations of f(t) bounced around this “line” or row of dots by an amount and in a way set by the distribution of the population from which f(¢) was withdrawn, normal or +1, zero mean and unit variance in either case.
Exactly the same type of statement can be made of the power spectrum of f(t) when f(t) is “white noise.” The expected power spectrum is a horizontal row of dots—white noise. The observed power spectrum bounces around this expected value with a skewed 90%. confidence belt a factor of 60 wide.
One method of smoothing out the, observations of f(t) is to take successive not overlapping ¢ groups and average them. Thus you
calculate f(t) = ¢ se f(t’) and plot that. We did this once before to calculate points on a trend with randomized dates (Section 5.1).
If by contrast you let the groups overlap (a moving average, possibly with unequal weights), you get more points to plot and it will “look” smoother and more continuous, (Figure 5.141f) but you are just kidding yourself with this smoothness. The successive points are not independent just to the extent that they have several values of f in common. If you put k in your group to average, the fluctuation of f(t) as plotted go down like the square root of k, but you have only T/k instead of T independent observations. What you gain in “continuity” you have lost in information about f (t). Exactly the same treatment can and frequently is applied to the power spectrum. You take k = four, five or more adjacent aja; and average them. The resulting plots “looks” more continuous. It doesn’t tell you any more, but rather less in the information sense about the power spectrum. The distribution of these average points
on the power spectrum is that of chisquare with 2k degrees of freedom. Instead of T/2 independent spectrum values, you have only T/2k. If you make it a moving average you get more points and they look smoother and more continuous, just as for the moving average
a2
for f(t).
You deceive your self if you
think this pretti trum is giving ed up specyou a clearer picture of the spectrum or which generate the process d it. It is purely a co smetic effect, pa ndering to your
and weight that, and Fo urier transf or
m, Alternat transform, di ively you ca n Fourier rectly or vi a the covari ance an F(t) smoothed, which has be tapered or en truncate d.
All these procedures opinion just are in my different slices of baloney! from the sa smooth slab me continuous . I might warn you th at this is th rather small mi e biased op inion of a nority.
5.16
How to Insp ect and Ch eck a Publ ished
Spectrum
(bushels)~1. (w eek)1,
something
to fit.
Power
If the units on the ax
is are 105, oy is funny, th 100, e author mu st have ch anged the scale, corn
The texts are
usually using such data to underlying “true, approximate, ” “continuous in or estimate so me ¢” (mathematica the opposite Pos l sense) process. ition, that it is We ta th ke e “c on ti nuous Pro esti mates the “tr cess” which ap ue” discrete proximates or Process whic h generates the data,
325
prices don’t change by such amounts in a week. Without units and dimensions, you just can’t tell.
You should be able to read off a rough estimate of dispersion of
these weekly price changes (take onehalf their intersextile range). This squared is (for Gaussian distribution) the variance, of dimensions $ (bushels) (week)? It is often called the “total Power.” The
power spectrum has these dimensions. If it is directly plotted as individual points as a function of ¢, ajaj, the spectrum should then average about [1/(number of coefficients)] times this variance. If it does not, then look for a missing factor on the power spectrum scale. The power spectrum may have been normalized by dividing by the variance, in that case the scale on the power spectrum values as
plotted should be such that the values add up to unity.
Sometimes the power spectrum is plotted as a density, in that case
its dimensions should be labeled $? (bushel)~? frequency?. Check what kind of frequency, is it cycles week~! or 2m cycles week~!? In
either case the area!? under the spectrum, with the proper units, should reproduce the previously estimated total power. You can make all these checks by rough measurements on the graphs themselves, but only if the units and dimensions are given. You can use such an area calculation to estimate and check the fraction of variance, or fraction of total power, in any specific frequency band. Finally you should be able to check, at least approximately the distribution of the spectrum values aja? and hence the width of the confidence belt. These are determined by the number of degrees of freedom.
The basic recipe for the number of degrees of freedom is 2 (number of spectrum points averaged together on the spectrum plot). If there is no averaging (i.¢., aja} was calculated directly and exactly by either (5.102,3) or (5.10 13),the recipe gives just 2 df, as we previously derived, (5.1410). The “number of points averaged together” can be produced in a number of different ways; one has to examine the text rather carefully to see what the computing routine was. There are at least three ways this number can be produced. 1) The simplest method is that of a running average (possibly of unequal weights) of the exact spectrum aja} (5.102, 3). See e.g., Grainger and Morgenstern, p. 70. Unfortunately the actual number 12 Note that “area” refers to a linear plot of the spectrum. On a semi log or loglog plot you have to convert, i.e., read off the linear dimensions. See Figure 5,197 for an example.
used in the Tunning av erage is no t given, so here we do n’t know th e
number of de grees of free dom.
2) The seco nd method fo
r determinin Points averag g the number ed together is of spectrum to use the foll the “exact” owing formula, formula eq. a variant of (5.1013)
M P; = power spectrum = 1. t } APMest imated k=0
5
(k)e?7iik/M
j=0,1,2,..
Here M is so me integer wh ich is a frac tion, 1/3 or total number less of T, of Sequential obse
the rvations, AP Mest(k) is usually but an expressi not always on, of the type
APMoce(k) = +
Tk
t=1
fe) +4)
(6.16.2)
Note that AP Mest(k) does not use the unrealistic Pr operty of stri of f(é), but ct stops su
Periodicity
mming shor strictly peri t of where odic in ¢, ap f(t) become propriate to s the “exact” It can be show expression n (see e.g., th (5.1013), e argume
nt following) (5.161) ives (5.105 to 14) just M indepe that ndent estimate number of s of th e spectrum. So spectrum Points aver the aged
together is of degrees of T/M. The freedom is 2( number T/M). See examples in 6. Figures 5.19 2 to
A variant of (5 .161) is 1
a P= PAP Mest (Kee? *iik /M
J=1,2,3,..
..M (5.16.3) Here A, is a set of unequa l weights Star ting at At=0 monotoni = 1 and tape cally to zero at 0. ring Different choices of Ax are called the Set of different wind weights ows. Four such differen described in t Windows ar Blackman an e d Tukey, pp .
9599. Natu APMest(k) they rally, for a give slightly diff given erent independen t estimates for th at different e valu
Spectrum
es.
approximatel
The effe ct
of such windows y average to is to gether 4s ad jacent Spec exact spec trum points trum. Th of e approx the imate nu mber of as before 2T degrees of /M. freedo
m is,
3) Finally on e can combin e both meth smooth wi ods 1 and 2 th a Tunn above, and ing averag e (two
Spectrum
calculated
from (5.163).
TTT
or three at the mo st here) a The numb er of d. f, is increase d
5M
327
over 2T/M corresponding to the number n (2 or 3) of points averaged
together,ie., dic ca on.
There is a difficulty with method 2) above for calculating a spec
trum, which is a good illustration of the difficulties sometimes en
countered in deciding what constitutes “good” scientifi c procedure.
Method 2) occasionally gives negative values for the spectrum, which is impossible by “exact” method (5103 or 13, or Method 1), which
smoothes the exact. values.
Now a chisquare distribution does not
have negative values of its argument, so the distribu tion of spec
trum values calculated by Method 2) cannot be exactly chisquare.
This means confidence limits on the spectrum assumin g a chisquare
distribution must be in error by some unknown amount. There has
been considerable effort in the past to devise averagi ng and weighting
procedures, the Ax, for calculating from the APMest( k), to prevent
these negative values in the spectrum from occurrin g. I regard this
as simply sweeping the dirt under the rug.
Forcing the estimated
spectrum to be positive does not necessarily force the distribution
to be chisquare with a determinable number of degrees of freedom. So the confidence limits are uncertain, and correspondingl y the reli
ability of statistical conclusions drawn from them. You should now
feel the prod of the horns of a dilemma, when you try to decide: a) whether to use the exact spectrum (5.103 or 13) or Method 1)
which simply smooths the exact spectrum; or b) a spectr um from an estimatedAPM(k), Methods 2 or 3. With choice a), from the Fourier expansion coefficients, you represent (fit) the data exactly with com
plete reverence for it.
You know the distribution of the spectrum
values (chisquare with known number of d.f.) and hence where the
confidence limits are. You have solved, neatly and exactly an absurd
and unrealistic statistical problem, involving a strictly periodic f(t)
and strictly periodic APM(k). With choice b) using APMest(k) you
do not assume these absurd and unrealistic properties which allowed
a known distribution and number of df. of the spectrum values. Instead you solve a realistic and reasonable statistical problem, but
are now uncertain about the distribution of the spectru m values,
and hence the confidence limits.
You need these to interpret your
results with some significance. You have in short a choice between an unrealistic problem solved exactly and certainly (in the statistical sense), and a realistic problem solved approximately and with greater uncertainty about the (significance of the) conclusions.
There are two escapes from this dilemma, one particular and one
328
general.
In a particula r exam
ple, if you are uncertain as much to believ to how e some interesti ng Property of the data as sh its spectrum, cal own by cul
ate the spectrum both ways. If yo u can find the
same conclusi on significantl y
either way, yo u
are probably safe. In general this dilemma could be resolved exp eri men tally. Calculate a number of dif
ferent. white noise spe
ctra, using both normal, + 1, random or even a str ongly skewed distribution, of zero mean, Calculate the Spectrum of these different white noi ses by both meth 
I might add, fr om my own expe rience that this problem of the distribution of Sp ectrum values is particularly exac er ba ted when you study 1/w noise, generate d naturally in a semiconducto r,
cially (e.g., 5.246, with 6 =  1/2).
or artifi
Table 5.161 Gi ven x?, WITH
PROBABILITIES ‘TO DETERMINE CONFIDENCE Bett WIDTH FoR Powe r SPECTRA No. Tota
of
df.
l
Width
of 90% — Factor
Conf.
Belt as a Factor 1
p)
Prob.
=0.05
1000
58
Down From Median (115)1
(13)
3
220
4
(6.7)?
13
(5.0)!
Prob.
Oe
Prob.
0.05 Given —
o< X?(
Given))
=
0.5 x? —_—
0039.45
0.95
Prob.
_=0.95
Factor
Up From
— 3.84
Median 8.5
1031.35
5.99
7113.57
7.81
3.3
9.49
2.7
2.5
3529.37
4.5
5
9.7
6
(3.8)!
78
1145
4.35
65
(29)!
163
11.07
T
(3.3)?
5.35
917
12.59
635
24
14.07
20
8
5.7
9
(2.7)?
51
2.73
(24)
7.34
3.39
15.51
g 54
2.1
16.92
2.0
10
4.7
15
(24)!
3.5
3.92
(20)!
9.34
7.29
18.31
14.30
2.0
24.90
L7
329
It is possible, although I have never seen it done in a statistical problem, to calculate a Fourier representation of the data, and hence
a power spectrum using sines only or cosines.
only.
In that case
(for white noise) the (coefficients)? have an x? distribution with one
degree of freedom. Its shape is of order gly)dy ~
5.17
e
Ydy a’
with y = chi— square
White Noise as a Standard of Comparison
I commented previously that the spectrum of white noise served
as benchmark against which other spectra could be compared. There is a corresponding standard procedure in a great many elementary statistical problems.
The same considerations apply in comparing
some power spectrum to that of white noise.
Let me illustrate by
means of an example. Suppose we have measurement on two variables 2, y; xi, yi,i = 1 to N.
We ask, are x and y statistically related?
One elementary
answer is to calculate the correlation coefficient. if it isn’t zero you have to ask, is it significantly different from zero? In this case the
comparison standard or bench mark is a population of uncorrelated and normally distributed pairs of variables . You calculate from this hypothetical population with zero correlation the probability of
finding with a sample equal in number to yours, the probability of getting a correlation at least as large as the one you actually observed.
If this probability (the significance probability) is small enough (.05, .01, etc.) you say at this level of significance the observed x and y are correlated.
A great many statistical efforts to answer our original question stop at this point.
If you want to do such a problem more care
fully you should have first asked is the above comparison standard
appropriate?, and if not, doctor the data a little until its distribution is appropriate to the chosen standard or benchmark, at least approximately, just as we doctored our sequential data to make it appropriate first for just statistical analysis, and then more specif ically for spectral analysis, with white noise (one way of describing
T uncorrelated and independent variable instead of just two) as a comparison standard. For example, z, y might be any one of the following pairs. e 1) height and weight of students
990
¢ 2) barometric pressure and wi nd velocity
e 3) price ch ange and volu me, one stock or an index © 4) 1Q of a student an d his family income
5) price and dividend of a stock © 6) price an d earnings of a stock ®
7) price an d book valu e of a stock
© 8) price ch ange and earn ings change for stock,
You can and sh ould test if the comparison st
anda
and uncorrel ated
rd of two norm variables is al appropriate by simply ex distribution of amining the x and y Separa tely (the marg inal distribu data are plot tions if the ted on a Scat ter diagram or assembled table). If th in a two wa ese two dist y ributions ar e approximat ely normal then
standard.
This will prob ably
be true for pa irs like 1) bu Pairs like 4). t Probably no In such case t for it would be Te asonable to tr both of the va ansform one riables. Fo or
r IQ vs. in
come, IQ is Pr to normal, obably close en but income ough might well not be. logi enough to g I might co normality. me close Other tran sf
ormations may well from the obse be suggeste rved distributi d on for other pa rticular Pairs, As we saw fr om the discussi on of our magi c transformati on word
331
of prices is usually sensible, but log of earnings would not be, since earnings have a finite probability of being zero or even negative. The preceding remarks about the use of the normal or bivariate normal distribution as a comparison standard has some close analogs in the discussion of power spectra, where the comparison standard is white noise, one way of describing 7' independent variables. ‘We saw that, given some experimental sequence f(t), differencing and removing the linear trend (as we narrowly defined it, the mean difference) went a rather long way toward massaging the sequence to make it nearly white in its spectrum. Just as in the case of a comparison standard for a single bivariate distribution, we don’t need to get our data to an exactly white spectrum, just close enough to examine significant departures from pure white. So our transformed sequence is compared to the white spectrum, and we can then examine if certain frequencies appear in significant excess or deficiency relative to the white noise distribution. To make this comparison carefully you have to know the distribution, of the white noise spectrum values, chisquare of 2d.f., supposedly but we saw that this was not a completely settled question, the distribution is different for different computing procedures.
The “Spectrum” of a Random Walk ltself We have discussed in considerable detail the spectrum, the dis5.18
tribution of the coefficients and confidence belts of a sequence ¢ = 1,...,T of independent variables f(t) of zero mean and finite vari
ance which we called “white noise.” These are just the steps of a random walk of zero expected advance. So we could easily use such data to generate a random walk R(t) of zero expected advance. DEFINITION
t
R(t) = Of)
(5.18.1)
#=1
‘We estimated in an earlier discussion (Section 5.7) what the au
tocorrelation of such a cumulative sum would look like. So using the relation that the power spectrum is the Fourier transform of the autocovariance, one might evaluate an expected power spectrum from that. We also pointed out that such a random walk was neither strictly nor weakly stationary, nor ergodic. The moments increased the longer the interval over which the data was taken. The distri
332
bution of R(t) was not independent of R(t2)—no matter how big
t, — t2 might be.
Nevertheless one can calculate a “sp ectrum” for such a process. It amounts to fin ding out what yo u would get if
you punched the data into a comput er and told it to calculate a spectr um anyhow.
Such experiments hav e been done both on artificial sequences ,
and
by Granger and Hat anake, Granger and Morgenstern or Lab ys and Granger. A character istic type of “spect rum” does indeed eme rge. Let us see what it is like ,
Reviewing briefly, we ha
d Gh _T as successive choices from a table of zero mean and unit varian ce, Our Fourier exp ansion was T1
FQ) = Vr ajerrt/T =0
Wy Qo seg
(5.18.2)
The real and imagin ary parts Separatel y of a;
were normally distributed with zer o expected mean, and variance oF /2 T = f2/2r. The separate,
unsmoothed points on the power spectr um, a;a5 were distributed exponenti ally (chz? of 2d.f.) independent of j, with expected value Eaj ay = o3/T.
We define a new fun ction R(t), as above (5.181). Now data R(t) (one member on from the ensemble of all Possible R(t )’s), must have a Fourier tep resentation of the form. FOURIER REPRES EN’ TATION T1
RO) = Vo wje/T, j=0
pe 1,2) oe gT
(5.18.3)
If we can find a rel ation between the @;’s and w;’s, the n since we know the former we can evaluate the lat ter. We shall derive the w;’s at first approximatel y, and then exactl y. Anticipating so me properexcluding and inc luding a constant (not necessarily the mean) and
linear trend in R(t ).
These two contribut ions
of R(t) are precisely what we have giv en arguments for removing from seq uential data, before carrying out any
statistical analysis, and in
Particular a spectrum calculation. We will be able to see exactly what Temoving them does to the spe ctrum.
You should also not e that for Ef(t) = 0, the expected con stant and linear trend of R(t) as defined abo ve in (5.181) is zer o. In a
333
particular single experiment or sample, they will almost certainly not be zero, nor even small, if the walk has several hundred steps.
5.19
The Approximate Evaluation of a Random Walk Spectrum
You will notice from the above definition of R(t) (5.181) that the differences of R(t), AR(t) = R(t) — R(t — 1) are exactly equal to f(t) except for the first one, AR(1) = R(1) — R(O), which is not defined, since R(0) is not given by its definition. However, if we take, arbitrarily, R(0) as given by its HUncae representation (5.182) (not
the definition) then R(0)= R(T)= 7, f(t). With that definition T
Z
AR(1) = RG)  RO) =f)  LF) =LVFH) t=1 t=1
19.1)
This gives us a definition of AR(1) to be sure, but this definition does not fit with the property of all the other AR’s AR(t)=f(),
t=2,..4T
It will in most cases be much larger in absolute values than all the other AR’s, being the sum of all the f’s except the first, and with opposite sign. If we are considering real data, we can contrive to make R(1) as defined above fit into the definition of all the others in either of the following ways. Suppose the R’s are a price sequence, say of commodities or common stock. We can pick a span of data such that the initial R(0) and final R(T) are the same. Just draw a horizontal line on the chart and pick data between any two points in time where
the data crosses the horizontal line. In that case then R(0) = R(T), and AR(1) = R(1) — R(0) = R(1) — R(T) = f(1), by definition, AR(1) is no longer an exceptional case. We subtract from all the data R(0) = R(T). Our random walk is about this horizontal line and begins and ends on it. You will see that this procedure picks data of observed zero linear trend (zero estimated expected advance) and
zero starting point. A constant (not necessarily R = (1/T) yer, R(t) has been removed from R(t).
It may be that the data does not allow this procedure. Suppose it has a nonzero linear trend in it and does not return to its starting value. Possibly it is a random walk of nonzero expected advance. In such case you can modify the data a little, and achieve the same
property for R(1) as before. In such case draw a straight line from
334
one before the first, (R(1 )), ie., R(O), to the last, R(T’) value of R(t), and then redefine R(t) as the deviation from this straight line. Thi s proced ure subtracts the mea n difference F=4D 24, f(t) fromal
l the differences, f(t). It also subtracts some consta nt (not necessarily R) from all the R(¢). We again achieve our obj ective that AR(t) = f(t), as mod
ified by the above pro cedure for all values of ¢.
f(t)
has also been slightly redefined by subtracting the exp erimental mean f fro m
f().
So we have with this mod ified data R(t) (corre cted), BY DEFINITION
Reore.(t) = S° Ff (#’)1+ F(0) = 0] 1
FOURIER,REPRESENTA
TION
T1
Reore.(t)
=
:
7 wye?t3t/T
(5.19.9)
j=0
I) = Alcor. (6) =
Reare.(t) ~ Reor.(t~ 1) =
T1
3
ee
> woj(e2tt/? _ 2riletp
J=0.
=
T1 re yS w;(1— 7 2ttwiei/T ) 2nigt/T
(5.19.8)
j=0
We also had as Fourie r representation of f(t )
To1
ss
F(t) = S* aje?*t/T j7=0
(5.19.4)
The equality f(t) = AR( t) now hol
ds for all values of t. Wit hout our subtraction pro cedure it would hol d for all values of ¢ except just one, t= 1.
What we subtra cted from the dat a, a con
stant and a linear term are just Fou rier expansion terms of the zero eigenvalue and eigenfunctions. It does not follow from our new definition for
R(t) that Room, = (1/T) OE, Reorr.(t) = 0 The re may still be some cosine of zer o frequency left in our data for R(é), but not for
S(t). Matching coefficients in our two expressions for f(t) (5.19 3 and 4), we find
wo. or
(Ll er PnU/T) = 4;
335
wz
i
Le
—_
[— e2mis/T
(5.19.5)
You will note that the above expression for w; is effective except
for j = 0, in which case it yields an indeterminacy. wo = §,since
ag= f= apt f(t) = R(t) = 0. wo has a determinate value, however. The general formula for finding Fourier coefficients gives
w= ¥ Row = Reore.
(5.19.6)
So the spectrum of our modified R(t), R(t) (corrected)
wi = Reo. §=0
wjwt
a5}
2(1 — cos(27j/T))’
j=1,2,...,7—1 (5.19.7)
So the expected spectrum is given from
j=0 Ews = E(Reor.)? ¥ o7T/3 _ Eajaz ra ¥
oF /T
GAO Ewywy = Farcosanj/P) ~ Wl —cosamsi iyo
Note that the relation (5.198) of the spectrum wjw of the corrected random walk (trend removed) about the straight line from the first to the last step, to the spectrum of white noise aja} (modified by subtraction of ag = f) is an exact numerical relation, holding separately for each point on both spectra. Each individual zig and zag of the white noise is reproduced exactly for the random walk Reore.(t), around the “line” 1/2(1 — cos 2aj/T). If you imagine the white noise spectrum, plotted around a line of ordinate unity on a
log ordinate scale, the abscissa running linearly from w = & to 2a
(j =1 to T/2), then “stretching” or “shearing” this line to a shape of log(1/2(1—cos 2nj/T), will give you the spectrum (on a log ordinate scale) of your random walk Reorr.(t) exactly. All of the statements about the distribution of the power spectrum coefficients of white noise, apply to the distribution of the random walk coefficients about the mean or median line 1/2(1 — cos 27j/T), exactly. We have illustrated the above with spectra of wheat price fu
tures (Figures 5.191 and 2) and their differences from Granger and Labys!® (p. 71, 73). The matchup seems almost perfect. There are slight discrepancies at about i = 45.
13 There is a little uncertainty about the confidence limits because of uncer
336
A i
A
SOA
WA

=
.
Re
Lower band 95%
> WM bY»
1] 
Upper band 95%
10
PP" amen 20
HY
2
3
Frequenc
40
y
Fig. 5.191 Po wer Spectrum of First
Differences of Wh tures Prices Mo eat Funthly, 195065 . From Labys an d Granger. ‘ainty in the ori ginal reference as to how the spectrum was window only (Gr smoothed; wit anger and Hatana h a ke, P. 43), or wit h a window plu Of 3 points of une s smoothi qual weight (Gr anger and Hatana ke, p. 60). Note Granger et. al. cal that when l upper and lower 85%
confidence limits 90% band, since , I have referred it has 90% of the to as Probability inside it, 5% above it an d 5% below
it.
337 T
T
T
POWER SPECTRUM
a
AC EebaWane A
16
Le 
292
;>—
Fo
FREQUENCY
4
Fig. 5.192 Power Spectrum of Wheat Price Futures Monthly
195065. From Labys and Granger.
might be in sig nificant excess or defici
ency relative to (a flat spectrum) white noise or its sum, the ra ndom walk Spectr cos 27j/T). Th um ~ 1/2(1 — er spectrum an d
e ig another ty pe of plot for which both th e power the freque ncy scales ar e logarithmi c.
3, 4 are the ori ginal semilog plots.
Figures 5.1 9
Figures 5.195 and 6 are log
log. The loglo g plot expands the lower half of the frequency sca le You will no
enormously.
te that one
half of the observed sp Points from w ectrum = 7/2 to 7, or J=T/4t oj = T/2 fall in range (a fact a small or of 2) at th e extreme tig ht. The low of the random frequency half walk spectr
um from j = 1 to T/4w= 2n/T is closely 1/2(1 to 1/4, — cos27j/T) ~ T?/(Qnj) = 1/ w?, a line of slo 2. This kind peof a plo
t is not symmet ric about w = x. The ent unplotted symm ire etric part of the spectrum fr om w = w to 27 also be compress would ed into a
factor of 2. This type of do uble logarithmi c plot for powe r Spectra is ve common in many physic ry al applications
.
telegraph signal , and shot noise, both with
Such a plot tends to br ing
covr ~ eT ha Spectra of th ve power e form ~ 1/ (u? +w?). So for w > yw the ir ex spectra also pected powe r are indistin guishable fr om that for a random wa lk.
339
Fig. 5.193 Power Spectrum of Composite Weekly SEC Index
193962  SemiLog Plot. From Granger and Morgenstern.
xg
340
Fig. 5.194 Po we rt Spectrum of Woolworth Stoc ~ SemiLog Plot. k Prices 1946.60 Fro;m Granger and Hatanake.
341
Y Scat _1re§o
Y Stale Fo
greener
L
2(1=Cos ®) L foo
5OO *3 VAvLUES
a .500, y 19
4
i C eo
00 5 5 a
NO, OATA POLNTS T==2NaGn6s
OW C.reSPo ECFiTRc 5,UM143)PomTs M=s15(36¥I,9>  NG 215 dif a a ‘se BELT_
wTOTeAL wBiydTHeAa2cTSOR 9
X= 70 4e5n7°oF 34]+ 20 Hoo qgOoNNweFeIcDeEeNoCnE b 10
_ VIkOe %ea
10°44a
CrAaLe £:1b1)
ie 
10088

9D6o°wlno BcCYaoFnFA,CTOR 2
ae
 1o
IN WEEKS
4
Fes
Zoycosu)
L 1.0.
 b o2
Vode Gor

1 = sei SCALES
Qoe

\
S1e4c39W6E1EKLY index

>
10°
Powsk SPECTROM OF
ty
FOR
yr , F OM Se L
aWT4elon’2
4
02.
7 oO
mw
i x 2 4 8 /o 2 3 23 64 ee YR wmon.
ALE}. PeRioD TIME SC
!
:
'
i
Fig. 5.195 The Data of Fig. 5.193 on a LogLog Plot.
,

342
YScare Pen
t/ad~easd)
 79
Anh
i
SPerTRom
yl
VALUES
Py
“I
Powe@ 5 PEcTRU M, Orweoe
WORTH
F
rs
PRices
moanracy, 144 ODd
4107
NO .D KTA Pots
Ts NO
IS
125 180
SPECTRUM
M=
2
5%
we
Pownts
66
ba
dpa (180 2aT “Ee
Width of
3.3
%0°To
A CONF,
tN
Go
Jo 2
BELT
Poww
a4
ap
ap
Tore. winth=7, 2
(Tan ce
Faartor
52161) r*
sf
re 4
eG .
au
.
+ WW
Scare
SRteO
Scarce
TA
1M Mow
a
Trig
oy 5 doar
i
+0. / 10
ROmow
7
Rwas “
5.196 The Data of Fig. 5.194 on a Lo gLog Plot.
343
64
0
days
16
32
Tat
weE TT
TT
4
2.
1
ru
6
8
T
13
cph
27 54 108 26
—
L44
a:
q
q 4
2 1962
29 AUG—29 SEP,
(1)
at
E
4
Me
H
TOT PWR (1) #8360 (km/sec) _ TOT PWR (2) 29959 (km/sec)
H Hi

ts
=== 30 SEP—31 OCT, 1962 (2) _
{
EE
tT
rri?rt
MARINER
E
[/(kDPEHmONSWIs2TRYec]?
2
cpd
rh H
L
TT
4
‘ro 2x10!0
E E
too
8
FREQ
FREQ
PERIOD.
2
a
4
Na
E
‘ER
[:
\
p
F ht
er
te ta” Ss , wie RIO]
C
103
lo?
,
PD 27DAY
Levin
tos
tos
FREQUENCY

~\
Sho(a. of cage

A
tore JJ =INIG3Rowe n lo*
(Hz)
108
10?
eyeles [Sec
Fig. 5.197 Power Spectrum of the Radial Solar Wind Velocity Vp. Wavenumber scale as in (Coleman 1968). From Annual Review of Astronomy and Astrophysics, Vol. II, 1973, Ed. L. Goldberg, et. al.

 
344
5.20
The Exact Sp ectrum of a Random Wa lk The Preceding discussion of th e expected sp tored” random ectrum of a “d walk Ror oc. (t ,) presents an
interesting co you get, de ntrast to what rived belo w, without any doctor ing. Intere the “shape” st ingly enou of the expe gh
cted spectr
um of the un walk is exac corrected Ta tly the Same ndom as for the co rrected walk and the dist . The magn ribution of th itude e coefficients are different. Note that in
We begin as before with th e Four
ier expansion pendent elem of f(t), as inde ents, In this case we allo w for a Possib finite expected le non zero but value Ef 4
0,€a9 = 0. Th e Fourier Tepr esentation
f(t) is:
of
T1
F(t) = Vaje2/T j=0
1,2,...,7
(5.20.1)
This random walk for nonze ro expected ad vance is: DEFINITION t
Runcomected(t) = > f(#") t=1
(5.20.2)
T1
Panc.(t) = S~ Rye2*i t/T j=0
(5.20.3)
We want to ev aluate the R; in terms of the aj. We substi into (5.202) tute (5.201) . F
We
Rane.(t) = DY ajereyr #=1 j=0
(5.20.4)
We can commut e the summatio n order and su m the ¢/ series directly To carry out th e ¢/ Summatio n we need the following form note some Pr ula and operties. as a geomet ric series,
y t=0
=
Pmt
9
345
5: t
i
r(l—r")/(l—r)
(5.20.5)
=
n, ifr=1orrTM =1(real)
(5.20.6)
t=1
For these formulae we will take r = ?"4/T and the general forms are valid except whenj = 0,47, +£2T, etc., or r = rT =1real. These
special cases follow from the definition. We have already quoted this special result and given both a geometric derivation, and an algebraic derivation (Section 5.11). From the above expressions, R(t) becomes T1
#
j=0
tel
(5.20.7)
Ranc.lt) = Yo aj Sy Pmse/T Splitting off the term j = 0, one obtains T1
t
¢
.
Rane.(t) = ao s {4 > a, x e2riit/T j=l
val
Using (5.206), =ait
prt ageMi/T
=aot + Df Baer getti/t
my
A
J(1 — e?rii/T) (1 — e2miit/T)
+
SOT
ajePrti/T eri
ed
(5.20.8)
é
B
ag is the mean slope of the random walk. It may or may not have an expected value zero.
You will notice that this expression consists of a linear trend, term A, a constant independent of t, term B, and an expansion term C whose coefficients differ only slightly from those derived for
a random walk with a constant and linear trend subtracted. The contribution of the terms in C alone to a power spectrum is identical to that derived in the preceding section for a “doctored” random
walk, Reorr
At t =1 we find, exactly T1
Rune.(t = Lao + Yo aje?TM4/7 = FL) gal
(5.20.9)
346
In order to ex press the enti r©
expres
sion fo r Pansion in e2 7%t/T » we ne ed to expa nd the
So we need an
€xpressi
on of the form
R,
une.() as a Four ier ex
line. ar term agt in e2rt it/T
T1
aot= S~ Ee Pmine/T
LD casei?
j=0
So
if
Qwigt/T
Ga =DY
agte2rist/
(5.20.10)
i=1
This sum fo r & is of the form with p — efniy/T
TH43
8p 4 T
We can use our previous €xpressions
uate this sum as fo llows:
Ptr
be
4784
for Seometri c series to ev al
+rT
ee
=r
4+r7 +17
try
~r7)/1 — 7)
= rl — Ty yc = (La rT 2yqc q
r) r)
Ta =r)/(a— +777
ato
ll
Gay
Note that wi th p = OMG UT
=
=
(5.20.
11) Lr =e gy we have, from
5.2010,
H
nes
pr
~
r
(1/2ag )t = ao(T'+ 1)/2 t=1
4o=
(age?/
7 7¢ )q
Putting all this together we have
Ranc.(t)
erru/P)
(5.20.12)
T+
= oo) from A
1
+O oar ape2rt
/T
B
_yri
1
Ee en e (~age?
*ii/T)
aze2*3/T
.2rijt/7
J=1 ag er Cc
(5.20.13)
.2nit/
7
347
A, B, C refer to source terms in 5.208.
This is now in the form of a Fourier expansion (5.203), a constant
plus a sum of terms in ent
_ao(P+1) Rune.(t) = —y
j
[= = er pul jet lou + r ja
jal
Ro, (uncorrected)
Rj, (uncorr.),j #0 (5.20.14)
We have sketched in Figure 5.201, the various contributions of these terms to the Fourier expansion of the corrected and uncorrected random walk, which we have deliberately drawn with an appreciable non zero expected advance, or linear trend.
Finally we give the power spectrum, and then take its expected.
value in order to see what we might get on the average if we repeated the experiment several times.

tc)
By s20e
ae
R; =o (uncon)
;
R (\porrected J ares wy ee
Rte),dneorresed rTerans
a
iz
Fig. 5.201 The Exact Fourier Composition of Spectrum ofa
Random Walk of NonZero Expected Advance.



sed
For the power spectrum of an uncorrected walk ,(5.203), from
(5.2014).
j=0
T+i) R= ae ; ) $Y aje?
/yy T a” )
j=1
(5.20.15)
Uncorrected Wa lk. (5.2015) is pure real
j#0
.
RR = (a0 + a;) (ag a @)
7
20 = cos(2xj/ T))
(5.20.16)
We can compar e the above re sults with thos random walk e for the correc Equation 5.19 ted. 6, 7
(5.20.17)
uncorrecte
d walk.
Hor
original ra ndom walk
O,or 4 0.
not the
Zero expect ed advanc e.
For zero
the spectrum of the correcte d walk. (5.198). Reor
., Zero Expect ed Advance J=0
{
a =
as expected value for
Ewe= E(Boor .) = RTE, Reorr.(t))? ~ o7T/3 @ Eaja*
J#O { Ew;w; = Wrest = ¥ T= cos(3ng/T)) ot/T
For the uncorr ected walk for th
(5.20.18)
e Particular ca of zero expect se at the mome ed advance, €a nt 9 = 0, the ex pe ct Pected valu ed spectrum e of (5.2015 (the ex, 16)
Uncorrected Walk, Zero Expected Adva nce F=0
{
€(R)x 027 /2 + E(Reor,) © o7T(5/6)
J#O { ERR} ne
(5.20.19)
Fao +€aja%+E 09(a
)+0*)
oa 2ra/T))
Note that in this Particular case for Eap = 0, then
(5.20.20)
E03 = faja5 = 03 /T = €f2/7
The expected value Eag (a; +a *) is zero. This is Sometimes as the indepe expressed ndence of co efficients of differing j, in the probabil ity
349
sense. It is a consequence of the peculiar properties of a geometric series sum, when r? = 1, real.
Eaga; Sty1 if(t ) ADL if(de —2nijt/T
Bot ef? (t) mall +
Y=0,7 40
‘) e2rtit/T f(t) Lage £ FOF
E=0,tkt!
=0
(5.20.21)
So for a random walk of zero expected advance, Eap the frequency dependent part (j 4 0) of the (expected) power spectrum of the uncorrected walk is just twice that for the corrected random walk since = Eaja;. The J = 0 term for uncorrected walk (expected adEaj vance zero)is slightly more than twice (5/6 vs. 1/3) that for corrected walk.
Although the terms €ao(a;++a’) do not contribute to the expected
value of the power spectrum of the uncorrected walk they do have
an effect on the distribution of the coefficients, or in other words how the actual observed spectrum scatter around the expected value €(aj+a;a3) We saw that aja% had an exponential distribution, or chi
square for two degrees of freedom. The distribution of ag(aj + a3)
is the distribution of the product of two uncorrelated and norinal variables. This is a symmetric distribution and rather concentrated about the origin. If u is the produce of two such variables, this
distribution is of order —log(u/ox) for u < oy and of order e~* Po for u > ou.
In summary the uncorrected random walk of zero expected advance has an expected spectrum identical in shape (j # 0) to that for the corrected random walk, but it is twice as large. The probability distribution of the power spectrum values is distinctly different for the two cases.
For a random walk of non zero expected advance Eaj # 0 the corrected random walk has exactly the same shape and size of expected spectrum as the corrected random walk for zero expected advance. This is understandable, the “correction” just takes out the
nonzero expected advance. The uncorrected random walk (non zero advance,£ay # 0) has exactly the same shape (j # 0 terms) of its expected spectrum as the corrected walk (trend removed) but the level, or magnitude of the terms for the uncorrected walk is a great deal higher. The distribution also is different. The magnitude in fact increases with the square of the nonzero expected advance. You can
r
ae
verify thes e statemen ts
from
expected 16) with value of Eq ap 4 0, by uations (5.2015, replacing ag by £, £a s large as the “uncorre you please cted” formul . In a
above, “” wo uld bejust linear trend, the slope of on nonzero ex the pected advanc e.
unity.
You can Se e that
in such a plot, exce (frequently omit pt for the ted from a Powe 7 = 9 term r sp ec tr of the unco um ), th e expected spec rrected and Corrected ra trum
ndom walks, zero or nonzero ex
5.21
A Commen t on the Id entity of E xpected Sp for Rando ectra m Walks, Linear Tr ends and Steps One might as
k, how can it be that the a linear tren expected po d aot,t = a wer Spectr um of T, and the expected po wer Spectr um some deta ils.
For the line ar trend, a slanted Tow of dots eman ating ao
origin, Defi nition:
t,t=1,2,.,
FOURIER EXP ANSION
at