The Stock Market and Finance from a Physicist's Viewpoint 0964629208

140 60 71MB

English Pages [400] Year 1995

Report DMCA / Copyright


Polecaj historie

The Stock Market and Finance from a Physicist's Viewpoint

  • Commentary
  • OCR single-page version of the other upload
Citation preview









S238 BS


oe 2% & ehMar sf o% %. ZSShGn----

32 34

% @ ¢ wtket (LOOIM) ‘¢ & Qs...



Single Market


Br, 45 ¥

Copyright ©1977 M.F .M. Osborne

First Printing 1977

Second Printing 1995 All rights reserved

Printed in Canada 109876

ISBN 0-9646292-0-8

Library of Congress Cat alogue Number 95-69502

Design by Kristen McD ougall

Mathmetics set by Eli zabeth Dolan

Typeset with PCTex using Times Roman font

Crossgar Press, Inc.

2116 West Lake of the Isles Park way

Minneapolis, MN 55405-2425 Books may be ord ered directly from the Publisher

at the above add ress by check ($19.95 plus $3. 00

for shipping, handling and tax).









rence cd aa arms a Biden # ue mere ow mE e Rw





5) vee

Sel vse eR &

won: we emi se a

& MR





Some Basic Principles of Investigation... . .



Some Fantasies of Wall Street... .......


Market Making................00.-.



Does a Capital Market Raise Capital?

.... .



Gambling Games and Liquidity... 2.2...



Supply and Demand. Economics ys. Reality. .



Supply and Demand in the Real World



Continuity and Derivatives of Real World Sup-

ply and Demand 2.6




Continuity of a




Comments on Continuity and Continuous Mar-

MONS: 2.8


Continuity of a Variable vs.



wee eas ee os

ee a oe oe eS


The Limit Order Only Interrupted Market (LOOIM)

as a Foundation for Auction Markets... .. .



The Continuous in the Time Single Market

MaKe? os aou nyse cev ga Kew 8es eae EZ



Definition of a Market Maker





Principles of M.

| Making - Required Infor-

mation and Criteria - Limits or Boundaries

Definition of Profit in Market Making

. .

... . .

The Effect of Lower Limits Only on Inventory, The Effect of Upper Inventory Limits, on Price sas a me HE EB H we E pw wn

TEATS oe ow

The Return to Optimum Inventory, When the

Limits are Exceeded»...


Tactical Inventory Zones... 2.2.00 ...

Tactics With the Minimum (1/8) Sprea d and a Size Greater Than One Round lot .....

Alternate Methods of Balancing Inven tory With-

out a Direct Suspension


Comments on Continuity. Methods of Preserving the Image of a Continuous Mark et Maker . Continuous Market Making With Fixed Mini-


mum, and Optimum Inventory, but No MaxiMUM.


A Digression on Boundaries, Limit s and the Definition ofa Theory ......... ..... Some Comments on Tape and Char t Reading.



Properties of Transacted Price Sequ ences

Several Market Makers in Competition. .. .. Commodity Markets, With No “Offi cial” Mar-

Ket Maker oa ae. ou 2.25


2.28 2.29





pw ae


An Official Market Maker With a Book of Limit Orders. The Exchange Specialist ... ... . , An Estimate of the NYSE Specialist Trading

YOO oem y 5B HR HE EH ack ele The Exchange Specialist and the Thir d Market

Summary of Market Making.

Limits on the

Existence of Markets... ......2.~ 0~2..,

Price Changes Over Fixed Time Intervals, Day,

Week or Month


78 80


83 89 90



The Dark Cloud Axiom.

The Silver Lining




we es we



BAW 103

Impossibility Theorems or Statements from NatwralSGienCe

smcce wee eRe EE eH

Bw 105


Impossibility Theorems from Mathematics.


Godel’s Theorem...


Examples of Random Walks........... 113


The Mean and Variance of One Step in a Ran-


. 107

2220 109

dom Walk . . 3.8

The Expected Advance and Dispersion for a


The Estimated Advance and Dispersion for a

Sum of j Steps ina Random Walk

....... 118

Given Random Walk... .....-...04. 122 3.10

Tests for Properties of Random Walks. Square Root of Time Diffusion Law



...... 127

Estimates of Advance h and Dispersion 6 Test-

ing Statistical Hypotheses... .......... 137 3.12

Evidence for Departures From a Simple Random: Walk in Stock Prices’...


2 6 2

2 sce we 138

The Formula for Random Walks vs. Standard Deviation of the Mean .. . wc eee 140


Random Walks with a Dispersion per Step Small Compared to the Expected Advance

..... . 143


Sequential vs. Across the Market Dispersion


Brownian Motion as the Continuous Limit of a Random Walk

3.17 3.18



. 146

02--5-2---- 157

Velocity vs. Derivative of Position.

. . . .


a mewe

Random Walks Whose Properties Change With ec awe


we ees


. 160

Be 162

Random Walks in Dollars of Price vs. Log. of PHBE

c ve ame eee Re eRe 166

The Effect of Dispersion on Arithmetic vs. GeGmiettie Mean.


cee wie see


we 172

“Equality of Opportunity in a Growing EconGmyTM






Re 173

Alternative Forms of Percentage Return and Logarithmic Random Walks........... 178 3.23

A Hybrid Random Walk, Return Including the Dividends...





eee 181





ANALYSIS OF STOCK MAR KET DATA .. 185 Preface to VolumeIT.. 2... .....20200,..... . 187 4

Measurement and Method s of Statistical AnalHOE ww em se aw eK Ahn e ew benmae 189 4.1


Digression on Measurment


........., 189 The Approximation of a Func tional Relation-


Some Mathematical Truths Whic h are not Nec-


Pryxytzl, Baloney and Red Herr ing...

ship in Measurement....... ...~,~

,.., 194

essarily So... e e lee 195


Examples of Infinite Moment s.


Time of Flight of Gas Molecule s






2... 202


= 208

....... . 209

‘Transformations of the Inde pendent Variable

Order Statistics.. .

. 215



Skew Random Walks. Fluctuat ions in a Panel


An Example in Which the Hist ogram Panel Test Doesn’t Work... ...... 22.2.,22.., 224






ofa Histogram ........~0.,


wr = 20)

A Comment on the Form of the Chi-square Test226

Counting of Events with a Pois son Distribution 228 Relation of Correlation, Regr ession, Contingency by Chi-square and the Method of Co-

incident Events... ......2.2 .,2.2022., 236

Seasonality in Stock Prices.

An Example of a

Problem with Three Variables. Good and Bad Practice in Statistics... .... .002.., 241

Some General Methods of Atta cking Large Batches

of Data. Preliminaries of Anal ysis of Sequen-




Extreme Observations.


Discoveries in Noise...

History of Science... 4.18


Some Biases from the

...0.0~0~000,, 257

2.2... 259 Examining Two Variables at ONG@ 22 4 2% di 260


The Quantitative Expression of “Eyeball” Statisfies








Statistics of Sequential Data. Power Spectra . . 265 Bal

Examining Two Variables at Once Where One is the Time Sequential Data. .......... 265


The Definition of Strict Stationarity


The Scatter Diagrams of Autocovariance. . . . 269

.. 2... 267


Comments on Auto - and Cross-Covariance


Definition of Weak Stationarity ......... 279


The Definition of Ergodicity........... 282


Examples Illustrating Strict Stationarity, Weak

. . 278

Stationarity, and Ergodicity........... 5.8

The Doctoring of Sequential Data Prior to Statistical Analysis, in Particular for Power Spectra: 287


A Comment on the Disparate Mathematical Concepts Which Can Be Encompassed in Fourier AMRIPRIS 2


ge ee




Formulas for Fourier Expansion, Expansion Coefficients, Power Spectrum and Autocovariance



A Digression on the Orthogonality Relation


A Digression on Aliasing...... 0.2... 303


The Interpretation of the Mean and Linear Trend

- 301

as Fourier Expansion Coefficients, Mathemat-

eal Paradoxes. 5.14

saw ke



The Spectrum of White Noise. a Sequence of Independent in probability Values of f(t)


. 309

The “Solidity” and “Smoothness” of a Power


2 G54 Zoe

kobe Bhs SHR



How to Inspect and Check a Published Power


White Noise as a Standard of Comparison . . . 329


The “Spectrum” ofa Random Walk Itself . . . 331


The Approximate Evaluation of a Random Walk



= nue 6

mH Be

G See SRR GY me E es 324

poe k em eee mas ha wee

EH 333


The Exact Spectrum of a Random Walk... . 344


A Comment on the Identity of Expected Spec-

tra for Random Walks, Linear Trends and Steps 350



The Running Summation and Difference Op-


The Structure Function and Interquartile Range






of Differences 5.24

ene 355


The Relation of the Shape of the Structure

Function and Power Spectrum. Fractional RanGOR WANS 5.25

ee e Re nee e RE

Some Unsolved Problems


Some Unexplored Problems in Security Market Data . 364

FIGURES Chapter 2 3-1

Prices as Function of Supply and Demand ...... .


Supply and Demand, as Functions of Price. The Economist’s

Diagram Turned Over

... .



Mrs. Jones Demand Function


Supply and Demand Functions of Price, for Buyers


Supply and Demand of Apple Sauce in the Grocery

SHG Sellers. SiGiGies

un: ese Re Re aE

ge wwe se




eae ee RE ee we

oem ES

13-1 Examples of Simple (Symmetric) Market Making. . . . 17-1 Examples of Quotes with Minimum Spread and Effec-

tive Size k of the Quote

.. 1... 250s ees

18-1 Trading with a Ratchet

23-1 Quotes of competing Market Makers .......... Chapter 3 5-1

A Line and a Point Not onaLine


Parchesi as a set of Interacting Random walks... . .


Probability Distribution for One Step of a Random Walan

esa ays

gon l dG


damian ckmannman


The Parabolic Sleeve of a Random Walk ........


Confidence Belt for the Mean of a Sum of i Steps, or LODSEEVALIONE..



swe iwi

eae see @ OS ew Be

Price Charts for IBM and Reynolds Metals for the Era of Table 3.9-1 Charts from Securities Research Company126


Determination of Fractile Ranges

........... 129

10-2 Price Charts for Stocks of Problem 4.

Charts from

Securities Research Co... ...... ........, 131-4

13-1 Histogram of Student SAT Scores (Schematic)... . . 141 14-1 Random Walk of Student to the Parking Lot. The Dispersion Per Step, 6, is Small Comp ared to Expected



2... 0... ll, 144

Across the Market Dispersion for Fixed Income Secu-

tity Prices, Earnings, and Earnings per Dollar of Price 148-56

19-1 The Gambler’s Random Walk in Roulette ...... . 169 19-2 The Financier’s Random walk in Roulette... .. . . 169

19-3 The Financier’s Walk on a $ Scale of Value (Left) and

Initial % Scale (Right)... 2. .. 170

19-4 The Gambler’s Walk on a Logar ithmic Scale of Value. 170 Chapter 4 5-1

6-1 6-2

7-1 9-1


An Experiment Giving the Cauc hy Distribution.


. . 206

Experiment to Determine Molecular Velocities... . 210

Distribution of Velocity and Time of Flight

22... 211

Transformation to y = x? of a Unif orm Distribution of x217

A Skew Random Walk... ...... ...0...., 221

Schematic Histograms with an Excess or Deficiency in

one Panel

©. 2.. .

12-1 The Poisson Distribution for an Expected Number \ = 0.055


12-2 The Poisson Distribution p(k). 4 = expected Number

fp OMe BOR





noe nane 235

13-1 Scatter Diagram of Weight vs. Height (Schematic) . . 236 14-1 Cumulated Distribution of Monthly Changes in the

Dow @86%

os cuic ee Se adden ew mev

15-1 A Picture” of a Column of Data on Prices

nes 246

.... . 250

15-2 Distribution Function of Closi ng Prices for July 31, 1956, (allitems, NYSE) ......... ......., 251

15-3 Distribution for log-P on July 31,

NYSE Common and Preferred)...


(all items

.........., 252

15-4 Distribution Function of logeP for Preferred Stocks

(NYSE, July 31,1956)... .....2.0.0.. .., 253

15-5 Distribution Function of logeP for Common Stocks

(NYSE, July 31,1956)... .....2.0. 0,..., 254

viii 15-6 Distribution Function of logeP for Common Stocks

(ASE, .July31, 1956)



cee eee ewes 254

15-7 Cumulated Distributions of log.P for NYSE and ASE

(coiimen stocks)




we ea


eee ee


15-8 Histogram of Measurements of Machine Parts (Schematic) 256 18-1 Scatter Diagram of Volume vs. Price Change (Schematic) 261

Chapter 5 3-1

Random Telegraph Signal...


Shot Noise


Autocovariance and Autocorrelation of Random Digits 284


Autocovariance and Autocorrelation ofa Running Sum

2... 2.0... 274


of Two Independent variables... 7-3

2. ....00.000. 285

Autocovariance and Autocorrelation for a Running

Sum of Five Independent Variables

.......... 285


Observed Autocorrelation ofa Random Walk (Schematic) 286


Test of Data for the Third Condition of Ergodicity


Sound Pressures. Time

. . 288

2.22.0 00 bc 0. eee 291

12-2 Fourier Representation of f(t) = const

........ 304

14-1 A Random Sequence of Normal Variables of Unit Variance and Zero Mean


2.2... eee eee

- - 312-13

14-2 A Random Sequence of Variables of Unit Variance and Zero Mean. +1 for Even Digits, -1 for Odd Digits. . . 314 14-3 The Distribution of Real Component b, or Imaginary

Component c of a Fourier Expansion. ......... 316 14-4 The Joint Distribution of Fourier Expansion Coefficients in Cartesian Coordinates ............. 318

14-5 The Rayleigh Distribution for the Modulus | a; |=

1/0} +c} of a Fourier Expansion Coefficient a; = bj +1c;319 14-6 The Distribution of the Power Spectrum Values P =

ajby* = B + es

Sk Sta maeee ees whe ws

ce x 321

14-7 The Distribution of the Power Spectrum Values P, on aLog Scale... . 2... 000.20. .00.0.. 322 19-1 Power Spectrum of First Differences of Wheat Futures

Prices Monthly, 1950-65 (from Labys and Granger) . . 336 19-2 Power Spectrum of Wheat Price Futures Monthly 1950-

65.(from Labys and Granger)

.............. 337

19-3 Power Spectrum of Composite Weekly SEC Index 193962 - Semi-Log Plot. (from Granger and Morgenstern)


19-4 Power Spectrum of Woolworth Stock Prices 1946-60 Semi-Log Plot.

(from Granger and Morgenstern)

. . . 340

19-5 Fig. 5.19-3 on a Log-Log Plot... ........... 341 19-6 Fig 5.19-4 ona Log-Log Plot

.............. 342

19-7 Power Spectrum of the Radial Solar Wind Velocity.

From Annual Review of Astronomy and Astrophysics, Vol. IE, 1973, Ed. L. Goldberg’

«occ ee eae a we 343

20-1 The Exact Fourier Composition of Spectrum of a Random Walk of Non-Zero Expected Advance... ... . 347

21-1 Power Spectrum of a Jump. (from Granger and Mor-






Rae ws 353

21-2 Power Spectrum for Linear Trend. (from Granger and Morgenstern)

«+ sa oe

9886 2 ek


eee 354



Expected Advance h and Dispersion éfor Weekly Changes

in loge Price, MDNM- A log_c (P(t)); Data from Cootherp: 286

see 4 es

3.10 Data on Expected Advance h and Dispersi

of Problem 4 (Figs 3.10-2) 3.21 Table of Payoffs for Investors in a Growing Economy . 174 3.21 Arnold Bernhard’s Track Record for OTC Stocks . . . 176

3.22 Table of Alternative Forms of Random Walks with

Different Measures for Utility Function of Money . . . 180 4.14 Data for Problem 6



Advances and Declines, February vs. August


Comparison of Different Observers

........... 248


Comparison of Different Observers

........... 248


Comparison of Different Observers

........-.. 248


Monthly Advance and Decline for Nonoverlapping Eras 248

..... 248

5.16 Given y? with Probabilities to Determine Confidence Bel

awe ee ee eee

ew Ew OD ee wR 328


Martin’s Statement - Does a Capital Market

Raise Capital?

2.2... ee


How Economists Get Supply and Demand Curves...


Examples of LOOIM............000 000%


Sequential Dispersion o and Expected Advance h of Boom and.Bust Stocks...

0... 66

Birth and Death of Securities


ee 135

.. 2.2.2... 000. 165

Seasonality in Stock Prices Histograms with Respect to Eighths. .



Major discoveries in any field very often are made by outsiders, persons who bring new perspective to the subject, or apply tools as yet unused from another discipline.

This occurred at least twice

in the annals of investment markets.

The first was Louis Bache-

lier’s application of the Brownian Motion model to French Rentes in his doctoral thesis in mathematics in 1900 which preceded Einstein’s paper on Brownian motion by a half dozen years. The second was Maury Osborne’s seminal 1959 article, ” Brownian Motion in the Stock Market.” Bachelier was a mathematician; Osborne is a physicist.

Osborne was a graduate student in astronomy at the University of California, Berkeley who came to work at the U.S. Naval Research Laboratory, Washington, D.C. in June of 1941, and spent his entire career there. During World War II, he worked on problems of underwater sound, submarine detection and underwater explosions. After

the war, physicists were encouraged to work on a variety of topics of their own choosing.

Osborne examined, among other topics, the

aerodynamics of insect flight, the hydrodynamical performance of migrating salmon, and the stock market as a slow motion source of

random noise - a subject of considerable interest in many different fields.

This resulted in his 1959 paper, "Brownian Motion in the

Stock Market.” Its thesis, that common stock prices followed a ran-

xii dom walk, was a bombshell within the field. It was the first of a series of investigations that led to a complete revision of view of the stock market, although perhaps not always in the way Osborne would have

anticipated or approved. More than a decade later, Osborne was asked to teach a course in

the stock market and finance at the University of California, Berkeley. The course he taught was not the typical course in investments, but something quite different. Its content bore directly on that subject,

but in a divergent and critical way. It raised issues of the nature of the underlying data, how it was to be examined, how theories about.

it should be derived, which current economic theories were useful and which were not, and the very nature, even, of appropriate methods

of statistical analysis and scientific investigation. This book is the 2nd printing of the content of that course. The words and content of the original have not been changed; the student problems and references are as they appeared in the original course.

Even the figures, as drawn by Osborne, have been reproduced in their original form. The retention of the original was done, not for historical purposes, but because Osborne’s approach and insights are not restricted to past eras, but apply as well today as they did in the past.

Overall, the book is a tour de force, instructive, playful, serious, ranging from easy to extremely difficult, always clearly written, and full of insightful viewpoints. Throughout, Osborne states matters as

he sees them, whether it be finding fault with common but misleading statistical practices, pointing out the proper investigative approach or the only appropriate way to describe the relationship between supply, demand and price. The book relates the views of an unusually

objective and thoughtful mind on a wide range of topics of economic and intellectual importance as they bear on the stock market and finance.

Joseph E. Murphy



In the fall of 1972 the author was a. visiting lecturer in the graduate school of business administration at the University of California

in Berkeley, teaching two courses labeled “Security Markets and investment Policies,” and “Seminar in Investments.” As a professional physicist and amateur pedagogue, I found preparing and delivering these lectures the most concentrated and prodigious mental effort I had ever made.

I would never have believed a group of students

could make the professor trot so quick. On returning to Washington

I thought that the record of such a labor should be better preserved than the rather disorderly lecture notesI had accumulated, so I sat

down with a tape recorder and talked them into it. These notes are the result after several typings, considerable expansion certainly, and

clarification I hope. These lectures were originally produced for graduate students in business administration, and I have tried to preserve this level of


So it would be appropriate to point out what level

of mathematical preparation is required. Most of the students had

had, and needed, an elementary course in statistics—calculation and interpretation of correlation and linear regression; expected values, significance probabilities, how to make a chi-square test.


half of them had had an exposure to the basic ideas of calculusderivatives, integrals and limits.

This too was very useful, but not

absolutely essential.

Some of the better students without a back-

ground in the exact sciences felt a little “snowed” by the use of sines

and cosines when we discussed power spectra. Some had trouble with logs to the base e, and log paper.

The original lectures on power spectra were primarily a qualitative discussion of when and when not to use power spectrum analysis; the present write-up on this topic goes into considerably more mathematical detail.

It would be literally correct, and at the same time very misleading to say that the really important mathematics the students needed is taught nowadays in the eighth and ninth grade.

Here the students

are introduced to the idea of a function, and to the different classes of numbers; integers, rational numbers and real numbers. I was briefly

exposed to these ideas as a graduate student. I used them without understanding, or even being aware of what I was doing, until my children went through the eighth and ninth grade. It took me several years of repeated passage, doing their homework, before Ifinally caught on that these were the mathematical concepts I needed to

bring sense and order to my understanding of a speculative market. Pure mathematicians should find a wry satisfaction, and mathemati-

cal economists may be a little startled to see that modern ninth grade mathematics is what you need to make sense of the law of supply and demand in a speculative market, and to expose in elementary terms

the mystique of the NYSE specialist or over the counter trader who “makes a market.” I have not tried to avoid subtle mathematical points when they

are needed.

Rather I have tried to emphasize that these subtle

ideas are used unknowingly, repeatedly in everyday life. In the sim-

plest terms, the most important mathematical ideas needed are those which distinguish between a solid and a dotted line. The material covered divides itself into roughly three parts. A discussion of the process of market making.


2) A discussion of

random walks. 3) A discussion of alternative methods of attacking a statistical problem and the relation between these methods. Included in this third category is an introduction to power spectrum methods. The reader might get the impression, and it would be correct, that

I am not overly enthusiastic about power spectra as the method of choice for examining sequential data. This prejudice is not restricted to economic time series. Readers with a serious interest in this question might examine closely sections 5.9, 5.13, and 5.21 of this text,


and also the reference footnoted! , as an example of alternatives to spectral methods in describing data on turbulence in fluids.


tral methods are normally de rigueur in this subject, and I admit to

being a small minority.

If I have seemed overly critical of spectral

methods, it is for the same reason that I have criticized other theoretical ideas. You understand a theory best when you appreciate its

limitations—the conditions under which it doesn’t work. When it comes to theoretical developments (as opposed to the

analysis of data) my attitude toward Fourier analysis is similar to that of Winston Churchill toward a democratic form of government. It is a terrible method, but the best one we know of. For those readers who like to browse, or might want to use this material for a course or seminar, I should point out that this material

was not delivered in the logical and perhaps pedagogically appropriate order that is found here. One class got a rather complete treat-

ment of market making, and very little of random walk. The other got a much fuller treatment of random walk, and no market making.

The discussion of alternative methods of analyzing data does

not require either market making or random walks as prerequisites. It may be appropriate to warn readers on the multiple and spe-

cialized meaning of certain words.

“Real” in ordinary language

means concrete and actual, rather than abstract or fictional. “Real”

in mathematics as referring to numbers or functions has a number of different connotations. The basic one is that of natural numbers (integers) plus fractions plus irrational numbers, positive and negative. If I say the real part of a number or function, it implies there may be a second so-called imaginary part. Historically both parts are the product of the most imaginative thinking of mathematicians, for at

least two millennia. “Trend” does not have the loose meaning of some kind of slow

change, ascribed to this word in general language or economics. For us it is usually, given a plotted finite set of sequential data, the slope

of a straight line from the first to the last data point, regardless of what the data may be in between. This is exactly the mean difference of the data between successive intervals, taken as unit time.

If the

ordinate has been transformed, say from prices to log prices, it is the slope of such a line on the transformed data.

1M. F. M. Osborne, “The Observation and Theory of Fluctuations in Deep Ocean Currents,” Erganzungheft zur Deutschen Hydrographischen Zeitschrift. Series A, No. 13, 1973, pp. 1-58.


“Represent” in mathematics means “stand for,” just a V (with

the Romans), five knots on a string (with the Aztecs), or the Arabic numeral 5 stands for the abstract concept of number also represented by the fingers on one hand. In discussing numerical data, “represent”

means “fit.” In the same sense that Ptolemaic epicycles vs. a theory of gravitation are used to fit or represent the observation of positions

of the Moon, or random walk theory represents some property of stock price sequences; or a Fourier series or a power series represents,

or fits, a sequence of observed numbers. I would venture to say that most people (the author included)

who buy a book about the stock market, do so because they would like to make money, either for themselves or as an advisor or man-


We can express this mathematically by saying that they are

interested in improving their estimate of the expected value of the change in price of some stock, or an index, over some interval of time t—to,t > to in the future, beginning now (to). The reader should be warned that there is more to the stock market than this ultimate ob-

jective, fascinating as it may be. I do not wish to denigrate the profit motive, but understanding the market may require an examination

of phenomena that have nothing to do with price, or profit. I can illustrate the narrowness of this “profit motive” point of view with a story which appeared in the Wall Street Journal some years ago. There was a man who loved horses, and happily earned his living running a stable, buying, selling, boarding and renting horses. There was a byproduct of this enterprise which in fact rendered it

profitable, the uniform production of horse manure, for which there was a good market for all that he could produce.

You can easily

see that so long as profit was regarded as an incidental byproduct

of this business, all would go well, but if he tried to increase profit by diet or laxatives, he would surely come to grief. You will see, for

example, when we discuss market making, that if we regard profit as

an incidental byproduct of the process of making a market, profits will flow steadily and at a relatively small risk, whereas if you focus on profit as the objective you increase your risks enormously. When we discuss the random walk model of stock prices, you will see that a simple expression of this model is that the expected change in price, or expected “profit” or “return” is zero, no matter when you start or how long you wait.

For someone who wants to

make money, this is a very depressing picture.

When we examine

the definition of profit more closely, and bring in the question of


amount of money vs.

its utility, in the economist’s sense, you will

see that “profit” is at times a rather fuzzy and ill-defined concept. There are some unsolved problems here. In all of this I merely want to emphasize that there are other ways of looking at capital markets

in general than that adopted by most people who have a practical and urgent interest in them.

If you back off from the nitty gritty

problem of protecting and increasing your capital, you may get a

more enlightening view of what it is all about. Finally to those readers (and I hope they are numerous) who take seriously the guiding principle of Chicken Little and Zero—“it ain’t necessarily so”—I make this request. I too may well make statements which are not necessarily so. If so, please tell me about them. There

are doubtless places in the text where I have made statements, or used mathematical conclusions which are not at all obvious.

free to ask me for enlightenment, I would be happy to comply.




I owe the opportunity to teach a course on finance and the stock market to my friend and colleague Victor Niederhoffer.

I owe the

publication of a second printing of these notes to Joe Murphy, a collaborator in several subsequent papers. Without him this second

printing would never have come to pass.

Thave often thought that the stock market, just as a phenomenon of purely scientific interest, was one of the best documented (in terms of available recorded data), and worst described aspect of human

life. The reason for this glaring dichotomy is that the overwhelming majority of interest in the market centers on how to make money, a very narrow, and very compelling practical interest. historical analogies of this situation.

There are

Medicine and agriculture are

very practical subjects of interest. Yet only in the last 200 years have

they been studied as purely scientific, rather than practical topics of interest. The major emphasis in this book is twofold: 1) reinterpretation of conventional wisdom, and 2) unanswered questions about finance

and economics. Data and methods exist for answering some of these

questions. This book can be imagined as an account by an unbiased anthropologist (a physicist) of an exotic society composed of enlightened financiers and economists, describing their standards of behavior and belief.


Some of the exotic beliefs which this society holds are:

1) Supply and demand are functions of price, but prices are not

functions of supply and demand. 2) A capital market (ie. the stock market) is totally unnecessary for the raising of capital and in fact raises very little capital, (less

than 1/10th of the capital raised in any year in the U.S.) 3) A slightly controversial belief of the members of this society has it that the natural logarithms of stock prices in $ per share follow approximately a random walk of zero expected advance.

belief has as a curious consequence:


(1) that the expected price it-

self, in $ per share, increases slightly and steadily in time, and it

also follows (2) that the expected reciprocal of price, shares per $, also increases with time in an identical fashion. The first conclusion

has been repeatedly tested with real data.

The second conclusion

has never been tested with real data, to my knowledge.

Note that

“expected” has the technical meaning of the sum of values times associated probability.

There are numerous other paradoxical beliefs of this society, consequent to the difference between discrete numbers (multiples of

1/8$) in which data is recorded, whereas the theoreticians of this society tend to think in terms of real numbers (mathematical definition of —“real” ).

The distinction between “real” and “discrete”

matters considerably for the stock market.

There is an analogy to all this in the foreign exchange market, an analogy which has never been explored, to my knowledge.


torically, foreign currencies were traded against a single standard of value — an ounce of gold, or pounds sterling. In the U.S. stock mar-

ket the $ is compared against a multitude of other —“currencies” or vehicles of value, the stocks themselves. Whether prices of curren-

cies (or their logs), like stock prices, follow a random walk, or have in past eras, is not known to me.

In the footnote of this preface are listed a number of papers which

bear on these and related questions, some of which appeared after

these notes were written. ? ? Osborne, M.F.M., —-“Brownian Motion in the Stock Market,” Operations Research, vol. 7, 1959, pp. 145-73, p. 807.

Osborne, M.F.M. and Murphy, J.E., ”Financial Analogs of Physical Brownian The Financial Review, vol. 19, no. 2, pp.

Motion, as Ilustrated by Earnings,” 153-172, 1984

Murphy, J.E. & Osborne, M.F.M. ”Brownian Motion in the Bond Market,” The Review of Futures Markets, December 1987, Vol. 6, No. 3, 306-326.




Some Basic Principles of Investigation

The objectives of these lectures are to stretch your minds as re-

gards imagination and a new point of view, excite your curiosity for new things, and develop your skepticism concerning what you may

have already read concerning the stock market or security data.

As models for the kind of thinking I am trying to bring out, I

will give three well known examples. The first is the model scientist, physicist if you like, who is typified by Chicken Little, a character I

am sure of whom you have all heard. Chicken Little was a little bird

freshly hatched from an egg, and with no preconceived notions as to what the world of nature was all about. He was out in the woods

pecking for bugs and worms and other delightful tidbits, when an acorn fell on him. He immediately concluded that the sky was falling

because he saw it with his eyes, heard it with his ears, and felt it when

it hit him on the tail. So he organized a following or lobby of Henny

Penny, Turkey Lurkey, Ducky Wucky and other barnyard birds to go and tell the king about this impending disaster.

On the way they met Foxy Loxy who saw a good thing coming for himself, so he said he knew just where the king lived and he would take them there. He led Chicken Little and his entourage right to his own den. He went

in and said he would make the arrangements and they should come in after him.

When they did, he bit their heads off one by one.


think a few of them may have escaped, but the point is Chicken Little exhibited all the good and even heroic qualities, also stupidities, of a scientist. First of all no preconceived notions or model as to what.

the world was like, and complete reverence for the data. He accepted his sense data literally when something hit him from above. Like all

good scientists, he immediately made a theory directly related to his data, and then he started to do something about it, even if it

killed him. Scientists these days I suppose, go to Washington “to do something about it.” The fact that he happened to be wrong is really nothing against


Science is nothing but a systematic way of making mistakes.

A good theory is nothing but a shorthand way of remembering the

data. He had reverence for the observations, that is the important point. He didn’t have any preconceived notions as to what was going

on. He just took the data of his senses straight. The second model is that of a mathematician, and our model

in this case is a character from the funny papers named Zero, who figures in the comic strip “Beetle Bailey.”

Zero is usually noted

for getting into difficult situations because he takes the meanings of words and the grammatical construction absolutely literally. This is exactly what a mathematician does.

In the comic strip it usually

happens to be a way of his getting into trouble.

Nevertheless, this

is the correct attitude for a mathematician. This fits perfectly with

the above definition of a scientist, and explains why mathematics and science go so well together. I think there is something significant about giving the name “Zero” to someone with these characteristics,

because zero is perhaps one of the greatest mathematical inventions that has ever been made.

Zero is a symbol for nothing, yet it is a

nothing with useful and even remarkable properties. The third attribute which I would like to bring out and develop in you is expressed in a hymn, which they sang in a church I used

to go to.

I call it a hymn because it was a song and because it

was sung in a church, here in Berkeley, although I doubt if you will

find it in any hymn book. The line goes something like this. ”Old Pharaoh’s daughter found Moses in the bull rushes, she said, but it

ain’t necessarily so, no, it ain’t necessarily so.” Well, this is the kind of skepticism I want to inculcate into you.

Just because you read

something in some authoritative source, it doesn’t mean that it is

necessarily so. It may not be so.


Some Fantasies of Wall Street

In order to cultivate a sense of astonishment, such as Chicken Little had, about the truly fantastic things that go on, on Wall Street, and in the securities market generally, let me describe some of the properties of a stock market as seen by some imaginary characters from ancient history.

The first of these is an ancient Roman or

Greek. He is taken to the stock exchange and it is explained to him that this is a market for synthetic slaves (corporations) who have been created you might say by a magician (lawyer).

The slave is

sawed up into several million pieces. The pieces are bought and sold, and nevertheless the slave continues to work and make money for the owners. In addition, even more astonishing to the Roman or Greek, free real citizens are hired by and work for these fictitious slaves.

The next visitor to the stock market is an oriental. It is explained

to him that these pieces of slaves are bought and sold, but he is astonished to learn that the prices are upside down in terms of the standards that he is accustomed to. In other words, the prices are in

terms of dollars per share, or dollars per piece. In the Orient prices are just upside down from this. There you do not buy rice or wheat in terms of dollars per bushel, but in terms of bushels per dollar, or pounds per rupee. The reason is, I suppose, they are more interested

in what they are buying because they are going to eat it, than they

are in the price they can get for it if they are going to sell it. Food is more important than money, for many orientals. If you have two annas, you want to know how much rice you can get to eat for them, rather than how many annas you can get for a pound of rice you already have.

I think the same thing sometimes occurs with those

Arabian sultans who have lots of oil. They buy Cadillacs seventeen at a time. They look on the price of a Cadillac as how many Cadillacs per gold brick, rather than how many gold bricks per Cadillac. The oriental is further astonished to observe that the smallest unit is a coin which doesn’t exist, the eighth of a dollar or bit.

There is a

remnant of this in our language, where a quarter is two bits.

It is

rather astonishing to these people of an ancient civilization that the hairy barbarians of the West buy and sell pieces of slaves on this

market with a price which is upside down, and with a coin which doesn’t exist.

The third gentleman which we introduce to this market is an an-

cient Egyptian. It is frequently said that the French are a nation of shopkeepers. Well in those days the Egyptians were a nation of undertakers. All of life was a preparation for death. When he is shown the newspaper with the names and prices of all these actively living

slaves, he immediately asks, where is the obituary column, where they describe the virtues and accomplishments of the people who are


Where is the necropolis where all the marvelous imaginary

people have been buried? He is not interested in the present, proper-

ties of living people because a person isn’t complete until he is dead, so he wants to know where all the dead ones are. His guide is rather astonished at this, because it is really quite vulgar to speak of canal stocks and buggy whip manufacturers, and all the disasters which have overcome these imaginary slaves of the past.

The Egyptian

thinks that the American point of view is really quite barbaric, and totally uncivilized. The preceding description of fantasies on Wall Street seem apropos today, and there are some equally fantastic phenomena going on right now. For example, for many years one might have thought that the NYSE brokerage business was the ideal business setup. was a


strictly closed corporation, to be a member of the NYSE. It

cost a lot of money to be a member, and there were only a certain number of people who could belong.

They had a monopoly on the

market sanctioned by the government; they had a fixed set of minimum prices (brokerage) that everybody charged. As time went on, say from 1920, prices have risen, that is the brokerage fees, and the amount of business has also risen (the daily volume). It was a protected, growing market, and a monopoly.

What could have been a

sweeter setup? As a result of all this a great many of them, astonishingly enough, got into serious difficulties and even went broke.

Things were so

good and they were flooded by so much business that they went out of business, despite the fact that it was a near monopoly with fixed minimum prices.

To me as a simpleminded physicist, this is

an utterly incredible phenomenon.

It got so good it was terrible,

In some respects this phenomenon is the mirror image of 1929.


those days it was the public that was taken for a dreadful bath. Now

(1968-72) as a result of tremendous participation by the public, as institutions rather than individuals, it is the operators of the market

who are being taken for a bath. Finally I can describe an astonishing discovery which I made

myself, although you will find it quite humorous!

I was making a

telephone call to the Securities and Exchange Commission to get some data on the relation of the volume of trading on any given day to the actual number of transactions. So I called up the young lady on the switchboard at the SEC and asked her who had data at the

SEC on this directly from the tape, the individual components of volume.

She said, “We don’t have anything like that over here.”

said, “This is the SEC isn’t it?”

“Oh Yes.”


“Don’t you have data

from the NYSE?” “Oh,” she said. “I know what you want. You want the CAPITAL markets section.”

That was like a blinding flash of

light to me, as if I were an astronaut on the dark side of the Moon. I had never thought of the stock market as a market where they bought and sold money. I always thought of it as a place where they

bought and sold stock. Apparently in the textbooks, it is primarily a place where capital is bought and sold. The stock certificates, the pieces of paper that are red and yellow and blue, are only incidental to the buying and selling of money. This was a totally new point of view to me. I might say that the definition of a stock market which we will use for the purposes of these lectures is somewhat different from what you will read in the textbooks.

To me it is a very ingenious social

device in which people’s inveterate and insatiable propensity to gamble or take chances is satisfied. As a by-product it provides a useful social function, which is the provision for the liquidity of capital. I don’t look on it at all as a capital raising device but as something

to satisfy people’s instinct to gamble. Only as a side benefit is this legalized gambling put to a useful economic purpose. If the market is

primarily a gambling phenomenon, this puts in an underlying struc-

ture (random walks) which determines in large measure how prices behave. We shall also return to some of the unconventional ideas of our visitors to the stock market.

The finite life of a stock puts a limit

on the concept of a random walk. Thinking in terms of upside down prices makes a difference in a market maker’s strategy, and also in the way, or alternate ways in which “return” might be evaluated.




Does a Capital Market Rais e Capital?

At this point I would like to give you the first problem which is intended to look a little more close ly at this question of whether a

capital market is in fact a mark et for capital and just how impo rtant it is as a source of capital. I am frankly a little suspicious , just as old Pharaoh might have been about his daughter’s story. I might say that frequently when I give you a problem, it is one for which I don’t

know the answer.

I am sure that collectively and indiv idually you

students know more economic s than I do, and so collective ly we'll

just find out about this question. Also, if you think the question is

ill put for parts of it, do not hesitate to criticize it, or the conception

of the question itself, or to subst itute a more germane or inter esting question of your own. Problems are thrown at the class to defea t

collectively. You may collaborat e, ask professors or other stude nts. I don’t care where or how you get information or answers, as long as

you acknowledge your sources.

I am somewhat skeptical of the source of authority of these remarks (in the problem) primarily because Mr. Martin himself is a

former president of the NYSE, and also because of the conclusions from this study (which don’t appe ar in the quote of the problem),

that all of the business should be concentrated in one market, whic h

would naturally be the NYSE. I am particularly skeptical of any au-

thority who has (or has had) a vested interest in the answer and this definitely includes Mr. Martin. We will see in this class a number of examples of vested interest. Let me give you a few examples. You can all understand why firemen

on diesel locomotives have a vested interest in keeping the firemen there, and you can be a

little skeptical when they say the jobs are

not for the primary benefit of the union or themselves, but for the safety of the public. You can also understand why the trial lawyers have a vested interest in the adversary system of settling accident claims, and take a very strong position against no fault insurance. The same thing happened when socialized medicine first appeared

on the scene.

The doctors had a vested interest in the status quo.

The same thing can be said of educators who have a vested interest in requiring students to take certain courses, or in the concept of tenure.

You can say with equal justice that dentists have a vested

interest in cavities.

You can say with exactly the same point of

view that social service and welfare workers have a vested interest in preserving poverty and a submerged class.

Likewise the Bureau

of Indian Affairs. They would all be out of jobs if they ever solved these problems.

The pope has a vested interest in the population

explosion and is against birth control, because then there are that

many more souls or brownie points for the Catholic Church. You should not understand that I criticize Mr. Martin for having his vested interest.

Vested interest explains to me a lot of other-

wise contradictory and inexplicable phenomena.

An evangelist for

example, like Billy Graham, has a vested interest in the concept of Hell, Sin and Damnation. He would not be at all sympathetic to the anthropologist’s or psychologist’s point of view that these are just arbitrary taboos and mental inventions set up to govern human be-

havior (and incidentally give the organized church power), because if you abolish the notions of hell, sin and damnation, the evangelist has no justification for being. It takes away his business as a broker

for real estate in paradise.

I once had a chief of police tell me he

thought an appreciable crime rate was a sign of a vigorous, aggres-

sive, healthy population. We all understand why the narcotics law enforcement agencies are not in favor of legalizing any drugs. As students of finance, I will tell you quite flatly, you have or will develop a vested interest in a concept; the concept of profit. It may he just as blinding and narrow as those I have just mentioned.

I hope in this course to at least make you aware of this. I can’t hope to prevent it. I don’t make any exception in this regard for myself.

When I

am not a professor of finance, I am an employce of the Defense Department, a member of the military-industrial establishment, and I

have a vested interest in warfare.

Other people getting killed are

just so much cold turkey in my deep freeze.

I am dovish on the

Vietnamese war for very hawkish reasons. I think we have gotten all the mileage (and by we, I mean, of course, the military-industria l

establishment) that can be obtained in terms of training and trying out new weapons and tactics.

What this country needs, and you

can see my vested interests, is another war, and I really don’t care

where it is.


MacArthur unfortunately got the Japanese out of the The Koreans have been so unaccommodating as to start,

to make peace. There are still possibilities in the near east or Africa, or anywhere, it really doesn’t matter.

So in suggesting that Mr. Martin has or had a vested interest, that is really no criticism at all. It just explains the way he thinks. I might say that his conclusion that all the markets should be concentr ated

into one, the NYSE, is hotly disputed by its competitors, the over the

counter market for listed securities, or third market. Mr. Weedon of

Weedon and Co. is the principal exponent of this viewpoint. It is by

no means obvious that Mr. Martin’s statement that a capital-ra ising

market is essential to an economy such as ours, is true, or that the conclusion he draws concerning a centralized market is valid, either.

This is just what I want you to look in to, and we'll find out more about it.

The students attacked this problem in a number of different ways, using different sources of data and for different years. Some confined themselves strictly to capital raised on the NYSE, and got data from

the NYSE Year Book. Others used data from the National Account

or National Budget and endeavored to sift out from these figures a number which represented the different components of capital raised

in any given year.

In view of the diversity of methods, points of

view and sources, some looking at the question as an accounta nt’s

problem and others as an economist’s problem, there was a rather

surprising agreement (I use that word agreement as a physicist , in a

tather rough sense) as to where most new capital came from. It was generated from internal sources within the corporations which were

needing capital.

As a rough figure, I would say from two-thirds to

nine- tenths of the capital was generated in this way, although it was

different in different years.

Using the same method of calculation

for different years, the fraction of capital raised “in the market” as Mr. Martin specified fluctuated from say 2 to perhaps 8% of the total capital raised.

There was one high figure of 12%, from a

different viewpoint.


Nevertheless, these figures indicated that the

actual capital raised by a market is a relatively small source. There were certain difficulties of conception, as to exactly how

you distinguish between new capital which is being raised, and capital which is essentially being rolled over. Here a new stock or bond issue

is being used to retire a previous debt. There was also a difficulty in assessing what might be called the unseen part of the iceberg, in that a great deal of capital is raised and placed privately, so that it never really appears on the public market.

Still another difficulty was in

assessing how much capital is raised by taxation, whether by federal, state or local government. It was pointed out by one of the students

that in an authoritative government like the Republic of China, the problem would be very easy.

You simply look at the budget, and

they have a certain amount or fraction of income or GNP that was specified for capital, and that was it. There is no question of a market at all.

There was also the question of how one raises capital when

there is no capital market or a very imperfect one or not visible one, as in Europe where a great deal of what is transacted publicly in

markets in this country is transacted privately between banks.

There was also the problem of distinguishing between “new” capital and “used” capital. That is, how much of the money lying around

is being swept into piles like dirt in a room from one place to another, and when the pile is big enough, for a long enough time, you call it “capital.”

I was also attracted as a physicist by the question of the proper “dimension” of capital, because as a physicist, knowing exactly what the dimensions are of a quantity you are thinking of, is often very

revealing about what the quantity is and how it is measured. I had inadvertently thought of capital as simply dollars, whereas it was evident that some of the students, and perhaps this is a more correct economic viewpoint, thought in terms of dollars per year. In one of the questions it is asked that, when money is saved, when is it used as capital and when is it used as “not capital”? What was going on in the back of my mind, and the students couldn’t read my mind, was the following viewpoint. If I put a thousand dollars in the bank for a


year I just think of it as a thousand dollars. The banker thinks of it, and may well use it as capital for the purposes of a loan, as dollars

per year, so that concept has a different dimension, and I suspect a

preferable one. In physics this is the difference between thinking of a quantity and the flux or flow of that quantity per unit time, like

charge vs. current, or gallons of water vs. gallons per minute.

We will see that the questions of dimensions is very important

when we examine the modifications that have to be made in the

simple-minded picture of supply and demand, when we try to apply

this idea to prices on an auction market.

Problem 1 Mr. Martin’s Statement - Does a Capital Marke t

Raise Capital?

The purpose of this problem is to try to determine how important the securities market (and the relative importance of various components of it~bonds, stock exchanges and OTC, etc.)

of raising capital.


are as a means

William M. Martin said, in a press release

(August 7, 1971) connected with his stock market study, “the public

interest dictates that the primary purpose of a security market is to

raise capital to finance the economy.” I am skeptical of the validity of this statement and also ignorant as to just what it means. An associated question is: is it possible to distinguish, both prac-

tically and conceptually between “new capital” actually “created” or “raised” in the securities market, and conversion or liquidity of “used,” or second hand capital. How do you distinguish, conceptu-

ally and practically between money saved and used as “capital,” or used as “not capital”? Possible quantitative estimates to answers to this problem are

1) The net change in value (for stock market) say over one year

of all securities listed—possibly discounted (if considered as security for loans) and corrected for brokerage losses.

2) The net foreign investment over one year. Note that in cases

1 and 2 the net “creation” of capital might be negative.

3) The total value of new stocks issued or bonds issued. How (if

it is possible) can this be “corrected”? for liquidation of old issues?

4) Is it possible to identify and make a relative compari son for

capital not raised in the securities market?

Where docs “capital”

come from, or be raised, where there is a very small (or even nonex-



istent) securities market? 2.2

Gambling Games and Liquidity

The preceding problem and the discussion of its answer will give you a feeling as to just how important the security market is as a method of raising capital. Contrary to the implication of Mr.


tin’s statement, like the refrain of the hymn, it ain’t necessarily so. But there are a number of questions and comments that we can make. First of all we can see that it is important for the governors of the exchanges and the community of brokers who make a

living out of the

market, to have the public believe, and in particular those segments

of the public who are influential, to believe that it is important, in exactly the same sense that it is important to the Brotherhood of

Trainmen to have the public believe that it is important to have firemen on Diesel locomotives. In both cases, there are very influential

segments of society who have believed it, and the public in general seems to have been conveniently docile in its belief for a very long time.

One might well ask where the picture of the stock market which I gave you, as a legalized form of gambling with a side benefit of liquidity came from. Also just how important is this liquidity? We will look into this latter question, and it will hinge a great deal on exactly what the definition of liquidity is. Is it important to be liquid

in a matter of minutes as the advocates of the exchanges would have you believe, or is it sufficient to have hours or even a day for achieving liquidity, and at what cost? I can tell you where the picture of the securities market as a form

of legalized gambling came from. It came from my own Chicken Little mind impressions of what I saw and understood when I first began to get seriously interested in the stock market.

That was about

in 1956, and I began, as all good scientists do, by looking around and collecting as many impressions and as much information about

what was going on as I could.

My sources were the obvious noisy

ones, the newspapers, the various financial presses - such as Barron’s

and the Wall Street Journal.

I also laid my hands on a number

of books which you can find in the reading list at the end of these

notes. I also subscribed to a number of financial advisory services. I recommend to you that this is a very good investment. Not so much for the quality of their specific recommendations, because once you get on the mailing list of a few of these organizations, you get on

12 the mailing list of all of them, and they will bombard you with all sorts of interesting, at least in the first instance, literature. If you do subscribe, I suggest that you save the initial explana tions of exactly how they do what they do, because it will explain to you the jargon

of the market in a way that is very difficult to acquire from any

other source. Often these witch doctors of folklore , if you read them carefully, can suggest to you some very interes ting lines of thought

and lines of understanding.

I just about wore out a copy of Magee and Edwards’ book on There is a lot in it which enlightens what is

Technical Analysis.

otherwise a strange phenomenon indeed. You can readily understand

that my listening to and observing the noisy sources of the market

quickly lead me to believe that, figuratively at least, the market was

a place where people with green paper, that is, money, were madly

scurrying around and trading it off with people who had other colored

pieces of paper, the stock certificates.

It was quickly apparent, at least from these sources of information which were producing the most noise if not information about the stock market, that the participants were indeed after a buck. Therefore with everybody competing with everyone else, it was a

game of competitive gambling. In it some were smart and some were not so smart, and the players changed sides so often that it was a

picture of financial chaos or bedlam. As I had had some experience in molecular chaos as a physicist studyi ng statistical mechanics, the

analogies were very clear to me indeed. Howev er, I should warn you that this was a lopsided picture. I didn’t realize this until I spoke to the lady at the SEC. Then I vaguely remembered that there were some other things I had read in the more scholarly and academic

volumes which took a different viewpoint. But like Chicken Little, I had reverence for the data as I saw it happen ing, and skepticism

for the authorities. I believe what I see rather than what somebody tells me about an interpretation and so that is where my particular

viewpoint came from.

I should warn you again that this approach may produce a lop-

sided impression, in much the same way as the story of the six blind

men and the elephant. Each one felt a different part and reached a

totally different image in his mind of what an elephant was. One can

give an analogy concerning the University of California at Berkeley,

and its students. If one were to listen only to the students who made the most noise and got the most publicity in the newspaper, one


“a 13 would have a very distorted picture perhaps of what the actual stu-

dent body at Berkeley was like. That does not mean to imply that the noisy ones are unimportant.

I think the noisy ones have had a

very profound effect on the conduct of the University of California. It is certainly true that the noisy people in the stock market: have a

very profound effect on how it is operated too.


Supply and Demand. Economics vs. Reality

From the above approximate (and I emphasize that it is approx-

imate) picture of the stock market, it seems evident that the vast majority of the(noisy)

participants are indeed interested in making

a dollar. This is confirmed indirectly not only because the price in dollars is in the numerator where it has the position of greatest interest, or top billing, although it should be in the denominator for the oriental, but also because, at least in 1956, at least half the column in the newspaper was devoted to this dollar figure price. There was a high and a low for the year, and an open and a

close for the day,

and a high and a low for the day. So the question then arose, just how are these prices generated? Thad already determined that they had the properties of chaos. The question then was; what are the details of the mechanism which

produces them.

“Everybody knows” that prices are determined by

supply and demand. So I thought I better refresh my mind and look

sharply at the details of what supply and demand were all about.

I got an elementary textbook, Samuelson I believe, and read very carefully the description of these two lines that are drawn as a function of supply and demand.

I looked at the schedules of actually

fictitious but. supposedly representative data, which were then plotted up, and the points joined with cither straight line segments or smooth lines, and how the derivative was evaluated, the logarithmic derivative actually, to describe the elasticity of supply and demand. But then the Chicken Little in me began to get a

little skepti-

cal. No matter how hard I looked, I never could see any actual real data that showed that these lines which were so thoroughly discussed actually could be observed in nature. This accounts for the second problem which I have assigned to the class.

(See Problem 2). I got

very unhappy about this because I also read in some of the more academic volumes that the laws of supply and demand didn’t seem

to work very well when they were applied to a security market.


order to get this situation straightened out, we will have to call on


the talents of our mathemati cian Zero to see exactly wha t the words mea

n. To do that we are going to nee d some elementar y concepts of

mathematics, and apply them very closely indeed.

The first mathematical idea I want to bring out is that of a function

, in the mathematical Sense. You can read in the newspapers that price is a functi on of supply and dem and. If you examine exactly what these wor ds mean to a mathem atician, you will see, as I will show you, that it isn’t necessarily so, although it is so that,

supply and demand (but with different dimensions) are fun ctions of price. The

one statement does not imply the other, nor conv ersely.

Going back to what you probably learned in the eighth or nint

h grade, let me define care fully the notion of a func tion. Actually you star t out with two sets, I remind you that a set is simply a coll

ection of objects with a definiti on to specify what the objects are, such that

you can always mak e an unambiguous decision about any object as to

whether it is or is not a member of the set.

It is the

n a welldefined set. The set can be concrete objects or the y can be abstract ideas, or the objects can be themselves sets of oth er objects, either abstract or concrete. The word function is defined in the following way. It is a relation between the objects of the two sets such

that if you have the two sets of objects Y = {y} and objects X = {x} you write the functional rela tion (x,y) such that if you pick uniquely an object from the set X, the functional relation tells you uniquely with what

object y in the set ¥ the first choice of x is associated.

It does not follow that you can go the other way . It may be false tha t if you pic k a y, that the rule tell s you uniquely what object. you get in X.

So if we have that y is equ al to

a function of x, or over the set X = {x}, frequently written y = F(z), then, given x, we can uniquely det

ermine ay. There are var ious alternative words in mathem

atics to express this. Someti mes x is called the argument and y the function, sometimes z is call ed the independent variable, and y the dependent

variable. The langua ge is that the function (a relationship) is def ined over the domain, which is the set X, and the fun ction itse

lf, the y, is specified on the range set Y. So there are ver y definite limits (the sets X and Y) over which the functional relati onship holds.

I can give you two exa mples to illustrate this functional relationship between sets. Consider the set of con gressmen (House of Representatives) and the set of all voters. There are five hundre d congressmen and





gressman is a fun ction over the set of voters,






the voters being the


domain and the five hundred congressmen being the range.

If you

pick a voter, he has a unique congressman. The converse is not true, because if you pick a congressman, you do not determine a unique

voter. Note that senators, since there are two each per state, are not a function of, or over the voters, nor conversely.

So that although

congressmen are a function of the voters, the voters are not a function

of the congressmen. A second example: consider the set of all mothers and the set of

all children. In this case it is quite possible for the same individual to be a mother and also a child.

the children.

The mothers are a function of

If you pick a child it has a unique mother, but the

children are not a function of their mothers.

Given a mother you

cannot uniquely determine the child that goes with it. If you want to cause a commotion in a PTA meeting you can make this statement.

“Mothers are a function of their children, but the children are not a function of their mothers.” This just shows that the common usage

of a word (function) may be different from its mathematical usage. Let us now return to our supply-demand diagram plotted, with price as the function, and see what this picture says, literally, to Zero our mathematician.



tfiemG)! Scr


Dim ctems (Hime)

Fig. 2.3-1 Prices as Function of Supply and Demand. It is a mathematical convention that the vertical axis is used for plotting the function, or range set, and the horizontal axis the domain



set. So we have in this diagram, three different kinds of prices. p(D)

on the demand curve, p(S), and the intersection which I will call Pr The subscript R means that, it is supposedly the price which appears in the real world of buying and selling. You will note that, unlike

p(D) and p(S), I have not indicated of what independent variables

or domain set (if any) Pp is a function in the mathematical sense.

The big problem when we read the textbooks, or speak of “price” in everyday language, is to determine which one of these three prices is intended. We will need to exami ne the context and the grammatical construction very carefully in order to decide what variables are intended in any statement to be the domain and what variables the range or function set. Right away from Fig. 2.3-1 we notic e two properties. As drawn,

by the mathematical convention p(D) and p(S) are the functions.

since p(D) slopes monotonically down, and p(S) slopes monotonically up, it is on the diagram possible to go uniquely from a particular p to

a unique D or S. So the figure impli es that the functional relationship

can be made in either direction. As we saw from the definition of a

function, this is not necessarily true.

Mathematicians have a word

to describe this possibility. If there is a “formula” (i.e.

some sort of

algebraic expression) to express p = p(D), then it is frequently (not always) possible to invert, and find D = D(p), over some limited

interval now of p. The domain and range sets have exchanged their roles.

The second property we notice is that we have drawn thin solid

smooth curves. The word “thin ” implies ordinate and abscissa (p, D

or S) are sharply defined numericall y.

This may not be necessarily

so. The word solid implies a conti nuous function, smooth implies the

existence of a derivative, and more than that, a continuous derivative. None of these ideas appeared in our definition of a function.

Obviously, we are implying a great deal here not only about the sets D, S and p, but also the relation between them. Unconscious assumptions or implicatio ns are very often the most It is a very common practice to repres ent. phe-

difficult to detect.

nomena by thin solid smooth graph s.

Sometimes the implications

of “thinness,” “solidity” and “smoo thness” are justified. When they are not, assuming them can cause endless controversy and confusion.

At this point a beady eyed Chicken Little might object to these characteristics of our diagram, and say, “Look here, you can’t have solid lines on that picture because there is always a smallest unit of


money, be it a penny or an eighth of a dollar or a dollar, in which you

buy things, and in addition there is always a unit of something that you buy:

tires, automobiles or wheat.”

There is always a smallest

amount, usually a bushel for wheat, but in any event we should have just whole numbers of some sort on that diagram on both axes. The

lines should be dotted.

Well, this is reasonable enough, and one

might understand that we just draw solid lines to get a picture of it. Then our mathematician Zero will have an objection on the grounds that if we are going to have dotted lines instead of solid lines on the curve then there does not exist any such thing as a slope, or a

derivative, or a logarithmic derivative either. Although the objections which Chicken Little and Zero have raised to our supply and demand figure seem quite trivial and even picayunish, these points are absolutely essential in understanding the modifications of the supply demand relationship to make them ap-

plicable to an auction market, and to the world in general, as I see it.

Let us now examine the textbook language which describes the supply-demand diagram. If you read the fine print in the textbooks

you will notice there is a specification that the supply and demand

are those quantities which people would continue to buy (demand) or sell (supply), at the given price. Hence the actual dimensions of supply and demand are items of the commodity, tires or wheat, per unit of time, a month or year.

Note that the specification “at the

given price” implies from the grammatical construction that price is the independent variable, or domain set, which is contrary to the construction of the figure as drawn. Everyday language confirms this implication. “Everybody knows” that a supply increases with increasing price, and demand increases with decreasing price.

The language indicates that price is consid-

ered the independent variable, or domain set. economists confirmed this.

Conversations with

It is only an historical accident that it

has become a convention in economics to plot price on the vertical axis.

A different statement from “everyday language” seems to contradict the above plausible statements about demand. increases, the price goes up.”

“If demand

This word “price” is not the “price”

of the preceding paragraph. It is Pr the intersection of the supply and demand curves. The statement means that if the entire demand curve is shoved to the right, the intersection point (Pr) slides up


on the supply curve, which is assumed to be held fixed.

But the

grammar implies that demand is the domain set.

These peculiarities were horribly confusing to me, and I became even more confused and suspicious when I was looking around for a


example of a supply or demand curve. I received in the mail an

advertisement of some pamphlets, and the statement said, “We are

prepared to supply according to the following schedule the numbers of pamphlets at the indicated prices.” I thought “A-HA,” at last a real example of a supply schedule. I plotted up the schedule using

the figures which they gave. Up to five pamphlets at 10 cents apiece.

Up to 100 at 2 cents apiece. A thousand or more at 1 cent apiece. I discovered that this “schedule of supply” was exactly the opposite in

its “slope” to the one which appeared in the textbo ok. So I was foiled again. It was not a real supply schedu le, by the textbook definition.

Being both skeptical and Chicken Little minded, I tended to discount the textbook.

It will also be noticed if you read the fine print in the textbooks

carefully, that the authors are careful to specify that these thin solid

smooth line supply and demand curves are drawn with the under standing that “other things” are to be equal, i.e., held constant. What those “other things” are came out in the course of the answ ers

of the students to the problem on this subject. This problem caused a good deal of commotion, not only amongst the students, but also

amongst the faculty to whom I directed the students to addr ess their

questions, since Barrows Hall is infested with economists.

Some at

least were skeptical that anyone woul d have the gall to seriously ques-

tion the legitimate activities of economists, who apparently spen d a

lot of time and effort computing suppl y and demand curves. Over in

the marketing department they said they had been trying to comp ute

these things, but there was too much noise in the data. It’s my feel-

ing that it is precisely the noise in the data which is the interestin g

part of the phenomenon.

In sum and substance the consensu s of expert opinion seemed to be that it is indeed very difficult to extract from real data what a

real life supply or demand curv e is like.

The condition of holding

all other things fixed is practicall y impossible to achieve in practice. The “other things” are such varia bles as weather (for demand of cosmetics, or fertilizer) the adver tising budget, the preference of the consumer, the supply and dema nd of competing products and so on.

So it is only in a very idealized sense that these curves are supposed





to have any meaning for purposes of understanding the real world. It might be well to summarize what the economists actually do

when they try to carry out this calculation.

They suppose a func-

tion exists which involves D, S, p, and all the other variables which

are supposed to be held constant, and plug in as much real data as they can for these variables.

They make a multiple regression

(linear usually) to evaluate in the simplest case the “tangent hyperplane” through the center of gravity of their data in a space of as many dimensions as they have variables.

If this regression calcula-

tion comes out “significant” in some sense, then by holding constant all the variables they want fixed they then get a regression “curve”

(tangent line) of price against supply or demand. This process has behind it some sort of a model, and for which hopefully, if they are

say evaluating a demand curve, the supply curve is “naturally” held fixed. The uncertainty or arbitrariness of the model, and the difficulties of interpreting multiple and partial regression coefficients are still present. Needless to say, this is a rather far cry from the simpleminded

picture in the elementary textbooks.

The notion of a mathemati-

cal function is completely wiped out in this stochastic picture. The

regression of y on z is not a functional relationship, a fortiori not invertable. The supply or demand curve “derived” is actually a regression line, plus some sort of a preconceived model to specify what the germane variables are.

Here we have a rather typical picture of what passes for scientific procedure in the social sciences. Social scientists for the most part don’t seem to have learned that the theory is always required to fit the data, and that it is an incorrect procedure that data should be

made to fit the theory.

From a pedagogical standpoint, science is

frequently taught this way. You do an experiment in the laboratory in which you know what the answer is by theory, and so you do the experiment to confirm it.

Even more misleading, scholarly papers

are often written this way. Even these lectures are written up in this manner. In the real world the procedure of discovery is exactly the opposite.

As a class, social scientists have never caught on to this.


a result they very often won’t even undertake an investigation and

collect data unless they have some sort of a theory or model to fit the data to.

This is not the way significant discoveries are made, and

it is just unfortunate that social scientists to a large extent carry


on their research in this mann er, This is probably an expl anation, among other reasons, of why economics is called the dism al science,

but that doesn’t prevent €conomics from being import ant.

Despite the scornful and even patronizing tone of the pre

ceding remarks, we really cann ot dismiss out of the han d the economist’s

ideas as the ravings of an acad emic lunatic in his private dre am world.

There certainly must be some measure, perhaps, of distorted truth

in his point of view. The Chicken Little in me sees quite plainly that econ omists with influence do make important decisions, so we can’t

just brush off the ir ideas.

I now want to show that we can take into account the objections of Chicken Little and Zero. By turn ing the diagram over so that

price is the domain set or independ ent variable, and a slight change in the dimensions of the diagram, we can bring the textbook ideas ,

with modifications, into rather close agre ement with what actually

does happen in the real world. The correct choice of dimensions, the

correct choice of the depe ndent and the independ ent variable, and

the discreteness of the domain set and the Tange set: these ideas are of fundamental importance.

Let us redraw the economist’s supply-demand picture turned over

and comment on its qualities, and then I will show you that its properties, appropriately translat ed, do actually occur. It might seem

impossible to preserve the idea of solid lines and smooth lines , which translates mathematically into continuity and existenc e and conti-

nuity of derivative, but it turn s out that, within certain limi tations,

even this is possible, with discrete variables.

So, looking at the new repl otted diagram we see that the

demand curve slopes dow n to the tight, the sup ply curve slopes up to

the right. D(p) and S(p) are in mathematical language decreasing and increasing functions of p. The domain set, or ind ependent variable is price and the dimens ions are of supply and dem and until we

change them, items per unit time. That last defi nition we will have

to change. We also have to understand that in these plot s of dem

and and supply against pric e, there are some other variables that have to be held constant, just as before, where this was expressed, “other

things being equal.”

We will also have this idea . We also note that the supply and demand curves intersect at a Point and presumabl y this is the price which appears in the real world. Economists may be misguided (who is not?) but they are not ignorant and they are not stupid. So we


really should pay attention to what they have to say. Scientists who are misguided are in the best tradition of Chicken Little.


15 eo











PG price Fig.



§ (tTem)7!

2.3-2 Supply and Demand, as Functions of Price.


Economist’s Diagram Turned Over. Problem 2

How Economists Get Supply and Demand Curves In almost any introductory economics text (e.g., Samuelson’s book) one can find diagrams of price plotted as a function of supply and demand. There are also schedules, or tables from which points can be plotted, and joined by straight line segments, or “smoothed” curves to give these diagrams. There is extensive discussion of this kind of (supposedly real, or at least representative) data, including an evaluation of quantities from the slope of these curves, e.g., elasticity of supply

8 = (d log p)/(d log s) = (dp s)/(ds p) (I may have this upside down.)

The question here is, is it possible to actually go out into the real world and collect real, quantitati ve data from which these supply and demand curves or schedules can be constructed? I have heard that it is possible, but I am both skept ical and ignorant. So come back

with as explicit and detailed infor mation on this question as you can

with especial reference to the diffic ulties and uncertainties involved.

It has been commented by some acade micians (cf. Granger and Morgenstern, pp. 8, 196) that convention al ideas on supply and de-

mand don’t seem to apply to specu lative markets. I am sufficiently

iconoclastic to believe that the convention al presentation doesn’t ap-

ply except in a very crude and appro ximate sense, to the real world,

period. The changes necessary to make them applicable require close

attention to the mathematical definition of function, continuity of a

function, and existence and continuity of a derivative. So you would

do well to review your knowledge of these. The importance of these

notions can be illustrated by the following statement (which is not at all obvious). Price as a func tion of time, which is a Brownian motion, is a continuous function for which the derivative does not exist

(anywhere in the range).

If you try to draw a picture of a contin-

uous function with this Property, you may find it conceptually very

difficult. We shall explain how to do it in detail in the lectures. 2.4

Supply and Demand in the Real World

Let us imagine a situation which, unlike the economists’ theoretical construct, does in fact occur in the real world many times every


Mrs. Jones has heard of a dress she wants to buy, a green dress

with red applique alligators and she’s determined to have one. More

than that, she is willing to pay up to $25 for it, because that is the amount of money her husband has said she could have for the


So in this instance we can draw her demand function for

this dress as indicated. (Fig. 2.4-1)

The independent variable is price, it is a discrete variable.


she is willing to pay $25, she is willing to pay less. The ordinate is also discrete and the demand is in fact a function of the discrete set of prices as drawn. The word funct ion is used in the mathematical

sense. A unique price gives a uniqu e démand, but the converse is not true. That is why we turned the S — D diagram over.

Demand is a function of some othe r variables as well, just as the

economist’s demand curve is, which are not shown in the diagram.

23 In our case, the other variables which are held fixed for the purpose

of the picture are the longitude, latitude and altitude of Mrs. Jones. The demand is unity for p < 25, i.e., not zeros and extends over a

region of space as far as Mrs. Jones can see, about 50 feet. The demand exists and is zero outside this range. Fifty feet is her distance

limit of information and communication. There is also a fourth variable which is held fixed in this picture, which is the time. That is, if Mrs. Jones decides she doesn’t want a dress after all, at that time the function collapses to zero. It is still a function of course, of all those variables, five in all. If she falls to sleep or forgets about it, the

demand again drops to zero.

Demand D dresses 2 bok ie







fai préee




: Hf

(dressy 73


24 25 26


Fig. 2.4-1 Mrs. Jones Demand Function.

There is another slight difference from the economist’s picture, but there is even a mathematical notation and language for that. In the economist’s demand curve, demand is, as we have drawn it in Fig. 2.3-2, a decreasing function of price. In the case of Mrs. Jones’ demand curve it is a non-increasing function of price. The difference is that if there are two neighboring points on the economist’s curve, the one with the larger price is always less in demand than the other.

In the case of Mrs. Jones’ Curve, the point with the larger price is not greater in demand than the other. This is the difference between

a decreasing and a non-increasing function. They are not quite the same thing. This all fits very nicely with what we have said about the importance of discreetness and the importance of the concept of

a function. In Mrs. Jones’ case the demand curve really is a function in the mathematical sense (over five domain sets) of all the variables which we have specified, although we have only drawn one, the price. Mrs. Jones goes downtown and shops around and her non zero

demand function moves around in space and time until she comes to the second floor of the department store and asks about this dress

24 she wants.

The clerk tells her, “No, we don’t have any here.”


tells Mrs. Jones that the supply function is zero at that altitude,

but down in the bargain basement she might find a non-zero supply


Seapp ly 6


pe (A)


bh -*b-- b=-b: 6 - b-Fb--5-I @ressy!


“"Zobersechia a “of



Sand D

Sapply S44






Price Fig.


Demand D ~~

dies ses


ee b-b#b--b- bhadbid, . $ ‘ 132-9-4 4 is Ie 77 78 /Ga0 31 2.23 2S price

Sy D




{ as"

im: 8 ress)!

2.4-2 Supply and Demand Functions of Price, for Buy ers

and Sellers.

So she goes down to the basement and lo and behold, there on the

rack are two green dresses with red applique alligators, and they are marked with a price of $15. Now we can also draw the supply func tion

for this particular item. This we have indicated (case A) and you

will notice that, just as in the case of Mrs. Jones’ demand function, it fits the economist’s Pict ure, to a degree. It is a non-decreasing


function of price. It is quantized at unit elements of supply, rising from zero to two at $15. It has other variables, held constant. The latitude, longitude, altitude and time of the store have the same

value as Mrs. Jones’ variables, when she sees it. economist’s case, “other things” will be equal.

So just as in the Supply is different

from zero only when the store is open. At other hours, the variable time having changed, supply collapses to zero so far as Mrs. Jones

is concerned.

So when we superpose these two “curves” of supply

and demand we see that they do in fact “intersect” at a point, just as the economist’s diagram does, and so there is a transaction. Note that supply and demand are both altered by the transaction. This is assumed not to be true in the ideal economist’s case. We could also draw this situation (Fig.

2.4-2, Case B) in the

hypothetical case that there was only one dress on the rack. You will also note that there is the tacit assumption in drawing the supply curve, that if the store keeper is willing to sell a dress for $15 he is

also willing to sell it for more. So if Mrs. Jones was so broadminded and liberal that she would be willing to pay more than $15 and there

was only one dress, then the curve would be drawn as in case B. Supply and demand would intersect not at a point but over a “line”

segment (actually closely spaced dots). In this case, there is an interesting asymmetry of information.

Mrs. Jones knows the supply function of the store, but the merchant does not know the demand function of Mrs. Jones. So in the normal

course of events, she will immediately slide her demand function (case A) back down to $15 and in that case the supply and demand curves will intersect at a point ($15, 1 dress). It might appear, that they then intersect on a vertical segment, but the strict definition of a function does not allow these vertical segments to be considered as

part of a function, as then the uniqueness part of the definition of a function would not be met.

So with the intersection of the supply

and demand functions they can have a transaction.

It sometimes

happens that when the supply and demand curves intersect on a horizontal segment, that the people are friendly, and they may split the difference, or they may not.

You can see from the above example that with this interpretation of supply and demand, the dimensions of supply and demand are not items (demanded or supplied) per unit time, but items.

There are

now a great, many examples in the real world. In fact, you cannot go into a store or have any sort of a transaction without precisely these



elements of the sup ply and demand

curve Occurring. For example, suppose Mrs. Jon es is going to the brocery store to buy ten cans of

applesauce, and she is will

ing to Pay 15 cents a can. She sees on the shelf three cans at 10 cents and 12 can s at 12 cents. Th e demand and sup ply curves (as lines, Teally dot

s) look as we hav e drawn them. So she buys the thr ee cheaper and the seven more expens ive ones to get her ten cans of applesauce.

a tar



10 s 34


~--Dip) ee

er Prete Fig. Store.

1012 ~





2.4-3 Supply and Demand of Apple Sauce in the Grocer y

There is nothing spe cial about having

the demand curve Jones level off at of Mrs, just one unit. It could well be tha t she wanted to

Price which cor responds closely but with significa nt difference fr the decreasing de om ma

nd function of the economist. In reference to the applesauce exampl e, there are teally Possibilities rat several her th an just the simple mind

ed interpretati on of the demand line at ten cans intersect ing the supply at ten cans, at a

i] I

single point (12 cents, 10 cans),

The simple interpretation is that

she would pay 12 cents for all ten cans. In fact she would first cut back her demand to a price of 10 cents, take the three cans and then get the remaining seven at 12 cents. If very crafty, she might ask the merchant if he had any more 10 cent cans first. Again we see here the significance of who has information and who does not. She knows his supply function and uses this information to make a two-

stage purchase.

If he withholds the information that he has more

cans at 10 cents that might be added to the supply function, then he comes out a little ahead of what he would get if he gave out that information.

We are now almost in a position where we can begin to apply

these ideas to a real auction market. Before doing so let me illustrate a few other examples which will bring out certain aspects that are important for the auction market, in particular the availability of information to the different participants. Let us suppose that Mrs.

Jones, having bought in the conven-

tional way her green dress with the red applique alligators, is so

enchanted with it that she wants another dress, this time a red dress with green applique alligators. She carries her demand function of five variables around through all the possible sources of supply without finding a supply function which is different from zero. Then she hears of a very fancy haute couture establishment, and so she is de-

termined to go there. Now this store is run in a slightly different fashion. A character in the funny papers, I think it was Henry, set up a junk or perhaps a lemonade stand which operated on exactly

the same principles. When she goes in, there in the back is a red dress with green applique alligators, but there isn’t any price on the label. So she asks the clerk about it. The clerk explains: “Oh, we can’t have anything so vulgar as prices on our dresses. We don’t do

business that way. If you want that dress, dearie, you just take it up to the counter and you write on the label what you're willing to pay. The manager will look and see whether this is an acceptable price on our list, which we don’t publish because it’s vulgar to have numbers and discuss prices. If your price is higher than what we consider a fair price, we will let you have it at the price which you have specified.”

Mrs. Jones is a little non-plussed at this, but on second thought she decides maybe that’s reasonable, so she goes and does just as the clerk told her, puts down $25 and gets it for $25. Now you will


notice the essential asymmetry in this from the precedi ng case.


this case, it was the seller who knew the demand function of the

buyer, but not conversely. This situation also occurs in the labor

market. A buyer of personal services will adverti se what he will buy at a certain wage, and the seller, the employee who might be willing

to sell his services for less, nevertheless accepts the buyer’s bid, so that this situation is by no means unprecedented. Again you will notice that all of the previous conditions concerning the existence of

real mathematical functions and what variables (five in all) are to

be held fixed and must be equal for both supply and demand, are exactly the same.

Finally we have a third situation.


Jones has now decided

that she wants a black dress with white applique alligators, and it.

seems hopeless. Her demand function is up to $50 on this item, but the supply functions are only zero. Then she hears from her operator

in the beauty parlor that there is a store where for a price they will sell you anything. “It’s run a little differently from that bargai n

basement place, or the high fashion store that you went to a while

ago, but nevertheless if you want it, they’v e got it. It’s up there on

Wall Street, and it’s called the New York Stock Exchange.” So she

goes up there on Wall Street and sure enough, there on the rack is a

black dress with white applique alligators. She is about to go and get it, when she is stopped by a man in a sandw ich board with the word “broker” on it. He says, “Just a minute, lady, we don’t do business that way here.” She says, “You don’t, well how do you do business?”

He said, “You write down on a piece of paper what you are willing to pay for that dress, and then I’ll go to the owner and find out what

he’s willing to sell it for, and if the two prices meet or overlap, then

I will decide what the price is which will be satisfactory to you both.

Mrs. Jones thinks about this a while. Then she does just what the man in the sandwich board says and puts down $25.

It turns out

that the owner was willing to sell it for $15, so the broker, being an

honest man, comes out and says: “Madam, you get it at a bargain for $20,”and she goes away happy having saved $5, she thinks. The

owner is happy since he got five dollars more than he was willing to


You will see that this is a third situation in which neither the buyer nor the seller knew the other’s deman d or supply function,

but with the aid of a go-between whom they trusted, a transaction

was effected.

This situation, which held only for one unit would


29 hold for as many units as you like, nor does there have to be the

same number of units on cither side.

The prices need not be the

same, they might be different for each unit. You can simply plot up

the demand and supply functions in a discreet manner, a series of steps going down anda

series of steps going up, and if they overlap

then there is a transaction for an equal number of buyers and sellers at a single price.

The reliability of the intermediary, the broker, is

essential to this kind of a market. You can easily see that the man

in the sandwich board, had he been so minded, could have told the owner of the dress that the bid was $15, and told Mrs.

Jones she

would have to pay the full $25 and pocketed the difference. No one would have been the wiser if the final information is not available to

all parties.

It is seen that those who have information that others

have not, are in the position of advantage, and it takes a degree of

faith on the part of the participants and a degree of honesty on the part of the intermediary to assure what might be called fair play.

Now perhaps the man in the sandwich board took out a little piece for himself, but maybe that’s regulated. If you will examine the details of the above picture by contrast to what appears in elementary textbook, you will see that there

are indeed a great many similarities.

The principal differences are

brought about as I mentioned by making the price the independent rather than the dependent variable, by taking into account explic-

itly the discreetness of both the domain and the range sets, and by taking into account explicitly that the supply and demand were now items rather than items per unit time. Demand and supply functions change with each transaction. They are definitely functions of time.

With these modifications it is seen that there is quite noticeable similarity, but essential differences in detail, between the economists’ theoretical picture and the data of observation.


Continuity and Derivatives of Real World Supply

and Demand There is one aspect of the theoretical curve which does not appear

in the observational curve. That is, the solidness and smoothness of the theoretical curves as opposed to the experimental ones. It doesn’t seem possible that the idea of continuity and existence and continuity of the derivative, which are essential for the notions of elasticity

of supply and demand and all the various theoretical developments which economists make when they use these derivatives, can possibly


exist in this framework. However, it turns out that if you really ex-

amine the definitions of function, and of continuity and of derivative

very carefully indeed, that even these properties of continuity and

existence of derivative can be satisfied (in part) by the discrete sets of the domain and the range. So far as I can see, this demonstra-

tion is more a mathematical feat at the momen t, and a tribute to the careful definitions of the mathematicians, rather than providing

a situation which allows the notions of continuity and elasticity and

the existence of the derivative to be applied in the same way that

the economists do it.

Nevertheless, continuity and derivative can

be defined when the definitions are followed carefully. This curious

situation caused a good deal of discussion in the lectures at the time.

Remember now that the supply and demand curves as redrawn

are, rigorously, functions of their respective independent, variables :

Price, latitude, longitude, altitude, and time. Of these we are specif ically going to concern ourselves only with price. The others are “to be equal” or held constant.

The “peasant” definition of continuity of a function D(p) is: D(p)

is continuous at p = po if

lim |D(p) — D(po)| = 0



The next question is, exactly what does the word limit mean? For

the case of continuity of a function, it is as follows. This is the “aris-

tocratic” definition of continuity of a function. Note particularly the

order in which the small quantities ¢,6 are picked, and the specifi-

cations on them. e

is picked first, must be positive and not zero,

and may be as small as you please. ¢ must exist. 6 simply has to exist, and be positive, not zero. It may or may not be “small.” D(p),

D(po), p and po must all exist with PF po-

D(p) is continuous at p = Po, if for any given ¢ > 0 there exists a

6 > 0 such that





|p—pol 0


As an aside we have uniform continuity over a “region” of p’s if the

same 6 be used throughout the region. The above is the conventional definition

of continuity of a func-

tion from any calculus book. It is usual ly if not invariably, applied in


a universe of discourse of the real numbers; integers including zero, fractions or rational numbers, irrationals including the transcenden-

tals. This universe is precisely what Chicken Little objected to when the economist drew solid lines.

Demand is in integers (units purchased) likewise price ($, 1/8 of dollars or even pennies). Now if the universe of discourse for the do-

main and range are discrete sets (let us just use the integers including zero for simplicity) then the 6’s and € are also so constrained. So let us go back over the definition and see under what conditions

D(p) is continuous. € must be greater than zero, the next largest one which exists is 1. So we have for a):


| p#po 46




































S(p) 5



















Order Function D(p)



Limit Order Buy


Limit Order Sell




Order Function S(p)










44 43



4 5


2,3,4 5


3=1,1,1 1

1 af

3 2

3 2













0, and lower limits to the absolute difference of bid or offer

from the most recent price. This latter lower limit may of course be zero. There are upper limits also to all these quantities, how these are

determined we shall shortly discuss. But that there are upper limits


is more important than exactly what they are, just as in the case of the inventory limits. All these limits, plus his present inventories

and the most recent price, summarize his input information. It is evident that, just as in the case of the used car dealer, his

profit in dollars depends on the size of the spread and the number

of transactions which he makes. Every time he makes two transactions, one a buy and one a sell, he gains a spread if the quote hasn’t changed. If he keeps his inventories near their optimum values and stays as far as possible away from his limits then in the course of

time he must move the quotes and hence transactions in such a way that the total number of his buys frequently equals the total number of his sells. The sum of these divided by two times the average spread he has maintained represents his dollar profit.

(Not exactly,

see below.) We can see that if in the course of trading he tends to buy more than he sells, so that his stock inventory gets large, he will want to

lower prices. If he sells more than he buys so that his inventory of stock tends to approach its lower limit, he will tend to raise prices.

The question is how much can he raise or lower prices, and how far apart can the spread be to enable him to keep the market open and

stay away from his inventory limits.


Definition of Profit in Market Making

If the money piles up while the stock inventory stays near its optimum, he can legitimately label the excess above his upper cash in-

ventory limit as profit’, and spend it for something else; wine women and song, buy a house or put it in the bank.

He should not use it

for market making, because correct thinking for this market maker requires boundaries on the money to be used for market making.

As we shall see it is equally possible for stock to pile up while he is at or near the optimum point of his cash inventory. The extra

stock certificates above his upper bound of stock inventory should be regarded as “profit,” but “profit” measured in shares of ownership, not their equivalent dollar value. This is “correct thinking” but not according to accepted accounting principles. He can use these “prof-

its” to paper his office, or celebrate with a bonfire.

He should not

* [used the word “profit” in the lectures in order to emphasize the symmetry between cash and stock. “Surplus inventory of cash” might be a better term. The important point is that “profit” should not be regarded as part of the “market

making capital equipment.”


use them in market making. In practice he is more likely to arrange a special “bargain sale” outside his regular market making activities.

There are some real life analogs to this situation.


are private placement, off floor or special distributions, sale as letter

stock, or sales to the corporation itself, for treasury stock. He may try to get rid of it by any method that does not immediately affect his regular market making.

These methods are applications of the

advantage of using temporarily withheld information.® Note that he is not a market maker when he does these things, he is a salesman. It

is not good practice, as a market maker to deliberately get into this position of “profit” in stock certificates. He is then trying, indirectly, to make a profit in dollars on his stock inventory.

It is an indirect

violation of the basic principles of market making. Some of the students raised questions about this peculiar concept

of “profit,” and asked why it was inappropriate, for purposes of calculating your net position, and hence net profit or loss, to convert

your stock inventory to some equivalent cash figure. Presumably you might have to do this to satisfy an external auditor or tax collector

who did not share these peculiar views. I admit the practical neces-

sity of having to compute profit or loss in the conventional fashion, but for the purposes of setting the strategy of a market maker, it

should not be done. Let me explain why. First of all, it is the market maker himself who is setting the prices. It is inappropriate, and in a sense a conflict of interest (with

his customers, or another side of his life, as an investor), to think

of his inventory as an equivalent amount of cash, when he to a very considerable degree controls the price.

The difference between a market maker, and a speculator or investor, is that the latter tries to make money on his stock inventory,

whereas to the former the stock inventory is a tool of his trade, to

be kept as nearly constant, and preferably as small as possible. But by his basic principles, he must have some stock to stay in business,

his primary concern.

“Profit” (in dollars or stock) is surprisingly

enough, as we shall see, an incidental byproduct of his activity, just

as (in our view) liquidity is a byproduct of the primarily gambling aspect of security markets in general.

8 See Section 2.18 on direct and reverse ratchets. The specialist can stash this

“surplus stock” in either his segregated investment account, or omnibus account. His “inventory for market making” is called his trading account. See Richard

Ney, The Wall Street Jungle.

Finally you should note the essential symmetry of the market makers’ activities. Stock certificates and dollars are just two kinds of

paper which it is his job to render exchangeable, and this symmetry

in viewpoint is quite essential to carrying out the basic principle s under which he operates. The two separate viewpoints of “profit” are needed to preserve this symmetry.

We have seen that the lower bounds of inventory for this market

maker are zero, both for cash and stock. Let us now set the upper

bounds in a number of different ways, and see over what limits of

price he can make a market.


The Effect of Lower Limits Only on Inventory,

on Price Limits

Let us assign our market maker a starting price of $30/shar e. This may be from the most recent transaction, or the previous day’s close, from an underwriter, or from a LOOIM after some crisis and

resulting trading halt.

We give him 20 round lots, to be specified as the optimum inventory, and also give him $60,000, specified as

optimum equivalent to the stock inventory at $30/share. $ 1/2 is specified as an acceptable upper limit of the price change between transactions. We note for future reference that it will make a great

deal of difference whether this specification is never to be exceeded , or not on the average (and what kind of an average) or “not very

often,” and just how often is not often.

Let us suppose our market maker, like Zero, is literal minded and

cautious. Being told to make a market, and not told to make a profit

(yet), he focuses just on market making. He adopts the rules, just

one round lot as the size of either side of his quote. He starts with a

quote of (29 1/2, 30). For every buy he makes (a market order to sell into the market), he lowers both quotes 1/2. We define the “drift” d as the change Ap, = p;—pi_1 = d of the mean quote (Po; +P; )/2 = Bi

from one quote (i — 1) (with an intervening transaction) to the next

(i). In this case then the drift is exactly equal to the spread. For

every sell he makes, he raises his mean quote in exactly the same

way. (See Fig. 2.13-1A)







HOSY gute

eb [)


te PF -~— ~-4b o-~ mo-b y



* bo

28--—- —~






chy J

(A) TransseTed prices Cé) poe thre Pays of Markel “maker followed by Three sells,

Spread = dnjen Yn

pe fofit's= Bera



Sor en &





~ trangsacGon

24, —~







_ en

4fper, b= bid


mb —


bid >





ger as) (B) Transacted prives Ce


b bb



case @), wth spread = !y dete "fa

Prog T= (Heczeansactiins) ie pread — dvr) = 6) (t-'/2) = / "la, oxcfter, b= bid. o.

Fig. 2.13-1 Examples of Simple (Symmetric) Market Making.

With this “strategy” he can make a market up to 39 1/2, where he runs out of stock, and down to about 18, where he runs out of money.

We can also see that his inventory exactly determines his

quote, or vice versa.

The order in which market orders to buy or

sell arrive at the market make no difference in this regard.

If we

imagine a “tape” showing his transactions only, the prices will be


either 1/2 or zero apart, the latter indicating a buy was followed by a sell, or vice versa. For any interval for which the total number of

buys equals the total number of sells, the quote and his inventories are returned to exactly the same values at the beginning and end

of the interval.

He can make a market and stay in business over

a quite respectable price range.

more than 1/2.

Transacted prices never change by

If that is “close” enough to satisfy an acceptable

definition of “market” continuity in transacted price he has made an acceptable market.

It is not a “deep” market (only one round lot

on both sides of the quote). An incoming order for three round lots would have to be satisfied by three successive quotes.

Quote prices

are, transacted prices are not, a function of stock inventory. Let us now introduce a

little profit, but stick literally to the

other specifications: no transactions more than 1/2 apart and make a market over as wide a price range as possible.

(See Fig.


1B). With a “most recent” price of 30, he quotes 29 1/2 to 30 1/2,

and drifts the quotes 1/2 after each 1 round lot transaction, just as before. The drift (1/2) is now one half the spread (unity). He can make a market over exactly the same price range as before, 40 to 18. The transactions on the tape look much the same except that there

will be no consecutive transactions at the same price. The average (and every) transacted price change (absolute) will be exactly 1/2, and slightly less so for the previous case.

As before the quote is a

function of the inventory, and vice versa. How much is the profit (in dollars)? You can readily verify from

the figure that for any time interval for which the initial and final quote prices, hence also the inventories are the same, the profit is: N - (no.oftransactions)


(spread — drift) (“size” of market)


“size” of market = 100 shares in this case.

Let us suppose that intelligent investors with the advice of experienced security analysts recognize that the intrinsic value of this stock is indeed somewhere in the price range $18 to $39, and the price remains in that range for the course of a year.

Trading at a

modest volume of 10 round lots per day for 250 trading days per year, Zero the mathematical market maker turns a profit of:

(N = 10 x 250) 5

(spr. = 1—dr. = 1/2)(size = 100 rd. lots) = $62, 500 (2.13.6)

an a

or slightly over 50% per year? gross return on “capital,” with conventional accounting.

If he can stay in business for a year, he can

write down his troublesome stock inventory to zero, and stick to con-

ventional accounting principles for profit calculation thereafter. Not bad, Zero, not bad at all. What is the least non zero profit he can make?

the smallest discrete price steps.

This is set by

Assuming his first quote 29 7/8,

to 30 1/2, a spread of 5/8 and a drift of 1/2 gives the same price range of market making, 18 to 40, as before. This annual profit is one fourth as much. The average price change between transaction is only slightly greater than for the case of zero profit, and transaction price changes are never greater than 1/2.


The Effect of Upper Inventory Limits, on Price Limits

Let us make a few comments on the above calculations, which will

suggest some alternate strategies.

We have specified the minimum

and optimum inventories, but not the maximum allowed.

We just

calculated the transacted price range at the ends of which one inven-

tory or the other hits its minimum (zero) and ended his continuous in time market making. Presumably as these transacted prices were

approached, he might want to take some corrective action. What are the possibilities?

Before going into this, let us note the following.

Note that the

profit, unlike that of the used car dealer, is not given by the spread alone, but the (spread minus drift) x (1/2)(No. transactions) times “size.” It is the drift which gives him protection as a market maker over as wide a range as $40 to $18 in the price. We have specified the lower and optimum inventories, so follow the principle that he should stay away from both bounds as far as

possible. Let us set upper limits of 40 shares (twice the optimum)

and for cash $120,000, also twice the initial optimum. These limits, or the corresponding prices are reached slightly inside the previously

calculated absolute price limits.

They can be calculated as follows

using a drift of $1/2: For the number of n, of steps up, $60,000 =

100 DF41(30 + (1/2)7) = 100(30n, + (1/4)nu(ny +1).

From this,

Ny ~ 17. py (upper transacted price) 30 + 17(1/2) = $ = 20 ® For specialists on NYSE, studies have indicated return of 30 to 40% per year, so our crude calculation seems correct on order of magnitude. report.)

(See e. g. Cohen

is the allowed number of steps down, where the inventory becomes

40 round lots - the maximum allowed.

Hence pj, (lower transacted

price) 30 — ng(1/2) = $20.

Note that py. (upper) and p; (lower) are not quite symmetric about the starting price of $30, and when they are reached he still has about two round lots to sell (at the upper limit of price) and enough cash to buy about four round lots at the lower limit.


The Return to Optimum Inventory, When the Limits are Exceeded

The simplest procedure when these prices, p (upper) and p (lower) or equivalent inventory limits are reached, is to declare a halt in

trading, to be reopened via LOOIM. Technically, this is a failure as a continuous in time market maker, but it won’t happen very often

relative to the number of trades he does make as a continuous in time market maker. Even so, there are ways to prevent this failure or to

camouflage the failure with a continuous “image.”

The suspension

will not last long in time. What are the possibilities when he reaches these inventory limits?

On the downside he suspends at $20, declares a LOOIM and lets the orders pile up. If there are at least 20 round lot incoming buy orders

available!® to him (at any price) since he sees the orders before setting the opening price, he can always put in 20 of his sell orders at a price such that his at least will be satisfied.

He is then exactly at his

optimum position as a market maker relative to the opening price, of twenty round lots, plus cash to buy 20 more round lots at the opening

price. He actually has a little extra cash, since he suspended before he quite ran out of cash.

Suppose there are fewer than 20 incoming buy orders. Then he takes what there are, and sets new inventory limits. Just the number of dollars that he got for his sale is his new optimum cash.


the number of shares that he sold is his new definition of optimum stock inventory. again.

With this new set of limits he goes into business

He makes a new continuous market in time.

In this case

where there were fewer than 20 round lot incoming buy orders, he has some surplus round lots “profit” or surplus inventory. A bon fire

or (not recommended) a special bargain sale (off the “market”) is 10 We have to qualify the words “available” and “participate” in case there are “ground rules” of the market (as on NYSE) which may give precedence of orders from outside over those of the market maker.


indicated. The suspension on the upside has similar consequences. He par-

ticipates in the opening, just to the extent that he can buy 20 round

lots or fewer, using only half his cash for the purpose.

Then he

would have remaining cash to buy an equivalent amount of stock at the opening price. If he buys fewer than 20 round lots, these numbers determine his new inventory limits, and he is again at his optimum inventory point

for both cash and stock. He way have some “profit” and it will be in real dollars.

We have described this somewhat artificial example of market making in order to show that consequent to the three principles, there are purely mechanical rules which can be followed.

Once the

limits on inventory, and limits on closeness to most recent price have been set, the discreetness of the price (1/8) and discreetness of the

size (1 or more round lots) play a basic role in determining the price limits over which the market can be made. The peculiar concept of

“profit” (surplus inventory) emphasizes that profit making must be subordinated to market making, and this puts an inherent symmetry on the way stock and cash should be regarded.

The properties of this market suggest certain properties of real markets, and also ways to modify it in a more realistic direction. Over a predetermined interval of price (20 to 38 in the example)

quote prices are a function of inventory. The price changes between trades are either the drift, or spread — drift.

At these price limits

(20, 38) there are jumps in price, (the LOOIM) probably but not necessarily larger than the previous upper limit of 1/2 in price change. The process then repeats itself, with a new “origin,” or stock inven-

tory balance point price. Even if at each LOOIM Zero succeeds in equilibrating at 20 round lots and cash to buy 20 more, the succeed-

ing prices at which big jumps (the LOOIM) are likely to occur, are not the same. Thus with a suspension at 20 if the price ultimately moves up to 28, there will be another suspension, whereas the previous equilibrium point price was at 30, his original starting price.

In other words, after the inventories are equilibrated by LOOIM, the quote is again a function of inventory, but not the same function that it was before. We also note one slight asymmetry. On a downside suspension (at

$20 in our example) Zero will always have enough stock to equilibrate at a new optimum stock inventory of 20 round lots.

(This assumes


there are enough buy orders “available to him” at the LOOIM.) On

the upside (at $38) he may not have enough cash to buy 20 round lots, especially if the price jumps up with the LOOIM reopening. A series of LOOIM upward suspensions will force him to diminish the chosen size of 20 round lots as his optimum stock inventory.

This is inherently reasonable, a market maker is not expected to make as deep a market in round lots with high as opposed to low priced stocks.

Put another way, if $120,000 is the absolute upper

limit of cash inventory, independent of price, then his optimum and

maximum stock inventory must get less with increasing price.


Tactical Inventory Zones

The preceding market making strategy might be called a “two

zone” strategy.

Inside his inventory limits he has a simple rule to

give his quotes.

Outside it, he has a rule (the LOOIM) for getting

inside and at the center of the first zone again. The first zone may then have redefined limits. The greatest “threat” to being driven out of the first zone is a long, unbroken sequence of market orders of the same sign, and this statement is true for any market making strategy. The limit to the number of these he can accommodate without being driven out of the

first zone is set by his preassigned inventory limits. The transacted price range is set by this number (the inventory limit) times the

allowed drift.

It makes no difference (at this stage) whether these

incoming orders are from separate customers, or in a big block! , since the market maker is only quoting a size of one round lot per quote, against incoming orders of unknown size. Let us divide this inventory range (optimum = 20 round lots

+ optimum) into subzones at the boundaries of which the market maker makes increasingly drastic changes in his strategy to protect

himself against being driven outside his absolute inventory limits to a suspension.

The numbers given are arbitrary, for illustrative

purposes. There are many alternates to the specific choices we give.

1). In the first zone, optimum + 3 round lots, he quotes a spread of 1/4, and a drift of zero. (29 3/4, 30; size one on both sides.) Note 1 Tf he is given this information (a big block on an incoming order as on NYSE for a big block accumulation or distribution) he can use this information advantageously.

See section on ratchets.

Information on a big block he is not

required to execute immediately represents a temporary addition to his inventory

(stock or cash) which he has to work off.

59 that this gives a depth to the market of at most three round lots without changing the quote. But this depth is not given in quite the same way as quoting a size of 3 at the optimum inventory point, and

then dropping to sizes 2, 1 as the inventory goes off balance. Quoting a size of more than one tells the inquirer how many more orders the

quote might absorb before moving, whereas only quoting a size of one on each quote withholds this information. This is advantageous to the market maker.


In the second zone, from optimum + 4 to + 8 round lots

he drifts 1/8, maintaining a spread of 1/4.

Note that the profit

per share per pair of buys and sells in the first zone is 1/4, the full spread, whereas in the second zone it is only 1/8, the spread (1/4)

minus the drift (1/8). The “virtue” of “stabilizing prices” and not letting them drift, or “ironing out temporary unbalance of supply and


is also relatively twice as profitable.

In the second zone,

transacted prices change by 1/8, the least possible. The market is as

“continuous” (close) in transacted prices as possible. It should now be clear what the quotes will be like in succeeding

zones away from optimum, as the inventory limits are approached. Increase the drift, so that prices move with the fewest possible num-

ber of trades to what ever new price is considered to be an equilibrium or fair price to outside buyers and sellers. Increase the spread,

so that profit (spread minus drift) is still positive.

3). From optimum + 9 to + 20 round lots stock inventory he can quote a spread of 5/8 and a drift of 1/2. These tactics in these three zones will get him to a lower limit of price of

30 —3 x (0) —5 x 1/8 — 12 x 1/2 = $23.375 where he reaches his upper limit of stock inventory, 40 round lots.

The upper limit of price, where he has $120,000 in money is reached

at about $36.

With these tactics alone, for all zones, the prices he

quotes will still be a function of inventory, regardless of the order in

which buy and sell orders enter the market.

It will be seen that the strategy of the market maker is exceedOnce the zone boundaries of inventory

ingly simple and flexible.

have been set, and a spread and drift chosen for each zone, quote prices are a function of his stock inventory.

All he has to do is to

look at this inventory to know what quote to give. Note that transacted prices are not, precisely a function of inventory.

The profit

in any zone is just the (spread minus drift) times N/2, times size,


for any segment of prices with equal numbers of buys and sells and equal quotes at the beginning and end of the segment. N

is the total

number of transactions in the segment. The above market is not very “deep.”

Except in the first zone

there is just one round lot on either side of the quote. The conclusions

can easily be generalized to deeper markets, with more than one round lot on either side of the quote.

At the moment, we prefer

to view this rather as a series of identical quotes, each for a size of one round lot, since this tactic holds back a little information which favors the market, maker.


Tactics With the Minimum (1/8) Spread and a

Size Greater Than One Round lot Let us suppose our market maker, either from pride, regulation, or force of competition, wants to quote the narrowest possible spread,

1/8. Since 1/8 is also the least drift, if he drifts his quote 1/8 after each one lot transaction, spread minus drift is zero and he makes no

profit. However, if he holds his quote fixed until it is net k longer or shorter than at the start of the holding period, and then drifts 1/8 up or down according to his previous policy, his mean drift under

this policy is d = 1/(8k). He drifts again when his inventory changes by & more round lots. The formula for the profit then is:

(N/2)(Spread—d). This is an average, not exact as in the previous case, but fairly accurate.

We can see by means of numerical examples

(see Figure 2.17-1) that if there are runs of buys and sells in the orders, of less than k, the market maker makes the full spread, even

though his policy is to drift by an average d between trades. Note with this policy that quote is still a function of stock inventory, as listed on the left side of each diagram.

Although the

inventory is not a function of the quote, as it was when the size, or k, was just unity. Note for example there are & buys in the first zone,

although the inventory rule says the quote holds for unbalance k —1. This is according to the rule. At inventory unbalance k — 1 the mar-

ket maker doesn’t know that the next order will throw him more off balance or not (i.e., whether it will be a buy or sell). Strict attention to the functional relationship, “inventory determines quote” means

that for a string of buys followed by a string of sells equal in number, the transacted prices(*) do not quite repeat themselves, in just such a way that the formula for profit (spread - mean drift) holds, on the average. This is true regardless of the order in which the buy orders


and sell orders come in.



Effective size k=3.

Five buys,

for the market maker. Number of transactions followed b; = lo.)


cee= 45 ° gOS ATH Wee oe”



LDypteRaog-) 9056 axact: profit

oy a


= “2

ivy bee oon ox bo b

pried bd ales rope - bugs st

= 4 e~aaes-) — fo Tsay






Approximate profit (/e/2) (1- '/3) = /0/3



Effective size k = 3,

market maker in the order

Dysek% Ika] ons Tuskrat: 34S


Five buys and five sells

ss bb sss.

for th

Number of transactions= Big

u eo 3 Fol 0% 03 Aenc¥? - 9 by bet oy b br beorb bo b

sales oT off


bursa vised bd



Exact profit = (3a+21j— F) =3 Approximate profit




Effective size.k.

of transactions = 2t,

we OF



so/s , ascm eae (a)

t buys followedLy, t selle, Number

t is such that

[bel ieIstetg si

Tux breif


BY seven sells followed by Seven buys 21

of OF OP oF


a¥ GF

oO e

wy 2b bbb b bbb i]

%® “0, Shere


= - ~ ae mete











time > Fig.

2.18-1 Trading with a ratchet.

The final quote is inde-

pendent of the order in a sequence of equal numbers of buy and sell orders, but the profit is not independent of the order. Drift = dy =

0 after sell. Drift down dg = — 1/8 after buy. Spread = 1/2. b=

bid, o = offer, * = transaction price. It should be noted that with any type of ratcheting the quote


prices are not a function of inventory, but depend on the detailed order in which the buy and sell orders come in. So the market maker

has to look not only at his inventory but also the actual price, or price target that he wants.

Let us define a direct ratchet as one which pushes prices down when the stock inventory is long from optimum, or up if short. Reverse ratcheting pushes up the price when the inventory is long from optimum and down if short. Since direct ratcheting pushes the quote,

and also balance point price down when the stock inventory is long from optimum if continued long enough it will (presumably) bring

out the buy orders to bring his inventory in balance.

Similarly on

the upside, so direct ratcheting is intrinsically inventory stabilizing,

which is of course what it is intended to do. Reverse ratcheting does just the opposite.

It tends to push the price up when the inven-

tory is long, if continued too far, it will presumably bring out more sell orders and throw the inventory even farther out of balance. So reverse ratcheting should be used with caution, preferably when the ordinary stock inventory is close to balance anyhow, in the first zone, and not be allowed to push the price too far.

As we have indicated, it is a violation of first principles of market making to try to make a profit on your inventory. However, reverse ratcheting is particularly helpful when (as on NYSE, or in the first

stages of making a market after underwriting) the market’ maker has a special big order to handle.

See the section on the specialist

(2.25). Any order for which the market maker has information as to its size, and is not required to make an immediate execution, gives him special information advantageous to him, can be regarded as temporary inventory, to him.

He uses this information to preserve

his image of making close (market continuity) transacted prices, and possibly a profit too.

There are analogs to this situation in other

aspects of economic life.

“List prices” are stable and continuous in

time, but there are numerous occasions of side deals both above and

below these prices, which are not given the publicity which would disturb the “image” of fixed and continuous list prices. The section on specialist describes some of these “side deals” with big orders.

We can make a comparison between the direct ratchet, when used to bring inventory to its optimum point, and the LOOIM, called or overnight, to accomplish the same end.

The LOOIM determines a

new price, set by the combined public judgment expressed by all the

limit order prices, plus the judgment of the market maker after he


sees the orders.

At this new price the market maker’s inventory is

at optimum. With a direct ratchet, the market maker tries to guess

(perhaps several guesses in succession)what this price will be, and

ratchets to it as a target price.

So we can say a LOOIM is “one

big ratchet trade” to equilibrate inventory. Alternatively we can say equilibrating inventory with a direct ratchet is a “quasi LOOIM” but smeared out in time, using the market maker’s estimate (rather than

specific limit orders) of what the public is willing to accept as an inventory balance point price.


Comments on Continuity. Methods of

Preserving the Image of a Continuous Market Maker Conventionally, one thinks of continuous increasing and decreasing supply and demand functions of price intersecting at a point, so conventionally one might think it was up to the market maker to

find that point. The notion of “continuity” of prices, or closeness to the most recent price, carefully fostered by the professional market

makers and unconsciously subscribed to by the public, coincides well

with this conventional point of view. One might even say that since prices are close (“continuous”) most (not all) of the time, the conventional point of view must be correct, since it is in agreement with

observation. Even (or especially) Chicken Little should be impressed

by this argument. A closer look at the phenomenon shows some small differences from the above picture, but those small differences have big consequences.

Supply and demand, as we have redefined them, are

step-like functions of price, piecewise continuous, non-decreasing and non-increasing, respectively. They intersect not at points but on fi-

nite horizontal segments.

These segments are long, consequent to

the properties of market orders. So a single intersection price is am-

biguous (transacted price is not a function of supply and demand). The market maker picks prices, and sequences of prices which bring

satisfaction to the customer and profit to him, just as the gentleman in the sandwich board did when Mrs. Jones bought her dress.

The tactics for inventory balancing which we have given for the

continuous market maker to stay in business during business hours!4— increasing spread and drift, jumps, ratcheting, all imply an under-

+4 The overnight suspension is a big help here.


lying assumption that must be true for them to be successful. It is that when the price moves far enough, orders of the appropriate type, buy or sell, will come in soon enough (before the close of business) to keep him from exceeding his inventory limits. This is another reason,

especially in near crises, for keeping the size on the quote small. In ratcheting, one might have enough stock to have five round lots on one side of the quote, but feeding it out one lot at a time takes a

little longer.

The greatest threat to the continuous in time market maker occurs when the above assumption is not true. He is then confronted

with an unbroken sequence of market orders all of the same sign.

There are some steps he can take to preserve his “image” of continuous (in time) market making. One of these is a very large spread. There is always an upper limit of spread beyond which people will become disgusted and not do business with him. This is one way of actually suppressing trading, and yet say he is in business without having to do much business. I can stand here and say I will make a

firm market for IBM. I will buy at a dollar a share and I will sell at $1,000/share. I am, strictly, a market maker, but I don’t have any

danger that people will do business with me. This is a ridiculous ex-

treme of the strategy of the Bank of England in a crisis. They might well discount the assets of the provincial banks at some horrible rate,

15% or 20%, but the important point was that the market or the bank

stayed open, a market was to be had. ‘This strategy of market makers

is indeed adopted in a crisis in the over the counter market today. Stocks may be ordinarily quoted at 3% of the price apart; after some

particularly bad piece of news they may quote them 15% apart, and with a big jump too. This is not considered good market making as a regular practice, nevertheless the image and indeed the reality of

market making is preserved. Exactly the same phenomenon occurs on NYSE, but to a lesser degree than a 15% spread.

There are other artifices by which a market maker may preserve his image in a crisis. For the OTC market, the simplest one is to take the phone off the hook. This is effectively a suspension, but is not

labeled as such. It gives more time for orders to come in, cuts down

on the number of transactions, and gets him closer to the close of

the business day. A similar practice on the NYSE is called “walking away from the market.” The specialist can say, “Help, I have to go to the bathroom,” and he simply disappears. If it is 3:15 P.M. and he stays there until 3:30, the close, he has not officially weaseled out


as a market maker. There have been times when banks have done similar things when there is a run on them, and they want to preserve the image of staying open.

They are racing against the clock when the bank can close

anyhow. To slow the run down they will send out a few employees to get in line with bags of money to make pseudo- deposits, or arrange for the employees to take a long time dickering and haggling at the cashier’s window. So the image of the bank being open during regular

business hours is preserved.

In fact, the banking process is slowed

down enough so that by the next day perhaps they can scrape up enough money to keep the bank open. A suspension for a market or bank, called, or naturally overnight, gives the participants time to

assemble information and make decisions which continuous in time trading does not give either them or the market maker. The LOOIM or its variants just give all concerned more time, and all the same amount of time.


Continuous Market Making With Fixed

Minimum, and Optimum Inventory, but No

Maximum In the preceding discussion of the single market maker without competition, handling market orders only, we set separate and predetermined limits on the cash and stock inventory. There were also constant limits set for each “zone” in the course of continuous trad-

ing, for allowable drift and spread. The size was limited to one round lot for simplicity. This is not an essential restriction. All these limits could be violated whenever there was an interruption of continuous trading either naturally overnight, or with a crisis suspension, a technical failure of the continuous market maker. We saw that these interruptions, naturally occurring once

a day, and rarely more often, were of the greatest importance to the market maker for safety and enabling him to stay in business. These “discontinuities” are an essential part of the “continuous in time” market making process. We saw that the separately specified optimum cash and stock inventories were “compatible” at the starting price of $30. The optimum cash would just “buy” the optimum inventory of stock at the starting price.

The separately specified limits on the two invento-

ties became incompatible as the price moved away from $30.


incompatibility was noticeable but not large at the extreme limiting


prices of market making allowed without any attempt to adjust the


This suggested rather naturally that at least one and

possibly both of the upper inventory limits of cash and stock might preferably not be predetermined separate constan ts, but depend a

little on the price.

We now want to show that by relaxing some of the above prede-

termined constants, we can devise a marke t which can be operated continuously in time without any interruption s whatever, or jumps in price greater than a predetermined amoun t whatever. This mar-

ket will operate over a much wider but still finite price range than

before. The price limits of market makin g will be set by the discrete units of money ($1/8 or $.01, as you may prefer) and the discrete

units of stock, one round lot, or one share, if you prefer. This market

will bring out the essential symmetry of stock vs.

cash, and some

similarities to a real market, notably the British market where the

customers’ orders are given in dollars (or pounds), not number of

round lots, and the number of shares transa cted is adjusted accordingly. It will also bring out similarities betwee n this British market

and the American odd lot, round lot and big block market, essentially markets of increasing size of trade. We want to bring out a fuzziness of distinction between small and large price changes. Discreetness will be an underlying and unifyi ng concept, just as it was

in going from the economist’s to the market maker’s ideas on supply and demand. Finally, we have seen the practical necessity to the

market maker for interruptions in time, and big price jumps. We shall see that they are also a logical necessi ty. This logical necessity is established by giving the conditions for eliminating them, and

showing that this condition cannot be achieved. This last point has a bearing on a question of some theoretical and practical interest.

Big price jumps, equivalently, long tails in the distribution of price changes, are the subject of considerable attent ion not only from aca-

demicians and speculators,but also the regulators and supervisors of

the market.

With this market, Zero the mathematical market maker is given

20 round lots, $60,000, and a starting price of $30, just as before. The lower inventory limits are zero, just as before, but no specification is given on the upper limits. He is told to make a market with changes in transacted prices not greater than 5% of the price (5% to the

nearest 1/8) between successive transactions . This specification is a

little gross for OTC stocks unless they are not traded actively, but


it is not intolerable. Note that 5% as a decimal is just the reciprocal of the number of round lots he was given. could make a nice close market of 1%.

With 100 round lots he

price changes.

Zero has to

make this “continuous” market in price (5% price jumps) 24 hours a day 365 days in the year, leap years no exception and positively no suspensions or price jumps more than 5%, ever. Continuity in time of the quote is also.required, without any exceptions.

The formula for the sum of a finite geometric progression is:

ltrtrgrs..trN y—pNat

1 This finite form works both when r is greater or less than 1. It even works when r is equal to 1, if you are careful in how you express it,

as indicated here: perv






(N41) log, +

1 — clog.

1—1—(N +1)log,r— (rio, ul rl 1-1 -log,r- (oger)? N+1

Moreover for |r| less than one you can sum an infinite number of terms: oo


Sox =>



3.15-4 Across the Market Dispersion (x2) for Industrial

Stock Prices. B.F. Goodrich to Hudson Bay Mining. x to = 1945.5, N= 12. 0 to = 1951.5,

N = 10-12 (Two ”vanished” from sample

in 1956). Data from Securities Research Company

® These figures are from M. F. M. Osborne “Random Walks in Earnings and Fixed Income Securities”, paper given to Institute for Quan. Apr. 1968.


in Finance,


The sequential dispersion of these vehicles has never been investigated to my knowledge, except for the original work of Bachelier. He found that for “rente” or perpetual French bonds, the sequential

dispersion increased as the square root of the time up to intervals of



ower Data Points







by. a factor = 2.9 for slog,E(t) = log,


E(t) fe EE t

(ee 20

TIME t-ty in years —>

Fig. 3.15-5 Across the Market ”Dispersion” (x2) for Fractional

Change of Earnings of Industrial Stocks Goodrich to Hudson Bay Mining. A to = 1945.5

N= 12. A to = 1951.5 N = 10-12. See Fig.

3.15-4 (two in sample vanished)


45 days, the longest tested.

What happens for longer intervals is

unknown to me. It seems inconceivable that the sequential dispersion of such vehicles would increase indefinitely. Other qualitative factors must enter into the description, such as the probability of default. For me this is an unexplored problem.






o ES


3.15-6 Across the Market Dispersion” (x2) of Earnings

per Dollar of Price for Industrial Stocks Goodrich to Hudson Bay Mining. x(to) = 1945.5. o(to) = 1951.5. See Figures 3.15-4,5.


One can also examine other sequences of data from the securities

market, Figures 3.15-4 to 3.15-9. For the earnings sequence of common stocks, the fractional or % change in earnings increases in its

dispersion across the market like the square root of time interval,




t-t, in years—»

Fig. 3.15-7 Across the Market Dispersion. (x2) of Log. Price for Utility Stocks, American Gas & Electric to Indiana Power & Light. Data from Securities Research Company. o to = 1946.5 N = 14. x to = 1956.5 N = 14 (one stock changed in sample).


but changes in the logs of earnings do not. Earnings per dollar (reciprocal of P/E ratio) are roughly constant they do not spread out

at all. The volume sequence has never been explored for sequential or across the market dispersion.

ta cy

1.0 = =



i=] 3 H



‘A Boon

ae OZ


Heh ®













Time t-t) in years —>

Fig. 3.15-8 Across the Market ”Dispersion.” (x2) of Fractional

Change of Earnings. (E(t) — E(to))/E(to) for Utility Stocks. Data as in Fig. 3.15-7.


—— —industrials,





aa7 ba








ae Zale ial






Deen O |








rile 10



Nap = 1958.5











—— —industrials tg = 1951.5




ty = 1945.5

] 20

in year:

3.15-9 Across the Market ”Dispersion” (x2) of Earning s

Per Dollar of Price for Utilities. Data as in Fig. 3.15-7.


Brownian Motion as the Continuous Limit of a Random Walk

I now want to describe the situation called Brownian motion

which is a limit of a random walk, or a whole assembly of random walks, depending on whether you like to look at sequential or across

the market distributions.

Suppose instead of an increment labeled

by an integer which is the number of steps taken, we speak of the time since the walk started, and we take the number of steps which

are taken in any minute as given. So instead of expressing the position (price) as a function of the number of steps, we express it as a function of the time interval after some chosen starting time. Then we increase the number of steps per minute indefinitely. The walker steps more and more rapidly, but we shorten the length for each step taken in such a way that in the limit of infinitesimally small steps taken infinitely rapidly, we still have some finite process going on.

This can be done in the following way. Let us suppose we take n = 1/e steps per minute. ¢ is the small time between steps, and each one is of expected length h.

We can

express the number of steps after time ¢ as i = (Int)tn = t/e. (Int) means the closest smaller integer to t/e So our random walk in terms of 2:

P(i)" =" Py +ih+ Vib is expressed in terms of the time variable ¢ as follows. We replace h by Ve where V is a constant, independent of e.

Similarly we replace 6 by D,/e, V (velocity) is the drift per unit time D the dispersion or standard deviation developed after unit time. Since the e’s cancel out, we have:

P(t)* =" Po + (Int)Vt + VIntVtD Note that we have not yet taken the limit « + 0.

The word (Int)

means that P(t) is spaced, € apart in time, and also in the vertical or price coordinate h + 6 with small steps.

As «

+ 0 the dots get

closer and closer together in both coordinates. V is called the drift velocity, D the diffusion constant. This is a “continuous in time” description of a random walk, and

it is called Brownian Motion.

It is actually this continuous form

which originally appeared in Einstein’s paper on Brownian Motion.

It is also in Bachelier’s thesis, and in Feller’s book in the references.

Brownian motion was discovered by a botanist who was look-

ing through his microscope about 1820 or thereabouts at very small spores, I think of ferns or a fungus. In this image in the microscope

he saw what he called a swarming seething motion, like an agitated

hive of bees, or gnats on a summer evening.

At first he thought

this might have been the manifestation of life at its most elementary level. After some reflection and investigations by other people, they decided that this chaotic motion probably was caused by the indi-

vidual molecules colliding with the somewhat larger but still very small spores. Molecules had been suspected to exist since the days

of Democritus, but no one had ever proved that they did. They examined Brownian motion in dust particles in drops of water that had been sealed up in crystals for millions of years and were still jumping like mad. They showed that the motion changed

with the temperature and with the size of the particles. Finally Einstein related the mass of the particles and the viscosity of the

liquid in which the particles moved to the number of molecules that were hitting them, and the temperature.

Then Perrin observed in

an experiment looking at small droplets of resin, gamboge I believe,

that the formula for the way in which these particles moved fit the square root of time diffusion law. In many respects our individual

observations of prices are an almost one to one analog with what Perrin did when he looked at his particles through a microscope,, and observed their position.

Istrongly urge you, if you have any friends in the biology department or medical sciences, to go and look at tobacco smoke under 500 power magnification. This will show you Brownian motion very


It is quite spectacular to watch.

They really do jump in a

mad fashion.

I want to point out that this limiting case of Brownian motion has a very curious property which I mentioned earlier. It is continuous in a mathematical sense but it doesn’t have a derivative. The

definition of continuity adopted for this case is not quite the same as it was in the case of the calculus, or for the discrete variable case, where the universe of discourse was the integers, and you could in the aristocratic sense still satisfy the definition of continuity.


fore I give this definition, let me point out what is actually going on when I compute the expected value of the position and dispersion of a random walk after i steps.

Let, us suppose we have a coin flipping random walk. Prob (s =


h+6)=1/2. After 50 steps s the position was 50h+V/506. At each step there were two possibilities, or after fifty steps there are 259 ~

1015 different possible random walks. 50h + W506 is the “position” after an average over this is huge number or ensemble of random walks. In physics this is called an ensemble average.

When we estimated from a single random walk given to us what the advance h and dispersion 6 per step was, we assumed that we had

just one out of this monster population of 101° different walks, and with a high degree of likelihood that it wasn’t a particularly


member of this huge population of possibilities, and that we could estimate h and 6 from it.

Now what does it mean to ask for continuity? You can see from the way we carefully adjusted the length of the step and its dispersion, as they got smaller and more rapid, that the particle did have a definite expected position.

Since a particle does have a position

at any given instant of time, if you are only adding infinitesimal elements to the position, the property of continuity should be pre-


For this limiting Brownian case we have not 10!° but an

infinite number of different random walks, of which for estimation, we pick just one.

I think you can see that a plausible definition of

continuity (it is not the only possible choice) is this one. It is called mean square continuity.

Instead of saying P(t) — P(to) approaches

zero as t approaches to you say that the expected value of the square approaches zero, because it is that quantity which is measured with uncertainty, or variance.

the square would too.

Certainly if P(t) approached P(to), then

So the stochastic definition of continuity is

this one. A Brownian motion P(t) is continuous at t = to if:

jim E(P() — P(te))” =0 You can also give a delta-epsilon definition if you don’t like the peasant definition and you don’t like the word limit.

In mathematical

language P(t) is described as continuous “almost surely almost ev-

erywhere,” or everywhen, if the variable is time. The extra words in quotations mean that the probability distribution of P(t), instead of being binomial, is continuous in the limit « - 0. Being “normal” it extends from minus infinity to plus infinity, so off in the tails which

is a very unlikely state of affairs, P(t) and P(to) might be different with ¢ close to to, but this is a very improbable state of affairs. we have those words, “almost surely almost everywhere.”


i.e., for

the entire span of observation or domain of to. This is not quite the


same definition for continuity as in the calculus. If you try to generalize, by asking about the derivative, the ob-

vious thing to say is, let the following expression, if P(¢) is going to

have a derivative, approach a limit:

Le = =F)’ -o

(“derivative”P(t)) = lim i


If you work this out, you will see that as t — ty gets small, the

limit doesn’t exist, it becomes infinite. Here is a function which has the peculiar property of being continuous (almost surely almost ev-

erywhere), but it doesn't have a derivative (squared) almost surely almost anywhere.


It is a freak, a continuous function without a

If you look in a microscope at Brownian motion of to-

bacco smoke you can easily see that, whereas you can follow with your eye one particular particle of smoke, it jumps and whirls and

dances and jiggles about so that it doesn’t seem to really have in any understandable or measurable sense a rate of change of position with time.


Velocity vs. Derivative of Position.

There are some subtleties and even paradoxes in the above discussion, and they matter, even or especially for the stock market. There are three mathematical processes going on. 1) The limit as

the steps get small and the rate of taking them increases. This is the passage from discrete steps numbered i to continuous time t. 2) Calculating the expected value. 3) The limit t + to. We had the concept of a drift velocity V but we don’t have a derivative (by our definition). Velocity is supposed to be just the derivative of position. How can this be? The drift velocity is


P(t) — P(to)

v= time ( ib ) which is not the same as the square root of the preceding “stochastic” definition of derivative squared.

If we don’t use squares then the

dispersion 5 of one step never appears in our calculation of expected value, and the stochastic nature of the process is lost entirely. The moral is that definitions, and the order in which operations

are carried out matter. The square of an expected value is different from the expected, “value - squared,” and the difference matters.


“Velocity” and derivative of position with time may or may not be the same thing, depending on how you define them.

Tam sure you may have noticed on a theater marquee a “velocity”

in a moving line of lights.

If you look casually there is motion but there isn’t any thing moving. The lights (cf. transacted prices) just

flash on and off in such a way as to make you think so. Not noticing this distinction in the stock market can cost you a lot of money, if you unconsciously accept the idea that transacted prices are continuous

with a continuous derivative, in time.

You have bought a high flier, and you put a stop loss sell order under it, in case it starts to go down. So probably have some other people. On the chart when you get stopped out it is almost always

at the lower end of the little continuous streak of prices that you get your execution. if you look at the record there probably weren’t any transactions near and below the price where your stop was set.

The “streak” on the chart is not on the tape. There was a jump, with or maybe without a suspension, and all those stop loss orders got swept into a pile for execution at the lower price.


and smoothness was lacking just at the time when you wanted these properties.

Not infrequently stop loss orders are banned in anticipation of Just such a situation. If you hear of such an announcement, it’s time to get out of the market.

Ican make another familiar analogy. The “Motion” in a “moving picture” is not real, but it is exceedingly realistic. A movie is, strictly,

a rapid discrete sequence of “Stills.” Rarely, the veil of illusion is torn, the wagon goes one way, the wheels momentarily spin in an opposite direction. More often very realistic illusions are created. A ghost passes through a wall, Dr. Jekyll turns into Mr. Hyde, we

have the illusions of “Fantasia” created by that master craftsman,

Walt Disney.

In the stock market the reality is a set of discrete “stills,” the transacted prices, with a double valued quote in between

(sometimes). If the “viewer” (speculator, investor) like a small child

at his first movie, really believes he is seeing a continuous and smooth process, he can and does develop all sorts of delusions concerning the market.

It is in fact very difficult to wash your brain of this false

belief, and the false conclusions this belief engenders. The specialist is licensed and expected to foster this illusion of continuity, most but not all of the time. It can be a very profitable license.



Random Walks Whose Properties Change With

Time Ihave pointed out in the preceding sections that it was perfectly

possible to have a random walk in which the expected advance (and also dispersion) varied with the index number of the step, or in terms of Brownian motion, to vary with the time.

If the dispersion was small compared to the expected advance per step, the walk became almost but not quite indistinguishable from some well defined and even analytic function of time.

I now want to point out that all of those possibilities when the dispersion per step is small compared to the expected advance per

step could equally be true when the dispersion per step is large compared to the expected advance. Parenthetically, Brownian motion is

a rather special case of this, since







gets large for small ¢, but separately both advance and dispersion approach zero.

For a random walk, the expected advance may change with time, but if it is small compared to the dispersion it might be very hard to find this time variation. The examples of Problem 4 illustrate d this

for stock prices. You might well have bull and you might well have bear, and lost a lot or gain a lot, but the dispersion was large compared to the expected advance, so it was hard to show the expected

advance as being different from zero. It was possible for intervals of more than a year, and in favorable cases.

The fact that the parameters h,é of a random walk can change with the time makes it important to see if there are other ways in

which the stock market, as represented random walks, change with

the time, which are a

little more detectable.

properties do change with the time.

the following sort.

The market and its

We can make an analogy of

Brownian motion is usually discussed for inert,

long-lived particles, but bacteria also show Brownian motion. You can watch one for twenty minutes and he will follow a random walk, and then he splits in two. A subsidiary is spun off. Now which piece

do you look at?

Or another bacterium may swim by and eat him

up (an acquisition, or merger).

Now what do you do? So there is certainly a change with the time here which a simple square root of time diffusion law takes no cognizance of whatever.


‘You may recall that I said the problem of the stock market was like the inverse of playing parcheesi.

Instead of knowing the rules

for playing the moves of the game, you are given a lot of games with moves,

and you try to figure out the rules.

In the case of the

stock market the “rules” may change because there is actually a legal

change, or the properties of the game may change because peoples’ tastes change as to what is important to buy and to sell.


changes do occur in the securities market. You can see that trying

to figure out what the rules are, if they are going to change from one year to the next, is going to be difficult. If you aren’t told when the rules change, you have to infer it from various other sources, or

study the moves very closely, to see whether or not the rules have changed.

Let me give a brief history, giving examples of how the rules and preferences have changed.

Common stocks have not always been

regarded as popular and appropriate vehicles of public investment.

They used to be restricted to very disreputable financial pirates, like the Rockefellers and the Vanderbilts, the Harrimans, Astors, and Goulds, and Leland Stanford.

No respectable person would own

these common things. That’s why they were called common stock. People of property and class owned bonds and mortgages, real estate

and slaves. Common stocks were very definitely a vulgar way to play with money. Respectable people who wanted to take risks took them in conservative real objects like ships, commodities or race horses.

Nevertheless the vulgarians made so much money that finally people had to pay attention to them, and so they became respectable.

The standards by which common stocks are evaluated certainly have changed over the years.

You can read about these changes in

Benjamin Graham’s account in the references.

In 1915, there were

certain standards by which common stocks were evaluated, but they were not what they are now.

Changes are going on even now.


have always known that earnings have something to do with prices

going up and down.

Up to about ten years ago no one made a

business of systematically forecasting earnings, but as people appreciated, or at least believed that you could forecast the earnings and their effect on prices, whole books of earnings forecasts were prepared.

For some stocks the earnings forecast was more important.

than the earnings themselves. We have the astonishing phenomenon

that when the earnings don’t come up to the popularly and widely touted forecast, the stock price will crash, even though the company


may be in perfectly respectable condition at the time. Tam quite certain that the publication in the newspaper every day

of price earnings ratios is going to have an effect on peoples’ estimate of what they will pay for a stock, that hasn’t occurred before, simply because the information is presented to them without their having to work nearly as hard for it. As I mentioned in the example of bacteria, where they split and merge, the concept of Brownian motion of an identifiable particle really doesn’t apply for indefinitely long intervals of time. It is just the purpose of Problem 5 to put some quantitative figures on this

phenomenon. If particles are not identifiable after ten years, then it doesn’t make much sense to extend the concept of a random walk to periods of time as long as that unless you take explicit account of the fact that the particle may vanish or new ones are created.

Stocks become delisted or liquidated or bankrupt or eaten up, so the phenomenon is much more a set of random walks with appearance and disappearance than it is of a simple inert particle following a random walk or Brownian motion.

In the problem the students found that the exponential decay time (both birth and death) for a stock on the NYSE was of the or-

der of 20 to 30 years, and possibly more. For the Toronto Exchange it was three, four or five years. For the OTC Market about ten years. There were also mixes of different lives too, so that the birth or mor-

tality curves didn’t necessarily always follow a simple exponential law. Some of the students were quite careful to point out what hap-

pened to their stocks as they followed them. Some were liquidated, some were promoted to some other exchange, or were simply listed

or dropped by editorial policy, which was different in different news-

papers. Bonds had different activity and life in different eras. They tended to be reborn as maturity approached. This life limit on the concept of the random walk will never appear unless you look for it. You can’t write a price down if it isn’t there, and so if you just look for identifiable prices and vehicles you will never see that this appearance and disappearance phenomenon is a limit on the simple concept of the random walk itself.

I think you may see that the preceding discussion may have a

bearing, though perhaps in an unexpected fashion, on our Dark Cloud Axiom. “It is impossible to systematically exercise good judgment in the securities market. “Systematically” implies you can de-

vise a set of rules or precepts for an appreciable period of time, some


fraction of your life at least. If the rules of the game change, it will be hard to find such precepts.

Maybe you can devise a systematic

procedure for updating your precepts, and maybe not. It obviously

isn’t going to be easy to do it systematically, especially if the qualitative properties of the market change. These are sometimes difficult to appraise, or even recognize when they start to occur, let alone


Ordinary life, not just financial or economic life is full of

examples of this sort.

Problem 5 Birth and Death of Securities In this problem we want to examine quantitatively some aspects

of Brownian motion in stock prices which have a counter part in the Brownian motion of bacteria, but not of inert particles.

There is

also something of an analogy for Brownian motion or diffusion in a nuclear pile in which “creation” and “annihilation” of particles also occurs. Bacteria behave like inert particles say for intervals less than 20 minutes, but for longer intervals they split, merge, die, or may be eaten up. This happens to stocks also.

For simplicity of definition we will take “existence” of a security as having its name listed in a news organ, “birth” its first appearance, and death its disappearance from the sample as represented by that news organ and its editorial policy. Thus by definition the Texas Co. died, on an unambiguous date, Texaco was born the next day - a totally new entity by our definition.

Take some news organ (e.g., WSJ, NYT, Barron’s etc.)

and a

sample of say 50 consecutively listed items, starting under some category,say the letter G, bank stocks of OTC, bonds, etc. Use a date 10 to 20 years ago. Look under the same heading a year, 2, 3, 4, etc.



How many of your original 50 survived, after 2 - 3 - 4,

etc. (You may have to look a little farther than 50 for these. So plot the number, or rather fraction, of “survivors” of your original 50, as

a function of time. This problem can be done blindly and mechanically, but I urge you not to do it this way.

Read the headings on the news organ

as to what is actually listed. The “OTC Market” on Monday used to be different from Friday, and has changed its complexion. Check

for stocks not traded if (as for NYSE) they are given. Look out for


unexpected phenomena.

Repeat the problem where you now start with your last date, and

take steps backward in time. You will now be evaluating a “birth” rather than “death” distribution in time. They might or might not

be similar. It would probably be a good idea to examine data at the two end dates first before taking the data, to ensure the span of

time is long enough to have produced an appreciable “attenuation.” This might take 20 years for NYSE, but only 1-2 years for Toronto

Exchange. Use about 8 - 10 intermedia te dates. They need not be exactly uniformly spaced.

The purpose of this problem is twofo ld: 1) to deliberately focus on a hitherto neglected aspect of Brown ian motion or random walks in the stock market - which has in the past been concerned only with identified, permanent “particles” (e.g., stocks); 2) to give you

a sense of how the complexion of the market changes with time.

Obviously this method overestimates the rates of birth and death

process, except for a simple mind ed investor for whom a new name

is a new stock, and if the name disap pears, it isn’t there The method max, also be contaminated by the occur rence of days (or weeks) of no trading.

As a wild guess I suspect your surviving fractions (birth or death)

may follow an exponential law, Ae x p(-t/T), which should give a straight line on semi-log paper. Try it. There are obviously many variants to your categories, or ways of

doing this problem. One would not expec t public utilities to show the same birth and death properties as ASE science and electronic

stocks, and the behavior may well be different for pre vs. post SEC


Recommended Reading (on reserve for 233 or 236) Osborne, “Random Walks in Earnings and Fixed Inco me Securities.” B. Graham, The New Speculation in Common Stocks , in Wu and Zakon, Elements of Investments, p.

Investor, 2nd ed.



Also in Graham,

The Intelligent

Random Walks in Dollars of Price vs. Log, of Price

T now want to discuss a slightly differ ent aspect of the random walk which I have briefly touched on previously. There is a difference between a “linear” random walk in dollar s of price, versus a random


walk in the logarithm of price. As we shall see you can have a random

walk in almost any function of the price you like. In a rough sense the relation between the price and the logarithm of the price corresponds to the economist’s notion of amount and utility, or dollars of an

amount of money ys.

its “value.”

There is already an assumption

implicit in the relationship between amount vs.

utility (cf.


vs. value) —- the assumption that the relationship is a functional one. We shall see that this assumption is an enormous over-simplification.

Beginning with simple things first, let us take the example of roulette.

Starting with a stack of p. chips and betting one each

time on red vs.

black, we saw that the height of the stack was a

binary, or coin flipping random walk of negative expected advance.

Prob(s; = —1/37 +1) =1/2.6 This random walk is described by

Ap =Pi-Pi1=s;

Prob(s;=h46) =1/2 po = Starting position


h = -1/37,6 = 1 forroulette

which we have drawn in Fig. 3.19-1. ‘You will note that with this approximation of equal probabilities

(1/2) for the payoff, the probability distribution of the position after i plays or steps is symmetric, the mean is equal to the median. After

po/h bets the player starting with 100 chips (pp) has, with probability one-half, gone broke.

This occurs after 3,700 bets.

This is the

median number of plays to ruin. The mean number of plays to ruin is considerably larger than this. The horizontal cross section of the parabolic sleeve is highly skewed, whereas the vertical cross section is symmetric about the mean = median line of the sleeve.

Let us suppose our gambler (an occidental financier, or economist

on vacation) decides he would like to get a little longer ride for his money, so he says, “Instead of betting one chip of my stack each time,

I will bet just &%. Of the stack I have after each play.”


that the chips can be broken up into smaller units, you can see that

you can bet k% of what you have left indefinitely. In practice chips or money being in discrete units, the time will come when the smallest

unit you are allowed to bet is more than k% of your remaining stack, so you will have to change your rule, or just stop betting.

© I have approximated a one starred wheel, for which P(s; 18/37, P(s: = —1) = 19/37, by equal probabilities, 1/2, and a drift and dispersion 6 per step, 6 =1


= +1) = h = —1/37


discreetness of money or casino rules will put a “boundary” just as in market making, to the delightful prosp ect of always being able to

bet k% of your money, no matter how little you have. The equation which governs this random walk is:

A Pi


Pedy sian 1/2 Po = starting position , in $ or chips.

We can write this in the form



P1a * T09

Define s} = ks;/100. It is just the Percentag e, of the money he has at each

bet as a decimal tha t our gambler bets. it will be less than one in any event, cred it betting not being allowed (he is not all owed

to bet more than his stac k). It will usually be fairly small compared to unity. Make the cha nge of variable Yi = log, p;

Avi = Yi — yi = log, (1 + 57) h* = kch/100 6 * = k6/100 If you like you can Say our gambler, now a “financier,” “evaluates” his money, or thinks in ter ms of natural logarithms whi ch are close to percen tages, if less than about 15% = 0.15 .

This random

walk, now in the variable y = log, P, is quite similar to the pre vious one. Just as

s; has an expected valu e h and a dispersion 6 so also has the function

log.(1+s") an expected value and a dispersion, which we could work

out numerically, using the probabilities 1/2.

Then on a y scale the financier’s random walk would be similar in struct ure to that of the straight gam

bler. It would be a stra ight line of slope Elog,( 1 + s*) and a parabolic sleeve tVio(log.(1 +8*)) The gambler’s sleeve is cut off hori

zontally at P = 0,

for the financier’s sleeve.

This corresponds to y = minus infinity

In practice we must cut his sleeve when

KP,/100 = smallest unit of money allowed (s.u.m.a ). This occurs at y = log, (100 s.u.m.a/k).

We can evaluate the slo pe and dispersion in y for the financier, in terms of the corresponding quanti ties for

the gambler, by exp anding the logarithm. This is quit e accurate for |st] = |ks; |100| =...

E(Ay)? — (E(Ay))? = 0? = 4262/1002


We drop cubic and higher terms. So we can write

y yo = log. Po,


yothyit Vidy

Poisthe starting “capital”.

PU )- Hamber

of chips





® The worst, most irreverent and commonest sin is to throw out the discordant observations. This method is guaranteed to prevent discoveries which the theory, if there is one, does not predict.


Molecules of negative velocity don’t come out the little hole.


time required to traverse the span D is t = 2 so the distribution of

the time of flight, recorded as a distance distribution on the circumference of the outer drum is

F(t)dt = (es 1 (2) by ~3l at TT or This has a long tail, going down as you can see as 1/t? and the expected time of flight is, if you just feed the formula, infinite, loga-

rithmical actually.

7 E(t) = [ oD ~ ylu)do = [co tF(t)dt ~ log,t |=" ~ 00 ‘v=0



If you insist on writing down arabic numbers for the observed distribution of molecules on the wall, and take means you will of course get a finite mean time. What you should test is not this mean time at all, but the shape of the distribution, say by chi-square, and see

if it fits the predicted distribution.




-1 tail but not an enormously

long one, and you have these two parameters a,

which you esti-

mate instead of estimating, as you usually do,the mean and standard deviation. The method of estimating them is called the maximum likelihood method, which you can read about in standard texts. It reduces to least squares, if you happen to be estimating moments from a normal distribution instead of alpha and beta from this gamma distribution. This kind of procedure is appropriate for skewed distri-

butions of data when you want to wring the last bit of information out of a limited sample of data that you possibly can. I don’t think the gamma distribution has ever been used to represent stock market data, and it probably would not have any profound underlying

significance. It would just be an economy in case you wanted to use

less data and work it a little harder.


Skew Random Walks. Fluctuations in a Panel of a Histogram

I now want to discuss two subjects which are somewhat related to improbable events or long tailed distributions but in a different context. The first of these is the skew random walk. We considered random walks of equal probability 1/2 for a step h+6. One can

devise a random walk which has a probability 0.9, or any fraction near one you like of a certain step length, and the remaining small

fraction for the probability of a different step length. Such a step has a mean and a variance, and for many steps the walk will approach,

albeit rather slowly, a normal distribution for the end point. It may

take quite a few steps. We can express this walk as follows.








€(8?) = h? +6°(M—1)



6°(M —1)


You can see that for M > 1 this will be a walk which drifts in one direction most of the time, but on improbable occasions it jumps the

° ‘The log normal distribution can also be used in this rainfall problem.


other way so that the general sketch of the walk is as in Fig. 4.9-1. This has some analogues in the stock market. Big jumps, though they may occur infrequently, tend to cover a lot of ground compared to

the little ones. One can have a market of negative expected advance which is in fact going forward most of the time. Because of discreetness you cannot exploit this peculiar situation to make a profit (on

the long side), the expected advance is negative. This kind of a situation actually happens in the market. If you look at the monthly record of the crash of 1929 to 1932, a majority of the months in the

Dow index were months of advance, but the net progress from ’29 to °32 was substantially in the opposite direction. When it did fall, it

fell with a big jump. So this is one kind of a random walk which has its analog in the market, and in other phenomena as well. Note that this is skewness of a sequential distribution. One can have an across

the market distribution in which the skewness is frequently the other way. The advance minus decline plot, roughly the median of stock

prices will be sinking, yet an average of stock prices, either the Dow, or a more broad based average will be advancing.

Pee) a Pie)












22 Aamber of


o c



or tome,

Fig. 4.9-1 A Skew Random Walk. A second formula which is of considerable usefulness expresses how much of a fluctuation you can expect in one particular panel of

a histogram. You make a frequency distribution of some variable, and you see a relative excess of members of the population, events or counts, in one particular panel, as in Fig. 4.9-2, also Figs. 4.15-2, -6 and you wonder if this is a statistical accident, or is this fluctuation so large that you really have to pay attention to it.

You can see

222 examples of this in Cootner, p. 103 and P. 287. The same arguments

would apply to a deficiency, if one panel was much shorter than its neighbors.






4.9-2 Schematic Histograms with an Excess or Deficiency

in One Panel.

We can throw this problem in the form of a random walk, in fact a rather skewed random walk in the following way. Let us consider

a histogram, and let us divide it into just two parts, one panel, with a small probability p, and the other panel consisting of all the other members and therefore a rather large probability 1 — p = g. We withdraw the members of the population, say the students in the

freshman class, one at a time. The particular panel let us say is for

the SAT scores from 600 to 620, and we make a random walk of the

following type. We step one forward if our random choice from the student population fell in the panel, and we step zero if it does not.

Say this panel contains 5% of the population p = .05. You will see as we run through the population when we get through the entire

class of 2,000 students or 2,000 steps (95% being of zero length), the expected advance is just the number in that panel of the histogram.

So what is the mean and what is the variance. If you work it out it has a pleasant and simple answer. For one step




o = E(s*) — (E(s))?> = p-p?=p(1—p) = 09


No. in Panel “=” Np +./Nopg Since q ~ 1, No. in panel ~ NP + Np = 100+ 10


The expected number in the panel is just as you picked it, 5% of 2,000 = 100 and the standard deviation, or square root of the variance is just the square root of the expected number, 10, since the other

probability g is very nearly one. So we have the simple rule of thumb,

derived from the skew random walk but applied in a very different

context. If you have a certain frequency of occurrence of a particula r event, small compared to the total number of all possible events,

the expected fluctuation or dispersion is just the square root of the number of times the particular event occurred. Approximately, not exactly, you will note.

Suppose you have a histogram with either an apparent excess or deficiency in one panel, and you wonder whether it is an acci-

dent, relative to the neighboring probabilities.

Then you can use

the neighboring probabilities as an estimate of what to expect in the one under question. If the fluctuation is larger than the square root

of the expected number, perhaps more than two standard deviatio ns away, then you can say with an estimable degree of probability that it could have occurred by chance. If the expected number in the panel

is say ten or more, the normal distribution is a fair approximation to how that number is distributed. This rule of thumb works equally well for distributions over more than one variable, a histogram or table with two or more variables. If you have a contingency table of two or more variables, and one box seems more or less populated than you might have expected, you can examine this in exactly the same way.

For example, there is a very ancient problem?® for which the chisquare test was invented.

Population statisticians suspected there

was a relation between color of the eyes and color of the hair. Blonds tended to be blue eyed and brunettes tended to be dark eyed. The

question was, is this common observation in fact supported by the You simply make a histogram, or as it is called, a frequency table of four boxes, a two dimensional histogram labelled by



You assume there isn’t any statistical relation between

the attributes and estimate the number in any given box, and on

this assumption estimate the number to be expected in any given

box. If the observed number is much greater or less, (the expected 1° The details of this example are in Kenney and Keeping.



fluctuation being just the square root of the expected number) then you can conclude there is a probability connection between color of

the eyes and color of the hair.

The general test is the chi-square test, but as a quick and dirty test you can use this expected number and its square root.


An Example in Which the Histogram Panel Test

Doesn’t Work In the previous lecture (Section 4.9) I described how you could examine a histogram for significant peaks, jumps or gaps in it using the square root of the expected number in a panel as the standard deviation. This was just a crude way of making a chi- square test.

Let me describe a problem where this didn’t work, so you can see

the importance of the limitations on this idea. The particular phenomenon I wanted to investigate was resistan ce and support. You can read chapters about this in the book of Magee

and Edwards, with lots of examples illustrated on charts. It is also frequently illustrated in market advisory letters. So I thought I would just put this commonly accepted idea to a test. If people can see these things with their eye- balls, then they should show up by an

objective test, maybe. So I picked a couple of stocks for which I had about two years of daily data, and also longer weekly and monthly

charts, and on which I could see, or thought I saw, very clearly

where the resistance and support levels were. One of them was the

Air Reduction Co., I don’t recall the other. I plotted up a histogra m of daily closing prices for two years, that’s about 500 points. I wasn’t really sure whether resistance and support would show up as bumps,

jumps or holes in the histograms, but something peculiar ought to.

The histogram was a sequential distribution of the prices themselves.

There were bumps and holes in the histogram all right, pronoun ced and significant ones by the square root of the number criterion. I couldn’t see that these bumps and holes bore any relation to the

resistance and support levels that I could “read” off the charts. So I tried again, this time weighting each point with the amount of daily volume, and spreading it out over the daily high low range. This gave

a different histogram, still rather irregular, though I wasn’t sure just

how to apply the square root of the number significance tests. In doing all this, I noticed something rather peculiar. One of the ground rules, or conventional practices in making histogra ms is


If an observation falls precisely on a class boundary (say the


classes are at 40 to 41, 41 to 42, etc.)

plot half in each adjoining class.

you divide the point and

Or just flip a coin to decide on

which side of the boundary to put it.

But with these histograms

an inordinate percentage, perhaps one- fourth of them were falling exactly on the integral boundaries. It made quite a difference in the

shape of the histogram if you shoved all of them to one side or the other.

I had never seen data behave that way before.

Shifting the

class boundaries to half integers didn’t seem to help much, they still piled up appreciably on the half integers.

At this point I started

looking at the closing prices with respect to eighths going across the market, or down the column in the newspaper. Sure enough, there was a preference for even eighths in the prices (Cootner p. 287). Even when you smeared out this effect, the histograms still looked most peculiar, three or four “significant” peaks in the sequential

distribution of closing prices, with or without volume weighting, for

a single stock.

I was beginning to get skeptical of my square root

of number rule. So I thought I would just generate a few synthetic random walks, and see what the histogram of “prices” generated

this way were like. I generated a few of 500 steps each, from random number tables, plus one for the even digits, minus one for the odd. These histograms too showed two or three highly significant peaks, just as the real price data on Air Reduction did and each histogram

was different from the others.

At this point I could see that my square root of n rule just wasn’t applicable for a histogram created in this way, the position or price in a random walk. I did a little homework, and dug around in the text-

books to find out just what was the answer to this problem. You can find a discussion in Feller’s book pp. 82 and 231. My old histogram rule is indeed not applicable. You may recall from its derivation in an earlier lecture, using the skew random walk that the probability of a step or count in a histogram panel was independent of previous steps

there, or elsewhere. So adjoining panels are probability independent in their count. This is most definitely not the case for a histogram

created in the manner described above. A step at one price definitely increases the probability of adjoining prices also occurring. This puts

clusters or “peaks” in the histogram. The panels are not independent in their counts. The distribution in a panel constructed this way is

not Poisson or normal for large counts, but instead approaches what is called a truncated normal distribution. It looks like the positive

half of a normal distribution, but its dependence on the total number


of observations or steps in a random walk is very different from “normal.” For an ordinary histogram, the mean or expecte d number in a particular panel increases linearly with n, so also does its variance.

So the expected number in a panel is ~ pn + ,/pn where p is the

small probability of falling in a panel, and n is the total number in

the sample.

For the random walk histogram the truncated normal distribution indicates that the expected numbe r in a “panel” or at a particular price level, or position, is of order V/n*.

n* is here the

total number of steps after you first land in the walk, in that panel, or price level. This is called the occasion of first passage . The variance of this number increases like n*, that is at a faster rate. So the

expected number in a panel, plus or minus its standard deviation is

=~ Vn* + Vn*. Thus the “standard deviation” is of the same order

as the number in the panel. This means that the histogram has big

fluctuations or highly significant peaks, using the convent ional (and

here incorrect) formula. This was exactly what I observed, both with the manufactured random walk and also the real price series. The question which I originally asked myself—how to demonstrate resistance and support levels of the sort you can trade against

and make a buck maybe, is still unanswered. Such levels exist at inte-

gers and half integers, and prices tend to bounce up slightly (support)

at 1/8 and 5/8, and to bounce down at 3/8 and 7/8 (resistance), but this phenomenon is not much help to peasant investo rs not on the floor of the exchange, or analysts trying to make a living reading

charts for themselves or their clients.

I think this problem might be explored by talking to a cooperative specialist about the distribution of orders on his book in the past, at supposed resistance and support levels in specific stocks, and

by examining both the quotes and the transactions at these times.

If you can get knowledge at the time it occurs of an exchange or floor distribution, and monitor the quotes and transactions, I suspect you might see evidence of resistance and support in a single

stock sequence. This is still an open question. You might well find

unambiguous positive evidence, but still an effect not large enough to make your fortune.


A Comment on the Form of the Chi-square Test

The expression (4.9-2) for the expected standard deviation of the number in the panel of a histogram, or box in a frequen cy table, can

be used to give a plausible explanation of the form of the chi-square

227 test.

If we have an experimental single variable distribution, or his-

togram, the expression for chi-square is:

x= SS fo fe)"/fe panels


Here fy are the observed numbers, or frequen cies, and f, the caleu-

lated or expected number from the theoretical distribution which is being tested.

In the case of a two-dimensional “histogram” or fre-

quency table, exactly the same formula applies , the sum being over

the boxes or cells of the table. In this case the f- or theoretical fre-

quencies are calculated from the observed frequencies f, under the assumption of independence of the two (or more) attributes or vari-

ables being tested. Thus f,,;; for the “ij th” box in a two dimensional table is


fois = Neotat = (fori) Neotal



R; and C; are the row and column sums, the so- called marginal


The approximate expression which we derived in the preceding

section, Fe = fe, enables us to put the expression for chi-squ are in

a “random walk” form as a sum of “steps,” one step for each box in Suppose we number the boxes from k = 1 to M; M= number of rows times number of columns, the total number of cells.

the table.

Then if we consider fs,i; = 2% as a random variabl e, the kth step

length, and f, as E(2,), its expected value, we have as the expression for chi-square: M

2=> (Joos = fe)? _ = (zp = (ee)?






Thus the expression for chi-square can be express ed as the sum of squares of steps in a random walk of zero expecte d advance and

unit variance, one step for each box in the table. (see Weatherburn “Mathematical Statistics” p. 169) The number of “independent steps” or degrees of freedom is not quite equal to the number of cells in the table. The theoreti cal fs are derived from the observed row and column. So the theoretical

f.’s are not quite independent. In a given row say, we could assign all

but one of them independently but then the last one must be chosen


so that the row of f,’s adds up to the observe d row sum.

There is

one constraint (the fixed row sum) on the otherwise arbitrary f,’s in

each row, likewise for columns. So the number of degrees of freedom

is not M but (mrow —1)(meor — 1).

The expression for chi-square is a sum of squares so it is essentially

positive. For a “large” ( > 30) number of degrees of freedom it can be shown that chi-square itself approaches a normal distribution of mean n—2 and variance 2n, where n is the numbe r of degrees of freedom.

So the variable (x? — (n — 2))/V2n is itself a norma l variable of zero

mean and unit variance. This is a conve nient formula for extending a chi-square table. There are other expressions involving either the

square root or cube root of chi-square, for transforming chi-square

into a normally distributed variable for large n, which are slightly more accurate than the above expression. See e.g., Kramer “Methods

of Mathematical Statistics” p. 251, or Kendall and Stewart, Vol. 1,


371, for a discussion of these alternate forms.

So long as the

standard of deviation /2n of-chi-square is sufficiently smaller than its expected value n — 2, the negative tail of the normal distribution

will be so small as to not distort its representati on of chi-square as an essentially positive quantity.

At n = 30, o,2 = 7.8 = V2n Thus n = 0 is not quite three standard deviat ions away from the expected

value (n —2) of x? and the “negative tail” of the normal distribution

is a negligible error.


Counting of Events with a Poisson Distr ibution

In the preceding discussion we saw that if the frequency in a his-

togram panel or contingency table were say 9, then a zero count was

9/V9 = 3 standard deviations away from the expect ed value 9. So

the normal distribution could represent the fluctuations in the count in one cell fairly well. The probability of the absurd observation of

a negative count in the table was really quite small.

Now suppose the expected frequency is small, zero or one or two

or three only, as the count in a box or panel. The expected count doesn’t have to be an integer, but the observed count must be. How do you examine and describe this situation? You obviously can’t use normal distribution, because that is a contin uous distribution, and goes off to minus infinity.

Let me give you an example of this sort. Suppo se you are keeping

a record of the barometer. You write down once a day for a year the

height of the mercury column. It averages aroun d thirty inches, and


on five occasions you note that the barometer is less than 28 inches of mercury. So it is a fairly improbable event, to have the pressure that low. You also observe the weather once a day and record the velocity of the wind. You note that the wind speed is greater than 50 miles per hour on four days in the year, so that also is an improbable event.

This has divided the frequencies of speed and barometer into two classes or histogram panels, each.

Now suppose you make the

observation that on two occasions the low barometer and the high wind speed occurred on the same day.

Does this suggest there is

a connection in the probability sense between the barometer and

the weather? Just thinking intuitively you would suspect there is a connection.

Let us see whether it is an improbable coincidence to

have two days in the year when both these improbable events occur.

As we have given the data, we could make a two by two contingency

table, with four boxes, and apply the chi-square test. There would be a count or frequency of only two in one of the cells, so it would be a rather lopsided table, not constructed in the best way to test

for independence by chi- square of barometer and wind velocity. We

will do it a little differently.

First let me make a rough calculation, and then do it in a litSuppose there is no connection between the

tle more refined way.

barometer reading on one day and any other day. events, p < 28, occur independently in time.

The barometric

The estimated prob-

ability of a barometric event is 5/365 on any one day, or trial, so

the estimated expected number in a year, or 365 trials is of course (5/365). 365 = 5. We make similar assumptions for the wind event. Note that these assumptions of time independence of wind and barometric events, separately, conflict with common experience, but we

make them anyhow.

If we further assume that barometer and wind are independent of each other in the probability sense, then the probability of both occurring on one day is (4/365) - (5/365), and the expected number

of such coincidences in 365 trials is just BB = 0.055. We would get this occurring on the average once in eighteen years, 1/.055. In fact we have observed it twice in one year. Is it sufficiently convincing ev-

idence; that the actual number of coincidences (2) is greater than the number to be expected by chance (0.055), and hence we would reject

the hypothesis that wind and barometer events are unconnected, in the probability sense?


To make this significance probability calculation we have to use

a distribution valid for small integral numbers, rather than large ones. We can’t use a continuous distribution at all. The appropriate

distribution is called the Poisson distribution.

You can see offhand that two events (coincidences) when only 0.055 are expected should make you very suspicious. 1 might trans-

late these numbers to a different situation as follows. You eat your

dinner in some number of different restaurants, every day in the year.

On five occasions you have eaten in a Chinese restaurant. You have

four stomach aches a year, two of these were after you ate the Chinese

food. You would be very suspicious that Chinese food had somethin g to do with your stomach aches. Suppose there were 365 people chosen at random. Four of them

were smokers and five had lung cancer, and it turned out that two of

the smokers also had lung cancer. You would be a little suspicious

that smoking might have something to do with the lung cancer. Note that such data is not necessarily an implication of cause of cancer by smoking. The data just suggests a probability of connecti on. Tendency to smoke and have cancer might have a common genetic,

environmental or physiological factor. At least the tobacco interests hope so.

This kind of information on connections or causes is presented to you every day, and you make decisions on the basis of this kind

of evidence without going through the mathematics. A rather tragic example of this kind of coincidence was first noticed in connecti on

with birth defects and the use of thalidomide. Occurrence of babies with these defects are very rare. People who were taking thalidomide

were relatively rare, yet they observed there were quite a number of


There were thousands of babies born, yet the coin-

cidences, though very few in terms of the numbers of babies born,

or users of thalidomide were much larger than independence would indicate. The doctor looking over the records of these defects was struck by the embarrassing number of coincidences.

So this kind of statistical testing is quite useful and common. It doesn’t involve large numbers of actual occurrence of events yet you can get significant results if you have a

lot of trials.

In the

above example of the barometer and the weather, just one occurren ce

might alert you at the 6% level. You would not expect this but once

in eighteen years. Similarly, just one stomach ache after a Chinese meal might make you feel cautious.


Before deriving the Poisson distribution, let me sketch it for the

above example, where the average or expected number of events was 0.055. In the examples the single “event” was actually defined as a

(binary) coincidence event, a smoker with cancer, a Chinese dinner with stomach ache, or low barometer and high winds.

The largest

probability is for none at all, so Prob (n = 0) ~ 0.945. For Prob (n = 1) we had 0.055 approximately and you will notice that the expected value using just these two probabilities alone, which are most, but not all of the probability is again approximately



P(o)-0+P(1)-14+ P(2)-24+... 0.945-0+40.051-1+...~0.05


So the probabilities for n = 2,3,4 must be very small indeed. The probability distribution must be represented as a nonuniform rake,

with by far the largest probability tooth at n = 0.

Our signif-

icance probability, since we actually observed two such events, is

p(n > 2) = p(2) + p(3) + p(4) +... which in any event is going to be a pretty small probability. From the original data we see we couldn’t

possibly have gotten more than four coincidences. The distribution we derive will not take note of this restriction, but the probabilities we calculate for four or more events will be indeed very small. You can see that the discreetness of the independent variable is of fundamental importance. It would not do at all to approximate this situation where we are dealing with small integers, the numbers of events (coincidence event here), with a continuous distribution, or a distribution which allowed negative counts.


Ww eat. wor- ip yy 4 o11rnr3 4

he >

Fig. 4.12-1 The Poisson Distribution for an Expected Number


A = 0.055.

The derivation of the Poisson distribution for a small number of

events in a large number of trials proceeds in the following manner.

Let us imagine the calendar as a long series of “boxes” or panels,

each box a year in length. I sprinkle at rando m 50 beans” (events)

over ten years or ten boxes so that the average number per box, for

example is\ = 50/10 = 5. (A = 0.055 for our previou s example). We

want to derive the probability P(k), k = 0, +1, +2, +3 etc., of exactly k beans in one box or year, where we are given that the average or expected number in a box (year) is A.

We imagine we divide each box into n cells. n is just a large number which will eventually go to infinity. We just imagine n so large that the number of beans in any one cell is almost certainly either one or zero. i.e., (A/n) . The factors of the form (1 —const/n) approach 1. Note that \ and & are fixed, as n — oo. So we have

P(k) = A*e~> /k!


This is the Poisson distribution. I have sketched a few examples of it in Figure 4.12-2. You will notice that when the average or expected number A in a box is considerably less than one, as for our example

of low barometer plus wind, or food plus stomach ache, most of the probability is concentrated at k = 0, an empty box. probability of just one event in a box is closely 4







Moreover the


A 2) -> P(k)= A?/2 = 0.0015, for \ = 0.055 This is the significance probability. Roughly once in 700 years (1/0.0015) would we expect 2 or more coincidences if the expected number is

0.055. We conclude at the significance level 0.0015 that low barometer and wind, food and stomach ache, etc., are not independent. As the expected number in a box increases to 1 or more the

maximum probability moves out from k = 0, an empty box, and begins to develop a hump at k ~ A. By A = 9 one can see the teeth of

the rake are being trimmed to approximate a normal distribution— a continuous distribution whose area could then approximate the probability in a number of adjoining teeth.


One can define and then examine a great variety of events by the Poisson distribution. For example we assumed that in the barometer

series, the barometric events (p < 28 inches of mercury) were independent in time. We could test this assum ption alone, if we suspected that a low barometer reading on one day enhances the probability that succeeding days will also have low barometer readings ie., baro-

metric events were clustered. We define a delayed coincidenc e as two

days in succession (P(t —1) < 28, P(t) < 28) of low baromete r. By

definition this coincidence (of two events) just occurs once, on the

second day.

We expect with time independ ence or no clustering)

just (5/365 .5/365) 365 pairs) 0.06 7 of these delayed coincidences in a year.

If we had two such coincide nces in a year (this woul d be

either three low barometers in a row, or two separated pairs, four low barometer events in all), we would conclude that low barome-

ter tended to significantly persi st, or cluster for more than one day.

You could also define events as big or small barometric chan ges,

(p(t) — p(t —1)).

Again these must be dated, pref erably as of the

second day, so you only have one event occurring of this type at a


What has all this got to do with data on securities? If for baro-

metric pressure you will read price, and for wind velocity v(t) read

volume, you will see you can analyze events in the volume and price

Sequences in exactly the same manner.

This is called technical anal-

ysis if you are an investor, or moni toring the tape if you are a mark et supervisor or regulator. Ther e are lots of events that can be defined

in these sequences; big price changes, big daily range and similarly

for the volume. Am.



You can read about examples of this in my Jour.

paper in 1967. Much of the folkl ore of the market place can be examined this way, and some of it is confirmed.

Maxima and minima of stock price s are signaled by large volume, so here is a chance to make a buck, maybe.

You should note that

highly significant “signals” in the probability sense are by no mean s

infallible. The barometer was signi ficantly related (Prob. = 0.0015) to the wind velocity. Nevertheless as a storm indicator it was wron g

more than half the time.

Events, both simple and multiple, can be defined in lots of ways,

and not just in numerical sequ ences. The dates on which the board

of directors of a corporation, or the Federal Reserve Board met might

be known. If you suspected leaks of information you might look for

coincidences or delayed coincide nces of price and volume events with

these meeting dates. The possibilities are limited only by your

Fig. 4.12-2 The Poisson Distribution p(k). 4 = Expected Num- |

ber in One Box.


interest and ingenuity.


Relation of Correlation, Regression,

Contingency by Chi-square and the Method of

Coincident Events

It might be well to take a problem which I am sure you have worked out in your statistics classes, and show that it can be cast in

these other forms which I have just discussed. Consider the calcu lation of correlation coefficients or linear regression coefficients. You can take a problem of that type and cast it either as a problem in

chi-square, or a test with the Poisson distribution. I think it will be more easily illustrated if we take a particular case where you have Suppose I am a man from Mars, and

some idea of what to expec t.

T have a lot of data on the weight and height of a population of students at the University of California . I plot weight against height,

not knowing what to expect, and I get a scatter diagram. I have per-

haps a thousand points, or members of the population, and I can see

in a general way that the heig ht and weight increase together. I am

sure you have formulas in your statistics books that show you how

to calculate the regression of heigh t on weight, which I remind you

is the curve of the mean height again st weight. The other regression is the mean weight against heigh t. They are not exactly the same

wot TALL



i 1


MnerghT- ur Lbs

Soe we





by aak Ohne et -







ee ee Nov HEAVY


blew h- Aecgat—feeT Fig. 4.13-1 Scatter Diagram of Weig ht vs. Height (Schematic).

thing. If we approximate by linea r (straight line) regression they


are two lines like scissors. If there is strong correlation they are close together, if there is no correlation they are at right angles and parallel to the axes. The regression actually tells you the relationship of mean height against weight, or vice versa. If you just want to answer

the question, do weight and height depend on each other in some systematic and monotonic fashion, then you calculate the correlation coefficient. It is the following expression, and it has a standard deviation which is roughly one over the square root of the number of observations. So if you want to determine a significant correlation coefficient of + 0.1, you must have at least 100 observations.

— 1 DM —A)(wi-@) _ hw - ho N



Let me describe a way by which you could have examined this data in terms of events. Divide this population with two variables into two classes with respect to each variable, one class small and the other large, for each variable. Take all the people who are taller than six feet six inches, those to the right of the vertical line, and all those to the left who are not so tall. We also divide into two classes, heavy and light, those who are above and below the horizontal line at 220 Ibs. You can imagine there are 365 people in the sample, five are tall, four are heavy and there are only two who are both tall and heavy to the right and above the two lines. You line the people up in any order and run them through a machine which is a door frame with a bell and a light on it, in front of some scales. Every time a tall man hits his head on the door frame the light flashes, and every time the scales go over 220 Ibs. the bell rings. We count the light flashes, the bell ringing and the coincidences between them. You can

see this is exactly the same problem as the storm and barometer, and we conclude as before that height and weight are not independent.

We could also cast this problem in the form of a chi-square test. Just divide the data up into boxes by vertical and horizontal lines,

and count the number in each box. You would have four boxes if you used median vertical and horizontal lines, but you don’t have to have the same number of divisions on the vertical and horizontal coordinates, nor do you have to have equal numbers in each class with respect to each variable, though it is usually desirable to do this at least. approximately.

You will notice that the chi-square test can tell you about independence or dependence where a calculation of the correlation coefficient might not. The variables don’t have to be distributed in


an elliptic fashion with one variable systematically inc reasing or decreasing with the other. We might have a V shaped regression line, Or arrangement of the scatter diagram like an X, or a wave like an M or

a W. Correlation or line ar Tegression would not reveal this kind of statistical dependenc e.

attribute form, tall , short, light, hea vy. You can als o go the other way, but it is a litt le more treachero us doing this. So me attributes don’t lend the

mselves very well to numerical ass

ignments. If it is simple attribute, a like Sex,you mig ht call male plu s one and female min

us one, and calcul ate a correlation coefficient of sex vs.

or sex vs.


If you were stu dying


the effect of dye color on fabric strength, you cou ld test data by chi-sq uare. A regression or correlation would require ass easy.

igning numbers to colors. This is not so

cients (or equivalen t chi-square or eve nt tests); three dif ferent Pairs of variables with the third summed over and three more Pai rs with the third held fixed. If you want to be really thorough, yo u should in addition test all three at once by chi -square or events, as it is

the problem the eco nomists run in to wh en they try to regres s price

on supply or de mand, when oth er variables ent er in.

Sometimes you can sor t out a situat

ion of this kind, say wit four or five variables h , by examining just two at a time, and the re are

contingency tables with three or

more variables. The same remark apply to calculati s ons of regression or correlation coe fficients.

One booby trap whi ch it seems financier s are Particularly inc lined



to fall in, is this one.

They have a batch of data, with “measure-

ments” on half a dozen or more variables for each member of the population.

Prices or price changes, earnings, sales, yield, capital-

ization, book value, cash flow, volume, etc., etc.

They pick just

one, the price, or “profit” (price change) or change in value (change of log price) and try to explain just that one in terms of all the others.

This is like

a committee of the six blind men who felt the

elephant. For some reason they pick the tail, or the ear, as the “important” part and pool their information or impressions to explain the properties and behavior of just that part, defined arbitrarily as “the elephant.” This is as arbitrary as the man from Mars deciding that just height, or just weight, is the only significant and definitive attribute of a man, and trying to explain just that one in terms of all the other physiological, environmental, or genetic attributes a man

might have. You can read a most critical and enlightened account of this approach in the paper by Keenan, “The Great SERM (single equation regression model) Bubble” in Frederickson’s book (second ed.). We

all understand why financiers think price or profit is important (a vested interest (!) in the profit concept) but concentration on trying to explain just this one variable is a very narrowminded, tunnel vision approach to the problem. When the variables which describe a population get numerous, as they certainly do in finance, economics, or psychology and anthropology, there is a different kind of approach. Rather than specifying the germane variables as they are given you from the observations, and trying to get the effect of each, you think, you instead ask of the

data, how many different variables are there? If you are clever, you can even let the data try to tell you what they are. You do not start out by naming and identifying the variables in advance. Nor do you

ask how each variable operates, i.e., calculate a regression coefficient. Let me give you just two examples of this approach.

You have

data on price sequences in 1,000 stocks. You suspect there are vari-

ables which influence price changes in these sequences.

You want

the data to tell you how many, and possibly what they are.


are believed to be “industry factors or components” such as transportation, retailing, utility, steel, oil, etc., etc., in addition to what ever you may know about the individual companies.

Can the data

on the price sequences alone tell you how many, and possibly what

they might be?


Brealey, in his first book (Chap. 5) descri bes a procedure given

by B. F. King for sorting out contribution s to the variance of price changes. The method is semi-automatic, you do not have to make preliminary guesses as to what the significant: variables are, which might bias your results. Interestingly enough, the industry groupings

which result are quite close to what you might have suspected from what you know by observation of what industry these companies are in.

An early and famous historical exam ple of this process of variable sorting and identification occur red in intelligence testing. People knew from common observation that there are different ways to be


You can be smart in some ways and stupid in others. Memory, verbal ability, numerical ability, logical ability, aural, oc-

ular and manual skill, persuasiveness, imagination or ingenuity are

aspects or attributes of this subtle and complicated quali ty we call


It was, and still is not completely known how many independent components of intelligen ce there are, or just what they

are. It was also suspected that “tests” which result in a single numerical “score” might have something to do with intelligence, just as

the numerical score of “price” has somet hing to do with the “value”

assigned to an object by a buyer or seller, whether it is a painting or an engraved stock certificate.

Thurstone (an engineer at the Unive rsity of Chicago) wrote a book called “The Vectors of Mind) in which he described one way in which the scores might be used to disentangle and enumerate the

components of intelligence.

The methods are somewhat relat ed to

that which King used to identify the industrial components which go

into the changes of a stock’s price.

Similar problems to these arise in anthr opology and archaeology, where you try to identify and perh aps trace the historical components of a culture or a race of people, or in biology, related species. The same kind of problem arises in linguistics, where you try to identify what the relations of different languages are by properties of

Pronunciation, vocabulary and gramm atical construction.


on the best methods of doing these things is still going on. Evidently

there are similar problems in economics and finance, where you try to identify what and how many the significant variables are.



Seasonality in Stock Prices. An Example of a Problem with Three Variables. Good and Bad Practice in Statistics

At this point the students got Problem 6, for which I have given in these notes the data and explanatory comments, from the sources

(See Problem 6, Table 4.14-1). These sets of data can be examined in a number of different ways. I want to discuss this problem in some

detail in order to illustrate what is good and bad practice in handling data statistically.

The yearly chi-square test as given by Osborne does not distin-

guish between 12, 6, 4, 3, or 2 month periodicity. it just says that the attribute “advancing or declining index” (The Dow Jones Industrial

Monthly) does depend in a barely significant fashion (statistically,

not practically) on the attribute “month of the year.” By grouping and adding the count of the data for January with July, March with

August, etc., the 12 by 2 table can be reduced to a 6 by 2 table, and tested for a mix of 6, 3, 2-month periodicities. Similar groupings enable tests for 4, 3, and 2-month periodicities. In this way the students found “significant” periodicity (barely at P = 0.05) only at 6 and 3 months and that primarily in the DJI. From the above discussion you should see there is nothing sacred

about examining the data this way using a twelve-month period only. We could have used a 13-month period, 14-month, or any and all intervals you like, and sorted the original monthly data on index changes into a 2 by n month table and tested by chi-square. Most of them would not have given a “significant” result, but some might. It would not be necessary to have the same number of observations in each month column.

There are still other possibilities.

We might have grouped all

the data from January to June, and July to December, and tested

the resulting 2 by 2 table. If you are looking for significant results, grouping April to September and comparing against the other six consecutive months would very likely give you a significant result. We could also just compare months two at a time.


against August gives a two by two contingency table, for the Dow, which is certainly significant by chi-square at the 1% level. When a contingency table has more than four boxes, you can see there are lots of ways to either group the data and condense the table, or examine only parts of it. Some of these tests can he done approximately and quickly using the rule of thumb for the fluctuations


in a cell ag the square root of the expected nu mber more exactly, V7Pq. But you sh ou ld also see there

is something a lit making a lar tle fishy about ge number of these tests, and acceptin g uncritically verdict “si gnificant” the ent groupi ngs

or “not Sig nificant.

or tests, either of all

If you devi se twenty differ-

the data or just pa rts ofit, then

one of them almost Surely is going to co

me out “signi five percent ficant” at th level, even wh e en there is nothing but pure random and independen ne ss ce in

the original

data, This is sometimes call “error” of the ed an first kind— re jecting the nul l hy pothesis when in fact correct, it is This is a bo

oby trap which People who have to an-

There is one grouping of thi

s data which it is instructiv plore. Considerin e to exg all three sets of da ta , th ey fo rm contingency a th re e va riable table 1) adva nce-de

twelve values.

barg, Brealey).

cline with

two values. 2) Month wi 3) The “Obser th ver” with thre e values (Osbor ne , ZinIf we wanted,

we could test the entire table at once for dependence by chi-square, wi th (2-1) (12-1) of freedom. (T (3-1) = 99 degr able 414-1) Th ees e grouping po ssibilities here are enor-

centages, ha s to be conv erted

to fr

equencies, Osborne’s Granville, was data, from also originally given in Percen ts.) Evidentl @ significance y there is variation of

advance and decline Propor tions on ob-

server, but who is cont ributing


the most to the

chi-square, or the The individual contributions to chi-square ble 4.14-3 are from Tashown in Table 4.14-4, so Osbo rne’s data (tak Granville) is ob en from viously the Ma verick, contri buting one half total “signifi of the cant” chi-squa re.

This was

data from the DJI. One could also ask the question , Do Zinbarg an d Brealey differ

and Poor Inde x; Brealey, us ing a

longer interval S&P and Cowles of time used bo th Commission ind ices. Does the admixt ure of Cowles da ta make a signif Apparently not, icant difference, although the rati o of ad va nc es barg, 326/214 = to declines for Zil 1.53 appears to be noticeably di fferent from that of


Zinbarg and Brealey are not significantly different, on this score of relative number of advances and declines. They had half their data in common anyhow, so this is not too surprising. Whether or not the Cowles commission data is significantly different from the S&P data (for different years) would have to be determined by examining that data separately. We can make a comparison for different years and also different indices by subtracting the count for Zinbarg from that. of Brealey. This gives a contingency table for different non- overlapping years, although the years are not consecutive. This grouping choice is given in Table 4.14-6.

So at the five percent level (barely) there is a statistical depen-

dency between the two attributes listed in the above table. We should note carefully just what this conclusion does and does not say. Evidently the years are different in the two columns, but we cannot necessarily conclude that the dependence of proportion of advances and declines is primarily a calendar or “era” effect. It might be due in part to the way in which the indices were computed. The Cowles index was an average of 8 to 18 stocks. The S&P had many more. The formula was changed and the index recomputed in 1957 to weight prices with the number of shares outstanding for each stock. Which SP formula was used for what years is not known to me. The chi-square test just says there is a difference without pinpointing primarily with what attribute of the grouping the difference is associated.

The important point of the above discussion is not just that the data samples are significantly different, which suggests nonstationarity. It is rather that a significance probability should always be

computed if it is at all possible to do so. Without it you have no more than intuition or bias as to what to believe or how strongly to believe it. Brealey and Zinbarg “looked” different when we compared ratios of advances to declines, but actually they were not. Signifi-

cance probabilities at least tell you what is or is not likely to be believable, without them you are just navigating in a fog, not knowing north from south or east from west. Significance probabilities give you a little sense of direction, though they are far from indicators of perfect truth. I must confess I regard statisticians who give data and draw conclusions from data without significance probabilities, or at the least, error estimates, in the same class with people who use plumbing and don’t pull the chain. It’s an incomplete, sloppy job.

There are some booby traps even when significance probabilities


are given. I mentioned one—errors of the first kind. A second kind of

booby trap depends on the reliability of the distrib ution from which the significance probability is calculated. For statistical problems for which the basic data is simply counts, as in the above example, the

distribution (theoretical for no dependence) is usually known, chi-

square, binomial, Poisson, etc. If the data is not given by counting,

but by measurement of the amount of price change, then the shape of the distribution of the data matters a great deal in calculating the

significance probability.

For example, if instead of giving the number of months of advance

and decline, we might have tabulated the actual change of index in its

units, or as a percentage, for each month of each year, and given the

mean and standard deviation for each month of each year. These standard deviations might or might not be reliably convertible to

a significance probability for whatever was being tested, periodicity, difference or observers or eras. This is one of the points of the papers by Mandelbrot and Fama. Unless you know and have checked out the distribution as being approximately normal, a variance or standard deviation does not tell you very reliably about the probability of significance of your results. It is very easy to be grossly misled on

this score. Some times the best you can do is assum e normality (or transform the data to approximate normal ity) and give a standard

deviation. But not to give any “error” or fluctu ation estimate, or data by which it can be estimated, is poor practice indeed.

It is true that counting methods (order statist ics) loses “information” in the technical sense, relative to analyz ing the quantitative measurements, as might have been done as described above.


isn’t serious when you have lots of data, as you usually do in secu-

tities markets. The compensation for this loss of information is that

when you do discover, or think you have discov ered something, you can test much more reliably how much to believe it.

If you give statistical results in graphical form or in tables, you should include in the legends and labels of the axes complete information as to exactly what the tabulated or plotted quantities are,

including the dimensions, price in dollars per specified unit, time in weeks, etc. To me it is really maddening to look at tables or graphs where this information is lacking and you have to hunt through the

text to find these quantities, and frequently not find them. A graph in particular is devised to enable the eye and mind to encompass the data as a whole. If the axes aren't labeled with dimensions and with

I 24.6) = 0.011, n=11 df

Total 438) 450)


Table 4.14-2 ADVANCES AND Dectines or DJI FOR MAR CH

vs. AUGUST (FROM TABLE 4.14 -1)









Table 4.14-3 Monrus oF ADV ANCE AND DECLINE oF AN IN-



























(1) Data from Granville












(2) N = total number of mont hs












Chi-square = 18, P(Chi-squ are > 18) < < 0.01



(FROM TABLE 4.14-3)




Decline Total




Cowles S.P.




214 (.395)



500 (43)






Chi-square = 1.78, P(Chi-sq uare > 1.78) =0.4,n=14d4.



1871-1917 (1)


1963-1968 (2)












(1) Cowles data. (2) S.P. data. Chi-square = 4.1, P(Chi-square > 4.1) = 0.04,n=1d4£



Some General Methods of Attacking Large Batches of Data. Preliminaries of Analysis of Sequential Data

I now want to take up some general methods of attacking the

description of a large amount of data, with the ultimate end in view of describing sequential data, or time series. Initially, we can ignore the sequential aspect. I want to point out that the history of science and statistics has somewhat biased the way in which people tend to

look at large batches of data. I want to make a special effort to point out these biases and how to take account of them.

First a story, apocryphal perhaps, which illustrates what can and

is done, and how not to do it. There was an Indian student at Oxford who was introduced to the freshman course in mathematics. After about two weeks he went to the professor and said he better drop the

course. He didn’t feel competent to manage it. The professor asked him why. He said, “We have been studying logarithms for the last

two weeks, and I have only been able to memorize the first 2 pages of the logarithm table. It is just beyond my power to learn them all. I think I better just drop this course. I am a failure.” This illustrates, needless to say, the wrong way to go at it.


you want to understand logarithms you don’t try to remember every one of them. Instead you try to be labor saving, and remember and hopefully understand a few properties of logarithms.

The table is

just something that you use but you certainly don’t try to remember it.

Just remembering that logy) 2 is 0.3 is enough to get you most

logarithms, for graphical purposes at least. On linear paper from 0.0 to 1.0; 1, 2, 4, 8, 10 are plotted at 0.0, 0.3, 0.6, 0.9, 1.0, and you have log paper, as many “cycles” as you want.

For large batches of observational data (as opposed to mathematical tables), just writing the data down and saving it is not very labor saving either.

The purpose of describing data statistically is

to be labor saving, so that instead of having to remember or keep a record of ten thousand numbers, you only have to remember or

think about a few. A mean, or median, a regression coefficient, one or two parameters which describe the shape of a distribution, may be all you really need.

What the student tried to do with the logarithm table is in some respects analogous to what has and still does go on to some extent in such sciences as astronomy, meteorology and oceanography, for which I can speak from personal familiarity. The end point objective


for many, not all Practioners in these sciences, is the fait hful and cont

inuous Tecording of large amount s of data. There is a saying

among astronome rs that the best ast ronomers and the most discov-

eries come from the Brit ish and the Dutch becaus e they have the

worst weather and the worst instruments,

The implication is they

are forced to take the time and thought to analyze other peoples’

data, since they can’t get muc h of their own, or else wring dry what

little they themselves can get. I am sure there are similar practices

in economics and fina nce.

As a Chicken Little-m inded physicist, I

have reverence for data faithful ly recorded, but one shouldn’ t stop there.

So we have a batch of data , a sheet of paper with columns of figures, one column for eac h variable. The stock pag e in the news-

paper is good enough for illustrative purposes.

The first step is to look at it. Doing simple thin gs first, one thing at a time , just take one

variable, say the closing price column, and look at that.


at it” does not mean just read off the numbers, one at a time. The eyeball and the brain, give n a chance can look at the whole column

of figures at once and draw conclusions which would othe rwise take a bit

of computing. Just plot that column of numbers as dots, on a straight line, stacking the m up on top of each other if two or more hap

pen to coincide. So the “pic ture” which the eyeball and brain

gets of the column of figu res looks something like this .

can even make them for you.



wes eet phe










Fig. 4.15-1 A ”Picture” of a Column of Data on Prices, From this “picture” of the col umn of figures the eyeball and brain

can read off a lot of calc ulations,

You can read off the mea n,

by asking where this one dimensio nal “cloud” of Points would balance,

on the point of a pencil. This means you have calculated by adding

and dividing in your head, without use of numbers,

the quantity rd baa P;. You can actually do it quite accu rately, well within the prob able error of the mean, which is all you should really ask. Try it


if you don’t believe it. You can get even greater accuracy by doing it twice, one with the picture turned upside down, so as to cancel out systematic errors which the eye and brain make in the slight

asymmetry of perception from right to left. I might mention that if you try this with the line of dots running vertically, you will make much bigger systematic errors. As a computer, the eye and the brain

adds and divides with much larger systematic errors in the vertical direction. That is why when people try to divide a length with the

eye, they stretch the length horizontally. What this irregular one dimensional cloud of dots represents to the eye is a frequency distribution, and one can give quantitative substance to this point of view by dividing the axis, price in this case, up into segments, which need not be of equal length, and dividing the number of points in each segment by the length of the segment.

This is a histogram, a plot of the density of dots along

the axis. I have done just this in Figure 4.15-2 from my first paper,

in Cootner (pp. 102-104). It was the very first thing I did when I became seriously interested in the stock market.

A histogram can

show you quite a bit, if you know how to look at it.


ow is)

no. in panel = SxS 27S tT









Pe stocks pte




P (a)

Fig. 4.15-2 Distribution Function of Closing Prices for July 31,

1956, (all items, NYSE). (75 + 8.7)/5 = 15 + 1.7. If you want to be thorough and crafty, it pays to plot the his-

togram with different choices of the independent variable.

I have

given the same data both for price p and log, p (Figure 4.15-3) which are standard choices: p”, or the square root of p would give a quite different shape to the histogram. You will note that on the log, p his-

togram the preferred stocks stand out very clearly, whereas they are almost invisible in the histogram of p itself. To be really thorough I should have put vertical error bars on each panel of the histogram,


using the square root of n rule, they would be very different in length

for different panels.




t Me, im panels 7/0 Xwtos



5 600;--



Pes Teo

& soo





SEM e “Wee

W 4o00oR


PHL. Shey,


2 300;-

c a

4 200;-


= 100F 00)



| Yio





3.0 1













4050 60 %

ae oe







Fig. 4.15-3 Distribution for log-P on July 31, 1956, (All Items

NYSE Common and Preferred).

I have calculated this error bar for an insignificant peak at the

panel from $45 to $50, for NYSE data (Figur es 4.15-2 and 4.15-5)

using measurements from the histogram itself.

You will see that

it is less than one standard deviation away from its neighbors (see also Figs. 4.15-2, -3, -4) and so probably not significant. A Zerominded statistician would have left out the word “probably,” but I

didn’t, because there is a similar structure right at the middle of the

histogram of the common stocks of the ASE (Fig. 4.15-6) from $15

to $20 and in this case the bump is more than 2 standard deviations

(Prob. 300—


s = 200— i


$100 s









35 40. 45 RATIO UNIT SCALE (LOG , P)














5.0 |




Fig. 4.15-5 Distribution Function of logeP for Common Stocks

(NYSE, July 31, 1956).

lL, No ive Panehs t10¥ .37


=19t 77

Lato t 24s Ee

A“ aig nticant "bam py


[ '

Te s






a 10



Fig. 4.15-6 Distribution Function of logeP for Common Stocks

(ASE, July 31, 1956).

Kendall also has given a most instructive example of unbeliev-

able evidence from a histogram. A machine was turning out metal parts to a certain dimension.

All those greater were to be rejected

or reworked. The histogram of the origina l measurements from the


inspectors’ notebooks showed a sharp drop at the tolerance limit, in-

dicated schematically as follows. (Figure 4.15-8). Neither the owners of the plant nor the inspectors could believe the statistical evidence that knowledge of the tolerance limit had biased the measurements. Observers are frequently biased in the digits they can see or estimate.

(Kendall and Stuart, Vol. I, p. 210-216).


10 rol,





ASE— 7/31/56 505










NYSE—6/30/56 1086







100 ae a


7/31/56 978














s sol

= 50h S

47 z

n 7608e

503 '


404 30 + 20

a8 GOP cae 1 70h ~ gob






= 9G \










e4 e 3!




4.15-7 Cumulated Distributions of log-P for NYSE and

ASE (Common Stocks).



FR equenty




K- inches

fran te

Fig. 4.15-8 Histogram of Measurem ents of Machine Parts (Schematic).

Problem 7

Histograms with Respect to Eighths. “Quote” Interpretation. 1) Read Cootner, pp. 286-290 (Op. Res.10 345 1962) with espe-

cial attention to Figs. 11, 12 and exactly how they were constructe d.

Execute the “verification” Pp. 287 with 100 consecutive stocks from

any newspaper, any date.

2) Read Frederickson, Frontiers of Inves tment Analysis.


Ed.) p. 729 (last paragraph) to p. 730, IX, conclusions. Do you see any contradictions to reading 1), or internal contradictions in this

reading? Explain them if you can.

3) Read “How to read quotations” (by professional tape reader,

on reserve). Spell out exactly what you don’t understand, and then

dig out an explanation.

In particular, pp.

11, 12 concerning the

implications of stopped stock. In this connection note the remark about quotes, p. 7. What in gener al is the “message” comparing

“quotes” vs. “transaction”?



Extreme Observations. Some Biases from the History of Science

There is one other observation which should be made, when you have numerical data on a variable plotted out in a line. Focus your attention on the three or four points which are plotted at either extreme of the data, and ask yourself the question, is there anything special about these members of the population or the way or time at. which this data was taken? This question is motivated by elemental animal curiosity, these members are more likely than others to be freakish or unusual in their other properties, and should excite your Chicken Little curiosity. They also mark out the limits of the population so far as this variable is concerned, and we have pointed out that knowing and studying the limits are of basic importance in understanding processes and properties.

In the particular example of market prices on an exchange, the very low priced members are likely to be mangy dogs, which may be delisted and lose some marketability. Or, they might recover, and

are real bargains. At the high price end of the scale we have a mix of high flying hot stocks, and maybe blue chips, candidates for splits,

which, it is said, improves their marketability. You can see analogies

to this in many other situations. If the variable happens to be height or weight of a human population, either extreme is marking limits to just being a human being. A practical example illustrating the importance of noticing extremes is the price earnings ratio, which they now print in the paper every day for every stock. Extremely large or extremely small values are a signal that something is unusual, and not necessarily “good” whether the ratio is very large or very small. The Equity Fund had a very small P/E ratio just before it went bankrupt. The ratio can be extremely large just before a hot stock loses all its glamour and crashes.

From a historical standpoint, the above procedure, focusing on

the extremes of a distribution, is almost exactly the opposite of what was considered acceptable procedure—throw the extreme observations out. This is still a very common practice. The history of how this mischievous practice originated is interesting and also shows the origin of a deep-seated bias in some of our ideas.

The “normal” or gaussian distribution was originally used by

Gauss to describe “errors” in astronomical observations, primarily of star and especially planetary positions; latitude and longitude on the sky, you might say. The sources of errors and corrections are numer-


ous and complicated,—errors in clocks , distortions due to refraction of light in the atmosphere and the bending and other imperfecti ons of the telescope. The astronom ers could sort them out and apply corrections, using the method of least Squares to evaluate regre ssion coefficients which determine the corrections.

There was much competition betw een astronomers and observa-

tories to put out data with the smallest “probable error,” whic h is

2/3 the standard deviation.

Now in such a least sum of squa res

method, it only takes a few big deviations to make the final error get

big too, so there was an incentive to throw out, preferably with an

objective criterion, the observat ions that made an astronomer or an observatory look bad. The same incentives were present when least

Squares spread to the othe r exact sciences.

Everybody likes to be

accurate, and while it is chea ting just to throw the disc ordant obser-

vations out without saying anyt hing about it, there are more subtle

ways of accomplishing the same thing. A not very subtle meth od of

diminishing errors is to take more observations. Nominally the error

goes down with the square root of this number of observations, as we saw in connection with random walks, so piling up the data to beat

down the error was and still is a very common procedure in astro n-

omy. The financial analog is usin g a 500 stock index to describe ‘the market;” ten would be enough.

The historical context of science during the era of Gauss and sub-

sequently, put an extremel y strong bias on the way peop le look at

data of all sorts, even to this day. Newton and Leibniz had invented the calculus, which was admirabl y devised to describe the detailed motions of the planets under gravitation and for many other phe-

nomena as well.

Calculus is fine for descr ibing solid smooth “thin ”

lines. A mechanistic philosophy took over, if only you were accurate

enough in your observations and took enough of them, you coul d

describe theoretically the small est details, in principle. The unco nscious bias of this viewpoint is mani fested when we draw a smooth solid thin line through an elongate d cloud of points plotted on a sheet of paper, and unconsciously thin k of this smooth thin solid line as the ultimate essence of reality unde rlying the data, which has noise and errors in it. I have spent a long time trying to disabuse you of this unconscious bias in discu ssing supply and demand. You can see this bias in favor of cont inuity in representing data when peo-

ple draw frequency polygons instead of step histograms or “rakes.”

They may represent pure “white noise ” by a violently zigzagging line


(continuous) between isolated points. The subjective impression to

the eye and the brain of these different representations is appreciably different.

A “line” may be safely considered as a representation of the underlying reality when the dots are observed positions vs. time of a satellite circling the earth, but if the dots are height vs. weight of students, or stock prices or price differences vs. time, then the “line” is a very misleading representation of underlying reality. We saw that the variable advance and dispersion random walk is one way (not the only way) of interpolating your understanding between these two viewpoints.

Astronomers and other exact scientists are not the only ones who like to have their results come out “smooth” “lines” with small “errors” or real fluctuations. Gauss laid the foundation for this rational approach to original sin in science, with the normal distribution and the method of least squares. I am sure you are aware that some financial managers like to see a “smooth” line on a calendar chart for the earnings of their company, and there are discretionary and legitimate accounting practices which enable the line to be smoothed to achieve the image of uniform earnings and uniform growth. Earnings is an intrinsically fuzzy number, which “officially” exists once every three months. It can be made more or less fuzzy to a degree which is under the financier’s control. It is just the problem of the security analyst or external tax auditor to detect just how much the earnings have been doctored up or down.


Discoveries in Noise

I mentioned in an earlier lecture that the marketing department had difficulty in calculating demand curves because of noise in the data. I thought perhaps the noise might be the most interesting part of the phenomena. Let me give you some examples where the “noise,” “residuals” or “errors,” depending on the terminology of your discipline really contains a lot of information. This is the honest data you might want to throw out in order to improve your agreement with theory.

Beginning with astronomy, geophysicists have been digging through that data to evaluate the irregular motions of the earth itself. If the earth wobbles or changes its rate of rotation in an irregular fashion, which it does, if continents drift or tilt, these are unexpected and unaccounted for errors in the positions of the stars and the planets.

260 At the time the data was taken, these possibi lities were not part of

the theory.

Back at the turn of the century, when electromagnetic phenomena were being actively investigated, physic ists tried to develop perfect insulators, or bottles to hold the electr ic fluid. No matter how they

tried, there was always just a little leaka ge in air. There was absolutely no theoretical reason why dry air should conduct electricity, but it did, just a very little. So they just put this down to experimental limitations, or “residual ioniza tion.” Fifteen years later it was

discovered as due to cosmic radiation, an extraterrestrial source of energetic particles which are not yet completely understood.

In the middle 1930’s Karl Jansky at the Bell Telephone Lab-

oratories was studying noise in telep hone circuits and radio communication. This was something the telephone company wanted to get rid of, just like the economist in the marketing department or the astronomers with their precis e observations. Some of these communication noise sources were know n—thunder storms and mag-

netic disturbances—earth and atmospheri c electric currents associBut there was a gentle hiss, really not very

ated with the aurora.

loud, that at first he couldn’t track down. It was noise as radio waves

coming from the center of the galaxy, and thus the new science of

radio astronomy was born.

In all these cases of noise, the data was trying to tell you something you didn’t ask of it. In a sense stock market data might be called economic noise in a rather pure and even amplified form. You

can ignore this noise if you want to, and economists did for a long You can try to draw underlying trend curves, or smooth out


the data by least squares, moving averages and seasonal adjustments,

but you just might be ignoring some intere sting phenomena. 4.18

Examining Two Variables at Once

The preceding remarks refer to methods of examining the statistical properties of just one variable. In summary the moral is: look

at the data all at once with your eyebal ls, and with several changes

of scale (x, log x, 2”, /z, etc.). You can ask a computer to check out what you might see, but don’t omit the eyeball inspection.

Exactly the same remarks refer to variables taken two at a time.

The eyeball inspection of a scatter diagr am will enable you to esti-

mate the marginal distributions and to calculate with your eye the

two regression lines.

You can do this by inspection of the data in


uncovered (not overlapping) strips, and estimating means. I do this with data all the time. The product of the two regression slopes gives you the square of the correlation coefficient. Eyeball inspection of the data will tell you whether it makes sense to calculate a correlation or linear regression, i-e., if they are well defined and appropriate quantities. You can also pick off from the scatter diagram extreme members of the sample of various sorts, for special attention. A couple of examples might be instructive. If you just go down

the column of the newspaper and plot price vs. dividend you will get a very lopsided scatter diagram concentrated near the origin. On D) it appears much more like a two log log paper (i.e., logp vs. log variable normally distributed scatter diagram. But unless you are careful to put them in as special cases, at the edge of the paper, the zero dividend stocks just won’t appear. You have to decide whether a zero dividend “exists” and is zero or just doesn’t exist. Just using logs tends to eliminate these “freaks” unless you are careful. A computer won't tell you about these special cases, unless you are careful to ask D on logP jt. A regression calculation of D on P, D on log P, log

or other possible choices is very misleading if you don’t inspect the

data first.

yoRume or log. Vo



DLogy p 4.18-1 Scatter Diagram of Volume vs.


Price Change


As a second example, consider the plot of Alogp vs. volume, either sequentially or across the market. Across the market, for a single day or week, these two variables seem nearly independ ent. If you make a sequential plot, usin g a single stock price sequence, the scatter diagram looks like a butte rfly, very roughly. The regression

of v or log v on A log P is approximately a V or aU. A calculation of correlation gives you zero. (Fig ure 4.18-1) The sequential corre lation

of |A log p| vs. Volume or log Volume is very slightly positive.

Finally, just a word about trying to do three or more variables at

once. In a sense the eyeball starts to fail here. It can’t reall y “look” at a three dimensional distributi on all at once. But you can plot two at a time (three different “proj ections”) and also plot up slabs , or

cross sections where the third variable is restricted to a finit e range

say 1/2 or 1/4 of the data span of the third variable.

If you have

lots of data, these two dimensio nal projections and sections can be

quite instructive to look at, before you put the data into a computer

and start some statistical routi ne. Three or more variables at once

can vary with each other in many complicated and subtle ways, but

with enough projections and cross sectional “slabs” you can sort out

some of the relationships.

I would refer you to Kend all or some of

the more advanced statistical texts on this kind of a problem.


The Quantitative Expres sion of “Eyeball”


In the previous sections I have urged you to “look at” your raw data with your eyeballs, and poin ted out some of the ways of doin g this—the histogram with panel error bars for one variable at a time, the scatter diagram for two vari ables at atime. You should take note of clustering or bumps, jumps, asymmetries and extreme valu es of the observations. Different choic es of the independent variable on a

histogram made the shape of the hist ogram change quite a bit.

The “classical” way of describi ng the detailed shape of a distr i-

bution is by the moments, the first two (mean and variance) be-

ing sufficient to describe the normal distribution.

Properties such

as asymmetry (skewness) and humpedness (kurtosis) are described

by functions of the third and four th moment.

All these quantities

have uncertainty or fluctuation estimates which involve moments of twice the order of the given mome nt. Thus the variance determin es the uncertainty of the mean. The fourth moment determin es the


uncertainty of the variance (second moment). The sixth moment determines the uncertainty of the third moment, and so on. This is all very fine when moments from the data are well defined, but

when they are not, as is frequently the case, then you can use the percentile points (partition values) for the quantitative expression of some properties which you can “see with your eyeballs.” Suppose we have the histogram of a variable x. Denote the percentile points or partition values by x,y Thus 2,5 is the median,

275 — ©.95 is the interquartile range, or (for a normal distribution) twice the “probable error.” The following quantities can be used to

quantify the various subjective impressions of the histogram. There

is nothing unique about the particular values of percentile chosen in any of the criteria 1 - 6.

Expressed by


Property 1.



Expressed by



is The “Central Tendency”



3. asymmetry

(1/2)(#.83 — @.17)



either sone )lg goat) skewness or

4. concentration







Number 4 will approach unity for a distribution with a high thin peak or central concentration, and approach zero for a U shaped dis-

tribution, or absence of a central concentration (a central “hole”). 5 and 6 below express the extent to which there are “outliers” to the

tight and left in the observed distribution. The number of observations can be compared to the number expected with a normal distribution of the same median = mean and intersextile range 23 — 2.17 or 2c. For this comparison one should use the Poisson distribution to compute the significance probability, since the number of observations refer to a small and positive integral count. An excessive

number of these extra outliers (actually it only takes a few) is re-

sponsible for the ill-behavior of the moments, when these are not well defined.


5. Number of outliers to the right,N,: No. of observations Re

ast mEBSE1D on, =2,3,0r4

6. Number of outliers to the left, Ny: No. of observations DS

re = mE) m = 2,3,or4

Criteria 1 through 4 can be assigned uncertaintie s or standard deviations using formulae for the standard deviat ion of partition values (See Kendall, Advanced Theory of Statis tics). I commented that histograms of the data could be made to look

quite different for different choices of the independent variable, and

that interesting properties could thereby be revealed, cf. Figs. 4.15-2 and 4.15-3. It is also true that transformati ons can conceal properties

- transforming to the uniform distri bution being the extreme case.

As another example, consider the stickiness of integers in the case of

stock prices. This is easy to see in data on prices and price differences in dollars.

If you transform to log prices or perce nt changes, the

effect is still present but much harder to find. See Cootner pp. 28993 and especially Figure 13.

The moral is that transforming data

in statistics can cause trouble and confus ion, just as transformations

caused trouble for the Munchkin boy in the Land of Oz. If you are not

careful and observant of the raw data, you may transform away some

interesting and unexpected properties.

In physics, and especially

astronomy, taking data in polar coordinates, and then transformi ng

to the (supposedly) simpler cartesian coordinates, causes all sorts

of statistical trouble. Financiers and econo mists should be thankful

they don’t have this problem.

~ Da




Examining Two Variables at Once Where One is the Time Sequential Data

In the preceding discussion of methods of initially examining data, one or two variables at a time, we did not specifically take into account that the values of one or perhaps both variables might have appeared sequentially in time, as in the case of price and volume. For purposes of eyeball inspection, we can plot up one variable against the time it appeared, and even use some of the simple minded methods we used for a scatter diagram (two variables). But you should remember that the second variable (time, conventionally the z axis) is not a random variable in this case. The days don’t come off the calendar even approximately at random. But for our purposes, and to a degree, we can and frequently do assume that they do. Let me show what I mean by this seemingly preposterous remark, and the trouble that assuming it unconsciously can engender. Imagine we have our two variables (y,t), N pairs of them plotted up, and we simple-mindedly go ahead and start to analyze with our eyeballs this “scatter diagram” in exactly the same mechanical way we did before.


One of the calculations we did optically was to divide the abscissa |


into intervals and calculat e the mean of the points in each vertical (y)

strip. Formerly we would have called this set of mea ns as points on the regression of y on t. Now there is a new word for exactly

the same process. We say we have calculated (points on a) “trend”


You will see that we have ignored the order of the date s

(averaged them) within each stri p or segment of the axis of ¢. Sometimes this process is just done by drawing a “best fitting”

smooth solid line, which is really a running average over some time

interval of unspecified length. In such case the same data is used or

“counted” more than once whic h improves the “image , of smoo th-


Strictly one should use the data in each strip just once and

the trend “line” is evaluate d just once for one “point” in each strip, the middle of the ¢ interval. Strictly this, is a regression, but not a

regression line. Just a set of points.

You will notice the subt le bias

which Newton and his calc ulus have put on this the conventional

procedure. We are ass uming there “exists” an underlying continuous and smooth process. This is just what a ran dom walk of expected

advance (variable or not) smal l compared to the dispersion per step does not have. I suspect the same is true of a good man y economic

time series.

Ordinarily one might suppose that the abscissa had been divi

ded into a fairly large number of intervals (20 or more) for this “trend” calculation. In fact for purp oses of deciding the question of whether a tren d exists or not (mathema tically, the Property of strict station-

arity as defined below) it is more instructive to use two, three, or at most four segments of equa l divisions of the total time span of the

data, and correspondingly just derive 2, 3 or 4 trend points.


question you then ask: are these four samples of data significantly

different or not? Thus if you had a sequence of 1,000 y values, with

four divisions you have 4 samples of 250 each.

Evidently one can compar e these four samples of dat a in many

different ways; by means, vari ance or in general the shape of their

distributions by chi-square.

It is also evident that if we just wan

t to decide whether the four samples are different fro m each other or not, a decision can freq uently be reached by eyeb all.

Uncover the

four pieces one at a time, and just read off the means and variances

(or medians and intersextile rang es). If they are significantly differ-

ent it will usually be easy to see.

Note that in this comparis on we

are randomizing the date s of the calendar within each block of 250



You will notice that the above is a mechanical and formal pro-

cedure for deciding whether or not the data is changing significantly in its statistical properties with the time. “Trend” usually implies just the mean, out of all possible statistics is changing with time. If the mean goes up or down significantly from one block to the next,

we have trend. With only four blocks of data, there are only B=8

different ways in which the mean can jump around up or down from

the value it has for the first block. The most wiggly way it can behave is like the letter N, or N backward or upside down, which

for those of you who have been exposed to sines and cosines, is the lowest frequency or longest complete “period” you can get into the total interval of observation. This is essentially the reason I picked four as the largest number of intervals to use to make this test for stationarity. We will return to this point when we discuss power


All of the above remarks are intended to give a quick answer to

the question of whether or not the statistical properties of the first, to fourth (if we used four) divisions of the data changed from sample to sample. If they do, and eyeball inspection is usually sufficient to show this, then the sequence is not stationary, it changes with the time.


The Definition of Strict Stationarity

A more precise mathematical definition in terms of ensembles of functions, of which your original data f(t),t = 1,2,...,1000 was a sample, is this. An ensemble of functions {f(t)} is strictly stationary if (See Rosenblatt, Random Processes) the joint distribution over the ensemble of f(t), f(t2),---,f(te), expressed as ®(f(t1)), f(t2),---s f(tx), tr < te


37 Running a of ation Autocorrel Fig. 5.7-2 Autocovariance and to



Sum of Two Independent variables. 07 vk)

, s

° «

O77 AS4S





a al

Fig. 5.7-3 Autocovariance and Autocorrelation for a Running

Sum of Five Independent Variables.


values of f(t) and f(t +1) then the autocorrelation dro

ps linearly from one to zero over a ran ge T = 0 to k and is zero for larger values of7,7> k

Now supposeI just let & get large. In any event for a Teal sample of data I only have

a finite number of val ues of f(t), say from t = 1

to 1,000. So f(t) can be imagined as foll ows:




1 +const







mtrt+rs+e TI t+r2+73+...47%4+ ¢



ry +12+73 +... 71999 + end of sample (5.7.5)

(sum of “previous” 1's)

You can see that in this case f(t) is really a random walk, starti ng at the origin of f, of whi ch my data is one fini te sample of length 1,0 00. The autocorrelat

ion drops to zero ove r the length of the int erval of observation, but I rea lly can’t evaluate it ver y well from the data for the total interval, say T'/4 = 250 or T/2 = 500 at the most. So the autocorre lation from the dat a looks something like Figure 5.7-4.

You can see that this seq uence,

a random walk as it sits , is neither strictly stationary, nor ergodic, and not just for the reason that it is not strict ly stationary. The expected moments get larger the longer

the interval of observ ation,

The distribution of f(t) and

f(t +7) never approach independence no matter how large + gets, since bot h F(t) and

f(t+ T) have all the random

variables of f(t) prior to ¢ in them. The third con dition of ergodicity is not satisfied either ,

Fig. 5.7-4 Observed Aut ocorrelation of a Ra

ndom Walk (Schematic).



The Doctoring of Sequential Data Prior to Statistical Analysis, in Particular for Power Spectra

In the previous lectures I have described, with examples, the properties of strict stationarity, weak stationarity and ergodicity.

Strict stationarity was in fact quite restrictive. It required the time


dependent variable f(t) at several different times. Weak stationarity


translational invariance of the distribution not only of the dependent variable, but of all possible multiple joint distributions of the

relaxed these conditions to apply them only to those properties given

by the first two moments, including cross moments of f(t) at just two


different times. These determine the autocovariance, autoregression and autocorrelation. These are the standard elementary two-variable statistics.

We also introduced the concept of ergodicity, which was devised as a theoretical property to enable statistics to be determined from a single time sequence. It required strict stationarity, a finite first moment, and asymptotic probability independence of values of the dependent variable for large time separation. Let us imagine we have some observed sequence, T = 1,000 in number, prices for example, and describe how it can be “doctored” to allow a statistical examination. We have f(t); t= 1,2,...1000 =T.



Let us further suppose that f = (1/1000) 30100 f(t) = 0, or if this | was not true for the original data, we have subtracted f out. This

gets rid of at least one possible cause of nonergodicity.

Let us further suppose we have no linear trend, here defined as the non-zero slope of a straight line from f(t = 1) to f(t = 1000). This slope, too might have been subtracted from the raw data. So we have also diminished at least this obvious source of nonstrict stationarity



and non-ergodicity.

Note that “subtracting out” a linear trend and constant is different for different transformations of the dependent variable f(t). Take the case of price contra log price as a function of time. For the 9 is not the same as logp. If you have decided for constant term, log some reason to transform the dependent variable before analyzing it, the subtraction of constants should be done after the transformation, not before it. Subtracting “linear” trend from the logp series corre-

sponds to factoring out e°"*'* from the original data. So “linear”

trend and constant removal, if the data is transformed first, has different effects on the data for different transformations. In any event,



with either raw or transfor med data “linear” trend (me an difference) and mean removal removes most of the most obvious sources of non

(strict) stationarity and non-ergo dicity.

How can we test our dat a roughly at least for the third Property

of ergodicity,

jim Prob.(f(t), f(t + 7)) = Prob.(f(t)) - Prob.(£(t + T))


where the two probab ility distributions on the right are, if we hav e stri

ct stationarity, the sam e.


One simple test is to plot the product moment cov ‘Tor ariance f(t +7) f(é)

de tr SOSH) against 7 The expected val ue of f(t

+7)f(¢) for large

T by the above formula (5.8-1) is

lim €f(t+7)f() = EF Ef(t +7) = (Ef@))?


the square of the mean (zer o in this case). Note that to be

really convincing, the product mom ent has to become and remain near its limit value

(F@)? = 0 appreciably short of r = T/4. Also not

e that we are really testing, by use of the first moment only the independence

of the distributions of f (t+r) and f(t). A sec ond moment test would requir e comparing f?(t)f2(t +7) and (F7()?.

So the preceding is

not an exhaustive test for the third Property of ergodicity.


fier) fo




t ( teT 4

Fig. 5.8-1 Test of Data for the Third Condition of Ergodicity. The above procedures are massaging technique s on the data to render it fit for statistical anal ysis.

In particular we are looking


toward power spectrum analysis, for which weak stationarity is required. Removing the mean and the linear trend can also be achieved by differencing. This removes the mean (or any constant). Removing the mean from the differences removes the linear trend. It is worth remembering these preliminary steps, and the reasons for them, before carrying out a power spectrum analysis. In practically any computer you use, there are programs already built in for the calculation of power spectra. These programs do not necessarily simply take out the mean and mean differences, equivalently a linear trend. They frequently have other procedures of filtering, averaging, “tapering” smoothing or truncating the data. These procedures sometimes amount to the same thing as mean and linear trend removal, and sometimes not. If you just hand the computer a batch of data and ask for a power spectrum, the computer or its staff may not tell you the data is in an inappropriate form. In fact what the program may do is to make you think it is in an appropriate form. An unspecified power spectrum program may act like a garbage disposal unit, it can grind up your false teeth and your diamond rings if you put them in there. Unless you examine your data first, and sift some things out, you might lose what you are looking for in the data, overlook some interesting property you were not looking for, and get quite misled as to what the data says.

You may recall that when I was describing an engineering ap-

proach to sequential data, I said that it was in principle possible to fit a polynomial of 1,000 terms to 1,000 values of sequential data. Nobody ever chooses to do this. In practice for describing sequential properties you might fit a quadratic, cubic, or even fourth order polynomial. In principle you could go to order 1,000 and fit (represent) every observation exactly. A power spectrum analysis is something like this process of fitting a 1,000 term polynomial, only you don’t

use terms of 2,3- 1000 increasing powers, t,t”, #°,... , #1000 Instead

you use a different kind of function—sines and cosines. These are essentially functions which wiggle in a uniform way. You have func-

tions of all degrees of wiggliness from zero wiggles or loops? in the

span of data 1 to 1,000, to 1,000 wiggles, in the span of the independent variable, usually the time. This is the most number of wiggles you need to fit, exactly if you wish, 1,000 data values. 3 ‘This terminology is somewhat imprecise. If you consider that sines and cosines have two loops or wiggles in one period, the statements are quantitatively correct.


The idea is to find out the proportionate mix

of all these different wiggling functions whi ch will represent the data. This method has the advantage over the polynomial fit ting method in tha t you can group tog ether as a single term several fun ctions

of slightly differ wiggliness without ent altering the Pictur e, or representati on very much.


We have talked a lot about the use of

the eyeballs to car analyses, in partic ry out ular, sets of scatte r diagrams f(t), vs. f(t+k).

interpret oscillates at varying frequenci es interm

ediate between the two extremes, an se d with varying amp litude, whether speech, music or din. The var

ying intervals betw een zeros

of pressure (at the or ambient atmosp mean heric pressure) and the amplitude , or dep arture from the mean

between these zeros are just. what convey s pleasure for music, informati on for speech, or the displeasure of din.

One can break up this wiggling line which shows the pressure variations into wh at might be called a set of pure tones. This is very

simple for an orches tra,

One pur

e tone may be a single clear note on a flute, or a single note from a piano, or a ste ady toot from horn. Music, speech a

or din can be repres ented as a superposi tion, or


adding up of all these different single tones to give the actual pressure

which your ear feels at any instant. The nature of the description of a power spectrum is to take what ever the irregular pressure vs. time curve is and break it down into all these different uniformly wiggling components, which we identify with our ears in the simplest case as pure tones.

Plt) - Pressure


Pagar oscel

pron TO" op $


time >


Fig. 5.8-2 Sound Pressure vs. Time.

This procedure which the ear does with pressure vs. time can also be applied to other sequences. It might be prices or price differences

vs. time, the height of waves coming in from the sea, sales of air line tickets per day. It really doesn’t matter. But you must take out just as the ear does, so far as simple hearing is concerned, the constant mean or atmospheric pressure, and any steady increase or decrease with time (the linear trend) which might exist if you were going up or down in an elevator or airplane. These contributions of the pressure have physiological effects of their own, but they are not strictly part of hearing.

So we can write‘ this function of the time, whatever it might be, as a sum of the following sort. Note that the function f(¢) in this © At this point, and in a number of subsequent lectures, I feel obliged to assume a somewhat greater degree of mathematical sophistication on the part of the reader than hitherto required. I regret this, it is a mark of my shortcomings as an

expositor that I cannot carefully and exactly express the ideas I need in a simple notation. For this I ask the non-mathematical readers’ indulgence. I have done the best I could in summarizing what can be found in more extensive and perhaps intelligible form in books on Fourier analysis, complex numbers and power spectra. Readers who may feel a little “snowed” by the ensuing mathematics are

referred to the comments in section 5.9, and that last three paragraphs of Section 5.13 and 5.21. They will be reminded that impressive and complex mathematical

concepts and notations have their limitations and absurd implications.


particular case is defined and exists only at a discrete, finite number of values of the

time Gb

128 soc 100

0, so we only need number of term a finite s to represent it exactly over thi s finite interval. The ma

thematical expres sions are simple r using complex bers rather than numreal sines and rea l Cosines, even if, as with real data, f(t) is a rea l function.® T-1

F(t) = So aje?TM /T,





This is called a Fourier expansio n or Tepres

entation for f(t), are called expa the a; nsion coefficie nts, Alternati vely we might use r

f(t) = SV aje2 mst/7,




In either case the re are exactly T ( complex)

coefficients a; to Sent the T’ ob represerved values of f(t). With either choice above, the aj can be calculat ed from the given values of F(t) by r

a; = (1/T) Dif (ther 29/7 ia

J an integer


As will be noted fr om the alternate fo rms 5.8-2, or 5.8-3, in 5.8-3, positi J is positive ve and zer o in 5.8-2, and a value for a;,7 Positive,

complex quan tity which we can write as a;

= ; + icj Wh teal we have a; en f(t) is = a*; where * denotes comple x conjugate (cha nge i to -i). More

over, since er t) P emet i emit jt/r et?mit/7 we have oF =a_j;= ar_; for f real. This means that for f teal, of 27° teal and imagin ary parts

for all the T val ues of a; only 7’ of them a; is ’symmetric’ about j = T/2, for f real. T even requires that ay and “7/2 are different.

are pure real. T odd just requir teal. For T od es ay pure d or even 4; wi th f real, satisf ies 4; = aF_,.

* You will note tha t the independent

variable, ¢, is discre not made the te, but we have same restricti on of S(t), or the range set Specification , contrary to on this subjec previous t when J() is supply and demand. Thi s is a defect


The combinations w = 27j/T are called the frequencies®, so the

w’s run (at discrete frequencies) from 0 to 2 x(T — 1)/T in (5.82), or from w = 2n/T to 27 in (5.8-3) The function f(t) is said to be

represented by a sum of these frequencies w; each with a complex amplitude a;, or equivalently a(w,). The power spectrum is the positive function of 7: = aja; = aaj = ajar;


= a(w)a(2m — w)


Because of the symmetry in aj mentioned above for real functions, f(t), in practice the power spectrum is usually given only from 7 = 0 to T/2, or equivalently from w = 0 to 7. The remaining part is just the mirror image in the ordinate at j = 7/2, or w = 7 (on a linear abscissa scale).

It is sometimes convenient in examining power spectra, to think in terms of “periods,” or the repetitive interval of the wiggling functions which compose the spectrum, rather than the (radian) frequencies w; = 20j/T. The periods are P; = 27/wj = T/j. Longer periods than T, or periods other than T'/j withj integral just don’t exist in the data. I emphasize “in the data” because economists and others think and talk about long term trends, parts of long cycles, slowly varying means, so I suppose they must believe in them. None of these things exist in a finite sample of discrete time data no matter how long, if you are going to represent it with a Fourier series. There is a rather special place for the linear trend as I have narrowly defined it (the mean difference) and also a constant in a Fourier expansion. All the rest of the terms in the expansion are for complete and unambiguous periods.

For example, suppose we have weekly data for an interval of two years, T’ = 104 weeks. The longest period, and hence smallest fre-

quency is for j = 1, P; = 104 weeks. The shortest period (j = T) is unity, one week for which the radian frequency is 27 week! . We mentioned in practice the power spectrum is usually only given from © The word “frequency” is also commonly used for the combination j/T, the number of “complete” cycles (less than unity if j < 1’) per day, week, etc, if T is measured in days or weeks. w = 2nj/T is also distinguished by called it “circular

frequency,’ ’ or “angular frequency” in radians per unit time.



j = T'/2, so the shortest effective period plot-

ted is Pj_7/2 or two weeks.

The corresponding (rad ian)

frequency is called the Nyquist or folding frequency, 7 (ra dian) week-! or 0.5 cycles week~!. The wor d “folding” frequency means that the total spe ctrum w = 0 to 27, being mirror sym

metric at this fre quency, 7, can be folded into coincidence with the range w = 0 to 7.

In general one can remember the formul a Period

= 2/(fraction of 7) in units of time (a week in this case) cor responding to the int erval between observations . It is very helpful to mark a few of these poi nts on a given pow er spectrum so you can, in familiar ter ms, “see what

you are looking at.”

tra, rather than aut ocorrelation, or the equivalent,

autocovariance. Both power spectrum and autocovariance come directly from the Suc cessive scatter dia grams of f(t +k) vs. F(t) which I described in In fact the autoco variance and pow er spectrum

a previous lectur e.

are closely related, and in exactly the same

way that F(t) and its expansion coefficie nts a; are related. There are some rea sons for this preference

for power spectrum over autocovariance, some scientifi-

cally sound and others primarily cosmetic.

Iremind you that either method contains exactl y

the same information and has exa ctly the same limita tions-statistical rel ations expressi ble by second mom ents and linear reg ression.

The limitations, ambiguities and sho rtcomings show up in the successive sca tter diagrams which you make to get the autocovariance,

but you don’t see these limitatio ns although they are still there, wh en you only

look at the final calcul ation of the Power spe ctrum, or only at the autocovariance as a function of lag k. You can put

any sequence of numbers you like int o a computer and punch the button for power spe ctrum, and the computer will gri nd out an answer .

Unless you know exactly what the computer did (and the routines are not all alike) the

emitted power spectr um may be very mislea ding. In practice power spe ctrum analysis has its greatest strength whe

n you are trying to find some fairly period ic signal or wiggle function immersed in a great many others. You are interested in Jus t one or two or thr ee of these periodic signals.


the power Spectrum is a ver y powerful met hod indeed for sifting out just one

or two of these per iodic signals from all the other uninte resting (?!!) noise which ma

y be Present. Power spectra. can also be used to de-


scribe functions of time which have no particularly dominant or even slightly periodic property but just wander back and forth the way stock market prices or other types of time signals sometimes do. The interpretation of the spectra of such sequences is much more difficult than those with just two or three dominant frequencies. If there are concentrated and intermittent bursts of activity like rainfall in a dry climate or storms in a weather record of wind velocity, temperature and humidity, electrical noise in semiconductors, intermittent turbulence in a fluid, galactic radio noise, or bursts of activity as in a speculative market, the power spectrum representation of the

data can be very treacherous in its interpretation. (See Section 5.21)

We will return to this interpretation problem when we discuss the spectra of random walks. You may recall from earlier lectures that I warned you about the misconceptions of thinking in terms of thin solid smooth lines when you are discussing data, or some underlying theory. “Thin” implies a sharpness of definition in a functional relationship. If you think of

the relation of height against weight just as a thin (regression) line

you are ignoring the obvious and sometimes important fact that real heights can be different for the same weight. Solid lines imply mathematical continuity of the variables. Real numbers are implied to be possible in the data. In fact there always has to be a smallest unit

change which you can detect or express, in real data. Smoothness

implied the existence of derivative, also implied to be continuous. We saw that sometimes these implied and fallacious assumptions could give trouble, whether in supply- demand relations for an auction market, the surveying experiment of Chicken Little and Zero, or in the distinction of a rake vs. a brick for the uniform distribution. The normal distribution had all these convenient and fictitious properties, which were however most inappropriate for describing experiments with small counts, where the Poisson distribution was more appropriate.

These same difficulties crop up again in a discussion of power spectra, which are frequently thought of as thin, solid and less frequently smooth curves, plotted as a function of w from zero to 7 They are frequently drawn, for cosmetic purposes as piecewise smooth, more particularly, continuously joined straight solid line segments. As we shall see, the realities of power spectra are rather far from the subjective impression of this cosmetically improved picture. Finally you might note the technical distinction between a Fourier




or fitting of the observed data (5.8 -4),

and the power spe ctrum Tepresentati on (5.8-5).


are T values of f(t) to be repres ented, and there are T different compon ents to the expansion coeffici ents a; (2T real and imaginary parts, but only T are different). So the Fourier exp ansion a; (5.8-4) can “fit” or represent the T value of f(t) exactly.

The power spectr um (5.8-5)

“representation” has only 7/2 different values,

So the

re must be some “information ” “lost,” in going from f(t) to its pow er spectrum,

but not in going to the Fourier expansions. Thi s “lo

st” information may be quite visibl e to the eye, if you look for it, as we saw in compar ing the random tel egraph signal and sho t noise scatter diagra ms.


A Comment on the Di

sparate Mathemat ical Concepts Which Ca n Be Encompassed in Fourier Analysis

You may recall from an earlier lecture that I discus

sed the conceptual differences between a random walk with very lar ge vs. vary small disper sion per step compar ed with the expect

ed advance per step. When the dis persion Per step was large compared to expected advance, as for

roulette, or stock market prices, the stochastic, or

When the dispersion per step was very sma

ll compared to the expected advance, as in the student’s “ra ndom walk” to the par kin g lot, or the int

egration of a different ial equation num

erically with an accumulating rounding off error, the stochasti c aspects of the problem became insignifican tly small. We could saf ely use the concepts of functionalit

y and the continuous and smooth concepts of the calculus. The random wal k concept enables us to interpolate betwee n the concep t of a stochasti c vs.

functional relati onship between numerical

(continuity) and smooth ness (existence and con tinuity of derivative)

were not quite as simple as intuit ion might indica te.

The use of a Fourie r Tepresentation, esp ecially in

the complex number form, is ess entially a method of simultaneously handling in a general way these conceptually quite different

ideas - stochastic vs. functional relationships, and discrete vs. real, con tinuous numbers. The

use of complex number s, the additional alg ebraic concept of

re) =} =I

i= V-1, gives an added flexibility and versatility to the mathematics.

This added generality, where either stochastic or functional reJations or both, discrete or continuous or both properties, can be handled by one set of mathematical notations, does cause some complications. Historically, sines and cosines were first used by Fourier

to build up solutions of the heat flow or diffusion equation. They were quickly discovered to be quite useful in representing solution

of many other types of problems in the calculus. In the 20th century their utility in conjunction with complex numbers in describing stochastic processes has been more and more widely appreciated. Because of the generality of a notation which encompasses these disparate concepts, there are numerous booby traps encountered when you use Fourier analysis either for theoretical developments, or in analyzing real data. I will point out some of them, I doubt if they have all been discovered even now.


Formulas for Fourier Expansion, Expansion Coefficients, Power Spectrum and Autocovariance

Let us examine more closely the general expression for Fourier

expansion of a set. of data and its power spectrum. Imagine we have function f(t) defined at discrete instants t = 1,2,3,...,T7 over a

finite interval T. We may regard this as a finite sample of data f(t)

generated in any manner whatever. In the following discussion we are not supposing either mean or trend removal has been carried out. ‘At the moment, the formulae are just straight data fitting. Just as it is possible to represent exactly these T values of f(t) with a T term polynomial, so it is also possible to represent them exactly by T terms of sines plus cosines, or equivalently complex exponential, as we mentioned. For the moment we will use the form labeling the coefficients withj from 0 to T — 1. The alternate, fromj =1toT and other possible choices we shall return to shortly. Repeating our basic formulae, we had: T)

7, ¢=1,2,...,7 f(t) = Ya




? r) T fe So/ aj = (1 t=1



wherej may be any integer, Positive, nega tive, zero, greater or less

than T.

The power spectrum as a function of j;

P(j) = ajat = ajacr_;)


and as function of wy = 2aj/T



a(w)a*(w) = a(w)a(2r — w)



The power spectrum is symmetric about w = 7 (w = 0 if you pre fer to think of w in the range —r to m). The above formulae are exact. The follow ing formulae are also exact provided the following peculi ar and somewhat

unrealistic definition is given to f(t) outside the domain ¢ = 1 to T for which f(t ) is defined. If we use the formula for F(t) given above (5.10-1) outside the domain ¢ = 1 to T, we will find that

F(0) = F(2),

fD=fT+1$ )= ,F( .T+ .h) ., ete.

In other words f (t) (as given by its Fourier Tep resentation, repeats itself, it is strictly per iodic with period T. This peculiar behavior is not

a shortcoming of the expansion for f(t)) (5.10-1), which is onl y intended for the dom

ain set 1 to T,

but this peculiar proper ty is needed for the formul ae below to be exact. For real data you can see that this

behavior is a physical absurdity. It is a pro per

ty which is somewhat suppresse d by various comput ing routines which are used to cal culate autocovarian ce and power spectr a from teal

data. I also make a distinction bel ow between autoprodu ct moment APM(K)

and autocovariance cov(k), the latter bei ng just the former, minus the square of the mea n, This is in accord with standard statisti cal terminology.

Auto product moment T

APM(k) = (1/T) St)( t+k), t=1


k= 0,1,2,...,7—1



f=a/T)> tb) t=]



Mean Square r

P= (1/T) SP = APM(k = 0)




cov(k) = APM(k) — (f)’,

In terms of the expansion coefficients aj,j






Mean Square









Variance of f


> alw)a*(w) (5.10.10) f? —(f)” = cov(k = 0) -y ajaf= xe _

The autoproduct moment in terms of power spectrum P; = aja} is

APM(k) = ba ajate 2miik/T



If we subtract a3 the above formula becomes

cov(k -¥ ajayeeT 1



The power spectrum in terms of the autoproduct moment is

P(j) = aja% = (1/T)= APM(k)e2TM5#/T



At j =0

this gives

(Ff)? = 46 = (1/T) = APM(S. a



The preceding formul ae express the basic Properties of a Fourie r

expansion, which is by definition (5.10-1). are

computer pro grams



Given F(t) as dat a, there


cal values for all of these formulae (5. 10- 4, 5 and 6) are definitions or follow immediately from the definitions. (5.10-8) is just a Par ticular case (j = 0) of (5.102) showing the sim ple relation of the mean f = (1/T) D e f(t) the first Fourier

coefficient, ag.

(5.10-9) and in

particular (5.10frequently expres 10) are sed verbally: “The variance of F(t) is the sum of the variances (mod ulus Squ ared) of the ind ividual Fourie r components

@;, except the zero frequenc y component,

ag which is of course (mean f)2.” Note correspondingly tha t the term j = 0 or omitted from (5. w=0


10-10) but not fr om (5.10-9).


All this is in ag with conventional reement statistical terminolo gy, where the var iance of asum (the Fourier series for F(t

), (5.10-1), is the

sum of the varian the terms in the ces of sum, if they are all independent in the pro bability sense. So the contribution s a; of each fr equency w = 2n 3/T to f(t)

are considere d independen t

in the above sentence,

You might note the similarity of the lan guage, but the enor difference of the mous con text, to the senten ce:

“The variance of point of a rand the end om walk (sum of steps) is the su m of the varian the steps, if ces of

the Steps are inde pendent in the pro bability sense.” [t is one of the beauti es of mathematic s that the same ideas appear in widely different contexts,

(5.10-11, 12, 13) expres s the fact that

the autoproduct mome and power spec nt trum are Fourie r transforms of each other, a rel tionship identical ato that expressed in (5.10-1) and (5.10-2) between f(t) and its expansion coe fficients 4;.

Subtracting the me an squared (f= (a9)? from (5. 10- 11) and (5.10-13) give exactly the same re(5.10-14) relates the Squa

red mean (f)? = (a9)? to the sum (Cf. “area” ) of the aut oproduct moment. Wi th (f)? = (ao)? = 0 this sum (“area”) must be zer

o. It should be not ed that some of the Properties of these exact formulae (5.10-5 to 14) violate so me of the specifica tions that we laid down in order

to permit analysis of Seq

uential data statistic In particular you ally. will note that the autocovariance, Eqs. (5.10-4) or (5.10-7) is strict ly periodic in k wi th period T. Co y (k) is also symmetric ab ou t k = 0 and k = T/2 just as

a;a% was. We required for ergodicity that Ef(t+ k)k) f(t) = (Ef (t))? for & lar ge. So this


ergodicity condition is obviously violated. A second troublesome property is (5.10-14). This requires that

if the mean (f = ao) is zero, the sum (cf “area”) of the autoco-

variance is zero. Equivalently this requires that the autocovariance, when summed have equal positive and negative contributions. There is nothing in the definitions of either strict or weak stationarity to require this. In fact one of the commonest types of autocovariance,

~ e~“lFl does not have this zero sum property. It is just the purpose

of the various smoothing, averaging and truncating processes built into a computer to ameliorate these theoretical deficiencies. They doctor the data to make it fit the theory of functions having a power spectrum.In our discussion of the successive scatter diagrams to ob-

tain the covariance, we stopped making them at k = T/4, or T/2

at the most, in order to avoid having to discuss these embarrassing properties.

Tt is not uncommon for a rigorous mathematical and logical description of a phenomenon to appear to have absurd implications which contradict common sense and simple observations. Fourier analysis is no exception in this regard. We saw examples of this in connection with various “impossibility” theorems.


A Digression on the Orthogonality Relation

All the formulae in Section 5.10 can be derived from the definition of a Fourier expansion (5.10-1), using the following relation. T1

PD ebrii(k-O/T = TT, j=0


fork = £-++ mT, man integer or) fork A@+mT

These two results are summarized in the notation:

= Ték,ttmT


This is called the orthogonality relation. It can be considered as a geometrical construction, - a sum of unit vectors, the complex exponentials, which add up to a straight line T in length — T — for k=1+mT. For k 4 mT, the sum curls up to a polygon, star or rosette which closes on itself (sum = 0). |& —1| is just the number of complete (27) revolutions of the unit vector in drawing the sum as a geometric figure.



= Vo 0





We can write this as






t rary

bart rtrtr tet y



4 pT




Pty ar 4 ett






This is an exac t formula, it wo rks forr is just a sum of T ones,


Now suppose we take for r the complex number This has a magn r = e2ttk/T | itude of one, bu t it is not equa l to one teal, unl k=0,47, +097 ess etc., etc. It is a unit vector, or arrow with an to the x axis angle of 2rk/T. This angle is zero at k = 0,47 etc.



» r= j=0




x (e2mik Tyi

j=0 1







— e2nik

Ja anky r T


The numerator of this expressi on is zero for all The denominato ky kan integer. r is not zero, un less k is 0,47, +27 etc., etc. this indeterminacy When Occ

as a sum of T' ones.

urs, we just go back to our definition of ay

So we have :


emuT 0


ty k=0,4T, 427 ete

0, all other k.

. (5.11.5)


This is just our orthogonality relation with k replacing two integral indices (k —1). The geometrical interpretation of this sum is that of a polygon, star or rosette.

You can check it out by using a box of

tooth picks, and use T = 5, 6, 7, or 8 as an example.


A Digression on Aliasing

I want to emphasize that the above representation of f(t), (5.8-2)

or (5.10-1) is not restricted to functions for which f = 1/T) CZ, f(t) is zero nor is it restricted to functions with a zero linear trend. I emphasized that the constant and linear trend had to be taken out

before we could hope to examine data on f(t) statistically but calculating the power spectrum either directly, or via the exact covariance (5.10-13) will not make these removals for you. There is moreover a peculiarity in the above Fourier representa-

tion which brings out the difference between thinking of f(t) as a function of a discrete vs. F(t) = constant,

a continuous variable t.

Let us suppose

t= 1,2,...T, and not defined for any other values

of t. If we use the form for f(t) in (5.10- 1), complex exponentials T-1

F(t) = DO age’ 47,



for this representation the first coefficient ag = constant, and all the remaining a;,j =1,2,...,7—1 are zero. So we have

f(t) = const e?TM4=9'/T — const (real for f(t) real)


If we choose to consider ¢ as a continuous variable (it was not such in the definition of f(t), then the Fourier representation of f(t),¢ now continuous, looks like Figure 5.12-1.

f(z) is correctly represented

where it originally existed, at the integers, and is the constant between them, plausibly enough.

We can equally well use for f(t) the representation (5.8-3), 7 =1 toT


f(t) = Yo ae’?



With this choice we find a), a2,...a7—, all = 0 and only the last

coefficient a7 of the series is different from zero. ar = const. For this case we have

f(t) =CemTM*



as representati on

of f(t) = consta nt as a Fourie

r expansion. plot this up with t no w considered as a contin uous variable (us

the real part) of e?* #, we find Figure 5.1 2-2.

The values of f(é)

Fig. 5.12-1 Fourie r Representation of f(t) = const. by



asco ANSE


UV $307 t>

Fig. 5.12-2 Fourie r Representation of f(t) = const. by

ape*G=T)t/T (realpart)

If we

e only


The above phenomenon in which a constant function f(t) (defined only at integers) can be represented continuously by either a “brick” or a “rake,” is a simple example of a general phenomenon called aliasing of frequencies. A frequency j = 0 or w = 0 is “alias” 7 = T or w = 2m. In fact any function defined only at T consecutive integers of its argument can be represented by any consecutive set ofT Fourier coefficients and associated frequencies, 7 = 0 to T—1, or j = 1 to =T +k —1. Correspondingly we had to T. In generalj =k toj suppose strict periodicity in T for f(t) for equations (5.10-4 to 14) to be valid. In terms of w, any span of length 27 —1/T, ~ 2m, is equally valid. Many treatments prefer to consider w from —7 to 7 but give only the interval 0 — 7 as the spectrum is symmetric about w = 0 or w = 7 for real functions f(t). In what follows we will just stick to the representations j = 0 to 7 = T —1. The graphs only cover j =0 to T/2. Any consecutive T values ofj and the corresponding w range would serve as well.

Aliasing is a mathematical curiosity, a property of Fourier representations for an f(t) defined and existing only at t integral. Aliasing is practically significant if there are sound reasons for believing that a function f(¢) exists and has significant properties between the values t = 1,2,3... for which it was observed. For many physical measurements (position of a satellite, temperature on a weather record) this is a reasonable assumption - the mathemat-

ical continuity of the variable t and the function f(t). For other types of sequences, notably economic ones, we have indicated that assumptions for any sort of existence of f(¢) between the observations are much more debatable. Nor for that matter are the observations spaced uniformly in time.

We can see in the preceding discussion the rather peculiar and

special role which a non-zero mean f =(1/T) XE, fF) = a0 plays

in the Fourier series and power spectrum representation of the data. The mean may appear either as a zero frequency or j = 0,w = 0, or j =T,w = 2n, asa high frequency component, oscillating precisely at

the interval one (day, week, month) at which we take (or see or hear) our data. It might seem simpler to just subtract it out once and for all and forget about it. However, we saw that it was not acceptable to subtract the same mean from all the different scatter diagrams for autocovariance, if we are using real data. only if we made the

physically absurd assumption of strict periodicity over an interval


L is a single mean subtraction accept abl

e. We are commit ting the mortal sin of ma king assumptio ns about the dat a to fit the the Fourier exp ory of


One way or ano ther, the mean is “mean,” and a source of tro ble and uncertain uty. We need to get rid of it for ergodicity, but pletely.

When we disc uss the struct ure

function and the closely related interquartil e range of the dif ferences, we shall se e a way around this difficulty, bu t with power Sp ectra it makes tro uble.


The Interpre tation

of the Mean

and Linear Trend as Four ier Expansio n Coefficients ,

Mathematical Para doxes

Although we ha ve not mentione d it in the prec eding discussion, difficulties and uncertainties as sociated with lin ear tr defined it—the end (as we have non-zero mean difference) are, from a mathemat cal standpoint, iof the Same


type as those associated with non-zero

To show thi s connection ,

let us dr

op for the mo ment, our pr viously emphasiz eed assertion that f (t) only exists at discrete valu of t=1,2,3... es 7. Thstead of ad hering to the 8o spel according to modern Prophe the ts, Chicken Lit tle and Zero, we adopt the view of those ancient point sages Leibniz an d Newton. We imagine our “d f(t) really exist ata” for all real numb ers #, not just integers, over an in-

(we didn’t take an y data there), Pu ttin

g f(t) =0 outsid domain of ¢, so e some finite that f(t) “exist s” for all t, even if zero for most it, is really quit of e as arbitrary an d absurd as supp osing, as we di the sake of Four d for ier analysis, that it was strictly per iodic in T. We

In any event, th e expression for this sit

uation, correspo our Fourier ser nding to ies expression (5.10-1), instea d of a sum of amplitudes a; separate or a(w;), is an integral of supe rposition,

Hi) = [




This is a superpos ition of wigglin ggun


g functions et with an ampl ‘Dpi-


tude (density) a(w) along the w axis. a(w)dw is the amount of this wiggling function from w to w + dw. Now just what is the nature of this wiggling function which we

superpose (in the limit infinitely many, each infinitesimally small) to get f(t).

It is a solution of the differential equation for simple

harmonic motion.




eu 0


The amplitudes a(w) correspond to the constants of integration. y = A(w) sinw + B(w) + const

or we may write



y = Awe! + Bw) TM


The amplitudes A(w), B(w) can be considered as functions of w For w = 0 the solution of (5.13-2) is y = Aot + Bo


which is just a constant and a linear trend. The w = 0 “components” of the spectrum are the constants of integration, exactly as they are

amplitudes for non-zero frequency w.

The limit as w — 0 is also instructive of the difficulties encountered when the constant and linear trend are not removed from data. For ¢ fixed, B(w) coswt approaches a constant as w — 0 is continuous in w and not zero. There is nothing in the mathematics that says

B(w) has to be continuous in w. There are important cases when it is not. A(w)sinwt approaches (¢ fixed) a linear trend as w — 0 only if A(w)w — const. So a linear trend puts an “infinite amplitude” A(w) ~ const/w component as w — 0. In the power spectrum

this is a contribution of order const/w?. In numerical analysis this can cause considerable ambiguities at low frequencies (i.e. w close to zero) if and only if you believe the f(¢) which you are representing

with sines and cosines has mathematical continuity as a function of t. If f(t) doesn’t exist between t and t+ 1, w between 0 and 20/T doesn’t exist either.

One can note an analogy of sorts with the experiment of Chicken

Little and Zero, and the Cauchy distribution. With linear trend or a non-zero mean, plus a belief in real numbers and mathematical continuity, you are using a Fourier representation or power spectrum

analysis in circumstances where it isn’t entirely appropriate.


From a more adva nced mathematic

al standpoint we the following in can point out connection with these zero freq uency terms whic have to be hand h led Separately in using sines an d cosines to repr a function. Th esent e name of

Fourier is usua lly associated sions in sines with expanand cosines. There are nu merous other families of

value, and th e associated fu nction is call ed the eigen or tion. For these proper funcother kinds of wiggling functi ons, the zero ei and associated genvalue eigen function are Special ca ses, which, if needed (which they are is seldom in mo st

physical prob lems where su tions are used), ch funchave to be hand led Separately just as the cons and linear tr end in Fo tant.

urier analysis , The purpose of these digres sions on alia sing, means trends is to re and linear mind you that the theory of Fo urier represen and power sp tations ectra, like all theories, has its limitations. Hopefully


attention to the limitations of your mathematics.

You may recall

in the 8th grade, you learned in algebra to manipulate letters which stood for numbers, instead of manipulating numerals as in arithmetic.

Suppose you had this problem. The difference of two boys

ages is two years. The product is 24 (years)? How old are they? If you translate this sentence in English into algebraic symbols, and manipulate them according to the rules, you find the boys are 6 and 4 years old, or -6 and -4.

If you translate these answers back into

English, one (possibly absurd) implication of the second answer is, they haven’t been born yet. or in a science fiction world, you can live backward in time. You could add to the mathematical specification

of your problem that age must be a positive number, the algebra alone won’t do that for you. The theory of sets of numbers will, if

you use it.

The paradoxes of Fourier analysis are similar to the above, but much more subtle. You have to pay close attention as you translate back and forth from the mathematical or numerical conclusions to every day language and concepts. You can find in Section 5.21 an example of a paradox of Fourier analysis, consequent to its ability to handle in a general way both functional and stochastic relations.

The “expected” power spectra of random walks, linear trends and a step function, all look exactly alike.


The Spectrum of White Noise. a Sequence of

Independent in probability Values of f(t) Let us take up a specific example of power spectra - so- called

white noise." This is one of the commonest and simplest cases, and is frequently used as a comparison standard or bench mark against which other sequences giving different power spectra are compared. We commented earlier that the power spectrum was frequently thought of and depicted as a thin, solid and perhaps piecewise smooth line. The same could be said of the autocovariance. In fact it was one

of the conditions for the existence of a power spectrum (at least for a real function of a real variable t) that the (expected) autocovariance be continuous, i.e., a solid line. So it seems inherently reasonable that 7 The term “white” as applied to a spectrum means that all frequencies (in

practice over a finite domain of frequency) are present in equal amounts, in analogy to white light (sunlight) containing all colors in equal amounts—also over a

finite domain (red to violet, the range of the eyeball). To the ear, white noise is approximated by the sound of a hail storm on a slate or tin roof—all frequencies to which the ear responds are present, in equal numbers.

310 the (expected) power spectrum, being the Fourie r transform of the should


have some de gree of solidi

ty (i.e., contin We saw that uity) if f(t) was dot ted, or define d at discrete val of t only, the coe ues


fficients 4; or a( w)

were dotted too . Yet it seems reasonable tha t such dotted functions migh t be ap proximated by solid an d piecewise smoo th lines.

As we sa

w, this was a co enough practice, mmon and works well in some cases, but not all, as have gone to con we

siderable pains to show. So we are going to take the case of white noise, an d see just how

in practice with real data, the powe r Spectrum of wh ite noise is not very well re presented by a thin sol id smooth lin

e, Thinking just in ter ms of a thin solid sm oo th line is like thinking of the relation of wei ght

y to height x just as

a line. There are important fluctu real and ations around thi s (regression) lin e. The expected value of height y might well be in Some approximat e or special sense smooth and con tinuous in x

but it would not make sen a zigzag line Con se to draw necting succes sive values of ob se rv ed weight with increasing height. It doe

sn’t make sense

with power spectra but it isa very co either, mmon practice nev ertheless, So let us imagin e our F(t) to be created in the fol lowing way. We make ¢ = 42 58 0..0 withdraw als in order from normal (Gaussian) random nu mber table

of zero mean an d-unit variance. ordered choices These are our f(t). We might repeat th e experiment to get another, different F(t). We woul

d certainly ex pect its statis properties to tical be much the Same as for the first f (t) . These two dered choices of orT

the ensemble of all

= 1000 say memb ers each, are tw o members of possible

f(t)’s. We will use € (expectat calculate expect ion) to ations over this ensemble, and bars to indicate ges over averone me gets large,

mber of su ch an ense mble.



see that as ZT f = AyD, f(t) a €(f) = 0, so thi s sy st em satisfies the second condition of erg

odicity and the ot

her conditions of both strict and weak stationar ity as well,® For what follows there is no requir ement that this table have a normal or Gaus si an distribution .


digits valued for even digits, mi at plus one nus one for odd wo uld serve equall y well, these also have zero expe

cted value and uni t variance.

In order to determ ine what you mi ght expect to get from this 5 Except, as we hav e noted, mathematical continuity of the aut all values of del ocovariance for ay, 7.


experiment, you simply calculate the expected value (over the ensemble) of our Eqs. (5.10-1 to 14) describing Fourier expansions and an power spectra, and compare this with what you actually get in experiment or repeated experiments. Beginning with simple things first, take just f(t) itself. Since Ef(t) = 0, this is just a row of dots

on the axis of abscissa. We have given this in Fig. 5.14-1(a), also the experimental results, Fig. 5.14-1(b).

We have plotted the experimental results in a number of different ways, in order to illustrate the cosmetic effect of different plotting methods. These plotting methods also illustrate the implied and un-

conscious assumptions behind them which the reader inadvertently the adopts. There are numerous examples in the literature.® At priappro the right, twisted through 90 degrees we have sketched ate distribution and given its analytic expression. As we shall see, this distribution has an appreciable effect on some of the cosmetic impressions.

Figure 5.14-1(a) is just the plot of Ef(t) itself, a row of dots on the t axis. (b) is what I call the Chicken Little Zero plot, a picture,

no more and no less than what the data says. If the time scale is is highly compressed, as it frequently is for data of this type, itthis of graph difficult to read off the time order of the points on a type. The subsequent plots ameliorate this difficulty, at the price of some unconscious and misleading assumptions.

backIn Figure 5.14-1(c) and (d) the dots of (b) are extended, These ward in (d) and forward in (c) to form short horizontal lines. horiare connected by vertical segments to succeeding and preceding what er zontal lines. The effect of these vertical strips is to make clear part dered consi be not d shoul the order in the sequence is, but they moves which of the process which generates f (t). There is nothing along these vertical segments, their actual existence would destroy functionality for f(t).

There is also nothing existing or “moving” along the horizontal nuity segments either. They provide an impression of piecewise conti“lead ”

which is not actually the case. Note that plot (d) seems to plot (c), simply because of the way they are plotted. type. It Figure 5.14-1(e) is a very common plot for data of thissucce ssive “supposes” that f(t) exists at all ¢, and has to “get” to its strai ght observed values, at the integers, as quickly as possible, i.e., lines. It is continuous everywhere and piecewise smooth. ° See e.g., Granger and Morgenstern, p. 53.


ae (Prebtlde -




; b,c, d,e, ob served




oe ’

" ale





n ya {oe


ep '

cs .






Fig. 5.14-1, f; Smoothed or Running Average of Observations of Fig. 5.14-1 b.

There is one subjective impression of (e) which is correct, and shows up more clearly than in (b), (c) or (d) and not at all in (a), the “expected behavior.” Subjectively (e) seems excessively zigzagged. A majority of upslants are followed by downslants, a majority of down slants are followed by upslants. This states the fact that if successive values of f(t) are independent in the probability sense, successive differences of f(t), Af(t) = f(t) — f(t — 1) are not in-

dependent, and in fact negatively correlated. Upslants are positive

differences, downslants negative differences. There are even formulae for computing their negative correlation by counting peaks, or

crossings of f = 0 of this zigzag line.

Examples of financial data which have approximately the properties of our random sequences are annual earnings, quarterly earnings to somewhat less degree; daily, weekly or monthly closing stock price

changes, daily advances and declines. The volume sequences (daily,

weekly and monthly) has noticeable departures from the properties of f(t). All of them can be compared to the random sequence as a standard. You can read about such comparisons in the books by Granger and Morgenstern and Granger and Labys. Finally you can imagine drawing either by eye or computer a

smoothed or moving average of the original data on f(t), Figure

5.14-1(b). We give this in Figure 5.14-1(f). This is a very common

practice, if you believe unconsciously that there is a continuous and

walk, a cont inuous line (a lmost surely, almost everyw slope almost here) su

rely almost no with a where, Imagin able, but not depictable.




eg 4s





wana prob

ier Pe





e | |










Fig. 5.14-2 4 Random Sequ en



ce of Variable and Zero Mean s of Unit Vari . +1 for Even ance Digits, -1 for odd digits fr digit table. om random



are exactly the same. The differences in appearance are due to the differences of the distributions of f, plotted at the right of Figure 5.14 1 and 2.

Note that the expected value of f(t) in both cases is zero. It is an amusing and confusing semantic paradox that with the ft) =1 sequences you will never observe (one observation) what you mathematically and theoretically expect as the best statistical estimate, and almost never with the random normal sequence for f(t). So much for what the data on f(t) looks like. What do the coefficients a; or a(w) look like, and the power spectrum? The expected values of a; are easy to evaluate, and they are all zero. We had TM


a; =(1/T) > f(HePTM"/T, 5 integral t=1

Remembering that the successive f(t) are independent, the expectation of a sum is the sum of the expectations, in the above case times

the complex numerical factors e2titlT | So



(1/T) > Ef (te 2/E t=1




Note that a; is usually a complex number so E(aj) = 0 applies both to real and imaginary parts of a;.

In order to discuss what the probability distribution of the coef ficients a; looks like, Figure 5.14-3, we have to take account of the fact that the aj are complex. The two parts can be considered in two different ways, corresponding to the cartesian and polar coordinates of a point. Let us write out the expression for a; explicitly as a complex number a; = 6; + ic;; bj, c; being the real and imaginary parts. a




a; = bj tic; = (1/T) So f(t) cos(2njt/T)+i(1/T) YF (#) sin(2x3¢/T) The two sums give separately bj and c;.



& ot; °

. '




+ .


















~b Arh









i= Gar

4c )

5.14-3 The Distri bution of Real Co

mponent b, or Im nary Component agic of a Fourier Exp ansion. a; = bj + V=T ¢;.

You can see from the above formula that the 6’s and c’s separa tely

will each have a (in the limT




E®\=0 £@) =a =









co) normal dis

tribution, simply fr the operation of the om central limit theore m. Since we have, e.g ., e Ag

85 = (1/T) 3° F(t) cos(2mgt/T) t=1

4; is a sum of indepe ndent random variab les, the f’s times coefficients (1/T) cos(27 jt/T), which are in absolu

te value less than equal to (1/T). Th or e f’s must have the first two mome nts finite and hence “well behave

d” (cf, Sections 4.5 -7), the actual detail s of the distribution of the f’s is unimportant. With well behaved mo ments for f(t) the b’s will be “normally” distri buted (a continuous approximatio

n) independent of these details. We could see these det in the distribution ails of (¢) im the succes sive Scatter diagra ms of f(t)


vs. f(t+k),) but this information is “lost,” if we only look at the

Fourier expansion coefficients, or power spectrum. The mean and mean square of the distribution of f(t) affect the (expected value of the) power spectrum, no other properties of the distribution of f(¢) enter in.

So regardless of whether f(t) is Gaussian or just +1, 6; is going to be normally distributed quite closely and in fact for white noise

all the different }; will have the same distribution.!° Since €b; = 0, the variance of bj is

Eb? = E((1/T) ) F(t) cos(2njt/T))?




(Q/TYE s P(t) cos*(2njt/T) + > f()F(t!)(cosines. . .) tA!


Since Ef(t) f(t’) = 0 for all t of

#2’ (independence of f(t), f(t’))

(T/T")E(f?) (1/2)

= 07/27,

= 1/2T

for allj,j#0

for of =1



The probability distribution is for all b;,j #0

db p)e o(b)db = (1/V2moP27


In a similar fashion, for the sines of (21jt/T) oo


Ec} = 05/2T

~ "de ple\de = (1/V8no-)e


We have plotted this distribution on the right in Figure 5.14-3. If we choose we can plot the two distributions of b, and ¢, together on a plane. This gives just a bivariate and uncorrelated normal distribution, Figure 5.14-4.

10 Except as you might guess the coefficient for cosine of zero frequency, bo.


318 j

ax beia

eX Comstance +Olb;)

Pld \dbde= =

Payds Perd e —(b*e?) fr) 6 d bde



oT Of

_ bd ¢ e

(=a EL 3 Te (a wa 2T

Fig. 5.14-4 The Joint Distribution of Fourier Expansio n Coefficients in Cartesian

Coordinates for Whit e Noise.

Plbe)dbde = hee 20r apg, with


= oF = 02 = 03/27


An alternate plot of the distribution of b’s and ¢’s is in polar coo nates. In that cas rdie we define



ga tan71(c/b)


The corresponding distribution is



©(0)d0- R(r)dr


eM [20 dp



re (Ol »


—_ =e

¢ ‘








‘ Es


e 4 @





wert fr > 2

re las= ‘


he rl =


Fig. 5.14-5 The Rayleigh Distribution for the Modulus | a; |=

1/0} + ¢} of a Fourier Expansion Coefficient a; = bj + ic; for White Noise.

The distribution of r = |a| alone is called the Rayleigh distribu-

tion (Figure 5.14-5). In physics this is the distribution of the speed

8 = ,/r2 +72 of a molecule in a two dimensional gas, whose individual velocity components are individually and normally distributed.

In statistics it is the distribution of the standard deviation (note, not variance) os = 4/ o? + 0% of the sum s = 21 + x2 of two independent and identically normally distributed variables, 11, 22;01 = 02.


Finally, wh at is the ex

pected valu Ja,;|? = aja! e and dist which is th ribution of e Power sp a? = ectrum itse lf, We alre ady have th e

The sum of all of them (or “area” under the doubled if “curve,” €) you have only plotte [2 = o3/T , d w from just the va 0 to TM Or riance of f j=0 to T/ itself oF. Th 2, is is is of “pow

er” in the ten referred Spectrum. to as the to tal The Probab ility distri bu tion of aa rectly fr ” = Ja)? ca om the Ra n be obtain yleigh di stribution ed difrom (5 . We ha

.14-9), Defini ng P(po

d R(r)dr

= wer) = 72 — b? +0? = ag

E(p) =. = 8 +0} = 26? = ot pp

We find, si nce rdr = dp /2, the Prob ability




ody J?

of p, U(p)dp


V(p)dp = y e


Tically skew ed.

The expone ntia

l distri

bution give erties wh s someinte en we pl resting Cosm ot up an etic propexperiment al Power spectrum Ga* asa


er cen denTM

By =4,2"

gla peltnte aro

Pay '


tosamy Ir?

o@)dp = ohh dp Ele)= Ax Ty acts nore GT

2 fa

Fig. 5.14-6 The Distribution of the Power Spectrum Values P

= aja; = b} + c} for White Noise.

largest, but large values from the exponential tail can and must occasionally occur. So there is an unsymmetric excess of high sharp peaks in the plot of the power spectrum not matched by a corresponding number of deep troughs. As a landscape profile of frequency, there are quite a few sharp peaks, and no deep ravines. It is very common practice to plot the power spectrum not on a

linear scale as in Figure 5.14-6 but on a semi-log scale i.e., log for the vertical coordinate aja; but linear for the frequency, or j. For the assessment by the eyeball this is equivalent to making a further

transformation of the power spectrum to a variable z = log. p/>You can easily work out that the distribution of this variable is (5.14.11) F(z)dz = e~* e*dz

plotted to the right of Figure 5.14-7. This variable is also strongly


Fig. 5.14-7 Th e Distribution of the Power Sp on a Log Scal ectrum Values e for Ordinate P, , for White Noise (Schematic).


There are very few sharp moun tain peaks, bu t many deep It is the Prof ile of a



deeply eroded Plateau.

The “Solidit y” and “Smo othness”


(For example,

of a Power


trum of white noise, or any noise for that matter? You should note the similarity, or rather identity of the issues if this same question is

applied to f(t) itself. f(t) is withdrawn from the smooth solid normal distribution, but plotted as a function of ¢, it is pretty ragged. The power spectrum comes from a smooth solid exponential distribution. Plotted as a function of w or 7 it is also pretty ragged, in an asymmetric way. We saw that € f(t) was a horizontal row of dots of ordinate zero. About as smooth, solid and simple as the limitation placed by a row of dots can be. The actual observations of f(t) bounced around this “line” or row of dots by an amount and in a way set by the distribution of the population from which f(¢) was withdrawn, normal or +1, zero mean and unit variance in either case.

Exactly the same type of statement can be made of the power spectrum of f(t) when f(t) is “white noise.” The expected power spectrum is a horizontal row of dots—white noise. The observed power spectrum bounces around this expected value with a skewed 90%. confidence belt a factor of 60 wide.

One method of smoothing out the, observations of f(t) is to take successive not overlapping ¢ groups and average them. Thus you

calculate f(t) = ¢ se f(t’) and plot that. We did this once before to calculate points on a trend with randomized dates (Section 5.1).

If by contrast you let the groups overlap (a moving average, possibly with unequal weights), you get more points to plot and it will “look” smoother and more continuous, (Figure 5.14-1f) but you are just kidding yourself with this smoothness. The successive points are not independent just to the extent that they have several values of f in common. If you put k in your group to average, the fluctuation of f(t) as plotted go down like the square root of k, but you have only T/k instead of T independent observations. What you gain in “continuity” you have lost in information about f (t). Exactly the same treatment can and frequently is applied to the power spectrum. You take k = four, five or more adjacent aja; and average them. The resulting plots “looks” more continuous. It doesn’t tell you any more, but rather less in the information sense about the power spectrum. The distribution of these average points

on the power spectrum is that of chi-square with 2k degrees of freedom. Instead of T/2 independent spectrum values, you have only T/2k. If you make it a moving average you get more points and they look smoother and more continuous, just as for the moving average


for f(t).

You deceive your self if you

think this pretti trum is giving ed up specyou a clearer picture of the spectrum or which generate the process d it. It is purely a co smetic effect, pa ndering to your

and weight that, and Fo urier transf or

m, Alternat transform, di ively you ca n Fourier rectly or vi a the covari ance an F(t) smoothed, which has be tapered or en truncate d.

All these procedures opinion just are in my different slices of baloney! from the sa smooth slab me continuous . I might warn you th at this is th rather small mi e biased op inion of a nority.


How to Insp ect and Ch eck a Publ ished


(bushels)~1. (w eek)-1,


to fit.


If the units on the ax

is are 10-5, oy is funny, th 100, e author mu st have ch anged the scale, corn

The texts are

usually using such data to underlying “true, approximate, ” “continuous in or estimate so me ¢” (mathematica the opposite Pos l sense) process. ition, that it is We ta th ke e “c on ti nuous Pro esti mates the “tr cess” which ap ue” discrete proximates or Process whic h generates the data,


prices don’t change by such amounts in a week. Without units and dimensions, you just can’t tell.

You should be able to read off a rough estimate of dispersion of

these weekly price changes (take one-half their intersextile range). This squared is (for Gaussian distribution) the variance, of dimensions $ (bushels) (week)? It is often called the “total Power.” The

power spectrum has these dimensions. If it is directly plotted as individual points as a function of ¢, ajaj, the spectrum should then average about [1/(number of coefficients)] times this variance. If it does not, then look for a missing factor on the power spectrum scale. The power spectrum may have been normalized by dividing by the variance, in that case the scale on the power spectrum values as

plotted should be such that the values add up to unity.

Sometimes the power spectrum is plotted as a density, in that case

its dimensions should be labeled $? (bushel)~? frequency-?. Check what kind of frequency, is it cycles week~! or 2m cycles week~!? In

either case the area!? under the spectrum, with the proper units, should reproduce the previously estimated total power. You can make all these checks by rough measurements on the graphs themselves, but only if the units and dimensions are given. You can use such an area calculation to estimate and check the fraction of variance, or fraction of total power, in any specific frequency band. Finally you should be able to check, at least approximately the distribution of the spectrum values aja? and hence the width of the confidence belt. These are determined by the number of degrees of freedom.

The basic recipe for the number of degrees of freedom is 2 (number of spectrum points averaged together on the spectrum plot). If there is no averaging (i.¢., aja} was calculated directly and exactly by either (5.10-2,3) or (5.10- 13),the recipe gives just 2 df, as we previously derived, (5.14-10). The “number of points averaged together” can be produced in a number of different ways; one has to examine the text rather carefully to see what the computing routine was. There are at least three ways this number can be produced. 1) The simplest method is that of a running average (possibly of unequal weights) of the exact spectrum aja} (5.10-2, 3). See e.g., Grainger and Morgenstern, p. 70. Unfortunately the actual number 12 Note that “area” refers to a linear plot of the spectrum. On a semi log or log-log plot you have to convert, i.e., read off the linear dimensions. See Figure 5,19-7 for an example.

used in the Tunning av erage is no t given, so here we do n’t know th e

number of de grees of free dom.

2) The seco nd method fo

r determinin Points averag g the number ed together is of spectrum to use the foll the “exact” owing formula, formula eq. a variant of (5.1013)

M P; = power spectrum = 1. t } APMest imated k=0




Here M is so me integer wh ich is a frac tion, 1/3 or total number less of T, of Sequential obse

the rvations, AP Mest(k) is usually but an expressi not always on, of the type

APMoce(k) = +



fe) +4)


Note that AP Mest(k) does not use the unrealistic Pr operty of stri of f(é), but ct stops su


mming shor strictly peri t of where odic in ¢, ap f(t) become propriate to s the “exact” It can be show expression n (see e.g., th (5.10-13), e argume

nt following) (5.16-1) ives (5.10-5 to 14) just M indepe that ndent estimate number of s of th e spectrum. So spectrum Points aver the aged

together is of degrees of T/M. The freedom is 2( number T/M). See examples in 6. Figures 5.19 -2 to

A variant of (5 .16-1) is 1

a P= PAP Mest (Kee? *iik /M


..M (5.16.3) Here A, is a set of unequa l weights Star ting at At=0 monotoni = 1 and tape cally to zero at 0. ring Different choices of Ax are called the Set of different wind weights ows. Four such differen described in t Windows ar Blackman an e d Tukey, pp .

95-99. Natu APMest(k) they rally, for a give slightly diff given erent independen t estimates for th at different e valu




The effe ct

of such windows y average to is to gether 4s ad jacent Spec exact spec trum points trum. Th of e approx the imate nu mber of as before 2T degrees of /M. freedo

m is,

3) Finally on e can combin e both meth smooth wi ods 1 and 2 th a Tunn above, and ing averag e (two



from (5.163).


or three at the mo st here) a The numb er of d. f, is increase d



over 2T/M corresponding to the number n (2 or 3) of points averaged

together,ie., dic ca on.

There is a difficulty with method 2) above for calculating a spec-

trum, which is a good illustration of the difficulties sometimes en-

countered in deciding what constitutes “good” scientifi c procedure.

Method 2) occasionally gives negative values for the spectrum, which is impossible by “exact” method (5-10-3 or 13, or Method 1), which

smoothes the exact. values.

Now a chi-square distribution does not

have negative values of its argument, so the distribu tion of spec-

trum values calculated by Method 2) cannot be exactly chi-square.

This means confidence limits on the spectrum assumin g a chi-square

distribution must be in error by some unknown amount. There has

been considerable effort in the past to devise averagi ng and weighting

procedures, the Ax, for calculating from the APMest( k), to prevent

these negative values in the spectrum from occurrin g. I regard this

as simply sweeping the dirt under the rug.

Forcing the estimated

spectrum to be positive does not necessarily force the distribution

to be chi-square with a determinable number of degrees of freedom. So the confidence limits are uncertain, and correspondingl y the reli-

ability of statistical conclusions drawn from them. You should now

feel the prod of the horns of a dilemma, when you try to decide: a) whether to use the exact spectrum (5.10-3 or 13) or Method 1)

which simply smooths the exact spectrum; or b) a spectr um from an estimatedAPM(k), Methods 2 or 3. With choice a), from the Fourier expansion coefficients, you represent (fit) the data exactly with com-

plete reverence for it.

You know the distribution of the spectrum

values (chi-square with known number of d.f.) and hence where the

confidence limits are. You have solved, neatly and exactly an absurd

and unrealistic statistical problem, involving a strictly periodic f(t)

and strictly periodic APM(k). With choice b) using APMest(k) you

do not assume these absurd and unrealistic properties which allowed

a known distribution and number of df. of the spectrum values. Instead you solve a realistic and reasonable statistical problem, but

are now uncertain about the distribution of the spectru m values,

and hence the confidence limits.

You need these to interpret your

results with some significance. You have in short a choice between an unrealistic problem solved exactly and certainly (in the statistical sense), and a realistic problem solved approximately and with greater uncertainty about the (significance of the) conclusions.

There are two escapes from this dilemma, one particular and one



In a particula r exam

ple, if you are uncertain as much to believ to how e some interesti ng Property of the data as sh its spectrum, cal own by cul

ate the spectrum both ways. If yo u can find the

same conclusi on significantl y

either way, yo u

are probably safe. In general this dilemma could be resolved exp eri men tally. Calculate a number of dif

ferent. white noise spe

ctra, using both normal, + 1, random or even a str ongly skewed distribution, of zero mean, Calculate the Spectrum of these different white noi ses by both meth -

I might add, fr om my own expe rience that this problem of the distribution of Sp ectrum values is particularly exac er ba ted when you study 1/w noise, generate d naturally in a semiconducto r,

cially (e.g., 5.24-6, with 6 = - 1/2).

or artifi-

Table 5.16-1 Gi ven x?, WITH






of 90% — Factor


Belt as a Factor 1






Down From Median (115)-1











0.05 Given —

o< X?(



0.5 x? —_—






Up From

— 3.84

Median 8.5









































g 54



















It is possible, although I have never seen it done in a statistical problem, to calculate a Fourier representation of the data, and hence

a power spectrum using sines only or cosines.


In that case

(for white noise) the (coefficients)? have an x? distribution with one

degree of freedom. Its shape is of order gly)dy ~



Ydy a’

with y = chi— square

White Noise as a Standard of Comparison

I commented previously that the spectrum of white noise served

as benchmark against which other spectra could be compared. There is a corresponding standard procedure in a great many elementary statistical problems.

The same considerations apply in comparing

some power spectrum to that of white noise.

Let me illustrate by

means of an example. Suppose we have measurement on two variables 2, y; xi, yi,i = 1 to N.

We ask, are x and y statistically related?

One elementary

answer is to calculate the correlation coefficient. if it isn’t zero you have to ask, is it significantly different from zero? In this case the

comparison standard or bench mark is a population of uncorrelated and normally distributed pairs of variables . You calculate from this hypothetical population with zero correlation the probability of

finding with a sample equal in number to yours, the probability of getting a correlation at least as large as the one you actually observed.

If this probability (the significance probability) is small enough (.05, .01, etc.) you say at this level of significance the observed x and y are correlated.

A great many statistical efforts to answer our original question stop at this point.

If you want to do such a problem more care-

fully you should have first asked is the above comparison standard

appropriate?, and if not, doctor the data a little until its distribution is appropriate to the chosen standard or benchmark, at least approximately, just as we doctored our sequential data to make it appropriate first for just statistical analysis, and then more specif ically for spectral analysis, with white noise (one way of describing

T uncorrelated and independent variable instead of just two) as a comparison standard. For example, z, y might be any one of the following pairs. e 1) height and weight of students


¢ 2) barometric pressure and wi nd velocity

e 3) price ch ange and volu me, one stock or an index © 4) 1Q of a student an d his family income

5) price and dividend of a stock © 6) price an d earnings of a stock ®

7) price an d book valu e of a stock

© 8) price ch ange and earn ings change for stock,

You can and sh ould test if the comparison st


and uncorrel ated

rd of two norm variables is al appropriate by simply ex distribution of amining the x and y Separa tely (the marg inal distribu data are plot tions if the ted on a Scat ter diagram or assembled table). If th in a two wa ese two dist y ributions ar e approximat ely normal then


This will prob ably

be true for pa irs like 1) bu Pairs like 4). t Probably no In such case t for it would be Te asonable to tr both of the va ansform one riables. Fo or

r IQ vs. in

come, IQ is Pr to normal, obably close en but income ough might well not be. logi enough to g I might co normality. me close Other tran sf

ormations may well from the obse be suggeste rved distributi d on for other pa rticular Pairs, As we saw fr om the discussi on of our magi c transformati on word


of prices is usually sensible, but log of earnings would not be, since earnings have a finite probability of being zero or even negative. The preceding remarks about the use of the normal or bivariate normal distribution as a comparison standard has some close analogs in the discussion of power spectra, where the comparison standard is white noise, one way of describing 7' independent variables. ‘We saw that, given some experimental sequence f(t), differencing and removing the linear trend (as we narrowly defined it, the mean difference) went a rather long way toward massaging the sequence to make it nearly white in its spectrum. Just as in the case of a comparison standard for a single bivariate distribution, we don’t need to get our data to an exactly white spectrum, just close enough to examine significant departures from pure white. So our transformed sequence is compared to the white spectrum, and we can then examine if certain frequencies appear in significant excess or deficiency relative to the white noise distribution. To make this comparison carefully you have to know the distribution, of the white noise spectrum values, chi-square of 2d.f., supposedly but we saw that this was not a completely settled question, the distribution is different for different computing procedures.

The “Spectrum” of a Random Walk ltself We have discussed in considerable detail the spectrum, the dis5.18

tribution of the coefficients and confidence belts of a sequence ¢ = 1,...,T of independent variables f(t) of zero mean and finite vari-

ance which we called “white noise.” These are just the steps of a random walk of zero expected advance. So we could easily use such data to generate a random walk R(t) of zero expected advance. DEFINITION


R(t) = Of)



‘We estimated in an earlier discussion (Section 5.7) what the au-

tocorrelation of such a cumulative sum would look like. So using the relation that the power spectrum is the Fourier transform of the autocovariance, one might evaluate an expected power spectrum from that. We also pointed out that such a random walk was neither strictly nor weakly stationary, nor ergodic. The moments increased the longer the interval over which the data was taken. The distri-


bution of R(t) was not independent of R(t2)—no matter how big

t, — t2 might be.

Nevertheless one can calculate a “sp ectrum” for such a process. It amounts to fin ding out what yo u would get if

you punched the data into a comput er and told it to calculate a spectr um anyhow.

Such experiments hav e been done both on artificial sequences ,


by Granger and Hat anake, Granger and Morgenstern or Lab ys and Granger. A character istic type of “spect rum” does indeed eme rge. Let us see what it is like ,

Reviewing briefly, we ha

d Gh _T as successive choices from a table of zero mean and unit varian ce, Our Fourier exp ansion was T-1

FQ) = Vr ajerrt/T =0

Wy Qo seg


The real and imagin ary parts Separatel y of a;

were normally distributed with zer o expected mean, and variance oF /2 T = f2/2r. The separate,

unsmoothed points on the power spectr um, a;a5 were distributed exponenti ally (|chz? of 2d.f.) independent of j, with expected value Eaj ay = o3/T.

We define a new fun ction R(t), as above (5.18-1). Now data R(t) (one member on from the ensemble of all Possible R(t )’s), must have a Fourier tep resentation of the form. FOURIER REPRES EN’ TATION T-1

RO) = Vo wje/T, j=0

pe 1,2) oe gT


If we can find a rel ation between the @;’s and w;’s, the n since we know the former we can evaluate the lat ter. We shall derive the w;’s at first approximatel y, and then exactl y. Anticipating so me properexcluding and inc luding a constant (not necessarily the mean) and

linear trend in R(t ).

These two contribut ions

of R(t) are precisely what we have giv en arguments for removing from seq uential data, before carrying out any

statistical analysis, and in

Particular a spectrum calculation. We will be able to see exactly what Temoving them does to the spe ctrum.

You should also not e that for Ef(t) = 0, the expected con stant and linear trend of R(t) as defined abo ve in (5.18-1) is zer o. In a


particular single experiment or sample, they will almost certainly not be zero, nor even small, if the walk has several hundred steps.


The Approximate Evaluation of a Random Walk Spectrum

You will notice from the above definition of R(t) (5.18-1) that the differences of R(t), AR(t) = R(t) — R(t — 1) are exactly equal to f(t) except for the first one, AR(1) = R(1) — R(O), which is not defined, since R(0) is not given by its definition. However, if we take, arbitrarily, R(0) as given by its HUncae representation (5.18-2) (not

the definition) then R(0)= R(T)= 7, f(t). With that definition T


AR(1) = RG) - RO) =f) - LF) =-LVFH) t=1 t=1


This gives us a definition of AR(1) to be sure, but this definition does not fit with the property of all the other AR’s AR(t)=f(),


It will in most cases be much larger in absolute values than all the other AR’s, being the sum of all the f’s except the first, and with opposite sign. If we are considering real data, we can contrive to make R(1) as defined above fit into the definition of all the others in either of the following ways. Suppose the R’s are a price sequence, say of commodities or common stock. We can pick a span of data such that the initial R(0) and final R(T) are the same. Just draw a horizontal line on the chart and pick data between any two points in time where

the data crosses the horizontal line. In that case then R(0) = R(T), and AR(1) = R(1) — R(0) = R(1) — R(T) = f(1), by definition, AR(1) is no longer an exceptional case. We subtract from all the data R(0) = R(T). Our random walk is about this horizontal line and begins and ends on it. You will see that this procedure picks data of observed zero linear trend (zero estimated expected advance) and

zero starting point. A constant (not necessarily R = (1/T) yer, R(t) has been removed from R(t).

It may be that the data does not allow this procedure. Suppose it has a non-zero linear trend in it and does not return to its starting value. Possibly it is a random walk of non-zero expected advance. In such case you can modify the data a little, and achieve the same

property for R(1) as before. In such case draw a straight line from


one before the first, (R(1 )), ie., R(O), to the last, R(T’) value of R(t), and then redefine R(t) as the deviation from this straight line. Thi s proced ure subtracts the mea n difference F=4D 24, f(t) fromal

l the differences, f(t). It also subtracts some consta nt (not necessarily R) from all the R(¢). We again achieve our obj ective that AR(t) = f(t), as mod

ified by the above pro cedure for all values of ¢.


has also been slightly redefined by subtracting the exp erimental mean f fro m


So we have with this mod ified data R(t) (corre cted), BY DEFINITION

Reore.(t) = S° Ff (#’)1+ -F(0) = 0] 1







7 wye?t3t/T



I) = Alcor. (6) =

Reare.(t) ~ Reor.(t~ 1) =




> woj(e2tt/? _ 2rile-tp



T-1 re yS w;(1— 7 2ttwiei/T ) -2nigt/T



We also had as Fourie r representation of f(t )



F(t) = S* aje?*t/T j7=0


The equality f(t) = AR( t) now hol

ds for all values of t. Wit hout our subtraction pro cedure it would hol d for all values of ¢ except just one, t= 1.

What we subtra cted from the dat a, a con

stant and a linear term are just Fou rier expansion terms of the zero eigenvalue and eigenfunctions. It does not follow from our new definition for

R(t) that Room, = (1/T) OE, Reorr.(t) = 0 The re may still be some cosine of zer o frequency left in our data for R(é), but not for

S(t). Matching coefficients in our two expressions for f(t) (5.19- 3 and 4), we find

wo. or

(Ll er PnU/T) = 4;






[— e-2mis/T


You will note that the above expression for w; is effective except

for j = 0, in which case it yields an indeterminacy. wo = §,since

ag= f= apt f(t) = R(t) = 0. wo has a determinate value, however. The general formula for finding Fourier coefficients gives

w= ¥ Row = Reore.


So the spectrum of our modified R(t), R(t) (corrected)

wi = Reo. §=0



2(1 — cos(27j/T))’

j=1,2,...,7—1 (5.19.7)

So the expected spectrum is given from

j=0 Ews = E(Reor.)? ¥ o7T/3 _ Eajaz ra ¥

oF /T

GAO Ewywy = Farcosanj/P) ~ Wl —cosamsi iyo

Note that the relation (5.19-8) of the spectrum wjw of the corrected random walk (trend removed) about the straight line from the first to the last step, to the spectrum of white noise aja} (modified by subtraction of ag = f) is an exact numerical relation, holding separately for each point on both spectra. Each individual zig and zag of the white noise is reproduced exactly for the random walk Reore.(t), around the “line” 1/2(1 — cos 2aj/T). If you imagine the white noise spectrum, plotted around a line of ordinate unity on a

log ordinate scale, the abscissa running linearly from w = & to 2a

(j =1 to T/2), then “stretching” or “shearing” this line to a shape of log(1/2(1—cos 2nj/T), will give you the spectrum (on a log ordinate scale) of your random walk Reorr.(t) exactly. All of the statements about the distribution of the power spectrum coefficients of white noise, apply to the distribution of the random walk coefficients about the mean or median line 1/2(1 — cos 27j/T), exactly. We have illustrated the above with spectra of wheat price fu-

tures (Figures 5.19-1 and 2) and their differences from Granger and Labys!® (p. 71, 73). The matchup seems almost perfect. There are slight discrepancies at about i = 45.

13 There is a little uncertainty about the confidence limits because of uncer-


A i








Lower band 95%

> WM bY»

1] |

Upper band 95%


PP" amen 20







Fig. 5.19-1 Po wer Spectrum of First

Differences of Wh tures Prices Mo eat Funthly, 1950-65 . From Labys an d Granger. ‘ainty in the ori ginal reference as to how the spectrum was window only (Gr smoothed; wit anger and Hatana h a ke, P. 43), or wit h a window plu Of 3 points of une s smoothi qual weight (Gr anger and Hatana ke, p. 60). Note Granger et. al. cal that when l upper and lower 85%

confidence limits 90% band, since , I have referred it has 90% of the to as Probability inside it, 5% above it an d 5% below


337 T





AC EebaWane A


Le -






Fig. 5.19-2 Power Spectrum of Wheat Price Futures Monthly

1950-65. From Labys and Granger.

might be in sig nificant excess or defici

ency relative to (a flat spectrum) white noise or its sum, the ra ndom walk Spectr cos 27j/T). Th um ~ 1/2(1 — er spectrum an d

e ig another ty pe of plot for which both th e power the freque ncy scales ar e logarithmi c.

3, 4 are the ori ginal semi-log plots.

Figures 5.1 9-

Figures 5.19-5 and 6 are log-

log. The log-lo g plot expands the lower half of the frequency sca le You will no


te that one

half of the observed sp Points from w ectrum = 7/2 to 7, or J=T/4t oj = T/2 fall in range (a fact a small or of 2) at th e extreme tig ht. The low of the random frequency half walk spectr

um from j = 1 to T/4w= 2n/T is closely 1/2(1 to 1/4, — cos27j/T) ~ T?/(Qnj) = 1/ w?, a line of slo 2. This kind peof a plo

t is not symmet ric about w = x. The ent unplotted symm ire etric part of the spectrum fr om w = w to 27 also be compress would ed into a

factor of 2. This type of do uble logarithmi c plot for powe r Spectra is ve common in many physic ry al applications


telegraph signal , and shot noise, both with

Such a plot tends to br ing

covr ~ eT ha Spectra of th ve power e form ~ 1/ (u? +w?). So for w > yw the ir ex spectra also pected powe r are indistin guishable fr om that for a random wa lk.


Fig. 5.19-3 Power Spectrum of Composite Weekly SEC Index

1939-62 - Semi-Log Plot. From Granger and Morgenstern.



Fig. 5.19-4 Po we rt Spectrum of Woolworth Stoc ~ Semi-Log Plot. k Prices 1946.-60 Fro;m Granger and Hatanake.


Y Scat _1re§o

Y Stale Fo



2(1=Cos ®) L foo


a .500, y 19


i C eo

00 5 5 a


OW C.reSPo ECFiTRc 5,UM143)PomTs M=s15(36¥I,9> | NG 215 dif a a ‘se BELT_

wTOTeAL wBiydTHeAa2cTSOR 9

X= 70 4e5n7°oF 34]+ 20 Hoo qgOoNNweFeIcDeEeNoCnE b 10

_ VIkOe %ea


CrAaLe £:1b-1)

ie ||



9D6o°wlno- BcCYaoFnFA,CTOR 2


| 1o-|





L 1.0.

| b o2

Vode Gor


1 = sei SCALES




S1e4c39-W6E1EKLY index







yr , F OM Se L




7 oO


i x 2 4 8 /o 2 3 23 64 ee YR wmon.






Fig. 5.19-5 The Data of Fig. 5.19-3 on a Log-Log Plot.




YScare Pen


| 79








Powe@ 5 PEcTRU M, Orweoe





moanracy, 144 O-Dd


NO .D KTA Pots



125 180









dpa (180 2-aT “Ee

Width of






Jo 2






Tore. winth=7, 2

(Tan ce


5216-1) r*


re 4

eG .



+ WW





1M Mow



oy 5 doar


+0. / 10



Rwas “

5.19-6 The Data of Fig. 5.19-4 on a Lo g-Log Plot.








weE TT











27 54 108 26




q 4

2 1962

29 AUG—29 SEP,







TOT PWR (1) #8360 (km/sec) _| TOT PWR (2) 29959 (km/sec)

H Hi



=== 30 SEP—31 OCT, 1962 (2) _|










rh H




‘ro 2x10!0
















F ht


te ta” Ss , wie RIO]












Sho(a. of cage



tore JJ =INIG3Rowe n lo-*




eyeles [Sec

Fig. 5.19-7 Power Spectrum of the Radial Solar Wind Velocity Vp. Wavenumber scale as in (Coleman 1968). From Annual Review of Astronomy and Astrophysics, Vol. II, 1973, Ed. L. Goldberg, et. al.


| |



The Exact Sp ectrum of a Random Wa lk The Preceding discussion of th e expected sp tored” random ectrum of a “d walk Ror oc. (t ,) presents an

interesting co you get, de ntrast to what rived belo w, without any doctor ing. Intere the “shape” st ingly enou of the expe gh

cted spectr

um of the un walk is exac corrected Ta tly the Same ndom as for the co rrected walk and the dist . The magn ribution of th itude e coefficients are different. Note that in

We begin as before with th e Four

ier expansion pendent elem of f(t), as inde ents, In this case we allo w for a Possib finite expected le non zero but value Ef 4

0,€a9 = 0. Th e Fourier Tepr esentation

f(t) is:



F(t) = Vaje2/T j=0



This random walk for non-ze ro expected ad vance is: DEFINITION t

Runcomected(t) = > f(#") t=1



Panc.(t) = S~ Rye2*i t/T j=0


We want to ev aluate the R; in terms of the aj. We substi into (5.20-2) tute (5.20-1) . F


Rane.(t) = DY ajereyr #=1 j=0


We can commut e the summatio n order and su m the ¢/ series directly To carry out th e ¢/ Summatio n we need the following form note some Pr ula and operties. as a geomet ric series,

y t=0





5: t





n, ifr=1orrTM =1(real)



For these formulae we will take r = ?"4/T and the general forms are valid except whenj = 0,47, +£2T, etc., or r = rT =1real. These

special cases follow from the definition. We have already quoted this special result and given both a geometric derivation, and an algebraic derivation (Section 5.11). From the above expressions, R(t) becomes T-1




(5.20.7) = Yo aj Sy Pmse/T Splitting off the term j = 0, one obtains T-1




Rane.(t) = ao s {4 > a, x e2riit/T j=l


Using (5.20-6), =ait

prt ageMi/T

=aot + Df Baer getti/t



J(1 — e?rii/T) (1 — e2miit/T)



ajePrti/T eri





ag is the mean slope of the random walk. It may or may not have an expected value zero.

You will notice that this expression consists of a linear trend, term A, a constant independent of t, term B, and an expansion term C whose coefficients differ only slightly from those derived for

a random walk with a constant and linear trend subtracted. The contribution of the terms in C alone to a power spectrum is identical to that derived in the preceding section for a “doctored” random

walk, Reorr-

At t =1 we find, exactly T-1

Rune.(t = Lao + Yo aje?TM4/7 = FL) gal



In order to ex press the enti r©


sion fo r Pansion in e2 7%t/T » we ne ed to expa nd the

So we need an


on of the form


une.() as a Four ier ex-

line. ar term agt in e2rt it/T


aot= S~ Ee Pmine/T

LD casei?





Ga =DY




This sum fo r & is of the form with p — efniy/T


8p 4 T

We can use our previous €xpressions

uate this sum as fo llows:




for Seometri c series to ev al




4+r7 +17


~r7)/1 — 7)

= rl — Ty yc = (La rT 2yqc q

r) r)

Ta =r)/(a— +777




Note that wi th p = OMG UT




11) Lr =e gy we have, from







(1/2ag )t = ao(T'+ 1)/2 t=1



7 7¢ )q

Putting all this together we have





= oo) from A


+O oar ape2rt





Ee en e (~age?




J=1 ag er Cc





A, B, C refer to source terms in 5.20-8.

This is now in the form of a Fourier expansion (5.20-3), a constant

plus a sum of terms in ent

_ao(P+1) Rune.(t) = —y


[= = er pul jet lou + |r ja


Ro, (uncorrected)

Rj, (uncorr.),j #0 (5.20.14)

We have sketched in Figure 5.20-1, the various contributions of these terms to the Fourier expansion of the corrected and uncorrected random walk, which we have deliberately drawn with an appreciable non zero expected advance, or linear trend.

Finally we give the power spectrum, and then take its expected.

value in order to see what we might get on the average if we repeated the experiment several times.



By s20e-


R; =o (uncon)


R (\porrected J ares wy ee

Rte),dneorresed rTerans



Fig. 5.20-1 The Exact Fourier Composition of Spectrum ofa

Random Walk of Non-Zero Expected Advance.





For the power spectrum of an uncorrected walk ,(5.20-3), from



T+i) R= ae ; ) $Y aje?

/yy T a” )



Uncorrected Wa lk. (5.20-15) is pure real



RR = (a0 + a;) (ag a @)


20 = cos(2xj/ T))


We can compar e the above re sults with thos random walk e for the correc Equation 5.19 ted. -6, 7



d walk.


original ra ndom walk

O,or 4 0.

not the

Zero expect ed advanc e.

For zero

the spectrum of the correcte d walk. (5.198). Reor

., Zero Expect ed Advance J=0


a =

as expected value for

Ewe= E(Boor .) = RTE, Reorr.(t))? ~ o7T/3 @ Eaja*

J#O { Ew;w; = Wrest = ¥ T= cos(3ng/T)) ot/T

For the uncorr ected walk for th


e Particular ca of zero expect se at the mome ed advance, €a nt 9 = 0, the ex pe ct Pected valu ed spectrum e of (5.20-15 (the ex, 16)

Uncorrected Walk, Zero Expected Adva nce F=0


€(R)x 027 /2 + E(Reor,) © o7T(5/6)

J#O { ERR} ne


Fao +€aja%+E 09(a


oa 2ra/T))

Note that in this Particular case for Eap = 0, then


E03 = faja5 = 03 /T = €f2/7

The expected value Eag (a; +a *) is zero. This is Sometimes as the indepe expressed ndence of co efficients of differing j, in the probabil ity


sense. It is a consequence of the peculiar properties of a geometric series sum, when r? = 1, real.

Eaga; Sty1 if(t )- ADL if(de —2nijt/T

Bot ef? (t) mall +

Y=0,7 40

‘) e2rtit/T f(t) Lage £ FOF




So for a random walk of zero expected advance, Eap the frequency dependent part (j 4 0) of the (expected) power spectrum of the uncorrected walk is just twice that for the corrected random walk since = Eaja;. The J = 0 term for uncorrected walk (expected adEaj vance zero)is slightly more than twice (5/6 vs. 1/3) that for corrected walk.

Although the terms €ao(a;++a’) do not contribute to the expected

value of the power spectrum of the uncorrected walk they do have

an effect on the distribution of the coefficients, or in other words how the actual observed spectrum scatter around the expected value €(aj-+a;a3) We saw that aja% had an exponential distribution, or chi-

square for two degrees of freedom. The distribution of ag(aj + a3)

is the distribution of the product of two uncorrelated and norinal variables. This is a symmetric distribution and rather concentrated about the origin. If u is the produce of two such variables, this

distribution is of order —log(|u|/ox) for |u| < oy and of order e~* Po for |u| > ou.

In summary the uncorrected random walk of zero expected advance has an expected spectrum identical in shape (j # 0) to that for the corrected random walk, but it is twice as large. The probability distribution of the power spectrum values is distinctly different for the two cases.

For a random walk of non zero expected advance Eaj # 0 the corrected random walk has exactly the same shape and size of expected spectrum as the corrected random walk for zero expected advance. This is understandable, the “correction” just takes out the

non-zero expected advance. The uncorrected random walk (non zero advance,£ay # 0) has exactly the same shape (j # 0 terms) of its expected spectrum as the corrected walk (trend removed) but the level, or magnitude of the terms for the uncorrected walk is a great deal higher. The distribution also is different. The magnitude in fact increases with the square of the non-zero expected advance. You can



verify thes e statemen ts


expected 16) with value of Eq ap 4 0, by uations (5.20-15, replacing ag by £, £a s large as the “uncorre you please cted” formul . In a

above, “” wo uld bejust linear trend, the slope of on nonzero ex the pected advanc e.


You can Se e that

in such a plot, exce (frequently omit pt for the ted from a Powe 7 = 9 term r sp ec tr of the unco um ), th e expected spec rrected and Corrected ra trum

ndom walks, zero or nonzero ex-


A Commen t on the Id entity of E xpected Sp for Rando ectra m Walks, Linear Tr ends and Steps One might as

k, how can it be that the a linear tren expected po d aot,t = a wer Spectr um of T, and the expected po wer Spectr um some deta ils.

For the line ar trend, a slanted Tow of dots eman ating ao

origin, Defi nition: