Progress in financial markets research 9781613247655, 1613247656

1,376 126 10MB

English Pages 358 [370] Year 2012

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Progress in financial markets research
 9781613247655, 1613247656

Table of contents :
Learning & Conditional Heteroscedasticity in Asset Returns
Modelling & Measuring the Sovereign Borrowers Option to Default
Success & Failure of Technical Analysis in the Cocoa Futures Market
When Nonrandomness Appears Random: A Challenge to Financial Economics
Finite sample properties of tests for STGARCH models & application to the US stock returns
A Statistical Test of Chaotic Purchasing Power Parity Dynamics
A methodology for the identification of trading patterns
Technical rules based on nearest-neighbour predictions optimised by genetic algorithms: Evidence from the Madrid stock market
Modern Analysis of Fluctuations in Financial Time Series & Beyond
Synchronicity between macroeconomic time series
Contagion Between the Financial Sphere & the Real Economy. Parametric & non Parametric Tools: A Comparison
A Macrodynamic Model of Real-Financial Interaction: Implications of budget equations & capital accumulation
Modelling Benchmark Government Bonds Volatility: Do Swaption Rates Help?
Nonlinear Cointegration using Lyapunov Stability Theory
Active Portfolio Management: The Power of the Treynor-Black Model
Stock price Clustering & Discreteness: The Compass Rose & Complex Dynamics
Index.

Citation preview

FINANCIAL INSTITUTIONS AND SERVICES

PROGRESS IN FINANCIAL MARKETS RESEARCH

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.

FINANCIAL INSTITUTIONS AND SERVICES Additional books in this series can be found on Nova’s website under the Series tab.

Additional E-books in this series can be found on Nova’s website under the E-books tab.

FINANCIAL INSTITUTIONS AND SERVICES

PROGRESS IN FINANCIAL MARKETS RESEARCH

CATHERINE KYRTSOU AND

COSTAS VORLOW EDITORS

Nova Science Publishers, Inc. New York

Copyright © 2012 by Nova Science Publishers, Inc. All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. For permission to use material from this book please contact us: Telephone 631-231-7269; Fax 631-231-8175 Web Site: http://www.novapublishers.com NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers’ use of, or reliance upon, this material. Any parts of this book based on government reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of such works. Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. Additional color graphics may be available in the e-book version of this book. LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA Progress in financial markets research / editors, Catherine Kyrtsou, Costas Vorlow. p. cm. Includes index.

ISBN:  (eBook)

1. Finance--Research. I. Kyrtsou, Catherine. II. Vorlow, Costas. HG152.P76 2011 332.072--dc22 2010048379

Published by Nova Science Publishers, Inc. † New York

CONTENTS Editorial Introduction Catherine Kyrtsou and Costas Vorlow

vii

Chapter 1

Learning and Conditional Heteroscedasticity in Asset Returns Bruce Mizrach

Chapter 2

Modelling and Measuring the Sovereign Borrower's Option to Default Ephraim Clark

15

Chapter 3

Success and Failure of Technical Analysis in the Cocoa Futures Market Peter Boswijk, Gerwin Griffioen and Cars Hommes

25

Chapter 4

When Nonrandomness Appears Random: A Challenge to Financial Economics Anastasios G. Malliaris and Mary E. Malliaris

71

Chapter 5

Finite Sample Properties of Tests for STGARCH Models and Application to the US Stock Returns Gilles Dufrenot, Velayoudom Marimoutou and Anne Peguin-Feissolle

83

Chapter 6

A Statistical Test of Chaotic Purchasing Power Parity Dynamics Apostolos Serletis and Asghar Shahmoradi

103

Chapter 7

A Methodology for the Identification of Trading Patterns Athanasios Sfetsos and Costas Siriopoulos

121

Chapter 8

Technical Rules Based on Nearest-Neighbour Predictions Optimised by Genetic Algorithms: Evidence from the Madrid Stock Market Christian Gonzalez-Martel, Fernando Fernandez-Rodriguez and Simon Sosvilla-Rivero

137

1

vi

Contents

Chapter 9

Modern Analysis of Fluctuations in Financial Time Series and Beyond Zbigniew R. Struzik

167

Chapter 10

Synchronicity between Macroeconomic Time Series Alvaro Escribano and Ana E. Sipols

189

Chapter 11

Contagion Between the Financial Sphere and the Real Economy. Parametric and Non Parametric Tools: A Comparison Guegan Dominique

221

Chapter 12

A Macrodynamic Model of Real-Financial Interaction: Implications of Budget Equations and Capital Accumulation Carl Chiarella, Peter Flaschel and Willi Semmler

243

Chapter 13

Modelling Benchmark Government Bonds Volatility: Do Swaption Rates Help? Christian L. Dunis and Freda L. Francis

263

Chapter 14

Nonlinear Cointegration using Lyapunov Stability Theory Raphael N. Markellos

289

Chapter 15

Active Portfolio Management: The Power of the Treynor-Black Model Alex Kane, Tae-Hwan Kim and Halbert White

311

Chapter 16

Stock Price Clustering and Discreteness: The “Compass Rose” and Complex Dynamics Constantinos E. Vorlow

333

Index

351

EDITORIAL INTRODUCTION Catherine Kyrtsou University of Macedonia, Thessaloniki, Greece

Costas Vorlow Ph.D. Director, IMAR, International Markets & Risk, Athens, Greece University of Macedonia, Thessaloniki, Greece University of Strasbourg, BETA, France University of Paris 10, EconomiX, France

Numerous empirical studies have analysed the identification and nature of the underlying process of an economic system, as well as the influence of information on financial time series. The standard financial theory of efficient markets assumes identical investors having rational expectations of future stock prices who instantaneously discount all market information into these prices. This means that there are no opportunities for speculative profit, and both trading volume and price volatility are not serially correlated. Even when anomalies emerge, they are simply due to chance and the market efficiency prevails on average (Fama, 1998). However, instead of the mainstream economic view of market efficiency, the intensity of the recent financial crisis has revealed that stock markets behave like complex systems as documented in Kyrtsou and Terraza (2002, 2003, 2009), and Kyrtsou et al. (2004, 2009), Kyrtsou and Vorlow (2005, 2009) among others. As such, fluctuations invade the market even in the absence of external shocks. This complexity may be attributed to numerous factors such as the reaction to public and private information, the role of investors’ behaviour, the structural change or other more specific characteristics related to the microstructure of the market. This complexity and its subsequent effect the non-linearity can lead to loss of information decreasing efficiency and affecting systematic risk estimations. Through the chain of tranching and distributing the risk, fundamental values and risk profiles of underlying assets became impossible to reconstruct, even for the most informed investors. There may be a deep paradox. In principle, financial innovation is meant to increase efficiency. But, in real financial markets, efficiency depends on the availability of information. And the nature of innovation that occurred has been able to destroy information.

viii

Catherine Kyrtsou and Costas Vorlow

In a similar unstable environment predicting future returns seems to be a quite difficult task. Most importantly, ignoring these properties of stock returns risks to seriously affect any attempt for successful portfolio diversification. The collective volume “Progress in Financial Markets Research” with the inclusion of papers employing time series and asset pricing methods, data mining, non-linear analysis, chaos and wavelet-based techniques, aims to contribute to a very promising field of research that emerged years ago but received special attention in the light of the financial crisis of 2007-2010. Chapter 1 analyses an attractive alternative hypothesis for the presence of heteroskedastic effects in stock markets. Overcoming the standard ad hoc specification of heteroskedasticity, Mizrach introduces GARCH disturbances, which evolves naturally out of the decision problem of economic agents. Chapter 2 goes deep into the study of sovereign credit risk. Within this context, Clark models the government’s willingness to pay as an American style call option. He shows that under certain hypotheses there is a ratio of debt costs where it is optimal for the sovereign to default. This result permits the pricing of the option, the estimation of the distance to default as well as the probability of default. In the next Chapter 3 Boswijk, Griffioen and Hommes apply a large set of trend following forecasting rules to LIFE and CSCE quoted cocoa futures prices and the PoundDollar exchange rate. The implementation of a parametric bootstrap technique completes the methodology. The empirical findings reveal a difference in the effectiveness of technical analysis between the two cocoa futures market. It is suggested as a possible explanation of this difference the combination of demand and supply of cocoa beans as well as the variation in the Pound-Dollar exchange rate. Chapter 4 builds on two well-known techniques for distinguishing between deterministic and random systems. Malliaris A., and Malliaris M., show that taking into account the principal limitation of small data sets in financial and economic applications, the correlation dimension and BDS techniques risk to give misleading results. In Chapter 5, Dufrenot, Marimoutou and Péguin-Feissolle present Monte Carlo experiments in the aim to investigate the empirical power and size of LM tests permitting the evaluation of simple GARCH and threshold GARCH models. An application to US stock returns underlines the importance to incorporate threshold effects in the conditional variance rather than using simple GARCH alternatives. In Chapter 6, Serletis and Shahmoradi use U.S. dollar-based, DM-based and Japanese yen-based real exchange rates for 21 OECD countries and test for the presence of deterministic chaotic dynamics. With the application of the dominant Lyapunov exponent method, they find evidence in favour of underling nonlinear dynamics in the real exchange rates series. Chapter 7 introduces a methodology based on data mining techniques for identifying patterns in a time series and the probability of reoccurrence. Sfetsos and Siriopoulos use this methodology in order to classify trading behaviour in the Dow Jones index and the GB pound-US Dollar exchange rates series. The empirical findings reveal a number of typical behaviour for each examined series. In Chapter 8, González-Martel, Fernández-Rodríguez, and Sosvilla-Rivero apply k nearest-neighbour predictors to the Madrid stock exchange index optimized by genetic algorithms. They transform then their forecasts acting as economic signals, into technical

Editorial Introduction

ix

rules whose profitability is evaluated against a risk-adjusted buy-and-hold strategy. As it is shown by the obtained results, the trading rule that is built on the k nearest-neighbour predictors outperforms the buy-and-hold strategy unveiling the importance of nonlinear technical analysis in financial markets. An alternative approach of fluctuations in financial time series is presented by Struzik in Chapter 9. It is the Wavelet Transform Modulus Maxima based multifractal analysis. More precisely deviations from the expected multifractal spectrum are taking into account in order to evaluate potential outliers. The application of this procedure to the S&P index series reveals an intriguing interaction between different time horizons before the biggest crashes of the stock index. In Chapter 10 Escribano and Sipols develop model-free tests for cointegration that are robust to different deviations from the standard linear cointegration hypothesis. Besides a complete Monte Carlo exercise, they apply this new technique to the exchange rates of the US Dollar, the Deutsch Mark and the Japanese Yen against the Spanish Peseta. The empirical findings suggest that the methodology is robust to monotonic nonlinearities and serial correlation structure in the cointegration errors, and certain types of level shifts in the cointegration relationship. Guégan opens Chapter 11 with a mixture of parametric and non-parametric methods and studies the phenomenon of contagion between the financial and real economic variables. The application of this kind of techniques provides useful conclusions on the nature of shocks hitting economy and the way that these shocks are propagated within the system. Guegan underlines that due to the sensitivity of the different tools a robust application on real data requires the joint implementation of such techniques. In Chapter 12, Chiarella, Flaschel and Semmler investigate the interaction between real and financial sphere through the use of the Blanchard model by including the capacity and the financing effect of the investments decision of firms. Different scenarios are considered producing quite rich dynamic effects. As it is shown, the dynamic behaviour of the model changes according to the stock market adjustment. Explosive behaviour is achieved and the system losses stability when stock market adjustment process is fast. Dunis and Francis examine in Chapter 13 the forecasting ability of different stochastic volatility models applied to the 10-year US Treasury Bond and the 10-year German Bund. Comparing with information provided by swaption rates leads to the conclusion that the simple models considered first add value in terms of forecasting accuracy. Chapter 14 focuses on the notion of non-linear cointegration. Markellos extends it using the Lyapunov stability theory and neural network estimators. The empirical application of this methodology to the UK Gilt-Equity ratio reveals the existence of non-linear cointegrated relationships. Chapter 15 deals with the power of the Treynor-Black model. The goal of the chapter is to identify first and then reduce the threshold level of profitable forecasting ability for portfolio management. Kane, Kim and White apply shrinkage estimation to beta coefficients and to discount function for forecasts of stock abnormal returns. Evidence strongly suggests that Least Absolute Deviations methodology outperforms the rest of the techniques in the outof-sample experiments. Finally, Vorlow closes this volume with a Chapter devoted to the analysis of underlying dynamics of asset prices from the FTSE all share index. To unveil potential non-linear deterministic structures, he employs the Recurrence Quantification Analysis. The empirical

x

Catherine Kyrtsou and Costas Vorlow

results show strong presence of complex non-stochastic and predictable patterns in the data generating processes. Concluding, the chapters presented in this book combine both novel and traditional approaches in the analysis of financial phenomena, enabling us to obtain a deeper understanding of their deeply complex and volatile character.

References Fama, E., Market Efficiency, long-term returns and behavioural finance, Journal of Financial Economics, 49, 283 (1998). Kyrtsou C. and Terraza M., Stochastic chaos or ARCH effects in stock series? A comparative study, International Review of Financial Analysis, 11, 407 (2002). Kyrtsou C. and Terraza M., Is it possible to study chaotic and ARCH behaviour jointly? Application of a noisy Mackey-Glass equation with heteroskedastic errors to the Paris Stock Exchange returns series, Computational Economics, 21, 257 (2003). Kyrtsou C. and Terraza M., Seasonal Mackey-Glass-GARCH process and short-term predictability, Empirical Economics, in press (2009). Kyrtsou C., Labys W. and Terraza M., Noisy chaotic dynamics in commodity markets, Empirical Economics, 29(3), 489 (2004). Kyrtsou C., Malliaris A.G. and Serletis A., Energy sector pricing: On the role of neglected nonlinearity, Energy Economics, vol. 31(3), 492 (2009). Kyrtsou, C. and C. Vorlow, 2005, .Complex Dynamics in Macroeconomics: A Novel Approach..In C. Diebolt and C. Kyrtsou (eds.), New Trends in Macroeconomics. Springer Verlag , pp. 225-251. Kyrtsou, C. and C. Vorlow 2009, Modelling nonlinear comovements between time series, Journal of Macroeconomics, Volume 31, Issue 1, March 2009, Pages 200-211.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 1-14

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 1

L EARNING AND C ONDITIONAL H ETEROSCEDASTICITY IN A SSET R ETURNS Bruce Mizrach∗ Department of Economics, Rutgers University, New Brunswick, NJ, US

1.1.

Introduction

Empirical researchers have uncovered three robust stylized facts about financial time series: (1) unconditional leptokurtosis; (2) serially correlated heteroscedasticity; (3) convergence to normality with temporal aggregation; These observations appear to be robust to time period1 , choice of asset2 , and country.3 The autoregressive conditionally heteroscedastic (ARCH) regression model, developed by Engel (1982), and generalized (GARCH) by Bollerslev (1986), is consistent with all three stylized facts and has proved a popular model for financial modeling. In the GARCH models, the volatile episodes that characterize the tails of the return distribution are clustered. Large changes in the absolute value of returns tend to follow other large changes. There is an acronymous literature of extensions to the basic model. In the ARCH-inmean or ARCH-M model developed by Engle, Lillien and Robins (1987), the conditional variance enters the conditional mean. Nelson’s (1991) exponential GARCH or EGARCH models the log of the conditional variance. Because volatility appears to be quite persistent, Engle and Bollerslev (1986) have considered an integrated GARCH or IGARCH model. High frequency data appears to be fractionally integrated process which Baillie, Bollerslev ∗ E-mail 1 Pagan

address: [email protected] and Schwert (1990) report volatility persistence in stock prices in data sets from the mid-19th cen-

tury. 2 These properties seems to hold for bonds,

see e.g. Engle, Lillien and Robins (1987), exchange rates, Hsieh (1988), commodities, Mandelbrot (1963), derivative securities, Engle and Mustafa (1992), as well as stocks. The references are merely representative. A comprehensive survey may be found in Bollerslev, Chou and Kroner (1992). 3 Lin, Engle and Ito (1994), for example, show that volatility persists throughout the day, moving across international borders.

2

Bruce Mizrach

and Mikkelsen call FIGARCH. Bera, Higgins and Lee (1992) have introduced an augmented ARCH model with time varying coefficients that they call AARCH. This model’s statistical properties are quite similar to the model I develop here. The applied literature is simply enormous and still growing. The survey article by Bollerslev, Chou and Kroner (1992) lists several hundred references to almost every conceivable asset. Despite the widespread use of the GARCH model, the specification of the conditional heteroscedasticity is essentially ad hoc. As Diebold (1988) notes, the literature is lacking a truly economic theory explaining the persistence in volatility. This paper’s contribution is to develop a model of asset pricing and learning where GARCH disturbances evolve naturally out of the decision problem of economic agents. Agents try to learn about the parameters of a stochastic process. Their evolving beliefs are incorporated into asset returns, leading to time variation in the parameters. I show that one representation for the data generating mechanism is a Markov model with GARCH disturbances. The learning model suggests an extended GARCH specification in which variables from the conditional mean explain time variation in the coefficients of the conditional variance. I propose a time-varying GARCH specification which nests the standard model. In an empirical example with the Italian Lira-German Deutschemark exchange rate, I reject the standard framework in favor of the learning model. The paper is organized in the following manner. Section 1 develops Bollerslev’s generalization of the ARCH model. A review of the time series properties of the model highlights the three stylized facts. Section 2 describes the model of asset pricing and learning. The covariance structure of this model is described in Section 3. Finite sample properties are explored in Section 4 in four different designs including one with structural change. The exchange rate example is presented is Section 5. Section 6 concludes and suggests some directions for future research.

1.2. GARCH in the Linear Regression Model I work throughout the paper with the linear regression model yt = xt β + εt .

(1.1)

yt is an observation on an endogenous variable which may be thought of as the fundamental, xt is a 1 × k vector of explanatory variables, β is a k × 1 vector of unknown parameters, and εt is a disturbance term distributed N(0, σ2ε ). Define Yt = [ y1 · · · yt ]′ , Xt = [ x1 · · · xt ]′ , and ξt = [ ε1 · · · εt ]′ . Engle extended the model 1.1 to allow for an explicitly time varying conditional variance ht p εt = ht ηt , ηt ∼ N(0, 1). (1.2)

Engle called it the ARCH model because the conditional variance was expressed as a function of the lagged innovations to the conditional mean 2 ht = a0 + ∑i=1 ai εt−i . q

(1.3)

Learning and Conditional Heteroscedasticity in Asset Returns

3

Following Bollerslev (1986), I will call 1.3 the ARCH(q) model. Bollerslev generalized Engle’s model to allow the conditional variance to depend on lagged conditional variances as well. The GARCH(p, q) model is written as 2 ht = a0 + ∑i=1 ai εt−i + ∑ j=1 b j ht− j . q

p

(1.4)

An equivalent, perhaps more intuitive formulation for 1.3, as in Bollerslev (1988), arises by substituting νt ≡ (εt2 − ht ) = ht (ηt2 − 1) at each εt2 , εt2 = a0 + ∑i=1

max[p,q]

2 (ai + bi )εt−i − ∑ j=1 b j vt− j + νt , p

(1.5)

where ai = 0 for i > q and bi = 0 for i > p. In this form, we can see that the GARCH(p, q) model is an ARMA(m, p) in the squared disturbances, where m = max[p, q].

1.2.1.

GARCH and the Stylized Facts for Asset Returns

1.2.1.1. Leptokurtosis Calculation of the kurtosis requires the second and fourth unconditional moments. The unconditional variance of the GARCH(p,q) model is given in Bollerslev (1986) as σ2ε = a0 /(1 − ∑i=1 ai − ∑ j=1 bi ). q

p

(1.6)

Let A(L) be the q-dimensional polynomial in the lag operator for the squared innovations and B(L) be the p-dimensional polynomial for the lagged conditional variances. A sufficient condition for the existence of the second moment is A(1) + B(1) < 1. The fourth moment requires even stronger parameter restrictions. For the ARCH(q) case, Milhoj (1985) derives necessary and sufficient conditions. Pre-multiplying 1.5 by 2 and taking expectations yields the jth autocovariance, εt− j 2 γ j ≡ cov(εt2 , εt− j ) = ∑i=1 ai γ j−i . q

(1.7)

Dividing through by γ0 , he obtains the analog of the Yule-Walker equations, ρ j ≡ γ j /γ0 = ∑i=1 ai ρ j−i , q

(1.8)

where the a′ s can now be interpreted as the q partial autocorrelations of the GARCH process, 2 2 2 a j = corr(εt2 , εt− j | εt−1 , . . . , εt− j−1 ).

(1.9)

The standard ARMA diagnostics apply here. The partial autocorrelations are zero for j > q. Milhoj stacks the equations 1.9 in matrix form as A = (I − Ψ)ρ

(1.10)

where A = [ a1 · · · aq ]′ , I is the q × q identity matrix, ρ = [ ρ1 · · · ρq ]′ and Ψ is the q × q matrix, Ψi j = ai+ j + ai− j , with a j = 0 for j ≤ 0 and j > q. He shows that a necessary and sufficient condition for the existence of the fourth moment4 is 3A′ (I − Ψ)−1 A < 1 4 Note

that many stationary GARCH models will not have a fourth moment.

(1.11)

4

Bruce Mizrach For the ARCH(1), this is simply 3a21 < 1. For the ARCH(2),1.11 can be written as 3[ a1 a2 ]



1 − a2 0 a1 1

−1 

a1 a2



which, upon inverting yields, a2 + 3a21 + 3a22 + 3a21 a2 − 3a32 The kurtosis5 is κ=2

3A′ (I − Ψ)−1 A + 1. 1 − 3Φ′ (I − Ψ)−1 Φ

Given condition 1.11, this implies the kurtosis is positive. For the GARCH(1, 1) process, Bollerslev (1986) has derived a necessary and sufficient condition for the existence of the 2rth moment, µ(a1 , b1 , r) = ∑ j=0 ( r

r j r− j )c a b < 1, j j 1 1

(1.12)

j

where c0 = 1, c j = ∏i (2 j − 1). For the fourth moment, he obtains from 1.12, µ4 =

3a20 (1 + a1 + b1 ) (1 − a1 − b1 )(1 − b21 − 2a1 b1 − 3a21 )

Given the existence of the second moment, the first term in parentheses is less than 1, making a necessary and sufficient condition6 for the existence of the fourth moment that 3a21 + 2a1 b1 + b21 < 1.

(1.13)

The kurtosis is then κ=

6a21 (1 − b21 − 2a1 b1 − 3a21 )

(1.14)

which is positive if 1.13 is satisfied. The leptokurtosis in asset returns is consistent with a number of other statistical models. Our next stylized fact will help rule out one important class. 5

The coefficient of kurtotis κ is defined as (E[εt4 ] − 3(σ2ε )2 )/(σ2ε )2 . If this is positive, it indicates that the distribution is “fat-tailed” relative to the normal. 6 There are no general formulas for the existence of the fourth moment in other GARCH processes. Bollerslev has derived conditions for the GARCH(1, 2): a2 + 3a21 + 3a22 + b21 + 2a1 b1 − 3a32 + 3a21 a2 + 6a1 a2 b1 + a2 b21 < 1; and for the GARCH(2, 1): b2 + 3a21 + b21 + b22 + 2a1 b1 − b32 − a21 b2 + 2a1 b1 b2 + b21 b2 < 1. These sums all will appear in the denominator of 1.14, making them leptokurtic as well.

Learning and Conditional Heteroscedasticity in Asset Returns

5

1.2.1.2. Normality with Temporal Aggregation T Consider a time series {yt }t=1 generated by 1.1 and 1.2. Denote by Yts the s-period temporal aggregate s Yts = ∑t=1 yt .

For instance, if yt was the change in the daily exchange rate, Yt5 would be the weekly change. A statistical representation for asset returns must account for our third stylized fact, that as s increases, asset returns converge to normality. One of the strong objections to the Paretian densities7 is that they are stable under aggregation. If the tails are thick in the daily changes, they will also be thick in the weekly aggregated series. The convergence to normality in the GARCH process is a fairly straightforward implication of central limit theory. The temporal aggregate sums over a very large number of draws of yt as s grows large. The only difficulty in proving convergence is to account for the dependence in the data. A central limit theory for dependent observations has been worked out under fairly weak assumptions: White (1984) shows that stationarity and ergodicity are sufficient. Diebold (1988) has proven the convergence using White’s theorem. He shows that if n yt is an AR(p), GARCH(p, q) process, the aggregated series {Yts }t=s , n = int[T /s], has an 8 unconditional normal distribution as n → ∞.

1.3. A Model of Asset Pricing and Learning A representative agent determines the return, rt , to a security based on his beliefs about the fundamental, rt = E[yt ]. (1.15) The agent is assumed to know the structure of the model but not its parameters. He updates his beliefs recursively using a least squares algorithm,

where

E[yt ] = xt b βt , b et e βt = (Xet′ Xet )−1 Xet′Yet = β + (Xet′ Xet )−1 Xet′e ξt . ξr ≡ β + M

(1.16)

I assume that the agent observes k pre-sample values of the fundamental and explanatory variables so that his beliefs start at time t = 1. I use the tilde to denote matrices that are augmented with these pre-sample values, e.g. Xet = [ x−k+1 · · · x0 · · · xt ], Consider now the perspective of an econometrician trying to analyze the relationship between security returns and the fundamental. He does not directly observe the agent’s 7 For

discussion on this family in the economics literature, see Mandelbrot (1963) or Fama and Roll (1968). is considerable discussion in the literature over the non-normality of the conditional distribution. Many studies find that vt is not normally distributed. Bollerslev (1987), for example, proposes a model where v is Student-t. 8 There

6

Bruce Mizrach

beliefs, but he does observe a time series of security returns, 

     e1e r1 x1 β x1 M ξ1       .. RT =  ...  =  ...  +   ≡ XT β +VT . . eT e rT xT β ξT xT M

(1.17)

Now suppose the econometrician regressed the return series on the same vector of explanatory variables used by the agent, b rt = xt βT ,

(1.18)

βT = (XT′ XT )XT′ RT = β + MT VT

(1.19)

where

Both the agent and the econometrician have unbiased expectations E[b βt ] = E[βT ] = β,

but the covariance structure is quite different from the standard Markov model. In particular, I will show that the residuals are conditionally heteroscedastic.

1.4. The Covariance Structure of the Residuals I first describe the covariance matrix of this process. In the second part, I focus on the conditional variance.

1.4.1.

The Covariance Matrix

The residuals of 1.15 less 1.18 are given by et e ξt − MT VT ) ≡ z1t − z2t . rt − b rt ≡ zt = xt (M

(1.20)

There will be four terms that make up each element of the covariance matrix Ω ≡ E[ZT ZT′ ]. For each term below, let t = 1, ..., T, and j = 0, ...,t − 1. The first component is ′ ′ et E[e e E[z1t z′1t− j ] = xt M ξt e ξt− j ]Mt− j xt− j , ′ et σ2ε It,t− j M et− j xt− = xt M j,

It,t− j is a t × t − j identity matrix.

= σ2ε xt (Xet′ Xet )−1 xt′ .

(1.21)

Learning and Conditional Heteroscedasticity in Asset Returns The second term is given by 

 e 1 x′ et E[e xt M ξt e ξ′1 ]M 1   ′ ′ .. E[z1t z′2t− j ] =   MT xt− j , . ′ ′ e e eT xT et E[ξt ξT ]M xt M   (Xet′ Xet )−1 x1′   ..   .  ′ ′  2 ′ ′ 2  ′ −1 ′ e e = σε xt  (Xt Xt ) xt   MT xt− j ≡ σε xt Wt MT xt− j .   ..   . ′ −1 ′ (XeT XeT ) xT

7

(1.22)

et It,k M ek reduces to (Xet′ Xet )−1 The second line uses the same substitution as in 1.21. M until k > t when it then starts to cancel to the left. The third term is ′  ′ X e −1 x1 (Xet− j t− j )   ..   .   2 ′ ′ 2 ′ −1  e e (1.23) E[z2t z1t− j ] = σε xt MT  xt− j (Xt− j Xt− j )   = σε xt MT Wt− j .   . ..   ′ −1 xT (XeT XeT ) ′ M e e′ e −1 until the left index ξt− In this case, the elements E[VT VT e j t− j ] leave an (Xt− j Xt− j ) exceeds t − j. The final term is a bit more involved because the cross product of VT VT′ produces a (T × T ) matrix,



x1 (Xe1′ Xe1 )−1 x1′  .. E[z2t z′2t− j ] = σ2ε xt MT  . ′ e e xT (XT XT )−1 x′

1

··· .. . ···

 x1 (Xe1′ Xe1 )−1 xT′  ′ ′ ..  MT xt− j . . ′ −1 ′ e e xT (XT XT ) xT

(1.24)

Overall, this covariance matrix is quite complex. The off-diagonal elements zt zt− j are all non-zero, indicating that the residuals are serially correlated.9 Now consider the diagonal terms, E[zt2 ], for t = 1, ..., T. 1.21-1.24 simplify for j = 0. 1.22 and 1.23 are identical, and because the residuals are orthogonal to the x′ s, E[xt zt ] = 0, the fourth term cancels one of them. I am left with, γz (t) ≡ E[zt2 ] = E[z21t − z1t z′2t ] = σ2ε [xt (Xet′ Xet )−1 xt′ − xt Wt MT′ xt′ ].

(1.25)

From 1.25, it is clear that the variances will be time varying and related to the explanatory variables. In the next section, I will show that the squared residuals are an ARMA process. 9 The

presence of serial correlation poses some difficulties for GARCH hypothesis testing, See e.g. Bera, Higgins and Lee (1992) for an alternative approach to the standard Lagrange multiplier test of Engle (1982).

8

Bruce Mizrach

1.4.2.

A GARCH Representation for the Conditional Variances

I will analyze in this section the autocorrelation and partial autocorrelation functions. To show that the model has a GARCH representation, I will show that they are both nonzero at all lags. Let’s begin with the autocovariance function for zt2 , 2 2 γzz (t,t − j) ≡ E[(zt2 − E[zt2 ])(zt− j − E[zt− j ])].

Since εt is Gaussian, the terms are just cross products of γz , γzz (t,t − j) = E[z21t z21t− j − 2z21t z1t− j + z21t z22t− j − 2z1t z2t z21t− j +

(1.26)

2 2 2 2 2 2 4z1t z2t z1t− j z2t− j − 2z1t z2t zt− j + z2t z1t− j − 2z2t z1t− j z2t− j + z2t z2t− j ].

Note that for large T at any j, E[z2t− j ] ≈ 0. To understand this intuitively, consider the case where x is a scalar,   T 2 r ∑r=1 xr (∑s=1 xs εs / ∑rs=1 xs2 ) . (1.27) z2t− j = xt− j ∑Tr=1 xr2 The inner double sum converges quickly. Consequently, ′ ′ et e et xt′ xt− j M et− j e e γzz (t,t − j) ≈ E[z21t z21t− j ] = E[xt M ξt e ξt′ M ξt− j e ξt− j Mt− j xt− j ].

This will be non-zero if the stochastic part of this term is non-zero, ′ E[e ξt e ξt′ It,t− j e ξt− j e ξt− j ] ≡ Σt− j ,

(1.28)

where Σt− j is a symmetric (t − j) × (t − j) matrix whose rth diagonal element is E[ε2r ∑s=1 ε2s ] = µ4 + ∑s6=r (σ2ε )2. t− j

t− j

(1.29)

Because the inner sum runs from 1 to t − j, this expectation is non-zero for any j, and consequently, so is γzz (t,t − j). To assess the significance of the partial autocorrelation coefficients, consider the regres2 and call this coefficient φ1 . The residuals from this regression, sion of zt2 on zt−1 2 zt2 − c − φ1 zt−1 ,

will include the difference of the partial sums up to time t − 1, ∆(∑ j=−k+1 x j ε j )2 = xt εt ∑ j=−k+1 x j ε j . t

t− j

2 because the remaining portion still sums disturThese will be correlated with any zt− j bances from the first time period. Intuitively, the learning process induces parameter variation in the conditional mean. An econometrician who fails to model this structure leaves information in the residuals. An equivalent representation, this section demonstrates, is a Markov model with GARCH disturbances. Technically, we still need to show that the autcorrelations and partial autocorrelations decay sufficiently quickly for the process to be stationary. This unfortunately must be left for future research. In this paper, Monte Carlo simulations will have to suffice.

Learning and Conditional Heteroscedasticity in Asset Returns

9

Table 1.1. Average autocorrelation coefficient (AC) and partial autocorrelation coefficient (PAC) at each lag in 250 replications. Simulation results for equation yt = 0.05 + εt , εt ∼ N(0, 1)

Lag 1 2 3 4 5 6 7 8 9 10

1.5.

T = 25 AC 0.397 0.156 0.063 0.012 -0.032 -0.045 -0.054 -0.066 -0.074 -0.086

PAC 0.397 -0.083 0.039 -0.059 -0.012 -0.034 -0.015 -0.060 -0.024 -0.064

T = 100 AC 0.643 0.458 0.356 0.282 0.226 0.187 0.157 0.132 0.112 0.092

PAC 0.643 -0.009 0.071 -0.036 0.032 -0.004 0.029 -0.006 0.022 -0.011

T = 250 AC 0.724 0.569 0.477 0.410 0.356 0.320 0.289 0.264 0.243 0.224

PAC 0.724 0.078 0.078 -0.021 0.048 0.009 0.042 -0.001 0.032 -0.004

Finite Sample Properties

I consider four finite sample exercises. The first design is a simple scalar case where the only regressor is a constant term. In this exercise, I hope to show the persistence coming solely from the disturbances. In the second case, I consider a multivariate regression model where the x′ s are independent. Since the heteroscedasticity varies with the explanatory variables, I want to exclude the effects of any dependence coming from the regressors. In the third simulation, I consider an AR(p) model. Finally, in a more realistic model, I consider structural change in the parameters. In each simulation, I draw samples of T + 5 observations on the disturbance term and regressors to generate the y′ s, essentially assuming the first 5 are only visible to the agent. I consider three sample sizes of length T = 25, 100 and 250. I then do 250 replications of each. I report sample averages of the first ten autocorrelations and partial autocorrelations of zt2 in Tables 1.1 through 1.4.

1.5.1.

Scalar Regression Model

In this first example, yt = 0.05 + εt , xt is a scalar constant, so k = 1, and xt = 1 for all t. ε is N(0, 1) and independent and identically distributed. The results are in Table 1.1. There is clear evidence of an AR process even in the small sample. The partial autocorrelations are large only at the first lag. An econometrician might readily conclude that this is an AR(1) in εt2 or an ARCH(1) process. It appears that some structure in the independent variables is needed to generate a high order ARCH or GARCH. This motivates our next simulation.

10

Bruce Mizrach Table 1.2. Average autocorrelation coefficient (AC) and partial autocorrelation coefficient (PAC) at each lag in 250 replications.. Simulation results for equation yt = 0.25 + 0.5x1t − 0.25x2t + εt , xt and εt ∼ N(0, 1)

Lag 1 2 3 4 5 6 7 8 9 10

1.5.2.

T = 25 AC 0.107 0.084 0.051 0.018 0.008 -0.013 -0.022 -0.021 -0.028 -0.046

PAC 0.107 0.040 0.045 -0.034 0.003 -0.029 -0.022 -0.036 -0.011 -0.044

T = 100 AC 0.246 0.206 0.164 0.163 0.129 0.125 0.104 0.089 0.083 0.076

PAC 0.246 0.117 0.113 0.062 0.061 0.029 0.034 0.000 0.026 0.008

T = 250 AC 0.292 0.255 0.235 0.213 0.189 0.167 0.157 0.140 0.141 0.134

PAC 0.292 0.159 0.148 0.070 0.067 0.033 0.049 0.012 0.041 0.013

Multiple Independent Regressors

I use a three variable model, yt = 0.25 + 0.5x1 − 0.25x2 + εt . Each x and the ε are independent draws from an N(0, 1). I again drop 5 pre-sample values for each of the three sample sizes. Results are reported in Table 1.2. By the mid-sized sample, T = 100, there is clear evidence of a higher order ARMA process. In the large sample, there is the nice smooth decay in both the AC and PAC that the theory predicts.

1.5.3.

An AR(p) Model

I next consider the AR(2) model, yt = xt = 0.05 + 0.5xt−1 − 0.25xt−2 + εt . The ε′ s are again N(0, 1), and I set the initial values for the x′ s to zero, still dropping the first 5 observations. Results for this third exercise are in Table 1.3. The covariance structure suggests that dependence in the x′ s should induce additional dependence in the squared disturbances. This can be detected in this simulation. At lags 1 and 2, the AC and PAC are larger than in Table 1.2 for all three sample sizes. These coefficients again suggest a high order ARMA process.

1.5.4.

Structural Change

If agent’s beliefs were finite only on a single model, volatility would tend to die out. In the third finite sample exercise with T = 100, the unconditional variance in the first half of the sample is 35% larger on average than in the second half. The conditional heteroscedasticity remains significant in the sample as a whole though because small errors in the second half of the series are following other small errors. In financial market data, we see repeated bursts of volatility. In this framework, I model these volatility outbreaks as changes in the coefficients of the fundamental. In this last

Learning and Conditional Heteroscedasticity in Asset Returns

11

Table 1.3. Average autocorrelation coefficient (AC) and partial autocorrelation coefficient (PAC) at each lag in 250 replications. Simulation results for equation yt = xt = 0.05 + 0.5xt−1 − 0.25xt−2 + εt , εt ∼ N(0, 1)

Lag 1 2 3 4 5 6 7 8 9 10

T = 25 AC 0.156 0.139 0.065 0.042 0.010 -0.009 -0.015 -0.025 -0.039 -0.043

PAC 0.156 0.066 0.059 -0.031 -0.001 -0.033 -0.011 -0.045 -0.020 -0.041

T = 100 AC 0.286 0.252 0.176 0.161 0.147 0.117 0.089 0.078 0.067 0.063

PAC 0.286 0.139 0.103 0.030 0.067 0.014 0.011 -0.009 0.023 0.004

T = 250 AC 0.333 0.295 0.219 0.205 0.192 0.163 0.137 0.127 0.117 0.111

PAC 0.333 0.168 0.107 0.046 0.079 0.027 0.027 0.005 0.029 0.013

Table 1.4. Average autocorrelation coefficient (AC) and partial autocorrelation coefficient (PAC) at each lag in 250 replications. Simulation results for AR(2) with 1% chance of shift in β2

Lag 1 2 3 4 5 6 7 8 9 10

T = 25 AC 0.157 0.138 0.063 0.043 0.010 -0.009 -0.015 -0.025 -0.038 -0.043

PAC 0.157 0.063 0.058 -0.029 -0.003 -0.011 -0.047 -0.036 -0.019 -0.040

T = 100 AC 0.287 0.251 0.180 0.162 0.148 0.119 0.092 0.078 0.069 0.061

PAC 0.287 0.140 0.109 0.030 0.068 0.012 0.015 -0.011 0.023 -0.001

T = 250 AC 0.333 0.297 0.226 0.200 0.193 0.174 0.148 0.132 0.123 0.111

PAC 0.333 0.173 0.114 0.035 0.079 0.039 0.036 0.006 0.033 0.001

exercise, there is a 1% chance, each period, that the first AR parameter will shift up by 0.1. I constrain the parameter when necessary to keep the process stationary. The results are in Table 1.4. The AC and PAC are basically the same as in Table 1.3 even though the unconditional volatility is now equal in both halves of the sample.

12

Bruce Mizrach

Table 1.5. Estimates of Log of the Conditional Variance for IL/DM Exchange Rate. Coeffient estimates for a GARCH(1,1) and augmented GARCH(1,1) estimation. Newey-West standard errors are reported. The likelihood ratio statistic is distributed χ2 (4) Explanatory Variable Constant (t-stat) 2 εt−1

Model 1 -0.261 (3.26) 0.176 (4.32)

2 xt2 εt−1 2 xt xt−1 εt−1

ht

0.853 (14.74)

xt2 ht xt xt−1 ht Log likelihood Likelihood ratio

1.6.

86.127

Model 2 -0.183 (4.32) 0.195 (1.91) -0.004 (0.34) 0.036 (0.77) 0.917 (27.03) -0.017 (0.11) -0.244 (2.66) 116.690 61.126

An Empirical Example

These simulation results do not constitute any direct evidence in favor of the learning model. In this section, I propose a flexible parameterization for the conditional heteroscedasticity. This functional form captures the parameter variation in the conditional variance and nests the standard model. To avoid any problems with negative variances, I use logs. Let ht = E[εt2 | εt2 , ht−1 , xt , ....], xt be a scalar, and write the functional form as 2 log(ht ) = a0 + (∑i=1 [a1,i + ∑r=0 a2+r,i xt xt−r ])εt−i + q

(∑

p [b + j=1 1, j

s

(1.30)



s b x x ]) log(ht− j ). r=0 2+r, j t t−r

The model lets the coefficients on the squared disturbances and the conditional variances vary with cross products of the x′ s. Also, if Σa2+r,i = Σb2+r, j = 0, it reduces to the standard model. I obtained a series of 510 daily Italian Lira-German Deutschemark exchange rates for 1992-93. This period includes the withdrawal of the Lira from the European exchange rate grid. The leptokurtosis is over 22, and previous work, Mizrach (1995), suggested the possibility of a GARCH model. I use log differences for returns, and the conditional mean is AR(1). I then jointly estimated a log-GARCH(1,1) for ht . Results are in the first column of Table 1.5. Overall,

Learning and Conditional Heteroscedasticity in Asset Returns

13

the model appears quite satisfactory with significant10 AR and MA parameters. Nonetheless, I wanted to see if the learning model parameterization might locate additional structure. The second column of Table 1.5 reports estimation for s = 1. The model appears to offer a good deal better fit. The lagged terms xt xt−1 are significant in both the squared disturbances and the lagged conditional variances. The likelihood improves by 25%, and the χ2 (4) likelihood ratio statistic is over 60. I can easily reject the standard model in favor of the learning specification.

Conclusion The literature on financial market volatility has made great strides in the statistical modeling of conditional variances. A variety of parameterizations for volatility exist. This paper has not completely avoided offering yet another. Focusing on the source of volatility should help us choose among candidate models. Further research on the impact of learning on volatility should prove fruitful at least in this respect. A single empirical example was also provided in favor of the learning specification. Additional research will be needed to establish whether this model is as empirically robust as its precursors.

References [1] Baillie, R., T. Bollerslev and H.O. Mikkelsen (1993), “Fractionally integrated generalized autoregressive conditional heteroscedasticity,” Kellogg School Working Paper #168. [2] Bera, A., M.L. Higgins, and S. Lee (1992), “Interaction between autocorrelation and conditional heteroscedasticity: a random coefficient approach,” Journal of Business and Economic Statistics 10, 133-142. [3] Bollerslev, T. (1986), “Generalized autoregressive conditional heteroskedasticity,” Journal of Econometrics 31, 307-27. [4] Bollerslev, T. (1987), “A conditionally heteroskedastic time series model for speculative prices and rates of return,” Review of Economics and Statistics 69, 542-47. [5] Bollerslev, T. (1988), “On the correlation structure for the generalized Autoregressive conditional heteroskedastic process,” Journal of Time Series Analysis 9, 121-31. [6] Bollerslev, T., R. Chou and K. Kroner (1992), “ARCH modeling in finance: a review of theory and empirical evidence,” Journal of Econometrics 52, 5-59. [7] Diebold, F. (1988), ARCH Modeling of Exchange Rates, New York: Springer-Verlag. [8] Engle, R., “Autoregressive conditional heteroscedasticity with estimates of the variance of UK inflation,” Econometrica 50, 987-1007. 10 I

use Newey-West HAC standard errors with 4 lags.

14

Bruce Mizrach

[9] Engle, R., D. Lillien and R. Robins (1987), “Estimating time varying risk premia in the term structure: the ARCH-M model,” Econometrica 55, 391-407. [10] Engle, R. and C. Mustafa (1992), “Implied ARCH models from Option Prices,” Journal of Econometrics 52, 289-311. [11] Fama, E. and R. Roll (1968), “Some properties of symmetric stable distributions,:” Journal of the American Statistical Association, 817-35. [12] Hsieh, D. (1988), “The statistical properties of daily foreign exchange rates: 19741983,” Journal of International Economics 17, 173-84. [13] Lin, W-L., R. Engle, and T. Ito (1994), “Do bulls and bears move across borders? international transmission of stock returns and volatility,” Review of Financial Studies 7, 507-38. [14] Mandelbrot, B. (1963), “The variation of certain speculative prices,” Journal of Business 36, 394-419. [15] Milhoj, A. (1985), “The moment structure of ARCH processes,” The Scandinavian Journal of Statistics 12, 281-92. [16] Mizrach, B. (1995), “Target Zone Models with Stochastic Realignments: An Econometric Evaluation,” Journal of International Money and Finance 14, 641-57. [17] Nelson, D. (1991), “Conditional heteroscedasticity in asset returns: a new approach,” Econometrica 59, 347-70. [18] Pagan, A. and Schwert, “Alternative models for conditional stock volatility,” Journal of Econometrics 45, 267-90.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 15-23

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 2

M ODELLING AND M EASURING THE S OVEREIGN B ORROWER ’ S O PTION TO D EFAULT Ephraim Clark∗ Middlesex University and SKEMA Business School, France

2.1. Introduction The main approach to credit risk modeling is based on Merton (1974, 1977) and views bonds as contingent claims on the borrowers’ assets. The credit event is modeled as timing risk when the assets of the borrower reach a threshold. In Merton (1974, 1977), Black and Cox (1976), Ho and Singer (1982), Chance (1990) and Kim, Ramaswamy and Sundaresan (1993) default is modeled as occurring at debt maturity if the assets of the borrower are less than the amount of the debt due. More recent models starting with Longstaff and Schwartz (1995) have randomized the timing of the default event determined by when the value of the assets hits a pre-determined barrier. Other papers such as Jarrow and Turnbull (1995), Madan and Unal (1998) and Duffie and Singleton (1999) model the timing of the default event as a Poisson process or a doubly stochastic Poisson process (Lando, 1994). These models are difficult to apply to sovereign credit risk because of what is called “national sovereignty”, which lies at the heart of the political world order, and endows countries with the de facto power to unilaterally abrogate or suspend contractual obligations when deemed in the national interest. This is a major factor that distinguishes sovereign debt from corporate debt. Thus, in the foregoing models, creditworthiness of a corporate borrower depends, for all practical purposes, on its ability to pay. Where a sovereign borrower is concerned, however, besides the ability to pay, creditworthiness depends on the government’s willingness or unwillingness to pay even if it has the ability. The literature on country risk has not overlooked the importance of the willingness factor. Eaton, Gersovitz, and Stiglitz (1986), for example, argued that because a country’s wealth is always greater than its foreign debts, the real key to default is the government’s willingness to pay. Borensztein and Pennacchi (1990) suggest that besides other observable variables that are tested, the price of sovereign debt should be related to an unobservable ∗ E-mail

address: [email protected]

16

Ephraim Clark

variable that expresses the debtor country’s willingness to pay. Clark (1991) suggests that the price of sovereign debt is related to a country’s willingness to pay which is motivated by a desire to avoid the penalties of default. Although analytically seductive, the problem with the concept of the willingness (unwillingness) to pay is that it is not readily observable. Consequently, empirical testing of the price of sovereign debt on the secondary market has tended to exclude this variable and focus on financial variables ((Feder and Just (1977), Cline (1984), Mc Fadden et al. (1985), Callier (1985)), structural variables (Berg and Sachs (1988)), or other phenomena such as prices (Huizinga (1989)) or debt overhang (Hajivassiliou (1989)). In this paper I follow Clark and Zenaidi (1999), who model sovereign credit risk as an American style call option where the decision to default depends on the government optimizing the trade-off between the gains to be reaped through non payment and the costs associated with not paying. This option measures the government’s willingness to pay. The higher the value of the option, the less the government is willing to honor its obligation to service its foreign debt. I show that when both nominal levels of debt and the costs to the sovereign associated with default are stochastic, there is a ratio of debt to costs where it will be optimal for the sovereign to default. This makes it possible to price the option and estimate both the distance to default and the probability of default over any time horizon. The rest of the paper is organized as follows. In section 2, I develop the model. In section 3, I show how it can be implemented. Section 4 concludes.

2.2. Modeling Country Risk In this section I develop a model of the value of the sovereign’s option to default. Let represent the value of the option to default. It is a function of two variables, total outstanding foreign debt, noted as D(t), and the cost to the country in indemnities, penalties, reduced access to capital markets, restrictive covenants and higher borrowing costs in the case of default, noted as C(t). Thus, with more debt increasing the value of the option, and higher default costs decreasing it so that x is increasing in D and decreasing in C. Define D as the par value or nominal amount of foreign debt outstanding. To the extent that local resources are inadequate to finance economic growth, foreign borrowing will have a deterministic element that depends on the economy’s long term rate of growth. Because economic performance is subject to random fluctuations, foreign borrowing will have a random component as well.1 With this in mind and since total foreign liabilities can never be negative, geometric Brownian motion with trend can describe its evolution through time: dD(t) = αD(t)dt + σD(t)dz(t)

(2.1)

where α is the expected rate of growth of the country’s foreign debt, σ the standard deviation of dD(t)/D(t), and dz(t) a Wiener process with zero mean and variance equal to dt. The cost of default is likely to have a deterministic element that depends to a certain extent on the size of the economy, its growth rate, the amount of debt outstanding and the 1 From

a technical point of view, the changes in outstanding foreign debt depend on the evolution of the current account balance and how the transactions in the current account balance, all of which are random variables, were financed.

Modelling and Measuring the Sovereign Borrowers Option to Default

17

risk aversion of creditors.2 It will also vary stochastically over time, depending on a wide range of conditions and circumstances. The market value of assets, for example, is known to vary stochastically. Thus, the market value of recoverable assets available as de jure or de facto guarantees will have a stochastic element. The amount of indemnities that can practically be imposed on a defaulting borrower also has a stochastic element. It depends on the reactions of many players including politicians, businessmen, bankers, civil servants and the like. Typically, these reactions vary according to circumstances that tend to vary, in turn, according to the evolution of a complex set of economic, political, social, environmental, etc. variables at the international, regional and local levels. Because of its deterministic and stochastic properties and since it cannot be negative, the amount recoverable can also be described by geometric Brownian motion dC(t) = πC(t)dt + ωC(t)dw(t)

(2.2)

where π is the trend parameter, ω2 is the variance parameter of the percentage change in C(t) and dw(t) is a Wiener process with zero mean and variance equal to dt, with dz(t)dw(t) = ρd(t) where ρ is the instantaneous correlation coefficient between D and C. Consider a new variable g = D/C, the nominal amount of debt outstanding per dollar of cost, where the time arguments have been dropped for simplicity of notation. Using 2.1, 2.2 and Ito’s lemma gives: dg = µgdt + δgds (2.3) where µ = α − π − σωρ + ω2 2

(2.4)

2

δ = σ − 2σωρ + ω (2.5) σdz − ωdw (2.6) ds = δ Make the change of variables X(g, 1) = x(D,C)/C and assume time independence of debt.3 We note that experience has shown that as outstanding sovereign liabilities come due, they are either rolled over or new loans are contracted to pay them off. Thus, in practice, contracted maturities are more like reset dates on a revolving credit with no fixed expiration date than a strict term loan. Thus, with no explicit expiry date, neither x nor X depend explicitly on time even though individual liabilities have finite maturities. Assuming risk neutrality, a risk free interest rate that is constant at r and taking expectations, gives:4 1 2 2 00 0 δ g X + µgX − rX = 0 (2.7) 2 2 What is

recovered by creditors is likely to be different from what is lost by defaulting countries. See Eaton and Gersovitz (1981) and Bulow and Rogoff (1989) for a discussion of the costs accruing to countries as a result of default. 3 This is a common assumption, adopted, for example by Modigliani and Miller (1958), Merton (1974), Black and Cox (1976) and Leland (1994). Leland (1994) justifies this assumption based on the conclusions of Brennan and Schwartz (1978) whereby the optimal strategy is continuously rolled over short term debt under the constraint of positive net asset value. Thus, as long as the firm is able to repay its debt, the debt is automatically rolled over. 4 An alternative procedure involves defining a spanning asset for g and using it in an asset pricing model to determine g’s required rate of return. A hedge portfolio consisting of one unit of X(g(t)) and a short position of units of g is then constructed and Ito’s lemma applied to give the same differential equation as (4) with

18

Ephraim Clark The solution to 2.7 is: X = K1 gη1 + K2 gη2

where η1 > 1 and η2 < 0 are the roots to the quadretic equation in η: p −(µ − δ2 /2) ± (µ − δ2 /2)2 + 2δ2 r . η1 , η2 = δ2

(2.8)

(2.9)

The boundary conditions are as follows. 1) when the amount of debt outstanding is zero, the option has no value: X(0) = 0

(2.10)

which makes K2 = 0. 2) There will be a level of g, call it g∗ , where it will be optimal for the sovereign to exercise its option to default. At this point, the sovereign pays C/C = 1 and receives (D/C)∗ : X(g∗ ) = g∗ − 1

(2.11)

3) The smooth pasting condition that precludes arbitrage opportunities is: X 0 (g∗ ) = 1

(2.12)

Solving 2.11 and 2.12 simultaneously for g∗ and K1 , gives the solution: X = K1 gη1 where

1 g∗−η1 η1 − 1

(2.14)

η1 η1 − 1

(2.15)

x = CK1 gη1

(2.16)

K1 = and

(2.13)

g∗ =

Thus, the value of the option to default is given as:

It turns out that besides α and σ, the expected growth rate and volatility of the debt itself, there are three other parameters that influence the value of the option to default: π, the trend parameter of default costs, ω, the volatility parameter of the percentage change in C and ρ, the instantaneous correlation coefficient between D and C. Table 2.1 shows how changes in the parameters that drive the exercise price affect the value of x, the option to default. An increase in the trend parameter of D increases the value of the option, an increase in the trend parameter of C, decreases the value of the option as does an increase in the correlation of the changes in the exercise price with changes in the amount of debt replacing . represents the growth rate adjusted for discounting at the riskless rate of interest. Using the CAPM, for example, where is the expected return on the market and is the covariance of the return on the market with the return on the insurance policy divided by the variance of the return on the market. See Dixit and Pindyck (1994). The problem with the spanning asset is that it often cannot be easily observed.

Modelling and Measuring the Sovereign Borrowers Option to Default

19

outstanding. The intuition behind these results is straightforward. Increases in x increase the value of the call, increases in the exercise price reduce the value of the call, and the more they move together, the more the increases in the value of the debt outstanding are offset by increases in the cost of default. Interestingly, the effect of changes in σ and ω, the volatilities of D and C, is ambiguous. Depending on the levels of the other parameters as well as the levels of σ and ω themselves, the effect can be positive, negative or zero. In fact, because of the interaction of σ, ω and ρ in determining δ and µ, the effect of an increase in either σ or ω up to a certain critical level has the effect of reducing the value of x. Above this critical level, further increases have the effect of increasing the value of x. Table 2.2 shows the critical level of σ for different levels of ω and ρ and Table 2.3 shows the critical levels of ω for different levels of σ and ρ. When they are below the critical value, an increase decreases the value of the option. When they are above the critical value, an increase increases the value of the option. In the first column of Table 2.2, for example, when σ is below 0.12, an increase in σ decreases the value of the option. Above 0.12 it increases the option’s value. Thus, with two sources of risk, the situation is considerably more complicated than the simple case of a single volatility parameter where increases always increase the value of the option. This might be an important consideration in the country risk analysis, if, for some reason the parameters of the variables themselves or the correlation between them are thought likely to change, thereby changing the probability of default and the value of the option. This is something we will come back to in Section 2.3.. Table 2.1. Comparative Statics on The Option to Default ϑY /ϑα > 0

ϑY /ϑπ < 0

ϑY /ϑρ < 0

ϑY /ϑσ ≤ 0, ≥ 0

ϑY /ϑω ≤ 0, ≥ 0

Table 2.2. Critical Values of σ for different levels of ω and ρ. r α π ω ρ Critical value of σ

0.1 0.04 0.02 0.1 0.9 0.12

0.1 0.04 0.02 0.1 0.3 0.05

0.1 0.04 0.02 0.3 0.9 0.41

0.1 0.04 0.02 0.3 0.3 0.61

2.3. Implementation 2.3.1.

Parameter Estimation

The first step is to determine the base currency, which could theoretically be any freely convertible currency. From a practical point of view, the base currency should also be one

20

Ephraim Clark Table 2.3. Critical Values of ω for different levels of σ and ρ. r α π σ ρ Critical value of ω

0.1 0.04 0.02 0.1 0.9 0.07

0.1 0.04 0.02 0.1 0.3 0.02

0.1 0.04 0.02 0.3 0.9 0.21

0.1 0.04 0.02 0.3 0.3 0.06

that is widely used internationally and supported by a large economy producing a wide range of industrial, agricultural, technological and financial products. The US dollar fills the bill and thus I use it as the base currency in which all calculations are effected. The variable x, the value of the option to default, is calculated from equation 2.16. This involves estimating D and C and their parameters as presented in equations 2.1 and 2.2. For D, I take the time series of total foreign debt outstanding as reported in the World Debt Tables. I then use this data to estimate the historical growth rate and standard deviation for D. To adjust for risk neutrality, I estimate the average risk-free rate and the average return on foreign debt RD over the period. From footnote 4, the risk neutral growth rate is then equal to the historical growth rate less RD − r, the risk premium. There is no reliable data source for observing C directly. However, two calculable measures have been proposed in the literature. Saunders (1986) suggests using the present value from time t to ∞ of GDP lost as a consequence of sanctions, seizures, indemnities, higher interest rates or being shut out of the international capital market altogether while Clark (1991) suggests using the present value from time t to ∞ of net exports lost to creditors.5 Both of these measures are stock concepts. The advantage of the latter measure is that it is based on a theoretical rationale and practical methodology for actually estimating these losses and the part that will accrue to creditors while the former does not. Consequently, I use the Clark methodology. This methodology involves a period by period estimation of the expected present value of a country’s net exports and the percentage of this value, which for simplicity is assumed to be constant, that would be lost in the case of default. This percentage can be estimated as the percentage of foreign resources in gross fixed capital formation.6 I then proceed as before and use this data to estimate the historical growth rate and standard deviation for C. To adjust for risk neutrality, I estimate the average return on the present value of the country’s net exports RC over the period. From footnote 4, the risk neutral growth rate is then equal to the historical growth rate less RC − r, the risk premium. Finally, the time series of D and C are used to calculate the historical correlation between the two. Substituting all this information into equations 2.4 and 2.5, gives the required parameters for g, that make it possible to solve for equations 2.13 and 2.16.

5A

brief restatement of the methodology can be found in Clark (2002, Chapter 12). proportion of foreign resources in gross fixed capital formation is measured as the sum of the current account balance over the period divided by the sum of gross fixed capital formation over the period. 6 The

Modelling and Measuring the Sovereign Borrowers Option to Default

2.3.2.

21

Risk Monitoring

The foregoing model can be used for measuring and monitoring sovereign risk. The distance to default can be measured as g∗ − g, the difference between the optimal ratio of D to C and the current value of this ratio. Default occurs when g = g∗ . The probability of default, then, is the probability that g will remain below g∗ . To calculate the probability of default, the fact that g is lognormally distributed gives: P[g < g∗ ] = 1 − P[g = g∗ ] where ∗

P[g > g ] = √

1 2Πδ2 T

Z c1 1 −(ln g−m)2/2δ2 T e dg −∞

g

(2.17)

(2.18)

and

ln(g/g∗ ) + (µ − δ2 /2)T √ (2.19) δ T To implement this procedure we determine the time horizon or the family of time horizons T = T1 , T2 , ..., Tn deemed to be relevant and apply equations 2.17-2.19. The model is also a useful tool for assessing the effect of parameter changes on the probability of default. Expected changes in the parameters of D and C, due political, structural or market based events, and their correlation can be plugged into equations 2.18-2.19 to measure their effect. One example would be how an expected rise (fall) in the US interest rate will affect the country’s probability of default. Another more complicated example is that changes in a country’s economic structure due to privatization, protectionism, subsidies and the like can have an impact on the rate of growth and volatility of the country’s foreign debt as well as on the cost of default. A higher (lower) rate of growth of debt and a lower (higher) rate of growth of the cost of default will raise (lower) the value of the option, decrease (increase) the distance to default, and increase (decrease) the probability of default. On the other hand and contrary to intuition, the model shows that increases (decreases) in the volatility of these two variables does not necessarily imply an increase (decrease) in the value of the option, a decrease (increase) in the distance to default, and an increase (decrease) in the probability of default. Everything depends on the size of the change and the starting and ending values with respect to the critical values of the volatilities. The changes can be both complementary and offsetting. The model makes it possible to measure the total net effect in a coherent and rigorous manner. It is clear from the foregoing discussion that the model and the parameters it generates could be applied to a wide range of situations and scenarios to supply important information for monitoring and managing sovereign default risk. c1 =

Conclusion This paper examines sovereign creditworthiness in a framework that recognizes the reality of national sovereignty and the importance of the government’s willingness to pay as a determinant of sovereign default. Using standard methods of stochastic calculus, the government’s willingness to pay is modeled as an American style call option on the nominal amount of debt outstanding with a stochastic exercise price. I show that it depends on the

22

Ephraim Clark

optimal exercise value of the government’s option to default, which can be broken down into six parameters: the riskless rate of interest, the rate of growth of outstanding debt and its standard deviation, the rate of growth of the government’s cost of default, its standard deviation and its correlation with the evolution of outstanding debt. The model can be used to estimate default probabilities over different time horizons. It can also be used to measure the effects of anticipated changes in the model’s parameters on the default probabilities and the distance to default.

References [1] Berg, A. et J. Sachs (1988), “The debt crisis: structural explanations of country performance,” Journal of Development Economics 29(3), pp.271-306. [2] Black, F. and J. Cox, (1976), “Valuing corporate securities: some effects of bond indenture provisions,” Journal of Finance, vol. 31, May, 351-367. [3] Borensztein, E. and G. Pennacchi, (1990), “Valuing interest payment guarantees on developing country debt,” IMF Staff Papers. [4] Brennan, M. and E. Schwartz, (1978), “Corporate income taxes, valuation and the problem of optimal capital structure,” Journal of Business, vol. 51, 103-114. [5] Callier, P. (1985), “Further results on countries’ debt servicing performance: the relevance of structural factors,” Weltwirtschaftliches Archiv 121, 105-115. [6] Chance, D., (1990), “Default Risk and the Duration of Zero Coupon Bonds,” Journal of Finance, Vol.45, 265-274. [7] Clark, E., (1997), Cross Border Investment Risk, London: Euromoney Publications. [8] Clark, E., (2002), International Finance, London: International Thomson Business Press. [9] Clark, E. and A. Zenadi, (1999), “Sovereign Debt Discounts and the Unwillingness to Pay,” Revue Finance, 20(2), 185-199. [10] Cline, W.R. (1984), “International debt: systematic risk and policy response,” Institute for International Economics, Washington, DC. [11] Duffie, D. and Singleton K., (1999), “Modeling Term Structures of Defaultable Bonds,” Review of Financial Studies, Vol.12, 687-720. [12] Eaton, J. and M. Gersovitz, (1981), “Debt with potential repudiation: theoretical and empirical analysis,” Review of Economic Studies, 48(152), pp. 289-309. [13] Eaton, J., M. Gersovitz and J. Stiglitz, (1986), “A pure theory of country risk,” European Economic Journal. [14] Feder, G. et R. Just (1977) “A study of debt servicing capacity applying logit analysis,” Journal of Development Economics 4, March, pp. 25-38.

Modelling and Measuring the Sovereign Borrowers Option to Default

23

[15] Hajivassiliou, U.A. (1989), “Do the secondary markets believe in life after debt?,” Working Paper No 252, International Economics Department, The World Bank, pp. 1-42. [16] Ho, T. and Singer R., (1982), Bond Indenture Provisions and the Risk of Corporate Debt, Journal of Financial Economics, Vol.10, 375-406. [17] Huizinga, H. (1989), “How has the debt crisis affected commercial banks?,” Working Paper No 195, International Economics Department, The World Bank, pp.1-32. [18] “International Financial Statistics,” Washington, D.C.: The International Monetary Fund, several issues. [19] Jarrow, R. A., Turnbull S. M., (1995), “Pricing Derivatives on Financial Securities Subject to Credit Risk,” Journal of Finance, 50, 1, 53-85. [20] Judge G.G., et. al. (1985), The Theory and Practice of Econometrics, Wiley Series in Probability and Mathematical Statistics, New York: John Wiley and Sons. [21] Kim, J., Ramaswamy K. and Sundaresan S., “Does Default Risk in Coupons Affect the Valuation of Corporate Bonds?: A Contingent Claims Model,” Financial Management, 117-131. [22] Lando, D. (1994), “On Cox Processes and Credit Risky Bonds,” Working Paper, Institute of Mathematical Statistics, University of Copenhagen. [23] Leland, H. (1994), “Corporate debt value, bond covenants, and optimal capital structure,” Journal of Finance, vol. 49, September, 1213-1252. [24] Longstaff , F. and Schwartz E., (1995), “A Simple Approach to Valuing Risky Fixed and Floating Rate Debt,” Journal of Finance, 50(3), 789-819. [25] Madan, D. and. Unal H, (1998), “Pricing the Risks of Default,” Review of Derivatives Research, Vol.2, 79-105. [26] Merton, Robert, (1973), “Theory of rational option pricing,” Bell Journal of Economics and Management Science, 4: 141-183. [27] Merton, Robert, (1974), “On the pricing of corporate debt : the risk structure of interest rates, Journal of Finance, vol. 29, May, 449-470. [28] Merton, Robert, (1977), “On the Pricing of Contingent Claims and the ModiglianiMiller Theorem,” Journal of Financial Economics, 5, 241-249. [29] Modigliani, F. and M. Miller, (1958), “ The cost of capital, corporation finance and the theory of investment “, American Economic Review, vol 38, June, 261-297. [30] Palac-McMiken, E.D., (1995), “Rescheduling, creditworthiness and market prices ”, London, Avebury. [31] Saunders, Anthony, (1986), “The determinants of country risk,” Studies in Banking and Finance, 3, pp. 2-38.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 25-70

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 3

S UCCESS AND FAILURE OF T ECHNICAL A NALYSIS IN THE C OCOA F UTURES M ARKET Peter Boswijk, Gerwin Griffioen and Cars Hommes Department of Quantitative Economics, CeNDEF, University of Amsterdam, Amsterdam

3.1. Introduction Fundamental and technical analysis encompass a myriad of techniques that are used in financial practice to study and explain the movements of financial markets. Their main aim is to predict future price movements by making use of publicly available information. For example, fundamental analysis is based on macro and micro economic variables, while technical analysis is based on past price and volume data alone. In this paper we will focus on the effectiveness of technical analysis. The study in this paper began as an attempt to answer questions raised by a financial practitioner, Guido Veenstra, employed at the Dutch cocoa-trading firm, Unicom International. Unicom buys crops of cocoa beans in Africa, the Far East and South America. The cocoa beans are shipped to Europe where they are grinded and processed to cocoa-butter, cocoa-mass and cocoa-powder. These semi-finished cocoa products serve as production factors in the chocolate industry. The first aim of Unicom is to trade the raw cocoa beans and sell the semi-finished cocoa products to chocolate manufacturers. To secure profit, a second important task of Unicom is to control currency and cocoa price risk. These risk exposures are hedged by using currency and cocoa futures contracts. In addition to companies that are really physically trading cocoa products, more and more speculators seem to be active in the cocoa futures market. These speculators use, among others, technical analysis as a forecasting tool. When a lot of speculators who are controlling large amounts of money are trading in a market, they may affect prices through their behavior. The question “Can cocoa futures prices be predicted by technical analysis?” thus becomes important from a practitioner’s viewpoint. This question is not only important to the cocoa business, but in general to companies hedging price risk of any commodity. Why should a company maintain a short trading posture in a futures contract to hedge his price risk exposure if he knows that a lot of speculators using technical analysis are demanding long trading postures and are thus causing upward pressure on price?

26

Peter Boswijk, Gerwin Griffioen and Cars Hommes

Knowledge of the behavior of speculators in the market may be useful to adapt a companies’ price hedging strategy. Until fairly recently the academic literature has paid little attention to technical analysis. The efficient markets hypothesis (EMH) was until the 1980s the dominating paradigm in finance, see e.g. Fama (1970) and Samuelson (1965). According to a strong version of the EMH all information is immediately discounted in the price of a financial asset. The price will only adapt if new information becomes available. Because news enters the market randomly, the price will adapt randomly. Therefore financial asset prices should be unpredictable. Thus according to the EMH one should not be able to predict future price movements by using technical analysis. In the last decade, however, technical analysis has regained the interest of many economic researchers. Several authors have shown that financial asset prices are predictable to some extent, either from their own past or from some other publicly available information, see e.g. Fama and French (1988), Lo and MacKinlay (1988, 1997, 1999) and Pesaran and Timmermann (1995, 2000). In particular it has been shown that simple technical forecasting rules can have statistically significant forecasting power and can yield statistically significant economic profits. For example Brock, Lakonishok and LeBaron (1992) test 26 of these simple technical forecasting rules (based on moving average and support & resistance) on daily data of the Dow-Jones Industrial Average (DJIA) in the period 1897-1986. They find that each forecasting rule predicts periods with positive returns better than periods with negative returns. Further they find that returns following buy signals are less volatile than returns following sell signals. They are the first who extend standard statistical analysis with parametric bootstrap techniques. They show that their results in favor of technical analysis are not consistent with data generating processes like the random walk, the autoregressive of the order one, the GARCH-in-mean and the exponential GARCH model. LeBaron (2000) extends the analysis of Brock et al. (1992) to the period 1988-1999. He finds that technical forecasting rules perform much worse in this period, but that volatility in returns remains different between periods following buy and sell signals. Levich and Thomas (1993) test technical forecasting rules (based on moving averages and Alexander’s (1961) filters) on foreign currency futures prices in the period 1976-1990. By applying bootstrap techniques they find that profits of technical forecasting rules cannot be explained by a random walk model or by autocorrelation in the data. LeBaron (1993) tests technical forecasting rules (based on interest rate differentials, moving averages and volatility comparison) on exchange rates. He concludes that these forecasting rules can predict price changes. Several authors have emphasized the danger of data snooping, meaning that if one searches long enough through a data set, there will always appear one forecasting rule that seems to work. Many authors mitigate this problem by reporting the robustness of their results across different subperiods or by testing only forecasting rules that are reported to be frequently used in financial practice. However, Sullivan, Timmermann and White (1999) noted that these forecasting rules could be the result of survivorship bias, since the currently used forecasting rules in practice can be the result of a continuous search for the best forecasting rule. Therefore they propose to use White’s (2000) Reality Check bootstrap methodology to correct for data snooping. In their study Sullivan et al. (1999) take a closer look at the results of Brock et al. (1992). Using the same 26 technical forecasting rules they

Success and Failure of Technical Analysis in the Cocoa Futures Market

27

conclude that the results of Brock et al. (1992) in the period 1897-1986 are robust to data snooping. However, they find that in the period 1987-1997 the best technical forecasting rule does not perform statistically significantly better than a buy-and-hold trading strategy. Extending the set of 26 technical forecasting rules to a universe of 7846 technical forecasting rules, Sullivan et al. (1999) show that these findings do not change. They conclude that the worse performance of technical analysis in the period 1987-1997 may be explained by a change of the market mechanism, e.g. an increase of market efficiency due to lower transaction costs and increased liquidity. For a comprehensive survey of the history of technical analysis and an extensive study of the effectiveness of technical forecasting rules in financial markets see Griffioen (2003). The present paper is empirical and tests the profitability and predictability of objective trend-following technical forecasting rules in the cocoa futures markets in the period January 1983 until June 1997. In order to avoid the problem of data snooping, our approach is to test a large set of more than 5000 technical forecasting rules. These are based on moving averages, support & resistance and Alexander’s (1961) filters. We study the magnitude of the fraction of the forecasting rules that earn a statistically significantly positive excess return and show statistically significant forecasting power. Cocoa futures contracts are traded at two different exchanges, namely at the Coffee, Sugar and Cocoa Exchange (CSCE) in New York and the London International Financial Futures Exchange (LIFFE)1 . The results of applying technical analysis to the prices of cocoa futures contracts traded at these two exchanges are strikingly different. It is found that 14% of all forecasting rules that are applied to the LIFFE quoted cocoa futures price earn a statistically significantly positive excess return, even when corrected for transaction costs. Furthermore, a large set of forecasting rules shows statistically significant forecasting power, with e.g. 27% showing a statistically significantly positive difference between the returns of the cocoa futures price in periods after a buy signal and in periods after a sell signal; for the 5 year subperiod January 1983 until December 1987 this fraction is even 47%. However, the same set of technical forecasting rules performs poorly when applied to the CSCE quoted cocoa futures price, with only 0.3% earning a statistically significantly positive net excess return and showing hardly any statistically significant forecasting power. This large difference in the effectiveness of technical analysis is surprising, because the underlying asset in both markets is more or less the same. Our findings may be attributed to a combination of demand and supply of cocoa beans and an accidental influence of the Pound-Dollar exchange rate. Due to a spurious relation between the level of the Pound-Dollar exchange rate and the demand and supply of cocoa beans, especially in the period January 1983 until December 1987, price trends were strengthened in the LIFFE quoted cocoa futures price, while the same price trends were weakened in the CSCE quoted cocoa futures price. Hence many technical forecasting rules were able to pick up these sufficiently strong trends in the LIFFE quoted cocoa futures price, but almost none of them were able to pick up the weaker trends in the CSCE quoted cocoa futures price. This paper is organized as follows. In section 3.1.1. the data sets are described. Also it is shown how a continuous time series of 15 years of daily data can be constructed out of the daily data of 160 temporarily existing futures contracts. Section 3.2. presents the tech1 As

Trade.

to date the LIFFE is a subsidiary of Euronext and the CSCE is a subsidiary of the New York Board of

28

Peter Boswijk, Gerwin Griffioen and Cars Hommes

nical forecasting rules that are studied; the parameterizations of these rules can be found in Appendix B. In section 3.3. something is said about how technical forecasting rules are implemented by trading strategies and how profits are measured. Section 3.4. focuses on the statistical significance of the profitability and predictability of the technical forecasting rules. First, the statistical tests are done under the assumption of independent and identically distributed (iid) returns. Thereafter a correction is made for dependence in the data. This is done firstly by estimating exponential GARCH models with a dummy for the trading posture in the regression equation. Secondly, in section 3.5. bootstrap techniques are applied. In section 3.6. a possible explanation is put forward for the large difference that is found in the predictability of the CSCE and LIFFE quoted cocoa futures price. Finally, section 3.7. concludes.

3.1.1.

Data

3.1.2.

Data Series

A cocoa futures contract that is traded at the CSCE and LIFFE is a standardized agreement between two parties to trade an amount of cocoa beans at some future date against a certain price. The party who is obliged to buy or sell the cocoa beans is said to keep the long or short trading posture in the futures contract. The contract specifies the quality and quantity of the cocoa beans as well as the time and place of delivery. The agreed price that is specified in the futures contract is called the futures price. This price is determined by demand and supply in the futures market. The expiry months of cocoa futures contracts are March, May, July, September and December. Each contract asks for the delivery of ten tons of cocoa. The LIFFE contract specifies that at each trading day ten expiry months are available for trading. The CSCE and LIFFE cocoa futures contracts differ somewhat in their specifications. First, cocoa is grown in many regions in Africa, Asia and Latin America and therefore the crops differ in quality. In the futures contracts a benchmark is specified and the other crops are traded at premiums. The benchmark in the LIFFE contract has a higher quality than the benchmark in the CSCE contract. Therefore the benchmark in the LIFFE contract is traded at a $160/ton2 premium over the benchmark in the CSCE contract. Second, the place of delivery in the CSCE contract is near New York, while the places of delivery in the LIFFE contract are nominated warehouses at different places in Europe. Third, the tick sizes of the CSCE and LIFFE quoted futures prices are respectively one Dollar and one Pound. We collected data about the settlement prices of the 160 cocoa futures contracts that expired in the period January 1982 through December 1997 at the CSCE and the LIFFE3 . Furthermore, for the same period we collected data about the Pound-Dollar exchange rate (WM/Reuters) and data about rates of 1-month UK and US certificates of deposit (COD). 2 Contract specifications of January 26, 1998. 3 We

thank the cocoa-trading firm Unicom International B.V. and ADP Financial Information Services for providing the data.

Success and Failure of Technical Analysis in the Cocoa Futures Market

3.1.3.

29

A Continuous Time Series of Futures Prices

Each cocoa futures contract is traded prior to expiration during a period of approximately 18 months. Thus there is no continuous time series of the cocoa futures price available encompassing a period of multiple years. This section describes how a continuous time series of daily prices covering multiple years can be constructed out of the daily prices of the separate temporarily existing futures contracts. The well-known formula of the price of a futures contract at day t that expires at day T is f

Ft = St exp{(rt + ut − yt )(T − t)}.

(3.1) f

Here St is the spot price of the underlying asset at time t, and rt , ut , yt are respectively the expected daily risk-free interest rate, storage costs and convenience yield averaged over the period (t, T ] at time t with continuous compounding. The convenience yield can be f seen as the utility of having the asset in stock. The term (rt + ut − yt ) is called the one f period cost of carry and (rt + ut − yt )(T − t) is called the total expected cost of carry of keeping the asset in stock until time of expiration T at time t. Equation (3.1) is also called the cost of carry relationship. The daily return rtF of the futures contract, expressed as the log difference, is given by f

f

rtF = rtS + (∆rt + ∆ut − ∆yt ) (T − t) − (rt−1 + ut−1 − yt−1 ),

(3.2)

f

where ∆zt = zt − zt−1 for zt = rt , ut or yt . This formula shows that a change in one of the factors of the one period cost of carry has an impact on the futures price. Otherwise, the return of a futures contract is equal to the excess return of the underlying asset over the one period cost of carry4 . Assume that we have two futures contracts, 1 and 2, with expiry dates T1 < T2 , futures f prices F1,t , F2,t and one period cost of carry variables ri,t , ui,t , yi,t for i = 1, 2. The futures price of contract 2 can be expressed in terms of the futures price of contract 1 f

f

F2,t = F1,t exp{(r2,t + u2,t − y2,t )(T2 − t) − (r1,t + u1,t − y1,t )(T1 − t)}.

(3.3)

If, as is usual, the total expected cost of carry of futures contract 2 is larger than the total expected cost of carry of futures contract 1, then the price of futures contract 1 is higher than the price of futures contract 2. However, sometimes the price of futures contract 2 is lower than the price of futures contract 1, because there is an expected shortage of the commodity in the short run, but not in the long run. In that case y1,t is much larger than y2,t causing the total expected cost of carry of futures contract 2 to be smaller than the total expected cost of carry of futures contract 1. This shows that the prices of different futures contracts move at different price levels. Further it follows from (3.3) that futures contract 2 inherits its price trend from futures contract 1. In this study a long continuous time series of cocoa futures prices is needed in order to be able to test technical forecasting rules with long memory. The continuous time series must be constructed out of the many price series of the different temporarily existing futures 4 Note

f

f

that rt−1 , ut−1 , yt−1 are measured over the period (t-1,t], that is e.g. rt−1 is the interest that is earned in the period (t-1,t].

30

Peter Boswijk, Gerwin Griffioen and Cars Hommes

contracts that show the same price trends, but move at different price levels. In particular roll over dates must be defined. These are dates at which we assume that a trader exchanges a trading posture in one futures contract for the same trading posture in another futures contract that has an expiry date further into the future. In practice most trading occurs in the second nearest contract, that is the futures contract that has the one but nearest expiration date. We investigated the liquidity of the cocoa futures contracts and decided to take as roll over dates the date one month before the contract with the nearest expiry date expires. In this way it is assured that always prices are used of futures contracts with high liquidity. Figure 3.1 exhibits graphically the roll over scheme.

Figure 3.1. Roll over scheme. As an example the time axis shows the roll over dates from December 1, 1993 until March 1, 1995. The arrows above the time axis show in which period which futures contract, as identified by the month of expiration, is used in constructing the continuous futures price series.

Murphy (1986) suggests pasting prices of two successive futures contracts to study price movements over a long period of time. However, this method introduces price jumps in the continuous time series, because the prices of two different contracts move at different levels. These price jumps may trigger spurious signals if technical forecasting rules are tested. Furthermore, due to the price jumps, the returns of the continuous time series at the roll over dates do not represent the “true” returns anymore. Therefore a continuous time series must be constructed in another way. The holder of the long trading posture in a futures contract pays a time premium to the holder of the short trading posture. According to (3.1) the time premium paid at time t is f

TPt = Ft − St = (exp{(rt + ut − yt )(T − t)} − 1) St .

(3.4)

Logically TPT = 0. According to (3.4) the time premium that is paid will be less when the time until expiration is shorter other things being equal. However, (3.4) also implies that if a continuous time series of futures prices is constructed by pasting the prices of futures contracts, then at each pasting date5 a new time premium is added to the time series, because at each pasting date a futures contract is exchanged with one that has a longer time until expiration. This causes price jumps and therefore an upward force in the global futures price development. In fact, even if over some period the return of the underlying asset is mostly not larger than the one period cost of carry, which should cause according to (3.2) mostly negative returns of the futures price and hence a downward sloping price trend, nevertheless a spurious upward price trend can be observed in the continuous price series. This effect 5 The

pasting date is equal to the roll over date.

Success and Failure of Technical Analysis in the Cocoa Futures Market

31

3000 2500 2000 1500 1000 500 0 1/04/82

11/04/85

9/04/89

7/05/93

5/05/97

Figure 3.2. Two continuous time series of CSCE quoted cocoa futures prices in the period January 1982 through June 1997. The upper time series is constructed by pasting the prices of futures contracts that are exchanged at the roll over dates. The lower continuous time series is constructed by pasting the returns. As starting value the price of the May futures contract at January 3, 1983 is chosen.

is illustrated in figure 3.2. Thus pasting prices may not correctly reflect long-term price movements and may affect the performance of long memory forecasting rules. Therefore in this study a continuous time series of futures prices is constructed by pasting the returns of futures contracts that are exchanged at the roll over dates and by choosing an appropriate starting value; see figure 3.2. For this continuous series there are no discontinuous price jumps and spurious price trends. Any price trends that are present in the series reflect real profitability of trading postures in futures contracts.

3.1.4.

Summary Statistics

Figure 3.3 shows the continuation series of the CSCE and LIFFE quoted cocoa futures prices as well as the Pound-Dollar exchange rate (BPDo) in the period 1982:1-1997:66. Also the return series are shown. The long-term and short-term price trends can be seen clearly. Each technical forecasting rule uses a different amount of past data to make its 6 From

now on the “year:month” notation is used.

32

Peter Boswijk, Gerwin Griffioen and Cars Hommes

first prediction. Therefore the first 260 observations in each data set are used to initialize the forecasting rules, so that on January 3, 1983 each rule advises some trading posture. All forecasting rules will be compared from this date on. Table 1 shows the summary statistics of the daily returns in the period 1983:1-1997:6 and in three subperiods of five years. Returns are calculated as the natural log differences of the level of the data series. The first subperiod, 1983:1-1987:12, covers a period in which all three series exhibit first a long-term upward price trend and thereafter a downward price trend; see figure 3.3. It is remarkable that the upward and downward price trends in both cocoa continuation series (accidentally) coincide with similar trends in the Pound-Dollar exchange rate series. In the second subperiod, 1988:1-1992:12, the cocoa continuation series exhibit a downward price trend, while the Pound-Dollar series is fluctuating upwards and downwards. The third subperiod, 1993:1-1997:6, covers a period in which both cocoa continuation series as well as the Pound-Dollar exchange rate seem to show no significant long-term price trends anymore. From table 1 it can be seen that the mean daily returns are close to zero for all periods. The largest (absolute) mean daily return is negative 9.5 basis points per day, -21.2% per year, for the CSCE cocoa continuation series in the second subperiod. The daily standard deviation of the CSCE cocoa returns is slightly, but significantly7, larger than the daily standard deviation of the LIFFE cocoa returns in all periods. The daily volatility in the returns of the Pound-Dollar exchange rate is much smaller, by a factor more than two as measured in standard deviations, than the volatility in the returns of both cocoa continuation series in all periods. All data series show excess kurtosis in comparison with a normal distribution and show some signs of skewness. The table also shows the maximum consecutive decline of the data series in each period. For example the CSCE cocoa continuation series declined with 85.1% in the period May 23, 1984 until February 20, 1997. The Pound lost 47.5% of its value against the Dollar in the period February 27, 1985 until September 2, 1992. Hence, if objective trend-following technical forecasting rules can avoid being in the market during such periods of great depreciation, large profits can be made. Table 2 shows the estimated autocorrelation functions of the returns, up to order 20, for all data series over all periods. Typically autocorrelations are small with only few lags being significant8. The CSCE cocoa returns series shows little autocorrelation. Only for the first subperiod the second order autocorrelation is significantly negative at the 5% significance level. The LIFFE cocoa returns series shows some signs of low order autocorrelation, significant at the 10% level in the first two subperiods. The Pound-Dollar returns series has significant first order autocorrelation at the 1% significance level, mainly in the first two subperiods.

7 The

null hypothesis H0 : σ2r(csce) = σ2r(li f f e) vs H1 : σ2r(csce) 6= σ2r(li f f e) is tested with the test statistic F =

S2r(csce) /S2r(li f f e) . 8 Because sample

autocorrelation may be spurious in the presence of heteroskedasticity we also tested for significancepby computing Diebold (1986) heteroskedasticity-consistent estimates of the standard errors, that is se(k) = 1/n (1 + γ(r 2,k)/σ4 ), where n is the number of observations, γ(r 2 ,k) is the k-th order sample autocovariance of the squared returns, and σ is the standard error of the returns.

Success and Failure of Technical Analysis in the Cocoa Futures Market 1600

33

0.15

1400 0.10 1200 1000

0.05

800

0.00

600 -0.05 400 200 1/03/83

11/03/86

9/03/90

7/04/94

-0.10 1/03/83

11/03/86

CSCE

9/03/90

7/04/94

CSCE RETURN

2500

0.10

2000 0.05 1500 0.00 1000 -0.05 500

0 1/03/83

11/03/86

9/03/90

7/04/94

-0.10 1/03/83

11/03/86

LIFFE

9/03/90

7/04/94

LIFFE RETURN

1.6

0.04

0.02 1.4 0.00 1.2 -0.02 1.0 -0.04

0.8 1/03/83

11/03/86

9/03/90 BPDO

7/04/94

-0.06 1/03/83

11/03/86

9/03/90

7/04/94

BPDO RETURN

Figure 3.3. Time series, over the period 1983:1-1997:6, of the CSCE (top left) and LIFFE (middle left) cocoa continuation series, the Pound-Dollar exchange rate (bottom left) and corresponding returns series (right).

34

Peter Boswijk, Gerwin Griffioen and Cars Hommes

3.2. Forecasting Techniques in Technical Analysis Murphy (1986) defines technical analysis as the study of past price movements with the aim to forecast future price movements, perhaps with the aid of certain quantitative summary measures of past prices such as “momentum” indicators (“oscillators”), but without regard to any underlying economic, or “fundamental” analysis. Another description is given by Pring (1998) who defines technical analysis as the “art” of identifying trend changes at an early stage and maintaining an investment or trading posture until the weight of evidence shows or proves that the trend has reversed. There are three basic principles underlying the philosophy of technical analysis. The first is that all information is gradually discounted in the price of an asset. Eventually, the dreams, hopes and nightmares of all investors are reflected in the price through the market mechanism. A technical analyst argues that the best adviser you can get is the market itself and there is no need to explore fundamental information. Second, technical analysis assumes that asset prices are upward, downward or sideways trending. Therefore most technical forecasting rules are based on trend-following instruments. The third assumption is that history repeats itself. Under equal conditions investors will react the same leading to certain geometrical patterns in a price chart. Technical analysts claim that if a pattern is recognized in an early stage, profitable trades can be made. In this study we confine ourselves to objective trend-following technical forecasting techniques which can easily be implemented on a computer. We test in total 5350 technical forecasting techniques based on moving-averages (in total 2760), trading range break-out (in total 1990) and Alexander’s (1961) filters (in total 600). These forecasting techniques are also tested by Brock, Lakonishok and LeBaron (1992), Levich and Thomas (1993) and Sullivan, Timmermann and White (1999)9 . We use the parameterizations of Sullivan et al. (1999) as a starting point to construct our set of technical forecasting techniques. These parameterizations can be found in Appendix B.

3.2.1.

The Moving-average Forecasting Rule

Forecasting techniques based on moving-averages (MAs) are the most commonly applied forecasting rules in technical analysis10. A moving average is a recursively updated average of past prices. It smooths out an otherwise volatile series. Hence, it yields insight in the underlying price trend. In this study we use equally weighted moving averages MAtn =

1 n−1 Pt− j , n∑ j=0

where MAtn is the moving average at time t of the last n observed prices11 . Short (long) term price trends can be detected by choosing n small (large). The larger n, the slower the MA adapts and the more the volatility is smoothed out. Technical analysts therefore refer to a MA with a large n as a slow MA and to a MA with a small n as a fast MA. 9 Geometrically

based technical forecasting techniques, such as head-and-shoulder pattern formation, are tested by e.g. Lo, Mamaysky and Wang (2000) using non-parametric methods. 10 From now on we denote technical forecasting techniques based on MAs as MA (forecasting) rules. 11 In technical analysis also weighted and exponentially MAs are used. See e.g. Achelis (1995).

Success and Failure of Technical Analysis in the Cocoa Futures Market

35

MA forecasting rules are based on one or two moving averages. A special case is the single crossover MA rule using the price series itself and a MA of the price series. If the price crosses the MA upward (downward) this is considered as an optimistic (pessimistic) signal. The double crossover MA rule on the other hand is based on two moving averages, a slow one and a fast one. The slow MA represents the long run price trend and the fast MA represents the short run price trend. If the fast MA crosses the slow MA upward (downward) an optimistic (pessimistic) signal is generated. We call the single and double crossover MA forecasting rules as described above, the basic MA rules. These basic MA rules are extended with a %-band filter, a time delay filter, a fixed holding period and a stop-loss. The %-band filter and time delay filter are developed to reduce the number of false signals. In the case of the %-band filter, a band is introduced around the slow MA. If the price or fast MA crosses the slow MA with an amount greater than the band, a signal is generated; otherwise any trading posture is maintained. This forecasting rule does not generate signals as long as the fast MA is within the band around the slow MA. The basic MA rule extended with a b · 100% band filter is described by the trading posture generating model Post+1 = 1, if MAtk > (1 + b)MAtn Post+1 = Post , if (1 − b)MAtn ≤ MAtk ≤ (1 + b)MAtn Post+1 = −1, if MAtk < (1 − b)MAtn , where k < n and Post+1 = −1 or 1 means keeping a pessimistic or optimistic trading posture in period t + 112, 13 . Choosing b = 0 this model results in the basic MA forecasting rule. The single cross over MA rule is defined by denoting Pt = MAt0 . According to the time delay filter a signal must hold for d consecutive days before a trade is implemented. If within these d days different signals are generated, the trading posture will not be changed. A MA rule with a fixed holding period maintains an optimistic (pessimistic) trading posture for a fixed number of f days after a signal is generated. After f days any trading posture is liquidated and a neutral trading posture (i.e. Post+1 = 0) is maintained up to the next signal. This rule tests whether the market behaves different in a time period after a crossing of MAs. All signals that are generated during the fixed holding period are ignored. The last extension is the stop-loss. The stop-loss is based on the popular phrase: “Let your profits run and cut your losses short.” If a pessimistic (optimistic) trading posture is kept, then the stop-loss will liquidate the trading posture if the price rises (declines) from the most recent low (high) with at least x%. A neutral trading posture is maintained up to the next signal.

3.2.2.

The Trading Range Break-out Forecasting Rule

Our second group of technical forecasting techniques is based on trading range breakout (TRB) or otherwisely stated support-and-resistance14. The TRB forecasting rule makes use of support and resistance levels. If during a certain period of time the price does not 12 Note

that trading postures are unchanged until the moving averages really cross.

13 Section 3.3. describes how optimistic and pessimistic trading postures are translated to real trading postures

in a financial asset. 14 From now on we denote technical forecasting techniques based on TRB as TRB (forecasting) rules.

36

Peter Boswijk, Gerwin Griffioen and Cars Hommes

fall below (rise beyond) a certain price level, this price level is called a support (resistance) level. According to technical analysts, there is a “battle between buyers and sellers” at these price levels. The market buys at the support level after a price decline and sells at the resistance level after a price rise. If the price breaks through the support (resistance) level, an important technical signal is generated. The sellers (buyers) have won the “battle”. At the support (resistance) level the market has become a net seller (buyer). This indicates that the market will move to a subsequent lower (higher) level. The support (resistance) level will change into a resistance (support) level. To implement the TRB rule, support-and-resistance levels are defined as local minima and maxima of the closing prices. If the price falls (rises) through the local minimum (maximum) a pessimistic (optimistic) trading posture is initiated. If the price moves between the local minimum and maximum the trading posture is maintained until there is a new breakthrough. The TRB rule is extended with a %band filter, a time delay filter, a fixed holding period and a stop-loss. The trading posture generating model including the %-band filter is described by Post+1 = 1, if Pt > (1 + b) max{Pt−1 , Pt−2 , ..., Pt−n } Post+1 = Post , if (1 − b) min{Pt−1 , ..., Pt−n } ≤ Pt ≤ (1 + b) max{Pt−1 , ..., Pt−n} Post+1 = −1, if Pt < (1 − b) min{Pt−1 , Pt−2 , ..., Pt−n} Choosing b = 0 this model results in the basic TRB forecasting rule.

3.2.3.

The Filter Forecasting Rule

The final group of technical forecasting techniques that we test are based on Alexander’s (1961) filters15 . When the price rises (falls) by at least x% from a previous low (high) a filter forecasting rule generates an optimistic (pessimistic) signal. In this study the filter rule is implemented by using a so-called moving stop-loss. In an upward trend the stoploss is placed below the price series. If the price goes up, the stop-loss will go up. If the price declines, the stop-loss will not be changed. If the price falls through the stop-loss, a pessimistic signal is generated and the stop-loss will be placed above the price series. If the price declines, the stop-loss will decline. If the price rises, the stop-loss is not changed. If the price rises through the stop-loss an optimistic signal is generated and the stop-loss is placed below the price series. Hence the stop-loss will follow the price series at most at a x% distance. The filter rule is extended with a time delay filter and a fixed holding period.

3.3. From Technical Forecasting Rule to Technical Trading Strategy If a technical forecasting rule generates a signal to initiate an optimistic or pessimistic trading posture, then we say that a buy or sell signal is generated. If the rule generates a signal to hold no trading posture, then we say that a neutral signal is generated. A forecasting rule divides the set of prices in three subsets. A buy or sell period is defined as the period following a buy or sell signal up to the next forecast signal. A neutral period is defined as 15 From

now on we denote technical forecasting techniques based on Alexander’s (1961) filters as filter (forecasting) rules.

Success and Failure of Technical Analysis in the Cocoa Futures Market

37

the period after a neutral signal up to the next buy or sell signal. The subsets consisting of buy, sell or neutral periods are called the set of buy, sell or neutral days. Our technical forecasting rules are applied to end of day data. If a buy or sell signal is generated at the end of day t, then it is assumed that a trading posture can be initiated against the settlement price at day t. We define a technical trading strategy as a technical forecasting rule combined with a certain trading strategy. The trading strategy formulates how buy, sell and neutral signals are translated to real trading postures in a financial asset. When applying a technical trading strategy to a cocoa continuation series, we define that on a buy or sell signal a long or short trading posture is initiated in the futures contract. On a neutral signal we define that any trading posture in the futures contract is liquidated. When trading a futures contract, it is required to keep some cash in a margin account with the broker. This is done to protect the broker against defaults of traders. Profits and losses are directly added and subtracted from the margin. We define that interest r f can be earned on the margin account. Further we define that the trader deposits an amount of cash M in the margin account that is equal to the price of the futures contract P. In this case the broker is fully protected against defaulting16. Thus, the margin at the end of day t is equal to Mt = (1 + rtf )Mt−1 + (Pt − Pt−1 )Post , where Mt−1 = Pt−1 if there is a forecast signal at the end of day t − 1, i.e. Post 6= Post−1 , f otherwise Mt−1 is the margin at the end of day t − 1. Note that rt is the interest earned in the period (t − 1,t]. Trading costs are computed as a fraction c of the futures price. Some forecasting rules generate signals very often, others not. If a forecasting rule does not generate signals very often and hence trading postures in a futures contract are maintained for a long time, then there are also trading costs due to the limited life span of the futures contract. In particular we assume that when a certain trading posture in a futures contract is maintained until 20 days after a roll over date, then a trade should take place since the trading posture has to be rolled over to the next futures contract and thus transaction costs must be paid. This approach leads to a fair comparison of the cost structure of forecasting rules that generate many signals with those that generate only a few signals. Finally, the gross return of a technical trading strategy (TTS) net of transaction costs in period t is computed as  M t  , if there is no trade;  Mt−1 (3.5) 1 + rtT T S = Mt 1 − c|Post−1 |   , if there is a trade. Mt−1 1 + c|Post |

Note that when Post = 0, then only interest is earned. When applying a technical trading strategy to the Pound-Dollar exchange rate, we define that on a buy signal Dollars are bought against price E. The Dollars are put in a US account that earns interest rUS . On a sell or neutral signal a trading posture is kept in Pounds. 16 In practice futures traders can deposit a margin of only 10% of the price of the contract. The broker issues frequently a margin call, that is to add money to the margin, if the trader is in a losing trading posture. However, to keep things as simple as possible we assume a fully protected trading posture by setting the required margin to 100% of the price of the futures contract.

38

Peter Boswijk, Gerwin Griffioen and Cars Hommes

These Pounds are put in an UK account that earns interest rUK . Trading costs are calculated as a fraction c of the exchange rate. The gross return of a technical trading strategy net of transaction costs in period t is computed as  1  1+c , if Dollars are bought; C= 1 − c, if Dollars are sold;  1, if there is no change in trading posture.    Et (1 + rUS if Dollars are kept t ) C, TTS Et−1 1 + rt = (3.6)   (1 + rUK ) C, if Pounds are kept. t As proxies for the US and UK interest rates we use data on interest rates of 1-month US and UK certificates of deposits (COD), which are recomputed to daily rates. Costs of trading cocoa futures contracts are set equal to 0.1% per trade, which is close to real transaction costs in futures trading. Costs of trading currency is also set to 0.1% per trade.

3.4. Effectiveness of Technical Analysis: Standard Statistical Tests 3.4.1.

The Best 5 Forecasting Rules

The effectiveness of a technical forecasting rule is studied by (1) testing the statistical significance of economic profits that are earned by an appropriate trading strategy and by (2) testing the statistical significance of the forecasting power, i.e. can periods with positive and negative returns be predicted by the forecasting rule. If a technical forecasting rule shows forecasting power, then this does not necessarily imply that it earns economic profits. Price changes could be to small to earn profits after paying transaction costs. Also, if a technical forecasting rule earns economic profits, then this does not necessarily imply that it shows forecasting power. Economic profits could be the result of chance, not of excellent forecasting power. Next we describe the statistics that are computed to measure the profitability and predictability of a technical forecasting rule. Firstly, the average logarithmic gross return of a technical trading strategy in excess of the logarithmic gross interest rate, is used as a measure of economic profits, i.e. the excess return at day t is computed as f

rte = ln(1 + rtT T S ) − ln(1 + rt ), where f = US or UK. This statistic we call average excess return (¯rexc ) in short. Secondly, as measures of forecasting power are used (1) the average return of the data series itself during buy days, (2) the average return of the data series itself during sell days and (3) the difference of (1) and (2), i.e. (1)−(2). For short hand notation we call (1) the average buy return (¯rbuy), (2) the average sell return (¯rsell ) and (3) the average buy-sell difference. The average buy or sell return measures whether periods with positive or negative returns can be predicted well. The average buy-sell difference measures whether the forecasting rule can distinguish periods with positive returns from periods with negative returns.

Success and Failure of Technical Analysis in the Cocoa Futures Market

39

Under the assumption of iid returns t-test statistics are used to make inferences about the effectiveness of technical analysis. It is tested whether the mean excess, mean buy and mean sell return as well as the mean buy-sell difference are statistically signifi√ cantly p different from zero. The t-test statistics are computed as texc = N (¯rexc/Sexc), √ tbuy = Nbuy (¯rbuy/Sbuy), tsell = Nsell (¯rsell /Ssell ) and q tbuy−sell = (¯rbuy − r¯sell )/ S2buy/Nbuy + S2sell /Nsell ,

where N is the number of data points in the data set, Nbuy and Nsell are the number of buy and sell days, Sexc is the standard error of the excess return, Sbuy and Ssell are the standard errors of returns in buy and sell periods. The tbuy−sell t-test statistic is not Student-t distributed. However, Satterthwhaite (1946) derived an approximation for the degrees of freedom, so that the critical values from the t-table can be used. If the number of observations is sufficiently large this test statistic will have a limiting standard normal distribution. Cocoa Continuation Series Panel A of table 3 shows the results of the best five technical forecasting rules applied to the CSCE cocoa continuation series in the period 1983:1-1997:6. Panel B of the table lists the results of the best forecasting rule in each subperiod. The first column of the table lists the rules’ parameters. MA, TRB and FR are abbreviations for the moving average, trading range break-out and filter forecasting rules respectively. %b, td, fhp, and stl are abbreviations for the %-band filter, the time delay filter, the fixed holding period and the stop-loss respectively. For example, the best technical forecasting rule in the full sample period is based on trading range break-out, defining support and resistance as local minima and maxima in the past five days, extended with a two %-band filter and a 50 day fixed holding period. The third column lists r¯exc net of 0.1% transaction costs, with texc beneath these numbers. The second column shows the effective yearly excess return, that is r¯Yexc = exp{252 r¯exc}17 . The fourth and fifth column list the number of buy and sell days, with the number of buy and sell signals beneath these numbers. The sixth and seventh column show for trading postures that are initiated by buy and sell signals the total number of days that trading postures that earn a strictly positive excess return last as a fraction of the total number of buy and sell days. The fractions of buy and sell signals that initiate trading postures that earn a strictly positive excess return are listed beneath these numbers. The eight and ninth column list r¯buy and r¯sell , with tbuy and tsell beneath these numbers. The last column shows r¯buy − r¯sell , with tbuy−sell beneath these numbers. The best technical forecasting rule that is applied to the full sample earns a statistically significantly positively yearly effective excess return of 10.38%. This is considerably large. The average buy and sell return are equal to 0.056% and −0.101% daily, or 15.2% and −22.5% effectively yearly. The average sell return is significantly negative at the 5% significance level using a one-sided test, while the average buy return is not significantly positive. The average buy-sell difference is significantly positive at the 5% significance level and equal to 0.158% daily, or 48.9% effectively yearly. The four other forecasting rules yield similar results. The average excess return is significantly positive in all cases at 17 We

assume that the number of trading days in a year is equal to 252.

40

Peter Boswijk, Gerwin Griffioen and Cars Hommes

the 10% significance level using a one-sided t-test. The average buy return is positive, but not significant, and the average sell return is significantly negative. For all five forecasting rules the average buy-sell differences are statistically significantly positive at the 5% significance level using a one-sided test. The sixth and seventh column show that for all five listed forecasting rules more than 50% of the trading postures following buy and sell signals earn a strictly positive excess return. Furthermore these trading postures encompass more than 50% of the buy and sell days. The findings above indicate that the best five technical forecasting rules applied to the CSCE cocoa continuation series in the period 1983:1-1997:6 earn profits and show forecasting power. For the three subperiods similar results are found. However, in the subperiods the best five technical forecasting rules earn a higher average excess return than in the full sample period. In all subperiods the best forecasting rule earns a statistically significantly positively yearly effective excess return of about 20%. Panel A of table 4 shows the results of the best five technical forecasting rules applied to the LIFFE cocoa continuation series in the period 1983:1-1997:6. Now the best forecasting rules are all based on moving averages. The best MA forecasting rule compares the price series with a 40-day MA. This forecasting rule is extended with a 0.5 %-band filter. The best five technical forecasting rules applied to the LIFFE cocoa continuation series show to be more profitable than the best five technical forecasting rules applied to the CSCE cocoa continuation series. Furthermore the t-test statistics suggest stronger statistical significance (in favor of technical analysis). However, compared to the CSCE cocoa continuation series, the numbers of trading postures with a strictly positive excess return are smaller. In most cases 20 − 40% of the buy and sell signals initiate trading postures that earn profits, but these trading postures encompass more than 70% of the number of buy and sell days. Thus most of the time the forecasting rules are earning profits, but there are a lot of short run trading postures that make a loss. Also for the LIFFE cocoa continuation series it is found that the best forecasting rules perform better in the subperiods than in the full sample. However, in the subperiods more than 50% of the trading postures that are initiated by these forecasting rules are profitable. Moreover, the trading postures following buy and sell signals encompass more than 70% of the buy and sell days. The findings above indicate that also for the LIFFE cocoa continuation series the best five technical forecasting rules earn profits and show forecasting power. Pound-Dollar Exchange Rate Panel A of table 5 shows the results of the best five technical forecasting rules applied to the Pound-Dollar exchange rate in the full sample period. The best forecasting rule is a 100-day trading range break-out rule extended with a one %-band filter and a 50 day fixed holding period. This forecasting rule earns a statistically significantly positively yearly effective excess return of 1.64%. Notice that this is considerably smaller when compared with the CSCE and LIFFE cocoa continuation series. The average buy and sell returns are equal to 0.161% and −0.017% daily, or 50% and −4.2% effectively yearly. For all forecasting rules the average buy return is significantly positive. However, for most forecasting rules the average sell return is not significantly negative. The average buy-sell difference is sig-

Success and Failure of Technical Analysis in the Cocoa Futures Market

41

nificantly positive for each of the best five forecasting rules and is equal to 0.178% daily, or 56.6% effectively yearly, for the best forecasting rule. It is found that more than 50% of the trading postures following a buy signal are profitable and these trading postures encompass more than 50% of the buy days. The number of profitable trading postures following a sell signal is zero, because in the case of a sell signal, the domestic currency is bought and the domestic interest rate is earned. Hence, the trading strategy that is used in this paper earns an excess return during sell days that is always equal to zero. The results for the three subperiods are similar. Thus also for the Pound-Dollar exchange rate the findings indicate that the best five technical forecasting rules earn profits and show forecasting power. However, much less profits could be made in comparison with the cocoa continuation series. For the three financial data series studied in this paper we have found that we can select technical forecasting rules that show to be effective. However, in a search for a good forecasting rule there can always be found a forecasting rule that shows to be effective, see e.g. Jensen and Benington (1969). This is called the danger of data snooping. In the next section we shall study the effectiveness of the 5350 technical forecasting rules as a group to deal with the data snooping problem.

3.4.2. 3.4.2.1.

The Set of 5350 Forecasting Rules Statistical Significance Under the Assumption of Iid Returns: Simple T-ratios

Firstly, to measure the profitability of the group of 5350 technical forecasting rules as a whole we compute two different statistics, namely the percentage of forecasting rules that earn (1) a statistically significantly positive excess return, i.e. texc > tc, and (2) a statistically significantly negative excess return, i.e. texc < −tc , as inferred from the corresponding t-test statistic. We denote these statistics by (1) %(texc > tc) and (2) %(texc < −tc ). Secondly, to measure the forecasting power of the group of 5350 technical forecasting rules as a whole we compute in addition eight more different statistics, namely the percentage of forecasting rules that show (3) a statistically significantly positive mean buy return, (4) a statistically significantly negative mean sell return, (5) a statistically significantly positive mean buy-sell difference, (6) both (3) and (4), (7) a statistically significantly negative mean buy return, (8) a statistically significantly positive mean sell return, (9) a statistically significantly negative mean buy-sell difference and (10) both (7) and (8), as inferred from the corresponding ttest statistic. For short hand notation we denote these statistics by (3) %(tbuy > tc ), (4) %(tsell < −tc ), (5) %(tbuy−sell > tc), (6) %(tbuy > tc ∧ tsell < −tc), (7) %(tbuy < −tc ), (8) %(tsell > tc), (9) %(tbuy−sell < −tc ) and (10) %(tbuy < −tc ∧ tsell > tc). The statistics (3) through (6) measure if periods with positive and negative returns are predicted well. Cocoa Continuation Series For both the CSCE and LIFFE cocoa continuation series table 6 shows the values of the statistics. Excess returns are computed net of 0.1% transaction costs. The table lists only the results of inferences that are made on the basis of one-sided tests at the 10% significance level. In that case tc = 1.28. The results at the 5% significance level are similar, but of course weaker. At the 1% significance level any results in favor of technical analysis disappear.

42

Peter Boswijk, Gerwin Griffioen and Cars Hommes

For the CSCE cocoa continuation series it is found that %(texc > tc) = 0.30 in the full sample period, while for the LIFFE cocoa continuation series this percentage is equal to 13.86. This difference can mainly be explained by the results in the first subperiod. It is remarkable that in this period for the LIFFE cocoa continuation series it is found that %(texc > tc) = 34.52, while for the CSCE cocoa continuation series this percentage is equal to 0.92. It is even so that a group of forecasting rules is performing very bad when applied to the CSCE cocoa continuation series, resulting in %(texc < −tc ) = 24.17. The percentages of technical forecasting rules earning a statistically significantly positive excess return decline to 0.45 and 2.13 for the CSCE and LIFFE cocoa continuation series in the third subperiod, while the percentages earning a statistically significantly negative excess return increase to 33.26 and 11.28. Thus, the forecasting rules show to be profitable mainly for the LIFFE cocoa continuation series in the first subperiod, but profitability is declining in time. When considering the forecasting power of the group of technical forecasting rules applied to the LIFFE cocoa continuation series, it is found that %(tbuy−sell > tc) = 26.58 in the full sample period. This percentage is only 1.38 for the CSCE cocoa continuation series. Again the difference can be explained by the first subperiod. For the LIFFE cocoa continuation series it is found that %(tbuy−sell > tc) = 46.65, while for the CSCE cocoa continuation series this percentage is equal to 1.46. Focusing on the LIFFE cocoa continuation series in the first subperiod it is found that %(tbuy > tc) = 26.73, %(tsell < −tc ) = 39.47 and %(tbuy > tc ∧ tsell < −tc ) = 14.70. These numbers are considerably smaller for the CSCE cocoa continuation series. Thus, in the first subperiod the forecasting rules seem to distinguish periods with positive returns from periods with negative returns better in the LIFFE than in the CSCE cocoa continuation series. In the second subperiod the forecasting rules seem to predict periods with negative returns very well in both the CSCE and LIFFE cocoa continuation series. It is found that %(tsell < −tc ) = 44.57 for the CSCE continuation series, while this percentage is even larger and equal to 54.62 for the LIFFE cocoa continuation series. The second subperiod is characterized by a long-term downward trend with short-term upward corrections. This explains the high percentages found for %(tsell < −tc). However, the good predictability of periods with negative returns does not result in large percentages for %(texc > tc ). Also no large percentages are found for %(tbuy−sell > tc). It appears that the forecasting rules do not predict periods with positive returns very well, resulting in %(tbuy < −tc ) = 26.55 and %(tbuy < −tc) = 31.96 for the CSCE and LIFFE cocoa continuation series. These results are in line with the advices of technical analysts only to trade in the direction of the main trend and not to reverse trading postures until there is enough weight of evidence that the trend has reversed. Apparently, the short-term upward corrections in the second subperiod did not last long enough to be predictable or profitable. The third subperiod is characterized by upward and downward price trends. Compared to the first two subperiods no remarkable results in favor of technical analysis are found. However, compared to the second subperiod the percentages found for %(tbuy < −tc ) decline to 7.98 and 2.34 for the CSCE and LIFFE cocoa continuation series.

Success and Failure of Technical Analysis in the Cocoa Futures Market

43

Pound-Dollar Exchange Rate

Table 6 shows the results of the group of MA, TRB and filter forecasting rules when applied to the Pound-Dollar exchange rate in the period 1983:1-1997:6. Excess returns are computed net of 0.1% transaction costs. Just as for the CSCE cocoa continuation series the percentages found for %(texc > tc ) do not show remarkable results in favor of technical analysis. On the contrary, we find even strong evidence against the profitability of technical analysis with %(texc < −tc ) = 62.32 in the full sample period. It appears that the forecasting rules were mainly performing badly in the third subperiod. The technical forecasting rules seem to distinguish periods with positive returns from periods with negative returns in the full sample period with %(tbuy−sell > tc ) = 28.19, %(tbuy > tc) = 13.08 and %(tsell < −tc )=17.13. This mainly occurs in the first subperiod with %(tbuy−sell > tc ) = 41.9. In this period it is found that periods with negative returns are predicted better than periods with positive returns with %(tsell < −tc ) = 44.29 and %(tbuy > tc ) = 12.42. In the second period on the contrary the periods with positive returns are predicted better than the periods with negative returns with %(tbuy > tc) = 29.63 and %(tsell < −tc ) = 7.73. The third subperiod is characterized by upward and downward price trends. Now the forecasting rules show hardly any signs of forecasting power. 3.4.2.2.

Statistical Significance Under the Assumption of Non-iid Returns: An Estimation Based Approach

In the previous subsection it appeared that technical forecasting rules seem to be effective when applied to the LIFFE cocoa continuation series in the period 1983:1-1987:12. This is the only period and data series for which good results in favor of technical analysis are found. So far the statistical tests are done under the assumption of iid returns. However, in section 3.1.4. it is shown that the data series studied in this paper exhibit some signs of linear dependence as measured by autocorrelation. Moreover it is well known that returns of financial series show dependence in the second moments, i.e. volatility clustering. Therefore we extend our analysis by building a time series model that captures autocorrelation and volatility clustering. In the regression function we add a dummy for the trading posture. This model is estimated for each forecasting rule. We then compute the percentage of forecasting rules for which the dummy coefficients are significantly different from zero. This we consider as a measure of the forecasting power of the group of forecasting rules as a whole. Some econometric time series models are estimated on the daily LIFFE cocoa futures returns in the period 1983:1-1987:12. It is found that the following exponential GARCH

44

Peter Boswijk, Gerwin Griffioen and Cars Hommes

model (EGARCH) developed by Nelson (1991)18 fits the data best19: rt εt

= α + φ16 rt−16 + εt √ = ηt ht ; ηt iid N(0, 1)

(3.7)

ln(ht ) = α0 + g(ηt−1 ) + β1 ln(ht−1 ) q g(ηt ) = θηt + γ(|ηt | − π2 ),

where rt is the logarithmic return at day t, ht is the conditional variance of εt and ηt is noise drawn from a standard normal distribution. This model allows that future volatility depends differently on the sign of current return. The coefficient θ measures the leverage effect. If θ is negative, then volatility in the returns is larger after a price decline than after a price increase. If θ is positive, then volatility in the returns is larger after a price increase than after a price decline. Table 7 shows the estimation results. It turns out that the coefficient θ is significantly positive. This indicates that there is a positive correlation between return and volatility. Note that this is in contrast with the results found in stock markets and exchange rates. There usually a negative correlation between return and volatility is found, see for example Nelson (1991). The estimate of γ is significantly positive. This result indicates that there is volatility clustering in the returns. The (partial) autocorrelation function of the (squared) standardized residuals shows no sign of dependence in the (squared) standardized residuals. Hence it is concluded that model (3.7) fits the data well. Table 7. Coefficient estimates EGARCH-model α -0.000339 (-1.11)

φ16 0.066843 (2.49)

α0 -0.194617 (-2.83)

θ 0.037536 (2.11)

γ 0.125153 (3.41)

β1 0.976722 (97.58)

Estimates of the parameters in model (3.7) using daily returns of the LIFFE cocoa futures price in the period December 12, 1981 until December 31, 1987. The exponential GARCH model is estimated using maximum likelihood, the Marquardt iterative algorithm and Bollerslev-Wooldridge (1992) heteroskedasticity-consistent standard errors and covariance. The numbers within parenthesis are t-ratios.

To study the effectiveness of forecasting rules the regression function in model (3.7) is replaced by rt = α + δm Dm,t + φ16 rt−16 + εt , where m = B or S indicates that a dummy for buy days or for sell days is inserted in the regression function. Thus DB,t = 1 (DS,t = 1) if day t is a buy (sell) day. From now on we refer to DB,t (DS,t ) as the buy (sell) dummy. For each of the 5350 technical forecasting 18 Nelson (1991) replaces the

normal distribution used here with a generalized error distribution. checked for significance of the estimated coefficients. We did diagnostic checking on the standardized residuals, to check whether there was still dependence. We used the (partial) autocorrelation function, LjungBox (1978) Q-statistics and the Breusch-Godfrey LM-test. The Schwartz Bayesian criterion was used for model selection. 19 We

Success and Failure of Technical Analysis in the Cocoa Futures Market

45

rules an EGARCH model with a buy dummy and an EGARCH model with a sell dummy are estimated separately. As a measure of forecasting power we use the percentage of forecasting rules that show a statistically significantly (1) positive buy dummy coefficient, %(tDB > tc ), (2) negative sell dummy coefficient, %(tDS < −tc), (3) both (1) and (2), %(tDB > tc ∧ tDS < −tc), (4) negative buy dummy coefficient, %(tDB < −tc ), (5) positive sell dummy coefficient, %(tDS > tc), (6) both (4) and (5), %(tDB < −tc ∧ tDS > tc). Table 8 shows the results using one-sided t-tests at the 10% significance level. Again the results indicate that the technical forecasting rules have forecasting power in the first subperiod. It is found that %(tDB > tc)=40.6, %(tDS < −tc)=27.4 and %(tDB > tc ∧ tDS < −tc)=22.8. Relatively to this large group of well performing forecasting rules there is a small group of badly performing forecasting rules with %(tDB < −tc )=4, %(tDS > tc)=6.4 and %(tDB < −tc ∧tDS > tc)=1.6. In comparison with the results under the assumption of iid returns it now seems that the forecasting rules predict periods with positive returns better than periods with negative returns, while first for the LIFFE cocoa continuation series it was the other way around. Table 8. Statistical significance: an estimation based approach. The percentage of forecasting rules for which we found a statistically significantly (1) positive buy dummy coefficient, (2) negative sell dummy coefficient, (3) both (1) and (2), (4) negative buy dummy coefficient, (5) positive sell dummy coefficient, and (6) both (4) and (5). Significance is determined at the 10% significance level using a one-sided test, that is tc = 1.28. Results are reported for the LIFFE cocoa continuation series in the period 1983:1-1987:12. Statistic (1) %(tDB > tc ) (2) %(tDS < −tc ) (3) %(tDB > tc ∧ tDS < −tc ) (4) %(tDB < −tc ) (5) %(tDS > tc ) (6) %(tDB < −tc ∧ tDS > tc )

MA 40.2 32.8 29.6 3.6 4.1 1.5

TRB 41.9 22.7 16.6 5.2 9.6 1.9

Filter 38.7 17.5 9.8 2.1 6.8 0.7

All 40.6 27.4 22.8 1.5 0.7 1.6

3.5. Effectiveness of Technical Analysis: The Bootstrap Method 3.5.1.

Methodology

In this section we extend our analysis with parametric bootstrap techniques. We investigate whether any good results that are found in favor of technical analysis can be explained by some popular data generating processes like the random walk, an autoregressive or an EGARCH model. We focus on the LIFFE cocoa continuation series in the period 1983:11997:6. With the aid of the bootstrap methodology the null hypothesis is tested whether the value of a certain statistic can be explained by characteristics as exhibited by a certain data generating process. We choose this statistic to be one of the eight measures of profitability

46

Peter Boswijk, Gerwin Griffioen and Cars Hommes

and predictability of the group of 5350 forecasting rules as a whole that are described in section 3.4.2.1.. The bootstrap methodology compares the value of the statistic computed from the original data series with the values of the same statistic computed from simulated comparison series. Especially the percentage of comparison series is computed for which the value of the statistic is larger than the value of the statistic in the original series. This number can be thought of as a simulated “p-value”. Using a one-sided test the null hypothesis is rejected at the 10% significance level if the “p-value”< 0.10 or if the “p-value”> 0.90. The comparison series are simulated under the null hypothesis of the data generating process, denoted as the null model. We choose the null model to be a random walk, autoregressive, EGARCH or structural break in the price trend model. Distributions of the statistic under the various null models are estimated using the bootstrap methodology inspired by Efron (1982), Freedman (1984), Freedman and Peters (1984a, 1984b), and Efron and Tibshirani (1986). According to the estimation based bootstrap methodology of Freedman and Peters (1984a, 1984b) the null model is fitted to the original data series. The estimated residuals are standardized and resampled with replacement to form a new residual series. This scrambled residual series is used together with the estimated model parameters to create a new so-called “bootstrapped” data series that has the same properties as the null model. We bootstrap 500 comparison series under each null model. Random Walk Process If the null model is chosen to be the random walk with a drift, then the comparison series are bootstrapped by resampling the returns of the original price series with replacement. We compute returns as natural logarithmic differences of the prices. If {Pt : t = 1, 2, ..., T}, is the original price series, then {rt = ln(Pt ) − ln(Pt−1 ) : t = 2, 3, ..., T} is the original return ∗ series. The bootstrapped comparison price series is {Pt∗ = exp(rt∗ )Pt−1 : t = 2, 3, ..., T}, ∗ where rt is the redrawn return series. The initial value of the bootstrapped price series is set equal to the initial original price, i.e. P1∗ = P1 . By construction the returns of the bootstrapped price series are iid. The bootstrap method ensures that there is no dependence in the comparison price series that can be exploited by technical forecasting rules. In this case only by chance a forecasting rule can yield good forecasting results. Hence if the null model is the random walk with a drift, then it is tested whether the findings in favor of technical analysis are just the result of pure luck. Autoregressive Process The second null model we test upon is an autoregressive (AR) model: rt = α + φ16 rt−16 + εt , |φ16| < 1,

(3.8)

where rt is the logarithmic return on day t and εt is iid noise20. The coefficients α, φ16 and the residuals εt are estimated using ordinary least squares (OLS). The estimated residuals are redrawn with replacement and the bootstrapped return series are generated using the estimated coefficients and residuals: ∗ rt∗ = αˆ + φˆ16 rt−16 + εt∗ , 20 This

model is found to fit the data best, see section 3.4.2.2..

Success and Failure of Technical Analysis in the Cocoa Futures Market

47

for t = 18, ..., T. Here εt∗ is the redrawn estimated residual at day t and rt∗ is the bootstrapped return at day t. For t = 2, .., 17 we set rt∗ = rt . The bootstrapped price series ∗ is equal to {Pt∗ = exp(rt∗ )Pt−1 : t = 2, ..., T} and we set P1∗ = P1 . If the null model is an autoregressive model, then it is tested whether the findings in favor of technical analysis can be explained by high order autocorrelation that is present in the data. Using OLS and White’s (1980) heteroskedasticity-consistent standard errors to estimate the model yields the following results with t-ratios within parenthesis: α -0.000235 (-0.68)

φ16 0.110402 (4.00)

The coefficient of the lagged return is significantly different from zero. This shows that the LIFFE cocoa continuation series contains high order autocorrelation. Exponential GARCH Process The third null model that is tested is the EGARCH model as described by (3.7). This model is estimated using maximum likelihood. The estimated coefficients and standardized residuals are used to generate new bootstrapped price series. The estimated standardized residuals, ηˆt , are resampled with replacement to form the resampled standardized residual series {ηt∗ : t = 18, ..., T}. The bootstrapped log conditional variance series is equal to ∗ ∗ {ln(ht∗ ) = αˆ0 + g(ηt−1 ) + βˆ1 ln(ht−1 ) : t = 19, ..., T}.

We set h∗18 equal to the unconditional variance. Under the assumption that the ηt are iid N(0, 1) the unconditional variance of εt is equal to 1

E(ht ) = {exp(α0 )E[exp(g(ηt−1 ))]} 1−β1 , where E[exp(g(ηt−1 ))] = r !      1 1 2 Φ(γ + θ) · exp (γ + θ)2 + Φ(γ − θ) · exp (γ − θ)2 · exp −γ . 2 2 π

21

Here Φ(.) is the cumulative normal distribution. The bootstrapped residual series is √ ∗ + εt∗ : {εt∗ = ηt∗ ht∗ : t = 19, ..., T} and the bootstrapped return series is {rt∗ = αˆ + φˆ16 rt−16 t = 19, ..., T}. For t = 2, ..., 18 we set rt∗ = rt . The bootstrapped price series is {Pt∗ = ∗ : t = 2, ..., T} and we set P∗ = P . exp(rt∗ )Pt−1 1 1 Structural Break in Trend Figure 3.4 reveals that the LIFFE cocoa continuation series contains an upward price trend starting in January 1983 and ending in February 1985. The figure also reveals a downward price trend starting in February 1985 and ending in December 1987. The final 21 See

Nelson(1995).

48

Peter Boswijk, Gerwin Griffioen and Cars Hommes

2200

0.06

2000

0.04

1800

0.02

1600

0.00

1400

-0.02

1200

-0.04

1000

-0.06

800 1/03/83

12/03/84

11/03/86

1/03/83

CSCE

0.06

2000

0.04

1800

0.02

1600

0.00

1400

-0.02

1200

-0.04

1000

-0.06

12/03/84

11/03/86

1/03/83

LIFFE

0.06

2.0

0.04

1.8

0.02

1.6

0.00

1.4

-0.02

1.2

-0.04

1.0

-0.06

12/03/84

11/03/86 BPDO

12/03/84

11/03/86

LIFFE RETURN

2.2

1/03/83

11/03/86

CSCE RETURN

2200

800 1/03/83

12/03/84

1/03/83

12/03/84

11/03/86

BPDO RETURN

Figure 3.4. On the left the CSCE (top) and LIFFE (middle) cocoa continuation series both on the same scale [800, 2200], and the Pound-Dollar exchange rate (bottom) on the scale [0.8, 2.2]. On the right the corresponding returns series all on the same scale [-0.08, 0.06]. All series are plotted in the period 1983:1-1987:12.

Success and Failure of Technical Analysis in the Cocoa Futures Market

49

bootstrap procedure that is considered simulates comparison series that have the same structural change in price trends. For the period showing the upward price trend we bootstrap the autoregressive model (3.8). We don’t find signs of volatility clustering in this period. However, we do find significant volatility clustering in the period showing the downward price trend and therefore we bootstrap the following GARCH model:

εt

= α + φ2 rt−2 + εt √ = ηt ht ; ηt iid N(0, 1)

ht

= α0 + α1 ht−1 + β1 ht−1 .

rt

It is found that this model fits the data best22. Table 9 contains the estimation results of the autoregressive model in the period January 5, 1983 until February 4, 1985 and of the GARCH model in the period February 5, 1985 until December 31, 1987 with t-ratios within parenthesis. Table 9. Coefficient estimates structural break in trend model The autoregressive model coefficients estimates 1/5/1983 - 2/4/1985 α φ16 0.001213 0.161887 (1.74) (3.67) The GARCH-model coefficients estimates 2/5/1985 - 12/31/1987 α φ2 α0 α1 β1 -0.001511 -0.113115 3.85E-06 0.064247 0.905622 (-3.95) (-2.85) (1.48) (1.68) (18.6) Coefficient estimates of an autoregressive model estimated on the daily return series of the LIFFE cocoa continuation series in the period 1-5-1983 until 2-4-1985 and of a GARCH model in the period 2-5-1985 until 12-31-1987. The autoregressive model is estimated using OLS and White’s (1980) heteroskedasticity-consistent standard errors. The GARCH model is estimated using maximum likelihood, the Marquardt iterative algorithm and Bollerslev-Wooldridge (1992) heteroskedasticityconsistent standard errors and covariance. The numbers within parenthesis are t-ratios.

In the first period the returns show significantly positive 16-th order autocorrelation, while in the second period the returns show significantly negative second order autocorrelation. The constant is significantly positive at the 10% significance level in the first period, while it is significantly negative at the 1% significance level in the second period. This is an indication that the drift in price is first positive and then negative. Hence, with this final bootstrap procedure it is tested whether the findings in favor of technical analysis can be explained by the trend structure in the price series and the strong autocorrelation in the returns. We remark that the mean excess return of a technical trading strategy that is applied 22 See

footnote 19.

50

Peter Boswijk, Gerwin Griffioen and Cars Hommes

to the cocoa continuation series is approximately equal to the return of a trading posture without correcting for the interest that is earned on the margin account, because f

rte = ln(1 + rt +

Pt − Pt−1 Pt − Pt−1 Pt − Pt−1 f f f Post ) − ln(1 + rt ) ≈ rt + Post − rt = Post . Mt−1 Mt−1 Mt−1

Therefore the mean excess return of a technical trading strategy is calculated as the mean return of the trading postures, so that it can be avoided to bootstrap the interest rate simultaneously with the cocoa continuation series.

3.5.2.

The Bootstrap Method: Empirical Results

We apply the bootstrap method to determine the statistical significance of the findings that are found in favor of technical analysis in the LIFFE cocoa continuation series in the period 1983:1-1987:12. Especially we focus on the question whether these results can be explained by characteristics as captured by a random walk with a drift, an autoregressive, an EGARCH, or a structural break in the price trend model. Table 10 shows the results. The first column shows the statistics whose values are tried to be explained by the characteristics as captured by the models that are listed in the first row. Columns two through four display the p-values. Random Walk Process Table 6 already showed that for the LIFFE cocoa continuation series 34.5% of the technical forecasting rules earn a statistically significantly positive mean excess return in the first subperiod. The entry at the second row and second column of table 10 shows the pvalue resulting from the bootstrap method testing the null hypothesis that the value of the statistic %(texc > tc) can be explained by characteristics as exhibited by a random walk process. This number is equal to 0.002. This means that in only 0.2% of the 500 random walk simulations the value of the statistic %(texc > tc) is larger than the 34.5 that was found in the original data series. Hence the null hypothesis that the considerable economic profits that are observed can be explained by the random walk model is rejected at the 1% significance level. Also it was shown that 26.7% of the technical forecasting rules have a statistically significantly positive mean buy return. The p-value in the row %(tbuy > tc ) shows that in 3.2% of the simulations the value of the statistic %(tbuy > tc) is larger than the 26.7 that was found in the original data series. However, the p-value in the row %(tsell < −tc ), shows that in 14% of the simulations the value of the statistic %(tsell < −tc ) is larger than the 39.5 that was found in the original data series. Thus for this statistic the bootstrap method does not reject its null hypothesis. The random walk model seems to explain the significantly negative mean sell return. However, it was shown that 46.7% of the technical forecasting rules have a statistically significantly positive mean buy-sell difference, but the p-value in the row %(tbuy−sell > tc ) is equal to zero. Hence for none of the random walk simulation the value of the statistic %(tbuy−sell > tc) is larger than the 46.7 that was found in the original data series. Further it was shown that 14.7% of the technical forecasting rules have a statistically significantly positive mean buy return as well as a statistically significantly negative mean sell return. The p-value in the row %(tbuy > tc ∧ tsell < −tc), which is 0.006,

Success and Failure of Technical Analysis in the Cocoa Futures Market

51

shows that in only 0.6% of the simulations the value of this statistic is larger than the 14.7 that was found in the original data series. Thus the random walk process can not explain the findings that technical forecasting rules seem to have the ability to distinguish periods with positive returns from periods with negative returns. The null hypothesis that the values of the statistics %(texc < −tc) and %(tbuy−sell < −tc ) can be explained by a random walk process is rejected at the 10% significance level. The results even indicate that the set of technical forecasting rules performs even worse when applied to the random walk process. However, the null hypothesis that the values of the statistics %(tbuy < −tc ), %(tsell > tc ) and %(tbuy < −tc ∧ tsell > tc ) can be explained by a random walk process is not rejected at the 10% significance level. Autoregressive and EGARCH Process The third and fourth column in table 10 show the p-values resulting from the bootstrap method testing the null hypothesis that the values of the statistics in the first column can be explained by characteristics as exhibited by an autoregressive and an EGARCH process. Now it can be studied whether the findings in favor of technical analysis can be explained by high order autocorrelation or volatility clustering and the leverage effect. The conclusions do not change compared to the random walk null model. Hence we conclude that neither the autoregressive process nor the EGARCH process can explain the findings in favor of technical analysis. Structural Break in Trend Finally, the last column in table 10 shows the p-values resulting from testing the null hypothesis that the findings in favor of technical analysis can be explained by the structural break in the price trend. The results are strikingly different from the three models that were studied before. Now for almost none of the statistics that are listed in the first column the null hypothesis is rejected. Hence the model that allows for a structural break in the price trend seems to explain the profitability and predictability of the set of technical forecasting rules when these are applied to the LIFFE cocoa continuation series in the period 1983:11997:12. Thus the strong change in the direction of the price trend seems to be the most probable cause for the trend-following technical forecasting techniques to show signs of forecasting power.

3.6. Success and Failure of Technical Analysis Technical analysis seems to be a success when applied to the LIFFE cocoa continuation series in the period 1983:1-1987:12. On the other hand it seems to be a failure when applied to the CSCE cocoa continuation series in the same period. The cocoa futures contracts that are traded at these two exchanges differ in their specifications of quality, currency and place of delivery, but it is surprising that the differences in profitability and predictability are so large. Why are these differences so pronounced? The daily CSCE cocoa futures returns show somewhat stronger autocorrelation in the first two lags than the LIFFE cocoa futures returns. In contrast with our results this would

52

Peter Boswijk, Gerwin Griffioen and Cars Hommes

Table 10. Statistical significance: a bootstrap based approach. The table shows the p-values (between 0 and 1) resulting from testing the null hypotheses that the values of the statistics in the first column can be explained by characteristics of data generating processes that are listed in the first row. Results are reported for the LIFFE cocoa continuation series in the period 1983:1-1987:12. Here tc = 1.28 Statistic %(texc > tc ) = 34.52 %(texc < −tc ) = 5.87 %(tbuy > tc ) = 26.73 %(tsell < −tc ) = 39.47 %(tbuy−sell > tc ) = 46.65 %(tbuy > tc ∧ tsell < −tc ) = 14.70 %(tbuy < −tc ) = 3.46 %(tsell > tc ) = 3.29 %(tbuy−sell < −tc ) = 3.29 %(tbuy < −tc ∧ tsell > tc ) = 0.82

RW 0.002 0.964 0.032 0.14 0 0.006 0.87 0.572 0.968 0.342

AR 0.038 0.936 0.074 0.274 0.012 0.016 0.838 0.502 0.95 0.274

EGARCH 0.03 0.942 0.05 0.334 0.002 0.012 0.902 0.428 0.942 0.278

Trend 0.414 0.96 0.478 0.528 0.248 0.426 0.858 0.776 0.952 0.542

be suggestive of more predictability in the CSCE cocoa continuation series. However, the volatility in the CSCE cocoa continuation series is slightly larger across all subperiods than the volatility in the LIFFE cocoa continuation series. This would be an indication beforehand why trend-following technical forecasting techniques should have more difficulty in predicting the CSCE cocoa continuation series. Nevertheless it seems that this somewhat higher volatility is not the explanation for the large differences that are observed. For example, in the second subperiod, when volatility is the strongest across all subperiods in both cocoa continuation series, it was shown that the technical forecasting rules are effective in both cocoa continuation series. Especially they predict periods with negative returns very well. Hence there must be some other explanation for the differences that are observed in the effectiveness of technical analysis. It is already shown in figures 3.3 and 3.4 that the LIFFE cocoa continuation series exhibits an upward price trend from January 1983 until June 1984:6, whereas the CSCE cocoa continuation series exhibits an upward price trend from January 1983 until February 1985. Both cocoa continuation series exhibit a downward price trend from February 1985 until December 1987. The upward price trend until mid 1984 was due to excess demand in the physical cocoa market, whereas after January 1986 cocoa prices declined for several years due to excess supply. See for example the graphs of gross crops and grindings of cocoa beans from 1960-1997 in the International Cocoa Organization Annual Report 1997/1998 (see e.g. p.15, Chart I).23 Thus in the end, the demand and supply of cocoa beans caused the upward and downward price trends in cocoa futures prices in the period 1983:1-1987:12. However figure 3.4 suggests that these price trends were more pronounced in the LIFFE 23 We

would like to thank Guido Veenstra, employed at the Dutch cocoa trading firm Unicom, for pointing this out to us.

Success and Failure of Technical Analysis in the Cocoa Futures Market

53

than in the CSCE cocoa continuation series.

3.6.1.

The Influence of the Pound-Dollar Exchange Rate

Also it is shown in figures 3.3 and 3.4 that the Pound-Dollar exchange rate exhibits similar price trends as both cocoa continuation series in the period 1983:1-1987:12. More precisely, the Pound-Dollar exchange rate increased (the Pound weakened against the Dollar) from January 1983 to reach its high in February 1985. This caused an upward force in the LIFFE cocoa futures price in Pounds, and a downward force in the CSCE cocoa futures price in Dollars. Just as the Pound-Dollar exchange rate, also the LIFFE cocoa futures price peaked in February 1985, while the CSCE cocoa futures price had reached its high in June 1984. After February 1985 the Pound strengthened against the Dollar until April 1988, that is, the Pound-Dollar exchange rate declined. This caused a downward force in the LIFFE cocoa futures price in Pounds, but an upward force in the CSCE cocoa futures price in Dollars. Until January 1986 the LIFFE cocoa futures price declined, while the CSCE cocoa futures price rose slightly. After January 1986 cocoa futures prices fell on both exchanges for a long time due to excess supply of cocoa beans. We therefore conclude that, by coincidence, the upward and downward price trends in the cocoa continuation series coincide with the upward and downward price trends in the Pound-Dollar exchange rate. Hence the price trends in the exchange rate caused the price trends in the LIFFE cocoa futures price to be strengthened, whereas the same trends in the exchange rate caused the trends in the CSCE cocoa futures price to be weakened. Table 11 shows the cross-correlations between the levels of the three data series across all subperiods. It is well known that if two independently generated integrated time series of the order one are regressed against each other in level, with probability one a spurious, but significant relation between the two time series will be found (Phillips 1986). Although the Pound-Dollar exchange rate should develop independently from cocoa futures prices, it has some impact on the level of cocoa futures prices as described above. The table shows that the Pound-Dollar exchange rate is correlated strongly with the level of the LIFFE cocoa continuation series and also (although a little bit weaker) with the level of the CSCE cocoa continuation series. In particular in the period 1983:1-1987:12 the table shows that the Pound-Dollar exchange rate is correlated strongly with the level of the LIFFE cocoa continuation series (cross correlation coefficient 0.88) and also (although a little bit weaker) with the CSCE cocoa continuation series (cross correlation coefficient 0.58). In the other subperiods there is little cross correlation between the Pound-Dollar exchange rate and the LIFFE and/or the CSCE cocoa continuation series. Apparently, due to an accidental correlation (spurious relation) in the period 1983:11987:12 between Pound-Dollar exchange rate movements, and demand and supply in the physical cocoa market, price trends in the LIFFE cocoa futures price were strengthened and price trends in the CSCE cocoa futures price were weakened. Because the technical forecasting rules that are tested in this study are mainly trend-following techniques, this presents a possible explanation for the large differences in the effectiveness of our technical forecasting rules in the LIFFE and CSCE cocoa futures markets. In order to explore further the possible impact of the Pound-Dollar exchange rate on the effectiveness of technical analysis, we now apply the set of technical forecasting rules to the

54

Peter Boswijk, Gerwin Griffioen and Cars Hommes Table 11. Cross-correlations between the LIFFE and CSCE cocoa continuation series, and the Pound-Dollar exchange rate in the periods 1983:1-1987:12, 1988:1-1992:12 and 1993:1-1997:6 83:1-97:6 Corr LIFFE CSCE BPDO

LIFFE 1 0.98 0.66

CSCE 1 0.51

1

88:1-92:12 Corr LIFFE CSCE BPDO

LIFFE 1 0.97 0.08

CSCE

BPDo

1 -0.13

BPDo

1

83:1-87:12 Corr LIFFE CSCE BPDO

LIFFE 1 0.87 0.88

CSCE

BPDo

1 0.58

1

93:1-97:6 Corr LIFFE CSCE BPDO

LIFFE 1 0.93 0.26

CSCE

BPDo

1 0.16

1

LIFFE cocoa continuation series expressed in Dollars and to the CSCE cocoa continuation series expressed in Pounds. Table 12 shows computations of the statistics that were defined in section 3.4.2.1.. The numbers in this table should be compared to the corresponding numbers in table 6. In the full sample period the statistics yield slightly better results for the CSCE cocoa continuation series in Pounds than for the CSCE cocoa continuation series in Dollars. For example, now %(tbuy−sell) = 2.73 (compared to 1.38)24. Periods with negative returns are predicted better, with %(tsell < −tc ) = 14.25 (compared to 5.92). For the LIFFE cocoa continuation series in Dollars the statistics yield poorer results in the full sample period than for the LIFFE cocoa continuation series in Pounds. Now the percentage of technical forecasting rules showing a statistically significantly positive mean excess return is only equal to 1.31, while this percentage was equal to 13.86 for the LIFFE cocoa continuation series in Pounds. Further %(tbuy > tc) = 5.10 (compared to 26.58). The forecasting rules still predict periods with negative returns well, with %(tsell < −tc ) = 25.97, but not nearly as good as for the LIFFE cocoa continuation series in Pounds for which it was found that %(tsell < −tc ) = 50.53. In the first subperiod technical analysis showed to be not effective when applied to the CSCE cocoa continuation series in Dollars. However, when applied to the CSCE cocoa continuation series in Pounds technical analysis shows to be more effective. For example, now the percentage of technical forecasting rules earning a statistically significantly positive mean excess return is equal to 8.33, while this percentage was only equal to 0.92 for the CSCE cocoa continuation series in Dollars. Further, periods with negative returns are predicted much better with %(tsell < −tc) = 19.65 (compared to 0.77). Periods with positive returns are also predicted better, but the improvement is not so large as in the case of the periods with negative returns, that is %(tbuy > tc) = 6.13 (compared to 1.27). The 24 Between brackets we repeat the results for the CSCE cocoa continuation series in Dollars and for the LIFFE

cocoa continuation series in Pounds.

Success and Failure of Technical Analysis in the Cocoa Futures Market

55

forecasting rules seem to distinguish better periods with positive returns from periods with negative returns, with %(tbuy−sell) = 19.41 (compared to 1.46). Table 12 shows clearly that for the first subperiod the conclusion about the effectiveness of applying technical forecasting rules changes indeed when the LIFFE cocoa continuation series is expressed in the other currency. The strong results that were found in favor of technical analysis totally disappear. The percentage of technical forecasting rules that earn a statistically significantly positively mean excess return drops from 34.52 to 1.03. Periods with negative returns and periods with positive returns are predicted badly, with %(tsell < −tc) is only equal to 1.18 (compared to 39.47) and %(tbuy > tc) is only equal to 1.70 (compared to 26.73). The forecasting rules have more difficulty in distinguishing periods with positive returns from periods with negative return, with %(tbuy−sell > tc ) = 2.13 (compared to 46.65) and %(tbuy > tc ∧ tsell < −tc) = 0.11 (compared to 14.70). We conclude that the Pound-Dollar exchange rate had a strong influence on the profitability and forecasting power of the technical forecasting rules that were applied to the LIFFE cocoa continuation series in Pounds, especially in the period 1983:1-1987:12. There is a dramatic change in effectiveness when the LIFFE cocoa futures price is changed to Dollars. On the other hand the profitability and forecasting power of the technical forecasting rules that were applied to the CSCE cocoa continuation series changed to Pounds is not as strong as the profitability and forecasting power of the same forecasting rules that were applied to the LIFFE cocoa continuation series in Pounds. Thus in addition to demand and supply of cocoa beans, the Pound-Dollar exchange rate movements provide only a partial explanation of the profitability and predictability of technical analysis in the cocoa futures markets.

3.6.2.

What Causes Success and Failure of Technical Analysis?

An important theoretical and practical question is: “What are the characteristics of speculative price series for which technical analysis can be successful?” In order to get some insight into this general question from our case study, it is useful to plot the price and returns series all on the same scale, as shown in figure 3.4. The returns series clearly show that the volatility in the Pound-Dollar exchange rate is lower than the volatility in both cocoa futures series. Furthermore, the price series show that the price trends in the LIFFE cocoa continuation series are much stronger than in the CSCE cocoa continuation series and in the Pound-Dollar exchange rate. One might characterize the three series as follows: (i) the CSCE cocoa continuation series has weak price trends and high volatility in returns; (ii) the LIFFE cocoa continuation series has strong price trends and high volatility in returns, and (iii) the Pound-Dollar exchange rate has weak price trends and low volatility in returns. Recall from section 3.4. that the effectiveness of our technical forecasting rules may be summarized as follows: (i) no profitability and no forecasting power in the CSCE cocoa continuation series; (ii) profitability and forecasting power in the LIFFE cocoa continuation series, and (iii) no profitability but signs of forecasting power in the Pound-Dollar exchange rate. Our case study of cocoa futures prices and the Pound-Dollar exchange rate suggests the following connection between effectiveness of technical analysis and the price trend and

56

Peter Boswijk, Gerwin Griffioen and Cars Hommes

volatility in returns. When price trends are weak and volatility in returns is relatively high, as in the CSCE cocoa continuation series, technical analysis does not have much forecasting power and therefore also can not lead to profits. Volatility in returns is too high relative to the price trends, so that technical analysis is unable to uncover these price trends. When price trends are weak but volatility in returns is also relatively low, as in the Pound-Dollar exchange rate, technical analysis can have statistically significant forecasting power without being profitable. In that case, because volatility in returns is low, technical analysis can still pick up the weak trends, but the price changes, although predictable, are too small to account for transaction costs. Finally, when price trends are strong and volatility in returns is relatively high, as in the LIFFE cocoa continuation series, a large set of technical forecasting rules may have statistically significant forecasting power leading to profit opportunities. In that case, the price trends are strong enough to be picked up by technical analysis even though volatility in returns is high. Moreover, since volatility in returns is high, the magnitude of the (predictable) price changes is large enough to cover the transaction costs.

3.7. Concluding Remarks Technical analysis encompasses a myriad of techniques to study, explain and predict the movements of financial markets. In this paper the effectiveness of a large set of 5350 technical forecasting rules has been tested. The profitability and predictability is studied. A correction is made for transaction costs. The forecasting rules are applied to the prices of cocoa futures contracts that are traded at the CSCE in New York and the LIFFE in London, and to the Pound-Dollar exchange rate. We focus on the period 1983:1-1997:6. The technical forecasting rules that are studied are based on moving averages, support and resistance, and Alexander’s (1961) filters. The forecasting rules show to be more effective when applied to LIFFE quoted cocoa futures prices than when applied to CSCE quoted cocoa futures prices. Especially this is the case in the period 1983:1-1987:12. In this period a large fraction of the forecasting rules that are applied to LIFFE quoted cocoa futures prices earn statistically significant profits and show statistically significant forecasting power. On the contrary, when these same forecasting rules are applied to CSCE quoted futures prices, then these rules show little forecasting power and are not profitable anymore. Furthermore statistically significant forecasting power of technical forecasting rules is found in the Pound-Dollar exchange rate in the period 1983:1-1997:6, but sorry enough most forecasting rules earn no profits. Technical analysis is more effective in the LIFFE cocoa futures market than in the CSCE cocoa futures market. This large difference may be explained by a combination of demand and supply of cocoa beans, and price movements in the Pound-Dollar exchange rate. In the period 1983:1-1987:12 cocoa futures prices and the Pound-Dollar exchange rate were, accidentally, strongly correlated. This spurious relation strengthened upward and downward price trends in the LIFFE quoted cocoa futures price, while it weakened price trends in the CSCE cocoa futures price. In the case of the LIFFE quoted cocoa futures price the price trends were strong enough to be picked up by a large class of technical forecasting rules. In the case of the CSCE quoted cocoa futures price most forecasting rules did not pick up the price trends, while these were similar to the price trends in the LIFFE quoted cocoa futures price, but weaker.

Success and Failure of Technical Analysis in the Cocoa Futures Market

57

We extend our standard statistical analysis of the effectiveness of technical analysis in the LIFFE cocoa futures market with parametric bootstrap techniques. Especially we focus on the period 1983:1-1987:12. This period exhibits a long-term upward price trend followed by a long-term downward price trend. We test the null hypothesis whether the results in favor of technical analysis can be explained by certain characteristics in the data that can be captured by a certain data generating process. It is found that a random walk with a drift, an autoregressive and an exponential GARCH process can not explain the results in favor of technical analysis. However a model that accounts for the structural break in the price trend can not be rejected as an explanation of the effectiveness of technical analysis. Apparently many technical forecasting rules were able to pick up this structural break in the price trend. Further it is found that technical analysis is less effective in the period 1993:11997:12. This is in line with many papers that found that effectiveness of forecasting rules tends to disappear in the 1990s. Although this study only documents the profitability and predictability of technical forecasting rules that are applied to a single commodity market, some general conclusions that may be useful for other financial series as well are suggested by our case study. First, in order to assess the success or failure of technical analysis it is useful to test a large class of forecasting rules, as done in this paper. A necessary condition for reliable success of technical analysis seems to be that a large class of forecasting rules, not just a few, should work well. If only a few forecasting rules are successful, then this may simply be due to “chance” or to data snooping. It should also be emphasized that even if a large class of forecasting rules has statistically significant forecasting power, then this is not a sufficient condition for economically significant trading profits after correcting for transaction costs. An example is the Pound-Dollar exchange rate for which a large fraction of forecasting rules exhibits statistically significant forecasting power, but these forecasting rules hardly yield any economic net profitability. Our case study of cocoa futures prices and the Pound-Dollar exchange rate suggests a connection between the success or failure of technical analysis and the price trend and volatility in returns of the corresponding series. When price trends are weak and volatility in returns is relatively high, technical analysis does not have much forecasting power and therefore also can not lead to economic profitability. Technical analysis is unable to uncover these price trends, because volatility in returns is too high. When price trends are weak but volatility in returns is relatively low, technical analysis can have statistically significant forecasting power without being profitable. In that case, because volatility in returns is low technical analysis can still pick up the weak price trends, but the price changes, although predictable, are too small to account for transaction costs. Finally, when price trends are strong and volatility in returns is relatively high, a large set of technical forecasting rules may have statistically significant forecasting power leading to economically significant profit opportunities. In that case, even though volatility in returns is high the price trends are strong enough to be picked up by technical analysis. Moreover, since volatility in returns is high, the magnitude of the (predictable) price changes is large enough to cover the transaction costs. We emphasize that this connection between predictive and economic effectiveness of technical analysis is suggestive and only documented by the markets studied here. Further research, of interest from a theoretical as well as a practical viewpoint, is needed to uncover whether the success and failure of technical trading is explained by the

58

Peter Boswijk, Gerwin Griffioen and Cars Hommes

relative magnitudes of price trends and volatility in returns. Technical analysis may pick up sufficiently strong price trends in asset prices and even may pick up structural breaks in price trends, without knowing or understanding the economic forces behind these price trends. It seems wise, however, that a technical analyst does not trust his charts only, but also tries to trace economic fundamentals that may cause to weaken or strengthen any occurring price trend. In the LIFFE quoted cocoa futures price the price trends were caused by two forces, namely firstly by demand and supply of cocoa beans and secondly by exchange rate movements. Apparently, at the same time as the structural break in the price trend, these forces changed direction. If both the technical charts and fundamental indicators point in the same direction, then technical analysis can be successful; otherwise failure seems a real possibility.

Appendix A. Tables Table 1 2 3 4 5 6 7 8 9 10 11 12

Summary statistics of daily returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Autocorrelation functions of daily returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results of the best strategies applied to the CSCE cocoa futures price . . . . Results of the best strategies applied to the LIFFE cocoa futures price . . . . Results of the best strategies applied to the Pound-Dollar exchange rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statistical significance: simple t-ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Coefficient estimates EGARCH-model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statistical significance: an estimation based approach . . . . . . . . . . . . . . . . . . . Coefficient estimates structural break in trend model . . . . . . . . . . . . . . . . . . . . Statistical significance: a bootstrap based approach . . . . . . . . . . . . . . . . . . . . . Cross-correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statistical significance when LIFFE in Dollars and CSCE in Pounds: simple t-ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 59 60 61 62 63 64 44 45 49 52 54 65

Table 1. Summary statistics of daily returns. Several statistics are presented for the full sample period 1983:1-1997:12 and three subperiods, 1983:1-1987:12, 1988:1-1992:12, 1993:1-1997:6. Returns are calculated as the natural log differences of the prices. The maximum loss is the largest consecutive decline in percentage terms during a certain period. The t-statistics test whether the mean return is significantly different from zero. Full Sample 83:1-87:12 88:1-92:12 93:1-97:6 CSCE N 3654 1254 1262 1136 Yearly effective return -0.078914 -0.016746 -0.21364 0.020661 Mean -0.000326 -0.000067 -0.000954 0.000081 Std. Dev. 0.016616 0.015787 0.018842 0.014773 t-ratio -1.186689 -0.150324 -1.798208 0.185154 Skewness 0.243951 -0.049036 0.341313 0.477601 Kurtosis 4.971493 3.39366 5.199822 5.495766 Maximum loss -0.8507 -0.4355 -0.7234 -0.3546 Period of maximum loss 05/23/84 - 02/20/97 05/23/84 - 12/08/87 01/22/88 - 06/24/92 07/18/94 - 02/20/97 LIFFE N 3673 1260 1264 1147 Yearly effective return -0.073934 -0.035598 -0.198199 0.030482 Mean -0.000305 -0.000144 -0.000877 0.000119 Std. Dev. 0.014056 0.013538 0.015521 0.012851 t-ratio -1.314172 -0.377152 -2.007875 0.314005 Skewness 0.08106 -0.249777 0.353273 0.040053 Kurtosis 5.797402 5.85137 5.564294 5.721865 Maximum loss -0.8919 -0.6115 -0.7513 -0.3749 Period of maximum loss 02/05/85 - 06/24/92 02/05/85 - 12/09/87 01/19/88 - 06/24/92 08/01/94 - 02/12/97 BPDo N 3780 1303 1304 1171 Yearly effective return -0.0019 -0.028517 0.042569 -0.020163 Mean -0.000008 -0.000115 0.000165 -0.000081 t-ratio -0.070642 -0.587341 0.832657 -0.537031 Std. Dev. 0.006567 0.007056 0.007174 0.00515 Skewness -0.021897 -0.448886 0.391937 -0.086657 Kurtosis 6.133925 6.487253 4.839026 6.362086 Maximum loss -0.4748 -0.4397 -0.244 -0.1714 Period of maximum loss 02/27/85 - 09/02/92 02/27/85 - 12/31/87 06/15/89 - 09/02/92 02/15/93 - 12/31/96

60

Peter Boswijk, Gerwin Griffioen and Cars Hommes

Table 2. Autocorrelation functions of daily returns. Autocorrelations are estimated up to order 20. a, b, c: significance at the 1%, 5%, 10% significance level with Bartlett (1946) standard errors. ∗ ∗ ∗, ∗∗, ∗: significance at the 1%, 5%, 10% significance level with Diebold (1986) heteroskedasticity-consistent standard errors. k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

CSCE 83:1-97:6 -0.0007 -0.0515a*** 0.0038 -0.0023 0.0106 -0.0192 -0.0065 0.0062 -0.0072 -0.0014 -0.024 -0.018 -0.0135 0.0052 0.0193 -0.0141 -0.0076 0.0156 0.0093 0.0135

83:1-87:12 0.0328 -0.0611b* 0.0004 -0.0007 -0.012 -0.0263 -0.0155 -0.0499c 0.005 -0.0387 -0.0352 0.0431 -0.0112 0.0372 0.0024 0.0049 0.0312 -0.0295 -0.005 -0.0083 k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

88:1-92:12 -0.0112 -0.0524c 0.0086 0.0031 0.0141 0.0022 -0.0101 0.0255 -0.0167 0.0094 -0.022 -0.0613b** -0.0008 0.0005 0.0437 -0.0377 -0.0384 0.0565b* 0.0135 0.0475c BPDo 83:1-97:6 0.0833a*** 0.0241 -0.0158 0.0016 0.0343b* -0.0034 -0.0303c 0.0280c 0.0121 -0.0048 -0.0021 -0.0203 -0.0079 0.0268 0.0305c -0.0009 0.0131 -0.0341b* -0.0131 0.0103

93:1-97:6 -0.0277 -0.0438 -0.0036 -0.017 0.0314 -0.0519c 0.0115 0.0344 -0.0078 0.0265 -0.0162 -0.0236 -0.046 -0.0302 -0.0041 0.001 0.0011 0.0003 0.0194 -0.0243

LIFFE 83:1-97:6 0.0300c -0.0378b** 0.0122 0.0368b* 0.0163 -0.0279c -0.0087 0.0066 0.0217 0.0398b** 0.0001 -0.0173 -0.0011 0.0176 0.0151 0.0098 -0.0193 0.004 0.0399b** 0.0072

83:1-87:12 0.1025a*** 0.0201 -0.0099 -0.0313 0.0266 0.0286 -0.0081 0.0479c -0.0221 -0.0570b* -0.0127 -0.0439 -0.0087 0.0211 0.0527c -0.0305 -0.0053 -0.0051 0.0143 0.0232

83:1-87:12 0.0083 -0.0178 0.0538c* -0.0065 0.0605b* 0.0016 -0.0193 -0.0068 0.0202 -0.0198 -0.0012 0.0409 0.0471c 0.0002 0.0357 0.1279a*** -0.0307 -0.0209 0.0089 -0.0306

88:1-92:12 0.1085a*** 0.0165 -0.0192 0.0359 0.0958a*** -0.0135 -0.0598b** 0.025 0.0357 0.0414 0.0203 -0.0068 0.0031 0.0386 0.0478c 0.0277 0.0085 -0.0635b** -0.01 0.0035

93:1-97:6 -0.0132 0.0477 -0.0151 -0.0029 -0.0605b -0.0411 -0.022 -0.0074 0.0299 0.0158 -0.0246 -0.0044 -0.0114 0.0128 -0.0641b* -0.0079 0.0487 -0.0059 -0.0366 -0.0177

88:1-92:12 0.0456 -0.0437 0.0155 0.0493c -0.0027 -0.026 -0.036 0.0188 0.0293 0.0662b** -0.0216 -0.0649b** -0.0131 0.0444 0.0239 -0.0775a** -0.0257 0.0488c 0.0433 0.0221

93:1-97:6 0.0253 -0.0567c* -0.047 0.0671b* -0.0048 -0.0704b** 0.0454 -0.0063 0.0041 0.0654b* 0.026 -0.0168 -0.0426 -0.0098 -0.0223 0.004 0.0054 -0.0287 0.0669b** 0.0152

Table 3. Results of the best technical forecasting rules applied to the CSCE quoted cocoa futures price. Panel A shows the results of the best five technical forecasting rules applied to the CSCE cocoa continuation series in the period 1983:1-1997:6. Panel B lists the results of the best forecasting rule in each subperiod. The first column of the table lists the rules’ parameters. MA, TRB and FR are abbreviations for the moving average, trading range break-out and filter forecasting rules respectively. %b, td, fhp, and stl are abbreviations for the %-band filter, the time delay filter, the fixed holding period and the stop-loss respectively. The third column lists r¯exc net of 0.1% transaction costs, with texc beneath these numbers. The second column shows the effective yearly excess return, that is r¯Yexc = exp{252 r¯exc }. The fourth and fifth column list the number of buy and sell days, with the number of buy and sell signals beneath these numbers. The sixth and seventh column show for trading postures that are initiated by buy and sell signals the total number of days that trading postures that earn a strictly positive excess return last as a fraction of the total number of buy and sell days. The fractions of buy and sell signals that initiate trading postures that earn a strictly positive excess return are listed beneath these numbers. The eight and ninth column list r¯buy and r¯sell , with tbuy and tsell beneath these numbers. The last column shows r¯buy − r¯sell , with tbuy−sell beneath these numbers. Panel A: Full sample, best five strategies Strategy [ 1983-1997 [ TRB 5 [ FR

1%

[ TRB

15

%b td fhp stl] 2%

50 3

2%

10 50

r¯Yexc

r¯exc

Nbuy

Nsell

Buy>0 Sell>0

r¯buy

r¯sell

r¯buy − r¯sell

0.00056 1.20175 0.00068 1.33110 0.00046 0.85819 0.00041 0.81153 0.00034 0.71020

] 0.1038 0.00039 1450 950 0.517 1.71767 28 19 0.5 ] 0.0935 0.00036 1150 1001 0.478 1.63535 111 97 0.477 ] 0.0832 0.00032 1000 750 0.65 1.60974 20 15 0.65 ] 0.0782 0.00030 1117 1050 0.62 1.37787 46 40 0.63 ] 0.0755 0.00029 1270 752 0.567 1.36795 26 16 0.577 Panel B: Subperiods, best strategy

0.737 0.737 0.649 0.649 0.733 0.733 0.69 0.7 0.801 0.813

-0.00101 -1.96037 -0.00118 -2.23745 -0.00126 -1.93654 -0.00105 -2.04345 -0.00117 -1.80930

0.00158 2.25978 0.00187 2.53246 0.00172 2.03745 0.00146 2.02312 0.00151 1.87465

[ FR

1.5%

5

25

[ FR

8%

3

50

3

50

] 0.2016 0.00073 1.82085

429 9

630 13

0.767 0.778

0.635 0.00158 -0.00057 0.615 1.99680 -0.94078

0.00215 2.15951

25

] 0.2156 0.00078 1.45504

652 21

560 21

0.617 0.524

0.732 0.00022 -0.00221 0.762 0.29297 -2.76492

0.00243 2.22314

10

] 0.2105 0.00076 2.12162

385 38

311 30

0.481 0.5

0.74 0.00157 -0.00145 0.7 2.01350 -1.66264

0.00302 2.58211

1983-1987 [ FR 0.5% 1988-1992 [ MA 1, 2 1993-1997 [ FR 1%

3

Table 4. Results of the best technical forecasting rules applied to the LIFFE quoted cocoa futures price. Panel A shows the results in the period 1983:1-1997:6. Panel B shows the results of the best technical forecasting rule in each of the three subperiods: 1983:1-1987:12, 1988:1-1992:12 and 1993:1-1997:6. For a description of the content see table 3. Panel A: Full sample, best five strategies Strategy [ 1983-1997 [ MA 1,40

%b td fhp 0.5%

5,50

[ MA

1,30

[ MA

1,30

r¯Yexc

r¯exc

Nbuy

Nsell Buy>0 Sell>0

] 0.1495 0.00055 1505 2168 0.728 2.60018 70 69 0.357 7.5%] 0.148 0.00055 1102 1914 0.691 3.05696 33 32 0.424 ] 0.1459 0.00054 1489 2184 0.666 2.55433 41 40 0.415 7.5%] 0.1442 0.00053 1485 2170 0.692 2.50284 130 129 0.292 ] 0.1422 0.00053 1510 2163 0.689 2.46005 118 117 0.297 Panel B: Subperiods, best strategy

[ MA 10,75 [ MA

stl]

0.5%

0.1%

1983-1987 [ MA 2,40

0.816 0.391 0.869 0.625 0.748 0.425 0.796 0.287 0.791 0.308

r¯buy 0.00059 1.50434 0.00057 1.32464 0.00052 1.31265 0.00067 1.72202 0.00063 1.61817

r¯sell r¯buy − r¯sell -0.00092 -3.25995 -0.00104 -3.59837 -0.00087 -3.10160 -0.00097 -3.36952 -0.00096 -3.34207

0.00151 3.13221 0.00161 3.10298 0.00139 2.85916 0.00164 3.38854 0.00158 3.28657

5%] 0.3547 0.00121 3.88278

388 19

691 19

0.866 0.526

0.925 0.00211 -0.00146 0.579 2.79621 -3.19459

0.00357 4.04603

1988-1992 [ FR 4%

5

50

] 0.2517 0.00089 2.53028

498 10

417 9

0.703 0.7

1 0.00078 -0.00218 1 1.16035 -2.72535

0.00296 2.83447

1993-1997 [ FR 0.5%

2

50

] 0.1923 0.00070 1.98085

400 8

675 13

0.75 0.75

0.75 0.00128 -0.00069 0.769 1.92902 -1.46915

0.00197 2.42358

Table 5. Results of the best technical forecasting rules applied to the Pound-Dollar exchange rate. Panel A shows the results in the period 1983:1-1997:6. Panel B shows the results of the best technical forecasting rule in each of the three subperiods: 1983:1-1987:12, 1988:1-1992:12 and 1993:1-1997:6. For a description of the content see table 3. Panel A: Full sample, best five strategies Strategy [ 1983-1997 [ TRB 100

%b td fhp 1%

50

[ TRB

50

1%

50

[ TRB

5

1.5%

10

[ TRB

250

[ TRB

250

2 0.1%

25 25

1983-1987 [ MA 20,40 1988-1992 [ FR 0.5% 1993-1997 [ MA 30,50

5

stl]

r¯Yexc

r¯exc Nbuy Nsell Buy>0 Sell>0

] 0.0164 0.00007 215 250 0.767 1.93088 5 5 0.8 ] 0.0127 0.00005 350 400 0.571 1.42095 7 8 0.571 ] 0.0126 0.00005 160 160 0.563 1.68605 16 16 0.563 ] 0.0115 0.00005 125 125 0.6 1.95839 5 5 0.6 ] 0.0115 0.00005 125 125 0.8 1.95916 5 5 0.8 Panel B: Subperiods, best strategy

0 0 0 0 0 0 0 0 0 0

r¯buy 0.00161 2.75658 0.00097 2.55344 0.00175 2.51482 0.00184 2.66970 0.00184 2.67404

r¯sell r¯buy − r¯sell -0.00017 -0.36090 -0.00008 -0.20080 -0.00060 -0.87011 -0.00108 -1.80209 -0.00119 -1.95430

0.00178 2.38333 0.00105 1.94510 0.00235 2.40450 0.00292 3.19810 0.00303 3.29853

2%] 0.0333 0.00013 1.34254

398 12

268 13

0.827 0.667

0 0.00089 -0.00040 0 2.79888 -0.96585

0.00128 2.47666

25

] 0.0534 0.00021 1.97336

307 13

325 13

0.593 0.615

0 0.00141 -0.00056 0 3.18590 -1.56473

0.00197 3.45882

10

] 0.0221 0.00009 1.53693

130 13

130 13

0.615 0.615

0 0.00120 -0.00013 0 2.38807 -0.25821

0.00132 1.88737

64

Peter Boswijk, Gerwin Griffioen and Cars Hommes

Table 6. Statistical significance: simple t-ratios. The percentage of forecasting rules for which is found a statistically significantly (1) positive mean excess return, (2) negative mean excess return, (3) positive mean buy return, (4) negative mean sell return, (5) positive mean buy-sell difference, (6) both (3) and (4), (7) negative mean buy return, (8) positive mean sell return, (9) negative mean buy-sell difference and (10) both (7) and (8). Significance is determined at the 10% significance level using a one-sided test, that is tc = 1.28. Results are reported for the CSCE and LIFFE cocoa continuation series and the Pound-Dollar exchange rate in the full sample period and the three subperiods. Statistic (1) %(texc > tc )

Period CSCE LIFFE BPDo 1 0.92 34.52 0.35 2 1.85 6.31 4.78 3 0.45 2.13 0.09 Full 0.3 13.86 2.07 (2) %(texc < −tc ) 1 24.17 5.87 27.11 2 9.32 8.8 17.32 3 32.26 11.28 66.02 Full 33.72 7.4 62.32 (3) %(tbuy > tc ) 1 1.27 26.73 12.42 2 0.5 0.78 29.63 3 0.86 4.46 0.52 Full 0.62 6.86 13.08 (4) %(tsell < −tc ) 1 0.77 39.47 44.29 2 44.57 54.62 7.73 3 0.56 0.92 1.98 Full 5.92 50.53 17.13 (5) %(tbuy−sell > tc ) 1 1.46 46.65 41.9 2 1.96 8.13 26.13 3 0.97 3.14 0.77 Full 1.38 26.58 28.19 (6) %(tbuy > tc ∧ tsell < −tc ) 1 0.04 14.7 8.44 2 0.02 0.49 4.11 3 0.02 0.15 0.07 Full 0.06 2.6 7.04 (7) %(tbuy < −tc ) 1 9.42 3.46 6.35 2 26.55 31.96 0.47 3 7.98 2.34 17.84 Full 32.13 4.76 1.96 (8) %(tsell > tc ) 1 5.59 3.29 0.9 2 1.01 3.38 13.26 3 14.83 6.93 4.13 Full 4.22 2.32 2.39 (9) %(tbuy−sell < −tc ) 1 13.54 3.29 2.34 2 6.33 15 8.76 3 19.91 5.96 16.64 Full 15.09 2.76 2.35 (10) %(tbuy < −tc ∧ tsell > tc ) 1 1.38 0.82 0.13 2 0.28 1.27 0.32 3 3.21 0.32 1.12 Full 1.35 0.15 0.78

Success and Failure of Technical Analysis in the Cocoa Futures Market

65

Table 12. Statistical significance when CSCE in Pounds and LIFFE in Dollars: simple t-ratios. The CSCE cocoa continuation series is changed to Pounds and the LIFFE cocoa continuation series is changed to Dollars. The table shows for all groups of forecasting rules (MA, TRB, Filter, ALL) for the full sample period and for each of the three subperiods (1, 2, and 3) the values of the statistics that are listed in the first column. Significance is tested at the 10% significance level using a one-sided test, that is tc = 1.28. Statistic %(texc > tc )

Period 1 2 3 Full %(texc < −tc ) 1 2 3 Full %(tbuy > tc ) 1 2 3 Full %(tsell < −tc ) 1 2 3 Full %(tbuy−sell > tc ) 1 2 3 Full %(tbuy > tc ∧ tsell < −tc ) 1 2 3 Full %(tbuy < −tc ) 1 2 3 Full %(tsell > tc ) 1 2 3 Full %(tbuy−sell < −tc ) 1 2 3 Full %(tbuy < −tc ∧ tsell > tc ) 1 2 3 Full

CSCE in Pounds MA TRB Filter 10.39 6.37 5.31 1.92 1.56 0.83 0.25 0.15 3.15 0.8 0.6 1.16 6.01 9.23 13.6 11.07 13.45 20.23 30.84 39.14 16.09 13.46 28.7 24.71 5.21 5.92 10.95 0.62 0.45 1 0.36 0.4 2.82 0.58 0.45 4.98 28.95 11.09 5.14 22.11 15.4 20.23 1.12 0.25 2.49 18.57 8.43 13.6 26.71 11.89 10.61 2.97 2.26 5.97 0.76 0.25 5.14 2.82 1.51 6.3 0.9 0.2 0.17 0.11 0.05 0 0.04 0 0.5 0.11 0.05 0.17 3.18 5.22 4.64 36.12 24.89 23.38 9.48 15.96 3.48 13.54 22.98 15.09 2.46 5.47 6.3 0.54 7.43 1.82 3.98 11.74 9.45 1.16 4.67 4.98 3.66 5.67 4.48 16.58 10.89 9.78 17.41 26.84 8.46 5.79 13.6 10.12 0.58 0.7 0.5 0.14 0.4 0.33 2.28 5.17 1.82 0.4 0.75 0.83

All 8.33 1.66 0.54 0.77 8.07 12.96 32.3 20.42 6.13 0.6 0.65 1.03 19.65 19.39 0.95 14.25 19.41 3.05 1.06 2.73 0.56 0.07 0.07 0.09 4.11 30.51 11.23 17.24 4.02 3.25 7.49 2.9 4.5 13.71 19.93 9.19 0.62 0.26 3.31 0.58

LIFFE in Dollars MA TRB Filter 0.9 1.1 1.33 21.17 15.4 16.09 1.52 1.4 3.48 1.19 0.95 2.99 14.66 8.83 13.43 3.08 7.53 14.1 13.68 16.86 13.76 8.61 10.59 20.9 1.38 2.16 1.66 0.47 3.16 4.64 2.79 4.62 7.96 0.87 1.25 4.81 1.01 1.15 1.99 81.58 49.67 47.43 0.76 0.1 1 32.18 19.47 18.74 1.88 2.21 2.99 14.95 12.09 23.05 2.28 2.06 5.64 5.5 3.11 9.78 0.14 0 0.33 0.07 1.86 0.83 0.18 0 0.66 0.22 0.25 1.16 1.38 4.62 1.82 26.06 21.78 25.21 1.34 1.71 0.66 7.82 7.23 13.6 1.7 1.76 3.32 0.25 3.31 0.66 9.7 19.92 16.92 0.4 3.66 2.16 5.36 3.51 5.64 4.42 5.17 7.96 7.09 9.48 8.13 1.81 2.81 5.47 0.25 0.2 0.17 0.04 0.05 0 0.22 0.25 0.5 0.07 0.05 0.33

All 1.03 18.44 1.70 1.31 12.37 5.94 14.89 10.74 1.70 1.94 4.05 1.46 1.18 65.91 0.54 25.97 2.13 14.81 2.58 5.10 0.11 0.82 0.17 0.34 2.63 24.36 1.40 8.26 1.91 1.44 14.33 1.81 4.71 5.10 8.11 2.60 0.22 0.04 0.26 0.09

66

Peter Boswijk, Gerwin Griffioen and Cars Hommes

B. Parameters of the Technical Forecasting Rules This appendix presents the values of the parameters of the set of technical forecasting rules that are applied in this paper. Most parameter values are equal to those used by Sullivan et al. (1999). Each basic forecasting rule can be extended by a %-band filter (band), time delay filter (delay), fixed holding period (fhp) and a stop-loss (sl). The total set consists of 5353 different forecasting rules, including rules that generate only one buy, sell or neutral signal at the start of the sample period and no signal thereafter.

Moving-average Rules n band delay fhp sl

=number of days over which the price must be averaged =%-band filter =number of days a signal must hold if a time delay filter is implemented =number of days a trading posture is held, ignoring all other signals =%-rise (%-fall) from a previous low (high) to liquidate a short (long) position

n band delay fhp sl

=[1, 2, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 125, 150, 200, 250] =[0.001, 0.005, 0.01, 0.015, 0.02, 0.03, 0.04, 0.05] =[2, 3, 4, 5] =[5, 10, 25, 50] =[0.02, 0.03, 0.04, 0.05, 0.075, 0.10]

With the 16 values of n we can construct (16 2 ) = 120 basic moving-average (MA) forecasting rules. We extend these rules with %-band filters, time delay filters, fixed holding period and a stop-loss. The values chosen above will yield in total: 120 + 120 ∗ 8 + 120 ∗ 4 + 120 ∗ 4 + 120 ∗ 6 = 2760 MA rules.

Trading Range Break-out Rules n band delay fhp sl

= length of the period to find local minima (support) and maxima (resistance) =%-band filter =number of days a signal must hold if a time delay filter is implemented =number of days a trading posture is held, ignoring all other signals =%-rise (%-fall) from a previous low (high) to liquidate a short (long) position

n band delay fhp sl

=[5, 10, 15, 20, 25, 50 ,100, 150, 200, 250] =[0.001, 0.005, 0.01, 0.015, 0.02, 0.03, 0.04, 0.05] =[2, 3, 4, 5] =[5, 10, 25, 50] =[0.02, 0.03, 0.04, 0.05, 0.075, 0.10]

With the parameters and values presented above the following trading range break-out (TRB) rules can be constructed:

Success and Failure of Technical Analysis in the Cocoa Futures Market basic TRB rules: TRB with %-band filter: TRB with time delay filter: TRB with fixed holding period: TRB with stop-loss: TRB with %-band and time delay filter: TRB with %-band and fixed holding: TRB with %-band and stop-loss: TRB with time delay and fixed holding: TRB with time delay and stop-loss: TRB with fixed holding and stop-loss: This will yield in total 1990 TRB rules.

10*1 10*8 10*4 10*4 10*6 10*8*4 10*8*4 10*8*6 10*4*4 10*4*6 10*4*6

67

=10 =80 =40 =40 =60 =320 =320 =480 =160 =240 =240

Filter Rules filt = %-rise (%-fall) from a previous low (high) to generate a buy (sell) signal delay =number of days a signal must hold if a time delay filter is implemented fhp =number of days a trading posture is held, ignoring all other signals filt

=[0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.12, 0.14, 0.16, 0.18, 0.2, 0.25, 0.3, 0.4, 0.5] delay =[2, 3, 4, 5] fhp =[5, 10, 25, 50] With the parameters and values presented above the following filter rules (FR) are constructed: basic FR: 24*1 =24 FR with time delay: 24*4 =96 FR with fixed holding: 24*4 =96 FR with time delay and fixed holding: 24*4*4 =384 This will yield in total 600 filter rules.

References [1] Achelis, S.B. (1995), Technical Analysis from A to Z, McGraw-Hill, New York. [2] Alexander, S.S. (1961), Price Movements in Speculative Markets: Trends or Random Walks, Industrial Management Review 2, 7-26. [3] Bartlett, M.S. (1946), On the Theoretical Specification of Sampling properties of Autocorrelated Time Series, Journal of the Royal Statistical Society, Series B, 8, 27-41. [4] Bera, A.K. and Higgins, M.L. (1993), Arch Models: Properties, Estimation and Testing, Journal of Economics Surveys 7, 305-362.

68

Peter Boswijk, Gerwin Griffioen and Cars Hommes

[5] Bollerslev, T., Wooldridge, J.M. (1992), Quasi-Maximum Likelihood Estimation and Inference in Dynamic Models with Time Varying Covariances, Econometric Reviews 11, 143-172. [6] Brock, W.A., Lakonishok, J. and LeBaron, B. (1992), Simple Technical Trading Rules and the Stochastic Properties of Stock Returns, The Journal of Finance 47, 1731-1764. [7] Curcio, R., Goodhart, G. Guillaume, D., Payne, R. (1997), Do Technical Trading Rules Generate Profits? Conclusions from the Intra-Day Foreign Exchange Market, International Journal of Finance and Economics 2, 267-280. [8] Dacorogna, M.M., M¨uller, U.A., Pictet, O.V. (1991), A Measure of Trading Model Performance with a Risk Component, A discussion paper by the O&A Research Group, MMD.1991-05-24. [9] De Long, J.B., Schleifer, A., Summers, L.H. and Waldmann, R.J. (1989), The size and incidence of the losses from noise trading, Journal of Finance 44, 681-696. [10] De Long, J.B., Schleifer, A., Summers, L.H. and Waldmann, R.J. (1990), Noise trader risk in financial markets, Journal of Political Economy 98, 703-738. [11] Diebold, F.X. (1986), Testing for Serial Correlation in the Presence of ARCH, Proceedings of the American Statistical Association, Business and Economic Statistics Section, 323-328. [12] Edwards, R.D. and Magee, J., Technical Analysis of Stock Trends. Seventh Edition, second printing, John Magee, Inc (1998). [13] Efron, B. (1979), Bootstrap methods: Another look at the jackknife, The Annals of Statistics 7, 1-26. [14] Efron, B., Tibshirani, R. (1986), Bootstrap methods for standard errors, confidence intervals and other measures of statistical accuracy, Statistical Science 1, 54-77. [15] Fama, E.F. (1970), Efficient capital markets: a review of theory and empirical work, Journal of Finance 25, 383-423. [16] Fama, E. and French, K. (1988), Dividend yields and expected stock returns, Journal of Financial Economics 22, 3-27. [17] Freedman, D. (1984), On bootstrapping two-stage least squares estimates in stationary linear models, Annals of Statistics 12, 827-842. [18] Freedman, D., Peters, S. (1984a), Bootstrapping a regression equation: Some empirical results, Journal of the American Statistical Society 79, 97-106. [19] Freedman, D., Peters, S. (1984b), Bootstrapping an econometric model: Some empirical results, Journal of Business and Economic Statistics 2, 150-158. [20] Griffioen, G.A.W. (2003), Technical analysis in financial markets, PhD Thesis, University of Amsterdam, Tinbergen Institute Research Series no. 305.

Success and Failure of Technical Analysis in the Cocoa Futures Market

69

[21] Hsieh, D.A. (1988), The Statistical Properties of Daily Foreign Exchange Rates: 19741983, Journal of International Economics 24, 129-145. [22] Hull, J.C. (1998), Futures and Options Markets, Prentice-Hall, Inc., 3th edition, London. [23] Jensen, M.C., Benington, G.A. (1969), Random Walks and Technical Theories: Some Additional Evidence, Journal of Finance 25, 469-482. [24] LeBaron, B. (1993), Practical Comparisons of Foreign Exchange Forecasts, Neural Network World 6/93, 779-790. [25] LeBaron, B. (2000a), The Stability of Moving-Average Technical Trading Rules on the Dow-Jones Index, Derivatives Use, Trading and Regulation 5, 324-338. [26] Levich, R.M. and Thomas L.R. (1993), The significance of technical trading-rule profits in the foreign exchange market: A bootstrap approach, Journal of International Money and Finance 12, 451-474. [27] Lo, A.W. and MacKinlay, A.C. (1988), Stock market prices do not follow random walks: evidence from a simple specification test, Review of Financial Studies 1, 4166. [28] Lo, A.W. and MacKinlay, A.C., (1997), Maximizing predictability in the stock and bond markets, Macroeconomic Dynamics 1, 102-134. [29] Lo, A.W. and MacKinlay, A.C. (1999), A non-random walk down Wall Street, Princeton University Press, Princeton. [30] Lo, A.W., Mamaysky, H., Wang, J. (2000), Foundations of technical analysis: computational algorithms, statistical inference and empirical implementation, Journal of Finance 55, 1705-1722. [31] Malkiel, B.G. (1973), A Random Walk Down Wall Street, W.W. Norton & Company, Inc, [32] Murphy J.J. (1986), Technical Analysis of the Futures Markets, New York institute of finance. [33] Nelson, D.B. (1995), Conditional heteroskedasticity in asset returns: a new approach, Econometrica 59, 347-370. [34] Pesaran, M.H. and Timmermann, A. (1995), Predictability of stock returns: robustness and economic significance, Journal of Finance 50, 1201-1228. [35] Pesaran, M.H. and Timmermann, A. (2000), A recursive modelling approach to predicting UK stock returns, Economic Journal 110, 159-191. [36] Phillips, P.C.B. (1986), Understanding Spurious Regressions in Econometrics, Journal of Econometrics, 33, 311-340.

70

Peter Boswijk, Gerwin Griffioen and Cars Hommes

[37] Pring, M. (1998), Introduction to Technical Analysis, McGraw-Hill, New York. [38] Ready, M.J. (1997), Profits from Technical Trading Rules, University of WisconsinMadison, Working paper. [39] Samuelson, P.A. (1965), Proof that properly anticipated prices fluctuate randomly, Industrial Management Review 6, 41-49. [40] Satterthwhaite, R.E. (1946), Biom. Bull. 2 :110. [41] Sullivan, R., Timmermann and A., White, H. (1998), Data-Snooping, Technical Trading Rule Performance, and the Bootstrap, Journal of Finance 54, 1647 - 1691. [42] Sullivan, R., Timmermann and A., White, H. (1998), The Dangers of Data-Driven Inference: The Case of Calendar Effects in Stock Returns, London School of Economics, Discussion Paper 304. [43] Taylor, M.P. and Allen, H. (1992), The use of technical analysis in the foreign exchange market, Journal of International Money and Finance 11, 304-314. [44] White, H. (1980), A Heteroskedasticity-Consistent Covariance Matrix and a Direct Test for Heteroskedasticity, Econometrica 48, 817-838. [45] White, H. (2000), A Reality Check for Data Snooping, Econometrica 68, 1097-1126.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 71-82

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 4

W HEN N ONRANDOMNESS A PPEARS R ANDOM : A C HALLENGE TO F INANCIAL E CONOMICS Anastasios G. Malliaris1,∗ and Mary E. Malliaris2 1 Economics Area, Loyola University Chicago, 25 E. Pearson, Chicago, IL, US 2 Information Systems Area, Loyola University Chicago, 25 E. Pearson, Chicago, IL, US

4.1. Introduction Determinism and randomness are the two pillars of scientific methodology. Ruhla (1992) argues that science, in its long historical evolution, has favored determinism. In other words, the search for an exact relationship between dependent and independent variables has received first priority by scientists who follow the deterministic tradition of Euclid, Newton and Leibnitz. The probabilistic paradigm, which originated in the rigorous analysis of gambling games, has flourished during the past several decades as exact relationships have become more difficult to confirm. Developments in operations research, management science and financial economics since World War II have reflected the evolution of scientific methodology in the physical sciences. The Marshallian static equilibrium price theory, the Walrasian dynamic tatonement general equilibrium, linear programming techniques, game theory and various other techniques, all emphasized classical determinism. However, measurement errors, unobservable variables, incomplete models, the introduction of expectations and the admission of the economic and business complexity, among other reasons, have swung the methodological pendulum towards probabilistic reasoning. This remarkable shift to probabilistic reasoning is quite evident in financial economics with its key theories of market efficiency and derivatives pricing. The need to forecast an uncertain future variable for purposes of economic and financial planning has reinforced probabilistic methods. Such reasoning gave rise to statistical techniques and the establishment of the field of financial economics. The text∗ E-mail

address: [email protected], [email protected]

72

Anastasios G. Malliaris and Mary E. Malliaris

book by Campbell, Lo and MacKinlay (1997) on the econometrics of financial economics exemplifies the probabilistic reasoning in this area. Although it is currently accepted by economists and financial analysts that there is a clear dichotomy between deterministic and probabilistic modeling, relatively recent developments in physical chaotic dynamics have shown that certain processes, while they appear to be random, need not in fact be random. It is the purpose of this paper to first review rapidly these ideas and, second to consider a model that is deterministic and ask the fundamental question: “When does nonrandomness appear random?”. Put differently, suppose that an exact, deterministic theoretical model is developed between certain variables: when or how can a financial economist conclude, by observing exact time series measurements of such variables, that these variables are random? The remainder of the paper is organized as follows. Section 2 briefly contrasts the notion of deterministic and random models with an emphasis on financial economics, while section 3 presents the most famous deterministic system that behaves like a random one, i.e. the Lorenz equations. Our contribution is exposited in section 4 where we sample from the Lorenz equations and posit the question: when or how can a financial analyst uncover whether the model under analysis is deterministic or random. We illustrate in section 4 that it is possible for a financial economist to conclude that a model is random when actually it is not. An evaluation, summary and questions for further study are given in the last section.

4.2. Deterministic versus Random Models Deterministic models consist of exact relationships. Abstracting from specific modelling considerations, the notion of determinism is clearly demonstrated in the relationship of a function: y =

f (x)

(4.1)

where f denotes the set of ordered pairs (x, y). In other words, each x is unambiguously associated with a specific y, with such a y being equal to f (x). From the simple calculus where f (x) : R → R, R denoting the real numbers, to multivariate calculus, differential equations, real analysis and functional analysis, the subject matter remains exact relationships between or among certain variables. These exact relationships can become quite complicated, particularly when such a relationship is between derivatives (i.e. differential equations) or even among functions themselves (i.e. functional analysis). Nevertheless, in all instances, such relationships are exact. From Euclid’s geometry, to Newton’s calculus and to today’s advanced analysis, the subject matter of scientific investigations is determinism. Discovering, establishing, analyzing and understanding exact relationships among certain variables remains today’s highest scientific goal, not only of mathematicians, but also of applied researchers, such as physicists and management scientists. Only after such a primary goal has not been reached, do scientists consider second best solutions by studying nondeterministic models. Such models are also called random or stochastic and are mostly substitutes rather than competing alternatives for the deterministic truth.

When Nonrandomness Appears Random

73

Mathematics, which one could argue remains the most rigorous of human scientific efforts, demonstrates that, independent of its intrinsic interest, randomness is not an alternative of equal standing but a temporary substitute to determinism. From the elementary probability, where one flips a coin, to measure-theoretic probability, the notion of a function prevails. What changes is the domain of the function. In probability, the domain is a random set and a function that takes its values from a random set is called a random variable. Ruhla (1992) describes with great scientific care the relationship between these two methodologies by arguing that probability is a branch in the scientific tree of determinism. The methodological debate between randomness and nonrandomness in financial economics has been extensive. Actually, two very popular books by Malkiel (2003) and Lo and MacKinlay (1999) review extensively the use of random and nonrandom techniques applied to the behavior of stock prices. A more rigorous approach of similar methodological issues is found in Campbell, Lo and MacKinlay (1997), cited in the previous section. Actually, while the efficient market hypothesis celebrated the methodology of random walks and martingales during the 1970s, studies such as Scheinkman and LeBaron (1989) and Hsieh (1989, 1991) followed by Lo (1991), Sengupta and Zheng (1995), Corazza and Malliaris (2002), Kyrtsou and Terraza (2002, 2003) among others, have shown the merit of chaotic dynamics. Useful surveys of the chaotic dynamics methodology can be found in Brock and Malliaris (1989) and Brock, Hsieh and LeBaron (1992).

4.3. The Lorenz Equations Our discussion thus far was carried out at the methodological level. In other words, in searching for causal relationships, a scientist may choose an exact or a random model. We have argued that exactness has been given priority in the applied sciences and in pure mathematics, while randomness is viewed as a temporary methodological substitute. How can we further strengthen our argument towards determinism? Chaotic dynamics was developed precisely for this purpose: to demonstrate that there exist exact functions which generate very complicated trajectories that appear like random. From the seminal work of Eckman and Ruelle (1985) to the numerous texts about dynamics such as Devaney (1986) or Guckenheimer and Holmes (1983), scientists have exposited an exciting new branch of mathematics which reinforces determinism. Limitations of space do not allow us to describe in detail the key ideas, definitions and theorems of chaotic dynamics. Devaney (1986) presents the essential elements while Guckenheimer and Holmes (1983) treat the subject at a more advanced level. Here, for the sake of continuity, we give the fundamental definition of chaotic dynamics. We say that a function f : R → R is chaotic if it satisfies three conditions: (a) f is topologically transitive, (b) f has sensitive dependence on initial conditions, and (c) f has periodic points that are dense in the real numbers. May (1976) gives several examples of chaotic maps, while Guckenheimer and Holmes (1983) discuss in detail the mathematical properties of such maps. The Lorenz (1963) equations are the most famous example of a system that generates chaotic dynamics. They

74

Anastasios G. Malliaris and Mary E. Malliaris

are: xt

= s(−xt−1 + yt−1 )

(4.2)

yt

= rxt−1 − yt−1 − xt−1 zt−1

(4.3)

zt

= −bzt−1 + xt−1 yt−1

(4.4)

This system of equations is represented here by difference equations. They can also be expressed as a system of differential equations, as was initially derived by Lorenz (1963) in his meteorological study of a three-equation approximation to the motion of a layer of fluid heated from below. Observe that there are three parameters, s, r and b. More specifically, the parameter r corresponds to the Reynolds number and as it varies, the system goes through remarkable qualitative changes. For parameter values b = 2.667, r = 28.0 and s = 10.0, almost all solutions converge to a set called the strange attractor. Furthermore, once on the attractor, these solutions exhibit random-like behavior. An exhaustive analysis of the numerous properties of these equations may be found in Sparrow (1982). Malliaris (1993) hypothesizes that the S&P 500 Index follows chaotic dynamics and uses neutral networks to confirm this hypothesis. Malliaris and Stein (1999) give a detailed financial interpretation of the Lorenz system and perform an econometric estimation using futures data. Papers by Brock and Hommes (1998), Lux (1998) and Chiarella, Dieci and Gardini (2000) offer theoretical models of chaotic dynamics.

4.4. The Experiment We are now in a position to describe our contribution. Using the software Phaser developed by Kocak (1989), we generate 5000 observations using the previous system of equations with parameter values as indicated above. The software generates these values for a choice of two numerical approximation methods, i.e. Euler and Runge-Kutta, and for certain values of the step size in the approximation. Initial values are also needed for the three variables. The values used in our experiment are x(0) = y(0) = z(0) = 5. Notice that for an interval [0, T ], Phaser selects a finite number of points [0,t1, . . ., tk , . . ., T ] which for simplicity are chosen to be equally spaced. The distance, h = tk+1 − tk , between two consecutive points is called the step size. By selecting a very small step size, let us say 0.01 instead of a larger one, such as 0.1, the numerical approximation becomes more accurate. Of course, such accuracy depends on the particular numerical approximation. Kocak (1989) compares both methods and concludes that the Runge-Kutta approximation is more accurate than the Euler approach. Our calculations are performed using the Runge-Kutta approximation with a step size of 0.1, unless otherwise specified. The next concept we wish to discuss is the idea of a jump. When for example the jump = 1, solutions are plotted at every step. If the jump = 10, then solutions are plotted at every tenth point. In other words, selecting a jump of 100 means that, although all the necessary calculations are performed, only each hundredth numerical value is sampled. In our experiment, we use jumps of 1, 10, or 100 to check and see whether the techniques used are capable of identifying the deterministic structure of the Lorenz map.

When Nonrandomness Appears Random

75

Before we describe our three data sets, and to further motivate our experiment, consider Figures 1 and 2. In Figure 1, we plot the time series of the x-variable for t in [0, 100] when the step size is chosen to be 0.001 and the jump is equal to 1. The strange atttractor is clearly visible and the time series does not appear very random. On the other hand, in Figure 2, for a smaller step size 0.01 and a much larger jump = 100, both the time series of x and its strange attractor lose their structure and appear random-like. Although these two figures do not constitute evidence of randomness, it is instructive to observe that infrequent sampling, by jumping over detailed information, misses the underlying structure of the population data. Having made the above clarifications, our experiment now can be described. Using the deterministic Lorenz equations, with a step size of 0.1, we generated three sets of 5000 observations each. The first set records each value of the variable x generated by the Lorenz equations. In other words, the first set has jump = 1. The jumps of the second and the third set are 10 and 100 respectively. The exact size of these two jumps is not critical; other numbers such as 20 and 50 or 250 and 500, etc., could have been chosen; what we wish to illustrate is three levels of information: all values, every tenth value and every hundredth value, where these three procedures correspond to detailed sampling, frequent sampling, and infrequent sampling. Obviously, to keep the number of observations the same, the interval of the second set is longer than the first and the third is longer than the second. We next ask the fundamental question: what methods are available to the decision scientist to allow him/her to distinguish whether a data set of observations is generated by a deterministic or random function? Scientists from various backgrounds have researched this question extensively. Key references are Grassberger and Procaccia (1983), Takens (1985) and Brock, Hsieh and LeBaron (1992). For our purposes, we will briefly exposit the two main techniques, namely, the correlation dimension and the BDS tests. Then, in Tables 1 and 2, we will present the results of these two tests. The correlation dimension was originally proposed by Grassberger and Procaccia (1983). Suppose that we are given a time series of price changes {x(t) : t = 0, 1, 2, . . ., T }. Suppose that T is large enough so that a strange attractor has begun to take shape. Use this time series to create pairs, i.e. x(t) ∼ {[x(t), x(t + 1)] : t = 0, 1, 2, ..., T} and then triplets and finally M-histories, (t) ∼ {[x(t), ..., x(t + M − 1)] : t = 0, 1, 2, . . ., T }. In other words, we convert the original time series of singletons into vectors of dimension 2, 3, ..., M. In generating these vectors, we allow for overlapping entries. For example, if M ∼ 3, we have a set of the form {[x(0), x(1), x(2)], [x(1),x(2),x(3)], . .., [x(T − 2), x(T − 1), x(T )]}. Such a set will have (T + 1) − (M − 1) vectors. Mathematically, the process of creating vectors of various dimension from the original series is called an embedding. Suppose that for a given embedding dimension, say M, we wish to measure if these M-vectors fill the entire M-space or only a fraction. For a given ε > 0, define the correlation integral, denoted by CM (ε), to be: CM (ε) = =

the number of pairs (s,t) whose distance kxM (s) − xM (t)k < ε T 2M the number of (s,t), 1 ≤ t, s ≤ T, kxM (s) − xM (t)k < ε T 2M

where TM = (T + 1) − (M − 1), and as before xM (t) = [x(t), x(t + 1), ..., x(t + M − 1)].

(4.5)

76

Anastasios G. Malliaris and Mary E. Malliaris

Observe that k · k in 4.5 denotes vector norm. Using the correlation integral, we can define the correlation dimension for an embedding dimension M as: DM =

lim ε→0 T →∞

lnCM (ε) ln(ε)

(4.6)

In 4.6 ln denotes natural logarithm. Finally, the correlation dimension D is given by: D =

lim DM

M→∞

(4.7)

We remark that technical accuracy requires that DM in is a double limit, first in terms of T → ∞ and then in terms of ε → 0. However, in practice T is usually given and it is impossible to increase it to infinity. Thus the limit T → ∞ is meaningless in practice and moreover M is practically bounded by T . Therefore, we only consider the limit ε → 0 in (4.6). Table 4.1 collects the results for correlation dimension analysis. Observe that there are three key columns of results corresponding to ε, 0.5ε, and 0.1ε. These three columns attempt to numerically illustrate the limiting process in (4.6). Observe also that we offer seven rows for various embedding dimensions 2, 3, 4, 5, 10, 15, and 20. The results clearly demonstrate that for jump = 1, the correlation dimensions analysis detects the deterministic structure since the numbers for sample 1 and for each ε, 0.5ε, and 0.1ε are small and converge to a number between 1 and 2; (see column 0.1 for sample 1). On the other hand, the numbers for sample 10 are larger as ε decreases to 0.5ε and 0.1ε, these numbers do not converge; (see column 0.1ε for sample 10 as the number starts from 1.3611 and grows beyond 3.5810 to become indeterminate. Finally, for sampling every hundredth, the numbers of the correlation dimension are even larger and diverge sooner than for those of jump = 10. To summarize, a decision scientist would conclude that observations sampled from the x-variable of the Lorenz equations constitute a random set. The second test we perform is the BDS, extensively presented in Brock, Hsieh and LeBaron (1991) and Brock, Dechert, Scheinkman and LeBaron (1996) These authors report that for an independent and identically distributed random process and for fixed M-histories and ε > 0, CM (ε, T ) → [C1 (ε)]M as T → ∞

(4.8)

They further report that as T approaches infinity, √

T {CM (ε, T ) − [C1 (ε, T )]M } → N(0, σ2 (ε, T )),

(4.9)

where N denotes a normal distribution with mean zero and variance σ2 (ε, T ). From the above two equations 4.8 and 4.9, it is concluded that √ T {CM (ε, T ) − [C1 (ε, T )]M } → N(0, 1), σM (ε, T )

(4.10)

When Nonrandomness Appears Random

77

Table 4.2 has only two values which lie in the [-1.96, 1.96] interval of the standardized normal distribution. These are the values: -0.0107 corresponding to jump 10, 1.0ε and M = 3 and -1.6259 corresponding to jump 100, 1.0ε and M = 15. For all the other values, we reject the null hypothesis of randomness. Note that the BDS does not claim that our three samples of 5000 observations are deterministic because its alternative to the null hypothesis is not well specified. Thus, a researcher could not conclude that the underlying structure of our samples is deterministic; he or she could only reject, with two exceptions, the null hypothesis.

Figure 4.1. Rules of 4-day period.

Figure 4.2. Rules of 4-day period.

Table 4.1. Dimension analysis DM for xt (5000 observations generated from Lorenz Equation for step 0.1). Note: ε=0,2205 for sample=1; ε=0,2183 for sample=10; ε=0,2160 for sample=100. Times of

1,0ε

0,5ε

Value of ε

0.2205

0.2183

0.2160

Sample

1

10

100

0.11025 0.10915 1

10

0,1ε 0.108 100

0.02205 0.02183 0.0216 1

10

100

Embedding Dimension 2

0.66157 0.90593 0.90765 0.84393 1.13420 1.17630 1.14340 1.36110 1.49650

3

0.88411 1.35200 1.36670 1.03870 1.59770 1.77020 1.32140 1.77130 2.24430

4

1.05840 1.76940 1.82400 1.18270 2.01190 2.36340 1.44140 2.08380 3.00400

5

1.17740 2.13320 2.28270 1.27740 2.36910 2.95770 1.52080 2.36430 3.84110

10

1.60950 3.99840 4.59000 1.69210 4.18990 5.93220 1.83130 3.58190

N/A

15

2.00860 5.87520 6.85070 1.99530 5.97950

N/A

2.00990

N/A

N/A

20

2.35560 7.68720 9.38890 2.26830 7.06120

N/A

2.16770

N/A

N/A

Table 4.2. BDS Test for for xt (5000 observations generated from Lorenz Equation for step 0.1). Note: ε=0,2205 for sample=1; ε=0,2183 for sample=10; ε=0,2160 for sample=100. * Reject null hypothesis of randomness Times of ε

1,0ε

0,5ε

0,1ε

Value of ε

0.2205

0.2183

0.2160

0.11025

0.10915

0.108

0.02205

0.02183

0.0216

Sample

1

10

100

1

10

100

1

10

100

2

1,5672E+2*

-2.4762*

-5.7504* 2,7225E+2* 2,1428E+1* -5.9515*

6,1123e+2*

1,4718e+2*

-3.3343*

3

1,6288E+2*

-1.0658E-02 -5.0498* 4,3919E+2* 4,7617E+1* -5.3501*

3,0973e+3*

4,8539e+2*

-1.9821*

4

1,9461E+2*

4,8161*

-4.1820* 7,9437E+2* 7,1756E+1* -4.6143*

2,0468e+4*

1,7420e+3*

-4.0175*

5

2,5816E+2*

1,2546E+1*

-3.7614* 1,6643E+3* 1,0937E+2* -4.1692*

1,7594e+5*

7,1521e+3* -1.34E+01*

10

1,5363E+3*

2,2178E+1*

-2.8503* 1,2980E+5* 5,1375E+2* -2.7778* 2,4482e+10* 3,2205e+7* -1.12E+01*

15

1,2904E+4*

2,6654E+1*

-1.6259

2,0387E+7* 3,1108E+3* -6.6069* 9,0690e+15*

-4.8504*

-5.2183*

20

1,3955E+5*

2,6288E+1*

2.6727* 4,0968E+9* 1,0592E+3* -3.7957* 4,3345e+21*

-2.6627*

-2.8917*

Embedding Dimension

80

Anastasios G. Malliaris and Mary E. Malliaris

Evaluation and Conclusion This paper has reviewed the methodological foundations of deterministic and random modeling and argued that determinism remains the scientific goal of any investigation. Recent papers in economic theory as in Brock and Hommes (1998), Lux (1998), Malliaris and Stein (1999), Chiarella, Dieci and Gardini (2000), among others, show that even if the time series behavior of a given model looks like random, its underlying structure may still be deterministic. Suppose that the underlying relationships are exact. What could account for our inability to detect such a structure and then to build models that would make such a structure explicit. In this paper, we make a contribution by demonstrating that the currently available techniques for distinguishing between deterministic and random systems are not adequate. The correlation dimension performs well when every value of the Lorenz equation is sampled, but does poorly when the jump increases to 10 and then to 100. This illustrates that unless, in the real world, we can record information at high frequencies rather than at prespecified intervals, say end of the day, weekly, monthly, etc., we are bound to lose the underlying structure. Our experiment shows that infrequent sampling misses the deterministic relationship. Of course, data limitations may not allow a scientist to perform the tests we used. For example using annual or quarterly data, one does not have enough observations to do dimension and BDS analysis. There is evidence provided by Ramsey, Sayers and Rothman (1990) that dimension calculations using small data sets are biased. However, the availability of massive data is rapidly becoming a reality and such data can be conveniently analyzed by the techniques demonstrated. The BDS does very well rejecting randomness in our sample, but cannot specify the alternative. Our overall conclusion is simply this: since WWII, the scientific pendulum in general and in management science, financial economics and forecasting in particular, has been pulled away from determinism and brought towards stochasticity. But such stochasticity has not fully enriched our understanding of the real world simply because what drives randomness often cannot be anticipated. Chaotic dynamics is not a totally new methodology, but rather a new way of affirming order, rationality and exactness despite the seeming disorderly, unpredictable and random behavior of certain variables. This discovery of chaotic dynamics and our increasing understanding of the Lorenz equation offer valid alternatives to randomness.

References [1] Brock, W. and A.G. Malliaris (1989), Differential Equations, Stability and Chaos in Dynamic Economics, Amsterdam: Elsevier Publishing. [2] Brock, W., D. Hsieh and B. LeBaron, (1992), Nonlinear Dynamics, Chaos and Instability: Statistical Theory and Economic Evidence, Second Edition, Cambridge, Massachusetts: The MIT Press. [3] Brock, W.A , Dechert , W.D. and Scheinkman, J.A.and B. LeBaron (1996). “A Test for Independence Based on the Correlation Dimension,” Econometric Reviews, 15, pp.197- 235.

When Nonrandomness Appears Random

81

[4] Brock, W.A. and Hommes , C.H. (1998). “Heterogeneous Beliefs and Routes to Chaos in a Simple Asset Pricing Model,” Journal of Economic Dynamics and Control, 22, 1235- 1274. [5] Campbell, John, A. Lo and A. MacKinlay (1997), The Econometrics of Financial Markets, Princeton: Princeton University Press. [6] Chiarella, C., Dieci, R. and Gardini, L. (2000), “Speculative Behavior and Complex Asset Price Dynamics,” Journal of Economic Behavior and Organization, 49, pp.17397. [7] Corazza, M. and Malliaris A.G.(2002). “Multi-Factuality in Foreign Currency Markets,” Multinational Finance Journal, 6, 65-98. [8] Devaney, R., (1986), An Introduction To Chaotic Dynamical Systems, Menlo Park, California: Benjamin/Cummings Publishing. [9] Eckmann, J. and D. Ruelle, (1985), “Ergodic Theory of Chaos and Strange Attractors,” Review of Modern Physics, Vol. 57, pp. 617-656. [10] Grassberger, P. and I. Procaccia, (1983), “Measuring the Strangeness of Strange Attractors,” Physics, Vol. 9-D, pp. 189-208. [11] Guckenheimer, J. and P. Holmes, (1983), Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields, New York: Springer-Verlag. [12] Hsieh, D.A. (1989). “Testing Nonlinear Dependence in Daily Foreign Exchange Rates” Journal of Business, pp. 339-368. [13] Hsieh, D.A. (1991). “Chaos and Nonlinear Dynamics: Application to Financial Markets” The Journal of Finance, 46, pp. 1839-1877. [14] Koc¸ak, H., (1989), Differential and Difference Equations through Computer Experiments, New York: Springer-Verlag. [15] Kyrtsou, C. and Terazza , M. (2002), “Stochastic Chaos or ARCH Effects in Stock Series? A Comparative Study,” International Review of Financial Analysis, 11, pp. 407-31. [16] Kyrtsou, C. and Terazza , M. (2003), “Is It Possible to Study Chaotic Dynamics and ARCH Behavior Jointly? Application of a Noisy Mackey-Glass Equation with Heteroskedastic Errors to the Paris Stock Exchange Returns Series,” Computational Economics, 21, pp. 257-75. [17] Lo, A. (1991), “Long-term Memory in Stock Market Prices,” Econometrica, 59, pp. 1279-1313. [18] Lo, A. and A. Craig MacKinlay (1999), A Non-Random Walk Down Wall Street, Princeton: Princeton University Press.

82

Anastasios G. Malliaris and Mary E. Malliaris

[19] Lorenz, E., (1963), “Deterministic Non-Periodic Flows,” Journal of Atmospheric Sciences, 20, pp. 130-141. [20] Lux, T. (1998), “The Socio-dynamics of Speculative Markets: Interactive Agents, Chaos and the Fat Tails of Returns Distributions,” Journal of Economic Behavior and Organization, 33, pp. 143-165. [21] Malliaris A.G. and Stein, J.L (1999), “Methodological Issues in Asset Pricing: Random Walk or Chaotic Dynamics,” Journal of Banking and Finance, 23, pp. 1605-1635. [22] Malliaris, M., (1993), “Modeling the Behavior of the S&P 500 Index: A Neural Network Approach,” Neurovest, Vol. 3, No. 3, May/June, 1995, pp. 16-21. [23] Malkiel, B (2003), A Random Walk Down Wall Street, 8th Edition, New York: Norton. [24] May, R., (1976), “Simple Mathematical Models With Very Complicated Dynamics,” Nature, Vol. 261, pp. 459-467. [25] Ramsey, J.B., C. Sayers and P. Rothman, (1990), “The Statistical Properties of Dimension Calculations Using Small Data Sets: Some Economic Applications,” International Economic Review, 31, pp. 991. [26] Rasmussen, D.R, and E. Mosekilde, (1988), “Bifurcations and Chaos in a Generic Management Model,” European Journal of Operational Research, 35, pp. 80-88. [27] Ruhla, C., (1992), The Physics of Chance: From Plaise Pascal to Niels Bohr, New York: Oxford University Press. [28] Scheinkman, J.A. and LeBaron , B. (1989), “Nonlinear Dynamics and Stock Returns,” Journal of Business, 62, pp. 311-337. [29] Sengupta, J.K. and Zheng, Y. (1995), “Empirical Tests of Chaotic Dynamics in Market Volatility,” Applied Financial Economics, 5, pp. 291-300. [30] Sparrow, C., (1982), The Lorenz Equations: Bifurcations, Chaos, and Strange Attractors, New York: Springer-Verlag. [31] Sterman, S., (1989), “Modeling Managerial Behavior: Misperceptions of Feedback in a Dynamic Decision Making Experiment,” Management Science, 35, pp. 321-339. [32] Takens, F., (1985), “Distinguishing Deterministic and Random Systems,” in G. Borenblatt, G. Iooss and D. Joseph, editors, Nonlinear Dynamics and Turbulence, Boston: Pitman, pp. 315-333.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 83-101

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 5

F INITE S AMPLE P ROPERTIES OF T ESTS FOR STGARCH M ODELS AND A PPLICATION TO THE US S TOCK R ETURNS Gilles Dufrenot1, Vˆelayoudom Marimoutou2 and Anne P´eguin-Feissolle3,∗ 1 CEPII et DEFI, Universit´ e d’Aix-Marseille, France 2 GREQAM and Universit´ e d’Aix-Marseille, France 3 GREQAM, Centre de la Vieille Charit´ e, 2 rue de la Charit´e, 13002 Marseille, France

5.1. Introduction Many papers have proposed variants of nonlinear GARCH models that take into account the presence of asymmetries in volatilities (among the models are the GJR model (Glosten, Jagannathan and Runkle (1993), the QARCH model (Sentana (1995)), the Engle and Ng (1993)’s model, the volatility switching ARCH model (Fornari and Mele (1996) and (1997)), the Markov switching GARCH model (Hamilton and Lin (1996)). Recent proposals of asymmetric volatility models include Bekaert and Wu (2000), Wu (2001), Wu and Xiao (2002). Despite their widespread use, it has been argued that many models lead alternative and competing interpretations and that one may want to construct a general framework that incorporates the many different facets of asymmetric and switching volatility models. Among the different frameworks that have been proposed, smooth transition GARCH models (STGARCH) encompass many types of the asymmetries that are usually studied in the existing nonlinear GARCH and switching volatility literature. Since their introduction in the early nineties by Hagerud (1996a, 1996b), they have been applied to financial data (see for instance Hagerud (1997a, 1997b), Gonzales-Rivera (1998), Anderson, Nam and Vahid (1999), Dufr´enot, Marimoutou and P´eguin-Feissolle (2002)). Lundbergh and Ter¨asvirta (1999 and 2002) present a number of alternative misspecification tests for evaluating GARCH and STGARCH models (see also Hagerud (1996a and b, 1997a and b) and Gonzales-Rivera (1998)). ∗ E-mail

address: [email protected]

84

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle

The aim of this paper is thus to assess the finite sample properties of Lagrange multiplier tests for STGARCH models. We extend the conclusions obtained henceforth in the literature in several ways. (i) We investigate the power and size of STGARCH tests by considering transition functions other than the logistic function. Notably, the case of exponential functions is investigated. Logistic and exponential STGARCH (LSTGARCH and ESTGARCH) models accommodate regime shifts characterized by different levels of volatility, depending upon how far the value of the transition variables are from to a given value. (ii) Following the approach commonly applied to STAR models, we use statistics based on Taylor expansions of the transition function of orders higher than the first two orders. The reason is that the nonlinear components of the transformed model must be sufficient in order to keep the characteristic features of the original model. We thereby derive test statistics based on first- and third-order approximations for the logistic STGARCH model and second- and fourth-order expansions for the exponential STGARCH model. (iii) Our results show a paradox, namely that, sometimes, the tests work the best when the stationarity conditions are violated. For illustrative purpose, we further envisage an application to the S&P 500 stock returns. We depart from the empirical studies already done on the subject, by assuming that the volatility response to shocks depends upon macroeconomic variables - and not on past innovations - because they convey more information about the transmission mechanism inducing asymmetry. We find that the LSTGARCH model outperforms other classical models such as the GARCH or GJR models, in terms of forecasting. The rest of the paper is organized as follows. Section 2 briefly presents the STGARCH models and the proposed tests. Section 3 contains the results of our Monte Carlo experiments. Section 4 contains our empirical application.

5.2. STGARCH Models and Test Statistics 5.2.1.

Smooth Transition GARCH Models

We deal with the basic specification of a smooth transition GARCH model. The reader interested by a more extensive presentation may consult the references cited above. We define the following smooth transition GARCH model (STGARCH(1,1)) for a time series T {yt }t=1 :  yt = Et−1 (yt ) + εt (5.1) εt |ψt−1 , zt ∼ N (0, ht ) where ψt−1 is the conditioning information set up to t − 1 and zt is a variable described below, Et−1 (yt ) is the conditional mean of yt and ht is the time-varying conditional variance: 2 ht = α0 + [α1 + α2 F (zt )] εt−1 + βht−1 ,

(5.2)

0

where β and the αi s are real parameters. F is a transition function, continuous and bounded, that is a ≤ F(.) ≤ b with (a, b) ∈ R2 . Depending upon the formulation of the transition function, the system (5.1)-(5.2) defines either a logistic smooth transition GARCH (LSTGARCH(1,1)) or an exponential smooth transition GARCH (ESTGARCH(1,1)) model: logistic : F (zt ) = [1 + exp (−γ (zt − c))]−1 − 1/2

(γ > 0),

(5.3)

Finite Sample Properties of Tests for STGARCH Models and Application ... or

  exponential : F (zt ) = 1 − exp −γ (zt − c)2

(γ > 0),

85 (5.4)

where c is a threshold parameter and γ a parameter that defines the smoothness of the transition between the different volatility regimes. The variable zt is a control variable which drives the volatility behavior in different regimes. Here, we assume that zt is distributed as a N(0, 1). This hypothesis has the following meaning in regard to finance theory. Indeed, we consider that the shocks governing the switching of volatility are randomly distributed. Suppose that the STGARCH model is used by the economic agents to forecast the future volatility. If zt was known in advance, they could behave in such a way to eliminate the changing regimes: one assumes that the market participants are rational; if a high volatility regime is anticipated, it is rational to hold one’s present position. If all the agents adopt such a behavior, prices do not move too much, thereby implying slow movements and thus a low volatility dynamics. In other papers, it is assumed that zt contains the past information on the dynamics of the shocks. For instance, Hagerud (1996a and b, 1997a and b) and Gonzales-Rivera (1998) consider that zt = εt−1 , with εt |ψt−1 ∼ N (0, ht ), which would mean that the agents use naive or extrapolative forecasts. The problem is, however, that in this case they will search to protect themselves against the negative effects of asymmetry (a too high volatility). If they do so, then their behavior will always yield the same regime (low volatility) and this is incompatible with the LSTGARCH specification. Using He and Ter¨asvirta (1999)’s approach, we obtain the expressions of the secondorder and fourth-order moments of εt : • For the LSTGARCH(1,1) model: E(εt2 ) = α0 [1 − α1 − max(α2 , 0) − β + α2 /2]−1 ,

 −1 E(εt4 ) = 3α20 (1 + ξL + β) (1 − ξL − β)(1 − 2βξL − 3ξ2L − β2 ) ,

(5.5) (5.6)

where ξL = α1 + max(α2 , 0) − α2 /2. • For the ESTGARCH(1,1) model

E(εt2 ) = α0 [1 − α1 − max(α2 , 0) − β]−1 ,

 −1 E(εt4 ) = 3α20 (1 + ξE + β) (1 − ξE − β)(1 − 2βξE − 3ξ2E − β2 ) ,

(5.7) (5.8)

where ξE = α1 + max(α2 , 0).

From these expressions, the second- and fourth-order stationarity conditions are easily deduced: • For the LSTGARCH(1,1) model, E(εt2 ) < ∞, if α1 + max(α2 , 0) + β − α2 /2 < 1,

(5.9)

E(εt4 ) < ∞, if 2βξL + 3ξ2L + β2 < 1.

(5.10)

86

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle • For the ESTGARCH(1,1) model, E(εt2 ) < ∞, if α1 + max(α2 , 0) + β < 1,

(5.11)

E(εt4 ) < ∞, if 2βξE + 3ξ2E + β2 < 1.

(5.12)

Further, the moments are strictly positive if the following sufficient conditions hold: • For the LSTGARCH(1,1) model: α0 > 0, β ≥ 0, α1 ≥ 0, α1 ≥ max(α2 , 0) − α2 /2.

(5.13)

• For the ESTGARCH(1,1) model α0 > 0, β ≥ 0, α1 ≥ 0, α1 ≥ max(α2 , 0).

5.2.2.

(5.14)

LM Tests Based on Taylor Approximations

The hypothesis under consideration is the null of a GARCH(1,1) model against the alternative of a STGARCH(1,1) model (either LSTGARCH(1,1), or ESTGARCH(1,1)); the test can be formulated as follows: H0 : α2 = 0 against H1 : α2 6= 0. Under the null, the model is unidentified and we face a nuisance parameters problem. This problem can be treated in several ways. Here, we use a local equivalence approach based on an approximation procedure similar to the approach retained for STAR models by Luukkonen, Saikkonen and Ter¨asvirta (1988) (other methodologies such as Davies (1977)’s or Hansen (1996)’s can be applied to STGARCH models, as illustrated by Gonzalez-Rivera (1998)). Our approach allows the derivation of Lagrange Multiplier (LM) statistics, which are simple to apply and computationally less expensive than Likelihood Ratio tests. Therefore, we consider Taylor expansions of orders higher than the first-order for the functions (5.3) and (5.4). All the expansions are computed around zt = c1 : • For the LSTGARCH model, the first- and third-order approximations are: F(zt ) = γzt /4

and

F(zt ) = γzt /4 − γ3 zt3 /48.

(5.15)

• For the ESTGARCH model, the second- and fourth-order approximations are: F(zt ) = γzt2

and

F(zt ) = γzt2 − γ2 zt4 /2.

(5.16)

Inserting these expressions into the STGARCH models yields the following expressions of the conditional variances: 1 For

sake of simplicity, we consider c = 0. Relaxing this assumption does not change our results.

Finite Sample Properties of Tests for STGARCH Models and Application ...

87

• For the LSTGARCH model, the first-order approximation yields: 2 e ht = α0 + [α1 + µzt ]εt−1 + βe ht−1 , µ = α2 γ/4,

(5.17)

and the third-order approximation gives:

  2 e ht = α0 + α1 + µzt + φzt3 εt−1 + βe ht−1 , µ = α2 γ/4, φ = −α2 γ3 /48.

• For the ESTGARCH model, the second-order approximation implies:   2 e ht = α0 + α1 + µ∗ zt2 εt−1 + βe ht−1 , µ∗ = α2 γ,

(5.18)

(5.19)

and the fourth-order approximation yields:

  2 e ht = α0 + α1 + µ∗ zt2 + φ∗ zt4 εt−1 + βe ht−1 , µ∗ = α2 γ, φ∗ = −α2 γ2 /2.

(5.20)

Testing the null of GARCH(1,1) against the alternative of a STGARCH(1,1) is now formulated as follows: • If the alternative is a LSTGARCH(1,1), we have the following two tests: (i) H01 : µ = 0 against H11 : µ 6= 0,

(ii) H02 : µ = 0 and φ = 0 against H12 : µ 6= 0 or φ 6= 0. • If the alternative is a ESTGARCH(1,1), we have the following two tests: (iii) H03 : µ∗ = 0 against H13 : µ∗ 6= 0,

(iv) H04 : µ∗ = 0 and φ∗ = 0 against H14 : µ∗ 6= 0 or φ∗ 6= 0. For all the tests, the LM statistic has the general expression: 0       −1   −1 T −1 2 e e e e e ∂ht /∂α εt h0t LM = (1/2) ∑ h0t − 1 ∂ht /∂α ∑ h0t t=1 t=1   0 −1         T −1 −1 −1 a 2 e e e e e × h0t ∂ht /∂α εt h0t − 1 ∂ht /∂α ∼ χ2k , ∑ h0t 

T

(5.21)

t=1

where k = 1 or 2 depending upon the number of parameters tested under the null hypothesis. α is the vector of parameters that enters the conditional variance equations. e h0t is the conditional variance under the null hypothesis. The expressions of ∂e ht ∂α are given below and depend upon the test under consideration (all the parameters are estimated under the null hypothesis): • test (i) (GARCH(1,1)/LSTGARCH(1,1)): ∂e ht /∂α =

t−1

∑β

i=1

t−1 i−1

,∑

i=1

t−1 2 βi−1 εt−i ,

∑β

i=1

t−1 i−1

2 zt−i+1 εt−i ,



i=1

βi−1e h0,t−i

!0

,

(5.22)

88

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle • test (ii) (GARCH(1,1)/LSTGARCH(1,1)): ∂e ht /∂α =

t−1

t−1

t−1

i=1

i=1

i=1

2 2 , ∑ βi−1 zt−i+1 εt−i , ∑ βi−1, ∑ βi−1 εt−i t−1

t−1

i=1

i=1

3 2 εt−i , ∑ βi−1 e h0,t−i ∑ βi−1 zt−i+1

!0

∂e ht /∂α =

∑β

t−1 i−1

i=1

,∑

t−1 2 βi−1 εt−i ,

i=1



t−1 2 2 βi−1 zt−i+1 εt−i ,

i=1

• test (iv) (GARCH(1,1)/ESTGARCH(1,1)): ∂e ht /∂α =



i=1

t−1

t−1

t−1

i=1

i=1

i=1

βi−1e h0,t−i

2 2 2 , ∑ βi−1 zt−i+1 εt−i , ∑ βi−1, ∑ βi−1 εt−i t−1

t−1

i=1

i=1

4 2 εt−i , ∑ βi−1 e h0,t−i ∑ βi−1 zt−i+1

!0

(5.24)

,

• test (iii) (GARCH(1,1)/ESTGARCH(1,1)): t−1

(5.23)

.

!0

,

(5.25)

(5.26)

(5.27)

One can use an asymptotic equivalent statistic: a

T R2 = T R2u ∼ χ2k , k = 1 or 2

(5.28)

where R2u is the squared multiple correlation of the auxiliary regression of yt∗ on xt∗ , with  −1 yt∗ = εt2 e h0t −1

 −1 and xt∗ = e h0t ∂e ht /∂α.

(5.29)

5.3. Monte Carlo Experiment 5.3.1.

Simulation Design

The data generating processes are set as follows. Under the null hypothesis, we consider the following GARCH(1,1) model, for t = 1, ..., T:  yt = a1√ + a2 yt−1 + εt ,    εt = νt ht , (5.30) 2 h  t = α0 + α1 εt−1 + β1 ht−1 ,   νt ∼ nid(0, 1). Under the alternative hypothesis, we simulate the following model, for t = 1, ..., T:  yt = a1√ + a2 yt−1 + εt ,    εt = νt ht , (5.31) 2 h = α0 + [α1 + α2 F(zt )]εt−1 + β1 ht−1 ,    t νt ∼ nid(0, 1).

Finite Sample Properties of Tests for STGARCH Models and Application ...

89

Table 1. Simulated size for the test of GARCH(1,1) against the alternative LSTGARCH(1,1) Tests Statistics (1a)

(2a)

(3a)

(4a)

(5a)

1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10%

(i) LMO1L 0.9 5.1 10.3 1.7 6.9 14.8 0.6 3.3 8.1 1.0 5.9 11.0 0.9 5.5 10.0

(i) T R2O1L 1.0 5.2 11.1 1.7 7.1 15.1 0.7 3.2 7.6 1.1 6.0 10.7 0.9 5.2 9.9

(ii) LMO3L 1.5 6.2 10.5 2.2 8.0 12.6 1.2 5.0 9.2 1.4 5.6 11.7 1.3 4.4 8.6

(ii) T R2O3L 1.4 5.7 10.6 2.3 7.8 13.5 1.6 5.4 9.2 1.2 5.4 11.9 1.3 4.0 8.2

Note: The table reports the rejection frequencies at the three theoretical significance levels of 1%, 5% and 10%. The simulations are based on 1000 replications and T = 500 observations. The data generating processes are given by (5.30) with different parameters values: a1 = 0.25, a2 = 0.30, and (1a) α0 = 0.10, α1 = 0.50, β1 = 0.05, (2a) α0 = 0.10, α1 = 0.05, β1 = 0.80, (3a) α0 = 0.30, α1 = 0.30, β1 = 0.70, (4a) α0 = 0.01, α1 = 0.60, β1 = 0.35, (5a) α0 = 0.50, α1 = 0.80, β1 = 0.20.

where F(zt ) is either the logistic (5.3) or the exponential (5.4) function. The transition variable zt is generated according to zt ∼ nid(0, 1). We also considered autoregressive processes for zt and the results were similar to those obtained here. We consider different values of the parameters in order to account for a wide variety of GARCH(1,1) and STGARCH(1,1) models. We conducted many Monte Carlo experiments by comparing the results obtained for finite and large sample sizes. Our simulations revealed that large samples display better size and power than small samples. To avoid too many tables, we have selected a sample with T = 500 for illustration purpose (other results are available upon request to authors). All the simulations are based upon 1000 replications.

90

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle Table 2. Simulated power for the test of GARCH(1,1) against the alternative LSTGARCH(1,1) Tests Statistics (1b)

(2b)

(3b)

(4b)

(5b)

1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10%

(i) LMO1L 53.7 78.8 87.1 51.3 73.3 82.1 87.2 95.5 98.0 94.0 98.8 99.6 90.1 97.0 98.6

(i) T R2O1L 52.6 77.6 87.0 49.0 72.6 82.0 81.5 93.1 95.5 90.6 96.8 98.6 88.2 96.6 98.3

(ii) LMO3L 53.7 75.9 83.6 48.1 70.2 79.7 87.3 95.1 97.7 92.5 98.6 99.6 88.4 96.3 98.2

(ii) T R2O3L 50.6 74.2 83.7 45.5 68.8 79.5 79.6 91.5 94.7 88.6 96.2 98.2 86.2 95.5 97.8

Note: The table reports the rejection frequencies at the three theoretical significance levels of 1%, 5% and 10%. The simulations are based on 1000 replications and T = 500 observations. The data generating processes are given by (5.31) and (5.3) with different parameters values: a1 = 0.25, a2 = 0.30, and (1b) α0 = 0.10, α1 = 0.20, α2 = −0.4, β1 = 0.05, γ = 20.00, (2b) α0 = 0.10, α1 = 0.20, α2 = 0.40, β1 = 0.20, γ = 5.00, (3b) α0 = 0.01, α1 = 0.80, α2 = −0.80, β1 = 0.50, γ = 10.00, (4b) α0 = 0.50, α1 = 0.80, α2 = −1.00, β1 = 0.20, γ = 5.00, (5b) α0 = 0.10, α1 = 0.50, α2 = 0.80, β1 = 0.05, γ = 5.00.

5.3.2.

Simulation Results

1. Tables 1 and 2 give the empirical size and power of the tests (i) and (ii). LMO1L and LMO3L refer to the LM tests based on the first- and third-order Taylor approximations. T R2O1L and T R2O3L are their T R2 equivalents. Globally, it is seen that the estimated sizes are close to the nominal sizes. We, however, notice that the performance of the tests depends upon the stationarity conditions. In table 1, we distinguish among second-order stationary GARCH processes (cases 1a, 2a), IGARCH processes (cases 3a, 5a) and GARCH processes with high persistence (case 4a). Comparing the different cases, we see that, in the stationary case 2a, the tests tend to overreject the null hypothesis when it is true. The sizes are sometimes better in the non-stationary cases. The results on the power yield similar conclusions (table 2). The stationary cases (where both E(εt2 ) and E(εt4 ) are finite) correspond to 1b, 2b, and the nonstationary cases to 3b, 4b and 5b. The tests perform better in the latter cases. By

Finite Sample Properties of Tests for STGARCH Models and Application ...

91

comparing the LSTGARCH and the ESTGARCH models, it appears that the tests are more powerful in the LSTGARCH models; even, it can be noted in tables 3 and 4 that, at 1% significance level, the test works badly for the ESTGARCH processes (the power is in general less than 50%). Table 3. Simulated size for the test of GARCH(1,1) against the alternative ESTGARCH(1,1) Tests Statistics (1c)

(2c)

(3c)

(4c)

(5c)

1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10%

(iii) LMO2E 1.6 6.4 12.2 1.1 5.6 11.0 0.8 4.3 9.4 1.1 5.1 9.0 1.4 6.0 10.1

(iii) T R2O2E 1.3 7.1 12.9 1.1 5.7 11.5 0.9 4.1 9.1 1.3 4.9 8.3 1.5 5.6 10.7

(iv) LMO4E 2.2 6.6 11.2 1.3 5.7 9.4 2.3 4.6 8.2 2.4 5.2 8.9 2.7 7.1 12.6

(iv) T R2O4E 2.3 6.8 11.9 1.8 5.7 10.6 2.0 4.9 9.0 2.3 4.7 9.1 2.7 7.2 12.8

Note: The table reports the rejection frequencies at the three theoretical significance levels of 1%, 5% and 10%. The simulations are based on 1000 replications and T = 500 observations. The data generating processes are given by (5.30) with different parameters values: a1 = 0.25, a2 = 0.30, and (1c) α0 = 0.10, α1 = 0.05, β1 = 0.80, (2c) α0 = 0.30, α1 = 0.25, β1 = 0.25,

(3c) (4c) (5c)

α0 = 0.10, α1 = 0.50, β1 = 0.40, α0 = 0.50, α1 = 0.80, β1 = 0.20, α0 = 0.30, α1 = 0.25, β1 = 0.70.

2. Tables 2 and 4 suggest that, when considering Taylor approximations, the tests based on the highest orders have a decreasing power. For the exponential case (table 4), the tests seem to be biased toward the null hypothesis, notably at 1% level of confidence. Here, we face the traditional trade-off problem encountered when supplementary variables are included in a regression. On one hand, this reduces the number of degrees of freedom, thereby implying a less powerful test; but, on the other hand, adding more variables to an equation is more informative (in the sense that it allows to take into account more nonlinearities) and the test should be more powerful. Here, our results suggest that augmenting the degree of the Taylor expansion does not gen-

92

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle erally increase the power of the tests. Table 4. Simulated power for the test of GARCH(1,1) against the alternative ESTGARCH(1,1) Tests Statistics (1d)

(2d)

(3d)

(4d)

(5d)

1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10% 1% 5% 10%

(iii) LMO2E 27.8 62.3 77.2 46.5 79.2 88.5 44.5 76.6 86.7 68.1 91.3 96.4 45.7 75.2 85.3

(iii) T R2O2E 26.3 60.5 76.1 44.7 77.3 86.9 40.4 74.7 86.1 63.9 89.4 95.7 43.1 73.7 84.3

(iv) LMO4E 26.8 56.6 70.6 47.3 74.9 86.4 57.8 80.5 89.4 69.1 90.0 95.7 43.4 68.0 80.4

(iv) T R2O4E 24.8 55.2 70.2 45.2 72.9 85.4 52.9 79.4 88.3 65.4 88.5 94.0 39.8 66.9 78.9

Note: The table reports the rejection frequencies at the three theoretical significance levels of 1%, 5% and 10%. The simulations are based on 1000 replications and T = 500 observations. The data generating processes are given by (5.31) and (5.4) with different parameters values: a1 = 0.25, a2 = 0.30, and (1d) α0 = 0.50, α1 = 0.50, α2 = −0.50, β1 = 0.05, γ = 1.00, (2d) α0 = 0.003, α1 = 0.70, α2 = −0.65, β1 = 0.25, γ = 1.00, (3d) α0 = 0.003, α1 = 0.70, α2 = −0.65, β1 = 0.05, γ = 3.00, (4d) α0 = 0.003, α1 = 0.90, α2 = −0.85, β1 = 0.05, γ = 1.00, (5d) α0 = 0.30, α1 = 0.80, α2 = −0.70, β1 = 0.35, γ = 0.60.

3. What would happen in larger samples? As shown in table 5, the power of the different tests is increasing with the number observations T , in the cases where the conditions of the moments are satisfied (cases (1e) and (3e)), as well as when they are violated (cases (2e) and (4e)) (the results are similar if zt is generated as N(0, 1) instead of a stationary AR process).

5.4. An Application to the US Stock Returns Given the mitigated results of the testing procedure of GARCH(1,1) against ESTGARCH(1,1) model, we shall consider only the LSTGARCH model and the corresponding tests. LM tests are applied to a monthly series of US stock returns over the period

Finite Sample Properties of Tests for STGARCH Models and Application ...

93

from 1988:01 to 1998:12. The original data is the S&P 500 index and we compute the logarithmic returns as the first-differences of the logarithm prices. 1. Definition of the possible transition variables. The transition variables are different interest rates differentials: Table 5. Some examples where the number of observations is increasing LSTGARCH: LMO3L T (1e) (2e) 100 16.6 24.0 200 27.7 44.1 500 46.8 80.6 1000 80.0 98.5

ESTGARCH: LMO4E T (3e) (4e) 100 14.1 18.1 200 32.8 33.1 500 73.6 62.2 1000 97.0 92.4

Note: The table reports the rejection frequencies at the 10% theoretical significance level. The simulations are based on 1000 replications. The data generating processes are given by st = 0.5st−1 + ζt with ζt ∼ nid(0, 1), and by (5.31) and (5.3) for the LSTGARCH model and (5.31) and (5.4) for the ESTGARCH model, with different parameters values: a1 = 0.25, a2 = 0.30, and (1e) α0 = 0.10, α1 = 0.25, α2 = 0.50, β1 = 0.10, γ = 1.00, (2e) α0 = 0.10, α1 = 0.50, α2 = 0.70, β1 = 0.50, γ = 1.00, (3e) α0 = 0.50, α1 = 0.50, α2 = −0.50, β1 = 0.05, γ = 1.00, (4e) α0 = 0.30, α1 = 0.80, α2 = −0.70, β1 = 0.35, γ = 8.00.

Table 6. Preliminary statistics on logarithmic returns2 : 1988:01-1998:12 mean maximum minimum variance Sk κ

0.011 0.107 -0.084 0.000 1.101 1.433

JB (p-value) KS ARCH(1) (p-value) ARCH(4) (p-value)

37.710 (0.00) 0.379 0.571 (0.44) 2.298 (0.68)

Note: Sk is the skewness coefficient, κ is the Kurtosis excess, JB is the Jarque-Bera normality test, KS is the Kolmogorov-Smirnov nonparametric normality test (the 5% critical value is 0.189), ARCH denotes the Engle conditional heteroskedasticity test for the own squared returns. The pvalues corresponding to different test statistics are given in parentheses.

• USSPREAD is defined as the US domestic spread, that is the difference ln(1 + us us us ius l,t ) − ln(1 + is,t ) where il,t and is,t are respectively the US long-term and short2 We

also applied the unit root and long-memory tests. The results are not reported here to avoid an overabundance of tables. The logarithmic returns were I(0) and a similar conclusion was obtained for the series of interest rate differentials.

94

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle term interest rates. All short-term interest rates used in the application are 3month interest rates and the long term interest rates are 10 year yield to maturity on long-term government bonds; • UKDIFST and FRDIFST are the logarithm differences between the American and English (resp. French) short-term interest rates, that is ln(1 + ius s,t ) − ln(1 + j is,t ) where j=U.K., France; j • UKDIFLT and FRDIFLT are similarly defined as ln(1 + ius l,t ) − ln(1 + il,t ), that is the logarithmic differences between the long-term interest rates.

2. Some preliminary statistics. Table 6 shows some preliminary statistics for the US stock returns. There is an overwhelming evidence that the returns have non-Normal distributions, as shown by the Kolmogorov-Smirnov statistics. This property, usually documented for financial data, is caused by skewed and heavy tails in the distributions (see the skewness and kurtosis coefficients, along with the Jarque-Bera statistics). Table 7. Heteroscedasticity tests on homoscedastic model (p-values) MCLL(1) MCLL(4) ENGLE1 (1) ENGLE1 (4) ENGLE2 (1) ENGLE2 (4) ANN(1) ANN(4)

0.50 0.56 0.50 0.40 0.51 0.41 0.10 0.21

SIGN SIGN− SIGN+ SIGNJ QARCH(1) QARCH(4) LOG(1) LOG(4) EXP(1) EXP(4)

0.32 0.00 0.00 0.00 0.23 0.45 0.29 0.16 0.36 0.67

Note: For all these tests, the null hypothesis is represented by an AR homoscedastic model. MCLL is the Mc Leod and Li test (1983), ENGLE1 and ENGLE2 are respectively the χ2 -version and F -version of the Engle test (1982), ANN is the neural network conditional heteroscedasticity test (P´eguin-Feissolle (1999) and Caulet and P´eguin-Feissolle (2000)), SIGN, SIGN−, SIGN+ and SIGNJ are the different formulations of the Engle and Ng tests (1993), QARCH is the Sentana (1995) homoscedasticity test against the alternative of quadratic ARCH model, LOG and EXP are the tests developed by Hagerud (1997a) for testing homoscedasticity against logistic or exponential smooth transition ARCH (where the transition variable is a lagged noise).

3. Testing homoscedasticity against different conditional heteros-cedasticity. First, we determine an AR(p) model using information criteria and tests for residual autocorrelations. We retain an optimal lag p = 1. We then use the estimated model to test the hypothesis of homoscedastic residuals against the alternative of asymmetric and nonlinear heteroscedastic residuals. The results, in table 7, clearly show a rejection of the null hypothesis when the conditional variance contains asymmetric patterns. This is suggested by the sign bias tests, where the p−values are less than 1%. It is

Finite Sample Properties of Tests for STGARCH Models and Application ...

95

also worth noting that when the transition variable consists of the lagged noise - as is the case with the Hagerud (1997a)’s LOG and EXP statistics - the hypothesis of an STGARCH model is rejected. This could mean, either that there are no STGARCH components in the residuals, or that the lagged noises are not informative enough to account for STGARCH effects. To discriminate between these alternatives, we shall consider the interest rate differentials as transition variables, when testing the null of a GARCH(1,1) against an LSTGARCH(1,1) alternative. 4. Testing a GARCH(1,1) model against different conditional heteroscedastic models. Table 8 shows the results of similar tests applied to the residuals of a GARCH(1,1) model. There is some evidence of time-varying and asymmetric dynamics in the conditional variance. Indeed, the statistic SIGN is statistically significant at 10 % level of significance and several statistics based on the constancy parameters tests are also significant. A further interesting question is whether these asymmetries and nonlinearities are satisfactorily captured by an ST GARCH model. Table 8. Diagnostic tests on GARCH(1,1) model (p-values) NRARCH(1) NRARCH(4) HOARCH(1) HOARCH(4) HOGARCH(1) HOGARCH(4) SIGN SIGN− SIGN+ SIGNJ

0.92 0.41 0.82 0.34 0.82 0.33 0.09 0.17 0.19 0.40

QGARCH LOG CONSTI (1) CONSTI (3) CONSTA (1) CONSTA (3) CONSTIA (1) CONSTIA (3) CONSTALL (1) CONSTALL (3)

0.89 0.07 0.49 0.00 0.34 0.19 0.60 0.00 0.61 0.00

Note: For all these tests, the null hypothesis is represented by a GARCH(1,1) model and q is the order of the tested conditional heteroscedasticity. NRARCH is the test for no remaining ARCH (Lundbergh and Ter¨asvirta (2002)), HOARCH and HOGARCH are the tests for higher order ARCH and GARCH (close to Bollerslev (1986)), SIGN− , SIGN+ and SIGNJ are the different formulations of the Engle and Ng tests (1993), QGARCH and LOG are the tests developed by Hagerud (1997a) for testing homoscedasticity against respectively the alternative of quadratic ARCH model and the alternative of logistic smooth transition ARCH (where the transition variable is a lagged noise), CONSTI , CONSTA , CONSTIA and CONSTALL are the tests developed in Lundbergh and Ter¨asvirta (2002) for testing parameter constancy respectively in intercept, in ARCH parameters, in intercept and ARCH parameters and in all parameters.

5. Testing a GARCH(1,1) model against a LSTGARCH(1,1) model. Table 9 shows the p−values corresponding to the test of a GARCH(1, 1) model against the alternative hypothesis of a LST GARCH(1, 1) model. We retain up to 6 lags for the transition variables. Evidently, in many cases, the null hypothesis is strongly rejected (see the p-values that are less than 5% and very small; the results for the transition variable

96

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle Table 9. Testing GARCH(1,1) against LSTGARCH(1,1) (p-values) zt : USSPREAD p LMFO LMT O 0 1 2 3 4 5 6

p 0 1 2 3 4 5 6

0.0001 0.0000 0.0000 0.0001 0.0000 0.0000 0.0000

0.0006 0.0004 0.0002 0.0004 0.0002 0.0002 0.0001

p 0 1 2 3 4 5 6

zt : UKDIFST LMFO LMT O 0.1415 0.0956 0.0895 0.1089 0.1050 0.1017 0.0620

0.0251 0.0108 0.0051 0.0136 0.0290 0.0365 0.0195

zt : UKDIFLT LMFO LMT O

p

zt : FRDIFST LMFO LMT O

0.6812 0.6320 0.5165 0.4764 0.4694 0.4612 0.3727

0 1 2 3 4 5 6

0.0426 0.0236 0.0138 0.0143 0.0121 0.0146 0.0109

0.0006 0.0063 0.0286 0.0728 0.5078 0.4507 0.4756

0.0612 0.0316 0.0137 0.0145 0.0096 0.0094 0.0057

Note: LMFO and LMTO correspond to the first-order and third-order LM versions of the test; zt is the transtion variable.

FRDIFLT are not given since all the p-values were high). There thus seems to be an overwhelming evidence that the changing regimes and asymmetries in the dynamics of the volatility of the US logarithmic returns could be caused by different variables of interest rates differentials, as shown by the smallest p−values (these variables seem more informative about the presence of asymmetry and nonlinearity than the shocks). 6. Estimation of different LSTGARCH(1,1) models. It is interesting to have some interpretations of the preceding results, notably by studying what the estimation of STGARCH models suggests. Table 10 contains some of our best estimated models using transition variables with p-values less than 5%. The parameters a0 and a1 are the coefficients of the conditional mean. The other parameters refer to the conditional variance equation (5.2) with the transition function given by (5.3). What the results show is a difficulty to find many models that confirm the finding of the tests, when the data length is small. As is seen, the second extreme regime (corresponding to F(zt ) = 1) implies a higher correlation in volatility (since α2 is positive). However, the presence of two distinct regimes in volatility is asserted for only one model (with UKDIFST as the transition variable). Indeed, α2 is statistically significant only for this case. Further, the results contradict the conclusions of the LM tests: we

Finite Sample Properties of Tests for STGARCH Models and Application ...

97

Table 10. LSTGARCH(1,1) models

(lag) a0 (p-value)

a1 (p-value)

α0 (p-value)

α1 (p-value)

α2 (p-value)

β (p-value)

γ (p-value)

c (p-value)

ln LT GB(1) (p-value) GB(5) (p-value) sk

κ JB (p-value) White (p-value)

USSPREAD (2) 0.010 (0.00) 0.108 (0.00) 0.000 (0.00) 0.253 (0.00) 0.084 (0.44) 0.445 (0.00) 0.249 (0.16) 0.249 (0.81) 2.160 1.383 (0.23) 6.419 (0.26) 0.152 2.166 25.921 (0.00) 3.479 (0.17)

UKDIFST (2) 0.009 (0.00) 0.088 (0.00) 0.000 (0.00) 0.164 (0.00) 0.036 (0.00) 0.791 (0.00) 0.264 (0.00) -0.419 (0.61) 2.170 1.684 (0.19) 6.818 (0.23) 0.284 2.148 26.772 (0.00) 3.587 (0.17)

UKDIFLT (0) 0.011 (0.00) 0.073 (0.00) 0.000 (0.00) 0.467 (0.00) 0.117 (0.26) 0.275 (0.00) 0.863 (0.02) -0.047 (0.73) 2.150 1.996 (0.15) 7.203 (0.20) 0.070 2.065 23.219 (0.00) 4.240 (0.12)

FRDIFST (2) 0.011 (0.00) 0.115 (0.00) 0.000 (0.00) 0.268 (0.00) 0.110 (0.10) 0.259 (0.00) 0.290 (0.00) -0.211 (0.92) 2.158 1.296 (0.25) 6.293 (0.27) 0.106 2.165 25.654 (0.00) 3.421 (0.18)

Note: The first line gives the names of the transition variables and the second line gives their number of lags. The p-values corresponding to the different parameter estimates and test statistics are given in parentheses. lnLT is the log-likelihood function, GB(q) denotes the Godfrey-Breusch statistic of the LM-type test for qth-order serial correlation in the residuals, sk is the skewness coefficient, κ is the Kurtosis excess, JB is the Jarque-Bera normality test for the residuals, W hite is the White heteroscedasticity test

see that the parameter γ is not statistically significant when the transition variable is USSPREAD. To check the goodness of fit of our models, we consider several diagnostic tests on the estimated residuals: the Godfrey-Breusch test for qth -order serial autocorrelation, the White heteroskedasticity test and the Jarque-Bera Normality test. 7. Some implications of the estimated LSTGARCH model with UKDIFST as the

98

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle transition variable. We split the data according to the different time periods identified by this model in order to detect regime-switching periods in the volatility. For each period, we report the mean excess return and the corresponding mean volatility (see table 11). Three main periods can be identified. The first-period ranges from 1988:3 to 1989:7 and belongs to regime 2. The latter is characterized by a positive difference of the interest rate differential and positive returns. The other two periods cover the dates 1989:8 to 1994:3 (regime 1 with a negative interest rate differential and negative excess returns) and 1994:4 to 1998:11 (regime 2). Considering the period from 1989:8 to 1998:11, the highest volatility corresponds to positive excess returns (regime 2). One reason is certainly the role played by the monetary policy. The negative value of zt − cb over 1989:8-1994:3 reflects a less restrictive monetary policy in the US, and a resulting lower volatility of the stock returns. This is a consequence of the Fed efforts to signal the gradually easing monetary policy (the belief of a low policy volatility during this period had a stabilizing effect on the volatility of stock markets). Conversely, the second period, 1994:4-1998:11, was characterized by higher interest rates that tightened monetary policy with more surprises and thereby increased risk premia. This may explain why the observed volatility is higher (0.093%) as compared to regime 1 (0.071%). Table 11. Summary indicators: zt = UKDIFST Regime1 1989:8-1994:3

Nobs

Regime 2 1988:3-1989:7 1994:4-1998:11

Nobs

56

17 56

bεtvar % 0.071

(zt − cb)% -21.89

b t) 1 − F(z 0.561

(Rt − Rbt )%

bεtvar %

(zt − cb)%

b t) F(z

(Rt − Rbt )%

0.059 0.093

6.17 21.94

0.52 0.561

-0.431

0.376 0.523

bt is the predicted returns from the conditional mean Note:Nobs is the number of observations, R var equation, b εt is the variance of the estimated residuals from the conditional mean equation.

8. Forecasting with LSTGARCH model. We now evaluate the performance of the estimated LSTGARCH model with UKDIFST as the transition variable, by comparing the forecasts with those of GJR and GARCH models. The GJR model is used because we saw, from tables 7 and 8, that the volatility of the stock returns showed an overwhelming evidence that positive and negative shocks have different impacts on the volatility, as allowed by the GJR model. The choice of GARCH model is straightforward. Forecasting performances are evaluated by considering out-of sample forecasts based on the RMSE criterion: we reserve the period 1999:01 to 2001:01 for forecasts. From table 12, it is seen that, in general, the LSTGARCH model yields better out-of-sample forecasts of the returns, at least in the short and medium-term. A further work would consist in finding economic metrics for forecast comparison,

Finite Sample Properties of Tests for STGARCH Models and Application ...

99

but this is beyond the scope of this paper. Table 12. RMSE on the stock returns (×103 )

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

LSTGARCH 0.2222 0.2719 0.4557 0.3817 0.3614 0.4819 0.7772 0.7167 0.7070 0.9715 0.9068 0.8439 0.8994 0.8934 0.8341 0.8858 0.8591 0.8119 0.7694 0.7525

GARCH 0.2240 0.2787 0.4680 0.3884 0.3639 0.4887 0.7770 0.7148 0.7030 0.9727 0.9091 0.8452 0.8981 0.8939 0.8346 0.8842 0.8586 0.8114 0.7688 0.7511

GJR 0.2457 0.2835 0.4664 0.3886 0.3656 0.4876 0.7793 0.7178 0.7069 0.9736 0.9092 0.8458 0.9000 0.8947 0.8353 0.8860 0.8598 0.8125 0.7699 0.7526

Concluding Remarks In this paper, we have investigated the empirical power and size of LM tests that allow choosing between GARCH(1,1) and STGARCH(1,1) models. The simulations show that the tests generally behave in a good way, but they also show a paradox: the power and size are better when the stationarity conditions on the moments are violated. The results on the example of US stock returns highlight the need to incorporate threshold effects in the conditional variance, rather than considering standard GARCH processes.

References [1] Anderson, H.M., K. Nam and F. Vahid (1999). ”Asymmetric nonlinear smooth transition GARCH models”, in, P. Rothman, ed., Nonlinear Time Series Analysis of Economic and Financial Data, Boston: Kluwer Academic Press, pp. 191-207.

100

Gilles Dufrenot, Vˆelayoudom Marimoutou and Anne P´eguin-Feissolle

[2] Beka¨ert G. and G. Wu (2000), ”Asymmetric volatility and risk in equity markets”, Review of Financial Studies 13, 1-42. [3] Bollerslev T. (1986). ”Generalized autoregressive conditional heteroskedasticity”, Journal of Econometrics 3, pp. 307-327. [4] Caulet R., P´eguin-Feissolle A. (2000). ”Un test d’h´et´erosc´edasticit´e conditionnelle inspir´e de la mod´elisation en termes de r´eseaux neuronaux artificiels”, Annales ´ d’Economie et de Statistique, 59, pp. 177-197. [5] Davies, R.B. (1977). ”Hypothesis testing when a nuisance parameter is present only under the alternative”, Biometrika, 64, pp. 247-254. [6] Dufr´enot, G., V. Marimoutou and A. P´eguin-Feissolle (2002). ”LSTGARCH effects in stock returns : the case of US, UK and France”, International Conference on Forecasting Financial Markets, London, May. [7] Engle R.F. (1982). ”Autoregressive conditional heteroskedasticity with estimates of the variance of United-Kingdom inflation”, Econometrica, 50, 4, pp. 987-1007. [8] Engle, R.F. and V.K. Ng (1993). ”Measuring and testing the impact of news on volatility”, Journal of Finance, 48, pp. 1749-1777. [9] Fornari, F. and A. Mele (1996). ”Modeling the changing asymmetry of conditional variances”, Economics Letters, 50, pp. 197-203. [10] Fornari, F. and A. Mele (1997). ”Sign and volatility-switching ARCH models: theory and applications to international stock markets”, Journal of Applied Econometrics, 12, pp. 49-66. [11] Glosten, L.R., R. Jagannathan and D.E. Runkle (1993). ”On the relation between expected value and the volatility of the nominal excess return on stocks”, Journal of Finance, 48, pp. 1779-1801. [12] Gonzales-Rivera, G. (1998). ”Smooth Transition GARCH Models”, Studies in Nonlinear Dynamics and Econometrics, 3, pp. 61-78. [13] Hagerud, G.E. (1996a). ”A smooth transition ARCH model for asset returns”, Working Paper Series in Economics and Finance No. 162, Stockholm School of Economics. [14] Hagerud, G.E. (1996b). ”Discrete time hedging of OTC options in a GARCH environment: a simulation experiment”, Working Paper Series in Economics and Finance No. 165, Stockholm School of Economics. [15] Hagerud, G.E. (1997a). ”Specification tests for asymmetric GARCH”, Working Paper Series in Economics and Finance No. 163, Stockholm School of Economics. [16] Hagerud, G.E. (1997b). ”Modeling Nordic stock returns with asymmetric GARCH models”, Working Paper Series in Economics and Finance No. 164, Stockholm School of Economics.

Finite Sample Properties of Tests for STGARCH Models and Application ...

101

[17] Hamilton, J.D. and G. Lin (1996). ”Stock market volatility and the business cycle”, Journal of Applied Econometrics, 11, pp. 573-593. [18] Hansen, B.E. (1996). ”Inference when a nuisance parameter is not identified under the null hypothesis”, Econometrica, 64, pp. 413-430. [19] He, C. and T. Ter¨asvirta (1999). ”Properties of moments of a family of GARCH processes”, Journal of Econometrics, 92, pp. 173-192. [20] Lundbergh S., Ter¨asvirta T. (1999). ”Modelling economic high-frequency time series with STAR-STGARCH models”, WP, Stockholm School of Economics. [21] Lundbergh S., Ter¨asvirta T. (2002). ”Evaluating GARCH models”, Journal of Econometrics, 110, 417-435. [22] Luukkonen, R., P. Saikkonen and T. Ter¨asvirta (1988). ”Testing linearity against smooth transition autoregressive models”, Biometrika, 75, 3, pp. 491-499. [23] McLeod A.J., Li W.K. 1983: ”Diagnostic checking ARMA time series models using squared residual autocorrelations”, Journal of Time Series Analysis 4: 269-273. [24] P´eguin-Feissolle A. (1999). ”A comparison of the power of some tests for conditional heteroskedasticity”, Economics Letters, 63, 1, pp. 5-17. [25] Sentana, E. (1995). ”Quadratic ARCH models”, Review of Economic Studies, 75, pp. 639-661. [26] Wu G. (2001), ”The determinants of asymmetric volatility”, Review of Financial Studies 14, 837-859. [27] Wu G., Xiao Z. (2002), ”A generalized partially linear model of asymmetric volatility”, Journal of Empirical Finance 9, 287-319.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 103-120

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 6

A S TATISTICAL T EST OF C HAOTIC P URCHASING P OWER PARITY DYNAMICS Apostolos Serletis1,∗ and Asghar Shahmoradi2 1 Department of Economics, University of Calgary, Calgary, Alberta T2N 1N4, Canada 2 Middle East and Central Asia Department International Monetary Fund Washington, D.C., 20431 US

6.1. Introduction The theory of purchasing power parity (PPP) has attracted a great deal of attention and has been explored extensively in the recent economics and finance literature. Based on the law of one price, purchasing power parity asserts that relative goods prices are not affected by exchange rates — or, equivalently, that exchange rate changes will be proportional to relative inflation. The relationship is important not only because it has been a cornerstone of exchange rate models in international economics, but also because of its policy implications; it provides a benchmark exchange rate and hence has some practical appeal for policymakers and exchange rate arbitragers. Empirical studies generally fail to find support for purchasing power parity during the recent floating exchange rate period (since 1973). In fact, the empirical consensus is that purchasing power parity does not hold over this period — see, for example, Adler and Lehman (1983), Mark (1990), Patel (1990), Grilli and Kaminski (1991), Flynn and Boucher (1993), Serletis (1994), Serletis and Zimonopoulos (1997), Coe and Serletis (2002) and Serletis and Gogas (2004). But there are also studies covering different groups of countries, studies covering periods of long duration or country pairs experiencing large differentials in price movements, studies using high-frequency (monthly) data, and studies that use panel methods that report evidence favorable to purchasing power parity — see Serletis and Gogas (2004) for a more detailed discussion. ∗ E-mail

address: [email protected]

104

Apostolos Serletis and Asghar Shahmoradi

A sufficient condition for a violation of purchasing power parity is that the real exchange rate is characterized by a unit root. Recent econometric advances and empirical evidence seem to suggest that the real excange rate does indeed have a unit root — see for example, Serletis and Zimonopoulos (1997). In the present paper we follow Serletis and Gogas (2000), who use U.S. dollar-based real exchange rates for 17 OECD countries over the period from 1957:1 to 1995:4, and contrast the random walk behavior of the real exchange rate with nonlinear chaotic dynamics. As Serletis and Gogas (2000, p. 616) argue “[t]his is motivated by the notion that the real exchange rate follows a deterministic, dynamic, and nonlinear process which generates output that mimics the output of stochastic systems.” However, unlike Serletis and Gogas (2000), in the present paper we follow the recent contribution by Whang and Linton (1999), Shintani and Linton (2004), and Serletis and Shintani (2003) and construct the standard error of the Nychka, Ellner, Gallant, and McCaffrey (1992) dominant Lyapunov exponent, thereby providing a statistical tests for chaos. Moreover, we use quarterly U.S. dollar-based, DM-based, and Japanese yen-based real exchange rates for 21 OECD countries (a total of sixty bilateral intercountry relations) over the recent floating exchange rate period, from 1973:1 to 1998:4. The countries involved are Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Japan, the Netherlands, New Zealand, Norway, Portugal, Spain, Sweden, Switzerland, the United Kingdom, and the United States. Our analysis is organized as follows. Section 2 presents a brief metodological background with respect to real exchange rate tests of the theory of purchasing power parity. The next section discusses the key features of the Nychka et al. (1992) Lyapunov exponent estimator and its limit distribution. In Section 4 we discuss the data and present the results of the chaos tests. Section 5 investigates the robustness of the results to alternative dynamical diagnosis procedures and the final section concludes.

6.2. PPP and the Real Exchange Rate One approach to testing the theory of purchasing power parity is to compute a linear combination of the PPP theory variables, such as the real exchange rate, and investigate its univariate time series properties using usual unit root testing procedures, as in Serletis and Zimonopoulos (1997) — see Serletis and Gogas (2004) for a discussion of other approaches to testing purchasing power parity. The real exchange rate, Et , can be calculated as Et = St

Pt∗ , Pt

where St denotes the nominal exchange rate (domestic currency value per unit of foreign currency), Pt the domestic price level (in domestic currency), and Pt∗ the foreign price level (in foreign currency). Taking logarithms of the above equation, the real exchange rate becomes a linear combination of the nominal exchange rate and the domestic and foreign price levels. That is et = st + pt∗ − pt

A Statistical Test of Chaotic Purchasing Power Parity Dynamics

105

where et is the logarithm of Et . Under long-run absolute purchasing power parity, the longrun equilibrium real exchange rate Et is equal to 1 (at every point in time), which would imply et = 0. In the short-run, however, we expect deviations from purchasing power parity, coming from stochastic shocks, and the question at issue is whether these deviations are permanent or transitory. A sufficient condition for a violation of absolute PPP is that the real exchange rate is characterized by a unit root. A number of approaches have been developed to test for unit roots. Nelson and Plosser (1982), using augmented Dickey-Fuller (ADF) type regressions [see Dickey and Fuller (1981)], argue that most macroeconomic time series (including real exchange rates) have a unit root. Perron (1989), however, has shown that conventional unit root tests are biased against rejecting a unit root where there is a break in a trend stationary process. Motivated by these considerations, Serletis and Zimonopoulos (1997) and Serletis and Gogas (2004) test the unit root hypothesis in real exchange rates and show that it cannot be rejected even if allowance is made for the possibility of a one-time change in the mean of the series at an unknown point in time. However, the (apparent) random walk behaviour of the real exchange rate could be contrasted with chaotic dynamics. This is motivated by the notion that the real exchange rate follows a deterministic nonlinear process which generates output that mimics the output of stochastic systems. In other words, it is possible for the real exchange rate to appear to be random but not to be really random. In fact, Serletis and Gogas (2000) test for chaos, using the Nychka et al. (1992) test (for positivity of the dominant Lyapunov exponent), in the dollar-based real exchange rate series used by Serletis and Zimonopoulos (1997), and find evidence of nonlinear chaotic dynamics in seven out of fifteen real exchange rate series. This suggests that real exchange rate movements might not be really random and that it is perhaps possible to model (by means of differential/difference equations) the nonlinear chaotic generating mechanism and build a predictive model of real exchange rates — see Barnett and Serletis (2000) for some thoughts along these lines. In the next section we follow the recent contribution by Whang and Linton (1999) and construct the standard error for the Nychka et al. (1992) dominant Lyapunov exponent for the real exchange rate series, thereby providing a statistical test for chaos. Moreover, as already noted, we test for chaos in a total of sixty bilateral intercountry relations.

6.3. A Statistical Test for Chaos T Let {Xt }t=1 be a random scalar sequence generated by the following non-linear autoregressive model

Xt = θ(Xt−1 , . . ., Xt−m ) + ut

(6.1)

T where θ: Rm →R is a non-linear dynamic map and {ut }t=1 is a random sequence of iid 2 2 disturbances with E(ut ) = 0 and E(ut ) = σ < ∞. We also assume θ to satisfy a smoothness condition, and Zt = (Xt , . . ., Xt−m+1 )0 ∈Rm to be strictly stationary and to satisfy a class of mixing conditions — see Whang and Linton (1999) and Shintani and Linton (2004) for details regarding these conditions. Let us express the model (6.1) in terms of a map

106

Apostolos Serletis and Asghar Shahmoradi F(Zt ) = (θ(Xt−1 , . . ., Xt−m), Xt−1 , . . ., Xt−m+1 )0

(6.2)

0

with Ut = (ut , 0, ..., 0) such that Zt = F(Zt−1 ) +Ut , and let Jt be the Jacobian of the map F in (6.2) evaluated at Zt . Then the dominant Lyapunov exponent of the system (6.1) is defined by  1 ln ν1 T0M TM , M→∞ 2M

λ ≡ lim

(6.3)

where M

TM = ∏ JM−t = JM−1 · JM−2 · · · · · J0 , t=1

and vi (A) is the i-th largest eigenvalue of a matrix A. Necessary conditions for the existence of the Lyapunov exponent are available in the literature. Usually, if max {ln ν1 (Jt0 Jt ) , 0} has a finite first moment with respect to the distribution of Zt , then the limit in (6.3) almost surely exists and will be a constant, irrespective of the initial condition. To obtain the Lyapunov exponent from observational data, Eckmann and Ruelle (1985) and Eckmann et al. (1986) proposed a method based on nonparametric regression which is known as the Jacobian method. The basic idea of the Jacobian method is to substitute θ in the Jacobian formula by its nonparametric estimator b θ. In other words, it is the sample analogue estimator of (6.3). It should be noted that we distinguish between the ‘sample size’ T used for estimating the Jacobian Jbt and the ‘block length’ M which is the number of evaluation points used for estimating the Lyapunov exponent. Formally, the Lyapunov exponent estimator of λ can be obtained by   bλM = 1 ln ν1 T b0 T bM , M 2M where M

b M = ∏ JbM−t = JbM−1 · JbM−2 · · · · · Jb0 , T t=1

and



 b t)  ∂ F(Z  Jbt = =  ∂Z 0 

∆b θ1t ∆b θ2t 1 0 0 1 .. .. . . 0 0

· · · ∆b θm−1,t ∆b θmt ··· 0 0 ··· 0 0 .. .. .. . . . ··· 1 0



   ,  

for t = 0, 1, . . ., M −1, and ∆b θ jt = De j b θ(Zt ) for j = 1, . . ., m in which e j = (0, . . ., 1, . . ., 0)0 ∈ Rm denotes the j-th elementary vector. In principle, any nonparametric derivative estimator De j b θ can be used for the Jacobian method. However, in practice, the Jacobian method based on the neural network estimation

A Statistical Test of Chaotic Purchasing Power Parity Dynamics

107

first proposed by Nychka et al. (1992) and Genc¸ay and Dechert (1992) is the most widely used method in recent empirical analyses in economics. The neural network estimator b θ can be obtained by minimizing the least square criterion ST (θT ) =

1 T 1 ∑ 2 (Xt − θT (Zt−1 ))2 T t=1

where the neural network sieve θT : Rm →R is an approximation function defined by k

θT (z) = β0 + ∑ β j ψ a0j z + b j j=1



where ψ is an activation function and k is the number of hidden units. For the neural network estimation, we use the FUNFITS program developed by Nychka et al. (1996). As an activation function ψ, this program uses a type of sigmoid function ψ(u) =

u(1 + |u/2|) , 2 + |u| + u2 /2

which was also employed by Nychka et al. (1992). The number of hidden units (k) will be determined by minimizing the BIC criterion. Using the argument in Whang and Linton (1999), Shintani and Linton (2004) showed that under some reasonable condition, the neural network estimator bλM is asymptotically normal and its standard error can be obtained using b= Φ

M−1



j=−M+1

ω( j/SM )bγ( j) and bγ( j) =

1 M bt η bt−| j| , η M t=|∑ j|+1

where bt = b η ξt − bλM ,

with

   0T b b   ν T 1 t t 1 1 b b0 T b1 .   for t ≥ 2 and b ξt = ln   ξ1 = lnν1 T 1 2 2 b0 T b ν1 T t−1 t−1 

Above, ω(·) and SM denote a kernel function and a lag truncation parameter, respectively. Note that the standard error is essentially the heteroscedasticity and autocorrelation bt . We employ the QS kernel for ω(·) covariance estimator of Andrews (1991) applied to η with SM selected by the optimal bandwidth selection method recommended in Andrews (1991).

6.4. Data and Results The data, taken from the IMF International Financial Statistics, consist of quarterly nominal exchange rates and consumer price indices covering the period 1973:1 to 1998:4

108

Apostolos Serletis and Asghar Shahmoradi

for twenty one OECD countries. A final demand price is used in the calculation of the real exchange rate instead of an output price, because of unavailability of the same quarterly output price for each country. It is to be noticed, however, that Perron and Vogelsang (1992) and Serletis and Gogas (2004) argue that the results for purchasing power parity tests could depend on the price used. The countries involved are Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Japan, the Netherlands, New Zealand, Norway, Portugal, Spain, Sweden, Switzerland, the United Kingdom, and the United States. In investigating purchasing power parity, however, we consider sixty bilateral intercountry relations; twenty relations between the United States as the home country and the other countries as the foreign countries; twenty relations between Germany as the home country and the other countries as the foreign countries; and twenty relations between Japan as the home country and the other countries as the foreign countries. Lyapunov exponent point estimates are displayed in Table 1 for the first logged differences of the data. The results are presented for dimensions 1 through 6, with the optimal value of the number of hidden units (k) in the neural net being chosen by minimizing the BIC criterion. An asterisk indicates rejection of the null hypothesis H0 : λ ≥ 0 at the 1% level. In general, for almost half of the cases we cannot reject the null hypothesis of chaotic behavior, irrespective of which country is used as the home country. Of course, the estimates depend on the choice of the dimension parameter, m. As m increases, the Lyapunov exponent point estimates increase in value, but even at low dimensions the null of chaotic behavior cannot be rejected.

2

1.5

1

0.5

0 -6

-5

-4

-3

-2

-1

0

-0.5

-1

-1.5

-2

Figure 6.1. Phase Portrait for the DM-Based Real Exchange Rate for Austria.

A Statistical Test of Chaotic Purchasing Power Parity Dynamics

109

3

2.5

2

1.5

1

0.5

0 -8

-7

-6

-5

-4

-3

-2

-1

0

-0.5

-1

-1.5

-2

-2.5

Figure 6.2. Phase Portrait for the DM-Based Real Exchange Rate for the Netherlands.

1

0.8

0.6

0.4

0.2

0 -4.5

-4

-3.5

-3

-2.5

-2

-1.5

-1

-0.5

0

-0.2

-0.4

-0.6

-0.8

-1

Figure 6.3. Phase Portrait for the Dollar-Based Real Exchange Rate for the U.K.

110

Apostolos Serletis and Asghar Shahmoradi

0.5

0.4

0.3

0.2

0.1

0 -4.5

-4

-3.5

-3

-2.5

-2

-1.5

-1

-0.5

0

-0.1

-0.2

-0.3

-0.4

-0.5

Figure 6.4. Phase Portrait fo the Dollar-Based Real Exchange Rate for Canada.

1.5

1

0.5

0 -5

-4.5

-4

-3.5

-3

-2.5

-2

-1.5

-1

-0.5

0

-0.5

-1

-1.5

Figure 6.5. Phase Portrait for the Yen-Based Real Exchange Rate for Portugal.

A Statistical Test of Chaotic Purchasing Power Parity Dynamics

111

6.5. Robustness In this section we investigate the robustness of our results to alternative dynamical diagnosis procedures. In particular, we present two-dimensional phase space portraits for six representative real exchange rate series — the DM-based exchange rates for Austria, the Netherlands and Spain, the U.S.dollar-based exchange rates for the United Kingdom and Canada, and the yen-based exchange rate for Portugal. The phase portraits reconstruct aspects of an attractor’s geometry by plotting each data point in an observed series against an estimate of its derivative — see, for example, Sprott (1995). As Casdagli (1991) argues, the best evidence for low dimensional chaos is a complex but structured phase portrait; deterministic processes such as chaos have structured phase portraits because their attractors limit the area of phase space that they can visit, whereas stochastic processes can visit all areas of the phase space. The phase portraits in Figures 1-6 are complex but structured, thereby providing further evidence for the presence of low dimensional structure.

1.5

1

0.5

0 -7

-6

-5

-4

-3

-2

-1

0

-0.5

-1

-1.5

Figure 6.6. Phase Portrait for the DM-Based Real Exchange Rate for Spain.

6.6. Conclusion We have applied tests from dynamical systems theory to contrast the apparent random walk behavior of the real exchange rate to chaotic dynamics. In doing so, we used quarterly U.S. dollar-based, DM-based, and Japanese yen-based real exchange rates for 21 OECD countries over the recent floating exchange rate period, from 1973:1 to 1998:4. We have found evidence consistent with deterministic chaotic dynamics in the real exchange rate, irrespective of which country is used as the home country. This is consistent with the evidence reported by Serletis and Gogas (2000), although they only use U.S. dollar-based real exchange rates for 17 OECD countries over the period from 1957:1 to 1995:4.

Table 6.1. Lyapunov exponent estimates for us$- dm- and yen-based real exchange rates

k=1 m BIC bλ Australia 1 -5.97 -3.16∗ 2 -5.94 -1.62∗ 3 -5.84 -0.33∗ 4 -5.74 -0.70∗ 5 -5.71 -0.13∗ 6 -5.61 -0.06 Austria 1 -5.43 -3.57∗ 2 -5.43 -1.24∗ 3 -5.30 -0.41∗ 4 -5.36 -0.16∗ 5 -5.21 -0.25∗ 6 -5.15 -0.16∗ Belgium 1 -5.40 -3.65∗ 2 -5.43 -1.16∗ 3 -5.32 -0.38∗ 4 -5.31 -0.16∗ 5 -5.19 -0.13∗ 6 -5.11 -0.24∗

US $-based series k=2 k=3 b BIC b λ BIC λ

k=1 BIC bλ

DM-based series k=2 BIC bλ

k=3 BIC bλ

k=1 BIC bλ

Yen-based series k=2 k=3 BIC bλ BIC bλ

-5.95 -3.20∗ -5.89 -0.79∗ -5.82 0.08 -5.94 -0.20∗ -5.82 0.16 -5.67 0.27

-5.99 -0.95∗ -5.89 -0.76∗ -5.92 0.11 -5.85 0.09 -5.79 0.29 -5.72 0.54

-5.17 -5.17 -5.10 -5.14 -4.98 -4.97

-4.02∗ -0.94∗ -0.19∗ -0.61∗ -0.28∗ -0.28∗

-5.20 -1.52∗ -5.13 -0.26∗ -5.0 0.10 -5.03 -0.22∗ -5.02 0.02 -4.91 0.30

-5.21 -1.46∗ -5.12 -0.26∗ -5.13 0.03 -5.02 0.19 -5.04 0.57 -4.88 0.63

-5.22 -5.24 -5.10 -5.14 -4.97 -4.99

-3.11∗ -1.66∗ -0.49∗ -0.30∗ -0.47∗ -0.23∗

-5.33 -1.94∗ -5.13 -0.60∗ -5.22 0.18 -4.92 0.14 -5.12 0.10 -4.86 0.00

-5.28 -1.63∗ -5.07 -0.80∗ -5.12 0.36 -5.04 -0.43 -5.03 0.28 -4.81 0.05

-5.44 -3.25∗ -5.43 -0.57∗ -5.34 -0.09 -5.26 0.07 -5.27 0.14 -5.23 0.24

-5.46 -1.69∗ -5.40 -0.34∗ -5.32 0.12 -5.24 0.16 -5.15 0.48 -5.19 0.17

-9.09 -9.27 -9.01 -9.31 -8.92 -9.33

-1.76∗ -0.54∗ -0.97∗ -0.88∗ -0.09 0.07

-9.06 -0.74∗ -9.28 -0.46∗ -9.00 -0.36∗ -9.27 0.15 -9.01 0.41 -9.20 0.09

-9.25 0.69∗ -9.23 -0.27∗ -9.18 0.00 -9.34 0.38 -9.12 0.23 -9.17 0.14

-5.65 -5.53 -5.56 -5.45 -5.44 -5.35

-3.33∗ -1.95∗ -0.80∗ -0.51∗ 0.08 -0.00

-5.60 -1.04∗ -5.72 0.39∗ -5.50 -0.62∗ -5.52 -0.40∗ -5.41 0.35 -5.42 0.15

-5.56 -5.68 -5.48 -5.59 -5.28 -5.55

-1.55∗ -0.54∗ -0.79∗ -0.46∗ 0.39 0.47

-5.40 -1.60∗ -5.41 -0.29∗ -5.28 0.03 -5.29 0.32 -5.10 0.01 -5.29 -0.14∗

-5.45 -1.70∗ -5.42 -0.06 -5.35 0.24 -5.25 0.28 -5.23 0.42 -5.19 0.40

-8.12 -8.32 -8.39 -8.49 -8.33 -8.35

-1.90∗ -1.14∗ -0.76∗ -0.97∗ -0.59∗ -0.23∗

-8.42 -8.10 -8.37 -8.58 -8.28 -8.41

-1.82∗ -0.28∗ 0.00 -0.26∗ 0.21 -0.19∗

-5.59 -5.50 -5.46 -5.46 -5.37 -5.29

-2.57∗ -1.64∗ -1.13∗ -0.56∗ 0.01 -0.05

-5.58 -0.85∗ -5.58 -0.56∗ -5.47 -0.09 -5.45 -0.42∗ -5.36 0.24 -5.34 0.33

-5.54 -5.54 -5.37 -5.37 -5.23 -5.36

-0.54∗ -0.53∗ -0.36∗ -0.21∗ 0.29 0.26

-1.91∗ -0.48∗ -0.22∗ -0.72∗ -0.47∗ -0.19∗

-8.37 -8.55 -8.24 -8.34 -8.35 -8.36

Notes: The largest Lyapunov exponent estimates are presented. An asterisk indicates rejection of the null hypthesis H0 : λ ≥ 0 .

k= 1 bλ m BIC Canada 1 -7.47 -2.75∗ 2 -7.42 -2.02∗ 3 -7.34 -0.05∗ 4 -7.50 -0.30∗ 5 -7.22 -0.50∗ 6 -7.48 0.06 Denmark 1 -5.47 -2.84∗ 2 -5.51 -1.25∗ 3 -5.36 -0.36∗ 4 -5.40 -0.16∗ 5 -5.26 -0.25∗ 6 -5.28 -0.27∗ Finland 1 -5.91 -2.10∗ 2 -5.88 -1.38∗ 3 -5.78 -0.56∗ 4 -5.77 -0.05 5 -5.75 -0.24∗ 6 -5.640 -0.34∗

US $-based series k=2 k=3 bλ BIC bλ BIC

k=1 BIC bλ

DM-based series k=2 BIC bλ

k=3 BIC bλ

k=1 BIC bλ

Yen-based series k=2 k=3 BIC bλ BIC bλ

-7.45 -2.28∗ -7.38 -1.27∗ -7.37 -0.06∗ -7.43 0.16 -7.31 0.48 -7.27 0.10

-7.46 -1.69∗ -7.41 -0.42∗ -7.45 0.43 -7.23 0.40 -7.37 0.61 -7.34 0.46

-5.27 -5.19 -5.15 -5.07 -5.06 -7.92

-2.94∗ -1.28∗ -0.52∗ -0.23∗ -0.26∗ -0.18∗

-5.23 -2.91∗ -5.21 0.09 -5.20 -0.00 -5.05 -0.14∗ -5.04 0.36 -5.19 -0.02

-5.25 -2.39∗ -5.16 0.10 -5.20 0.01 -5.01 -0.13∗ -5.05 0.36 -4.98 0.03

-5.27 -5.15 -5.17 -5.09 -5.07 -4.92

-5.35∗ -1.44∗ -0.75∗ -0.17∗ -0.22∗ -0.17∗

-5.23 -2.28∗ -5.17 -0.24∗ -5.13 0.03 -5.07 0.04 -5.06 -0.19∗ -5.04 0.01

-5.19 -0.35∗ -5.10 0.06 -5.12 0.20 -4.99 0.21 -4.94 -0.17∗ -4.85 0.12

-5.49 -2.60∗ -5.45 -0.62∗ -5.35 0.05 -5.36 0.40 -5.24 -0.21∗ -5.30 0.36

-5.56 -5.43 -5.46 -5.25 -5.30 -5.12

-1.82∗ -0.12∗ -0.31∗ 0.43 -0.23∗ 0.35

-7.93 -7.86 -7.86 -7.91 -7.67 -7.72

-2.52∗ -0.84∗ -0.85∗ -0.76∗ -0.18∗ -0.33∗

-7.88 -1.79∗ -7.91 -0.64∗ -7.86 -0.52∗ -7.76 0.00 -7.70 0.08 -7.77 0.00

-7.89 -1.51∗ -7.94 -0.61∗ -7.81 -0.19 -7.81 0.11 -7.73 0.15 -7.71 0.28

-5.68 -5.56 -5.55 -5.49 -5.46 -5.29

-2.67∗ -1.79∗ -0.79∗ -0.26∗ -0.02 -0.01

-5.64 -0.92∗ -5.63 -1.33∗ -5.56 -0.09∗ -5.54 0.14 -5.42 0.18 -5.39 0.19

-5.59 -0.65∗ -5.62 -0.47∗ -5.48 0.45 -5.53 0.50 -5.35 0.30 -5.34 0.29

-5.94 -1.92∗ -5.91 -0.85∗ -5.82 0.20 -5.91 -0.02 -5.72 -0.23∗ -5.70 0.39

-5.91 -2.03∗ -5.90 -0.92∗ -5.81 0.02 -5.85 0.13 -5.70 0.09 -5.73 0.38

-6.16 -6.23 -6.16 -6.30 -6.04 -5.96

-2.94∗ -2.00∗ -0.30∗ -0.09∗ -0.08∗ -0.42∗

-6.18 -6.11 -6.17 -6.30 -6.14 -6.00

-1.03∗ -1.29∗ -0.25∗ 0.23 -0.06∗ -0.14∗

-6.17 -0.84∗ -6.28 -0.47∗ -6.23 -0.22∗ -6.32 0.26 -6.08 0.37 -6.09 0.17

-5.46 -5.34 -5.45 -5.16 -5.29 -5.18

-3.20∗ -1.51∗ -0.68∗ -0.38∗ -0.11∗ -0.19∗

-5.41 -0.50∗ -5.37 -0.29∗ -5.29 0.14 -5.34 -0.27∗ -5.30 0.34 -5.13 0.01

-5.35 -1.23∗ -5.32 -0.22∗ -5.19 0.049 -5.37 -0.22∗ -5.25 0.37 -5.00 0.45

Notes: The largest Lyapunov exponent estimates are presented. An asterisk indicates rejection of the null hypthesis H0 : λ ≥ 0 .

Table 6.1. (Continued)

k=1 m BIC bλ France 1 -5.57 -3.22∗ 2 -5.52 -1.47∗ 3 -5.46 -0.55∗ 4 -5.38 -0.31∗ 5 -5.36 -0.33∗ 6 -5.24 -0.20∗ Germany 1 -5.39 -3.40∗ 2 -5.39 -0.96∗ 3 -5.29 -0.52∗ 4 -5.25 -0.22∗ 5 -5.18 -0.29∗ 6 -5.90 -0.16∗ Greece 1 -5.61 -3.67∗ 2 -5.66 -1.83∗ 3 -5.48 -0.68∗ 4 -5.59 0.10 5 -5.45 -0.17∗ 6 -5.47 -0.34∗

US $-based series k=2 k=3 b BIC b λ BIC λ -2.84∗ -0.41∗ -0.16∗ 0.09 0.12 -0.11∗

-5.58 -1.76∗ -5.52 -0.01 -5.52 0.06 -5.40 0.21 -5.31 0.28 -5.33 0.41

-5.39 -2.53∗ -5.40 -0.45∗ -5.24 -0.32∗ -5.22 0.09 -5.20 0.39 -5.11 0.25

-5.41 -1.68∗ -5.36 -0.42∗ -5.33 0.32 -5.20 0.10 -5.19 0.44 -5.14 0.29

-5.56 -2.05∗ -5.71 -0.29∗ -5.41 -0.41∗ -5.62 0.16 -5.33 0.34 -5.38 0.07

-5.60 -3.26∗ -5.65 0.18 -5.53 -0.40∗ -5.54 0.35 -5.38 0.39 -5.31 0.18

-5.57 -5.55 -5.51 -5.38 -5.40 -5.36

k=1 BIC bλ -7.37 -7.54 -7.37 -7.43 -7.27 -7.49

DM-based series k=2 BIC bλ

-3.90∗ -1.41∗ -0.98∗ -0.51∗ -0.51∗ -0.20∗

-6.10 -1.11∗ -6.34 -2.05∗ -5.97 -1.09∗ -6.24 -0.03 -5.83 0.14 -6.22 0.17

-7.37 -7.54 -7.49 -7.40 -7.47 -7.50

-1.17∗ -1.69∗ -1.27∗ -0.18∗ 0.15 -0.08∗

-6.07 -1.10∗ -6.45 -0.46∗ -6.23 -0.56∗ -6.26 0.02 -6.11 0.50 -6.06 0.36

k=3 BIC bλ -7.31 -7.72 -7.36 -7.53 -7.33 -7.43

-1.22∗ -1.34∗ -0.70∗ -0.18∗ 0.17 -0.04

-6.36 -1.06∗ -6.39 -0.05 -6.29 -0.57∗ -6.29 0.32 -6.17 0.42 -6.22 0.26

k=1 BIC bλ

Yen-based series k=2 k=3 BIC bλ BIC bλ

-5.62 -5.56 -5.50 -5.44 -5.40 -5.31

-2.10∗ -1.34∗ -0.69∗ -0.45∗ 0.01 -0.03

-5.59 -1.63∗ -5.65 -0.63∗ -5.48 -0.47∗ -5.56 -0.03 -5.35 0.27 -5.51 0.17

-5.54 -0.83∗ -5.61 -0.27∗ -5.46 0.36∗ -5.52 0.29 -5.30 0.35 -5.39 0.40

-5.64 -5.54 -5.53 -5.47 -5.39 -5.29

-3.41∗ -2.09∗ -1.04∗ -0.42∗ 0.01 -0.09∗

-5.62 -1.66∗ -5.61 -0.05 -5.48 -0.65∗ -5.41 -0.35∗ -5.31 0.02 -5.41 0.21

-5.58 -1.11∗ -5.56 -0.12 -5.49 -0.30∗ -5.40 -0.27∗ -5.35 0.19 -5.30 0.30

-5.49 -5.47 -5.39 -5.33 -5.26 -5.25

-2.97∗ -1.57∗ -0.74∗ -0.38∗ -0.08 -0.10∗

-5.45 -3.73∗ -5.54 -0.44∗ -5.33 0.04 -5.52 -0.01 -5.26 0.45 -5.46 0.08

-5.50 -3.37∗ -5.52 0.20 -5.39 0.22 -5.44 0.27 -5.32 0.40 -5.36 0.07

Notes: The largest Lyapunov exponent estimates are presented. An asterisk indicates rejection of the null hypthesis H0 : λ ≥ 0 .

k=1 m BIC bλ Ireland 1 -5.55 -3.29∗ 2 -5.56 -1.20∗ 3 -5.45 -0.64∗ 4 -5.44 -0.28∗ 5 -5.32 -0.23∗ 6 -5.39 -0.12∗ Italy 1 -5.60 -2.77∗ 2 -5.55 -1.44∗ 3 -5.47 -0.89∗ 4 -5.41 -0.39∗ 5 -5.35 -0.30∗ 6 -5.31 -0.21∗ Japan 1 -5.32 -2.15∗ 2 -5.30 -1.42∗ 3 -5.30 -0.40∗ 4 -5.09 -0.17∗ 5 -5.12 -0.13∗ 6 -4.92 -0.30∗

US $-based series k=2 k=3 b BIC b λ BIC λ

k=1 BIC bλ

DM-based series k=2 BIC bλ

k=3 BIC bλ

k=1 BIC bλ

Yen-based series k=2 k=3 BIC bλ BIC bλ

-5.64 -1.65∗ -5.55 -0.55∗ -5.55 -0.03 -5.38 0.37 -5.43 0.08 -5.30 -0.08∗

-5.61 -1.88∗ -5.61 0.50 -5.50 0.17 -5.27 0.23 -5.42 0.07 -5.41 0.35

-6.79 -6.82 -6.81 -6.73 -6.64 -6.63

-3.51∗ -0.53∗ -0.29∗ -0.41∗ -0.10∗ -0.23∗

-6.83 -1.94∗ -6.77 -0.26∗ -6.82 0.100 -7.07 -0.47∗ -6.79 -0.10 -6.76 0.08

-6.85 -3.63∗ -6.77 -0.03 -6.75 0.05 -6.94 0.64 -6.91 0.07 8.58 0.31

-5.50 -5.41 -5.38 -5.42 -5.24 -5.21

-1.56∗ -1.68∗ -0.58∗ -0.35∗ -0.04 -0.00

-5.47 -5.54 -5.37 -5.45 -5.23 -5.44

-2.33∗ -0.39∗ -0.29∗ -0.25∗ 0.00 0.28

-5.59 -1.75∗ -5.51 -0.56∗ -5.49 0.12 -5.41 0.35 -5.42 -0.02 -5.21 0.34

-6.19 -6.08 -6.20 -6.05 -6.10 -6.06

-2.84∗ -2.22∗ -0.32∗ -0.80∗ -0.67∗ -0.03

-6.14 -1.18∗ -6.04 -1.03∗ -6.04 -0.25∗ -6.24 0.74 -6.24 0.00 -6.50 0.23

-6.12 -2.22∗ -6.46 -0.89∗ -6.09 -0.02 -6.28 0.78 -6.10 -0.00 -6.15 0.35

-5.48 -5.37 -5.35 -5.26 -5.22 -5.15

-1.71∗ -1.47∗ -0.74∗ -0.50∗ -0.01 -0.09∗

-5.44 -0.79∗ -5.44 -0.63∗ -5.41 -0.17∗ -5.35 0.12 -5.35 0.05 -5.19 0.52

-5.29 -2.50∗ -5.27 -0.24 -5.22 -0.46∗ -5.11 0.20 -5.08 -0.18∗ -5.02 0.03

-5.28 -1.86∗ -5.22 -0.39∗ -5.26 0.30 -5.09 0.28 -5.11 0.44 -4.99 0.08

-5.64 -5.54 -5.53 -5.43 -5.41 -5.32

-3.14∗ -2.05∗ -1.05∗ -0.43∗ 0.01 -0.09∗

-5.62 -1.76∗ -5.61 -0.07 -5.50 0.75∗ -5.42 0.26 -5.41 0.01 -5.42 0.21

-5.58 -5.56 -5.39 -5.39 -5.41 -5.29

-5.60 -5.54 -5.52 -5.30 -5.41 -5.34

-1.47∗ -0.48∗ -0.49∗ -0.11∗ 0.20 0.60

Notes: The largest Lyapunov exponent estimates are presented. An asterisk indicates rejection of the null hypthesis H0 : λ ≥ 0 .

-1.53∗ -2.11∗ -0.35∗ -0.31∗ 0.11 0.27

-5.43 -5.48 -5.39 -5.40 -5.15 -5.22

-1.35∗ -0.51∗ -0.22∗ -0.37∗ 0.33 0.34

-5.40 -0.63∗ -5.39 -0.14 -5.29 -0.02 -5.32 0.42 -5.17 -0.01 -5.22 0.52

Table 6.1. (Continued)

k=1 m BIC bλ Netherlands 1 -5.42 -3.64∗ 2 -5.40 -1.14∗ 3 -5.35 -0.59∗ 4 -5.25 -0.25∗ 5 -5.23 -0.23∗ 6 -5.20 -0.18∗ New Zealand 1 -5.60 -3.66∗ 2 -5.64 -1.31∗ 3 -5.58 -0.12∗ 4 -5.44 -0.77∗ 5 -5.43 -0.36∗ 6 -5.60 -0.10∗ Norway 1 -5.79 -3.16∗ 2 -5.81 -1.04∗ 3 -5.68 -0.27∗ 4 -5.79 -0.16∗ 5 -5.61 -0.22∗ 6 -5.56 -0.38∗

US $-based series k=2 k=3 b BIC b λ BIC λ

k=1 BIC bλ

DM-based series k=2 BIC bλ

k=3 BIC bλ

k=1 BIC bλ

Yen-based series k=2 k=3 BIC bλ BIC bλ

-5.42 -3.26∗ -5.44 0.01 -5.30 0.11 -5.25 0.25 -5.20 -0.23∗ -5.21 0.19

-5.45 -3.38∗ -5.43 -0.03 -5.33 0.32 -5.24 0.33 -5.16 0.57 -5.19 0.07

-5.64 -5.24 -5.53 -5.43 -5.41 -5.32

-2.98∗ -1.52∗ -0.66∗ -0.12∗ -0.38∗ -0.37∗

-5.62 -5.61 -5.50 -5.42 -5.41 -5.42

-1.51∗ -1.15∗ -0.31∗ 0.01 -0.19∗ 0.15

-5.58 -2.27∗ -5.56 -0.50∗ -5.39 -0.32∗ -5.39 0.19 -5.38 0.17 -5.29 0.14

-5.65 -5.52 -5.53 -5.47 -5.42 -5.37

-3.00∗ -1.42∗ -1.08∗ -0.53∗ 0.09 0.00

-5.63 -1.01∗ -5.64 -0.73∗ -5.55 -0.53∗ -5.50 0.00 -5.41 0.35 -5.37 0.03

-5.59 -0.72∗ -5.60 -0.58∗ -5.42 0.11 -5.42 0.21 5.34 0.39 -5.39 0.22

-1.30∗ -1.00∗ -0.06∗ -0.24∗ -0.03 -0.09

-5.62 -1.73∗ -5.60 0.35 -5.64 -0.10∗ -5.85 0.21 -5.48 0.02 -5.38 0.19

-5.40 -5.36 -5.56 -5.25 -5.13 -5.13

-3.30∗ -1.88∗ -0.33∗ -0.83∗ -0.28∗ -0.23∗

-5.38 -5.30 -5.23 -5.24 -5.17 -5.15

-2.85∗ -1.20∗ -0.13∗ -0.15∗ 0.14 0.12

-5.37 -2.13∗ -5.26 -0.63∗ -5.34 -0.07 -5.16 0.10 -5.23 0.10 -5.13 0.12

-5.30 -5.21 -5.21 -5.07 -5.11 -5.00

-4.56∗ -1.36∗ -0.53∗ -0.68∗ -0.44∗ -0.16∗

-5.29 -2.24∗ -5.17 -0.56∗ -5.19 0.11 -5.17 -0.12∗ -5.19 0.00 -4.89 0.06

-5.26 -0.82∗ -5.13 0.13 -5.24 0.11 -5.09 -0.12∗ -5.04 0.69 -5.00 0.38

-5.84 -2.39∗ -5.78 -0.78∗ -5.84 -0.61∗ -5.80 0.32 -5.69 0.23 -5.56 0.00

-5.86 -0.94∗ -5.72 -0.77∗ -5.70 -0.22∗ -5.66 0.42 -5.59 0.32 -5.48 0.38

-6.81 -6.77 -6.69 -6.65 -6.57 -6.47

-3.66∗ -1.57∗ -0.53∗ -0.43∗ -0.26∗ -0.23∗

-6.81 -6.79 -6.68 -6.65 -6.54 -6.46

-2.19∗ -1.02∗ -0.18∗ -0.23∗ -0.17∗ 0.03

-6.81 -1.14∗ -6.78 0.04 -6.64 -0.09 -6.58 0.28 -6.53 -0.17∗ -6.49 0.48

-5.56 -5.44 -5.46 -5.25 -5.33 -5.26

-2.98∗ -1.50∗ -1.16∗ -0.66∗ -0.12∗ -0.31∗

-5.54 -5.48 -5.42 -5.30 -5.36 -5.32

-1.43∗ -0.42∗ -0.43∗ -0.64∗ 0.17 0.04

-5.48 -0.57∗ -5.44 -0.61∗ -5.44 -0.12∗ -5.37 0.54 -5.21 0.11 -5.30 0.43

-5.60 -5.60 -5.57 -5.68 -5.47 -5.27

Notes: The largest Lyapunov exponent estimates are presented. An asterisk indicates rejection of the null hypthesis H0 : λ ≥ 0 .

k=1 m BIC bλ Portugal 1 -5.54 -3.47∗ 2 -5.58 -1.51∗ 3 -5.41 -0.22∗ 4 -5.55 -0.30∗ 5 -5.31 -0.32∗ 6 -5.30 -0.29∗ Spain 1 -5.74 -3.09∗ 2 -5.60 -1.54∗ 3 -5.63 -0.26∗ 4 -5.47 -0.29∗ 5 -5.56 -0.40∗ 6 -5.44 -0.31∗ Sweden 1 -5.61 -2.61∗ 2 -5.80 -1.12∗ 3 -5.56 -0.33∗ 4 -5.79 -0.67∗ 5 -5.49 -0.14∗ 6 -5.52 -0.10∗

US $-based series k=2 k=3 b BIC b λ BIC λ

k=1 BIC bλ

DM-based series k=2 BIC bλ

k=3 BIC bλ

k=1 BIC bλ

Yen-based series k=2 k=3 BIC bλ BIC bλ

-5.58 -1.61∗ -5.56 -0.47∗ -5.43 -0.10 -5.44 0.26 -5.28 -0.20∗ -5.33 -0.03

-5.59 -1.81∗ -5.50 -0.36∗ -5.57 0.05 -5.43 0.44 -5.43 0.46 -5.25 0.15

-6.43 -6.48 -6.35 -6.37 -6.24 -6.24

-3.43∗ -1.52∗ -0.39∗ -0.81∗ -0.15∗ -0.16∗

-6.37 -1.12∗ -6.47 -1.06∗ -6.25 -0.01 -6.40 -0.49∗ -6.13 -0.02 -6.30 0.00

-6.36 -1.51∗ -6.50 -0.93∗ -6.37 0.20 -6.29 0.01 -6.25 0.05 -6.35 -0.01

-5.36 -5.24 -5.22 -5.14 -5.17 -5.11

-2.98∗ -1.28∗ -0.86∗ -0.47∗ -0.07∗ -0.22∗

-5.32 -5.38 -5.22 -5.41 -5.14 -5.15

-2.37∗ -0.60∗ -0.60∗ -0.22∗ 0.04 0.09

-5.27 -1.66∗ -5.36 -0.89∗ -5.25 -0.30∗ -5.28 0.51 -5.05 0.05 -5.17 0.49

-5.69 -1.04∗ -5.70 -0.90∗ -5.54 -0.28∗ -5.56 -0.09 -5.51 0.51 -5.39 0.01

-5.66 -0.63∗ -5.66 -0.54∗ -5.52 0.06 -5.55 -0.04 -5.39 0.23 -5.36 0.08

-6.13 -6.25 -6.02 -6.40 -6.91 -6.01

-2.37∗ -0.70∗ -0.36∗ -0.75∗ -0.16∗ -0.30∗

-6.15 -6.20 -6.05 -6.38 -6.89 -6.06

-1.77∗ -0.97∗ 0.06 -0.62∗ -0.08∗ -0.14∗

-6.19 -1.15∗ -6.15 -0.47∗ -6.17 0.05 -6.23 -0.01 -6.00 0.53 -6.94 0.02

-5.44 -5.33 -5.37 -5.31 -5.24 -5.22

-2.68∗ -1.16∗ -1.06∗ -0.42∗ -0.10∗ -0.02

-5.39 -0.80∗ -5.43 0.95∗ -5.26 -0.29∗ -5.36 0.20 -5.14 0.05 -5.22 0.05

-5.36 -1.07∗ -5.39 -0.14 -5.21 0.47 -5.29 0.45 -5.23 0.05 -5.09 0.40

-5.65 -1.07∗ -5.76 -0.91∗ -5.47 -0.30∗ -5.80 018 -5.59 0.04 -5.85 0.20

-5.71 -0.35∗ -5.70 -0.95∗ -5.71 -0.08 -5.51 0.36 -5.57 0.26 -5.42 0.33

-6.06 -5.96 -5.94 -6.12 -5.83 -6.02

-2.40∗ -1.71∗ -0.83∗ -0.12∗ -0.57∗ -0.24∗

-6.05 -2.58∗ -5.91 -0.64∗ -5.93 0.00 -6.00 0.04 -5.78 -0.34∗ -5.80 0.31

-6.00 -2.38∗ -5.87 -0.83∗ -5.83 0.34 -6.07 0.62 -5.70 -0.45∗ -5.69 0.76

-5.31 -5.16 -5.19 -5.03 -5.09 -5.19

-2.61∗ -1.18∗ -0.82∗ -0.68∗ -0.06 -0.18∗

-5.27 -1.82∗ -5.17 -0.04 -5.16 -0.69∗ -5.24 0.62 -5.03 0.29 -4.96 0.24

-5.22 -3.39∗ -5.12 -0.15 -5.07 0.07 -5.17 0.69 -5.14 0.42 -5.04 0.39

Notes: The largest Lyapunov exponent estimates are presented. An asterisk indicates rejection of the null hypthesis H0 : λ ≥ 0 .

Table 6.1. (Continued)

k=1 m BIC bλ Switzerland 1 -5.20 -3.17∗ 2 -5.11 -1.08∗ 3 -5.11 -0.46∗ 4 -5.09 -0.12∗ 5 -5.02 -0.29∗ 6 -4.95 -0.13∗ United Kingdom 1 -5.66 -2.85∗ 2 -5.65 -1.39∗ 3 -5.54 -0.50∗ 4 -5.60 -0.40∗ 5 -5.40 -0.19∗ 6 -5.48 -0.19∗ United States 1 2 3 4 5 6

US $-based series k=2 k=3 b BIC b λ BIC λ

k=1 BIC bλ

DM-based series k=2 BIC bλ

k=3 BIC bλ

k=1 BIC bλ

Yen-based series k=2 k=3 BIC bλ BIC bλ

-5.19 -1.09∗ -5.14 -0.43∗ -5.08 0.01 -4.95 -0.03 -4.96 -0.18∗ -4.96 0.04

-5.15 -1.10∗ -5.08 0.09 -5.09 0.04 -4.86 0.58 -4.86 0.39 -4.78 0.02

-6.78 -6.65 -6.65 -6.56 -6.53 -6.54

-4.44∗ -0.99∗ -0.95∗ -0.53∗ -0.20∗ -0.03

-6.73 -3.10∗ -6.64 -0.02 -6.64 -0.42∗ -6.69 -0.32∗ -6.52 0.23∗ -6.56 0.01

-6.72 -3.09∗ -6.68 0.02 -6.60 -0.35∗ -6.78 -0.29∗ -6.49 -0.00 -6.74 0.18

-5.56 -5.47 -5.42 -5.32 -5.34 -5.25

-3.79∗ -1.66∗ -0.86∗ -0.24∗ -0.16∗ -0.16∗

-5.53 -5.44 -5.45 -5.49 -5.36 -5.38

-1.86∗ -0.98∗ -0.76∗ 0.20 -0.16∗ 0.13

-5.50 -1.94∗ -5.46 0.14 -5.40 0.29 -5.43 0.17 -5.27 0.03 -5.32 0.37

-5.73 -1.51∗ -5.65 -0.28∗ -5.57 0.09 -5.55 -0.06 -5.52 -0.16∗ -5.35 -0.02

-5.68 -1.40∗ -5.61 0.02 -5.61 0.11 -5.57 0.05 -5.55 0.19 -5.34 0.05

-5.80 -5.72 -5.75 -5.67 -5.62 -5.46

-3.80∗ -1.81∗ -0.48∗ -0.68∗ -0.17∗ -0.22∗

-5.78 -1.59∗ -5.68 -0.49∗ -5.79 -0.09 -5.60 0.10 -5.63 0.50 -5.52 -0.04

-5.76 -1.13∗ -5.62 -0.51∗ -5.68 0.50 -5.50 0.60 -5.59 0.56 -5.44 0.02

-5.51 -5.37 -5.40 -5.33 -5.29 -5.32

-2.56∗ -1.46∗ -0.94∗ -0.32∗ -0.15∗ -0.24∗

-5.47 -1.70∗ -5.37 -0.27∗ -5.32 -0.51∗ -5.33 0.27 -5.18 0.43 -5.12 0.01

-5.41 -1.30∗ -5.34 -0.16∗ -5.31 -0.00 -5.25 0.59 -5.26 0.46 -5.00 0.23

-5.39 -5.40 -5.29 -5.27 -5.18 -5.08

-3.28∗ -1.10∗ -0.43∗ -0.24∗ -0.29∗ -0.16∗

-5.39 -3.53∗ -5.40 -0.40∗ -5.24 -0.27∗ -5.23 0.08 -5.20 0.38 -5.06 -0.04

-5.41 -2.88∗ -5.36 -0.24∗ -5.32 0.25 -5.20 0.16 -5.25 0.27 -5.04 0.27

-5.32 -5.31 -5.24 -5.11 -5.18 -5.07

-2.10∗ -1.43∗ -0.39∗ -0.16∗ -0.25∗ -0.28∗

-5.29 -3.33∗ -5.27 -0.20 -5.18 0.36∗ -5.25 0.26 -5.11 -0.02 -5.21 -0.08∗

-5.28 -1.58∗ -5.22 0.06 -5.26 -0.06 -5.07 0.32 -5.06 0.24 -4.97 0.08

Notes: The largest Lyapunov exponent estimates are presented. An asterisk indicates rejection of the null hypthesis H0 : λ ≥ 0 .

A Statistical Test of Chaotic Purchasing Power Parity Dynamics

119

References [1] Adler, Michael, and Bruce Lehman. “Deviations from Purchasing Power Parity in the Long Run.” Journal of Finance 38 (1983), 1471-87. [2] Andrews, Donald W.K. “Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation.” Econometrica 59 (1991), 817-858. [3] Barnett, William A. and Apostolos Serletis. “Martingales, Nonlinearity, and Chaos.” Journal of Economic Dynamics and Control 24 (2000), 703-724. [4] Casdagli, M. “Chaos and Deterministic versus Stochastic Nonlinear Modeling.” Journal of the Royal Statistical Society, Series B, 54 (1991), 303-328. [5] Coe, Patrick J. and Apostolos Serletis. “Bounds Tests of the Theory of Purchasing Power Parity.” Journal of Banking and Finance 26 (2002), 179-199. [6] Dickey, David A., and Wayne A. Fuller. “Likelihood Ratio Statistics for Autoregressive Time Series with a Unit Root.” Econometrica 49 (1981), 1057-72. [7] Eckmann, J.-P. and D. Ruelle. “Ergodic Theory of Chaos and Strange Attractors.” Reviews of Modern Physics 57 (1985), 617-656. [8] Eckmann, J.-P., S.O. Kamphorst, D. Ruelle, and S. Ciliberto. “Liapunov Exponents from Time Series.” Physical Review A 34 (1986), 4971-4979. [9] Flynn, N. Alston, and Janice L. Boucher. “Tests of Long-Run Purchasing Power Parity Using Alternative Methodologies.” Journal of Macroeconomics 15 (1993), 109-22. [10] Genc¸ay, R. and W.D. Dechert. “An Algorithm for the n Lyapunov Exponents of an n-Dimensional Unknown Dynamical System.” Physica D 59 (1992), 142-157. [11] Grilli, Vittorio, and Graciela Kaminsky. “Nominal Exchange Rate Regimes and the Real Exchange Rate: Evidence from the United States and Great Britain, 1885-1986.” Journal of Monetary Economics 27 (1991), 191-212. [12] Mark, Nelson C. “Real and Nominal Exchange Rates in the Long Run: An Empirical Investigation.” Journal of International Economics 28 (1990), 115-136. [13] Nelson, Charles R. and Charles I. Plosser. “Trends and Random Walks in Macroeconomic Time Series: Some Evidenve and Implications.” Journal of Monetary Economics 10 (1982), 139-162. [14] Nychka, D.W., S. Ellner, A.R. Gallant, and D. McCaffrey. “Finding Chaos in Noisy Systems.” Journal of Royal Statistical Society B 54 (1992), 399-426. [15] Patel, Jayendu. “Purchasing Power Parity as a Long-Run Relation.” Journal of Applied Econometrics 5 (1990), 367-379. [16] Perron, Pierre. “The Great Crash, the Oil Price Shock, and the Unit Root Hypothesis.” Econometrica 57 (1989), 1361-1401.

120

Apostolos Serletis and Asghar Shahmoradi

[17] Serletis, Apostolos. “Maximum Likelihood Cointegration Tests of Purchasing Power Parity: Evidence from Seventeen OECD Countries.” Weltwirtschaftliches Archiv, 130 (1994), 476-493. [18] Serletis, Apostolos and Periklis Gogas. “Purchasing Power Parity, Nonlinearity and Chaos.” Applied Financial Economics, 10 (2000), 615-622. [19] Serletis, Apostolos and Periklis Gogas. “Long-Horizon Regression Tests of the Theory of Purchasing Power Parity.” Journal of Banking and Finance 28 (2004), 1961-1985. [20] Serletis, Apostolos and Mototsugu Shintani. “No Evidence of Chaos But Some Evidence of Dependence in the U.S. Stock Market.” Chaos, Solitons, and Fractals 17 (2003), 449-454. [21] Serletis, Apostolos, and Grigorios Zimonopoulos. “Breaking Trend Functions in Real Exchange Rates: Evidence from Seventeen OECD Countries.” Journal of Macroeconomics, 19 (1997), 781-802. [22] Shintani, Mototsugu and Oliver Linton. “Nonparametric Neural Network Estimation of Lyapunov Exponents and a Direct Test for Chaos.” Journal of Econometrics 120 (2004), 1-33. [23] Sprott, J.S. Chaos Data Analyzer: User’s Manual. Raleigh, USA: Physics Academic Software (1995). [24] Whang, Yoon-Jae and Oliver Linton. “The Asymptotic Distribution of Nonparametric Estimates of the Lyapunov Exponent for Stochastic Time Series.” Journal of Econometrics 91 (1999), 1-42.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 121-136

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 7

A M ETHODOLOGY FOR THE I DENTIFICATION OF T RADING PATTERNS Athanasios Sfetsos1,∗ and Costas Siriopoulos2,† 1 EREL, INTRP, NCSR Demokritos, 15310 Ag. Paraskevi, Greece 2 Department of Business Administration, University of Patras, University Campus 26500, Patra, Greece

7.1. Introduction Recent empirical studies in the international literature have documented substantial evidence on the predictability of asset returns and foreign exchange markets using technical analysis trading rules (Brock et.al. 1992, Levich and Thomas 1993, Chan et.al. 1996, Bessembinder and Chan 1998, Allen and Karjalainem 1999, Lo et.al. 2000, Fang and Xu 2003). These studies report that asset returns are correlated, and hence, predictability can be captured, at least to some degree, by technical trading rules or by certain time series models or by their combination. The purpose of this paper is to provide a data-driven methodology for the identification of formations that appear in financial series with limited intervention of the user. Additionally, it provides the means of classifying formations of constant length into groups with similar characteristics. Technical analysis is defined as the attempt to identify regularities in the time series of price and volume information from a financial market, by extracting patterns from noisy data (Lo et.al. 2000). Technical analysts use price charts to study market action for the purpose of forecasting future price trends. (Murphy 1986). They believe that prices move in trends, and that history (price patterns) repeats itself (Murphy 1986). Price patterns are pictures or formations, which appear on price charts, that can be classified into different categories, and that have predictive value. Although there are potentially an infinite number of price formations, the technical analysis literature (Bulkowski 2000) suggests that certain price formations are widely identified in stock and foreign exchange markets. Price patterns are classified in two major categories, namely, reversal patterns and ∗ E-mail † E-mail

address: [email protected] address: [email protected]

122

Athanasios Sfetsos and Costas Siriopoulos

continuation patterns. Reversal patterns indicate that an important reversal in tends is taking place. A continuation pattern is an indication that the existing trend will be resumed (Murphy 1986). The technical analysts to find a minimum price objective use such price formations. Weiss and Indurkhya (1998) define data mining as ”the search for valuable information in large volumes of data. Predictive data mining is a search for very strong patterns in big data sets that can generalise to take accurate future decisions”. The application of data mining methodologies in financial forecasting aims to discover relationships between past and future formations of the series with limited intervention from the user. This technique also provides the opportunity to generate association rules that linguistically describe the examined process. There are two main types of data mining applications in financial modelling. The first is that some predefined structure of the series exists and an indexing (search by query) approach aims to identify similar behaviour within the dataset. The alternative is to apply theses methodologies to identify main patterns within a series and examine reoccurrence characteristics for prediction purposes. Sarker et al (2003) introduced a method of extracting sequence of symbols from the time series data by using segmentation and clustering processes. The search space and speed tip of the process was reduced with grouping of the data. The experimental results conducted on a shared memory multiprocessors system justified the inevitability of using parallel techniques for mining huge amount of data in the time series domain. Zeng et al (2001) proposed a hybrid approach for the prediction of the stock market trend based on pattern recognition and classification. A set of vectors to describe the future patterns is introduced which are classified based on a probabilistic relaxation algorithm. The approach is with the Air Quantas stock price and a 68% success is reported. Leigh (2002) introduced a method for template matching to predict future trends of a series. The results resemble those produced from technical analysis. The results on the NYSE composite index showed about 68% success in the prediction of the 5-day pattern. Das et al (1998) proposed an approach for the discovery of rules from sequential observations, using a discrete representation. The rules found local relationships in the series using primitive shapes of the series. The shapes were initially found from a greedy clustering algorithm. Experimental results were presented for ten companies of the NASDAQ stock market for a period of 19 months. Lu et al. (1998) studied the stock market for extracting inter-transaction association rules. They defined events on the time scale and tried to extract relations between similar patterns of events. The algorithm was limited to a fixed ”sliding window” (time interval). Only associations among the events within the same window were extracted. A demonstration of a single dimensional database of a stock market was given, but without prediction results. Last et al (2001) introduced a general methodology for knowledge discovery in time series databases. The process included cleaning and filtering of the data, identifying the most important predicting attributes, and extracting a set of association rules that can be used to predict the time series behaviour in the future. The computational theory of perception was used to reduce the set of extracted rules by fuzzification and aggregation. Povinelli and Feng (2003) proposed a method that identified predictive temporal structures in reconstructed phase spaces. A genetic algorithm was employed to search such phase spaces for optimal heterogeneous clusters that are predictive of the desired events. They applied an optimisation approach to search for the optimal tem-

A Methodology for the Identification of Trading Patterns

123

poral patterns, which match the specific goal of the problem. Their approach was able to classify time series events more effectively compared to Time Delay Neural Networks and the C4.5 algorithm. In this work a tool for identifying trading patterns of constant length is presented, consisting of two parts. Initially, a clustering algorithm is applied to identify groups of similar trading characteristics. The selected clustering algorithm is the k-means combined with a clustering validity index and a statistical test for outliers. The transitional properties between successive trading patterns are also examined. Secondly, a data mining algorithm is applied to estimate formations, which can be use to identify segments of the series that reoccur frequently. The proposed approach is applied on two major financial series, the daily closing values of the Dow Jones Index and the GB Pound to US Dollar exchange rate. The analysis identified the presence of series pattern implication and revealed those with high probability of reoccurrence.

7.2. Methodology The developed methodology contains all the necessary steps to identify trading patterns of predefined length within a given time series. The only requirement for the user of the algorithm is to specify the basic time interval. This can be either date-defined period, e.g. a week or a month, or successive number of days. The advantage of the former is that it could account for effects related to days of the week, but in case of missing data, due to holidays, special arrangements are needed. The initial outcome of the methodology is groups of patterns, termed “classes”, which exhibit common trading characteristics. The estimated classes are the used as inputs in a data mining algorithm to generate rules with high percentage of appearance in the data set.

Figure 7.1. Schematic Representation of Developed Methodology.

124

7.2.1.

Athanasios Sfetsos and Costas Siriopoulos

Input Selection

The process initiates with the selection of a period of the analysis. The data that belong to this period can be pre-processed using common financial transformations, such as percentage change or return values over successive intervals. Here, the former is preferred since results from different data sets can be immediately compared. Thus a matrix x, of n m dimensions is formed that contains the values (raw or pre-processed) of the series. Here n reflects the selected period and m the integer part of the division product between the total number of points in the series and the selected interval.

7.2.2.

Clustering

The resulting inputs from the previous process are grouped into sets with similar characteristics using clustering algorithms. The operating principle is to partition a given data set whose members are more similar to each other than those in different ones. In this study the k-means algorithm was selected (Jain 1999). The data set is directly decomposed into groups though the iterative optimisation of an objective function that describes the distance between a point and the nearest cluster centre: c

E = ∑ ∑ d(x, pi)

(7.1)

i=1 x∈C

Here, pi are the coordinates of the cluster centre Ci , and d is the Euclidean distance between the point x and pi . The process starts by selecting the number of clusters c and randomly initialising the centres. Then an iterative process is performed that assigns all points to the cluster with the nearest centre, and re-estimates the centres. This process is repeated until the centres position stops changing. The most difficult issue in similar research is the selection of the number of clusters that gives meaningful results. In this study a modified compactness and separation criterion (CSC) is applied as proposed by Kim et al (2001). Two indices that show an estimate of under-partition and over-partition of the data set are used: Uu = Uo =

1 c ∑ MDt c i=1 c dmin

(7.2)

MDi is the mean intra-cluster distance of the i-th cluster. Here, dmin is the minimum distance between cluster centres, which is a measure of intra-cluster separation. The optimum number is found from the minimisation of a combinatory expression of these two indices normalised between [0,1]. Uu − min(Uu ) max(Uu) − min(Uu) Uo − min(Uo ) = max(Uo) − min(Uo) = U¯ u + U¯ o

U¯ u = U¯ o Uc

(7.3)

A Methodology for the Identification of Trading Patterns

125

Once the number of clusters is finalised, the data are assigned to nearest cluster centre using the Euclidean distance, forming a “class”. Each cluster is checked for the presence of outliers using the Grubbs test (1969), also known as the maximum normalised residual test. The test is iterated a number of times until no more outliers are detected. Finally, the transitional matrix (TM) is estimated for the remaining data. This matrix shows the occurrence, in percentage, that data belonging in class i are followed data belonging in class j. The transitional matrix is a visual tool for identifying preferential and prohibited transitions using the historical information of the data set. T M(i, j) =

7.2.3.

∑ days of type i followed by j ∑ days of type i

(7.4)

Data Mining

A data mining algorithm is derived to estimate association rules that describe sequential characteristics of the estimated trading patterns or “classes”. The association rules are of the generalised form. C(t − j) · · ·C(t − 2) C(t − 1) ⇒ C(t) · · ·C(t + k)

(7.5)

The antecedent part or Left Hand Side (LHS) contains historic information about the classes, C, in ordered in a sequential manner. The consequent part or Right Hand Side (RHS) contains future values of the series. Two measures are used to determine the strength of the rule, using information contained in the data set: • Support is the number of instances that the rule appears within the data set. • Confidence is the accuracy with which the rule predicts correctly, i.e., that the LHS leads to the RHS. These measures, which are user-defined values, are a measure of the frequency that a rule, and therefore a trading pattern, appeared in the data and of the predictability of the specified rule. The total number of possible rules is the number of different classes to the power of the combined LHS and RHS dimensions. The most elementary approach followed taken by several investigators (Keogh 2002, Mannila 1997, Agrawal 1995) is to sequentially scan all the available data and to add up every occurrence of a rule, and the occurrences of every antecedent and consequent. This counting makes it possible to calculate the frequency of appearance and confidence of each rule, in the cost of limiting the format of the rules. Even moderately complex rule formats will make the task of counting all occurrences impractical. Previous attempts of solving the problem of mining predictive rules from time series can be classified into two main types. Firstly are supervised methods, where the target-rule form is known in advance and used as input to the data mining algorithm. Thus the objective of the analysis becomes to generate rules for predicting these events based on the data available before the event occurred (Weiss 1998, Hetland 2002). In the second type, unsupervised methods, inputs to the algorithms are only the data. The goal is to automatically extract informative rules from the series. In most cases this means that the rules should have some level of preciseness, be

126

Athanasios Sfetsos and Costas Siriopoulos

Data mining Algorithm

Step 1 Select LHS Step 2 Set flag = true Loop all data points If flag (point) = true Add new rule q = find data following rule set flag (q) = false end Remove rules < support threshold Remove rules < confidence thresho ld Save rules Add LHS & goto Step 1 Figure 7.2. Schematic Representation of Developed Methodology.

representative of the data, easy to interpret, and interesting to the financial analyst (Freitas 2002). The developed data mining algorithm (Fig. 7.2) in the work progresses sequentially, but the search in each loop is done with decreasing number of samples. The entire process is iterated with an increasing adjacent part until the support of all estimated rules are below a predefined threshold value. The finally presented rules are those that appear more than a specified support and confidence threshold value. This value is selected to deal with trading pattern that appeared frequently in the trading history and exhibited high probability that the consequent part will appear in the future, thus enabling for profitable trading decisions.

7.3. Application to the Dow Jones Index Closing Values The first series that the examined methodology was applied is the daily closing values of the Dow Jones index. The available data cover a period from 13th January 1980 to 20th December 2003. Two different configurations are tested: Weekly data with replacement of missing values with previous ones and 4-day periods of successive data.

A Methodology for the Identification of Trading Patterns

127

A. Weekly Data The data set was formed using the percentage change between the days, thus forming a matrix of 5 rows. The total number of samples (weeks) for this interval was 1248. [%P(F → M)%P(M → T )%P(T → W )%P(W → T )%P(T → F)] × 100

(7.6)

Using the described process in the previous section 16 classes of typical trading patterns were identified, and 35 were discarded from future analysis as outliers. The most frequently observed pattern, observed in 9.66% of the data, is C4 where an initial increase (Mon. to Wend.) is followed by a decrease in the last two days of the week, an overall increase of 0.28%. Assuming that the closing value of the previous Friday as unity a total of 11 out of 16 classes exhibited an positive sign at the close of Friday contrasting the 5 that correspond to a negative week. This is a result of the general upward trend exhibited by the DJI values during the 1980’s until mid 1990’s. Table 7.1. Weekly trading patterns Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Mon. 0.067 -0.57 -0.16 0.85 0.70 -0.16 -0.18 -0.47 0.26 0.04 0.06 0.45 1.46 0.22 -0.05 -0.11

Tue. 1.55 -0.24 -0.70 0.02 -0.53 0.62 -0.22 1.51 -1.37 -0.16 0.29 -0.16 -0.08 -0.70 0.07 0.75

Wend. 0.28 0.30 -0.52 0.20 -0.58 1.43 -0.13 0.04 -0.50 1.16 0.28 -0.70 0.34 1.13 -0.30 -0.66

Thu. -0.17 0.60 0.22 -0.28 1.29 0.02 0.29 1.09 0.15 1.51 -0.16 -0.54 0.16 -0.35 -1.22 0.19

Fri. -0.32 0.12 -1.03 -0.50 0.01 -0.20 1.39 0.25 0.09 0.09 0.29 0.38 1.28 0.08 -0.20 -0.50

% App. 5.29 7.77 5.85 9.66 4.48 4.89 7.53 3.61 4.17 4.37 8.73 6.09 4.57 6.17 7.05 7.05

Table 7.1 presents the most frequent transitions that occurred in the data set. The threshold value for those shown here is to be higher than twice the equal probability value for each class, i.e. 1/16. It can be observed that in four cases C2, C4, C11 and C15 there is evidence of persistence in the behaviour of the DJI weekly trading characteristics. Also C4 is a highly probable follow-on to five of the presented classes (C1, C2, C4, C6 and C7) all of which exhibit a total increase thought the weekly period. Transitions C1 > C4, C6 > C4, C7 > C4 demonstrate a strong increase followed by a moderate one in the following week, whereas 1 Change

refers to the overall weekly change using the following signs: :strong increase, ↑:increase, ∼:constant, ↓:decrease :strong decrease.

128

Athanasios Sfetsos and Costas Siriopoulos Table 7.2. Frequent transitions Trans. 1>4 2>2 3 > 16 4>4 6>4

Change 1 >↑ ∼>∼ >↓ ↑>↑ >↑

% 15.15 14.43 13.69 13.22 13.11

Trans. 7>4 9>1 11 > 11 12 > 11 15 > 15

Change >↑ > ↑>↑ ↓>↑ >

% 13.82 15.38 20.18 13.15 12.5

transitions C4 > C4 and C11 > C11 show a succession of increasing periods. Periods of strong negative performance show mixed behaviour as strong increase (C9 > C1), decrease (C3 > C16) and strong decrease (C15 > C15) appear frequently in the data set. The application of the developed data mining algorithm using the estimated classes of the data, with threshold values for support = 3 and confidence = 70%, resulted in a single rule that appeared overall four times in the data set. For the configuration of a rule with three dimensions there are 4096 different alternatives, which exceeds the number of available samples. The adjacent part of the rule shows decreasing behaviour in the initial stages followed by a stabilisation period, where as the consequent part is increasing. Rule: C(j-2) = 9 & C(j-1) = 14 > C(j) = 1 Support: 3 Confidence: 75% B. 4-day Period The methodology was reapplied to the data formed from 4-day period of successive data ignoring non working days and the day of the week effect. In total 1512 non-overlapping periods were considered and the data were pre-processed using the percentage operator between successive days, as previously. The application of the clustering algorithm resulted in 12 classes (Table 7.3) and a total of 59 values to be removed as outliers. Here, the most frequent trading pattern is C7 with 14.81%, which corresponds to an overall increase with high value in the second day and fairly small variation in the rest, totally 0.58%. As noted previously, the total number of patterns that correspond to an overall increase in the DJI value during the 4-day period are 8 compared to the 4 that show a decrease. The most frequent transitions, more than 15.5%, occurring in the data set are presented in Table 7.4. Only in two cases C7 and C9 there is evidence of strong persistent reoccurrence. The latter is associated with strong decreasing values over both intervals. Transition C4 > C7 represents a strong overall increase followed by a moderate one, whereas transitions from, C1, C7, C12 show increase of similar magnitude. Movement from C3 to C2 shows a intense increase in the second period after a period with successive negative and positive movements, but constant overall. The application of the data mining algorithm on the selected classes, with similar threshold values, revealed the presence of two association rules, (Fig. 7.3). Since the total dimensions of the rules are now four, there are totally 20736 different possible combinations that 2 Change

refers to the overall weekly change using the following signs: :strong increase, ↑:increase, ∼:constant, ↓:decrease :strong decrease.

A Methodology for the Identification of Trading Patterns

129

Table 7.3. 4-day period trading patterns Class 1 2 3 4 5 6 7 8 9 10 11 12

Day 1 -0.02 0.03 0.13 1.4 0.14 1.08 -0.08 -0.02 -0.42 0.38 -1.77 -0.5

Day 2 1.48 0.43 -1.44 0.65 -0.42 -0.59 0.73 0.43 -0.47 -0.1 -0.03 -0.13

Day 3 0.37 1.58 0.11 0.11 -1.09 0.37 -0.15 -0.07 0.38 -0.57 -0.95 0.52

Day 4 -1.2 -0.03 1.3 0.43 0.21 -0.11 0.08 2.04 -0.55 -1.51 0 0.77

% App. 4.37 8.2 5.75 7.28 9.33 9.26 14.81 4.63 11.84 7.08 4.37 9.19

Table 7.4. Frequent transitions Trans. 1>7 3>2 4>7

Change 2 ↑>↑ ∼> >↑

% 16.66% 17.24% 21.81%

Trans. 7>7 9>9 12 > 7

Change ↑>↑ > ↑>↑

% 22.76% 15.64% 17.98%

may be encountered. Rule 1 corresponds to an increasing trend in the data values and appeared 3 out of 4 times that the LHS combination (C4 → C7 → C7) was present. Rule 2 corresponds to a decreasing trend followed by stabilization. This rule appeared all 3 times the LHS combination was found in the dataset. A graphical representation of these two rules when they appeared in the data set is presented in Fig. 7.4

Rule 1 Rule 2

C(j-3) 4 5

C(j-2) 7 11

C(j-1) 7 7

C(j) 7 7

Support 3 3

Confidence 75% 100%

7.4. Application to the Pound-dollar Exchange Rate Series The second examined series is the daily values of the GB Pound to US Dollar exchange rates, over a period from the 14th September 1980 to the 13th December of 2003. This series is characterised by milder fluctuations, an overall decreasing trend and more evident structural breaks. Again the same two configurations as for the DJI series were tested: well specified weekly intervals and 4-day periods of successive data.

130

Athanasios Sfetsos and Costas Siriopoulos 5 4 Rule 1

3

% Change

2 1 0 -1 Rule 2

-2 -3 -4 -5

0

2

4

6

8

10

12

14

16

Days

Figure 7.3. Rules of 4-day period.

A. Weekly Data As in the previous series, the data set was formed using the percentage change between the days, thus forming a matrix of 5 rows. The total number of samples (weeks) for this interval was 1213. [%P(F → M)%P(M → T )%P(T → W )%P(W → T )%P(T → F)] × 100

(7.7)

The application of the clustering algorithm indicated the presence of 21 different classes (Table 7.5) and 26 outliers were removed from the remaining analysis. The most Frequently observed class is C15 that exhibits an almost constant behaviour in the first three days with decreasing characteristics in the last two days of the week, resulting in a total loss of 0.92%. This class corresponded to 7.25% of the data in the data set. For this series the majority of the identified classes exhibit a positive overall balance, 13 out of 21. Table 7.6 displays the most frequent transitions between the previously identified classes within the data set. For this series, the most frequent rules do not exhibit any sign of persistent characteristics. Another very interesting comment is that outliers tend to be followed by C21. Transitions C1 > C5 and C6 > C13 show that a moderate increase is followed by a stronger increase the following week, whereas the transitions from C5 and C19 to C10 shows that a strong increase is followed by a decrease. Weeks whose overall change is fairly constant are followed either weeks with similar characteristics or increasing ones. 3 Change

refers to the overall weekly change using the following signs: :strong increase, ↑:increase, ∼:constant, ↓:decrease :strong decrease.

A Methodology for the Identification of Trading Patterns Rule 1

Rule 2 3500 Price (units)

Price (units)

5800 5600 5400 6700

131

6705

6710

6715

6720

6725

6730

6735

6740

6745

3450 3400 3350 5855

6750

5860

5865

5870

5875

5880

5885

5890

5895

5900

5905

6240

6245

6250

6255

6260

6265

6270

6275

6280

6285

8500

8505

8510

8515 8520 8525 Series Index

8530

8535

8540

8545

5600 6730 9400 Price (units)

Price (units)

5800

6735

6740

6745

6750

6755

6760

6765

6770

6775

6780

9200 9000 8800 8600 7180

7185

7190

7195

7200 7205 7210 Series Index

7215

7220

7225

7230

3900 3800 6235

Price (units)

Price (units)

4000 6000

9800 9600 9400 8495

Figure 7.4. Rules recognition in the series.

The data mining algorithm, with threshold values for support = 3 and confidence = 70%, returned only a single rule out of the 9261 possible combinations of the selected classes. The trading pattern that is formed from the association rule, shows that if the adjacent part is a decrease followed by a strong increase then the consequent is a decrease. Rule: C(j-2) = 10 & C(j-1) = 9 > C(j) = 10 Support: 3 Confidence: 75% B. 4-day Period The developed methodology was applied on the 4-day of successive data ignoring nonworking days and the day of the week effect. In total 1516 non-overlapping periods were considered similar pre-processing as before was applied. Finally, 21 classes were identified and 26 values corresponding to outliers were removed from further analysis. The most frequent trading pattern is C21 with 6.27%. This can be interpreted as varying behaviour with 0.445 increase in the second day and -0.20% in the fourth day, a total increase of 0.32%. From the estimated classes, 11 correspond to an increase and 10 to decrease. Table 7.8 displays the most frequent transitions, above a threshold of twice the number of equal probability, which occurred in the data set. Transition C7 > C5 shows a relatively smooth behaviour with few fluctuations, where as C10 > C2 and C19 > C6 are successions of decreasing behaviour, with the later being more severe. Transition C12 > C4 is a mild increase followed by a strong decrease and C13 > C8, C14 > C8, C17 > C13 and C18 > C21 show a succession of increasing periods with the former being the strongest. The application of the data mining algorithm on the 4-day period data set revealed the presence of three rules (Fig. 7.5) that exceed the threshold values of minimum confidence (70%) and support (3). Rule 1 represents a negative trend of the data, although appearing to stabilise at the consequent stage. Rule 2 is interpreted as low variation starting period followed by increasing behaviour. This rule exhibits a confidence of 100%, which means that all four times that the adjacent part, appeared in the data set was followed by days that were classified under C8. Rule 3 depict a succession of increasing and decreasing characteristics that ends in fairly constant behaviour. Figure 7.6) presents a demonstration

132

Athanasios Sfetsos and Costas Siriopoulos Table 7.5. Weekly trading patterns Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Mon. 0.01 -0.75 1.05 0.31 0.07 0.52 0.01 -0.53 0.1 -0.05 -0.55 0.97 0.27 -0.38 -0.03 0.11 0.04 -0.11 -0.09 -0.21 0.08

Tue. 0.5 -0.19 0.39 -0.03 0.21 -0.12 -0.01 0.52 0.37 -0.1 0.1 -0.51 -0.2 1.08 0.1 0.9 -1.05 -0.51 0.2 0.12 -0.32

Wend. -0.23 0.12 -0.03 0.9 0.66 0.02 -0.31 0.57 0.33 0.28 -0.07 -0.11 0.18 -0.15 -0.08 0 0.42 0.02 0.11 -0.17 -0.53

Thu. 0.61 -0.04 0.29 -0.1 0.69 -0.15 0.49 0.3 0.04 -0.44 0.03 0.15 0.27 0.12 -0.26 -0.35 -0.16 0.24 0.3 0.14 -0.24

Fri. -0.46 -0.52 0 -0.41 -0.54 0.03 0.25 0.14 0 0.03 0.38 -0.27 0.51 -0.44 -0.66 0.4 -0.35 -0.28 1.31 -0.08 0.48

% App. 2.8 4.86 4.29 3.13 3.38 4.78 4.7 2.97 4.45 6.51 5.19 3.05 5.61 3.71 7.25 4.62 5.28 5.36 3.79 5.36 6.76

Table 7.6. Frequent transitions Trans. 1>5 5 > 10 6 > 13 11 > 5

Change 3 ↑> >↓ ↑> ∼>

% 14.70% 12.19% 15.51% 12.69%

Trans. 12 > 20 19 > 10 20 > 9 Out > 21

Change ∼>∼ >↓ ∼>↑

% 13.51% 13.04% 12.30% 15.38%

of the segments of the series that correspond to the first two rules appeared.

Rule 1 Rule 2 Rule 3

4 Change

C(j-2) 1 10 14

C(j-1) 20 14 2

C(j) 7 8 20

Support 3 4 3

Confidence 75% 100% 75%

refers to the overall weekly change using the following signs: :strong increase, ↑:increase, ∼:constant, ↓:decrease :strong decrease.

A Methodology for the Identification of Trading Patterns

133

Table 7.7. 4-day period trading patterns Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Day 1 -0.2 0.24 0.16 -0.43 -0.12 0.25 0.19 0.52 0.35 -0.12 -0.03 -0.47 -0.17 0.06 0.36 -0.38 0.28 0.59 -0.83 -0.81 0.06

Day 2 0.12 -0.21 0.05 -0.58 0.02 -1.23 -0.36 0.13 0.48 -0.11 -0.1 0.02 0.26 0.12 0.02 0.25 0.91 -0.06 0.39 0 0.44

Day 3 -0.89 -0.46 -0.37 -0.36 -0.14 0.41 0.18 0.2 -0.6 0.14 0.47 0.8 0.03 0.59 0.89 -0.06 0.1 0.11 0.18 0.02 0.02

Day 4 0.25 -0.03 -0.85 0.27 0.28 -0.23 0.1 -0.03 0.09 -0.27 -0.86 0.09 1.23 0.54 -0.18 -0.04 0.4 0.77 -0.87 0.24 -0.2

% App. 4.22 5.8 5.28 5.67 5.34 3.96 6.07 5.87 3.96 5.61 2.97 3.89 3.63 3.43 4.09 4.75 3.89 3.76 3.3 5.8 6.27

Table 7.8. Frequent transitions Trans. 7>5 10 > 2 12 > 4 13 > 8

Change 4 ∼>∼ ↓>↓ ↑> >↑

% 13.04% 11.76% 13.55% 12.72%

Trans. 14 > 8 17 > 13 18 > 21 19 > 6

Change >↑ > >↑ >↓

% 13.46% 13.55% 12.28% 12.00%

Conclusion This work presented a methodology for classifying trading behaviour for a well-defined time interval and identifying formations with high probability of reoccurrence through a detailed breakdown of historic information of the series using a data mining algorithm. Additionally, the methodology identifies frequent and prohibited the transitional probabilities between the identified classes. Also it estimates rules in the form of successive classes that were found within the data set with high probability of reoccurrence. The proposed methodology can form the basis for decision making process and will be further enhanced including elements such as the inclusion of exogenous variables, such as the transaction volume, the probability of ruin and trading rules.

134

Athanasios Sfetsos and Costas Siriopoulos 2.5 2.0 Rule 3

1.5 % Chnage

1.0 0.5 0

Rule 2

-0.5 -1.0 Rule 1 -1.5 -2.0

0

2

4

6 Days

8

10

12

Figure 7.5. Rules of 4-day period.

3225

3230 3235 Series Index

3240

3245

3250

3255

6610

6615

6620

6625 6630 Series Index

6635

6640

6645

6650

1.65 6880

6885

6890

6895

6900 6905 Series Index

6910

6915

6920

6925

7190

7195

7200

7205 7210 Series Index

7215

7220

7225

7230

1.65 7185

1.92 1.9 1.88 1.86 1.84 2860

Price (units)

Price (units)

3220

1.5 6605

Price (units)

3215

1.55

Price (units)

1.55 3210

Price (units)

Rule 3

Price (units)

Price (units)

Rule 2 1.6

2865

2870

2875

2880

2885

2890

2895

2900

2905

4240 1.58 1.56 1.54 1.52 1.5 1.48

4245

4250

4255

4260

4265

4270

4275

4280

4285

5825

5830

5835

5840

5845 5850 Series Index

5855

5860

5865

5870

1.68 1.66 1.64 1.62

Figure 7.6. Rules identification in the series.

The application of the methodology was presented for two major financial series, the daily closing values of the Dow Jones Index and the GB Pound to US Dollar exchange rate over a period of more than 20 years. Two different intervals were selected; one was a well-defined weekly interval and the other a 4-day period of the available data ignoring non-trading days. The analysis revealed a number of typical behaviour for each examined case of the series that differ between them, following the requirements for the successful application of the clustering algorithm. Also the identified association rules for each case were with high probability of reoccurrence, in excess of 70%. Their small number is related

A Methodology for the Identification of Trading Patterns

135

to the problem formulation, and more specifically the selection of constant length periods. The selection of a specific interval may not capture all the trading characteristics as these may span into longer intervals.

References [1] Allen, F., and R. Karjalainen, “Using generic algorithms to find technical trading rules,” Journal of Financial Economics, 51: 245-271, 1999. [2] Bessembinder, H., and K. Chan, “Market efficiency and the returns to technical analysis,” Financial Management, 27: 5-17, 1998. [3] Brock, W., Lakonishok, J. and B. Le Baron, “Simple technical trading rules and the stochastic properties of stock returns,” Journal of Finance, 47(5): 1731-1764, 1992. [4] Bulkowski, T.N., Encyclopaedia of chart patterns, John Wiley and Sons, (2000). [5] Chan, L., and J.Lakonishok, “Momentum strategies,” Journal of Finance, 51: 16811713, 1996. [6] Das, G., Lin, K.I., Manilla, H., Renganathan, G., and P. Smyth, “Rule discovery from time series,” Proc. 4th Int. Conf. on Knowledge Discovery & Data Mining, 16-22, 1998. [7] Fang, Y., and D. Xu, “The predicatibility of asset returns: an approach combining technical analysis and time series forecasts,” International Journal of Forecasting, 9(3): 369-385, 2003. [8] Freitas A.A., Data Mining and Knowledge Discovery with Evolutionary Algorithms, Berlin: Spinger-Verlag, 2002. [9] Grubbs, F.E., “Procedures for Detecting Outlying Observations in Samples,” Technometrics, 11(1): 1-21, 1969. [10] Jain, A.K., Murty M.N., and P.J. Flynn, “Data Clustering: A Review,” ACM Computing Surveys, 31(3): 264-323, 1999. [11] Keogh, E.J., Lonardi, S. and B. Chiu, “Finding surprising patterns in a time series database in linear time and space,” in Proc. KDD, 550-556, 2002. [12] Kim, D.J., Park, Y.W., and D.J. Park, “A novel validity index for determination of the optimum number of clusters,” IEICE Trans Inf. & Syst., E84-D(2): 281-285, 2001. [13] Last, M., Klein, Y., and A. Kandel, “Knowledge Discovery in Time Series Databases,” IEEE Tr. Systems Man and Cybernetics part B, 31(1): 160-169, 2001. [14] Leigh, W., Paz, M. and R. Purvis, “An analysis of a hybrid neural network and pattern recognition technique for predicting short-term increases in the NYSE composite index,” Omega, 30: 69-76, 2002.

136

Athanasios Sfetsos and Costas Siriopoulos

[15] Levich, R., and L.Thomas, “The significance of technical trading-rule profits in the foreign exchange market: A bootstrap approach,” Journal of International Money and Finance, 12: 451-474, 1993. [16] Lo, A.W., H.Mamaysky and J.Wang, “Foundations of technical analysis: Computational algorithms, statistical inference, and empirical implementation,” Journal of Finance, 55(4): 1705-1765, 2000. [17] Lu, H., Han, J., and L. Feng, “Stock Movement Prediction and N-Dimensional InterTransaction Association Rules,” Proc. of 1998 SIGMOD, 12:1-12:7. [18] Mannila, H., Toivonen, H. and A.I. Verkamo, “Discovery of frequent episodes in event sequences,” Data Mining and Knowledge Discovery, 1(3): 259-289, 1997. [19] Murphy, J., Technical analysis of future markets, New York Institute of Finance, Prentice-Hall, 1986. [20] Povinelli, R., and X. Feng, “A New Temporal Pattern Identification Method For Characterisation And Prediction Of Complex Time Series Events,” IEEE Transactions on Knowledge and Data Engineering, 15(2): 339- 352, 2003. [21] Sarker, B.K., Mori, T., Hirata, T., and K. Uehara, “Parallel algorithms for mining association rules in time series data,” Lecture Notes in Computer Science, 2745: 273 284, 2003. [22] Weiss, S.M., and N. Indurkhya, Predictive Data Mining: A Practical Guide, Morgan Kaufmann, 1998. [23] Zeng, Z., Yan, H., and A.M.N. Fu, “Time-series prediction based on pattern classification,” Artificial Intelligence in Engineering, 15(1): 61-69, 2001. [24] Agrawal, R., and R. Srikant, “Mining sequential patterns,” in Proc. ICDE, 3-14, 1995. [25] Weiss, G.M., and H. Hirsh, “Learning to predict rare events in event sequences,” in 4th Int. Conf. Knowledge Discovery and Data Mining (KDD’98) 359-363, 1998.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 137-165

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 8

T ECHNICAL R ULES B ASED ON N EAREST-N EIGHBOUR P REDICTIONS O PTIMISED BY G ENETIC A LGORITHMS : E VIDENCE FROM THE M ADRID S TOCK M ARKET Christian Gonz´alez-Martel†, Fernando Fern´andez-Rodr´ıguez† and Simon Sosvilla-Rivero ‡ † Universidad de Las Palmas de Gran Canaria, Spain ‡ Universidad Complutense de Madrid, Spain

8.1. Introduction A considerable amount of work has provided support for the view that simple technical trading rules (TTRs) are capable of producing valuable economic signals [see 7, 2, 33, 17, among others]. However, the majority of these studies have ignored the issue of parameter optimisation, leaving them open to the criticism of data-snooping and the possibility of a survivorship bias [see 31, 8, respectively]. To avoid this criticism, a more objective and valid approach consists in choosing TTRs based on an optimisation procedure utilising in-sample data and testing the performance of these rules out-of-sample. In this sense, a genetic algorithm is appropriate method to discover TTRs, as shown in [1]. The aim of this paper is to investigate the profitability of the (non-linear) predictions from NN forecasting methods by transforming them into a simple trading strategy, whose profitability is evaluated against a risk-adjusted buy-and-hold strategy. Unlike previous empirical evidence when evaluating trading performance, we will consider transaction costs, as well as a wider set of profitability indicators than those usually examined. We have applied this investment strategy to the General Index of the Madrid Stock Exchange (IGBM), using data covering for the period 2 January 1977 to 31 December 2002 (5665 observations). The paper is organised as follows. Section 2 briefly presents the KNN predictors, while in Section 3 we show how the local predictions are transformed in a simple trading strategy and how we assess the economic significance of predictable patterns in the stock market. The empirical results are shown in Section 4. Finally, Section 5 provides some concluding remarks.

138

C. Gonz´alez-Martel, F. Fern´andez-Rodr´ıguez and S. Sosvilla-Rivero

8.2. KNN Predictions The KNN method works by selecting geometric segments in the past of the time series similar to the last segment available before the observation we want to forecast [see 15, 16]. This approach is philosophically very different from the Box-Jenkins methodology. In contrast to Box-Jenkins models, where extrapolation of past values into the immediate future is based on correlation among lagged observations and error terms, NN methods select relevant prior observations based on their levels and geometric trajectories, not their location in time. The KNN forecast can be succinctly described as follows [see 17, for a more detailed account]: 1. We first transform the scalar series xt (t = 1, . . ., T ) into a series of m-dimensional vectors, xm i , t = m, . . ., T : xtm = (xt , xt−1 , . . ., xt−m+1 )

(8.1)

with m referred to as the embedding dimension. These m-dimensional vectors are often called m-histories. 2. Secondly, we select the k m-histories m m m xm i1 , xi2 , xi , . . ., xik , 3

(8.2)

that are most similar to the last available vector xm T = (xT , xT −1 , xT −2 , . . ., xT −m+1 ) ,

(8.3)

where k=int(λT) (0 0 for the continuous version (CWT). The 3D plot in figure 9.1 shows how the wavelet transform reveals more and more detail while going towards smaller scales, i.e. towards smaller log(s) values. The wavelet transform is sometimes referred to as the ‘mathematical microscope’ [4], due to its ability to focus on weak transients and singularities in the time series. The wavelet used determines the optics of the microscope; its magnification varies with the scale factor s. Whether we want to use a continuous or discrete WT, see figure 9.2, is largely a matter of application. For coding purposes, one wants to use the smallest number of coefficients which can be compressed by thresholding low values or using correlation properties. For this purpose a discrete (for example a dyadic) scheme of sampling the scale s, position b space is convenient. Such sampling often spans an orthogonal wavelet base. For analysis purposes, one is not so much concerned with numerical or transmission efficiency or representation compactness, but rather with accuracy and adaptive properties of the analysing tool. Therefore, in analysis tasks, continuous wavelet decomposition is mostly used. The space of scale s and position b, is then sampled semi-continuously, using the finest data resolution available.

Modern Analysis of Fluctuations in Financial Time Series and Beyond

169

Figure 9.1. Continuous Wavelet Transform representation of the random walk (Brownian process) time series. The wavelet used is the Mexican hat - the second derivative of the Gaussian kernel. The coordinate axes are: position x, scale in logarithm log(s), and the value of the transform W (s, x).

For decomposition, a simple base function is used. The wavelet ψ, see Eq. 9.1, took its name from its wave-like shape. It has to cross the zero value line at least once, since its mean value must be zero. The criterion of zero mean is referred to as the admissibility of the wavelet, and is related to the fact that one wants to have the possibility of reconstructing the original function from its wavelet decomposition. This condition can be proven formally, but let us give a quick, intuitive argument. We have seen that wavelets work at smaller and smaller scales, covering higher and higher frequency bands of the signal being decomposed. This is the so-called band pass filtering of the signal. Only a certain band of frequencies (level of detail) is captured by the wavelets working at one scale. Of course, at another scale a different set of details (band of frequencies) is captured. But other frequencies (in particular zero frequency) are not taken into the coefficients. This is the idea of decomposition. Kernels like the Gaussian smoothing kernel are low-pass filters, which means they evaluate the entire set of frequencies up to the current resolution. This is the idea of approximation at various resolutions. The reader may rightly guess here that it is possible to get band pass information (wavelet-coefficients) from subtracting two low-pass approximations at various levels of resolution. This is, in fact, a so-called multi-resolution scheme of decomposition into WT components. But back to the admissibility - reconstruction from multiple resolution approximations would not be possible since the same low frequency detail would be described in

170

Zbigniew R. Struzik

Figure 9.2. Continuous sampling of the parameter space (left) versus discrete (dyadic) sampling (right).

several coefficients of the low pass decomposition. This, of course, is not the case for wavelets; they select only a narrow band of detail with very little overlap (in the orthogonal case no overlap at all!). In particular, if one requires that the wavelet is zero for frequency zero, i.e. it fully blocks zero frequency components, this corresponds with the zero mean admissibility criterion.

9.3.

The Wavelet ψ

The only admissibility requirement for the wavelet ψ is that it has zero mean - it is a wave function, hence the name wavelet. Z ∞

−∞

x ψ(x) dx = 0 .

(9.2)

However, in practice, wavelets are often constructed with orthogonality to a polynomial of some degree n. Z ∞

−∞

xn ψ(x) dx = 0 .

(9.3)

This property of the wavelets - orthogonality to polynomials of degree n - has a very fine application in signal analysis. It is referred to by the name of the number of vanishing moments. If the wavelet is orthogonal to polynomials of a degree up to and including n, we say that it has m = n + 1 vanishing moments. So one vanishing moment is good enough to filter away constants - polynomials of zero degree P0 . This can be done, for example, by the first derivative of the Gaussian kernel plotted in figure 9.3. Similarly the second derivative of the same Gaussian kernel, which is often used upside down and then appropriately called the Mexican hat wavelet, has two vanishing moments and in addition to constants can also

Modern Analysis of Fluctuations in Financial Time Series and Beyond

171

Mexican Hat

Figure 9.3. Left: the smoothing function: the Gaussian. Centre: wavelet with one vanishing moment, the first derivative of the Gaussian. Right: wavelet with two vanishing moments, the second derivative of the Gaussian.

filter linear trends P1 . Of course, if the wavelet has m vanishing moments, it can filter polynomials of degree m − 1, m − 2, ... 0.

9.4. The H¨older Exponent The use of vanishing moments becomes apparent when we consider local approximations to the function describing our time series. Suppose we can locally approximate the function with some polynomial Pn , but the approximation fails for Pn+1 . One can think of this kind of approximation as the Taylor series decomposition. In fact the arguments to be given are true even if such Taylor series decomposition does not exist, but it can serve as an illustration. For the sake of illustration, let us assume that the function f can be characterised by the H¨older exponent h(x0 ) in x0 , and f can be locally described as: f (x)x0 = c0 + c1 (x − x0 ) + · · · + cn (x − x0 )n +C|x − x0 |h(x0) = Pn (x − x0 ) +C|x − x0 |h(x0 ) . The exponent h(x0 ) is what ‘remains’ after approximating with Pn and what does not yet ‘fit’ into an approximation with Pn+1 . More formally, our function or time series f (x) is locally described by the polynomial component Pn and the so-called H¨older exponent h(x0 ). | f (x) − Pn (x − x0 )| ≤ C|x − x0 |h(x0 ) .

(9.4)

It is traditionally considered to be important in economics to capture trend behaviour Pn . It is, however, widely recognised in other fields that it is not necessarily the regular polynomial background but quite often the transient singular behaviour which can carry important information about the phenomena / the underlying system ‘producing’ the time series. One of the main reasons for the focus on the regular component was that until the advent of multi-scale techniques (like WT) capable of locally assessing the singular behaviour, it was practically impossible to analyse singular behaviour. This is because the weak transient exponents h are usually completely masked by the much stronger Pn . However, wavelets provide a remedy in this case! The reader has perhaps already noted the link with the vanishing moments of the wavelets. Indeed, if the number of the vanishing

172

Zbigniew R. Struzik

moments is at least as high as the degree of Pn , the wavelet coefficients will capture the local scaling behaviour of the time series as described by h(x0 ). In fact, the phrase ‘filtering’ with reference to the polynomial bias is not entirely correct. The actual filtering happens only for wavelets the support of which is fully incorporated in the biased interval. If the wavelet is at the edge of such an interval (where bias begins) or if the current resolution of the wavelet is simply too large with respect to the biased interval, the wavelet coefficients will capture the information pertinent to the bias. This is understandable since the information does not get ‘lost’ or ‘gained’ in the process of WT decomposition. The entire decomposed function can be reconstructed from the wavelet coefficients, including the trends within the function. What wavelets provide in a unique way is the possibility to tame and manage trends in a local fashion, through localised wavelets components. Above, we have suggested that the function can locally be described with Eq. 9.4. Its wavelet transform W (n) f with the wavelet with at least n vanishing moments now becomes:   Z Z x − x0 1 h(x0) (n) C|x − x0 | ψ dx = C|s|h(x0) |x0 |h(x0 ) ψ(x0 ) dx0 . W f (s, x0 ) = s s Therefore, we have the following power law proportionality for the wavelet transform of the (H¨older) singularity of f (x0 ): W (n) f (s, x0 ) ∼ |s|h(x0) . From the functional form of the equation, one can immediately attempt to extract the value of the local H¨older exponent from the scaling of the wavelet transform coefficients in the vicinity of the singular point x0 . This is indeed possible for singularities which are isolated or effectively isolated, that is that can be seen as isolated from the current resolution of the analysing wavelet. A common approach to trace such singularities and to reveal the scaling of the corresponding wavelet coefficients is to follow the so-called maxima lines of the CWT converging towards the analysed singularity. This approach was first suggested by Stefan Mallat et al [11] and later used and further developed among others in Refs [12, 4, 13]. In figure 9.4, we plot the input time series which is a part of the S&P index containing the crash of ’87. In the same figure, we plot corresponding maxima derived from the CWT decomposition with the Mexican hat wavelet. The maxima corresponding to the crash stand out both in the top view (they are the longest ones) and in the side log-log projection of all maxima (they have a value and slope different from the remaining bulk of maxima). The only maxima higher in value are the end of the sample finite size effect maxima. These observations indicate that the crash of ’87 can be viewed as an isolated singularity in the analysed record of the S&P index for practically the entire wavelet range used. This is, however, (luckily) an unusual event and in general in time series we have densely packed singularities which cannot be seen as isolated cases for a wider range of wavelet scales. The related H¨older exponent can then be measured either by selecting smaller scales or by using some other approach. A possibility we would like to suggest is using the multifractal paradigm in order to estimate what we call the effective H¨older exponent. The detailed discussion of this approach can be found in [1, 2], but let us quickly

Modern Analysis of Fluctuations in Financial Time Series and Beyond

173

Figure 9.4. Left: the input time series with the WT maxima above it in the same figure. The strongest maxima correspond to the crash of ’87. The input time series is de-biased and L1 normalised. Right: we show the same crash related maxima highlighted in the projection showing the logarithmic scaling of all the maxima.

point out that the effective H¨older exponent captures local deviations from the mean scaling exponent of the decomposition coefficients related to the singularity in question. This approach has been quite successful in evaluating histograms of the scaling exponents, singularity spectra and collective properties of the local H¨older exponent. Dense singularities can be seen as evolving from a multiplicative cascading process which takes place across scales. The CWT was successfully used in revealing such a process and in recovering its characteristics. The canonical global methodology applied for this purpose has been introduced by Arneodo et al. [4] We describe it briefly in the following section in the context of outlier/crash detection and analysis.

9.5. Multifractal Formalism on the WTMM Tree The WTMM tree lends itself very well to defining the partition function based multifractal formalism (MF) [4]. The MF takes the moments q of the measure distributed on the WTMM tree to obtain the dependence of the scaling function τ(q) on the moments q:

Z (s, q) ∼ sτ(q). The Z (s, q) is the partition function of the q-th moment of the measure distributed over the wavelet transform maxima at the scale s considered:

Z (s, q) =

∑ (W f ωi(s))q ,

(9.5)

Ω(s)

where Ω(s) = {ωi (s)} is the set of all maxima ωi (s) at the scale s, satisfying the constraint on their local logarithmic derivative in scale [14]. (The local slope bound used throughout ˘ ≤ 2.) this paper is |h|

174

Zbigniew R. Struzik

Since the moment q has the ability to select a desired range of values: small for q < 0, or large for for q > 0, the scaling function τ(q) globally captures the distribution of the exponents h(x) - weak exponents are addressed with large negative q, while strong exponents are suppressed and effectively filtered out. For the large positive q, the opposite takes place (and strong exponents are addressed while weak exponents are effectively filtered out). This dependence may be linear, indicating that there is only one class of singular structures and related exponents, or it may have a slope non-linearly changing with q. In the latter case, the local tangent slope to τ(q∗ ) will give the corresponding exponent, i.e. h(q∗), with its related dimension marked on the ordinate axis C = D(h(q∗ )), where τ(q∗ ) = h(q∗ )q∗ +C. The set of values C, i.e. dimensions D(h(q∗)) for each value of h selected with q∗ , is the socalled spectrum of the singularities D(h) of the fractal signal. Formally, the transformation from τ(q) to D(h) is referred to as the Legendre transformation: dτ(q) = h(q) , dq D((h(q)) = q h(q) − τ(q) .

(9.6)

Figure 9.5. Left: ‘First Dirty’. Right: ‘First Clean’. The only difference between the two time series is a number of erroneous spikes which in ‘First Clean’ were localised using external information and were removed by hand. The plot of ‘First Clean’ still shows spikes which belong to the process investigated. The task of the methodology to be described is to detect the presence of erroneous spikes in time series and provide the means of localising them. The renormalised plot of ‘First Clean’ reveals its complex form. Spiky events are still present in the time series, but these belong to the process.

From this transformation, we can directly obtain expressions for the average h(q) and D((h(q)) in terms of the partition function over maxima values. From 9.6 and 9.5 we have: h(q) =

dτ(q) d log(Z (s, q)) = lim s→0 dq dq log(s) 1 = lim ∑ P(s, q, ωi(s)) log(W f ωi(s)) s→0 log(s) Ω(s)

(9.7)

Modern Analysis of Fluctuations in Financial Time Series and Beyond where P(s, q, ωi(s)) =

(W f ωi (s))q ∑Ω(s)(W f ωi (s))q

175

(9.8)

is the weighting measure for the statistical ensemble Ω(s) [15]. Similarly we obtain the expression for D(h(q)): D(h(q)) = lim

s→0

1 log(s)

∑ p(s, q, ωi(s)) log(p(s, q, ωi(s))) .

(9.9)

Ω(s)

Usually this pair is used to obtain D((h(q)) spectrum in a parametric form (q is a parameter here). We will, however, also show D(q) and h(q) separately, since the explicit dependence on q is beneficial for the purpose of illustration. In figure 9.6, we show D(h(q)) evaluated for the ‘Dirty’ time series and for its cleaned version where the outlier spikes were removed using external information, (both cases shown in figure 9.5). The difference in the spectral information is striking and well reflects the high sensitivity of the partition function method to outliers. The spectrum for the clean version is narrow and focused around the mean value of singularity strength hmean = 0.4. For the dirty version, we have a very broad spectrum which gradually falls off to zero dimension values for decreasing h < hmean . This fall off regime corresponds with positive q values which have the ability to select exponents of a relatively lower value than the hmean value. Since (in this example) we are operating in the h = 0.4 range, thus capturing spikes for which h = −1 (in an isolated situation) is relatively easy. Let us immediately remark that economical time series generally have a wide range of h among their characteristics. Such processes are generically called multifractal, as a fractal dimension D(h) is associated with each h. Hence, if there is a multitude of meaningful values D(h∗ ) (associated with some h∗ ) constituting the spectrum of the process, the process can be of multifractal type. The difference between such processes and the processes with outliers is in the relative values of D(q) and the spacing between the successive q’s. For the process to have a meaningful multifractal spectrum, we will require dense coverage of h(q) values. Also the D(q) values should be relatively large for meaningful spectra. Dense support on a line corresponds with dimension 1. Single points, that is separated point-wise events, on the other hand have support 0. Therefore, if D(q) is near 0, this indicates very weakly supported events and therefore a high probability of an outlier. In view of the scale invariance pertinent to true multifractal, outlier-free processes large crashes should not be different from small crashes at a smaller scale (higher resolution). The crashes, large or small, should, in view of the scale invariant paradigm, be equally well characterised, predictable and should follow the same mechanism. For the strongest crashes observed, obviously due to their economical impact, there is a great interest and an ongoing debate whether they can be classified as outliers or whether they actually belong to the dynamics of the economical system [16, 17, 23, 18, 19]. In the case of the crash of ’87, there are indications that it resulted from the past history of the development of the index [18], in particular as it lacked any evident external reason for occurring. In figure 9.7, we check D(q) and h(q) separately in order to analyse the dependence on q. The test data in figure 9.7 is contaminated with Dirac delta type outliers. For comparison we also analyse outlier-free time series. h(q) evaluated from Eq. 9.7, shows a strong

176

Zbigniew R. Struzik

Figure 9.6. Left: D(h(q)) evaluated for the ‘First Dirty’ time series from figure 9.5. Right: the same for the ‘Clean’ version of the same time series. (The outlier spikes were removed using external information.) A clear difference in D spectrum is visible.

crossover, and for q > 1 it quickly falls off the average hmean value. Also in the case of D(q) evaluated from Eq. 9.9 for the contaminated time series, a clear difference in behaviour is visible. D(q) quickly approaches 0 for q larger than 1. The conclusion that we can derive from such test results is that comparing the value of both the h(q) and D(q) for positive q’s may be useful for detecting the presence of spikes in the time series. In particular the second moment q = 2 seems to be suitable for use as a criterium in comparison with the reference q = 0 moment. The rule of thumb which we suggest is that if the value of h(q = 2) differs from h(q = 0) by some 0.5 and if D(q = 2) is about 0.5 or less, the probability of spike presence is relatively high. For actual localisation of spikes, we need the local value of h(x) instead of the global average as defined in Eq. 9.7. Such a local h(x) will make possible separating outliers from the residue, using some threshold value for h. This will be discussed further in the next section.

9.6. Estimation of the Local, Effective H¨older Exponent Using the Multiplicative Cascade Model Note that even though the partition function method (discussed thus far) uses the maxima tree containing full local information about the singularities, this is lost at the very moment the partition function is computed. Therefore, there is no explicit local information present in the scaling estimates; τ, h or D, and all these are global statistical estimates. This is also where the strength of the partition function method lies - global averages are much more stable than local information and in some cases global information is all that it is possible to obtain. We have shown in the previous section that the wavelet transform, and in particular its maxima lines, can be used in evaluating the H¨older exponent in isolated singularities. In most real life situations, however, the singularities in the time series are not isolated but

Modern Analysis of Fluctuations in Financial Time Series and Beyond

177

Figure 9.7. Left: h(q) evaluated from 9.7 for both ‘Clean’ and ‘Dirty’ time series from fig 1. h shows a strong crossover, and for q > 1 quickly falls off the average 0.4 value for this process. Right: D(q) evaluated from 9.9 for the same time series. Again a clear difference in behaviour is visible. D quickly approaches 0 for q larger than 1.

densely packed. The logarithmic rate of increase or decay of the corresponding wavelet transform maximum line is usually not stable but fluctuates, following the action of the (hypothetical, multiplicative) process involved. Indeed, it is generally not possible to obtain local estimates of the scaling behaviour other than in the case of isolated singular structures from the WT. This is why we introduced [1] an approach circumventing this problem while retaining local information - a local effective H¨older exponent, in which we model the singularities as created in some kind of a collective process of a very generic class. To capture the fluctuations and estimate the related exponents (to which we will refer as an effective H¨older exponent of the singularity), we will model the singularities as created in some kind of a collective process of a very generic class - the multiplicative cascade model. Each point in this cascade is uniquely characterised by the sequence of weights (s1 ...sn), taking values from the (binary) set {1, 2}, and acting successively along a unique process branch leading to this point. Suppose that we denote the density of the cascade at the generation level Fi (i running from 0 to max) by κ(Fi ), we then have κ(Fmax) = ps1 ... psn κ(F0 ) = PFF0max κ(F0 ) and the local exponent is related to the rate of increase of the product PFF0max over the gained scale difference. In any experimental situation, the weights pi are not known and h has to be estimated. This can be simply done using the fact that for the multiplicative cascade process, the effective product of the weighting factors is reflected in the difference of logarithmic values of the densities at F0 and Fmax along the process branch: hFF0max =

log(κ(Fmax)) − log(κ(F0 )) . log((1/2)max) − log((1/2)0)

The densities along the process branch can be estimated with the wavelet transform, using its remarkable ability to reveal the entire process tree of a multiplicative process [4, 20].

178

Zbigniew R. Struzik

It can be shown that the density κ(Fi ) corresponds with the value of the wavelet transform along the maxima lines belonging to the given process branch. The estimate of the effective H¨older exponent becomes: log(W f ω pb (slo )) − log(W f ω pb (shi )) hˆ sshilo = , log(slo ) − log(shi) where W f ω pb (s) is the value of the wavelet transform at the scale s, along the maximum line ω pb corresponding to the given process branch. Scale slo corresponds with generation Fmax , while shi corresponds with generation F0 , (simply the largest available scale in our case).

Figure 9.8. Left: the projection of the maxima lines of the WT along time. The mean value of the H¨older exponent can be estimated from the log-log slope of the line shown. Also, the beginning of the cascade at the maximum scale shi is indicated. Right: the maxima at the smallest scale considered are shown in the projection along time. The effective H¨older exponent can be evaluated for each point of the maximum line at slo scale. Two extremal exponent values are indicated, for minimum and maximum slope. For a multiplicative cascade process, a mean value of the cascade at the scale s can be defined as: ∑Ω(s) log(W f ωi (s)) M (s) = , (9.10) Z (s, 0) where the Z (s, 0) is the partition function Eq. 9.5 for q = 0 and corresponds with the number of maxima at the scale s considered. This mean is compatible with the canonical formalism based spectrum, see Eq. 9.7, and gives the direct possibility of estimating the mean value of the local H¨older exponent as a linear fit to M : log(M (s)) = h¯ logs +C . (9.11) Therefore, we estimate our mean H¨older exponent h¯ from 9.11. The estimate of the ˆ 0 , s) or just h, ˆ now becomes: local H¨older exponent, from now on to be denoted as h(x log(W f (slo )) − (h¯ log(sSL ) +C) ∼ hˆ ssSL . = lo log(slo ) − log(sSL )

Modern Analysis of Fluctuations in Financial Time Series and Beyond

179

ˆ s) is a function of the same x parameter (time, position) as Such an estimated local h(x, the analysed function f (x), and can be analysed in a local fashion or histogrammed in order to study its distribution properties.

Figure 9.9. Left: the local H¨older exponent for the time series from figure 9.5 left. Right: the corresponding log-histograms of the local H¨older exponent. Thresholding on h separates outliers from residue. In figure 9.9, we depict it in a temporal fashion for the record contaminated with Dirac type ˆ a = log(10)) plot is rather varied, but still clustered events, see figure 9.6. The local h(x, around the hmean = 0.4. There are several ‘drop down’ events present, indicating the presence of strong singular events. These can be selected using an appropriate threshold on h. Generic criteria for threshold choice can be obtained from the statistical distribution of the h exponent. For this purpose, the log-histogram of h can be analysed; see figure 9.9 right. For noisy time series, the corresponding histograms show considerable widening, often accompanied by visible fragmentation (discontinuity). Determining the ‘noiseless’ width of the histogram and the point above/below which ‘noise’ starts is a crucial task in this procedure. The histogram can provide good insight, but it is somehow hard to automatize the choice of the threshold level and make it arbitrary. We have, therefore, resorted to a simple heuristic. A good threshold value which worked well in our tests (for an account of the experimental work we refer the reader to Ref [21] ) was determined using the scaling exponent of (the square root of) the second moment of the measure Z (s, 2): s Z (s, 2) M 0 (s) = , (9.12) Z (s, 0) and the thresholding hˆ exponent is then determined from the linear fit: ˆ log(M 0 (s)) = hlog(s) +C0 .

(9.13)

In cases where there are outliers in the time series, this quantity, the ‘micro-canonical’ geometric mean, hˆ does not coincide with the hmean mode value of the D(h) distribution. In such cases it can be used for the threshold between the outliers and the residue, as in figure 9.9. In fact, the hˆ value can also be used as an additional criterium (besides that

180

Zbigniew R. Struzik

Figure 9.10. Local contributions to the multifractal spectrum of a healthy adult heartbeat record with all its monochromatic, i.e. mono-fractal, components separable. [2]

suggested in the previous section) for testing for the presence of outliers. In cases where ˆ the probability of outliers is high. the hmean differs significantly from h, Finally, the histogram of the h exponent is shown in figure 9.9, with the tail to the lower h values clearly visible. The same threshold hˆ as in the local plot separates the residual bulk and the tail outlier events.

Modern Analysis of Fluctuations in Financial Time Series and Beyond

181

9.7. Employing the Local Effective H¨older Exponent in the Characterisation of Time Series ˆ 0 , s) can be depicted in a temporal fashion, for example with Such an estimated local h(x colour stripes, as we have done in figure 9.11. The colour of the stripes is determined by ˆ 0 , s) and its location is simply the x0 location of the analysed the value of the exponent h(x singularity (in practice this amounts to the location of the corresponding maximum line). Colour coding is done with respect to the mean value, which is set to the green colour central to our rainbow range. All exponent values lower than the mean value are given colours from the ‘warmer’ side of the rainbow, all the way towards dark red. All higher than average exponents get ‘colder’ colours, down to dark blue.

Figure 9.11. Left: example time series with local Hurst exponent indicated in colour: the record of healthy heartbeat intervals and white noise. The background colour indicates the H¨older exponent locally, centred at the Hurst exponent at green; the colour goes towards ˆ Right: the corresponding log-histograms of blue for higher hˆ and towards red for lower h. the local H¨older exponent.

The first example time series is a record of the S&P500 index from the time period 1984-1988. There are significant fluctuations in colour in this picture, with the green colour centred at h = 0.5, indicating both smoother and rougher components. In particular, one can observe an extremal red value at the crash of ’87 coordinate, followed by very rough behaviour. It has been previously recognised that a post-crash behaviour may provide relevant means of completing financial time series analysis [23]. In a very recent shift of interest towards the analysis of intra- and post-crash behaviour, this counterpart of the traditional crash-precursory behaviour has been proven to constitute a meaningful ‘diagnostic’ tool [18]. The second example time series is a computer generated sample of fractional Brownian

182

Zbigniew R. Struzik

motion with H = 0.6. It shows almost monochromatic behaviour, centred at H = 0.6; the colour green is dominant. There are, however, several instances of darker green and light blue, indicating locally smooth components. It is important to notice that h = HBrownian Walk, the H¨older exponent value equal to the Hurst exponent of an uncorrelated Brownian walk, corresponds with no correlation in time series. 1 An ideal random walk would have only monochromatic components of this value. Of course, an ideal, infinitely long record of fractional Brownian motion of H = 0.6 is correlated, but this correlation would be stationary in such an ideal case and no fluctuations in correlation level (in colour) would be observed. By the same argument, we can interpret the variations in h as the local fluctuations of correlation in the S&P index. The more red the colour, the more unstable and the more anti-correlated the index. And the more blue, the more stable and correlated. To the right of figure 9.11, the log-histograms are shown of the H¨older exponent displayed in the colour panels. They are made by taking the logarithm of the measure in each histogram bin. This conserves the monotonicity of the original histogram, but allows us to compare the log-histograms with the spectrum of singularities D(h). The log-histograms are actually closely related to the (multifractal) spectra of the H¨older exponent [2]. The multifractal spectrum of the H¨older exponent is the ‘limit histogram’ Ds→0 (h) of the H¨older exponent in the limit of infinite resolution. Of course we cannot speak of such a limit other than theoretically and, therefore, a limit histogram (multifractal spectrum) has to be estimated from the evolution of the log-histograms along scale. For details see [2]. Let us point out that the width of the spectra alone is a relatively weak argument in favour of the hypothesis of the multifractality of the S&P index. The log-histogram of the S&P is only slightly wider than the log-histogram of a record of fractional Brownian motion of comparable length, see figure 9.11. An interesting observation is, however, that the crash of ’87 is clearly an outlier in the sense of the log-histogram of H¨older exponent (and therefore in the sense of the MF spectrum). The issue of crashes as outliers has been extensively discussed by Johansen [16] and by L’vov et al [24]. Here we support this observation from another point of view.

9.8. Breaking with the Universality Picture: Reasoning from Non-stationarity A true multifractal process like the one suggested for both financial time series and heartbeat rate [25, 26, 27, 28], would share the same parameters (like MF spectrum) for any sub-part of the record. Thus, for an ideal multifractal system, any new data recorded would not affect the spectrum already estimated. However, in real life systems which are not isolated, the interaction with the environment is reflected in the systems’ characteristics at any moment. The dynamical process in constant interaction with the environment will reflect the current mode of interaction with the environment in its characteristics. This holds for externally induced crashes - the decelerations due to events like obstruction of the umbilical cord in the case of the heartbeat of a fetus [29, 30], or finacial crashes due 1 Theoretically

this is h = 0.5, but finite size sample effects usually add some degree of correlation, slightly increasing this value.

Modern Analysis of Fluctuations in Financial Time Series and Beyond

183

to the single piece of very bad news (outbreak of war etc). But, it also holds for systems’ characteristics at any moment, and can be related to external information in an instantaneous fashion. In particular, short-range collective behaviour of the scaling exponent h contributing to the global MF spectrum can be studied by running a simple moving average (MA) filter, which may capture collective behaviour of the local h characteristic.

Figure 9.12. The variability plot from a long run of experiments on adults where the test persons were given placebo or beta-blocker. Two runs of MA filter were performed with 100 and 1000 maxima long window. An interesting pattern of response to food is evident.

An interesting pattern of ‘surprising’ features can be identified in the example (7 days long) record of the heartbeat of an adult. Upon verification, it confirms a pattern of response to activity, suggesting novel links to external information. Without going into much detail of the record given, there is a particularly strong response of the person in question to food. The observed shift towards higher values as the result of eating (it is almost possible to estimate the volume of the meal!) may indicate some nearly pathologic response in this individual case [8]. Similar analysis of the financial records can be done. In fact, linking roughness exponents with external influences has been postulated in Ref [6]. The roughess estimated there was, however, not linked to the multifractality model of the financial time series. Universal descriptions of complex phenomena, like that using multifractal cascades, may prove to be inadequate for a complete description, even if valid in a restricted temporal range or in isolated or free-running conditions. The inadequacy of such models is often demonstrated by the non-stationarity of their characteristics. This aspect is usually neglected (through the selection of examples supporting the theory) or filtered out. However, failure of the model to explain the phenomenon fully may well provide sig-

184

Zbigniew R. Struzik

nificant insight into the dependence of the system on the external conditions. Structures emerging from the non-stationarity of the (most) sophisticated description available very likely indicate that more dimensional embedding may be needed for a proper description of the system under study.

9.9. Discovering Structure Through the Analysis of Collective Properties of Non-stationary Behaviour Non-stationarities are usually seen as the curse of the exact sciences, economics not excluded. Let us here present a different opinion: where the non-stationarities occur, interestingness begins! Non-stationarities can be seen as a departure from some (usually) simple ‘model’. For example, this can be the failure of the stationarity of the effective H¨older exponent (see figure 9.11). In some sense, they indicate that the simple model used is not adequate, but this does not necessarily mean that one needs to patch or replace this low level model. On the contrary, the information revealed by such a low level model may be used to detect higher order structures. In particular, correlations in the non-stationarities may indicate the existence of interesting structures. An intriguing example of such an approach in the financial domain is the work by A. Arneodo et al, where a correlation structure in S&P index has been revealed [31]. The simplest way of detecting structure, we suggest, is detecting fluctuations or the collective behaviour of the local effective h. This has already been successfully applied in human heartbeat analysis. [8]. Here we will present some preliminary results for the S&P index. The non-stationary behaviour in h can be quantified, and for this purpose we use a low pass moving average filter (MA) to detect/enhance trends. This processing is, of course, done on the H¨older exponent value set {hi ( f (x))}, not on the input signal f (x) . A n-MA filtering of n base is defined as follows: hMAn (i) =

1 i=n ∑ hi( f (x)) , n i=1

(9.14)

where hi ( f ) are the subsequent values of the effective H¨older exponent of the time series f . Let us now go back to the S&P index and its effective H¨older exponent description. Different window lengths in our MA filter represent different horizons for the trader. If the index is all that is available, in order to evaluate the risk associated with trading (or in other words, to predict the risk of an index crash), the trader might want to know how ‘stable’ the market/index is on a daily or monthly time scale. In fact a comparison between the two indicators of stability might be even more indicative. This is exactly what we have done using two different time scales (two trading horizons) for the MA smoothing; see figure 9.13. The smoothed input is the effective H¨older exponent of the S&P index. It corresponds closely with the logarithm of the local volatility and as such it reflects the stability of the market. We made the following observations from this experiment: the short time horizon MA shows a strong oscillatory pattern in collective behaviour of the h. These oscillations have

Modern Analysis of Fluctuations in Financial Time Series and Beyond

185

already been observed by Y. Liu et al [22] and by N. Vandewalle et al. [6] This is, however, not log-periodic behaviour in our results and it does not converge to a moment of crash. What can perhaps be used in order to help the trader in evaluating the growing risk is the interplay of the various time horizons. The second MA filter has a time horizon ten times longer and it shows practically no oscillations. However, its value decays almost monotonically, in the moment just before the crash, reaching the level of correlations characteristic for the random walk (see figure 9.13 right inserts). Note that the crashes themselves are not visible in the insert plots. Let us recall that the main advantage of the effective H¨older exponent above some traditional measures of volatility is that it describes the local level of correlation in the time series. If the value of h is below h = HBrownian Walk, this means we have an anti-correlated time series which intuitively corresponds with a rather unstable process. The h above h = HBrownian Walk indicates the presence of correlations and generally can be associated with ‘stability’. Please note that the oscillations in MA50 before the crashes bring the collective h up and down between the correlated and the anti-correlated regimes. Similarly MA500 steadily decays towards the anti-correlated regime just before the crashes.

Figure 9.13. The effective H¨older exponent smoothed with two windows (MA50 and MA500) is shown. The two plots to the right show windows on the smoothed effective H¨older exponent just before crash #1 and crash #2. (The crashes are visible in the left figure but not in the windows.) Visible oscillations of MA50 and decay of MA500 characterise precursors of both crashes. The average level of the effective H¨older exponent for the uncorrelated Brownian walk is also indicated.

186

Zbigniew R. Struzik

Conclusion We have presented a modern approach to multiscale analysis of fluctuations in economic time series - the Wavelet Transform Modulus Maxima based multifractal analysis has been discussed in both global partition function based and local H¨older exponent formulation. In particular, we have used the deviation from the expected multifractal spectrum for evaluation of the potential presence of outliers. The local effective H¨older exponent has been applied to evaluate the correlation level of the S&P index locally at an arbitrary position (time) and resolution (scale). In addition to this, we have analysed collective properties of the local correlation exponent as perceived by the trader exercising various time horizon analyses of the index. A moving average filtering of H¨older exponent based variability estimates was used to mimic the various time horizon analyses of the index. We observed intriguing interplay between different time horizons before the biggest crashes of the index.

References [1] Z. R. Struzik, Local Effective H¨older Exponent Estimation on the Wavelet Transform Maxima Tree, in Fractals: Theory and Applications in Engineering, Eds: M. Dekking, J. L´evy V´ehel, E. Lutton, C. Tricot, Springer Verlag, (1999). [2] Z. R. Struzik, Determining Local Singularity Strengths and their Spectra with the Wavelet Transform, Fractals, 8, No 2, pp 163-179, (2000). [3] M.H.R. Stanley, L.A.N. Amaral, S.V. Buldyrev, S. Havlin, H. Leschhorn, P. Maass, M. A. Salinger, H.E. Stanley, Can Statistical Physics Contribute to the Science of Economics?, Fractals 4, 3, pp 415-425, (1996). H.E. Stanley, L.A.N. Amaral, D. Canning, P. Gopikrishnan, Y. Lee, Y. Liu, Econophysics: Can Physicsts Contribute to the Science of Economics?, Physica A, 269, 156-169, (1999). [4] A. Arneodo, E. Bacry and J.F. Muzy, The Thermodynamics of Fractals Revisited with Wavelets. Physica A, 213, 232 (1995). J.F. Muzy, E. Bacry and A. Arneodo, The Multifractal Formalism Revisited with Wavelets. Int. J. of Bifurcation and Chaos 4, No 2, 245 (1994). [5] R.N. Mantegna, H.E. Stanley, Scaling Behaviour in the Dynamics of of an Economic Index, Nature 376, 46-49 (1995). R. Mantegna, H.E. Stanley, An Introduction to Econophysics (Cambridge University Press, Cambridge, 2000). [6] N. Vandewalle, M. Ausloos, Coherent and Random Sequences in Financial Fluctuations, Physica A, 246, 454-459, (1997). N.Vandewalle, Ph.Boveroux, A.Minguet and M.Ausloos, Physica A, (1998). [7] P.Ch. Ivanov, M.G. Rosenblum, C.-K. Peng, J. Mietus, S. Havlin, H.E. Stanley and A.L. Goldberger, Scaling Behaviour of Heartbeat Intervals Obtained by Wavelet-based Time-series Analysis. Nature, 383, 323 (1996).

Modern Analysis of Fluctuations in Financial Time Series and Beyond

187

P.Ch. Ivanov, M.G. Rosenblum, L.A. Nunes Amaral, Z.R. Struzik, S. Havlin, A.L. Goldberger and H.E. Stanley, Multifractality in Human Heartbeat Dynamics. Nature 399, (1999). [8] Z.R. Struzik, Revealing Local Variability Properties of Human Heartbeat Intervals with the Local Effective H¨older Exponent, to appear in Fractals, March 2001, see also CWI Report, INS-R0015, June 2000. [9] I. Daubechies, Ten Lectures on Wavelets, (S.I.A.M., 1992). [10] M. Holschneider, Wavelets - An Analysis Tool, (Oxford Science Publications, 1995). [11] S.G. Mallat and W.L. Hwang, Singularity Detection and Processing with Wavelets. IEEE Trans. on Information Theory 38, 617 (1992). S.G. Mallat and S. Zhong Complete Signal Representation with Multiscale Edges. IEEE Trans. PAMI 14, 710 (1992). [12] S. Jaffard, Multifractal Formalism for Functions: I. Results Valid for all Functions, II. Self-Similar Functions, SIAM J. Math. Anal., 28(4): 944-998, (1997). [13] R.Carmona, W.H. Hwang, B. Torr´esani, Characterisation of Signals by the Ridges of their Wavelet Transform, IEEE Trans. Signal Processing 45, vol 10, 480-492, (1997) [14] Z.R. Struzik, Removing Divergences in the Negative Moments of the Multi-Fractal Partition Function with the Wavelet Transformation. CWI Report, INS-R9803. Also see ‘Fractals and Beyond - Complexities in the Sciences’, M.M. Novak, Ed., World Scientific, 351 (1998). [15] A. Arneodo, E. Bacry and J.F. Muzy, Wavelet Analysis of Fractal Signals: Direct Determination of the Singularity Spectrum of Fully Developed Turbulence Data, Physica A, 213, 232 (1995). [16] A. Johansen, D. Sornette, Stock Market Crashes are Outliers, Eur.Phys.J. B 1, pp. 141-143 (1998). [17] A. Johansen, D. Sornette, Large Stock Market Price Drawdowns are Outliers, Journal of Risk, 1 (4), pp. 5-32 (2002). [18] D. Sornette, Y. Malevergne, J.F. Muzy, Volatility Fingerprints of Large Shocks: Endogeneous Versus Exogeneous, in “Application of Econophysics,” Proceedings of the second Nikkei symposium on econophysics, H. Takayasu, ed., Springer Verlag, (2003). [19] X. Gabaix, P. Gopikrishnan, V. Plerou, H.E. Stanley, Understanding Large Movements in Stock Market Activity, X. Gabaix, P. Gopikrishnan, V. Plerou, H.E. Stanley, A Simple Theory of the ’cubic’ Laws of Financial Fluctuations, working paper (2002). [20] Z.R. Struzik, The Wavelet Transform in the Solution to the Inverse Fractal Problem. Fractals 3 No.2, 329 (1995).

188

Zbigniew R. Struzik

[21] Z.R. Struzik, A. Siebes, Outlier Detection and Localisation with Wavelet Based Multifractal Formalism, CWI Report, INS-R0008 (2000). [22] Y. Liu, P. Gopikrishnan, P. Cizeau, M. Meyer, C.-K. Peng, H.E. Stanley, The Statistical Properties of the Volatility of Price Fluctuations, Phys. Rev. E 60, pp 1390-1399, (1999). [23] Z. R. Struzik. Wavelet Methods in (Financial) Time-series Processing. Physica A: Statistical Mechanics and its Applications, 296 (1-2), pp. 307-319, (2001). [24] V.S. L’vov, A. Pomyalov, I. Procaccia, Outliers, Extreme Events and Multiscaling, Phys. Rev. E, 63, 056118 (2001). [25] A. Fisher, L. Calvet, B.B. Mandelbrot, Multifractality of the Deutschmark/US Dollar Exchange Rate, Cowles Foundation Discussion Paper, (1997). [26] M.E. Brachet, E. Taflin, J.M. Tch´eou, Scaling Transformations and Probablity Distributions for Financial Time Series, Chaos Solitons and Fractals 11, pp. 2343-2348, (2000) [27] F. Schmitt, D. Schwertzer, S. Lovejoy, Multifractal Analysis of Foreign Exchange data, Appl. Stochastic Models Data Anal. 15, pp. 29-53, (1999). [28] P.Ch. Ivanov, M.G. Rosenblum, L.A. Nunes Amaral, Z.R. Struzik, S. Havlin, A.L. Goldberger and H.E. Stanley, Multifractality in Human Heartbeat Dynamics. Nature 399, (1999). [29] Z. R. Struzik, Econonatology: The Physics of the Economy in Labour, Physica A, 324(1-2) pp 344-351, (2003). [30] Z. R. Struzik, Taking the Pulse of the Economy, QuantitativeFinance 03, C78-C82 (2003). [31] A. Arn´eodo, J.-F. Muzy, D. Sornette, ‘Direct’ Casual Cascade in the Stock Market, Eur. Phys. J. B 2 277-282 (1998).

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 189-220

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 10

S YNCHRONICITY BETWEEN M ACROECONOMIC T IME S ERIES∗ Alvaro Escribano1,† and Ana E. Sipols2,‡ 1 Telefonica Chair of Economics of Telecommunications, Department of Economics, Universidad Carlos III de Madrid, Madrid, Spain 2 Departement of Statistics. Universidad Rey Juan Carlos de Madrid, Madrid, Spain

10.1. Introduction In this chapter we analyse the performances of a model free cointegration testing device that we construct from functions of order statistics. First, we propose new exploratory techniques that consist in comparing the plots of the range and the jump sequences of the original series. These plots suggest alternative cointegration testing schemes. Here we focused on one of them, which involves two complementary test statistics. We report on some promising results obtained from Monte Carlo experiments as well as some empirical applications of the new method to pairs of exchange-rates, and to gold and silver prices. Our study concludes that the proposed methodology is potentially robust to monotonic nonlinearities and serial correlation structure in the cointegration errors, and certain types of level shifts in the cointegration relationship. Processes which exhibit common trends or similar long waves in their sample paths are often called cointegrated. The concept of cointegration is inherently linear and originated in macroeconomics and finance (Granger, 1981; Engle and Granger, 1987), where the theory suggests the presence of economic or institutional forces preventing two or more series ∗ This

paper is written in memory of Felipe M. Aparicio that passed away. It is heavily based on the preliminary joint work done as a Working Paper of UC3M ”Synchronicity between macroeconomic time series: an exploratory analysis”. † The first author acknowledges the financial support received from the ”Excellence Program in Education and Research of the Bank of Spain” and from the research grant MICIN-ECO2009-08308 of the Ministry of Innovation and Science, Spain. ‡ The second author acknowledges the financial support received from URJC-CM-2008-CET-3703.

190

Alvaro Escribano and Ana E. Sipols

to drift too far apart from each other. Take for example, those series as income and expenditure, the prices of a particular good in different markets, the interest rates in different countries, the velocity of circulation of money and short-run interest rates, etc. Cointegration relationships may also appear in engineering applications. For instance, between the outputs signals from different sensing or processing devices having a limited storage capacity or memory, and driven by a common persistent input flow (Aparicio, 1995). Underlying the idea of cointegration is that of a stochastic equilibrium relationship (i.e. one which, apart from deterministic elements, holds on the average) between two cointegrated variables, yt , xt . A strict equilibrium exists when for some α 6= 0, one has yt = αxt . This unrealistic situation is replaced, in practice, by that of (linear) cointegration, in which the equilibrium error zt = yt − αxt is different from zero but fluctuates around the mean much more frequently than the individual series. So far, only few attempts to extent the concept of cointegration beyond the assumption of linearirty in the relationship have been considered. This is essentially due to the fact that a general null hypothesis of cointegration encompassing nonlinear relationships is too wide to be tested. Notwithstanding, the possibility of nonlinear cointegrating relationships is real, and therefore it has prompted some interesting definitions and an ongoing research on the subject. The first of theses attempts was due to Hallman (1990) and to Granger and Hallman (1991a). Following this, for a pair of series yt , xt to have a cointegration nonlinear attractor, there must be nonlinear measurable functions f (.), g(.) such that f (yt ) and g(xt ) are both I(d), d > 0, and wt = f (yt ) − g(xt ) ∼ I(dw ), with dw < d (see also Granger and Terasvirta (1993)). Assuming that f and g can be expanded as Taylor series up to some order p ≥ 2 around the origin, we can write wt = c0 + c1 zt + HOT (yt , xt ), where zt = yt − αxt , and with HOT (., .) denoting higher-order terms. It follows that the linear approximation, zt , to the true cointegration residuals differs from the latter by some higher-order terms which express that the strength of attraction onto the cointegration line yt = αxt varies with the levels of both series, yt and xt . As with linear cointegration, the case where dx = dy = 1, dz = 0 and the cointegration residuals have finite variance is most important in practice, since it allows a straightforward interpretation in terms of equilibrium concepts. Figure 10.1 illustrates the case of nonlinear cointegration, with simulated nonlinear transformations of random walks. The most general distributional results to unit-root testing were first obtained by Phillips (1987). Suppose xt has mean µt , and let ∆(xt − µt ) = εt with ∆ denoting the first differencing operator. The main assumption imposed to obtain the limit distribution of standard unit-root tests is the following, due to Herrndorf (1984): Assumption 1 (AS1). 1. E(εt ) = 0 2. supE(|εt |γ ) < ∞ for some γ > 2 t   n −1 2 3. 0 < lim E n ( ∑ εt ) < ∞ n→∞

t=1



1−2/γ

4. εt is strong mixing, with mixing coeficients αi satisfying ∑ αi i=1

x j ), see for example David (1981). j=1

Synchronicity between Macroeconomic Time Series

193

10.2. Cointegration Testing Using the Ranges The objective of this section is twofold. First, based on a preliminary exploratory analysis, we propose a model free procedure for testing cointegration. Second, we analyse its behaviour on finite samples and show promising results as regards its robustness to different departures from standard framework, such as monotonic nonlinearities or serial correlation in the cointegration errors. Our concern is mainly exploratory, and as such, we will deliberately skip any asymptotic analysis. Important questions such as the convergence rates and the limiting distributions of the standardised test statistic which we propose here are still under research. For our propose, it may be interesting to see the cointegration property in terms of the following two conditions: 1. There are informational events that have a permanent effect on the levels of the series (in the linear case, this amounts to saying that the series are integrated). 2. The relevant informational events for either series occur simultaneously up to a constant delay, and their effects on their levels can be related.

10.2.1.

Exploratory Analysis Based on Ranges

Obviously, these conditions automatically raise the question of what should be considered as relevant informational events for a series. The definition that follows is based on a characterization of the “long wave” behavior in terms of what we call Low-Frequency Features (LFF hereafter). Following Granger and Terasvirta (1993) and Anderson and Vahid (1998), a feature is essentially any dominating statistical property exhibited by a time series. Features may refer to either the behavior of the mean or of higher-order moments of the series, such as heteroskedasticity. Here we are interested only in the former; that is in.those features that are potentially useful in revealing the presence of stochastic trends. These include the autocorrelation structure of the series or of nonlinear transformations of the latter, any existing growth rate, and whatever measure of mean reversion. Features are endowed with some algebraic properties. For instance, if xt has a feature while yt has not, then both λxt and yt + xt as well as any delayed replica of xt , say xt−p (where p is a positive integer), will have that feature. Roughly speaking, we could say that a time series has strong dependence in the mean if it exhibits a LFF. Here we will focus on the class of LFF’s that are obtained from the sequence of running ranges. The range of a data sample is defined in terms of its extremes. Formally, for a given time series xt , the statistics x1,i = min {x1 , · · · , xi } and xi,i = max {x1 , · · · , xi } are called the i-th extremes (see for instance Galambos, 1984). When the sample comes from a time (x) series xt , a sequence of ranges can be obtained as Ri = xi,i − x1,i , for i = 1, · · · , n, where n denotes the sample size. This sequence defines an integrated jump process, where the (x) (x) (x) jumps ∆Ri = Ri − Ri−1 are nonnegative quantities that will be different from zero each time i that a new maximum or a minimum is reached. Some statistical properties of the original time series xt such as general forms of serial dependence and dependence between series could be more easily assessed in the dynamics (x) of the running ranges Ri (see Aparicio and Escribano,1999). One important finding is that

194

Alvaro Escribano and Ana E. Sipols (x)

the range sequence Ri for stationary time series is stochastically bounded 2 , whereas it is not for null-recurrent time series such as integrated time series or those having monotonic trends. The implication of this is that cointegration will basically consist in checking the synchronicity (up to a constant delay) of two sets of arrival times. In the sequel, we will made this characterisation operative by using the sequences of running ranges and integrated jump process. Basically, a process defined by a sequence of ranges is an integrated jump process, where each jump correspond to the arrival of a what we called a relevant informational event, which for us will be one which contributes either a new maximum or a new minimum level in the series. (y) (x) Figure 10.2 show the range sequences Ri , Ri for pairs of linearly cointegrated, nonlinearly cointegrated (cubic nonlinearity), non-cointegrated, and I(0) comoving time series. It can be seen that, for cointegrated series (either linear or nonlinear), the jumps occur at approximately the same instants, even though their amplitudes may not be related by a linear relationship (see Figures 10.3 and 10.4). On the contrary, the jump sequences corresponding to the non-cointegrated series show not apparent relation between the arrival times of the two sets of jumps, and a similar behaviour is obtained when the series are I(0) but comoving (Figures 10.5 and 10.6). Figure 10.7 shows the cross-plots of the range sequences for the four pairs of series. It is apparent that cointegration implies a sort of continuity (synchronicity) in these plots. For the pairs of independent random walks andt the pairs of I(0) comoving series, the sequences of ranges evolve differently, which explain the discontinuities in the corresponding cross-plots. These discontinuities are more prnounced for pairs of independent random walks since the sample paths of the series consist of long strides, while for the I(0) comoving series, the different ways in which the range series evolve are hidden by high-frecuency fluctuations. As a consequence, the discontinuities in the last cross-plot are not so outstanding.

10.2.2.

Some Nonparametric Range Statistics for Cointegration Testing

As we mentioned before, our characterisation of cointegration requires that the following two conditions concerning the jumps (first differences of the ranges) are satisfied: 1. For each series, the jumps are persistent. This means that the probability of ocurrence of a new maximum or minimum is constant along time and is nonzero. That is, the series can fluctuate wildly around their mean and their levels are not stochastically bounded. Somehow, this requirement amounts to the long-memory property. 2. The jumps for both series occur at time instants that are equal up to a constant delay. If either of these conditions are not satisfied then the series will not be cointegrated in the sense of our characterisation. For example, if for each series jumps, clusters along orthogonal supports these series will not be cointegrated. Further, to distinguish between linear and nonlinear cointegration, it would be enough to remark that while for linear cointegration, informational events having identical arrival times in both series will have approximately nonnegative sequence st is said to be stochastically bounded if for every positive real number ε,there exists a finite positive constant δε such that supt P(st ≤ δε ) ≥ 1 − ε. 2A

Synchronicity between Macroeconomic Time Series 80

195

350 300

60

250 200

40 150 100

20

50 0

0

200

400

600

800

1000

0

0

200

400

(a)

600

800

1000

600

800

1000

(b)

100

70 60

80

50 60

40

40

30 20

20 0

10 0

200

400

600 (c)

800

1000

0

0

200

400 (d)

Figure 10.2. Cross-plots of the sequences of ranges for a pair of linearly, nonlinearly, noncointegrated and I(0) comoving series: (a) linear cointegration, (b) nonlinear cointegration, (c) independent randomwalks, and (d) I(0) comoving series.

the same impact on their levels, for nonlinear or non-stationary cointegration these schoks may have quite different effects on each series. Assuming that no series lags behind the other, then under cointegration, a cross-plot of the jumps for both series would show many points in the first quadrant, while for independent random walks or for I(0) comoving series, the points in these plots would tend to li very close to or along the nonnegative half-axis. Therefore, one would be inclined to believe that the quality of fit of a regression line from the origin to the points in these plots would necessarily be less bad under cointegration than under non-cointegration. Figure 11.8 shows these cross-plots obtained from 100 replications of each pair of series. In order (y) (x) to summarise the information collected by the cross-plots of ∆Ri versus ∆Ri in a statistic, we remark that in the presence of a linear or a monotonic nonlinear cointegrating compo(x) (y) (x) (y) nent in xt , yt the sequences of ranges, R1 , . . . , Rn and R1 , . . . , Rn , will be approximately (x) (x) proportional. We expect a similar behaviour from the sequences of jumps, ∆R1 , . . . , ∆Rn (y) (y) and ∆R1 , . . . , ∆Rn . Thus, a non-parametric measure of cointegration could be provided by the following statistic, which provides a measure of the sample cross-correlation of the jump sequences (x) (y) ∆Ri and ∆Ri :

196

Alvaro Escribano and Ana E. Sipols 20

15

10

5

0

0

100

200

300

400

500

600

700

800

900

1000

0

100

200

300

400

500

600

700

800

900

1000

3.5 3 2.5 2 1.5 1 0.5 0

(y)

Figure 10.3. Sequences of jumps ∆Ri in Figure 10.2.

(x)

and ∆Ri

n

ρnx,y = 

for the linearly cointegrated series used

(x)

(y)

∑ ∆Ri ∆Ri

i=2 n



i=2



 (x) 2 ∆Ri

 21 

n



i=2



 (y) 2 ∆Ri

 12

(10.5)

As it will be shown in the next section, the previous test statistic cannot discriminate properly between model 1 of linear cointegration and model 3 of I(0) linearly comoving series. An inspection of the cross-plots for the jump series reveals that the points in these plots tend to cluster at the origin for pairs of I(0) comoving series, meaning that there are large time spells in which no relevant informational event appears for either series. Indeed, the very nature of the sample paths of I(0) series entails that all relevant features of the series are captured in a comparatively small time interval, whereas the long strides of the sample paths of integrated series preclude this possibility. This explains why the quality of fit of a regression line from the origin to the points in the cross-plots cannot be distinguished from that obtained for pairs of linearly cointegrated series. This calls for a complementary nonparametric test statistic that takes into account these features in order to discriminate between pairs of cointegrated series and pairs of I(0) series. Therefore we propose a second (n) test statistic Rx,y , which define as: J+ (n) Rx,y = (10.6) NJ

Synchronicity between Macroeconomic Time Series

197

3 2.5 2 1.5 1 0.5 0

0

100

200

300

400

500

600

700

800

900

1000

0

100

200

300

400

500

600

700

800

900

1000

3 2.5 2 1.5 1 0.5 0

(y)

Figure 10.4. Sequences of jumps ∆Ri used in Figure 10.2.

(x)

and ∆Ri

for the nonlinearly cointegrated series

where J + denotes the number of points in the plots which occur on the positive half axes, (n) and NJ the number of points at the origin of these plots. In other words, Rx,y measures the proportion of informational events that are only relevant to one series with respect to those which are not for either. The variable selected for the numerator in thos ratio ensures that (n) for pairs of independent random walks Rx,y will take large values as compared to the cases of I(0) series and for linear cointegration.

10.2.3.

Monte Carlo Simulations

In Table 11.1, the mean values and their standard deviations (given in brakets) for the jumps statistics ρnx,y is given for an experiment involving 5000 replications of cointegrated (linearly and nonlinearly -quadratic-) and non-cointegrated series of length n = 1000. The nonlinearly cointegrated series were obtained as in Figure 11.1, using a quadratic transformation of a common random walk component, plus an added independent white Gaussian noise. We also estimated the mean and standard deviation of the jumps statistic on a pair of linearly comoving I(0) series generated with the following model: xt

= 0.6xt−1 + et,1

yt

= 2.0xt + et,2

198

Alvaro Escribano and Ana E. Sipols 5 4 3 2 1 0

0

100

200

300

400

500

600

700

800

900

1000

0

100

200

300

400

500

600

700

800

900

1000

3 2.5 2 1.5 1 0.5 0

(y)

Figure 10.5. Sequences of jumps ∆Ri in Figure 10.2.

(x)

and ∆Ri

for the independent random walks used

Table 10.1. Mean values and standard deviations of the jump statistic ρnx,y , evaluated on 5000 replications of LC, NLC, IRW and I(0) linearly comoving, for a sample size n = 1000. Test statistic LC NLC (quadratic) NC I(0) Comoving ρnx,y 0.2724 (0.09) 0.465 (0.1) 0.07 (0.04) 0.6 (0.12) where et,1 , et,2 are independent i.i.d sequences of Gaussian random variables (Nid(0, 1)) In the sequel, LC will stand for linear cointegration, NLC for nonlinear cointegration, and IRW for independent random walks. Clearly, the case of independent random walks can be easily discriminated using this statistic in a unilateral test. Figure 11.9 shows the histogram plots of ρnx,y . Under the null hypothesis H0 of IRW we estimated the empirical density of ρnx,y by smoothing techniques. More specifically, we chose the kernel density estimator with the Epanechnikov kernel function and bandwidth parameter h = 0.05 for different sample sizes (n = 100, 500, 1000) and for 5000 Monte Carlo simulated pairs of independent random walks with Nid(0, 1) errors (Model 0), Figure 11.10 shows this density. We observe that, as n tends to infinity, the shape of the estimated density is similar to that of a Chi-squared distribution. However, as the sample size increases, the critical values decrease quite fast and suggest that the asymptotic null distribution for ρnx,y could be degenarate. In this chapter

Synchronicity between Macroeconomic Time Series

199

5 4 3 2 1 0

0

100

200

300

400

500

600

700

800

900

1000

0

100

200

300

400

500

600

700

800

900

1000

3 2.5 2 1.5 1 0.5 0

(y)

Figure 10.6. Sequences of jumps ∆Ri Figure 10.2.

(x)

and ∆Ri

for the pair of I(0) comoving used in

we will not attempt to find out the proper convergence rate of this statistic. In spite of it, we analyse use this testing device on finite samples by considering the empirical critical value, when fixing the sample size for different values. We also estimated the empirical density of ρnx,y for exponentially cointegrated variables, by Monte Carlo simulations as above with h = 0.15. The DGP was wt

= wt−1 + et,0

xt

= wt + et,1

yt

= 0.5 exp(wt ) + et,2

(10.7)

Figure 10.11 shows the estimated density for different sample sizes. We observe that as n tends to infinity the estimated density approaches the shape a Normal distribution. Under the null hypothesis H0 of IRW, we computed the 5% right critical values of the empirical distribution of ρnx,y for different sample sizes (n=100, 224, 500, 1000, 2000 and 5000) and for 10000 simulated pairs of independent random walks with Nid(0, 1) distributed errors, (Model 0). Model 0 (Independent random walks) xt

= xt−1 + et,1

yt

= yt−1 + et,2

et,1 , et,2 ∼ Nid(0, 1)

(10.8) (10.9)

200

Alvaro Escribano and Ana E. Sipols 80

5000 4000

60

3000 40 2000 20

1000

0 0

20

40

0 0

60

20

(a) 40

20

30

15

20

10

10

5

0 0

20

40 (c)

40

60

(b)

60

0 0

80

2

4 (d)

6

8

Figure 10.7. Cross-plots of the sequences of ranges for pairs of series that follow: (a) linear cointegration, (b) nonlinear cointegration, (c) independent random walks, and (d) I(0) comoving series. The critical value (cv) corresponding to the sample size n = 224 is generated because it will be used later in one empirical applications of the test. The results are summarised in Table 10.2. First, we will show that cointegration can be defined by means of the synchronicity requirement when we allow for the appropiate time delay correction in the series. We consider 10000 replications of time series generated with the following linear model: Modelo 1a (Linear Cointegration) wt

= wt−1 + et,0

xt

= wt + et,1

yt

= awt−m + et,2

(10.10) (10.11)

Table 10.2. Simulated 5% right critical values of the empirical distribution of the test statistic ρnx,y under the null hypothesis. n cv

100 0.39

224 0.27

500 0.22

1000 0.16

2000 0.12

5000 0.08

Synchronicity between Macroeconomic Time Series 3

3

2.5

2.5

2

2

1.5

1.5

1

1

0.5

0.5

0

0

1

2

3

0

0

50

100 (b)

150

200

0

0.5

1 (d)

1.5

2

(a)

3

2.5

2.5

2

201

2 1.5 1.5 1 1 0.5

0.5 0

0

0.5

(y)

1 (c)

1.5

2

0

(x)

Figure 10.8. ∆Ri versus ∆Ri for pairs of: (a) linearly cointegrated series, (b) nonlinearly cointegrated series (quadratic transformation), (c) non-cointegrated series (independent random walks), and (d) I(0) linearly comoving series. (n)

where et,1 , et,2 ∼ Nid(0, 1). Table 11.3 shows the estimated power of the test based on ρx,y , the correlation between xt and yt , for different sample sizes (n=100, 500 and 1000) and for different values of the delay parameter m, (m=1, 2 and 5). The results in Table 11.3 show that our test has lost power since we have misspecified the correct constant delay (m). Howerver, if we consider the appropiate delay m in the (n) statistic ρxt−m ,yt , the correlation between xt−m and yt , for m=1, 2, 5 we get back to the previously obtained power results. Table 11.4 shows the values of the estimated power of (n) the test based on ρxt−m ,yt . Therefore, in general, synchronicity is a stronger requirement than cointegration. However, if we allow that the former property to take place at any constant delay, then cointe-

(n)

Table 10.3. Power of the test based on ρx,y m\n 1 2 5

100 0.123 0.073 0.038

500 0.324 0.17 0.07

1000 0.6 0.4 0.2

Alvaro Escribano and Ana E. Sipols

0

0

0

2

1

1

4

2

2

6

3

3

8

4

202

0.0 0.0

0.2

0.4

0.6

0.2

(a)

0.4

0.6

0.1

0.2

0.3

0.8 (c)

(b)

Figure 10.9. Histogram plots for ρnx,y where the frequencies are estimated from 5000 replications of sample size 1000 of: (a) linearly cointegrated series, (b) nonlinearly (quadratic) cointegrated series, and (c) non-cointegrated series (independent random walks). (n)

Table 10.4. Power of the test based on ρxt−m ,yt . m\n 1 2 5

100 0.5 0.5 0.5

500 0.8 0.74 0.85

1000 0.9 0.9 0.95

gration is properly defined in terms of synchronicity (up to a constant delay). Next, we computed the power and size of the unilateral test that uses the estimated critical values under different alternatives models given below, for a=0.5, b=0.6, c=1000, d=100 where et,0 ,et,1 , et,2 ∼ Nid(0, 1) distributed errors. Table 11.5 shows the estimated power for the fixed sample sizes n=100, 500 and 1000. Model 2a (Linear Cointegration) wt

= wt−1 + et,0

xt

= wt + et,1

yt

= awt + et,2

(10.13)

wt

= wt−1 + et,0

(10.14)

xt

= wt + et,1

yt

= awt2 + et,2

(10.12)

Model 2a (Quadratic Cointegration)

(10.15)

Model 2b (Logarithmic Cointegration) wt

= wt−1 + et,0

xt

= wt + et,1

yt

= a log(wt + c) + et,2

(10.16) (10.17)

203

8

Synchronicity between Macroeconomic Time Series

0

2

4

6

n=100 n=500 n=1000

0.0

0.2

0.4

0.6

(n)

Figure 10.10. Kernel Density Estimator of ρx,y of independent random walk series.

Model 2c (Exponential Cointegration) wt

= wt−1 + et,0

xt

= wt + et,1

yt

= a exp(wt /d) + et,2

(10.18) (10.19)

Model 2d (Exponential Cointegration) wt

= wt−1 + et,0

xt

= wt + et,1

yt

= a exp(wt ) + et,2

(10.21)

xt

= bxt−1 + et,1

(10.22)

yt

= 2.0xt + et,2

(10.20)

Model 3 (I(0) Comoving)

Model 4 (I(0) Independent) xt

= bxt−1 + et,1

yt

= 0.8yt−1 + et,2

(10.23)

Alvaro Escribano and Ana E. Sipols

4

5

204

0

1

2

3

n=100 n=500 n=1000

0.0

0.2

0.4

0.6

0.8

1.0

(n)

Figure 10.11. Kernel Density Estimator of ρx,y of exponentially cointegrated series. The DGP’s are given in equation 10.7.

Model 5 (I(0) and I(1) Independent) xt

= bxt−1 + et,1

yt

= yt−1 + et,2

(10.24)

As expected, the results in Table 11.5 tell us that this test statistic cannot discriminate properly between model 1 (linear cointegration) and model 3 of I(0) linearly comoving series. In fact, it yields significant values whenever the series are I(0), as shown by the fact that the power of the test aginst pairs of independent I(0) series is much larger tahn the estimated critical values under the null hypothesis, and increases rapidly with the sample size. Another drawback of this test is its varying power performance when different nonlinearities appear in the relationship. These results come in support of the findings in Park and Phillips (1999) by suggesting that there is no single convergence rate of the test statistic for all nonlinear transformation. A closer look at Figures 11.10 and 11.11 reveal also certain discrepancies with the power results found for model 2c of exponential cointegration. These discrepancies arise from the existence of a linear regime and a more nonlinear regime for a transformation such as the exponential transformation and the logarithmic. Therefore, for the smaller values of the parameter d in model 2c, the exponential transformation behaves approximately as linear, contrary to what happens for values of this parameter close to d=100, as we used for the power estimation. Finally, the nonlinear cointegration experiment was also run for pairs of nonlinearly cointegrated time series with squared-integrable

Synchronicity between Macroeconomic Time Series

205

(n)

Table 10.5. Estimated power for the test based on ρx,y against different alternatives Model \ n 1 2a 2b 2c 2d 3 4 5

100 0.481 0.9 0.06 0.05 0.5 1 0.114 0.07

500 0.8 1 0.09 0.07 0.6 1 0.255 0.08

1000 0.9 1 0.123 0.09 0.8 1 0.423 0.098

Table 10.6. Simulated 5% left critical values and right critical values of the empirical (n) distribution of the test statistic Rx,y under the null hypothesis of linear cointegration (model 1). cv \n left (cl ) right (cτ )

100 0.10714 0.39063

224 0.0798 0.17740

500 0.05161 0.15854

1000 0.03575 0.10544

transformation such as 1 1 + yt2 g(yt ) = sin(yt ) g(yt ) =

(10.25)

g(yt ) = cos(yt ) which are known to have a “stationarising” effect on the transformed series (since the variance of the transformed variable becomes bounded). As expected, in either of these cases, the test has no power. The previous problems of the correlation range statistic led us to investigate the performances in finite samples of a testing device which combines the information given by the (n) (n) correlation statistic ρx,y with that provided by the statistic Rx,y , defined in (10.6). the 5% (n) critical values of the empirical distribution of Rx,y were simulated from 10000 Monte Carlo replications of linear cointegration with Nid(0, 1) distributed errors, and for different fixed sample sizes (n=100, 500 and 1000). Table 11.6 shows the estimated left (cl ) and right (cτ ) 5% critical values. The power of (n) a test based on Rx,y was estimated for the different models considered above, from 10000 replications of each, and for the sample sizes n=100,224,500 and 1000. For model 3, we analysed the power behaviour for different values of the AR(1) coefficient, b, going from 0.6 (n) to 0.99. We also studied the power of Rx,y against pairs of I(0) monotonically nonlinearly comoving series using the new model 6 defined below. again, the critical values of this statistic decrease rapidly with the increasing sample size indicating that the test statistic

206

Alvaro Escribano and Ana E. Sipols

needs to be corrected by the speed of convergence. Therefore, we will restrict the analysis to the performance of our test statistic for several fixed small samples. Model 6a (I(0), Quadratic Comoving) xt

= 0.6xt−1 + et,1

yt

= 0.5xt2 + et,2

Model 6b (I(0), Logarithmic Comoving) xt

= 0.6xt−1 + et,1

yt

= log(xt + 1000) + et,2

Model 6c (I(0), Exponential Comoving) xt

= 0.6xt−1 + et,1

yt

= exp(xt /100) + et,2

Table 11.6 shows the estimated left (cl ) and right (cτ ) 5% critical values for fixed sam(n) ple sizes n=100, 224, 500 and 1000. The power of a test based on Rx,y was estimated for different models from 10000 Monte Carlo replications. For model 3, we analysed the power. For model 3, we analysed the power behaviour for different values of the AR(1) (n) coefficient, b, going from 0.6 to 0.99. We also studied the power of Rx,y against pairs of I(0) monotonically nonlinearly comoving series using model 6. The results are shown in Table 11.7. (n)

Table 10.7. Estimated power for the test based on Rx,y against different alternatives Model \ n 2a ( 0, J3 = J31 J33 + − J21 J22 J11 J13 −(1 − c) i − (y/βy )i β β > 0, J2 = = J31 J33 (1 − u) − k/h1 −r0 − n/β p y pe

and

e

the latter, since the off–diagonal can then be considered the dominant element in the determinant J2 if the parameter h1 is sufficiently small. For the determinant of J we furthermore find that − 0 + a3 = − det J = − 0 0 − = J32 J11 J23 > 0. − + −

Finally we have for the composite coefficient of the Routh-Hurwitz conditions for local asymptotic stability: b = a1 a2 − a3 = −(J11 + J33 )(J32 J23 + . . .) − J11 J23 J32 > 0, i.e., the positive terms in the a1 a2 expression include the negative of the determinant of J and thus dominate the negative term −J32 J11 J23 . We thus in sum get: Proposition 2. Assume βy > y and h1 < k/(1 − u). Then: The interior steady state of the dynamics (12.22)–(12.24) is locally asymptotically stable. Proof. See the above and Brock and Malliaris (1989) or Gantmacher (1959) for the formulation of the Routh-Hurwitz condition for local asymptotic stability.  Remark. The above proof of proposition 2 shows that stability problems can only arise due to the term J2 , or even more specificly due to −(1 − u)i(βy − y)i < 0, in the condition a2 > 0 if βy > y holds true. This term can for example be made the dominant one in a2 by choosing h1 and βy sufficiently large and the parameter c sufficiently close to 1. Another channel for overthrowing the stability of the dynamics is considered in the following proposition. Proposition 3. Assume (1 − u)(βy − y) > imo /h1 + (1 − c)βyro + (βy − y)ik/h1 by appropriate choices of the parameters h1 and c. The local stability found in the preceding proposition gets lost by way of a Hopf bifurcation (i.e., in a cyclical fashion) if the parameter β pe is made sufficiently large.

A Macrodynamic Model of Real-Financial Interaction

251

Proof. The assumption made implies that the coefficient a2 depends negatively on β pe . Since this functional dependence is a linear one, we thus get that a2 must become negative for β pe chosen sufficiently large. We note that the Hopf bifurcation must occur before a2 = 0 has been reached, since b must change sign before this situation occurs. The speed condition of the Hopf–bifurcation theorem at the Hopf bifurcation point finally follows from Orlando’s formula (see Gantmacher (1959)) b = −(λ1 + λ2 )(λ1 + λ3 )(λ2 + λ3 ), for the three eigenvalues λi of the Jacobian J. At the Hopf bifurcation we have (by choosing eigenvalues in an appropriate order) that λ1 is negative and λ2 + λ3 = 0. Orlando’s formula thus implies at the Hopf bifurcation point b0 (β pe ) = 2|λ1 + λ2 |2 (β pe )(Reλ2 )0 (β pe ), where | · | denotes distance measurement in the complex plane and Re the real part of eigenvalues. Eigenvalues thus cross the imaginary axis with positive speed Reλ2 )0 (β pe ) > 0 if and only if b0 (β pe ) < 0 holds true. The latter condition is however obvious, since the parameters in front of β pe in the expressions forming b have to be negative due to the assumption made and the fact that trace and determinant of J are also linear functions of β pe with only negative parameter values. See Benhabib and Miyao (1981) for a related consideration of Orlando’s formula and Wiggins (1990) and Strogatz (1994) for presentations of the Hopf bifurcation theorem.  Remark. In the case βy < y we can always get a Hopf-bifurcation as considered in proposition 3 by simply choosing h1 sufficiently small. Remark. As in the original approach of Blanchard (1981) we have the y˙ = 0 and q˙ = 0 isoclines and now in addition an m˙ = 0 isocline, which are determined as follows: y˙ = 0 : q˙ = 0 : m˙ = 0 :

(1 − c)y + c(δ + t) − (δ + n + g) = q1 (y), (1 − y/βy )i ρ (1 − u)y − δ q= = = q2 (y, m), r ro + (ky − m)/h1 q ≡ 1.

q = 1+

The first isocline is the IS-curve, while the second one represents the LM-curve of Blanchard’s approach. Such isoclines however no longer matter in the following stability analysis of the present dynamics, which means that we do not need to care about the slope of the IS- and the LM curve and so the distinction between good news and bad news cases no longer arises as it did in Blanchard (1981). Proposition 4. Assume the situation considered in proposition 2. The same proposition on local asymptotic stability then holds also for the 4D dynamics (12.22)–(12.25) for all parameters βz sufficiently small.

252

Carl Chiarella, Peter Flaschel and Willi Semmler

Proof. The determinant of the 4D Jacobian J can be reduced to J11 J12 J13 J14 J21 J22 J23 J24 , |J| = J31 J32 J33 J34 0 0 0 J44

with J44 = βz(β pe − κ pe ) < 0 by assumption, by adding appropriate multiples of its second and third row to its fourth row. There follows: J11 J12 J13 sign|J| = − J21 J22 J23 > 0. J31 J32 J33

For small βz we thus have three eigenvalues close to those of the Jacobian J of the first proposition which thus must all have negative real parts. See Sontag (1990) for the theorem that eigenvalues depend continuously on the parameters of the dynamics. Since |J| > 0 for the 4D case the fourth eigenvalue of J must then be negative in addition, since the determinant is given by the product of all eigenvalues.  Proposition 5. The local stability found in the preceding proposition 4 gets lost by way of a Hopf bifurcation if β pe > 1 − κ pe holds and if βz is made sufficiently large. We conjecture from numerical experience with such dynamics that the limit cycles implied by the Hopf bifurcation only exist locally and give way to purely explosive dynamics fairly soon after the Hopf bifurcation point has been passed. The situation considered in proposition 5 therefore will basically be one of financial acceleration with (rapid) departure from a situation of return parities towards more and more increasing return differentials. Sooner or later such an accelerating process must come to an end, in particular since agents expect it to turn and thus become more and more cautious in their market transactions, thereby reducing the speed of adjustment in the stock market. This may create the type of situation considered in the proposition, possibly with basins of attraction that are sufficiently large. Proof. Same as in proposition 3 if one notes that J44 = βz (β pe − (1 − κ pe )) > 0 holds true in the assumed situation.  Remark. Demonstrating the Routh Hurwitz conditions for the characteristic polynomial of the Jacobian J at the steady state of the full 4D dynamics is generally not at all an easy task, since these conditions on the coefficients a j of this polynomial λ4 + a1 λ3 + a2 λ2 + a3 λ + a4 then read: a j > 0,

( j = 1, 2, 3),

bo = a1 a2 − a3 > 0,

b1 = a3 bo − a21 a4 > 0,

and since the principal minors to be calculated for the determination of the coefficients a2 and a3 are now 6 and 4 in number, respectively. It is however not difficult to show for the

A Macrodynamic Model of Real-Financial Interaction

253

dynamics considered here that all a j must be positive (in fact all principal minors of order two and three are nonnegative in the considered situation) if βy > y, h1 < k/(1 − u), β pe < 1 − κ pe , due to the fact that its sign structure in this situation is given by − 0 + 0 0 0 − 0 . |J| = − + − + − + − −

We furthermore note that in this case the positivity of bo is implied by the positivity of b1 . Looking at the expression b1 = a3 (a1 a2 − a3 ) − a21 a4 for large βz we thus get the following proposition: Proposition 6. Assume βy > y, h1 < k/(1 − u), β pe < 1 − κ pe . The steady state of the 4D dynamics (12.22)–(12.25) is locally asymptotically stable for all parameters βz sufficiently large. We claim, but cannot prove this here, that the proven stability is in fact generally not restricted to be close to the steady state. Basins of attraction may indeed be quite large, and in particular at least sufficiently large such that a convergent process is established if the accelerator situations considered in the preceding proposition come to rest through a decline in the speed of adjustment of the stock market that allows for the application of proposition 5. Proof. We have to show that b1 = a3 (a1 a2 − a3 ) − a21 a4 > 0 holds true in such a situation. 3D We first note in this respect that a4 is given by a3D 3 (−J44 ), where a3 denotes the corresponding Routh Hurwitz coefficient of the 3D dynamics, see the proof of proposition 4, i.e., J11 J12 J13 a3D 3 = − J21 J22 J23 . J31 J32 J33 There follows that

J11 J12 J13 J21 J22 J23 J31 J32 J33

(J44 )3

gives the dominant term for the βz influence in a21 a4 when this parameter is made sufficiently large (J44 = βz (β pe − (1 − κ pe )) < 0). This term is however included among the product terms to be found in a3 a1 a2 and can thus be removed from consideration. There are however further expression of the form const(βz)3 in the products that form a3 a1 a2 and none of this type in −(a3 )2 . This implies that b1 is a polynomial in βz of degree three with a positive coefficient in front of (βz)3 which implies the assertion.  Remark. It is also easy to show (at least for n equal to zero) that the Routh-Hurwitz condition bo > 0 holds for the 4D dynamics without any restriction on the parameter βz ,

254

Carl Chiarella, Peter Flaschel and Willi Semmler

but of course with βy > y, h1 < k/(1 − u), β pe < 1 − κ pe . To demonstrate this it suffices to note that three of the four minors of order 3 of the Jacobian J (the fourth is zero) reappear with opposite sign in the products that form the expression a1 a2 due to proposition 2 and its proof and due to the partial linear dependency that exists between the entries in the second plus third and the fourth row of J, which allows us to express these minors as products of second order and first order minors (as they reappear in a1 and a2 ). Only the condition b1 > 0 may thus cause problems for local asymptotic stability and this only for values of βz that are bounded away from zero and infinity. It can therefore be expected that the situation where βy > y, h1 < k/(1 − u), β pe < 1 − κ pe , holds is by and large one of (not only) local asymptotic stability for all speeds of adjustment of capital gains expectations. Stability for all βz under the assumptions of proposition 6 can be easily proved for example when the link from financial markets to real markets is made sufficiently weak, as the following proposition shows. We therefore indeed conjecture that the conditions βy > y, h1 < k/(1 − u), β pe < 1 − κ pe are by and large sufficient to imply local asymptotic stability for all speeds of adjustments βz . A dynamic multiplier that is sufficiently fast, a Keynes-effect that is sufficiently strong and a stock market adjustment that is sufficiently tranquil can therefore be expected to represent sufficient conditions for the convergence of our 4 state variables y, m, q, z back to the steady state in case of shocks that throw the dynamics out of the steady state (and that are not too large). Of course, the admissible sizes of the shocks and thus the basin of attraction of the interior steady state solution can only be determined numerically for these 4D dynamics. Proposition 7. Assume h1 < k/(1 − u), β pe < 1 − κ pe . The steady state of the 4D dynamics (12.22)–(12.25) is locally asymptotically stable for all parameters βz if the investment parameter i is sufficiently small. Proof. Assuming i = 0 implies for the characteristic polynomial of the Jacobian J at the steady state the form (here I is the identity matrix): λ − J11 −J12 0 0 −J21 λ − J22 0 0 |λI − J| = −J32 λ − J33 −J34 −J31 −J41 −J42 −J43 λ − J44 λ − J11 −J12 λ − J33 −J34 . = · −J21 λ − J22 −J43 λ − J44

This implies that the zeros of this polynomial are given by the zeros of the characteristic polynomial of the real and financial part of the economy considered in isolation. Under the assumptions made we thus get that three roots of the characteristic polynomial must be negative and one zero, due to the fact that the Jacobians of the real and the financial part of the economy are given by     − 0 − + Jreal = and J f inancial = . − 0 − −

A Macrodynamic Model of Real-Financial Interaction

255

Continuity of the eigenvalues with respect to the parameters of the dynamics and the fact that |J| > 0 for the full 4D system then again imply that small changes in i must leave the three negative real parts negative while the zero eigenvalue must become negative as well in order to give rise to a positive determinant of the Jacobian of the full dynamics at the steady state.  What has been shown above for large βz is reflected in the following proposition 8 on the limit case of myopic perfect foresight: Proposition 8. Assume βy > y. In the special case of perfect foresight, i.e., system (13.22)– (13.24): y˙ = βy (c(y − δ − t) + i(q − 1) + δ + n + g − y) − i(q − 1)y m b = −i(q − 1)   (1 − u)y − δ ky − m 1−q 1 β pe − (r0 + ) + (i(q − 1) + n) qb = 1 − β pe − κ pe q h1 q we have local asymptotic stability at the interior steady state of these dynamics if h1 is sufficiently small and if β pe < 1 − κ pe holds true.

Proof. For the Jacobian of these dynamics at the steady state we have   − 0 +  0 0 −  − + − where the only difference to the Jacobian in proposition 2 is given by the fact that the third row in this Jacobian is now multiplied with 1/(1 − β pe − κ pe ).5 This implies the assertion of the proposition.  Note that the isoclines of (13.22)–(13.24) are the same as the ones discussed above and that they again are of not much importance in discussing the stability properties of these dynamics. Furthermore perfect foresight is also not of central importance for these properties. Instead, as long as there is a sufficiently sluggish adjustment of share prices (a cautious stock market, combined with a fast dynamic multiplier and a strong Keynes-effect) we have asymptotic stability and thus not the saddlepoint dynamics of the Blanchard (1981) paper. The question therefore has become whether markets in disequilibrium may react in this way or not. We suggest that they may not always react in this way, but may indeed be forced to adopt this type of behavior when disequilibria become too large. Remark: The situation of explosive dynamics can be integrated with the situation of convergent dynamics by way of a regime switching methodology as considered in Chiarella, Flaschel and Semmler (2003). The system can thereby be made a viable one even in the case of myopic perfect foresight by allowing it to run through certain sequences of bull and bear markets which sometimes are tranquil, producing convergence, and sometimes activated and diverging dynamics. 5 Up

to the term −n in the entry J32 .

256

Carl Chiarella, Peter Flaschel and Willi Semmler

Let us discuss finally the extreme limit case where both adjustment speeds β pe and βz are set equal to infinity, i.e., where we have perfect foresight and perfect substitutes at one and the same time, i.e., the system (13.25)–(13.27) introduced above, namely m b = −i(q − 1),

y˙ = βy (c(y − δ − t) + i(q − 1) + δ + n + g − y) − i(q − 1)y, ky − m (1 − u)y − δ 1 − q − + (i(q − 1) + n). qb = r0 + h1 q q

(12.32) (12.33) (12.34)

In this case we get as Jacobian   0 0 −imo J = 0 βy (c − 1) (i − y/βy )βy  . −1/h1 k/h1 − (1 − u) −ro − n

It is easily shown that det J < 0 and trace J < 0 must be true if βy is sufficiently large. Furthermore J3 = 0, J2 > 0 is always true while the sign of J1 = βy (c − 1)(ro − n) − (i − y/βy )βy(k/h1 − (1 − u)) is ambiguous. We assume in the following that this determinant has a positive sign. This implies that all Routh-Hurwitz conditions on local asymptotic stability hold, since detJ is part of the term trace J · (J1 + J2 + J3 ). We thus get in sum the proposition: Proposition 9. The limit case β pe = βz = ∞ exhibits a unique steady state equilibrium which is locally asymptotically stable if the dynamic multiplier is sufficiently fast and if h1 is sufficiently large. The limiting dynamics investigated in Blanchard (1981) thus need not exhibit a problematic steady state situation (none or two steady states) and saddlepoint dynamics if the parameters of the dynamics are chosen in the above way. The dynamics of the capital stock may be slow relative to the other adjustment processes, but this framework avoids the need of a discussion of short-run equilibria (good news and bad news cases) as in Blanchard (1981) that may be surrounded by saddlepoint dynamics. Note however that the Keynes effect must be weak in the present case instead of being large as before, due to the fact that the case of perfect substitutes reverses the sign of the entry where the sign is ambiguous. Otherwise, the saddlepoint dynamics considered in Blanchard (1981) for the state variables y and q reappears in this extended dynamical system.

12.5. Outlook: Jump-variable Conundrum vs. Global Boundedness through Switching Phase Diagrams in the Real-financial Interaction In this section we briefly reconsider the 3D case of myopic perfect foresight (βz = ∞) in the case where there is a saddlepoint dynamics found to be surrounding the steady state (where therefore proposition 8 does not apply) in which case conventional wisdom would

A Macrodynamic Model of Real-Financial Interaction

257

apply the rational expectations solution procedure to find the actual dynamics of the model, i.e., would apply the jump variable technique in order to place the economy always into the region where convergence back to the steady state is ensured after a shock has hit the economy. In Chiarella, Flaschel, Franke and Semmler (2001), see also Chiarella, Flaschel and Semmler (2003), we have discussed the 2D Blanchard (1981) model of the real-financial interaction, extended in this paper by the capacity effect of investment behavior and consistent financing conditions, in detail from its short-run perspective (excluding capital stock changes) obtaining there, in one of its typical scenarios, for example the type of phase diagram shown in figure 1, exhibiting two interior steady states, one a saddle and one a stable node or focus. This phase diagram allows the application of the jump-variable technique only from a local perspective and also ignores the fact that there is one unstable arm in the upper equilibrium that leads the economy to the lower one., i.e., that also allows the application of the jump-variable technique on the basis of its assumption that agents only choose converging paths when they adjust the share price pe or in this way Tobin’s q if a shock throws the economy out of its steady state. The short-run analysis of these earlier papers therefore implies in the considered situation that the jumps imposed by the jump-variable technique are neither necessarily leading the economy towards E1 nor are they really needed if the trajectories are already situated in the basin of attraction of E2 .

Figure 12.1. Dynamics in Blanchard’s good news case: saddle point and stable node or focus.

The jump-variable technique faces a variety of other problems, see for example Asada, Chiarella, Flaschel and Franke (2003) for a recent summary and also for references on further arguments concerning this issue. The present paper allows us to add one further aspect to our critique of the arbitrariness of this rational expectations solution of myopic perfect foresight assumptions and the saddlepoint behaviour they may imply. If capital accumulation is taken into account in the short-run Blanchard good news case briefly sketched above

258

Carl Chiarella, Peter Flaschel and Willi Semmler

then the two depicted steady state situations are just now longer there, but start moving more or less rapidly, depending on the strength of the investment behavior of the original Blanchard (1981) model. How would rational expectations agents behave in this extended situation? Would they ignore capital accumulation (and the lower stable equilibrium that then may exist) and attempt to jump onto one of the stable arms of the equilibrium E1 ? This is now much less easily done, since this equilibrium and thus the whole stable arm is now moving in time. Agents are thus then assumed to be even more capable in judging the whole situation they are living in than in the situations normally considered by the jump-variable technique. They completely control in the usual way the situation for any given value of m and thus in the plane into which this restriction leads them. Furthermore, they know perfectly the movements to which this plane is subject and recalculate the dislocation of their stable arm during the movements of this 3D space. But which path are they then following and to where will this path finally lead them? Being in each moment of time on a stable arm – that is however fading away – does not directly imply anything for the resulting path they are following, in particular if further exogenous shocks occur. We conclude from this situation that the jump-variable technique should be applied to the full 3D dynamics instead and the stable arms they may exhibit. But this would imply that the original analysis of Blanchard (1981) is completely pointless, since the stable arms in the 3D case need not have anything in common with the ones of the originally 2D dynamics. The jump-variable solution therefore depends critically on what dynamics agents take into consideration and what they leave out of their conception of the world. Simple changes concerning the agents’ view of the dimension of the world in which they live – here from 2D to 3D or vice versa, just by adding or subtracting the role of capital accumulation that is intrinsic to the model – thus change the dynamics in a radical way and thus make the modeling of the economic world in which agents live very erratic or even futile. Needless to say such radical overhauls of the dynamics of the real world are not supported by the facts. We thus believe that model of the present paper, when applied to the case of myopic perfect foresight in a situation where saddlepoint dynamics can still be proved to exist from the local point of view, should not be analysed via the jump-variable technique, but instead with one of the alternatives we have introduced in Chiarella, Flaschel and Semmler (2003) and Chiarella, Flaschel, Franke and Semmler (2003) in order to obtain a viable, globally bounded dynamics also in the case of local saddlepoint situations. We thus end this paper with a brief description of the phase diagram switching methodology first introduced in the two papers just cited. Let us assume κ pe = 0 for reasons of simplicity. On the basis of propositions 6 and 7 and the observations accompanying them we expect that the situation of a sufficiently strong dynamic multiplier and a sufficiently strong Keynes-effect (and possibly also a small parameter i) will imply to convergence back to the steady state if β pe < 1 holds true, that is, when the stock market exhibits a cautious type of share price adjustment (due to the expectation of a turning point in stock price movements and a low trading volume in the stock market). Tobin’s q and share prices pe are then indeed slowly moving back to their steady state values, accompanied by nearly perfect foresight with respect to capital gains if adaptive expectations are adjusted with sufficient speed. Such a situation of tranquillity

A Macrodynamic Model of Real-Financial Interaction

259

may however slowly increase the trading volume in the stock market, since agents become less cautious. Due to fast adaptive expectations, this will imply an explosive movement accelerating away from the steady state as soon as the parameter β pe has become larger than 1. Though explosive, the dynamics are cyclical in nature and thus may produce turning points that induce economic agents to become cautious again which reestablishes the stability of the dynamics. We stress that this scenario is not subject to arbitrary changes when the dimension of the considered dynamics is changed, but of course subject to considerable change should the mood in the financial markets change in the just described way. In this way we may expect the dynamics to switch back and forth between periods of tranquility and convergence and periods of accelerating activity and divergence in an unpredictable way by way of phase diagram switches as they were analyzed in detail for the isolated q, z dynamics in the just quoted works. Due to the higher dimension of the dynamics considered in the present paper it is however not directly possible to repeat this analysis here in its details. Instead we may have to rely here on numerical demonstrations of the phase diagram switching methodology, here to be applied on the basis of the propositions 5, 6 and 7 of the preceding section.

12.6. Appendix: Adding the Dynamics of the Government Budget Constraint We have so far assumed that taxes net of interest payments of the government are constant per unit of capital. This is a convenient assumption as long as fiscal policy and the dynamics of the GBR is considered a secondary issue and analysis is thus concentrated on the private sector of the economy. However, this distorts asset returns after taxes in a specific way that must be assumed to be unobserved by private agents, in particular in the case where perfect asset substitution is assumed. Sooner or later such an assumption on tax collection which suppresses interest payments in the budget constraints of households and the government must be removed. In this appendix6 we only show in this regard the two changes that are implied for the dynamics of the body of the paper and leave the analysis of the resulting 5D dynamics for future research. In the place of T −rB K = const we now assume in close correspondence to the assumption made on government expenditures that only KT = const holds. Fiscal policy is thus again treated by means of simple parameters in the intensive form of the dynamics, but no longer by ones that suppress interest payments of the government in the resulting laws of motion.

12.6.1.

Intensive Form of the Model

In order to derive the intensive form laws of motion and their point of rest we have first to define the state variables that we shall make use of in the following. 6 This appendix builds on the model in the body of the paper where the government interest payments to households and the dynamics of the GBR were still suppressed by means of an appropriate tax collection rule. Its aim therefore is to investigate the consequences of a full integration of all budget constraints, in particular that of the government, into a stock market augmented real-financial interaction of the IS-LM type.

260

Carl Chiarella, Peter Flaschel and Willi Semmler

12.6.1.1.

The State Variables

As in the body of the paper, we do not yet consider growth apart from the exogenous labor force growth with rate n in the present formulation of the real financial interaction. We express everything in per unit of capital form in order to derive the laws of motion of the dynamics extended by an explicit treatment of the interest payments of the government. The state variables are thus y = Y /K, m = M/K, q = pe E/K, z = pbee , b = B/K. ˆ Note that we have to make use again of the term Eˆ − Kˆ = 1−q q K in order to obtain the law of motion of Tobin’s q from the law of motion of the share price pe . 12.6.1.2.

The Revised Laws of Motion

The laws of motion of the extended dynamics are the following ones: y˙ = βy(c(y − δ + rb − t) + i(q − 1) + δ + n + g − y) − i(q − 1)y,

(12.35)

m b = −i(q − 1),    ky − m 1−q (1 − u)y − δ + z − r0 + + κ pe z + (i(q − 1) + n), qb = β pe q h1 q    (1 − u)y − δ ky − m z˙ = βz (β pe + z − r0 + − (1 − κ pe )z), q h1 b˙ = g + rb − t − µ¯ m − [i(q − 1) + n]b.

(12.36) (12.37) (12.38) (12.39)

We now have interest income per unit of capital rb in the definition of disposable income used to determine consumption expenditures of households. This simple change in the consumption function employed in equation (12.35) demands that the dynamics of government debt now feeds back into the rest of the dynamics and can therefore no longer be ignored in their investigation. The dynamics of the GBR is shown in equation (12.39) which up to the explicit representation of the interest payments of the government is still of a fairly simple type, since g,t and µ¯ are given parameters of the model. The GBR in intensive form is to be derived from eqn. (4) in the body of the paper by ˙ ˙ ˆ Note with respect to this form making use of the relationship bˆ = KB 1b − Kˆ or b˙ = KB − Kb. ˙ of the GBR that µ¯ m = M/K stands for that part of the government deficit g + rb − t that is financed by money (through open market operations of the monetary authority which issues money in view of real growth n = µ¯ by buying government bonds). Note furthermore that the last item in this GBR is solely due to the fact that everything is expressed in per unit of capital form. It implies that the dependence of b˙ on b is given by ro − n at the steady state and is thus an explosive one if the steady state rate of interest exceeds the natural rate of growth. This tendency towards accelerating government debt may however be checked by stabilizing forces deriving from the rest of the dynamics.

A Macrodynamic Model of Real-Financial Interaction

261

We stress again that the limit case considered on a less complete level (since the law of motion for real balances is disregarded) in Blanchard (1981) is recovered by assuming βq = βz = ∞, i.e., by assuming perfect substitutes and perfect foresight. The special case of perfect foresight considered in Blanchard (1981) reads for the present model: y˙ = βy (c(y − δ + rb − t) + i(q − 1) + δ + n + g − y) − i(q − 1)y,

(12.40)

m b = −i(q − 1), (12.41)    (1−u)y − δ ky − m 1−q 1 β pe − r0 + (i(q − 1) + n), (12.42) qb = + 1 − β pe − κ pe q h1 q b˙ = g − t − µ¯ m − [i(q − 1) + (n − r)]b. (12.43) Note that the steady state solution of these dynamics is identical to the one of the 5D system (the variable z is disregarded now). Assuming bonds and equities as perfect substitutes (i.e. β pe = ∞) in addition implies furthermore the following special case of the above 4D dynamics y˙ = βy (c(y − δ + rb − t) + i(q − 1) + δ + n + g − y) − i(q − 1)y, m b = −i(q − 1), ky − m (1 − u)y − δ 1 − q qb = r0 + − + (i(q − 1) + n), h1 q q b˙ = g − t − µ¯ m − [i(q − 1) + (n − r)]b.

12.6.2.

(12.44) (12.45) (12.46) (12.47)

Steady State Determination (m 6= 0, q 6= 0)

With regard to the steady states of the 5D dynamics (12.35) – (12.39) we have the following proposition. Proposition 10. The interior steady state(s) of the laws of motion (12.35)–(12.39) is determined by the following set of equations: q0 = 1, z0 = 0, m0 = ky0 , g − t − nmo bo = , n − ro r0 = (1 − u)y0 − δ, 1 y0 = (δ + n + g + c(ro bo − δ − t)). 1−c Note that the last 4 equations of these steady state conditions are fully interdependent and not easily solved. They already indicate that an analysis of the dynamics with a GBR feedback into the private sector of the economy is not an easy matter even already on the level of steady state analysis. This difficulty becomes even more pronounced once the issue of the stability of the resulting steady state is approached.

262

Carl Chiarella, Peter Flaschel and Willi Semmler

References [1] Asada, T., Chiarella, C., Flaschel, P. and R. Franke (2003): Open Economy Macrodynamics. An integrated disequilibrium approach. Heidelberg: Springer. [2] Benhabib, J. and T. Miyao (1981): Some new results on the dynamics of the generalized Tobin model. International Economic Review, 22, 589–596. [3] Blanchard, O.J. (1981): Output, the stock market, and interest rates. American Economic Review, 71, 132-143. [4] Blanchard, O.J. and S. Fischer (1989): Lectures on Macroeconomics. Cambridge, Mass.: The MIT Press. [5] Brock, W.A. and A.G. Malliaris (1989): Differential Equations, Stability and Chaos in Dynamic Economics. Amsterdam: North Holland. [6] Chiarella, C., Flaschel, P., Franke, R. and W. Semmler (2001): Output, interest and the stock market, An alternative to the jump-variable technique. Bulletin of the Czech Econometric Society, 13, 1–30. [7] Chiarella, C., Flaschel, P., Franke, R. and W. Semmler (2002): Stability analysis of a high-dimensional macrodynamic model of the real-financial interaction: a cascade of matrices approach. Working paper: UTS Sydney. [8] Chiarella, C., Flaschel, P., Franke, R. and W. Semmler (2003): Output and the term structure of interest rates: Ways out of the jump-variable conundrum. Working paper: UTS Sydney. [9] Chiarella, C., Flaschel, P. and W. Semmler (2004): Real-financial interaction. A reconsideration of the Blanchard model with a state-of-market dependent reaction coefficient. In: W. Barnett, C. Deissenberg and G. Feichtinger (Eds.): Economic Complexity: Nonlinear Dynamics, Multi-Agent Economies, and Learning. ISETE Series, 14, Amsterdam: Elsevier. [10] Gantmacher, F.R. (1959), Applications of the Theory of Matrices. New York: Interscience Publishers. [11] Sontag, E.D. (1990): Mathematical Control Theory: Deterministic Finite Dimensional Systems. Springer, New York. [12] Strogatz, S. H. (1994): Nonlinear Dynamics and Chaos. New York: Addison-Wesley. [13] Wiggins, S. (1990): Introduction to Applied Nonlinear Dynamical Systems and Chaos. Heidelberg: Springer.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 263-288

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 13

M ODELLING B ENCHMARK G OVERNMENT B ONDS VOLATILITY: D O S WAPTION R ATES H ELP ? Christian L. Dunis†∗ and Freda L. Francis‡ † Liverpool Business School & Centre for International Banking, Economics and Finance, JMU, Liverpool, UK ‡ Centre for International Banking, Economics and Finance, JMU, Liverpool, UK

13.1. Introduction If volatility fluctuates in a forecastable way, then volatility forecasts are useful in the management of risk, e.g. for putting together option hedging programs, assessing Value at Risk, etc.; hence the interest in volatility forecasting in the risk management literature. A revolution in modelling and forecasting volatility began some two decades ago with Engle (1982). As that literature has matured, and as our abilities in computation and simulation have advanced, it has fuelled the development of powerful risk-management methods and software. In the context of the use of bond options by market participants, having the best volatility prediction has become even more crucial. Just because there will never be such thing as a unanimous agreement on the future volatility estimate, market participants with a better view/forecast of the evolution of volatility will have an edge over their competitors. The higher the volatility perceived by market participants, the further away up or down the bond price may vary. In practice, those investors/market participants who can reliably predict volatility should be able to control better the financial risks and, at the same time, profit from their superior forecasting ability. The main motivation for this paper 1 is to focus on the forecasting ability of alternative volatility models applied to the US and the German 10-year benchmark Government bonds ∗ E-mail address:

[email protected] chapter draws heavily on ’The Informational Content of Swaption Rates for USD and EUR Government Bonds Volatility Models’ by Dunis and Francis (2004), Derivatives Use, Trading & Regulation , 10, 3, 197-228. We are grateful to Professor John Thompson of Liverpool Business School for helpful comments on an earlier version of this paper. F. Francis also wishes to thank G. Book for his support. The usual disclaimer applies. 1 This

264

Christian L. Dunis and Freda L. Francis

and to check whether implied volatility data obtained from the swaps market or ’swaption rates’ add value in terms of forecasting accuracy 2 . This research also differs from earlier work as, to the best of our knowledge; it is the first time that the potential added value from implied volatilities is tested in the context of government bonds. We retain three time horizons: 1-day, 5-day (or 1 trading week) and 21-day (or 1 trading month). We also wish to assess whether new nonlinear modelling techniques such as Neural Network Regression (NNR) models and model combination can help to forecast bond volatility better. Overall, the paper differs from earlier work as it is the first time that the potential added value from implied volatility is tested in the context of Government bonds. The forecasting ability of time series models such as GARCH (p,q), ARMA (p,q), stochastic variance and NNR models is assessed using traditional forecasting accuracy measures such as Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Theil-U statistic (Theil-U) and correct directional change (CDC). These ’pure’ time series models are supplemented with implied volatility data obtained from the swaps market thanks to Goldman Sachs, leading to the estimation of ’mixed’ time series models. As can be seen from the charts in Appendix 1, there is a strong relationship between historical (realised) volatility and the implied volatility from the swap interest rate market3 : over the period 27 January 1997 to 22 April 2002, the instantaneous correlation is 82.1% for the US 10-year T-Bond and 83.0% for the German 10-year Bund. As we aim to investigate whether implied volatility adds value in terms of forecasting accuracy, we add the swaption rate to the different time series models used. Our volatility forecasts are also benchmarked against a RiskMetrics volatility computation and two naive ’random walk’ models: the first naive model states the best n-step forecast of conditional variance is its current past n-day average and the second naive model sets the n-step ahead forecast of the conditional variance at the current implied volatility level. Finally, based on the literature that shows that forecast combinations often outperform individual forecasts from individual models we produce some average and regression-based combinations to check whether this is also true for the bond time series under review and for the period concerned. All models are developed using the same in-sample data, from 23 January 1997 to 7 June 2001, leaving the remainder period from 8 June 2001 to 22 April 2002 for out-ofsample forecasting. Overall, we conclude that, if it is impossible to identify a single volatility model as the ’best’ overall, the evidence strongly indicates that most of our volatility forecasting models offer more precise indications about future bond volatility than swaption rates. The paper is organised as follows: section 2 presents a brief review of some of the literature relevant to this research; section 3 describes our data, giving their statistical features; section 4 examines the different models that we estimate, giving the definition of both the time series models and the ’mixed’ models investigated. Section 5 presents the estimation results for all our volatility models, focusing on out-of-sample results as they constitute the 2 Since the inception of EMU in January 1999, the German Bund is considered as the benchmark bond for the whole Euro area. 3 Swap rates are considered a good proxy for Government bond yields: over the review period, the instantaneous correlation bond yield/swap rate is 91.6% in levels and 75.5% in first differences for the US rates, and respectively 96.6% and 41.4% for the German rates.

Modelling Benchmark Government Bonds Volatility

265

acid test of forecasting efficiency. Finally, section 6 closes this article with a summary of our conclusions.

13.2. Literature Review Accurate volatility measures and good forecasts of future volatility are critical. In response to this, a voluminous literature has emerged for modelling the temporal dependencies in financial market volatility at different time frequencies using ARCH/GARCH and stochastic volatility type models. While countless time series volatility forecasting models have been proposed, in terms of actual usage by market professionals and textbook attention, two are overwhelmingly the most popular: (1) the sample variance or standard deviation over some recent period, and (2) Bollerslev (1986) GARCH(1,1) model (RiskMetrics volatility could be considered as a special case of GARCH models). The historical variance and GARCH(1,1) both belong to what might be called the linear squared deviation (LSD) class of estimators in that, in both models, the forecast variance is a linear combination of the squared deviations of recent returns from their expected value. In the case of the historical variance, each squared deviation (or observation) in the sample period is weighted equally while observations prior to the chosen sample cutoff date receive a zero weight. In the GARCH(1,1) model, the weights decline exponentially and there is no cutoff date. Although the forecasting abilities of the historical variance and ARCH/GARCH models have been compared in the past by Akgiray (1989), Jorion (1995), Brailsford and Faff (1996) and Figlewski (1997), no clear winner has emerged. Akgiray (1989) and Brailsford and Faff (1996) find that ARCH/GARCH models tend to forecast better though the latter caution that this depends on the measure and time period. Figlewski (1997) finds that the GARCH(1,1) model has a lower root mean squared error in predicting long-period S&P 500 index volatilities but finds that the historical variance forecasts better in the interest rate and foreign exchange markets. Jorion (1995) finds that neither dominates in forecasting foreign exchange volatility. While these studies find that the GARCH model’s forecasting ability is low, Andersen and Bollerslev (1997) respond that better measures of forecasting ability can be obtained using more frequent sampling, such as intraday data, and that measured in this manner GARCH performs much better. Patterns of bond market volatility forecastability appear to differ from those in equity and foreign exchange markets. Christoffersen and Diebold (1997) examined the forecastability of bond return volatility. Limited availability of historical daily international fixed income data forced them to focus exclusively on the 10-year U.S. Government bond. They used a sample from 1 January 1973 up to 1 May 1997, resulting in 6350 daily observations. The results indicated substantially more volatility forecastability than in the equity or foreign exchange markets, with some forecastability out as far as 15-20 trading days. It is hard to determine whether the apparently greater bond market volatility predictability is real, or whether it is an artifact of the approximation used to calculate bond returns. At any rate, their finding that volatility is more forecastable in bond markets than elsewhere is consistent with other existing evidence, including Engle et al. (1987) and Andersen and Lund (1996), as well as the survey by Bollerslev et al. (1992). If volatility is forecastable at the horizons of interest, then volatility forecasts are relevant for risk management. But their

266

Christian L. Dunis and Freda L. Francis

results indicate that if the horizon of interest is more than ten or twenty days, depending on the bond, then volatility is effectively not forecastable. The assumptions embedded in popular risk management paradigms effectively assume highly persistent volatility dynamics. J.P. Morgan’s (1996) RiskMetrics, for example, is built upon exponential smoothing of squared returns, which is roughly equivalent to forecasting volatility with an integrated GARCH specification. West and Cho (1995) found that volatility in foreign exchange markets is unforecastable beyond a 5-day horizon. Their results are consistent with those of Andersen and Bollerslev (1997) who study volatility over an interval, as is relevant, in particular, for options pricing and question evidence of the sort provided by Figlewski (1994) and Jorion (1995), which seems to indicate that ARCH models provide poor volatility forecasts. Ederington and Guan (1999) examine the forecasting ability of twenty time series volatility models. They compare forecasting ability across several markets, including stock market indices and individual equities, long- and short-term interest rates, and foreign exchange rates. Except for Figlewski (1997), all previous studies confine their attention to one of these, commonly the stock market or foreign exchange, so it is not clear if their results can be generalised. Since derivative prices depend on expected volatility, they compare the models’ forecasting ability over multi-period horizons, specifically 10, 20, 40, 80 and 120 trading days. They find that models based on absolute return deviations generally forecast better than equivalent models based on squared return deviations (for instance the historical mean absolute deviation forecasts better than the historical standard deviation), yet GARCH-type models are overall better models. Furthermore, they also find that models like GARCH(1,1) in which the weights attached to older observations decline exponentially, tend to underweight the most recent and oldest observations and overweight those in between, but that efforts to correct this using more flexible lag structures introduce additional estimation error which leads to poorer out-of-sample forecasting. Among the most popular time series models, they find that GARCH(1,1) generally yields better out-ofsample forecasts than the historical standard deviation but between GARCH and EGARCH there is no clear winner. However, all are dominated by a simple nonlinear least squares model based on historical absolute deviations which they develop in their paper. Dunis and Huang (2002) show that, compared with more traditional statistical methods, NNR models do add value in the field of foreign exchange volatility forecasting: their results are tested out-of-sample not only in terms of forecasting accuracy, but also in terms of trading efficiency, via the implementation of a trading strategy using option straddles once mispriced options have been identified.

13.3. Bond Return and Bond Volatility Data 13.3.1. The Bond Returns and Historical Volatility Series Databank The return series we use for the 10-year US Treasury Bond and the 10-year German Bund were extracted from a historical benchmark Government bond databank provided by Goldman Sachs. Returns, defined as st = (Pt /Pt−1 ) − 1, are calculated for the two bond series on a daily basis. As the implied volatility databank covers a span of 5 years, from 23 January 1997 to 22

Modelling Benchmark Government Bonds Volatility

267

April 2002, we used the same period for bond returns and historical volatility computations, giving 1364 datapoints for each return and volatility data bank. We use a restricted sample from 23 January 1997 to 7 June 2001 for model estimation, which is 5/6 of the dataset (1137 data points). That leaves us with 227 datapoints for the out-of-sample volatility estimates, which spans from 8 June 2001 to 22 April 2002. Summary statistics for the daily returns over the whole data period are presented in the tables below. These tables clearly show that both our US Treasury bond and German Bund yield series are nonnormally distributed and fat-tailed. They also show that mean returns are not statistically different from zero. Other standard tests of autocorrelation, stationarity and heteroskedasticity (not reported here in order to conserve space) show that both T-Bond and Bund returns are stationary and heteroskedastic. Autocorrelation is present in the T-bond return series, not in the Bund return series.

Table 13.1. Summary statistics of daily T-Bond return (23 January 1997 - 22 April 2002) Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Jarque-Bera

-0.000113 0.000000 0.054668 -0.041448 0.010685 0.348155 5.405657 356.1992

Table 13.2. Summary statistics of daily Bund returns (23 January 1997 - 22 April 2002) Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Jarque-Bera

-2.95E-05 -0.000198 0.084369 -0.032663 0.008885 1.031297 10.50774 3442.739

Because both T-bond and bund returns have zero unconditional mean it allows us to use squared returns as a measure of their variance and absolute returns as a measure of their standard deviation. The use of absolute returns is common among market practitioners, moreover, as suggested by Schwert (1989a, b), the variance of a zero mean normally distributed variable is π/2 times the square of the expected value of its absolute value. Since

268

Christian L. Dunis and Freda L. Francis

the variables considered are not normally distributed, we can hence set this constant arbitrarily to 1. The standard tests of autocorrelation, nonstationarity and heteroskedasticity (again not reported here in order to conserve space) show that squared and absolute bond return series for the whole data period are all nonnormally distributed, stationary, heteroskedastic and autocorrelated with means that are not statistically different from zero. Taking, as is usual practice, a 252-trading day year, we compute the 21-day historical volatility as the moving annualised standard deviation of our returns: 1 t √ HVOLnt = ∑ ( 252 · 21 t−20

q st2 )

where st2 is the squared T-Bond or Bund returns. HVOLn t is the realised T-Bond and Bund volatilities over 21-day that we are interested in forecasting as accurately as possible, for risk management or portfolio management purposes.

13.3.2. The Implied Volatility Series Databank Volatility is now an observable and traded quantity in many financial markets. So far, most studies that deal with implied volatilities have used data from listed options on exchanges rather than over-the-counter (OTC) volatility data (see e.g. Chiras and Manaster (1978), Kroner et al. (1995), Latane and Rendleman (1976), Lamoureux and Lastrapes (1993) and Xu and Taylor (1996)). The volatility time series we use for the US T-Bond and German Bund were extracted from a market quoted implied volatilities database again provided by Goldman Sachs for data until April 2002. These 3-month at the money forward, market-quoted swaption volatilities are obtained from brokers on a daily basis, at the close of business. Note that the strike of the option is the current 10-year rate 3 months forward and not the actual prevailing 10-year rate4 . The swaption rate is the annualised lognormal volatility of the swap rate: if the swaption rate is 12%, then the random variable dS/S (the percentage change in the swap) has a standard deviation of 0.12. Alternatively, if the 10-year swap is 5%, then the standard deviation of the swap rate is 0.05*0.12=0.006 or 60 basis points (bp) per annum 5 . Summary statistics for these implied volatilities for the whole data period are shown below (tables 13.3 & 13.4). As we can see from tables 13.3 and 13.4, all series are nonnormal and fat-tailed. Further tests of autocorrelation, stationarity and heteroskedasticity show that both 3-month implied volatility series display strong autocorrelation and heteroskedasticity and they are both nonstationary. 4 This

will matter little for any modelling purposes unless the data is being used to calibrate sophisticated term-structure models. √ 5 All volatilities can be made daily by dividing by 252 . Thus, in our example, the swap rate is expected to move by 60/ 252= 3.78 bp per day. This is possibly the most intuitive way of looking at this measure.

Modelling Benchmark Government Bonds Volatility

269

Table 13.3. Summary statistics of daily USD swaption rate (23 January 1997 - 22 April 2002) Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Jarque-Bera

16.19095 14.90000 29.25000 10.00000 3.826101 1.021169 3.360062 244.4283

Table 13.4. Summary statistics of daily EUR swaption rate (23 January 1997 - 22 April 2002) Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Jarque-Bera

12.51300 11.90000 19.00000 7.900000 2.238135 0.518379 2.475349 76.73207

13.4. Volatility and Benchmark Models All the models used in this research are shown in table 13.5. These methods are well documented in the finance literature and as a result are only outlined below, starting with the popular GARCH model. 6

13.4.1. The ARCH/GARCH Time Series and ’Mixed’ Models The ARCH family is a very widely used group of time series models. The GARCH(1,1) model proposed by Bollerslev (1986) and Taylor (1986) is the most popular among these. Basically it states that the conditional variance of asset returns in any given period depends upon a constant, the previous period’s squared random component of the return and the previous period’s variance. In the notation that has become standard, the GARCH(1,1) model is: 2 + βht−1 ht = ω + αεt−1

(13.1)

where ht denotes the variance of t conditional on εt−1 and ht−1 . In fact we tried alternative GARCH specifications in our in-sample period, but none managed to outperform 6 The

detailed model specifications and in-sample results are not reported here in order to conserve space. They are available from the authors upon request.

270

Christian L. Dunis and Freda L. Francis Table 13.5. List of 1-, 5- and 21-Day Volatility Models

Model 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Description GARCH(p,q) based on variance of returns GARCH(p,q) based on variance of returns + implied volatility AR(p) based on squared returns AR(p) based on squared returns + implied volatility AR(p) based on absolute returns AR(p) based on absolute returns + implied volatility SV(1) based on log of squared returns SV(1) based on log of squared returns + implied volatility Implied volatility RiskMetrics volatility Historical volatility Neural Network + implied volatility Average of all simple models except worst models in-sample Average of all mixed models except worst models in-sample Regression-weighted average of all simple models except worst models in-sample Regression-weighted average of mixed models except worst models in-sample

Mnemonic GARCH GARCH ARretsqr ARretsqr ARretabs ARretabs Kalman Kalman ImpliedVol RMVol HistoricalVol NNR Comb Comb GR GR

consistently the more standard GARCH(1,1) specification for both T-Bond and Bund return series. Equation 13.1immediately the 1-step ahead volatility forecast and, using recursive substitution, Campbell et al. (1997) give the n-step ahead forecast for a GARCH(1,1) process: E(t)[h(t + n)] = (α + β)n [h(t) − ω/(1 − α − b)] + ω/(1 − a − β)

(13.2)

where n = 5, 21 in our case and h(t) comes from ht from equation 13.1 above. The formula applies when α + β < 1. The ’mixed’ version counterpart of the GARCH(1,1) integrating implied volatility, IMPt , yields the following formulation for the conditional variance (see, for instance, Kroner et al. (1995)): 2 + βht−1 + γIMPt−1 ht = ω + αεt−1

(13.3)

For the mixed model, for the both the Bund and T-Bond return series GARCH (1,1) again outperformed all the other models. We use the difference of implied volatility in order to get a good forecast model. The ’mixed’ GARCH(1,1) model n-step ahead forecast becomes: E(t)[h(t + n)] = (α + β)n [h(t) − ω/(1 − α − b)] + ω/(1 − a − β)

(13.4)

where n = 5, 21 in our case and h(t) comes from ht from equation 13.3 above which already includes the implied volatility term. The formula applies when α + β < 1.

Modelling Benchmark Government Bonds Volatility

271

13.5. The AR(p) Time Series and ‘Mixed’ Models In our analysis of T-Bond and Bund returns in section 13.3.1. above, we saw that squared returns could be used as a measure of their variance and absolute returns as a measure of their standard deviation. Furthermore, we mentioned that squared and absolute T-Bond and Bund returns were all nonnormally distributed, stationary, heteroskedastic and autocorrelated with means not statistically different from zero. This stationarity allows us to apply traditional ARMA estimation procedure to our squared and absolute bond return series, provided we make good for the presence of both heteroskedasticity and autocorrelation where appropriate. For a full discussion on the procedure refer to Box et al. (1994), Gouri´eroux and Monfort (1995), Pindyck and Rubinfeld (1998). Following West and Cho (1995), we thus model the conditional variance as either: 16

2 ht = ω + ∑ αi st−i

(13.5)

i=1

or as: 16

ht = ω + ∑ αi | st−i |

(13.6)

i=1

The n-step ahead forecast for our AR(p) models becomes respectively: 16

2 ˆ + ∑ αˆi st+n−i ht+n = ω

(13.7)

i=1

and: 16

ˆ + ∑ αˆi | st+n−i | ht+n = ω

(13.8)

i=1

Based on the AIC/SBC information criteria, log likelihood and standard error of the estimation, we select in the end for the conditional variance forecast based on squared returns an AR(2) and a restricted AR(1,6) process for the Bund and T-bond respectively 7. For the conditional variance forecast based on absolute returns, we choose restricted AR(2,5,7,14,16) and AR(1,2,4,9,15) processes for the Bund and T-Bond variances respectively. These lags represent a period of up to 3 trading weeks maximum. The ’mixed’ version counterpart of the AR(p) models integrating implied volatility data are then: 16

2 + γIMPt−1 ht = ω + ∑ αi st−i i=1

7A

restricted AR process only incorporates statistically significant lags (i.e. those in brackets).

(13.9)

272

Christian L. Dunis and Freda L. Francis

and: 16

ht = ω + ∑ αi | st−i | +γIMPt−1

(13.10)

i=1

t The best outcome for the ‘mixed’ AR(p) was found to be an AR(1) for both ’mixed’ Bund and T-Bond squared returns. In the case of ’mixed’ models based on absolute returns, the result was found to be an AR(1) process for the Bund and an AR(9) process for the T-Bond. The n-step ahead forecasts for equations (13.9) and (13.10) are then calculated in recursive manner to yield: 16

2 ˆ + ∑ αˆ i st+n−i + γIMPt−1 ht+n = ω

(13.11)

i=1

and: 16

ˆ + ∑ αˆi | st+n−i | +γIMPt−1 ht+n = ω

(13.12)

i=1

13.5.1. The Stochastic Variance SV(1) Time Series and ’Mixed’ Models With the development of applications of the state space modelling procedure, stochastic volatility models have become more popular in recent years, if not among market practitioners, at least in academic circles: intuitively, there is a clear attraction with the idea that volatility and its time-varying nature could be stochastic rather than the result of some deterministic function. The seminal works of Harvey and Shepherd (1993), Harvey et al. (1994) and Hamilton (1994) have underlined the advantages of using state space modelling for representing dynamic systems where unobserved variables (the so-called ’state’ variables) can be integrated within an ’observable’ model. Following, amongst other contributions, So et al. (1999) and Dunis et al. (2000, 2003), it is thus possible to model volatility in state space form as a time-varying parameter model. After several attempts at alternative specifications our preferred approach was selected on the basis of the resulting log-likelihood and the standard error of the observation equation. In the end, we chose to model the logarithm of the conditional variance as a random walk plus noise 8. We further made the assumption that our random coefficient (our ‘state’ variable) was best modelled as an AR(1) process with a constant mean, implying that shocks would show some persistence, but that the random coefficient would eventually return to its mean level, an assumption compatible with the behaviour of bond volatility. The selected model for both Bund and T-Bond variances was a SV(1) process with the following specification: log ht

= ω + SVt + εt

SVt = δSVt−1 + nt

(13.13)

where SVt is our time-varying coefficient while εt and nt are uncorrelated error terms. 8 Working in

logarithms ensures that ht is always positive.

Modelling Benchmark Government Bonds Volatility

273

The ‘mixed’ version counterpart of this system that integrates implied volatility data is then straightforwardly: log ht SVt

= ω + SVt + γ log IMPt−1 + εt = δSVt−1 + nt

(13.14)

In order to derive the n-step ahead forecast for system (13.13), we must compute E (SVt+n | It ), with It the information set available at time t. It is clear from (13.13) that we have: E(SVt+1 | It ) = δE(SVt ) = δ2 SVt−1 + nt+1

(13.15)

By By iterating equation (13.15), we can therefore compute E (SVt+n | It ): E(SVt+1 | It ) = δn SVt + nt+n

(13.16)

We can now compute the n-step ahead forecast for the logarithm of squared returns as: log ht+n = ω + SVt+n + εt+n SVt+n = δn SVt + nt+n

(13.17)

Similarly, taking into account the fact that, in order to compute a truly out-of-sample forecast, the last information on implied volatility available at time t is IMP t , the ‘mixed’ system n-step ahead forecast becomes: log ht+n = ω + SVt+n + γ logIMPt + εt+n SVt+n = δn SVt + nt+n

(13.18)

13.5.2. The ‘Na¨ıve’ Random Walk Models and RiskMetrics Volatility We proceed to ‘estimate’ two alternative types of ‘na¨ıve’ random walk models, one which simply states that the best n-step ahead forecast of the variance is its current past n-day average and the other which sets the n-step ahead forecast of the conditional variance at the current n-day implied volatility level. Consequently, the first type of ‘na¨ıve’ model based on historical volatility yields the following n-step ahead forecast: √ ht+n = (1/ 252)HVOLi,t

(13.19)

where HVOLi,t is the realised 1-month (i=21) historical volatility. The second type of ‘na¨ıve’ model is based on market-quoted swaption rates and yields the following n-step ahead forecast: √ ht+n = (1/ 252)IMPt2

(13.20)

274

Christian L. Dunis and Freda L. Francis

where IMPt is the 3-month swaption rate prevailing at time t, as we do not have 1-day, 5-day and 1-month implied volatilities. RiskMetrics is based on the risk measurement methodology developed by J.P. Morgan for the measurement, management and control of market risks in its trading, arbitrage and own investment account activities. RiskMetrics is nothing more than a simple and highquality tool for professional risk managers involved in financial markets and is not a guarantee of specific results. RiskMetrics is a set of tools that enable participants in financial markets to estimate their exposure to market risk under what has been called the Value-atRisk framework. The RiskMetrics volatility is calculated using the standard formula: 2 σ2(t+1/t) = bσ2(t/t−1) + (1 − b)r(t)

(13.21)

where σ2 is the T-Bond or Bund variance, r2 is the T-Bond or Bund squared return and b = 0.94 for daily data. In this paper we use RiskMetrics volatility to forecast 1-day, 5-day and 21-day ahead for the out-of-ample period. The RiskMetrics volatility is calculated from equation (13.21) and then we use equation (13.22) below to calculate the n-step ahead forecast: √ (13.22) ht+n = (1/ 252)RMVOL2i,t

13.5.3. The Neural Network Models A recent development in finance has been the application of nonparametric time series modelling techniques to volatility forecasts. Gaussian Kernel regression is an example, as in West and Cho (1995). Neural Network Regression (NNR) models, in particular, have been applied with increasing success to economic and financial forecasting and would, according to some, constitute the state of the art in forecasting methods (see, for instance, Zhang et al. (1998)). It is well beyond the scope of this paper to give a complete overview of NNR models, their biological foundation and their many architectures and potential applications 9. For a full discussion of NNR models, refer to Haykin (1999), Kaastra and Boyd (1996), Kingdon (1997), and Zhang et al. (1998). For our purpose, let it suffice to say that NNR models are a tool for determining the relative importance of an input (or a combination of inputs) for predicting a given outcome. They are a class of models made up of layers of elementary processing units, called neurons or nodes, which elaborate information by means of a nonlinear transfer function. Most of the computing takes place in these processing units. Theoretically, the advantage of neural networks over traditional forecasting methods is that, as is often the case, the model best adapted to a particular problem cannot be identified beforehand. It is then better to resort to a method that is a generalisation of many models, than to rely on an a priori models. Successful applications in forecasting foreign exchange rates can be found in Deboeck (1994), Kuan and Liu (1995) and Franses and Homelen (1998) amongst others. If there are 9 In

this paper, we use exclusively the multilayer perceptron, a multilayer feedforward network trained by error backpropagation. Our application follows Dunis and Huang (2002). Adding Stoxx50 and/or Dax30 returns does not improve the results.

Modelling Benchmark Government Bonds Volatility

275

many articles on applications of NNR models to foreign exchange, stock and commodity markets, there are rather few concerning financial markets volatility forecasting in general: the few publications in this field concern either stock market volatility as in Donaldson and Kamstra (1997) and Bartlmae and Rauscher (2000) or foreign exchange volatility as in Dunis and Huang (2002). It seems therefore that, as an alternative technique to the more traditional statistical forecasting methods, NNR models need further investigation to check whether or not they can add value in the field of bond volatility forecasting. Developing NNR models is a rather difficult and time consuming task. For this research, we need to develop for both the T-Bond and the Bund variances one NNR model per forecast horizon (1-, 5- or 21-step ahead), with slightly different input variables. In the circumstances, we only apply this modelling approach to ‘mixed’ models to check whether, if traditional ‘mixed’ models including swaption rates outperform their ‘pure’ time series counterpart, these results can still be improved by resorting to NNR modelling. Inputs are transformed into returns as, despite some contrary opinions, e.g. Balkin (1999), stationarity remains important if NNR models are to be assessed on the basis of the level of explained variance. In the absence of an indisputable theory of bond volatility, we assumed that it could be explained by that bond’s recent evolution, volatility spillovers from other financial markets, and macro-economic and monetary policy expectations as measured by the yield curve. Final inputs include the actual lagged returns of the Euro/US dollar exchange rate, lagged returns of the S&P500 and the FTSE100 stock index 10 , lagged returns of the gold price and of the Eurodollar 3-month interest rate, lagged swaption rates and RiskMetrics volatility, lagged T-Bond and Bund returns, squared returns and absolute returns depending on the bond volatility being modelled. Relevant lags are chosen with respect to the forecast horizon. All variables are normalised according to our choice of the sigmoid activation function. Starting from a traditional linear correlation analysis, variable selection is achieved via a forward stepwise neural regression procedure: starting with lagged implied volatility, other potential input variables are progressively added, keeping the network architecture constant 11. If adding a new variable improves the level of explained variance over the previous ‘best’ model, the pool of explanatory variables is updated. If there is a failure to improve over the previous ‘best’ model after several attempts, variables in that model are alternated to check whether no better solution can be achieved. The model chosen finally is then kept for further tests and improvements. Finally, conforming with standard heuristics, we partition our total data set into three subsets, using roughly 2/3 of the data for training the model, 1/6 for testing and the remaining 1/6 for validation. This partition in training, test and validation sets is made in order to control the error and reduce the risk of overfitting. Both the training and the following test period are used in the model tuning process: the training set is used to develop the model; the test set measures how well the model interpolates over the training set and makes it possible to check during the adjustment whether the model remains valid for the future. As the fine tuned system is not independent from the test set, the use of a third validation set which was not involved in the model’s tuning is necessary. The validation set is thus used 10 Adding Stoxx50 and/or Dax30

returns does not improve the results. that for instance when adding S&P500 returns for the German Bund variance, we have to ensure that we adjust the lag structure to make up for the time difference between the US and Europe. 11 Note

276

Christian L. Dunis and Freda L. Francis

to estimate the actual performance of the model in a deployed environment. We use the same input space and architecture (i.e., with only one hidden layer), with mostly 11 hidden nodes in its hidden layers. On average, NNR models with a single hidden layer performed marginally better than models with 2 hidden layers while at the same time requiring less processing time. In our case, the 1134 return observations from 24 January 1997 to 7 June 2001 are considered as the in-sample period for the estimation of our models. We therefore retain the first 907 observations from 24 January 1997 to 21 July 2000 for the training set and the remainder of the in-sample period is used as test set. The last 229 observations from 8 June 2001 to 22 April 2002 constitute the validation set and serve as the out-of-sample forecasting period, as with the other modelling approaches.

13.5.4. The Combined Time Series and ‘Mixed’ Models As underlined by Dunis et al. (2001), many researchers in finance have now come to the conclusion that individual forecasting models are misspecified in some dimensions and that the identity of the ‘best’ model changes over time. In this situation, it is likely that a combination of forecasts will perform better over time than forecasts generated by any individual model that is kept constant. For a while now, survey literature on forecast combinations such as Clemen (1989) and Mahmond (1984) have confirmed that combining different models generally provides more precise forecasts. This statement on the advantages of combining two or more forecasts into a composite forecast is consistent with findings by Makridakis et al. (1982), Granger and Ramanathan (1984), Dunis et al. (2000), amongst others. These articles agree that model combination of several methods improves overall forecasting accuracy over and above that of the individual forecasting models used in the combination. Consequently there is a strong case for combining the various successful models we have retained in our research. Accordingly, we compute two different, yet simple, model combinations. The first forecast combination we retain is the simple average of each single forecasting model for time t + n, minus the p number of models which perform worst when we analyse the insample forecasting accuracy measures. We thus have: m

ht+n = (1/m) ∑ hi,t+n

(13.23)

i=1

where n = {1, 5, 21}, m is the number of models we have left after removing the p worst ones and hi,t+n represents the forecast of each single forecasting model except the p worst performing ones for time t + n. The p worst models are taken out by analysing the forecasting accuracy12 of the individual simple and ’mixed’ models in-sample 13. On average three to four models allow to maximise in-sample forecasting acuracy. We name this model combination ’Comb’ in the subsequent out-of-sample forecasting accuracy analysis. Models 13 and 14 of table 13.5 at the beginning of section 4 are estimated according to equation (13.23) for the three time horizons considered. 12 See

section 13.6. below for more details on the forecasting accuracy measures used. the combination is to be used in a true out-of-sample forecasting exercise, its constituents must be determined on the basis of in-sample results. 13 If

Modelling Benchmark Government Bonds Volatility

277

For the ‘pure’ time series models and the 1-step ahead in-sample forecast for the Bund, the best combination contains the GARCH, AR based on squared returns and historical volatility. The same combination applies for the US Treasury Bond. This means that we choose the best combination excluding the worst four models. Implied volatility and RiskMetrics volatility are close contenders, but the forecasting accuracy in-sample is better if we drop them. The same combinations of three models are selected for the 5-step ahead forecast for the Bund and the T-Bond variance on the basis of their better in-sample forecasting accuracy. For the 21-step ahead forecast, the best combination for the Bund variance comprises the best four models (GARCH, AR based on squared returns, historical volatility and stochastic variance), whereas for the T-Bond variance the best combination is identical except for the SV model which is excluded. For the ‘mixed’ models, for the 1-step ahead forecast for the Bund and the T-Bond variance, the results are better when we combine the ‘mixed’ GARCH, ‘mixed’ AR based on squared returns and NNR models. The 5-step and the 21-step ahead forecasts involve again a combination of the same three types of models for both the Bund and the T-Bond variance. The second forecast combination uses the linear regression weighting approach suggested by Granger and Ramanathan (1984), which yields: m

ht+n = ∑ αˆ i hi,t+n + bˆ

(13.24)

i=1

where n = {1, 5, 21}, m is the number of models and represents the regression-weighted forecast of each single forecasting model for time t + n. Models 15 and 16 of table are estimated according to equation (13.24) for the three time horizons considered. We call this model combination ‘GR’ in the subsequent out-of-sample forecasting accuracy analysis. For example, for the 1-step ahead forecast for the Bund variance simple model, the single models retained are the AR based on squared returns, the SV model and the RiskMetrics variance. The models that are not statistically significant are removed and we are left with the statistically significant ones only. The final choice is based on the in-sample forecasting accuracy of alternative GR combinations 14.

13.6. The Out-of-Sample Estimation Results 13.6.1. The Out-of-Sample Estimation Procedure As mentioned earlier, the in-sample period for model estimation covers the period 23 January 1997 to 7 June 2001, which represents 5/6 of the dataset (1137 data points). That leaves us with 227 observations for the out-of-sample volatility forecasts, from 8 June 2001 to 22 April 2002, during which period the models are kept constant: this is obviously a very 14 Note that, as we were more interested in forecasting performance than in the value of individual coefficients, in some cases we left a marginally significant model (up to the 15% level) together with the more significant ones as this greatly improved in-sample forecasting accuracy. As mentioned in footnote 6, details of the equations are not reported here in order to conserve space. They are available from the authors upon request.

278

Christian L. Dunis and Freda L. Francis

stringent test of model robustness as, in practice, such models would be re-estimated more frequently and it is fair to say that our out-of-sample procedure certainly disadvantages estimated model forecasts compared to the benchmark ‘forecasts’ presented in section 13.5.2. above.

13.6.2. The Measures of Forecasting Accuracy Having documented the 16 different models that we estimated for the Bund and the TBond variance, for the three time horizons considered, we present the measures of forecasting accuracy that we use. To start with, although this is a straightforward transformation, we must adjust the results of our forecasting models (which give us either a conditional variance forecast or the forecast of its logarithm) to give us an annualised volatility forecast15 . As is standard in the economic literature, we compute the Root Mean Squared Error (RMSE), the Mean Absolute Error (MAE) and Theil U-statistic (Theil-U). These measures have already been presented in details by, amongst others, Makridakis et al. (1983), Pindyck and Rubinfeld (1998) and Theil (1966). We also compute a ‘correct directional change’ (CDC) measure which checks whether the direction given by the forecast is the same as the actual change which has subsequently occurred (i.e. the direction of change implied by the forecast at time t for time t+n compared with the volatility level prevailing at time t). Calling σ the actual volatility and σˆ the forecast volatility at time τ, with a forecast period going from t + 1 to t + n, the forecast error statistics are respectively: RMSE =

s

t+n

(1/n)



(σˆτ − στ )2

τ=t+1 t+n

MAE = (1/n) Theil-U =

s



τ=t+1

| σˆτ − στ |

t+n

(1/n)



(σˆτ − στ )2 /

τ=t+1

"s

(1/n)

t+n



τ=t+1

σˆ2τ +

s

t+n

(1/n)



τ=t+1

σ2τ

#

t+n

CDC = (100/n)



τ=t+1



where Dτ = 1 if (στ − στ−1 ) · (σˆ τ − στ−1 ) > 0, else Dτ = 0. The RMSE and the MAE statistics are scale-dependent measures but give us a basis to compare our volatility forecasts with the realised volatility. The Theil-U and CDC statistics are independent of the scale of the variables: the Theil-U statistics is constructed in such a way that it necessarily lies between zero and one, with zero indicating a perfect fit, whereas 15 √ This is easily done by taking the square root of the conditional variance forecast and multiplying it by 252; in the case of the stochastic variance forecasts, it is further necessary to begin taking the exponential of the respective model’s forecast after having adjusted it by the log-forecast variance (we used the variance from the observation equation over the in-sample period since, in a true out-of-sample forecasting process, one does not know ex ante what the forecast error is going to be) to account for the transformation to ‘anti-log’ the forecasts.

Modelling Benchmark Government Bonds Volatility

279

the CDC lies by construction between 0 and 100%, the latter indicating a perfect forecast of changes16 . For three of the error statistics retained (RMSE, MAE and Theil-U), the lower the output, the better the forecasting accuracy of the model concerned. However, rather than on securing the lowest statistical forecast error, the profitability of a trading system critically depends on taking the right position and therefore getting the direction of changes right. RMSE, MAE and Theil-U are all important error measures, yet they may not constitute the best criterion from a profitability point of view. The CDC statistics addresses this issue and, for this measure, the higher the output the better the forecasting accuracy of the model concerned. Choosing the best models is not such a simple matter as the best model is dependent upon the choice of criteria. In order to rank the models we give a score to each forecasting accuracy measure, a score of 1 to 9 for our 9 simple models (1 to 10 in the case of the ’mixed’ models) to each RMSE, MAE and Theil-U and a score twice that size for CDC 17 . For example, the best model in terms of RMSE gets a score of 9, the second best a score of 8 and so on, while for the CDC the model with the highest CDC gets a score of 18, the second best a score of 16 and so on, so in the end, the model with the most points is chosen as the best one.

13.6.3. The Out-of-Sample Forecasting Results Having documented the error measures that we use to gauge the forecasting accuracy of our different models, we now turn to the analysis of our out-of-sample empirical results. We basically wish to answer the following questions: 1. How do our models fit out-of-sample and is there a (or several) better forecasting model(s)? 2. Do implied volatility data, model combination and NNR models add value in terms of forecasting accuracy? We start with pure time series models, with the simple 1-day out forecasts for the Bund and the T-Bond volatilities, and concentrate first on error levels. As can be seen from tables 13.6 and 13.7 of Appendix 1, most of our volatility models have indeed some forecasting power, as shown for instance by the Theil-U statistics but the errors remain important as evidenced for instance by the MAE measure. The results are most interesting. For both the German Bund and the US T-Bond, the GR model combination performs best overall. The directional change is higher than for other models, close to 80% in the case of the T-Bond. In both cases the SV model (or Kalman) is the worst model, which is not really surprising as this is a time-varying parameter model and we have fixed the system if not the parameters at the end of the in-sample period. Among the 9 models, implied volatility is the sixth best ’model’ for the Bund and seventh for the T-Bond. Hence all the other better ranked five and 16 Note

that a CDC of 50% is the random result and values less than 50% imply a worse than random performance. 17 We attach double weight to CDC as it is our only measure of direction, a key criterion in financial markets.

280

Christian L. Dunis and Freda L. Francis

six volatility forecasting models respectively offer more precise indications about future volatility than implied volatilities. Looking at the 5-day out forecasts for the simple models on tables 13.8 and 13.9, we see similar results to those of the 1-day forecasts. The best model is again the GR for both the Bund and T-Bond and the worst is the SV model. Implied volatility comes only sixth in both cases. For the 21-day out volatility forecasts of the simple models, GARCH(1,1) is the better model for both the Bund and T-Bond, followed closely by GR 18 . Among the 9 models, implied volatility comes only seventh for the Bund and sixth for the T-Bond. Turning to the ‘mixed’ models which include the swaption rate as an extra explanatory variable, for the 1-day out volatility forecasts NNR models outperform all other single modelling techniques for both the Bund and the T-Bond, as shown by tables 13.10 and 13.11. Implied volatility comes respectively seventh for the Bund and eighth for the TBond, with only the SV model and the AR model based on absolute returns performing worse in both cases. Yet, the best ‘mixed’ model does not manage to outperform the simple GR combination. For the 5-day out ’mixed’ volatility forecasts, tables 13.12 and 13.13, show that the NNR model and simple average combination give the best predictions for the Bund volatility, closely followed by the AR model based on squared returns. For the T-Bond, the best results are given by the GR closely followed by GARCH and NNR models respectively. Implied volatility comes seventh for both the Bund and the T-Bond, with the RiskMetrics forecast, the AR model based on absolute returns and the SV model the only models performing worse in both cases. Again, the best ‘mixed’ model does not significantly outperform the simple GR combination. For the 21-day out ‘mixed’ volatility forecasts, the AR model based on squared returns performs best for the Bund volatility, followed by the simple average combination and GR17. The NNR model is only the fifth best model this time, although its performance is indeed very close to that of the better models. For the T-Bond, the NNR model comes as a very close second best after the GR combination. As for the 5-day horizon, implied volatility comes seventh for both the Bund and the T-Bond, with the RiskMetrics forecast, the AR model based on absolute returns and the SV model the only models performing worse in both cases: in other words, most of our volatility forecasting models offer more precise indications about future bond volatility than implied volatility. Here again, the best ‘mixed’ model does not manage to outperform decisively the best simple model. In the end, our results indicate that, if it is impossible to identify a single volatility model as the ‘best’ overall, for the ‘mixed’ models including implied volatility as an extra input, NNR models clearly emerge as the best single modelling approach. Furthermore, they demonstrate that, more often than not, volatility models provide more precise indications about future volatility than swaption rates. The inclusion of an implied volatility term and/or model combination can also improve forecasting accuracy.

18 Full

results are available from the authors upon request.

Modelling Benchmark Government Bonds Volatility

281

Conclusion The basis of this research was to focus on the forecasting ability of alternative volatility models applied to Government benchmark bonds, namely the 10-year US Treasury Bond and the 10-year German Bund which has become the benchmark bond for the Euro area. The main objective was to investigate whether, compared with information provided by swaption rates, volatility models can add value in terms of forecasting accuracy and whether adding an implied volatility term to the different time series models proves useful. Considering the tables in Appendix 2 and 3 indicates this to be true as most of the volatility models retained outperform implied volatility as an indication about future bond volatility. Overall, the GR combination of simple models shows a remarkable performance that is difficult to beat even by the ‘mixed’ models. Nevertheless, if we consider, for both the T-Bond and Bund and the 3 different time horizons of 1 day, 1 week and 1 month, the forecasting performance of the GARCH, AR and Comb models against the ‘mixed’ GARCH, ‘mixed’ AR and ‘mixed’ Comb models one can see that, more often than not, the addition of an implied volatility term improves forecasting accuracy. This is not the case however for the Kalman SV model: the stochastic variance model forecasting accuracy is better without the implied volatility term, which is consistent with the findings of Dunis et al. (2000, 2003) who argue that, for foreign exchange volatility models, the implied volatility term has least impact on SV models, where indeed 50% of the time it leads to a deterioration in forecasting accuracy. In our application, this occurs for both the Bund and the T-Bond volatility for all time horizons. Our results show that it is impossible to unequivocally identify a single volatility model as the ‘best’ overall: yet, for the ‘mixed’ models including the swaption rate as an extra explanatory variable, NNR models evidently come out as the best single modelling technique. For the simple models, for all three time horizons, GR performed better, but GARCH models, AR models based on squared returns and the Comb simple average combination performed well too. For the ‘mixed’ models at the three time horizons, as already mentioned, NNR models were the best modelling approach with the GR and Comb combinations. AR models based on squared returns and GARCH models were good models too, and they are better when the implied volatility term is added to them. As expected from the literature on forecast combination, we were also able to show that, for both the simple and ‘mixed’ models, combinations do often add value in terms of forecasting accuracy, but the GR regression-weighted models generally perform better than those based on a simple statistical average. Overall, for both simple and ‘mixed’ volatility models for the Bund and T-Bond, stochastic variance and AR models based on absolute returns were almost always the worst models. Areas for further research might include using NNR models for the simple forecasting models, as we have only applied them here for the ‘mixed’ models. A more interesting line to follow would be to examine the use of our forecasting models for trading bond volatility: following the work of Dunis and Huang (2002) on the foreign exchange markets, the models could then be tested out-of-sample not only in terms of forecasting accuracy as we have done here, but also in terms of trading efficiency, using Government bond option straddles once mispriced options have been identified. It would be interesting to see whether superior models in terms of forecasting accuracy also produce superior trading strategies.

282

Christian L. Dunis and Freda L. Francis

Appendix 1: Historical and Implied 10-Year Volatilites 19

Figure 13.1. Historical and Implied 10-year Euro Bund Volatilites. Source: Implied volatility data from Goldman Sachs and own computations for RiskMetrics volatility

Figure 13.2. Historical and Implied 10-year US T-Bond Volatilites. Source: Implied volatility data from Goldman Sachs and own computations for RiskMetrics volatility

19 The

implied vol is derived from the 3-month ATM option on the 10-year swap rate, with the strike of the option the current 10-year rate 3 months forward (and not the actual 10-year rate). Swaption volatility is deemed a good proxy for Government bond volatilities.

Modelling Benchmark Government Bonds Volatility

283

Appendix 2: Out-Of-Sample Forecasting Accuracy (Simple Models) Table 13.6. Euro Bund volatility forecasts (1-step ahead) Rank 2 3 8 9 7 6 5 4 1

Euro Bund GARCH ARretsqr ARretabs Kalman RMVol ImpliedVol HistoricalVol Comb GR

RMSE MAE Theil-U 0.000181 0.000082 0.644645 0.000181 0.000085 0.64966 0.009281 0.009196 0.97215 0.011405 0.0114 0.977061 0.001251 0.001096 0.817902 0.000981 0.000933 0.790519 0.000691 0.000592 0.734971 0.000286 0.000232 0.576774 0.000168 0.000077 0.585758

CDC 71.68 68.14 50.44 50.44 50.44 51.33 50.89 54.43 72.12

Table 13.7. US T-Bond volatility forecasts (1-step ahead) Rank 2 3 8 9 6 7 5 4 1

US T-Bond GARCH ARretsqr ARretabs Kalman RMVol ImpliedVol HistoricalVol Comb GR

RMSE 0.000418 0.000339 0.012361 0.013366 0.003186 0.003038 0.00205 0.000752 0.000267

MAE Theil-U 0.000228 0.553979 0.000183 0.585375 0.012136 0.955124 0.013361 0.95928 0.002755 0.841026 0.002917 0.84369 0.0016 0.787344 0.000586 0.596505 0.000147 0.430866

CDC 72.12 70.8 48.23 48.23 48.23 48.23 48.67 51.33 77.43

Table 13.8. Euro Bund volatility forecasts (5-step ahead) Rank 2 3 8 9 7 6 5 4 1

Euro Bund GARCH ARretsqr ARretabs Kalman RMVol ImpliedVol HistoricalVol Comb GR

RMSE 0.000181 0.000181 0.009523 0.011427 0.001269 0.000981 0.000689 0.000285 0.000176

MAE Theil-U 0.000082 0.643119 0.000085 0.647681 0.009447 0.972554 0.011423 0.977106 0.001097 0.830413 0.000935 0.790505 0.00059 0.733497 0.000232 0.574355 0.00008 0.632275

CDC 72.57 68.58 50.44 50.44 50.89 50.89 51.33 56.64 72.12

284

Christian L. Dunis and Freda L. Francis Table 13.9. US T-Bond volatility forecasts (5-step ahead) Rank 2 3 8 9 7 6 5 4 1

US T-Bond GARCH ARretsqr ARretabs Kalman RMVol ImpliedVol HistoricalVol Comb GR

RMSE MAE Theil-U 0.000395 0.000215 0.553996 0.000343 0.000184 0.608606 0.010492 0.010482 0.948484 0.013364 0.013359 0.959272 0.003234 0.002771 0.851943 0.003027 0.002905 0.84284 0.002059 0.001613 0.790041 0.000736 0.000584 0.594429 0.000334 0.000192 0.549305

CDC 73.01 67.7 48.23 48.23 48.23 48.23 50.44 51.77 67.7

Appendix 3: Out-Of-Sample Forecasting Accuracy (‘Mixed’ Models) Table 13.10. Euro Bund ‘mixed’ volatility forecasts (1-step ahead) Rank 5 3 9 10 8 7 6 1 1 4

Euro Bund GARCH ARretsqr ARretabs Kalman RMVol ImpliedVol HistoricalVol NNR Comb GR

RMSE MAE Theil-U 0.000179 0.000083 0.638731 0.000176 0.000079 0.628286 0.005799 0.005599 0.954802 0.012364 0.012166 0.978412 0.001251 0.001096 0.817902 0.000981 0.000933 0.790519 0.000691 0.000592 0.734971 0.000177 0.000078 0.645975 0.000177 0.00008 0.63804 0.000174 0.000091 0.58051

CDC 69.03 70.35 50.44 50.44 50.44 51.33 50.89 71.24 71.24 66.37

Table 13.11. US T-Bond ‘mixed’ volatility forecasts (1-step ahead) Rank 3 5 9 10 7 8 6 2 4 1

US T-Bond GARCH ARretsqr ARretabs Kalman RMVol ImpliedVol HistoricalVol NNR Comb GR

RMSE 0.00034 0.000347 0.012561 0.022507 0.003186 0.003038 0.00205 0.000356 0.000339 0.000272

MAE Theil-U 0.000185 0.597339 0.000231 0.518508 0.012373 0.956281 0.022137 0.97502 0.002755 0.841026 0.002917 0.84369 0.0016 0.787344 0.000168 0.72765 0.000189 0.590397 0.000181 0.385698

CDC 71.24 63.72 48.23 48.23 48.23 48.23 48.67 73.01 69.47 70.35

Modelling Benchmark Government Bonds Volatility

285

Table 13.12. Euro Bund ‘mixed’ volatility forecasts (5-step ahead) Rank 5 3 9 10 8 7 6 1 1 4

Euro Bund GARCH ARretsqr ARretabs Kalman RMVol ImpliedVol HistoricalVol NNR Comb GR

RMSE 0.000179 0.000177 0.005818 0.012363 0.001269 0.000981 0.000689 0.000177 0.000177 0.000179

MAE Theil-U 0.000082 0.641012 0.00008 0.631681 0.005656 0.955258 0.012165 0.978412 0.001097 0.830413 0.000935 0.790505 0.00059 0.733497 0.000076 0.657547 0.000079 0.643539 0.000085 0.617037

CDC 68.58 70.35 50.44 50.44 50.89 50.89 51.33 72.12 72.12 69.47

Table 13.13. US T-Bond ‘mixed’ volatility forecasts (5-step ahead) Rank 2 5 9 10 8 7 6 3 4 1

Euro Bund GARCH ARretsqr ARretabs Kalman RMVol ImpliedVol HistoricalVol NNR Comb GR

RMSE 0.00034 0.000356 0.012558 0.02253 0.003234 0.003027 0.002059 0.00034 0.00034 0.000339

MAE Theil-U 0.000184 0.60125 0.000246 0.524334 0.01237 0.956282 0.022166 0.975047 0.002771 0.851943 0.002905 0.84284 0.001613 0.790041 0.000185 0.599681 0.000201 0.565347 0.000184 0.595656

CDC 70.8 63.72 48.23 48.23 48.23 48.23 50.44 70.35 68.58 71.68

References [1] Akgiray, V. (1989), ’Conditional Heteroskedasticity in Time Series of Stock Returns: Evidence and Forecasts’, Journal of Business, 62, 55-80. [2] Andersen, T. and Lund, J. (1996), Stochastic Volatility and Mean Drift in the Short Term Interest Rate Diffusion: Sources of Steepness, Level and Curvature in the Yield Curve, Manuscript, Kellogg school, Northwestern University. [3] Andersen, T. and Bollerslev, T. (1997), Answering the Critics: Yes ARCH Models do Provide Good Volatility Forecasts, Working Paper 227, Kellogg School, Northwestern University. [4] Balkin, S. D. (1999), Stationarity Concerns When Forecasting Using Neural Networks, Presentation at the INFORMS Conference, Philadelphia, PA.

286

Christian L. Dunis and Freda L. Francis

[5] Bartlmae, K. and Rauscher, F. A. (2000), ’Measuring DAX Market Risk: A Neural Network Volatility Mixture Approach’, Presentation at the FFM 2000 Conference , London, 31 May-2 June. [6] Bollerslev, T. (1986), ’Generalised Autoregressive Conditional Heteroskedasticity’, Journal of Econometrics, 31, 307-27. [7] Bollerslev, T., Chou, R. Y. and Kroner, K. F. (1992), ’ARCH Modeling in Finance: A Review of the Theory and Empirical Evidence’, Journal of Econometrics, 52, 5-59. [8] Box, G. E. P., Jenkins, G. M. and Gregory, G. C. (1994), Time Series Analysis:Forecasting and Control , Prentice-Hall, Englewood Cliffs, New Jersey. [9] Brailsford, T. and Faff, R. (1996), ’An Evaluation of Volatility Forecasting Techniques’, Journal of Banking and Finance , 20, 419-38. [10] Campbell, J. Y., Lo, A. W. and MacKinlay, A. C. (1997), The Econometrics of Financial Markets, Princeton University Press. [11] Chiras, D. P. and Manaster, S. (1978), ’The Information Content of Option Prices and a Test of Market Efficiency’,Journal of Financial Economics, 6, 213-34. [12] Christoffersen, P. F. and Diebold, F. X. (1997), How Relevant is Volatility Forecasting for Financial Risk Management?, Wharton Financial Institutions Center Working Paper 97-45, 1, 12-22. [13] Clemen, R. T. (1989), ’Combining Forecasts: A Review and Annotated Bibliography’, International Journal of Forecasting , 5, 559-83. [14] Deboeck, G. J. (1994), Trading on the Edge - Neural, Genetic and Fuzzy Systems for Chaotic Financial Markets, John Wiley & Sons, New York. [15] Donaldson, R.G. and Kamstra, M. (1997), ’An Artificial Neural Network-GARCH Model for International Stock Return Volatility’, Journal of Empirical Finance, 4, 17-46. [16] Dunis, C. and Huang, X. (2002), ’Forecasting and Trading Currency Volatility: An Application of Recurrent Neural Regression and Model Combination’, Journal of Forecasting, 21, 317-54. [17] Dunis, C., Laws, J. and Chauvin, S. (2000), FX Volatility Forecasts: A FusionOptimisation Approach, Neural Network World, 10, No. 1/2, 187-202. [18] Dunis, C., Laws, J. and Chauvin, S. (2003), ’FX Volatility Forecasts and the Informational Content of Market Data for Volatility’, European Journal of Finance 9, 3, 242-72. [19] Dunis, C., Moody, J. and Timmermann, A. [eds.] (2001), Developments in Forecast Combination and Portfolio Choice, John Wiley & Sons, Chichester.

Modelling Benchmark Government Bonds Volatility

287

[20] Ederington, L. H. and Guan, W. (1999), Forecasting Volatility, Working Paper, University of Oklahoma. [21] Engle, R. F. (1982), ’Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation’, Econometrica, 4, 50, 987-1007. [22] Engle, R. F., Lilien, D. M. and Robins, R. P. (1987), ’Estimating Time Varying Risk Premia in the Term Structure: The ARCH-M Model’, Econometrica, 55, 391-407. [23] Figlewski, S. (1994), Forecasting Volatility Using Historical Data, Manuscript, Stern School of Business, New York University. [24] Figlewski, S. (1997), Forecasting Volatility, Financial Markets, Institutions, and Instruments Vol. 6, Stern School of Business, Blackwell Publishers, Boston. [25] Franses, P. H. and Van Homelen, P. (1998), On Forecasting Exchange Rates Using Neural Networks, Applied Financial Economics, 8, 589-96. [26] Gourieroux, C. and Monfort, A. (1995), Time Series and Dynamic Models , translated and edited by G. Gallo, Cambridge University Press, Cambridge. [27] Granger, C. W. and Ramanathan, R. (1984), ’Improved Methods of Combining Forecasts’, Journal of Forecasting , 3, 197-204. [28] Hamilton, J. (1994), Time Series Analysis, Princeton University Press, Princeton. [29] Harvey, A. C. and Shepherd, N. (1993), The Econometrics of Stochastic Volatility, LSE Discussion Paper n?166. [30] Harvey, A. C., Ruiz, E. and Shepherd, N. (1994), ’Multivariate Stochastic Variance Models’, Review of Economic Studies, 61, 247-64. [31] Haykin, S. (1999),Neural Networks: A Comprehensive Foundation , 2nd edition, Prentice-Hall, Englewood Cliffs, New Jersey. [32] Jorion, P. (1995), ’Predicting Volatility in the Foreign Exchange Market’, Journal of Finance, 50, 507-28. [33] J.P. Morgan (1996), RiskMetrics - Technical Document, 4th Edition, New York. [34] Kaastra, I. and Boyd, M. (1996), ’Designing a Neural Network for Forecasting Financial and Economic Time Series’, Neurocomputing, 10, 215-236. [35] Kingdon, J. (1997),Intelligent Systems and Financial Forecasting , Springer, London. [36] Kroner, K. F., Kneafsey, K. P. and Claessens, S. (1995), ’Forecasting Volatility in Commodity Markets’, Journal of Forecasting, 14, 77-95. [37] Kuan, C. M. and Liu, T. (1995), ’Forecasting Exchange Rates Using Feedforward and Recurrent Neural Networks’, Journal of Applied Economics , 10, 347-64.

288

Christian L. Dunis and Freda L. Francis

[38] Lamoureux, C. G. and Lastrapes, W. D. (1993), ’Forecasting Stock-Return Variances: Toward an Understanding of Stochastic Implied Volatilities’, Review of Financial Studies, 6, 293-326. [39] Latane, H. A. and Rendleman, R. J. (1976), ’Standard Deviations of Stock Price Ratios Implied in Option Prices’, Journal of Finance, 31, 369-81. [40] Mahmond, E. (1984), ’Accuracy in Forecasting’, Journal of Forecasting , 3, 139-60. [41] Makridakis, S., Andersen, A., Carbone, R., Fildes, R., Hibon, M., Lewandwski, R., Newton, J., Parzen, E. and Winkler, R. (1982), ’The Accuracy of Time Series (Extrapolative) Methods: Results of a Forecasting Competition’, Journal of Forecasting , 1, 111-53. [42] Pindyck, R. S. and Rubinfeld, D. L. (1998), Econometric Models and Economic Forecasts, McGraw-Hill, New York. [43] Schwert, G. W. (1989a), ’Business Cycles, Financial Crises and Stock Volatility’, in K. Brunner and A. H. Meltzer [eds.], IMF Policy Advice, Market Volatility, Commodity Price Rules and Other Essays , North-Holland, Amsterdam, 82-126. [44] Schwert , G. W. (1989b), ’Why Does Stock Market Volatility Change Over Time?’, Journal of Finance, 44, 1115-54. [45] So, M. K. P., Lam, K. and Li, W. K. (1999), ’Forecasting Exchange Rate Volatility Using Autoregressive Random Variance Model’, Applied Financial Economics, 9, 583-91. [46] Taylor, S. J. (1986), Modelling Financial Time Series, John Wiley & Sons, Chichester. [47] Theil, H. (1966), Applied Economic Forecasting, North-Holland, Amsterdam. [48] West, K. D. and Cho, D. (1995), ’The Predictive Ability of Several Models of Exchange Rate Volatility’, Journal of Econometrics, 69, 367-91. [49] White, H. (1980), ’A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity’, Econometrica, 48, 817-38. [50] Xu, X. and Taylor, S. J. (1996), ’Conditional Volatility and the Informational Efficiency of the PHLX Currency Options Market’, in C. Dunis [ed.], Forecasting Financial Markets, John Wiley & Sons, Chichester, 181-200. [51] Zhang, G., Patuwo, B. E. and Hu, M. Y. (1998), ’Forecasting with Artificial Neural Networks: The State of the Art’, International Journal of Forecasting , 14, 35-62.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 289-309

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 14

N ONLINEAR C OINTEGRATION USING LYAPUNOV S TABILITY T HEORY Raphael N. Markellos∗ Department of Management Science and Technology Athens University of Economics and Business Athens, Greece

14.1.

Introduction

Although most of the research in modelling long-run relationships has concentrated on cointegration analysis and linear model specifications, a number of recent studies have been concerned with the theoretical and empirical relevance of nonlinearities. Nonlinear errorcorrection mechanisms can be accommodated rather straightforwardly within the cointegration analysis framework in that residuals from some linear cointegration relationship enter a nonlinear error-correction model (see, for example, Balke and Fomby, 1997; Escribano and Granger, 1998; Swanson, 1999). However, nonlinear cointegration and, in general, nonlinear relationships between nonstationary variables bring about several economic and econometric problems which appear to be different to those associated with nonlinear errorcorrection (for a general overview see Granger, 1997). Although a number of testing procedures and modelling procedures have recently been proposed for nonlinear cointegration (Granger and Hallman, 1991; Bierens, 1997; Park and Phillips, 2001; Corradi, Swanson and White, 2000), issues such as fractional cointegration, nonstationarity of cointegration error variance, and multivariate nonlinear cointegration have not been addressed under a sufficiently general framework. An attempt is made in this paper by considering the nonlinear cointegration approach of Granger and Hallman (1991) and Sephton (1994) and extending it using stochastic Lyapunov stability theory and neural network estimators. The paper also introduces a class of so-called nonpredictive cointegration models which involve nonlinear cointegration relationships that do not allow predictions to be made. Section 2 describes the methodology used in this paper. Section 3 ∗ E-mail

address: [email protected]

290

Raphael N. Markellos

discusses an application of the proposed methodology in testing and modelling nonlinear cointegration for the UK Gilt-Equity ratio. The final section concludes the paper.

14.2. Methodology 14.2.1.

Basic Concepts

Although1 the concept of stationarity is central to the theory of modern time series econometrics, it is almost impossible to test for directly and can be defined only in terms of individual properties such as constancy of moments and memory: Definition of Process Memory (Granger and Hallman, 1991). Consider the conditional probability density function of xt+h given the information It : xt− j , Qt− j , j ≥ 0, where Qt is a vector of other explanatory variables. The series xt will be said to be short memory in distribution (SMD) with respect to It if: | Prob(xt+h in A|I in B) − Prob(xt+h in A) |→ 0

(14.1)

as h → ∞ for all appropriate sets A, B such that Prob(It in B) > 0. The complement of SMD processes are called long memory in distribution (LMD). A more narrow definition of memory can be made with respect to the mean of a process. Define the conditional mean as: E(xt+h |It ) = ξt,h

(14.2)

so that ξt,h is the optimum least squares forecast of xt+h using It . Then xt is said to be short memory in mean (SMM) if: lim ξt,h = Ξt

h→∞

(14.3)

where Ξ is a random variable with distribution D and D does not depend on It . The most interesting case is when D is singular, so that Xi just takes a single value µ, which is the unconditional mean of xt , assumed to be finite. Other cases include limit cycles and processes with strange, possibly fractionally dimensional, attractors. If ξt,h depends on It for all h, xt is said to be extended memory in mean (EMM). Definition of Stationarity (Granger and Ter¨asvirta, 1993, p. 9): Denote by ξt,k the section of a real-valued series ξt− j , j = 0, . . . , k − 1, so that the section contains k consecutive terms. Let pt,k (x) denote the distribution function of xt,m . A series xt is said to be strongly stationary if pt,k (x) is not a function of t for every finite k. Thus, a series is stationary if its generating mechanism is time invariant and if the series is SMD. A process is said to have weak stationarity if its mean and variance are constant over time and its autocovariances cov(xt+l , xt ) depend only on l. 1 This

section draws heavily from Granger and Hallman (1991) and Granger and Ter¨asvirta (1993, Ch. 1).

Nonlinear Cointegration using Lyapunov Stability Theory

14.2.2.

291

Cointegration as an Attractor

If xt and yt are both I(1) variables, then in general the linear combination: = αxt + zt

yt or,

(14.4) = Yt − αxt

zt

will also be I(1). In the case where zt ∼ I(0), the two variables xt and yt are said to be (linearly) cointegrated (Granger, 1981; Engle and Granger, 1987). A consequence of Granger’s Representation Theorem (GRT, Engle and Granger, 1987) is that cointegrated systems such as (14.4) have an error-correction representation: n

k

i=1

i=1

∆yt = −αzt−1 + ∑ α1i ∆yt−i + ∑ α2i ∆xt−i

(14.5)

where zt−i is the lagged residual from the cointegrating regression (14.4), xt , yt are I(1) variables and ∆ is a simple difference operator. Consider the variables xt and yt under the more general assumptions of EMM processes. Then, in general, the nonlinear combination: yt

=

f (xt ) + εt

or, εt

(14.6) = yt − f (xt )

will also be EMM. In the case where zt ∼ I(1) and εt ∼ SMM then the two variables xt and yt are said to be nonlinearly cointegrated. This is a definition of nonlinear cointegration that follows Granger and Hallman (1991). These authors introduced the term attractor to describe the cointegrating function f . Granger and Ter¨asvirta (1993, pp. 53-59) considered a set M, by analogy to f , composed of EMM variables, and defined the signed Euclidean distances ztM , by analogy to εt , from this set. They then defined M to be an attractor if ztM is SMM with zero mean. Granger and Hallman (1991) employed the nonparametric Alternating Conditional Expectations (ACE) technique in estimating the nonlinear cointegrating regression f . This was later extended by Sephton (1994), who employed the more sophisticated Multivariate Adaptive Regression Spline (MARS). In the tradition of residual based cointegration tests, Granger and Hallman (1991) proposed testing nonlinear cointegration on the basis of the memory properties of the nonlinear cointegration residuals. The authors employed standard Dickey-Fuller unit root tests and adapted the critical values for the ACE algorithm. The same approach was followed by Sephton (1994). Granger (1997) latter expressed his concerns about residual-based approaches in cointegration: “Unfortunately, the theory is based on a number of assumptions which are not tested in practice, such as that the variance of the cointegrating vector, zt is constant or, at least not growing with time. It is unclear what it means to say that there is a long-run equilibrium when the equilibrium error has a variance that is exploding over time, for example, if the processes is I(0) in mean but I(1) in variance”. This concern is also related to the possibility of fractional cointegration, where the residuals of the cointegrating system have a fractional, rather than unit, root

292

Raphael N. Markellos

and infinite variance. Although fractional cointegration has been discussed in the literature, testing procedures have been somewhat arbitrary (e.g., see Baillie and Bollerslev, 1994). One of the major disadvantages of the Granger and Hallman approach (1991) was that the methodological and conceptual framework could only accommodate bivariate and single equation cointegrations. No indication was given as to how the approach could be extended to more general systems. However, Granger and Hallman (1991) admitted a nonrigorous treatment and a descriptive exposition of the related issues, which aimed to be a starting point for generalisations of linear cointegration. Some of the topics that remain open in this field is the development of a robust methodological approach for nonlinear cointegration, the adaptation of a multivariate framework and the accommodation of fractional cointegration and nonstationarity of variance of cointegration errors. An attempt at addressing these topics is made in the following sections.

14.2.3.

Lyapunov Stability Theory and Stationarity

The definition of stationarity given in 14.2.1. is sufficient only for linear processes. In the case of a nonlinear process, it will just contain some of the necessary conditions for stationarity. Dechert and Genc¸ay (1992) and Granger and Ter¨asvirta (1993), among others, have suggested that the concept of stability may be more useful in assessing the stationarity of nonlinear processes. Stability is concerned with the properties of a process with a specific generating mechanism after it has been running for a long period. A system of difference equations is strongly stable if every trajectory converges to a stationary state which can depend on the initial state of the trajectory. A weaker, but more practical notion of stability (quasi-stability) requires that every system trajectory approaches asymptotically a bounded set of equilibrium points. There are various ways in which a processes can be unstable. For example, it could be that the underlying probability distribution is explosive in mean, variance or higher moments. In general, if a vector y˜t,k = (yt , yt−1 , . . . , yt−k+1 ) has been running for a long time and y˜t,k is stable for all k, then it will also be stationary. A very useful tool in examining the global stability of nonlinear deterministic dynamical systems is the theory proposed by the Russian mathematician Alexander M. Lyapunov. Lyapunov was the first to understand clearly, at the turn of this century, the connection between the eigenvalues of the Jacobian matrix of partial derivatives of a dynamical system at an equilibrium and the local stability of the system. The definition of Lyapunov stability theory on the basis of the so-called Jacobian approach is given in the remainder of this section2 . Consider the following deterministic difference equation: xt = g0 (xt−1 , xt−2 , . . . , xt−k ),

t = 1, . . . , T

(14.7)

where g0 : Rk ∼ R is a nonlinear dynamic mapping. The model (14.7)) can be expressed in terms of a state vector Zt = (xt , xt−1 , . . . , xt−k+1 )′ ∈ Rk , and a function G : Rk → Rk such that: Zt = G(Zt−1 ), 2 This

t = 1, . . . , T

definition is based on that given by Whang and Linton (1999).

(14.8)

Nonlinear Cointegration using Lyapunov Stability Theory

293

Let Jt be the Jacobian of the map G in (14.12) evaluated at Zt . Specifically, define:   ∆g1,t ∆g2,t · · · ∆gm−1,t ∆gm,t  1 0 ··· 0 0     0 1 ··· 0 0  Jt = D f (Zt ) =  (14.9)   .. .. .. ..  ..  .  . . . . 0 0 ··· 1 0 for t = 0, 1, . . . , T − 1, where ∆g j,t = De j g0 (Zt ), where are the partial derivatives of G with respect to the j-th variable, j = 1, 2, . . . , m, in which e j = (0, . . . , 1, . . . , 0)′ ∈ Rk denotes the j-th elementary (or selection) vector. The following limit, with k · k being any matrix norm:

T 1 a.s.

(14.10) λ = lim ln ∏ JT −t T →∞ T

t=1

if it exists, gives the dominant or largest Lyapunov exponent (LLE) of the dynamical system. If we denote by αi (T, Z) the i-th largest eigenvalue of Jt , then: a.s.

λi = lim

T →∞

1 ln kαi (T, Z)k, T

i = 1, . . . , m

(14.11)

gives the Lyapunov spectrum of the dynamical system. The above definition of Lyapunov exponents (LEs) can be readily extended to the case of stochastic stability for dynamic systems with system noise: xt = g0 (xt−1 , xt−2 , . . . , xt−m ) + εt ,

t = 1, . . . , T

(14.12)

where {εt } is a sequence of zero mean iid variables with variance σ2 and g0 is an unspecified regression function. Model (14.12) can be expressed in state space form, as in (14.8), with the addition of the error vector Ut = (εt , 0, . . . , 0)′ ∈ Rk . Alternatively, it is possible to observe the series xt of (14.7) with measurement noise: yt = xt + εt ,

t = 1, . . . , T

(14.13)

where {εt } defined as previously. In the case of measurement noise we can find a function m∗0 for which (14.12) holds with x replaced by the observed data y. This means that asymptotically there is no loss of generality in restricting attention to system noise assumptions. The above definitions of Lyapunov stability can readily be extended to multivariate systems. Essentially, LEs measure the local rate of divergence or convergence between two trajectories averaged over the attractor. They are generalisations to nonlinear systems of the eigenvalues or roots of linear systems. The sum of all LEs, i.e., the logarithm of the trace of the matrix of system derivatives, indicates the overall stability of the system. If this sum is positive, it implies that the system is explosive. A zero sum indicates that the system conserves a constant volume in phase space, i.e., it is a conservative system. If the sum of LEs is negative, the system attractor tends to shrink in time, i.e., the system is dissipative. If the LLE is positive, it implies that two nearby trajectories exponentially diverge in space. Although this is often associated with a characteristic property of chaotic systems,

294

Raphael N. Markellos

namely sensitive dependence on initial conditions, fractional Brownian motions also exhibit positive LLEs (see, for example, Tsonis and Elsner, 1992). Based on the sign patterns of the real part of LEs, we can classify attractors as equilibrium points (fixed, stationary, rest singular or critical points), periodic orbits (limit cycles) and quasiperiodic orbits (tori). Fixed point attractors will have all LEs negative (,-,. . . ,-), limit cycles will have one zero LE and the rest will be negative (0,-,. . . ,-) and κ-periodic orbit attractors (T κ tori) will have κ zero LEs and the rest will remain negative (0, . . . , 0, −, . . . , −) ( Medio, 1993, pp. 121-122). In all cases the attractor must be Lya| {z } κ

punov stable, i.e., the sum of the LEs must be less than zero, to ensure that the system has not got explosive dynamics. From the point of view of time series econometrics, McCaffrey et al. (1992) argued that a negative LLE means that all roots lie within the unit circle and that the solutions settle into stationary oscillations with bounded variance. When the LLE is positive, the variance diverges and there is no convergence to a stationary distribution. Whang and Linton (1999) argue that stationary linear autoregressions will have a negative LLE, while unit root processes will have an LLE equal to zero. Explosive autoregressions will have a positive LLE, but these series will torque off to infinity.

14.2.4.

Estimation of Lyapunov Exponents

A powerful technique for estimating LEs using the Jacobian approach is based on a nonparametric regression estimate of g0 (see Dechert and Genc¸ay, 1992; McCaffrey et al., 1992; Nychka et al., 1992; Whang and Linton, 1999). More specifically, let g(·) ˆ be a consistent smoothing based estimate of g0 , e.g., neural network, kernel, nearest neighbours, spline, local polynomial, etc. Dechert and Genc¸ay (1996) proved that the estimated function g(·) ˆ is topologically conjugate to the true function g0 . Moreover, they proved that the m LEs of the function g(·) ˆ are the LEs of the function g0 . Dechert and Genc¸ay (1992) argued that the LEs of g0 can be calculated from the eigenvalues of Jt using the QR decomposition algorithm. They also proposed using a multi-layer perceptron (MLP) neural network in estimating the function g(·) ˆ . They found that this technique had good small sample properties in the presence of measurement and system noise. McCaffrey et al. (1992) and Nychka et al. (1992) also found that MLPs were the best regression method for small samples of data from chaotic systems with low levels of noise. They found that other methods, such as splines, radial basis functions and projection pursuit, had the disadvantage of not being robust against incorrect choices of model embedding dimension (number of lags in 3.1.17). Finally, they suggested choosing amongst alternative models and dimensions on the basis of some parsimony criterion. One of the most important problems in the estimation of LEs using the Jacobian approach has been the possibility of obtaining spurious exponents through an incorrect choice of dimension or because of statistical problems. Most authors have adopted a trial-anderror approach where the dimension of the model is increased until LEs reach a plateau while spurious exponents diverge. Analytical results by Genc¸ay and Dechert (1996) showed that it is possible to obtain spurious LEs that are even larger than the LLE of the original system. Simulation results by Tanaka, Aihara and Masao (1998) showed that the Jaco-

Nonlinear Cointegration using Lyapunov Stability Theory

295

bian method for estimating LEs can give spurious positive exponents when applied to finite samples of random time series. They derived an upper bound for the LLE estimated by the local linear regression Jacobian approach. Lai and Chen (1998) studied the asymptotic distribution of Lyapunov exponents based on the asymptotic distribution theory of random matrices. They established a limit theorem for the conditional least-squares estimator of √ Lyapunov exponents, as well as the T -consistency of the estimated LEs by the Jacobian method using conditional least squares regression. Whang and Linton (1999) developed an asymptotic theory for nonparametric kernel estimators of LEs from stochastic time series. More specifically, the authors proved √ the asymptotic normality of√these estimators and showed that the convergence rate is T , under a general scenario or 3 T for processes with chaotic-like behaviour. Whang and Linton argued that their theory is applicable to any nonparametric estimator that converges in probability uniformly at a certain rate, including the MLP approach of Genc¸ay and Dechert (1992) and the spline methods of Nychka et al. (1992). They calculated sample standard errors based on the distribution of LEs across nonoverlapping sub-samples of the complete sample available. Genc¸ay (1996) proposed a moving-block bootstrap estimator of the Lyapunov exponents. Bask and Genc¸ay (1998) developed a bootstrap-based one-sided test statistic for the Lyapunov exponent. LEs have been for many years one of the standard tools in the analysis of systems of nonlinear differential equations and in control theory (e.g., see Cook, 1994). Lyapunov theory has become very popular in recent years due to the central position it holds within the rapidly developing field of chaos theory. Likewise, Lyapunov theory became known in economics through studies that tested for chaos in mathematical models and economic time series. More recently, Lyapunov stability theory has been applied to mathematical economics (Russel and Zecevic, 1998) and game theory (Deissenberg, 1991; Bomze and Weibull, 1995). Lyapunov exponents have also had interesting applications in econometrics: Bougerol and Picard (1992) proved that a necessary and sufficient condition for stationarity of GARCH processes and of generalised (random coefficient) AR models with non-negative coefficients is the negativity of the associated LLE. Duan (1997) proved that a negative LLE is a sufficient, but not necessary, condition for strict stationarity of augmented GARCH processes. As mentioned previously, LEs can be used to test for stability of NAR and NVAR systems. This is very important, since stability conditions are available only for some specific nonlinear models (e.g., see Granger and Ter¨asvirta, 1993, Ch. 4). In a related application, Dechert and Genc¸ay (1992) applied Lyapunov stability theory to test the stability and stationarity of an autoregressive process.

14.2.5.

Cointegration and Lyapunov Stability

Consider the following dynamic regression: yt = f (xt−i , yt−i ) + εt

(14.14)

where f is a smooth nonlinear function, i=1,. . . ,n, and εt is a zero mean iid random variable. Rather than looking at the residuals of this regression, we could evaluate the stability of the model using the eigenvalues. In other words, rather than looking at the transients of the attractor, we can concentrate on its Lyapunov stability. As argued previously, Lyapunov stability can be used to draw inferences about the stationarity of a system. Furthermore,

296

Raphael N. Markellos

considering the results of Bougerol and Picard (1992) and Duan (1997) on the stability of GARCH processes, Lyapunov stability also conveniently addresses the stationarity of the variance process. More specifically, nonlinear cointegration can be defined as the condition whereby a set of EMM series have a nonlinear combination that is characterised by a negative sum of LEs. If the nonlinear function has a positive LLE but a negative sum of LEs then the variables are fractionally cointegrated3 . If the sum of LEs is zero then the variables are not cointegrated. A positive sum of LEs implies that the system is explosive. The equalities and inequalities in the above definitions are meant in a statistical sense and the LEs are assumed to be non-spurious. Equation (14.14) can readily be extended to include an arbitrary number of variables. The interpretation of LEs is the same as before. In general this dynamical system will have a sum of LEs equal to zero. In the case when the sum of LEs is negative, the variables can be considered to be nonlinearly cointegrated. It is proposed that all the nonlinear functions involved in nonlinear cointegration analysis should be approximated nonparametrically using the powerful MLP neural network models. The MLP is preferable to estimators such as ACE and MARS since it can cope with multivariate nonlinearities and can approximate a wide class of smooth functions with arbitrary accuracy (for a comprehensive review of the celebrated universal approximation properties of MLPs, see Scarselli and Tsoi, 1998). As mentioned previously, MLPs will be advantageous in estimating LEs for cointegrated systems such as (14.14). Moreover, MLPs can be applied in estimating nonlinear cointegrating functions such as in (3.1.22). More specifically, using the notation of Granger and Ter¨asvirta (1993, p. 105) for a simple version of the MLP, the cointegrating regression in (14.4) can be represented as: " # q

yt

=

α′ + ∑ γ j φ(β j · xt ) + zt ′

j=1

or, zt

"

q

#

(14.15)

= yt − α′ + ∑ γ j φ(β j · xt ) j=1



where φ(z) is typically a bounded, monotonic function, the so-called ’squashing function’, q the number of nonlinear terms, and α, β, γ are model parameters. Hereafter, the following simpler representation of MLP functions such as (14.15) will be used: yt

= MLPq (xt ) + zt

zt

= yt − MLPq (xt )

(14.16) Assuming for now that a nonlinear cointegration is predictive, residuals from the MLP cointegrating regression (14.16) could be used in linear error-correction models such as (14.5). However, since it is possible that the correction of transient behaviour to equilibrium 3 Strictly

spearing, a positive sum of LE may also imply that the system is chaotic or explosive with bounds (see the examples by Bougerol and Picard, 1992). However, the present study considers that these are only theoretical possibilities, especially chaos, and will not be encountered in economic systems.

Nonlinear Cointegration using Lyapunov Stability Theory

297

is nonlinear, it is instructive to also consider nonlinear generalisations of error-correction models. The MLP can be used again in obtaining a general approximation of the nonlinear version of (14.5), as in Haefke and Helmenstein (1996): ∆yt

= MLPq (zt−1 , ∆yt−i , ∆xt−i )

(14.17)

where i,=1,. . . ,n and zt−i is a lagged residual from the MLP cointegrating regression (14.16). Moreover, it is possible to model the error-correction mechanism and the cointegrating restriction jointly in one model : ∆yt

= MLPq (yt−1 , xt−1 , ∆yt−i , ∆xt−i )

(14.18)

No cointegration residual is included in the regression since this will be estimated simultaneously with the error-correction model. Finally, it must be noted that simple linear differences such as ∆yt = yt − yt−1 may be misspecified and that a more general nonlinear difference operator is needed. An approximation to this operator can be obtained by an MLP AR model: yt = MLPq (yt−i ) + δi

(14.19)

where δi will be the differences that should be used in models such as (14.17) and (14.18). Granger’s Representation Theorem and Nonpredictive Cointegration Several researchers have assumed that GRT will also apply to nonlinear cointegration and have built error-correction models based on the residuals of nonlinear cointegrating regressions such as (14.5). Although it was accepted that the theory of error-correction models for nonlinear attractors was still incomplete, no specific evidence was given to contradict the GRT. Granger and Ter¨asvirta (1993, p. 60) pointed out that a relevant problem is that if a series xt is EMM then its simple difference ∆xt will not necessarily be SMM. They suggested estimating the difference operator using a nonparametric function (as in equation (14.19)). Strictly speaking, the GRT will be invalid in the case of nonlinear cointegration, since the proof given by Engle and Granger (1987) depends heavily on linear matrix algebra. One of the studies that has looked directly at violations of the GRT is Ogaki (1998). More specifically, Ogaki considered the general case of the Wold representation and showed that, if the dimension of the innovations is smaller than that of the original series, then one of the conditions of the GRT is violated and the theorem does not hold. Based on this observation, the author provided a counter-example to the theorem where two processes are cointegrated but do not have an error-correction representation. Intuitively, it is possible to imagine a case of nonlinear cointegration where the GRT does not hold. In the case of linear cointegration the attractor is a single point and the errorcorrection mechanism returns the system to that point. However, nonlinear cointegration will not have a single equilibrium and this implies that the error- correction mechanism will have predictive value only if the next equilibrium state is known in advance. This is because a nonlinearly cointegrated system may deviate from one equilibrium and then return to another equilibrium. Nonlinear cointegrated variables that cannot be predicted on the basis of lagged error-correction terms will be called nonpredictive cointegrations.

298

Raphael N. Markellos

Examples of nonpredictive cointegration systems can readily be constructed. For the remainder of this section we assume that yt is a unit root process generated by: yt

= yt−1 + ωt

(14.20)

Define ωt , εt and ut as zero mean iid random variables and consider the following model : ( δyt−1 + εt , yt−1 > ct (14.21) xt = yt−1 + εt , yt−1 ≤ ct This is a case of threshold cointegration where the strength of cointegration depends on ct . If ct is a constant or deterministic function, e.g., a seasonal, then the nonlinear cointegration is predictive since the evolution of the threshold ct can be predicted.. However, if ct follows some random process and if the cointegrating coefficient δ is significantly different from unity, then it is possible that the cointegration will be nonpredictive. Although the system will return to one of the two regimes, it may not be possible to predict the random threshold and hence the correct equilibrium regime. Another possible case of nonpredictive cointegration is the following nonconstant-coefficient model: xt

= αt yt−1 + εt

(14.22)

where αt = γαt−1 + ut , kγk < 1, i.e., the cointegrating coefficient follows an autoregressive or unit root process. Considering the typically high frequency of structural breaks and business cycle variations in economic systems, the two examples of nonpredictive cointegration given above should not be thought of as mathematical sophisms. Nonpredictive cointegration may be part of a theory of cointegration under parameter variation and structural breaks that accepts long-run equilibrium relationships but rejects long-term predictability. Interesting examples of nonpredictive cointegration can be built on the basis of “exotic”, for econometrics, mathematical functions. For example, consider the following “look-back” cointegration function: xt

= α max(yt−1 , . . . , yt−i ) + εt ,

i = 1, . . . , n

(14.23)

where max(yt−1 , . . . , yt−i ) is a function that returns the maximum value of i the preceding yt ’s. An example of clearly unpredictable cointegration can be constructed using a cosine function: xt

= α cos(yt−1 ) + εt

(14.24)

Moreover, all trigonometric cointegration functions will be nonpredictive. Finally, another example of a cointegration function that will also be unpredictable is the following: xt

= α

mod (yt−1 ) + εt

(14.25)

where mod (yt−1 ) is the modulo function and returns the decimal part of yt−1 . Similar results would be obtained if mod is replaced by a truncation function that returns (part of) the integer part of yt−1 . Functions such as cos and mod are highly problematic since they can transform an EMM process to an SMM process and thus confuse tests of cointegration. These problems will also confuse LE-based analysis of cointegration. It should be emphasised that the examples of nonpredictive cointegration given in this section indicate cases where the GRT is violated and do not constitute counter-examples.

Nonlinear Cointegration using Lyapunov Stability Theory

299

14.3. Empirical Application 14.3.1.

The Gilt-Equity Yield Ratio and Nonlinear Cointegration

One of the oldest and most widely used stock market forecasting techniques is the socalled Gilt-Equity Yield Ratio, defined as (see Levin and Wright, 1998): GEYR =

Rb de /Pe

(14.26)

where Rb is the bond yield, de is the income stream (dividend) from equity and Pe is the price of equity. This ratio is also known as the Confidence factor, a name coined by the famous technical analyst Ellinger (1955). In practice, bond yields are based on those of longterm government bonds (gilts) while the equity prices and associated dividends are based on those of a broad stock market index. The use of the assumes that they will converge to some equilibrium value. Forecasting and trading assume that any deviations from the equilibrium will eventually be corrected. Deviations from equilibrium could arise through a variety of reasons, such as shocks, noise trading, speculative bubbles, estimation errors, market microstructures, etc. However, the correction to equilibrium may not occur instantaneously, e.g., due to transaction costs, price rigidities, etc. This implies that forecasts and trading signals may be generated only after the deviation from equilibrium exceeds some critical level. As noted by Clare, Thomas and Wickens (1994), buy-sell thresholds are formulated on the assumption that the formulas have a normal long-run range reflecting a long-run arbitrage relation between gilt and equity markets. Investment advisors Hoare Govett (1991) argued that a value of the less than 2 is usually taken as a buy signal while a value greater than 2.4 is usually taken as a sell signal. Recent studies by Clare, Thomas and Wickens (1994) and Levin and Wright (1998) suggested that the is still widely used by financial practitioners. Both these studies showed how the can be nested within a more formal theoretical model of arbitrage relations. Econometric analysis by Clare, Thomas and Wickens (1994) confirmed the empirical value of the in forecasting equity returns. The authors attributed this predictive power to the actuarial practices that encourage a disproportionate emphasis on yields rather than capital gains. Levin and Wright (1998) argued that the alone is likely to have poor forecasting value and that it should be corrected to account for variations in inflation. They found that a modified version of the was successful in tactical asset allocation between bonds and equity. Mills (1991) studied the and argued that it is consistent with elements of the portfolio theory developed by Tobin and his co-researchers. Moreover, he noted that the implies that in equilibrium the underlying variables will be in constant proportion to each other and took a logarithmic transform of (14.26): CF = Rb − de − Pe

(14.27)

Mills then observed that (14.27) formed a cointegrating relationship since it involved a stationary combination of nonstationary variables. Moreover, he noted that the implied that the absolute value of the coefficients from this cointegrating regression should be equal to unity. The author then estimated the following cointegrating regression: εt

= α1 Rb − α2 de − α3 Pe

(14.28)

300

Raphael N. Markellos

and confirmed that the variables were indeed cointegrated with slopes not significantly different than unity. Mills then used the lagged residuals from (14.28) in an error-correction model and thus confirmed the usefulness of the in predicting equity returns. Later, Markellos and Mills (1998) generalised this approach and argued that cointegration analysis and financial ratio analysis and forecasting share the same principles and that the former can be used to develop more formal and consistent approaches. They also noted that the intuition behind financial ratio analysis is that of cointegration in that the objectives of the former are to construct a time-invariant statistic from a set of trending variables and to express information about multiple variables in a single variable. It can be argued that there exists theoretical reasons for nonlinearities in the relationships captured by the GEYR. In general, during periods of economic expansion the demand for funds increases and the equity market normally rises faster than long-term interest rates. As the business cycle peak approaches, the return on the equity market exceeds the longterm rate. During the subsequent recession, the equity market falls and this fall is normally faster than that of the long-term rate. In other words, the spread between the equity market return and long-term interest rates will change over the business cycle. The expected difference in the speed of adjustment between the two returns implies that the relationship between them is not constant, i.e., linear, but that they are nonlinearly related. Brock (1988) linked the absence of arbitrage profits in financial equilibrium with the theory of economic growth to show how nonlinear dynamics in the “dividend” process can be transmitted through the equilibrating mechanism to equilibrium asset prices. Yuhn (1996) and Diba and Grossman (1988) have shown, on the basis of the self-fulfilling model of expectations, that the validity of the Present Value model requires nonlinear cointegration between stock prices and dividends. Christie and Huang (1994) found that although no exploitable systematic empirical relation exists between dividend yields and risk-adjusted returns, their functional relationship is characterised by a rich variety of nonlinear shapes. Given the importance associated with the by financial practitioners and academics for forecasting, it is justified to research further in this area. The next section will search for nonlinearities in the using the nonlinear cointegration approach developed in part 14.2..

Figure 14.1. Logarithms of prices, dividends and gilts from 1965-1995

Nonlinear Cointegration using Lyapunov Stability Theory

14.3.2.

301

Modelling the Gilt-Equity Ratio with Nonlinear Cointegration

Following Mills (1991), the was studied using end-month observations from January 1965 to December 1995 on the FT-Actuaries 500 equity index, the associated dividend index, and the Par Yield on 20 year British Government stocks. The variables were analysed in logarithms and are denoted as P, D and R, respectively. A graph of the co-scaled variables can be seen in Figure 14.1. Mills (1991) studied the using data after 1969 and found that the three variables were I(1) and cointegrated with a single cointegrating vector. He also identified and modelled ARCH effects in the error-correction model and considered the possibility of other nonlinearities in the process. Table 14.1. Dickey Fuller and Augmented Dickey Fuller unit root test. Star (*) denotes significant at the 5% level. The critical values are -3.4236 and -2.8695 for the trend and no trend case, respectively. The ADF test is performed for one lag

P D R ∆P ∆D ∆R

DF No Trend Trend -0.1262 -2.2645 1.297 -2.3498 -1.8172 -1.6485 -16.9718* -16.9635* -18.2849* -18.3974* -14.2467* -14.3317*

ADF(1) No Trend Trend -0.2297 -2.5658 1.204 -2.3212 -2.1962 -2.0544 -14.3669* -14.3644* -11.7082* -11.8271* -12.6073* -12.7154*

The parametric Dickey Fuller (DF) and Augmented Dickey Fuller (ADF) unit root test (Dickey and Fuller, 1979) were used to test for stationarity. The results, given in Table 14.1, indicate that all three variables are I(1) with I(0) differences. The ADF test was performed for one lag and similar results were obtained if additional lags were included. Since the DF and ADF tests may be misspecified, the semi-parametric Geweke/Porter-Hudak (1982) fractional integration test and the Dechert-Genc¸ay (1992) stability test were also performed and gave similar results4 . A linear cointegrating relationship was first estimated using OLS. Then, following the attractor approach of Granger and Hallman (1991) and Sephton (1994), a nonlinear cointegration function was estimated using an MLP neural network. The MLP was estimated using nonlinear least squares and a single hidden unit. The τµ test was then used on the residuals from each regression to test the null hypothesis of non-cointegration (see Engle and Granger 1987). As shown in Table 14.3.2., the null hypothesis cannot be rejected if a linear cointegration function is used. Moreover, the restrictions implied by the CF on the parameters of the linear cointegrating function are not satisfied. These results do not contradict the validity of those obtained by Mills (1991) since a longer sample is used here. As noted by Granger and Hallman (1991), the standard critical values of τµ cannot be used if the cointegration function is nonlinear. Granger and Hallman (1991) and Sephton (1994) used Monte Carlo simulation to estimate critical values for their specific nonlinear cointegration functions. This estimation was based on the assumption that the underlying 4 Results

are available upon request from the author.

302

Raphael N. Markellos

Table 14.2. τm u residual tests of cointegration. * Significant at the 5% level. The critical values for the cointegration test statistic (Engle and Granger, 1987) were calculated for 3 variables and 372 observations using the software by MacKinnon (1996). The 5% and 1% critical values are -4.1505 and -4.7127, respectively. The test statistic of the linear cointegration function is significant at the 17.105% level. OLS linear cointegrating regression Pt = 0.1018 + 0.8132Dt − 0.1601Rt + εt , MLP nonlinear cointegrating regression Pt = MLP1 (Dt , Rt ) + zt ,

τµ = -3.598474 τµ = -4.55302*

variables were I(1).

Figure 14.2. Distribution of 50 bootstrap test statistics for MLP

Here bootstrap simulation was used to estimate the critical values of the τµ test statistic for MLPs with a single hidden unit. This approach is more robust to deviations of the underlying variables from pure I(1)processes. Due to severe computational constraints, the critical values of τµ were computed from 50 bootstrap estimates. The MLPs were estimated via nonlinear least squares using 372 independently bootstrapped observations, for each one of the three variables under study. The distribution of the bootstrap τµ test statistics, depicted in Figure 14.2, had a mean of -2.597958 and a variance of 1.239699. If a normal distribution is assumed for the bootstrap estimates of τµ , which is supported by Figure 14.2 and a Jarque-Bera normality test, then the standard error of the average test statistic can be calculated as -0.17532. This implies that the 5% and 1% critical values are -2.941585 and -3.049553, respectively. As an indication of the nonlinearities involved in the attractor, the response surface of the estimated MLP cointegration function is depicted in Figure 14.3. These results support the case of nonlinear cointegration for the log levels of the variables that compose the CF and justify further investigations. The possibility of nonlinear cointegration was further tested using the Lyapunov stability methodology proposed in this

Nonlinear Cointegration using Lyapunov Stability Theory

303

Figure 14.3. Response surface of MLP cointegration function

paper. The LE spectrum was obtained from the following dynamic model: Pt

= MLP1 (Pt−1 .Dt−1 , Rt−1 ) + εt

(14.29)

The LEs were calculated as before using the Jacobian approach on an MLP with a single-hidden unit. In order to obtain a sense of statistical significance, the block sampling estimation scheme of Whang and Linton (1999) was employed. More specifically, the LEs of (14.29) were estimated over 10 nonoverlapping sub-samples of the original 372 observations. The results of Whang and Linton (1999) suggest that the LEs obtained from this method will be asymptotically normally distributed and have a square root convergence rate. The LE estimates obtained were 0.035865 (0.068241), -0.17101 (0.04297) and -0.19513 (0.042108), with standard errors given in parentheses. These results imply that the regression (4.1.5) is Lyapunov stable with a sum of LEs significantly less than zero. Thus, following the definition given previously, the system is said to be nonlinearly cointegrated. Since the LLE is not significantly above zero, there is no evidence of fractional cointegration. The next step in the analysis examines if the nonlinear cointegration relationship under study is predictive. This can be done by examining the predictive power of lagged residuals zt from the MLP cointegrating regression described in Table 14.3.2.. Differenced FTA prices were chosen as the regressand in this error-correction model. This model assumes that the simple difference operator is applicable to the variables under study and that the FTA return is one of the variables that can be predicted using the error-correction mechanism. The error-correction model (ECM) employed the SIC criterion to choose the structure of lagged differenced gilts and dividends, respectively. The performance of the ECM was compared to a simple dynamic regression model (DRM) that excluded the error correction term.

304

Raphael N. Markellos

Initial OLS estimation of the two models faced problems of heteroskedastic residuals and this led to the use of ARCH models in modelling the residual variance process. For both models, a GARCH(1,1) model was found to offer the best description. The OLS/GARCH(1,1) estimated DRM and ECM model, respectively, along with descriptive statistics, are presented in Table 14.3.2.. The first conclusion that can be drawn is that the nonlinear cointegration under study appears to be predictive since the lagged cointegration residuals enter with high significance in the ECM. Moreover, the ECM performs significantly better than the DRM in terms of all criteria considered. More specifically, the ECM explains more of the variation in FTA returns with an R2 almost 4% higher than that of the DRM. Moreover, the ECM has a sign prediction error around 3% smaller than that of the DRM. This is important because Satchell and Timmerman (1995) have proved that standard forecasting evaluation criteria, based on mean squared (MSE) and absolute (MAE) errors, are not necessarily suited for the evaluation of the economic value of predicting nonlinear processes. They also showed that the probability of predicting the sign of a stochastic variable need not be a decreasing function of the MSE if the predicted value and error are dependent. This implies that statistical forecast significance does not necessarily imply economic significance and that the sign prediction accuracy is a more appropriate measure of economic performance.

Table 14.3. In-sample comparison of model performance. Sign error measures the percentage error of predicting the sign of the next return; AIC is the Akaike (1974) Information Criterion; SIC is the Schwartz (1978) Information Criterion; JB is the Jarque-Bera (1980) normality test

Nonlinear Cointegration using Lyapunov Stability Theory

305

Also, the ECM provides a more parsimonious representation of the data according to both AIC and SIC. Finally, as suggested by the Jarque-Bera normality test statistic, the ECM appears better specified with residuals that are much closer to a normal distribution than those of the DRM. Although these results suggest that the ECM has a better in-sample performance, it would be desirable to compare the out-of-sample performance of the models. This is because the nonlinear models underlying the ECM may have overfitted the estimation data and thus could have very poor out-of-sample performance. The first 310 observations were chosen for the initial estimation sample, and then two different types of out-of-sample forecasts were made using the next 60 (5 years) observations. First, static forecasts were made using actual rather than forecasted values. Second, dynamic forecasts were made by calculating forecasts for periods after the first period in the sample by using the previously forecasted values of the lagged dependent variable. Table 14.4. Dickey Fuller and Augmented Dickey Fuller unit root test. Star (*) denotes significant at the 5% level. The critical values are -3.4236 and -2.8695 for the trend and no trend case, respectively. The ADF test is performed for one lag

Root Mean Square Error Mean Absolute Error Mean Absolute Percent Error Theil Inequality coefficient • Bias Proportion • Variance Proportion • Covariance Proportion

Static forecasting DRM ECM 0.031934 0.030215 0.025543 0.02423 220.86 232.8691 0.568651 0.540833 0.003282 0.008801 0.171098 0.219178 0.825619 0.772021

Dynamic forecasting DRM ECM 0.032563 0.030681 0.026315 0.02463 241.7545 244.3325 0.582875 0.555275 0.002386 0.006642 0.169602 0.22829 0.828012 0.765068

The results, given in Table 14.3.2.., suggest that the performance of the ECM is not due to overfitting since the model still outperforms the DRM out-of-sample on the basis of three out of the four evaluation criteria used. However, it is evident that the ECM performs only marginally better than the DRM.

Conclusions This paper has discussed a general approach to testing and modelling nonlinear cointegration relationships. The methodology used is nonparametric and extends the attractor approach of Granger and Hallman (1991). The testing procedures proposed are based on Lyapunov stability theory for stochastic dynamical systems and have the advantage of being able to account for complicated nonlinear behaviour such as fractional cointegration, nonstationarity in variance and limit cycles. MLP neural network models are proposed as nonparametric estimators of nonlinear cointegration, error-correction and differencing operators, respectively. An empirical application of the proposed methodology looked at gilt-equity ratio, as defined by the relationship between the FTA index, the associated dividend index and the par yields on 20 year British Government stocks using monthly data from January 1965

306

Raphael N. Markellos

to December 1995. The analysis found that these variables are nonstationary but have a nonlinear combination that is consistent with the Lyapunov stability definition of nonlinear cointegration. An analysis of the error-correction models from the nonlinear cointegration suggested that the cointegration was predictive and could be used to forecast FTA returns. Moreover, it was found that the in- and out-of-sample performance of the associated errorcorrection models was better than that of a simple dynamic regression.

References [1] Akaike, H. (1974) “A New Look at the Statistical Model Identification”, IEEE Transactions on Automatic Control, AC-19, pp. 716-723. Baillie, R.T., Bollerslev, T. (1994) “Cointegration, Fractional Cointegration, and Exchange Rate Dynamics”, Journal of Finance, 49, pp. 737-745. [2] Balke, N.S., Fomby T.B. (1997) “Threshold Cointegration”, International Economic Review, 38, pp. 627-645. [3] Bask, M., Genc¸ay, R. (1998) “Testing chaotic dynamics via Lyapunov exponents”, Physica D, 114, pp. 1-2. [4] Bierens, H.J (1997) “Nonparametric cointegration analysis”, Journal of Econometrics, 77, pp. 379-404. [5] Bomze, I.M., Weibull, J.W. (1995) “Does neutral stability imply Lyapunov stability”, Games and Economic Behaviour, 11, pp. 173-192. [6] Bougerol, P., Picard, N. (1992) “Stationarity of GARCH processes of some nonnegative time series”, Journal of Econometrics, 52, pp. 115-127. [7] Brock, W.A. (1988) “Nonlinearity and Complex Dynamics in Economics and Finance”, in Anderson, P., Arrow, K., Dines, D., eds., The Economy as an Evolving Complex System, pp. 77-97, Reading MA: SFI Studies in the Sciences of Complexity. [8] Christie, W.G., Huang, R.D. (1994) “The changing functional relation between stock returns and dividend yields”, Journal of Empirical Finance, 1, pp. 161-191 [9] Clare, A.D., Thomas, S.H., Wickens, M.R. (1994) “Is the Gilt-Equity Yield Ratio Useful for Predicting Stock Returns?”, Economic Journal, 104, pp. 303-315. [10] Cook, P.A. (1994) Nonlinear Dynamical Systems, 2nd edition, New York: Prentice Hall. [11] Corradi, V., Swanson, N.R., White, H. (2000) “Testing for Stationarity-Ergodicity and for Comovements Between Nonlinear Discrete Time Markov Processes”, Journal of Econometrics, 96, pp. 39-73. [12] Deissenberg, C. (1991) “Robust Lyapunov games - the continuous-time case”, Lecture Notes in Economics and Mathematical Systems, 353, pp.65-83.

Nonlinear Cointegration using Lyapunov Stability Theory

307

[13] Diba, B.T., Grossman, H.I. (1988) “Explosive rational bubbles in stock prices?”, American Economic Review, 78, 520-529 [14] Dickey, D.A., Fuller, W.A. (1979) “Distribution of the Estimators for Autoregressive Time Series with a Unit Root”, Journal of the American Statistical Association, 74, pp. 427-431. [15] Duan, J.C. (1997) “Augmented GARCH(p,q) process and its diffusion limit”, Journal of Econometrics, 79, pp. 97-127. [16] Ellinger, A.G. (1955) The Art of Investment, London: Bowes & Bowes [17] Engle, R.F., Granger, C.W.J. (1987) “Cointegration and error correction: Representation, estimation and testing”, Econometrica, 55, pp. 251-276. [18] Escribano, A., Granger, C.W.J. (1998) “Investigating the Relationship between Gold and Silver Prices”, Journal of Forecasting, 17, pp. 81-107. [19] Genc¸ay, R. (1996) “A statistical framework for testing chaotic dynamics via Lyapunov exponents”, Physica D, 89, pp. 261-266. [20] Genc¸ay, R., Dechert, W.D. (1996) “The Identification of Spurious Lyapunov Exponents in Jacobian Algorithms”, Studies in Nonlinear Dynamics and Econometrics, 1, pp. 145-154. [21] Genc¸ay, R., Dechert, W.D. (1996) “The Identification of Spurious Lyapunov Exponents in Jacobian Algorithms”, Studies in Nonlinear Dynamics and Econometrics, 1, pp. 145-154. [22] Geweke, J., Porter-Hudak, S. (1983) “The Estimation and Application of Long Memory Time Series Models”, Journal of Time Series Analysis, 4, pp. 221-238. [23] Granger, C.W.J. (1997) “On Modelling the Long Run in Applied Economics”, Economic Journal, 107, pp. 169-177. [24] Granger, C.W.J., Hallman J.J. (1991) “Long-Memory Processes with Attractor”, Oxford Bulletin of Economics and Statistics, 53, pp. 11-26. [25] Granger, C.W.J., Ter¨asvirta, T. (1993) Modelling Nonlinear Economic Relationships, New York: Oxford University Press. [26] Haefke, C., Helmenstein, C. (1996) “Forecasting Austrian IPOs: An Application of Linear and Neural Network Error-Correction Models”, Journal of Forecasting, 15, pp. 237-251 [27] Hoare Govett (1991) “UK Market Prospects for the Year Ahead”, in Equity Market Strategy, London: Hoare Govett. [28] Jarque, C.M., Bera, A.K. (1980) “Efficient Tests for Normality, Homoskedasticity and Serial Dependence of Regression Residuals”, Economics Letters, 6, pp. 255-259.

308

Raphael N. Markellos

[29] Lai, D., Chen, G. (1998) “Statistical Analysis of Lyapunov Exponents from Time Series: A Jacobian Approach”, Mathematical Computer Modelling, 27, pp. 1-9. [30] Levin, E.J., Wright, R.E. (1998) “The information content of the gilt-equity yield ratio”, The Manchester School Supplement, 25, pp. 89-101. [31] MacKinnon, J.G. (1996) “Numerical Distribution Functions for Unit Root and Cointegration Tests”, Journal of Applied Econometrics, 11, pp. 601-618 [32] Markellos, R.N., Mills, T.C. (1998) “Complexity Reduction for Co-Trending Variables”, Journal of Computational Intelligence in Finance, 6, pp. 6-13. [33] McCaffrey, D., Ellner, S., Gallant, Nychka, D. (1992) “Estimating Lyapunov Exponents with Nonparametric Regression”, Journal of the American Statistical Association, 87, pp. 682-695. [34] Medio, A. (1993) Chaotic Dynamics: Theory and Applications to Economics, Cambridge: Cambridge University Press. [35] Mills, T.C. (1991) “Equity Prices, Dividends and Gilt Yields in the UK: Cointegration, Error-correction and ’Confidence’”, Scottish Journal of Political Economy, 38, pp. 242-255. [36] Nychka, D., Ellner, S., Gallant, A.R., McCaffrey, D. (1992) “Finding Chaos in Noisy Systems”, Journal of the Royal Statistical Society B, 54, pp. 399-426. [37] Ogaki, M. (1998) “On the Granger Representation Theorem: a counter example?”, Economics Letters, 60, pp. 19-21. [38] Park, J.Y, Phillips, P.C.B. (2001) “Nonlinear Regression with Integrated Time Series”, Econometrica, 69, pp. 117-161. [39] Russel, T., Zecevic, A. (1998) “Lyapunov stability, regions of attraction, and indeterminate growth paths”, Economics Letters, 58, pp. 319-324. [40] Satchell, S., Timmermann, A. (1995) “An Assessment of the Economic value of Nonlinear Exchange Rate Forecasts”, Journal of Forecasting, 14, pp. 477-497. [41] Scarselli, F., Tsoi, C. (1998) “Universal Approximation Using Feedforward Neural Networks: A Survey of Some Existing Methods and Some New Results”, Neural Networks, 11, pp. 15-37. [42] Schwarz, G. (1978) “Estimating the dimension of a model”, Annals of Statistics, 6, pp. 461-464. [43] Sephton, P.S. (1994) “Cointegration Tests on MARS”, Computational Economics, 7, pp. 23-35. [44] Swanson, N.R. (1999) “Finite Sample Properties of a Simple LM Test for Neglected Nonlinearity in Error-Correcting Regression Equations”, Statistica Neerlandica, 53, pp. 76-95.

Nonlinear Cointegration using Lyapunov Stability Theory

309

[45] Tanaka, T., Aihara, K., Masao, T. (1998) “Analysis of Lyapunov exponents from random time series”, Physica D, 111, pp. 42-50. [46] Tsonis, A.A., Elsner, J.B. (1992) “Nonlinear prediction as a way of distinguishing chaos from random fractal sequences”, Nature, 358, pp. 217-220. [47] Whang, Y.J., Linton, O. (1999) “The asymptotic distribution of nonparametric estimates of the Lyapunov exponents for stochastic time series”, Journal of Econometrics, 91, pp. 1-42. [48] Yuhn, K.-H. (1996) “Stock price volatility: tests for linear and nonlinear cointegration in the present value model of stock prices”, Applied Financial Economics, 6, pp. 487494.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 311-332

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 15

ACTIVE P ORTFOLIO M ANAGEMENT: T HE P OWER OF THE T REYNOR -B LACK M ODEL Alex Kane† , Tae-Hwan Kim‡,∗ and Halbert White §,† † Graduate School of International Relations and Pacific Studies (IR/PS) University of California, San Diego, La Jolla, CA, US ‡ School of Economics, Yonsei University, South Korea § Department of Economics, University of California, San Diego, La Jolla, CA, US

15.1. Introduction The presumption of market efficiency is inconsistent with the existence of a vast industry engaged in active portfolio management. Grossman and Stiglitz (1980) derive an informationinefficient capital market equilibrium based on the cost of information and the fact that portfolio managers cannot observe the asset allocations of competitors. Treynor and Black (1973) propose a model to construct an optimal portfolio under such conditions, when security analysts forecast abnormal returns on a limited number of securities. The optimal portfolio is achieved by mixing a benchmark portfolio with an active portfolio constructed from the securities covered by the analysts. The original model assumes that residuals from the market model are uncorrelated across stocks (the diagonal version), but it can easily be extended to account for non-zero covariance across residuals (the covariance version). The efficiency of the Treynor Black (TB) model depends critically on the ability to predict abnormal returns. Its implementation requires that security analyst forecasts be subjected to statistical analysis and that the properties of the forecasts be explicitly used when ∗ Kim’s research was supported by National Research Foundation of Korea - Grant funded by the Korean Government (NRF-2009-327-B00088. For further information, email: [email protected]; URL: http://web.yonsei.ac.kr/thkim/); tel: +82-2-2123-5461; fax: +82-2-2123-8638. † E-mail address: [email protected]

312

Alex Kane, Tae-Hwan Kim and Halbert White

new forecasts are input to the optimization process. It follows that security analysts must submit quantifiable forecasts and that they will be exposed to continuous, rigorous tests of their individual performance. The entire portfolio is also continuously subjected to performance evaluation that may engender greater exposure of managers to outside pressures. The TB model appears to have had little impact despite encouraging reports; e.g., Hodges and Brealey (1973), Ambachtsheer (1974, 1977), Ferguson (1975), Ambachtsheer and Farrell (1979), Dunn and Theisen (1983), Ippolito (1989), Goetzman and Ibbotson (1991), Kane et al. (1999), to mention a few of the listed references. Although theoretically compelling, the model has not been widely adopted by investment managers. We suspect that portfolio managers and security analysts are reluctant to subject their analysis to rigorous tests. This attitude may owe in no small measure to the belief of many renowned scholars that the forecasting ability of most analysts is below the threshold needed to make the model useful. This paper aims to identify this threshold, and to lower it by using effective statistical methods. Optimal portfolios are constructed with actual forecasts of abnormal returns and beta coefficients obtained from a financial institution. We apply various ways of shrinking a robust estimator toward a data-dependent point to identify and utilize predictive power. Since distributions of abnormal returns are fat-tailed, we choose the Least Absolute Deviations (LAD) estimator as a benchmark. The quality of estimates of stock betas affects the accuracy of the estimates of realized abnormal returns, needed to measure the bias and precision of the forecasts. We also use Dimson’s (1979) aggregate coefficients method to account for infrequent trading, The analyst forecasts in our database show correlations between forecasts and realizations of abnormal returns on the order of 0.04, and predictive ability generally declines over the sample period, perhaps due to changing market conditions. The forecasts are biased and forecast errors are asymmetric and correlated across stocks. Nevertheless, the application of shrinkage LAD estimation to the TB model results in superior performance in both the diagonal and covariance versions. Portfolios based on OLS estimates are dominated by the those that use LAD and shrinkage LAD estimators. The paper is organized as follows. Section 15.2. presents the TB framework and Section 15.3. describes the forecast data and sampling procedure. Section 15.4. elaborates on the estimation of beta coefficients and abnormal returns from realized stock returns, and Section 15.5. treats the calibration of forecasts from the history of forecasting records. Section 15.6. lays out the out-of-sample test procedures and Section 15.7. reports on portfolio performance. Section 15.8. provides a summary and conclusions.

15.2. The Treynor-Black Framework To fix ideas and introduce notation, we briefly describe the model 1 . Treynor and Black (1973) deal with a scenario in which the mean-variance criterion (the Sharpe ratio) is used by investors; a specified market index is taken as the default efficient (passive) strategy, and the security analysts of a portfolio management firm cover a limited number of securities. Under these conditions, securities that are not analyzed are assumed to be efficiently priced, 1A

more elaborate description can be found in Bodie et al. (2001).

Active Portfolio Management

313

and a portfolio of only the covered securities cannot be efficient. The optimal portfolio must be a mix of the covered securities and the index portfolio. TB identify the portfolio of only the covered securities (the efficient Active Portfolio, A) that can be mixed with the index (Passive Portfolio, M) to obtain the optimal risky portfolio 2. With a risk-free asset (or a zero beta portfolio) whose rate of return is denoted r f , the weights of A and M in the optimal risky portfolio, P, which maximizes the Sharpe ratio (SP = RP /σP ) are given by wA =

RA σ2M − RM Cov(RA , RM ) ; wM = 1 − wA , RA σ2M + RM σ2A − (RA + RM )Cov(RA , RM )

(15.1)

where R is the expected excess return, E(r) − r f . Assuming the diagonal version of the market model we have, Ri = αi + βi RM ; E(ei ) = Cov(Ri , R j ) = 0, ∀i 6= j; Var(ei ) = σ2i ,

(15.2)

where αi is the abnormal return expected by the analyst who covers the ith security and, except for σM , σi denotes residual standard deviation of the ith security (or portfolio). We assume that the security analysts cover n securities (i = 1, 2,. . . , n). Substituting (15.2) into (15.1) yields w0 wA = 1 + (1 − βA )w0 (15.3) 2 αA /σA , w0 = RM /σ2M where αA = ∑ni=1 wi αi , βA = ∑ni=1 wi βi and weight wi is given below in (15.5). Note that w0 is the optimal weight in the active portfolio when its beta is average, βA = 1. The intuition of (15.3) is that the larger the systematic risk of the active portfolio, the less effective is diversification with the index, and hence, the larger the weight in A. With wA from (15.3), the Sharpe ratio of the risky portfolio, P, is given by S2P =

[wA (αA + βA RM ) + (1 − wA )RM ]2 α2A 2 = S + , M w2A (β2A σ2M + σ2A ) + (1 − wA )2 σ2M + 2wA (1 − wA )βA σ2M σ2A

(15.4)

which reveals that the appraisal ratio (αA /σA ) of the active portfolio determines its marginal contribution to the Sharpe ratio of the passive strategy. This appraisal ratio, in turn, is maximized by choosing the weight, wi , on the ith covered security (out of n), to be wi =

αi /σ2i

αj

∑nj=1 σ2

.

(15.5)

j

The reason βi is absent in (15.5) is that a correction is made for βA in wA of (15.3). Applying this solution to (15.4) shows that the marginal contribution of an individual security to the risky portfolio’s squared Sharpe measure is equal to its own squared appraisal ratio S2P = S2M + 2 The

2 α2A α2 = S2M + ∑ 2i . 2 σA i=1 σi

existence of this portfolio can be inferred from Merton (1972) or Roll (1977).

(16.4a)

314

Alex Kane, Tae-Hwan Kim and Halbert White

Thus, if forecast quality exceeds some threshold, there are economies to scale in the coverage of securities that help explain large portfolios in the industry. Another important organizational implication is that portfolio management can be organized into three decentralized activities; macro forecasting to obtain R M , micro forecasting to obtain αi , and statistical analysis to obtain σM , βi and σi 3 . The assumption of the diagonal model is obviously suspect. Yet it’s not a priori clear that even if residuals are correlated across securities, the use of the generalized model will be profitable, since we face a trade off between a somewhat flawed model (with assumed non-correlation) and a correct model with estimation errors in the covariance matrix. We will put this tradeoff to the test; hence we need the optimal portfolio for the generalized model. The first stage of the optimization with a non-diagonal, residual covariance matrix is unchanged. We only need to redo the maximization of the appraisal ratio of the active portfolio (see the analysis in Merton (1972), and Roll (1977)). Using matrix notation and denoting the covariance matrix of residuals by Ω, the weight, wi , in the active portfolio is given by the ith element of the following vector: wc = [α0Ω−1 ι]−1 Ω−1 α,

(15.6)

where α is the vector of expected abnormal returns and ι is a vector of ones. When the covariance matrix is diagonal, wc reduces to wi in (15.5).

15.3. The Forecast Database and Sampling Procedures The forecast data set used in our study has been provided by an investment firm active in the U.S. in the early-mid 1990’s. We shall refer to this firm as ”XYZ corp” hereafter. In that period, the firm began extensively using artificial neural network-based statistical analysis to predict abnormal returns. They used the S&P500 as a performance benchmark and, consequently, they mostly held large company stocks that traded in relatively large volumes. The firm (XYZ) graciously provided us with monthly 4 forecasts of abnormal returns and beta coefficients for all stocks in its database for the period December, 1992December, 1995, 37 sets of monthly observations in all. These are the forecasts that XYZ used in constructing their portfolios. Nevertheless, XYZ did not reveal how it went about portfolio management. In December, 1992, XYZ had 711 stocks in its database and additional stocks were added regularly over the sample period, ending with 771 stocks in December, 1995. To simplify the test procedure, we eliminated any stock for which one or more forecasts were missing, leaving 646 stocks for the test databank. As the table below shows, the sampleperiod years, 1993-1995, were quite representative for the US stock market with one each of average, bad and good years. Annual Returns (%) Large Stocks Small Stocks 3 Obviously, 4 Forecasts

1993

1994

1995

9.87 20.30

1.29 -3.34

37.71 33.21

1926-1999 Average 12.50 19.02

SD (%) 20.39 40.44

there is room for improvement by exchanging ideas among the staffs of the various activities. were submitted by the last Friday of the month prior to the target month.

Active Portfolio Management

315

XYZ’s monthly forecasts of alpha were constrained to integer-values of percent per month, between -12% and 14% so that any more extreme forecast would have been set to the appropriate limit. Figure 15.1 and Figure 15.2 show histograms of the alpha and beta forecasts for the 646 stocks in the sample over the period, and Table 15.1 presents summary statistics for their location and dispersion 5. Alpha forecasts were right-skewed and negative on average. Beta forecasts were distributed around one, typical for large stocks.

Figure 15.1. Histogram of Alpha Forecasts.

We confine our study to 105 of the 646 stocks in the database to reduce the computation time needed for LAD estimation 6. Accordingly, we chose to work with a subset of 105 randomly selected stocks 7 . Among many others, Rosenberg et al. (1985) and Fama and French (1992) provide empirical evidence that the equity book-to-market ratio (BE/ME) and market capitalization (SIZE) help explain average stock returns 8 . To account for this effect, we seek to preserve 5 This is quite

unusual. Most financial firms produce forecasts in the form of a ranking variable which must be converted to a scale variable, using the Information Correlation Adjustment proposed by Ambachtsheer (1977). 6 While LAD estimators can be obtained by simplex linear programming proposed by Charnes and Lemke (1954), this method is not efficient as the parameter space grows along with the number of observations and requires a long search time. Barrodale and Roberts (1974) proposed a modified version of the simplex algorithm (BR-L1) which is more efficient and greatly reduces computation time. Nevertheless, computation time for 646 stocks with BR-L1 is excessive. 7 As shown in section 1, reducing the number of covered stocks significantly lowers the contribution of the active portfolio and hence does not diminish the force of our positive test results. 8 Fama and French argue that these variables are proxies for the part of risk premiums not captured by market beta.

316

Alex Kane, Tae-Hwan Kim and Halbert White

Figure 15.2. Histogram of Beta Forecasts.

the distribution of BE/ME and SIZE in our sub-sample. Annual data for these variables were obtained from Standard and Poor’s COMPUSTAT tape. Using 7 categories of BE/ME and SIZE, we allocate the 646 stocks into 49 groups of similar BE/ME and SIZE, with about 13 stocks in each group. We randomly draw as close as possible to an equal fraction of (2 or 3) stocks from each group, arriving at a random sample of 105 stocks that reflects the databank (population) distribution of the category variables. We use this sub-sample in the subsequent sections. Figure 15.3 and Figure 15.4 show the population and sample histograms for SIZE (Market Value) and BE/ME (Book/Market Value), respectively.

15.4. Estimation of Beta Coefficients and Realized Abnormal Returns Forecasting accuracy derived from records of past forecasts is a critical input when using security analysis to optimize portfolios. To determine the accuracy of abnormal return forecasts we need a time series of realized abnormal returns 9 . We obtained from DATASTREAM daily returns for the 105 stocks in the sample, the S&P500 index, and 3-month yields for T-bills for the period January 1, 1990 through March 31, 1996 with 1629 observations. 9 The term ‘realized’

for ex-post abnormal returns is somewhat misleading, since they are unobservable. We estimate them from realized returns, the market-model equation and estimated beta coefficients.

Active Portfolio Management

317

Table 15.1. Summary Statistics for Alpha & Beta Forecasts Mean Std Min 25% 50% 75% Max Skewness Kurtosis JB P-value

Beta 0.982 0.274 0.056 0.841 0.985 1.143 2.087 -0.003 3.268 67.878 0

Alpha -1.356 4.089 -12 -5 -1 0 14 0.491 3.32 1005.369 0

15.4.1. Beta Coefficients We use XYZ’s beta forecasts to compute realized abnormal returns. Nevertheless, if we confine ourselves to XYZ’s beta forecasts, we run the risk that tests of the quality of forecasts of abnormal returns will depend too greatly on the quality of these beta forecasts. To avoid this risk we also use standard estimates of betas from realized daily stock and market-index excess returns. For the in-sample estimate of realized abnormal returns in any given month, we estimate beta from daily returns over three years where the tested month is in the middle of the period. We estimate betas with the following three alternative procedures. (i) To account for infrequent trading we use Dimson’s (1979) aggregate coefficient method (AC). Beta estimates are the sum of the contemporaneous plus K lead and lag coefficients from the regression Ri,t = αi +

K



k=−K

K

bi,k Rm,t+k + ei,t ;

bi =



bi,k ;

t = 1, . . ., T.

(15.7)

k=−K

There is no obvious rule for selecting the number of lags and leads ( K). An appropriate value for K can be inferred from by regressing the market-model residuals (ex-post abnormal return minus alpha forecast from XYZ) on a constant and the market excess returns, and testing for a zero slope coefficient. If the correct number of lags and leads are used, then the slope coefficient from the regression should be zero. We pool the residuals for all stocks over all the months in the sample (approximately = 3,885 observations). Table 2 shows the estimation results for 0 to 2 lags and leads. We reject the hypothesis that the slope coefficient is equal to zero only for 0 and 1 lags/leads 10 and it becomes significantly different from zero for K ≥ 2. Among the two candidates 0 and 1, we choose K = 1 because our belief is that when K = 0, the infrequent trading problem may still be present. 10 This can

be explained by XYZ’s concentration in large-company stocks.

318

Alex Kane, Tae-Hwan Kim and Halbert White

Figure 15.3. Market Value: Population vs. Sample.

Table 15.2. Lag Selection for the Aggregate Coefficient Method. (Note: Standard errors are in parenthesis) K

Intercept

Slope

R2

R2a

Prob(F)

0

0.699 (0.123) 0.700 (0.123) 0.701 (0.123)

0.052 (0.054) -0.068 (0.054) -0.113 (0.054)

0.00024

-0.000012

0.334

Mean of estimated betas 0.8810

0.00041

0.000153

0.207

0.9995

0.00113

0.000876

0.036

1.0447

1 2

(ii) Vasicek (1973) proposes the Bayesian estimate ( V ): bV = wb + (1 − w)b∗ ; w =

1/v2b , 1/v2b∗ + 1/v2b

where • b = estimated market beta, • v2b = estimate of variance of b, • b∗ = mean of prior distribution of market beta, • v2b∗ = variance of prior distribution of market beta. Vasicek suggests the cross-sectional mean and variance for b∗ and v2 b∗ respectively.

(15.8)

Active Portfolio Management

319

Figure 15.4. Book/Market Value: Population vs. Sample.

(iii) An alternative approach is a James-Stein shrinkage estimator (JS): +  h ∗ , bJS = wb + (1 − w)b ; w = 1 − (b − b∗ )Var(b)−1(b − b∗ )

(15.9)

where h is a choice parameter and [a]+ = max(a, 0). Note that (b − b∗)Var(b)−1(b − b∗) is an F-statistic with (n − 1, 1) degrees of freedom, approximately a χ2 -statistic with one degree of freedom for the implicit hypothesis: H0 : b = b∗ , and the weight w is now given by [1 − Fh ]+ . When F is large, (that is, we are likely to reject the implicit null hypothesis), w is large (close to one), and we do not shrink. Instead of specifying the variance of the prior distribution as in Vasicek’s method, the shrinkage factor, h, needs to be specified. Noting that Prob[w ≥ 0] = Prob[F≥ h], it is easily seen that as h → 0, Prob[w ≥ 0]→ 1 while Prob[w ≥ 0]→ 0 as h → ∞. We choose h = 0.45 such that Prob[w ≥ 0] = 0.5. We apply both shrinkage methods to the AC estimates as two alternative estimates. To summarize, we use four alternatives to XYZ’s beta forecasts: OLS, AC, V and JS, where V and JS are Vasicek’s method and the JS method applied to the AC estimates. Figure 15.5 shows the distribution of beta estimates from the four methods. It is evident that AC shifts the OLS distribution upwards, correcting for the downward bias due to infrequent trading. V shrinks the tails toward the AC mean, while JS leaves the tails almost unchanged; instead, it shifts the central mass of the distribution toward the mean.

15.4.2. Realized Abnormal Returns The five methods of estimating betas yield 5 sets of estimates of realized abnormal returns. For each of the five methods, we have a 105 × 37 matrix, b(·) , of monthly betas for the 105 stocks. To each element in the matrix we obtain a corresponding element of the

320

Alex Kane, Tae-Hwan Kim and Halbert White

Figure 15.5. Distribution of Beta Estimates Based on 4 Methods.

matrix of realized abnormal returns from ∗(·)

αi,t

(·)

= Ri,t − bi,t RM,t ;

i = 1, . . ., 105; t = 1, . . ., 37; (·) =

(XYZ), (OLS), (AC), (V), (JS)

(15.10)

A first glimpse at the quality of abnormal return forecasts is shown in Figure 15.6. The scatter of forecasts and realization with the V specification (Dimson’s estimates with Vasiceck’s correction) show that the constraint on the range of alpha forecasts [-12,14] may have been costly, particularly for positive values.

15.5. Calibration of Alpha Forecasts TB point out that we must explicitly account for the quality of forecasts when optimizing the portfolio. This issue is taken up by Admati (1985), Dybvig and Ross (1985) and Kane and Marks (1990). Assume, for example, a simple case where the realization, α∗ , ˆ + ε where ε is white noise. Assuming that the ˆ are related by α∗ = f (α) and a forecast, α, ˆ = ρ2 αˆ , by discounting function f is linear, we obtain the unbiased forecast, αUB = E(α∗|α) ∗ ˆ We can obtain the appropriate the raw forecast using the correlation, ρ, between α and α. ˆ discount function for α by estimating a well specified regression of the forecasting record, e.g. α∗ = α + bαˆ + η. To assess overall forecast accuracy with various specifications and beta estimates, we pool all 37 × 105 = 3885 pairs of forecasts and realizations. We use three alternative speci-

Active Portfolio Management

321

Figure 15.6. Scatter Diagram of Ex-Post Abnormal Returns vs. Alpha Forecasts.

fications: Linear α∗L = αL + bL αˆ + ηL Parabolic

α∗P

= αP + b1P αˆ + b2P αˆ + ηP 2

ˆ 0] + ηK Kinked α∗K = αK + bK max[α,

(16.11a) (16.11b) (16.11c)

The kinked-linear specification is a ‘no short sales’ alternative that may be required for many institutions. Pooling the sample across stocks and time to estimate the forecast discount function would affect the test results in two important ways. First, applying a uniform discount function to all stocks is likely unrealistic and would significantly reduce the potential contribution of the active portfolio. Hence, pooling across stocks will, if anything, increase the power of the performance tests. However, pooling observations across time entails the use of data that was unknown at the time the forecast was made. For this reason we use pooling across time only to discuss issues of overall forecast quality. When we test performance month-by-month we use only data from past months to estimate the discount function, maintaining the pooling across stocks. Figure 15.7 shows the fitted lines from the regressions of realizations on abnormal return forecasts with the three alternative specifications. The linear specification reveals a severe correction for quality. The graph of the parabolic specification shows that the correction would be more severe at the low end of the forecast range. The extent of this differential overwhelms this specification. At the lowest end, the discount function would convert negative alpha forecasts into positive signals, calling for long positions in these stocks. The kinked line specification is a milder form of such correction, it converts all negative forecasts to zero, effectively taking these stocks out of the active portfolio.

322

Alex Kane, Tae-Hwan Kim and Halbert White

Figure 15.7. Fitted Lines Based on 3 Specifications.

Table 15.5. presents three panels for the regression results of the three specifications, with five beta estimators in each. The estimation results are almost identical for the various beta estimation methods. With zero P-value for all cases, White’s heteroscedasticity test rejects the homoscedasticity assumption. The standard errors in parenthesis in Table 15.5. are computed from the Heteroscedasticity-Consistent Covariance Matrix Estimator 11 (HCCME) proposed by White (1980). The adjusted R2 (R2A ) is highest for the kinked-linear specification suggesting that the parabolic specification is not flexible enough to handle the downward bias at the low end of the forecast range. The slope coefficients show the superior power of the positive alpha forecasts. The inadequacy of the parabolic specification is apparent from the insignificance of the slope coefficients on the squared forecast alpha. To complete this picture, we draw similar conclusions from the intercepts. These are significant in the linear specification and are smallest and insignificant in the kinked-linear specification. Finally, the R2A is quite small, never exceeding 0.00155. It is interesting to track the time-consistency of the quality of the forecasts. To do so, we regress (16.11) on the 105 stocks, one month at a time - 37 regressions for each specification and beta estimate. Figure 15.8 shows the plots of the resultant for the V beta estimate only, as the other four plots are very similar. The three plots reveal that, with the exception of the first two months, the quality of the forecasts consistently deteriorated over time. This observation is somewhat surprising since the best year of the sample period was the last, and Table 15.5. shows that positive forecasts fared best, indicating that with overall low predictive power we must be careful in making generalization. Armed with the various specifications for discounting individual forecasts we now turn are several ways of estimating the HCCME. We estimate: HCCME =(X 0 X)−1 X 0 Ω∗X(X 0 X)−1 , = Diag[et2 /(1 − ht )], et = regression residual and ht = t th element of the hat matrix (X(X 0 X)−1 X).

11 There

where

Ω∗

Active Portfolio Management

323

Table 15.3. Regression Results for the Calibration of Alpha Forecasts

to the construction of optimal portfolios and performance evaluations.

15.6. Out-of-Sample Test Procedures The steps comprising our performance test are as follows: 1. Estimate the discount function for forecasts of month t from paired observations of forecast and realization of abnormal returns in month t − 1. Abnormal returns of month t − 1 are computed from the market-model equation using beta coefficients estimated from past realized daily returns. 2. Obtain unbiased forecasts for month t by applying the discount function from month t − 1 to the forecasts of abnormal returns for month t. 3. Obtain macro forecasts of the mean and variance of the index portfolio.

324

Alex Kane, Tae-Hwan Kim and Halbert White

Figure 15.8. Predictive Power (Adjusted R2 ).

4. Estimate the covariance matrix of residuals from past daily returns to use in the diagonal and covariance versions of TB 5. Construct the active portfolio using the unbiased forecasts and estimates of the residual variances (covariance matrix) in the diagonal (covariance) version of TB. Construct the optimal risky portfolio from the active and index portfolios. 6. Compute the realized return of the optimal portfolio in month t, using stock and market realizations of excess returns in month t. 7. Use the realized monthly excess returns of the optimal risky TB portfolio and the market index (t = 2, . . ., 37) to evaluate performance.

15.6.1. Discount Functions for the Monthly Forecasts of Abnormal Returns We use forecasts and realizations available each month. For each test month; t, t = 2, . . ., 37, we first estimate the four sets of 105 beta coefficients from realized daily returns over three years preceding month t. XYZ’s 105 beta forecasts plus the four sets of estimates (·) up to month t-1, bi,t−1 ; i=1,. . .,105; t=2,. . .,37; (·) = OLS, AIT, AC, V, JS, are used in the ∗(·)

market model equation (15.1) to compute the realized abnormal returns, αi,t−1 ; i=1,. . .,105; t=2,. . .,37; (·) = OLS, XYZ, AC, V, JS. Each of the 3 × 5 = 15 sets of 105 abnormal returns are then paired with the 105 forecasts, αˆ i,t−1 ; i=1,. . ., 105; t=2,. . .,37. For each set of the 15 combinations of (◦) =L, P, K and (·) = OLS, XYZ, AC, V, JS, we run the following pooled

Active Portfolio Management

325

regression (across i) for each t: ∗(·)

(·)

(·)

(·)

α(◦)i,t−1 = a(◦),t−1 + b(◦),t−1 αˆ i,t−1 + η(◦),t−1

(15.12)

t = 2, .., 37 (◦) = L, P, K (·) = OLS, XYZ, AC, V, JS where L,P,K stand for the linear, parabolic and kinked specifications. Hence, in the above regression, we have 105 observations for t = 2, 210 observations for t = 3, and so on. Once (·) (·) the coefficients a (◦),t−1 , b(◦),t−1 are estimated for each test month t, the discounted alpha (·)

(·)

forecast for the ith security is given by a (◦),t−1 + b(◦),t−1 αˆ i,t . The critical role of the discount function motivates an elaborate estimation scheme. We estimate (15.12) in 5 different ways: (i) OLS, (ii) LAD, (iii) Non-Random Combination LAD (NRLAD), (iv) James-Stein LAD (JSLAD), and (v) Optimal Weighting Scheme LAD (OWLAD) 12 . The NRLAD, JSLAD and OWLAD estimators (Kim and White (2001)) are obtained by optimally mixing the OLS and LAD estimators. These estimators have smaller mean squared error than the OLS and LAD estimators and are expected to improve out-of-sample performance of the forecasts. We evaluate performance with these five estimation methods.

15.6.2. Residual Variances and Covariances The optimal weights (15.5) of the diagonal version of TB require forecasts of residual variances, and those of the covariance version (15.6) require a forecast of the full covariance matrix of residuals. Ideally, these would be extrapolated from past forecasting errors. However, since we do not have a sufficient number of observations, we estimate the covariance matrix from daily returns over three years ending in the last trading day of month t-1. The five estimates of beta for each stock, obtained from the daily returns of the prior 3-year (·) (·) period, yield five sets of deviations from the market model, ei,d = Ri,d − bi,m RM,d ; d= day in the 3-year period; m = month in the 3-year period (the subscript for the month for which the estimates are prepared is dropped for clarity). These daily residuals are used to estimate the covariance matrix of daily residuals for the 3-year period prior to month t. The matrix is multiplied by nt, the number of days in month t, to obtain the forecast of the residual covariance matrix in month t. 12 Let

b be the LAD estimator and g be the OLS estimator. Then, the JSLAD, NRLAD and OWLAD estimators are defined as follows: JSLAD

=

(1 − c1 /kb − gk2)(b − g) + g

NRLAD

=

(1 − c2 )(b − g) + g

OWLAD

=

(1 − λ1 − λ2 /kb − gk2 )(b − g) + g,

where kb − gk2 = (b − g)0Q(b − g) and Q is a weighting matrix and the combination parameters (c1 , c2 , λ1 , λ2 ) are chosen to minimize the asymptotic risk of the corresponding estimator. See Kim and White (2001) for a detailed discussion.

326

Alex Kane, Tae-Hwan Kim and Halbert White

15.6.3. Macro Forecasts It appears from (16.4a) that macro forecasts of the mean and variance of the indexportfolio are not more important than those of a single security. This appearance led Ferguson (1975) to argue that effort spent on macro forecasting should not exceed that spent on any individual security. But this argument is false. The Sharpe ratio in (16.4a) is conditional on optimal weights of all securities in the risky portfolio. To the extent that a forecast for a security is of low quality and not properly discounted, (16.4a) does not apply. The shortfall from the maximum Sharpe ratio will depend on the weight of the security in the portfolio. Since it is likely that the weight on M will be substantially higher than that of any individual security, the quality of macro forecasts will be substantially more important. The literature on estimating the market mean is sparse (see Merton (1980)), while the literature on market volatility is quite rich. We use the AR(0)-GARCH(1,1) specification as in Engle et al. (1993) to forecast the market excess return and variance as follows: RMt = E(RM ) + εt

(15.13)

2 + bσ2M,t−1 ; t = 1, . . ., T. σ2Mt = w + αεt−1

Daily returns and 3-year rolling estimation window are used in (15.13) so that T is the final day of the 3-year period window. Once the estimation step for one iteration is completed, then the k-step ahead volatility prediction ( σ2M,T,T +k ) standing at time T is generated by σ2M,T,T +1 = w + ασ2M,T σ2M,T,T +k

=

w + (α + b)σ2M,T,T +k−1;

(15.14) k = 2, . . ., nm

m and the target month variance estimate is σ2M,m = ∑nk=1 σ2M,T +k−1 where nm is the number of days in the target month. Figure 15.9 shows the forecasts and realizations of the market index excess returns. The RMSE of the excess-return forecasts is 2.20% per month.

15.6.4. Restrictions on Portfolio Weights Despite the discount of the forecasts, the weights on the active portfolio derived from (15.3) and (15.6) turn out to be excessive and volatile. It appears that the size and volatility of the active portfolio weights result from the dynamics of the discounted abnormal-return forecasts. While many portfolio mangers may take short positions in individual stocks (shorting against the box, in Wall Street terms), fewer can use index futures to emulate short positions in the index. We therefore restrict the active portfolio weight to the range [0,1] in order to rule out any unrealistic positions in the active portfolio that generate superior performance.

15.6.5. Performance Measures The TB model maximizes the Sharpe measure. However, most users have little intuition about the value of an incremental improvement in the Sharpe measure. Modigliani and Modigliani (1997) propose a transformation of the measure to a rate of return equivalent.

Active Portfolio Management

327

Figure 15.9. Forecasting Monthly S&P 500 Index.

This measure, which has come to be known as M 2 , is computed from the Sharpe measure of portfolio P by (15.15) MP2 = SP σM − RM The first term SP σM gives the excess return on a mix of P with the risk free asset that would yield the same risk (SD) as the market index portfolio M. By subtracting the average excess return on M we get the risk adjusted return premium of P over M. M 2 provides better intuition than S and has become popular with practitioners; hence we include the measure in the performance evaluation reports. We track the monthly optimal weights of the individual stocks and the index in the risky portfolios derived from the various estimators and specifications. We use the daily returns on the stocks, index and bills to compute realized daily excess returns on the risky portfolios over the 36 months of the forecast period (January, 1992-December, 1995). 13 The daily excess returns of each risky portfolio are paired with the index excess returns to compute S and M 2 for each portfolio over the entire period.

15.7. Portfolio Performance Evaluation Table 15.4 reports performance of portfolios constructed from the diagonal version of TB. It shows values for the Sharpe and M 2 measures for selected estimators and specifications. As could be predicted by the poor results of the parabolic specification reported in Table 15.5. and Figure 15.7, portfolios derived from the parabolic specification performed poorly and were eliminated from Table 15.4. We also eliminated the inferior OLS and AC beta estimators. The major implications from Table 15.4 are as follows: 13 The

first month is lost since we have no estimate of the discount function for this month.

328

Alex Kane, Tae-Hwan Kim and Halbert White

(i) All portfolios, except those derived from OLS estimation of the forecast-discount functions, outperformed the passive strategy. The superior performance is of economic significance, providing strong testimony to the value of even miniscule predictive ability of security analysts, and the power of the TB model. (ii) Portfolios derived from LAD estimators uniformly outperform portfolios from the OLS specification of the discount functions. This indicates the existence of fat tails in the return distributions and the value of better estimation methods. (iii) The portfolios derived from the kinked specification of the discount functions uniformly outperform the linear specification. This result may be unique to the unusual nature of the XYZ forecasts that were better in the positive range. Still, the adjusted R-square of this specification (see Table 15.5.) shows that even positive forecasts were of low quality, and yet, with no short sales, portfolios derived from these forecasts were significantly superior to the passive strategy. Table 15.4. Sharpe Ratio and M 2 -measure: Diagonal Model (S&P500 Sharpe Ratio = 0.909) Note: * indicates the best estimator in the row Sharpe Ratio NRLAD JSLAD 0.922 1.033 1.095 1.235

OWLAD 0.921 1.095

LAD 1.025 1.235

OLS -0.131 1.501

NRLAD 0.112 1.667

M 2 -measure JSLAD OWLAD 1.033 0.107 2.920 1.668

Beta

Line Kinked

OLS 0.895 1.076

LAD 1.036 * 2.921 *

Vasicek

Line Kinked

0.905 1.094

0.957 1.119

1.058 1.146

0.955 1.118

1.058 1.146

-0.037 1.654

0.427 1.886

1.337 * 2.124 *

0.413 1,878

1.337 * 2.123

JS

Line Kinked

0.904 1.095

0.965 1.12

0.941 1.145

0.963 1.12

0.94 1.145

-0.05 1.665

0.497 * 1.894

0.287 2.114 *

0.481 1.890

0.279 2.112

Table 15.5 presents the risk-return data of the optimal portfolio. The right-hand panel shows that with the [0,1] restriction on the active portfolio weights, the managed portfolio’s risk is only slightly larger than that of the passive strategy. Moreover, the risk of the kinkedspecification portfolios is actually slightly lower than that of the index portfolio. This means that superior performance is achieved by the improvement in average returns from the identification of non-zero alpha stocks in the linear specification, and positive-alpha stocks in the ’no short sales’ (kinked) specification. Table 15.5. TBP Return and Risk: Diagonal Model (S&P500 Return = 8.166: S&P500 Risk = 8.980) TBP Return (Mean) NRLAD JSLAD 9.286 10.706 9.581 10.794

OWLAD 9.262 9.585

LAD 10.71 10.796

OLS 9.833 8.861

TBP Risk (Std. Dev.) NRLAD JSLAD OWLAD 10.072 10.451 10.054 8.750 8.744 8.752

Beta

Line Kinked

OLS 8.799 9.539

LAD 10.452 8.744

Vasicek

Line Kinked

8.872 9.725

9.294 9.623

11.637 9.403

9.277 9.618

11.641 9.4

9.8 8.893

9.713 8.597

10.996 8.206

9.712 8.600

11 8.205

JS

Line Kinked

8.899 9.758

9.323 9.601

11.577 9.323

9.307 9.601

11.583 9.32

9.846 8.914

9.664 8.570

12.299 8.144

9.666 8.574

12.317 8.143

15.7.1. The Covariance Version of the TB Model In using the covariance version we face a trade off between a theoretically-advised improvement and (low) estimation precision of the residual covariance matrix. Table 15.6

Active Portfolio Management

329

presents the performance measures of the portfolios derived from the covariance version. Most portfolios (20 out of 30), and all ‘no short sales’ (kinked) specification portfolios, show improved performance over the portfolios from the diagonal version. Here, too, the LAD-estimator portfolios perform best. This is another indication that improved estimation will further increase the effectiveness of the model and the contribution of security analysts. Table 15.7 shows that risk of the managed portfolios was increased relative to portfolios from the diagonal version in most cases. This should be expected since the full covariance model provides better utilization of forecasts and hence, larger positions in the active portfolio at the expense of diversification. Our sub-sample of 105 stocks out of the 646 stocks in XYZ’s databank was chosen randomly. This suggests that the forecasting quality of the stocks that were left out is similar to the 105 stocks we used. In that case, we can easily assess the incremental performance that would be obtained with the diagonal model by expanding the universe of covered securities from 105 to 646. Table 15.4 shows the value of the Sharpe measure of the JSLAD portfolio as 1.145, compared with .909 for the S&P500 index, resulting in M 2 = 2.114%. Using (15.3) we obtain the contribution of the active portfolio to the squared Sharpe measure by 1.1452 − 0.9092 = 0.485. This contribution would √ be expected to grow to 0.485 × 646 ÷ 105 = 2.982, resulting in a Sharpe measure of 2.982 + .9092=1.951, and M 2 =1.951 × 8.980 − 8.166= 9.354%. Obviously, the complexity of managing an organization 6 times as large with the same production quality, and the difficulty in various statistical analyses can substantially cut down the potential gain. At the same time, there can be a distinct diversification-like advantage in estimation procedures with a larger universe of stocks. Table 15.6. Ratio and M 2 -measure: Covariance Model (S&P500 Sharpe Ratio = 0.909) Note: * indicates the M 2 -measure is greater than its counterpart in the diagonal model in Table 15.4 Sharpe Ratio NRLAD JSLAD 0.744 1.019 1.232 1.435

OWLAD 0.747 1.232

LAD 1.019 1.435

OLS -1.293 2.659 *

NRLAD -1.482 2.897 *

M 2 -measure JSLAD OWLAD 0.980 -1.460 4.717 * 2.898 *

Beta

Line Kinked

OLS 0.765 1.205

LAD 0.984 4.718 *

Vasicek

Line Kinked

0.792 1.21

0.867 1.244

1.077 1.261

0.865 1.242

1.077 1.261

-1.054 2.703 *

-0.378 3.003 *

1.504 * 3.158 *

-0.399 * 2.985 *

1.507 * 3.157 *

JS

Line Kinked

0.789 1.214

0.866 1.248

1.015 1.267

0.864 1.247

1.015 1.267

-1.082 2.732 *

-0.388 3.042 *

0.948 * 3.216 *

-0.407 3.030 *

0.951 * 3.214 *

Table 15.7. Return and Risk: Covariance Model (S&P500 Return = 8.166: S&P500 Risk = 8.980). TBP Return (Mean) NRLAD JSLAD 8.17 11.972 11.632 13.827

Beta

Line Kinked

OLS 7.958 11.532

Vasicek

Line Kinked

8.163 11.526

8.818 11.39

JS

Line Kinked

8.137 11.55

8.811 11.365

TBP Risk (Std. Dev.) NRLAD JSLAD OWLAD 10.976 11.754 10.930 9.442 9.638 9.445

OWLAD 8.162 11.638

LAD 11.983 13.829

OLS 10.398 9.567

LAD 11.76 9.639

12.456 10.978

8.794 11.378

12.464 10.976

10.307 9.523

10.168 9.158

11.568 8.706

10.167 9.162

11.571 8.705

11.054 10.927

8.788 11.358

11.057 10.925

10.315 9.517

10.173 9.106

10.891 8.621

10.171 9.110

10.892 8.621

330

Alex Kane, Tae-Hwan Kim and Halbert White

Summary and Conclusions The objective of this paper is to identify and reduce the threshold of profitable forecasting ability for portfolio management. We suspect that the low precision of the forecasts of security analysts contributes to the dearth of portfolio managers that efficiently use the security analysis afforded by the Treynor Black model. We find that the threshold of profitable forecasts of abnormal returns is extremely low, that is, security analysis that results with a correlation between forecasts and realizations of abnormal returns as low as 0.04 can still be profitable in an organization that covers more than 100 stocks. Nevertheless, this requires that econometric methods are utilized. We experiment with a database of monthly forecasts of abnormal returns for 105 stocks over 37 months, provided by an investment firm that actually used these forecasts to construct its portfolios. Using a database of realized returns on these stocks and the market index, we estimate forecast discount functions and apply them to the forecasts prior to the construction of the risky portfolios. In the process, we use the Dimson method to account for infrequent trading, and shrinkage Bayesian estimators for beta coefficients. The discount functions are estimated with LAD estimators to account for fat tails. Various specifications of the discount functions are used to account for the quality of the forecasts. These methods significantly improve the performance of the risky portfolio. We show that the key to profitability of low-precision forecasts is the use of sophisticated econometric methods. Using OLS to estimate the market model and the forecastdiscount functions will not do. Shrinkage estimators for beta and LAD estimates of the discount functions, while imposing organization-driven restrictions on the weights of the active portfolio, can endow a portfolio derived from low-quality forecasts with profitability. Our experiment was performed under adverse conditions. The forecast records were short, requiring that we assign equal quality to all forecasts. Macro forecasting was not utilized and the substitute extrapolation techniques were not as powerful as they could be. Finally, the estimates of residual variances in the diagonal version and the covariance matrix in the covariance version of the TB model can be improved. All this indicates that the threshold to profitability of forecast precision can be further lowered. These findings lend a real meaning to the concept of nearly-efficient capital markets. If XYZ was representative of competitive investments firms, this suggests that competition leads to a degree of information efficiency that reduces the forecast precision of supermarginal firms to a level as low as what we observe in this experiment.

References [1] Admati, A.R. (1985), A Noisy Rational Expectations Equilibrium for Multi-asset Securities Markets, Econometrica, 53, 629-657. [2] Ambachtsheer, K. (1974), Profit Potential in an Almost Efficient Market, Journal of Portfolio Management, Fall, 84-87. [3] Ambachtsheer, K. (1977), Where Are the Customers’ Alphas?, Journal of Portfolio Management, Fall, 53-56.

Active Portfolio Management

331

[4] Ambachtsheer, K. and Farrell, Jr., J.L. (1979), Can Active Management Add Value?, Financial Analysts Journal, November-December, 39-47. [5] Barrodale, I. and Roberts, F.D.K. (1974), Algorithm 478: Solution of an Overdetermined System of Equations in the L1 Norm, Communications of the Association for Computing Machinery, 17, 319-320. [6] Bodie, Z., Kane, A. and Marcus, A.J. (2001), Investment, fifth Edition, McGraw-Hill. [7] Charnes, A. and Lemke, C.E. (1954), Computational Theory of Linear Programming: The Bounded Variables Problem, Graduate School of Ind. Administration, Carnegie Institute Of Technology, Pittsburgh, Pennsylvania. [8] Dimson, E. (1979), Risk Measurement When Shares Are Subject to Infrequent Trading, Journal of Financial Economics 7, 197-226. [9] Dunn, P. and Theisen, R. (1983), How Consistently Do Active Managers Win?, Journal of Portfolio Management, 9, 47-50. [10] Dybvig, P.H. and Ross, S.A. (1985), The Analytics of Performance-Measurement Using a Security Market Line, Journal of Finance, 40, 401-416. [11] Engle, F.R., Kane, A. and Noh, J. (1993), Index-Option Pricing with Stochastic Volatility and The Value of Accurate Variance Forecasts, NBER Working Paper, No 4519. [12] Fama, E.F. and French, K.R. (1992), The Cross-Section of Expected Stock Returns, Journal of Finance, Vol. XLVII, 427-465. [13] Ferguson, R. (1975), Active Portfolio Management: How to Beat The Index Funds, Financial Analysts Journal, May-June, 63-72. [14] Goetzmann W. and Ibbotson, R. (1991), Do Winners Repeat? Patterns in Mutual Fund Behavior, Working Paper, Yale School of Organization and Management. [15] Grossman, S.J., and Stiglitz, L. E. (1980), On the Impossibility of Informationally Efficient Markets, American Economic Review, 70, 393-408. [16] Hodges, S.D. and Brealey, R.A. (1973), Portfolio Selection in a Dynamic and Uncertain World, Financial Analysts Journal, March-April, 50-65. [17] Ippolito, R. (1989), Efficiency with Costly Information: A Study of Mutual Fund Performance 1965-1984, Quarterly Journal of Economics, 104, 1-23. [18] Kane, A. and Marks, S. (1990), The Delivery of Market Timing Services: Newsletters versus Market Timing Funds, Journal of Financial Intermediation , 1, 150-166. [19] Kane, A., Marcus, A.J., and Trippi, R.R., (1999), The Valuation of Security Analysis, Journal of Portfolio Management, Spring, 25-36.

332

Alex Kane, Tae-Hwan Kim and Halbert White

[20] Kim, T and White, H. (2001), James-Stein Estimators in Large Samples with Application to the Least Absolute Deviations Estimator, Journal of the American Statistical Association, 96, 697-705. [21] Merton, R.C. (1972), An Analytical Derivation of the Efficient Portfolio Frontier, Journal of Financial and Quantitative Analysis, 7, 1851-1872. [22] Merton, R.C. (1980), On Estimating The Expected Return on The Market: An Exploratory Investigation, NBER Working Paper Series, No. 444. [23] Modigliani, F. and Modigliani, L. (1997), Risk Adjust Performance, Journal of Portfolio Management, Winter, 45-54. [24] Roll, R. (1977), A Critique of the Asset Pricing Theory’s Tests: Part I: On past and Potential Testability of the Theory, Journal of Financial Economics, 4, 129-176. [25] Rosenberg, B., Reid, K., and Lanstein, R. (1985), Persuasive Evidence of Market Inefficiency, Journal of Portfolio Management, 11, 9-17. [26] Treynor, J.L. and Black, F. (1973), How to Use Security Analysis to Improve Portfolio Selection, Journal of Business, 46, 66-86. [27] Vasicek, O.A. (1973), A Note on Using Cross-Sectional Information in Bayesian Estimation of Security Betas, Journal of Finance, 28, 1233-1239. [28] White, H. (1980), A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity, Econometrica, 48, 847-838.

In: Progress in Financial Markets Research Editors: C. Kyrtsou and C. Vorlow, pp. 333-349

ISBN: 978-1-61122-864-9 c 2012 Nova Science Publishers, Inc.

Chapter 16

S TOCK P RICE C LUSTERING AND D ISCRETENESS : T HE “C OMPASS R OSE ” AND C OMPLEX DYNAMICS Constantinos E. Vorlow∗ Durham Business School, University of Durham, Durham, UK

Abstract We investigate the “compass rose” (Crack, T.F. and Ledoit, O. (1996), Journal of Finance, 51(2), pg. 751-762) patterns revealed in phase portraits i.e., scatter diagrams of daily stock returns against their lagged values. The patterns that appear have been attributed to price discreteness and the tick size. We examine a novel manifestation of price clustering and discreteness and show that the actual patterns observed may contain information on the complexity of the sequences examined, that possibly enhance the predictability of stock returns dynamics.

Keywords: Price Clustering and Discreteness; Microstructure; Compass Rose; Recurrence Quantification Analysis.

16.1. Introduction The “compass rose” pattern was identified in literature by Crack and Ledoit Crack and Ledoit [1996]. It was revealed in scatter diagrams of daily percentage stock returns against their lagged values, also called “phase portraits”, “lag-plots” or “delay plots” among other (see Pemberton and Rau [2001] for more definitions). Usually one can discern rays which emanate from the center that resemble a compass rose (Fig. 16.1a). The formation of these rays was attributed to price clustering and discreteness and especially the tick size.1 The ∗ E-mail

address: [email protected] in stock market prices has been highlighted by Niederhoffer [1965] and Niederhoffer [1966], motivated by the original findings of Osborne [1962]. Niederhoffer and Osborne [1966] also investigated dependencies related to clustering and discreteness. Their research was followed by Rosenfeld [1980] who conducted simulations on price rounding and discreteness and showed that the hypothesis of a geometric Brownian motion for daily and weekly frequency price dynamics could be rejected. Harris [1991, 1990] has treated these issues more rigorously. He found that clustering varies with price level, volatility and exchange listing. See also in Gottlieb and Kalay [1985], Cho and Frees [1988], Ball et al. [1985] for further research on this area. 1 Clustering

334

Constantinos E. Vorlow

above factors are considered an important part of the “market microstructure” literature with serious implications on risk evaluation, the optimal design of securities and market efficiency. It has been suggested that the appearance of the compass rose pattern is to a considerable degree subjective and of no use to forecasting (Crack and Ledoit [1996]). In this paper, following the approach Koppl and Nardone [2001] and extending the findings of Antoniou and Vorlow [2004a,b] and Vorlow [2004], we show how from a simple recoding of the visual information provided from phase portraits we can obtain results that can be used to enhance our knowledge on the complex dynamical character of financial prices-returns. To achieve this we employ Recurrence Quantification Analysis (RQA), which especially focuses on the examination of complex dynamics and was originally introduced by Zbilut and Webber [1992] on Recurrence Plots (Eckmann et al. [1987], Casdagli [1997]). The suitability of RQA is warranted by its extreme flexibility and its minimal assumptions on stationarity and underlying distributional properties of the data under examination. At the same time we demonstrate a new way to display the effects of price clustering and discreteness, that can be more clear and precise than the compass rose. In the following section, we briefly review the literature around the compass rose. In section 16.3. we discuss our methodology and results. We also introduce a new type of diagram that reveals as well the effect of price clustering and discreteness and combines both returns and price levels. In section 16.3.1. we briefly discuss the RQA methodology and in section 16.3.2. produce both qualitative and quantitative evidence on the complex character of the dynamics of the sequences analyzed. Finally in section 16.4. we provide our conclusions and pointers for future research.

16.2. The Compass Rose in Scientific Literature The discovery of the “compass rose” formation during the nineties by Crack and Ledoit [1996], has triggered an intensive investigation of the patterns observed mainly in daily returns sequences. These plots are a standard approach for depicting time dependencies within the structure of deterministic-chaotic sequences and can provide qualitative evidence for the existence of attractors (Kantz and Schreiber [1997], Kaplan and Glass [1995]). As mentioned in the introduction, Crack and Ledoit [1996] first documented a distinctive pattern that arises in these phase portraits: rays emanating from the center, generating a compass rose like formation (fig. 16.1). This would not only appear in two dimensional phase portraits but also in three dimensions. The pattern was due to the price fluctuations being small integer multiples of a tick size (see also McKenzie and Frino [2003]), imposed by the market practices or regulations. According to Crack and Ledoit [1996], given that Rt and Pt denote returns and closing prices respectively at time t, and h the tick size, one can generate the following approximation: Rt+1 Rt

= ≈

(Pt+1 − Pt )/Pt ≈ (Pt − Pt−1 )/Pt−1 (Pt+1 − Pt )/Pt−1 Pt+1 − Pt nt+1 h nt+1 = = = , (Pt − Pt−1 )/Pt−1 Pt − Pt−1 nt h nt

(16.1) (16.2)

Stock price Clustering and Discreteness

335

where nt ≡ (Pt − Pt−1 )/h is the integer number of ticks by which prices Pt and Pt−1 differ. According to eq. (16.2), for reasonably close values of the Pt and Pt−1 closing prices, subsequent returns may generate pairs of coordinates for points in the phase portrait that lead to alignments along the ray joining the origin with the integer pair (nt , nt−1 ). Other past attempts to produce informative phase portraits failed to reveal a compass rose like structure. This failure is mainly attributed to inappropriately formatted diagrams, unfortunate choice of graphic resolutions and connection of the dots of the portrait (see for example in Enright [1992], Chen [1993], Papaioannou and Karytinos [1995], Franses [1998], Franses and Dijk [2000], Andreou et al. [2000]). According to Crack and Ledoit [1996], three conditions are necessary for the realization of a compass rose pattern: 1. The daily price changes of the stock should be small relative to the price level; 2. daily price changes should also be effected in discrete jumps of a small number of ticks and 3. the price of the stock should vary over a relatively wide range. It was also reported that phase portraits of returns of some particular assets as well as index and portfolio returns, may not necessarily reveal such a characteristic pattern. The reason for the absence of a compass rose pattern has always been related to the violation of any or all of the three conditions mentioned above. Szpiro Szpiro [1998] argued that the only necessary condition was that price changes were realized in discrete jumps and that the approximation of Crack and Ledoit [1996] through which we obtain eq. (16.2), is unnecessary. The requirement of keeping h and nt small in relation to prices was relaxed and hence eq. (16.1) could be written as:   Rt+1 nt+1 nt h = 1− . (16.3) Rt nt Pt Using eq. (16.3) Szpiro [1998] showed that when share prices span values between P and P(1 + 1/λ), where nt+1 = λδ, nt = λε and δ/ε determines the ray slope, clusters of points appeared to be connecting whereas when the span range is wider, these clusters would overlap. All clusters would connect when prices varied between P and 2P. This view was criticized by Wang et al. [2000] through the introduction of the element of time. Using a micro-structure approach based on intra-daily UK stock market data, Wang et al. [2000] suggested that the compass rose should become more apparent as prices were being sampled in higher frequencies. In the same work it was also proposed that the original conditions of Crack and Ledoit [1996] would still be valid assuming that all the potential levels of returns that are necessary for a complete phase portrait, have historically occurred and been collected. Thus Wang et al. [2000] suggest the compass rose is most likely to occur if the following two conditions are fulfilled: 1. the effective tick size is large compared to the standard deviation of the returns and, 2. the frequency of the observations increases.

336

Constantinos E. Vorlow

Various other papers have appeared since 1996 on the compass rose theme. Chen Chen [1997] investigated formations with data from Taiwan. This research revealed a new type of compass rose which contained overlapping square patterns attributed to the existence of daily price limits. Clustering in some areas of the phase portraits was also discovered and it was suggested that these are directly linked with the predictability of stock returns. The “square compass rose” patterns were assumed to provide useful information for forecasting with ARMA or GARCH models and for trading purposes. Lee et al. [1999] examined the compass rose in futures markets data and concluded that not all contracts showed the pattern. They also found that the frequency of the data played a significant role in the appearance of the compass rose. Continuing, Gleason et al. [2000] compared intra-day with daily Forex returns and concluded that the pattern was visible only when the tick/volatility ratio was above some threshold level. They suggested that price discreteness alone was not a determinant in their case. More recently, Wang and Wang [2002] introduced a quality factor and showed that with an adequate length of the returns sequence, one could obtain strong compass rose patterns. In Crack and Ledoit [1996] it was also suggested that price discreteness, as manifested in the compass rose, could affect the power of various statistical tests. Fang [2002] showed that as autocorrelation estimates were biased because of price discreteness, a number of random walk tests could also be biased and the underlying asymptotic theory could be rendered invalid when transactions data was used. Earlier research by Kramer and Runde [1997] had also found that price discreteness can cause the BDS test (Brock et al. [1987]) to reject a correct null in 80% of the cases. More recently, Amilon [2003], examining lowpriced Australian stocks, has shown that GARCH models are misspecified when applied to returns calculated from discrete prices. Price clustering and discreteness as manifested in the compass rose, has undoubtedly occupied the financial literature since Crack and Ledoit [1996]. However the compass rose as a phase portrait is effectively a tool that illustrates time based interdependencies in a sequence of observations. In nonlinear-dynamical time series analysis it is used for revealing the dynamics of the data generating processes (DGP) in 2-dimensional or 3-dimensional phase spaces. Nonlinear science articles are replete with such diagrams that provide qualitative evidence for the existence of complex dynamics in the DGP.2 It has been argued in Crack and Ledoit [1996] that the compass rose per se does not offer a great deal of information. Rather it confirms the clustering and discreteness of stock closing prices. Koppl and Nardone [2001] have shown that there is a way to provide a more objective view of the discretized dynamics through the compass rose, contrary to Crack and Ledoit [1996]. From a simple manipulation of the information of the phase portraits, they obtain angular distributions on their points and produce evidence towards the non-randomness and predictability of the dynamics. Influenced mainly by (Szpiro [1998], Koppl and Nardone [2001]) and the broader bibliography on price clustering and discreteness, in the following section we examine the case of the compass rose revealing even more information that it 2 For more information and background theory refer to texts such as Abarbanel [1995], Kaplan and Glass [1995], Kantz and Schreiber [1997] and Urbach [2000] among others. The main limitation of phase portraits is that they can only provide a visualization of the dynamics of a sequence up to 3 dimensions. For higher dimensional dynamics one should resort to other kinds of diagrams such as Recurrence plots. Refer to Eckmann et al. [1987] and Antoniou and Vorlow [2000, 2004a] for more on this area.

Stock price Clustering and Discreteness

337

was originally claimed.

16.3. Methodology and Results Our data set refers to companies from the FTSE ALL SHARE index, spanning the period of January 1st, 1970 to May 30th, 2003 (which provides a maximum of 8717 daily observations). Due to space limitations, we provide here graphical results on a few stocks. The task of generating a compass rose is a trivial matter. In fig. 16.1 we have generated the compass rose for the logarithmic returns (not percentage returns as in Crack and Ledoit [1996]) of British-American Tobacco (subfigure a). In subfigure (b) we generated the scatter diagram of the respective closing prices against their corresponding logarithmic returns. Figure 16.1(b), to our knowledge, depicts for the first time in financial literature, manifestations of price clustering and discreteness such as the ones in the compass rose. We can see very clearly the trajectories of points converging to the horizontal axis. The curvature of these trajectories may be attributed to the slight curvature observed in the compass rose rays as explained in Szpiro [1998]. In physics or mathematics the plot of a function against its first derivative is called a “phase plot”. We can thus loosely term 16.1(b) as a phase plot. What we observe in these kind of diagrams is a set of correlation and anticorrelation patterns for various price-returns levels. It is a really interesting formation that appears in all daily returns-prices sequences we examined. In Antoniou and Vorlow [2004a] it is suggested that there may be a more regular structure in both compass roses and phase plots, which is revealed once a non-systematic component is extracted from the returns sequences, using wavelet based methods. Preliminary results have showed that when we reshuffle returns sequences using a constraint randomization scheme used in the detection of nonlinear dynamics (Theiler et al. [1992]), both the compass rose and phase plot patterns disappear (Vorlow [2004]). This is often an adequate indication for the presence of nonlinear-deterministic dynamics in the DGPs of the returns processes (Kaplan and Glass [1995]).

16.3.1.

The Compass Rose and Complex Dynamics

Following Koppl and Nardone [2001], Antoniou and Vorlow [2004b], Vorlow [2004], we investigated the angles of the rays (line) formed by the connection of each point of the compass rose with its center and the horizontal axis (henceforth referred to as “arcs ”). As we can see in the case of the BP stock in Fig. 16.3(b), the distribution of these arc values is multimodal. This is often linked to the presence of nonlinear deterministic dynamics (Kantz and Schreiber [1997], Kaplan and Glass [1995]). The time sequence of arc is depicted in Fig. 16.3(c) and a sorted version of the values of the arc in Fig. 16.3(d). In all subfigures we can discern the levels and positions of the clustering of the arc values. This provides an alternative view of price discreteness and clustering as manifested in the compass rose. However it can be regarded as more objective (Koppl and Nardone [2001]). In Antoniou and Vorlow [2004b] and Vorlow [2004] evidence is provided in support of predictable nonstochastic dynamics. Here we employ a different methodology in independent support the same hypothesis. We apply Recurrence Quantification Analysis (RQA: Zbilut and Webber [1992], Zbilut et al. [1998, 2000]) on the time sequence of arcs (Fig. 16.3c). Our approach is

338

Constantinos E. Vorlow

0.00 −0.04

−0.02

Returns

0.02

0.04

Compass Rose

−0.04

−0.02

0.00

0.02

0.04

Lagged Returns

(a) Compass Rose

0.00 −0.02

−0.01

Returns

0.01

0.02

BAT closing prices against logarithmic returns

0

500

1000

1500

Prices

(b) Phase Plot

Figure 16.1. Compass rose and phase diagram. not without precedence (e.g., refer to Gilmore [1996], Antoniou and Vorlow [2000], Hołyst et al. [2001], Belaire-Franch et al. [2002]; for the applicability and intuition behind RQA refer also to McGuire et al. [1997], Thiel et al. [2004a]). RQA was born as an attempt to quantify the information obtained through Recurrence Plots introduced by Eckmann et al. [1987] (see also Casdagli [1997] for general information on applications and extensions). Recurrence plots are effectively a graphical representation of nonlinear time correlation matrices. In order to generate such a plot we need first to embed Takens [1981], Packard

Stock price Clustering and Discreteness

339

0.01 0.02 0.03 −0.01 −0.03

BP Returns

Phase Plot (detail)

0

50

100

150

200

BP Closing Prices

Figure 16.2. A detail of the phase plot of BP logarithmic prices against logarithmic returns. et al. [1980], Sauer et al. [1991] the time series and reconstruct the phase space dynamics. More precisely, the following steps are needed in order to create a recurrence plot: 1. Generate m embedded vectors y from the original sequence x for time-delay τ and embedding dimension dE : y1 , y2 , y3 , ..., ym, (16.4) where m = N − (dE − 1)τ and yk = [xk , xk+τ , xk+2τ, ..., xk+(dE−1)τ ] = y(k),

(16.5)

for k=1,2,...,m. This is called “embedding” of the sequence x. 2. Choose a threshold (also termed radius or resolution) ε and then find all vectors y that are closer than this distance ky(i) − y( j)k < ε. (16.6) Here i corresponds to the horizontal axis and j to the vertical one. After the determination of i and j, the recurrence plot can be generated with a black dot on coordinates i, j denoting closeness of embedded vectors y(i) and y( j).3 Hence, the analytical definition of a recurrence plot is: Ri, j = Θ(ε − ky(i) − y( j)k),

i, j = 1, · · · , m,

(16.7)

where y denotes the vectors obtained from embedding in equation (16.5), Θ(x) is the Heaviside function and ε is the predefined threshold as in eq. (16.6), measured usually in units of the standard deviation of the time series x. 3 Such recurrence plots

are referred to as “thresholded”

340

Constantinos E. Vorlow

Interpreting a recurrence plot may be a difficult task and may require some experience. Recurrence plots are symmetric around the 45 degree main diagonal, due to construction. Usually, any line segments parallel to the main diagonal of the plots are evidence of recurrent, predictable and possibly chaotic dynamics. The largest of these segments is also inversely proportional to the largest Lyapunov exponent. Usually complex, nonlinear deterministic dynamics should be characterized by relatively short line segments whereas very long ones would imply pure determinism. These segments refer to the time and point in the dynamics of the system where the attractor revisits the same area of the phase space. Any lack of structure and points scattered uniformly all over the plot, can be regarded as evidence of random dynamics. The density of the dark areas differs around the main diagonal when the dynamics exhibit trends, drifts and nonstationarity. Vertical and horizontal line segments refer to the time the system remains on a stable state or exhibits slowly drifting dynamics (“laminar states”). Line segments vertical to the main diagonal can be an artifact of incorrect embedding. There are two important points in RQA and recurrence plots in general. Firstly, it is obvious that the parameters of time delay τ and embedding dimension dE are crucial for the reconstruction of the phase space dynamics (according to Takens [1981] we require an infinite amount of noise free data and of infinite accuracy). Due to the noisy character of many financial time series it is very difficult to obtain a correct embedding as usually dE can not be estimated accurately. However, we follow here Iwanski and Bradley [1998] and Thiel et al. [2004b] which show that even without embedding (de = τ = 1), the RQA results can reveal qualitatively the same dynamics with those obtained by the correct “embedding”. The second point is the choice of radius ε. This is usually determined on the basis of the standard deviation of the x sequence and when the noise component is known, thresholds (radii) up to five times its standard deviation may be appropriate Thiel et al. [2002], Matassini et al. [2002]. A threshold of the magnitude of the lower 10% of the entire distance range is suggested in Webber Jr. and Zbilut [1994] and we also employ this strategy for our analysis that follows (see section 16.3.2.). In Atay and Altıntas¸ [1999] it is suggested that the time delay τ is a more crucial parameter than the embedding dimension dE and that the average length of line segments parallel to the main diagonal can be insensitive to the choice of the latter. According to Thiel et al. [2004a], the most crucial statistic to quantify the predictability of the system is the distribution of diagonals. It is also suggested that the topological reconstruction of an attractor from the thresholded recurrence plot is possible, whether the series exhibit deterministic or stochastic dynamics or even a mixture of these. Hence, current developments in RQA and recurrence plots in general, suggest that even with inaccurate embedding and loose determination of the threshold level, we can obtain an almost complete image of the dynamical information on the system examined through the time series.

16.3.2.

RQA Results

In Fig. 16.4 we provide details of the recurrence plot on the arcs of the BP stock compass rose as well as details from recurrence plot of other known processes. We can clearly see in Fig. 16.4(a) and (b) that the BP plots are very different to that of a Gaussian white noise (Fig. 16.4e) or a Brownian motion (Fig. 16.4d). In Fig. 16.4(c) we have a recurrence

Stock price Clustering and Discreteness Arc Distribution

400 300

Frequency

100 0

−0.02

−0.01

0.00

0.01

0.02

−3

−2

−1

0

1

2

3

Arc ranges (b)

BP Compass Rose arcs

BP Compass Rose (sorted) arcs

2 1 0 −1 −2 −3

−3

−2

−1

0

Arc value

1

2

3

Rt (a)

3

−0.02

Arc value

200

0.00 −0.01

Rt−1

0.01

500

0.02

600

Compass Rose

341

0

2000

4000 Time (c)

6000

8000

0

2000

4000

6000

8000

(d)

Figure 16.3. Arc distribution for the BP stock (in radii). plot of the Lorenz system which is known to exhibit chaotic (nonlinear deterministic) dynamics whereas in Fig. 16.4(f) we have a recurrence plot of a sine function which is purely deterministic (non-chaotic). The dynamics revealed in Fig. 16.4 (a,b) are much more consistent with the ones of the Lorenz system of equations than any other process. Actually the BP recurrence plot is replete with structures that can be regarded as evidence of predictable and complex dynamics. However, it becomes clear that the visual inspection of a recurrence plot can provide a very subjective view of the dynamics. For this reason Zbilut and Webber [1992] have suggested a quantification of the visual information which includes entropy based criteria, as to avoid any ambiguities in interpretation. For our analysis, following the considerations outlined in the previous section, we employed a threshold level (radius ε) range for eq. (16.6) starting from 1 standard deviation s = 1.8 of the BP compass rose arcs to 2.3 (see table 16.1). We concentrate only on a subset of the RQA measures Zbilut and Webber [1992], Zbilut et al. [1998], Marwan et al. [2002] chosen for their clarity in interpretation for the current exercise. More precisely, in table 16.1 we report the values of the RQA on the BP compass rose arcs for the following

342

Constantinos E. Vorlow Recurrence Plot (detail)

Recurrence Plot (detail) 600

100

500

80

400

60 300

40 200

20

100 0 0

100

200

300

400

500

600

0 0

(a) BP arcs (detail)

20

40

60

80

100

(b) BP arcs (detail) Recurrence Plot

Recurrence Plot

1000

3500

900 3000

800 2500

700 600

2000

500 1500

400 300

1000

200 500

100 0

0 0

500

0

1000 1500 2000 2500 3000 3500

(c) Lorenz

100 200 300 400 500 600 700 800 900 1000

(d) Brownian Motion Recurrence Plot

Recurrence Plot

900

700

800 700

650

600 500 600

400 300 550

200 100

500 500

0 550

600

650

(e) White Noise

700

0

100 200 300 400 500 600 700 800 900

(f) Sine function

Figure 16.4. Recurrence Plots. statistics, according to the radius ε: • %REC (Recurrence): Percentage of recurrent points in the plot. • %DET (Determinism): Percentage of recurrent points which form lines parallel to the main diagonal. In the case of a deterministic system, these parallel lines are an indication of the trajectories being close in phase space for time scales that are equal

Stock price Clustering and Discreteness

343

Table 16.1. Recurrence Quantification Analysis of the arcs sequence from the compass rose of BP returns. Results for a range of radii based on the standard deviation of the process. Standard deviation of process analyzed: 1.8; Type of Norm in eq. (16.7): Euclidean Radius 1.80 1.83 1.85 1.88 1.90 1.93 1.95 1.98 2.00 2.02 2.05 2.08 2.10 2.13 2.15 2.17 2.20 2.23 2.25 2.27 2.30

% REC 49.36 49.99 50.47 50.86 51.78 52.23 52.68 53.24 53.61 54.02 55.14 55.51 55.94 56.52 56.92 57.65 58.08 58.67 59.11 59.54 59.94

% DET 36.61 37.27 37.72 38.09 39.00 39.39 39.79 40.38 40.67 41.05 42.30 42.60 43.04 43.56 43.95 44.78 45.18 45.79 46.27 46.70 47.10

MAXL 30.00 30.00 30.00 30.00 30.00 30.00 30.00 30.00 30.00 30.00 30.00 30.00 30.00 30.00 30.00 31.00 31.00 31.00 31.00 31.00 31.00

ENT 2.48 2.50 2.51 2.52 2.54 2.55 2.56 2.57 2.58 2.59 2.63 2.63 2.64 2.66 2.67 2.69 2.70 2.71 2.73 2.74 2.75

TREND −0.01 −0.01 −0.01 0.00 −0.01 −0.01 0.00 0.03 0.04 0.03 −0.02 −0.02 −0.01 0.00 −0.00 −0.01 −0.01 −0.00 0.01 0.02 0.03

% LAM 21.78 22.47 23.20 23.72 25.02 25.50 26.15 26.91 27.36 27.91 29.66 30.26 30.92 31.69 32.26 33.49 34.17 35.18 35.93 36.55 37.21

TRAP 6.20 6.24 6.26 6.29 6.33 6.35 6.38 6.40 6.42 6.45 6.53 6.55 6.57 6.61 6.64 6.70 6.74 6.78 6.83 6.86 6.89

to the length of these lines. • MAXL (Maximum Line): The length of the biggest diagonal line in points. The inverse (1/MAXL) of this measure is related to the maximal (positive) Lyapunov exponent of the sequence. • ENT (Entropy): The entropy of the distribution of lines in the plot, measured in bits. This measure does not refer to the entropy of the sequence analyzed but is intended as an indicator of “structureness” of the recurrence plot and the complexity of the deterministic structure of the dynamics. • TREND: This measures the paling of the plot towards the edges. It is estimated as the slope of the linear regression line of %REC on the displacement from the main diagonal (excluding the last 10% of the total range) and is expressed in units of the percentage of local recurrence per 1000 points. It provides information on the stationarity of the process and the presence of any trend or drift. • %LAM (“Laminarity”): Analogous to %DET above and measures the percentage of

344

Constantinos E. Vorlow vertical lines which indicate the occurrence of laminar states i.e., periods of tranquility or slowly drifting dynamics Marwan et al. [2002].

• TRAP (Trapping Time): The average length of all vertical lines, indicating an average time the system is “trapped” into a laminar state as defined above Marwan et al. [2002].

Table 16.2. Recurrence Quantification Analysis of the arcs sequence from the compass rose of BP returns. Results for a range of radii based on percentage of maximum rescaled distance. Standard deviation of process analyzed: 1.8. Type of Norm in eq. (16.7): Maximum norm Radius 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% 11% 12% 13% 14% 15% 16% 17% 18% 19% 20%

% REC 2.787 4.316 6.085 8.156 9.608 11.676 13.227 15.469 17.048 18.868 20.551 21.851 24.674 25.799 27.352 29.099 30.562 32.619 33.819 35.219

% DET 0.082 0.138 0.281 0.592 0.937 1.648 2.286 3.37 4.33 5.56 6.809 7.895 10.436 11.544 12.996 14.836 16.316 18.684 19.934 21.412

MAXL 7 7 8 9 9 10 11 13 14 15 17 21 21 21 21 21 22 22 22 24

ENT 0.728 0.696 0.702 0.859 0.929 1.057 1.166 1.283 1.355 1.435 1.509 1.567 1.696 1.738 1.794 1.858 1.915 1.99 2.026 2.069

TREND -0.119 -0.05 -0.02 -0.011 0.012 -0.014 0.055 -0.03 -0.003 -0.003 0.027 0.082 -0.114 -0.065 -0.005 -0.005 0.013 -0.039 -0.008 0.004

% LAM 0 0 0.008 0.023 0.076 0.123 0.227 0.36 0.603 0.949 1.315 1.672 2.56 3.028 3.738 4.624 5.289 6.528 7.219 8.242

TRAP N/A N/A 5 5 5 5 5.018 5.127 5.225 5.25 5.303 5.355 5.509 5.538 5.598 5.64 5.687 5.727 5.754 5.777

Table 16.1 shows the values of the above statistics for the range of radii chosen. It is straightforward that the recurrence plot of the BP compass rose arcs will exhibit significant structure (confirmed also by Fig.16.4a and 16.4a). Usually for independently identically randomly distributed data (iid) one would expect a value of %DET close to 0 and similarly low levels of %REC as well. The MAXL shows that there is a maximum period of 30 days where the dynamics of the arc sequence are concentrated in the same region of the phase space and provide a lower bound for the maximum Lyapunov exponent of 0.033. This value, although positive, is fairly small and could suggest some “containment” of the dynamics in the sense that a “large” MAXL indicates less instability than a smaller one. Concentrating on the %LAM and TRAP measures, we see that the former ranges between 21.78% and 37.21% which is a high value and indicates the level of laminar states for the sequence. The

Stock price Clustering and Discreteness

345

former is a value between 6 and 7 days, indicating the length on average that the sequence dynamics were “trapped” on a stable condition or exhibited slowly drifting dynamics. The evidence of RQA here corroborates with the qualitative information obtained from the recurrence plot itself (Fig. 16.4). In this case we can not exclude that the sequence of arcs of the BP compass rose exhibits significant non-random structure and possibly nonlinear deterministic complexity. To view the RQA results through a different resolution, we also produce in table 16.2, the same quantification measures for radii ranging from 1% to 20% of the maximum rescaled distance in the recurrence plot of the sequence. Moreover, we experimented with a different type of norm for eq. (16.7). This produces effectively an almost identical recurrence plot but because of the different magnitudes of ε and type of norm, the RQA measures will differ. However, we can observe again here that for a radius of 10% as in Webber Jr. and Zbilut [1994], we have a considerable %REC of 18.87% and low entropy (ENT) which suggests strong structure and predictability of dynamics. The levels of trapping time (TRAP) are not that different from those of table 16.1. It is obvious that even with a different distance measure and levels of threshold, we still obtain evidence of complex and predictable dynamics. On the basis of our qualitative and quantitative results, we can also not refute the presence of nonlinear determinism.

16.4. Conclusion and Future Research In the previous sections we have demonstrated how the information obtained from the compass rose and RQA can be used to obtain a view of the complex dynamical character of stock returns processes. Our RQA results suggest the presence of complex non-stochastic dynamics and predictable structures in the data generating processes. This is in line with Antoniou and Vorlow [2004a,b], Vorlow [2004] and Hołyst et al. [2001]. An obvious problem with financial time series is that there is no clear information on the noise component of the sequences (level and type). While recurrence plots and RQA are quite robust to the presence of noise, it would be an interesting exercise to experiment with blind signal separation techniques, such as Independent Components Analysis or Wavelets, so as to preprocess the time series before analysis (e.g., see Antoniou and Vorlow [2004a]). This could eventually provide an alternative strategy to estimate systematic and unsystematic components of stock prices and could be of potential use to risk modelling and forecasting. With respect to the last point, the approach followed here suggests that RQA can form the basis of a forecasting exercise. This is an area we currently are investigating.

Acknowledgments We wish to thank James Ramsey, Enrico Capobianco, Abhay Abyankar, Timothy Crack, Tassos Malliaris and the participants of the Microstructure workshop (organized by CentER, University of Tilburg (NL), April 2004) for their useful comments and suggestions during the initial stages and the progress of this work. We also wish to acknowledge the valuable help and support of Duncan Rand and the University of Durham “High Performance Computing Service”. The author retains the responsibility for any errors.

346

Constantinos E. Vorlow

References H.D.I. Abarbanel. Analysis of Observed Chaotic Data. Springer-Verlag, New York, 1995. Henrik Amilon. Garch estimation and discrete stock prices: an application to low-priced australian stocks. Economics Letters, 81(2):215–222, 2003. A. S. Andreou, G. Pavlides, and A. Karytinos. Nonlinear time-series analysis of the greek exchange-rate market. International Journal of Bifurcation and Chaos, 10(7):1729– 1759, 2000. Antonios Antoniou and Constantinos E. Vorlow. Recurrence plots and financial time series analysis. Neural Network World, 10(1-2):131–146, 2000. Antonios Antoniou and Constantinos E. Vorlow. Recurrence quantification analysis of wavelet pre-filtered index returns. Physica A: Statistical Mechanics and Its Apllications, 2004a. Forthcoming. Antonios Antoniou and Constantinos E. Vorlow. Price Clustering and Discreteness: Is there Chaos behind the Noise? Physica A: Statistical Mechanics and Its Applications, 2004b. URL http://arxiv.org/abs/cond-mat/0407471. Forthcoming. F. M. Atay and Y. Altıntas¸. Recovering smooth dynamics from time series with the aid of recurrence plots. Physical Review E, 59(6):6593–6598, 1999. URL 10.1103/PhysRevE.59.6593. C.A. Ball, W.N. Torous, and A.E. Tschoegl. The Degree of Price Resolution: The Case of the Gold Market. The Journal of Futures Markets, 5(1):29–43, 1985. J. Belaire-Franch, D. Contreras, and L. Tordera-Lled´o. Assessing nonlinear structures in real exchange rates using recurrence plot strategies. Physica D, 171(4):249–264, 2002. URL 10.1016/S0167-2789(02)00625-5. W.A Brock, W. Dechert, and J. Scheinkman. A test for independence based upon the correlation dimension. Working paper, University of Winsconsin, 1987. M. C. Casdagli. Recurrence plots revisited. Physica D: Nonlinear Phenomena, 108(1-2): 12–44, 1997. An-Sing Chen. The square compass rose: the evidence from taiwan. Journal of Multinational Financial Management, 7(2):127–144, 1997. Ping Chen. Searching for economic chaos: A challenge to econometric practice and nonlinear tests. In Richard H. Day and Ping Chen, editors, Nonlinear dynamics and evolutionary economics., pages 217–53. Oxford University Press, Oxford; New York; Toronto and Melbourne, 1993. David Chinhyung Cho and Edward W. Frees. Estimating the Volatility of Discrete stock Prices. Journal of Finance, 43(2):451–66, 1988.

Stock price Clustering and Discreteness

347

Timothy Falcon Crack and Olivier Ledoit. Robust structure without predictability: The “compass rose” pattern of the stock market. Journal of Finance, 51(2):751–762, 1996. J.P. Eckmann, S. Oliffson Kamphorst, and D. Ruelle. Recurrence Plots of Dynamical Systems. Europhysics Letters, 4:973–977, 1987. Arthur J. Enright. Searching for chaotic components in financial time-series. Ph.d. thesis, Pace University, 1992. Yue Fang. The compass rose and random walk tests. Computational Statistics & Data Analysis, 39(3):299–310, 2002. Philip Hans Franses. Time series anslysis models for business and economic forecasting. Cambridge University Press, 1998. Philip Hans Franses and Dick van Dijk. Nonlinear time series models in empirical finance. Cambridge University Press, 2000. C. G. Gilmore. Detecting linear and nonlinear dependence in stock returns: New methods derived from chaos theory. Journal of Business Finance & Accounting, 23(9–10):1357– 1377, 1996. Kimberly C. Gleason, Chun I. Lee, and Ike Mathur. An explanation for the compass rose pattern. Economics Letters, 68(2):127–133, 2000. Gary Gottlieb and Avner Kalay. Implications of the Discreteness of Observed Stock Prices. Journal of Finance, 40(1):135–153, 1985. Lawrence Harris. Estimation of Stock Price Variances and Serial Covariances from Discrete Observations. Journal of Financial and Quantitative Analysis, 25(3):291–306, 1990. Lawrence Harris. Stock Price Clustering and Discreteness. Review of Financial Studies, 4 (3):389–415, 1991. J. A. Hołyst, M. Zebrowska, and K. Urbanowicz. Observations of deterministic chaos in financial time series by recurrence plots, can one control chaotic economy? The European Physical Journal B, 20(4):531–535, 2001. J. S. Iwanski and E. Bradley. Recurrence plots of experimental data: To embed or not to embed? Chaos, 8(4):861–871, 1998. H. Kantz and T. Schreiber. Nonlinear Time Series Analysis. Number 7 in Cambridge Nonlinear Science series. Cambridge University Press, UK, 1997. Daniel T. Kaplan and Leon Glass. Understanding nonlinear dynamics. Textbooks in mathematical sciences. Springer-Verlag, New York, 1995. Roger Koppl and Carlo Nardone. The Angular Distribution of Asset Returns in Delay Space. Discrete Dynamics in Nature and Society, 6:101–120, 2001.

348

Constantinos E. Vorlow

Walter Kramer and Ralf Runde. Chaos and the compass rose. Economics Letters, 54(2): 113–118, 1997. Chun I. Lee, Kimberly C. Gleason, and Ike Mathur. A comprehensive examination of the compass rose pattern in futures markets. The Journal of Futures Markets, 19(5):541–564, 1999. N Marwan, N. Wessel, U. Meyerfeldt, A. Schirdewan, and J. Kurths. Recurrence-plotbased measures of complexity and their application to heart rate variability data. Physical Review E, 66:026702.1–026702.8, 2002. Lorenzo Matassini, Holger Kantz, Janusz Holyst, and Rainer Hegger. Optimizing of recurrence plots for noise reduction. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics, 65(1):021102, 2002. G. McGuire, N. B. Azar, and M. Shelhamer. Recurrence matrices and the preservation of dynamical properties. Physics Letters A, 237(1–2):43–47, 1997. URL 10.1016/S0375-9601(97)00697-X. Michael D. McKenzie and Alex Frino. The tick/volatility ratio as a determinant of the compass rose: empirical evidence from decimalisation on the NYSE. Accounting & Finance, 43(3):331–331, 2003. Victor Niederhoffer. Clustering of Stock Prices. Operations Research, 13(2):258–265, 1965. Victor Niederhoffer. A New Look at Clustering of Stock Prices. The Journal of Business, 39(2):309–313, 1966. Victor Niederhoffer and M.F.M. Osborne. Market Making and Reversal on the Stock Exchange. Journal of the Anerican Statistical Asssociation, 61(316):897–916, 1966. M.F.M. Osborne. Periodic Structure in the Brownian Mtion of Stock Prices. Operations Research, 10(3):345–379, 1962. N.H. Packard, J.P. Crutchfield, J.D. Farmer, and R.S. Shaw. Geometry from a time series. Physical Review Letters, 45:712–715, 1980. G. Papaioannou and A. Karytinos. Nonlinear time series analysis of the stock exchange: The case of an emerging market. International Journal of Bifurcation and Chaos, 5(6): 1557–1585, 1995. Malcolm Pemberton and Nicholas Rau. Mathematics for Economists. Manchester University Press, 2001. Eric. Rosenfeld. Stochastic Processes of Common Stock Returns: An Empirical Examination. PhD thesis, MIT Sloan School of Management, 1980. T. Sauer, J.A. Yorke, and M. Casdagli. Embedology. Journal of Statistical Physics, 65(3-4): 579–616, 1991.

Stock price Clustering and Discreteness

349

George G. Szpiro. Tick size, the compass rose and market nanostructure. Journal of Banking & Finance, 22(12):1559–1569, 1998. F. Takens. Detecting strange attractors in turbulence. In D. Rand and L.S . Young, editors, Dynamical Systems and Turbulence, page 366. Springer, 1981. James Theiler, Stephen Eubank, A. Longtin, Bryan Galdrikian, and J. Doyne Farmer. Testing for nonlinearity in time series: the method of surrogate data. Physica D: Nonlinear Phenomena, 58(1-4):77–94, 1992. Marco Thiel, M. Carmen Romano, J¨urgen Kurths, Riccardo Meucci, Enrico Allaria, and F. Tito Arecchi. Influence of observational noise on the recurrence quantification analysis. Physica D: Nonlinear Phenomena, 171(3):138–152, 2002. Marco Thiel, M. Carmen Romano, and J¨urgen Kurths. How much information is contained in a recurrence plot? Physics Letters A, 2004a. Forthcoming. Marco Thiel, M. Carmen Romano, J¨urgen Kurths, and P. Read. Estimation of dynamical invariants without embedding by recurrence plots. Chaos, 14(2):234–243, 2004b. Richard Urbach. Footprints of Chaos in the Markets: Analyzing Non-Linear Time Series in Financial Markets and Other Real Systems. Financial Times - Prentice Hall Philadelphia, London, 2000. Constantinos E. Vorlow. Stock Price Clustering and Discreteness: The “Compass Rose” and Predictability. 2004. URL http://arxiv.org/abs/cond-mat/0408013. Under Review in Economics Letters. Eliza Wang, Robert Hudson, and Kevin Keasey. Tick size and the compass rose: further insights. Economics Letters, 68(2):119–125, 2000. Huaiqing Wang and Chen Wang. Visibility of the compass rose in financial asset returns: A quantitative study. Journal of Banking & Finance, 26(6):1099–1111, 2002. Charles L. Webber Jr. and Joseph P. Zbilut. Dynamical assessment of physiological systems and states using recurrence plot strategies. Journal of Applied Physiology, 76:965–973, 1994. Joseph P. Zbilut and Charles L. Webber. Embeddings and delays as derived from quantification of recurrence plots. Physics letters A, 171(3-4):1991–214, 1992. Joseph P. Zbilut, Alessandro Giuliani, and Charles L. Webber Jr. Recurrence quantification analysis and principal components in the detection of short complex signals. Physics Letters A, 237(3):131–135, 1998. Joseph P. Zbilut, Alessandro Giuliani, and Charles L. Webber Jr. Recurrence quantification analysis as an empirical test to distinguish relatively short deterministic versus random number series. Physics Letters A, 267(2-3):174–178, 2000.

INDEX A accelerator, 244, 253 access, 16 accommodation, 292 acid, 265 adaptation, 292 adjustment, ix, 218, 235, 236, 237, 243, 244, 245, 247, 252, 253, 254, 255, 256, 258, 275, 300 ADP, 28 adults, 183 adverse conditions, 330 affirming, 80 Africa, 25, 28 aggregate demand, 223, 246 aggregation, 1, 5, 122 algorithm, 5, 44, 49, 122, 123, 124, 125, 126, 128, 130, 131, 132, 134, 137, 162, 165, 291, 294, 315 alternative hypothesis, vii, 88, 95 amplitude, 239 arbitrage, 18, 274, 299, 300 Asia, 28, 222 assessment, 143, 349 assets, vii, 15, 17, 223, 335 association rules, 122, 125, 128, 134, 136 asymmetry, 84, 85, 96, 100, 224 Austria, 104, 108, 111, 112 authority, 260 autonomous consumption, 246

B bandwidth, 107, 198 bankers, 17 base, 19, 20, 168, 169, 184, 223 basis points, 31, 268 behaviors, 221, 225, 230 Belgium, 104, 108, 112 bias, 26, 94, 137, 172, 312, 319, 322 Blanchard model, ix, 262 bond market, 69, 265

bonds, 1, 15, 93, 244, 245, 248, 260, 261, 264, 281, 299 borrowers, 15 bounds, 296 Brazil, 222 breakdown, 132 Brownian motion, 16, 17, 182, 294, 333, 340 business cycle, 100, 223, 224, 240, 241, 242, 298, 300 buyer, 36 buyers, 36

C C4.5, 123 calculus, 21, 72 calibration, 312 candidates, 140, 317 capital accumulation, 246, 249, 258 capital flows, 221 capital gains, 243, 244, 246, 248, 254, 258, 299 capital markets, 16, 68, 221, 330 capital productivity, 246 case studies, 208 case study, 55, 56, 57 cash, 37 causal relationship, 73 CDC, 264, 278, 279, 283, 284, 285 Central Asia, 103 certificates of deposit, 28, 38 chaos, vii, x, 104, 105, 111, 241, 295, 296, 309, 346, 347 Chicago, 71 civil servants, 17 clarity, 325, 341 classes, 123, 125, 127, 128, 130, 131, 133, 192, 224, 229 classification, 122, 136 cleaning, 122 clustering, 43, 44, 49, 51, 122, 123, 124, 128, 130, 134, 333, 334, 336, 337 clusters, 122, 124, 125, 135, 194, 335

352 cocoa, viii, 25, 27, 28, 29, 30, 31, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 61, 62, 64, 65 coding, 168, 181 commercial, 23, 210 commercial bank, 23 commodity, x, 25, 29, 57, 275 commodity markets, x competition, 330 competitors, 263, 311 complement, 290 complexity, vii, 71, 167, 244, 329, 333, 343, 345, 348 computation, 263, 264, 315 computational theory, 122 computer, 34, 181, 240 computing, 34, 143, 274 conception, 258 concordance, 227 conditional mean, 1, 2, 8, 12, 84, 96, 98 conditioning, 84 configuration, 128 consensus, 103 constituents, 276 construction, 46, 279, 323, 330, 340 consumption, 244, 260 consumption function, 260 contamination, 223, 226, 237 convention, 246 convergence, 1, 5, 167, 192, 193, 199, 204, 206, 254, 255, 257, 258, 259, 293, 294, 295, 303 copulas, 223, 228, 229, 230, 233, 234, 235, 237 correlation, viii, ix, 7, 13, 17, 18, 19, 20, 21, 22, 44, 53, 75, 76, 80, 88, 96, 97, 138, 163, 168, 182, 184, 185, 186, 189, 191, 193, 201, 205, 207, 209, 223, 226, 227, 232, 233, 235, 237, 242, 264, 275, 320, 330, 337, 338, 346 correlation analysis, 275 correlation coefficient, 17, 18, 53 correlations, 94, 167, 184, 185, 227, 312 cost, 16, 17, 19, 21, 22, 23, 29, 30, 38, 125, 162, 311 covering, 29, 103, 107, 137, 141, 169 creditors, 17, 20 creditworthiness, 15, 21, 23 crises, 222, 223 critical value, 19, 21, 39, 93, 141, 192, 198, 199, 200, 202, 204, 205, 206, 207, 208, 209, 211, 291, 301, 302, 305 criticism, 137 crops, 25, 28, 52 CSCE, viii, 27, 28, 31, 32, 33, 34, 39, 40, 41, 42, 43, 48, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 64, 65 currency, 19, 20, 25, 26, 38, 41, 52, 55, 104, 222, 223 current account, 16 current account balance, 16 cycles, 223, 235, 238, 242, 252, 290, 294, 305

Index

D danger, 26, 41 data analysis, 167 data mining, vii, viii, 122, 123, 125, 126, 128, 130, 131, 132 data set, viii, 1, 26, 27, 31, 39, 75, 80, 122, 123, 124, 125, 127, 128, 129, 130, 131, 133, 221, 223, 224, 226, 228, 231, 233, 235, 236, 237, 239, 240, 275, 314, 337 database, 122, 135, 268, 312, 314, 315, 330 debt servicing, 22 debts, 15 decay, 8, 10, 177, 185 decomposition, 167, 168, 169, 170, 171, 172, 173, 294 deficit, 260 demonstrations, 259 Denmark, 104, 108, 113 dependent variable, 210, 305 deposits, 37 depreciation, 31 derivatives, 71, 72, 292, 293 detection, 173, 221, 238, 337, 349 determinism, 71, 72, 73, 80, 340, 345 devaluation, 222 deviation, 22, 143, 186, 234, 265, 266, 299, 340, 343, 344 dichotomy, 72 differential equations, 72, 74, 295 diffusion, 307 discontinuity, 179 discordance, 227 discreteness, 333, 334, 336, 337 diseases, 222 disequilibrium, 255, 262 dislocation, 258 displacement, 343 disposable income, 246, 260 distribution, 1, 4, 5, 44, 104, 106, 167, 174, 179, 190, 192, 198, 199, 200, 205, 207, 223, 227, 228, 233, 235, 237, 290, 294, 295, 302, 309, 316, 318, 319, 337, 340, 341, 343 distribution function, 227, 228, 233, 290 divergence, 259, 293 diversification, vii, 313, 329 Dow Jones index, viii, 126 dynamic systems, 272 dynamical properties, 348 dynamical systems, 111, 162, 305

E ECM, 218, 220, 303, 304, 305 economic activity, 231 economic fundamentals, 58 economic growth, 16, 300

353

Index economic performance, 16, 304 economic systems, 296, 298 economic theory, 2, 80 economics, 5, 71, 72, 73, 80, 103, 107, 167, 171, 184, 295, 346 editors, 82, 346, 349 empirical studies, vii, 84, 121 EMS, 164 EMU, 264 engineering, 190 entropy, 341, 343, 345 environment, vii, 100, 182, 276 equilibrium, 71, 105, 190, 207, 208, 246, 249, 256, 257, 258, 291, 292, 294, 296, 297, 298, 299, 300, 311 equilibrium price, 71 equities, 243, 244, 246, 248, 261, 266 equity, 99, 244, 265, 299, 300, 301, 315 equity market, 99, 299, 300 Euclidean space, 140 Eurodollar, 275 Europe, 25, 28, 222, 275 evidence, viii, 9, 10, 12, 13, 34, 43, 69, 75, 80, 94, 95, 98, 103, 104, 105, 111, 121, 127, 128, 137, 209, 211, 223, 264, 265, 266, 297, 303, 315, 334, 336, 337, 340, 341, 345, 346, 348 evolution, 16, 17, 22, 71, 182, 210, 221, 223, 231, 233, 235, 237, 239, 246, 247, 263, 275, 298 excess demand, 52 excess supply, 52, 53 exchange rate, viii, ix, 1, 2, 5, 12, 26, 27, 28, 31, 33, 38, 41, 43, 48, 53, 54, 55, 56, 57, 58, 63, 64, 103, 104, 105, 107, 108, 111, 112, 123, 129, 164, 208, 275, 346 exercise, ix, 9, 10, 11, 18, 19, 21, 22, 235, 236, 239, 276, 341, 345 expenditures, 260 exponential functions, 84 exposure, 25, 274, 312 external influences, 183 external shocks, vii

F fat, 328, 330 Federal Reserve Board, 219 fetus, 182 filters, 26, 27, 34, 36, 56, 66, 169 financial, vii, viii, ix, 1, 10, 13, 16, 20, 25, 26, 27, 35, 37, 41, 43, 56, 57, 68, 71, 72, 73, 74, 80, 83, 94, 121, 122, 123, 124, 126, 133, 162, 165, 168, 181, 182, 183, 184, 189, 208, 216, 221, 222, 223, 226, 228, 231, 237, 238, 239, 240, 242, 243, 244, 245, 247, 248, 252, 254, 259, 260, 263, 265, 268, 274, 275, 279, 299,

300, 312, 315, 334, 336, 337, 340, 345, 346, 347, 349 financial crisis, vii, 222 financial data, 41, 83, 94, 165, 168 financial innovation, vii financial markets, vii, viii, 25, 27, 56, 68, 162, 221, 222, 223, 228, 240, 244, 248, 254, 259, 268, 274, 275, 279 financial planning, 71 financial records, 183 financial support, 189 Finland, 104, 108, 113 fiscal policy, 246, 259 flexibility, 334 fluctuations, vii, viii, 16, 129, 131, 177, 181, 182, 184, 186, 194, 245, 334 fluid, 74 food, 183 force, 30, 53, 315 forecasting, viii, ix, 25, 26, 27, 28, 29, 30, 31, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 50, 51, 52, 54, 55, 56, 57, 61, 62, 63, 64, 65, 66, 80, 84, 121, 122, 137, 142, 162, 164, 241, 263, 264, 265, 266, 268, 274, 275, 276, 277, 278, 279, 280, 281, 299,눬300, 304, 305, 312, 314, 320, 325, 326, 329, 330, 334, 336, 345, 347 foreign exchange, 14, 69, 70, 121, 135, 164, 165, 209, 265, 266, 274, 275, 281 foreign exchange market, 69, 70, 121, 135, 164, 265, 266, 281 formation, 20, 34, 333, 334, 337 formula, 29, 106, 251, 270, 274 foundations, 80 France, 83, 94, 100, 104, 108, 114, 221, 231, 235, 240, 241 freedom, 39, 91, 141, 167, 319 functional analysis, 72 funds, 139, 300

G gambling, 71 game theory, 71, 295 GARCH alternatives, viii GARCH disturbances, viii, 2 Gaussian random variables, 198 GDP, 20, 223, 231, 232, 233, 234, 235, 236 genetic diversity, 140 geometry, 72, 111 Germany, 104, 108, 114, 231, 235, 243 government budget, 247 government expenditure, 259 graph, 239, 301, 321 Great Britain, 119, 231, 235 Greece, 104, 108, 114, 121, 289 grouping, 122

354

Index

growth, 16, 18, 20, 21, 22, 142, 193, 243, 244, 245, 246, 247, 248, 260, 308 growth rate, 16, 18, 20, 193

H heart rate, 348 hedging, 25, 26, 100, 263 heteroscedasticity, 1, 2, 9, 10, 12, 13, 14, 94, 95, 97, 107, 241, 322 heteroskedastic effects, vii heteroskedasticity, viii, 13, 34, 69, 93, 96, 99, 100, 101, 193, 267, 268, 271 histogram, 173, 179, 180, 182, 198, 315, 316 history, 27, 34, 121, 126, 175, 312 human, 73, 168, 184 hybrid, 122, 135 hypothesis, ix, 7, 26, 51, 73, 74, 77, 85, 86, 94, 105, 141, 167, 182, 192, 207, 208, 209, 210, 212, 217, 317, 319, 333, 337 hypothesis test, 7

I ideal, 182 identification, vii, 121, 134, 328 identity, 3, 6, 254, 276 image, 340 IMF, 22, 107, 288 improvements, 275 incidence, 68 income, 22, 190, 246, 249, 260, 265, 299 income distribution, 249 income tax, 22 independence, 17, 163, 164, 192, 207, 216, 346 Independence, 80 independent variable, 71 indexing, 122 individuals, 140 industry, 25, 311, 314 inevitability, 122 inferences, 39, 42, 295 inflation, 13, 100, 103, 210, 245, 299 initial state, 292 institutions, 321 integration, 211, 245, 259, 301 integrity, 168 interbank deposits, 141, 142 interdependence, 221, 240, 241 interest rates, 20, 23, 38, 93, 94, 95, 98, 190, 244, 262, 300 interestingness, 168, 184 intermediaries, 223 International Monetary Fund, 103 Intervals, 186, 187 intervention, 121, 122 invariants, 349

investment, 23, 34, 137, 223, 243, 244, 245, 246, 248, 257, 258, 274, 312, 314, 330 investments, ix, 210, 330 investors, vii, 34, 142, 222, 263, 312 Ireland, 104, 108, 115 IS-LM, 249, 259 isolation, 254 issues, 23, 37, 73, 260, 289, 292, 321, 333 Italy, 104, 108, 115, 231, 235 iteration, 326

J Japan, 104, 108, 115, 164, 167 jumping, 75

K Keynes, 256 Korea, 311

L labor force, 247, 260 laminar, 340, 344 laws, 246, 247, 249, 259, 260, 261 lead, vii, 56, 57, 83, 162, 258, 317, 335 learning, 2, 8, 12, 13 learning process, 8 LIFE, viii light, vii, 182 linear dependence, 43, 226 linear function, 213, 251 linear model, 68, 101, 200, 224, 289 linear programming, 71, 315 linear systems, 293 liquidate, 36, 66 liquidity, 27, 30, 223 loans, 17 LSD, 265

M macroeconomics, 189 magnitude, 27, 56, 58, 128, 340 majority, 130, 137, 162 management, 71, 72, 80, 263, 274 manipulation, 336 mapping, 292 marginal distribution, 228 market capitalization, 315 Markov chain, 225 mass, 319 mathematics, 73, 337

355

Index matrix, 3, 6, 7, 8, 106, 124, 125, 127, 130, 227, 250, 254, 292, 293, 297, 314, 319, 320, 322, 324, 325, 328, 330 matrix algebra, 297 matter, 72, 168, 226, 251, 261, 268, 279, 337 measurement, 71, 251, 274, 293, 294 measurements, 72 medical, 222 memory, 29, 31, 122, 189, 190, 216, 225, 231, 241, 290, 291 methodology, viii, ix, 20, 26, 46, 73, 80, 121, 122, 123, 126, 128, 131, 132, 133, 138, 173, 174, 189, 216, 243, 255, 258, 259, 274, 289, 290, 302, 305, 334, 337 microscope, 168 microstructure, vii, 334 microstructures, 299 minimum price, 122 minors, 250, 252, 253, 254 mixing, 105, 190, 311, 325 model specification, 269 modelling, 69, 72, 122, 165, 167, 221, 243, 263, 264, 265, 268, 272, 274, 275, 276, 280, 281, 289, 290, 304, 305, 345 models, viii, ix, 1, 3, 4, 13, 14, 15, 28, 44, 46, 50, 51, 71, 72, 74, 80, 83, 84, 86, 89, 91, 95, 96, 97, 98, 99, 100, 101, 103, 121, 138, 165, 167, 183, 192, 202, 205, 206, 207, 210, 221, 222, 223, 224, 226, 231, 235, 236, 237, 239, 241, 242, 243, 264, 265, 266, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 289, 294, 295, 296, 297, 304, 305, 306, 336, 347 momentum, 34 monetary policy, 98, 275 monetary union, 223 money supply, 245 motivation, 263 multifractal analysis, viii, 186 multifractal formalism, 168 multifractality, 182, 183 multiples, 252, 334 multiplier, 7, 84, 243, 254, 255, 256, 258 multivariate distribution, 240 mutation, 140 mutations, 140

N negative effects, 85 negativity, 295 net exports, 20 Netherlands, 104, 108, 109, 111, 116 neural network, ix, 94, 106, 107, 135, 164, 274, 289, 294, 296, 301, 305, 314 Neural Network Model, 274 neural networks, 164, 274 neurons, 274

neutral, 20, 36, 37, 38, 66, 74, 306 New Zealand, 104, 108, 116 Niels Bohr, 82 nightmares, 34 nodes, 274, 276 non-linear analysis, vii nonlinear dynamics, viii, 300, 337, 347 nonlinear systems, 293 non-linearity, vii normal distribution, 5, 31, 39, 44, 47, 76, 77, 305 North America, 222 Norway, 104, 108, 116 nuisance, 86, 100 null, 31, 46, 47, 50, 51, 52, 57, 77, 79, 86, 87, 88, 90, 91, 94, 95, 100, 108, 112, 113, 114, 115, 116, 117, 118, 190, 191, 192, 198, 199, 200, 204, 205, 207, 208, 211, 227, 229, 301, 319, 336 null hypothesis, 31, 46, 50, 51, 57, 77, 79, 87, 88, 90, 91, 94, 95, 100, 108, 190, 191, 192, 198, 199, 200, 204, 205, 211, 301, 319

O obstruction, 182 OECD countries, viii, 104, 108, 111 Oklahoma, 287 one-sided test, 40, 42, 46, 64, 65, 191, 295 open market operations, 260 operations, 71 operations research, 71 opportunities, vii, 18, 56, 57 optimization, 140, 311, 314 orbit, 294 orthogonality, 170 overlap, 170, 335 overweight, 266

P Pacific, 311 pairing, 140 parallel, 122, 340, 342 parameter estimates, 97 participants, 85, 263, 274, 345 partition, 124, 168, 173, 174, 175, 176, 178, 186, 275 pattern recognition, 122, 135 penalties, 16 permit, 221, 223, 226, 235, 238 phase diagram, 243, 257, 258, 259, 338 Philadelphia, 285, 349 physics, 167, 337 placebo, 183 policy, 18, 22, 98, 103, 259 policymakers, 103 population, 75, 140, 316

356

Index

portfolio, vii, ix, 17, 268, 299, 311, 312, 313, 314, 315, 320, 321, 323, 324, 326, 327, 328, 329, 330, 335 portfolio management, ix, 268, 311, 312, 314, 330 portraits, 111, 333, 334, 335, 336 Portugal, 104, 108, 110, 111, 117 positive correlation, 44 Pound-Dollar exchange rate, viii, 27, 28, 31, 33, 38, 41, 43, 48, 53, 54, 55, 56, 57, 63, 64 POWER, 103, 311 predictability, x, 27, 28, 39, 42, 46, 51, 52, 55, 56, 57, 69, 121, 125, 164, 225, 265, 298, 333, 336, 340, 345, 347 present value, 20, 309 preservation, 348 price changes, 56, 57, 58, 75, 245, 335 principles, 34, 300 private information, vii privatization, 21 probabilistic reasoning, 71, 72 probability, viii, 16, 19, 21, 53, 73, 123, 126, 127, 131, 132, 133, 134, 140, 175, 176, 180, 194, 225, 226, 227, 231, 233, 235, 290, 292, 295, 304 probability density function, 290 probability distribution, 292 professionals, 265 profit, vii, 25, 56, 57, 263 profitability, viii, 27, 28, 31, 39, 41, 42, 43, 46, 51, 52, 55, 56, 57, 137, 139, 142, 143, 162, 163, 164, 279, 330 propagation, 228 proportionality, 172 proposition, 249, 250, 251, 252, 253, 254, 255, 256, 261 protectionism, 21 purchasing power, 103, 104, 105, 108 purchasing power parity, 103, 104, 105, 108 P-value, 317, 322

Q quantification, 341, 345, 346, 349 query, 122

R radius, 339, 340, 341, 342, 345 random walk, 26, 45, 46, 50, 51, 57, 73, 104, 105, 164, 169, 182, 185, 190, 191, 194, 195, 197, 198, 199, 200, 202, 203, 212, 264, 273, 336, 347 rate of return, 17, 139, 142, 248, 313, 326 ratio analysis, 300 rational expectations, vii, 242, 257, 258 rationality, 80

reactions, 17 real numbers, 72, 73 reality, 21, 80, 167 reasoning, 71 recall, 185 recession, 223, 226, 234, 235, 237, 239, 300 recognition, 131 recombination, 140 reconstruction, 340 recurrence, 339, 340, 341, 343, 344, 345, 346, 347, 348, 349 Recurrence Quantification Analysis, ix, 334, 337, 343, 344 regression, 1, 2, 8, 9, 28, 43, 44, 45, 68, 88, 91, 106, 138, 163, 164, 191, 192, 195, 196, 208, 210, 274, 275, 277, 291, 293, 294, 295, 296, 297, 299, 301, 302, 303, 306, 317, 320, 322, 325, 343 regression analysis, 163 regression equation, 28, 68, 210 regression line, 195, 196, 343 regression method, 294 regression model, 1, 2, 9, 303 regulations, 334 rejection, 89, 90, 91, 92, 93, 94, 108, 112, 113, 114, 115, 116, 117, 118, 212 relaxation, 122 relevance, 22, 162, 243, 289 requirements, 134, 168 researchers, 1, 26, 72, 223, 276, 297 residuals, 6, 7, 8, 44, 46, 47, 94, 95, 96, 97, 98, 164, 190, 192, 208, 211, 289, 291, 295, 296, 297, 300, 301, 303, 304, 305, 311, 314, 317, 324, 325 resistance, 26, 27, 36, 39, 56, 66 resolution, 168, 169, 172, 175, 182, 186, 339, 345 resources, 16, 20 response, 22, 84, 183, 222, 238, 239, 240, 265, 302 restrictions, 3, 301, 330 risk, vii, viii, 14, 15, 16, 17, 19, 20, 21, 22, 23, 25, 68, 98, 99, 139, 140, 143, 144, 147, 150, 153, 156, 159, 162, 184, 185, 240, 241, 263, 266, 268, 274, 275, 313, 315, 317, 325, 327, 328, 329, 334, 345 risk aversion, 17 risk management, 263, 266, 268 risk profile, vii risks, vii, 210, 240, 263, 274 root, 93, 104, 105, 141, 142, 179, 191, 192, 208, 210, 211, 265, 278, 291, 294, 298, 301, 303, 305 roots, 18, 105, 191, 249, 254, 293, 294 roughness, 183 rules, viii, 26, 27, 28, 29, 30, 31, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 50, 51, 52, 54, 55, 56, 57, 61, 62, 63, 64, 65, 66, 67, 121, 122, 123, 125, 126, 128, 129, 130, 131,

357

Index 133, 135, 137, 142, 144, 147, 150, 153, 156, 159, 162, 163, 164, 165, 246 RUS, 231 Russia, 222

S S&P index, viii, 168, 172, 182, 184, 186 sample variance, 265 sanctions, 20 scaling, 167, 168, 172, 173, 174, 176, 177, 179, 183 scatter, 231, 320, 333, 337 scatter plot, 231 school, 285 science, 71, 80, 262, 336 scientific investigations, 72 scientific method, 71 scope, 98, 216, 274 search space, 122, 140 securities, 1, 22, 164, 311, 312, 313, 314, 326, 329, 334 security, 5, 6, 139, 140, 164, 311, 312, 313, 316, 325, 326, 328, 329, 330 seller, 36 sellers, 36 sensing, 190 sensitivity, ix, 175, 240 shape, 75, 169, 198, 199 shock, 222, 223, 228, 233, 237, 239, 240, 257 shortage, 29 shortfall, 241, 326 short-term interest rate, 93, 94, 266 showing, 27, 49, 54, 142, 168, 173 SIC, 303, 304, 305 signals, viii, 26, 30, 35, 36, 37, 38, 40, 61, 66, 67, 137, 139, 140, 143, 162, 190, 299, 321, 349 significance level, 34, 40, 42, 45, 46, 49, 50, 51, 60, 64, 65, 89, 90, 91, 92, 93, 141, 208, 209, 212 signs, 31, 34, 43, 49, 51, 56, 127, 128, 130, 131 silver, 189, 208, 210, 211, 216 simulation, 9, 10, 12, 51, 100, 263, 301, 302 simulations, 8, 50, 51, 89, 90, 91, 92, 93, 99, 199, 244, 333 skewness, 31, 93, 94, 97, 141 smoothing, 169, 171, 184, 198, 266, 294 smoothness, 85, 105, 225 software, 74, 263, 302 solution, 18, 140, 243, 248, 249, 254, 257, 258, 261, 275, 313 South America, 25 sovereignty, 15, 21 Spain, 104, 108, 111, 117, 189 specifications, 28, 52, 269, 272, 289, 320, 321, 322, 325, 327, 330 spillovers, 275 Spring, 331

stability, ix, 184, 185, 243, 244, 250, 251, 252, 253, 254, 255, 256, 259, 261, 289, 292, 293, 295, 296, 301, 302, 305, 306, 308 stabilization, 129 standard deviation, 16, 20, 22, 31, 139, 143, 191, 197, 198, 234, 238, 265, 266, 267, 268, 271, 313, 335, 339, 340, 341 standard error, 12, 13, 34, 39, 44, 47, 49, 60, 68, 104, 105, 107, 271, 272, 295, 302, 303, 322 state, 139, 224, 225, 236, 243, 244, 245, 246, 247, 248, 249, 250, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 272, 274, 292, 293, 297, 340, 344 states, 223, 224, 225, 244, 256, 257, 261, 264, 269, 273, 340, 344, 349 statistical inference, 69, 136 statistics, 31, 39, 40, 41, 42, 50, 51, 52, 54, 58, 59, 65, 84, 86, 93, 94, 95, 141, 189, 193, 197, 208, 211, 213, 216, 218, 220, 221, 223, 231, 232, 237, 267, 268, 269, 278, 279, 304, 315, 342, 344 stochastic processes, 111 stock exchange, viii, 165, 348 stock markets, vii, 44, 98, 100, 142, 162, 163, 241, 243, 244 stock price, vii, 1, 73, 122, 163, 258, 300, 307, 309, 345, 346 storage, 29, 190 stress, 248, 259, 261 structure, ix, 2, 5, 6, 8, 9, 10, 13, 14, 21, 22, 23, 38, 50, 74, 75, 76, 77, 80, 111, 122, 184, 189, 193, 207, 228, 242, 253, 262, 275, 303, 334, 335, 337, 340, 343, 344, 345, 347 style, viii, 16, 21 substitutes, 72, 210, 244, 248, 249, 256, 261 substitution, 7, 259, 270 succession, 128, 131 survival, 230, 233, 234 susceptibility, 221 Sweden, 104, 108, 117 Switzerland, 104, 108, 118, 241

T Taiwan, 336 target, 314, 326 tau, 227, 229, 230, 232, 233 tax collection, 259 taxes, 259 TBP, 328, 329 techniques, vii, viii, ix, 25, 26, 28, 34, 35, 36, 45, 51, 52, 54, 56, 57, 71, 73, 74, 75, 80, 122, 140, 171, 189, 198, 217, 264, 274, 280, 299, 330, 345 test data, 175, 314 test procedure, 192, 312, 314 test statistic, 31, 39, 42, 93, 97, 189, 191, 193, 196, 204, 205, 206, 210, 216, 302, 305

358

Index

testing, 16, 26, 38, 50, 51, 52, 92, 94, 95, 100, 104, 137, 167, 180, 189, 190, 191, 193, 199, 205, 207, 210, 211, 214, 275, 289, 290, 291, 292, 305, 307, 317 textbook, 265 Thailand, 222 thoughts, 105 threshold level, ix, 179, 336, 340, 341 throws, 257 time periods, 97 time series, vii, viii, x, 1, 2, 5, 6, 13, 20, 27, 29, 30, 31, 32, 43, 44, 53, 72, 75, 84, 101, 104, 105, 121, 122, 123, 125, 135, 136, 138, 139, 142, 163, 168, 169, 171, 172, 173, 174, 175, 176, 177, 179, 181, 182, 183, 184, 185, 186, 189, 192, 193, 194, 200, 204, 207, 208, 210, 216, 240, 241, 242, 264, 265, 266, 268, 269, 274, 275, 277, 279, 281, 290, 294, 295, 306, 309, 316, 336, 339, 340, 345, 346, 347, 348, 349 tracks, 221 trade, 25, 28, 35, 38, 43, 222, 314, 328 trade-off, 16, 91 training, 142, 275, 276 trajectory, 292 transaction costs, 27, 38, 39, 42, 43, 56, 57, 58, 61, 137, 139, 142, 145, 146, 148, 149, 151, 152, 154, 155, 157, 158, 160, 161, 162, 299 transactions, 16, 139, 252, 336 transformation, 168, 174, 192, 197, 201, 204, 205, 235, 236, 278, 326 transformations, 124, 190, 192, 193, 207, 226, 227, 231, 240 translation, 168 transmission, 14, 84, 168, 222, 223 Treasury, ix, 266, 267, 277, 281 treatment, 260, 292 trial, 140, 165 turbulence, 349

U UK, ix, 13, 28, 38, 39, 69, 100, 165, 220, 232, 233, 234, 235, 290, 307, 308, 311, 333, 335, 347 UK Gilt-Equity ratio, ix, 290 umbilical cord, 182 uniform, 221, 321 United, 104, 108, 111, 118, 119, 231, 235, 287 United Kingdom, 104, 108, 111, 118, 287 United States, 104, 108, 118, 119, 231, 235

universe, 27, 329 USA, 71, 120, 219

V validation, 142, 143, 163, 275, 276 valuation, 22 variables, ix, 2, 5, 6, 7, 9, 15, 16, 17, 19, 21, 25, 29, 71, 72, 74, 80, 84, 91, 93, 95, 96, 97, 104, 133, 139, 190, 192, 199, 210, 213, 223, 226, 227, 228, 230, 241, 247, 254, 256, 259, 260, 268, 272, 275, 278, 289, 290, 291, 293, 296, 297, 298, 299, 300, 301, 302, 303, 306, 315, 316 variations, 182, 298, 299 vector, 2, 6, 76, 87, 106, 138, 140, 142, 227, 228, 242, 290, 291, 292, 293, 301, 314 velocity, 190 velocity of circulation, 190 visualization, 336 volatility, vii, ix, 1, 2, 10, 11, 13, 14, 18, 19, 21, 26, 31, 35, 43, 44, 49, 51, 52, 55, 56, 57, 58, 83, 84, 85, 95, 96, 97, 98, 99, 100, 101, 184, 185, 221, 222, 227, 237, 238, 242, 263, 264, 265, 266, 267, 268, 270, 271, 272, 273, 274, 275, 277, 278, 279, 280, 281, 282, 283, 284, 285, 309, 326, 333, 336, 348

W wages, 245 war, 183 Washington, 22, 23, 103, 219 wavelet, 167, 168, 169, 170, 171, 172, 173, 176, 177, 178, 337, 346 Wavelet Transform Modulus Maxima, viii, 186 wealth, 15, 223, 244, 246 windows, 185 Wisconsin, 70 withdrawal, 12 World Bank, 23 world order, 15 World War I, 71

Y yield, 26, 29, 40, 46, 54, 57, 66, 67, 85, 90, 93, 264, 267, 272, 275, 299, 308, 319, 325, 327