Applied Mathematics 9789462390089, 9789462390096

1,556 167 3MB

English Pages [567] Year 2013

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Applied Mathematics
 9789462390089, 9789462390096

Table of contents :
Editorial......Page 6
Preface......Page 8
Contents......Page 12
Teaching Guide......Page 14
Figures......Page 19
1 Linear Spaces......Page 21
1.1 Vector Spaces......Page 23
1.2 Sequence and Function Spaces......Page 34
1.3 Inner-Product Spaces......Page 45
1.4 Bases of Sequence and Function Spaces......Page 54
1.5 Metric Spaces and Completion......Page 74
2 Linear Analysis......Page 83
2.1 Matrix Analysis......Page 84
2.2 Linear Transformations......Page 99
2.3 Eigenspaces......Page 109
2.4 Spectral Decomposition......Page 125
3 Spectral Methods and Applications......Page 134
3.1 Singular Value Decomposition and Principal Component Analysis......Page 135
3.2 Matrix Norms and Low-Rank Matrix Approximation......Page 151
3.3 Data Approximation......Page 165
3.4 Data Dimensionality Reduction......Page 176
4 Frequency-Domain Methods......Page 189
4.1 Discrete Fourier Transform......Page 190
4.2 Discrete Cosine Transform......Page 197
4.3 Fast Fourier Transform......Page 208
4.4 Fast Discrete Cosine Transform......Page 217
5 Data Compression......Page 224
5.1 Entropy......Page 225
5.2 Binary Codes......Page 234
5.3 Lapped Transform and Compression Schemes......Page 248
5.4 Image and Video Compression......Page 272
6 Fourier Series......Page 280
6.1 Fourier Series......Page 283
6.2 Fourier Series in Cosines and Sines......Page 293
6.3 Kernel Methods......Page 305
6.4 Convergence of Fourier Series......Page 312
6.5 Method of Separation of Variables......Page 322
7 Fourier Time-Frequency Methods......Page 334
7.1 Fourier Transform......Page 336
7.2 Inverse Fourier Transform and Sampling Theorem......Page 346
7.3 Isotropic Diffusion PDE......Page 356
7.4 Time-Frequency Localization......Page 368
7.5 Time-Frequency Bases......Page 377
7.6 Appendix on Integration Theory......Page 390
8 Wavelet Transform and Filter Banks......Page 395
8.1 Wavelet Transform......Page 397
8.2 Multiresolution Approximation and Analysis......Page 406
8.3 Discrete Wavelet Transform......Page 421
8.4 Perfect-Reconstruction Filter Banks......Page 435
9 Compactly Supported Wavelets......Page 449
9.1 Transition Operators......Page 452
9.2 Gramian Function Gφ(ω)......Page 461
9.3 Compactly Supported Orthogonal Wavelets......Page 467
9.4 Compactly Supported Biorthogonal Wavelets......Page 481
9.5 Lifting Schemes......Page 495
10 Wavelet Analysis......Page 515
10.1 Existence of Refinable Functions in......Page 517
10.2 Stability and Orthogonality of Refinable Functions......Page 526
10.3 Cascade Algorithms......Page 533
10.4 Smoothness of Compactly Supported Wavelets......Page 551
Index......Page 563

Citation preview

Mathematics Textbooks for Science and Engineering

Charles K. Chui Qingtang Jiang

Applied Mathematics Data Compression, Spectral Methods, Fourier Analysis, Wavelets, and Applications

Mathematics Textbooks for Science and Engineering Volume 2

For further volumes: http://www.springer.com/series/10785

Charles K. Chui Qingtang Jiang •

Applied Mathematics Data Compression, Spectral Methods, Fourier Analysis, Wavelets, and Applications

Charles K. Chui Department of Statistics Stanford University Stanford, CA USA

ISBN 978-94-6239-008-9 DOI 10.2991/978-94-6239-009-6

Qingtang Jiang Department of Mathematics and Computer Science University of Missouri St. Louis, MO USA

ISBN 978-94-6239-009-6

(eBook)

Library of Congress Control Number: 2013939577 Published by Atlantis Press, Paris, France www.atlantis-press.com Ó Atlantis Press and the authors 2013 This book, or any parts thereof, may not be reproduced for commercial purposes in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system known or to be invented, without prior permission from the Publisher. Printed on acid-free paper

Series Information Textbooks in the series ‘Mathematics Textbooks for Science and Engineering’ will be aimed at the broad mathematics, science and engineering undergraduate and graduate levels, covering all areas of applied and applicable mathematics, interpreted in the broadest sense. Series Editor Charles K. Chui Stanford University, Stanford, CA, USA Atlantis Press 8 square des Bouleaux 75019 Paris, France For more information on this series and our other book series, please visit our website www.atlantis-press.com

Editorial

Recent years have witnessed an extraordinarily rapid advance in the direction of information technology within both the scientific and engineering disciplines. In addition, the current profound technological advances of data acquisition devices and transmission systems contribute enormously to the continuing exponential growth of data information that requires much better data processing tools. To meet such urgent demands, innovative mathematical theory, methods, and algorithms must be developed, with emphasis on such application areas as complex data organization, contaminated noise removal, corrupted data repair, lost data recovery, reduction of data volume, data dimensionality reduction, data compression, data understanding and visualization, as well as data security and encryption. The revolution of the data information explosion as mentioned above demands early mathematical training with emphasis on data manipulation at the college level and beyond. The Atlantis book series, ‘‘Mathematics Textbooks for Science and Engineering (MTSE)’’, is founded to meet the needs of such mathematics textbooks that can be used for both classroom teaching and self-study. For the benefit of students and readers from the interdisciplinary areas of mathematics, computer science, physical and biological sciences, and various engineering specialties, contributing authors are requested to keep in mind that the writings for the MTSE book series should be elementary and relatively easy to read, with sufficient examples and exercises. We welcome submission of such book manuscripts from all who agree with us on this point of view. This second volume is intended to be a comprehensive textbook in ‘‘Contemporary Applied Mathematics’’, with emphasis in the following five areas: spectral methods with applications to data analysis and dimensionality reduction; Fourier theory and methods with applications to time-frequency analysis and solution of partial differential equations; Wavelet time-scale analysis and methods, with indepth study of the lifting schemes, wavelet regularity theory, and convergence of cascade algorithms; Computational algorithms, including fast Fourier transform, fast cosine transform, and lapped transform; and Information theory with applications to image and video compression. This book is self-contained, with writing

vii

viii

Editorial

style friendly towards the teacher and reader. It is intended to be a textbook, suitable for teaching in a variety of courses, including: Applied Mathematics, Applied Linear Algebra, Applied Fourier Analysis, Wavelet Analysis, and Engineering Mathematics, both at the undergraduate and beginning graduate levels.

Charles K. Chui Menlo Park, CA

Preface

Mathematics was coined the ‘‘queen of science’’ by the ‘‘prince of mathematicians,’’ Carl Friedrich Gauss, one of the greatest mathematicians of all time. Indeed, the name of Gauss is associated with essentially all areas of mathematics. It is therefore safe to assume that to Gauss there was no clear boundary between ‘‘pure mathematics’’ and ‘‘applied mathematics’’. To ensure financial independence, Gauss decided on a stable career in astronomy, which is one of the oldest sciences and was perhaps the most popular one during the eighteenth and nineteenth centuries. In his study of celestial motion and orbits and a diversity of disciplines later in his career, including (in chronological order): geodesy, magnetism, dioptrics, and actuarial science, Gauss has developed a vast volume of mathematical methods and tools that are still instrumental to our current study of applied mathematics. During the twentieth century, with the exciting development of quantum field theory, with the prosperity of the aviation industry, and with the bullish activity in financial market trading, and so forth, much attention was paid to the mathematical research and development in the general discipline of partial differential equations (PDEs). Indeed, the non-relativistic modeling of quantum mechanics is described by the Schrödinger equation; the fluid flow formulation, as an extension of Newtonian physics by incorporating motion and stress, is modeled by the NavierStokes equation; and option stock trading with minimum risk can be modeled by the Black-Scholes equation. All of these equations are PDEs. In general, PDEs are used to describe a wide variety of phenomena, including: heat diffusion, sound wave propagation, electromagnetic wave radiation, vibration, electrostatics, electrodynamics, fluid flow, and elasticity, just to name a few. For this reason, the theoretical and numerical development of PDEs has been considered the core of applied mathematics, at least in the academic environment. On the other hand, over the past two decades, we have witnessed a rapidly increasing volume of ‘‘information’’ contents to be processed and understood. With the recent advances of various high-tech fields and the popularity of social networking, the trend of exponential growth of easily accessible information is certainly going to continue well into the twenty-first century, and the bottleneck created by this information explosion will definitely require innovative solutions

ix

x

Preface

from the scientific and engineering communities, particularly those technologists with better understanding of, and strong background in, applied mathematics. Today, ‘‘big data’’ is among the most pressing research directions. ‘‘Big data’’ research and development initiatives have been created by practically all Federal departments and agencies of the United States. It is noted, in particular, that on May 29, 2012, the Obama administration unveiled a ‘‘big data’’ initiative, announcing $200 million new R&D investments to help solve some of the nation’s most pressing challenges, by improving our ability to extract knowledge and insights from large and complex collections of digital data. To launch the initiative, six Federal departments and agencies later announced more than $200 million in new commitments that promised to greatly improve the tools and techniques needed to access, organize, and glean discoveries from huge volumes of digital data. ‘‘Mathematics of big data’’ is therefore expected to provide innovative theory, methods, and algorithms to virtually every discipline, far beyond sciences and engineering, for processing, transmitting, receiving, understanding, and visualizing datasets, which could be very large or live in some high-dimensional spaces. Of course the basic mathematical tools, particularly PDE models and methods, are always among the core of the mathematical tool-box of applied mathematics. But other theory and methods have been integrated in this tool-box as well. One of the most essential ideas is the notion of ‘‘frequency’’ of the data information. A contemporary of Gauss, by the name of Joseph Fourier, instilled this important concept to our study of physical phenomena by his innovation of trigonometric series representations, along with powerful mathematical theory and methods, which significantly expanded the core of the tool-box of applied mathematics. The frequency content of a given dataset facilitates the processing and understanding of the data information. Another important idea is the ‘‘multi-scale’’ structure of datasets. Less than three decades ago, with the birth of another exciting mathematical subject, called ‘‘wavelets’’, the dataset of information can be put in the wavelet domain for multi-scale processing as well. About half of this book is devoted to the study of the theories, methods, algorithms, and computational schemes of Fourier and wavelet analyses. In addition, other mathematical topics, which are essential to information processing but not commonly taught in a regular applied mathematics course, are discussed in this book. These include information coding, data dimensionality reduction, and data compression. The objective of this textbook is to introduce the basic theory and methods in the tool-box of the core of applied mathematics, with a central scheme that addresses information processing with emphasis on manipulation of digital image data. Linear algebra is presented as linear analysis, with emphasis on spectral representation and principal component analysis (PCA), and with applications to data estimation and data dimensionality reduction. For data compression, the notion of entropy is introduced to quantify coding efficiency as governed by Shannon’s Noiseless Coding theorem. Discrete Fourier transform (DFT), followed by the discussion of an efficient computational algorithm, called fast Fourier transform (FFT), as well as a real-valued version of the DFT, called discrete cosine

Preface

xi

transform (DCT), are studied, with application to extracting frequency content of the given discrete dataset that facilitates reduction of the entropy and thus significant improvement of the coding efficiency. While DFT is obtained from discretization of the Fourier coefficient integral, four versions of discretization of the Fourier cosine coefficients yield the DCT-I, DCT-II, DCT-III and DCT-IV that are commonly used in applications. The integral version of DCT and Fourier coefficient sequences is called the Fourier transform (FT). Analogous to the Fourier series, the formulation of the inverse Fourier transform (IFT) is derived by applying the Gaussian function as sliding time-window for simultaneous timefrequency localization, with optimality guaranteed by the Uncertainty Principle. Both the Fourier series and Fourier transform are applied to solving certain PDEs. In addition, local time-frequency basis functions are introduced in this textbook by discretization of the frequency-modulated sliding time-window function at the integer lattice points. Replacing the frequency modulation by modulation with the cosines avoids the Balian-Low stability restriction on the local time-frequency basis functions, with application to elimination of blocky artifacts caused by quantization of tiled DCT in image compression. In Chap. 8, multi-scale data analysis is introduced and compared with the Fourier frequency approach; the architecture of multiresolution approximation and analysis (MRA) is applied to the construction of wavelets and formulation of the multi-scale wavelet decomposition and reconstruction algorithms; and the lifting scheme is also introduced to reduce the computational complexity of these algorithms as well as implementation of filter banks. The final two chapters of this book are devoted to an in-depth study of wavelet analysis, including construction of bi-orthogonal wavelets, their regularities, and convergence of the cascade algorithms. The last three chapters alone, namely Chaps. 8–10, constitute a suitable textbook for a one-semester course in ‘‘Wavelet Analysis and Applications’’. For more details, a teaching guide is provided on pages xvi–xix. Most chapters of this Applied Mathematics textbook have been tested in classroom teaching at both the undergraduate and graduate levels, and the final revision of the book manuscript was influenced by student feedback. The authors are therefore thankful to have the opportunity to teach from the earlier drafts of this book in the university environment, particularly at the University of Missouri-St. Louis. In writing this textbook, the authors have benefited from assistance of several individuals. In particular, they are most grateful to Margaret Chui for typing the earlier versions of the first six chapters, to Tom Li for drawing many of the diagrams and improving the artworks, and to Maryke van der Walt for proofreading the entire book and suggesting several changes. In addition, the first author would like to express his appreciation to the publisher, Atlantis Press, for their enthusiasm in publishing this book, and is grateful to Keith Jones and Zeger Karssen, in particular, for their unconditional trust and lasting friendship over the years. He would also like to take this opportunity to acknowledge the generous support from the U.S. Army Research Office and the National Geospatial-Intelligence Agency of

xii

Preface

his research in the broad mathematical subject of complex and high-dimensional data processing. The second author is indebted to his family for their sacrifice, patience, and understanding, during the long hours in preparing and writing this book. Charles K. Chui Menlo Park, California Qingtang Jiang St. Louis Missouri

Contents

Teaching Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xv

Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xxi

1

Linear Spaces . . . . . . . . . . . . . . . . . . . . . . 1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . 1.2 Sequence and Function Spaces . . . . . . . 1.3 Inner-Product Spaces . . . . . . . . . . . . . . 1.4 Bases of Sequence and Function Spaces. 1.5 Metric Spaces and Completion . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1 3 14 25 34 54

2

Linear Analysis . . . . . . . . . 2.1 Matrix Analysis. . . . . . 2.2 Linear Transformations 2.3 Eigenspaces . . . . . . . . 2.4 Spectral Decomposition

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

63 64 79 89 105

3

Spectral Methods and Applications. . . . . . . . . . . . . . . . 3.1 Singular Value Decomposition and Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . 3.2 Matrix Norms and Low-Rank Matrix Approximation. 3.3 Data Approximation. . . . . . . . . . . . . . . . . . . . . . . . 3.4 Data Dimensionality Reduction . . . . . . . . . . . . . . . .

........

115

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

116 132 146 157

4

Frequency-Domain Methods. . . . . . 4.1 Discrete Fourier Transform . . . . 4.2 Discrete Cosine Transform . . . . 4.3 Fast Fourier Transform. . . . . . . 4.4 Fast Discrete Cosine Transform.

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

171 172 179 190 199

5

Data Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

207 208

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

xiii

xiv

Contents

5.2 5.3 5.4

Binary Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lapped Transform and Compression Schemes. . . . . . . . . . . . . . Image and Video Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

217 231 255

6

Fourier Series . . . . . . . . . . . . . . . . . . . 6.1 Fourier Series . . . . . . . . . . . . . . . . 6.2 Fourier Series in Cosines and Sines . 6.3 Kernel Methods. . . . . . . . . . . . . . . 6.4 Convergence of Fourier Series . . . . 6.5 Method of Separation of Variables .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

263 266 276 288 295 305

7

Fourier Time-Frequency Methods . . . . . . . . . . . . . . 7.1 Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . 7.2 Inverse Fourier Transform and Sampling Theorem 7.3 Isotropic Diffusion PDE . . . . . . . . . . . . . . . . . . . 7.4 Time-Frequency Localization . . . . . . . . . . . . . . . 7.5 Time-Frequency Bases . . . . . . . . . . . . . . . . . . . . 7.6 Appendix on Integration Theory . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

317 319 329 339 351 360 373

8

Wavelet Transform and Filter Banks . . . . . . . . 8.1 Wavelet Transform . . . . . . . . . . . . . . . . . . 8.2 Multiresolution Approximation and Analysis 8.3 Discrete Wavelet Transform . . . . . . . . . . . . 8.4 Perfect-Reconstruction Filter Banks . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

379 381 390 405 419

9

Compactly Supported Wavelets . . . . . . . . . . . . 9.1 Transition Operators . . . . . . . . . . . . . . . . . 9.2 Gramian Function G/ ðxÞ . . . . . . . . . . . . . . 9.3 Compactly Supported Orthogonal Wavelets . 9.4 Compactly Supported Biorthogonal Wavelets 9.5 Lifting Schemes . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

433 436 445 451 465 479

10 Wavelet Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Existence of Refinable Functions in L2 ðRÞ . . . . . . 10.2 Stability and Orthogonality of Refinable Functions 10.3 Cascade Algorithms . . . . . . . . . . . . . . . . . . . . . . 10.4 Smoothness of Compactly Supported Wavelets . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

499 501 510 517 535

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

547

Teaching Guide

This book is an elementary and yet comprehensive textbook that covers most of the standard topics in Applied Mathematics, with a unified theme of applications to data analysis, data manipulation, and data compression. It is a suitable textbook, not only for most undergraduate and graduate Applied Mathematics courses, but also for a variety of Special Topics courses or seminars. The objective of this guide is to suggest several samples of such courses. (1) A general ‘‘Applied Mathematics’’ course: Linear Spaces, Linear Analysis, Fourier Series, Fourier Transform, Partial Differential Equations, and Applications (2) Applied Mathematics: with emphasis on computations and data compression (3) Applied Mathematics: with emphasis on data analysis and representation (4) Applied Mathematics: with emphasis on time-frequency analysis (5) Applied Mathematics: with emphasis on time-frequency and multi-scale methods (6) Applied Linear Algebra (7) Applied Fourier Analysis (8) Applied and Computational Wavelet Analysis (9) Fourier and Wavelet Analyses

xv

xvi

Teaching Guide

1. Teaching Guide (for a general Applied Mathematics Course)

Chapter 1

Chapter 2

Chapter 3

Chapter 6 Chapter 7: 7.1, 7.2, 7.3 Chapters 4: 4.1, 4.2 Supplementary materials on computations 2. Teaching Guide (Emphasis on computations and data compressions)

Chapter 2 Chapter 3

Chapter 4

Chapter 5

Chapter 6 Chapter 1: Background material, if needed

Teaching Guide

xvii

3. Teaching Guide (Emphasis on data analysis and data representations)

Chapter 2

Chapter 3

Chapter 4

Chapter 6 Chapter 7: 7.1, 7.2, 7.3

Chapter 1: Background material, if needed 4. Teaching Guide (Emphasis on time-frequency analysis) Chapter 6

Chapter 7: 7.1, 7.2, 7.4, 7.5

Chapter 8: 8.1 – 8.2

Chapter 4: 4.1 on background material, if needed

xviii

Teaching Guide

5. Teaching Guide (Emphasis on time-frequency/time-scale methods)

Chapter 4: 4.1 – 4.2

Chapter 6: 6.1 – 6.2

Chapter 7: 7.1, 7.2, 7.4, 7.5

Chapter 8 Chapter 9: 9.5 only 6. Teaching Guide (Applied Linear Algebra)

Chapter 1: 1.2 – 1.5 Chapter 2

Chapter 3

Chapter 4

Chapter 5 7. Teaching Guide (Applied Fourier Analysis)

Chapter 6: 6.1 – 6.4 Chapter 4: 4.1 – 4.2

Chapter 5

Chapter 7: 7.1, 7.2, 7.4, 7.5

Teaching Guide

xix

8. Teaching Guide (Applied and Computational Wavelet Analysis)

Chapter 7: 7.1, 7.2, 7.4 Chapter 8

Chapter 9

Chapter 10 9. Teaching Guide (Fourier and Wavelet Analyses)

Chapter 6: 6.1 – 6.4

Chapter 7: 7.1, 7.2, 7.4, 7.5 Chapter 8

Chapter 9

Chapter 10

Figures

Fig. 1.1 Fig. 1.2 Fig. 1.3 Fig. 3.1

Fig. 3.2

Fig. 4.1 Fig. 4.2 Fig. 4.3 Fig. 4.4 Fig. 4.5 Fig. 4.6 Fig. 4.7 Fig. 4.8 Fig. 5.1 Fig. 5.2 Fig. 5.3 Fig. 5.4 Fig. 5.5 Fig. 6.1 Fig. 6.2 Fig. 6.3 Fig. 6.4 Fig. 6.5

Plots of c and c in the complex plane Areas under y ¼ f ðxÞ and x ¼ f 1 ðyÞ in the proof of Young’s inequality Orthogonal projection Pv in W Top Centered data (represented by ); Bottom original data (represented by ), dimension-reduced data (represented by ), and principal components v1 ; v2 Top centered data (represented by ); bottom dimension-reduced data (represented by ) with the first and second principal components and original data (represented by ) Top real part of the DFT of S1 ; Bottom imaginary part of the DFT of S1 Top real part of the DFT of S2 ; Bottom imaginary part of the DFT of S2 Top DCT of S1 ; Bottom DCT of S2 FFT Signal flow chart for n ¼ 4 FFT signal flow chart for n ¼ 8 Lee’s fast DCT implementation Hou’s Fast DCT implementation Fast DCT implementation of Wang, Suehiro and Hatori Encoder : Q = quantization; E = entropy encoding Decoder : Q1 = de-quantization; E1 = de-coding Quantizers: low compression ratio Quantizers: high compression ratio Zig-zag ordering Coefficient plot of Dn ðxÞ Dirichlet’s kernels Dn ðxÞ for n ¼ 4 (on left) and n ¼ 16 (on right) Coefficient plot of rn ðxÞ Fejer’s kernels rn ðxÞ for n ¼ 4 (on left) and n ¼ 16 (on right) From top to bottom: f ðxÞ ¼ jxj and its 2p-extension, S10 f , S50 f and S100 f

xxi

xxii

Figures

Fig. 6.6 Fig. 6.7 Fig. 6.8 Fig. 7.1 Fig. 7.2 Fig. 7.3 Fig. 8.1 Fig. 8.2 Fig. 8.3

Fig. 8.4 Fig. 8.5 Fig. 8.6 Fig. Fig. Fig. Fig. Fig.

8.7 8.8 9.1 9.2 9.3

Fig. 10.1 Fig. 10.2 Fig. 10.3 Fig. 10.4

From top to bottom: f ðxÞ ¼ v½p=2;p=2 ðxÞ; p  x  p and its 2p-extension, S10 f , S50 f and S100 f Gibbs phenomenon Insulated region Diffusion with delta heat source (top) and arbitrary heat source (bottom) Admissible window function uðxÞ defined in (7.5.8) Malvar wavelets w0;4 (top) and w0;16 (bottom). pffiffiffi The envelope of cosðk þ 12Þpx is the dotted graph of y ¼  2uðxÞ Hat function / (on left) and its refinement (on right) D4 scaling function / (on left) and wavelet w (on right) Top: biorthogonal 5/3 scaling function / (on left) e (on left) and wavelet w (on right). Bottom: scaling function / e and wavelet w (on right) From top to bottom: original signal, details after 1-, 2-, 3-level DWT, and the approximant From top to bottom: (modified) forward and backward lifting algorithms of Haar filters From top to bottom: (modified) forward and backward lifting algorithms of 5/3-tap biorthogonal filters Ideal lowpass filter (on left) and highpass filter (on right) Decomposition and reconstruction algorithms with PR filter bank D6 scaling function / (on left) and wavelet w (on right) D8 scaling function / (on left) and wavelet w (on right) Top biorthogonal 9/7 scaling function / (on left) and wavelet w e (on left) and wavelet w e (on right); Bottom scaling function / (on right) /1 ¼ Qp /0 ; /2 ¼ Q2p /0 ; /3 ¼ Q3p /0 obtained by cascade algorithm with refinement mask of hat function and /0 ¼ v½0;1Þ ðxÞ /1 ¼ Qp /0 ; /2 ¼ Q2p /0 ; /3 ¼ Q3p /0 obtained by cascade algorithm with refinement mask f12 ; 0; 12g and /0 ¼ v½0;1Þ ðxÞ /1 ¼ Qp /0 ; /4 ¼ Q4p /0 obtained by cascade algorithm with refinement mask of / ¼ 13 v½0;3Þ and /0 being the hat function Piecewise linear polynomials /n ¼ Qnp /0 ; n ¼ 1; . . .; 4 approximating D4 scaling function / by cascade algorithm

Chapter 1

Linear Spaces

The two most basic operations on a given set V of functions are addition of any two functions in V and multiplication of any function in V by a constant. The concept of “vector spaces” to be discussed in some details in the first section of this chapter is to specify the so-called closure properties of the set V, in that these two operations never create any function outside the set V. More precisely, the set of constants that are used for multiplication to the functions in V must be a “scalar field” denoted by F, and V will be called a vector space over the field F. The only (scalar) fields considered in this book are the set R of real numbers as well as its subset Q of rational numbers and its superset C of complex numbers. Typical examples of vector spaces over F include the vector spaces Rm,n and Cm,n of m ×n matrices of real numbers and of complex numbers, over the scalar fields F = R and F = C, respectively, where m and n are integers with m, n ≥ 1. If m = 1, then the vector space of matrices becomes the familiar space of row vectors, while for n = 1 we have the familiar space of column vectors. The vector space of infinite sequences and that of piecewise continuous functions are studied in the second section, Sect. 1.2. The term “vector” will be used throughout the entire book for any member of the vector space under consideration. Hence, a vector could be a matrix, an infinite sequence, a function, and so forth. For any p with 0 ≤ p ≤ ∞, in order to show that the sequence subspace  p and function subspace  L p (J ) are closed under addition, we will establish Minkowski’s inequalities, also called triangle inequality in the general setting. For this purpose, the inequalities of Hölder are first derived by applying Young’s inequality, which is also discussed in this section. For this preliminary discussion, the functions in  L p (J ) are restricted to piecewise continuous functions on an interval J which is allowed to be either bounded or unbounded. Extension to “measurable functions” is delayed to the last section, Sect. 1.5, of the chapter.

C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_1, © Atlantis Press and the authors 2013

1

2

1 Linear Spaces

The third section, Sect. 1.3, of this chapter is devoted to the study of an important measurement tool for vector spaces, namely: the inner product. Since the inner product of any two vectors is a constant in the scalar field, it is also called “scalar product”, and particularly the “dot product” for the Euclidean spaces Rm = Rm,1 and Cm = Cm,1 . A vector space V endowed with some inner-product measurement is called an inner-product space. The notion of “norm” of a vector in an inner-product space is introduced to facilitate our discussion of the most important property of the inner-product measurement, namely: the Cauchy-Schwarz inequality, which is the L 2 (J ), derived in Sect. 1.2, to the generalization of Hölder’s inequality for 2 and  general abstract setting. When the norm is used to measure the length of a vector in a real inner-product space V over the scalar field R, the Cauchy-Schwarz inequality can be applied to define the angle between any two vectors in V. This concept motivates the definition of orthogonality, which obviously extends to all inner-product spaces V, including those over the scalar field C. As an application, the notion of “orthogonal complement” of any subspace W ∈ V is introduced for the study of applications in later chapters of this book. The concepts of linear independence and bases from an elementary course in Linear Algebra are reviewed and extended in Sect. 1.4 to infinite-dimensional vector L p (J ), spaces, such as the vector space  p of infinite sequences and the function space  for any p with 0 ≤ p ≤ ∞, studied in Sect. 1.2. Another objective of Sect. 1.4 is to apply the inner product of an inner-product space V discussed in Sect. 1.3 to introduce the notion of orthonormal basis of V and that of orthogonal projection from V to any of its subspaces W, as well as best approximation of vectors in V from W. In addition, we will give an in-depth discussion of the so-called Gram-Schmidt process for constructing an orthonormal set of vectors from any linearly independent collection, while preserving the algebraic span of the linearly independent set. The notion of the norm measurement defined by an inner product in Sect. 1.3 is extended to the general abstract setting in Sect. 1.5. A vector space endowed with a norm is called a normed space (or normed linear space). Important examples of L p (J ) for any p, with 1 ≤ p ≤ ∞, introduced normed spaces include  p and  in Sect. 1.2. While the length of a vector as defined by its norm can also be used to measure the distance between any two vectors in a normed space, the notion of “metric” measurement from an elementary course in Set Theory for measuring the distance between two points in a point-set can be applied without the vector space structure. Important examples of metric spaces that are not normed spaces are  p and  L p (J ) for any p, with 0 ≤ p < 1. With the metric, the definition of Cauchy sequences can be used to introduce the concept of completeness. For example, the completion of  L p (J ) is L p (J ), for any p with 0 ≤ p ≤ ∞, by extending the collection of piecewise continuous functions to the (Lebesgue) measurable functions. A complete normed space is called a Banach space and a complete inner-product space iss called a Hilbert space.

1.1 Vector Spaces

3

1.1 Vector Spaces For the definition of a vector space, we need the notion of its companion scalar field. Throughout this book, the set of all real numbers is denoted by R. Under the operations of addition/subtraction, multiplication, and division by non-zero real numbers, the set R is a field. For the purpose of our discussion of vector spaces, any field will be called a scalar field, and any element of the field will be called a scalar. In this book, the set of natural numbers is denoted by N. Both N and the notation Z = {. . . , −2, −1, 0, 1, 2, . . .} for the set of all integers will be used frequently in our discussions. For example, the set of all fractions can be represented by  Q=

 p : p ∈ Z, q ∈ N . q

A fraction is also called a rational number, and the set Q of rational numbers is also a scalar field, and hence a sub-field of R, since addition/subtraction and multiplication of two fractions, as well as division of a fraction by any non-zero fraction remain to be a fraction. Hence, the set Q can also be used as a scalar field in the discussion of vector spaces. To better understand the set Q of rational numbers, let us first recall that a real number with terminating decimal representation is in Q. Indeed, as an example, the 31 231 = 100 , which is a fraction. A less obvious real number 2.31 can be written as 2 + 100 fraction representation of certain real numbers is one that has infinitely many digits in its decimal representation, but after writing out finitely many digits, the decimal representation repeats indefinitely the same pattern of some finite sequence. This pattern therefore constitutes a period of the decimal representation. To illustrate this statement, consider the real number b = 2.3121212 . . ., with the periodic pattern of 12 tacked on to the decimal representation 2.3. To see that b can be expressed as a fraction, let us first consider another real number x = 0.1212 · · · . Since the pattern of the two digits 12 in x repeats indefinitely, we may multiply x by 102 = 100 to obtain the equation 100x = 12.121212 · · · = 12 + x with unique solution given by x=

12 12 = , 100 − 1 99

which is a fraction. Returning to b, we can compute its fraction representation as follows:

4

1 Linear Spaces

b = 2.3 + 0.0121212 · · · = 2.3 + =2+

x 10

3 12 2289 + = . 10 990 990

More generally, it can be shown that if x = 0.c1 c2 · · · cs c1 c2 · · · cs c1 c2 · · · cs · · · (and the notation x = 0.c1 c2 · · · cs is used to indicate that the pattern c1 c2 · · · cs is repeated indefinitely), then x may be written as the fraction: 0.c1 c2 · · · cs =

c1 c2 · · · cs 10s − 1

(see Exercise 3). Thus, if a real number y has the decimal representation y = a1 · · · an .b1 · · · bm c1 c2 · · · cs c1 c2 · · · cs c1 c2 · · · cs · · · , then y can be written as a fraction by adding three terms as follows: a 1 · · · an +

b1 · · · bm c1 c2 · · · cs + . m 10 (10s − 1)10m

On the other hand, the decimal representation for most real numbers does not have a periodic pattern, and hence “almost all” real numbers do not have fraction representations and therefore, are not rational numbers. A real number which is not a rational is called an irrational number. For example, the numbers e (in honor of Euler, and used to define the natural logarithm), π (the ratio of the circumference of a circle √ with its diameter, introduced by Archimedes of Syracuse), and 2 are real numbers that are not in Q. In fact, it is not difficult to show √ that if a > 1 is not a perfect square (that is, a  = b2 for any integer b > 1), then a is an irrational number (or equivalently, different from a fraction). These numbers do not have periodic patterns √ in their decimal representations. For example, the first few decimals of e, π, and 2 are: e = 2.718281828459045 · · · , π = 3.141592653589793 · · · , √ 2 = 1.414213652373095 · · · . Since the decimal representations of “almost all” real numbers are so irregular, it seems to be impossible to enumerate or “count” the set of all real numbers. In the mathematical language, a set is said to be countable, if either it has a finite number of members (also called elements) or it can be counted by assigning each member of the set to one and only one natural number, while a set is said to be uncountable

1.1 Vector Spaces

5

if the set is not countable. More precisely, a set S is countable if and only if it can be listed by assigning some natural number to each of its members as follows: S = {s1 , s2 , s3 , . . . , sn } or S = {s1 , s2 , s3 , . . .}, for some n ∈ N. The set Q of rational numbers is countable To show this, we first point out that the union of two countable sets is also countable (see Exercise 4), so that it is sufficient to prove that the set of positive rational numbers is countable. Allowing redundancy, we display the set of all positive fractions p/q in a triangular array in the figure below, where p runs over all natural numbers 1, 2, 3, . . ., from northwest to southeast, along each line of the array with slope −1, and q runs over all natural numbers 1, 2, 3, . . . from the top along each vertical line of the array. 1/1 1/2 1/3 1/4 1/5 1/6 ···

2/1 2/2 2/3 2/4 2/5 ···

3/1 3/2 3/3 3/4 ···

4/1 4/2 5/1 4/3 5/2 6/1 ··· ··· ··· ···

Then by deleting redundant representation of the same fractions, all positive rational numbers are listed in a unique manner on this triangular array (counting row by row from the top) as follows: 1 1 2 1 3 1 2 3 4 1 5 1 2 , , , , , , , , , , , , ,..., 1 2 1 3 1 4 3 2 1 5 1 6 5 where redundancy is avoided by skipping 2/2, 2/4, . . . , 3/3, 3/6, . . . , 4/4, 4/6, . . ., and so forth. The set Q of rational numbers has measure = 0 This statement Remark 1 means that the set Q is contained in the union of some intervals with arbitrarily small total length. To show this, let Q = {r1 , r2 , . . .} be the set of all rational numbers, listed in any desirable order, and place all the rational numbers on the real line (or x-axis). Now let  be an arbitrarily small positive number. Then for each k = 1, 2, . . ., consider the interval Ik = [rk −  2−(k+1) , rk +  2−(k+1) ] with center at rk . Hence, the set Q of all rational numbers is contained in the union of the intervals I1 , I2 , . . .. Therefore, since the total length of these intervals is equal to    + 2 + 3 + · · · = , 2 2 2

6

1 Linear Spaces

the set Q of all rational numbers is contained in a set of “measure" less than the arbitrarily small positive number . Such sets are said to have measure = 0. Observe that since the real line has infinite length, the set R\Q of irrational numbers must be uncountable, and so is the superset R of R\Q. So, in comparison with Q, the set of irrational numbers is much larger.  Remark 2 The notions of "almost all" and "almost everywhere" In view of the fact that the set of rational numbers has measure zero, we say that “almost all" real numbers are irrational numbers. Extending this concept to functions defined on an interval J , we say that two functions f and g are equal almost everywhere on J , if f (x) = g(x) for all x ∈ J \S, for some subset S of the interval J with measure zero, and we write f = g a.e. Of course S could be the empty set.  Remark 3 Real numbers that are not fractions must be quantized (for example, by truncation followed by rounding off the last digit) for fixed point computation to avoid using symbolic implementation. √ is √ Unfortunately, the quantization process irreversible. For example, to multiply 2 by 3 without symbolic calculation, 2 can be quantized to 1.4142 before multiplying by 3, yielding: √ 2 × 3 ≈ 1.4142 × 3 = 4.2426. But when this number is later squared, we √ will only obtain 4.24262 = 17.99965476, which is different from the exact value ( 2 × 3)2 = 2 × 9 = 18. On the other hand, the quantization process is a key step in such data compression applications that require high compression ratio (or low bit-rate). This topic will be discussed in Chap. 5.  The scalar field R of real numbers can be extended to the field C of complex numbers, so that C is a super-field of R, or equivalently, R is a sub-field of C. For this purpose, recall the imaginary number i defined by i 2 = −1; that is, i and −i are the two roots of the equation x 2 + 1 = 0. Then a complex number c ∈ C is defined by c = a + ib, a, b ∈ R. Hence, while a real number a ∈ R is customarily indicated by a “point” on the “real line”, the plot of a complex number c is a point on a plane, called the complex plane, also denoted by C, with the horizontal axis to be called the real axis and the vertical axis to be called the imaginary axis. For the complex number c = a + ib, the real number a is called the real part of c, and the other real number b is called the imaginary part of c, denoted, respectively, by a = Re c, b = Im c. The magnitude |c| of c is defined by

1.1 Vector Spaces

7

|c| =

 a 2 + b2 .

If c  = 0, it can be written in the polar form c = ρeiθ for ρ = |c| and θ defined by cos θ = a/ρ, sin θ = b/ρ, with the value of θ restricted to the interval [0, 2π). Observe that without this restriction of θ, the above definition yields Euler’s identity: eiθ = cos θ + i sin θ, which is valid for all real numbers θ. For c = a + ib, the complex number c = a − ib, with b in c replaced by −b, is called the complex conjugate of c. See Fig.1.1. By applying the definition i 2 = −1, we may conclude that |c|2 = c c¯ = a 2 + b2 ; 1 a − ib a b c¯ = 2 −i 2 . = 2 = 2 2 2 c |c| a +b a +b a + b2 While the “real line”, also denoted by R, is the set of real numbers that constitute all the “points” that lie on some horizontal line, its extension to the “plane” R2 = R × R is the set of ordered pairs x = (x1 , x2 ), where x1 , x2 ∈ R. For example, x = (1.5, −3) is a “point” in the plane R2 with first component (also called coordinate) being 1.5 and the second component being −3. Observe that when the two components are interchanged, we have a different point on the plane, namely:

Fig. 1.1 Plots of c and c¯ in the complex plane

8

1 Linear Spaces

(1.5, −3)  = (−3, 1.5). Similarly, the plane R2 is extended to the three-dimensional space R3 by x = (x1 , x2 , x3 ) ∈ R3 , where x1 , x2 , x3 ∈ R. In general, for each integer n ≥ 2, the n-dimensional space Rn is defined by (1.1.1) x = (x1 , . . . , xn ) ∈ Rn , where x1 , . . . , xn ∈ R. Analogously, the n-dimensional space of “vectors” with complex-valued components is denoted by Cn , with z = (z 1 , . . . , z n ) ∈ Cn ,

(1.1.2)

where z 1 , . . . , z n ∈ C. In the above discussion, we have used the words “vector” and “space” loosely; with the intention of motivating the introduction of the concept of “vector spaces”, as follows. Definition 1 Vector space Let F be a scalar field such as C, R or Q. A nonempty collection V of elements (called “vectors”) together with two operations (one called “addition” and the other “scalar multiplication”), is said to be a vector space over the scalar field F, if the following operations are satisfied. (a) Closure under vector addition: x, y ∈ V ⇒ x + y ∈ V. (b) Closure under scalar multiplication: x ∈ V and a ∈ F ⇒ ax ∈ V. (c) Rules for vector addition and scalar multiplication: (i) x + y = y + x, for all x, y ∈ V, (ii) (x + y) + z = x + (y + z), for all x, y, z ∈ V, (iii) there is a zero element 0 (called zero vector) in V such that x + 0 = x, for all x ∈ V, (iv) for each x ∈ V, −x = (−1)x satisfies x + (−x) = 0 and a(x + y) = (x + y)a = ax + ay for all a ∈ F and x, y ∈ V, (v) (a + b)x = ax + bx, for all a, b ∈ F and x ∈ V, (vi) a(bx) = (ab)x, for all a, b ∈ F and x ∈ V, and (vii) 1x = x for all x ∈ V.

1.1 Vector Spaces

9

Remark 4 The closure properties (a) and (b) of a vector space V over a scalar field F can be unified as a single condition, namely : a, b ∈ F and x, y ∈ V ⇒ ax + by ∈ V.

(1.1.3)

Indeed, by setting a = b = 1 in (1.1.3), we have (a) in Definition 1; and by setting b = 0, (b) is satisfied. Conversely, for a, b ∈ F, x, y ∈ V, it follows from (b) that ax, by ∈ V. This, together with (a), implies that ax + by ∈ V, as desired.  From (iv) in (c) in Definition 1, we have y − x = y + (−x) is in V and this operation is called the “difference” of y and x. The following result should be easy to verify. Theorem 1 Vector spaces Rn and Cn Let Cn , where n ≥ 1 is a positive integer, be defined as in (1.1.2), with vector addition and scalar multiplication defined by x + y = (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ) and ax = a(x 1 , . . . , xn ) = (ax1 , . . . , axn ). Then Cn is a vector space over the scalar field C. The same statement is valid if C and Cn are replaced by R and Rn , respectively. The vector spaces Rn and Cn are called Euclidean spaces. Next, let m and n be integers with m, n ≥ 1. Then the definition of the vector space Cn in (1.1.2) can be extended to the set Cm,n of m × n matrices, defined by ⎤ a11 a12 · · · a1n ⎢ a21 a22 · · · a2n ⎥ ⎥ ⎢ m,n A=⎢ . .. .. ⎥ ∈ C , ⎣ .. . ··· . ⎦ am1 am2 · · · amn ⎡

(1.1.4)

where a jk = a j,k , 1 ≤ j ≤ m and 1 ≤ k ≤ n, are complex numbers (observe that the comma that separates j and k is often omitted for convenience). Similarly, Rm,n denotes the set of m × n real matrices (matrices whose entries consist entirely of real numbers). By extending the operations of vector addition and scalar multiplication for Cn in Theorem1 to Cm,n , namely:

10

1 Linear Spaces

⎤ ⎡ ⎤ b11 b12 · · · b1n a11 a12 · · · a1n ⎢ a21 a22 · · · a2n ⎥ ⎢ b21 b22 · · · b2n ⎥ ⎥ ⎢ ⎥ ⎢ A+B =⎢ . .. . ⎥+⎢ . .. . ⎥ ⎣ .. . · · · .. ⎦ ⎣ .. . · · · .. ⎦ am1 am2 · · · amn bm1 bm2 · · · bmn ⎤ ⎡ a11 + b11 a12 + b12 · · · a1n + b1n ⎢ a21 + b21 a22 + b22 · · · a2n + b2n ⎥ ⎥ ⎢ =⎢ ⎥ .. .. ⎦ ⎣ . ··· ··· . ⎡

(1.1.5)

am1 + bm1 am2 + bm2 · · · amn + bmn

and

⎤ ⎡ ⎤ aa11 · · · aa1n a11 · · · a1n ⎢ ⎥ ⎢ ⎥ a A = a ⎣ ... · · · ... ⎦ = ⎣ ... · · · ... ⎦ , am1 · · · amn aam1 · · · aamn ⎡

(1.1.6)

we have the following result. Theorem 2 Vector spaces Cm,n and Rm,n The set Cm,n of m × n matrices of complex numbers defined by (1.1.4) and operations as in (1.1.5) and (1.1.6) is a vector space over the scalar field C. This statement remains valid if Cm,n and C are replaced by Rm,n and R, respectively. Definition 2 Subspaces A nonempty subset W of a vector space V over F is called a subspace of V, if W is also a vector space over F with the same addition and scalar multiplication operations of V. Since W ⊂ V, the operations in (i)–(ii) and (iv)–(vii) of Definition 1 are already valid for vectors in W. In addition, since W is nonempty, there exists v0 ∈ W. Thus, from (v) in Definition 1, we see 0 = 0v0 ∈ W. That is, the zero vector 0 is always in W. Therefore, the operation (iii) applies to W. Hence, to find out if W is a subspace of V, it is sufficient to verify if W satisfies the closure properties (a) and (b) in Definition 1, or equivalently, the condition (1.1.3). Example 1 Let W1 = {(c, 0, d) : c, d ∈ R} and W2 = {(c, 1, d) : c, d ∈ R}. Show that W1 is a subspace of R3 but W2 is not (although it is a subset of R3 ). Solution Clearly, W1 is nonempty. As remarked above, it is enough to show that W1 satisfies (1.1.3). For all a, b ∈ R and any x = (x1 , 0, x3 ), y = (y1 , 0, y3 ) in W1 , we have ax + by = a(x 1 , 0, x3 ) + b(y1 , 0, y3 ) = (ax1 , 0, ax3 ) + (by1 , 0, by3 ) = (ax1 + by1 , 0, ax3 + by3 ), which is in W1 , by setting c = ax1 + by1 and d = ax3 + by3 .

1.1 Vector Spaces

11

On the other hand, since a(x1 , 1, x3 ) = (ax1 , a, ax3 ) ∈ / W2 for any real number  a  = 1, it follows that W2 is not a subspace of R3 . Let A = [a jk ]1≤ j,k≤n ∈ Cn,n be a square matrix. Then A is called an uppertriangular matrix if a jk = 0 for all j > k; and A is a called a lower-triangular matrix if a jk = 0 for all j < k. The set of all n × n upper-triangular matrices is a subspace of C n,n ; and so is the set of all n × n lower-triangular matrices. A square matrix An = [a jk ]1≤ j,k≤n is called a diagonal matrix, if all entries that are not on the main diagonal (with j = k) are equal to 0. The notation for a diagonal matrix A is A = diag{a11 , . . . , ann }. Clearly, a diagonal matrix is an upper-triangular matrix and also a lower-triangular matrix. Let F be the scalar field Q, R, or C and P(t) = an t n + an−1 t n−1 + · · · + a1 t + a0 denote a polynomial in the variable t, where a j ∈ F for j = 0, . . . , n and an  = 0. Then P(t) is called a polynomial of degree n with leading coefficient an , and the notation degP = n is used for the degree of the polynomial. We also denote by  the set of all polynomials. Then it is easy to show that  is a vector space over F under the standard definitions of addition and scalar multiplication of functions. For each integer n ≥ 0, let n = {0} ∪ {all polynomials with degree ≤ n}. Then it is  also easy to show that n is a subspace of . We conclude this section by discussing the sum and intersection of two subspaces. Let U and W be two subspaces of a vector space V. Define U + W, called the sum of U and W, by U + W = {u + w : u ∈ U, w ∈ W}; and the intersection of U and W by U ∩ W = {v : v ∈ U and v ∈ W}. Then both U + W and U ∩ W are subspaces of V. Here, we give the proof for U + W, and leave the proof for U ∩ W as an exercise (see Exercise 14). First, since 0 ∈ U and 0 ∈ W, 0 = 0 + 0 ∈ U + W. Thus U + W is not empty. As remarked above, it suffices to prove that U + W satisfies (1.1.3). Suppose a, b ∈ F and x, y ∈ U + W. Then x, y can be written as x = u1 + w1 , y = u2 + w2 , where u1 , u2 ∈ U, w1 , w2 ∈ W. Since U and W are subspaces of V, they satisfy (1.1.3). Thus, au1 + bu2 ∈ U, aw1 + bw2 ∈ W, so that

12

1 Linear Spaces

ax + by = a(u1 + w1 ) + b(u2 + w2 ) = (au1 + bu2 ) + (aw1 + bw2 ) ∈ U + W. This shows that U + W is closed under the operations of vector addition and scalar multiplication, and hence, it is a subspace of V.  Exercises Exercise 1 Convert the following decimal representations of real numbers to fractions: (a) (b) (c) (d) (e) (f)

a = 2.12, b = 0.999 · · · , c = 3.333 · · · , d = 3.141414 · · · , e = 3.14151515 · · · , f = 1.22022022 · · · .

Exercise 2 In each of the following, determine which of the two numbers is smaller than the other, and which two are equal: (a) 56 and 45 , (b) 0.83 and 56 , (c) 0.444 · · · and 49 , (d) 0.82999 · · · and 56 . Exercise 3 Show that

Suppose x = 0.c1 c2 · · · cs c1 c2 · · · cs c1 c2 · · · cs · · · = 0.c1 c2 · · · cs . x=

c1 c2 · · · cs . 10s − 1

Exercise 4 Show that the union of two countable sets is also countable. Exercise 5 Simplify the following complex numbers to the form of α + iβ, where α, β ∈ R, and plot them on the complex plane: (a) a = 1 + 2i, −a, a, ¯ −a, ¯ a + a, ¯ a − a, ¯ ¯ ¯ 1 , −b, −b, (b) b = 3 + 4i, b, b (c) c = 5 + 12i, |c|, |c|2 − Re c, |c|2 + Im c, 1c .

√ / Q: Exercise 6 Follow the procedure described below to show that 2 ∈ √ √ Step 1. Assume 2 ∈ Q. Then 2 = a/b, where a and b are positive integers with no common divisor different from 1. Verify that a 2 is an even integer. Step 2. Show that if a 2 is an even integer, then a is also an even integer. Step 3. Apply the result from Step 2 to show that b2 is an even integer. Step 4. Conclude from Step 2 that b is also an even integer.

1.1 Vector Spaces

13

Step 5. Apply the results from Step 2 and Step 4 to conclude that a and b have a common divisor √(namely 2) different from 1. Step 6. Conclude that 2 is not a rational number, since the assumption in Step 1 is violated. Exercise 7 Modify the procedure described in Exercise 6 to show that for any prime √ number p, its square-root p is not a rational number. Exercise 8 Verify that the subset

W = (a, b, c) ∈ R3 : a + c = 0 of R3 is a subspace of the vector space R3 . Exercise 9 Justify that the set

W = (a, b, c) ∈ R3 : a + c = 1 is not a subspace of the vector space R3 . Exercise 10 Verify that the subset  W=

ab0 c0d



 :a+b =d

of R2,3 is a subspace of the vector space R2,3 . Exercise 11 Justify that the subset  W=

ab0 c0d



 :a+b =d +1

of R2,3 is not a subspace of the vector space R2,3 . Exercise 12 Verify that the subset   W = A = [a jk ] ∈ Cn,n : a11 + a22 + · · · ann = 0 of Cn,n is a subspace of the vector space Cn,n . Exercise 13 Let 1 ≤ m < n be integers, and extend each x = (x1 , . . . , xm ) ∈ Rm to  x = (x1 , . . . , xm , 0, . . . , 0) ∈ Rn by tacking on n − m zeros to x. If  x is identified as x, namely Rm is identified as {(x1 , . . . , xm , 0, . . . , 0) : x j ∈ R}, show that Rm is a subspace of Rn . Similarly, fill in the details to identify and conclude that Cm is a subspace of Cn .

14

1 Linear Spaces

Exercise 14 Let U and W be two subspaces of a vector space V. Show that U ∩ W is a subspace of V.

1.2 Sequence and Function Spaces In the first half of this section, the Euclidean space Cn is extended to the vector spaces of infinite (or more precisely, bi-infinite) sequences x = {x j }, where x j ∈ C and j runs from −∞ to ∞. The definitions and results to be presented in this section remain valid for (one-sided) infinite sequences x = {x j }, where j runs from any integer, such as 0, to ∞ (or from −∞ to any integer). We will study the sequence spaces  p , where 0 ≤ p ≤ ∞. For this purpose, we quantify the infinite sequences x = {x j }, by introducing the following measurements x p of x. (a) For 1 ≤ p < ∞,



⎞1/ p

∞ 

x p = ⎝

|x j | p ⎠

.

(1.2.1)

j=−∞

(b) For p = ∞,

x ∞ = sup |x j |,

(1.2.2)

j

where sup j |x j | (with sup standing for the word “supremum”) is the least upper bound of a bounded sequence x, and sup j |x j | = ∞ for unbounded x. Namely, sup j |x j | denotes the smallest M such that |x j | ≤ M, for all j = . . . , −1, 0, 1, . . . . (c) For 0 < p < 1,

x p =

∞ 

|x j | p .

(1.2.3)

j=−∞

(d) For p = 0,

x 0 = # of non-zero terms in x.

(1.2.4)

By using the above measurements for infinite sequences, we now introduce the following subsets of the set of infinite sequences of complex numbers:    p = x = {x j } : x p < ∞, x j ∈ C . When there is no ambiguity, the same notation  p will be used for the subset of infinite sequences x of real numbers for each p, where 0 ≤ p ≤ ∞.

1.2 Sequence and Function Spaces

15

It follows from the above definition of measurements that ∞ is precisely the set of all bounded infinite sequences, and 0 is the set of all infinite sequences, each of which has at most finitely many non-zero terms. We remark that the 0 -measurement in (1.2.4) is important to the study of “sparse data representation”, and particularly to the concept of “compressed sensing”. Theorem 1 Vector space  p With the operations of vector addition and scalar multiplication defined analogously as those for Cn , namely: {x j } + {y j } = {x j + y j } and c{x j } = {cx j } for any c ∈ C, the set  p , where 0 ≤ p ≤ ∞, is a vector space over the scalar field C. The same statement is valid for infinite sequences of real numbers over the scalar field R. Since  p is a subset of the set of all infinite sequences and since the closure condition on scalar multiplication of sequences in  p is obvious, it is sufficient to prove the closure property of vector addition, which is a consequence of the following result to be called “triangle inequality” for sequences in  p . Theorem 2 Triangle inequality for  p For each real number p, 0 ≤ p ≤ ∞, the  p -measurement · p for the set  p , as defined by (1.2.1)–(1.2.4), has the following property: (1.2.5)

x + y p ≤ x p + y p , for all sequences x, y ∈  p . The proof of (1.2.5) is simple for p = 0, 1, ∞. For example, for p = 0, since the number of non-zero terms of the sequence x + y does not exceed the sum of the number of non-zero terms of x and that of y, for x, y ∈ 0 , the triangle inequality (1.2.5) holds for p = 0. The verification for p = ∞ is left to the reader as an exercise. For 0 < p < ∞, the triangle inequality (1.2.5) can be written out precisely as follows: For 0 < p < 1, 

|x j + y j | p ≤

j



|x j | p +

j



|y j | p ;

(1.2.6)

j

and for 1 ≤ p < ∞,  j

|x j + y j | p

1/ p



 j

|x j | p

1/ p

+

 j

|y j | p

1/ p

.

(1.2.7)

16

1 Linear Spaces

To prove (1.2.6), it is sufficient to show that if 0 < p < 1, then |a + b| p ≤ |a| p + |b| p

(1.2.8)

for all complex numbers a and b, since (1.2.6) follows by summing both sides of the inequality |x j + y j | p ≤ |x j | p + |y j | p over all j. To prove (1.2.8), we may assume that a  = 0, so that (1.2.8) becomes |1 + x| p ≤ 1 + |x| p ,

(1.2.9)

by setting b = ax and cancelling |a| p . Let us first assume that x is real. In this case, the inequality (1.2.9) trivially holds for x < 0, since |1+x| p < max(1, |x| p ) < 1+|x| p . For x ≥ 0, we consider the function f (x) = 1 + |x| p − |1 + x| p = 1 + x p − (1 + x) p , with derivative given by f  (x) = px p−1 − p(1 + x) p−1   1 1 =p − . x 1− p (1 + x)1− p But x < 1 + x is equivalent to

1 1 > , x 1+x

so that since 1 − p > 0, we have 1 1 > , x 1− p (1 + x)1− p yielding f  (x) > 0 for all x > 0. Hence, since f (0) = 0, we may conclude that f (x) ≥ 0 for all x ≥ 0, which is equivalent to (1.2.9). In general, suppose that x is complex. Then since |1 + x| ≤ 1 + |x|, the inequality (1.2.9) for real |x| can be applied to prove that (1.2.9) remains valid for complex x (see Exercise 10).  Inequality (1.2.7) is called Minkowski’s inequality for sequences. Since (1.2.7) trivially holds for p = 1, we only consider 1 < p < ∞, and will derive (1.2.7) for 1 < p < ∞ by applying Hölder’s inequality (to be stated later). But let us first establish the following inequality of Young. p , called the Theorem 3 Young’s inequality Let 1 < p < ∞ and q = p−1 conjugate of p. That is, the pair { p, q} satisfies the duality condition:

1.2 Sequence and Function Spaces

17

Fig. 1.2 Areas under y = f (x) and x = f −1 (y) in the proof of Young’s inequality

1 1 + = 1. p q

(1.2.10)

Then for all nonnegative real numbers a and b, ab ≤

ap bq + , p q

(1.2.11)

where equality holds if and only if b = a p−1 . Proof Young’s inequality (1.2.11) can be proved as follows. Let f (x) = x p−1 , for x ∈ [0, ∞). Then y = f (x) is an increasing function of x for x ≥ 0, with f (0) = 0 1

and the inverse of f given by x = f −1 (y) = y p−1 . Now, it is clear from Fig. 1.2 that  a  b f (x)d x + f −1 (y)dy ≥ ab, 0

0

where equality holds if and only if b = f (a) = a p−1 . By direct calculation, we have 

a



0

and



b 0

a

f (x)d x =

f −1 (y)dy =

x p−1 d x =

0

 0

b

1

1

y p−1 dy =

ap , p

b p−1 1 p−1

+1

+1

=

bq , q

18

1 Linear Spaces

since

1 p 1 +1= = p−1 p−1 1−

= q.

1 p



This completes the proof of Theorem 3. We are now ready to state and prove the following inequality of Hölder.

Theorem 4 H¨older s inequality for sequences For any real number p with 1 < p < ∞, let q be its conjugate as defined by (1.2.10). Then for any two sequences x = {x j } ∈  p and y = {y j } ∈ q , the sequence {x j y j } is a sequence in 1 , and 



|x j y j | ≤

j

|x j | p

1/ p  

j

|y j |q

1/q

.

(1.2.12)

j

Proof To prove this theorem, we first observe that if at least one of the two sequences {x j } and {y j } is the zero sequence (that is, all terms of the sequence are equal to zero), then (1.2.12) trivially holds, since both sides of (1.2.12) would be equal to zero. Hence, we may assume that

x p =



1

p

|x j | p

 = 0, y q =



j

|y j |q

1 q

 = 0,

j

so that (1.2.12) can be written as  |x j | |y j | ≤ 1.

x p y q j

  x   x

Now, since

p

     = 1,  y  y 

   = 1, 

q q

p

it is sufficient to prove (1.2.12) for x p = 1 and y q = 1, or equivalently   |x j | p = 1 and |y j |q = 1. To do so, we apply Young’s inequality to yield j

j

|x j y j | ≤

|x j | p |y j |q + . p q

Then, by taking summation of both sides, we have  j

 |x j y j | ≤

j

|x j | p p

 +

j

|y j |q q

=

1 1 + = 1. p q

1.2 Sequence and Function Spaces

19

This completes the proof of (1.2.12).



We remark that the special case of p = q = 2 in (1.2.12) is called the Cauchy-Schwarz inequality, to be derived in the next section for the general inner product setting. We are now ready to derive Minkowski’s inequality (1.2.7) by applying Hölder’s inequality (1.2.12). To prove (1.2.7), let us first observe that it trivially holds for p = 1. So, let us consider p for 1 < p < ∞ and its conjugate q, as defined by (1.2.10). Then by applying (1.2.12), we have 



|x j + y j | p =

j

|x j + y j ||x j + y j | p−1

j





|x j ||x j + y j | p−1 +

j





|x j | p

j



+



1/ p  



|y j ||x j + y j | p−1

j

|x j + y j |( p−1)q

j

|y j | p

1/ p  

j

|x j + y j |( p−1)q

j

= x p + y p

 

|x j + y j | p

1/q

1/q

1/q

,

j

in view of ( p − 1)q = p. This yields (1.2.7) when both sides are divided by 1/q  |x j + y j | p , again by applying of (1.2.10).  j

Example 1 Let {ki } be a finite sequence of (integer) indices. Then for each p, 0 ≤ p ≤ ∞, the set

 S p = x = {x j } ∈  p : x ki = 0

(1.2.13)

i

is a subspace of  p . Solution Let x, y ∈ S p and z = ax + by, where a, b ∈ C. Then since 

xki = 0 and

i

we have  i

z ki =

 i

(axki + byki ) = a



yki = 0,

i

 i

x ki + b

 i

yki = a × 0 + b × 0 = 0.

20

1 Linear Spaces

Hence, z ∈ S p and we conclude that S p is a subspace of  p .



Let us now turn to the study of vector spaces of functions. For this elementary textbook, we only consider functions that are either continuous or piecewise continuous, although a brief discussion of extension to “measurable” functions will be given in the last section, Sect. 1.5, of this chapter. For a bounded interval J , we say that a function f is piecewise continuous on J if f is continuous on J , with the exception of at most a finite number of points on J at which f has finite jumps, meaning that both left-hand limit f (x0 −) and right-hand limit f (x0 +) exist and are finite for each x0 in J . If f (x0 −)  = f (x0 +), then x0 is called a discontinuity of f , and the value ( f (x0 +) − f (x0 −)) is called the jump of f (x) at x = x0 . If J is an unbounded interval, we say f is piecewise continuous on J , if f is piecewise continuous on any bounded subinterval of J . Let −∞ < a < b < ∞. Then the interval J under consideration may be one of the following intervals: J = (a, b), [a, b], (a, b], [a, b), (−∞, b), (−∞, b], (a, ∞), or [a, ∞). (1.2.14) In the following, we introduce the necessary notations for the sets of functions defined on J to be studied in this book. (i) (ii) (iii)

C(J ) denotes the set of functions continuous on J . PC(J ) denotes the set of piecewise continuous functions on J .  / [c, d] ⊂ J , for some c < d}. L 0 (J ) = { f ∈ PC(J ) : f (x) = 0, if x ∈

It is easy to see that under the standard operations of addition of two functions and multiplication of a function by a constant, the sets C(J ), PC(J ) and  L 0 (J ) are vector spaces over the scalar field F = C, R or Q. Next we study the function classes  L p (J ) ⊂ PC(J ) for 0 < p ≤ ∞, by introducing the following measurements: (a) For 1 ≤ p < ∞,



1/ p

f p =

|f|

.

(1.2.15)

f ∞ = ess sup | f (x)|.

(1.2.16)

p

J

(b) For p = ∞,

x

(c) For 0 < p < 1,



f p =

| f |p.

(1.2.17)

J

In (1.2.16), “ess sup” (which reads: “essential supremum”) means that the values of f at discontinuities are ignored in taking the least upper bound. Also, it should be understood that the notation: f (x) = 0, x ∈ J

1.2 Sequence and Function Spaces

21

is used for functions f (x) = 0 for any x ∈ J at which f (x) is continuous (by ignoring the values at the discontinuities). In this regard, we remark that when the class of piecewise continuous functions is extended to measurable functions (to be discussed in Sect. 1.5), then “ess sup” means that the values of f on a subset of J with measure zero is ignored in taking the least upper bound (see the notions of “almost all” and “almost everywhere" with abbreviation “a.e.” in Sect. 1.1 on p.3). Example 2 Let J = [0, 1]. Define f and g in PC(J ) as follows.  f (x) =  g(x) =

1, for 0 < x < 1, 2, for x = 0 and x = 1;

−2x, for 0 ≤ x < 1, 3, for x = 1.

Determine f ∞ and g ∞ . Solution By ignoring the values of f (x) at x = 0 and x = 1, | f (x)| is bounded by 1, namely | f (x)| ≤ 1 for 0 < x < 1. In addition, it is clear 1 is the smallest upper bound. Thus f ∞ = 1. Ignoring the value of g(x) at x = 1, we see that |g(x)| = 2x is bounded by 2 on [0, 1). Since lim |g(x)| = 2, 2 is the smallest upper bound. Thus g ∞ = 2.  x→1−

We are now ready to introduce the function classes  L p (J ), for all real numbers p with 0 < p ≤ ∞, as follows:  L p (J ) = { f ∈ PC(J ) : f p < ∞}. Example 3 Let f, g and h be functions defined on the interval J = [0, ∞) as follows. f (x) = e−x , x ≥ 0; √1 , x

g(x) =  h(x) =

0,

for 0 < x ≤ 1, for x = 0 and x > 1;

x, for 0 ≤ x ≤ 1, 0, for x > 1.

Verify that L 0 (J ); f ∈ C(J ) and f ∈  L p (J ) for any 0 < p ≤ ∞, but f  ∈  g  ∈ PC(J ), and hence g  ∈  L p (J ) for any 0 ≤ p ≤ ∞; h∈ L p (J ) for any 0 ≤ p ≤ ∞. Solution The claim for the function f should be easy to verify by direct computation. To verify the claim for the function g, observe that g has an infinite jump at x = 0, L p (J ) for since lim g(x) does not exist. Thus, g  ∈ PC(J ) and therefore not in  x→0+

22

1 Linear Spaces

 any 0 ≤ p ≤ ∞, although the improper integral

|g(x)| p d x < ∞ for 0 < p < 2. J

As to the function h, it is clear that h ∈  L p (J ) for any 0 < p ≤ ∞ by direct computation. To see that h is also in  L 0 (J ), we simply choose [c, d] = [0, 1], so that h(x) = 0 for x  ∈ [c, d]. Thus, by the definition of  L 0 (J ), we may conclude that  h∈ L 0 (J ). L p For each p, 0 < p ≤ ∞, the collection of funcTheorem 5 Vector spaces  tions  L p (J ) is a vector space over the scalar field F = C or R. To prove this theorem, it is sufficient to derive the additive closure property of  L p (J ) remains to be in L p (J ), by showing that the sum of any two functions in   L p (J ). This follows from the triangle inequality to be established in the following theorem. Let · p be the  L p -measurement for functions defined by (1.2.15) –(1.2.17). Then for any functions f, g ∈  L p (J ), Theorem 6 Triangle inequality for L p

f + g p ≤ f p + g p .

(1.2.18)

The proof of (1.2.18) for p = ∞ is simple (see Exercise 14). As to 0 < p < ∞, we give the proof of (1.2.18) by considering 0 < p < 1 and 1 ≤ p < ∞ separately, as follows. For 0 < p < 1,    | f + g| p ≤ J

| f |p + J

|g| p ;

(1.2.19)

J

and for 1 ≤ p < ∞, 

1/ p

 | f + g| p J



1/ p



| f |p J

1/ p

+

|g| p

.

(1.2.20)

J

The proof of (1.2.19) follows immediately from the inequality (1.2.8) with a = f (x) and b = g(x), and simply by integrating both sides of | f (x) + g(x)| p ≤ | f (x)| p + |g(x)| p over the interval J . Inequality (1.2.20) is called Minkowski’s inequality. Since the proof of (1.2.20) is trivial for p = 1, we will only consider 1 < p < ∞ in the following discussions. Also, analogous to the proof of Minkowski’s inequality for sequences, the inequality (1.2.20) is a consequence of Hölder’s inequality, in the following theorem. Theorem 7 H¨older s inequality

Let p be any real number with 1 < p < ∞ and q be its conjugate, as defined in (1.2.10). Then for any f ∈  L p (J ) and g ∈  L q (J ),

1.2 Sequence and Function Spaces

23

their product fg is in  L 1 (J ), and 



1/ p 

| f g| ≤

1/q

| f |p

J

|g|q

J

.

(1.2.21)

J

The special case of (1.2.21), where p = q = 2, is called the Cauchy-Schwarz inequality, which is an example of the Cauchy-Schwarz inequality for the general inner product setting to be derived in the next section. The proof of (1.2.21) is provided in the following.  1 1  p q | f |p and | f |q respectively, we may First, dividing f and g by J



assume that

J



| f |p =

|g|q = 1.

J

J

Next, by Young’s inequality (3), we have | f g| ≤

|g|q | f |p + . p q

Hence, integration of both sides over the interval J yields !

 | f g| ≤ J

J

| f |p + p

! J

|g|q 1 1 = + = 1, q p q

as desired.  Since the proof of (1.2.20) by applying (1.2.21) is essentially the same as that of (1.2.7) by applying (1.2.12), it is safe to leave the derivation as an exercise (see Exercise 12). Exercises Exercise 1 Let Wa = { f ∈ C(−∞, ∞) : f (0) = a}, where a ∈ R is fixed. Determine the value of a for which Wa is a subspace of C(−∞, ∞). Justify your answer with rigorous reasoning. Exercise 2 Let Va = { f ∈ C[−2, 2] : f (−1) + f (1) = a}, where a ∈ R is fixed. Determine the value of a for which Va is a subspace of C[−2, 2]. Justify your answer with rigorous reasoning. Exercise 3 In each of the following, determine the range of the real number α for which the following infinite sequence is in 1 . (a) x = {(1 + k 2 )α }, k = . . . , −1, 0, 1, · · · . (b) y = {2αk }, k = . . . , −1, 0, 1, · · · . (c) z = {2α|k| }, k = . . . , −1, 0, 1, · · · .

24

1 Linear Spaces

Exercise 4 Repeat Exercise 3 for  p , where 1 < p ≤ ∞. (Note that α may depend on p.) Exercise 5 In each of the following, determine the range of the real number α for which the function is in  L 1 (J ), where the interval J may be different in different problems. (a) f (x) = (1 + x 2 )α , J = (0, 1). (b) f (x) = (1 + x 2 )α , J = (0, ∞). (c) g(x) = eαx , J = (0, ∞). Exercise 6 Repeat Exercise 5 for  L p (J ), where 1 < p ≤ ∞. (Note that α may depend on p.)  x2 j = 0}. Show that W is a subspace of Exercise 7 Let W = {x = {x j } ∈  p : j

 p , 0 < p < ∞. Exercise 8 Let U = {x = {x j } ∈  p :

 (−1) j x j = 0}. Is U a subspace of  p ? j

Justify your answer with rigorous reasoning. L p (J ), 0 < p < ∞, where Exercise 9 Let J1 ⊂ J and W be a subset of   f = 0, for all f ∈ W. J1

Is W a subspace of  L p (J )? Justify your answer with rigorous reasoning. Exercise 10 Extend the inequality (1.2.9) to the complex setting. That is, for 0 < p < 1, show that |1 + z| p ≤ 1 + |z| p for all z ∈ C. Hint: Use the facts that |1 + z| ≤ 1 + |z| and that f (x) = x p is increasing for x > 0. Then apply (1.2.9) to (1 + |z|) p . Exercise 11 Let p > 1 and consider the function f (x) = 1 + x p − (1 + x) p . Follow the derivation of (1.2.9) to show that f (x) < 0 for all x > 0. Conclude that the inequality (1.2.9) is reversed if 0 < p ≤ 1 is replaced by p > 1. Exercise 12 Follow the derivation of (1.2.7) and apply (1.2.21) to derive the inequality (1.2.20). Exercise 13 Prove (1.2.5) for p = ∞. L ∞ (J ) are vector spaces on C. Exercise 14 Prove that C(J ), PC(J ),  L 0 (J ) and 

1.3 Inner-Product Spaces

25

1.3 Inner-Product Spaces In this section, the notions of the “product” of two vectors and the “angle” between them will be introduced. More precisely, the “product” of two vectors (called “inner product”) is some scalar and the angle between them is defined in terms of their inner product, after the vectors are normalized. Definition 1 Inner-product space Let V be a vector space over some scalar field F, which is either C or any subfield of C, such as the field R of real numbers. Then V is called an inner-product space, if there is a function ·, · defined on V × V with range in F, with the following properties : (a) Conjugate symmetry: x, y = y, x for all x, y ∈ V ; (b) Linearity: ax + by, z = ax, z + by, z for all x, y, z ∈ V, a, b ∈ F ; (c) Positivity: x, x ≥ 0 for all x ∈ V, and x, x = 0 ⇔ x = 0. From the definition of ·, ·, we have ax, y = ax, y, x, ay = ax, y. The function ·, · from V × V to F is called an inner product, also called scalar product since its range is the scalar field F. Recall from an elementary course in Linear Algebra that for V = Rn and F = R, the function ·, · defined on Rn × Rn by (1.3.1) x, y = x · y = x1 y1 + x2 y2 + · · · + xn yn , for all x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) in Rn is called the “dot product” between x and y. For the complex field C, we may extend the “dot product” to vectors in V = Cn , by defining x, y = x · y = x1 y1 + x2 y2 + · · · + xn yn ,

(1.3.2)

for all x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) ∈ Cn , to satisfy the conjugate symmetry condition in the above definition of the inner product. Although there are other ways to define various inner products for the vector spaces Rn and Cn , the dot products in (1.3.1) and (1.3.2) are called the standard (or ordinary) inner products for Rn and Cn . In this book, unless specified otherwise, the inner product for Rn and Cn is always the standard inner product, as defined by the dot product in (1.3.1) and (1.3.2). Example 1 Let V = Cn and F = C. Then the function ·, · defined by (1.3.2) is an inner product for Cn . Solution We verify (a)–(c) of Definition 1, as follows: (a) For x, y ∈ Cn ,

26

1 Linear Spaces

x, y = x1 y 1 + · · · + xn y n = y 1 x1 + · · · + y n xn = y1 x 1 + · · · + yn x n = y, x. (b) For all a, b ∈ C and x, y, z ∈ Cn , ax + by, z = (ax1 + by1 )z 1 + · · · + (axn + byn )z n = (ax1 z 1 + · · · + axn z n ) + (by1 z 1 + · · · + byn z n ) = ax, z + by, z. (c) x, x = |x1 |2 + · · · + |xn |2 ≥ 0 and x, x = 0 if and only if |x1 |2 + · · · +  |xn |2 = 0, or x1 = 0, · · · , xn = 0, or x = 0. Since the proofs of the following two theorems are straight-forward, it is safe to leave them as exercises (see Exercises 2–3). Theorem 1 Inner-product space 2 For the vector space 2 of infinite sequences over the scalar field F = C or R, the function ∞ 

x, y =

xk y k ,

(1.3.3)

k=−∞

defined on 2 × 2 , is an inner product. Consequently, endowed with this inner product, 2 is an inner-product space. Theorem 2 Inner-product space  L2 field F = C or R, the function

For the vector space  L 2 (J ) over the scalar  f (t)g(t)dt,

 f, g =

(1.3.4)

J

L 2 (J ), is an inner product. Consequently, endowed with this defined on  L 2 (J ) ×  inner product,  L 2 (J ) is an inner-product space. It follows from Hölder’s inequality for sequences (1.2.12) on p.17 for p = q = 2, that " #1 " #1 2 2     xk y k | ≤ |xk ||yk | ≤ |xk |2 |yk |2 , (1.3.5) | k

k

k

k

so that x, y in (1.3.3) is well defined. Similarly, by Hölder’s inequality (1.2.21) on p.22 for p = q = 2, we also have

1.3 Inner-Product Spaces



27



|

f (t)g(t)dt| ≤ J

| f (t)| |g(t)|dt ≤



J

| f (t)| dt 2

1   2

J

|g(t)|2 dt

1 2

,

J

(1.3.6) so that  f, g in (1.3.4) is well defined. The inequalities (1.3.5) and (1.3.6) are also called the Cauchy-Schwarz inequality, to be discussed later in this section. The inner products defined by (1.3.3) and (1.3.4) are called the standard (or ordiL 2 . In the discussions throughout this book, the nary) inner products for 2 and  L 2 are always the standard inner products, unless specified inner products for 2 and  otherwise. We next define some measurement, called the “norm” (or “length") of a vector in an inner-product space V. Definition 2 Norm for inner-product space Let V be an inner-product space, with inner product ·, · as in Definition 1. Then

x =

 x, x,

(1.3.7)

is called the norm (or length) of x ∈ V. The inner product and its corresponding norm as defined above have the following relation. Theorem 3 Cauchy-Schwarz inequality Let V be an inner-product space over some scalar field F, where F = C or a subfield of C. Then for all x, y ∈ V, |x, y| ≤ x

y ,

(1.3.8)

with · defined by (1.3.7). Furthermore, equality in (1.3.8) holds if and only if x = cy, or y = cx for some scalar (also called constant) c ∈ F. Proof We only prove this theorem for F = R and leave the proof for F = C as an exercise (see Exercise 6). Let a ∈ R be any constant. We compute, according to (1.3.7), 0 ≤ x − ay 2 = x − ay, x − ay = x − ay, x + x − ay, −ay = x, x − ay, x − ax, y + a 2 y, y = x, x − 2ax, y + a 2 y, y = x 2 − 2ax, y + a 2 y 2 .

(1.3.9)

If y = 0, then the theorem trivially holds. So we may assume y  = 0. Then by setting

28

1 Linear Spaces

a=

x, y ,

y 2

we have x, y x, y |x, y|2 y 2 +

y 2

y 4 |x, y|2 = x 2 −

y 2

0 ≤ x 2 − 2

or 0 ≤ x 2 y 2 − |x, y|2 , which is the same as (1.3.8). Moreover, the equality in (1.3.8) holds if and only if 0 = x − ay 2 in (1.3.9) with a =

x, y , or x = ay.

y 2



Next, observe that for a real inner-product space V over the field R of real numbers, with norm defined by (1.3.7), it follows from the Cauchy-Schwarz inequality (1.3.8) that % $ y x , ≤1 −1 ≤

x y for all non-zero x, y ∈ V, since x, y ∈ R. Hence, there is precisely one value θ, with 0 ≤ θ ≤ π such that % $ y x, y x , = , cos θ =

x y

x

y or θ = cos

−1



 x, y ,

x y

(1.3.10)

which is called the “angle” between the two non-zero real-valued vectors x and y. Remark 1 (a) For θ = 0, x = cy for some constant c > 0. In other words, x and y are parallel and “pointing to the same direction”. (b) For θ = π, x = cy for some c < 0. In other words, x and y are parallel, but “pointing to opposite directions”. (c) For θ = π/2, x, y = 0. In other words, x and y are perpendicular to each other. (d) In view of the previous remark, there is no need to consider the notion of the “angle” between two complex-valued vectors x and y to convey that they are perpendicular to each other. All that is needed is that their inner product is zero: x, y = 0. In the literature, the term “orthogonality” is used most frequently.

1.3 Inner-Product Spaces

29

Definition 3 Orthogonal vectors Let V be an inner-product space over C or a sub-field of C. Then x, y ∈ V are said to be orthogonal to each other, if x, y = 0. Observe that since x, 0 = 0 for all x ∈ V, the zero vector 0 is orthogonal to all vectors in the inner-product space V. For a subspace W ⊂ V, the notion W⊥ (which reads W “perp” for the word “perpendicular”) is defined by   W⊥ = x ∈ V : x, w = 0, for all w ∈ W .

(1.3.11)

It can be verified that W⊥ is a vector space and W ∩ W⊥ = {0} (see Exercises 13–14). In this book, W⊥ is called the orthogonal complement of W in V, or the orthogonal complementary subspace of V relative to W. Example 2 Let x = (1, 1, 4), y = (2, −1, 2) ∈ R3 . Find the angle between x and y. Repeat this for u = (1, −2, 2), v = (−2, 1, 2). Solution We have x, y = 1(2) + 1(−1) + 4(2) = 9,  √ √

x = 12 + 12 + 42 = 18 = 3 2,  √

y = 22 + (−1)2 + 22 = 9 = 3. Thus, cos θ =

√ x, y 2 9 = = √ .

x

y 2 3 2(3)

The angle θ between x and y is √ θ = cos

−1

2 π = . 2 4

For u and v, we have u, v = 1(−2) + (−2)(1) + 2(2) = 0. Thus, u and v are orthogonal to each other, and the angle between them is π2 .



L 2 [0, 1]. Find Example 3 Let f 1 (x) = 1, f 2 (x) = x, x ∈ [0, 1] be the functions in  the angle between f 1 and f 2 . Repeat this for g1 (x) = 1, g2 (x) = cos πx. Solution We have

30

1 Linear Spaces



1

 f1 , f2  =

 f 1 (x) f 2 (x)d x =

0 1



f 1 2 =

1

| f 1 (x)|2 d x =



1 , 2

1d x = 1,

0

2

f 2 2 =

xd x =

0



0

1



1

| f 2 (x)|2 d x =

0

x 2d x =

0

and cos θ =

1 ; 3

√ 3  f1 , f2  = .

f 1

f 2 2

Thus, the angle θ between f 1 and f 2 is θ = cos−1

√ 3 π = . 2 6

For g1 and g2 , we have  g1 , g2  = 0

1



1 cos πx d x = sin πx π

1 = 0. x=0

Therefore g1 and g2 are orthogonal to each other, and the angle between them is π2 . Observe that there is no geometric meaning for the angle between two functions.  Theorem 4 Pythagorean theorem Let V be an inner-product space over some scalar field F = C or R. If x, y ∈ V are orthogonal, then

x 2 + y 2 = x + y 2 .

(1.3.12)

When x, y ∈ R2 , equality (1.3.12) is the Pythagorean theorem from middle/highschool geometry: the square of the length of the hypotenuse of a right-angled triangle is equal to the sum of the squares of the lengths of the other two sides (“legs”). The equality (1.3.12) can be derived as follows:

x + y 2 = x + y, x + y = x, x + x, y + y, x + y, y = x 2 + y 2 , where the last equality follows from the orthogonality assumption: x, y = 0, y, x = 0.  Example 4 Let u = (1, −2, 2), v = (−2, 1, 2) ∈ R3 be the vectors considered in Example 2. Verify the Pythagorean theorem by computing u 2 , v 2 and u + v 2 directly.

1.3 Inner-Product Spaces

31

Solution We have

u 2 = 11 + (−2)2 + 22 = 9,

v 2 = (−2)2 + 12 + 22 = 9, and

u + v 2 = (−1, −1, 4) 2 = (−1)2 + (−1)2 + 42 = 18,

which is indeed the sum of u 2 and v 2 .



L 2 [0, 1] considered Example 5 Let g1 (x) = 1, g2 (x) = cos πx be the functions in  in Example 3. Verify the Pythagorean theorem with g1 (x) and g2 (x) by computing

g1 2 , g2 2 and g1 + g2 2 directly. Solution We have 

1

g1 2 =



0



g2

2

 |g2 (x)| d x = 2

0



1

cos2 πxd x

0 1

=

12 d x = 1,

0

1

=

1

|g1 (x)|2 d x =

0

1 1 (1 + cos 2πx)d x = . 2 2

On the other hand, we have 

g1 + g2 2 =

1

|1 + cos πx|2 d x

0

 =

1

1 + 2 cos πx + cos2 πxd x

0

=1+0+ Thus, (1.3.12) holds for g1 (x) and g2 (x).

1 3 = . 2 2 

A vector x in a vector space over F is said to be a finite linear combination of vectors x1 , x2 , . . . , xn , if there exist scalars (also called constants ) c1 , . . . , cn ∈ F, such that n  c j x j = c1 x1 + · · · + cn xn . (1.3.13) x= j=1

This concept is used to introduce the notion of the “span” or more precisely, “linear span” of a set of vectors, as follows.

32

1 Linear Spaces

Definition 4 Linear span Let S = {x j } be a (finite or infinite) set of vectors in a vector space V over a scalar field F. Then the (linear) span of S, denoted by spanS or span{x j }, is the set of all finite linear combinations of vectors in S. In particular, the span of a finite set S = {x j : 1 ≤ j ≤ n} is spanS =

⎧ ⎨

x=



n  j=1

⎫ ⎬

c j x j : c1 , . . . , cn ∈ F . ⎭

It is easy to show that span{x j } is a subspace of V (see Exercise 15). Thus, alternatively, the set {x j } is said to span the vector space: span{x j }. Example 6 Let x1 = (1, 1, 3, −5), x2 = (2, −1, 2, 4) ∈ R4 and W = span{x1 , x2 }. Find W⊥ . Solution A vector c = (c1 , c2 , c3 , c4 ) ∈ R4 is in W⊥ if and only if c, x1  = 0 and c, x2  = 0. That is,  c1 + c2 + 3c3 − 5c4 = 0, 2c1 − c2 + 2c3 + 4c4 = 0. Solving the above linear system, we have 5 1 4 14 c1 = − c3 + c4 , c2 = − c3 + c4 . 3 3 3 3 Thus, 

 5 1 4 14 c = − c3 + c4 , − c3 + c4 , c3 , c4 3 3 3 3     4 5 1 14 = c3 − , − , 1, 0 + c4 , , 0, 1 3 3 3 3 c4 c3 = (−5, −4, 3, 0) + (1, 14, 0, 3). 3 3 Therefore W⊥ = span{(−5, −4, 3, 0), (1, 14, 0, 3)}.



Exercises Exercise 1 Let ·, · be the inner product as introduced in Definition 1. Show that x, ay = ax, ¯ y, where a¯ is the complex conjugate of a ∈ C. Exercise 2 Provide a proof of Theorem 1. Exercise 3 Provide a proof of Theorem 2.

1.3 Inner-Product Spaces

33

Exercise 4 In each of the following, compute x, y, x , and y . Then illustrate the Cauchy-Schwarz inequality (1.3.8) with these examples. (a) x, y ∈ R3 , with x = (1, 2, −1) and y = (0, −1, 4). (b) x, y ∈ 2 , with x = {x j } and y = {y j }, where x j = y j = 0 for j ≤ 0; 1 1 x j = , y j = (−1) j , for j > 0. j j ∞ ∞   1 1 π2 = by observing that Hint: . Compute 2 j 6 (2 j − 1)2 j=1

j=1

∞ ∞ ∞    1 1 1 = + . 2 2 j (2 j − 1) (2 j)2 j=1

j=1

j=1

Apply the same trick of writing the summation as the sum over the even indices ∞  (−1) j and the sum over the odd indices to compute . j2 j=1

Exercise 5 In each of the following, compute  f, g, f , and g for the space  L 2 [0, 1]. Then illustrate the Cauchy-Schwarz inequality (1.3.8) with these examples. (a) (b) (c) (d)

f (x) = 1 + x, g(x) = 1 − x. f (x) = sin πx, g(x) = cos πx. f (x) = 1, g(x) = cos 2πx. f (x) = cos 2πmx, g(x) = cos 2πnx, where m and n are integers.

Exercise 6 Extend the proof of Theorem 3 from F = R to F = C. Hint: If the scalar a ∈ R defined in the proof of the theorem is replaced by a ∈ C, then a¯ = y, x/ y 2 . Exercise 7 As a continuation of Exercise 4, compute the angle θ between the vectors x and y. (Note that θ is restricted to 0 ≤ θ ≤ π.) Exercise 8 As a continuation of Exercise 5, compute the angle θ between the functions f and g in  L 2 [0, 1]. Exercise 9 Let V be an inner-product space over some scalar field F and let the norm · of vectors in V be defined as in (1.3.7). Derive the following “parallelogram law” for all vectors x, y ∈ V:

x + y 2 + x − y 2 = 2 x 2 + 2 y 2 . Exercise 10 In Exercise 9, consider the special case of V = R2 and F = R. Recall that x = [x1 , x2 ]T ∈ V is also considered as a point x = (x1 , x2 ) on the plane R2 . Let y = (y1 , y2 ) be another point on the plane R2 . Provided that x, y and x + y are different from 0, then the four points 0 = (0, 0), x = (x1 , x2 ), y = (y1 , y2 ), and

34

1 Linear Spaces

x + y = (x1 + y1 , x2 + y2 ) constitute the four vertices of a parallelogram. Show that the lengths of the two diagonals are given by x + y and x − y . Hence, the parallelogram law in Exercise 9 says that the sum of the squares on the two diagonals is the same as the sum of the squares on the four sides of any parallelogram. Illustrate this conclusion with the following examples: (a) x = (1, 2), y = (2, 1), (b) x = (−1, 0), y = (1, 1), (c) x = (a, 0), y = (0, b), where a  = 0 and b  = 0. Exercise 11 Illustrate the Pythagorean theorem, namely (1.3.12), with the following pairs of vectors x, y ∈ R2 after verifying that x, y are perpendicular to each other (that is, x, y = 0): (a) x = (1, −1), y = (2, 2), (b) x = (4, 1), y = (−2, 8), (c) x = (a, 0), y = (0, b) with a  = 0 and b  = 0, as in (c) of Exercise 10. Exercise 12 Illustrate the Pythagorean theorem, namely (1.3.12), with the following pairs of functions f, g ∈  L 2 [0, 1] after verifying that  f, g = 0: (a) f (x) = sin 2πx, g(x) = 1, (b) f (x) = sin 2πx, g(x) = cos 2πx, (c) f (x) = cos 2π j x, g(x) = cos 2πkx, where j and k are distinct integers. Exercise 13 Show that if W is a subspace of an inner-product space V, then W⊥ , as defined in (1.3.11), is also a subspace of V. Exercise 14 As a continuation of Exercise 13, show that W ∩ W⊥ = {0}. Exercise 15 Let {x j } be a set of vectors in a vector space V. Show that span{x j } is a subspace of V. Exercise 16 Let x = (1, −1, 1) ∈ R3 and W = span{x}. What is the best you can say about W⊥ ? Exercise 17 Let x1 = (1, −1, 3, −1), x2 = (1, 2, 2, 1) ∈ R4 and denote W = span{x1 , x2 }. What is the best you can say about W⊥ ?

1.4 Bases of Sequence and Function Spaces The concepts of (algebraic) linear dependency and independency, bases, orthogonality, orthogonal projection and Gram-Schmidt process from elementary Linear Algebra are reviewed in this section. In addition, these concepts will be extended to L p (J ). infinite-dimensional vector spaces such as  p and 

1.4 Bases of Sequence and Function Spaces

35

Let V be a vector space over some scalar field F. Recall that a finite set S = {vk : 1 ≤ k ≤ n} ⊂ V is said to be linearly dependent, if there exist c1 , . . . , cn ∈ F, not all zero, such that n  ck vk = 0. k=1

If S is not linearly dependent, then it is said to be linearly independent. In other words, S is linearly independent if and only if the only solution of the equation n 

ck vk = 0

k=1

for c1 , . . . , cn in F is the zero solution; that is, c1 = · · · = cn = 0. Hence, if the zero vector 0 is in S, then S is a linearly dependent set of vectors. Example 1 Consider the vectors x1 = (1, 0, 2), x2 = (0, 1, 1), x3 = (−1, 1, 0) in the vector space R3 . Determine whether or not the set {x1 , x2 , x3 } is linearly independent. Solution Consider the equation c1 x1 + c2 x2 + c3 x3 = 0, or c1 (1, 0, 2) + c2 (0, 1, 1) + c3 (−1, 1, 0) = (0, 0, 0); that is,

⎧ ⎨ c1 − c3 = 0, c2 + c3 = 0, ⎩ 2c1 + c2 = 0.

(1.4.1)

Putting c1 = c3 from the first equation to the third equation yields 

c2 + c3 = 0, c2 + 2c3 = 0,

so that c3 = 2c3 , or c3 = 0 ⇒ c2 = 0 ⇒ c1 = 0. Hence, c1 = c2 = c3 = 0, and  the set {x1 , x2 , x3 } is linearly independent. Definition 1 Basis for finite-dimensional space A finite set S = {vk : 1 ≤ k ≤ n} of vectors in a vector space V is said to be a basis of V if S is linearly independent and any x ∈ V can be expressed as x=

n  k=1

ck vk

36

1 Linear Spaces

for some ck ∈ F. In this case, V is said to be a finite-dimensional space, and n is called the dimension of V. For example, Rn , Cn and Rm,n , Cm,n are finite-dimensional vector spaces. In particular, the set of e1 = (1, 0, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1)

(1.4.2)

is a basis for Rn and Cn . We leave the proof as an exercise (see Exercise 4). There is rich theory on the linear independence and bases for finite-dimensional spaces V in elementary Linear Algebra. For example, it can be shown that the dimension n of V is independent of the choices of bases; and for V = Rn , S = {vk : 1 ≤ k ≤ n} is a basis for Rn if and only if S is linearly independent, which is equivalent to that the n × n matrix A = [v1 v2 · · · vn ], with v1 , . . . , vn as columns of A, is nonsingular. A vector space is called an infinite-dimensional vector space if it is not finitedimensional. To extend the notion of basis to an infinite-dimensional vector space, let us first give the following equivalent definition of basis for a finite-dimensional vector space V. Remark 1 A set S = {vk : 1 ≤ k ≤ n} in a finite-dimensional vector space V is a basis of V if and only if each x ∈ V can be written uniquely as a linear combination of vk , 1 ≤ k ≤ n. Indeed, if S is a basis of V, then for any x ∈ V, writing x=

n 

ck vk

k=1

as well as x=

n 

dk vk ,

k=1

and taking the difference of these two linear combinations, we have n 

(ck − dk )vk = 0.

k=1

Hence, the linear independence of {vk : 1 ≤ k ≤ n} implies that ck = dk , k = 1, . . . , n. That is, the linear combination representation is unique. Conversely, suppose each x ∈ V can be expressed uniquely as a linear combination of vk , 1 ≤ k ≤ n. Then in the first place, the algebraic span of S is all of V. In addition, n  suppose ck vk = 0. Then this is a linear combination representation of the zero k=1

1.4 Bases of Sequence and Function Spaces

37

vector 0, which is a vector in V. But since 0 =

n 

0vk , it follows from the unique

k=1

linear combination representation of the zero vector 0 that ck = 0, k = 1, . . . , n. That is, the set {vk : 1 ≤ k ≤ n} is linearly independent, and is therefore a basis of V.  Motivated by the above remark, we introduce the notion of (algebraic) basis for an infinite-dimensional vector space as follows. An infinite set Definition 2 Algebraic basis for infinite-dimensional space S = {vk : k = 1, 2, . . .} of vectors in an infinite-dimensional vector space V is said to be an algebraic basis for V, if each x ∈ V can be expressed uniquely as a finite linear combination of S; that is, each x ∈ V is written uniquely as x=

J 

c j vk j

j=1

for some integer J ≥ 1, constants c j ∈ F, and vectors vk j ∈ S that depend on x. To give an example, let us first introduce the “Kronecker delta” symbol, defined by δj =

1, if j = 0, 0, if j  = 0,

(1.4.3)

where j is any integer. In addition, for each fixed integer k ∈ Z, the delta sequence δ k is defined by δ k = {δk− j } j=··· ,−1,0,1,... ; that is, δ k = (. . . , 0, . . . , 0, 1, 0, . . .), where the value 1 occurs at the kth entry, while all the other entries are equal to zero. It is clear that each delta sequence δ k is in  p for any p with 0 ≤ p ≤ ∞. In the following, we will see that while S = {δ k : k = 0, ±1, ±2, . . .} is an algebraic basis of 0 , it is not an algebraic basis of  p for 0 < p ≤ ∞. Example 2 Verify that S = {δ k : k = 0, ±1, ±2, · · · } is an algebraic basis for 0 . Solution Indeed, for any x ∈ 0 , there are only finitely many entries xk of x that are nonzero. Thus, we have xk = 0 for all |k| > K for some positive integer K . Therefore x can be written as a finite linear combination of S, namely: x = (. . . , 0, x−K , . . . , x K , 0, . . .) =

K  j=−K

xk δ k .

38

1 Linear Spaces

To show that the above expression is unique, let x =

K 

dk δ k , and take the differ-

j=−K

ence of these two sums to arrive at K 

(xk − dk )δ k = x − x = 0.

j=−K

That is, (. . . , 0, x−K − d−K , . . . , x K − d K , 0, . . .) = 0 = (. . . , 0, 0, . . . , 0, 0, . . .), so that xk = dk , −K ≤ k ≤ K . Therefore, every x ∈ 0 can be written uniquely as a finite linear combination of S = {δ k : k = 0, ±1, ±2, . . .}. That is, S is an  algebraic basis of 0 . Example 3 Give an example to illustrate that S = {δ k : k = 0, ±1, ±2, · · · } is not an algebraic basis of  p for 0 < p ≤ ∞. Solution For each real number p with 0 < p < ∞, the sequence y p = {yk : −∞ < k < ∞}, where yk = (1 + k 2 )−1/ p , is in  p , since ∞ 

|yk | p =

k=−∞

∞  k=−∞

1 < ∞. 1 + k2

However, since each term yk of the sequence is non-zero, it is clear that y p cannot be written as a finite linear combination of S = {δ k : k = 0, ±1, ±2, . . .}. For p = ∞, consider y∞ = (. . . 1, 1, . . . , 1 . . .), where each term of the sequence y∞ is equal to 1. Hence, y∞ ∈ ∞ but cannot be written as a finite linear combination  of S = {δ k : k = 0, ±1, ±2, . . .} either. The notion of “algebraic bases”, as introduced in Definition 2, is limited only to “finite” linear combinations. To extend to “infinite linear combinations” which are actually infinite series, it is clear that the concept of convergence must be taken into consideration. In other words, the notion of “measurement” of vectors in V is again needed. For convenience, we will only consider the norm measurement induced by some inner product. Therefore, in what follows in this section, the vector space V to be studied is always an inner-product √ space over some scalar field F = C or R with inner product ·, · and norm || · || = ·, ·. ∞  Let vk , k = 1, 2, . . ., be vectors in V. We say that the infinite series ck vk , k=1

where ck ∈ F, is convergent in V, if the sequence of its partial sums converges to some vector x ∈ V; that is, n    ck vk  = 0. lim x −

n→∞

k=1

(1.4.4)

1.4 Bases of Sequence and Function Spaces

39

In this case, x is called the limit of the series, and we write x=

∞ 

ck vk .

(1.4.5)

k=1

Of course, it should be obvious that the one-sided infinite sum (or infinite series) in the above formulation can be easily extended to the two-sided infinite sum without further discussion, but only with minimum change of notation, such as ∞ 

ck vk ,

k=−∞

if the set {vk : k = 1, 2, . . .} is replaced by {vk : k = 0, ±1, ±2, . . .}. We will encounter such situations quite frequently, particularly in Chapters 6-10 of this book. The following definitions are valid for all normed (linear) spaces to be introduced in Sect. 1.5, but we will only apply it to the norm induced by some inner product in this section. Definition 3 Complete set and basis of infinite-dimensional space Let V be a normed space over some scalar field F with norm · . A set S = {vk , k = 1, 2, · · · } of vectors in V is said to be complete if every x ∈ V can be represented as in (1.4.5) for some ck ∈ F. If in addition, the representation is unique, then S is called a basis for V. These definitions apply to {vk : k = 0, ±1, ±2, · · · }, with the one-sided infinite sum in (1.4.5) replaced by the two-sided infinite sum. In the above definition of bases, the uniqueness of the infinite series representation for x ∈ V means that the coefficients ck , k = 1, 2, . . . are uniquely determined by x. This notion is equivalent to that of linear independence of an infinite set of vectors, introduced in the following definition. Definition 4 Linear independence of infinite set A set S = {vk : k = 1, 2, . . .} of vectors in a normed space V over a scalar field F is said to be linearly dependent if there exist ck ∈ F, k = 1, 2, . . ., not all equal to zero, such that ∞ 

ck vk = 0.

k=1

A set of vectors in V is linearly independent if it is not linearly dependent. In other words, a set {vk : k = 1, 2, . . .} of vectors in V is linearly independent, if ∞  ck vk = 0 (meaning that the series converges to 0) for some ck ∈ F, k = 1, 2, . . ., k=1

then ck must be zero, for all k.

40

1 Linear Spaces

In view of the above definition of linear independence for an infinite set of vectors in a normed space V, an equivalent definition of basis for V can be stated as follows: Remark 2 Linear independence + completeness = basis A set S = {vk : k = 1, 2, . . .} of vectors in a normed space V is a basis of V if S is linearly independent and complete in V.  For an inner-product space V, the notion of orthogonal basis for a finitedimensional space in an elementary course of Linear Algebra can also be extended to infinite-dimensional inner-product spaces, and in fact, to the notion of a pair of “dual bases”, as follows: Definition 5 Orthogonal and dual bases Let V be an inner-product space over √ some scalar field F with inner product ·, · and norm || · || = ·, ·. (a) A basis S = {vk : k = 1, 2, . . .} of V is called an orthogonal basis of V if vk , v j  = 0, j  = k, j, k = 1, 2, . . . . If, in addition, vk = 1 for all k; that is, vk , v j  = δk− j , j, k = 1, 2, . . . , then S is called an orthonormal basis of V. vk : k = 1, 2, . . .} of V are said to be dual (b) Two bases {vk : k = 1, 2, . . .} and { bases of V, if v j  = δk− j , j, k = 1, 2, . . . . vk , Remark 3 Orthogonality + Completeness ⇔ Orthogonal basis It is not difficult to show that if a set S = {vk : k = 1, 2, . . .} of nonzero vectors vk in an inner-product space V is orthogonal, then S is linearly independent (see Exercise 9), and hence, S is an orthogonal basis for V if and only if S is complete.  S = { v1 , v2 , v3 } of R3 are dual Example 4 Identify whether S = {v1 , v2 , v3 } and  bases or not, where v1 = (2, 1, −1), v2 = (1, 0, −1), v3 = (1, 1, 1),  v1 = (−1, 2, −1), v2 = (2, −3, 1), v3 = (1, −1, 1). Solution It is easy to verify that both S and  S are bases of R3 . By simple computation, we have v1  = 2(−1) + 1(2) + (−1)(−1) = 1, v1 , v1 , v2  = 2(2) + 1(−3) + (−1)(1) = 0, v3  = 2(1) + 1(−1) + (−1)(1) = 0. v1 ,

1.4 Bases of Sequence and Function Spaces

41

Similarly, we can obtain vk , v j  = δk− j for k = 2, 3, 1 ≤ j ≤ 3. Thus, S and  S are  dual bases of R3 . Example 5 Verify that the set S = {δ k : k = 0, ±1, ±2, . . .} as discussed in Example 2 is an orthonormal basis of 2 . Solution Let x = {xk } be a sequence in 2 . It follows from the definition of δ k that x=

∞ 

xk δ k ,

k=−∞

which converges in the sense of (1.4.4), since   2  x − xk δ k 2 = |xk |2 → 0 |k|≤N

|k|>N

(see Exercise 5). Clearly this representation for x is unique, so that S is a basis for 2 . In addition, for j, k = 0, ±1, ±2, . . ., δ k , δ j  =

∞ 

δk−m δ j−m = δ j−k ,

m=−∞

and hence, S is an orthonormal basis for 2 .



Next, we introduce the concept of orthogonal projection. Let w1 , . . . , wn be nonzero orthogonal vectors in an inner-product √ ssspace V over some scalar field F, with inner product ·, · and norm x = x, x, x ∈ V, and consider the algebraic span W = span{w1 , . . . , wn }. For any v ∈ V, the vector Pv ∈ W, defined by Pv = c1 (v)w1 + c2 (v)w2 + · · · + cn (v)wn , with c j (v) =

v, w j  ,

w j 2

j = 1, 2, . . . , n,

(1.4.6)

(1.4.7)

is called the orthogonal projection of v onto W. Let W⊥ be the orthogonal complement of W in V defined by (1.3.11) on p. 29. Theorem 1 Properties of orthogonal projection jection of v onto W as defined by (1.4.6). Then (a) Pv = v if v ∈ W. (b) Pv = 0 if v ∈ W⊥ .

Let Pv be the orthogonal pro-

42

1 Linear Spaces

(c) v − Pv ∈ W⊥ ; that is, v − Pv, w = 0 for all w ∈ W. Since (a) and (b) are at least intuitively clear, it is safe to leave the proof as an exercise. To prove (c), it is sufficient to verify that v − Pv, wk  = 0 for 1 ≤ k ≤ n. This indeed holds, since v − Pv, wk  = v, wk  − Pv, wk  n  = v, wk  − c j (v)w j , wk  j=1

= v, wk  − ck (v)wk , wk  v, wk  = v, wk  − wk , wk  = 0.

wk 2



Example 6 Let w1 = (0, 4, 2), w2 = (−5, −1, 2), W = span{w1 , w2 }. For the given vector v = (−5, 4, 12), compute the orthogonal projection Pv of v onto W. Repeat this for u = (−10, 8, 9). Solution We must first verify that the two vectors w1 and w2 are orthogonal to each other, so that (1.4.6) and (1.4.7) can be applied to compute Pv, Pu. In this regard, we remark that if they are not orthogonal, then the Gram-Schmidt process to be discussed later in this section can be followed to replace w2 by another nonzero vector in span{w1 , w2 }, which is orthogonal to w1 , before carrying out the computations below. For these two vectors w1 and w2 in this example, it is clear they are orthogonal. By applying (1.4.6) and (1.4.7), we have v, w2  v, w1  w1 + w2

w1 2

w2 2 0 + 16 + 24 25 − 4 + 24 = w1 + w2 20 30   3 15 13 = 2(0, 4, 2) + (−5, −1, 2) = − , , 7 . 2 2 2

Pv =

Similarly, we have, for u = (−10, 8, 9), u, w2  u, w1  w1 + w2 2

w1

w2 2 0 + 32 + 18 50 − 8 + 18 = w1 + w2 20 30 5 = (0, 4, 2) + 2(−5, −1, 2) = (−10, 8, 9). 2

Pu =

Observe from the above computation that Pu = u ∈ W. Hence, we may conclude that dist(u, W) = 0 without any work. Note that Pu is a linear combination of w1

1.4 Bases of Sequence and Function Spaces

43

5 and w2 , from the above computation, namely: u = w1 + 2w2 . This conclusion is 2 also assured by statement (a) of Theorem 1. To illustrate Pv geometrically, the vectors v, w1 and w2 are in the Euclidean space V = R3 , and the subspace W is a plane in R3 that contains the two vectors w1 and 15 13 w2 (see Fig. 1.3). The projection Pv of v onto W is the vector (− , , 7) on W 2 2 and   15 13 5 v − Pv = (−5, 4, 12) − − , , 7 = (1, −1, 2) 2 2 2 is the vector orthogonal to W.



Example 7 Let the functions g1 , g2 , g3 be defined as follows: √ √ 2 2 1 g1 (x) = √ , g2 (x) = √ cos x, g3 (x) = √ cos 2x. π π π These functions are in  L 2 [0, π] and constitute an orthonormal basis of its span W = span{g1 , g2 , g3 } (see Exercise 6). Compute the orthogonal projection P f of the function f (x) = x 2 onto W. Solution It is easy to verify that the functions g j , 1 ≤ j ≤ 3 are orthonormal to one another and that they have unit length, namely: g j = g j 2 = 1 for j = 1, 2, 3. Therefore, by the definition of P f , we have

Fig. 1.3 Orthogonal projection Pv in W

44

1 Linear Spaces

 f, g1   f, g2   f, g3  g1 (x) + g2 (x) + g3 (x) 2 2

g

g2

g3 2  π1 1 = x 2 √ d x g1 (x) π 0  π √  π √ 2 2 2 + x √ cos xd x g2 (x) + x 2 √ cos 2xd x g3 (x) π π 0 0 √ √ 1 π3 2 2π =√ g1 (x) + √ (−2π)g2 (x) + √ g3 (x) 3 π π π2 π2 = − 4 cos x + cos 2x. 3

P f (x) =



Next we consider the problem of best approximation of any vector v ∈ V by vectors from W; namely, for any given v ∈ V, we wish to find some w0 ∈ W, such that w0 is the closest to the given v among all w ∈ W. The following result solves this problem by assuring that the orthogonal projection Pv is the desired w0 ∈ W. Theorem 2 Orthogonal projection yields best approximation Let W be any finite-dimensional subspace of an inner-product space V and v any vector in V. Then the projection Pv of v onto W provides the best approximation of v from the subspace W. Precisely,

v − Pv = min v − w . w∈W

Proof To prove this result, let w1 , . . . , wn be any orthogonal basis of W. Later in this section, we will derive a procedure, called the Gram-Schmidt process, for computing an orthonormal basis of W from any given basis of W. Let w ∈ W be arbitrarily chosen. Then by (c) of Theorem 1, since Pv − w ∈ W, the vector v − Pv is orthogonal to Pv − w. Thus, it follows from the Pythagorean theorem (see Theorem 4 in Sect. 1.3 on p.24) that

v − w 2 = (v − Pv) + (Pv − w) 2 = v − Pv 2 + Pv − w 2 , so that

v − Pv 2 ≤ v − w 2 for all w ∈ W.



The non-negative real number minw∈W v − w is called the distance of v from W, denoted by dist(v, W). The above theorem assures that this distance is attained by the projection Pv. To derive an explicit formula of dist(v, W), we rely on the orthogonal basis w1 , . . . , wn , namely: its square is given by

1.4 Bases of Sequence and Function Spaces

,

v − Pv

2

= v−

n 

c j (v)w j , v −

j=1

n 

= v 2 −

j=1

= v 2 −

n 

c j (v)w j , v +

j=1

|v, w j |2 −

w j 2

n  |v, w j |2 j=1

c j (v)w j

c j (v)v, w j  −

j=1 n 

-

j=1

n 

= v, v −

45

w j 2

n  j=1

n 

c j (v) 2 w j , w j 

j=1

|v, w j |2 +

w j 2

n  j=1

|v, w j |2

w j 2

.

(1.4.8)

Let us summarize this result in the following theorem. Theorem 3 Bessel s inequality Let W be any subspace of an inner-product space V with orthogonal basis w1 , . . . , wn . Then for any v ∈ V, (a)

n  2  |v, w j |2 .

v − Pv 2 = dist(v, W) = v 2 −

w j 2

(1.4.9)

j=1

(b) Bessel’s inequality:

n  |v, w j |2 j=1

w j 2

≤ v 2 .

(1.4.10)

It is clear that Bessel’s inequality (1.4.10) is a trivial consequence of (1.4.9), since dist(v, W) ≥ 0. Example 8 Let v and W be the vector and space considered in Example 6. Find the distance dist(v, W) of v to W. Solution By applying (1.4.9), we have 

2 |v, w1 |2 |v, w2 |2 = v 2 − − dist(v, W)

w1 2

w2 2 2 2 40 45 75 = 185 − − = , 20 30 2 √ 5 6 . Alternatively, we may compute dist(v, W) by observing so that dist(v, W) = 2 it is the same as v − Pv , namely:

46

1 Linear Spaces

     15 13  dist(v, W) = v − Pv =  , , 7 (−5, 4, 12) − −   2 2 √    5 6 5  =  2 (1, −1, 2) = 2 . dist(v, W) is the length v − Pv of the vector v − Pv and it is the distance from point (−5, 4, 12) to the plane W (see Fig. 1.3).  Example 9 Let f (x) and W be the function and subspace of  L 2 [0, π] considered in Example 7. Compute the distance, dist( f, W), of f to W. Solution It follows from the definition (1.4.9) of the distance dist( f, W) of f to W that 

2

dist( f, W)

= f 2 −

3  | f, g j |2

g j 2

j=1



π

=

 x dx − 4

0

1 π3 √ π 3

2

"√ #2 #2 " √ 2 2π − √ (−2π) − √ π π2

4 5 17 π − π, 45 2 . 4 5 17 π − π.  so that dist( f, W) = 45 2 The importance of Bessel’s inequality is that the inequality remains valid for all n-dimensional subspaces of V. In particular, if V is an infinite-dimensional innerproduct space, then the inequality (1.4.10) holds even by allowing n to go to infinity. Therefore, for any infinite family {w1 , w2 , . . .} of non-zero orthogonal vectors in an infinite-dimensional inner-product space V, we have the following formulation of Bessel’s inequality: ∞  |v, w j |2 ≤ v 2 . (1.4.11)

w j 2 =

j=1

In the following, we replace w j by w j / w j in (1.4.11), and show that the inequality in (1.4.11) becomes equality if and only if the normalized family {w1 , w2 , . . .} is an orthonormal basis for V. Theorem 4 Orthonormal basis ⇔ Parseval s identity An orthonormal set W = {w1 , w2 , . . .} in an inner-product space V is complete in V, or equivalently an orthonormal basis for V, if and only if it satisfies Parseval’s identity: ∞  j=1

|v, w j |2 = v 2 , for all v ∈ V.

(1.4.12)

1.4 Bases of Sequence and Function Spaces

47

Proof To prove (1.4.12), we observe that since W is an orthonormal basis of V, any v ∈ V can be written as ∞  cjwj, v= j=1

with c j = v, w j , j = 1, 2, . . . (see Exercise 11). In other words,  2   ∞     0 = v − v, w j w j     j=1  2   n     = lim v − v, w j w j   n→∞   j=1 = lim v 2 − n→∞

n 

|v, w j |2

j=1

= v 2 −

∞ 

|v, w j |2 ;

j=1

yielding Parseval’s identity (1.4.12). To prove the converse, suppose that v ∈ V. Then for any integer n > 0, it follows from (1.4.8) that n  v, w j w j 2

v − j=1

= v − 2

n 

→ v 2 −

|v, w j |2

j=1 ∞ 

|v, w j |2 as n → ∞

j=1

= 0, by (1.4.12). That is, v =

∞  v, w j w j . Since this holds for all v ∈ V, W is comj=1

plete in V (see Definition 3). This, together with Remark 3, implies that W is an orthonormal basis of V.  To conclude this section, we derive a procedure, called the Gram-Schmidt (orthogonalization) process, for converting a linearly independent family of vectors in an

48

1 Linear Spaces

inner-product space to an orthonormal family, while preserving the same linear span. We will also give three examples for demonstrating this orthogonalization procedure. Gram-Schmidt process Let U = {u1 , u2 , . . . un } be a linearly independent set of vectors in an inner-product space V, with inner product ·, ·. The formulation of the Gram-Schmidt process for obtaining an orthonormal set V = {v1 , v2 , . . . , vn } with spanV =spanU is as follows. v1 = u1 ,

u2 , v1  v1 ,

v1 2 u3 , v1  u3 , v2  v1 − v2 , v3 = u3 − 2

v1

v2 2 .. . un , v1  un , v2  un , vn−1  vn = un − v1 − v2 − · · · − vn−1 .

v1 2

v2 2

vn−1 2 v2 = u2 −

Then the vectors w1 =

v1 v2 vn , w2 = , . . . , wn =

v1

v2

vn

constitute an orthonormal basis for spanU .



Observe that for each integer j > 1, the new vector v j is the difference of u j and the orthogonal projection of u j onto the subspace span{v1 , . . . , v j−1 }. Thus by statement (c) of Theorem 1, v j is orthogonal to v1 , . . . , v j−1 . In particular, if U = {u1 , u2 , . . . , un } is a basis of V, the set V = {v1 , v2 , . . . , vn }, obtained by following the Gram-Schmidt process, is an orthogonal basis for V. Remark 4 In the above formulation of the Gram-Schmidt process, observe that the vector v j+1 does not change even if the previous vectors {v1 , v2 , . . . , v j } are replaced by {c1 v1 , c2 v2 , . . . , c j v j } for arbitrarily chosen non-zero constants {c1 , c2 , . . . , c j }. This should reduce computational complexity if multiplication of v j by some appropriate positive integer changes the fractional numerical expression of v j to integer representation for computing the norm of c j v j for the formulation of vk for k > j. 1 For example, if v2 = (−6, 3, 0, 5), then we may replace it by 5v2 = (−6, 3, 0, 5) 5 to simplify the formulation of v j for j > 2. The Gram-Schmidt process can be extended to orthogonalization of an infinite set of linearly independent vectors. In particular, if U = {u1 , u2 , . . .} is a basis of an infinite-dimensional inner-product space V, then the Gram-Schmidt process can be followed to obtain an orthonormal basis V = {v1 , v2 , . . .} of V. This will be an important tool for computing orthonormal families in the later chapters of this book.  Example 10 Find an orthonormal basis of span{u1 , u2 , u3 }, where

1.4 Bases of Sequence and Function Spaces

49

u1 = (1, 2, 0, 0), u2 = (−1, 1, 0, 1), u3 = (1, 0, 2, 1). Solution Following the Gram-Schmidt process, we first set v1 = u1 and compute

v1 2 = 12 + 22 + 02 + 02 = 5. This yields u2 , v1  1 v1 = u2 − (−1, 1, 0, 1), (1, 2, 0, 0)v1 2

v1 5 1 1 = (−1, 1, 0, 1) − (1)(1, 2, 0, 0) = (−6, 3, 0, 5). 5 5

v2 = u2 −

Now, following Remark 4, we compute 5v2 2 instead of v2 2 , to obtain

5v2 2 = (−6, 3, 0, 5) 2 = 70. Then it follows from the formula for v3 in the statement of the Gram-Schmidt process that u3 , v1  u3 , 5v2  v1 − 5v2 2

v1

5v2 2 1 1 = u3 − (1, 0, 2, 1), (1, 2, 0, 0)v1 − (1, 0, 2, 1), (−6, 3, 0, 5)5v2 5 70 1 1 = (1, 0, 2, 1) − (1)(1, 2, 0, 0) − (−1)(−6, 3, 0, 5) 5 70 1 1 = (50, −25, 140, 75) = (10, −5, 28, 15). 70 14

v3 = u3 −

This yields the orthogonal basis v1 , 5v2 , 14v3 of span{u1 , u2 , u3 }. Finally, the desired orthonormal basis w1 , w2 , w3 is obtained by normalization of each of v1 , v2 , v3 to have unit length, namely: v1 1 = √ (1, 2, 0, 0),

v1 5 v2 5v2 1 = = = √ (−6, 3, 0, 5),

v2

5v2 70 v3 14v3 = = = (10, −5, 28, 15)/ (10, −5, 28, 15)

v3

14v3 1 = √ (10, −5, 28, 15). 9 14

w1 = w2 w3

50

1 Linear Spaces

 Example 11 Find an orthonormal basis of span{x1 , x2 , x3 }, where x1 = (3, 4, 0, 0), x2 = (0, 1, 1, 0), x3 = (0, 0, −3, 4). Solution Since x1 , x3  = 0, we choose u1 = x1 , u2 = x3 and u3 = x2 to save one computational step. In doing so, we already have v1 = u1 = x1 , v2 = u2 = x3 without any work, so that

v1 2 = 25, v2 2 = 25. Then following the Gram-Schmidt procedure, we obtain u3 , v1  u3 , v2  v1 − v2 2

v1

v2 2 1 1 = x2 − (0, 1, 1, 0), (3, 4, 0, 0)v1 − (0, 1, 1, 0), (0, 0, −3, 4)v2 25 25 4 3 = (0, 1, 1, 0) − (3, 4, 0, 0) + (0, 0, −3, 4) 25 25 1 = (−12, 9, 16, 12). 25

v3 = u3 −

What remains is only to normalize the vector v3 , yielding the orthonormal basis {w1 , w2 , w3 } of span{x1 , x2 , x3 }, where v1 1 = (3, 4, 0, 0),

v1 5 v2 1 = = (0, 0, −3, 4),

v2 5 v3 25v3 (−12, 9, 16, 12) = = =√

v3

25v3 144 + 81 + 256 + 144

w1 = w2 w3

=

1 (−12, 9, 16, 12). 25

1 Here, as pointed out in Remark 4, we have ignored the multiple in the normal25 ization of v3 , since it can be replaced by 25v3 .  L 2 [−1, 1], where f 1 (x) = 1, f 2 (x) = x, Example 12 Let W = span{ f 1 , f 2 , f 3 } ⊂  f 3 (x) = x 2 . Find an orthonormal basis of W by following the Gram-Schmidt process. Solution Following the Gram-Schmidt process, we set g1 (x) = f 1 (x) = 1. Then since

1.4 Bases of Sequence and Function Spaces

51

  f 2 , g1  =

1 −1

xd x = 0,

the two functions g1 and g2 , with g2 (x) = f 2 (x) = x, are orthogonal to each other. To obtain g3 , we first compute   f 3 , g1  =

1

−1 1

x 2d x =

  f 3 , g2  = 

g1 2 =

g2 2 =

1 1

−1  1 −1

2 , 3

x 3 d x = 0,

1d x = 2, x 2d x =

2 . 3

Then by the Gram-Schmidt process, we have  f 3 , g1   f 3 , g2  g1 (x) − g2 (x) 2

g1

g2 2 1 1 = x 2 − g1 (x) − 0 × g2 (x) = x 2 − . 3 3

g3 (x) = f 3 (x) −

Finally, the desired orthonormal basis h 1 , h 2 , h 3 of W is obtained by normalization of each of {g1 , g2 , g3 } to have unit length, namely: g1 (x) 1 =√ ,

g1 2 √ 3 g2 (x) h 2 (x) = = √ x,

g2 2

h 1 (x) =

h 3 (x) =

√ 5 g3 (x) 3x 2 − 1 = /! = √ (3x 2 − 1). 1

g3 2 2 2 2 −1 (3x − 1) d x  Exercises

Exercise 1 Let {x j } be a (finite or infinite) set of non-zero and mutually orthogonal vectors in an inner-product space V over some scalar field F. Show that {x j } is a linearly independent set as defined in Definition 4. Exercise 2 Determine which of the following sets of vectors in R2 or R3 are linearly dependent or linearly independent: (a) x1 = (1, 3), x2 = (4, 1), x3 = (−1, 5),

52

1 Linear Spaces

(b) y1 = (0, 1, 2), y2 = (1, 2, −2), (c) z1 = (1, 0, 1), z2 = (0, 1, 1), z3 = (1, 1, 0), (d) w1 = (−1, −2, 3), w2 = (1, 2, 3), w3 = (2, 5, 7), w4 = (0, 1, 0). Exercise 3 As a continuation of Exercise 2, determine the dimensions of the following linear spans: (a) (b) (c) (d)

span{x1 , x2 , x3 }, span{y1 , y2 }, span{z1 , z2 , z3 }, span{w1 , w2 , w3 , w4 }.

Exercise 4 Let n ≥ 2 be an integer. Show that the set {e1 , e2 , . . . , en } introduced in (1.4.2) is an orthonormal basis of Cn over C, and also of Rn over R. Exercise 5 Fill in the details in the solution of Example 2 by showing that

x −



xk δ j 22 =

|k|≤N



|xk |2

|k|>N

and that if x = {xk } ∈ 2 , then 

|xk |2 → 0, as N → ∞.

|k|>N

1 √2

Exercise 6 Verify that the infinite set √ , √ cos j x : j = 1, 2, . . . is an orthoπ π normal basis of its linear span, as a subspace of  L 2 [0, π]. Hint: Recall the trigonometric identity:  1 cos(α − β) + cos(α + β) . 2

√2 Exercise 7 Verify that the infinite set √ sin j x : j = 1, 2, . . . is an orthonorπ mal basis of its linear span, as a subspace of  L 2 [0, π]. Hint: Recall the trigonometric identity: cos α cos β =

sin α sin β =

 1 cos(α − β) − cos(α + β) . 2

Exercise 8 Identify which pairs of bases of C2 or R2 in the following are dual bases. Justify your answers. (a) {(1, 1), (1, −1)}, {(1, 1), ( 21 , −1 2 )}. 1 1 1 −1 (b) {(1, 1), (1, −1)}, {( 2 , 2 ), ( 2 , 2 )}.

1.4 Bases of Sequence and Function Spaces

53

(c) {(1, 1 + i), (0, i)}, {(1, 0), (1 + i, −1)}. (d) {(1, 1 + i), (0, 1)}, {(1, 0), (−1 − i, 1)}. Exercise 9 Let S = {vk : k = 1, 2, . . .} be a set of nonzero vectors vk in an inner-product space V. Show that if S is orthogonal, then it is linearly independent. Exercise 10 Prove (a) and (b) Theorem 1. Exercise 11 Let W = {w1 , w2 , . . .} be an orthonormal basis for V. Suppose v ∈ V ∞  is represented with W as v = c j w j . Show that c j = v, w j , j = 1, 2, . . .. j=1

Exercise 12 Let w1 = (1, 1, 2), w2 = (1, 1, −1) and W = span{w1 , w2 }. For v = (−3, 1, 2), find the orthogonal projection Pv of v onto W and compute the distance dist(v, W) of v to W. Exercise 13 Repeat Exercise 12 with w1 = (1, 1, 0, 1), w2 = (1, 2, −1, −3) and v = (1, 4, 2, −3). √ √ 2 2 1 Exercise 14 Let g1 (x) = √ , g2 (x) = √ cos x, g3 (x) = √ cos 2x be orthoπ π π normal functions in  L 2 [0, π]. Denote W = span{g1 (x), g2 (x), g3 (x)}. Find the orthogonal projection P f of f (x) = 1 + x onto W and compute the distance dist ( f, W) of f to W. Exercise 15 Repeat Exercise 14 with f (x) = sin 3x. √ √ 2 2 Exercise 16 Repeat Exercise 14 with g1 (x) = √ sin x, g2 (x) = √ sin 2x, π π √ 2 g3 (x) = √ sin 3x and f (x) = 1 + x. π Exercise 17 g3 (x) =

√ √2 π

Repeat Exercise 14 with g1 (x) =

√ √2 π

sin x, g2 (x) =

√ √2 π

sin 2x,

sin 3x and f (x) = cos 4x.

Exercise 18 Let u1 , u2 ∈ R3 be given by u1 = (0, 0, −1, 1), u2 = (1, 0, 0, 1). Find an orthonormal basis of span{u1 , u2 } by following the Gram-Schmidt process. Exercise 19 Let w1 , w2 , w3 ∈ R4 be given by w1 = (0, 0, −1, 1), w2 = (1, 0, 0, 1), w3 = (0, 1, 1, 0) and V = span{w1 , w2 , w3 }. Show that dim V = 3 by verifying that w1 , w2 , and w3 are linearly independent. Then find an orthonormal basis of V.

54

1 Linear Spaces

Hint: Apply the Gram-Schmidt process to u1 = w2 , u2 = w3 and u3 = w1 . L 2 [0, 1], where f 1 (x) = 1, Exercise 20 Let W = span{ f 1 , f 2 , f 3 } be a subspace of  f 2 (x) = x, f 3 (x) = x 2 . Find an orthonormal basis of W by following the Gram-Schmidt process. Exercise 21 Let J be an interval in R and w(x) ≥ 0, with w(x)  ≡ 0 on any subintervalof J . Define the inner-product space  L 2 (J, w(x)d x) of functions f ∈ PC(J ) ! 2 with | f (x)| w(x)d x < ∞, and with inner product  f, g = J f (x)g(x)w(x)d x. J

For J = [0, ∞) and w(x) = e−x , follow the Gram-Schmidt process to comL 2 ([0, ∞), e−x d x), where pute an orthonormal basis of W = span{ f 1 , f 2 , f 3 } ∈  2 f 1 (x) = 1, f 2 (x) = x, f 3 (x) = x .

1.5 Metric Spaces and Completion In this section we introduce the notion of metric and normed spaces, and discuss the concept of “complete metric spaces”. In particular, the completion of  L p (J ), by extending the piecewise continuous functions  L p (J ) to include all measurable functions, will be denoted by L p (J ). Definition 1 Metric space A set U is called a metric space if there exists a function d(x, y), called metric (or “distance measurement”), defined for all x, y ∈ U , that has the following properties. (a) Positivity: d(x, y) ≥ 0 for all x, y ∈ U , and d(x, y) = 0 ⇔ x = y. (b) Symmetry: d(x, y) = d(y, x) for all x, y ∈ U . (c) Triangle inequality: d(x, y) ≤ d(x, z) + d(z, y) for all x, y, z ∈ U. For example, the set R of real numbers is a metric space with metric defined by the absolute value, namely: d(x, y) = |x − y|. More generally, for any positive integer n, the Euclidean space Rn is a metric space with metric defined by 1  2 d(x, y) = (x1 − y1 )2 + (x2 − y2 )2 + · · · + (xn − yn )2 . It is important to point out that a metric space is only a set, which may not be a vector space. In fact, the concept of scalar field is not required in the definition of metric spaces. As an example, any bounded interval J ∈ R is a metric space, with metric defined by the absolute value as the metric space R as above, though J is not closed under addition. Another example of metric spaces that are not vector spaces is the set U [a, b] of continuous functions f on an interval [a, b] with f (a) = 1 with metric

1.5 Metric Spaces and Completion

55

defined by d( f, g) = max | f (x) − g(x)|, x∈[a,b]

f, g ∈ U [a, b].

(1.5.1)

Note that 0 = 0 × f (a)  = 1. Thus g(x) = 0 f (x) = 0  ∈ U [a, b] since g(a)  = 1. Theorem 1  p is metric space with · p For each p, 0 ≤ p ≤ ∞, the vector space  p (with · p as defined by (1.2.1)–(1.2.4) on p.14) is a metric space with metric d p (x, y) defined by d p (x, y) = x − y p , x, y ∈  p .

(1.5.2)

Indeed, the positivity and symmetry properties (a) and (b) for the metric d p are consequences of the following properties of  p :

x p ≥ 0 for x ∈  p ; x p = 0 ⇔ x = 0; and − x p = x p , while the triangle inequality property (c) for d p follows from the triangle inequality (1.2.5) on p.15, namely: d p (x, y) = (x − z) + (z − y) p ≤ x − z p + z − y p = d p (x, z) + d p (z, y).  Similarly, for any 0 < p ≤ ∞, it follows from the definition of f p in (1.2.15)– (1.2.17) and the triangle inequality (1.2.18) on p.22, that the following result, analL p (J ). ogous to Theorem 1 for  p , is also valid for the function space  L p is metric space with · p Theorem 2  metric space with metric defined by d p ( f, g) = f − g p ,

For each p, 0 < p ≤ ∞,  L p (J ) is a f, g ∈  L p (J ).

(1.5.3)

Since the proof is analogous to that of Theorem 1, it is safe to leave it as an exercise (see Exercise 3). We next introduce the notion of normed spaces (also called normed linear spaces). Definition 2 Normed space A vector space V over some scalar field F is called a normed space, if there is a function · defined on V that has the following properties. (a) Positivity: x ≥ 0 for all x ∈ V, and x = 0 ⇔ x = 0. (b) Scale preservation: ax = |a| x for all x ∈ V, a ∈ F. (c) Triangle inequality:

56

1 Linear Spaces

x + y ≤ x + y for all x, y ∈ V. For a normed space, the function · is called the “norm”. Theorem 3  p ,  L p are normed spaces with · p for1 ≤ p ≤ ∞ For each p with 1 ≤ p ≤ ∞, let x = x p and f = f p , where x = {x j } ∈  p and f ∈  L p (J ), be defined by (1.2.1)–(1.2.2) on p.14 and (1.2.15)–(1.2.16) on p.20, L p (J ) are normed spaces. However, for 0 < p < 1,  p respectively. Then  p and  and  L p (J ), with x p and f p defined by (1.2.3) on p.14 and (1.2.17) on p.20, are not normed spaces. The proof of this theorem for 1 ≤ p ≤ ∞ follows from the triangle inequalities (1.2.5) on p.15 and (1.2.18) on p.22, since the scale-preservation condition 

cx p = |c| x p ,

c f p = |c| f p

is satisfied due to the cancellation of the power p in |c| p by 1/ p. On the other hand, for 0 < p < 1, although the triangle inequality holds for · p as defined by (1.2.3) and (1.2.17), the scale-preservation condition is not satisfied (unless the scale c satisfies |c| = 1), without pth root, (1/ p)th power, to cancel the L p (J ) are not pth power in |c| p . In other words, for each p with 0 < p < 1,  p and  normed spaces.  The definition of  p measurement in (1.2.1)–(1.2.3) on p.14 for infinite sequences is valid for finite sequences, which are nothing but vectors in the Euclidean space Cn . Precisely, for x = (x1 , . . . , xn ) in Cn , we define (a) if 1 ≤ p < ∞,

⎛ ⎞1/ p n 

x p = ⎝ |x j | p ⎠ ;

(1.5.4)

j=1

(b) if p = ∞,

x ∞ = max{|x j | : j = 1, . . . , n};

(c) if 0 < p < 1,

x p =

n 

|x j | p .

j=1

Then by (1.2.5), we have the triangle inequality

x + y p ≤ x p + y p , for any x, y ∈ Cn , which yields the following result.

(1.5.5)

(1.5.6)

1.5 Metric Spaces and Completion

Theorem 4 np space

57

Let np be the vector space Cn over scalar field C with

measurement · p defined by (1.5.4)–(1.5.6). Then for each p with 0 < p ≤ ∞, np is a metric space with metric defined by d p (x, y) = x − y p , x, y ∈ np . Moreover, for each p with 1 ≤ p ≤ ∞, np is a normed space with norm given by

· p . Furthermore, the same statement is valid if Cn is replaced by Rn and C by R. Cn

We remark that for the choice of p = 2, the metric space n2 is the Euclidean space (or Rn ).

Theorem 5 Normed space is metric space If V is a normed vector space with norm · , then V is a metric space with metric defined by d(x, y) = x − y , x, y ∈ V. Clearly d(x, y) defined above satisfies (a) and (b) of metric space. Furthermore, we have d(x, y) = (x − z) + (z − y) ≤ x − z + z − y = d(x, z) + d(z, y).

(1.5.7)

That is, the triangle inequality holds for d(x, y). Therefore, V is a metric space with metric defined as the distance between two vectors. However, a metric space which is also a vector space is not necessarily a normed L p (J ), for 0 < p < 1, are not normed spaces, space. For example, the spaces  p and  although they are metric spaces and vector spaces. If V is an inner-product Theorem 6 Inner-product space is normed space √ space with inner product ·, ·, then V is a normed space, with norm · = ·, ·. Indeed, the property (a) follows from (c) in the definition of an inner product in Definition 1 on p. 8; while the property (b) follows from (b) in this definition by setting b = 0 and z = ax, namely: ¯ x = |a|2 x 2

ax 2 = ax, ax = a ax, so that ax = |a| x . To complete the proof of Theorem 6, we simply apply Cauchy-Schwarz’s inequality (1.3.8) on p.27 in Sect. 1.3 to derive the triangle inequality (c) in Definition 2, as follows:

x + y 2 = |x + y, x + y| = |x, x + y + y, x + y| ≤ |x, x + y| + |y, x + y|

58

1 Linear Spaces

≤ x x + y + y x + y , so that the triangle inequality x + y ≤ x + y is obtained by canceling out

x+y . Observe that this proof is precisely the same as the derivation of Minkowski’s inequality by applying Hölder’s inequality in Sect. 1.2.  We next introduce the concept of “completeness” of a metric space (such as a normed space or an inner-product space). For this purpose, we first recall the notion of Cauchy sequences from Calculus, as follows. Let V be a metric space with metric d(·, ·). A sequence {xk }∞ k=1 ⊂ V is called a Cauchy sequence, if d(xm , xn ) → 0 as m, n → ∞, independently. In other words, for any given positive  > 0, there exists a positive integer N , such that d(xm , xn ) < , for all m, n ≥ N . Definition 3 Complete metric space A metric space V is said to be complete with respect to some metric d(·, ·), if for every Cauchy sequence {xk } in V, there exists some x ∈ V, called the limit of {xk }, such that d(xk , x) → 0, as k → ∞. A metric space with metric d(·, ·) is said to be incomplete, if it is not a complete metric space with this metric. The set R of real numbers, considered as a metric space with metric induced by the absolute value, is complete. Furthermore, any closed and bounded interval [a, b] ∈ R is also a complete metric space, but the open and bounded interval (a, b) ∈ R is incomplete, since a Cauchy sequence that converges to one of the end-points a or b does not have a limit in (a, b). As to functions, the metric space C[a, b] of continuous functions on [a, b] with metric d( f, g) defined by (1.5.1) is complete. This is a well-known result in Advanced Calculus, which states that the limit of a uniformly convergent sequence of continuous functions on a closed and bounded interval is a continuous function on the interval (see the hint in Exercise 6 for its proof). On the other hand, the fact that the subset Q ⊂ R of rational numbers is incomplete, as already alluded to in Sect. 1.1, is shown in the following. Example 1 Give an example to show that the set Q of rational numbers, with metric d(x, y) = |x − y|, x, y ∈ R, is incomplete. Solution Let {x n } be the sequence of fractions defined by x1 = 2, xn =

1 1 , n = 2, 3, . . . . xn−1 + 2 xn−1

1.5 Metric Spaces and Completion

59

Being fractions, each xn , n = 1, 2, √ . . . , is a positive rational number. In addition, in view of the inequality a + b ≥ 2 ab for non-negative real numbers, it follows that 1 1 ≥2 xn = xn−1 + 2 xn−1

0

1 xn−1 2



1



xn−1

=



2,

so that for all n = 2, 3, . . ., we have xn − xn−1 =

1 xn−1

2 2 − xn−1 1 − xn−1 = ≤ 0, or xn ≥ xn−1 . 2 2xn−1

√ Thus, as a non-increasing sequence which is bounded from below by 2, {xn } must be convergent, and hence, is a Cauchy sequence, with√ limit denoted by x. According to the definition of the sequence xn , we see that x ≥ 2 and satisfies the equation 1 1 x+ , 2 x √ or x 2 = 2, with positive solution x = 2, which is not in Q. This example shows that Q is incomplete.  x=

Example 2  L p (J ) is incomplete for each p, 0 < p ≤ ∞. L p (J ), each function of this sequence, Solution For a sequence of functions f n ∈  being piecewise continuous, is assumed to have at most a finite number of jump discontinuities. Let us assume that f n has exactly n jump discontinuities in the open interval (a, b). Then if the sequence f n has some limit f , it should be at least intuitively clear that the limit function f has infinitely many jumps of discontinuities, and is therefore not in  L p (J ). To be precise, consider the example J = [0, 1] and the sequence of functions 1 < x ≤ k1 , k = 1, . . . , n, 1/k s , for k+1 1 , 0, for 0 ≤ x ≤ n+1

f n (x) =

for some s with sp > −1 when 0 < p < ∞, and s > 0 for p = ∞ (for example one may just let s = 1), where n = 1, 2, . . . . Hence, we have (a) for 0 < p < ∞,

 0

(b) for p = ∞,

1

| fn | p =

n  k=1

1 k ps k(k

+ 1)

ess sup | f n (x)| = 1. x∈[0,1]

< ∞;

60

1 Linear Spaces

Furthermore, for 0 < p < ∞ and s with ps > −1, we have, for m > n, " d p ( fm , fn ) = fm − fn p =

m 

k=n+1

1 ps k k(k + 1)

#1/ p → 0,

(1.5.8)

as m, n → ∞ independently. For p = ∞, we also have, for any s > 0, d∞ ( f m , f n ) = ess sup

n 1. Hint: Show that  N  N +1 N  dx 1 dx < < 1 + , α α α x k 1 1 x k=1

and apply these inequalities to study the convergence or divergence of the infinite series. Exercise 12 Apply the result in Exercise 11 to verify (1.5.8).

Chapter 2

Linear Analysis

In the previous chapter, the theory and methods of elementary linear algebra were extended to linear spaces, in that finite-dimensional vector spaces were extended to include infinite-dimensional sequence and function spaces, that the inner product was applied to derive such mathematical tools as orthogonal projection and the GramSchmidt orthogonalization process, and that the concept of completeness was studied to introduced Hilbert and Banach spaces. In this chapter, the basic theory of linear analysis is studied in some depth with the goal of preparing for the later discussion of several application areas, including: solution and least-squares approximation for linear systems and reduction of data dimension (in Chap. 3) and image/video compression (in Chap. 5), as well as for the study of anisotropic diffusion and compressed sensing in a forthcoming publication of this book series. In the first section, Sect. 2.1, we give a brief review of the basic theory and methods of elementary linear algebra, including matrix-matrix multiplication, block matrix operations, row echelon form, and matrix rank. An application to the study of the rank of a matrix is also discussed at the end of this first section, with the objective of preparing for the study of spectral methods and their applications in the next chapter. In the next Sect. 2.2, matrix multiplication to column vectors is extended to the notion of linear transformations, particularly linear functionals and linear operators, defined on vector spaces. A representation formula for any bounded linear functional defined on an inner-product space is derived and applied to prove the existence and uniqueness of the adjoint of an arbitrary bounded linear operator on a complete innerproduct space. In addition, when a square matrix A of dimension n is considered as a linear operator on the Euclidean space Cn , it is shown that its adjoint is given by complex conjugation of the transpose of A, which is different from the matrix adjoint of A, studied in elementary matrix theory or linear algebra for the inversion of the matrix A, if it happens to be nonsingular.

C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_2, © Atlantis Press and the authors 2013

63

64

2 Linear Analysis

In Sect. 2.3, the subject of eigenvalues and eigenvectors of square matrices, studied in elementary linear algebra, is reviewed and extended to linear operators on arbitrary vector spaces. To prepare for the solution of the (isotropic) heat diffusion PDE (partial differential equation) in Sect. 6.5 of Chap. 6, an example of the eigenvalue/eigenvector problem for certain differential operators is presented. This example is also used to demonstrate the notion of self-adjoint and positive semidefinite linear operators on inner-product spaces. It is shown that eigenvalues of a self-adjoint linear operator T (meaning that T is its own adjoint) are real numbers, and that these real eigenvalues are non-negative, if the self-adjoint operator T is positive semi-definite, meaning that the inner-product of T v with v, for an arbitrary v in the inner-product space, is non-negative. This section ends with a discussion of a certain minimization problem in terms of the Rayleigh quotient for the computation of eigenvalues, that will be useful for numerical solution of the (anisotropic) heat diffusion PDE, and serves as a motivation for the study of the spectral decomposition problem in the next section. The notion of eigenvalues of square matrices is extended to that of spectra for linear operators in Sect. 2.4. A linear operator is said to be normal, if it commutes with its adjoint. The main result established in this section is the Spectral (Decomposition) Theorem for normal matrices, with application to the derivation of the property of preservation of distances, norms, and angles, under normal matrix transformation. Examples of normal matrices include self-adjoint (Hermitian) matrices and unitary matrices (which are also called orthogonal matrices, for those with real entries).

2.1 Matrix Analysis Let A ∈ Cm,n and B ∈ Cn, p . Then their product, AB ∈ Cm, p , is defined by ⎡

⎤⎡ ⎤ a11 . . . a1n b11 . . . b1 p ⎦⎣ ⎦ ... ... AB = ⎣ am1 . . . amn bn1 . . . bnp ⎡ ⎤ c11 . . . c1 p ⎦ = C, ... =⎣ cm1 . . . cmp with c jk =

n 

(2.1.1)

a j bk .

=1

It is important to emphasize that for the validity of the matrix multiplication AB = C, the number n of “columns” of A must be the same as the number n of “rows” of B.

2.1 Matrix Analysis

65

Example 1 Consider ⎤ 1 2 −1 1 −1 0 A= , B = ⎣0 1 2 ⎦. 2 3 1 40 1 





Then AB is well defined but B A does not make sense, since m = 2, n = 3 and p = 3. Compute AB. Solution By applying the rule of matrix-matrix multiplication defined in (2.1.1), we have ⎡ ⎤   1 2 −1 1 −1 0 ⎣ 01 2 ⎦ AB = 2 3 1 40 1   c c c = 11 12 13 = C, c21 c22 c23 where, according to (2.1.1), c11 = 1(1) + (−1)(0) + 0(4) = 1 c12 = 1(2) + (−1)(1) + 0(0) = 1 c13 = 1(−1) + (−1)(2) + 0(1) = −3 c21 = 2(1) + 3(0) + 1(4) = 6 c22 = 2(2) + 3(1) + 1(0) = 7 c23 = 2(−1) + 3(2) + 1(1) = 5.

Hence,



 1 1 −3 AB = C = . 67 5 

Remark 1 It is convenient to identify Cn,1 with the Euclidean space Cn by agreeing on the notation Cn,1 = Cn . Recall from (1.1.2) on p.8 that x = (x1 , . . . , xn ) ∈ Cn . Hence, by identifying Cn with the set Cn,1 of n × 1 matrices, x is also a column vector, namely: ⎡ ⎤ x1 ⎢ .. ⎥ n,1 ⎣ . ⎦∈C . xn

66

2 Linear Analysis

⎤ x1 ⎢ ⎥ Hence-forth, we will therefore use both notations (x1 , . . . , xn ) and ⎣ ... ⎦ for the ⎡

Cn

same x ∈ = In particular, for A ∈ the convention x ∈ Cn,1 , Cn,1 .



a11 ⎢ .. Ax = ⎣ . ⎡

am1

Cm,n

and x ∈

Cn ,

xn we have, by adopting

⎤⎡ ⎤ x1 a1n .. ⎥ ⎢ .. ⎥ . ⎦⎣ . ⎦ . . . amn xn ... .. .

⎤ a11 x1 + · · · + a1n xn ⎢ ⎥ .. =⎣ ⎦. . am1 x1 + · · · + amn xn

(2.1.2)

 On the other hand, in some occasions, it is more convenient to write a matrix A as a block matrix or a partitioned matrix, such as  A = [A1 A2 ] , A =

   A11 A12 C1 ,A= , C2 A21 A22

where A1 , A2 , C1 , C2 , A11 , A12 , A21 , A22 are submatrices of A. As an example, the matrices A and B in Example 1 may be written as ⎡ ⎢ 1 A=⎢ ⎣... 2





.. ⎢ 1 . −1 0 ⎥ ⎢... ⎢ ... ... ...⎥ , B = ⎢ ⎦ ⎢ 0 .. ⎣ . 3 1 4

.. . ... .. . .. .

⎤ 2 −1 ⎥ ... ...⎥ ⎥ ⎥. 1 2 ⎥ ⎦ 0 1

(2.1.3)

In other words, we have partitioned the matrices A and B as block matrices A = [A jk ]1≤ j,k≤2 , B = [B jk ]1≤ j,k≤2 , where A11 = [1], A12 = [ −1 0 ], A21 = [2], A22 = [ 3 1 ],     0 12 B11 = [1], B12 = 2 −1 , B21 = , B22 = . 4 01 Remark 2 Block-matrix multiplication The operation of matrix-matrix multiplication (2.1.1) remains valid for multiplication of block matrices. For example, let     B11 B12 A11 A12 ,B = , A= A21 A22 B21 B22

2.1 Matrix Analysis

67

where A jk and B jk are the ( j, k)-blocks of A and B with appropriate sizes. Then we have     A11 B11 + A12 B21 A11 B12 + A12 B22 C11 C12 = . AB = C21 C22 A21 B11 + A22 B21 A21 B12 + A22 B22 Also, even non-square matrices can be written as diagonal block matrices. For example, if A and B are diagonal block matrices, denoted by A = diag{A1 , A2 , . . . , As }, B = diag{B1 , B2 , . . . , Bs }, with main diagonal blocks A j , B j , 1 ≤ j ≤ s of appropriate sizes and the other block matrices not on the “diagonal” being zero submatrices, then by applying the above multiplication operation for block matrices, we have AB = diag{A1 B1 , A2 B2 , . . . , As Bs }. We emphasize again that this multiplication operation applies, even if A, B are not square matrices.  Observe that by writing any A ∈ Cm,n as a block matrix A = [a1 a2 · · · an ], of vector column sub-blocks, with a j being the jth column of A, then for any x ∈ Cn , the matrix-vector multiplication of A and x can be written as ⎤ x1 ⎢ ⎥ Ax = [a1 a2 · · · an ] ⎣ ... ⎦ ⎡

xn = x1 a1 + x2 a2 + · · · + xn an .

(2.1.4)

The matrix partitioning of A into blocks of column-vectors for the reformulation (2.1.2) of matrix-vector multiplication will facilitate our discussions below and in later chapters, particularly in Chap. 4 in the discussion of DFT and DCT. Example 2 Let A and B be the matrices in Example 1, and consider the partition of these two matrices into block matrices as given in (2.1.3). Compute the product AB by multiplication of block matrices. Solution In terms of the notation for block matrix multiplication introduced in the above discussion, it follows that   0 C11 = A11 B11 + A12 B21 = [1][1] + −1 0 = [1], 4

68

2 Linear Analysis

  12 C12 = A11 B12 + A12 B22 = [1] 2 −1 + −1 0 01 = 2 −1 + −1 −2 = 1 −3 ,   0 = [6], C21 = A21 B11 + A22 B21 = [2][1] + 3 1 4   12 C22 = A21 B12 + A22 B22 = [2] 2 −1 + 3 1 01 = 4 −2 + 3 7 = 7 5 . Thus, we have



C11 C12 C21 C22





 1 1 −3 = , 67 5

which agrees with the matrix AB obtained by the matrix-matrix multiplication in Example 1.  Next we recall the notions of transpose and transpose-conjugate of matrices. The transpose of an m × n matrix A = [a jk ]1≤ j≤m,1≤k≤n , denoted by A T , is the n × m matrix defined by ⎡

A T = [ak j ]1≤ j≤n,1≤k≤m

a11 ⎢ a12 ⎢ =⎢ . ⎣ .. a1n

⎤ a21 . . . am1 a22 . . . am2 ⎥ ⎥ .. .. ⎥ . . ... . ⎦ a2n . . . amn

The transpose-conjugate of A, denoted by A∗ , is defined by

T A∗ = A ,

(2.1.5)

where A = [a jk ]. For real matrices A, it is clear that A∗ = A T . A square matrix A is said to be symmetric if A T = A, and Hermitian if A∗ = A. We also recall that

T A T = A,

∗ ∗ A = A,



AB AB

T ∗

= B T AT , = B ∗ A∗ , 

which can be easily verified. For block matrices A =

 B C , the above operations D E

can be extended to  AT =

B T DT CT ET



, A∗ =



 B ∗ D∗ . C∗ E∗ 

2.1 Matrix Analysis

69

In an elementary course of Linear Algebra or Matrix Theory, the method of Gaussian elimination is introduced to solve a system of linear equations, to invert nonsingular matrices, study matrix rank, and so forth. The matrices are then reduced to the so-called “row echelon form” or “reduced row echelon form”, defined as follows. Definition 1 Row echelon form the following properties.

A matrix is in row echelon form if it satisfies

(a) All nonzero rows appear above any rows of all zeros. (b) The first nonzero entry from the left of a nonzero row is a 1, and is called the leading 1 of this row. (c) All entries in each column below a leading 1 are zeros. A matrix in row echelon form is said to be in reduced row echelon form if it satisfies the following additional condition: (d) Each leading 1 is the only nonzero entry in the column that contains this leading 1. For example, the following matrices are in row echelon form, where ∗ denotes any number: ⎡ ⎤ ⎡ ⎤ 01∗∗∗∗∗ 1∗∗∗ ⎢ ⎥ ⎣ 0 0 1 ∗ ⎦, ⎢ 0 0 1 ∗ ∗ ∗ ∗ ⎥, ⎣0 0 0 0 1 ∗ ∗⎦ 0001 000001∗ and in particular, the following two matrices ⎡ ⎤ 01 1∗00 ⎢0 0 ⎣ 0 0 1 0 ⎦, ⎢ ⎣0 0 0001 00 ⎡

0 1 0 0

∗ ∗ 0 0

0 0 1 0

0 0 0 1

⎤ ∗ ∗⎥ ⎥ ∗⎦ ∗

are in reduced row echelon form. A matrix can be transformed into a matrix in row echelon form or reduced row echelon form by performing finitely many steps of the elementary row operations. There are three permissible elementary row operations on a matrix A, to be called types I-III, defined as follows: (i) Type I: Interchange two rows of A. (ii) Type II: Multiply a row of A by a nonzero constant. (iii) Type III: Add a constant multiple of a row of A to another row of A. For example, to study the solution of system of linear equations Ax = b

(2.1.6)

(also called a linear system for simplicity), where A is an m × n (coefficient) matrix, and b is a given m × 1 column vector, and x is the unknown column vector in Cn , let

70

2 Linear Analysis

  . C .. d be the matrix in row echelon form or reduced row echelon form obtained by   .. applying elementary row operations on the “augmented matrix” A . b of the linear system (2.1.6). Then Cx = d has the same solutions as the original linear system (2.1.6). Thus, by solving Cx = d, which is easy to solve since C is in row echelon form or reduced row echelon form,  we get  the solution for (2.1.6). This method is .. called the Gaussian elimination if C . d is in row echelon form, and it is called the   .. Guass-Jordan reduction if C . d is in reduced row echelon form.

An n × n matrix R is called an elementary matrix of type I, type II or type III (each is called an elementary matrix) if it is obtained by performing one step of type I, type II, or type III elementary row operation on the identity matrix In . For example, ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 100 100 1d0 R1 = ⎣ 0 0 1 ⎦ , R2 = ⎣ 0 c 0 ⎦ , R2 = ⎣ 0 1 0 ⎦ , 010 001 001 where c = 0, d = 0, are type I, type II, and type III elementary matrices, respectively. Let A = [a jk ] be a 3 × 3 matrix. Observe that ⎤ ⎤ ⎡ a11 a12 a13 a11 a12 a13 R1 A = ⎣ a31 a32 a33 ⎦ , R2 A = ⎣ ca21 ca22 ca23 ⎦ , a21 a22 a23 a31 a32 a33 ⎡ ⎤ a11 + da21 a12 + da22 a13 + da23 ⎦. a21 a22 a23 R3 A = ⎣ a31 a32 a33 ⎡

Hence, R1 A is the matrix obtained by interchanging the second and third rows of A, and this is exactly the same operation applied to I3 to obtain R1 (namely, R1 is the resulting matrix after interchanging the second and third rows of I3 ). Similarly, R2 A and R3 A are the matrices obtained by performing the corresponding type II and type III elementary row operations on A as they are performed on I3 to yield R2 and R3 , respectively. More generally, for an elementary matrix R, we have the following fact: RA is the matrix obtained by performing the corresponding elementary row operation on A . Thus, if B is the matrix obtained by performing finitely many steps of elementary row operations on A, then B = E A,

2.1 Matrix Analysis

71

where E = Rk . . . R2 R1 with each R j an elementary matrix. In addition, E is nonsingular, and A = E −1 B, where E −1 = R1−1 R2−1 . . . Rk−1 with each R −1 j being also an elementary matrix.  For an m × n matrix A, let a j denote the jth column of A, namely: A = [a1 a2 · · · an ]. The column space of A is defined by span{a1 , . . . , an }; that is, the vector space spanned by the columns of A. The dimension of this vector space is called columnrank of the matrix A. Next we apply the above discussion to find a basis for the column space of A. To find a basis of vectors among a1 , . . . , an for span{a1 , . . . , an }, we need to take linear dependence of a1 , . . . , an into consideration, by investigating the linear system c1 a1 + · · · + cn an = 0, where c1 , . . . , cn ∈ C. From (2.1.4), this linear system has matrix formulation: Ac = 0, where c = [c1 , . . . , cn ]T , with coefficient matrix A. With elementary row operations, we may transform A into its reduced row echelon form B. If Bc = 0 has only the trivial solution, then a1 , . . . , an are linearly independent and therefore constitute a basis for the column space. In this case B must have n leading 1’s and that n ≤ m. Next we may assume Bc = 0 has a nontrivial solution. For simplicity of presentation, we assume that the columns of B with a leading 1 precede any columns without a leading 1. More precisely, B can be written as ⎤ ⎡ 1 0 . . . 0 b1 r +1 . . . b1n ⎢ 0 1 . . . 0 b2 r +1 . . . b2n ⎥ ⎥ ⎢ ⎢ .. .. . . .. .. .. .. ⎥ ⎥ ⎢. . . . . . . ⎥ ⎢ ⎥ . . . b 0 0 . . . 1 b (2.1.7) B=⎢ r r +1 rn ⎥ , ⎢ ⎢0 0 0 0 0 ... 0 ⎥ ⎥ ⎢ ⎢ .. .. .. .. .. .. .. ⎥ ⎣. . . . . . . ⎦ 0 0 ... 0 0 ... 0 where 0 ≤ r ≤ n. Let b j , 1 ≤ j ≤ n, denote the jth column of B. Clearly the first r columns b1 , . . . , br of B are linearly independent. In addition, other columns can be written as a linear combination of them: b j = b1 j b1 + b2 j b2 + · · · + br j br , r + 1 ≤ j ≤ n. Thus, b1 , . . . , br form a basis for the column space of B.

(2.1.8)

72

2 Linear Analysis

From the fact that A = E B, where E is a nonsingular matrix which is a product of elementary matrices, we see a j = Eb j , 1 ≤ j ≤ n. Thus, by multiplying both sides of (2.1.8) with E, we have Eb j = b1 j Eb1 + b2 j Eb2 + · · · + br j Ebr , r + 1 ≤ j ≤ n, or a j = b1 j a1 + b2 j a2 + · · · + br j ar , r + 1 ≤ j ≤ n. That is, a j ∈ span {a1 , . . . , ar }, r + 1 ≤ j ≤ n. On the other hand, from the linear independence of b1 , . . . , br and the relation b j = Ea j between a j and b j , it is easy to see that a1 , . . . , ar are linearly independent (see Exercise 7). Thus, {a1 , . . . , ar } is a basis of the column space of A. To summarize, we have shown that the column-rank of A is r , which is the number of the leading 1’s in B, and the columns of A corresponding to the columns B with a leading 1 form a basis for the column space of A. More generally, we have the following theorem. Theorem 1 Basis for column space echelon form. Then

Let A be an m × n matrix and B be its row

column-rank A = # of leading 1’s in B, and the columns of A corresponding to the columns B with a leading 1 constitute a basis of the column space of A. For a set of vectors v1 , . . . , vn in Cm or Rm , to find a basis for its span, we only need to apply Theorem 1 to the matrix with v j , 1 ≤ j ≤ n (or v Tj if v j are row vectors) as its column vectors. Example 3 Determine some columns of the following matrix A which constitute a basis for its column space: ⎡ ⎤ 0121 3 ⎢2 4 2 0 8 ⎥ ⎥ A=⎢ ⎣ 3 7 5 1 15 ⎦ . 2301 3 Solution We first transform A into its row echelon form. A → Multiply 2nd row by

1 2

2.1 Matrix Analysis



0 ⎢1 −→ ⎢ ⎣3 2 ⎡ 1 ⎢0 −→ ⎢ ⎣3 2 ⎡ 1 ⎢0 −→ ⎢ ⎣0 0 ⎡ 1 ⎢0 −→ ⎢ ⎣0 0 ⎡ 1 ⎢0 ⎢ −→ ⎣ 0 0

73

⎤ 3 4 ⎥ ⎥ → Interchange 1st and 2nd rows 15 ⎦ 3 ⎤ 210 4 121 3 ⎥ ⎥ → Add multiple of 1st row by (-3) to 3rd row; 7 5 1 15 ⎦ and add multiple of 1st row by (-2) to 4th row 301 3 ⎤ 2 1 0 4 1 2 1 3 ⎥ ⎥ → Add multiple of 2nd row by (-1) to 3rd row; 1 2 1 3 ⎦ and add 2nd row to 4th row −1 −2 1 −5 ⎤ 210 4 1 121 3 ⎥ ⎥ → Multiply 4th row by − 2 ; ⎦ 000 0 and then interchange 3rd and 4th rows 0 0 2 −2 ⎤ 210 4 121 3 ⎥ ⎥ = B, 0 0 1 −1 ⎦ 000 0 1 2 7 3

2 1 5 0

1 0 1 1

which is in row echelon form. Since there are 3 leading 1’s in B, the column-rank of A is 3. In addition, because the first, second and fourth columns of B are the columns containing leading 1’s, the first, second and fourth columns of A form a basis for the column space of A; that is ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 1 1 ⎢2⎥ ⎢4⎥ ⎢0⎥ ⎢ ⎥, ⎢ ⎥, ⎢ ⎥ ⎣3⎦ ⎣7⎦ ⎣1⎦ 2 3 1 constitute a basis for the column space of A.



Similarly, the row space of an m × n matrix A ∈ Cm,n is defined by the vector space spanned by the row vectors [a j1 . . . a jn ], j = 1, . . . , m of A. The “row-rank” of A is defined by the dimension of its row space. In other words, the row-rank of A is precisely the number of linearly independent rows of A. As shown in the following theorem, the row-rank and column-rank of any matrix are the same. Hence, the rank of a matrix A is defined by rank(A) := row-rank(A) = column-rank(A).

74

2 Linear Analysis

Theorem 2 row-rank = column-rank matrix are the same.

The row-rank and column-rank of any

Proof Suppose A is an m × n matrix and B is a row echelon form of A. Thus B = E A, where E is a nonsingular matrix. Therefore, every row of B is a linear combination of the rows of A. Hence, span{rows of B} ⊂ span{rows of A}, which implies row-rank(B) ≤ row-rank(A). Similarly, since A = E −1 B, we also have row-rank(A) ≤ row-rank(B). Hence row-rank(B) = row-rank(A). On the other hand, it is obvious that being in row echelon form, the nonzero rows of B constitute a basis for the row space of B. Hence, row-rank(B) = # of nonzero rows of B = # of leading 1’s in B. This, together with Theorem 1, leads to column-rank(A) = row-rank(A) = # of leading 1’s in B. This completes the proof of the theorem.



Clearly for an m×n matrix A, we have rank(A) ≤ min{m, n}, rank(A) = rank(A T ) = rank(A∗ ), and that a nonsingular linear transformation does not alter the rank of a matrix (see Exercise 16). Furthermore, for two matrices B, C, rank(B + C) ≤ rank(B) + rank(C),

(2.1.9)

rank(BC) ≤ min{rank(B), rank(C)}.

(2.1.10)

Next, we recall that the null space of an m × n matrix A is defined by



N0 (A) := {x ∈ Cn : Ax = 0}. The dimension of the null space N0 (A) is called nullity of A, denoted by ν(A). Observe that if B is a row echelon form or the reduced row echelon form of A, then N0 (A) = N0 (B) since the two linear systems Ax = 0 and Bx = 0 have the same solution space. If Bx = 0 has the trivial solution only, then N0 (A) = {0}. In this case, N0 (A) has no basis and ν(A) = 0. If Bx = 0 has a nontrivial solution, then from Bx = 0, it is not difficult to find a basis for N0 (B) (and hence for N0 (A) also). The dimensionof N0 (B) is equal to the number of the free parameters in the

2.1 Matrix Analysis

75

solutions for Bx = 0. For example, for B given in (2.1.7), xr +1 , . . . , xn are the free parameters in the solutions. Hence, in this case the nullity of B is n − r . Observe for an m × n matrix B in row echelon form or reduced row echelon form, the number of the free parameters in solutions for Bx = 0 is equal to the number of columns of B minus the number of leading 1’s in B. Hence, we obtain the following formula on the rank and nullity of an arbitrary matrix studied in an elementary course in Linear Algebra or Matrix Theory. Theorem 3 Rank and nullity formula

Let A be an m × n matrix. Then

rank(A) + ν(A) = n.

(2.1.11)

Example 4 Let A be the matrix in Example 3. Find a basis for N0 (A). Solution Let B be the matrix in row echelon form in Example 3. To find a basis for N0 (A), we only need find a basis for N0 (B). Since the solution of the linear system Bx = 0 is given by ⎧ ⎪ ⎨x1 = 3x3 + 4x5 , x2 = −2x3 − 4x5 , ⎪ ⎩ x4 = x5 , with two free parameters, x3 and x5 , the dimension of the null space of A is 2, namely ν(A) = 2. The general solution of Ax = 0 can be written as ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 3x3 + 4x5 x1 3 4 ⎢ x2 ⎥ ⎢ −2x3 − 4x5 ⎥ ⎢ −2 ⎥ ⎢ −4 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ x3 x = ⎢ x3 ⎥ = ⎢ ⎥ = x3 ⎢ 1 ⎥ + x5 ⎢ 0 ⎥ . ⎣ x4 ⎦ ⎣ ⎦ ⎣ ⎣ ⎦ 0 1 ⎦ x5 0 1 x5 x5 ⎡

Thus, [3, −2, 1, 0, 0]T , [4, −4, 0, 1, 1]T form a basis of N0 (A). Observe that B has 3 nonzero rows. Thus, rank(A) =rank(B) = 3, so that rank(A) + ν(A) = 3 + 2 = 5, which satisfies the formula (2.1.11) with n = 5.



76

2 Linear Analysis

Example 5 Compute the rank and nullity of the matrix ⎡

2 ⎢ −1 A=⎢ ⎣ 1 3

1 −1 2 4

⎤ 4 −1 ⎥ ⎥. −1 ⎦ 1

Solution Since A has more rows than columns, we perform elementary row operations on A T as follows to reduce the number of elementary row operations: ⎡

⎤ 2 −1 1 3 A T = ⎣ 1 −1 2 4 ⎦ → Interchange 1st and 2nd rows 4 −1 −1 1 ⎡ ⎤ 1 −1 2 4 Add (-2)-multiple of 1st row to 2nd row −→ ⎣ 2 −1 1 3 ⎦ → and add (-4)-multiple of 1st row to 3rd row 4 −1 −1 1 ⎡ ⎤ 1 −1 2 4 −→ ⎣ 0 1 −3 −5 ⎦ → Add (-3)-multiple of 2nd row to 3rd row 0 3 −9 −15 ⎡ ⎤ 1 −1 2 4 −→ ⎣ 0 1 −3 −5 ⎦ = B, 0 0 0 0 which is in row echelon form. Thus, we obtain rank(A) = column-rank(A) = row-rank(A T ) =2. In addition, since the dimension of A is 4 × 3 (with n = 3), it follows from (2.1.11) that the nullity of A is ν(A) = 3 − 2 = 1.  We conclude this section by showing that B B ∗ , B ∗ B and B have the same rank. Theorem 4 B, BB∗ , and B∗ B have the same rank

For any matrix B,

rank(B B ∗ ) = rank(B ∗ B) = rank(B). Proof For an arbitrary matrix B ∈ Cm,n , set C = B ∗ B, and observe that N0 (C) = N0 (B).

(2.1.12)

Indeed, if x ∈ N0 (B), then Cx = B ∗ (Bx) = B ∗ 0 = 0, so that x ∈ N0 (C). On the other hand, for any x ∈ N0 (C), we have

2.1 Matrix Analysis

77

0 = x, 0 = x, Cx = x, B ∗ Bx = Bx, Bx = ||Bx||2 , so that Bx = 0; that is, x ∈ N0 (B). Now, since C is an n × n matrix, it follows from (2.1.11) and (2.1.12) that rank(C) = n − ν(C) = n − ν(B) = rank(B), or rank(B ∗ B) = rank(B). Finally, since B B ∗ = (B ∗ B)∗ , we may conclude that  rank(B B ∗ ) = rank(B ∗ B), which agrees with rank(B). Exercises Exercise 1 Consider the matrices ⎤ 1 −1 0 101 20 A= , B= , C = ⎣0 1 0⎦. 011 11 1 0 1 

C2









Which of the following nine multiplications AB, AC, B A, BC, C A, C B, A2 , B 2 , do not make sense? Why? Compute those that are well defined.

Exercise 2 Let A, B, C be the matrices given in Exercise 1. Which of the following matrix operations do not make sense? Why? Compute those that are well defined. (a) (b) (c) (d)

AC + 2B. AC − A. B A + 3C. B A + 2 A.

Exercise 3 Let B, C, D, E be 2 × 2 matrices. Show that      B O D O BD O = , O C O E O CE where O is the zero matrix (with appropriate size). Exercise 4 Repeat Exercise 3, where B, C, D and E are 2 × 3, 2 × 2, 3 × 2 and 2 × 3 matrices, respectively. Exercise 5 Use the block matrix multiplication formula in Exercise 3 to compute the matrix product diag{B, C}diag{D, E}, where  B=

       0.5 3 1 −1 2 0 1 1 , C= , D= , E= . −1.5 1 1 2 0 −1 −1 0

Exercise 6 Use the block matrix multiplication formula in Exercise 4 to calculate the matrix product diag{B, C}diag{D, E}, where

78

2 Linear Analysis

⎡ ⎤      3 1 12 1 −1 1 0 1 3 ⎣ ⎦ B= , C= , D = −1 0 , E = . 1 2 −1 0 1 1 −1 2 −2 −1 

Exercise 7 Suppose that the vectors a j and b j , 1 ≤ j ≤ r in Cn satisfy a j = Eb j , where E is a nonsingular matrix. Show that a1 , . . . , ar are linearly independent if and only if b1 , . . . , br are linearly independent. Exercise 8 Determine the vectors from ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 2 1 2 1 ⎢2⎥ ⎢4⎥ ⎢2⎥ ⎢5⎥ ⎢4⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎥ v1 = ⎢ ⎢ 2 ⎥ , v2 = ⎢ 4 ⎥ , v3 = ⎢ 2 ⎥ , v4 = ⎢ 4 ⎥ , v5 = ⎢ 2 ⎥ ⎣3⎦ ⎣6⎦ ⎣3⎦ ⎣7⎦ ⎣5⎦ 2 4 1 3 3 that form a basis of the vector space span{v1 , . . . , v5 }. Exercise 9 Repeat Exercise 8 with v1 = (0, 2, 3, 5, 6), v2 = (2, 4, −6, 0, −2), v3 = (3, 2, 1, −2, 1), v4 = (2, −2, 1, 3, 4). Exercise 10 Perform elementary row operations on each of the following matrices to (i) compute its row-rank and (ii) find a basis for its column space. Then verify that the column-rank and row-rank for each of these matrices are the same. ⎡ ⎤ −3 2 −1 0 (a) A1 = ⎣ 1 0 2 1 ⎦. 0 2 5 3 ⎡ ⎤ 1 −1 2 ⎢ −1 2 1 ⎥ ⎥ (b) A2 = ⎢ ⎣ 0 1 3 ⎦. 1 0 5 ⎡ ⎤ 0 2 1 0 ⎢1 1 0 1⎥ ⎢ ⎥ ⎥ (c) A3 = ⎢ ⎢ 1 −1 −1 1 ⎥. ⎣1 5 4 0⎦ 4 2 1 3 Exercise 11 For each of the matrices A1 , A2 , A3 in Exercise 10, perform elementary row operations to the transpose of the matrix to (i) compute its column-rank and (ii) find a basis for its row space. Then verify that the column-rank and row-rank for each of A1 , A2 , A3 are the same. Exercise 12 Show that for any matrix A, rank(A) = rank(A T ).

2.1 Matrix Analysis

79

Exercise 13 Suppose A and B are two matrices of sizes m×n and n×m, respectively. Determine whether rank(AB) = rank(B A) always holds or not. If yes, give a proof for this formula. Otherwise, give a counter-example. Exercise 14 Show that for two matrices B and C of the same size, the inequality (2.1.9) holds. Exercise 15 Show that for two matrices B and C with appropriate sizes, the inequality (2.1.10) holds. Exercise 16 Show that a nonsingular matrix transformation of a rectangular matrix B does not change its rank; that is, rank(B) = rank(AB) = rank(BC), where A and C are nonsingular matrices. Exercise 17 Let A1 , A2 , A3 be the matrices given in Exercise 10. Answer the following questions. (a) (b) (c) (d)

Is rank(A1 A2 ) = rank(A2 A1 )? Is rank(A1 A3T ) = rank(A1 )? Is rank(A1 A3T ) = rank(A3 )? Is rank(A2T A3T ) ≤ min{rank(A2 ), rank(A3 )}?

2.2 Linear Transformations Multiplication of an m × n matrix A to a column vector x ∈ Cn results in a column vector z ∈ Cm . Hence, the matrix A can be considered as a mapping (to be called a transformation in this book) from the Euclidean space Cn to the Euclidean space Cm . This mapping has the important property that A(ax + by) = a Ax + b Ay for all vectors x, y ∈ Cn and all scalars a, b ∈ C. To be called the linearity property of the matrix A, this concept extends to transformations defined on finite or infinitedimensional vector spaces, including fairly general differential and integral operators on certain appropriate subspaces of the function spaces  L p (J ) as studied in the previous chapter. Definition 1 Linear transformations Let V and W be two vector spaces over some scalar field F, such as C or R. A transformation T from V to W is said to be linear, if it satisfies: T (ax + by) = aT x + bT y (2.2.1) for all x, y ∈ V and a, b ∈ F.

80

2 Linear Analysis

It follows from (2.2.1) that a linear transformation T satisfies T (x + y) = T x + T y; T (ax) = aT x

(2.2.2) (2.2.3)

for all x, y ∈ V and a ∈ F. Conversely, (2.2.1) follows from (2.2.2) and (2.2.3) as well (see Exercise 1). Example 1 As mentioned above, any matrix A ∈ Cm,n is a linear transformation from Cn to Cm . This statement remains valid if Cn , C are replaced by Rn , R, respectively. Validity of the statement in this example follows from the rule of matrix multiplication, with B being an n × 1 matrix; and scalar multiplication as defined in Sect. 1.1 of Chap. 1. Details will be left as an exercise (see Exercise 2).  Observe that the scalar field F can be considered as a vector space over F itself. For instance, in the above example, the matrix A ∈ C1×n (with m = 1), is a linear transformation from V = Cn to the vector space W = C, which is the scalar field C. This matrix transformation, from a vector space to a scalar field, is called a linear functional. More generally, we have the following. A linear transformation Definition 2 Linear functionals, linear operators from a vector space V (over some scalar field F, such as C or R) to F is called a linear functional on V. A linear transformation T from a vector space V to itself is called a linear operator on V. If the vector spaces V and W are both normed (linear) spaces, then a linear transformation T from V to W is said to be bounded, provided that the norm of T x is uniformly bounded for all unit vectors x ∈ V. For bounded linear transformations, we introduce the notion of operator norm, as follows. Definition 3 Operator norm Let V and W be two normed spaces (over some scalar field F, such as C or R) with norms denoted by · V and · W , respectively. The norm of a linear transformation T from V to W is defined by |||T |||V→W = sup

 T x

W

x V

 : 0 = x ∈ V .

(2.2.4)

If |||T |||V→W is finite, then the transformation T is said to be a bounded transformation. For convenience, if W = V (that is, for linear operators T on V), we will adopt the abbreviated notation (2.2.5) |||T |||V = |||T |||V→V .

2.2 Linear Transformations

81

Example 2 Let V =  L 1 (J ), where J is an interval on R. Then  Tf =

f

(2.2.6)

J

is a bounded linear functional on V. Example 3 Let V = C(J ) be the vector space of continuous functions on the interval J over C or R, as introduced in Sect. 1.2, and let b ∈ J . Then T f = f (b), f ∈ C(J ),

(2.2.7)

is a bounded linear functional on C(J ). Example 4 Let V be an inner-product space with inner product ·, · , and let x0 ∈ V. Then T x = x, x0 , x ∈ V, is a bounded linear functional from V to C with |||T |||V→C = ||x0 ||. Verifications of Examples 2–4 are left as exercises (see Exercises 8–10). The converse of Example 4 is also valid by employing a pair of dual bases, and in particular, an orthonormal basis. Recall from Definition 5 on p.39, that two bases vk , k = 1, 2, . . .} of an inner-product space V are said to {vk , k = 1, 2, . . .} and { constitute a dual pair, if v j = δ j−k , j, k = 1, 2, . . . , vk , where ·, · is the inner product for the space V. vk }, Theorem 1 Let V be a complete inner-product space with dual bases {vk } and { and let T be a bounded linear functional on V, such that xT , defined by xT =



(T v j ) vj,

(2.2.8)

j

is in V. Then T : V → F can be formulated as: T x = x, xT for all x ∈ V. Furthermore, xT in (2.2.8), called the representer of T , is unique. Proof To derive the representation (2.2.9) for each x ∈ V, write x=

 k

ck vk ,

(2.2.9)

82

2 Linear Analysis

where the series converges in the sense of (1.4.4) on p.38, with · = by linearity and taking the limit, we have (see Exercise 12) Tx =



ck T vk .

√ ·, · . Then

(2.2.10)

k

On the other hand, again by linearity and taking limit, we also have, by the definition of xT given in (2.2.8), x, xT =



ck vk ,

k

=



ck

k

=







T v j vj



j

T v j vk , vj

j

ck

k



T v j δk− j =

j



ck T vk .

k

Hence, it follows from (2.2.10) that (2.2.9) is established. To prove that xT in (2.2.9) is unique, let y0 ∈ V be another representer of T . Then T x = x, xT = x, y0 , for all x ∈ V so that x, xT − y0 = 0. By choosing x = xT − y0 , we have

xT − y0 2 = 0, or y0 = xT .



Remark 1 We remark that xT , as defined in (2.2.8), is always in V. Refer to Exercise vk } (that is, {vk } is an orthonormal basis of V). 11 for the proof in the case {vk } = { However, the proof for the general case of dual bases is beyond the scope of this book.  Throughout the entire book, the inner product ·, · for the Euclidean space Cn or is always the ordinary dot product, so that the corresponding norm · of z in Cn or Rn is its Euclidean norm (or length of the vector z):

Rn

1

z = (|z 1 |2 + · · · + |z n |2 ) 2 . S = { v1 , v2 , v3 } be dual bases of R3 as given Example 5 Let S = {v1 , v2 , v3 } and  in Example 4 on p.40. Consider the linear functional T , defined by

2.2 Linear Transformations

83

Tx =

3 

xj,

(2.2.11)

j=1

where x = (x1 , x2 , x3 ) ∈ R3 . Write down the representer xT of T with dual bases S and  S, and verify that T x = x, xT for all x ∈ R3 . Solution In view of (2.2.8) in Theorem 1, the required representer is given by xT =

3 

(T v j ) vj

j=1

= (T v1 ) v1 + (T v2 ) v2 + (T v3 ) v3 = 2 v1 + 0  v2 + 3  v3 = 2(−1, 2, −1) + (0, 0, 0) + 3(1, −1, 1) = (1, 1, 1). Hence, for each x = (x1 , x2 , x3 ) ∈ R3 , x, xT = x1 (1) + x2 (1) + x3 (1) =

3 

xj.

j=1

Thus, x, xT is exactly T x in (2.2.11). On the other hand, if we use the standard basis e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1), v j = e j , we again have of R3 , then by applying (2.2.8) with v j =  xT =

3  (T e j )e j j=1

= (T e1 )e1 + (T e2 )e2 + (T e3 )e3 = e1 + e2 + e3 = (1, 1, 1), which agrees with the above result. This example illustrates the independence of the  representer xT in terms of the choice of dual bases in (2.2.8). Example 6 For n ≥ 1, show that any square matrix A ∈ Cn,n is a bounded linear operator on the vector space Cn .

84

2 Linear Analysis

In view of the linearity property, it is sufficient to show that Ax is uniformly  bounded for all x ∈ Cn with ||x|| = 1 (see Exercise 6). We next introduce the notion of adjoints of linear operators, but would like to point out in advance that this concept differs from the definition of (matrix) adjoints of square matrices in an elementary course of Linear Algebra or Matrix Theory (see Remark 2 on p.85). Theorem 2 Adjoint of linear operators Let V be a complete inner-product space over the scalar field C or R, such that certain dual bases of V exist. Then corresponding to any bounded linear operator T on V, there exists a linear operator T ∗ defined on V, called the adjoint of T , that satisfies the property T x, y = x, T ∗ y , for all x, y ∈ V.

(2.2.12)

Furthermore, T ∗ is uniquely determined by (2.2.12). Proof In view of Theorem 1 and Remark 1, we observe that for any fixed y ∈ V, the linear functional L = L y , defined by L y x = T x, y ,

(2.2.13)

is bounded, and hence, has a unique representer x L y , in that L y x = x, x L y for all x ∈ V. Since x L y is a function of y, we may write x L y = F(y), so that T x, y = x, F(y)

(2.2.14)

for all x ∈ V, where (2.2.14) holds for all y ∈ V. Next, let us verify that F is a linear operator on V. From its definition, F is a transformation from V into itself. That F is a linear operator on V follows from the linearity of the inner product. Indeed, for any fixed y, z ∈ V and fixed a, b ∈ C = F, it follows from (2.2.14) that, for all x ∈ V, x, F(ay + bz) = T x, ay + bz = T x, ay + T x, bz = aT x, y + bT x, z = ax, F(y) + bx, F(z) = x, a F(y) + x, bF(z) , so that x, F(ay + bz) − (a F(y) + bF(z)) = 0 for all x ∈ V. By setting x = F(ay + bz) − (a F(y) + bF(z)), we have

F(ay + bz) − (a F(y) + bF(z)) 2 = 0;

2.2 Linear Transformations

85

that is, F(ay + bz) = a F(y) + bF(z). Hence, by setting T ∗ = F, we have derived (2.2.12). Furthermore, since the representer in Theorem 1 is unique, T ∗ = F is unique.  Example 7 Consider an n × n matrix A ∈ Cn,n as a linear operator on the vector space V = Cn . Determine the adjoint A∗ of A. Solution For any x, y ∈ Cn , consider x and y as column vectors: ⎤ x1 ⎢ ⎥ x = ⎣ ... ⎦ ,



⎤ y1 ⎢ ⎥ y = ⎣ ... ⎦



xn

yn

so that x T = [x1 , . . . , xn ] and yT = [y1 , . . . , yn ] are row vectors (where the superscript T denotes, as usual, the transpose of the matrix). Hence, it follows from the definition of the inner product for Cn that Ax, y = (Ax)T y = x T A T y   

T T = x T A y = x, A y , so that from the uniqueness of the adjoint A∗ of A, we have

T A∗ = A .

(2.2.15)

In other words, the adjoint of A is the transpose-conjugate A∗ of A defined in (2.1.5) on p.68.  As pointed out previously, the operator adjoint is different from the matrix adjoint being studied in an elementary course of matrix theory or linear algebra, as follows. Remark 2 Matrix adjoints In Matrix Theory, the adjoint (which we call matrix adjoint) of a square matrix A = a j,k 1≤ j,k≤n is the matrix 

A j,k

T

1≤ j,k≤n

,

where A j,k denotes the cofactor of a j,k . Hence in general, the matrix adjoint of A is

T different from the operator adjoint A∗ = A , when A is considered as an operator on the vector space Cn .  Definition 4 Self-adjoint operators adjoint if T ∗ = T .

A linear operator T is said to be self-

Remark 3 Hermitian matrices Recall that in an elementary course on Linear Algebra or Matrix Theory, a square matrix A ∈ Cn,n is said to be Hermitian, if

86

2 Linear Analysis

T A = A . Thus, in view of (2.2.15), when considered as a linear operator, a square matrix A ∈ Cn,n is self-adjoint if and only if it is Hermitian. Henceforth, for linear operators T , which may not be square matrices, we say that T is Hermitian if it is self-adjoint. Clearly, if a matrix A ∈ Cn×n is Hermitian, then all the diagonal entries of A are real. It will be shown in the next section that all eigenvalues of self-adjoint operators are real in general.  Example 8 Write down the (operator) adjoints A∗1 , . . . , A∗4 for the following matrices:     14 1 −i , A2 = , A1 = 32 i 2     1i 2 1−i , A4 = . A3 = i 2 1+i 3 Solution By applying (2.2.15), we have A∗1 and A∗3



 13 = , 42

A∗2



 1 −i = , i 2



   1 −i 2 1−i ∗ = , A4 = . −i 2 1+i 3

Observe that A3 and A∗3 are both symmetric, but they are not Hermitian. On the other hand, A∗2 = A2 and A∗4 = A4 . Thus A2 and A4 are Hermitian, but they are not  symmetric. Finally, A1 and A∗1 are neither symmetric nor Hermitian. Exercises Exercise 1 Show that (2.2.1) is equivalent to the totality of (2.2.2) and (2.2.3). Exercise 2 Complete the solution of Example 1 by filling in more details. Exercise 3 Recall that R can be considered as a vector space V over R itself. Hence, any function f : R → R (that is, y = f (x) ∈ R for all x ∈ R) is a transformation of V = R to W = R. Identify all linear transformations among the functions in the following by verifying the condition (2.2.1) in Definition 1. Justify your answers with reasoning. (a) (b) (c) (d)

f (x) = x 2 , for x ∈ R. g(x) = 2x, for x ∈ R. h(x) = −5x, for x ∈ R. k(x) = x + 2, for x ∈ R.

Exercise 4 Let V = Cm,n be the set of m × n matrices with complex entries, with scalar multiplication, matrix addition, and matrix multiplication defined in Sect. 2.1.

2.2 Linear Transformations

87

In particular, for V = Cn , W = Cm , and A ∈ Cm,n , y = Ax is a vector in W = Cm for all x ∈ V = Cn . Identify all linear transformations from C2 to C3 among the following operations T1 , . . . , T4 , by verifying the condition (2.2.1) in Definition 1. Justify your answers with rigorous arguments. ⎡ ⎤ ⎡ ⎤ −1 0 0 (a) T1 x = ⎣ 1 2 ⎦ x + ⎣ 0 ⎦ , for x ∈ C2 . 0 0 1 ⎡ ⎤ ⎡ ⎤ −1 0 1 0 (b) T2 x = ⎣ 0 0 ⎦ x + ⎣ 2 1 ⎦ x, for x ∈ C2 . 1 2 0 −1 ⎡ ⎤ 1 0 (c) T3 x = 3 ⎣ 0 i ⎦ x, for x ∈ C2 . −i 1 ⎡ ⎤ 1 0 (d) T4 x = ⎣ 2 − i 3i ⎦ x, for x ∈ C2 . −i 4 Exercise 5 As a continuation of Exercise 4, identify all linear transformations from C3 to C2 among S1 , . . . , S4 by verifying (2.2.1) in Definition 1. Justify your answers with rigorous arguments.     −1 1 0 0 (a) S1 x = x+ , for x ∈ C3 . 0 20 1     i −1 0 001 (b) S2 x = x− x, for x ∈ C3 . 0 1 i 100   1 0 −i (c) S3 x = 3 x, for x ∈ C3 . 0i 0   1 2−i i −1 (d) S4 x = x, for x ∈ C3 . 0 3i 4 Exercise 6 Verify that any square matrix A is a bounded linear operator when considered as a linear transformation. Hint: Apply the Cauchy-Schwarz inequality to show that |||A|||Cn is bounded by the summation of the Euclidean norms of the row vectors of A. Exercise 7 Let T be the set of all linear transformations from a vector space V over some scalar field F to a vector space W over the same scalar field F. For any T, S ∈ T and any a ∈ F, define T + S and aT by (T + S)x = T x + Sx, for x ∈ V and (aT )x = a(T x), for x ∈ V. Then T can be considered as a vector space (of linear transformations) over F. Justify this statement by considering the set Cm,n of m × n matrices.

88

2 Linear Analysis

Exercise 8 Verify that the transformation T defined in (2.2.6) is a bounded linear functional on  L 1 (J ), with |||T |||V→C = 1. Exercise 9 Verify that the transformation T defined in (2.2.7) is a bounded linear functional on C(J ), with |||T |||V→C = 1. Exercise 10 Verify that the transformation T defined in (2.2.9), where xT is fixed, is a bounded linear functional on V. This is the solution of Example 4 with x0 = xT . Hint: Analogous to Exercise 6, apply the Cauchy-Schwarz inequality to derive an upper bound of the operator norm of T . Exercise 11 Let T be a bounded linear functional on a complete inner-product space V, namely |T x| ≤ c x for all x ∈ V, where c > 0 is a constant. Also, let {vk }∞ k=1 be an orthonormal basis for V. N (a) Show that k=1 x, vk vk 2 ≤ x , for any N ∈ N, x ∈ V. N (b) Apply (a) and the boundedness of T to show that |x, k=1 (T vk )vk | ≤ c x , for any N∈ N, x ∈ V, where c is a constant. N N (T vk )v in (b), show that k=1 |T vk |2 ≤ c2 for any N ∈ N, and (c) For x = k=1 k ∞ 2 hence conclude that k=1 |T vk | < ∞.  (d) Apply (c) to show that the partial sums of xT := ∞ k=1 (T vk )vk is a Cauchy sequence, and hence conclude that xT is in V. Exercise12 Fill in the details in the proof of Theorem 1, by   showing that

T x − c T v

→ 0, as N → ∞, and that c v , vj = k k k k |k|≤N k j (T v j )   v j is valid. k ck j T v j vk , Exercise 13 Let T be the functional on C2 or R2 defined by  T

x1 x2

 = 2x1 − x2 .

(a) Verify that T is linear. (b) Apply each pair of dual bases, as identified in Exercise 8 on p.52 in Sect. 1.4 of Chap. 1, to formulate the representer xT of T . Exercise 14 Write down B ∗j of the following matrices B j , and identify those that are Hermitian.     1 i 1 1−i (a) B1 = . (b) B2 = . 1 + i −i 1+i i     2 1+i −1 2 − i (c) B3 = . (d) B4 = . 1−i 3 2 + i −3

2.3 Eigenspaces

89

2.3 Eigenspaces Corresponding to a linear transformation T from a vector space V to some vector space W, the set of x ∈ V which is mapped to the zero vector 0 in W, namely: N0 = {x ∈ V : T x = 0}

(2.3.1)

is a subspace of V, called the “null space” of T . Indeed, for any x, y ∈ N0 and scalars a, b ∈ F, it follows from the linearity of T in (2.2.1) that T (ax + by) = aT x + bT y = a 0 + b 0 = 0. In this section, we extend the study of the null space to “eigenspaces” for linear operators; that is, for W = V (see Definition 2 on p.80). Since every vector space contains the zero vector 0 (and is therefore never an “empty” set), the vector space N0 is said to be trivial (as opposed to being empty) if T x = 0 implies x = 0. In general, a vector space is said to be nontrivial if it contains at least one non-zero x. Also, for all practical purposes, only the scalar fields C and R are used in our study of eigenspaces. Definition 1 Eigenvalues Let V be a vector space over the scalar field F = C or R, and let I denote the identity operator, meaning that I x = x for all x ∈ V. Then a scalar λ ∈ F (that is, λ is a real or complex number) is called an eigenvalue of a linear operator T on V, if the null space Nλ = {x ∈ V : (T − λI ) x = 0}

(2.3.2)

of the linear operator Tλ = T − λI is nontrivial. Furthermore, if λ is an eigenvalue of T , then any non-zero x ∈ Nλ is called an eigenvector corresponding to (or associated with) the eigenvalue λ. In addition, Nλ is called the eigenspace corresponding to λ. Eigenvalues for square matrices In particular, if T = A is an n × n square matrix, then λ is an eigenvalue of A, if there is a non-zero v ∈ Cn (called a λeigenvector or an eigenvector associated with λ), such that Av = λv; namely, the system (2.3.3) (A − λIn )x = 0 of n linear equations has a nontrivial solution x = v. This agrees with the definition of eigenvalues in an elementary course of Linear Algebra or Matrix Theory, since the (linear) system (2.3.3) has a nontrivial solution x = v, if and only if the matrix A − λIn is singular; or equivalently, det(λIn − A) = 0. Let

90

2 Linear Analysis

P(λ) := det(λIn − A).

(2.3.4)

Then P(λ) is a polynomial of degree n, called the characteristic polynomial of the given matrix A. Thus, λ is an eigenvalue of A, if and only if λ is a (real or complex) root of its characteristic polynomial P(λ). By the Fundamental Theorem of Algebra, if multiplicities of roots are counted, then P(λ) has exactly n (real or complex) roots λ1 , λ2 , . . . , λn ; that is, A has exactly n eigenvalues λ1 , λ2 , . . . , λn , counting multiplicities. When a linear transformation is some square matrix, then the trace of the matrix, defined by the sum of its main diagonal entries, is given by the sum of its eigenvalues, as follows. Definition 2 Trace of matrices The trace of a square matrix A = a jk 1≤ j,k≤n is defined by the sum of its main diagonal entries; namely, Tr(A) =

n 

ajj.

j=1

Theorem 1 Trace of square matrix is sum of its eigenvalues n matrix. Then n  Tr(A) = λj,

Let A be an n × (2.3.5)

j=1

where λ j , 1 ≤ j ≤ n (with multiplicities being listed) are the eigenvalues of A. Let P(λ) be the characteristic polynomial of A defined by (2.3.4). Then P(λ) can be written as P(λ) = λn − (a11 + a22 + · · · + ann )λn−1 + · · · + (−1)n det(A). On the other hand, P(λ) can also be written as P(λ) = (λ − λ1 )(λ − λ2 ) · · · (λ − λn ) = λn − (λ1 + λ2 + · · · + λn )λn−1 + · · · + (−1)n λ1 λ2 . . . λn , where λ j ∈ C, 1 ≤ j ≤ n are the eigenvalues of A. Therefore, equating the coefficient of λn−1 in the above two expressions of P(λ) yields n  j=1

ajj =

n  j=1

λj.



Example 1 Determine the eigenvalues and corresponding eigenspaces of the following 2 × 2 matrices:

2.3 Eigenspaces

91



 14 , 32   1i A3 = , i 2



A1 =

A2 =  A4 =

 1 −i , i 2

 2 1−i . 1+i 3

Solution To compute the eigenvalues and study the eigenspaces of A1 , . . ., A4 , we may compute the determinants of A j − λI , for j = 1, . . . , 4. (a) For the linear operator A1 over R, we have det(A1 − λI ) = (1 − λ)(2 − λ) − 12 = λ2 − 3λ + 2 − 12 = λ2 − 3λ − 10, so that λ=



√ 3 ± 49 3±7 9 + 40 = = , 2 2 2



3+7 3−7 which yields the eigenvalues λ1 = = 5 and λ2 = = −2 of A1 . To 2 2 formulate the corresponding eigenspaces Nλ1 and Nλ2 , consider the matrices 

−4 4 A 1 − λ1 I = 3 −3





 34 and A1 − λ2 I = , 34

with null spaces given by Nλ1 = span

1 1

and Nλ2 = span

 4  , −3

respectively, since [−4 4]

    1 4 = 0 and [3 4] = 0. 1 −3

Observe that Nλ1 and Nλ2 are 1-dimensional vector spaces. (b) For A2 , we have det(A2 − λI ) = (1 − λ)(2 − λ) − (−i)(i) = λ2 − 3λ + 2 − 1 = λ2 − 3λ + 1, so that λ=



that is, the eigenvalues of A2 are

√ √ 3± 5 9−4 = ; 2 2

92

2 Linear Analysis

√ √ 1 1 (3 + 5) and λ2 = (3 − 5). 2 2

λ1 =

We leave the formulation of the eigenspaces Nλ1 and Nλ2 as an exercise (see Exercise 3). (c) For A3 , we have det(A3 − λI ) = (1 − λ)(2 − λ) − (−i)2 = λ2 − 3λ + 3, so that λ=





√ 1 9 − 12 = (3 ± i 3); 2 2

that is, the eigenvalues of A3 are λ1 =

√ √ 1 1 (3 + i 3) and λ2 = (3 − i 3). 2 2

The formulation of the corresponding eigenspaces Nλ1 and Nλ2 is left as an exercise (see Exercise 4). (d) For A4 , we have det(A4 − λI ) = (2 − λ)(3 − λ) − (1 − i)(1 + i) = λ2 − 5λ + 6 − 2 = λ2 − 5λ + 4 = (λ − 4)(λ − 1), so that the eigenvalues of A4 are λ1 = 1 and λ2 = 4. Observe that     1 1−i 2 − λ1 1 − i = A 4 − λ1 I = 1 + i 3 − λ1 1+i 2   1 1−i = 1 + i (1 + i)(1 − i) and

   −2 1 − i 2 − λ2 1 − i = A 4 − λ2 I = 1 + i 3 − λ2 1 + i −1   (i − 1)(1 + i) (i − 1)(−1) = ; 1+i −1 

so that the two eigenspaces are given by Nλ1 = span

1 − i  −1

and Nλ2 = span

 1  , 1+i

2.3 Eigenspaces

93



since [1 1 − i]

1−i −1



 = 0 and [1 + i − 1]

1 1+i

 = 0. 

Let us now turn to the study of linear operators that are not necessarily square matrices. Example 2 Let V = C ∞ (−∞, ∞) be the vector space of infinitely differentiable real-valued functions on R = (−∞, ∞) over R; that is, f ∈ V if and only if f is real-valued and the derivatives f, f  , f  , . . . are continuous functions on R. Consider the linear operator T defined by taking the second derivative. More precisely, let T f = −D 2 f = − f  . Determine the eigenvalues of T and exhibit their corresponding eigenspaces. Solution Observe that since V = C ∞ (−∞, ∞) is not a finite-dimensional space, T cannot be represented as a (finite) square matrix. On the other hand, it is easy to verify that T is a linear operator on V, and the eigenspace corresponding to λ is the null space of T − λI ; namely, the vector space of all x = f (t), for which T x − λx = − f  (t) − λ f (t) = 0 (see Definition 1). In other words, we have to solve the ordinary differential equation y  + λy = 0

(2.3.6)

(for nontrivial solutions) to compute the eigenvalues λ of T . For convenience, the discussion below is divided into three separate cases. (i) Consider λ = 0. Then the solution of (2.3.6) is simply y = a + bt, a, b ∈ R. Hence, λ = 0 is an eigenvalue of T = −D 2 and the corresponding eigenspace is the null space of T , namely: N0 = span{1, t},

(2.3.7)

the space of linear polynomials with algebraic basis {1, t}. (ii) Let λ < 0, and write λ = −μ2 , where μ > 0. Then the differential equation (2.3.6) has two linearly independent solutions: f 1 (t) = e−μt , f 2 (t) = eμt .

(2.3.8)

94

2 Linear Analysis

Hence, every negative number λ = −μ2 is an eigenvalue of −D 2 , with corresponding eigenspace Nλ = N−μ2 = span{e−μt , eμt }.

(2.3.9)

(iii) Let λ > 0, and write λ = μ2 , where μ > 0. Then the differential equation (2.3.6) has two linearly independent solutions g1 (t) = cos μt, g2 (t) = sin μt.

(2.3.10)

Hence, every positive number λ = μ2 is an eigenvalue of T = −D 2 , with corresponding eigenspace Nλ = Nμ2 = span{cos μt, sin μt}.

(2.3.11) 

Remark 1 Eigenfunction For function spaces, eigenvectors are also called eigenfunctions. For example, for a given linear (differential or integral) operator T , a nonzero function f that satisfies T f = λ f , where λ is a scalar, is called a λ-eigenfunction or an eigenfunction associated with the eigenvalue λ of T .  In the following example, we replace the infinite interval (−∞, ∞) in Example 2 by a finite interval and impose certain boundary conditions. Example 3 Let J = [0, c], where c > 0, and consider the set C0∞ (J ) of infinitely differentiable functions f (t) on J with the two-point boundary condition f  (0) = f  (c) = 0. Determine the eigenvalues and corresponding eigenspaces of the linear operator T = −D 2 on C0∞ (J ), as introduced in Example 2. Solution It is clear that C0∞ (J ) is a vector space, since for all a, b ∈ R and f, g ∈ C0∞ (J ), the function h = a f + bg is in C ∞ (J ) and satisfies the two-point boundary condition h  (0) = a f  (0) + bg  (0) = 0, h  (c) = a f  (c) + bg  (c) = 0. We again consider three cases. (i) Consider λ = 0. Then the solution of the two-point boundary value problem 

y  = 0; y  (0) = y  (c) = 0

has the general solution y = q, q ∈ R. Hence, λ0 = 0 is an eigenvalue of T = −D 2 and the corresponding eigenspace is

2.3 Eigenspaces

95

Nλ0 = N0 = span{1}.

(2.3.12)

(ii) Consider λ = −μ2 , μ > 0. Then the solution of the two-point boundary value problem   y − μ2 y = 0; y  (0) = y  (c) = 0 is given by y = f (t) = ae−μt + beμt μ with f  (0) = (−a + b)μ = 0, or a = b; and in addition, f  (c) = (−ae−μc + beμc ) = 0, or a = be2μc . Hence, a = b = 0. That is, T has no negative eigenvalue (since Nλ = N−μ2 = {0}). (iii) Consider λ = μ2 , μ > 0. Then the solution of the two-point boundary value problem   y + μ2 y = 0; y  (0) = y  (c) = 0 is given by y = f (t) = a cos μt + b sin μt, with f  (0) = −aμ sin 0 + bμ cos 0 = bμ = 0, or b = 0; and in addition, f  (c) = −aμ sin μc = 0, or μc = kπ, where k is any integer. Therefore, the eigenvalues of T = −D 2 are λk =

 kπ 2 c

, k = 0, 1, 2, . . . ,

(2.3.13)

since λ0 = 0 has been taken care of in Case (i). The eigenspace Nλk corresponding to each λk , k = 0, 1, 2, . . . , is given by  πkt  , Nλk = span cos c

(2.3.14)

which includes (2.3.12), because cos 0 = 1.  Remark 2 In Sect. 6.5 of Chap. 6, the result in Example 3 will be applied as “spatial” basis functions to solving the heat diffusion partial differential equations, after the spatial and time components are separated.  We now return to Example 1 and observe that the eigenvalues of the Hermitian T T (self-adjoint) matrices A2 and A4 (that is, A∗2 = A2 = A2 and A∗4 = A4 = A4 ) are T real, but the eigenvalues of the symmetric matrix A3 (with A3 = A3 , but A∗3 = A3 ) are not real numbers. This example serves as a motivation of the following result.

96

2 Linear Analysis

Theorem 2 Eigenvalues of self-adjoint operators are real Eigenvalues of a self-adjoint (or Hermitian) linear operator T on any inner-product space over F = C or R are real numbers. Proof The proof is quite simple. Indeed, T ∗ = T implies that λxλ , xλ = λxλ , xλ = T xλ , xλ = xλ , T ∗ xλ = xλ , T xλ = xλ , λxλ = λxλ , xλ , where λ is any eigenvalue and xλ ∈ Nλ is a corresponding eigenvector. Now, being an eigenvector, xλ = 0 or xλ , xλ = xλ 2 > 0, so that λ = λ. Hence, λ is real.  Theorem 3 Orthogonality of eigenspaces Let T be a self-adjoint (or Hermitian) operator on an inner-product space V over F = C or R. Suppose that λ1 and λ2 are two distinct eigenvalues of T . Then the two corresponding eigenspaces Nλ1 and Nλ2 are orthogonal to each other; that is, x, y = 0 for all x ∈ Nλ1 and y ∈ Nλ2 . Proof The proof is similar to that of Theorem 2 . Let x ∈ Nλ1 and y ∈ Nλ2 . Then λ1 x, y = λ1 x, y = T x, y = x, T ∗ y = x, T y = x, λ2 y = λ2 x, y = λ2 x, y , since λ2 is real according to Theorem 2. Therefore, we have (λ1 − λ2 )x, y = 0, which implies that x, y = 0, since λ1 = λ2 .



Remark 3 Orthogonality of left and right eigenvectors If T = A is a Hermitian matrix, then by Theorems 2 and 3, all eigenvalues of A are real, and two (right) eigenvectors associated with two distinct eigenvalues are orthogonal. Corresponding to the second property, we have the following result for general real matrices. For any eigenvalue-eigenvector pair (λ, v) of a square matrix A, the row vector u = v T satisfies u A T = λu. We will call u a left eigenvector of the matrix A T . For convenience, we may also call the eigenvector v a right eigenvector of A. If A is a real matrix, then since A T is the adjoint of A, the above proof of Theorem 3 can be adopted to show that a left eigenvector corresponding to some eigenvalue λ1 and a right eigenvector corresponding

2.3 Eigenspaces

97

to an eigenvalue λ2 are orthogonal to each other, provided that λ1 = λ2 . This result extends Theorem 3 to non-self-adjoint (but real) matrices (see Exercise 17 ).  Example 4 Let A1 and A4 be the matrices  A1 =

 14 , 32

 A4 =

2 1−i 1+i 3



considered in Example 1. Apply A1 to illustrate the above Remark 3, and A4 to illustrate Theorem 3. Solution For A1 , recall from Example 1 that the eigenvectors corresponding to the eigenvalues λ1 = 5 and λ2 = −2 are v1 =

    1 4 , v2 = , 1 −3

respectively. Since A1 is not self-adjoint, we do not expect v1 to be orthogonal to v2 , although the two eigenvalues are different. However, for the eigenvalue λ1 = 5, we have   −4 4 ; A 1 − λ1 I 2 = 3 −3 and it is easy to verify that the row vector u = [3, 4] is a left eigenvector associated with λ1 = 5. From Remark 3, since λ1 = λ2 , this vector u must be orthogonal to the (right) eigenvector v2 associated with the eigenvalue λ2 = −2. This is indeed the case, namely:   4 = 12 − 12 = 0. uv2 = [3, 4] −3 As to the matrix A4 , it is clear that it is self-adjoint. From Example 1, we already know that the eigenvectors are (non-zero) constant multiples of  v1 =

   1−i 1 , v2 = −1 1+i

associated with the corresponding eigenvalues λ1 = 1 and λ2 = 4. Since λ1 = λ2 and A4 is Hermitian, we expect v1 and v2 to be orthogonal to each other, as assured by Theorem 3. That this is indeed the case can be verified by direct calculation, namely:  v1 , v2 = v1T v2 = [1 − i, −1]

1 1−i

 = 1 − i − (1 − i) = 0. 

98

2 Linear Analysis

Example 5 As a continuation of Example 3, where the eigenvalues of the differential linear operator T = −D 2 are λk =

 kπ 2 c

, k = 0, 1, 2, . . . ,

and their corresponding eigenspaces given by  πkt  Nλk = span cos , k = 0, 1, 2, 3, . . . , c in view of (2.3.12) and (2.3.14), verify that T is self-adjoint on C0∞ (J ) and that the eigenspaces Nλk are mutually orthogonal. Solution For f, g ∈ C0∞ (J ), it follows by the two-point boundary condition and integration by parts that 

 − f  (t) g(t)dt 0  c  c = − f  (t)g(t) + f  (t)g  (t)dt 0 0  c  c  c   = − f (t)g(t) + f (t)g (t) − f (t)g  (t)dt 0 0 0  c

 = f (t) − g  (t) dt =  f, T g , c

T f, g =

0

since f  (0) = f  (c) = 0 and g  (0) = g  (c) = 0. That is, T is self-adjoint. In addition, observe that for all j, k = 0, 1, . . . , we have cos

πkt π jt , cos = c c



c

cos

πkt π jt cos dt c c

0  c π = cos jt cos ktdt π 0   π  c  π cos( j − k)tdt + cos( j + k)tdt = 2π 0 0 c  sin( j − k)t π  sin( j + k)t π  = 0, + = 0 0 2π j −k j +k

which is valid whenever j = k.



Of course an equivalent statement of Theorem 3 can be formulated as follows: “For self-adjoint operators on an inner-product space, eigenvectors corresponding to different eigenvalues are orthogonal to one another”. Hence, to derive an orthonormal family from all the eigenvectors, it is sufficient to replace the algebraic basis of every

2.3 Eigenspaces

99

eigenspace by an orthonormal basis. The procedure to achieve this goal is to apply the Gram-Schmidt orthogonalization process introduced in Sect. 1.4 of Chap. 1. Next, we observe that the self-adjoint matrices A2 and A4 in Example 1 also satisfy the property: (2.3.15) x, Ax ≥ 0, for all x ∈ C2 , where A = A2 , A4 . We only verify the case A = A2 and leave the case A = A4 as an exercise (see Exercise 9). Indeed, for x = [a, b], where a, b ∈ C, we have 

1 −i x, A2 x = [a b] i 2

     a a − ib = [a b] b ia + 2b

= a(a + ib) + b(−ia + 2b) = aa + iab − iab + 2bb = (a − ib)(a − ib) + |b|2 = |a − ib|2 + |b|2 ≥ 0. Motivated by this example, we introduce the concept of A selfDefinition 3 Self-adjoint positive semi-definite (spd) operators adjoint linear operator T on an inner-product space V over the scalar field F = C or R is said to be positive semi-definite if T x, x ≥ 0 for all x ∈ V. Furthermore, an spd operator is called positive definite, if T x, x = 0 implies that x = 0. In the literature, “spd” stands for both “self-adjoint positive semi-definite” and “self-adjoint positive definite” operators. Example 6 As discussed above, the matrices A2 and A4 in Example 1 are spd operators on the inner-product space C2 over the scalar field C. Example 7 The linear operator T = −D 2 in Example 3 is spd on the inner-product space C∞ 0 (J ) over R, where J = [0, c]. Solution Indeed, for any f ∈ C0∞ (J ), an application of integration by parts, as in Example 5 with g = f , immediately yields: 

c

f  (t) f (t)dt  c  c  = − f (t) f (t) + f  (t) f  (t)dt 0 0  c   2  f (t) dt ≥ 0. =

T f, f = −

0

0



100

2 Linear Analysis

There is a distinction between the above two examples. While A2 x, x = |a − ib|2 + |b|2 = 0 implies a = b = 0 in Example 6, the differential opera2 7 annihilates the nonzero constant function q, in that tor T = −D  c in Example   d 2 T q, q =  q  dt = 0 although q = 0. Hence, A2 is positive definite while 0 dt T = −D 2 is positive semi-definite, though both operators are called spd, being self-adjoint operators. Theorem 4 Eigenvalues of spd operators are ≥ 0 Let T be an spd linear operator on an inner-product space over F = C or R. Then all eigenvalues λ of T are non-negative. The proof is evident. Indeed, if λ is an eigenvalue of an spd operator T with corresponding eigenvector xλ , then λxλ , xλ = T xλ , xλ ≥ 0 so that λ ≥ 0, since xλ , xλ = xλ 2 > 0.



However, computation of eigenvalues is not an easy task in general. Even for square matrices A, computing the determinant of A − λI , followed by finding the roots (also called zeros) of the characteristic polynomial det(A − λI ), is a very difficult task for square matrices of high dimensions. The task would not seem to be feasible for linear operators on infinite-dimensional inner-product spaces in general. Fortunately, there are numerical algorithms available in the literature, particularly for spd linear operators T , for which 0 ≤ λx, x = λx, x = T x, x (provided that λ ≥ 0 is an eigenvalue and x a corresponding eigenvector), so that λ=

T x, x .

x 2

From the information that all eigenvalues of an spd linear operator T are non-negative, it is at least intuitively clear that the smallest eigenvalue is given by λ0 = inf

x=0

T x, x ,

x 2

(2.3.16)

where “inf” stands for “infimum”, which means the “greatest lower bound”. The T x, x quotient in (2.3.16) is called the “Rayleigh quotient” for the spd operator T .

x 2 For a subspace W ⊂ V, let W⊥ denote the orthogonal complement of W in V, as defined by ( 1.3.11) on p.28. Hence, it follows from Theorem 3 that the second smallest eigenvalue  λ1 satisfies   T x, x ⊥  : 0  = x ∈ N λ1 = inf λ0 .

x 2

(2.3.17)

2.3 Eigenspaces

101

Observe that since the infimum in (2.3.17) is taken over a proper subset of V as compared to the set V in (2.3.16), we have  λ1 > λ0 . In general, by listing only the distinct eigenvalues of the spd linear operator T in increasing order; namely, λ1 <  λ2 < . . . 0 ≤ λ0 <  (where  λ0 := λ0 ), we have  T x, x  ⊥  : 0  = x ∈ N , j = 0, . . . , n . λn+1 = inf  λj

x 2

(2.3.18)

The minimization model in (2.3.16)–(2.3.18) provides a platform for computing numerical approximations of the distinct eigenvalues of spd operators. In the above λ0 ,  λ1 , . . . are discussion, we use the notation  λ j instead of λ j to emphasize that  distinct. Remark 4 Observe that the minimization model (2.3.16)– (2.3.18) allows for the computation of infinitely many eigenvalues for a general spd operator in increasing order, starting with λ0 ≥ 0. However, if the spd operator is an n × n matrix An , then we may modify (2.3.16)– (2.3.18) to allow for computation of the eigenvalues of An in decreasing order, namely: λ1 = sup

 A x, x  n : x = 0 2

x

(2.3.19)

(see Theorem 1 and Theorem 2 to be derived in Sect. 3.2 of the next chapter). Furthermore, in order to match the indices with those of the eigenvectors, multiplicities of the eigenvalues must be taken into consideration. To do so, we change the above indexing notation to λ1 = · · · = λr1 > λr1 +1 = · · · = λr2 +r1 > · · · > · · · = λn ≥ 0, where r1 = dim Nλ1 , r2 = dim Nλr1 +1 , . . ., and λr1 +1 = sup λr1 +r2 +1 = sup

  A x, x n ⊥ : 0  = x ∈ N λ1 ;

x 2

  A x, x n ⊥ ⊥ : 0  = x ∈ N ∪ N λ λ r1 +1 r1 +1

x 2

etc., and finally λn = sup

  A x, x An x, x n ⊥ = inf : 0  = x ∈ N , λ > λ , n λ 2 x=0 x 2

x

where (2.3.16) is applied to formulate the last quantity in (2.3.20).

(2.3.20) 

102

2 Linear Analysis

Remark 5 Let An be an n × n spd matrix, with eigenvalues λ1 , . . . , λn , listed in non-increasing order according to multiplicities as determined by the dimensions of the eigenspaces (see Remark 4 ), so that λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0. According to Theorem 3 and by applying the Gram-Schmidt orthonormalization process studied in Sect. 1.4 of Chap. 1, there is an orthonormal basis {v1 , . . . , vn } of Cn , such that (λ1 , v1 ), (λ2 , v2 ), . . . , (λn , vn ) are eigenvalue-eigenvector pairs. Hence, An v1 = λ1 v1 , An v2 = λ2 v2 , . . . , An vn = λn vn , and in (compact) matrix formulation, ⎡ ⎢ An V = V ⎣

λ1 O

O ..

.

⎤ ⎥ ⎦,

(2.3.21)

λn

where V denotes the n × n matrix [v1 . . . vn ] with the jth column being the  vector v j . Remark 6 In the next section, the formulation (2.3.21) is extended from spd matrices to “normal” matrices and provides the so-called “spectral decomposition” of all normal matrices.  T

Remark 7 If X is an m × n rectangular matrix, then An = X X is an n × n spd matrix (see Exercise 15). Hence, we may apply Theorem 3 to each eigenspace of An to obtain the formulation (2.3.21). Since λ j ≥ 0 for j = 1, . . . , n, we may also write λ j = σ 2j , where σ j ≥ 0. Then σ1 ≥ . . . ≥ σn ≥ 0 are called the “singular values” of the m × n matrix X . Note that if n > m, then it is necessary that, σm+1 = . . . = σn = 0. In Sect. 3.1 of Chap. 3, we will apply this result to introduce and study the notion of singular value decomposition of the matrix X .  Exercises Exercise 1 Determine the null spaces of the following linear transformations. ⎡ ⎤ 1 −1 1 (a) T1 = ⎣ 2 1 0 ⎦ : R3 → R3 . 1 2 −1 ⎡ ⎤ i 1 + i 3 −1 + 3i 2i ⎦ : C3 → C3 . (b) T2 = ⎣ 2i 2i 5 0 2 1 −2 + 4i

2.3 Eigenspaces

103

(c) (T3 f )(x) = f  (x) − 2 f  (x): C 2 [0, 1] → C[0, 1], where C 2 [0, 1] denotes the vector space of functions f defined on [0, 1] such that f, f , f  are continuous on [0, 1]. (d) (T4 f )(x) = f  (x) + f (x): C 2 [0, 1] → C[0, 1], where C 2 [0, 1] is defined as in (c). Exercise 2 Compute the eigenvalues of the following operators:   −1 1 (a) C1 = , 1 −1   0 2 (b) C2 = , −1 1   1 i , (c) C3 = −i 3 ⎡ ⎤ 10 0 (d) C4 = ⎣ 0 2 0 ⎦. 0 0 −3 Exercise 3 Determine the two eigenspaces of the matrix A2 in Example 1. Exercise 4 Determine the two eigenspaces of the matrix A3 in Example 1. Exercise 5 Determine the eigenspaces of C1 , . . . , C4 in Exercise 2 by exhibiting their bases. Exercise 6 Let V = C ∞ (−∞, ∞) be a subspace of  L 2 (−∞, ∞), as defined in Example 2. Determine the eigenvalues of the following linear operators on V: (a) (b) (c) (d)

T1 f T2 f T3 f T4 f

= 3 f , = −( f  + f ), = − f  + 2 f , = f −4f.

Exercise 7 Let C0∞ (J ), J = [0, c] be the subspace of  L 2 (J ) as defined in Example 3. Determine the eigenvalues, if any, of the linear operators T1 , . . . , T4 , in Exercise 6, on C0∞ (J ). For those that have eigenvalues, determine the corresponding eigenspaces. Exercise 8 As a continuation of Exercise 6, determine all the eigenvalues of the operator T f = f (4) on the subspace C ∞ (−∞, ∞) of the inner product space  L 2 (−∞, ∞). In particular, exhibit a basis of the null space of T (that is, the eigenspace of the zero eigenvalues). Exercise 9 Verify that the matrix A = A4 in Example 1 satisfies (2.3.15). Exercise 10 Consider the subspace W0 = f ∈ C ∞ [−1, 1] : f  (−1) = f  (−1) = f  (1) = f  (1) = 0

!

104

2 Linear Analysis

of the inner-product space  L 2 [−1, 1], and the operator T on W0 defined by T f = f (4) for f ∈ W0 . Show that T is a self-adjoint operator. Exercise 11 Let λ = ν 4 , where ν > 0, and set f 1 (x) = cos νx, f 2 (x) = sin νx, f 3 (x) = cosh νx, and f 4 (x) = sinh νx. (a) Show that f 1 , f 2 , f 3 , f 4 are linearly independent. (b) Show that f 1 , f 2 , f 3 , f 4 are eigenvectors of the operator T defined by T f = f (4) associated with the eigenvalue λ = ν 4 . (c) Let gλ (x) = c1 f 1 (x)+c2 f 2 (x)+c3 f 3 (x)+c4 f 4 (x). Verify that for each ν = kπ, where k is an arbitrary integer = 0, (λ, gλ ) is an eigenvalue-eigenvector pair of the linear operator T on the inner-product subspace W0 in Exercise 10, if and only if c3 = c4 = 0. (d) Repeat the problem in (c) for each ν = (k + 21 )π, where k is an arbitrary integer. Exercise 12 As a continuation of (c) and (d) in Exercise 11, determine the eigenspaces of T for the eigenvalues λ = (kπ)4 , where k is an arbitrary non-zero integer. 4

Exercise 13 Repeat Exercise 12 for the eigenvalues λ = (k + 21 )π , where k is an arbitrary integer. Exercise 14 Verify (2.3.16) for the matrix A = A4 in Example 1. Hence, A4 is an spd (self-adjoint positive definite) linear operator (see Definition 3). Exercise 15 Let X be any m × n (rectangular) matrix of complex numbers, and consider the n × n and m × m square matrices T

T

An = X X, Bm = X X . Show that both An and Bm are spd (self-adjoint positive semi-definite) matrices. Exercise 16 Compute the eigenvalues and determine the corresponding eigenspaces of the matrix ⎡ ⎤ 110 A = ⎣1 1 0⎦. 000 Then apply the Gram-Schmidt process studied in Sect. 1.4 of Chap. 1 to compute an orthonormal basis of R3 in terms of the eigenvectors of A. Exercise 17 Let A be an n × n real matrix, and u ∈ C1,n , v ∈ Cn,1 be vectors that satisfy Av = λ1 v, u A = λ2 u. Prove that if λ1 = λ2 , then u and v are orthogonal to each other; that is, u, v = nk=1 u k vk = 0. Exercise 18 Show that for any n × n matrices A and B, Tr(AB) = Tr(B A).

2.4 Spectral Decomposition

105

2.4 Spectral Decomposition Loosely speaking, the set of all eigenvalues of a linear operator T on an inner-product space V is called the spectrum σ(T ) of T . Indeed, this is a correct statement if V is finite-dimensional, but σ(T ) could be a larger set in general. Definition 1 Spectrum Let T be a linear operator on an inner-product space V over F = C or R. Then the spectrum of T , denoted by σ(T ), is the set of λ ∈ F, for which the operator (2.4.1) Tλ := T − λI is singular (that is, not invertible). Example 1 Recall from the previous section that λ ∈ C is an eigenvalue of a matrix A ∈ Cn,n , if it is a root of the characteristic polynomial P(λ) = det(A − λIn ) (see (2.3.4) on p.90). In other words, λ is an eigenvalue of A, if and only if the  (matrix) operator A − λIn is singular, or λ ∈ σ(A). Example 2 Let  2 be the inner-product space of the one-sided infinite sequences x = (x0 , x1 , . . .), with

∞ 

|x j |2 < ∞,

j=0

and with the inner product defined, analogous to 2 , by x, y =

∞ 

xj y j,

j=0

where y = (y0 , y1 , . . .) ∈  2 as well. It is clear that the right-shift operator L on  2 , defined by (2.4.2) Lx = L(x0 , x1 , . . .) = (0, x0 , x1 , . . .), is a linear operator on  2 . Verify that L has no eigenvalues but 0 ∈ σ(L). Solution Let us first observe that λ0 = 0 is not an eigenvalue. Indeed, if Lx = 0x = 0,

106

2 Linear Analysis

then (0, x0 , x1 , . . .) = 0, or x0 = x1 = · · · = 0, or x = 0, so that Nλ0 = {0}. Next, if λ1 = 0 would be an eigenvalue of L, then Lx = λ1 x can be written as (0, x0 , x1 , x2 , . . .) = λ1 (x0 , x1 , x2 , x3 , . . .), or equivalently, 0 = λ1 x0 , x0 = λ1 x1 , . . . , x j = λ1 x j+1 , . . . . Hence, λ1 = 0 implies that x0 = 0, 0 = x1 , 0 = x2 , . . . , 0 = x j+1 , . . .; that is, x = 0. In other words, Nλ1 = {0}. Therefore, the operator L does not have any eigenvalue. On the other hand, to show that 0 ∈ σ(L), it is equivalent to verifying that L − 0I = L is not invertible on  2 . Indeed, if M would be an inverse of L, then M must be a left-shift operator: (2.4.3) M(y0 , y1 , y2 , . . .) = (y1 , y2 , . . .), in order to yield M L(x0 , x2 , . . .) = (x0 , x1 , . . .), for all (0, x0 , x1 , . . .) in the range of L. However, if y0 = 0, then y = (y0 , y1 , . . .) cannot be recovered from My in (2.4.3) since the value y0 is lost. That is, M does not have an inverse. Hence, L cannot be an inverse of M (or M cannot be an inverse of L). In summary, although L does not have any eigenvalue, the spectrum σ(L) of L is not empty.  Definition 2 Normal operators and normal matrices A linear operator T on an inner-product space V over C or R said to be normal, if it commutes with its adjoint T ∗ ; that is, (2.4.4) T T ∗ = T ∗ T. In particular, a square matrix A is called a normal matrix if it satisfies A A∗ = A∗ A. Clearly, if A is Hermitian, namely A∗ = A, then A is normal. Note that  2 in Example 2 is an infinite-dimensional vector space. In view of Examples 1–2, we will only discuss finite-dimensional inner-product spaces V over F = C or R in this section. First we recall the definitions of orthogonal and unitary matrices.

2.4 Spectral Decomposition

107

Definition 3 Unitary and orthogonal matrices numbers is called a unitary matrix, if

An n × n matrix A of complex

A A ∗ = In .

(2.4.5)

In particular, if A is a real matrix, then the unitary matrix A is also called an orthogonal matrix; and in this case, A A T = In .

(2.4.6)

From the definition, we see that a unitary (and particularly an orthogonal) matrix T is nonsingular with inverse given by A−1 = A∗ = A (and A−1 = A T , if A is real), so that A∗ A = A A∗ = In . Theorem 1 For an n × n matrix A, write ⎤  a1 ⎢ ⎥ A = [a1 , a2 , . . . , an ] = ⎣ ... ⎦ , ⎡

 an

where a j is the j-th column of A and  a j is the j-th row of A. Suppose that A is a unitary matrix. Then a j , ak = δk− j , 1 ≤ j, k ≤ n. a j , ak = δk− j ,  The proof follows from the definition (2.4.5) and the formula ⎤ ⎡ T a¯ 1T a¯ 1 a1 a¯ 1T a2 ⎥ ⎢ ⎢ .. A∗ A = ⎣ ... ⎦ [a1 , a2 , . . . , an ] = ⎣ ... . a¯ nT a¯ nT a1 a¯ nT a2 ⎡

⎤ . . . a¯ 1T an .. .. ⎥ , . . ⎦ T . . . a¯ n an

so that the ( j, k)-entry of A∗ A is given by a¯ Tj ak = a j , ak , which is also equal to a j , ak = δk− j , 1 ≤ j, δk− j by (2.4.5). Similarly, from A A∗ = In , one can obtain  k ≤ n.  From the above theorem, we see that each row of a unitary matrix A ∈ Cn,n is a unit vector, and that all the row vectors are orthogonal to one another; that is, the row vectors of a unitary matrix in Cn,n constitute an orthonormal basis of Cn . Similarly, the column vectors of a unitary matrix in Cn,n also constitute an orthonormal basis of Cn . In particular, the set of row vectors and the set of column vectors of an orthogonal matrix in Rn,n are both orthonormal bases of Rn . Theorem 2 Unitary invariance property the property:

Every unitary matrix U ∈ Cn,n has

108

2 Linear Analysis

U x, U y = x, y , for all x, y ∈ Cn ,

(2.4.7)

U x − U y = x − y , for all x, y ∈ Cn .

(2.4.8)

or equivalently, In particular, if U ∈ Rn,n is an orthogonal matrix, then (2.4.7) and (2.4.8) hold for all x, y ∈ Rn . If U is unitary, then U x, U y = x, U ∗ U y = x, y which is (2.4.7); and

U x − U y 2 = U x − U y, U x − U y = U (x − y), U (x − y) = x − y, x − y = x − y 2 , so that (2.4.8) is equivalent to (2.4.7).



The property (2.4.7) can be interpreted as invariance property of “angles” between vectors under unitary transformation, and the property (2.4.8) says that the distance between any two vectors is invariant under unitary transformation as well. In particular, by setting y = 0 in (2.4.8), we see that unitary transformations also preserve vector lengths, in that

U x = x . Of course, these invariant properties are valid for any real vectors under orthogonal transformation. Theorem 3 Spectral theorem Let A be an n × n normal matrix. Then there is a unitary matrix U and a diagonal matrix  = diag{λ1 , . . . , λn }, such that A = U U ∗ .

(2.4.9)

The factorization of the matrix A in (2.4.9) is called the “spectral decomposition” of A. The spectral decomposition (2.4.9) can be re-formulated as U −1 AU = diag{λ1 , . . . , λn }.

(2.4.10)

Hence, any normal matrix A is “diagonizable”, in the sense that A is “similar” to a diagonal matrix, namely: the diagonal matrix  = diag{λ1 , . . . , λn }. Moreover, from the equivalent formulation AU = U ,

2.4 Spectral Decomposition

109

it follows that each of λ1 , . . . , λn , on the main diagonal of , is an eigenvalue of A, and the jth column of U is an eigenvector of A associated with the eigenvalue λ j , for all j = 1, . . . , n. Examples of normal matrices, as defined by (2.4.4), include all unitary matrices and all Hermitian matrices. To prove the spectral theorem, let us first establish the following weaker result, where the diagonal matrix  is replaced by an upper-triangular matrix T = [t jk ]1≤ j,k≤n , with t jk = 0 for all j > k. Theorem 4 Unitary-triangular matrix decomposition For any given n × n matrix A, there exist a unitary matrix U and an upper-triangular matrix T , such that A = U T U ∗.

(2.4.11)

Proof This theorem can be proved by mathematical induction, since the theorem trivially holds for n = 1. For n > 1 and A ∈ Cn×n , let v1 be some unit eigenvector associated with an eigenvalue λ1 of A. We then formulate the unitary matrix V = [v1 v2 · · · vn ], by extending v1 to an orthonormal basis {v1 , v2 , . . . , vn } of Cn . Since v2 , . . . , vn are orthogonal to v1 , and Av1 = λ1 v1 , we have ⎡ ⎤ v1∗ ⎢ λ1 ⎢ ⎥ V ∗ AV = ⎣ ... ⎦ [ λ1 v1 ∗ · · · ∗ ] = ⎢ ⎣...

⎤ .. . ∗ ... ∗ ⎥ ... ... ... ...⎥ ⎦, .. O . B



vn∗

where O is a zero column vector and B is an (n − 1) × (n − 1) matrix, which, by applying the induction hypothesis, can be written as B = W T1 W ∗ , where W is some (n − 1) × (n − 1) unitary matrix and T1 an (n − 1) × (n − 1) upper-triangular matrix. Therefore, by introducing the matrix ⎡

⎤ .. 1 . O ⎢ ⎥ ⎥ U=V⎢ ⎣... ... ...⎦, .. O . W we have the following decomposition formulation:

110

2 Linear Analysis

⎤⎡ ⎤ ⎤⎡ .. .. .. λ 1 . O 1 . O . ∗ . . . ∗ ⎥⎢ 1 ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎥⎢ U ∗ AU = ⎢ ⎣... ... ... ⎦⎣... ... ... ... ...⎦⎣... ... ...⎦ . .. . O .. W O . W∗ O .. B ⎡ ⎤ ⎤ ⎡ .. .. ⎢ λ1 . ∗ . . . ∗ ⎥ ⎢ λ1 . ∗ . . . ∗ ⎥ ⎥ ⎥ ⎢ =⎢ ⎣............... ⎦ = ⎣... ... ... ... ...⎦, .. . O . W ∗ BW O .. T ⎡

1

where the matrix on the right is an upper-triangular matrix. Since U is unitary (because both V and W are unitary), we may conclude that A has the desired matrix decomposition as in (2.4.11), completing the induction argument.  We are now ready to prove Theorem 3, by applying Theorem 4. Proof of Theorem 3 By Theorem 4, we may write A = U T U ∗ , where U is unitary and T = [t jk ]1≤ j,k≤n upper-triangular, with t jk = 0 for all j > k. In the following, we will prove that T is actually a diagonal matrix, and thereby complete the proof of the theorem. Indeed, from the assumption that A is normal, we have U (T ∗ T )U ∗ = (U T ∗ U ∗ )(U T U ∗ ) = A∗ A = A A∗ = (U T U ∗ )(U T ∗ U ∗ ) = U (T T ∗ )U ∗ , so that

T ∗ T = T T ∗.

(2.4.12)

Observe that the (1, 1)-entry of T ∗ T is |t11 |2 , while the (1, 1)-entry of T T ∗ is |t11 |2 + |t12 |2 + · · · + |t1n |2 . Hence, from (2.4.12), we have |t11 |2 = |t11 |2 + |t12 |2 + · · · + |t1n |2 , and therefore t12 = 0, t13 = 0, . . . , t1n = 0. This enables us to write

⎤ .. ⎢ 1 . O⎥ ⎥ T =⎢ ⎣ . . . . . . . . . ⎦, . O .. T1 ⎡

where T1 = [t jk ]2≤ j,k≤n is an (n − 1) × (n − 1) upper-triangular matrix. By applying (2.4.12) again, we obtain T1∗ T1 = T1 T1∗ , which yields t23 = t24 = . . . = t2n = 0,

2.4 Spectral Decomposition

111

by the same argument as above. Hence, repeating the same process again and again, we may conclude that T =  is a diagonal matrix, so that A = U T U ∗ = U U ∗ , completing the proof of Theorem 3.  Next we derive the following property of operator norms for normal operators. Theorem 5 Norm preservation by adjoint of normal operators normal operator on an inner-product space V. Then T satisfies

T x = T ∗ x , for any x ∈ V;

Let T be a

(2.4.13)

or equivalently, |||T |||V = |||T ∗ |||V . Proof The proof of (2.4.13) is evident from the definition of normal operators. Indeed, for any v ∈ V, since T ∗ T = T T ∗ , we have

T v 2 = T v, T v = T ∗ T v, v = T T ∗ v, v = T ∗ v, T ∗ v = T ∗ v 2 . Consequently, by the definition of operator norms in (2.2.4) on p.80, we have, by  using the notation in (2.2.5) on the same page, that |||T |||V = |||T ∗ |||V . Exercises Exercise 1 Let A be an n × m and B an n × p matrix. Denote A∗ B = C = [c j,k ]. Show that if A = [a1 , . . . , an ] and B = [b1 , . . . , b p ], then c j,k = a j , bk = a j , bk . Exercise 2 Let A⎡be an⎤m × n and⎡B a ⎤ p × n matrix. Denote AB ∗ = C = [c j,k ]. b1 a1 ⎢ .. ⎥ ⎢ .. ⎥ Show that if A = ⎣ . ⎦ and B = ⎣ . ⎦, then c j,k = a j , bk . am

bp

Exercise 3 For each of the following matrices, compute its eigenvalues and corresponding eigenvectors with unit norm; and compute the required unitary (or orthogonal) matrix to formulate its spectral decomposition.   21 (a) A1 = . 11   2 −4 (b) A2 = . 1 1 ⎡ ⎤ 1 0 1 (c) A3 = ⎣ 0 −1 0 ⎦. 1 0 0 ⎡ ⎤ 01 0 (d) A4 = ⎣ 1 1 0 ⎦. 0 0 −1

112

2 Linear Analysis

Exercise 4 Identify which of the following matrices are diagonalizable. Justify your answers with rigorous arguments. ⎡ ⎤ 101 (a) B1 = ⎣ 0 2 0 ⎦. 003 ⎡ ⎤ −1 0 0 (b) B2 = ⎣ 2 1 0 ⎦. 0 01 ⎡ ⎤ 0 12 (c) B3 = ⎣ 0 0 0 ⎦. −1 0 0 ⎡ ⎤ 0 0 0 0 ⎢ 0 0 −1 0 ⎥ ⎥ (d) B4 = ⎢ ⎣ 0 −1 0 0 ⎦ . 0 0 0 0 Exercise 5 In the following, identify those matrices that are normal, those that are unitary, and those that are Hermitian (that is, self-adjoint). Justify your answers with rigorous arguments.   1 −i 1 (a) C1 = √ . 2 −i 1   0 1 (b) C2 = . −1 0   1 −i . (c) C3 = √1 2 i 1 ⎡ ⎤ 1 −2 2 (d) C4 = 13 ⎣ 2 2 1 ⎦ . 2 −1 −2 Exercise 6 Compute the trace of each of the matrices A1 , A2 , A3 , A4 (in Exercise 3) and C1 , C2 , C3 , C4 (in Exercise 5) Exercise 7 In the following, identify all unitary (or orthogonal) matrices. For those that are not unitary (or orthogonal), normalize the appropriate column vectors (by dividing them with their norms, called normalization constants), so that the modified matrices become unitary (or orthogonal).   2 −2 (a) D1 = . 2 2 ⎡ ⎤ 1 2 2 (b) D2 = ⎣ 2 −2 1 ⎦. 2 1 −2

2.4 Spectral Decomposition

113



⎤ 1 1 1 1 ⎢ 1 −1 −1 1 ⎥ ⎥ (c) D3 = 21 ⎢ ⎣ 1 1 −1 −1 ⎦. 1 −1 1 −1 ⎡ ⎤ 1 1 1 1 (d) D4 = ⎣ 1 i −1 −i ⎦ . 1 −i −1 i Exercise 8 For each of D1 , D2 , D3 , D4 in Exercise 7, without performing matrix multiplications, compute the distance between the transformed vectors, by applying (2.4.8) for unitary matrices. Hint: If D j is not unitary but its modification is, as in Exercise 7, then the normalization constants can be transferred to the corresponding components of the given vectors x and y. (a) (b) (c) (d)

D1 x − D1 y , where x = (1, 2), y = (−1, 1).

D2 x − D2 y , where x = (1, −1, 1)), y = (0, 1, 0).

D3 x − D3 y , where x = (1, 0, −1, 1), y = (0, 1, 1, 1).

D4 x − D4 y , where x = (i, 0, −i, 1), y = (0, 1, 0, 0).

Exercise 9 As a continuation of Exercises 7–8, compute the angles between the two vectors D j x and D j y, for each j = 1, . . . , 4, where x, y are given in (a), (b), (c), or (d) in Exercise 8

Chapter 3

Spectral Methods and Applications

The study of linear transformations, eigenvalue problems, and spectral decomposition in Chap. 2 is extended in this chapter to singular value decomposition, principal component analysis, and various operator norms for two areas of applications to data representations. In Sect. 3.1, the spectral decomposition studied in Chap. 2 is applied to extending the eigenvalue problem for square matrices to the discussion of the singular value problem for rectangular matrices. As a result, eigenvalues of square matrices are extended to singular values of rectangular matrices, and the spectral decomposition of normal matrices studied in Sect. 2.3 is applied to the derivation of the singular value decomposition (SVD) of arbitrary matrices, particularly those that are not square. It is shown that singular values are preserved under unitary transformations, and this result is applied to introduce the concept of principal components as well as the method of principal component analysis (PCA). When the matrix represents a dataset, in that each row vector is a data point, the method of PCA provides a new coordinate system to facilitate better analysis of the given data. We end this section by proving that the singular values of a square matrix are precisely the absolute values of its eigenvalues, if and only if the matrix is a normal matrix. In Sect. 3.2, the operator norm introduced in Sect. 2.1 of Chap. 2 is elaborated and new operator norms for matrix transformations are introduced. In particular, when the linear transformation is a matrix B ∈ Cm,n and the Euclidean norm is used for both Cn and Cm , then the operator norm |||B|||2 of B is shown to be the largest singular value of the matrix B. In addition, the Frobenius norm is introduced for quantifying the error of approximation of a given matrix by matrices with an arbitrarily pre-assigned lower rank. More generally, the operator norm |||B|||2 and Frobenius norm of B are extended, from p = ∞ and 2, to the Schattern p-norm for arbitrary 0 ≤ p ≤ ∞, defined by the  p -measurement of the singular values of B. This more general Schattern norm, particularly for p = 0 and p = 1, will be instrumental to the study of the subject of compressed (or compressive) sensing in a C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_3, © Atlantis Press and the authors 2013

115

116

3 Spectral Methods and Applications

forthcoming publication of this book series. It is shown in this section that lower-rank matrix approximation of a given matrix B with rank = r is accomplished by writing B as a sum  of rank-1 matrices by re-formulating the SVD of B, while its best approximant with pre-assigned rank not exceeding d < r is unique and is given by the partial sum, consisting of the first d terms of . The last two sections of this chapter, Sects. 3.3 and 3.4, are devoted to the application of SVD and PCA to the study of two areas in data representations, with the first on least-squares data estimation and solving (or finding the closest solution of) an arbitrary system of linear equations, which could be over-determined or underdetermined; and the second on representation of data in a lower-dimensional space. In Sect. 3.3, the notion of pseudo-inverse of a possibly rectangular matrix is introduced in terms of the singular values. Several key properties of the pseudo-inverse are derived, including its agreement with the matrix inverse for non-singular square matrices. The other properties are shown to uniquely determine the definition of the pseudo-inverse. With the pseudo-inverse, the solution (or closest solution) of an arbitrary system of linear equations can be explicitly formulated. In addition, leastsquares estimation of a given data set is formulated in terms of the pseudo-inverse, and it is shown that this least-squares solution is unique under the constraint of minimum 2 -norm. In Sect. 3.4, motivation of the need for high-dimensional data is discussed by considering spectral curves of digital images. An explicit formula of the dimension-reduced data is derived by applying SVD and PCA; and a result that describes the best approximation in the 2 -norm is derived with two demonstrative toy-examples.

3.1 Singular Value Decomposition and Principal Component Analysis In this section, we extend the concept of eigenvalues for square matrices to the notion of singular values that applies even to rectangular matrices in general. We will then introduce and derive the matrix factorization, called singular value decomposition (SVD), which has a host of applications, such as those to be discussed in Sect. 3.3 and Sect. 3.4 of this chapter. Let B be an m × n matrix, and consider its corresponding Gram matrix A, defined by (3.1.1) A = B B∗,  T where B ∗ = B . Then A is self-adjoint and positive semi-definite (spd) (see Remark 5 on p.102), and is therefore normal. Hence, A has the spectral decomposition A = U U ∗

(3.1.2)

by Theorem 3 on p.108, with diagonal matrix  = diag{λ1 , . . . , λm } and unitary matrix

3.1 Singular Value Decomposition and Principal Component Analysis

117

U = [u1 · · · um ], where for each j = 1, . . . , m, (λ j , u j ) is an eigenvalue-eigenvector pair of the matrix A. Furthermore, since A is spd, we may, and will, write λ j = σ 2j , where σ1 ≥ σ2 ≥ · · · ≥ σr > σr +1 = · · · = σm = 0

(3.1.3)

for some r , with 0 ≤ r ≤ m. Hence, the diagonal matrix  in the spectral decomposition (3.1.2) has the more precise formulation  = diag{σ12 , . . . , σr2 , 0, . . . , 0},

(3.1.4)

where we have adopted the standard convention that {σr +1 , . . . , σm } is an empty set if r = m. Furthermore, recall from Theorem 4 on p.76 that rank(B) = rank(B B∗ ) = rank() = r, so that r ≤ min{m, n}. Let r = diag{σ1 , . . . , σr }

(3.1.5)

and consider the m × n matrix ⎡

⎤ .. . O  r ⎢ ⎥ ⎥ S=⎢ ⎣··· ··· ···⎦, .. O . O

(3.1.6)

where O denotes the zero matrix (possibly with different dimensions), so that ⎡

⎤ n

S = ⎣ · · · ⎦ or S = m ... O O

(3.1.7)

if r = n < m or r = m < n, respectively. Observe that the diagonal matrix  in (3.1.4) can be written as  = SS T = SS ∗ , (3.1.8) and the spectral decomposition of A in (3.1.2) can be re-formulated as A = U SS ∗ U ∗ = (U S)(U S)∗ .

(3.1.9)

Since A = B B ∗ , the formulation in (3.1.9) may suggest a factorization of the matrix B as some unitary transformation U of the “diagonal” rectangular matrix S. Unfor-

118

3 Spectral Methods and Applications

tunately, this is not quite correct yet, but an additional unitary transformation V of S will do the job. More precisely, we will show the existence of two unitary matrices U and V , of dimensions m × m and n × n, respectively, such that B = U SV ∗

(3.1.10)

(see Theorem 2 on p.121 to be derived later in this section). To understand the factorization in (3.1.10), let us write U = [u1 , . . . , um ], V = [v1 , . . . , vn ],

(3.1.11)

where {u1 , . . . , um } and {v1 , . . . , vn } are orthonormal bases of Cm and Cn , respectively. Then it follows from (3.1.10) that BV = U S and B ∗ U = V S T , so that (i) if n < m, then Bv j = σ j u j , B ∗ u j = σ j v j , for j = 1, . . . , n, B ∗ u j = 0, for j = n + 1, . . . , m;

(3.1.12)

B ∗ u j = σ j v j , for j = 1, . . . , m,

(3.1.13)

(ii) if n ≥ m, then Bv j = σ j u j ,

Bv j = 0, for j = m + 1, . . . , n (see Exercise 1). Definition 1 Singular values In (3.1.10), the diagonal entries σ1 , . . . , σr of r in (3.1.6) are called the (non-zero) singular values of the matrix B, and the pair (v j , u j ) of vectors in (3.1.12) or (3.1.13) is called a singular-vector pair associated with the singular value σ j . Clearly if (v j , u j ) is a singular-vector pair of B associated with σ j , then (u j , v j ) is a singular-vector pair of B ∗ associated with the same σ j . Example 1 For the matrix

B1 =

0 1 0 1 0 −1

√ with m = 2 and n = 3, verify that σ1 = 2, σ2 = 1 are singular values of B1 , with corresponding singular-vector pairs (v1 , u1 ), (v2 , u2 ), and B1 v3 = 0, where

3.1 Singular Value Decomposition and Principal Component Analysis

119

T 0 − √1 , 2  T v2 = 0 1 0 ,

1 1 T v3 = √2 0 √2 ,  T  T u1 = 0 1 and u2 = 1 0 .

v1 =

Solution (i) For σ1 =





√1 2

2, 01 0 B1 v1 = 1 0 −1

B1∗ u1



√1 2



0 ⎥ ⎢ √ ⎣ 0 ⎦= 2 − √1 2 √ 0 = 2 = σ1 u1 ; ⎡ ⎤1 ⎡ ⎤ 0 1 1 0 = ⎣1 0 ⎦ =⎣ 0 ⎦ 1 0 −1 −1 ⎤ ⎡

√1

√ ⎢ 2 ⎥ = 2 ⎣ 0 ⎦ = σ1 v1 ; − √1 2

(ii) for σ2 = 1, ⎡ ⎤

0 01 0 1 ⎣ ⎦ 1 = B1 v2 = = σ2 u2 ; 1 0 −1 0 0 ⎡ ⎤ ⎡ ⎤ 0 1 0 0 ∗ ⎣ ⎦ ⎣ 1 0 B1 u2 = = 1 ⎦ = σ2 v2 ; 1 0 −1 0

(iii) for v3 ,

01 0 B1 v3 = 1 0 −1



√1 2



0 ⎥ ⎢ ⎣ 0 ⎦ = 0 = 0. 1 √ 2

Observe that {v1 , v2 , v3 } is an orthonormal basis of R3 (and of C3 ), and {u1 , u2 } is an orthonormal basis of R2 (and of C2 ). In addition, in this example 2 = m < n = 3,  and (3.1.14) holds with B = B1 .

120

3 Spectral Methods and Applications

Theorem 1 Reduced SVD theorem Let B be an m × n matrix with rank(B) = r . Then there exists an m × r matrix U1 and an n × r matrix V1 , with U1∗ U1 = Ir , V1∗ V1 = Ir ,

(3.1.14)

such that B has the reduced SVD B = U1 r V1∗ ,

(3.1.15)

where r = diag{σ1 , . . . , σr } with σ1 ≥ σ2 ≥ · · · ≥ σr > 0. Furthermore, if B is a real-valued matrix, then U1 and V1 in (3.1.15) can be chosen to be real-valued matrices. Proof To prove Theorem 1, let A = B B ∗ . Then A has the spectral decomposition 2 } with (3.1.2) for some m × m unitary matrix U and  = diag{σ12 , . . . , σm σ1 ≥ σ2 ≥ · · · ≥ σr > σr +1 = · · · = σm = 0, where 0 ≤ r ≤ min{m, n}. Furthermore, by Theorem 4 on p.76, we have rank(B) = rank(BB∗ ), so that rank(B) = rank(A) = rank() = r . . Write U as U = [U .. U ], with U being the m × r matrix consisting of the 1

2

1

first r columns of U . Then U1∗ U1 = Ir and U1∗ U2 = O, where Ir denotes, as usual, the r × r identity matrix. Observe from (3.1.2), that U1∗ B B ∗ U1 = U1∗ U U ∗ U1 = [Ir O][Ir O]∗ = (r )2 .

(3.1.16)

Similarly, it can be shown that U2∗ B B ∗ U2 = O, yielding

Next, we define V1 by

U2∗ B = O.

(3.1.17)

V1 = B ∗ U1 r−1 .

(3.1.18)

Then V1 satisfies (3.1.15). Indeed, from (3.1.17) and the definition of V1 in (3.1.18), we have ∗ U1 B I − r r V1∗ U ∗ (B − U1 r V1∗ ) = O U2∗ B

∗ U1 B − r V1∗ = O. = O Thus, since U ∗ is nonsingular, we have B − U1 r V1∗ = O; that is, B = U1 r V1∗ , as desired.

3.1 Singular Value Decomposition and Principal Component Analysis

121

Furthermore, for real matrices B, the unitary matrix U in the spectral decomposition of A = B B ∗ = B B T in (3.1.2) can be chosen to be an m × m orthogonal  matrix, and hence the matrix V1 defined by (3.1.18) is also real. Theorem 2 Full SVD theorem Let B be an m × n matrix with rank(B) = r . Then there exist m × m and n × n unitary matrices U and V , respectively, such that B = U SV ∗ ,

(3.1.19)

where S is an m × n matrix given by (3.1.5) for r = diag{σ1 , . . . , σr } with σ1 ≥ σ2 ≥ · · · ≥ σr > 0. Furthermore, if B is a real matrix, then the unitary matrices U and V in (3.1.19) can be chosen to be orthogonal matrices. Proof To prove Theorem 2, let us again write A = B B ∗ and consider the matrix . V defined by (3.1.18) so that (3.1.15) holds. Also let U = [U .. U ], where U 1

1

2

1

and U2 are the matrices introduced in the proof of Theorem 1. Then by (3.1.18) and (3.1.16), we have V1∗ V1 = r−1 U1∗ B B ∗ U1 r−1 = r−1 (r )2 r−1 = Ir . Hence, the columns of V1 constitute an orthonormal family in Cn , and we may . extend V1 to a unitary matrix V = [V1 .. V2 ] by introducing another matrix V2 with orthonormal column vectors. Thus, it follows from (3.1.15) that B=

U1 r V1∗

r O V ∗ = U SV ∗ , = U1 [r O][V1 V2 ] = [U1 U2 ] O O ∗



completing the proof of (3.1.19). Furthermore, for real matrices B, the unitary matrix U in the spectral decomposition (3.1.2) of A = B B T can be chosen to be an orthogonal matrix. In addition, as already shown above, the columns of V1 defined by (3.1.18) constitute an orthonormal family of Rn , so that V1 can be extended to an orthogonal matrix V ∈ Rn,n that  satisfies B = U SV T . From a full SVD of B in (3.1.19), the construction of a reduced SVD (3.1.15) of B is obvious, simply by keeping only the first r columns of U and V to obtain U1 and V1 , respectively. Conversely, from a reduced SVD (3.1.15) of B, it is also possible to recover a full SVD of B in (3.1.19) by extending r to S, defined in (3.1.5), as well as extending U1 and V1 to unitary matrices U and V , respectively. In the literature, both the reduced SVD (3.1.15) and full SVD (3.1.19) are called SVD of B. Remark 1 To compute the singular value decomposition (SVD) of a rectangular matrix B, the first step is to compute the eigenvalues of B B ∗ . Then the non-zero singular values of B are the positive square-roots of the non-zero eigenvalues of B B ∗ . The unitary matrix U in the SVD of B is the unitary matrix in the spectral

122

3 Spectral Methods and Applications

decomposition of B B ∗ . It is important to emphasize that eigenvectors of B B ∗ associated with the same eigenvalue must be orthogonalized by applying the Gram-Schmidt process, and that all eigenvectors must be normalized to have unit norm. Of course U and V are not unique, although the singular values are unique.  Remark 2 To compute the singular value decomposition (SVD) of a rectangular matrix B of dimension m × n with n < m, the computational cost can be reduced by computing the spectral decomposition of the n×n matrix B ∗ B instead of B B ∗ , which has larger dimension. To do so, simply replace B by B ∗ and consider A = (B ∗ )(B ∗ )∗ . Hence, the reduced SVD and full SVD are given by B ∗ = U1 r V1∗ and B ∗ = U V ∗ , respectively; so that B = V1 r U1∗ = V U ∗ .  Example 2 Let B = B1 in Example 1; that is, B=

01 0 . 1 0 −1

Compute the SVD of B. Solution Since B ∗ B is 3 × 3 and B B ∗ is 2 × 2, we compute the eigenvalues of one with lower dimension, namely: B B∗ = B BT =



01 0 1 0 −1





0 1 ⎣1 0 ⎦ = 1 0 , 02 0 −1



1−λ 0 , yielding the eigenvalues σ12 = 2 and 0 2−λ σ22 = 1 (since they are arranged in decreasing order). Then the (non-zero) singular values of B are: √ σ1 = 2, σ2 = 1. by taking the determinant of

To compute u1 and u2 , note that  BB



− σ 2j I2

=

 1 − σ 2j 0 , j = 1, 2; 0 2 − σ 2j





−1 0 00 that is, and . Hence, we may select 0 0 01 u1 = [0 1]T and u2 = [1 0]T ,

3.1 Singular Value Decomposition and Principal Component Analysis

yielding

123



01 U= . 10

Since r = m = 2, in this case U1 = U , and V1 as defined by (3.1.18) is simply ⎡ 1 ⎤ ⎤ √ 0

 1  0 1 2 √ 0 01 ⎢ ⎥ 2 = ⎣ 0 1⎦. = ⎣1 0 ⎦ 10 0 1 0 −1 − √1 0 ⎡

V1 = B ∗ U1 2−1

2

Hence, the reduced SVD of B is given by B=

U1 2 V1∗



01 = 10



√  1 √ 0 − √1 20 2 2 . 0 1 0 1 0

To obtain a full SVD of B, V can be obtained by extending V1 to a 3 × 3 orthogonal matrix: ⎡ 1 ⎤ √ 0 √1 2 2 ⎢ ⎥ V = ⎣ 0 1 0 ⎦. − √1 0 √1 2

2

Therefore the SVD of B is given by B=

01 10



200 0 10

⎤ 0 − √1 2 ⎥ ⎢ ⎣ 0 1 0 ⎦. √1 0 √1 ⎡

√1 2

2

2

 Example 3 Compute the SVD of B, where ⎡

1 ⎢ 1 B=⎢ ⎣ −1 0

0 −i 0 1

⎤ i 0 ⎥ ⎥. −1 ⎦ i

Solution Since B is 4 × 3 with m = 4 > n = 3, we consider the SVD of B ∗ first and compute the spectral decomposition of A = (B ∗ )(B ∗ )∗ = B ∗ B: ⎡ ⎤ 1 1 1 −1 0 ⎢ 1 A = ⎣ 0 i 0 1 ⎦⎢ ⎣ −1 −i 0 −1 −i 0 ⎡

0 −i 0 1

⎤ ⎡ ⎤ i 3 −i 1 + i ⎥ 0 ⎥ ⎣ i 2 i ⎦. = −1 ⎦ 1 − i −i 3 i

124

3 Spectral Methods and Applications

To compute the eigenvalues of A, we evaluate the determinant of the matrix λI3 − A and factorize the characteristic polynomial, yielding (λ − 2)(λ2 − 6λ + 5) = (λ − 2)(λ − 5)(λ − 1), so that λ1 = 5, λ2 = 2, λ3 = 1, when arranged in the decreasing order. Hence, the (non-zero) singular values of B are σ1 =



5, σ2 =

√ 2, σ3 = 1.

To compute the eigenvectors, we simply solve the three homogeneous linear systems (A − λ j I3 )x = 0, j = 1, 2, 3. After dividing each solution by its Euclidean norm, we obtain the normalized eigenvectors u j associated with λ j , for j = 1, 2, 3, listed as follows: ⎤ ⎤ ⎡ ⎡ ⎡ ⎤ 1 − 3i 1 1 1 ⎣ 1 1 ⎦ , u2 = √ ⎣ −1 ⎦ , u3 = ⎣ 1 − i ⎦ . 2 u1 = √ 2 2 6 −1 − 3i 3 −1 i √ √ Let U = [u1 , u2 , u3 ]. Since r = 3, we have U1 = U and 3 = diag{ 5, 2, 1}. Applying (3.1.18), we have V1 = (B ∗ )∗ U1 3−1 = BU 3−1 ⎡ 1−3i √1 √ ⎡ ⎤ 3 2 6 1 0 i ⎢ ⎢ 1 −i 0 ⎥ ⎢ ⎥ ⎢ √1 − √1 =⎢ ⎣ −1 0 −1 ⎦ ⎢ 6 3 ⎢ ⎣ 0 1 i −1−3i √ − √1 2 6 3 ⎤ ⎡ 2−2i 1−i √ √ 0 6 ⎥ ⎢ 30 ⎥ ⎢ ⎥ ⎢ 1−5i i 1+i √ −2 ⎥ ⎢ √ 6 ⎥ ⎢ 2 30 ⎥. =⎢ ⎥ ⎢ 1+i ⎥ ⎢ √3i ⎢ 30 0 − 2 ⎥ ⎥ ⎢ ⎦ ⎣ 5−i √ √ − 1+i − 2i 2 30

1 2 1−i 2

⎤ ⎥   ⎥ 1 1 ⎥ ⎥ diag √ , √ , 1 ⎥ 2 5 ⎦

i 2

6

Thus, the reduced SVD for B ∗ is given by B ∗ = U 3 V1∗ , from which we arrive at the following reduced SVD for B: B = V1 3 U ∗ . To obtain the full SVD for B, we must extend

3.1 Singular Value Decomposition and Principal Component Analysis

125

V1 = [v1 , v2 , v3 ] ∈ C4,3 to a unitary matrix V = [v1 , v2 , v3 , v4 ] ∈ C4,4 . To do so, we may select any vector w4 ∈ C4 which is not a linear combination of v1 , v2 , v3 and apply the Gram-Schmidt orthogonalization process, introduced in Sect. 1.4 of Chap. 1, to the linearly independent set v1 , v2 , v3 , w4 . In this example, we may choose w4 = [1, 0, 0, 0]T , and compute:  w4 = w4 − w4 , v1 v1 − w4 , v2 v2 − w4 , v3 v3 2 + 2i 1+i = w4 − √ v1 − √ v2 − 0v3 30 6 1 = [2, −1 − i, 1 − i, −1 + i]T , 5 followed by normalization of  w4 : v4 =

1  w4 = √ [2, −1 − i, 1 − i, −1 + i]T .

 w4

20

With this unitary matrix V = [v1 , v2 , v3 , v4 ], we obtain the full SVD B ∗ = U [3 , O] V ∗ of B ∗ , yielding the full SVD B=V

3 U ∗, O

after taking complex conjugation of the transpose. In the next two theorems we provide some properties of singular values.



Theorem 3 Let B ∈ Cm,n . Then the matrices B T , B, B ∗ have the same singular values as B. Proof We first show that B T and B have the same (non-zero) singular values. Indeed, let B = U SV ∗ be the full SVD for B, where U and V are unitary matrices, and S is an m × n matrix given by (3.1.6) with r = diag{σ1 , . . . , σr } and where σ1 ≥ σ2 ≥ · · · ≥ σr > 0 are the non-zero singular values of B. Thus, ∗

BT = V ST U , where S T is an n × m matrix given by

rT O S = O O T



r O = . O O

126

3 Spectral Methods and Applications ∗

Since V and U are unitary matrices, B T = V S T U is an SVD of B T . Therefore, the nonzero entries on the main diagonal of r in S T are the singular values of B T . That is, B T and B have the same non-zero singular values: σ1 , σ2 , . . . , σr . It is easy to see that B and B have the same singular values. Since B ∗ = (B)T , B ∗ has the same singular values as B, which has the same singular values as B. Thus,  B ∗ and B have the same singular values. Theorem 4 Unitary transformations preserve singular values Let B ∈ Cm,n , and W and R be unitary matrices of dimensions m and n, respectively. Then the singular values of W B R are the same as the singular values of B. More precisely, if r = rank(B), then rank(W B R) = rank(B) = r and σ j (W B R) = σ j (B), j = 1, . . . , r, where σ j (W B R) and σ j (B) denote the singular values of W B R and B, respectively, listed in non-increasing order. Proof The proof of the above theorem follows from the observation that (W B R)(W B R)∗ = W B R R ∗ B ∗ W ∗ = W B B ∗ W ∗ , which implies that (W B R)(W B R)∗ and B B ∗ are similar matrices, and hence have the same eigenvalues, λ 1 ≥ · · · ≥ λm ≥ 0. Therefore, we may conclude that  σ j (W B R) = σ j (B) = λ j , for all j = 1, . . . , r . In the next section, Sect. 3.2, we will show that the SVD of a matrix B can be applied to derive a lower-rank matrix approximation of B. But first let us introduce the concept of principal components, as follows. Definition 2 Principal components Let B = U SV ∗ be the SVD of an m × n matrix B in Theorem 2, where S is given by (3.1.5) with singular values σ1 , . . . , σr of B in the sub-block r of S, arranged in non-increasing order as in (3.1.4). Then the singular vector pair (v1 , u1 ) associated with the largest singular value σ1 is called the principal component of B. Furthermore, the singular vector pairs (v2 , u2 ), . . . , (vr , ur ), associated with the corresponding singular values σ2 , . . . , σr , are called the second, . . . , r th principal components of B, respectively. Remark 3 Uniqueness of principal components Let B = U1 r V1∗ be the reduced SVD of an m × n matrix B in Theorem 1, where r = diag{σ1 , . . . , σr } with σ1 ≥ σ2 ≥ · · · ≥ σr > 0 being the singular values of B. Since U1∗ U1 = Ir , V1∗ V1 = Ir , we have B B ∗ = U1 r2 U1∗ , B ∗ B = V1 r2 V1∗ , and hence,

B B ∗ U1 = U1 r2 , B ∗ BV1 = V1 r2 .

3.1 Singular Value Decomposition and Principal Component Analysis

127

Thus, for each j = 1, . . . , r , u j is an eigenvector of B B ∗ associated with eigenvalue σ 2j and v j is an eigenvector of B ∗ B associated with σ 2j . Therefore, if σ1 > σ2 > · · · > σs for s, 2 ≤ s ≤ r , then σ 2j , 1 ≤ j ≤ s are simple eigenvalues of B B ∗ and B ∗ B as well, and hence, the normalized corresponding eigenvectors u j and v j are unique.  Remark 4 In application to data analysis, when the matrix B ∈ Cm,n is a data matrix, with the data in Cn given by the m rows b1 , . . . , bm of B, then the principal components of B provide a new “coordinate system”, with origin given by the average bav =

m 1  bj. m j=1

This coordinate system facilitates the analysis of the data, called principal component analysis (PCA), to be discussed in some details in Sect. 3.3.  Recall that singular values are always non-negative by definition. In the following, let us investigate the relationship between absolute values of eigenvalues and singular values for square matrices. More precisely, let B be an n ×n (square) matrix, with singular values σ1 , . . . , σn (including zero, if any) and eigenvalues λ1 , . . . , λn , arranged in the order: σ1 ≥ σ2 ≥ · · · ≥ σn ≥ 0 and |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. A natural question to ask is whether or not σ1 = |λ1 |, . . . , σn = |λn |?

(3.1.20)

In general the answer to the question (3.1.20) is negative, as illustrated by the matrices



21 1 1 B1 = , B2 = , 01 −1 1 for which we have B1 B1∗





51 20 ∗ = , B2 B2 = , 11 02

√ √ with eigenvalues 3 + 5, 3 − 5 for B1 B1∗ , and eigenvalues 2, 2 for B2 B2∗ . Hence, the singular values of B1 are  σ1 =

√ 3+ 5=

√ √  √ 5+1 5−1 √ , σ2 = 3 − 5 = √ , 2 2

√ √ while the singular values of B2 are 2, 2. On the other hand, the eigenvalues of B1 are 2, 1, while the eigenvalues of B2 are 1 + i, 1 − i. It is easy to verify that both B1 and B2 provide negative answers to (3.1.20). On the other hand, consider the following example.

128

3 Spectral Methods and Applications

Example 4 Compare the singular values and (absolute values of) the eigenvalues for the two matrices ⎡ ⎤ ⎡ ⎤ 1 1 2 1 1 2 B = ⎣ 2 −1 1 ⎦ , C = ⎣ 1 0 −1 ⎦ . 1 0 −1 2 −1 1 Solution First, the determinants of the matrices λI3 − B and λI3 − C are given by (λ + 2)(λ2 − λ − 3), (λ − 3)(λ + 2)(λ − 1), respectively. Thus, the eigenvalues of B are λ1 =

1+

√ √ 13 1 − 13 , λ2 = −2, λ3 = , 2 2

and the eigenvalues of C are 3, −2, 1. On the other hand, since ⎡

⎤ 6 −1 3 B B = C C = ⎣ −1 2 1 ⎦ , 3 1 6 T

T

the two matrices B and C have the same singular values: σ1 = 3, σ2 = 2, σ1 = 1. Therefore, the answer to the question (3.1.20) is positive for the matrix C, but negative for the matrix B.  But what is the difference between the two matrices in Example 4? Observe that C is the matrix resulted by exchanging the second and third rows of B; and so we have C = U B, where ⎡ ⎤ 100 U = ⎣0 0 1⎦ 010 is unitary. Hence, in view of Theorem 4, the matrices B and C have the same singular values, as demonstrated by the above example. However, while singular values are preserved under unitary transformations, eigenvalues do not enjoy this invariance property. Also observe that the matrix C in Example 4 is symmetric; and being real, C is selfadjoint and is therefore a normal matrix. This provides an example for the following theorem.

3.1 Singular Value Decomposition and Principal Component Analysis

129

Theorem 5 Eigenvalues“=”singular values for normal matrix Let B ∈ Cn,n be any (square) matrix with eigenvalues: λ1 , . . . , λn , and singular values: σ1 , . . . , σn , where multiplicities are listed and magnitudes are arranged in non-increasing order. Then B has the property |λ j | = σ j , j = 1, 2, . . . , n, if and only if B is a normal matrix. Proof To prove this theorem, let us first consider normal matrices B and apply the spectral decomposition (2.4.9) of Theorem 3 on p.108 to write B = U U ∗ , with unitary matrix U and  = diag{λ1 , . . . , λn }, where λ1 , . . . , λn are the eigenvalues of B, with multiplicities listed and arranged with magnitudes in non-increasing order; that is, |λ1 | ≥ |λ2 | ≥ |λr | > |λr +1 | = · · · = |λn | = 0, where r denotes the rank of B. For each j = 1, . . . , r , write λ j = |λ j |eiθ j for some θ j ∈ [0, 2π). Then since U is unitary and the product of two unitary matrices remains to be unitary, the matrix V = U diag{e−iθ1 , . . . , e−iθr , 1, . . . , 1} is a unitary matrix. Hence, B = U diag{|λ1 |, . . . , |λr |, 0, . . . , 0}V ∗ , is an SVD for B, with singular values |λ1 |, . . . , |λr |. To prove the converse, assume that the eigenvalues λ1 , . . . , λn and singular values σ1 , . . . , σn of B satisfy |λ j | = σ j , for j = 1, . . . , n. Then by Theorem 4 on p.109, the matrix B has the decomposition B = U T U ∗ , where U is unitary and T is uppertriangular. It is evident that the diagonal entries t j j of T are the eigenvalues of B, so that n n   |t j j |2 = |λ j |2 . j=1

j=1

On the other hand, since σ 2j are the eigenvalues of B B ∗ , it follows from Theorem 1 on p.90 that  j

σ 2j = Tr(B B ∗ ) = Tr(T T ∗ ) =

n  n  j=1

=1

 t j t j =



|t j |2 ,

1≤ j,≤n

where the second equality follows from the fact that B B ∗ is similar to T T ∗ . Thus, from the assumption |λ j | = σ j , we have

130

3 Spectral Methods and Applications n 

|t j j |2 =

j=1



|t j |2 ,

1≤ j,≤n

which implies that t j = 0 for all j = . Hence, T is a diagonal matrix, so that T T ∗ = T ∗ T ; and therefore, B B ∗ = U T T ∗ U ∗ = U T ∗ T U = B ∗ B. That is, B is normal, as we intended to prove.



Exercises Exercise 1 Let B be an m × n matrix which can be represented as B = U SV ∗ in (3.1.10), where S is defined in (3.1.6), and U, V be given by (3.1.11), with {u1 , . . . , um }, {v1 , . . . , vn } being orthonormal bases of Cm , Cn , respectively. Verify the validity of the statements (i) and (ii), as formulated in (3.1.12) and (3.1.13), respectively. Exercise 2 For the rectangular matrix B=

1 −1 0 , 0 0 2

√ verify that σ1 = 2, σ2 = 2 are singular values of B, with corresponding singularvector pairs (v1 , u1 ), (v2 , u2 ), and that Bv3 = 0, where  T v1 = 0 0 1 ,

T 1 1 v2 = √2 √2 0 ,

T 1 −1 v3 = √2 √2 0 , u1 = [ 0 1 ]T , u2 = [ 1 0 ]T . Exercise 3 Compute the SVD of the following matrices:

−1 1 0 (a) X 1 = , 0 10

1 01 (b) X 2 = , −1 0 0

0 0 1 , (c) X 3 = 1 −1 0

2 1 −2 (d) X 4 = . −2 1 2

3.1 Singular Value Decomposition and Principal Component Analysis

131

Exercise 4 Apply Theorem 1 to compute the reduced SVD of the following matrices: ⎡ ⎤ −1 1 0 0 (a) B1 = ⎣ 0 1 0 0 ⎦ , −1 2 0 0 ⎡ ⎤ 0 01 (b) B2 = ⎣ 1 0 1 ⎦, −1 0 0 ⎡ ⎤ 1 −1 0 (c) B3 = ⎣ 1 −1 1 ⎦. 0 0 1 Exercise 5 Fill in the details in the derivation of (3.1.17) in the proof of Theorem 1. Exercise 6 Suppose that B is an m × n matrix with rank = r . Let B ∗ B = U U ∗ be the spectral decomposition of B ∗ B, where  is the n × n diagonal matrix . diag{σ11 , σ22 , . . . , σr2 , 0, . . . , 0}. Denote U = [U1 .. U2 ] with U1 consisting of the first r columns of U . Let V1 be defined by (3.1.18). Show that (a) V1∗ V1 = Ir ;

(b) B = V1 r U1∗ .

. Hint: Apply U1∗ U = [Ir .. O] to show that U1∗ B ∗ BU1 = r2 . Exercise 7 In Theorem 1, show that if B ∈ Rm,n is real, then in the reduced SVD (3.1.15) of B, the matrices U1 and V1 can be chosen to be real matrices also. Exercise 8 In Theorem 2, show that if B ∈ Rm,n is real, then in the full SVD (3.1.19) of B, the unitary matrices U and V can be chosen to be orthogonal matrices (that is, real unitary matrices). Exercise 9 Compute the singular values of the matrices X, X T , X ∗ , and X individually, where

1i 0 X= . i 0 −1 Observe that they have the same singular values, as assured by Theorem 3. Exercise 10 Let X 1 , X 2 , X 3 be the matrices in Exercise 3, and

3 4 A= . −4 3 Apply Theorem 4 and the solution of Exercise 3 to compute the singular values of the following matrices:

132

3 Spectral Methods and Applications







3 4 −1 1 0 −3 7 0 = , −4 3 0 10 4 −1 0





3 4 1 01 −1 0 3 (b) B2 = AX 2 = = , −4 3 −1 0 0 −7 0 −4





3 4 0 0 1 4 −4 3 (c) B3 = AX 3 = = . −4 3 1 −1 0 3 −3 −4 (a) B1 = AX 1 =

Hint: Observe that the matrix

= 1 3 4 A 5 −4 3 is an orthogonal matrix. Exercise 11 Let X 1 , X 2 , X 3 be the matrices in Exercise 3, and ⎡

⎤ 1 −1 √0 C = ⎣0 0 2⎦. 1 1 0 Apply Theorem 4 and the solution of Exercise 3 to compute the singular values of the matrices X 1 C, X 2 C, and X 3 C. Hint: Consider the orthogonal matrix √1 C in the application of Theorem 4. 2

Exercise 12 Apply Theorem 5 to determine the singular values of the following square matrices simply by computing their eigenvalues. Justify the validity of this approach by applying Theorem 5.

1 i . (a) A1 = −i 0

0 1+i (b) A2 = . 1−i 0

−1 2 (c) A3 = . 2 1

3.2 Matrix Norms and Low-Rank Matrix Approximation When a matrix B ∈ Cm,n is considered as a linear transformation from V = Cn to W = Cm , the norm of B is the operator norm |||B|||V→W (see (2.2.4) on p.180). Of course this definition depends on the choice of norms for the spaces V = Cn and W = Cm . If there is no indication of such choice, then it is a common practice to n assume that the Euclidean (that is, m 2 and 2 ) norms are used. More generally, recall from Sect. 1.5 of Chap. 1, that for any p with 1 ≤ p ≤ ∞, the normed (linear) space np is the vector space Cn over C with norm · p , defined by

3.2 Matrix Norms and Low-Rank Matrix Approximation

133

1  n p p; (a) for 1 ≤ p < ∞ : x p = j=1 |x j | (b) for p = ∞ : x ∞ = max{|x j | : 1 ≤ j ≤ n}, where x = (x1 , . . . , xn ) ∈ Cn . Of course for the real-valued setting, Cn is replaced by Rn and C by R. For simplicity, we will replace the notation |||B|||np →mp , in (2.2.4) and (2.2.5) on p.80, by |||B||| p for matrix transformations B ∈ Cm,n , as follows. Definition 1 Operator norm of matrices Let B ∈ Cm,n . For 1 ≤ p ≤ ∞, the  p -operator norm of B is defined by  |||B||| p = max Bx p : x ∈ Cn , x p = 1}  

Bx p n = max : x ∈ C , x = 0 .

x p

(3.2.1)

Similarly, for B ∈ Rm,n , |||B||| p is defined as in (3.2.1) with Cn replaced by Rn . To be precise, we point out that the  p -norm in (3.2.1) for Bx and x are somewhat different, since it stands for the mp -norm of Bx ∈ Cm and the np -norm of x ∈ Cn . Again, for the real-valued setting, we simply replace Cn by Rn . From the definition (3.2.1), it is clear that

Bx p ≤ |||B||| p x p , for all x ∈ Cn ,

(3.2.2)

and that |||B||| p ≤ C, for any constant C > 0 for which

Bx p ≤ C x p , x ∈ Cn .

(3.2.3)

In other words, |||B||| p in (3.2.2) is the smallest constant C in (3.2.3). In the following, we focus our attention on |||B||| p , only for p = 1, 2, ∞. For the Euclidean space, we have the following result. Theorem 1 |||B|||2 norm = largest singular value of B For any B ∈ Cm,n , its operator norm |||B|||2 , as defined by (3.2.1) with p = 2, is precisely the largest singular value of B; that is, (3.2.4) |||B|||2 = σ1 , where σ1 ≥ σ2 ≥ · · · ≥ σr > σr +1 = · · · = σn = 0 are the singular values of B, as introduced in (3.1.3) on p.117. Proof To prove (3.2.4), consider the spectral decomposition B ∗ B = V V ∗ of B ∗ B, with unitary matrix V and  = diag{σ11 , σ22 , . . . , σn2 }, where σ1 ≥ σ2 ≥ · · · ≥ σn . Then for any x ∈ Cn , we have Bx, Bx = x, B ∗ Bx = x, V V ∗ x = V ∗ x, V ∗ x .

134

3 Spectral Methods and Applications

Hence, applying the Cauchy-Schwarz inequality, we obtain

Bx 22 = Bx, Bx = V ∗ x, V ∗ x ≤ V ∗ x 2 V ∗ x 2 ≤ V ∗ x 2 σ12 V ∗ x 2 = σ12 x 22 , where the second inequality follows from the fact that y 2 ≤ σ12 y 2 for any y ∈ Cn (see Exercise 12) and the last equality follows from the length invariant property of unitary transformations (see Theorem 2 on p.107). This yields Bx 22 ≤ σ12 x 22 for any x ∈ Cn ; and hence, it follows from (3.2.3) that |||B|||2 ≤ σ1 . On the other hand, by a suitable choice of the first column of V , namely: x0 = V e1 , where e1 = [1, 0, . . . , 0]T , we have x0 2 = 1 and V ∗ x0 = e1 , so that

Bx0 22 = V ∗ x0 , V ∗ x0 = e1 , e1 = σ12 = σ12 x0 22 , which yields |||B|||2 ≥ σ1 , in view of (3.2.2). Therefore, by combining the two inequalities, we have proved (3.2.4).  In view of Theorem 1, it is natural to introduce the following definition. Definition 2 Spectral norm The operator norm |||B|||2 of a matrix B ∈ Cm,n is called the spectral norm of B. For Hermitian matrices, the spectral norm is the largest eigenvalue (see Theorem 5 on p.130), which is characterized by the maximum of the Rayleigh quotients, as already pointed out in Remark 4 (see (2.3.19)) on p.101 in Sect. 2.2 of Chap. 2. This fact is a consequence of Theorem 1 and the following result. Theorem 2 If B is a Hermitian matrix, then |||B|||2 = max |Bx, x |.

x 2 =1

Proof To prove Theorem 2 for square matrices B ∈ Cm,m , we first apply the Cauchy-Schwarz inequality to obtain |Bx, x | ≤ Bx x ≤ |||B|||2 x 2 |||B|||2 , which holds for any x ∈ Cm with x = 1, so that max |Bx, x | ≤ |||B|||2 .

x 2 =1

To derive that |||B|||2 is also a lower bound, we apply the spectral decomposition B = U U ∗

3.2 Matrix Norms and Low-Rank Matrix Approximation

135

of the Hermitian matrix B, with unitary matrix U and  = diag{λ1 , . . . , λm }, where |λ1 | ≥ |λ2 | ≥ · · · ≥ |λm |. Then since U is unitary, we have, for all x ∈ Cm ,

Bx 2 = U U ∗ x 2 = U ∗ x 2 ≤ |λ1 | U ∗ x 2 = |λ1 | U ∗ x 2 = |λ1 | |x . Thus, it follows from the definition of the operator norm that |||B|||2 ≤ |λ1 |. On the other hand, if v1 is a unit eigenvector associated with λ1 , then |Bv1 , v1 | = |λ1 v1 , v1 | = |λ1 | v1 2 = |λ1 |, which implies that |λ1 | ≤ max |Bx, x |.

x 2 =1

The above two derivations together yield |||B|||2 ≤ |λ1 | ≤ max |Bx, x |.

x 2 =1

Hence, |||B|||2 is also a lower bound, completing the proof of the theorem.



We remark that the above theorem can also be proved by applying Theorem 1 and the fact that |λ1 | = σ1 , which is assured by Theorem 5 on p.129 for normal matrices. The spectral radius ρ(B) of a square matrix B is defined as the largest eigenvalue (in modulus) of B: ρ(B) := |λ1 |. Theorem 3 For a square matrix B, 1

ρ(B) = lim |||B k ||| k , k→∞

(3.2.5)

for any operator norm ||| · |||. The proof of (3.2.5) is beyond the scope of this book. Here, we only give the proof for the special case that B is diagonalizable; that is, there is a nonsingular matrix U such that B = U −1 DU, where D = diag{λ1 , . . . , λn } with λ j being eigenvalues of B. Suppose |λ1 | ≥ · · · ≥ |λn |. From B k = U −1 D k U,

136

and

3 Spectral Methods and Applications

c1 |||D k ||| ≤ |||U −1 D k U ||| ≤ c2 |||D k |||,

where c1 , c2 are some positive numbers, we have 1 1 √ √ k c1 |||D k ||| k ≤ |||B k ||| ≤ k c2 |||D k ||| k . 1 √ √ From this and the facts |||D k ||| k = |λ1 |, limk→∞ k c1 = limk→∞ k c2 = 1, we conclude that (3.2.5) holds.  Next we will show that the operator norm |||B||| p , for p = 1 and p = ∞, can be computed in terms of the 1 -norms of the columns and of the rows, of the matrix B, respectively.

Theorem 4 For any matrix B = [b jk ] ∈ Cm,n , |||B|||1 = max

1≤k≤n

|||B|||∞ = max

1≤ j≤m

m 

 |b jk | ;

(3.2.6)

 |b jk | .

(3.2.7)

j=1 n  k=1

For the sake of easier reading, we first demonstrate the computation of |||B||| p , for p = 1, 2, ∞, with the following example, and delay the derivations of (3.2.6) and (3.2.7) to the end of this section. Example 1 Compute the operator norms |||B|||2 , |||B|||1 and |||B|||∞ of the matrix ⎡

⎤ 0 1 1 B = ⎣ 3 −1 1 ⎦ . 3 1 −1 Solution By Theorem 1, to determine the spectral norm |||B|||2 , we may compute the largest singular value σ1 of B. For this purpose, first compute A = B B ∗ , namely: ⎡

⎤⎡ ⎤ ⎡ ⎤ 0 1 1 0 3 3 2 0 0 A = B B T = ⎣ 3 −1 1 ⎦ ⎣ 1 −1 1 ⎦ = ⎣ 0 11 7 ⎦ . 3 1 −1 1 1 −1 0 7 11 The eigenvalues of A can be computed from the determinant of λI3 − A, which is    λ − 11 −7   = (λ − 2)(λ2 − 22λ + 72) (λ − 2)  −7 λ − 11  = (λ − 2)(λ − 4)(λ − 18),

3.2 Matrix Norms and Low-Rank Matrix Approximation

137

with eigenvalues of A given by the roots λ1 = 18, λ2 = 4, λ3 = 2. Then the singular values of B are the positive square-roots of the eigenvalues of A, namely: √ √ √ √ σ1 = 18 = 3 2, σ2 = 4 = 2, σ3 = 2. √ Hence, the largest singular value is σ1 = 3 2, so that √ |||B|||2 = σ1 = 3 2 by Theorem 1. To compute the operator norm |||B|||1 , we simply compute the 1 -norm of each column of B, which is the sum of each column of the matrix ⎡ ⎤ 011  B = [|b jk |] = ⎣ 3 1 1 ⎦, 311 obtained from B by taking the absolute value of its entries. The result is the set of three numbers 6, 3, 3. By (3.2.6) of Theorem 4, we have |||B|||1 = max{6, 3, 3} = 6. Similarly, to find the operator norm |||B|||∞ , we compute the sum of each row of the matrix  B associated with B. The result is the set of three numbers 2, 5, 5. Hence by (3.2.7) of Theorem 4, we have |||B|||∞ = max{2, 5, 5} = 5.  Next we introduce the notion of Frobenius norm (or Hilbert-Schmidt norm). In this case, a matrix B = [b jk ] ∈ Cm,n is considered as an mn × 1-vector b ∈ Cmn,1 , in that all the entries of B are arranged as a finite sequence, such as b = (b11 , . . . , bm1 , b12 , . . . , bm2 , . . . , b1n , . . . , bmn ). Then the Frobenius norm of B is defined by the Euclidean norm of the vector b, as follows. Definition 3 Frobenius norm The Frobenius norm (also called Hilbert-Schmidt norm) of an m × n matrix B = [b jk ] is defined by

138

3 Spectral Methods and Applications

B F =

m  n 

|b jk |2

1/2

.

j=1 k=1

Of course the definition of the Frobenius norm can be generalized from the Euclidean 2 -norm to  p , for any 1 ≤ p ≤ ∞. It can even be generalized to 0 < p < 1, but without taking the power of 1/ p, for the validity of the triangle inequality, as discussed in Sect. 1.2 of the first chapter. But we are only interested in the case p = 2, as in the above definition of the Frobenius norm, since it occurs to be the only useful consideration for the study of data dimensionality reduction, to be studied in the final section of this chapter. Theorem 5 Frobenius norm = 2 -norm of singular values Let B ∈ Cm,n with rank(B) = r . Then r 1/2 

B F = σ 2j , (3.2.8) j=1

where σ1 , . . . , σr are the singular values of B. Proof To derive (3.2.8), let A = B B ∗ and observe that the Frobenius norm B F of B agrees with the trace of A = [a jk ], j = 1, . . . , n, introduced in Definition 2 on p.93. This fact is a simple consequence of the definition of the trace, namely: Tr(A) =

n 

ak,k =

k=1

=

n  m 

n  m  k=1

 bk, bk,

=1

|b,k |2 = B 2F .

k=1 =1

On the other hand, recall from (2.3.5) of Theorem 1 on p.90 that Tr(A) is the sum of the eigenvalues of A. By the definition of the singular values of B, these eigenvalues λ1 = σ12 , λ2 = σ22 , . . . , λn = σn2 of A are squares of the singular values of the matrix B. Hence, we may conclude that n r 1/2   1/2  1/2   = σ 2j = σ 2j ,

B F = Tr(A) j=1

since σr +1 = · · · = σm = 0.

j=1



Example 2 Let B be the matrix considered in Example 1. Compute the Frobenius norm B F of B.

3.2 Matrix Norms and Low-Rank Matrix Approximation

139

Solution The Frobenius norm B F can be computed directly by applying the definition, namely:

B 2F

=

3  3 

|b jk |2 = 02 + 32 + 32 + 12 + 12 + 12 + 12 + 12 + 12 = 24,

j=1 k=1

√ √ so that B F = 24 = 2 6. Since the singular values of B have already been obtained in Example 1, we may also apply (3.2.8) of Theorem 5 to compute B F , namely: ⎛ ⎞1 2 3  √  1 2⎠ ⎝ σj = 18 + 4 + 2 2 = 2 6,

B F = j=1

which agrees with the above answer obtained by using the definition.  Applying Theorem 5 to generalize the Frobenius norm, from the 2 -norm to the general  p -norm, of the sequence of singular values B, as opposed to the sequence of all of the entries of the matrix B, we introduce the following notion of Schatten norm. Definition 4 Schatten p-norm Let B ∈ Cm,n with rank(B) = r . For 1 ≤ p ≤ ∞, the Schatten p-norm of B is defined as

B ∗, p

⎛ ⎞1/ p r  p =⎝ σ ⎠ , j

j=1

where σ1 , . . . , σr are the singular values of B. Remark 5 The Schatten p-norm B ∗, p not only generalizes the Frobenius norm, but also the spectral norm as well. Indeed, B ∗,2 = B F and B ∗,∞ = |||B|||2 , since B ∗,∞ = σ1 and by Theorem  1, |||B|||2 = σ1 also. In addition, for p = 1, the Schatten 1-norm B ∗,1 = rj=1 σ j is called the nuclear norm (also called trace norm and Ky Fan norm in the literature). Since the nuclear norm is useful for low-rank and sparse matrix decomposition, a subject to be studied in a forthcoming publication of this book series, we use the abbreviated notation

B ∗ = ||B||∗,1 =

r 

σj

(3.2.9)

j=1

for simplicity.



Remark 6 When Cm,n is considered as a vector space over the scalar field C, both the operator norm |||B||| p and Schatten norm B ∗, p , for any 1 ≤ p ≤ ∞, provide

140

3 Spectral Methods and Applications

norm measurements of B ∈ V, meaning that they satisfy the norm properties (a)–(c) of Definition 2 on p.55. While verification of these norm properties for the operator norm is fairly simple for all 1 ≤ p ≤ ∞ (see Exercise 14–15), proof of the triangle inequality:

A + B ∗, p ≤ A ∗, p + B ∗, p where A, B ∈ Cm,n , for the Schatten p-norm, with p different from 1, 2 and ∞, is somewhat tedious and not provided in this book. We also mention that the operator norm satisfies the sub-multiplicative condition, meaning that |||AB||| p ≤ |||A||| p |||B||| p for all A ∈ C,m and B ∈ Cm,n (see Exercise 16). Of course, all of the above statements remain valid for the real valued setting, if we replace Cm,n by Rm,n , C,m by R,m , and C by R. In the following, we establish the unitary invariance property of the Schatten p-norm. Theorem 6 Unitary invariance of Schatten p-norm For any p, with 1≤ p≤ ∞, and any m × n matrix B, the Schatten p-norm of arbitrary unitary transformations of B remains the same as the Schatten p-norm of B. More precisely,

W B R ∗, p = B ∗, p for all unitary matrices W and R of dimension m × m and n × n, respectively. The proof of Theorem 6 is accomplished simply by applying the unitary invariance property of singular values in Theorem 4 on p.126.  Since B ∗,2 = B F (see Remark 1), Theorem 6 is valid for the Frobenius norm, namely: for any m × n matrix B,

W B R F = B F ,

(3.2.10)

where W and R are arbitrary unitary matrices. The Frobenius norm will be used to assess the exact error in the approximation of matrices B with rank r , by matrices C with rank ≤ d, for any desired d < r . Approximation by lower-rank matrices is the first step to the understanding of data dimensionality reduction, a topic of applications to be discussed in the next section. The basic tool for this study is the following decomposition formula of any matrix B ∈ Cm,n with rank = r , obtained by applying singular value decomposition (SVD), namely: r  σ j u j v∗j , (3.2.11) B = U1 r V1∗ = U SV ∗ = j=1

3.2 Matrix Norms and Low-Rank Matrix Approximation

141

where U, V are unitary matrices of dimensions m, n respectively, and U1 = [u1 , . . . , ur ], V1 = [v1 , . . . , vr ] are obtained from U, V by keeping only the first r columns, where σ1 ≥ · · · ≥ σr > 0 are the non-zero singular values of B (with multiplicities being listed), and u∗j , v∗j denote the complex conjugate of the transpose of the j th columns u j , v j of U1 , V1 in (3.1.15) on p.120 or U, V in (3.1.19) on p.121, respectively. To derive (3.2.11), we simply apply the reduced SVD of B in (3.1.15) on p.120, re-formulate the two matrix components U1 r and V1∗ , and finally multiply the corresponding column and row sub-blocks, as follows: ⎤ v1∗ ⎢ ⎥ B = U1 r V1∗ = [σ1 u1 · · · σr ur ] ⎣ ... ⎦ = σ1 u1 v1∗ + · · · + σr ur vr∗ . ⎡

vr∗

For each j = 1, . . . , r , observe that u j v∗j is an m × n matrix with rank = 1. Such matrices are called rank-1 matrices. We remark that an m × n matrix B is a rank-1 matrix, if and only if B = vw T , where v and w are m-dimensional and n-dimensional column vectors, respectively (see Exercises 6–7), and that the rank of the sum of r rank-1 m × n matrices does not exceed r (see Exercises 8–10). Theorem 7 Lower-rank matrix approximation Let B be any m × n matrix of complex (or real) numbers with rank(B) = r and with singular values σ1 ≥ · · · ≥ σr > σr +1 = · · · = 0. Then for any integer d, with 1 ≤ d ≤ r , the d th partial sum B(d) =

d 

σ j u j v∗j ,

(3.2.12)

j=1

of the rank-1 matrix series representation (3.2.11) of B, provides the best approximation of B by all matrices of rank ≤ d under the Frobenius norm, with precise 2 error given by σd+1 + · · · + σr2 ; that is,

B − B(d) 2F =

r 

σ 2j ,

(3.2.13)

B − B(d) 2F ≤ B − C 2F

(3.2.14)

j=d+1

and for all m × n matrices C with rank ≤ d. Furthermore, B(d) in (3.2.12) is the unique best approximant of B under the Frobenius norm. Proof Let Sd denote the matrix obtained from the matrix S ∈ Rm,n , introduced in (3.1.6)–(3.1.7) on p.117, by replacing each of σ1 , . . . , σd with 0. Then we have B − B(d) = U Sd V ∗

142

3 Spectral Methods and Applications

(see Exercise 13). Hence, it follows from (3.2.10) that

B − B(d) 2F = U Sd V ∗ 2F = Sd 2F =

r 

σ 2j ,

j=d+1

completing the derivation of (3.2.13). To prove (3.2.14), assume that C ∈ Cm,n with rank = k ≤ d provides the best approximation to B under the Frobenius norm · F , so that B − C 2F ≤

B − B(d) 2F . Hence, it follows from (3.2.13) that

B − C 2F ≤

r 

σ 2j .

(3.2.15)

j=d+1

Let U, V be the unitary matrices in the SVD of B in (3.2.11) and set G = U ∗ C V , so that G has the same rank k as C and that C = U GV ∗ . Hence, by applying (3.2.10), we have

B − C F = U SV ∗ − U GV ∗ F = U (S − G)V ∗ F = S − G F .

Set G = [g j, ] and S = [s j, ], where in view of (3.1.6)–(3.1.7) on p.117, s j, = 0 for all j different from  and s j, j = 0 for j > r . Since the Frobenius norm S − G F is defined by the 2 sequence norm of the sequence consisting of all the entries s j, − g j, of the matrix S − G, it should be clear (see Exercise 17) that for S − G F to be minimum among all G ∈ Cm,n , this optimal G is given by

r 0 G= , 0 0 where r = diag{g1 , g2 , . . . , gr }, with each g j ≥ 0, so that

S − G 2F =

r  j=1

|s j, j − g j |2 =

r 

|σ j − g j |2 .

j=1

Now, since the rank of G is k ≤ d ≤ r , only k of the r diagonal entries g1 , g2 , . . . , gr of r are non-zero, and the minimum of the above sum is achieved only when these k non-zero entries match the largest k values of σ1 , . . . , σr (see Exercise 18). In other words, we have g1 = σ1 , . . . , gk = σk , and g j = 0 for j > k, and that

3.2 Matrix Norms and Low-Rank Matrix Approximation

143 r 

B − C 2F = S − G 2F =

σ 2j

(3.2.16)

j=k+1

(see Exercise 19). Hence, by combining (3.2.15) and (3.2.16), we may conclude that k = d, r  2

B − C F = σ 2j , j=d+1

and that C = U GV ∗ , with



r 0 G= , 0 0

where r = diag{σ1 , . . . , σd }. This implies that C = B(d) in (3.2.12), completing the proof of the theorem.  We conclude this section by providing the following proof of Theorem 4. Proof of Theorem 4 To derive the formula (3.2.6), let c0 be defined by c0 :=

m 

|b jk0 |,

j=1

and among all the column-norms m assume that this column-norm is the largest n , we have |b |, for 1 ≤ k ≤ m. Then for any x ∈ C j=1 jk

Bx 1 = =

m  n m  n      b jk xk  ≤ |b jk | |xk |  j=1 k=1 n  m 

j=1 k=1 n 

|b jk | |xk | ≤

k=1 j=1

c0 |xk | = c0 x 1 .

k=1

Since |||B|||1 is the minimum among all upper bounds, we have |||B|||1 ≤ c0 . On the other hand, by selecting the coordinate unit vector x0 = ek0 = [0, . . . , 0, 1, 0, . . . , 0]T ∈ Rn , we obtain

Bx0 1 = [b1k0 , b2k0 , . . . , bmk0 ]T 1 =

m  j=1

|b jk0 | = c0 x0 1 .

144

3 Spectral Methods and Applications

Thus, it follows from the definition of the operator norm ||| · |||1 that |||B|||1 ≥ c0 . Since the value c0 , defined above, provides both an upper bound and a lower bound of |||B|||1 , we have m   |||B|||1 = c0 = max |b jk | . 1≤k≤n

j=1

To derive the formula (3.2.7) for |||B|||∞ , let d0 be defined by d0 :=

n 

|b j0 k |

k=1

and assume that this row-norm is the largest among all the row-norms for 1 ≤ j ≤ m. Then for any x ∈ Cn , we have

Bx ∞

n

k=1 |b jk |,

n n        = max  b jk xk  ≤ max |b jk | |xk | 1≤ j≤m

≤ max

1≤ j≤m

k=1 n 

1≤ j≤m

k=1



|b jk | x ∞ = d0 x ∞ .

k=1

Since |||B|||∞ is the minimum among all upper bounds, we have |||B|||∞ ≤ d0 . Next, for any complex number z, we introduce the notation  conj(z) =

z¯ /|z|, if z = 0 0, if z = 0,

so that z conj(z) = |z|. Then by selecting the vector x0 = [conj(b j0 1 ), conj(b j0 2 ), . . . , conj(b j0 n )]T , we obtain

Bx0 ∞ = max

1≤ j≤m

n      b jk conj(b j0 k )  k=1

n n      ≥ b j0 k conj(b j0 k ) = |b j0 k | = d0 x0 ∞ , k=1

k=1

since x0 ∞ = 1. Thus, it follows from the definition of operator norm that |||B|||∞ ≥ d0 . Since the value d0 defined above is both an upper bound and a lower bound, we may conclude that

3.2 Matrix Norms and Low-Rank Matrix Approximation

|||B|||∞ = d0 = max

1≤ j≤m

n 

145

 |b jk | .

k=1

 Exercises Exercise 1 Compute the operator norm ||| · |||2 of each of the following matrices:

2 1 1 (a) B = , 1 −1 1 ⎡ ⎤ 2 1 1 (b) C = ⎣ 1 −1 1 ⎦. 2 −1 −3 Exercise 2 Determine the spectral norms ||| · |||1 and ||| · |||∞ of each of the matrices in Exercise 1. Exercise 3 Find the Frobenius norm · F of each of the matrices in Exercise 1. Exercise 4 Determine the nuclear norm · ∗ of each of the matrices in Exercise 1. Exercise 5 What is the spectral radius of the matrix C in Exercise 1? Exercise 6 Let x ∈ Cm and y ∈ Cn be arbitrarily given non-zero column vectors. Show that the m × n matrix B = xyT is a rank-1 matrix; that is, rank(B) = 1. Exercise 7 Let B be an m×n matrix of complex numbers. Show that if rank(B) = 1, then B = xyT for some vectors x ∈ Cm and y ∈ Cn . Exercise 8 Let B1 and B2 be rank-1 matrices. Show that rank(B1 + B2 ) ≤ 2. Exercise 9 As an extension of Exercise 8, let B be a rank-1 matrix and A = B + C. Then show that rank(A) ≤ rank(C) + 1. Exercise 10 As another extension of Exercise 8 (as well as Exercise 9), show that rank

r 

 x j yTj ≤ r

j=1

for all x1 , . . . , xr ∈ Cm and y1 , . . . , yr ∈ Cn . Exercise 11 Compute the rank of each of the following matrices. A1 = x1 y1T + x2 y2T ,

(a) where

x1 = [ 1 0 −1 ]T , x2 = [ 0 1 2 ]T , y1T = [ 2 1 ], y2T = [ 1 −1 ].

146

(b)

3 Spectral Methods and Applications

A2 = A1 + x3 y3T , where A1 is given in (a), and x3 = [ 2 −1 0 ]T , y3T = [ 0 1 ]T .

Exercise 12 Let A = diag{λ1 , λ2 , . . . , λn }, where λ j ∈ C and |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Show that

Ax ≤ |λ1 | x

for any x ∈ Cn , and that |||A|||2 = |λ1 |. Exercise 13 Let Sd be the matrix obtained from the matrix S in (3.1.6)–(3.1.7) on p.117 by replacing each of σ1 , . . . , σd with 0. Show that B − B(d) = U Sd V ∗ . Exercise 14 For each p with 1 ≤ p ≤ ∞, show that the vector space of all matrices B ∈ Cm,n endowed with the operator norm |||B||| p is a normed vector space, by verifying the norm properties (a)–(c) of Definition 2 on p.55. Exercise 15 For each p with 1 ≤ p ≤ ∞, verify that the Schatten p-norm B ∗, p of matrices B ∈ Cm,n satisfy the two properties (a) and (b) of Definition 2 on p.55. Exercise 16 Show that the operator norm satisfies the sub-multiplicative condition: |||AB||| p ≤ |||A||| p |||B||| p , for all matrices A ∈ C,m and B ∈ Cm,n . Exercise 17 Let D ∈ Cn,n be a diagonal matrix with rank r > 0 and 0 ≤ k < r . Suppose that C provides the best approximation of D from the collection of all matrices in Cn,n with rank ≤ k under the Frobenius norm; that is, D − C F ≤

D − A F for all matrices A ∈ Cn,n with rank ≤ k. Show that C is also a diagonal matrix. Exercise 18 As a continuation of Exercise 17, show that the entries of the diagonal matrix C, which best approximates D, are precisely the k entries of D with largest absolute values. Also show that all the diagonal entries of D are non-zero and that the rank of C is equal to k. Exercise 19 As a continuation of Exercises 17 and 18, show that the approximant C of D is unique, and formulate the error of approximation D − C F in terms of the entries of the given matrix D.

3.3 Data Approximation In this section, the mathematical development studied in the previous sections is applied to the problem on generalization of matrix inversion with applications

3.3 Data Approximation

147

to solving arbitrary systems of linear equations and least-squares data estimation. Consider the matrix formulation Bx = b of a given system of linear equations, with an arbitrary m × n coefficient matrix B, the (unknown) column x, and any m-dimensional (known) column vector b. Of course if B is a non-singular square matrix, then the solution of this system is simply x = B −1 b. But if m > n (or m < n), the system is “over-determined” (or “under-determined”) in the sense that there are more equations than unknowns (or more unknowns than equations), and the system could even be “inconsistent” in both cases. In the following, we provide the “optimal” solution of the (possibly inconsistent) system, by introducing the notion of “pseudo-inverse” B † of B based on the SVD of the matrix B. Definition 5 Pseudo-inverse Let B be an m × n matrix of real or complex numbers and B = U SV ∗ be its SVD, with unitary matrices U, V and S as given by (3.1.6) on p.117 with diagonal sub-block r = diag{σ1 , . . . , σr }. Set S by r−1 = diag{σ1−1 , . . . , σr−1 } and define the n × m matrix  ⎡

⎤ .. . O⎥  · · · · · · ···⎥ . S=⎢ ⎣ ⎦ .. O . O n×m −1 ⎢ r

Then the n × m matrix

B† = V  SU ∗

(3.3.1)

is called the pseudo-inverse of the given matrix B. As in the above definition, the subscript n × m of the matrix of  S is, and will be, used to indicate the matrix dimension. Observe that ⎤ ⎡ .. . O I ⎥ ⎢ r ⎥ U∗ B B † = (U SV ∗ )(V S + U ∗ ) = U ⎢ ⎣··· ··· ···⎦ . O .. O m×m

and



⎤ .. ⎢ Ir . O ⎥ ⎥ B † B = (V S + U ∗ )(U SV ∗ ) = V ⎢ V∗ ⎣··· ··· ···⎦ . O .. O n×n

are m ×m and n×n square matrices, respectively, with r ×r identity matrix sub-block Ir , where r = rank(B). Hence, for non-singular square matrices, the pseudo-inverse

148

3 Spectral Methods and Applications

agrees with the inverse matrix. For this reason, the pseudo-inverse is also called generalized inverse. To “solve” the following (possibly over-determined, under-determined, or even inconsistent) system of linear equations Bx = b,

(3.3.2)

(to be called a “linear system” for convenience), where B is an m × n coefficient matrix and b an m-dimensional (known) column vector, we may apply the pseudoinverse B † of B to b to formulate the vector: x♦ = B † b,

(3.3.3)

and obtain the following result, where for a vector y ∈ Cn , its norm y denotes the Euclidean (or n2 ) norm of y. Theorem 8 Pseudo-inverse provides optimal solution For the linear system (3.3.2) with coefficient matrix B ∈ Cm,n and (known) b ∈ Cm , the vector x♦ defined in (3.3.3) has the following properties: (i) for all x ∈ Cn ,

Bx♦ − b ≤ Bx − b ;

(ii) the linear system (3.3.2), with unknown x, has a solution if and only if the pseudo-inverse B † of B satisfies the condition: B B † b = b, namely, x = x♦ is a solution; (iii) if (3.3.2) has a solution, then the general solution of (3.3.2) x ∈ Cn is given by x = x♦ + (In − B † B)w, for all w ∈ Cn ; (iv) if (3.3.2) has a solution, then among all solutions, x♦ is the unique solution with the minimal Euclidean norm, namely:

x♦ ≤ x

for any solution x of (3.3.2); and (v) if (3.3.2) has a unique solution, then rank(B) = n. The above statements remain valid, when Cm,n , Cn , Cm are replaced by Rm,n ,Rn ,Rm , respectively. To prove the above theorem, we need the following properties of the pseudoinverse. Theorem 9 Properties of pseudo-inverse Let B ∈ Cn,m or B ∈ Rn,m . Then

3.3 Data Approximation

(i) (ii) (iii) (iv)

149

(B B † )∗ = B B † ; (B † B)∗ = B † B; B B † B = B; and B† B B† = B†.

Furthermore, B † as defined by (3.3.1), is the only n × m matrix that satisfies the above conditions (i)–(iv). Proof Derivation of the properties (i)–(iv) of B † is left as exercises (see Exercise 1). To show that B † is unique, let A ∈ Cm,n satisfy (i)–(iv); that is, (i) (ii) (iii) (iv)

(B A)∗ = B A; (AB)∗ = AB; B AB = B; and AB A = A.

In view of the definition B † = V  SU ∗ of B † in (3.3.1), we introduce four matrix sub-blocks A11 , A12 , A21 , A22 of dimensions r × r, r × (n − r ), (m − r ) × r, (m − r ) × (n − r ), respectively, defined by ⎤ .. . A A 11 12 ⎥ ⎢ ⎥ V ∗ AU = ⎢ ⎣ · · · · · · · · · ⎦. .. . A A ⎡

21

22

Then by the SVD formulation B = U SV ∗ of the given matrix B, it follows from the assumption B AB = B in (iii) that (U ∗ BV )(V ∗ AU )(U ∗ BV ) = U ∗ (B AB)V = U ∗ BV. Hence, by the definition (3.1.5) of S on p.117, we have

r O O O



A11 A12 A21 A22



r O O O



r O = , O O

which is equivalent to r A11 r = r . This yields A11 = r−1 . By applying the assumptions (i) and (ii) on A above, respectively, it can be shown that A12 = O and A21 = O (see Exercises 2 and 3). Finally, by applying these results along with the assumption AB A = A in (iv), a similar derivation yields A22 = O (see Exercise 4). Hence,

−1 A11 A12 r O ∗ U =V U ∗ = B†. A=V A21 A22 O O This completes the proof of the theorem.



Proof of Theorem 1 To prove Theorem 1, write Bx−b=(Bx−Bx♦ )+(Bx♦ −b), and observe that the two vectors Bx − Bx♦ and Bx♦ − b are orthogonal to each

150

3 Spectral Methods and Applications

other. The reason is that (Bx♦ − b)∗ (Bx − Bx♦ ) = (B B † b − b)∗ B(x − x♦ )    = b∗ (B B † )∗ − I B(x − x♦ ) = b∗ B B † B − B)(x − x♦ ) = 0, where the last two equalities follow from (i) and (iii) of Theorem 2, respectively. Thus, it follows from the Pythagorean theorem that

Bx − b 2 = Bx − Bx♦ 2 + Bx♦ − b 2 ≥ Bx♦ − b 2 , establishing statement (i) in Theorem 1. Statement (ii) follows immediately from statement (i), since if the system (3.3.2) has some solution, then x♦ in (3.3.1) is also a solution of (3.3.2). To derive statement (iii), we first observe, in view of B B † B = B in (iii) of Theorem 2, that for any w ∈ Cn , the vector x = x♦ + (In − B † B)w is a solution of (3.3.2), since (3.3.2) has a solution, namely, x♦ . On the other hand, suppose that x is a solution. Then by setting w = x − x♦ , we have Bw = 0, so that x♦ + (In − B † B)w = x♦ + w − B † Bw = x♦ + w = x; which establishes statement (iii). To prove statement (iv), we apply (ii) and (iv) of Theorem 2 to show that in statement (iii) of the theorem, the vector x♦ is orthogonal to (In − B † B)w, namely:     ∗  (In − B † B)w x♦ = w∗ In − (B † B)∗ B † b = w∗ B † − B † B B † b = 0. Hence, since every solution x can be written as x = x♦ + (In − B † B)w, we may apply the Pythagorean theorem to conclude that

x 2 = x♦ + (In − B † B)w 2 = x♦ 2 + (In − B † B)w 2 ≥ x♦ 2 . Thus, x♦ is the unique solution of (3.3.2) with minimal norm. Finally, to see that statement (v) holds, we simply observe that for the solution x♦ of (3.3.2) to be unique, the matrix (In − B † B) in the general solution (in statement (iii)) must be the zero matrix; that is, B † B = In or B † is the right inverse of B, so that rank(B) = n.  Example 3 As a continuation of Example 2 on p.118 in Sect. 3.1, compute the pseudo-inverse of the matrix

0 1 0 B= . 1 0 −1

3.3 Data Approximation

151

Solution Recall from Example 2 in Sect. 3.1 that the SVD of B is given by ⎤ ⎡

√12 0 − √12

√ 01 200 ⎢ ⎥ B = U SV ∗ = ⎣ 0 1 0 ⎦. 10 0 10 1 1 √ 0 √

2

2

Hence, by the definition (3.3.1) of the pseudo-inverse B † of B, we have ⎡

⎤⎡ ⎡ 1 ⎤ ⎤ 0 √1 √1 0

0 2 2 ⎥⎣ 2 ⎦ 0 1 1 0 ⎦ 0 1 = ⎣1 0 ⎦. 10 0 − 21 0 √1 0 0 2

√1 2

⎢ B† = V  SU ∗ = ⎣ 0 − √1

2

 Example 4 Write the system of linear equations #

x2 = 3, x1 − x3 = −1,

in matrix formulation Bx = b with x = [x1 , x2 , x3 ]T and b = [3, −1]T . Apply the result from Example 1 to obtain the solution x♦ = B † b and verify that ||x♦ || ≤ ||x|| for all solutions x of the linear system. Solution Since the coefficient matrix B is the matrix in Example 1, we may apply B † computed above to obtain the solution ⎡ 1⎤ ⎤

−2 0 21 3 = ⎣ 3 ⎦. x♦ = B † b = ⎣ 1 0 ⎦ −1 1 0 − 21 2 ⎡

Furthermore, it is clear that the general solution of the system is x = [a − 1, 3, a]T , for any real number a, so that ||x||2 = (a − 1)2 + 32 + a 2 = a 2 − 2a + 1 + 9 + a 2   = 2(a 2 − a) + 10 = 2 a 2 − a + 41 + 10 − 2    2  1 2  = 2 a − 21 + 32 + −1 + 2 2  2 = 2 a − 21 + ||x♦ ||2 ≥ ||x♦ ||2 ,

2 4

152

3 Spectral Methods and Applications

with ||x|| = ||x♦ || if and only if a = 21 , or x = x♦ .



Example 5 Consider the inconsistent system of linear equations ⎧ ⎪ ⎨x2 = 1, x1 = −1, ⎪ ⎩ −x2 = 1, with matrix formulation Bx = b, where ⎤ ⎡ ⎤ ⎡ 0 1 1 B = ⎣ 1 0 ⎦ and b = ⎣ −1 ⎦ . 0 −1 1 What would be a “reasonable” solution of the system? Solution From Example 1 (see Example 2 in Sect. 3.1), since the coefficient matrix B is the transpose of the matrix B in Example 1, we have the SVD, B = U SV ∗ with ⎡

√1 2

⎢ U =⎣ 0 − √1

2

⎤ 0 √1 2 ⎥ 1 0 ⎦, 0 √1 2

⎡√

2 01 V = , S=⎣ 0 10 0

⎤ 0 1⎦. 0

Hence, the pseudo-inverse B † of B is given by

˜ ∗= 01 B † = V SU 10



√1 2

00 0 10





⎤ 0 − √1 2 ⎢ ⎥ ⎣ 0 1 0 ⎦; √1 0 √1 √1 2

2

2

so that

01 x♦ = B † b = 10



√1 2

00 0 10

 ⎡ 0 ⎤





0 −1 ⎣ −1 ⎦ = 0 1 = . √ 10 −1 0 2

That is, x1 = −1 and x2 = 0 is the “reasonable” solution of the inconsistent system. Observe that the average of the inconsistency x2 = 1 and −x2 = 1 is the “ reasonable”  solution x2 = 0. Next, we apply Theorem 1 to study the problem of least-squares estimation.

3.3 Data Approximation

153

Let V be an inner-product space over the scalar field C or R and Sn = {v1 , . . . , vn } be a (possibly linearly dependent) set of vectors in V with W = span {v1 , . . . , vn }. Since the cardinality n of the set Sn can be very large, to find a satisfactory representation of an arbitrarily given v ∈ V from W, it is often feasible to acquire only a subset of measurements v, v for  ∈ {n 1 , . . . , n m } ⊂ {1, . . . , n}. Let b = [b1 , . . . , bm ]T = [v, vn 1 , . . . , v, vn m ]T

(3.3.4)

be the data vector in Cm associated with v (where m ≤ n). The least-squares estimation problem is to identify the “best” approximants w=

n 

xjvj ∈ W

j=1

of nthe vector v, based only on the measurement b in (3.3.4). Now, since w, vn  = j=1 v j , vn j x j is supposed to “match” the data component v, vn  = b for  = 1, . . . , m, we consider the system of linear equations n  v j , vn  x j = b = v, vn  ,  = 1, . . . , m,

(3.3.5)

j=1

or in matrix formulation, Bx = b,

(3.3.6)

  B = v j , vn  ,

(3.3.7)

where x = [x1 , . . . , xn ]T and

with 1 ≤  ≤ m and 1 ≤ j ≤ n, is the m × n coefficient matrix. Therefore, by Theorem 1, the “solution” to (3.3.6) is given by x♦ = B † b (where B † is the pseudo-inverse of B) in that ||Bx♦ − b|| ≤ ||Bx − b|| for all x ∈ Cn (or x ∈ Rn ) and that ||x♦ || ≤ ||y|| if ||By − b|| = ||Bx♦ − b||. Of course, by setting x♦ = (x1♦ , . . . , xn♦ ), the (unique) optimal minimum-norm least-squares representation of v ∈ V is given by

154

3 Spectral Methods and Applications n 

x♦ j vj.

(3.3.8)

j=1

Remark 1 Matching the inner product  with the data vector v in (3.3.5) is a consequence of the variational method, when j x j v j is required to be the best approxi mation of v in the Euclidean (or 2 ) norm. Indeed, for the quantity ||v − j x j v j ||2 to be the smallest for all choices of coefficients x1 , . . . , xn , the partial derivatives with respect to each x1 , . . . , xn must be zero. For convenience, we only consider the real setting, so that for each  = 1, . . . , n, 0= =

∂ ∂x ||v ∂ ∂x



 j

x j v j ||2

     ||v||2 − 2 nj=1 x j v, v j + nj=1 nk=1 x j xk v j , vk

= −2v, v + or

n j=1

x j v j , v +

n 

n

k=1 x k v , vk ,

x j v j , v = v, v ,

(3.3.9)

j=1

which is (3.3.5), when n  is replaced by  (see Exercise 9).  Remark 2 For computational efficiency and stability, the coefficient matrix B in (3.3.6) should be “sparse” by choosing locally supported vectors (or functions) v j in V. For example, when piecewise polynomials (or splines) are used, it is best to use B-splines. In particular, when piecewise linear polynomials with equally spaced continuous “turning points” (called simple knots) are considered, then the linear B-splines are “hat” functions and the full matrix Bh = [v j , vk ], for 1 ≤ j, k ≤ n, is the banded square matrix ⎡

2 1 0 0 0 ⎢1 4 1 0 0 ⎢ ⎢0 1 4 1 0 1 ⎢ ⎢ ··· ··· ··· ··· Bh = 6h ⎢ ⎢0 0 0 0 0 ⎢ ⎣0 0 0 0 0 0 0 0 0 0

··· 0 0 0 ··· 0 0 0 ··· 0 0 0 ··· ··· ··· ··· ··· 0 0 0 ··· 0 1 4 ··· 0 0 1

⎤ 0 0⎥ ⎥ 0⎥ ⎥ ⎥, ⎥ 0⎥ ⎥ 1⎦ 2

(3.3.10)

where h > 0 is the distance between two adjacent knots (see Exercise 10). 

3.3 Data Approximation

155

Exercises Exercise 1 Show that the matrix B † introduced in (3.4.1) satisfies (i)–(iv) in Theorem 2. Exercise 2 In the proof of Theorem 2, show that the matrix sub-block A12 = O by applying the assumption (B A)∗ = B A in (i). Exercise 3 In the proof of Theorem 2, show that the matrix sub-block A21 = O by applying the assumption (AB)∗ = AB in (ii). Exercise 4 In the proof of Theorem 2, show that the matrix sub-block A22 = O by applying the assumption AB A = A in (iv). Exercise 5 Compute the pseudo-inverse of each of the following matrices:

102 (a) A1 = , 010 ⎡ ⎤ −1 0 (b) A2 = ⎣ 0 2 ⎦ , 1 1 ⎡ ⎤ −1 1 0 (c) A3 = ⎣ 1 0 1 ⎦ . 0 11 Exercise 6 Apply the results from Exercise 5 to find the general solutions of the following systems of linear equations. If there is more than one solution, then identify the solution with minimum 2 -norm. # x1 + x3 = 3, (a) x2 = −1. ⎧ ⎪ ⎨−x1 = 2, (b) 2x2 = 4, ⎪ ⎩ x + x2 = 0. ⎧ 1 ⎪ ⎨−x1 + x2 = 0, (c) x1 + x3 = 5, ⎪ ⎩ x2 + x3 = 5. Exercise 7 Apply the results from Exercise 5 to “solve” each of the following inconsistent systems of linear equations. Why is your “solution” reasonable? ⎧ ⎪ ⎨−x1 = 2, (a) 2x2 = 4, ⎪ ⎩ x1 + x2 = 1.

156

3 Spectral Methods and Applications

⎧ ⎪ ⎨−x1 + x2 = 0, (b) x1 + x3 = 5, ⎪ ⎩ x2 + x3 = 1. Exercise 8 For any real number a, we introduce the notation a+ = max(a, 0). Consider the function B(x) = (1−|x|)+ and its integer translates B j (x) = B(x − j). Show that {B0 (x), . . . , Bn (x)} constitutes a basis of the vector space S1 [0, n] of continuous piecewise linear polynomials on [0, n] with break-points (called knots) at the integers 1, . . . , n − 1, by showing that (a) every f ∈ S1 [0, 1] has the formulation f (x) =

n 

f (k)B j (x);

k=0

(b) {B0 (x), . . . , Bn (x)} is linearly independent. Exercise 9 As a continuation of Exercise 8, show that for any function g ∈ L 2 [0, n], min

x1 ,...,xn ∈R

n  ( ( (g(x) − x j B j (x)(

2

j=1

is achieved by Sg (x) =

n 

x j B j (x),

j=1

where (x0 , . . . , xn ) is the solution of the system of linear equations  (2 ∂ ( (g(x) − x j B j (x)(2 = 0, ∂x n

j=1

for  = 0, . . . , n; or equivalently, ⎤ ⎡ ⎤ g, B0 x0 ⎥ ⎢ ⎥ ⎢ B ⎣ ... ⎦ = ⎣ ... ⎦ , ⎡

xn where

g, Bn



B = b j , bk

0≤ j, k≤n

,

and ·, · denotes the inner product of L 2 [0, n]. Exercise 10 Compute the matrix B in the above Exercise 9.

3.3 Data Approximation

157

Exercise 11 Derive the formulation of the coefficient matrix Bh for v j (x) = B(hx − j + 1) for j = 1, . . . , n + 1 in (3.3.10) on p.154.

3.4 Data Dimensionality Reduction The notion of principal components, introduced in Definition 2 on p.126 in Sect. 3.1, is instrumental to the study of data analysis. As mentioned in Remark 4 on p.128, the principal components provide a new coordinate system for the given data with the “hierarchal structure” in the decreasing order of the singular values for the data matrix. When applied to such problems as data management and data dimensionality reduction, current linear methods based on this data-dependent coordinate system are often collectively called “PCA”, which stands for “principal component analysis". On one hand, it is at least intuitively clear that effective data management facilitates the data reduction process, with the goal of eliminating the somewhat irrelevant data to reduce the data volume. In laymen’s terms, perhaps “finding a needle in a haystack” pretty much describes the essence of “data mining”. On the other hand, the notion of “data dimension” is certainly not a household word. For this reason, it is hoped that the following discussion will shed some light for understanding this important topic. When the dimension must be very high to represent the “data geometry”, reduction of the data dimension is necessary (as long as the data geometry is not lost), for such applications as data feature extraction, data understanding, and data visualization. So what is data dimension? Perhaps the most simple example to understand is color digital images, with data dimension equal to 3, for the primary color components R (red), G (green) and B (blue). Since human vision is “trichromatic”, meaning that our retina contains three types of color cone cells with different absorption spectra, the three primary color components are combined, in various ratios, to yield a wide range of colors for human vision. However, with the rapid technological advancement in high-quality digital image and video display, more accurate color profiles for consistent imaging workflow require significantly more sophisticated color calibration, by using spectro-colorimeters that take narrow-band measurements, even below 10 nm (nano meter) increments. Hence, for the visible light (electromagnetic radiation, EMR) range of 400–700 nm, even a 10 nm increment requires 31 readings, a 10-fold increase over the 3 RGB readings. In other words, the “spectral curve” dimension for every single image pixel goes up from dimension 3 (for RGB) to dimension 31, and even higher if sub 10 nm increments are preferred. For many applications, including medical imaging, homeland security screening, satellite imaging for agriculture control and warfare, and so forth, the EMR range well beyond visible light is used. For example, a typical hyperspectral image (HSI), with 5 nm incremental reading on the EMR range between 320 nm (for ultraviolet) to 1600 nm (for infrared), already yields spectral curves in R257 . Therefore, to facilitate computational efficiency, memory usage, data understanding and visualization, it is often necessary to reduce the data dimension while preserving data similarities (and dis-similarities), data geom-

158

3 Spectral Methods and Applications

etry, and data topology. In this elementary writing, we only discuss the linear PCA approach to dimensionality reduction. Let B denote a dataset of m vectors b1 , . . . , bm in Cn . The objective for the problem of dimensionality reduction is to reduce n for the purpose of facilitating data understanding, analysis and visualization, without loss of the essential data information, such as data geometry and topology. For convenience, the notation for the set B is also used for the matrix B ∈ Cm,n , called data-matrix, with row vectors: bTj = [b j,1 , . . . , b j,n ], or column vectors b j , where j = 1, . . . , m; that is, ⎡

⎤ b1T ⎢ ⎥ B = ⎣ ... ⎦ = [b1 · · · bm ]T . T bm Observe that the inner product of the j th and k th rows of B, defined by b j , bk =

n 

b j, bk, ,

(3.4.1)

=1

reveals the “correlation” of the data b j and bk in terms of the ratio of its magnitude with respect to the product of their norms. √ Recall from the Cauchy-Schwarz inequality (1.4.4) on p.38 that, with · = ·, · , |b j , bk | ≤ b j bk , and equality holds, if and only if bk is a constant multiple of b j . Hence, b j and bk are a good “match” of each other if the ratio |b j , bk |

b j bk

(3.4.2)

(which is between 0 and 1) is close to 1, and a poor “match”, if the ratio in (3.4.2) is close to 0. Since the inner product b j , bk is the ( j, k)th entry of the m × m square matrix A = B B ∗ , called the Gram matrix of B, as defined by (3.1.1) on p.116, the Gram matrix of a dataset is often used to process the data. On the other hand, even if b j is only a minor “shift” of bk , the ratio in (3.4.2) could be much smaller than 1, which is achieved when b j = bk (that is, without any shift). Hence, to provide a better measurement of data correlation, all data vectors are shifted by their average, as follows. For b1 , . . . , bm ∈ Cn , their average is defined by

3.4 Data Dimensionality Reduction

159

bav =

m 1  bj, m

(3.4.3)

j=1

and the centered data-matrix  B, associated with the given data-matrix B, is defined by T T    bm = b1 − bav , · · · , bm − bav . (3.4.4) B=  b1 , . . . ,  We remark that the matrix

1∗ B( B) m

(3.4.5)

is the covariance matrix of the dataset B, if b1 , . . . , bm are observations of an n-dimensional random variable. Clearly, both Gram matrices  B(  B)∗ and B B ∗ are spd matrices. Again, when the same notation B of the dataset is used for the data-matrix B = [b1 · · · bm ]T , the principal components of B, as introduced in Definition 2 on p.126, provide a new “coordinate system”, with origin at bav as defined in (3.4.3), which facilitates the analysis of the data B. All linear methods based on this data-dependent coordinate system are collectively called methods of principal component analysis (PCA). We remark that for PCA, because m  j=1

 bj =

m 

(b j − bav ) = 0,

j=1

both the centered matrix  B and its Gram  B(  B)∗ are centered at 0, in the sense that  the sum of all row vectors (of B, and also of  B B ∗ ) is the zero vector 0. In addition, the geometry and topology of the data set B is unchanged, when B is replaced by  B, since for all j, k = 1, . . . , m, bk = b j − bk .

 bj −

(3.4.6)

In view of these nice properties of the centered matrix, when we say that PCA is applied to B, what we mean is that PCA is applied to the centered matrix  B defined in (3.4.4). To discuss PCA for data dimensionality reduction, let us first introduce the notion of “orthogonal rectangular matrices”, as follows. Definition 1 Orthogonal rectangular matrices Let 1 ≤ d ≤ q be integers. The notation Oq,d is used for the collection of all q × d (complex) matrices W = [w1 · · · wd ] with orthonormal column vectors; that is W ∗ W = Id or w j , wk = δ j−k , all 1 ≤ j, k ≤ d.

160

3 Spectral Methods and Applications

For any W = [w1 · · · wd ] ∈ On,d , its column vectors constitute an orthonormal basis of its algebraic span, spanW = span{w1 , w2 , . . . , wd }, which is a d-dimensional subspace of Cn , so that any w ∈ spanW has a unique representation: ⎡ ⎤ c1 d  ⎢ .. ⎥ cjwj = W ⎣ . ⎦ , w= j=1 cd for some c j ∈ C. The components of the vector [c1 · · · cd ]T will be called the “coordinates” of vector w, in the “coordinate system” with coordinate axes determined by the unit vectors: w1 , w2 , . . . , wd . In this book, all coordinate systems are restricted to orthogonal systems, with orthonormal vectors as unit vectors for the coordinate axes. We are now ready to study the topic of PCA-based dimensionality reduction. For the dataset {b1 , . . . , bm } ⊂ Cn , consider the SVD of the data-matrix B = [b1 · · · bm ]T = U SV ∗ ,

(3.4.7)

for some unitary matrices U ∈ Cm,m and V ∈ Cn,n and an m × n matrix S, given by S=

r O , O O

with r = diag{σ1 , . . . , σr } and σ1 ≥ · · · ≥ σr > 0, where r = rank(B). Write U, V as U = [u1 · · · um ], V = [v1 · · · vn ], where u j , v j are column vectors. Then the dimensionality reduction problem is to reduce r = rank(B) to an arbitrarily chosen integer d, with 1 ≤ d < r . The PCA approach is to consider the truncated matrices Ud = [u1 · · · ud ], Vd = [v1 · · · vd ], d = diag{σ1 , . . . , σd }.

(3.4.8)

Then in view of the SVD B = U SV ∗ in (3.4.7), or equivalently BV = U S, we consider the m × d matrix representation: BVd = Ud d ,

3.4 Data Dimensionality Reduction

161

and apply this m×d matrix Yd = BVd = Ud d to introduce the notion of dimensionreduced data. Definition 2 Dimension-reduced data Let 0 ≤ d < r . The column vectors y1 , . . . , ym of the matrix YdT = (BVd )T = (Ud d )T ; or equivalently, ⎡

⎤ y1T ⎢ .. ⎥ ⎣ . ⎦ = Yd = BVd = Ud d , T ym

(3.4.9)

are said to constitute the dimension-reduced data of the given dataset B = {b1 , . . . , bm } ⊂ Cn . The main result on dimensionality reduction in this chapter is the following theorem which states that the dimension-reduced dataset y1 , . . . , ym of the given data B = {b1 , . . . , bm } ⊂ Cn is the optimal choice, in the sense that when measured in the “coordinate system” v1 , . . . , vd in Cn , this set provides the best 2 -approximation of B among all possible choices q1 , . . . , qm ∈ Cd and all possible “coordinate systems” Wd of Cd . Theorem 1 Dimension-reduced data are optimal Let B = [b1 · · · bm ]T be an arbitrary m × n data-matrix with SVD representation given by (3.4.7). Then the dimension-reduced data y1 , . . . , ym of B, defined by (3.4.9), lie in a d-dimensional subspace of Cn and satisfy the following best approximation property: m  j=1

V d y j − b j ≤ 2

m 

Wd q j − b j 2 ,

(3.4.10)

j=1

for all Wd = [w1 · · · wd ] ∈ On,d and all {q1 , . . . , qm } ⊂ Cd , where Vd , d are defined by (3.4.8) with V d ∈ On,d . In (3.4.10), for each j = 1, . . . , m, the vector V d y j lies in a d-dimensional subspace of Cn with basis v1 , . . . , vd , and the set V d y1 , . . . , V d ym is an approximation of the given dataset B = {b1 , . . . , bm }. Observe that y1 , . . . , ym are the coordinates of the approximants V d y1 , . . . , V d ym in the coordinate system v1 , . . . , vd . Hence, the inequality (3.4.10) guarantees that the first d principal components v1 , . . . , vd of B provide the best coordinate system (in hierarchal order), for optimal dimensionality reduction of the dataset B ⊂ Cn to a d-dimensional subspace, for any choice of dimension d < n. In particular, to reduce B to a 1-dimensional subspace, the first principal component v1 of B should be used to give the generator of this subspace for computing and representing the best 1-dimensional reduced data; to reduce B to a 2dimensional subspace, the first and second principal components v1 , v2 of B should be used for computing and representing the best 2-dimensional reduced data; and so forth. In general, to reduce the dimension of a given dataset B = {b1 , . . . , bm } ⊂ Cn

162

3 Spectral Methods and Applications

to a d-dimensional subspace of Cn , for any d < n, the best replacement of B is given by: ⎤ ⎡ (V d y1 )T ⎥ ⎢ .. ∗ ∗ ∗ (3.4.11) ⎦ = Yd Vd = Ud d Vd = BVd Vd . ⎣ . (V d ym )T

Proof of Theorem 1 In view of (3.4.9) and the above dimension-reduced data representation formula (3.4.11), we may apply the formulation of B(d) in (3.4.12) to write ⎤ ⎡ (V d y1 )T ⎥ ⎢ .. ∗ ∗ ⎦ − B = Yd Vd − B = [u1 · · · ud ]d [v1 · · · vd ] − B ⎣ . (V d ym )T

= B(d) − B.  Hence, since the left-hand side of (3.4.10) can be written as mj=1 yTj Vd∗ − bTj 2 , which is precisely the square of the Frobenius norm of Yd Vd∗ − B (see (3.2.8) on p.138), we have m 

V d y j − b j 2 = B(d) − B 2F . j=1

Let Q be the m × d matrix with the jth row given by qTj for 1 ≤ j ≤ m. Then the right-hand side of (3.4.10) can also be written as m 

Wd q j − b j 2 =

j=1

m 

qTj WdT − bTj 2 = R − B 2F ,

j=1

where R = QWdT .

(3.4.12)

Since Wd ∈ On,d , the rank of the matrix R in (3.4.12) does not exceed d. Therefore, the desired inequality (3.4.10) follows from (3.4.14) in Theorem 7 on p.141. Furthermore, again by this theorem, (the square of) the error of the dimensionality reduction from B ⊂ Cn to y1 , . . . , ym ⊂ Cn is given by m  j=1

V d y j − b j = B(d) − 2

B 2F

=

r 

σ 2j .

(3.4.13)

j=d+1



3.4 Data Dimensionality Reduction

163

Remark 1 The following comments should facilitate better understanding of the above PCA-based data reduction definition and result. (i) Theorem 1 remains valid if the SVD of B in (3.4.7) is replaced by the reduced SVD (3.4.14) B = Ur r Vr∗ , where Ur and Vr are m × r and n × r matrices, respectively, with orthonormal rows (that is, Ur∗ Ur = Ir , Vr∗ Vr = Ir ), r = rank (B), and r is the diagonal matrix of the (non-zero) singular values σ1 ≥ · · · ≥ σr > 0 of B. (ii) If B has more rows than columns, it is more effective to compute the unitary matrix V for the SVD, B = U SV ∗ , from the spectral decomposition of B ∗ B = V V ∗ (instead of B B ∗ ). In this undertaking, we may obtain the d-dimensional reduced data y1 , . . . , ym from Yd = BVd and the approximation of B by considering BVd Vd∗ in (3.4.11). (iii) Since the data set B = {b1 · · · bm } in Cn is not centered in general, the B (defined by dimension-reduced data y1 , . . . , ym of the centered dataset  (3.4.4), where bav is the average of B given by (3.4.3)), should be computed first. Then the required dimension-reduced dataset for the given dataset B is V d y1 + bav , . . . , V d ym + bav , and the error of approximation of B = {b1 , . . . , bm } can be formulated as: ⎤ ⎡ av ⎤ bav b ⎢ .. ⎥ ⎢ .. ⎥ ∗ ∗

BVd Vd + ⎣ . ⎦ − B F or Ud d Vd + ⎣ . ⎦ − B F . ⎡

bav

bav

Observe that the dimension-reduced data y1 , . . . , ym lie in a d-dimensional subspace of Cn , but of course, this subspace is usually not the same as Cd . (iv) In the above discussion, including (3.4.9) of the definition of dimension-reduced data and Theorem 1, we have only considered reduction of the data dimension n of the data set B = {b1 , . . . , bm } ⊂ Cn . Analogous arguments (with appropriate new notation and definition) also apply to the problem of data reduction (in reducing the cardinality m of B), simply by applying the duality formulation of the SVD theorem: B∗ = V ST U ∗.  Example 1 Let B = [b1 b2 b3 ]T ⊂ R2 , where



2.5 9 −1 b1 = , b2 = , b3 = . 4.5 4 −1

164

3 Spectral Methods and Applications

Compute the dimension-reduced data {y1 , y2 , y3 } of B, which lie in a 1-dimensional y1 ,  y2 ,  y3 } of the corsubspace of R2 , by considering the dimension-reduced data { T      responding centered dataset B = b1 ; b2 ; b3 of B. In addition, compute the 2 × 1 (real) matrix transformation V1 = [v1 ] in (3.4.8), for which 3 

 y j v1 −  b j 2 ≤

j=1

3 

q j w1 −  b j 2

j=1

for any unit vector w1 ∈ R2 and all q1 , q2 , q3 ∈ R. Solution Since the average bav of B is bav =

3.5 , 2.5

we have

−1 av  , b1 = b1 − b = 2

5.5  , b2 = b2 − bav = 1.5

−4.5 av  ; b3 = b3 − b = −3.5 so that the centered dataset is given by ⎡



⎡ ⎤ −1 2 ⎥ ⎢  B = ⎣ b2T ⎦ = ⎣ 5.5 1.5 ⎦ . −4.5 −3.5 b3T b1T

Since  B has more rows than columns, we may consider the spectral decomposition 1 =  of  B∗  B and apply Y BV1 B (instead of  B B ∗ ) to obtain V1 in the reduced SVD of   to obtain the reduced data for B, as follows. By direct calculation, we have

51.5 22  B∗  B= , 22 18.5 with eigenvalues 125/2 and 15/2 and corresponding eigenvectors given by √ √ 2/√5 −1/√ 5 , v2 = . v1 = 1/ 5 2/ 5

Hence, it follows from (3.4.9) that

3.4 Data Dimensionality Reduction

165

⎡ Y1 =  BV1 =  Bv1 =

0

√ ⎢ 5 5 ⎣ 2√ −525

⎤ ⎥ ⎦.

That is, the dimension-reduced dataset for  B is # √ √ ) 5 5 5 5 { y1 ,  y2 ,  y3 } = 0, ,− . 2 2 To verify that this solution is correct, we may compute (the square of) its error, as follows: 3

− b j 2 2 =  y1 v1 −  b1 +  y2 v1 −  b2 2 +  y3 v1 −  b3 2 y j v1 j=1 







−1 2 5 5 5.5 2 4.5 2 = 0 −

+

− +

+ −

2 2.5 2.5 1.5 3.5 = (12 + 22 ) + (0.52 + 12 ) + (0.52 + 12 ) = 7.5 = 15/2 = σ22 , as expected from (3.4.13) in the proof of Theorem 1. Hence, the dimension-reduced dataset y1 , y2 , y3 of B can be obtained by adding y1 v1 ,  y2 v1 ,  y3 v1 , namely: bav to 



3.5 8.5 −1.5 y1 = , y2 = , y3 = . 2.5 5 0 Observe that the three points y1 , y2 , y3 lie on a straight line that passes through the point bav = (3.5, 2.5) with slope 0.5; and hence they are in a 1-dimensional space. The centered dataset  B is shown in the top box of Fig. 3.1; the dimension-reduced dataset y1 , y2 , y3 of B (represented by 3 •), along with B, are shown in the bottom box of Fig. 3.1, while the first principal component v1 and second principal component v2 of B are also displayed in the bottom box of Fig. 3.1. BVd is applied to obtain the dimensionIn the above discussion, the matrix Yd =  reduced data. If the reduced SVD of  B is applied, we may use Yd = Ud d to obtain the reduced data instead. In the following, let us verify this fact. The singular values B, namely: of  B are√computed √ by taking the square-roots of the eigenvalues of  B∗  5 10 30 σ1 = 2 , σ2 = 2 . The principal left eigenvectors u1 , u2 can be obtained by applying (3.1.18) on p.120 with r = 2, as follows: √ ⎤ 0√ 2/ √6 [u1 u2 ] =  B [v1 v2 ] diag(σ1−1 , σ2 −1 ) = ⎣ 1/ √2 −1/√6 ⎦ , −1/ 2 −1/ 6 ⎡

166

3 Spectral Methods and Applications 3 2 1 0 −1 −2 −3 −4 −6

−4

−2

0

2

4

6

6

8

10

6 5 4

v2

3

v1

2 1 0 −1 −2

0

2

4

Fig. 3.1. Top Centered data (represented by ∗); Bottom original data (represented by ∗), dimensionreduced data (represented by •), and principal components v1 , v2

where the last equality is obtained by direct calculations. This enables us to compute ⎡ ⎤ ⎡ √ 0 5 10 ⎢ √1 ⎥ ⎢ Y1 = [u1 ][σ1 ] = ⎣ 2⎦=⎣ 2 √1 2

0

√ 5 5 2√ −525

which agrees with Y1 =  BV1 from the above computation.

⎤ ⎥ ⎦,



Example 2 Repeat Example 1 by considering the dataset matrix B = [b1 · · · b5 ]T with 5 data points in R3 and reduce the dimension of the set to a 2-dimensional plane in R3 , where

3.4 Data Dimensionality Reduction

167

4 3 2 1 0 −1 −2 −3 4 2

2 1

0

0

−2

−1 −4 −2

4 3 2 v

1 0

v

−1

1

2

−2 −3 4 2 0 −2 −4 −1.5

−1

−0.5

0

0.5

1

1.5

2

Fig. 3.2. Top centered data (represented by ∗); bottom dimension-reduced data (represented by •) with the first and second principal components and original data (represented by ∗)



⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 2 −1.2 0.4 2 −1.2 b1 = ⎣ 2.9 ⎦ , b2 = ⎣ 1 ⎦ , b3 = ⎣ 0.9 ⎦ , b4 = ⎣ 0.2 ⎦ , b5 = ⎣ −3.5 ⎦. 4 0.3 −0.4 −0.5 −2.4 Solution Since the average bav of B is bav = [0.4, 0.3, 0.2]T , b5 } is given by (see Fig. 3.2) the matrix  B of the centered dataset { b1 , . . . , 

168

3 Spectral Methods and Applications



⎤ 1.6 2.6 3.8 ⎢ −1.6 0.7 0.1 ⎥ ⎢ ⎥  0 0.6 −0.6 ⎥ B=⎢ ⎢ ⎥. ⎣ 1.6 −0.1 −0.7 ⎦ −1.6 −3.8 −2.6 To reduce computational cost, we compute the spectral decomposition of 3×3 matrix ⎡

⎤ 512 448 448 1 ⎣  448 1103 977 ⎦ B∗  B= 50 448 977 1103 B∗  B are 1152/25, 144/25, (instead of the 4 × 4 matrix  B B ∗ ). The eigenvalues of  and 63/25, with the corresponding eigenvectors v1 , v2 and v3 , given by √ ⎤ ⎤ ⎤ ⎡ ⎡ 0√ −4/(3√ 2) 1/3 v1 = ⎣ 2/3 ⎦ , v2 = ⎣ 1/(3√2) ⎦ , v3 = ⎣ 1/ √2 ⎦ . 2/3 −1/ 2 1/(3 2) ⎡

Thus, it follows from (3.4.9) and Theorem 1 that ⎡

⎤ 24/5 √0 ⎢ 0 6 2/5 ⎥ ⎢ ⎥ ⎢ ⎥.  0 Y2 = B[v1 v2 ] = ⎢ 0 ⎥ √ ⎣ 0 −6 2/5 ⎦ −24/5 0 Therefore the dimension-reduced data for  b1 , . . . ,  b5 are:  24  0 y1 y 0 y3 2 = 5 , = 6√2 , = , z1 z2 z 0 0 3 5 

 0√ y5 y4 − 24 5 ; = = , z4 z5 0 −652 so that the dimension-reduced data for the given dataset B are y j v1 + z j v2 + bav , for j = 1, . . . , 5; namely, ⎡

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 2 −1.2 0.4 2 −1.2 ⎣ 3.5 ⎦ , ⎣ 0.7 ⎦, ⎣ 0.3 ⎦, ⎣ −0.1 ⎦, ⎣ −2.9 ⎦. 3.4 0.6 0.2 −0.2 −3

3.4 Data Dimensionality Reduction

169

These points lie on a 2-dimensional plane, generated by v1 and v2 , that passes through  the point bav = (0.4, 0.3, 0.2) as displayed in Fig. 3.2. Exercises Exercise 1 Consider the data points



2 1 0 1 b1 = , b2 = , b3 = , b4 = , −1 0 −1 1 b4 ]T be the matrix of the centered data  b j = b j − bav , and let  B = [ b1 · · ·  1 av 1 ≤ j ≤ 4, where b = 4 (b1 + · · · + b4 ). Compute the dimension-reduced data b1 ,  b2 ,  b3 ,  b4 }, and determine the PCA-based dimension{y1 , y2 , y3 , y4 } ⊂ R of { reduced data y1 , . . . , y4 of b1 , . . . , b4 ⊂ R2 from R2 to a 1-dimensional subspace. Hint: Follow the procedure in the solution of Example 1. Exercise 2 Repeat Exercise 1 by considering ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 −1 0 2 b1 = ⎣ 1 ⎦, b2 = ⎣ 3 ⎦, b3 = ⎣ 0 ⎦, b4 = ⎣ 3 ⎦. 2 4 1 −2 Exercise 3 As in Exercise 2, let  B = [ b1 · · ·  b4 ]T be the centered data matrix of the data points ⎡

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ −1 −1 0 1 b1 = ⎣ 1 ⎦, b2 = ⎣ 2 ⎦, b3 = ⎣ 1 ⎦, b4 = ⎣ 0 ⎦. 1 3 0 1 Compute the dimension-reduced set { y1 , y2 , y3 , y4 } with  y j = [ y j , z j ]T ∈ R2 of b2 ,  b3 ,  b4 }, and determine the PCA-based dimension-reduced data {y1 , . . . , y4 } { b1 ,  of {b1 , . . . , b4 } ⊂ R3 from R3 to a 2-dimensional subspace. Hint: Follow the procedure in the solution of Example 2.

Chapter 4

Frequency-Domain Methods

Spectral methods based on the singular value decomposition (SVD) of the data matrix, as studied in the previous chapter, Chap. 3, apply to the physical (or spatial) domain of the data set. In this chapter, the concepts of frequency and frequency representation are introduced, and various frequency-domain methods along with efficient computational algorithms are derived. The root of these methods is the Fourier series representation of functions on a bounded interval, which is one of the most important topics in Applied Mathematics and will be investigated in some depth in Chap. 6. While the theory and methods of Fourier series require knowledge of mathematical analysis, the discrete version of the Fourier coefficients (of the Fourier series) is simply matrix multiplication of the data vector by some n ×n square matrix Fn . This matrix is called the discrete Fourier transformation (DFT), which has the important property that with the multiplicative factor of √1n , it becomes a unitary matrix, so that the inverse discrete Fourier transformation matrix (IDFT) is given by the n1 - multiple of the adjoint (that is, transpose of the complex conjugate) of Fn . This topic will be studied in Sect. 4.1. For the investigation of the frequency contents of n-dimensional real-valued data vectors, it is certainly more effective to apply some real-valued version of the DFT. In Sect. 4.2, the notion of discrete cosine transform (DCT), to be denoted by Cn , is introduced; and the most popular version, DCT-II, which is one of the four types of DCT’s, is derived from the DFT matrix, F2n , introduced in Sect. 4.1. The formulation of this DCT includes the √1n multiplicative factor, and is therefore an orthogonal (that is, a real-valued unitary) matrix, with the corresponding inverse DCT (IDCT) given by the transpose of Cn . The reason for the choice of DCT, as opposed to an analogous discrete sine transform (DST), for the transformation of data vectors from the spatial domain to the frequency domain, is that DCT trivially “encodes” the zero-frequency contents of the data vectors, while the computational cost for encoding the zerofrequency contents by DST is very high.

C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_4, © Atlantis Press and the authors 2013

171

172

4 Frequency-Domain Methods

Since the computational complexity of the n-point DFT is of order O(n 2 ), which is unacceptably high for large values of n, the notion of fast Fourier transform (FFT) is introduced in Sect. 4.3 via Lanczos’ matrix factorization formula for Fn provided that n is an even integer, in order to decrease the complexity count. This will be Theorem 1 of the section. Consequently, a full factorization formula of the DFT, Fn for n = 2m , can be established and will be stated in Theorem 2, to reduce the computation count of the DFT, F2m , to O(nm) = O(n log n) arithmetic operations. Of course this count is meaningless for small values of n > 0. For this reason, precise counts are derived for n = 4, 8, 16 in this section also. The reason for inclusion of F16 is that it is used to formulate the DCT-II matrix, C8 , as discussed in Sect. 4.2. In addition, signal flow charts are also included at the end of this section. In Sect. 4.4, the other three types of DCT’s, namely: DCT-I, DCT-III, and DCT-IV, are formulated. In addition, Theorem 2 of Sect. 4.3 is applied to formulate the fast DCT (FDCT) for all four types of DCT’s. This final section of Chap. 4 ends with the display of various data flow charts for FDCT implementation of the 8-point DCT-II, which is used for both JPEG digital image, as well as MPEG video, compressions, to be discussed in Sect. 5.4 of the next chapter.

4.1 Discrete Fourier Transform In view of the invariance properties of unitary matrices, as described in Sect. 2.3, unitary matrices are often used in applications, such as transformation of a digital signal (from the physical domain) to the “frequency domain”. The notion of “frequency” is associated with that of “wavelength”, assuming that the digital signal is obtained by discretization (that is, sampling followed by quantization) of an analog signal in the form of a traveling wave (since the analogous concept of standing waves is not discussed in this context). There are various definitions of wavelengths that are somewhat equivalent. In this book, the distance (on the time axis) between two consecutive local maxima of the waveform is used as the definition of wavelength, say, η m (where m stands for “meters”). Let us assume that the wave under consideration travels at the speed of v0 m/s (which stands for “meters per second”), then the frequency is defined by ω0 =

v0 Hz, η

(4.1.1)

where 1 Hz (which stands for “Hertz”) is one cycle per second. Typical examples include electro-magnetic waves that travel at the speed of light, namely, c = 299, 792, 458 m/s (in vacuum) and sound waves, whose speed depends not only on the medium of wave propagation, but also on the temperature. For a repeating event, the frequency is the reciprocal of the period. Mathematical formulations of traveling waves depend on sinusoidal functions, in terms of cosines and/or sines. For example, a single-frequency signal represented as

4.1 Discrete Fourier Transform

173

w(t) = a cos 2πωt + b sin 2πωt for some a, b ∈ R, has the frequency = ω Hz, if the unit of measurement in the time-domain t is in seconds. Since both cos √2πωt and sin 2πωt can be formulated in terms of ei2πωt and e−i2πωt , where i = −1 is the imaginary unit, via “Euler’s identity” ei x = cos x + i sin x, namely: ei x + e−i x ; 2 ei x − e−i x , sin x = 2i

cos x =

it is important to acquire the basic skill for manipulating the function ei x . The following three formulas are particularly useful. (i) Geometric sum: 1 + ei x + · · · + ei(n−1)x =

1 − einx . 1 − ei x

(4.1.2)

(ii) Orthonormal sequences: n−1 1  i2πk(− j)/n e = δ− j , 0 ≤ , j ≤ n − 1. n

(4.1.3)

k=0

In other words, by introducing the set {f  },  = 0, . . . , n − 1, of sequences f , defined by (4.1.4) f = (a,0 , . . . , a,n−1 ), a,k = ei(2π/n)k , we have  f , f j  = δ− j ,

(4.1.5)

where  ,  is the normalized inner product defined by  f , f j  =

n−1 1 a,k a k, j . n

(4.1.6)

k=0

(iii) Orthonormal basis functions: 1 2π

 0



ei(− j)x d x = δ− j ,

(4.1.7)

174

4 Frequency-Domain Methods

for all integers  and j. In other words, the family {g } of functions g , where  runs over all integers, defined by g (x) = eix , 0 ≤ x ≤ 2π, is an orthonormal family with respect to the normalized inner product  ,  1 g , g j  = 2π





0

g (x)g j (x)d x = δ− j .

Definition 1 Discrete Fourier transformation (DFT) Let n ≥ 2 be any given integer. Then the n-point discrete Fourier transformation (DFT) is the n × n matrix Fn of complex numbers, defined by ⎡

Fn

  = (z n ) jk

0≤ j,k≤n−1



1 1 ⎢ 1 zn =⎢ ⎣ 1 z nn−1 where

1 ⎢1 =⎢ ⎣

z n0 z n1

1 z nn−1 ⎤

... 1 ... z nn−1 ⎥ ⎥, ⎦ ... (n−1)(n−1) . . . zn z n = e−i2π/n .

⎤ ... z n0 ... z nn−1 ⎥ ⎥ ⎦ ... (n−1)(n−1) . . . zn (4.1.8)

(4.1.9)

Furthermore, the n-point inverse discrete Fourier transformation (IDFT) is defined n = 1 Fn∗ , where Fn∗ is the adjoint (that is, transpose and complex conjugate) by F n of Fn . For a vector x ∈ Cn , its discrete Fourier transform (DFT), denoted by x, is defined as

x = Fn x. Theorem 1 For each n ≥ 2, let Fn be defined as in (4.1.8)–(4.1.9). Then the matrix Un , defined by 1 Un = √ Fn , (4.1.10) n is a unitary matrix. Hence,

n Fn = In , n = F Fn F

where In = diag{1, . . . , 1} is the identity matrix.

(4.1.11)

4.1 Discrete Fourier Transform

175

n , it is Since (4.1.11) immediately follows from (4.1.10) and the definition of F sufficient to verify (4.1.10). To verify the validity of (4.1.10), observe, according to (4.1.8) – (4.1.9) and (4.1.4), that ⎤ ⎡ a0,0 a1,0 . . . an−1,0 ⎥ ⎢ .. .. .. Fn = ⎣ ... ⎦. . . . a0,n−1 a1,n−1 . . . an−1,n−1

Hence, for 0 ≤ , j ≤ n − 1, the (, j)th entry of the matrix UU ∗ is precisely n−1  n−1  1  1  1 = a,k a k, j = δ− j , a a √ ,k √ k, j n n n k=0

k=0

where the last equality follows from (4.1.6) and (4.1.5) consecutively. That is, we  have shown that UU ∗ = In . Example 1 Formulate and simplify the 4-point DFT and IDFT (matrix) operators, 4 , respectively. Verify (4.1.11) for n = 4; that is, F4 and F 4 = I4 . F4 F Solution For n = 4, we have z 4 = e−i2π/4 = e−iπ/2 = cos(− π2 ) + i sin(− π2 ) = −i. Hence, it follows from (4.1.8) that ⎡

1 ⎢1 F4 = ⎢ ⎣1 1 ⎡ 1 ⎢1 ⎢ =⎣ 1 1

⎤ 1 1 1 (−i) (−i)2 (−i)3 ⎥ ⎥ (−i)2 (−i)4 (−i)6 ⎦ (−i)3 (−i)6 (−i)9 ⎤ 1 1 1 −i −1 i ⎥ ⎥. −1 1 −1 ⎦ i −1 −i

In addition, by the definition of IDFT, we have ⎡

4 = 1 F ∗ F 4

1 1 1⎢ 1 i = ⎢ 4 ⎣ 1 −1 1 −i ⎡ 1 1 1⎢ 1 i = ⎢ 4 ⎣ 1 −1 1 −i

1 −1 1 −1 1 −1 1 −1

⎤T 1 −i ⎥ ⎥ −1 ⎦ i ⎤ 1 −i ⎥ ⎥. −1 ⎦ i

176

4 Frequency-Domain Methods

Finally, to verify (4.1.11) for n = 4, we simply compute ⎡

1 1 ⎢ 1 −i 1 4 = ⎢ F4 F 4 ⎣ 1 −1 1 i ⎡

1 −1 1 −1

⎤ 1 i ⎥ ⎥ −1 ⎦ −i



1 ⎢1 ⎢ ⎣1 1

1 i −1 −i

1 −1 1 −1

⎤ 1 −i ⎥ ⎥ −1 ⎦ i

⎤ 4 (1 + i − 1 − i) (1 − 1 + 1 − 1) (1 − i − 1 + i) 1 ⎢ (1 − i − 1 + i) (1 + 1 + 1 + 1) (1 + i − 1 − i) (1 − 1 + 1 − 1) ⎥ ⎥ = ⎢ 4 ⎣ (1 − 1 + 1 − 1) (1 − i − 1 + i) (1 + 1 + 1 + 1) (1 + i − 1 − i) ⎦ (1 + i − 1 − i) (1 − 1 + 1 − 1) (1 − i − 1 + i) (1 + 1 + 1 + 1) ⎡ ⎤ ⎡ ⎤ 4000 1000 ⎢ ⎥ 1 ⎢0 4 0 0⎥ ⎥ = ⎢0 1 0 0⎥. = ⎢ ⎣ ⎦ ⎣ 0 0 1 0⎦ 4 0040 0004 0001 

Example 2 Apply F4 from Example 1 to compute the DFT of the following digital signals (formulated as column vectors for convenience). (i) x = [1, 1, 1, 1]T . (ii) y = [0, 1, 2, 3]T . (iii) z = [0, 1, 0, 1]T . Interpret the frequency content of x = F4 x, y = F4 y, and z = F4 z. Solution For (i), ⎡

1 1 ⎢ 1 −i

x = F4 x = ⎢ ⎣ 1 −1 1 i

1 −1 1 −1

⎤ 1 i ⎥ ⎥ −1 ⎦ −i

⎡ ⎤ ⎡ ⎤ 1 4 ⎢1⎥ ⎢0⎥ ⎢ ⎥ = ⎢ ⎥. ⎣1⎦ ⎣0⎦ 1 0

Since the signal x is a constant, there is no high-frequency content. Hence, the DC (direct current) term is 4 and the AC (alternate current) contents are 0, 0, 0. For (ii), ⎡

1 1 ⎢ 1 −i

y = F4 y = ⎢ ⎣ 1 −1 1 i ⎡

1 −1 1 −1

⎤ 1 i ⎥ ⎥ −1 ⎦ −i

⎡ ⎤ 0 ⎢1⎥ ⎢ ⎥ ⎣2⎦ 3

⎤ ⎡ ⎤ ⎡ ⎤ 6 6 0 ⎢ 2(−1 + i) ⎥ ⎢ −2 ⎥ ⎢ 2 ⎥ ⎥=⎢ ⎥ ⎢ ⎥ =⎢ ⎣ ⎦ ⎣ −2 ⎦ + i ⎣ 0 ⎦ . −2 2(−1 − i) −2 −2

4.1 Discrete Fourier Transform

177

Since the signal y does not oscillate, the DC term (= 6) is the largest. Observe that both the real and imaginary parts of the AC terms, being [−2, −2, −2] and [2, 0, −2], have constant slopes (0 and −2, respectively). Hence, the “phase” of

y = F4 y is linear. For (iii), ⎡

1 1 ⎢ 1 −i

z = F4 z = ⎢ ⎣ 1 −1 1 i

1 −1 1 −1

⎤ 1 i ⎥ ⎥ −1 ⎦ −i

⎤ ⎡ ⎤ ⎡ 2 0 ⎢1⎥ ⎢0 ⎥ ⎥ ⎢ ⎥=⎢ ⎣ 0 ⎦ ⎣ −2 ⎦ . 0 1

The DC term is small (only = 2), while only one high-frequency term is non-zero. Indeed, the signal (0, 1, 0, 1) has only one single period.  Example 3 Let S1 , S2 ∈ R64 be signals given by ⎧ 1, ⎪ ⎪ ⎨ 2, S1 ( j) = ⎪ −1, ⎪ ⎩ 0.5,

0 ≤ j ≤ 15, 16 ≤ j ≤ 31, 32 ≤ j ≤ 47, 48 ≤ j ≤ 63;

and S2 ( j) = j/8, 0 ≤ j ≤ 63. The real part and imaginary part of the DFT of S1 are shown in Fig. 4.1, those of the  DFT of S2 are shown in Fig. 4.2. By using the column vector notation for the finite sequences f0 , . . . , fn−1 introduced in (4.1.4), we may formulate the n-point DFT of any x = [x0 , . . . , xn−1 ]T ∈ C as follows: ⎤ ⎤ ⎡ ⎡ x, f0 

x0 ⎥ ⎥ ⎢ ⎢ ..

x = ⎣ ... ⎦ = Fn x = ⎣ ⎦. .

xn−1

x, fn−1 

Hence, to compute the DFT x of x ∈ Cn , it is necessary to compute each of the n inner products: n−1  xk e−i2πk/n ,  = 0, . . . , n − 1. (4.1.12) x, f  = k=0

Let us assume that cos(2πk/n) and sin(2πk/n) have been pre-computed and stored in a look-up table. Then the number of multiplications in (4.1.12) is 2(n − 1). Taking into account of n additions in (4.1.12), the number of operations required to compute x of x x, f  is 3n − 2; so that the total number of operations to compute the DFT is (3n 2 − 2n). Hence, the amount of computations is quite intensive for large n.

178

4 Frequency-Domain Methods 40 35 30 25 20 15 10 5 0 −5

0

10

20

30

40

50

60

10

20

30

40

50

60

40 30 20 10 0 −10 −20 −30 −40

0

Fig. 4.1 Top real part of the DFT of S1 ; Bottom imaginary part of the DFT of S1 250 200 150 100 50 0 0

10

20

30

40

50

60

0

10

20

30

40

50

60

80 60 40 20 0 −20 −40 −60 −80

Fig. 4.2 Top real part of the DFT of S2 ; Bottom imaginary part of the DFT of S2

4.1 Discrete Fourier Transform

179

The fast Fourier transform (FFT) to be discussed in a later section reduces the order of operations from O(n 2 ) to O(n log n), for n = 2m , m = 1, 2, . . .. Exercises Exercise 1 Compute the frequency for each of the following electromagnetic (EM) waves w j (t), j = 1, 2, 3, 4, with given wavelengths η j . (Hint: EM waves travel at the speed of light. Also 1 nm (nanometer) = 10−9 m (meter)). (a) (b) (c) (d)

w1 (t) with η1 w2 (t) with η2 w3 (t) with η3 w4 (t) with η4

= 600 nm (visible green light). = 350 nm (ultraviolet). = 1, 999 nm (infrared). = 0.1 nm (X-ray).

Exercise 2 Write down the following complex numbers in the form of a +ib, where a and b are real numbers. Simplify your answers. (a) (b) (c) (d)

ei2πk/3 , k = 0, 1, 2. ei2πk/4 , k = 0, 1, 2, 3. ei2πk/6 , k = 0, 1, 2, 3, 4, 5. e−iπk/2 , k = 0, 1, 2, 3.

Exercise 3 Fill in the details in the derivation of (4.1.2), (4.1.3), (4.1.6), and (4.1.7). Exercise 4 Apply the results from Exercise 2 to write down the DFT and IDFT 3 , F4 , F 4 , F6 , F 6 , with precise (not approximate) values of all entries matrices F3 , F 4 .) in the form of a + ib, where a, b ∈ R. (Hint: See Example 1 for F4 and F Exercise 5 Compute the DFT of the given vectors x and y. Verify your answers by computing the corresponding IDFT. (a) F3 x and F3 y, for x = [1, −1, 1]T and y = [i, 0, −i]T . (b) F4 x and F4 y, for x = [0, i, 0, −i]T and y = [1, 0, −1, 1]T . Exercise 6 Give explicit formulation of the 8-point DFT matrix F8 in terms of F4 by applying Theorem 1 and the matrix F4 in Example 1. Exercise 7 Repeat Exercise 6 for the 6-point DFT matrix F6 in terms of F3 from Exercise 4.

4.2 Discrete Cosine Transform To introduce the discrete cosine transform (DCT), let us recall the DFT in Sect. 4.1, namely the n-point DFT in (4.1.8)–(4.1.9) of Definition 1 on p.174. In applications

180

4 Frequency-Domain Methods

to audio, image, and video compression, since complex numbers in DFT are more costly to code (or more precisely to “encode”), the real DCT is preferred. Let x = [x0 , . . . , xn−1 ]T ∈ Rn , and define  x =

x , for  = 0, . . . , n − 1, x2n−−1 , for  = n, . . . , 2n − 1,

(4.2.1)

x = [ x0 , . . . , x2n−1 ] in R2n , such that to extend the vector x from Rn to the vector the extended vector x is symmetric with respect to the center “n − 21 ”, as follows: xn−1 = xn−1 xn = xn+1 = xn−2 = xn−2 xn+2 = xn−3 = xn−3 ............... x2n−1 = x0 = x0 . Now, apply (4.1.8) with n replaced by 2n to formulate the 2n-point DFT x of x, with jth component (of x) given by  

x

j

= =

2n−1  =0 n−1 

j

x e−i2π 2n

x e−i

jπ n

+

2n−1 

=0

x2n−−1 e−i

jπ n

(4.2.2)

=n

for j = 0, . . . , 2n − 1, where the definition of x in (4.2.1) is used. But the last summation in (4.2.2) can be simplified to be 2n−1 

x2n−−1 e

−ijπ n

=n

=

n−1 

x e

i(+1) jπ n

,

=0

since e−i2nπ/n = e−i2π = 1. Hence, putting this back into (4.2.2) yields  

x

j

=

n−1 

 jπ  jπ jπ x e−i n + ei n ei n

=0 i jπ

= e 2n

n−1 

  2+1 2+1 x e−i j ( 2n )π + ei j ( 2n )π

=0 n−1 

i jπ

= 2e 2n

=0

x cos

π j (2 + 1)π = d j ei j 2n , 2n

(4.2.3)

4.2 Discrete Cosine Transform

181

where the notation dj = 2

n−1 

x cos

=0

j ( + 21 )π n

(4.2.4)

has been introduced to facilitate the discussion below. Now, by the definition of the 2n-point DFT in (4.1.8), we have   x , =

 

x

2n− j

so that

 

x

j

π

π

2n− j

= d j ei j 2n = d j e−i j 2n ,

from which it follows by returning to (4.2.3), with j replaced by 2n − j, that   π

x d2n− j = e−i(2n− j) 2n =e =e Thus, by setting j = n, we have

π −i(2n− j) 2n

−iπ

dje

2n− j π −i j 2n

d j = −d j .

dn = 0.

(4.2.5)

(4.2.6)

Next, to recover x from x by taking the inverse 2n-point DFT, as introduced also in Definition 1 on p.174 (see Theorem 1 on p.186), we have, by (4.2.3),   x



= = =

2n−1 1    i2π j/2n

x e j 2n

1 2n 1 2n

j=0 2n−1  j=0 2n−1 

d j ei jπ/2n ei jπ/n 1

d j ei j (+ 2 )π/n .

j=0

Here, let us separate the summation (4.2.7) into two sums: 2n−1  j=0

with

=

n−1  j=0

+

2n−1  j=n

,

(4.2.7)

182

4 Frequency-Domain Methods 2n−1 

=

j=n

2n−1 

1

d j ei j (+ 2 )π/n

j=n

=

2n−1 

1

d j ei j (+ 2 )π/n

j=n+1

=

n−1 

1

d2n−k ei(2n−k)(+ 2 )π/n ,

k=1

 2n−1 where the term dn = 0 in (4.2.6) reduces 2n−1 j=n to j=n+1 , and the change of summation index k = 2n − j is applied to arrive at the final sum. Now observe that 1

1

1

ei(2n−k)(+ 2 )π/n = ei2(+ 2 )π e−ik(+ 2 )π/n 1 = −e−ik(+ 2 )π/n . Hence, by (4.2.5) with j replaced by k, the expression in (4.2.7) becomes   x



=

n−1 n−1 1 1 1 1  1  d0 + d j ei j (+ 2 )π/n + (−dk )(−e−ik(+ 2 )π/n ) 2n 2n 2n

=

1  d0 + n 2

j=1

n−1 

k=1

d j cos

j=1

j ( + n

1  2 )π

.

(4.2.8)

Therefore, since x = x = ( x) ,  = 0, . . . , n − 1 by (4.2.1), we may return to the definition of d j in (4.2.4) to re-formulate (4.2.8) as: n−1 

xk δk− =

k=0

n−1 n−1   n−1  j ( + 21 )π  j (k + 21 )π  1 cos xk + 2 xk cos n n 2 k=0

=

1 n

j=1

n−1  

n−1 

k=0

j=1

1+2

k=0

cos

j ( + 21 )π  j (k + 21 )π cos xk n n

for all  = 0, . . . , n − 1. Since x0 , . . . , xn−1 are arbitrarily chosen, it follows that n−1  j ( + 21 )π  j (k + 21 )π 1 1+2 cos = δk− cos n n n j=1

for all k,  = 0, . . . , n − 1. In other words, the n vectors a0 , . . . , an−1 ∈ Rn , defined by

4.2 Discrete Cosine Transform

 ak =

1 √ n



183



(k + 21 )π 2 cos n n

...

(n − 1)(k + 21 )π 2 cos n n

T ,

(4.2.9)

where k = 0, . . . , n − 1, constitute an orthonormal basis of Rn ; or equivalently, the n × n matrix (4.2.10) Cn = [a0 . . . an−1 ], with column vectors a0 , . . . , an−1 , is an n × n orthogonal (or unitary) matrix (see Exercise 1). Definition 1 DCT & IDCT For each integer n ≥ 2, the unitary matrix Cn , as introduced in (4.2.10), is called the n-point discrete cosine transformation (DCT), and its transpose CnT is called the n-point inverse discrete cosine transformation (IDCT). Since Cn is orthogonal (or unitary), we have CnT Cn = In , where In is the n × n identity matrix so that Cn−1 = CnT . Example 1 Formulate and simplify the n-point DCT for n = 2 and 3. Solution (i) For n = 2, we have, from (4.2.9),  ak =

k + 21 1 π √ cos 2 2

T , k = 0, 1.

Hence, the 2-point DCT is ⎡ C2 = ⎣

√1 2

√1 2

cos π4 cos 3π 4





⎦=⎣



  ⎦ = √1 1 1 −1 2 1 −1 √

√1 √1 2 2 √1 2

2

with IDCT C2−1 = C2T . Observe that ⎡ C2T C2 = ⎣

√1 √1 2 2 −1 √1 √ 2 2

⎤⎡ ⎦⎣

√1 2



  ⎦ = 1 0 = I2 . −1 01 √

√1 √1 2 2 2

(ii) For n = 3, we have, from (4.2.9),  ak =

1 √ 3



k + 21 2 cos π 3 3



2(k + 21 ) 2 cos π 3 3

T ,

184

4 Frequency-Domain Methods

for k = 0, 1, 2. Hence, the 3-point DCT is ⎡

√1 3

√1 3



√1 3

⎢ ⎥   ⎢ 2 π 2 3π 2 5π ⎥ ⎢ ⎥ cos cos cos C3 = ⎢ 3 6 3 6 3 6 ⎥ ⎣ ⎦   2 2π 2 6π 2 10π cos cos 3 cos 6 ⎡ 13 1 6 13 ⎤ 6 √

⎢ 3 ⎢ √1 =⎢ ⎢ 2 ⎣ √1 6



√ 3

3

⎥ ⎥ 0 − √1 ⎥ . 2⎥ ⎦  − 23 √1 6

Observe that ⎡

√1 3

⎢ ⎢ T 1 C3 C3 = ⎢ ⎢ √3 ⎣ √1 3

√1 2

√1 6

√1 3

⎢  ⎥ ⎥⎢ 2 ⎥ ⎢ √1 − 3 ⎥⎢ 2 ⎦⎣

0 − √1

⎤⎡

2

√1 6

√1 6

√1 3

√1 3

0  − 23

− √1 2 √1 6



⎡ ⎤ ⎥ 100 ⎥ ⎥ = ⎣0 1 0⎦. ⎥ ⎦ 001 

Example 2 Compute the DCT of the vector x = [1 − 3 3]T ∈ R3 . Solution By the solution of Example 1, we have ⎡

√1 3

⎢ ⎢ √1

x = C3 x = ⎢ ⎢ 2 ⎣ ⎡ ⎢ =⎢ ⎣ Observe that the IDCT of x is

√1 3

√1 3

0  − 23

− √1 2



⎡ ⎤ ⎥ 1 ⎥ ⎥ ⎣ −3 ⎦ ⎥ ⎦ 3

√1 √1 6 6 √1 (1 − 3 + 3) 3 √1 (1 + 0 − 3) 2  √1 (1 + 3) + 3 2 3 6





√1 3√



⎥ ⎢ ⎥ = ⎣− 2⎥ ⎦. ⎦ 10 √ 6

4.2 Discrete Cosine Transform

185

6 4 2 0 −2 −4 −6 35 30 25 20 15 10 5 0 −5 −10 −15 −20

0

0

10

20

30

40

50

60

10

20

30

40

50

60

Fig. 4.3 Top DCT of S1 ; Bottom DCT of S2

⎡ C3T x

√1 3

⎢ ⎢ = ⎢ √1 ⎣ 3 ⎡ ⎢ =⎢ ⎣

√1 2

0 − √1

√1 2 6 1 10 ⎤ − 1 + 3 6 √ ⎥ 2 √10√ ⎥ 1 3 − 3 · 2 3⎦ 1 10 3 +1+ 6 √1 3

⎤⎡

⎤ √1 3 ⎥  ⎥ ⎢ −√2 ⎥ ⎥ − 23 ⎥ ⎢ ⎦ ⎦⎣ √1 6



10 √ 6

⎤ 1 = ⎣ −3 ⎦ = x. 3 

R64

be signals considered in Example 3 on p.177. The Example 3 Let S1 , S2 ∈  DCTs of S1 and S2 are shown in Fig. 4.3. Remark 1 Let x ∈ Rn . Since the n-point DCT of x, defined by

x = Cn x, is computed by taking the inner products of the row vectors of Cn with x, it is more convenient to re-formulate the DCT (orthogonal, i.e. unitary matrix), defined in (4.2.10), in terms of row vectors; namely,

186

4 Frequency-Domain Methods

⎤ c0T ⎥ ⎢ Cn = [a0 · · · an−1 ] = ⎣ ... ⎦ , T cn−1 ⎡

(4.2.11)

T where c0T is the first row of Cn , c1T is the second row of Cn , . . . , and cn−1 is the nth T row of Cn . More precisely, c j is the transpose of the column vector c j defined as follows:   1 1 1 T = √ [1 . . . 1]T , (4.2.12) c0 = √ · · · √ n n n

and for all j = 1, . . . , n − 1,    2 jπ j3π j (2n − 1)π T cj = . cos cos · · · cos n 2n 2n 2n

(4.2.13)

Then it follows from (4.2.10) and (4.2.11) that the corresponding IDCT can be re-formulated as (4.2.14) Cn−1 = CnT = [c0 . . . cn−1 ] with column vectors c0 , · · · , cn−1 introduced in (4.2.12) and (4.2.13).



Theorem 1 Let x ∈ Rn . The n-point DCT of x is given by ⎤ x, c0  ⎥ ⎢ ..

x = Cn x = ⎣ ⎦, . x, cn−1  ⎡

and x can be written as x=

n−1  x, c j c j .

(4.2.15)

(4.2.16)

j=0

Indeed, (4.2.15) follows from matrix multiplication by using the formulation of Cn in (4.2.11), and (4.2.16) is a re-formulation of ⎤ x, c0  ⎥ ⎢ .. x = CnT (Cn x) = [c0 · · · cn−1 ] ⎣ ⎦, . x, cn−1  ⎡

which is obtained by applying (4.2.14) and (4.2.15).



Definition 2 DCT as inner product For any x ∈ Rn , where n ≥ 2 is an arbitrary (fixed) integer, x as defined by (4.2.15) is called the n-point DCT of x and the (finite) series expansion in (4.2.16) is called the discrete cosine series representation

4.2 Discrete Cosine Transform

187

of x, with cosine coefficients c0 , x, . . . , cn−1 , x. Furthermore, the first cosine coefficient c0 , x is called the DC (direct current) term and the remaining cosine coefficients are called the AC (alternate current) terms of x. Remark 2 The DC term of x for any given x = (x0 , . . . , xn−1 ) ∈ Rn is given by n−1 1  xk , x, c0  = √ n k=0

which is simply the “average” of the components of x, while the AC terms are governed by some non-trivial (oscillating) cosines (see Remarks 3–4 to follow).  Remark 3 As a continuation of Remark 2, consider the n-point DCT of a vector f ∈ Rn obtained by “sampling” a continuous function f (t) for 0 ≤ t ≤ 1, at the sample points 2k + 1 , k = 0, . . . , n − 1; (4.2.17) tk = 2n namely, f = ( f (t0 ), . . . , f (tn−1 )) ∈ Rn . Then by applying the finite cosine series representation (4.2.16), we have b0  b j cos( jπt ), for all  = 0, . . . , n − 1, + 2 n−1

f (t ) =

(4.2.18)

j=1

where bj =

n−1 2 f (tk ) cos( jπtk ), j = 0, . . . , n − 1 n

(4.2.19)

k=0



(see Exercise 8).

Remark 4 Since cos 2πt is periodic with period = 1, the frequency of cos( jπt) is 1 2 j Hz (see (4.1.1)). Hence, when f (t) in Remark 3 is considered as a traveling wave, then the finite cosine series representation of f (t) (at the time samples t = t as in (4.2.17)) reveals the various frequency components of f (t): with stationary (or DC) component b0 /2 and traveling (or AC) components b1 , b2 , . . . , bn−1 at frequencies 1 2 n−1  2 , 2 , . . . , 2 , respectively. Remark 5 The continuous (or analog) version of the finite cosine series representation in (4.2.18), obtained by replacing t in (4.2.18) with t ∈ [0, 1] is called the (Fourier) cosine series of f ∈ C[0, 1]; namely, ∞

f (t) =

b0  b j cos jπt. + 2 j=1

188

4 Frequency-Domain Methods

In addition, when the sum in (4.2.19) is replaced by the integral over [0, 1], the coefficients b0 , b1 , . . . (of the above cosine series) are called (Fourier) cosine coefficients. That is, the continuous version of (4.2.19) is given by 

1

bj = 2

f (t) cos jπtdt.

0

In other words, the n-point DCT of an n-vector x can be considered as discretization of the cosine coefficients of x(t) at t0 , . . . , tn−1 given by (4.2.17), where x = [x(t0 ), . . . , x(tn−1 )]T . The topic of Fourier cosine series will be studied in Sect. 6.2.  Exercises Exercise 1 Apply the n-point DCT Cn , for n = 2 and 3, in Example 1 to compute the discrete cosine transforms of the following data sets. (a) x j = C2 x j for j = 1, 2, 3, where x1 = [ 1 2 ]T , x2 = [ 2 1 ]T , x3 = [ −1 1 ]T . (b) y j = C3 y j for j = 1 and 2, where y1 = [ 1 2 3 ]T , y2 = [ −1 2 1 ]T . Exercise 2 Apply the n-point IDCT Cn−1 = CnT , as introduced in Definition 1, to verify your answers in Exercise 1, by computing C2−1 x j for j = 1, 2, 3 in (a) above, and C3−1 y j for j = 1, 2 in (b) above. Exercise 3 Let x = Cn x be the n-point DCT of x ∈ Rn , where Cn = [a0 · · · an−1 ] T denote with the n column vectors a0 , . . . , an−1 defined as in (4.2.9). Let c0T , . . . , cn−1 the row vectors of Cn ; that is, ⎡ T ⎤ c0 ⎥ ⎢ Cn = ⎣ ... ⎦ T cn−1

as in (4.2.11). Verify that the vectors c0 , . . . , cn−1 are given by (4.2.12)–(4.2.13) and illustrate this result by writing out such rows for the cases n = 2 and 3. Exercise 4 As a continuation of Exercise 3, compute the following DCT, namely: (a) x j = C2 x j , j = 1, 2, 3, for x1 , x2 , x3 given in Exercise 1, by taking inner products with c0 , c1 , c2 as in Theorem 1; that is,

4.2 Discrete Cosine Transform

189

x j = [c0 , x j  c1 , x j  c2 , x j ]T ; (b) y j = C3 y j , j = 1, 2 for y1 , y2 given in Exercise 1, by taking inner products with the vectors c j ’s instead of the a j ’s; that is,

y j = [c0 , y j  c1 , y j ]T . y j simply Exercise 5 As a continuation of Exercise 4, compute the IDCT of x j and by writing out and adding the following sums: (a) x j = c0 , x j c0 + c1 , x j c1 + c2 , x j c2 for j = 1, 2, 3; and (b) y j = c0 , y j c0 + c1 , y j c1 for j = 1, 2. (c) Compare the results in (a) and (b) with those in Exercise 2. Note that the vectors c j ’s in (a) are different from those in (b), since the ones in (a) are for 2-point DCT, while those in (b) are for 3-point DCT. Exercise 6 For each of the following continuous functions f (x) on [0, 1], compute its discrete cosine coefficients bj =

2  j (2k + 1)π  2   2k + 1  cos f 3 6 6 k=0

for j = 0, 1, 2. Then compute the finite cosine Fourier series of f , namely: 2  j (2 + 1)π   2 + 1  b  0 = + b j cos f 6 2 6 j=0

for  = 0, 1, 2. (a) f (x) = |x − 21 |. (b) f (x) = 2x 2 − 2x + 1.

  with Exercise 7 As a continuation of Exercise 6, compare the values of f 2+1 6   f 2+1 ,  = 0, 1, 2, for the two functions f (x) = |x− 21 | and f (x) = 2x 2 −2x+1. 6 (Note that only the special case n = 3 is considered in this example. For large values of n, the finite cosine Fourier series introduced in (4.2.18) of Remark 3 provide good approximation of these two functions. See Exercise 8 below.) Exercise 8 Use MATLAB to compute the discrete cosine coefficients bj =

n−1  j (2k + 1)π  2   2k + 1  cos f n 2n 2n k=0

190

4 Frequency-Domain Methods

for the functions f (x) = |x − 21 |, for n ≥ 100. Then compute the values of the finite cosine Fourier series n−1  j (2 + 1)π   2 + 1  b  0 = + b j cos f 2n 2 2n j=0

with the corresponding values f

 2 + 1  2n

 2 + 1 1    = −  2n 2

for  = 0, . . . , n − 1. Exercise 9 Repeat Exercise 8 for the function f (x) = 2x 2 − 2x + 1 in Exercise 6 (b).

4.3 Fast Fourier Transform To study the FFT computational scheme, we need the following result due to Cornelius Lanczos to reduce a 2n-point DFT to an n-point DFT via multiplication by certain sparse matrices. Denote Dn = diag{1, e−iπ/n , . . . , e−i(n−1)π/n }, and let ⎡

⎤ 1 0 0 0 ··· 0 ⎢0 0 1 0 ··· 0⎥ ⎥; Pne = [δ2 j−k ] = ⎢ ⎣ ⎦ ··· 0 0 0 ... 1 0 ⎡ ⎤ 0 1 0 0 ··· 0 ⎢0 0 0 1 ··· 0⎥ ⎥, Pno = [δ2 j−k+1 ] = ⎢ ⎣ ⎦ ··· 0 0 ··· 0 1 where 0 ≤ j ≤ n − 1 and 0 ≤ k ≤ 2n − 1, be n × (2n) matrices. Also, let In denote the n × n identity matrix and ⎤ .. . D I n n ⎥ ⎢ ⎥ =⎢ ⎣...... ... ... ⎦. .. . −Dn In ⎡

E 2n

(4.3.1)

4.3 Fast Fourier Transform

191

For example, with the above notations, we have 

 1 1 = [1 0], = [0 1], E 2 = , D1 = [1], 1 −1       1 0 1000 0100 , P2e = , P2o = , D2 = 0 −i 0010 0001 ⎡ ⎤ 10 1 0 ⎢ 0 1 0 −i ⎥ ⎥ E4 = ⎢ ⎣ 1 0 −1 0 ⎦ . 01 0 i P1e

P1o

Theorem 1 Factorization of DFT matrices The 2n-point DFT F2n can be factored out in terms of two diagonal blocks of the n-point DFT Fn as follows: ⎡

F2n

⎤ .. ⎡ e⎤ F . O n ⎢ ⎥ Pn ⎥⎣ ⎦ = E 2n ⎢ ⎣ . . . . . . . . . . . . ⎦ . . o. , .. Pn O . Fn

(4.3.2)

where O denotes the n × n zero matrix. Proof To derive (4.3.2), consider x = [x0 , . . . , x2n−1 ]T ∈ C2n , with DFT given x0 , . . . , x2n−1 ]T , where by x = F2n x = [

x = =

2n−1  k=0 n−1  j=0

=

n−1  j=0

xk e−i2πk/(2n)

x2 j e−i2π(2 j)/(2n) +

n−1 

x2 j+1 e−i2π(2 j+1)/(2n)

j=0

x2 j e−i2π j/n + e−iπ/n

n−1 

x2 j+1 e−i2π j/n .

(4.3.3)

j=0

Set xe = [x0 , x2 , . . . , x2n−2 ]T , xo = [x1 , x3 , . . . , x2n−1 ]T . Then xe = Pne x, xo = Pno x. Using the notation (v) for the th component of the vector v, the result in (4.3.3) is simply

192

4 Frequency-Domain Methods

x = (Fn xe ) + e−iπ/n (Fn xo )

= (Fn Pne x) + e−iπ/n (Fn Pno x)  (Fn Pne x) + e−iπ/n (Fn Pno x) , for 0 ≤  ≤ n − 1, = (Fn Pne x) − e−iπ(−n)/n (Fn Pno x) , for n ≤  ≤ 2n − 1.

(4.3.4)

In (4.3.4), we have used the fact that e−iπ/n = −e−iπ(−n)/n . Thus,  (F2n x) =

! (Fn Pne + Dn Fn Pno )x! , for 0 ≤  ≤ n − 1, (Fn Pne − Dn Fn Pno )x  , for n ≤  ≤ 2n − 1, 

which yields (4.3.2).

m m Let n = 2m with m ≥ 1. For each k, 0 ≤ k ≤ m − 1, let G m k be the 2 × 2 matrix defined by

⎡ ⎢ Gm m−k , . . . , E 2m−k } = ⎣ k = diag{E "2 #$ % 2k copies

E 2m−k

O ..

.

O

⎤ ⎥ ⎦,

(4.3.5)

E 2m−k

where, according to (4.3.1), E 2m−k is a 2m−k × 2m−k matrix: ⎤ .. ⎢ I2m−k−1 . D2m−k−1 ⎥ ... ⎥ =⎢ ⎦. ⎣...... ... .. I2m−k−1 . −D2m−k−1 ⎡

E 2m−k

Furthermore, let

⎤ P2em−1 ⎢ ... ⎥ ⎥ =⎢ ⎦ ⎣ P2om−1 ⎡

P2m

denote the 2m × 2m “permutation matrix” in (4.3.2) with n = 2m ; and define, 20 , P 2 = P 21 , P 4 = P 22 , . . . , P 2m by 1 = P inductively, the permutation matrices P 1 = [1], . . . , P 2 = P where  = 1, 2, . . . , m.



2−1 O P 2−1 O P

 P2 ,

4.3 Fast Fourier Transform

193

Example 1 Consider n = 4, m = 2. We have G 20 = E 22 = E 4 ,  G 21 = diag{E 2 , E 2 } =

E2 O O E2





1 ⎢1 =⎢ ⎣0 0

1 −1 0 0

0 0 1 1

⎤ 0 0 ⎥ ⎥, 1 ⎦ −1

and  10 = , = 01       1 0 P 10 10 = 1 P2 = 0 1 P2 = P2 = 0 1 , 0 P ⎡ ⎤ 1000  e ⎢0 0 1 0⎥ P2 ⎥ =⎢ = o ⎣0 1 0 0⎦, P2 0001     2 O P I2 O P P4 = P4 . = = 4 2 O I2 O P 

P2 2 P

P4

4 P

P1e P1o







Next we state the fast Fourier transform (FFT) computational scheme as follows. Theorem 2 Full factorization of DFT matrices Let n = 2m , where m ≥ 1 is an integer. Then the n-point DFT has the formulation m m Fn = F2m = G m 0 G 1 . . . G m−1 P2m .

(4.3.6)

Proof Proof of the FFT scheme (4.3.6) can be carried out by mathematical induction on m = 1, 2, . . . . For m = 1, since   1 1 G 10 = E 2 = 1 −1 2 = I2 as shown in Example 1, it follows that and P 2 = G 10 = G 10 P which is (4.3.6) for m = 1.



 1 1 = F2 , 1 −1

194

4 Frequency-Domain Methods

In general, by Theorem 1 and the induction hypothesis, we have  F

2m

=E

2m

 = E 2m = E 2m

F2m−1 O O F2m−1

 P2m

 m−1 O P2m−1 G 0m−1 . . . G m−2 P2m m−1 O G 0m−1 . . . G m−2 P2m−1  m−1   m−1   2m−1 O G m−2 O O G0 P . . . m−1 2m−1 P2m O P O G 0m−1 O G m−2

m m = E 2m G m 1 G 2 . . . G m−1 P2m m m m 2m , = G 0 G 1 . . . G m−1 P m since G m 0 = E 2 by (4.3.5), and



m−1 O G k−1 m−1 O G k−1

 = Gm k ,

for 1 ≤ k ≤ m − 1 (see Exercise 2).

(4.3.7) 

n = 1 Fn∗ , it follows from (4.3.6) that for Remark 1 Since the IDFT is given by F n m n=2 , ∗ m ∗ 2m = 2−m P n = F 2Tm (G m (4.3.8) F m−1 ) . . . (G 0 ) . Example 2 Let us verify (4.3.6) for n = 4, n = 8. First, by Theorem 1,  F4 = E 4

F2 O O F2



 e

P2 P2o



1 ⎢1 ⎢ = E4 ⎣ 0 0

1 −1 0 0

0 0 1 1

⎤ 0 0⎥ ⎥ P = G2G2 P 0 1 4, 1⎦ 4 −1

4 where the last equality follows from G 20 = E 4 , the expression of G 21 and P4 = P as shown in Example 1. This is (4.3.6) for n = 4. For n = 8, 

  2 2  4 O P4e G0G1 P = E 8 4 P8 P4o O G 20 G 21 P  2  2   4 O G0 O G1 O P = E8 4 P8 O P O G 20 O G 21 8 , = G 30 G 31 G 32 P

F8 = E 8

F4 O O F4



where the last equality follows from (4.3.7).



4.3 Fast Fourier Transform

195

Remark 2 Complexity of FFT Computational complexity is the operation count, including the number of long operations (i.e. multiplication and division) and short operations ' (i.e. addition and subtraction). To compute the n-point DFT, for & Fn = (z n ) jk , if we count (z n ) jk c for a c ∈ C as one (complex) multiplication operation even if (z n ) jk = 1 (when jk = 0 or jk = n for some integer  > 0), then the number of multiplication operations in computing an n-point DFT is n 2 . The significance of the special case of n = 2m is that the FFT computational scheme, as formulated in (4.3.6) of Theorem 2, can be applied to compute the n-point DFT with only O(mn) = O(n log n) operations (see Exercise 1). The count in terms of the order of magnitudes O(n 2 ) and O(n log n) is valid only for large values of n. In many applications, a signal is partitioned into segments, each of which is of length n, where n is relatively small. Hence, a precise count is more valuable, particularly for hardware implementation. In the following, we give a precise count and implementation-ready algorithms for n = 4, 8, 16 for computing the n-point DFT via the FFT scheme. The 16-point DFT, F16 , is particularly important, since fast computation of the 8-point DCT depends on the computation of F16 . The topic of fast discrete cosine transform (FDCT) will be studied in the next section. Application of 8-point DCT to image and video compressions will be the topic of discussion in Sect. 5.4.  Operation count for 4-point FFT For x = [x0 , x1 , x2 , x3 ]T ∈ C4 , let g = x denote its DFT: g = F4 x = E 4 diag{E 2 , E 2 } P4 x. The role of P4 is to interchange the orders of x j in x, namely: ⎤ x0 ⎢ x2 ⎥ ⎥ P4 x = ⎢ ⎣ x1 ⎦ . x3 ⎡

Denote d = diag{E 2 , E 2 } g = x = F4 x is  d0 d  1 d2 d3

P4 x. Then g = E 4 d, and the FFT algorithm to compute = x0 + x2 = x0 − x2 = x1 + x3 = x1 − x3

⎧ g0 ⎪ ⎪ ⎪ ⎨g 1 ⎪ g 2 ⎪ ⎪ ⎩ g3

= d0 + d2 = d1 − id3 = d0 − d2 = d1 + id3 .

(4.3.9)

Observe that we need one complex multiplication id3 in (4.3.9) for computing g1 ; but complex multiplication is not required to compute g3 since id3 has already been

196

4 Frequency-Domain Methods

computed. Thus, the FFT for n = 4 requires a total of one complex multiplication and 8 additions.  Operation count for 8-point FFT For n = 8, denote W = z 8 = e−i √ 2 2

√ − i 22 .

2π 8

=

From Theorem 2 (see also Example 2), we have 8 , F8 = G 30 G 31 G 32 P

where  = E8 =

I 4 D4 I4 −D4



with D4 = diag{1, W, −i, −W }, ⎡ ⎤ 10 1 0 ⎢ 0 1 0 −i ⎥ ⎥ G 31 = diag{G 20 , G 20 } with G 20 = E 4 = ⎢ ⎣ 1 0 −1 0 ⎦ , 01 0 i ⎡ ⎤ 1 1 0 0 ⎢ 1 −1 0 0 ⎥ ⎥ G 32 = diag{G 21 , G 21 } with G 21 = ⎢ ⎣0 0 1 1 ⎦, 0 0 1 −1 G 30



and

1 ⎢0 ⎢ ⎢0 ⎢ ⎢0 P8 = ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎣0 0

0 0 0 0 1 0 0 0

0 0 1 0 0 0 0 0

0 0 0 0 0 0 1 0

0 1 0 0 0 0 0 0

0 0 0 0 0 1 0 0

0 0 0 1 0 0 0 0

⎤ 0 0⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥. 0⎥ ⎥ 0⎥ ⎥ 0⎦ 1

(4.3.10)

8 is to interchange the orders of x j in x = [x0 , x1 , . . . , x7 ]T ∈ C8 The role of P such that 8 x = [x0 , x4 , x2 , x6 , x1 , x5 , x3 , x7 ]T . P Let d, h, g ∈ C8 denote 8 x, h = G 31 d, g(= x) = G 30 h. d = G 32 P √



Then, with W = 22 − i 22 , the FFT algorithm for computing the 8-point DFT g = x = F8 x can be realized as follows:

4.3 Fast Fourier Transform

   

d0 = x0 + x4 d1 = x0 − x4 d2 = x2 + x6 d3 = x2 − x6 d4 = x1 + x5 d5 = x1 − x5 d6 = x3 + x7 d7 = x3 − x7

197

⎧ ⎪ ⎪h 0 ⎪ ⎨h 1 ⎪h 2 ⎪ ⎪ ⎩ h ⎧ 3 h4 ⎪ ⎪ ⎪ ⎨h 5 ⎪ h 6 ⎪ ⎪ ⎩ h7

= d0 + d2 = d1 − id3 = d0 − d2 = d1 + id3 = d4 + d6 = d5 − id7 = d4 − d6 = d5 + id7

⎧ ⎪ g0 ⎪ ⎪ ⎪ ⎪ g ⎪ 1 ⎪ ⎪ ⎪ ⎪ g ⎪ 2 ⎪ ⎪ ⎨g 3 ⎪ g 4 ⎪ ⎪ ⎪ ⎪ ⎪g5 ⎪ ⎪ ⎪ ⎪ g6 ⎪ ⎪ ⎪ ⎩ g7

= h0 + h4 = h1 + W h5 = h2 − i h6 = h3 − W h7 = h0 − h4 = h1 − W h5 = h2 + i h6 = h3 + W h7.

(4.3.11)

Observe that we need five complex multiplications id3 , id7 , W h 5 , i h 6 and W h 7 in (4.3.11) for computing h 1 , h 5 , g1 , g2 , g3 , and that the values of id3 , id7 , W h 5 , i h 6 and W h 7 are stored in the memory RAM for computing h 3 , h 7 , g5 , g6 , g7 . Thus, the computational complexity for computing the 8-point DFT via FFT consists of 5 complex multiplications and 24 additions.  π

Operation count for 16-point FFT For n = 16, denote w = z 16 = e−i 8 . √

Thus, W = 22 − i 1 and 2, we have



2 2

in the computation of 8-point FFT is w 2 . Following Theorems

⎤ P8e = E 16 diag{F8 , F8 } ⎣. . .⎦ Po8 ⎡ e⎤   P8 I 8 D8 8 , P 8 } ⎣. . .⎦ , = diag{G 30 , G 30 } diag{G 31 , G 31 } diag{G 32 , G 32 } diag{ P I8 −D8 Po8 ⎡

F16

where D8 = diag{1, w, w2 , w 3 , −i, −w 3 , −w 2 , −w}, 8 are provided above for FFT with n = 8. and G 30 , G 31 , G 32 and P The role of ⎡ e⎤ P8 8 , P 8 } ⎣. . .⎦ 16 = diag{ P P Po8 is to interchange the orders of x j in x = [x0 , x1 , . . . , x15 ]T ∈ C16 such that 16 x = [x0 , x8 , x4 , x12 , x2 , x10 , x6 , x14 , x1 , x9 , x5 , x13 , x3 , x11 , x7 , x15 ]T (4.3.12) P

198

4 Frequency-Domain Methods

(see Exercise 6). Denote 16 x, h = diag{G 31 , G 31 }d, d = diag{G 32 , G 32 } P g = diag{G 30 , G 30 }h, u(= x) = E 16 g. π

Then, with w = e−i 8 , the FFT algorithm for n = 16 to obtain u = x = F16 x is  ⎧ d0 = x0 + x8 ⎧ ⎪ h = d + d ⎪ 0 0 2 ⎪g0 = h 0 + h 4 ⎪ ⎪ ⎨h = d − id ⎪ d1 = x0 − x8 ⎪ ⎪g1 = h 1 + w 2 h 5  1 1 3 ⎪ ⎪ ⎪ ⎪ d2 = x4 + x12 ⎪ ⎪ h = d − d ⎪ 2 0 2 ⎪ ⎪g2 = h 2 − i h 6 ⎪ ⎩ ⎨g = h − w 2 h d3 = x4 − x12 h 3 = d1 + id3 ⎪ 3 3 7  ⎧ d4 = x2 + x10 ⎪h 4 = d4 + d6 ⎪ g = h − h 4 0 4 ⎪ ⎪ ⎪ ⎪ ⎪ d = x2 − x10 ⎨h 5 = d5 − id7 ⎪ ⎪ g5 = h 1 − w 2 h 5 ⎪  5 ⎪ ⎪ ⎪ d6 = x6 + x14 ⎪ h 6 = d4 − d6 ⎪ g6 = h 2 + i h 6 ⎪ ⎪ ⎪ ⎪ ⎩ ⎩ 2 d = x6 − x14 h 7 = d5 + id7  7 ⎧g7 = h 3 + w h 7 ⎧ h 8 = d8 + d10 ⎪ d8 = x1 + x9 ⎪ ⎪g8 = h 8 + h 12 ⎪ ⎪ ⎪ ⎨h = d − id ⎪ ⎪g9 = h 9 + w 2 h 13 d = x1 − x9 9 9 11 ⎪ ⎪  9 ⎪ ⎪ ⎪ h = d − d g10 = h 10 − i h 14 ⎪ 10 8 10 d10 = x5 + x13⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎨g = h − w 2 h h = d + id 9 11 11 11 15 d = x5 − x13⎧ 11  11 ⎪ h = d + d g = h − h 12 12 14 12 8 12 ⎪ d12 = x3 + x11⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪g13 = h 9 − w 2 h 13 ⎪ d13 = x3 − x11 h 13 = d13 − id15⎪ ⎪  ⎪ ⎪ ⎪ h = d − d ⎪ 14 12 14 ⎪ ⎪g14 = h 10 + i h 14 d14 = x7 + x15⎪ ⎪ ⎩ ⎩ 2 h = d + id 13 15 15 g15 = h 11 + w h 15 d =x −x 15

7

15

⎧ u 0 = g0 + g8 ⎪ ⎪ ⎪ ⎪ ⎪ u ⎪ 1 = g1 + wg9 ⎪ ⎪ ⎪ ⎪ u 2 = g2 + w 2 g10 ⎪ ⎪ ⎪ ⎪ ⎪ u 3 = g3 + w 3 g11 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪u 4 = g4 − ig12 ⎪ ⎪ ⎪ u 5 = g5 − w 3 g13 ⎪ ⎪ ⎪ ⎪ ⎪u 6 = g6 − w 2 g14 ⎪ ⎪ ⎪ ⎨u = g − wg 7 7 15 ⎪ u = g − g 8 0 8 ⎪ ⎪ ⎪ ⎪ ⎪ u = g − wg 9 1 9 ⎪ ⎪ ⎪ 2 ⎪ ⎪u 10 = g2 − w g10 ⎪ ⎪ ⎪ ⎪u 11 = g3 − w 3 g11 ⎪ ⎪ ⎪ ⎪ ⎪ u 12 = g4 + ig12 ⎪ ⎪ ⎪ ⎪ ⎪ u 13 = g5 + w 3 g13 ⎪ ⎪ ⎪ ⎪ ⎪ u 14 = g6 + w2 g14 ⎪ ⎪ ⎩ u 15 = g7 + wg15 .

(4.3.13) Observe that the computational complexity for computing 16-point DFT via FFT consists of 17 complex multiplications and 64 additions. The algorithm in (4.3.13) is ready to be implemented easily.  The signal flow charts for 4-point FFT and 8-point FFT are provided in Figs. 4.4 and 4.5, respectively.

Fig. 4.4 FFT Signal flow chart for n = 4

4.3 Fast Fourier Transform

199

Fig. 4.5 FFT signal flow chart for n = 8

Exercises Exercise 1 For n = 2m , where m is an integer, verify that the number of operations required to compute the n-point DFT of any vector x ∈ Cn is of order O(mn) = O(nlogn), when the FFT procedure, as described in (4.3.6) of Theorem 2, is followed. Exercise 2 Derive the formula (4.3.7). Exercise 3 Verify the computational scheme (4.3.9). Exercise 4 Verify the validity of the 8 × 8 permutation matrix (4.3.10). Exercise 5 Verify the computational scheme (4.3.11). 16 and verify (4.3.12). Exercise 6 Write out the permutation matrix P Exercise 7 Fill in the details in the derivation of the computational scheme of the 16-point FFT in (4.3.13). Exercise 8 Attempt to draw the signal flow chart for the 16-point FFT by applying (4.3.13).

4.4 Fast Discrete Cosine Transform The DCT introduced in Sect. 4.2 (see Definition 1 on p.183) is called DCT-II. It is the DCT used in Matlab and certain compression standards, such as JPEG image compression and MPEG video compression to be discussed in the next chapter

200

4 Frequency-Domain Methods

(see Sect. 5.4). Recall that this DCT is derived from the DFT in (4.2.1)–(4.2.10) in Sect. 4.2, and observe that the derivation of this DCT CnI I = Cn in (4.2.10) on p.183 from the DFT F2n can be formulated as the product of the matrix F2n , a matrix P of the 4nth roots of unity, and a permutation matrix Q = Q 2n×n , as follows: CnI I = Cn = P ∗ F2n Q,

(4.4.1)

where F2n is given by (4.1.8) on p.174 with n replaced by 2n, ⎡√

P = P2n×n



2

⎢ ⎥ ωn ⎢ ⎥ ⎢ ⎥ . .. ⎢ ⎥ ⎢ ⎥ n−1 ⎢ 1 ωn ⎥ ⎥ = √ ⎢ ⎥ 2 2n ⎢ ⎢ 0 ... ... 0 ⎥ n−1 ⎢ ωn ⎥ ⎢ ⎥ ⎢ ⎥ . ⎣ ⎦ .. 0 ωn

with

(4.4.2)

2n×n

π

ωn = ei 2n , ⎡

and

Q = Q 2n×n

1

(4.4.3) ⎤

⎥ ⎢ 1 ⎥ ⎢ ⎢ .. ⎥ ⎥ ⎢ . ⎥ ⎢ ⎢ 1⎥ ⎥ ⎢ =⎢ , 1⎥ ⎥ ⎢ ⎢ . ⎥ ⎢ .. ⎥ ⎥ ⎢ ⎦ ⎣ 1 1 2n×n

(4.4.4)

where the zero entries in P and Q are not displayed. There are other types of DCT that are commonly used, including DCT-I, DCT-III, and DCT-IV. For convenience, we introduce the notation ⎧ 1 ⎪ ⎨ √2 , b(k) = 1, ⎪ ⎩ 0,

for = 0, n, for ≤ k ≤ n − 1, for < 0 or k > n,

(4.4.5)

for formulating the DCT matrices of DCT-I, DCT-II, and DCT-III, as follows. (i) DCT-I is defined by the (n + 1) × (n + 1) orthogonal matrix

4.4 Fast Discrete Cosine Transform

201

  I I Cn+1 = cn+1 (k, ) with the (k, )-entry  I (k, ) cn+1

= b(k)b()

2 kπ cos , n n

(4.4.6)

where k,  = 0, . . . , n. (ii) DCT-II is defined by the n × n orthogonal matrix   CnI I = cnI I (k, ) with the (k, )-entry  cnI I (k, )

= b(k)

k( + 21 )π 2 cos , n n

(4.4.7)

where k,  = 0, . . . , n − 1 (see (4.2.9) on p.174). (iii) DCT-III is the transpose of CnI I , defined by   CnI I I = cnI I I (k, ) with the (k, )-entry  cnI I I (k, )

= b()

(k + 21 )π 2 cos , n n

(4.4.8)

where k,  = 0, . . . , n − 1 (note that cnI I I (k, ) = cnI I (, k)). (iv) DCT-IV is defined by the n × n orthogonal matrix   CnI V = cnI V (k, ) with the (k, )-entry  cnI V (k, )

=

(k + 21 )( + 21 )π 2 cos , n n

(4.4.9)

where k,  = 0, . . . , n − 1. I , C I I , C I I I , and C I V is an orthogonal Remark 1 Since each of the matrices Cn+1 n n n matrix (see Exercises 1–4), the corresponding inverse matrix is its transpose. Observe I and CnI V are symmetric matrices, they are their own inverses, namely: that since Cn+1

202

4 Frequency-Domain Methods

 2 2  I Cn+1 = In+1 , CnI V = In , where In+1 and In are identity matrices of dimensions n + 1 and n, respectively.   Also, as mentioned above, since CnI I I is the transpose of CnI I , we have CnI I   CnI I I = In .  In (4.4.1), we give the matrix representation of CnI I in terms of the DFT matrix F2n . Since CnI I I is the transpose of CnI I , it can be written as CnI I I = Q T F2n P,

(4.4.10)

in view of the fact that the matrix F2n is symmetric (see (4.1.8) on p.183). As to I and CnI V , we have Cn+1 1 I Cn+1 = √ U T F2n U, (4.4.11) 2 2n ⎡√

where

U = U2n×(n+1)

⎢ ⎢ ⎢ ⎢ ⎢ ⎢ =⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣



2 1

⎥ ⎥ ⎥ ⎥ . ⎥ 1√ ⎥ ⎥; 0 2⎥ ⎥ ⎥ 1 ⎥ ⎥ . . ⎦ .

..

(4.4.12)

0 1 and

1 CnI V = √ e−iπ/4n R T F2n R, 2 2n

where



R = R2n×n



1

⎢ ωn ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ = ⎢ 0 ... ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ . ⎣ .. ωn

(4.4.13)

..

. ω n−2 n

... ..

.

0 ωnn−1

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ n−1 ωn ⎥ ⎥ i ⎥ ⎥ 0 ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

(4.4.14)

4.4 Fast Discrete Cosine Transform

203

π

with ωn = ei 2n . The importance of the matrix-product formulation of the DCT, in (4.4.1), (4.4.10), (4.4.11), and (4.4.13), is that the FFT, as introduced in Sect. 4.3, can be applied to achieve fast computation of the DCT (called FDCT), with O(n log n) operations, as opposed to O(n 2 ) operations, provided that n = 2m for some integer m > 0. The reason is that multiplication by the matrices P, Q, U and R (in (4.4.2), (4.4.4), (4.4.12) and (4.4.14), respectively), or by their inverses, does not increase the order O(n log n) of operations (see Exercise 8). DCT-II is more often used than DCT-I and DCT-IV. For example, for JPEG image and MPEG video compressions, the digital image is divided into 8×8 pixel blocks and 8-point DCT-II is applied to both the horizontal and vertical directions of each block. In Chap. 5, we will describe the general compression scheme in Sect. 5.3 and JPEG image compression standard in Sect. 5.4. In Figs.4.6, 4.7, 4.8 we include DCT-II via FDCT computations without going into any details. The interested reader is referred to the vast literature on JPEG image compression, the topic to be studied in the next chapter (see Exercises 9-11). Exercises I Exercise 1 Verify that matrix Cn+1 with entries given by (4.4.6) is orthogonal for n = 3.

Exercise 2 Verify that matrix CnI V with entries given by (4.4.9) is orthogonal for n = 3 and n = 4.

Fig. 4.6 Lee’s fast DCT implementation

204

4 Frequency-Domain Methods

Fig. 4.7 Hou’s Fast DCT implementation

Fig. 4.8 Fast DCT implementation of Wang, Suehiro and Hatori

I Exercise 3 Prove that matrix Cn+1 is an orthogonal matrix for all positive integers n, ! 2 I by showing that Cn+1 = In+1 .

Exercise 4 Prove that matrix CnI V is an orthogonal matrix for all positive integers n, !2 by showing that CnI V = In . Exercise 5 Verify for n = 3 and n = 4, that matrix CnI I with entries given by (4.4.7) satisfies (4.4.1), where P = P2n×n and Q = Q 2n×n are defined by (4.4.2) and (4.4.4), respectively.

4.4 Fast Discrete Cosine Transform

205

I Exercise 6 Verify for n = 3, that matrix Cn+1 with entries given by (4.4.6) satisfies (4.4.11), where U = U2n×(n+1) is defined by (4.4.12).

Exercise 7 Verify for n = 3 and n = 4, that CnI V with entries given by (4.4.9) satisfies (4.4.13), where R = R2n×n is defined by (4.4.14). Exercise 8 Apply the complexity count of FFT from Sect. 4.3 to show that the I , C I I , and C I V . (Note that since C I I I is FDCT has O(n log n) operations for Cn+1 n n n the transpose of CnI I , the FDCT for CnI I I has the same number of operations as CnI I .) Exercise 9 Verify the correctness of Lee’s FDCT implementation (in Fig. 4.6) for n = 8. Exercise 10 Verify the correctness of Hou’s FDCT implementation (in Fig. 4.7) for n = 8. Exercise 11 Verify the correctness of FDCT implementation (in Fig. 4.8) by Wang, Suehiro and Hatori, for n = 8.

Chapter 5

Data Compression

The difference between data reduction, including data dimensionality reduction studied in Chap. 3, and the topic of data compression to be investigated in this chapter is that compressed data must be recoverable, at least approximately. The most commonly used representation of data (particularly compressed data) for communication and storage is a string of numbers consisting only of zeros, 0’s, and ones, 1’s, without using any punctuation. This string of 0’s and 1’s is called a “binary code” of the data. For the recovery of the data, whether compressed or not, the binary code must be decipherable by referring to the corresponding code-table. The length of the binary code depends on coding efficiency, which is governed by the “entropy” of the source data. On the other hand, by manipulating the source data in a clever way, the entropy could be decreased, and hence allows for shorter binary codes. In addition, if recovery of a fairly good approximation of the source data is sufficient, the entropy could be significantly decreased to allow for very high compression ratios. This is called “lossy” (or irreversible) compression. Lossy compression of photographs (particularly, digital images) and videos are most effective, since the human eye is not very sensitive to the high-frequency contents of imagery data. To quantify the efficiency of coding schemes, the notion of “entropy” is introduced in Sect. 5.1, in terms of the probability distribution of the data. In addition to establishing the theoretical results, examples are given to illustrate the feasibility of decreasing this governing quantity. Binary codes are studied in the second section. While Kraft’s inequality governs the length of individual code-words, the average code length is shown to be bounded below by the entropy. The most commonly used (entropy) coding algorithm, called Huffman coding, is discussed in some details in Sect. 5.2. It is also demonstrated in this section that the Huffman code does not exceed the upper bound of the Noiseless Coding Theorem. The topic of data compression schemes is discussed in Sect. 5.3, and specifically, the application of DCT, studied in Chap. 4, is applied to segments (also called blocks or tiles) of the data sequence to facilitate efficient computation and memory saving, particularly for long C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_5, © Atlantis Press and the authors 2013

207

208

5 Data Compression

data sequences. Unfortunately, if the DCT segments are quantized, the inverse DCT, again applied to individual segments, reveals “blocky artifacts”, called “ quantization noise”. To suppress the quantization noise, the data segments can be pre-processed in such a way that the DCT applied to each segment is changed to a “windowed DCT” of the same segment, along with some data in its two adjacent segments. Since DCT is an orthogonal matrix and the windowed DCT extends the original DCT to operate on some data from the adjacent segments, the windowed DCT is also called a lapped orthogonal transform, with abbreviation “LOT”. The more general “lapped transform”, without the orthogonality restriction, is also introduced in Sect. 5.3 for the purpose of constructing the general LOT, while the notion of “dual lapped transform” is also introduced to yield the inverse of the corresponding LOT, via data post-processing. Digital image compression, and in particular the JPEG compression standard, is the central theme of investigation in Sect. 5.1, that includes the study of color formats and extension to video compression.

5.1 Entropy Information Theory is an area of Applied Mathematics which is concerned with such topics as the quantification, coding, communication, and storage of information. Here, the term “information” should not be confused with the word “meaning”, since in Information Theory, “pure nonsense” could be considered as an information source. In other words, “information” should not be measured in terms of “what is to be said”, but only of “what could be said”. As Claude Shannon, the father of Information Theory, emphasized in his 1948 pioneering paper, the semantic aspects of the communication of information are irrelevant to mathematical and engineering considerations. On the other hand, coding of an information source is essential for communication and storage of the information. Here, the term “coding” consists of two operations: “encoding” of the information source to facilitate effective communication and storage, and “decoding” to recover the original information source. In other words, decoding is the inverse operation of encoding. In this chapter, we only consider “binary coding”, meaning that the encoded information, called a binary code, is a (finite) “sequence” of numbers consisting only of zeros, 0’s, and ones, 1’s; and that all “code-words” and “instructions” that constitute this so-called sequence can be identified from some “code-table”, even though no punctuation are used in this “sequence” to separate them. Furthermore, all elements of the set of the information source are represented as code-words in the code-table. Here, the term “sequence” is in quotation, since no “commas” are allowed, as opposed to the usual mathematical definition of sequences. Hence, if each element of the given set of information source is assigned some non-negative integer, all of such integers with the “code-words” contained in some “code-table”, then decoding to receive or recover the original information source from the code is possible by using the code-table. To quantify the integer representation

5.1 Entropy

209

of (the elements of the set of) the information source, the unit “bit”, coined by John Tukey as the abbreviation of “binary digits” is used. For example, if the largest (nonnegative) integer used in the integer representation of the information source is 255, we say that the source is an 8-bit information (since the binary representation of 255 is 11111111, a string of 8 ones). In other words, one bit (or 1b) is one 0 or 1. In addition, the measurement in the unit of “bytes” is also used. One byte (1B) is equal to 8 bits (1B = 8b). When a binary code is stored in some memory device (such as hard disk, flash memory, or server), the length of the “sequence” of 0’s and 1’s is called the “file size” of the (compressed) data. When the binary code is transmitted (such as in broadcasting or video streaming), the length of the sequence of 0’s and 1’s is called the length of the bit-stream, and the speed of transmission is called the “bit-rate”. While file sizes are measured in kilo-bytes, mega-bytes, giga-bytes, and tera-bytes (or kB, MB, GB, and TB, respectively), bit-rates are measured in kilo-bits per second, mega-bits per second, and giga-bits per second (or kb/s, Mb/s, Gb/s, respectively). A typical example is an 8-bit gray-scale digital image, with the “intensity” of each pixel (considered as an element of the set of the information source, which is the image) being calibrated in increasing integer steps from 0 to 255, with 0 representing “black” (or no light) and 255 representing “white”, while for j = 1, . . . , 254, the increase in intensity yields increasingly lighter shades (of gray). Another example is a 24-bit color digital image. But instead of using 24 bits in the binary expression of the largest integer used in the integer representation of the information source (which is a color image), the convention is to assign 8-bits to each of the three primary color components, “red” (R), “green” (G), and “blue” (B), in that a triple (i, j, k) of integers, with each of i, j, and k, ranging from 0 to 255, for (R, G, B) represents increasing intensities of the red, green, and blue (visible) lights, respectively. Recall that as primary additive colors, addition of different intensities for R, G, B (that is, different values of i, j, k) yield 28 × 28 × 28 = 224 colors of various shades. In particular, a gray-scale digital image is obtained by setting i = j = k, where 0 ≤ i ≤ 255. Therefore, the term “24-bit color” is justified. It must be understood that the meaning of a 12-bit novel only indicates an upper bound of the number of words in the novel. The actual encoded file size is significantly larger, and often quantified in megabytes. The typical compressed file size of a novel is about 1 MB. Similarly, by a 24-bit color picture, we only mean that the quality (in terms of shades of color) is limited to 24-bit. The file size of a JPEG compressed image usually exceeds several kilo-bytes and occasionally even over 1 MB, depending on the image resolution (that is, the number of pixels). In the above examples, it will be clear that the notion of “probability”, in the sense of percentages of equal pixel values for a digital image and percentages of the same word being used in the novel, should play an important role in information coding. For instance, in a picture with blue sky and blue water in the background, the percentages of RGB pixels with values of (0, 0, k), where 50 ≤ k ≤ 150, are much higher than those with (i, j, 0) for all 0 ≤ i, j ≤ 255. Also, for the novel example mentioned above, the frequency of occurrence of the word “the” is much higher than just about all of the other words in the novel. In fact, according to some study, less

210

5 Data Compression

than half of the vocabulary used in a typical novel constitutes over 80 % of all the words in the book. Furthermore, it should be emphasized that the “entropy” of the information source (to be defined next in terms of the probabilities of occurrence as discussed above) often decreases when some suitable mathematical transformation is applied to the integer representation of the information source. Typical transforms include RLE and DPCM, to be discussed briefly in Example 1 on p.210 and again in Sects. 5.3 and 5.4 on digital image compression. Since the encoded file size and length of an encoded bit-stream are governed by the entropy, it is important to understand this concept well. Definition 1 Entropy Let X n = {x1 , . . . , xn } be an information source (or some mathematical transformation of a given information source). Let Z = {z 1 , . . . , z m } be the subset of all distinct elements of the set X n . Corresponding to each z j ∈ Z , j = 1, . . . , m, let p j denote its probability of occurrence in X n . Then P = { p1 , . . . , pm } is a discrete probability distribution, and the function H (P) = H ( p1 , . . . , pm ) =

m 

p j log2

j=1

1 pj

(5.1.1)

is called the entropy of the probability distribution P, or more specifically, the entropy of the pair (X n , P). Remark 1 For most applications, since the # X n = n is very large and since many (or even most) information sources of the same application are quite similar, the same discrete probability distribution P is used for all such information sources. This P is usually obtained from large volumes of previous experiments. Of course, the precise values of p1 , . . . , pm that constitute P can be defined (and computed) in terms of the “histogram” of X n ; namely, pj = for each j = 1, . . . , m.

1 #{xi ∈ X n : xi = z j }, n

(5.1.2) 

Example 1 Let the information source be the data set X = {0, 1, . . . , 255}. Introduce some transformation of X to another set Y that facilitates binary coding which must be reversible. Solution The set X can be written as X = {x0 , . . . , x255 } with x j = j for j = 0, . . . , 255. Since the binary representation of 255 is a string of 8 ones; namely x255 = 11111111, and since no commas are allowed to separate x j from its neighbor x j+1 , the straight forward and naive way is to assign 8 bits to each of j = 0, . . . , 255; namely, x0 = 00000000, x1 = 00000001, . . . , x255 = 11111111. Hence, the code is 00000000000000010 · · · 011111111, which has length of 256 × 8 = 2048 bits. On the other hand, if we set y0 = 0 and

5.1 Entropy

211

y j = x j − x j−1 − 1, for j = 1, . . . , 255,

(5.1.3)

then we have y j = 0, for all j = 0, . . . , 255. Therefore, the code for Y = {y0 , . . . , y255 } = {0, . . . , 0} is simply 0 · · · 0, a string of 256 zeros. The transformation from X to Y is reversible, since, for y j = 0, j = 0, . . . , 255, we have x j = y j + x j−1 + 1, j = 1, . . . , 255,

(5.1.4)

with initial condition x0 = 0. Of course the formula (5.1.4) along with x0 = 0 must be coded, but this requires only a few bits. In addition, instead of coding a string of 256 zeros in (5.1.3) with y0 = 0, one may “encode” the size of a block of 256 zeros. This is called “run-length encoding (RLE)”. Also, the code of the transformation in (5.1.3) is called the “differential pulse code modulation (DPCM)”. Both RLE and DPCM are commonly used, and are part of the JPEG coding scheme for image compression. We will elaborate on RLE and DPCM in Sect. 5.4 later.  Example 2 As a continuation of Example 1 on p. 210 compute the entropy of the information source X n = {x0 , . . . , x255 } with x j = j and n = 256 and the DPCM transformation Y256 = {y0 , . . . , y255 }, with y0 = 0 and y j = x j − x j−1 − 1, j = 1, . . . , 255. 1 , Solution For X 256 , since all its elements are distinct, we have p0 = · · · = p255 = 256 and hence, the entropy of (X 256 , P) is given by

H ( p0 , . . . , p255 ) =

255  j=0

1 1  = log2 256 = 8. pj 256 255

p j log2

j=0

On the other hand, since Y256 = {0, . . . , 0}, the probability distribution is P = {1}, so that H (1) = 0.



In the following, we show that with the given probability distribution as weights to define the weighted average of the logarithm of another probability distribution Q, then P is the optimal choice among all Q. Theorem 1 Let P = { p1 , . . . , pm } be a given discrete probability distribution with entropy H (P) defined by (5.1.1) of Definition 1. Then for all probability distributions Q = {q1 , . . . , qm }; that is, 0 ≤ q j ≤ 1 for j = 1, . . . , m, and q1 + · · · + qm = 1, the quantity m  1 p j log2 G(q1 , . . . , qm ) = qj j=1

212

5 Data Compression

satisfies G(q1 , . . . , qm ) ≥ G( p1 , . . . , pm ) = H (P). Furthermore, G(q1 , . . . , qm ) p1 , . . . , q m = pm .

=

(5.1.5)

H (P) in (5.1.5), if and only if q1

=

The result in (5.1.5) can be reformulated as H (P) = inf G(Q),

(5.1.6)

Q

where inf (which stands for “infimum”) is interpreted as the greatest lower bound over all probability distributions Q. Proof of Theorem 1 To prove (5.1.5), let us consider the constraint optimization problem m  x j = 1}. (5.1.7) min{G(x1 , . . . , xm ) : j=1

By introducing a parameter λ, called the Lagrange multiplier, the problem (5.1.7) can be solved by applying the “method of Lagrange multipliers”, namely: min

x1 ,...,xm ,λ

G λ (x1 , . . . , xm , λ),

where G λ (x1 , . . . , xm , λ) = G(x1 , . . . , xm ) + λ

m 

 xj − 1 .

(5.1.8)

j=1

To find the local minima of (5.1.8), we may compute the partial derivatives of G λ with respect to each of x1 , . . . , xm , λ and set them to be zero, yielding: 1 ∂ ∂ Gλ = p log2 +λ ∂x ∂x x  1  = p − + λ,  = 1, . . . , m, x ln 2

0=

and

 ∂ Gλ = x j − 1. ∂λ

(5.1.9)

m

0=

(5.1.10)

j=1

By (5.1.9), we have

so that, in view of (5.1.10),

p = λln 2,  = 1, . . . , m, x

(5.1.11)

5.1 Entropy

213

1=

m 

p = λln 2

m 

=1

or λ =

1 ln 2 ;

x = λln 2

=1

and it follows from (5.1.11) that ln 2 p = 1, = x ln 2

or equivalently x = p (i.e. q = p ), for  = 1, . . . , m. This completes the proof of Theorem 1 or (5.1.6).  In the following, we list two useful properties of the entropy function. Theorem 2 The entropy function H (P) defined on the collection of all discrete probability distributions P satisfies 0 ≤ H (P) ≤ log2 m,

(5.1.12)

where m denotes the number of positive masses of P; namely, P = { p1 , . . . , pm } with p j > 0 for all j = 1, . . . , m. Furthermore, H (x1 , . . . , xm ) is a continuous function on the simplex m 

Dm = {(x1 , . . . , xm ) :

x j = 1, 0 ≤ x1 , . . . , xm ≤ 1},

j=1

and has continuous partial derivatives in the interior of D. We remark that the lower bound in (5.1.12) is attained at the discrete probability distribution P = {1} with a single positive mass, while the upper bound in (5.1.12) is attained at the discrete probability distribution P = { m1 , . . . , m1 } with m equal masses p1 = · · · = pm = m1 (see Example 2 on p.211). Proof of Theorem 2 That the lower bound in (5.1.12) is valid follows from the fact that log2 ( p1j ) ≥ 0 for 0 < p j ≤ 1, since H (P) =

m  j=1

p j log2

1 ≥ 0. pj

To derive the upper bound in (5.1.12), we may apply the method of Lagrange multipliers introduced in the proof of Theorem 1 , by considering the function G(x1 , . . . , xm , λ) =

m  j=1

m   1 x j log2 +λ xj − 1 , xj j=1

214

5 Data Compression

and setting all first order partial derivatives of G(x1 , . . . , xm , λ) to be zero; namely, 1 ∂  ∂ x log2 +λ G= ∂x ∂x x  1 1  = log2 + λ,  = 1, . . . , m, + x − x x ln 2

0=

and

 ∂ G= 0= x j − 1. ∂λ

(5.1.13)

m

(5.1.14)

j=1

From (5.1.13), we have, for all  = 1, . . . , m, log2 x = λ −

1 ln 2

1

or x1 = x2 = · · · = xm = 2(λ− ln 2 ) , so that (5.1.14) implies that x1 = · · · = xm =

1 . m

Hence, H (x1 , . . . , xm ) =

m  j=1

=m

x j log2

1 xj

1 1 log2 = log2 m, m 1/m

which is the upper bound in (5.1.12). To complete the proof of Theorem 2, we observe that the function f (x) = x log2 x is continuous on the closed interval [0, 1] and has continuous derivatives for 0 < x ≤ 1.  For discrete probability distributions P = { p1 , . . . , pm } with a large number m of positive masses, the following result allows the partitioning of P into arbitrarily many disjoint subsets of probability distributions (obtained after appropriate normalization) to facilitate computing the entropy H (P) of P. To facilitate the statement of this result, let n < m be arbitrarily chosen and write { p1 , . . . , pm } = { pk0 +1 , . . . , pk1 ; pk1 +1 , . . . , pk2 ; . . . ; pkn−1 +1 , . . . , pkn }, (5.1.15) with 0 = k0 < k1 < · · · < kn = m, where k1 , . . . , kn−1 are arbitrarily selected so that

5.1 Entropy

215 k+1 −k



pk + j = a+1 > 0

(5.1.16)

j=1

for each  = 0, . . . , n − 1. Observe that A = {a1 , . . . , an },

(5.1.17)

and all of  = { pk +1 /a+1 , pk +2 /a+1 , . . . , pk+1 −1 /a+1 , pk+1 /a+1 } P

(5.1.18)

for  = 0, . . . , n, are discrete probability distributions (see Exercise 7). Theorem 3 Let P = { p1 , . . . , pm } be a discrete probability distribution, partitioned n−1 be the 0 , . . . , P arbitrarily as in (5.1.15) such that (5.1.16) itholds. Also, let A, P discrete probability distributions defined by (5.1.17) and (5.1.18). Then the entropy of P can be partitioned as follows: H (P) = H (A) +

n−1 

 ). a+1 H (P

(5.1.19)

=0

Proof To derive the formula (5.1.19), we observe, for each  = 0, . . . , n − 1, that  ) = a+1 a+1 H (P

k+1 −k

 j=1

pk  + j a+1 log2 a+1 pk +1

k+1 −k



=

pk + j log2

j=1 k+1 −k

=



pk + j log2

j=1

1 pk  + j



−k  k+1 

 pk + j log2

j=1

1 a+1

1 1 − a+1 log2 . pk  + j a+1

Therefore, summing both sides of the above formula over all  = 0, . . . , n − 1 yields n−1  =0

 ) = a+1 H (P

n−1 

k+1 −k

=0

j=1



pk + j log2

1 pk  + j

− H (A) = H (P) − H (A),

completing the proof of (5.1.19). Example 3 Compute the entropy of the discrete probability distribution



216

5 Data Compression

P=

1 1 1 1 1 1 , , , , , 6 3 6 12 12 6

by applying Theorem 3.

1 1 1 1 1 1 Solution Write P = 12 , 12 , 6 , 6 , 6 , 3 . Then with m = 6 and n = 3 in (5.1.15), and 1 1 1 + = ; 12 12 6 3 1 1 1 a2 = + + = ; 6 6 6 6 1 2 a3 = = , 3 6

a1 =

we have H (P) = H (a1 , a2 , a3 ) + a1 H

 1  1 1  1 1  + a2 H , , , 12a1 12a1 6a2 6a2 6a2

 1  3a3 1 3 2 1 1 1 3 1 1 1 2 + H , + H , , + H (1) =H , , 6 6 6 6 2 2 6 3 3 3 6 1  6 2 6 1 1 3 = log2 6 + log2 + log2 + 2 log2 2 6 6 3 6 2 6 2  3 1 + 3 log2 3 + 0 6 3 1  1 3 1 3 2 = log2 3 + + + log2 3 + + log2 3 6 6 6 6 6 6 5 = + log2 3. 6 + a3 H

 Exercises Exercise 1 Let P5 = { p1 , . . . , p5 } be a discrete probability distribution, where p j denotes the probability of occurrence of the integer j in a given information source X n for j = 1, . . . , 5. Compute P5 = P5 (X n ) for each of the following information sources in terms of the histogram of X n . (a) X 8 = {2, 2, 3, 5, 4, 1, 1, 1}. (b) X 14 = {5, 5, 3, 4, 2, 5, 4, 4, 3, 2, 1, 3, 5, 1}. (c) X 16 = {1, 3, 2, 1, 3, 2, 1, 3, 3, 3, 4, 5, 4, 3, 3, 1}. Exercise 2 As a continuation of Exercise 1, compute the entropy of each of the information sources X 8 , X 14 , X 16 (or more precisely, the entropy H (P5 ) for each information source).

5.1 Entropy

217

Exercise 3 Let P8 = { p1 , . . . , p8 } be a discrete probability distribution, where p j denotes the probability of occurrence of the source word a j in a given information X n ) for each of the following  Xn. source  X n , for j = 1, . . . , 8. Compute P8 = P8 (   X 16 = {a8 , a6 , a4 , a2 , a2 , a4 , a1 , a5 , a3 , a7 , a2 , a8 , a6 , a8 , a8 , a2 }.  X 24 = {a1 , a1 , a1 , a7 , a6 , a5 , a7 , a5 , a4 , a8 , a4 , a8 , a2 , a2 , a3 , a1 , a2 , a3 , a7 , a5 , a6 , a7 , a1 , a8 }. (c)  X 32 = {a3 , a7 , a7 , a7 , a3 , a1 , a5 , a5 , a3 , a1 , a8 , a6 , a5 , a4 , a2 , a2 , a1 , a3 , a4 , a7 , a3 , a5 , a3 , a1 , a1 , a1 , a3 , a7 , a7 , a8 , a6 , a1 }.

(a) (b)

Exercise 4 As a continuation of Exercise 3, compute the entropy of each discrete X n ) for  Xn =  X 16 ,  X 24 ,  X 32 . probability distribution P8 = P8 (  Exercise 5 For the discrete probability distributions P5 = P5 (X n ) in Exercises 1–2, let a1 = p1 + p2 , a2 = p3 + p4 + p5 , and consider the discrete probability distributions A = {a1 , a2 }, Q 1 = { p1 /a1 , p2 /a1 }, and Q 2 = { p3 /a2 , p4 /a2 , p5 /a2 }. Compute the entropies of A, Q 1 , Q 2 for each of the information sources X n = X 8 , X 14 , X 16 in Exercise 1a–c. Then apply the formula (5.1.19) of Theorem 3 to compute H (P8 ) = H (P8 (X n )). Compare with the answers for Exercise 2. Exercise 6 For the discrete probability distributions P8 = P8 (  X n ) in Exercises 3–4, let a1 = p1 + · · · + p4 , a2 = p5 + · · · + p8 , and consider the probability distri1 = { p1 /a1 , . . . , p4 /a1 }, and Q 2 = { p5 /a2 , . . . , p8 /a2 }.  = {a1 , a2 }, Q butions A 2 ) for each of the information sources  H (Q 1 ), H ( Q Compute the entropies H ( A),  X 16 ,  X 24 ,  X 32 in Exercise 3a–c. Then apply (5.1.19) of Theorem 3 to compute Xn =  X n )). Compare with the answers for Exercise 4. H (P8 ) = H (P8 (   for  = 0, . . . , n be defined by (5.1.18). Show that they are Exercise 7 Let P discrete probability distributions.

5.2 Binary Codes In this section, we introduce the concept of binary codes and the Huffman coding scheme. As mentioned earlier, an encoded file or bit-stream is a “sequence” of numbers consisting of only zeros (0) and ones (1), but without separation of any punctuation such as commas, periods, hyphens, parentheses, quotation marks, etc. A typical code looks like 0101100110011101111000110001, but is much much longer. By using a “code-table” that consists of “code-words”, the message (such as a digital image, an audio recording, a document, etc.) can be recovered by “decoding” the bit-stream. A code-word is a binary number such as 0, 10, 110, 1111. If a table consists only of these four code-words; that is,

218

5 Data Compression

K 4 = {0, 10, 110, 1111},

(5.2.1)

then to encode a sentence by using the table K 4 in (5.2.1), we can only use at most 4 different words in the sentence. For example, by using K 4 to represent the four words: 0 = it 10 = good 1111 = bad 110 = is then the four bit-streams “110010”, “01101111”, “011010”, “101101111” uniquely represent four different messages, without ambiguity. Try to decode (or decipher) them by yourself. (The answers are : “is it good”, “it is bad”, “it is good”, and “good is bad”.) Again, there is absolutely no ambiguity to use the code-table (5.2.1) to represent 4 different words to compose any sentence. Hence, the table K 4 is called a code-table. On the other hand, the table T4 = {0, 10, 1, 11} is not a code-table. Indeed, by using T4 to represent the four words: 0 10 1 11

= you = not = should = go

then the same bit-stream “011011” represents the following five different messages: “you should not go”, “you go you go”, “you should should you go”, “you should should you should should”, and “you go you should should”. Moreover, even among code-tables of the same size, some are better than others. For example, both 2 = {0, 01} K 2 = {0, 10} and K 2 ) to represent the are code-tables. By applying Table K 2 (and respectively Table K two colors b (black) and w (white); namely, Table K 2 2 Table K

0 = w, 10 = b, 0 = w, 01 = b,

the two bit-streams 0001000101000

(5.2.2)

0000100010100

(5.2.3)

5.2 Binary Codes

219

2 for decoding (5.2.3)), represent (with Table K 2 for decoding (5.2.2), and Table K the same color sequence w w w b w w b b w w. (5.2.4) But there is some difference between them. To decode (5.2.2) by using Table K 2 to yield (5.2.4), it is “instantaneous” to identify the 3 black (b) pixels in (5.2.4), in that when reading from left to right the first 1 already determines the code-word 10 (and hence the color b), since the only code-word containing 1 is 10. On the other hand, 2 and again reading from left to right, identifying to decode (5.2.3) by using Table K the color b is not instantaneous, since 0 does not indicate the code 01 without reading the second digit 1 in 01. In any case, a binary code-word is always represented as a “sequence” of 0 and/or 1 without using “commas” in-between. Definition 1 Length of code-word The length of a code-word c j in some codetable K N = {c1 , . . . , c N } is the number of digits 0 and/or 1 in the binary representation of the “integer” c j , and will be denoted by  j = length (c j ).

(5.2.5)

In constructing a code-table K N with N code-words c1 , . . . , c N , it is of course most desirable to achieve the shortest lengths of the code-words. However, the restriction on the lengths is governed by the following Kraft’s inequality: N 

2− j ≤ 1.

(5.2.6)

j=1

Observe that if x1 = 0 and x2 = 1 are both in the code-table K N , then there cannot be any other code-word in K N , since we would already have attained the upper bound in (5.2.6); namely 2−1 + 2−2 =

1 1 + = 1, 2 2

so that the size of K N is N = 2. We will not prove (5.2.6) in this book, but mention that Kraft only proved (5.2.6) for instantaneous codes in his 1949 MIT Master’s Thesis in Electrical Engineering, but McMillan removed the “instantaneous” restriction to allow the validity of (5.2.6) for all decodable (or decipherable) tables. Remark 1 To assess the suitability of the code-tables K N = {c1 , . . . , c N } in Definition 1 for encoding a given information source X n = {x1 , . . . , xn } with a relatively shorter bit-stream (or smaller file size), recall the notion of the discrete probability distribution P = { p1 , . . . , pm } associated with X n . Assuming that the code-table K N is constructed solely for X n with probability distribution P, then since each

220

5 Data Compression

code-word c j ∈ K N is constructed according to the value of the corresponding p j ∈ P, as compared with all pk ∈ P, k = 1, . . . , m, the cardinality m = #P agrees with the number N of code-words in K N . Under this assumption, observe that the precise value of p j ∈ P, for each fixed j, as defined by the histogram of X n in (5.1.2), increases proportionally as the frequency of occurrence of the same x j ∈ X according to (5.1.2) increases. Hence, for this discussion with m = N , the length  j of the code-word c j ∈ K N should be relatively shorter for larger relative values of p j for the code-table K N to be suitable for encoding X n . For this reason, the values p1 , . . . , pm ∈ P are used as weights to define the weighted average: avglength (K N ) = avglength {c1 , . . . , c N } =

N 

pjj,

(5.2.7)

j=1

called “average code-word length” of K N , where  j = length (c j ) as introduced in (5.2.5).  Remark 2 In applications, as mentioned in Remark 1, the same probability distribution P, and hence the same code-table K N , is constructed for a large class of information sources X n with different cardinalities n = # X n . In particular, N is usually larger than m = #P, where P is the discrete probability distribution associated with X n . For example, the same code-table K N , called the “Huffman table”, to be discussed later in this section, is used for most (if not all) JPEG compressed digital images, a topic of investigation in Sect. 5.4 of this chapter.  Let us return to Kraft’s inequality (5.2.6) and observe that it governs the necessity of long code-word lengths, when the number of code-words must be sufficiently large for practical applications. However, it does not give a quantitative measurement of the code-word lengths. In the following, we will see that the entropy H (P) of a given discrete probability distribution P = { p1 , . . . , p N }, with N equal to the size of the desired code-table K N = {c1 , . . . , c N }, provides a perfect measurement stick. Theorem 1 Let K N = {c1 , . . . , c N } be a code-table with code-word lengths  j = length {c j }, j = 1, . . . , N , as introduced in Definition 1. Then the average codeword length of K N is bounded below by the entropy of the desired discrete probability distribution that governs the construction of the code-table; namely, H (P) ≤ avglength (K N ),

(5.2.8)

where avglength (K N ) is defined in (5.2.7). Furthermore, equality in (5.2.8) is achieved if and only if both of the following conditions are satisfied: pj =

1 , j = 1, . . . , N , 2n j

for some positive integers n 1 , . . . , n N ; and

(5.2.9)

5.2 Binary Codes

221

 j = n j , j = 1, . . . , N ,

(5.2.10)

so that Kraft’s inequality becomes equality. Proof The proof of this theorem is an application of Theorem 1 and Kraft’s inequality (5.2.6). Indeed, let 1 − j qj = 2 , j = 1, . . . , N , C where C=

N 

2−k .

k=1

Then since q1 + . . . + q N = 1 and 0 ≤ q j ≤ 1 for all j, we may apply (5.1.5) on p.212 to conclude that H (P) ≤ G(q1 , . . . , q N ) =

N 

  p j log2 C 2 j

j=1

=

N 

  p j  j + log2 C

j=1

= avglength (K N ) + log2 C, N according to (5.2.7) and j=1 p j = 1. In view of Kraft’s inequality, we have log2 C ≤ 0, and hence we have completed the derivation of (5.2.8). Furthermore, equality in (5.2.8) holds if and only if log2 C = 0, or C = 1 and H (P) = G(q1 , . . . , qn ), which, according to the last statement in Theorem 1 of the previous section on p.211, is equivalent to q 1 = p1 , . . . , q N = p N , or equivalently, p1 =

1 1 , . . . , pN =  . 2 1 2 N

This completes the proof of (5.2.9)–(5.2.10), with n j =  j , which is an integer, being  the length of the code-word c j , j = 1, . . . , N . Remark 3 Since discrete probability distributions are seldom positive integer powers of 21 , one cannot expect to achieve a code-table with minimum average code-word lengths in general. On the other hand, there are fairly complicated coding schemes,

222

5 Data Compression

such as “arithmetic coding”, that could reach the entropy lower bound as close as desired.  In introducing the notion of entropy, Claude Shannon also showed that the entropy can be used as a measuring stick in that there exist instantaneous code-tables with average code-word lengths bounded above by the entropy plus 1. In other words, we have the following result, called “noiseless coding” by Shannon. Theorem 2 Noiseless Coding Theorem For any discrete probability distribution P = { p1 , . . . , p N }, H (P) ≤ min avglength (K N ) < H (P) + 1, KN

where the minimum is taken over all instantaneous code-tables. We will not prove the above theorem in this book, but proceed to discuss the most popular coding scheme introduced by David Huffman in his 1952 paper. In Professor Fano’s undergraduate class at MIT in 1951, in lieu of taking the final examination, the students were given the option of turning in a project paper on developing a coding scheme that achieves entropy bounds in the Noiseless Coding Theorem. One of the students, David Huffman, met the challenge and developed the following simple, yet effective, scheme based on building a binary tree by using the given probability distribution P. We remark that since only P and not the information source is used to construct the code-table, there is no need to refer to the information source X n as in Definition 1. Algorithm 1 Huffman coding scheme Let P N0 = { p1 , . . . , p N } be a given discrete probability distribution. For convenience, assume that p1 , . . . , p N have been re-arranged so that 0 < p1 ≤ p2 ≤ · · · ≤ p N . Step 1. Use the two smallest numbers p1 , p2 to build the first branch of the Huffman tree, with labels p1 and p2 for the two branches and p1 + p2 for the “stem” of the branch. Then assign the partial codes 0 to the smaller p1 and 1 to the larger p2 , as follows. (If p1 = p2 , the assignment is arbitrary.) 0 p1

p1 + p2

1 p2 Step 2. Let P N1 −1 = { p11 , . . . , p 1N −1 } be the discrete probability distribution consisting of p1 + p2 , p3 , . . . , p N ,

5.2 Binary Codes

223

but re-arranged in non-decreasing order, so that 0 < p11 ≤ p21 ≤ · · · ≤ p 1N −1 . (i) If p1 + p2 = p11 , p21 , repeat Step 1 to build a new branch with analogous labels and partial codes as follows. 0 p11 = p3 p11 + p21 = p3 + p4 1 p21 = p4 (ii) If p1 + p2 = p11 , then the same labeling along with partial code assignments yields the following branch. 0 p1 0 p1 + p2 = p11 1 p2 p11 + p21 = p1 + p2 + p3 1 p21 = p3 (iii) Similarly, if p1 + p2 = p21 , then the same labeling along with partial code assignments yields the following branch. 0 p11 = p3 0 p1

p11 + p21 = p1 + p2 + p3 1 p1 + p2 = p21

1 p2 Step 3. Let P N2 −2 = { p12 , . . . , p 2N −2 } be the discrete probability distribution consisting of p11 + p21 , p31 , · · · , p 1N −1 ,

224

5 Data Compression

but re-arranged in non-decreasing order, so that 0 < p12 ≤ p22 ≤ . . . ≤ p 2N −2 . Repeat Step 2 to build the following sub-branch and assign partial codes as follows. 0 p12 p12 + p22 1 p22 (i) If p12 = p11 + p21 = p3 + p4 as in Step 2 (i), the full sub-branch becomes 0 p3 0 p3 + p4 1 p4 1 p22 for p3 + p4 ≤ p22 ; or 0 p21 0 p3 1 p3 + p4 1 p4 for p3 + p4 ≥ p21 . (ii) If p12 = p3 + p4 , then by extending the sub-branches in Step 2 (ii) or (iii), we have one of the following full sub-branches.

5.2 Binary Codes

225

0 p1 0 p1 + p2 1 p2

0 p1 + p2 + p3 1 p3 1 p22

0 p12 0 p1 0 p1 + p2 1 p2

1 p1 + p2 + p3 1 p3

0 p3 0 p1

0 p1 + p2 + p3 1 p1 + p2

1 p2 1 p22

226

5 Data Compression

0 p12 0 p3 1 p1 + p2 + p3

0 p1 1 p1 + p2 1 p2

Step 4. Repeat the above steps till we arrive at P2N −2 = { p1N −2 , p2N −2 } with re-arrangement: p1N −2 ≤ p2N −2 . Then this final partial sub-branch has the trunk of the full tree as its stem, as follows. 0 p1N −2 p1N −2 + p2N −2 = p1 + . . . + p N = 1 1 p2N −2 Step 5. Formulate the code-word corresponding to each p j , for j = 1, . . . , N , by tracing from the branch that starts with the label p j all the way to the trunk (with label p1 + · · · + p N = 1), by putting the partial codes 0 or 1 on each branch together, but in reverse order. For example, the code-word corresponding to p j in the following tree is 01101 (not 10110). Remark 4 The re-arrangement in non-decreasing order in P N0 and each step of P N1 −1 , . . . , P2N −2 is not necessary, but is stated in Algorithm 1 for convenience to describe the coding scheme. Furthermore, the code-table (called the Huffman table) corresponding to P = P N0 is not unique, since some of the p j ’s and/or p j ’s could be the same.   1 6 2 4 1 2 8 , 24 , 24 , 24 , 24 , 24 , 24 . Follow the steps in the Huffman Example 1 Let P = 24 coding scheme of Algorithm 1 to construct a Huffman table. Verify that the codetable satisfies both the equality in Kraft’s inequality and the Noiseless Coding entropy bounds.

5.2 Binary Codes

227

Solution Re-arrange the probability distributions in non-decreasing order. That is, set 1 1 2 2 4 6 8 , , , , , , = { p1 , . . . , p7 }. P = P70 = 24 24 24 24 24 24 24 Step 1. 0 p1 p1 + p2 = 1 p2

2 24

Step 2. P61 = { p1 + p2 , p3 , . . . , p7 } = { p11 , p21 , . . . , p61 } 2 2 2 4 6 8 = , , , , , 24 24 24 24 24 24

0 p1 0 1 p1 + p2 + p3 =

p2 1 p21 = p3

4 24

Step 3. P52 = { p12 , p22 , p32 , p42 , p52 } = { p4 , p1 + p2 + p3 , p5 , p6 , p7 } 2 4 4 6 8 , , , , = 24 24 24 24 24

228

5 Data Compression

0 p4 0 p1 0 p1 + p2 + p3 + p4

p11 1

1

p2

p22 1 p3

Step 4. P43 = { p13 , p23 , p33 , p43 } = { p5 , p1 + . . . + p4 , p6 , p7 } 4 6 6 8 , , , = 24 24 24 24

0 p5 0 p4

p1 + . . . + p5

0 p1 0

1

p11

p23

1

1

p2

p22 1 p3

5.2 Binary Codes

229

Step 5. P34 = { p14 , p24 , p34 } = { p6 , p7 , p1 + p2 + p3 + p4 + p5 }  6 8 10 , , = 24 24 24

0 p6 1 1 p7 0 p5 1

0

p4

p34

0 p1 0 1 p2

1 1

1 p3 In conclusion, tracing the 0’s and 1’s in reverse order for each of p1 , . . . , p7 , the code-table is given by K 7 = {01100, 01101, 0111, 010, 00, 10, 11}, with code lengths given by 1 = 5, 2 = 5, 3 = 4, 4 = 3, 5 = 2, 6 = 2, 7 = 2. Hence,

n  j=1

2− j =

1 1 1 1 1 1 1 + + + + + + = 1, 32 32 16 8 4 4 4

which attains the equality in Kraft’s inequality. Finally, the entropy of P is :

230

5 Data Compression

H (P) =

7 

p j log2

j=1

 1 1 6 log2 24 + log2 24 − log2 6 = pj 24 24

  2 4 1 log2 24 − log2 2 + log2 24 − log2 4 + log2 24 24 24 24   2 8 + log2 24 − log2 2 + log2 24 − log2 8 24 24  1 = log2 24 − 0 + 6(log2 6) + 2 + 4 log2 4 + 0 + 2 + 8(log2 8) 24 = 2.4387, +

and the average code-word length is: 1 2 2 4 6 8 1 ·5+ ·5+ ·4+ ·3+ ·2+ ·2+ ·2 24 24 24 24 24 24 24 1 60 = (5 + 5 + 8 + 6 + 8 + 12 + 16) = = 2.5. 24 24

avglength (K 7 ) =

Hence, H (P) = 2.4387 < 2.5 < 3.4387 = H (P) + 1, as required by the Noiseless Coding Theorem. 6 We remark that, since the probabilities p1 , . . . , p7 , with the exception of 24 = 41 , 1 are not integer powers of 2 , it follows from Theorem 1 that the average code-word length cannot attain the entropy lower bound of 2.4387, although 2.5 is much closer to the lower bound than the upper bound of H (P) + 1 = 3.4387. 

Exercises Exercise 1 Let f (x) be the function defined by f (x) = x log2 x, 0 < x ≤ 1, and f (0) = 0. Show that f (x) is continuous on the closed interval [0, 1] and has continuous derivative for 0 < x ≤ 1. Exercise 2 For the discrete probability distributions P5 = P5 (X n ) in Exercise 1 on p.216, apply Algorithm 1 to construct a Huffman table for each of X n = X 8 , X 14 , X 16 . Also, compute the average code-word length, avglength (K 5 (X n )), again for each of X n = X 8 , X 14 , X 16 . Finally, compare with the entropy bounds of the Noiseless Coding Theorem by using the results of Exercise 2 on p.216 Exercise 3 For the discrete probability distributions P8 = P8 (  X n ) in Exercise 3 on p.217, apply Algorithm 1 to construct a Huffman table for each of  Xn =  X 24 ,  X 32 . Also, compute the average code-word length, avglength (K 8 (  X n )), X 16 ,  X 16 ,  X 24 ,  X 32 . Finally, compare with the entropy bounds of the for each of  Xn =  Noiseless Coding Theorem by using the results of Exercise 4 on p.217.

5.2 Binary Codes

231

Exercise 4 Show that the Huffman coding schemes achieve the lower bound for any probability distribution { p1 , . . . , pm } with p j = 2−n j , where n j is a positive integer, for j = 1, . . . , m. Exercise 5 (For advanced students) Provide a proof of Kraft’s inequality (5.2.6). Exercise 6 (For advanced students) Provide a proof of the upper bound H (P) + 1 in Theorem 2, the Noiseless Coding Theorem.

5.3 Lapped Transform and Compression Schemes In this current era of “information revolution”, data compression is a necessity rather than convenience or luxury. Without the rapid advancement of the state-of-the-art compression technology, not only the internet highway is unbearably over-crowded, but data management would be a big challenge also. In addition, data transmission would be most competitive and data storage would be very costly. On the other hand, for various reasons, including strict regulations (such as storage of medical images), industry standards (for the capability of data retrieval by nonproprietary hardware and software installed in PC’s or hand-held devices), and the need for a user-friendly environment, the more complex and/or proprietary solutions are not used by the general public, though they usually have better performance. For instance, ASCII is the preferred text format for word processors and Huffman coding is the most popular binary coding scheme, even though arithmetic coding is more optimal. In general, there are two strategies in data compression, namely: lossless compression and lossy compression. Lossless compression is reversible, meaning that the restored data file is identical to the original data information. There are many applications that require lossless compression. Examples include compression of executable codes, word processing files, and to some extent, medical records, and medical images. On the other hand, since lossless compression does not satisfactorily reduce file size in general, lossy compression is most widely used. Lossy compression is non-reversible, but allows insignificant loss of the original data source, in exchange for significantly smaller compressed file size. Typical applications are image, video, and audio compressions. For compression of digital images and videos, for instance, compressed imagery is often more visually pleasing than the imagery source, which inevitably consists of (additive) random noise due to perhaps insufficient lighting and non-existence of “perfect sensors” for image capture. Since random noise increases entropy, which in turn governs the lower bound of the compressed file size according to the Noiseless Coding Theorem (see Theorem 2 on p.222), lossy compression via removing a fair amount of such additive noise is a preferred approach. The topic of noise removal will be studied in a forthcoming publication of this book series, “Mathematics Textbooks for Science and Engineering (MTSE)”.

232

5 Data Compression

There are three popular methods for lossless data compression: (i) run-length encoding (RLE, as briefly mentioned in the introduction of Sect. 3.3), (ii) delta encoding, and (iii) entropy coding. RLE simply involves encoding the number of the same source word that appears repeatedly in a row. For instance, to encode a one-bit line drawing along each scan-line, only very few dots of the line drawing are on the scanline. Hence, if “1” is used for the background and “0” for the line drawing, there are long rows of repeating “1” ’s before a “0” is encountered. Therefore, important applications of RLE include graphic images (cartoons) and animation movies. It is also used to compress Windows 3.x bitmap for the computer startup screen. In addition, RLE is incorporated with Huffman encoding for the JPEG compression standard to be discussed in some details later in Sect. 5.4 Delta encoding is a simple idea for encoding data in the form of differences. In Example 1 on p.210, the notion of “differential pulse code modulation (DPCM)” is introduced to create long rows of the same source word for applying RLE. In general, delta encoding is used only as an additional coding step to further reduce the encoded file size. In video compression, “delta frames” can be used to reduce frame size and is therefore used in every video compression standard. We will also mention DPCM in JPEG compression in the next section. Perhaps the most powerful stand-alone lossless compression scheme is LZW compression, named after the developers A. Lempel, J. Ziv, and T. Welch. Typical applications are compression of executable codes, source codes, tabulated numbers, and data files with extreme redundancy. In addition, LZW is used in GIF image files and as an option in TIFF and PostScript. However, LZW is a proprietary encoding scheme owned by Unisys Corporation. Let us now focus on the topic of lossy compression, to be applied to the compression of digital images and video, the topic of discussion in the next section. The general overall lossy compression scheme consists of three steps: (i) Transformation of source data, (ii) Quantization, (iii) Entropy coding. Both (i) and (iii) are reversible at least in theory, but (ii) is not. To recover the information source, the three steps are: (i) De-coding by applying the code-table, (ii) De-quantization, (iii) Inverse transformation. We have briefly discussed, in Example 2 of Sect. 5.1, that an appropriate transformation could be introduced to significantly reduce the entropy of certain source data, without loss of any data information. This type of specific transformations is totally data-dependent and therefore not very practical. Fortunately, there are many transformations that can be applied for sorting data information without specific knowledge of the data content, though their sole purpose is not for reduction of the data entropy. When data information is properly sorted out, the less significant content can be suppressed and the most insignificant content can be eliminated, if desired, so as to

5.3 Lapped Transform and Compression Schemes

233

reduce the entropy. As a result, shortened binary codes, as governed by the Noiseless Coding Theorem studied in the previous section, can be constructed. The most common insignificant data content is (additive) random noise, with probability values densely distributed on the (open) unit interval (0, 1). Hence, embedded with such noise, the source data has undesirably large entropy. Fortunately, partially due to the dense distribution, random noise lives in the high-frequency range. Therefore, all transformations that have the capability of extracting high-frequency contents could facilitate suppressing the noise content by means of “quantization”, to be discussed in the next paragraph. Such transformations include DCT and DFT studied in Chap. 4, DST (discrete sine transform), Hardamard transform, and DWT (discrete wavelet transform), which will be introduced and studied in some depth in Chaps. 8 and 9. Among all of these transformations, DCT, particularly DCT-II, and DCT-IV introduced in Sect. 4.4 of Chap. 4, is the most popular for compression of digital images, videos, and digitized music. The key to the feasibility of significant file size reduction for lossy compression is the “quantization” process that maps a “fine” set of real numbers to a “coarse” set of integers. Of course such mappings are irreversible. For any real numbers x, sgn x (called the sign of x) is defined by ⎧ ⎪ for x > 0, ⎨1, sgn x = 0, for x = 0, ⎪ ⎩ −1, for x < 0. Also, for any non-negative real number x, x will denote the largest integer not exceeding x. Then the most basic quantization process is the mapping of x ∈ R to an integer  x , defined by  x = round

x Q

= (sgn x)

 |x|  Q

,

(5.3.1)

where Q is a positive integer, called the “quantizer” of the “round-off” function defined in (5.3.1). Hence,  x is an integer and Q x is an approximation of x, in that |x − Q x | < 1. A better approximation of the given real number x could be achieved by Q x , with  x defined by    x ±  Q   2 , (5.3.2)  x = (sgn x) Q where the “+” sign or “−” sign is determined by whichever choice yields the smaller |x − Q x |. In any case, it is clear that the binary representation of  x (or of  x ) requires fewer bits for larger integer values of the quantizer Q. In applications to audio and image compressions, since the human ear and human eye are less sensitive to higher

234

5 Data Compression

frequencies, larger values of the quantizer Q can be applied to higher-frequency DCT terms to save more bits. Moreover, since additive random noise lives in the higher frequency range, such noise could be suppressed, often resulting in more pleasing audio and imagery quality. In summary, lossy compression is achieved by binary encoding (such as Huffman coding) of the quantized values of the transformed data; and recovery of the source data is accomplished by applying the inverse transformation to the de-quantized values of the decoded data. Since the quantization process is irreversible, the compression scheme is lossy. As mentioned above, DCT is the most popular transformation, particularly for audio, image, and video compressions. However, for such large data sets, it is not feasible to apply DCT to the entire data source. The common approach is to partition the data set into segments (also called tiles or blocks) and apply DCT to each segment. Unfortunately, quantization of the DCT terms for individual segments introduces “blocky artifacts”, called “quantization noise”. To suppress such “noise”, an appropriate pre-processing scheme can be incorporated with the discrete cosine transformation for each segment. We will only consider n-point DCT, and in particular, DCT-I, DCT-II, DCT-III, and DCT-IV, as studied in Sect. 4.4 of Chap. 4. Since quantization noise occurs only at or near the boundaries (i.e. end-points) of the partitioned data segments, data pre-processing should be applied only to the boundaries while leaving the interior data content unaltered. More precisely, let us consider any (finite) data sequence U = {u 0 , . . . , u N −1 } of real numbers. For convenience, we assume that N is divisible by the length n of the segments (5.3.3) U = {u n , . . . , u (+1)n−1 },  = 0, . . . , L − 1, where n > 0. In other words, we have N = Ln and U = {U0 , . . . , U L−1 }.

(5.3.4)

Consider any desired positive even integer m ≤ n, and two appropriate preprocessing filters   H1 = h − m2 , . . . , h m2 −1 ; H3 = h n− m2 , . . . , h n+ m2 −1

(5.3.5)

of real numbers for the purpose of pre-processing the data segments U in (5.3.3), with the first filter applied to the left boundary and the second filter applied to the right boundary, respectively. Depending on the choice of the DCT, the two filters must be so chosen that the original data segments U ,  = 0, . . . , L − 1, in (5.3.3) can be recovered from the DCT of the pre-processed data.

5.3 Lapped Transform and Compression Schemes

235

Example 1 Let Cn = [c0 , . . . , cn−1 ] denote the n-point DCT-IV introduced in Sect. 4.4 of Chap. 4, with column vectors ck =



2 n

cos

1 1 2 (k+ 2 )π

n

,



2 n

cos

3 1 2 (k+ 2 )π

n

,...,



2 n

cos

(n− 21 )(k+ 21 )π n

T

for k = 0, . . . , n − 1. Construct a pre-processing scheme so that the original data can be recovered via the inverse DCT, Cn−1 = CnT , from the DCT of the pre-processed data. Solution For the interior segments U ,  = 1, . . . , L − 2, let  = { u n , . . . ,  u (+1)n−1 } U be defined as follows: ⎧  u n = h 0 u n + h −1 u n−1 , ⎪ ⎨ .. . ⎪ ⎩  u n+ m2 −1 = h m2 −1 u n+ m2 −1 + h − m2 u n− m2 , ⎧  u n+ m2 = u n+ m2 , ⎪ ⎨ .. . ⎪ ⎩  u (+1)n− m2 −1 = u (+1)n− m2 −1 , ⎧ u (+1)n− m2 = h n− m2 u (+1)n− m2 − h n+ m2 −1 u (+1)n+ m2 −1 , ⎪ ⎨ .. . ⎪ ⎩  u (+1)n−1 = h n−1 u (+1)n−1 − h n u (+1)n ,

(5.3.6)

(5.3.7)

(5.3.8)

for  = 1, . . . , L − 2. For the first segment U0 , let 0 = {u 0 , . . . , u n− m −1 ,  u n− m2 , . . . ,  u n−1 }, U 2 u n−1 are defined by (5.3.8) with  = 0, and for the last segment where  u n− m2 , . . . ,  U L−1 , let L−1 = { u (L−1)n , . . . , u (L−1)n+ m2 −1 , u (L−1)n+ m2 , . . . , u Ln−1 }, U u (L−1)n+ m2 −1 are defined by (5.3.6) with  = L − 1. To recover where  u (L−1)n , . . . ,  the original data U = {U1 , . . . , U L−1 } in (5.3.4) from the DCT content  ,  = 0, . . . , L − 1, Cn U  is also considered as a where Cn is the n-point DCT-IV matrix and the sequence U column vector, namely:

236

5 Data Compression

T   =  u n , . . . ,  u (+1)n−1 , U    = vn , . . . , v(+1)n−1 T for  = 0, . . . , L − 1, we proceed as follows. Write Cn U and consider the DCT data sequence V = {v0 , . . . , v Ln−1 } = {v0 , . . . , v N −1 },

(5.3.9)

partitioned into the following L disjoint segments of length n: V = {vn , . . . , v(+1)n−1 },  = 0, 1, . . . , L − 1.

(5.3.10)

In the following, we consider these sequences as column vectors. Next, we extend the inverse Cn−1 = CnT to an (n + m) × n matrix B by tacking T , . . . , c T to the top; and m rows, c T , . . . , c T , to the bottom, on m2 rows, c− m n −1 2 n+ m2 −1 2 where ck denotes the same kth column of the n-point DCT-IV matrix, but extended beyond 0 ≤ k ≤ n − 1. Precisely, the transpose B T of the (n + m) × n matrix B is given by   B T = c− m2 , . . . , c−1 , Cn , cn , . . . , cn+ m2 −1   = c− m2 , . . . , cn+ m2 −1 . We will apply B to each V ,  = 1, . . . , L − 2, and again consider V as a column vector. For  = 0 and  = L − 1, we only need   B0T = Cn , cn , . . . , cn+ m2 −1 ,   (5.3.11) B1T = c− m2 , . . . , c−1 , Cn , respectively. From the definition of V and V in (5.3.9)–(5.3.10), we have, for 1 ≤  ≤ L − 2, ⎡  u n− m2   ⎢ V = c− m2 , . . . , c−1 , Cn , cn , . . . , cn+ m2 −1 ⎣ ...

⎤ ⎥ ⎦

 u (+1)n+ m2 −1

 T u n− m2 , . . . ,  = BT  u n−1 ,  u n , . . . ,  u (+1)n−1 ,  u (+1)n , . . . ,  u (+1)n+ m2 −1 . Hence, when B is applied to V for 1 ≤  ≤ L − 2, B0 to V0 , and B1 to V−1 , we obtain  T  T 0 w00 , . . . , wn+ = B0 B0T u 0 , . . . , u n− m2 −1 ,  u n− m2 , . . . , u n+ m2 −1 , m −1 2

(5.3.12)

5.3 Lapped Transform and Compression Schemes

237

 T  L−1 L−1 w(L−1)n− u (L−1)n− m2 , . . . , = B1 B1T  u (L−1)n+ m2 −1 , m , . . . , w Ln−1 2 T u (L−1)n+ m2 , . . . , u Ln−1 ,  T  T   T m , . . . , m wn−  u = B B u m,...,w m n− 2 (+1)n+ 2 −1 , (+1)n+ −1 2

2

for  = 1, . . . , L − 2. For DCT-IV, observe that for 1 ≤ k ≤ is c( j, −k) = cos

m 2,

the jth entry of the column vector c−k

( j + 21 )(−k + 21 )π ( j + 21 )(k − 1 + 21 )π = cos = c( j, k − 1), n n

so that c−k = ck−1 , while for k = n, . . . , n +

m 2

− 1, the j th entry of ck is

( j + 21 )(k + 21 )π ( j + 21 )(2n − k − 21 )π = − cos n n = − c( j, 2n − 1 − k),

c( j, k) = cos

so that cn+k = −cn−k−1 for k = 0, . . . , m2 − 1. Hence, we have the following results on the computation of the (m + n) square matrix B B T : (i) For − m2 ≤ −k ≤ −1, and − m2 ≤ − j ≤ −1, T T c−k c− j = ck−1 c j−1 = δ j−k .

(ii) For − m2 ≤ −k ≤ −1, and 0 ≤ j ≤

m 2

− 1,

T T c−k c j = ck−1 c j = δ j−k+1 .

(iii) For − m2 ≤ −k ≤ −1, and

m 2

≤ j ≤n+

m 2

− 1,

T c−k c j = 0.

(iv) For n ≤ k ≤ n +

m 2

− 1, and n ≤ j ≤ n +

m 2

− 1,

T ckT c j = c2n−k−1 c2n− j−1 = δ j−k .

(v) For n ≤ k ≤ n +

m 2

− 1, and n −

m 2

≤ j ≤ n − 1,

T ckT c j = −c2n−k−1 c j = −δ2n−k− j−1 .

(vi) For n ≤ k ≤ n +

m 2

− 1 and − m2 ≤ j ≤ n −

m 2

− 1,

238

5 Data Compression

ckT c j = 0. Therefore, since the (n + m)-dimensional square matrix B B T is given by B B T = [ckT c j ], − m2 ≤ k, j ≤ n + m2 − 1, and Cn is an orthogonal matrix, ⎡

I m2 ⎢... ⎢ B BT = ⎢ ⎢O ⎣... O

J m2 ... In ... −J m2

⎤ O ...⎥ ⎥ O⎥ ⎥ ...⎦ I m2

with identity diagonal, where J m2 denotes the matrix ⎡ ⎢ ⎣

..

.

1

⎤ ⎥ ⎦

1 with 1’s on the northeast-southwest diagonal (anti-diagonal). Similarly, the same argument also yields  B0 B0T =

In O O O −J m2 I m2



 , B1 B1T =

I m2 J m2 O O O In



(see Exercise 5). Hence, the output (inverse DCT-IV) data in (5.3.12) are ⎧ 0 0 w0 = u 0 , . . . , wn− = u n− m2 −1 , m ⎪ ⎪ 2 −1 ⎪ 0 ⎨ 0 wn− m =  u n− m2 , . . . , wn−1 = u n−1 , 2 0 0 ⎪ un −  u n−1 , wn+1 =  u n+1 −  u n−2 , . . . , ⎪ wn =  ⎪ ⎩ w0 m m =  u −  u . n+ 2 −1 n− 2 n+ m −1 2

⎧ For  = 1, . . . , L − 2, ⎪ ⎪  ⎪ w m =  ⎪ u n− m2 +  u n+ m2 −1 , . . . , wn−1 = u n−1 +  u n , ⎪ n− 2 ⎪ ⎪ ⎪ ⎪ ⎨  =  wn u n , . . . , w(+1)n−1 = u (+1)n−1 , ⎪ ⎪ ⎪ ⎪ ⎪  ⎪ = u (+1)n −  u (+1)n−1 , . . . , ⎪ w(+1)n ⎪ ⎪ ⎩  =  u (+1)n+ m2 −1 −  u (+1)n− m2 , w(+1)n+ m −1 2

5.3 Lapped Transform and Compression Schemes

239

⎧ L−1 w(L−1)n− m =  u (L−1)n− m2 +  u (L−1)n+ m2 −1 , . . . , ⎪ ⎪ 2 ⎪ ⎪ L−1 ⎪ ⎪ u (L−1)n−1 +  u (L−1)n , w(L−1)n−1 =  ⎪ ⎪ ⎪ ⎨ (5.3.13)

L−1 L−1 ⎪ w(L−1)n = u (L−1)n , . . . , w(L−1)n+ = u (L−1)n+ m2 −1 , m ⎪ ⎪ 2 −1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ L−1 ⎩w L−1 = u (L−1)n+ m2 , . . . , w Ln−1 = u Ln−1 . (L−1)n+ m 2

Finally, to recover the original data set U = {u 0 , . . . , u N −1 } = {u 0 , . . . , u Ln−1 }, u n , . . . ,  u (+1)n−1 }, we apply the result in (5.3.13) to the pre-processed data  u  = {  = 0, . . . , L − 1, as defined in (5.3.6)–(5.3.8), and simply invert 2 × 2 matrices, when necessary, as follows: (i) 0 u 0 = w00 , . . . , u n− m2 −1 = wn− m −1 2

(ii) For k = n − m2 , . . . , n − 1, apply the second result in the first set and first result in the second set of equations in (5.3.13) to yield ⎡ 0⎤ ⎡ ⎤ ⎤⎡ wk hk uk −h 2n−k−1 ⎣ ⎦=⎣ ⎦ ⎦⎣ (h k + h k−n ) (h n−k−1 − h 2n−k−1 ) u 2n−k−1 wk1 (iii) For k = n, . . . , n + m2 − 1: ⎤⎡ ⎤ ⎡ 0⎤ ⎡ (h k−n + h k ) (h n−k−1 − h 2n−k−1 ) uk wk ⎦⎣ ⎦ ⎣ ⎦=⎣ 1 h n−k−1 u 2n−k−1 h k−n wk (iv) For  = 1, . . . , L − 2; k = − m2 , . . . , −1: 





 wn+k = (h −k−1 − h n−k−1 ) (h k + h n+k ) ⎣

(v) For  = 1, . . . , L − 2; k = 0, . . . , m2 − 1: 





 wn+k = h −k−1 h k ⎣

(vi) For  = 1, . . . , L − 2; k =

m 2 ,...,n



m 2

u n−k−1 u n+k

− 1:

⎤ ⎦

u n−k−1 u n+k

⎤ ⎦

240

5 Data Compression  wn+k = u n+k

(vii) For  = 1, . . . , L − 2; k = n −

m 2 ,...,n



− 1:





 wn+k = h k −h 2n−k−1 ⎣

(viii) For  = 1, . . . , L − 2; k = n, . . . , n +

m 2



u n+k



u n+2n−k−1

− 1: 





 wn+k = (h k−n + h k ) (h n−k−1 − h 2n−k−1 ) ⎣

u n+k

⎤ ⎦

u n+2n−k−1

(ix) For  = L − 1; k = − m2 , . . . , −1: ⎡ ⎣

−1 wn+k  wn+k





⎦=⎣

−h n−k−1

h n+k

⎤⎡ ⎦⎣

(h −k−1 − h n−k−1 ) (h n−k + h k )

u n−k−1

⎤ ⎦

u n+k

(x) For  = L − 1; k = 0, . . . , m2 − 1: ⎡ ⎣

−1 wn+k  wn+k





⎦=⎣

(h n−k−1 + h −k−1 ) (h k − h n−k ) h −k−1

hk

⎤⎡ ⎦⎣

u n−k−1

⎤ ⎦

u n+k

(xi) L−1 L−1 w(L−1)n+ m = u (L−1)n+ m , . . . , w Ln−1 = u Ln−1 . 2 2



Remark 1 To understand why proper data pre-processing would help in suppressing blocky artifacts of quantized DCT, we may extend the notation of the two filter sequences H1 and H3 in (5.3.5) to formulate a “window sequence” H = {H1 , H2 , H3 }, with H2 = {1, . . . , 1} being the constant sequence of 1’s and with length = n − m. Hence, the length of H is n + m. When the n-point DCT is applied to the pre defined in (5.3.6)–(5.3.8), for DCT-IV, we have, for processed data segment, say U  = 1, . . . , L − 2,

5.3 Lapped Transform and Compression Schemes

241

 T   = c0 , . . . , cn−1  u n , . . . ,  Cn U u (+1)n−1  T  = c− m2 , . . . , cn+ m2 −1 h − m2 u n− m2 , . . . , h n+ m2 −1 u (+1)n+ m2 −1 ,

since c− m2 = c m2 −1 , . . . , c−1 = c0 , cn = −cn−1 , . . . , cn+ m2 −1 = −cn− m2 . Hence, by introducing the notion of “lapped orthogonal transform (LOT)”   A = h − m2 c− m2 , . . . , h n+ m2 −1 cn+ m2 −1 , the n-point DCT of the pre-processed data is the same as the LOT of the original data, namely: T   . A u n− m2 , . . . , u (+1)n+ m2 −1 = Cn U



To better understand the lapped orthogonal transform, we next introduce the more general transformation without the orthogonality restriction. This will be called the “lapped transform”. Let n be any positive integer, and {x j }, j ∈ Z, be a given bi-infinite data sequence. We partition this sequence into disjoint segments X  = {xn , . . . , x(+1)n−1 },

(5.3.14)

 = 0, ±1, . . . , and extend each segment X  by tacking on m data values x(+1)n , . . . , x(+1)n+m−1 (with m ≤ n being any desirable positive integer) to formulate X ext = {X 1 , X 2 , X 3 }, where X 1 = {xn , . . . , xn+m−1 }, X 2 X 3

(5.3.15)

= {xn+m , . . . , x(+1)n−1 },

(5.3.16)

= {x(+1)n , . . . , x(+1)n+m−1 }.

(5.3.17)

Observe that X  = {X 1 , X 2 } and X 3 constitutes the extension X ext of X  . Definition 1 Let 0 < m ≤ n be integers and A = [A1 , A2 , A3 ]

242

5 Data Compression

be an n × (n + m) matrix of real numbers, where A1 , A2 , A3 are matrix sub-blocks of sizes n×m, n×(n−m), n×m, respectively. Suppose that there exists a corresponding (n + m) × n matrix ⎡ ⎤ B1 B = ⎣ B2 ⎦ , B3 with matrix sub-blocks of B1 , B2 , B3 of sizes m × n, (n − m) × n, m × n, respectively, such that B j Ak = O, for j = k; B2 A2 = In−m ,

(5.3.18)

B1 A1 + B3 A3 = Im , where O denotes a zero matrix (with appropriate matrix dimensions depending on j, k = 1, 2, 3). Then A is called a lapped transform with dual transform B, and the pair (A, B) is called a (perfect reconstruction) dual pair. When a lapped transform is applied to the extensions X ext of the disjoint segments X  of any sequence {x j }, as discussed above, the dual transform can be used to recover X  for all  ∈ Z, and hence the entire given sequence {x j }. Indeed, since AX ext = A1 X 1 + A2 X 2 + A3 X 3 , for such  ∈ Z, we have ⎡

Y1





B1 A1 X 1 + B1 A2 X 2 + B1 A3 X 3



⎢ ⎢ ⎥ ⎥ ⎢ ⎢ 2⎥ ⎥ ⎢ Y ⎥ = B(AX ext ) = ⎢ B2 A1 X 1 + B2 A2 X 2 + B2 A3 X 3 ⎥    ⎥ ⎢ ⎢ ⎥ ⎣ ⎣ ⎦ ⎦ 3 1 2 3 Y B3 A1 X  + B3 A2 X  + B3 A3 X  ⎡

B1 A1 X 1



⎢ ⎥ ⎢ ⎥ 2 ⎢ = ⎢ B2 A2 X  ⎥ ⎥, ⎣ ⎦ 3 B3 A3 X  or equivalently, Y1 = B1 (A1 X 1 ), Y2 = In−m X 2 = X 2 , Y3 = B3 (A3 X 3 ),

5.3 Lapped Transform and Compression Schemes

243

for all  ∈ Z. On the other hand, in view of the definition of the extension X ext , it is clear that 3 ,  ∈ Z; X 1 = X −1 so that, in view of the third identity in (5.3.18), 3 3 = B1 A1 X 1 + B3 A3 X −1 = (B1 A1 + B3 A3 )X 1 = X 1 . Y1 + Y−1 3 , and Therefore, while X 2 is identical to Y2 , X 1 can be recovered from Y1 + Y−1 hence each segment X  = {X 1 , X 2 },  ∈ Z, of the sequence {x j } can be perfectly reconstructed. We summarize this result in the following.

Theorem 1 Perfect reconstruction of lapped transform Let m ≤ n be positive integers, {x j } be any bi-infinite sequence of real numbers, and X  = {xn , . . . , xn−1 },  ∈ Z. Also, let X ext = {X  , X 3 } = {X 1 , X 2 , X 3 }, where X 1 = {xn , . . . , xn+m−1 }, 1 = {x(+1)n , . . . , x(+1)n+m−1 }. Suppose X 2 = {xn+m , . . . , x(+1)n−1 }, X 3 = X +1 that A is a lapped transform with dual B, as introduced in Definition 1, then {x j } can be recovered from the transformed segments AX ext ,  ∈ Z, by X 1 = B1 (A1 X 1 ) + B3 (A3 X 3 ) and X 2 = B2 (A2 X 2 ), where X  = {X 1 , X 2 }.



In Example 1, we have studied in some detail the recovery of the n-point DCTIV of some pre-processed data, and in Remark 1, we have shown that such n-point DCT gives rise to the LOT of the original data. We will next study the more general windowed DCT by applying the dual lapped transform. Let ck = [c(0, k), . . . , c(n − 1, k)]T , k = 0, . . . , n − 1, be the kth column of an n × n DCT matrix, such as DCT-I, DCT-II, DCT-III, or DCT-IV, introduced and studied in Sect. 4.4 of I so as to match the Chap. 4. (Recall that for DCT-I, CnI should be replaced by Cn−1 n × n dimension of the DCT matrix.) Now, extend the indices of the columns from k = 0, . . . , n − 1 to k = − m2 , . . . , n + m2 − 1, where m is any desired positive even integer, with 2 ≤ m ≤ n, and apply a window sequence  h − m2 , . . . , h n+ m2 −1 of real numbers with width (or length) n + m to the extended DCT columns to formulate the n × (n + m) matrix

244

5 Data Compression

  A = h − m2 c− m2 , . . . , h n+ m2 −1 cn+ m2 −1

(5.3.19)

(see Remark 1). Note that the kth column of A is h k ck = h k [c(0, k), . . . , c(n − 1, k)]T ,

(5.3.20)

where k = − m2 , . . . , n + m2 − 1. Before going ahead to restriction of the window sequence {h − m2 , . . . , h m2 +n−1 } so that A, as defined in (5.3.19), is a lapped transform, let us first study the extension of the orthogonality property of the columns ck of the DCT matrices, from 0 ≤ k ≤ n − 1 to − m2 ≤ k ≤ n + m2 − 1, as follows. This will extend the study in Example 1 to include DCT-I, DCT-II, DCT-III. Theorem 2 Let m be an even integer and 2 ≤ m ≤ n. Denote by Cn = [c0 , . . . , cn−1 ], with columns c0 , . . . , cn−1 , the DCT matrix for DCT-I, DCT-II, DCT-III, or DCT-IV (with n replaced by n − 1 for DCT-I, so that Cn is an n × n matrix). Then ckT c = ck , c = δk−

(5.3.21)

for all 0 ≤ k,  ≤ n − 1. Furthermore, the n + m columns of the n × (n + m) matrix   Cnext = c− m2 . . . c0 . . . cn−1 . . . cn+ m2 −1 satisfy the following properties: (i) For DCT-I, with ck = ckI , ck = c , for k +  = 0 or k +  = 2(n − 1), where − m2 ≤ k,  ≤ n + m2 − 1. (ii) For DCT-II, with ck = ckI I , ck = c , for k +  = −1 or k +  = 2n − 1, where − m2 ≤ k,  ≤ n + m2 − 1. (iii) For DCT-III, with ck = ckI I I , ck = c , for k +  = 0; ck = −c , for k +  = 2n, where − m2 ≤ k,  ≤ n + m2 − 1. (iv) For DCT-IV, with ck = ckI V ,

(5.3.22)

5.3 Lapped Transform and Compression Schemes

245

ck = c , for k +  = −1; ck = −c , for k +  = 2n − 1, where − m2 ≤ k,  ≤ n +

m 2

− 1.

The orthogonality property (5.3.21) for the column vectors of DCT-I, . . . , DCT-IV is guaranteed since such DCT matrices are orthogonal matrices (see Sect.4.4 of Chap. 4). The symmetry properties at x = 0 for DCT-I and DCT-III and at x = − 21 for DCT-II and DCT-IV are easy to verify. For the symmetry property at x = n −1 for DCT-I and at x = n − 21 for DCT-II, see Exercise 6. For the anti-symmetry property at x = n for DCT-III and at x = n − 21 for DCT-IV (see Exercise 7 and Example 1). In view of these symmetry or anti-symmetry properties, the columns ck of the matrix Cnext for − m2 ≤ k ≤ −1 and n ≤ k ≤ n + m2 − 1 inherit the orthogonality properties of the columns of Cn for 0 ≤  ≤ m2 − 1 and n − m2 ≤  ≤ n − 1, respectively; of course except for k +  = 0 or k +  = −1, and for k +  = 2n − 2, or k +  = 2n, or k +  = 2n − 1.  We now turn to the study of the n × (n + m) matrix A in (5.3.19), with column vectors m m h k ck , k = − , . . . , n + − 1, 2 2 given by (5.3.20), where ck = [c(0, k), . . . , c(n − 1, k)]T , and c( j, k), j = 0, . . . , n −1, denotes the ( j, k)-entry of the n-point DCT matrix Cn , which includes DCT-I, DCT-II, DCT-III, or DCT-IV. Our approach is to characterize the window sequence {h − m2 , . . . , h n+ m2 −1 } in the definition of the n × (n + m) matrix A in (5.3.19) such that A is a lapped T , where transform with dual B = A     1 , A 2 , A 3 =  (5.3.23) h n+ m2 −1 cn+ m2 −1 = A A h − m2 c− m2 , . . . , 

1 , A 2 , A 3 are n ×m, n ×(n −m), and for some sequence  h − m2 , . . . ,  h n+ m2 −1 , and A

n×m sub-matrix blocks, respectively. For convenience, we call  h − m2 , . . . ,  h n+ m2 −1 a dual window sequence. Observe that by setting ⎡

⎤ B1 B = ⎣ B2 ⎦ , B3 we have

246

5 Data Compression

T  B1 =  h − m2 c− m2 , . . . ,  h m2 −1 c m2 −1 , T  B2 =  h m2 c m2 , . . . ,  h n− m2 −1 cn− m2 −1 ,

(5.3.24)

T  B3 =  h n− m2 cn− m2 , . . . ,  h n+ m2 −1 cn+ m2 −1 , and

1T A1 , B2 A2 = A 2T A2 , B3 A3 = A 3T A3 . B1 A1 = A

T ) to be a perfect reconstruction From Definition 1, for the pair (A, B) = (A, A

dual lapped transform pair, the window sequence h − m2 , . . . , h n+ m2 −1 and its dual

 h − m2 , . . . ,  h n+ m2 −1 must be so chosen that the three matrix equations in (5.3.18) are satisfied. That is, by setting T A = [a j,k ],  = 1, 2, 3, A

(5.3.25)

the window and its corresponding dual window sequences must be selected to satisfy the following three conditions: a 2j,k = δ j−k , 1 ≤ j, k ≤ n − m;

(5.3.26)

a 1j,k + a 3j,k = δ j−k , 1 ≤ j, k ≤ m;

(5.3.27)

1T A2 = 0, A 1T A3 = 0, A 2T A3 = 0. A

(5.3.28)

Let us first observe that (5.3.28) always holds, independent of the choice of h − m2 , . . . ,  h n+ m2 −1 }, since {h − m2 , . . . , h n+ m2 −1 } and {  h k ck , h  c =  h k h  ck , c = 0 for − m2 ≤ k ≤ m2 − 1 and m2 ≤  ≤ n − m2 − 1, or for − m2 ≤ k ≤ m2 − 1 and n − m2 ≤  ≤ n + m2 − 1, or for m2 ≤ k ≤ n − m2 − 1 and n − m2 ≤  ≤ n + m2 − 1, respectively, by applying the extension property in (iii)–(iv) in Theorem 2 and the orthogonality property (5.3.21). To study the Eq. (5.3.26), we again apply (5.3.21) in Theorem 2 to conclude that   T A2 =  h k ck , h  c A 2   =  h k h  ck , c =  h k h  δk− for

m 2

≤ k,  ≤ n −

m 2

− 1. Hence, (5.3.26) is satisfied, if and only if

5.3 Lapped Transform and Compression Schemes

 h n− m2 −1 h n− m2 −1 = 1. h m2 h m2 = . . . = 

247

(5.3.29)

T A1 and A T A3 individually, as follows. To study the Eq. (5.3.27), we compute A 1 3 For 1 ≤ j, k ≤ m, we have h j− m2 −1 c j− m2 −1 , h k− m2 −1 ck− m2 −1 a 1j,k =  = h j− m2 −1 h k− m2 −1 c j− m2 −1 , ck− m2 −1 , and

(5.3.30)

a 3j,k =  h j+n− m2 −1 c j+n− m2 −1 , h k+n− m2 −1 ck+n− m2 −1 = h j+n− m2 −1 h k+n− m2 −1 c j+n− m2 −1 , ck+n− m2 −1 .

(5.3.31)

Observe that in view of (5.3.21) and properties in (i)–(iv) in Theorem 2, the inner products of the DCT column vectors in (5.3.30) and (5.3.31) are equal to 0, unless the two column vectors are the same, or unless one is an extended column in Cnext and the other is a column of Cn and they are either equal (due to symmetry) or the negative of each other (due to anti-symmetry). Hence, it is necessary to investigate each of the DCTs separately. (i) For DCT-I, it follows from (i) in Theorem 2 that symmetry at x = 0 implies 

j−

   m m − 1 + k − − 1 = 0, 2 2

or j + k = m + 2, and symmetry at x = n − 1 implies 

j +n−

   m m − 1 + k + n − − 1 = 2(n − 1), 2 2

or j + k = m. Hence, in view of (5.3.21), we have

a 1j,k + a 3j,k =

⎧  m  ⎪ m m m ⎪ ⎪h j− 2 −1 h j− 2 −1 + h j+n− 2 −1 h j+n− 2 −1 , for k = j, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ h n+ j− m2 −1 h m2 +n− j+1 , h j− m2 −1 h m2 − j+1 +  ⎪ ⎪ ⎪ ⎪ ⎪ for k = − j + m + 2, ⎨ ⎪ ⎪ ⎪   ⎪ ⎪h j− m2 −1 h m2 − j−1 + h n+ j− m2 −1 h m2 +n− j−1 , ⎪ ⎪ ⎪ for k = − j + m, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩0, otherwise.

248

5 Data Compression

Consequently, for the window sequences {h − m2 , . . . , h n+ m2 −1 } and { h − m2 , . . . ,  h n+ m2 −1 } to satisfy (5.3.27), it is necessary and sufficient that  h j− m2 −1 h j− m2 −1 +  h j+n− m2 −1 h j+n− m2 −1 = 1, for j = 2, . . . , m − 1;  h j− m2 −1 h m2 − j+1 +  h j+n− m2 −1 h m2 +n− j+1 = 0, for j = 1, . . . , m;  h j+n− m2 −1 h m2 +n− j−1 = 0, for j = 1, . . . , m. h j− m2 −1 h m2 − j−1 + 

(5.3.32)

(ii) For DCT-II, it follows from (ii) in Theorem 2 that symmetry at x = − 21 implies 

j−

   m m − 1 + k − − 1 = −1, 2 2

or j + k = m + 1, and that symmetry at x = n − 

j +n−

1 2

implies

   m m − 1 + k + n − − 1 = 2n − 1, 2 2

or j + k = m + 1 as well. Hence, in view of (5.3.21), we have

a 1j,k + a 3j,k

⎧  h j− m2 −1 h j− m2 −1 +  h j+n− m2 −1 h j+n− m2 −1 , for k = j, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ h j+n− m2 −1 h m2 +n− j , h j− m2 −1 h m2 − j +  = ⎪ for k = − j + m + 1, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0, otherwise.

h − m2 , . . . , Consequently, for the window sequences {h − m2 , . . . , h n+ m2 −1 } and {  h n+ m2 −1 } to satisfy (5.3.27), it is necessary and sufficient that  h j− m2 −1 h j− m2 −1 +  h j+n− m2 −1 h j+n− m2 −1 = 1, for j = 2, . . . , m − 1;  h j+n− m2 −1 h m2 +n− j = 0, for j = 1, . . . , m. h j− m2 −1 h m2 − j + 

(5.3.33)

(iii) For DCT-III, it follows from (iii) in Theorem 2 that symmetry at x = 0 implies 

j−

   m m − 1 + k − − 1 = 0, 2 2

or j + k = m + 2, and that anti-symmetry at x = n implies 

j +n−

   m m − 1 + k + n − − 1 = 2n, 2 2

5.3 Lapped Transform and Compression Schemes

249

or j + k = m + 2 as well. Hence, in view of (5.3.21), we have

a 1j,k + a 3j,k

⎧  h j− m2 −1 h j− m2 −1 +  h j+n− m2 −1 h j+n− m2 −1 ⎪ ⎪   ⎪ ⎪ ⎪ ⎪ × 1 − δ j− m2 −1 , for k = j, ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ =  h j− m2 −1 h m2 − j+1 −  h j+n− m2 −1 h m2 +n− j+1 , ⎪ ⎪ ⎪ ⎪ for k = − j + m + 2, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0, otherwise.

We remark that the factor (1 − δ j− m2 −1 ) of the term  h j+n− m2 −1 h j+n− m2 −1 (1 − δ j− m2 −1 ) is needed, since the column cn of Cnext for DCT-III is the zero-column (due ( j+ 1 )nπ

2 to cos = 0, j = 0, . . . , n − 1). Consequently, for the sequences n m m h − m2 , . . . ,  h n+ m2 −1 } to satisfy (5.3.27), it is neces{h − 2 , . . . , h n+ 2 −1 } and { sary and sufficient that

m  + 1; h j+n− m2 −1 h j+n− m2 −1 = 1, for j = h j− m2 −1 h j− m2 −1 +  2 m  + 1; h 0 h 0 = 1, for j = 2  h j+n− m2 −1 h m2 +n− j+1 , for j = 1, . . . , m. h j− m2 −1 h m2 − j+1 = 

(5.3.34)

(iv) For DCT-IV, it follows from (iv) in Theorem 2 that symmetry at x = − 21 implies 

j−

   m m − 1 + k − − 1 = −1, 2 2

or j + k = m + 1, and that anti-symmetry at x = n − 

j +n−

1 2

implies

   m m − 1 + k + n − − 1 = 2n − 1, 2 2

or j + k = m + 1 as well. Hence, in view of (5.3.21), we have

250

5 Data Compression

a 1j,k + a 3j,k

⎧  ⎪ h j− m2 −1 h j− m2 −1 +  h j+n− m2 −1 h j+n− m2 −1 , for k = j, ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ =  h j+n− m2 −1 h m2 +n− j , h j− m2 −1 h m2 − j −  ⎪ ⎪ ⎪ for k = − j + m + 1, ⎪ ⎪ ⎪ ⎩ 0, otherwise.

h − m2 , . . . ,  h n+ m2 −1 } Consequently, for the sequences {h − m2 , . . . , h n+ m2 −1 } and { to satisfy (5.3.27), it is necessary and sufficient that  h j+n− m2 −1 h j+n− m2 −1 = 1, for j = 2, . . . , m − 1; h j− m2 −1 h j− m2 −1 +   h j+n− m2 −1 h m2 +n− j , for j = 1, . . . , m. h j− m2 −1 h m2 − j = 

(5.3.35)

Remark 2 To apply the theory and methods of lapped transform to the data sequence U = {u 0 , . . . , u N −1 }, it is extended to a bi-infinite sequence by tacking on 0’s both to the left and to the right of U . In addition, to adopt the notation of AX ext = A1 X 1 + A2 X 2 + A3 X 3 so that Theorem 1 can be applied, we set xk = u k− m2 , for k = xk = 0, for k
0. More precisely, we have the following theorem. Theorem 3 Fourier series of functions in L2 [−d, d] For d > 0, the family {eikπx/d }, k = 0, ±1, ±2, . . ., is an orthonormal basis of L 2 [−d, d] with inner prod1 ·, · L 2 [−d,d] . Consequently, any function f in L 2 [−d, d] can be represented uct 2d by its Fourier series, namely: f (x) = (S f )(x) =

∞  k=−∞

ck ( f )eikπx/d ,

(6.1.11)

272

6 Fourier Series

which converges to f in L 2 [−d, d], where 1 ck ( f ) = 2d



d

−d

f (x)e−ikπx/d d x.

Furthermore, the Fourier coefficients ck = ck ( f ), k = 0, ±1, ±2, . . ., satisfy Parseval’s identity, namely: 1 2d



d −d

∞ 

| f (x)|2 d x =

|ck |2 , for all f ∈ L 2 [−d, d].

k=−∞

In the rest of this section, we apply Parseval’s identity (6.1.10) for some suitably chosen functions to compute the precise values of certain interesting series. Example 3 Apply Parseval’s identity (6.1.10) to the function f1 (x) defined in Example 1 to compute the precise value of the infinite series ∞  1 . k2 k=1

Solution Since f 1 22 = 2π, it follows from (6.1.5) that 1 =

∞    =−∞

4 = 2 π =

4 π2

=

4 π2

−2i 2 π(2 + 1)

∞ 

−1  1 1 + 2 (2 + 1) (2 + 1)2 =0 =−∞ ∞  ∞   1 1 + (2 + 1)2 (2 − 1)2 =0 =1 ∞  ∞   1 1 + (2 + 1)2 (2 + 1)2 =0

∞ 1 8  = 2 , π (2 + 1)2



=0

=0

or

∞  =0

1 π2 . = (2 + 1)2 8

(6.1.12)

To complete the solution of Example 3, we simply partition the sum over k into two sums, with one over even k’s and the other over odd k’s, namely:

6.1 Fourier Series

273 ∞ ∞ ∞    1 1 1 = + 2 2 k (2) (2 + 1)2 k=1

=

=1 ∞ 

1 4

=1

=0

1 π2 + , 2 8

(6.1.13)

by applying (6.1.12), so that ∞

 4 π2 π2 1 1 −1 π 2 = = . = 1 − k2 4 8 3 8 6

(6.1.14)

k=1

 Remark 2 Example 3 is called the “Basel problem”, posed by P. Mengoli in the year 1644 and solved by L. Euler in 1735, before the introduction of Fourier series by J. Fourier in the early 1800’s. In fact, Euler has derived the exact formula ζ(2n) =

∞  1 (−1)n−1 22n−1 π 2n b2n , = 2n k (2n)!

(6.1.15)

k=1

for all n = 1, 2, . . . , where the b2n ’s are the Bernoulli numbers, whose exact values can be easily computed recursively. The function ζ(z) defined for the complex variable z is the famous “Riemann zeta function”. The first three Bernoulli numbers b2n are: 1 −1 1 , b6 = . (6.1.16) b2 = , b4 = 6 30 42 On the other hand, the problem of finding the exact values of ζ(2n + 1) =

∞  k=1

1 k 2n+1

, n = 1, 2, . . .

remains unsolved, although Euler was able to compute at least ζ(3) and ζ(5) fairly accurately, of course without the computer (in the eighteenth century).  Remark 3 Putting the value of b2 in (6.1.16) into the formula (6.1.15) yields ζ(2) = π 2 /6, which is precisely the solution (6.1.14) of Example 3; where the Fourier series of the function f 1 (x) in Example 1 is used in applying Parseval’s identity. In the following examples, we further demonstrate the power of the Fourier methods for computing ζ(2n) without using Euler’s formula (6.1.15) and Bernoulli numbers b2n . The idea is to find functions f n (x) that satisfy f n (−π) = f n (π) such that f n (x) can be expressed in terms of f n−1 (x), so that the Fourier coefficients ck ( f n ) can be computed from ck ( f n−1 ) simply by integration by parts. 

274

6 Fourier Series

In the following two examples, we illustrate the idea proposed in Remark 3, but leave the detailed calculations as exercises. Example 4 Apply Parseval’s identity to compute ζ(4) by selecting an appropri . Verify the answer with Euler’s formula (6.1.15) and the ate function f 2 ∈ PC2π Bernoulli number b4 in (6.1.16). Solution Following the idea introduced in Remark 3, we set f 2 (x) = |x| for x ∈ [−π, π] and extend this function from [−π, π] to R by setting f 2 (x) = f 2 (x + 2π),  . Observe that f (x) = f (x) in Example 1. Hence, by integration so that f 2 ∈ PC2π 1 2 by parts, we have, for k = 0, 1  e−ikx π 1 ck ( f 2 ) = f 2 (x) − 2π −ik −π 2π −i = ck ( f 1 ), k



π −π

f 1 (x)

e−ikx dx −ik (6.1.17)

where the first term vanishes due to the fact that f 2 (−π) = f 2 (π). Since c0 ( f 2 ) = and c2 ( f 1 ) = 0 for all  = 0, it follows from (6.1.5) and (6.1.17) that ∞ 

|ck ( f 2 )|2 =

k=−∞

π 2

∞  4 π2 + 4 π 2 (2 + 1)4 =−∞ ∞ 

1 π2 8 + 2 4 π (2 + 1)4 =0   ∞ 1 −1 8  1 π2 + 1− 4 = , 4 2 π2 k4 =

k=1

where the trick introduced in the derivation of (6.1.13) is applied to arrive at the quantity in the last line. Hence, by Parseval’s identity, we have ∞

1  1 π 2 24 π 2 f 2 22 − = 4 k 2π 4 24 − 1 8 k=1

=

π2 3



π 2 24 π 2 π4 , = 4 24 − 1 23 90

(6.1.18)

which exactly matches the value of ζ(4) in (6.1.15) for n = 2 by using the value of b4 in (6.1.16) (see Exercise 6).  Example 5 Follow the instruction in Example 4 to compute ζ(6) without using Euler’s formula (6.1.15). Solution Following the idea introduced in Remark 3, we consider the odd function extension of

6.1 Fourier Series

275

f 3 (x) =

1

π 2 π2 − x− , 0 ≤ x ≤ π, 8 2 2

(6.1.19)

by setting f 3 (x) = − f 3 (−x) for −π ≤ x < 0. Then extend f 3 (x) from [−π, π]  . Observe that periodically to f 3 ∈ PC2π f 3 (x) =

π − f 2 (x), 2

f 3 (−π) = f 3 (π) = 0, and that 1 π4 f 3 22 = 2π 120

(6.1.20)

(see Exercise 8 for the computation of (6.1.20)). Hence, by applying (6.1.17) and by following the same method of derivation for (6.1.18), we have ∞  1 26 π 2 1 f 3 22 = 6 6 k 2 − 1 8 2π k=1

=

π6 8π 2 π 4 · = , 63 120 945

(6.1.21)

which exactly matches the value of ζ(6) in (6.1.15) for n = 3 (where the value of b6 in (6.1.16) is used).  Exercises  . Show that Exercise 1 Let f ∈ PC2π



a+π

 f (x)d x =

a−π

π

−π

f (x) d x

for all real numbers a. Exercise 2 Compute the Fourier series (S f )(x) of the “saw-tooth” function f (x) defined on [−π, π] by f (x) = x and extended periodically to R. Exercise 3 Let g(x) = π + x for −π ≤ x ≤ 0 and g(x) = π − x for 0 < x ≤ π. Compute the Fourier series (Sg)(x) of g(x). Exercise 4 Compute the Fourier series (6.1.4) of each of the following functions: (a) (b) (c) (c)

g1 (x) = cos2 x, g2 (x) = 1 + 2 sin2 x, g3 (x) = sin3 x, g4 (x) = g1 (x) − g2 (x) + g3 (x).

276

6 Fourier Series

Exercise 5 Let d > 0. Verify that the functions constitute an orthonormal family in L 2 [−d, d].

√1 eikπx/d , 2d

k = 0, ±1, ±2, . . .,

Exercise 6 Fill in the computational details in (6.1.18) by computing f 2 22 and splitting the infinite sum into the sum over all even integers and the sum over all odd integers. Exercise 7 Verify that f 3 (x), as defined in (6.1.19), satisfies f 3 (−π) = f 3 (π) and (6.1.20). Exercise 8 As a continuation of Exercises 6–7, fill in the computational details to derive the formula (6.1.21). Exercise 9 Apply Parseval’s identity to show that the Fourier series of any f ∈  determines f uniquely, in the sense that if two functions in PC  have the PC2π 2π same Fourier series, then these two functions are equal almost everywhere (see the notation of “almost everywhere” in Sect. 1.1). Exercise 10 Show that the sequence {ck } of Fourier coefficients of any f ∈ L 2 [−d, d] converges to 0, as k → ±∞. Hint: Apply Parseval’s identity in Theorem 3.

6.2 Fourier Series in Cosines and Sines For various important reasons, including the study of (Fourier) cosine series and facilitation of implementation, we convert the Fourier series expansion of functions  to (Fourier) trigonometric series in terms of cosines and sines by getting f ∈ PC2π rid of the imaginary unit, i, in Euler’s formula, as follows. Theorem 1 Fourier series in cosines and sines The Fourier series S f of f ∈  in (6.1.4) on p.267 can be re-formulated as PC2π ∞  a0   + (S f )(x) = ak cos kx + bk sin kx , 2

(6.2.1)

k=1

where

and

1 ak = π



1 bk = π

π

f (x) cos kxd x, k = 0, 1, 2, . . . ,

−π



π −π

f (x) sin kxd x, k = 1, 2, . . . .

Furthermore, the nth partial sum of (S f )(x) in (6.1.6) on p.270 can be written as

6.2 Fourier Series in Cosines and Sines

(Sn f )(x) =

277

n  a0   + ak cos kx + bk sin kx , 2

(6.2.2)

k=1

and converges to f (x) in L 2 [−π, π], in the sense that f − Sn f 2 → 0, as n → ∞.

(6.2.3)

In addition, the Fourier cosine and sine coefficients in (6.2.1) satisfy Parseval’s identity; that is, ∞

 1  1 |a0 |2 + |ak |2 + |bk |2 = 4 2 2π k=1



π

−π

| f (x)|2 d x,

 . for all f ∈ PC2π

Proof Observe that

1 a0 = 2 2π



π

−π

f (x)d x = c0 ,

where c0 is defined in (6.1.3) on p.267 for k = 0. For k = 1, 2, . . .,  π 1 1 f (x)e−ikx d x = (ak − ibk ); 2π −π 2  π 1 1 = f (x)eikx d x = (ak + ibk ), 2π −π 2

ck = c−k so that (S f )(x) = c0 +

∞ 

ck eikx +

k=1

= c0 + =

∞ 

ck eikx +

k=1 ∞ 

a0 + 2

k=1 ∞

−1 

ck eikx

k=−∞ ∞ 

c−k e−ikx

k=1

1 1 (ak − ibk )eikx + (ak + ibk )e−ikx 2 2

a0  1 1 + ak (eikx + e−ikx ) + (−ibk )(eikx − e−ikx ) = 2 2 2 = This proves (6.2.1).

a0 + 2

k=1 ∞  k=1

ak cos kx + bk sin kx.

(6.2.4)

278

6 Fourier Series

Following the above derivation, we also have (Sn f )(x) =

n n   a0   + ak cos kx + bk sin kx = ck eikx . 2 k=1

k=−n

Thus, the convergence property in (6.2.3) follows from (6.1.9) on p.270. Furthermore, Parseval’s identity (6.2.4) follows from (6.1.10) on p.271, since c0 = a0 /2 and 1 1 |ak − ibk |2 + |ak + ibk |2 4 4  1 2 2 = |ak | + |bk | , k = 1, 2, . . . , 2

|ck |2 + |c−k |2 =

where the fact that  |a − ib|2 + |a + ib|2 = 2 |a|2 + |b|2 ),

(6.2.5)

which is valid for all complex numbers a and b (see Exercise 2), is applied.  Since PC[−π, π] is dense in L 2 [−π, π], it follows from Theorem 1 on p.270 that the orthonormal family { √1 eikx : k = 0, ±1, . . .} is an orthonormal basis of 2π L 2 [−π, π]. As an alternative, it follows from Theorem 1 that the orthonormal family

1 1 1 √ , √ cos kx, √ sin kx : π π 2π

 k = 1, 2, . . .

of real-valued functions is another orthonormal basis of L 2 [−π, π]. In general, we have the following theorem. Theorem 2 Fourier cosine and sineseries for L2 [−d, d] Let d > 0. Then the family

W =

1 1 kπx 1 kπx , √ sin : k = 1, 2, . . . √ , √ cos d d 2d d d

 (6.2.6)

is an orthonormal basis of L 2 [−d, d]. Consequently, any f in L 2 [−d, d] can be represented by its Fourier series in cosines and sines, namely: ∞

f (x) = (S f )(x) =

a0  + 2 k=1

  kπx kπx ak cos + bk sin , d d

which converges to f in L 2 [−d, d], where

(6.2.7)

6.2 Fourier Series in Cosines and Sines



1 ak = d

f (x) cos

−d

and bk =

d

1 d



279

d

kπx d x, for k = 0, 1, 2, . . . , d

(6.2.8)

kπx d x, for k = 1, 2, . . . . d

(6.2.9)

f (x) sin

−d

Furthermore, the Fourier cosine and sine coefficients, defined in (6.2.8)–(6.2.9), satisfy Parseval’s identity, namely: ∞

 |a0 |2 1  1 + |ak |2 + |bk |2 = 4 2 2d k=1



d

−d

| f (x)|2 d x.

Example 1 Let f (x) = 1 for −d ≤ x ≤ 0, and f (x) = 0 for 0 < x ≤ d. Compute the Fourier cosine and sine series S f of f in (6.2.7). Solution For k = 0, 1 a0 = d



d

−d

1 f (x)d x = d



0

−d

1d x =

d = 1, d

and for k ≥ 1,  1 0 kπx kπx dx = dx f (x) cos 1 · cos d d −d d −d 0  1 sin kπx 1  d = sin 0 − sin(−kπ) = 0; = kπ d kπ d −d  d  1 0 1 kπx kπx bk = dx = dx f (x) sin 1 · sin d −d d d −d d  0  1 cos kπx 1  d = − cos 0 − cos(−kπ) =− d nπ kπ d

1 ak = d 



d

−d

 1  =− 1 − (−1)k kπ Thus, the Fourier cosine and sine series of f is given by ∞

kπx 1  1 (1 − (−1)k ) sin (S f )(x) = − 2 kπ d k=1 ∞

(2n + 1)πx 2 1  sin . = − 2 (2n + 1)π d n=0



280

6 Fourier Series

For the Fourier series expansion (S f )(x) in terms of both cosines and sines in (6.2.1) of Theorem 1 or in (6.2.7) in Theorem 2, we observe that the computational complexity can be reduced by eliminating either the sine series or the cosine series. Recall from the first paragraph of Sect. 6.1 that any function f (x), defined on an interval [a, b], can be treated as a function defined on the interval [0, d], by the d (x − a). We then proceed by considering the Fourier change of variables t = b−a trigonometric series of f (t) (with f (t) = f (x)) on [0, d] (instead of f (x) on [a, b]), while recovering both f (x) and its Fourier cosine and sine series by the change of variable x = b−a d t + a. Hence, without loss of generality, we only consider functions f (x) in L 2 [0, d], d > 0. For f ∈ L 2 [0, d], we may extend f to an even function f e on [−d, d] by setting

f (x), for 0 ≤ x ≤ d, f (−x), for − d ≤ x < 0.

f e (x) =

Then, since sin( kπx d ) is an odd function, we have bk ( f e ) = 0 in (6.2.9), so that ∞

(S f e )(x) =

a0 ( f e )  kπx + , ak ( f e ) cos 2 d

(6.2.10)

k=1

where 1 ak ( f e ) = d



d −d

kπx 2 f e (x) cos dx = d d



d

f (x) cos

0

kπx dx d

for k = 0, 1, 2, . . . . Let a0 ( f e )  kπx + ak ( f e ) cos 2 d n

(Sn f e )(x) =

k=1

denote the nth partial sum of (S f e )(x). Then it follows from Theorem 2 that (Sn f e )(x) converges to f e in L 2 [−d, d]. Thus, when restricted to [0, d], (Sn f e )(x) converges to f in L 2 [0, d]. The series in (6.2.10) is called the cosine series of f , and is denoted by S c f . For functions defined on [0, d], the collection   1 2 kπx cos : k = 1, 2, . . . W1 = √ , d d d

(6.2.11)

is an orthonormal family in L 2 [0, d] (see Exercise 5). Thus, W1 is an orthonormal basis of L 2 [0, d]. To summarize, we have the following theorem. Theorem 3 Fourier cosine series Let d > 0. Then the set W1 of cosine functions defined by (6.2.11) is an orthonormal basis of L 2 [0, d]. Thus, any function f ∈ L 2 [0, d] can be expanded as its cosine series, namely:

6.2 Fourier Series in Cosines and Sines

281 ∞

f (x) = (S c f )(x) =

a0  kπx ak cos + , 2 d

(6.2.12)

k=1

which converges to f (x) in L 2 [0, d], where the cosine coefficients are given by ak =

2 d



d

f (x) cos

0

kπx d x, k = 0, 1, . . . . d

(6.2.13)

Example 2 Let f (x) = 1 + 2 cos2 x, 0 ≤ x ≤ π. Find the cosine series (S c f ) of f in (6.2.12) with d = π. Solution By the trigonometric identity cos2 x = 21 (1 + cos 2x), we may write f (x) as 1 f (x) = 1 + 2 · (1 + cos 2x) = 2 + cos 2x. 2 Hence, applying (6.2.13), we can compute  2 π (2 + cos 2x) d x = 4; π 0  π 2 (2 + cos 2x) cos 2xd x a2 = π 0  π 2π 2 cos2 2xd x = = = 1, π 0 π2

a0 =

and for k = 3, 4, . . ., ak = 0, since cos kx is orthogonal to 1 and cos 2x. Thus, by (6.2.12), the cosine series S c f of f (x) = 1 + 2 cos2 x is (S c f )(x) =

a0 + a2 cos 2x = 2 + cos 2x. 2

Observe that there is really no need to compute ak , since the cosine polynomial 2 + cos 2x is already a cosine series of cos kx, k = 0, 1, . . ..  In general, for integer powers of the cosine and sine functions, we may apply Euler’s formula to avoid integration. Example 3 Let f (x) = cos5 x, 0 ≤ x ≤ π. Find the cosine series (S c f ) of f in (6.2.12) with d = π. Solution By Euler’s formula, we have 5 ei x + e−i x cos x = 2   1 i5x e + 5ei3x + 10ei x + 10e−i x + 5e−i3x + e−i5x = 32 

5

282

6 Fourier Series

1 i5x e + e−i5x + 5(ei3x + e−i3x ) + 10(ei x + e−i x ) 32  1 cos 5x + 5 cos 3x + 10 cos x . = 16

=

Since the above trigonometric polynomial is already a “series” in terms of cosine basis functions cos kx, k = 0, 1, . . ., we know that its cosine series is itself. Thus, the cosine series S c f of f (x) = cos5 x is given by (S c f )(x) =

5 1 5 cos x + cos 3x + cos 5x, 8 16 16 

where the terms are arranged in increasing order of “frequencies”.

Similarly, a function f ∈ L 2 [0, d] can be expanded as a sine series. The trick is to consider the odd extension f o of f to [−d, d], namely: ⎧ for 0 < x ≤ d, ⎨ f (x), f o (x) = − f (−x), for − d ≤ x < 0, ⎩ 0, for x = 0. Then, since cos kπx d is an even function for each k = 0, 1, . . ., we have ak ( f o ) = 0 in (6.2.8), and bk ( f o ) =

1 d



d −d

f o (x) sin

2 kπx dx = d d



d

f (x) sin

0

kπx dx d

for k = 1, 2, . . .. Thus, the Fourier series of f o is given by (S f o )(x) =

∞  k=1

bk ( f o ) sin

kπx , d

which converges to f in L 2 [0, d]. In addition, it is not difficult to verify that W2 =

 2 d

sin

 kπx : k = 1, 2, . . . d

(6.2.14)

is an orthonormal family in L 2 [0, d] (see Exercise 7). Thus, W2 is an orthonormal basis of L 2 [0, d]. We summarize the above discussion in the following theorem. Theorem 4 Fourier sine series Let d > 0. Then the family W2 of sine functions defined by (6.2.14) is an orthonormal basis of L 2 [0, d]. Consequently, any function f ∈ L 2 [0, d] can be represented by its sine series, namely:

6.2 Fourier Series in Cosines and Sines

283

f (x) = (S s f )(x) =

∞ 

bk sin

k=1

kπx , d

(6.2.15)

which converges to f (x) in L 2 [0, d], where the sine coefficients bk are given by bk =

2 d



d

f (x) sin

0

kπx d x, k = 1, 2, . . . . d

(6.2.16)

Example 4 Let f (x) = x, 0 ≤ x ≤ 1. Compute the sine series S s f of f in (6.2.15) with d = 1. Solution For d = 1 in (6.2.16),  bk = 2

1



1

f (x) sin kπxd x = 2 x sin kπxd x 0 0    1  − cos kπx −2 1 =2 x dx = x (cos kπx) d x kπ kπ 0 0 1  1  sin kπx 1   −2  −2  . x cos kπx − cos kπ − 0 − = cos kπxd x = 0 0 kπ kπ kπ 0  sin 0  −2  −2  sin kπ (−1)k − + = (−1)k − 0 + 0 = kπ kπ kπ kπ k+1 (−1) 2 2 (−1)k+1 = = kπ π k

Thus, by (6.2.15), the Fourier sine series of f (x) = x is given by ∞ 2  (−1)k+1 sin kπx. π k k=1

 In applications, there are other types of cosine and sine series for f ∈ L 2 [0, d] that are more frequently used. Let g(x) be the odd extension of f around x = d (instead of the even extension around 0 as in the derivation of (6.2.10) that leads to (6.2.12)–(6.2.13) of Theorem 3), namely: ⎧ for 0 ≤ x < d, ⎨ f (x), g(x) = − f (2d − x), for d < x ≤ 2d, ⎩ 0, for x = d. Then by Theorem 3, as a function in L 2 [0, 2d], g(x) can be expanded as its cosine series, namely:

284

6 Fourier Series ∞   a0 (g)  kπx g(x) = S c g (x) = ak (g) cos + , 2 2d k=1

where  2d 2 kπx g(x) cos dx 2d 0 2d    1 2d  kπx kπx 1 d dx + dx f (x) cos − f (2d − x) cos = d 0 2d d d 2d  d  d   1 kπ(2d − y) kπx 1 = − f (y) cos f (x) cos dx + dy d 0 2d d 0 2d    1 d kπx d x. = f (x) 1 − (−1)k cos d 0 2d

ak (g) =

Thus, 2 a2k (g) = 0, a2k+1 (g) = d



d

f (x) cos

0

(2k + 1)πx d x, k = 0, 1, . . . . 2d

Since the cosine series S c g of g converges to g in L 2 [0, 2d], it converges to f in L 2 [0, d]. In addition, it is not difficult to verify that  W3 =

(2k + 1)πx 2 cos : k = 0, 1, . . . d 2d

 (6.2.17)

is an orthonormal family in L 2 [0, d] (see Exercise 9). Hence, W3 is an orthonormal basis of L 2 [0, d]. We summarize the above discussion in the following theorem. Theorem 5 Fourier cosine series of Type II Let d > 0. The family W3 defined by (6.2.17) is an orthonormal basis of L 2 [0, d]. Consequently, any f ∈ L 2 [0, d] can be represented by the cosine series f (x) = ( S c f )(x) =

∞  k=0

ak ( f ) cos

(2k + 1)πx , 2d

(6.2.18)

which converges to f (x) in L 2 [0, d], where the cosine coefficients ak ( f ) are given by  2 d (2k + 1)πx d x, k = 0, 1, . . . . (6.2.19) ak ( f ) = f (x) cos d 0 2d Remark 1 It is important to observe that the DCT studied in Sect. 4.3 of Chap. 4 and applied to JPEG image compression in Chap. 5 is the discrete version of the cosine

6.2 Fourier Series in Cosines and Sines

285

coefficients ak ( f ), k = 0, 1, . . ., in (6.2.19), and that the inverse DCT is the discrete  version of the cosine series S c f in (6.2.18). Example 5 Let f (x) = x, 0 ≤ x ≤ 1. Compute the cosine series S c f of f in (6.2.18) with d = 1. Solution For d = 1 in (6.2.19),      1 1 1 πxd x = 2 πxd x ak ( f ) = 2 f (x) cos k + x cos k + 2 2 0 0      1  sin k + 1 πx 1 sin k + 21 πx 2 =2 x  dx − 2    0 k + 21 π k + 21 π 0       1 2 sin k + 21 π 2 1 =  + πx cos k +    1 2 0 2 k+2 π (k + 21 )π 

=

1

2(−1)k (k +

1 2 )π

2 − 2 . (k + 21 )π

Thus, by (6.2.18), the Fourier cosine series of f (x) = x is given by c

( S f )(x) =

∞  k=0



2(−1)k 2   −   2 1 k+2 π k + 21 π



  1 πx. cos k + 2 

Finally, let us consider another type of sine series for f ∈ L 2 [0, d]. Let h(x) be the even extension of f around x = d (instead of the odd extension around 0 as in the derivation of (6.2.15)–(6.2.16) of Theorem 4), namely:

h(x) =

f (x), for 0 ≤ x ≤ d, f (2d − x), for d < x ≤ 2d.

Then, by Theorem 4, as a function in L 2 [0, 2d], h(x) can be expanded as its sine series, namely: ∞    kπx , h(x) = S s h (x) = bk (h) sin 2d k=1

where, by the symmetry property of h(x),  2d 2 kπx dx h(x) sin 2d 0 2d    1 d kπx = dx f (x) 1 − (−1)k sin d 0 2d

bk (h) =

(6.2.20)

286

6 Fourier Series

(see Exercise 11). Thus, 2 b2k (h) = 0, b2k+1 (h) = d



d

f (x) sin

0

(2k + 1)πx d x, k = 0, 1, . . . . 2d

Since the sine series S s h of h converges to h in L 2 [0, 2d], it converges to f in L 2 [0, d]. In addition, it is not difficult to verify that  W4 =

(2k + 1)πx 2 sin : k = 0, 1, . . . d 2d

 (6.2.21)

is an orthonormal family in L 2 [0, d] (see Exercise 7). Thus, W4 is an orthonormal basis of L 2 [0, d]. We may summarize the above discussion in the following theorem. Theorem 6 Fourier sine series of Type II Let d > 0. The family W4 defined by (6.2.21) is an orthonormal basis of L 2 [0, d]. Consequently, any f ∈ L 2 [0, d] can be represented by the sine series f (x) = ( S s f )(x) =

∞  k=0

bk ( f ) sin

(2k + 1)πx , 2d

(6.2.22)

which converges to f (x) in L 2 [0, d], where the sine coefficients bk ( f ) are given by 2 bk ( f ) = d



d

f (x) sin

0

(2k + 1)πx d x, k = 0, 1, . . . . 2d

(6.2.23)

Example 6 Let f (x) = x, 0 ≤ x ≤ 1. Compute the sine series S s f of f in (6.2.22) with d = 1. Solution By (6.2.23) with d = 1,      1 1 1 πxd x = 2 πxd x f (x) sin k + x sin k + 2 2 0 0    1  cos k + 1  πx 1 cos k + 21 πx 2 = −2 x  dx +2    0 k + 21 π k + 21 π 0     1 2 1 πx = 0 +  sin k +   2 0 2 k + 21 π 

bk ( f ) = 2

1

2(−1)k =   2 . k + 21 π Thus, by (6.2.22), the Fourier sine series of f (x) = x is given by

6.2 Fourier Series in Cosines and Sines

( S s f )(x) =

∞  k=0

287

  2(−1)k 1 πx. sin k +   2 2 k + 21 π  Exercises

Exercise 1 Compute the Fourier cosine and sine series (6.2.1) of each of the following functions: (a) (b) (c) (d)

g1 (x) = cos2 x, g2 (x) = 1 + 2 sin2 x, g3 (x) = sin3 x, g4 (x) = g1 (x) − g2 (x) + g3 (x).

Exercise 2 Verify (6.2.5) for any complex numbers a, b ∈ C. Exercise 3 Verify that W , defined by (6.2.6), is an orthonormal family of L 2 [−d, d]. Exercise 4 Compute the cosine series (6.2.12) with d = 1 for each of the following functions: (a) (b) (c) (d)

f 1 (x) = 1, 0 ≤ x ≤ 1, f 2 (x) = 1, 0 ≤ x ≤ 21 ; and f 2 (x) = −1, 21 < x ≤ 1, f 3 (x) = x 2 , 0 ≤ x ≤ 1, f 4 (x) = sin2 πx, 0 ≤ x ≤ 1.

Exercise 5 Verify that W1 , defined by (6.2.11), is an orthonormal family in L 2 [0, d]. Exercise 6 Find the sine series (6.2.15) with d = 1 for each of the functions in Exercise 4 Exercise 7 Show that W2 , defined by (6.2.14), is an orthonormal family in L 2 [0, d]. Exercise 8 Compute the cosine series (6.2.18) with d = 1 for each of the functions in Exercise 4. Exercise 9 Verify that W3 , defined by (6.2.17), is an orthonormal family in L 2 [0, d]. Exercise 10 Compute the sine series (6.2.22) with d = 1 for each of the functions in Exercise 4. Exercise 11 Provide the details for the derivation of (6.2.20). Exercise 12 Verify that W4 , defined by (6.2.21), is an orthonormal family in L 2 [0, d]. Exercise 13 Let f ∈ L 2 [0, d] and {ak ( f )} be the sequence defined by (6.2.19). Show that ak ( f ) → 0, as k → ∞.

288

6 Fourier Series

Hint: Apply Parseval’s identity with the orthonormal basis W3 , defined by (6.2.17), for L 2 [0, d]. Exercise 14 Let f ∈ L 2 [0, d] and {bk ( f )} be the sequence defined by (6.2.23). Show that bk ( f ) → 0, as k → ∞. Hint: Apply Parseval’s identity with the orthonormal basis W4 , defined by (6.2.21), for L 2 [0, d].

6.3 Kernel Methods  , the nth partial sum (S f )(x) of the Fourier series expansion (6.1.4) For f ∈ PC2π n on p. 267 can be formulated as

(Sn f )(x) =

n  k=−n n 

ck eikx = 

 π n

 1 f (t)e−ikt dt eikx 2π −π

k=−n π

1 f (t)eik(x−t) dt 2π −π k=−n  π 1 f (t)Dn (x − t)dt, = 2π −π

=

where

n 

Dn (x) =

eikx

(6.3.1)

(6.3.2)

k=−n

is called the nth order “Dirichlet kernel” .  with inner product Definition 1 Convolution For the inner-product space PC2π defined in (6.1.1), the operation

 ( f ∗ g)(x) =

π −π

f (t)g(x − t)dt

(6.3.3)

 with g ∈ PC  . is called the convolution of f ∈ PC2π 2π

Remark 1 The convolution operator defined by (6.3.3) is commutative, since 

π

−π

 f (t)g(x − t)dt =

π

−π

f (x − t)g(t)dt,

6.3 Kernel Methods

289

1.5

1

0.5

0

−n −n+1

−1 0 1

n−1 n

−0.5

Fig. 6.1 Coefficient plot of Dn (x)

which can be easily verified by applying the 2π-periodicity property of f (x) and g(x) in the change of variable (x − t) to t in the integration (where x is fixed) (see Exercise 1).  Remark 2 As shown in (6.3.1), the nth partial sum (Sn f )(x) of the Fourier series  is a 1 -multiple of the convolution of f with the nth order Dirichlet of f ∈ PC2π 2π 1 1 f ∗ Dn . Hence, if 2π Dn (x) is considered as a (convolution) kernel, namely, Sn f = 2π  filter of the signal f ∈ PC2π , then it is an “ideal lowpass filter” in that the highfrequency content, ck ( f ) for |k| > n, is removed, while the low-frequency content, ck ( f ) for k = −n, . . . , n, is unaltered. Indeed, the Fourier coefficients ck of Dn (x) are given by ck = 1 for |k| ≤ n, and ck = 0 for |k| > n, as shown in Fig. 6.1. This concept will be elaborated upon in Sect. 7.2.  Theorem 1 Let f and g be 2π-periodic functions such that f ∈ L 2 [−π, π] and g ∈ L 1 [−π, π]. Then f ∗ g ∈ L 2 [−π, π] and f ∗ g 2 ≤ f 2 g 1 .

(6.3.4)

Although (6.3.4) is valid for Lebesgue integrable functions as stated in the theo∗ in this elementary textbook. Hence, both rem, we only consider functions in PC2π of the functions f ∗ g and g can be approximated by finite Riemann sums. More precisely, lim

n→∞

lim

n→∞

Now, by setting

n−1 

 f

k=0 n−1   k=0

2πk x− n

   2πk 2π g = ( f ∗ g)(x); n n

   π  2πk  2π  g = |g(t)|dt = g 1 .  n  n −π

(6.3.5)

(6.3.6)

290

6 Fourier Series

 Fn,k (x) = f

x−

2πk n

   2πk 2π g n n

in (6.3.5), we have

lim

n→∞

n−1 

⎛ = ⎝ lim

Fn,k

k=0

n→∞

2

⎛  ⎝ =

   −π 



n−1 π  k=0

2 ⎞ 21   Fn,k (x) d x ⎠ 

n−1 2 ⎞ 21   π   lim  Fn,k (x) d x ⎠ = n→∞   −π

f ∗g

2

.

k=0

Here, the justification for the validity of interchanging limit and integration is that  n−1 2  Fn,k (x) is nonnegative. It can also be proved rigorously by applying k=0

Fatou’s lemma, which is one of the most basic theorems in a course of Real Analysis (see Theorem 3 in Sect. 7.6 of Chap. 7). On the other hand, it follows from the triangle inequality for the L 2 -norm, that n−1  k=0



Fn,k 2

n−1 

Fn,k

2

=

k=0

n−1  

−π

k=0

=

 n−1  k=0

=

π

  2    2πk  dx f x −  n  −π π

   21  2πk 2π 2 g( )   n n 

n−1   k=0

1    Fn,k (x)2 d x 2

π −π

 1       f (x)2 d x 2 g 2πk  2π  n  n

 n−1     2πk  2π   → f 2 g 1 , = f 2 g n  n k=0

for n → ∞, by (6.3.6). Here, we have applied the 2π-periodic property of f (x) to obtain 2  π  π       f x − 2πk  d x =  f (x)2 d x.   n −π −π This completes the proof of (6.3.4).



We remark that Theorem 1 can also be proved by applying the more general Minkowski inequality for integrals to be discussed in the Appendix of Chap. 7. In the following, we derive an explicit formulation of the Dirichlet kernel.

6.3 Kernel Methods

291

Theorem 2 For n = 1, 2, 3, . . ., the Dirichlet kernel Dn (x), as defined in (6.3.2), is given by sin(n + 21 )x . (6.3.7) Dn (x) = sin x2 The formula (6.3.7) can be derived as follows: Dn (x) = e−inx

2n 

eikx = e−inx

k=0

e−inx

1 − ei(2n+1)x 1 − ei x 1

1

e−i(n+ 2 )x − ei(n+ 2 )x

− ei(n+1)x

= x x x x x ei 2 (e−i 2 − ei 2 ) e−i 2 − ei 2     (−2i) sin n + 21 x sin n + 21 x = = . (−2i) sin x2 sin x2 =



Remark 3 From the definition in (6.3.2), it is clear that Dn (0) = 2n + 1. Hence, the formula in (6.3.7) of Dn (x) for x = 0 also applies to x = 0 by applying L’Hospital’s rule, namely: lim

x→0

sin(n + 21 )x (n + 21 ) cos(n + 21 )x = lim x 1 x x→0 sin 2 2 cos 2 cos 0 = 2n + 1. = (2n + 1) cos 0

Examples of graphs of Dn (x) are plotted in Fig. 6.2.



Next, we introduce the notion of Césaro means (that is, averages) of Dirichlet’s kernels D j (x) to formulate the nth order Fejér’s kernel, namely: 35

8

30 25

6

20

4

15

2

10 5

0 −2 −3

0

−2

−1

0

1

2

3

−5 −3

−2

−1

0

1

Fig. 6.2 Dirichlet’s kernels Dn (x) for n = 4 (on left) and n = 16 (on right)

2

3

292

6 Fourier Series 1.5

1

0.5

0

− n− n + 1

−1 0 1

n−1 n

−0.5

Fig. 6.3 Coefficient plot of σn (x)

σn (x) =

n 1  D j (x), n+1

(6.3.8)

j=0

which is another lowpass filter with coefficient plot shown in Fig. 6.3. For convenience, we set z = ei x and observe that D j (x) =

  1 1 − z 2 j+1 1 j+1 . = z − z j (1 − z) z−1 zj

Thus, we have (n + 1)σn (x) = D0 (x) + D1 (x) + · · · + Dn (x)     1 1 1 1 1 (z − 1) + z2 − + ··· + z n+1 − n = z−1 z−1 z z−1 z  1 1 1  z + z 2 + · · · + z n+1 − 1 − − · · · − n = z−1 z z   1

1 1 + z + z2 + · · · + zn z− n = z−1 z   1 1 z − n (z n+1 − 1) = (z − 1)2 z

n+1 n+1 2 2  n+1 z 2 − z− 2 −1 z = n =

1 1 2 z (z − 1)2 z 2 − z− 2  2 sin (n+1)x 2 = . sin x2 In the following, we summarize the important properties of the Fejér kernel.

6.3 Kernel Methods

293

Theorem 3 Fej´er’s kernel For n = 1, 2, . . . , the n th order Fejér kernel σn (x), as defined in (6.3.8), can be formulated as σn (x) =

1 sin n+1 sin

(n+1)x 2 x 2

2

.

(6.3.9)

Furthermore, the sequence {σn (x)} constitutes a positive “approximate identity”, meaning that it has the following properties: (a) σn (x)  π≥ 0, f or all x; 1 (b) σn (x)d x = 1; 2π −π (c) For any positive number δ with 0 < δ < π, sup σn (x) → 0, as n → ∞;

(6.3.10)

δ≤|x|≤π

(d) For any f ∈ L 2 [−π, π] with 2π-periodic extension also denoted by f , 1 f ∗ σn 2 ≤ f 2 . 2π The first property is evident from (6.3.9), and the second property follows from the definitions of Dk (x) and σn (x), while the third property follows from the fact that | sin y| ≤ 1 for all y ∈ R and | sin x2 | ≥ | sin 2δ | > 0 for δ ≤ |x| ≤ π (see Exercise 2). The fourth property is a consequence of Theorem 1 together with properties (a) and  (b). Examples of graphs of σn (x) are illustrated in Fig. 6.4. In view of the convolution formulation (in terms of the Dirichlet kernels and f ) for the partial sum (Sn f )(x) of the Fourier series (S f )(x) in (6.3.1), we consider the Césaro means of (Sn f )(x) as convolution with the Fejér kernels σn (x), as follows.

5

18

4.5

16

4

14

3.5

12

3

10

2.5

8

2

6

1.5 1

4

0.5

2

0

−3

−2

−1

0

1

2

3

0

−3

−2

−1

0

1

Fig. 6.4 Fejer’s kernels σn (x) for n = 4 (on left) and n = 16 (on right)

2

3

294

6 Fourier Series

 Definition 2 C´esaro means The nth-order Césaro means (Cn f )(x) of f ∈ PC2π is defined by (S0 f )(x) + · · · + (Sn f )(x) (Cn f )(x) = . (6.3.11) n+1  . Since Observe that V2n+1 , defined by (6.1.2) on p.267, is a subspace of PC2π (Cn f )(x) is a linear combination of (S j f )(x), 0 ≤ j ≤ n, and since each (S j f )(x) is in V2n+1 , the Césaro means (Cn f )(x) is also in V2n+1 .

Remark 4 It follows from (6.3.1) and (6.3.8) that 1 (Cn f )(x) = 2π



π

−π

f (t)σn (x − t)dt =

1 ( f ∗ σn )(x), 2π

(6.3.12)

where σn (x) is the nth order Fejér kernel. Hence, by applying property (d) of Theorem  , 3, we also have, for any g ∈ PC2π Cn g 2 =

1 g ∗ σn 2 ≤ g 2 . 2π

(6.3.13) 

Exercises  is comExercise 1 Show that the convolution operation defined in (6.3.3) for PC2π mutative.

Exercise 2 Show that, for 0 < δ ≤ π,    sin

x    ≥  sin 2

δ  δ ≥ 2 π

for all x, with δ ≤ x ≤ π.



Exercise 3 Let Dn (t) be the Dirichlet kernel. Show that

π

Dn (t)dt = π.

0

Exercise 4 Let (Cn f )(x) be the nth order Césaro means of the partial sums S0 f, . . . , Sn f of the Fourier series (S f )(x), as defined in (6.3.11). Derive the convo1 ( f ∗ σn )(x) in (6.3.12), where σn (x) is the nth order lution formula (Cn f )(x) = 2π Fejér kernel. Exercise 5 From the formula (6.3.9) of the Fejér kernel, compute the value of σn (x) at x = 0. Exercise 6 Does the sequence of Césaro means of the divergent sequence {xn }, xn = (−1)n , converge? If so, compute its limit.

6.3 Kernel Methods

295

Exercise 7 As a continuation of Exercise 6, let sn =

n 

xk . Show that {sn } is diver-

k=0

gent although the sequence of its Césaro means converges to 21 . Exercise 8 (For advanced students) If a sequence {xn } of complex numbers converges to some limit c ∈ C as n → ∞, show that the sequence of its Césaro means yn =

x0 + · · · + xn n+1

also converges to c.

6.4 Convergence of Fourier Series As mentioned previously, when a function f ∈ L 2 [−π, π] is considered as an analog signal, its Fourier coefficients ck = ck ( f ), k ∈ Z, reveal its frequency content, with the coefficient ck representing the content at (k/2π)Hz. Therefore, the separation of f (x) into frequency components ck eikx , ak cos kx, or bk sin kx facilitates the application of digital signal processing (DSP) of f (x). However, it is important that the (modified) function f (x) can be recovered from the (processed) sequence {ck }, or equivalently, from the Fourier series (S f )(x). The first theorem in this section  , (S f )(x ) converges to f (x ), under the assumption that assures that for f ∈ PC2π 0 0 f (x) has both one-sided derivatives at x = x0 . Theorem 1 Pointwise convergence Let f (x) ∈ PC[−π, π] and x0 ∈ [−π, π], and suppose that both of the one-sided derivatives f (x0+ ) = lim

h→0+

f (x0 + h) − f (x0+ ) − f (x0 − h) − f (x0− ) , f (x0 ) = lim , h −h h→0+

of f (x) at x0 ∈ [−π, π] exist. Then the Fourier series of f converges to  f (x0+ ) at x = x0 .

1 2



f (x0− ) +

In the above theorem, f (x0+ ) and f (x0− ) denote the right-hand and left-hand limits of f at x0 . Moreover, in view of the 2π-periodic extension of f (x), we have 

f (π + ) = f (−π + ), f (−π − ) = f (π − ); f (π + ) = f (−π + ), f (−π − ) = f (π − ).

(6.4.1)

Proof of Theorem1 Consider the 2π-periodic extension of f ∈ PC[−π, π] to  . Then by (6.3.1) on p.288, we have f ∈ PC2π

296

6 Fourier Series

(Sn f )(x) =

1 2π



π

−π

 0

f (x − t)Dn (t)dt

 π   1 f (x − t)Dn (t) dt + 2π −π 0  π   1 = f (x + t) + f (x − t) Dn (t)dt, 2π 0

=

where the change of the variable of integration u = −t for the integral over [−π, 0] and the fact that Dn (−t) = Dn (t) have been applied. Notice that 1 2π



π

Dn (t)dt =

0

1 2

(see Exercise 3 of previous section). Thus, for any x = x0 , we have  1 f (x0− ) + f (x0+ ) (Sn f )(x0 ) −  π 2    1 f (x0 + t) + f (x0 − t) − f (x0− ) + f (x0+ ) Dn (t)dt = 2π    π 0 1 tdt, h(t) sin n + = 2 0 where h(t) =

1  + −  t f (x 0 + t) − f (x 0 ) + f (x 0 − t) − f (x 0 ) . 2π sin 2

By the assumption that f (x0+ ) and f (x0− ) exist, and the fact that πt ≤ sin 2t for all 0 ≤ t ≤ π (see Exercise 2), we may conclude that h(t) is bounded on [0, π] (see Exercise 3). Thus, h(t) ∈ L 2 [0, π]. Hence, lim (Sn f )(x0 ) −

n→∞

 1 f (x0− ) + f (x0+ ) = lim n→∞ 2



π

0

1 h(t) sin(n + )tdt = 0, 2

wherethe identity for h(t) with the orthonormal % last equality follows from Parseval’s  2 1 basis π sin(k + 2 )x : k = 0, 1, . . . of L 2 [0, π] (see Exercise 14 of Sect. 6.2).  Since h(t) is also in L 1 [0, π], the last equality in the above proof also follows from the Riemann-Lebesgue lemma, which states that  lim

ω→∞ 0

π

g(t)e−iωt dt = 0

(6.4.2)

for any g(t) ∈ L 1 [0, π] (see Exercise 4 for the proof  of (6.4.2)).  If f (x) is continuous at x ∈ (−π, π), then 21 f (x − ) + f (x + ) = f (x); and if f (x) is continuous at π and −π and f (−π) = f (π), then

6.4 Convergence of Fourier Series

297

  1 1 f (π − ) + f (π + ) = f (π), f (−π − ) + f (−π + ) = f (−π). 2 2 Furthermore, if f (x) is differentiable at x, then f (x + ) and f (x − ) exist. Therefore, ∗ , the Fourier series for any f ∈ PC2π ∞ 

(S f )(x) =

ck ( f )eikx

k=−∞

or



(S f )(x) =

a0 ( f )  + ak ( f ) cos kx + bk ( f ) sin kx 2 k=1

converges to f (x) at any x, where f (x) is continuous, provided that both of the one-sided derivatives f (x + ) and f (x − ) exist. Example 1 Let f (x) = |x|, −π ≤ x ≤ π. Discuss the pointwise convergence of the Fourier series of f . Solution It is clear that the function f (x) = |x| is continuous on [−π, π] and differentiable on (−π, 0) and (0, π), with f (x) = 1 for 0 < x < π; f (x) = −1 for −π < x < 0. Furthermore, the one-sided derivatives of f (x) also exist at x = −π, 0, π, with f (0+ ) = 1, f (0− ) = −1; f (π − ) = 1, f (−π + ) = −1, while, by the second line in (6.4.1), f (−π − ) = f (π − ) = 1, f (π + ) = f (−π + ) = −1. Therefore, by Theorem 1, the Fourier series of f (x) = |x| converges to f (x) for all x ∈ [−π, π]. Here, we have used the fact that  1 f (−π − ) + f (−π + ) = f (−π); 2  1 f (π − ) + f (π + ) = f (π), 2 since the 2π-periodic extension of f (x) = |x| is continuous at the two end-points x = −π, π. See Fig. 6.5 for the partial sums of the Fourier series. 

298

6 Fourier Series 3.5 3 2.5 2 1.5 1 0.5 0 −10 3.5

−8

−6

−4

−2

0

2

4

6

8

10

−8

−6

−4

−2

0

2

4

6

8

10

−8

−6

−4

−2

0

2

4

6

8

10

−8

−6

−4

−2

0

2

4

6

8

10

3 2.5 2 1.5 1 0.5 0 −10 3.5 3 2.5 2 1.5 1 0.5 0 −10 3.5 3 2.5 2 1.5 1 0.5 0 −10

Fig. 6.5 From top to bottom: f (x) = |x| and its 2π-extension, S10 f , S50 f and S100 f

Example 2 Let f (x) = χ[−π/2,π/2] (x), −π ≤ x ≤ π be the function g1 (x) conπ sidered in Example 2 on p.269; that is, f (x) = 1 for |x| ≤ and f (x) = 0 for 2 π < |x| ≤ π. Discuss the pointwise convergence of the Fourier series of f . 2

6.4 Convergence of Fourier Series

299

Solution The function f (x) is clearly in PC[−π, π] and differentiable, with π π f (x) = 0 for all x ∈ [−π, π] except x = − , , and with 2 2         f (−π/2)− = f (−π/2)+ = 0, f (π/2)− = f (π/2)+ = 0 as well. In addition, the 2π-periodic extension is continuous at the two end-points x = −π, π. Therefore, it follows from Theorem 1 that the Fourier series (S f )(x) of π f (x) converges to f (x) at each x ∈ [−π, π] with x = ± , and to 2    1 1  f (π/2)− + f (π/2)+ = (1 + 0) = 2 2    1 1  − + f (−π/2) + f (−π/2) = (0 + 1) = 2 2

1 , 2 1 2

π π at x = and x = − , respectively. See Fig. 6.6 for the partial sums of the Fourier 2 2 series.  Observe that in the graphs of (Sn f )(x) for n = 10, 50, 100, as plotted in Fig. 6.6, there are ripples near the points of jump discontinuities of f (x) at x = − π2 and x = π2 . This is due to the Gibbs phenomenon, which is explained as follows. Recall from Example 2 on p.269 that the Fourier coefficients ck = ck ( f ) are given by 1 (−1)−1 ,  = 1, 2, . . . . c0 = , c2 = 0, c2−1 = 2 (2 − 1)π Thus, the (2n − 1)th- and (2n)th-order partial sums of the Fourier series (S f )(x) of f are the same, namely: (S2n f )(x) = (S2n−1 f )(x) =

n 2 cos(2 − 1)x 1 + , (−1)−1 2 π 2 − 1 =1

so that (S2n f )

π 2



n π 1 2  sin 2−1 2n π = + 2n 2 π 2 − 1

=

=1 n 

1π 1 + 2 πn

On the other hand, observe that n π  sin 2−1 2n π 2−1 n 2n π =1

=1

2−1 2n π . 2−1 2n π

sin

300

6 Fourier Series

1 0.8 0.6 0.4 0.2 0 −0.2 −10

−8

−6

−4

−2

0

2

4

6

8

10

−8

−6

−4

−2

0

2

4

6

8

10

−8

−6

−4

−2

0

2

4

6

8

10

−8

−6

−4

−2

0

2

4

6

8

10

1 0.8 0.6 0.4 0.2 0 −0.2 −10 1 0.8 0.6 0.4 0.2 0 −0.2 −10 1 0.8 0.6 0.4 0.2 0 −0.2 −10

Fig. 6.6 From top to bottom: f (x) = χ[−π/2,π/2] (x), −π ≤ x ≤ π and its 2π-extension, S10 f , S50 f and S100 f



π

is a Riemann sum of the integral 0

sin x d x. Thus, we have x

6.4 Convergence of Fourier Series

lim (S2n f )

301

π 2

n→∞



π 1 1 = + 2n 2 π



π

0

sin x d x. x

Similarly, it can be shown that 

π π − lim (S2n−1 f ) n→∞ 2 2n − 1



1 1 = + 2 π



π 0

sin x d x. x

Therefore, for large values of n, we have (Sn f )

π 2





π 1 1 ≈ + n 2 π

π 0

1 1 sin x d x = + + 0.089490 · · · ≈ 1 + 0.09. x 2 2

  Notice that f ( π2 )− = 1 and the discontinuity jump of f at large n, we have (Sn f )

π 2



π − f n

  π − 2

π 2

is 1. Therefore, for

≈ 1 × 0.09.

This means that the graph of (Sn f )(x) has an overshoot of about 9% of the jump f (x0+ ) − f (x0− ) = 1 near the point x0 = π2 of jump discontinuity of f (x) (see Fig. 6.7). On the other hand, there is an undershoot of the same amount at x0 = π2 in the other direction. These overshoot and undershoot never disappear as the order of the partial sums Sn f increases. This is called the Gibbs phenomenon. In general, if f is discontinuous at any point x0 ∈ [−π, π] with the discontinuity jump J = f (x0+ ) − f (x0− ), 1.2 (π/2 − π/n, 1.09)

1

0.8

0.6

0.4

0.2

0 (π/2 + π/n, −0.09)

−0.2

π/2

Fig. 6.7 Gibbs phenomenon

302

6 Fourier Series

then

π − f (x0+ ) = J × 9 %, lim (Sn f ) x0 + n→∞ n

π − f (x0− ) = −J × 9 %. lim (Sn f ) x0 − n→∞ n  We now turn to the study of convergence of Fourier series in the · 2 -norm, induced by the inner product ·, ·, via the density of trigonometric polynomials (that is, the union of the subspaces V2n+1 defined by (6.1.2) on p.267 for n = 1, 2, . . .) in  . In the following, we will prove the density result for a stronger norm than the PC2π inner product space norm · 2 , namely the supremum norm (also called uniform norm), defined by (6.4.3) f ∞ = sup | f (x)| x

for f ∈ C2π , where C2π denotes the normed linear space of 2π-periodic continuous functions with norm · ∞ defined in (6.4.3) (see (1.2.16) on p.20 for the more general notion of ess sup norm). To derive the density result mentioned above, we use the properties of Fejér’s kernel as stated in Theorem 3 on p.293 of the previous section.  , its nth Césaro Theorem 2 Convergence of C´eesaro means For any f ∈ PC2π means Cn f is a trigonometric polynomial in V2n+1 , and

f − Cn f 2 → 0, as n → ∞.

(6.4.4)

 , then Moreover, if f ∈ C2π

f − Cn f ∞ → 0, as n → ∞.

(6.4.5)

Proof Since the proof of (6.4.4) will depend on (6.4.5), we first prove (6.4.5). Let  , since [−π, π] is a compact interval,  > 0 be arbitrarily given. Then for f ∈ C2π f (x) is bounded and uniformly continuous on [−π, π]; that is, M = f ∞ < ∞, and there is a δ > 0 such that | f (x) − f (t)|
0 is a constant (called thermal conductivity), D is a bounded region in Rs , s ≥ 1, with piecewise smooth boundary ∂ D. Here and throughout this book, 2 =

∂2 ∂2 + · · · + ∂xs2 ∂x12

(6.5.2)

∂ denotes the Laplace operator in the spatial variable x = (x1 , . . . , xs ) ∈ D, and ∂n denotes the normal derivative with unit normal n = nx for each x ∈ ∂ D. Recall that the normal derivative is the directional derivative, in the direction of n; namely ∂ u(x, t) = u(x, t) · n, (6.5.3) ∂n

306

6 Fourier Series

where u is the gradient of u(x, t), defined by  u(x, t) =

 ∂ ∂ u(x, t), . . . , u(x, t) , ∂x1 ∂xs

(6.5.4)

and u(x, t) · n is the dot product (or Euclidian inner product) of u(x, t) and the unit normal n. (i) For the 1-dimensional setting, we only consider the closed and bounded interval D = [0, M], M > 0, and observe that 2 u(x, t) = ∂ D = {0, M}, and

∂2 u(x, t), ∂x 2

∂ u(x, t), x ∈ ∂ D ∂n

∂ ∂ becomes ∂x u(0, t) and ∂x u(M, t). (ii) For the 2-dimensional setting, we only focus on the rectangular region D = [0, M] × [0, N ], M > 0, N > 0, and observe that

2 u(x, t) = 2 u(x, y, t) = with x = (x, y), and

∂2 ∂2 u(x, y, t) + u(x, y, t), ∂x 2 ∂ y2

∂ u(x, t), x = (x, y) ∈ ∂ D ∂n

becomes ∂ ∂ ∂ ∂ u(x, 0, t), u(x, N , t), u(0, y, t), u(M, y, t). ∂x ∂x ∂y ∂y



Returning to the PDE in (6.5.1), we remark that the boundary condition ∂ u(x, t) = 0, for x ∈ ∂ D, t ≥ 0, ∂n in (6.5.1) describes that the “heat content” u(x, t) for x ∈ D is not allowed to diffuse outside D at any time t ≥ 0. In other words, the region D is perfectly insulated ∂ (see Fig. 6.8). Any condition imposed on ∂n u(x, t) is called Neumann’s condition. Hence, the PDE described by (6.5.1) is called a Neumann problem, with Neumann’s condition equal to zero. We now turn to the derivation of the general solution of the Neumann (heat) diffusion PDE (6.5.1), by introducing the method of “separation of variables”. We will first consider the setting of one spatial variable x = x, then extend it to the

6.5 Method of Separation of Variables

307

Fig. 6.8 Insulated region

setting of two variables x = (x, y), and finally conclude with the general s-variable setting. For x = x ∈ [0, M], M > 0, the PDE in (6.5.1) becomes ⎧ ∂ ∂2 ⎪ ⎪ u(x, t) = c u(x, t), ⎪ ⎨ ∂t ∂x 2

x ∈ [0, M], t ≥ 0,

⎪ ⎪ ⎪ ⎩ ∂ u(0, t) = ∂ u(M, t) = 0, t ≥ 0. ∂x ∂x

(6.5.5)

Hence, if we write u(x, t) = T (t)X (x), then (6.5.5) becomes 

T (t)X (x) = cT (t)X (x), x ∈ [0, M], t ≥ 0, X (0) = X (M) = 0.

(6.5.6)

Thus, dividing the PDE in (6.5.6) by cT (t)X (x), we arrive at X (x) T (t) = , cT (t) X (x) where the left-hand side is a function independent of the variable x and the righthand side is a function independent of the variable t. Therefore both sides must be the same constant, say λ, and we now have two ordinary differential equations (ODE), namely: T (t) = λcT (t) (6.5.7) and



X (x) = λX (x), X (0) = X (M) = 0.

The solution of (6.5.7) is clearly T (t) = a0 eλct

(6.5.8)

308

6 Fourier Series

for any constant a0 . To write down the solution of (6.5.8), let us first consider the case λ > 0, for which the general solution is clearly X (x) = a1 e



λx



+ a2 e −

λx

, 0 ≤ x ≤ M.

For this solution to satisfy the condition X (0) = X (M) = 0, we have √ √ √ √ √ √ a1 λ − a2 λ = 0, a1 λe λ·M − a2 λe− λ·M = 0,

√ which implies that a1 = a2 = 0 (since λ > 0). In other words, for (6.5.8) to have a nontrivial solution, the only possibility is λ ≤ 0. For λ = 0, the ODE in (6.5.8) becomes X = 0, and hence the general solution is given by X (x) = a1 + a2 x, where a1 , a2 are constants. To satisfy the boundary condition X (0) = 0, we have a2 = 0, so that the solution of (6.5.8) is X (x) = a1 , for some constant a1 . Now let us consider λ < 0, so that λ = −μ2 , with μ > 0. Then the general solution of the ODE in (6.5.8) is X (x) = a1 cos μx + a2 sin μx, where a1 , a2 are arbitrary constants. Hence, X (x) = −a1 μ sin μx + a2 μ cos μx, so that X (0) = a2 μ = 0 implies a2 = 0; that is, X (x) = a1 cos μx and X (M) = −a1 μ sin μM = 0. Thus, to avoid obtaining the trivial solution a1 , a2 = 0 once again, we must have μ = kπ M for k = 1, 2, . . .. To summarize, the only values of λ in (6.5.7) and (6.5.8) for which nontrivial solutions exist, are  kπ 2 λ = −μ2 = − , k = 0, 1, 2, . . . . M We pause for a moment to remark that (6.5.8) is precisely the eigenvalue problem 2 studied in Example 3 of Sect. 2.3 in Chap. 2, but with operator T = − ddx 2 instead   2  2 2 is the negative of kπ in of ddx 2 in (6.5.8). Observe that the eigenvalue − kπ M M (2.3.13) on p.95, with c = M in Chap. 2. kπ 2 Returning to our discussion, since T (t) = eλct = e−c( M ) t and X (x) = cos μx = cos

kπx M

6.5 Method of Separation of Variables

309

(where the reason for not including the coefficients a0 and a1 will be clear later), we may conclude that the general solution of the PDE (6.5.5) can be formulated as ∞

u(x, t) =

kπ 2 b0  kπx bk e−c( M ) t cos + 2 M

(6.5.9)

k=1

by the principle of superposition, as described in the beginning of this section. Notice that for t = 0, u(x, 0) is a cosine Fourier series in x. Therefore, we have obtained the following result. Theorem 1 Diffusion on insulated bounded interval Let u 0 ∈ L 2 [0, M], M > 0. Then the solution of the Neumann diffusion PDE (6.5.5) with initial heat content u(x, 0) = u 0 (x), x ∈ [0, M], is given by (6.5.9), with bk =

2 M



M

u 0 (x) cos

0

kπx d x, k = 0, 1, 2, . . . . M

(6.5.10)

For u 0 ∈ L 2 [0, M], it follows from Theorem 3 of Sect. 6.2 on p.280 (with proof assured by Theorem 2 of Sect. 6.4 on p.302) that the cosine Fourier series ∞

u 0 (x) =

b0  kπx + , bk cos 2 M k=1

with the Fourier coefficients b0 , b1 , . . . given by (6.5.10) (see (6.2.13) of Theorem 3 if the one-sided of Sect. 6.2), converges in the L 2 [0, M]-norm to u 0 (x). Furthermore,   derivatives of u 0 (x) exist at x = x0 , then the series converges to u 0 (x0+ )+u(x0− ) /2, by Theorem 1 of Sect. 6.4 on p.295, provided that u 0 ∈ PC[0, M]. In particular, the series representation (6.5.9) of u(x, t) converges, at least in the L 2 [0, M]-norm.  Example 1 Solve the heat diffusion PDE ⎧ ∂ ∂2 ⎪ ⎪ u(x, t) = c 2 u(x, t), 0 ≤ x ≤ π, t ≥ 0, ⎪ ⎪ ⎪ ∂t ∂x ⎪ ⎪ ⎪ ⎪ ⎨ ∂ ∂ u(0, t) = u(π, t) = 0, t ≥ 0, ⎪ ⎪ ∂x ∂x ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩u(x, 0) = x , 0 ≤ x ≤ π. π Solution We first compute the Fourier cosine series

310

6 Fourier Series ∞

x b0  bk cos kx, = + π 2 k=1

where b0 =

2 π

 0

π

2 π2 x dx = 2 . = 1, π π 2

and, for k ≥ 1, bk

 2 π x cos kxd x = π 0 π

  π 2 x sin kx π sin kx d x = 2 −  0 π k k 0   π    2 cos kx  2  = 2 0− − 2  = 2 2 cos kπ − cos 0 0 π k π k  2  = 2 2 (−1)k − 1 π k  0, for k = 2 j, = 2 · (−2), for k = 2 j − 1, π 2 (2 j−1)2

yielding: ∞   x 1 4  1 = − 2 cos (2 j − 1)x , 0 ≤ x ≤ π. π 2 π (2 j − 1)2 j=1

Therefore, the solution of the required heat diffusion PDE is given by u(x, t) =

∞   1 4  1 2 − 2 e−c(2 j−1) t cos (2 j − 1)x , 2 2 π (2 j − 1) j=1

where 0 ≤ x ≤ π, t ≥ 0. Example 2 Solve the heat diffusion PDE ⎧ ∂ ∂2 ⎪ ⎪ u(x, t) = c u(x, t), 0 ≤ x ≤ π, t ≥ 0, ⎪ ⎪ ⎪ ∂t ∂x 2 ⎪ ⎪ ⎪ ⎨ ∂ ∂ u(0, t) = u(π, t) = 0, t ≥ 0, ⎪ ⎪ ∂x ∂x ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ u(x, 0) = 1 + cos 2x, 0 ≤ x ≤ π.



6.5 Method of Separation of Variables

311

Solution Since the initial function u 0 (x) = 1 + cos 2x is already a Fourier cosine series, with b0 = 2, b2 = 1, bk = 0 for k = 0, 2, the solution of the required heat diffusion PDE is simply:     1·π 2·π 2 −c( 1·π )2 t −c( 2·π )2 t π π x +1·e x cos cos u(x, t) = + 0 · e 2 π π   3·π 2 3·π x + ··· + 0 · e−c( π ) t cos π = 1 + e−c4t cos 2x. To verify this answer, we observe that ⎫ ∂ ⎬ u(x, t) = −4ce−4ct cos 2x ⎪ ∂ ∂2 ∂t =⇒ u(x, t) = c 2 u(x, t), 2 ∂ ⎪ ∂t ∂x (ii) 2 u(x, t) = 4e−4ct (− cos 2x) ⎭ ∂x ∂ u(x, t) = −2e−4ct sin 2x =⇒ (iii) ∂x ∂ ∂ u(0, t) = −2e−4ct sin 0 = 0, u(π, t) = −2e−4ct sin 2π = 0. ∂x ∂x (i)

Finally, u(x, 0) = 1 + e0 cos 2x = 1 + cos 2x = u 0 (x).



The method of separation of variables extends to any bounded region D ⊂ Rs , s > 1, as follows. We begin by writing u(x, t) = T (t)U (x), x ∈ D. Then, for u(x, t) in the PDE in (6.5.1), we have T (t)U (x) = cT (t)2 U (x) so that

2 U (x) T (t) = , cT (t) U (x)

which must be a constant λ (since the left-hand side is independent of the variable x ∈ D and the right-hand side is a function independent of the variable t > 0). Hence, we have one ODE given by T (t) − λcT (t) = 0,

(6.5.11)

312

6 Fourier Series

which is the same as (6.5.7) and one ODE given by 2 U (x) = λU (x),

(6.5.12)

which is an eigenvalue problem. Since T (t) = a0 eλct , as in the 1-dimensional diffusion problem, and by keeping in mind that the PDE in (6.5.1) describes a diffusion process, we may conclude that λ ≤ 0, as discovered earlier. Unfortunately, for an arbitrary region D ⊂ Rs , s ≥ 2, the eigenvalue problem 2 U (x) = −μ2 U (x), x ∈ D (with λ = −μ2 , μ ≥ 0), does not have an explicit solution (see the discussion on numerical approximation via the Rayleigh quotient with T = −2 in Sect. 2.3 of Chap. 2). To apply the Fourier series method, we only consider D = [0, M1 ] × · · · × [0, Ms ],

(6.5.13)

where M1 , . . . , Ms > 0. Let us first discuss the case s = 2, and let M = M1 , N = M2 for convenience. Then, by setting x = x1 and y = x2 , the Neumann diffusion PDE (6.5.1) becomes ⎧∂ ⎪ u(x, y, t) = c2 u(x, y, t), x ∈ [0, M], y ∈ [0, N ], t ≥ 0, ⎪ ⎪ ⎪ ∂t ⎪ ⎪ ⎪ ⎪ ⎪ ⎨∂ ∂ u(0, y, t) = 0, u(M, y, t) = 0, y ∈ [0, N ], t ≥ 0, ⎪ ∂x ∂x ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∂ u(x, 0, t) = 0, ∂ u(x, N , t) = 0, x ∈ [0, M], t ≥ 0. ∂y ∂y Then for U (x, y) = X (x)Y (y), we have

2 U (x, y) = X (x)Y (y) + X (x)Y (y),

and hence 2 U = λU becomes X (x)Y (y) + X (x)Y (y) = λX (x)Y (y), or equivalently,

(6.5.14)

6.5 Method of Separation of Variables

313

Y (y) X (x) =− +λ = λ X (x) Y (y) for some constant λ (since −Y (y)/Y (y) + λ is independent of x and X (x)/ X (x) is independent of y). Now, by using the two-point boundary condition X (0) = X (M) = 0,  2 it can be shown that λ = − kπ/M for k = 0, 1, 2, . . ., as in the setting of one spatial variable, so that X (x) = constant mulitple of cos

kπx . M

Similarly, for Y (y)/Y (y) = λ − λ, the two-point boundary condition Y (0) = Y (N ) = 0,  2 implies that λ − λ = − π/N for  = 0, 1, 2, . . ., so that

kπ 2  π   π 2 2 λ = λ− =− − , N M N which is ≤ 0, as expected in view of the diffusion process described by the PDE. Also, πx Y (y) = constant mulitple of cos . N Hence, by the principle of superposition, we have u(x, y, t) =

∞  ∞ 

d(k, )bk, e−c







π 2 2 ( kπ M ) +( N ) t

k=0 =0

cos

   π kπ x cos y , M N (6.5.15)

where we have introduced the notation d(k, ) = 2−(δk +δ ) , k,  = 0, 1, 2, . . . ,

(6.5.16)

by using the Kronecker delta symbol δ j (for j = k and j = ), and where bk, are certain constants. u(x, y, t) in (6.5.15) gives the general solution of the PDE (6.5.14) before the initial condition is considered. To impose the initial condition, we have u 0 (x, y) = u(x, y, 0) =

∞  ∞  k=0 =0

 d(k, )bk, cos

   π kπ x cos y , M N

(6.5.17)

314

6 Fourier Series

which is the Fourier cosine series representation of u 0 (x, y). Thus, bk, , k = 0, 1, . . . ,  = 0, 1, . . . in (6.5.15) are the coefficients of the cosine series of the initial function u 0 (x, y), namely: bk

4 = MN



M 0

 0

N

   π kπ x cos y d xd y, u 0 (x, y) cos M N 

for k,  = 0, 1, 2, . . . . For s ≥ 3, we may consider

(6.5.18) 

U (x) = U (x1 , . . . xs−1 , xs ) = V (x1 , . . . , xs−1 )X (xs ), so that

2s U = X 2s−1 V + X V

(where the subscript of 2 denotes the dimension of the Laplace operator). Hence, by an induction argument, we have the following result, which extends Theorem 1 to any dimension s ≥ 2. For convenience, the notation d(k, ) in (6.5.16) is extended to an arbitrary s ≥ 2; namely * − sj=1 δk j . d(k1 , . . . , ks ) = 2  Theorem 2 Diffusion in higher dimensions Let u 0 ∈ L 1 [0, M1 ] × · · · × [0,  Ms ] , where M1 , . . . , Ms > 0. Then the solution of the Neumann diffusion PDE (6.5.1) for D = [0, M1 ]×· · ·×[0, Ms ] with initial heat content u(x, 0) = u 0 (x), x ∈ D, is given by u(x, t) = ∞  k1 =0

...

∞ 

d (k1 , . . . , ks ) bk1 ...ks e

−c



*s j=1

ks =0

kjπ Mj

2 t

 cos

   ks π k1 π x1 . . . cos xs , M1 Ms (6.5.19)

with bk1 ...ks =

2s M1 . . . Ms



 u 0 (x) cos D

   ks π k1 π x1 . . . cos xs dx, M1 Ms

where the convergence of the series in (6.5.19) is in the L 2 (D)-norm. Example 3 Solve the Neumann diffusion PDE (6.5.14) for M = N = π with initial heat content u(x, 0) = u 0 (x) = 1 + 2 cos x cos y + cos x cos 3y.

6.5 Method of Separation of Variables

315

Solution Since the representation of u 0 (x, y) is already its Fourier cosine series representation, it follows from Theorem 2 that the solution is given by u(x, y, t) = 1 + 2e−c(1 +1 )t cos x cos y + e−c(1 +3 )t cos x cos 3y = 1 + 2e−2ct cos x cos y + e−10ct cos x cos 3y. 2

2

2

2

 Exercises Exercise 1 Solve the heat diffusion PDE (6.5.5) for c = 2, M = π, and u(x, 0) = u 0 (x) = |x −

π |, 0 ≤ x ≤ π. 2

Exercise 2 Apply the method of “separation of variables” to separate the following PDE’s into ODE’s. ∂2u ∂2u + = 0. ∂x 2 ∂ y2 ∂2u ∂u ∂u (b) + = 0. + ∂x 2 ∂x ∂y 2 ∂ u ∂2u ∂u (c) + + f (x) = 0. ∂x 2 ∂x ∂ y2 (a)

Exercise 3 Solve the heat diffusion PDE (6.5.14) for c = 3, M = N = π, and u 0 (x, y) = 2 + cos 2x cos 3y + 4 cos 5x cos 2y. Exercise 4 Repeat Exercise 3 for u 0 (x, y) =

10  5  k=0 j=0

k cos kx cos j y. j +1

Exercise 5 Repeat Exercise 3 for u 0 (x, y) = |x −

π π | + |y − |, 0 ≤ x, y ≤ π. 2 2

Exercise 6 Repeat Exercise 3 for u 0 (x, y) = |(x −

π π )(y − )|, 0 ≤ x, y ≤ π. 2 2

Exercise 7 Apply the method of “separation of variables” to solve the 1-dimensional PDE of “vibrating string” with no movement at the two end-points, given by

316

6 Fourier Series

⎧ 2 ∂ ∂2 ⎪ ⎪ ⎨ 2 u(x, t) = c 2 u(x, t), x ∈ [0, M], t ≥ 0, ∂t ∂x ⎪ ⎪ ⎩ u(0, t) = u(M, t) = 0, t ≥ 0, where c > 0 is a constant. Exercise 8 Apply the method of “separation of variables” to separate the PDE of the “wave equation of the two spatial variables” for u = u(x, y, t), given by ∂2u = c2 2 u, c > 0 constant, ∂t 2 ∂ where 2 = ∂x 2 + time variable t. 2

∂2 , ∂ y2

into a PDE in the space variables x, y and an ODE in the

Exercise 9 The PDE model of a “vibrating membrane” is the wave equation with zero boundary condition. As a continuation of Exercise 8, solve the following PDE that describes a vibrating membrane on [0, M] × [0, N ] with the given Fourier sine series representation of the initial vibration: ⎧ 2 ∂ 2 2 ⎪ x ∈ [0, M], y ∈ [0, N ], t ≥ 0, 2 u(x, y, t) = c  u(x, y, t), ⎪ ⎪ ⎨ ∂t ⎪ ⎪ ⎪ ⎩

u(x, y, 0) = 0, x ∈ [0, M], y ∈ [0, N ], *∞ *∞ ∂ mπx m=0 n=0 bm,n sin M sin ∂t u(x, y, 0) =

where

∞ ∞   m=0 n=0

|bm,n | < ∞.

nπ y N ,

x ∈ [0, M], y ∈ [0, N ],

Chapter 7

Fourier Time-Frequency Methods

When non-periodic functions f ∈ PC(−∞, ∞) are considered as analog signals with time-domain (−∞, ∞), the notion of the Fourier transform (FT)  f (ω) =



∞ −∞

f (x)e−i xω d x

is introduced in Sect. 7.1 to reveal the frequency content of f ∈ L 2 (R), where the most useful properties of the FT are derived. In addition, the FT of the Gaussian function is computed, and it will be clear that the FT of the Gaussian function is again another Gaussian. Analogous to Fejér’s kernels introduced in Sect. 6.3 of the previous chapter, it is shown that the family of Gaussian functions, gσ (x) =

1 − 1 x2 √ e 4σ2 , 2σ π

with parameter σ, also constitutes a positive approximate identity, but now for the entire real line R, as compared with the interval [−π, π] for the sequence of Fejér’s kernels. The Fourier transform defined for functions f ∈ L 1 (R) in Sect. 7.1 is extended to L 2 (R) in Sect. 7.2, where Plancherel’s formula (also called Parseval’s identity to match that of the Fourier series established in Sect. 6.1 of Chap. 6) is derived by applying the property of positive approximate identity of the Gaussian function gσ and the Gaussian formulation of the FT of gσ . Under the assumption that the Fourier transform  f of the given function f is also in L 1 (R), it is shown in this section that f can be recovered from  f (ω) by the inverse Fourier transform (IFT), defined by f (x) =

1 2π





−∞

 f (ω)eiωx dω.

C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_7, © Atlantis Press and the authors 2013

317

318

7 Fourier Time-Frequency Methods

For the time-domain (−∞, ∞), convolution of two functions f (x) and h(x) is defined by  ∞ f (t)h(x − t)dt, ( f ∗ h)(x) = −∞

and one of the most important properties of the FT is that it maps convolutions to products, namely:    f ∗ h (ω) =  f (ω) h(ω). Hence, for applications to signal processing, by selecting an appropriate filter function h(x), the convolution operation on an analog signal f (x) results in modulation of its frequency content  f (ω) with the desired filter characteristic  h(ω) in the frequency-domain. As an application to ideal lowpass filtering, we will derive the sampling theorem at the end of Sect. 7.2. Section 7.3 can be considered as a continuation of Sect. 6.5 on the solution of the diffusion PDE, but now for the entire Euclidean space Rs instead of bounded rectangular regions in Sect. 6.5. Since it is not a boundary value problem, we may now consider the diffusion PDE as an initial value problem, with initial heat content given by a function u 0 defined on Rs as the initial condition. For R1 , it will be shown in this section that when the square of the parameter σ of the family of Gaussian functions gσ is replaced by ct, where c > 0 is a constant and t denotes the time parameter, then spatial convolution of the initial heat content u 0 with gσ yields the solution of the initial value heat diffusion PDE, with the positive constant c being the thermal conductivity of the diffusion media. This derivation is extended to Rs , for any spatial dimension s ≥ 2. Observe that since the Gaussian function is well localized, the initial function u 0 is allowed to be of polynomial growth as well as being a periodic function, such as a Fourier cosine or sine series. In fact, for the region R2 , when a Fourier cosine series defined on the rectangular region [0, M] × [0, N ] is considered as the initial condition for the diffusion PDE for the entire spatial domain R2 , it is shown in the final example, Example 6, of this section that the solution is the same as that of the diffusion PDE for the perfectly insulated bounded region [0, M] × [0, N ], with zero Neumann condition, obtained by applying the method of separation of variables. In many applications, it is desirable to localize both in time and in space. The choice of filter functions h(x) to simultaneously localize an analog signal in both the time-domain and the frequency-domain is studied in Sect. 7.4, where an identity for simultaneous time-frequency localization is established. In addition, the general formulation of the Gaussian gc,γ,b (x) = ce−γ(x−b)

2

with γ > 0, c = 0 and b ∈ R, is shown to provide the optimal time-frequency localization window function, as governed by the uncertainty principle, also to be derived in Sect. 7.4. If a time-window function u(x) is used to define the localized

7 Fourier Time-Frequency Methods

319

(or short-time) Fourier transform (Fu f )(x, ω) of any f ∈ L 2 (R), then under the conditions that both u(x) and its Fourier transform  u (ω) are in (L 1 ∩ L 2 )(R) and that u(0) = 0, it will be shown also in Sect. 7.4 that f (x) can be recovered from (Fu f )(x, ω) by applying the inverse FT operation. On the other hand, for practical applications, the time-frequency coordinates (x, ω) are sampled at (x, ω) = (ma, 2πkb), where a, b > 0 are fixed and m, k run over the set Z of all integers. Then the localized Fourier transform (Fu f )(x, ω), introduced in Sect. 7.4, at these sample points can be expressed as an inner product, namely: (Fu f )(ma, 2πkb) = f, h ma,kb , where h ma,kb (x) = u(x − ma)ei2πkbx, m, k ∈ Z, could be viewed as some time-frequency “basis” functions, with m running over the discrete time-domain Z, and n running over the frequency domain Z. Unfortunately, for good time-frequency localization (that is, for u to satisfy u  u < ∞), it is necessary to sample at (ma, 2πkb) with ab < 1 for the existence of such window functions u(x). This is the so-called Balian-Low constraint. On the other hand, the choice of a = b = 1 significantly facilitates computational efficiency. The good news is that the Balian-Low constraint can be removed by replacing ei2πkx (for a = b = 1) with cosine and/or sine functions. The study of such time-frequency bases will be studied in Sect. 7.5. For the book to be self-contained, a discussion of the integration theory from Real Analysis, including Fubini’s theorem, Minkowski’s integral inequality, Fatou’s lemma, Lebesgue’s dominated convergence theorem, and Lebesgue’s integration theorem, is presented in the last section to facilitate the derivation of certain theorems in this chapter.

7.1 Fourier Transform The discrete Fourier transform (DFT) introduced in Chap. 4 allows us to study the frequency behavior of digital data. More precisely, when x = (x0 , . . . , xn−1 ) ∈ Cn is considered as a digital signal with time-domain (0, . . . , n − 1), then its n-point DFT xn−1 ) = Fn x  x = ( x0 , . . . ,  reveals the frequency content of the digital signal x (see Example 2 on p.176). In Sect. 6.1, the notion of Fourier coefficients is introduced to study the frequency behavior of analog signals with time-domain given by a bounded interval [a, b], where −∞ < a < b < ∞. By a simple change of variables as discussed in Sect. 6.1, we may replace [a, b] by [−π, π]. Hence, by periodic extension of functions in  , the Fourier series can be considered as the “inverse” transform PC[−π, π] to PC2π of the sequence of Fourier coefficients to recover the analog signal. In this section, we introduce the notion of the Fourier transform to study functions in PC(−∞, ∞). Hence, the Fourier transform of an analog signal with time-domain (−∞, ∞) reveals its entire frequency content, and the inverse Fourier transform to be studied in Sect. 7.2 recovers the analog signal from the frequency content.

320

7 Fourier Time-Frequency Methods

Definition 1 Fourier transform Let f ∈ L 1 (R). The Fourier transform (FT) of f , denoted by  f or F f , is defined by  f (ω) = (F f )(ω) =



∞ −∞

f (x)e−iωx d x,

ω ∈ R.

(7.1.1)

Sometimes it may be more convenient to write  f (ω) as ( f )∧ (ω). See, for example, (7.1.10)–(7.1.12). Observe that since e−i xω = cos xω−i sin xω, the FT of f (x) reveals the frequency content  ∞  ∞  f (x) cos ωx d x − i f (x) sin ωx d x f (ω) = −∞

−∞

of f (x) in terms of the oscillation of the cosine and sine functions, with frequency defined by ω/2π Hz. Example 1 Compute the Fourier transform  f (ω) of f (x) = e−a|x| , where a > 0. Solution Clearly f ∈ L 1 (R). Since f (x) is even,  f (ω) =



=2



−∞

e−a|x| e−iωx d x = 2





e−ax cos ωxd x

0

 e−ax ∞ 2a (ω sin ωx − a cos ωx) = 2 . 2 2 0 ω +a ω + a2

 The Fourier transform  f = F f of a function f ∈ L 1 (R) enjoys the following properties: f (ω). Then Theorem 1 Let f ∈ L 1 (R) with Fourier transform  (i) (ii) (iii) (iv) (v)

 f ∈ L ∞ (R) with  f ∞ ≤ f 1 .  f (ω) is uniformly continuous on R.  f (ω) → 0 as |ω| → ∞. f (ω). If f  (x) exists on R and f  ∈ L 1 (R), then (F f  )(ω) = iω  f ∈ Ck (R) with If x k f (x) ∈ L 1 (R) for some k ≥ 1, then  dk  f (ω) = (−i)k dω k





−∞

x k f (x)e−iωx d x, ω ∈ R.

(7.1.2)

Since the proof of (i) is straightforward, it is safe to leave it as an exercise (see Exercise 2). To prove (ii), we observe that for any  > 0,

7.1 Fourier Transform

321

     ∞   sup   f (x) e−ix − 1 e−iωx d x  f (ω + ) −  f (ω) = sup  ω ω −∞  ∞   ≤ | f (x)| e−ix − 1d x → 0, −∞

for  → 0, since

  | f (x)| e−ix − 1 ≤ 2| f (x)|

for any  and f (x) ∈ L 1 (R), so that the “Lebesgue’s dominated convergence theorem” from Real Analysis (see Theorem 4 in the Appendix of this chapter) can be applied to interchange limit and integration. Property (iii) is called the “Riemann-Lebesgue lemma”. To prove (iii), we may assume n that f is a compactly supported piecewise constant function, namely: f (x) = ck χ[ak−1 ,ak ) (x). The reason is that an L 1 (R) function can be approximated k=1 arbitrarily well by such functions, with Fourier transform that satisfies (i). Then  f (ω) =

n k=1



ak

ck

e ak−1

−iωx

n 1 dx = ck (e−iωak − e−iωak−1 ) → 0, −iω k=1

as ω → ∞. The proof of (iv) follows from “integration by parts” which can be shown rigorously, under the assumption that f  ∈ L 1 (R) and by observing that f (x) → 0 as |x| → ∞. For (v), we first observe that if f (x) ∈ L 1 (R) and x k f (x) ∈ L 1 (R) as well, f ∈ Ck (R) can be verified then x j f (x) ∈ L 1 (R) for 1 ≤ j ≤ k − 1. The property  j iteratively by showing that  f ∈ C (R), with j = 1, . . . , k. Here, we may consider only the case k = 1, namely:  d  f (ω + ξ) −  f (ω) f (ω) = lim ξ→0 dω ξ  ∞ e−i(ω+ξ)x − e−iωx dx = lim f (x) ξ→0 −∞ ξ  ∞ e−iξx − 1 = dx f (x)e−iωx lim ξ→0 ξ −∞  ∞ = f (x)(−i x)e−iωx d x, −∞

where the interchange of limit and integration in the third equality is allowed by applying Lebesgue’s dominated convergence theorem to the function q(x, ξ) = f (x)

e−i(ω+ξ)x − e−iωx e−iξx − 1 = f (x)e−iωx , ξ ξ

322

7 Fourier Time-Frequency Methods

since |q(x, ξ)| ≤ |x| | f (x)| for any real ξ (see Exercise 14); that is, q(x, ξ) is dominated by an L 1 (R) function |x| | f (x)|. d  d  The continuity of dω f (ω) follows from the expression of dω f (ω) in (7.1.2) and (ii) with f (x) replaced by x f (x).  In the following, we list some important operations of functions in L 1 (R) or L 1 [0, ∞). (a) Even extension. Let f ∈ L 1 [0, ∞). The even extension f e (x) of f (x) is defined by f e (x) = f (x) for x ≥ 0 and f e (x) = f (−x), forx < 0.

(7.1.3)

Note that f e ∈ L 1 (R). (b) Translation. Let f ∈ L 1 (R) and b ∈ R. The translation operator Tb is defined by (7.1.4) (Tb f )(x) = f (x − b). (c) Dilation. Let f ∈ L 1 (R) and a > 0. The dilation operator Da is defined by (Da f )(x) = f (ax).

(7.1.5)

Note that Da f ∈ L 1 (R). (d) Frequency modulation. Let f ∈ L 1 (R) and 0 = c ∈ R. Then the (frequency) modulation operator Mc is defined by (Mc f )(x) = f (x)eicx .

(7.1.6)

  Note that (Mc f )(x) = | f (x)|. (e) Convolution. Let f be a function on R. Then for a given “filter”, which is another function on R, the convolution of f (x) with the filter function h(x) is defined by  ( f ∗ h)(x) =



−∞

f (t)h(x − t)dt.

(7.1.7)

Remark 1 By applying the Cauchy-Schwarz inequality in Theorem 3 on p.27 (or Hölder’s inequality for p = q = 2 in (1.2.21) on p.22), it is clear that f ∗ h ∈ L ∞ (R) provided that f, h ∈ L 2 (R) (see Exercise 4). If f, h ∈ L 1 (R), then f ∗ h ∈ L 1 (R) by applying Fubini’s theorem in Theorem 1 of the Appendix (see Exercise 5). More general, by the Minkowski inequality for integrals, it can be shown that for 1 ≤ p, q < ∞, f ∗ h r ≤ f p h q , where r > 0 is given by

1 r

+1=

1 p

+ q1 .



7.1 Fourier Transform

323

Remark 2 By the change of variables of integration u = x − t, we see that f ∗ h = h ∗ f. Thus, the convolution operator defined by (7.1.7) is commutative. Observe that the convolution operator ∗ defined in (6.3.3) of Definition 1 on p.  differs from that in (7.1.7) for L (R) only in the limits of integration. 288 for PC2π 2  Next we derive the formulas for the Fourier transform of functions resulting from the operations introduced in (7.1.3)–(7.1.7). Theorem 2 Properties of Fourier transform (i) Let f e be the even extension of f ∈ L 1 [0, ∞). Then  ∞  f e (ω) = 2 f (x) cos ωxd x.

(7.1.8)

0

(ii) Let f ∈ L 1 (R) and b ∈ R. Then (Tb f )∧ (ω) = e−ibω  f (ω);

(7.1.9)

that is, FTb = M−b F. (iii) Let f ∈ L 1 (R) and a = 0. Then 1  ω ; f a a

(7.1.10)

f (ω − c); (Mc f )∧ (ω) = 

(7.1.11)

(Da f )∧ (ω) = that is, FDa = a1 Da −1 F. (iv) Let f ∈ L 1 (R) and 0 = c ∈ R. Then

that is, FMc = Tc F. (v) Let f, h ∈ L 1 (R). Then f ∗ h ∈ L 1 (R) and f (ω) h(ω). ( f ∗ h)∧ (ω) = 

(7.1.12)

Proofs of (7.1.8)–(7.1.11) are straightforward (see Exercises 15–18), and derivation of (7.1.12) is a consequence of interchanging the order of integrations by applying Fubini’s theorem, as follows.

324

7 Fourier Time-Frequency Methods

( f ∗ h)∧ (ω) =





−∞ ∞

e−iωx

 =

−∞ ∞

f (t)

 =

−∞ ∞

 =

−∞

f (t)







−∞ ∞

−∞ ∞



−∞

f (t)h(x − t)dt d x

e−iωx h(x − t)d x dt

e−iω(y+t) h(y)dy dt

f (t)e−iωt  h(ω)dt

=  f (ω) h(ω).  In the rest of this section, we consider the Gaussian function gσ (x) =

x 2 1 √ e−( 2σ ) , 2σ π

(7.1.13)

where σ > 0. Remark 3 When the Gaussian function gσ is used as the time localization window function u in (7.4.1) on p.351, the radius of this window function gσ is shown to be gσ = σ in Theorem 2 of Sect. 7.4 on p.355. In addition, since the Fourier transform 2 2 gσ (ω) = e−σ ω , as shown in Theorem 3, the radius of the frequency of gσ is simply  1 as shown in Theorem 2 of Sect. 7.4. On the localization window function  gσ is 2σ other hand, in the Statistics literature, the Gaussian function f σ (x) =

1 −( √x )2 2σ √ e σ 2π

is often adopted, since σ is the “standard derivation” of the probability distribution function y = f σ (x), with “variance" σ 2 . Hence, in view of f σ (x) = gσ/√2 (x), it is clear that when gσ is used as a probability distribution function, its variance is var(gσ ) = 2σ 2 .  √ The division by 2σ π in (7.1.13) is to assure the integral to be equal to 1, namely: 



−∞

gσ (x)d x = 1, all σ > 0.

(7.1.14)

7.1 Fourier Transform

325

To prove (7.1.14), observe that by changing the Cartesian coordinates to polar coordinates, we have





−∞

e−x d x 2

2

 =







−∞ −∞  ∞ π

=

−π

0

which yields





−∞

e−(x

2 +y 2 )

d xd y

e−r r dr dθ = 2π 2





e−r r dr = π, 2

0

e−x d x = 2



π

(7.1.15)

after taking the square-root. Hence, for any α > 0, 



−∞

e

−αx 2

 dx =

π , α

(7.1.16)

which implies that (7.1.14) holds, by choosing α = 1/(4σ 2 ). 2 To compute the Fourier transform of gα (x), we first consider v(x) = e−x and formulate its Fourier transform as  ∞ 2 e−x e−iωx d x G(ω) =  v (ω) = −∞  ∞ 2 = e−(x +iωx) d x. −∞

Then for y ∈ R, we have 



e−(x +yx) d x −∞  ∞ √ 2 2 y 2 /4 =e e−(x+y/2) d x = πe y /4 .

G(−i y) =

2

−∞

(7.1.17)

Hence, when the function H (z) = G(z) −

√ −z 2 /4 πe

(7.1.18)

is considered as a function of a complex variable z, H (z) is analytic for all z ∈ C (or H (z) is called an entire function) and H (−i y) = 0 for all y ∈ R. Recall that if an analytic function vanishes on a set with a finite number of accumulation points, then the function vanishes identically in the domain of analyticity. Hence, H (z) = 0 for all z ∈ C, so that √ 2 G(ω) − πe−ω /4 = 0, ω ∈ R;

326

7 Fourier Time-Frequency Methods



or  v (ω) =



−∞

e−x e−iωx d x = 2



πe−ω

2 /4

.

(7.1.19)

This enables us to compute the Fourier transform  gσ (ω) of gσ (x); namely, in view of (7.1.13) and (7.1.19), we have  ∞ x 2 1  gσ (ω) = e−( 2σ ) e−iωx d x √ 2σ π −∞  ∞ 1 2 = e−y e−i(2σω)y (2σ)dy √ 2σ π −∞ 1 √ −(2σω)2 /4 2 πe = e−(σω) . =√ π Theorem 3 Gaussian is preserved under Fourier transform The Fourier transform of the Gaussian function gσ (x), σ > 0, defined in (7.1.13), is  gσ (ω) = e−σ

2 ω2

.

(7.1.20)

Remark 4 By (7.1.19) and the change of variables of integration, we have 



−∞

e−σ

2 ω2

√ ei xω dω =

  π − x 2 e 2σ = 2πgσ (x). σ

The detailed verification is left as an exercise (see Exercise 21).

(7.1.21) 

By allowing σ > 0 in gσ (x) to be a free parameter, we have a powerful mathematical tool, called “positive approximate identity” {gσ } for L 2 (R) (see the family {σn }  , of Fejér’s kernels in Theorem 3 on p.293 as a positive approximate identity for PC2π which obviously extends to all periodic functions in L 2 [a, b], −∞ < a < b < ∞). Theorem 4 Positive approximate identity in (7.1.13). Then

Let gσ (x) be the Gaussian function

(a)  gσ (x) ≥ 0 for all x ∈ R; ∞ gσ (x)d x = 1, all σ > 0; and (b) −∞

(c) for any δ > 0,

 lim

σ→0 |x|>δ

gσ (x)d x = 0.

Clearly gσ (x) > 0 for any x ∈ R, while the property (b) has been proved above. The proof of the result (c) is left as an exercise (see Exercise 20).

7.1 Fourier Transform

327

The property of positive approximate identity of gσ (x) is applied in the following to show that convolution of gσ (x) preserves all bounded continuous functions, as σ → 0+ . In other words, the limit of gσ (x) is the “delta function". Theorem 5 Let f ∈ L ∞ (R) be continuous at x = x0 . Then lim ( f ∗ gσ )(x0 ) = f (x0 ).

σ→0+

Proof Let  > 0 be arbitrarily given. Since there exists some δ > 0, such that | f (x) − f (x0 )|
δ

| f (x0 ) − f (x0 − t)| gσ (t)dt

 δ | f (x0 ) − f (x)| gσ (t)dt |x−x0 |≤δ −δ  gσ (t)dt + 2 f ∞ |t|>δ   gσ (t)dt. (7.1.22) ≤ + 2 f ∞ 2 |t|>δ

≤ max

Hence, since f ∈ L ∞ (R), we may apply (c) of Theorem 4 to deduce the existence of some σ0 > 0, such that   gσ (t)dt ≤ 4 f ∞ |t|>δ for all 0 < σ < σ0 . This completes the proof of Theorem 5. Exercises 

Exercise 1 Let h 1 (x) =

1, if 0 ≤ x ≤ 1, 0, otherwise.



328

7 Fourier Time-Frequency Methods

Compute the Fourier transform  h 1 (ω) of h 1 (x). f = F f ∈ L ∞ (R) and  f ∞ ≤ f 1 . Exercise 2 Show that if f ∈ L 1 (R), then  Exercise 3 Show that if f ∈ L 1 (R), f  (x) exists for all x ∈ R and f  ∈ L 1 (R), f (ω) by applying integration by parts. then (F f  )(ω) = iω  Exercise 4 If f, g ∈ L 2 (R), show that f ∗ g ∈ L ∞ (R). Exercise 5 If f, g ∈ L 1 (R), show that f ∗ g ∈ L 1 (R). Exercise 6 Let h 2 (x) be defined by h 2 (x) = (h 1 ∗h 1 )(x), where ∗ is the convolution operation and h 1 (x) is defined in Exercise 1. (a) (b) (c) (d)

Show that h 2 (x) = 0 for x < 0 or x > 2. Compute h 2 (x) for 0 ≤ x ≤ 1. Compute h 2 (x) for 1 ≤ x ≤ 2. Graph y = h 2 (x).

Exercise 7 By applying the result in Exercise 1 and formula (7.1.12) of Theorem 2, compute the Fourier transform  h 2 (ω) of h 2 (x). Exercise 8 By applying (7.1.9) of Theorem 2 and the result in Exercise 7, compute 2 (ω) of the function H2 (x) = h 2 (x + 1). the Fourier transform H Exercise 9 Extend the  formula (7.1.12)  of Theorem 2 to write out the Fourier transform of g(x) = f 1 ∗ f 2 ∗ ( f 3 ∗ f 4 ) (x), where f j ∈ L 1 (R), 1 ≤ j ≤ 4. Exercise 10 In Exercise 9, show that the Fourier transform  g (ω) of g(x) is independent of the permutation of f 1 , f 2 , f 3 , f 4 in the convolution formulation, and hence conclude that g(x) may be written as g(x) = ( f 1 ∗ f 2 ∗ f 3 ∗ f 4 )(x). Exercise 11 Verify the result in Exercise 10 directly from the definition of convolution without appealing to Fourier transform. Exercise 12 As a result of Exercise 10, let h m (x) = (h 1 ∗ h 1 ∗ · · · ∗ h 1 )(x),    m copies of h 1

h m (ω). where h 1 (x) is defined in Exercise 1. Computer the Fourier transform  Exercise 13 For the function h m (x) in Exercise 12, define Hm (x) = h m (x + m (ω) and show that H m (ω) is a real-valued function. Compute H Exercise 14 Show that for t ∈ R, |e−it − 1| ≤ |t|. Exercise 15 Derive the formula (7.1.8).

m 2 ).

7.1 Fourier Transform

329

Exercise 16 Derive the formula (7.1.9). Exercise 17 Derive the formula (7.1.10). Exercise 18 Derive the formula (7.1.11). Exercise 19 Derive the formulas (7.1.16) and (7.1.14) by applying (7.1.15). Exercise 20 Prove the property (c) of Theorem 4. Exercise 21 Derive the formula (7.1.21).

7.2 Inverse Fourier Transform and Sampling Theorem The definition of the Fourier transform for L 1 (R) in (7.1.1) on p.320 can be extended to functions f ∈ L 2 (R) by considering the truncated functions  f N (x) =

f (x), for|x| ≤ N , 0, for|x| > N ,

(7.2.1)

for N = 1, 2, · · · . Observe that each f N is compactly supported. Thus, from the f N is well-defined. In assumption f ∈ L 2 (R), we have f N ∈ (L 2 ∩ L 1 )(R), so that  addition, since { f N } converges to f in L 2 (R), { f N } is a Cauchy sequence in L 2 (R). In this section, by applying the Gaussian function and its Fourier transform, we will prove that for N = 1, 2, · · · ,  f N 22 = 2π f N 22

(7.2.2)

(see (7.2.4)–(7.2.6)). It then follows from (7.2.2) and the fact that { f N } is a Cauchy f N (ω)} N is also a Cauchy sequence in L 2 (R), sequence in L 2 (R), that the sequence {  and its limit, being a function in L 2 (R), can be used as the definition of the Fourier transform of f . Definition 1 Fourier transforms of L 2 (R) functions Let f ∈ L 2 (R) and f N (x) be the truncations of f defined by (7.2.1). Then the Fourier transform  f (ω) of f is defined as the limit of {  f N (ω)} N in L 2 (R). Remark 1 The Fourier transform  f for f ∈ L 2 (R) defined in Definition 1 is independent of the choice of { f N } in the sense that if {g N } with g N ∈ (L 2 ∩ L 1 )(R) g N } has the same limit as {  f N }. Indeed, by (7.2.2), converges to f in L 2 (R), then { we have f N 2 = 2π g N − f N 2  gN −  ≤ 2π g N − f 2 + 2π f − f N 2 → 0

330

7 Fourier Time-Frequency Methods

as N → ∞. Thus, { g N } and {  f N } have the same limit.



 Recall from (6.1.10) of Theorem 2 on p.271 that the “energy” of any f ∈ PC2π is preserved by the energy of the sequence {ck } of Fourier coefficients of f , called  . The analogue of this result extends to the Fourier Parseval’s identity for PC2π transform via the identity (7.2.2) to be established in the following theorem.

Theorem 1 Parseval’s theorem Let  f (ω) be the Fourier transform of f ∈ L 2 (R). Then (7.2.3)  f 22 = 2π f 22 . In other words, the “energy” of f (x) is “preserved” by its Fourier transform  f (ω). The identity (7.2.3) is commonly called Plancherel’s formula but will be called Parseval’s identity in this book for uniformity. Proof of Theorem 1 By the definition of the Fourier transform for functions in L 2 (R), it is sufficient to derive the identity (7.2.3) for the truncated functions; and in view of Remark 1, we may assume that f ∈ (L 1 ∩ L 2 )(R) and consider its corresponding “autocorrelation function” F(x), defined by  F(x) = By setting



−∞

f (t) f (t − x)dt.

(7.2.4)

f − (x) = f (−x),

the autocorrelation can be viewed as the convolution of f and f − , namely: F(x) = ( f ∗ f − )(x), so that it follows from Remark 1 on p.322 that F(x) is in both L ∞ and L 1 (R). Furthermore, by (v) in Theorem 2 on p.323, we have  f (ω) = |  f (ω)|2 F(ω) =  f (ω) 

(7.2.5)

(see Exercise 1). Therefore, by applying (7.2.5) together with (7.1.20) on p.326, we obtain  ∞  ∞ 2 2  | f (ω)|2 e−σ ω dω = f (ω)  f (ω)  gσ (ω)dω −∞ −∞  ∞  ∞  ∞

= f (y)e−iω y f (x)eiωx  gσ (ω)d xd y dω −∞ −∞ −∞  ∞  ∞  ∞

= f (y) f (x)  gσ (ω)ei(x−y)ω dω d xd y −∞

−∞

−∞

7.2 Inverse Fourier Transform and Sampling Theorem

 = 2π = 2π = 2π



−∞  ∞ −∞  ∞ −∞

 f (y) f (y)



−∞  ∞ −∞

331

f (x)gσ (x − y)d xd y f (y + t)gσ (t)dtdy

F(−t)gσ (t)dt = 2π(F ∗ gσ )(0),

where the 4th equality follows from (7.1.21) on p.326 and the change of variables of integration t = x − y is applied to derive the second last line. In addition, since F(x) is a continuous function (see Exercise 2), it follows from Theorem 5 on p.327 with x0 = 0 that by taking the limit as σ → 0,  f 22 = 2π F(0) = 2π f 22 . This completes the proof of Theorem 1. As a consequence of Theorem 1, we have the following result. Theorem 2 Parseval’s formula

(7.2.6) 

Let f, g ∈ L 2 (R). Then

f, g =

1  f , g . 2π

(7.2.7)

We only derive (7.2.7) for real-valued functions and leave the complex-valued setting as an exercise (see Exercise 3). For real-valued f (x) and g(x), since f ± g 22 = f 22 ± 2 f, g + g 22 , we have

f, g =

1

f + g 22 − f − g 22 . 4

Hence, it follows from (7.2.3) (with f replaced by ( f + g) and ( f − g), respectively) that 1 1  f, g = f + g 22 −  f − g 22 4 2π 1  f , g . = 2π  The following result, though similar to the formulation of (7.2.7), is more elementary and can be proved by a straightforward change of order of integrations (that is, by applying Fubini’s theorem). Theorem 3 Let f, g ∈ L 2 (R). Then g , f .  f , g = 

(7.2.8)

332

7 Fourier Time-Frequency Methods

To prove this theorem, it is sufficient to derive (7.2.8) for f, g ∈ (L 1 ∩ L 2 )(R), since the completion of (L 1 ∩ L 2 )(R) in L 2 (R) is L 2 (R). Under this additional assumption, we may apply Fubini’s theorem to interchange the order of integrations. Thus, we have  f , g =





−∞ ∞



−∞ ∞

= =





−∞

f (x)e−i yx d x g(y)dy −∞  ∞

f (x) g(y)e−i x y dy d x ∞

−∞

f (x) g (x)d x =  g , f .

 To be able to recover f (x) from its Fourier transform  f (ω) = (F f )(ω), we need to introduce the inverse F−1 of the Fourier transform operation F. To accomplish this goal, let us first consider the companion transform F# of F, defined by 1 (F g)(x) = 2π #





−∞

g(ω)ei xω dω

(7.2.9)

for g ∈ L 1 (R). Remark 2 Observe that F# introduced in (7.2.9) is related to the Fourier transform F in (7.1.1) on p.320 by (F# g)(x) =

1 1 (Fg)(−x) = (Fg)(x). ¯ 2π 2π

(7.2.10) 

Under the additional assumption that  f ∈ L 1 (R), we can recover f ∈ L 2 (R) from  f (ω) by applying F# , as in the following theorem. Theorem 4 Inverse Fourier transform Let  f (ω) be the Fourier transform of f ∈ L 1 (R). Then f ∈ L 2 (R). Suppose that    1 f (x) = F#  f (x) = 2π



∞ −∞

 f (ω)ei xω dω.

(7.2.11)

That is, F# = F−1 , to be called the inverse Fourier transform (or IFT). Remark 3 There exist functions f ∈ L p (R) for all p, 1 ≤ p ≤ ∞, such that  f ∈ L 1 (R), as illustrated in the following example. Hence, F# = F−1 in Theorem 4  is restricted only to operation on functions  f ∈ L 1 (R). Example 1 Let f 0 (x) be defined by

7.2 Inverse Fourier Transform and Sampling Theorem

 f 0 (x) =

333

e−x , for x ≥ 0, 0, for x < 0,

and  f 0 (ω) be its Fourier transform. Then f 0 ∈ L p (R) for all p, 1 ≤ p ≤ ∞, but  f 0 ∈ L 1 (R). Solution It is clear that f 0 ∈ L p (R), since 





−∞

| f 0 (x)| p d x =



e− px d x =

0

1 < ∞, p

for 1 ≤ p < ∞ and f 0 ∞ = 1. On the other hand,  f 0 (ω) =





e

−x

e

−i xω



0

=



dx =

e−x(1+iω) d x

0

1 , 1 + iω

which is not in L 1 (R). Indeed, 



−∞

   f 0 (ω)dω =





(1 + ω 2 )−1/2 dω

−∞  ∞

 ∞ (1 + ω 2 )−1/2 dω > 2 (1 + ω 2 )−1/2 dω 0 1 ∞ √ √  ∞ 2 −1/2  (ω ) dω = 2 ln ω  = ∞. > 2

=2

1

ω=1



Proof of Theorem 4 To prove Theorem 4, we set g =  f and apply (7.2.10), Theorem 3, and Theorem 2, consecutively, to compute f − F# g 22 = f 22 − f, F# g − F# g, f + F# g 22 1 1 1 f, Fg − Fg, = f 22 − ¯ f + ( )2 Fg 22 2π 2π 2π 1 1 1 Fg, f − = f 22 − Fg, ¯ f¯ + ( )2 (2π) g 22 2π 2π 2π 1  1  1 f , g − g 22 = f 22 − f , g + 2π 2π 2π 1   1   1 2 f, f − f 2 = f 22 − f, f + 2π 2π 2π 1 2 1 2 1 2 f 2 − f 2 + f 2 = f 22 − 2π 2π 2π 1 2 f 2 = f 22 − f 22 = 0. = f 22 − 2π

334

7 Fourier Time-Frequency Methods

¯ 2 = h 2 and g =  Here, we have used the fact that h f . Hence, ( f − F# g)(x) = 0 for almost all x, which implies the validity of (7.2.11).



Next, let us recall the concept of “ideal lowpass filtering” introduced in Remark  . By considering Dirichlet’s kernels D (x) defined 2 on p.289 for functions in PC2π n in (6.3.2) on p.288 as the “convolution filters” of the signal f (x), the output is the nth-order partial sum (Sn f ) of the Fourier series S f . In other words, this ideal lowpass filtering process retains the low-frequency coefficients ck ( f ) with |k| ≤ n but removes the high-frequency content ck ( f ) for all k with |k| > n from S f . Hence, the precise analogy of ideal lowpass filtering of signals f ∈ L 2 (R) is removing the frequency content  f (ω) of f (x) for all ω with |ω| > η, where η > 0 is any desired “cut-off” frequency. More precisely, let  χ[−η,η] (ω) =

1, for|ω| ≤ η, 0, for|ω| > η,

(7.2.12)

denote the characteristic function of the interval [−η, η]. Then ideal lowpass filtering of f (x) is mapping  f (ω) −→  f (ω)χη (ω). Therefore, in view of the result in (7.1.12) of Theorem 2, it is clear that the ideal lowpass filter is the function h η (x) with Fourier transform given by  h η (ω) = χ[−η,η] (ω).

(7.2.13)

Hence, since χη ∈ (L 1 ∩ L 2 )(R), it follows from Theorem 4 that h η (x) =

1 2π





−∞  η

 h η (ω)ei xω dω

1 eiηx − e−iηx 1 ei xω dω = 2π −η 2π ix

sin ηx η sin ηx = = . πx π ηx =

Definition 2 Sinc function

(7.2.14)

The “sinc” function is defined by sinc(x) =

sin πx . πx

(7.2.15)

Remark 4 Let η > 0. Then by applying (7.2.14), the ideal lowpass filter h η (x) can be formulated as

7.2 Inverse Fourier Transform and Sampling Theorem

335

η η sinc x . π π

h η (x) =

(7.2.16)

Thus by (7.2.13),

π π η ∧ h η (ω) = χ[−η,η] (ω). sinc( x) (ω) =  π η η

(7.2.17)

It follows from (7.1.12) of Theorem 2 on p. 323 that the convolution operation  ( f ∗ h η )(x) =



−∞

f (t)h η (x − t)dt

applied to any f ∈ L 2 (R) is an ideal lowpass filtering of f (x), considered as a signal, in the sense that f (ω)χ[−η,η] (ω). ( f ∗ h η )∧ (ω) =   We end this section by applying Theorem 4 to derive the following so-called “sampling theorem” which assures perfect reconstruction of band-limited continuous functions (considered as analog signals) from certain discrete samples. Definition 3 Band-limited function A function f ∈ L 2 (R) is said to be bandlimited if its Fourier transform  f (ω) vanishes for all ω with |ω| > σ for some σ > 0. The bandwidth of a band-limited function is 2σ, where σ > 0 is the smallest value for which  f (ω) = 0 for |ω| > σ. Let f be a band-limited function with  f (ω) = 0 for |ω| > σ. Since f ∈ L 2 (R), we have  f ∈ L 2 (R). In addition,  R

| f (ω)|dω =



σ −σ

| f (ω)|dω ≤

√  2σ

σ

−σ

| f (ω)|2 dω

1/2

< ∞.

That is,  f ∈ L 1 (R). Thus it follows from Theorem 4 that f (x) =

1 2π



σ

−σ

 f (ω)ei xω dω.

(7.2.18)

By (ii) in Theorem 1 on p.320, we observe that the function defined by the integral on the right-hand side of (7.2.18) is uniformly continuous on R, since  f ∈ L 1 (R). Thus by re-defining the original f at certain points, if necessary, f is continuous on R. In the following, a band-limited function is a continuous function with representation (7.2.18) for some function g(ω) on (−σ, σ) (which happens to f (ω) ∈ (L 2 ∩ L 1 )(R), by (v) in Theorem be  f (ω)). In fact, since for any k ≥ 1, ω k 

336

7 Fourier Time-Frequency Methods

1 on p.320, we see that f given by (7.2.18) is in C k (R). In summary, we have the following theorem. Theorem 5 A band-limited function is in C ∞ (R). Next we introduce the sampling theorem, which is also called the NyquistShannon Sampling Theorem. Let f ∈ L 2 (R) be band-limited with bandwidth

Theorem 6 Sampling theorem

not exceeding 2σ. Then f (x) can be recovered from the discrete samples f ( kπ σ ), k ∈ Z, by applying the formula ∞

f (x) =

f

k=−∞

σ

kπ sinc x − k , x ∈ R, σ π

(7.2.19)

where the series converges to f (x) both uniformly in R and in L 2 (R). Proof Since f ∈ L 2 (R), we have  f ∈ L 2 [−σ, σ]. Thus, by Theorem 3 on p.271,  f can be represented by its Fourier series  f (ω) =



ck (  f )eikπω/σ , ω ∈ [−π, π],

k=−∞

which converges to  f (ω), in the sense that the sequence of the partial sums (Sn  f )(ω) =

n

ck (  f )eikπω/σ

k=−n

of the Fourier series converges to  f (ω) in L 2 [−σ, σ]. Here, ck (  f ), k ∈ Z, are the  Fourier coefficients of f (ω) given by  σ 1   f (ω) e−ikπω/σ dω ck ( f ) = 2σ −σ π −kπ , = f σ σ where (7.2.18) is applied to yield the last equality. Let L n (x) be the nth partial sum of the series in (7.2.19): L n (x) =

n k=−n

Observe that

f

kπ σ

sinc

σ π

x −k .

7.2 Inverse Fourier Transform and Sampling Theorem

337

 σ sin σ(x − kπ )

σ kπ 1 σ = sinc x − k = ei(x− σ )ω dω. kπ π 2σ −σ σ(x − σ ) Thus,  n 1 kπ σ i(x− kπ )ω σ L n (x) = f e dω 2σ σ −σ k=−n  σ n

kπ 1 e−ikπω/σ ei xω dω f = 2σ −σ σ k=−n  σ n

− jπ π 1 ei jπω/σ ei xω dω = f 2π −σ σ σ j=−n  σ 1 = (Sn  f )(ω)ei xω dω. 2π −σ This, together with (7.2.18), leads to L n (x) − f (x) =

1 2π



σ −σ

(Sn  f )(ω) −  f (ω) ei xω dω.

Therefore, |L n (x) − f (x)| ≤

1 2π



σ −σ

   (Sn  f )(ω) −  f (ω) ei xω dω

 2 21 1 √ σ  ≤ (Sn  2σ f )(ω) −  f (ω) dω → 0, 2π −σ as n → ∞ for all x ∈ R, since (Sn  f )(ω) converges to  f (ω) in L 2 [−σ, σ]. This shows that the series in (7.2.19) converges to f (x) uniformly in R. Next we show that L n (x) converges to f (x) in L 2 (R). By (7.2.17), we have (see Exercise 7)

∧ σ π sinc( x − k) (ω) = χ[−σ,σ] (ω)e−ikπω/σ . (7.2.20) π σ Thus, n π kπ  χ[−σ,σ] (ω)e−ikπω/σ = (Sn  L n (ω) = f f )(ω)χ[−σ,σ] (ω). σ σ k=−n

Hence, it follows from Parseval’s theorem on p.330 that

338

7 Fourier Time-Frequency Methods

2π L n − f 22 =  Ln −  f 22  ∞ 2  (Sn  = f (ω) dω f )(ω)χ[−σ,σ] (ω) −  −∞ σ  2 (Sn  = f )(ω) −  f (ω) dω → 0, −σ

as n → ∞, since (Sn  f )(ω) converges to  f (ω) in L 2 [−σ, σ]. That is, the series in  (7.2.19) converges to f (x) in L 2 [−σ, σ]. Exercises Exercise 1 Show that the Fourier transform of the autocorrelation function F(x) of f (x) is the square of the magnitude of the Fourier transform of f (x) as in (7.2.5). Exercise 2 Prove that the autocorrelation function F(x) defined in (7.2.4) is a continuous function for any function f ∈ L 2 (R). Exercise 3 Derive (7.2.7) by applying (7.2.3) for complex-valued functions f, g ∈ L 2 (R). Hint: In addition to considering f ± g 22 as in the proof for the real-valued setting, also consider f ± ig 22 to obtain Im f, g . Exercise 4 For y ∈ R, let y+ = max(y, 0) and consider the function g(ω) = (1 − |ω|)+ . Show that (F# g)(x) = f (x), where  f (x) =

1−cos x , πx 2 1 2π ,

if x = 0, if x = 0.

Hence, apply Theorem 4 to conclude that  f (ω) = (1 − |ω|)+ . Exercise 5 Show that the function f (x) in Exercise 4 is band-limited with bandwidth equal to 2σ = 2 (or σ = 1); and apply the Sampling theorem to write f (x) in terms of an infinite series of sinc functions. Exercise 6 Recall the function  h η (ω) in (7.2.13) and its inverse Fourier transform (IFT) h η (x) in (7.2.16). (a) For a constant c > 0, what is the bandwidth of the function kc (x) = h η (cx)? (b) Evaluate kc (0) by applying L’Hospital’s rule. (c) For 0 < c ≤ 1, verify the Sampling theorem for f (t) = kc (t), by allowing η = 1. (d) For c = m1 , where m is a positive integer, verify the Sampling theorem directly. Exercise 7 Apply (7.2.17) to verify (7.2.20).

7.3 Isotropic Diffusion PDE

339

7.3 Isotropic Diffusion PDE The study of (heat) diffusion on a bounded region D in Rs , for any dimension s ≥ 1, has already been studied in Sect. 6.5 of Chap. 6. When D is perfectly insulated on the boundary ∂ D of D, the mathematical model of the diffusion problem is the partial differential equation (PDE) ∂ u(x, t) = c2 u(x, t), x ∈ D, t ≥ 0, ∂t with zero Neumann’s boundary condition, namely: ∂ u(x, 0) = 0, x ∈ ∂ D, ∂n where n = nx denotes the unit outer normal vector at x ∈ ∂ D. Here and throughout, recall that the c > 0 is a positive constant, called heat conductivity. In this section, we consider the same PDE, but on the entire space Rs , s ≥ 1, instead of a bounded region D. Hence, there is no boundary condition; and with initial heat content u(x, 0) = u 0 (x), x ∈ Rs , at t = 0, the mathematical model is the initial-value PDE ⎧ ∂ 2 ⎪ s ⎪ ⎨ u(x, t) = c u(x, t), x ∈ R , t ≥ 0, ∂t ⎪ ⎪ ⎩ x ∈ Rs . u(x, 0) = u 0 (x),

(7.3.1)

Our mathematical tool is convolution with the Gaussian function gσ (x) = gσ (x1 ) · · · gσ (xs ), where x = (x1 , . . . , xs ) ∈ Rs . Let us first focus on heat diffusion on R = R1 , with given initial heat content u 0 (x), −∞ < x < ∞. Recall from Sect. 7.1 that for any constant σ > 0, the Gaussian function gσ (x) is defined by x 2 1 gσ (x) = √ e−( 2σ ) 2σ π as in (7.1.13) on p.324, satisfies 



−∞

gσ (x)d x = 1, all σ > 0

(see (7.1.14) on p.324), and its Fourier transform is given by

340

7 Fourier Time-Frequency Methods

 gσ (ω) = e−σ

2 ω2

,

(7.3.2)

as shown in (7.1.20) of Theorem 3 in Sect. 7.1, on p. 326. Let c > 0 be the conductivity constant in the PDE (7.3.1) and introduce the “time" parameter t=

σ2 or σ 2 = ct c

(7.3.3)

to define the function G(x, t) of two variables: 1

G(x, t) = gσ (x) =

g√

t − 2 − x 2 t −1 (x) = √ e 4c , ct 2 πc

(7.3.4)

with x ∈ R to be called the spatial variable, and t ≥ 0 to be called the time variable. Then G(x, t) satisfies the PDE (7.3.1) in that ∂2 ∂ G(x, t) = c 2 G(x, t), x ∈ R, t > 0. ∂t ∂x

(7.3.5)

Indeed, by taking the partial derivatives of G(x, t) in (7.3.4), we have 1 3

2 x 2 −1 1 x ∂ 1 G(x, t) = √ e− 4c t − t − 2 + t − 2 ( )t −2 ; ∂t 2 4c 2 πc

− 21 −1 x 2 −1 ∂ t t G(x, t) = √ e− 4c t x , − ∂x 2c 2 πc so that 1 ∂2 t −1 2 t − 2 − x 2 t −1 t −1 4c + (− x) − G(x, t) = e √ ∂x 2 2c 2c 2 πc 1 3

x 2 −1 1 x2 1 1 − t − 2 + ( )t − 2 · t −2 = √ e− 4c t c 2 πc 2 4c 1 ∂ G(x, t), = c ∂t

which proves (7.3.5). Next, it follows from Theorem 5 on p.327 in Sect. 7.1 that G(x, 0) = δ(x), x ∈ R, where δ(x) denotes the “Dirac delta" distribution (also commonly called the “delta function") in that if f ∈ L 2 (R) is continuous at x, then f (x) is “reproduced" by convolution with the delta function:

7.3 Isotropic Diffusion PDE

341

f ∗ δ(x) = f (x), meaning that

 lim



t→0+ −∞

f (y)G(x − y, t)dy = f (x), x ∈ R

(7.3.6)

(see Theorem 5 on p.327 in Sect. 7.1). Remark 1 We have assumed that f ∈ L ∞ (R) in Theorem 5 on p.327 for (7.3.6) to hold, but it was only for simplicity of the proof. In fact, since for any fixed t > 0, f (y)G(x − y, t) is integrable for all f ∈ PC(R) with “at most polynomial growth", meaning that f (x)x −n ∈ L ∞ (R) for some integer n > 0, the statement of Theorem 5 on p.327 (or equivalently (7.3.6)) remains valid for all f ∈ PC(R) with at most polynomial growth.  In other words, we have the following result. Theorem 1 Solution of diffusion PDE for R1 Let u 0 ∈ PC(R) with at most polynomial growth. Then the solution of the initial value PDE ⎧ ∂ ∂2 ⎪ ⎪ ⎨ u(x, t) = c 2 u(x, t), x ∈ R, t ≥ 0, ∂t ∂x ⎪ ⎪ ⎩ x ∈ R, u(x, 0) = u 0 (x), is given by

 u(x, t) =

∞ −∞

u 0 (y)G(x − y, t)dy.

(7.3.7)

(7.3.8)

The proof of (7.3.8) has already been discussed in (7.3.6) with f (x) = u 0 (x) and Remark 1. The idea is that if the heat source (i.e. initial heat content) is the delta function δ(x), then the heat distribution for t > 0 is the Gaussian function G(x, t), as shown at the top of Fig. 7.1. To show that the heat distribution u(x, t), with initial heat content u 0 (x), is the solution of the initial PDE (7.3.7), we simply apply (7.3.5) to obtain  ∞ ∂ ∂ u(x, t) = u 0 (y)G(x − y, t)dy ∂t ∂t −∞  ∞ ∂ = u 0 (y) G(x − y, t)dy ∂t −∞  ∞ ∂2

= u 0 (y) c 2 G(x − y, t) dy ∂x −∞

342

7 Fourier Time-Frequency Methods

Fig. 7.1. Diffusion with delta heat source (top) and arbitrary heat source (bottom)





=c

∂2 ∂x 2

=c

∂2 u(x, t), ∂x 2

−∞

u 0 (y)G(x − y, t)dy

by the definition of u(x, t) in (7.3.8). Hence, u(x, t) as defined by (7.3.8) is the output as shown in the bottom of Fig. 7.1, with input u 0 (x).  Example 1 Compute the solution u(x, t) of the initial value (heat diffusion) PDE in (7.3.7), with initial (or input) function u 0 (x) = aα cos αx + bα sin αx, where α is any real number and aα , bα are arbitrary constants. Solution By Theorem 1, the solution is obtained by computing the convolution of u 0 (x) with the Gaussian function G(x, t) with respect to the spatial variable x, where t ≥ 0 is fixed while the convolution operation is performed. Although there is no need to consider the following three separate cases, we will do so in this first example to show the computational steps more transparently. Case 1. For α = 0, the input function u 0 (x) = a0 is a constant. Hence, u 0 (x − y) = a0 and  ∞   a0 gσ (y)dy = a0 , u(x, t) = u 0 ∗ G(·, t) (x) = −∞

since the Gaussian gσ (x) is normalized with integral over (−∞, ∞) equal to 1. Case 2. For bα = 0, the initial function is u 0 (x) = aα cos αx =

aα iαx e + e−iαx . 2

Hence, for fixed σ 2 = ct, the convolution becomes   u(x, t) = u 0 ∗ gσ (x)

7.3 Isotropic Diffusion PDE

343

  ∞ aα ∞ iα(x−y) e gσ (y)dy + e−iα(x−y) gσ (y)dy 2 −∞ −∞ aα iαx −iαx e gσ (α) + e = gσ (−α) 2 aα iαx −σ2 α2 2 2 e e + e−iαx e−σ (−α) = 2 aα iαx 2 2 e + e−iαx e−σ α = 2

=

by (7.3.2). Since σ 2 = ct, we have u(x, t) = aα e−cα

2t

cos αx.

Case 3. For aα = 0, the initial function is u 0 (x) = bα sin αx =

bα iαx e − e−iαx . 2i

Therefore, the same computation as above yields u(x, t) =

bα iαx 2 2 e − e−iαx e−σ α 2i

= bα e−cα t sin αx, 2

since σ 2 = ct. Combining the above computational results, we obtain the solution of the initial value PDE (7.3.7): u(x, t) = e−cα

2t

  2 aα cos αx + bα sin αx = e−cα t u 0 (x)

for all x ∈ R and t ≥ 0.



 , the same computational In general, if the initial (input) function u 0 (x) is in PC2π steps apply to yield the solution u(x, t) of the initial value PDE (7.3.7), as follows.

Example 2 Let u 0 ∈ PC[0, M], M > 0 such that the partial sums (Sn u 0 )(x) of the Fourier series of u 0 (x) are uniformly bounded. Show that the solution u(x, t) of the initial value PDE (7.3.7) with initial heat content u 0 (x) is given by ∞

u(x, t) =

a0 −c( kπ )2 t  kπ kπ  + x + bk sin x , ak cos e M 2 M M k=1

where

(7.3.9)

344

7 Fourier Time-Frequency Methods

1 ak = M 1 bk = M

 

M −M M −M

u 0 (x) cos

kπ x d x, k = 0, 1, 2, . . . M

u 0 (x) sin

kπ x d x, k = 1, 2, . . . . M

Solution Consider the Fourier cosine and sine series expansion of u 0 (x) in Theorem 2 on p.278 of Chap. 6, with d = M. Since (Sn u 0 )(x) are uniformly bounded, we may apply Lebesgue’s dominated convergence theorem in Theorem 4 of the Appendix (or Sect. 7.6) of this chapter to interchange summation and integration, namely:   u(x, t) = u 0 ∗ G(·, t) (x)  ∞ a0 ak − ibk ∞ i kπ (x−y) + e M G(y, t)dy = 2 2 −∞ k=1  ak + ibk ∞ −i kπ (x−y) + e M G(y, t)dt 2 −∞ ∞

ak + ibk −i kπ x a0 ak − ibk i kπ x + e M  e M  gσ (k) + gσ (−k) = 2 2 2 a0 + = 2 = =

a0 + 2 a0 + 2

k=1 ∞

k=1 ∞

ei M x + e−i M x ei M x − e−i M x + bk  gσ (k) ak 2 −2i kπ

e−σ



2 ( kπ )2 M

k=1 ∞

kπ 2 ct

e−( M )

k=1





 kπ kπ  x + bk sin x ak cos M M

 kπ kπ  t + bk sin t , ak cos M M

since σ 2 = ct, where again the formula  gσ (ω) = e−σ

2 ω2

in (7.3.2) is applied.



Example 3 Find the solution u(x, t) of the initial value (heat diffusion) PDE (7.3.7) with initial heat content u 0 (x) = x n , for n = 1 and 2. Solution Since the PDE (7.3.7) describes the heat diffusion process, let us consider u 0 (x) as the initial temperature at x ∈ R. Hence, the solution u(x, t) is the temperature at the time instant t > 0, at the same position x ∈ R. For n = 1, the temperature at t > 0 and location x ∈ R is given by  u(x, t) = =x



(x − y)gσ (y)dy  ∞ gσ (y)dy − ygσ (y)dy,

−∞  ∞

−∞

−∞

7.3 Isotropic Diffusion PDE

345

where σ 2 = ct. Since the first integral is equal to 1 and the second integral is 0 (with odd function ygσ (y)), we have u(x, t) = x, for all t ≥ 0. That is, the temperature does not change with time; or there is no diffusion at all. For n = 2, since (x − y)2 = x 2 − 2x y + y 2 , the same argument as above yields  u(x, t) =

∞ −∞

 (x − y)2 gσ (y)dy = x 2 +

∞ −∞

y 2 gσ (y)dy,

where σ 2 = ct. Observe that because  ∞  ∞ ∂ 2 2 y 2 e−αy d x = − e−αy dy ∂α −∞ −∞  1 √ −3/2 ∂ π = =− πα , ∂α α 2 we obtain 

∞ −∞

 ∞ y 2 1 y 2 e−( 2σ ) dy √ 2σ π −∞ 1 1√ π(2σ)3 = 2σ 2 = 2ct, = √ 2σ π 2

y 2 gσ (y)dy =

and this yields: u(x, t) = x 2 + 2ct, for all t > 0. Observe that the temperature at x increases from u(x, 0) = x 2 to u(x, c) = x 2 + 2ct as t > 0 increases. This is truly “global warming” everywhere!  Let us now extend our discussion from one spatial variable x ∈ R to s spatial variables x = (x1 , . . . , xs ) ∈ Rs . Let |x| denote the Euclidean norm of x ∈ Rs ; that is |x|2 = x12 + · · · + xs2 , and observe that the Gaussian function gσ (x), x ∈ Rs , defined by gσ (x) =

2 1 − |x|2 4σ , e (4πσ 2 )s/2

(7.3.10)

can be written as the product of the 1-dimensional Gaussian functions; namely gσ (x) = gσ (x1 , . . . , xs ) = gσ (x1 ) · · · gσ (xs ).

346

7 Fourier Time-Frequency Methods

Hence, when the time variable t is defined by (7.3.3); that is σ 2 = ct, the extension of G(x, t) in (7.3.4) to Rs , s ≥ 2, is given by s

G(x, t) = G(x1 , . . . , xs , t) = gσ (x) =

|x|2 −1 t− 2 e− 4c t . s/2 (4πc)

(7.3.11)

Indeed, by (7.3.10) and (7.3.4), we have G(x, t) = gσ (x1 ) · · · gσ (xs )

1

1 2 2 2 2 = √ e−x1 /4σ · · · √ e−xs /4σ 4πσ 2 4πσ 2 1 2 2 2 2 = e−x1 /4σ · · · e−xs /4σ 2 s/2 (4πσ ) 1 1 2 2 2 2 2 = e−(x1 +···+xs )/4σ = e−|x| /4σ (4πσ 2 )s/2 (4πσ 2 )s/2 s

2 −1 t− 2 − |x| 4c t = e , (4πc)s/2

as desired. Now, let us consider the s-dimensional convolution defined by  (u 0 ∗ h)(x) =

Rs

u 0 (y)h(x − y)dy.

(7.3.12)

Then for fixed t ≥ 0, the s-dimensional convolution with the spatial variables x = (x1 , . . . , xs ) of h(x) = G c (x, t) can be written as consecutive 1-dimensional convolutions; namely,    u 0 ∗ G(·, t) (x) = u 0 (y)G(x − y, t)dy Rs  ∞  ∞ ··· u 0 (y1 , . . . , ys )G(x1 − y1 , t) · · · G(xs − ys , t)dy1 · · · dys =  −∞  −∞ sintegrals

  = u 0 (., ., . . . , .) ∗1 gσ ∗2 · · · ∗s gσ (x1 , . . . , xs ),   

(7.3.13)

s components

where “∗k ” denotes the 1-dimensional convolution with respect to the kth component (say yk in (7.3.12)). Hence, as in the 1-dimensional case, it is not difficult to verify that G(x, t) = G(x1 , . . . , xs , t)

7.3 Isotropic Diffusion PDE

347

is the solution of the s-dimensional heat equation with δ(x) = δ(x1 ) · · · δ(xs ) as the initial heat source; namely, ⎧ ∂ 2 ⎪ ⎪ x ∈ Rs , t ≥ 0, ⎨ G(x, t) = c G(x, t), ∂t ⎪ ⎪ ⎩ G(x, 0) = δ(x) = δ(x1 ) · · · δ(xs ), x ∈ Rs ,

(7.3.14)

where 2 denotes the Laplace operator, defined by 2 G(x, t) =

∂2 ∂2 G(x, t) + · · · + G(x, t). ∂xs2 ∂x12

To verify that G(x, t) satisfies the heat diffusion equation, we simply follow the same computations in the derivation of (7.3.5), as follows:

2 2 −1 s |x| ∂ 1 s − s+2 −( |x| −2 4c )t 2 + t− 2 ( G(x, t) = t )t − ; e ∂t (4πc)s/2 2 4c s 2 −1 ∂ t −1 t− 2 − |x| 4c t xk , − G(x1 , . . . , xs , t) = e ∂xk (4πc)s/2 2c so that

2 −1 ∂2 t −1 2 t −1 t− 2 − |x| 4c t + (− xk ) . − G(x, t) = e (4πc)s/2 2c 2c ∂xk2 s

Hence, it follows that c2 G(x, t) = c

s ∂2 G(x1 , . . . , xs , t) ∂xk2 k=1

s s 2 −1 t −2 2 st −1 t− 2 − |x| 4c t + − = e xk (4πc)s/2 2c 4c2

k=1

2 −1  2 s − s −1 1 − |x| − 2s |x| 2 −2 4c t 2 t − e + t t = (4πc)s/2 2 4c ∂ = G(x, t). ∂t To apply the above result to an arbitrary heat source, we simply follow the same argument for the 1-dimensional setting and apply (7.3.13) and (7.3.14) to obtain the solution of the initial value PDE:

348

7 Fourier Time-Frequency Methods

⎧ ∂ 2 ⎪ s ⎪ ⎨ u(x, t) = c u(x, t), x ∈ R , t ≥ 0, ∂t ⎪ ⎪ ⎩ x ∈ Rs , u(x, 0) = u 0 (x),

(7.3.15)

where the initial condition is any “reasonable measurable function” u 0 (x) = u 0 (x1 , . . . , xs ) defined on Rs . We summarize the above discussion in the following theorem. Theorem 2 Solution of diffusion PDE for Rs Let u 0 (x) be a measurable function in Rs , s ≥ 1, with at most polynomial growth such that the set of x ∈ Rs on which u 0 (x) is discontinuous has (Lebesgue) measure zero. Then the solution of the initial value PDE (7.3.15) is given by   u(x, t) = u 0 ∗ G(·, t) (x),

(7.3.16)

as defined in (7.3.13). Example 4 Let c > 0 and 2 be the Laplace operator defined by 2 f (x, y) =

∂2 ∂2 f (x, y) + f (x, y). ∂x 2 ∂ y2

Re-formulate the solution u(x, y, t) of the 2-dimensional (heat diffusion) initial value PDE ∂ u(x, y, t) = c2 u(x, y, t) ∂t in (7.3.15) with initial (input) function u 0 (x, y), according to Theorem 2 explicitly in terms of the 1-dimensional Gaussian function. Solution According to Theorem 2 and the commutative property of the convolution operation, we have u(x, y, t)  ∞  gσ (y1 ) = =

−∞  t −1 ∞

4πc

where σ 2 = ct.

−∞



−∞

−1

e

− t 4c y12

u 0 (x − x1 , y − y1 )gσ (x1 )d x1 dy1



∞ −∞

u 0 (x − x1 , y − y1 )e−

t −1 2 4c x 1

(7.3.17)

d x1 dy1 , 

Definition 1 A function u 0 (x, y) of two variables is said to be separable, if there exist functions f j and h k of one variable, such that

7.3 Isotropic Diffusion PDE

349

u 0 (x, y) =



a j,k f j (x)h k (y)

(7.3.18)

j,k

for some constants a j,k , where

 j,k

denotes a finite double sum, such as:

m n m n− n j , , , etc. =0 j+k=

j=0 k=0

j=0 k=0

Example 5 Let u 0 (x, y) be a separable function as defined by (7.3.18). Write out the solution u(x, y, t) of the 2-dimensional (heat diffusion) PDE in Example 4 with initial condition u 0 (x, y) in the most useful form for computation. Solution Let σ 2 = ct and apply the formulation (7.3.17) to obtain u(x, y, t)  ∞  gσ (y1 ) = −∞

=





a j,k

j,k

×







−∞ j,k



−∞ ∞

−∞

a j,k f j (x − x1 )h k (y − y1 )gσ (x1 )d x1 dy1

f j (x − x1 )gσ (x1 )d x1

(7.3.19)

h k (y − y1 )gσ (y1 )dy1 ,

√ with σ = ct. After√computing (7.3.19), we may write out the solution u(x, y, t) by replacing σ with ct 1/2 , as follows: u(x, y, t) =

 ∞ t −1 2 t −1 a j,k f j (x − x1 )e− 4c x1 d x1 4πc −∞ j,k 

∞ t −1 2 × h k (y − y1 )e− 4c y1 dy1 . −∞

 Example 6 Apply the solution in Example 5 to compute the solution u(x, y, t) of the 2-dimensional (heat diffusion) initial value PDE in Example 4 with initial jπ condition u 0 (x, y) given by (7.3.18), where f j (x) = cos M x and h k (y) = cos kπ N y for M, N > 0. Solution For f j (x) = cos Example 1 yields 

∞ −∞

jπ Mx

and h k (y) = cos kπ N y, application of the solution in

f j (x − x1 )gσ (x1 )d x1 = e−σ

2 ( jπ )2 M

cos

jπ 2 jπ jπ x = e−( M ) ct cos x. M M

350

7 Fourier Time-Frequency Methods

Hence, it follows from (7.3.19) in the previous example that u(x, y, t) =

j,k

 jπ  kπ 2 2 kπ jπ x cos y. a j,k e− ( M ) +( N ) ct cos M M  Exercises

Exercise 1 Solve the initial value PDE ∂2 ∂ u(x, t) = c 2 u(x, t), x ∈ R, t ≥ 0, ∂t ∂x with initial value u(x, 0) = u 0 (x), where c > 0 is a constant and u 0 (x) = 10 − cos 2x + 5 sin 3x. Exercise 2 Repeat Exercise 1 by considering the initial function u 0 (x) = sin3 x. Exercise 3 Repeat Exercise 1 by considering the initial function u 0 (x) = x n , where n ≥ 3 is any positive integer. Exercise 4 Solve the 2-dimensional initial value PDE ∂ u(x, y, t) = c2 u(x, y, t), (x, y) ∈ R2 , t ≥ 0, ∂t where c > 0 is a constant, with initial condition u(x, y, 0) = u 0 (x, y) = a j,k (cos j x)(sin ky), j,k

for some constants a j,k . Exercise 5 Repeat Exercise 4 by considering the initial condition u 0 (x, y) =



b j,k (sin j x)(sin ky)

j,k

for some constants b j,k . Exercise 6 Repeat Exercise 4 by considering the initial condition u 0 (x, y) = a0,0 + a1,0 x + a0,1 y + a2,0 x 2 + a1,1 x y + a0,2 y 2 + a3,0 x 3 +a2,1 x 2 y + a1,2 x y 2 + a0,3 y 3 , for some constants a j,k , j + k = , where  = 0, 1, 2, 3.

7.4 Time-Frequency Localization

351

7.4 Time-Frequency Localization In Sect. 5.3 of Chap. 5, we have discussed the application of DCT to image subblocks, as opposed to the entire digital image. This is essential since multiplication of high-dimensional matrices is not only computational expensive, but also requires significantly more memory. For example, for JPEG compression, DCT is applied to 8 × 8 pixel blocks. To extend this idea to the computation of the Fourier transform, if the function f (x) is truncated by some characteristic function χ(a,b) (x), then computation of 

f χ(a,b)

∧

 (ω) =

b

f (x)e−iωx d x

a

is certainly much simpler than that of  f (ω). In general, a more desirable window function u(x) could be used in place of the characteristic function χ(a,b) (x), and this window should be allowed to move (continuously) along the x-axis, instead of partitioning the x-axis into disjoint intervals. This is the central idea of the socalled “short-time” Fourier transform. Since this transform localizes the function f (x) before the Fourier transform is applied, we will also call it localized Fourier transform, as follows.   Definition 1 Short-time Fourier transform Let u ∈ L 1 ∩ L 2 (R) and x ∈ R. Then for any f ∈ L 2 (R), the integral transform   Fu f (x, ω) =





−∞

f (t)u(t − x)e−iωt dt

(7.4.1)

is called the localized Fourier transform (LFT), or short-time Fourier transform (STFT) of the function f (x) at the time-frequency (or space-frequency) point (x, ω) ∈ R2 . Remark 1 In contrast with the Fourier transform F that takes a function f (x) from the time (or spatial) domain R to  f (ω) in the frequency domain R, the LFT Fu , with “window function” u(x), takes f (x) from the time (or spatial) domain R to the timefrequency domain R2 . For this reason, we use t (instead of x) as the dummy variable for the integration in (7.4.1), while reserving the variable x in the time-frequency  coordinate (x, ω) ∈ R2 . To localize the inverse  Fourier  transform (see Theorem 4. on p.332) with some window function v ∈ L 1 ∩ L 2 (R), we adopt the notation F# in (7.2.9) on p.332 to introduce the localized inverse Fourier transform F#v as follows.   Definition 2 Localized inverse Fourier transform Let v ∈ L 1 ∩ L 2 (R) and ω ∈ R. Then for any f (x) ∈ L 2 (R) with Fourier transform  f (ω), the integral transform

352

7 Fourier Time-Frequency Methods

 #  1 Fv  f (x, ω) = 2π





−∞

 f (ξ)v(ξ − ω)ei xξ dξ

(7.4.2)

is called the localized inverse Fourier transform (LIFT) of  f (ω) at the time-frequency (or spatial-frequency) point (x, ω) ∈ R2 . Remark 2 As mentioned in Remark 1, since ω is reserved for the frequency variable of the time-frequency coordinate (x, ω) ∈ R2 , the dummy variable ξ is used for the integration in (7.4.2). We also remark that in view of Theorem 1 below and the recovery formula for f (x) in (7.4.12) of Theorem 4 on p.332 to be discussed in this section, the notion of LIFT is justified.  To quantify the localization properties of the LFT and LIFT, we introduce the notion of “window width” in the following.   Definition 3 Time-frequency window width Let u ∈ L 1 ∩L 2 (R) be a nonzero function on R such that xu(x) ∈ L 2 (R). Then 1 x = u 22 



∞ −∞

x|u(x)|2 d x

(7.4.3)

is called the center of the localization window function u(x), and 1  ∞

1/2  2 2 u = (x − x ) |u(x)| d x u 22 −∞

(7.4.4)

is called the radius of u(x). In addition, the window width of u(x) is defined by 2u . Observe that for u ∈ L 2 (R), if xu(x) ∈ L 2 (R), then xu(x)2 ∈ L 1 (R) (see Exercise 2). Thus the center x  is well-defined.   u (ω) In the following, we will see that if u ∈ L 1 ∩ L 2 (R) with Fourier transform    also in L 1 ∩ L 2 (R), then by choosing v(ω) = (Fu)(ω) ¯ = u (−ω) (see Exercise 3) for the LIFT F#v in (7.4.2), we have simultaneous time and frequency localization.   Theorem 1 Time-frequency localization Let u ∈ L 1 ∩ L 2 (R) with Fourier   transform Fu =  u ∈ L 1 ∩ L 2 (R). Then for any f ∈ L 1 (R),



Fu f (x, ω) = e−i xω F#u   f (x, ω),

(7.4.5)

for (x, ω) ∈ R2 , when u  is defined by u (−ξ), for ξ ∈ R. u  (ξ) =  Proof The proof of (7.4.5) follows by applying (7.2.7) of Theorem 2 on p.331. Indeed, by considering

7.4 Time-Frequency Localization

353

  g(t) = u(t − x)eiωt = Mω Tx u¯ (t), it follows from (7.1.9) and (7.1.11) of Theorem 2 on p.331 that u (ω − ξ)e−i x(ξ−ω) .  g (ξ) =  u(ξ ¯ − ω)e−i x(ξ−ω) =  Hence, by (7.2.7) of Theorem 2 on p.331 and the definition of Fu in (7.4.1), we have, for fixed (x, ω),   1  f , g

Fu f (x, ω) = f, g =  ∞ 2π 1  = f (ξ) u (ω − ξ) ei x(ξ−ω) dξ 2π −∞  ∞ 1  = e−i xω f (ξ)u  (ξ − ω)ei xξ dξ 2π −∞

= e−i xω F#u   f (x, ω).



In view of the time-frequency localization identity (7.4.5), it is imperative to come u (ω) ∈ L 2 (R) up with window functions u(x) such that both xu(x) ∈ L 2 (R) and ω in order to achieve finite window widths 2u and 2u  , as defined in (7.4.3)–(7.4.4). Example 1 The window function u(x) = χ(− 1 , 1 ) (x) 2 2

with



∞ −∞

 u(x)d x =

1 2

1d x = 1

− 21

and center x  = 0 has finite u , but u  = ∞. Solution Clearly, with

 u 22

we have



x = and

−∞

 u =





−∞

=

1 2

− 21

12 d x = 1, 

xu(x)2 d x =

− 21

 x u(x) d x = 2

2

1 2

1 2

− 21

xd x = 0

x 2d x =

1 12

354

7 Fourier Time-Frequency Methods

is finite. On the other hand,   u (ω) = =

1 2

− 21

e−iω/2 − eiω/2 −iω

e−iωx d x =

sin(ω/2) , ω/2

and hence, u  (ω) =  u (−ω) =  u (ω) and 

∞ −∞

2  ω u (ω) dω = 4





ω sin2 ( )dω = ∞. 2 −∞

 The most commonly used time-window function is the Gaussian function defined by (7.1.13) on p.324. To compute the window width, we differentiate both sides of (7.1.16) on p.325 with respect to α, and then set α = 1/(2σ 2 ), to yield  − so that

∞ −∞





−∞

x

x 2 e−2( 2σ ) d x = − 2

x 2 gσ2 (x)d x =

1 √ 1 −3/2 π , 2 2σ 2 √ π 3/2 3 2 σ . 2

1 √ (2σ π)2

Since gα is an even function, the center x ∗ of gα is 0. On the other hand, since 



−∞

 ∞ x 2 1 e−2( 2σ ) d x √ 2 (2σ π) −∞ √ √ 1 = √ 2 2σ π, (2σ π)

gσ2 (x)d x =

we have 2  gσ = =





−∞

x 2 gσ2 (x)d x





−∞

gσ2 (x)d x

√ π 3/2 3  √ √ 2 σ ( 2σ π) = σ 2 , 2

so that the width of the window function gσ is 2gσ = 2σ, where σ is the standard deviation of the Gaussian function gσ (x).

(7.4.6)

7.4 Time-Frequency Localization

355

To compute the window width of the Fourier transform  gσ of gσ , we re-write the Fourier transform  gα (ω) in (7.1.20) on p.326 as  gσ (ω) = e−(ω/2η) with η = 2

1 . 2σ

Then, we may conclude, by applying the result gη = η in (7.4.6), that gσ = 1 1 ; so that the width of the window function  gσ is 2gσ = . We summarize η= 2σ σ the above results in the following. Theorem 2 Time-frequency window of Gaussianfunction The radii of the wingσ (ω) are given by dow functions of Gaussian functions gσ (x) and  gσ = σ, gσ =

1 ; 2σ

(7.4.7)

and the area of the time-frequency localization window: 

   − gσ , gσ × − gσ , gσ

in R2 is

   2gσ 2gσ = 2.

(7.4.8)

(7.4.9)

It turnsout that the area = 2 is the smallest among all time-frequency localization  windows − u , u × −  u ,  u , where u ∈ (L 1 ∩ L 2 )(R), as asserted by the following so-called “uncertainty principle”.   Theorem 3 Uncertainty principle Let u ∈ L 1 ∩L 2 (R) with Fourier transform  u (ω). Then 1 (7.4.10) u  u ≥ , 2 where u or  u may be infinite. Furthermore, equality in (7.4.10) holds if and only if (7.4.11) u(x) = cgσ (x − b) for σ > 0, b ∈ R, and c = 0. In other words, the Gaussian function is the only time-window function that provides optimal time-frequency localization. The proof of Theorem 3 is delayed to the end of this section. After discussing the area of the time-frequency localization windows, we now return to study the localized (or short-time) Fourier transform (Fu f )(x, ω). To recover f (x) from (Fu f )(x, ω), we have the following result.

356

7 Fourier Time-Frequency Methods

    u ∈ L 1 ∩ L 2 (R) such Theorem 4 Let u ∈ L 1 ∩ L 2 (R) with Fourier transform    that u(0) = 0. Then for any f ∈ L 1 ∩ L 2 (R) with  f ∈ L 1 (R), f (x) =



1 1 u(0) 2π

∞ −∞

(Fu f )(x, ω) ei xω dω.

(7.4.12)

The derivation of (7.4.12) follows from Theorem 1 by multiplying both sides of (7.4.5) with ei xω and then taking the integral; namely, 1 2π





−∞

(Fu f )(x, ω)e

i xω

1 2π 1 2π 1 2π 1 2π 1 2π





(F#u   f )(x, ω)dω  ∞

  i xξ 1   u − (ξ − ω) dω dξ = f (ξ)e 2π −∞ −∞  ∞

1  ∞  =  u (y)dy dξ f (ξ)ei xξ 2π −∞ −∞  ∞

1  ∞  =  u (y)ei0y dy dξ f (ξ)ei xξ 2π −∞ −∞  ∞  = f (ξ)ei xξ u(0)dξ −∞

1  ∞  = u(0) f (ξ)ei xξ dξ = u(0) f (x), 2π −∞

dω =

−∞  ∞

where Theorem 4 on p.332 is applied to both  u and  f which are assumed to be in  L 1 (R). Remark 3 If the Gaussian function gσ (x) defined in (7.1.13) on p.324 is used as the gσ are both in (L 1 ∩ L 2 )(R) window function u(x) in Theorem 4, then since gσ and  and since 1 gσ (0) = √ , 2 πσ √ it follows that the choice of σ = 1/(2 π) in (7.4.12) yields 1 f (x) = 2π





−∞

(G f )(x, ω)ei xω dω

for all f ∈ L 1 (R) with  f ∈ L 1 (R), where G f is defined as follows. Definition 4 Gabor transform



The integral transform 

(G f )(x, ω) =

(7.4.13)

∞ −∞

f (t)e−π(t−x) e−itω dt 2

(7.4.14)

7.4 Time-Frequency Localization

357

of functions f ∈ L 1 (R) is called the Gabor transform of f (x) at the (x, ω) position of the time-frequency domain R2 . Hence, every f ∈ L 1 (R) with Fourier transform  f ∈ L 1 (R) can be recovered from its Gabor transform (G f )(x, ω) by applying the formula (7.4.13).  We end this section by providing the proof of Theorem 3. Proof of Theorem 3 To prove this theorem, we may assume, without loss of generality, that the centers of u(x) and  u (ω) are x  = 0 and ω  = 0, respectively (see Exercise 1). Hence,

2

u  u

1 = u 22  u 22





−∞

x |u(x)| d x 2

2





−∞

ω 2 | u (ω)|2 dω .

(7.4.15)

In (7.4.15), we may apply Theorem 1 (iv) on p.320 and Parseval’s identity on p.330 to conclude that  ∞ ω 2 | u (ω)|2 dω = u 2 = 2π u  2 . (7.4.16) 2

−∞

2

In addition, in the denominator of (7.4.15), we have  u 22 = 2π u 22 , again by Parseval’s identity. Therefore, applying the Cauchy-Schwarz inequality, we obtain  ∞  ∞ 2

−4 2 = u |xu(x)| |u  (x)|2 d x u  u 2 −∞ −∞

 ∞ 2  (x)|d x ≥ u −4 |xu(x)u 2 −∞  ∞ 2   (x)}d x  . ≥ u −4 Re {xu(x)u  2 −∞

(7.4.17) (7.4.18)

But since x

d d |u(x)|2 = x u(x)u(x) dx dx   = x u(x)u  (x) + u  (x)u(x) = 2Re {xu(x)u  (x)},

the right-hand side of (7.4.18) can be written as 1 

 d  2 x |u(x)|2 d x 2 −∞ d x  ∞  ∞

2 1 −4 2 x|u(x)| = u 2 − |u(x)|2 d x −∞ 4 −∞ 1 1 −4 4 = u 2 u 2 = , 4 4

u −4 2



(7.4.19)

358

7 Fourier Time-Frequency Methods

1 and hence, u  u ≥ 2 . In both (7.4.16) and (7.4.19), we have assumed that u ∈ PC(R). In addition, since u ∈ L 2 (R) or |u|2 ∈ L 1 (R), the function |u(x)|2 must decay to 0 faster than x1 , when |x| → ∞. That (7.4.16) and (7.4.19) are valid for any u ∈ L 1 ∩ L 2 (R) follows from a standard “density” argument of

  closure L 2 PC(R) = L 2 (R). Finally, if the inequality in (7.4.18) becomes equality, we recall from the theorem on the Cauchy-Schwarz inequality (see Theorem 3 on p.27) that |xu(x)| = r |u  (x)|

(7.4.20)

± Re xu(x)u  (x) = |xu(x)u  (x)|.

(7.4.21)

xu(x) = r u  (x)eiθ(x)

(7.4.22)

for some constant r > 0 and

From (7.4.20), we have

for some real-valued function θ(x). Hence, by (7.4.21) together with (7.4.22), we may conclude that ±Re r |u  (x)|2 eiθ(x) = r |u  (x)|2 . Thus ±Re(eiθ(x) ) = 1, which implies that ±eiθ(x) is the constant function 1. Therefore, (7.4.22) becomes 1 u  (x) −1 u  (x) = x or = x, u(x) r u(x) r or equivalently, u(x) =  ce x

2 /2r

or u(x) =  ce−x

ce x But since r > 0 and u(x) ∈ L 1 (R), u(x) cannot be  function 2 2 u(x) =  ce−α x

2 /2r

2 /2r

.

and must be the Gaussian

with α2 = 2r1 > 0. In the above argument, we have assumed that the center x  of the time-window function u(x) is x  = 0. Therefore, in general, u(x) can be formulated as u(x) = cgσ (x − b) for σ =

1 2α

and some c = 0, x  = b ∈ R.



7.4 Time-Frequency Localization

359

Exercises Exercise 1 Let x  be the center of a localization window function u(x) as defined by (7.4.3). Show that the center of the window function  u (x) = u(x + x  ) is at the origin x = 0. Exercise 2 Show that if u ∈ L 2 (R) and xu(x) ∈ L 2 (R), then xu(x)2 ∈ L 1 (R). Exercise 3 Show that (Fu)(ω) ¯ = u (−ω). Exercise 4 Let m = 0, 1, 2, . . .. Compute





−∞

x m e−x d x. 2

Consider the cases of even and odd m, and for even m, refer to the calculation Hint: ∞ 2 2 of x e−αx d x for gσ . −∞

Exercise 5 Let h(x) = e−ax , a > 0. 2

(a) Compute h . (b) Compute  h. 1 (c) Show that h  h = 2. Exercise 6 Let gc (x, t) be defined by x2 1 e− 4ct , gc (x, t) = √ 2 ctπ

where c > 0 is fixed and t > 0 is considered as the time variable. ∂ (a) Compute gc (x, t). ∂t ∂ gc (x, t). (b) Compute ∂x 2 ∂ (c) Compute gc (x, t). ∂x 2 ∂ ∂2 (d) Derive the relationship between gc (x, t) and gc (x, t). ∂t ∂x 2 Exercise 7 Let x  be the center of a localization window function u(x), as introduced in Definition 3. Verify that 

∞ −∞

2  (x − x  )u(x) d x = 0.

Exercise 8 Let h 1 (x) = χ[0,1] (x), and for n = 2, 3, . . ., define h n (x) recursively by 



h n (x) = h n−1 ∗ h 1 (x) =





−∞

 h n−1 (t − x)h 1 (t)dt = 0

1

h n−1 (t − x)dt.

360

7 Fourier Time-Frequency Methods

Prove by mathematical induction that the functions Hn (x) = h n (x + n2 ) are even, for all n = 1, 2, . . .. Hint: If f (x) is even, then 

a −a

 f (t − x)dt =

a −a

f (t + x)dt,

which can be derived by the change of variables of integration from t to u = −t. Exercise 9 As a continuation of Exercise 8, determine the center of the localization window function h n (x) for each n = 1, 2, . . .. Exercise 10 Show that the time-frequency window for the localization window function h n (x) has a finite area for n ≥ 2 by showing that h n  h n < ∞. Exercise 11 (This is a very difficult problem and is included only for the very advanced students.) Let an = h n  h n . Prove that a2 > a3 > · · · > an →

1 , 2

as n → ∞. In other words, although h n (x) is not the Gaussian, it provides nearoptimal time-frequency localization as governed by the uncertainty principle (in Theorem 3) for sufficiently large n.

7.5 Time-Frequency Bases As already established in Theorem 1 on p.352 in the previous section, computation of the localized (or short-time) Fourier transform (Fu f )(x, ω) of f ∈ L 1 (R) provides the quantity of the localized transform (F#u   f )(x, ω) of the frequency content  f (ω)  u (−ξ). On the other hand, f (x) can be recovered with window function u (ξ) =  from (Fu f )(x, ω) by the formula (7.4.12) on p.356. In particular, when the window function is the Gaussian function, then the localized Fourier transform is the Gabor f ∈ L 1 (R) can transform (G f )(x, ω), and every f ∈ L 1 (R) with Fourier transform  be recovered from its Gabor transform by applying the formula (7.4.13) on p.356. In this section, we study the time-frequency analysis by considering sampling of the LFT (or STFT) (Fu¯ f )(x, ω) of f ∈ (L 1 ∩ L 2 )(R), where the complex conjugate u(x) of u ∈ (L 1 ∩ L 2 )(R) is used as the window function. Let Z denote the set of all integers. By sampling (Fu¯ f )(x, ω) at (x, ω) = (m, 2πk), with m, k ∈ Z, we have

7.5 Time-Frequency Bases

361

 (Fu¯ f )(m, 2πk) = =



−∞  ∞ −∞

f (x)u(x − m) e−i2πkx d x f (x)h m,k (x)d x = f, h m,k ,

with h m,k (x) defined by h m,k (x) = u(x − m) ei2πkx .

(7.5.1)

Remark 1 Since e−i2πkx = e−i2πk(x−m) , the functions h m,k (x) in (7.5.1) can be formulated as (7.5.2) h m,k (x) = Hk (x − m), with Hk (x) defined by

Hk (x) = u(x) ei2πkx .

Hence, while Hk (x) localizes the frequency k ∈ Z of f (x) only at the time sample point m = 0, h m,k (x) localizes the same frequency of f (x) at any time instant m ∈ Z.  Remark 2 On the other hand, if the frequency k ∈ Z in the definition (7.5.1) is not an integer, such as h m,kb = u(x − m) ei2πbkx , where b ∈ Z, then (7.5.2) does not apply. Later, we will also consider (Fu¯ f )(ma, 2πkb) = f, h ma,kb

with a, b > 0 and

h ma,kb (x) = u(x − ma) ei2πkbx .

(7.5.3) 

Example 1 Let u(x) be the characteristic function of [− 21 , 21 ); that is,  u(x) = χ[− 1 , 1 ) (x) = 2 2

1, for − 21 ≤ x < 21 , 0, otherwise.

Then the family {h m,k (x)}, m, k ∈ Z defined by (7.5.1), constitutes an orthonormal basis of L 2 (R). Solution For each m ∈ Z,  h m,k , h m, =

m+ 21

m− 21

ei2π(k−)x d x = δk− .

362

7 Fourier Time-Frequency Methods

For all k,  ∈ Z and m = n,

h m,k , h n, = 0,

since the supports of h m,k and h n, do not overlap. Hence, h m,k , h n, = δm−n δk− ; or {h m,k (x)}, m, k ∈ Z, is an orthonormal family. In addition, for any f ∈ L 2 (R), by setting f m (x) = f (x + m), where − 21 ≤ x ≤ 1 2 , namely: f m (x) = u(x) f (x + m) = χ[− 1 , 1 ) (x) f (x + m), − 2 2

1 1 ≤x≤ , 2 2

and extending it to R periodically such that f m (x + 1) = f m (x), then for each fixed m ∈ Z, the Fourier series (S f m )(x) =



ak ( f m ) ei2πkx

k=−∞

of f m (x) converges to f m (x) in L 2 [− 21 , 21 ], where  ak ( f m ) =

− 21

 = =

1 2

f m (t) e

m+ 21

m− 1  ∞2 −∞

−i2πkt

 dt =

1 2

− 12

f (t + m) e−i2πkt dt

f (y) e−i2πk(y−m) dy

f (t)u(t − m) e−i2πkt dt = f, h m,k .

Thus, for each m ∈ Z, we have   f (x)u(x − m) = f m (x − m)u(x − m) = (S f m )(x − m) u(x − m) =





ak ( f m ) ei2πk(x−m) u(x − m) =

k=−∞

f, h m,k h m,k (x).

k=−∞

Hence, summing both sides over all m ∈ Z yields f (x) = =

∞ m=−∞ ∞ m=−∞

f (x)χ[m− 1 ,m+ 1 ) (x) = 2

2

f (x)u(x − m) =



f (x)χ[− 1 , 1 ) (x − m)

m=−∞ ∞ ∞ m=−∞ k=−∞

2 2

f, h m,k h m,k (x),

7.5 Time-Frequency Bases

363

where the convergence is in the L 2 (R)-norm (see Exercise 1).



Remark 3 The limitation of the window function u(x) = χ[− 1 , 1 ) (x) in the above 2 2 example is that it provides very poor frequency localization, as discussed in Example 1 on p.353. Unfortunately, the formulation of h m,k (x) in (7.5.1) cannot be improved by too much, as dictated by the so-called “Balian-Low” restriction to be stated in Theorem 2 later in this section (see also Remark 5).  Definition 1 Frame A family of functions {h α (x)} in L 2 (R), α ∈ J , where J is some infinite index set, such as Z and Z2 , is called a frame of L 2 (R), if there exist constants A and B, with 0 < A ≤ B < ∞, such that | f, h α |2 ≤ B f 22 , (7.5.4) A f 22 ≤ α∈J

for all f ∈ L 2 (R). Here, A and B are called frame bounds. Furthermore, if A and B can be so chosen that A = B in (7.5.4), then {h α } is called a tight frame. Remark 4 If {h α }, α ∈ J , is a tight frame with frame bound A = B, then the family { h α (x)}, defined by 1  h α (x) = √ h α (x), α ∈ J, A satisfies Parseval’s identity f 22 =



| f,  h α |2 , f ∈ L 2 (R).

(7.5.5)

α∈J

 Recall that an orthonormal basis of L 2 (R) also satisfies Parseval’s identity (7.5.5). To understand the identity (7.5.5) for the tight frame {h α }, let us consider the function f (x) = h α0 (x) for any fixed index α0 ∈ J , so that  h α0 22 =



|  h α0 ,  h α |2

α∈J

=  h α0 42 +



|  h α0 ,  h α |2

α=α0

or

h α0 22 = |  h α0 ,  h α |2 .  h α0 22 1 −  α=α0

Since the right-hand side is non-negative, we see that

364

7 Fourier Time-Frequency Methods

 h α0 22 ≤ 1. h α0 , h α = 0 for In addition, if  h α0 2 = 1, then the right-hand side vanishes, or  all α = α0 . Thus if  h α 2 = 1 for each α ∈ J , then { h α }α∈J is an orthonormal family. Furthermore, observe that any frame {h α } of L 2 (R) is complete in L 2 (R), meaning that span{h α : α ∈ J } is “dense” in L 2 (R) (more precisely, the completion of span{h α : α ∈ J } is L 2 (R)). Indeed, if {h α } is not complete in L 2 (R), then there would exist some non-trivial f ∈ L 2 (R) which is orthogonal to each h α , violating the lower-bound frame condition, in that | f, h α |2 = 0. 0 < A f 22 ≤ α∈J

Let us summarize the above derivations in the following theorem. Theorem 1 Let {h α }, α ∈ J , be a tight frame of L 2 (R) with frame bound A = B = 1. Then (a) h α 2 ≤ 1 for all α ∈ J ; (b) the L 2 (R) closure of span{h α : α ∈ J } is the entire L 2 (R) space; (c) if h α 2 = 1 for all α ∈ J , then {h α } is an orthonormal basis of L 2 (R). We now return to the study of time-frequency analysis by stating the BalianLow restriction as in Theorem 2 below and a sampling criterion for achieving good time-frequency localization, in Theorem 3(c) later in this section. Theorem 2 Balian-Low restriction Let {h m,k (x)}, (m, k) ∈ Z2 , be defined by (7.5.1) with window function u(x) ∈ (L 1 ∩ L 2 )(R). Then a necessary condition for {h m,k (x)} to be a frame of L 2 (R) is that at least one of 

∞ −∞

 |xu(x)|2 d x and



−∞

|ω u (ω)|2 dω

is equal to ∞. Remark 5 Recall from Theorem 1 (iv) on p.320 and Theorem 1 on p.330 that 



−∞

|ω  f (ω)|2 dω =



∞ −∞



= 2π

|(F f  )(ω)|2 dω ∞

−∞

| f  (x)|2 d x.

Hence, if u(x) is the window function in {h m,k (x)} with finite window width (that is, u < ∞), then for {h m,k (x)} to be a frame of L 2 (R), it is necessary that

7.5 Time-Frequency Bases

365





−∞

|u  (x)|2 d x = ∞

according to the Balian-Low restriction in Theorem 2. Consequently, any compactly n supported smooth function, such as any B-spline Hn (x) = h n (x + ) for n ≥ 2 2 (defined in Exercise 8 of the previous section), cannot be used as the window function u(x) in (7.5.1) to achieve good time-frequency localization.  On the other hand, we have the following result. Theorem 3 Let {h ma,kb (x)}, (m, k) ∈ Z2 , be defined by (7.5.3) with window function u ∈ (L 1 ∩ L 2 )(R), where a, b > 0. (a) For ab > 1, there does not exist any window function u(x) for which {h ma,kb (x)}, (m, k) ∈ Z2 , is complete in L 2 (R). (b) For ab = 1 (such as a = 1 and b = 1 in (7.5.1)), there exists u(x) ∈ (L 1 ∩L 2 )(R) such that {h m,k (x)} is a frame (such as an orthonormal basis in Example 1), but the time-frequency window [−u , u ] × [− u ,  u] necessarily has infinite area, namely: u  u = ∞. (c) For 0 < ab < 1, there exists u ∈ (L 1 ∩ L 2 )(R) such that u  u < ∞ and the corresponding family {h ma,kb (x)}, (m, k) ∈ Z2 , is a tight frame of L 2 (R). Remark 6 Let a and b in (7.5.3) be restricted by 0 < ab < 1 to achieve good timefrequency localization, as guaranteed by Theorem 1(c). Then the frame {h ma,kb (x)} of L 2 (R) in (7.5.3), with window function u(x) (that satisfies u  u < ∞) cannot be formulated as the translation (by mb, m ∈ Z) of some localized function Hkb (x) = u(x) ei2πkbx as in (7.5.2) for h m,k (x), where a = b = 1. Consequently, the computational aspect of time-frequency analysis is much less effective.  The good news is that by replacing ei2πkx with the sine and/or cosine functions to formulate a frequency basis, such as ck (x) =



1 2 cos(k + )πx, k = 0, 1, . . . 2

(7.5.6)

(that is, the family W3 introduced in (6.2.17) of Sect. 6.2 on p.284 with d = 1, as the orthonormal basis of L 2 [0, 1] for the Fourier cosine series of type II), then localization by window functions u(x) with finite time-frequency windows (that is,

366

7 Fourier Time-Frequency Methods

c u  u < ∞ ) is feasible by formulation of the local basis functions h mn (x) as integer translates, such as

h cm,k (x) = Hkc (x − m) = u(x − m) ck (x − m), m, k ∈ Z, of the localized frequency basis Hkc (x) = u(x) ck (x), k ∈ Z (compare this with (7.5.2)). Recall from Theorem 5 on p.284 that ck (x), k = 0, 1, . . . defined by (7.5.6) constitute an orthonormal basis of L 2 [0, 1]. In this chapter, we will only focus on those computational efficient window functions u(x) that satisfy the admissible conditions to be defined as follows: Definition 2 Admissible window function A function u(x) is said to be an admissible window function, if it satisfies the following conditions: (i) there exists some positive number δ, with 0 < δ < 21 , such that u(x) = 1, for x ∈ [δ, 1 − δ], and u(x) = 0, for x ∈ [−δ, 1 + δ]; (ii) 0 ≤ u(x) ≤ 1; (iii) u(x) is symmetric with respect to x = 21 , namely: 1 1 u( − x) = u( + x), for all x; 2 2 (iv) both u(x) and u  (x) are piecewise continuous on [−δ, 1 + δ]; and (v) u 2 (x) + u 2 (−x) = 1, for x ∈ [−δ, δ]. (See Fig. 7.2 for the graphical display of a typical admissible window function u(x) to be discussed in Example 2.) Remark 7 It follows from (i), (ii) and (iv) that 

∞ −∞

 x 2 u 2 (x)d x < ∞ and



−∞

|u  (x)|2 d x < ∞,

so that by Theorem 1 (iv) on p.320 and Parseval’s theorem on p.330, 



  2  Fu  dω = 1 ω | u (ω)| dω = 2π −∞ −∞ ∞

2

2







−∞

|u  (x)|2 d x < ∞.

7.5 Time-Frequency Bases

367

1 0.8 0.6 0.4 0.2 0 −0.2

0

0.2

0.4

0.6

0.8

1

1.2

Fig. 7.2. Admissible window function u(x) defined in (7.5.8)

Hence, u(x) provides good time-frequency localization, meaning that u  u < ∞. Furthermore, conditions (i), (iii) and (v) assure that ∞

u 2 (x − m) = 1, for allx ∈ R

(7.5.7)

m=−∞



(see Exercise 4) Example 2 Let 0 < δ
1 + δ; −δ ≤ x < 0; 0 ≤ x < δ; δ ≤ x ≤ 1 − δ;

(7.5.8)

1 − δ < x ≤ 1; 1 < x ≤ 1 + δ,

with the graph of y = u(x) shown in Fig. 7.2. Verify that u(x) is an admissible window function. Solution The verification of (i), (ii), (iii) and (v) for u(x) is straightforward. Here we only verify that u  (x) exists for any x ∈ R, which is reduced to the points x0 = −δ, 0, δ, 1 − δ, 1, 1 + δ since it is clear that u  (x) exists elsewhere. Since u(x) is continuous (see Exercise 6), to verify that u  (x0 ) exists, it is enough to show that the left-hand and right-hand limits of u  (x) at x0 exist and are equal.

368

7 Fourier Time-Frequency Methods

For x0 = −δ, clearly lim u  (x) = lim 0 = 0.

x→−δ −

x→−δ −

On the other hand, lim u  (x) = lim

x→−δ +

x→−δ +

π 1 π π 1 π =√ cos(− ) = 0. √ cos( x) 2δ 2δ 2 2 2 2δ

Thus u  (−δ) exists. For x0 = 0, π 1 π π 1 π =√ cos 0 = √ ; lim u  (x) = lim √ cos( x) − − 2δ 2δ x→0 x→0 2 2 2δ 2 2δ

− 1    1 π π π π  2 2 1 − 1 − sin x 1 − sin x cos x lim u  (x) = lim + + 2 2δ 2δ 2δ x→0 x→0 4δ π 1 − 21 π = = √ . 4δ 2 2 2δ Therefore u  (0) exists. The verification of the existence of u  (x) at x0 = δ is left as an exercise (see Exercise 7). The existence of u  (x) at x0 = 1−δ, 1, 1+δ follows from the symmetry of u(x).  Theorem 4 Malvar wavelets Let u(x) be an admissible window function that satisfies the conditions (i)–(v) in Definition 2. Then u(x) has the property u  u < ∞,

(7.5.9)

and the family {ψm,k (x)} of functions defined by ψm,k (x) = u(x − m)ck (x − m) √   1 = 2u(x − m) cos (k + )π(x − m) , m ∈ Z, k ≥ 0, (7.5.10) 2 where ck (x) is defined in (7.5.6), constitutes an orthonormal basis of L 2 (R). Remark 8 The functions ψm,k (x) are often called “Malvar wavelets” in the literature. For the discrete time-frequency setting, with ck (x) in (7.5.6) replaced by the DCT, the discrete formulation of ψm,k (x) in (7.5.10) provides an enhanced version of block DCT since the window function u(x) “smooths” the boundary of the tiles and therefore reduces “blocky artifacts”, particularly for 8 × 8 tiles for the JPEG image compression standard as studied in Sect. 5.4 of Chap. 5. 

7.5 Time-Frequency Bases

369

1.5

1

0.5

0

−0.5

−1

−1.5

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

0

0.2

0.4

0.6

0.8

1

1.2

1.5

1

0.5

0

−0.5

−1

−1.5

−0.2

Fig. 7.3. Malvar wavelets ψ0,4 (top) and ψ0,16 (bottom). The envelope of cos(k + 21 )πx is the dotted √ graph of y = ± 2u(x)

Let u(x) be the admissible window function given by (7.5.8) with δ = display the graphs of u(x), ψ0,4 and ψ0,16 in Fig. 7.3.

1 4.

We

Proof of Theorem 4 By Theorem 1 or Theorem 4 on p.46, to show that ψm,k , m ∈ Z, k ≥ 0 constitute an orthonormal basis for L 2 (R), it is sufficient to show ψm,k 2 = 1 and that Parseval’s identity f 22 =

∞ m∈Z k=0

holds.

| f, ψm,k |2 , for all f ∈ L 2 (R),

(7.5.11)

370

7 Fourier Time-Frequency Methods

Observe that by the symmetry of u(x) and the property of the cosine function, we have u(1 − x) = u(x), u(2 − x) = u(x − 1), ck (−x) = ck (x), ck (2 − x) = −ck (x). In addition, by property (v) of u(x) and symmetry of u(x), we also have u 2 (x) + u 2 (2 − x) = 1, for|x − 1| ≤ δ

(7.5.12)

(see Exercise 5). Thus, ψm,k 22 = ψm,k , ψm,k = ψ0, , ψ0,k

 1+δ ck2 (x)u 2 (x)d x = −δ

=

 

= 0





0

+

ck2 (x) 

+

1 1−δ



1

=

1

 +

1+δ

ck2 (x)u 2 (x)d x

−δ 0 1 δ ck2 (−y)u 2 (−y)dy δ

=

0

0





1

+ 0

 ck2 (x)u 2 (x)d x + 



u (−x) + u (x) d x + 2

2

1−δ δ

1 1−δ

ck2 (2 − z)u 2 (2 − z)dz

ck2 (x)u 2 (x)d x

  ck2 (x) u 2 (x) + u 2 (2 − x) d x

ck2 (x) · 1d x = 1,

where the second equality follows from properties (i) and (v) of u(x) and (7.5.12), and the fifth equality is a matter of change of variables of integration (with y = −x and z = 2 − x). Next let us verify Parseval’s identity (7.5.11). In this regard, for f ∈ L 2 (R), set f m (x) = f (x + m)u(x), where m ∈ Z. Then we have  ∞ f (x)u(x − m)ck (x − m)d x f, ψm,k =  = =

−∞



−∞

 0 −δ



f (x + m)u(x)ck (x)d x = 

1

+ 0



1+δ

+ 1



1+δ

−δ

f m (x)ck (x)d x

f m (x)ck (x)d x

7.5 Time-Frequency Bases



δ

= 0

 =

371



1

f m (−x)ck (−x)d x +

 f m (x)ck (x)d x +

0

1

1

f m (2 − x)ck (2 − x)d x

1−δ

f m (−x)χ[0,δ] (x) + f m (x) − f m (2 − x)χ[1−δ,1] (x) ck (x)d x,

0

where the fact ck (−x) = ck (x), ck (2 − x) = −ck (x) has been used. Since {ck (x) : k = 0, 1, . . .} is an orthonormal basis of L 2 [0, 1], by Parseval’s identity applied to this basis, we have ∞

| f, ψm,k |2

k=0



= 0

 =

1

2    f m (−x)χ[0,δ] (x) + f m (x) − f m (2 − x)χ[1−δ,1] (x) d x

1 

      f m (−x)2 χ[0,δ] (x) + 2Re f m (−x) f m (x) χ[0,δ] (x) +  f m (x)2

0

 2   −2Re f m (x) f m (2 − x) χ[1−δ,1] (x) +  f m (2 − x) χ[1−δ,1] (x) d x  1  1  δ        f m (−x)2 d x + Im +  f m (x)2 d x − IIm +  f m (2 − x)2 d x = 0

 =

0

   f m (y)2 d x + Im +

−δ 1+δ

 =

−δ ∞

 =

−∞

0 1



  f m (x)2 d x − IIm +

1−δ 1+δ 



0

  f m (y)2 dy

1

   f m (x)2 d x + Im − IIm

   f (x)u(x − m)2 d x + Im − IIm ,

where 

δ

Im =

  2Re f m (−x) f m (x) d x;

0

 IIm =

1

  2Re f m (x) f m (2 − x) d x.

1−δ

Observe that  Im+1 =

δ

  2Re f (m + 1 − x) f (m + 1 + x)u(−x)u(x) d x

0

 =

1

1−δ 1

 =

1−δ

  2Re f (m + y) f (m + 2 − y)u(y − 1)u(1 − y) dy   2Re f (m + y) f (m + 2 − y)u(2 − y)u(y) dy = IIm .

372

7 Fourier Time-Frequency Methods

Hence, it follows that N ∞

 N

| f, ψm,k |2 =

m=−M k=0

m=−M

∞ −∞

   f (x)u(x − m)2 d x + I−M − II N . (7.5.13)

Since f ∈ L 2 (R), I−M → 0, II N → 0 as M, N → ∞. In addition, by (7.5.7),  N

lim

M,N →∞

 =



−∞

m=−M

∞ −∞

   f (x)u(x − m)2 d x

 ∞    f (x)2 u 2 (x − m)d x = m=−∞

∞ −∞

   f (x)2 d x.

Therefore, by taking limit with M, N → ∞ on both sides of (7.5.13), we obtain (7.5.11).  Exercises Exercise 1 Provide the details in showing that

N

f (x)u(x − m) − f (x) 2 → 0

m=−N

in Example 1. Exercise 2 Let gm,k (x) = u(x − m2 )ei2πkx , where u(x) is the characteristic function of [− 21 , 21 ). Hence, g2m,k (x) = h m,k (x) in Example 1. Show that the family {gm,k (x)}, m, k ∈ Z, is a frame of L 2 (R). Hint: For the upper frame bound, consider the two sub-families g2,k (x) and 2 + 1 i2πkx )e =h ,k (x − 21 ),  ∈ Z, separately. g2+1,k (x) = u(x − 2 Exercise 3 If {h α (x)}, α ∈ J , is a tight frame of L 2 (R) with frame bound A = B, justify that the family { h α (x)} introduced in Remark 4 is a tight frame with frame bound = 1. Exercise 4 Let u(x) be an admissible window function in Definition 2. Show that u(x) satisfies the property (7.5.7) in that {u 2 (x − m)}, m ∈ Z, provides a partition of unity. Exercise 5 Show that an admissible window function u(x) satisfies (7.5.12). Exercise 6 Let u(x) be the function defined by (7.5.8). Show that u(x) is continuous on R. Exercise 7 Let u(x) be the function defined by (7.5.8). Verify that u  (δ) exists.

7.6 Appendix on Integration Theory

373

7.6 Appendix on Integration Theory The study of Fourier transform and time-frequency analysis in this chapter requires some basic results from the integration theory. For completeness, we include, in this Appendix, a brief discussion of only those theorems that are needed for our study in this book. Of course, we only present these results for piecewise continuous functions to avoid the need of Lebesgue integration and more advanced theory. The reader is referred to more advanced textbooks in Real Analysis and Integration Theory. Theorem 1 Fubini’s theorem Let J1 , J2 be two finite or infinite intervals and f (x, y) be a function defined on J1 × J2 , such that for each x ∈ J1 , g(y) = f (x, y) is in PC(J2 ) and for each y ∈ J2 , h(x) = f (x, y) is in PC(J1 ). Suppose that either f (x, y) ≥ 0 on J1 × J2 , or at least one of the two integrals   J1

is,

  | f (x, y)|dy d x, | f (x, y)|d x dy, J2

J2

J1

is finite. Then the order of integration over J1 and J2 can be interchanged; that     f (x, y)dy d x = f (x, y)d x dy. J1

J2

J2

J1

 Theorem 2 Minkowski’s integral inequality Let J1 , J2 be two finite or infinite intervals and f (x, y) be a function defined on J1 × J2 , such that for each x ∈ J1 , g(y) = f (x, y) is in PC(J2 ) and for each y ∈ J2 , h(x) = f (x, y) is in PC(J1 ). Then for any 1 ≤ p < ∞,

   J1

 p 1p f (x, y)dy  d x ≤ J2

  1    f (x, y) p d x p dy. J2

(7.6.1)

J1

The integrals in (7.6.1) are allowed to be infinite. The inequality (7.6.1) can be obtained from the Minkowski inequality for sequences. Indeed, for f ∈ C(J1 × J2 ), where J1 = [a, b] and J2 = [c, d] are bounded intervals, we may consider any partitions of J1 and J2 , namely: a = x0 < x1 < · · · < xm = b; c = y0 < y1 < · · · < yn = d. Set x j = x j − x j−1 , yk = yk − yk−1 , with j = 1, 2, . . . , m and k = 1, 2, . . . , n, and apply Minkowski’s inequality (1.2.7), on p.15 in Sect. 1.2, n − 1 times to obtain

374

7 Fourier Time-Frequency Methods m m 1

1

 n p  n 1 p p p   f (x j , yi )yi  x j = f (x j , yi )yi (x j ) p 



j=1 i=1 n m i=1

j=1 i=1 n m 1 1    1  f (x j , yi )yi (x j ) p  p p =  f (x j , yi ) p x j p yi .

j=1

i=1

j=1

Observe that the first term is a Riemann sum of the integral on the left of (7.6.1) and the last term is a Riemann sum of the integral on the right of (7.6.1). Since the limit of suitable Riemann sums is the Riemann integral, we have established (7.6.1) for continuous functions on bounded rectangles J1 × J2 . A standard density argument extends continuous functions to functions in L p and from bounded J1 × J2 to unbounded J1 × J2 .  For a sequence {an }, n = 1, 2, . . . of real numbers, its “limit inferior” is defined by lim inf an := lim n→∞



n→∞

 inf am .

m≥n

Since inf am = inf{am : m ≥ n}, n = 1, 2, . . . is a non-decreasing sequence, its m≥n

limit (which may be +∞) always exists. Clearly,   lim inf an = sup inf am = sup inf{am : m ≥ n} : n ≥ 1 . n→∞

n≥1 m≥n

Similarly, the “limit superior” of {an }, n = 1, 2, . . . is defined by lim sup an := lim



n→∞

n→∞

   sup am = inf sup{am : m ≥ n} : n ≥ 1 .

m≥n

Clearly lim inf an ≤ lim sup an . When lim an exists, then n→∞

n→∞

n→∞

lim inf an = lim sup an = lim an . n→∞

n→∞

n→∞

Theorem 3 Fatou’s lemma Let { f n (x)} be a sequence of piecewise continuous functions on a finite or infinite interval J with f n (x) ≥ 0, x ∈ J, n = 1, 2, . . .. Then 

 lim inf f n (x)d x ≤ lim inf

J n→∞

n→∞

f n (x)d x,

(7.6.2)

J

where the integrals are allowed to be infinite. Theorem 3 is called Fatou’s lemma in Real Analysis. Its proof is beyond the scope of this book. In Real Analysis, the piecewise continuity assumption on f n (x), n = 1, 2, · · · is replaced by the general assumption that f n (x) are measurable functions.

7.6 Appendix on Integration Theory

375

In this case the pointwise limit f (x) = lim f n (x), x ∈ J , is also measurable. In n→∞

this elementary textbook, we assume that f (x) is in PC(J ). Let { f n (x)} be a sequence of piecewise continuous functions on an interval J . If there is g(x) ∈ L 1 (J ) such that f n (x) ≤ g(x), n = 1, 2, . . . , for almost all x ∈ J , then applying Fatou’s lemma to the sequence g(x)− f n (x) ≥ 0, we have   f n (x)d x ≤

lim sup n→∞

J

lim sup f n (x)d x,

(7.6.3)

J n→∞

since inf {− f m (x)} = − sup { f m (x)}. m≥n



m≥n

Theorem 4 Lebesgue’s dominated convergence theorem Let J be a finite or infinite interval, and { f n (x)} be a sequence of functions in L 1 (J ). Suppose that lim f n (x) = f (x) for almost all x ∈ J , and that { f n (x)} is dominated by a function

n→∞

g(x) ∈ L 1 (J ), namely: | f n (x)| ≤ g(x), n = 1, 2, . . . , x ∈ J. Then the order of integration and taking the limit can be interchanged; that is, 

 f n (x)d x =

lim

n→∞

f (x)d x.

J

(7.6.4)

J

Theorem 4 is also valid for a family of functions f h (x) = q(x, h) with a continuous parameter h. More precisely, let a ∈ R and suppose that lim h→a q(x, h) = f (x) for almost all x ∈ J , and that there is g(x) ∈ L 1 (J ) and δ0 > 0 such that  |q(x, h)|d x < ∞, |q(x, h)| ≤ g(x), h ∈ (a − δ0 , a + δ0 ) J

for almost all x ∈ J , then 

 q(x, h)d x =

lim

h→a

J

f (x)d x. J

The proof of Theorem 4 is based on Fatou’s lemma. Indeed, considering the sequence | f n (x) − f (x)|, n = 1, 2, . . ., we have | f n (x) − f (x)| ≤ 2g(x), n = 1, 2, . . . . Thus by (7.6.3), we have

376

7 Fourier Time-Frequency Methods

  lim sup | f n (x) − f (x)|d x ≤ lim sup | f n (x) − f (x)|d x n→∞ J J n→∞  lim | f n (x) − f (x)|d x = 0, = J n→∞

where the last equality follows from the assumption lim f n (x) = f (x). Thus, n→∞

lim

n→∞ J

| f n (x) − f (x)|d x exists and is zero, which implies (7.6.4).



Theorem 5 Lebesgue’s integration theorem Let 1 ≤ p < ∞. If f ∈ L p (R), then  ∞    f (x + t) − f (x) p d x = 0. (7.6.5) lim t→0 −∞

The statement remains valid if R is replaced by a finite interval. Theorem 5 is concerned with the L p continuity and is called Lebesgue’s integration theorem here. Its proof is provided below. For an arbitrarily given  > 0, there is a b > 0 such that

Proof of Theorem 5 

 | f (x)| d x < , p

|x|>b

|x|>b

| f (x + t)| p d x < ,

where the second inequality holds for all sufficiently small t, say |t| < 1. Hence, we may restrict our attention to the interval [−b − 1, b + 1]. Since a function f ∈ L p [−b − 1, b + 1] can be approximated arbitrarily close by continuous functions, we may assume that f is continuous on [−b − 1, b + 1]. Since continuous functions on a closed and bounded interval are uniformly continuous, there exists some δ, 0 < δ < 1, such that | f (x) − f (y)| ≤

 1 p , for all x, y ∈ [−b − 1, b + 1], |x − y| < δ. 2b

Thus for |t| < δ, we have 

b

−b

 | f (x + t) − f (x)| p d x ≤

b −b

 d x = ; 2b

from which it follows that  ∞    f (x + t) − f (x) p d x −∞



=

|x|>b

 p | f (x + t)| + | f (x)| d x +

 |x|≤b

   f (x + t) − f (x) p d x

7.6 Appendix on Integration Theory

 ≤

  2 | f (x + t)| p + | f (x)| p d x + p

|x|>b p

377



b

−b

| f (x + t) − f (x)| p d x

≤ 2 (2) +  = (2 p+1 + 1), where the inequality (a + c) p ≤ 2 p (a p + c p ), for any a, c ≥ 0, has been used in the above derivation. This completes the proof of (7.6.5). 

Chapter 8

Wavelet Transform and Filter Banks

To adopt the Fourier basis functions eiωx or einx , cos nx, and sin nx, where ω ∈ R and n ∈ Z, for investigating the frequency contents of functions f (x), for f ∈ L 1 (R) or f ∈ PC ∗ [a, b], in some desirable neighborhood of any x = x0 , called “region of interest (ROI)” in engineering applications, a suitable window function is used to introduce the localized Fourier transform (LFT) and to construct time-frequency basis functions in Sects. 7.4 and 7.5, respectively, of the previous chapter. In this chapter, we consider the approach of replacing the Fourier basis functions by a very general function ψ ∈ L 2 (R) with integral over (−∞, ∞) equal to zero, such that ψ(x) → 0 for x → ±∞. Hence, there is no need to introduce any window function any more. Just as the LFT in Definition 1 on p.351, where the window function u(t) is allowed to slide along the entire “time-domain” R = (−∞, ∞), we will also translate ψ(t) along R by considering ψ(t − b), where b ∈ R. On the other hand, instead of considering the frequency modulation u(t − x)e−iωt of the sliding window u(t −x) in (7.4.1) of Sect. 7.4 as the integral kernel for the LFT operator Fu , a positive parameter a > 0, called “scale”, is used to introduce the family of functions ψb,a (t) =

1 t − b ψ , a a

where the normalization by a1 multiplication is used to preserve L 1 (R) norm, namely: ||ψb,a ||1 = ||ψ||1 for all a > 0 and b ∈ R. The functions ψb,a (t) of the two-parameter family {ψb,a } are called “wavelets” generated by a single wavelet function ψ ∈ L 2 (R). Moreover, analogous to the definition of LFT Fu on p.351, with integral kernel u(t − x)e−iωt , the complex conjugate of the wavelet family {ψb,a } is used as the integral kernel to introduce the “wavelet transform” Wψ , defined by

C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_8, © Atlantis Press and the authors 2013

379

380

8 Wavelet Transform and Filter Banks

1 (Wψ f )(b, a) =  f, ψb,a  = a





−∞

f (t)ψ

t − b dt a

for functions f ∈ L 2 (R). Hence, the wavelet transform Wψ may be called “integral wavelet transform”, but is commonly called “continuous wavelet transform” (CWT) in the literature. The significance of the scale parameter a > 0 in the wavelet transform (Wψ f )(b, a) of f ∈ L 2 (R) will be illustrated in Sect. 8.1. In particular, the scale parameter facilitates not only the investigation of the frequency content of f (x), but also the consideration of the “zoom-in” and “zoom-out” functionality. Under the additional  assumption that the Fourier transform ψ(ω) of ψ satisfies the condition 2  ∈ L 1 (R), |ω|−1 |ψ(ω)|

it will be shown in the same first section that every function f ∈ (L 2 ∩ L ∞ )(R) can be recovered from its wavelet transform (Wψ f )(b, a) at any x ∈ R where f is continuous. On the other hand, for computational and implementational simplicity, the wavelet transform (Wψ f )(b, a) is sampled at  (b, a) =

k 1 , 2j 2j

 , j, k ∈ Z.

The consideration of the multi-scales 2− j , j ∈ Z, gives rise to the multi-level structure, called “multiresolution approximation” (MRA), in that a nested sequence of vector subspaces  V j = closure L 2 span φ(2 j x − k) : k ∈ Z of L 2 = L 2 (R) can be formulated in terms of a single function φ(x) that satisfies the so-called two-scale relation:

pk φ(2x − k), x ∈ R, φ(x) = k

for some sequence { pk } ∈ 2 , with the integral of φ over R equal to 1. Here, the subspaces V j satisfy the nested and density properties: {0} ← · · · ⊂ V−1 ⊂ V0 ⊂ V1 ⊂ V2 ⊂ · · · → L 2 (R). This will be discussed in Sect. 8.2, with Cardinal B-splines as canonical examples. By considering the complementary subspace W j of V j+1 relative to V j , for each j ∈ Z, a corresponding wavelet ψ(x), formulated by

8 Wavelet Transform and Filter Banks

ψ(x) =

381



qk φ(2x − k), x ∈ R,

k

can be constructed by finding an appropriate sequence {qk } ∈ 2 . Simple examples will be provided in Sect. 8.2 to illustrate the importance of this MRA architecture. Here, together with the wavelet ψ, MRA stands for “multiresolution analysis”. The notion of discrete wavelet transform (DWT) is introduced in Sect. 8.3, by sampling the time and scale parameters of the time-scale coordinate (b, a) for the (integral) wavelet transform at the scales a = 2 j and corresponding time samples b = k/2 j , for all integers j and k as discussed above. As a consequence, when a filter pair is used to decompose a bi-infinite signal sequence into low-frequency and high-frequency components, the corresponding bi-orthogonal filter pair can be applied to perfectly reconstruct the signal. This filtering process is called wavelet decomposition and reconstruction, and the same filter pair can be applied to decompose the low-frequency components to as many scale levels as desired, with the same corresponding bi-orthogonal filter pair to reconstruct the decomposed signal components to the previous scale levels. The Haar filter pairs are used to illustrate this multi-level wavelet decomposition/reconstruction algorithm, and the lifting implementation scheme is also demonstrated by using both the Haar filters and the 5/3-tap biorthogonal filters. The topic of filter banks is studied in Sect. 8.4 to generalize and fine-tune the wavelet decomposition and reconstruction algorithm considered in Sect. 8.3, including the downsampling and upsampling operations and perfect reconstruction. The symbol notation is applied to change the convolution filtering operations to algebraic polynomial multiplications.

8.1 Wavelet Transform Let us consider a function ψ ∈ L 2 (R) that satisfies ψ(t) → 0 for t → ±∞, for which the Cauchy principal value of the integral of ψ on R exists and vanishes; that is,   ∞

PV

−∞

ψ(t)dt = lim

A→∞

A

−A

ψ(t)dt = 0.

(8.1.1)

Then ψ is called a wavelet; and by introducing two real numbers b ∈ R and a > 0, we have a two-parameter family of functions ψb,a (t) =

1 t − b ψ , a a

(8.1.2)

called wavelets generated by ψ. The word “wavelets” means “small waves” or “ondelettes” in French. In view of (8.1.1), the graph of ψ(t) is oscillating (and hence, “wavy”); and this wave dies down to 0, since ψ(t) → 0 as |t| → ∞. Moreover,

382

8 Wavelet Transform and Filter Banks

observe that ψb,a (t) zooms in to a smaller region near t = b as the parameter a tends to 0. Therefore the graphs of ψb,a (t) are indeed small or large waves, depending on small values or large values of the parameter a > 0; and this family of wavelets covers the entire “time-domain” R = (−∞, ∞) as b runs over R. Recall from Sect. 7.4 of the previous chapter that to localize the Fourier transform, a window function u(t) is introduced to define the localized Fourier transform (LFT) Fu by  ∞

(Fu f )(x, ω) =  f, U (·, x, ω) =

−∞

f (t)U (t, x, ω)dt,

(8.1.3)

where the complex conjugate of U (t, x, ω) is defined by U (t, x, ω) = u(t − x)e−itω

(8.1.4)

(see Definition 1 in Chap. 7 on p.351). In other words, as a function of the variable t, the window function u(t − x) is used to localize the (input) function f (t) near t = x and the modulation by e−itω is applied to generate the oscillations in the definition of Fu f with frequency = ω/2π H z for ω = 0. In contrast to U (t, x, ω), the wavelets ψb,a (t) defined in (8.1.2) have both the localization and oscillation features for the analysis of (input) functions f ∈ L 2 (R), when used as the “integration kernel” for the wavelet transform of f (t), defined by (Wψ f )(b, a) =  f, ψb,a  =

1 a



∞ −∞

f (t) ψ

t − b dt a

(8.1.5)

for analyzing the oscillation behavior of f (t). The localization feature is easy to understand, since ψ(t) is a window function already. However, it is important to point out that the window size (as defined in terms of the width 2ψ in Definition 3, (7.4.4) of Chap. 7 on p.352) varies since ψb,a = aψ and  ψb,a =

1 . a ψ

(8.1.6)

Hence, the wavelet transform (Wψ f )(b, a) of f (t) zooms in, as the time-window width 2ψb,a = 2aψ narrows (for smaller values of a > 0, with wider frequencywindow) and zooms out as the window width widens (for larger values of a > 0, with narrower frequency-window for analyzing high-frequency contents). Next, we must understand the feature of oscillation, or frequency. Since the translation parameter b has nothing to do with oscillation, the frequency must be governed by the scale parameter a > 0 as well. For this purpose, let us consider the single frequency signal f ω (t) = eiωt with frequency = ω/2π H z (where ω is fixed). Although f ω is not in L 2 (R), the inner product in (8.1.5) is still well-defined [for any wavelet ψ ∈ L 2 (R)], since

8.1 Wavelet Transform

383

  f ω , ψb,a  =



−∞

ψb,a (t)e−itω dt

is the Fourier transform of the function ψb,a ∈ L 2 (R) [see Definition 1 for the definition of Fourier transform for L 2 (R) on p.329]. Hence, in view of the properties (ii) and (iii) in Theorem 2 on p.323, we have, from (8.1.5), that

Wψ f ω (b, a) =  f ω , ψb,a  = Fψb,a (ω)  1 ∞ ¯  t − b  iωt e dt ψ = a −∞ a  = eiωb ψ(aω).

(8.1.7)

Example 1 Apply (8.1.7) to explore the relationship between the notions of “frequency” ω and “scale” a, by considering an appropriate wavelet ψ whose Fourier transform is formulated as the difference of two ideal lowpass filters. Solution Let us first consider a single-frequency signal gω0 (t) = d0 cos ω0 t 1 ω0 H z. with amplitude = d0 = 0 and frequency = 2π Recall from (7.2.14) on p.334 the ideal lowpass filter

h η (t) =

sin ηt , η > 0, πt

with Fourier transform given by  h η (ω) = χ[−η,η] (ω), and consider the function ψ (t) defined by ψ (t) = h 1+ (t) − h 1− (t), with Fourier transform given by  (ω) = χ[−1−,−1+) (ω) + χ(1−,1+] (ω) ψ  (0) = 0, or equivalently, by applying (8.1.8). Since 0 <  < 1, we have ψ 

∞ −∞

 (0) = 0, ψ (t)dt = ψ

so that ψ (t) is a wavelet [see (8.1.1)]. Now, applying (8.1.7), we have

(8.1.8)

384

8 Wavelet Transform and Filter Banks



  1  1  Wψ gω0 (b, a) = d0 Wψ eiω0 t (b, a) + d0 Wψ e−iω0 t (b, a) 2 2  1  iω0 b −iω0 b  = d0 e +e ψ (aω0 ) 2   = d0 (cos ω0 b) χ(−1−,−1+) + χ(1−,1+) (aω0 ) = d0 (cos ω0 b) χ(−1−,1+) (aω0 ),

(8.1.9)

since aω0 > 0 for positive values of ω0 . An equivalent formulation of (8.1.9) is that and



Wψ gω0 (b, a) = 0, for |aω0 − 1| > 

Wψ gω0 (b, a) = f 0 (b), for |aω0 − 1| < 

(where we have ignored the consideration of |aω0 − 1| =  for convenience). Hence, for small values of  > 0, we see that the relation of the scale a and frequency ω0 is 1 ≈ ω0 . a

(8.1.10)

In view of (8.1.10), it is customary to say that the scale a is inversely proportional to the frequency, although this statement is somewhat misleading and only applies to the wavelet ψ (t).  The relationship as illustrated in (8.1.10) between the scale a > 0 and frequency can be extended from a single frequency to a range of frequencies, called “frequency band”, with bandwidth determined by  ψ . (Observe that  ψ → 0 as  → 0 in Example 1.) To illustrate this concept, let us again consider some wavelet ψ as the difference of two ideal lowpass filters in the following example. Example 2 Apply (8.1.7) to explore the relationship between the scale a > 0 and “frequency bands” [d j , d j+1 ) for any d > 1 and all integers, j = 0, 1, 2, . . . , by considering some wavelet ψ as the difference of two appropriate ideal lowpass filters. Solution Let us again consider the ideal lowpass filter h η (t) with Fourier transform  h η (ω) = χ[−η,η] (ω) as in (8.1.8) of Example 1, but consider the wavelet

where d > 1, so that

ψ I (t) = h d (t) − h 1 (t),

(8.1.11)

I (ω) = χ[−d,−1) (ω) + χ(1,d] (ω) ψ

(8.1.12)

is an “ideal” bandpass filter, with pass-band = [−d, −1) ∪ (1, d]. Then for a multifrequency signal

8.1 Wavelet Transform

385 n

g(t) =

ck cos kt

(8.1.13)

k=0

with dc term c0 and ac terms with amplitudes ck for the frequency components of k/2π H z, respectively, for k = 1, . . . , n, it follows from the same computation as in Example 1 that n

ck (cos kb)χ[1,d) (ak). (Wψ I g)(b, a) = k=1

In particular, for each j = 0, . . . , logd n, n k   1

ck (cos kb)χ[1,d) j (Wψ I g) b, j = d d k=1

= ck cos kb.

(8.1.14)

d j ≤k 0 and b ∈ R, where ψ is the Fourier transform of ψ. Exercise 3 Let f ω0 be a single-frequency signal defined by f ω0 (t) = b0 sin ω0 t, with magnitude b0 = 0. Follow Example 1 to compute the wavelet transform (Wψ f ω0 )(b, a) with the wavelet ψ (t) = h 1+ (t) − h 1− (t), where h η (t) = sin ηt/πt. Exercise 4 As a continuation of Exercise 3, compute the wavelet transform (Wψ f ω0 )(b, a) with the wavelet ψ(t) = h d (t) − h 1 (t). Then compute (Wψ f )(b, a)

n for f (t) = bk sin kt. k=1

Exercise 5 Fill in the computational details in the proof of Theorem 1. Exercise 6 Fill in the computational details in the proof of Theorem 2. Exercise 7 Fill in the details in the proof of Theorem 3.

8.2 Multiresolution Approximation and Analysis In this section we introduce the notion of multiresolution approximation (MRA) and show how MRA leads to compactly supported wavelets. This section serves as an introduction of the MRA method for the construction of compactly supported wavelets, which is studied in some depth in the next section. In most applications, particularly in signal and image processing, only bandlimited functions are of interest. Hence, although the general theory and method of MRA will be studied in this section and the next two chapters under the framework of L 2 = L 2 (R), we will first introduce the concept of MRA and that of the corresponding wavelet bandpass filters, by considering ideal lowpass and ideal bandpass filters, to apply to all band-limited functions.  Let φ(ω) = χ[−π,π] (ω) denote the ideal lowpass filter. Then for any band-limited function f (x), there exists a (sufficiently large) positive integer J such that  f (ω) vanishes outside the interval [−2 J π, 2 J π].  −J ω) becomes an allpass filter of all functions Hence, the ideal lowpass filter φ(2 including f (x), with bandwidth ≤ 2 J +1 π; that is,  −J ω) =   f (ω) f (ω)φ(2 for all ω, or equivalently ( f ∗ φ J )(x) = f (x)

8.2 Multiresolution Approximation and Analysis

391

for all x, where φ J (x) = 2 J φ(2 J x) and the function φ, defined by sin πx , πx

φ(x) =

(8.2.1)

 is the inverse Fourier transform of φ(ω) = χ[−π,π] (ω). On the other hand, it follows from the Sampling Theorem (see Theorem 6 of Sect. 7.2 in Chap. 7, on p.336) that J +1 π) can be recovered from such functions f (x) (with  bandwidth not exceeding 2 k its discrete samples f 2 J , k ∈ Z, via the formula f (x) = =





f

k=−∞ ∞

k 2J



sin π 2 J x − k π 2J x − k

  ckJ φ 2 J x − k ,

(8.2.2)

k=−∞

by applying (8.2.1), where

 ckJ = f

k 2J

 .

Since the function f (x) = φ 2 J −1 x has bandwidth = 2 J π, it follows from (8.2.2) that ∞    

φ 2 J −1 x = ckJ φ 2 J x − k , (8.2.3) k=−∞

where, in view of (8.2.1),  ckJ = φ

2 J −1 k 2J

 =φ

  k sin(πk/2) = , 2 kπ/2

which is independent of J . Hence, we may introduce the sequence { pk }, defined by sin(πk/2) = pk = πk/2



δj,

(−1) j 2 (2 j+1)π ,

for k = 2 j, for k = 2 j + 1,

(8.2.4)

for all j ∈ Z, and replace 2 J −1 x by x in (8.2.3) to obtain the identity φ(x) =



k=−∞

pk φ(2x − k),

(8.2.5)

392

8 Wavelet Transform and Filter Banks

to be called the two-scale relation or refinement equation, with the governing sequence { pk } in (8.2.4) called the corresponding two-scale sequence or refinement mask of φ(x). For each j ∈ Z, let   V j = closure L 2 span 2 j φ(2 j x − k) : k ∈ Z .

(8.2.6)

Then it follows from the two-scale relation (8.2.5) that the family {V j } of vector spaces is a nested sequence · · · ⊂ V−1 ⊂ V0 ⊂ V1 ⊂ V2 ⊂ · · · .

(8.2.7)

The reason for considering the above ideal lowpass filter φ(x) in (8.2.1) in generating the nested sequence of vector spaces V j , is that the Fourier transform of the generating functions 2 j φ 2 j x is precisely the ideal lowpass filter    ω = χ[−2 j π,2 j π] (ω) φ 2j   with pass-band = −2 j π, 2 j π . Hence, for each j ∈ Z,      ω  ω −φ φ j j−1 2 2 is the ideal bandpass filter with pass-band [−2 j π, −2 j−1 π] ∪ [2 j−1 π, 2 j π].

(8.2.8)

Therefore, to separate any function f J (x) with bandwidth ≤ 2 J +1 π into components f J (x) = f 0 (x) + g0 (x) + · · · + g J −1 (x) on non-overlapping (ideal) frequency bands; that is,  f J (ω)χ[−π,π] (ω) f 0 (ω) =    g0 (ω) = f J (ω)χ[−2π,−π)∪(π,2π] (ω) f J (ω)χ[−22 π,−2π)∪(2π,22 π] (ω)  g1 (ω) =  .. .  g J −1 (ω) =  f J (ω)χ[−2 J π,−2 J −1 π)∪(2 J −1 π,2 J π] (ω), it is required to find an ideal bandpass filter ψ I (x) with Fourier transform

(8.2.9)

8.2 Multiresolution Approximation and Analysis

393

    ω − φ(ω), I (ω) = χ[−2π,−π)∪(π,2π] (ω) = φ ψ 2

(8.2.10)

which yields I ψ

ω 2j

 =φ

 ω     ω = χ[−2 j+1 π,−2 j π)∪(2 j π,2 j+1 π] (ω) − φ 2 j+1 2j

for j = 0, . . . , J − 1. However, for computational and other reasons, we introduce a phase shift of ψ I (x) to define the “wavelet”  1 , ψ(x) = −2φ(2x − 1) + φ x − 2

(8.2.11)

  ω  ω    − φ(ω) . ψ(ω) = −e−i 2 φ 2

(8.2.12)



so that

 I (ω)| by comparing (8.2.12) with (8.2.10), and hence the Observe that |ψ(ω)| = |ψ separation of f J (x) into components on ideal frequency bands in (8.2.9) remains valid with only a phase shift of g j (x) by −(π + ω/2 j ), j = 0, . . . , J − 1. In the definition of ψ(x) in (8.2.11), we remark that ψ ∈ V1 , with ψ(x) = =



k=−∞ ∞

pk φ(2x − (k + 1)) − 2φ(2x − 1) ( pk−1 − 2δk−1 )φ(2x − k),

(8.2.13)

k=−∞

by applying (8.2.5). Furthermore, since we have, from (8.2.4), that p2 j = δ2 j and p1−2 j =

−2 sin (2 j−1)π 2 sin (1−22 j)π 2 = = p2 j−1 , (1 − 2 j)π −(2 j − 1)π 

so that (−1)k p1−k =

p2 j−1 , −δ2 j ,

for k = 2 j, for k = 2 j + 1,

for all k, it follows from (8.2.13) that ψ as defined in (8.2.11) satisfies the two-scale relation ∞

ψ(x) = qk φ(2x − k), (8.2.14) k=−∞

with qk = (−1)k p1−k , k ∈ Z.

(8.2.15)

394

8 Wavelet Transform and Filter Banks

For any function ψ defined on R, we will use the notation j

ψ j,k (x) = 2 2 ψ(2 j x − k), j, k ∈ Z.

(8.2.16)

We will also call ψ ∈ L 2 (R) an orthogonal wavelet provided that the family {ψ j,k : j, k ∈ Z} is an orthonormal basis of L 2 (R). In this chapter we only consider realvalued wavelets. The subspaces V j of L 2 (R) defined in (8.2.6) form the so-called multiresolution approximation (MRA). The construction of compactly supported orthogonal wavelets, to be studied in some depth in Chap. 9, is based on MRA. Before introducing the precise definition of MRA, we recall that for a set W = { f 1 , f 2 , . . .} of functions in L 2 (R), spanW is the L 2 -closure of the linear span by W , namely, spanW consists of all functions given by ∞

ck f k (x)

k=1

which converges in L 2 (R), where ck are constants. Recall from Definition 3 in Sect. 1.4 of Chap. 1, on p.38, that the family W = { f 1 , f 2 , . . .} is complete in L 2 (R) if spanW = L 2 (R). Definition 1 Multiresolution approximation (MRA) An MRA is a nested sequence {V j }, j ∈ Z of closed subspaces V j in L 2 (R) that satisfy the following conditions: (1◦ ) V j ⊂ V j+1 , j ∈ Z; (2◦ )  j∈Z V j = {0}; (3◦ ) j∈Z V j is dense in L 2 (R); (4◦ ) f ∈ V j ⇔ f (2·) ∈ V j+1 ; and (5◦ ) there exists a real-valued function φ ∈ L 2 (R) such that {φ(· − k) : k ∈ Z} is a Riesz basis of V0 , that is, V0 = span {φ(· − k) : k ∈ Z} and there exist some constants c, C > 0, such that c

k∈Z



2 |ck |2 ≤  ck φ(x − k) ≤ C |ck |2 , for all {ck } ∈ 2 (Z). (8.2.17) k∈Z

k∈Z

Observe that by (4◦ ) and (5◦ ), V j = span{φ j,k : k ∈ Z}. Thus we say that φ generates the MRA {V j }. A function φ in (5◦ ) is called a scaling function and φ ∈ L 2 (R) satisfying (8.2.17) is said to be stable. An MRA is called an orthogonal MRA if φ in (5◦ ) is orthogonal, in the sense that the collection of its integer translations is an orthonormal system: φ, φ(· − k) = δk , k ∈ Z. Conditions (1◦ ), (4◦ ) and (5◦ ) in Definition 1 imply that

(8.2.18)

8.2 Multiresolution Approximation and Analysis

395

φ ∈ V0 ⊂ V1 = span{φ(2x − k) : k ∈ Z}. Thus φ can be written as (8.2.5) for some pk ∈ R. A function φ satisfying the refinement equation (8.2.5) is said to be refinable, and the sequence { pk } is called the refinement mask of φ. − k) : k ∈ Z}, and Conversely, suppose that φ is stable. Let V0 = span{φ(· set V j = { f : f 2− j · ∈ V0 } = span{φ 2 j · −k : k ∈ Z}. Then (4◦ ) holds for {V j }. If, in addition, φ satisfies (8.2.5), then (1◦ ) is satisfied. For a compactly supported φ ∈ L 2 (R), it can be shown that (2◦ ) and (3◦ ) follow from (1◦ ), (4◦ ) and (5◦ ). Consequently, {V j } is an MRA; and in addition, if φ is orthogonal, then {V j } is an orthogonal MRA. To summarize, to construct an MRA (or an orthogonal MRA), we need only to construct (compactly supported) stable (or orthogonal) and refinable φ. When φ generates an orthogonal MRA, then ψ defined by (8.2.14), where qk , k ∈ Z are given by (8.2.15) or more generally, by (8.2.28) below, is an orthogonal wavelet. A multiresolution approximation, together with the orthogonal wavelet ψ defined by (8.2.14), is called an orthogonal multiresolution analysis (abbreviated as MRA also). Example 1 Haar MRA Let ϕ0 (x) = χ[0,1) (x). Define for j ∈ Z,  Vj =

f (x) =









ck ϕ0 2 x − k : {ck } ∈ 2 (Z) . j

k

Then it is easy to see that  Vj =

 f ∈ L 2 (R) : f (x) 

k 2j

, k+1 j

 is

 a constant function on

2

k k+1 , 2j 2j

 .

Thus V j ⊆ V j+1 . The nesting property (4◦ ) of {V j } can also be verified by the refinement of ϕ0 : ϕ0 (x) = ϕ0 (2x) + ϕ0 (2x − 1) (see Exercise 1). The density property (3◦ ) of {V j } follows from the fact that a function f ∈ L 2 (R) can be approximated arbitrarily well by piecewise constant functions. Property (2◦ ) can be verified directly. In addition, ϕ0 is orthogonal. Thus {V j } is an orthogonal MRA  Symbol of a sequence For a sequence g = {gk } of real or complex numbers, its two-scale symbol, also called symbol for short, is defined by G(z) =

1

gk z k , 2 k

(8.2.19)

396

8 Wavelet Transform and Filter Banks

where z ∈ C\{0}, and convergence of the series is not considered. For a finite sequence g = {gk } (or an infinite sequence g = {gk } with only finitely many gk nonzero), G(z) is given by k2 1

G(z) = gk z k 2 k=k1

for some integers k1 and k2 . Such a G(z) is called a Laurent polynomial of z or z −1 . In addition, for z = e−iω , G(e−iω ) is a sum of a finite number of terms of gk e−ikω , and is therefore a trigonometric polynomial. When the finite sequence {gk } is used as a convolution digital filter, then 2G(z) is called the z-transform of {gk }. Since the z-transform of a digital filter is the transform function of the filtering process, we  sometimes call G(z) or G(e−iω ) a finite impulse response (FIR) filter. Remark 1 To distinguish the two-scale symbol notation for the refinement and wavelet sequences p = { pk },  p = { pk } and q = {qk },  q = { qk } from the z-transform notation of a sequence, we do not bold-face p,  p , q,  q throughout this book.  Taking the Fourier transform of both sides of (8.2.5) and using the fact that the 1 −iω/2 , we may deduce that (8.2.5) can be Fourier transform of φ(2x − k) is φ(ω)e 2 written as  ω  ω    , (8.2.20) φ(ω) = P e−i 2 φ 2 where P(z) is the two-scale symbol of { pk }: P(z) =

1

pk z k . 2 k

Hence, applying (8.2.20) repeatedly, we have    ω  2   φ(ω) = P e−iω/2 P e−iω/2 φ 22 = ···       ω 2 n  = P e−iω/2 P e−iω/2 · · · P e−iω/2 φ 2n n      j  ω . = P e−iω/2 φ 2n j=1

Suppose that the Laurent polynomial P(z) satisfies P(1) = 1, then converges pointwise on R. Thus φ, defined by

n j=1

  j P e−iω/2

8.2 Multiresolution Approximation and Analysis

 φ(ω) =

397

  j P e−iω/2 ,

∞ 

(8.2.21)

j=1

 is a solution of (8.2.20) with φ(0) = 1. In addition, suppφ ⊆ [N1 , N2 ], provided that supp p ⊆ [N1 , N2 ]; that is, pk = 0 for k < N1 or k > N2 . This φ is called the normalized solution of (8.2.5). A detailed discussion of these properties of φ will be provided in Chaps. 9 and 10. Furthermore, in Chap. 10, we will derive the condition on p = { pk } so that φ is in L 2 (R) and that φ is orthogonal. From now on, we will always assume that a refinement mask p = { pk } satisfies P(1) = 1, i.e.,

pk = 2 ;

k

and that the function φ associated with p = { pk } means the normalized solution of (8.2.5) given by (8.2.21). Example 2 Refinement of Cardinal B-splines Let ϕ0 (x) = χ[0,1) (x) as introduced in Example 1. Then φ = ϕ0 ∗ ϕ0 is the piecewise linear B-spline, also called the hat function, given by ⎧ ⎨ x, φ(x) = min{x, 2 − x}χ[0,2) (x) = 2 − x, ⎩ 0, From ϕ 0 (ω) =

for 0 ≤ x < 1, for 1 ≤ x < 2, elsewhere.

(8.2.22)

1 − e−iω iω

(see Exercise 1 in Sect. 7.1 on p.327 and (v) in Theorem 2 of Sect. 7.1 on p.323), we have 2 1 − e−iω iω 2  2  −iω/2 1 − e−iω/2 1+e = 2 iω/2 2    −iω/2 1+e  ω . = φ 2 2

 φ(ω) =ϕ 0 (ω)2 =



Thus, φ is refinable with the symbol of the refinement mask P(z) given by  P(z) =

1+z 2

2 ,

398

8 Wavelet Transform and Filter Banks

1

1

0.9

0.9

φ (x)

0.8 0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1 0

φ (x)

0.8

φ (2x−1) 1

1 2

2

φ (2x)

φ (2x−2)

0.1 0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0

2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Fig. 8.1. Hat function φ (on left) and its refinement (on right)

or the refinement mask is given by p0 =

1 1 , p1 = 1, p2 = , pk = 0, k = 0, 1, 2; 2 2

that is, φ(x) =

1 1 φ(2x) + φ(2x − 1) + φ(2x − 2) 2 2

(see Fig. 8.1 for the refinement of φ). More generally, let φ be the Cardinal B-spline of order m defined by φ(x) = (ϕ0 ∗ ϕ0 ∗ · · · ∗ ϕ0 )(x) $ %& '

(8.2.23)

m copies of ϕ0

(see Exercise 12 in Sect. 7.1 of Chap. 7 on p.328, where ϕ0 (x) = h 1 (x)). Then φ is refinable with the symbol of the refinement mask given by (see Exercise 3) P(z) =

 1 + z m 2

.

(8.2.24) 

To construct an orthogonal wavelet, we start with a lowpass FIR filter p = { pk } such that the corresponding scaling function φ is orthogonal; that is, the integer shifts φ(x −k), k ∈ Z of φ are orthogonal to each other. This implies that the corresponding symbol P(z) must satisfy |P(z)|2 + |P(−z)|2 = 1, z = 0.

(8.2.25)

Such a filter p (or P(z)) is called a quadrature mirror filter (QMF). By applying P(z), we construct the corresponding highpass filter q = {qk } under the construction

8.2 Multiresolution Approximation and Analysis

399

criteria |Q(z)|2 + |Q(−z)|2 = 1, z = 0,     1 1 + P(−z)Q − = 0, z = 0. P(z)Q z z

(8.2.26) (8.2.27)

Then ψ, defined by (8.2.14), is a compactly supported orthogonal wavelet. The detailed discussion is provided in the next chapter. Here, we only mention that the highpass filter can be given by the corresponding QMF as shown below. Choice of highpass filter q for a QMF lowpass filter p For a Laurent polyno( mial P(z) = 21 k pk z k with real-valued coefficients pk , define the Laurent poly( nomial Q(z) = 21 k qk z k by   1 , Q(z) = −z 2L−1 P − z or equivalently by qk = (−1)k p2L−1−k , k ∈ Z,

(8.2.28)

where L is any desirable integer. It can be verified that if P(z) is a QMF, then Q(z) satisfies (8.2.26) and (8.2.27) (see Exercise 10). If we choose L = 1, then (8.2.28) is reduced to (8.2.15).  Example 3 Haar wavelet As a continuation of Example 1, let ϕ0 (x) = χ[0,1) (x) with its refinement mask p0 = p1 = 1 and pk = 0 for k = 0, 1. Define qk by qk = (−1)k p1−k , namely: q0 = 1, q1 = −1; qk = 0, k = 0, 1. Then the corresponding wavelet ψ given by ⎧ ⎨ 1, ψ(x) = ϕ0 (2x) − ϕ0 (2x − 1) = −1, ⎩ 0,

for 0 ≤ x < 21 , for 21 ≤ x < 1, elsewhere

is an orthogonal wavelet to be called the Haar wavelet.

(8.2.29) 

Example 4 D4 orthogonal wavelet Let { pk } be the filter with the only nonzero terms pk given by √ √ √ √ 1− 3 3− 3 3+ 3 1+ 3 , p1 = , p2 = , p3 = . p0 = 4 4 4 4 Then the symbol P(z) of { pk } is given by

400

8 Wavelet Transform and Filter Banks

√ √ √ √ 1− 3 3− 3 3+ 3 2 1+ 3 3 P(z) = + z+ z + z , 8 8 8 8

(8.2.30)

and P(z) is a QMF (see Exercise 11). Define qk = (−1)k p3−k . Then the only nonzero qk of the sequence {qk } are given by √ √ √ √ 1+ 3 3+ 3 3− 3 1− 3 , q1 = − , q2 = , q3 = − . q0 = 4 4 4 4 Thus, the highpass filter Q(z) for the orthogonal wavelet is given by Q(z) =

√ √ √ √ 1+ 3 3+ 3 3− 3 2 1− 3 3 − z+ z − z . 8 8 8 8

(8.2.31)

Then Q(z) satisfies (8.2.26) and P(z), Q(z) satisfy (8.2.27) (see Exercise 11). Observe that P(z) has four nonzero terms. Thus this filter is called a 4-tap orthogonal filter, or 4-tap Daubechies filter (D4 filter for short). See Fig. 8.2 for the corresponding scaling function φ and orthogonal wavelet ψ. A detailed discussion on the construction of compactly supported orthogonal wavelets is provided in Sect. 9.3 of the next chapter.  A compactly supported orthogonal wavelet has the limitation that it cannot be symmetric unless it is the Haar wavelet. To attain the symmetry property, the orthogonality property is relaxed to biorthogonality. Two compactly supported functions  are called a pair of biorthogonal wavelets if two families {ψ j,k } j,k∈Z and ψ and ψ j,k } j,k∈Z are biorthogonal to each other; that is, {ψ j  ,k   = δ j− j  δk−k  , j, k, j  , k  ∈ Z, ψ j,k , ψ

(8.2.32)

 and each of the families is a Riesz basis of L 2 (R). In this case, one of the pair {ψ, ψ} is called a dual of the other. 1.4

1.5

1.2

1

1 0.5

0.8 0.6

0

0.4

−0.5

0.2

−1

0 −1.5

−0.2 −0.4 0

0.5

1

1.5

2

2.5

3

−2

0

0.5

1

1.5

Fig. 8.2. D4 scaling function φ (on left) and wavelet ψ (on right)

2

2.5

3

8.2 Multiresolution Approximation and Analysis

401

The construction of compactly supported biorthogonal wavelets starts with two p = { pk }, such that the corresponding scaling FIR lowpass filters p = { pk } and   are biorthogonal to each other, in the sense that functions φ and φ  − k) = δ j−k , j, k ∈ Z. φ(· − j), φ(·  of p and  This requires that the symbols P(z) and P(z) p satisfy  P(z)P

    1 1  + P(−z)P − = 1. z z

(8.2.33)

 that meet the construction criteria: Then we choose two FIR filters Q(z) and Q(z)     1 1  + Q(−z)Q − = 1, z z     1 1   + Q(−z)P − = 0, Q(z)P z z     1 1   + P(−z)Q − = 0. P(z)Q z z

 Q(z)Q

(8.2.34) (8.2.35) (8.2.36)

 by their Fourier transforms Consequently, we may define the wavelets ψ and ψ  ω  ω   ω  ω      e−i 2  ψ(ω) = Q e−i 2 φ ,  ψ(ω) =Q , φ 2 2 or equivalently, directly by ψ(x) =

k

 = qk φ(2x − k), ψ(x)



  qk φ(2x − k).

k

 ψ and ψ  satisfy Then φ, φ,  − k) = δk , k ∈ Z; ψ, ψ(·  − k) = 0, φ,  ψ(· − k) = 0, k ∈ Z; φ, ψ(·  respectively, as in and {V j } and { V j }, generated by φ and φ,  j · −k) : k ∈ Z}, V j = span{φ(2 j · −k) : k ∈ Z},  V j = span{φ(2 form two MRAs which are biorthogonal to each other, in the sense that j,k   = δk−k  , j, k, k  ∈ Z. φ j,k , φ

(8.2.37) (8.2.38)

402

8 Wavelet Transform and Filter Banks

 j } defined by Furthermore, with {W j } and {W  j · −k) : k ∈ Z},  j = span{ψ(2 W j = span{ψ(2 j · −k) : k ∈ Z}, W we have

j + V j+1 = W Vj. V j+1 = W j + V j , 

(8.2.39)

 j and  V j ⊥ W j (as follows from (8.2.38)), implies This, together with V j ⊥ W  j  , for any j = j  . Wj ⊥ W  constructed in this way satisfy (8.2.32), and they form a pair of comHence, ψ and ψ pactly supported biorthogonal wavelets. A detailed discussion on the construction of biorthogonal wavelets is provided in Sect. 9.4 of the next chapter. If p and  p satisfy (8.2.33), then we say that they are biorthogonal to each other; and if p, q and  p,  q satisfy (8.2.33)–(8.2.36), then we say that they are biorthogonal  or constitute a biorthogonal filter bank. The filters p and  p (or P(z) and P(z)) are  called lowpass filters; and q and  q (or Q(z) and Q(z)) are called highpass filters. Choices of highpass filters q,  q for biorthogonal filters p,  p For a given pair of biorthogonal FIR lowpass filters p and  p , we may choose the FIR highpass filters q and  q as follows:    −1 , Q(z) = −z 2s−1 P z

   = −z 2s−1 P − 1 , Q(z) z

(8.2.40)

where s is any desirable integer. It can be verified that if p and  p are biorthogonal;  satisfy (8.2.33), then Q(z) and Q(z)  satisfy (8.2.34)–(8.2.36) that is, P(z) and P(z) (see Exercise 12). That is, p, q,  p,  q constitute a biorthogonal filter bank.  pk } be the filters given Example 5 5/3-tap biorthogonal filter bank Let { pk }, { by 1 1 1 P(z) = + z + z 2 , 4 2 4 (8.2.41) 3 1 1 1 1 −1 2 3  =− z + + z+ z − z . P(z) 8 4 4 4 8 Also, let {qk }, { qk } be the filters defined by (8.2.40) with s = 1. Then 1 1 3 1 1 Q(z) = − z −2 − z −1 + − z − z 2 , 8 4 4 4 8  = − 1 z −1 + 1 − 1 z. Q(z) 4 2 4

(8.2.42)

8.2 Multiresolution Approximation and Analysis 1

403 1.5

0.9 0.8

1

0.7 0.6 0.5

0.5

0.4 0.3

0

0.2 0.1 0 0

0.2 0.4 0.6 0.8

1

1.2 1.4 1.6 1.8

2

4

−0.5 −1

−0.5

0

0.5

1

1.5

2

0

0.5

1

1.5

2

5 4

3

3 2

2

1

1 0

0

−1 −1 −2 −1

−2 −0.5

0

0.5

1

1.5

2

2.5

3

−3 −1

−0.5

Fig. 8.3. Top: biorthogonal 5/3 scaling function φ (on left) and wavelet ψ (on right). Bottom: scaling function  φ (on left) and wavelet  ψ (on right)

  Then P(z), Q(z), P(z), Q(z) satisfy (8.2.34)–(8.2.36) (see Exercise 13); that is, p, q,  p,  q constitute a biorthogonal filter bank. Observe that the scaling function associated with P(z) in (8.2.41) is the linear B-spline (i.e. the hat function). These two lowpass filters  p and p have 5 and 3 nonzero terms, respectively, and they are commonly called the 5/3-tap biorthogonal filters. The 5/3-tap biorthogonal filter pairs have been adopted by the JPEG2000 image compression standard for lossless digital image compression. The graphs of the corresponding scaling functions and biorthogonal wavelets are displayed in Fig. 8.3.  Exercises Exercise 1 Let {V j } be the sequence of closed spaces defined in Example 1. Show that V j ⊂ V j+1 for all j ∈ Z. Exercise 2 Let {V j } be the sequence of closed spaces defined in Example 1. Show directly that ∩ j∈Z V j = {0}. Exercise 3 Let φ be the Cardinal B-spline of order m defined by (8.2.23). Show that φ is refinable with refinement mask given by (8.2.24) (see Exercise 12 in Sect. 7.1 of Chap. 7 on p.328, where ϕ0 (x) = h 1 (x)). Exercise 4 Decide whether φ(x) = χ[0,1.5) (x) is refinable or not.

404

8 Wavelet Transform and Filter Banks

Exercise 5 Let φ(x) be the hat function defined by (8.2.22), and ϕ(x) = φ(x) + φ(x − 1). Show that ϕ(x) is refinable with ϕ(x) =

1 1 1 1 ϕ(2x) + ϕ(2x − 1) + ϕ(2x − 2) + ϕ(2x − 3). 2 2 2 2

Exercise 6 Let φ be the hat function defined by (8.2.22). Show directly that φ(x) =

1 1 φ(2x) + φ(2x − 1) + φ(2x − 2), x ∈ R. 2 2

Hint: By the symmetry of φ, it is sufficient to consider x ∈ (−∞, 1]; and then, consider the separate cases x ∈ (−∞, 0), [0, 0.5), [0.5, 1]. Exercise 7 Let φ = ϕ0 ∗ ϕ0 ∗ ϕ0 be the quadratic B-spline defined by ⎧1 2 ⎪ 2x , ⎪ ⎨ −(x − 1)2 + (x − 1) + 21 , 2 φ(x) = 1 ⎪ 1 − (x − 2) , ⎪ 2 ⎩ 0,

if 0 ≤ x < 1, if 1 ≤ x < 2, if 2 ≤ x < 3, elsewhere.

Show directly that φ(x) =

3 3 1 1 φ(2x) + φ(2x − 1) + φ(2x − 2) + φ(2x − 3), x ∈ R. 4 4 4 4

Hint: By the symmetry of φ, it is enough to consider x ∈ (−∞, 1.5]; and then, consider the separate cases x ∈ (−∞, 0), [0, 0.5), [0.5, 1), [1, 1.5]. Exercise 8 Decide whether φ(x) = χ[0,1) (x) + 21 χ[1,2) (x) is refinable or not. Exercise 9 Let ϕ(x) = φ(x) + 21 φ(x − 1), where φ(x) is the hat function defined by (8.2.22). Decide whether ϕ(x) is refinable or not. Exercise 10 Suppose that the FIR lowpass filter P(z) is a QMF. Show that Q(z) defined by (8.2.28) satisfies (8.2.26) and (8.2.27). Exercise 11 Let P(z) and Q(z) be the D4 orthogonal filters given by (8.2.30) and (8.2.31). Verify directly that (i) P(z) satisfies (8.2.25); (ii) Q(z) satisfies (8.2.26); and (iii) P(z), Q(z) satisfy (8.2.27).  satisfy (8.2.33). Show that Exercise 12 Suppose that the FIR filters P(z) and P(z)  Q(z) and Q(z) defined by (8.2.40) satisfy (8.2.34)–(8.2.36).   Exercise 13 Let P(z), P(z), Q(z), Q(z) be the filters defined by (8.2.41) and (8.2.42). Verify directly that they satisfy (8.2.33)–(8.2.36).  j  for j = j  , and apply this fact to validate Exercise 14 Show that W j ⊥ W (8.2.32).

8.3 Discrete Wavelet Transform

405

8.3 Discrete Wavelet Transform Suppose that p, q,  p,  q constitute a biorthogonal FIR filter bank, in that their sym be the scaling funcbols satisfy (8.2.33)–(8.2.36) in the previous section. Let φ, φ  be the corresponding tions associated with the lowpass filters p and  p , and let ψ, ψ V j } be the MRAs generbiorthogonal wavelets defined by q,  q . Also, let {V j } and {  respectively. Assume that a real-valued function f (x) is in V j for ated by φ and φ, some j, say j = 0. Then f (x) can be expanded as f (x) =



xk φ(x − k),

(8.3.1)

k

for some real numbers xk . On the other hand, by (8.2.39) with j = 0 in the previous section, on p.402, f (x) can also be expanded as f (x) =

k

ck φ



 x −k + −k , dk ψ 2 2

x

(8.3.2)

k

for some real numbers ck , dk .

 1  x − n and integrating, we may deduce Multiplying both sides of (8.3.2) by φ 2 2 with respect to φ, ψ (see (8.2.38) in the previous section from the biorthogonality of φ on p.401), that for any n ∈ Z, 

∞ −∞





1  x − n dx = f (x) φ ck δk−n + dk 0 = cn . 2 2 k

k

 is real-valued. This property of φ  In the above equality, we have used the fact that φ will also be used below. Therefore, from (8.3.1), we obtain    1 ∞  x − n dx f (x)φ 2 −∞ 2  ∞

  1  x − n dx = xk φ(x − k) φ 2 −∞ 2 k    ∞ 1

 x − n d x. = xk φ(x − k) φ 2 2 −∞

cn =

k

 we have that By the refinement of φ, 

  x −n =  − 2n − ). φ  p φ(x 2 

Thus,

406

8 Wavelet Transform and Filter Banks

 ∞ 1

 − 2n − )d x xk  p φ(x − k) φ(x 2 −∞ k  1

1

xk  p δk−(2n+) = xk  pk−2n . = 2 2

cn =



k

k

Similarly, it can be shown that dn =

1

xk  qk−2n . 2

(8.3.3)

k

Hence, the sequence {xk } can be decomposed into two sequences {ck } and {dk }. On the other hand, from (8.3.2), we have that (· − n) xn =  f, φ   · ·



 − n) +  − n) − k , φ(· − k , φ(· = ck φ dk ψ 2 2 k k

 ∞  − n)d x = ck p φ(x − 2k − )φ(x +





dk





q



k

=

−∞



k

ck pn−2k +

k

∞ −∞



 − n)d x φ(x − 2k − )φ(x dk qn−2k .

k

Thus, the sequence {xk } can be recovered from its decomposed components {ck } and {dk }. To summarize, for an input sequence x = {xk }, the wavelet decomposition algorithm with analysis filters  p,  q is defined by cn =

1

1

 pk−2n xk , dn =  qk−2n xk , n ∈ Z; 2 2 k

(8.3.4)

k

and the wavelet reconstruction algorithm with synthesis filters p, q is given by  xn =

k

pn−2k ck +



qn−2k dk , n ∈ Z.

(8.3.5)

k

The outputs c = {ck } and d = {dk } are called, respectively, lowpass output and highpass output, or approximant and details of x with the analysis filters  p,  q , respectively. Algorithms (8.3.4) and (8.3.5) are also called discrete wavelet transform (DWT) and inverse discrete wavelet transform (IDWT), respectively. The reason for introducing the terminology of DWT can be justified as follows.

8.3 Discrete Wavelet Transform

407

Set f 0 (x) = f (x), ck0 = xk in (8.3.1). Since V0 = V−1 + W−1 = V−2 + W−2 + W−1 = · · · = V−J + W−1 + · · · + W−J , where J is a positive integer, f ∈ V0 can be expanded as f = f 0 = f −1 + g −1 = f −2 + g −2 + g −1 = · · · = f −m + g −1 + · · · + g −m = · · · = f

−J

+g

−1

+ ··· + g

−J

(8.3.6)

,

where x  − k , 2j k 

−j  x dk ψ j − k . g − j (x) = 2 f − j (x) =



−j

ck φ

k

From (8.3.2) and (8.3.4), we see ck−1 = ck , dk−1 = dk . We can obtain, as we derived −j −j (8.3.4), that other ck and dk are given iteratively by −j

cn =

1

1

1− j −j 1− j  pk−2n ck , dn =  qk−2n ck , n ∈ Z, 2 2 k

(8.3.7)

k

for j = 1, 2, . . . , J . Now, in view of (8.2.38) and (8.2.32) in the previous section, we see that   + *  x  x − n = 0; φ m − k ,ψ m 2   2x + * x  − n = 2m δ j−m δk−n ψ j − k ,ψ m 2 2 for all k, n ∈ Z and j ≤ m. Therefore, it follows from (8.3.6) that  +  + * 1 *  x − n = 1 f −m (x) + g −1 (x) + · · · + g −m (x), ψ  x −n f (x), ψ m m m m 2 2 2 2   x + 1 * −m  x  = m dk ψ m − k , ψ − n = dn−m . 2 2 2m k

That is, dn−m

1 = m 2





−∞

  x − n2m  d x. f (x)ψ 2m

408

8 Wavelet Transform and Filter Banks

In other words, according to the definition of the wavelet transform W ψ (b, a) in (8.1.5) of Sect. 8.1 on p.382, we have m m dn−m = W ψ f (n2 , 2 ). Of course, if we replace f (x) by f (2 N x) for any positive integer N , then the above derivation yields   n 1  j , n ∈ Z, f , dn = W ψ 2j 2j for all integers j < N . Hence, the wavelet decomposition algorithm (8.3.4) can be applied iteratively j times to compute the wavelet transform W (b, a) for the timeψ   scale value (b, a) = 2nj , 21j for all integers n ∈ Z. For this reason, (8.3.4) is called the DWT (algorithm). Furthermore, since (8.3.5) can be applied iteratively j times to j j recover {xk } (and hence f (x)) from {cn } and {dn }, it is called the IDWT (algorithm). Remark 1 In contrast to the discrete Fourier transform (DFT), which is obtained by discretization of the Fourier coefficients studied in Chaps. 4 and 6, respectively, the discrete wavelet transform the wavelet transform evaluated at the   is precisely n 1 dyadic time-scale position 2 j , 2 j , n ∈ Z, without the need of discretization of    n 1 the integral W in (8.1.5) in Sect. 8.1. Therefore, the DWT algorithm f , j j ψ 2 2 (8.3.4) is not only much more efficient than discretization of the integral (8.1.5) in Sect. 8.1, but also yields precise values as opposed to approximate values, of the wavelet transform.   form a pair of biorthogonal wavelets, then the We have shown above that if ψ, ψ  IDWT recovers x from its approximant and details. Actually, the condition that ψ, ψ form a pair of biorthogonal wavelets could be relaxed to the condition that the filter bank p, q,  p,  q are biorthogonal. In the next theorem, we show that a biorthogonal filter bank can be applied to recover the original sequence. To this end, we introduce the z-transform X (z) of a bi-infinite sequence x = {x j } defined by X (z) =



xj z j,

j=−∞

where z ∈ C, z = 0. X (z) is only a “symbol”, since we are not concerned with its convergence (see the comment on the notion of FIR at the end of our discussion of the two-scale symbol in (8.2.19) on p.396).( However, if x j = 0 except for only finitely many terms, then the bi-infinite sum ∞ −∞ is a “Laurent polynomial” in −1 z or z . Here, we derive X (z) from the definition of the traditional z-transform ( ∞ −j j − j (for convenience). j=−∞ x j z , by using powers z instead of z Recall that if a finite sequence {gk } is considered as a lowpass or highpass filter for a scaling function or a wavelet, respectively, then G(z) in (8.2.19) on p.396 denotes

8.3 Discrete Wavelet Transform

409

its two-scale symbol (or frequency response), with the multiplicative constant factor of 21 : 1

gk z k . G(z) = 2 k

Also recall that when p = { pk },  p = { pk } and q = {qk },  q = { qk } are considered as the refinement and wavelet sequences, we do not bold-face p,  p , q,  q throughout this book (see Remark 1 on p.396). Theorem 1 Biorth. filter bank means perfect recovery of signal If  p,  q, p and q are biorthogonal, then an input x can be recovered from its approximant c and details d defined by (8.3.4), namely :  x defined by (8.3.5) is exactly x. (z) denote the z-transforms of c, d, x and Proof Let C(z), D(z), X (z) and X x. Then in the frequency domain, (8.3.4) and (8.3.5) can be written respectively as       1 1   X (z) + P − X (−z) ; P z z       1 1  1 2  X (z) + Q − X (−z) ; D(z ) = Q 2 z z

(8.3.8)

 X (z) = 2P(z)C(z 2 ) + 2Q(z)D(z 2 )

(8.3.9)

1 C(z ) = 2 2

and

(see Exercise 1). p,  q and p, q, respectively, Let M  p , q and M p,q be the modulation matrices of  defined by  M p , q (z) =

,  ,   P(z) P(−z) P(z) P(−z) , M (z) = . p,q   Q(z) Q(−z) Q(z) Q(−z)

  It is easy to see that  p,  q , p and q are biorthogonal, namely: P(z), Q(z), P(z), Q(z) satisfy the biorthogonality conditions (8.2.33)–(8.2.36) on p.401, if and only if  M p , q (z)  T which means that M p,q by

1 z,

1 z

M p,q

 T 1 = I2 , z ∈ C\{0}, z

(8.3.10)

is the inverse matrix of M  p , q (z). Thus, with z replaced

(8.3.10) is equivalent to 

M p,q (z)

T M p , q

  1 = I2 , , z ∈ C\{0}, z

(8.3.11)

410

8 Wavelet Transform and Filter Banks

which, in turn, is equivalent to         1 1 1 1     + Q(z) Q = 1, P(z) P − + Q(z) Q − = 0. (8.3.12) P(z) P z z z z Plugging C(z 2 ) and D(z 2 ) [as given in (8.3.8)] into (8.3.9), we have        − 1 X (−z)  1 X (z) + P P z z       1  1  − 1 X (−z) X (z) + Q + 2 Q(z) Q 2 z z            1  − 1 + Q(z) Q  −1  1 + Q(z) Q X (z) + P(z) P X (−z). = P(z) P z z z z

1  X (z) = 2 P(z) 2

Thus  X (z) = X (z) if (8.3.12) holds, which is equivalent to saying that  p,  q , p, q form a biorthogonal filter bank. Hence, the biorthogonality of  p,  q , p and q implies that x can be recovered from its lowpass and highpass outputs c, d by the wavelet reconstruction algorithm.  Remark 2 Biorthogonality in terms of modulation matrices From the above proof, we see that  p,  q , p, q form a biorthogonal filter bank if and only if their  modulation matrices M  p , q , M p,q satisfy (8.3.11). Multi-level wavelet decomposition and reconstruction The wavelet decomposition algorithm (8.3.4) could be applied to the approximant c of an input x to get further approximant and details of x. Continuing this procedure, we have multiscale (or multi-level) wavelet decomposition and reconstruction algorithms. More (0) precisely, let c(0) denote the input x, that is, ck = xk . For J > 1, the J -level wavelet decomposition algorithm is given by (8.3.7). For biorthogonal  p (ω),  q (ω), p(ω) and q(ω), x is recovered by the corresponding reconstruction algorithm 1− j

cn

=



−j

pn−2k ck +

k



−j

qn−2k dk , n ∈ Z,

(8.3.13)

k

for j = J, J − 1, . . . , 1.



Example 1 Let x = {xk } be the sequence with  xk =

1, 0,

k = 1, 2, . . . , 16, elsewhere.

Let { pk }, {qk } be the Haar filters with nonzero pk , qk given by p0 = p1 = 1, q0 = 1, q1 = −1.

8.3 Discrete Wavelet Transform

411

(0)

Then, with ck = xk , the J -level DWT is given by −j

cn =

  1  1− j 1  1− j 1− j −j 1− j c2n + c2n+1 , dn = c2n − c2n+1 , n ∈ Z, 2 2

for j = 1, 2, . . . , J . In particular, the approximant ck−1 and the detail dk−1 of x after 1-level wavelet transform decomposition are given by ⎧ 1 ⎪ ⎪ ⎨2,

cn−1

=

dn−1

⎧ 1 ⎪ ⎪ − , ⎪ ⎨ 2 = 1, ⎪ ⎪ ⎪ ⎩2 0,

1, ⎪ ⎪ ⎩ 0,

n = 0, 8, n = 1, 2, . . . 7, elsewhere;

and n = 0, n = 8, elsewhere.



Example 2 In this example we show the graphs of the details and approximant of an audio signal after 3-level wavelet decomposition with the D4 orthogonal filter. The original signal is shown on the top of Fig. 8.4 and the approximant at the bottom of the figure. The details are in the middle of this picture. Observe that after each level of DWT, the details and approximants are essentially halved.  DWT and orthogonal transformations For a finite signal, after a periodic extension, the orthogonal DWT can be expressed as a linear transformation of an orthogonal matrix. Let p = { pk } be a QMF supported on [0, N ] for some odd N , namely: pk = 0 for k < 0 or k > N . Let q = {qk } be a corresponding highpass filter with qk = (−1)k p N −k , such that q is also supported on [0, N ]. Let {x0 , x1 , . . . , x2L−1 } be a signal with even length, say 2L, for some L > N . Extend this finite signal 2L-periodically to an infinite sequence, denoted by x = {xk }, k ∈ Z. Let c and d be the approximant and detail of x, respectively, obtained by (8.3.4) with analysis filters p and q. Then both c and d are L-periodic (see Exercise 8). Furthermore, we have ⎡ ⎡ ⎤ ⎤ ⎡ ⎤ ⎤ x0 x0 d0 c0 ⎢ .. ⎥ 1 ⎢ .. ⎥ ⎢ .. ⎥ 1 ⎢ .. ⎥ ⎣ . ⎦ = P ⎣ . ⎦, ⎣ . ⎦ = Q⎣ . ⎦, 2 2 c L−1 x2L−1 d L−1 x2L−1 ⎡

where P and Q are L × (2L) matrices given by

(8.3.14)

412

8 Wavelet Transform and Filter Banks

1 0 −1 0 1

50

100

150

200

250

0 −1 0 0.5

20

40

60

80

100

120

140

10

20

30

40

50

60

70

5

10

15

20

25

30

35

5

10

15

20

25

30

35

0 −0.5 0 0.5 0 −0.5 0 5 0 −5 0

Fig. 8.4. From top to bottom: original signal, details after 1-, 2-, 3-level DWT, and the approximant

⎤ p0 p1 p2 p3 · · · p N 0 0 ··· 0 0 0 0 ⎢ 0 0 p0 p1 · · · p N −2 p N −1 p N · · · 0 0 0 0 ⎥ ⎥ ⎢ ⎢ .. .. .. .. .. .. .. .. ⎥ , P = ⎢ ... ... ... ... ... ⎥ . . . . . . . . ⎥ ⎢ ⎣ p4 p5 p6 p7 · · · 0 0 0 · · · p0 p1 p2 p3 ⎦ p2 p3 p4 p5 · · · 0 0 0 · · · 0 0 p0 p1 ⎡ ⎤ q0 q1 q2 q3 · · · q N 0 0 ··· 0 0 0 0 ⎢ 0 0 q0 q1 · · · q N −2 q N −1 q N · · · 0 0 0 0 ⎥ ⎢ ⎥ ⎢ .. .. .. .. .. .. .. .. ⎥ . Q = ⎢ ... ... ... ... ... . . . . . . . . ⎥ ⎢ ⎥ ⎣ q4 q5 q6 q7 · · · 0 0 0 · · · q0 q1 q2 q3 ⎦ q2 q3 q4 q5 · · · 0 0 0 · · · 0 0 q0 q1 ⎡

(8.3.15)

(8.3.16)

Let  x be the sequence obtained by IDWT (8.3.5). Then it can be shown, by using the L-periodicity of c and d, that  x is 2L-periodic (see Exercise 10), and that

8.3 Discrete Wavelet Transform

413



⎤ ⎡ ⎤ ⎡ ⎤  x0 c0 d0 ⎢ .. ⎥ T ⎢ . ⎥ T ⎢ . ⎥ ⎣ . ⎦ = P ⎣ .. ⎦ + Q ⎣ .. ⎦ .  x2L−1 c L−1 d L−1

(8.3.17)

Thus, from (8.3.14) and (8.3.17), we have ⎡ ⎡ ⎤ ⎤ ⎤ ⎤ ⎡ x0 x0  x0 x0 ⎢ .. ⎥ . ⎥ . ⎥ 1 T . ⎥ T 1P ⎢ T1 ⎢ T ⎢ ⎣ .. ⎦ + Q Q ⎣ .. ⎦ = P P + Q Q ⎣ .. ⎦ . ⎣ . ⎦=P 2 2 2  x2L−1 x2L−1 x2L−1 x2L−1 ⎡

Hence, if p and q are orthogonal, then  x = x, and we conclude that P T P + QT Q = 2I, which leads to the following theorem. Theorem 2 DWT and orthogonal transformations Let p and q be orthogonal filters supported on [0, N ]. Define 

, P W= , Q where P and Q are defined by (8.3.15) and (8.3.16). Then matrix, namely: W T W = 2I.

(8.3.18) √1 W 2

is an orthogonal (8.3.19)

In fact, (8.3.19) can be derived directly from the orthogonality of p, q given in (8.2.25)–(8.2.27) of the previous section on p.399 (see Exercises 11–14). Denote x2L = [x0 , . . . , x2L−1 ]T , c L = [c0 , . . . , c L−1 ]T , x2L = [ x0 , . . . ,  x2L−1 ]T . d L = [d0 , . . . , d L−1 ]T ,  Then (8.3.14) and (8.3.17) can be written as ⎡ ⎤ cL 1 ⎣ ⎦ = W x2L ,  x2L = W T ⎣ ⎦ . 2 dL dL ⎡

cL



Thus, DWT and IDWT can be expressed as orthogonal transformations.



414

8 Wavelet Transform and Filter Banks

DWT and lifting scheme DWT and IDWT can be implemented as liftingscheme algorithms. This topic will be discussed in some detail in Sect. 9.4 of the next chapter. Haar filters For example, for the unnormalized Haar filter bank  L(z) = 21 (z) = 1 (1 − z) and L(z) = 1 + z, H (z) = 1 − z, the corresponding (1 + z), H 2 (modified) forward and backward lifting algorithms for input x = (xk ) are given as follows. Forward lifting (wavelet decomposition) algorithm: Splitting: c = (x2k ), d = (x2k+1 ); 1 Step1. c(1) = (c + d); 2 Step2. d(1) = d − c(1) . c(1) and d(1) are the lowpass and highpass outputs. Backward lifting (wavelet reconstruction) algorithm: Step1.

d = d(1) + c(1) ;

Step2. c = 2c(1) − d; Combining: x = (. . . , ck , dk , . . .). x is recovered from its lowpass and highpass outputs c(1) and d(1) . See Fig. 8.5 for the algorithms. 5/3-tap biorthogonal filters As another example, DWT and IDWT with 5/3-biorthogonal filter bank considered in Example 5 on p.402 can be written as (modified) forward and backward lifting for input x: Forwardlifting(waveletdecomposition)algorithm: Splitting: c = (x2k ), d = (x2k+1 ); 1 (1) Step1. ck = ck − (dk + dk−1 ); 2  1  (1) (1) (1) Step2. dk = dk + ck + ck+1 ; 4 1 (2) (1) (2) Step3. ck = dk , dk = ck(1) . 2 c(2) and d(2) are the lowpass and highpass outputs. Backward lifting (wavelet reconstruction) algorithm: Step1.

(1)

(2)

(1)

ck = 2dk , dk

(2)

= ck ;

8.3 Discrete Wavelet Transform

415

Fig. 8.5. From top to bottom: (modified) forward and backward lifting algorithms of Haar filters

 1  (1) (1) ck + ck+1 ; 4 1 (1) Step3. ck = ck + (dk + dk−1 ); 2 Combining: x = (. . . , ck , dk , . . .).

Step2.

(1)

dk = dk −

x is recovered from its lowpass and highpass outputs c(2) and d(1) . See Fig. 8.6 for the algorithms.



Two-dimensional DWT The MRA architecture can be used to construct compactly supported wavelets of higher dimensions. Here we provide the tensor-product wavelets (also called separable wavelets) on R2 and the associated DWT and IDWT. Suppose φ is a compactly supported scaling function with an FIR lowpass filter p = { pk }, and ψ is an associated compactly supported wavelet with a highpass filter q = {qk }. Recall that the nested sequence of spaces V j = span{φ(2 j x − k) : k ∈ Z}, −∞ < j < ∞, constitutes an orthogonal MRA, and V j+1 = V j ⊕⊥ W j (namely, V j+1 = V j + W j and V j ⊥W j ), where W j = span{ψ(2 j x − k) : k ∈ Z}. Define V j = V j ⊗ V j = { f (x)g(y) : f (x) ∈ V j , g(y) ∈ V j }.

416

8 Wavelet Transform and Filter Banks

Fig. 8.6. From top to bottom: (modified) forward and backward lifting algorithms of 5/3-tap biorthogonal filters

Then {V j }, j ∈ Z, form a nested sequence of subspaces of L 2 (R2 ) with the density property, namely; {0} ← · · · ⊂ V−1 ⊂ V0 ⊂ V1 ⊂ V2 ⊂ · · · → L 2 (R2 ). In addition, with the notations (1)

Wj we have

(2)

= Wj ⊗ Vj, Wj

(3)

= Vj ⊗ Wj, Wj (1)

(2)

= Wj ⊗ Wj, (3)

V j+1 = V j ⊕⊥ W j ⊕⊥ W j ⊕⊥ W j .

Let us now introduce the 2-dimensional separable scaling function and wavelets (x, y) = φ(x)φ(y),  (1) (x, y) = ψ(x)φ(y),  (2) (x, y) = φ(x)ψ(y),  (3) (x, y) = ψ(x)ψ(y), and for f (x, y) defined in R2 , denote f j,k1 ,k2 (x, y) = 2 j f (2 j x − k1 , 2 j y − k2 ), j, k1 , k2 ∈ Z.

8.3 Discrete Wavelet Transform

417

Then   (1)   j,k1 ,k2 : k1 , k2 ∈ Z ,  j,k1 ,k2 : k1 , k2 ∈ Z ,     (2) (3)  j,k1 ,k2 : k1 , k2 ∈ Z ,  j,k1 ,k2 : k1 , k2 ∈ Z , (2) (3) are orthonormal bases of V j , W (1) j , W j , W j , respectively. This, together with the fact that   (1) (2) (3) = L 2 (R2 ), ⊕⊥j∈Z ⊕⊥ W j ⊕⊥ W j ⊕⊥ W j

implies that



 (2) (3) ,  ,  : j, k , k ∈ Z  (1) 1 2 j,k1 ,k2 j,k1 ,k2 j,k1 ,k2

is an orthonormal basis of L 2 (R2 ).  (0) For a sequence c(0) = ck1 ,k2 : k1 , k2 ∈ Z , the J -level 2-dimensional (tensorproduct) DWT is given by ⎧ 1

−j 1− j ⎪ ⎪ c = pk1 −2n 1 pk2 −2n 2 ck1 ,k2 , ⎪ n ,n 1 2 ⎪ ⎪ 4 ⎪ k1 ,k2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1

⎪ − j,1 1− j ⎪ ⎪ dn 1 ,n 2 = qk1 −2n 1 pk2 −2n 2 ck1 ,k2 , ⎪ ⎪ 4 ⎪ ⎨ k1 ,k2 (8.3.20)

⎪ ⎪ 1

− j,2 1− j ⎪ ⎪ = pk1 −2n 1 qk2 −2n 2 ck1 ,k2 , d ⎪ n ,n 1 2 ⎪ 4 ⎪ ⎪ k1 ,k2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1

⎪ − j,3 1− j ⎪ qk1 −2n 1 qk2 −2n 2 ck1 ,k2 , n 1 , n 2 ∈ Z, dn 1 ,n 2 = ⎪ ⎪ ⎩ 4 k1 ,k2

for j = 1, 2, . . . , J ; and the 2-dimensional (tensor-product) IDWT is given by 1− j

cn 1 ,n 2 =

k1 ,k2

+

−j

pn 1 −2k1 pn 2 −2k2 ck1 ,k2 +



k1 ,k2

− j,2

k1 ,k2

pn 1 −2k1 qn 2 −2k2 dk1 ,k2 +

− j,1

qn 1 −2k1 pn 2 −2k2 dk1 ,k2



k1 ,k2

(8.3.21)

− j,3

qn 1 −2k1 qn 2 −2k2 dk1 ,k2 ,

for j = J, J − 1, . . . , 1. Each c− j is called a ( j-level) approximant of c(0) , and d− j,1 , d− j,2 , d− j,3 are called ( j-level) details of c(0) . q = { qk }, p = { pk }, q = {qk }, the 2For a biorthogonal filter bank  p = { pk },  pk and  qk , dimensional DWT is given by (8.3.20), with pk and qk being replaced by  respectively, and the IDWT is given by (8.3.21).

418

8 Wavelet Transform and Filter Banks

Exercises Exercise 1 Show that (8.3.4) and (8.3.5) are equivalent to (8.3.8) and (8.3.9), respectively. Exercise 2 Derive (8.3.3). Exercise 3 Let p = { pk }, q = {qk } be the Haar filters with p0 = p1 = 1, q0 = 1, q1 = −1 and pk = 0, qk = 0 for k < 0 or k > 1. Formulate the modulation matrix M p,q (z) of p and q, and verify that  T  1 M p,q (z) M p,q = I2 , z ∈ C\{0}. z Exercise 4 Repeat Exercise 3 with p, q being the D4 -filters given in Example 4 of the previous section on p.399. Exercise 5 Let  p,  q , p, q be the 5/3-tap biorthogonal filters given in Example 5 of the previous section on p.402. Verify the validity of (8.3.10) for this filter bank. Exercise 6 Let x = {xk } be the sequence with xk = 1 for 1 ≤ k ≤ 8 and xk = 0 for k < 1 or k > 8. Determine the approximant c = {ck } and the details d = {dk } of x after 1-level wavelet decomposition with the D4 -filters p and q given on p.400. Exercise 7 Repeat Exercise 6 with the D4 -filters p, q replaced by  p,  q of 5/3-tap biorthogonal filters given on p.402. Exercise 8 Let x = {xk } be a sequence with period 2L for some positive integer L, namely: xk+2L = xk , k ∈ Z. Show that y = {yk }, defined by yn =

1

h k−2n xk , 2 k

for some constants h k , is L-periodic. Exercise 9 Let  x be a sequence defined by (8.3.5) for some sequences c and d for some pk , qk . Show that if c and d are L-periodic, then  x is 2L-periodic. Exercise 10 Derive the formula (8.3.14). Exercise 11 Derive the formula (8.3.17). Exercise 12 Let p = { pk } and q = {qk } be two FIR filters, and let P(z) and Q(z) be their two-scale symbols. Show that (8.2.25), (8.2.26), (8.2.27) of the previous section on p.399 are, respectively, equivalent to

pk pk−2n = 2δn , n ∈ Z; (a) k

8.3 Discrete Wavelet Transform

(b) (c)

k

419

qk qk−2n = 2δn , n ∈ Z; pk qk−2n = 0, n ∈ Z.

k

Exercise 13 Let P and Q be the L × (2L) matrices defined by (8.3.15) and (8.3.16) for p and q supported on N with N < L. As a continuation of Exercise 12, show that (a), (b), (c) in Exercise 12 are respectively equivalent to (i) PP T = 2I L ; (ii) QQT = 2I L ; (iii) PQT = 0. Exercise 14 As a continuation of Exercises 12 and 13, show that p and q are orthogonal (i.e. they satisfy (8.2.25), (8.2.26) and (8.2.27) on p.399), if and only if the matrix W, defined by (8.3.18), satisfies (8.3.19). Hint: (8.3.19) is equivalent to WW T = 2I . Exercise 15 Derive the formula (8.3.20). Exercise 16 Derive the formula (8.3.21).

8.4 Perfect-Reconstruction Filter Banks Recall that the “integral convolution” of two functions f, g ∈ (L 2 ∩ L 1 )(R), defined by   ( f ∗ g)(x) =



−∞

f (t)g(x − t)dt =



−∞

f (x − t)g(t)dt,

possesses the Fourier transform property ( f ∗ g)(ω) =  f (ω) g (ω). This important property extends to the z-transform of bi-infinite sequences in 2 = 2 (Z). Recall that the z-transform of a sequence x = {x j } ∈ 2 is defined by ∞

X (z) = xj z j, j=−∞

and the “discrete convolution” of two sequences x, y ∈ 2 by (x ∗ y)k =



j=−∞

x j yk− j =



j=−∞

xk− j y j .

420

8 Wavelet Transform and Filter Banks

Then it can be shown that the z-transform of the discrete convolution w = x ∗ y of x and y is given by W (z) = X (z)Y (z). (8.4.1) Indeed, by a change of summation indices (from j − k to n below), we have W (z) =



wj z j =

j=−∞

=

3 ∞



j=−∞

=





j=−∞

4



xk y j−k z j =

k

xk

(x ∗ y) j z j

3

3

yn z



z = k

n

k=−∞

xk ⎝



k=−∞

4 n



j

xk z

⎞ y j−k z j−k ⎠ z k

43

k



4 yn z

n

n=−∞

k=−∞

= X (z)Y (z). Example 1 Let x = {x j } be the sequence defined by x0 = 1, x1 = 1 and x j = 0, j = 0, 1. Then X (z) = 1 + z and (x ∗ x)k =



x j xk− j =

j

= xk + xk−1

1

j=0

⎧ ⎨ 1, = 2, ⎩ 0,

x j xk− j =

1

xk− j

j=0

if k = 0, 2, if k = 1, otherwise,

so that the z-transform of x ∗ x is given by 1 + 2z + z 2 = (1 + z)2 = (X (z))2 . This verifies the property in (8.4.1), namely: “the z-transform” of the convolution of x with itself is the square of the z-transform of x.  In the following, we introduce the very general notion of FIR lowpass and highpass filters. A finite impulse response (FIR) lowpass filter  = { j } is a sequence of real numbers  j that satisfies (i) (  j = 0 except for finitely many j ∈ Z,  j = c0 for a constant c0 > 0, and (ii) (j j (iii) j (−1)  j = 0; and an FIR highpass filter h = {h j } is a sequence of real numbers h j that satisfies (i) ( h j = 0 except for finitely many j ∈ Z, h j = 0, and (ii) (j j (iii) j (−1) h j = c0 . Let L(z) and H (z) be the z-transforms of  and h, respectively. Then it is clear from the above definitions of  and h, that

8.4 Perfect-Reconstruction Filter Banks

π



421

π



Fig. 8.7. Ideal lowpass filter (on left) and highpass filter (on right)

L(1) = c0 , L(−1) = 0; and H (1) = 0, H (−1) = c0 . For z = e−iω , the two trigonometric polynomials L(e−iω ) and H (e−iω ), as functions of ω, are also called the frequency responses of  and h. We remark that ideal lowpass and highpass filters, with frequency responses shown in Fig. 8.7, are not FIR filters, since the Fourier series on the interval (0, 2π) of the characteristic function of (0, π), or of [π, 2π), has infinitely many non-zero coefficients (see Sect. 6.1 of Chap. 6). The implementation of lowpass and highpass filtering, by using some FIR lowpass and highpass filters s = {s j } and r = {r j }, respectively, can be represented as follows:

x : (x) j = x j

(  ∗s −→ x L k = j s j xk− j , Lowpass output (  ∗r −→ x H k = j r j xk− j , Highpass output

where the lowpass and highpass outputs x L and x H are the low-frequency and highfrequency components of the input x, and they are obtained by taking discrete convolutions with the “finite” sequences {s j } and {r j }, which are therefore finite sums. Example 2 Let s = {s j } and r = {r j } be the lowpass and highpass filters defined by s−1 = s0 = 21 , s j = 0, j = −1, 0, and r−1 = − 21 , r0 = 21 , r j = 0, j = −1, 0. Then the lowpass output x L and highpass output x H of the input sequence x = {x j }, obtained by using the filters s and r, are given by xkL = that is,

1 1 (xk + xk+1 ), xkH = (xk − xk+1 ); 2 2

422

8 Wavelet Transform and Filter Banks



⎡ ⎤ ⎤ .. .. . . ⎢ ⎢ ⎥ ⎥ ⎢ x−1 + x0 ⎥ ⎢ x−1 − x0 ⎥ ⎢ ⎢ ⎥ ⎥ 1⎢ 1⎢ x0 + x1 ⎥ x0 − x1 ⎥ L H ⎢ ⎢ ⎥ ⎥. , x = ⎢ x = ⎢ 2 ⎢ x1 + x2 ⎥ 2 ⎢ x1 − x2 ⎥ ⎥ ⎥ ⎢ x2 + x3 ⎥ ⎢ x2 − x3 ⎥ ⎣ ⎣ ⎦ ⎦ .. .. . .



Observe that for a “finite” input sequence x (i.e. a bi-infinite sequence with finite index-support), the lengths (i.e. the smallest index-support) of the output sequences x L and x H are equal, and that this length is the same as the length of the input sequence x. So, at least intuitively, “half” of the contents of x L and “half” of those of x H could be deleted without loss of data information. To accomplish this, we will next introduce the “downsampling operator” D to drop “half” of the lowpass and highpass output sequence components; namely, to replace x L and x H by x L D and x H D that have half of the contents of x L and x H , respectively. We will also introduce the “upsampling operator” U for recovering the input sequence x from the downsampled outputs x L D and x H D . Definition 1 Downsampling and upsampling operations The operator D : 2 (Z) → 2 (Z), defined by dropping the odd-indexed terms; that is, D(. . . , x−2 , x−1 , x0 , x1 , x2 , . . .) = (. . . , x−2 , x0 , x2 , . . .), is called the downsampling operator. The operator U : 2 (Z) → 2 (Z), defined by inserting a zero in-between every two consecutive terms; that is, U (. . . , y−1 , y0 , y1 , . . .) = (. . . , 0, y−1 , 0, y0 , 0, y1 , 0, . . .), is called the upsampling operator. More precisely, the downsampling and upsampling operators, D and U , are defined by D

x −→ x D ,



xD

 k

= x2k , k ∈ Z;

U

y −→ U y, (U y)2 j = y j , (U y)2 j+1 = 0, j ∈ Z, respectively. The operations of D and U are represented as follows: x → 2↓ → x D ; y → 2↑ → U y.

8.4 Perfect-Reconstruction Filter Banks

423

Example 3 Let x L and x H be the lowpass and highpass outputs of the input sequence x, obtained by applying the filters s and r given in Example 2. Then the downsampled sequences x L D and x H D , obtained by applying the downsampling operator D to x L and x H , respectively, are given by ⎡

xLD

⎡ ⎤ ⎤ .. .. . . ⎢ ⎢ ⎥ ⎥ ⎢ x−2 + x−1 ⎥ ⎢ x−2 − x−1 ⎥ ⎢ ⎥ ⎥ 1⎢ 1 HD ⎢ ⎥ x0 + x1 ⎥ = ⎢ ⎥ , x = 2 ⎢ x0 − x1 ⎥ . 2⎢ ⎢ x2 + x3 ⎥ ⎢ x2 − x3 ⎥ ⎣ ⎣ ⎦ ⎦ .. .. . .

Next, by applying the upsampling operator U to x L D and x H D , we obtain ⎡

U xLD

⎡ ⎤ ⎤ .. .. . . ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ 0 0 ⎢ ⎢ ⎥ ⎥ ⎢ x−2 + x−1 ⎥ ⎢ x−2 − x−1 ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ 0 0 ⎢ ⎥ ⎥ 1⎢ 1 HD ⎢ ⎢ ⎥ = ⎢ x0 + x1 ⎥ ; U x = ⎢ x0 − x1 ⎥ ⎥. 2⎢ 2⎢ ⎥ ⎥ 0 0 ⎢ ⎢ ⎥ ⎥ ⎢ x2 + x3 ⎥ ⎢ x2 − x3 ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ 0 0 ⎣ ⎣ ⎦ ⎦ .. .. . .



Example 4 Let {, h} be a lowpass-highpass filter pair, with  = { j } and h = {h j } defined by  0 = 1 = 1,  j = 0, j = 0, 1; (8.4.2) h 0 = 1, h 1 = −1, h j = 0, j = 0, 1, then for any input sequence y, its lowpass and highpass outputs y ∗  and y ∗ h, obtained by applying the filter pair {, h}, are given by (y ∗ ) j = y j + y j−1 , (y ∗ h) j = y j − y j−1 , by following the same argument as given in Example 2. Thus, as a continuation of Example 3, we observe that

424

8 Wavelet Transform and Filter Banks

⎤ ⎤ ⎡ .. .. . . ⎥ ⎥ ⎢ ⎢ ⎢ x−2 + x−1 ⎥ ⎢ x−2 − x−1 ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ x−2 + x−1 ⎥ ⎢ x−1 − x−2 ⎥ ⎥ ⎥ ⎢ ⎢ LD LD 1⎢ 1⎢ x0 + x1 ⎥ x0 − x1 ⎥ ⎥ ⎥, ⎢ ⎢ , Ux Ux ∗= ⎢ ∗h= ⎢ 2 ⎢ x0 + x1 ⎥ 2 ⎢ x1 − x0 ⎥ ⎥ ⎥ ⎢ x2 + x3 ⎥ ⎢ x2 − x3 ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ x2 + x3 ⎥ ⎢ x3 − x2 ⎥ ⎦ ⎦ ⎣ ⎣ .. .. . . ⎡

so that, by summing the above two results, we have LD Ux ∗  + U xLD ∗ h = x.



We remark that filters s, r, , h in Examples 2, 3 and 4 satisfy sj =

1 1 − j , r j = h − j , j ∈ Z, 2 2

and that {, h} is precisely the Haar filter pair, as introduced earlier. In what follows, s and r are said to be obtained by “time-reversal” of the filters 1 1 2  and 2 h, respectively (and for convenience, they are also called “time-reverses” 1 of 2  and 21 h). In general, the time-reverse of a (bi-infinite) sequence g = {gk } is defined by g− = {gk− }, where gk− = g−k , k ∈ Z. Hence, if G(z) denotes the z-transform   of the filter g, then the z-transform of the time reverse g− of g is given by G 1z . Definition 2 Perfect-reconstruction filter banks A pair { ,  h} of lowpass and highpass filters is said to have a PR dual pair {, h}, if these two pairs provide a “perfect-reconstruction” (PR) filter bank, in the sense as described by the data-flow diagram in Fig. 8.8.

Fig. 8.8. Decomposition and reconstruction algorithms with PR filter bank

8.4 Perfect-Reconstruction Filter Banks

425

Observe that the PR dual of the Haar filter pair {, h} given in Example 4 is the  pair 21 , 21 h . Next we discuss the condition under which two pairs of digital filters constitute a PR filter bank. (z) denote the z-transforms of , h, Theorem 1 Let L(z), H (z),  L(z), H ,  h, respectively. Then the two filter pairs { ,  h} and {, h} constitute a PR filter bank, if and only if          1 1  1  −1 + H (z) H X (z) + L(z) L − + H (z) H X (−z) z z z z = 2X (z) (8.4.3)



L(z) L

for any bi-infinite sequence with z-transform X (z). Proof Let us first consider the z-transform of y = U Dw, obtained by downsampling, followed by upsampling, of any sequence w = {w j }. Then from the definitions of D and U , we have y = UDw = (. . . , 0, w−2 , 0, w0 , 0, w2 , . . .); that is, y2 j = w2 j , y2 j−1 = 0, j ∈ Z. Thus, the z-transform of y is given by Y (z) =



yk z k =



k

j

y2 j z 2 j =



w2 j z 2 j .

j

On the other hand, since

w2 j z 2 j =

j

1 + (−1)k 2

k

wk z k

4 3

1

1 = wk z k + (−1)k wk z k = W (z) + W (−z) , 2 2 k

k

it follows that Y (z) =

1 (W (z) + W (−z)). 2

In light of the above derivation, we may conclude that the z-transforms of − UD(x ∗   ) and UD(x ∗  h− ) are given by             1 1  1 1 1  1   X (z) + L − X (−z) , X (z) + H − X (−z) . L H 2 z z 2 z z Therefore, the two pairs {, h} and { ,  h} constitute a PR filter bank {, h}, { ,  h}, if and only if

426

8 Wavelet Transform and Filter Banks

            1 1  1 1  1  − 1 X (−z) H (z) X (z) +  L − X (−z) L(z) + X (z) + H L H 2 z z 2 z z = X (z),



which is (8.4.3).

We remark that from the symmetric formulation in (8.4.3), we may also conclude   that if {, h} is a PR dual of  ,  h , then  ,  h is a PR dual of {, h}. In order to apply (8.4.3) in Theorem 1 to verify if two filter pairs constitute a PR filter bank, it is necessary to show that (8.4.3) holds for all X (z). This is not an easy (if feasible) task, since both X (z) and X (−z) appear in (8.4.3). To eliminate one of them, say X (−z), in (8.4.3), we may additionally assume that the z-transforms of the two filter pairs {, h} and  ,  h satisfy the identity:     1 1   + H (z) H − = 0, L(z) L − z z so that (8.4.3) reduces to the identity L(z) L

    1  1 = 2. + H (z) H z z

The above two identities can also be formulated by replacing z with −z, yielding the following four identities:     1 1   + H (z) H = 2, L(z) L z z     1  1 = 0, + H (−z) H L(−z) L z z     1 1   + H (−z) H − = 2, L(−z) L − z z     1  − 1 = 0. L(z) L − + H (z) H z z

(8.4.4a) (8.4.4b) (8.4.4c) (8.4.4d)

To re-formulate these 4 identities in a more compact form, we introduce the notion of modulation matrices, M L ,H (z) and M  (z), corresponding to the two L, H  (z) of Laurent polynomials, defined by pairs {L(z), H (z)} and  L(z), H  M L ,H (z) =

,   L(z) L(−z) L(z) ; M (z) =  L, H (z) H (z) H (−z) H

,  L(−z) (−z) . H

Then (8.4.4) (that is, the totality of (8.4.4a)–(8.4.4d)) is equivalent to the matrix identity:

8.4 Perfect-Reconstruction Filter Banks



M L ,H (z)

T

M  L, H

427

  1 = 2I2 , z ∈ C\{0} z

(8.4.5)

(see Exercise 7). Since the totality of (8.4.4a)–(8.4.4d) implies the validity of (8.4.3) for all X (z), the following result is an immediate consequence of Theorem 1. Theorem 2 Perfect-reconstruction filter banks Suppose that the modulation matrices M L ,H (z) and M  (z) corresponding to the two z-transform pairs {L(z), L, H (z)} of the filter pairs {, h} and { H (z)} and { L(z), H ,  h}, respectively, satisfy the matrix identity (8.4.5). Then the filter bank { ,  h, , h} constitutes a PR filter bank. Remark 1 Biorthogonal filters are PR filter banks Let p = { pk }, q = {qk }, q = { qk } be FIR filters, and set  p = { pk },  1 1 pk ,  qk . k = p k , h k = q k ,  k =  hk =  2 2 Then the decomposition and reconstruction algorithms (8.3.4)–(8.3.5) on p.406 can be formulated as − h− ∗ x) ↓ 2, c = (  ∗ x) ↓ 2, d = (  x = (c ↑ 2) ∗  + (d ↑ 2) ∗ h,

(8.4.6) (8.4.7)

where ↓ 2 and ↑ 2 denote the downsampling and upsampling operators. Therefore, if p, q,  p,  q are biorthogonal, then the filter bank { ,  h, , h} constitutes a PR filter bank. For this reason, the biorthogonal filters p, q,  p,  q are also called PR filters.  and Q(z)  Now, let P(z), Q(z), P(z) be the two-scale symbols of p, q,  p and  q, (z) be the z-transforms of , h,  L(z) and H  and  h, respectively; and let L(z), H (z),  respectively. Then we have 1 1 L(z), Q(z) = H (z), 2 2  =  =H (z). P(z) L(z), Q(z) P(z) =

Thus, the condition (8.3.11) on p.409 for the biorthogonality of p, q,  p,  q is equivalent to (8.4.5). Hence, Theorem 2 can be derived directly from Remark 2 on p.410 in the previous section.  Next, let us apply the above result to determine the existence of an FIR dual filter pair { ,  h} for a given FIR filter pair {, h}. This problem is equivalent to studying the (z) with modulation matrix M existence of Laurent polynomials  L(z) and H  (z) L, H that satisfies the matrix identity (8.4.5). To this end, we call a Laurent polynomial w(z) a monomial, if w(z) = cz −m for some m ∈ Z and some constant c = 0. Theorem 3 Existence of FIR PR dual Let {, h} be an FIR filter pair with z-transform pair {L(z), H (z)} and modulation matrix M L ,H (z). Then there exists

428

8 Wavelet Transform and Filter Banks

an FIR dual filter pair { ,  h} corresponding to {, h}, provided that det M L ,H (z) is a monomial. Proof From the assumption on the modulation matrix M L ,H (z), we may write det M L ,H (z) = cz m for some integer m and constant c = 0. First observe that by interchanging its two columns, the matrix M L ,H (z) becomes M L ,H (−z), so that det M L ,H (−z) = −cz m . On the other hand, by replacing z with −z in the matrix, we also have det M L ,H (−z) = c(−z)m . Hence, m is an odd integer. Let us now introduce the two Laurent polynomials   1 2  , L(z) = z m H − c z

  (z) = − 2 z m L − 1 . H c z

(8.4.8)

From the fact that m is an odd integer, it is easy to verify that the modulation matrix M  (z) satisfies (8.4.5) (see Exercise 9). Thus, it follows from Theorem 2 that the L, H filter bank { ,  h, , h} constitutes a PR filter bank; that is, { ,  h} is a dual filter pair of {, h}.  Example 5 Since the z-transforms L(z), H (z) of the Haar filter pair {, h} in Example 4 are given by L(z) = 1 + z,

H (z) = 1 − z,

it is easy to compute the determinant of the modulation matrix, namely: det M L ,H (z) = det



L(z) L(−z) H (z) H (−z)

,

= (1 + z)(1 + z) − (1 − z)(1 − z) = 4z, which is indeed a monomial of odd degree. Therefore, it follows from (8.4.8), with m = 1, c = 4, in the proof of Theorem 3, that     2 1 1 1 1  L(z) = z H − = z· 1− = (z + 1), 4 z 2 −z 2     1 z 1 1 2 (z) = − z L − =− · 1+ = (−z + 1). H 4 z 2 −z 2 That is, the PR dual of {, h} is the FIR filter pair time-reverse of {s, r}, as discussed in Example 2.

1

1 2 , 2 h



, which is precisely the  T Let us now return to (8.4.5) and observe that the matrix 21 M L ,H (z) is the 1 left-inverse of M  ( z ). Since the right-inverse is the same as the left-inverse for L, H

8.4 Perfect-Reconstruction Filter Banks

429

non-singular square matrices, we have   T 1 1 M L ,H (z) = I2 , M  L, H z 2 or equivalently,  T 1 = 2I2 , z ∈ C\{0}. M  (z) M L ,H L, H z 

Hence, the filter pair { ,  h} is dual to the filter pair {, h}, if and only if ⎡     ⎤ : 9 1 1  L H L(z) L(−z) z z ⎥ ⎢ · ⎣     ⎦ = 2I2 , 1 1   H (z) H (−z) L −z H −z which is equivalent to a set of four identities:     1 1 + L(−z)L − = 2, z z     1 1  + L(−z)H − = 0, L(z)H z z     (−z)L − 1 = 0, (z)L 1 + H H z z     1 (−z)H − 1 = 2. (z)H +H H z z

 L(z)L

(8.4.9a) (8.4.9b) (8.4.9c) (8.4.9d)

Remark 2 Lowpass/highpass dual filters and filter orthogonality Among the above four identities, (8.4.9a) describes the “duality” of the lowpass filters:  and  , while (8.4.9d) describes the “duality” of the highpass filters: h and  h. On the other hand, (8.4.9b) and (8.4.9c) describe, respectively, the “orthogonality” between   and h and the “orthogonality” between  and  h. These four relations, with two on duality and two on orthogonality, are very useful for the design of PR filter banks (see Exercise 7).  Example 6 Let 1 (z) = 1 (1 − z) L(z) = 1 + z, H (z) = 1 − z,  L(z) = (1 + z), H 2 2 be the z-transforms of the Haar filter bank {, h  ,  h} considered in Example 5. In this example we verify that these Laurent polynomials satisfy the identities (8.4.9a)–(8.4.9d), so that {, } is a lowpass dual filter pair, {h,  h} is a highpass

430

8 Wavelet Transform and Filter Banks

dual filter pair, while   is “orthogonal” to h, and  is “orthogonal” to  h. Indeed, first observe that       1 1 1 1 1  = (1 + z) 1 + = z+2+ , L(z)L z 2 z 2 z   so that  L(−z)L − 1z =



1 2

−z + 2 −

 L(z)L

1 z

 . Hence, we have

    1 1  + L(−z)L − = 2, z z

or L(z) and  L(z) satisfy (8.4.9a). Next, observe that (z)H H

      1 1 1 1 1 = (1 − z) 1 − = −z + 2 − , z 2 z 2 z

  (−z)H − 1 = so that H z

1 2



z+2+

1 z

 . Hence, we also have

    1 (−z)H − 1 = 2, +H z z

(z)H H

(z) satisfy (8.4.9d). In addition, since or H (z) and H  L(z)H

      1 1 1 1 1 = (1 + z) 1 − = z− , z 2 z 2 z

  so that  L(−z)H − 1z =

1 2

  −z + 1z , we have     1 1 + L(−z)H − = 0, z z

 L(z)H

which implies that (8.4.9b) holds. Finally, since (z)L H

      1 1 1 1 1 = (1 − z) 1 + = −z + , z 2 z 2 z

  (−z)L − 1 = so that H z

1 2



z−

(z)L H

1 z

 , we also have

    1 1  + H (−z)L − = 0, z z

which implies that (8.4.9c) holds as well.



8.4 Perfect-Reconstruction Filter Banks

431

Exercises Exercise 1 Compute the discrete convolution x ∗ y for each of the following pairs of bi-infinite sequences x = {x j }, y = {y j }. ⎧ ⎧ ⎪ ⎪ −1, for j = −1, for j = 0, ⎨ 2, ⎨ 2, for j = 0, yj = (a) x j = −1, for j = 1, −1, for j = 1, ⎩ ⎪ ⎪ 0, otherwise; ⎩ 0, otherwise. ⎧  for j = 0, ⎨ 1, 1, for j = 0 or 1, (b) x j = 1/2, for j = −1 or 1, y j = 0, otherwise. ⎩ 0, otherwise; (c) {x j } as given in (b), and y j = x j . Exercise 2 In each of (a), (b), and (c) in Exercise 1, compute the z-transforms X (z), Y (z) and W (z), where w = x ∗ y. Exercise 3 Among all the sequences: x, y and w = x ∗ y in each of (a), (b), and (c) in Exercise 1, identify those that are lowpass and those that are highpass filters. Exercise 4 Let  = { j } and h = {h j } be lowpass and highpass filters, respectively, as given in (a) and (b) below. For any bi-infinite sequence x = {x j }, write out the lowpass and highpass outputs x L and x H explicitly. ⎧ ⎨ 1/2, for j = −1 or 1, for j = 0, (a)  j = 1, ⎩ 0, otherwise; ⎧ ⎨ −1/2, for j = −1 or 1, for j = 0, h j = 1, ⎩ 0, otherwise. ⎧ ⎨ −1/2, for j = −1, for j = 0, (b)  is the same as in (a), and h j = 1/2, ⎩ 0, otherwise. Exercise 5 As a continuation of Exercise 4, give an explicit formulation of x L D and x H D for each of (a) and (b). Exercise 6 As a continuation of Exercise 5, give an explicit formulation of U x L D and U x H D for each of (a) and (b). Exercise 7 Show that (8.4.4) [that is, the totality of (8.4.4a)–(8.4.4d)] is equivalent to the matrix identity (8.4.5). Also, show that for any non-singular square matrix A, if a square matrix B satisfies AB = I , then B A = I . Finally, apply this fact to show that the totality of the identities (8.4.4a)–(8.4.4d) is equivalent to the totality of the identities (8.4.9a)–(8.4.9d). Exercise 8 Write out the details in the derivation of (8.4.4a) and (8.4.4b).

432

8 Wavelet Transform and Filter Banks

Exercise 9 Let M L ,H (z) be the modulation matrix of L(z), H (z). Suppose det M L ,H (z) = cz m (z), defined by (8.4.8), for some odd integer m and constant c = 0. Show that  L(z), H satisfy (8.4.5). Exercise 10 Let  = { j },  = {  j }, h = {h j } and  h = { h j } with z-transforms (z), respectively. Formulate the duality condition (8.4.4a) L(z),  L(z), H (z), and H and (8.4.4b) for the two pairs {, h} and { ,  h} in the time-domain; that is, formulate (z). j, h j,  h j instead of L(z),  L(z), H (z), H the condition in terms of  j ,  Exercise 11 Use the same notations as in Exercise 10 to formulate the following conditions in the time-domain: (a) the duality condition (8.4.9a) of  and  ; (b) the duality condition (8.4.9d) of h and  h; (c) the orthogonality condition (8.4.9b) of   and h, and (d) the orthogonality condition (8.4.9c) of  and  h.

Chapter 9

Compactly Supported Wavelets

The objective of this chapter is three-fold: firstly to discuss the theory and methods for the construction of compactly supported wavelets, including both orthogonal and biorthogonal wavelets as well as their corresponding scaling functions; secondly to study the stability and smoothness properties of these basis functions; and thirdly to investigate the details in the derivation of fast computational algorithms, the lifting schemes, for the implementation of wavelet decompositions and reconstructions. To accomplish this goal, our preliminary task is to introduce the concept of the transition operator T p , along with its representation matrix T p , associated with a given finite sequence p, as well as to study the properties of the eigenvalues and eigenvectors of the matrix T p . This will be discussed in Sect. 9.1, which also includes the Fundamental Lemma of transition operators that will be applied in the next chapter to prove the existence of the refinable function φ ∈ L 2 (R) with p as its refinement mask, as well as to characterize the stability of φ, again in terms of the transition operator. To facilitate our discussions, the Gramian function G φ,φ , defined in terms of the  and φ, by Fourier transform of a pair of refinable functions, φ G φ,φ (ω) =



  + 2π)φ(ω  + 2π), φ(ω

∈Z

is introduced in Sect. 9.2, where it is shown that G φ,φ (ω) =

 k



−∞

 − k)d x e−ikω . φ(x)φ(x

Hence, for refinement masks with only finitely many non-zero terms, since the corresponding refinable functions are compactly supported, the Gramian functions are trigonometric polynomials. When only one refinable function φ, corresponding to

C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_9, © Atlantis Press and the authors 2013

433

434

9 Compactly Supported Wavelets

some refinement mask, is considered, it is also shown in this section that φ is stable, if and only if the Gramian function G φ = G φ,φ is uniformly bounded both from above and from below by some positive constants. As a consequence, the refinable function φ is a scaling function, as introduced in Definition 1, in Sect. 8.2 of Chap. 8. In particular, φ is an orthonormal (i.e. orthogonal and with L 2 (R) norm = 1) scaling function, if and only if G φ is the constant function 1 (almost everywhere, if φ is not compactly supported). In Sect. 9.3, the MRA (multiresolution approximation) architecture generated by a compactly supported orthonormal scaling function φ, as introduced in Definition 1 in Sect. 8.2 of Chap. 8, is extended to an orthogonal multiresolution analysis (also denoted by MRA) architecture for the formulation of compactly supported orthogonal wavelets in terms of φ. When the trigonometric polynomial p(ω) (i.e. the two-scale symbol P(e−iω )) of some finite sequence p satisfies the quadrature mirror filter (QMF) identity, it is shown that the trigonometric polynomial q(ω) = − p(π − ω)e−i(2L−1)ω is the two-scale symbol of a compactly supported wavelet corresponding to the scaling function φ (for an arbitrary integer L), provided that the transition operator T p satisfies Condition E, in that the integer 1 is a simple eigenvalue of the representation matrix T p of T p and any other eigenvalue λ of T p satisfies |λ| < 1. The notions of sum rules and vanishing moments are also introduced in Sect. 9.3, and it is proved that the sum-rule property of a given order implies the same order of vanishing moments for orthogonal wavelets. This section ends with the derivation of the method, along with formulas, for constructing compactly supported orthogonal wavelets, and with examples of the wavelet filters D2n for n = 1, . . . , 4. Since compactly supported orthogonal wavelets, with the exception of the Haar wavelet, are not symmetric or anti-symmetric, and hence their corresponding wavelet filters are not linear-phase filters, the next section, Sect. 9.4, is devoted to the investigation of biorthogonal wavelets by using two different compactly supported scaling  and φ that are symmetric. For this study, “orthogonal basis” is replaced functions φ by the more general “Riesz basis”. The weaker condition for “Bessel sequence” is introduced, by dropping the lower-bound requirement for Riesz bases. It is proved in Sect. 9.4, however, that any pair of biorthogonal systems must satisfy the lower-bound condition, if they are Bessel sequences. Hence, to derive biorthogonal wavelets, it is sufficient to verify the biorthogonality property and the upper-bound condition. The good news in this regard is that, under a very mild smoothness condition, any compactly supported wavelet ψ with vanishing moment of order (at least) one (i.e. the integral of ψ over its support is zero) already generates a Bessel system {ψ j,k : j, k ∈ Z}, where ψ j,k (x) = 2 j/2 ψ(2 j x − k).  φ} of refinable functions corTo derive the biorthogonality property of a pair {φ, responding to the pair { p , p} of compactly supported refinement masks, it is again

9 Compactly Supported Wavelets

435

necessary and sufficient to assure that the Gramian function G φ,φ is the constant function 1. This is accomplished by extending the QMF identity to the dual QMF: p (ω + π) p(ω + π) = 1,  p (ω) p(ω) +  where  p (ω) and p(ω) are the two-scale symbols of  p and p, respectively. Further and ψ can be formulated via the more, the corresponding biorthogonal wavelets ψ two-scale symbols  q (ω) and q(ω) by applying the three identities: q (ω + π)q(ω + π) = 1,  q (ω)q(ω) +   q (ω) p(ω) +  q (ω + π) p(ω + π) = 0, p (ω + π)q(ω + π) = 0.  p (ω)q(ω) +  This section ends with a description of the method for constructing compactly supported symmetric (or anti-symmetric) biorthogonal wavelets, and with examples of the 5/3-tap and 9/7-tap filters that have been adopted by the JPEG-2000 digital image compression standard, for lossless and lossy compressions, respectively. Perhaps the most significant impact of the wavelet transform (WT) to the application of local time-scale and time-frequency analyses is that the (wavelet) decomposition algorithm eliminates the need for computing the WT integral in (8.1.5) to obtain the DWT values at the dyadic points, and that the (wavelet) reconstruction algorithm can be applied to recover the original input data. Therefore, fast computational schemes for implementation of the (wavelet) decomposition and reconstruction algorithms are of great interest. In Sect. 8.3 of Chap. 8, we have demonstrated the design of such implementation with two simple examples: the Haar and 5/3-tap wavelet filters, by introducing the notion of lifting steps. In the last section, Sect. 9.5, of this chapter, we will derive the general lifting schemes and establish the corresponding relevant results. Recall from Sect. 8.3 of Chap. 8 that the modulation matrices play an essential role in formulating the wavelet decomposition/reconstruction algorithms. Indeed, the wavelet filter pairs, { p,  q } and { p, q}, constitute a PR filter bank, if and only if their corresponding modulation matrices M  p , q (ω) and M p,q (ω) satisfy the identity T M p , q (ω)M p,q (ω) = I2 , ω ∈ R. Here, the first column of M p,q (ω) is given by [ p(ω), q(ω)]T = [P(z), Q(z)]T , where z = e−iω , while the second column is obtained from the first column by replacing z with e−iπ z; that is, by a phase shift of π. In general, when the wavelet scale (also called dilation) 2 is generalized to an arbitrary integer d ≥ 2, then instead of one wavelet filter q, it is necessary to consider d − 1 wavelet (highpass) filters q1 , . . . , qd−1 . In this case, the first column of the corresponding modulation matrix M p,q1 ,...,qd−1 (ω) is given by [ p(ω), q1 (ω), . . . , qd−1 (ω)]T = [P(z), Q 1 (z), . . . , Q d−1 (z)]T , and the (k + 1)th column is obtained from the first column by a phase shift of 2πk/d or replacing z by e−i2πk/d z. Observe that all columns of M p,q1 ,...,qd−1 (ω) are correlated, in that they are phase shifts of one another. This makes it very difficult, if even feasible

436

9 Compactly Supported Wavelets

at all, to construct the filters directly from the matrix identity: T M p , q1 ,..., qd−1 (ω)M p,q1 ,...,qd−1 (ω) = Id .

Therefore, the method of polyphase decomposition must first be applied to untangle the correlations. In Sect. 9.5, we only consider dilation d = 2, but still call the bi-phase matrices by the commonly used term, polyphase matrices. The main result of this section is the factorization of the DWT decomposition/reconstruction algorithms into lifting steps. Examples, including the lifting schemes for implementation of the D4 , 5/3-tap and 9/7-tap DWT, will be discussed at the end of the section.

9.1 Transition Operators The notion of the transition operator and its representation matrix, associated with a given refinement mask, is introduced in this first section. In particular, the properties of the eigenvalues and eigenvectors of the transition operator will be investigated, since they are instrumental to the characterization of the orthogonality, stability, and smoothness of the scaling function corresponding to the refinement mask, to be studied in the later sections. In this chapter, we will only consider (refinement) sequences with compact support; that is, sequences with only finitely many non-zero terms. For simplicity of our discussion, we may and will assume, without loss of generality, that the support of the sequence p = { pk } under consideration is [0, N ] for some N ∈ N; that is, pk = 0 for all k < 0 or k > N . (As in the previous chapter, when a sequence p = { pk } is considered as a refinement sequence, we do not bold-face p.) For a sequence p with support on [0, N ], we will call [0, N ] the support of p = { pk }, if both p0 and p N are N for p, if needed for clarity. different from 0, and will use the notation p = { pk }k=0 Thus, the two-scale symbol P(z) of this sequence p is given by 1 pk z k , z ∈ C\{0}. 2 N

P(z) =

k=0

In this and the next chapters, we will also use the notation p(ω) to replace P(e−iω ) for convenience, namely: N 1 p(ω) = pk e−ikω , (9.1.1) 2 k=0

so that p(ω) is a trigonometric polynomial. Suppose that there exists some function φ that satisfies the identity:  φ(x) = pk φ(2x − k), (9.1.2) k

9.1 Transition Operators

437

then φ is a called a refinable function with refinement sequence (also called refinement mask) p = { pk }. An equivalent formulation of the identity (9.1.2) is the following frequency-domain representation:     ω  ω  . φ φ(ω) = p 2 2

(9.1.3)

Throughout this and the next chapters, a refinement mask p = { pk } is always assumed to satisfy the condition:  pk = 2, k

or equivalently, P(1) = p(0) = 1. In addition, as in the previous chapter, when p = { pk },  p = { pk } and q = {qk },  q = { qk } are considered as the refinement and wavelet sequences, we do not bold-face p,  p , q,  q (see Remark 1 on p.396). Remark 1 The definition of refinable functions in (9.1.2) is valid for all (finite or N has compact support, then infinite) sequences p = { pk }. Of course, if p = { pk }k=0 the summation in (9.1.2) runs from k = 0 to k = N . But in any case, φ is called a refinable function with refinement mask p = { pk }. Furthermore, the definition of refinable functions is more general than that of scaling functions, introduced in Definition 1 in Sect. 8.2 of Chap. 8, in that refinable functions do not necessarily satisfy the stability condition, which is required for scaling functions. Hence, while  of a scaling function must satisfy the condition φ(0)  =c the Fourier transform φ(ω) for a nonzero constant c, this is not necessarily the case for refinable functions in general. However, in the following, a refinable function φ associated with p = { pk } means the normalized solution of (9.1.2), given by (8.2.21) in Sect. 8.2, with the  = 1. condition φ(0)  We are now ready to introduce the notion of the transition operator. For a refinement mask p = { pk } with compact support, the transition operator T p associated with p is defined by (T p v)(ω) = | p

        ω ω 2 ω ω | v + |p + π |2 v +π , 2 2 2 2

(9.1.4)

for all trigonometric polynomials v(ω). As in Sect. 6.1 of Chap. 6, let V2N +1 be the finite-dimensional linear space defined by  N  vk e−ikω , vk ∈ C V2N +1 = v(ω) : v(ω) = k=−N

(see (6.1.2) in Chap. 6). We have the following result.

438

9 Compactly Supported Wavelets

Theorem 1 Representation matrix of transition operator

The transition ope-

N has the following properties: rator T p associated with p = { pk }k=0

(i) the space V2N +1 is invariant under the transformation T p ; that is, T p v ∈ V2N +1 for any v ∈ V2N +1 ; (ii) with respect to the basis {e−ikω }k=−N ,...,N for V2N +1 , the representation matrix of T p , restricted to V2N +1 , is given by

1 , Tp = a2 j−k 2 −N ≤ j,k≤N where aj =



ps p s− j =

s∈Z

N 

ps p s− j .

(9.1.5)

s=0

More explicitly,

(T p E)(ω) = T p (e−ikω )

= E(ω)T p ,

(9.1.6)

k=−N ,...,N

where



E(ω) = e−ikω



= ei N ω , . . . , eiω , 1, e−iω , . . . , e−i N ω .

k=−N ,...,N

Proof First observe that since pk = 0 for k < 0 or k > N , we have a j = 0 if j < −N or j > N . Thus, for any k with 0 ≤ k ≤ N , a2 j−k = 0 if j < −N or j > N . Next, we note that 1 1 1 ps e−isω pk eikω = ps pk e−i(s−k)ω 2 s 2 4 k s,k

    1 1 −i jω = ps p s− j e = ps p s− j e−i jω 4 s 4 s

| p(ω)|2 =

j

1 = a j e−i jω . 4

j

j

Thus, for v(ω) = e−ikω , where −N ≤ k ≤ N , we have 1 1 as e−isω/2 e−ikω/2 + as e−is(ω/2+π) e−ik(ω/2+π) 4 s 4 s 1 1 = as e−i(s+k)ω/2 + as (−1)s+k e−i(s+k)ω/2 4 s 4 s

T p (e−ikω ) =

9.1 Transition Operators

439

  1  = a−k 1 + (−1) e−iω/2 (with  = s + k) 4  1 = a2 j−k e−i jω (only even terms  = 2 j are kept) 2

(9.1.7)

j

=

N  1 a2 j−k e−i jω (since a2 j−k = 0 if | j| > N ) 2

j=−N



1 = E(ω) a2 j−k 2 Therefore,

−N . j=N

  T p ei N ω , . . . , e−ikω , . . . , e−i N ω

1 = E(ω) a2 j−k = E(ω)T p . 2 −N ≤ j≤N ,−N ≤k≤N 

This completes the proof of (9.1.6).

Remark 2 To compute each entry a j of T p , although one may simply use its definition in (9.1.5), it is often easier to apply the relation  a j e−i jω = 4| p(ω)|2 (9.1.8) j

and compute the Fourier coefficients (see the above proof of Theorem 1).



Example 1 Let

⎧ ⎨ x, φ(x) = min{x, 2 − x}χ[0,2) (x) = 2 − x, ⎩ 0,

for 0 ≤ x < 1, for 1 ≤ x < 2, elsewhere,

be the hat function. Recall from Example 2 in Sect. 8.2 of Chap. 8, on p.397, that φ is a refinable function with refinement mask: p0 =

1 1 , p1 = 1, p2 = , 2 2

and pk = 0 for k < 0 or k > 2. Formulate the representation matrix T p of the transition operator T p (restricted to V5 ) associated with p = { pk }2k=0 . Solution To compute a j from (9.1.5), we have a−2 = p0 p2 =

1 , a−1 = p0 p1 + p1 p2 = 1, 4

440

9 Compactly Supported Wavelets

a0 = p02 + p12 + p22 =

3 , 2

a1 = p1 p0 + p2 p1 = 1, a2 = p2 p0 = and ak = 0 for |k| > 2. Thus

Tp

1 , 4

⎤ a−2 a−3 a−4 a−5 a−6 ⎢ a0 a−1 a−2 a−3 a−4 ⎥

⎥ 1 1⎢ a2 a1 a0 a−1 a−2 ⎥ a2 j−k = = ⎢ ⎥ ⎢ 2 2⎣ a −2≤ j,k≤2 a3 a2 a1 a0 ⎦ 4 a6 a5 a4 a3 a2 ⎡ ⎤ 1 0 0 0 0 ⎢6 4 1 0 0⎥ ⎥ 1⎢ 1 4 6 4 1⎥ = ⎢ . ⎢ 8 ⎣0 0 1 4 6⎥ ⎦ 0 0 0 0 1 ⎡

 Remark 3 Proper subspaceV2N −1 is also invariant underT p transition operator associated with p = and j with | j| ≥ N , we have

N . { pk }k=0

Let T p be the

For integers k ∈ [1 − N , N − 1]

|2 j − k| ≥ N + 1. Thus a2 j−k = 0 for k ∈ [1 − N , N − 1] and | j| ≥ N . Therefore, by (9.1.7), we have, for an integer k ∈ [1 − N , N − 1], T p (e−ikω ) =

1 j

2

a2 j−k e−i jω =

N −1  j=1−N

1 a2 j−k e−i jω , 2

that is, T p (e−ikω ) ∈ V2N −1 . This means that V2N −1 is invariant under T p , and the representation matrix of T p , restricted to V2N −1 , is given by

1 a2 j−k . 2  1−N ≤ j,k≤N −1 N2 , one can show, as above, that For a general refinement mask p = { pk }k=N 1 both V2(N2 −N1 )+1 and V2(N2 −N1 )−1 are invariant under T p . In fact, for any M ≥ M+N1 N2 − N1 − 1, V2M+1 is invariant under T p (just by considering h = { pk }k=N with 1 pk = 0 for k > N2 as the mask and observe T p = Th ). In the next theorem, we will show that the nonzero eigenvalues of T p |V 2M+1 (that is, the restriction of T p to V2M+1 ) are the same as those of T p |V 2(N2 −N1 )+1 for any M > N2 − N1 .

9.1 Transition Operators

441

For a trigonometric polynomial v(ω) =



k vk e

−ikω ,

its support is defined by

supp(v) = [k1 , k2 ], k2 −ikω with k ≤ k and v provided that v(ω) = 1 2 k1  = 0, vk2  = 0. A k=k1 vk e trigonometric polynomial v(ω) is called an eigenfunction of T p associated with eigenvalue λ if (T p v)(ω) = λ v(ω). Theorem 2 Eigenfunctions of transition operator N2 operator associated with p = { pk }k=N . Then 1 with a nonzero eigenvalue lies in V2(N2 −N1 )+1 .

Let T p be the transition

an eigenfunction of T p associated

 Proof Suppose that the trigonometric polynomial v(ω) = k vk e−ikω is an eigenfunction of T p associated with a nonzero eigenvalue λ. For an integer k, we have (by referring to the derivation of (9.1.7)), 1 a2 j−k e−i jω . T p (e−ikω ) = 2 j

Thus,

T p v(ω) =

 1 j

k

2

a2 j−k vk e−i jω .

Observe that a2 j−k = 0 if |2 j − k| > N2 − N1 , or equivalently, a2 j−k is not zero only if N2 − N1 k N2 − N1 k − + ≤ j≤ + . 2 2 2 2 This means that supp(T p v) ⊆

1 1 [N1 − N2 , N2 − N1 ] + supp(v). 2 2

Similarly, we have 1 1 [N1 − N2 , N2 − N1 ] + supp(T p v) 2 2 1 1 1 ⊆ [N1 − N2 , N2 − N1 ] + 2 [N1 − N2 , N2 − N1 ] + 2 supp(v), 2 2 2

supp(T p2 v) ⊆

and, in general, supp(T pn v) 1 1 1 ⊆ [N1 − N2 , N2 − N1 ] + . . . + n [N1 − N2 , N2 − N1 ] + n supp(v) 2 2 2 1 ⊆ [N1 − N2 , N2 − N1 ] + n supp(v). 2

442

9 Compactly Supported Wavelets

On the other hand, T pn v = λn v with λ = 0 for n ≥ 0. Thus supp(T pn v) = supp(v). Therefore, supp(v) = supp(T pn v) ⊆ [N1 − N2 , N2 − N1 ] +

1 supp(v). 2n

Letting n → ∞ in the above equation, and using the fact that supp(v) is bounded, we conclude that supp(v) ⊆ [N1 − N2 , N2 − N1 ]. 

This shows v(ω) ∈ V2(N2 −N1 )+1 , as desired.

Remark 4 In the next sections, the property of the eigenvalues (and in some cases, eigenvectors as well) of the transition operator will be used to characterize the existence, stability, orthogonality, and smoothness of scaling functions φ. From Theorem N2 2, we see that for a refinement mask { pk }k=N , we need only consider the transition 1 operator T p restricted to V2(N2 −N1 )+1 , instead of V2M+1 for an M > N2 − N1 . In the following, we simplify notations by using T p for T p |V 2(N2 −N1 )+1 , which is the restriction of T p to V2(N2 −N1 )+1 . Hence the operator T p is defined on the finitedimensional space V2(N2 −N1 )+1 , and the representation matrix of T p is given by

1 Tp = a2 j−k 2

. N1 −N2 ≤ j,k≤N2 −N1

 Now let us return to the refinement mask supported on [0, N ]; namely, p = N . The transition operator T is understood to be the linear operator defined { pk }k=0 p on V2N +1 . N −ikω ∈ V For v(ω) = 2N +1 , let vec (v) denote the column vector j=−N vk e consisting of the coefficients vk of v(ω), namely: ⎤ v−N ⎢ .. ⎥ ⎢ . ⎥ ⎥ ⎢ ⎥ =⎢ ⎢ vk ⎥ . ⎢ . ⎥ ⎣ .. ⎦ vN ⎡

−N vec(v) = vk k=N

(9.1.9)

Then v(ω) = E(ω)vec(v). From (9.1.6), we have T p v(ω) = T p E(ω)vec(v) = E(ω)T p vec(v),

(9.1.10)

9.1 Transition Operators

and thus

443

  vec T p v = T p vec(v).

(9.1.11)

Remark 5 Eigenfunctions of T p and eigenvectors of T p If v(ω) ∈ V2N +1 is an eigenfunction of T p associated with an eigenvalue λ; that is, T p v(ω) = λv(ω), then it follows from (9.1.11) that   T p vec(v) = vec T p v = vec(λv) = λvec(v). This means that λ is an eigenvalue of T p , with vec (v) being an associated eigenvector. Conversely, suppose v is a (right) eigenvector of T p associated with an eigenvalue λ, i.e. T p v = λv. Let v(ω) = E(ω)v ∈ V2N +1 . Then vec(v) = v. Thus, by (9.1.10), we have T p v(ω) = E(ω)T p v = λE(ω)v = λv(ω). That is, λ is an eigenvalue of T p , with v(ω) = E(ω)v ∈ V2N +1 being an associated eigenfunction.  Example 2 As a continuation of Example 1, let us calculate the eigenvalues of T p in Example 1. Observing that each of the first and the last rows of T p has only one nonzero entry, we have   2  λ − 1 − 1  0  2 8 1  det(λI5 − T p ) = λ − − 21 λ − 43 − 21  8  0 − 18 λ − 21   2    1 1 1 = λ− λ− , (λ − 1) λ − 8 2 4 where the details of the calculation for the last equality are omitted. Thus, T p has eigenvalues 1, 21 , 41 , 18 , 18 . It is not difficult to verify that v = [0 1 4 1 0]T is a (right) 1-eigenvector of T p . Therefore, v(ω) = E(ω)v = [ei2ω , eiω , 1, e−iω , e−i2ω ]v = 4 + 2 cos ω 

is a 1-eigenfunction of T p .

We end this section by introducing the following fundamental lemma for transition operators. This result will be used in the next chapter. Theorem 3 Fundamental lemma of transition operators sition operator associated with p =

N . { pk }k=0

Let T p be the tran-

Then, for all f, g ∈ V2N +1 ,

444

9 Compactly Supported Wavelets



π −π

 g(ω)

 T pn



f (ω)dω =

2n π

−2n π

g(ω)



n  j=1

   ω 2 ω |p j | f dω. 2 2n

(9.1.12)

Proof Since the formula (9.1.12) can be derived by applying mathematical induction, we only give the proof for n = 1 in the following. The proof for n = k + 1 (using the induction hypothesis; that is, (9.1.12) holds for n = k) is similar, and is left as an exercise (see Exercise 9). For n = 1, we have 

π −π

 g(ω) T p f (ω)dω = 2  =2  =2

π 2

− π2 π 2

− π2

π 2

− π2

T p f (2ω) g(2ω)dω

  | p(ω)|2 f (ω) + | p(ω + π)|2 f (ω + π) g(2ω)dω  | p(ω)|2 f (ω)g(2ω)dω + 2

π 2 +π

− π2 +π

| p(u)|2 f (u)g(2u − 2π)du

(substitution u = ω + π was used for the 2nd integral)  π +π 2 =2 | p(ω)|2 f (ω)g(2ω)dω (by g(ω − 2π) = g(ω))  =2  =

− π2 π

−π 2π

−2π

| p(ω)|2 f (ω)g(2ω)dω (| p(ω)|2 f (ω)g(2ω) is 2π-periodic)

|p

    ω 2 ω | f g(ω)dω. 2 2  Exercises

Exercise 1 Let p = { p0 , p1 } with p0 = p1 = 1 be the Haar filter. Compute the eigenvalues of the representation matrix T p of the transition operator T p and the 1-eigenfunctions of T p , if there are any. Exercise 2 Repeat Exercise 1 for p = { pk }1k=−1 with p−1 = p1 =

1 ; p0 = 1. 2

Exercise 3 Repeat Exercise 1 for p = { pk }0k=−2 with p−2 = p0 =

1 ; p−1 = 1. 2

Exercise 4 Repeat Exercise 1 for p = { pk }3k=0 with p0 = p3 =

1 3 ; p1 = p2 = . 4 4

9.1 Transition Operators

445

Exercise 5 Suppose p = { pk }2k=0 with p0 = p2 = 21 ; p1 = 1. Let Q p be the representation matrix of T p restricted to V3 . Compute the eigenvalues of Q p . Compare them with the eigenvalues of T p obtained in Example 2. Exercise 6 Repeat Exercise 5 for p = { pk }1k=−1 in Exercise 6. Exercise 7 Repeat Exercise 5 for p = { pk }3k=0 in Exercise 4. N . Let T and Q be the representation matrices Exercise 8 Suppose p = { pk }k=0 p p of T p restricted to V2N +1 and V2N −1 , respectively. Find the relation among the eigenvalues of T p and Q p .

Exercise 9 Prove (9.1.12) for n = k + 1 under the assumption that (9.1.12) holds for n = k.

9.2 Gramian Function G φ (ω) In this section we introduce the Gramian functions G φ (ω) and G φ, φ (ω). These functions are important for the study of orthogonality, biorthogonality and stability of scaling functions. Some properties of these functions are presented in this section. We also provide the characterizations of orthogonality and stability of scaling functions φ in terms of G φ (ω). For φ ∈ L 2 (R), define G φ (ω) =



 + 2π)|2 , |φ(ω

(9.2.1)

∈Z

 ∈ L 2 (R), define and for φ, φ G φ,φ (ω) =



  + 2π)φ(ω  + 2π). φ(ω

(9.2.2)

∈Z

Theorem 1 Properties of G φ (ω) and G φ,φ (ω)

 be functions in Let φ and φ

L 2 (R). Then: (i) G φ and G φ,φ are 2π-periodic functions in L 1 [−π, π]; (ii) G φ and G φ,φ can be expanded as G φ (ω) =



−∞

k

G φ,φ (ω) =



 k

φ(x)φ(x − k)d x e−ikω ,



−∞

 − k)d x e−ikω . φ(x)φ(x

(9.2.3) (9.2.4)

446

9 Compactly Supported Wavelets

Proof To prove (i), we first show G φ ∈ L 1 [−π, π]. From Parseval’s formula (7.2.7) of Theorem 2 on p.331, we have 2 ∞  2π+π     φ(ω)   dω 2= 2π φ 2 = φ   = =

∞ 



=−∞ 2π−π

2     φ(ω  + 2π) dω =  

π

=−∞ −π  π −π

π

2 ∞     φ(ω  + 2π) dω  

−π =−∞

G φ (ω)dω,

where the interchange of the integral and summation signs follows from the fact that  + 2π)|2 ≥ 0. Thus G φ (ω) ∈ L 1 [−π, π]. |φ(ω Using the fact (see Exercise 1) that   (ω)| ≤ G (ω) G φ (ω), (9.2.5) |G  φ,φ φ we see G φ,φ (ω) is in L 1 [−π, π]. Clearly, G φ (ω) and G φ,φ (ω) are 2π-periodic. Next, we prove (ii). Since (9.2.3) is a special case of (9.2.4), we only need to show (9.2.4). As a 2π-periodic function in L 1 [−π, π], G φ,φ (ω) can be expanded as its Fourier series which converges in L 2 [−π, π], that is,   ck eikω = c−k e−ikω , G φ,φ (ω) = k

where c−k =

1 2π



π

−π

k

ikω G dω φ,φ (ω)e

∞  π 1    + 2π)φ(ω  + 2π)eikω dω = φ(ω 2π −π =−∞ ∞  2π+π 1   ikω  φ(ω)e  = φ(ω) dω 2π 2π−π =−∞  ∞ 1  −ikω dω  φ(ω)e  φ(ω) = 2π −∞  ∞  = − k)d x, φ(x)φ(x −∞

where the last equality follows from Parseval’s formula, and where we have interchanged the integral and summation signs in the second equality. The justification for the interchanging can be obtained by applying Lebesgue’s dominated convergence

9.2 Gramian Function G φ (ω)

447

theorem to the series of G φ,φ (ω), since the partial sums of this series are bounded   by G  φ (ω) G φ (ω), which is in L 1 [−π, π] (see Exercise 2). Remark 1 Trigonometric polynomials G φ (ω) and G φ,φ (ω)

If φ ∈ L 2 (R) is

compactly supported, then the series in (9.2.3) has finitely many terms, and hence G φ (ω) is a trigonometric polynomial. In particular, if supp(φ) ⊆ [N1 , N2 ], then G φ (ω) =  since

∞ −∞

N2 −N 1 −1





k=−(N2 −N1 −1) −∞

φ(x)φ(x − k)d x e−ikω ,

(9.2.6)

φ(x)φ(x − k)d x = 0 for k ≥ N2 − N1 or k ≤ −(N2 − N1 ).

In the following, for a compactly supported function φ ∈ L 2 (R), G φ (ω) will denote the trigonometric polynomial given on the right-hand side of (9.2.3). Hence G φ (ω) is continuous and bounded on R.  and φ are compactly supported functions in L 2 (R), then G (ω) Similarly, if φ φ,φ  ⊆ [N 1 , N 2 ], supp(φ) ⊆ is a trigonometric polynomial. More precisely, if supp(φ) [N1 , N2 ], then 2 −N1 −1 N 

G φ,φ (ω) =

1 −1) k=−(N2 − N



∞ −∞

 − k)d x e−ikω . φ(x)φ(x 

Example 1 Let φ(x) = χ[0,1) (x). Let us calculate G φ (ω). Since supp(φ) ⊆ [0, 1], to find G φ (ω), by (9.2.6), we only need to calculate the integral 

∞ −∞



1

φ(x)φ(x)d x =

12 d x = 1.

0

Thus G φ (ω) = 1 for any ω ∈ R.



Example 2 Let φ = min{x, 2− x}χ[0,2) (x) be the hat function considered in Examples 1 and 2 in the previous section, namely, ⎧ ⎪ ⎨x, φ(x) = 2 − x, ⎪ ⎩ 0,

for 0 ≤ x < 1, for 1 ≤ x < 2, elsewhere.

Calculate G φ (ω). Solution Since supp(φ) ⊆ [0, 2], by (9.2.6), to find G φ (ω), we only need to calculate the integrals

448

9 Compactly Supported Wavelets





−∞ ∞





−∞ ∞

 φ(x)φ(x)d x = 2 0



−∞

2

φ(x)φ(x − 1)d x =

(2 − x)(x − 1)d x =

1 1

x 2d x =



2

φ(x)φ(x + 1)d x =

1 , 6

2 , 3

(2 − x)(x − 1)d x =

1

1 . 6

Thus G φ (ω) = =

1  



k=−1 −∞

φ(x)φ(x − k)d x e−ikω

1 iω 2 1 −iω 1 e + + e = (2 + cos ω). 6 3 6 3

Recall from Example 2 on p.443 that a 1-eigenfunction v(ω) of T p is v(ω) = 2 + cos ω. Thus, up to a constant, G φ (ω) is equal to v(ω). The reason behind this is  that G φ (ω) is a 1-eigenfunction of T p , as shown in the next theorem. Theorem 2 G φ (ω) is a 1-eigenfunction of T p

Let φ be a compactly supported

N . If φ ∈ L 2 (R), then refinable function with the refinement mask p = { pk }k=0 G φ (ω) ∈ V2N +1 and G φ (ω) is a 1-eigenfunction of T p , that is,

Tp G φ = G φ. Proof Since suppφ ⊆ [0, N ] and φ ∈ L 2 (R), we know, by (9.2.6), G φ (ω) ∈ V2N −1 ⊆ V2N +1 . In addition,         ω ω ω ω 2 2 | Gφ + |p + π | Gφ +π T p G φ (ω) = | p 2 2 2 2     2        ω ω 2 ω ω   φ | + 2kπ |2 + | p + π |2 + π + 2kπ = |p | φ   2 2 2 2 k k      ω ω + 2kπ |2 | + 2kπ |2 = |p φ 2 2 k     ω ω + π + 2kπ |2 | + π + 2kπ |2 + |p φ 2 2 k   = | φ(ω + 4kπ)|2 + | φ(ω + 2π + 4kπ)|2 k

k

= G φ (ω),

as desired.



We end this section by showing that the orthogonality and stability of φ (where φ is not necessarily compactly supported) can be characterized in terms of G φ (ω).

9.2 Gramian Function G φ (ω)

449

Theorem 3 Orthogonality of φ in terms of G φ (ω) Let φ be a function in L 2 (R). Then φ is orthogonal if and only if G φ (ω) = 1 a.e. ω ∈ R. This theorem follows immediately from (9.2.3). Similarly, from (9.2.4), we know  and φ in L 2 (R) are biorthogonal to each other if and only if two functions φ G φ,φ (ω) = 1 a.e. ω ∈ R. Next we give a characterization on the stability of a function φ ∈ L 2 (R). Recall that φ ∈ L 2 (R) is said to be stable if it satisfies (8.2.17) of Sect. 8.2, that is, c

 k∈Z

! !2  ! ! ! |ck | ≤ ! ck φ(x − k)! ≤ C |ck |2 , ∀{ck } ∈ 2 (Z), ! 2

k∈Z

(9.2.7)

k∈Z

for some constants c, C > 0. For a sequence {ck } ∈ 2 (Z), denote c(ω) =  −ikω . Then c(ω) ∈ L [0, 2π], and 2 k ck e 

|ck |2 =

k∈Z

1 2π



π

−π

|c(ω)|2 dω.

By Parseval’s formula (7.2.7) of Theorem 2 on p.331, we have  2π

R

|



 ck φ(x − k)| d x = 2

k



= = =

R

2  |c(ω)φ(ω)| dω =



π

j∈Z −π  π −π

R

|



 j∈Z

2  ck e−ikω φ(ω)| dω

k 2π j+π

2  |c(ω)|2 |φ(ω)| dω

2π j−π

 + 2π j)|2 dω = |c(ω)|2 |φ(ω



π −π

|c(ω)|2



 + 2π j)|2 dω |φ(ω

j∈Z

|c(ω)|2 G φ (ω)dω.

Thus φ satisfies (9.2.7) if and only if 0 < c ≤ G φ (ω) ≤ C < ∞, a.e. ω ∈ [−π, π]. This conclusion is listed as statement (i) in the following theorem. Theorem 4 Stability of φ in terms of G φ (ω) Then:

Let φ be a function in L 2 (R).

(i) φ is stable if and only if 0 < c ≤ G φ (ω) ≤ C < ∞, a.e.ω ∈ [−π, π] for some constants c, C > 0. (ii) If, in addition, φ is compactly supported, then φ is stable if and only if G φ (ω) > 0 for ω ∈ [−π, π]. The statement (ii) in the above theorem follows from (i) and the fact that G φ (ω) is a trigonometric polynomial (see Remark 1). Hence it is bounded and attains its minimum on [−π, π], which is positive.

450

9 Compactly Supported Wavelets

Example 3 Let φ(x) = min{x, 2 − x}χ[0,2) (x) be the hat function considered in Example 2. Discuss the stability of φ by applying Theorem 4. Solution From Example 2, we have G φ (ω) =

1 (2 + cos ω). 3

Since G φ (ω) > 0, we conclude, by Theorem 4, that φ is stable.



Example 4 Let φ(x) = 13 χ[0,3) (x). Calculate G φ (ω), and then discuss the stability of φ. Solution Since supp(φ) ⊆ [0, 3], by (9.2.6), to find G φ (ω), we need to calculate integrals 







1 , 9 −∞ −∞  ∞  ∞ 2 φ(x)φ(x − 1)d x = φ(x)φ(x + 1)d x = , 9 −∞ −∞  ∞ 1 φ(x)2 d x = . 3 −∞ φ(x)φ(x − 2)d x =

φ(x)φ(x + 2)d x =

Thus G φ (ω) =

2  



k=−2 −∞

φ(x)φ(x − k)d x e−ikω

1 2 1 2 1 + iω + + e−iω + e−i2ω 9ei2ω 9e 3 9 9 2 1 4 = + cos ω + cos 2ω. 3 9 9 =



Notice that Gφ

2π 3

 =

1 4 2π 2 4π + cos + cos = 0. 3 9 3 9 3

Thus, by Theorem 4, φ is not stable.

 Exercises

 be functions in L 2 (R). Show that (9.2.5) holds. Exercise 1 Let φ, φ  are functions in L 2 (R). For N = 1, 2, . . ., define Exercise 2 Suppose φ, φ

9.2 Gramian Function G φ (ω)

S N (ω) =

451 N 

  + 2π)φ(ω  + 2π)eikω , φ(ω

=−N ikω , where k is an integer. the partial sums of the series G φ,φ (ω)e   (a) Show that |S N (ω)| ≤ G φ (ω) G φ (ω) for N = 1, 2, . . . and ω ∈ R. (b) Use (a) and Lebesgue’s dominated convergence theorem to show that



π

−π

ikω G dω = φ,φ (ω)e

∞  

π

=−∞ −π

  + 2π)φ(ω  + 2π)eikω dω. φ(ω

Exercise 3 Let φ(x) = χ[0,1.5) (x). Calculate G φ (ω), and then discuss the stability of φ. Exercise 4 Repeat Exercise 3 with φ(x) = χ[0,1) (x) + 21 χ[1,2) (x). Exercise 5 Let φ(x) be the hat function considered in Example 2. Define ϕ(x) = φ(x) + φ(x − 1). Calculate G ϕ (ω), and then discuss the stability of ϕ. Exercise 6 Repeat Exercise 5 with ϕ(x) = φ(x) + 21 φ(x − 1), where φ(x) is the hat function. Exercise 7 Let φ = ϕ0 ∗ ϕ0 ∗ ϕ0 be the quadratic B-spline given by ⎧ 1 2 ⎪ ⎪ 2x , ⎪ ⎪ 1 2 ⎪ ⎨ −(x − 1) + (x − 1) + 2 ,  2 φ(x) = 1 ⎪ 2 1 − (x − 2) , ⎪ ⎪ ⎪ ⎪ ⎩ 0,

for 0 ≤ x < 1, for 1 ≤ x < 2, for 2 ≤ x < 3, elsewhere.

Calculate G φ (ω), and then discuss the stability of φ.

9.3 Compactly Supported Orthogonal Wavelets In this section we consider the construction of compactly supported orthogonal wavelets. The method of construction is based on the architecture of the orthogonal multiresolution approximation (MRA), introduced in Sect. 8.2 of Chap. 8. We will first show that for this MRA approach, the construction of orthogonal wavelets is reduced to constructing orthogonal scaling functions, which is further reduced to the construction of quadrature mirror filter banks (QMF). Next, we discuss the sum-rule properties of lowpass filters and vanishing moments of wavelets. After that we provide the procedures for the construction of orthogonal wavelets and show how these procedures lead to Daubechies orthogonal wavelets.

452

9 Compactly Supported Wavelets

To construct a (real-valued) wavelet ψ, that is, a function ψ such that ψ j,k (x) = 2 j/2 ψ(2 j x − k), j, k ∈ Z form an orthogonal basis of L 2 (R), we start with a realvalued function φ which generates an orthogonal MRA {V j }. Let W j = V j+1 ⊥ V j , the orthogonal complement of V j in V j+1 , namely V j+1 = W j ⊕⊥ V j ,

(9.3.1)

), (2◦ ) and (3◦ ) in the definition of MRA on p.394 with W j ⊥V j . Then conditions (1◦ imply that W j ⊥Wk if j = k and j∈Z ⊕⊥ W j = L 2 (R) (see Exercise 1). Observe that, since V j = { f (x) : f (2− j x) ∈ V0 }, we have W j = {g(x) : g(2− j x) ∈ W0 } (see Exercise 2). Thus, if the integer translations of a real-valued function ψ form an orthonormal basis of W0 , i.e. W0 = span{ψ(· − k) : k ∈ Z} and ψ, ψ(· − k) = δk , k ∈ Z,

(9.3.2)

then, for each j, {ψ j,k : k ∈ Z} forms an orthonormal basis of W j , and furthermore, {ψ j,k : j, k ∈ Z} is an orthonormal basis of L 2 (R). Therefore, ψ is an orthogonal wavelet. To construct such a ψ, we notice that, since ψ ∈ W0 ⊆ V1 , we know that ψ is given in terms of φ(2x − k), k ∈ Z, that is, ψ(x) =



qk φ(2x − k),

(9.3.3)

k

where {qk } is a sequence of real numbers with finitely many qk nonzero. In addition, we observe that the condition (9.3.1), which can be reduced to the case j = 0 only, is equivalent to φ, ψ(· − k) = 0, k ∈ Z, (9.3.4) and φ(2x) =

 k

ck φ(x − k) +



dk ψ(x − k),

(9.3.5)

k

where ck , dk are some real numbers such that the series in (9.3.5) converges in L 2 (R). Thus, if a function ψ satisfies the above conditions, then it is an orthogonal wavelet, as stated in the following theorem. Theorem 1 Orth. wavelets derived from orth. scaling functions Let φ ∈ L 2 (R) be a compactly supported orthogonal refinable function. Let {qk } be a finite sequence such that ψ, given by (9.3.3), satisfies (9.3.2), (9.3.4) and (9.3.5). Then ψ is a compactly supported orthogonal wavelet. Example 1 Haar wavelet Define

Let φ(x) = χ[0,1) (x) be the Haar scaling function.

9.3 Compactly Supported Orthogonal Wavelets

⎧ ⎨ 1, ψ(x) = φ(2x) − φ(2x − 1) = −1, ⎩ 0,

453

for 0 ≤ x < 21 , for 21 ≤ x < 1, elsewhere.

(9.3.6)

Then we have, for k ∈ Z, ψ, ψ(· − k) = φ(2·), φ(2 · −2k) − φ(2·), φ(2 · −2k − 1) − φ(2 · −1), φ(2 · −2k) + φ(2 · −1), φ(2 · −2k − 1) 1 1 = δk − 0 − 0 + δk = δk ; 2 2 and φ, ψ(· − k) = φ(2·) + φ(2 · −1), φ(2 · −2k) − φ(2 · −2k − 1) = φ(2·), φ(2 · −2k) − φ(2·), φ(2 · −2k − 1) + φ(2 · −1), φ(2 · −2k) − φ(2 · −1), φ(2 · −2k − 1) 1 1 = δk − 0 + 0 − δk = 0. 2 2 In addition, we have φ(2x) =

1 1 φ(x) + ψ(x), 2 2

that is, (9.3.5) is satisfied. Thus, by Theorem 1, ψ is an orthogonal wavelet, which is called the Haar wavelet. On the other hand, one can show directly that {ψ j,k (x)}, j, k ∈ Z, is complete in  L 2 (R) and orthogonal (see Exercise 3). Next, we discuss how to apply Theorem 1 to construct orthogonal wavelets. The idea is to start with the refinement mask { pk } for an orthogonal scaling function φ, and then we derive the sequence {qk } for the orthogonal wavelet ψ directly from { pk }. Here, we focus on compactly supported wavelets, so that { pk } and {qk } are finite sequences. Let p(ω) and q(ω) be the associated two-scale symbols p(ω) =

1 1 pk e−ikω , q(ω) = qk e−ikω , 2 2 k

k

which are trigonometric polynomials. Theorem 2 Orthogonality implies QMF Let φ ∈ L 2 (R) be a refinable function with refinement mask { pk }, and suppose φ is orthogonal. Then p(ω) is a QMF, that is, p(ω) satisfies (9.3.7) | p(ω)|2 + | p(ω + π)|2 = 1, ω ∈ R. Furthermore, ψ, defined by (9.3.3), satisfies (9.3.2) and (9.3.4) if and only if

454

9 Compactly Supported Wavelets

|q(ω)|2 + |q(ω + π)|2 = 1, ω ∈ R,

(9.3.8)

p(ω)q(ω) + p(ω + π)q(ω + π) = 0, ω ∈ R.

(9.3.9)

Proof By the refinement of φ, we have G φ (2ω) =

2   2       =    p(ω + π) φ(2ω + 2π) φ(ω + π)    

∈Z

∈Z

2   2            =  p(ω + 2kπ)φ(ω + 2kπ) +  p(ω + (2k + 1)π)φ(ω + (2k + 1)π) k∈Z

k∈Z

= | p(ω)|2 G φ (ω) + | p(ω + π)|2 G φ (ω + π).

(9.3.10)

Thus, if φ is orthogonal, or equivalently, G φ (ω) = 1, then we can deduce from (9.3.10) that p(ω) is a QMF. Similarly, we have G ψ (2ω) = |q(ω)|2 G φ (ω) + |q(ω + π)|2 G φ (ω + π), G φ,ψ (2ω) = p(ω)q(ω)G φ (ω) + p(ω + π)q(ω + π)G φ (ω + π). Thus, under the assumption that G φ (ω) = 1, ψ satisfies (9.3.2) and (9.3.4), which are equivalent to G ψ (ω) = 1 and G φ,ψ (ω) = 0, 

if and only if p and q satisfy (9.3.8) and (9.3.9).

Following the remark in Sect. 8.2 of Chap. 8, on p.399, for a QMF p(ω), we may simply choose the highpass filter q(ω) for the orthogonal wavelet to be q(ω) = − p(ω + π)e−i(2L−1)ω = − p(π − ω)e−i(2L−1)ω ,

(9.3.11)

or equivalently, qk = (−1)k p2L−1−k , k ∈ Z, where L is an integer. Then q(ω) satisfies (9.3.8) and (9.3.9). Next, we provide a characterization for the orthogonality of φ in terms of its refinement mask p. More precisely, the characterization is given by certain conditions on the eigenvalues of the transition operator T p associated with p. Recall that, for p = { pk } supported in [N1 , N2 ], the transition operator T p associated with p is the linear operator defined on V2(N2 −N1 )+1 by         ω ω 2 ω ω 2 | v + |p +π | v + π , v(ω) ∈ V2(N2 −N1 )+1 , (T p )v(ω) = | p 2 2 2 2 where V2(N2 −N1 )+1 denotes the linear span of e−ikω , |k| ≤ N2 − N1 , that is,

9.3 Compactly Supported Orthogonal Wavelets N 2 −N1

V2(N2 −N1 )+1 = {

455

ck e−ikω : ck ∈ C}.

k=−(N2 −N1 )

Then T p v(ω) ∈ V2(N2 −N1 )+1 for any v(ω) ∈ V2(N2 −N1 )+1 . Definition 1 Condition E A matrix or a linear operator T on a finite-dimensional space is said to satisfy Condition E if 1 is a simple eigenvalue of T , and any other eigenvalue λ of T satisfies |λ| < 1. Theorem 3 Characterization of orthogonality of refinable functions Suppose a trigonometric polynomial p(ω) is a QMF with p(0) = 1. Then the associated refinable function φ is in L 2 (R) and orthogonal if and only if p(π) = 0 and the transition operator T p satisfies Condition E. The proof of Theorem 3 is provided in Sect. 10.2 of Chap. 10, on p. 504. Theorem 4 Orthogonal wavelets derived from QMFs Let the trigonometric polynomial p(ω) be a QMF with p(0) = 1, p(π) = 0, and φ be the refinable function associated with p. Let ψ be the function given by (9.3.3), with {qk } defined by (9.3.11). If the transition operator T p associated with p satisfies Condition E, then ψ is a compactly supported orthogonal wavelet. Proof By Theorem 3, φ is in L 2 (R) and orthogonal. Thus, by Theorem 2, φ and ψ satisfy (9.3.2) and (9.3.4), since q(ω) satisfies (9.3.8)–(9.3.9). Therefore, by Theorem 1, to prove Theorem 4, namely to show ψ to be a compactly supported orthogonal wavelet, we only need to verify (9.3.5), which is carried out below. Let M p,q (ω) be the modulation matrix of p(ω), q(ω) as defined in Sect. 8.3 of Chap. 8, that is,

p(ω) p(ω + π) . (9.3.12) M p,q (ω) = q(ω) q(ω + π) Then (9.3.7)–(9.3.9) is equivalent to M p,q (ω)M p,q (ω)T = I2 , ω ∈ R, or M p,q (ω)T M p,q (ω) = I2 , ω ∈ R. In particular, we have p(ω) p(ω) + q(ω)q(ω) = 1, p(ω) p(ω + π) + q(ω)q(ω + π) = 0,

456

9 Compactly Supported Wavelets

so that



   p(ω) + p(ω + π) p(ω) + q(ω) + q(ω + π) q(ω) = 1.

Hence                     ω ω  ω ω  ω ω ω ω ω  φ φ φ = p +p +π p + q +q +π q 2 2 2 2 2 2 2 2 2       ω ω 1 1 = φ(ω) + ψ(ω) 1 + (−1)k pk eik 2  1 + (−1)k qk eik 2  2 2 k k   = φ(ω) + ψ(ω), p2n einω  q2n einω  n

n

where, in the last equality, we use the assumption that pk and qk are real numbers. Thus 1 1 p−2n φ(x − n) + q−2n ψ(x − n). φ(2x) = 2 n 2 n 

Hence we have shown that (9.3.5) holds.

Definition 2 Sum-rule order A finite sequence p = { pk } is said to have sumrule order L if L is the largest integer, for which the two-scale symbol p(ω) satisfies p(0) = 1,

d p(π) = 0, for  = 0, 1, . . . , L − 1, dω 

(9.3.13)

or equivalently  k

pk = 2,

  (2k) p2k = (2k+1) p2k+1 , for  = 0, 1, . . . , L−1. (9.3.14) k

k

In this case, we also say that p(ω) has sum-rule order L. A (compactly supported) function ψ ∈ L 2 (R) is said to have vanishing moments of order L if  ∞ −∞

ψ(x)x  d x = 0, for  = 0, 1, . . . , L − 1.

Let q(ω) be the trigonometric polynomial defined by (9.3.11). Then   d d −i(2L−1)ω − p(π − ω)e q(ω) = dω  dω      − j    dj j d −i(2L−1)ω e . p(π − ω)(−1) =− j dω j dω − j j=0

9.3 Compactly Supported Orthogonal Wavelets

457

Thus, if p(ω) has sum-rule order L, then d q(0) = 0, for 0 ≤  ≤ L − 1. dω 

(9.3.15)

Condition (9.3.15) can be written as 

k  qk = 0, for 0 ≤  ≤ L − 1

(9.3.16)

k

(see Exercise 14). This property of {qk } is called the discrete polynomial annihilation. Theorem 5 Sum-rules of QMF implies vanishing moments Let φ be an orthogonal refinable function associated with a trigonometric polynomial p(ω). Let ψ be the function given by (9.3.3), with {qk } defined by (9.3.11). If p(ω) has sum-rule order L, then ψ has vanishing moments of order L. ψ  ∈ C ∞ , by the property Proof Since φ, ψ are compactly supported, we know φ, (v) of Fourier transform in Theorem 1 on p.320. Thus, from the fact that      ω  ω d  d q ψ(ω) = φ   dω dω 2 2     j − j        dj 1 d ω  ω , q = φ j dω j 2 2 dω − j 2 j=0

and using also (9.3.15), we conclude that d  ψ(0) = 0, for 0 ≤  ≤ L − 1. dω  This, together with (7.1.2) on p.320, yields 



−∞

x  ψ(x)d x = i k

d  ψ(0) = 0, for 0 ≤  ≤ L − 1, dω 

that is, ψ has vanishing moments of order L.



In fact, one can show that if p(ω), q(ω) satisfy (9.3.9), and p(ω) has sum-rule order L, then q(ω) satisfies (9.3.15) (see Exercise 13). Hence the corresponding orthogonal wavelet ψ has vanishing moments of order L if the scaling function φ is in L 2 (R). Construction of compactly supported orthogonal wavelets By Theorem 5, to construct an orthogonal wavelet with vanishing moments of order L, we start with a QMF p(ω) of the form

458

9 Compactly Supported Wavelets

 p(ω) =

1 + e−iω 2

L p0 (ω),

(9.3.17)

where p0 (ω) is a trigonometric polynomial. For such a p(ω), we have 2L  ω 2 | p(ω)| = cos | p0 (ω)|2 . 2 Since the coefficients pk of p(ω) are real numbers, we have | p0 (ω)|2 = p0 (ω) p0 (ω) = p0 (−ω) p0 (−ω) = | p0 (−ω)|2 , that is, | p0 (ω)|2 is an even trigonometric polynomial, and it can be written as a function of cos ω. Therefore, with cos ω = 1−2 sin2 ω2 , | p0 (ω)|2 can be expressed as   ω | p0 (ω)|2 = P sin2 , 2 where P is a polynomial. Hence, we conclude that p(ω) in (9.3.17) is a QMF, namely, it satisfies (9.3.7), if and only if P satisfies     ω ω ω ω cos2L P sin2 + sin2L P cos2 = 1, ω ∈ R, 2 2 2 2 or equivalently, with y = sin2 ω2 , P satisfies (1 − y) L P(y) + y L P(1 − y) = 1, y ∈ [0, 1].

(9.3.18)

Let PL−1 be the polynomial defined by PL−1 (y) =

 L−1   L −1+k yk . k

(9.3.19)

k=0

 Then P(y) = PL−1 (y) +

yL R

 1 2

− y

satisfies (9.3.18), where R is an odd

polynomial (including the zero polynomial). In the following, we list PL−1 (y) for 1 ≤ L ≤ 5: P0 (y) = 1, P1 (y) = 1 + 2y, P2 (y) = 1 + 3y + 6y 2 , P3 (y) = 1 + 4y + 10y 2 + 20y 3 , P4 (y) = 1 + 5y + 15y 2 + 35y 3 + 70y 4 .

(9.3.20)

9.3 Compactly Supported Orthogonal Wavelets

459

By factorizing PL−1 (sin2 ω2 ) into p0 (ω) p0 (ω), we obtain a QMF p(ω) with sumrule order L. This resulting QMF p(ω) has 2L nonzero coefficients pk , and it is called the 2L-tap orthogonal filter or 2L-tap Daubechies orthogonal filter (the D2L filter for short). In the following, we look at the cases with L ≤ 4.  As before, we use the notation z = e−iω . With this notation, we have cos2

    1 ω ω 1 1 1 , sin2 = 2−z− . = (1 + z) 1 + 2 4 z 2 4 z

Haar wavelet For L = 1, since P0 (y) = 1, we have   1 1 ω = (1 + z) 1 + . | p(ω)| = cos 2 4 z 2

Thus p(ω) =

2

1 1 (1 + z) = (1 + e−iω ), 2 2

and q(ω) = − p(ω + π)e−i(2×1−1)ω =

1 (1 − e−iω ). 2 

In this case, we reach the Haar wavelet. D4 filter For L = 2, with y = sin2

ω 2

= 41 (2 − z − 1z ), we have

11 1 1 = − (z 2 − 4z + 1). P1 (y) = 1 + 2y = 2 − z − 2 2z 2z Notice that the roots of z 2 − 4z + 1 = 0 are r1 = 2 −



3, r2 = 2 +

√ 3, with r2 =

1 r1 .

Thus

1 1 (z − r1 )(z − r2 ) = − (z − r1 )(r1 z − r1r2 ) 2z 2zr1   1 1 − r1 . (z − r1 ) = 2r1 z

P1 (y) = −

460

9 Compactly Supported Wavelets

Therefore, we may choose √   √ 1 1+ 3 z − (2 − 3) , p0 (ω) = √ (z − r1 ) = 2 2r1 and we reach the D4 orthogonal filter: √  2 2   √ 1+z 1+z 1+ 3 z − (2 − 3) p(ω) = p0 (ω) = 2 2 2 √ √ √ √ 3+ 3 2 1+ 3 3 1− 3 3− 3 + z+ z + z . (9.3.21) = 8 8 8 8 

The corresponding highpass filter q(ω) for the orthogonal wavelet is given by q(ω) = − p(ω + π)e−i(2×2−1)ω √ √ √ √ 3− 3 2 1− 3 3 1+ 3 3+ 3 − z+ z − z . = 8 8 8 8 The graphs of the corresponding scaling function φ and orthogonal wavelet ψ are displayed in Fig. 8.2 in Sect. 8.2 of Chap. 8, on p.400. Observe that P1 (y) can also be factorized as     1 1 1 1 − r2 = . (z − r2 ) (r2 − z) r2 − P1 (y) = 2r2 z 2r2 z Thus, one may choose 1 p0 (ω) = √ (r2 − z) = 2r2



  √ 3−1 2+ 3−z . 2

In this case, the corresponding 4-tap orthogonal filter, denoted by r (ω), is 2 √ √ 1+z 3−1 r (ω) = (2 + 3 − z) 2 2 √ √ √ √ 3− 3 2 1− 3 3 1+ 3 3+ 3 + z+ z + z . = 8 8 8 8 

This lowpass filter r (ω) is p(−ω)e−i4ω , where p(ω) is the 4-tap orthogonal filter given by (9.3.21). One can easily show that if p(ω) is a QMF, then p(−ω)e−inω is also a QMF, where n ∈ N (see Exercise 4).  D6 filter For L = 3, we first factor P2 (y) = 1 + 3y + 6y 2 as

9.3 Compactly Supported Orthogonal Wavelets

461

P2 (y) = 6(y − y1 )(y − y2 ), where y1 , y2 = − 41 ± i



15 12 .

With y = sin2

ω 2

= 41 (2 − z − 1z ), we have

   1 1 3 z+ −c z+ −c , P2 (y) = 8 z z where c = 3 − i



15 3 .

Let

 √  √ √    12 10 + 15 15 − 12 10 − 15 1 3 2 c− c −4 = − − i z0 = 2 2 6 6 be a root of z 2 − cz + 1 = 0. Then P2 (y) is further factorized as 3 1 2 (z − cz + 1)(z 2 − cz + 1) 8 z2     3 1 1 1 (z − z 0 ) z − = (z − z 0 ) z − 8 z2 z0 z0    1 3 1 1− = (z − z 0 )(z − z 0 ) 1 − 8 zz 0 zz 0    1 3 1 1 − z0 − z0 . = (z − z 0 )(z − z 0 ) 8 |z 0 |2 z z

P2 (y) =

Thus, we have the corresponding QMF p(ω), given by 

1+z 2

3 "

3 1 (z − z 0 )(z − z 0 ) 8 |z 0 | 3 √    6 1 1+z z 2 − 2 Re(z 0 ) z + |z 0 |2 = 2 4 |z 0 |  

√ √ √ √ √ 3  1 + 10 − 5 + 2 10 1 − 10 1 + 10 + 5 + 2 10 2 1+z + z+ z , = 2 4 2 4

p(ω) =

where the nonzero pk are given by    √ √ 1 1 + 10 − 5 + 2 10 , 16    √ √ 1 p1 = 5 + 10 − 3 5 + 2 10 , 16    √ √ 1 p2 = 5 − 10 − 5 + 2 10 , 8 p0 =

462

9 Compactly Supported Wavelets 1.5

1.4 1.2

1

1

0.5

0.8 0.6

0

0.4

−0.5

0.2 0 −0.2 −0.4 0

−1 −1.5 0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

−2 0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Fig. 9.1. D6 scaling function φ (on left) and wavelet ψ (on right)

   √ √ 1 5 − 10 + 5 + 2 10 , p3 = 8    √ √ 1 p4 = 5 + 10 + 3 5 + 2 10 , 16    √ √ 1 p5 = 1 + 10 + 5 + 2 10 . 16 The highpass filter {qk } can be chosen as qk = (−1)k p5−k , k ∈ Z. See Fig. 9.1 for the scaling function φ and orthogonal wavelet ψ.



D8 filter For L = 4, P3 (y) = 1 + 4y + 10y 2 + 20y 3 can be factorized as P3 (y)

    1 1 2 y− = 20(y − θ) y + θ + 2 20θ

(9.3.22)

= 20(y − θ)(y − λ)(y − λ), where

   √ √ 1 3 1 3 θ =− + 105 15 − 350 − 105 15 + 350 6 30 ≈ −0.3423840948583691, λ ≈ −0.0788079525708154 + 0.3739306454336101 i.

(9.3.23)

Thus, with y = 41 (2 − z − 1z ), we can factorize P3 (y) as follows, as we did for the cases L = 1, 2:

9.3 Compactly Supported Orthogonal Wavelets

463

    5 2 − (2 − 4θ)z + 1 2 − (2 − 4λ)z + 1 2 − (2 − 4λ)z + 1 z z z 16z 3 1 1 1 5 (z − z 0 )(z − )(z − z 1 )(z − )(z − z 1 )(z − ) =− z0 z1 z1 16z 3      1 1 1 5 − z − − z (z − z , (z − z ) )(z − z ) z = 0 0 1 1 1 1 z z z 16z 0 |z 1 |2

P3 (y) = −

where z 0 ≈ 0.32887591778603087, z 1 ≈ 0.28409629819182162 + 0.24322822591037987 i. Therefore, with p0 (ω) =

√ 5 1 1 (z − z 0 )(z − z 1 )(z − z 1 ), √ 4 z 0 |z 1 |

we have a QMF with sum-rule order 4, given by  p(ω) =

1+z 2

3 p0 (ω),

with 8 nonzero pk given by p0 = −0.0149869893303615, p1 = 0.0465036010709818, p2 = 0.0436163004741773, p3 = −0.2645071673690398, p4 = −0.0395750262356446, p5 = 0.8922001382467596, p6 = 1.0109457150918289,

p7 = 0.3258034280512984.

The corresponding highpass filter {qk } can be chosen as qk = (−1)k p7−k , k ∈ Z. See Fig. 9.2 for the scaling function φ and orthogonal wavelet ψ.



Observe that the orthogonal wavelets of the D4 , D6 , D8 QMFs have vanishing moments of orders 2, 3, 4, respectively. In this way, one can construct orthogonal wavelets with higher vanishing moments. Exercises Exercise 1 Suppose {V j } is an orthogonal MRA. Let W j be the orthogonal complement of V j in V j+1 . Show that W j ⊥Wk if j = k and j∈Z ⊕W j = L 2 (R).

464

9 Compactly Supported Wavelets

1.2

1

1 0.5

0.8 0.6

0

0.4 −0.5

0.2 0

−1

−0.2 −0.4

−1.5 0

1

2

3

4

5

6

7

0

1

2

3

4

5

6

7

Fig. 9.2. D8 scaling function φ (on left) and wavelet ψ (on right)

Exercise 2 Suppose {V j } is an orthogonal MRA. Let W j be the orthogonal complement of V j in V j+1 . Show that W j = {g(x) : g(2− j x) ∈ W0 }. Exercise 3 Let ψ be the Haar wavelet defined by (9.3.6) in Example 1. Show directly that {ψ j,k (x)}, j, k ∈ Z is orthogonal. Exercise 4 Show that if p(ω) is a QMF, then p(−ω)e−inω is also a QMF, where n ∈ Z. Exercise 5 Suppose FIR filters p(ω), q(ω) satisfy (9.3.7)–(9.3.9). Show that, for any n, m ∈ Z, p(ω)e−inω , q(ω)e−i(n+2m)ω also satisfy (9.3.7)–(9.3.9). Exercise 6 Suppose φ(x) is the refinable function with two-scale symbol p(ω). Let ϕ(x) be the refinable function with two-scale symbol p(ω)e−inω , where n ∈ Z. Find the relation between φ(x) and ϕ(x). Exercise 7 Let ψ(x) be the function defined by (9.3.3), with a finite sequence {qk }. Suppose f (x) is the function defined by (9.3.3), with two-scale symbol q(ω)e−i2mω , where m ∈ Z. Find the relation between ψ(x) and f (x). Exercise 8 Show that (9.3.13) is equivalent to (9.3.14). Exercise 9 Let p = { pk } be a refinement mask with nonzero pk given by p0 = p1 = p2 = p3 =

1 . 2

Find the sum-rule order of p. Exercise 10 Repeat Exercise 9 for p = { pk } with nonzero pk given by p0 =

1 , 4

p1 =

3 , 4

p2 =

3 , 4

p3 =

1 . 4

Exercise 11 Repeat Exercise 9 for p = { pk } with nonzero pk given by

9.3 Compactly Supported Orthogonal Wavelets

p−1 =

1 , 4

p0 =

3 , 4

465

p1 =

3 , 4

p2 =

1 . 4

Exercise 12 Suppose the FIR filter p(ω) has sum-rule order L. Show that, for any n ∈ Z, p(ω)e−inω also has sum-rule order L. Exercise 13 Suppose FIR filters p(ω), q(ω) satisfy (9.3.9). Show that if p(ω) has sum-rule order L, then q(ω) satisfies (9.3.15). Hint: Differentiate both sides of (9.3.9) with respect to ω and set ω = 0. Exercise 14 Show that (9.3.15) is equivalent to (9.3.16). Exercise 15 Let P(y) = PL−1 (y) be the polynomial defined by (9.3.20). For L = 2, 3, 4, show that P(y) satisfies (9.3.18). Exercise 16 Let p(ω) be the orthogonal filter given by (9.3.21). Show that p(ω) has sum-rule order 2.

9.4 Compactly Supported Biorthogonal Wavelets The compactly supported orthogonal wavelets discussed in the previous section lack symmetry. In many applications, symmetry is an important property. For example, in application to image compression, to deal with the boundary extension of the image, symmetric filters are desirable. On the other hand, expanding a signal in a Riesz basis is enough in some applications. Relaxing the orthogonality condition will result in symmetric wavelets. In this section, we discuss the construction of compactly supported biorthogonal wavelets with symmetry. In the beginning of this section, we introduce the definitions of Bessel sequences and Riesz bases, after which we show how MRAs lead to biorthogonal wavelets, which is reduced to the construction of biorthogonal filters. Next, we discuss the symmetry of biorthogonal wavelets. Finally, we provide the procedures of the construction of biorthogonal wavelets, and show how these procedures lead to some biorthogonal wavelets, including 5/3-tap and 9/7-tap biorthogonal wavelets. Let { f k } be a set of functions in L 2 (R). If there are positive constants A and B such that !2 !   ! ! 2 ! A |ck | ≤ ! ck f k ! |ck |2 , (9.4.1) ! ≤B k

k

k

for any {ck } in 2 , then { f k } is called a Riesz sequence. In this case, A, B are called Riesz bounds. If { f k } satisfies the right-hand inequality of (9.4.1), then it is called a Bessel sequence with a bound B. Definition 1 Riesz basis A set { f k } of functions in L 2 (R) is called a Riesz basis for L 2 (R) if it is a Riesz sequence and it is complete in L 2 (R).

466

9 Compactly Supported Wavelets

Clearly, if { f k } satisfies (9.4.1), then it is linearly independent (see Exercise 1). Thus, a Riesz basis for L 2 (R) is a basis for L 2 (R). f k } of functions in L 2 (R) are said to be biorthogonal Recall that two sets { f k } and {  to each other or to form biorthogonal systems if f k  = δ j−k .  f j,  f k }, to verify whether they are Riesz For two biorthogonal systems { f k } and {  sequences, we only need to check whether they are Bessel sequences, as stated in the next theorem. Theorem 1 Biorthogonal Bessel sequences are Riesz sequences Let { f k } and {gk } be two sets of functions in L 2 (R) which form biorthogonal systems of L 2 (R). Then { f k } and {gk } are Riesz sequences if and only if they are Bessel sequences. Proof Suppose ! !2  ! ! ! ! c f |ck |2 , k k! ≤ B ! k

k

!2 !  ! !  ! ! c g |ck |2 , k k! ≤ B ! k

k



for any k |ck |2 < ∞. By the biorthogonality of { f k } and {gk }, together with the Cauchy-Schwarz inequality, we have ! ! ! !    !! ! ! 2 ! ! ! |ck | =  cj f j, ck gk  ≤ ! cj f j! ! ck gk ! ! k

j





B

k

 k

Thus

j

k

!  21 !  ! ! ! |ck |2 ! c g k k !. ! k

!2 ! ! ! 1  ! |ck |2 ≤ ! c g k k! . ! B k

k

Therefore, {gk } is a Riesz sequence with bounds B −1 , B  . Similarly, one can show  that { f k } is a Riesz sequence with bounds B  −1 , B. For a function φ in L 2 (R), from the definition of stability of φ (see (8.2.17) on p.394), we know that φ is stable if {φ(x − k) : k ∈ Z} is a Riesz sequence. Also,  in L 2 (R) are said to be biorthogonal to each other recall that two functions φ and φ  if {φ(x − k) : k ∈ Z} and {φ(x − k) : k ∈ Z} form biorthogonal systems of L 2 (R), that is,  − k) = δ j−k , j, k ∈ Z. (9.4.2) φ(· − j), φ(· For φ ∈ L 2 (R), it is shown in the proof of (i) in Theorem 4, on p.449, that {φ(x − k) : k ∈ Z} is a Bessel sequence with bound B if and only if

9.4 Compactly Supported Biorthogonal Wavelets

467

G φ (ω) ≤ B, a.e. ω ∈ R, where G φ (ω) is the Gramian function of φ defined by (9.2.1) on p.445. In addition, if φ is compactly supported, then G φ (ω) is a trigonometric polynomial (see Remark 1 on p.447), and hence, it is upper bounded. Thus {φ(x − k) : k ∈ Z} is a Bessel sequence. This observation, together with Theorem 1, leads to the following corollary. Corollary 1 Biorthogonality implies stability If two compactly supported func are biorthogonal to each other, then φ and φ  are stable. tions φ and φ For a real-valued function ψ ∈ L 2 (R), denote, as before, ψ j,k (x) = 2 j/2 ψ(2 j x − k),

j, k ∈ Z.

We say that ψ is a wavelet provided that ψ j,k , j, k ∈ Z form a Riesz basis of L 2 (R). j,k , j, k ∈ Z} are If, in addition, two families of functions {ψ j,k , j, k ∈ Z} and {ψ biorthogonal to each other, namely, j  ,k   = δ j− j  δk−k  , j, k ∈ Z, ψ j,k , ψ

(9.4.3)

 are called a pair of biorthogonal wavelets. then two wavelets ψ and ψ Since a Riesz basis of L 2 (R) is a basis for L 2 (R), we know that {ψ j,k } and j,k }, j, k ∈ Z, are two bases for L 2 (R). Thus, any f ∈ L 2 (R) can be expanded by {ψ j,k }, j, k ∈ Z, and the biorthogonality property gives the expansion either {ψ j,k } or {ψ j,k or with ψ j,k (see Exercise 2): coefficients as the inner products of f with ψ f (x) =



j,k ψ j,k (x) =  f, ψ

j,k∈Z



j,k (x).  f, ψ j,k ψ

(9.4.4)

j,k∈Z

j,k and ψ j,k imply that they satisfy the In addition, the Riesz basis properties of ψ frame conditions (see Exercise 3)  1 1 | f, ψ j,k |2 ≤ f 2 , f 2 ≤   B A j,k∈Z

(9.4.5)

 1 j,k |2 ≤ 1 f 2 , f 2 ≤ | f, ψ B A j,k∈Z

j,k }, respectively.   where A, B and A, B are the Riesz basis bounds for {ψ j,k } and {ψ The frame conditions assure the stability of the wavelet expansions.  By Theorem 1, to construct biorthogonal wavelets, we need to construct ψ and ψ such that: j,k }, j, k ∈ Z, is complete for L 2 (R); (a) Each collection of {ψ j,k }, j, k ∈ Z, and {ψ  satisfy (9.4.3); (b) ψ and ψ

468

9 Compactly Supported Wavelets

j,k }, j, k ∈ Z, are Bessel sequences. (c) {ψ j,k }, j, k ∈ Z, and {ψ For a compactly supported ψ, it can be shown that under some mild conditions on ψ, {ψ j,k : j, k ∈ Z} is a Bessel sequence. Theorem 2 Bessel sequence Let ψ ∈ L 2 (R) be a compactly supported function. Suppose ψ satisfies  ∞ (i) ψ(x)d x = 0; and −∞

(ii) there is a γ > 0 such that 

∞ −∞

2  |ψ(ω)| (1 + ω 2 )γ dω < ∞.

Then {ψ j,k : j, k ∈ Z} is a Bessel sequence of L 2 (R). Theorem 2 states that, for a compactly supported function ψ in L 2 (R), if it has a vanishing moment of order 1 and certain smoothness, then {ψ j,k : j, k ∈ Z} is a Bessel sequence of L 2 (R). The biorthogonal wavelets to be constructed do satisfy these mild conditions. Next, let us look at the MRA-based construction of biorthogonal wavelets. In this V j }, which are biorthogonal to each other in case we construct two MRAs {V j } and {  are biorthogonal the sense that their compactly supported scaling functions φ and φ to each other, namely, they satisfy (9.4.2). With  j · −k) : k ∈ Z}, V j = span{φ(2 V j = span{φ(2 j · −k) : k ∈ Z}, 

(9.4.6)

 ∈ L 2 (R) such that if we can construct compactly supported functions ψ, ψ  − k) = δk , k ∈ Z; ψ, ψ(·  − k) = 0, φ,  ψ(· − k) = 0, k ∈ Z, φ, ψ(·

(9.4.7) (9.4.8)

0 V1 are the direct sums of their subspaces W0 and V0 , and W and such that V1 and   and V0 , respectively, that is, 0 ⊕  V1 = W V0 , V1 = W0 ⊕ V0 , 

(9.4.9)

where  − k) : k ∈ Z},  0 = span{ψ(· W0 = span{ψ(· − k) : k ∈ Z}, W  form a pair of biorthogonal wavelets. A vector space V is said to be a then ψ and ψ direct sum of its two subspaces U and W, which is denoted as V = U ⊕ W,

9.4 Compactly Supported Biorthogonal Wavelets

469

provided that each v ∈ V can be written as v = u + w, with u ∈ U, w ∈ W and U ∩ W = {0}. Theorem 3 Biorth. wavelets derived from biorth. scaling functions Suppose  ∈ L 2 (R) are biorthogonal to each two compactly supported refinable functions φ, φ  ∈ L 2 (R) satisfy (9.4.7)–(9.4.9), other. If two compactly supported functions ψ, ψ j,k }, j, k ∈ Z,  are a pair of biorthogonal wavelets, i.e. {ψ j,k } and {ψ then ψ and ψ are Riesz bases for L 2 (R), and satisfy (9.4.3). are stable, since they are biorthogProof By Corollary 1, we know that both φ and φ onal to each other. Thus {V j } and { V j }, defined by (9.4.6), are two MRAs. Denote  j · −k) : k ∈ Z}.  j = span{ψ(2 W j = span{ψ(2 j · −k) : k ∈ Z}, W Then (9.4.9) implies j ⊕ V j+1 = W Vj. V j+1 = W j ⊕ V j ,  This, together with conditions (2◦ ) and (3◦ ) of MRA, implies that    j. L 2 (R) = ⊕W j , L 2 (R) = ⊕W j∈Z

(9.4.10)

j∈Z

The biorthogonality property (9.4.8) implies

This and (9.4.7) lead to

 j ⊥ Vj. Vj, W Wj ⊥ 

(9.4.11)

 j  , j, j  ∈ Z. Wj ⊥ W

(9.4.12)

Indeed, if j = j  , (9.4.12) follows from (9.4.7). For j = j  , assuming j  < j, we have  j ⊂  W Vj.  j  ⊥ W j , as desired. This and  V j ⊥ W j imply W j,k : j, k ∈ Z} Then from (9.4.10), we see that each of {ψ j,k : j, k ∈ Z} and {ψ is complete in L 2 (R). Furthermore, (9.4.12) implies that {ψ j,k : j, k ∈ Z} and j,k : j, k ∈ Z} are biorthogonal systems of L 2 (R) (see Exercise 4). Thus, condi{ψ  to be biorthogonal wavelets are fulfilled. tions (a) and (b) listed on p.467 for ψ and ψ For condition (c), we will find that biorthogonal compactly supported wavelets automatically satisfy the vanishing moment and smoothness conditions (i) and (ii) in  are biorthogonal wavelets. Theorem 2. Therefore, ψ and ψ   ψ, ψ,  Except for the biorthogonality conditions (9.4.2), (9.4.7) and (9.4.8) on φ, φ,  and the refinement properties of φ, φ, (9.4.9) also needs to be satisfied. Condition (9.4.9) is equivalent to

470

9 Compactly Supported Wavelets

ψ(x) = #

 k

φ(2x) =  φ(2x) =

 = qk φ(2x − k), ψ(x)

 

k ck φ(x

− k) +

 − k) +

ck φ(x k

  qk φ(2x − k),

(9.4.13)

k

 



k

dk ψ(x − k),

k

 − k), dk ψ(x

(9.4.14)

qk } are finite sequences of real numbers, and ck , dk , ck , dk are some real where {qk }, { numbers such that the series in (9.4.14) converge in L 2 (R). The direct sum property  0 = {0}, which can be easily obtained V0 ∩ W in (9.4.9) also requires V0 ∩W0 = {0},    from the biorthogonality of φ, φ, ψ, ψ (see Exercise 5). To construct biorthogonal wavelets, we start with the masks { pk } and { pk } for the  Next, we derive the sequences {qk } and { qk } from { pk } and scaling functions φ and φ.  { pk } such that ψ and ψ are given by (9.4.13). Since we focus on compactly supported qk } are finite sequences. Let p(ω),  p (ω), q(ω) and wavelets, { pk }, { pk }, {qk } and {  q (ω) be the corresponding two-scale symbols. Then (9.4.13) can be written as         ω  ω  ω   ω .  ,  ψ(ω) = q φ (9.4.15) φ ψ(ω) =q 2 2 2 2  ∈ L 2 (R), let G (ω) be the 2π-periodic function defined by (9.2.2) in For φ, φ φ,φ  are Sect. 9.2, on p.445. Recall that, from (9.2.4) in Theorem 1 of Sect. 9.2, φ and φ biorthogonal if and only if G φ,φ (ω) = 1, a.e. ω ∈ R.  ∈ L 2 (R) Theorem 4 Biorthogonality implies perfect reconstruction Let φ, φ  be the functions be refinable functions with refinement masks { pk }, { pk }. Let ψ and ψ  defined by (9.4.13). Suppose φ and φ are biorthogonal to each other, i.e. they satisfy (9.4.2). Then p (ω + π) p(ω + π) = 1. (9.4.16)  p (ω) p(ω) +  Furthermore, (9.4.7) and (9.4.8) are satisfied if and only if q (ω + π)q(ω + π) = 1,  q (ω)q(ω) +   q (ω) p(ω) +  q (ω + π) p(ω + π) = 0,

(9.4.17) (9.4.18)

 p (ω)q(ω) +  p (ω + π)q(ω + π) = 0.

(9.4.19)

Proof To show (9.4.16), one can follow the proof of (9.3.10) on p.454 by using p (ω) p(ω)G p (ω + π) p(ω + π)G G φ,φ (2ω) =  φ,φ (ω) +  φ,φ (ω + π).  are biorthogonal to each other, or equivalently, G (ω) = 1, then Thus, if φ and φ φ,φ the above equation implies that p(ω) and  p (ω) satisfy (9.4.16).

9.4 Compactly Supported Biorthogonal Wavelets

471

Similarly, by G q (ω)q(ω)G q (ω + π)q(ω + π)G ψ,ψ (2ω) =  φ,φ (ω) +  φ,φ (ω + π), G q (ω) p(ω)G q (ω + π) p(ω + π)G ψ,φ (2ω) =  φ,φ (ω) +  φ,φ (ω + π), G p (ω)q(ω)G p (ω + π)q(ω + π)G φ,ψ (2ω) =  φ,φ (ω) +  φ,φ (ω + π), we see that, with the assumption G φ,φ (ω) = 1, the biorthogonality conditions (9.4.7) and (9.4.8), which are equivalent to G ψ,ψ (ω) = 1, G  ψ,φ (ω) = 0, G φ,ψ (ω) = 0, are satisfied if and only if (9.4.17)–(9.4.19) hold.



Recall that if lowpass filters (refinement masks) p,  p satisfy (9.4.16), we say that they are biorthogonal; and if  p,  q , p, q satisfy (9.4.16)–(9.4.19), then we say they form a biorthogonal filter bank or a perfect reconstruction (PR) filter bank. As discussed in Sect. 8.2, for a given pair of biorthogonal FIR lowpass filters p and  p, we may choose the FIR highpass filters q and  q to be p (ω + π)e−i(2s−1)ω ,  q (ω) = − p(ω + π)e−i(2s−1)ω , q(ω) = − or equivalently, p2s−1−k ,  qk = (−1)k  p2s−1−k , k ∈ Z, qk = (−1)k 

(9.4.20)

where s is an integer such that  p,  q , p, q form a biorthogonal (PR) filter bank. Up to now, we have seen that the key to constructing biorthogonal wavelets is to construct finite sequences { pk } and { pk } such that the associated refinable functions  are biorthogonal. Next, we give a characterization on the biorthogonality of φ and φ  the proof of which will be provided in Chap. 10, on p.528. φ and φ,  be compactly Theorem 5 Biorthogonality of refinable functions Let φ and φ supported refinable functions associated with some trigonometric polynomials p(ω)  are in L 2 (R) and biorthogand  p (ω) satisfying p(0) = 1,  p (0) = 1. Then φ and φ onal to each other if and only if (i) p(ω) and  p (ω) satisfy (9.4.16); (ii) p(π) = 0 and  p (π) = 0; and (iii) T p and T p satisfy Condition E.  be compactly supported refinable functions associated with Theorem 6 Let φ, φ  be the functions given by p(ω),  p (ω) satisfying (i)–(iii) in Theorem 5. Let ψ, ψ  are a pair of compactly qk } defined by (9.4.20). Then ψ and ψ (9.4.13) with {qk }, { supported biorthogonal wavelets.

472

9 Compactly Supported Wavelets

 are in L 2 (R) and biorthogonal to each Proof By Theorem 5, we know that φ and φ other (which implies that they are stable). Thus, by Theorem 4, (9.4.7) and (9.4.8) are satisfied, since q(ω) and  q (ω) satisfy (9.4.17)–(9.4.19). Therefore, by Theorem 3, ψ  are a pair of compactly supported biorthogonal wavelets if (9.4.14) is satisfied. and ψ Next, let us verify (9.4.14). Indeed, from (9.4.16)–(9.4.19), we have T

M p , q (ω) M p,q (ω) = I2 , ω ∈ R, p,  q and p, q, respectively, defined where M  p , q and M p,q are modulation matrices of  T

by (9.3.12) on p.455. Thus M p,q (ω) is the inverse matrix of M  p , q (ω), and we have T

M p,q (ω) M  p , q (ω) = I2 , ω ∈ R. In particular, we have p(ω) p (ω) + q(ω) q (ω) = 1, p(ω) p (ω + π) + q(ω) q (ω + π) = 0, or  p (ω) p(ω) +  q (ω)q(ω) = 1,  p (ω + π) p(ω) +  q (ω + π)q(ω) = 0. Therefore,



(9.4.21)

    p (ω) +  p (ω + π) p(ω) +  q (ω) +  q (ω + π) q(ω) = 1.

Hence,                     ω ω ω ω ω ω  ω ω  ω  φ φ φ p p q q =  + +π p +  + +π q 2 2 2 2 2 2 2 2 2     ω ω 1 1 1 + (−1)k pk eik 2  1 + (−1)k  φ(ω) + ψ(ω) qk eik 2  = 2 2 k k   =  p2n einω   q2n einω  φ(ω) + ψ(ω), n

n

where we use the assumption that pk ,  qk are real in the last equality. Thus,  1 1  p−2n φ(x − n) +  q−2n ψ(x − n). φ(2x) = 2 n 2 n Similarly, we have  1   − n) + 1  − n). p−2n φ(x q−2n ψ(x φ(2x) = 2 n 2 n This shows that (9.4.14) holds, as desired.



9.4 Compactly Supported Biorthogonal Wavelets

473

Next, we consider the symmetric or antisymmetric biorthogonal wavelets. Let N ∈ Z. We say a sequence {h k } is symmetric around N2 provided that h N −k = h k , k ∈ Z, and we say it is antisymmetric if h N −k = −h k , k ∈ Z. Clearly, for a (finite) sequence {h k }, it is symmetric or antisymmetric around N2 if and only if the two-scale symbol h(ω) satisfies h(−ω) = ei N ω h(ω) or h(−ω) = −ei N ω h(ω) (see Exercise 7). Theorem 7 Symmetry

Let φ be the normalized refinable function associated

with a finite sequence { pk }. If { pk } is symmetric around φ is also symmetric around N2 , i.e.

N 2

for some integer N , then

φ(x) = φ(N − x). If, in addition, {qk } is symmetric/antisymmetric around  ψ(x) = qk φ(2x − k),

N 2 ,

then ψ, defined by

k

is symmetric/antisymmetric around 41 (N + N  ). Proof We start by noting that  φ(−ω) =

∞ 

p(−

j=1

  ∞  ω ω i N ωj  2 p = ei N ω φ(ω). ) = e j 2 2j j=1

Thus φ(x) = φ(N −  as desired.   x), ω  ω and q(−ω) = ±ei N  ω q(ω), we have  φ From ψ(ω) =q 2

2

    (N +N  ) ω i N ω/2  ω ω  ω i N  ω/2   e = ±ei 2 ω ψ(ω). q φ ψ(−ω) = q(− )φ(− ) = ±e 2 2 2 2  Therefore ψ(x) = ±ψ 1 4 (N

N +N  2

 − x , that is, ψ is symmetric or antisymmetric around

+ N  ) (depending on the + or − sign).



One can show that if { pk } and { pk } are symmetric, then {qk } and { qk }, defined by (9.4.20), are symmetric/antisymmetric (see Exercise 8). Sum-rules imply vanishing moments Except for symmetry, we also require that the constructed biorthogonal wavelets have the vanishing moment property. Suppose p(ω), q(ω),  p (ω),  q (ω) form a biorthogonal FIR filter bank. Then one can show that if p(ω) has sum-rule order L, then  q (ω) satisfies

474

9 Compactly Supported Wavelets

d  q (0) = 0, for 0 ≤  ≤ L − 1 dω 

(9.4.22)

 has vanishing moments of order L (refer to Theorem 5 (see Exercise 10). Hence, ψ on p.457). Similarly, if  p (ω) has sum-rule order  L, then ψ has vanishing moments  of order L. Hence, to construct biorthogonal wavelets with vanishing moments, we only need to construct lowpass filters with sum-rule orders.  Next, we construct symmetric lowpass filters { pk } and { pk } with certain sum-rule orders. Construction of symmetric biorthogonal wavelets Suppose p(ω) and  p (ω) are given by ω ω p(ω) = cos p0 (ω),  p0 (ω), p (ω) = cos  2 2 p0 (ω) are some trigonometric polynomials, and ,   are positive where p0 (ω) and  integers such that  +   is even, that is, +  = 2L p0 (ω) is even, and hence it can for a positive integer L. Because of symmetry, p0 (ω) p0 (ω) can be be written as a function of cos ω. With cos ω = 1 − 2 sin2 ω2 , p0 (ω) expressed as ω p0 (ω) p0 (ω) = P(sin2 ), 2 where P is a polynomial. Thus the biorthogonality condition (9.4.16) on p(ω) and  p (ω) is given by cos2L

ω ω ω ω P(sin2 ) + sin2L P(cos2 ) = 1, ω ∈ R, 2 2 2 2

or equivalently, with y = sin2 ω2 , P(y) satisfies (9.3.18) on p.458. We will choose P(y) = PL−1 (y), where PL−1 is the polynomial of degree L − 1 defined by p (ω), to obtain (9.3.19) on p.458. Then, we factorize cos2L ω2 PL−1 (sin2 ω2 ) into p(ω) biorthogonal filters p and  p.  Next, we study the construction of the most commonly used symmetric biorthogonal filters p(ω) and  p (ω), and in particular, the so-called 5/3-tap and 9/7-tap filters that have been adopted by the Digital Image Compression Standard, JPEG 2000, for lossless and lossy compressions, respectively. The first step is to factorize p (ω). We will again use the notation cos2L ω2 PL−1 (sin2 ω2 ) into p(ω) z = e−iω , and rely on the fact that

9.4 Compactly Supported Biorthogonal Wavelets

cos2

475

1 1 1 1 ω ω = (1 + z)(1 + ), sin2 = (2 − z − ). 2 4 z 2 4 z

For L = 1, since P0 (y) = 1, we have p(ω) p (ω) = cos2

1 1 ω = (1 + z)(1 + ). 2 4 z

Thus, the only possible choice for p(ω) and  p (ω) is p(ω) =  p (ω) =

1 (1 + z). 2

In this case we obtain the Haar wavelet. In the next three examples, we consider the cases L = 2, 3, 4. Example 1 L = 2

For L = 2, with P1 (y) = 1 + 2y, we have   1 1 1 4 ω 2 ω 1 + 2 sin = (1 + z)2 (1 + )2 (4 − z − ). p (ω) = cos p(ω) 2 2 32 z z

We may choose 1 (1 + z), 2 1 1 1  p (ω) = (1 + )(1 + z)2 (4 − z − ) 16 z z 1 1 1 1 1 1 1 1 =− + + z + z2 − z3, + 2 16 z 16 z 2 2 16 16 p(ω) =

or we may choose the lowpass filters, called 5/3-tap biorthogonal lowpass filters : 1 (1 + z)2 , 4 1  p (ω) = (1 + z)2 (4 − z − 8 11 1 3 =− + + z+ 8z 4 4 p(ω) =

(9.4.23) 1 ) z 1 2 1 3 z − z . 4 8

The highpass filters q(ω) and  q (ω) corresponding to p(ω),  p (ω) in (9.4.23) can be obtained by applying (9.4.20) for an integer s. In the following, we provide q(ω) and  q (ω) with s = 1 to obtain the highpass filters, called

476

9 Compactly Supported Wavelets

5/3-tap biorthogonal highpass filters : 1 3 1 1 1 q(ω) = − (z 2 + 2 ) − (z + ) + , 8 z 4 z 4 1 1 2 1 1 1  q (ω) = − z(1 − ) = − (z + ). 4 z 2 4 z

(9.4.24)

See Fig. 8.3 in Sect. 8.3, on p.403, for the graphs of the corresponding scaling functions and biorthogonal wavelets.  Example 2 L = 3

For L = 3, we have

  ω ω ω 1 + 3 sin2 + 6 sin4 2 2 2   1 1 3 1 1 3 2 3 = (1 + z) (1 + ) 19 − 9(z + ) + (z + 2 ) . 256 z z 2 z

p (ω) = cos6 p(ω)

We may choose 1 (1 + z)2 , 4   1 3 1 1 1 (1 + z)4 19 − 9(z + ) + (z 2 + 2 )  p (ω) = 64 z z 2 z 19 1 3 4 1 1 3 1 45 3 5 (z + 3 ) − (z + 2 ) − (z + ) + (z 2 + 1) + z, = 128 z 64 z 8 z 64 64 p(ω) =

or 1 (1 + z)3 , (9.4.25) 8   1 3 1 1 (1 + z)3 19 − 9(z + ) + (z 2 + 2 )  p (ω) = 32 z 2 z 7 1 9 1 45 3 5 (z + 2 ) − (z 4 + ) − (z 3 + 1) + (z 2 + z), = 64 z 64 z 64 64 p(ω) =

or 1 (1 + z)4 , 16   3 1 1 1 z(1 + z)2 19 − 9(z + ) + (z 2 + 2 )  p (ω) = 16 z 2 z 3 5 5 3 5 1 (z + ) − (z 4 + 1) + (z 3 + z) + z 2 . = 32 z 8 32 4 p(ω) =

(9.4.26)

9.4 Compactly Supported Biorthogonal Wavelets

477

Observe that the scaling function φ associated with p in (9.4.25) is the quadratic  and B-spline, while that associated with (9.4.26) is the cubic B-spline. However, φ φ associated with  p and p in (9.4.26) are not biorthogonal to each other, as shown in Chap. 10, on p.530.  Example 3 L = 4

For L = 4, we have   ω ω ω ω 1 + 4 sin2 + 10 sin4 + 20 sin6 2 2 2 2 1 1 = 4 (1 + z)4 (1 + )4 Q 3 (z), 4 z

p(ω) p (ω) = cos8

where Q 3 (z) = 13 −

1 5 131 1 5 1 (z + ) + (z 2 + 2 ) − (z 3 + 3 ). 16 z 2 z 16 z

We may choose 1 (1 + z)2 , 4 1 1  p (ω) = (1 + )2 (1 + z)4 Q 3 (z), 64 z p(ω) =

or 1 (1 + z)3 , 8 1 1  p (ω) = (1 + )(1 + z)4 Q 3 (z). 32 z p(ω) =

Recall that with y = 41 (2 − z − 1z ), Q 3 (z) = P3 (y), and P3 (y) can be factorized as in (9.3.22) on p.462. Thus, we have Q 3 (z) = −

   1 4 5 1 1 z + − (2 − 4θ) z 2 + 2 − (6 + 4θ)(z + ) + 10 + 8θ − , 16 z z z 5θ

where θ is given by (9.3.23) on p.462; and we may choose  p (ω) and p(ω) to obtain the lowpass filters, called: 9/7-tap biorthogonal lowpass filters :   1 1 1 z + − (2 − 4θ) , p(ω) = 4 (1 + z)4 2 4θ z   1 5 1 1 4 4 2  p (ω) = 4 (1 + z) (− )4θ z + 2 − (6 + 4θ)(z + ) + 10 + 8θ − . 2 16 z z 5θ

478

9 Compactly Supported Wavelets

The numbers of nonzero coefficients  pk and pk are 9 and 7, respectively. These two biorthogonal filters are called the 9/7-tap biorthogonal filters. The nonzero pk , pk are given by p−1 = p5 = −0.0912717631142501, p0 = p4 = −0.0575435262285002, p1 = p3 = 0.5912717631142501, p2 = 1.1150870524570004;  p−2 =  p6 = 0.0534975148216202,  p−1 =  p5 = −0.0337282368857499, p4 = −0.1564465330579805,  p1 =  p3 = 0.5337282368857499,  p0 =   p2 = 1.2058980364727207. The corresponding highpass filters q(ω) and  q (ω) can be obtained from (9.4.20) qk with for some suitable integer s. In the following, we provide the nonzero qk and  s = 3 for the highpass filters, called: 9/7-tap biorthogonal highpass filters : q−1 = q7 = −0.0534975148216202, q0 = q6 = −0.0337282368857499, q1 = q5 = 0.1564465330579805, q2 = q4 = 0.5337282368857499, q3 = −1.2058980364727207;  q0 =  q6 = −0.0912717631142501,  q1 =  q5 = 0.0575435262285002,  q2 =  q4 = 0.5912717631142501,  q3 = −1.1150870524570004. See Fig. 9.3 for the graphs of the corresponding scaling functions and biorthogonal wavelets.  Exercises Exercise 1 Show that if { f 1 , f 2 , . . .} in L 2 (R) satisfies (9.4.1), then it is linearly independent.  are a pair of biorthogonal wavelets in L 2 (R). Show Exercise 2 Suppose ψ and ψ that, for any f ∈ L 2 (R), (9.4.4) is satisfied.  are a pair of biorthogonal wavelets in L 2 (R). Show Exercise 3 Suppose ψ and ψ  that {ψ j,k } and {ψ j,k }, j, k ∈ Z, satisfy the frame condition in (9.4.5). Exercise 4 Show that the condition in (9.4.12) implies that {ψ j,k : j, k ∈ Z} and j,k : j, k ∈ Z} form biorthogonal systems of L 2 (R). {ψ  ψ, ψ  ∈ L 2 (R) satisfy (9.4.7), (9.4.2) and (9.4.8). Let Exercise 5 Suppose φ, φ,  ψ and   V0 , V0 , W0 and W0 be the spaces spanned by the integer translates of φ, φ,    ψ, respectively. Show that V0 ∩ W0 = {0}, V0 ∩ W0 = {0}.

9.4 Compactly Supported Biorthogonal Wavelets

479

1.4

1

1.2

0.5

1 0

0.8 0.6

−0.5

0.4

−1

0.2 −1.5

0 −0.2 −1

0

1

2

3

4

−2 −1

5

1.4 1.2

1

2

3

4

5

6

0

1

2

3

4

5

6

1

1

0.5

0.8 0.6

0

0.4

−0.5

0.2

−1

0

−1.5

−0.2 −0.4 −2

0

1.5

−1

0

1

2

3

4

5

−2 −1

6

Fig. 9.3. Top biorthogonal 9/7 scaling function φ (on left) and wavelet ψ (on right); Bottom scaling function  φ (on left) and wavelet  ψ (on right)

Exercise 6 Suppose that the FIR filters p(ω), q(ω),  p (ω) and q (ω) are biorthogonal, that is, they satisfy (9.4.16)–(9.4.19). Show that, for any n, m ∈ Z, p(ω)e−inω , p (ω)e−inω ,  q (ω)e−i(n+2m)ω are also biorthogonal. q(ω)e−i(n+2m)ω ,  Exercise 7 Show that a finite sequence {h k } is symmetric/antisymmetric around if and only if h(−ω) = ±ei N ω h(ω).

N 2

Exercise 8 Show that if { pk } and { pk } are symmetric, then {qk } and { qk }, defined by (9.4.20), are symmetric/antisymmetric. Exercise 9 Let p(ω), q(ω),  p (ω) and  q (ω) be the 5/3-tap biorthogonal filters given by (9.4.23)–(9.4.24). Verify directly that they satisfy (9.4.16)–(9.4.19). Exercise 10 Suppose that the FIR filters p(ω),  q (ω) satisfy (9.4.18). Show that if p(ω) has sum-rule order L, then  q (ω) satisfies (9.4.22). Hint: Differentiate both sides of (9.4.18) with respect to ω and set ω = 0.

9.5 Lifting Schemes Recall that, for a filter or any sequence g = {gk }, the symbol G(z) is used to denote its z-transform, defined by  gk z k . G(z) = k

480

9 Compactly Supported Wavelets

Hence, if g is an FIR filter, then G(z) is a Laurent polynomial in z; that is, G(z) =

k2 

gk z k ,

k=k1

where k1 and k2 are integers, with k1 ≤ k2 , gk1 = 0, gk2 = 0 (k1 could be a negative integer). We will call the degree of the Laurent polynomial G by the length of G, length(G) = k2 − k1 , since it is length of the filter g minus 1. Also recall that if a sequence h = {h k } is a lowpass filter (refinement mask) for a refinable function or a highpass filter for a wavelet, the notation h(ω) is used to denote the two-scale symbol of h, defined by h(ω) =

1 h k e−ikω . 2 k

For an FIR filter bank  p = { pk },  q = { qk }, p = { pk }, q = {qk }, denote 1 1  pk ,  qk , k = pk , h k = qk . k =  hk =  2 2 (z), L(z) and H (z) be the z-transforms of  Let  L(z), H ,  h,  and h, respectively. −iω With z = e , we have (z),  p (ω) =  L(z),  q (ω) = H 1 1 p(ω) = L(z), q(ω) = H (z). 2 2 Recall that the discrete wavelet transform (DWT) (8.3.4) with  p,  q and the inverse discrete wavelet transform (IDWT) (8.3.5) with p, q in Sect. 8.3, on p.406, can be written as −

h− ∗ x) ↓ 2, c = (  ∗ x) ↓ 2, d = (  x = (c ↑ 2) ∗  + (d ↑ 2) ∗ h,

(9.5.1) (9.5.2)

where ↓ 2 and ↑ 2 denote the downsampling and upsampling operations, respectively, and for a filter g = {gk }, g− = {gk− } denotes its time-reverse, given by gk− = g−k , k ∈ Z (see Remark 1 of Sect. 8.4 on p.427). Furthermore, (9.5.1) and (9.5.2) can be written as

9.5 Lifting Schemes

481

 

C(z 2 )

=

1 1 2L z

X (z) + 21  L(− 1z ) X (−z),

  1  1 2 (− 1 ) X (−z), z ∈ C\{0}, D(z ) = 2 H z X (z) + 21 H z and

 X (z) = L(z) C(z 2 ) + H (z) D(z 2 ), z ∈ C\{0}

(9.5.3)

(9.5.4)

(see (8.3.8)–(8.3.9) on p. 409). In addition,  p,  q , p, q are biorthogonal if and only (z), L(z), H (z) form a perfect reconstruction (PR) filter bank, which, if  L(z), H according to Theorem 2 on p.427, is equivalent to   1 = 2I2 , z ∈ C\{0}, M  L, H z

T

 M L ,H (z)

(9.5.5)

where M L ,H (z) and M  (z) are the modulation matrices of L(z), H (z) and L, H  (z), respectively, defined by L(z), H

M L ,H (z) =

 L(z) L(−z) L(z)  L(−z) , M  (z) = L, H (z) H (−z) . H (z) H (−z) H

(z), one pair of lowpass and highpass For a PR filter bank L(z), H (z) and  L(z), H   filters {L , H } or { L, H } is called the PR dual of the other. For a filter g = {gk }, denote G e (z) =



g2k z k , G o (z) =

k



g2k+1 z k .

k

Then the z-transform G(z) of g can be written as G(z) = G e (z 2 ) + G o (z 2 )z. Definition 1 Polyphase matrix {L(z), H (z)} is defined by

The polyphase matrix of a pair of filters

P(z) =

L e (z) L o (z) . He (z) Ho (z)

Observe that the modulation matrix M L ,H (z) and polyphase matrix of L(z), H (z) have the relation

1 1 M L ,H (z) = P(z 2 ) . z −z

482

9 Compactly Supported Wavelets

This, together with the fact that

1 z 1 −z



1 1 1 1 z −z

= 2I2 ,

leads to the following theorem. Theorem 1 PR filter bank and polyphase matrices The filter pairs {L(z), (z)} satisfy (9.5.5); that is, they constitute a PR filter bank, H (z)} and { L(z), H if and only if  T  1 = I2 , (9.5.6) P(z) P z  are the polyphase matrices of {L(z), H (z)} and { (z)}, where P(z) and P(z) L(z), H respectively. Recall that a Laurent polynomial w(z) is called a monomial if w(z) = cz m for some m ∈ Z and constant c = 0. From Theorem 1, we have the following corollary (the proof is left as an exercise). Corollary 1 Existence of FIR PR dual and polyphase matrix An FIR filter pair {L(z), H (z)} has an FIR PR filter dual if and only if det(P(z)) is a monomial. Example 1 Lazy wavelet transform

 = I2 , we obtain very simIf P(z) = P(z)

ple PR filters, which are called Lazy filters, and we denote them by L (0) (z), H (0) (z) (0) (z), with and  L (0) (z), H (0) (z) = z. L (0) (z) = 1, H (0) (z) = H L (0) (z) =  The Lazy wavelet transform, the decomposition algorithm (or DWT) with (0) (z), is to subsample an input signal into the even and odd indexed  L (0) (z) and H samples. More precisely, let x = {xn } be the input signal. Then c and d obtained by (0) (z) = z (or equivalently, by the wavelet transform (9.5.1) with  L (0) (z) = 1 and H (0) (e−iω ) = e−iω , or q (ω) = H (8.3.4) on p.406 with  p (ω) =  L (0) (e−iω ) = 1 and  pk = 0, k = 0 and  q1 = 2,  q j = 0, j = 1) are given by  p0 = 2,  c = {x2n }, d = {x2n+1 }. On the other hand, the inverse Lazy wavelet transform, the wavelet reconstruction algorithm (or IDWT) (9.5.2) with L (0) (z) = 1 and H (0) (z) = z (or equivalently, (8.3.5) on p.406 with p(ω) = 21 L (0) (e−iω ) = 21 and q(ω) = 21 H (0) (e−iω ) = 21 e−iω , or p0 = 1, pk = 0, k = 0 and q1 = 1, q j = 0, j = 1), is to combine c = {cn }, d = {dn } into a signal, with cn and dn filling the 2n-index and 2n + 1-index positions, that is,

9.5 Lifting Schemes

483

(. . . , cn−1 , dn−1 , cn , dn , cn+1 , dn+1 , . . .).  Definition 2 Lifting

and

Let {L(z), H (z)} be a pair of FIR filters. Define L new (z) = L(z) + G(z 2 )H (z),

(9.5.7)

H new (z) = H (z) + S(z 2 )L(z),

(9.5.8)

where G(z), S(z) are Laurent polynomials. Then L new (z) and H new (z) are called lifting filters of L(z) and H (z), respectively. Theorem 2 Lifting scheme Let L(z), H (z) be FIR filters and L new (z) and H new (z) be their lifting filters defined by (9.5.7) and (9.5.8) for some Laurent polynomials G(z), S(z). Then: (z)} if and only if {L new (z), H (z)} (i) {L(z), H (z)} has an FIR PR dual { L(z), H new   has an FIR PR dual { L(z), H (z)}, with   1 new   L(z); (9.5.9) H (z) = H (z) − G 2  z (z)} if and only if {L(z), H new (z)} (ii) {L(z), H (z)} has an FIR PR dual { L(z), H new   has an FIR PR dual { L (z), H (z)}, with   1   L(z) − S 2 H L new (z) =  (z). (9.5.10) z Proof Here we present the proof of (i). The proof of (ii) is similar and it is left as an exercise (see Exercise 2). To show (i), observe that   L new (z) = L e (z 2 ) + L o (z 2 )z + G(z 2 ) He (z 2 ) + Ho (z 2 )z 



= L e (z ) + G(z )He (z ) + L o (z ) + G(z )Ho (z ) z. 2

2

2

2

2

2

Thus, the polyphase matrix P new (z) of {L new (z), H (z)} is given by

1 G(z) L e (z) + G(z)He (z) L o (z) + G(z)Ho (z) new P (z) = = P(z), Ho (z) He (z) 0 1

484

9 Compactly Supported Wavelets

$ % $ % where P(z) is the polyphase matrix of {L(z), H (z)}. Since det P(z) = det P new (z) , we conclude, by Corollary 1, that {L(z), H (z)} has an FIR PR dual if and only if {L new (z), H (z)} has an FIR PR dual.  new (z) denote the polyphase matrices of { (z)} and Let P(z) and P L(z), H new   { L(z), H (z)}, respectively. Then, following similar arguments as above, we have ⎤ 1  0  new (z) = ⎣ ⎦ P(z). P −G 1z 1 ⎡

 T    T new (z) P new 1   1 = I if and only if P = I2 . Therefore, Thus P(z) P 2 z z new (z)}, with H new (z) given by (9.5.9), is a PR dual of {L new (z), H (z)}, { L(z), H (z)} is a PR dual of {L(z), H (z)}. provided that { L(z), H  Next, we consider the relation between DWT and IDWT algorithms which have an FIR PR filter bank and those with a lifted FIR PR filter bank. In the following two new (z) refer to the lifting filters defined L new (z) and H theorems, L new (z), H new (z),  by (9.5.7), (9.5.8), (9.5.10) and (9.5.9) for some Laurent polynomials G(z), S(z). Let g = {gk } and s = {sk } denote the sequences consisting of the coefficients of G(z) and S(z), respectively. Theorem 3 Forward lifting algorithms Suppose that c and d are the lowpass (z)}. and highpass outputs corresponding to the input sequence x, with { L(z), H Then: (z)} are L new (z), H (i) the lowpass and highpass outputs cnew and dnew of x with { given by  sn dk−n , dknew = dk , k ∈ Z; (9.5.11) cknew = ck − n

new (z)} are L(z), H (ii) the lowpass and highpass outputs cnew and dnew of x with { given by  gn ck−n , k ∈ Z. (9.5.12) cknew = ck , dknew = dk − n

Proof Here we only give the proof of (i), since the proof of (ii) is similar and is left as an exercise (see Exercise 3). To prove (i), we start by noticing that the proof is trivial for the case where dknew = dk , since the corresponding highpass filters H new (z) and H (z) are the same. For the relation between lowpass outputs, we use the standard notation C new (z) to denote the z-transform of cnew . Then, with the formulation (9.5.3) for the wavelet decomposition algorithm, we have

9.5 Lifting Schemes

485

  1 new 1 1 new 1 X (z) +  L L (− ) X (−z) 2 z 2 z        1  1 1  1  1 (− 1 ) X (−z) = L − S(z 2 ) H X (z) + L(− ) − S(z 2 ) H 2 z z 2 z z       1 1 1 1 1 1  (− 1 )X (−z) = L X (z) + L(− )X (−z) − S(z 2 ) H X (z) + H 2 z 2 z 2 z z

C new (z 2 ) =

= C(z 2 ) − S(z 2 )D(z 2 ).

Thus C new (z) = C(z) − S(z)D(z), or equivalently, cnew = c − s ∗ d. 

This yields (9.5.11), as desired. Theorem 4 Backward lifting algorithms and dnew ,

For any two given sequences cnew

(i)  x, obtained by IDWT from cnew and dnew with {L(z), H new (z)}, is equal to that obtained by IDWT from c and d with {L(z), H (z)}, provided that dk = dknew , ck = cknew +



sn dk−n , k ∈ Z;

(9.5.13)

n

(ii)  x, obtained by IDWT from cnew and dnew with {L new (z), H (z)}, is equal to that obtained by IDWT from c and d with {L(z), H (z)}, provided that ck = cknew , dk = dknew +



gn ck−n , k ∈ Z.

(9.5.14)

n

Proof As in the proof of the previous theorem, here we show (i), and the proof of (ii) is left as an exercise. To show (i), we use the formulation (9.5.4) for IDWT (the wavelet reconstruction algorithm). Let y be the recovered signal (sequence) obtained by IDWT from c and d with {L(z), H (z)}. Then, from (9.5.13), we have Y (z) = L(z)C(z 2 ) + H (z)D(z 2 )   = L(z) C new (z 2 ) + S(z 2 )D(z 2 ) + H (z)D(z 2 )  = L(z)C

new



(z ) + L(z)S(z ) + H (z) D(z 2 ) 2

2

= L(z)C new (z 2 ) + H new (z)D new (z 2 ) = X (z), as desired, where (9.5.4) is used to obtain the first and the last equalities.



486

9 Compactly Supported Wavelets

new (z), P new (z) of FIR PR filter pairs To summarize, the polyphase matrices P new (z)} and {L(z), H (z)}, lifted from a PR filter bank { (z)}, H L(z), H {L(z), H (z)}, are given by

{ L new (z),



 ⎤

1  new (z) = ⎣ 1 −S z ⎦ P(z), P 0 1

P new (z) =

1 0 P(z). S(z) 1

The extra decomposition and reconstruction algorithms (9.5.11) and (9.5.13) are Forward lifting algorithm: cnew = c − s ∗ d, dnew = d;

(9.5.15)

Backward lifting algorithm: d = dnew , c = cnew + s ∗ d.

(9.5.16)

Observe that these two algorithms are very simple. By simply moving one term s ∗ d in the equation from one side to the other side, the one can be obtained from the other. Moreover, one can write them down by simply looking at the polyphase new (z). matrix P new (z)}, {L new (z), H (z)}, the Similarly, for the lifted FIR PR filter pairs { L(z), H polyphase matrices are given by ⎤ 1  0  new (z) = ⎣ ⎦ P(z), P −G 1z 1 ⎡

P new (z) =

1 G(z) P(z), 0 1

and the corresponding forward lifting algorithm (9.5.12) and backward lifting algorithm (9.5.14) are cnew = c, dnew = d − g ∗ c;

(9.5.17)

Backward lifting algorithm: c = cnew , d = dnew + g ∗ c.

(9.5.18)

Forward lifting algorithm:

Again, one can write the algorithms down by simply looking at the polyphase matrix new (z). P Multi-step lifting scheme

For a lifted FIR PR filter bank, we can apply more  = I2 , the correlifting procedures to get further lifted FIR PR filter banks. For P(z) new  sponding matrix P (z) will be a product of upper-triangular and lower-triangular matrices of the form ⎤ ⎡  ⎤ ⎡ 1  0 1 1 −S z ⎦, ⎣ ⎦, ⎣ (9.5.19) −G 1z 1 0 1

9.5 Lifting Schemes

487

and the corresponding decomposition algorithms will consist of several repetitions of (9.5.15) and (9.5.17), which can easily be written down from the factorization of new (z). We illustrate this in the following two examples.  P new (z) of the analysis filter bank, Example 2 Suppose that the polyphase matrix P lifted from the Lazy filter bank, is given by ⎤⎡  ⎤ 1  0 1 new (z) = ⎣ ⎦ ⎣ 1 −S z ⎦, P 1 −G z 1 0 1 ⎡

where G(z) and S(z) are Laurent polynomials. Then the forward lifting algorithm for input x = (xk ) is given as follows: Forward lifting algorithm: Splitting(Lazy wavelet transform) :

c = (x2k ), d = (x2k+1 );

Step 1. c = c − s ∗ d; Step 2. dnew = d − g ∗ cnew . new

cnew and dnew are the lowpass and highpass outputs. The backward lifting algorithm is the reverse of the above forward lifting algorithm: Backward lifting algorithm: Step 1. d = dnew + g ∗ cnew ; Step 2. c = cnew + s ∗ d; Combining (Inverse Lazy wavelet transform): x = (. . . , ck , dk , . . .). x is recovered from its lowpass and highpass outputs cnew and dnew . The “Splitting” and “Combining” procedures in the above algorithms are the wavelet decomposition and reconstruction procedures, respectively, with the Lazy (z) = z and L(z) = 1, H (z) = z. filter bank given by  L(z) = 1, H  new (z) of the analysis filter bank, Example 3 Suppose that the polyphase matrix P lifted from the Lazy filter bank, is given by ⎡

⎤ 1  0 ⎦, ⎦⎣ −G 1z 1

 ⎤⎡

new (z) = ⎣ 1 −S P 0 1

1 z

where G(z) and S(z) are Laurent polynomials. Then the forward lifting algorithm for input x is given as follows:

488

9 Compactly Supported Wavelets

Forward lifting algorithm: Splitting (Lazy wavelet transform): c = (x2k ), d = (x2k+1 ); Step 1. dnew = d − g ∗ c; Step 2. cnew = c − s ∗ dnew . cnew and dnew are the lowpass and highpass outputs. Again, the backward lifting algorithm is the reverse of the above forward lifting algorithm: Backward lifting algorithm: Step 1. c = cnew + s ∗ dnew ; Step 2. d = dnew + g ∗ c; Combining (Inverse Lazy wavelet transform): x = (. . . , ck , dk , . . .). xis recovered from its lowpass and highpass outputs cnew and dnew .   of a simple FIR PR So far, we have shown that, from the polyphase matrix P(z) filter bank such as the Lazy filter bank, we obtain a lifted FIR PR filter bank by (left)  with a product of matrices of the form in (9.5.19). Next, we show multiplying P(z) that the polyphase matrix of an FIR filter pair which has an FIR PR filter dual can be factorized as the product of matrices of the form in (9.5.19). Theorem 5 Factoring wavelet transform into lifting scheme

Suppose the

 of FIR filters  (z) is a monomial. determinant of the polyphase matrix P(z) L(z) and H  can be written as Then P(z)  P(z)

=

c1 z n 1 0 0 c2 z n 2



⎡ ⎣

⎤⎡

 ⎤

1 0   ⎦ ⎣ 1 −Sm 1 −G m z 1 0 1

1 z



⎦...⎣

⎤⎡

(9.5.20)

 ⎤

1 0   ⎦ ⎣ 1 −S1 1 −G 1 z 1 0 1

1 z

⎦,

or  P(z)

=

0 c1 z n 1 −c2 z n 2 0



(9.5.21)

 ⎤ ⎡  ⎤ ⎤⎡ ⎤⎡ 1 0 1 0     1 −Sm 1z ⎦ ⎣ 1 −S1 1z ⎦ ⎦ ⎣ ⎦ ⎣ ⎣ ... , −G 1 1z 1 −G m 1z 1 0 1 0 1 ⎡

where c1 , c2 are nonzero constants, n 1 , n 2 are integers, and G j (z), S j (z) are Laurent polynomials. Remark 1 We have the following remarks on the factorizations in (9.5.20) and (9.5.21).

9.5 Lifting Schemes

489

(a) The factorizations in (9.5.20) and (9.5.21) are not unique (see Example 4 below). G m (z) and/or S1 (z) in (9.5.20) and/or (9.5.21) could be zero, meaning that the second matrix (from the left) in (9.5.20) or (9.5.21) could be an upper-triangular matrix, and the last matrix could be a lower-triangular matrix. (b) The factorization in (9.5.21) can be transformed into that in (9.5.20), since we can write n







c1 z 1 0 11 1 0 11 0 c1 z n 1 = . −c2 z n 2 0 0 c2 z n 2 01 −1 1 01 (z)} by {z m 1  (z)} (c) Replacing the original FIR filter pair { L(z), H L(z), z m 1 −2m 2 H for some suitable integers m 1 and m 2 , we could choose n 1 and n 2 in (9.5.20) and (9.5.21) to be zero.  Proof of Theorem 5

Recall that  = P(z)



 L o (z) L e (z)  o (z) . e (z) H H

Since the greatest common divisor (gcd) of  L e (z) and  L o (z) is also a divisor of   det( P(z)), and det( P(z)) is a monomial, this gcd must be a monomial. L e ). Then we have Assume that length( L o ) ≥length(   1   L o (z) = −S1 L e (z) + R1 (z), z where S1 (z) and R1 (z) are Laurent polynomials with L e ). length(R1 ) < length( R1 (z) is called the remainder of the division. Thus

 L (z) R1 (z) 1 S1 ( 1z )  . = e P(z) He (z) ∗ 0 1 If length(R1 ) = 0 (namely R1 (z) is not a monomial), then  L e (z) can be written as 1  L e (z) = −G 1 ( )R1 (z) + R2 (z), z where G 1 (z) and R2 (z) are Laurent polynomials with length(R2 ) < length(R1 ).

490

9 Compactly Supported Wavelets

Hence





1 0 R2 (z) R1 (z) 1 S1 ( 1z )  . = P(z) ∗ ∗ G 1 ( 1z ) 1 0 1

If length(R1 ) = 0, then R2 = 0. Repeating this procedure (if R2 = 0), we obtain ⎧

Y (z) 0 ⎪ ⎪ , or ⎪ ⎪ ⎨ W1 (z) W2 (z)





1 0 1 S1 ( 1z ) 1 Sm ( 1z )  P(z) ... =

G 1 ( 1z ) 1 ⎪ 0 1 0 1 ⎪ ⎪ ⎪ ⎩

0 Y (z) , −W2 (z) W1 (z)

where W1 (z), W2 (z) are some Laurent polynomials, and Y (z) is the gcd of  L e (z) and  L o (z), which is a monomial. Suppose Y (z) = c1 z n 1 for some integer n 1 and constant c1 = 0. Taking the determinant of both sides of the above equation, we have  det( P(z)) = c1 z n 1 W2 (z).  Since det( P(z)) is a monomial, W2 (z) is also a monomial, say c2 z n 2 , where c2 = 0. Write

c1 z n 1 0 0 c1 z n 1 = W1 (z) W2 (z) W1 (z) c2 z n 2

n

1 0 c1 z 1 0 , = 1 −n 2 W1 (z) 1 0 c2 z n 2 c2 z and write



0 c1 z n 1 0 c1 z n 1 = −W2 (z) W1 (z) −c2 z n 2 W1 (z)



1 − c12 z −n 2 W1 (z) 0 c1 z n 1 = . −c2 z n 2 0 0 1

 can be factorized into (9.5.20) or (9.5.21). Thus P(z)



Next, we show how to factorize some commonly used orthogonal and biorthogonal wavelet transforms into lifting schemes. The key is to reduce the lengths of  L e (z) and  L o (z) by carrying out elementary column operations on the polyphase matrix  P(z). We will use this idea in the following theorem. Theorem 6 Matrix column operations ces. Then:

Let A(z) and B(z) be two 2 × 2 matri-

9.5 Lifting Schemes

491

  (i) if B(z) is obtained from A(z) by adding S 1z × column 1 of A to column 2 of A, then

⎡ A(z) = B(z)

 ⎤

⎣ 1 −S 0

(ii) if B(z) is obtained from A(z) by adding G A, then

1 z

⎦;

1   1 z × column 2 of A to column 1 of



⎤ 1  0 ⎦. A(z) = B(z) ⎣ −G 1z 1

The proof is left as an exercise (see Exercise 6).



(z) = 1 − z, L(z) = 1 (1 + Example 4 Haar filter Let  L(z) = 1 + z, H 2 1 z), H (z) = 2 (1 − z) be the unnormalized Haar filter bank. Then we can factorize  of { (z)} as follows: the polyphase matrix P(z) L(z), H

1 1  P(z) = → add(−1) × column 1 to column 2 1 −1





1 0 1 0 1 0 . −→ = 1 −2 0 −2 − 21 1 Thus, by (i) in Theorem 6, we have





1 0 1 0 11  P(z) = . 0 −2 01 − 21 1 The corresponding forward and backward lifting algorithms for input x are given by (see similar lifting algorithms in Sect. 8.3) Forward lifting algorithm: Splitting (Lazy wavelet transform): c = (x2k ), d = (x2k+1 ); Step 1. c(1) = c + d; 1 Step 2. d(1) = d − c(1) ; 2 Step 3. d(2) = −2d(1) . (1) c and d(2) are the lowpass and highpass outputs.

492

9 Compactly Supported Wavelets

Backward lifting algorithm: 1 Step 1. d(1) = − d(2) ; 2 1 (1) Step 2. d = d + c(1) ; 2 Step 3. c = c(1) − d; Combining (Inverse Lazy wavelet transform): x = (. . . , ck , dk , . . .). x is recovered from its lowpass and highpass outputs c(1) and d(2) .  is not unique. Indeed, we have We should note that the factorization of P(z)

1 1  P(z) = → add column 2 to column 1 1 −1



1 2 1 2 0 1 2 . −→ = 0 −1 0 −1 01 Thus, by (ii) in Theorem 6, we have

1

1 0 1 2  = 2 0 P(z) . 0 −1 −1 1 01  Example 5 D4 -filter Let p(ω) and q(ω) be the D4 orthogonal filters given by (z), given by  (8.2.30) and (8.2.31) on p.399. Consider  L(z), H L(e−iω ) = p(ω), −iω i2ω  H (e ) = e q(ω). Observe that we have shifted the coefficients of q(ω) two (z). Thus units to the left to define H √ √ √ √ 3+ 3 2 1+ 3 3 1− 3 3− 3  + z+ z + z , L(z) = 8√ 8 8 √ √ 8 √ 1 + 3 −2 3 + 3 −1 3 − 3 1 − 3  H (z) = z − z + − z. 8 8 8 8  of { (z)} is factorized as follows: The polyphase matrix P(z) L(z), H √ √ √ √ 3 − √3 + (1 + 3)z√ 1 − √ 3 + (3 + 3)z √ (1 + 3)z −1 + 3 − 3 −(3 + 3)z −1 − 1 + 3 √ add (− 3) × column 2 to column 1

 8 P(z) =



−→

√ √ √ 3 − √3 + (1 + 3)z√ 4 −√ 4 3 (4 + 4 3)z −1 −(3 + 3)z −1 − 1 + 3

9.5 Lifting Schemes

493

√  3 2+ 3 + z × column 1 to column 2 add 4 4 √

0√ 4 −√ 4 3 −→ (4 + 4 3)z −1 4 + 4 3 √



1 0 0√ 4−4 3 . = z −1 1 0 4+4 3 √

Thus, by Theorem 6, we have &  = P(z)

√ 1− 3 2

0

0√

1+ 3 2

'

1 0 z −1 1

&

1− 0



3 4



− 2+4 3 z 1

'

√1 0 . 31

The corresponding forward and backward lifting algorithms for input x are given as follows: Forward lifting algorithm: Splitting (Lazy wavelet transform): c = (x2k ), d = (x2k+1 ); √ (1) Step 1. dk = dk + 3ck ; √ √ 3 (1) 2 + 3 (1) (1) Step 2. ck = ck − d − dk+1 ; 4 k 4 (1) ; Step 3. dk(2) = dk(1) + ck−1 √ √ 1 − 3 (1) (3) 1 + 3 (2) (2) Step 4. ck = ck , dk = dk . 2 2 c(2) and d(3) are the lowpass and highpass outputs. Backward lifting algorithm: √ (2) (2) √ (1) (3) Step 1. ck = −(1 + 3)ck , dk = ( 3 − 1)dk ; (1)

(2)

(1)

= dk − ck−1 ; √ √ 3 (1) 2 + 3 (1) (1) Step 3. ck = ck + d + dk+1 ; 4 k 4 √ (1) Step 4. dk = dk − 3ck ; Combining (Inverse Lazy wavelet transform): x = (. . . , ck , dk , . . .). x is recovered from its lowpass and highpass outputs c(2) and d(3) .  Step 2. dk

494

9 Compactly Supported Wavelets

Example 6 5/3-tap biorthogonal filters Let  p = { pk },  q = { qk } be the FIR filter pair of the 5/3-biorthogonal filter bank given in Example 5 on p.402. With (z) =  L(z) =  p (ω), H q (ω), that is, z = e−iω , denote  11 1 3 1 1  L(z) = − + + z + z2 − z3, 8z 4 4 4 8 (z) = − 1 1 + 1 − 1 z. H 4z 2 4  of { (z)} is factorized as follows: The polyphase matrix P(z) L(z), H  ⎤ 1 3 1 1 + z − + z 2 2 4 z ⎥ ⎢    ⎥ 2 P(z) =⎢ ⎦ ⎣ 1 − 21 1z + 1 ⎡

1 2

 1 1 × column 1 to column 2 add + 2 2z

1 1



1 0 02 + z2 −→ 2 2 . = 1 10 1 0 4 (1 + z) 1 

Thus, by (i) in Theorem 6, we have





1 0 1 − 21 (1 + 1z )  = 01 P(z) . 1 20 0 1 4 (1 + z) 1 The corresponding forward and backward lifting algorithms for input x are: Forward lifting algorithm: Splitting (Lazy wavelet transform): c = (x2k ), d = (x2k+1 ); 1 (1) Step 1. ck = ck − (dk + dk−1 ); 2 1 (1) (1) (1) Step 2. dk = dk + (ck + ck+1 ); 4 1 (1) (2) (1) (2) Step 3. ck = dk , dk = ck . 2 c(2) and d(2) are the lowpass and highpass outputs. Backward lifting algorithm: (1)

(2)

(1)

(2)

Step 1. ck = 2dk , dk = ck ; 1 (1) (1) (1) Step 2. dk = dk − (ck + ck+1 ); 4

9.5 Lifting Schemes

495

1 (1) Step 3. ck = ck + (dk + dk−1 ); 2 Combining (Inverse Lazy wavelet transform): x = (. . . , ck , dk , . . .). x is recovered from its lowpass and highpass outputs; c(2) and d(2) .  Example 7 9/7-tap biorthogonal filters Let  p = { pk },  q = { qk } be the FIR filter pair of the 9/7-biorthogonal filter bank constructed in Sect. 9.4 of Chap. 9, on (z) = −ei2ω L(z) = ei2ω  p (ω), H q (ω), that is, p.477. With z = e−iω , denote    4 1 5θ 1 1 4 2 ) z ) + 10 + 8θ − (1 + z) (− + − (6 + 4θ)(z + z 2 24 4 z2 z 5θ 1 1 1 1 2 (z 2 + 2 ) +  = 0 +  1 (z + ) +  3 (z 3 + 3 ) +  4 (z 4 + 4 ), z z z z   1 1 1 1 (z) = z 3 (1 − )4 − z − − (2 − 4θ) H 24 z 4θ z 1 1 h 4 (z 4 + 2 ), = h1 z +  h 2 (z 2 + 1) +  h 3 (z 3 + ) +  z z  L(z) =

where θ (≈ −0.3423840948583691) is an irrational number given by (9.3.23) on p.462, and 5 5 5 3 35 1  1 = − θ − θ2 , 0 = − θ − θ2 ,  8 32 4 4 32 16 5 5 5 1 5 5 2 2  + θ+ θ ,  θ+ θ ,  2 = 3 = 4 = − θ, 16 8 8 32 16 64 1 1 1 3 1 1 1  , h2 = − , h3 = + , h4 = − . h1 = − 8 16θ 64θ 4 16 32θ 64θ The numerical values of  j, h j can be obtained by 1 1  p j+2 ,  q j+2 , hj = −  j =  2 2 q j given on p.478. with  pj,  of { (z)}, we reduce the lengths To factorize the polyphase matrix P(z) L(z), H  of the (1, 1)-entry and the (1, 2)-entry of P(z). In the following, we use A[ j, k] to denote the ( j, k)-entry of the 2 × 2 matrix A(z) of Laurent polynomials, and we use coeff(A[ j, k], z n ) to denote its coefficient with power z n . Then, we have &  = P(z)

 0 +  2 (z + 1z ) +  4 (z 2 + z12 )  1 (1 + 1z ) +  3 (z + 1   h 4 (z 2 + z ) h1 +  h 3 (z + 1z ) h 2 (1 + z) + 

1 ) z2

' .

496

9 Compactly Supported Wavelets

( (  1], z 2 ) coeff( P[1,  2], z) = − a = −coeff( P[1, 4  3 ,

With

 to obtain we add a(1 + z)× column 2 to column 1 of P(z)  P(z) & −→ A(z) =

∗ + ∗(z + 1z )  1 (1 + 1z ) +  3 (z +  ∗(1 + z) h1 +  h 3 (z + 1z )

1 ) z2

'

coeff(A[1, 2], z) 1 , addb(1 + ) × column 1 to column 2 of A(z) coeff(A[1, 1], z) z

∗ + ∗(z + 1z ) ∗(1 + 1z ) −→ B(z) = ∗(1 + z) ∗ coeff(B[1, 1], z) , add c(1 + z) × column 2 to column 1 of B(z) with c = − coeff(B[1, 2], z 0 ) & '



g 0 g ∗(1 + 1z ) 1 −d(1 + 1z ) −→ C(z) = , = 1 1 0 2g 0 0 1 2g with b = −

where 1 5 2 5 − θ − θ ≈ 0.8128930661159611, 4 2 2 3 + 2θ d=− ≈ −0.4435068520439712. 11 + 34θ + 50θ2 g=

The constants a, b, c, d, g can be calculated by using computer software, such as Maple. The expressions and numerical values of a, b, c are given by 1 ≈ 1.5861343420599236, 2(1 + 2θ) 1 − θ − 10θ2 ≈ 0.0529801185729614, b= 4(1 + 4θ + 10θ2 ) 1 + 6θ − 2θ2 ≈ −0.8829110755309333 c= 2(1 + 2θ)(3 + 2θ)

a=

(we have used the fact that 1 + 4θ + 10θ2 + 20θ3 = 0 to simplify b, c, d). Thus,  can be factorized as P(z)  = P(z)



g 0 1 0 2g



1 −d(1 + 1z ) 0 1



1 0 −c(1 + z) 1



1 −b(1 + 1z ) 0 1



1 0 . −a(1 + z) 1

9.5 Lifting Schemes

497

The corresponding forward and backward lifting algorithms for input x are given as follows: Forward lifting algorithm: Splitting (Lazy wavelet transform): c = (x2k ), d = (x2k+1 ); Step 1. dk(1) = dk − a(ck + ck+1 ); (1) ); Step 2. ck(1) = ck − b(dk(1) + dk−1 (2)

Step 3. dk

(2)

(1)

(1)

(1)

(1)

(2)

(2)

= dk − c(ck + ck+1 );

Step 4. ck = ck − d(dk + dk−1 ); 1 (2) Step 5. ck(3) = g ck(2) , dk(3) = d . 2g k c(3) and d(3) are the lowpass and highpass outputs. Backward lifting algorithm: 1 (3) (2) (2) (3) Step 1. ck = ck , dk = 2g dk ; g (1)

(2)

(2)

(2)

(2)

(1)

(1)

Step 2. ck = ck + d(dk + dk−1 ); (1)

Step 3. dk

= dk + c(ck + ck+1 ); (1)

(1)

(1)

Step 4. ck = ck + b(dk + dk−1 ); Step 5. dk = dk(1) + a(ck + ck+1 ); Combining (Inverse Lazy wavelet transform): x = (. . . , ck , dk , . . .). x is recovered from its lowpass and highpass outputs c(3) and d(3) .  Exercises Exercise 1 Give a rigorous proof of Corollary 1. Hint: You may apply Theorem 1 in this section and Theorem 3 on p.427. L new (z) be the lifting filters of H (z) and  L(z), defined Exercise 2 Let H new (z) and  by (9.5.8) and (9.5.10), respectively. Show that (a) {L(z), H (z)} has an FIR PR dual if and only if {L(z), H new (z)} has an FIR PR dual; (z)} is a PR dual to {L(z), H new (z)} if { (z)} is a PR dual of L(z), H (b) { L new (z), H {L(z), H (z)}. Exercise 3 Prove (ii) in Theorem 3. Exercise 4 Prove (ii) in Theorem 4.

498

9 Compactly Supported Wavelets

(z)} has an FIR PR filter dual. Exercise 5 Suppose that the FIR filter pair { L(z), H (z)} also has an FIR PR filter dual, for some integers Show that {z m 1  L(z), z m 1 −2m 2 H m 1 and m 2 different from 0. Exercise 6 Prove Theorem 6. Exercise 7 Let 1 1 1 11 1 + 1 + z + z2 − z3; + 8 z2 8z 8 8 H (z) = −1 + z.

L(z) = −

(a) Find the polyphase matrix P(z) of {L(z), H (z)}. (b) Compute the determinant of P(z). (c) Decide whether {L(z), H (z)} has an FIR PR dual or not. Exercise 8 Repeat Exercise 7 for 1 1 11 1 1 + + 1 + z + z2 − z3, 2 8z 8z 8 8 1 H (z) = − + 1. z

L(z) = −

Exercise 9 Let 1 L(z) = z + (1 + z 2 ); 2 3 1 1 1 1 H (z) = − (z + ) − (z 2 + 2 ). 2 2 z 4 z (a) Find the polyphase matrix P(z) of {L(z), H (z)}. (b) Show that {L(z), H (z)} has an FIR PR dual. Exercise 10 Repeat Exercise 9 for 5 1 3 1 3 1 5 + (z + ) − (z 2 + 2 ) + (z 3 + 3 ), 2 16 z 4 z 16 z   1 1 3 1 + z3 . H (z) = z − (1 + z 2 ) + 4 2 8 z

L(z) =

Exercise 11 Factorize the polyphase matrix P(z) of {L(z), H (z)} as given in Exercise 9, and then provide the corresponding forward and backward lifting algorithms. Exercise 12 Repeat Exercise 11 by using both L(z) and H (z) as given in Exercise 10.

Chapter 10

Wavelet Analysis

The transition operator T p , along with its representation matrix in terms of a given finite sequence p, as introduced and studied in Sect. 9.1 of Chap. 9, is again applied to our study of scaling functions and wavelets in this chapter. In particular, while the characterization of the properties of stability and orthogonality of the refinable function φ was formulated in terms of the Gramian function G φ (ω) in Chap. 9, these properties will be characterized in terms of the properties of the eigenvalues and 1-eigenvectors of the transition operator T p in this chapter. This consideration is necessary, since there is no explicit expression of the refinable function φ in the formulation of G φ (ω) in general. N satisfies the (necessary) condition that  p = When the sequence p = { pk }k=0 k k  ω 2, we will first prove in Sect. 10.1 that the infinite product ∞ j=1 p( 2 j ) converges absolutely for each ω. Here and throughout Chap. 10, we will use the notation p, without boldface, for any finite refinement sequence (see Remark 1 in Sect. 8.2 on p.396), and p(ω) to denote the two-scale symbol of p. By the Paley-Wiener Theorem, it can be shown that the above infinite product converges to an entire function, which is the Fourier transform of some compactly supported distribution φ. One of the main objectives of this first section is to show that a necessary and sufficient condition for the distribution φ to be a function in L 2 (R) is that the transition operator T p has a 1-eigenfunction v(ω), which is a nonnegative trigonometric polynomial of degree given by the length N of the support of the refinement sequence p, such that v(0) > 0. Other topics of investigation in Sect. 10.1 include the support of φ, the sum-rule and the property of partition of unity. Characterizations of stability and orthogonality of refinable functions φ in terms of the transition operators, defined by the associated refinement sequences p, is the subject of investigation in Sect. 10.2. The Condition E of T p and the sum-rule order of p, defined in Sect. 9.3 of Chap. 9, are used for the characterization. For example, it will be shown that a refinable function φ is stable, if and only if (i) the corresponding

C. K. Chui and Q. Jiang, Applied Mathematics, Mathematics Textbooks for Science and Engineering 2, DOI: 10.2991/978-94-6239-009-6_10, © Atlantis Press and the authors 2013

499

500

10 Wavelet Analysis

refinement sequence p has sum-rule order ≥ 1, (ii) the transition operator T p satisfies Condition E, and (iii) there exists a 1-eigenfunction v0 (ω) of T p that satisfies either v0 (ω) > 0, or v0 (ω) < 0, for all ω ∈ R. Another result is that a compactly supported refinable function φ is orthogonal, if and only if (i) the corresponding refinement sequence p is a QMF, (ii) it has sum-rule order ≥ 1, and (iii) its transition operator T p satisfies Condition E. With only very few exceptions such as the Cardinal B-splines, compactly supported scaling functions (i.e. stable refinable functions) do not have explicit expressions. For instance, again with the exception of B-splines (such as the Haar scaling function), the compactly supported orthogonal and biorthogonal scaling functions and wavelets constructed in Chap. 9, with demonstrative examples in Chap. 8, are only formulated in terms of the refinement and two-scale equations. Of course, the values of a refinable function φ(x) at the dyadic points x = k/2 j , for k = 0, . . . , 2 j N and j = 0, 1, . . . , n, can be computed iteratively for any desirable positive integer n, by first finding the 1-eigenvector y0 = [φ(0), φ(1), . . . , φ(N )]T of the (N + 1)dimensional square matrix [ p2 j−k ], 0 ≤ j, k ≤ N , and then by applying the refinement equation to compute φ(k/2 j ) for j = 1, . . . , n, and finally by applying the two-scale relation of the wavelet to compute ψ(k/2n ) from φ(k/2n−1 ). The objective of Sect. 10.3 is to introduce and study the convergence of the cascade algorithm, for computing a sequence of functions φn (x), in terms of some linear combinations of any desirable compactly supported initial function φ0 (x), that approximates φ(x) for all x ∈ R and sufficiently large n. The main result of this section is that convergence of the cascade algorithm is guaranteed if and only if the refinement sequence p has sum-rule order ≥ 1 and its transition operator satisfies Condition E. We will also apply this convergence result to prove the characterization for the biorthogonality of scaling functions. As mentioned above, the compactly supported scaling functions φ and wavelets ψ, that are orthogonal or biorthogonal, do not have explicit expressions in general. On the other hand, to determine the order of smoothness of those functions that are written as (finite) linear combinations of φ(2m x −k) and ψ(2 j x −k), for j, k, m ∈ Z, it is necessary (and sufficient) to know the order of smoothness of the scaling function φ (since the wavelet ψ(x) is a linear combination of φ(2x − k) from the wavelet twoscale relation). Section 10.4 of this chapter is devoted to the study of the Sobolev smoothness of φ. In this book, the order of Sobolev smoothness will be given in terms of the eigenvalues of the transition operator T p associated with the refinement sequence p of φ. To introduce this concept, we need the notion of the Sobolev space. For any γ ≥ 0, a function f is said to be in the Sobolev space Wγ (R), if f satisfies γ the condition (1 + ω 2 ) 2  f (ω) ∈ L 2 (R). Then the order of Sobolev smoothness is defined by the critical Sobolev exponent ν f = sup{γ : f ∈ Wγ (R)}. To describe the results in this section, the concept of spectral radius for square matrices, studied in Chap. 3, is extended to the transition operator T p and its restric-

10 Wavelet Analysis

501

tion on the subspaces U2L of trigonometric polynomials u(ω) of degree N whose derivatives up to order 2L − 1 vanish at ω = 0. Here, N is the length of the support of the refinement sequence p and L denotes the sum-rule order of p. The natural extension of the notion of “spectral radius” from a square matrix A, studied in Chap. 3, to a 1 more general operator T , is to apply the result ρ(A) = limn→∞ |||An ||| n in Theorem 1 3 of Sect. 3.2 to define the spectral radius of T , namely: ρ(T ) = limn→∞ |||T n ||| n . In Sect. 10.4 of this chapter, we will consider the operator T = T p |U2L and denote ρ0 = ρ(T p |U2L ). In addition, let σ(T p ) denote the set of the eigenvalues of the representation matrix T p of the transition operator T p . The main results in this section are: firstly, the estimate of the critical Sobolev exponent νφ ≥ −

log ρ0 , 2 log 2

where this inequality can be replaced by equality, if the refinable function φ is stable (i.e., φ is a scaling function); and secondly,   1 1  ρ0 = max |λ| : λ ∈ σ(T p ) {1, , . . . , 2L−1 } . 2 2

10.1 Existence of Refinable Functions in L 2 (R) The first objective of this section is to establish convergence result, to be ∞an absolute ω p( ), associated with an arbitrary stated in Theorem 1, for the infinite product j=1 2j  finite sequence p = { pk } with k pk = 2. Here and throughout, p(ω) denotes the two-scale symbol of p, as defined in (9.1.1) in Chap. 9, namely: p(ω) =

1 pk e−ikω . 2 k

1

Since p(0) = 2 k pk = 1, the above infinite product converges to 1 at ω = 0,  so that the limit function, denoted by φ(ω), satisfies the normalization condition  φ(0) = 1. The second objective of this section is to apply the Fundamental Lemma of transition operators (i.e. Theorem 3) in Sect. 9.1 of Chap. 9 to show that the limit  function φ(ω) is the Fourier transform of some function φ ∈ L 2 (R), under certain assumptions on the transition operator T p associated with the given finite refinement sequence (mask) p. This result will be established in Theorem 3 in this section. The third objective of this section is to derive various properties of the function φ, including: the support of φ as dictated by the support of its refinement mask p, the property of partition of unity (PU), and the sum-rule property. Let us first recall (see N , the solution φ (if it exists) of the Sect. 8.2) that for a finite sequence p = { pk }k=0 refinement equation

502

10 Wavelet Analysis

φ(x) =

N 

pk φ(2x − k)

(10.1.1)

k=0

is called the normalized solution, if the principal value of its integral over R is equal  = 1). An equivalent formulation of (10.1.1) is to 1 (i.e. φ(0) ω ω  φ(ω) = p( )φ( ), 2 2 and  φ(ω) =



p(

j=1

(10.1.2)

ω ), 2j

(10.1.3)

as already mentioned in (8.2.21) on p.397 in Sect. 8.2 of Chap. 8. Of course, the above statements are meaningful, only if the infinite product in (10.1.3) converges to the Fourier transform of the normalized solution φ (again, if it exists) of the refinement equation (10.1.1). First let us show that the infinite product (10.1.3) always converges, as long as the finite sequence p sums to 2, as follows. Theorem 1 Pointwise convergence of

∞ j=1

p(

ω ) 2j

Let p = { pk } be any finite

  ω sequence with k pk = 2. Then ∞ j=1 p( 2 j ) converges absolutely for each ω ∈ R n in the sense that Bn = | j=1 p( 2ωj )| converges for each ω ∈ R.   ω Proof Since lnBn = nj=1 ln| p( 2ωj )|, it is sufficient to show that ∞ j=1 ln| p( 2 j )| converges pointwise on R. To prove the convergence of this infinite series, we observe that since p(ω) is a trigonometric polynomial, p  (ω) is also a trigonometric polynomial; and being periodic and continuous, it is bounded by some positive constant C, namely: | p  (ω)| ≤ C for all ω ∈ R. Therefore, by the Mean Value Theorem, we have | p(ω) − 1| = | p(ω) − p(0)| = | p  (ξ)(ω − 0)| ≤ C|ω|, for some ξ. Hence, we have | p( 2ωj ) − 1| ≤ 1−

C|ω| , 2j

so that

C|ω| ω C|ω| ≤ | p( j )| ≤ 1 + j , 2j 2 2

(10.1.4)

for all integers j ≥ 1. On the other hand, in view of the inequalities ln(1 + x) ≤ x, for x ≥ 0; and −2x ≤ ln(1 − x), for 0 ≤ x ≤

1 2

(10.1.5)

10.1 Existence of Refinable Functions in L 2 (R)

503

(see Exercise 1), we may conclude that for an arbitrary given ω ∈ R, there exists some positive integer J , such that 1 C|ω| ≤ , for j ≥ J. 2j 2 Thus, for j ≥ J , by (10.1.4) and (10.1.5), −2 ·

C|ω| C|ω| ω C|ω| C|ω| ≤ ln(1 − j ) ≤ ln| p( j )| ≤ ln(1 + j ) ≤ . 2j 2 2 2 2j

Therefore, ∞ J −1 ∞ 







C|ω|

ln| p( ω )| ≤

ln| p( ω )| + 2 · j < ∞. j j 2 2 2 j=1

This shows that

j=1

∞

ω j=1 ln| p( 2 j )|

j=J

converges, completing the proof of the theorem. 

For a finite sequence p = { pk }, it can be shown, by using the Paley-Wiener  theorem, that φ(ω), defined by (10.1.3), is an entire function (i.e., a function analytic in the entire complex plane C) and is the Fourier transform of some compactly supported distribution (also called generalized function) φ. Before showing that under a certain suitable condition imposed on the sequence p (formulated in terms of the transition operator T p associated with p), the distribution φ is an L 2 function in Theorem 3 below, let us first prove that the support of the distribution φ is dictated by the support of p, when p is considered as a bi-infinite sequence by padding it with zeros . Theorem 2 Supports of refinable functions

For any finite sequence p =

N , { pk }k=0

the solution φ of the refinement equation (10.1.1) is a distribution with supp(φ) ⊆ [0, N ]. Proof From the refinement equation (10.1.1), it is clear that supp(φ) ⊆





supp φ(2 · −k) =

k∈[0,N ]

k∈[0,N ]



1 supp φ) + k 2

1 1 1 ⊆ supp(φ) + [0, N ] = [0, N ] + supp(φ). 2 2 2 By applying this result to the second term on the right, we have 1 [0, N ] + 2 1 = [0, N ] + 2

supp(φ) ⊆

 11 1 [0, N ] + supp(φ) 2 2 2 1 1 [0, N ] + 2 supp(φ); 22 2

504

10 Wavelet Analysis

and more generally, by repeating the same argument, we obtain 1 1 1 1 [0, N ] + 2 [0, N ] + · · · + j [0, N ] + j supp(φ), 2 2 2 2

supp(φ) ⊆

for j ≥ 1. But lim j→∞ 21j supp(φ) = 0, since φ is compactly supported. Therefore,  1 we may conclude that supp(φ) ⊆ ∞  j=1 2 j [0, N ] = [0, N ], as desired. Remark 1 Following the proof of Theorem 2, one can show that if the refinement mask p = { pk } is supported on [N1 , N2 ], namely pk = 0 for k < N1 or k > N2 ,  then supp(φ) ⊆ [N1 , N2 ] (see Exercise 2). Next, we provide a characterization for the normalized solution φ, as a function in L 2 (R), of the refinement equation, by imposing some restriction on the refinement mask. As remarked in Remark 1 of Sect. 9.1, in the rest of this chapter, a refinable function φ associated with p = { pk } means the normalized solution of the refinement  = 1. equation with the condition φ(0) N Theorem 3 Characterization of refinable functions in L2 (R) Let p = { pk }k=0 and φ be the normalized solution of (10.1.1). Then the distribution φ is a function in L 2 (R), if and only if there exists some non-negative trigonometric polynomial v(ω) ∈ V2N +1 with v(0) > 0, such that v(ω) is a 1-eigenfunction of the transition operator T p .

Proof Suppose φ ∈ L 2 (R). Then it follows from Theorem 2 of Sect. 9.2 on p.448 that the Gramian function G φ (ω) of φ is in V2N +1 and (T p G φ )(ω) = G φ (ω). Clearly, G φ (ω) ≥ 0, and ∞  2   2 = 1. |φ(2πk)| ≥ |φ(0)| G φ (0) = k=−∞

Thus, v(ω) = G φ (ω) is a trigonometric polynomial that satisfies the required properties. Conversely, suppose that there exists v(ω) ∈ V2N +1 such that (T p v)(ω) = v(ω), v(ω) ≥ 0 and v(0) > 0. Then, by the Fundamental Lemma of transition operators in Sect. 9.1 of Chap. 9 (see (9.1.12) of Theorem 3 on p.444), we have 

2n π

−2n π

=

n

ω 2 ω )| v( n )dω 2j 2  π n (T p v)(ω)dω = v(ω)dω < ∞.

| p(

j=1 π



−π

−π

Thus, by applying Fatou’s lemma, we may conclude that

10.1 Existence of Refinable Functions in L 2 (R)

 v(0)

R

2  |φ(ω)| dω =

 R

505

2  |φ(ω)| v(0)dω

 =

R

n

limn→∞ |

j=0

 ≤ limn→∞ = limn→∞

p

|

n

R j=0  π −π

p

ω 2j ω 2j

|2 v |2 v

ω 2n ω 2n

χ2n [−π,π] dω χ2n [−π,π] dω

v(ω)dω < ∞.

Hence, since v(0) = 0, we have  R

2  |φ(ω)| dω < ∞,

or φ ∈ L 2 (R).



Example 1 Apply Theorem 3 to show that the refinable function φ of the refinement equation φ(x) = φ(2x) + φ(2x − 2), is a function in L 2 (R). Solution First, we must find a 1-eigenfunction v(ω) of T p that satisfies the conditions in Theorem 3. To do so, we compute the representation matrix T p = 1  2 a2 j−k −2≤ j,k≤2 of the transition operator T p , by considering the Fourier series  −i jω = 4| p(ω)|2 (refer to (9.1.8) on p.439), where j aje p(ω) = so that



1 (1 + e−i2ω ), 2

a j e−i jω = 4| p(ω)|2 = 2 + e−i2ω + ei2ω .

j

Therefore, the only nonzero coefficients a j are given by a0 = 2, a1 = a−1 = 1, and this gives the representation matrix

506

10 Wavelet Analysis



Tp =

1 2

a2 j−k

 −2≤ j,k≤2

1 ⎢2 1⎢ 1 = ⎢ 2⎢ ⎣0 0

0 0 0 0 0

0 1 2 1 0

0 0 0 0 0

⎤ 0 0⎥ ⎥ 1⎥ ⎥. 2⎦ 1

It is not difficult to verify that v0 = [0, 1, 2, 1, 0]T is a (right) 1-eigenvector of T p . Thus, v0 (ω) = E(ω)v0 = eiω + 2 + e−iω = 2 + 2 cos ω is a 1-eigenfunction of the transition operator T p . Since the necessary conditions (v0 (ω) ≥ 0, for ω ∈ [−π, π]; and v0 (0) = 4 > 0) are satisfied by the trigonometric polynomial v0 (ω) ∈ V5 , it follows from Theorem 3 that φ ∈ L 2 (R). Indeed, it is easy to verify that the solution of the given refinement equation is the function  φ(x) = 21 χ[0,2) (x). N is called a QMF if its two-scale symbol Recall that a sequence p = { pk }k=0 satisfies the identity | p(ω)|2 + | p(ω + π)|2 = 1.

Therefore, for a QMF p, the constant trigonometric polynomial 1 ∈ V2N +1 is an eigenfunction of the transition operator T p corresponding to the eigenvalue 1, namely: T p 1 = | p(ω/2)|2 · 1 + | p(ω/2 + π)|2 · 1 = 1. Since v(ω) = 1 is non-negative and different from zero at ω = 0, the necessary conditions in Theorem 3 are satisfied. This leads to the following result. N Corollary 1 QMF implies φ ∈ L 2 (R) Let N be any positive integer, p = { pk }k=0 any QMF, and φ the normalized solution of (10.1.1). Then the solution φ is a compactly supported function in L 2 (R), with supp(φ) ⊆ [0, N ].

Next, we will show that if the normalized solution of the refinement equation (10.1.2) is a function in L 2 (R), then this (scaling) function enjoys the property of partition of unity (PU). A compactly supported function f (x) is said to have the PU, if it satisfies the identity  f (x − ) = 1, x ∈ R. (10.1.6) ∈Z

Observe that since f (x) is compactly supported, the infinite series in (10.1.6) is only a finite sum for each fixed value of x. Thus, the series always converges to a 1-periodic function. In the next theorem we show that PU is equivalent to the moment condition (called the Strang-Fix condition), to be defined in (10.1.7) below.

10.1 Existence of Refinable Functions in L 2 (R)

507

Theorem 4 PU ⇔ moment condition A compactly supported function f ∈L 2 (R) has the PU if and only if it satisfies the moment condition of order at least equal to 1; that is,  f (0) = 1,  f (2πk) = 0, k ∈ Z\{0}. (10.1.7) Proof  Since f is compactly supported, f ∈ L 2 (R) implies f ∈ L 1 (R), and hence g(x) = ∈Z f (x − ) is in L 1 [0, 1]. Observe that, for k ∈ Z, 

1

g(x)e−i2πkx d x =

0

=



1 0 ∈Z 1



f (x − ) e−i2πkx d x f (x − ) e−i2πkx d x

∈Z 0

= =



1−

∈Z −  ∞ −∞

f (u) e−i2πku du

f (u) e−i2πku du =  f (2πk),

where the interchange of integration and summation in the second equality is valid, since ∈Z is a finite sum. Thus, f satisfies (10.1.6), if and only if it satisfies (10.1.7).   = δk Let φ be the normalized solution of Theorem 5 φ ∈ L 2 (R) implies φ(2πk) the refinement equation (10.1.2), such that φ ∈ L 2 (R). Then φ satisfies the moment condition of order at least equal to 1; that is,  = 1, φ(2πk)  φ(0) = 0, k ∈ Z\{0}.

(10.1.8)

Proof Since φ is compactly supported, φ ∈ L 2 (R) implies φ ∈ L 1 (R). Conse quently, it follows from the Riemann-Lebesgue lemma that φ(ω) → 0 as ω → ∞. Let k = 0. Then  n · 2πk) = φ(2 =

n j=1 n

p(

2n · 2kπ  2n · 2kπ ) φ( ) 2j 2n

  p(0) φ(2kπ) = φ(2kπ).

j=1

  n · 2πk) → 0 as n → ∞; that is, φ(2πk)  Hence, φ(2πk) = φ(2 = 0 for all integers k = 0, as desired.  As an immediate consequence of Theorems 4 and 5, we have the following result.

508

10 Wavelet Analysis

Corollary 2 L 2 (R) for refinable φ implies

 k

φ(x − k) = 1 A compactly sup-

ported refinable function in L 2 (R) has the property of partition of unity. As another consequence of Theorem 5, observe that the Gramian function of a scaling function φ ∈ L 2 (R) satisfies G φ (0) =



2   2 = 1. |φ(2kπ)| = |φ(0)|

k∈Z

Also, recall from Definition 2 of Sect. 9.3 on p.456 that a trigonometric polynomial p(ω) is said to have sum-rule order L ≥ 1, if L is the largest integer for which p(0) = 1,

d p(π) = 0, for  = 0, 1, . . . , L − 1. dω 

Theorem 6 G φ (π) > 0 implies p(ω) has at least first sum-ruleorder

Let φ ∈

N . If L 2 (R) be a compactly supported scaling function with refinement mask { pk }k=0 G φ (π) > 0, then p(ω) has at least first sum-rule order; that is, p(0) = 1 and

p(π) = 0.

(10.1.9)

Proof To prove (10.1.9), we observe, from the fact that T p G φ = G φ , that G φ (0) = | p(0)|2 G φ (0) + | p(π)|2 G φ (π). Thus, by the assumption that p(0) = 1 and the fact that G φ (0) = 1, we have  | p(π)|2 G φ (π) = 0, which implies p(π) = 0, as desired. It has been shown in Theorem 4 of Sect. 9.2, on p.455, that if φ is stable, then G φ (ω) > 0. This fact and Theorem 6 lead to the following corollary. Corollary 3 Stability of φ implies p(ω) has the sum-rule property

If a refin-

N able function φ is stable, then its associated refinement mask { pk }k=0 has at least the first order sum-rule property.

Example 2 Let φ(x) = 13 χ[0,3) (x) be the function considered in Example 4 of Sect. 9.2 in the previous chapter, on p.450, where it was shown that G φ (ω) = 13 + 4 2 9 cos ω + 9 cos 2ω and that φ(x) is not stable. However, it is easy to verify that φ(x) is a solution of the refinement equation φ(x) = φ(2x) + φ(2x − 3), and hence is a refinable function. Since

10.1 Existence of Refinable Functions in L 2 (R)

G φ (π) =

509

2 1 1 4 + cos π + cos 2π = > 0, 3 9 9 9

it follows from Theorem 6 that the refinement mask p of φ should have at least the first sum-rule order . Indeed, this is true, as p(ω) =

1 (1 + e−i3ω ), 2

which implies that p(0) = 1 and p(π) = 21 (1 − 1) = 0.



Example 3 Let φ(x) = 21 χ[0,2) (x) be the refinable function considered in Example 1 of this section. The associated two-scale symbol p(ω) is given by p(ω) =

1 (1 + e−i2ω ). 2

Although φ ∈ L 2 (R), the refinement mask p does not satisfy the sum-rule condition, since p(π) = 21 (1 + 1) = 1 = 0. Observe that this does not contradict Theorem 6, because the refinable function φ does not satisfy the condition G φ (π) > 0. Observe that 1 1 G φ (ω) = + cos ω, 2 2 and G φ (π) = 0 (see Exercise 3).

 Excercises

Exercise 1 Show that −2x ≤ ln(1 − x) for 0 ≤ x ≤ 21 . N2 Exercise 2 Let φ be the refinable distribution associated with p = { pk }k=N . Show 1 that supp(φ) ⊆ [N1 , N2 ].

Exercise 3 Let φ(x) = 21 χ[0,2) (x) be the function considered in Example 3. Show that G φ (ω) = 21 + 21 cos ω. Exercise 4 Let φ(x) = 13 χ[0,3) (x) be the refinable function considered in Example 2. Verify that it satisfies the refinement equation φ(x) = φ(2x) + φ(2x − 3). Then compute the representation matrix of the transition operator T p associated with the refinement mask. Next obtain the 1-eigenfunction v(ω) of T p . Finally, verify that v(ω) satisfies the necessary conditions in Theorem 3. Exercise 5 Let φ(x) = 13 χ[0,3) (x) be a refinable function. Show that its two-scale symbol is a QMF, and hence deduce from Corollary 1 that φ ∈ L 2 (R). Exercise 6 Let p(ω) = 21 (1 + e−i2ω ) be the symbol of φ(x) = 21 χ[0,2) (x), considered in Example 3. Find the eigenvalues of T p and 1-eigenfunctions of T p .

510

10 Wavelet Analysis

Exercise 7 Let p = { pk }3k=0 be a refinement mask with p0 = p3 =

1 3 , p1 = p2 = . 4 4

Find the 1-eigenfunctions of T p , and then apply Theorem 3 to decide whether or not the corresponding refinable distribution φ is a function in L 2 (R). (For reference, see Exercise 4 on p.444.) Exercise 8 Repeat Exercise 7 for p = { pk }3k=0 with p0 = p1 = p2 = p3 =

1 . 2

10.2 Stability and Orthogonality of Refinable Functions In Sect. 9.2 of Chap. 9, we have presented the characterizations of the properties of orthogonality and stability of the refinable function φ in terms of the Gramian function G φ (ω) of φ (see Theorem 3 on p.449 and Theorem 4 on p.449). These characterizations require the explicit expression of φ or its Fourier transform for the formulation and calculation of the Gramian function G φ (ω). However, in general, the existence of a refinable function φ is only in terms of some infinite product, as studied in the previous section, and there is no explicit expression of φ or its Fourier transform. Such examples include all compactly supported orthogonal scaling functions that are continuous, as studied in both Chaps. 8 and 9. In this situation, we must find some other way to characterize orthogonality and stability. In this section, we provide certain characterizations in terms of the refinement mask p and its transition operator T p . Firstly, let us recall the notion of Condition E on a matrix (or a linear operator on a finite-dimensional vector space), as introduced in Definition 1 of Sect. 9.3 of Chap. 9, on p.455. Condition E A matrix (or a linear operator T on a finite-dimensional space) is said to satisfy Condition E, to be denoted by T ∈ E, if 1 is a simple eigenvalue of T and any other eigenvalue λ of T satisfies |λ| < 1.  Theorem 1 Characterization for stability of refinable φ supported refinable function with refinement mask p = and only if the following two conditions are satisfied:

Let φ be a compactly

N . { pk }k=0

Then φ is stable if

(i) p(π) = 0; and (ii) T p ∈ E, and there exists a 1-eigenfunction v0 (ω) of T p that satisfies either v0 (ω) > 0, or v0 (ω) < 0, for all ω ∈ R. The proof of Theorem 1 will be provided at the end of this section.

10.2 Stability and Orthogonality of Refinable Functions

511

Example 1 Let φ be the refinable function with refinement mask p = { pk }2k=0 , where 3 1 p0 = , p1 = 1, p2 = . 4 4 Apply Theorem 1 to decide whether or not φ is stable. Solution The nonzero values of a j can be computed from the formula a j = N s=0 ps ps− j , namely: a0 =

13 3 , a1 = a−1 = 2, a2 = a−2 = . 8 16

Thus, we have the representation matrix ⎡

Tp =

1 2

 a2 j−k

−2≤ j,k≤2

3 ⎢26 1 ⎢ ⎢3 = 32 ⎢ ⎣0 0

0 16 16 0 0

0 3 26 3 0

0 0 16 16 0

⎤ 0 0⎥ ⎥ 3⎥ ⎥ 26⎦ 3

of the transition operator T p . To compute the 1-eigenvector of T p , we first observe that each of the first and last rows of the matrix T p has only one nonzero entry, so that the determinant of the matrix λI5 − T p of dimension 5 can be simplified to be



λ − 1 − 3 0

2 32

3

1

det(λI5 − T p ) = (λ − )2 − 21 λ − 13 16 − 2

32

3

0 − 32 λ − 21



λ − 1 λ − 1 λ − 1

3 2

1 − 21

= (λ − ) − 2 λ − 13 16 32

3 0 − 32 λ − 21

(by adding row-2 and row-3 to row-1)



λ − 1 0 0



3 1 0

= (λ − )2

− 21 λ − 13 16 + 2 32

3 0 − 32 λ − 21

(by subtracting col-1 from col-2 and col-3) = (λ −

3 2 5 1 ) (λ − 1)(λ − )(λ − ). 32 2 16

Thus, the eigenvalues of T p are given by

512

10 Wavelet Analysis

1,

1 5 3 3 , , , . 2 16 32 32

This ensures that T p ∈ E. In addition, it is easy to verify that the vector [0, 3, 16, 3, 0] is a 1-eigenvector of T p , so that v(ω) = 3eiω + 16 + 3e−iω = 16 + 6 cos ω is a 1-eigenfunction of T p . Clearly, v(ω) > 0 for any ω ∈ R. Furthermore, p has at least sum-rule order 1. Thus, by Theorem 1, we may conclude that φ is stable.  Next, we give a characterization on the orthogonality of compactly supported refinable functions. Theorem 2 Characterization of orthogonal refinable functions Let φ be a comN . Then φ is in pactly supported refinable function with refinement mask p = { pk }k=0 L 2 (R) and orthogonal, if and only if the following conditions on p and the transition operator T p are satisfied:

(i) p is a QMF; (ii) p(π) = 0; and (iii) T p ∈ E. Proof Suppose φ ∈ L 2 (R) is orthogonal. Then statements (ii) and (iii) follow from Theorem 1, since orthogonality implies stability of φ. Although it was already shown in Theorem 2 on p.453 that if φ is orthogonal, then the corresponding refinement mask p is a QMF, we re-write the proof by only considering the transition operator. Indeed, by the orthogonality of φ, it follows that G φ (ω) = 1 (see Theorem 3 on p.449). Thus, T p 1 = 1; that is, | p(ω/2)|2 + | p(ω/2 + π)|2 = 1, so that p is a QMF. Conversely, suppose that (i)–(iii) in Theorem 2 hold. Then, since p is a QMF, it follows from Corollary 1 on p.506 that φ ∈ L 2 (R). Hence, G φ is a 1-eigenfunction of T p . Also, since T p ∈ E and 1 is a 1-eigenfunction of T p , we may conclude that G φ (ω) = c0 · 1 = c0 , for some constant c0 . On the other hand, by Theorem 5  of the previous section on p.507, φ(2πk) = δk , which implies that G φ (0) = 1, so that c0 = 1. Hence, G φ (ω) = 1, which is equivalent to the statement that φ is orthogonal.  Remark 1 Let p be a finite QMF that has at least sum-rule order 1, and φ be the refinable function with refinement mask p. Then it follows from Theorem 2 that

10.2 Stability and Orthogonality of Refinable Functions

513

φ is orthogonal ⇐⇒ T p ∈ E.  Example 2 Let p = { pk }3k=0 with √ √ √ √ 1− 3 3− 3 3+ 3 1+ 3 p0 = , p1 = , p2 = , p3 = , 4 4 4 4 be the refinement mask for the scaling function φ, associated with the D4 wavelet, studied in Sect. 9.3 of the Chap. 9 (see (9.3.21) on p.460). Apply Theorem 2 to verify that φ is orthogonal. Solution Since p(ω) is a QMF and has sum-rule property of order ≥ 1, to prove that φ is orthogonal, it is sufficient to show that T p satisfies Condition E. By Remark 5 on p.443, the verification is in turn reduced to showing that T p = [ 21 a2 j−k ] ∈ E.  To compute a j from the Fourier series representation j a j e−i jω = 4| p(ω)|2 (see (9.1.8) on p.439), recall from the construction of the D4 filter on p.459, that ω ω 1 + 2 sin2 2 2

2

1 1 1 = 2 + eiω + e−iω 2 − eiω − e−iω 16 2 2

1 = 16 + 9eiω + 9e−iω − ei3ω − e−i3ω . 32

| p(ω)|2 = cos4

Thus, the nonzero ak are given by a0 = 2, a1 = a−1 =

9 1 , a3 = a−3 = − , 8 8

so that the representation matrix of the transition operator T p is given by ⎡

Tp =

1 2

 a2 j−k

−3≤ j,k≤3

−1 ⎢9 ⎢ ⎢9 1 ⎢ ⎢−1 = 16 ⎢ ⎢0 ⎢ ⎣0 0

0 0 16 0 0 0 0

0 −1 9 9 −1 0 0

0 0 0 16 0 0 0

0 0 −1 9 9 −1 0

0 0 0 0 16 0 0

⎤ 0 0⎥ ⎥ 0⎥ ⎥ −1⎥ ⎥. 9⎥ ⎥ 9⎦ −1

To compute the eigenvalues of T p , observe that each of the first and last rows, as well as the 4th column, has one nonzero entry. Thus, the desired determinant can be reduced to

514

10 Wavelet Analysis



1

λ 0 0

16

1

−1 λ − 9 1 0

16 16 det(λI7 − T p ) = (λ + )2 (λ − 1)

1 9

0 λ − −1 16



16 16 1

0 0 λ 16 1 1 1 1 = (λ + )2 (λ − 1)(λ − )(λ − )2 (λ − ), 16 2 4 8 where the calculation details for the last equality are omitted. Thus, the eigenvalues of T p are 1 1 1 1 1 1 1, , , , , − , − , 2 4 4 8 16 16 implying that T p ∈ E. Therefore, the scaling function φ is orthogonal. From T p , it is easy to verify that the vector [0, 0, 0, 1, 0, 0, 0] is an eigenvector of T p corresponding to the eigenvalue 1, so that the constant trigonometric polynomial v(ω) = 1 is a 1-eigenfunction of T p . Since G φ (ω) is also a 1-eigenfunction of T p , we may conclude that G φ (ω) = cv(ω) = c for some constant c. But from G φ (0) = 1, we have c = 1. Thus, G φ (ω) = 1 for all ω ∈ R. This is exactly the orthogonality condition on φ given in Theorem 3 on p.449.  We end this section by giving the proof of Theorem 1. Proof of Theorem 1 First, suppose that (i) and (ii) in Theorem 1 hold. Then since v0 (0) = 0, it follows from Theorem 3 that φ ∈ L 2 (R), so that G φ (ω) ∈ V2N +1 and T p G φ = G φ . Now, because 1 is a simple eigenvalue of T p , G φ (ω) = c0 v0 (ω) for some constant c0 = 0. Thus, since G φ (ω) ≥ 0 and v0 (ω) is never zero on R, we may conclude that G φ (ω) > 0 for all ω ∈ R. Therefore, by Theorem 4 on p.449, φ is stable. Conversely, assume that φ is stable. Then statement (i) follows from Corollary 3 in Sect. 10.1 (see p.508). In addition, the stability of φ implies that G φ ∈ V2N +1 and T p G φ = G φ and c ≤ G φ (ω) ≤ C, ω ∈ R, for some constants c, C > 0. Thus, v0 (ω) = G φ (ω) is a 1-eigenfunction of T p which is never zero. Therefore, to complete the proof of Theorem 1, it is sufficient to show that T p ∈ E. For any f, g ∈ V2N +1 , we have, by Theorem 3 in Sect. 9.1 on p.443, 

π

−π

 g(ω)(T pn f )(ω)dω =

−2n π

 = =

2n π

2n π

−2n π



g(ω)

n

| p(

ω 2 ω )| f ( n )dω 2j 2

| p(

ω 2 ω  ω ω )| f ( n ) |φ( n + 2kπ)|2 /G φ ( n )dω 2j 2 2 2

j=1

g(ω)

n j=1

2n π+2kπ2n

n n k∈Z −2 π+2kπ2

k∈Z

g(u)

n j=1

u u u u | p( j )|2 | φ( n )|2 f ( n )/G φ ( n )du 2 2 2 2

10.2 Stability and Orthogonality of Refinable Functions

 =

R

g(u)

| p(

j=1

 =

n

515

u 2 u 2 u u )| |φ( n )| f ( n )/G φ ( n )du 2j 2 2 2

u u g(u)| φ(u)|2 f ( n )/G φ ( n )du, 2 2 R

where the third equality is obtained by substituting u = ω + 2kπ2n and observing the 2π-periodicity of g, f, G φ ; while the last equality follows from the refinement  ω ). Thus,  equation for φ; that is, φ(ω) = p( ω2 )φ( 2 

π −π

 g(ω)(T pn f )(ω)dω =

R

2  g(ω)|φ(ω)| f(

ω ω )/G φ ( n )dω. 2n 2

(10.2.1)

Now let λ be an eigenvalue of T p with eigenfunction v(ω) ∈ V2N +1 . Then, by (10.2.1), we have  λ

n

π

−π

 v(ω)v(ω)dω =  =

π −π

R

v(ω)(T pn v)(ω)dω

2  v(ω)|φ(ω)| v(

ω ω )/G φ ( n )dω, 2n 2

where the integrand of the last integral satisfies

ω ω



2 2   v( n )/G φ ( n ) ≤ C1 |φ(ω)| , ω ∈ R,

v(ω)|φ(ω)| 2 2 for some positive constant C1 , since a trigonometric polynomial v(ω) is bounded and |1/G φ ( 2ωn )| ≤ 1c . Thus, by Lebesgue’s dominated convergence theorem, we have  1 ω ω 2  v(ω)|φ(ω)| v( n )/G φ ( n )dω 2 dω n→∞ 2 2 |v(ω)| R −π  1 2  = π v(ω)|φ(ω)| v(0)/G φ (0)dω < ∞. 2 dω |v(ω)| R −π

lim λ = lim  π n

n→∞

Therefore, limn→∞ λn exists, which implies that we have either λ = 1 or |λ| < 1. Hence, to complete the argument in showing T p ∈ E, we only need to show that 1 is a simple eigenvalue. First, we know that 1 is an eigenvalue of T p , since T p G φ (ω) = G φ (ω) and G φ (ω) = 0. Next, let v(ω) ∈ V2N +1 be an arbitrary 1-eigenfunction of T p , and let f (ω) = v(ω) − v(0)G φ (ω). Then (T p f )(ω) = f (ω), and f (0) = 0, since G φ (0) = 1. Hence, by (10.2.1) again,

516

10 Wavelet Analysis





π −π

| f (ω)|2 dω =

π

−π



f (ω)(T pn f )ω)dω

ω  ω 2  f (ω)|φ(ω)| f ( n ) /G φ ( n )dω 2 2 R 2  → f (ω)|φ(ω)| f (0)/G φ (0)dω = 0,

=

R

as n → ∞. Thus f = 0; that is, v(ω) = v(0)G φ (ω). Therefore, up to a constant, G φ (ω) is the unique 1-eigenfunction of T p . This means that the geometric multiplicity of the eigenvalue 1 is 1. Finally, let us show that 1 is nondegenerate. Assume to the contrary that it is not the case. Then there exists v(ω) ∈ V2N +1 such that T p v(ω) = G φ (ω)+v(ω). Again, let f (ω) = v(ω) − v(0)G φ (ω). Then f (0) = 0 and, by (10.2.1), we have 

π −π

 T pn f (ω)dω =

R



2  |φ(ω)| f(

R

ω ω )/G φ ( n )dω n 2 2

2  |φ(ω)| f (0)/G φ (0)dω = 0,

by allowing n → ∞. On the other hand, observe that T pn f (ω) = T pn v(ω) − v(0)G φ (ω) = nG φ (ω) + v(ω) − v(0)G φ (ω), so that 

π −π

 T pn

f (ω)dω =

π −π

nG φ (ω) + v(ω) − v(0)G φ (ω)dω → ∞, 

as n → ∞. This is a contradiction to the fact that

π

−π

T pn f (ω)dω → 0 as n → ∞.

Thus, 1 is indeed a simple eigenvalue of T p , and hence, T p ∈ E, as desired.



Exercises Exercise 1 Let p = { pk }2k=0 with p0 =

1 3 , p1 = 1, p2 = , 4 4

be a refinement mask. Apply Theorem 1 to decide whether or not the associated refinable function φ is stable. Exercise 2 Repeat Exercise 1 for p = { pk }2k=0 with

10.2 Stability and Orthogonality of Refinable Functions

p0 =

517

2 1 , p1 = 1, p2 = . 3 3

Exercise 3 Repeat Exercise 1 for p = { pk }3k=0 with p0 = p3 =

1 3 , p1 = p2 = . 4 4

(For reference, see Exercise 4 on p.444 and Exercise 7 on p.510.) Exercise 4 Repeat Exercise 1 for p = { pk }3k=0 with p0 = p1 = p2 = p3 =

1 . 2

(For reference, see Exercise 8 on p.510.) Exercise 5 Repeat Exercise 1 for p = { pk }3k=0 with p0 = 1, p1 =

1 1 , p2 = 0, p3 = . 2 2

Exercise 6 Let p = { pk }5k=0 be the refinement mask for the refinable function φ associated with the D6 wavelet. By following Example 2, compute the eigenvalues of the associated transition operator T p , and then conclude that φ is orthogonal.

10.3 Cascade Algorithms In general, compactly supported scaling functions (and their corresponding wavelets) are difficult to construct. In fact, with a few exceptions, such as Cardinal B-splines, they are only defined by the inverse Fourier transform of certain infinite products, as in (10.1.3) of Sect. 10.1, where the trigonometric polynomial p(ω) is the twoscale symbol of a given finite refinement mask p. Hence, such scaling functions do not have explicit expressions, and the plotting of their graphs can only depend on the sequences p. For convenience, we will assume, without loss of generality, in N is [0, N ], meaning that both p0 and this section that the support of p = { pk }k=0 p N are nonzero. It was proved in Sect. 10.1 that if the refinement equation has a solution φ which is a function, then the support of this function is within [0, N ] (see Theorem 2 in Sect. 10.1). Hence, to plot the graph y = φ(x), only the values of φ(x), for 0 ≤ x ≤ N , are needed. Probably the most obvious way to render this graph is to apply the refinement equation directly to compute the values of φ(x) at the dyadic points x = k/2 j for k = 0, . . . , N 2 j . To do so, the first step is to compute the values of φ(x) at the integers; namely, the components of the vector

518

10 Wavelet Analysis

y0 = [φ(0), φ(1), . . . , φ(N )]T . It is easy to show that this vector y0 is a 1-eigenvector of the (N + 1)-dimensional square matrix   , (10.3.1) P N +1 = p2 j−k 0≤ j,k≤N

with the ( j, k)th entry being p2 j−k , where it is understood that pk = 0 for k < 0 or k > N . If the values φ(0), φ(1/2 j−1 ), . . . , φ(2 j−1 N /2 j−1 ) have been computed, then the values φ(m/2 j ) can be found by computing the sum φ(m/2 j ) =

N 

pk φ(m/2 j−1 − k) =

k=0

N 

pk φ((m − 2 j−1 k)/2 j−1 ),

(10.3.2)

k=0

for each m = 1, . . . , 2 j N −1. To find the values of φ(x) at x = m/2n for any desired positive integer n, this computational process is repeated iteratively, for j = 1, . . . , n (see Exercises 1, 2, and 3). Of course, this process requires that the refinement mask p must be so chosen that 1 is an eigenvalue of the matrix P N +1 in (10.3.1). In this regard, we note that when p has at least sum-rule order 1 (this is a minimal condition for the existence of a scaling function), p satisfies 

p2k =

k



p2k+1 = 1,

k

which implies that the 1 × (N + 1) row vector v0 = [1, 1, . . . , 1] satisfies v0 P N +1 = v0 . Thus, in this case, 1 is indeed an eigenvalue of P N +1 . If 1 is a simple eigenvalue of P N +1 , then y0 = [φ(0), φ(1), . . . , φ(N )]T is determined as the unique (up to a constant) (right) 1-eigenvector of P N +1 . In this section, we will introduce and study another approach to render φ(x), but again by using the refinement mask p. Instead of evaluating φ on the dense set of dyadic points in the interval (0, N ), we approximate φ(x) for all x ∈ [0, N ]. The motivation to this approach is to observe that if a sequence of functions φn converges to φ, then φn−1 also converges to φ, as n tends to infinity. Hence, if we consider the functional equations N  φn (x) = pk φn−1 (2x − k), k=0

for each n = 1, 2, . . . , the limit function φ(x) satisfies the refinement equation, and hence is the refinable function we wish to find. This is also an iterative scheme, but with some initial function φ0 , as opposed to using the components of the 1-eigenvector y0 of the matrix P N +1 as the initial values in computing the values in

10.3 Cascade Algorithms

519

(10.3.2) iteratively. To realize this approximation approach rigorously, we introduce the refinement operator Q p , defined by Q p f (x) =



pk f (2x − k),

f ∈ L 2 (R),

(10.3.3)

k∈Z

and for n = 1, 2, . . ., set

Q n+1 = Q n Q, Q 1 = Q,

so that for any initial function φ0 ∈ L 2 (R), we may also use the notation: φn = Q np φ0 = Q n−1 p (Q p φ0 ) . N In the above discussion and throughout this section, the refinement mask p = { pk }k=0 N is always assumed to satisfy k=0 pk = 2, with nonzero p0 and p N . The iterative process of computing φ1 , φ2 , . . ., with a compactly supported L 2 (R) function φ0 as initial function, is called the cascade algorithm. We remark that we have abused the use of notations: while the notation φn = Q np φ0 is used in this section to denote the result obtained after n iterative steps of the cascade algorithm, it was used in other sections to denote φ(x − n), which is the shift of φ(x) by n. Returning to the notation for the cascade algorithm, we observe that n ω  ω n (ω) = p( j ) φ φ 0 ( n ). 2 2 j=1

0 (0) = 1, then φ n (ω) Thus, according to Theorem 1 of Sect. 10.1, on p.502, if φ  converges pointwise to φ(ω) for all ω ∈ R. Example 1 Let p = { pk }2k=0 be the refinement mask of the hat function, with p0 = p2 = 21 , p1 = 1. With φ0 (x) = χ[0,1) (x), we have 1 1 φ1 (x) = (Q p φ0 )(x) = φ0 (2x) + φ0 (2x − 1) + φ0 (2x − 2) 2 2 ⎧1 , for 0 ≤ x < 0.5, ⎪ ⎪ ⎨2 1, for 0.5 ≤ x < 1, = 1 ⎪ , for 0 ≤ x < 1.5, ⎪ ⎩2 0, elsewhere. Then, from φ1 , we obtain φ2 , as given by

520

10 Wavelet Analysis

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0 −0.2

−0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

1 0.8 0.6 0.4 0.2 0 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Fig. 10.1. φ1 = Q p φ0 , φ2 = Q 2p φ0 , φ3 = Q 3p φ0 obtained by cascade algorithm with refinement mask of hat function and φ0 = χ[0,1) (x)

1 1 φ2 (x) = Q p φ1 (x) = φ1 (2x) + φ1 (2x − 1) + φ1 (2x − 2) 2 2 ⎧ 1 ⎪ ⎪ 4 , for x ∈ [0, 0.25) ∪ [1.5, 1.75), ⎪ ⎪ ⎪ 1 ⎪ ⎨ 2 , for x ∈ [0.25, 0.5) ∪ [1.25, 1.5), = 3 , for x ∈ [0.5, 0.75) ∪ [1, 1.25), 4 ⎪ ⎪ ⎪ ⎪ 1, for x ∈ [0.75, 1), ⎪ ⎪ ⎩ 0, elsewhere. In this way, we can compute φn for n ≥ 3. The graphs of φ1 , φ2 , φ3 are displayed in Fig. 10.1. From the graphical results, it appears that the sequence φn converges in  L 2 (R) to the hat function φ. Example 2 Let p = { pk }2k=0 with p0 = p2 = 1, p1 = 0 be the refinement mask of φ(x) = 21 χ[0,2) (x), considered in Example 1 on p.505 and Example 3 on p.509. By applying the cascade algorithm associated with this mask with initial function φ0 (x) = χ[0,1) (x), we obtain φ1 (x) = Q p φ0 (x) = φ0 (2x) + 0φ0 (2x − 1) + φ0 (2x − 2) =

1, for x ∈ [0, 0.5) ∪ [1, 1.5), 0, elsewhere.

10.3 Cascade Algorithms

521

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

−0.2

−0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

1 0.8 0.6 0.4 0.2 0 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Fig. 10.2. φ1 = Q p φ0 , φ2 = Q 2p φ0 , φ3 = Q 3p φ0 obtained by cascade algorithm with refinement mask { 21 , 0, 21 } and φ0 = χ[0,1) (x)

Then, from φ1 , we obtain φ2 , given by φ2 (x) = Q p φ1 (x) = φ1 (2x) + 0φ1 (2x − 1) + φ1 (2x − 2) =

1, for x ∈ [0, 0.25) ∪ [0.5, 0.75) ∪ [1, 1.25) ∪ [1.5, 1.75), 0, elsewhere;

and from φ2 , we obtain φ3 , φ4 , . . .. In this way, we can compute φn for n ≥ 3. The graphs of φ1 , φ2 , φ3 are displayed in Fig. 10.2. However, from the graphical results, it appears that φn does not converge to φ in L 2 (R).  From the above two examples, we see that it is imperative to derive the mathematical theory to assure the convergence of the cascade algorithm. In Theorem 3, to be proved at the end of this section, we will see that the two sufficient conditions for the convergence of the cascade algorithm are: firstly, the finite refinement mask p has the sum-rule property at least of the first order; and secondly, the transition operator T p associated with p satisfies Condition E. Furthermore, from the two examples that will be presented right after the statement of Theorem 3, we will see that neither of these conditions can be dropped to ensure convergence of the cascade algorithm. Before going into the discussion of the convergence of the cascade algorithm, we need some preliminary results. First let us demonstrate, in the following example, that for the cascade algorithm, with a compactly supported initial function φ0 , such that

522

10 Wavelet Analysis

supp(φ0 ) ⊆ [0, N ], then for each n = 1, 2, . . . , we also have supp(φn ) ⊆ [0, N ], but the support generally increases for increasing values of n. Note that the refinable function φ of the refinement equation defined by p, being the limit of the sequence φn , will then have support, supp(φ) ⊆ [0, N ], as assured by Theorem 2 of Sect. 10.1. N with non-zero p0 , p N , and let φ0 ∈ L 2 (R), with Example 3 Let p = { pk }k=0 supp(φ0 ) ⊆ [0, N − δ], where 0 < δ ≤ 1, be an initial function for the cascade algorithm, then

1 1 supp(φn ) ⊆ [0, 1 − ( )n N ] + ( )n supp(φ0 ). 2 2

(10.3.4)

This will be derived in the proof of Theorem 1 below. Hence, since supp(φ0 ) ⊆ [0, N − δ], we have δ supp(φn ) ⊆ [0, N − n ].  2 Observe that if we choose an arbitrary compactly supported function φ0 ∈ L 2 (R) as an initial function for the cascade algorithm, then the support of φn may not be a subset of the interval [0, N ], although the support of the limit function φ is (assuming that the cascade algorithm converges). For fast convergence, it is advisable to select an initial function φ0 with a small support. Next, recall that a compactly supported function f (x) is said to have the property of partition of unity (PU) if it satisfies 

f (x − ) = 1, x ∈ R.

∈Z

In the following Theorem 1, we will also prove that the PU is preserved by the cascade algorithm, provided that the refinement mask p has the sum-rule property of at least the first order, namely, p(π) = 0. Theorem 1 Support and PU of φn = Q np φ0

Let φn = Q np φ0 , where φ0 is any

compactly supported function in L 2 (R). Then: (i) There exists an n 0 ≥ 1 such that supp(φn ) ⊆ [− 21 , N + 21 ] for all n ≥ n 0 . (ii) In addition, if p has at least sum-rule order 1; that is, p(π) = 0, and if the initial function φ0 (x) has PU, then each of φn (x), n = 1, 2, . . . , has PU. Proof To prove (i), observe that it follows from the definition of Q p , given by (10.3.3), that

10.3 Cascade Algorithms

523

supp(φ1 ) = supp(Q p φ0 ) ⊆ =

k∈[0,N ]

=

1





supp φ0 (2 · −k)

k∈[0,N ]



1

supp φ0 ) + k ⊆ supp(φ0 ) + [0, N ] 2 2

1 1 [0, N ] + supp(φ0 ). 2 2

In general, we have supp(φn ) ⊆

1 1 1 1 [0, N ] + 2 [0, N ] + · · · + n [0, N ] + n supp(φ0 ). 2 2 2 2

Since φ0 is compactly supported, an integer n 0 can be chosen sufficiently large so that 21n supp(φ0 ) ⊆ [− 21 , 21 ] for all n ≥ n 0 . Thus, we have supp(φn ) ⊆

∞  1 1 1 1 1 [0, N ] + [− , ] = [− , N + ], 2j 2 2 2 2 j=1

for n ≥ n 0 , as desired. 0 (0) = 1. For an integer 1 (0) = p(0)φ To prove (ii), we start by observing that φ k = 0, 0 (πk) = 1 (2πk) = p(πk)φ φ

0 (2π), for k = 2, p(0)φ 0 (2π + π), for k = 2 + 1, p(π)φ

= 0, 0 (2π) = 0 for  = 0. Thus φ1 has PU. By repeating this since p(π) = 0 and φ  argument, we may deduce that φ2 , φ3 , . . ., all have PU. From Theorem 5 in Sect. 10.1 on p.507, we also know that if a refinable function φ is in L 2 (R), then φ has PU. Next, we show that if the sequence {Q np φ0 }n converges in L 2 (R), then φ0 must have PU. Theorem 2 Convergence of {Q np φ0 }∞ n=1 ⇒ PU for initial φ0 If, for a compactly 0 (0) = 1, the cascade sequence φn = Q n φ0 supported initial function φ0 with φ p converges in L 2 (R), then the initial function φ0 must have the PU. n also converges Proof Since the sequence φn converges in L 2 (R), the sequence φ in L 2 (R) by Parseval’s theorem for the Fourier transform (see Theorem 1 on p.330). on R. Thus, φ n converges in L 2 (R) to n converges pointwise to φ On the other hand, φ  Hence, the sequence φn converges in L 2 (R) to φ. In addition, by (i) of Theorem 1, φ. there exists an n 0 such that supp(φn ) ⊆ [− 21 , N + 21 ] for n ≥ n 0 . Thus, for n ≥ n 0 ,

524

10 Wavelet Analysis

n (ω) − φ(ω)|  |φ ≤



∞ −∞



φn (x) − φ(x) d x =



N + 21

− 21



φn (x) − φ(x) d x

1 2

≤ (N + 1) φn − φ2 → 0, as n → ∞,  uniformly on R. n converges to φ for any ω ∈ R. This means that φ For any integer k = 0, observe that n−1 (2n−1 2πk) = p(0) φ n−1 (2n−1 2πk) n (2n 2πk) = p(2n−1 2πk) φ φ n−1 n−1 (2 2πk) = · · · = φ 0 (2πk), =φ so that

n (2n 2πk) = lim φ(2  n 2πk) = 0, 0 (2πk) = lim φ φ n→∞

n→∞

where the last equality holds by applying the Riemann-Lebesgue lemma. This shows φ0 has the PU.  Based on Theorem 2, we introduce the following definition of the convergence of the cascade algorithm. Definition 1 Convergence of cascade algorithm The cascade algorithm associN is said to be convergent in L (R), if the ated with a refinement mask p = { pk }k=0 2 n sequence φn = {Q p φ0 } converges in L 2 (R), for any compactly supported initial function φ0 ∈ L 2 (R) that has the PU.

The main theorem of this section is the following. Theorem 3 Characterization of convergence of cascade algorithm cade algorithm associated with p = if

N { pk }k=0

The cas-

converges in L 2 (R) to φ if and only

(i) p has at least sum-rule order 1; and (ii) T p ∈ E. The proof of Theorem 3 will be given at the end of this section. Let us first point out that, by Theorem 3, we see that the cascade algorithm associated with the refinement mask p = { pk }2k=0 of the hat function, as considered in Example 1, converges in L 2 (R) to the hat function, since p has sum-rule order 2 and T p satisfies Condition E (see Example 2 on p.443). Example 4 Let p = { pk }2k=0 with p0 = p2 = 1, p1 = 0 be the refinement mask of φ(x) = 21 χ[0,2) (x) considered in Example 2. Discuss the convergence of the associated cascade algorithm. Solution The two-scale symbol is given by

10.3 Cascade Algorithms

525

p(ω) =

1 (1 + e−i2ω ). 2

Since p(ω) does not have sum-rule property of any order, we immediately know, from Theorem 3, that the associated cascade algorithm could not converge in L 2 (R). In the following, let us consider this problem in more depth. Firstly, let us consider whether T p ∈ E. From Example 1 of Sect. 10.1, on p.505, we have



λ − 1 0

2



1 1 |λI5 − T p | = (λ − )2

0 λ − 1 0

= (λ − 1)(λ − )2 λ2 . 2

2 1 0 − 2 λ

Thus, the eigenvalues of T p are given by 1 1 1, , , 0, 0; 2 2 from which we may conclude that T p satisfies Condition E. In the following, we show directly that the cascade algorithm does not converge. Observe that if φn is supported on [0, 2], then



supp φn (2·) ⊆ [0, 1], supp φn (2 · −2) ⊆ [1, 2]; and φn+1 , obtained by the cascade algorithm, given by φn+1 (x) = φn (2x) + φn (2x − 2), on [0, 2], supp φn ) ⊆ [0, 2] for is also supported on [0, 2]. Thus, for φ0 supported

n ≥ 1, and supp φn (2·) ∩ supp φn (2 · −2) ⊆ {1}. Hence we have 



−∞

 ∞





φn+1 (x) 2 d x +

φn+1 (x) 2 d x −∞ 1  ∞  ∞

2





φn (2x) d x +

φn (2x − 2) 2 d x = −∞ −∞  ∞  

2

2



1 ∞

1 ∞

φn (x) 2 d x φn (x) d x + φn (x) d x = = 2 −∞ 2 −∞ −∞  ∞

2

φ0 (x) d x. = ··· =



φn+1 (x) 2 d x =



1

−∞

In particular, by selecting φ0 (x) = χ[0,1) (x) as the initial function, we have lim φn 2 = lim φ0 2 = 1.

n→∞

n→∞

(10.3.5)

526

10 Wavelet Analysis

On the other hand, if the cascade algorithm associated with p converges in L 2 (R), then √ 2 as n → ∞, φn 2 → φ2 = 2 and this contradicts (10.3.5), showing that φn does not converge in L 2 (R) to φ.  The above example demonstrates that the sum-rule property is a necessary condition for the validity of Theorem 3. In the next example, we show that Condition E is also necessary. Example 5 Let φ(x) = 13 χ[0,3) (x) be the refinable function considered in Example 4 on p.450 and in Example 2 on p.508. Discuss the convergence of the associated cascade algorithm. Solution The two-scale symbol is given by p(ω) =

1 (1 + e−i3ω ). 2

The entries a j of T p can be obtained from 

a j e−i jω = 4| p(ω)|2 = 2 + e−i3ω + ei3ω ,

j

with the nonzero values given by a0 = 2, a3 = a−3 = 1, ⎡

and

Tp =

1 2

a2 j−k

 −3≤ j,k≤3

1 ⎢0 ⎢ ⎢0 1⎢ 1 = ⎢ 2⎢ ⎢0 ⎢ ⎣0 0

0 0 2 0 0 0 0

0 1 0 0 1 0 0

0 0 0 2 0 0 0

0 0 1 0 0 1 0

0 0 0 0 2 0 0

⎤ 0 0⎥ ⎥ 0⎥ ⎥ 1⎥ ⎥. 0⎥ ⎥ 0⎦ 1

Also, it can be shown that 1 1 |λI7 − T p | = (λ − 1)2 (λ + 1)(λ − )3 (λ + ). 2 2 Thus, the eigenvalues of T p are given by 1 1 1 1 1, 1, −1, , , , − . 2 2 2 2

10.3 Cascade Algorithms

527

Hence, T p does not satisfy Condition E, and Theorem 3 does not apply to conclude the convergence of the cascade algorithm. As in Example 4, let us demonstrate the divergence of the cascade algorithm directly. First, notice that if supp(φn ) ⊆ [0, 3], then

3 3 supp φn (2·) ⊆ [0, ], supp φn (2 · −3) ⊆ [ , 3], 2 2 and hence φn+1 , obtained by the cascade algorithm, given by φn+1 (x) = φn (2x) + φn (2x − 3), is also supported on [0, 3]. Thus, for φ0 supported on [0, 3], we have 

∞ −∞



φn+1 (x) 2 d x =



∞ −∞



φn (2x) 2 d x +



∞ −∞



φn (2x − 3) 2 d x



3 (by the fact supp φn (2·) ∩ supp φn (2 · −3) ⊆ { })  ∞ 2   ∞

2

2



1 ∞

1

φn (x) d x =

φn (x) 2 d x φn (x) d x + = 2 −∞ 2 −∞ −∞  ∞

2

φ0 (x) d x. = ··· = −∞

If the cascade algorithm is convergent, then φn 2 → φ2 , as n → ∞. However, for φ0 (x) = χ[0,1) (x) or φ0 (x) = max{x, 2 − x}χ[0,2) (the hat function, where the verification of its PU is left as an exercise), lim φn 2 = lim φ0 2 =

n→∞

n→∞

√ 3 = φ2 , 3

a contradiction. This shows that φn does not converge in L 2 (R) to φ. In Fig. 10.3,  we present φ1 and φ4 , with φ0 being the hat function. From the above two examples, we see both conditions (i) and (ii) in Theorem 3 are necessary for the convergence of the cascade algorithm. By Theorem 1, if a refinable function φ is stable, its associated mask p must have at least sum-rule order 1, and T p ∈ E. Thus Theorem 3 leads to the following corollary. Corollary 1 Stability ⇒ convergence of cascade algorithm If a refinable function φ is stable (i.e., φ is a scaling function), then the cascade algorithm associated with its refinement mask converges.

528 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

10 Wavelet Analysis

0.5

1

1.5

2

2.5

3

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.5

1

1.5

2

2.5

3

Fig. 10.3. φ1 = Q p φ0 , φ4 = Q 4p φ0 obtained by cascade algorithm with refinement mask of φ = 13 χ[0,3) and φ0 being the hat function

Example 6 Let p be the D4 filter with sum-rule order 2. Since the refinable function φ is orthogonal, it is stable. Thus, by Corollary 1, the cascade algorithm associated with p is convergent. Indeed, from Example 2 on p.509, T p satisfies Condition E, and thus, we have the same conclusion on the convergence of the associated cascade algorithm. In particular, if we choose φ0 (x) = min{x, 2 − x}χ[0,2) (x) (the hat function) as the initial function, then φn , n = 1, 2, . . ., provide a sequence of piecewise linear polynomials that approximate φ. The graphs of φn with n = 1, . . . , 4 are shown in Fig. 10.4. Efficient schemes, based on the cascade algorithm, can be used to draw (i.e. to render) the graphs of φ and the associated wavelet ψ, as well as graphs of functions written as finite linear combinations of their integer translates.  Next, we apply the characterization of the convergence of the cascade algorithm to prove Theorem 5 on p.471 on the biorthogonality of two scaling functions. ! are biorthogonal Proof of Theorem 5 of Sect. 9.4 on p.471 Suppose that φ and φ to each other. Then item (i) in Theorem 5 has been proved in Theorem 4 on p.470, and items (ii) and (iii) follow from Theorem 1 of Sect. 10.2 on p.510, since the ! immediately implies the stability of φ and φ, ! respectively biorthogonality of φ and φ (see Corollary 1 on p.467). Conversely, suppose (i)–(iii) in Theorem 5 hold. By Theorem 3, the cascade algorithms associated with p and with ! p converge in L 2 (R). In particular, with !0 (x) = χ[0,1) (x) as initial functions for both φn and φ !n ; that is, by φ0 (x) = φ considering n! !n = (Q ! φn = (Q p )n φ0 , φ p ) φ0 , n = 1, 2, . . . , !n converges to φ ! in L 2 (R). Thus, the refinable φn converges to φ in L 2 (R), and φ ! functions φ and φ are in L 2 (R). ! are biorthogonal, we next show G! (ω) = 1, ω ∈ To show that φ and φ φn ,φn [−π, π] for n = 0, 1, . . . , by mathematical induction. Indeed, since G! φ0 ,φ0 (ω) = G φ0 (ω) = 1 (in view of the orthogonality of φ0 ), we may use the induction hypothesis, G! φn ,φn (ω) = 1. Now, by applying this assumption and the refinement relations

10.3 Cascade Algorithms

529

1.2

1.4

1

1.2 1

0.8

0.8

0.6

0.6

0.4

0.4 0.2

0.2

0 0 −0.2 0

−0.2 0.5

1

1.5

2

2.5

3

−0.4 0

1.4

1.4

1.2

1.2

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

−0.2

−0.2

−0.4 0

0.5

1

1.5

2

2.5

3

−0.4 0

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

Fig. 10.4. Piecewise linear polynomials φn = Q np φ0 , n = 1, . . . , 4 approximating D4 scaling function φ by cascade algorithm

in the calculation below, we arrive at G! φn+1 ,φn+1 (ω) = =



 ! φn+1 (ω + 2π) φn+1 (ω + 2π)

∈Z



! p(

ω ω ω ω ! + π)  φn ( + π) p( + π)  φn ( + π) 2 2 2 2

! p(

ω ω ω ω ! φn ( + 2πk) + 2πk) p( + 2πk)  φn ( + 2πk)  2 2 2 2

∈Z

=

 k∈Z

+

 k∈Z

! p(

ω ω ω ω ! φn ( + 2πk + π) + 2πk + π) p( + 2πk + π)  φn ( + 2πk + π)  2 2 2 2

ω ω ω ω ω ω p ( + π) p( + π) G! + π) =! p ( ) p( ) G! φn ,φn ( ) + ! φn ,φn ( 2 2 2 2 2 2 ω ω ω ω p ( + π) p( + π) = 1. =! p ( ) p( ) + ! 2 2 2 2

! This shows that G! φn ,φn (ω) = 1, ω ∈ [−π, π] for all n ≥ 0; that is, φn and φn are biorthogonal to each other, namely: 



−∞

!n (x)φn (x − k)d x = δk , k ∈ Z. φ

530

10 Wavelet Analysis

!n converge in L 2 (R) to φ and φ, ! Since φn and φ  respectively, the limit of the left-hand side of the above equation, as n → ∞, is 

Thus we have



−∞



−∞

! − k)d x (see Exercise 8). φ(x)φ(x

! − k)d x = δk , k ∈ Z. φ(x)φ(x

! are biorthogonal to each other, as desired. This shows that φ and φ



Example 7 Let p and ! p be the 5/3-tap biorthogonal lowpass filters considered in Example 1 on p.475, with two-scale symbols: 1 (1 + e−iω )2 , 4 1 1 3 1 1 ! p (ω) = − eiω + + e−iω + e−i2ω − e−i3ω . 8 4 4 4 8 p(ω) =

a j , which are Then it is not difficult to calculate the nonzero a j ,! a0 =

3 1 , a1 = a−1 = 1, a2 = a−2 = , 2 4

and 23 5 1 ,! a1 = ! a2 = ! a−1 = , ! a−2 = − , 8 4 2 1 1 ! a3 = ! a4 = ! . a−3 = − , ! a−4 = 4 16   Then we can compute the eigenvalues of T p = 21 a2 j−k −2≤ j,k≤2 , which are given by 1 1 1 1 1, , , , . 2 4 8 8 1  The eigenvalues of T ! a2 j−k −4≤ j,k≤4 can be calculated by using any compup = 2! tational software. For example, numerical results generated from Matlab are: ! a0 =

1,

1 1 1 1 1 1 , , , 0.54279, − , −0.23029, , . 2 4 8 4 32 32

Thus, both T p and T! p have at least sum-rule p satisfy Condition E. In addition, p and ! order 1. Therefore, by Theorem 5 on p.471, we may conclude that the associated ! are biorthogonal to each other. refinable functions φ and φ  Example 8 Let p and ! p be the biorthogonal lowpass filters given in (9.4.26) in Example 2 of Sect. 9.2 on p.447. The largest eigenvalue of

10.3 Cascade Algorithms

531

T! p =

1 2

! a2 j−k

 −4≤ j,k≤4

is 3.1378. Thus T! p does not satisfy Condition E, and hence the associated refinable ! functions φ and φ are not biorthogonal to each other.  We end this section by giving the proof of the main theorem of the section, Theorem 3. Proof of Theorem 3 Suppose the cascade algorithm associated with p converges in L 2 (R). To prove item (i) in Theorem 3, since we only consider finite sequences that sum to 2, we already have p(0) = 1; that is, it is sufficient to show p(π) = 0. Choose the initial function φ0 (x) = χ[0,1) (x). Then φ0 has PU, and hence, the sequence {φn }, with φn = Q np φ0 , converges in L 2 (R) to φ. On the other hand, as shown in the proof  uniformly on R. Thus, with n } converges to φ of Theorem 2, we know that {φ n−1 (2n−1 π) = · · · = φ 1 (2π) = p(π)φ 0 (π), n (2n π) = φ φ we have

n (2n π) = lim φ(2  n π) = 0, 0 (π) = lim φ p(π)φ n→∞

n→∞

where the last equality follows by applying the Riemann-Lebesgue lemma. This, 0 (π) = 1−e−iπ = 2 = 0, leads to p(π) = 0. together with the fact that φ iπ iπ To prove item (ii) in Theorem 3, we note that G φn (ω) = T pn G φ0 (ω) (see Exercise 5). Thus, by (9.2.6) on p.447, for a compactly supported φ0 ∈ L 2 (R) with PU, we have T pn G φ0 (ω) = G φn (ω) = →

N −1 

N −1 





k=−(N −1) −∞





k=−(N −1) −∞

φn (x)φn (x − k)d x e−ikω

φ(x)φ(x − k)d x e−ikω asn → ∞

= G φ (ω), for ω ∈ [−π, π], where  lim



n→∞ −∞

 φn (x)φn (x − k)d x =

∞ −∞

φ(x)φ(x − k)d x

follows from the fact that φn − φ2 → 0 as n → ∞ (see Exercise 8). Thus, lim T pn G φ0 (ω) = G φ (ω), ω ∈ [−π, π].

n→∞

(10.3.6)

532

10 Wavelet Analysis

For each j, with −N ≤ j ≤ N , let us choose the initial function φ0 (x) = χ[ j, j+1) (x). Then φ0 has PU. Thus, with G φ0 (ω) = e−i jω (see Exercise 6), and from (10.3.6), we have lim T pn (e−i jω ) = G φ (ω). n→∞

Therefore, for any v(ω) =

N

vk e−ikω ∈ V2N +1 ,

k=−N

lim T pn v(ω) =

n→∞

N 

vk G φ (ω), ω ∈ [−π, π].

(10.3.7)

k=−N

Since, for v(ω) ∈ V2N +1 , T pn v(ω) converges for any ω ∈ [−π, π], the eigenvalue λ of T p must satisfy either |λ| < 1 or λ = 1, and the eigenvalue 1 is nondegenerate. Next, we show that 1 is a simple eigenvalue. Let v(ω) ∈ V2N +1 be a 1-eigenfunction of T p . Then, by (10.3.7), we have v(ω) = lim 1 v(ω) = n

n→∞

lim T n v(ω) n→∞ p

=

N 

vk G φ (ω).

k=−N

Thus, up to a constant multiple, G φ (ω) is the only 1-eigenfunction of T p . This shows T p ∈ E. Conversely, suppose that p has at least sum-rule order 1 and that T p ∈ E. For a compactly supported φ0 with PU, we know, by (i) in Theorem 1, that there is a positive integer n 0 such that supp(φn ) ⊆ [− 21 , N + 21 ] for n ≥ n 0 . In addition, by (ii) in Theorem 1, it follows that φn 0 has PU. By considering the convergence of φn+n 0 = Q np (φn 0 ) with φn 0 as the initial function, we may assume that the support of φ0 is in [− 21 , N + 21 ]. Then, for n ≥ 1, supp(φn ) ⊆ [− 21 , N + 21 ] (by referring to the proof of (i) in Theorem 1), and hence, G φn ∈ V2N +1 . From the fact that G φn (ω) = T pn G φ0 (ω) and the assumption that T p ∈ E, we know that G φn (ω) converges pointwise on [−π, π]. Let W (ω) denote the limit. By (ii) in Theorem 1, φn has PU. Thus, G φn (0) =



n (2π)|2 = 1. |φ



Therefore, W (0) = 1. Clearly, W (ω) ≥ 0. Thus, by Theorem 3 on p.504, the refinable function φ is in L 2 (R). Therefore G φ (ω) ∈ V2N +1 , and it is a 1-eigenfunction of T p . Since G φn+1 (ω) = T p G φn (ω), we have T p W (ω) = W (ω); that is, W (ω) is also a 1-eigenfunction of T p . Since 1 is a simple eigenvalue of T p , there exists a constant c such that

10.3 Cascade Algorithms

533

W (ω) = c G φ (ω). In view of W (0) = 1 and G φ (0) = 1, we may deduce c = 1. Hence, we have W (ω) = G φ (ω). This means that G φn (ω) converges pointwise to G φ (ω) on [−π, π]; that is, N   k=−N

∞ −∞

φn (x)φn (x − k)d x e−ikω →

N   k=−N



−∞

φ(x)φ(x − k)d x e−ikω

as n → ∞ for ω ∈ [−π, π]. In particular, we have 

∞ −∞

 |φn (x)|2 d x →



−∞

|φ(x)|2 d x as n → ∞.

Therefore, by Parseval’s formula (7.2.7) in Theorem 2 on p.331, we have  2 as n → ∞. n 2 → φ φ  n (ω) converges pointwise on R to φ(ω), implies that This, together with the fact that φ  in L 2 (R) (by applying a standard result in Real Analysis). n (ω) converges to φ(ω) φ Hence, by applying Parseval’s formula again, we may conclude that φn (x) converges  to φ(x) in L 2 (R). Exercises Exercise 1 Let p = { pk }2k=0 with p0 =

1 1 , p1 = 1, p2 = . 2 2

Show that the vector y0 = [0, 1, 0]T is a 1-eigenvector of the matrix P3 in (10.3.1). Then apply the algorithm described by (10.3.2) to compute the values of the refinement function φ(x) with refinement mask p, for x = 0, 1/16, 2/16, . . . , 31/16, 2. Observe that φ(x) is the hat function, φ(x) = min{x, 2 − x}χ[0,2) (x), called the linear (Cardinal) B-spline. Exercise 2 Let p = { pk }3k=0 with p0 =

1 3 1 , p1 = p2 = , p3 = . 4 4 4

Show that the vector y0 = [0, 21 , 21 , 0]T is a 1-eigenvector of the matrix P4 in (10.3.1). Then apply the algorithm described by (10.3.2) to compute the values of the refinement function φ(x) with refinement mask p, for x = 0, 1/16, 2/16, . . . , 47/16, 3. We remark that φ(x) is called the quadratic (Cardinal) B-spline.

534

10 Wavelet Analysis

Exercise 3 Let p = { pk }4k=0 with p0 =

1 4 6 4 1 , p1 = , p2 = , p3 = , p4 = . 8 8 8 8 8

Show that the vector y0 = [0, 16 , 46 , 16 , 0]T is a 1-eigenvector of the matrix P5 in (10.3.1). Then apply the algorithm described by (10.3.2) to compute the values of the refinement function φ(x) with refinement mask p, for x = 0, 1/16, 2/16, . . . , 63/16, 4. We remark that φ(x) is called the cubic (Cardinal) B-spline. Exercise 4 Let p = { pk }2k=0 with p0 =

3 2 , p1 = 1, p2 = , 5 5

be a refinement mask. Find φ1 = Q p φ0 , φ2 = Q 2p φ0 , where φ0 (x) = χ[0,1) (x). Exercise 5 Let φn = Q np φ0 , n = 1, 2, . . . , where Q p is the refinement operator defined by (10.3.3). Prove that G φ1 (ω) = T p G φ0 (ω), and then conclude that G φn (ω) = T pn G φ0 (ω). Exercise 6 Let φ0 (x) = χ[ j, j+1) (x). Show that G φ0 (ω) = e−i jω . Exercise 7 Let φ0 (x) = min{x, 2 − x}χ[0,2) (x) be the hat function (also called the linear Cardinal B-spline). Show that φ0 has the property of partition of unity (PU). Exercise 8 Show that if  f n − f 2 → 0 and gn − g2 → 0 as n → ∞, then  lim



n→∞ −∞

 f n (x)gn (x)d x =

∞ −∞

f (x)g(x)d x.

Exercise 9 Let T p be the matrix in Example 5. Verify that its eigenvalues are 1 1 1 1 1, 1, −1, , , , − . 2 2 2 2 Exercise 10 Let p = { pk }2k=0 with p0 =

3 1 , p1 = 1, p2 = , 4 4

be the refinement mask considered in Example 1 on p.511. Decide whether the cascade algorithm associated with p is convergent in L 2 (R). Exercise 11 Repeat Exercise 10 for p = { pk }2k=0 with

10.3 Cascade Algorithms

535

p0 =

1 2 , p1 = 1, p2 = . 3 3

10.4 Smoothness of Compactly Supported Wavelets In the previous section, we have studied how the graphs of refinable functions and any (finite) linear combination of their integer translates, including the corresponding wavelets, can be drawn, but there was no indication of the quality of these graphs. The objective of this section is to characterize the smoothness of compactly supported refinable functions φ (and hence their corresponding wavelets) in terms of the refinement masks (or sequences). In this book, we only study the quality of smoothness measured by Sobolev regularity, and will calibrate the Sobolev regularity (or smoothness) of a refinement function φ in terms of the eigenvalues of the transition operator T p , associated with the corresponding refinement mask p of φ. For any γ ≥ 0, a function f is said to be in the Sobolev space Wγ (R) if the Fourier transform of f satisfies the condition γ

f (ω) ∈ L 2 (R). (1 + ω 2 ) 2  In the following, the supremum of all values of γ, denoted by ν f = sup{γ : f ∈ Wγ (R)}, is called the critical Sobolev exponent. The issue that must be addressed is the reason for studying the Sobolev regularity instead of the more natural Hölder regularity of the same “order” γ > 0, defined by the space of functions of f ∈ Ck (R) with | f (k) (x) − f (k) (y)| ≤ C|x − y|α , for all x, y ∈ R, where 0 < α = γ − k < 1, for some positive constant C. While the derivation of the Hölder regularity is not more difficult than that of the Sobolev regularity to be discussed in this section, we prefer a unified theory based on the transition operator T p associated with the refinement mask p, since the Sobolev regularity estimate provided in this section is characterized by the eigenvalues of T p . The advantage of the Sobolev regularity over the Hölder regularity is its computational efficiency. This will be studied in this section with four demonstrative examples. Observe that f (ω) ∈ L 2 (R) by applying Plancherel’s formula (called if f (k) (x) ∈ L 2 (R), then ω k  Parseval’s theorem in Theorem 1 of Sect. 7.2), namely: f (ω)22 = 2π f (k) 22 . ω k 

536

10 Wavelet Analysis

Thus, if in addition, f (x) ∈ L 2 (R), then f ∈ Wk (R). On the other hand, it can also be proved that if f ∈ Wγ (R) for an arbitrary γ > k + 21 , where k is a given nonnegative integer, then f ∈ Ck (R). Thus, the Sobolev regularity assures smoothness of functions in a similar way as the Hölder regularity. Let φ ∈ L 2 (R) be the refinable function associated with a refinement mask p = N , which has sum-rule order L ≥ 1. Then the refinement mask p(ω) is a { pk }k=0 trigonometric polynomial that can be written as p(ω) =

1 + e−iω L 2

ω L −i ω L p0 (ω) = cos e 2 p0 (ω), 2

for some trigonometric polynomial p0 (ω). Let U2L denote the subspace of V2N +1 defined by   d u(0) = 0, for all 0 ≤  ≤ 2L − 1 . U2L = u(ω) ∈ V2N +1 : dω 

(10.4.1)

Theorem 1 If p has sum-rule order L, then the subspace U2L is invariant under T p . Proof Any u(ω) ∈ U2L can be written as u(ω) =

1 − e−iω 2L 2i

ω 2L −iωL u 0 (ω) = sin e u 0 (ω), 2

where u 0 (ω) is a trigonometric polynomial. Thus ω ω ω ω T p u(ω) = | p( )|2 u( ) + | p( + π)|2 u( + π) 2 2 2 2 ω 2 ω 2L −i ω L ω 2 = | p( )| sin e u0( ) 2 4 2 ω 2L −i( ω +π)L ω ω + sin e 2 | p0 ( + π)|2 u( + π); 4 2 2 that is, T p u(ω) can be written as ω 2L ω T p u(ω) = sin w0 ( ), 4 2 where w0 (ω) is a trigonometric polynomial. This leads to 

d 

T u(ω) = 0,

p ω=0 dω  for all 0 ≤  < 2L. Hence T p u(ω) ∈ U2L , as desired.



For v(ω) ∈ V2N +1 , let v2 be the L 2 -norm of v(ω), defined by the square-root of the normalized inner product of v with itself, where the normalization is achieved

10.4 Smoothness of Compactly Supported Wavelets

537

by dividing the integral over [−π, π] by 2π, as defined in Sect. 4.1 and (4.1.2) in Sect. 6.1 of Chap. 6. Let A be a linear operator defined on a subspace U of V2N +1 , and |||A||| be the operator norm of A (see Definition 1 of Sect. 3.2 in Chap. 3 for matrix operators), namely: |||A||| = sup0 =u∈U

Au2 . u2

Let σ(A) denote the spectrum of A, i.e. the set of all eigenvalues of A, with spectral radius ρ(A) defined to be the largest magnitude among all eigenvalues of A (see Theorem 1 in Sect. 3.2 of Chap. 3 for rectangular matrices). Then, by spectral theory (see Theorem 3 in Sect. 3.2 of Chap. 3), we have 1

ρ(A) = lim |||An ||| n . n→∞

In addition, the above (limit) formula is independent of the choice of the operator norm. For n ≥ 1, let Dn denote the union of two intervals defined by  Dn = [−2n π, 2n π] [−2n−1 π, 2n−1 π] = [−2n π, −2n−1 π) ∪ (2n−1 π, 2n π]. N . Theorem 2 Let φ ∈ L 2 (R) be the refinable function associated with p = { pk }k=0 Suppose that p has sum-rule order L. Then, for an arbitrary given  > 0, there exists a positive constant c, such that for all n ≥ 1,

 Dn

Proof Let



2



n

 φ(ω) dω ≤ c ρ(T p U ) +  . 2L

(10.4.2)

ω 2L . u 0 (ω) = (1 − cos ω) L = sin 2

Then u 0 (ω) ∈ U2L . In addition,  π π π 2L 1 u 0 (ω) ≥ sin = L , for ω ∈ D1 = [−π, π] [− , ]. 4 2 2 2  Since φ(ω) is continuous, and hence bounded, on [−π, π], there exists a positive constant c1 , such that  |φ(ω)| ≤ c1 , for ω ∈ [−π, π].  Then, with φ(ω) =

n j=1

 ωn ), we have p( 2ωj )φ( 2

538

10 Wavelet Analysis





2

 φ(ω) dω = Dn



n

ω 2 ω

2

p( j )  φ( n ) dω 2 2 Dn

 ≤ c1

j=1

n

ω 2

p( j ) dω 2 Dn j=1



n

ω 2 ω

p( j ) u 0 ( n )dω 2 2 Dn

≤ c1 2 L  ≤ c1 2 L

j=1 n



2n π

−2n π

 = c1 2 L

π

−π



p(

j=1

ω

2 ω ) u 0 ( n )dω j 2 2

T pn u 0 )(ω)dω,

where the last equality follows from (9.1.12) in Sect. 9.1 of Chap. 9, on p.444. In addition, since



1 ρ(T p U ) = lim |||(T p U )n ||| n , n→∞

2L

2L

for the  > 0 given in the statement of the theorem, there exists a positive constant c > 0 such that, for n ≥ 1,





n |||(T p U )n ||| ≤ c ρ(T p U ) +  . 2L

2L

Therefore





n T pn u 0 2 ≤ |||(T p U )n ||| u 0 2 ≤ c u 0 2 ρ(T p U ) +  ; 2L

2L

and we may conclude that 



2

 φ(ω) dω ≤ c1 2 L Dn



π

−π



T pn u 0 )(ω)dω

√ √

n ≤ c1 2 L 2πT pn u 0 2 ≤ c1 2 L 2πc u 0 2 ρ(T p U ) +  . 2L

√ This proves that (10.4.2) holds, with the choice of c = c1 2 L 2πc u 0 2 .



Next

we give a Sobolev smoothness estimate for φ in terms of the spectral radius ρ(T p U ). Throughout this section, log denotes the common logarithm; that is, using 2L base 10. Theorem 3 Sobolev smoothness estimate Let φ ∈ L 2 (R) be the refinable funcN . Suppose that p has sum-rule order L. Then tion associated with p = { pk }k=0

10.4 Smoothness of Compactly Supported Wavelets

νφ ≥ γ0 = −

539

log ρ(T p U ) 2L

2 log 2

.

(10.4.3)

In addition, if φ is stable, then νφ = γ0 . Proof Here, we only give the proof of (10.4.3), but remark that the validity of the second statement (νφ = γ0 ) depends on the fact that L is the largest integer (that is, the order) for which the sum rules hold. For any γ < γ0 , there exists an  > 0, such that

log ρ(T p U ) +  2L , γ