An Introduction to Linear Ordinary Differential Equations Using the Impulsive Response Method and Factorization [1st ed.] 3319496662, 978-3-319-49666-5, 978-3-319-49667-2

This book presents a method for solving linear ordinary differential equations based on the factorization of the differe

208 36 947KB

English Pages 120 [125] Year 2016

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

An Introduction to Linear Ordinary Differential Equations Using the Impulsive Response Method and Factorization [1st ed.]
 3319496662, 978-3-319-49666-5, 978-3-319-49667-2

Table of contents :
Front Matter....Pages i-vii
Introduction....Pages 1-7
The Case of Constant Coefficients....Pages 9-54
The Case of Variable Coefficients....Pages 55-117
Back Matter....Pages 119-120

Citation preview

PoliTO Springer Series

Roberto Camporesi

An Introduction to Linear Ordinary Differential Equations Using the Impulsive Response Method and Factorization

PoliTO Springer Series Editor-in-Chief Giovanni Ghione, Dept. of Electronics and Telecommunications, Politecnico di Torino, Italy Editorial Board Andrea Acquaviva, Dept. of Control and Computer Engineering, Politecnico di Torino, Italy Pietro Asinari, Dept. of Energy, Politecnico di Torino, Italy Claudio Canuto, Dept. of Mathematical Sciences, Politecnico di Torino, Italy Erasmo Carrera, Dept. of Mechanical and Aerospace Engineering, Politecnico di Torino, Italy Felice Iazzi, Dept. of Applied Science and Technology, Politecnico di Torino, Italy Luca Ridolfi, Dept. of Environment, Land and Infrastructure Engineering, Politecnico di Torino, Italy

Springer, in cooperation with Politecnico di Torino, publishes the PoliTO Springer Series. This co-branded series of publications includes works by authors and volume editors mainly affiliated with Politecnico di Torino and covers academic and professional topics in the following areas: Mathematics and Statistics, Chemistry and Physical Sciences, Computer Science, All fields of Engineering. Interdisciplinary contributions combining the above areas are also welcome. The series will consist of lecture notes, research monographs, and briefs. Lectures notes are meant to provide quick information on research advances and may be based e.g. on summer schools or intensive courses on topics of current research, while SpringerBriefs are intended as concise summaries of cutting-edge research and its practical applications. The PoliTO Springer Series will promote international authorship, and addresses a global readership of scholars, students, researchers, professionals and policymakers.

More information about this series at http://www.springer.com/series/13890

Roberto Camporesi

An Introduction to Linear Ordinary Differential Equations Using the Impulsive Response Method and Factorization

123

Roberto Camporesi Dipartimento di Scienze Matematiche “G. L. Lagrange” Politecnico di Torino Turin Italy

ISSN 2509-6796 PoliTO Springer Series ISBN 978-3-319-49666-5 DOI 10.1007/978-3-319-49667-2

ISSN 2509-7024

(electronic)

ISBN 978-3-319-49667-2

(eBook)

Library of Congress Control Number: 2016959750 © Springer International Publishing AG 2016 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The 2.1 2.2 2.3 2.4 2.5

Case of Constant Coefficients . . . . . . . . . . . . . . . n¼1 .................................. n¼2 .................................. The General Case . . . . . . . . . . . . . . . . . . . . . . . . . Explicit Formulas for the Impulsive Response . . . The Method of Undetermined Coefficients . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1 9 9 11 27 36 49

Case of Variable Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 n ¼ 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 n ¼ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.3.1 The Factorization Problem . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.3.2 The Non-homogeneous Equation . . . . . . . . . . . . . . . . . . . . 80 3.3.3 The Homogeneous Equation . . . . . . . . . . . . . . . . . . . . . . . . 81 3.4 The Connection with Variation of Parameters . . . . . . . . . . . . . . . . 91 3.5 The Approach by Linear Systems and the Peano-Baker Series . . . . 103

3 The 3.1 3.2 3.3

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

v

Abstract

We present an approach to the impulsive response method for solving linear ordinary differential equations based on the factorization of the differential operator. In the case of constant coefficients the approach is elementary, we only assume a basic knowledge of calculus and linear algebra. In particular, we avoid the use of distribution theory, as well as of the other more advanced approaches: Laplace transform, linear systems, the general theory of linear equations with variable coefficients and variation of parameters. The case of variable coefficients is dealt with using the result of Mammana about the factorization of a real linear ordinary differential operator into a product of first-order (complex) factors, as well as a recent generalization of this result to the case of complex-valued coefficients.

vii

Chapter 1

Introduction

In view of their many applications in various fields of science and engineering, linear differential equations constitute an important chapter in the theory and application of ordinary differential equations. In introductory courses on differential equations, the treatment of constantcoefficient second or higher order non-homogeneous equations is usually limited to illustrating the method of undetermined coefficients. Using this, one finds a particular solution when the forcing term is a polynomial, an exponential, a sine or a cosine, or a product of terms of this kind. It is well known that the impulsive response method gives an explict formula for a particular solution in the more general case in which the forcing term is an arbitrary continuous function. This method is generally regarded as too difficult to implement in a first course on differential equations. Students become aware of it only later, as an application of the theory of the Laplace transform [10] or of distribution theory [16]. An alternative approach which is sometimes used consists in developing the theory of linear systems first, considering then linear equations of order n as a particular case of this theory. The problem with this approach is that one needs to “digest” the theory of linear systems, with all the issues related to the diagonalization of matrices and the Jordan form ([7], Chap. 3). Another approach is by the general theory of linear equations with variable coefficients, with the notion of Wronskian and the method of the variation of constants. This approach can be implemented also in the case of constant coefficients ([8], Chap. 2). However in introductory courses, the variation of constants method is often limited to first-order equations. Moreover, this method may be very long to implement in specific calculations, even for rather simple equations. Finally, within this approach, the occurrence of the particular solution as a convolution integral is rather indirect, and appears only at the end of the theory (see, for example, [8], exercise 4 p. 89). © Springer International Publishing AG 2016 R. Camporesi, An Introduction to Linear Ordinary Differential Equations Using the Impulsive Response Method and Factorization, PoliTO Springer Series, DOI 10.1007/978-3-319-49667-2_1

1

2

1 Introduction

In the first part of this paper, we give an elementary presentation of the impulsive response method in the case of constant-coefficients using only very basic results from linear algebra and calculus in one or many variables. We write the n-th order equation in the form y (n) + a1 y (n−1) + a2 y (n−2) + · · · + an−1 y  + an y = f (x),

(1.1)

where we use the notation y(x) in place of x(t) or y(t), y (k) denotes as usual the derivative of order k of y, a1 , a2 , . . . , an are complex constants, and the forcing term f is a complex-valued continuous function in an interval I . When f = 0 the equation is called non-homogeneous. When f = 0 we get the associated homogeneous equation y (n) + a1 y (n−1) + a2 y (n−2) + · · · + an−1 y  + an y = 0.

(1.2)

The basic tool of our investigation is the so called impulsive response. This is the function defined as follows. Let p(λ) = λn + a1 λn−1 + · · · + an−1 λ + an (λ ∈ C) be the characteristic polynomial, and let λ1 , . . . , λn ∈ C be the roots of p(λ) (not necessarily distinct, each counted with its multiplicity). We define the impulsive response g = gλ1 ···λn recursively by the following formulas: for n = 1 we set gλ1 (x) = eλ1 x , for n ≥ 2 we set gλ1 ···λn (x) = eλn x



x

e−λn t gλ1 ···λn−1 (t) dt

(x ∈ R).

(1.3)

0

It turns out that g solves the homogeneous equation (1.2) with the initial conditions y(0) = y  (0) = · · · = y (n−2) (0) = 0,

y (n−1) (0) = 1.

(1.4)

Moreover, the impulsive response allows one to solve the non-homogeneous equation with an arbitrary continuous forcing term and with arbitrary initial conditions. Indeed, if g denotes the impulsive response of order n and if 0 ∈ I , we shall see that the general solution of (1.1) in the interval I can be written as y = y p + yh , where the function y p is given by the convolution integral  y p (x) =

x

g(x − t) f (t) dt,

(1.5)

0

and solves (1.1) with trivial initial conditions at the point x = 0 (i.e., y (k) p (0) = 0 for k = 0, 1, . . . , n − 1), whereas the function

1 Introduction

3

yh (x) =

n−1 

ck g (k) (x)

(1.6)

k=0

gives the general solution of the associated homogeneous equation (1.2) as the coefficients ck vary in C. In other words, the n functions g, g  , g  , . . . , g (n−1) are linearly independent solutions of this equation and form a basis of the vector space of its solutions. We will begin with the case of first-order equations, in which (1.5) and (1.6) are easily proved, proceed with the basic case of second-order equations, and treat finally the case of arbitrary n. For simplicity, we start our discussion for n = 1, 2 with the case of real coefficients, i.e., a1 , . . . , an ∈ R in (1.1) and (1.2), and realvalued forcing terms. The same techniques and proofs apply to the case of complex coefficients and complex-valued forcing terms. The complex notation will be useful to treat the case n > 2 by induction on n. For n ≥ 2 we will not assume, a priori, existence, uniqueness, and extendability of the solutions of a linear initial value problem (homogeneous or not). We shall rather prove these facts directly, in the case of constant coefficients, by obtaining explicit formulas for the solutions. The proof of (1.5) that we give is by induction on n and is based on the factorization of the differential operator acting on y in (1.1) into first-order factors, along with the formula for solving first-order linear equations. It requires, moreover, the interchange of the order of integration in a double integral, that is, the Fubini theorem. The proof is constructive in that it directly produces the particular solution y p as a convolution integral between the impulsive response g and the forcing term f . In particular if we take f = 0, we get that the unique solution of the homogeneous initial value problem with all vanishing initial data is the zero function. This implies immediately the uniqueness of the initial value problem (homogeneous or not) with arbitrary initial data. We remark that in the usual approach to linear equations with constant coefficients, the proof of uniqueness relies on an estimate for the rate of growth of any solution y of the homogeneous equation, together with its derivatives y  , y  , . . . , y (n−1) , in terms of the coefficients a1 , a2 , . . . , an in (1.2) (see, e.g., [8], Chap. 2, Theorems 13 and 14). As regards (1.6), one can give a similar proof by induction on n using factorization. We shall rather prove (1.6) by computing the linear relationship between the coefficients ck and the initial data bk = y (k) (0) = yh(k) (0) for any given  x n. Ifx we impose the initial conditions at any point x0 ∈ I , we can just replace 0 with x0 in (1.5), and we can use, in place of (1.6), the translated formula yh (x) =

n−1 

ck g (k) (x − x0 ).

(1.7)

k=0

The function y p satisfies then y (k) p (x 0 ) = 0 for 0 ≤ k ≤ n − 1, and the relationship between the coefficients ck and bk = y (k) (x0 ) = yh(k) (x0 ) remains the same as before.

4

1 Introduction

We will then look for explicit formulas for the impulsive response g. For n = 1 g is an exponential, for n = 2 it is easily computed. For generic n one can use the recursive formula (1.3) to compute g at any prescribed order. A simple formula is obtained for n = 3, or, for generic n, in the case when the roots of p(λ) are all equal or all different. In the general case we prove by induction on k that if λ1 , . . . , λk are the distinct roots of p(λ), of multiplicities m 1 , . . . , m k , then there exist polynomials G 1 , . . . , G k , of degrees m 1 − 1, . . . , m k − 1, such that g(x) =

k 

G j (x)eλ j x .

(1.8)

j=1

An explicit formula for the polynomials G j is obtained. To this end, we first rewrite the impulsive response g from formula (1.3) as a convolution between the functions x m 1 −1 eλ1 x , . . . , x m k −1 eλk x (using commutativity and associativity of convolution). Then we prove a formula for the convolution of x p ebx and x q ecx (with p, q ∈ N and b = c) by elementary calculus. Using this formula in the previously mentioned formula for g and iterating gives the result (1.8). The polynomial G 1 is expressed as a (k − 1)-times iterated sum of monomials whose coefficients are ratios of binomial coefficients and powers of (λ1 − λ j ), j = 1. The polynomials G j for j ≥ 2 are obtained by exchanging λ1 ↔ λ j and m 1 ↔ m j in this formula. Another formula for the polynomials G j is based on the partial fraction expansion of 1/ p(λ). This formula can be proved by induction, but it is most easily proved using the Laplace transform or distribution theory. Using (1.8) in (1.6) gives the well known formula for the general solution of (1.2) in terms of the complex exponentials eλ j x . The case of real coefficients follows easily from this result. Using (1.8) in (1.5) gives a simple proof of the method of undetermined coefficients in the case when f (x) = P(x)eλ0 x , where P is a complex polynomial and λ0 ∈ C. Several examples with worked-out solutions are given throughout the paper. We also propose some exercises to complement the main text and to get acquainted with the impulsive response method. This part of the paper is an outgrowth of [3, 5]. In [3] the method of factorization was illustrated for second-order equations, but the general case only outlined. In [5] we treated equations of arbitrary order n, but we avoided the extensive use of convolution in order to make the paper as readable as possible. Thus we could not prove formula (1.8) and the related formulas for the polynomials G j . This is remedied in the present paper. We also present the method of undetermined coefficients using the impulsive response approach, which was not given in [3, 5]. In the second part of this paper, the impulsive response method is generalized to linear differential equations with variable coefficients, namely the equations of the form (1.1) with a1 , . . . , an and f complex-valued continuous functions in a common interval I . There is a formula, analogous to (1.5), for the solution of the nonhomogeneous equation with trivial initial data at some point x0 ∈ I as a generalized convolution integral, namely

1 Introduction

5



x

y p (x) =

g(x, t) f (t) dt.

(1.9)

x0

The kernel g(x, t) in this formula (x, t ∈ I ) is known as the impulsive response kernel. It solves the homogeneous equation in the variable x with the following initial conditions at the point x = t (analogous to (1.4)): 

∂ ∂x

j

 g(x, t)

x=t

= 0 for 0 ≤ j ≤ n −2,



 ∂ n−1 ∂x

 g(x, t)

x=t

= 1. (1.10)

Formula (1.9) is easily verified by differentiating under the integral sign. However, this proof is not constructive, and the origin of formula (1.9) remains rather obscure for n ≥ 2. A constructive proof can be given using the theory of distributions and the concept of a kernel (i.e., a distribution on the product space Rn × Rn ) introduced by Schwartz in [17]. Here we present a constructive yet elementary proof of (1.9), which is analogous to the case of constant coefficients. It is based on the result of Mammana [13, 14] about the factorization of a real linear ordinary differential operator into a product of first-order (generally complex) factors, and on a recent generalization of this result to the case of complex-valued coefficients [4]. According to these results, any linear differential operator of the form L=



 d n dx

+ a1 (x)



 d n−1 dx

+ · · · + an−1 (x) ddx + an (x)

(1.11)

with a j ∈ C 0 (I, C), can be factored into a product of first-order operators L=



d dx

− α1 (x)



d dx

   − α2 (x) · · · ddx − αn (x)

(1.12)

globally defined on I , i.e., α j ∈ C j−1 (I, C) (1 ≤ j ≤ n). One can then prove (1.9) by induction on n, using only Fubini’s theorem and the formula for solving first-order linear equations. The kernel g(x, t) = gα1 ···αn (x, t) is given by the following recursive formula, that generalizes (1.3): for n = 1 we have gα (x, t) = e for n ≥ 2,

 gα1 ···αn (x, t) =

x

x t

α(s) ds

,

gαn (x, s) gα1 ···αn−1 (s, t) ds.

(1.13)

t

The function x → gα1 ···αn (x, t) solves Lgα1 ···αn (·, t) = 0 with the initial conditions (1.10). In the case of constant coefficients, g(x, t) = g(x − t), where g is the impulsive response. We also have the analogue of (1.7) for the general solution yh of the homogeneous equation L y = 0 as a linear combination of partial derivatives of g(x, t) with respect to the variable t:

6

1 Introduction

yh (x) =

n−1  (−1)k c˜k

∂k g (x, t) ∂t k

(c˜k ∈ C).

(1.14)

k=0

This formula holds for any t ∈ I under suitable assumptions of regularity on the coefficients a j , namely a j ∈ C n− j−1 (I, C) (1 ≤ j ≤ n − 1),

an ∈ C 0 (I, C).

In this case, the n functions (x, t), . . . , (−1)n−1 ∂∂t n−1g (x, t) x → g(x, t), − ∂g ∂t n−1

are linearly independent solutions of the homogeneous equation ∀t ∈ I , and form a basis of the vector space of its solutions (a fundamental system). We obtain a formula for the coefficients c˜k in (1.14) in terms of the initial data at the point x = t, bk = yh(k) (t). This formula is analogous to the constant-coefficient formula, except that the derivatives of the coefficients a j enter into the formula as soon as n ≥ 3. It can be proved using distribution theory, but we present an elementary proof based on factorization and induction. In order to make contact with the variation of parameters method, we also obtain a formula for the kernel g in terms of any fixed fundamental system, and, using this result, we give another proof of (1.14). We remark that, despite all these similarities, the theory with variable coefficients is far less elementary and less explicit than the theory with constant coefficients. On the one hand, the proof of factorization given in [4, 13, 14] requires the existence and uniqueness theorem for linear equations ([8], Theorem 8, p. 260). This theorem is then no longer a consequence of factorization as it was before, although it can easily be proved once one knows that L can be factored in the form (1.12). On the other hand, it is not generally possible to compute the functions α j in (1.12) in terms of quadratures involving the coefficients a j in (1.11). The α j satisfy Riccatitype equations that can not be solved in closed form, in general. If the coefficients a j are rational functions of x with rational coefficients, then there are algorithms to search for rational solutions of Riccati equations, see, e.g., [18], Chap. 2. This goes along with the fact that, unlike n = 1, there is no general formula for the solutions of L y = 0 in terms of quadratures on the coefficients a j for linear equations of order n ≥ 2. One can rewrite the n-th order equation L y = 0 as a system of n first-order equations. The associated initial value problem can be transformed into a vector-valued integral equation that can be formally solved by Peano-Picard iteration. However, in general, there are no manageable formulas for the resulting series of iterated integrals in the coefficients a j . In Sect. 3.5 we illustrate this approach for the case of 2nd-order equations. Another problem is that linear differential operators do not admit unique factorizations, in general. For example one easily verifies that

1 Introduction



 d 2 dx

7



2 d x dx

+

2 x2

=



d dx



 1 2 x

=



d dx



1 x(1+ax)



d dx



1+2ax x(1+ax)



for any constant a. (This example is due to E. Landau, see [18], p. 26.) Nevertheless the factorization method, besides yielding simplified proofs of basic results, can give interesting information about the solutions of the homogeneous equation. For example for second-order equations with real-valued coefficients, we shall see that the imaginary part of the function α2 in (1.12) either vanishes identically on I or it is always nonzero there. In the second case, the general solution of L y = 0 can be written in the form [13] y(x) = eη(x) (c1 cos ω(x) + c2 sin ω(x)) 

where η(x) =

x x0

(c1 , c2 ∈ R), 

Re α2 (t) dt,

ω(x) =

x

Im α2 (t) dt,

x0

the function ω being strictly monotone in I . This is similar to the case of constant coefficients with complex conjugate roots of p(λ).

Chapter 2

The Case of Constant Coefficients

2.1 n = 1 Consider the linear first-order differential equation y  + ay = f (x),

(2.1)

where y  = ddyx , a is a real constant, and the forcing term f is a real-valued continuous function in an interval I ⊂ R. It is well known that the general solution of (2.1) is given by  y(x) = e−ax

eax f (x) d x,

(2.2)

 where eax f (x) d x denotes the set of all primitives of the function eax f (x) in the interval I (i.e., its indefinite integral). Suppose that 0 ∈ I , and consider the integral x function 0 eat f (t) dt. By the Fundamental Theorem of Calculus, this is the primitive ax of e f (x) that vanishes at 0. The theorem of the additive constant for primitives implies that 

 eax f (x) d x =

x

eat f (t) dt + k

(k ∈ R),

0

and we can rewrite (2.2) in the form  x eat f (t) dt + ke−ax y(x) = e−ax  x 0 = e−a(x−t) f (t) dt + ke−ax 0  x = g(x − t) f (t) dt + kg(x),

(2.3)

0

© Springer International Publishing AG 2016 R. Camporesi, An Introduction to Linear Ordinary Differential Equations Using the Impulsive Response Method and Factorization, PoliTO Springer Series, DOI 10.1007/978-3-319-49667-2_2

9

10

2 The Case of Constant Coefficients

where g(x) = e−ax . The function g is called the impulsive response of the differential equation y  + ay = 0. It is the unique solution of the initial value problem 

y  + ay = 0 y(0) = 1.

Formula (2.3) illustrates a well known result in the theory of linear differential equations. Namely, the general solution of (2.1) is the sum of the general solution of the associated homogeneous equation y  + ay = 0 and of any particular solution of (2.1). In (2.3) the function 

x

y p (x) =

g(x − t) f (t) dt

(2.4)

0

is the particular solution of (2.1) that vanishes at x = 0. If x0 is any point of I , it is easy to verify that  y(x) =

x

g(x − t) f (t) dt + y0 g(x − x0 )

(x ∈ I )

x0

is the unique solution of (2.1) in the interval I that satisfies y(x0 ) = y0 (y0 ∈ R). As we shall see, formula (2.4) gives a particular solution of the non-homogeneous equation also in the case of higher-order linear constant-coefficient differential equations, by suitably defining the impulsive response g. Remark 2.1 (The convolution) To better understand the structure of the particular solution (2.4), let us recall that if h 1 and h 2 are suitable functions (for example if h 1 and h 2 are piecewise continuous on R with h 1 bounded and h 2 absolutely integrable over R, or if h 1 and h 2 are two signals, that is, they are piecewise continuous on R and vanish for x < 0), then their convolution h 1 ∗ h 2 is defined as  h 1 ∗ h 2 (x) =

+∞

−∞

h 1 (x − t)h 2 (t) dt.

The change of variables x − t = s shows that convolution is commutative:  h 1 ∗ h 2 (x) = h 2 ∗ h 1 (x) =

+∞ −∞

h 1 (s)h 2 (x − s) ds.

Moreover, one can show that convolution is associative: (h 1 ∗ h 2 ) ∗ h 3 = h 1 ∗ (h 2 ∗ h 3 ). If h 1 and h 2 are two signals, it is easy to verify that h 1 ∗ h 2 is also a signal and that for any x ∈ R one has 

x

h 1 ∗ h 2 (x) = 0

h 1 (x − t)h 2 (t) dt.

2.1 n = 1

11

In a similar way, if h 1 and h 2 vanish for x > 0, the same holds for h 1 ∗ h 2 and we have  0 h 1 (x − t)h 2 (t) dt. h 1 ∗ h 2 (x) = x

If θ denotes the Heaviside step function, given by  θ (x) =

1 if x ≥ 0 0 if x < 0,

and if we let θ˜ (x) = −θ (−x) =



0 if x > 0 −1 if x ≤ 0,

then for any two functions g and f we have  0

x

 g(x − t) f (t) dt =

θg ∗ θ f (x) if x ≥ 0 −θ˜ g ∗ θ˜ f (x) if x ≤ 0.

(2.5)

x In other words, the integral 0 g(x − t) f (t) dt is the convolution of θg and θ f for x ≥ 0, it is the opposite of the convolution of θ˜ g and θ˜ f for x ≤ 0.

2.2 n = 2 Consider the second-order non-homogeneous linear differential equation y  + ay  + by = f (x),

(2.6)

2

where y  = dd xy2 , a, b are real constants, and the forcing term f is a real-valued continuous function in an interval I ⊂ R, i.e., f ∈ C 0 (I ). For f = 0 we get the associated homogeneous equation y  + ay  + by = 0.

(2.7)

We will write (2.6) and (2.7) in operator form as L y = f (x) and L y = 0, where L is the linear second-order differential operator with constant coefficients defined by L y = y  + ay  + by, for any function y at least twice differentiable. Denoting by operator, we have  2 L = ddx + a ddx + b.

d dx

the differentiation (2.8)

12

2 The Case of Constant Coefficients

L defines a map C 2 (R) → C 0 (R) that to each real-valued function y at least twice differentiable over R with continuous second derivative associates the continuous function L y. The fundamental property of L is its linearity, that is, L(c1 y1 + c2 y2 ) = c1 L y1 + c2 L y2 ,

(2.9)

∀c1 , c2 ∈ R, ∀y1 , y2 ∈ C 2 (R). This formula implies some important facts. First, if y1 and y2 are any two solutions of the homogeneous equation on R, then any linear combination c1 y1 + c2 y2 is also a solution of this equation on R. In other words, the set V = {y ∈ C 2 (R) : L y = 0} is a real vector space. We will prove that dim V = 2, and will see how to get a basis of V . Note that if y ∈ V , then y ∈ C ∞ (R), as follows from (2.7) by isolating y  and differentiating any number of times. Secondly, if y1 and y2 are two solutions of (2.6) on a given interval I  ⊂ I , then their difference y1 − y2 solves (2.7). It follows that if we know a particular solution y p of the non-homogeneous equation on an interval I  , then any other solution of (2.6) on I  is given by y p + yh , where yh is a solution of the associated homogeneous equation. We shall see that the solutions of (2.6) are defined on the whole of the interval I on which f is continuous (and are of course of class C 2 there). The fact that L has constant coefficients (i.e., a and b in (2.8) are constants) allows one to find explicit formulas for the solutions of (2.6) and (2.7). To this end, it is useful to consider complex-valued functions, y : R → C. If y = y1 + i y2 (with y1 , y2 : R → R) is such a function, the derivative y  may be defined by linearity as y  = y1 + i y2 . It follows that L(y1 + i y2 ) = L y1 + i L y2 . In a similar way one defines the integral of y: 

 y(x) d x =

 y1 (x) d x + i



d

y2 (x) d x,



d

y(x) d x =

c

c



d

y1 (x) d x + i

y2 (x) d x.

c

The theorem of the additive constant for primitives and the Fundamental Theorem of Calculus extend to complex-valued functions. It is then easy to verify, using either Euler’s formula or the series expansion of the exponential function and differentiation term by term, that d λx e dx

= λeλx ,

∀λ ∈ C.

It follows that Leλx = (λ2 +aλ+b)eλx , and the complex exponential eλx is a solution of (2.7) if and only if λ is a root of the characteristic polynomial p(λ) = λ2 + aλ + b. Let λ1 , λ2 ∈ C be the roots of p(λ), so that p(λ) = (λ − λ1 )(λ − λ2 ).

2.2 n = 2

13

The operator L factors in a similar way as a product (composition) of first-order differential operators:    (2.10) L = ddx − λ1 ddx − λ2 . Indeed we have 

d dx

− λ1



d dx

   − λ2 y = ddx − λ1 (y  − λ2 y) = y  − (λ1 + λ2 )y  + λ1 λ2 y,

which coincides with L y since λ1 + λ2 = −a, and λ1 λ2 = b. Note that in (2.10) the order with which the two factors are composed is unimportant. In other words, the two operators ( ddx − λ1 ) and ( ddx − λ2 ) commute: 

d dx

− λ1



    − λ2 = ddx − λ2 ddx − λ1 .

d dx

The idea is now to use (2.10) to reduce the problem to first-order differential equations. It is useful to consider linear differential equations with complex coefficients, whose solutions will be, in general, complex-valued. For example the first-order homogeneous equation y  − λy = 0 with λ ∈ C has the general solution y(x) = k eλx

(k ∈ C).

  (Indeed if y  = λy, then ddx y(x)e−λx = 0, whence y(x)e−λx = k.) The first-order non-homogeneous equation y  − λy =



d dx

 − λ y = f (x)

(λ ∈ C),

with complex-valued forcing term f : I ⊂ R → C continuous in I 0, has the general solution y(x) = eλx



e−λx f (x)d x

 x e−λt f (t)dt + k eλx = eλx 0  x = gλ (x − t) f (t) dt + k gλ (x)

(x ∈ I, k = y(0) ∈ C).

(2.11)

0

Here gλ (x) = eλx is the impulsive response of the differential operator ( ddx − λ). It is the (unique) solution of y  − λy = 0, y(0) = 1. Formula (2.11) can be proved as (2.3) in the real case. In particular, the solution of the first-order problem 

y  − λy = f (x) y(0) = 0

14

2 The Case of Constant Coefficients

(λ ∈ C) is unique and is given by 

x

y(x) =

eλ(x−t) f (t) dt.

(2.12)

0

The following result gives a particular solution of (2.6) as a convolution integral. Theorem 2.2 Let f ∈ C 0 (I ), I an interval, and suppose that 0 ∈ I . Then the initial value problem   y + ay  + by = f (x) (2.13) y(0) = 0, y  (0) = 0 has a unique solution, defined on the whole of I , and given by the formula 

x

y(x) =

g(x − t) f (t) dt

(x ∈ I ),

(2.14)

0

where g is the function defined by 

x

g(x) =

eλ2 (x−t) eλ1 t dt

(x ∈ R).

(2.15)

0

In particular if we take f = 0, we get that the only solution of the homogeneous problem   y + ay  + by = 0 (2.16) y(0) = 0, y  (0) = 0 is the zero function y = 0. Proof We rewrite the differential equation (2.6) in the form 

d dx

− λ1

Letting h=





d dx

d dx

 − λ2 y = f (x).

 − λ2 y = y  − λ2 y,

we see that y solves the problem (2.13) if and only if  h solves

h  − λ1 h = f (x) and y solves h(0) = 0

From (2.12) we have



x

h(x) =

eλ1 (x−t) f (t) dt,

0

 y(x) = 0

x

eλ2 (x−t) h(t) dt.



y  − λ2 y = h(x) y(0) = 0.

2.2 n = 2

15

Substituting h from the first formula into the second, we obtain y(x) as a repeated integral (for any x ∈ I ): 

x

y(x) =

e

λ2 (x−t)



t

e

0

λ1 (t−s)

 f (s) ds

dt.

(2.17)

0

To fix ideas, let us suppose that x > 0. Then in the integral with respect to t we have 0 ≤ t ≤ x, whereas in the integral with respect to s we have 0 ≤ s ≤ t. We can then rewrite y(x) as a double integral: y(x) = eλ2 x



e(λ1 −λ2 )t e−λ1 s f (s) ds dt, Tx

where Tx is the triangle in the (s, t) plane defined by 0 ≤ s ≤ t ≤ x, with vertices at the points (0, 0), (0, x), (x, x). In (2.17) we first integrate with respect to s and then with respect to t. Since the triangle Tx is convex both horizontally and vertically, and since the integrand function F(s, t) = e(λ1 −λ2 )t e−λ1 s f (s) is continuous in Tx , we can interchange the order of integration and integrate with respect to t first. Given s (between 0 and x) the variable t in Tx varies between s and x, see the picture below. t s

t

x s t x

x

0

s

We thus obtain  y(x) =

x



x

e 0

λ2 (x−t) λ1 (t−s)

e

 f (s) ds.

dt

(2.18)

s

By substituting t with t + s in the integral with respect to t we finally get 

x

y(x) =



x−s

e 

0

0

e

0 x

=

λ2 (x−s−t) λ1 t

g(x − s) f (s) ds,

 dt

f (s) ds

(2.19)

16

2 The Case of Constant Coefficients

which is (2.14). For x < 0 we can reason in a similar way and we get the same result.  Remark 2.3 The Fubini Theorem is not really needed to prove the equality (2.17)= (2.18). Indeed the integrals in (2.17) and (2.18) are iterated integrals of continuos functions, and the exchange of the order of integration is a consequence of the relationship between the derivative and the integral. More precisely, consider the two functions    x  x  x  t F(s, t) ds dt, H (x) = F(s, t) dt ds. G(x) = 0

0

0

s

x

By the Fundamental Theorem of Calculus, G  (x) = 0 F(s, x) ds. The differentiability of H (x) and the equality H  (x) = G  (x) follow by a simple adaptation of the usual argument from the Fundamental Theorem of Calculus (as in the proof of (2.26), see also (2.49)). The equality G(x) = H (x) follows since G  (x) = H  (x) and G(0) = H (0) = 0. The integral in formula (2.15) can be computed exactly as in the real field. We obtain the following expression of the function g: (1) if λ1 = λ2 (⇔  := a 2 − 4b = 0) then g(x) =

1 λ1 −λ2

 λ1 x  e − e λ2 x ;

(2.20)

(2) if λ1 = λ2 (⇔  = 0) then g(x) = x eλ1 x .

(2.21)

Note that g is always a real function. Letting α = −a/2 and √ −/2 if  < 0 β= √ /2 if  > 0,

 so that λ1,2 =

α ± iβ if  < 0 α ± β if  > 0,

we have g(x) =





1 e(α+iβ)x − e(α−iβ)x = β1 eαx sin(βx) 2iβ   1 e(α+β)x − e(α−β)x = β1 eαx sh(βx) 2β

if  < 0 if  > 0.

(2.22)

Also notice that g ∈ C ∞ (R). It is easy to check that g solves the following homogeneous initial value problem: 

y  + ay  + by = 0 y(0) = 0, y  (0) = 1.

(2.23)

Conversely, if y solves (2.23) then y = g given by (2.15). This is proved using (2.10)–(2.11) and reasoning as in the proof of Theorem 2.2. The function g is called

2.2 n = 2

17

the impulsive response of the differential operator L. The name comes from the initial conditions in (2.23), namely g(0) = 0, g  (0) = 1. Remark 2.4 (The impulsive response as a convolution) Denoting g by gλ1 λ2 and recalling (2.5), we can rewrite (2.15) in the suggestive form 

x

gλ1 λ2 (x) =

 gλ2 (x − t)gλ1 (t) dt =

0

θgλ2 ∗ θgλ1 (x) if x ≥ 0 −θ˜ gλ2 ∗ θ˜ gλ1 (x) if x ≤ 0,

(2.24)

showing that g is symmetric in λ1 ↔ λ2 . This formula relates the (complex) firstorder impulsive responses to the impulsive response of order 2, and will be generalized to linear equations of order n (cf. (2.64)). We also observe that the proof of Theorem 2.2 can be simplified, at least at a formal level, by rewriting (2.17) in convolution form (for example for x ≥ 0) and then using the associativity of this: y(x) = θgλ2 ∗ ( θgλ1 ∗ θ f ) (x) = ( θgλ2 ∗ θgλ1 ) ∗ θ f (x).

(2.25)

This is just the equality (2.17)=(2.19) for x ≥ 0. The interchange of the order of integration in the double integral considered above is then equivalent to the associative property of convolution for signals. Now (2.25) and (2.24) imply (2.14) for x ≥ 0. It is interesting to verify directly that the function y given by (2.14) solves (2.6). First let us prove the following formula for the derivative y  : d y (x) = dx 



x

g(x − t) f (t) dt  x = g(0) f (x) + g  (x − t) f (t) dt 0

(x ∈ I ).

(2.26)

0

Indeed, given h such that x + h ∈ I , we have 1 y(x + h) − y(x) = h h



x+h

 g(x + h − t) f (t) dt −

0

x

 g(x − t) f (t) dt .

0

(2.27)

As g ∈ C 2 (R), we can apply Taylor’s formula with the Lagrange remainder g(x0 + h) = g(x0 ) + g  (x0 )h +

1 2

g  (ξ )h 2

at the point x0 = x − t, where ξ is some point between x0 and x0 + h. Substituting this in (2.27) and using  0

x+h



x

g(x − t) f (t) dt = 0



x+h

g(x − t) f (t) dt + x

g(x − t) f (t) dt,

18

2 The Case of Constant Coefficients

gives 

1 1 (y(x + h) − y(x)) = h h

x+h x

1 h 2

+



x+h

g(x − t) f (t) dt +

g  (x − t) f (t) dt

0



x+h

g  (ξ ) f (t) dt,

(2.28)

0

for some ξ between x − t and x − t + h. When h tends to zero, the first term in the right-hand side of (2.28) tends to g(0) f (x), by the Fundamental Theorem of x Calculus. The second term in (2.28) tends to 0 g  (x − t) f (t) dt, by the continuity of the integral function. Finally, the third term tends to zero, since the integral that occurs in it is a bounded function of h in a neighborhood of h = 0. (This is easy to prove.) We thus obtain formula (2.26). Recalling that g(0) = 0, we finally get y  (x) =



x

g  (x − t) f (t) dt

(x ∈ I ).

0

In the same way we compute the second derivative:   d 2 x g(x − t) f (t) dt dx 0  x d = g  (x − t) f (t) dt dx 0  x = g  (0) f (x) + g  (x − t) f (t) dt 0  x = f (x) + g  (x − t) f (t) dt,

y  (x) =



0

where we used g  (0) = 1. It follows that y  (x) + ay  (x) + by(x) = f (x) +



x



 g  + ag  + bg (x − t) f (t) dt

0

= f (x),

∀x ∈ I,

g being a solution of the homogeneous equation. Therefore the function y given by (2.14) solves (2.6) in the interval I . The initial conditions y(0) = 0 = y  (0) are immediately verified. We now come to the solution of the initial value problem with arbitrary initial data at the point x = 0. Theorem 2.5 Let f ∈ C 0 (I ), I an interval, 0 ∈ I , and let y0 , y0 be two arbitrary real numbers. Then the initial value problem

2.2 n = 2

19



y  + ay  + by = f (x) y(0) = y0 , y  (0) = y0

(2.29)

has a unique solution, defined on the whole of I , and given by  y(x) = 0

x

  g(x − t) f (t) dt + y0 + ay0 g(x) + y0 g  (x)

(x ∈ I ).

(2.30)

In particular (taking f = 0), the solution of the homogeneous problem 

y  + ay  + by = 0 y(0) = y0 , y  (0) = y0

(2.31)

is unique, of class C ∞ on the whole of R, and is given by   yh (x) = y0 + ay0 g(x) + y0 g  (x)

(x ∈ R).

(2.32)

Proof The uniqueness of the solutions of the problem (2.29) follows from the fact that if y1 and y2 both solve (2.29), then their difference y˜ = y1 − y2 solves the problem (2.16), whence y˜ = 0 by Theorem 2.2. Now notice that the function g  satisfies the homogeneous equation (like g). Indeed, since L has constant coefficients, we have Lg  = L ddx g =



 d 2 dx

+a

d dx

+b



d dx

g=

d dx

Lg = 0.

By the linearity of L and by Theorem 2.2 it follows that the function y given by (2.30) satisfies (L y)(x) = f (x), ∀x ∈ I. It is immediate that y(0) = y0 . Finally, since  x    y (x) = g  (x − t) f (t) dt + y0 + ay0 g  (x) + y0 g  (x), 0

we have y  (0) = y0 + ay0 + y0 g  (0) = y0 + ay0 + y0 (−a g  (0) − b g(0)) = y0 . It is also possible to give a constructive proof, analogous to that of Theorem 2.2. Indeed, by proceeding as in the proof of this theorem and using (2.11), we find that y solves the problem (2.29) if and only if y is given by 

x

y(x) = 0

  g(x − s) f (s) ds + y0 − λ2 y0 g(x) + y0 eλ2 x .

20

2 The Case of Constant Coefficients

This formula agrees with (2.30) in view of the equality eλ2 x = g  (x)−λ1 g(x), which  follows from (2.23)  and (2.10) by observing that the function h(x) = g (x) − λ1 g(x) d  solves d x − λ2 h = 0, h(0) = 1. By imposing the initial conditions at an arbitrary point of the interval I , we get the following result. Theorem 2.6 Let f ∈ C 0 (I ), I an interval, x0 ∈ I , y0 , y0 ∈ R. The solution of the initial value problem   y + ay  + by = f (x) (2.33) y(x0 ) = y0 , y  (x0 ) = y0 is unique, it is defined on the whole of I , and is given by  y(x) =

x x0

  g(x − t) f (t) dt + y0 + ay0 g(x − x0 ) + y0 g  (x − x0 )

(x ∈ I ). (2.34)

In particular (taking f = 0), the solution of the homogeneous problem 

y  + ay  + by = 0 y(x0 ) = y0 , y  (x0 ) = y0 ,

(2.35)

with x0 ∈ R arbitrary, is unique, of class C ∞ on the whole of R, and is given by   yh (x) = y0 + ay0 g(x − x0 ) + y0 g  (x − x0 )

(x ∈ R).

(2.36)

Proof Let y be a solution of the problem (2.33), and let τx0 denote the translation map defined by τx0 (x) = x + x0 . Set y˜ (x) = y ◦ τx0 (x) = y(x + x0 ). Since the differential operator L has constant coefficients, it is invariant under translations, that is, L(y ◦ τx0 ) = (L y) ◦ τx0 . It follows that (L y˜ )(x) = (L y)(x + x0 ) = f (x + x0 ). Since moreover y˜ (0) = y(x0 ) = y0 , y˜  (0) = y  (x0 ) = y0 , we see that y solves the problem (2.33) in the interval I if and only if y˜ solves the initial value problem 

y˜  + a y˜  + b y˜ = f˜(x) y˜ (0) = y0 , y˜  (0) = y0

in the translated interval I − x0 = {t − x0 : t ∈ I } and with the translated forcing term f˜(x) = f (x + x0 ). By Theorem 2.2 we have that y˜ is unique and is given by

2.2 n = 2

21

 y˜ (x) = 0

x

  g(x − s) f˜(s) ds + y0 + ay0 g(x) + y0 g  (x).

Therefore y is also unique and is given by y(x) = y˜ (x −x0 ) =

 x−x0   g(x −x0 −s) f (s +x0 ) ds + y0 + ay0 g(x −x0 )+ y0 g  (x −x0 ). 0

Formula (2.34) follows immediately from this by making the change of variable  s + x0 = t in the integral with respect to s. Substitution of (2.21) and (2.22) in (2.36) yields the following formula for the solution of the homogeneous problem (2.35): ⎧ α(x−x0 ) cos β(x − x ) + 1 (y  − αy )eα(x−x0 ) sin β(x − x ) if  < 0 ⎪ 0 0 0 ⎨ y0 e β 0 yh (x) = y0 eα(x−x0 ) ch β(x − x0 ) + β1 (y0 − αy0 )eα(x−x0 ) sh β(x − x0 ) if  > 0 ⎪  ⎩ α(x−x0 )  y0 + (y0 − αy0 )(x − x0 ) if  = 0, e

(2.37) where for  = 0 we let α = λ1 (so that, in all cases, α = −a/2). Note the similarity between the solutions with  = 0, where the trigonometric functions for  < 0 are replaced by the hyperbolic ones for  > 0. Also note that for β → 0, both solutions with  = 0 go over to the solution with  = 0. Remark 2.7 (Existence, uniqueness and extendability of the solutions) It is well known that a linear initial value problem with variable coefficients (homogeneous or not) has a unique solution defined on the entire interval I in which the coefficients and the forcing term are continuous (on the whole of R in the homogeneous constantcoefficient case). (See, e.g., [8], Theorems 1 and 3 pp. 104–105, for the homogeneous variable-coefficient case.) Now Theorems 2.2, 2.5 and 2.6 give an elementary proof of this fact (together with an explicit formula for the solution) for a linear second-order problem with constant coefficients. Corollary 2.8 Let f ∈ C 0 (I ) and let x0 ∈ I be fixed. Every solution y of L y = f (x) in the interval I can be written as y = y p + yh , where  y p (x) =

x

g(x − t) f (t) dt

(2.38)

x0

solves (2.6) with the initial conditions y p (x0 ) = y p (x0 ) = 0, and yh is the solution of the homogeneous equation (2.7) such that yh (x0 ) = y(x0 ), yh (x0 ) = y  (x0 ). Proof Let y be any solution of (2.6) in I , and let y0 = y(x0 ), y0 = y  (x0 ). Then y solves the problem (2.33), and by uniqueness y is given by (2.34). Thus y = y p + yh , where y p , given by (2.38), solves (2.6) with trivial initial conditions at x0 , and yh , given by (2.36)–(2.37), solves (2.7) with the same initial conditions as y at x0 . 

22

2 The Case of Constant Coefficients

Corollary 2.9 The set V of real solutions of the homogeneous equation (2.7) on R is a real vector space of dimension 2, and the two functions g, g  form a basis of V . Proof Let y be a real solution of (2.7) on R. Let y0 = y(0), y0 = y  (0), and define yh by formula (2.32). Then y and yh both satisfy (2.31), whence y = yh by Theorem 2.5. Note that y ∈ C ∞ (R). We conclude by (2.32) that every element of the vector space V can be written as a linear combination of g and g  in a unique way. (The coefficients in this combination are uniquely determined by the initial data at the point x = 0.) In particular, this holds for the trivial solution y(x) = 0 ∀x, i.e., cg(x) + dg  (x) = 0 for c, d ∈ R and ∀x, implies c = d = 0. Thus g and g  are linearly independent and form a basis of V .  Another basis of V can be obtained from the following result and its corollary. Theorem 2.10 Every complex solution of the homogeneous equation L y = 0 can be written in the form y = c1 y1 + c2 y2 (c1 , c2 ∈ C), where y1 and y2 are given by  λ x λ x e 1 ,e 2 if λ1 = λ2 (y1 (x), y2 (x)) =  λ x  λ1 x 1 e , xe if λ1 = λ2 . Conversely, any function of this form is a solution of L y = 0. Proof 1 This proof does not use the impulsive response but only factorization. It can be used to give an independent proof of existence and uniqueness for the solutions of the homogeneous initial value problem (2.35). (The constants c1 , c2 can be uniquely determined in terms of y0 , y0 .) By (2.10), we rewrite the homogeneous differential equation (2.7) in the form Ly =



d dx

− λ1



d dx

 − λ2 y = 0.

We now argue as in the proof of Theorem 2.2. Letting h = we see that y solves L y = 0 if and only if 

h  − λ1 h = 0 y  − λ2 y = h(x)

 ⇐⇒



d dx

 − λ2 y = y  − λ2 y,

h(x) = keλ1 x (k ∈ C) y  − λ2 y = keλ1 x .

By formula (2.2) (that holds for a ∈ C as well, see (2.11)), we get that y solves L y = 0 if and only if y is given by

2.2 n = 2

23

y(x) = eλ2 x = ke =  =



λ2 x

e−λ2 x keλ1 x d x



e(λ1 −λ2 )x d x

 keλ2 x (x  +k )

ke

λ2 x

e(λ1 −λ2 )x λ1 −λ2

 if λ1 = λ2 + k if λ1 = λ2 

c1 eλ1 x + c2 xeλ1 x if λ1 = λ2 c1 eλ1 x + c2 eλ2 x if λ1 = λ2 ,

k , where k  ∈ C, and we have set c1 = kk  , c2 = k for λ1 = λ2 , and c1 = λ1 −λ 2  c2 = kk for λ1 = λ2 . This proves the first part of the Theorem. The converse part also follows from this. Indeed for λ1 = λ2 we already know that y1 and y2 solve L y = 0 so by (complex) linearity, any linear combination c1 y1 +c2 y2 with c1 , c2 ∈ C solves L y = 0. For λ1 = λ2 we know that y1 (x) = eλ1 x is a solution of L y = 0. Taking k = 1, k  = 0 in the previous formulae, we see that y2 (x) = xeλ1 x is also a solution of L y = 0, so the result follows by linearity.

Proof 2 This proof uses the impulsive response and is well suited for generalization to higher-order equations. We observe that any complex solution of L y = 0 on R can be written as a complex linear combination of g and g  in a unique way. Indeed if y = Re y + iIm y solves L y = 0, then L(Re y) = 0 = L(Im y), since L has real coefficients. By Corollary 2.9, Re y and Im y are unique real linear combinations of g and g  , so y is a unique complex linear combination of g and g  . In particular, the set VC of complex solutions of the homogeneous equation on R is a complex vector space of dimension 2, with a basis given by {g, g  }. Now by (2.20) and (2.21) we see that any linear combination of g and g  can be rewritten as a linear combination of y1 and y2 . Indeed if y = cg + dg  with c, d ∈ C, then y = c1 y1 + c2 y2 , where 

1 2 , c2 = c+dλ for λ1 = λ2 c1 = c+dλ λ1 −λ2 λ2 −λ1 c1 = d, c2 = c + dλ1 for λ1 = λ2 .

The converse part follows by verifying that y1 and y2 solve L y = 0. The case 2  λ1 = λ2 is clear. If λ1 = λ2 then L = ddx − λ1 by (2.10), and not only L y1 = 0, but also L y2 (x) =



d dx

− λ1

 

d dx

− λ1



xeλ1 x



=



d dx

Note that in this case, y2 is just the impulsive response g.

 − λ1 eλ1 x = 0. 

Corollary 2.11 If λ1 and λ2 are real, then the functions y1 and y2 in Theorem 2.10 form a basis of V . If λ1,2 = α ± iβ with α, β ∈ R, β = 0, then a basis of V is given by the functions

24

2 The Case of Constant Coefficients

y˜1 (x) = Re y1 (x) = eαx cos βx,

y˜2 (x) = Im y1 (x) = eαx sin βx.

Proof By Theorem 2.10 the functions y1 and y2 form a basis of VC . Indeed they belong to VC and every element of VC is a linear combination of them. Since dim VC = 2, y1 and y2 must be linearly independent. If λ1 and λ2 are real, then y1 and y2 are realvalued functions and form a basis of V as well. Indeed they are linearly independent in VC , thus also in V , and dim V = 2. On the other hand, if λ1,2 = α ± iβ with β = 0, then using y˜1 = 21 (y1 + y2 ), y˜2 = 2i1 (y1 − y2 ), we see that the functions y˜1 and y˜2 solve L y = 0 and are linearly independent in VC (and thus also in V ). Indeed if c y˜1 + d y˜2 = 0 for c, d ∈ C, then we get 21 (c − id)y1 + 21 (c + id)y2 = 0, whence c ± id = 0 by the linear independence of y1 and y2 , and finally c = d = 0. Since y˜1 and y˜2 are linearly independent in V and dim V = 2, y˜1 and y˜2 form a basis of V .  Remark 2.12 (Complex coefficients) Theorems 2.2, 2.5, 2.6, 2.10 and Corollary 2.8 remain valid if L is a linear second-order differential operator with constant complex coefficients, i.e., a, b ∈ C in (2.8). In this case λ1 and λ2 in (2.10) are arbitrary complex numbers. Moreover, the forcing term f can be a complex-valued function, and the initial data y0 , y0 can be complex numbers. The impulsive response g of L is defined again by (2.15)–(2.24), it is given explicitly by formulas (2.20)–(2.21), and solves (2.23). Finally, the set of solutions of (2.7) on R is a complex vector space of dimension 2, with a basis given by {g, g  } or {y1 , y2 }. In the next section the results obtained here for n = 2 will be generalized to linear equations of order n, by working directly with complex coefficients and using induction on n. Remark 2.13 (The approach by distributions) The meaning of formulas (2.14)– (2.5), or (2.30)–(2.5), is best appreciated in the context of distribution theory. Given a linear constant-coefficient differential equation, written in the symbolic form L y = f (x), we can study the equation L T = S for T and S in a suitable convolu of distributions with support in [0, +∞), tion algebra, for example the algebra D+ which generalizes the algebra of signals. One can show that the elementary solution  , that is, the solution of L T = δ, or equivalently, the inverse of Lδ (δ the in D+ Dirac distribution), is given by θg, where θ is the Heaviside step function and g the impulsive response of the differential operator L. If y solves L y = f (x) with L given by (2.8) and with trivial initial conditions at x = 0, then L(θ y) = θ f , whence θ y = θg ∗ θ f . This is just (2.14) for x ≥ 0. If y solves L y = f (x) with the initial conditions y(0) = y0 , y  (0) = y0 , then one computes L(θ y) = θ f + (y0 + ay0 )δ + y0 δ  , whence

θ y = θg ∗ θ f + (y0 + ay0 ) θg + y0 θg  .

This is just (2.30) for x ≥ 0. This approach requires, however, a basic knowledge of distribution theory, which goes far beyond our goals and will not be used here. (See [16], chapters II and III, for a nice presentation.)

2.2 n = 2

25

We now look at some examples. Example 1 Solve the initial value problem 

e y  − 2y  + y = x+2 y(0) = 0, y  (0) = 0. x

Solution. The characteristic polynomial is p(λ) = λ2 − 2λ + 1 = (λ − 1)2 , so that λ1 = λ2 = 1, and the impulsive response is g(x) = x e x . x

e is continuous for x > −2 and for x < −2. Since the The forcing term f (x) = x+2 initial conditions are posed at x = 0, we can work in the interval I = (−2, +∞). By Theorem 2.2 we get the following solution of the proposed problem in I :

 x et x −t dt = e x dt (x − t) e x−t t +2 0 0 t +2   x  x x +2 − 1 dt = e x (x + 2) log(t + 2) − t 0 = ex t +2 0   = e x (x + 2) log(x + 2) − x − (x + 2) log 2   − x ex . = e x (x + 2) log x+2 2 

y(x) =

x

Example 2 Solve the initial value problem 

y  + y = cos1 x y(0) = 0, y  (0) = 0.

Solution. We have p(λ) = λ2 + 1, whence λ1 = λ2 = i, and the impulsive response is g(x) = sin x. The initial data are given at x = 0 and we can work in the interval I = (−π/2, π/2), where the forcing term f (x) = cos1 x is continuous. By formula (2.14) we get  x 1 1 dt = dt sin(x − t) (sin x cos t − cos x sin t) cos t cos t 0 0  x  x sin t = sin x dt = x sin x + cos x log(cos x). dt − cos x 0 0 cos t 

y(x) =

x

Example 3 Solve the initial value problem 

y  − y  = ch1 x y(0) = 0, y  (0) = 0.

26

2 The Case of Constant Coefficients

Solution. We have p(λ) = λ2 − λ = λ(λ − 1), thus λ1 = 1, λ2 = 0, and the impulsive response is g(x) = e x − 1. The forcing term f (x) = ch1 x is continuous on the whole of R. Formula (2.14) gives  x  x −t  x  x−t  1 e 1 x dt = e dt − dt. e −1 y(x) = ch t ch t ch t 0 0 0 The two integrals are easily computed:  

1 dt = 2 ch t

e−t dt = 2 ch t





1 dt = 2 t e + e−t

1 dt = 2 e2t + 1





  et dt = 2 arctan et + C, 2t e +1

  1 + e2t − e2t dt = 2t − log 1 + e2t + C  . 2t 1+e

We finally get   x   x y(x) = e x 2t − log 1 + e2t 0 − 2 arctan et 0      = e x 2x − log 1 + e2x + e x log 2 − 2 arctan e x + π2 . Exercises 1. Verify that

d λx e dx

= λeλx , ∀λ = α + iβ ∈ C, using Euler’s formula e(α+iβ)x = eαx eiβx = eαx (cos βx + i sin βx).

2. Verify that function

d λx e dx

= λeλx , ∀λ ∈ C, using the series expansion of the exponential eλx =

∞  (λx)n n=0

n!

and differentiation term by term. 3. Solve the following initial value problems:   −x y + 2y  + y = xe2 +1 (a) y(0) = 0, y  (0) = 0.   ex y − 2y  + 2y = 1+cos x (b) y(0) = 0, y  (0) = 0. 2x y  − 3y  + 2y = (exe+1)2 (c) y(0) = 0, y  (0) = 0.

=

∞  λn n=0

n!

xn

2.2 n = 2

27



y  + 4y = sin12x y( π ) = 0, y  ( π4 ) = 0.   4 y − y = sh1x (e) y(1) = 0, y  (1) = 0.   x y − 2y  = che 2x (f) y(0) = 0, y  (0) = 0.

(d)

2.3 The General Case Consider the linear constant-coefficient non-homogeneous differential equation of order n y (n) + a1 y (n−1) + a2 y (n−2) + · · · + an−1 y  + an y = f (x),

(2.39)

 k k where y (k) = dd xyk = ddx y, a1 , a2 , . . . , an ∈ C, and the forcing term f : I → C is a complex-valued continuous function in an interval I ⊂ R, i.e., f ∈ C 0 (I, C). When f = 0 we obtain the associated homogeneous equation y (n) + a1 y (n−1) + a2 y (n−2) + · · · + an−1 y  + an y = 0.

(2.40)

We will write (2.39) and (2.40) in operator form as L y = f (x) and L y = 0, where L is the linear differential operator of order n with constant coefficients defined by L y = y (n) + a1 y (n−1) + a2 y (n−2) + · · · + an−1 y  + an y for any function y at least n times differentiable. Denoting by operator, we have L=



 d n dx

+ a1



 d n−1 dx

d dx

+ · · · + an−1 ddx + an .

the differentiation

(2.41)

L defines a map C n (R, C) → C 0 (R, C) that to each complex-valued function y at least n-times differentiable over R with continuous nth derivative associates the continuous function L y. The fundamental property of L is its linearity, i.e., (2.9) holds, ∀c1 , c2 ∈ C, ∀y1 , y2 ∈ C n (R, C). Again this implies that if y1 and y2 are any two solutions of (2.40) on R, then any linear combination of them is again a solution of (2.40) on R. It follows that the set V = {y ∈ C n (R, C) : L y = 0} is a complex vector space. We will prove that dim V = n, and will see how to get a basis of V . Note that if y ∈ V , then y ∈ C ∞ (R, C), as follows from (2.40) by isolating y (n) and differentiating any number of times.

28

2 The Case of Constant Coefficients

The linearity of L also implies, as in the case of n = 1, 2, that the general solution of (2.39) on some interval I  ⊂ I is the sum of the general solution of (2.40) on I  plus any particular solution of (2.39) on I  . We will prove that the solutions of the initial value problem for (2.39) at any point x0 ∈ I and with any initial data are defined on the entire interval I on which f is continuous (and are of course of class C n there). The fact that L has constant coefficients allows one to find explicit formulas for the solutions of (2.39)–(2.40). We first observe that the complex exponential eλx , λ ∈ C, is a solution of (2.40) if and only if λ is a root of the characteristic polynomial p(λ) = λn + a1 λn−1 + · · · + an−1 λ + an .

(2.42)

Let λ1 , λ2 , . . . , λn ∈ C be the roots of p(λ), not necessarily all distinct, each counted with its multiplicity. The polynomial p(λ) factors as p(λ) = (λ − λ1 )(λ − λ2 ) · · · (λ − λn ). The differential operator L can be factored in a similar way as L=



d dx

− λ1



d dx

   − λ2 · · · ddx − λn ,

(2.43)

where the order of composition of the factors is unimportant as they all commute. The idea is now to use (2.43) to reduce the problem to first-order differential equations. The following result gives a particular solution of (2.39) as a convolution integral, and generalizes Theorem 2.2. Theorem 2.14 Let f ∈ C 0 (I, C), I an interval, and suppose 0 ∈ I . Let λ1 , λ2 , . . . , λn be n complex numbers (not necessarily all distinct), and let L be the differential operator (2.43). Then the initial value problem 

L y = f (x) y(0) = y  (0) = · · · = y (n−1) (0) = 0

(2.44)

has a unique solution, defined on the whole of I , and given by the formula 

x

y(x) =

g(x − t) f (t) dt

(x ∈ I ),

(2.45)

0

where g = gλ1 ···λn is the function defined recursively as follows: for n = 1 we set gλ (x) = eλx (λ ∈ C), for n ≥ 2 we set 

x

gλ1 ···λn (x) = 0

gλn (x − t) gλ1 ···λn−1 (t) dt

(x ∈ R).

(2.46)

2.3 The General Case

29

The function gλ1 ···λn is of class C ∞ on the whole of R. In particular if we take f = 0, we get that the unique solution of the homogeneous problem 

Ly = 0 y(0) = y  (0) = · · · = y (n−1) (0) = 0

(2.47)

is the zero function y = 0. Proof We proceed by induction on n. First we prove that gλ1 ···λn ∈ C ∞ (R, C). Indeed this holds for n = 1. Suppose it holds for n − 1. Then it holds also for n, since by (2.46)  x

gλ1 ···λn (x) = eλn x

e−λn t gλ1 ···λn−1 (t) dt,

0

so gλ1 ···λn is the product of two functions of class C ∞ on R. (The integral function of a C ∞ function is C ∞ .) We now prove (2.45). The theorem holds for n = 1, 2. Assuming it holds for n − 1, let us prove it for n. Consider then the problem (2.44) with L given by (2.43): 

    − λ1 ddx − λ2 · · · ddx − λn y = f (x) y(0) = y  (0) = · · · = y (n−1) (0) = 0.

Letting h = solves



d dx

d dx

 − λn y, we see that y solves the problem (2.44) if and only if h  d     − λ1 ddx − λ2 · · · ddx − λn−1 h = f (x) dx h(0) = h  (0) = · · · = h (n−2) (0) = 0,

and y solves



y  − λn y = h(x) y(0) = 0.

(The initial conditions for h = y  − λn y follow by computing h  , h  , . . . , h (n−2) and setting x = 0.) By the inductive hypothesis and by (2.12), we have 

x

h(x) =

gλ1 ···λn−1 (x − t) f (t) dt,

0

 y(x) =

x

eλn (x−t) h(t) dt.

0

Substituting h from the first formula into the second, we obtain ∀x ∈ I :

30

2 The Case of Constant Coefficients

 t  gλn (x − t) gλ1 ···λn−1 (t − s) f (s) ds dt 0 0   x  x gλn (x − t)gλ1 ···λn−1 (t − s) dt f (s) ds = 0 s   x  x−s = gλn (x − s − t)gλ1 ···λn−1 (t) dt f (s) ds 0 0  x = gλ1 ···λn (x − s) f (s) ds. 

y(x) =

x

0

We have interchanged the order of integration in the second line, substituted t with t + s in the third, and used (2.46) in the last.  By induction one can also verify that the function gλ1 ···λn solves the following homogeneous initial value problem: 

Ly = 0 y(0) = y  (0) = · · · = y (n−2) (0) = 0,

y (n−1) (0) = 1.

(2.48)

Conversely, if y solves (2.48) then y is given by (2.46). In other words, the solution of the initial value problem (2.48) is unique and is calculable for n ≥ 2 by the recursive formula (2.46). This is proved by induction, reasoning as in the proof of Theorem 2.14. For example take n = 3 and suppose y solves the problem 

   − λ1 ddx − λ2 ddx − λ3 y = 0 y(0) = y  (0) = 0, y  (0) = 1.

Letting h =



d dx

d dx

 − λ3 y, the function h solves 

  − λ1 ddx − λ2 h = 0 h(0) = 0, h  (0) = 1, d dx

whence h = gλ1 λ2 . Therefore y solves 

y  − λ3 y = gλ1 λ2 (x) y(0) = 0,

and (2.12) gives  y(x) =

x

gλ3 (x − t) gλ1 λ2 (t) dt = gλ1 λ2 λ3 (x).

0

The function g = gλ1 ···λn is called the impulsive response of the differential operator L. We will see later how to compute it explicitly using (2.46). Note that as a function of λ1 , . . . , λn , gλ1 ···λn is symmetric in the interchange λi ↔ λ j , ∀i, j = 1, . . . , n.

2.3 The General Case

31

This follows from the fact that gλ1 ···λn solves the problem (2.48) with L symmetric in λi ↔ λ j , ∀i, j. Remark 2.15 It is interesting to verify directly that the function y given by (2.45) solves the non-homogeneous equation L y = f (x). Indeed, using repeatedly the formula  x  x ∂F d F(x, t) dt = F(x, x) + (x, t) dt, (2.49) dx ∂x 0

0

we get from (2.45): y  (x) = g(0) f (x) +

 x

y  (x) = g  (0) f (x) +

0

g  (x − t) f (t) dt =

 x 0

 x

g  (x − t) f (t) dt =

0

g  (x − t) f (t) dt

 x 0

g  (x − t) f (t) dt

.. .

 x  x g (n−1) (x − t) f (t) dt = g (n−1) (x − t) f (t) dt y (n−1) (x) = g (n−2) (0) f (x) + 0 0  x  x y (n) (x) = g (n−1) (0) f (x) + g (n) (x − t) f (t) dt = f (x) + g (n) (x − t) f (t) dt, 0

0

where we used (2.48). It follows that  (L y)(x) =

f (x) +

x

(Lg)(x − t) f (t) dt =

f (x),

∀x ∈ I,

0

being Lg = 0. The initial conditions y (k) (0) = 0 for 0 ≤ k ≤ n − 1 are immediately verified from these formulas. For the initial value problem with arbitrary (complex) initial data at x = 0, we have the following result, that generalizes Theorem 2.5. Theorem 2.16 Let f ∈ C 0 (I, C), I an interval, 0 ∈ I , and let b0 , b1 , . . . , bn−1 be n complex numbers. Let L be the differential operator in (2.41) or (2.43), and let g be the impulsive response of L. Then the initial value problem 

L y = f (x) y(0) = b0 , y  (0) = b1 , . . . , y (n−1) (0) = bn−1

(2.50)

has a unique solution, defined on the whole of I , and given for any x ∈ I by 

x

y(x) = 0

g(x − t) f (t) dt + c0 g(x) + c1 g  (x) + · · · + cn−1 g (n−1) (x), (2.51)

32

2 The Case of Constant Coefficients

where

⎧ ⎪ ⎪ c0 = bn−1 + a1 bn−2 + · · · + an−2 b1 + an−1 b0 ⎪ ⎪ c1 = bn−2 + a1 bn−3 + · · · + an−3 b1 + an−2 b0 ⎪ ⎪ ⎪ ⎨ .. . ⎪ cn−3 = b2 + a1 b1 + a2 b0 ⎪ ⎪ ⎪ ⎪ c = b1 + a1 b0 ⎪ ⎪ ⎩ n−2 cn−1 = b0 .

(2.52)

In particular (taking f = 0), the solution of the homogeneous problem 

Ly = 0 y(0) = b0 , y  (0) = b1 , . . . , y (n−1) (0) = bn−1

is unique, of class C ∞ on the whole of R, and is given by yh (x) =

n−1 

ck g (k) (x)

(x ∈ R).

(2.53)

k=0

By collecting the b j , we can rewrite this solution as yh (x) = b0 y0 (x) + b1 y1 (x) + · · · + bn−2 yn−2 (x) + bn−1 yn−1 (x),

(2.54)

where ⎧ y0 (x) = an−1 g(x) + an−2 g  (x) + an−3 g  (x) + · · · + a1 g (n−2) (x) + g (n−1) (x) ⎪ ⎪ ⎪   (n−3) ⎪ y (x) + g (n−2) (x) ⎪ 1 (x) = an−2 g(x) + an−3 g (x) + an−4 g (x) + · · · + a1 g ⎪ ⎪   (n−4) ⎪ (x) = a g(x) + a g (x) + a g (x) + · · · + a g (x) + g (n−3) (x) y ⎪ n−3 n−4 n−5 1 ⎨ 2 .. . ⎪ ⎪ ⎪ yn−3 (x) = a2 g(x) + a1 g  (x) + g  (x) ⎪ ⎪ ⎪ ⎪ y (x) = a1 g(x) + g  (x) ⎪ ⎪ ⎩ n−2 yn−1 (x) = g(x). (2.55) ( j) In particular, the functions yk solve L yk = 0 with the initial conditions yk (0) = δ jk (0 ≤ j, k ≤ n − 1). Proof The uniqueness follows from the fact that if y1 , y2 both solve the problem (2.50), then their difference y˜ = y1 − y2 solves the problem (2.47), whence y˜ = 0 by Theorem 2.14. To show that y(x) given by (2.51) satisfies L y(x) = f (x), just observe that any derivative g (k) of g satisfies the homogeneous equation, since Lg (k) = L



 d k dx

g=



 d k dx

Lg = 0.

2.3 The General Case

33

By Theorem 2.14 and the linearity of L it follows that (L y)(x) = f (x) + L

 n−1 

 ck g

(k)

(x) = f (x) +

k=0

n−1 

ck (Lg (k) )(x) = f (x).

k=0

There remain to prove the relations (2.52) between the coefficients ck and the initial data bk = y (k) (0) (0 ≤ k ≤ n − 1). By computing y  , y  , . . . , y (n−1) from (2.51) and imposing the initial conditions in (2.50), we obtain the linear system ⎧ cn−1 = b0 ⎪ ⎪ ⎪ (n) ⎪ c ⎪ n−2 + cn−1 g (0) = b1 ⎪ ⎪ ⎨ cn−3 + cn−2 g (n) (0) + cn−1 g (n+1) (0) = b2 cn−4 + cn−3 g (n) (0) + cn−2 g (n+1) (0) + cn−1 g (n+2) (0) = b3 ⎪ ⎪ ⎪ .. ⎪ ⎪ . ⎪ ⎪ ⎩ c0 + c1 g (n) (0) + c2 g (n+1) (0) + · · · + cn−1 g (2n−2) (0) = bn−1 ,

(2.56)

where we used g (k) (0) = 0 for 0 ≤ k ≤ n − 2, g (n−1) (0) = 1. The derivatives g (k) (0) with k ≥ n can be computed from the differential equation Lg = 0 by isolating g (n) (x), differentiating and letting x = 0. One gets g (n) (0) = −a1 g (n−1) (0) = −a1 g (n+1) (0) = −a1 g (n) (0) − a2 g (n−1) (0) = a12 − a2 g (n+2) (0) = −a1 g (n+1) (0) − a2 g (n) (0) − a3 g (n−1) (0) = −a13 + 2a1 a2 − a3 , and so on. These relations, substituted in the system (2.56), allow one to solve it recursively, and we get exactly (2.52). Finally, (2.54)–(2.55) is just a rewriting of (2.53)–(2.52).  Remark 2.17 We can also proceed more sistematically as follows. We rewrite (2.52) in matrix form as ⎛ ⎞ ⎞ ⎛ b0 c0 ⎜ b1 ⎟ ⎜ c1 ⎟ ⎜ ⎟ ⎟ ⎜ c = ⎜ . ⎟ = Ab = A⎜ . ⎟, ⎝ .. ⎠ ⎝ .. ⎠ cn−1 where

⎛ an−1 ⎜an−2 ⎜ ⎜an−3 ⎜ ⎜ A = ⎜ ... ⎜ ⎜ a2 ⎜ ⎝ a1 1

bn−1 ⎞ 1 0⎟ ⎟ 0⎟ ⎟ .. ⎟ , .⎟ ⎟ 0 · · · · · · 0 0⎟ ⎟ 0 · · · · · · 0 0⎠ 0 ··· ··· 0 0

an−2 an−3 · · · · · · a2 an−3 · · · · · · a2 a1 an−4 · · · · · · a1 1 .. . a1 1 0

1 0 0

a1 1 0 .. .

(2.57)

34

2 The Case of Constant Coefficients

and the system (2.56) as b = B c, where ⎛ 0 0 0 ··· ··· 0 ⎜0 0 · · · · · · 0 1 ⎜ (n) ⎜0 0 · · · · · · 1 g (0) ⎜ B = ⎜. . . .. .. ⎜ .. ⎜ (n) ⎝0 1 ··· g (0) · · · · · · 1 g (n) (0) g (n+1) (0) · · · · · · g (2n−3) (0)

⎞ 1 g (n) (0) ⎟ ⎟ g (n+1) (0) ⎟ ⎟ ⎟. .. ⎟ . ⎟ (2n−3) g (0)⎠ g (2n−2) (0)

(2.58)

Using the equations Lg = 0,

Lg  = 0,

Lg  = 0, . . .

, Lg (n−2) = 0

at x = 0, it is easily checked that the matrices A and B are the inverse of each other. Finally, let us mention that Theorem 2.16 can also be proved by induction on n, with a constructive proof analogous to that of Theorem 2.14. (See Theorem 3.17, for the case of variable coefficients.) Remark 2.18 (The approach by distributions) The most efficient derivation of (2.51)–(2.52) is by means of distribution theory. We refer to [16], chapters II and III, for a proper presentation. A very brief outline is as follows. Given the linear constant-coefficient differential equation L y = f (x), one considers the equation  of distributions with support in L T = S for T and S in the convolution algebra D+  , that is, the solution of [0, +∞). Then one proves that the elementary solution in D+ L T = δ, or equivalently, the inverse of Lδ (δ the Dirac distribution), is given by θg, θ the Heaviside step function and g the impulsive response of the differential operator L. If y solves L y = f (x) with trivial initial conditions at x = 0, then L(θ y) = θ f , whence θ y = θg ∗ θ f . This is just (2.45) for x ≥ 0. If y solves the problem (2.50), one computes n−1  ck δ (k) , L(θ y) = θ f + k=0

with the ck given by (2.52). Convolving this with θg, gives θ y = θg ∗ θ f +

n−1 

ck θg (k) ,

k=0

which is just (2.51) for x ≥ 0. The proof of the following result is similar to that of Theorem 2.6. Theorem 2.19 Let f ∈ C 0 (I, C), I an interval, x0 ∈ I , b0 , b1 , . . . , bn−1 ∈ C, and let L and g be as in Theorem 2.16. The solution of the initial value problem

2.3 The General Case



35

L y = f (x) y(x0 ) = b0 , y  (x0 ) = b1 , . . . , y (n−1) (x0 ) = bn−1

(2.59)

is unique, it is defined on the whole of I , and is given ∀x ∈ I by  y(x) =

x

g(x − t) f (t) dt +

x0

n−1 

ck g (k) (x − x0 ),

(2.60)

k=0

where the coefficients ck are given by (2.52). In particular (taking f = 0), the solution of the homogeneous problem 

Ly = 0 y(x0 ) = b0 , y  (x0 ) = b1 , . . . , y (n−1) (x0 ) = bn−1 ,

with x0 ∈ R arbitrary, is unique, of class C ∞ on the whole of R, and is given ∀x ∈ R by yh (x) =

n−1 

ck g (k) (x − x0 )

(2.61)

bk yk (x − x0 ),

(2.62)

k=0

=

n−1  k=0

where the functions yk are given by (2.55). Now let y be any solution of (2.39) in the interval I x0 , and let bk = y (k) (x0 ) (0 ≤ k ≤ n − 1). Then y solves the problem (2.59), and by uniqueness y is given by (2.60). We thus obtain the following analogue of Corollary 2.8. Corollary 2.20 Let f ∈ C 0 (I, C) and let x0 ∈ I be fixed. Every solution of L y = x f (x) in the interval I can be written as y = y p + yh , where y p (x) = x0 g(x − t) f (t) dt solves L y = f (x) with the initial conditions y (k) p (x 0 ) = 0 for 0 ≤ k ≤ n − 1, and yh solves L yh = 0 with the same initial conditions as y at x0 . The function yh is given by (2.61)–(2.62), where bk = y (k) (x0 ) and the ck are given by (2.52) (0 ≤ k ≤ n − 1). The following result generalizes Corollary 2.9. Corollary 2.21 The set V of solutions of the homogeneous equation L y = 0 on R is a complex vector space of dimension n, with a basis given by {g, g  , g  , . . . , g (n−1) }. Another basis is {y0 , y1 , . . . , yn−1 }, where the yk are defined in (2.55). If L has real coefficients (i.e., a1 , . . . , an in (2.41) are all real numbers), then the set VR of real solutions of L y = 0 on R is a real vector space of dimension n, with a basis given by {g, g  , g  , . . . , g (n−1) } or by {y0 , y1 , . . . , yn−1 }.

36

2 The Case of Constant Coefficients

Proof From (2.53) and the uniqueness of the homogeneous initial value problem, it follows that every solution of L y = 0 on R can be written as a linear combination of g, g  , g  , . . . , g (n−1) in a unique way. (The coefficients ck in this combination are uniquely determined by the initial data bk at x = 0 through (2.52).) In particular, this holds for the trivial solution y(x) = 0 ∀x, i.e., c0 g(x) + · · · + cn−1 g (n−1) (x) = 0 for c0 , . . . , cn−1 ∈ C and ∀x ∈ R, implies c0 = · · · = cn−1 = 0. Thus the n functions g, g  , g  , . . . , g (n−1) are linearly independent on R and form a basis of V . A similar reasoning gives the linear independence of the functions y0 , y1 , . . . , yn−1 . (Every y ∈ V is a linear combination of them by (2.54), and the coefficients are precisely the initial data bk at x = 0.) Note by (2.55) that  the matrix A in (2.57) is precisely the matrix relating these two bases, i.e., y j = k A jk g (k) . If L has real coefficients, then the solutions of L y = 0 with real initial data at x = 0 (i.e., bk = y (k) (0) ∈ R for 0 ≤ k ≤ n − 1) are real-valued. Indeed by writing y = Re y + iIm y, one gets that Im y solves the problem (2.47), whence Im y = 0. In particular, the impulsive response g is real-valued. The set VR of real solutions of L y = 0 on R is then a real vector space of dimension n, with a basis given by {g, g  , g  , . . . , g (n−1) } or by  {y0 , y1 , . . . , yn−1 }. Exercises 1. Verify that the matrices A and B in Remark 2.17 are the inverse of each other. 2. Prove Theorem 2.19 by imitating the proof of Theorem 2.6.

2.4 Explicit Formulas for the Impulsive Response To summarize our results so far, we have seen that the impulsive response g, i.e., the solution of the homogeneous problem (2.48), allows one to solve the nonhomogeneous equation (2.39) with an arbitrary forcing term and with arbitrary initial conditions. We now consider the problem of explicitly computing g for an equation of order n. By iterating (2.46) we obtain the following formula for the impulsive response of order n as an (n − 1)-times repeated integral of exponential functions:  tn−1 eλn (x−tn−1 ) eλn−1 (tn−1 −tn−2 ) · · · 0 0  t2     t3 eλ3 (t3 −t2 ) eλ2 (t2 −t1 ) eλ1 t1 dt1 dt2 · · · dtn−2 dtn−1 . ··· 

gλ1 ···λn (x) =

x

0

0

(2.63) For x ≥ 0 this is just the convolution of the n functions θgλ j (1 ≤ j ≤ n), whereas for x ≤ 0 it is the opposite of the convolution of the functions θ˜ gλ j (see (2.5)):

2.4 Explicit Formulas for the Impulsive Response

 gλ1 ···λn (x) =

37

θgλn ∗ θgλn−1 ∗ · · · ∗ θgλ2 ∗ θgλ1 (x) if x ≥ 0 − θ˜ gλn ∗ θ˜ gλn−1 ∗ · · · ∗ θ˜ gλ2 ∗ θ˜ gλ1 (x) if x ≤ 0.

(2.64)

We can also write the repeated integral in (2.63) as an (n −1)-dimensional integral gλ1 ···λn (x) = eλn x



etn−1 (λn−1 −λn ) · · · et2 (λ2 −λ3 ) et1 (λ1 −λ2 ) dt1 dt2 · · · dtn−1 , Tx

where Tx is the (n − 1)-dimensional simplex of Rn−1 defined (for, e.g., x > 0) by Tx = {(t1 , t2 , . . . , tn−1 ) ∈ Rn−1 :

0 ≤ t1 ≤ t2 ≤ · · · ≤ tn−1 ≤ x}.

For example for n = 3, (2.63) becomes 

x

gλ1 λ2 λ3 (x) =

eλ3 (x−t2 )



0

t2

 eλ2 (t2 −t1 ) eλ1 t1 dt1 dt2 ,

0

and we easily obtain the following result. Corollary 2.22 The impulsive response g of the differential operator L=



 d 3 dx

+ a1



 d 2 dx

+ a2 ddx + a3

is given as follows. Let λ1 , λ2 , λ3 be the roots of the characteristic polynomial p(λ) = λ3 + a1 λ2 + a2 λ + a3 . (1) If λ1 = λ2 = λ3 (all distinct roots), then g(x) =

e λ1 x (λ1 −λ2 )(λ1 −λ3 )

+

e λ2 x (λ2 −λ1 )(λ2 −λ3 )

+

e λ3 x . (λ3 −λ1 )(λ3 −λ2 )

(2) If λ1 = λ2 = λ3 , then g(x) =

1 (λ1 −λ3 )2

(3) If λ1 = λ2 = λ3 , then

 λ3 x  e − eλ1 x + (λ1 − λ3 ) x eλ1 x . g(x) =

1 2

x 2 e λ1 x .

Note that if a1 , a2 , a3 are all real, then g is real-valued (as it should be). Indeed in case (1), if λ1 ∈ R and λ2,3 = α ± iβ with β = 0, one computes g(x) =

1 (λ1 −α)2 +β 2

eλ1 x + β1 (α − λ1 ) eαx sin βx − eαx cos βx .

In all other cases λ1 , λ2 , λ3 are real if a1 , a2 , a3 ∈ R.

38

2 The Case of Constant Coefficients

When the roots of p(λ) are all distinct or all equal, there is a fairly simple expression for g. Proposition 2.23 For generic n, let λ1 , λ2 , . . . , λn be the roots of the characteristic polynomial p(λ) in (2.42). (1) If λi = λ j for i = j (all distinct roots), then the impulsive response is g(x) = c1 eλ1 x + c2 eλ2 x + · · · + cn eλn x , where

1 1 =  p (λ j ) i = j (λ j − λi )

cj = 

(1 ≤ j ≤ n).

(2) If λ1 = λ2 = · · · = λn , then g(x) =

1 x n−1 eλ1 x . (n − 1)!

(2.65)

Proof This is readily proved by induction on n using (2.46). For (1) it is useful to remind that the function g = gλ1 ···λn is symmetric in the interchange λi ↔ λ j , ∀i, j = 1, . . . , n.  In the general case we can simplify formula (2.64) as follows. Let λ1 , λ2 , . . . , λk be the distinct roots of p(λ), of multiplicities m 1 , m 2 , . . . , m k , with m 1 +m 2 +· · ·+ m k = n. Since convolution is commutative and associative, we can group together in (2.64) the convolutions relative to each root λ j (1 ≤ j ≤ k). We can then rewrite the impulsive response of order n (for, say, x ≥ 0) as       g = θgλk ∗ · · · ∗ θgλk ∗ θgλk−1 ∗ · · · ∗ θgλk−1 ∗ · · · ∗ θgλ1 ∗ · · · ∗ θgλ1 .   The term θgλ j ∗ · · · ∗ θgλ j in this formula contains m j factors and the convolution is repeated m j − 1 times. This term gives (for x ≥ 0) the impulsive response gλ j ···λ j of the differential operator ( ddx − λ j )m j . By (2.64) and (2.65) we have gλ j ···λ j (x) =

θgλ j ∗ · · · ∗ θgλ j (x) if x ≥ 0 1 x m j −1 eλ j x , = − θ˜ gλ j ∗ · · · ∗ θ˜ gλ j (x) if x ≤ 0 (m j − 1)!

∀x ∈ R.

Defining for short gλ,m (x) := we obtain

1 x m−1 eλx (m − 1)!

(λ ∈ C, m ∈ N+ ),

(2.66)

2.4 Explicit Formulas for the Impulsive Response

39



θgλk ,m k ∗ θgλk−1 ,m k−1 ∗ · · · ∗ θgλ1 ,m 1 (x) if x ≥ 0 − θ˜ gλk ,m k ∗ θ˜ gλk−1 ,m k−1 ∗ · · · ∗ θ˜ gλ1 ,m 1 (x) if x ≤ 0  tk−1  x = gλk ,m k (x − tk−1 ) gλk−1 ,m k−1 (tk−1 − tk−2 ) · · ·

g(x) =

0

0



t3

··· 0

 gλ3 ,m 3 (t3 − t2 )

t2 0

(2.67)

   gλ2 ,m 2 (t2 − t1 )gλ1 ,m 1 (t1 ) dt1 dt2 · · · dtk−2 dtk−1 .

This formula is useful for computing g in the case of multiple roots. In fact, using (2.67), we can prove the following result. Theorem 2.24 There exist polynomials G j of degrees m j − 1 (1 ≤ j ≤ k) such that g(x) = G 1 (x)eλ1 x + G 2 (x)eλ2 x + · · · + G k (x)eλk x .

(2.68)

More precisely, the impulsive response g for k distinct roots is given by (2.68), where m 1 −1 G 1 is the polynomial of degree m 1 − 1 given for k = 1 by G 1 (x) = (mx 1 −1)! , and for k ≥ 2 by G 1 (x) =

r1  r2 m 1 −1  r1 =0

×

 (−1)m 1 −1−rk−1 x rk−1 × rk−1 ! r2 =0 r3 =0 rk−1 =0 m 1 +m 2 −2−r1 m 3 −1+r1 −r2 m 4 −1+r2 −r3 

(λ1 − λ2

rk−2

···

m 2 −1 )m 1 +m 2 −r1 −1 (λ

  k−2 −rk−1 · · · m k −1+r m 3 −1 m 4 −1 m k −1 m +r −r m +r −r m +r −r 1 − λ3 ) 3 1 2 (λ1 − λ4 ) 4 2 3 · · · (λ1 − λk ) k k−2 k−1

.

(2.69)

To get G j , j ≥ 2, just exchange λ1 ↔ λ j and m 1 ↔ m j in this formula. Proof The proof of the first statement is by induction on the number k of distinct m 1 −1 roots of p(λ). The result holds for k = 1, with G 1 (x) = (mx 1 −1)! by formula (2.65). For k = 2 we have from (2.67) 

x

gλ2 ,m 2 (x − t)gλ1 ,m 1 (t) dt  x e λ2 x = (x − t)m 2 −1 t m 1 −1 e(λ1 −λ2 )t dt. (m 2 − 1)!(m 1 − 1)! 0

g(x) =

0

(2.70)

We need the following lemma, whose proof is postponed at the end. Lemma 2.25 Let p, q ∈ N and b, c ∈ C, b = c. Then, ∀x ∈ R, we have 1 bx e p!q!



x

(x − t) p t q e(c−b)t dt = P(x, p, q, b−c) ebx + P(x, q, p, c−b) ecx ,

0

(2.71) where P(x, p, q, a), a ∈ C \ {0}, is the polynomial of degree p in x given by P(x, p, q, a) =

  p  (−1) p−r p + q − r r =0

r!

q

xr a p+q+1−r

.

(2.72)

40

2 The Case of Constant Coefficients

Suppose this is proved. Using (2.71)–(2.72) in (2.70), we immediately get the following formula for the impulsive response g in the case of two distinct roots λ1 , λ2 of multiplicities m 1 , m 2 : g(x) = G 1 (x)eλ1 x + G 2 (x)eλ2 x , G 1 (x) =

m 1 −1 r =0

G 2 (x) =

m 2 −1 r =0

(2.73)

  xr (−1)m 1 −1−r m 1 + m 2 − r − 2 , r! (λ1 − λ2 )m 1 +m 2 −r −1 m2 − 1

(2.74)

  xr (−1)m 2 −1−r m 1 + m 2 − r − 2 . r! (λ2 − λ1 )m 1 +m 2 −r −1 m1 − 1

(2.75)

Note that G 1 , G 2 have degrees m 1 − 1, m 2 − 1, respectively, and that G 2 is obtained from G 1 by interchanging λ1 ↔ λ2 and m 1 ↔ m 2 , so that g is symmetric under this interchange (clear from (2.67)). Note also that (2.74) agrees with (2.69) for k = 2. The step from k − 1 to k in the inductive proof is now similar. Suppose the result holds for k − 1 roots, and let us prove it holds for k roots. The inductive hypothesis and (2.67) imply that for x ≥ 0 θgλk−1 ,m k−1 ∗ · · · ∗ θgλ1 ,m 1 (x) = G˜ 1 (x)eλ1 x + · · · + G˜ k−1 (x)eλk−1 x , for some polynomials G˜ 1 , . . . , G˜ k−1 , of degrees m 1 − 1, . . . , m k−1 − 1, respectively. Adding the root λk , of multiplicity m k , and using (2.67), gives the impulsive response   g(x) = θgλk ,m k ∗ θgλk−1 ,m k−1 ∗ · · · ∗ θgλ1 ,m 1 (x)  x

e λk x = (x − t)m k −1 G˜ 1 (t)e(λ1 −λk )t + · · · + G˜ k−1 (t)e(λk−1 −λk )t dt. (m k − 1)! 0 (2.76) The final formula for g(x) holds for any x ∈ R. By expanding each polynomial G˜ j in (2.76) in powers of t, we obtain integrals of the form (2.71), with p = m k − 1, 0 ≤ q ≤ m j − 1, b = λk , and c = λ j (1 ≤ j ≤ k − 1). Lemma 2.25 implies that g is of the form (2.68), with G 1 , . . . , G k−1 of degrees m 1 − 1, . . . , m k−1 − 1, respectively, and G k of degree ≤ m k − 1. However, since g must be symmetric under λi ↔ λ j and m i ↔ m j , ∀i, j = 1, . . . , k (this follows from (2.67)), the polynomial G k must actually be of degree m k − 1. In order to prove formula (2.69), we rewrite (2.71) in convolution form, by observing that the left-hand side of (2.71) is just 

x 0

 gb, p+1 (x − t)gc,q+1 (t) dt =

θgb, p+1 ∗ θgc,q+1 (x) if x ≥ 0 −θ˜ gb, p+1 ∗ θ˜ gc,q+1 (x) if x ≤ 0

(gλ,m given by (2.66)). Recalling (2.72), we obtain for x ≥ 0,

2.4 Explicit Formulas for the Impulsive Response p  θgb, p+1 ∗ θgc,q+1 (x) = (−1) p−r r =0 q

+

 (−1)q−r r =0

41

 p+q−r  q

(b − c) p+q+1−r  p+q−r  (c −

p b) p+q+1−r

gb,r +1 (x) gc,r +1 (x).

(2.77)

The same expression is obtained for −θ˜ gb, p+1 ∗ θ˜ gc,q+1 (x) with x ≤ 0. Note that (2.77) holds only for b = c. For b = c we have the simpler formula (2.91). Using (2.77) in (2.67) and iterating, it is possible to computethe polynomials G  j for k distinct roots. For example for k = 3, by computing g = θgλ1 ,m 1 ∗ θgλ2 ,m 2 ∗ θgλ3 ,m 3 for x ≥ 0, we get g = G 1 eλ1 + G 2 eλ2 + G 3 eλ3 , where G 1 (x) =

m r 1 −1  r =0 s=0

G 2 (x) =

m r 2 −1  r =0 s=0

m 1 +m 2 −2−r m 3 −1+r −s  (−1)m 1 −1−s m 2 −1 m 3 −1 xs, s! (λ1 − λ2 )m 1 +m 2 −r −1 (λ1 − λ3 )m 3 +r −s m 1 +m 2 −2−r m 3 −1+r −s  (−1)m 2 −1−s m 1 −1 m 3 −1 xs, s! (λ2 − λ1 )m 1 +m 2 −r −1 (λ2 − λ3 )m 3 +r −s

m 1 +m 2 −2−r m 3 −1+r −s  (−1)m 1 m 2 −1 r G 3 (x) = xs m 1 +m 2 −r −1 (λ − λ )m 3 +r −s s! (λ − λ ) 1 2 1 3 r =0 s=0 m 1 +m 2 −2−r m 3 −1+r −s  m −1 2 −1 m 3  (−1)m 2 m 1 −1 r + xs. m 1 +m 2 −r −1 (λ − λ )m 3 +r −s s! (λ − λ ) 2 1 2 3 r =0 s=0 m 1 −1 m 3 −1 

This formula for g is clearly symmetric under λ1 ↔ λ2 , m 1 ↔ m 2 , since then G 1 ↔ G 2 while G 3 stays invariant. Although not immediately apparent, the formula is also symmetric under λ1 ↔ λ3 , m 1 ↔ m 3 (and thus under λ2 ↔ λ3 , m 2 ↔ m 3 ), as it should be. The polynomial G 3 can then be rewritten in a form similar to G 1 and G 2 , by just letting λ1 ↔ λ3 and m 1 ↔ m 3 in G 1 . Note that the formula for G 1 agrees with (2.69) for k = 3. In general, using (2.77) in (2.67) k − 1 times, we obtain precisely  formula (2.69) 1 −1 s for the polynomial G 1 . Note that the coefficient of x s in G 1 (x) = m s=0 cs x is given for k ≥ 3 by cs =

r1 m 1 −1  r1 =s r2 =s

×

(λ1 − λ2

 (−1)m 1 −1−s × s! rk−2 =s m 1 +m 2 −2−r1 m 3 −1+r1 −r2  rk−3

···

m 2 −1 )m 1 +m 2 −r1 −1 (λ

1

− λ3

m 3 −1 )m 3 +r1 −r2

···

m k−1 −1+rk−3 −rk−2 m k −1+rk−2 −s  m k−1 −1

m k −1

· · · (λ1 − λk−1 )m k−1 +rk−3 −rk−2 (λ1 − λk )m k +rk−2 −s

.

The last statement of the theorem follows from the symmetry of g under λi ↔ λ j ,  m i ↔ m j , ∀i, j. Proof of Lemma 2.25. We shall prove that if p, q ∈ N and a ∈ C \ {0}, then

42

2 The Case of Constant Coefficients

1 p!q!



x

(x − t) p t q eat dt = P(x, p, q, −a) + P(x, q, p, a)eax .

(2.78)

0

Lemma 2.25 follows immediately from this by letting a = c − b. Set 

x

F(x, a, p, q) =

(x − t) p t q eat dt.

0

To compute this observe that the powers of t in the integral can be traded by cor∂ differentiating with respect to the responding powers of the differential operator ∂a ∂ at parameter a. Indeed by iterating the formula ∂a e = t eat , one easily gets the formulas ∂ q at ∂ p at ) e = t q eat , (x − ∂a ) e = (x − t) p eat , ( ∂a (x −

∂ p ∂ q at ) ( ∂a ) e ∂a

= (x − t) p t q eat .

Using this in F and taking the derivatives with respect to a outside the integral, we obtain  x ∂ p ∂ q ) ( ∂a ) eat dt F(x, a, p, q) = (x − ∂a 0   ∂ p ∂ q eax −1 . (2.79) ) ( ∂a ) = (x − ∂a a This formula holds for p = 0 or q = 0 as well, the zero power being the identity ∂ q eax −1 use Leibniz rule ) operator. To compute ( ∂a a ∂ q ( ∂a ) ( f g) =

q  



 q ∂ r ∂ q−r ( ∂a ) f ( ∂a ) g , r r =0

and the formulas ∂ r ax ( ∂a ) (e − 1) =



eax − 1 if r = 0 x r eax if r ≥ 1,

∂ q−r −1 ( ∂a ) a = (−1)q−r (q − r )! a −1−(q−r ) ,

to get ∂ q ) ( ∂a

 eax −1  a

= (e

ax



1)(−1)q a q! q+1

+

q    q r =1

= (−1)q+1 aq! q+1

r

)! x r eax (−1)q−r (q−r a q+1−r

q  ax +e (−1)q−r q! r! r =0

xr a q+1−r

.

2.4 Explicit Formulas for the Impulsive Response

43

Using this in (2.79) gives F(x, a, p, q) = (x −

∂ p ∂a )

(−1)

q!

q+1

= (−1)q+1 q!(x −

a q+1

∂ p ∂a )

+e

ax

a −(q+1) +

q 

(−1)

q−r

r =0 q  r =0

q! x r r ! a q+1−r

r (−1)q−r q! r ! x (x −



∂ p ∂a )



 eax a −(q+1−r ) .

(2.80) To compute the term (x −

∂ p ) ∂a

(x −

a −(q+1) in this formula, use the binomial expansion

∂ p ) ∂a

=

p    p r =0

r

∂ p−r x r (− ∂a ) ,

and the general formula for the derivatives of inverse powers ∂ p −m ) a = (− ∂a

( p+m−1)! −m− p a , (m−1)!

(2.81)

to get (x −

∂ p ) ∂a

a −q−1 =

p    p r =0

= ∂ p ) ∂a

(x −



∂ p−r −(q+1) x r (− ∂a ) a

p    p ( p+q−r )! r =0

To compute the term (x −

r

r

q!

a r − p−q−1 x r .

 eax a −(q+1−r ) in (2.80), observe that

∂ m ax ) e ∂a

= 0, ∀m ≥ 1.

It follows that for any function g: (x − (x − .. . (x −

∂ )(eax g) = [(x ∂a ∂ 2 ax ) (e g) = (x ∂a

∂ p ax ) (e g) ∂a

− −

∂ ∂ ∂ )eax ]g − eax ∂a g = −eax ∂a g, ∂a ax ∂ ax ∂ 2 ∂ )(−e ∂a g) = e ( ∂a ) g ∂a

∂ p = eax (− ∂a ) g.

Using this with g(a) = a −(q+1−r ) and (2.81), we obtain (x −

∂ p ) ∂a

 ax −(q+1−r )  ∂ p −(q+1−r ) e a = eax (− ∂a ) a

)! r − p−q−1 = eax ( p+q−r a . (q−r )!

(2.82)

44

2 The Case of Constant Coefficients

Substituting this and (2.82) in (2.80) we finally get F(x, a, p, q) = (−1)q+1 +

p    p ( p + q − r )!x r a r − p−q−1 r r =0

q  )! r − p−q−1 (−1)q−r q! x r eax ( p+q−r a . r! (q−r )! r =0

This yields precisely (2.78) with P given by (2.72), as easily seen.  Another way to compute the polynomials G j is given by the following result. Theorem 2.26 Let λ1 , λ2 , . . . , λk be the distinct roots of the characteristic polynomial (2.42), of multiplicities m 1 , m 2 , . . . , m k , and let k

1

j=1 (λ

− λ j )m j

 k   c j,m j c j,1 c j,2 = + + ··· + λ − λj (λ − λ j )2 (λ − λ j )m j j=1

(2.83)

1 be the partial fraction expansion of p(λ) over C. Then the impulsive response of the differential operator (2.41) is given by (2.68), where

G j (x) = c j,1 + c j,2 x + c j,3

x2 x m j −1 + · · · + c j,m j 2! (m j − 1)!

(1 ≤ j ≤ k).

Proof This theorem can be proved by induction, though it is most easily proved using the Laplace transform or distribution theory. Indeed, one can show that the Laplace transform L of the impulsive response g is precisely the reciprocal of the characteristic polynomial: 1 (Lg)(λ) = . p(λ) Decomposing 1/ p(λ) according to (2.83) and taking the inverse Laplace transform gives the result (see [10], Eq. (21) p. 81). See also [16], pp. 141–142, for another  proof using distribution theory in the convolution algebra D+ .  We now use the impulsive response to compute the general solution of the homogeneous equation L y = 0 in terms of the complex exponentials eλ j x . The following result generalizes Theorem 2.10. Theorem 2.27 Every solution of L y = 0 can be written in the form yh (x) = P1 (x)eλ1 x + P2 (x)eλ2 x + · · · + Pk (x)eλk x ,

(2.84)

where P j (1 ≤ j ≤ k) is a complex polynomial of degree ≤ m j − 1. Conversely, any function of this form is a solution of L y = 0.

2.4 Explicit Formulas for the Impulsive Response

45

Proof Let yh be a solution of L y = 0. By Corollary 2.21, yh can be written as a linear combination of g, g  , . . . , g (n−1) , i.e., it is given by (2.53) for some complex numbers c0 , c1 , . . . , cn−1 . Substituting (2.68) in (2.53) we get an expression of the form (2.84), where the degrees of the polynomials P j can be less than or equal to m j − 1 (depending on the coefficients c0 , c1 , . . . , cn−1 ). For the converse, observe that since λ1 , λ2 , . . . , λk are the distinct roots of p(λ), of multiplicities m 1 , m 2 , . . . , m k , p(λ) can be factored according to the formula p(λ) = (λ − λ1 )m 1 (λ − λ2 )m 2 · · · (λ − λk )m k . Similarly, the differential operator L factors as L=



− λ1

d dx

m 1 

d dx

− λ2

m 2

···



d dx

− λk

m k

,

(2.85)

where the order of composition of the various factors is irrelevant. We now observe that if λ ∈ C and m ∈ N+ , then 

d dx

−λ

m

y=0



y(x) = P(x)eλx ,

(2.86)

where P is a complex polynomial of degree ≤ m − 1. Indeed given a function y and letting h(x) = y(x)e−λx , we compute    h  (x) = y  (x)e−λx − λy(x)e−λx = ddx − λ y (x) e−λx ,

   2 h  (x) = y  (x) − 2λy  (x) + λ2 y(x) e−λx = ddx − λ y (x) e−λx , .. . h (m) (x) =



d dx

−λ

m y (x) e−λx .

It follows that 

d dx

−λ

m

y=0



h (m) = 0



h is a polynomial of degree ≤ m − 1,

as claimed. Note that the implication from left to right also follows by substituting (2.65) in (2.53), as already seen in the first part of the proof for the case of k distinct roots. By writing now L in the form (2.85) and using the commutativity of the various factors and (2.86), we see that any function of the form (2.84) satisfies L yh = 0.  Theorem 2.27 identifies another basis of the vector space V of solutions of L y = 0 on R, namely the well-known basis given explicitly by the n functions eλ j x ,

x eλ j x ,

x 2 eλ j x , . . . ,

x m j −1 eλ j x (1 ≤ j ≤ k).

Indeed, these functions belong to V (by the converse part of the theorem), and every element of V is a linear combination of them (by (2.84)). Since V has dimension n

46

2 The Case of Constant Coefficients

(by Corollary 2.21), these functions must be linearly independent. As a corollary one gets the following result, that gives a real basis of VR when L has real coefficients and generalizes Corollary 2.11. Corollary 2.28 Let L have real coefficients, and let λ1 , λ2 , . . . , λ p , α1 ± iβ1 , α2 ± iβ2 , . . . , αq ± iβq be the distinct roots of the characteristic polynomial p(λ), where λ j (1 ≤ j ≤ p) is a real root of multiplicity n j , and α j ± iβ j (1 ≤ j ≤ q) is a pair of complex conjugate roots both of multiplicity m j , with n 1 + n 2 + · · · + n p + 2m 1 + 2m 2 + · · · + 2m q = n. Any real-valued solution of the homogeneous equation L y = 0 can be written in the form y(x) =

p 

P j (x)eλ j x +

j=1

q 

  eα j x Q j (x) cos β j x + R j (x) sin β j x ,

j=1

where P j (1 ≤ j ≤ p) is a real polynomial of degree ≤ n j − 1, and Q j , R j (1 ≤ j ≤ q) are real polynomials of degree ≤ m j − 1. Conversely, any function of this form is a real solution of L y = 0. Proof A basis of V is given by the n functions eλ j x , x eλ j x , . . . , x n j −1 eλ j x (1 ≤ j ≤ p), e(α j ±iβ j )x , x e(α j ±iβ j )x , . . . , x m j −1 e(α j ±iβ j )x (1 ≤ j ≤ q). Using eα j x cos β j x =

e(α j +iβ j )x +e(α j −iβ j )x 2

, eα j x sin β j x =

e(α j +iβ j )x −e(α j −iβ j )x 2i

,

it is easy to verify that the functions x k eα j x cos β j x,

x k eα j x sin β j x (0 ≤ k ≤ m j − 1, 1 ≤ j ≤ q)

solve L y = 0 and are linearly independent in V . It follows easily that the n functions eλ j x , x eλ j x , . . . , x n j −1 eλ j x (1 ≤ j ≤ p), eα j x cos β j x, xeα j x cos β j x, . . . , x m j −1 eα j x cos β j x (1 ≤ j ≤ q), eα j x sin β j x, xeα j x sin β j x, . . . , x m j −1 eα j x sin β j x (1 ≤ j ≤ q)

2.4 Explicit Formulas for the Impulsive Response

47

are linearly independent in V and thus form a real-valued basis of V . But then these functions are linearly independent in VR , and since dim VR = n (by Corollary 2.21), they form a basis of VR . We conclude that any real-valued solution of L y = 0 is a linear combination of these n functions with real coefficients, and conversely, any  such linear combination is in VR . Example 4 Find the general real solution of the differential equation y (4) − 2y  + y =

1 . ch3 x

(2.87)

Solution. The characteristic polynomial p(λ) = λ4 − 2λ2 + 1 = (λ2 − 1)2 has the roots λ1 = 1, λ2 = −1, both of multiplicity m = 2. The general solution of the homogeneous equation is then yh (x) = (ax + b) e x + (cx + d) e−x

(a, b, c, d ∈ R).

To this we must add any particular solution of (2.87), for example the solution y p of the initial value problem 

y (4) − 2y  + y = ch13 x y(0) = y  (0) = y  (0) = y  (0) = 0.

From (2.73), (2.74) and (2.75) we find the impulsive response g(x) = (− 41 + 41 x)e x + ( 41 + 41 x)e−x =

x ch x − 21 sh x.

1 2

Alternatively, use the partial fraction expansion of 1/ p(λ) 1 (λ2 −1)2

=

a λ−1

+

b (λ−1)2

+

c λ+1

+

d , (λ+1)2

compute a = −1/4, b = c = d = 1/4, and use Theorem 2.26 to get the same result. The forcing term ch13 x is continuous on R. By (2.45) we get ∀x ∈ R y p (x) =

1 2

 0

x

1 (x − t) ch(x − t) dt − 2 ch3 t

 0

x

sh(x − t) dt. ch3 t

Now use the formulas ch(x − t) = ch x ch t − sh x sh t, sh(x − t) = sh x ch t − ch x sh t. The following integrals are immediate: 

1 dt = th t + c, ch2 t

Integrating by parts we compute



sh t 1 dt = − + c. 3 ch t 2ch2 t

48

2 The Case of Constant Coefficients

 t

1 dt = t th t − log(ch t) + c, ch2 t

 t

sh t t 1 dt = − + th t + c. 3 2 2 ch t 2ch t

Thus y p (x) =

1 2

ch x log(ch x) −

1 4

x sh x.

The final result can be written as y(x) = yh (x) + 21 ch x log(ch x), by noting that the term − 41 x sh x is a solution of the homogeneous equation. Exercises 1. Prove Corollary 2.22. 2. Prove Proposition 2.23. 3. Solve the following initial value problems:   ex y − 3y  + 3y  − y = x+1 (a) y(0) = y  (0) = y  (0) = 0.   x y − 2y  − y  + 2y = exe+1 (b)   y(0) = y (0) = y (0) = 0.   y + y  = sin1 x (c) y( π ) = y  ( π ) = y  ( π2 ) = 0.  2  2  y − y − y + y = ch13 x (d) y(0) = y  (0) = y  (0) = 0. 4. Find the general real solution of each of the following differential equations: (a) (b) (c) (d) (e) (f)

y  + y = 0 y (4) + y = 0 y (4) + 2y  + 2y  + 2y  + y = 0 √ y (5) + y = 0 (remind that cos π5 = 5+1 , sin 4 y (6) + y = 0 y (8) + 8y (6) + 24y (4) + 32y  + 16y = 0.

π 5

=

1 4

√ 10 − 2 5 )

5. Compute the following analytic function in terms of elementary functions: y(x) =

∞  x3 x6 x9 x 3n =1+ + + + ··· . (3n)! 3! 6! 9! n=0

(Hint: using differentiation term by term, show that the function y solves the differential equation y  − y = 0 with the initial conditions y(0) = 1, y  (0) = y  (0) = 0.) 6. Prove that if k ∈ N+ then

2.4 Explicit Formulas for the Impulsive Response

49

∞ k−1  x kn 1  αj x e , = (kn)! k j=0 n=0

where α j are the k-th roots of 1 αj =

√ k

1 = ei

2π j k

( j = 0, 1, . . . , k − 1).

 x kn (Hint: prove that the function y(x) = ∞ n=0 (kn)! solves the differential equation y (k) − y = 0 with the initial conditions y(0) = 1, y  (0) = y  (0) = · · · = y (k−1) (0) = 0.)

2.5 The Method of Undetermined Coefficients The computation of the impulsive response g and of the convolution integral in (2.45) may be rather laborious, in general. The problem of finding a particular solution of (2.39) can be solved in a more efficient way in the case in which the forcing term f has a particular form, namely when it is a polynomial, an exponential, or a product of terms of this kind. One looks then for a particular solution which is similar to f . The unknown coefficients in this tentative solution are determined by substitution into the differential equation. The justification of this method, known as the method of undetermined coefficients, and the precise form of the solution to be looked for, are given by the following result and its corollary. Theorem 2.29 The differential equation Ly =



 d n dx

+ a1



 d n−1 dx

+ · · · + an−1 ddx + an y = P(x) eλ0 x ,

(2.88)

where a1 , . . . , an , λ0 ∈ C and P is a complex polynomial of degree m, has a particular solution of the form y(x) = x r Q(x) eλ0 x ,

(2.89)

where Q is a complex polynomial of degree m, and  r=

0 if p(λ0 ) = 0 multiplicity of λ0 if p(λ0 ) = 0.

Proof Let g be the impulsive response of L. By Theorem 2.14, (2.88) has the particular solution  x g(x − t)P(t)eλ0 t dt.

y p (x) =

0

50

2 The Case of Constant Coefficients

 Let P(x) = mj=0 c j x j . Recalling (2.5), (2.66) and (2.67), we can rewrite y p in convolution form (for, e.g., x ≥ 0) as yp =

m 

  j! c j θgλ1 ,m 1 ∗ · · · ∗ θgλk ,m k ∗ θgλ0 , j+1 .

(2.90)

j=0

Suppose first p(λ0 ) = 0. Then the expression in round brackets in (2.90) is the impulm m   j+1   sive response of the differential operator ddx − λ1 1 · · · ddx − λk k ddx − λ0 , with the extra root λ0 of multiplicity j + 1 (see (2.67)). By Theorem 2.24, there are polynomials G j,1 , . . . , G j,k , G j,0 , of degrees m 1 − 1, . . . , m k − 1, j, respectively, such that θgλ1 ,m 1 ∗· · ·∗θgλk ,m k ∗θgλ0 , j+1 (x) = G j,1 (x)eλ1 x +· · ·+ G j,k (x)eλk x + G j,0 (x)eλ0 x , for x ≥ 0. The same expression is obtained for −θ˜ gλ1 ,m 1 ∗ · · · ∗ θ˜ gλk ,m k ∗ θ˜ gλ0 , j+1 when x ≤ 0. The first k terms in the right-hand side of this formula are solutions of the homogeneous equation L y = 0 (by Theorem 2.27). It follows from (2.90) that (2.88) has a particular solution of the form y(x) =

m 

j! c j G j,0 (x) eλ0 x := Q(x)eλ0 x ,

j=0

where Q is a polynomial of degree m = max0≤ j≤m j. Suppose now p(λ0 ) = 0, for example let λ0 = λ1 . We observe that the function θgλ1 ,m 1 ∗ θgλ1 , j+1 that occurs in (2.90) is just the impulsive response θgλ1 ,m 1 + j+1 of the differential operator ( ddx − λ1 )m 1 ( ddx − λ1 ) j+1 = ( ddx − λ1 )m 1 + j+1 (for x ≥ 0). In other words, we have the general formula θgλ, p ∗ θgλ,q = θgλ, p+q

(λ ∈ C, p, q ∈ N+ ),

(2.91)

that can also be proved directly from (2.66) for x ≥ 0. For x ≤ 0 we get the analogous formula −θ˜ gλ, p ∗ θ˜ gλ,q = −θ˜ gλ, p+q . By (2.90) we get for x ≥ 0 yp =

m 

  j! c j θgλ1 ,m 1 + j+1 ∗ θgλ2 ,m 2 ∗ · · · ∗ θgλk ,m k .

j=0

Again the expression in round brackets in this formula is the impulsive response of the differential operator ( ddx − λ1 )m 1 + j+1 · · · ( ddx − λk )m k , which by Theorem 2.24 is of the form G j,1 (x)eλ1 x + G j,2 (x)eλ2 x + · · · + G j,k (x)eλk x ,

2.5 The Method of Undetermined Coefficients

51

for polynomials G j,1 , . . . , G j,k , with deg G j,1 = m 1 + j, deg G j,l = m l − 1 (2 ≤ l ≤ k). Since the term G j,2 eλ2 + · · · + G j,k eλk is a solution of L y = 0, we conclude that (2.88) has a particular solution given by y(x) =

m 

j! c j G j,1 (x) eλ1 x := Q 1 (x)eλ1 x ,

j=0

where Q 1 has degree m 1 + m = max0≤ j≤m (m 1 + j). By writing out Q 1 as   Q 1 (x) = d0 + d1 x + · · · + dm 1 −1 x m 1 −1 + dm 1 x m 1 + · · · + dm 1 +m x m 1 +m , we see that the polynomial in round brackets, multiplied by eλ1 x , is a solution of L y = 0. Finally, we obtain a particular solution of (2.88) of the form   x m 1 dm 1 + dm 1 +1 x + · · · + dm 1 +m x m eλ0 x := x r Q(x)eλ0 x , where deg Q = m, and r = m 1 is the multiplicity of λ0 in p(λ).



Remark 2.30 The polynomial Q in (2.89) is uniquely determined by the requirement that y given by (2.89) satisfy (2.88). See, for instance, [8], pp. 97–98. By applying the theorem with L and P real, and separating out the real and imaginary parts in the complex solution (2.89), we obtain the following real version of the method of undetermined coefficients. Corollary 2.31 If L in (2.88) has real coefficients and P is a real polynomial of degree m, then the differential equation L y = P(x)eαx cos βx, where α, β ∈ R, has a particular solution of the form y(x) = x r eαx (Q 1 (x) cos βx + Q 2 (x) sin βx) ,

(2.92)

where Q 1 , Q 2 are real polynomials of degree ≤ m (but at least one of them has degree m), and  r=

0 if p(α + iβ) = 0 multiplicity of α + iβ if p(α + iβ) = 0.

A similar result holds for the differential equation L y = P(x)eαx sin βx (β = 0). Proof Let β = 0. By letting λ0 = α + iβ in Theorem 2.29, we see that the equation L y = P(x)e(α+iβ)x has a particular solution yc of the form yc (x) = x r Q(x) e(α+iβ)x = x r (Q 1 (x) + i Q 2 (x)) eαx (cos βx + i sin βx) , where Q 1 , Q 2 are real polynomials of degree ≤ m and at least one of them has degree m (since Q has degree m). Separating out the real and imaginary parts we get

52

2 The Case of Constant Coefficients

yc (x) = x r eαx {Q 1 (x) cos βx − Q 2 (x) sin βx + i [Q 2 (x) cos βx + Q 1 (x) sin βx]} .

By linearity and the fact that L has real coefficients, it follows that the function y1 (x) = Re yc (x) = x r eαx [Q 1 (x) cos βx − Q 2 (x) sin βx] solves

L y = P(x) eαx cos βx,

whereas the function y2 (x) = Im yc (x) = x r eαx [Q 2 (x) cos βx + Q 1 (x) sin βx] solves

L y = P(x) eαx sin βx.

We thus obtain (2.92) (with −Q 2 in place of Q 2 ). Note that if (Q 1 , Q 2 ) are the polynomials in the particular solution (2.92) with forcing term P(x)eαx cos βx, then the polynomials (−Q 2 , Q 1 ) give the particular solution with forcing term P(x)eαx sin βx. Also note that for β = 0 the polynomial Q is generally complexvalued (i.e., Q 2 = 0) even though P and L have real coefficients. If β = 0 the result (2.92) for a particular solution of L y = P(x)eαx (namely y(x) = x r eαx Q 1 (x) with Q 1 of degree m) becomes identical with (2.89), the polynomial Q being real-valued  if λ0 ∈ R. We observe that, for calculational purposes, it is often simpler to use the complex formalism, namely in order to find a particular solutions of L y = P(x)eαx cos βx (resp. L y = P(x)eαx sin βx) it is generally easier to solve L y = P(x)e(α+iβ)x by Theorem 2.29, and then take the real (resp. imaginary) part of the complex solution thus obtained. Example 5 Find a particular solution of the differential equation y  − 4y  + 5y = x 2 e2x cos x.

(2.93)

Solution. The characteristic polynomial p(λ) = λ2 − 4λ + 5 has roots λ = 2 ± i, both of multiplicity 1, and the general real solution of the homogeneous equation is yom (x) = c1 e2x cos x + c2 e2x sin x

(c1 , c2 ∈ R).

As P(x) = x 2 is a polynomial of degree 2, Corollary 2.31 implies that (2.93) has a particular solution of the form      y1 (x) = x e2x ax 2 + bx + c cos x + d x 2 + ex + f sin x

(2.94)

2.5 The Method of Undetermined Coefficients

53

where a, b, c, d, e, f ∈ R. Substituting (2.94) in (2.93) we find, by a tedious calculation, a = c = e = 0, b = − f = 1/4, d = 1/6. Using the complex formalism simplifies the computation, as we now show. Instead of solving (2.93), we solve the complex equation y  − 4y  + 5y = x 2 e(2+i)x .

(2.95)

According to Theorem 2.29, this equation has a particular solution of the form y(x) = x e(2+i)x ( Ax 2 + Bx + C ) = e(2+i)x ( Ax 3 + Bx 2 + C x ) where A, B, C are complex constants to be determined. We have y  (x) = e(2+i)x [ (2 + i) ( Ax 3 + Bx 2 + C x ) + 3Ax 2 + 2Bx + C ] , y  (x) = e(2+i)x [ (2 + i)2 ( Ax 3 + Bx 2 + C x ) + 2(2 + i) ( 3Ax 2 + 2Bx + C ) + 6Ax + 2B ] .

Substituting in (2.95) we get the equality e(2+i)x { [ (2 + i)2 − 4(2 + i) + 5 ] ( Ax 3 + Bx 2 + C x ) + 2(2 + i) ( 3Ax 2 + 2Bx + C ) − 4 ( 3Ax 2 + 2Bx + C ) + 6Ax + 2B } = x 2 e(2+i)x .

Now the term in square brackets vanishes, being equal to p(2 + i) = 0, and we get 6i Ax 2 + 2 ( 3A + 2i B ) x + 2B + 2iC = x 2 , whence ⎧ ⎨ 6i A = 1 3A + 2i B = 0 ⎩ B + iC = 0,

=⇒

⎧ ⎨ A = −i/6 B = 1/4 ⎩ C = i/4.

It follows that (2.95) has the particular solution y(x) = x e(2+i)x (− 16 i x 2 + 14 x + 14 i ) = x e2x ( cos x + i sin x ) [ 41 x + i ( 41 − 16 x 2 ) ] = x e2x { 41 x cos x + ( 16 x 2 − 14 ) sin x + i [ ( 41 − 16 x 2 ) cos x + 14 x sin x ] } .   Since x 2 e2x cos x = Re x 2 e(2+i)x , the real solution sought of (2.93) is precisely y1 (x) = Re y(x) = x e2x [ 41 x cos x + ( 16 x 2 − 41 ) sin x ] ,

54

2 The Case of Constant Coefficients

in agreement with the calculation using the real formalism. By Corollary 2.31, it also follows that the function y2 (x) = Im y(x) = x e2x [ ( 41 − 16 x 2 ) cos x + 41 x sin x ] solves

  y  − 4y  + 5y = Im x 2 e(2+i)x = x 2 e2x sin x.

Exercises 1. Find a particular solution of the differential equation y (4) − y = f (x) for each of the following forcing terms: (b)

(a)

f (x) = e x

(d)

f (x) = e x cos x (e)

f (x) = cos x

(c)

f (x) = x e x sin x ( f )

f (x) = sin x f (x) = x 2 e x cos x

2. Find a particular solution of the differential equation y  − 3y  + 4y  − 2y = x 2 e x sin x.

Chapter 3

The Case of Variable Coefficients

Most of the theory developed so far generalizes to linear differential equations with variable coefficients, namely the equations of the form L y = y (n) + a1 (x) y (n−1) + a2 (x) y (n−2) + · · · + an−1 (x) y  + an (x) y = f (x), where a1 , a2 , . . . , an , f are complex-valued continuous functions in an interval I ⊂ R. In this section we extend the impulsive response method to this case. We use the result of Mammana [13, 14] about the factorization of a real linear ordinary differential operator into first-order (complex) factors, as well as a recent generalization in [4] to the case of complex-valued coefficients. The presentation will be less elementary than in the rest of the paper. We shall need some basic results from the theory of linear equations, such as the existenceuniqueness theorem for the solutions of the initial value problem. According to this theorem, if b0 , . . . , bn−1 are any n (complex) constants and x0 ∈ I , there exists one, and only one, solution of the initial value problem 

L y = f (x) y(x0 ) = b0 , y  (x0 ) = b1 , . . . , y (n−1) (x0 ) = bn−1 ,

defined on the entire interval I . (See, e.g., [8], Theorem 8, p. 260.) This implies easily that the solutions of the n-th order homogeneous equation L y = 0 on I form a vector space of dimension n. We also recall that n solutions of this equation on I are linearly independent if and only if their Wronskian never vanishes in I . We refer to [8], Chap. 3, for a nice introduction.

© Springer International Publishing AG 2016 R. Camporesi, An Introduction to Linear Ordinary Differential Equations Using the Impulsive Response Method and Factorization, PoliTO Springer Series, DOI 10.1007/978-3-319-49667-2_3

55

56

3 The Case of Variable Coefficients

3.1 n = 1 The case n = 1 is elementary. Consider the differential equation y  − α(x)y = f (x), where α, f are complex-valued continuous functions in an interval I . It is well known that the general solution is  A(x) e−A(x) f (x) d x, y(x) = e  where A is any primitive of α in I , and e−A(x) f (x) d x denotes the set of all x primitives of e−A(x) f (x) in the interval I . Fix x0 ∈ I , take A(x) = x0 α(t)dt, and  −A(x)  x −A(t) write e f (x) d x = x0 e f (t) dt + k (k ∈ C). Then we can rewrite y as x

α(r )dr



x



t

α(r )dr

e x0 f (t) dt + k e y(x) = e x0  x  x x α(r )dr = e t α(r )dr f (t) dt + k e x0 x  0x = gα (x, t) f (t) dt + k gα (x, x0 ), x0

x x0

α(r )dr

(3.1)

x0

where gα (x, t) = e

x t

α(r )dr

.

(3.2)

The function gα : I × I → C is called the impulsive response kernel. It solves the homogeneous equation in the variable x with the initial condition y(t) = 1, i.e., it is the unique solution of the initial value problem   y − α(x)y = 0 (3.3) y(t) = 1. The constant k in (3.1) is computed to be k = y(x0 ), and the term k gα (x, x0 ) gives the general solution of the homogeneous equation. On the other hand, the function  x gα (x, t) f (t) dt (3.4) y p (x) = x0

is the unique solution of the initial value problem 

y  − α(x)y = f (x) y(x0 ) = 0.

If α is constant, then gα (x, t) = eα(x−t) , and (3.1) reduces to (2.11).

(3.5)

3.2 n = 2

57

3.2 n = 2 Consider the second-order linear differential equation y  + a1 (x)y  + a2 (x)y = f (x), where a1 , a2 and f are complex-valued continuous functions in an interval I ⊂ R, i.e., a1 , a2 , f ∈ C 0 (I, C). As in the case of constant coefficients, we write the equation in operator form as L y = f (x), where L is the 2nd-order linear differential operator  2 (3.6) L = ddx + a1 (x) ddx + a2 (x). When f = 0 the equation is called non-homogeneous, L y = 0 is the associated homogeneous equation. Let us try to write L in the factored form L=



d dx

− α1 (x)



d dx

 − α2 (x) ,

(3.7)

where α1 ∈ C 0 (I, C) and α2 ∈ C 1 (I, C). We have 

d dx

− α1



d dx

  2 − α2 = ddx − (α1 + α2 ) ddx + α1 α2 − α2 .

Equating this to L in (3.6) gives 

a1 = −α1 − α2 a2 = α1 α2 − α2 .

(3.8)

By substituting α1 = −a1 − α2 into the second equation, we see that the function α2 satisfies the complex Riccati equation α2 + α22 + a1 α2 + a2 = 0.

(3.9)

This is a first-order nonlinear equation, and there is no general formula for solving it. If a1 and a2 are rational functions of x with rational number coefficients, there are algorithms to search for rational solutions of (3.9), see [18], pp. 17–18. See also [18], Appendix D, for a list of known factorizations in this case. We have proved that if L admits the factorization (3.7), then α2 is a solution of (3.9). Conversely, if α2 is a solution of (3.9) on I and α1 = −a1 − α2 , then L given by (3.7) becomes identical to L in (3.6). Now observe that if L is factored as in (3.7), then a solution of the homogeneous equation L y = 0 is certainly given by any solution of 

d dx

 − α2 (x) y = 0,

58

3 The Case of Variable Coefficients

for instance by the function α(x) = e

x x0

α2 (t) dt

= gα2 (x, x0 )

(cf. (3.2)), where x0 ∈ I is fixed. We have α(x) = 0 ∀x ∈ I , and α2 = α /α, i.e., α2 is the logarithmic derivative of α. Conversely, suppose α : I → C is a solution of L y = 0 that never vanishes on I . Define α2 = α /α, then α2 =

αα −α2 , α2

and   + a1 αα + a2   = α1 α + a1 α + a2 = 0,

α2 + α22 + a1 α2 + a2 =

αα −α2 α2

+

  2 α α

i.e., α2 solves the Riccati equation (3.9). Defining α1 = −a1 − α2 , we get the factorization (3.7) of L. We have proved the following result (cf. [4], Proposition 2.1). Proposition 3.1 Let L be a second-order linear ordinary differential operator L=



 d 2 dx

+ a1 (x) ddx + a2 (x),

where a1 , a2 ∈ C 0 (I, C), I an interval. Then the following conditions are equivalent: (1) L admits the factorization L=



d dx

− α1 (x)



d dx

− α2 (x)



for some α1 ∈ C 0 (I, C) and α2 ∈ C 1 (I, C). (2) There exists a solution α2 ∈ C 1 (I, C) of the complex Riccati equation z  + z 2 + a1 z + a2 = 0. (3) There exists a solution α : I → C of L y = 0 such that α(x) = 0 ∀x ∈ I . The relationship between the functions α, α2 and α1 is then α2 =

α , α

α=e



α2 d x

,

α1 = −a1 − α2 .

Example 6 Consider the differential equation y  +x y  + y = 0. The Riccati equation (3.9) reads α2 + α22 + xα2 + 1 = 0,

3.2 n = 2

59

and α2 (x) = −x is a solution. Thus α1 (x) = −x +x = 0, and we get the factorization 

 d 2 dx

+ x ddx + 1 =

d dx



d dx

+x



on I = R. Consider now the equation y  + x y  − y = 0. The Riccati equation is now α2 + α22 + xα2 − 1 = 0, and a solution is α2 (x) = x1 . Thus α1 (x) = −x − x1 , and we get the factorization 

 d 2 dx

+ x ddx − 1 =



d dx

+x+

1 x



d dx



1 x



on, say, I = R+ . Example 7 Consider the differential equation y  − (1 + x 2 )y = 0. The Riccati equation reads α2 + α22 − 1 − x 2 = 0, and α2 (x) = x is a solution. Since a1 = 0, α1 (x) = −α2 (x) = −x, and we get the factorization     d 2 − (1 + x 2 ) = ddx + x ddx − x . dx   Example 8 Consider the differential equation y  − 1 + x1 y  + x1 y = 0 on I = R+ . The Riccati equation is   α2 + α22 − 1 + x1 α2 +

1 x

= 0,

and α2 (x) = 1 is a solution. Thus α1 (x) = x1 , and we get the factorization 

 d 2 dx

  − 1 + x1 ddx +

1 x

=



d dx



1 x



d dx

 −1 .

One may wonder if the conditions of Proposition 3.1 are always satisfied for any L of the form (3.6). Of course, one does not expect the factorization to be unique,  2 in general. For instance let L = ddx + 1, then α(x) = ei x is a solution of L y = 0 that never vanishes on any interval I ⊂ R. In this case α2 (x) = α (x)/α(x) = i. [More generally, for any β ∈ C with Im β = 0, the function α : R → C, α(x) = cos x + β sin x, solves Lα = 0, α(0) = 1, α (0) = β, and α(x) = 0 ∀x ∈ R.] On intervals of length less than or equal to π we can get real factorizations. For example on I = (0, π) we can take α(x) = sin x, and α2 (x) = α (x)/α(x) = ctg x. Thus in the interval I = (0, π) we have 

 d 2 dx

+1=



d dx

+i



d dx

    − i = ddx + ctg x ddx − ctg x .

The following result was proved in [4], Theorem 2.4.

60

3 The Case of Variable Coefficients

 2 Theorem 3.2 Let I ⊂ R be an interval, and let L = ddx + a1 (x) ddx + a2 (x) be a second-order linear differential operator, where a1 , a2 : I → C are continuous functions. Then there exists a solution α : I → C of L y = 0 that vanishes nowhere in I . As a consequence, L admits a factorization of the form (3.7). We can now solve L y = f (x) by the same method we used in the case of constant coefficients (Sect. 2.2). Consider first the initial value problem with trivial initial conditions at some point x0 ∈ I . The following result generalizes Theorem 2.2. Theorem 3.3 Let f ∈ C 0 (I, C), I an interval, let x0 ∈ I be fixed, and suppose L in (3.6) is factored according to (3.7). The solution of the initial value problem 

L y = f (x) y(x0 ) = 0, y  (x0 ) = 0

(3.10)

is unique, defined on the whole of I , and is given by the formula 

x

y p (x) =

g(x, t) f (t) dt

(x ∈ I ),

(3.11)

x0

where g : I × I → C is the function given by  g(x, t) = gα1 α2 (x, t) =

x

gα2 (x, s)gα1 (s, t) ds

(x, t ∈ I )

(3.12)

t

(ga (x, t) defined in (3.2)). The function x → g(x, t) is, for any t ∈ I , the unique solution of the homogeneous problem 

Ly = 0 y(t) = 0,

(3.13)

y  (t) = 1.

Proof We write the differential equation L y = f (x) in the form 

d dx

− α1

Letting h=



d dx



d dx

 − α2 y = f (x).

 − α2 y = y  − α2 y,

we see that y solves the problem (3.10) if and only if  h solves

h  − α1 h = f (x) and y solves h(x0 ) = 0



y  − α2 y = h(x) y(x0 ) = 0.

3.2 n = 2

61

Recalling (3.4)–(3.5), we get 

x

h(x) =

gα1 (x, t) f (t) dt,

x0

 y(x) =

x

gα2 (x, t)h(t) dt.

x0

Substituting h from the first formula into the second, we obtain y(x) as a repeated integral (for any x ∈ I ): 

x

y(x) =

 gα2 (x, t)

x0

t

gα1 (t, s) f (s) ds

dt.

(3.14)

x0

To fix ideas, suppose that x > x0 . Then in the integral with respect to t we have x0 ≤ t ≤ x, and in the integral with respect to s we have x0 ≤ s ≤ t. We can then rewrite y(x) as a double integral:  gα2 (x, t)gα1 (t, s) f (s) ds dt,

y(x) = Tx

where Tx is the triangle in the (s, t) plane defined by x0 ≤ s ≤ t ≤ x, with vertices at the points (x0 , x0 ), (x0 , x), (x, x). In (3.14) we first integrate with respect to s and then with respect to t. Since the integrand function is continuous in Tx , we can interchange the order of integration and integrate with respect to t first. Given s (between x0 and x) the variable t in Tx varies between s and x. We thus obtain 

x



x0 x

y(x) = =



x

gα2 (x, t)gα1 (t, s) dt

f (s) ds

s

g(x, s) f (s) ds.

x0

For x < x0 we can reason in a similar way and we get the same result. The last statement of the theorem can be proved in a similar way. Consider the initial value problem (3.13),  d   − α1 ddx − α2 y = 0 dx y(t) = 0, y  (t) = 1. Letting h = y  − α2 y, we get 

h  − α1 h = 0 and h(t) = 1



y  − α2 y = h(x) y(t) = 0.

62

3 The Case of Variable Coefficients

Using (3.1) and (3.4) we get h(x) = gα1 (x, t), 

and



x

y(x) =

gα2 (x, s)h(s) ds =

t

x

gα2 (x, s)gα1 (s, t) ds = g(x, t).

t

 The function g(x, t) is called the impulsive response kernel of L. We observe that g ∈ C 1 (I × I, C). Indeed by the formula (cf. (2.49)) ∂ ∂x



x



x

F(x, t, s) ds = F(x, t, x) +

t

t

∂F (x, t, s) ds, ∂x

and the analogous formula ∂ ∂t



x



x

F(x, t, s) ds = −F(x, t, t) +

t

t

we compute (using the notation ∂x = 

∂ ∂x

∂F (x, t, s) ds, ∂t

from now on)

x

∂x g(x, t) = ∂x

gα2 (x, s)gα1 (s, t) ds  x = gα2 (x, x)gα1 (x, t) + ∂x gα2 (x, s)gα1 (s, t) ds t

t

= gα1 (x, t) + α2 (x)g(x, t), 

(3.15)

x

∂t g(x, t) = ∂t

gα2 (x, s)gα1 (s, t) ds  x gα2 (x, s)∂t gα1 (s, t) ds = − gα2 (x, t)gα1 (t, t) + t

t

= − gα2 (x, t) − α1 (t)g(x, t),

(3.16)

where we used gα (x, x) = 1, ∂x gα (x, t) = α(x)gα (x, t) and ∂t gα (x, t) = −α(t)gα (x, t) (see (3.2)). Now a1 , a2 ∈ C 0 , α1 ∈ C 0 and α2 ∈ C 1 (cf. (3.8)), thus ∂x g and ∂t g are both continuous on I × I , and g ∈ C 1 (I × I, C). Note by (3.16) that the function x → ∂t g(x, t) solves the homogeneous equation for any t ∈ I , being a linear combination (with t-dependent coefficients) of the two functions x → gα2 (x, t) and x → g(x, t). Alternatively, just permute the derivatives with respect to x in (3.6) with ∂t in ∂t g(x, t), to write L ∂t g(x, t) =



 d 2 dx

+ a1 (x) ddx + a2 (x) ∂t g(x, t) = ∂t Lg(x, t) = 0.

3.2 n = 2

63

This is permissible as one can show from (3.16) that the mixed derivative ∂x ∂t g and ∂x2 ∂t g exist continuous on I × I . We also observe that ∂x2 g and ∂x ∂t g = ∂t ∂x g are given by ∂x2 g(x, t) = (α1 (x) + α2 (x))gα1 (x, t) + (α22 (x) + α2 (x))g(x, t), ∂t ∂x g(x, t) = − α1 (t)gα1 (x, t) − α2 (x)gα2 (x, t) − α1 (t)α2 (x)g(x, t) = ∂x ∂t g(x, t), (3.17) and are thus continuous on I × I , but ∂t2 g does not exist, in general. This is clear from (3.16) as the function α1 is only continuous, in general. Under the additional assumption that a1 ∈ C 1 , we have α1 ∈ C 1 (from (3.8)), and ∂t2 g is continuous on I × I being given by ∂t2 g(x, t) = (α1 (t) + α2 (t))gα2 (x, t) + (α12 (t) − α1 (t))g(x, t), so that g ∈ C 2 (I × I, C). Moreover, in this case, the function t → g(x, t) satisfies the dual (or adjoint) equation (∂t + α2 (t)) (∂t + α1 (t)) g(x, t) = 0, with the initial conditions g(x, t)|t=x = 0, ∂t g(x, t)|t=x = −1. (Just apply (∂t + α2 (t)) to both sides of the formula −gα2 (x, t) = (∂t + α1 (t))g(x, t), obtained from (3.16).) We now come to the solution of L y = f (x) with arbitrary initial data at the point x = x0 . It is enough to solve the homogeneous equation in view of the following result. Proposition 3.4 Let f ∈ C 0 (I, C), I an interval, and x0 ∈ I . Every solution of L y = f (x) on I can be written as y = y p + yh , where y p , given by (3.11), solves the non-homogeneous equation with trivial initial conditions at x0 , and yh solves the homogeneous equation with the same initial conditions as y at x0 . Proof Let y be any solution of L y = f (x) on I , and let yh = y − y p . Then  L yh = L y − L y p = 0, and yh (x0 ) = y(x0 ), yh (x0 ) = y  (x0 ). We have the following generalization of (2.36) and of Corollary 2.9 (the complex version). Theorem 3.5 Let L be the differential operator (3.6)–(3.7), with a1 , a2 , α1 ∈ C 0 (I, C), α2 ∈ C 1 (I, C), I an interval, and let g(x, t) be the impulsive response kernel of L. Let x0 ∈ I , y0 , y0 ∈ C. The solution of the initial value problem 

Ly = 0 y(x0 ) = y0 ,

y  (x0 ) = y0

(3.18)

64

3 The Case of Variable Coefficients

is

  yh (x) = y0 + a1 (x0 )y0 g(x, x0 ) − y0 ∂t g(x, x0 ).

(3.19)

The set of solutions of the homogeneous equation on I , V = {y ∈ C 2 (I, C) : L y = 0}, is a complex vector space of dimension 2, with a basis given by {x → g(x, t),

x → −∂t g(x, t)},

(3.20)

for any t ∈ I . Proof As already observed, the function x → ∂t g(x, t) solves the homogeneous equation L y = 0, ∀t ∈ I . Formula (3.19) can then be verified by checking the initial conditions. However, we shall give a constructive proof of (3.19), analogous to that  of Theorem 3.3. Letting h = ddx − α2 y, we see that y solves (3.18) if and only if  h solves

h  − α1 h = 0 h(x0 ) = y0 − α2 (x0 )y0

 and y solves

y  − α2 y = h(x) y(x0 ) = y0 .

Using (3.1), we have h(x) = h(x0 )gα1 (x, x0 ), 

and

x

y(x) = y(x0 )gα2 (x, x0 ) +

gα2 (x, t)h(t) dt.

x0

Substituting h(t) from the first formula, we get  x y(x) = y0 gα2 (x, x0 ) + h(x0 ) gα2 (x, t)gα1 (t, x0 ) dt x0   = y0 gα2 (x, x0 ) + y0 − α2 (x0 )y0 g(x, x0 ).

(3.21)

Here we eliminate gα2 , using (3.16) to write gα2 (x, x0 ) = −∂t g(x, x0 ) − α1 (x0 )g(x, x0 ). Recalling (3.8), we finally get   y(x) = y0 − (α1 (x0 ) + α2 (x0 ))y0 g(x, x0 ) − y0 ∂t g(x, x0 )   = y0 + a1 (x0 )y0 g(x, x0 ) − y0 ∂t g(x, x0 ). Now let y be any solution of the homogeneous equation on I . Fix any point x0 ∈ I , and let y0 = y(x0 ), y0 = y  (x0 ). Then y solves the problem (3.18), so by uniqueness

3.2 n = 2

65

y = yh given by (3.19). Thus any y ∈ V can be written as a linear combination of the two functions x → g(x, x0 ) and x → −∂t g(x, x0 ) in a unique way. (The coefficients in this combination are uniquely determined by the initial data at x0 .) In particular, these functions are linearly independent in V , and form a basis of V , for  any x0 ∈ I . Remark 3.6 In the case of constant coefficients, we can take α1 (x) = λ1 and α2 (x) = λ2 , where λ1,2 are the roots of the characteristic polynomial. Formula (3.12) becomes  g(x, t) =

x

e

λ2 (x−s) λ1 (s−t)

e

 ds =

t

x−t

eλ2 (x−t−s) eλ1 (s) ds = g(x − t),

0

where g(x) is the impulsive response (2.15). Noting that −∂t g(x, t) = −∂t g(x − t) = g  (x − t), we see that (3.11) and (3.19) become identical to (2.38) and (2.36), respectively. For the later generalization to equations of order n ≥ 2, we give another proof of Theorem 3.5. Recall that two solutions y1 , y2 of L y = 0 on I are linearly independent in V (i.e., the only constants c1 , c2 such that c1 y1 (x) + c2 y2 (x) = 0, ∀x ∈ I , are c1 = c2 = 0) if and only if their Wronskian

y y w(y1 , y2 ) = y1 y2 − y2 y1 =

1 2

y1 y2 is nonzero at some point of I , in which case w(y1 , y2 )(x) = 0, ∀x ∈ I . In fact w = w(y1 , y2 ) satisfies the differential equation w  = −a1 (x)w, whence w(x) = w(x0 )e



x x0

a1 (t)) dt

(∀x, x0 ∈ I ).

(3.22)

This formula generalizes to higher order equations (cf. [8], Chap. 3, Theorem 8). Two linearly independent solutions of L y = 0 on I (i.e., a basis of V ) are called a fundamental system. Alternative proof of Theorem 3.5. Consider, for any t ∈ I , the Wronskian matrix W (x, t) of the two functions y1 (x) = g(x, t) and y2 (x) = −∂t g(x, t) (cf. (3.20)):

g(x, t) −∂t g(x, t) . W (x, t) = ∂x g(x, t) −∂x ∂t g(x, t)

Here we substitute (3.15), (3.16) and (3.17), and take x = t, to get W (t, t) =





0 1 0 1 = 1 α1 (t) + α2 (t) 1 −a1 (t)

66

3 The Case of Variable Coefficients

(cf. (3.8)). Thus the Wronskian w of y1 , y2 is nonzero at x = t, w(t) = det W (t, t) = −1 = 0, and it follows that y1 , y2 are linearly independent in V , ∀t ∈ I . Next, we write any y ∈ V as a linear combination of y1 , y2 , y(x) = c1 y1 (x) + c2 y2 (x) = c1 g(x, t) − c2 ∂t g(x, t). Taking the derivative with respect to x gives y  (x) = c1 ∂x g(x, t) − c2 ∂x ∂t g(x, t). We rewrite these two equations in matrix form as



y(x) c1 . = W (x, t) c2 y  (x) Letting x = t gives





y(t) c1 c1 0 1 = W (t, t) = . y  (t) c2 1 −a1 (t) c2

By inverting the matrix W (t, t) we find the expression of c1 , c2 in terms of y(t), y  (t):







c1 a1 (t) 1 y(t) a1 (t)y(t) + y  (t) −1 y(t) = = . = W (t, t) c2 1 0 y  (t) y(t) y  (t) This proves (3.19) (taking t = x0 ). Remark 3.7 From (3.21) we see that another basis of V is given by the functions x → gα2 (x, x0 ) and x → g(x, x0 ), ∀x0 ∈ I . Using α1 = −a1 − α2 , we can rewrite the expression of g(x, t) in terms of gα2 (x, t) as follows: 

x



x

x

s

e s α2 e t α1 ds t t  x   x s s s x x α2 t (−a1 −α2 ) α2 s t = e e ds = e e−2 t α2 e− t a1 ds

g(x, t) =

gα2 (x, s)gα1 (s, t) ds =

t

=e

x t

α2



x t

s

t

e − t a1   s 2 ds = gα2 (x, t) e t α2



x t

s

e− t a1 (r ) dr ds. gα2 (s, t)2

(3.23)

We note here a connection with the method of reduction of the order (see [8], Chap. 3, Sect. 5): given a solution x → α(x) of the homogeneous equation that never vanishes

3.2 n = 2

67

in the interval I (e.g., α(x) = gα2 (x, x0 )), another linearly independent solution β of L y = 0 on I is given by the formula 

x

β(x) = α(x)

e



s x0

a1 (r ) dr

(x0 ∈ I ).

ds

α(s)2

x0

Example 9 Consider the differential equation y  +

1 4x 2

y=0

α on I = R+ . By looking for a solution √of the form y(x) = x , we get α(α−1)+1/4 = 0, whence α = 1/2. Thus y1 (x) = x is a solution and it does not vanish for x > 0. Since a1 = 0, another independent solution on R+ is

y2 (x) =





x √1 ( s)2

x x0



ds =

x , x0

x log

where x0 > 0 is fixed. We can also proceed by factorization. The Riccati equation (3.9) is α2 + α22 + 4x1 2 = 0. By looking for a solution of the form α2 (x) = 1 = −α1 (x), and we get the real factorization 2x 

 d 2 dx

+

1 4x 2

We easily compute gα2 (x, x0 ) =



= x x0



d dx

we can write the general solution of y + √

1 2x



we find k = 21 . Thus α2 (x) =



d dx

and g(x, x0 ) = 

y(x) =

+

k , x

1 4x 2

1 2x





.

x x0 log

x . x0

Taking x0 = 1,

+

y = 0 on R in the form

x (c1 + c2 log x)

(c1 , c2 ∈ R).

Remark 3.8 We note that for a1 = 0 the impulsive response kernel is skewsymmetric, g(x, t) = −g(t, x), ∀x, t ∈ I. Indeed using (3.12) or (3.23) with a1 = 0, i.e., α1 = −α2 , we get  g(x, t) = t

x

 gα2 (x, s)g−α2 (s, t) ds =

x

gα2 (x, s)gα2 (t, s) ds,

t

whence g(t, x) = −g(x, t). For a1 = 0 we get from (3.12), using a1 = −α1 − α2 and the general formula gα (x, s)gα (s, t) = gα (x, t) (cf. (3.2)),

68

3 The Case of Variable Coefficients

g(t, x) = −e

x

a1 (t) dt

t

g(x, t).

(3.24)

Recalling formula (3.22) for the Wronskian w of any fundamental system {y1 , y2 }, we can rewrite (3.24) in the form: w(x)g(t, x) = −w(t)g(x, t).

(3.25)

This can also be proved from the following formula for g(x, t) as the unique solution y = c1 y1 + c2 y2 of the problem (3.13): (−y1 (x)y2 (t) + y2 (x)y1 (t))



1 y1 (t) y2 (t) . = w(t) y1 (x) y2 (x)

g(x, t) =

1 w(t)

(3.26)

This formula admits a generalization to any n (cf. (3.93)). However, (3.24)–(3.25) do not generalize to higher order equations. (By (3.93), we can rewrite w(t)g(x, t) as a determinant, which is skew under x ↔ t if and only if n = 2.) Remark 3.9 We observe that the general case of a1 = 0 with a1 continuously differentiable can be reduced to the case a1 = 0, as follows. Suppose a1 ∈ C 1 (I, C), and let y be a solution of L y = y  + a1 (x)y  + a2 (x)y = 0. Then the function

1 x

y˜ (x) = e 2 solves

where

a1 (t)dt

(3.27)

y(x)

(3.28)

L˜ y˜ = y˜  + a˜ 2 (x) y˜ = 0,

(3.29)

a˜ 2 = a2 − 41 a12 − 21 a1 .

(3.30)

x0

This is easily checked by a direct computation. Conversely, if y˜ solves (3.29), and if a1 ∈ C 1 (I, C), then the function y defined by (3.28) solves (3.27), with a2 given by (3.30). In particular, the impulsive response kernels g and g˜ of L and L˜ are related by 1 x

g(x, ˜ x0 ) = e 2

x0

a1 (t)dt

g(x, x0 ).

Moreover, if L factors as L=



d dx

− α1 (x)



d dx

 − α2 (x) ,

3.2 n = 2

69

then L˜ factors as L˜ =



d dx

+ α˜ 2 (x)



d dx

 − α˜ 2 (x) ,

where α˜ 2 = α2 + 21 a1 ,

and conversely. Example 10 Consider the differential equation L y = y  + x y  + y = 0. Then (cf. Example 6)    2 L = ddx + x ddx + 1 = ddx ddx + x , and we can take α1 = 0, α2 (x) = −x. It follows that gα1 (x, x0 ) = 1, gα2 (x, x0 ) = 2 2 e−(x −x0 )/2 , and  x  x 2 2 g(x, x0 ) = gα2 (x, s)gα1 (s, x0 ) ds = e−x /2 es /2 ds. x0

If we let

x0

a˜ 2 (x) = a2 (x) − 41 a12 (x) − 21 a1 (x) =

1 2

− 41 x 2 ,

then the differential equation L˜ y = y  + a˜ 2 (x)y = 0 has kernel 1 x

g(x, ˜ x0 ) = e 2

x0

t dt

g(x, x0 ) = e−(x

2

+x02 )/4



x

es

2

/2

ds,

x0

and we have the factorization L˜ =



 d 2 dx

+

1 2

− 41 x 2 =



d dx



x 2



d dx

+

x 2



.

We now discuss in more detail the case of real-valued coefficients. Suppose a1 , a2 in (3.6) are real-valued and continuous in the interval I . It is then natural to ask if we can achieve the factorization (3.7) with α1 and α2 real-valued. Again this is equivalent to the existence of a real-valued solution α of the homogeneous equation that never vanishes in I . The answer is negative, in general. Indeed we have the following result. Theorem 3.10 Let I be open or compact and a1 , a2 real-valued. Then the conditions (1), (2) and (3) of Proposition 3.1 with α1 , α2 and α real-valued, are equivalent to (4) L is disconjugate on I , i.e., every nontrivial real solution of L y = 0 has at most one zero in I . See, e.g., [11], Corollary 6.1, p. 351, or [9], Theorem 1, p. 5. This is also proved in [13] (for I compact). In general, a real L is not disconjugate on I . For instance the operator L = ( ddx )2 + 1 is not disconjugate on R or on any interval I of length greater than π. Another example is given by a differential equation L y = 0 which is oscillatory on

70

3 The Case of Variable Coefficients

I , i.e., every real solution has infinitely many zeros in I . For instance consider the differential equation y  + 2x y  + x14 y = 0 in the interval I = (0, 1). It is easily checked that y1 (x) = sin x1 and y2 (x) = cos x1 are linearly independent solutions in I , so any real solution is a linear combination of them, y(x) = c1 y1 (x) + c2 y2 (x) (c1 , c2 ∈ R). If c2 = 0 then y has infinitely many 1 ) = c2 , so y must vanish in the zeros in I . If c2 = 0, then y( π1 ) = −c2 and y( 2π 1 1 1 1 interval ( 2π , π ). Repeating this, we see that y vanishes in the interval ( (n+1)π , nπ ), + for any n ∈ N . Thus a real L can not be factored by real operators globally defined on I , in general. However, it is always possible to factor L by complex operators, in agreement with Theorem 3.2. Indeed, let y1 , y2 be two linearly independent real-valued solutions of L y = 0 on I (a real-valued fundamental system). Then y1 and y2 cannot have common zeros in I , and the complex-valued function α = y1 + i y2 never vanishes in I . Thus we get the factorization (3.7) with α2 = α /α and α1 = −a1 − α2 . (See also Remark 3.11.) Now without any assumption about disconjugacy, let L in (3.6) with a1 , a2 realvalued be factored as in (3.7), with α2 a complex-valued solution of the Riccati equation (3.9). By separating out the real and imaginary parts in (3.9) with α2 = Re α2 + iIm α2 , we get 

(Re α2 ) + (Re α2 )2 − (Im α2 )2 + a1 Re α2 + a2 = 0 (Im α2 ) + 2(Re α2 ) (Im α2 ) + a1 Im α2 = 0.

(3.31)

The second equation in v = Im α2 is separable, namely v  = −(a1 + 2 Re α2 )v, and can be integrated to yield Im α2 (x) = Im α2 (x0 ) e



x

x0 (a1 (t)+2 Re α2 (t)) dt

,

(3.32)

where x0 is any point of I . It follows that the imaginary part of α2 either vanishes identically on I , or it is always nonzero there. Suppose Im α2 (x) = 0 ∀x ∈ I . A complex-valued solution of L y = 0 is given then by gα2 (x, x0 ) = e =e where



x x0

α2 (t) dt

 η(x,x0 )

=e

x x0

Re α2 (t) dt

e

i

x x0

Im α2 (t)) dt

cos ω(x, x0 ) + i sin ω(x, x0 )

x η(x, x0 ) = x0 Re α2 (t) dt x ω(x, x0 ) = x0 Im α2 (t) dt



(3.33)

(3.34)

3.2 n = 2

71



Put

y1 (x) = eη(x,x0 ) cos ω(x, x0 ) y2 (x) = eη(x,x0 ) sin ω(x, x0 ).

Then y1 and y2 are real-valued solutions of L y = 0, since by linearity 0 = L(y1 + i y2 ) = L y1 + i L y2 and L has real coefficients. Their Wronskian at x = x0 is



1 0

= Im α2 (x0 ) = 0.

w(x0 ) = Re α2 (x0 ) Im α2 (x0 ) Thus the functions y1 , y2 form a basis for the real-valued solutions of L y = 0 on I . We conclude that if Im α2 (x) = 0 ∀x ∈ I , the general real solution of L y = 0 can be written in the form   (c1 , c2 ∈ R). (3.35) y(x) = eη(x,x0 ) c1 cos ω(x, x0 ) + c2 sin ω(x, x0 ) Note that the function x → ω(x, x0 ) is strictly monotone in I , as ω  (x, x0 ) = Im α2 (x) is always > 0 or < 0 in I . The constants c1 , c2 are given in terms of y0 = y(x0 ) and y0 = y  (x0 ) by 

c1 = y0 c2 = Im α12 (x0 ) (y0 − y0 Re α2 (x0 )),

(3.36)

and the impulsive response kernel is g(x, x0 ) =

1 Im α2 (x0 )

eη(x,x0 ) sin ω(x, x0 ).

(3.37)

This formula can also be proved directly from (3.23), using (3.32) and (3.33). The result (3.35) is similar to the case of constant coefficients with complex conjugate roots α ± iβ of p(λ). In that case α2 (x) = α + iβ, η(x, x0 ) = α(x − x0 ), ω(x, x0 ) = β(x − x0 ), and (3.35)–(3.36) reduce to (2.37) (the case  < 0). Finally, if Im α2 (x) = 0 ∀x ∈ I , so that α2 is real-valued, we have the factorization (3.7) with α1 and α2 real-valued. This occurs precisely when L is disconjugate on I (by Theorem 3.10). In this case we can represent the general solution of L y = 0 as in (3.19), or, more simply, as in (3.21), with the kernel g(x, t) given by (3.23): y(x) = e

x x0

α2



 c1 + c2

x

 − xt a1 0 t 2 α e x0 2

e x0

 dt

Example 11 Consider the differential equation y  +

1 x4

y=0

(c1 , c2 ∈ R).

72

3 The Case of Variable Coefficients

on I = R+ . We have a1 = 0, a2 (x) = x14 . The Riccati equation (3.9) is α2 +α22 + x14 = 0, and we look for a complex solution. It is easy to check that the system (3.31) is solved by Im α2 (x) = x12 . Re α2 (x) = x1 , Thus α2 (x) =

1 x

+

i , x2



and we get the complex factorization

 d 2 dx

+

1 x4

=



d dx

+

1 x

+

i x2



d dx



1 x



i x2



.

Taking x0 = 1, we have 

x η(x, 1) = 1 1t dt = log x x ω(x, 1) = 1 t12 dt = 1 − x1 ,

and the general solution of y  +

1 x4

y = 0 on R+ is

    x−1 = x c cos 1 + c sin 1 y(x) = x c1 cos x−1 1 2 x + c2 sin x x x

(c1 , c2 , c1 , c2 ∈ R).

Example 12 Consider the differential equation y  + (e x + 1)y  + e x y = 0. Here a1 (x) = e x + 1, a2 (x) = e x . The Riccati equation (3.9) reads α2 + α22 + (e x + 1)α2 + e x = 0, and we see that α2 (x) = −1 is a solution. Thus α1 (x) = −e x , and we get the real factorization     d 2 + (e x + 1) ddx + e x = ddx + e x ddx + 1 . dx  x With x0 = 0, we compute gα2 (x, 0) = e−x , g(x, 0) = e−x 1 − e1−e , and the general solution on R is   x  y(x) = e−x c1 + c2 1 − e1−e

(c1 , c2 ∈ R).

Example 13 Consider the differential equation y  + (e4x − 1)y = 0. Here a1 (x) = 0, a2 (x) = e4x −1. The Riccati equation (3.9) is α2 +α22 +e4x −1 = 0, and we look for a complex solution. The system (3.31) is easily seen to be solved by Re α2 (x) = −1,

Im α2 (x) = e2x .

3.2 n = 2

73

Thus α2 (x) = −1 + ie2x , and we get the factorization 

 d 2 dx



+ e4x − 1 =

Taking x0 = 0, we compute



d dx

− 1 + ie2x



 + 1 − ie2x .

d dx

η(x, 0) = −x 2x ω(x, 0) = e 2−1 ,

and the general solution of y  + (e4x − 1)y = 0 on R is  2x y(x) = e−x c1 cos e 2−1 + c2 sin

e2x −1 2



(c1 , c2 ∈ R).

Example 14 Consider the differential equation L y = y  + x y  − y = 0. Then (cf. Example 6)     2 L = ddx + x ddx − 1 = ddx + x + x1 ddx − x1 , and α2 (x) =

1 x

on I = R+ (say). We compute gα2 (x, x0 ) = g(x, x0 ) = x x0 e x0 /2 2



x x0

x , x0

2

e−s /2 s2

ds.

Integrating by parts the last integral with 1/s 2 = −(1/s) , we have 

x x0

2

e−s /2 s2

ds = − e

−x 2 /2

2

e−x0 /2 x0

+

x



x



e−s

2

/2

ds,

x0

and the impulsive response kernel becomes g(x, x0 ) = x − x0 e

−(x 2 −x02 )/2

 − x x0

x

e−(s

2

−x02 )/2

ds.

x0

This is a smooth function of x on the whole of R. Taking x0 = 1, we get the general solution of L y = 0 on R in the form  −x 2 /2 y(x) = c1 x + c2 e +x

x

e

−s 2 /2

(c1 , c2 ∈ R).

ds

1

  Example 15 Consider the differential equation L y = y  − 1 + x1 y  + x1 y = 0 on I = R+ . Then (cf. Example 8) L=



 d 2 dx

  − 1 + x1 ddx +

1 x

=



and α2 (x) = 1. We compute gα2 (x, x0 ) = e x−x0 ,

d dx



1 x



d dx

 −1 ,

74

3 The Case of Variable Coefficients

g(x, x0 ) =

x0 +1 x−x0 e x0



x+1 . x0

The general solution of L y = 0 on R+ can be written as y(x) = c1 e x + c2 (x + 1)

(c1 , c2 ∈ R).

Example 16 Consider the differential equation y  + x1 y  +

1 x2

y=0

on I = R+ . By looking for a solution of the form y(x) = x α , we find α2 + 1 = 0, whence α = ±i. Thus two complex-valued solutions are x ±i = e±i log x = cos(log x) ± i sin(log x), and the general real solution on R+ is y(x) = c1 cos(log x) + c2 sin(log x)

(c1 , c2 ∈ R).

We can also proceed by factorization. The system (3.31) is easily seen to be solved by Re α2 (x) = 0, Im α2 (x) = x1 . With x0 = 1 we get gα2 (x, 1) = ei log x , η(x, 1) = 0, ω(x, 1) = log x, and (3.35) gives the same result as above. We get the factorization 

 d 2 dx

+

1 d x dx

+

1 x2



=

d dx

+

1 x

+

i x



d dx



i x



.

Remark 3.11 Actually, the set VR of real solutions of L y = 0 can always be represented in the form (3.35) for any L (with real-valued coefficients). Indeed, let z 1 , z 2 be any basis of VR . Put α = z 1 + i z 2 , then, as already observed, α is a complexvalued solution of L y = 0 that never vanishes in I , α(x) = 0 ∀x ∈ I . If we define α2 = α /α, we get the factorization (3.7) with α2 complex-valued given by α2 =

z 1 +i z 2 z 1 +i z 2

=

z 1 z 1 +z 2 z 2 z 12 +z 22

1 ,z 2 ) + i w(z , z 2 +z 2 1

2

where w(z 1 , z 2 ) = z 1 z 2 − z 2 z 1 is the Wronskian of z 1 , z 2 . Since Im α2 (x) = 0 ∀x ∈ I , we can repeat the above construction. Thus we write the function gα2 (x, x0 ) in the form (3.33), with η and ω given by (3.34), and we again obtain (3.35) for the real solutions of L y = 0. The relationship between the functions (η, ω) and the system (z 1 , z 2 ) is as follows. Since Re α2 =

z 1 z 1 +z 2 z 2 z 12 +z 22

=

and Im α2 =

1 d 2 dx

log(z 12 + z 22 ),

w(z 1 ,z 2 ) , z 12 +z 22

(3.38)

3.2 n = 2

75

we get

⎧ √ ⎨ eη(x,x0 ) = √ z1 (x)2 +z2 (x)2 , z 1 (x0 )2 +z 2 (x0 )2 ⎩ ω(x, x ) =  x w(z1 ,z2 )(t) dt. 0

x0 z 1 (t)2 +z 2 (t)2

 Note that by writing α = z 1 + i z 2 = z 12 + z 22 eiθ , we get θ = that  x

w(z 1 ,z 2 ) z 12 +z 22

= Im α2 , so

θ (t) dt = θ(x) − θ(x0 ).

ω(x, x0 ) =

x0

Thus (eη , ω) are essentially polar coordinates in the space (z 1 , z 2 ). Since 

x



x

α2 (t) dt =

x0

α (t) α(t)

x0

we have gα2 (x, x0 ) = e

x x0

α(x) dt = log α(x , 0)

α2 (t) dt

=

α(x) , α(x0 )

and we can rewrite the kernel (3.37) in terms of z 1 , z 2 as (cf. (3.33), (3.37), (3.38)): g(x, x0 ) = =

z 12 (x0 )+z 22 (x0 ) z 2 (x) 1 Im gα2 (x, x0 ) = w(z Im zz11(x(x)+i Im α2 (x0 ) 1 ,z 2 )(x 0 ) 0 )+i z 2 (x 0 ) 1 (x)z 2 (x0 ) + z 2 (x)z 1 (x0 )) w(z 1 ,z 2 )(x0 ) (−z 1

(cf. (3.26)). As an example, consider the case of constant coefficients with real and distinct roots of p(λ), λ1,2 = α ∓ β. Taking z 1 (x) = e(α−β)x , z 2 (x) = e(α+β)x , we compute  η(x,0) √ = eαx ch 2βx  e ω(x, 0) = arctan th βx . Then (3.35)–(3.36) with x0 = 0 yield (2.37) (the case  > 0), since α2 (0) = α +iβ, tan ω(x, 0) = th βx, and cos ω(x, 0) =

√ ch βx , ch 2βx

sin ω(x, 0) =

√ sh βx . ch 2βx

For real and equal roots λ1 = λ2 , taking z 1 (x) = eλ1 x , z 2 (x) = xeλ1 x , we get 

√ eη(x,0) = eλ1 x 1 + x 2 ω(x, 0) = arctan(x).

Then (3.35)–(3.36) with x0 = 0 yield (2.37) (the case  = 0), as α2 (0) = λ1 + i, and 1 x , sin ω(x, 0) = √1+x . cos ω(x, 0) = √1+x 2 2

76

3 The Case of Variable Coefficients

3.3 The General Case Consider the linear non-homogeneous differential equation of order n y (n) + a1 (x) y (n−1) + a2 (x) y (n−2) + · · · + an−1 (x) y  + an (x) y = f (x), (3.39) where a1 , . . . , an , f are complex-valued continuous functions in a common interval I , i.e., a j , f ∈ C 0 (I, C). As usual we write (3.39) in operator form as L y = f (x), where L is the linear ordinary differential operator of order n given by L=



 d n dx

+ a1 (x)



 d n−1 dx

+ · · · + an−1 (x) ddx + an (x).

(3.40)

As in the case of constant coefficients, the linearity of L (i.e., (2.9)), implies that the general solution of the non-homogeneous equation L y = f (x) is the sum of the general solution of the associated homogeneous equation L y = 0, plus any particular solution of the non-homogeneous equation. We recall that the set of solutions of the homogeneous equation, V = {y ∈ C n (I, C) : L y = 0}, is a complex vector space of dimension n. This follows from the existence-uniqueness theorem for the homogeneous initial value problem (see, e.g., [8], Chap. 3, Theorems 1, 3, 5). A basis of V is called a fundamental system (of solutions of L y = 0). We also recall that n solutions y1 , y2 , . . . , yn of L y = 0 are linearly independent in V if and only if their Wronskian, i.e., the determinant defined by

y1 y2 · · · yn



y1 y2 · · · yn

w(y1 , . . . , yn ) = .. .. .. ,

. . .

(n−1) (n−1) (n−1)

y y2 · · · yn 1 is nonzero at some point of I , in which case w(y1 , . . . , yn )(x) = 0, ∀x ∈ I . In fact, as for n = 2, the function w = w(y1 , . . . , yn ) solves the  x first order linear differential − x a1 (t)) dt  equation w = −a1 (x)w, whence w(x) = w(x0 )e 0 , where x0 is a fixed point of I . (See, e.g., [8], Chap. 3, Theorems 6 and 8.)

3.3.1 The Factorization Problem We would like to factor L in (3.40) as a product (composition) of first-order operators L=



d dx

− α1 (x)



d dx

   − α2 (x) · · · ddx − αn (x)

(3.41)

3.3 The General Case

77

globally defined on I , i.e., such that α j ∈ C j−1 (I, C) (1 ≤ j ≤ n). This is the minimal regularity condition under which the operator in (3.41) will have continuous coefficients. The following result generalizes Proposition 3.1. The analogue of condition (3) in this Proposition is now the existence of a fundamental system whose complete chain of Wronskians is never zero in I . Theorem 3.12 Let L be the linear ordinary differential operator in (3.40), where the coefficients a j ∈ C 0 (I, C), I an interval. Then the following conditions are equivalent. (1) L admits a factorization of the form (3.41), with α j ∈ C j−1 (I, C) (1 ≤ j ≤ n). (2) There exists a fundamental system z 1 , z 2 , . . . , z n of solutions of the homogeneous equation L y = 0 with the property that the sequence of Wronskian determinants

z w0 = 1, w1 = z 1 , w2 =

1 z1



z1 z2 · · · z j



z1

z 2 · · · z j

z 2

.. ..

(1 ≤ j ≤ n) , . . . , w j =

.. z 2 . .

.

( j−1) ( j−1) ( j−1)

z z2 ··· zj 1

(3.42) never vanishes in the interval I , i.e., w j (x) = 0, ∀x ∈ I, ∀ j = 1, . . . , n.

(3.43)

The functions α j in (3.41) are then the logarithmic derivative of ratios of Wronskians, namely w (1 ≤ j ≤ n), (3.44) α j = ddx log wn−n−j+1 j i.e.,

⎧ w w n α1 = ddx log wwn−1 = wnn − wn−1 ⎪ ⎪ n−1 ⎪ ⎪ ⎨ .. . w w ⎪ 2 ⎪ αn−1 = ddx log w = w22 − w11 ⎪ w1 ⎪ ⎩ w z w1 αn = ddx log w = w11 = z11 . 0

If the coefficients a j in (3.40) are real-valued ∀ j, then L admits a factorization of the form (3.41) with α j real-valued ∀ j if and only if there exists a real-valued fundamental system z 1 , z 2 , . . . , z n satisfying (3.43). For the proof in the real case see [14], Lemma II, p. 198. See also [9], Theorem 2, p. 91. We observe that the proof remains unchanged in the case of complexvalued coefficients, namely the condition that the partial Wronskians are all positive throughout I may be replaced by the condition that they vanish nowhere in I . Note that for n = 2, the condition (2) above reduces to the statement that there exists a solution z 1 of L y = 0 that never vanishes in I , which is just condition (3) of

78

3 The Case of Variable Coefficients

Proposition 3.1. (Indeed a second solution z 2 such that w2 does not vanish in I can always be found.) The analogue of condition (2) in Proposition 3.1 for generic n can be derived by expanding out the product in (3.41) and equating to L in (3.40). The functions α j satisfy a suitable system of Riccati type non-linear differential equations. (See (3.56) and (3.57), for the cases n = 3, 4.) The equation satisfied by αn is of order n − 1, and can be obtained more quickly as follows (cf. [18], p. 16). Observe that if x α (t)dt solves L is factored as in (3.41), then the function y(x) = gαn (x, x0 ) = e x0 n L y = 0 and never vanishes in I . But y  = αn y, so we can get the equation for αn by doing the change of variable y  = zy (i.e., z is the logarithmic derivative of y). Then y  = z  y + zy  = (z  + z 2 )y, and in general y (k) = φk (z)y, where the φk (z) are determined by φk (z) = φk−1 (z) + zφk−1 (z),

φ0 (z) = 1.

For instance φ1 (z) = z,

φ2 (z) = z  + z 2 ,

φ3 (z) = z  + 3zz  + z 3 ,

φ4 (z) = z  + 4z  z + 3z 2 + 6z  z 2 + z 4 , and so on. Substituting the y (k) in (3.39) (with f = 0), gives the Riccati equation of order n − 1 associated to L y = 0 and satisfied by z = αn : φn (z) + a1 φn−1 (z) + a2 φn−2 (z) + · · · + an−1 φ1 (z) + an = 0.

(3.45)

For example for n = 2, 3, 4, we get the following equations for z = α2 , α3 , α4 , respectively: z  + z 2 + a1 z + a2 = 0, z  + 3zz  + z 3 + a1 (z  + z 2 ) + a2 z + a3 = 0,

(3.46)

z  +4z  z +3z 2 +6z  z 2 +z 4 +a1 (z  +3zz  +z 3 )+a2 (z  +z 2 )+a3 z +a4 = 0. (3.47) The other functions α j (1 ≤ j ≤ n − 1) satisfy Riccati-type equations of order j − 1 (cf. (3.56), (3.57) for n = 3, 4). It is not possible to solve these equations, in general. If the a j are rational functions of x with rational coefficients, there are algorithms to generate the rational solutions (if any) of (3.45), see [18], pp. 16–21. The question is now whether or not the conditions of Theorem 3.12 always hold for any L of the form (3.40). We observe that a generic fundamental system does not have the property (3.43). Indeed, we know that z 1 , . . . , z n are linearly independent solutions of L y = 0 if and only if wn (t) = 0 ∀t ∈ I . However, the lower dimensional Wronskians w j , j < n, can vanish in I . (For example w1 = z 1 can vanish at some point of I .)

3.3 The General Case

79

Historically, a local factorization of the form (3.41) had been known for some time, dating back to works of Frobenius and Floquet (see [12], p. 121). Mammana proved that if the a j are real-valued, then a fundamental system with w j (x) = 0 ∀x ∈ I , ∀ j, always exists, with z 1 (generally) complex-valued, while z 2 , . . . , z n can be taken to be real-valued. (See [14], p. 203, and Teorema generale, p. 207.) Thus the new point established in [14] is that, for a j real-valued, one can always find a global decomposition of the form (3.41) (i.e., valid on the whole of the interval I ) if one allows the α j to be complex-valued. In [4], Theorem 3.3, the result of Mammana was generalized to the case of complex-valued coefficients a j . We summarize the general result here. Theorem 3.13 Let L be a linear ordinary differential operator of order n L=



 d n dx

+ a1 (x)



 d n−1 dx

+ · · · + an−1 (x) ddx + an (x)

with coefficients a j ∈ C 0 (I, C), I an interval. Then there exists a fundamental system z 1 , . . . , z n of solutions of L y = 0 with the property (3.43). Consequently, L admits the factorization L=



d dx

− α1 (x)



d dx

   − α2 (x) · · · ddx − αn (x) ,

where the functions α j ∈ C j−1 (I, C) (1 ≤ j ≤ n) are given by (3.44), with w j given by (3.42). If the coefficients a j are all real-valued, then one can take z 2 , . . . , z n to be real-valued, but z 1 is generally complex-valued. Finally, in the case of real-valued coefficients a j , we have the following necessary and sufficient condition for L to admit the factorization (3.41) with α j real-valued. Theorem 3.14 Let I be open or compact and let the coefficients a j in (3.40) be real-valued ∀ j. Then L admits a factorization (3.41) with α j ∈ C j−1 (I, R) ∀ j = 1, . . . , n, if and only if L is disconjugate on I , i.e., every nontrivial real solution of L y = 0 has at most n − 1 zeros in I (counted with multiplicity). (A point x0 ∈ I is a zero of multiplicity m of y if y(x0 ) = y  (x0 ) = · · · = y (m−1) (x0 ) = 0, y (m) (x0 ) = 0.) See [14], p. 214, Teorema. See also [11], Corollary 6.1, p. 351 (for n = 2), and p. 67, Ex. 8.3 (for any n); [9], Theorem 2, p. 91, Theorem 3, p. 94. Historically, the connection between disconjugacy and the factorization of a real linear differential operator of order n into a product of first-order real operators was first discussed by Pólya in [15]. Note that the so-called Pólya factorization (cf. [15], formula (18)) is equivalent to the Mammana factorization (see [9], formula (8) p. 92). Remark 3.15 The condition about counting zeros with multiplicity can be omitted. Indeed, one can prove that every nontrivial real solution of L y = 0 has at most n − 1 zeros in I counted with multiplicity, if and only if every nontrivial real solution of L y = 0 has at most n − 1 distinct zeros in I . (See [14], pp. 214–215, [11], Ex. 8.4, p. 68.)

80

3 The Case of Variable Coefficients

3.3.2 The Non-homogeneous Equation Suppose now that L in (3.40), with a j ∈ C 0 (I, C), is factored as in (3.41), with α j ∈ C j−1 (I, C). We can then easily solve the initial value problem with trivial initial conditions at some point x0 ∈ I . Theorem 3.16 Let f ∈ C 0 (I, C), I an interval, x0 ∈ I , and let L be the differential operator (3.41), with α j ∈ C j−1 (I, C) (1 ≤ j ≤ n). Then the initial value problem 

L y = f (x) y(x0 ) = y  (x0 ) = · · · = y (n−1) (x0 ) = 0

has a unique solution, defined on the whole of I , and given by the formula  y p (x) =

x

g(x, t) f (t) dt,

(3.48)

x0

where g = gα1 ···αn : I × I → C is the function defined recursively as follows: for n = 1 we set x gα (x, t) = e t α(s) ds , for n ≥ 2 we set  gα1 ···αn (x, t) =

x

gαn (x, s) gα1 ···αn−1 (s, t) ds.

(3.49)

t

The function x → gα1 ···αn (x, t) is, for any t ∈ I , the unique solution of the homogeneous problem L y = 0 with the initial conditions

∂xk g(x, t) x=t = 0 for 0 ≤ k ≤ n − 2,

∂xn−1 g(x, t) x=t = 1.

(3.50)

Proof The proof of the first part is by induction on n. The result holds for n = 1, 2 (see (3.4)–(3.5) and Theorem 3.3).Assuming the theorem holds for n − 1, one finds  that the function h = ddx − αn (x) y solves 

   − α1 (x) · · · ddx − αn−1 (x) h = f (x) h(t) = h  (t) = · · · = h (n−2) (t) = 0, d dx

and is given by

 h(x) =

x x0

gα1 ···αn−1 (x, t) f (t) dt.

3.3 The General Case

81

Thus  y(x) =

x

gαn (x, t)h(t) dt  t

gαn (x, t) gα1 ···αn−1 (t, s) f (s) ds dt = x x0

 0x  x gαn (x, t)gα1 ···αn−1 (t, s) dt f (s) ds = x s  0x = gα1 ···αn (x, s) f (s) ds. x0  x

x0

We have interchanged the order of integration in the third line. The second part is proved again by induction in a similar way. The statement holds for n = 1, 2 (cf. (3.2)–(3.3) and Theorem 3.3). Consider the problem 

Ly = 0 y(t) = y  (t) = · · · = y (n−2) (t) = 0, y (n−1) (t) = 1.

Letting h = ( ddx − αn (x))y, we see that the function h solves 

   − α1 (x) · · · ddx − αn−1 (x) h = 0 h(t) = h  (t) = · · · = h (n−3) (t) = 0, h (n−2) (t) = 1. d dx

By the inductive hypothesis, h(x) = gα1 ···αn−1 (x, t). Since y solves y  − αn (x)y = h(x) with the initial condition y(t) = 0, we get by (3.4)  y(x) = t

x

 gαn (x, s)h(s) ds =

x

gαn (x, s)gα1 ···αn−1 (s, t) ds = g(x, t).

t

 The function g(x, t) is called the impulsive response kernel of L. It can also be used to solve the initial value problem with arbitrary initial data. In view of Proposition 3.4 (which holds with the same notations), it is enough to consider the homogeneous equation. As we shall see, the kernel g(x, t) allows one to construct a fundamental system of solutions of L y = 0, under suitable assumptions of regularity on the coefficients a j .

3.3.3 The Homogeneous Equation First we make some remarks about the regularity of g = gα1 ···αn . It is easy to see that g ∈ C 1 (I × I, C). Indeed let us compute the partial derivatives ∂x g and ∂t g. For ∂x g we find immediately from (3.49)

82

3 The Case of Variable Coefficients

∂x g(x, t) = gα1 ···αn−1 (x, t) + αn (x) g(x, t).

(3.51)

For ∂t g let us prove by induction that for all n ≥ 2, ∂t g(x, t) = −gα2 ···αn (x, t) − α1 (t) g(x, t).

(3.52)

Indeed the result holds for n = 2 (cf. (3.16)). Suppose it holds for n − 1, i.e., assume ∂t gα1 ···αn−1 (x, t) = −gα2 ···αn−1 (x, t) − α1 (t) gα1 ···αn−1 (x, t). Then, using gα1 ···αn−1 (t, t) = 0, ∀n ≥ 3, we get  ∂t gα1 ···αn (x, t) = ∂t  = t

x t

x

gαn (x, s) gα1 ···αn−1 (s, t) ds

gαn (x, s) ∂t gα1 ···αn−1 (s, t) ds



= − t

x

 gαn (x, s) gα2 ···αn−1 (s, t) ds − α1 (t)

x

t

gαn (x, s) gα1 ···αn−1 (s, t) ds

= − gα2 ···αn (x, t) − α1 (t) g(x, t),

which is (3.52). From (3.51) and (3.52) we see that ∂x g and ∂t g are both continuous on I × I , so g ∈ C 1 (I × I, C). We also observe that the function x → ∂t g in (3.52) solves the homogeneous equation L y = 0. This can be seen in two ways. We can either permute the derivatives with respect to x in L with the derivative with respect to t in ∂t g, to get L ∂t g = j ∂t Lg = 0. This is permissible because the derivatives ∂x ∂t g exist continuous on j I × I for 1 ≤ j ≤ n. Indeed applying ∂x to (3.52) and using (3.51) with n −1 in place of n, n − 2 in place of n, and so on, we see that ∂xn ∂t g(x, t) involves the functions (n−2) αn(n−1) , αn−1 , . . . , α2 , α1 , which are all continuous in I (recall that α j ∈ C j−1 ). Alternatively, just observe that L in (3.41) annihilates both terms gα2 ···αn and g in (3.52), as immediately seen. However, it is clear from (3.52) that ∂t2 g does not exist, in general, since α1 ∈ C 0 . Note that in the case of constant coefficients we have g(x, t) = g(x − t), where g(x) is the impulsive response (by Theorem 2.19), and g (k) (x − t) = (−1)k ∂tk g(x, t).

(3.53)

Then the translated formula (2.61) says that the partial derivatives (−1)k ∂tk g(x, t), with 0 ≤ k ≤ n − 1, are linearly independent and generate the space of solutions of the homogeneous equation. In order to generalize this result and the results of Theorem 3.5 to n ≥ 3, we need the higher order derivatives ∂tk g with k ≤ n − 1 to exist continuous in I × I . Therefore, for n ≥ 3 we have to impose more regularity on the coefficients a j and α j . It is not difficult to see that the minimal condition required is

3.3 The General Case

an ∈ C 0 ,

83

a j ∈ C n− j−1 , ∀ j = 1, . . . , n − 1,

(3.54)

α j ∈ C n−2 , ∀ j = 1, . . . , n − 1.

(3.55)

which is equivalent to αn ∈ C n−1 ,

Let us first demonstrate this equivalence. For example take n = 3. By expanding out     3  2  the operator ddx − α1 ddx − α2 ddx − α3 and equating to ddx + a1 ddx ddx + a2 ddx + a3 , we get: ⎧ ⎨ a1 = −(α1 + α2 + α3 ) a2 = −2α3 + α3 (α1 + α2 ) − (α2 − α1 α2 ) ⎩ a3 = −α3 + α3 (α1 + α2 ) + α3 (α2 − α1 α2 ).

(3.56)

Here if we eliminate α1 , α2 , we get precisely the Riccati equation (3.46) for z = α3 . The usual condition α j ∈ C j−1 , i.e., α3 ∈ C 2 , α2 ∈ C 1 , α1 ∈ C 0 , is equivalent to a1 , a2 , a3 ∈ C 0 from (3.56). Now suppose (3.55) holds, i.e., α3 ∈ C 2 and α1 , α2 ∈ C 1 . Then we get a2 , a3 ∈ C 0 from the second and third equations in (3.56), and a1 ∈ C 1 from the first, so (3.54) holds. Conversely, if (3.54) holds, then (3.55) holds. For n = 4 we get: ⎧ a1 ⎪ ⎪ ⎪ ⎪ a2 ⎪ ⎪ ⎨ a3 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ a4 ⎩

= −(α1 + α2 + α3 + α4 ) = −3α4 + α4 (α1 + α2 + α3 ) − (2α3 + α2 − α1 α2 − α1 α3 − α2 α3 ) = −3α4 + 2α4 (α1 + α2 + α3 ) + α4 (2α3 + α2 − α1 α2 − α1 α3 − α2 α3 ) −(α3 − α3 (α1 + α2 ) − α3 (α2 − α1 α2 )) = −α4 + α4 (α1 + α2 + α3 ) + α4 (2α3 + α2 + α3 − α1 α2 − α1 α3 − α2 α3 ) +α4 (α3 − α3 (α1 + α2 ) − α3 (α2 − α1 α2 )). (3.57)

By eliminating α1 , α2 , α3 , gives the Riccati equation (3.47) for z = α4 . Now if (3.55) holds, i.e., α4 ∈ C 3 and α1 , α2 , α3 ∈ C 2 , then we get a4 , a3 ∈ C 0 , a2 ∈ C 1 , and a1 ∈ C 2 , so (3.54) holds. Conversely, if (3.54) holds, we get (3.55). ( j−1) (1 ≤ j ≤ n). Thus αn ∈ In general, the coefficient a j contains the term α j n−1 0 n−2 n− j−1 implies an ∈ C , and α j ∈ C implies a j ∈ C , ∀ j = 1, . . . , n − 1, C and conversely. Now let us check that under the conditions (3.55) the kernel g(x, t) has n − 1 partial derivatives with respect to t. Indeed from (3.52), taking more derivatives with respect to t and iterating, we see that ∂tk gα1 ...αn (x, t) (0 ≤ k ≤ n − 1) is a linear combination of gαk+1 ···αn (x, t), gαk αk+1 ···αn (x, t), . . . , gα2 ···αn (x, t), gα1 ···αn (x, t),

(3.58)

with coefficients depending only on t and involving the derivatives of the functions α j (cf. (3.85)). Take k = n −1, then ∂tn−1 g is a linear combination of gαn , gαn−1 αn , . . . , g.

84

3 The Case of Variable Coefficients

The least regular coefficient is that of g, it contains α1(n−2) so it is in C 0 if α1 ∈ C n−2 . The coefficient of gα2 ···αn is more regular: it contains α1(n−3) and α2(n−3) , so it is in C 1 if α1 , α2 ∈ C n−2 . Continuing this way, we see that if (3.55) holds, then ∂tn−1 g exists continuous on I × I . Moreover, we have g ∈ C n−1 (I × I, C). Indeed, one can prove j from (3.52) and (3.55) that the mixed derivatives ∂x ∂tk g exist continuous on I × I , for all j, k with 0 ≤ j ≤ n, 0 ≤ k ≤ n − 1. It is also clear that the partial derivatives ∂tk g(x, t) (0 ≤ k ≤ n − 1) solve the homogeneous equation L y = 0 in the variable x. Just permute the derivatives with respect to x in L with those with respect to t in ∂tk g. (This is permissible in view of j the already observed continuity of the mixed derivatives ∂x ∂tk g with 0 ≤ j ≤ n and 0 ≤ k ≤ n − 1.) Alternatively, observe that the operator L in (3.41) annihilates each term in (3.58), as easily seen. By analogy with the case of constant coefficients and with the case n = 2, we expect that under the conditions (3.54)–(3.55) the functions x → ∂tk g(x, t), 0 ≤ k ≤ n − 1, will be linearly independent for any t ∈ I , and generate the space V of solutions of L y = 0. In [2], Theorem 5, it was indeed shown by induction on n that any y ∈ V can be written as a linear combination of these functions, for any t ∈ I . Since dim V = n, these functions form a basis of V . In [2] it was also given (without proof) the formula for the coefficients in this linear combination as a function of the initial data at x = t. One way to obtain this formula is by the theory of distributions and the concept of a kernel, i.e., a distribution on the product space Rn × Rn (see [17], p. 138, [6], p. 53), under the assumption a j ∈ C ∞ (I, C). A brief outline of this approach is given in Remark 3.19. We shall give instead an elementary proof of this formula using factorization and induction, more in the spirit of this paper. Later on, we shall provide an alternative proof that uses variation of parameters and the algebra of Wronskians. Theorem 3.17 Assume the coefficients of L in (3.40) satisfy a j ∈ C n− j−1 (I, C) for 1 ≤ j ≤ n − 1, an ∈ C 0 (I, C), and let g(x, t) be the impulsive response kernel of L. (1) Let x0 ∈ I , b0 , b1 , . . . , bn−1 ∈ C. The solution of the initial value problem 

Ly = 0 y(x0 ) = b0 , y  (x0 ) = b1 , . . . , y (n−1) (x0 ) = bn−1

(3.59)

is y(x) = b0 y0 (x) + b1 y1 (x) + · · · + bn−2 yn−2 (x) + bn−1 yn−1 (x), where

(3.60)

3.3 The General Case

85

    ⎧ y0 (x) = an−1 (x0 )g(x, x0 ) − ∂t an−2 (t)g(x, t) t=x + ∂t2 an−3 (t)g(x, t) t=x + · · · ⎪ ⎪ 0 0   ⎪ ⎪ ⎪ +(−∂t )n−2 a1 (t)g(x, t) t=x + (−∂t )n−1 g(x, x0 ) ⎪ ⎪ 0     ⎪ ⎪ ⎪ y1 (x) = an−2 (x0 )g(x, x0 ) − ∂t an−3 (t)g(x, t) t=x + ∂t2 an−4 (t)g(x, t) t=x + · · · ⎪ ⎪ 0 0   ⎪ ⎪ ⎪ +(−∂t )n−3 a1 (t)g(x, t) t=x + (−∂t )n−2 g(x, x0 ) ⎪ ⎪ 0   ⎪ ⎪ y (x) = a (x )g(x, x ) − ∂ a (t)g(x, t) ⎪ + ∂t2 an−5 (t)g(x, t) t=x + · · · 2 n−3 0 0 t n−4 ⎪ t=x 0 ⎪ 0   ⎪ ⎪ ⎪ +(−∂t )n−4 a1 (t)g(x, t) t=x + (−∂t )n−3 g(x, x0 ) ⎪ ⎪ 0 ⎪ ⎪ ⎪ ⎨ .. .     ⎪ yn−k (x) = ak−1 (x0 )g(x, x0 ) − ∂t ak−2 (t)g(x, t) t=x + ∂t2 ak−3 (t)g(x, t) t=x + · · · ⎪ ⎪ 0 0   ⎪ ⎪ ⎪ +(−∂t )k−2 a1 (t)g(x, t) t=x + (−∂t )k−1 g(x, x0 ) ⎪ ⎪ 0 ⎪ ⎪ .. ⎪ ⎪ ⎪ ⎪ . ⎪ ⎪     ⎪ ⎪ yn−4 (x) = a3 (x0 )g(x, x0 ) − ∂t a2 (t)g(x, t) t=x + ∂t2 a1 (t)g(x, t) t=x − ∂t3 g(x, x0 ) ⎪ ⎪ 0 ⎪  0  ⎪ ⎪ ⎪ yn−3 (x) = a2 (x0 )g(x, x0 ) − ∂t a1 (t)g(x, t) t=x0 + ∂t2 g(x, x0 ) ⎪ ⎪ ⎪ ⎪ ⎪ yn−2 (x) = a1 (x0 )g(x, x0 ) − ∂t g(x, x0 ) ⎪ ⎩ yn−1 (x) = g(x, x0 ).

(3.61) In particular, the functions y j solve L y j = 0 with the initial conditions y (i) j (x 0 ) = δi j (0 ≤ i, j ≤ n − 1). The solution (3.60) can also be written as y(x) =

n−1 

  (−1)k ∂tk ck (t)g(x, t) t=x0 ,

(3.62)

k=0

where the functions ck (t) are defined by formula (2.52) with ak (t) in place of ak , namely ck (t) =

n−k−1 

ar (t)bn−r −k−1

(a0 ≡ 1, 0 ≤ k ≤ n − 1).

(3.63)

r =0 j

j

Using Leibniz rule in (3.62) and collecting the derivatives ∂t g(x, x0 ) = ∂t g(x, t)t=x0 , gives y(x) = c˜0 g(x, x0 ) − c˜1 ∂t g(x, x0 ) + · · · + (−1)n−1 c˜n−1 ∂tn−1 g(x, x0 ),

(3.64)

where the coefficients c˜ j are given by

n−1  (k− j) k− j k ck c˜ j = (−1) (x0 ) j k= j

(0 ≤ j ≤ n − 1).

(3.65)

In the case of constant coefficients, formulas (3.60)–(3.61) and (3.64)–(3.65) reduce to (2.62)–(2.55) and (2.61)–(2.52), respectively (in view of (3.53)).

86

3 The Case of Variable Coefficients

For example for n = 2, 3, 4, we get  n=2 ⇒

c˜0 = b1 + a1 (x0 )b0 c˜1 = b0 ,

⎧ ⎨ c˜0 c˜1 n=3 ⇒ ⎩ c˜2 ⎧ ⎪ ⎪ c˜0 ⎨ c˜1 n=4 ⇒ c˜ ⎪ ⎪ ⎩ 2 c˜3

(3.66)

  = b2 + a1 (x0 )b1 + a2 (x0 ) − a1 (x0 ) b0 = b1 + a1 (x0 )b0 (3.67) = b0 ,     = b3 + a1 (x0 )b2 + a2 (x0 ) − a1 (x0 ) b1 + a3 (x0 ) − a2 (x0 ) + a1 (x0 ) b0 = b2 + a1 (x0 )b1 + a2 (x0 ) − 2a1 (x0 ) b0 = b1 + a1 (x0 )b0 = b0 .

(3.68) Note that the derivatives of the coefficients a j start appearing into the c˜k as soon as n ≥ 3. (2) The set of solutions of the homogeneous equation on I , V = {y ∈ C n (I, C) : L y = 0}, is a complex vector space of dimension n, with a basis given by x → g(x, t), −∂t g(x, t), . . . , (−1)n−1 ∂tn−1 g(x, t),

(3.69)

for any t ∈ I . Proof (1) We proceed by induction on n. For n = 1, 2 formulas (3.60)–(3.61) hold (cf. (3.1) and 3.19)). Suppose the result holds for a linear differential operator of order n − 1, and let us prove it for order n. Consider then the problem (3.59), with L in (3.40) factored as in (3.41): 

    − α1 (x) ddx − α2 (x) · · · ddx − αn (x) y = 0 y(x0 ) = b0 , y  (x0 ) = b1 , . . . , y (n−1) (x0 ) = bn−1 . d dx

Instead of h = ( ddx − αn )y, it is more convenient to define h=



d dx

   − α2 (x) · · · ddx − αn (x) y ≡ L˜ y.

Then h solves h  − α1 (x)h = 0, whence by (3.1) h(x) = h(x0 ) gα1 (x, x0 ). To get h(x0 ) in terms of the initial data b j , define the coefficients a˜ j of L˜ by 

   − α2 (x) · · · ddx − αn (x)  n−1  n−2  n−3 = ddx + a˜ 1 (x) ddx + a˜ 2 (x) ddx + · · · + a˜ n−2 (x) ddx + a˜ n−1 (x).

L˜ =

d dx

3.3 The General Case

87

Then h(x0 ) = bn−1 + a˜ 1 (x0 )bn−2 + a˜ 2 (x0 )bn−3 + · · · + a˜ n−2 (x0 )b1 + a˜ n−1 (x0 )b0 . (3.70) Moreover, the equality L = the coefficients a j and a˜ j :



d dx

 − α1 (x) L˜ gives the following relationship between

⎧ a1 = a˜ 1 − α1 ⎪ ⎪ ⎪ ⎪ a2 = a˜ 2 + a˜ 1 − α1 a˜ 1 ⎪ ⎪ ⎪ ⎨ a3 = a˜ 3 + a˜ 2 − α1 a˜ 2 .. ⎪ . ⎪ ⎪ ⎪  ⎪ = a˜ + a˜ n−2 − α1 a˜ n−2 a ⎪ ⎪ ⎩ n−1  n−1 an = a˜ n−1 − α1 a˜ n−1 .

(3.71)

Now y solves the non-homogeneous problem 

L˜ y = h(x) y(x0 ) = b0 , y  (x0 ) = b1 , . . . , y (n−2) (x0 ) = bn−2 .

By the inductive hypothesis and by Theorem 3.16, we have  y(x) = h(x0 )

x

gα2 ···αn (x, t)gα1 (t, x0 ) dt+b0 y˜0 (x)+b1 y˜1 (x)+· · ·+bn−2 y˜n−2 (x),

x0

where the functions y˜ j (0 ≤ j ≤ n − 2) are given by (3.61) with n − 1 in place of n and with g˜ = gα2 ···αn in place of g. In particular, for the function y(x) = g(x, x0 ) = gα1 ···αn (x, x0 ) we have b0 = b1 = · · · = bn−2 = 0, bn−1 = 1 = h(x0 ), and we get the identity  x

gα2 ···αn (x, t)gα1 (t, x0 ) dt = g(x, x0 ).

(3.72)

x0

This can also be proved using (3.52), rewritten as gα2 ···αn (x, t) = − (∂t + α1 (t)) g(x, t).

(3.73)

Substituting (3.70), (3.72) and the y˜ j from (3.61) in y(x), with g˜ given by (3.73), we find that y(x) is given by        b0 a˜ n−1 (t)g(x, t) − a˜ n−2 (t) ∂t + α1 (t) g(x, t) + ∂t a˜ n−3 (t) ∂t + α1 (t) g(x, t) +        + · · · + (−1)n−2 ∂tn−3 a˜ 1 (t) ∂t +α1 (t) g(x, t) + (−1)n−1 ∂tn−2 ∂t + α1 (t) g(x, t) t=x 0        b1 a˜ n−2 (t)g(x, t) − a˜ n−3 (t) ∂t + α1 (t) g(x, t) + ∂t a˜ n−4 (t) ∂t + α1 (t) g(x, t) +        + · · · + (−1)n−3 ∂tn−4 a˜ 1 (t) ∂t + α1 (t) g(x, t) + (−1)n−2 ∂tn−3 ∂t + α1 (t) g(x, t) t=x 0

88

3 The Case of Variable Coefficients ···+    bn−4 a˜ 3 (t)g(x, t) − a˜ 2 (t) ∂t + α1 (t) g(x, t)        + ∂t a˜ 1 (t) ∂t + α1 (t) g(x, t) − ∂t2 ∂t + α1 (t) g(x, t) + t=x 0       + bn−3 a˜ 2 (t)g(x, t) − a˜ 1 (t) ∂t + α1 (t) g(x, t) + ∂t ∂t + α1 (t) g(x, t) t=x 0     + bn−2 a˜ 1 (t)g(x, t) − ∂t + α1 (t) g(x, t) t=x 0

bn−1 g(x, x0 ).

(3.74)

Now using (3.71), it is easy to verify that this expression coincides with (3.60)–(3.61). Indeed, the term in bn−1 is the same. The function multiplying bn−2 in (3.74),   a˜ 1 (x0 ) − α1 (x0 ) g(x, x0 ) − ∂t g(x, x0 ) = a1 (x0 )g(x, x0 ) − ∂t g(x, x0 ), is just yn−2 (x) in (3.61). The function multiplying bn−3 in (3.74) is   a˜ 2 − α1 a˜ 1 + α1 (x0 )g(x, x0 ) − a˜ 1 (x0 )∂t g(x, x0 ) + ∂t2 g(x, x0 )   = a2 − a1 g(x, x0 ) − a˜ 1 (x0 )∂t g(x, x0 ) + ∂t2 g(x, x0 )   = a2 (x0 )g(x, x0 ) − ∂t a1 (t)g(x, t) t=x0 + ∂t2 g(x, x0 ), which is just yn−3 (x) in (3.61). Next, the function multiplying bn−4 in (3.74) is     a˜ 3 − α1 a˜ 2 + (α1 a˜ 1 ) − α1 (x0 )g(x, x0 ) − a˜ 2 − a˜ 1 − α1 a˜ 1 + 2α1 (x0 )∂t g(x, x0 )   + a˜ 1 (x0 ) − α1 (x0 ) ∂t2 g(x, x0 ) − ∂t3 g(x, x0 )     = a3 − a2 + a1 (x0 )g(x, x0 ) − a2 − 2a1 ∂t g(x, x0 ) + a1 ∂t2 g(x, x0 ) − ∂t3 g(x, x0 )     = a3 (x0 )g(x, x0 ) − ∂t a2 (t)g(x, t) t=x + ∂t2 a1 (t)g(x, t) t=x − ∂t3 g(x, x0 ). 0

0

This equals yn−4 (x) in (3.61). Continuing this way, we can prove that for each k = 1, . . . , n, the term multiplying bn−k in (3.74) coincides with yn−k (x) in (3.61). In fact the term multiplying bn−k in (3.74) is        a˜ k−1 (t)g(x, t) − a˜ k−2 (t) ∂t + α1 (t) g(x, t) + ∂t a˜ k−3 (t) ∂t + α1 (t) g(x, t) +        · · · + (−1)k−2 ∂tk−3 a˜ 1 (t) ∂t +α1 (t) g(x, t) + (−1)k−1 ∂tk−2 ∂t + α1 (t) g(x, t)

t=x 0

.

(3.75) j

Expanding out the derivatives using Leibniz rule and collecting the terms ∂t g, we immediately see that the highest-derivative term (−1)k−1 ∂tk−1 g(x, x0 ) is the same as in yn−k (x) in (3.61). The next term is (−1)k−2 (a˜ 1 − α1 )(x0 )∂tk−2 g(x, x0 ) = (−1)k−2 a1 (x0 )∂tk−2 g(x, x0 ), same as the corresponding term in yn−k (x).

3.3 The General Case

89

Now the coefficient multiplying (−1)k−3 ∂tk−3 g(x, x0 ) in (3.75) is   a˜ 2 − (k − 3)a˜ 1 − α1 a˜ 1 + (k − 2)α1 (x0 )   = a2 − a˜ 1 + α1 a˜ 1 − (k − 3)a˜ 1 − α1 a˜ 1 + (k − 2)α1 (x0 ) = a2 (x0 ) − (k − 2)(a˜ 1 − α1 )(x0 ) = a2 (x0 ) − (k − 2)a1 (x0 ). This is the same as the coefficient of (−1)k−3 ∂tk−3 g(x, x0 ) in yn−k (x) in (3.61), as easily seen. Continuing this way, we see that the coefficient of (−∂t ) j g(x, x0 ) in (3.75) agrees with the coefficient of (−∂t ) j g(x, x0 ) in yn−k (x) in (3.61), ∀ j. The most complicated term is the coefficient of g(x, x0 ) in (3.75), which reads a˜ k−1 − α1 a˜ k−2 + (α1 a˜ k−3 ) + · · · + (−1)k−2 (α1 a˜ 1 )(k−3) + (−1)k−1 α1(k−2)  = ak−1 − a˜ k−2 + α1 a˜ k−2 − α1 a˜ k−2 + (α1 a˜ k−3 ) + · · ·   = ak−1 − ak−2 + a˜ k−3 − (α1 a˜ k−3 ) + (α1 a˜ k−3 ) + · · ·   + ak−3 + · · · + (−1)k−3 a2(k−3) + (−1)k−2 a1(k−2) , = ak−1 − ak−2

all terms calculated at x0 . This is precisely the coefficient of g(x, x0 ) in yn−k (x) in (3.61), as easily seen. The proof of (3.60)–(3.61) is now complete. Formula (3.62), with the ck (t) given by (3.63), is just a rewriting of (3.60)–(3.61). To prove (3.64)–(3.65) just expand out the derivatives in (3.62) using Leibniz rule, j and collect the terms ∂t g(x, x0 ), 0 ≤ j ≤ n − 1, to get ⎞ ⎛ n−1 k

  k (k− j) j ck y(x) = (−1)k ⎝ (x0 ) ∂t g(x, x0 )⎠ j k=0 j=0 ⎞ ⎛

n−1  n−1  ⎝ (−1)k k ck(k− j) (x0 )⎠ ∂tj g(x, x0 ) = j j=0 k= j =

n−1  j (−1) j c˜ j ∂t g(x, x0 ). j=0

 k n−1 n−1 We have exchanged the sums over k and j according to n−1 k=0 j=0 = j=0 k= j (as easily proved). (2) This is proved in the usual way. The functions in (3.69) (as well as the functions in (3.61)) form a basis of V since every y ∈ V can be written as a linear combination of them in a unique way. (The coefficients are uniquely determined by the initial data at the point x = t through (3.64)–(3.65) (with x0 = t). For the functions y0 , . . . , yn−1 the coefficients are just the initial data b0 , . . . , bn−1 at x0 .)  Remark 3.18 We observe that the conditions a j ∈ C n− j−1 (1 ≤ j ≤ n −1), an ∈ C 0 in Theorem 3.17 are the minimal ones under which formulas (3.61) and (3.65) make sense. If we require the stronger conditions a j ∈ C n− j (1 ≤ j ≤ n), then α j ∈ C n−1 ,

90

3 The Case of Variable Coefficients

∀ j = 1, . . . , n, and the kernel g(x, t) has n partial derivatives with respect to t (rather than n − 1). Moreover g ∈ C n (I × I, C), and g = gα1 ···αn satisfies the following adjoint equation in the variable t: d dt

+ αn (t)

 d dt

   + αn−1 (t) · · · dtd + α1 (t) g(x, t) = 0,

(3.76)

with the initial conditions

∂tk g(x, t) t=x = 0 for 0 ≤ k ≤ n − 2, To prove (3.76) just apply (−1)n−1 gαn (x, t) =

d dt

d dt

∂tn−1 g(x, t) t=x = (−1)n−1 .

(3.77)

 + αn (t) to both sides of the formula    + αn−1 (t) · · · dtd + α1 (t) g(x, t),

∀n ≥ 2,

which is easily proved by induction on n, or by iterating (3.73). The initial conditions (3.77) (analogous to (3.50)) are easily proved by induction. Remark 3.19 (The distributional approach) Suppose the coefficients a j are smooth, i.e., a j ∈ C ∞ (I, C), ∀ j. Then, in the language of distribution theory, the function G : I × I → C given by G(x, t) = θ(x − t)g(x, t) (θ the Heaviside step function) is a bilateral (i.e., left and right) elementary kernel for L, i.e., L t G(x, t) = δ(x − t),

L x G(x, t) = δ(x − t),

where L  is the adjoint of L and δ the Dirac distribution. Moreover G is regular since it is C ∞ for x = t (cf. [17], p. 139). To find the solution of L y = f (x) with arbitrary initial data at x = x0 , observe that the function x → θ(x − x0 )y(x) solves n−1    ck (x)δ (k) (x − x0 ), L θ(x − x0 )y(x) = θ(x − x0 ) f (x) + k=0

where the coefficients ck are given by (3.63). Taking the convolution of the two sides of this formula with G(x, t) (Volterra convolution of a kernel and a distribution, see [6], p. 54), we get for x ≥ x0  y(x) =

−∞

 =  =

+∞

x

G(x, t)θ(t − x0 ) f (t) dt +

 n−1  k g(x, t) f (t) dt + (−1) k=0

x

n−1 

x0

k=0 +∞

−∞

x0

g(x, t) f (t) dt +

n−1  

+∞

−∞

G(x, t)ck (t)δ (k) (t − x0 ) dt

  δ (k) (x0 − t) θ(x − t)g(x, t)ck (t) dt

  (−1)k ∂tk ck (t)g(x, t)

k=0

t=x0

,

(3.78)

3.3 The General Case

91

 j j where we used ∂t θ(x − t)g(x, t) = θ(x − t)∂t g(x, t), for 0 ≤ j ≤ n − 1. Now the first term in (3.78) is the particular solution (3.48), and the second term is just (3.62). Remark 3.20 If we only assume that the coefficients a j of L are continuous in I , so that we only have α j ∈ C j−1 (I, C), then the result of Theorem 3.17 is no longer true for n ≥ 3. Indeed, the partial derivatives ∂tk g(x, t), k ≥ 2, do not exist, in general, as already discussed. In this case, if L is factored as in (3.41), we can write the general solution of L y = 0 as y(x) = c1 gα1 ···αn (x, x0 ) + c2 gα2 ···αn (x, x0 ) + · · · + cn−1 gαn−1 αn (x, x0 ) + cn gαn (x, x0 ),

(3.79)

where ck ∈ C and x0 ∈ I is fixed. Here gαk αk+1 ···αn (x, t) (1 ≤ k ≤ n) is of course the impulsive response kernel of the differential operator Lk =



d dx

− αk (x)



d dx

   − αk+1 (x) · · · ddx − αn (x) .

Formula (3.79) is easily proved by induction on n. To see that x → gαk αk+1 ···αn (x, t) solves L y = 0, ∀k = 1, . . . , n, just write L = L1 = = ··· =









d − α1 (x) L 2 = ddx − α1 (x) ddx d xd    − α1 (x) · · · ddx − αk−1 (x) L k . dx

 − α2 (x) L 3

The coefficients ck in (3.79) can be computed in terms of the initial data bk = y (k) (x0 ). Just evaluate the derivatives of y(x) at x = x0 (using the values of j ∂x gαk αk+1 ···αn (x0 , x0 )) and solve the resulting system. For example the first few coefficients are cn = b0 , cn−2

cn−1 = b1 − αn (x0 )b0 ,     = b2 − αn−1 (x0 ) + αn (x0 ) b1 + αn−1 (x0 )αn (x0 ) − αn (x0 ) b0 .

3.4 The Connection with Variation of Parameters There is a simple formula for g in terms of any fundamental system. This formula provides the connection between the impulsive response method and the so called method of variation of parameters (or variation of constants) for finding a solution to the non-homogeneous equation. Proposition 3.21 Let y1 , y2 , . . . , yn be any fundamental system of solutions of the homogeneous equation L y = 0 in the interval I (i.e., any basis of V ), and let g(x, t) be the impulsive response kernel of L. Then, ∀x, t ∈ I , we have

92

3 The Case of Variable Coefficients

g(x, t) = y1 (x)(W (t)−1 )1n + y2 (x)(W (t)−1 )2n + · · · + yn (x)(W (t)−1 )nn , (3.80) where W (t) is the Wronskian matrix of y1 , . . . , yn , i.e., ⎛ ⎜ ⎜ W (t) = W (t) jk = ⎜ ⎝

y1 (t) y1 (t) .. .

y2 (t) y2 (t) .. .

··· ···

yn (t) yn (t) .. .



⎟ ⎟ ⎟, ⎠ (n−1) (n−1) (n−1) (t) y2 (t) · · · yn (t) y1

and W (t)−1 is the inverse of W (t), with entries (W (t)−1 ) jk . Proof Since the kernel g(x, t) solves the homogeneous equation in the variable x, for any t ∈ I , we can expand it in terms of the y j as g(x, t) = d1 (t)y1 (x) + d2 (t)y2 (x) + · · · + dn (t)yn (x). j

To find the coefficients d j (t) we take the derivatives ∂x , 1 ≤ j ≤ n − 1, and let x = t. Recalling the initial conditions (3.50), we get ⎧ d1 (t)y1 (t) + d2 (t)y2 (t) + · · · + dn (t)yn (t) = 0 ⎪ ⎪ ⎪ ⎪ (t)y1 (t) + d2 (t)y2 (t) + · · · + dn (t)yn (t) = 0 d ⎪ ⎨ 1 .. . ⎪ ⎪ ⎪ d (t)y1(n−2) (t) + d2 (t)y2(n−2) (t) + · · · + dn (t)yn(n−2) (t) = 0 ⎪ ⎪ ⎩ 1 d1 (t)y1(n−1) (t) + d2 (t)y2(n−1) (t) + · · · + dn (t)yn(n−1) (t) = 1, i.e., in vector notations, ⎞ ⎛ ⎞ 0 d1 (t) ⎜ d2 (t) ⎟ ⎜ .. ⎟ ⎜ ⎟ ⎜.⎟ W (t) ⎜ . ⎟ = ⎜ ⎟ . ⎝ .. ⎠ ⎝ 0 ⎠ ⎛

dn (t)

Thus

1



⎞ ⎛ ⎞ d1 (t) 0 ⎜ d2 (t) ⎟ ⎜ .. ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ .. ⎟ = W (t)−1 ⎜ . ⎟ . ⎝ . ⎠ ⎝0⎠ dn (t)

1

This means that the vector (d1 (t), . . . , dn (t)) is the last column of the matrix W (t)−1 ,  i.e., d j (t) = (W (t)−1 ) jn (1 ≤ j ≤ n).

3.4 The Connection with Variation of Parameters

93

Let w(t) = det W (t) be the Wronskian of y1 , . . . , yn . By the well known formula for the inverse of a matrix, we have that (W (t)−1 ) jn = w j (t)/w(t)

(1 ≤ j ≤ n),

where w j (t) is the determinant obtained from w(t) by replacing the j-th column by 0, 0, . . . , 0, 1. Then (3.80) reads w1 (t) y (x) w(t) 1

g(x, t) =

+ ··· +

wn (t) y (x), w(t) n

(3.81)

and we can rewrite formula (3.48) in the form  y p (x) = y1 (x)

x x0

w1 (t) w(t)



x

f (t) dt + · · · + yn (x)

wn (t) w(t)

x0

f (t) dt.

(3.82)

This is the variation of constants method (see [8], Sect. 6, Chap. 3, and formula (6.2) p. 123). One looks for a solution of L y = f (x) in the form of a linear combination y p = u 1 y1 + · · · + u n yn , where u 1 , . . . , u n are not constants but functions to be determined. By imposing suitable conditions on the u j (see [8], p. 123), we can solve for u j with the result 

x

u j (x) =

x0

w j (t) w(t)

f (t) dt

(1 ≤ j ≤ n),

in agreement with (3.82). (See also [7], Eq. (6.15) p. 87, and Problem 21 p. 101, where linear systems were used instead.) For example for n = 2 we have W (t) =

y1 (t) y2 (t) , y1 (t) y2 (t)

W (t)

−1



1 y2 (t) −y2 (t) = , w(t) −y1 (t) y1 (t) w1 (t) = −y2 (t), w2 (t) = y1 (t),

w(t) = y1 (t)y2 (t) − y2 (t)y1 (t), 1 g(x, t) = (− y1 (x)y2 (t) + y2 (x)y1 (t) ) w(t) (cf. (3.26)), and (3.82) becomes  y(x) = −y1 (x)

x x0

y2 (t) w(t)

 f (t) dt + y2 (x)

We can now give another proof of Theorem 3.17.

x x0

y1 (t) w(t)

f (t) dt.

94

3 The Case of Variable Coefficients

Alternative proof of Theorem 3.17. We first prove that the n functions in (3.69) are linearly independent, by showing that their Wronskian is different from zero at x = t. Consider, for t ∈ I fixed, the Wronskian matrix of the n functions in (3.69), W jk (x, t) = (−1)k ∂xj ∂tk g(x, t)

(0 ≤ j, k ≤ n − 1),

i.e., ⎛

−∂t g −∂x ∂t g −∂x2 ∂t g .. .

g ∂x g ∂x2 g .. .

∂t2 g · · · (−1)n−1 ∂tn−1 g ∂x ∂t2 g · · · (−1)n−1 ∂x ∂tn−1 g ∂x2 ∂t2 g · · · (−1)n−1 ∂x2 ∂tn−1 g .. .. . .



⎟ ⎜ ⎟ ⎜ ⎟ ⎜ W (x, t) = ⎜ ⎟ (x, t), ⎟ ⎜ ⎠ ⎝ ∂xn−1 g −∂xn−1 ∂t g ∂xn−1 ∂t2 g · · · (−1)n−1 ∂xn−1 ∂tn−1 g where all entries are calculated at (x, t) ∈ I × I . We shall prove that at x = t this matrix has zero entries above the antidiagonal, and all entries on this diagonal equal to 1, ⎛ ⎞ 0 0 0 ··· 0 1 ⎜0 0 0 · · · 1 ⎟ ⎜ ⎟ ⎜ .. .. .. . . ⎟ ⎜. . . . ⎟ W (t, t) = ⎜ (3.83) ⎟, ⎜0 0 1 ⎟ ⎜ ⎟ ⎝0 1 ⎠ 1 so that the Wronskian determinant at x = t is w(t) ˜ = det W (t, t) = (−1)n(n−1)/2 = 0. Consider the columns of W (x, t). The first column at x = t is just (0, . . . , 0, 1) by the initial conditions (3.50) satisfied by g. For the second column, recall formula (3.52), (3.84) − ∂t g(x, t) = gα2 ···αn (x, t) + α1 (t) g(x, t). j

Taking the derivatives ∂x , 0 ≤ j ≤ n − 2, in this formula, and calculating at x = t, gives ⎧ −∂t g(t, t) = gα2 ···αn (t, t) + α1 (t) g(t, t) = 0 ⎪ ⎪ ⎪ ⎪ −∂ ⎪ x ∂t g(t, t) = ∂x gα2 ···αn (t, t) + α1 (t) ∂x g(t, t) = 0 ⎪ ⎪ ⎨ −∂x2 ∂t g(t, t) = ∂x2 gα2 ···αn (t, t) + α1 (t) ∂x2 g(t, t) = 0 .. ⎪ . ⎪ ⎪ ⎪ ⎪ −∂ n−3 ∂ g(t, t) = ∂xn−3 gα2 ···αn (t, t) + α1 (t) ∂xn−3 g(t, t) = 0 ⎪ ⎪ ⎩ xn−2 t −∂x ∂t g(t, t) = ∂xn−2 gα2 ···αn (t, t) + α1 (t) ∂xn−2 g(t, t) = 1,

3.4 The Connection with Variation of Parameters

95

where we have used (3.50) for g = gα1 ···αn and the analogous formula for gα2 ···αn . This proves the second column in (3.83). To compute the 3rd column in W (t, t), we need a formula for ∂t2 g. From (3.84) we get ∂t2 g(x, t) = − ∂t gα2 ···αn (x, t) − α1 (t) g(x, t) − α1 (t)∂t g(x, t) = gα3 ···αn (x, t) + (α1 (t) + α2 (t))gα2 ···αn (x, t) + (α12 (t) − α1 (t))g(x, t), j

where we used again (3.84) for both g and gα2 ···αn . Taking ∂x , 0 ≤ j ≤ n − 3, in this formula, and calculating at x = t, gives ⎧ 2 ∂t g(t, t) = 0 ⎪ ⎪ ⎪ 2 ⎪ ∂ ⎪ x ∂t g(t, t) = 0 ⎪ ⎪ ⎨ ∂x2 ∂t2 g(t, t) = 0 .. ⎪ . ⎪ ⎪ ⎪ ⎪ ∂ n−4 ∂ 2 g(t, t) = 0 ⎪ ⎪ ⎩ xn−3 t2 ∂x ∂t g(t, t) = 1, where we used (3.50) for g, gα2 ···αn and gα3 ···αn . This proves the 3rd column in (3.83). Continuing this way, we arrive at the formula, for each k = 0, . . . , n − 1, (−1)k ∂tk g(x, t) = gαk+1 ···αn (x, t) + βk1 (t)gαk ···αn (x, t) + βk2 (t)gαk−1 ···αn (x, t) + · · · + βkk (t)g(x, t),

(3.85)

where βk1 (t) = α1 (t) + · · · + αk (t), and the functions βkr (t) contain the derivatives j of the α j . Applying ∂x , 0 ≤ j ≤ n − k − 1, in this formula, and taking x = t, we get ⎧ (−1)k ∂tk g(t, t) = 0 ⎪ ⎪ ⎪ ⎪ (−1)k ∂x ∂tk g(t, t) = 0 ⎪ ⎪ ⎪ ⎨ (−1)k ∂x2 ∂tk g(t, t) = 0 .. ⎪ . ⎪ ⎪ ⎪ ⎪ (−1)k ∂ n−k−2 ∂ k g(t, t) = 0 ⎪ x t ⎪ ⎩ (−1)k ∂xn−k−1 ∂tk g(t, t) = 1, proving the result (3.83) for the (k+1)-th column. Note that for k = n−2 (penultimate column), we get (−1)n−2 ∂tn−2 g(x, t) = gαn−1 αn (x, t) + βn−2,1 (t)gαn−2 αn−1 αn (x, t) + · · · + βn−2,n−2 (t)g(x, t),

whence



(−1)n−2 ∂tn−2 g(t, t) = 0 (−1)n−2 ∂x ∂tn−2 g(t, t) = 1,

96

3 The Case of Variable Coefficients

and for k = n − 1, (−1)n−1 ∂tn−1 g(x, t) = gαn (x, t) + βn−1,1 (t)gαn−1 αn (x, t) + · · · + βn−1,n−1 (t)g(x, t), implying (−1)n−1 ∂tn−1 g(t, t) = 1. This proves (3.83) and the linear independence of the n functions in (3.69). In order to prove (3.64)–(3.65), we expand y ∈ V in terms of the basis (3.69) as y(x) = c˜0 g(x, t) − c˜1 ∂t g(x, t) + · · · + (−1)n−1 c˜n−1 ∂tn−1 g(x, t), j

for each t ∈ I fixed. Taking the derivatives ∂x , 1 ≤ j ≤ n − 1, gives ⎧ ⎪ y  (x) = c˜0 ∂x g(x, t) − c˜1 ∂x ∂t g(x, t) + · · · + (−1)n−1 c˜n−1 ∂x ∂tn−1 g(x, t) ⎪ ⎪  ⎪ ⎨ y (x) = c˜0 ∂x2 g(x, t) − c˜1 ∂x2 ∂t g(x, t) + · · · + (−1)n−1 c˜n−1 ∂x2 ∂tn−1 g(x, t) .. ⎪ ⎪ . ⎪ ⎪ ⎩ (n−1) (x) = c˜0 ∂xn−1 g(x, t) − c˜1 ∂xn−1 ∂t g(x, t) + · · · + (−1)n−1 c˜n−1 ∂xn−1 ∂tn−1 g(x, t), y

∀x, t ∈ I . Letting x = t = x0 we get, in vector notation, ⎛

b0 b1 .. .





c˜0 c˜1 .. .



⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ = W (x0 , x0 ) ⎜ ⎟. ⎝ ⎝ ⎠ ⎠ bn−1 c˜n−1 Inverting this relation gives ⎛

c˜0 c˜1 .. .





b0 b1 .. .



⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = W (x0 , x0 )−1 ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ c˜n−1 bn−1

(3.86)

It is then a matter of computing the lower entries in W (t, t) (below the antidiagonal), and the inverse matrix W (t, t)−1 . This involves lengthy computations. As an example, let us prove that the entries in the subdiagonal immediately below the antidiagonal in W (t, t) are all equal to −a1 (t):

3.4 The Connection with Variation of Parameters

⎛ 0 0 0 ··· ⎜0 0 0 ··· ⎜ ⎜0 0 0 ··· ⎜ ⎜ .. . . . . . W (t, t) = ⎜ . .. . . ⎜ ⎜0 0 1 −a1 (t) ⎜ ⎝0 1 −a1 (t) 1 −a1 (t)

97

⎞ 0 0 1 0 1 −a1 (t)⎟ ⎟ ⎟ 1 −a1 (t) ⎟ ⎟ . ⎟. .. ⎟ ⎟ ⎟ ⎠

(3.87)

This means that, (−1)k ∂xn−k ∂tk g(t, t) = −a1 (t),

∀k = 1, . . . , n − 1.

(3.88)

To prove this for k = 1, we proceed as before. We take n − 1 derivatives with respect to x in (3.84), to get −∂xn−1 ∂t g(x, t) = ∂xn−1 gα2 ···αn (x, t) + α1 (t)∂xn−1 g(x, t). Taking x = t gives − ∂xn−1 ∂t g(t, t) = ∂xn−1 gα2 ···αn (t, t) + α1 (t).

(3.89)

To compute ∂xn−1 gα2 ···αn (t, t), note that gα2 ···αn (x, t) solves the problem 

(∂x − α2 (x)) · · · (∂x − αn (x))gα2 ···αn (x, t) = 0 0 = gα2 ···αn (t, t) = · · · = ∂xn−3 gα2 ···αn (t, t), ∂xn−2 gα2 ···αn (t, t) = 1.

(3.90)

From the differential equation in (3.90) we get ∂xn−1 gα2 ···αn (x, t) = (α2 (x) + · · · + αn (x))∂xn−2 gα2 ···αn (x, t) + rest, j

where “rest” contains the derivatives ∂x gα2 ···αn (x, t) with j ≤ n − 3. These terms do not contribute at x = t in view of the initial conditions in (3.90). Thus we get ∂xn−1 gα2 ···αn (t, t) = α2 (t) + · · · + αn (t).

(3.91)

Substituting this in (3.89), gives −∂xn−1 ∂t g(t, t) = α2 (t) + · · · + αn (t) + α1 (t) = −a1 (t), which is just (3.88) for k = 1 and proves the second column in (3.87). (The formula α1 + α2 + · · · + αn = −a1 is immediately proved by expanding out L in (3.41) and equating to (3.40).)

98

3 The Case of Variable Coefficients

To prove (3.88) for 1 < k ≤ n − 1 just take ∂xn−k in (3.85), to get (−1)k ∂xn−k ∂tk g(x, t) = ∂xn−k gαk+1 ···αn (x, t) + βk1 (t)∂xn−k gαk ···αn (x, t) + βk2 (t)∂xn−k gαk−1 ···αn (x, t) + · · · + βkk (t)∂xn−k g(x, t). (3.92) Taking x = t, all terms in the lower line of (3.92) give zero. In the second term we have ∂xn−k gαk ···αn (t, t) = 1. For the first term we easily prove, as a generalization of (3.91), that ∂xn−k gαk+1 ···αn (t, t) = αk+1 (t) + · · · + αn (t),

∀k = 1, . . . , n − 1.

Using this in (3.92) with x = t, and recalling that βk1 (t) = α1 (t) + · · · + αk (t), we get (−1)k ∂xn−k ∂tk g(t, t) = αk+1 (t) + · · · + αn (t) + α1 (t) + · · · + αk (t) = −a1 (t). This is just (3.88) and proves (3.87). The remaining lower entries in W (t, t) can be computed in a similar way, but the computations become more involved. We will now see a more efficient method of calculation based on the algebra of Wronskians. We observe that (3.81) can be rewritten in the following form:

y1 (t) y2 (t) · · · yn (t)



y1 (t) y2 (t) · · · yn (t)

1

.. .. .. g(x, t) =

.

. . .

w(t) (n−2) (n−2) (n−2)

y (t) y2 (t) · · · yn (t)

1

y1 (x) y2 (x) · · · yn (x)

(3.93)

Indeed, by expanding the determinant in (3.93) with respect to the last row, we get precisely (3.81). (See [8], Exercise 6, p. 125.) Now let us recall the rule for differentiating an n × n determinant:





a11 (x) · · · a1n

a11 (x) · · · a1n (x) (x)



a21 (x) · · · a2n (x) d

a21 (x) · · · a2n (x)



.. .. = .. ..

dx . . .

.

an1 (x) · · · ann (x)

an1 (x) · · · ann (x)



a11 (x) · · · a1n (x)

a11 (x) · · · a1n (x)





a21 (x) · · · a2n

a21 (x) · · · a2n (x) (x)



+ . .. + · · · + .. ..

..

. .

.

an1 (x) · · · ann (x)

a  (x) · · · a  (x) n1

nn

3.4 The Connection with Variation of Parameters

99

(cf [8], p. 114). This is the sum of n determinants obtained by differentiating a given row at a time. We can use this in (3.93) to verify the initial conditions (3.50). Taking the partial j derivatives ∂x g(x, t) merely differentiates the last row of the determinant in (3.93). If we set x = t we get zero for 0 ≤ j ≤ n − 2, since we always get two equal rows in the determinant. For ∂xn−1 g(t, t) we get 1 as the determinant reduces precisely to w(t). This proves the first column in (3.83) in a simple way. Next, let us compute ∂t g(x, t) from (3.93). To differentiate w, recall that w  (t) = w  (t) a1 (t) 1  ) = − w(t) −a1 (t)w(t) (cf. [8], p. 115), so that ( w(t) 2 = w(t) . When we differentiate the determinant in (3.93) with respect to t, we only get a nonzero result when differentiating the penultimate row (in the other cases we always get two equal rows). Then,

y1 (t) · · · yn (t)



y1 (t) · · · yn (t)

.. ..

1

. . (3.94) ∂t g(x, t) = a1 (t)g(x, t) +

(n−3)

. (n−3)

w(t) y1 (t) · · · y (t) n

y (n−1) (t) · · · y (n−1) (t)

1

n

y (x) · · · y (x) 1 n j

Taking now the partial derivatives ∂x , j ≤ n − 1, in this formula, and setting x = t, we find by (3.50), ⎧ j ⎨ ∂x ∂t g(t, t) = 0 for 0 ≤ j ≤ n − 3 (3.95) −∂ n−2 ∂ g(t, t) = 1 ⎩ xn−1 t −∂x ∂t g(t, t) = −a1 (t). This proves the second column of W (t, t) in (3.87), including the lower triangular entry −a1 (t). Now take ∂t in (3.94), to get ∂t2 g(x, t) = a1 (t)g(x, t) + a1 (t)∂t g(x, t)

y1 (t) · · · yn (t)



y1 (t) · · · yn (t)

.. ..

a1 (t) . . + w(t) (n−3)

+ (n−3)

y (t) · · · yn (t)

1

y (n−1) (t) · · · y (n−1) (t)

1

n

y (x) · · · y (x) 1 n

y1 (t) · · · yn (t)

.. ..



. .

1 (n−3) (n−3) + w(t) y (t) · · · yn (t)

.

1

y (n) (t) · · · y (n) (t)

1

n

y (x) · · · y (x) 1 n



y1 (t) · · · yn (t)

.. ..



. .

(n−4)

(n−4)

1 y1 (t) · · · y (t) n

w(t) (n−2) (n−2) (t) · · · yn (t)

y1

y (n−1) (t) · · · y (n−1) (t)

1

y (x) · · · ny (x) 1 n

100

3 The Case of Variable Coefficients

Here note, by (3.94), that the 3rd term in the right-hand side is just   a1 (t) ∂t g(x, t) − a1 (t)g(x, t) . We can then rewrite ∂t2 g as   ∂t2 g(x, t) = a1 (t) − a12 (t) g(x, t) + 2a1 (t)∂t g(x, t)

y1 (t) · · · yn (t)



y1 (t) · · · yn (t)

.. ..



.. ..



. .

(n−4)

. . (n−4)

1 y1 1 (t) · · · y (t) (n−3) n (n−3) + w(t) (n−2)

+ w(t) y (t) · · · yn (t)

. (n−2)

y

1

(t) · · · y (t)

1

y (n) (t) · · · y (n) (t)

n

y (n−1) (t) · · · y (n−1) (t)

1

n

1

y (x) · · · y (x)

n 1 n

y (x) · · · y (x) 1

n

(3.96) j

Taking ∂x , j ≤ n − 1, in (3.96) and letting x = t, we first get from (3.50) and (3.95): ∂xj ∂t2 g(t, t) = 0 for 0 ≤ j ≤ n − 4,

y1 (t) · · ·

..

.

(n−4) (t) · · · n−3 2 1 y1 −∂x ∂t g(t, t) = w(t) (n−2) (t) · · ·

y1

(n−1)

y1 (t)

(n−3) · · ·

y (t) · · · 1





yn(n−4) (t)

= 1.

yn(n−2) (t)

yn(n−1) (t)

y (n−3) (t) yn (t) .. .

n

For ∂xn−2 ∂t2 g(t, t) we get from (3.96)

y1 (t) · · · yn (t)

.. ..



. .

n−2 2 1 (n−3) (n−3) ∂x ∂t g(t, t) = −2a1 (t) + w(t) y (t)

.

1 (n) (t) · · · yn

y (t) · · · y (n) (t)

1 n

y (n−2) (t) · · · y (n−2) (t) n 1

(3.97)

To compute the last determinant, we use the differential equation L y j = 0, namely (n−1) − a2 y (n−2) − · · · − an−1 y j − an y j y (n) j = −a1 y j j

(1 ≤ j ≤ n).

(3.98)

When we substitute this in the penultimate row of the determinant in (3.97), we contribute. Indeed by well-known properties of see that only the terms −a1 y (n−1) j determinants, we can add to the penultimate row any linear combination of the other

3.4 The Connection with Variation of Parameters

101

rows, thus we can add to it: a2 (t) · (last row) + a3 (t) · (3rd from last row) + · · · + an (t) · (first row), so that the penultimate row becomes just (−a1 y1(n−1) , . . . , −a1 yn(n−1) ). The determinant in (3.97) becomes



y1 (t) ··· yn (t)

.. ..



. .

(n−3)

y1(n−3) (t) ··· yn (t)

= a1 (t)w(t),

−a1 (t)y (n−1) (t) · · · −a1 (t)y (n−1) (t)

n 1

y (n−2) (t) ··· yn(n−2) (t) 1 and we get ∂xn−2 ∂t2 g(t, t) = −2a1 (t) + a1 (t) = −a1 (t). Finally for ∂xn−1 ∂t2 g(t, t) we get from (3.96)

y1 (t) · · · yn (t)

.. ..



. .

1 ∂xn−1 ∂t2 g(t, t) = a1 (t) − a1 (t)2 + 2a1 (t)2 + w(t)

y1(n−3) (t) · · · yn(n−3) (t) .

(n)

y (t) · · · y (n) (t)

1 n

y (n−1) (t) · · · y (n−1) (t) n 1 (n−2) To get the determinant here we proceed as above and replace y (n) in j by −a2 y j the penultimate row (using (3.98)). The last determinant then equals −a2 (t)w(t), and we get ∂xn−1 ∂t2 g(t, t) = a1 (t) + a1 (t)2 − a2 (t).

We have thus proved that the 3rd column of W (t, t) in (3.87) (including the last entry) is given by   0, . . . , 0, 1, −a1 , a12 − a2 + a1 (t). Taking now ∂t in (3.96) and proceeding as above, we find that the 4th column of W (t, t) in (3.87) has the following entries: 

 0, . . . 0, 1, −a1 , a12 − a2 + 2a1 , β (t),

where   β(t) = ∂xn−1 (−∂t )3 g(t, t) = 2a1 a2 − a3 − a13 + a2 − 3a1 a1 − a1 (t).

102

3 The Case of Variable Coefficients

The other entries in the second subdiagonal below the antidiagonal are computed in a similar way, and one gets ⎛ 0 ⎜0 ⎜ ⎜ ⎜0 ⎜ ⎜ W (t, t) = ⎜ ... ⎜ ⎜ ⎜0 ⎜ ⎝0 1

0 0 0 .. . 0 1 −a1

0 0

0 0 .. .

··· 0 0 ··· 0 1

1 −a1



⎟ ⎟ ⎟ 2 ⎟ 0 · · · 1 −a1 a1 −a2 +(n−2)a1 ⎟ ⎟ .. . ⎟ (t). .. . 1 ⎟ ⎟ . . ⎟ . 1 −a1 ⎟ 2  ⎠ −a1 a1 −a2 +2a1 2  a1 −a2 +a1 β

For example for n = 2, 3, 4 we have: W (t, t) =



0 1 , 1 −a1 (t)

⎛ ⎞ 0 0 1 −a1 ⎠ (t), W (t, t) = ⎝0 1 2 1 −a1 a1 −a2 +a1 ⎛ 0 ⎜0 W (t, t) = ⎜ ⎝0 1

⎞ 0 0 1 ⎟ 0 1 −a1 ⎟ (t). 2  ⎠ 1 −a1 a1 −a2 +2a1 2  3    −a1 a1 −a2 +a1 2a1 a2 −a3 −a1 +a2 −3a1 a1 −a1

By inverting these matrices, we find W (t, t)−1 =

W (t, t)−1

W (t, t)−1



a1 (t) 1 , 1 0

⎛ ⎞ a2 − a1 a1 1 1 0⎠ (t), = ⎝ a1 1 0 0

⎛ a3 − a2 + a1 a2 − a1 ⎜ a2 − 2a  a1 1 =⎜ ⎝ 1 a1 1 0

a1 1 0 0

⎞ 1 0⎟ ⎟ (t). 0⎠ 0

This proves (3.66), (3.67) and (3.68) (taking t = x0 and recalling (3.86)). Continuing this way, we can compute all the columns of W (t, t). The general formula for W (t, t) becomes more and more involved and will not be given here.

3.4 The Connection with Variation of Parameters

103

Inverting W (t, t) gives instead the following general formula for the upper triangular matrix W (t, t)−1 : ⎛

 an−1 −an−2 + · · · + (−1)n−2 a1(n−2) ··· ⎜an−2 −2a  +· · ·+(−1)n−3 (n−2)a (n−3) ··· n−3 1 ⎜ ⎜ .. ⎜ .   ⎜ n−2  ⎜  −(n−3)a a2 −(n−3)a1 a 3 ⎜ 2 + 2 a1 ⎜  a1 a2 −(n−2)a1 ⎜ ⎝ 1 a 1

1

0

⎞ a3 −a2 +a1 a2 −a1 a1 1 a2 −2a1 a1 1 0⎟ ⎟ .. ⎟ . . . . . . . . . .⎟ ⎟ , a1 0⎟ ⎟ ⎟ 1 0⎟ 0 0⎠ 0 0

all entries calculated at t. When this is substituted in (3.86), we get (3.65). Note that the matrix W (t, t)−1 is not symmetric for n ≥ 4. Its transpose relates the two bases (3.69) and (3.61) of V (for t = x0 ), namely we have ⎛ ⎜ ⎜ ⎜ ⎝

y0 (x) y1 (x) .. .





g(x, x0 ) −∂t g(x, x0 ) .. .



⎟  ⎟ t ⎜ ⎜ ⎟ ⎟ ⎟ = W (x0 , x0 )−1 ⎜ ⎟, ⎝ ⎠ ⎠ n−1 n−1 yn−1 (x) (−1) ∂t g(x, x0 )

as easily checked from (3.61). In the case of constant coefficients, W (t, t)−1 is a constant symmetric matrix and coincides with the matrix A in (2.57), whereas W (t, t) coincides with B in (2.58).

3.5 The Approach by Linear Systems and the Peano-Baker Series The theory of linear equations with variable coefficients, developed so far using the impulsive response method and factorization, presents several analogies with the case of constant coefficients. However, compared with the latter theory, it is far less explicit. For instance, we have not been able to obtain an explicit formula for the impulsive response kernel for equations of order n ≥ 2. (Formula (3.49) is of no practical use, unless L is given already in the factored form (3.41).) In this section we present formulas for g(x, x0 ) involving iterated integrals in the coefficients of the differential equation. The general strategy is to transform the differential equation into an equivalent integral equation, and then solve this by the method of successive approximations. Although there is no need to use linear systems, we first present the connection between linear equations and linear systems, and then specialize the general method of successive approximations to the case of a single linear equation. We illustrate this approach in the case of second-order equations, but the results can easily be generalized to higher order equations.

104

3 The Case of Variable Coefficients

We recall that a linear ordinary differential equation of order n can be rewritten as a system of n first-order linear equations (see [8], p. 258). The associated initial value problem is equivalent to a vector-valued integral equation of Volterra type. This can be solved by a formal Picard iteration. The resulting series of iterated integrals is also known as the Peano-Baker series solution (see [1], and references therein). It is worth mentioning here the work of Vito Volterra, who introduced the notion of product integral in order to study vector-valued linear differential equations. This is defined as the limit of a product rather than of a sum, as in an ordinary integral. It turns out that the product integral of a continuous matrix-valued function coincides with its Peano-Baker series. We refer to [19] for more details on the product integral and for references to the original works. Consider the homogeneous initial value problem 

y  + a1 (x)y  + a2 (x)y = 0 y(x0 ) = y0 , y  (x0 ) = y0 ,

(3.99)

where a1 , a2 ∈ C 0 (I, C), I an interval. Letting y1 = y, y2 = y  , we can rewrite (3.99) as the linear system ⎧  y = y2 ⎪ ⎪ ⎨ 1 y2 = −a1 (x)y2 − a2 (x)y1 y1 (x0 ) = y0 ⎪ ⎪ ⎩ y2 (x0 ) = y0 , i.e., in vector notations,

where Y =



Y  = A(x)Y Y (x0 ) = Y0 ,







y1 y y , Y  = 1 , Y0 = 0 , and A(x) is the matrix y2 y2 y0

0 1 . A(x) = −a2 (x) −a1 (x)

(3.100)

(3.101)

By integrating (3.100) between x0 and x, we obtain  Y (x) = Y0 +

x

A(t)Y (t) dt.

(3.102)

x0

It is easy to see that Y is a solution of the initial value problem (3.100) on the interval I if and only if it is a solution of the integral equation (3.102) on I . Now (3.102) can be solved by the method of successive approximations, that we now describe. A first approximation to a solution is the constant vector Y0 (x) = Y0 , ∀x ∈ I . This satisfies Y (x0 ) = Y0 but not (3.102), in general. However, if we compute

3.5 The Approach by Linear Systems and the Peano-Baker Series





x

Y1 (x) = Y0 +

x

A(t)Y0 (t) dt = Y0 +

x0

105

A(t)Y0 dt x0

we expect that Y1 might be a better approximation to a solution than Y0 . Continuing the process, we define the successive approximations Yn to a solution of (3.102) recursively by Y0 (x) = Y0 , 

x

Yn+1 (x) = Y0 +

A(t)Yn (t) dt

(n = 0, 1, 2, . . .).

x0

Taking the limit as n → ∞, we expect that Yn (x) → Y (x), where Y will satisfy (3.102). By computing the Yn , we have



x

Y1 (x) = 1 +

A(t) dt Y0 := ( 1 + P1 (x)) Y0 ,

x0



x

Y2 (x) = Y0 +

A(t)Y1 (t) dt

x0  x

 t

 x A(t)Y0 dt + A(t) A(s)Y0 ds dt = Y0 + x0 x0 x  t0



 x  x A(t) dt + A(t) A(s) ds dt Y0 = 1+ x0

x0

x0

:= (1 + P1 (x) + P2 (x)) Y0 , and in general Yn (x) = (1 + P1 (x) + · · · + Pn (x)) Y0 , where 

x

Pn (x) =



t1

A(t1 )

x0

 A(t2 ) · · ·

x0

tn−1





A(tn ) dtn · · · dt2 dt1 .

(3.103)

x0

The matrix sequence Pn (x) satisfies the recursion (with P0 (x) = 1)  Pn+1 (x) =

x

A(t)Pn (t) dt

(∀x ∈ I, ∀n ∈ N).

(3.104)

x0

The matrix series P(x, x0 ) =

∞ 

 Pn (x) = 1 +

n=0



+

x x0

 A(t1 )

x

 A(t1 ) dt1 +

x0 t1 x0

A(t2 )



x x0

t2 x0

 A(t1 )



t1

A(t2 ) dt2

dt1

x0

A(t3 ) dt3 dt2 dt1 + · · ·

(3.105)

106

3 The Case of Variable Coefficients

is called the Peano-Baker series of A. If the matrix-valued function A has continuous entries on the interval I , then the Peano-Baker series converges absolutely ∀x ∈ I , and uniformly on compact subintervals J ⊂ I (with x0 ∈ J ) in the appropriate matrix norm. See, e.g., [1], Theorem 1. The solution to (3.102) is unique and is given by Y (x) = P(x, x0 )Y0 , ∀x ∈ I . The following properties of the matrix-valued kernel P(x, t) are easily proved: ⎫ P(t, t) = 1, ∀t ∈ I, ⎬ P(x, s)P(s, t) = P(x, t), ∀x, s, t ∈ I, group properties ⎭ (P(x, t))−1 = P(t, x), ∀x, t ∈ I, ∂x P(x, t) = A(x)P(x, t),

∀x, t ∈ I,

∂t P(x, t) = −P(x, t)A(t), ∀x, t ∈ I,  x P(x, t) = 1 + A(s)P(s, t) ds, ∀x, t ∈ I, t  x P(x, t) = 1 + P(x, s)A(s) ds, ∀x, t ∈ I. t

If A is a constant matrix, then the Peano-Baker series reduces to the exponential series ∞  1 (x−x0 )A = P(x, x0 ) = e (x − x0 )n An . n! n=0 More generally,  x if the matrices A(t) and A(s) commute ∀t, s ∈ I , or if A(x) commutes with x0 A(t) dt for all x ∈ I , then the Peano-Baker series reduces to the exponential series P(x, x0 ) = e

x x0

A(t) dt

 x

n ∞  1 = A(t) dt . n! x0 n=0

Indeed, in all these cases, we have Pn (x) =

1 n!



n

x

A(t) dt

.

x0

This is easily proved by induction on n using (3.104) (see [1], Lemma 2). Alternatively, one can show that the iterated integral in (3.103), rewritten as an n-dimensional integral over the simplex Tx = {(t1 , . . . , tn ) ∈ Rn : x0 ≤ tn ≤ tn−1 ≤ · · · ≤ t1 ≤ x}, equals 1/n! times the integral of A(t1 ) · · · A(tn ) over the hypercube [x0 , x]n (for x > x0 ):

3.5 The Approach by Linear Systems and the Peano-Baker Series

107





 t1  tn−1 A(t1 ) A(t2 ) · · · A(tn ) dtn · · · dt2 dt1 x0 x0 x0  = A(t1 ) · · · A(tn ) dtn · · · dt1 Tx  x   1 x x = ··· A(t1 ) · · · A(tn ) dtn · · · dt1 n! x0 x0 x0  x

n 1 = A(t) dt (if A(t)A(s) = A(s)A(t), ∀t, s ∈ I ). n! x0



x

(3.106)

What we have seen so far applies to any n × n system of the form (3.100), with A any n × n matrix-valued continuous function on the interval I . We now specialize the Peano-Baker series to the case of A given by (3.101). We are interested in the solution y of (3.99), i.e., the first component of Y (x) = P(x, x0 )Y0 , given by y(x) = P11 (x, x0 )y0 + P12 (x, x0 )y0 , where P =

P11 P12 . P21 P22

(3.107)

Comparing (3.107) with (3.19), gives the following relationship between the functions P11 (x, x0 ), P12 (x, x0 ) and the impulsive response kernel g(x, x0 ) and its derivative ∂t g(x, x0 ) = ∂t g(x, t)t=x0 : 

P11 (x, x0 ) = a1 (x0 )g(x, x0 ) − ∂t g(x, x0 ), P12 (x, x0 ) = g(x, x0 ).

(3.108)

(Note that the function x → a1 (x0 )g(x, x0 )−∂t g(x, x0 ) is the solution of the problem (3.99) with y0 = 1, y0 = 0.) From (3.105) we have P11 (x, x0 ) = 1 +

∞  

∞   n=1



x0

n=1

P12 (x, x0 ) =

x

x x0

t1

 ···

x0



t1 x0

f n (t1 , . . . , tn ) dtn · · · dt1 ,

x0

 ···

tn−1

tn−1

gn (t1 , . . . , tn ) dtn · · · dt1 ,

x0

where f n , gn , n ≥ 1, are defined by   f n (t1 , . . . , tn ) = A(t1 ) · · · A(tn ) 11 ,   gn (t1 , . . . , tn ) = A(t1 ) · · · A(tn ) 12 . The functions f n , gn satisfy the following recursion relations:

108

3 The Case of Variable Coefficients



f n+1 (t1 , . . . , tn+1 ) = −a2 (tn+1 )gn (t1 , . . . , tn ), gn+1 (t1 , . . . , tn+1 ) = f n (t1 , . . . , tn ) − a1 (tn+1 )gn (t1 , . . . , tn ),

(3.109)

for any n ≥ 1, with the initial conditions f 1 (t1 ) = 0,

g1 (t1 ) = 1.

For instance, we compute ⎧ f 2 (t1 , t2 ) = −a2 (t2 ), g2 (t1 , t2 ) = −a1 (t2 ), ⎪ ⎪ ⎨ f 3 (t1 , t2 , t3 ) = a1 (t2 )a2 (t3 ), g3 (t1 , t2 , t3 ) = −a2 (t2 ) + a1 (t2 )a1 (t3 ), f ⎪ 4 (t1 , t2 , t3 , t4 ) = a2 (t2 )a2 (t4 ) − a1 (t2 )a1 (t3 )a2 (t4 ), ⎪ ⎩ g4 (t1 , t2 , t3 , t4 ) = a1 (t2 )a2 (t3 ) + a2 (t2 )a1 (t4 ) − a1 (t2 )a1 (t3 )a1 (t4 ), and so on. We can let n = 0 in (3.109) by defining f 0 = 1, g0 = 0. By eliminating f n in (3.109), yields the following recursion for gn : gn+1 (t1 , . . . , tn+1 ) = −a2 (tn )gn−1 (t1 , . . . , tn−1 ) − a1 (tn+1 )gn (t1 , . . . , tn ), (3.110) for any n ≥ 1. The impulsive response kernel is given by ∞  

g(x, x0 ) = P12 (x, x0 ) =

n=0

= x − x0 +

∞  

x

x



x0



t1

n=1

x0

x0

n=1

x0

x0



t1 x0

tn

··· 

···

gn+1 (t1 , . . . , tn+1 ) dtn+1 · · · dt1

x0 tn

"

− a2 (tn )gn−1 (t1 , . . . , tn−1 )

x0

# − a1 (tn+1 )gn (t1 , . . . , tn ) dtn+1 · · · dt1  tn−1 " ∞  x  t1  − a2 (tn )(tn − x0 )gn−1 (t1 , . . . , tn−1 ) = x − x0 + ··· 

tn



x0





a1 (tn+1 )dtn+1 gn (t1 , . . . , tn ) dtn · · · dt1 .

(3.111)

x0

Up to n = 3, we get  g(x,x0 ) = x − x0 −  +  +

x t x0 x

 t

x0

x0

+ ··· .

x x0



t

a1 (s) ds dt

x0

−(s − x0 )a2 (s) +

x0

s x0





s

 a1 (r )dr a1 (s) ds dt

x0

(r − x0 )a2 (r )a1 (s) +



r x0

a1 (u)du

  a2 (s) − a1 (r )a1 (s) dr ds dt

3.5 The Approach by Linear Systems and the Peano-Baker Series

109

We can interchange the order of integration using 



x x0

to get



x

g(x,x0 ) = x − x0 −  +  +

x x0 x

 (x − s) 

x

x0  x

 −

x0 x

(x − s)F(s) ds,

(3.112)

x0

(x − s)a1 (s) ds 

 a1 (r )dr a1 (s) ds

s x0





r

(r − x0 )a2 (r )a1 (s) +

a1 (u)du

  a2 (s) − a1 (r )a1 (s) dr ds

x0



x

(x − s)a1 (s) ds −

x0 x

x0

x0

= x − x0 −

+

s

x

F(s) ds dt =

(x − s) −(s − x0 )a2 (s) +

x0





x0



+ ···

+

t



s

(x − s)a1 (s) (x − s)a2 (s)

x0  s

 (x − s)a1 (s)

x0

(x − s)(s − x0 )a2 (s) ds

x0

x0 s



x

a1 (r ) dr ds +



s

(x − s)a1 (s)

x0

(r − x0 )a2 (r ) dr ds

x0

(s − u)a1 (u) du ds 

r

a1 (r )

x0

a1 (u) du dr ds + · · · .

(3.113)

x0

As regards P11 (x, x0 ) we get, using (3.109), (3.110) and (3.112), P11 (x, x0 ) = 1 + =1−

∞  

x

n=0 x 0 ∞  x  n=1 x 0



x

=1−

x0



t1



t1

 ···

x0



t1

tn

f n+1 (t1 , . . . , tn+1 ) dtn+1 · · · dt1

x0

 ···

x0

tn

a2 (tn+1 )gn (t1 , . . . , tn ) dtn+1 · · · dt1

x0

a2 (t2 ) dt2 dt1 +

x0

∞  

x



n=2 x 0

t1



tn

···

x0

a2 (tn+1 )×

x0

  × a2 (tn−1 )gn−2 (t1 , . . . , tn−2 ) + a1 (tn )gn−1 (t1 , . . . , tn−1 ) dtn+1 · · · dt1  x t  x t s =1− a2 (s) ds dt + a1 (s)a2 (r ) dr ds dt  +

x0 x0 x t s x0

 =1−  + −

x0 x

x0



x0 r x0

(x − s)a2 (s)

x0



x

(x − s)a2 (s) ds −

x0 x

x0  x

x0

x0

  a2 (s) − a1 (s)a1 (r ) a2 (u) du dr ds dt + · · ·

(x − s)a1 (s)



 (x − s)a1 (s)

x0 s

x0  s x0

s

a2 (r ) dr ds

x0

(s − u)a2 (u) du ds  a1 (r )

r x0

a2 (u) du dr ds + · · · .

(3.114)

110

3 The Case of Variable Coefficients

We now look at two particular cases.  2 (1) Suppose the coefficient a2 = 0, i.e., L = ddx + a1 (x) ddx . From (3.109)– (3.110) we have f n = 0 ∀n, g0 = 0, g1 = 1 and gn (t1 , . . . , tn ) = (−1)n−1 a1 (t2 )a1 (t3 ) · · · a1 (tn ),

∀n ≥ 2.

Formula (3.113) yields 

x

g(x,x0 ) = x − x0 − (x − s)a1 (s) ds x0 

 x s (x − s)a1 (s) a1 (r )dr ds + x0 x0  r

 s  x (x − s)a1 (s) a1 (r ) a1 (u)du dr ds + · · · . − x0

x0

x0



We claim that g(x, x0 ) =

x

e



t x0

a1 (s) ds

dt.

(3.115)

x0

Indeed using (2.49) (with x0 in place of 0), we compute  s

 x  x d g(x,x0 ) = 1 − a1 (s) ds + a1 (s) a1 (r )dr ds dx x0 x0 x0  s  r



 x a1 (s) a1 (r ) a1 (u)du dr ds + · · · . − x0

x0

x0

It is easy to see that this expression equals 

x

1−

x0

=e



a1 (s) ds +

x x0

1 2!



x x0

2 a1 (s) ds



1 3!



x

3 a1 (s) ds

+ ···

x0

a1 (s) ds

(cf. (3.106)). Since g(x0 , x0 ) = 0, we get (3.115). Of course (3.115)  can also be proved in an elementary way, by just factoring L = ddx + a1 (x) ddx and taking α2 = 0 in Proposition 3.1. Then gα2 (x, x0 ) = 1 and (3.115) follows from (3.23). Finally, (3.114) yields P11 (x, x0 ) = 1, which is the correct formula for the solution to L y = 0, y(x0 ) = 1, y  (x0 ) = 0.  2 (2) Let the coefficient a1 = 0, i.e., L = ddx + a2 (x). From (3.109)–(3.110) we get g2n = 0 = f 2n+1 ∀n, g1 = 1, and f 2n (t1 , . . . , t2n ) = g2n+1 (t1 , . . . , t2n+1 ) = (−1)n a2 (t2 )a2 (t4 ) · · · a2 (t2n ),

3.5 The Approach by Linear Systems and the Peano-Baker Series

111

∀n ≥ 1. The expansion (3.111) becomes g(x,x0 ) = x − x0 −

∞  

n=1 x 0

= x − x0 + = x − x0 − 

x



n=1  x

x0 t r



x0

∞ 

x0



x

(−1)n t

t1

 ···

x0



t1

tn−1

a2 (tn )(tn − x0 )gn−1 (t1 , . . . , tn−1 ) dtn · · · dt1

x0

 ···

x0

x0



t2n−1

(t2n − x0 )a2 (t2n )a2 (t2n−2 ) · · · a2 (t2 ) dt2n · · · dt1

x0



x

(r − x0 )a2 (r ) dr dt +

x0  s

x0



x



x0

u

x0



x0

 t

v

x0

r x0



s

(u − x0 )a2 (u)a2 (r ) du ds dr dt

x0

(y − x0 )a2 (y)a2 (u)a2 (r ) dy dv du ds dr dt + · · · .

x0

Using repeatedly (3.112) to simplify the iterated integrals, we arrive at the formula  g(x,x0 ) = x − x0 −  + −

x x0

x

x0  x

(x − r )a2 (r )(r − x0 ) dr

 

(x − r ) a2 (r )

x0

= x − x0 +

∞ 



tn−1

x

(−1)n



u

(r − u)a2 (u)

dr

(u − y)a2 (y)(y − x0 ) dy

du

dr

x0



t1

(x − t1 )a2 (t1 )

 (t1 − t2 )a2 (t2 )

x0

x0

n=1



x0 r

(r − u)a2 (u)(u − x0 ) du

x0

+ ···

···

r

(x − r ) a2 (r )

t2

···

x0

(tn−1 − tn )a2 (tn )(tn − x0 ) dtn · · · dt2 dt1 .

(3.116)

x0

This can also be written as g(x, x0 ) = lim sn (x, x0 ), n→∞

where the sequence of partial sums is  s0 (x, x0 ) =

x

dt1 = x − x0 ,

x0

 s1 (x, x0 ) =

x



 1 − (x − t1 )a2 (t1 )(t1 − x0 ) dt1 ,

x0

 s2 (x, x0 ) =

x x0



t1 x0

   1 − (x − t1 )a2 (t1 ) 1 − (t1 − t2 )a2 (t2 )(t2 − x0 ) dt2 dt1 ,

112

3 The Case of Variable Coefficients

 x  t1  t2 s3 (x,x0 ) = dt3 dt2 dt1 × x0 x0 x0     × 1 − (x − t1 )a2 (t1 ) 1 − (t1 − t2 )a2 (t2 ) 1 − (t2 − t3 )a2 (t3 )(t3 − x0 ) , and in general  sn (x,x0 ) = 

x



t1



x0

x0

t2



tn−1

···

x0

dtn · · · dt2 dt1 ×

x0

    × 1 − (x − t1 )a2 (t1 ) 1 − (t1 − t2 )a2 (t2 ) · · · 1 − (tn−1 − tn )a2 (tn )(tn − x0 ) .

It follows from (3.116) that 

x

∂x g(x, x0 ) = 1 −



x0 x

 −

x

a2 (r )(r − x0 ) dr + 

a2 (r )

x0

x0 r



r

a2 (r )



(r − u)a2 (u)

x0

(r − u)a2 (u)(u − x0 ) du

x0 u



(u − y)a2 (y)(y − x0 ) dy

dr

du

dr

x0

+ ··· .

We can easily check from this that Lg = 0, i.e., ∂x2 g(x, x0 ) = −a2 (x)g(x, x0 ). Finally, (3.114) with a1 = 0 gives  P11 (x, x0 ) = 1 −  − =1+ ···



x

(x − s)a2 (s) ds +

x0 x



(x − s) a2 (s)

x0 ∞ 



(−1)n

n=1  tn−1

x

 (x − s) a2 (s)

x0 s

(s − u)a2 (u)

x0 x x0



(x − t1 )a2 (t1 )



s

(s − u)a2 (u) du

x0 u

(u − y)a2 (y) dy

x0 t1

(t1 − t2 )a2 (t2 )

x0

(tn−1 − tn )a2 (tn ) dtn · · · dt2 dt1 .



t2

ds

du

ds + · · ·

···

x0

(3.117)

x0

Comparing this with (3.116), it is easy to check that P11 (x, x0 ) = −∂t g(x, t)t=x0 , in agreement with formula (3.108) for a1 = 0. Remark 3.22 It is not clear if the series of iterated integrals (3.116)–(3.117), or in general (3.111)–(3.114), can be resummed in terms of the exponential series as in the case of a2 = 0 (cf. (3.115)). What we have in mind is a formula like (3.23) x α (which reduces to (3.115) for a2 = 0), with the kernel gα2 (x, x0 ) = e x0 2 given explicitly in terms of the coefficients a1 , a2 . Of course, the function x → gα2 (x, x0 ) solves L y = 0 with the initial conditions y(x0 ) = 1, y  (x0 ) = β0 , where β0 = ∂x gα2 (x0 , x0 ) = α2 (x0 ). Thus gα2 (x, x0 ) is given by (3.107) with y0 = 1, y0 = β0 , and we go back to the expansions for P11 (x, x0 ) and P12 (x, x0 ), but the parameter β0 is unknown. (This parameter ensures that the function x → gα2 (x, x0 ) never vanishes in the interval I . Its existence is guaranteed by Theorem 3.2 and Proposition 3.1.)

3.5 The Approach by Linear Systems and the Peano-Baker Series

113

For a1 , a2 real valued and Im β0 = 0, formula (3.23) reduces to (3.37). We might expect the Peano-Baker series to give a formula for the functions η(x, x0 ) and ω(x, x0 ) in (3.37) in terms of a1 and a2 . Remark 3.23 (Analytic coefficients) If the coefficients a1 and a2 are analytic functions in a neighborhood I = (x0 − r, x0 + r ) of x0 , then the solutions of L y = 0, y(x0 ) = y0 , y  (x0 ) = y0 , are analytic in I as well. Their convergent power series expansion can be computed by a formal algebraic process. One expands a1 , a2 and y in power series around x0 and determines a recursion relation for the coefficients of y by substituting into the differential equation. Alternatively, one uses the differential equation to compute directly the derivatives y (n) (x0 ), n ≥ 2, in terms of y0 , y0 . We refer to [8], Chap. 3, Sects. 7 and 9, for examples and full justification of the power series method. The Peano-Baker series solutions (3.111) and (3.114) for the functions P12 (x, x0 ) and P11 (x, x0 ) must obviously coincide with their power series expansions. The equality will be between the overall functions and not between the single terms of each series, in general. Indeed the terms in the Peano-Baker series are obtained by integrating the exact coefficients a1 , a2 , and are not necessarily powers. Example 17 Consider the differential equation y  + x 2 y = 0. With x0 = 0 and a2 (x) = x 2 , formulas (3.117) and (3.116) read: 



x



x

s

(x − s)s ds + (x − s)s (s − u)u 2 du ds P11 (x, 0) = 1 − 0 0 0  x  s  u − (x − s)s 2 (s − u)u 2 (u − y)y 2 dy du ds + · · · , 2

2

0

0

0

 x  x  s (x − s)s 3 ds + (x − s)s 2 (s − u)u 3 du ds g(x, 0) = P12 (x, 0) = x − 0 0  x  0s  u 2 2 3 − (x − s)s (s − u)u (u − y)y dy du ds + · · · . 0

0

0

We easily compute P11 (x, 0) = 1 −

x4 3·4

+

x8 3·4·7·8



x 12 3·4·7·8·11·12

+ ··· = 1 +

∞ 

(−1)n x 4n , 3·4·7·8···(4n−1)·4n

n=1

g(x, 0) = x −

x5 4·5

+

x9 4·5·8·9



x 13 4·5·8·9·12·13

+ ··· = x +

∞ 

(−1)n x 4n+1 , 4·5·8·9···4n·(4n+1)

n=1

with both series converging for any x ∈ R, as easily seen by the ratio test. The power series method gives (of course) the same result.

114

3 The Case of Variable Coefficients

Example 18 Consider the differential equation y  + |x|y = 0. Proceeding as in the previous example with x0 = 0 and a2 (x) = |x|, we find $ P11 (x, 0) = $ g(x, 0) =

1− 1+

x− x+

x3 2·3 x3 2·3

x4 3·4 x4 3·4

+ +

+ +

x6 2·3·5·6 x6 2·3·5·6

x7 3·4·6·7 x7 3·4·6·7

− +

− +

x9 2·3·5·6·8·9 x9 2·3·5·6·8·9

x 10 3·4·6·7·9·10 x 10 3·4·6·7·9·10

+ ··· =

∞

(−1)n x 3n n=1 2·3·5·6···(3n−1)·3n  x 3n 1+ ∞ n=1 2·3·5·6···(3n−1)·3n

+ ··· = 1 +

+ ··· = x

∞

(−1)n x 3n+1 n=1 3·4·6·7···3n·(3n+1)  x 3n+1 + ∞ n=1 3·4·6·7···3n·(3n+1)

+ ··· = x +

(x ≥ 0) (x < 0), (x ≥ 0) (x < 0),

again with all series converging for any x ∈ R. In this case a2 ∈ C 0 but a2 ∈ / C 1 , so that the power series method around x0 = 0 does not apply, in principle. However, we can just solve y  + x y = 0 and y  − x y = 0 by power series, and then join the solutions thus obtained for x ≥ 0 and x ≤ 0, respectively, to get the same result. Example 19 Consider the differential equation y  + e x y = 0. First, let us discuss the power series approach. We use the differential equation to write y  (x) = −e x y(x),

y (3) (x) = −e x y(x) − e x y  (x),

y (4) (x) = −e x y(x) − 2e x y  (x) − e x y  (x) = y(x)(e2x − e x ) − 2e x y  (x), y (5) (x) = y(x)(4e2x − e x ) + y  (x)(e2x − 3e x ), and so on. Letting x = x0 = 0 gives the derivatives y (n) (0), n ≥ 2, in terms of y0 = y(0) and y0 = y  (0): y  (0) = −y0 ,

y (3) (0) = −y0 −y0 ,

y (4) (0) = −2y0 ,

y (5) (0) = 3y0 −2y0 , . . . .

The solution of y  + e x y = 0, y(0) = y0 , y  (0) = y0 , is given by the power series y(x) =

∞ 

y (n) (0) n x n!

= y0 + y0 x +

n=0

 = y0 1 − 21 x 2 − 16 x 3 +

1 5 x 40

y  (0) 2 x 2!

+ ···

  + · · · + y0 x − 16 x 3 −

1 4 x 12



1 5 x 60

 + ··· .

Since the power series of a2 (x) = e x converges for any x ∈ R, the two power series multiplying y0 and y0 will converge for any x ∈ R (cf. [8], Theorem 12 p. 129). They represent the functions P11 (x, 0) and P12 (x, 0) respectively, but the pattern of the coefficients is not so clear. The Peano-Baker series with x0 = 0 is likewise not very illuminating. For instance for P11 (x, 0) we get from (3.117)

3.5 The Approach by Linear Systems and the Peano-Baker Series

115

 x  x  s P11 (x, 0) = 1 − (x − s)es ds + (x − s)es (s − u)eu du ds 0 0 0  x  s  u − (x − s)es (s − u)eu (u − y)e y dy du ds + · · · 0 0 0     = 1 − e x − 1 − x + 41 e2x + (1 − x)e x − 21 x − 45  1 3x  x 5 − 36 + ··· . e + e2x (− x4 + 21 ) + e x (− x2 − 14 ) − 12 − 18 To simplify matters, let us choose x0 = −∞ as the boundary point in (3.117):  P11 (x, −∞) = 1 −  −



x



x

s

(x − s)e ds + (x − s)e (s − u)eu du ds −∞ −∞ −∞  s  u x s u (x − s)e (s − u)e (u − y)e y dy du ds + · · · s

−∞

s

−∞

−∞

(3.118) =1−e +

1 2x e 4

=1−e +

1 2x e 2!2

x

x



+ ···

1 3x e 36



+ ··· = 1 +

1 3x e 3!2

∞ 

(−1)n (n!)2

enx .

(3.119)

n=1

This is the solution of y  + e x y = 0 such that y  (−∞) = lim y  (x) = 0.

y(−∞) = lim y(x) = 1, x→−∞

x→−∞

Formula (3.118) can be fully justified as follows. We integrate y  (x) = −e x y(x) between −∞ and x with y  (−∞) = 0, to get y  (x) = −



x

et y(t) dt. −∞

Integrating one more time with y(−∞) = 1 gives  y(x) = 1 −

x



t

es y(s) ds dt. −∞

−∞

We exchange the order of integration using (3.112) with x0 = −∞, to get  y(x) = 1 −

x

−∞

(x − s)es y(s) ds.

We now solve this integral equation by the method of successive approximations. The first approximation is y0 (x) = 1. The second is

116

3 The Case of Variable Coefficients

 y1 (x) = 1 −

x −∞

(x − s)es y0 (s) ds = 1 − e x .

Defining in general  yn (x) = 1 −

x

−∞

(x − s)es yn−1 (s) ds,

we get the solution y(x) = limn→∞ yn (x) given precisely by (3.118)–(3.119). In order to find a second solution of our equation, consider the Peano-Baker series (3.116) for P12 (x, x0 ). Here we can not just set x0 = −∞, because of the terms (tn − x0 ) inside the iterated integrals. However, if we omit x0 in all these terms and replace the lower limit of integration by x0 = −∞, we obtain the function  P12 (x, −∞) := x −  −

 x  s (x − s)es s ds + (x − s)es (s − u)eu u du ds −∞ −∞ −∞  s  u x s u (x − s)e (s − u)e (u − y)e y y dy du ds + · · · . x

−∞

−∞

−∞

(3.120) Again this can be justified by considering the integral equation  y(x) = x −

x −∞





t −∞

es y(s) ds dt = x −

x

−∞

(x − s)es y(s) ds.

(3.121)

If y solves (3.121), then y solves y  + e x y = 0 with the “initial conditions” y(−∞) = lim y(x) = −∞, x→−∞

y  (−∞) = lim y  (x) = 1. x→−∞

Solving (3.121) by successive approximations with y0 (x) = x, gives precisely (3.120). We get the following formula for P12 (x, −∞): x 11 P12 (x, −∞) = x − e x (x − 2) + e2x ( x4 − 34 ) − e3x ( 36 − 108 ) + ···     3x 1 3x 1 e + · · · − 2 − e x + 41 · 23 e2x − 36 · 11 = x 1 − e x + 41 e2x − 36 6 e + ···         = x 1 − e x + 2!12 e2x − 3!12 e3x + · · · − 2 − e x + 2!12 1 + 21 e2x − 3!12 1 + 21 + 13 e3x + · · · ∞  n=0

=

∞ 

 (−1)n  1 + 21 + · · · + n1 enx (n!)2 n=1 ∞   (−1)n  x P11 (x, −∞) − 2 1 + 21 + · · · + n1 enx . (n!)2 n=1

=x

(−1)n (n!)2

enx − 2

(3.122)

All series involved converge for any x ∈ R. It can be shown that the two functions P11 (x, −∞) and P12 (x, −∞) are linearly independent on R. They can be related to Bessel functions as follows.

3.5 The Approach by Linear Systems and the Peano-Baker Series

117

In our equation y  + e x y = 0 let us make the change of variable z = 2 e x/2 , i.e., d d = 2z dz , x = 2 log(z/2). Then ddx = ddzx dz 

 d 2 dx

+ ex =

1 4



z2

 d 2 dz

 d + z dz + z2 ,

and we get the following differential equation for the function u(z) = y(x) = y(2 log(z/2)) in the interval z > 0: z 2 u  + zu  + z 2 u = 0,

i.e.,

u  + 1z u  + u = 0,

primes meaning differentiation with respect to z. This is just Bessel equation of order zero, see [8], Chap. 4, Sect. 7. The solutions of this equation that are regular at z = 0 are proportional to the Bessel function of the first kind of order 0, denoted J0 (z) and given by the power series ∞  (−1)n z 2n J0 (z) = ( ) . (n!)2 2 n=0

Comparison with (3.119) gives P11 (x, −∞) = J0 (2 e x/2 ). A second linearly independent solution of the Bessel equation of order zero is the Bessel function of the second kind of order 0, denoted K 0 (z) in [8], p. 170, and given by ∞     2n (−1)n K 0 (z) = (log z)J0 (z) − . 1 + 21 + · · · + n1 2z (n!)2 n=1

This diverges to −∞ as z → 0+ . Comparing with (3.122) we get P12 (x, −∞) = 2K 0 (2 e x/2 ) − 2(log 2)J0 (2 e x/2 ).

References

1. M. Baake, U. Schlägel, The Peano-Baker series. Proc. Steklov Inst. Math. 275, 155–159 (2011) 2. R. Camporesi, Linear ordinary differential equations. Revisiting the impulsive response method using factorization. Rend. Sem. Mat. Univ. Politec. Torino 68(4), 319–335 (2010) 3. R. Camporesi, Linear ordinary differential equations with constant coefficients. Revisiting the impulsive response method using factorization. Internat. J. Math. Ed. Sci. Tech. 42, 497–514 (2011) 4. R. Camporesi, A.J. Di Scala, A generalization of a theorem of Mammana. Colloq. Math. 122, 215–223 (2011) 5. R. Camporesi, A fresh look at linear ordinary differential equations with constant coefficients. Revisiting the impulsive response method using factorization. Internat. J. Math. Ed. Sci. Tech. 47, 82–99 (2016) 6. Y. Choquet-Bruhat, Distributions, Théorie et problémes (Masson et Cie, Éditeurs, Paris, 1973) 7. E.A. Coddington, N. Levinson, Theory of ordinary differential equations (McGraw-Hill Book Company Inc., New York, Toronto, London, 1955) 8. E.A. Coddington, An introduction to ordinary differential equations (Prentice-Hall Mathematics Series (Prentice-Hall Inc., Englewood Cliffs, N.J., 1961) 9. W.A. Coppel, Disconjugacy. Lecture Notes in Mathematics, vol. 220 (Springer, Berlin, New York, 1971) 10. G. Doetsch, Introduction to the theory and application of the Laplace transformation (Springer, New York, Heidelberg, 1974) 11. P. Hartman, Ordinary differential equations, 2nd edn. (Birkhäuser, Boston, 1982) 12. E.L. Ince, Ordinary differential equations (Dover Publications, New York, 1944) 13. G. Mammana, Sopra un nuovo metodo di studio delle equazioni differenziali lineari. Math. Z. 25, 734–748 (1926) 14. G. Mammana, Decomposizione delle espressioni differenziali lineari omogenee in prodotti di fattori simbolici e applicazione relativa allo studio delle equazioni differenziali lineari. Math. Z. 33, 186–231 (1931) 15. G. Pólya, On the mean-value theorem corresponding to a given linear homogeneous differential equation. Trans. Am. Math. Soc. 24, 312–324 (1922) 16. L. Schwartz, Méthodes mathématiques pour les sciences physiques (Enseignement des Sciences, Hermann, Paris, 1961) 17. L. Schwartz, Théorie des distributions. Publications de l’Institut de Mathématique de l’Université de Strasbourg, No. IX-X. Nouvelle édition, entiérement corrigée, refondue et augmentée (Hermann, Paris, 1966) 18. F. Schwarz, Algorithmic Lie theory for solving ordinary differential equations. Pure and Applied Mathematics (Boca Raton), vol. 291 (Chapman & Hall/CRC, Boca Raton, FL, 2008) © Springer International Publishing AG 2016 R. Camporesi, An Introduction to Linear Ordinary Differential Equations Using the Impulsive Response Method and Factorization, PoliTO Springer Series, DOI 10.1007/978-3-319-49667-2

119

120

References

19. A. Slavík, Product integration, its history and applications. Corrected translation of the Czech original, Necas Center for Mathematical Modeling, 1. Dejiny Matematiky/History of Mathematics, 29 (Matfyzpress, Prague, 2007)