Handbook of Geomathematics 9783642277931

202 106 90MB

English Pages [2805] Year 2015

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Handbook of Geomathematics
 9783642277931

Citation preview

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Geomathematics: Its Role, Its Aim, and Its Potential Willi Freeden Geomathematics Group, University of Kaiserslautern, Rhineland-Palatinate, Germany

Abstract During the last decades, geosciences and geoengineering were influenced by two essential scenarios: First, the technological progress has completely changed the observational and measurement techniques. Modern high-speed computers and satellite-based techniques are more and more entering all geodisciplines. Second, there is a growing public concern about the future of our planet, its climate, and its environment and about an expected shortage of natural resources. Obviously, both aspects, viz., efficient strategies of protection against threats of a changing Earth and the exceptional situation of getting terrestrial, airborne, as well as spaceborne data of better and better quality, explain the strong need of new mathematical structures, tools, and methods, i.e., geomathematics. This paper deals with geomathematics, its role, its aim, and its potential. Moreover, the “circuit” geomathematics is exemplified by three problems involving the Earth’s structure, namely, gravity field determination from terrestrial deflections of the vertical, ocean flow modeling from satellite (altimeter measured) ocean topography, and reservoir detection from (acoustic) wave tomography.

1 Introduction Geophysics is an important branch of physics; it differs from the other physical disciplines due to its restriction to objects of geophysical character. Why shouldn’t the same hold for mathematics what physics regards as its canonical right since the times of Emil Wiechert in the late nineteenth century? More than ever before, there are significant reasons for a well-defined position of geomathematics as a branch of mathematics and simultaneously of geosciences (cf. Fig. 1). On the one hand, these reasons are intrinsically based on the self-conception of mathematics; on the other hand, they are explainable from the current situation of geosciences. In the following, I would like to explain my thoughts on geomathematics in more detail. My objective is to convince not only the geoscientists but also a broader audience: “Geomathematics is essential and important. As geophysics within physics it has an adequate forum within mathematics, and it should have a fully acknowledged position within geosciences!”

2 Geomathematics as a Cultural Asset In chapter “Navigation on Sea: Topics in the History of Geomathematics” of this “Handbook of Geomathematics”, T. Sonar starts his contribution with the sentence: “Geomathematics in our times is thought of being a very young science and a modern area in the realms of mathematics.



E-mail: [email protected]

Page 1 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

There exists a growing public concern about Modern high speed computers are the future of our planet, its climate, its envientering more and more all geodis- ronment, and about an expected shortage of natural resources. ciplines.

There is a strong need for strategies of protection against threats of a changing Earth.

There is an exceptional situation of getting data of better and better quality.

Fig. 1 Four significant reasons for the increasing importance of geomathematics

Nothing is farer from the truth. Geomathematics began as man realized that he walked across a sphere–like Earth and that this observation has to be taken into account in measurements and calculations.” In consequence, we can only do justice to geomathematics if we look at its historic importance, at least shortly. According to the oldest evidence which has survived in written form, geomathematics was developed in Sumerian Babylon and ancient Egypt (see Fig. 2) on the basis of practical tasks concerning measuring, counting, and calculation for reasons of agriculture and stock keeping. In the ancient world, mathematics dealing with problems of geoscientific relevance flourished for the first time, for example, when Eratosthenes (276–195 BC) of Alexandria calculated the radius of the Earth. We also have evidence that the Arabs carried out an arc measurement northwest of Bagdad in the year 827 AD. Further key results of geomathematical research lead us from the Orient across the occidental Middle Ages to modern times. Copernicus (1473–1543) successfully made the transition from the Ptolemaic geocentric system to the heliocentric system. Kepler (1571–1630) determined the laws of planetary motion. Further milestones from a historical point of view are, for example, the theory of geomagnetism developed by Gilbert (1544–1608), the development of triangulation methods for the determination of meridians by Brahe (1547–1601) and Snellius (1580–1626), the laws of falling bodies by Galilei (1564–1642), and the basic theory on the propagation of seismic waves by Huygens (1629–1695). The laws of gravitation formulated by the English scientist Newton (1643–1727) have taught us that gravitation decreases with an increasing distance from the Earth. In the seventeenth and eighteenth centuries, France took over an essential role through the foundation of the Academy in Paris (1666). Successful discoveries were the theory of the isostatic balance of mass distribution in the Earth’s crust by Bouguer (1698–1758), the calculation of the Earth’s shape and especially of the pole flattening by Maupertuis (1698–1759) and Clairaut (1713–1765), and the development of the calculus of Page 2 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 2 Papyrus scroll containing indications of algebra, geometry, and trigonometry due to Ahmose (nineteenth century BC) [Department of Ancient Egypt and Sudan, British Museum EA 10057, London, Creative Commons Lizenz CC-BY-SA 2.0] (taken from Sonar 2011)

spherical harmonics by Legendre (1752–1833) and Laplace (1749–1829). The nineteenth century was essentially characterized by Gauß (1777–1855). Especially important was the calculation of the lower Fourier coefficients of the Earth’s magnetic field, the hypothesis of electric currents in the ionosphere, as well as the definition of the level set of the geoid (however, the term “geoid” was defined by Listing (1808–1882), a disciple of Gauß). Riemann (1826–1866) made lasting contributions to differential geometry, some of them enabling the later development of general relativity. Helmert (1843–1917) laid the mathematical foundation of modern geodesy. At the end of the nineteenth century, the basic idea of the dynamo theory in geomagnetics was developed by Elsasser (1904–1981), Bullard (1907–1980), etc. This very incomplete list (which does not even include essential facets of the last century) already shows that geomathematics is one of the large achievements of mankind from a historic point of view.

3 Geomathematics as Task and Objective Modern geomathematics deals with the qualitative and quantitative properties of the current or possible structures of the system Earth. It guarantees concepts of scientific research concerning the system Earth, and it is simultaneously the force behind it. The system Earth (see Fig. 3) consists of a number of elements which represent individual systems themselves. The complexity of the entire system Earth is determined by interacting physical, biological, and chemical processes transforming and transporting energy, material, and information (cf. Emmermann and Raiser 1997). It is characterized by natural, social, and economic processes influencing one another. In most instances, a simple theory of cause and effect is therefore completely inappropriate if we want to understand the system. We have to think in dynamical structures and to account for multiple, unforeseen, and of course sometimes even undesired effects in the case of interventions. Inherent networks must be recognized and made use of, and self-regulation must be accounted for. All these aspects require a type of mathematics which must be more than a mere collection of theories and numerical methods. Page 3 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Geosystems Mathematics as key technology penetrating the complex system Earth (modified illustration following Emmermann and Raiser 1997)

Mathematics dedicated to geosciences, i.e., geomathematics, deals with nothing more than the organization of the complexity of the system Earth. Descriptive thinking is required in order to clarify abstract complex situations. We also need a correct simplification of complicated interactions, an appropriate system of mathematical concepts for their description, and exact thinking and formulations. Geomathematics has thus become the key science of the complex system Earth. Wherever there are data and observations to be processed, e.g., the diverse scalar, vectorial, and tensorial clusters of satellite data, we need mathematics. For example, statistics serves for noise reduction, constructive approximation serves for compression and evaluation, and the theory of special function systems yields georelevant graphical and numerical representations – there are mathematical algorithms everywhere. The specific task of geomathematics is to build a bridge between mathematical theory and geophysical as well as geotechnical applications. The special attraction of this branch of mathematics is therefore based on the vivid communication between applied mathematicians more interested in model development, theoretical foundation, and the approximate as well as computational solution of problems and geoengineers and physicists more familiar with measuring technology, methods of data analysis, implementation of routines, and software applications. There is a very wide range of modern geosciences on which geomathematics is focused (see Fig. 4), not least because of the continuously increasing observation diversity. Simultaneously, the mathematical “toolbox” is becoming larger. A special feature is that geomathematics primarily deals with those regions of the Earth which are only insufficiently or not at all accessible for direct measurements (even by remote sensing methods (as discussed, e.g., in chapters “Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment”, “Inverse Resistivity Problems in Computational Geoscience”, “Analysis of Data from Multisatellite Geospace Missions”, and “Potential Methods and Geoinformation Systems” of this Page 4 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 Geomathematics, its range of fields, and its disciplines

work)). Inverse methods (see, for instance, chapters “Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution”, “Multiresolution Analysis of Hydrology and Satellite Gravitational Data”, “Elastic and Viscoelastic Reaction of the Lithosphere to Loads”, “Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents”, “The Forward and Adjoint Methods of Global Electromagnetic Induction for CHAMP Magnetic Data”, “Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives”, “Noise Models for Ill-Posed Problems”, “Sparsity in Inverse Geophysical Problems”, “Multiparameter Regularization in Downward Continuation of Satellite Data”, “Evaluation of Parameter Choice Methods for Regularization of IllPosed Problems in Geomathematics”, “Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment”, “Inverse Resistivity Problems in Computational Geoscience”, “Identification of Current Sources in 3D Electrostatics”, “Transmission Tomography in Seismology”, “Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery”, “Strategies in Adjoint Tomography”, “Potential-Field Estimation Using Scalar and Vector Slepian Functions at Satellite Altitude”, “Multidimensional Seismic Compression by Hybrid Transform with Multiscale Based Coding”, “Tomography: Problems and Multiscale Solutions”, “RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences”, “Material Behavior: Texture and Anisotropy”, and “Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-and S-wave Velocities Near Surface Profiling” of this handbook) are absolutely essential for mathematical evaluation in these cases (see, e.g., Nashed (1981) for a deeper insight). Mostly, a physical quantity is measured in the vicinity of the Earth’s surface, and it is then continued downward or upward by mathematical methods until one reaches the interesting depths or heights.

4 Geomathematics as Interdisciplinary Science Once more it should be mentioned that, today, computers and measurement technology have resulted in an explosive propagation of mathematics in almost every area of society. Mathematics as an interdisciplinary science can be found in almost every area of our lives. Consequently, Page 5 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 5 Geomathematics as a key technology bridging the real and virtual world: An example from seismic tomography

mathematics is closely interacting with almost every other science, even medicine and parts of the arts (mathematization of sciences). The use of computers allows for the handling of complicated models for real data sets. Modeling, computation, and visualization yield reliable simulations of processes and products. Mathematics is the “raw material” for the models and the essence of each computer simulation. As the key technology, it translates the images of the real world to models of the virtual world, and vice versa (cf. Fig. 5 for an example in seismic tomography). The special importance of mathematics as an interdisciplinary science (see also Neuzert and Rosenberger 1991; Beutelspacher 2001; Pesch 2002) has been acknowledged increasingly within the last few years in technology, economy, and commerce. However, this process does not remain without effects on mathematics itself. New mathematical disciplines, such as scientific computing, financial and business mathematics, industrial mathematics, biomathematics, and also geomathematics, have complemented the traditional disciplines. Interdisciplinarity also implies the interdisciplinary character of mathematics at school. Relations and references to other disciplines (especially informatics, physics, chemistry, biology, and also economy and geography) become more important, more interesting, and more expandable. Problem areas of mathematics become explicit and observable, and they can be visualized. Of course, this undoubtedly also holds for the system Earth.

5 Geomathematics as a Challenge From a scientific and technological point of view, the past twentieth century was a period with two entirely different faces concerning research and its consequences. The first two thirds of the century were characterized by a movement toward a seemingly inexhaustible future of science and technology; they were marked by the absolute belief in technical progress which would make everything achievable in the end. Up to the 1960s, mankind believed to have become the master of the Earth (note that, in geosciences as well as other sciences, to master is also a synonym for to understand). Geoscience was able to understand plate tectonics on the basis of Wegener’s theory of continental drift, geoscientific research began to deal with the Arctic and Antarctic, and man Page 6 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 6 (Potato) “Earth” of radius 6,371 km

started to conquer the universe by satellites, so that for the first time in mankind’s history, the Earth became measurable on a global scale. Then, during the last third of the past century, there was a growing skepticism as to the question whether scientific and technical progress had really brought us forth and whether the effects of our achievements were responsible. As a consequence of the specter of a shortage in raw materials (mineral oil and natural gas reserves), predicted by the Club of Rome, geological/geophysical research with the objective of exploring new reservoirs was stimulated during the 1970s (see, e.g., Jakobs and Meyer 1992). Moreover, the last two decades of the century have sensitized us for the global problems resulting from our behavior with respect to climate and environment. Our senses have been sharpened as to the dangers caused by the forces of nature, from earthquakes and volcanic eruptions to the temperature development and the hole in the ozone layer, etc. Man has become aware of his environment. The image of the Earth as a potatodrenched by rainfall (which is sometimes drawn by oceanographers) is not a false one (see Fig. 6). The humid layer on this potato, maybe only a fraction of a millimeter thick, is the ocean. The entire atmosphere hosting the weather and climate events is only a little bit thicker. Flat bumps protruding from the humid layer represent the continents. The entire human life takes place in a very narrow region of the outer peel (only a few kilometers in vertical extension). However, the basically excellent comparison of the Earth with a huge potato does not give explicit information about essential ingredients and processes of the system Earth, for example, gravitation, magnetic field, deformation, wind and heat distribution, ocean currents, internal structures, etc. In our twenty-first century, geoproblems currently seem to overwhelm the scientific programs and solution proposals. “How much more will our planet Earth be able to take?” has become an appropriate and very urgent question. Indeed, there have been a large number of far-reaching changes during the last few decades, e.g., species extinction, climate change, formation of deserts, ocean currents, structure of the atmosphere, transition of the dipole structure of the magnetic field to a quadrupole structure, etc. These changes have been accelerated dramatically. The main reasons for most of these phenomena are the unrestricted growth in the industrial societies (population and consumption, especially of resources, territory, and energy) and severe poverty in the developing and newly industrialized countries. The dangerous aspect is that extreme changes have taken place within a very short time; there has been no comparable development in the dynamics of the system Earth in the past. Changes brought about by man are much faster than changes due to natural fluctuations. Besides, the current financial crisis shows that our western model of affluence (which holds for approximately 1 billion people) cannot be transferred globally to 5–8 billion people. Massive effects on mankind are inevitable. The appalling résumé is that the geoscientific problems collected over the decades must now all be solved simultaneously. Interdisciplinary solutions including Page 7 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

intensive mathematical research are urgently required as answers to an increasingly complex world. Geomathematics is absolutely essential for a sustainable development in the future. However, the scientific challenge does not only consist of increasing the leading role of mathematics within the current “scientific consortium Earth”. The significance of the subject “Earth” must also be acknowledged (again) within mathematics itself, so that mathematicians will become more enthusiastic about it. Up to now, it has become usual and good practice in application-oriented mathematical departments and research institutions to present applications in technology, economy, finances, and even medicine as being very career enhancing for young scientists. Geomathematics can be integrated smoothly into this phalanx with current subjects like exploration, geothermal research, navigation, and so on. Mathematics should be the leading science for the solution of these complex and economically very interesting problems, instead of fulfilling mere service functions. Of course, basic research is indispensable. We should not hide behind the other geosciences! Neither should we wait for the next horrible natural disaster! Now is the time to turn expressly toward georelevant applications. The Earth as a complex, however limited system (with its global problems concerning climate, environment, resources, and population) needs new political strategies. Consequently, these will step-by-step also demand changes in research and teaching at the universities due to our modified concept of “well-being” (e.g., concerning milieu, health, work, independence, financial situation, security, etc.). This will be a very difficult process. We dare to make the prognosis that, finally, it will also result in a modified appointment practice at the universities. Chairs in the field of geomathematics must increase in number and importance, in order to promote attractiveness, but also to accept a general responsibility for society. The time has come to realize that geomathematics is indispensable as a constituting discipline within a mathematical faculty (instead of “ivory tower-like” parity thinking following traditional structures). Additionally, the curricular standards and models for school lessons in mathematics (see, e.g., Sonar 2001; Bach et al. 2004) have also to change. We will not be able to afford any jealousies or objections on our way in that direction.

6 Geomathematics as Solution Potential Current methods of applied measurement and evaluation processes vary strongly, depending on the examined measurement parameters (gravity, electric or magnetic field force, temperature and heat flow, etc.), the observed frequency domain, and the occurring basic “field characteristic” (potential field, diffusion field, or wave field, depending on the basic differential equations). In particular, the differential equation strongly influences the evaluation processes. The typical mathematical methods are therefore listed here according to the respective “field characteristic” – as it is usually done in geomathematics: • Potential methods (potential fields, elliptic differential equations) in geomagnetics, geoelectrics, gravimetry, etc. • Diffusion methods (diffusion fields, parabolic differential equations) in flow and heat transport, magnetotellurics, geoelectromagnetics, etc. • Wave methods (wave fields, hyperbolic differential equations) in seismics, georadar, etc. The diversity of mathematical methods will increase in the future due to new technological developments in computer and measurement technology. More intensively than before, we must Page 8 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

aim for the creation of models and simulations for combinations and networks of data and observable structures. The process (i.e., the “circuit”) for the solution of practical problems usually has the following components: • Mathematical modeling: the practical problem is translated into the language of mathematics, requiring the cooperation between application-oriented scientists and mathematicians. • Mathematical analysis: the resulting mathematical problem is examined as to its “wellposedness” (i.e., existence, uniqueness, dependence on the input data). • Development of a mathematical solution method: appropriate analytical, algebraic, statistic/stochastic, and/or numerical methods and processes for a specific solution must be adapted to the problem; if necessary, new methods must be developed. The solution process is carried out efficiently and economically by the decomposition into individual operations, usually on computers. • “Back-transfer” from the language of mathematics to applications: the results are illustrated adequately in order to ensure their evaluation. The mathematical model is validated on the basis of real data and modified, if necessary. We aim for good accordance of model and reality. Often, the circuit must be applied several times in an iterative way in order to get a sufficient insight into the system Earth. Nonetheless, the advantage and benefit of the mathematical processes are a better, faster, cheaper, and more secure problem solution on the basis of the mentioned means of simulation, visualization, and reduction of large amounts of data. So, what is it exactly that enables mathematicians to build a bridge between the different disciplines? The mathematics’ world of numbers and shapes contains very efficient tokens by which we can describe the rulelike aspect of real problems. This description includes a simplification by abstraction: essential properties of a problem are separated from unimportant ones and included into a solution scheme. Their “eye for similarities” often enables mathematicians to recognize a posteriori that an adequately reduced problem may also arise from very different situations, so that the resulting solutions may be applicable to multiple cases after an appropriate adaptation or concretization. Without this second step, abstraction remains essentially useless. The interaction between abstraction and concretization characterizes the history of mathematics and its current rapid development as a common language and independent science. A problem reduced by abstraction is considered as a new “concrete” problem to be solved within a general framework, which determines the validity of a possible solution. The more examples one knows, the more one recognizes the causality between the abstractness of mathematical concepts and their impact and cross-sectional importance. Of course, geomathematics is closely interconnected with geoinformatics, geoengineering, and geophysics. However, geomathematics basically differs from these disciplines (cf. Kümmerer 2002). Engineers and physicists need the mathematical language as a tool. In contrast to this, geomathematics also deals with the further development of the language itself. Geoinformatics concentrates on the design and architecture of processors and computers, databases and programming languages, etc., in a georeflecting environment (cf. chapters “Potential Methods and Geoinformation Systems” and “Geoinformatics”). In geomathematics, computers do not represent the objects to be studied, but instead represent technical auxiliaries for the solution of mathematical problems of georeality. Statistics (usually seen as a basic subdiscipline of mathematics) is generally devoted to the analysis and interpretation of uncertainties caused by limited sampling of a property under study. In consequence, the focus of geostatistics is the development and statistical validation of Page 9 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

models to describe the distribution in time and space of Earth sciences phenomena. Geostatistical realizations (see, e.g., chapters “Selected Statistical Methods”, “Statistical Analysis of Climate Series”, “Geodetic Deformation Analysis with Respect to an Extended Uncertainty Budget”, and “It’s All About Statistics: Global Gravity Field Modeling from GOCE and Complementary Data”) aim at integrating physical constraints, combining heterogeneous data sources, and characterizing uncertainty. Applications include a large palette of areas, for example, groundwater hydrology, air quality, and land use change using terrestrial as well as satellite data. Because of both the statistical distribution of sample data and the spatial correlation among the sample data, a large variety of Earth science problems are effectively addressed using statistical procedures. Stochastic systems and processes play an important role in mathematical models of phenomena in geosciences.

7 Geomathematics as Solution Method Up to now, ansatz functions for the description of geoscientifically relevant parameters have been frequently based on the almost spherical shape of the Earth. By modern satellite positioning methods, the maximum deviation of the actual Earth’s surface from the average Earth’s radius (6,371 km) can be proved as being less than 0.4 %. Although a mathematical formulation in a spherical context may be a restricted simplification, it is at least acceptable for a large number of problems (see chapter “Special Functions in Mathematical Geosciences: An Attempt at a Categorization” of this handbook for more details). In fact, ellipsoidal nomenclature (see chapter “Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics”) is much closer to geophysical and/or geodetic purposes, but the computational and numerical amount of work is a tremendous obstacle that could not be sufficiently cleared so far. Usually, in geosciences, we consider a separable Hilbert space such as the L2 -space (of finite signature energy) with a (known) polynomial basis as reference space for ansatz functions. However, there is a striking difference between the L2 -space and the Earth’s body and surface. Continuous “surface functions” can be described in arbitrary accuracy, for example, with respect to C- and L2 -topology by restrictions of harmonic functions (such as spherical harmonics), whereas “volume functions” contain anharmonic ingredients (for more details see, e.g., Freeden and Gerhards 2013; Michel 2013). This fact has serious consequences for the reconstruction of signatures. Since the times of C.F. Gauß (1863, Werke, Bd. 5), a standard method for globally reflected surface approximation involving equidistributed data has been the Fourier series in an orthogonal basis in terms of spherical harmonics. It is characteristic for such an approach that these polynomial ansatz functions do not show any localization in space (cf. Fig. 7). In the momentum domain (throughout this work called frequency domain), each spherical function corresponds to exactly one single Fourier coefficient reflecting a certain frequency. We call this ideal frequency localization. Due to the ideal frequency localization and the simultaneous dispensation with localization in space, local data modifications influence all the Fourier coefficients (that have to be determined by global integration). Consequently, this also leads to global modifications of the data representations in case of local changes. Fourier expansions provide approximation by oscillation, i.e., the oscillations grow in number, while the amplitudes become smaller and smaller. Nevertheless, we may state that ideal frequency localization has proved to be extraordinarily advantageous due to the important physical interpretability (as multipole moments) of the model and due to the simple comparability of the Fourier coefficients with observables in geophysical Page 10 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 7 Uncertainty principle and its consequences for space–frequency localization

and/or geodetic interrelations (see, e.g., the Meissl diagrams discussed by Meissl (1971), Rummel and van Gelderen (1995), Grafarend (2001), and Nutz (2002)). From a mathematical and physical point of view, however, certain kinds of ansatz functions would be desirable which show ideal frequency localization as well as localization in space. Such an ideal system of ansatz functions would allow for models of highest resolution in space; simultaneously, individual frequencies would remain interpretable. However, the principle of uncertainty, which connects frequency localization and space localization qualitatively and quantitatively, teaches us that both properties are mutually exclusive (except for the trivial case). Extreme ansatz functions in the sense of such an uncertainty principle are, on the one hand, spherical polynomials (see Fig. 8), i.e., spherical harmonics (no space localization, ideal frequency localization), and, on the other hand, the Dirac (kernel) function(als) (ideal space localization, no frequency localization). In consequence (see also chapter “Special Functions in Mathematical Geosciences: An Attempt at a Categorization” of this handbook, and for further details Freeden 1998, 2011; Freeden et al. 1998; Freeden and Maier 2002; Freeden and Schreiner 2009; Freeden and Gutting 2013), (spherical harmonic) Fourier methods are surely well suited to resolve low- and medium-frequency phenomena, while their application is critical to obtain high-resolution models. This difficulty is also well known to theoretical physics, e.g., when describing monochromatic electromagnetic waves or considering the quantum-mechanical treatment of free particles. In this case, plane waves with fixed frequencies (ideal frequency localization, no space localization) are the solutions of the corresponding differential equations, but do certainly not reflect the physical reality. As a remedy, plane waves of different frequencies are superposed to the so-called wave packages, which gain a certain amount of space localization while losing their ideal frequency (spectral) localization. A suitable superposition of polynomial ansatz functions (see Freeden and Schreiner 2009) leads to the so-called kernel functions/kernels with a reduced frequency but increased space localization (cf. Fig. 9). These kernels can be constructed as to cover various spectral bands and, hence, can show all intermediate stages of frequency and space localization. The width of the corresponding frequency and space localization is usually controlled using a so-called scale parameters. If the kernel is given by a finite superposition of polynomial ansatz functions, it is said to be bandlimited, Page 11 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 8 Spherical harmonics of low degrees leading to a zonal function, i.e., a sum of Legendre functions

while in the case of infinitely many ansatz functions, the kernel is called non-bandlimited. It turns out that due to their higher-frequency localization (short frequency band), the bandlimited kernels show less space localization than their non-bandlimited counterparts (infinite frequency band). This leads to the following characterization of ansatz functions: Fourier methods with polynomial trial functions are the canonical point of departure for approximations of low-frequency phenomena (global to regional modeling). Because of their excellent localization properties in the space domain, bandlimited and non-bandlimited kernels with increasing space localization properties can be used for stronger and stronger modeling of short-wavelength phenomena (local modeling). Using kernels of different scales, the modeling approach can be adapted to the localization properties of the physical process under consideration. By use of sequences of scaledependent kernels tending to the Dirac kernel, i.e., the so-called Dirac sequences, a multiscale approximation (i.e., “zooming-in” process) can be established appropriately. Later on in this treatise (see also chapters “Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution”, “Mathematical Properties Relevant to Geomagnetic Field Modeling”, “Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions in Geomagnetic Induction”, “The Forward and Adjoint Methods of Global Electromagnetic Induction for CHAMP Magnetic Data”, “Stokes Problem, Layer Potentials and Regularizations, Multiscale Applications”, “Potential-Field Estimation using Scalar and Vector Slepian Functions at Satellite Altitude”, and “Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-and S-wave Velocities Near Surface Profiling”), we deal with simple (scalar and/or vectorial) wavelet techniques, i.e., with multiscale techniques based on special kernel functions: (spherical) scaling functions and wavelets. Typically, the generating functions of scaling functions have the characteristics of low-pass filters, i.e., the polynomial basis functions of higher frequencies are attenuated or even Page 12 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 9 Weighted summation of spherical harmonics leading to the generation of space-localized zonal kernels

completely left out. The generating functions of wavelets, however, have the typical properties of band-pass filters, i.e., the polynomial basis functions of low and high frequencies are attenuated or even completely left out when constructing the wavelet. Thus, wavelet techniques usually lead to a multiresolution of the Hilbert space under consideration, i.e., a certain two-parameter splitting with respect to scale and space. To be more concrete, the Hilbert space under consideration can be decomposed into a nested sequence of approximating subspaces – the scale spaces – corresponding to the scale parameter. In each scale space, a model of the data function can usually be calculated using the respective scaling functions, thus leading to an approximation of the data at a certain resolution. For increasing scales, the approximation improves and the information obtained on coarse levels is contained in all levels of approximation above. The difference between two successive approximations is called the detail information, and it is contained in the so-called detail spaces. The wavelets constitute the basis functions of the detail spaces, and summarizing the subject, every element of the Hilbert space can be represented as a structured linear combination of scaling functions and wavelets of different scales and at different positions (“multiscale approximation”) (cf. Freeden et al. (1998); Freeden and Michel (2004); Michel (2002); Michel (2013), and the references therein). Hence, we have found a possibility to break up complicated functions like geomagnetic field, electric currents, gravitational field, deformation field, oceanic currents, propagation speed of seismic waves, etc., into single pieces of different resolutions and to analyze these pieces separately and to decorrelate certain features. This helps to find adaptive methods (cf. Fig. 10) that take into account the specific structure of the data, i.e., in areas where the data show only a few coarse spatial structures, the resolution of the model can be chosen to be rather low; in areas of complicated data structures, the resolution can be increased accordingly. In areas where the accuracy inherent in the measurements is reached, the solution process can be stopped by some kind of thresholding. That is, using scaling functions and Page 13 of 66

Fig. 10 Scaling functions (upper row) and wavelet functions (lower row) in mutual relation (“tree structure”) within a multiscale reconstruction

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Page 14 of 66

Fig. 11 Global multiscale reconstruction of the Earth’s Gravitational Model EGM96 (based on data taken from F. G. Lemoine et al. 1998)

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Page 15 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

wavelets at different scales, the corresponding approximation techniques can be constructed as to be suitable for the particular data situation. Consequently, although most data show correlation in space as well as in frequency, the kernel functions with their simultaneous space and frequency localization allow for the efficient detection and approximation of essential features in the data structure by only using fractions of the original information (decorrelation of signatures) (Fig. 11). Finally, it is worth mentioning that future spaceborne observation combined with terrestrial and airborne activities will provide huge data sets of the order of millions of data to be continued downward to the Earth’s surface (see chapters “Earth Observation Satellite Missions and Data Access”, “Satellite-to-Satellite Tracking (Low-Low/High-Low SST)”, “GOCE: Gravitational Gradiometry in a Satellite”, “Sources of the Geomagnetic Field and the Modern Data That EnableTheir Investigation”, “Mathematical Properties Relevant to Geomagnetic Field Modeling”, “Using B-Spline Expansions for Ionosphere Modeling”, “Radio Occultation via Satellites”, and “Analysis of Data from Multi-satellite Geospace Missions” concerning different fields of observation). Standard mathematical theory and numerical methods are not at all adequate for the solution of data systems with a structure such as this, because these methods are simply not adapted to the specific character of the spaceborne problems. They quickly reach their capacity limit even on very powerful computers. In our opinion, a reconstruction of significant geophysical quantities from future data material requires much more: for example, it requires a careful analysis, fast solution techniques, and a proper stabilization of the solution, usually including procedures of regularization (see Freeden 1999; Freeden and Mayer 2003 and the references therein). In order to achieve these objectives, various strategies and structures must be introduced reflecting different aspects (cf. Fig. 12). As already pointed out, while global longwavelength modeling can be adequately done by the use of polynomial expansions, it becomes more and more obvious that splines and/or wavelets are most likely the candidates for mediumand short-wavelength approximation. But the multiscale concept of wavelets demands its own nature which – in most cases – cannot be developed from the well-known theory in Euclidean spaces. In consequence, the stage is also set to present the essential ideas and results involving a multiscale framework to the geoscientific community (see chapters “Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution”, “Multiresolution Analysis of Hydrology and Satellite Gravitational Data”, “Multiscale Model Reduction with Generalized Multiscale Finite Element

Fig. 12 Structural principles and methods

Page 16 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Methods in Geomathematics”, “Efficient Modeling of Flow and Transport in Porous Media Using Multiphysics and Multiscale Approaches”, “Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents”, “Stokes Problems, Layer Potentials and Regularizations, Multiscale Applications”, “Multiparameter Regularization in Downward Continuation of Satellite Data”, “Multidimensional Seismic Compression by Hybrid Transform with Multiscale Based Coding”, “Tomography: Problems and Multiscale Solutions”, “Splines and Wavelets on Geophysically Relevant Manifolds”, and “Multiscale Approximation” for different realizations).

8 Geomathematics: Three Exemplary “Circuits” In the sequel, the “circuit” geomathematics as solution potential will be demonstrated with respect to contents, origin, and intention on three completely different geoscientifically relevant examples, namely, the determination of the gravity field from terrestrial deflections of the vertical, the computation of (geostrophic) oceanic circulation from altimeter satellite data (i.e., RADAR data), and seismic processing from acoustic wave propagation. Their solution can be performed within the same mathematical context; it can be actually described within the same apparatus of concepts and formulas based on the use of fundamental solutions of associated partial differential equations and their appropriate regularizations. Nevertheless, the numerical methods must be specifically adapted to the concrete solution of any of the three problems; in this respect they are basically dissimilar. In the first example (i.e., the gravity field determination), we are confronted with a process of integrating the vectorial surface gradient equation; in the second example, (i.e., modeling of the oceanic circulation), we have to execute a process of differentiation following the vectorial surface curl gradient equation – in both cases to discretely given data which are assumed to be available on spheres, for simplicity. The first example uses vectorial data on the Earth’s surface, and the second example utilizes scalar satellite altimetry data. The essential goal of the third example concerned with seismic (post)processing is to transfer the existing signal, which resulted from acoustic wave propagation from seismograms, under the expectation that designated properties of the target bedrock such as a migration result or a velocity field model can be interpreted from the transformed information. In more detail, the acoustic waves are reflected at the places of impedance contrast (rapid changes of the medium density), propagated back, and then recorded on the surface and/or in available boreholes by receivers of seismic energy. The recorded seismograms are carefully processed to detect fractures along with their location, orientation, and aperture, which are needed for interpretation and definition of the target reservoir. In all our three examples, our work is based on a simple regularization procedure of fundamental solutions to associated partial differential equations and the realization of a multiscale approach leading to locally supported wavelets.

8.1 Circuit: Gravity Field from Deflections of the Vertical The modeling of the gravity field and its equipotential surfaces, especially the geoid of the Earth (see chapters “Earth Observation Satellite Missions and Data Access”, “Satellite-toSatellite Tracking (Low-Low/High-Low SST)”, “GOCE: Gravitational Gradiometry in a Satellite”, “Classical Physical Geodesy”, “Geodetic Boundary Value Problem”, “Time-Variable Gravity Field and Global Deformation of the Earth”, “Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution”, “Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics”, Page 17 of 66

Fig. 13 Gravity-involved processes (From “German Priority Research Program: Mass transport and mass distribution in the system Earth, DFG–SPP 1257”)

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Page 18 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

“Multiresolution Analysis of Hydrology and Satellite Gravitational Data”, “Time Varying Mean Sea Level”, “Gravitational Viscoelastodynamics”, “Oblique Stochastic Boundary-Value Problem”, “It’s All About Statistics: Global Gravity Field Modeling from GOCE and Complementary Data”, “Analysis of Data from Multi-Satellite Geospace Missions”, and “Geodetic World Height System Unification”), is essential for many applications (see Fig. 13 for a graphical illustration), from which we only mention some significant examples (essentially following Rummel 2002): Earth system: there is a growing awareness of global environmental problems (e.g., the CO2 question, the rapid decrease of rain forests, global sea level changes, etc.). What is the role of the future terrestrial campaigns, airborne duties, and satellite activities in this context? They do not tell us the reasons for physical processes, but it is essential to bring the phenomena into one system (e.g., to make sea level records comparable in different parts of the world). In other words, the geoid, i.e., the equipotential surface at sea level as defined by Listing (1873), is viewed as an almost static reference for many rapidly changing processes and at the same time as a “frozen picture” of tectonic processes that evolved over geological time spans. Solid earth physics: the gravity anomaly field has its origin mainly in mass inhomogeneities of the continental and oceanic lithosphere. Together with height information and regional tomography, a much deeper understanding of tectonic processes should be obtainable in the future. Physical oceanography: the altimeter satellites in combination with a precise geoid will deliver global dynamic ocean topography. Global surface circulation can be computed resulting in a completely new dimension of ocean modeling. Circulation allows the determination of transport processes of, e.g., polluted material. Satellite orbits: for any positioning from space, the uncertainty in the orbit of the spacecraft is the limiting factor. The future spaceborne techniques will basically eliminate all gravitational uncertainties in satellite orbits. Geodesy and civil engineering: accurate heights are needed for civil constructions, mapping, etc. They are obtained by leveling, a very time-consuming and expensive procedure. Nowadays geometric heights can be obtained fast and efficiently from space positioning (e.g., GPS, GLONASS, and (future) GALILEO). The geometric heights are convertible to leveled heights by subtracting the precise geoid (see Fig. 22), which is implied by a high-resolution gravitational potential. To be more specific, in those areas where good gravity information is already available, the future data information will eliminate all medium- and long-wavelength distortions in unsurveyed areas. GPS, GLONASS, and GALILEO together with the planned explorer satellite missions for the past 2015 time frame will provide extremely high-quality height information at the global scale. Exploration geophysics and prospecting: airborne gravity measurements have usually been used together with aeromagnetic surveys, but the poor precision of airborne gravity measurements has hindered a wider use of this type of measurements. Strong improvements can be expected by combination with terrestrial and spaceborne observations in the future scenario. The basic interest (see, e.g., Bauer et al. (2014)) in gravitational methods in exploration is based on the small variations in the gravitational field anomalies in relation to, e.g., an ellipsoidal reference model, i.e., the so-called “normal” gravitational field. 8.1.1

Mathematical Modeling of the Gravity Field (Classical Approach)

Gravity as provided on the Earth’s surface by absolute and/or relative measurements (see Fig. 14) is the combined effect of the gravitational mass attraction and the centrifugal force due to the Earth’s rotation. The force of gravity provides a directional structure to the space above the Earth’s surface. It is tangential to the vertical plumb lines and perpendicular to all level surfaces. Any water surface Page 19 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

at rest is part of a level surface. As if the Earth were a homogeneous, spherical body, gravity turns out to be constant all over the Earth’s surface, the well-known quantity 9:8 ms2 . The plumb lines are directed toward the Earth’s center of mass, and this implies that all level surfaces are nearly spherical, too. The gravity decreases from the poles to the equator by about 0:05 ms2 (see Fig. 15). This is caused by the flattening of the Earth’s figure and the negative effect of the centrifugal force, which is maximal at the equator. High mountains and deep ocean trenches (cf. Fig. 16) cause the gravity to vary. Materials within the Earth’s interior are not uniformly distributed. The irregular gravity field shapes the geoid as virtual surface. The level surfaces are ideal reference surfaces, for example, for heights. The gravity acceleration (gravity) w is the resultant of gravitation v and centrifugal acceleration c (see chapters “Classical Physical Geodesy”, “Geodetic Boundary Value Problem”, “Time Varying Mean Sea Level”, and “Geodetic World Height System Unification”): w D v C c:

(1)

The centrifugal force c arises as a result of the rotation of the Earth about its axis. Here, we assume a rotation of constant angular velocity  around the rotational axis x3 , which is further assumed to be fixed with respect to the Earth. The centrifugal acceleration acting on a unit mass is directed outward, perpendicularly to the spin axis (see Fig. 17). If the  3 -axis of an Earth-fixed coordinate system coincides with the axis of rotation, then we have c D rC; where C is the socalled centrifugal potential (with f 1 ;  2 ;  3 g the canonical orthonormal system in Euclidean space R3 ). The direction of the gravity w is known as the direction of the plumb line, and the quantity g D jwj is called the gravity intensity (often just gravity). The gravity potential of the Earth can be expressed in the form W D V C C:

(2)

The (vectorial) gravity acceleration w (see chapter “Gravitational Viscoelastodynamics” for more details) is given by w D rW D rV C rC:

(3)

Fig. 14 The falling apple: Newton’s approach to absolute (left) and relative (right) gravity observation

Page 20 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 15 Illustration of the gravity intensity: constant, i.e., 9:8 ms2 (left), decreasing to the poles by about 0:05 ms2 (mid), and real simulation (right)

Fig. 16 Illustration of the constituents of the gravity intensity g (ESA medialab, ESA communication production SP–1314)

The surfaces of constant gravity potential W .x/ D const, x 2 R3 , are designated as equipotential (level, or geopotential) surfaces of gravity. The gravity potential W of the Earth is the sum of the gravitational potential V and the centrifugal potential C , i.e., W D V C C . In an Earth-fixed coordinate system, the centrifugal potential C is explicitly known. Hence, the determination of equipotential surfaces of the potential W is strongly related to the knowledge of the gravitational potential V . The gravity vector w given by w D rW is normal to the equipotential surface passing through the same point. Thus, equipotential surfaces intuitively express the notion of tangential surfaces, as they are normal to the plumb lines given by the direction of the gravity vector (cf. Fig. 18). The traditional concept in gravitational field modeling is based on the assumption that all over the Earth the position (i.e., latitude and longitude) and the scalar gravity g are available. Moreover, it is common practice that the gravitational effects of the sun, moon, Earth’s atmosphere, etc., are accounted for by means of corrections. The gravitational part of the gravity potential can then be regarded as a harmonic function in the exterior of the Earth. A classical approach to gravity field modeling was conceived by Stokes (1849), Helmert (1981), Neumann (1887). They proposed to reduce the given gravity accelerations from the Earth’s surface to the geoid. As the geoid is a level surface, its potential value is constant. The difference between the reduced gravity on the geoid and the reference gravity on the so-called normal ellipsoid is called the gravity anomaly. The disturbing potential, i.e., the difference between the actual and the reference potential, can be obtained from a (third) boundary value problem of potential theory. Its solution is representable in integral form, i.e., by the Stokes integral. The disadvantage of the Stokes approach is that the reduction to the geoid requires the introduction of assumptions concerning the unknown mass distribution between the Earth’s surface and the geoid (for more details concerning the classical theory, the reader is referred, e.g., to chapter “Gravitational Viscoelastodynamics” and the references therein). Page 21 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

direction of plumb line

x v

c

w

center of mass

Fig. 17 Gravitation v; centrifugal acceleration c; gravity acceleration w

(x) x w(x) level surface plumb line

Fig. 18 Level surface and plumb line y

0

x 2

x

Fig. 19 Regularity at infinity

Next, we briefly recapitulate the classical approach to global gravity field determination by formulating the differential/integral relations between gravity disturbance, gravity anomaly, and deflections of the vertical on the one hand, and the disturbing potential and the geoidal undulations on the other hand. The representations of the disturbing potential in terms of gravity disturbances, gravity anomalies, and deflections of the vertical are written in terms of well-known integral representations over the geoid. For practical purposes the integrals are replaced by approximate formulas using certain integration weights and knots conventionally within a spherical framework (see chapter “Numerical Integration on the Sphere” for a survey paper on numerical integration on the sphere). Equipotential surfaces of the gravity potential W allow in general no simple representation (see Figs. 20, 21, 22). This is the reason why a reference surface – in physical geodesy usually an ellipsoid of revolution – is chosen for the (approximate) construction of the geoid (see Fig. 22). As a matter of fact, the deviations of the gravity field of the Earth from the normal field of such an ellipsoid are small. The remaining parts of the gravity field are gathered in a so-called disturbing gravity field rT corresponding to the disturbing potential T . Knowing the gravity potential, all equipotential surfaces – including the geoid – are given by an equation of the form W .x/ D const.

Page 22 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

By introducing U as the normal gravity potential corresponding to the ellipsoidal field and T as the disturbing potential, we are led to a decomposition of the gravity potential in the form W DU CT

(4)

such that (C1) (C2)

the center of the ellipsoid coincides with the center of gravity of the Earth, the difference of the mass of the Earth and the mass of the ellipsoid is zero.

According to the classical Newton law of gravitation (1687), knowing the density distribution of a body, the gravitational potential can be computed everywhere in R3 (see chapter “Time Varying Mean Sea Level” for more information in time–space-dependent relation). More V of the Earth’s exterior is given by V .x/ D R explicitly, the1gravitational potential G 3 .y/jx  yj d V .y/; x 2 R nEarth; where G is the gravitational constant (G D 6:6742  4 Earth 1011 m3 kg1 s2 ) and  is the density function. The properties of the gravitational potential V in the Earth’s exterior are easily described as follows: V is harmonic in x 2 R3 nEarth; i.e., V .x/ D 0; x 2 R3 nEarth. Moreover, the gravitational potential V is regular at infinity, i.e.,

Fig. 20 Geoidal surface over oceans (top) and over the whole Earth (bottom), modeled by smoothed Haar wavelets, Freeden et al. (2009), Geomathematics Group, Kaiserslautern Page 23 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 21 Geoidal surface (provided by R. Haagmans, European Space Agency, Earth Surfaces and Interior Section, ESTEC, Noordwijk, ESA ID number SEMLXEOA90E)

 1 ; jxj ! 1; jV .x/j DD O jxj   1 jrV .x/j DD O ; jxj ! 1: jxj2 

(5) (6)

Note that for suitably large values jxj (see Fig. 19), we have jyj  12 jxj, hence, jx  yj  jjxj  jyjj  12 jxj. However, the actual problem is that in reality the density distribution  is very irregular and only known for parts of the upper crust of the Earth. Actually, geoscientists would like to know it from measuring the gravitational field (gravimetry problem). Even if the Earth is supposed to be spherical, the determination of the gravitational potential by integrating Newton’s potential is not achievable (see chapters “Classical Physical Geodesy”, “Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives”, “Sparsity in Inverse Geophysical Problems”, and “RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences” for inversion methods of Newton’s law). Remark 1. As already mentioned of, the classical remedy avoiding any knowledge of the density inside the Earth is the formulation boundary value problems of potential theory to determine the external gravitational potential from terrestrial data. For reasons of demonstration, however, here we do not follow the standard (Vening–Meinesz) approach in physical geodesy of obtaining the disturbing potential from deflections of the vertical as terrestrial data (as explained in chapter “Classical Physical Geodesy”). Our approach is based on the context as developed by Freeden and Schreiner (2009) and Freeden and Wolf (2008). Furthermore, it should be noted that the determination of the Earth’s gravitational potential under the assumptions of nonspherical geometry and terrestrial oblique derivatives in form of a (deterministic or stochastic) boundary value problem of potential theory is discussed, e.g., in chapters “Geodetic Boundary Value Problem” and “Oblique Stochastic Boundary-Value Problem”. Page 24 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

(x)

’(x) geoid W = const = W0

x

w(x)

u(x)

N(x)

geoidal height

y

u(y)

reference ellipsoid U = const = U0

Fig. 22 Illustration of the gravity anomaly vector ˛.x/ D w.x/  u.y/ and the gravity disturbance vector ı.x/ D w.x/  u.x/

Further basic details concerning oblique derivative problems can be found in Freeden and Michel (2004), Gutting (2007) and in chapters “Geodetic Boundary Value Problem”, “Oblique Stochastic Boundary-Value Problem”, and “Fast Spherical/Harmonic Spline Modeling”. A point x of the geoid is projected onto the point y of the ellipsoid by means of the ellipsoidal normal (see Fig. 22). The distance between x and y is called the geoidal height or geoidal undulation. The gravity anomaly vector is defined as the difference between the gravity vector w.x/ and the normal gravity vector u.y/, u D rU , i.e., ˛.x/ D w.x/  u.y/ (see Fig. 22). It is also possible to distinguish the vectors w and u at the same point x to get the gravity disturbance vector ı.x/ D w.x/  u.x/: Of course, several basic mathematical relations between the quantities just mentioned are known (see, e.g., chapters “Classical Physical Geodesy” and “Geodetic World Height System Unification”). In what follows, we only heuristically describe the fundamental relations. We start by observing that the gravity disturbance vector at the point x can be written as ı.x/ D w.x/  u.x/ D r.W .x/  U.x// D rT .x/:

(7)

Expanding the potential U at x according to Taylor’s theorem and truncating the series at the linear term, we get @U : U.x/ D U.y/ C 0 .y/N.x/ (8) @ : (D means approximation in linearized sense). Here,  0 .y/ is the ellipsoidal normal at y, i.e.,  0 .y/ D u.y/=.y/, .y/ D ju.y/j, and the geoid undulation N.x/, as indicated in Fig. 22, is the aforementioned distance between x and y, i.e., between the geoid and the reference ellipsoid. Using .y/ D ju.y/j D  0 .y/  u.y/ D  0 .y/  rU.y/ D 

@U .y/ @ 0

(9)

we arrive at

Page 25 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

N.x/ D

T .x/  .W .x/  U.y// T .x/  .W .x/  U.y// D : ju.y/j .y/

Letting U.y/ D W .x/ D const D W0 , we obtain the so-called Bruns’ formula N.x/ D

T .x/ : .y/

(10)

It should be noted that Bruns’ formula (10) relates the physical quantity T to the geometric quantity N (cf. Bruns 1878). In what follows we are interested in introducing the deflections of the vertical of the gravity disturbing potential T . For this purpose, let us consider the vector field .x/ D w.x/=jw.x/j. This gives us the identity (with g.x/ D jw.x/j and .x/ D ju.x/j) w.x/ D rW .x/ D jw.x/j .x/ D g.x/.x/:

(11)

u.x/ D rU.x/ D ju.x/j  0 .x/ D .x/ 0 .x/:

(12)

Furthermore, we have

The deflection of the vertical ‚.x/ at the point x on the geoid is defined to be the angular (i.e., tangential) difference between the directions .x/ and  0 .x/, i.e., the plumb line and the ellipsoidal normal in the same point:   ‚.x/ D .x/   0 .x/  ..x/   0 .x//  .x/ .x/:

(13)

Clearly, because of (13), ‚.x/ is orthogonal to .x/, i.e., ‚.x/  .x/ D 0. Since the plumb lines are orthogonal to the level surfaces of the geoid and the ellipsoid, respectively, the deflections of the vertical give briefly spoken a measure of the gradient of the level surfaces. This aspect will be described in more detail below: from (11), we obtain, in connection with (13), w.x/ D rW .x/



 D jw.x/j ‚.x/ C  0 .x/ C ...x/   0 .x//  .x//.x/ :

(14)

Altogether, we get for the gravity disturbance vector w.x/  u.x/ D rT .x/



   D jw.x/j ‚.x/ C ..x/   0 .x//  .x/ .x/

(15)

 .jw.x/j  ju.x/j/  0 .x/: The magnitude D.x/ D jw.x/j  ju.x/j D g.x/  .x/

(16)

is called the gravity disturbance, while

Page 26 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

A.x/ D jw.x/j  ju.y/j D g.x/  .y/

(17)

is called the gravity anomaly. Since the vector .x/   0 .x/ is (almost) orthogonal to  0 .x/, it can be neglected in (15). Hence, it follows that w.x/  u.x/ D rT .x/ : D jw.x/j‚.x/  .jw.x/j  ju.x/j/  0 .x/:

(18)

The gradient rT .x/ can be split into a normal part (pointing in the direction of  0 .x/) and an angular (tangential) part (characterized by the surface gradient r  ). It follows that rT .x/ D

@T 1  .x/ 0 .x/ C r T .x/: 0 @ jxj

(19)

By comparison of (18) and (19), we therefore obtain D.x/ D g.x/  .x/ D jw.x/j  ju.x/j D 

@T .x/; @ 0

(20)

i.e., the gravity disturbance, besides being the difference in magnitude of the actual and the normal gravity vector, is also the normal component of the gravity disturbance vector. In addition, we are led to the angular, i.e., (tangential) differential, equation 1  r T .x/ D jw.x/j ‚.x/: jxj

(21)

Remark 2. The reference ellipsoid deviates from a sphere only by quantities of the order of the flattening. Therefore, in numerical calculations, we treat the reference ellipsoid as a sphere around the origin with (certain) mean radius R. This may cause a relative error of the same order (for more details the reader is referred to standard textbooks of physical geodesy). In this way, together with suitable prereduction processes of gravity, formulas are obtained that are rigorously valid for the sphere. Remark 3. Since j‚.x/j is a small quantity, it may be (without loss of precision) multiplied either by jw.x/j or by ju.x/j, i.e., g.x/ or by .x/. In well-known spherical approximation, we have .y/ D ju.y/j D

GM ; jyj2

@ y GM .y/ D .y/ D 2  r y @ 0 jyj jyj3

(22) (23)

and

Page 27 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

2 1 @ .y/ D  ; 0 .y/ @ jyj

(24)

where G is the gravitational constant and M is the mass. If now – as explained above – the relative error between normal ellipsoid and mean sphere of radius R is permissible, we are allowed to go over to the spherical nomenclature x D R ; R D jxj; 2 S2 with S2 being the unit sphere in R3 : Replacing ju.R /j by its spherical approximation GM=R2 , we find r  T .R / D 

GM ‚.R /; R

2 S2 :

(25)

By virtue of Bruns’ formula (10), we finally find the relation between geoidal undulations and deflections of the vertical GM  GM r N.R / D  ‚.R /; R2 R

2 S2 ;

(26)

i.e., r  N.R / D R ‚.R /;

2 S2 :

(27)

In other words, the knowledge of the geoidal undulations allows the determination of the deflections of the vertical by taking the surface gradient on the unit sphere. From the identity (20), it follows that 

@T .x/ D D.x/ D jw.x/j  j.x/j @ 0 @ : D jw.x/j  j.y/j  0 .y/ N.x/ @ @ D A.x/  0 .y/ N.x/; @

(28)

where A represents the scalar gravity anomaly as defined by (17). Observing Bruns’ formula, we get A.x/ D 

@T 1 @ .x/ C .y/ T .x/: 0 @ .y/ @ 0

(29)

In the sense of physical geodesy, the meaning of the spherical approximation should be carefully kept in mind. It is used only for expressions relating to small quantities of the disturbing potential, the geoidal undulations, the gravity disturbances, the gravity anomalies, etc. Together with the Laplace equation in the exterior of the sphere around the origin with radius R and the regularity at infinity, the equations x  D.x/ D  rT .x/ (30) jxj

Page 28 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 23 Absolute values of the deflections of the vertical and their directions computed from EGM96 (cf. Lemoine et al. 1996) from degree 2 up to degree 360 (reconstructed by use of space-limited (locally supported) scaling functions (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)) (min D 0, max D 3:0  104 )

and  A.x/ D

2 x  rT .x/ C T .x/ jxj R

(31)

represent the so-called fundamental equations of physical geodesy in spherical approximation (see, e.g., Heiskanen and Moritz 1967; Groten 1979; Torge 1991). Actually, the identities (30) and (31), respectively, serve as boundary conditions of boundary value problems of potential theory, which are known as Neumann and Stokes problem (see also chapters “Classical Physical Geodesy” and “Geodetic Boundary Value Problem” of this handbook for more details). The study of Figs. 23–26 leads us to the following remarks: the gravity disturbances, which enable a physically oriented comparison of the real Earth with the ellipsoidal Earth model, are consequences of the imbalance of forces inside the Earth according to Newton’s law of gravitation. It leads to the suggestion of density anomalies. In analogy the difference between the actual level surfaces of the gravity potential and the level surfaces of the model body forms a measure for the deviation of the Earth from a hydrostatic status of balance. In particular, the geoidal undulations (geoidal heights) represent the deflections from the equipotential surface on the mean level of the ellipsoid. The geoidal anomalies generally show no essential correlation to the distribution of the continents (see Fig. 20). It is conjecturable that the geoidal undulations mainly depend on the reciprocal distance of the density anomaly. They are influenced by lateral density variations of large vertical extension, from the core–mantle layer to the crustal layers. In fact, the direct geoidal signal, which would result from the attraction of the continental and the oceanic bottom, would be several hundreds of meters (for more details, see Rummel (2002) and the references therein). In consequence, the weights of the continental and oceanic masses in the Earth’s interior are almost perfectly balanced. This is the phenomenon of isostatic compensation that was observed by P. Bouguer already in the year 1750. Page 29 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

180° w

75° N

120° w

60° w



60° E

120° E

180° E

60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

–800

–1000

–600

–400

–200

0

200

400

600

800

[m2/s2]

Fig. 24 Disturbing potential computed from EGM96 from degree 2 up to degree 360 (reconstructed by use of spacelimited (locally supported) scaling functions (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)) (min  1 038 m2 =s2 , max  833 m2 =s2 ) 75° N

180° w

120° w

60° w



60° E 120° E 180° E

60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

–1.5

–1

–0.5

0 [m/s2]

0.5

1

1.5 x10–3]

Fig. 25 Gravity disturbances computed from EGM96 from degree 2 up to degree 360 (reconstructed by the use of space-limited (locally supported) scaling functions (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)) (min D 3:6  103 m=s2 , max D 4:4  103 m=s2 )

8.1.2

Mathematical Analysis

Equation (27) under consideration is of vectorial tangential type with the unit sphere S2 in Euclidian space R3 as the domain of definition (see also chapters “Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was Not the First: The History of the Geomagnetic Atlases,” “Sources of the Geomagnetic Field and the Modern Data That Enable Their Investigation”, “Convection Structures of Binary Fluid Mixtures in Porous Media”, “Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents”, “Toroidal-Poloidal Decompositions of Electromagnetic Green’s Page 30 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

75° N

180° w 120° w

60° w



60° E

120° E 180° E

60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

–1.5

–1

–0.5

0

0.5

1

[m/s2]

1.5 –3]

x10

Fig. 26 Gravity anomalies computed from EGM96 from degree 2 up to degree 360 (reconstructed by use of spacelimited (locally supported) scaling functions (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)) (min D 3:4  103 m=s2 , max D 4:4  103 m=s2 )

Functions in Geomagnetic Induction”, “Using B-Spline Expansions for Ionosphere Modeling”, and “Potential-Field Estimation Using Scalar and Vector Slepian Functions at Satellite Altitude” for relations to geomagnetism). More precisely, we are confronted with a surface gradient equation r  P D p with p as given continuous vector field (i.e., p. / D R‚.R /; 2 S2 ) and P as continuously differentiable scalar field that is desired to be reconstructed. In the previously formulated abstraction, the determination of the surface potential function P via the equation r  P D p is certainly not uniquely solvable. We are able to add an arbitrary constant to P without changing the equation. For our problem, however, this argument is not valid, since we have to observe additional integration relations resulting from the conditions (C1) and (C2), namely, Z P . /. i  /k dS. / D 0I k D 0; 1I i D 1; 2; 3: (32) S2

Consequently, we are able to guarantee uniqueness. The solution theory is based on the Green theorem (see Freeden et al. 1998; Freeden and Schreiner 2009; Freeden and Gerhards 2010; Gerhards 2011; Freeden and Gerhards 2013) Z Z 1 P . / D P . / dS. /  r  G.  /  r  P . / dS. /; (33) 2 2 4 S S where t 7! G.t / D

1 1 ln.1  t / C .1  ln 2/; t 2 Œ1; 1/; 4 4

(34)

is the Green function (i.e., the fundamental solution) with respect to the (Laplace)–Beltrami operator on the unit sphere S2 . This leads us to the following result: suppose that p is a given continuous tangential, curl-free vector field on the unit sphere S2 : Then P given by Page 31 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

1 P . / D 4

Z S2

1 .  .  / /  p. / dS. /; 1 

2 S2 ;

is the uniquely determined solution of the surface gradient equation r  P D p with Z P . / dS. / D 0:

(35)

(36)

S2

Consequently, the existence and uniqueness of the equation is assured. Even more, the solution admits a representation in the form of the singular integral (35). 8.1.3

Development of a Mathematical Solution Method

Precise terrestrial data of the deflections of the vertical are not available all over the Earth in dense distribution. They exist, e.g., on continental areas in much larger density than on oceanic ones. In order to exhaust the existent scattered data reservoir, we are not allowed to apply a Fourier technique in terms of spherical harmonics (note that, according to Weyl’s concept, the integrability is equivalently interrelated to the equidistribution of the data points). For more details see Weyl (1916) (one-dimensional theory); Freeden et al. (1998); Freeden and Wolf (2008); Freeden and Schreiner (2009); Freeden and Gerhards (2010) (Spherical Theory). Instead, we have to use an appropriate zooming-in procedure which starts globally from (initial) rough data width (low scale) and proceeds to more and more finer local data width (higher scales). A simple solution for a multiscale approximation consists of appropriate “regularization” of the singular kernel. ˆ W t 7! ˆ.t / D .1  t /1 ;

1  t < 1;

(37)

in (35) by the continuous kernel ˆ W t 7! ˆ .t /, t 2 Œ1; 1 , 2 .0; 2/, given by  ˆ .t / D

1 1t 1

; 1  t > ; ; 1  t  :

(38)

In fact, it is not difficult to verify (see Freeden and Schreiner (2009) and the references therein) that Z 1 ˆ .  /.  .  / /  p. / dS. /; 2 S2 ; (39) P . / D 4 S2 where the surface curl-free space-regularized Green vector scaling kernel is given by . ; / 7! ˆ .  /.  .  / /; ; 2 S2 ;

(40)

satisfies the following limit relation: lim sup jP . /  P . /j D 0:

!0 2S2 >0

(41)

Furthermore, in the scale discrete formulation using a strictly monotonically decreasing sequence . j /j 2N0 converging to zero with j 2 .0; 2 (for instance, j D 21j or j D Page 32 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

75° N

180° w



90° w

90° E

180° E

60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S

0

75° S

10

20

30

40

50

60

[m/s2]

60° N

75° N

180° w



90° w

90° E

180° E

45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S

75° S

10

0

75° N

20 180° w

30 [m/s2]

40



90° E

90° w

50

60

50

60

180° E

60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S

0

75° S

10

20

30

40 [m/s2]

Fig. 27 Absolute values and the directions of the surface curl-free space-regularized Green vector scaling kernel

j . ; /; ; 2 S2 , as defined by Eq. (40) with fixed and located at 00 N, 00 W and j D 21j for scales j D 1; 2; 3; respectively

1  cos.2j /, j 2 N0 ), we are allowed to write the surface curl-free space-regularized Green vector scaling kernel as follows:

j . ; / D ˆ j .  /.  .  / /; ; 2 S2 :

(42)

The surface curl-free space-regularized Green vector wavelet kernel (cf. Fig. 27) then reads as

Page 33 of 66

3.5

4 4.5

5 1.5

180° w

2

90° w

2.5



3 3.5

90° E

4

180° E

250

4.5

5 60° W

75° N 60° N 45° N 30° N

1

90° W

90° W

85° W

60° W

95° W 5° N

90° W

95° W 5° N

90° W

35

40

45

0.5

100

150

200

5° S

2.5° S



2.5° N

50

100

150

200

5° S

2.5° S



2.5° N

85° W

30° S

50

100

150

200

5° S

2.5° S



2.5° N

5

5

5 30° S

10

10

10

15° S

15

15

15 15° S

20

20

250

50

250

90° W

20

0° 0°

15° N

120° W 30° N

25

35

40

45

25

85° W

60° W

1

25

90° W

0.5

30

95° W 5° N

90° W

30

15° N

120° W 30° N

85° W

1

30° S 45° S 60° S 75° S

15° S

30

35

40

45

45° S 60° S 75° S

30° S

50

100

150

200

250

5° S

2.5° S



2.5° N

95° W 5° N

30° S

15° S





2.5

120° W 30° N

2

90° w

15° N

1.5

180° w

3

90° W

90° W

4

180° E

3.5

90° E

4.5

85° W

60° W

5

10

15

20

25

30

35

40

45

50

100

150

200

250

5

Fig. 28 “Zooming-in” strategy choosing a hotspot (here, Galapagos (00 N,910 W)) as target area. The colored areas illustrate the local support of the wavelets for increasing scale levels (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)

5° S

2.5° S



2.5° N

95° W 5° N

30° S

15° S



15° N

120° W 30° N

0.5

30° S 45° S



75° N 60° N 45° N 30° N

15° S

3

180° E

15° S

2.5

90° E

15° N

2





1.5

90° w

15° N

60° S 75° S

180° w

15° N 0°

30° N

75° N 60° N 45° N

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Page 34 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

j . ; /

D ‰ j .  /.  .  / /; ; 2 S2 ;

(43)

where ‰ j D ˆ j C1  ˆ j

(44)

is explicitly given by

‰ j .t / D

8 ˆ < ˆ :

1 1t 1 j C1

0 ;  1j ;  1j ;

1  t > j ; j  1  t > j C1 ; j C1 > 1  t:

(45)

 p. / dS. /;

(46)

W j defined by 1 W j . / D 4

Z S2

j . ; /

2 S2 ;

leads us to the recursion P j C1 D P j C W j :

(47)

Hence, we immediately get by elementary manipulations for all m 2 N (cf. Fig. 28) P j Cm D P j C

m1 X

W j Ck :

(48)

kD0

The functions P j represent low-pass filters of P . Obviously, P j is improved by the band-pass filter W j in order to obtain P j C1 , while P j C1 is improved by the band-pass filter W j C1 in order to obtain P j C2 , etc. Summarizing our results, we are allowed to formulate the following conclusion: three features are incorporated in our way of thinking about multiscale approximation by the use of locally supported wavelets, namely, basis property, decorrelation, and fast computation. More concretely, our vector wavelets are “building blocks” for huge discrete data sets. By virtue of the basis property, the function P can be better and better approximated from p with increasing scale j . Our wavelets have the power to decorrelate. In other words, the representation of data in terms of wavelets is somehow “more compact” than the original representation. We search for an accurate approximation by only using a small fraction of the original information of a potential. Typically, our decorrelation is achieved by vector wavelets which have a compact support (localization in space) and which show a decay toward high frequencies (see Fig. 28). The main calamity in multiscale approximation is how to decompose the function under consideration into wavelet coefficients and how to efficiently reconstruct the function from the coefficients. There is a “tree algorithm” (cf. also Fig. 11) that makes these steps simple and fast (see Freeden et al. 1998):

Page 35 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

W j P j

W j C1 & & ! ˚ ! P j C1 ! ˚ ! P j C2 : : : :

The fast decorrelation power of wavelets is the key to applications such as data compression, fast data transmission, noise cancellation, signal recovering, etc. With increasing scales j ! 1, the supports become smaller and smaller. This is the reason why the calculation of the integrals has to be extended over smaller and smaller caps (“zooming in”). Of course, downsizing spherical caps and increasing data widths are in strong correlation. Thus, the variable width of the caps with increasing scale parameter j enables the integration of data sets of heterogeneous data width for local areas without violating Weyl’s law of equidistribution. 8.1.4

“Back-Transfer” to Application

The multiscale techniques as presented here will be used to investigate the anomalous gravity field particularly for areas in which mantle plumes and hotspots occur. In this respect it should be noted that mantle plume is a geoscientifical term which denotes an upwelling of abnormally hot rocks within the Earth’s mantle (cf. Fig. 13). Plumes (cf. Ritter and Christensen 2007) are envisioned to be vertical conduits in which the hot mantle material rises buoyantly from the lower mantle to the lithosphere at velocities as large as 1 m yr1 , and these quasicylindrical regions have a typical diameter of about 100–200 km. In mantle convection theory, mantle plumes are related to hotspots which describe centers of surface volcanism that are not directly caused by plate tectonic processes. A hotspot is a long-term source of volcanism which is fixed relative to the plate overriding it. A classical example is Hawaii. Due to the local nature of plumes and hotspots such as Hawaii, we have to use high-resolution gravity models. Because of the lack of terrestrial-only data, the GFZ-combined gravity model EIGEN–GLO4C is used, which consists of satellite data, gravimetry, and altimetry surface data. The “zooming-in” property of our analysis is of great advantage. Especially the locally compact wavelet turns out to be an essential tool of the (vectorial) multiscale decomposition of the deflections of the vertical and correspondingly the (scalar) multiscale approximation of the disturbing potential. In the multiscale analysis (cf. Figs. 29 and 30), several interesting observations can be detected. By comparing the different positions and different scales, we can see that the maximum of the “energy” contained in the signal of the disturbing potential – measured in the so-called wavelet variances (see Freeden and Michel 2004 for more details) – starts in the North West of the Hawaiian islands for scale j D 2 and travels in east southern direction with increasing scale. It ends up, for scale j D 12, in a position at the geologically youngest island under which the mantle plume is assumed to exist. Moreover, in the multiscale resolution with increasing scale, more and more local details of the disturbing potential appear. In particular, the structure of the Hawaiian island chain is clearly reflected in the scale and space decomposition. Obviously, the “energy peak” observed at the youngest island of Hawaii is highly above the “energy intensity level” of the rest of the island chain. This seems to strongly corroborate the belief of a stationary mantle plume, which is located beneath the Hawaiian islands and that is responsible for the creation of the Hawaii–Emperor seamount chain, while the oceanic lithosphere is continuously passing over it. An interesting area is the southeastern part of the chain, situated on the Hawaiian swell, a 1,200 km broad anomalously shallow region of the ocean floor, extending from the island of Hawaii to the Midway atoll. Here

Page 36 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 29 Approximation of the vector-valued vertical deflections ‚ in [ms2 ] of the Hawaiian region with smoothed Haar wavelets (a rough low-pass filtering at scale 6 is improved by several band-pass filters of scale j D 6; : : : ; 11; where the last picture shows the multiscale approximation at scale j D 12) (from the PhD-thesis Fehlinger (2009), Geomathematics Group, University of Kaiserslautern)

a distinct geoidal anomaly occurs that has its maximum around the youngest island that coincides with the maximum topography, and both decrease in northwestern direction.

Page 37 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 30 Multiscale reconstruction of the disturbing potential T in Œm2 s2 from vertical deflections for the Hawaiian (plume) area using regularized vector Green functions (a rough low-pass filtering at scale j D 6 is improved by several band-pass filters of scale j D 6; : : : ; 11; where the last illustration shows the approximation of the disturbing potential T at scale j D 12) (from the PhD-thesis Fehlinger (2009), Geomathematics Group, University of Kaiserslautern)

8.2 Circuit: Oceanic Circulation from Ocean Topography Ocean flow (see chapters “Time Varying Mean Sea Level”, “Self-Attraction and Loading of Oceanic Masses”, “Unstructured Meshes in Large-Scale Ocean Modeling”, and “Asymptotic Models for Atmospheric Flows”) has a great influence on mass transport and heat exchange. By Page 38 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

modeling oceanic currents, we therefore gain, for instance, a better understanding of weather and climate. In what follows we devote our attention to the geostrophic oceanic circulation on bounded regions on the sphere (and in a first approximation, the oceanic surfaces under consideration may be assumed to be parts of the boundary of a spherical Earth model), i.e., to oceanic flow under the simplifying assumptions of stationarity, spherically reflected horizontal velocity, and strict neglect of inner frictions. This leads us to inner-oceanic long-scale currents, which still give meaningful results – as, for example, for the phenomenon of El Niño. 8.2.1

Mathematical Modeling of Ocean Flow

The numerical simulation of ocean currents is based on the Navier–Stokes equation. Its formulation (see, e.g., Ansorge and Sonar 2009) is well known: let us consider a fluid occupying an arbitrary (open and bounded) subdomain G0  R3 at time t D 0. The vector function v W Œ0; tend  G0 ! Gt  R3 describes the motion of the particle positions  2 G0 with time, so that at times t  0 the fluid occupies the domain Gt D fv.t I /j 2 G0 g, respectively. Hence, Gt is a closed system in the sense that no fluid particle flows across its boundaries. The path of a particle  2 G0 is given by the graph of the function t 7! v.t I /; and the velocity of the fluid at a fixed location x D v.t I / 2 Gt by the derivative u.t I x/ D @t@ v.t I /: The derivation of the governing equations relies on the conservation of mass and momentum. The essential tool is the transport theorem, which shows how the time derivative of an integral over a domain changing with the time may be computed. The mass of a fluid occupying a domain is determined by the integral over the density of the fluid : Since the same amount of fluid occupying the domain at timeR 0 later occupies the R domain at time t > 0, we must have that G0 .0I x/ d V .x/ coincides with Gt .t I x/ d V .x/ for all t 2 .0; tend : Therefore, the derivative of mass with respect to time must vanish, upon which the transport theorem yields for all t and Gt Z  Gt

 @ .t I x/ C div .u/.t I x/ d V .x/ D 0: @t

(49)

Since this is valid for arbitrary regions (in particular, for arbitrarily small ones), this implies that the integrand itself vanishes, which yields the continuity equation for compressible fluids @  C div .u/ D 0: @t The momentum of a solid body is the product of its mass with its velocity Z .t I x/u.t I x/ d V .x/:

(50)

(51)

Gt

According to Newton’s second law, the same rate of change of (linear) momentum is equal to the sum of the forces acting on the fluid. We distinguish two R types of forces, viz., body forces k (e.g., gravity, Coriolis force), which can be expressed as Gt .t I x/k.t I x/ d V .x/ with a given force density k per unit volume, and surface forces (e.g., pressure, internal friction) representable as R  .t I x/.x/ dS.x/, which includes the stress tensor  .t I x/: Thus, Newton’s law reads @Gt

Page 39 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

d dt

Z

Z

Z

.t I x/u.t I x/ d V .x/ D Gt

.t I x/k.t I x/ d V .x/ C Gt

 .t I x/.x/ dS.x/:

(52)

@Gt

If we now apply the product rule and the transport theorem componentwise to the term on the left and apply the divergence theorem to the second term on the right, we obtain the momentum equation @ .u/ C .u  r/.u/ C .u/r  u D k C r   : @t

(53)

The nature of the oceanic flow equation depends heavily on the model used for the stress tensor. In the special case of incompressible fluids (here, ocean water) that is characterized by a density .t I x/ D 0 D const dependent neither on space nor on time, we find r  u D 0, i.e., u is divergence-free (for a discussion of (53) on the unit sphere, the reader is also referred to Fengler and Freeden (2005) and the references therein). When modeling an inviscid fluid, internal friction is neglected and the stress tensor is determined solely by the pressure  .t I x/ D P .t I x/i (i is the unit matrix). In the absence of inner friction (in consequence of, e.g., effects of wind and surface influences), we are able to ignore the derivative @u and, hence, the dependence on time. As relevant @t volume forces k, the gravity field w and the Coriolis force c D 2u ^ ! remain valid; they have to be observed. Finally, for large-scale currents of the ocean, the nonlinear part does not play any role, i.e., the term .u  r/u is negligible. Under all these very restrictive assumptions, the equation of motion (53) reduces to the following identity : 2! ^ u D 

rP  w: 0

(54)

Even more, we suppose a velocity field of a spherical layer model. For each layer, i.e., for each sphere around the origin 0 with radius r. R/; the velocity field u can be decomposed into a normal field unor and a tangential field utan . The normal part is negligibly small in comparison to the tangential part (see the considerations in Pedlovsky (1979)). Therefore, we obtain with ! D .   3 / (note that the expression C. / D 2  . 3  /

(55)

is called the Coriolis parameter) the following separations of Eq. (54) (observe that utan .r / D 0; 2 S2 ) : C. / ^ utan .r / D 

1  r P .r /; 0 r

(56)

1 @ P .r / C gr : 0 @r

(57)

and .C. / ^ utan .r //  D 

Equation (56) essentially tells us that the tangential surface gradient is balanced by the Coriolis force. For simplicity, in our approach, we regard the gravity acceleration as a normal field: Page 40 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

w.r / D gr ; 2 S2 (with gr as mean gravity intensity). Moreover, the vertical Coriolis acceleration in comparison to the tangential motion is very small, that is, we are allowed to assume that .C. / ^ utan .R //  D 0 with C given by (55). On the surface of the Earth (here, r D R), we then obtain from (57) a direct relation of the product of the mean density and the mean gravity acceleration to the normal pressure gradient (hydrostatic approximation): @ P .R / D 0 gr : @r

(58)

This is the reason why we obtain the pressure by integration (see also chapters “Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution” and “Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions in Geomagnetic Induction” of this handbook) as follows: P .R / D 0 gR „.R / C PAtm ;

(59)

where PAtm denotes the mean atmospheric pressure. The quantity „.R / (cf. Fig. 31) is the difference between the heights of the ocean surface and the geoid at the point 2 S2 . The scalar function 7! „.R /; 2 S2 ; is called ocean topography. By use of altimeter satellites, we are able to measure the difference H between the known satellite height HSat and the (unknown) height of the ocean surface HOcean : H D HSat  HOcean . After calculation of HOcean , we then get the ocean topography „ D HOcean  HGeoid from the known geoidal height HGeoid . In connection with (56), this finally leads us to the equation

Fig. 31 Ocean topography (confer “German Priority Research Program: Mass transport and Mass distribution in the system Earth, DFG-SPP 1257”)

Page 41 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

2  . 3  / ^ utan .R / D 

gR  r „.R /: R

(60)

Remembering the surface curl gradient L D ^ r  , we are able to conclude ^ .! ^ utan .R // D 

gR  L „.R /; 2R

(61)

i.e., C. /utan .R / D

gR  L „.R /: R

(62)

This is the equation of the geostrophic oceanic flow. 8.2.2

Mathematical Analysis

Again, we have an equation of vectorial tangential type given on the unit sphere S2 : This time, however, we have to deal with an equation of the surface curl gradient L S D s (with s. / D 2R . 3  /utan .R / and S D „). The solution theory providing the surface stream function S from gR the surface divergence-free vector field s would be in accordance with our considerations above (with r  replaced by L ). The computation of the geostrophic oceanic flow simply is a problem of differentiation, namely, the computation of the derivative L S. / D ^ r  S. /; 2 S2 : 8.2.3

Development of a Mathematical Solution Method

The point of departure for our intention to determine the geostrophic oceanic flow (as derived above from the basic hydrodynamic equation) is the ocean topography which is obtainable via satellite altimetry (see chapter “Classical Physical Geodesy" for observational details). As a scalar field on the spherical Earth, the ocean topography consists of two ingredients. First, on an Earth at rest, the water masses would align along the geoid related to a (standard) reference ellipsoid. Second, satellite measurements provide altimetric data of the actual ocean surface height which is also used in relation to the (standard) reference ellipsoid. The difference between these quantities is understood to be the actual ocean topography. In other words, the ocean topography is defined as the deviation of the ocean surface from the geoidal surface, which is here assumed to be due to the geostrophic component of the ocean currents. The data used for our demonstration are extracted from the French CLS01 model (in combination with the EGM96 model). The calculation of the derivative L S is not realizable, at least not directly. Also in this case, we are confronted with discrete data material that, in addition, is only available for oceanic areas. In the geodetic literature a spherical harmonic approach is usually used in the form of a Fourier series. The vectorial isotropic operator L (cf. Freeden and Schreiner 2009 for further details) is then applied – unfortunately under the leakage of its vectorial isotropy when decomposed in terms of scalar components – to the resulting Fourier series. The results are scalar components of the geostrophic flow. The serious difficulty with global polynomial structures such as spherical harmonics (with „ D 0 on continents!) is the occurrence of the Gibb’s phenomenon close to the coast lines (see, e.g., Nerem and Koblinski 1994; Albertella et al. 2008). In this respect it should

Page 42 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

be mentioned additionally that the equation of the geostrophic oceanic flow cannot be regarded as adequate for coastal areas; hence, our modeling fails for these areas. An alternative approach avoiding numerically generated oscillations in coastal areas is the application of kernels with local support, as, e.g., smoothed Haar kernels (see Freeden et al. (1998) and the references therein): ( 0; < 1  t  2; ˆ.k/ (63) .t / D kC1 .t .1 //k ; 0  1  t  : kC1 2 For 2 .0; 2 , k 2 N; the function ˆ.k/ as introduced by (63) is .k  1/ times continuously .k/ differentiable on the interval Œ0; 2 . .ˆ j /j 2N0 is a sequence tending to the Dirac function(al), i.e., a Dirac sequence (cf. Figs. 32 and 33). For a strictly monotonically decreasing sequence . j /j 2N0 satisfying j ! 0 for j ! 1, we obtain for the convolution integrals (low-pass filters) Z .k/ 2 ˆ.k/ (64) H j . / D j .  /H. /dS. /; 2 S ; S2

the limit relation 8

8 k=0 k=3

7

k=2 k=5

6

6

5

5

4

4

3

3

2

2

1

1

0

0

−1

−1

−2 −4

−3

−2

−1

0

1

2

3

j=0 j=2

7

4

−2 −4

scale j = 3, k = 0,2,3,5

−3

−2

−1

0

1

2

j=1 j=3

3

4

scale j = 0,1,2,3, k = 5

Fig. 32 Sectional illustration of the smoothed Haar wavelets ˆ j .cos. // with  2 Œ: ; j D 2j .k/

Fig. 33 Illustration of the first members of the wavelet sequence for the smoothed Haar scaling function on the sphere ( j D 2j , j D 2; 3; 4, k D 5)

Page 43 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

ˇ ˇ ˇ ˇ lim sup ˇH. /  H .k/ . / ˇ D 0: j

j !1 2S2

(65)

An easy calculation yields ( L ˆ.k/ j .  / D

0; k.kC1/ ..  /.1 j //k1 2 jkC1

j < 1    2; ^ ; 0  1    j :

It is not hard to show (cf. Freeden and Schreiner 2009) that ˇ ˇ ˇ ˇ . / lim sup ˇL H. /  L H .k/ ˇ D 0: j j !1 2S2

(66)

(67)

The multiscale approach by smoothed Haar wavelets can be formulated in a standard way. For example, the Haar wavelets can be understood as differences of two successive scaling functions. In doing so, an economical and efficient algorithm in a tree structure (Fast Wavelet Transform (FWT)) can be implemented appropriately. 8.2.4

“Back-Transfer” to Application

Ocean currents are subject to different influence factors, such as wind field, warming of the atmosphere, salinity of the water, etc., which are not accounted for in our modeling. Our approximation (see Figs. 34 and 35) must be understood in the sense of a geostrophic balance. An analysis shows that its validity may be considered as given on spatial scales of an approximate expansion of a little more than 30 km and on time scales longer than approximately 1 week. Indeed, the geostrophic velocity field is perpendicular to the tangential gradient of the ocean topography (i.e., perpendicular to the tangential pressure gradient). This is a remarkable property. The water flows along the curves of constant ocean topography, i.e., along isobars (see Fig. 34 for a multiscale modeling). Despite the essentially restricting assumptions necessary for the modeling, we obtain instructive circulation models for the internal ocean surface current for the northern or southern hemisphere, respectively (however, difficulties for the computation of the flow arise from the fact that the Coriolis parameter vanishes on the equator). An especially positive result is that the modeling of the ocean topography has made an essential contribution to the research in exceptional phenomena of internal ocean currents, such as El Niño. El Niño is an anomaly of the ocean–atmosphere system. It causes the occurrence of modified currents in the equatorial Pacific, i.e., the surface water usually flowing in western direction suddenly flows to the east. Geographically speaking, the cold Humboldt Current is weakened and finally ceases. Within only a few months, the water layer moves from Southeast Asia to South America. Water circulation has reversed. As a consequence, the Eastern Pacific is warmed up, whereas the water temperature decreases off the shores of Australia and Indonesia. This phenomenon has worldwide consequences on the weather, in the form of extreme droughts and thunderstorms. Our computations do not only help to visualize these modifications graphically, but they also offer the basis for future predictions of El Niño characteristics and effects (Fig. 36).

Page 44 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Haar scaling function (k=3), scale 1

Haar wavelet (k=3), scale 1

0°30° E60° E90° E120° E150° E180° W150° W120° W90° W60° W30°W

0°30°E60° E90° E120° E150° E180°W150°W120°W90°W60°W30°W

75° N 60° N

75° N 60° N

45° N 30° N

45° N 30° N

15° N 0°

15° N

15° S 30° S

15° S

0° 30° S

45° S

45° S

60° S

60° S

75° S

75° S

+ –100

–50

0

50

100

cm

–10

0 cm

low–pass filtering (scale j = 1)

band–pass filtering (scale j = 1)

–20

Haar wavelet (k=3), scale 2

10

20

Haar wavelet (k=3), scale 2 0°30°E60° E90° E120° E150° E180°W150°W120°W90°W60°W30°W

0°30° E60° E90° E120° E150° E180° W150° W120° W90° W60° W30°W 75° N 60° N

75° N 60° N

45° N 30° N

45° N 30° N

15° N 0°

15° N

15° S

15° S

0° 30° S

30° S

45° S

45° S

60° S

60° S

75° S

75° S

+

+ –20

–10

0

10

20

cm

–20

–10

0 cm

band–pass filtering (scale j = 2)

band–pass filtering (scale j = 3)

Haar wavelet (k=3), scale 4

10

20

Haar wavelet (k=3), scale 5

0°30° E60° E90° E120° E150° E180° W150° W120° W90° W60° W30°W

0°30°E60° E90° E120° E150° E180°W150°W120°W90°W60°W30°W

75° N 60° N

75° N 60° N

45° N

45° N 30° N

30° N 15° N 0°

15° N

15° S 30° S

15° S

0° 30° S

45° S

45° S

60° S

60° S

75° S

75° S

+

+ –20

–10 cm

0

10

20

–15

–10

0 cm

band–pass filtering (scale j = 4)

band–pass filtering (scale j = 5)

Haar wavelet (k=3), scale 6

–5

5

10

15

Haar scaling function (k=3), scale 7

0°30° E60° E90° E120° E150° E180° W150° W120° W90° W60° W30°W

0°30°E60° E90° E120° E150° E180°W150°W120°W90°W60°W30°W

75° N 60° N

75° N 60° N

45° N 30° N

45° N 30° N

15° N 0°

15° N

15° S

15° S



30° S

30° S

45° S

45° S

60° S

60° S

75° S

=

+ –10

–5

0

5

10

75° S

–100

–50

0

50

100

cm

cm

band–pass filtering (scale j = 6)

low–pass filtering (scale j = 7)

Fig. 34 Multiscale approximation of the ocean topography [cm] (a rough low-pass filtering at scale j D 1 is improved with several band-pass filters of scale j D 1; : : : ; 6; where the last picture shows the multiscale approximation at scale j D 7) (numerical and graphical illustration in cooperation with D. Michel and V. Michel 2006)

8.3 Circuit: Seismic Processing from Acoustic Wave Tomography The essential goal of seismic processing (see chapters “Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives”, “Transmission Tomography in Seismology”, “Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery”, “Strategies in Adjoint Tomography”, “Multidimensional Seismic Compression by Hybrid Transform with Page 45 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 35 Ocean topography [cm] (top) and geostrophic oceanic flow [cm/s] (bottom) of the gulf stream computed by the use of smoothed Haar wavelets .j D 8; k D 5/ (numerical and graphical illustration in cooperation with D. Michel and V. Michel 2006)

Multiscale Based Coding”, and “Tomography: Problems and Multiscale Solutions”) is to transfer the existing signal that resulted from the integration of the wave equation under the expectation that designated properties of the target bedrock like the velocity field can be interpreted from the transformed signal. In this work, based on the regularization of Green’s functions (fundamental solutions), new wavelet techniques for a detailed band-pass filtering of acoustic seismic phenomena are formulated in order to get a local understanding and interpretability of scattered wave field potentials in deep geothermal research. More material can be found in Augustin (2014), Augustin et al. (2012), Bauer et al. (2014), Freeden and Nutz (2014), and Ostermann (2011).

Page 46 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

May 1997

05.97

August 1997

08.97

June 1997

06.97

11.97

September 1997

09.97

July 1997

07.97

November 1997

02.98

Decamber 1997

12.97

October 1997

10.97

February 1998

March 1998

03.98

Januray 1997

01.98

April 1998

04.98

Fig. 36 Ocean topography during the El Niño period (May 1997–April 1998) from data of the satellite CLS01 model (numerical realization and graphical illustration in cooperation with V. Michel and S. Maßmann 2006)

8.3.1

Mathematical Modeling in Reservoir Detection

In order to determine the structure, depth, and thickness of the target reservoir (see, e.g., Dahlen and Tromp 1998; Nolet 2008; Tarantola 1984; Yilmaz 1987), the standard seismic methods are applied to two- and/or three-dimensional seismic sections. All methods in use can be distinguished between time- and depth-migration strategies and between applications to post- and pre-stack data sets (see chapter “Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives”). The time-migration strategy is used to resolve conflicts in dipping events with different velocities. The depth-migration strategy handles strong lateral velocity variations associated to complex overburden structures. The numerical techniques used to solve the migration problem can generally be separated into three broad categories: (i) integral discretization methods such as Kirchhoff migration based on the solution of the eikonal equation; (ii) methods based on finite-difference schemes, e.g., depth continuation methods and reverse-time migration; and (iii) transform methods based on frequency–wave number implementations, e.g., frequency–space and frequency–wave number migration. All these migration methods usually rely on a certain approximation of the scalar acoustic or vectorial elastic wave equation (for more details, the reader is referred, e.g., to Nolet 2008; Yilmaz 1987 and the references therein). According to the geophysical requirements, highly accurate approximations and efficient numerical techniques must be realized in order to handle steep dipping events and complex velocity models with strong lateral and vertical variations, as well as to construct the subsurface image in a locally defined region with high resolution on available computational resources. In consequence, migration algorithms require an accurate velocity model. The adaptation of the interval velocity by use of an inversion by comparing the measured travel times with simulated travel times is called reflection tomography. There are many versions of reflection tomography, but they all use ray-tracing techniques and they are formulated usually as a mathematical optimization problem. The most popular and efficient methods are ray-based travel time tomography, waveform and fullwave inversion tomography (FWI), and Gaussian beam tomography (GBT). The “true” velocity estimation is often obtained by an iterative process called migration velocity analysis (MVA), which uses the kinematic information gained by the migration and consists of the following steps: (initial step) perform a reflection tomography of the coarse velocity structure using a priori knowledge about the subsurface; (iterative step) migrate the seismic data sets and apply the imaging condition; and update the velocity function by tomography inversion. These approaches Page 47 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

recording truck

vibrator truck

sandstone P

limestone

P

P

Fig. 37 Principle of seismic reflection (from the PhD-thesis Ilyasov (2011), Geomathematics Group, University of Kaiserslautern)

often have rather poor quality in the sense of the interpretability of the migration result. Since the interest of, e.g., geothermal projects is not only focused on structure heights and traps as in oil field practice but also on fault zones and karst structures under recent stress conditions, the interpretation for geothermal needs is significantly complicated (Fig. 37). 8.3.2

Mathematical Analysis of Acoustic Wave Propagation

In the context of seismic processing imaging (see also chapters “Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives”, “Identification of Current Sources in 3D Electrostatics”, “Transmission Tomography in Seismology”, “Numerical Algorithms for Nonsmooth Optimization Applicable to Seismic Recovery”, “Strategies in Adjoint Tomography”, and “Multidimensional Seismic Compression by Hybrid Transform with Multiscale Based Coding”), it is usually assumed that shear stresses generated by the wave impulse and other kinds of damping can be neglected. As a consequence, wave propagation is treated as an acoustic phenomenon: pressure changes P .x; t / imply volume changes d V that generate a displacement u.x; t / which yield further pressure changes in the neighborhood of the volume. The relation between pressure changes and volume changes is assumed to be governed by Hooke’s law (see, e.g., Skudrzyk 1972; Achenbach 1973) P D K

dV V

(68)

as the basis of a linear elastic relation. Here, K is the bulk modulus of the material. In order to connect pressure changes to the displacement (see also Freeden and Nutz 2014), we observe that a small volume V may be written as V D dx1 dx2 dx3 . It is transformed to V 0 D dx10 dx20 dx30 . As the displacement ıu is defined via dxi0 D dxi C ıui ;

i 2 f1; 2; 3g;

(69)

we formally get

Page 48 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014



V V0 dV D V V dx1 dx2 dx3  .dx1 C ıu1 /.dx2 C ıu2 /.dx3 C ıu3 / D dx1 dx2 dx3 dx1 dx3 ıu2 C dx2 dx3 ıu1 C dx1 dx2 ıu3 D dx1 dx2 dx3 dx1 ıu2 ıu3 C dx2 ıu1 ıu3 C dx3 ıu1 ıu2 ıu1 ıu2 ıu3   dx1 dx2 dx3 dx1 dx2 dx3   ıu1 ıu2 ıu3 D C C CR dx1 dx2 dx3 D  r  u C R;

(70)

with R summarizing all terms of higher order in ıui which is negligible as ıui is assumed to be small. We obtain P D Kr  u C S

(71)

with the source term S as a constitutive equation. Moreover, we assume the balance of linear momentum, or equivalently Newton’s second law, which in our case reads P .x C ıxi  i ; t /  P .x; t / D .x/ıxi ai ;

i 2 f1; 2; 3g ;

(72)

with the acceleration a and the canonical orthonormal basis of Euclidean space R3 :  1 ;  2 ;  3 . Under the further assumption that the acceleration is given by the second-order time derivative of the displacement u, we get P .x C ıxi  i ; t /  P .x; t / @2 D .x/ 2 ui .x; t / ıxi @t

(73)

and finally by considering the limit ıxi ! 0, @ @2 P .x; t / D .x/ 2 ui .x; t /: @xi @t

(74)

These component equations can be summarized in vectorial form as rx P .x; t / D .x/

@2 u.x; t /: @t 2

(75)

Assuming that P and u are sufficiently often differentiable, (71) and (75) can be combined by applying the second-order time derivative to (71) to gain  2  @2 @ @2 P .x; t / D K.x/r  u.x; t / C S.x; t /: x @t 2 @t 2 @t 2

(76)

Page 49 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Using (75) in (76), we obtain   @2 1 @2 r P .x; t / D K.x/ r  P .x; t / C S.x; t /: x x @t 2 .x/ @t 2

(77)

Applying the product rule yields the identity 0 1  1 1 @ 1 .rx .x//  .rx P .x; t // : rx P .x; t / D rx  rx P .x; t /A  2 rx  „ ƒ‚ … .x/ .x/  .x/ 

(78)

Dx

provided that  is smooth enough. If the gradient of  is negligibly small, we arrive at the nondivergence form of the acoustic wave equation @2 @2 2 P .x; t / D c .x/  P .x; t / C S.x; t /; x @t 2 @t 2

(79)

where the quantity s c.x/ D

K.x/ .x/

(80)

is called the propagation speed of a wave, which results in the purely divergence form of the acoustic wave equation 

 1 @2 1 @2   S.x; t /: P .x; t / D x c 2 .x/ @t 2 c 2 .x/ @t 2

(81)

The identity (81) ends the standard approach to the acoustic wave equation, thereby assuming a compressible, viscous (i.e., no attenuation) medium with no shear strength and no internal forces (i.e., in equilibrium). Remark 4. There is a large literature dealing with existence and uniqueness of the forward formulation of the acoustic wave equation (see, e.g., Evans (2002) and the references therein). 8.3.3

Development of a Mathematical Solution Method

As is well known, a standard approach to acoustic wave equation (81) is a Fourier transform with respect to time: Z 1 P .x; t / exp.i !t / dt (82) U.x/ D p 2 R leading to the reduced wave equation

Page 50 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 38 Scheme of seismic processing



 !2 U.x/ D W .x/; x C 2 c .x/

(83)

usually called the Helmholtz equation. Obviously, W is given by 1 W .x/ D  p 2

Z R

1 @2 S.x; t / exp.i !t / dt c 2 .x/ @t 2

(84)

for all x. We are interested in two solution procedures of the Helmholtz equation (83), namely: (1) Postprocessing: the decorrelation of U and c from a preprocessed (sufficiently suitable) solution U , (2) Inverse modeling: the determination of c under the a priori knowledge of a low-pass filtered “trend solution.” More precisely, we do not make the attempt here to solve the inverse problem of determining c as a whole. Instead, we base our investigations on already successful prework, for example, (i) integral discretization methods such as Kirchhoff migration based on the solution of the eikonal equation (see, e.g., Yilmaz 1987 and the references therein) and Gaussian beam procedures (see, e.g., Popov et al. 2006, 2008); (ii) methods based on the finite-difference schemes, e.g., depth continuation methods (cf. Claerbout 2009) and reverse-time migration (see, e.g., Baysal et al. 1984; Yilmaz 1987; Popov et al. 2008; Nolet 2008); and (iii) transform methods based on frequency– wave number implementations, e.g., frequency–space and frequency–wave number migration (see, e.g., Yilmaz 1987; Claerbout 2009). In other words, in order to get information about the structure, depth, and thickness of a target reservoir, we start from today’s realistic assumption that standard seismic tomography results are available and meaningful, at least to some extent. We focus our attention on the interpretability of a “true” migration result obtained from elsewhere and/or the (local) improvement of a trend result (see Figs. 38 and 39). An essential tool is a new class of locally supported wavelets (see Freeden and Blick 2013; Freeden and Nutz 2014) derived from regularizations of Green functions for the Helmholtz operator:

Page 51 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 39 Results of seismic processing

(1) Postprocessing: The Helmholtz equation (83) leads to the definition of the wave number k.x/ and the refraction index N.x/ as N.x/ D

! ! c0 c0 ; k.x/ D D D k0 N.x/; c.x/ c.x/ c0 c.x/

(85)

with c0 being a suitable constant reference velocity (see, e.g., Engl et al. 1996; Snieder 2002; Biondi 2006, and the references therein). Accordingly, the Helmholtz equation (83) can be rewritten as   x C k02 N 2 .x/ U.x/ D 0:

(86)

The region where N.x/ ¤ 1 represents the scattering object such that N.x/  1 may be supposed to have compact support. Another standard assumption is that the difference between c.x/ and c0 should be sufficiently small. As a consequence, N 2 .x/ may be developed into a Taylor series up to order one with a center such that c.x0 / D c0 . This yields N 2 .x/ ' 1 C .x/

(87)

with a small perturbation parameter . Consequently, we have k 2 .x/ D k02 N 2 .x/ D k02 .1 C .x// :

(88)

With the same argument as explained before, the unknown function .x/ may be supposed to have 2 compact support. The wave operator x C c 2!.x/ may be separated in the following way:

Page 52 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Ax D x C

!2 D x C k02 N 2 .x/ D x C k02 .1 C .x// 2 c .x/

D x C k02 C k02 .x/ D A.0/ C A.1/ ;

(89)

where we have used the abbreviations A.0/ D x C k02

(90)

A.1/ D k02 .x/:

(91)

and

Hence, the wave field U.x/ may be split into an incident wave field UI , corresponding to the wave propagating in the absence of the scatterer, and the scattered wave field US such that U D UI C U S :

(92)

  A.0/ UI D x C k02 UI D 0;   A.0/ US D x C k02 US D k02 .UI C US / D k02 U D A.1/ U:

(93)

This splitting leads us to

(94)

It should be mentioned that Eq. (93) formalizes that UI corresponds to the wave propagating in the absence of the scatterer. As the fundamental solution to the Helmholtz operator  C k02 is known to be G. C k02 I jx  yj/ D

1 e i k0 jxyj ; 4 jx  yj

x ¤ y;

the functions UI and US ; respectively, can be represented as volume potentials Z UI .x/ D G. C k02 I jx  yj/ W .y/ d V .y/; R3 Z   G. C k02 I jx  yj/ k02 .y/ U.y/ d V .y/ US .x/ D „ ƒ‚ … B

(95)

(96) (97)

DF .y/

with the volume element d V and B D supp./ being the support of ; where W is given by (84). Remark 5. In seismic reflection modeling, point sources with certain spectra are usually chosen as unperturbed wave (for more details, see, e.g., Nolet 2008). Only if UI and US are available in the compact support B D supp./, a direct computation of  by applying the Helmholtz operator  C k02 to (97) is possible. However, it should be mentioned

Page 53 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

that US can be usually measured only away from the support B of . Nevertheless, in exceptional cases, local information is available inside boreholes, which is of particular interest. In the sense of the perturbation theory and with the conventional setting U 0 D UI ; U can be formally written as a series U D

1 X

 k U .k/ ;

(98)

kD0

which yields 1 X

   k A.0/ C A.1/ U .k/ D W:

(99)

kD0

By collecting terms which are of the same order in , we therefore get A.0/ U .0/ D W;

(100)

A.0/ U .k/ D  A.1/ U .k1/ ;

k 2 N:

(101)

The scattered wave field is then given by US D U  U I D

1 X

 k U .k/ :

(102)

kD1

This procedure is known as Born approximation (see, e.g., Snieder (2002) for more details). Considering only the first-order approximation, we obtain A.0/ UI DW;

(103)

A.0/ US D  A.1/ UI :

(104)

The difference between (94) and (104) is crucial. On the right-hand side of (97), we find the sum of UI and US which makes this a nonlinear equation. On the right-hand side of (104), only UI appears, which is determined by (103), making the relation between scattered wave and perturbation of the medium linear. Remark 6. potentials

UI and the first-order, i.e., Born approximation of US can be represented by the Z UI .x/ D

R3

G. C k02 I jx  yj/ W .y/ d V .y/;

(105)

Z USI .x/

D

  G. C k02 I jx  yj/ k02 .y/ UI .y/ d V .y/: „ ƒ‚ … B

(106)

DFI .y/

Page 54 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

The basic properties of volume integrals of type (97) can be summarized as follows (see, e.g., Müller 1969): the volume potential US is a metaharmonic function in R3 nB under the assumption of boundedness of F (i.e., . C k02 /US D 0 in R3 nB). For a continuous F in B, the potential (97) is of class C .1/ .R3 /, and under the assumption of Hölder continuity of F , we have . C k02 /USI .x/ D FI .x/

(107)

for all x 2 B. This equation indicates the direct relation between USI and FI in B. It is actually the key point of Born modeling as discussed in the literature (see, e.g., Marks 2006). Our postprocessing procedure starts from the nonlinear integral relation (96). The essential idea is to use a sequence of regularizations fG j . C k 2 I /g; j 2 N, for the kernels (95) given by 8 < eik0 jxyj ; 4jxyj G j . C k02 I jx  yj/ D eik0 jxyj  : 8 3

jxyj2

j

j2



jx  yj > j ; ; jx  yj  j ;

(108)

where f j gj 2N denotes a positive monotonically decreasing sequence converging to 0 (e.g., a dyadic sequence given by j D 2j , j 2 N). The regularization (108) is constructed in such a way that each kernel G j . C k02 I / is continuously differentiable and only dependent on the distance between two points x and y. Furthermore, under the assumption that F is bounded in B, we obtain for the “regularized version of the potential” (97) Z (109) .US / j .x/ D G j . C k02 I jx  yj/ F .y/ d V .y/ B

the limit relation US .x/ D .US / j .x/ C O. j2 /; j ! 0; for all x 2 R3 . It should be noted that the aforementioned approach of replacing the fundamental solution G. C k02 I jx  yj/ by its regularized versions (109) initiates a multiscale method in canonical way (cf. Freeden and Blick 2013) based on ˇ ˇ Z ˇ ˇ lim sup ˇˇUS .x/  G j . C k02 I jx  yj/F .y/ d V .y/ˇˇ D 0: (110) j !1

x2B

B

Each scaling function G j . C k02 I / provides low-pass filtering of the signature US . In order to obtain multiscale components, we simply calculate the difference of two consecutive scaling functions and get the wavelet functions with respect to the scale parameter j . Of course, other types of regularizations can be chosen. In our approach, however, we restrict ourselves to Haar-related kernel functions (see Haar 1910 for one-dimensional theory). As a matter of fact, by using the regularizations of the fundamental solution of the Helmholtz operator  C k02 , we are immediately led to locally supported wavelets via the scale discrete scaling equation ‰ j . C k02 I jx  yj/ D G j C1 . C k02 I jx  yj/  G j . C k02 I jx  yj/; j 2 N0 :

(111)

Explicitly, we have (see Fig. 40 for graphical illustration)

Page 55 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

‰ j . C k02 I jx  yj/ D

 8 e i k0 jxyj ˆ ˆ 3 ˆ < 8 j C1 i k0 jxyj

ˆ  e4jxyj C ˆ ˆ : 0;

jxyj2 j2C1

e ik0 jxyj 8 j

  

e ik0 jxyj 8 j

3

jxyj2 j2



3



jxyj2 j2

 ;

jx  yj  j C1 ;

j C1 < jx  yj  j ;

;

(112)

j < jx  yj:

The convolution integral (113) indicates the difference of two sequential low-pass filters, i.e., it represents a band-pass filtering at the position x with respect to the scale parameter j : Z W j .x/ D ‰ j . C k02 I jx  yj/ F .y/ d V .y/: (113) B

W j includes all detail information contained in .US / j C1 but not in .US / j . In accordance with our construction, we therefore obtain for every L 2 N j CL1

.US / j CL D .US / j C

X

W n :

(114)

nDj

0.03

0.06

0.02

0.04

0.01

0.02

0

0

–0.01

–0.02

–0.02

–0.04

–0.03 –5

0

5

–0.06 –5

0

5

0

5

0.3

0.12 0.1

0.25

0.08 0.2 0.06 0.04

0.15

0.02

0.1

0

0.05

–0.02 0

–0.04 –0.06 –5

0

5

–0.05 –5

Fig. 40 Wavelet function ‰ j in sectional illustration for k0 D 5 and j D

4 2j

; j D 0; : : : ; 3

Page 56 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

The formula (114) shows the progress in the “zooming-in process,” which proceeds from scale j to scale j C L. Hence, the identity (114) describes the amount of improvement in the accuracy from level j to level j C L. Indeed, it uniformly follows for each position x and for each scale value j that US .x/ D .US / j .x/ C

1 X

W n .x/;

(115)

nDj

i.e., the signal US consists of a (coarse) low-pass filtering and an infinite number of successive band-pass convolutions. Of course, in practice, only a finite number of band-pass filters has to be calculated to satisfy a certain error tolerance (for more multiscale aspects in constructive approximation, see Freeden and Michel 2004; Freeden and Gerhards 2013). Until now, we have only reconstructed the quantity US by virtue of a multiscale technique using building blocks. For practical purposes, the decorrelation of both F and  are of interest. An elementary calculation using (108) yields the Helmholtz derivative     K j . C k02 I jx  yj/ D x C k02 G j  C k02 I jx  yj ( i k0 jxyj 3 4e jxyj 3 .i k0 j2  i k0 jx  yj2  jx  yj/; jx  yj  j ; j D : 0; jx  yj > j : fK j g is a “Haar-type sequence,” and it approximately reduces to the Haar sequence fH j g ( H j .jx  yj/ D

1 3 4 3 ; jx  yj  j ; j

0;

jx  yj > j :

for sufficiently large j , i.e., for each k0 K j . C k02 I jx  yj/ D H j .jx  yj/ C O. j /; for j ! 1: If F is bounded, then it is clear that Z   2 G j . C k02 I jx  yj/ F .y/ d V .y/ x C k 0 B Z     D x C k02 G j  C k02 I jx  yj F .y/ d V .y/;

(116)

B

such that Z  F .x/ D lim

j !1

B

(117)

Z

D lim

j !1

K j . C k02 I jx  yj/ F .y/ d V .y/

B

H j .jx  yj/ F .y/ d V .y/

Page 57 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

for all x 2 B. Moreover, we have under the assumption of Hölder continuity of F (see, e.g., Müller 1969) Z  F .x/ D .x C k02 / US .x/ D .x C k02 /

B

G. C k02 I jx  yj/F .y/ d V .y/

(118)

for all x 2 B. In other words, the “Helmholtz derivative”  C k02 of the regularization of the fundamental solution leads back to a “Haar-type” singular integral (117) for detecting F .x/ D " k02  .UI C US /.x/ D " k02  U.x/

(119)

for all x 2 B. Vice versa, our approach offers the possibility to introduce alternative regularizations of the fundamental solution such that their “Helmholtz derivatives” represent singular-type kernels constituting Dirac-type sequences in B. All in all, the multiscale technique by regularized fundamental solutions enables us simultaneously to decompose the signal information of the wave field as well as the refraction index N based on the interrelation (116), however, under the “postprocessing assumption” that US is discretely known (to some extent) inside B. Once more, our understanding of the multiscale technique here does not preferably aim at the reconstruction of the signals, but instead in working out characteristic detail information that emerge from the difference of two consecutive scale-space representations. This practice (see also Freeden and Blick 2013) comes across as the decorrelation of signatures, such that postprocessing of density structures is the key element. (2) Inverse Modeling: In the following (cf. Freeden and Gerhards 2013), we make the attempt to apply the Haar philosophy to an approximate determination of F from US inside B. The regularization procedure of the volume potential as proposed for postprocessing is the essential tool. Let f j gj 2N0 be a monotonically decreasing sequence of positive values j such that limj !1 j D 0 (e.g., j D 2j ). Then, in accordance with (117), we are able to specify a sufficiently large integer J such that, for all x 2 B; Z K J . C k02 I jx  yj/ F .y/ d V .y/ (120)  F .x/ '  F J .x/ D B

as well as

Z .US /.x/ ' .US / J .x/ D

B

G J . C k02 I jx  yj/ F .y/ d V .y/

(121)

(“' ”means that the error is negligible). If F is bounded, then we already know that 

 x C k02 .US / J .x/ D F J .x/

(122)

for all x 2 B: In order to realize a fully discrete approximation of F , we have to apply approximate integration formulas over B leading to .US /.x/ '

NJ X

NJ J G J . C k02 I jx  yiNJ j/ wN i F .yi /;

(123)

i D1

Page 58 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

NJ J where wN i , yi , i D 1; : : : ; NJ , are the known weights and knots, respectively. Using an appropriate integration formula, we are therefore led to a linear system to be solved in order to obtain insight into approximate information about . As already explained, for numerical realization, we may assume that .US / J 1 is available from elsewhere, at least discretely in points xkMJ 2 B; k D 1; : : : ; MJ ; to be needed in the solution process of the linear system. Obviously, we have to calculate all unknown coefficients NJ J aiNJ D wN i F .yi /;

i D 1; : : : ; NJ ;

(124)

    from discrete values US xkMJ  .US / J 1 xkMJ ; k D 1; : : : ; MJ : Then we have to solve a linear system US



xkMJ



 .US / J 1



xkMJ



'

NJ X

‰ J 1 . C k02 I jxkMJ  yiNJ j/ aiNJ ; k D 1; : : : ; MJ ;

(125)

i D1

in order to determine the required coefficients aiNJ ; i D 1; : : : ; NJ . Remark 7. The linear system (125) is the bottleneck in inverse modeling, although ‰ J 1 as constructed here possesses a local support. Formally, (125) can be written in a general matrix notation of the form Ad D u. In fact, there is a large literature for solving such a system by minimizing a penalty term of generic form P .d / D 12 kAd  uk C R.d /; where R.d / is a measure of the size and/or complexity (see, e.g., chapters “Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment”, “Transmission Tomography in Seismology”, “Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery”, and “Strategies in Adjoint Tomography” for related problems). Even more, sparsity comes into play (see, e.g., chapters “Sparsity in Inverse Geophysical Problems”, “Sparse Solutions of Underdetermined Linear Systems” and the references therein). J Once all coefficients aiNJ ; i D 1; : : : ; NJ are available (note that the integration weights wN i ; i D 1; : : : ; NJ ; are known), the function F  F J 1 can be obtained in obvious way. From the

Fig. 41 Wave propagation in the Marmousi velocity model after 0.6 s

Fig. 42 Wave propagation in the Marmousi velocity model after 1.1 s

Page 59 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 43 Wave propagation in the Marmousi velocity model after 1.6 s

Fig. 44 Wave propagation in the Marmousi velocity model after 2.1 s

Fig. 45 Wave propagation in the Marmousi velocity model after 2.6 s (all pictures taken from the PhD-thesis Ilyasov (2011), Geomathematics Group, University of Kaiserslautern)

knowledge of US ; .US / J 1 ; we are therefore able via (119) to model , hence, the wanted refraction index N . 8.3.4

“Back-Transfer” to Application

A seismic prototype for test investigations in geoexploration is the 2D Marmousi model (see Martin et al. 2002). A velocity model is shown in Fig. 39 (taken from the PhD-thesis due to Ilyasov (2011)). The source wave field involving the Marmousi model in direct time (i.e., snapshots of the wave propagation after 0.6, 1.1, 1.6, 2.1, and 2.6 s) is illustrated in Figs. 41–45. A result of a reverse-time migration (RT) in the context of a finite-difference scheme (FDS) applied to the Marmousi data set using the velocity model (Fig. 39) is illustrated in Fig. 46. For the decorrelation behavior of the Fourier-transformed 3D Helmholtz wavelets, we limit our research to wave numbers k0 in the interval from k0 D 0:010 to k0 D 0:120. Our calculations show two illustrations, namely, for the wave members k0 D 0:047 and k0 D 0:099. Figure 49 shows the details for the scale parameters corresponding to a dyadic sequence. Relevant structural differences during the scale-dependent wavelet convolution become obvious for scale j D 3; : : : ; 6. Indeed, we are able to show that our decorrelation method highlights specific rock formations. By wavelet filtering of the migration result, we are not only able to specify the salt formation but also to dampen other rock formations as well as undesired noise phenomena caused by erroneous migration (see Fig. 47). Finally, in accordance with our construction (119), the Helmholtz derivative simultaneously leads back to a multiscale approximation of the velocity field using Haar-type trial functions. The wave number chosen for the illustrations in Fig. 48 is k0 D 0:099. Page 60 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 46 A migration result of the Marmousi model by FDS (taken from the PhD-thesis Ilyasov (2011), Geomathematics Group, University of Kaiserslautern)

−600 −800 −1000 −1200 −1400 −1600 −1800 −2000 −2200

3800 3600 3400 3200 3000 2800 2600 2400 2200 2000 1800

[m]

[m]

Fig. 47 Interpretation of the Marmousi model due to Martin et al. (2002) −600 −800 −1000 −1200 −1400 −1600 −1800 −2000 −2200 1000 2000 3000 4000 5000 6000 7000 8000 [m]

1000 2000 3000 4000 5000 6000 7000 8000 [m]

low-pass filtering (scale j =4)

[m]

−1000 −1500 −2000 −2500 1000 2000 3000 4000 5000 6000 7000 8000 [m]

band-pass filtering (scale j =4) 600 500 400 300 200 100 0 −100 −200 −300 −400

400

−500

[m]

−500

400 300 200 100 0 −100 −200 −300 −400

300

−1000

200

−1500

100

−2000

−100

0 −200

−2500 1000 2000 3000 4000 5000 6000 7000 8000 9000 [m]

band-pass filtering (scale j =5)

−300

band-pass filtering (scale j =6) 100 150

−500

100

−1000

50

−1500

0

−2000

−50

−2500

−100 1000 2000 3000 4000 5000 6000 7000 8000 9000 [m]

band-pass filtering (scale j =7)

[m]

[m]

−1000

80 60

−500

40 20 0 −20 −40

−1500 −2000 −2500 1000 2000 3000 4000 5000 6000 7000 8000 9000

−60

[m]

band-pass filtering (scale j =8)

Fig. 48 Wavelet approximation of the velocity field in [m/s] by Helmholtz derivatives (following Freeden and Blick 2013) Page 61 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 49 Wavelet decorrelation (band-pass filtering) of the Marmousi migration model in [m] for scales j D 2; : : : ; 6 (following Freeden and Blick 2013)

9 Final Remarks The Earth is a dynamic planet in permanent change, due to large-scale internal convective material and energy rearrangement processes, as well as manifold external effects. We can, therefore, only understand the Earth as our living environment if we consider it as a complex system of all its interacting components. The processes running on the Earth are coupled with one another, forming ramified chains of cause and effect which are additionally influenced by man who intervenes into the natural balances and circuits. However, knowledge of these chains of cause and effect has

Page 62 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

currently still remained incomplete to a large extent. In adequate time, substantial improvements can only be reached by the exploitation of new measurement and observation methods, e.g., by satellite missions and by innovative mathematical concepts of modeling and simulation, all in all by geomathematics. As far as data evaluation is concerned in the future, traditional mathematical methods will not be able to master the new amounts of data neither theoretically nor numerically – especially considering the important aspect of a more intensively localized treatment with respect to space and time, embedded into a global concept. Instead, geoscientifically relevant parameters must be integrated into constituting modules; the integration must be characterized by three essential characteristics: good approximation property, appropriate decorrelation ability, and fast algorithms. These characteristics are the key for a variety of abilities and new research directions. Acknowledgments This introductory chapter is based on the German note “W. Freeden (2009): Geomathematik, was ist das überhaupt?, Jahresbericht der Deutschen Mathematiker Vereinigung (DMV), JB.111, Heft 3, 125–152.” I am obliged to the publisher Vieweg+Teubner for giving the permission for an English translation of essential parts of the original version. Particular thanks go to Dr. Helga Nutz for reading an earlier version and eliminating some inconsistencies. Furthermore, I would like to thank my Geomathematics Group, Kaiserslautern, for the assistance in numerical calculation as well as graphical illustration concerning the three exemplary circuits.

References Achenbach JD (1973) Wave propagation in elastic solids. North Holland, New York Albertella A, Savcenko R, Bosch W, Rummel R (2008) Dynamic ocean topography – the geodetic approach. IAPG/FESG Mitteilungen, 27, TU München Ansorge R, Sonar T (2009) Mathematical models of fluid dynamics, 2nd updated edn. Wiley-VCH, Weinheim Augustin M (2014) A method of fundamental solutions in poroelasticity to model the stress field in geothermal reservoirs. PhD-thesis, Geomathematics Group, University of Kaiserslautern Augustin M, Freeden W, Gerhards C, Möhringer S, Ostermann I (2012) Mathematische Methoden in der Geothermie. Math Semesterber 59:1–28 Bach V, Fraunholz W, Freeden W, Hein F, Müller J, Müller V, Stoll H, von Weizsäcker H, Fischer H (2004) Curriculare Standards des Fachs Mathematik in Rheinland-Pfalz (Vorsitz: W. Freeden). Studie: Reform der Lehrerinnen- und Lehrerausbildung, MWWFK Rheinland-Pfalz Bauer M, Freeden W, Jacobi H, Neu T (eds) (2014) Handbuch Tiefe Geothermie. Springer, Heidelberg Baysal E, Kosloff DD, Sherwood JWC (1984) A two-way nonreflecting wave equation. Geophysics 49(2):132–141 Beutelspacher S (2001) In Mathe war ich immer schlecht. Vieweg, Wiesbaden Biondi BL (2006) Three-dimensional seismic imaging. Society of Exploration Geophysicists, Tulsa Bruns EH (1878) Die Figur der Erde. Publikation Königl Preussisch Geodätisches Institut. P Stankiewicz, Berlin Claerbout J (2009) Basic earth imaging. Stanford University Press, Stanford Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Page 63 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Emmermann R, Raiser B (1997) Das System Erde – Forschungsgegenstand des GFZ. Vorwort des GFZ-Jahresberichts 1996/1997, GeoForschungsZentrum, Potsdam Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer Academic, Dordrecht/Boston Evans LD (2002) Partial differential equation, 3rd printing. American Mathematical Society, Providence Fehlinger T (2009) Multiscale formulations for the disturbing potential and the deflections of the vertical in locally reflected physical geodesy. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Dr. Hut, München Fengler MJ, Freeden W (2005) A non-linear Galerkin scheme involving vector and tensor spherical harmonics for solving the incompressible Navier–Stokes equation on the sphere. SIAM J Sci Comput 27:967–994 Freeden W (1998) The uncertainty principle and its role in physical geodesy. In: Progress in geodetic science at GW 98, pp 225–236, Shaker Verlag, Aachen Freeden W (1999) Multiscale modelling of spaceborne geodata. B.G. Teubner, Stuttgart/Leipzig Freeden W (2009) Geomathematik, was ist das überhaupt? Jahresbericht der Deutschen Mathematiker Vereinigung (DMV), Vieweg+Teubner, JB. 111, Heft, vol 3, pp 125–152 Freeden W (2011) Metaharmonic lattice point theory. CRC/Taylor & Francis, Boca Raton Freeden W, Blick C (2013) Signal decorrelation by means of multiscale methods. World Min 65(5):1–15 Freeden W, Gerhards C (2010) Poloidal and toroidal fields in terms of locally supported vector wavelets. Math Geosci 42:817–838 Freeden W, Gerhards C (2013) Geomathematically oriented potential theory. CRC/Taylor & Francis, Boca Raton Freeden W, Gutting M (2013) Special functions of mathematical (geo-)sciences. Birkhäuser, Basel Freeden W, Maier T (2002) Multiscale denoising of spherical functions: basic theory and numerical aspects. Electron Trans Numer Anal 14:40–62 Freeden W, Mayer T (2003) Wavelets generated by layer potentials. Appl Comput Harm Anal (ACHA) 14:195–237 Freeden W, Michel V (2004) Multiscale potential theory (with applications to geoscience). Birkhäuser, Boston/Basel/Berlin Freeden W, Nutz H (2014) Mathematische Methoden. In: Bauer M, Freeden W, Jacobi H, Neu T, Herausgeber, Handbuch Tiefe Geothermie. Springer, Heidelberg Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences – a scalar, vectorial, and tensorial setup. Springer, Berlin/Heidelberg Freeden W, Wolf K (2008) Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math Semesterber 56:53–77 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford/Clarendon, Oxford Freeden W, Michel D, Michel V (2005) Local multiscale approximations of geostrophic ocean flow: theoretical background and aspects of scientific computing. Mar Geod 28:313–329 Freeden W, Fehlinger T, Klug M, Mathar D, Wolf K (2009) Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J Geod 83:1171–1191 Gauss, C.F. (1863) Werke, Band 5, Dietrich Göttingen Gerhards C (2011) Spherical multiscale methods in terms of locally supported wavelets: theory and application to geomagnetic modelling. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Dr. Hut, München Page 64 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Grafarend EW (2001) The spherical horizontal and spherical vertical boundary value problem – vertical deflections and geoidal undulations – the completed Meissl diagram. J Geod 75:363–390 Groten E (1979) Geodesy and the Earth’s gravity field I+II. Dümmler, Bonn Gutting M (2007) Fast multipole methods for oblique derivative problems. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Shaker, Aachen Heiskanen WA, Moritz H (1967) Physical geodesy. Freeman and Company, San Francisco Haar A (1910) Zur Theorie der orthogonalen Funktionssysteme. Math Ann 69:331–371 Helmert FR (1881) Die mathematischen und physikalischen Theorien der Höheren Geodäsie 1+2, B.G. Teubner, Leipzig Ilyasov M (2011) A tree algorithm for Helmholtz potential wavelets on non-smooth surfaces: theoretical background and application to seismic data processing. PhD-thesis, Geomathematics Group, University of Kaiserslautern Jakobs F, Meyer H (1992) Geophysik – Signale aus der Erde. Teubner, Leipzig Kümmerer B (2002) Mathematik. Campus, Spektrum der Wissenschaftsverlagsgesellschaft, pp 1–15 Lemoine FG, Kenyon SC, Factor JK, Trimmer RG, Pavlis NK, Shinn DS, Cox CM, Klosko SM, Luthcke SB, Torrence MH, Wang YM, Williamson RG, Pavlis EC, Rapp RH, Olson TR (1998) The development of the joint NASA GSFC and NIMA geopotential model EGM96. NASA/TP1998-206861, NASA Goddard Space Flight Center, Greenbelt Listing JB (1873) Über unsere jetzige Kenntnis der Gestalt und Größe der Erde. Dietrich, Göttingen Marks DL (2013) A family of approximations spanning the Born and Rytov scattering series. Opt Exp 14:8837–8848 Martin GS, Marfurt KJ, Larsen S (2002) Marmousi-2: an updated model for the investigation of AVO in structurally complex areas. In: Proceedings, SEG annual meeting, Salt Lake City Meissl P (1971) On the linearisation of the geodetic boundary value problem. Report No. 152, Department of Geodetic Science, The Ohio State University, Columbo, OH Michel V (2002) A multiscale approximation for operator equations in separable Hilbert spaces – case study: reconstruction and description of the Earth’s interior. Habilitation-thesis, Geomathematics Group, University of Kaiserslautern, Shaker, Aachen Michel V (2013) Lectures on constructive approximation – Fourier, spline, and wavelet methods on the real line, the sphere, and the ball. Birkhäuser, Boston Müller C (1969) Foundations of the mathematical theory of electromagnetic waves. Springer, Berlin/Heidelberg/New York Nashed MZ (1981) Operator-theoretic and computational approaches to ill-posed problems with application to antenna theory. IEEE Trans Antennas Propag 29:220–231 Nerem RS, Koblinski CJ (1994) The geoid and ocean circulation. In: Vanicek P, Christon NT (eds) Geoid and its geophysical interpretations. CRC, Boca Raton, pp 321–338 Neumann F (1887) Vorlesungen über die Theorie des Potentials und der Kugelfunktionen. Teubner, Leipzig, pp 135–154 Neunzert H, Rosenberger B (1991) Schlüssel zur Mathematik. Econ, Düsseldorf Nolet G (2008) Seismic tomography: imaging the interior of the Earth and Sun. Cambridge University Press, Cambridge Nutz H (2002) A unified setup of gravitational observables. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Shaker, Aachen

Page 65 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_1-3 © Springer-Verlag Berlin Heidelberg 2014

Ostermann I (2011) Modeling heat transport in deep geothermal systems by radial basis functions. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Dr. Hut, München Pedlovsky J (1979) Geophysical fluid dynamics. Springer, New York/Heidelberg/Berlin Pesch HJ (2002) Schlüsseltechnologie Mathematik. Teubner, Stuttgart/Leipzig/Wiesbaden Popov MM, Semtchenok NM, Popov, Verdel AR (2006) Gaussian beam migration of multi-valued zero-offset data. In: Proceedings, international conference, days on diffraction, St. Petersburg, pp 225–234 Popov MM, Semtchenok NM, Popov PM, Verdel AR (2008) Reverse time migration with Gaussian beams and velocity analysis applications. In: Extended abstracts, 70th EAGE conference & exhibitions, Rome, F048 Ritter JRR, Christensen UR (eds) (2007) Mantle plumes – a multidisciplinary approach. Springer, Heidelberg Rummel R (2002) Dynamik aus der Schwere – Globales Gravitationsfeld. An den Fronten der Forschung (Kosmos, Erde, Leben), Hrsg. R. Emmermann u.a., Verhandlungen der Gesellschaft Deutscher Naturforscher und Ärzte, 122. Versammlung, Halle Rummel R, van Gelderen M (1995) Meissl scheme – spectral characteristics of physical geodesy. Manuscr Geod 20:379–385 Skudrzyk E (1972) The foundations of acoustics. Springer, Heidelberg Snieder R (2002) The Perturbation method in elastic wave scattering and inverse scattering in pure and applied science, general theory of elastic wave. Academic, San Diego, pp 528–542 Sonar T (2001) Angewandte Mathematik, Modellbildung und Informatik: Eine Einführung für Lehramtsstudenten, Lehrer und Schüler. Vieweg, Braunschweig, Wiesbaden Sonar T (2011) 3000 Jahre Analysis. Springer, Heidelberg/Dordrecht/London/New York Stokes GG (1849) On the variation of gravity at the surface of the earth. Trans Camb Philos Soc 8:672–712; Mathematical and physical papers by George Gabriel Stokes, vol II. Johanson Reprint Corporation, New York, pp 131–171 Tarantola A (1984) Inversion of seismic relation data in the acoustic approximation. Geophysics 49:1259–1266 Torge W (1991) Geodesy. Walter de Gruyter, Berlin Weyl H (1916) Über die Gleichverteilung von Zahlen mod Eins. Math Ann 77:313–352 Wolf K (2009) Multiscale modeling of classical boundary value problems in physical geodesy by locally supported wavelets. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Dr. Hut, München Yilmaz O (1987) Seismic data analysis: processing, inversion and interpretation of seismic data. Society of Exploration Geophysicists, Tulsa

Page 66 of 66

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Navigation on Sea: Topics in the History of Geomathematics Thomas Sonar Institut Computational Mathematics, Technische Universität Braunschweig, Braunschweig, Germany

Abstract In this chapter, we review the development of the magnet as a means for navigational purposes. Around 1600, knowledge of the properties and behavior of magnetic needles began to grow in England mainly through the publication of William Gilbert’s influential book De Magnete. Inspired by the rapid advancement of knowledge on one side and of the English fleet on the other, scientists associated with Gresham College began thinking of using magnetic instruments to measure the degree of latitude without being dependent on a clear sky, a quiet sea, or complicated navigational tables. The construction and actual use of these magnetic instruments, called dip rings, is a tragic episode in the history of seafaring since the latitude does not depend on the magnetic field of the Earth but the construction of a table enabling seafarers to take the degree of latitude is certainly a highlight in the history of geomathematics.

1 General Remarks on the History of Geomathematics Geomathematics in our times is thought of being a very young science and a modern area in the realms of mathematics. Nothing is farer from the truth. Geomathematics began as man realized that he walked across a sphere-like Earth and that this had to be taken into account in measurements and computations. Hence, Eratosthenes can be seen as an early geomathematician when he tried to determine the circumference of the Earth by measurements of the sun’s position from the ground of a well and the length of shadows farther away at midday. Other important topics in the history of geomathematics are the struggles for an understanding of the true shape of the Earth which led to the development of potential theory and much of multidimensional calculus (see Greenberg 1995), the mathematical developments around the research of the Earth’s magnetic field, and the history of navigation.

2 Introduction The history of navigation is one of the most exciting stories in the history of mankind and one of the most important topics in the history of geomathematics. The notion of navigation thereby spans the whole range from the ethnomathematics of Polynesian stick charts via the compass to modern mathematical developments in understanding the Earth’s magnetic field and satellite navigation via GPS. We shall concentrate here on the use of the magnetic needle for navigational purposes and in particular on developments having taken place in early modern England. However, we begin our investigations with a short overview on the history of magnetism following Balmer (1956). 

E-mail: [email protected]

Page 1 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

3 The History of the Magnet The earliest sources on the use of magnets for the purpose of navigation stem from China. During the Han epoche 202–220, we find the description of carriages equipped with compass-like devices so that the early Chinese imperators were able to navigate on their journeys through their enormous empire. These carriages were called tschinan-tsche, meaning “carriages that show noon.” The compass-like devices consisted of little humanlike figures which swam on water in a bowl, the finger of the stretched arm pointing always straight to the south. We do not know nowadays why the ancient Chinese preferred the southward direction instead of a northbound one. In a book on historical memoirs written by Sse-ma-tsien (or Schumatsian), we find a report dating back to the first half of the second century about a present that imperator Tsching-wang gave 1100 before Christ to the ambassadors of the cities of Tonking and Cochinchina. The ambassadors received five “magnetic carriages” in order to guide them safely back to their cities even through sand storms in the desert. Since the ancient Chinese knew about the attracting forces of a magnet, they called them “loving stones.” In a work on natural sciences written by Tschin-tsang-ki from the year 727, we read: The loving stone attracts the iron like an affectionate mother attracts their children around her; this is the reason where the name comes from.

It was also very early known that a magnet could transfer its attracting properties to iron when it was swept over the piece of iron. In a dictionary of the year 121, the magnet is called a stone “with which a needle can be given a direction,” and hence, it is not surprising that a magnetic needle mounted on a piece of cork and swimming in a bowl of water belonged to the standard equipment of larger Chinese ship as early as the fourth century. Such simple devices were called “bussola” by the Italians and are still known under that name. A magnetic needle does not point precisely to the geographic poles but to the magnetic ones. The locus of the magnetic poles moves in time so that a deviation has always to be taken into account. Around the year 1115, the problem of deviation was known in China. The word “magnet” comes from the Greek word magnes describing a sebacious rock which, according to the Greek philosopher and natural scientist Theophrastos (ca. 371–287 BC), was a forgeable and silver white rock. The philosopher Plato (428/27–348/47 BC) called magnetic rock the “stone of Heracles,” and the poet Lucretius (ca. 99–55 BC) used the word “magnes” in the sense of attracting stone. He attributed the name to a place named “Magnesia” where this rock could be found. Other classical Greek anecdotes call a shepherd named “Magnes” to account for the name. It is said that he wore shoes with iron nails and while accompanying his sheep suddenly could no longer move because he stood on magnetic rock. Homer wrote about the force of the magnet as early as 800 BC. It seems typical for the ancient Greek culture that one sought for an explanation of this force fairly early on. Plato thought of this force as being simply “divine.” Philosopher Epicurus (341–270) had the hypothesis that magnets radiate tiny particles – atoms. Eventually Lucretius exploited this hypothesis and explained the attracting force of a magnet by the property of the radiated atoms to clear the space between the magnet and the iron. Into the free space then iron atoms could penetrate, and since iron atoms try hard to stay together (says Lucretius), the iron piece would follow them. The pressure of the air also played some minor role in this theory. Lucretius knew that he had to answer the question why iron would follow but other materials would not. He simply declared that gold would be too heavy and timber would show too large porosities so that the atoms of the magnet would simply go through.

Page 2 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 1 The magnetic perpetuum mobile of Peregrinus

Fig. 2 Elizabeth I (Armada portrait)

The first news on the magnet in Western Europe came from Paris around the year 1200. A magnetic needle was used to determine the orientation. We do not know how the magnet came to Western Europe and how it was received but it is almost certainly true that the crusades and the associated contact with the peoples in the Mediterranean played a crucial role. Before William Gilbert around 1600 came up with a “magnetick philosophy,” it was the crusader, astronomer, chemist, and physician Peter Peregrinus De Maricourt who developed a theory of the magnet in a famous “letter on the magnet” dating back to 1269. He describes experiments with magnetic stones which are valid even nowadays. Peregrinus grinds a magnetic stone in the form of a sphere, places it in a wooden plate, and puts this plate in a bowl with water. Then he observes that the sphere moves according to the poles. He develops ideas of magnetic clocks and describes the meaning of the magnet with respect to the compass. He also develops a magnetic perpetuum mobile according to Fig. 1. A magnet is mounted at the tip of a hand which is periodically moving (says Peregrinus) because of iron nails on the circumference.

Page 3 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Peregrinus’ work was so influential in Western Europe that even 300 years after his death, he is still accepted as the authority on the magnet.

4 Early Modern England In the sixteenth century, Spain and Portugal developed into the leading sea powers. Currents of gold, spices, and gemstones regorge from the South Americans into the home countries. England had missed connection. When Henry VIII died in 1547, only a handful of decaying ships were lying in the English sea harbors. His successor, his son Edward VI, could only rule for 6 years before he died young. Henry’s daughter Mary, a devout Catholic, tried hard to re-catholize the country her father had steered into protestantism and married Philipp II, King of Spain. Mary was fairly brutal in the means of the re-catholization and many of the protestant intelligentsia left the country in fear of their lives. “Bloody Mary” died in 1558 at the age of 47 and the way opened to her stepsister Elizabeth. Within one generation itself, Elizabeth I transformed rural England to the leading sea power on Earth. She was advised very well by Sir Walter Raleigh who clearly saw the future of England on the seas. New ships were built for the navy and in 1588 the small English fleet was able to drown the famous Spanish armada – by chance and with good luck; but this incident served to boost not only the feeling of self-worth of a whole nation but also the realization of the need of a navy and the need of efficient navigational tools.

Fig. 3 De Magnete

Page 4 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Unmagnetized needle

Magnetized needle

Fig. 4 Norman’s discovery of the magnetic dip

Fig. 5 Dip rings: (a) the dip ring after Gilbert in De Magnete. (b) A dip ring used in the seventeenth century

English mariners realized on longer voyages that the magnetic needle inside a compass lost its magnetic power. If that was detected, the needle had to be magnetized afresh – it had to be “loaded” afresh. However, this is not the reason why the magnet is called loadstone in the English language but only a mistranslation. The correct word should be lodstone – “leading stone” – but that word was actually never used (Pumfrey 2002).

5 The Gresham Circle In 1592, Henry Briggs (1561–1639), chief mathematician in his country, was elected examiner and lecturer in mathematics at St John’s College, Cambridge, which nowadays corresponds to a professorship. In the same year, he was elected Reader of the Physics Lecture founded by Dr. Linacre in London. One hundred years before the birth of Briggs, Thomas Linacre was horrified by the pseudomedical treatment of sick people by hairdressers and vicars who did not shrink back from chirurgical operations without a trace of medical instruction. He founded the Royal College of Physicians of London and Briggs was now asked to deliver lectures with medical contents. The Royal College of Physicians was the first important domain for Briggs to make contact with men outside the spheres of the two great universities, and, indeed most important, he met William Gilbert (1544–1603) who was working on the wonders of the magnetical forces and who revolutionized modern science only a few years later. Page 5 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 6 Measuring the dip on the terrella in De Magnete

While England was on its way to become the world’s leading sea force, the two old English universities Oxford and Cambridge were in an alarming state of sleepiness (Hill 1997, p. 16ff). Instead of working and teaching on the forefront of modern research in important topics like navigation, geometry, and astronomy, the curricula were directly rooted in the ancient Greek tradition. Mathematics included reading of the first four or five books of Euclid, and medicine was read after Galen and Ptolemy ruled in astronomy. When the founder of the English stock exchange (Royal Exchange) in London, Thomas Gresham, died, he left in his last will money and buildings in order to found a new form of university, the Gresham College, which is still in function. He ordered the employment of seven lecturers to give public lectures in theology, astronomy, geometry, music, law, medicine, and rhetorics mostly in English language. The salary of the Gresham professors was determined to be £50 a year which was an enormous sum as compared to the salary of the Regius professors in Oxford and Cambridge (Hill 1997, p. 34). The only conditions on the candidates for the Gresham professorships were brilliance in their field and an unmarried style of life. Briggs must have been already well known as a mathematician of the first rank since he was chosen to be the first Gresham professor of Geometry in 1596. Modern mathematics was needed badly in the art of navigation, and public lectures on mathematics were in fact already given in 1588 on behalf of the East India Company, the Muscovy Company, and the Virginia Company. Even before 1588, there were attempts by Richard Hakluyt to establish public lectures and none less than Francis Drake had promised £20 (Hill 1997, p. 34), but it needed the national shock of the attack of the Armada in 1588 to make such lectures come true.

Page 6 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

During his time in Gresham College, Briggs became the center of what we can doubtlessly call the Briggsian circle. Hill writes (1997, p. 37): He [i.e. BRIGGS] was a man of the first importance in the intellectual history of his age, . . . . Under him Gresham at once became a centre of scientific studies. He introduced there the modern method of teaching long division, and popularized the use of decimals.

The Briggsian circle consisted of true Copernicans: men like William Gilbert who wrote De Magnete; the able applied mathematician Edward Wright who is famous for his book on the errors in navigation; William Barlow, a fine instrument maker and men of experiments; and the great popularizer of scientific knowledge, Thomas Blundeville. Gilbert and Blundeville were protégés of the Earl of Leicester, and we know about connections with the circle of Raleigh in which the brilliant mathematician Thomas Harriot worked. Blundeville held contacts with John Dee who introduced modern continental mathematics and the Mercator maps in England (Hill 1997, p. 42). Hence, we can think of a scientific sub-net in England in which important work could be done which was impossible to do in the great universities. It was this time in Gresham College in which Briggs and his circle were most productive in the calculation of tables of astronomical and navigational importance. In the center of their activities was Gilbert’s “magnetick philosophy.”

6 William Gilberts Dip Theory The role of William Gilbert in shaping modern natural sciences cannot be overestimated, and a recent biography of Gilbert (Pumfrey 2002) emphasizes his importance in England and abroad. Gilbert, a physician and member of the Royal College of Physicians in London, became interested in navigational matters and the properties of the magnetic needle in particular by his contacts to seamen and famous navigators of his time alike. As a result of years of experiments, thought, and discussions with his Gresham friends, the book De Magnete, magneticisque corporibus, et de magno magnete tellure; Physiologia nova, plurimis & argumentis, & experimentis demonstrata was published in 1600. (I refer to the English translation Gilbert (1958) by P. Fleury Mottelay which is a reprint of the original of 1893. There is a better translation by Sylvanus P. Thompson from 1900, but while the latter is rare, the former is still in print.) It contained many magnetic experiments with what Gilbert called his terrella – the little Earth – which was a magnetical sphere. In the spirit of the true Copernican, Gilbert deduced the rotation of the Earth from the assumption of it being a magnetic sphere. Concerning navigation, Briggs, and the Gresham circle, the most interesting chapter in De Magnete is Book V: On the dip of the magnetic needle. Already in 1581, the instrument maker Robert Norman had discovered the magnetic dip in his attempts to straighten magnetic needles in a fitting on a table. He had observed that an unmagnetized needle could be fitted in a parallel position with regard to the surface of a table but when the same needle was magnetized and fitted again, it made an angle with the table. Norman published his results already in 1581 (R NORMAN – The New Attractive London, 1581). Even before Norman, the dip was reported by the German astronomer and instrument maker Georg Hartmann from Nuremberg in a letter to the Duke Albrecht of Prussia from 4th of March, 1544, see Balmer 1956, pp. 290–292), but nobody read the letter. In modern notion, the phenomenon of the dip is called inclination in contrast to the declination or variation of the needle. A word of warning is appropriate here: in Gilbert’s time, many authors used the word declination for the inclination. Anyway, Norman was the first to build

Page 7 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 7 Third hypothesis

a dip ring in order to measure the inclination. This ring is nothing else but a vertical compass. Already Norman had discovered that the dip varied with time! However, Gilbert believed that he had found the secrets of magnetic navigation. He explained the variation of the needle by land masses acting on the compass which fitted nicely with the measurements of seamen but is wrong, as we now know. Concerning the dip, let me give a summary of Gilbert’s work in modern terms. Gilbert must have measured the dip on his terrella many, many times before he was led to his: First hypothesis: constant dip.

There is an invertible mapping between the lines of latitude and the lines of

Hence, Gilbert believed to have found a possibility of determining the latitude on Earth from the degree of the dip. Let ˇ be the latitude and ˛ the dip. He then formulated his: Second hypothesis: At the equator, the needle is parallel to the horizon, i.e., ˛ D 0ı . At the north pole, the needle is perpendicular to the surface of the Earth, i.e., ˛ D 90ı . He then draws a conclusion, but in our modern eyes, this is nothing but another: Third hypothesis: If ˇ D 45ı , then the needle points exactly to the second equatorial point. What he meant by this is best described in Fig. 7. Gilbert himself writes . . . points to the equator F as the mean of the two poles. (Gilbert 1958, p. 293)

Note that in Fig. 7, the equator is given by the line A  F and the poles are B(north) and C so that our implicit (modern) assumption that the north pole is always shown on top of a figure is not satisfied. From his three hypotheses, Gilbert concludes correctly: First conclusion: The rotation of the needle has to be faster on its way from A to L than from L to B. Or, in Gilbert’s words, . . . the movement of this rotation is quick in the first degrees from the equator, from A to L, but slower in the subsequent degrees, from L to B, that is, with reference to the equatorial point F , toward C . (Gilbert 1958, p. 293) Page 8 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 8 Gilbert’s geometrical construction of the mapping in De Magnete

And on the same page, we read: . . . it dips; yet, not in ratio to the number of degrees or the arc of the latitude does the magnetic needle dip so many degrees or over a like arc; but over a very different one, for this movement is in truth not a dipping movement, but really a revolution movement, and it describes an arc of revolution proportioned to the arc of latitude.

This is simply the lengthy description of the following: Second conclusion: The mapping between latitude and dip cannot be linear. Now that Gilbert had made up his mind concerning the behavior of the mapping at ˇ D 0ı ; 45ı; and 90ı , a construction of the general mapping was sought. It is exactly here where De Magnete shows strange weaknesses; in fact, we witness a qualitative jump from a geometric construction to a dip instrument. Gilbert’s geometrical description can be seen in Fig. 8. I do not intend to comment on this construction because this was done in detail elsewhere (Sonar 2001), but it is not possible to understand the construction from Gilbert’s writings in De Magnete. Even more surprising, while all figures in De Magnete are raw wood cuts in the quality shown before, suddenly there is a fine technical drawing of the resulting construction as shown in Fig. 9. The difference between this drawing and all other figures in De Magnete and the weakness in the description of the construction of the mapping between latitudes and dip angles suggest that at least this part was not written by Gilbert alone but by some of his friends in the Gresham circle. Pumfrey speaks of the dark secret of De Magnete (Pumfrey 2002, p. 173ff) and gives evidence that Edward Wright, whose On Certain Errors in Navigation had appeared a year before De Magnete, had his hands in some parts of Gilbert’s book. In Parsons and Morris (1939, pp. 61–67), we find the following remarks: Wright, and his circle of friends, which included Dr. W. Gilbert, Thomas Blundeville, William Barlow, Henry Briggs, as well as Hakluyt and Davis, formed the centre of scientific thought at the turn of the century. Between these men there existed an excellent spirit of co-operation, each sharing his own discoveries with the others. In 1600 Wright assisted Gilbert in the compilation of De Magnete. He wrote a long preface to the work, in which he proclaimed his belief in the rotation of the earth, a theory which Gilbert was explaining, and also contributed chapter 12 of Book IV, which dealt with the method of finding the amount of the variation of the compass. Gilbert devoted his final chapters to practical problems of navigation, in which he knew many of his friends were interested.

Page 9 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 9 The fine drawing in De Magnete

There is no written evidence that Briggs was involved too but it seems very unlikely that the chief mathematician of the Gresham circle should not have been in charge in so important a development as the dip theory. We shall see later on that the involvement of Briggs is highly likely when we study his contributions to dip theory in books of other authors.

7 The Briggsian Tables If we trust Ward (1740, pp. 120–129), the first published table of Henry Briggs is the table which represents Gilbert’s mapping between latitude and dip angles in Thomas Blundeville’s book The Theoriques of the seuen Planets, shewing all their diuerse motions, and all other Accidents, called Passions, thereunto belonging. Whereunto is added by the said Master Blundeuile, a breefe Extract by him made, of Magnus his Theoriques, for the better vnderstanding of the Prutenicall Tables, to calculate thereby the diuerse motions of the seuen Planets. There is also hereto added, The making, description, and vse, of the two most ingenious and necessarie Instruments for Sea-men, to find out therebye the latitude of any place vpon the Sea or Land, in the darkest night that is, without the helpe of Sunne, Moone, or Starre. First inuented by M. Doctor Gilbert, a most excellent Philosopher, and one of the ordinarie Physicians to her Maiestie: and now here plainely set down in our mother tongue by Master Blundeuile. London Printed by Adam Islip. 1602.

Blundeville is an important figure in his own right; see Taylor (1954, p. 173) and Waters (1958, pp. 212–214). He was one of the first and most influential popularizers of scientific knowledge. He did not write for the expert, but for the layman, i.e., the young gentlemen interested in so diverse questions of science, writing of history, mapmaking, logic, seamenship, or horse riding. We do not know much about his life (Campling 1921–1922), but his role in the Gresham circle is apparent through his writings. In The Theoriques, Gilbert’s dip theory is explained in detail, and a step-byPage 10 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 10 Title page of The Theoriques by Blundeville

step description of the construction of the dip instrument is given. I have followed Blundeville’s instructions and constructed the dip instrument again elsewhere; see Sonar (2002). See also Sonar (2001). The final result is shown in Blundeville’s book as in Fig. 11. In order to understand the geometrical details, it is necessary to give a condensed description of the actual construction in Fig. 8 which is given in detail in The Theoriques. We start with a circle ACDL representing planet Earth as in Fig. 12. Note that A is an equatorial point while C is a pole. The navigator (and hence the dip instrument) is assumed to be in point N which corresponds to the latitude ˇ D 45ı . In the first step of the construction, a horizon line is sought, i.e., the line from the navigator in N to the horizon. Now a circle is drawn around A with radius AM (the Earth’s radius). This marks the point

Page 11 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

b

F on a line through A parallel to CL. A circle around M through F now gives the arc FM . The point H is constructed by drawing a circle around C with radius AM. The point of intersection of this circle with the outer circle through F is H . If the dip instrument is at A, the navigator’s horizon point is F. If it is in C, the navigator’s horizon will be in H. Correspondingly, drawing a circle through N with radius AM gives the point S; hence, S is the point at the horizon seen from N. Hence, to every position N of the needle, there is a quadrant of dip which is the arc from M to a corresponding point on the outer circle through F . If N is at ˇ D 45ı latitude as in our example, we know from Gilbert’s third hypothesis that the needle points to D. The angle between S and the intersection point of the quadrant of dip with the direction of the needle is the dip angle. The remaining missing information is the point to which the needle points for a general latitude ˇ. This is accomplished by quadrants of rotation which implement Gilbert’s idea of the needle rotating on its way from A to C . The construction of these quadrants is shown in Fig. 13. We need a second outer circle which is constructed by drawing a circle around A through L. The intersection point of this circle with the line AF is B and the second outer circle is then the circle through B around M . Drawing a circle around C through L defines the point G on the second outer circle. These arcs, GL and BL, are the quadrants of rotation corresponding to the positions C and A of the needle, respectively. Assume again that the dip instrument is in N at ˇ D 45ı . Then the corresponding quadrant of rotation is constructed by a circle around N through L and is the arc OL. This arc is now divided in 90 parts, starting from the second outer circle (0ı ) and ending at L.90ı /. Obviously, in our example, according to Gilbert, the 45ı mark is exactly at D. Now we are ready for the final step. Putting together all our quadrants and lines, we arrive at Fig. 14. The needle at N points to the mark 45ı on the arc of rotation OL and hence intersects the c is the dip angle ı. quadrant of dip (arc S M ) in the point S. The angle of the arc SR We can now proceed in this manner for all latitudes from ˇ D 0ı to ˇ D 90ı in steps of 5ı . Each latitude gives a new quadrant of dip, a new quadrant of rotation, and a new intersection point R. The final construction is shown in Fig. 15. However, in Fig. 15, the construction is shown in the lower right quadrant instead of in the upper left and uses already the notation of Blundeville instead of those of William Gilbert. The main goal of the construction, however, is a spiral line which appears after the removal of all the construction lines, as in Fig. 16, and can already be seen in the upper left picture in Blundeville’s drawing in Fig. 11. The spiral line consists of all intersection points R. Together with a quadrant which can rotate around the point C of the mater, the instrument is ready to use. In order to illustrate its use, we give an example. Consider a seaman who has used a dip ring and measured a dip angle of 60ı . Then he would rotate the quadrant until the spiral line intersects the quadrant at the point 60ı on the inner side of the quadrant. Then the line A  B on the quadrant intersects the scale on the mater at the degree of latitude; in our case 36ı , see Fig. 18. However, accurate reading of the scales becomes nearly impossible for angles of dip larger than 60ı , and the reading depends heavily on the accuracy of the construction of the spiral line. Therefore, Henry Briggs was asked to compute a table in order to replace the dip instrument by a simple table look-up. At the very end of Blundeville’s The Theoriques, we find the following appendix; see Fig. 19:

b

b

b

b

b

A short appendix annexed to the former Treatise by Edward Wright, at the motion of the right Worshipful M. Doctor Gilbert Because of the making and using of the foresaid Instrument, for finding the latitude by the declination of the Magneticall Needle, will bee too troublesome for the most part of Sea-men, being notwithstanding a thing most worthie to be put in daily practise, especially by such as undertake long voyages: it was thought meet by my worshipfull friend M. Doctor Gilbert, that (according to M. Blundeuile’s earnest request) this Table following Page 12 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 11 The dip instrument in The Theoriques should be hereunto adioned; which M. Henry Briggs (professor of Geometrie in Gresham Colledge at London) calculated and made out of the doctrine and tables of Triangles, according to the Geometricall grounds and reason of this Instrument, appearing in the 7 and 8 Chapter of M. Doctor Gilberts fift[h] booke of the Loadstone. By helpe of which Table, the Magneticall declination being giuen, the height of the Pole may most easily be found, after this manner. With the Instrument of Declination before described, find out what the Magneticall declination is at the place where you are: Then look that Magneticall declination in the second Collum[n]e of this Table, and in the same line immediatly towards the left hand, you shall find the height of the Pole at the same place, unleße there be some variation of the declination, which must be found out by particular obseruation in euery place.

The next page (which is the final page of The Theoriques) indeed shows the table. Fig. 19 shows the appendix. In order to make the numbers in the table more visible, I have retyped the table. Page 13 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 12 The quadrant of dip

Fig. 13 The quadrant of rotation

Fig. 14 The final steps Page 14 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

90

Fig. 15 The construction of Gilbert’s mapping

E

50

60

70

80

C

40

30

0 5 10

20

F

Fig. 16 The mater of the dip instrument in The Theoriques

60

70

80

90 A

0

10

20

30

40

50 B

G

Fig. 17 The quadrant Page 15 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

60

50

40

30

70

80 90

0

E G

50

60

70

80

C

10

90

20

A

40

0 5 10

20

30

B

F

Fig. 18 Determining the latitude for 60ı dip

Fig. 19 The appendix as found in Edward Wrights On errors in Navigation

We shall not discuss this table in detail, but it is again worthwhile to review the relations between Gilbert, Briggs, Blundeville, and Wright (Hill 1997), p. 36.: Briggs was at the center of Gilbert’s group. At Gilbert’s request he calculated a table of magnetic dip and variation. Their mutual friend Edward Wright recorded and tabulated much of the information which Gilbert used and helped in the production of De Magnete. Thomas Blundeville, another member of Brigg’s group, and, like Gilbert, a former protégé of the Earl of Leicester, popularized Gilbert’s discoveries in The Theoriques of the Seven Planets (1602), a book in which Briggs and Wright again collaborated.

Page 16 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

First column Heighs of the pole Degrees 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Second column Magnetical declination Deg. Min. 2 11 4 20 6 27 8 31 10 34 12 34 14 32 16 28 18 22 20 14 22 4 23 52 25 38 27 22 29 4 30 45 32 24 34 0 35 36 37 9 38 41 40 11 41 39 43 6 44 30 45 54 47 15 48 36 49 54 51 11

First column Heighs of the pole Degrees 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Second column Magnetical declination Deg. Min. 52 27 53 41 54 53 56 4 57 13 58 21 59 28 60 33 61 37 62 39 63 40 64 39 65 38 66 35 67 30 68 24 69 17 70 9 70 59 71 48 72 36 73 23 74 8 74 52 75 35 76 17 76 57 77 37 78 15 78 53

First column Heighs of the pole Degrees 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

Second column Magnetical declination Deg. Min. 79 29 80 4 80 38 81 11 81 43 82 13 82 43 83 12 83 40 84 7 84 32 84 57 85 21 85 44 86 7 86 28 86 48 87 8 87 26 87 44 88 1 88 17 88 33 88 47 89 1 89 14 89 27 89 39 89 50 90 0

It took Blundeville’s The Theoriques to describe the construction of the dip instrument accurately which nebulously appeared in Gilbert’s De Magnete. However, even Blundeville does not say a word concerning the computation of the table. Another friend in the Gresham circle, famous Edward Wright, included all of the necessary details in the second edition of his On Errors in Navigation (Wright 1610), the first edition of which appeared 1599. Much has been said about the importance of Edward Wright, (see, for instance, Parsons and Morris 1939), and he was certainly one of the first – if not the first – who was fully aware of the mathematical background of Mercator’s mapping; see Sonar (2001, p. 131ff). It is in Wright’s On Errors in Navigation where Wright and Briggs explain the details of the computation of the dip table which was actually computed by Briggs showing superb mastership of trigonometry. We shall now turn to this computation.

Page 17 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

8 The Computation of the Dip Table In the second edition of Wright’s book On Errors in Navigation, we find in chapter XIIII:

Fig. 20 Figure A: CHAP. XIIII to find the inclination or dipping of the magneticall needle under the Horizon

Let OBR be a meridian of the Earth, wherein let O be the pole, B the æquinoctal, and R the latitude (suppose 60 degrees), and let BD be perpendicular to AB in B and equal to the subtense OB; and drawing the line AD, describe therwith the arch DSV. Then draw the subtense OR, wherewith (taking R for the center) draw the lines RS equal to RO and AS equal to AD. Also because BR is assumed to be 60 deg., therefore let ST be 60 90 parts of the arch STO, and draw the line RT, for the angle ART shall be the cõplement of the magnetical needles inclinatiõ under the horizon, which may be found by the solution of the two triangles OAR and RAS after this manner:

Although here again other notation is used as in Blundeville’s book as well as in De Magnete, we can easily see the situation as described by Gilbert. Now the actual computation starts: First the triangle OAR is given because of the arch OBR, measuring the same 150 degr. and consequently the angle at R 15 degr. being equall to the equall legged angle at O; both which together are 30 degr. because they are the complement of the angle OAR (150 degr.) to a semicircle of 180 degr.

The first step in the computation hence concerns the triangle OAR in Fig. 21. Since point R lies at 60ı (measured from B), the arc OBR corresponds to an angle of 90ı C 60ı D 180ı  30ı D 150ı . Hence, the angle at A in the triangle OAR is just 90ı C .90ı  30ı / D 150ı . Since OAR is isosceles, the angles at O and R are identical, and each is 15ı . Let us go on with Wright: Secondly, in the triangle ARS all the sides are given AR the Radius or semidiameter 10,000,000: RS equal to RO the subtense of 150 deg. 19,318,516: and AS equall to AD triple in power to AB, because it is equal in power to AB and BD, that is BO, which is double in power to AB.

The triangle ARS in Fig. 21 is looked at where S lies on the circle around A with radius AD and on the circle around R with radius OR. The segment AR is the radius of the Earth or the “whole Page 18 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 21 Figure B: The first step

sine.” Wright takes this value to be 107 . We have to clarify what is meant by subtense and where the number 19,318,516 comes from. Employing the law of sines in triangle OAR, we get sin 150ı OR D ; AR sin 15ı and therefore, it follows that OR D RO D AR 

sin 150ı OR D 19;318;516.:5257 : : :/: D AR  AR sin 15ı

Since O lies on the circle around R with radius OR as S does, we also have RO D RS. Furthermore, AS D AD since D as S lies on the circle around A with radius AD. Per constructionem, we have BD D OB, and using the theorem of Pythagoras, we conclude OB 2 D BD 2 D 2AB 2 as well as AD 2 D AB 2 C BD 2 D AB 2 C 2AB 2 D 3AB 2 : This reveals the meaning of the phrase triple in power to AB: “the square is three times as big as AB.” Hence, it follows for AS: p AS D AD0 3  AB D 17;320;508.:0757 : : :/: It is somewhat interesting that Wright does not compute the square root but gives an alternative mode of computation as follows: Or else thus: The arch OB being 90 degrees, the subtense therof OB, that is, the tangent BD is 14,142,136, which sought in the table of Tangents, shall giue you the angle BAD 54 degr. 44 min. 8 sec. the secant whereof is the line AD that is AS 17,320,508.

Page 19 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

In the triangle ABD, we know the lengths of the segments AB and BD D OB D 14;142;135.:6237 : : :/. Hence, for the angle at A, we get BD D tan †A D AB

p

p

2  AB D

p 2  AB D 2; AB

which results in †A D 54:7356 : : :ı D 54ı 440 800 . Using this value, it follows from sin †A D

OB BD D AD AD

that AD D AS D

p OB AB D 2 D 17;320;508.:0757 : : :/: sin †A sin 54ı 440 800

Wright goes on: Now then by 4 Axiom of the 2 booke of Pitisc.1 as the base or greatest side SR 19,318,516 is to ye summe of the two other sides SA and AR 27,320,508; so is the difference of them SX 7,320,508 to the segment of the greatest side SY 10,352,762; which being taken out of SR 19,318,516, there remaineth YR 8,965,754, the halfe whereof RZ 4,482,877, is the Sine of the angle RAZ 26 degr. 38 min. 2 sec. the complement whereof 63 degr. 21 min. 58 sec. is the angle ARZ, which added to the angle ARO 15 degr. maketh the whole angle ORS, 78 degr. 21 min. 58 sec. wherof 60 90 make 52 degr. 14 min. 38 sec. which taken out of ARZ 63 degr. 21 min. 58 sec. there remaineth the angle TRA 11 deg. 7 min. 20 sec. the cõplement whereof is the inclination sought for 78 degrees, 52 minutes, 40 seconds.

Fig. 22 Figure C: Second step

1 The Silesian Bartholomäus Pitiscus (1561–1613) authored the first useful text book on trigonometry: Trigonometriae sive dimensione triangulorum libre quinque, Frankfurt 1595, which was published as an appendix to a book on astronomy by Abraham Scultetus. First independent editions were published in Frankfurt 1599, 1608, 1612 and in Augsburg 1600. The first English translation appeared in 1630.

Page 20 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

The “Axiom 4” mentiod is nothing but the Theorem of chords: If two chords in a circle intersect then the product of the segments of the first chord equals the product of the segments of the other.

Looking at Fig. 23, the theorem of chords is

Fig. 23 Figure D: The final step

MS  SX D SR  SY and since MS D AS C AB, it follows .AS C AB/  SX D SR  SY; resulting in SR SX D AS C AB SY Now the computations should be fully intelligible. Given are AS D 17;320;508; AB D 107 ; SR D OR D 19;318;516, and SX D AS  AX D 7;320;508. Hence, SY D

SX  .AS C AB/ D 10;352;762: SR

The segment YR has length YR D SR  SY D 8;965;754. Per constructionem, the point Z is the midpoint of YR. Half of YR is RZ D 4;482;877. From sin †RAZ D RZ=AR D 4;482;877=107 D 0:4482877, we get †RAZ D 26:6339ı D 26ı 380 200 . In the right-angled triangle ARZ, we see from Fig. 24 that †ARZ D 90ı  26ı 380 200 D 63ı 210 5800 . At a degree of latitude of 60ı , the angle ARO at R is 15ı since the obtuse angle in the isosceles triangle ORA is 90ı C 60ı D 150ı . Therefore, †ORS D †ARO C †ARZ D 78ı 210 5800 . The part Page 21 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 24 Figure E

Fig. 25 Figure F

TRS of this angle is 60/90 of it; hence, †TRS D 52ı 140 3800 . We arrive at †TRA D †ARZ  †TRA D 63ı 210 5800  52ı 140 3800 D 11ı 70 2000 : The dip angle ı is the complement of the angle TRA, ı D 90ı  †TRA D 78ı 520 4000 : Although the task of computing the dip if the degree of latitude is given is now accomplished, we find a final remark on saving of labor: The Summe and difference of the sides SA and AR being alwaies the same, viz. 27,320,508 and 7,320,508, the product of them shall likewise be alwaies the same, viz. 199,999,997,378,064 to be diuided by ye side SR, that is RO the subtense of RBO. Therefore there may be some labour saued in making the table of magneticall inclination, if in stead of the said product you take continually but ye halfe thereof, that is 99,999,998,689,032,

Page 22 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 26 Figure G and so diuide it by halfe the subtense RO, that is, by the sine of halfe the arch OBR. Or rather thus: As halfe the base RS (that is, as the sine of halfe the arch OBR) is to halfe the summe of the other two sides SA & AR 13,660,254, so is half the difference of the e 3,660,254 to halfe of the segment SY, which taken out of half the base, there remaineth RZ ye sine of RAZ, whose cõplement to a quadrãt is ye angle sought for ARZ. According to this Diagramme and demonstration was calulated the table here following; the first columne whereof conteineth the height of the pole for euery whole degree; the second columne sheweth the inclination or dipping of the magnetical needle answerable thereto in degr. and minutes.

Although we have taken these computations from Edward Wright’s book, there is no doubt that the author was Henry Briggs as is also clear from the foreword of Wright.

9 Conclusion The story of the use of magnetic needles for the purposes of navigation is fascinating and gives deep insight into the nature of scientific inventions. Gilbert’s dip theory and the unhappy idea to link latitude to dip is a paradigm of what can go wrong in mathematical modeling. The computation of the dip table is, however, a brilliant piece of mathematics and shows clearly the mastery of Henry Briggs.

References Balmer H (1956) Beiträge zur Geschichte der Erkenntnis des Erdmagnetismus. Verlag H.R. Sauerländer, Aarau Campling A (1922) Thomas Blundeville of Newton Flotman, co. Norfolk (1522–1606). Norfolk Archaeol 21:336–360 Gilbert W (1958) De Magnete. Dover, New York Greenberg JL (1995) The problem of the Earth’s shape from Newton to Clairault. Cambridge University Press, Cambridge Hill Ch (1997) Intellectual origins of the English revolution revisited. Clarendon Press, Oxford Parsons EJS, Morris WF (1939) Edward Wright and his work. Imago Mundi 3:61–71 Page 23 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_2-3 © Springer-Verlag Berlin Heidelberg 2015

Pumfrey S (2002) Latitude and the magnetic Earth. Icon Books, Cambridge Sonar Th (2001) Der fromme Tafelmacher. Logos Verlag, Berlin Sonar Th (2002) William Gilberts Neigungsinstrument I: Geschichte und Theorie der magnetischen Neigung. Mitteilungen der Math. Gesellschaft in Hamburg, Band XXI/2, 45–68 Taylor EGR (1954) The mathematical practioneers of Tudor & Stuart England. Cambridge University Press, Cambridge Ward J (1740) The lives of the professors of Gresham College. Johnson Reprint Corporation, London Waters DJ (1958) The art of navigation. Yale University Press, New Haven Wright E (1610) Certaine errors in navigation detected and corrected with many additions that were not in the former edition as appeareth in the next pages. Printed by Felix Knights, London

Page 24 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

Earth Observation Satellite Missions and Data Access Henri Laura and Volker Liebigb a European Space Agency (ESA), Head of Earth Observation Mission Management Division, ESRIN, Frascati, Italy b European Space Agency (ESA), European Space Agency (ESA), Director of Earth Observation Programmes, ESRIN, Frascati, Italy

Abstract This article provides an overview on Earth Observation satellites, describing the end-to-end elements of an Earth Observation mission and then focusing on the European Earth Observation programs. Some significant results obtained using data from European missions (ERS, Envisat) are provided. Finally the access to Earth Observation data through the European Space Agency, free of charge, is described.

1 Introduction Early pictures of the Earth seen from space became icons of the Space Age and encouraged an increased awareness of the precious nature of our common home. Today, images of our planet from orbit are acquired continuously and have become powerful scientific tools to enable better understanding and improved management of the Earth. Satellites provide clear and global views of the various components of the Earth system – its land, ice, oceans, and atmosphere – and how these processes interact and influence each other. Space-derived information about the Earth provides a whole new dimension of knowledge and services which can benefit our lives on a day-to-day basis. Earth Observation satellites supply a consistent set of continuously updated global data which can offer support to policies related to environmental security by providing accurate information on various environmental issues, including global change. Meteorological satellites have radically improved the accuracy of weather forecasts and have become a crucial part of our daily life. Earth Observation data are gradually integrated within many economic activities, including exploitation of natural resources, land-use efficiency, or transport routing. The European Space Agency (ESA) has been dedicated to observing the Earth from space ever since the launch of its first meteorological mission Meteosat in 1977. Following the success of this first mission, the subsequent series of Meteosat satellites, the ERS, and Envisat missions have been providing a growing number of users with a wealth of valuable data about the Earth, its climate, and changing environment. It is critical, however, to continue learning about our planet. As our quest for scientific knowledge continues to grow, so does our demand for accurate satellite data to be used for numerous practical applications related to protecting and securing the environment. Responding to these needs, ESA’s Earth Observation programs comprise a science and research element, which includes the Earth Explorer missions, and an applications element, which is designed to



E-mail: [email protected]

Page 1 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

facilitate the delivery of Earth Observation data for use in operational services, including the well-established meteorological missions with the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT). In addition, the GMES (Global Monitoring for Environment and Security) Sentinel missions, which form part of the GMES Space Component, will collect robust, long-term relevant datasets. Together with other satellites, their combined data archives are used to produce Essential Climate Variables for climate monitoring, modeling, and prediction. Mathematics is of crucial importance in Earth Observation: onboard data compression and signal processing are common features of many Earth Observation satellites, inverse problems for data applications (e.g., the determination of the gravitational field from ESA GOCE satellite measurements), data assimilation to combine Earth Observation data with numerical models, etc.

2 End-to-End Earth Observation Satellite Systems In this chapter, the end-to-end structure of a typical satellite system shall be explained. This will enable the user of Earth Observation (EO) data to better understand the process of gathering the information he or she is using. Earth Observation satellites usually fly on the so-called low Earth orbit (LEO) which means an orbit altitude of some 250–1,000 km above the Earth’s surface and an orbit inclination close to 90ı , i.e., a polar orbit. Another orbit used mainly for weather satellites like Meteosat is the geostationary orbit (GEO). This is the orbit in which the angular velocity of the satellite is exactly the same as the one of Earth (360ı in 24 h). If the satellite is positioned exactly over the equator (inclination 0ı ), then the satellite has always the same position relative to the Earth’s surface. This is why the orbit is called geostationary. Using this position, the satellite’s instruments can always observe the same area on Earth. The price to pay for this position is that the orbit altitude is approx. 36,000 km which is very far compared to a LEO. A special LEO is the sun-synchronous orbit which is used for all observation systems which need the same surface illumination angle of the sun. This is important for imaging systems like optical cameras. For this special orbit, typically 600–800 km in altitude, the angle between the satellite orbit and the line between sun and Earth is kept equal. As the Earth rotates once per year around the sun, the satellite orbit has also to rotate by 360ı in 365 days. This can be reached by letting the rotational axis of the satellite orbit precess, i.e., rotate, by 360/365 degrees per day to keep exactly pace with the Earth’s rotation around the sun. As the Earth is not a perfect sphere, the excess mass at the equator creates an angular momentum which lets a rotating system precess like a gyroscope. The satellite has to fly with an inclination of approx. 98ı which is close to an orbit flying over the poles. As the Earth is rotating under this orbit, the satellite almost “sees” the Earth’s entire surface during several orbits. Typical LEO satellites are ERS, Envisat, or SPOT (see Sect. 3). All Earth Observation satellite systems consist of a space segment and a ground segment. If we regard the whole chain of technical infrastructures to acquire, downlink, process, and distribute the EO data, we speak about an end-to-end system.

Page 2 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 Envisat satellite with its platform (service module and solar array) and the payload composed of ten instruments

2.1 Space Segment The space segment (S/S) mainly consists of the spacecraft, i.e., the EO satellite which has classically a modular design and is subdivided into satellite platform (or bus) and satellite payload. Figure 1 shows as example the Envisat satellite. The satellite bus offers usually all interfaces (mechanical, thermal, energy, data handling) which are necessary to run the payload instruments mounted on it. In addition, it contains usually all housekeeping, positioning, and attitude control functions as well as a propulsion system. It should be mentioned that small satellites often have a more integrated approach. In some cases, the space segment can consist of more than one satellite, e.g., if two or more satellites fly in tandem, such as Swarm (see Sect. 3). The space segment may also include a data relay satellite (DRS), positioned in a geostationary orbit and used to relay data between the EO satellite and the Earth when the satellite is out of visibility of the ground stations, such as ESA Artemis satellite data relay for the Envisat satellite.

2.2 Ground Segment The ground segment (G/S) provides the means and resources to manage and control the EO satellite, to receive and process the data produced by the payload instruments, and to disseminate and archive the generated products. In general, the ground segment can be split into two major elements (see Fig. 2): – The Flight Operations Segment (FOS), which is responsible for the command and control of the satellite – The Payload Data Segment (PDS), which is responsible for the exploitation of the instruments data

Page 3 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 2 Main elements of an EO ground segment including the Flight Operations Segment (FOS) and the Payload Data Segment (PGS)

Both G/S elements have different communication paths: satellite control and telecommand uplink use normally S-band, whereas instrument data is downlinked to ground stations via Xband either directly or after onboard recording. In addition, Ka-band or laser links with very high bandwidth are used for inter-satellite links needed by data relay satellites like Artemis or the future EDRS (European Data Relay Satellite). 2.2.1 Flight Operations Segment The Flight Operations Segment (FOS) at ESA is under the responsibility of the European Space Operation Centre (ESOC) located in Darmstadt, Germany. The mandate of ESOC is to conduct mission operations for ESA satellites and to establish, operate, and maintain the related ground segment infrastructure. Most of ESA EO satellites are controlled and commanded from ESOC control rooms. ESOC’s involvement in a new mission normally begins with the analysis of possible operational orbits or trajectories and the calculation of the corresponding launch windows – selected to make sure that the conditions encountered in this early phase remain within the spacecraft capabilities. The selection of the operational orbit is a complex task with many trade-offs involving the scientific objectives of the mission, the launch vehicle, the spacecraft, and the ground stations. ESOC’s activity culminates during the Launch and Early Orbit Phase (LEOP), with the first steps after the satellite separates from the launcher’s uppermost stage, including the deployment of antennas and solar arrays as well as critical orbit and attitude control maneuvers. After the LEOP, ESOC starts the operations of the FOS, including generally the command and control of the satellite; the satellite operations uploading, based on the observation plans prepared by the Payload Data Segment; the satellite configuration and performance monitoring; the orbit prediction, restitution, and maintenance; and the contingency and recovery operations following satellite anomalies. ESOC does also provide a valuable service for avoidance of collision with space debris, a risk particularly high for LEO satellites orbiting around 800 km altitude. More information about ESOC activities can be found at http://www.esa.int/esoc. Page 4 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

2.2.2 Payload Data Segment The EO Payload Data Segment (PDS) at ESA is under the responsibility of ESRIN, located in Frascati, Italy, and known as the ESA Centre for Earth Observation, because it also includes the ESA activities for developing the EO data exploitation. The PDS provides all services related to the exploitation of the data produced by the instruments embarked on board the ESA EO satellites and is therefore the gateway for EO data users. The activities performed within a PDS are: – The payload data acquisition using a network of worldwide acquisition stations as well as data relay satellites such as Artemis – The processing of products either in near real time from the above data acquisitions or on demand from the archives – The monitoring of product quality and instrument performance, as well as the regular upgrade of data processing algorithms – The archiving and long-term data preservation and the data reprocessing following the upgrade of processing algorithms – The interfaces to the user communities from user order handling to product delivery, including planning of instrument observations requested by the users – The availability of data products on Internet servers or through dedicated satellite communication (see Sect. 5 for data access) – The development of new data products and new services in response to evolving user demand The ESA PDS is based on a decentralized network of acquisition stations and archiving centers. The data acquisition stations located close to the poles can “see” most of the LEO satellite passes and are therefore used for acquiring the data recorded on board during the previous orbit. Acquisition stations located away from polar areas are used essentially to acquire data collected over their station mask (usually 4,000 km diameter). The archiving centers are generally duplicated to avoid data loss in case of fire or accidents (Fig. 3).

Fig. 3 Example of a network of acquisition stations for the data transmitted by the ESA Envisat satellite. In addition to the above ground stations, the Envisat data was also transmitted to ground through the ESA data relay satellite Artemis

Page 5 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

In carrying out the PDS activities, ESA works closely with other space agencies as well as with coordination and standardization bodies. This includes a strong interagency collaboration to acquire relevant EO data following a natural disaster. More information about ESRIN activities can be found at http://www.esa.int/esrin.

3 Overview on ESA Earth Observation Programs 3.1 Background The European Space Agency (ESA) is Europe’s gateway to space. Its mission is to shape the development of Europe’s space capability and ensure that investment in space continues to deliver benefits to the citizens of Europe and the world. ESA is an international organization with 20 Member States. By coordinating the financial and intellectual resources of its members, it can undertake programs and activities far beyond the scope of any single European country. ESA’s programs are designed to find out more about Earth, its immediate space environment, our Solar System, and the Universe, as well as to develop satellite-based technologies and services, and to promote European industries. ESA works closely with space organizations inside and outside Europe. ESA’s Earth Observation program, known as the ESA Living Planet program, embodies the fundamental goals of Earth Observation in developing our knowledge of Earth, preserving our planet and its environment, and managing life on Earth in a more efficient way. It aims to satisfy the needs and constraints of two groups, namely, the scientific community and also the providers of operational satellite-based services. The ESA Living Planet program is therefore composed of two main components: a science and research element in the form of Earth Explorer missions and an operational element known as Earth Watch designed to facilitate the delivery of Earth Observation data for use in operational services. The Earth Watch element includes the future development of meteorological missions in partnership with EUMETSAT and also new missions under the European Union’s GMES initiative, where ESA is the partner responsible for developing the Space Component. The past ERS and Envisat missions contributed both to the science and to the applications elements of the ESA Living Planet program. General information on ESA Earth Observation programs can be found at http://www.esa.int/ esaEO.

3.2 ERS-1 and ERS-2 Missions The ERS-1 (European remote sensing) satellite, launched in 1991, was ESA’s first Earth Observation satellite on low Earth orbit; it carried a comprehensive payload including an imaging C-band Synthetic Aperture Radar (SAR), a radar altimeter, and other instruments to measure ocean surface temperature and winds at sea. ERS-2, which overlapped with ERS-1, was launched in 1995 with an additional sensor for atmospheric ozone research. At their time of launch, the two ERS satellites were the most sophisticated Earth Observation spacecraft ever developed and launched in Europe. These highly successful ESA satellites have collected a wealth of valuable data on the Earth’s land surfaces, oceans, and polar caps and have been called upon to monitor natural disasters such as severe flooding or earthquakes in remote parts of the world. Page 6 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

ERS-1 was unique in its systematic and repetitive global coverage of the Earth’s oceans, coastal zones, and polar ice caps, monitoring wave heights and wavelengths, wind speeds and directions, sea levels and currents, sea surface temperatures, and sea ice parameters. Until ERS-1 appeared, such information was sparse over the polar regions and the southern oceans. The ERS missions were both an experimental and a preoperational system, since it has had to demonstrate that the concept and the technology had matured sufficiently for successors such as Envisat and that the system could routinely deliver to end users some data products such as reliable sea ice distribution charts and wind maps within a few hours of the satellite observations. The experimental nature of the ERS missions was outlined shortly after the launch of ERS-2 in 1995 when ESA decided to link the two spacecrafts in the first ever “tandem” mission which lasted for 9 months. During this time the increased frequency and level of data available to scientists offered a unique opportunity to observe changes over a very short space of time, as both satellites orbited Earth only 24 h apart and to experiment innovative SAR measurement techniques. The ERS-1 satellite ended its operations in 2000, far exceeding its 3 years planned lifetime. The ERS-2 satellite operated until 2011, i.e., 16 years after its launch, maximizing the benefit of the past investment. More information on ERS missions can be found at http://earth.esa.int/ers/.

3.3 Envisat Mission Envisat, ESA’s second-generation remote-sensing satellite, launched in 2002, not only provided continuity of many ERS observations – notably the ice and ocean elements – but added important new capabilities for understanding and monitoring our environment, particularly in the areas of atmospheric chemistry and ocean biological processes. Envisat was the largest and most complex satellite ever built in Europe. Its package of ten instruments made major contributions to the global study and monitoring of the Earth and its environment, including global warming, climate change, ozone depletion, and ocean and ice monitoring. Secondary objectives were more effective monitoring and management of the Earth’s resources and a better understanding of the solid Earth. As a total package, Envisat capabilities exceeded those of any previous or planned Earth Observation satellite. The payload included three new atmospheric sounding instruments designed primarily for atmospheric chemistry, including measurement of ozone in the stratosphere. The advanced C-band Synthetic Aperture Radar collected high-resolution images with a variable viewing geometry, with wide swath and selectable dual polarization capabilities. A new imaging spectrometer was included for ocean color and vegetation monitoring, and there were improved versions of the ERS radar altimeter, microwave radiometer, and visible/near-infrared radiometers, together with a new very precise orbit measurement system. Combined with ERS-1 and ERS-2 missions, the Envisat mission is an essential element in providing long-term continuous data sets that are crucial for addressing environmental and climatological issues. In addition, the Envisat mission further promoted the gradual transfer of applications of Earth Observation data from experimental to preoperational and operational exploitation. Although its nominal lifetime was 5 years, ESA could operate the Envisat satellite for 10 years, i.e., until 2012 when a failure in the power subsystem abruptly ended the satellite control. More information on Envisat mission can be found at http://envisat.esa.int.

Page 7 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

3.4 Proba-1 Launched in 2001, the small Project for On-Board Autonomy (Proba) satellite was intended as a 1-year ESA technology demonstrator. Once in orbit, however, its unique capabilities and performance made it evident that it could make big contributions to science so its operational lifetime is currently extended until 2012 to serve as an Earth Observation mission. Its main payload is a spectrometer (CHRIS), designed to acquire hyperspectral images with up to 63 spectral bands. Also aboard is the high-resolution camera (HRC), which acquires 5 m black and white images.

3.5 Earth Explorers The Earth Explorer missions form the science and research element of ESA’s Living Planet program and focus on the atmosphere, biosphere, hydrosphere, cryosphere, and the Earth’s interior with the overall emphasis on learning more about the interactions between these components and the impact that human activity is having on natural Earth processes. The Earth Explorer missions are designed to address key scientific challenges identified by the science community while demonstrating breakthrough technology in observing techniques. By involving the science community right from the beginning in the definition of new missions and introducing a peer-reviewed selection process, it is ensured that a resulting mission is developed efficiently and provides the exact data required by the user. The process of mission selection has given the Earth science community an efficient tool for advancing the understanding of the Earth system. The science questions addressed also form the basis for development of new applications of Earth Observation. This approach also gives Europe an excellent opportunity for international cooperation, both within the wide scientific domain and also in the technological development of new missions (ESA 2006). The family of Earth Explorer missions is a result of this strategy. Currently, there are seven Earth Explorers missions and a further three undergoing feasibility study: GOCE (Gravity Field and Steady-State Ocean Circulation Explorer) GOCE was dedicated to measuring the Earth’s gravity field and modeling the geoid with unprecedented accuracy and spatial resolution to advance our knowledge of ocean circulation, which plays a crucial role in energy exchanges around the globe, sea-level change and Earth interior processes. GOCE also made significant advances in the field of geodesy and surveying. Launched in 2009 with a 2-years planned lifetime, the mission operations were extended by another 2.5 years including a lowering of the satellite in order to obtain an even better geoid spatial resolution. After 4.5 years of successful operations, the GOCE satellite run out of propellant and the satellite re-entered and burned into the atmosphere in November 2013. [This book includes a chapter dedicated to Geodesy and the GOCE mission, written by R Rummel] SMOS (Soil Moisture and Ocean Salinity) Launched in 2009, SMOS observes soil moisture over the Earth’s landmasses and salinity over the oceans. Soil moisture and sea-surface salinity are two variables in Earth’s water cycle that scientists

Page 8 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

need on a global scale for a variety of applications, such as oceanographic, meteorological, and hydrological forecasting, as well as research into climate change. CryoSat Launched in 2010, CryoSat acquires accurate measurements of the thickness of floating sea ice, so that seasonal to interannual variations can be detected, and also surveys the surface of continental ice sheets to detect small elevation changes. CryoSat’s main objective is to provide regional trends in Arctic perennial sea ice thickness and mass and to determine the contribution that the Antarctic and Greenland ice sheets are making to mean global rise in sea level. After 2 years of operations, CryoSat’s results confirmed, for the first time, that the decline in sea ice coverage in the polar region has been accompanied by a substantial decline in ice volume. Swarm Launched in 2013, Swarm is a constellation of three satellites that will provide high-precision and high-resolution measurements of the strength and direction of the Earth’s magnetic field. The geomagnetic field models resulting from the Swarm mission will provide new insights into the Earth’s interior, will further our understanding of atmospheric processes related to climate and weather, and will also have practical applications in many different areas such as space weather and radiation hazards. ADM-Aeolus (Atmospheric Dynamics Mission) Due for launch in 2016, ADM-Aeolus will be the first space mission to measure wind profiles on a global scale. It will improve the accuracy of numerical weather forecasting and advance our understanding of atmospheric dynamics and processes relevant to climate variability and climate modeling. ADM-Aeolus is seen as a mission that will pave the way for future operational meteorological satellites dedicated to measuring the Earth’s wind fields. EarthCARE (Earth Clouds Aerosols and Radiation Explorer) Due for launch in 2018, EarthCARE is being implemented in cooperation with the Japanese Aerospace Exploration Agency (JAXA). The mission addresses the need for a better understanding of the interactions between cloud, radiative, and aerosol processes that play a role in climate regulation. Future Earth Explorers In 2005, ESA released the latest opportunity for scientists from ESA Member States to submit proposals for ideas to be assessed for the next in the series of Earth Explorer missions. As a result, 24 proposals were evaluated and a shortlist of six missions underwent assessment study. Subsequently, three proposals were selected for the next stage of development (feasibility study). This process led to the selection of the BIOMASS mission as ESA’s seventh Earth Explorer mission. The BIOMASS mission aims to take measurements of forest biomass to assess terrestrial carbon stocks and fluxes. Page 9 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

The process for selecting the eighth Earth Explorer mission is on going. More information on Earth Explorer missions can be found at: http://www.esa.int/esaLP/LPearthexp.html.

3.6 GMES and Sentinels The Global Monitoring for Environment and Security (GMES) program, now called Copernicus, has been established to fulfill the growing need among European policy-makers to access accurate and timely information services to better manage the environment, understand and mitigate the effects of climate change, and ensure civil security. Under the leadership of the European Commission, Copernicus relies largely on data from satellites observing the Earth. Hence, ESA is developing and managing the Copernicus Space Component. The European Commission, acting on behalf of the European Union, is responsible for the overall initiative, setting requirements, and managing the Copernicus services. To ensure the operational provision of the Earth Observation data, the Space Component includes a series of five space missions called “Sentinels,” which are being developed by ESA specifically for GMES. In addition, data from satellites that are already in orbit or are planned will also be used for the initiative. These “Contributing Missions” include both existing and new satellites, whether owned and operated at European level by the EU, ESA, and EUMETSAT or on a national basis. They also include data acquired from non-European partners. The GMES Space Component forms part of the European contribution to the worldwide Global Earth Observation System of Systems (GEOSS). The acquisition of reliable information and the provision of services form the backbone of Europe’s GMES initiative. Services are based on data from a host of existing and planned EO satellites from European and national missions, as well as a wealth of measurements taken in situ from instruments carried on aircraft, floating in the oceans, or positioned on the ground. Services provided through GMES fall so far into five main categories: services for land management, services for the marine environment, services relating to the atmosphere, services to aid emergency response, and services associated with security. The service component of GMES is under the responsibility of the European Commission. More information on GMES can be found at http://www.esa.int/esaLP/LPgmes.html. Sentinel Missions The success of Copernicus will be achieved largely through a well-engineered Space Component for the provision of EO data to turn them into services for continuous monitoring the environment. The GMES Space Component comprises five types of new satellites called Sentinels that are being developed by ESA specifically to meet the needs of GMES. The Sentinel missions include radar and super-spectral imaging for land, ocean, and atmospheric monitoring. The so-called a and b models of the first three Sentinels (Sentinel-1a/1b, Sentinel-2a/2b, and Sentinel-3a/3b) are currently under industrial development, with the first satellite, Sentinel-1A, launched in 2013. Sentinel-1 will provide all-weather, day and night C-band radar imaging for land and ocean services and will allow to continue SAR measurements initiated with ERS and Envisat missions; Sentinel-2 will provide high-resolution optical imaging for land services; Sentinel-3 will carry an altimeter, optical, and infrared radiometers for ocean and global land monitoring as a continuation of Envisat measurements; and Sentinel-4 and Sentinel-5 will provide data for atmospheric

Page 10 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

composition monitoring (Sentinel-4 from geostationary orbit and Sentinel-5 from low Earth polar orbit). Sentinel-4 and Sentinel-5 will be instruments carried on the next generation of EUMETSAT meteorological satellites – Meteosat Third Generation (MSG) and post-EUMETSAT Polar System (EPS), respectively. However a dedicated Sentinel-5 precursor mission is planned to be launched in 2015 to reduce the data gap between Envisat and Sentinel-5. Copernicus Contributing Missions Before data from the Sentinel satellites is available, missions contributing to Copernicus play a crucial role ensuring that an adequate dataset is provided for the Copernicus services. The role of the Contributing Missions will, however, continue to be essential once the Sentinels are operational by complementing Sentinel data and ensuring that the whole range of observational requirements is satisfied. Contributing Missions are operated by national agencies or commercial entities within ESA’s Member States, EUMETSAT, or other third parties. GMES Contributing Missions data initially address services for land and ice and also focus on ocean and atmosphere. Current services mainly concentrate on the following observation techniques: – Synthetic Aperture Radar (SAR) sensors, for all-weather day/night observations of land, ocean, and ice surfaces – Medium- and low-resolution optical sensors for information on land cover, for example, agriculture indicators, ocean monitoring, coastal dynamics, and ecosystems – High-resolution and medium-resolution optical sensors – panchromatic and multispectral – for regional and national land monitoring activities – Very high-resolution optical sensors for targeting specific sites, especially in urban areas as for security applications – High accuracy radar altimeter systems for sea-level measurements and climate applications – Radiometers to monitor land and ocean temperature – Spectrometer measurements for air quality and atmospheric composition monitoring A ground segment, facilitating access to both Sentinel and Contributing Missions data, complements the GMES space segment. More information on GMES data can be found at http://gmesdata.esa.int.

3.7 Meteorological Programs Meteosat With the launch of the first Meteosat satellite into a geostationary orbit in November 1977, Europe gained the ability to gather weather data over its own territory with its own satellite. Meteosat began as a research program for a single satellite by the European Space Research Organisation, a predecessor of ESA. Once the satellite was in orbit, the immense value of the data it provided led to the move from a research to an operational mission. ESA launched three more Meteosat satellites before the founding of EUMETSAT, organization partner of ESA for the meteorological programs. Launched in 1997, Meteosat-7 was the last of the first generation of Meteosat satellites.

Page 11 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

The first generation of seven Meteosat satellites brought major improvements to weather forecasting. But technological advances and increasingly sophisticated weather forecasting requirements created demand for more frequent, more accurate, and higher-resolution space observation. To meet this demand, EUMETSAT started the Meteosat Second Generation (MSG) program, in coordination with ESA. In 2002, EUMETSAT launched the first MSG satellite, renamed Meteosat8 when it began routine operations to clearly maintain the link to earlier European weather satellites. It was the first of four MSG satellites, which are gradually replacing the original Meteosat series. The Meteosat Third Generation (MTG) will take the relay in 2018/2019 from Meteosat-11, the last of a series of four MSG satellites. MTG will enhance the accuracy of forecasts by providing additional measurement capability, higher resolution, and more timely provision of data. Like its predecessors, MTG is a joint project between EUMETSAT and ESA that followed the success of the first-generation Meteosat satellites. MetOp Launched in 2006, in partnership between ESA and EUMETSAT, MetOp is Europe’s first polarorbiting satellite dedicated to operational meteorology. It represents the European contribution to a new cooperative venture with the United States providing data to monitor climate and improve weather forecasting. MetOp is a series of three satellites to be launched sequentially over 14 years, forming the space segment of EUMETSAT’s Polar System (EPS). MetOp carries a set of “heritage” instruments provided by the United States and a new generation of European instruments that offer improved remote sensing capabilities to both meteorologists and climatologists. The new instruments augment the accuracy of temperature humidity measurements, readings of wind speed and direction, and atmospheric ozone profiles. Preparations have started for the next generation of this EUMETSAT Polar System, the so-called Post-EPS. More information can be found at http://www.eumetsat.int.

3.8 ESA Third Party Missions ESA uses its multi-mission ground systems to acquire, process, distribute, and archive data from other satellites – known as Third Party Missions (TPM). The data from these missions are distributed under specific agreements with the owners or operators of those missions, which can be either public or private entities outside or within Europe. ESA Third Party Missions are addressing most of the existing observation techniques: – Synthetic Aperture Radar (SAR) sensors, e.g., L-band instrument (PALSAR) on board the Japanese Space Agency ALOS – Low-resolution optical sensors, e.g., MODIS sensors on board the US Terra and Aqua satellites, SeaWIFS sensor on board the US OrbView-2 satellite, etc. – High-resolution optical sensors, e.g., Landsat imagery (15–30 m), SPOT-4 data (10 m), a 10 m radiometer instrument (AVNIR-2) on board the ALOS, medium resolution (20–40 m) sensors on board the Disaster Monitoring Constellation (DMC), etc.

Page 12 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

– Very high-resolution optical sensors, e.g., US Ikonos-2 (1–4 m), Korean Kompsat-2 (1–4 m), Taiwan Formosat-2 (2–8 m), Japanese PRISM (2.5 m) on board the ALOS, etc. – Atmospheric chemistry sensors, e.g., Swedish Odin, Canadian SciSat-1, and Japanese GOSAT satellites The complete list of Third Party Missions currently supported by ESA is available at http://earth. esa.int/thirdpartymissions/.

4 Some Major Results of ESA Earth Observation Missions Since 1991, the flow of data provided by the consecutive ERS-1, ERS-2, and Envisat missions has been converted in extensive results, giving new insights into our planet. These results encompass many fields of Earth science, including land, ocean, ice, and atmosphere studies, and have shown the importance of long-term data collection to identify trends such as those associated to climate change. The ERS and Envisat missions are valuable tools not only for Earth scientists but gradually also for public institutions providing operational services such as sea ice monitoring for ship routing or UV index forecast. The ERS and Envisat data have also stimulated the emergence of new analysis techniques such as SAR Interferometry.

4.1 Land One of the most striking results of the ERS and Envisat missions for land studies is the development of the interferometry technique using Synthetic Aperture Radar (SAR) instruments. SAR instruments are microwave imaging systems. Besides their cloud penetrating capabilities and day and night operational capabilities, they have also “interferometric” capabilities, i.e., capabilities to measure very accurately the travel path of the transmitted and received radiation. SAR Interferometry (InSAR) exploits the variations of the radiation travel path over the same area observed at two or more acquisition times with slightly different viewing angles. Using InSAR, scientists are able to measure millimetric surface deformations of the terrain, like the ones associated with earthquake movements, volcano deformation, landslides, or terrain subsidence (ESA 2007). Since the launch of ERS-1 in 1991, InSAR has advanced the fields of tectonics and volcanology by allowing scientists to monitor the terrain deformation anywhere in the world at any time. Some major earthquakes, such as in Landers, California, in 1992 and Bam, Iran, in 2003 or the 2009 L’Aquila earthquake in central Italy, were “imaged” by the ERS and Envisat missions, allowing geologists to better understand the fault rupture mechanisms. Using ERS and Envisat data, scientists have been able to monitor the long-term behavior of some volcanoes such as Mt. Etna, providing crucial information for understanding how the volcano’s surface deformed during the rise, storage, and eruption of magma. Changes in surface deformation, such as sinking, bulging, and rising, are indicators of different stages of volcanic activity, which may result in eruptions. Thus, precise monitoring of a volcano’s surface deformation could lead to predictions of eruptions. The InSAR technique was also exploited by merging SAR data acquired by different satellites. For 9 months in 1995/1996, the two ERS-1 and ERS-2 satellites undertook a “tandem” mission,

Page 13 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

in which they orbited Earth only 24 h apart. The acquired image pairs provide much greater interferometric coherence than is normally possible, allowing scientists to generate detailed digital elevation maps and observe changes over a very short space of time. The same “tandem” approach was followed by the ERS-2 and Envisat satellites, adding to the ever growing set of SAR interferometric data. Because SAR instruments are able “to see” through clouds or at night time, their ability to map river flooding was quickly recognized and has been gradually exploited by civil protection authorities. Similarly other ERS and Envisat instruments such as the ATSR infrared radiometers have been able to provide relevant forest fires statistics. Using data of the MERIS spectrometer instrument on board the Envisat, the most detailed Earth global land cover map was created. This global map, which is ten times sharper than any previous global satellite map, was derived by an automatic and regionally adjusted classification of the MERIS data global composites using land cover classes defined according to the UN Land Cover Classification System (LCCS). The map and its various composites support the international community in modeling climate change impacts, studying ecosystems, and plotting worldwide land-use trends.

4.2 Ocean Through the availability of ERS and Envisat satellite data, scientists have gained an understanding of the ocean and its interaction with the entire Earth system that they would not have otherwise been able to do. Thanks to their 20 year’s extent, the time series of the ERS and Envisat missions’ data allow scientists to investigate the effects of climate change, in particular on the oceans. Those long-term data measurements allow removing the yearly variability of most of the geophysical parameters, providing results of fundamental significance in the context of climate change. Radar altimeters on board satellites play an important role in those long-term measurements. They work by sending thousands of separate radar pulses down to the Earth per second and then recording how long their echoes take to bounce back to the satellite platform. The sensor times its pulses’ journey down to under a nanosecond to calculate the distance to the planet below to a maximum accuracy of 2 cm. The consecutive availability of altimeters on board the ERS-1, ERS-2, and then Envisat allowed establishing that the global mean sea level raised by about 2–3 mm per year since the early 1990s, with important regional variations. Sea-surface temperature (SST) is one of the most stable of several geographical variables which, when determined globally, helps diagnose the state of the Earth’s climate system. Tracking SST over a long period is a reliable way researchers know of measuring the precise rate at which global temperatures are increasing and improves the accuracy of our climate change models and weather forecasts. There is evidence from measurements made from ERS and Envisat ATSR radiometer instruments that there is a distinctive upward trend in global sea-surface temperatures. The ATSR instruments produced data of unrivaled accuracy on account of its unique dual view of the Earth’s surface, whereby each part of the surface is viewed twice, through two different atmospheric paths. This not only enables scientists to correct for the effects of dust and haze, which degrade measurements of surface temperature from space, but also enables scientists to derive new measurements of the actual dust and haze, which are needed by climate scientists. One of the main assets of the Envisat mission is its multisensor capability, which allows observing geophysical phenomena from various “viewpoints.” A good example is the observation of cyclones: the data returned by Envisat includes cloud structure and height at the top of Page 14 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 This Envisat’s ASAR image acquired on 17 November 2002 shows a double-headed oil spill originating from the stricken Prestige tanker, lying 100 km off the Spanish coast

the cyclone, wind and wave fields at the bottom, sea-surface temperature, and even sea height anomalies, indicative of upper ocean thermal conditions that influence its intensity. The ERS and Envisat SAR data did also stimulate the development of maritime applications. They include monitoring of illegal fisheries or monitoring of oil slicks (Fig. 4) which can be natural or the results of human activities. The level of shipping and offshore activities occurring in and around icebound regions is growing steadily and with it the demand for reliable sea ice information. Traditionally, icemonitoring services were based on data from aircraft, ships, and land stations. But the area coverage available from such sources is always limited and often impeded further by bad weather. Satellite data has begun to fill this performance gap, enabling continuous wide-area ice surveillance. SAR instruments of the type flown on ERS and Envisat were able to pierce through clouds and darkness and therefore are particularly adapted for generating high-quality images of sea ice. Ice classification maps generated from radar imagery are now being supplied to users at sea. The maritime operational applications will be continued with the Sentinel-1 data.

4.3 Cryosphere The cryosphere is both influenced by and has a major influence on climate. Because any increase in the melt rate of ice sheets and glaciers has the potential to greatly increase sea level, researchers are Page 15 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

looking to the cryosphere to get a better idea of the likely scale of the impact of climate change. In addition, the melting of sea ice will increase the amount of solar radiation that will be absorbed by ice-free polar oceans rather than reflected by ice-covered oceans, increasing the ocean temperature. The remoteness, darkness, and cloudiness of Earth’s polar regions make them difficult to study. Being microwave active instruments, the radar on board the ERS and Envisat missions allowed seeing through clouds and darkness. Since about 30 years, satellites have been observing the Arctic and have witnessed reductions in the minimum Arctic sea ice extent – the lowest amount of ice recorded in the area annually – at the end of summer from around 8 million km2 in the early 1980s to the historic minimum of 3.5 million km2 in 2012, changes widely viewed as a consequence of greenhouse warming. ERS and Envisat radar instruments, i.e., the imaging radar (SAR) and the radar altimeter instruments, witnessed this sharp decline, providing detailed measurements, respectively, on sea ice areas and sea ice thickness. The CryoSat mission adds accurate sea ice thickness measurements to complement the sea ice extent measurements. In addition to mapping sea ice, scientists have used repeat-pass SAR image data to map the flow velocities of glaciers. Using SAR data collected by ERS-1 and ERS-2 during their tandem mission in 1995 and Envisat and Canada’s Radarsat-1 in 2005, scientists discovered that the Greenland glaciers are melting at a pace twice as fast as previously thought. Such a rapid pace of melting was not considered in previous simulations of climate change, therefore showing the important role of Earth Observation in advancing our knowledge of climate change and improving climate models. Similar phenomena also take place in Antarctica, with some large glaciers such as the Pine Island Glacier thinning at a constantly accelerating rate as suggested by ERS and Envisat altimetry data. In Antarctica, the stability of the glaciers is also related to their floating terminal platform, the ice shelves. The breakup of large ice shelves (e.g., Larsen ice shelf, Wilkins ice shelf) has been observed by Envisat SAR and is likely a consequence of both sea and air temperature increase around the Antarctic peninsula and West Antarctica.

4.4 Atmosphere ERS-2 and Envisat were equipped with several atmospheric chemistry instruments, which can look vertically or sideways to map the atmosphere in three dimensions, producing high-resolution horizontal and vertical cross sections of trace chemicals stretching from ground level to 100 km in the air, all across a variety of scales. Those instruments could detect holes in the thinning ozone layer, plumes of aerosols and pollutants hanging over major cities or burning forests, and exhaust trails left in the atmosphere by commercial airliners. ERS-2 and Envisat satellites have been maintaining a regular census of global stratospheric ozone levels from 1995 to 2012, mapping yearly Antarctic ozone holes as they appear. The size and precise time of the ozone hole is dependent on the year-to-year variability in temperature and atmospheric dynamics, as established by satellite measurements. Envisat results benefited from improved sensor capabilities. As an example, the high spatial resolution delivered by the Envisat atmospheric instruments means precise maps of atmospheric trace gases, even resolving individual city sources such than the high-resolution global atmospheric map of nitrogen dioxide (NO2), an indicator of air pollution (Fig. 5). By making a link with the measurements started with ERS-2, the scientists could note a significant increase in the NO2 emissions above Eastern China, a tangible sign of the fast economical growth of China.

Page 16 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 5 Nitrogen dioxide (NO2) concentration map over Europe derived from several years of SCIAMACHY instrument data on board the Envisat satellite (Courtesy Institute of Environmental Physics, Univ. of Heidelberg)

Based on several years of Envisat observations, scientists have produced global distribution maps of the most important greenhouse gases – carbon dioxide (CO2) and methane (CH4) – that contribute to global warming. The importance of cutting emissions from these “anthropogenic,” or man-made gases has been highlighted with European Union leaders endorsing binding targets to cut greenhouse gases in the midterm future. Careful monitoring is essential to ensuring these targets are met, and space-based instruments are new means contributing to this. The SCIAMACHY instrument on board the Envisat satellite was the first space sensor capable of measuring the most important greenhouse gases with high sensitivity down to the Earth’s surface because it observes the spectrum of sunlight shining through the atmosphere in “nadir”-looking operations. Envisat atmospheric chemistry data are useful for helping build scenarios of greenhouse gas emissions, such as methane – the second most important greenhouse gas after carbon dioxide. Increased methane concentrations induced mainly by human activities were observed. By comparing model results with satellite observations, the model is continually adjusted until it is able to reproduce the satellite observations as closely as possible. Based on this, scientists continually improve models and their knowledge of nature.

4.5 ESA Programs for Data Exploitation Earth Observation is an inherently multipurpose tool. This means there is no typical Earth Observation user: it might be anyone who requires detailed characterization of any given segment of our planet, across a wide variety of scales from a single city block to a region, or continent, right up to coverage of the entire globe. Earth Observation is already employed by many thousands of users worldwide. However, ESA works to further increase Earth Observation take-up by encouraging development of new science and new applications and services centered on user needs. New applications usually emerge from scientific research. ESA supports scientific research either by providing easy access to high-quality data (see Sect. 5), by organizing dedicated

Page 17 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

workshops and symposia, by training users, or by taking a proactive role in the formulation of new mission concepts and by providing support to science. Converting basic research and development into an operational service requires the fostering of partnership between research institutions, service companies, and user organizations. ESA’s Data User Element (DUE) program addresses institutional users tasked with collecting specific geographic or environmental data. The Data User Element aims to raise such institutions’ awareness of the applicability of Earth Observation to their day-to-day operations and develop demonstration products tailored to increase their effectiveness. The intention is then to turn these products into sustainable services provided by public or private entities. More information on ESA’s Data User Element (DUE) program can be found at http://dup.esrin.esa.it. Complementing Data User Element objectives is ESA’s Earth Observation Value Adding program. This provides a supportive framework within which to organize end-to-end service chains capable of leveraging scientific EO data into commercial tools supplied by self-supporting businesses. More information on ESA EO Value Adding program can be found at http://www. eomd.esa.int.

5 User Access to ESA Data ESA endeavors to maximize the beneficial use of Earth Observation data. It does this by fostering the use of this valuable information by as many people as possible, in as many ways as possible. For users, and therefore for ESA, easy access to EO data is of paramount importance. However, the challenges for easy EO data access are many: 1. The ESA EO data policy shall be beneficial to various categories of users, ranging from global change scientists to operational services, and shall have the objective to stimulate a balanced development of Earth science, public services, and value-adding companies. ESA has always pursued an approach of low cost fees for its satellite data, trying to provide free of charge the maximum amount of data. This approach will continue and even be reinforced in the future, by further increasing the amount of data available on the Internet and by reducing the complexity of EO missions. 2. The volume of data transmitted to the ground by ESA EO satellites is particularly high: the Envisat satellite transmitted about 270 GB of data every day; the future Sentinel-1 satellite will transmit about 900 GB of data every day. Once acquired, the data shall be transformed (i.e., processed) into products in which the information is related either to an engineering calibrated parameter (so-called Level 1 products, e.g., a SAR image) or to a geophysical parameter (so-called Level 2 products, e.g., sea surface temperature). Despite the high volume of data, the processing into Level 1 and Level 2 products shall be as fast as possible to serve increasingly demanding operational services. Finally, the data products shall be almost immediately available with users either through a broad use of Internet or through dedicated communication links. 3. The quality of EO data products shall be high so that user can effectively rely on the delivered information content. This means that the processing algorithms (i.e., the transformation of raw data into products) as well as the product calibration and validation shall be given strong attention. ESA has constantly given such attention for their EO missions, investing large efforts in algorithms development, particularly for innovative instruments such as the ones flying on Page 18 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

board the Earth Explorer missions. Of equal importance for the credibility of EO data are the validation activities aiming at comparing the geophysical information content of EO data products with similar measurements collected through, e.g., airborne campaigns or ocean buoys setup. 4. Finally, EO data handling tools shall be offered to users. Besides the general assistance given through a centralized ESA EO user service ([email protected]), the ESA user tools include: – Online data information (http://earth.esa.int) providing EO missions news, data product description, processing algorithms documentation, workshop proceedings, etc. – Data collection visualization through online catalogues, including request for product generation when the product does not yet exist (e.g., for future data acquisition) or direct download when the product is already available in online archives – EO data software tools, aiming to facilitate the utilization of data products by provision of, e.g., viewing capabilities, innovative processing algorithms, format conversion, etc.

5.1 How to Access the EO Data at ESA Internet is the main way to access to the EO satellite services and products at ESA. Besides the general description on data access described below, further assistance can be provided by the ESA EO help desk ([email protected]). 1. For most of the ESA EO data, open and free-of-charge access is granted after a simple user registration. The data products are those that are systematically acquired, generated, and available online. This includes all altimetry, sea surface temperature, atmospheric chemistry, and future Earth Explorer data but also large collections of optical (MERIS, Landsat) and SAR datasets. User registration is done at http://eopi.esa.int. The detailed list of open and free-of-charge data products is available on this Web site. For users who are only interested in basic imagery (i.e., false color jpg images), ESA provides access to large galleries of free-of-charge Earth images (http://earth.esa.int/satelliteimages). 2. Some ESA EO data and services cannot be provided free of charge or openly either because of restrictions in the distribution rights granted to ESA (e.g., some ESA Third Party Missions) or because the data/service is restrained by technical capacities and therefore is on demand (i.e., not systematically provided). The restrained dataset essentially includes on-demand SAR data acquisition and production. In this case, users shall describe the intended use of the data within a project proposal. The project proposals are collected at http://eopi.esa.int. ESA analyzes the project proposal to review its scientific objectives, to assess its feasibility, and to establish project quotas for the requested products and services (e.g., instrument tasking). The products and services are provided free of charge. Products are provided on the Internet.

Page 19 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_3-4 © Springer-Verlag Berlin Heidelberg 2014

6 Conclusion During the last three decades, Earth Observation satellites have gradually taken on a fundamental role with respect to understanding and managing our planet. Contributing to this trend, the Earth Observation program of the European Space Agency addresses a growing number of scientific issues and operational services, thanks to the continuous development of new satellites and sensors and to a considerable effort in stimulating the use of EO data. Partnerships between satellite operators and strong relations with user organizations are essential at ESA for further improving information retrieval from EO satellite data. The coming decade will see an increasing number of orbiting EO satellites, not only in Europe. This is a natural consequence of the growing user demands and expectations, but also of the gradual decrease of the costs for satellite manufacturing. The main challenge in Earth Observation will therefore be to maximize the synergies between existing satellites, both with respect to combining their respective observations and optimizing their operation concepts.

References ESA (2006) The changing Earth – new scientific challenges for ESA’s living planet programme, ESA SP-1304 ESA (2007) InSAR principles – guidelines for sar interferometry processing and interpretation, ESA TM-19

Page 20 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

GOCE: Gravitational Gradiometry in a Satellite Reiner Rummel Institut für Astronomische und Physikalische Geodäsie, Technische Universität München, München, Germany

Abstract Spring 2009 the satellite Gravity and steady-state Ocean Circulation Explorer (GOCE), equipped with a gravitational gradiometer, was launched by European Space Agency (ESA). Its purpose is the detailed determination of the spatial variations of the Earth’s gravitational field, with applications in oceanography, geophysics, geodesy, glaciology, and climatology. Gravitational gradients are derived from the differences between the measurements of an ensemble of three orthogonal pairs of accelerometers, located around the center of mass of the spacecraft. Gravitational gradiometry is complemented by gravity analysis from orbit perturbations. The orbits are thereby derived from uninterrupted and three-dimensional GPS tracking of GOCE. The gravitational tensor consists of the nine second-derivatives of the Earth’s gravitational potential. Due to its symmetry only six of them are independent. These six components can also be interpreted in terms of the local curvature of the field or in terms of components of the tidal field generated by the Earth inside the spacecraft. Four of the six components are measured with high precision (1011 s2 per squareroot of Hz), the others are less precise. Several strategies exist for the determination of the gravity field at the Earth’s surface from the measured tensor components at altitude. The mission ended in November 2013. Until August 2012 in total 2.3 years of data were collected. They entered into ESA’s fourth release of GOCE gravity models. After August 2012 the orbit altitude was lowered in several steps by altogether 31 km in order to test the enhanced gravitational sensitivity at lower orbit heights. The fields of application range from solid earth physics, via geodesy and oceanography to atmospheric physics. For example, several studies are concerned with the state of isostatic mass compensation in regions such as South America, Africa, Himalaya, and Antarctica. GOCE will help to unify height systems worldwide and enable the direct conversion of GPS-based ellipsoidal heights to accurate and globally consistent heights above the geoid. For the first time, it became possible to derive mean dynamic ocean topography and geostrophic ocean velocities with high spatial resolution and accuracy directly from space, combining the altimetric mean sea surface and the GOCE geoid. Assimilation into numerical ocean circulation models will help to improve estimates of ocean mass and heat transport. Common-mode accelerations as measured by GOCE lead to improved atmospheric density and wind estimates at GOCE altitudes.

1 Introduction: GOCE and Earth Sciences On March 17, 2009, the European Space Agency (ESA) launched the satellite Gravity and steadystate Ocean Circulation Explorer (GOCE). It is the first satellite of ESA’s Living Planet Programme (see ESA, 1999a, 2006). It is also the first one that is equipped with a gravitational gradiometer. The purpose of the mission is to measure the spatial variations of the Earth’s gravitational field globally 

E-mail: [email protected]

Page 1 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

with maximum resolution and accuracy. Its scientific purpose is essentially twofold. First, the gravitational field reflects the density distribution of the Earth’s interior. There are no direct ways to probe the deep Earth interior, only indirect ones in particular, seismic tomography, gravimetry, and magnetometry. The gravimetry part is now been taken care of by GOCE. Also, in the field of space magnetometry, an ESA mission was launched in fall 2013; it is denoted “Swarm” and consists of three satellites. Seismic tomography is based on a worldwide integrated network of seismic stations. From a joint analysis of all seismic data, a tomographic image of the spatial variations in the Earth’s interior of the propagation velocity of seismic waves is derived. The three methods together establish the experimental basis for the study of solid Earth physics, or more specifically, of phenomena such as core-mantle topography, mantle convection, mantle plumes, ocean ridges, subduction zones, mountain building, and mass compensation. Inversion of gravity alone is nonunique but joint inversion together with seismic tomography, magnetic field measurements and in addition with surface data of plate velocities and topography, and models from mineralogy will lead to a more and more comprehensive picture of the dynamics and structure of the Earth’s interior (see, e.g., Hager and Richards, 1989; Lithgow-Bertelloni and Richards, 1998; Bunge et al., 1998; Kaban et al., 2004). Second, the gravitational field and therefore the mass distribution of the Earth determines the geometry of level surfaces, plumb lines, and lines of force. This geometry constitutes the natural reference in our physical and technical world. In particular, in cases where small potential differences matter such as in ocean dynamics and large civil constructions, precise knowledge of this reference is an important source of information. The most prominent example is ocean circulation. Dynamic ocean topography, the small one up to 2 m deviation of the actual ocean surface from an equipotential surface, can be directly translated into ocean surface circulation. The equipotential surface at mean ocean level is referred to as geoid and it represents the hypothetical surface of the world oceans at complete rest. GOCE, in conjunction with satellite altimetry missions, like Jason will allow for the first time direct and detailed measurement of ocean circulation (see discussions in Wunsch and Gaposchkin, 1980; Ganachaud et al., 1997; LeGrand and Minster, 1999; Losch et al., 2002; Albertella and Rummel, 2009; Maximenko et al., 2009). GOCE is an important satellite mission for oceanography, solid Earth physics, geodesy, and climate research (compare ESA, 1999b; Rummel et al., 2002; Johannessen et al., 2003).

2 GOCE Gravitational Sensor System In the following, the main characteristics of the GOCE mission which is unique in several ways will be summarized (see also Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was Not the First: The History of the Geomagnetic Atlases). The mission consists of two complementary gravity sensing systems. The large-scale spatial variations of the Earth’s gravitational field are derived from its orbit, while the medium to short scales are measured by a so-called gravitational gradiometer. Even though satellite gravitational gradiometry has been proposed already in the late 1950s in Carroll and Savet (1959) (see also Wells 1984), the GOCE gradiometer is the first instrument of its kind to be put into orbit. The principles of satellite gradiometry will be described in Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was Not the First: The History of the Geomagnetic Atlases. The purpose of gravitational gradiometry is the measurement of the gradients of gravitational acceleration or, equivalently, the second-derivatives of the gravitational potential. In total, there exist nine second-derivatives in the orthogonal coordinate system of the instrument. The GOCE gradiometer is a three-axis instrument and its measurements are Page 2 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 GOCE gravitational gradiometer (courtesy ESA)

based on the principle of differential acceleration measurement. It consists of three pairs of accelerometers, mounted orthogonally to each other, each accelerometer having three axes (see Fig. 1). The gradiometer baseline of each one-axis gradiometer is 50 cm. The precision of each accelerometer is about 2  1012 m/s2 per square-root of Hz along two sensitive axes; the third axis has much lower sensitivity. This results in a precision of the gravitational gradients of 1011 s2 or 10 mE per square-root of Hz (1 E = 109 s2 = 1 Eötvös Unit). From the measured gravitational acceleration differences, the three main diagonal terms and one off-diagonal term of the gravitational tensor can be determined with high precision. These are the three diagonal components xx , yy , zz as well as the off-diagonal component xz , while the components xy and yz are less accurate. Thereby the coordinate axes of the instrument are pointing in flight direction (x/, cross direction (y/, and radially toward the Earth (z/. The extremely high gradiometric performance of the instrument is confined to the so-called measurement band (MB), while outside the measurement band noise is increasing.

Page 3 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

Strictly speaking, the derivation of the gradients from accelerometer differences assumes all six accelerometers (three pairs) to be perfect twins and all accelerometer test masses to be perfectly aligned. In real world, small deviations from such an idealization exist. Thus, the calibration of the gradiometer is of high importance. Calibration is essentially the process of determination of a set of scale, misalignment, and angular corrections. They are the parameters of an affine transformation between an ideal and the actual set of six accelerometers. Calibration in orbit is done by random shaking of the satellite by means of a set of cold gas thrusters and comparison of the actual output with the theoretically correct one. Before calibration the nonlinearities of each accelerometer are removed electronically; in other words, the proof mass of each accelerometer inside the electrodes of the capacitive electronic feedback system is brought into its linear range. The gravitational signal is superimposed by the effects of angular velocity and angular acceleration of the satellite in space. Knowledge of the latter is required for the removal of the angular effects from the gradiometer data and for angular control. The separation of angular acceleration from the gravitational signal is possible from a particular combination of the measured nine acceleration differences. The angular rates (in the MB) as derived from the gradiometer data in combination with those deduced from the star sensor readings are used for attitude control of the spacecraft. The satellite has to be well controlled and guided smoothly around the Earth. It is Earth pointing, which implies that it performs one full revolution in inertial space per full orbit cycle. Angular control is attained via magnetic torquers, i.e., using the Earth’s magnetic field lines for orientation. This approach leaves uncontrolled one-directional degree of freedom at any moment. In order to prevent non-gravitational forces, in particular atmospheric drag, to “sneak” into the measured differential accelerations as secondary effect, the satellite is kept “drag-free” in along-track direction by means of a pair of ion thrusters. The necessary control signal is derived from the available “common-mode” accelerations (sum instead of differences of the measured accelerations) along the three orthogonal axes of the accelerometer pairs of the gradiometer. Some residual angular contribution may also add to the common-mode acceleration, due to the imperfect symmetry of the gradiometer relative to the spacecraft’s center of mass. This effect has to be modeled. The second gravity sensor device is a newly developed European GPS receiver. From its measurements, the orbit trajectory is computed to within a few centimeters, either purely geometrically, the so-called kinematic orbit, or by the method of reduced dynamic orbit determination (compare Švehla and Rothacher, 2004; Jäggi, 2007; Bock et al., 2011). As the spacecraft is kept in an almost drag-free mode (at least in along-track direction) the orbit motion can be regarded as purely gravitational. It complements the gradiometric gravity field determination and covers the long wavelength part of the gravity signal. The orbit altitude is extremely low, only about 255 km at perigee. This is essential for a high gravitational sensitivity. No scientific satellite has been flown at such low altitudes so far. Its altitude is maintained through the drag-free control and additional orbit maneuvers, which are carried out at regular intervals. As said above, this very low altitude results in high demands on drag-free and attitude control. Finally, any time-varying gravity signal of the spacecraft itself, the so-called self-gravitation, must be excluded. This results in extremely tight requirements on metrical stiffness and thermal control. In summary, GOCE is a technologically very complex and innovative mission. The gravitational field sensor system consists of a gravitational gradiometer and GPS receiver as core instruments. Orientation in inertial space is derived from star sensors. Common-mode and differential-mode accelerations from the gradiometer and orbit positions from GPS are used together with ion

Page 4 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

CESS

STR SSTI

Xenon tank

Ion propulsion module

MT

Gradiometer

CDM LRR

GCD tank

Fig. 2 GOCE satellite and main instruments (courtesy ESA) (CESS coarse earth and Sun sensor, MT magneto torquer, STR star tracker, SSTI satellite to satellite tracking instrument, CDM command and data management unit, LRR laser retro reflector)

thrusters for drag-free control and together with magneto-torquers for angular control. The satellite and its instruments are shown in Fig. 2. The system elements are summarized in Table 1. Table 1 Sensor elements and type of measurement delivered by them (approximate orientation of the instrument triad: x = along-track, y = out-of-orbit-plane, z = radially downward) Sensor Three-axis gravity gradiometer

Star sensors (STR) GPS receiver (SSTI) Drag control with two ion thrusters Angular control with magnetic torquers Orbit altitude maintenance Internal calibration (and quadratic factors removal) of gradiometer

Measurements Gravity gradients xx , yy , zz , xz in instrument system and in MBW (measurement bandwidth) Angular accelerations (highly accurate around y-axis, less accurate around x, z axes) Common-mode accelerations High-rate and high-precision inertial orientation Orbit trajectory with centimeter precision Based on common-mode accelerations from gradiometer and GPS orbit Based on angular rates from star sensors and gradiometer Based on GPS orbit Calibration signal from random shaking by cold gas thrusters (and electronic proof mass shaking)

3 Gravitational Gradiometry Gravitational gradiometry is the measurement of the second derivates of the gravitational potential V . Its principles are described in textbooks such as Misner et al. (1970), Falk and Ruppel (1974), and Ohanian and Ruffini (1994) or in articles like Rummel (1986), compare also Colombo (1989) and Rummel (1997). Despite the high precision of the GOCE gradiometer instrument, the theory Page 5 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

can still be formulated by classical Newton mechanics. Let us denote the gravitational tensor, expressed in the instrument frame as 0

1

0

Vxx Vxy Vxz B  D Vij D @ Vyx Vyy Vyz A D @ Vzx Vzy Vzz

@2 V @x 2 @2 V @y@x @2 V @z@x

@2 V @x@y @2 V @y 2 @2 V @z@y

@2 V @x@z @2 V @y@z @2 V @z2

1 C A;

(1)

where the gravitational potential represents the integration over all Earth masses (cf. Classical Physical Geodesy and Stokes Problem, Layer Potentials and Regularizations, Multiscale Applications) ZZZ Q d˙Q : (2) VP D G `PQ where G is the gravitational constant, Q the density, `PQ the distance between the mass element in Q and the computation point P , and d˙ is the infinitesimal volume. We may assume the gravitational effect of the atmosphere to be negligible. Then the space outside of the Earth is empty and it holds r  rV D 0 (source P free) apart from r  rV D 0 (vorticity free). This corresponds to saying in Eq. 1 Vij D Vj i and i Vi i D 0. It leaves only five independent components in each point and offers important cross-checks between the measured components. If the Earth were a homogenous sphere, the off-diagonal terms would be zero and in a local triad {north, east, radial} one would find 0 1 0 1 Vxx Vxy Vxz 1 0 0 GM (3)  D Vij D @ Vyx Vyy Vyz A D 3 @ 0 1 0 A ; r Vzx Vzy Vzz 0 0 2 where M is the mass of the spherical Earth. This simplification gives an idea about the involved orders of magnitude. At GOCE satellite altitude, it is Vzz D 2;740 E. This also implies that at a distance of 0.5 m from the spacecraft’s center of mass, the maximum gravitational acceleration is about 1:5  106 m/s2. In an alternative interpretation, one can show that the Vij express the local geometric curvature structure of the gravitational field, i.e., 1 k1 t1 f1  D Vij D g @ t2 k2 f2 A ; f1 f2 H 0

(4)

where g is gravity, k1 and k2 express the local curvature of the level surfaces in north and east directions, t1 and t2 are the torsion, f1 and f2 the north and east components of the curvature of the plumb line, and H the mean curvature. For a derivation, refer to Marussi (1985). This interpretation of gravitational gradients in terms of gravitational geometry provides a natural bridge to Einstein’s general relativity, where gravitation is interpreted in terms of space-time curvature for it holds i R0j 0 D

1 Vij c2

(5)

Page 6 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

i for the nine components of the tidal force tensor, which are components of Rkj l the Riemann curvature tensor with its indices running from 0, 1, 2, 3 (Ohanian and Ruffini 1994, p. 41; Moritz and Hofmann-Wellenhof 1993, Chap. 5). A third interpretation of gravitational gradiometry is in terms of tides. Sun and moon produce a tidal field on Earth. It is zero at the Earth’s center of mass and maximum at its surface. Analogously, the Earth is producing a tidal field in every Earth-orbiting satellite. At the center of mass of a satellite, the tidal acceleration is zero; i.e., the acceleration relative to the center of mass is zero; this leads to the terminology “zero-g.” The tidal acceleration increases with distance from the satellite’s center of mass like

ai D Vij dx j

(6)

with the measurable components of tidal acceleration ai and of relative position dx j , both taken in the instrument reference frame. Unlike sun and moon relative to the Earth, GOCE is always Earth pointing with its z-axis. This implies that the gradiometer measures permanently “high-tide” in zdirection and “low-tide” in x- and y-directions. The gradient components are deduced from taking the difference between the acceleration at two points along one gradiometer axis and symmetrically with respect to the satellite’s center of mass Vij D

ai .1/  ai .2/ : 2dx j

(7)

Remark 1. In addition to the tidal acceleration of the Earth, GOCE is measuring the direct and indirect tidal signal of sun and moon. This signal is much smaller, well-known and taken into account. If the gravitational attraction of the atmosphere is taken care of by an atmospheric model, the Earth’s outer field can be regarded source free and Laplace equation holds. It is common practice to solve Laplace equation in terms of spherical harmonic functions. The use of alternative base functions is discussed, for example, in Schreiner (1994) or Freeden et al. (1998). For a spherical surface, the solution of a Dirichlet boundary value problem yields the gravitational potential of the Earth in terms of normalized spherical harmonic functions Ynm .˝P / of degree n and order m as: V .P / D

X  R nC1 X rP

n

tnm Ynm .P / D Y t

(8)

m

with f˝P , rP g D fP , P , rP g the spherical coordinates of P and tnm the spherical harmonic coefficients. The gravitational tensor at P is then Vij .P / D

(

X X n

m

tnm @ij

R rP

nC1

) Ynm .P / D Y fij gt :

(9)

The spherical harmonic coefficients tnm are derived from the measured gradiometric components Vij by least squares adjustment.

Page 7 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

Remarks 2.

In the case of GOCE gradiometry the situation is as follows.

Above degree and order n D m D 220, noise starts to dominate signal. Thus, the series has to be truncated in some intelligent manner, minimizing aliasing and leakage effects. GOCE is covering our globe in 61 days with a dense pattern of ground tracks. Original planning of its mission lifetime assumed the completion of at least three times such 61-day cycles. The orbit inclination is 96:7ı . This leaves the two polar areas (opening angle 6:7ı ) free of observations, the so-called polar gaps. Various strategies have been suggested for minimization of the effect of the polar gaps on the determination of the global field (compare, e.g., Baur et al., 2009). Instead of dealing in an least-squares adjustment with the analysis of the individual tensor components one could consider the study of particular combinations. A very elegant approach is the use of the invariants of the gravitational tensor. They are independent of the orientation of the gradiometer triad. It is referred to Rummel (1986) and in particular to Baur and Grafarend (2006) and Baur (2007). While the first invariant cannot be used for gravity field analysis, it is the Laplace trace condition, the two others can be used. They are nonlinear and lead to an iterative adjustment. Let us assume for a moment the gradiometer components Vij would be given in an Earthfixed spherical {north, east, radial}-triad. In that case, the tensor can be expanded in tensor spherical harmonics and decomposed into the irreducible radial, mixed normal-tangential, and pure tangential parts with the corresponding eigenvalues (Rummel and van Gelderen 1992; Schreiner 1994; Rummel 1997; Nutz 2002; Martinec 2003), p.n C 1/.n C 2/ .n C 2/ n.n C 1/ for

and xx C yy ; for zz .nC2/Š fxz ; yz g; .n2/Š for fxx  yy ; 2xy g

All eigenvalues are of the order of n2 . In Schreiner (1994) and Freeden et al. (1998) as well as Geodetic Boundary Value Problem of this handbook, it is also shown that {xz , yz } and {xx  yy , – 2xy } are insensitive to degree zero, the latter combination also to degree one. In the case of GOCE, the above properties cannot be employed in a straightforward manner, because (1) not all components are of comparable precision and, more importantly, (2) the gradiometric components are measured in the instrument frame, which is following in its orientation the orbit and the attitude control commands. There exist various competing strategies for the actual determination of the field coefficients tnm , depending on whether the gradients are regarded in situ measurements on a geographical grid, along the orbit tracks, as a time series along the orbit or as Fourier-coefficients derived from the latter (compare, e.g., Migliaccio et al., 2004; Pail and Plank, 2004; Brockmann et al., 2009; Stubenvoll et al., 2009). These methods take into account the noise characteristics of the components and their orientation in space. The principles of the methods of gravity modeling are described by Roland Pail in this handbook (Pail 2014). Here, the stochastic as well as the functional model are discussed. It also contains a summary of the methods of determination of angular rates from a joint analysis of star tracking and gradiometry applying Wiener filtering. Because of the polar gap gravity modeling combining gravitational gradiometry and GPS-based kinematic orbits leads to numerical instabilities and requires some form of regularization (Kusche and Klees 2002). A further step will be regional

Page 8 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

refinement by combination of GOCE with terrestrial data sets (e.g., Eicker et al. 2009; Stubenvoll et al. 2009 or Förste et al. 2011).

4 GOCE Status GOCE was launched on March 17, 2009. After a commissioning and calibration phase, the first operational measurement cycle started on November 1, 2009. The mission was originally planned for 20 months only, because of its smooth performance it was ultimately extended until November 2013. All sensor systems worked well with the exception of a slightly higher than nominal noise level in the radial components Vzz and Vxz for reasons still not understood. Three major interruptions occurred: from February 12 to March 2 and from July 2 to September 25, 2010 due to problems with the processor units and from January 1 to January 21, 2011 because of a software problem related to the GPS receiver. The mission ended on November 11, 2013 with the reentry of the spacecraft over the South Atlantic Ocean near the Falkland Islands. By August 2013, altogether 2.3 years of data were available and entered the release 4 models. ESA Release 4 GOCE gravity models are summarized in Table 2. From August 1, 2012 on the orbit, altitude was lowered in four steps with at least one full measurement cycle in between by 9, 6, 5, and 11 km, i.e., altogether by 31km. The lowest altitude was attained on May 31, 2013 with 224 km. This was done in order to test whether the increased gravitational sensitivity can be exploited despite the increase of atmospheric drag at lower altitudes. Preliminary analysis shows an increase in spatial resolution. Release five is expected to be published in summer 2014. It will include the complete GOCE data set, including the data from the lower orbit altitudes. Table 2 ESA release 4 GOCE gravity models DIR4 and TIM4 and their characteristics Maximum D/O GOCE Data Volume Gravity Gradients Gradient Filter GOCE SST (GPS) GRACE SST(K-Band)

LAGEOS1/2 (SLR) Regularization

DIR4 260 01.11.09–01.08.12; 2.3 years(net) Vxx , Vyy , Vzz , Vxz 288 Mio. Obs. Band-pass filter – 2003–2012 GRGS RL02 (d/o 55), GFZ RL05(d/o 56–180) 1985–2010, 25 years Iterative spherical cap (d/o 260) based on GRACE/LAGEOS Kaula zero constraint(d/o >200)

TIM4 250 01.11.09–19.06.12; 2.2 years(net) Vxx , Vyy , Vzz , Vxz 279 Mio. Obs. ARMA filter per segment Short arc approach (d/o 130) –

– Kaula zero constraint (near zonals and for d/o >180)

Page 9 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Vertical gravity gradient field of Antartica as measured by GOCE (units: milliEötvös D10-12s-2) and some prominent tectonic features; the GOCE polar gap is marked in gray

5 Conclusions: GOCE Science Applications Data exploitation of GOCE for science and application is far from being completed. We observe that the research fields are essentially those identified already in the GOCE mission proposal (ESA 1999b). There is an exception. The expectation was that GOCE will be unable to sense temporal variations of gravity and geoid. This is true in general; however some big mass movements such as those associated with the big Japan earthquake seem to be detectable from the gradiometer data (Bouman et al. 2013). Solid earth physics. The gravity field as sensed by GOCE reflects the density distribution of the earth’s masses, primarily from topography, crust, and lithosphere and in an attenuated form from the upper and lower mantle down to the core. In ocean areas, gravity is well-known already from satellite altimetry. The situation is different in continental areas. Comparisons of GOCE gravity models with EGM2008 (e.g., Yi and Rummel 2014) reveal good agreement in well-surveyed parts of the earth such as North America, Europe, Australia, New Zealand, and Japan but rather poor agreement in large parts of South America, Africa, Himalaya, and South-East Asia. EGM2008 is a combination of GRACE satellite gravimetry with a global set of terrestrial gravity anomalies. Some of the regions with poor terrestrial gravity are of high geodynamic relevance. Currently several studies are underway, looking into the state of isostatic equilibrium and into the elastic thickness of the lithosphere (e.g., Sampietro et al. 2012 or van der Meijde et al. 2013). A special case is Antarctica where terrestrial gravity data is sparse. GRACE delivered gravity and geoid information in Antarctica but confined to rather large spatial scales. GOCE shows now the gravity field and with it tectonic processes hidden under an ice layer with a thickness of several kilometers (e.g., Ferraccioli et al. 2011). Figure 3 shows the vertical gravity gradient field of Antarctica as measured by GOCE and some prominent tectonic features. The polar gap is marked in gray.

Page 10 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 Geostropic velocities (in cm/s) derived from drifter measurements (left) and from geodetic mean dynamic ocean topography (right)

Height systems. Official heights are provided to the user either as gravity potential differences between terrain points or as metric heights derived from the potential differences such as orthometric or normal heights. The official height systems refer to a zero value at some adopted reference marker at a tide gauge, representing mean sea level there. However, mean sea level varies from location to location due to the variations in coastal oceanic conditions. The variations are not big, usually less than 2 m, but they are responsible for unknown height off-sets between the various worldwide official height systems. GOCE provides the best possible geoid surface (ideally refined locally by shorter scales from some regional geoid computation). It represents the theoretical sea surface at rest and can be introduced as an ideal worldwide height reference. It also permits determination of the height off-sets between the various height systems, not detectable in the past. This process of height unification is of great value for mapping, large civil constructions and sea level research. As a demonstration of the value of the GOCE geoid for height determination, Woodworth et al. (2012) revealed the bias of the North American height system which is based on classical spirit leveling. In the near future, GPS positions can be translated to heights above the GOCE geoid yielding globally consistent and physically meaningful heights worldwide. Oceanography. In his classical textbook “Atmosphere-Ocean Dynamics”, A.E. Gill (1982) writes on page 46: “If the sea were at rest, its surface would coincide with the geopotential surface.” This geopotential surface is the geoid and with GOCE its shape is determined with an accuracy of 2– 3 cm. In reality the sea is not at rest, it is in motion as driven by winds, atmospheric pressure differences, and tides and as a consequence the ocean surface deviates from a geopotential surface by up to 1 or 2 m at the major ocean currents; the deviation is denoted dynamic ocean topography. The actual sea surface is measured from space by satellite altimetry. More than 20 years of altimetry yield models of the mean sea surface (MSS) accurate to a few centimeters. Subtraction of the GOCE geoid from MSS gives the geodetic mean dynamic ocean topography (MDT). It is maintained by the balance of the pressure differences due to the MDT and Coriolis acceleration. Its slope is proportional to the geostrophic velocities of ocean circulation. GOCE together with altimetry give MDT and the geostrophic velocity field without the use of any oceanographic in situ data. Geodetic MDT and geostrophic velocities represent a new type of input quantity for numerical ocean circulation models and help to improve ocean mass and heat transport estimates, e.g., in the area of the Weddell Sea in the Southern ocean, one of the tipping points of our climate system. Figure 4 shows the magnitude of the geostrophic velocities in the North Atlantic based on the Danish MSS model DTU-10 (right) and geostrophic velocities derived from drifter data (left). We observe higher signal strength of the geodetic estimate. A large number of ocean studies based on GOCE was published already. Examples are Bingham et al. (2011), Janji´c et al. (2012) or Le Traon et al. (2011). Page 11 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

Atmosphere. GOCE was kept drag-free in flight direction using ion thrusters. The feedback signal for drag-free control was the measured common-mode accelerations of the six accelerometers of the gradiometer. They are a measure of the nongravitational forces acting on the GOCE spacecraft and open the possibility for studies of atmospheric density and winds (Doornbos et al. 2013). At GOCE altitude, no other data source is available.

References Albertella A, Rummel R (2009) On the spectral consistency of the altimetric ocean and geoid surface: a one-dimensional example. J Geod 83(9):805–815 Baur O (2007) Die Invariantendarstellung in der Satellitengradiometrie. DGK, Reihe C, Beck, München Baur O, Grafarend EW (2006) High performance GOCE gravity field recovery from gravity gradient tensor invariants and kinematic orbit information. In: Flury J, Rummel R, Reigber Ch, Rothacher M, Boedecker G, Schreiber U (eds) Observation of the earth system from space. Springer, Berlin, pp 239–254 Baur O, Cai J, Sneeuw N (2009) Spectral approaches to solving the polar gap problem. In: Flechtner F, Mandea M, Gruber Th, Rothacher M, Wickert J, Güntner A, Schöne T (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin Bingham RJ, Knudsen P, Andersen O, Pail R (2011) An initial estimate of the North Atlantic steady-state geostrophic circulation from GOCE. Geophys Res Lett 38:L01606. doi:10.1029/2010GL045633 Bock H, Jäggi A, Meyer U, Visser P, van den Ijssel J, van Helleputte T, Heinze M, Hugentobler U (2011) GPS-derived orbits of the GOCE satellite. J Geod 85(11):807–818 Bouman J, Visser P, Fuchs M, Broerse T, Haberkorn C, Lieb V, Schmidt M, Schrama E, van der Wal W (2013) GOCE gravity gradients and the Earth’s time varying gravity field. ESA Living Planet, Edinburgh Brockmann JM, Kargoll B, Krasbutter I, Schuh WD, Wermuth M (2009) GOCE data analysis: from calibrated measurements to the global earth gravity field. In: Flechtner F, Mandea M, Gruber Th, Rothacher M, Wickert J, Güntner A, Schöne T (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin Bunge H-P, Richards MA, Lithgow-Bertelloni C, Baumgardner JR, Grand SP, Romanowiez BA (1998) Time scales and heterogeneous structure in geodynamic earth models. Science 280:91–95 Carroll JJ, Savet PH (1959) Gravity difference detection. Aerosp Eng 18:44–47 Colombo O (1989) Advanced techniques for high-resolution mapping of the gravitational field. In: Sansò F, Rummel R (eds) Theory of satellite geodesy and gravity field determination. Lecture notes in earth sciences, vol 25. Springer, Heidelberg, pp 335–369 Doornbos E, Bruinsma S, Fritsche B, Visser P, v/d Ijssel J, Teixeira Encarna J, Kern M (2013) Air density and wind retrieval using GOCE data. ESA Living Planet, Edinburgh Eicker A, Mayer-Gürr T, Ilk KH, Kurtenbach E (2009) Regionally refined gravity field models from in situ satellite data. In: Flechtner F, Mandea M, Gruber Th, Rothacher M, Wickert J, Güntner A, Schöne T (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin ESA (1999a) Introducing the “Living Planet” Programme-the ESA strategy for earth observation. ESA SP-1234. ESA Publication Division, ESTEC, Noordwijk Page 12 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

ESA (1999b) Gravity field and steady-state ocean circulation mission. Reports for mission selection, SP-1233 (1). ESA Publication Division, ESTEC, Noordwijk. http://www.esa.int./ livingplanet/goce ESA (2006) The changing earth-new scientific challenges for ESA’s Living Planet Programme. ESA SP-1304. ESA Publication Division, ESTEC, Noordwijk Falk G, Ruppel W (1974) Mechanik, Relativität, Gravitation. Springer, Berlin Ferraccioli F, Finn CA, Jordan TA, Bell RE, Anderson LM, Damaske D (2011) East Antarctic rifting triggers uplift of the Gamburtsev Mountains. Nature 479:388–392 Förste C, Bruinsma S, Shako R, Marty J-C, Flechtner F, Abrikosov O, Dahle C, Lemoine J-M, Neumayer H, Biancale R, Barthelmes F, König R, Balmino G (2011) EIGEN-6: a new combined global gravity field model including GOCE data from the collaboration of GFZ-Potsdam and GRGS-Toulouse, EGU2011-3242 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere. Oxford Science Publications, Oxford Ganachaud A, Wunsch C, Kim M-Ch, Tapley B (1997) Combination of TOPEX/POSEIDON data with a hydrographic inversion for determination of the oceanic general circulation and its relation to geoid accuracy. Geophys J Int 128:708–722 Gill AE (1982) Atmosphere-ocean dynamics. Academic, New York Hager BH, Richards MA (1989) Long-wavelength variations in Earth’s geoid: physical models and dynamical implications. Philos Trans R Soc Lond A 328:309–327 Jäggi A (2007) Pseudo-stochastic orbit modelling of low earth satellites using the global positioning system. Geodätisch - geophysikalische Arbeiten in der Schweiz, 73 Janji´c T, Schröter J, Savcenko R, Bosch W, Albertella A, Rummel R, Klatt O (2012) Impact of combining GRACE and GOCE gravity data on ocean circulation estimates. Ocean Sci 8:65–79. doi:10.5194/os-8-65-2012 Johannessen JA, Balmino G, LeProvost C, Rummel R, Sabadini R, Sünkel H, Tscherning CC, Visser P, Woodworth P, Hughes CH, LeGrand P, Sneeuw N, Perosanz F, Aguirre-Martinez M, Rebhan H, Drinkwater MR (2003) The European gravity field and steady-state ocean circulation explorer satellite mission: its impact on geophysics. Surv Geophys 24:339–386 Kaban MK, Schwintzer P, Reigber Ch (2004) A new isostatic model of the lithosphere and gravity field. J Geod 78:368–385 Kusche J, Klees R (2002) Regularization of gravity field estimation from satellite gravity gradients. J Geod 76:359–368 LeGrand P, Minster J-F (1999) Impact of the GOCE gravity mission on ocean circulation estimates. Geophys Res Lett 26(13):1881–1884 Le Traon PY, Schaeffer P, Guinehut S, Rio MH, Hernandez F, Larnicol G, Lemoine JM (2011) Mean ocean dynamic topography from GOCE and altimetry, ESA SP 686 Lithgow-Bertelloni C, Richards MA (1998) The dynamics of cenozoic and mesozoic plate motions. Rev Geophys 36(1):27–78 Losch M, Sloyan B, Schröter J, Sneeuw N (2002) Box inverse models, altimetry and the geoid; problems with the omission error. J Geophys Res 107(C7):15-1–15-13 Martinec Z (2003) Green’s function solution to spherical gradiometric boundary-value problems. J Geod 77:41–49 Marussi A (1985) Intrinsic geodesy. Springer, Berlin Maximenko N, Niiler P, Rio M-H, Melnichenko O, Centurioni L, Chambers D, Zlotnicki V, Galperin B (2009) Mean dynamic topography of the ocean derived from satellite and drifting buoy data using three different techniques. J Atmos Ocean Technol 26:1910–1919 Page 13 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_4-3 © Springer-Verlag Berlin Heidelberg 2014

Migliaccio F, Reguzzoni M, Sansò F (2004) Space-wise approach to satellite gravity field determination in the presence of colored noise. J Geod 78:304–313 Misner CW, Thorne KS, Wheeler JA (1970) Gravitation. Freeman, San Francisco Moritz H, Hofmann-Wellenhof B (1993) Geometry, relativity, geodesy. Wichmann, Karlsruhe Nutz H (2002) A unified setup of gravitational observables. Dissertation. Shaker Verlag, Aachen Ohanian HC, Ruffini R (1994) Gravitation and spacetime. Norton & Comp., New York Pail R (2014) It is all about statistics: global gravity field modelling from GOCE and complementary data. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics. Springer, Heidelberg Pail R, Plank R (2004) GOCE gravity field processing strategy. Stud Geophys Geod 48:289–309 Rummel R (1986) Satellite gradiometry. In: Sünkel H (ed) Mathematical and numerical techniques in physical geodesy. Lecture notes in earth sciences, vol 7. Springer, Berlin, pp 317–363. ISBN (Print):978-3-540-16809-6, doi:10.1007/BFb0010135 Rummel R (1997) Spherical spectral properties of the earth’s gravitational potential and its first and second-derivatives. In: Sansò F, Rummel R (eds) Geodetic boundary value problems in view of the one centimeter geoid. Lecture notes in earth sciences, vol 65. Springer, Berlin, pp 359–404. ISBN:3-540-62636-0 Rummel R, van Gelderen M (1992) Spectral analysis of the full gravity tensor. Geophys J Int 111:159–169 Rummel R, Balmino G, Johannessen J, Visser P, Woodworth P (2002) Dedicated gravity field missions-principles and aims. J Geodyn 33/1–2:3–20 Sampietro D, Reguzzoni M, Braitenberg C (2012) The GOCE estimated Moho beneath the Tibetan Plateau and Himalaya. In: C Rizos, P Wills (eds) Earth on the edge: science for a substainable planet, International Association of Geodesy Symposia, vol 139. Springer, pp 391– 397. doi:10.1007/978-3-642-37222-3_52 Schreiner M (1994) Tensor spherical harmonics and their application in satellite gradiometry. Dissertation, Universität Kaiserslautern Stubenvoll R, Förste Ch, Abrikosov O, Kusche J (2009) GOCE and its use for a high-resolution global gravity combination model. In: Flechtner F, Mandea M, Gruber Th, Rothacher M, Wickert J, Güntner A, Schöne T (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin Svehla D, Rothacher M (2004) Kinematic precise orbit determination for gravity field determination. In: Sansò F (ed) The proceedings of the international association of geodesy: a window on the future of geodesy. Springer, Berlin, pp 181–188 Van der Meijde M, Julia J, Assumpcao M(2013) Gravity derived Moho for South America. Tectonophysics 609:456–467 Wells WC (ed) (1984) Spaceborne gravity gradiometers. NASA conference publication, vol 2305, Greenbelt Woodworth PL, Hughes CW, Bingham RJ, Gruber T(2012) Towards worthwide height system unification using ocean information., J Geodetic Sci 2(4):302–318. doi:10.2478/v10156-0120004-8 Wunsch C, Gaposchkin EM (1980) On using satellite altimetry to determine the general circulation of the oceans with application to geoid improvement. Rev Geophys 18:725–745 Yi W, Rummel R (2014) A comparison of GOCE gravitational models with EGM2008. J Geodyn 73:14–22

Page 14 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Sources of the Geomagnetic Field and the Modern Data That Enable Their Investigation Nils Olsena , Gauthier Hulotb and Terence J. Sabakac a DTU Space, Technical University of Denmark, Kgs. Lyngby, Denmark b Equipe de Géomagnétisme, Institut de Physique du Globe de Paris, Sorbonne Paris Cité, Université Paris Diderot, UMR 7154 CNRS, Paris, France c Planetary Geodynamics Laboratory, Code 698, NASA Goddard Space Flight Center, Greenbelt, MD, USA

Abstract The geomagnetic field one can measure at the Earth’s surface or on board satellites is the sum of contributions from many different sources. These sources have different physical origins and can be found both below (in the form of electrical currents and magnetized material) and above (only in the form of electrical currents) the Earth’s surface. Each source happens to produce a contribution with rather specific spatio-temporal properties. This fortunate situation is what makes the identification and investigation of the contribution of each source possible, provided appropriate observational data sets are available and analyzed in an adequate way to produce the so-called geomagnetic field models. Here we provide a general overview of the various sources that contribute to the observed geomagnetic field, and of the modern data that enable their investigation via such procedures. The Earth has a large and complicated magnetic field, a major part of which is produced by a self-sustaining dynamo operating in the fluid outer core. What is measured at or near the surface of the Earth, however, is the superposition of the core field and of additional fields caused by magnetized rocks in the Earth’s crust, by electric currents flowing in the ionosphere, magnetosphere and oceans, and by currents induced in the Earth by the time-varying external fields. The sophisticated separation of these various fields and the accurate determination of their spatial and temporal structure based on magnetic field observations is a significant challenge, which requires advanced modeling techniques (see e.g., Hulot et al. 2007). These techniques rely on a number of mathematical properties which we review in the accompanying chapter by Sabaka et al. (2010), entitled “Mathematical Properties Relevant to Geomagnetic Field Modelling”. But as many of those properties have been derived by relying on assumptions motivated by the nature of the various sources of the Earth’s magnetic field and of the available observations, it is important that a general overview of those sources and observations be given. This is precisely the purpose of the present chapter. It will first describe the various sources that contribute to the Earth’s magnetic field (Sect. 1) and next discuss the observations currently available to investigate them (Sect. 2). Special emphasis is given on data collected by satellites, since these are extensively used for modeling the present magnetic field. We will conclude with a few words with respect to the way the fields those sources produce can be identified and investigated, thanks to geomagnetic field modeling.



E-mail: [email protected]

Page 1 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

1 Sources of the Earth’s Magnetic Field Several sources contribute to the magnetic field that is measured at or above the surface; the most important ones are sketched in Fig. 1. The main part of the field is due to electrical currents in the Earth’s fluid outer core at depths larger than 2,900 km; this is the so-called core field. Its strength at the Earth’s surface varies from less than 30,000 nT near the equator to about 60,000 nT near the poles, which makes the core field responsible for more than 95 % of the observed field at ground. Magnetized material in the crust (the uppermost few kilometers of Earth) causes the crustal field; it is relatively weak and accounts on average only for a few percent of the observed field at ground. Core and crustal fields together make the internal field (since their sources are internal to the Earth’s surface). External magnetic field contributions are caused by electric currents in the ionosphere (at altitudes 90–1,000 km) and magnetosphere (at altitudes of several Earth radii). On average, their contribution is also relatively weak – a few percent of the total field at ground during geomagnetic quiet conditions. However, if not properly considered, they disturb the precise determination of the internal field. It is therefore of crucial importance to account for external field (by data selection, data correction, and/or field co-estimation) in order to obtain reliable models of the internal fields. Finally, electric currents induced in the Earth’s crust and mantle by the timevarying fields of external origin, and the movement of electrically conducting seawater, cause magnetic field contributions that are of internal origin like the core and crustal field; however, typically only core and crustal field is meant when speaking about “internal sources.” A useful way of characterizing the spatial behavior of the geomagnetic field is to make use of the concept of spatial power spectra (e.g., Lowes 1996 and section 4 of Sabaka et al. 2010). Figure 2 shows the spectrum of the field of internal origin, often referred to as the Lowes-Mauersberger spectrum, which gives the mean square magnetic field at Earth’s surface due to contributions

Fig. 1 Sketch of the various sources contributing to the near-Earth magnetic field

Page 2 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

λn [km]

Rn [(nT)2]

40000

4000

2000 1500

1000

800

700

600

500

108

108

106

106

104

104

Core

102

100

0

Crust

10

20

102

30

40

50

60

70

100 80

degree n

Fig. 2 Spatial power spectrum of the geomagnetic field at Earth’s surface. Black dots represent the spectrum of a recent field model (Maus et al. 2008; Olsen et al. 2009). Also shown are theoretical spectra (Voorhies et al. 2002) for the core (blue) and crustal (magenta) part of the field, as well as their superposition (red curve)

with horizontal wavelength n corresponding to spherical harmonic degree n. The spectrum of the observed magnetic field (based on a combination of the recent field models derived by Olsen et al. 2009 and Maus et al. 2008) is shown by black dots, while theoretical spectra describing core, resp. crustal, field spectra (Voorhies et al. 2002) are shown as blue, resp. magenta, curves. Each of these two theoretical spectra has two free parameters which have been fitted to the observed spectra; their sum (red curve) provides a remarkable good fit to the observed spectrum. There is a sharp “knee” at about degree n D 14 which indicates that contributions from the core field are dominant at large scales (n < 14) while those of the crustal field dominate for the smaller scales (n > 14). We now proceed with a more detailed overview of the various field sources and their characteristics.

1.1 Internal Field Sources: Core and Crust 1.1.1 Core Field Although the Earth’s magnetic field has been known for at least several thousands of years (see e.g., Merrill et al. 1998), the nature of its sources has eluded scientific understanding for a very long time. It was not until the nineteenth century that its main source was finally proven to be internal to the Earth. We now know that this main source is most likely a self-sustaining dynamo within the Earth’s core (see e.g., Roberts 2007; Wicht et al. 2010). This dynamo is the result of the fact that the liquid electrically conducting outer core (consisting of a Fe-Ni alloy) is cooling down and convecting vigorously enough to maintain electrical currents Page 3 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

and a magnetic field. The basic process is one whereby the convective motion of the conducting fluid within the magnetic field induces electromotive forces which maintain electrical currents, and therefore also the magnetic field, against Ohmic dissipation. The Earth’s core dynamo has several specific features. First, the core contains a solid, also conducting, inner core. This inner core is thought to be the solidified part of the core (Jacobs 1953), the growth of which is the result of the cooling of the core. Because the Fe-Ni alloy that makes the core must in fact contain additional light elements, the solidification of the inner core releases so-called compositional buoyancy at the inner core boundary. This buoyancy will add up to the thermal buoyancy and is thought to be a major source of energy for the convection (see e.g., Nimmo 2007). Second, the core, together with the whole Earth, is rotating fast, at a rate of one rotation per day. This leads Coriolis forces to play a major role in organizing the convection, and the way the dynamo works. In particular, spherical symmetry is dynamically broken, and a preferential axial (North-South) dipole field can be produced (see e.g., Gubbins and Zhang 1993). At any given instant, however, the field produced cannot be too simple (a requirement which has been formalized in terms of anti-dynamo theorems, starting with the best-known Cowling theorem (Cowling 1957)). In particular, no fully axisymmetric field can be produced by a dynamo. In effect, all dynamo numerical simulations run so far with conditions approaching that of the core dynamo produce quite complex fields in addition to a dominant axial dipole. This complexity does not only affect the so-called toroidal component of the magnetic field which remains for the most part within the fluid core (such toroidal components are nonzero only where their sources lie (cf. section 3 of Sabaka et al. 2010), and the poorly conducting mantle forces those to essentially remain within the core). It also affects the poloidal component of the field which can escape the core by taking the form of a potential field (such poloidal components can indeed escape their source region in the form of a Laplacian potential field, see again Sabaka et al. (2010)), reach the Earth’s surface and make the core field we can observe. The core field thus has a rich spatial spectrum beyond a dominant axial dipole component. It also has a rich temporal spectrum (with typical time scales from decades to centuries (see e.g., Hulot and Le Mouël 1994)) directly testifying for the turmoil of the poloidal field produced by the dynamo at the core surface. However, what is observable at and above the Earth’s surface is only part of the core field that reaches it. Spatially, its small-scale contributions are masked by the crustal field, as shown in Fig. 2, and therefore only its largest scales (corresponding to spherical harmonic degrees smaller than 14) can be recovered. And temporally, the high frequency part of the core field (corresponding to periods shorter than a few months) is screened by the finite conductivity of the mantle (see e.g., Alexandrescu et al. 1999). This puts severe limitations on the possibility to recover the spatiotemporal structure of the core field, regardless of the quality of the magnetic field observations. More information on our present knowledge of the core field based on recent data can be found in e.g., Hulot et al. (2007), Jackson and Finlay (2007), and Finlay et al. (2010). 1.1.2 Crustal Field The material that makes the mantle and crust contains substantial amounts of magnetic minerals. Those minerals can become magnetized in the presence of an applied magnetic field. To produce any significant magnetic signals, this magnetism must however be of ferromagnetic type, which also requires the material to be at a low enough temperature (below the so-called Curie temperature of the minerals, see e.g., Dunlop and Özdemir 2007). Those conditions are only met within the Page 4 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Earth’s upper layers, above the so-called Curie isotherm. Its depth can vary between zero (such as at mid-oceanic ridges) and several tens of kilometers, with a typical value on the order of 20 km in continental regions. Magnetized rocks can thus only be found in those layers. Magnetized rocks essentially carry two types of magnetization, induced magnetization and remanent magnetization. Induced magnetization is one that is proportional, both in strength and direction, to the ambient field within which the rock is embedded. The ability of such a rock to acquire this magnetization is a function of the nature and proportion of the magnetic minerals it contains. It is measured in terms of a proportionality factor known as the magnetic susceptibility. Were the core field (or more correctly the local field experienced by the rock) to disappear, this induced magnetization would also disappear. Then, only the second type of magnetization, remanent magnetization, would remain. This remanent magnetization could have been acquired by the rock in many different ways (see e.g., Dunlop and Özdemir 2007). For instance at times of deposition for a sedimentary rock, or via chemical transformation, if the rock has been chemically altered. The most ubiquitous process, however, which also usually leads to the strongest remanent magnetization, is thermal. It is the way igneous and metamorphic rocks acquire their remanent magnetization when they cool down below their Curie temperature. The rock becomes magnetized in proportion, both in strength and direction, to the ambient magnetic field that the rocks experiences at the time it cools down (the proportionality factor being again a function of the magnetic minerals contained in the rock). Remanent magnetization from a properly sampled rock thus can provide information about the ancient core field (see e.g., Hulot et al. 2010). There is no way one can identify the signature of the present core field without taking the crustal field into account (in fact, modern ways of modeling the core field from satellite data often also involves modeling the crustal field). It is therefore important to also mention some of the most important spatio-temporal characteristics of the field produced by magnetized sources. It should first be recalled that not all magnetized sources will produce observable fields at the Earth’s surface. In particular, if the upper layers of the Earth consisted in a spherical shell of uniform magnetic properties magnetized within the core field at a given instant, they would produce no observable field at the Earth’s surface. This is known as Runcorn’s theorem (Runcorn 1975), an important implication of which is that the magnetic field observed at the Earth’s surface is not sensitive to the induced magnetization due to the average susceptibility of a spherical shell best describing the upper magnetic layers of the Earth. It will only sense the departure of those layers from sphericity (see e.g., Lesur and Jackson 2000), either because of the Earth’s flattening, because of the variable depth of the Curie isotherm, or because of the contrasts in magnetization due to the variable nature and susceptibility of rocks within those layers (although even such contrasts can sometimes also fail to produce observable fields (Maus and Haak 2003)). Another issue of importance is that of the relative contributions of induced and remanent magnetization. Induced magnetization is most likely the main source of large scale magnetization, while remanent magnetization plays a significant role only at regional (especially in oceans) and local scales (e.g., Purucker et al. 2002; Hemant and Maus 2005). Finally, it is important that we also briefly mention the poorly known issue of possible temporal changes in crustal magnetization on a human time scale. At a local scale, any dynamic process that can alter the magnetic properties of the rocks, or change the geological setting (such as an active volcano), would produce such changes. On a planetary scale, by contrast, significant changes can only occur in the induced magnetization because of the slowly time-varying core field, as has been recently demonstrated by Hulot et al. (2009) and Thebault et al. (2009). Recent reviews of the crustal field are given by Purucker and Whaler (2007) and Thébault et al. (2010). Page 5 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

1.2 Ionospheric, Magnetospheric, and Earth-Induced Field Contributions Ionospheric and magnetospheric currents (which produce the field of external origin), as well as Earth-induced currents (which produce externally induced fields), contribute in a nonnegligible way to the observed magnetic field, both at ground and at satellite altitude. It is therefore important to consider them in order to properly identify and separate their signal from that of the field of internal origin. 1.2.1 Ionospheric Contributions Geomagnetic daily variations at nonpolar latitudes (known as Sq variations) are caused by diurnal wind systems in the upper atmosphere: Heating at the dayside and cooling at the nightside generates tidal winds which drive ionospheric plasma against the core field, inducing electric fields and currents in the ionospheric E-region dynamo region between 90 and 150 km altitude (Richmond 1989; Campbell 1989; Olsen 1997b). The currents are concentrated at an altitude of about 110–115 km and hence can be represented by a sheet current at that altitude, (cf. section 3.4 and 3.5 of Sabaka et al. 2010). They remain relatively fixed with respect to the Earth-Sun line and produce regular daily variations which are directly seen in the magnetograms of magnetically quiet days. On magnetically disturbed days there is an additional variation which includes superimposed magnetic storm signatures of magnetospheric and high-latitude ionospheric origin. Typical peakto-peak Sq amplitudes at middle latitudes are 20–50 nT; amplitudes during solar maximum are about twice as large as those during solar minimum. Sq variations are restricted to the dayside (i.e., sunlit) hemisphere, and thus depend mainly on local time. Selecting data from the nightside when deriving models of the internal field is therefore useful to minimize field contributions from the nonpolar ionospheric E-region. Because the geomagnetic field is strictly horizontal at the dip equator, there is about a fivefold enhancement of the effective (Hall) conductivity in the ionospheric dynamo region, which results in about a fivefold enhanced eastward current, called the Equatorial Electrojet (EEJ), flowing along the dayside dip equator (Rastogi 1989). Its latitudinal width is about 6ı –8ı . In addition, auroral electrojets (AEJ) flow in the auroral belts (near ˙(65ı –70ı ) magnetic latitude) and vary widely in amplitude with different levels of magnetic activity from a few tens nT during quiet periods to several thousand nT during major magnetic storms. As a general rule, ionospheric fields at polar latitudes are present even at magnetically quiet times and on the nightside (i.e., dark) hemisphere, which makes it difficult to avoid their contribution by data selection. Electric currents at altitude above 120 km, i.e., in the so-called ionospheric F -region (up to 1,000 km altitude), cause magnetic fields that are detectable at satellite altitude as nonpotential (e.g., toroidal) magnetic fields (Olsen 1997a; Richmond 2002; Maus and Lühr 2006). Their contributions in nonpolar regions are also important during local nighttime when the E-region conductivity vanishes and therefore contributions from Sq and the Equatorial Electrojet are absent. 1.2.2 Magnetospheric Contributions The field originating in Earth’s magnetosphere is due primarily to the ring-current and to currents on the magnetopause and in the magnetotail (Kivelson and Russell 1995). Currents flowing on the outer boundary of the magnetospheric cavity, the magnetopause currents, cancel the Earth’s field outside and distend the field within the cavity. This produces an elongate tail in the antisolar Page 6 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

direction within which so-called neutral-sheet currents are established in the equatorial plane. Interaction of these currents with the radiation belts near the Earth produces a ring-current in the dipole equatorial plane which partially encircles the Earth, but achieves closure via field-aligned currents (FAC) (currents which flow along core field lines) into and out of the ionosphere. These resulting fields have magnitudes on the order of 20–30 nT near the Earth during magnetically quiet periods, but can increase to several hundreds of nT during disturbed times. In polar regions (poleward of, say, ˙65ı dipole latitude), the auroral ionosphere and magnetosphere are coupled by field-aligned currents. The fields from these FAC have magnitudes that vary with the magnetic disturbance level. However, they are always present, on the order of 30–100 nT during quiet periods and up to several thousand nT during substorms. There are also currents which couple the ionospheric Sq current systems in the two hemispheres that flow, at least in part, along magnetic field lines. The associated magnetic fields are generally 10 nT or less. Finally, there exists a meridional current system which is connected to the EEJ with upward directed currents at the dip equator and field-aligned downward directed currents at low latitudes. These currents result in magnetic fields of about a few tens of nT at 400 km altitude. 1.2.3 Induction in the Solid Earth and the Oceans Time-varying external fields produce secondary, induced, currents in the oceans and the Earth’s interior; this contribution is what we refer to as externally induced fields, which are the topic of electromagnetic induction studies (see Parkinson and Hutton 1989; Constable 2007; Kuvshinov 2012). In addition, the motion of electrically conducting seawater through the core field, via a process referred to as motional induced induction, also produces secondary currents (e.g., Tyler et al. 2003; Kuvshinov and Olsen 2005; Maus 2007b). The oceans thus contribute twofold to the observed magnetic field: by secondary currents induced by primary current systems in the ionosphere and magnetosphere; and by motion induced currents due to the movement of seawater, for instance by tides. The amplitude of induced contributions generally decreases with the period. As an example, about one third of the observed daily Sq variation in the horizontal components is of induced origin (Schmucker 1985). But induction effects also depend on the scale of the source (i.e., the ionospheric and magnetospheric current systems); as a result, the induced contribution due to the daily variation of the Equatorial Electrojet is, for instance, much smaller than the above mentioned one third, typical of the large-scale Sq currents (Olsen 2007b).

2 Modern Geomagnetic Field Data 2.1 Definition of Magnetic Elements and Coordinates Measurements of the geomagnetic field taken at ground or in space form the basis for modeling the Earth’s magnetic field. Observations taken at the Earth’s surface are typically given in a local topocentric (geodetic) coordinate system (i.e., relative to a reference ellipsoid as approximation for the geoid). The magnetic elements X; Y; Z are the components of the field vector B in an orthogonal righthanded coordinate system, the axis of which are pointing towards geographic North, geographic East, and vertically down, as shown in Fig. 3. Derived magnetic elements are: the angle between

Page 7 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 The magnetic elements in the local topocentric coordinate system, seen from North-East

geographic North and the (horizontal) direction in which a compass needle is pointing, denoted as declination D D arctan Y =X; the angle between the local horizontal p plane and the field vector, X 2 C Y 2 ; and total intensity denoted as inclination I D arctan Z=H ; horizontal intensity H D p 2 2 2 F D X CY CZ . In contrast to magnetic observations taken at or near ground, satellite data are typically provided in the geocentric coordinate system as spherical components Br ; B ; B' where r,  , ' are radius, colatitude, and longitude respectively. Equations for transforming between geodetic components X; Y; Z and geocentric components Br ; B ; B' can be found in, e.g., section 5.02.2.1.1 of Hulot et al. (2007). The distribution in space of the observations at a given time determines the spatial resolution to which the field can be determined for that time. Internal sources are often fixed with respect to the Earth (magnetic fields due to induced currents in the Earth’s interior are an exception) and thus follow its rotation. Internal sources are therefore best described in an Earth-Centered-Earth-Fixed (ECEF) coordinate frame like that given by the geocentric coordinates r,  , '. In contrast, many external fields are fixed with respect to the position of the Sun, and therefore the use of a coordinate frame that follows the (apparent) movement of the Sun is advantageous. Solar magnetospheric (SM) coordinates for describing near magnetospheric currents like the ring-current, and geocentric solar magnetospheric (GSM) coordinates for describing far magnetospheric current systems like the tail currents have turned out to be useful when determining models of Earth’s magnetic field (Maus and Lühr 2005; Olsen et al. 2006, 2009, 2010b).

2.2 Ground Data Presently about 150 geomagnetic observatories monitor the time changes of the Earth’s magnetic field. Their global distribution, shown in the lower part of Fig. 4, is very uneven, with large

Page 8 of 20

180

IGY/C IQSY

200 annual means hourly means 1-min values

IMS 2nd IPY

140

100 80 60 40 20

1st IPY

120 Göttingen Magnetic Union

number of observatories

160

Preparation for Ørsted

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

0 1800 1820 1840 1860 1880 1900 1920 1940 1960 1980 2000 year

Fig. 4 Distribution of ground observatory magnetic field data in time (top) and space (bottom)

uncovered areas especially over the oceans. Red symbols indicate sites that provide data (regardless of the observation time instant and the duration of the time series), while the yellow dots show observatories that have provided hourly mean values for the recent years. These observatories provide data of different temporal resolution which are distributed through the World-Date-Center system (e.g., http://www.ngdc.noaa.gov/wdc, http://wdc.kugi.kyotou.ac.jp, http://www.wdc.bgs. ac.uk). Traditionally, annual mean values have been used for deriving field models, but the availability of hourly mean values (or even 1 min values) in digital form for the recent years allow for a better characterization of external field variations. The upper part of Fig. 4 shows the distribution in time of observatory data of various sampling rate. International campaigns, like the Göttingen Magnetic Union, the 1st and 2nd International Polar Year (IPY), the International Geophysical Year (IGY/C), the International Quiet Solar Year (IQSY), the International Magneto-

Page 9 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

spheric Study (IMS), and the preparation of the Ørsted satellite mission have stimulated observatory data processing and the establishment of new observatories. Geomagnetic observatories aim at measuring the magnetic field in the geodetic reference frame with an absolute accuracy of 1 nT (Jankowski and Sucksdorff 1996). However, it is presently not possible to take advantage of that measurement accuracy due to the (unknown) contribution from nearby crustal sources. When using observatory data for field modeling it is therefore common practice to either use first time differences of the observations (thereby eliminating the static crustal field contribution) or to co-estimate together with the field model an “observatory bias” for each site and element, following a procedure introduced by Langel et al. (1982). A joint analysis of observatory and satellite data allows one to determine these observatory biases. The need for knowledge of the absolute baseline is therefore less important during periods for which satellite data are available. Recognizing this will simplify the observation practice, especially for ocean-bottom magnetometers for which the exact determination of true north is very difficult and expensive. In addition to geomagnetic observatories (which monitor the time changes of the geomagnetic field at a given location), magnetic “repeat stations” are sites where high-quality magnetic measurements are taken every few years for a couple hours or even days (Newitt et al. 1996; Turner et al. 2007). The main purpose of repeat stations is to measure the time changes of the core field (secular variation); they offer better spatial resolution than observatory data but do not provide continuous time series.

2.3 Satellite Data The possibility to measure the Earth’s magnetic field from space has revolutionized geomagnetic field modeling. Magnetic observations taken by low-earth-orbiting (LEO) satellites at altitudes below 1,000 km form the basis of recent models of the geomagnetic field. There are several advantages of using satellite data for field modeling: 1. Satellites sample the magnetic field over the entire Earth (apart from the polar gap, a region around the geographic poles that is left unsampled if the satellite orbit is not perfectly polar). 2. Measuring the magnetic field from an altitude of 400 km or so corresponds roughly to averaging over an area of this dimension. Thus the effect of local heterogeneities, for instance caused by local crustal magnetization, is reduced. 3. The data are obtained over different regions with the same instrumentation, which helps to reduce spurious effects. There are, however, some points to consider when using satellite data instead of ground based data: 1. Since the satellite moves (with a velocity of about 8 km/s at 400 km altitude) it is not possible to decide whether an observed magnetic field variation is due to a temporal or spatial change of the field. Thus, there is risk for time-space aliasing. 2. It is necessary to measure the magnetic field with high accuracy – not only regarding resolution, but also regarding orientation and absolute values.

Page 10 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Table 1 Satellite missions of relevance for geomagnetic field modeling Satellite Cosmos 49 POGO OGO-2 OGO-4 OGO-6 Magsat DE-1 DE-2 POGS

Years 1964 1965–1967 1967–1969 1969–1971 1979–1980 1981–1991 1981–1983 1990–1993

Inclination i 50ı 87ı 86ı 82ı 97ı 90ı 90ı 90ı

Altitude/km 261–488 413–1,510 412–908 397–1,098 325–550 568–23,290 309–1,012 639–769

Accuracy/nT 22 6 6 6 6 ? 30(F)/100 ?

UARS Ørsted CHAMP SAC-C Swarm

1991–1994 1999– 2000–2010 2001–2004 2013–

57ı 97ı 87ı 97ı 88ı /87ı

560 650–850 250–450 698–705 530/ < 450

? 4 3 4 2

Remarks Scalar only Scalar only Scalar only Scalar only Vector and scalar Vector (spinning) Low accuracy vector Low accuracy vector, timing problems Vector (spinning) Scalar and vector Scalar and vector Scalar only Scalar and vector

3. Due to the Earth’s rotation, the satellite revisits a specific region after about 1 day.1 Hence the magnetic field in a selected region is modeled from time series with a sampling rate of 1 day. However, since the measurements were not low-pass filtered before “resampling,” aliasing may occur. 4. Satellites usually acquire data not at one fixed altitude, but over a range of altitudes. The decay of altitude through mission lifetime often leads to time series that are unevenly distributed in altitude. 5. Finally, the satellite moves through an electric plasma, and the existence of electric currents at satellite altitude does, in principle, not allow to describe the observed field as the gradient of a Laplacian potential. An overview of previous and present satellites that have been used for geomagnetic field modeling is given in Table 1 (see also Table 3.3 of Langel and Hinze 1998). The quality of the magnetic field measurements is rather different for the listed satellites, and before the launch of the Ørsted satellite in 1999, the POGO satellite series (Cain 2007) that flew in the second half of the 1960s, and Magsat (Purucker 2007), which flew for 6 months around 1980, were the only high-precision magnetic satellites. A timeline of high-precision missions is shown in Fig. 5. After a gap of almost 20 years with no high-precision satellites in orbit, the launch of the Danish Ørsted satellite (Olsen 2007a) in February 1999 marked the beginning of a new epoch for exploring the Earth’s magnetic field from space. Ørsted was followed by the German CHAMP satellite (Maus 2007a) and the US/Argentinian/Danish SAC-C satellite, launched in July and November 2000, respectively. All three satellites carry essentially the same instrumentation and provide high-quality and high-resolution magnetic field observations from space. They sense the various internal and external field contributions differently, due to their different altitudes and drift rates through local time.

1

Actually the satellite revisits that region already after about 12 h, but this will be for a different local time. Because of external field contributions – which heavily depend on local time – it is safer to rely on data taken at similar local time conditions, which results in the above stated sampling recurrence of 24 h. Page 11 of 20

Ørsted

Swarm ?

3-sat constellation, vector and scalar

Vector and scalar

Ørsted, CHAMP, SAC-C

Magsat

Vector and scalar

Scalar only

POGO (OGO-2, -4, -6)

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

CHAMP SAC-C 1965 1970 1975

1980 1985 1990 1995 2000 2005 2010 2015 2020

Fig. 5 Distribution of high-precision satellite missions in time

Fig. 6 Left: The path of a satellite at inclination i in orbit around the Earth. Right: Ground track of 24 h of the Ørsted satellite on January 2, 2001 (yellow curve). The satellite starts at –57ı N, 72ı E at 00 UT, moves northward on the morning side of the Earth, and crosses the Equator at 58ı E (yellow arrow). After crossing the polar cap it moves southward on the evening side and crosses the equator at 226ı E (yellow open arrow) 50 min after the first equator crossing. The next Equator crossing (after additional 50 min) is at 33ı E (red arrow), 24ı westward of the first crossing 100 min earlier, while moving again northward

A closer look at the characteristics of satellite data sampling is helpful. A satellite moves around Earth in elliptical orbits. However, ellipticity of the orbit is small for many of the satellites used for field modeling, and for illustration purposes we will concentrate on circular orbits. As sketched on the left side of Fig. 6, orbit inclination i is the angle between the orbit plane and the equatorial plane. A perfectly polar orbit implies i D 90ı , but for practical reasons most satellite orbits have inclinations that are different from 90ı . This results in “polar gaps,” which are regions around the geographic poles that are left unsampled. The right part of Fig. 6 shows the ground track of 1 day (January 2, 2001) of Ørsted satellite data. It is obvious that the coverage in latitude and longitude provided even by only a few days of satellites data is much better than that of the present ground based observatory network (cf. Fig. 4). The polar gaps, the regions of half-angle j90ı – i j around the geographic poles, are obvious when looking at the orbits in a polar view in an Earth-fixed coordinate system, as done in the left part of Page 12 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 7 Left: Ground track of 1 day (January 2, 2001) of the Ørsted (red) and CHAMP (blue) satellites in dependence on geographic coordinates. Right: Orbit in the solar-magnetospheric (SM) reference frame

Fig. 7 for the Ørsted (red), resp. CHAMP (blue), satellite tracks of January 2, 2001. The polar gaps are larger for Ørsted (inclination i D 97ı ) compared to CHAMP (i D 87ı ), as confirmed by the figure. As mentioned before, internal magnetic field sources are often fixed (or slowly changing, in the case of the core field) with respect to the Earth while most external fields have relatively fixed geometries with respect to the Sun. A good description of the various field contributions requires a good sampling of the data in the respective coordinate systems. Good coverage in latitude and longitude is essential for modeling the internal field. There are, however, pitfalls due to peculiarities of the satellite orbits, which may result in less optimal sampling. The top panel of Fig. 8 shows the longitude of the ascending node (the equator crossing of the satellite going from south to north) of the Ørsted (left) and CHAMP (right) satellite orbits. Depending on orbit altitude (shown in the middle part of the figure) there are periods with pronounced “revisiting patterns”: In June and July 2003, the CHAMP satellite samples, for instance, only the field near the equator at longitudes ' D 7:6ı ; 19:2ı ; 30:8ı: : :356:0ı . This longitudinal sampling of ' D 11:6ı hardly allows to resolve features of the field of spatial scale corresponding to spherical harmonic degrees above n D 15. Another issue that has to be considered when deriving field models from satellite data concerns satellite altitude. The middle panel of Fig. 8 shows the altitude evolvement for the Ørsted and CHAMP satellites. Various altitude maneuvers are the reason for the sudden increase of altitude of CHAMP. At lower altitudes the magnetic signal of small-scale features of the internal magnetic field (corresponding to higher spherical harmonic degrees) are relatively more amplified compared to large-scale features (represented by low degree spherical harmonics). The crustal field signal measured by a satellite is thus normally stronger towards the end of the mission lifetime due to the lower altitude. However, if the crustal field is not properly accounted for, the decreasing altitude may hamper the determination of the core field time changes. Good sampling in the Earth-Fixed coordinate system, which is essential for determining the internal magnetic field, can, at least in principle, be obtained from a few days of satellite data. However, good sampling in sun-fixed coordinates is required for a reliable determination of the external field contributions. The right part of Fig. 7 shows the distribution of the Ørsted, resp. CHAMP, satellite data of January 2, 2001, in the SM coordinate system, i.e., in dependence on the distance Page 13 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 8 Some orbit characteristics for the Ørsted (left), resp. CHAMP (right) satellite in dependence on time. Top: Longitude of the ascending node, illustrating longitudinal “revisiting patterns.” Middle: mean altitude. Bottom: Local Time of ascending (red), resp. descending (blue) node

from the geomagnetic North pole (which is in the center of the plot) and magnetic local time MLT. Despite the rather good sampling in the geocentric frame (left panel), the distribution in the sunfixed system (right panel) is rather coarse, especially when only data from one satellite are considered. Data obtained at different local times are essential for a proper description of external fields. The bottom panel of Fig. 8 shows how (geographic) local time of the satellite orbits change through mission lifetime. The Ørsted satellite scans all local times within 790 days (2.2 years), while the local time drift rate of CHAMP is much higher: CHAMP covers all local times within 130 days. Combining observations from different spacecraft flying at different local times helps to improve data coverage in the various coordinate systems. Especially the Swarm satellite constellation mission (Friis-Christensen et al. 2006, 2009) consisting of three satellites that were launched in November 2013 has specifically been designed to reduce the time-space ambiguity that is typical of single-satellite missions.

Page 14 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

3 Making the Best of the Data to Investigate the Various Field Contributions: Geomagnetic Field Modeling Using the observations of the magnetic field described in the previous section to identify the various magnetic fields described in Sect. 1 is the main purpose of geomagnetic field modeling. This requires the use of mathematical representations of such fields in both space and time. The mathematical tools that make such a representation in space possible are described in the accompanying chapter by Sabaka et al. (2010). But because the fields vary in time, some temporal representations are also needed. Using such spatio-temporal representations formally makes it possible to represent all the fields that contribute to the observed data in the form of a linear superposition of elementary functions. The set of numerical coefficients that define this linear combination is then what one refers to as a geomagnetic field model. It can be recovered from the data via inverse theory, which next makes it possible to identify the various field contributions. In practice, however, one has to face many pitfalls, not the least because the data are limited in number and not ideally distributed. In particular, although numerous, the usefulness of satellite data is limited by the time needed for satellites to complete an orbit, during which some of the fields can change significantly. This can then translate into some ambiguity in terms of the spatial/temporal representations. But advantage can be taken of the known spatiotemporal properties of the various fields described in Sect. 1 and of the combined use of ground and satellite data. Fast changing fields, with periods up to typically a month, are for instance known to mainly be of external origin (both ionospheric and magnetospheric), but with some electrically induced internal fields. Those are best identified with the help of observatory data, which can be temporally band filtered, and next spatially analyzed with the help of the tools described in Sabaka et al. (2010). This then makes it possible to identify the contribution from sources above and below the Earth’s surface. The relative magnitude and temporal phase shifts between the (induced) internal and (inducing) external fields can then be computed, which provide very useful information with respect to the distribution of the electrical conductivity within the solid Earth (see e.g., Constable 2007; Kuvshinov 2008). Satellite data can also be used for similar purposes, but this is a much more difficult endeavor since, as we already noted, one then has to deal with additional space/time separation issues related to the fact that satellites sense both changes due to their motion over stationary sources (such as the crustal field) and true temporal field changes. Much efforts are currently devoted to deal with those issues and make the best of such data for recovering the solid Earth electrical conductivity distribution, with encouraging preliminary successes (see e.g., Kuvshinov and Olsen 2006), especially in view of the Swarm satellite constellation mission (Kuvshinov et al. 2006). As a matter of fact, and for the time being, satellite data turn out to be most useful for the investigation of the field of internal origin (the core field and the crustal field). But even recovering those fields requires considerable care and advanced modeling strategies. In principle, and as explained in Sabaka et al. (2010), full vector measurements can be combined with ground-based vector data to infer both the field of internal origin, the E-region ionospheric field, the local F region ionospheric field and the magnetospheric field. But there are many practical limits to this possibility, again because satellites do not provide instantaneous sets of measurements on a sphere at all times, and also because the data distribution at the Earth’s surface is quite sparse. This sets a limit on the quality of the E-region ionospheric field, one can possibly hope to recover by a joint use of ground-based and satellite data. But appropriate knowledge of the spatiotemporal behavior

Page 15 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

of each type of sources can again be used. This basically leads to the two following possible strategies to infer the contribution of each field from satellite data. A first strategy consists in acknowledging that the field due to nonpolar ionospheric E-region is weak at night, especially at so-called magnetic quiet time (as may be inferred from ground-based magnetic data), and selecting satellite data in this way, so as to minimize contributions from the ionosphere. Those satellite data can then be used alone to infer both the field of internal origin and the field of external (then only magnetospheric) origin (though this usually still requires some care when dealing with polar latitude data, because those are always, also at night and during quiet conditions, affected by some ionospheric and local field-aligned currents). This is a strategy that can be used to focus on the field of internal origin, and in particular the crustal field (see e.g., Maus et al. 2008). A second strategy consists in making use of both observatory data and satellite data, and simultaneously parameterizing the spatial and temporal behavior of as many sources as possible. This strategy has been used in particular to improve the recovery of the core field and its slow secular changes (see e.g., Thomson and Lesur 2007; Lesur et al. 2008; Olsen et al. 2009), but can more generally be used to try and recover all field sources simultaneously, using the socalled Comprehensive Modeling approach (Sabaka et al. 2002, 2004), to investigate the temporal evolution of all fields over long periods of times when satellite data are available. This strategy is one that looks particularly promising in view of the Swarm satellite constellation mission (Sabaka and Olsen 2006). Finally, it is worth pointing out that whatever strategy is being used, residuals from the modeled fields may then also always be used to investigate additional nonmodeled sources such as local ionospheric F -region sources (Lühr et al. 2002; Lühr and Maus 2006) to which satellite data are very sensitive. Considerable more details about all those strategies and other possible future strategies can be found in e.g., Hulot et al. (2007), Olsen et al. (2010a), Lesur et al. (2011), and Schott and Thebault (2011) to which the reader is referred, and where many more references can be found. Acknowledgments This is IPGP contribution 2595 (updated).

References Alexandrescu MM, Gibert D, Le Mouël JL, Hulot G, Saracco G (1999) An estimate of average lower mantle conductivity by wavelet analysis of geomagnetic jerks. J Geophys Res 104: 17735–17745 Cain JC (2007) POGO (OGO-2, -4 and -6 spacecraft). In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg, pp 828–829 Campbell WH (1989) The regular geomagnetic field variations during quiet solar conditions. In: Jacobs JA (ed) Geomagnetism, vol 3. Academic, London, pp 385–460 Constable S (2007) Geomagnetic induction studies. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 237–276 Cowling TG (1957) Magnetohydrodynamics. Wiley Interscience, New York Dunlop D, Özdemir Ö (2007) Magnetizations in rocks and minerals. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 277–336 Finlay CF, Dumberry M, Chulliat A, Pais A (2010) Short timescale core dynamics: theory and observations. Space Sci Rev 155:177–218. doi:10.1007/s11214-010-9691-6 Page 16 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Friis-Christensen E, Lühr H, Hulot G (2006) Swarm: a constellation to study the Earth’s magnetic field. Earth Planets Space 58:351–358 Friis-Christensen E, Lühr H, Hulot G, Haagmans R, Purucker M (2009) Geomagnetic research from space. EOS Trans AGU 90(25):213–215 Gubbins D, Zhang K (1993) Symmetry properties of the dynamo equations for paleomagnetism and geomagnetism. Phys Earth Planet Int 75:225–241 Hemant K, Maus S (2005) Geological modeling of the new CHAMP magnetic anomaly maps using a geographical information system technique. J Geophys Res 110:B12103. doi:10.1029/2005JB003837 Hulot G, Le Mouël JL (1994) A statistical approach to the Earth’s main magnetic field. Phys Earth Planet Int 82:167–183. doi:10.1016/0031-9201(94)90070-1 Hulot G, Sabaka TJ, Olsen N (2007) The present field. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 33–75 Hulot G, Olsen N, Thébault E, Hemant K (2009) Crustal concealing of small-scale core-field secular variation. Geophys J Int 177:361–366. doi:10.1111/j.1365-246X. 2009.04119.x Hulot G, Finlay C, Constable C, Olsen N, Mandea M (2010) The magnetic field of planet Earth. Space Sci Rev 152:159–222. doi:10.1007/s11214-010-9644-0 Jackson A, Finlay CC (2007) Geomagnetic secular variation and its application to the core. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam Jacobs JA (1953) The earth’s inner core. Nature 172:297–300 Jankowski J, Sucksdorff C (1996) IAGA guide for magnetic measurements and observatory practice. IAGA, Warszawa Kivelson MG, Russell CT (1995) Introduction to space physics. Cambridge University Press, Cambridge Kuvshinov A (2008) 3-D global induction in the oceans and solid earth: recent progress in modeling magnetic and electric fields from sources of magnetospheric, ionospheric and oceanic origin. Surv Geophys 29(2):139–186 Kuvshinov A (2012) Deep electromagnetic studies from land, sea, and space: progress status in the past 10 years. Surv Geophys 33:169–209. doi:10.1007/s10712-011-9118-2 Kuvshinov AV, Olsen N (2005) 3D modeling of the magnetic field due to ocean flow. In: Reigber C, Lühr H, Schwintzer P, Wickert J (eds) Earth observation with CHAMP, results from three years in orbit. Springer, Berlin Kuvshinov AV, Olsen N (2006) A global model of mantle conductivity derived from 5 years of CHAMP, Ørsted, and SAC-C magnetic data. Geophys Res Lett 33:L18301. doi:10.1029/2006GL027083 Kuvshinov AV, Sabaka TJ, Olsen N (2006) 3-D electromagnetic induction studies using the Swarm constellation. Mapping conductivity anomalies in the Earth’s mantle. Earth Planets Space 58:417–427 Langel RA, Hinze WJ (1998) The magnetic field of the Earth’s lithosphere: the satellite perspective. Cambridge University Press, Cambridge Langel RA, Estes RH, Mead GD (1982) Some new methods in geomagnetic field modeling applied to the 1960–1980 epoch. J Geomagn Geoelectron 34:327–349 Lesur V, Jackson A (2000) Exact solution for internally induced magnetization in a shell. Geophys J Int 140:453–459 Lesur V, Wardinski I, Rother M, Mandea M (2008) GRIMM: the GFZ reference internal magnetic model based on vector satellite and observatory data. Geophys J Int 173:382–394

Page 17 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Lesur V, Olsen N, Thomson AW (2011) Geomagnetic core field models in the satellite era. In: Mandea M, Korte M (eds) Geomagnetic observations and models. IAGA special Sopron book series, chap 11, vol 5. Springer, Heidelberg, pp 277–294. doi:10.1007/978-90-481-9858-0_11 Lowes FJ (1966) Mean-square values on sphere of spherical harmonic vector fields. J Geophys Res 71:2179 Lühr H, Maus S (2006) Direct observation of the F region dynamo currents and the spatial structure of the EEJ by CHAMP. Geophys Res Lett 33:L24102. doi:10.1029/2006GL028374 Lühr H, Maus S, Rother M (2002) First in situ observation of night-time F region currents with the CHAMP satellite. Geophys Res Lett 29(10). doi:10.1029/2001GL013845 Maus S (2007a) CHAMP magnetic mission. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg, pp 59–60 Maus S (2007b) Electromagnetic ocean effects. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg Maus S, Haak V (2003) Magnetic field annihilators: invisible magnetization at the magnetic equator. Geophys J Int 155:509–513 Maus S, Lühr H (2005) Signature of the quiet-time magnetospheric magnetic field and its electromagnetic induction in the rotating Earth. Geophys J Int 162:755–763 Maus S, Lühr H (2006) A gravity-driven electric current in the Earth’s ionosphere identified in CHAMP satellite magnetic measurements. Geophys Res Lett 33:L02812. doi:10.1029/2005GL024436 Maus S, Yin F, Lühr H, Manoj C, Rother M, Rauberg J, Michaelis I, Stolle C, Müller R (2008) Resolution of direction of oceanic magnetic lineations by the sixth-generation lithospheric magnetic field model from CHAMP satellite magnetic measurements. Geochem Geophys Geosyst 9(7):Q07021. doi:10.1029/2008GC001949 Merrill R, McFadden P, McElhinny M (1998) The magnetic field of the earth: paleomagnetism, the core, and the deep mantle. Academic, San Diego Newitt LR, Barton CE, Bitterly J (1996) Guide for magnetic repeat station surveys. International Association of Geomagnetism and Aeronomy, Boulder Nimmo F (2007) Energetics of the core. In: Treatise on geophysics, G. Schubert (ed), vol 8. Elsevier, Amsterdam, pp 31–65 Olsen N (1997a) Ionospheric F region currents at middle and low latitudes estimated from Magsat data. J Geophys Res 102(A3):4563–4576 Olsen N (1997b) Geomagnetic tides and related phenomena. In: Wilhelm H, Zürn W, Wenzel H-G (eds) Tidal phenomena. Lecture notes in earth sciences, vol 66. Springer, Berlin/New York Olsen N (2007a) Ørsted. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg, pp 743–745 Olsen N (2007b) Natural sources for electromagnetic induction studies. In: Gubbins D, HerreroBervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg Olsen N, Lühr H, Sabaka TJ, Mandea M, Rother M, Tøffner-Clausen L, Choi S (2006) CHAOS – a model of Earth’s magnetic field derived from CHAMP, Ørsted, and SAC-C magnetic satellite data. Geophys J Int 166:67–75. doi:10.1111/j.1365-246X. 2006.02959.x Olsen N, Mandea M, Sabaka TJ, Tøffner-Clausen L (2009) CHAOS-2 – a geomagnetic field model derived from one decade of continuous satellite data. Geophys J Int 179(3):1477–1487. doi:10.1111/j.1365-246X.2009.04386.x Olsen N, Hulot G, Sabaka TJ (2010a) Measuring the Earth’s magnetic field from space: concepts of past, present and future missions. Space Sci Rev 155:65–93. doi:10.1007/s11214-010-9676-5

Page 18 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Olsen N, Mandea M, Sabaka TJ, Tøffner-Clausen L (2010b) The CHAOS-3 geomagnetic field model and candidates for the 11th generation of IGRF. Earth Planets Space 62:719–727 Parkinson WD, Hutton VRS (1989) The electrical conductivity of the earth. In: Jacobs JA (ed) Geomagnetism, vol 3. Academic, London, pp 261–321 Purucker ME (2007) Magsat. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg, pp 673–674 Purucker M, Whaler K (2007) Crustal magnetism. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 195–235 Purucker M, Langlais B, Olsen N, Hulot G, Mandea M (2002) The southern edge of cratonic North America: evidence from new satellite magnetometer observations. Geophys Res Lett 29(15):8000. doi:10.1029/2001GL013645 Rastogi RG (1989) The equatorial electrojet: magnetic and ionospheric effects. In: Jacobs JA (ed) Geomagnetism, vol 3. Academic, London, pp 461–525 Richmond AD (1989) Modeling the ionospheric wind dynamo: a review. In: Campbell WH (ed) Quiet daily geomagnetic fields. Birkhäuser Verlag, Basel Richmond AD (2002) Modeling the geomagnetic perturbations produced by ionospheric currents, above and below the ionosphere. J Geodynamics 33:143–156 Roberts PH (2007) Theory of the geodynamo. In: Treatise on geophysics, vol 8. Elsevier, Amsterdam, pp 67–106 Runcorn SK (1975) On the interpretation of lunar magnetism. Phys Earth Planet Int 10:327–335 Sabaka TJ, Olsen N (2006) Enhancing comprehensive inversions using the Swarm constellation. Earth Planets Space 58:371–395. http://www.terrapub.co.jp/journals/EPS/pdf/2006/5804/ 58040371.pdf Sabaka TJ, Olsen N (2004) Purucker ME Extending comprehensive models of the Earth’s magnetic field with Ørsted and CHAMP data. Geophys J Int 159:521–547. doi:10.1111/j.1365246X.2004.{02421}.x Sabaka TJ, Olsen N, Langel RA (2002) A comprehensive model of the quiet-time near-Earth magnetic field: phase 3. Geophys J Int 151:32–68 Sabaka TJ, Hulot G, Olsen N (2010) Mathematical properties relevant to geomagnetic field modeling. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, chap 17. Springer, Heidelberg, pp 504–538. doi:10.1007/978-3-642-01546-5_17 Schmucker U (1985) Magnetic and electric fields due to electromagnetic induction by external sources. In: Landolt-Börnstein, new-series, 5/2b, W. Martienssen (ed). Springer, Berlin/Heidelberg, pp 100–125 Schott J-J, Thebault E (2011) Modeling the Earth’s magnetic field from global to regional scales. In: Mandea M, Korte M (eds) Geomagnetic observations and models. IAGA special Sopron book series, chap 9, vol 5. Springer, Heidelberg. doi:10.1007/978-90-481-9858-0_2 Thebault E, Hemant K, Hulot G, Olsen N (2009) On the geographical distribution of induced timevarying crustal magnetic fields. Geophys Res Lett 36:L01307. doi:10.1029/2008GL036416 Thébault E, Purucker M, Whaler KA, Langlais B, Sabaka TJ (2010) The magnetic field of the Earth’s lithosphere. Space Sci Rev 155:95–127. doi:10.1007/ s11214-010-9667-6 Thomson AWP, Lesur V (2007) An improved geomagnetic data selection algorithm for global geomagnetic field modeling. Geophys J Int 169(3):951–963 Turner GM, Rasson JL, Reeves CV (2007) Observation and measurement techniques. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam Tyler RH, Maus S, Lühr H (2003) Satellite observations of magnetic fields due to ocean tidal flow. Science 299:239–241 Page 19 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_5-2 © Springer-Verlag Berlin Heidelberg 2014

Voorhies CV, Sabaka TJ, Purucker M (2002) On magnetic spectra of Earth and Mars. J Geophys Res 107(E6):5034 Wicht J, Harder H, Stellmach S (2010) Numerical dynamo simulations – from basic concepts to realistic models. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics. Springer, Heidelberg

Page 20 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_6-2 © Springer-Verlag Berlin Heidelberg 2014

Classical Physical Geodesy Helmut Moritz Institute of Navigation, Graz University of Technology, Graz, Austria

Abstract Geodesy can be defined as the science of the figure of the Earth and its gravitational field, as well as their determination. Even though today the figure of the Earth, understood as the visible Earth’s surface, can be determined purely geometrically by satellites, using Global Positioning System (GPS) for the continents and satellite altimetry for the oceans, it would be pretty useless without gravity. One could not even stand upright or walk without being “told” by gravity where the upright direction is. So as soon as one likes to work with the Earth’s surface, one does need the gravitational field. (Not to speak of the fact that, without this gravitational field, no satellites could orbit around the Earth.) To be different from the existing textbooks, a working knowledge of professional mathematics can be taken for granted. In some areas where professors of geodesy are hesitant to enter too deeply, afraid of losing their students, some fundamental problems can be studied. Of course, there is a brief introduction to terrestrial gravitation as treated in the first few chapters of every textbook of geodesy, such as gravitation and gravity (gravitation plus the centrifugal force of the Earth’s rotation), the geoid, and heights above the ellipsoid (now determined directly by GPS) and above the sea level (a surprisingly difficult problem!). But then, as accuracies rise from 106 in 1960 (about 0. However much the modest sand grain tries to be nonobtrusive, it will cause a singularity into the originally regular geopotential model and thus introduce a convergence sphere (Fig. 2)! This shows that the possibility of regular analytic continuation of the geopotential and convergence is a highly complicated and unstable problem. A single sand grain may change convergence into divergence, and think of how many mass points, rocks, molecules, and electrons make up the Earth’s body and cause the most fanciful singularities. A counterexample. This would indicate that for all solid bodies, consisting of many mass points, the analytical continuation of the outer potential into its interior is automatically singular. A counterexample is the homogeneous sphere of radius R. Its external potential is well known to be that of a point mass M situated at its center:

Page 15 of 33

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_6-2 © Springer-Verlag Berlin Heidelberg 2014

Vext D

GM ; r > R; r

(37)

whereas the potential inside the Earth is (int = interior)   1 2 2 Vint D 2 G R  r ; 3

(38)

which satisfies Poisson’s equation Vint D 4 G. The analytical continuation of Vext is evidently the harmonic function Vcont D

GM ; 0 b0

T Df

for

  1 T .u/ D O u

u D b0 for

u ! 1;

(2) (3) (4)

where f is assumed to be a known square-integrable function, f 2 L2 . The asymptotic condition in Eq. (4) means that the harmonic function T approaches zero at infinity. The solution of the Laplace equation (2) can be written in terms of ellipsoidal harmonics as follows (Moon and Spencer 1961; Heiskanen and Moritz 1967; Thong and Grafarend 1989):   Qj m i Eu T .u; / D Tj m  b0  Yj m./; iE Q j m j D0 mDj j 1 X X

(5)

  where Qj m i Eu are the Legendre function of the second kind, Yj m ./ are complex spherical harmonics of degree j and order m, and Tj m are coefficients to be determined from the boundary condition of Eq. (3). Substituting Eq. (5) into Eq. (3), and expanding f ./ in a series of spherical harmonics, f ./ D

Z j 1 X X j D0 mDj

f .0 /Yjm .0 /d 0 Yj m./

(6)

0

where 0 is the full solid angle and d  D sin#d #d , and comparing the coefficients at spherical harmonics Yj m ./ in the result, one gets Z Tj m D

f .0 /Yjm .0 /d 0

(7)

0

Page 5 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

for j D 0; 1; : : : , and m D j; : : : ; j where Yjm is the conjugate complex spherical harmonic of Yj m. Furthermore, substituting coefficients Tj m into Eq. (5) and interchanging the order of summation over j and m and integration over 0 due to the uniform convergence of the series expansion given by Eq. (5), the solution to the ellipsoidal Dirichlet boundary-value problem, Eqs. (2)–(4), formally reads Z T .u; / D 0

 u i Q j m f .0 / Tj m  bE0 Yjm .0 /Yj m ./d 0 Qj m i E j D0 mDj j 1 X X

(8)

From a practical point of view, the spectral form of Eq. (8) of the solution to the Dirichlet problem given by Eqs. (2)–(4) may often become inconvenient, since the construction of Qj m .z/ functions and their summation up to high degrees and orders (j  104  105 ) is time consuming and numerically unstable (Sona 1995). Moreover, in the case that the level ellipsoid u D b0 deviates from a sphere by only a tiny amount, which is the case for the Earth, the solution of the problem should be close to the solution to the same problem but formulated on a sphere. One should thus attempt to rewrite T .u; / as a sum of the well-known Poisson integral (Kellogg 1929, Sect. IX.4), which solves the Dirichlet problem on a sphere, plus the corrections due to the ellipticity of the boundary. An evident advantage of such a decomposition is that existing theories as well as numerical codes for solving the Dirichlet problem on a sphere can simply be corrected for the ellipticity of the boundary.

2.2 Power-Series Representation of the Integral Kernel Thong (1993)  u and  Martinec and Grafarend (1997) showed that the Legendre function of the second kind, Qj m i E , can be developed in an infinite power series of the first eccentricity e: 1  u X m.j C1/=2 .j C m/Š j C1 D .1/ e aj mk e 2k Qj m i E .2j C 1/ŠŠ kD0

(9)

where coefficients aj mk can, for instance, be defined by the recurrence relation as aj m0 D 1 aj mk D

.j C 2k  1/2  m2 aj mk1 2k.2j C 2k C 1/

(10) for

k  1:

(11)

Throughout this chapter, it is assumed that the eccentricity e0 of the reference ellipsoid u D b0 E e0 D p b02 C E 2

(12)

is less than 1. Then, for points .u; / being outside or on the reference ellipsoid, i.e., when u D b0 , the series in Eq. (9) is convergent. By Eq. (9), the ratio of the Legendre functions of the second kind in Eq. (8) reads Page 6 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

1 P  u aj mk e 2k  j C1 Qj m i E e kD0  b0  D : 1 P e Qj m i E 0 2k aj mk e0

(13)

kD0

Dividing the polynomials in Eq. (13) term by term, one can write 1 P kD0 1 P

aj mk e 2k D1C aj mk e02k

1 X

bj mk ;

(14)

kD1

kD0

where the explicit forms of the first few constituents read   bj m1 D aj m1 e 2  e02 ;     bj m2 D aj m2 e 4  e04  aj2 m1 e02 e 2  e02 ;       bj m3 D aj m3 e 6  e06  aj m2 aj m1 e02 e 4 C e 2 e02  2e04 C aj3 m1 e04 e 2  e02 :

(17)

  bj mk D O e 2r e02s ; r C s D k:

(18)

(15) (16)

Generally,

To get an analytical expression for the kth term of the series in Eq. (14), some cumbersome algebraic manipulations have to be performed. To avoid them, confine to the case where the computation point ranges in a limited layer above the reference ellipsoid (for instance, topographical layer), namely, b0 < u < b0 C 9;000 m, which includes all the actual topographical masses of the Earth. For this restricted case, which is, however, often considered when geodetic boundary-value problems are solved, express the first eccentricity e of the computation point by means of the first eccentricity e0 of the reference ellipsoid and a quantity "; " > 0, as e D e0 .1  "/:

(19)

Assuming b0 < u < b0 C 9;000 m means that " < 1:4  103 , and one can put approximately e 2k D e02k .1  2k"/;

(20)

bj m1 D 2"e02 aj m1 :

(21)

This allows one to write 1 P kD0 1 P kD0

1 P

aj mk e 2k D 1  2" aj mk e02k

kaj mk e02k

kD1 1 P

:

(22)

aj mk e02k

kD0

Page 7 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Expand the fraction on the right-hand side of the last equation (divided by aj m1 ) into an infinite power series e02 : 1 P

1 aj m1

kaj mk e02k

kD1 1 P

D

1 X

aj mk e02k

ˇj mk e02k :

(23)

kD1

kD0

To find the coefficients ˇj mk , rewrite the last equation in the form 1 1 X

aj m1

kaj mk e02k

D

kD1

1 X

aj mk e02k

1 X

kD0

ˇj ml e02l :

(24)

lD1

Since both the series on the right-hand side are absolutely convergent, their product may be rearranged as 1 X

aj mk e02k

kD0

1 X

ˇj ml e02l

lD1

D

1 X k X

ˇj ml aj m;kl e02k :

(25)

kD1 lD1

Substituting Eq. (25) into Eq. (24) and equating coefficients at e02k on both sides of Eq. (24), one obtains X aj mk k D ˇj ml aj mkl ; aj m1 lD1 k

k D 1; 2 : : :

(26)

which yields the recurrence relation for ˇj mk : aj mk X Dk  ˇj ml aj m;kl ; aj m1 lD1 k1

ˇj mk

k D 2; 3 : : :

(27)

with the starting value ˇj m1 D 1:

(28)

With the recurrence relation in Eq. (27), one may easily construct the higher coefficients ˇj mk : aj m2  aj m1 ; aj m1

(29)

aj m3  3aj m2 C aj2 m1: aj m1

(30)

ˇj m2 D 2 ˇj m3 D 3

This process may be continued infinitely. Important properties of the coefficients ˇj mk are

Page 8 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Coefficients ˇj mk for j D 30, m D 0; 20; 30, and k D 1; : : :; 100

1 D ˇj m1 > ˇj m2 > ˇj m3 >     0;

(31)

ˇjj k > ˇjj 1k >    > ˇj 0k :

(32)

Figure 1 demonstrates these properties for j D 30. Finally, substituting the series Eq. (23) into Eq. (22) and using Eqs. (21) and (28) for bj m1 and ˇj m1 , respectively, yields 1 P kD0 1 P

aj mk e 2k D 1 C bj m1 1 C aj mk e02k

1 X

! ˇj mk e02k2 :

(33)

kD2

kD0

2.3 The Approximation of O.e02 / The harmonic upward or downward continuation of the potential or the gravitation between the Geoid to the Earth’s surface is an example of the practical application of the boundary-value problem Eqs. (2)–(4) (Engels et al. 1993; Martinec 1996; Vaniˇcek et al. 1996). To make the theory as simple as possible but still matching the requirements on Geoid height accuracy, one should keep throughout the following derivations the terms of magnitudes of the order of the first eccentricity e02 of the Earth’s level ellipsoid and neglect the term of higher powers of e02 . This approximation is justifiable because the error introduced by this approximation is at most 1. 5  105 , which then causes an error of at most 2 mm in the Geoidal heights. Keeping in mind the inequalities in Eq. (31), and assuming that e02  1, the magnitude of the last term in Eq. (33) is of the order of e02 ,

Page 9 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

1 X

  ˇj mk e02k2   e02 :

(34)

kD2

Hence, Eq. (13) becomes  u  j C1 Qj m i       e E  D 1 C aj m1 e 2  e02 1 C O e02 : b0 e0 Qj m i E

(35)

Substituting Eq. (35) into Eq. (8), evaluating a j m1 according to Eq. (11) for k D 1, and bearing in mind the classical Laplace addition theorem for spherical harmonics (for instance, Varshalovich et al. 1989, p. 164), j X 4 Yj m./Yjm .0 /; Pj .cos / D 2j C 1 mDj

(36)

where Pi .cos / is the Legendre polynomial of degree j and is the angular distance between directions  and 0 , one gets Z  1 T .u; / D f .0 / K sph .t; cos /  2e02 k ell .t; ; 0 /.1 C O.e02 // d 0 ; (37) 4 0 where tD

e : e0

(38)

K sph .t; cos / is the spherical Poisson kernel (Kellogg 1929; Heiskanen and Moritz 1967; Pick et al. 1973), K

sph

.t; cos / D

1 X

.2j C 1/t j C1 Pj .cos / D

j D0

t .1  t 2 / g3

(39)

with g  g.t; cos / D

p 1 C t 2  2t cos

(40)

and k ell .t; ; 0 / stands for 0

k .t; ;  / D 4.1  t / ell

2

j 1 X X j D0 mDj

t j C1

.j C 1/2  m2 Yj m./Yjm .0 /: 2.2j C 3/

(41)

Moreover, it holds that (t < 1)

Page 10 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

jk ell .t; ; 0 /j k ell .t; ; / k ell .t; ; /j#D0 D

1 2j C 1 1  t 2 X j C1 t .j C 1/2 2 j D0 2j C 3




where the residuals Ri .cos / are of the forms R2 .cos / D R3 .cos / D R4 .cos / D

1 X

5j C 7 P .cos /; 2 .2j C 3/ j .j  1/ j D2

1 X

6j 2  4j  4 P .cos /; 2 .2j C 1/ j j.j  2/ j D3

(153)

1 X

6 Pj .cos /: .j  1/2 j D2 Page 32 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Figure 7 plots the residuals Ri .cos /; i D 1; : : :; 4 within the interval 0 . One can see that Ri (cos) are “reasonably” smooth functions bounded for all angles . This is the consequence of the fact that the magnitudes of the series terms in infinite sums (153) decrease quickly with increasing summation index j . In order to achieve an absolute accuracy of the order of 0.01, which is sufficient in the framework of the O.e02 /-approximation, series Eq. (153) may be truncated at degree j 25. Moreover, the above formulae make it possible to study the behavior of functions Ki .cos / in the vicinity of point D 0. It can readily be seen that lim Ki .cos /

1

!0

; i D 1; 2;

lim K3 .cos /  ln !0

lim K4 .cos /

!0

2

2

;

(154)

:

Consequently, when integration point 0 lies in the vicinity of computation point , the ellipsoidal Stokes function S el l .; 0 /, see Eq. (128), may be approximated as 1 S el l .; 0 / D  .1 C sin2 # cos2 ˛/ for

 1:

(155)

This also means that the ellipsoidal Stokes function S el l .; 0 / has the same degree of singularity at point D 0, namely, 1= , as the original Stokes function.

3.9 Conclusion This work was motivated by the question whether the solution of the Stokes boundary-value problem with the boundary condition prescribed on an ellipsoid of revolution can be expressed in a closed spatial form, suitable for numerical computations. To answer this question, the solution of the Stokes boundary-value problem in terms of ellipsoidal harmonics was first found. The fact that this solution is represented by slowly convergent series of ellipsoidal harmonics prevents its use for regional Geoid computations. That is why, in the next step, it was confined to the O.e0 2 /approximation, meaning that the terms of magnitudes of the order of e0 2 were retained; the terms of the order of O.e0 4 / and of higher powers were not considered. Nevertheless, the accuracy of the order of O.e0 2 / is fairly good for today’s requirement of Geoidal height computations. Within this accuracy, it has been shown that the solution of the Stokes boundary-value problem can be expressed as an integral taken over the full solid angle and applied to gravity anomalies multiplied by a kernel consisting of the traditional spherical Stokes function and a correction due to the elliptical shape of the boundary; this additional term is called the ellipsoidal Stokes function. It has been managed to express the ellipsoidal Stokes function, originally represented in the form of an infinite sum of ellipsoidal harmonics, as a finite combination of elementary functions analytically describing the singular behavior of the ellipsoidal Stokes function at the point D 0. This expression is suitable for the numerical solution of the Stokes boundary-value problem on an ellipsoid of revolution. The most important result is that the ellipsoidal Stokes function can be Page 33 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

approximated by function 1= in the vicinity of its singular point singularity of the ellipsoidal Stokes function in the vicinity of point the original spherical Stokes function.

D 0. Thus, the degree of D 0 is the same as that of

4 Vertical Deflections in Gravity Space In this chapter a definition of the vertical which relates to astronomical longitude and astronomical latitude as spherical coordinates in gravity space is presented. Vertical deflections and gravity disturbances relate to a reference gravity potential. In order to refer the horizontal and vertical components of the disturbing gravity field to a reference gravity field, which is physically meaningful, the Somigliana-Pizzetti gravity potential as well as its gradient has been chosen. The local gravity vector  as well as a reference gravity vector , both in a global frame of reference, are introduced in order to be able to construct the vertical deflection vector f jj jj2   jjjj2 g in gravity space. In order to relate the vertical deflection vector to observables as elements of an observation space, one has to transform the incremental gravity vector f jj jj2   jjjj2 g from a global frame of reference to a local frame of reference, also called “natural,” “horizontal,” or “local level.”

4.1 Representation of the Actual Gravity Vector as Well as the Reference Gravity Vector Both in a Global and a Local Frame of Reference Conventionally, the local gravity vector is represented in a global orthonormal frame of reference {E 1 , E 2 , E 3 jO} either in Cartesian coordinates ( x D 1 ; y D 2 ; z D 3 /, or in spherical coordinates (ƒ ; ˆ ; ) subject to D jj jj, its Euclidean length (`2 -norm). ƒ

is called astronomical longitude, ˆ denotes astronomical latitude, and denotes the “modulus of the actual gravity vector.” Consult Eqs. (156) and (157) for such a representation. The global orthonormal frame of reference {E 1 , E 2 , E 3 jO} is attached to the origin O, the mass center of the Earth. Since the actual gravity vector is attached to the position P (placement vector from O to P ), one may think of a parallel transport of {E 1 , E 2 , E 3 } from O to P , namely, parallel transport in the Euclidean sense. The base vectors {E 1 , E 2 , E 3 jO} may be materialized by the definitions of the Equatorial Frame of Reference or quasi-Earth fixed reference frame (International Terrestrial Reference Frame [ITRF]) distributed by the International Earth Rotation Service (IERS). Alternatively, by means of the derivatives fDƒ  ; Dˆ ; D  g, one is able to construct a local orthonormal frame of reference, namely, E ƒ D Dƒ  jjDƒ  jj, E ˆ D Dˆ  jjDˆ  jj, and E D D  jjD  jj, in short fE ƒ ; E ˆ ; E jP g, also called {astronomical east, astronomical north, astronomical vertical} at P . Equations (160)–(162) outline the various operations in detail. A similar procedure applies to the reference gravity vector . It is represented in a global orthonormal frame of reference {e 1 , e 2 , e 3 jO}, either in Cartesian coordinates ( x D 1 ; y D 2 ; z D 3 / or in spherical coordinates ( ; ' ; ) subject to D 2 , its Euclidean length (`2 norm).  is called reference longitude, ' reference latitude, and “modulus of the reference gravity vector” in gravity space. Consult Eqs. (158) and (159) for such a representation. The global orthonormal reference frame {e 1 , e 2 , e 3 jO} is attached to the origin O, the mass center of the Earth. Again by Euclidean parallelism {e 1 , e 2 , e 3 } is transported from O to P such that {e 1 , e 2 , e 3 jP g D fe 1 , e 2 , e 3 jO}. The reference frame of type {e 1 , e 2 , e 3 jO} is chosen by means Page 34 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

of the definition of the reference gravity field, here gauged to ITRF, namely, fE 1 ; E 2 ; E 3 jOg D fe 1 ; e 2 ; e 3 jOg. Local and global representation of the actual gravity vector and the reference gravity vector: Actual gravity vector E D  =jj jj D  = ;

(156)

 D E  D E1 cos ˆ cos ƒ C E2 cos ˆ sin ƒ C E3 sin ˆ

(157)

ƒ , astronomical longitude; ˆ , astronomical latitude; and  , modulus of the actual gravity vector. Reference gravity vector: e ” D ”=jj”jj D ”=”

(158)

 ” D e ” ” D e 1 cos '” cos ” C e 2 cos '” sin ” C e 3 sin '”

(159)

ƒ , astronomical longitude; ' , astronomical ˚ latitude; and , modulus of the actual gravity vector. In contrast, by means of the derivatives D” ”; D'” ”; D” ” , one can compute a local orthonormal frame of reference, namely, e ” D D” ” jjD” ”jj, e '” D D'” ” jjD'” ”jj, e ” D D” ” jjD” ”jj, also called {reference east, reference north, reference vertical} at P . Equations (163)– (165) outline the various relations in detail. Both local orthonormal frames of reference fEƒ ; Eˆ ; E jP g and feƒ ; eˆ ; e jP g are related   by means of an orthonormal matrix RE ƒ C 2 ; 2  ˆ ; 0 D R3 .0/R2 2  ƒ R3 ƒ C 2 , Eq. (162), to the global orthonormal frame of reference fE1 ; E2 ; E3 jOg as well as, Eq. (164), to the global orthonormal frame of reference fe1 ; e2 ; e3 jOg. RE contains as an index the symbol of a rotation matrix which is parameterized by Euler angles. R3 denotes a rotation around the three-axis plane (or in the (1,2)-plane) and R2 around the two-axis plane (or in the (3,1)-plane). Construction of the local reference frame feƒ ; eˆ ; e jP g from the spherical coordinates of the actual gravity vector:  "    #1=2    @ 1 2 @ 1 @ 2 @ 3 Dƒ

@ 2 2 @ 3 2 E ƒ D C e2 C e3 C C ; D e1 jjDƒ jj @ƒ











    "    #1=2 @ 2 2 @ 3 2 @ 1 2 @ 1 @ 2 @ 3 Dˆ

E ˆ D C e2 C e3 C C ; D e1 jjDˆ jj @ˆ











    "    #1=2 @ 2 2 @ 3 2 @ 1 2 @ 1 @ 2 @ 3 D

E D C C ; D e1 C e2 C e3 jjD jj @

@

@

@

@

@

(160) 2 3 3 2 E1 E ƒ

  4 E ˆ 5 D RE ” C  ;   '” ; 0 4 E2 5 ; (161) 2 2 E

E3

Page 35 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

dg

g

G

Fig. 8 Decomposition of the actual gravity vector D C ı

2

3 2 32 3 cosƒ

0 E ƒ

E1  sinƒ

4 E ˆ 5 D 4  cosƒ sinˆ sinƒ sinˆ cosˆ 5 4 E2 5 : E

cosƒ cosˆ sinƒ cosˆ sinˆ

E3

(162)

Construction of the local reference frame fe ƒ ; e ˆ ; e jP g from the spherical coordinates of the reference gravity vector: e ” e '”

  " 2  2  2 #1=2 D” ” @”1 @”2 @”3 @”1 @”2 @”3 D e1 D C e2 C e3 C C ; jjD” ”jj @” @” @” @” @” @”   " 2  2  2 #1=2 D'” ” @”1 @”2 @”3 @”1 @”2 @”3 D e1 D C e2 C e3 ; C C jjD'” ”jj @'” @'

@'” @'” @'” @'”

  " 2  2  2 #1=2 @”1 @”2 @”3 D” ” @”1 @”2 @”3 D e1 e” D C e2 C e3 ; C C jjD” ”jj @” @” @” @” @” @” 3 2 2 3 e ” e1   4 e '” 5 D RE ” C  ;   '” ; 0 4 e2 5 ; 2 2 e” e3 2 3 2 32 3  sin” cos” 0 e ” e1 4 e '” 5 D 4  cos” sin'” sin” sin'” cos'” 5 4 e2 5 : cos” cos'” sin” cos'” sin'” e3 e”

(163)

(164)

(165)

4.2 The Incremental Gravity Vector The incremental gravity vector ı is defined as the vector-valued difference of the actual gravity vector and the reference gravity vector , computed at the same point in space: ı D  . Figure 8 is an illustration of the local incremental gravity vector, also called “disturbing gravity vector” ı. Indeed, the choice of the reference gravity vector which leads to the unit reference gravity vector -=jjjj has to be specified. Next, computing the rotation matrix R for the transformation between the local orthonormal frames of reference ˚ fE ƒ ; E ˆ ; E g 7! fe ” e '” ; e ” g, according to Fig. 9, is aimed at. At first the moving frames e ” ; e '” ; e ” jP and fE ƒ ; E ˆ ; E jP g are rotated by means of Eulerian rotation matrices R E of type Eqs. (162) and (165) to the fixed frame fe 1 , e2 , e3 jOg D fE 1 ; E2 ; E3 jOg. Page 36 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

{e λ γ , ef γ , e γ |P}

R

–,π – – fg ,0) RE (λg + π 2 2

{ElG , EfG , EG |P} π π RE (ΛΓ + – , – – ΦΓ ,0) 2 2

{e 1, e 2, e 3|O} = {E1, E2, E3|O}

Fig. 9 Basis transformation in gravity space

Second, in closing the commutative diagram of Fig. 9, one is led to the unknown matrix R of Eq. (167). All rotation matrices belong to the special orthogonal group SO.3/ D fR 2 R 33 jR T R D I3 and jRj D 1g. Its inverse is just the transpose. The compound rotation matrix R depends on the two sets of parameters ƒ , ' and ƒ , ˆ , respectively. Since the differences ı D ƒ   , ı' D ˆ  ' between astronomical longitude/latitude and reference longitude/latitude in gravity space are small to first order, ı , ı' are called Euler increments. As soon implements  ƒ D  C ı , ˆ D ' C  as one   ı' into the Euler rotation matrix R E ƒ C 2 ; 2  ˆ ; 0 and linearizes it to the first order, : : for instance, cos.” C ı” / D cos ”  sin ƒ” ı” , sin.'” C ı'” / D sin '” C cos '” ı'” , one is able to compute the compound rotation matrix R D I C ıA of type Eqs. (169) and (170) decomposed into the unit matrix I3 and the incremental antisymmetric matrix ıA if one neglects terms of second order. Basic transformation between both local bases: 3 2 3 e ” E ƒ

4 E ˆ 5 D R 4 e '” 5 ; E

e” 2

        R D R E ƒ C ;  ˆ ; 0 R TE ” C ;  '” ; 0 ; 2 2 2 2 3 2 1 sin '” ı”  cos '” ı” 5; R D 4  sin '” ı” C O.ı” ı'” / 1 ı'” cos '” ı” C O.ı” ı'” / ı'” 1

(166)

(167)

(168)

R D I 3 C ıA;

(169)

3 0 sin '” ı”  cos '” ı” 5: gıA D 4  sin '” ı” C O.ı” ı'” / 0 ı'” cos '” ı” C O.ı” ı'” / ı'” 0

(170)

2

Finally by means of Eqs. ˚ (171)–(178), the incremental gravity vector ı D .  / in the local reference frame e ” ; e '” ; e ” jP at P is represented. While the actual gravity vector is originally given in the ˚ local base E , by means of Eq. (172), one succeeds to represent in the local reference base e ” ; e '” ; e ” jP by Eq. (171). As soon as one combines Eqs. (171) and (172), one finally represents the negative incremental gravity vector in the local reference frame (ı ) : of type Eq. (173). The modulus of gravity D jj jj2 D jj”jj2 C jjı”jj2 D ” C ı” is finally Page 37 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Spherical North

π (–d g)

γ df γ

ef

γ

P

Spherical East

γ cos f γ dl γ

el G

Fig. 10 Negative incremental gravity vector ı in the tangent space at point P and projection .ı / of the negative incremental gravity vector onto the tangent plane at point P

approximated by the modulus of reference gravity D jj jj2 and the modulus of incremental gravity ı D jjı jj2 . In this way, from Eq. (174) one is led to Eq. (175) and Eq. (176), just defining the horizontal components of the incremental gravity vector as vertical deflections “east”  D cos ' ı and “north”  D ı' up to order O.2/. Figure 10 illustrates those vertical deflections in the tangent space TP S 2 of the sphere S 2 of radius D jjjj2 spanned by the unit vectors east, ˚ north P D e ; e' jP at P . Incremental gravity vector within the local reference frame: ˚

e ” ; e '” ; e ” jP :

Actual gravity vector: 3 cos ' ı ” ”    D E D e ” ; e '” ; e ” 4 ı'” 5 : 1

(171)

2 3 0   ” D e” ” D e ” ; e '” ; e ” ” 4 0 5 : 1

(172)

2

Reference gravity vector:

Incremental gravity vector: 3 2 cos ' ı”  5; ı” D .  ”/ D e ” ; e '” ; e ” 4 ı'”

” 3 2  .” C ı”/ cos '” ı” 5; ı” D e ” ; e '” ; e ” 4 .” C ı”/ı'” ı” 2 3  ” cos '” ı” C O.ı”ı” / ı” D e ” ; e '” ; e ” 4 ”ı'” C O.ı”ı” / 5 ; ı”

(173)

(174)

(175)

Page 38 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

2

3 ”  ı” D e ” ; e '” ; e ” 4 ” 5 : ı”

(176)

Vertical deflections:  D cos '” ı” C O.ı”ı” /;

(177)

 D ı'” C O.ı”ı” /:

(178)

Gauss surface normal coordinates (L,B,H): (i) Forward transformation X DE1 X C E2 Y C E3 Z; i h p X D A1 = 1  E 2 sin2 B C H.L; B/ cos B cos L; i h p Y D A1 = 1  E 2 sin2 B C H.L; B/ cos B sin L; i h p 2 2 2 Z D A1 .1  E /= 1  E sin B C H.L; B/ sin B:

(179)

(180)

Relative eccentricity of the ellipsoid of revolution EA2 1 ;A1 ;A2 E 2 D .A21  A22 /=A21 ; E 2 RC

(181)

(ii) Backward transformation – see review paper by Grafarend (2001). (iii) Jacobi matrix 3 DL X DB X DH X (182) J D 4 DL Y DB Y DH Y 5 ; DL Z DB Z DH Z 2 3 cos BŒHL cos L  .N C H / sin L cos LŒHB cos B  .M C H / sin B cos B cos L J D 4 cos BŒHL sin L  .N C H / cos L sin LŒHB cos B  .M C H / sin B cos B sin L 5 : sin BHL HB sin B C .M C H / cos B sin B (183) 2

(iv) Metric 2

3 dL dS 2 D ŒdL dB dH J T J 4 dB 5 ; dH G D JTJ;

(184) (185)

Page 39 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

3 HL HB HL .N C H /2 cos2 B C HL2 G D4 HL HB .M C H /2 C HB2 HB 5 ; HL HB 1 p 3 M.B/ D A1 .1  E 2 /= 1  E 2 sin2 B; p N.B/ D A1 = 1  E 2 sin2 B; 2

(186) (187) (188)

HL D 0; HB D 0 ) dS 2 D .N C H /2 cos2 BdL2 C .M C H /2 dB 2 C dH 2 :

(189)

5 Vertical Deflections and Gravity Disturbance in Geometry Space Next, two different definitions of the vertical which relate to (i) Gauss surface normal coordinates (also called geodetic coordinates) of type ellipsoidal longitude and ellipsoidal latitude and (ii) Jacobi ellipsoidal coordinates of type spheroidal longitude and spheroidal latitude in geometry space are presented. Up to terms of second order, those vertical deflections agree to each other.

5.1 Ellipsoidal Coordinates of Type Gauss Surface Normal With reference to the extensive review on ellipsoidal coordinates by Thong and Grafarend (1989), Gauss surface normal coordinates (LBH) called ellipsoidal longitude L, ellipsoidal latitude B, and ellipsoidal height H which relate to Cartesian coordinates fX; Y; Zg in the global orthonormal frame of reference fE1 ; E2 ; E3 jOg by Eqs. (179)–(181) are introduced. The derivatives fDL x; DB x; DH xg will enable one to derive the local orthonormal frame of reference EL D DL x jjDL xjj, EB D DB x jjDB xjj, EH D DH x jjDH xjj, in short also called {ellipsoidal east, ellipsoidal north, ellipsoidal vertical} at P . The local orthonormal frame of reference fEL ; EB ; EH jP g at P is related to the global orthonormal frame of reference {E1 , E2 , E3 jO} at O by the orthonormal matrix T1 as outlined in Eqs. (190) and (191). Construction of the local reference frame feL ; eB ; eH jQg from the geodetic coordinates of point Q on the ellipsoid of revolution.     "    #1=2 @x2 2 @x3 2 @x1 2 @x1 @x2 @x3 DL x C C ; D E1 C E2 C E3 EL D jjDL xjj @L @L @L @L @L @L   "    #1=2   @x1 2 DB x @x2 2 @x3 2 @x1 @x2 @x3 D E1 C E2 C E3 C C ; EB D jjDB xjj @B @B @B @B @B @B   "    #1=2   @x1 2 @x1 @x2 @x3 DH x @x2 2 @x3 2 D E1 C E2 C E3 EH D C C jjDH xjj @H @H @H @H @H @H D E L  EB D .EL ^ EB /; (190) 3 2 3 2 32 3 2 E1  sin L cos L 0 E1 EL 4 EB 5 D T1 4 E2 5 D 4  cos L sin B sin L sin B cos B 5 4 E2 5 : (191) EH E3 E3 cos L cos B sin L cos B sin B Page 40 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Basis transformation from geometry space to gravity space. 2

3 2 3 2 3 e ” EL EL 4 e '” 5 D T 2 T 1 4 EB 5 D T 4 EB 5 EH EH e”

(192)

subject to T 1 D T 1 .L; B/;

(193)

    T 2 D T 2 .” ; '” / D RE ” C ;  '” ; 0 ; 2 2 T T D T 2 T 1 D T .L; BI ” ; '” /; T 1 ; T 2 2 SO.3/;

2

 cos.L  ” /

 sin.L  ” / sin B

sin.L  ” / cos B

(194) (195) 3

6 sin.L   / sin ' cos.L  ” / sin B sin '” C cos.L  ” / cos B sin '” C 7 6 7 ” ” 6 7 6 7: T D6 C cos B cos '” C sin B cos '” 7 6 7 4  sin.L  ” / cos '” cos.L  ” / sin B cos '” C cos.L  ” / cos '” C 5 C cos B sin '”

(196)

C sin B sin '”

Additive decomposition: L D ƒ” C ıL , ƒ” D L  ıL;

(197)

B D ƒ” C ıB , ƒ” D B  ıB;

(198)

: sin.L  ıL/ D sin L  cos LıL;

(199)

: sin.B  ıB/ D cos B C sin BıB:

(200)

Linearized basis transformation from geometry space to gravity space: 2

3 2 32 3 e ” 1  sin BıL cos BıL EL 4 e '” 5 D 4  sin BıL 1 ıB 5 4 EB 5 e” EH  cos BıL ıB 1

(201)

T D I 3 C ıA;

(202)

3 0  sin BıL cos BıL ıA D 4 sin BıL 0 ıB 5 :  cos BıL ıB 0

(203)

subject to

2

Page 41 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

By means of the transformation has succeeded already succeeded in establishing the orthonormal basis T2 .ƒ ; ' /, Eq. (194). Such a relation will be used to transform the ellipsoidal orthonormal frame of reference fEL ; EB ; EH jQg; Q 2 EA2 1 ;A1 ;A2 to the reference “east, north, vertical” frame ˚ e” ; e'” ; e” jP . Indeed, ŒE L ; E B ; E H T D T 1 Œe1 ; e2 ; e3 T D T 1 T T2 Œe” ; e'” ; e” T or Œe1 ; e2 ; e3T D T 2 T T1 ŒEL ; EB ; EH T is the compound transformation of T of Eqs. (192)–(204), linearized with respect to the antisymmetric matrix ıA, Eq. (203).  Based upon the linearized version of the transformation ŒEL ; EB ; EH  ! e” ; e'” ; e” , Eqs. (202)–(204), one can take  advantage of the representation of the incremental gravity vector -ı, Eq. (176), in terms of e” ; e'” ; e” : Eqs. (204)–(207) illustrate the representation of the incremental gravity vector -ı in the basis {E L ; E B ; E H }. The first basic result has to be interpreted as follows. The second representation of the incremental gravity vector -ı contains horizontal components as well as a vertical component which are all functions of {; ; ı }. For instance, the east component E L is a function of  (first order) and of ( ; ı ) (second order). Or the vertical component E H is a function of ı (first order) and of ( ; ) (second order). If one concentrates on first-order terms only, Eqs. (206) and (207) prove the identity of the first and second definition of vertical deflections. Vertical deflections with respect to a basis in geometry space: Gauss surface normal coordinates {L,B,H}:   ı” D e ” e '”

2 3 3 ” ” 4 4 5 e ” D ” D ŒEL EB EH T ” 5 ; ı” ı” 2

2

” C sin BıL”   cos BıLı”

(204)

3

6 7 : 7:  ı” D ŒEL EB EH  6   sin BıL” C ”  ıBı” 4 5 cos BıL” C ıB” C ı”

(205)

Vertical deflections: :  D cos '” ıƒ” D cos BıL; :  D ı'” D ıB:

(206) (207)

The potential theory of the horizontal and vertical components of the gravity field is reviewed in Eqs. (208)–(219). First, the reference gravity vector  as the gradient of the gravity potential and the incremental gravity vector ı as the gradient of the incremental gravity potential, also called disturbing potential, both represented in the ellipsoidal frame of reference {e L ; e B ; e H }, neglecting HL D 0, HB D 0, are presented. The elements {gLL ; gBB ; gHH } of the matrix of metric G have to be implemented. Second, the highlight is the first-order potential representation of {; ; ı }, Eqs. (214)–(216), as functionals of type {DL ıw; DB ıw; DH ıw}. Potential theory of horizontal and vertical components of the gravity field. ” D grad w D eL p

1 1 1 DL w C eB p DB w C eH p DH w; gLL gBB gHH

(208)

Page 42 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

1 1 1 DL ıw C eB p DB ıw C eH p DH ıw .HL D 0; HB D 0/: gLL gBB gHH (209) Functionals of the disturbing potential ıw: ı” D grad ıw D eL p

3 2 3 p .1= gLL /DL ıw ” 4 .1=pgBB /DB ıw 5 D T 4 ” 5 ; p .1= gHH /DH ıw ı” 2

1 D ”



 1 1 1 DL ıw C sin BıL p DB ıw  cos BıL p DH ıw ; p gLL gBB gHH

  1 1 1 1 DL ıw C p DB ıw  ıB p DH ıw ;  sin BıL p D ” gLL gBB gHH 1 1 1 DL ıw C ıB p DB ıw  p DH ıw; gLL gBB gHH sin BıL cos BıL ıB DB ıw  1; p DH ıw  1; p DB ıw  1; p gBB gHH gBB cos BıL ıB sin BıL DL ıw  1; p DL ıw  1; p DH ıw  1; p gLL gLL gHH

(210)

(211)

(212)

ı” D  cos BıL p

(213)

then 1 : 1 DL ıw; D ” .N C H / cos B 1 : 1 DB ıw; D ”M CH : ı” D DH ıw:

(214) (215) (216)

5.2 Jacobi Ellipsoidal Coordinates With reference to the extensive review on ellipsoidal coordinates by Thong and Grafarend 1989, Jacobi ellipsoidal coordinates (; '; u), also called “mixed elliptic trigonometric-elliptic coordinates,” namely, spheroidal longitude , spheroidal latitude ', and semiminor axis u which relate to Cartesian coordinates (X; Y; Z) in the global orthonormal frame of reference {E1 ; E2; E3 jO} by Eqs. (217) and (219), are introduced. Those elliptic coordinates have been introduced to physical geodesy since in contrast to Gauss surface normal ellipsoidal coordinates the Laplace differential equation, which governs the Newtonian gravitational field, separates in these coordinates (see, e.g., Grafarend 1988). The Jacobi ellipsoidal coordinates are generated by the intersection of

Page 43 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

1. a confocal, oblate ellipsoid, 2. a confocal half hyperboloid, 3. a half plane in a unique way. Jacobi ellipsoidal coordinates {ƒ; '; u}. (i) Forward transformation XDp E1 X C E2 Y C E3 Z; X D u2 C "2 cos ' cos ;

(217)

p Y D u2 C "2 cos ' cos ; Z D u sin ':

(218)

Absolute eccentricity of the ellipsoid of revolution EA2 1 ;A1 ;A2 .   "2 D A21  A22 =A21 ;

" 2 RC :

(219)

(ii) Backward transformation 8 arctan.Y =X/ for .X > 0/ and .Y ˆ ˆ ˆ ˆ arctan..Y =X/ C /l for .X < 0/ and .Y ˆ ˆ < arctan..Y =X/ C 2/ for .X > 0/ and .Y D ˆ =2 for.X D 0/ and .Y ˆ ˆ ˆ ˆ 3=2 for.X D 0/ and .Y ˆ : undefined for.X D 0/ and .Y

/ ¤ 0/ < 0/ > 0/ < 0/ D 0/;

(220)



1=2 p 1 2 2 2 2 2 2 2 2 2 2 2 ' D .sgnZ/ arcsin ."  .X C Y C Z / C .X C Y C Z  " / C 4" Z / ; 2"2 (221)

1=2 p 1 2 .X C Y 2 C Z 2  "2 C .X 2 C Y 2 C Z 2  "2 /2 C 4"2 Z 2 / uD : (222) 2 (iii) Jacobi matrix 3 D X D' X Du X J D 4 D Y D' Y Du Y 5 ; D Z D' Z Du Z 2

2 p p  u2 C "2 cos ' sin   u2 C "2 sin ' cos  p 6 p J D 4 u2 C "2 cos ' cos   u2 C "2 sin ' sin  0 u cos '

p u u2 C"2 p u u2 C"2

(223) 3 cos ' cos  7 cos ' sin  5 : sin '

(224)

Page 44 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

(iv) Metric 2

dS 2 D Œd  d'

3 d d uJ T J 4 d' 5 ; du

(225)

G D JTJ; 3 2 2 0 0 .u C "2 / cos2 ' 5; G D4 0 0 u2 C "2 sin2 ' 0 0 .u2 C "2 sin2 '/=.u2 C "2 / dS 2 D .u2 C "2 / cos2 'd 2 C .u2 C "2 sin2 '/d' 2 C

u2 C "2 sin2 ' 2 du : u2 C "2

(226) (227)

(228)

The derivatives {Dƒ x; D' x,Dux} are presented in order to construct in Eqs. (229) and (230) a local orthonormal frame of reference, namely, E D D x jjD xjj; E' D D' x jjD' xjj; Eu D Du x jjDu xjj, in short {E ; E' ; Eu jP }, also called {ellipsoidal east, ellipsoidal north, ellipsoidal vertical} at P . The local orthonormal frame of reference {E ; E' ; Eu jP } at P is related to the global orthonormal frame of reference {E1 ; E2; E3 jO} at O by the orthonormal matrix T1 . Thanks to the transformation Eq. (165) fe1 ; e2 ; e3 g ! feƒ; e'; eug, one has already succeeded to establish the orthonormal matrix T2 . ; ' ), Eq. (233). Such a relation will again be used to transform the ellipsoidal orthonormal frame of reference {E ; E' ; Eu jP } to the reference “east, north, vertical” frame {e ; e' ; eu jP }. Indeed ŒE ; E' ; Eu T D T1 Œe1 ; e2 ; e3 T D T1 T2T Œe ; e' ; eu T or Œe ; e' ; eu T D T2 T1T ŒE ; E' ; Eu T is the compound transformation T of Eqs. (231)–(235). As soon as one develops T close to the identity, namely, by means of the decomposition (Eqs. (236)– (239)) as well as of a special case (Eq. (240)) of the binomial series, ("u/ < 1, one gains the elements of the matrix T (case study t23 /. The matrix T is finally decomposed into the unit matrix I 3 and the incremental antisymmetric matrix A, Eqs. (248) and (249). Construction of the local reference frame {E ; E' ; Eu jP } from the Jacobi ellipsoidal coordinates of point P on the topography:   "    #1=2   @x1 2 @x1 @x2 @x3 D X @x2 2 @x3 2 D E1 C E2 C E3 E D C C ; jjD Xjj @ @ @ @ @ @   "    #1=2   @x1 2 @x1 @x2 @x3 D' X @x2 2 @x3 2 E' D D E1 C E2 C E3 C C ; jjD' Xjj @' @' @' @' @' @'   "    #1=2   @x1 2 @x1 @x2 @x3 Du X @x2 2 @x3 2 D E1 C E2 C E3 C C ; Eu D jjDu Xjj @u @u @u @u @u @u (229) 3 2 3 2 E1 E 4 E' 5 D T 1 4 E2 5 ; (230) Eu E3

Page 45 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

2

 sin 

cos 

3

0

6 7 p p 6 7 2 C "2 6  u2 C "2 7 u  u 6p cos  sin ' p sin  sin ' p cos ' 7 6 7 T 1 D 6 u2 C "2 sin2 ' 7: u2 C "2 sin2 ' u2 C "2 sin2 ' 6 7 6 7 p 6 7 2 C "2 u u u 4 5 cos  cos ' p sin  cos ' p sin ' p 2 2 2 2 2 2 2 2 2 u C " sin ' u C " sin ' u C " sin ' Basis transformation from geometry space to gravity space: 2

3 2 3 2 3 e” E E 4 e'” 5 D T 2 T T1 4 E' 5 D T 4 E' 5 Eu Eu e”

(231)

: T 1 D T 1 .ƒ; '; u/;

(232)

    : T 2 D T 2 .” ; '” / D R E ” C ;  '” ; 0 ; 2 2

(233)

T WD T 2 T T1 D T .ƒ; '; uI ” ; '” /; T 1 ; T 2 2 SO.3/;

(234)

subject to

2

p  u2 C "2 sin.  ” / sin ' p u2 C "2 sin2 ' p u2 C "2 cos.  ” / sin '” C p u2 C "2 sin2 '

u sin.  ” / cos ' p u2 C "2 sin2 '

3

cos.  ” / 7 6 7 6 7 6 7 6 / cos ' sin ' u cos.   7 6 ” ” 6 sin.  ” / sin '” C7 p 2 7 6 2 2 u C " sin ' 7 6 p 7 6 2 2 6 u C " sin ' cos '” 7 u cos ' cos '” 7: 6 C p Cp T D6 7 2 2 2 2 u C " sin ' u C " sin ' 7 6 7 6 p 6 u2 C "2 cos.  ” / sin ' cos '” u cos.  ” / cos ' cos '” 7 6  sin.   / cos ' C C7 p p 7 6 2 2 2 C "2 sin ' 2 C "2 sin ' 7 6 u u 7 6 p 7 6 u2 C "2 sin ' sin '” u cos ' sin '” 5 4 C p Cp u2 C "2 sin2 ' u2 C "2 sin2 ' (235) Additive decomposition:  D ” C ı , ” D '  ı;

(236)

' D '” C ı' , '” D '  ı';

(237)

: sin.ƒ  ı/ D sin ƒ   cos ƒı;

(238)

: sin.'  ı'/ D cos ' C sin 'ı':

(239) Page 46 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Special case of binomial series: 1 15 3 .1 C x/1=2 D 1  x C x 2  x 3 C O.4/: 2 8 48 Condition: u > " D

p

(240)

a2  b 2

1=2  "2 sin2 ' D 1C p u2 u2 C "2 sin2 '       " sin ' 1 " sin ' 2 3 " sin ' 4 C C O6 D1 2 u 8 u u s   1=2 u2 C "2 "2 cos2 ' D 1C 2 u C "2 u2 C "2 sin2 '  2   " cos ' 3 "2 cos2 ' 1 "2 cos2 ' : C C O6 p D1 2 u2 C "2 8 u2 C "2 u2 C "2 u

(241)

(242)

Case study:

"2 sin2 ' C O.4/ . sin ' cos ' C cos2 'ı'/ t23 D 1  2 u

(243)



"2 cos2 ' C O.4/ .sin ' cos ' C sin2 'ı'/; C 1 2 2 u C" "2 sin ' cos ' : (244) t23 D ı' C 2 2 2u .u C "2 /   .u2 C "2 / sin2 ' C u2 cos2 '  .u2 C "2 / sin ' cos 'ı' C u2 sin ' cos 'ı' ; "2 sin ' cos ' : t23 D ı' C 2u2

u2 u2 2 2  sin ' C 2 cos '  sin ' cos 'ı' C 2 sin ' cos 'ı' ; u C "2 u C "2 : t23 D ı' C ."2 =4u2 / sin 2':

(245)

(246)

Linearized basis transformation from geometry space to gravity space: 2

32 3 2 3 e” E 1  sin 'ıL cos 'ıL 4 e'” 5 D 4 sin 'ıL 1 ı' C ."=4u2 / sin 2' 5 4 E' 5 e” 1 Eu  cos 'ıL ı'  ."=4u2 / sin 2'

(247)

subject to Page 47 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

T D I 3 C ıA;

(248)

3 0  sin 'ıL cos 'ıL ıA D 4 sin 'ıL 0 ı' C ."=4u2 / sin 2' 5 : 2 0  cos 'ıL ı'  ."=4u / sin 2'

(249)

2

Based upon the linearized version of the transformation {E ƒ ; E ' ; E u g ! fe ƒ ; e ' ; e u }, Eqs. (231)–(249), one can build up the representation of the incremental gravity vector ı, Eq. (176), in terms of [e ƒ ; e ' ; e u ]: Eqs. (250)–(253) illustrate the third representation of the incremental gravity vector -ı, now in the basis [E ƒ ; E ' ; E u ]. The second basic result has to be interpreted as follows. The third representation of the incremental gravity vector ı contains horizontal components as well as a vertical component which are all functions of (; ; ı ). For instance, the east component Eƒ is a function of  (first order) and of ( ; ı ) (second order). Or the vertical component Eu is a function of ı (first order) and of ( ; ) (second order). If one concentrates on first-order terms only, Eqs. (252) and (253) prove the identity of the first and third definition of vertical deflections. Vertical deflections with respect to a basis in geometry space: Jacobi ellipsoidal coordinates (ƒ; '; u).   ı” D e ” ; e '” ; e ” ; 3 ” C sin 'ı”  cos 'ıı”  2 7 6 :  sin 'ı” C ”  ı' C 4u" 2 sin 2' ı 7 :  ı” D ŒE E' Eu  6 5 4   2 cos 'ı” C ı' C 4u" 2 sin 2' ” C ı

(250)

2

(251)

Vertical deflections :  D cos '” ı” D cos 'ı;

(252)

"2 :  D ı'” D ı' C 2 sin 2': 4u

(253)

The potential theory of the horizontal and vertical components of the gravity field is reviewed in Eqs. (254)–(262). First, the reference gravity vector as the gradient of the gravity potential and the incremental gravity vector ı as the gradient of the incremental gravity potential, also called disturbing potential, newly represented in the Jacobi ellipsoidal frame of reference {E  ; E ' ; E u jP }, are presented. The elements (g ; g' ' ; guu / of the matrix of the metric G have to be implemented. Second, the updated highlight is the first-order potential representation (; ; ı ), Eqs. (257)–(259), as functionals of type (Dƒ ıw ; D' ıw ; Du ıw /. Potential theory of horizontal and vertical components of the gravity field: ” D grad w D E p

1 1 1 D w C E' p D' w C Eu p Du w; g g' ' guu

1 1 1 ı” D grad ıw D E p D ıw C E' p D' ıw C Eu p Du ıw: g g' ' guu

(254) (255)

Page 48 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Functionals of the disturbing potential ıw: 2

3 2 3 p .1= g /D ıw ” 4 .1=pg' ' /D' ıw 5 D T 4 ” 5 ; p ı” .1= guu /Du ıw 1 D ”

(256)

! 1 1 1 D' ıw  cos 'ı p Du ıw ; p D ıw C sin 'ı p g g' ' guu

(257)

!   1 "2 1 1 1 D' ıw  ı' C 2 sin 2' p Du ıw ; D  sin 'ı p D ıw C p ” g g' ' 4u guu !   "2 1 1 1 D' ıw C p Du ıw ; ı” D   cos 'ı p D ıw C ı' C 2 sin 2' p g 4u g' ' gu

(258)

(259)

if cos 'ı ı' C ."2 =4u2 / sin 2' sin 'ı D ıw  1; p D ıw  1; D' ıw  1; p p g g g' ' cos 'ı ı' C ."2 =4u2 / sin 2' sin 'ı D' ıw  1; p Du ıw  1; Du ıw  1; p p g' ' guu guu then 1 : 1 D ıw; D p ” u2 C "2 cos '

(260)

1 : 1 D p D' ıw; ” u2 C "2 sin2 '

(261)

s : ı” D 

u2 C "2 sin2 ' Du ıw: u2 C "2

(262)

6 Potential Theory of Horizontal and Vertical Components of the Gravity Field: Gravity Disturbance and Vertical Deflections Now the gravitational disturbing potential in terms of Jacobi ellipsoidal harmonics is represented. As soon as one takes reference to a normal potential of Somigliana-Pizzetti type, the ellipsoidal harmonics of degree/order (0,0), (1,0), (1, 1), (1,1) and (2,0) are eliminated from the gravitational disturbing potential.

Page 49 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

In order to present the potential theory of the horizontal and vertical components of the gravity field, namely, ellipsoidal vertical deflections and ellipsoidal gravity disturbance, one has to make a decision about what is the proper choice of the ellipsoidal potential field of reference w.ƒ; '; u/ and about the related ellipsoidal incremental potential field ıw.ƒ; '; u/, also called “disturbing potential.”

6.1 Ellipsoidal Reference Potential of Type Somigliana-Pizzetti There has been made three proposals for an ellipsoidal potential field of reference. The first choice is the zero-degree term [arccot.u="/.GM="/ of the external ellipsoidal harmonic expansion. Indeed, it would correspond to the zero-order term GM=r of the external spherical harmonic expansion. As proven in Grafarend and Ardalan (1999), the equipotential surface w D Œarccot.u="/.GM="/ D constant where GM is the geocentric gravitational constant. Unfortunately, such an equipotential reference surface doesnot include the rotation of the Earth, namely, its centrifugal potential 2 "2 1 C .u="/2 =3 C 2 "2 1 C .u="/2 P2 .sin '/=3. Accordingly, there has been made the proposal for a second choice, namely, to choose [arccot(u="/GM=" C 2 "2 .u2 C "2 / cos2 '=2. In this approach the zero-degree term of the gravitational potential and the centrifugal potential is superimposed. Unfortunately, the level surface [arccot(u="/ GM=" C .1 C P2 .sin '//2 "2 Œ1 C .u="/2 =3 D constant is not an ellipsoid of revolution. It is for this reason that the proposal for the third choice has been chosen. Superimpose the gravitational potential, which is externally expanded in ellipsoidal harmonics, and the centrifugal potential represented also in ellipsoidal base functions (the centrifugal potential is not a harmonic function) and postulate an equipotential reference surface to be a level ellipsoid. Such a level ellipsoid should be an ellipsoid of revolution. Such an ellipsoidal reference field has been developed by Pizzetti (1894) and Somigliana (1929) and is properly called Somigliana-Pizzetti reference potential. The Euclidean length of its gradient is referred to as the International Gravity Formula, which recently has been developed to the sub-nano Gal level by Ardalan and Grafarend (2001). Here, the recommendations followed are that of the International Association of Geodesy, namely, Moritz 1984, to use the Somigliana-Pizzetti potential of a level ellipsoid as the reference potential summarized in Eqs. (263)–(267). Reference gravity potential field of type Somigliana-Pizzetti. Reference Level Ellipsoid, Semimajor axis a, semiminor p axis b, Absolute eccentricity " D a2  b 2 , E 2a;b D X 2 R3 j.X 2 C Y 2 /=a2 C Z 2 =b 2 D 1:

(263)

The first version of the reference potential field       u 2   3 C 1 arccot u"  3 u" u 1 2 2 " GM  arccot C  a   2 w.'; u/ D .3 sin2 '  1/   " " 6 3 b C 1 arccot b  3 b "

1 C 2 .u2 C "2 / cos2 ': 2

"

"

(264)

Page 50 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Input: 4 parameters: GM, , a, " D

p

a2  b 2 or b.

Legendre polynomials of the first and second kind:  arccot .u="/ D Q00 .u="/;     u 2 3 C1 "  2 ! b 3 C1 "

  u u 3 " "   arccot b" 3 b"

arccot

D

 .u="/ Q20 ;  Q20 .b="/

(265)

2  3 sin2 '  1 D p P20 .sin '/: 5

(266)

The second version of the reference potential field: p  5 2 2 Q20 .u="/  1 GM  Q00 .u="/ C a  P20 .sin '/ C ! 2 .u2 C "2 / cos2 ': w.'; u/ D " 15 Q20 .b="/ 2

(267)

Constraints: reference gravity potential field of type Somigliana-Pizzetti. Conditions for the ellipsoidal terms of degree/order (0,0) and (2,0). The first version: 1 .0; 0/ W u00 C 2 a2 D W0 ; 3 p 5 2 2  a D 0: .2; 0/ W u20  15 The first condition in multipole expansion:     b GM GM b arccot Dp u00 D ; arccot p " " a2  b 2 a2  b 2   GM  b Q00 : u00 D " "

(268)

(269)

(270)

(271)

Corollary: p



GM a2



b2

arccot p



b a2



b2

1 C 2 a2  W0 D 0: 3

(272)

The second condition in multipole expansion: G1 u20 D "2

"

! #    2 b b b 3 C 1 arccot 3 J20 ; " " "

(273)

Page 51 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

  G  b J20 : u20 D Q20 " "

(274)

Ellipsoidal multipole of degree/order (2,0): ! .0 ;' 0 / u0Z  0 2 Z2 Z=2 1 u d' 0 cos ' 0 d u0 .u02 C "2 sin2 ' 0 / C1 J20 D d 0 3 2 " 0

=2

p



0

5 .3 sin2 ' 0  1/ .0 ; ' 0 ; u0 /: 2

(275)

Functional of the mass density field .ƒ0 ; ' 0 ; u0 /: Z2 J20 D

d 0

0

Z=2

0

d' cos '

0

.;' 0 / u0Z

=2

0

02

d u .u C " sin ' 2

2

0

 /P20

 0 u  P20 .sin ' 0 / .; '; u0/: "

(276)

0

Cartesian multipole of degree two versus ellipsoidal multipole of degree/order (2,0): J11 D A; J22 D B; J33 D C;

(277)

Z Jpq D

d w3 .jjXjj2 ıpq  Xp Xq / .X; Y; Z/ 8p; q 2 f1; 2; 3g:

(278)

Z J11 D

d w3 .Y 2 C Z 2 / .X; Y; Z/; Z

J22 D

d w3 .X 2 C Z 2 / .X; Y; Z/;

(279)

Z J33 D

d w3 .Y 2 C Y 2 / .X; Y; Z/; 

 ACB 1p 1 2  C C M" : 5 2 3 J20 D 2 " 2

(280)

The second condition in Cartesian multipole expansion: ! "   #

p   b 3 51 b b 2 2 2 3 C 1 arccot 3 u20 D G.A C B  2C / C GM" ; 8 "3 " " " 3 p     ACB 1 3 51  b 2 G  C C GM" : u20 D Q 4 "3 20 " 2 3

(281)

(282)

Page 52 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Corollary:

G 3 1 1 GM p C p .A C B  2C / "3 4 a2  b 2 8 3 a2  b 2   

 b 1 b b2  3p  2 a2 D 0: C 1 arccot p  3 2 2 2 2 2 2 a b 15 a b a b

(283)

Corollary (second ellipsoidal condition):   b GM arccot C20 u20 D " "

(284)

is a transformation of the dimensionless ellipsoidal coefficient of degree (2,0) C20 to the nondimensionless ellipsoidal coefficient of degree/order (2,0), and then the second condition in the ellipsoidal harmonic coefficient C20 is p

GM a2  b 2

 arccot p

b a2  b 2



1 C20  p 2 a2 D 0: 3 5

(285)

Lemma (World Geodetic Datum, Grafarend and Ardalan 1999) If the parameters {W0 ; GM; C20 ; } are given, the Newton iteration of the nonlinear two condition equations is contractive and leads to a D a (W0 ; GM; C20 ; /; b D b.W0 ; GM; C20 ; /, and " D ".W0 ; GM; C20 ; /.

6.2 Ellipsoidal Reference Gravity Intensity of Type Somigliana-Pizzetti Vertical deflections {.ƒ; '; u/; .ƒ; '; u/} as defined as the longitudinal and lateral derivatives of the disturbing potential are normalized by means of reference gravity intensity .; '; u/ DW jjgrad.; '; u/jj (Eqs. (254)–(262), (285), and (286)). Here the aim is at computing the modulus of reference gravity with respect to the ellipsoidal reference potential of type Somigliana-Pizzetti. The detailed computation of jj grad.; '; u/jj is presented by means of Eqs. (285)–(291), two lemmas, and two corollaries. Since the reference potential of type Somigliana-Pizzetti w.'; u/ depends only on spheroidal latitude ' and spheroidal height u, the modulus of the reference gravity vector Eq. (286) is a nonlinear operator based upon the lateral derivative D' w and the vertical derivative Du w. As soon as one departs from the standard representation of the gradient operator in orthogonal coordinates, namely, grad.w/ D e .g /1=2 D w C e' .g' ' /1=2 D' w C eu .guu /1=2 Du w, one arrives at the standard form of jjgradwjj of type Eq. (286). Here, for the near-field pcomputation, one shall assume 2 2 1 x D .u C" / .D' w=Du w/. Accordingly by means of Eq. (287), 1 C x is expanded in binomial series and is led to the first-order approximation of D jjgradwjj by Eq. (288). Obviously up to O.2/, it is sufficient to compute the vertical derivative jDu wj. An explicit version of Du w is given by Eqs. (291) and (292), in the form of ellipsoidal base functions by Eq. (293). Reference gravity intensity of type Somigliana-Pizzetti, International Gravity Formula:

Page 53 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

p ” D jjgradw.'; u/jj D hgradw.'; u/jgradw.'; u/i q D .u2 C "2 sin2 '/1 .D' w/2 C .u2 C "2 /.u2 C "2 sin2 '/1 .Du w/2 q q D .u2 C "2 /=.u2 C "2 sin2 '/ .Du w/2 C .u2 C "2 /1 .D' w/2 ; q q ” D .u2 C "2 /=.u2 C "2 sin2 '/jDu wj 1 C .u2 C "2 /1 .D' w=Du w/2 ; x D.u2 C "2 /1 .D' w=Du w/2 ;

(286)

(287)

p 1 1 1 C x D 1 C x  x 2 C O.3/8jxj < 1: 2 8

If x D .u2 C "2 /1 .D' w=Du w/2 < 1, then q ” D .u2 C "2 /=.u2 C "2 sin2 '/jDu wj C O.2/:

(288)

The first version of Du w reads GM 1 C 2 a2 .3 sin2 '  1/ Du w D  2 2 u C" 6

    u 2 "  3 " C 1 u2 C" 2  "   2    3 b" C 1 arccot b"  3 b"

6u arccot "2

u

3 "

C 2 u cos2 ';

    u u 2 GM 1 2 2 1 6 " C 1 uarccot "  2" 3 " C 2   2  a 2 Du w D  2    u C "2 6 u C "2 3 b" C 1 arccot b"  3 b"        u u 2 u 2 2 6 C 1 uarccot C 2  2" 3 sin ' " " " 1   C 2 a2 2 C 2 u cos2 ':     u 2 b b 2 u C "2 3 " C 1 arccot "  3 "   u 2



(289)

(290)

The second version of Du w reads p 5 2 2  GM  ŒQ .u="/0 0 Du w D ŒQ00 .u="/ C  a P20 .sin '/ 20 C 2 u cos2 ':  " 15 Q20 .b="/

(291)

A more useful closed-form representation of the reference gravity intensity of type SomiglianaPizzetti will be given by two lemmas and two corollaries. First, if one collects the coefficients of {1; cos2 '; sin2 'g by { 0 , c , s }, one is led to the representation of by Eqs. (292)–(295) expressed in the Lemma 1. Second, if one takes advantage of the (a; b) representation of p p 2 2 u C " = u2 C "2 sin2 ' and decompose 0 according to 0 .sin2 ' C cos2 '), one is led to the alternative elegant representation of by Eqs. (296)–(303), presented in Lemma 2. Lemma 1. Formula:

Reference gravity intensity of type Somigliana-Pizzetti, International Gravity

Page 54 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

If x D .u2 C "2 /1 .D' w=Du w/2  1 holds, where w D w.'; u/ is the reference potential field of type Somigliana-Pizzetti, then its gravity field intensity .'; u/ can be represented up to the order O.2/ by q ” D .u2 C "2 /=.u2 C "2 sin2 '/j”0 C ”c cos2 ' C ”s sin2 'j

(292)

subject to 0 D 

1 2 2 1 GM   a a2 C .u C b/.u  b/ 3 a2 C .u C b/.u  b/

3..u="/2 C 1/uarccot.u="/  ".3.u="/2 C 2/ ;  .3.b="/2 C 1/arccot.b="/  3.b="/ ”c D 2 u; ”s D 2 a2 Lemma 2. Formula:

(293)

(294)

3..u="/2 C 1/uarccot.u="/  ".3.u="/2 C 2/ 1 : a2 C .u C b/.u  b/ .3.b="/2 C 1/arccot.b="/  3.b="/

(295)

Reference gravity intensity of type Somigliana-Pizzetti, International Gravity

If x D .u2 C "2 /1 .D' w=Du w/2 can be represented by cos ‰X D< X2  X1 jX3  X1 > .k X2  X1 k2 k X3  X1 k2 /. k X2  X1 k2 and k X3 X1 k2 denote the Euclidean length of the relative placement vector X2 X1 and X3 X1 , respectively. The transformation T: Wl3 $ Wr3 leave angles (“space angles”) and distance ratios equivariant, namely, cos ‰X D cos x and k X2  X1 k2 =jjX3  X1 k2 D jjx2  x1 k2 =jjx3  x1 k2 , a property also called invariance under the similarity transformation: 9 8 2 3 2 3 X x < 2 2= jjX3  X1 jj jjx3  x1 jj ; D T 2 C7 .3/ D T 2 R 7 .3/j 4 Y 5 D T 4 y 5 ; cos ‰X D cos x ; 2 : jjX2  X1 jj jjx2  x1 jj2 ; Z z (387) 2 3 2 3 2 3 2 3 x X x tx T W 4 y 5 7! 4 Y 5 D .1 C s/R.˛; ˇ; ”/ 4 y 5 C 4 ty 5 ; (388) tz z Z z R.˛; ˇ; ”/ D R1 .˛/R2 .ˇ/R3 .”/ D cos ˇ cos ” cos ˇ sin ” 4 sin ˛ sin ˇ cos ”  cos ˛ sin ” sin ˛ sin ˇ sin ” C cos ˛ cos ” cos ˛ sin ˇ cos ” C sin ˛ sin ” cos ˛ sin ˇ sin ”  sin ˛ cos ” 2

3  sin ˇ sin ˛ cos ˇ 5 : cos ˛ cos ˇ

(389)

For geodetic applications, it is sufficient to consider the conformal group C7 (3) close to the identity. First, the rotation matrices are expanded around R 1 .0/; R 2 .0/, R 3 .0/, an operation which produces the abstracted Pauli matrices R 01 .0/, R 02 .0/, R 03 .0/. Second, the advantage is taken of the scale expansion 1 C s around 1. Third, one realizes that the translation parameters tx ; ty ; tz appear in a linear form. In this way, the linearization of the nonlinear similarity transformation “close to Page 72 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

the identity” leads one via Eqs. (390)–(395) to the forward transformation Eqs. (396)–(398), to the backward transformation Eqs. (399)–(401), as well as to the Cartesian coordinate increments X  x D ıx, Y  y D ıy; Z  z D ız, Eq. (402), namely, functions of the transformation parameter column Œtx ; ty ; tz ; ˛; ˇ; ; s0 : R 1 .˛/ D R 1 .0/ C R 01 .0/˛ C O.˛ 2 /; 2

0 0 R 01 .0/ D 4 0 0 0 1

3 0 15 ; 0

R 2 .ˇ/ D R 2 .0/ C R 02 .0/ˇ C O.ˇ 2 /; 2

0 0 R 02 .0/ D 4 0 0 1 0

3 1 0 5; 0

R 3 .”/ D R 3 .0/ C R 03 .0/” C O.” 2 /; 2

0 0 4 R 3 .0/ D 1 0

3 1 0 0 05: 0 0

(390)

(391)

(392)

(393)

(394)

(395)

The forward transformation close to the identity: X D x C tx  zˇ C y” C xs C O1 .˛ 2 /;

(396)

Y D y C ty C z˛  x” C ys C O2 .ˇ 2 /;

(397)

Z D z C tz  y˛ C xˇ C zs C O3 .” 2 /:

(398)

The backward transformation close to the identity: x D X  tx  Y ” C Zˇ  Xs C Ox .˛ 2 /;

(399)

y D Y  ty  Z˛ C X”  Y s C Oy .ˇ 2 /;

(400)

z D Z  tz  Xˇ C Y˛  Zs C Oz .” 2 /:

(401)

Page 73 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Backward to forward: 2 3 tx 6t 7 y7 2 3 2 36 6t 7 X x 1 0 0 j 0 z y j x 6 z 7 7 4 Y  y 5 D 4 0 1 0 j z 0 x j y 5 6 6˛7; 6 7 Zz 0 0 1 j y x 0 j z 6 ˇ 7 6 7 4”5 s

(402)

As soon as is established, the Cartesian datum transformation “close to the identity” one departs for deriving the curvilinear datum transformation, namely, for spherical and spheroidal coordinates. First, Eqs. (403) and (404) generate the transformation of spherical longitude, latitude, and radius .ƒ; ˆ; R/; .ƒ; '; r) to Cartesian coordinates (X, Y, Z), (x, y, z), respectively. Subject to Eqs. (405) and (406) by means of a Taylor expansion up to first-order terms, one succeeds to compute the transformation Eq. (407) of incremental coordinates (ı; ı'; ır) to (ıx; ıy; ız) as well as the inverse transformation Eq. (408) of incremental coordinates (ıx; ıy; ız) to (ı; ı'; ır). Such an inverse transformation is generated by inverting the Jacobi matrix J .ƒ; '; r) “at the point (ƒ; '; r).” As soon as one transplants Eqs. (402)–(408), one gains the final spherical datum transformation Eqs. (409)–(411). The variation of spherical coordinates (ı; ı'; ır) is affected by translational parameters (tx ; ty ; tz ), by rotational parameters (˛; ˇ; ), and by incremental scale s: X D R cos ˆ cos ƒ; Y D R cos ˆ sin ƒ; Z D R sin ˆ

(403)

x D r cos ' cos ; y D r cos ' sin ; z D r sin ':

(404)

X D x C ıx; Y D y C ıy; Z D z C ız;

(405)

ƒ D ƒ C ı; ˆ D ' C ı'; R D r C ır

(406)

3 2 3 3 2 ı ıx Dƒ X Dˆ X DR X 4 ıy 5 D 4 Dƒ Y Dˆ Y DR Y 5 .; '; r/ 4 ı' 5 ; Dƒ Z Dˆ Z DR Z ır ız

(407)

31 2 3 3 2 ıx ı Dƒ X Dˆ X DR X 4 ı' 5 D 4 Dƒ Y Dˆ Y DR Y 5 .; '; r/ 4 ıy 5 : Dƒ Z Dˆ Z DR Z ız ır

(408)

versus

2

2

(9)–(16) ! (9)–(22)

Page 74 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

ı D ı' D

tx sin  C ty cos  C r sin '.˛ cos  C ˇ sin /  ”; r cos '

(409)

sin '.tx cos  C ty sin / C tz cos '  r.˛ sin   ˇ cos /  rs cos ' sin ' ; r

(410)

ır D cos '.tx cos  C ty sin / C tz sin ' C rs:

(411)

Second, Eqs. (412) and (413) generate the transformation of spheroidal longitude, latitude, and semiminor axis (ƒ; ˆ; U), (ƒ; '; u) to Cartesian coordinates (X,Y,Z), (x,y,z), respectively. Equation (414) "2 indicates the absolute eccentricity, the difference of semimajor axis squared, A2 ; a2 , and semiminor axis B2 ; b2 , of the International Reference Ellipsoid. Indeed, such an ellipsoid of reference is fixed by A D a, B D b. For a detailed introduction into spheroidal coordinates, namely, special ellipsoidal coordinates, refer to Thong and Grafarend (1989). Subject to Eqs. (415) and (416) by means of a Taylor expansion up to first-order terms, one succeeds to compute the transformation (Eq. (417)) of incremental coordinates (ı; ı'; ıu) to (ıx; ıy; ız) as well as the inverse transformation (Eq. (418)) of incremental coordinates (ıx; ıy; ız) to (ı; ı'; ıu). Such an inverse transformation is generated by inverting the Jacobi matrix J.ı; ı'; ıu/ “at the point (ı; ı'; ıu)”. As soon as one transfers Eqs. (402)–(418), one arrives at the final spherical datum transformation Eqs. (419)–(421). The variation of spheroidal coordinates (ı; ı'; ıu) is caused by the parameters of type translation (tx ; ty ; tz ), rotation (˛; ˇ; ), and incremental scale s. Spheroidal datum transformation close to the identity: p X D p"2 C U 2 cos ˆ cos ƒ; Y D "2 C U 2 cos ˆ sin ƒ; Z D U sin ˆ

(412)

versus p x D p"2 C u2 cos ' cos ; y D "2 C u2 cos ' sin ; z D u sin '

(413)

subject to "2 D A2  B 2 D a2  b 2 ;

(414)

X D x C ıx; Y D y C ıy; Z D z C ız;

(415)

ƒ D  C ı; ˆ D ' C ı'; U D u C ıu;

(416)

2 3 3 3 2 ı Dƒ X Dˆ X DU X ıx 4 ıy 5 D 4 Dƒ Y Dˆ Y DU Y 5 .; '; u/ 4 ı' 5 ; ıu Dƒ Z Dˆ Z DU Z ız

(417)

2

Page 75 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

31 2 3 3 2 ıx ı Dƒ X Dˆ X DU X 4 ı' 5 D 4 Dƒ Y Dˆ Y DU Y 5 .; '; u/ 4 ıy 5 : Dƒ Z Dˆ Z DU Z ız ıu 2

(418)

(402) ! (418) tx sin  C ty cos  C u sin '.˛ cos  C ˇ sin / r  ”; (419)  u 2 " 1C cos ' " r  u 2 1 u 1 sin '.tx cos  C ty sin / C tz cos ' ı' D  2  1C " u C sin2 ' " " " ! r  u 2  u 1C .˛ sin   ˇ cos /  s cos ' sin ' ; (420) " r   u 2   u 2 u 1 sin ' 1C cos '.tx cos  C ty sin / C tz 1 C ıu D"  2 u " " C sin2 ' " " ! r   u 2   u 2 u 1C " 1 C : (421) cos ' sin '.˛ sin   ˇ cos /  s" " " " ı D

10 Datum Transformations in Terms of Spherical Harmonic Coefficients For the datum transformation of spherical harmonic coefficients, the potential Eq. (422) in a threedimensional Weizenböck space 31 is given by known spherical harmonic coefficients cNnm and sNnm . There is also given another potential Eq. (423) in a three-dimensional Weizenböck space 3r with unknown spherical harmonic coefficients C nm and S nm . With Taylor expansion Eq. (424) and a comparison of coefficients, the unknown coefficients in 3r can be determined. Potentials in three-dimensional Weizenböck spaces 31 and 3r and Taylor expansion: GM X X  r0 nC1  P nm .sin '/.cNnm cos m C sNnm sin m/ in 31 ; U.; '; r/ D r0 nD0 mD0 r 1

n

(422)

 1 n  GM X X R0 nC1  P nm .sin ˆ/.C nm cos mƒ C S nm sin mƒ/ in 3r ; (423) U.ƒ; ˆ; R/ D R0 nD0 mD0 R U.ƒ; ˆ; R/ D U.; '; r/CD U.; '; r/ıCD' U.; '; r/ı' CDr U.; '; r/ır CO.r 2 /: (424) The Taylor expansion Eq. (424) of the known potential in 31 leads to Eq. (425), where the partial derivation to ' will be replaced by the recursive formula Eq. (426). The result is given in Eq. (427). Equations after Taylor expansion and inserting recursive formula for partial derivation to '. Equation after Taylor expansion: Page 76 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

GM X X  r0 nC1 U.ƒ; ˆ; R/ D P nm .sin '/.cNnm cos m C sNnm sin m/ r0 nD0 mD0 r 1

n

GM X X  r0 nC1 mP nm .sin '/.cNnm sin m C sNnm cos m/ı C r0 nD0 mD0 r 1

n

GM X X  r0 nC1 d P nm .sin '/ .cNnm cos m C sNnm sin m/ı' C r0 nD0 mD0 r d' 1

n

GM X X n C 1  r0 nC1 C  P nm .sin '/.cNnm cos m C sNnm sin m/ır: (425) r0 nD0 mD0 r r 1

n

Recursive formula for partial derivation to ': d P nm .sin '/ D m tan 'P nm .sin '/ C d'

r

.2  ı0m /.n  m/.n C m C 1/ P n;mC1 .sin '/: (426) 2

Equation with partial derivation to ': GM X X  r0 nC1 P nm .sin '/.cNnm cos m C sNnm sin m/ U.ƒ; ˆ; R/ D r0 nD0 mD0 r 1

n

GM X X  r0 nC1 C mP nm .sin '/.cNnm sin m C sNnm cos m/ı r0 nD0 mD0 r 1

n

GM X X  r0 nC1 m tan 'P nm .sin '/.cNnm cos m C sNnm sin m/ı'  r0 nD0 mD0 r r 1 n GM X X  r0 nC1 .2  ı0m /.n  m/.n C m C 1/ C r0 nD0 mD0 r 2 1

n

 P n;mC1 .sin '/.cNnm cos m C sNnm sin m/ı' 1 n GM X X rnC1  r0 nC1 C P nm .sin '/.cNnm cos m C sNnm sin m/ır: r0 nD0 mD0 r r

(427)

In Eq. (427) there are products of trigonometric functions with the arguments  and m which can be replaced by the arguments .m C 1/ and .m  1/ by the application of addition theorems, cf. Klapp (2002a, b). For better illustration the result in Eq. (428) is summarized by transformation parameters. The equations are summarized by transformation parameters after application of addition theorems:  GM X X  r0 nC1  P tx ;ty tz ˛;ˇ s Unm C Unm C Unm C Unm C Unm C Unm U.ƒ; ˆ; R/ D r0 nD0 mD0 r 1

n

(428)

Page 77 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

where UPnm D P nm .sin '/.cNnm cos m C sNnm sin m/ q   2m  .n C m C 1/ cos ' P nm .sin '/  .2ı0m /.nm/.nCmC1/ sin 'P n;mC1 .sin '/ cos ' 2 tx ;ty Unm D 2r  Œ.tx cNnm C ty sNnm / cos.m  1/ C .tx sNnm  ty cNnm / sin.m  1/



.n C m C 1/ cos 'P nm .sin '/ C

q

.2ı0m /.nm/.nCmC1/ 2

sin 'P n;mC1 .sin '/

2r

 Œ.tx cNnm  ty sNnm / cos.m C 1/ C .tx sNnm C ty cNnm / sin.m C 1/  .n C m C 1/ sin 'P nm .sin '/ C

tz Unm

D

U˛;ˇ n;m

1 D .m tan 'P nm .sin '/  2

q

.2ı0m /.nm/.nCmC1/ 2

r

cos 'P n;mC1 .sin '/

;

r

.2  ı0m /.n  m/.n C m C 1/ P n;mC1 .sin '// 2 1  Œ.ˇ cNnm C ˛ sNnm / cos.m  1/ C .ˇ sNnm  ˛ cNnm / sin.m  1/ C P n;mC1 .sin '/ 2  Œ.ˇ cNnm C ˛ sNnm / cos.m C 1/ C .ˇ sNnm  ˛ cNnm / sin.m C 1/;

” D mP nm .sin '/.cNnm sin m  sNnm cos m/”; Unm s D .n C 1/P nm .sin '/.cNnm cos m C sNnm sin m/s: Unm

In the next step one has to replace the products of trigonometric functions with argument ' and Legendre functions of the first kind by recursive formulas given in the Appendix (observe all the remarks at the end of the chapter). With the result in Eq. (429), it is possible to shift the indices n and m from n to n  1 for n C 1, from m to m  1 for m C 1, and from m to m C 1 for m  1. It is important to notice that in the case of m D 0 the index shift m to m C 1 leads to the factor 1 C ı1m in the terms of the index shift m to m  1. In Eq. (430) one has the equation to compare the coefficients, which leads to the equations for the datum transformation of spherical harmonic coefficients Eq. (431). The equations after application of recursive formula for Legendre functions of the first kind, indices shift, and the equations for the datum transformation of spherical harmonics are as follows:  GM X X  r0 nC1  P tx ;ty s z Unm C Unm C Utnm C U˛;ˇ C U C U U.ƒ; ˆ; R/ D nm nm nm r0 nD0 mD0 r 1

n

(429)

where P Unm D P nm .sin '/.cNnm cos m C sNnm sin m/;

Page 78 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

t ;t

x y Unm

tz Unm

˛;ˇ Unm

s 2n C 1 .2  ı0m /.n  m C 1/.n  m C 2/ D 2r .2  ı1m /.2n C 1/.2n C 3/    P nC1;m1 .sin '/ .tx cNnm C ty sNnm / cos.m  1/ C .tx sNnm  ty cNnm / sin.m  1/  ; CP nC1;mC1 .sin '/ .tx cNnm  ty sNnm / cos.m C 1/ C .tx sNnm C ty cNnm / sin.m C 1/ s 2n C 1 .n  m C 1/.n C m C 1/ D P nC1;m .sin '/.cNnm cos m C sNnm sin m/tz ; r .2n C 1/.2n C 3/ s 1 .2  ı0m /.n  m C 1/.n C m/ D 2 2  ı1m   P n;m1 .sin '/ Œ.ˇ cNnm  ˛ sNnm / cos.m  1/ C .ˇ sNnm C ˛ cNnm / sin.m  1/ ;  P n;mC1 .sin '/ Œ.ˇ cNnm C ˛ sNnm / cos.m C 1/ C .ˇ sNnm  ˛ cNnm / sin.m C 1/

” Unm D mP nm .sin '/.cNnm sin m  sNnm cos m/”; s D .n C 1/P nm .sin '/.cNnm cos m C sNnm sin m/s: Unm

The equation after indices shifts n ! n  1, m ! m  1, m ! m C 1 with cN1;m D sN1;m D 0 and before comparison of coefficients C nm , S nm and cNnm , sNnm with R0 D r0 :  n  1 X X R0 nC1 nD0 mD0

R

P nm .sin ˆ/.C nm cos mƒ C S nm sin mƒ/

n 1 X   X R0  r0 nC1 tx ;ty P tz ˛;ˇ s D P nm .sin '/ Unm C Unm C Unm C Unm C Unm C Unm ; r r nD0 mD0 0

(430)

where UPnm D cNnm cos m C sNnm sin m; s 2.n  m  1/.n  m/ 2n  1 tx ;ty D Unm 2r0 .2  ı0m /.2n  1/.2n C 1/   .tx cNn1;mC1 C ty sNn1;mC1 / cos m C .tx sNn1;mC1  ty cNn1;mC1 / sin m s .1 C ı1m /.2n  1/ .2  ı1m /.n C m  1/.n C m/  2r0 2.2n  1/.2n C 1/   .tx cNn1;m1  ty sNn1;m1 / cos m C .tx sNn1;m1 C ty cNn1;m1 / sin m ; s .n  m/.n C m/ .2n  1/ z .cNn1;m cos m C sNn1;m sin m/ tz ; D Utnm r0 .2n  1/.2n C 1/ s 2.n  m/.n C m C 1/ 1 U˛;ˇ nm D  2 2  ı0m Page 79 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

 Œ.ˇ cNn;mC1  ˛ sNn;mC1 / cos m C .ˇ sNn;mC1 C ˛ cNn;mC1 / sin m r 1 C ı1m .2  ı1m /.n  m C 1/.n C m/ C 2 2  Œ.ˇ cNn;m1 C ˛ sNn;m1 / cos m C .ˇ sNn;m1  ˛ cNn;m1 / sin m ; U”nm D m.cNnm sin m  sNnm cos m/”; Usnm D .n C 1/.cNnm cos m C sNnm sin m/s: Equations for the datum transformation of spherical harmonics:

C nm S nm



r .2  ı1m /.n  m C 1/.n C m/ 1 D .1 C ı1m / 2 2  





 cNn;m1 sNn;m1 cNnm sNnm  ˇ ˙˛ C .1  .n C 1/s/

m sNn;m1 cNn;m1 sNnm cNnm s  

 1 2.n  m/.n C m C 1/ cNn;mC1 sNn;mC1  ˇ

˛ sNn;mC1 cNn;mC1 2 2  ı0m s .2n  1/.1 C ı1m / .2  ı1m /.n C m  1/.n C m/  2r0 2.2n  1/.2n C 1/ s 



  .n  m/.n C m/ 2n  1 cNn1;m1 sNn1;m1 cNn1;m tz

ty   tx sNn1;m1 cNn1;m1 sNn1;m 2r0 .2n  1/.2n C 1/ s  

 2.n  m  1/.n  m/ 2n  1 cNn1;mC1 sNn1;mC1  tx  ˙ ty : sNn1;mC1 cNn1;mC1 2r0 .2  ı0m /.2n  1/.2n C 1/ (431)

11 Datum Transformations of Ellipsoidal Harmonic Coefficients The datum transformation of spheroidal harmonic coefficients is quite similar to the datum transformation of spherical harmonic coefficients. Here, it is started with the potential Eq. (432) with known spheroidal harmonic coefficients cNnm and sNnm in a three-dimensional Weizenböck space 31 . Another potential Eq. (433) is given by unknown spheroidal harmonic coefficients C nm and S nm in a three-dimensional Weizenböck space 3r . With the Taylor expansion Eq. (434) and a comparison of coefficients, the unknown coefficients in 31 can be determined. Potentials in three-dimensional Weizenböck spaces 31 and 3r and Taylor expansion: u  X n 1 X  Qnm b GM  b"  arccot U.; '; u/ D  " " nD0 mD0 Qnm "

(432)

P nm .sin '/.cNnm cos m C sNnm sin m/ in 31 ;

Page 80 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

U   X n 1 X  Qnm GM B  EB  U.ƒ; ˆ; U / D arccot  E E nD0 mD0 Qnm E

(433)

P nm .sin ˆ/.C nm cos mƒ C S nm sin mƒ/ in 3r ; U.ƒ; ˆ; U / D U.; '; u/ C D U.; '; u/ı C D' U.; '; u/ı' CDu U.; '; u/

(434)

ıu C O.u2 /: "

The Taylor expansion of the known potential in 31 leads to Eq. (435). After Taylor expansion the partial derivations will be replaced by the recursive formulas Eqs. (436) and (437) and the variation of spheroidal coordinates (ı; ı'; ıu) will be replaced by Eqs. (419)–(421). The result is given in Eq. (438).  b  It has also to be mentioned that in all formulas in this chapter, the constant GM factor " arccot " is replaced by k. Equations after Taylor expansion and inserting recursive formulas for partial derivations to u" and ': U.ƒ; ˆ; U / D k

n 1 X  X Qnm  Qnm nD0 mD0

Ck

n 1 X X nD0 mD0

m

u  b" P nm .sin '/ .cNnm cos m C sNnm sin m/ "

 Qnm  Qnm

n 1 X  X Qnm

u  b" P nm .sin '/ .cNnm sin m C sNnm cos m/ ı "

u

d P nm .sin '/ .cNnm cos m C sNnm sin m/ ı'  d' Q nm " nD0 mD0 u n 1  XX dQnm 1 " b  Ck P nm .sin '/ du  Q nm " " nD0 mD0 Ck

 b" 

 .cNnm cos m C sNnm sin m/

ıu : "

(435)

Recursive formula for partial derivation to ': d P nm .sin '/ D m tan 'P nm .sin '/ C d'

r

.2  ı0m /.n  m/.n C m C 1/ P nmC1 .sin '/: (436) 2

Recursive formula for partial derivation to u" : 2 3   u  u dQnm u  u n C m C 1  16 m 7 " D q Q Q  4 5: nm n;mC1 u 2 du   " 1 C ."/ " " " u 2 " 1C "

(437)

Equation with partial derivations to ' and u" :

Page 81 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

U.ƒ; ˆ; U / D k

n 1 X  X Qnm  Qnm nD0 mD0

Ck k Ck

u  b" P nm .sin '/ .cNnm cos m C sNnm sin m/ "

u

n 1 X  X Qnm  Qnm nD0 mD0 n 1 X X

 Qnm

 Qnm nD0 mD0 n 1 X X

 u" mP nm .sin '/.cNnm sin m C sNnm cos m/ı "

u  b" m tan 'P nm .sin '/ .cNnm cos m C sNnm sin m/ ı'

 Qnm

 Qnm nD0 mD0

"

ur  b"  "

.2  ı0m /.n  m/.n C m C 1/ P n;mC1 .sin '/ 2

 .cNnm cos m C sNnm sin m/ ı' Ck

n 1 X X

1

nD0 mD0

 Qnm

b "

u  u 1 m P nm .sin '/  u 2 Qnm "1C " " "

 .cNnm cos m C sNnm sin m/ ı

u "

u nCmC1  b q k  2 Qn;mC1 " P nm .sin '/  Qnm " " 1C u nD0 mD0 " n 1 X X

1

u  .cNnm cos m C sNnm sin m/ ı : "

(438)

In Eq. (438) products of trigonometric functions with the arguments ' and m, which can be replaced by the arguments .m C 1/ and .m  1/ by the application of addition theorems given in Klapp (2002a, b). In this step Eq. (439) is inserted to separate u" and '. For better illustration the result in Eq. (440) is summarized by transformation parameters. Approximation to separate u" and ' and equation after application of addition theorems summarized by transformation parameters. Approximation for u 2 1 2 : . " / Csin '  u 2 "

U.ƒ; ˆ; U / D k

1 C sin2 '

n 1 X X nD0 mD0

1 b

 Qnm

"



1  2 ; 1 C u" P.App/

;

(439) t ;t .App/

t .App/

x y z Unm C Unm C Unm ˛;ˇ.App/ .App/ s.App/ CUnm C Unm C Unm

! ;

(440)

where P.App/ Unm tx ;ty .App/ Unm

D

 Qnm

1 D q "

u " 1

1C

P nm .sin '/ .cNnm cos m C sNnm sin m/ ;

 u 2

 Qnm

u m P nm .sin '/ " cos '

"

Page 82 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

     tx cNnm C ty sNnm cos.m  1/ C tx sNnm  ty cNnm sin.m  1/ 2q

u

.2ı0m /.nm/.nCmC1/ 2

3

1  sin 'P n;mC1 .sin '/ 7 q Qnm 7   2 2" " 7 1 C u" 7 7 u u nCmC1 5 "  cos 'P Q .sin '/ C  u 2 n;mC1 nm 2" " 1C "

.tx cNnm C ty sNnm / cos.m  1/ C .tx sNnm  ty cNnm / sin.m  1/  ; C.tx cNnm  ty sNnm / cos.m C 1/ C .tx sNnm C ty cNnm / sin.m C 1/

6 6 6 6 6 4

2q tz .App/ Unm

.2ı0m /.nm/.nCmC1/ 2

u "

u

3

 6 cos 'P n;mC1 .sin '/ 7  u 2 Qnm 6 7 " " 6 7 1C " D6 7 u 1 6 nCmC1 7  4 5 sin 'P q Q .sin '/ nm n;mC1  u 2 " " 1C "

 .cNnm cos m C sNnm sin m/ tz ;

˛;ˇ.App/

Unm

”.App/

Unm

u u 1  D  q "   Qnm m tan 'P nm .sin '/ " " u 2 1C "  Œ.˛ sNnm C ˇ cNnm / cos.m  1/ C .˛ cNnm C ˇ sNnm / sin.m  1/ 3 2q .2ı0m /.nm/.nCmC1/ u   u 2 6  P n;mC1 .sin '/ 7 q "   Qnm 7 6 2 " 7 6 u 2 1 C C6 7 " 7 6 nCmC1 u 1 4  cos ' sin 'P nm .sin '/ 5  u 2 Qn;mC1 2 " 1C " 3 2 .˛ sNnm C ˇ cNnm / cos.m  1/ C .˛ cNnm C ˇ sNnm / sin.m  1/ 7 6 4 5; C.˛ sNnm C ˇ cNnm / cos.m C 1/ C .˛ cNnm C ˇ sNnm / sin.m C 1/ u  D mQnm P nm .sin '/.cNnm sin m  sNnm cos m/” "

2

3 u u nCmC1 "  P nm .sin '/  q  u 2 Qn;mC1 " P nm .sin '/ 7 " " 7 1C " 7 q 7 .2ı0m /.nm/.nCmC1/ 7 u 1 2 5  Q .sin '/  cos ' sin 'P  u 2 nm n;mC1 " " 1C "

 6 mQnm

s.App/ Unm

6 6 D6 6 4

u

.cNnm cos m C sNnm sin m/s: Page 83 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

In the next step one has to replace the products of given in Klapp (2002a, b) Legendre functions of the first kind and trigonometric functions with argument ' and the products of Legendre functions of the second kind with terms containing u" by recursive formulas. Before we index the shifts can be performed one has to verify and, if necessary, adapt the degrees and orders of Legendre functions, because there should be no different degrees in a product of Legendre functions before the coefficients can be compared. The orders in a product of Legendre functions have to be equal but also have to correspond to the factor of the argument ' in trigonometric functions. In the case of nonequal or corresponding degrees and orders one has to change them following the rules  b given  in Eqs. (441) and (442). It is important to notice that the factors of normalization and Qnm " have also to be adapted. With Eq. (443) the indices shifts from n to n  2 for n C 2, from n to n  1 for n C 1, from m to m  1 for m C 1, and from m to m C 1 for m  1 can be done. It is important to notice that m D 0 causes the factor 1 C ı1m in the terms of index shift from m to m  1. In Eq. (444) one has the equation after indices shifts to compare the coefficients which leads to the equations for the datum transformation of spherical harmonic coefficients Eq. (445). Equations after application of recursive formula for Legendre functions of the first and second kind and adaption of degree and order and the equations for the datum transformation of sectorial coefficients are as follows. u  Rules for adaption of degree n and order m by extending the term QnCi;ml P nk;mCj .sin '/ " with i; j; k; l 2 f0; : : :; 1g. Adaption of degree n.n  k ! n C i /: Legendre functions have the degree n C i : u  QnCi;ml " P nk;mCj .sin '/

!



 CQnCi;ml

u "

P nkC1;mCj .sin '/C     C P nCi;mCj .sin '/



u  QnCi;ml " P nCi;mCj .sin '/:

(441)

Adaption of order m.m  l ! m C j /: Legendre functions have the order m C k:  QnCi;ml

u

" P nCi;mCj .sin '/

!  

C

u  QnCi;mlC1 "     C QnCi;mCj

C u

!

 QnCi;mCj

u (442)

" P nCi;mCj .sin '/:

"

P nCi;mCj .sin '/  Equation after adaption of degree n and order m of Qnm

U.ƒ; ˆ; U / D k

n 1 X X

1

nD0 mD0

 Qnm

b "

b : n ! n C i C k, m ! m C j C l: "

P.Adap/

t ;t .Adap/

t .Adap/

x y z Unm C Unm C Unm ˛;ˇ.Adap/ .Adap/ s.Adap/ CUnm C Unm C Unm

! ;

(443)

where

Page 84 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

P.Adap/

Unm

t ;t .Adap/

x y Unm

 D Qnm

D

p

u "

P nm .sin '/.cNnm cos m C sNnm sin m/;

2ı0m 8a

q 3 .2m C 1/ .nm1/.nm/ 7 6 q 2.2n1/.2nC1/ 7 6 7  6 C.2m C 1/ .nCmC1/.nCmC2/ u 2.2nC1/.2nC3/ 7Q 6 q 6 7 nC1;m1 " P nC1;m1 .sin '/ .nCm1/.nCm/ 7 6 C.2m  1/ .2ı1m /.2n1/.2nC1/ 5 4 q .nmC1/.nmC2/ C.4n C 2m C 3/ .2ı 1m /.2nC1/.2nC3/ 2

   p 0m   tx cNnm C ty sNnm cos.m  1/ C tx sNnm  ty cNnm sin.m  1/  2ı 4a q 3 2 .n C m C 1/ .nm1/.nm/ 2.2n1/.2nC1/ 7 6 q 7 6 .nCm1/.nCm/ 6 C.n C m C 1/ .2ı1m /.2n1/.2nC1/ 7  u 7Q 6 q 6 P nC1;mC1 .sin '/ nC1;mC1 7 " .nmC1/.nmC2/ 7 6 .n C m C 1/ .2ı 1m /.2nC1/.2nC3/ 5 4 q .nCmC1/.nCmC2/ C.n  m/ 2.2nC1/.2nC3/      tx cNnm  ty sNnm cos.m C 1/ C tx sNnm C ty cNnm sin.m C 1/ ; t .Adap/

z Unm

D  2nC1 2a

q

.nmC1/.nCmC1/  QnC1;m .2nC1/.2nC3/

u "

P nC1;m .sin '/

.cNnm cos m C sNnm sin m/tz ; ˛;ˇ.Adap/ Unm

r u 1 .2  ı0m /.n  m C 1/.n C m/  D Qn;m1 P n;m1 .sin '/ 2 2  ı1m " Œ .˛ sNnm C ˇ cNnm / cos.m  1/ C .˛ cNnm C ˇ sNnm / sin.m  1/ r u 1 .2  ı0m /.n  m C 1/.n C m/  Qn;mC1 P n;mC1 .sin '/ C 2 2 " Œ.˛ sNnm C ˇ cNnm / cos.m Cr1/ C .˛ cNnm C ˇ sNnm / sin.m C 1/ 2  ı0m nCmC1   a 2 2  ı1m 8.2n  1/.2n C 3/ " q 2 3 .2n C 3/ .nm/.nCm2/.nCm1/.nCm/ .2n3/.2nC1/ 6 7 p 6 7 6 7  6 C.2m  1/ .n  m C 1/.n C m/ 7 4 5 q .nmC1/.nmC2/.nmC3/.nCmC1/ .2n  1/ .2nC1/.2nC5/ u 3 2  QnC2;m1 " P nC2;m1 .sin '/ 6 Œ.˛ sNnm C ˇ cNnm / cos.m  1/ C .˛ cNnm C ˇ sNnm / sin.m  1/ 7 7; u 6 5 4 CQ P .sin '/ nC2;mC1 nC2;mC1 " Œ.˛ sNnm C ˇ cNnm / cos.m C 1/ C .˛ cNnm C ˇ sNnm / sin.m C 1/

Page 85 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

 U”.Adap/ D mQnm nm s.Adap/

Unm

u "

 D .n C 1/Qnm

P nm .sin '/.cNnm sin m  sNnm cos m/”;

u

P .sin '/.cNnm cos m C sNnm sin m/s 2 nm 3 .2n q C 3/.n C m C 1/ 6 7 .nm1/.nm/.nCm1/.nCm/ 6 7 .2n3/.2nC1/ 6 7 1 6 C .2m C 1/.n  m/.n C m C 1/ 7 a 2 6 7 4.2n1/.2nC3/. " / 6 7 C.2n  1/.n  m/ 4 q 5 .nmC1/.nmC2/.nCmC1/.nCmC2/  .2nC1/.2nC5/ u  QnC2;m " P nC2;m .sin '/ .cNnm cos m C sNnm sin m/s "

after indices shift n ! n  2, n ! n  1, m ! m  1, m ! m C 1 where cNn;m D sNn;m D 0 and before comparison of coefficients C nm ; S nm and cNnm ; sNnm with EB D b" Earccot "arccot D

1 P

1 P n B  P E

n P

nD0 mD0

nD0 mD0 1  Qnm . b" /

b "

 Qnm . UE /

P nm .sin ˆ/.C nm cos mƒ C S nm sin mƒ/ ! tx ;ty .Index/ P.Index/ tz .Index/ C Unm C Unm Unm ; ˛;ˇ.Index/ ”.Index/ CUnm C Unm C Us.Index/ nm  Qnm . EB /

(444)

where UP.index/ nm

D

1

  b Qnm  Qnm "

u "

P nm .sin '/ .cNnm cos m C sNnm sin m/;

q 3 .nm3/.nm2/ .2m C 3/ 6 7 q .2n3/.2n1/ 6 7 2.nCm1/.nCm/ .2m C 3/ 6 7 1 1 6 tx ;ty .Index/ .2ı0m /.2n3/.2n1/ 7 q b D  Unm 7 .nCmC1/.nCmC2/ Qn1;mC1 " 8a 6 6 C.2m C 3/ 7 .2n1/.2nC1/ 4 5 q 2.nm1/.nm/ C.4n C 2m C 1/ .2ı0m /.2n1/.2nC1/ 2

Œ.tx cN  n  1; m C 1 C ty sNn1;mC1 / cos m C .tx sNn1;mC1  ty cNn1;mC1 / sin m p 2  ı1m 1 b   4a Qn1;m1 "

Page 86 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

3 2  q .nm1/.nm/ ı1m n C m  1  2 .2m  3/ 2.2n3/.2n1/ 7 6 q 7 6   .nmC1/.nmC2/ 7 6  n C m  1  ı1m .2m  3/ 7 6 2 q.2ı2m /.2n1/.2nC1/ 7 6  6 C n C m  1  ı1m .2m  3/ .nCm3/.nCm2/ 7 7 6 2 q.2ı2m /.2n3/.2n1/ 5 4   .nCm1/.nCm/ C n  m  ı1m .4n  2m C 1/ 2 2.2n1/.2nC1/  Œ.tx cNn1;m1  ty sNn1;m1 / cos m C .tx sNn1;m1 C ty cNn1;m1 / sin m; tz .Index/ Unm

2n  1 b D  2a Qn1;m " 1

s

.n  m/.n C m/   u  P nm .sin '/ Q .2n  1/.2n C 1/ nm "

.cNn1;m cos m C sNn1;m sin m/tz ; ˛;ˇ.Index/ Unm

1 b D  Qn;mC1 " 2 1

r

2.n  m/.n C m C 1/   u  Qnm P nm .sin '/ 2  ı0m "

Œ.˛ sNn;mC1 C ˇ cNn;mC1 / cos m C .˛ cNn;mC1 C ˇ sNn;mC1 / sin m r 1 C ı1m .2  ı1m /.n  m C 1/.n C m/   u  1   Qnm " P nm .sin '/ C  2 2 Qn;m1 b"  Œ.˛ sNn;m1 C ˇ cNn;m1 / cos m C .˛ cNn;m1 C ˇ sNn;m1 / sin m C

1

b

nCm

 a 2  Qn2;mC1 " 8.2n  5/.2n  1/ " q 2 3 .2n  1/ 2.nm3/.nCm3/.nCm2/.nCm1/ .2ı0m /.2n7/.2n3/ 6 7 q 6 7  u 2.nm2/.nCm1/ 6 7Q  6 .2m C 1/ 2ı0m 7 nm " P nm .sin '/ q 4 5 C.2n  5/ 2.nm2/.nm1/.nm/.nCm/ .2ı0m /.2n3/.2nC1/

Page 87 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

 Œ.˛ sNn2;mC1 C ˇ cNn2;mC1 / cos m C .˛ cNn2;mC1 C ˇ sNn2;mC1 / sin m p 2  ı1m 1 b C    Qn2;m1 " 8.2n  5/.2n  1/ a 2 " 3 2 q .nm1/.nCm5/.nCm4/.nCm3/ .2n  1/.n C m  2/ .2ı2m /.2n7/.2n3/ 7 6 q 7 6 6 Cı .2n  1/.n  m/ .nm3/.nm2/.nm1/.nCm3/ 7 1m 7 6 .2ı0m /.2n7/.2n3/ 7 6 q 7 6 .nm/.nCm3/ 7 6 .2m  3/.n C m  2/ 2ı 2m 7 6 q 6 7 7 6 Cı1m .2m  1/.n  m/ .nm1/.nCm2/ 7 6 2ı0m 7 6 q 6 .nm/.nmC1/.nmC2/.nCm2/ 7 7 6 C.2n  5/.n C m  2/ .2ı2m /.2n3/.2nC1/ 7 6 q 5 4 .nm/.nCm2/.nCm1/.nCm/ Cı1m .2n  5/.n  m/ .2ı0m /.2n3/.2nC1/  Qnm

u "

P nm .sin '/

 Œ.˛ sNn2;m1 C ˇ cNn2;m1 / cos m C .˛ cNn2;m1 C ˇ sNn2;m1 / sin m; ”.Index/ Unm D

Us.Index/ D nm C

u m   b Qnm P nm .sin '/ .cNnm sin m  sNnm cos m/ ”;  " Qnm "

u nC1   b Qnm P nm .sin '/ .cNnm cos m C sNnm sin m/ s  " Qnm " 1

b 

1

 a 2  Qn2;m " 4.2n  5/.2n  1/ " q 3 2 .nm3/.nm2/.nCm3/.nCm2/ .2n  1/.n C m  1/ .2n7/.2n3/ 7  u 6 7Q 6  4 .2m C 1/.n  m  2/.n C m  1/ 5 nm " P nm .sin '/ q .nm1/.nm/.nCm1/.nCm/ C.2n  5/.n  m  2/ .2n3/.2nC1/  .cNn2;m cos m C sNn2;m sin m/ s: Equations for the datum transformation of spheroidal harmonics:

Page 88 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015



E arccot. b /   B  D "arccot B" Qnm E .E /  

 3 2 cNn;m1 sNn;m1 ˛;ˇ 7 6 un;m1 ˇ sNn;m1 ˙ ˛ cNn;m1 7 6  

7 6 1.nC1/s cNnm s N m nm 7 6C

b b   7 6 Qnm . " / sNnm Qnm . " / c N nm 7 6

 7 6 ˛;ˇ  cNn;mC1  s N n;mC1 7 6 u ˇ

˛ 7 6 n;mC1 s N c N n;mC1 n;mC1 6  

 7 7 6 t ;t cNn1;m1 sNn1;m1 7 6 u x y t

t y 7 6 n1;m1 x sN c N 7 6 n1;m1 n1;m1

 7 6 cNn1;m 7 6 tz 7; 6 un1;m tz 7 sNn1;m 6 



  7 6 7 6 tx ;ty cN sN 7 6 Cun1;mC1 tx n1;mC1 ˙ ty n1;mC1 7 6 s N c N n1;mC1 n1;mC1  

 7 6 7 6 ˛;ˇ 7 6 Cun2;m1 ˇ cNn2;m1 ˙ ˛ sNn2;m1 7 6 s N c N n2;m1 n2;m1 7 6 

7 6 c N n2;m s 7 6 u 7 6 n2;ms sNn2;m 7 6 



  5 4 ˛;ˇ cNn2;mC1 sNn2;mC1

˛ un2;mC1 ˇ sNn2;mC1 cNn2;mC1

C nm S nm

(445)

where ˛;ˇ

un;m1 D ˛;ˇ un;mC1 t ;t

x y D un1;m1

t

z un1;m

t ;t

1

 Qn1;m1 . b" /

p

D

1Cı1m  Qn;m1 . b" / 2 1

1  Qn;mC1

1 .b/ 2 "

q

q

.2ı1m /.nmC1/.nCm/ ; 2

2.nm/.nCmC1/ ; 2ı0m

2ı1m 4a

2 3  .2m  3/ n C m  1  ı1m 2 6 q q q 7 6 7 .nm1/.nm/ .nmC1/.nmC2/ .nCm3/.nCm2/ 6  .2ı2m /.2n1/.2nC1/ C .2ı2m /.2n3/.2n1/ 7  6 2.2n3/.2n1/ 7; q 4 5 .nCm1/.nCm/ .4n  2m C 1/ Cn  m  ı1m 2.2n1/.2nC1/ q2 .nm/.nCm/ D Q 1 b 2n1 ; 2a .2n1/.2nC1/ n1;m . " /

x y un1;mC1 D

1 1  Qn1;mC1 . b" / 4a

2

.2m C 3/  12q

3

q q  6 .nm3/.nm2/ 2.nCm1/.nCm/ .nCmC1/.nCmC2/ 7 C   6 7; 4 .2n3/.2n1/ .2ı0m /.2n3/.2n1/ .2n1/.2nC1/ 5 q 2.nm1/.nm/ 1 C 2 .4n C 2m C 1/ .2ı0m /.2n1/.2nC1/

Page 89 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

˛;ˇ un2;m1

D

p

1

b

2  ı1m

 2 8.2n  5/.2n  1/ a" q 2 3 .2n  1/.n C m  2/ .nm1/.nCm5/.nCm4/.nCm3/ .2ı2m /.2n7/.2n3/ 6 q 7 6 .nm3/.nm2/.nm1/.nCm3/ 7 6 Cı1m .2n  1/.n  m/ 7 .2ı0m /.2n7/.2n3/ 6 7 q 6 7 6 .2m  3/.n C m  2/ .nm/.nCm3/ 7 2ı2m 6 7; q 6 7 .nm1/.nCm2/ 6 Cı1m .2m  1/.n  m/ 7 2ı 0m 6 7 q 6 7 6 C.2n  5/.n C m  2/ .nm/.nmC1/.nmC2/.nCm2/ 7 .2ı /.2n3/.2nC1/ 2m 4 5 q .nm/.nCm2/.nCm1/.nCm/ Cı1m .2n  5/.n  m/ .2ı0m /.2n3/.2nC1/  Qn2;m1

usn2;m D

1  Qn2;m

2

"

b  "

1 4.2n  5/.2n q  1/

.2n  1/.n C m  1/

 a 2

" .nm3/.nm2/.nCm3/.nCm2/ .2n7/.2n3/

6 6 C.2m C 1/.n  m  2/.n C m  1/ 4 q .2n  5/.n  m  2/ .nm1/.nm/.nCm1/.nCm/ .2n3/.2nC1/ ˛;ˇ un2;mC1

D

nCm

3 7 7; 5

r

2 b  a 2 2  ı0m 8.2n  5/.2n  1/ " " q 3 2 .2n  1/ .nm3/.nCm3/.nCm2/.nCm1/ .2n7/.2n3/ 7 6 p 6  4 C.2m C 1/q .n  m  2/.n C m  1/ 7 5: .nm2/.nm1/.nm/.nCm/ .2n  5/ .2n3/.2nC1/ 1

 Qn2;mC1

12 Examples In this chapter, the examples for spherical and ellipsoidal datum transformation are not given to show a practical use but to verify the formula for ellipsoidal datum transformation, because the results of both transformations, spherical and ellipsoidal, should be similar. For this purpose one does not only compare the equations but also the transformed coefficients of spherical and ellipsoidal datum transformation and the difference between spherical and ellipsoidal transformed coefficients should decrease with increasing degree and order. At first, it is required to have parameters of two reference systems given in Table 4, a set of transformation parameters given in Table 5 which are only examples for a datum transformation, and a set of coefficients cNnm and sNnm given in Tables 6 and 7. With transformation formulas for spherical and ellipsoidal datum transformation given in Sects. 10 and 11, one has equations for computing transformed coefficients. For the zonal, sectorial, and tesseral coefficients, the equations are given in Eqs. (446)–(448). The transformed coefficients until degree and order n D m D 4 are given in Tables 8 and 10 for spherical datum transformation and in Tables 9 and 11 for ellipsoidal datum transformation.

Page 90 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Table 4 Parameters of reference systems Radius Of the reference sphere Of the underlying reference sphere Semiaxis and linear eccentricity Of the reference ellipsoid Of the underlying reference ellipsoid

R D 6;371;000:790 m r D 6;370;283:158 m A D 6;378;137:000 m a D 6;377;397:155 m

E D 521;854:011 " D 521;013:137

Table 5 Transformation parameters Translation tx 100 m ty 100 m tz 100 m Rotation ˛ 5000 ˇ 5000 5000 Scale variation s 106 Table 6 Coefficients cNnm mD0 n D 0 C1:000000 C 00 1 C0:000000 C 00 2 C5:137725  04 3 C9:572542  07 4 2:504930  07 Table 7 Coefficients s nm mD0 n D 0 C0:000000 C 00 1 C0:000000 C 00 2 C0:000000 C 00 3 C0:000000 C 00 4 C0:000000 C 00

1

2

3

4

C0:000000 C 00 1:869876  10 C2:029989  06 5:363223  07

C2:439144  06 C9:046278  07 C3:574280  07

C7:210727  07 C9:907718  07

1:885608  07

1

2

3

4

C0:000000 C 00 1:869876  10 C2:029989  06 5:363223  07

C2:439144  06 C9:046278  07 C3:574280  07

C7:210727  07 C9:907718  07

1:885608  07

2

3

4

C2:439815  06 C9:059311  07 C3:559571  07

C7:199865  07 C9:914496  07

1:885526  07

sph

Table 8 Transformed spherical coefficients C nm mD0 1 n D 0 C9:999990  01 1 9:063181  06 9:063181  06 2 C5:137709  04 C2:145951  07 3 C9:357437  07 C2:013218  06 4 2:504193  07 5:363879  07 ell

Table 9 Transformed ellipsoidal coefficients C nm mD0 1 n D 0 C1:000115 C 00 1 4:534155  06 6:796671  06 2 C5:138355  04 C1:590650  07 3 C9:4608580  07 C2:027601  06 4 2:504627  07 5:361879  07

2

3

4

C2:440108  06 C9:060808  07 C3:560602  07

C7:201633  07 C9:916399  07

1:885977  07

Page 91 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Zonal equations: sph

C 00 D .1  s/cN00 ; ell C 00 D k00 .1  s/cN00 ;

(446)



 cN00 cN10 C .1  2s/  sN00  C 10

 sN10

D sph sN11 cN11 S 10

˛ ; ˇ sN11 cN11

 3

2 p cN00 cN10 ( ell )  3  6 q00 6a tz sN00 C q10 .1  2s/ sN10 7 C 10 6 7;  

 D k 10 4 ell 5 cN11 sN11 S 10 

˛ q11 ˇ sN11 cN11 (

p

) sph

3 t 3r z

(447)

Tesseral equations: p  35r 5

(

sph

C 21 sph S 21

) D

 



 cN10 cN10 cN11 tx

ty  tz s N10  sN10  sN11 p cN20 sN20 C 3 ˇ ˙˛

sN20 

cN20  cN sN C.1  3s/ 21 21 cN21   sN21  cN22 sN22  ˇ

˛ sN22 cN22

 

3 cN00 sN00 ˇ ˙˛ a 2 cN00  7 7 6 30. " / sN00 

p 7 6 c N c N 10 7 6 q 3 5 t

ty 10 7 6 10 16a x sN s N 10 10 7 6

 7 6 p cN11 7 6 3 5 ( ell ) t q 7 6 11 10a z C 21 s N 7 6 11 

  

D k 7 6 21 ell p 7 6 sN20 cN20 S 21  7 6 Cq20 3 ˇ ˙˛ 7 6

cN20  7

sN20  6 6 sN21 7 cN21  7 6 Cq21

.1  3s/ 7 6 s N c N 21 21 7 6  

 5 4  cN22 sN22 q22 ˇ

˛ sN22 cN22 2

 6 q00

p

15

(448)

with knm

  Earccot b"   B  D  "arccot EB Qnm E

 D and qnm

1  Qnm

 b :

(449)

"

Page 92 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

sph

Table 10 Transformed spherical coefficients S nm mD0 1 nD0 C0:000000 C 00 1 C0:000000 C 00 9:063181  06 2 4:233427  13 2:147696  07 3 1:352858  09 C2:483001  07 4 C7:908926  10 4:739281  07

2

3

4

1:398980  06 6:19860107 C6:587076  07

C1:414423  06 2:001632  07

C3:082603  07

2

3

4

1:399147  06 6:199595  07 C6:587241  07

C1:414581  06 2:001664  07

C3:082546  07

ell

Table 11 Transformed ellipsoidal coefficients S nm mD0 1 nD0 C0:000000 C 00 1 C0:000000 C 00 6:796671  06 2 4:235981  13 1:592413  07 3 1:353573  09 C2:463250  07 4 C7:416188  10 4:741757  07

–75

–50

–25

0

(°)

50

Fig. 13 Plate I: West-east component  of the deflection of the vertical on the International Reference Ellipsoid, Mollweide projection sph

ell

For the transformed coefficients C nm and C nm , the most significant differences appear for the zonal coefficient c00 , the sectorial coefficient c10 , and the tesseral coefficients c11 and c21 . The reason for these differences is the factors with the linear eccentricity and the Legendre function of the second kind in the equation of ellipsoidal datum transformation. But for the tesseral coefficient c21 , another reason is given by the additional terms for the ellipsoidal equation which are the result of the different approximations. sph ell For the transformed coefficients S nm and S nm , the most significant differences appear only for the tesseral coefficients c11 and c21 . The reason for these differences are the same as for the sph ell sph ell transformed coefficients C nm and C nm . The zonal coefficients S 00 and S 00 are both equal to zero for spherical and ellipsoidal datum transformation because S 00 is also equal to zero. This is the reason why in Eq. (446) there is no equation given explicitly for the transformation for the zonal sph ell coefficients S 00 and S 00 .

Page 93 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

180°

[*]

50

50° O

150

°W

1 °O

120 °W

120

25

90° W

90° O

0 90° N

–25

W

60°

O

60° 30°

W 0° 0°

°W

O

30°

30°

–75

[*]

O

O

60°

50

25

60°

W

30

–50

70° S

80° S

80° O

90° W

0 90° S

120

°O

120

°W

–25

–50

150 –75

°W

°O 180°

150

Fig. 14 Plate II: West-east component  of the deflection of the vertical on the International Reference Ellipsoid, polar regions, equal-area azimutal projection of EA1 ;A1 ;A2 Top: North Pole. Bottom: South Pole 2

Page 94 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

–75

–50

–25

0

(°)

50

Fig. 15 Plate III: South-north component  of the deflection of the vertical on the International Reference Ellipsoid, Mollweide projection

13 Conclusions As it is shown by the examples, at last the datum transformation of spheroidal harmonics is quite similar to the spherical case. The differences which are mainly caused by the major and minor semiaxis in the spheroidal case lead to another approximation as in the spherical case. But this approximation should not be the only way to get a transformation formula because in some cases the differences of transformed spheroidal and spheroidal harmonics are quite big and for these cases it could be necessary to modify the equations of spheroidal datum transformation. It is once again important to say that the approximated solution for the datum transformation of spheroidal harmonics is only an approximation which could not be as exact as the datum transformation of spherical harmonics and that there should be a way to get a better result. Due to length restrictions three appendices have been transferred to the related publications which provide equivalent compensation: Equivalent material to Appendix A relating to four alternative representations of ellipsoidal harmonics can be found in N.C. Thong and E.W. Thong and Grafarend (1989). The power-series expansion of the associated Legendre polynomials of the second kind was the content of Appendix B. For this material the reader is referred to Grafarend et al. (2006, pp. 1–57). Finally, addition theorems, Jacobi matrices, normalized Legendre functions, and recursive formulae of Legendre functions, of the first and second kind were the subject of Appendix C. These topics are combined in the unpublished M.S. Thesis of the second author (see Klapp 2002a, b). They take care of detailed representations of ellipsoidal datum transformations.

Page 95 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

180°

[*]

50

50° O

150

°W

1 °O

1

120

20° W

25

90° W

90° O

0 90° N

–25

W

60°

O

60° 30°

30°

W 0° 0°

30°

°W

O

–75 [*]

O

O

60°

50

25

60°

W

30

–50

70° S

80° S

80° O

90° W

0 90° S

1

°O

120

W 20°

–25

–50

–75

150

°W

°O 180°

150

Fig. 16 Plate IV: South-north component  of the deflection of the vertical on the International Reference Ellipsoid, polar regions, equal-area azimutal projection of EA1 ;A1 ;A2 . Top: North Pole. Bottom: South Pole 2

Page 96 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

–2

–1.5

–1

–0.5

0

0.5

(dGal)

1.5

Fig. 17 Plate V: Gravity disturbance ı on the International Reference Ellipsoid, Mollweide projection

References Abramowitz M, Stegun IA (1970) Handbook of mathematical functions. Dover, New York Akhtar N (2009) A multiscale harmonic spline interpolation method for the inverse spheroidal gravimetric problem. Universität Siegen, Siegen Akhtar N, Michel V (2012) Reproducing kernel based splines for the regularization of the inverse ellipsoidal gravimetric problem. Appl Anal 91:2105–2132 Ardalan AA (1996) Spheroidal coordinates and spheroidal eigenspace of the Earth gravity field. Universität Stuttgart, Stuttgart Ardalan AA (1999) High resolution regional Geoid computation in the World Geodetic Datum 2000. Universität Stuttgart, Stuttgart Ardalan AA, Grafarend EW (2000) Reference ellipsoidal gravity potential field and gravity intensity field of degree/order 360/360 (manual of using ellipsoidal harmonic coefficients “Ellipfree.dat” and “Ellipmean.dat”). http://www.uni-stuttgart.de/gi/research/paper/coefficients/ coefficients.zip Ardalan AA, Grafarend EW (2001) Ellipsoidal Geoidal undulations (ellipsoidal Bruns formula): case studies. J Geodesy 75:544–552 Ardalan A, Karimi R, Grafarend E (2010) A new reference equipotential surface and reference ellipsoid for the planet Mars. Earth Moon Planet 106:1–13 Arfken G (1968) Mathematical methods for physicists, 2nd edn. Academic, New York/London Balmino G et al (1991) Simulation of gravity gradients: a comparison study. Bull Géod 65:218–229 Bassett A (1888) A treatise on hydrodynamics. Deighton, Bell and Company, Cambridge. Reprint edition in 1961 (Dover, New York) Bölling K, Grafarend EW (2005) Ellipsoidal spectral properties of the Earth’s gravitational potential and its first and second derivatives. J Geodesy 79:300–330 Cajori F (1946) Newton’s principia. University of California Press, Berkeley, CA Cartan EH (1922) Sur les petites oscillations d’une masse fluide. Bull Sci Math 46(317–352):356–369

Page 97 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Cartan EH (1928) Sur la stabilité ordinaire des ellipsoides de Jacobi. In: Proceedings of the international mathematical congress, Toronto 1924, 2, Toronto, University of Toronto Press, Toronto, pp 2–17 Cayley A (1875a) A memoir on prepotentials. Philos Trans R Soc Lond 165:675–774 Cayley A (1875b) On the potential of the ellipse and the circle. Proc Lond Math Soc 6:38–55 Chandrasekhar S (1969) Ellipsoidal figures of equilibrium. Yale University Press, New Haven Chandrasekhar S, Roberts PH (1963) The ellipticity of a slowly rotating configuration. J Astrophys 138:801–808 Cruz JY (1986) Ellipsoidal corrections to potential coefficients obtained from gravity anomaly data on the ellipsoid. Report 371, Department of Geodetic Science and Surveying, The Ohio State University, Columbus Darboux G (1910) Lecons sur les systemes orthogonaux et les cordonées curvilignes. GauthierVillars, Paris Darwin GH (1906) On the figure and stability of a liquid satellite. Philos Trans R Soc Lond 206:161–248; Scientific Papers 3, Cambridge University Press, Cambridge, 1910, 436 Dedekind R (1860) Zusatz zu der vorstehenden Abhandlung. J Reine Angew Math 58:217–228 Doob JL (1984) Classical potential theory and its probabilistic counterpart. Springer, New York Dyson FD (1991) The potentials of ellipsoids of variable densities. Q J Pure Appl Math XXV:259–288 Eisenhart LP (1934) Separable systems of Stäckel. Ann Math 35:284–305 Ekman M (1996) The permanent problem of the permanent tide; what to do with it in geodetic reference systems. Mar Terres 125:9508–9513 Engels J (2006) Zur Modellierung von Auflastdeformationen und induzierter Polwanderung. Technical Reports, Department of Geodesy and Geoinformatics University Stuttgart, Report 2006.1, Stuttgart Engels J, Grafarend E, Keller W, Martinec Z, Sanso F, Vanicek P (1993) The Geoid as an inverse problem to be regularized. In: Anger G, Gorenflo R, Jochmann H, Moritz H, Webers W (eds) Inverse problems: principles and applications in geophysics, technology and medicine Mathematical research, vol 74. Akademie-Verlag, Berlin, pp 122–167 Ferrers NM (1877) On the potentials of ellipsoids, ellipsoidal shells, elliptic harmonic and elliptic rings of variable densities. Q J Pure Appl Math 14:1–22 Finn G (2001) Globale und regionale Darstellung von Lotabweichungen bezüglich des Internationalen Referenzellipsoids. Universität Stuttgart, Stuttgart Fischer D, Michel V (2012) Sparse regularization of inverse gravimetry – case study: spatial and temporal mass variations in South America. Inverse Probl 28:065012 Flügge S (1979) Mathematische Methoden der Physik. Springer, Berlin Freeden W, Michel V (2004) Multiscale potential theory. Birkhäuser, Boston–Basel Freeden W, Gervens T, Schreiner M (1998) Constructive approximation of the sphere. Clarendon, Oxford Friedrich D (1998) Krummlinige Datumstransformation. Universität Stuttgart, Stuttgart Gauss CF (1867) Werke 5, Theoria attractionis corporum sphraedicorum ellipticorum homogeneorum. Königliche Gesellschaft der Wissenschaften, Göttingen Gleason DM (1988) Comparing corrections to the transformation between the geopotential’s spherical and ellipsoidal spectrum. Manuscr Geod 13:114–129 Gleason DM (1989) Some notes on the evaluation of ellipsoidal and spheroidal harmonic expansions. Manuscr Geod 14:114–116

Page 98 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Gradshteyn IS, Ryzhik IM (1980) Tables of integrals, series and products. Corrected and enlarged edition (trans. by A. Jeffrey). Academic, New York Grafarend EW (1988) The geometry of the Earth’s surface and the corresponding function space of the terrestrial gravitational field. Festschrift R. Sigl, Deutsche Geodätische Kommission, Bayerische Akademie der Wissenschaften, Report B 287, München, pp 76–94 Grafarend, EW. (2001): The spherical horizontal and spherical vertical boundary value problem vertical deflections and geoidal undulations. The completed Meissl diagram, Journal of Geodesy, 75, 363–390 Grafarend EW (2011) Space gradiometry: tensor-valued ellipsoidal harmonics, the datum problem and application of the Lusternik Schnirelmann Category to construct a minimum atlas. Int J Geomath 1:145–166 Grafarend EW (2012) Von A. Einstein über H.Weyl und E.Cartan zur Quanten-Gravitation. Sitzungsberichte der Leibniz Sociataet der Wissenschaften 113: 13–21 Grafarend EW, Ardalan AA (1999) World geodetic datum 2000. J Geodesy 73:611–623 Grafarend EW, Awange JL (2000) Determination of vertical deflections by GPS/LPS measurements. Z Vermess 125:279–288 Grafarend EW, Awange J (2012) Applications of linear and nonlinear models: fixed effects, random effects and total least Squares. Springer, Berlin/Heidelberg/New York/Dordrecht, p. 1016 Grafarend EW, Engels J (1998) Erdmessung und physikalische Geodäsie, Ergänzungen zum Thema Legendrefunktionen. Skript zur Vorlesung WS 1998/99. Universität Stuttgart, Stuttgart Grafarend EW, Heidenreich A (1995) The generalized Mollweide projection of the biaxial ellipsoid. Bull Géod 69:164–172 Grafarend EW, Thong NC (1989) A spheroidal harmonic model of the terrestrial gravitational field. Manuscr Geod 14:285–304 Grafarend EW, Krumm F, Okeke F (1995) Curvilinear geodetic datum transformations. Z Vermess 7:334–350 Grafarend EW, Engels J, Varga P (1997) The spacetime gravitational field of a deformable body. J Geodesy 72:11–30 Grafarend EW, Finn G, Ardalan AA (2006) Ellipsoidal vertical deflections and ellipsoidal gravity disturbance: case studies. Studia Geophys Geod 50:1–57 Green G (1828) An essay on the determination of the exterior and interior attractions of ellipsoids of variable densities. In: Ferrers NM (ed) Mathematical papers of George Green. Chelsea, New York Groten E (1979) Geodesy and the Earth’s gravity field. Vol I: Principles and conventional methods. Vol II: Geodynamics and advanced methods. Dümmler Verlag, Bonn Groten E (2000) Parameters of common relevance of astronomy, geodesy and geodynamics. The geodesist’s handbook. J Geodesy 74:134–140 Hackbusch W (1995) Integral equations. Theory and numerical treatment. Birkhäuser Verlag, Basel Hake G, Grünreich D (1994) Kartographie. Walter de Gruyter, Berlin Heck B (1991) On the linearized boundary value problem of physical geodesy. Report 407, Department of Geodetic Science and Surveying, The Ohio State University, Columbus Heiskanen WH, Moritz H (1967) Physical geodesy. W.H. Freeman, San Francisco, CA Heiskanen WA, Moritz H (1981) Physical geodesy (Corrected reprint of original edition from W.H. Freeman, San Francisco, CA, 1967), order from: Institute of Physical Geodesy, TU Graz, Austria

Page 99 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Helmert FR (1884) Die mathematischen und physikalischen Theorien der Höheren Geodäsie, Vol 2. B.G. Teubner, Leipzig (Reprinted in 1962 by Minerva GmbH, Frankfurt (Main)) Hicks WM (1882) Recent progress in hydrodynamics. Rep Br Assoc 57–61 Hobson EW (1896) On some general formulae for the potentials of ellipsoids, shells and discs. Proc Lond Math Soc 27:519-416 Hobson EW (1965) The theory of spherical and ellipsoidal harmonics. Second Reprint of the edition 1931 (Cambridge University Press), Chelsea, New York Holota P (1995) Classical methods for non-spherical boundary problems in physical geodesy. In: Sansó F (ed) Symposium 114: Geodetic Theory today. The 3rd Hotine-Marussi Symposium on Mathematical Geodesy. Springer, Berlin/Heidelberg Holota P (2005) Successive approximation in the solution of a weakly formulated geodetic boundary value problem. In: Sanso F (ed) A window for the future of geodesy, proceedings of the internatinal association of geodesy, Sapporo. Springer, Berlin/Heidelberg/New York, pp 452–458 Honerkamp J, Römer H (1986) Grundlagen der klassischen theoretischen Physik. Springer, Berlin/Heidelberg/New York Hotine M (1967) Downward continuation of the gravitational potential. General Assembly of the International Assembly of Geodesy, Luceone Jacobi CGJ (1834) Über die Figur des Gleichgewichts. Poggendorf Annalen der Physik und Chemie 33:229–238. Reprinted in Gesammelte Werke 2 (Berlin, G. Reimer, 1882), pp 17–72 Jeans JH (1917) The motion of tidally-distorted masses, with special reference to theories of cosmogony. Mem R Astron Soc Lond 62:1–48 Jeans JH (1919) Problems of cosmogony and stellar dynamics, chaps 7 and 8. Cambridge University Press, Cambridge Jekeli C (1988) The exact transformation between ellipsoidal and spherical harmonic expansions. Manuscr Geod 13:106–113 Jekeli C (1999) An analysis of vertical deflections derived from high-degree spherical harmonic models. J Geodesy 73:10–22 Kahle AB (1967) Harmonic analysis for a spheroidal Earth. RAND Corporation Document P-3684, presented at IUGG Assembly, St. Gallen Kahle AB, Kern JW, Vestine EH (1964) Spherical harmonic analyses for the spheroidal Earth. J Geomagn Geoelectr 16:229–237 Kassir MK, Sih GC (1966) Three-dimensional stress distribution around elliptical crack under arbitrary loadings. ASME J Appl Mech 33:601–611 Kassir MK, Sih GC (1975) Three-dimensional crack problems. Mechanics of fracture, vol 2. Noordhoff International Publishing, Leyden Kellogg OD (1929) Foundations of potential theory. Springer, Berlin/Heidelberg/New York Klapp M (2002a) Synthese der Datumtransformation von Kugel- und Sphäroidalfunktionen zur Darstellung des terrestrischen Schwerefeldes – Beispielrechnungen zu den Transformationsgleichungen. Universität Stuttgart, Stuttgart Klapp M (2002b) Analyse der Datumtransformation von Kugel- und Sphäroidalfunktionen zur Darstellung des terrestrischen Schwerefeldes – Herleitung der Transformationsgleichungen. Universität Stuttgart, Stuttgart Kleusberg A (1980) The similarity transformation of the gravitational potential close to the identity. Manuscr Geod 5:241–256 Kneschke A (1965) Differentialgleichungen und Randwertprobleme. Teubner Verlag, Leipzig

Page 100 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Kohlhaas A (2010) Multiscale methods on regular surfaces and their application to Physical Geodesy. Geomath. Group, Technical University Kaiserslautern, Ph.D. Thesis, Verlag Dr.Hut, Muenchen Lamé G (1859) Lecons sur les cordonnées curvilignes et leurs diverses applications. MalletBachelier, Paris Lamp SH (1932) Hydrodynamics. Cambridge University Press, Cambridge, pp 722–723 Lebovitz NR (1998) The mathematical developments of the classical ellipsoids. Int J Eng Sci 36:1407–1420 Lejeune Dirichlet G (1860) Untersuchungen über ein Problem der Hydrodynamik. J Reine Angew Math 58:181–216 Lejeune Dirichlet G (1897) Gedächtnisrede auf Carl Gustav Jacob Jacobi gehalten in der Akademie der Wissenschaften am 1. Juli 1852. Gesammelte Werke, 2 (Berlin, G. Reimer), 243 Lense J (1950) Kugelfunktionen. Akademische Verlagsgesellschaft Geest-Portig, Leipzig Lowes FJ, Winch DE (2012) Orthogonality of harmonic potentials and fields in spheroidal and ellipsoidal coordinates: application in geomagnetism and geodesy. Geophys J Int 191:491–507 Lyttleton RA (1953) The stability of rotating liquid masses, chap 9. Cambridge University Press, Cambridge Maclaurin C (1742) A treatise on fluxions. Edinburgh MacMillan WD (1958) The theory of the potential. Dover, New York Mangulis V (1965) Handbook of series for scientists and engineers. Academic, New York Martinec Z (1996) Stability investigations of a discrete downward continuation problem for Geoid determination in the Canadian Rocky Mountains. J Geodesy 70:805–828 Martinec Z, Grafarend EW (1997) Solution to the Stokes boundary value problem on an ellipsoid of revolution. Stud Geophys Geod 41:103–129 Martinec Z, Vaniˇcek P (1996) Formulation of the boundary-value problem for Geoid determination with a higher-order reference field. Geophys J Int 126:219–228 Mikolaiski HW (1989) Polbewegung eines deformierbaren Erdkörpers. PhD thesis, Deutsche Geodätische Kommission, Bayerische Akademie der Wissenschaften, Reihe C, Heft 354, München Milne EA (1952) Sir James Jeans, a biography, chap 9. Cambridge University Press, Cambridge Molodensky MS (1958) Grundbegriffe der geodätischen Gravimetrie. VEB Verlag Technik, Berlin Moon P, Spencer DE (1953) Recent investigations of the Separation of Laplace’s equation. Ann Math Soc Proc 4:302–307 Moon P, Spencer DE (1961) Field theory for engineers. D. van Nostrand, Princeton, NJ Moritz H (1968a) Density distributions for the equipotential ellipsoid. Department of Geodetic Science and Surveying, The Ohio State University, Columbus Moritz H (1968b) Mass distributions for the equipotential ellipsoid. Boll Geofis Teorica Appl 10:59–65 Moritz H (1973) Computation of ellipsoidal mass distributions. Department of Geodetic Science and Surveying, The Ohio State University, Columbus Moritz H (1980) Geodetic reference system 1980. Bull Géod 54:395–405 Moritz H (1984) Geodetic reference system 1980. The geodesist’s handbook. Bull Géod 58:388–398 Moritz H (1990) The figure of the Earth. Wichmann Verlag, Karlsruhe Moritz H, Mueller I (1987) Earth rotation. Theory and observation. Ungar, New York Morse PM, Feshbach H (1953) Methods of theoretical physics, part II. McGraw Hill, New York

Page 101 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Natanson JP (1967) Theory of functions of a real variable. Frederick, New York Niven WD (1891) On ellipsoidal harmonics. Philos Trans R Soc Lond A 182:231–278 Otero J (1995) A uniqueness theorem for a Robin boundary value problem of physical geodesy. Q J Appl Math Panasyuk VV (1971) Limiting equilibrium of brittle solids with fractures. Management Information Sevices, Detroit, MI Pflaumann E, Unger H (1974) Funktionalanalysis I. Zürich Pick M, Picha J, Vyskoèil V (1973) Theory of the Earth’s gravity field. Elsevier, Amsterdam Pizzetti P (1894) Geodesia – Sulla espressione della gravita alla superficie del Geoide, supposto ellissoidico. Atti Reale Accademia dei Lincei 3:166–172 Poincare H (1885) Sur l’équilibre d’une masse fluide animée d’un mouvement de rotation. Acta Math 7:259–380 Polya G (1965) Mathematical discovery. On understanding, learning and teaching problem solving. Wiley, New York Press WH et al (1989) Numerical recipes. The art of scientific computing. Cambridge University Press, Cambridge Rapp RH, Cruzy JY (1986) Spherical harmonic expansions of the Earth’s gravitational potential to degree 360 using 30’ mean anomalies. Report no. 376, Department of Geodetic Science and Surveying, The Ohio State University, Columbus Riemann B (1860) Untersuchungen über die Bewegung eines flüssigen gleichartigen Ellipsoides. Abhandlung der Königlichen Gesellschaft der Wissenschaften 9:3–36; Gesammelte Mathematische Werke (Leipzig, B.G. Teubner, 1892), p 182 Roche MEd (1847) Mémoire sur la figure d’une masse fluide (soumise á l’attraction d’un point éloigné. Acad des Sci de Montpellier 1(243–263):333–348 Routh EJ (1902) A treatise on analytical statics, vol 2. Cambridge University Press, Cambridge Rummel R et al (1988) Comparisons of global topographic isostatic models to the Earth’s observed gravity field. Report 388, Department of Geodetic Science and Surveying, The Ohio State University, Columbus Sansò F, Sona G (2001) ELGRAM, an ellipsoidal gravity model manipulator. Boll Geod Sci Affini 60:215–226 Sauer R, Szabo J (1967) Mathematische Hilfsmittel des Ingenieurs. Springer, Berlin Schäfke FW (1967) Spezielle Funktionen. In: Sauer R, Szabó I (eds) Mathematische Hilfsmittel des Ingenieurs, Teil 1. Springer, Heidelberg/Berlin/New York, pp 85–232 Shah RC, Kobayashi AS (1971) Stress-intensity factor for an elliptical crack under arbitrary normal loading. Eng Fract Mech 3:71–96 Shahgholian H (1991) On the Newtonian potential of a heterogeneous ellipsoid. SIAM J Math Anal 22:1246–1255 Skrzipek MR (1998) Polynomial evaluation and associated polynomials. Numer Math 79:601–613 Smith JR (1986) From plane to spheroid. Landmark Enterprises, Pancho Cordova, California Sneddon IN (1966) Mixed boundary value problems in potential theory. Wiley, New York Somigliana C (1929) Teoria generale del campo gravitazionale dell’ ellisoide di rotazione. Mem Soc Astron Ital IV:541–599 Sona G (1996) Numerical problems in the computation of ellipsoidal harmonics. J Geodesy 70:117–126 Stäckel P (1897) Über die Integration der Hamiltonschen Differentialgleichung mittels Separation der Variablen. Math Ann 49:145–147

Page 102 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_7-4 © Springer-Verlag Berlin Heidelberg 2015

Stokes GG (1849) On the variation of gravity on the surface of the Earth. Trans Camb Philos Soc 8:672–695 Šebera J, Bouman, J Bosch W (2012) On the computing ellipsoidal harmonics using Jekeli’s renormalization. J Geod 86:713–728 Thomson W, Tait PG (1883) Treatise on natural philosophy. Cambridge University Press, Cambridge, pt 2, pp 324–335 Thong NC (1993) Untersuchungen zur Lösung der fixen gravimetrischen Randwertprobleme mittels sphäroidaler und Greenscher Funktionen. Deutsche Geodätische Kommission, Bayerische Akademie der Wissenschaften, Reihe C 399, München 1993 (in German) Thong NC, Grafarend EW (1989) A spheroidal harmonic model of the terrestrial gravitational field. Manuscr Geod 14:285–304 Todhunter I (1966) History of the mathematical theories of attraction and the figure of the Earth from the time of Newton to that of Laplace. Dover, New York van Asche W (1991) Orthogonal polynomials, associated polynomials and functions of the second kind. J Comput Appl Math 37:237–249 Vaniˇcek P et al (1996) Downward continuation of Helmert’s gravity. J Geodesy 71:21–34 Varshalovich DA, Moskalev AN, Khersonskii VK (1989) Quantum theory of angular momentum. World Scientific, Singapore Vijaykumar K, Atluri SN (1981) An embedded elliptical crack, in an infinite solid, subject to arbitrary crack face tractions. ASME J Appl Mech 48:88–96 Webster AG (1925) The dynamics of particles and of rigid, elastic and fluid bodies. Teubner, Leipzig Whittaker ET, Watson GN (1935) A course of modern analysis, vol 2. Cambridge University Press, Cambridge Yeremeev VF, Yurkina MI (1974) Fundamental equations of Molodenskii’s theory for the gravitational references field. Stud Geophys Geod 18:8–18 Yu JH, Cao HS (1996) Elliptical harmonic series and the original Stokes problem with the boundary of the reference ellipsoid. J Geodesy 70:431–439

Page 103 of 103

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

Time-Variable Gravity Field and Global Deformation of the Earth Jürgen Kusche Astronomical, Physical and Mathematical Geodesy Group, Bonn University, Bonn, Bonn

Abstract The analysis of the Earth’s time-variable gravity field and its changing patterns of deformation plays an important role in Earth system research. These two observables provide, for the first time, a direct measurement of the amount of mass that is redistributed at or near the surface of the Earth by oceanic and atmospheric circulation and through the hydrological cycle. In this chapter, we will first reconsider the relations between gravity and mass change. We will in particular discuss the role of the hypothetical surface mass change that is commonly used to facilitate the inversion of gravity change to density. Then, after a brief discussion of the elastic properties of the Earth, the relation between surface mass change and the three-dimensional deformation field is considered. Both types of observables are then discussed in the framework of inversion. None of our findings are entirely new; we merely aim at a systematic compilation and discuss some frequently made assumptions. Finally, some directions for future research are pointed out.

1 Introduction Mass transports inside the Earth and at or above its surface generate changes in the external gravity field and in the geometrical shape of the Earth. Depending on their magnitude, spatial and temporal scale, they become visible in the observations of modern space-geodetic and terrestrial techniques. Examples for mass transport processes that are sufficiently large to become observed include changes in continental water storage in greater river basins and catchment areas, large-scale snow coverage, present-day ice mass changes occurring at Greenland and Antarctica and at the continental glacier systems, atmospheric pressure changes, sea level change and the redistribution of ocean circulation systems, land-ocean exchange of water, and glacial-isostatic adjustment (cf. Fig. 1). As many of these processes are directly linked to climate, an improved quantification and understanding of their present-day trends and interannual variations from geodetic data will help to separate them from the long-term evolution, typically inferred from proxy data. The order of magnitude for atmospheric, hydrological, and oceanic loads, in terms of the associated change of the geoid, an equipotential surface of the Earth’s gravity field, is about 1 cm or 1  109 in relative terms. For example, in the Amazon region of South America, this geoid change is largely caused by an annual basin-wide oscillation in surface and ground water that amounts to several dm. The variety of geodetic techniques that are sufficiently sensible include intersatellite tracking as currently conducted with the Gravity Recovery and Climate Experiment (GRACE) mission, satellite laser ranging, superconducting relative and free-fall absolute gravimetry, and the monitoring of deformations with the Global Positioning System (GPS) and with the Very-Long Baseline Interferometry (VLBI) network of radio telescopes, Interferometric Synthetic Aperture Radar (InSAR), and of the ocean surface with satellite altimetry and tide gauges. 

E-mail: [email protected]

Page 1 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

100

Static

10

GRACE

CHAMP

1.000

Ice Bottom Topography Fronts Topographic Control Coastal Currents

Quasi Static’ Ocean Circulation (near Surface) Plumes

Mantle Converction

Plate Boundaries Lithosphere Structure

Secular Glacial Isostatic Adjustment

Decadal

Hydrology

Ice Flow

Ground Water

Hydrology Water Balance

Interannual Annual

Sea Ice Ice Sheet Mass Balance

Sea Level Change

Snow Soil Moisture

Storage Variations

Seasonal

Atmosphere

Basin Scale Ocean Flux

Ocean Bottam Currents

Atmosphere

Monthly

Run Off

Postseismic Deformation

Soild Earth and Ocean Tides

Diurnal Semi-Diurnal Instantaneous Time resolution

[km]

GOCE

10.000

Volcanos Co-Seismic Deformation

10.000

1.000

100

10

[km]

Spatial resolution

Fig. 1 Spatial and temporal scales of mass transport processes and the resolution limits of satellite gravity missions CHAMP, GRACE, and GOCE (From Ilk et al. 2005)

Here, we will limit ourselves to a discussion of the fundamental relations that concern presentday mass redistributions and their observability in time-variable gravity (TVG) and time-variable deformation of the Earth. This is not intended to form the basis for real-data inversion schemes. Rather, we would like to point out essential assumptions and fundamental limitations in some commonly applied algorithms. To this end, it will be sufficient to consider the response of the solid Earth to the mass loading as purely elastic. In fact, this is a first assumption that will not be allowed anymore in the discussion of sea level, when geological timescales are involved at which the Earth’s response is driven by viscous or viscoelastic behavior. Furthermore, we will implicitly consider only that part of mass redistribution or mass transport that is actually associated with local change of density. Mass transport that does not change the local density (stationary currents in the hydrological cycle; i.e., the “motion term” in Earth’s rotation analysis) is in the null-space of the Newton and deformation operators; it cannot be inferred from observations of time-variable gravity or from displacement data.

Page 2 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

2 Mass and Mass Redistribution In what follows, we will adopt an Eulerian representation of the redistribution of mass, where one considers mass density as a 4D field rather than following the path of individual particles (Lagrangian representation). The mass density inside the Earth, including the oceans and the atmosphere, is then described by    D  x0 ; t :

(1)

Mass density is the source of the external gravitational potential, described by the Newton integral Z V .x; t / D G v

 .x0 ; t / dv; jx  x0 j

(2)

where v is the volume of the Earth including its fluid and gaseous envelope. The inverse distance (Poisson) kernel can be developed into   1 n   1 X X r 0 nC1 1 1 D 0 Ynm .e/Ynm e0 ; 0 jx  x j r nD0 mDn r .2n C 1/

(3)

where r D jxj, r 0 D jx0 j, and eD

x r

e0 D

x0 : r0

Ynm are the 4-normalized surface spherical harmonics. Using spherical coordinates  (spherical longitude) and  (spherical colatitude),  Ynm D Pnjmj .cos  /

cos m; sin jmj;

m  0; m < 0:

Here, the Pnm D …nm Pmn are the 4-normalized associated Legendre functions. …nm D  1=2 .nm/Š .2  ı0m /.2n C 1/ .nCm/Š denotes a normalization factor that depends only on harmonic degree n and order m. The associated Legendre functions relate to the Legendre polynomials by  m=2 d m Pn .u/ Pmn .u/ D 1  u2 , and the Legendre polynomials may be expressed by the Rodrigues d um n n 2 d u 1 . / formula, Pn .u/ D 2n1nŠ d un . On plugging Eq. (3) into Eq. (2), the exterior gravitational potential of the Earth takes on the representation n 1 X X

1 V .x; t / D G .2n C 1/r nC1 nD0 mDn

Z

  0  0n  0  x ; t r Ynm e dv Ynm .e/;

(4)

v

which converges outside of a sphere that tightly encloses the Earth. On the other hand, by introducing a reference scale a, the exterior potential V can be written as a general solution of the Laplace equation;

Page 3 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

n 1 GM X X  a nC1 V .x; t / D vnm .t /Ynm .e/ a nD0 mDn r

(5)

with 4-normalized spherical harmonic coefficients  vnm D

m  0; m < 0:

cnm ; snjmj ;

A direct comparison of Eqs. (4) and (5) provides the source representation of the spherical harmonic coefficients (see also chapter “Stokes Problem, Layer Potentials and Regularizations, Multiscale Applications”): Z  0  0  1 1 1 0n e  x ; t dv: r Y (6) vnm .t / D nm M .2n C 1/ an v There are 2n C 1 coefficients for each degree n, and each coefficient follows by projecting the density on a single 3D harmonic function (solid spherical harmonic) r n Ynm . An equivalent approach is to expand the inverse distance into a Taylor series: 3 1 3 X 3 X X X 1 .n/ D  i1 i2 :::in xi01 xi02 : : : xi0n jx  x0 j nD0 i D1 i D1 i D1 1

2

! (7)

n

with components x10 , x20 , x30 of x0 , and  .n/ i1 i2 :::in

D

1 nŠ

"

1 @n jxx 0j

@xi01 @xi02 : : : @xi0n

# :

(8)

xDx0 .n/

This leads to a representation of the exterior potential through mass moments Mi1 i2 :::in of degree n: V .x; t / D G

3 1 3 X X X i1 D1 i2 D1

nD0



3 X

! .n/ .n/ i1 i2 :::in Mi1 i2 :::in

(9)

in D1

with Z .n/ Mi1 i2 :::in

D v

  xi01 xi02 : : : xi0n  x0 ; t dv0 :

(10)

Note that the integrals in Eqs. (6) and (10) can be interpreted as inner products (; ) in a Hilbert space with integrable density defined on v. However, whereas the homogeneous Cartesian polynomials in Eq. (10) do provide a complete basis for the approximation of the density, the solid spherical harmonics in Eq. (6) fail to do so (in fact, they allow to represent the “harmonic density” independent mass part C for which C D 0). Due to symmetry of Eq. (10), there are .nC1/.nC2/ 2 moments of degree n. Of those only 2n C 1 become visible in the gravitational potential, which Page 4 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

means that the remaining .nC1/.nC2/  .2n C 1/ D .n1/n span the null-space of the Newton integral 2 2 (Chao 2005). Some of the low-degree coefficients and mass moments deserve a special discussion, e.g., obviously 1 v00 D M

Z

  M .0/  x0 ; t dv D M v

corresponds to the total mass of the Earth, scaled by a reference value M . Since this is largely constant in time, inversion of geodetic data usually assumes ddt v00 D 0 (or ıv00 D 0; see below). Moreover, the degree-1 coefficients directly correspond to the coordinates of the center of mass x0 .t /: p Z .1/ 3 M 1 x30 dv D 3 D p x30 .t / v10 .t / D 3M v M 3 and 1 v11 .t / D p x10 .t / 3

1 v11 .t / D p x20 .t / 3

(in the reference system where the spherical coordinates ,  are referring to). Space-geodetic evidence suggests that if the considered reference frame is fixed to the crust of the Earth, the temporal variation of the center of mass is no more than a few mm, with the x30 -component being largest. This is the case for realizations of the International Terrestrial Reference System (ITRS). Yet rather large mass redistributions are required to shift the geocenter for a few mm, and the study of these effects is subject to current research. Similarly, the degree-2 spherical harmonic coefficients v2m can be related to the tensor of inertia .2/ of the Earth. However, of the .2C1/.2C2/ D 6 mass moments Mi1 i2 , only 2 2 C1 D 5 become visible 2 in the five spherical harmonic coefficients of the gravity field, with the null-space of the Newton integral being of dimension 6  5 D 1 for degree 2. The annual variation of the “flattening” coefficient v20 is of the order 1  1010 , the linear rate ddt v20 (predominantly due to the continuing rebound of the Earth in response to deglaciation after the last ice age) at the 1  1011 y1 level. A special case is the variation of degree 2, order 1 coefficients (v21 , v21 ), as those correspond to the position of the figure axis of the Earth (in the reference system where the ,  are referring to). Since the mean figure axis is close to the mean rotation axis and since the latter can be determined with high precision from the measurement of Earth’s rotation, additional observational constraints for their time variation (of the order of 1  1011 y1 ) exist. Nowadays, the vnm can be observed from precise satellite tracking with global navigation systems, intersatellite ranging, and satellite accelerometry with temporal resolution of 1 month or below and spatial resolution of up to nmax D 120. However, as mentioned earlier, one cannot uniquely invert gravity change into density change. The question had been raised which physically plausible assumptions nevertheless allow to somehow locate the sources of these gravity changes. A common way to regularize this “gravitational-change inverse problem” (GCIP) is to restrict the solution space to density changes within an infinitely thin spherical shell of radius a (Wahr Page 5 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

et al. 1998). This corresponds to the determination of surface mass from an external potential. The spherical harmonic coefficients vnm of the potential caused by a surface density  on a sphere are 1 a2 vnm .t / D M .2n C 1/

Z

     e0 ; t Ynm e0 ds:

(11)

s

Since only changes with respect to a reference status, which can be realized through the measurements (e.g., a long-term average), are observable, one defines     ı.e0 ; t / D  e0 ; t  N e0 and the coefficients of potential change follow from ıvnm .t / D vnm .t /  vN nm where vN nm D

R t2 t1

1 a2 D M .2n C 1/

Z

    ı e0 ; t Ynm e0 ds;

(12)

s

vnm .t 0 /dt 0 . Or, involving the spherical harmonic expansion of ı, ıvnm D

4a2 3 1 1 ınm D ınm ; M .2n C 1/ e .2n C 1/

(13)

where we made use of the average density of the Earth, e D M , va D 34 a2 being the volume of va a sphere of radius a. Obviously, this relation is coefficient-by-coefficient invertible. Corresponding to the 2n C 1 (unnormalized) coefficients ıvnm at degree n, there are just 2n C 1 surface mass moments, with components e10 , e20 , e30 of e0 , Z .n/ ıAi1 i1 :::in

Da

n s

  ei01 ei02 : : : ei0n ı e0 ; t ds:

(14)

The integrals in Eqs. (12) and (14) can be viewed as inner products (; ı) in a Hilbert space of integrable (density) functions defined on the sphere. We will again look at low-degree terms. Obviously, 4a2 3 a2 .0/ ı00 D ı00 D ıA ıv00 D M e 4 correspondsto the P change in total surface mass. This should be zero as long as ı considers all subsystems ı D s ı.s/ , whereas a shift of total mass from, e.g., the ocean to the atmosphere may happen very well. For the center-of-mass shift referred to the Earth (strictly speaking, to the center of the sphere where  is located on) with mass M , ıv10 .t / D

4a2 1 1 1 .1/ ı10 D ı10 D p x30 D p ıA3 3M 3e 3 3e

and similar for ıv11 and ıv11 .

Page 6 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

Restriction of the density to a spherical shell serves for eliminating the null-space of the problem, thus allowing a unique solution for density determination from gravity. In fact, the surface layer could be located on the surface of an ellipsoid of revolution or any other sufficiently smooth surface as well. This could be implemented in Eqs. (11) and (12) by including an upward continuation integral term; however, it would not allow a simple scaling of the coefficients as with Eq. (13) anymore. On the other hand, in the light of comparing inferred surface densities to modeled densities (e.g., from ocean or hydrology models), one could determine  npoint-wise densities on a nonspherical surface from those on a spherical shell by applying an ar term (Chao 2005). The 3D density  and the 2D density  are of course related to each other. This can be best seen by writing the source representation of the spherical harmonic coefficients, Eq. (6), in the following way: 1 a2 vnm .t / D M .2n C 1/

Z (Z s

rmax 0

)  0 nC2    0  r  x ; t dr Ynm e0 ds a

(15)

and comparing the term in {. . . } brackets to Eq. (11). Surface mass change ı can thus be considered as a column-integrating “mapping” of ı, say, L2 .s/  L1 .Œ0; rmax / ! L2 .s/ if we assume square integrability. A coarse approximation of the integral in Eq. (15) is used if surface density change ı is transformed to “equivalent water height” change ıhw D ı with a constant reference value w for w the density of water, or if (real) water height change (e.g., from ocean model output) is transformed to surface density change ı D w ıhw . In this case, the assumption is implicitly made that ar  1, introducing an error of the order of .n C 2/f ınm , where f is the flattening of the Earth (about 1/300). For computing the contribution of the Earth’s atmosphere to observed gravity change, it is nowadays accepted that 3D integration should be preferred to the simpler approach where surface pressure anomalies are converted to surface density. For example, in the GRACE analysis dp , assuming hydrostatic equilibrium. (Flechtner 2007)  .x0 ; t / dr in Eq. (15) is replaced by  g.r/  2 Then with g.r/  g ar and r D a C N C 1ˆˆ , a

1 1 a2 vnm .t / D M .2n C 1/ g

Z (Z 0 s

ps

N a C aˆ a

nC4

)

  dp Ynm e0 ds:

(16)

In Eq. (16), N is the height of the mean geoid above the ellipsoid, ps is surface pressure, and ˆ is the geopotential height that is derived from vertical level data on temperature, pressure, and humidity. A completely different way to solve the gravitational inverse problem (GIP) (and the GCIP) has been suggested by Michel (2005), who derives the harmonic part C of the density from gravity observations.

Page 7 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

3 Earth Model A caveat must be stated at this point, since so far we have considered the Earth as rigid. In reality, any mass redistribution at the surface or even inside the Earth is accompanied by a deformation of the solid Earth in its surroundings, which may be considered as an elastic, instantaneous response for short timescales and a viscoelastic response for longer timescales. This deformation causes an additional, “indirect” change of the gravity potential, which is not negligible and generally depends on the harmonic degree n of the load ınm . The linear differential equations that describe the deformation and gravity change of an elastic or viscoelastic, symmetric Earth, forced by surface loading of redistributing mass, are usually derived by considering a small perturbation of a radialsymmetric, hydrostatically prestressed background state. In the linearized equation of momentum, r   C r.0 g0 u  er /  0 rıV  ıg0er 

d2 uD0 dt 2

(17)

the first term describes the contribution of the stresses; the second term the advection of the hydrostatic prestress (related to the Lagrangian description of a displaced particle); the third term represents self-gravitation, expressing the change in gravity due to deformation; and the fourth term describes the change in density if one accounts for compressibility. Here,  is the incremental stress tensor, u D u.x; t / the displacement of a particle at position x; ıV the perturbation of the gravitational potential, and ı the perturbation of density. The fifth term in Eq. (17) is necessary when one is interested in the (free or forced) eigenoscillations of the Earth or in the body tides of the Earth caused by planetary potentials; however, the dominant timescales for external loads and the “integration times” for observing systems such as GRACE are rather long, with the consequence that one is usually confident with the quasi-static solutions. Assuming the Earth in hydrostatic equilibrium prior to the deformation is certainly a gross simplification; it is however a necessary assumption to find “simple” solutions with the perturbation methods that are usually applied (e.g., Farrell 1972). Generally, the density perturbation can be expressed by the continuity equation: ı D r  .u/:

(18)

For the perturbation of the potential, the Poisson equation ıV D 4Gı

(19)

must hold. These equations are to be completed by a rheological law (e.g.,  D f ./ for elastic behavior,  D f .; P / for viscoelastic Maxwell behavior,  being the strain tensor) and boundary conditions for internal interfaces of a stratified Earth and for the free surface. For an elastic model consisting of Z layers, the rheology is usually prescribed by polynomial functions  D .z/ .r/;  D .z/ .z/ .r/;  D .z/ .r/ of the Lamé parameters ;  and the density within each layer, rmin .r/  r  .z/ rmax . Solutions for potential change and deformation at the surface, obtained for the boundary condition of surface loading, are often expressed through Green’s functions:

Page 8 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 2 Load Love numbers after Farrell (1972), h0n .C/; nln0 ./; nkn0 ./ versus degree n

Z ıV .x; t / D

1

Z

D0

    KV x; x0 ; 0 ı x0 ; t  0 ds d

s

.x0 D ae0 / and Z u.x; t / D

1

D0

Z

    Ku x; x0 ; 0 ı x0 ; t  0 ds d ; s

where the kernels for general (anisotropic, rotating, ellipsoidal, viscoelastic) Earth models can be represented through location- and frequency-dependent coefficients, e.g., K .x; x0 ; 0 / D P 1 0 0 0 0 nm k nm .; ; z/ k nm . ;  ; z/, where z is the Laplace transform parameter with unit s . Couplings between the degree-n, order-m terms of the load and potential-change and deformation harmonics of other degrees and orders have to be taken into account (e.g., Wang 1991). It should be noted already here that Green’s functions essentially represent the impulse response of the Earth (i.e., Eqs. (17)–(19) together with a rheological law) and, as such, may be determined from measurements under a known load “forcing.” For symmetric, nonrotating, elastic, and isotropic (SNREI) models of the Earth, the 0 .; ; z/ D kn0 are simply a function of the harmonic degree and the Green’s function kernels knm depend on the spatial distance of x and x0 only. Numerical solutions for the kn0 start with the observation that in Eqs. (17)–(19) the reference quantities 0 ; g0 and the Lamé parameters that enter Hooke’s law all depend on the radius r only. Figure 2 shows the load Love numbers – h0n , nln0 , – nk0n that Farrell (1972) computed, using a Gutenberg-Bullen model for the Earth’s rheology, for surface potential change and displacement. (It should be noted that Love numbers are radius dependent but usually in geodesy only the surface limits are applied.) The solution for the gravitational response to a surface mass distributed on top of an elastic Earth can be best described in spherical harmonics:  a2  1 1 C kn0 ıvnm .t / D M .2n C 1/

Z

    ı x0 ; t Ynm e0 ds

(20)

s

or

Page 9 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

ıvnm D

  4a2  3  1 1 1 C kn0 ınm D ınm ; 1 C kn0 M .2n C 1/ e .2n C 1/

(21)

where the “1” term is the potential change caused by ı and the load Love number “kn0 ” term describes the incremental potential due to solid Earth deformation. Again, degree 0 and 1 terms deserve special attention: Due to mass conservation, k00 must be zero. Also, l00 D 0 (see below) is obvious since a uniform load on a spherical Earth cannot cause horizontal displacements due to symmetry reasons. In contrast, h00 corresponds to the average compressibility of the elastic Earth and will not vanish. In the local spherical East-North-Up frame, the solution for u is  u.x; t / D

u.x; t / h.x; t /

 Da

 n 1 X X ı

nm .t /r



ıhnm .t /

nD0 mDn

 Ynm .e/

(22)

with r  being the tangential part of the gradient operator r. In spherical coordinates, r D

@ 1 @ 1 e C e r @ r sin  @

r D r C

@ : @r

The radial displacement function is h0n a2 ıhnm .t / D gM .2n C 1/

Z

    ı x0 ; t Ynm e0 ds

(23)

s

and at location x of the Earth’s surface, the radial displacement is ıh.x; t / D a

n 1 X X

ıhnm Ynm .e/:

nD0 mDn

The lateral displacement function is ln0 a2 ı nm .t / D gM .2n C 1/

Z

    ı x0 ; t Ynm e0 ds:

(24)

s

At location x, the East and North displacement components are 

0



ıe x ; t D a

n 1 X X

ı

nm

n 1 X X  0  ın x ; t D a ı

nm

nD0 mDn

@ Ynm .e/ @e

and

nD0 mDn

@ Ynm .e/: @n

Page 10 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

A reference system fixed to the center-of-mass of the solid Earth (CE system) is the natural system to compute the dynamics of solid Earth deformation and to model load Love numbers (Blewitt 2003). This system is obviously “blind” to mass transports that shift the center-of-mass of the earth (excluding ocean, atmosphere, etc.); hence, by definition for the degree-1 potential Love number k10 D 0. Note that this is not true for the other degree-1 Love numbers, e.g., Farrell (1972) found h01 D 0:290 and l10 D 0:113. However, the CE system is difficult to realize in practice by spacegeodetic “markers.” There are two principal ways to compute deformations for other reference systems: (a) first compute in CE, and subsequently apply a translation; (b) transform degree-1 Love numbers, and compute in the new system. Blewitt (2003) showed that a translation of the reference system origin along the direction of the Load moment can be absorbed in the three load Love numbers by subtracting the “isomorphic parameter” ˛ from them (we follow the convention of Blewitt 2003), kQ10 D k10  ˛, hQ 01 D h01  ˛, and lQ10 D l10  ˛. For example, when transforming from the CE system to the center of mass of Earth system (CM system), which is fixed to the center of the mass of the Earth including the atmospheric and oceanographic loads, one has ˛ D 1: However, in reality the network shift (and rotation and scaling) might not entirely be known, as supposed for the mentioned approaches. In this situation, Kusche and Schrama (2005) and Wu et al. (2006) showed that from 3D displacements in a realistic network and for a given set of Love numbers with h01 ¤ l10 , it is possible to separate residual unknown network translation and degree-1 deformation in an inversion approach. For realistic loads, the theory predicts vertical deformation of up to about 1 cm and horizontal deformation of a few mm. The solution in Eq. (22) refers to the local spherical East-North-Up frame, since it refers to an SNREI model. This must be kept in mind since geodetic displacement vectors are commonly referred to a local ellipsoidal East-North-Up frame. Otherwise a small but systematic error will be introduced from projecting height displacements erroneously onto North displacements. In reality, the Earth is neither spherically symmetric nor purely elastic or isotropic, and it rotates in space. The magnitude of these effects is generally thought to be small. For example, Métivier et al. (2005) showed that ellipticity of the Earth, under zonal (atmospheric) loading, leads to negligible amplitude and phase perturbations in low-degree geopotential harmonics. Another issue is that, even within the class of SNREI models, differences in rheology would cause differences in the response to loading. Plag et al. (1996) found for the models preliminary reference Earth model (PREM) and parametric Earth model, continental (PEMC), differing in lithospheric properties, differences in the vertical displacement of up to 20 %, in the horizontal displacement of up to 40 %, and in gravity change of up to 20 % in the vicinity (closer than 2ı distance) of the load, but identical responses for distances greater than this value.

4 Analysis of TVG and Deformation Pattern In many cases, the aim of the analysis of time-variable gravity and deformation patterns is, at least in an intermediate sense, the determination of mass changes within a specific volume: Z    .x; t/  N x0 dv; (25) ıM D v

Page 11 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

e.g., an ocean basin or a hydrological catchment area. Under the assumption that the thinlayer hypothesis provides an adequate description, ıM can be found by integrating ı over the (spherical) surface s  of v (Wahr et al. 1998). The integration is usually performed in spectral  domain, with the 4-normalized spherical harmonic coefficients snm of the characteristic function   S for the area s Z ıM D

s

n 1 X X  0   ı e ; t ds D snm ınm .t /;

(26)

nD0 mDn

or, in brief, ıM D .S  ; ı/. Following the launch of the GRACE mission, the analysis of time-variable surface mass loads from GRACE-derived spherical harmonic coefficients via the inverse relation (Wahr et al. 1998) ınm D

e .2n C 1/   ıvnm 3 1 C kn0

(27)

has been applied in a fast-growing number of studies (for an overview, the publication database of the GRACE project at http://www-app2.gfz-potsdam.de/pb1/op/grace/ might be considered). Since GRACE-derived monthly or weekly ıvnm D vnm  vN nm are corrupted with low-frequency noise, and since they exhibit longitudinal artifacts when projected in the space domain, isotropic or anisotropic filter kernels are applied to these estimates usually. See Kusche (2007)Pfor a review on filter methods, where in general a set of spectrally weighted coefficients ı vQ nm D wpq nm ıvpq is p;q

derived. In principle, spherical harmonic models of surface load can be inferred coefficient-by-coefficient as well once a spherical harmonic model of vertical deformation has been derived: ınm D

e .2n C 1/ ıhnm 3 h0n

(28)

ınm D

e .2n C 1/ ı 3 ln0

(29)

or, of lateral deformation nm :

For the combination of vertical and lateral deformations in an inversion, two approaches exist: (a) either the ınm are estimated directly from discrete u.xi / or (b) in a two-step procedure, first the hnm and nm are estimated separately from discrete height and lateral displacements, and the ınm are subsequently inferred from those. In practice, 3D displacement vectors are available in discrete points of the Earth’s surface. For the network of the International GNSS Service (IGS), a few hundred stations provide continuous data such that a maximum degree and order of well beyond degree and order 20 might be resolved. However, several studies have shown that this rather theoretical resolution cannot be reached. The reason is that spatial aliasing is present in the signal measured at a single site, which measures the sum of all harmonics up to infinity, unless the signal above the theoretical resolution can be removed from other data such as time-variable gravity.

Page 12 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

Inversion approaches must be seen in the light of positioning accuracies of modern worldwide networks, which are currently of a similar order compared with the deformation signals (few mm horizontally and a factor of 2–3 worse in the vertical direction). Whereas random errors may significantly reduce in the spatial averaging process that is required to form estimates for low-degree harmonics, any systematic errors at the spatial scale of the signals of interest may be potentially dangerous. A particular problem is given by the presence of secular trends in the displacement data that are dominated by nonloading phenomena like glacial isostatic adjustment (GIA), plate motion, and local monument subsidence. If the load ı.t / is known, from independent measurements (e.g., water level measurements from gauges or surface pressure from barometric measurements and meteorological modeling) the LLN’s kn 0 , hn 0 , ln 0 could, in principle, be determined experimentally from measurements of gravity change and displacement. Plag et al. (1996) coined the term “loading inversion” (and as an alternative “loading tomography,” noting that tomography is usually based on probing along rays and not by body-integrated response) and suggested a method to invert for certain spherical harmonic expansion coefficients of density ınm and Lamé parameter perturbations ınm , ınm , together with polynomial parameters that describe their variation in the radial direction. This inverse approach to recover elastic properties from measurements of gravity and deformation is followed in planetary exploration, however, for the body Love numbers k2 and h2 . There, the role of the known load is assumed by the known tidal potentials in the solar system. On the other hand, one has the possibility to eliminate the ınm from the equations and determine the ratio h0n ıhnm D 0 1 C kn ıvnm

for each

m D n : : : n

ln0 ı nm D 0 1 C kn ıvnm

for each

m D n : : : n

and

from a combination of gravity and displacement data. If all spherical harmonic coefficients up to degree nmax are well determined from data, these equations provide 2n C 1 relations per degree n. Mendes Cerveira et al. (2006) proposed that this freedom might in principle be used to uniquely 0 h0 lnm of global, “azimuth-dependent” LLNs. solve for the ratio 1Cknm0 and 1Ck 0 nm nm Other observations of gravity change and deformation of the Earth’s solid and fluid surface may be considered as well. With the global network of superconducting gravimeters, it is believed that annual and short-time variations in mass loading can be observed and compared with GRACE results, after an appropriate removal of local effects (Neumeyer et al. 2006). Absolute (free-fall) gravimeters provide stable time series from which trends in gravity can be obtained and from which time series of superconducting gravimeters can be calibrated. Gravity change ıg D jrV j  jrV j as sensed by a terrestrial gravimeter, which is situated on the deforming Earth surface, reads (in spherical approximation, where the magnitude of the gradient is replaced by the radial derivative) n 1  GM X X  ıg.x; t / D 2 n C 1  2h0n ıvnm .t /Ynm .e/ a nD0 mDn

(30)

Page 13 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

a r

n+1

1 + k n′ 2n + 1

Space gravity

h′n (2n + 1)

r (X′, t )

M (n) ii

m (X′, t)

A(n) ii

l ′n (2n + 1)

Vertical Lateral displacement displacement

1 2.....in

1 2.....in

n + 1 – 2h′n 2n+1

1 + k ′n 2n+1

1 + k ′n – h ′n (2n+1)

Terrestrial gravity

Absolute sealevel

Relative sealevel

Fig. 3 Spectral operators for space and terrestrial gravity, displacement, and sea-level observables

and when related to surface mass change, 1 n GM 3 X X n C 1  2h0n ıg.x; t / D 2 ınm .t /Ynm .e/: a e nD0 mDn 2n C 1

(31)

In principle, observations of the sea level might be considered in a multi-data inversion scheme as well. This requires that the steric sea-level change, which is related to changes in temperature and salinity of the ocean, can be removed from the measured total or volumetric sea level. Sea-level change can be measured using satellite altimeters, as with the current Jason-1 and Jason-2 missions, in an absolute sense, since altimetric orbits refer to an ITRS-type global reference system. Tide gauge observations, on the other hand, provide sea level in a relative sense since they refer to land benchmarks. Ocean bottom pressure recorders measure the load change directly. If one assumes that the ocean response is largely “passive” (Blewitt and Clarke 2003), i.e., the ocean surface follows an equipotential surface, the absolute sea-level change is related to surface load as 1 n GM 3 X X 1 C kn0 ınm .t /Ynm .e/ C ıs0 .t / ıs.x; t / D a e nD0 mDn 2n C 1

(32)

and relative sea level as 1 n GM 3 X X 1 C kn0  h0n ınm .t /Ynm .e/ C ı sL0 .t /; ı sL .x; t / D a e nD0 mDn 2n C 1

(33)

where ıs0 .t / and ı sL0 .t / are spatially uniform terms that account for mass conservation (cf. Blewitt 2003). All spectral operators discussed in this chapter are summarized in Fig. 3.

Page 14 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

5 Future Directions Within the limits of accuracy and spatial resolution of current observing systems, inversions for (surface) mass appear to have almost reached their potential. Limitations are, in particular, the achievable accuracy of spherical harmonic coefficients from GRACE, systematic errors nicknamed as “striping,” the presence of systematic errors in displacement vectors from space-geodetic techniques, and the spatial density and inhomogeneous distribution of global networks (spacegeodetic techniques, absolute and superconducting gravimetry). However, moderate improvements in data quality and consistency can be expected from reprocessings such as the anticipated GRACE RL05 products or the IGS reprocessing project. In the long run, GRACE Follow-On missions and geometric positioning in the era of GALILEO will provide, hopefully, bright prospects. At the time of writing, some groups focus on the combination of gravity and geometrical observations with a priori models of mass transport (so-called joint inversions; cf. Wu et al. 2006; Jansen et al. 2009; Rietbroek et al. 2009). The benefit of this strategy is that certain weaknesses of individual techniques can be covered by other techniques. For example, it has been shown that the geocenter motion or degree-1 surface load which is not observed with GRACE can be restituted to some extent by GPS and/or ocean modeling. Research is ongoing in this direction. Another issue is that if the space agencies fail to replace the GRACE mission in time with a follow-on satellite mission, a gap in the observation of the time-variable gravity field might occur. To some extent, the very low degrees of mass loading processes might be recovered during the gap from geometrical techniques and loading inversion, provided that their mentioned limitations can be overcome and a proper cross-calibration is facilitated with satellite gravity data in the overlapping periods (Bettadpur et al. 2008; Plag and Gross 2008). The same situation occurs, if one tries to go back in time before the launch of GRACE, e.g., using (reprocessed) GPS solutions. Anyway, the problem remains extremely challenging.

6 Conclusions We have reviewed concepts that are currently used in the interpretation of time-variable gravity and deformation fields in terms of mass transports and Earth system research. Potential for future research is seen in particular for combinations of different observables.

References Bettadpur S, Ries J, Save H (2008) Time-variable gravity, low Earth orbiters, and bridging gaps. In: GRACE Science Team Meeting 2008, San Francisco, 12–13 Dec 2008 Blewitt G (2003) Self-consistency in reference frames, geocenter definition, and surface loading of the solid Earth. J Geophys Res 108(B2):2103. doi:10.1029/2002JB002082 Blewitt G, Clarke P (2003) Inversion of Earth’s changing shape to weigh sea level in static equilibrium with surface mass redistribution. J Geophys Res 108(B6):2311. doi:10.1029/2002JB002290 Chao BF (2005) On inversion for mass distribution from global (time-variable) gravity field. J Geodyn 39:223–230

Page 15 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_8-2 © Springer-Verlag Berlin Heidelberg 2014

Farrell W (1972) Deformation of the Earth by surface loads. Rev Geophys Space Phys 10(3):761– 797 Flechtner F (2007) AOD1B Product Description Document for Product Releases 01 to 04 (Rev. 3.1, 13 Apr 2007), GR-GFZ-AOD-0001, GFZ Potsdam Ilk KH, Flury J, Rummel R, Schwintzer P, Bosch W, Haas C, Schröter J, Stammer D, Zahel W, Miller H, Dietrich R, Huybrechts P, Schmeling H, Wolf D, Götze HJ, Riegger J, Bardossy A, Güntner A, Gruber T (2005) Mass transport and mass distribution in the Earth system. Contribution of the new generation of satellite gravity and altimetry missions to geosciences, GFZ Potsdam Jansen MWF, Gunter BC, Kusche J (2009) The impact of GRACE, GPS and OBP data on estimates of global mass redistribution. Geophys J Int. doi:10.1111/j.1365-246X.2008.04031.x Kusche J (2007) Approximate decorrelation and non-isotropic smoothing of time-variable GRACE-type gravity fields. J Geodesy 81(11):733–749 Kusche J, Schrama EJO (2005) Surface mass redistribution inversion from global GPS deformation and Gravity Recovery and Climate Experiment (GRACE) gravity data. J Geophys Res 110:B09409. doi:10.1029/2004JB003556 Mendes Cerveira P, Weber R, Schuh H (2006) Theoretical aspects connecting azimuth-dependent Load Love Numbers, spatiotemporal surface geometry changes, geoid height variations and Earth rotation data. In: WIGFR2006, Smolenice Castle, 8–9 May 2006 Métivier L, Greff-Lefftz M, Diament M (2005) A new approach to computing accurate gravity time variations for a realistic earth model with lateral heterogeneities. Geophys J Int 162:570–574 Michel V (2005) Regularized wavelet-based multiresolution recovery of the harmonic mass density distribution from data of the Earth’s gravitational field at satellite height. Inverse Probl 21:997– 1025 Neumeyer J, Barthelmes F, Dierks O, Flechtner F, Harnisch M, Harnisch G, Hinderer J, Imanishi Y, Kroner C, Meurers B, Petrovic S, Reigber C, Schmidt R, Schwintzer P, Sun H-P, Virtanen H (2006) Combination of temporal gravity variations resulting from superconducting gravimeter (SG) recordings, GRACE satellite observations and global hydrology models. J Geodesy 79:573–585 Plag H-P, Gross R (2008) Exploring the link between Earth’s gravity field, rotation and geometry in order to extend the GRACE-determined terrestrial water storage to non-GRACE times. In: GRACE science team meeting 2008, San Francisco, 12–13 Dec 2008 Plag H-P, Jüttner H-U, Rautenberg V (1996) On the possibility of global and regional inversion of exogenic deformations for mechanical properties of the Earth’s interior. J Geodyn 21(3):287–308 Rietbroek R, Brunnabend S-E, Dahle C, Kusche J, Flechtner F, Schröter J, Timmermann R (2009) Changes in total ocean mass derived from GRACE, GPS, and ocean modeling with weekly resolution. J Geophys Res 114:C11004. doi:10.1029/2009JC005449 Wahr J, Molenaar M, Bryan F (1998) Time variability of the Earth’s gravity field: hydrological and oceanic effects and their possible detection using GRACE. J Geophys Res 103(B12):30205– 30229 Wang R (1991) Tidal deformations on a rotating, spherically asymmetric, visco-elastic and laterally heterogeneous Earth. Peter Lang, Frankfurt/Main Wu X, Heflin MB, Ivins ER, Fukumori I (2006) Seasonal and interannual global surface mass variations from multisatellite geodetic data. J Geophys Res 111:B09401. doi:10.1029/2005JB004100

Page 16 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution Willi Freedena and Michael Schreinerb a Geomathematics Group, University of Kaiserslautern, Rhineland-Palatinate, Germany b Institute for Computational Engineering, University of Buchs, Buchs, Switzerland

Abstract Satellite gravity gradiometry (SGG) is an ultrasensitive detection technique of the space gravitational gradient (i.e., the Hesse tensor of the Earth’s gravitational potential). In this note, SGG – understood as a spacewise inverse problem of satellite technology – is discussed under three mathematical aspects: First, SGG is considered from potential theoretic point of view as a continuous problem of “harmonic downward continuation.” The space-borne gravity gradients are assumed to be known continuously over the “satellite (orbit) surface”; the purpose is to specify sufficient conditions under which uniqueness and existence can be guaranteed. In a spherical context, mathematical results are outlined by the decomposition of the Hesse matrix in terms of tensor spherical harmonics. Second, the potential theoretic information leads us to a reformulation of the SGG-problem as an ill-posed pseudodifferential equation. Its solution is dealt within classical regularization methods, based on filtering techniques. Third, a very promising method is worked out for developing an immediate interrelation between the Earth’s gravitational potential at the Earth’s surface and the known gravitational tensor.

1 Introduction Due to the nonspherical shape, the irregularities of its interior mass density, and the movement of the lithospheric plates, the external gravitational field of the Earth shows significant variations. The recognition of the structure of the Earth’s gravitational potential is of tremendous importance for many questions in geosciences, for example, the analysis of present day tectonic motions, the study of the Earth’s interior, models of deformation analysis, the determination of the sea surface topography, and circulations of the oceans, which, of course, have a great influence on the global climate and its change. Therefore, a detailed knowledge of the global gravitational field including the local high-resolution microstructure is essential for various scientific disciplines. Satellite gravity gradiometry (SGG) is a modern domain of studying the characteristics, the structure, and the variation process of the Earth’s gravitational field. The principle of satellite gradiometry can be explained roughly by the following model (cf. Fig. 1): several test masses in a low orbiting satellite feel, due to their distinct positions and the local changes of the gravitational field, different forces, thus yielding different accelerations. The measurements of the relative accelerations between two test masses provide information about the second-order partial derivatives of the gravitational potential. To be more concrete, differences between the displacements of opposite test masses are measured. This yields information on the differences of the forces. Since the gradiometer itself is small, these differences can be identified with differentials so that a so-called full gradiometer gives information on the whole tensor consisting out of all 

E-mail: [email protected] Page 1 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 The principle of a gradiometer

second-order partial derivatives of the gravitational potential, i.e., the Hesse matrix. In an ideal case, the full Hesse matrix can be observed by an array of test masses. On 17 March 2009, the European Space Agency (ESA) began to realize the concept of SGG with the launch of the most sophisticated mission ever to investigate the Earth’s gravitational field, viz. GOCE (Gravity Field and Steady-State Ocean Circulation Explorer). ESA’s 1-ton spacecraft carries a set of six State-of-the-art, high-sensitivity accelerometers to measure the components of the gravity field along all three axes (see the contribution of R. Rummel in this issue for more details on the measuring devices of this satellite). GOCE produced a coverage of the entire Earth with measurements (apart from gaps at the polar regions). For around 20 months, GOCE gathered gravitational data. After running out of propellant, the GOCE satellite begun dropping out of this orbit in October 2013 and made an uncontrolled reentry on 11 November 2013. In order to make this mission successful, ESA and its partners had to overcome an impressive technical challenge by designing a satellite that is orbiting the Earth close enough (at an altitude of only 250 km) to collect high-accuracy gravitational data while being able to filter out disturbances caused, e.g., by the remaining traces of the atmosphere. It is not surprising that, during the last decade, the ambitious mission GOCE motivated many scientific activities such that a huge number of written material is available in different fields concerned with special user group activities, mission synergy, calibration as well as validation procedures, geoscientific progress (in fields like gravity field recovery, ocean circulation, hydrology, glaciology, deformation, climate modeling, etc.), data management, and so on. A survey about the recent status is well demonstrated by the “ESA Living Planet Programme”, which also contains a list on GOCE-publications (see also the contribution by the ESA-Frascati Group in this issue, for information from geodetic point of view the reader is referred, e.g., to the notes (Beutler et al. 2003; ESA 1999, 2007; Rummel et al. 1993), too). Mathematically, the literature dealing with the solution procedures of problems related to SGG can be divided essentially into two classes: the timewise approach and the spacewise approach. The former one considers the measured data as a time series, while the second one supposes that the data are given in advance on a (closed) surface. This chapter is part of the spacewise approach. Its goal is a potential theoretically reflected approach to SGG with strong interest in the characterization of SGG-data types and tensorial oriented solution of the occurring (pseudodifferential) SGG-equations by regularization. Particular emphasis is laid on the transition from scalar data types (such as the second-order radial derivative) to full tensor data of the Hesse matrix.

Page 2 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

2 SGG in Potential Theoretic Perspective Gravity as observed on the Earth’s surface is the combined effect of the gravitational mass attraction and the centrifugal force due to the Earth’s rotation. The force of gravity provides a directional structure to the space above the Earth’s surface. It is tangential to the vertical plumb lines and perpendicular to all level surfaces. Any water surface at rest is part of a level surface. As if the Earth were a homogeneous, spherical body gravity turns out to be constant all over the Earth’s surface, the well-known quantity 9.8 ms2. The plumb lines are directed toward the Earth’s center of mass, and this implies that all level surfaces are nearly spherical, too. However, the gravity decreases from the poles to the equator by about 0.05 ms2 . This is caused by the flattening of the Earth’s figure and the negative effect of the centrifugal force, which is maximal at the equator. Second, high mountains and deep ocean trenches cause the gravity to vary. Third, materials within the Earth’s interior are not uniformly distributed. The irregular gravity field shapes as virtual surface, the geoid. The level surfaces are ideal reference surfaces, for example, for heights. In more detail, the gravity acceleration (gravity) w is the resultant of gravitation v and centrifugal acceleration c, i.e., w D v C c. The centrifugal force c arises as a result of the rotation of the Earth about its axis. We assume here, a rotation of constant angular velocity !0 about the rotational axis x3 , which is further assumed to be fixed with respect to the Earth. The centrifugal acceleration acting on a unit mass is directed outward perpendicularly to the spin axis. If the  3 -axis of an Earthfixed coordinate system coincides with the axis of rotation, then we have c.x/ D !02  3 ^ . 3 ^ x/. Using the so-called centrifugal potential C.x/ D .1=2/!02 .x12 C x22 / we can write c D rC . The direction of the gravity w is known as the direction of the plumb line, the quantity jwj is called the gravity intensity (often just gravity). The gravity potential of the Earth can be expressed in the form: W D V C C . The gravity acceleration w is given by w = rW D rV C rC . The surfaces of constant gravity potential W .x/ D const, x 2 R3 , are designated as equipotential (level, or geopotential) surfaces of gravity. The gravity potential W of the Earth is the sum of the gravitational potential V and the centrifugal potential C, i.e., W D V C C . In an Earth’s fixed coordinate system, the centrifugal potential C is explicitly known. Hence, the determination of equipotential surfaces of the potential W is strongly related to the knowledge of the potential V . The gravity vector w given by w.x/ D rx W .x/ where the point x 2 R3 is located outside and on a sphere around the origin with Earth’s radius R, is normal to the equipotential surface passing through the same point. Thus, equipotential surfaces intuitively express the notion of tangential surfaces, as they are normal to the plumb lines given by the direction of the gravity vector (for more details see, for example, Heiskanen and Moritz (1967), Freeden and Schreiner (2009) and the contributions by H. Moritz, J. Kirsch and F. Sousa in this issue). According to the classical Newton’s Law of Gravitation (1687), knowing the density distribution  of a body, the gravitational potential can be computed everywhere in R3 . More explicitly, the gravitational potential V of the Earth’s exterior is given by Z V .x/ D G Earth

.y/ d V .y/; jx  yj

x 2 R3 nEarth;

(1)

where G is the gravitational constant .G D 6:6742  1011 m3 kg1 s2 / and dV is the (Lebesgue-) volume measure. The properties of the gravitational potential (1) in the Earth’s exterior are appropriately described by the Laplace equation:

Page 3 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

V .x/ D 0; x 2 R3 nEarth:

(2)

The gravitational potential V as defined by (1) is regular at infinity, i.e.,  jV .x/j D O

 1 ; jxj ! 1: jxj

(3)

For practical purposes, the problem is that in reality the density distribution  is very irregular and known only for parts of the upper crust of the Earth. It is actually so that geoscientists would like to know it from measuring the gravitational field. Even if the Earth is supposed to be spherical, the determination of the gravitational potential by integrating Newton’s potential is not achievable. This is the reason why, in simplifying spherical nomenclature, we first expand the so-called reciprocal distance in terms of harmonics (related to the Earth’s mean radius R) as a series X X 4R 1 R R .x/Hn;k .y/; D Hn1;k jx  yj 2n C 1 nD0 j D1 1 2nC1

(4)

R where Hn;k is an inner harmonic of degree n and order k given by

R .x/ Hn;k

1 D R



jxj R

n Yn;k ./; x D jxj;  2 ;

(5)

R and Hn1;k is an outer harmonic of degree n and order k given by

R Hn1;k .x/

1 D R



R jxj

nC1 Yn;k ./; x D jxj;  2 

( is the unit sphere in R3 ). Note that the family fYn;k g

nD0;1;::: kD1;:::;2nC1

(6)

is an L2 ./-orthonormal system

of scalar spherical harmonics (for more details concerning spherical harmonics see, e.g., Müller (1966), Freeden et al. (1998), Freeden and Schreiner (2009), Freeden and Gerhards (2013), and Freeden and Gutting (2013)). Insertion of the series expansion (4) into Newton’s formula for the external gravitational potential yields 1 2nC1 X X 4R Z R R .y/ Hn;k .y/ d V .y/ Hn1;k .x/: V .x/ D G 2n C 1 int R nD0 kD1

(7)

The expansion coefficients of the series (7) are not computable, since their determination requires the knowledge of the density function  in the Earth’s interior (see the introductory chapter and the contribution of V. Michel in this issue). In fact, it turns out that there are infinitely many mass distributions, which have the given gravitational potential of the Earth as exterior potential. Nevertheless, collecting the results from potential theory on the Earth’s gravitational field v for the outer space (in spherical approximation) we are confronted with the following (mathematical) characterization: v is an infinitely often differentiable vector field in the exterior of the Earth such Page 4 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

GPS satellites

Gradiometry

Mass anomaly

Earth

Fig. 2 The principle of satellite gravity gradiometry (From ESA (1999))

that (v1) divv D r  v = 0, curl v = L v = 0 in the Earth’s exterior, (v2) v is regular at infinity: jv.x/j D O 1=.jxj2 / ; jxj ! 1. Seen from mathematical point of view, the properties (v1) and (v2) imply that the Earth’s gravitational field v in the exterior of the Earth is a gradient field v D rV , where the gravitational potential V fulfills the properties: V is an infinitely often differentiable scalar field in the exterior of the Earth such that (V1) V is harmonic in the Earth’s exterior, and vice versa. Moreover, the gradient field of the Earth’s gravitational field (i.e., the Jacobi matrix field) v = r v, obeys the following properties: v is an infinitely often differentiable tensor field in the exterior of the Earth such that(v1) div v = r v = 0, curl v D 0 in the Earth’s exterior, (v2) v is regular at infinity: jv.x/j D O 1=.jxj3 / ; jxj ! 1, and vice versa. Combining our identities we finally see that v can be represented as the Hesse tensor of the scalar field V, i.e., v D r ˝ rV D r .2/ V. The technological SGG-principle of determining the tensor field v at satellite altitude is illustrated graphically in Fig. 2. The position of a low orbiting satellite is tracked using GPS. Inside the satellite there is a gradiometer. A simplified model of a gradiometer is sketched in Fig. 1. The photo of the GOCE satellite is contained in the contribution of R. Rummel in this issue. An array of test masses is connected with springs. Once more, the measured quantities are the differences between the displacements of opposite test masses. According to Hooke’s law, the mechanical configuration provides information on the differences of the forces. They, however, are due to local differences of rV . Since the gradiometer itself is small, these differences can be identified with differentials, so that a so-called full gradiometer gives information on the whole tensor consisting out of all second-order partial derivatives of V , i.e., the Hesse matrix v of V . From our preparatory remarks, it becomes obvious that the potential theoretic situation for the SGG-problem can be formulated briefly as follows: Suppose the satellite data v = r ˝ r V are known continuously over the “orbital surface,” the satellite gravity gradiometry problem amounts to the problem of determining V from v D r ˝ rV at the “orbital surface.” Mathematically, SGG is a nonstandard problem of potential theory. The reasons are obvious: • SGG is ill-posed since the data are not given on the boundary of the domain of interest, i.e., on the Earth’s surface but on a surface in the exterior domain of the Earth, i.e., at a certain height. Page 5 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

• Tensorial SGG-data (or scalar manifestations of them) do not form the standard equipment of potential theory (such as, e.g., Dirichlet or Neumann data). Thus, it is – at first sight – not clear whether these data ensure the uniqueness of the SGG-problem or not. • SGG-data have its natural limit because of the strong damping of the high-frequency parts of the (spherical harmonic expansion of the) gravitational potential with increasing satellite heights. For a heuristic explanation of this calamity, let us start from the assumption that the gravitational potential outside the spherical Earth’s surface R with the mean radius R is given by the ordinary expansion in terms of outer harmonics (confer the identity (7)) V .x/ D

1 2nC1 X XZ nD0 kD1

R

R R V .y/Hn1;k .y/d!.y/Hn1;k .x/

(8)

(d! is the usual surface measure). Then it is not hard to see that those parts of the gravitational R of order n at height H above the Earth’s potential belonging to the outer harmonics Hn1;k nC1 surface R are damped by a factor ŒR=.R C H / . Just a way out of this difficulty is seen in SGG, where, e.g., second-order radial derivatives of the gravitational potential are available at a height of typically about 250 km. The second derivatives cause (roughly speaking) an amplification of the potential coefficients by a factor of order n2 . This compensates the damping effect due to the satellite’s height if n is not too large. Nevertheless, in spite of the amplification, the SGG-problem still remains (exponentially) ill-posed. Altogether, the gravitational potential decreases exponentially with increasing height, and therefore the process of transforming, the data down to the Earth surface (usually called “downward continuation”) is unstable. The non-canonical (SGG)-situation of uniqueness within the potential theoretic framework can be demonstrated already by a simple example in spherical context: Suppose that one scalar component of the Hesse tensor is prescribed for all points x at the sphere RCH D fx 2 R3 j jxj D R C H g. Is the gravitational potential V unique on the sphere R D fx 2 R3 j jxj D Rg? The answer is not positive, in general. To see this, we construct a counterexample: If b 2 R3 with jbj = 1 is given, the second-order directional derivative of V at the point x is b T r ˝ rV .x/b. Given a potential V , we construct a vector field b on RCH , such that the second-order directional derivative b T r ˝ r Vb is zero: Assume that V is a solution of (2) and (3). For each x 2 RCH , we know that the Hesse tensor r ˝ rV .x/ is symmetric. Thus, there exists an orthogonal matrix A.x/ so that A.x/T .r ˝ rV .x//A.x/ D diag. 1 .x/; 2 .x/; 3 .x//, where 1 .x/; 2 .x/; 3 .x/ are the eigenvalues of r ˝ rV .x/. From the harmonicity of V it is clear that 0 D rV .x/ D 1 .x/ C 2 .x/ C 3 .x/. Let 0 D 31=2 .1; 1; 1/T . We define the vector field W RCH ! R3 by

.x/ D A.x/ 0 ; x 2 RCH . Then we obtain

T .x/.r ˝ rV .x// .x/ D T0 A.x/T .r ˝ rV .x//A.x/ 0

(9)

10 1 0 1 1 .x/ 0 1 A @ @ D 3.1 1 1/ 0 2 .x/ 0 1A 0 0 3 .x/ 1 D 13 . 1 .x/ C 2 .x/ C 3 .x// D 0:

(10)

0

Page 6 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

Hence, we have constructed a vector field such that the second-order directional derivative of V in the direction of .x/ is zero for every point x 2 RCH . It can be easily seen that, for a given V , there exist many vector fields showing the same properties for uniqueness as the vector field . Observing these arguments we are led to the conclusion that the function V is undetectable from the directional derivatives corresponding to (see also Schreiner 1994a, b). It is, however, good news that we are not lost here: As a matter of fact, there do exist conditions under which only one quantity of the Hesse tensor yields a unique solution (at least up to low order harmonics). In order to formulate these results, a certain decomposition of the Hesse tensor is necessary, which strongly depends on the separation of the Laplace operator in terms of polar coordinates. In order to follow this path, we start to reformulate the SGG-problem more easily in spherical context. For that purpose we start with some basic facts specifically formulated on the unit sphere  D fx 2 R3 j jxj D 1g. As is well-known, any x 2 R3 ; x ¤ 0, can be decomposed uniquely in the form x = r , where the directional part is an element of the unit sphere:  2 . Let fYn;m g W  ! R3 , n = 0, 1, . . . , m = 1, . . . , 2n + 1, be an orthonormal set of spherical harmonics. As is well-known (see, e.g., Freeden and Schreiner 2009), the system is complete in L2 ./, hence, each function F 2 L2 ./ can be represented by the spherical harmonic expansion F ./ D

1 2nC1 X X

F ^ .n; m/Yn;m ./;  2 ;

(11)

nD0 mD1

with “Fourier coefficients” given by ^

Z

F .n; m/ D .F; Yn;m /L2 ./ D

F ./Yn;m ./ d!./:

(12)



Furthermore, the (outer) harmonics Hn1;m W R3 nf0g ! R related to the unit sphere  are 1 1 denoted by Hn1;m .x/ D Hn1;m .x/, where Hn1;m .x/ D .1=jxjnC1 /Yn;m .x=jxj/. Clearly, they are harmonic functions and their restrictions coincide on  with the corresponding spherical harmonics. Any function F 2 L2 ./ can, thus, be identified with a harmonic potential via the expansion (11), in particular, this holds true for the Earth’s external gravitational potential. This motivates the following mathematical model situation of the SGG-problem to be considered next: (i) Isomorphism: Consider the sphere R  R3 around the origin with radius R > 0. int R is the ext inner space of R , and R is the outer space. By virtue of the isomorphism  3  7! R  2 R we assume functions F W R ! R to be defined on . It is clear that the function spaces defined on  admit their natural generalizations as spaces of functions defined on R . Obviously, an L2 ./-orthonormal system of spherical harmonics forms an orthogonal system on R (with respect to .; /L2 .R / ). Moreover, with the relationship  $ R , the differential operators on R can be related to operators on the unit sphere . In more detail, the surface gradient r IR , the surface curl gradient LIR and the Beltrami operator IR on R , respectively, admit the representation r IR D .1=R/r I1 D .1=R/r  , LIR = (1/R/LI1 = (1/R/L , IR = (1/R2 /I1 = (1/R2 / , where r  , L ,  are the surface gradient, surface curl gradient, and the Beltrami operator of the unit sphere , respectively. For Yn being a spherical harmonic of degree n we have IR Yn D .1=R2 /n.n C 1/Yn D .1=R2 / Yn .

Page 7 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

ΩR+H

ΩR H

R Earth

Fig. 3 The role of the “Runge sphere” within the spherically reflected SGG-problem

(ii) Runge Property: Instead of looking for a harmonic function outside and on the (real) Earth, we search for a harmonic function outside the unit sphere  (assuming the units are chosen in such a way that the sphere  with radius 1 is inside of the Earth and at the same time not too “far away” from the Earth’s boundary). The justification of this simplification (see Fig. 3) is based on the Runge approach (see, e.g., Freeden 1980a; Freeden and Michel 2004): To any harmonic function V outside of the (real) Earth and any given " > 0, there exists a harmonic function U outside of a sphere inside the (real) Earth such that the absolute error jV .x/  U.x/j < " holds true for all points x outside and on the (real) Earth’s surface.

3 Decomposition of Tensor Fields by Means of Tensor Spherical Harmonics Let us recapitulate that any point  2  may be represented by polar coordinates in a standard way  D t 3 C

p

1  t 2 .cos '  1 C sin '  2 /; 1  t  1; 0  ' < 2; t D cos #;

(13)

(# 2 Œ0; : (co-)latitude, ': longitude, t : polar distance). Consequently, any element  2  may be represented using its coordinates .'; t / in accordance with (13). For the representation of vector and tensor fields on the unit sphere , we are led to use a local triad of orthonormal unit vectors in the directions r, ', and t as shown by Fig. 4 (for more details the reader is referred to Freeden and Schreiner (2009) and the references therein). As is well known, the second-order tensor fields on the unit sphere, i.e., f W  ! R3 ˝ R3 , can be separated into their tangential and normal parts as follows: p;nor f D .f/ ˝ ;

(14)

pnor; f D  ˝ . T f/;

(15)

p;tan f D f  p;nor f D f  .f/ ˝ ;

(16)

ptan; f D f  pnor; f D f   ˝ . T f/;

(17)

Page 8 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

t (ξ)

r (ξ) f (ξ)

t (h)

f (h) r (h)

Fig. 4 Local triads  r ,  ,  t with respect to two different points  and on the unit sphere

pnor;tan f D pnor; .p;tan f/ D p;tan .pnor; f/ D  ˝ . T f/  . T f/ ˝ :

(18)

The operators pnor;nor , ptan;nor , and ptan;tan are defined analogously. A vector field f W  ! R3 ˝ R3 is called normal if f = pnor;nor f and tangential if f = ptan;tan f. It is called left normal if f = pnor; f, left normal/right tangential if f = pnor;tan f, and so on. The constant tensor fields itan and jtan can be defined using the local triads by itan D  ' ˝  ' C  t ˝  t ; jtan D  ^ itan D  t ˝  '   ' ˝  t :

(19)

Spherical tensor fields can be discussed in an elegant manner by the use of certain differential processes. Let u be a continuously differentiable vector field on , i.e., u 2 C .1/ ./, given in its coordinate form by u./ D

3 X

Ui ./ i ;  2 ; Ui 2 C .1/ ./:

(20)

iD1

Then we define the operators r  ˝ and L ˝ by r

˝ u./ D

3 X

.r Ui .// ˝  i ;  2 ;

(21)

iD1

L

˝ u./ D

3 X 

 L Ui ./ ˝  i ;  2 :

(22)

iD1

Clearly, r  ˝u and L ˝u are left tangential. But it is an important fact, that even if u is tangential, the tensor fields r  ˝ u and L ˝ u are generally not tangential. It is obvious, that the product rule is valid. To be specific, let F 2 C .1/ ./ and u 2 C .1/ ./ (once more, note that the notation u 2 c.1/ ./ means that the vector field u is a continuously differentiable on ), then Page 9 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

r ˝ .F ./u.// D r F ./ ˝ u./ C F ./r ˝ u./;

 2 :

(23)

In view of the above equations and definitions, we accordingly introduce operators o.i;k/ :C.2/ () ! c.0/ ./ (note that c.0/ ./ is the class of continuous second-order tensor fields on the unit sphere ) by .1;1/

F ./ D  ˝ F ./;

(24)

.1;2/

F ./ D  ˝ r F ./;

(25)

.1;3/

F ./ D  ˝ L F ./;

(26)

.2;1/

F ./ D r F ./ ˝ ;

(27)

.3;1/

F ./ D L F ./ ˝ ;

(28)

.2;2/

F ./ D itan ./F ./;

(29)

.2;3/

  F ./ D r ˝ r  L ˝ L F ./ C 2r F ./ ˝ ;

(30)

.3;2/

  F ./ D r ˝ L C L ˝ r F ./ C 2L F ./ ˝ ;

(31)

F ./ D jtan ./F ./;

(32)

o o o o o o o o

.3;3/

o

 2 : After our preparations involving spherical second-order tensor fields it is not difficult to prove the following lemma. Lemma 1. 1. 2. 3. 4. 5. 6. 7. 8.

Let F W  ! R be sufficiently smooth. Then the following statements are valid:

o.1;1/ F is a normal tensor field. o.1;2/ F and o.1;3/ F are left normal/right tangential. o.2;1/ F and o.3;1/ F are left tangential/right normal. o.2;2/ F , o.2;3/ F , o.3;2/ F and o.3;3/ F are tangential. o.1;1/ F , o.2;2/ F , o.2;3/ F and o.3;2/ F are symmetric. o.3;3/ F is skew-symmetric. .o.1;2/ F /T = o.2;1/ F and .o.1;3/ F /T = o.3;1/ F . For  2  8 < F ./ for .i; k/ D .1; 1/; .i;k/ trace o F ./ D 2F ./ for .i; k/ D .2; 2/; : 0 for .i; k/ ¤ .1; 1/; .2; 2/:

Page 10 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

The tangent representation theorem (cf. Backus 1966, 1967) asserts that if ptan;tan f is the tangential part of a tensor field f 2 c.2/ (), as defined above, then there exist unique scalar fields F2;2 , F3;3 , F2;3 , F3;2 such that Z Z F2;2 ./ d!./ D F3;3 ./ d!./ D 0; (33) 



Z

Z F3;2 ./.  / d!./ D

F2;3 ./. i  / d!./ D 0;

i



i D 1; 2; 3;

(34)



and ptan;tan f D o.2;2/ F2;2 C o.2;3/ F2;3 C o.3;2/ F3;2 C o.3;3/ F3;3 :

(35)

Furthermore, the following orthogonality relations may be formulated: Let F; G W  ! R be .i;k/ .i 0 ;k 0 / F () = 0 whenever we have sufficiently smooth. Then for all  2 , we have o F ()  o 0 0 .i;k/ satisfying (i , k/ ¤ (i , k /. The adjoint operators O Z Z .i;k/ o F ./  f./ d!./ D F ./ O .i;k/ f./ d!./; (36) 



for all sufficiently smooth functions F W  ! R and tensor fields f W  ! R3 ˝R3 can be deduced by elementary calculations. In more detail, for f 2 c(2) (), we find (cf. Freeden and Schreiner 2009) .1;1/

f./ D  T f./;

(37)

.1;2/

f./ D r  ptan . T f.//;

(38)

.1;3/

f./ D L  ptan . T f.//;

(39)

.2;1/

f./ D r  ptan .f.//;

(40)

.3;1/

f./ D L  ptan .f.//;

(41)

.2;2/

f./ D itan ./  f./;

(42)

O O O O O O

.2;3/

O

    f./ D r  ptan r  ptan; f./  L  ptan L  ptan; f./ 2r  ptan .f.//;

.3;2/

O

    f./ D L  ptan r  ptan; f./ C r  ptan L  ptan; f./ 2L  ptan .f.//;

.3;3/

O

f./ D jtan ./  f./;

(43)

(44) (45)

 2 . Provided that F W  ! R is sufficiently smooth we see that Page 11 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

.i 0 ;k 0 / .i;k/ o F ./

O

D 0 if .i; k/ ¤ .i 0 ; k 0 /;

(46)

whereas 8 ˆ ˆ ˆ ˆ ˆ
0,  2 , we obtain the following decomposition of the Hesse matrix on the sphere RCH , i.e., for x 2 R3 with jxj D R C H : .1;1/

r ˝ rHn1;m ..R C H / / D .n C 1/.n C 2/ .RCH1 /nC3 o Yn;m ./   .1;2/ .2;1/ .n C 2/ .RCH1 /nC3 o Yn;m ./ C o Yn;m ./ .2;2/

1  .nC1/.nC2/ o 2 .RCH /nC3  .2;3/

C 12 .RCH1 /nC3 o

Yn;m ./

(66)

Yn;m ./:

Keeping in mind, that any solution of the SGG-problem can be expressed as a series of outer harmonics and using the completeness of the spherical harmonics in the space of square-integrable functions on the unit sphere, it follows that the SGG-problem is uniquely solvable (up to some low order spherical harmonics) by the O .1;1/ , O .1;2/ , O .2;1/ , O .2;2/ , and O .2;3/ components. To be more specific, we are able to formulate the following theorem: Theorem 2.

Let V satisfy the following condition V 2 Pot.C .0/ .//; i:e:;

 jV .x/j D O

V 2 C .0/ .ext / \ C (2) .ext /;

(67)

V .x/ D 0; x 2 ext ;

(68)

 1 ; jxj ! 1; uniformly for all directions: jxj

(69)

Then the following statements are valid: 1. 2. 3. 4.

O .i;k/ r ˝ r V ..R C H // = 0 if (i; k/ 2 f.1; 3/; .3; 1/; .3; 2/; .3; 3/g. O .i;k/ r ˝ r V ..R C H // = 0 for (i; k/ 2 f.1; 1/; .2; 2/g if and only if V = 0. O .i;k/ r ˝ r V ..R C H // = 0 for (i; k/ 2 f.1; 2/; .2; 1/g if and only if Vj is constant. O .2;3/ r ˝ r V ..R C H // = 0 if and only if Vj is linear combination of spherical harmonics of degree 0 and 1.

This theorem gives detailed information, which tensor components of the Hesse tensor ensure the uniqueness of the SGG-problem (see also the considerations due to Schreiner (1994a) and Freeden et al. (2002), Freeden and Nutz (2011)). Anyway, for a potential V of class P ot .C .0/ .// with vanishing spherical harmonic moments of degree 0 and 1 such as the Earth’s disturbing potential (see, e.g., Heiskanen and Moritz (1967) for its definition) uniqueness is assured in all cases (listed in Theorem 2). Since we now know at least in the spherical setting, which conditions guarantee the uniqueness of an SGG-solution we can turn to the question of how to find a solution and what we mean with Page 14 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

a solution, since we have to take into account the ill-posedness. To this end, we are interested here in analyzing the problem step by step. We start with the reformulation of the SGG-problem as pseudodifferential equation on the sphere, give a short overview on regularization, and show how this ingredients can be composed together to regularize the SGG-data. In doing so, we find great help by discussing how classical boundary value problems in gravitational field of the Earth as well as modern satellite problems may be transferred into pseudodifferential equations, thereby always assuming the spherically oriented geometry. Indeed, it is helpful to treat the classical Dirichlet and Neumann boundary value problem as well as significant satellite problems such as satellite-to-satellite tracking (SST) and SGG.

4.1 SGG as Pseudodifferential Equation Let †  R3 be a regular surface, i.e., we assume the following properties: (i) † divides the Euclidean space R3 into the bounded region †int (inner space) and the unbounded region †ext (outer space) so that †ext D R3 n†int , † D †int \ †ext with ø = †int \ †ext , (ii) †int contains the origin, (iii) † is a closed and compact surface free of double points, (iv) † is locally of class C (2) (see Freeden and Schreiner (2004), Freeden and Gerhards (2013) for more details concerning regular surfaces). From our preparatory considerations (in particular, from the Introduction), it can be deduced that a gravitational potential of interest may be understood to be a member of the class V 2 Pot.C .0/ .†//, i.e., V 2 C .2/ .†ext / \ C .2/ .†ext /;

(70)

V .x/ D 0; x 2 †ext ;   1 ; jxj ! 1; uniformly for all directions: jV .x/j D O jxj

(71) (72)

Assume that R D fx 2 R3 j jxj D Rg is a (Runge) sphere with radius R around the origin, i.e., a sphere that lies entirely inside †, i.e., R  †int . On theclass L2 .R / we impose the inner product .; /L2 .R / . Then we know that the functions R1 Yn;m R form an orthonormal set of functions on R , i.e., given F 2 L2 .R /, its Fourier expansion reads 1 2nC1    x  X X 1  ; F; Y Y F .x/ D n;m n;m R2 R L2 .R / R nD0 mD1

x 2 R :

(73)

Instead of considering potentials that are harmonic outside † and continuous on †, we now consider potentials that are harmonic outside R and that are of class L2 .R /. In accordance with our notation we define ( ) 1 2nC1    X X  nC1 R 1 x Pot.L2 .R // D x 7! Y F; Yn;m . R / 2 j F 2 L2 .R / : (74) R2 jxjnC1 n;m jxj nD0 mD1

L .R /

Page 15 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

Clearly, Pot.L2 .R // is a “subset” of Pot.C 0 .†// in the sense that if V 2 Pot.L2 .R //, then V j†ext 2 Pot.C 0 .†//. The “difference” of these two spaces is not “too large”: Indeed, we know from the Runge approximation theorem (cf. Freeden 1980a), that for every " > 0 and every V 2Pot.C 0 .†// there exists a VO 2Pot.L2 .R // such that supx2†ext jV .x/  VO .x/j < . Thus, in all geosciences, it is common (but not strictly consistent with the Runge argumentation) to identify R with the surface of the Earth and to assume that the restriction V jR is of class L2 .R /. Clearly, we have a canonical isomorphism between L2 .R / and Pot.L2 .R //, which is defined via the trace operator, i.e., the restriction to R and its harmonic continuation, respectively.

4.2 Upward/Downward Continuation

   1 is then orthonormal Let RCH be the sphere with radius R C H . The system RCH Yn;m RCH 2 in L .RCH /. (We assume H to be the height of a satellite above the Earth’s surface.) Let F 2 Pot.L2 .R // be represented in the form   1 2nC1    X X 1  RnC1 x : F; Y Y x 7! n;m n;m R2 R L2 .R / jxjnC1 jxj nD0 mD1

(75)

Then the restriction of F on RCH reads F jRCH

  1 2nC1    X X 1  RnC1 x : F; Yn;m W x 7! Yn;m R2 R L2 .R / .R C H /nC1 RCH nD0 mD1

(76)

  Hence, any element R1 Yn;m R of the orthonormal system in L2 .R / is mapped to a function Rn =.R C H /n 1/.R C H / Yn;m ( / .R C H //. The operation defined in such away is called upward continuation. It is representable by the pseudodifferential operator (for more details on pseudodifferential operators the reader should consult Svensson (1983), Schneider (1997), Freeden et al. (1998), and Freeden (1999), Freeden and Schreiner (2009) ƒR;H W L2 .R / ! L2 .RCH / up with associated symbol  R;H ^ ƒup .n/ D

Rn : .R C H /n

In other words, we have         1  1 R;H R;H ^ Yn;m D ƒup Yn;m : .n/ ƒup R R RCH RCH

(77)

(78)

The image of ƒR;H up is given by Picard’s criterion (cf. Theorem 4):

Page 16 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014



2 ƒR;H up .L .R //

P2nC1  .RCH /n 2 P D F 2 L2 .RCH /j 1 nD0 mD1 Rn      2 1  F; RCH Yn;m RCH 0 and define the family fun gn2N via un D ƒvn =jjƒvn jjK . The sequence fun gn2N forms a complete orthonormal system of eigenvectors of ƒƒ*, and the following formulas are valid: ƒvn D n un ;

(114)

ƒ un D n vn ;

(115)

ƒx D

1 X

n .x; vn /H un ; x 2 H;

(116)

nD1 

ƒ yD

1 X

n .y; un /K vn ; y 2 K:

(117)

nD1

Page 26 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

The convergence of the infinite series is understood with respect to the Hilbert space norms under consideration. The identities (116) and (117) are called the singular value expansions of the corresponding operators. If there are infinitely many singular values, they accumulate (only) at 0, i.e., limn!1  n = 0. Theorem 4. Let (n ; vn , un / be a singular system for the compact linear operator ƒ, y 2 K. Then we have C

y 2 d.ƒ / if and only if

1 X j.y; un /K j2 nD1

n2

< 1;

(118)

and for y 2 D.ƒC / it holds C

ƒ yD

1 X .y; un /K nD1

n

vn :

(119)

The condition (118) is the Picard criterion. It says that a best-approximate solution of ƒx D y exists only if the Fourier coefficients of y decay fastly enough relative to the singular values. The representation (119) of the best-approximate solution motivates a method for the construction of regularization operators, namely by damping the factors 1/ n in such a way that the series converges for all y 2 K. We are looking for filters q W .0; 1/  .0; jjƒjjL.H;K/ / ! R

(120)

such that R˛ y D

1 X q.˛; n / nD1

n

.y; un /K vn ;

y 2 K;

is a regularization strategy. The following statement is known, e.g., from Kirsch (1996). Theorem 5. Let ƒ W H ! K be compact with singular system (n ; vn , un ). Assume that q from (120) has the following properties: 1. jq(˛,)j  1 for all ˛ > 0 and 0 <   jjƒjjL.H;K/ : 2. For every ˛ > 0 there exists a c(˛) so that jq(˛,)j  c(˛) for all 0 <   jjƒjjL.H;K/ : 3. lim˛!0 q(˛; ) = 1 for every 0    jjƒjjL.H;K/ : Then the operator R˛ W K ! H, ˛ > 0, defined by R˛ y D

1 X q.˛; n / nD1

n

.y; un /K vn ;

y 2 K;

(121)

is a regularization strategy with jjR˛ jjL.K;H/  c.˛/:

Page 27 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

The function q is called a regularizing filter for ƒ. Two important examples should be mentioned: 2 ˛ C 2

(122)

1;  2  ˛; 0;  2 < ˛;

(123)

q.˛; / D defines the Tikhonov regularization, whereas  q.˛; / D

leads to the regularization by truncated singular value decomposition.

4.10 Regularization of the Exponentially Ill-Posed SGG-Problem We are now in the position to have a closer look at the role of the regularization techniques particularly working for the SGG-problem. In (95), the SGG-problem is formulated as pseudodifferential equation: Given G 2 L2 ./, find F 2 L2 ./ so that ƒR;H SGG F D G with  R;H ^ ƒSGG .n/ D

.n C 1/.n C 2/ Rn : .R C H /n .R C H /2

(124)

Switching now to a finite dimensional space (which is then the realization of the regularization by a singular value truncation), we are interested in a solution of the representation N 2nC1 X X

FN D

F ^ .n; m/Yn;m :

(125)

G ^ .n; m/Yn;m ;

(126)

nD1 mD1

Using a decomposition of G of the form GD

N 2nC1 X X nD1 mD1

we end up with the spectral equations  R;H ^ ƒSGG .n/F ^ .n; m/ D G ^ .n; m/; n D 1; : : : ; N; m D 1; : : : ; 2n C 1:

(127)

In other words, in connection with (125) and (126), we find the relations G ^ .n; m/ ; n D 1; : : : ; N; m D 1; : : : ; 2n C 1: F ^ .n; m/ D  R;H ^ ƒSGG .n/

(128)

For the realization of this solution we have to find the coefficients G ^ .n, m/. Of course, we are confronted with the usual problems of integration, aliasing, and so on.

Page 28 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

The identity (128) also opens the perspective for SGG-applications by bandlimited regularization wavelets in Earth’s gravitational field determination. For more details, we refer to Freeden et al. (1997), Schneider (1996, 1997), Freeden and Schneider (1998), Glockner (2002), Hesse (2003), and Freeden and Nutz (2011). The book written by Freeden (1999) contains nonbandlimited versions of (harmonic) regularization wavelets. Multiscale regularization by use of spherical up functions is the content of the papers by Freeden and Schreiner (2004) and Schreiner (2004).

5 Future Directions The regularization schemes described above are based on the decomposition of the Hesse tensor at satellite’s height into scalar ingredients due to geometrical properties (normal, tangential, mixed) as well as to analytical properties originated by differentiation processes involving physically defined quantities (such as divergence, curl, etc). SGG-regularization, however, is more suitable and effective if it is based on algorithms involving the full Hesse tensor such as from the GOCE mission (for more insight into the tensorial decomposition of GOCE-data, the reader is referred to the contribution of R. Rummel in this issue). In addition, see Rummel and van Gelderen (1992) and Rummel (1997). Our context initiates another approach to tensor spherical harmonics. Based on cartesian operators (see Freeden and Schreiner 2009), the construction principle starts from operators oQ .i;k/ n ; i; k 2 f1; 2; 3g given by     oQ .1;1/ F .x/ D .2n C 3/x  jxj2 rx ˝ .2n C 1/x  jxj2 rx F .x/; n

(129)

  2 ˝ rx F .x/; oQ .1;2/ F .x/ D .2n  1/x  jxj r x n

(130)

  F .x/ D .2n C 1/x  jxj2 rx ˝ .x ^ rx /F .x/; oQ .1;3/ n

(131)

  2 F .x/ D r ˝ .2n C 1/x  jxj r F .x/; oQ .2;1/ x x n

(132)

oQ .2;2/ F .x/ D rx ˝ rx F .x/; n

(133)

F .x/ D rx ˝ .x ^ rx /F .x/; oQ .2;3/ n

(134)

  F .x/ D .x ^ rx / ˝ .2n C 1/x  jxj2 rx F .x/; oQ .3;1/ n

(135)

F .x/ D .x ^ rx / ˝ rx /F .x/; oQ .3;2/ n

(136)

oQ .3;3/ F .x/ D .x ^ rx / ˝ .x ^ rx /F .x/ n

(137)

for x 2 R3 and sufficiently smooth functions F W R3 ! R: Elementary calculations in cartesian coordinates lead us in a straightforward way to the following result.

Page 29 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_9-3 © Springer-Verlag Berlin Heidelberg 2014

Lemma 2. Let Hn ; n 2 N0 , be a homogeneous harmonic polynomial of degree n. Then, oQ .i;k/ n Hn .i;k/ is a homogeneous harmonic tensor polynomial of degree deg .n/, where 8 n2 ˆ ˆ ˆ ˆ 0, we define 9 > Œfij ::: ˙ D lim fij ::: .X ˙ n / > = !0C .0/

Œfij ::: C 

D Œfij ::: C  Œfij ::: 

> > ;

(28) Xi 2 @X ;

t 2T: (29)

The interface condition for fij ::: can then be written as ˙ Œfij ::: C  D fij ::: ;

Xi 2 @X ; t 2 T ;

(30)

where fij˙::: is the increase of fij ::: in the direction of ni . For convenience, we extend field values onto @R using 1 fij ::: D fŒfij :::  C Œfij ::: C g; 2

Xi 2 @X ; t 2 T :

(31)

3 Field Equations and Interface Conditions In this section, we deduce the incremental field equations and interface conditions of GVED describing infinitesimal, gravitational-viscoelastic perturbations of compositionally and entropically stratified, compressible, rotating fluids initially in hydrostatic equilibrium. In deducing the equations, we suppose that the perturbations are isocompositional and isentropic. Our assumption is justified if the characteristic times of diffusive processes are long compared with those of viscoelastic relaxation. The modifications of the theory required to include phase changes have been studied elsewhere (e.g., Johnston et al. 1997). We use the Lagrangian representation developed in Sect. 2, i.e., the field values refer to the current position, ri , of a particle whose initial position, Xi , is taken as the spatial argument, the temporal argument is the current time, t . The field equations and interface conditions are defined Page 7 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

for Xi 2 X [ XC and Xi 2 @X , respectively, and for t 2 T . We begin by collecting the field equations and interface conditions for the total fields (Sect. 3.1) and the initial fields (Sect. 3.2), from which those for the incremental fields are obtained (Sect. 3.3). After this, the continuity and state equations involving the density and thermodynamic pressure are given (Sect. 3.4). These fields are useful when studying the large-t asymptotic behavior of the incremental field equations and interface conditions (section “Large-t Asymptotes: Field Theory of GVD”) and when considering locally incompressible perturbations (Sect. 5.1).

3.1 Equations for the Total Fields We follow the standard monographs on continuum mechanics (e.g., Malvern 1969; Eringen 1989). In particular, we are concerned with the relationship between the Cauchy stress, tij , and the (nonsymmetric) Piola-Kirchhoff stress, ij : .0/

tij d2 rj D ij d2 rj ;

(32) .0/

where d2 ri is an arbitrary differential area currently at ri and d2 ri .0/ differential area at ri . Using the transformation formula .0/

rj;i d2 rj D j d2 ri ;

the associated initial

(33)

with the Jacobian determinant, j , given by j D detŒri;j ;

(34)

ri;k j k D jtij :

(35)

it then follows that

We suppose in this review that the continuum is without couple stresses, volume couples, and spin angular momentum so that tij is symmetric. Consider now a gravitating, rotating continuum undergoing perturbations of some initial state. Assuming that its angular velocity, ˝i , is prescribed, the momentum equation relative to a corotating reference frame is ij;j C .0/ .gi C 2ij k k dt rj / D .0/ d2t ri ;

(36)

where gi is the gravity force per unit mass, 2ij k k dt rj the Coriolis force per unit mass, and .0/  0 the initial volume-mass density. The symbols dt and d2t denote the first- and secondorder material-derivative operators with respect to t . The field gi , henceforth simply referred to as gravity, is given by gi D . C X C

/;j Xj;i ;

(37)

Page 8 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

with  the gravitational potential, the centrifugal potential, and gravitational-potential equation can be written as

the tidal potential. The

j.;ij Xi;k Xj;k C ;i Xi;jj / D 4 G.0/ ;

(38)

where G is Newton’s gravitational constant. Using the relations between the Lagrangian and Eulerian gradients (Sect. 2.1) and the continuity equation (Sect. 3.4), Eq. 38 can be interpreted as the Lagrangian representation of the (Eulerian) Poisson equation (e.g., Ramsey 1981). The rotational-potential equation is 2X D i i rj rj  i j ri rj ;

(39)

which implies that the origin of the coordinate system is on the spin axis. The constitutive equation is assumed to be of the form tij D tij C Mij Œrm;k .t  t 0 /rm;` .t  t 0 /  ık` ; .0/

(40)

where Mij is the anisotropic relaxation functional (assumed to be linear) transforming the strain history given by the term in brackets into the current incremental stress and t 0 is the excitation time. .0/ Clearly, t 0 2 Œ0; t  must hold as a consequence of the causality principle. With Mij ; tij ; .0/ ; , and ˝i prescribed parameters and in view of X.i;j / r.j;i/ D 1 (no summation), Eqs. 34–40 constitute the system of total field equations for gi ; j; ri ; tij ; ij ; , and . Next, the interface conditions to be satisfied by the fields will be collected. We assume here that @R coincides with a material sheet whose interface-mass density is . Considering Eqs. 28–30 and the direction of ni agreed on, the following interface conditions result from Eqs. 34–40: Œri C  D 0;

(41)

ŒC  D 0;

(42)

Œni ;j Xj;i C  D 4 G ;

(43)

Œnj tij C  D gi :

(44)

Note that the conditions apply on the supposition that @R is a “welded” interface not admitting slip or cavitation. The value of gi for Xi 2 @X is defined according to Eq. 31, and is supposed to be a prescribed function of Xi 2 @X and t 2 T .

3.2 Equations for the Initial Fields We now assume that (i) the continuum is a fluid and (ii) the initial state applying at t D 0 is a static equilibrium state. Since a fluid at rest cannot maintain deviatoric stresses, the initial state must be a hydrostatic equilibrium state. With the mechanical pressure defined by p D ti i =3, we then have .0/

tij D ıij p .0/

(45)

Page 9 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

.0/

and, with ri;j D ıij , Eqs. 34 and 35 reduce to j .0/ D 1;

(46)

.0/

ij D ıij p .0/ :

(47)

.0/

Using Eqs. 45–47, Xi;j D ıij and .dt ri /.0/ D .dt2 ri /.0/ D 0, Eqs. 36–40 become .0/

.0/

 p;i C .0/ gi D 0; .0/

gi D . .0/ C X .0/ C

.0/

(48) /;i ;

(49)

.0/

;i i D 4 G.0/ ; .0/ .0/

(50) .0/ .0/

2X .0/ D i i rj rj  i j ri rj ;

(51)

p .0/ D ..0/ ; .0/ ; ' .0/ /:

(52)

The last expression is the form of the state equation assumed in this review, where .0/ is a field representing the initial composition and ' .0/ the initial entropy density. With the state function, , known and .0/ ; ' .0/ ; .0/ , and ˝i prescribed, Eqs. 48–52 constitute the (nonlinear) system .0/ of initial field equations of gravitational hydrostatics (GHS) for gi ; p .0/ ; .0/ ;  .0/ , and .0/ . .0/ .0/ We point out the relationship ij k ;j gk D 0 following from Eqs. 48 and 49, whence these equations require that the level surfaces of p .0/ ; .0/ , and  .0/ C .0/ C .0/ coincide. However, since Eqs. 48 and 49 represent three scalar equations, respectively, the system consisting of Eqs. 48–52 is overdetermined, and solutions for the level surfaces are severely restricted. .0/ Supposing .0/ D 0 and using Eq. 45 and Xi;j D ıij , the following initial interface conditions are obtained from Eqs. 41–44: h

h

.0/ ri

iC

D 0;

(53)

Π.0/ C  D 0;

(54)



.0/ .0/ ni ;i

iC 

D 0;

(55)

Œp .0/ C  D 0:

(56)

Since solutions to Eqs. 48–52 admit a jump discontinuity of .0/ for Xi 2 @X , we also have ˙ Œ.0/ C  D  :

(57)

Page 10 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

3.3 Equations for the Incremental Fields Material Form Using Eqs. 20, 21, and 25, we decompose the total fields in Eqs. 34–40 into initial and incremental .0/ .0/ parts. Considering also Eqs. 45–47 and 52, ri;j D Xi;j D ıij , and .dt ri /.0/ D .d2t ri /.0/ D 0, we get .1 C j .ı/ / D detŒıij C ui;j ;

(58)

    .ı/ .ı/ .ıik C ui;k / ıj k p .0/ C j k D .1 C j .ı/ / ıij p .0/ C tij ;

(59)

  .0/ .ı/ .0/ .ı/  p;i C ij;j C .0/ gi C gi C 2ij k k dt uj D .0/ d2t ui ;

(60)

.0/

.ı/

gi C gi D . .0/ C  .ı/ C .0/ C .ı/ C

.0/

C

.ı/

/;j .ıj i  Uj;i /;

(61)

  .1 C j .ı/ / . .0/ C  .ı/ /;ij .ıik  Ui;k /.ıj k  Uj;k /  . .0/ C  .ı/ /;i Ui;jj D 4 G.0/ ;

(62)

      .0/ .0/ .0/ .0/ C / D i i rj C uj rj C uj  i j ri C ui rj C uj ;

(63)

2.

.0/

.ı/

.ı/

ıij p .0/ C tij D ıij ..0/ ; .0/ ; ' .0/ / CMij fŒımk C um;k .t  t 0 /Œım` C um;` .t  t 0 /  ık` g:

(64)

We note that no restriction on the magnitude of the perturbations has been imposed so far, i.e., Eqs. 58–64 are valid for finite perturbations. Since we are only concerned with infinitesimal perturbations, this allows us to linearize the field equations. Accordingly, we have Ui;j D ui;j

(65)

j .ı/ D ui;i ;

(66)

ij D tij C p .0/ .uj;i  ıij uk;k /:

(67)

and Eq. 58 reduces to

by which Eq. 59 can be rewritten as .ı/

.ı/

Considering Eqs. 48–52 and 65–67, Eqs. 60–64 then become .ı/ tij;j

C

.0/ p;j uj;i



.0/ p;i uj;j

.ı/

C

gi D . .ı/ C .ı/ C .ı/

.0/

.0/

.ı/



.ı/ gi

C 2ij k k dt uj D .0/ d2t ui ;

/;i  . .0/ C .0/ C .0/



.0/

;i i  2;ij ui;j  ;i ui;jj D 4 G.0/ ui;i ;

/;j uj;i ;

(68) (69) (70)

Page 11 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

.0/

.ı/ D ;i ui ;

(71)

tij D Mij Œuk;` .t  t 0 / C u`;k .t  t 0 /:

(72)

.ı/

With Mij ; .ı/, and ˝i prescribed parameters and the initial fields given as the special solution to the initial field equations and interface conditions, Eqs. 68–72 constitute the material form of the .ı/ .ı/ incremental field equations of GVED for gi , tij ; ui ;  .ı/, and .ı/. Next, the linearized form of the associated incremental interface conditions is derived. For this purpose, we decompose the total fields in Eqs. 41–44 into initial and incremental parts. Using .0/ .0/ Eqs. 20, 21, 25, 45, and ri;j D Xi;j D ıij , we get h

.0/

ri C ui

iC

D 0;

(73)

Œ .0/ C  .ı/ C  D 0;

(74)



 iC h .0/ .ı/ ni C ni . .0/ C  .ı/ /;j .ıj i  Uj;i / D 4 G ;

(75)

 iC  h  .0/ .ı/ .ı/ .0/ .ı/ ıij p .0/ C tij nj C nj D  gi C gi :

(76)





In view of Eqs. 53–56 and 65 and on the assumption of infinitesimal perturbations, the material forms of the incremental interface conditions are as follows:

h

.0/ ni



.ı/ ;i

h

.0/

Since ni surface

Œui C  D 0;

(77)

Œ .ı/ C  D 0;

(78)



.0/ ;j uj;i

.0/ .ı/ nj tij

iC 

iC 

D 4 G ;

(79)

.0/

(80)

D gi :

is normal to @R.0/ , which is a surface of constant  .0/ C .0/ C .0/

.0/

gi D  ni :

.0/

, we put on this

(81)

Material-Local Form The material-local form of the incremental field equations and interface conditions results if we .ı/ use Eq. 27 to express gi ;  .ı/ , and .ı/ in terms of the respective local increments: .ı/

./

gi D gi

.0/

C gi;j uj ;

(82) Page 12 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

.0/

(83)

.0/ ;i ui :

(84)

 .ı/ D  ./ C ;i ui ; D

.ı/

./

C

In view of Eqs. 48–50, 71, and 82–84, Eqs. 68–72 take the form     .ı/ .0/ .0/ ./ tij;j C p;j uj  gi ..0/ uj /;j C .0/ gi C 2ij k k dt uj D .0/ d2t ui ;

(85)

;i

./

gi

D . ./ C

./

/;i ;

(86)

./

;i i D 4 G..0/ ui /;i ;

(87)

tij D Mij Œuk;` .t  t 0 / C u`;k .t  t 0 /:

(88)

.ı/

With Mij ; ./ , and i prescribed parameters and the initial fields given as the special solution to the initial field equations and interface conditions, Eqs. 86–88 constitute the material-local form of ./ .ı/ the incremental field equations of GVED for gi , tij ; ui , and  ./ . Equations 86–87 agree with the incremental momentum equation and the incremental gravitational-potential equation given by Love (1911) and Dahlen (1974). Love used the Eulerian representation, i.e., his incremental equations are functions of the current particle position, ri . Since the difference between the Lagrangian and Eulerian representations is of second order in the incremental quantities, it may be ignored in linearized field theory. In contrast to Love, Dahlen used the Lagrangian representation in terms of the initial particle position, Xi , which has also been adopted here. The associated incremental interface conditions follow upon substituting Eq. 83 into Eqs. 78 and 79, yielding iC h .0/ D 0;  ./ C ;i ui

(89)



h

.0/

ni

 iC ./ .0/ ;i C ;ij uj D 4 G :

(90)



.0/ .0/

Observing the constraints imposed by Eq. 55 on the continuity of the components of nj ;ij , Eq. 90 can be shown to be equivalent to h

.0/ ni

 iC ./ .0/ ;i C ;jj ui D 4 G : 

(91)

Upon consideration of Eqs. 50, 77, 80, and 81, the incremental interface conditions are [4] found to be Œui C  D 0;

(92)

Π./ C  D 0;

(93) Page 13 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

h

.0/

ni

 iC ./ ;i  4 G.0/ ui D 4 G ; 

h

.0/ .ı/ nj tij

iC 

.0/

D  ni :

(94) (95)

The material-local form of the incremental field equations and interface conditions of GVED is reconsidered below when deriving the small-t asymptotes to these equations (section “Small-t Asymptotes: Field Theory of GED”). 3.3.1 Local Form We consider Eqs. 27 and 45, giving .ı/

./

.0/

tij D tij  ıij p;k uk :

(96)

On account of Eqs. 48 and 96, the material-local form of the incremental field equations and interface conditions, Eqs. 86–88 and 92–95, transforms into ./ tij;j



.0/ gi ..0/ uj /;j

C ./

gi

.0/

  ./ gi C 2ij k k dt uj D .0/ d2t ui ;

D . ./ C

./

/;i ;

./

(97) (98)

;i i D 4 G..0/ ui /;i ;

(99)

tij D ıij p;k uk C Mij Œuk;` .t  t 0 / C u`;k .t  t 0 /;

(100)

Œui C  D 0;

(101)

Π./ C  D 0;

(102)

./

.0/

 iC ./ .0/ ;i  4 G ui D 4 G ;

(103)

 iC h .0/ ./ .0/ .0/ nj tij  ıij .0/ gk uk D  ni :

(104)

h

.0/ ni





With Mij ; ./ , and ˝i prescribed parameters and the initial fields given as the special solution to the initial field equations and interface conditions, Eqs. 97–100 constitute the local form of the ./ ./ incremental field equations of GVED for gi , tij ; ui , and  ./ , whose solution must satisfy the associated incremental interface conditions, Eqs. 101–104. iC h .0/ .0/ in Eq. 104 is sometimes referred to as buoyancy term. Note that its The term ni .0/ gj uj  appearance is solely a consequence of formulating the incremental field equations and interface ./ conditions in terms of the local incremental stress, tij . In the material-local form of the equations,

Page 14 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

.ı/

where the material incremental stress, tij , is used, no buoyancy term can therefore arise in the corresponding interface condition, Eq. 95. Conversely, the material-local momentum Eq. 86 .0/ contains the advective term p;j uj;i , which is absent from the local momentum equation, Eq. 97. The local form of the incremental field equations and interface conditions of GVED is used when deducing the large-t asymptotes to the incremental equations (section “Large-t Asymptotes: Field Theory of GVD”) and when considering the approximations of local incompressibility (Sect. 5.1) and material incompressibility (Sect. 5.2). 3.3.2 Constitutive Equation To obtain an expression for Mij , we use the continuous differentiability of the strain history. On this assumption, Mij may be written as a convolution integral: Z

t

Mij D

mij k` .t  t 0 /dt 0 Œuk;` .t 0 / C u`;k .t 0 / dt 0 ;

(105)

0

with mij k` .t  t 0 / the anisotropic relaxation function (e.g., Christensen 1982). Supposing isotropic viscoelasticity from now on and exploiting the usual symmetry properties of mij k` , this simplifies to  Rt  Mij D ıij 0 m1 .t  t 0 /  23 m2 .t  t 0 / dt 0 Œuk;k .t 0 / dt 0 (106) Rt C 0 m2 .t  t 0 /dt 0 Œui;j .t 0 / C uj;i .t 0 / dt 0 : We refer to this relation as incremental constitutive equation of viscoelasticity. The independent functions m1 .t  t 0 / and m2 .t  t 0 / are defined for t  t 0 2 Œ0; 1/ and are called bulk- and shearrelaxation functions, respectively. For convenience, we also use m .t  t 0 / [4] with  2 f1; 2g. We assume that m .t  t 0 / is continuously differentiable for Xi 2 X [ XC but may have a jump discontinuity for Xi 2 @X . Furthermore, we take m .t  t 0 / as continuously differentiable with respect to t  t 0 . From thermodynamic principles, it follows that (e.g., Christensen 1982; Golden and Graham 1988) m .t  t 0 /  0;

(107)

dt t 0 m .t  t 0 /  0;

(108)

dt t 0 m .t  t 0 /  0:

(109)

To obtain an additional constraint on m2 .t  t 0 /, we recall that the assumption has been made for the continuum to be a fluid. A necessary condition of fluid constitutive behavior is that deviatoric stresses can relax completely (e.g., Christensen 1982; Golden and Graham 1988). In view of Eq. 106, this is formally expressible as the fluidity condition: lim m2 .t  t 0 / D 0:

t t 0 !1

(110)

Page 15 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

3.4 Continuity and State Equations So far, the incremental density and incremental thermodynamic pressure have not appeared explicitly in the equations. This is in accordance with the adoption of the Lagrangian representation, where the displacement, ui , is preferentially used. However, the incremental density and incremental thermodynamic pressure are required when interpreting the large-t asymptotes to the incremental field equations and interface conditions of GVED (section “Large-t Asymptotes: Field Theory of GVD”) and when studying the approximation of local incompressibility (Sect. 5.1). The current value of the density, , can be related to its initial value, .0/ , by means of j D .0/ ;

(111)

which is the continuity equation (e.g., Malvern 1969; Dahlen 1974). For a fluid not necessarily in hydrostatic equilibrium, the thermodynamic pressure, $ , is introduced with the aid of a state equation whose functional relation is identical to that governing the mechanical pressure, p D ti i =3, in the case of hydrostatic equilibrium (e.g., Malvern 1969; Dahlen 1974). In view of Eq. 52, we therefore have $ D .; ; '/;

(112)

with $ , in general, different from p. However, at t D 0, Eq. 112 reduces to $ .0/ D ..0/ ; .0/ ; ' .0/ /;

(113)

which, by comparison with Eq. 52, yields $ .0/ D p .0/ :

(114)

A direct consequence of Eqs. 113 and 114 is  .0/ p;i

D

@ @

.0/



.0/ ;i

@ C @

.0/

 .0/

;i

C

@ @'

.0/ .0/

';i ;

(115)

where the partial derivatives .@ =@/.0/ D Œ@ =@D.0/ , etc. are functions of Xi 2 X [ XC . Next, we use Eq. 25 to decompose the total fields in Eqs. 111 and 112 into initial and material incremental parts. Using also Eqs. 46 and 66, Eq. 111 becomes .1 C ui;i /..0/ C .ı/ / D .0/ :

(116)

Since, by assumption, we have isocompositional and isentropic perturbations, .ı/ D ' .ı/ D 0 applies, and the decomposition of Eq. 112 takes the form  $

.0/

C$

.ı/

D . ; ; ' / C .0/

.0/

.0/

@ @

.0/ .ı/ :

(117)

Considering Eq. 113 and retaining only terms that are linear in the incremental quantities, Eqs. 116 and 117 reduce to Page 16 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

.ı/ D .0/ ui;i ;  $

.ı/

D

@ @

(118)

.0/ .ı/ ;

(119)

which constitute the material forms of the incremental continuity and state equations, respectively. Due to Eqs. 27 and 114, we, however, have .0/

.ı/ D ./ C ;i ui ;

(120)

.0/

$ .ı/ D $ ./ C p;i ui ;

(121)

whence the material forms can be replaced by ./ D ..0/ ui /;i ;  $

D

./

@ @

(122)

.0/   .0/ .0/ ./  C ;i ui  p;i ui :

(123)

These relations represent the local forms of the incremental continuity and state equations, .0/ respectively. Equation 123 takes a more familiar form upon substituting for p;i from Eq. 115, giving  $

./

D

@ @

.0/

 

./



@ @

.0/



.0/

;i ui

@  @'

.0/ .0/

';i ui :

(124)

Explicit expressions for the partial derivatives are stated below (Sects. “Large-t Asymptotes: Field Theory of GVD” and 5.1).

4 Asymptotic Incremental Field Theories We proceed by supposing perturbations whose limits exist for both t ! 0 and t ! 1. Obviously, these limits correspond to the initial and final hydrostatic equilibrium states of the fluid. This means that the small- and large-t asymptotes to the incremental field equations and interface conditions of GVED also exist. We determine them by finding suitable asymptotic approximations to the Laplace transform of the incremental constitutive equation of viscoelasticity. Upon substitution of Eq. 106 into Eqs. 88 and 100, respectively, and use of Eqs. 207, 209, 211 (Appendix 1: Laplace .0/ Transform), and ui D 0, the Laplace-transformed material and local forms of the incremental constitutive equation can be written as .ı/ tQij

 D ıij

 2 Q 2 s uQ k;k C m m Q1  m Q 2 s.Qui;j C uQ j;i /; 3

(125)

Page 17 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

./ tQij

   2 .0/ Q 2 s uQ k;k C m Q 2 s.Qui;j C uQ j;i /; D ıij p;k uQ k C m Q1  m 3

(126)

eij ::: .X; s/ denotes the Laplace transform of fij ::: .X; t / and s 2 S the inverse Laplace time where f (Appendix 1: Laplace Transform). As in Eqs. 125 and 126, we continue to suppress the argument, s, of Laplace-transformed quantities. Before expanding the equations, it is necessary to specify m Q  . This is achieved by expressing m .t  t 0 / in terms of the associated relaxation spectrum (Sect. 4.1). Laplace transformation then supplies a formula for s m Q  , from which asymptotic approximations for large and small s can be derived (Sect. 4.2). Substituting these approximations into Eqs. 125 and 126 and applying the generalized initial- and final-value theorems finally give the small- and large-t asymptotes to the incremental constitutive equation of viscoelasticity (Sect. 4.3).

4.1 Relaxation Functions For  D 1; 2, suppose that m .t  t 0 / can be expressed as Z

0

1

m .t  t / D m1 C

0

0

m .˛ 0 /e ˛ .t t / d˛ 0 ;

(127)

0 0

where m .˛/ is the relaxation spectrum, ˛ 0 is the inverse spectral time, and m .t  t 0 / satisfies the restrictions expressed by Eqs. 107–110 (e.g., Christensen 1982; Golden and Graham 1988). Equation 127 implies m1 D

lim m .t  t 0 /;

t t 0 !1

(128)

whence, by Eq. 110, it follows that m21 D 0:

(129)

m0 D m .0/;

(130)

Defining

we also get Z

1

m .˛ 0 /d˛ 0 D m0  m1:

(131)

0

A consequence of Eqs. 108, 128, 130, and 131 is that 0

R1 0

m .˛ 0 /d˛ 0  0. Here, we impose the

more stringent condition that m .˛/  0 for ˛ 0 2 Œ0; 1/. We furthermore require m .˛ 0 / to 0 0 and ˛ 0 ! 1 so that, for 0 < ˛1 < 1, the integrals vanish sufficiently rapidly R 1 as0 ˛ ! R ˛1 0 0 0 0 0 0 m .˛ /=˛ d˛ and ˛1 ˛ m .˛ / d˛ converge. These assumptions are of sufficient generality to include conventional mechanical and molecular models of viscoelasticity.

Page 18 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

Next, we apply Eqs. 212 and 213 (Appendix 1: Laplace Transform) to obtain the s-multiplied Laplace transform of Eq. 127 with respect to t  t 0 : Z

1

sm Q  D m1 C 0

sm .˛ 0 / 0 d˛ : s C ˛0

(132)

Being interested in asymptotic approximations to s m Q  for large and small s, we decompose the integral in Eq. 132 in the following way: Z

1 0

sm .˛ 0 / 0 d˛ D s C ˛0

Z

s0

0

m .˛ 0 / 0 0 d˛ C 1 C ˛s

Z

1 sC0

s m .˛ 0 / 0 d˛ : ˛ 0 1 C ˛s0

(133)

Note that, on the right-hand side, ˛ 0 =s < 1 in the first integrand, whereas s=˛ 0 < 1 in the second.

4.2 Asymptotic Relaxation Functions 4.2.1 Large-s Asymptotes For sufficiently large s, theR second integral on the right-hand side of Eq. 133 may be neglected. 1 Due to the convergence of sC0 ˛ 0 m .˛ 0 / d˛ 0 , we get the asymptotic approximation Z

1 0

sm .˛ 0 / 0 d˛ ' s C ˛0

Z

1 0

  ˛0 d˛ 0 : m .˛ / 1  s 0

(134)

Since m .˛ 0 /  0 for ˛ 0  0, we can apply the mean-value theorem of integral calculus and obtain the following estimate: Z

1

0

0

0

Z

1

˛ m .˛ / d˛ D ˛0

0

m .˛ 0 / d˛ 0 ;

(135)

0

where ˛v0 > 0. In view of Eqs. 131, 134, and 135, Eq. 132 takes the form sm Q  ' m0  .m0  m1 /

˛0 ; s

(136)

which is correct to the first order in ˛v0 =s. Using the abbreviations e D m10 ;

(137)

e0 D .m10  m11 /˛10 ;

(138)

e D m20

(139)

0e D .m20  m21 /˛20 ;

(140)

we finally obtain

Page 19 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

sm Q 1 ' e 

e0 ; s

(141)

sm Q 2 ' e 

0e ; s

(142)

which are asymptotically correct for large s. From the properties of m .t  t 0 / specified above, it follows that the values of e ; e0 ; e , and 0e are nonnegative and continuously differentiable for Xi 2 X [ XC , with jump discontinuities admitted for Xi 2 @X . 4.2.2 Small-s Asymptotes For sufficiently small s, the first on the right-hand side of Eq. 133 may be neglected. As a R s0integral 0 result of the convergence of 0 m .˛ /=˛ 0 d˛ 0 , we arrive at the asymptotic approximation Z

1

0

sm .˛ 0 / 0 d˛ ' s C ˛0

Z

1

m .˛ 0 /

0

s 0 d˛ : ˛0

(143)

Applying the mean-value theorem of integral calculus, we now have Z 0

1

m .˛ 0 / 0 1 d˛ D 0 ˛ ˛1

Z

1

m .˛ 0 /d˛ 0 ;

(144)

0

where ˛v1  0. From Eqs. 131, 143, and 144, Eq. 132 becomes sm Q  ' m1 C .m0  m1 /

s ˛1

;

(145)

which is correct to the first order in s=˛v0. We simplify this by means of h D m11 ; h0 D

m10  m11 ˛11

h D m21 ; 0h D

m20  m21 : ˛21

(146) (147) (148) (149)

Since h D 0 by Eqs. 129 and 148, we obtain sm Q 1 ' h C h0 s;

(150)

sm Q 2 ' 0h s;

(151)

Page 20 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

which are asymptotically correct for small s. From the properties of m .t  t 0 / described above, it follows that the values of h ; h0 , and 0h are nonnegative and continuously differentiable for Xi 2 X [ XC , with jump discontinuities admitted for Xi 2 @X .

4.3 Asymptotic Incremental Field Equations and Interface Conditions 4.3.1 Small-t Asymptotes: Field Theory of GED By the generalized initial-value theorem for Laplace transforms (Appendix 1: Laplace Transform), the small-t asymptote to Eq. 88 corresponds to the large-s asymptote to Eq. 125. However, for s Q 2 may be approximated by Eqs. 141 and 142, respectively, whose sufficiently large, s m Q 1 and s m substitution into Eq. 125 provides     uQ i;j C uQ j;i uQ k;k 2 2 Qtij.ı/ D ıij e  e uQ k;k C e .Qui;j C uQ j;i /  ıij e0  0e  0e : 3 3 s s

(152)

In view of Eqs. 208, 210, and 214 (Appendix 1: Laplace Transform), inverse Laplace transformation from the s domain to the t domain gives



.ı/ tij D ıij e  23 e uk;k C e .ui;j C uj;i /  ıij e0  23 0e a0t uk;k .t 0 / dt 0 0e a0t Œui;j .t 0 / C uj;i .t 0 / dt 0 ;

(153)

with e called elastic bulk modulus, e elastic shear modulus, 0e anelastic bulk modulus, and 0e anelastic shear modulus. Equation 153 is to be complemented by the remaining incremental field equations, Eqs. 86–87, and the associated incremental interface conditions, Eqs. 92–95. Together, they constitute the material-local form of the small-t asymptotes to the incremental field equations ./ .ı/ of GVED in terms of gi ; tij ; ui , and  ./ . We refer to the equations also as generalized incremental field equations and interface conditions of GED. If the integrals in Eq. 153 are neglected, it simplifies to the incremental constitutive equation of elasticity. In this case, the small-t asymptotes to the incremental field equations and interface conditions of viscoelastodynamics agree with the ordinary incremental field equations and interface conditions of GED (e.g., Love 1911; Dahlen 1974; Grafarend 1982). 4.3.2 Large-t Asymptotes: Field Theory of GVD By the generalized final-value theorem for Laplace transforms (Appendix 1: Laplace Transform), the large-t asymptote to Eq. 100 corresponds to the small-s asymptote to Eq. 126. However, with Q 2 may be replaced by Eqs. 150 and 151, respectively, whose s sufficiently small, s m Q 1 and s m substitution into Eq. 126 leads to ./ tQij

    2 0 .0/ 0 D ıij p;k uQ k C h uQ k;k C ıij h  h s uQ k;k C 0h s.Qui;j C uQ j;i /: 3 .0/

Considering Eqs. 208, 209, 214 (Appendix 1: Laplace Transform), and ui transformation from the s domain to the t domain results in

(154)

D 0, inverse Laplace

Page 21 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

./ tij

    2 0 .0/ 0 D ıij p;k uk C h uk;k C ıij h  h dt uk;k C 0h dt .ui;j C uj;i /; 3

(155)

with h referred to as hydrostatic bulk modulus, h0 as viscous bulk modulus (bulk viscosity), and 0h as viscous shear modulus (shear viscosity). We reduce the large-t asymptote to an expression for $ ./ by recalling that, for a fluid not necessarily in hydrostatic equilibrium, $ ./ is related to ./ the other state variables by the same function that relates p ./ D ti i =3 to these variables in the case of hydrostatic equilibrium (e.g., Malvern 1969). Putting dt = 0 in Eq. 155 [4] thus yields .0/

$ ./ D p;i ui  h ui;i :

(156)

To replace this by a more familiar expression, we compare Eqs. 122, 123, and 156, giving 

@ @

.0/ D

h ; .0/

(157)

D

l ;

.0/

(158)

D

v ; ' .0/

(159)

and put 



@ @ @ @'

.0/

.0/

where l and v are the compositional and entropic moduli, respectively. Upon substitution of Eqs. 157–159, Eq. 124 takes the form $ ./ D

h ./ l .0/ v .0/  

u  ' ui ; i ;i .0/

.0/ ' .0/ ;i

(160)

which is the incremental state equation of a fluid whose total state equation is given by Eq. 112. Considering Eqs. 122, 155, 156, and 160 and the assumption of isocompositional and isentropic perturbations, .ı/ D ' .ı/ D 0, the local form of the incremental field equations of GVED, Eqs. 97– 100, reduces to   ./ .0/ ./ (161) tij;j C gi ./ C .0/ gi C 2ij k k dt uj D .0/ d2t ui ; ./

gi

D . ./ C

./

/;i ;

./

;i i D 4 G..0/ ui /;i ;  ./ tij

D ıij $

./

C ıij

h0

2  0h 3

(162) (163)

 dt uk;k C 0h dt .ui;j C uj;i /;

(164)

Page 22 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

$ ./ D

h ./ l   C .0/ ./ C .0/ ' ./ ; .0/ 

' ./ D ..0/ ui /;i ;

(165) (166)

.0/

(167)

.0/

(168)

./ D  ;i ui ; ' ./ D ';i ui :

These equations are completed by the associated incremental interface conditions, Eqs. 101–104. Together, they constitute the local form of the large-t asymptotes to the incremental equations ./ ./ of GVED in terms of gi ; tij ; ui ; ./ , $ ./ , ./ ,  ./ , and ' ./ ; in particular, they agree with the incremental field equations and interface conditions of GVD (e.g., Backus 1967; Jarvis and McKenzie 1980).

5 Approximate Incremental Field Theories This section is concerned with simplified field theories. We suppose that the fluid is isocompositional and isentropic in each of the domains X and XC : .0/

.0/

;i D ';i D 0;

(169)

that rotational and tidal effects are negligible: i D 0; D

D 0;

(170) (171)

that the approximation of quasi-static perturbations applies: d2t ui D 0;

(172)

m1 .t  t 0 / D h :

(173)

and that the bulk relaxation is negligible:

We note that in the absence of tidal forces, the perturbations are now solely due to the incremental interface-mass density, . The field theories of GHS and GVED satisfying Eqs. 169–173 are referred to as approximate incremental field theories. Upon introducing further restrictions, the analysis is divided into the case of local incompressibility (Sect. 5.1), which accounts for an initial density gradient due to self-compression, and the case of material incompressibility (Sect. 5.2), where the initial state is also taken as incompressible.

Page 23 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

5.1 Local Incompressibility 5.1.1 Equations for the Initial Fields .0/

Upon consideration of Eqs. 169–171 and elimination of gi , the initial field equations of GHS, Eqs. 48–52, simplify to .0/

.0/

 p;i C .0/ ;i D 0;

(174)

;i i D 4 G.0/ ;

.0/

(175)

p .0/ D b ..0/ /;

(176)

with b the barotropic state function. The initial interface conditions, Eqs. 53–57, continue to apply. Solutions to the approximate equations for the initial fields can be shown to exist for the level surfaces of p .0/ ; .0/ , and  .0/ being concentric spheres, coaxial cylinders, or parallel planes (e.g., Batchelor 1967). To eliminate p .0/ , consider the gradient of Eq. 176:  .0/ p;i

D

d b d

.0/ .0/

;i ;

(177)

where (d b =d/.0/ must be constant on the level surfaces. Comparing Eq. 174 with 177 [4] then yields 

d b d

.0/ .0/

.0/

;i D .0/ ;i ;

(178)

which is the Williamson-Adams equation (e.g., Williamson and Adams 1923; Bullen 1975). With (d b =d/.0/ prescribed, Eqs. 175 and 178 are to be solved for .0/ and  .0/ . 5.1.2 Equations for the Incremental Fields: Local Form Using Eq. 173 and m2 .t  t 0 / D .t  t 0 /;

(179)

substitution of Eq. 106 into Eq. 100 leads to ./ tij

  Rt .0/ D ıij p;k uk C h uk;k  23 ıij 0 .t  t 0 /dt 0 Œuk;k .t 0 / dt 0 Rt C 0 .t  t 0 /dt 0 Œui;j .t 0 / C uj;i .t 0 / dt 0 :

(180)

Since, by setting dt D 0 in Eq. 180, we find .0/

$ ./ D p;i ui  h ui;i

(181)

Page 24 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

and, by comparing Eqs. 122, 123, and 181 and observing d b =d D @ =@, we obtain 

d b d

.0/ D

h ; .0/

(182)

it follows from Eqs. 177, 181, and 182 that $ ./ D 

h .0/ . ui /;i : .0/

(183)

./

Note that, with p ./ D ti i =3 per definitionem, Eqs. 180 and 181 yield $ ./ D p ./ ;

(184)

which will henceforth be implied. Suppose now that Eq. 183 can be replaced by the simultaneous conditions h ! 1;

(185)

..0/ ui /;i ! 0;

(186)

p ./ D finite:

(187)

The significance of Eq. 186 becomes evident, if we note that, by Eqs. 120 and 122, the condition .0/ (.0/ ui /;i D 0 is equivalent to the condition .ı/ D ;i ui or ./ D 0. Equation 186 thus states that the compressibility of a displaced particle is constrained to the extent that the material incremental density “follows” the prescribed initial density gradient so that the local incremental density vanishes. For this reason, we refer to Eq. 186 as local incremental incompressibility condition. ./ Taking into account Eqs. 170–172, 178, 180, 182, and 186 and eliminating gi , the local form of the incremental field equations and interface conditions of GVED, Eqs. 97–104, [4] reduces to ./

./

tij;j C .0/ ;i

D0

./

(188)

;i i D 0;

(189)

.0/ 2.0/  R t ./ tij D ıij p ./ C ıij 3h ;k 0 .t  t 0 /dt 0 Œuk .t 0 / dt 0 Rt C 0 .t  t 0 /dt 0 Œui;j .t 0 / C uj;i .t 0 / dt 0 ;

(190)

.0/

ui;i

.0/ ;i D ui ; h

(191)

Œui C  D 0

(192)

Π./ C  D 0;

(193)

Page 25 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

h  iC .0/ ./ ni ;i  4 Gui D 4 G ;

(194)

 iC h .0/ ./ .0/ .0/ .0/ nj tij  ıij  ;k uk D  ni :

(195)





./

The approximate incremental equations are to be solved for p ./ , tij ; ui , and  ./ , where .0/ and  .0/ must satisfy the appropriate initial field equations and interface conditions. We observe that the hydrostatic bulk modulus, h , remains finite in Eqs. 190 and 191. This is because it enters into these equations as a consequence of substituting Eq. 182 into Eq. 178 for the initial fields, for which the approximation given by Eq. 185 does not apply. If  .0/ is prescribed and  ./ is neglected, the mechanical and gravitational effects decouple. In this case, solutions for the initial state are readily found. The decoupled incremental equations were integrated for a Newton-viscous spherical Earth model (Li and Yuen 1987; Wu and Yuen 1991) and for a Maxwell-viscoelastic planar Earth model (Wolf and Kaufmann 2000). The solution to the coupled incremental equations for a Maxwellviscoelastic spherical Earth model has recently been derived by Martinec et al. (2001).

5.2 Material Incompressibility We proceed using the supposition that the material is incompressible. As a result, the initial state is incompressible, whence .0/ = constant replaces Eq. 176 and h ! 1 applies also in Eqs. 190 and 191, the latter reducing to the conventional material incremental incompressibility condition. 5.2.1 Equations for the Initial Fields With these additional restrictions, the approximate initial field equations of GHS for local incompressibility, Eqs. 174–176, further simplify to those applying to material incompressibility: .0/

.0/

 p;i C .0/ ;i D 0; .0/

(196)

;i i D 4 G.0/ ;

(197)

.0/ D constant:

(198)

However, the initial interface conditions, Eqs. 53–57, continue to apply. 5.2.2 Equations for the Incremental Fields: Local Form Owing to the additional assumptions, the approximate incremental field equations and interface conditions of GVED for local incompressibility, Eqs. 188–195, further reduce to the conventional form valid for material incompressibility: ./

./

tij;j C .0/ ;i

D0

(199)

Page 26 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

./

Z ./ tij

D ıij p

./

t

C

;i i D 0;

(200)

.t  t 0 /dt 0 Œui;j .t 0 / C uj;i .t 0 /dt 0 ;

(201)

0

ui;i D 0;

(202)

Œui C  D 0;

(203)

Π./ C  D 0;

(204)

 iC ./ ;i  4 G.0/ ui D 4 G ;

(205)

h  iC .0/ ./ .0/ .0/ nj tij  ıij .0/ ;k uk D  ni :

(206)

h

.0/

ni





Deductions of analytical solutions to these equations for one- or two-layer Earth models are given elsewhere (e.g., Wolf 1984, 1994; Amelung and Wolf 1994; Rümpker and Wolf 1996; Wu and Ni 1996). In addition, a number of analytical or semi-analytical solutions for multilayer Earth models have been obtained (e.g., Sabadini et al. 1982; Wu and Peltier 1982; Wolf 1985d; Wu 1990; Spada et al. 1992; Vermeersen and Sabadini 1997; Martinec and Wolf 1998; Wieczerkowski 1999). An instructive solution to a simplified form of the above equations has been derived by Wolf (1991b). All these solutions refer to lateral homogeneity. Recently, the derivation of solutions for laterally heterogeneous Earth models has also received attention. Whereas Kaufmann and Wolf (1999) and Tromp and Mitrovica (1999a, b, 2000) limited the theory to small perturbations of the parameters in the lateral direction, D’Agostino et al. (1997), Martinec (1998, 2000), and Martinec and Wolf (1999) developed solution techniques valid for arbitrarily large perturbations.

6 Summary The results of this review can be summarized as follows: 1. We have defined the Lagrangian and Eulerian representations of arbitrary fields and provided expressions for the relationship between the fields and their gradients in these kinematic representations. In correspondence with the Lagrangian and Eulerian representations, we have also defined the material and local increments of the fields. Using the relation between the kinematic representations, this has allowed us to establish the material and local forms of the fundamental perturbation equation. 2. Postulating only the differential form of the fundamental principles of continuum mechanics and potential theory in the Lagrangian representation, we have then presented a concise derivation of the material, material-local, and local forms of the incremental field equations and interface conditions of GVED. These equations describe infinitesimal, gravitationalviscoelastic perturbations of compositionally and entropically stratified, compressible, rotating fluids initially in hydrostatic equilibrium.

Page 27 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

3. Following this, we have obtained, as the short-time asymptotes to the incremental field equations and interface conditions of GVED, a system of equations referred to as generalized incremental field equations and interface conditions of GED. The long-time asymptotes agree with the incremental field equations and interface conditions of GVD. In particular, we have shown that the incremental thermodynamic pressure entering into the long-time asymptote to the incremental constitutive equation of viscoelasticity satisfies the incremental state equation appropriate to viscous fluids. 4. Finally, we have adopted several simplifying assumptions and developed approximate field theories applying to gravitational-viscoelastic perturbations of isocompositional, isentropic, and compressible or incompressible fluid domains.

Appendix 1: Laplace Transform Forward Transform The Laplace transform, `Œf .t /, of a function, f .t /, is defined by Z

1

LŒf .t / D

f .t /e st dt;

s 2 S;

(207)

0

where s is the inverse Laplace time and S is the complex s domain (e.g., LePage 1980). We assume here that f .t / is continuous for all t 2 T and of exponential order as t ! 1, which are sufficient conditions for the convergence of the Laplace integral in Eq. 207 for Re s larger than some value, sR . Defining LŒf .t / D fQ.s/ and assuming the same properties for g.t /, elementary consequences are then LŒa f .t / C b g.t / D afQ.s/ C b g.s/; Q

a; b D constant;

LŒdt f .t / D s fQ.s/  f .0/; Z



t

L

0

f .t / dt

0

D

0

Z



t

L

0

0

f .t  t /g.t / dt

0

(208) (209)

fQ.s/ ; s

(210)

D fQ.s/g.s/; Q

(211)

0

1 LŒ1 D ; s LŒe s0 t  D

1 ; s C s0

s0 D constant:

(212) (213)

Page 28 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

Inverse Transform If LŒf .t / is the forward Laplace transform of f .t /, then f .t / is called inverse Laplace transform of LŒf .t /. This is written as L1 fLŒf .t /g D f .t /. Since LŒf .t / D fQ.s/, it follows that L1 ŒfQ.s/ D f .t /;

t 2T;

(214)

which admits the immediate inversion of the forward transforms listed above.

Generalized Initial- and Final-Value Theorems Some useful consequences of Eqs. 207 and 214 are the generalized initial- and final-value theorems. Assuming that the appropriate limits exist, the first theorem states that an asymptotic approximation, p.t / to f .t / for small t , corresponds to an asymptotic approximation, p.s/ Q to Q f .s/ for large s. Similarly, according to the second theorem, an asymptotic approximation, q.t / to f .t / for large t , corresponds to an asymptotic approximation, q.s/ Q to fQ.s/ for small s.

Page 29 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

Appendix 2: List of Important Symbols Latin Symbols Symbol d 2 ri dnt fQ Fij ::: fij ::: Fij :::;k fij :::;k ./ fij ::: .ı/ fij ::: .0/ fij ::: fij˙::: G gi i, j , . . . j l m m m1 m2 m0 m1 mij k` ni p ri s t t0 tij ui v Xi

Name Differential area at ri nth-order material-derivative operator with respect to t Laplace transform of f Eulerian representation of Cartesian tensor field Lagrangian representation of Cartesian tensor field Gradient of Fij ::: with respect to rk Gradient of fij ::: with respect to Xk Local increment of fij ::: Material increment of fij ::: Initial value of fij ::: Increase of fij ::: across @R in direction of ni Newton’s gravitational constant Gravity force per unit mass Index subscripts of Cartesian tensor Jacobian determinant Compositional modulus Relaxation function Relaxation spectrum Bulk-relaxation function Shear-relaxation function Small-t limit of relaxation function Large-t limit of relaxation function Anisotropic relaxation function Outward unit normal with respect to @R Mechanical pressure Position of place, current position of particle Inverse Laplace time Current time Excitation time Cauchy stress Displacement Entropic modulus Initial position of material point

First reference Sect. 3.1 Sect. 3.1 Sect. 4 Sect. 2.1 Sect. 2.1 Sect. 2.1 Sect. 2.1 Sect. 2.2 Sect. 2.2 Sect. 2.2 Sect. 2.3 Sect. 3.1 Sect. 3.1 Sect. 2 Sect. 3.1 Section “Large-t Asymptotes: Field Theory of GVD” Section “Constitutive Equation” Sect. 4.1 Section “Constitutive Equation” Section “Constitutive Equation” Sect. 4.1 Sect. 4.1 Section “Constitutive Equation” Sect. 2.3 Sect. 3.2 Sect. 2.1 Sect. 4 Sect. 2.1 Sect. 3.1 Sect. 3.1 Sect. 2.1 Section “Large-t Asymptotes: Field Theory of GVD” Sect. 2.1

Page 30 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

Greek Symbols Symbol ˛0  ıij @ "ij k e e0 h h0

 e 0e 0h b $  ij  ' ˝i

Name Inverse spectral time .0/ Magnitude of gi on @R.0/ Kronecker symbol Partial-derivative operator Levi-Civita symbol Elastic bulk modulus Anelastic bulk modulus Hydrostatic bulk modulus Viscous bulk modulus Composition Shear-relaxation function Elastic shear modulus Anelastic shear modulus Viscous shear modulus State function Barotropic state function Thermodynamic pressure Volume-mass density (Incremental) interface-mass density Piola-Kirchhoff stress Gravitational potential Entropy density Centrifugal potential Tidal potential Angular velocity

First reference Sect. 4.1 Section “Material Form” Sect. 2 Sect. 2.1 Sect. 2 Section “Large-s Asymptotes” Section “Large-s Asymptotes” Section “Small-s Asymptotes” Section “Small-s Asymptotes” Sect. 3.2 Section “Equations for the Incremental Fields: Local Form” Section “Large-s Asymptotes” Section “Large-s Asymptotes” Section “Small-s Asymptotes” Sect. 3.2 Sect. 5.1.1 Sect. 3.4 Sect. 3.1 Sect. 3.1 Sect. 3.1 Sect. 3.1 Sect. 3.2 Sect. 3.1 Sect. 3.1 Sect. 3.1

Page 31 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

Calligraphic Symbols Symbol E L L1 Mij R RC S T X XC @R @X

Name Euclidian space domain Laplace transformation functional Inverse Laplace transformation functional Anisotropic relaxation functional Internal ri domain External ri domain s domain t domain Internal Xi domain External Xi domain Interface between R and RC Interface between X and XC

First reference Sect. 2.1 Sect. 6 Sect. 6 Sect. 3.1 Sect. 2.1 Sect. 2.1 Sect. 4 Sect. 2.1 Sect. 2.1 Sect. 2.1 Sect. 2.1 Sect. 2.1

References Amelung F, Wolf D (1994) Viscoelastic perturbations of the earth: significance of the incremental gravitational force in models of glacial isostasy. Geophys J Int 117:864–879 Backus GE (1967) Converting vector and tensor equations to scalar equations in spherical coordinates. Geophys J R Astron Soc 13:71–101 Batchelor GK (1967) An introduction to fluid dynamics. Cambridge University Press, Cambridge Biot MA (1959) The influence of gravity on the folding of a layered viscoelastic medium under compression. J Franklin Inst 267:211–228 Biot MA (1965) Mechanics of incremental deformations. Wiley, New York Bullen KE (1975) The Earth’s density. Chapman and Hall, London Cathles LM (1975) The viscosity of the Earth’s mantle. Princeton University Press, Princeton Chandrasekhar S (1961) Hydrodynamic and hydromagnetic stability. Clarendon Press, Oxford Christensen RM (1982) Theory of viscoelasticity, 2nd edn. Academic, New York Corrieu V, Thoraval C, Ricard Y (1995) Mantle dynamics and geoid green functions. Geophys J Int 120:516–532 D’Agostino G, Spada G, Sabadini R (1997) Postglacial rebound and lateral viscosity variations: a semi-analytical approach based on a spherical model with Maxwell rheology. Geophys J Int 129:F9–F13 Dahlen FA (1972) Elastic dislocation theory for a self-gravitating elastic configuration with an initial static stress field. Geophys J R Astron Soc 28:357–383 Dahlen FA (1973) Elastic dislocation theory for a self-gravitating elastic configuration with an initial static stress field II: energy release. Geophys J R Astron Soc 31:469–484 Dahlen FA (1974) On the static deformation of an earth model with a fluid core. Geophys J R Astron Soc 36:461–485 Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton

Page 32 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

Darwin GH (1879) On the bodily tides of viscous and semi-elastic spheroids, and on the ocean tides upon a yielding nucleus. Philos Trans R Soc Lond Part 1 170:1–35 Dehant V, Wahr JM (1991) The response of a compressible, non-homogeneous earth to internal loading: theory. J Geomagn Geoelectr 43:157–178 Eringen AC (1989) Mechanics of continua, 2nd edn. R. E. Krieger, Malabar Golden JM, Graham GAC (1988) Boundary value problems in linear viscoelasticity. Springer, Berlin Grafarend EW (1982) Six lectures on geodesy and global geodynamics. Mitt Geodät Inst Tech Univ Graz 41:531–685 Hanyk L, Yuen DA, Matyska C (1996) Initial-value and modal approaches for transient viscoelastic responses with complex viscosity profiles. Geophys J Int 127:348–362 Hanyk L, Matyska C, Yuen DA (1999) Secular gravitational instability of a compressible viscoelastic sphere. Geophys Res Lett 26:557–560 Haskell NA (1935) The motion of a viscous fluid under a surface load. Physics 6:265–269 Haskell NA (1936) The motion of a viscous fluid under a surface load, 2. Physics 7:56–61 Jarvis GT, McKenzie DP (1980) Convection in a compressible fluid with infinite Prandtl number. J Fluid Mech 96:515–583 Johnston P, Lambeck K, Wolf D (1997) Material versus isobaric internal boundaries in the earth and their influence on postglacial rebound. Geophys J Int 129:252–268 Kaufmann G, Wolf D (1999) Effects of lateral viscosity variations on postglacial rebound: an analytical approach. Geophys J Int 137:489–500 Krauss W (1973) Methods and results of theoretical oceanography, vol. 1: dynamics of the homogeneous and the quasihomogeneous ocean. Bornträger, Berlin LePage WR (1980) Complex variables and the Laplace transform for engineers. Dover, New York Li G, Yuen DA (1987) Viscous relaxation of a compressible spherical shell. Geophys Res Lett 14:1227–1230 Love AEH (1911) Some problems of geodynamics. Cambridge University Press, Cambridge Malvern LE (1969) Introduction to the mechanics of a continuous medium. Prentice-Hall, Englewood Cliffs Martinec Z (1999) Spectral, initial value approach for viscoelastic relaxation of a spherical earth with three-dimensional viscosity-I. Theory. Geophys J Int 137:469–488 Martinec Z (2000) Spectral-finite element approach to three-dimensional viscoelastic relaxation in a spherical earth. Geophys J Int 142:117–141 Martinec Z, Wolf D (1998) Explicit form of the propagator matrix for a multi-layered, incompressible viscoelastic sphere. Scientific technical report GFZ Potsdam, STR98/08, p 13 Martinec Z, Wolf D (1999) Gravitational-viscoelastic relaxation of eccentrically nested spheres. Geophys J Int 138:45–66 Martinec Z, Thoma M, Wolf D (2001) Material versus local incompressibility and its influence on glacial-isostatic adjustment. Geophys J Int 144:136–156 Mitrovica JX, Davis JL, Shapiro II (1994) A spectral formalism for computing three-dimensional deformation due to surface loads 1. Theory. J Geophys Res 99:7057–7073 O’Connell RJ (1971) Pleistocene glaciation and the viscosity of the lower mantle. Geophys J R Astron Soc 23:299–327 Panasyuk SV, Hager BH, Forte AM (1996) Understanding the effects of mantle compressibility on geoid kernels. Geophys J Int 124:121–133 Parsons BE (1972) Changes in the Earth’s shape. Ph.D. thesis, Cambridge University, Cambridge

Page 33 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

Peltier WR (1974) The impulse response of a Maxwell earth. Rev Geophys Space Phys 12: 649–669 Peltier WR (ed) (1989) Mantle convection, plate tectonics and geodynamics. Gordon and Breach, New York Ramsey AS (1981) Newtonian attraction. Cambridge University Press, Cambridge Rayleigh L (1906) On the dilatational stability of the earth. Proc R Soc Lond Ser A 77:486–499 Rümpker G, Wolf D (1996) Viscoelastic relaxation of a Burgers half space: implications for the interpretation of the Fennoscandian uplift. Geophys J Int 124:541–555 Sabadini R, Yuen DA, Boschi E (1982) Polar wandering and the forced responses of a rotating, multilayered, viscoelastic planet. J Geophys Res 87:2885–2903 Spada G, Sabadini R, Yuen DA, Ricard Y (1992) Effects on post-glacial rebound from the hard rheology in the transition zone. Geophys J Int 109:683–700 Tromp J, Mitrovica JX (1999a) Surface loading of a viscoelastic earth-I. General theory. Geophys J Int 137:847–855 Tromp J, Mitrovica JX (1999b) Surface loading of a viscoelastic earth-II. Spherical models. Geophys J Int 137:856–872 Tromp J, Mitrovica JX (2000) Surface loading of a viscoelastic planet-III. Aspherical models. Geophys J Int 140:425–441 Vermeersen LLA, Mitrovica JX (2000) Gravitational stability of spherical self-gravitating relaxation models. Geophys J Int 142:351–360 Vermeersen LLA, Sabadini R (1997) A new class of stratified viscoelastic models by analytical techniques. Geophys J Int 129:531–570 Vermeersen LLA, Vlaar NJ (1991) The gravito-elastodynamics of a pre-stressed elastic earth. Geophys J Int 104:555–563 Vermeersen LLA, Sabadini R, Spada G (1996) Compressible rotational deformation. Geophys J Int 126:735–761 Wieczerkowski K (1999) Gravito-Viskoelastodynamik für verallgemeinerte Rheologien mit Anwendungen auf den Jupitermond Io und die Erde. Publ Deutsch Geod Komm Ser C 515:130 Williamson ED, Adams LH (1923) Density distribution in the earth. J Wash Acad Sci 13:413–428 Wolf D (1984) The relaxation of spherical and flat Maxwell earth models and effects due to the presence of the lithosphere. J Geophys 56:24–33 Wolf D (1985a) Thick-plate flexure re-examined. Geophys J R Astron Soc 80:265–273 Wolf D (1985b) On Boussinesq’s problem for Maxwell continua subject to an external gravity field. Geophys J R Astron Soc 80:275–279 Wolf D (1985c) The normal modes of a uniform, compressible Maxwell half-space. J Geophys 56:100–105 Wolf D (1985d) The normal modes of a layered, incompressible Maxwell half-space. J Geophys 57:106–117 Wolf D (1991a) Viscoelastodynamics of a stratified, compressible planet: incremental field equations and short- and long-time asymptotes. Geophys J Int 104:401–417 Wolf D (1991b) Boussinesq’s problem of viscoelasticity. Terra Nova 3:401–407 Wolf D (1994) Lamé’s problem of gravitational viscoelasticity: the isochemical, incompressible planet. Geophys J Int 116:321–348 Wolf D (1997) Gravitational viscoelastodynamics for a hydrostatic planet. Publ Deutsch Geod Komm Ser C 452:96

Page 34 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_10-2 © Springer-Verlag Berlin Heidelberg 2014

Wolf D, Kaufmann G (2000) Effects due to compressional and compositional density stratification on load-induced Maxwell-viscoelastic perturbations. Geophys J Int 140:51–62 Wu P (1990) Deformation of internal boundaries in a viscoelastic earth and topographic coupling between the mantle and the core. Geophys J Int 101:213–231 Wu P (1992) Viscoelastic versus viscous deformation and the advection of prestress. Geophys J Int 108:136–142 Wu P, Ni Z (1996) Some analytical solutions for the viscoelastic gravitational relaxation of a twolayer non-self-gravitating incompressible spherical earth. Geophys J Int 126:413–436 Wu P, Peltier WR (1982) Viscous gravitational relaxation. Geophys J R Astron Soc 70:435–485 Wu J, Yuen DA (1991) Post-glacial relaxation of a viscously stratified compressible mantle. Geophys J Int 104:331–349

Page 35 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

Multiresolution Analysis of Hydrology and Satellite Gravitational Data Helga Nutz and Kerstin Wolf Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany

Abstract We present a multiresolution analysis of temporal and spatial variations of the Earth’s gravitational potential by the use of tensor product wavelets which are built up by Legendre and spherical wavelets for the time and space domain, respectively. The multiresolution is performed for satellite and hydrological data, and based on these results we compute correlation coefficients between both data sets, which help us to develop a filter for the extraction of an improved hydrology model from the satellite data.

1 Introduction The twin satellite gravity mission GRACE (Gravity Recovery And Climate Experiment) (see Tapley and Reigber 2001; Tapley et al. 2004a, b) provides a huge amount of data, which enables for the first time to quantify spatial and temporal variations of the Earth’s gravity field caused by mass transport and mass distribution with sufficient accuracy (see Swenson et al. 2003; Swenson and Wahr 2006). Most of the measured gravitational variations belong to hydrological mass distribution, and the determination of the continental water changes from the GRACE data is possible with a resolution of 1 cm water column in monthly resolution. This gives us the opportunity to analyze the hydrological information at different scales in time and space with respect to topics as, e.g., global water balance and water transfer, large-scale spatial and temporal variations of terrestrial water storage, water balances in difficult to access regions, long-term trends of continental water storage, and identification of hydrological problem zones with respect to water management and the availability of water resources. Hydrological data, as, e.g., WGHM (WaterGAP Global Hydrology Model) (see Döll et al. 2003) used for our computations, are given in the form of a time series of monthly equivalent water column heights or surface density variations. These data can be directly transformed to the corresponding gravitational potential by numerical integration over the underlying grid. The classical approach for modeling the gravitational field of the Earth is to use a truncated Fourier series based on spherical harmonics where the accuracy of the approximation is given by the maximum degree. A fundamental disadvantage of the spherical harmonic expansion is the localization of the basis functions (spherical harmonics) in the frequency domain, which leads to a smearing of the spatial detail information over the whole globe. The need of a possibility



E-mail: [email protected]

Page 1 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

to locally analyze the gravitational potential led to the development of spherical wavelets in the Geomathematics Group of the University of Kaiserslautern (Freeden 1999; Freeden et al. 1998; Freeden and Schneider 1998a, b; Freeden and Schreiner 2009). The spherical wavelets are kernel functions, which are constructed using clusters of a finite number of spherical harmonics, and by this means guarantee a good localization in the space domain. The uncertainty principle reveals that localization in both frequency and space domain are mutually exclusive. Based on the spherical multiresolution analysis we derive a multiresolution analysis for both time and space domain. This is performed by transferring the spherical theory to the time domain by the use of Legendre wavelets instead of spherical wavelets and then by applying the theory of tensor product wavelets known from the classical multidimensional wavelet analysis. In the classical wavelet analysis the two dimensions correspond to two directions in space, whereas in our case the dimensions are the time and space domain (sphere). This method allows us to reveal both temporal and spatial detail information of time series of gravitational data (hydrological or satellite data). Finally, we compare the resulting temporal and spatial detail information and the scale-depending approximations of both data sets by computing local and global correlation coefficients. These comparisons reveal the temporal and spatial regions of bad correlation of the GRACE and WGHM data. With the objective of improving the existing hydrological models, we finally derive a filter by weighting the detail information of different scales subject to the local correlation coefficients. The layout of the chapter is as follows: In Sect. 2 we give a short presentation of the multiresolution for Hilbert spaces in order to explain the wavelet concept because this theory is fundamental for the further course of this chapter. The combined time-space multiresolution analysis for reconstructing a signal in the temporal and spatial domain and the theory of correlation coefficients is then introduced in Sect. 3. Section 4 is concerned with the numerical computations based on the theory which is presented in the foregoing section. All computations are performed with data from the satellite mission GRACE and with hydrological data from WGHM. A first idea for an “optimal” extraction of a hydrological model from satellite data is presented in Sect. 5 and finally some conclusions are drawn in the last section.

2 Scientific Relevance of Multiresolution The concept of multiresolution has been developed by Mallat (1989a, b) and Meyer (1992) for fast and stable wavelet analysis and synthesis of functions in L2 .R/ and has been transferred to the spherical case by Freeden (see Freeden et al. (1998) and the references therein).

2.1 Preliminaries We start with a short recapitulation of some notation and symbols which will be important within this chapter. Additional information can be found, e.g., in Müller (1966) and Freeden et al. (1998) and the references therein. The sets of positive integers, non-negative integers, integers, and real numbers are represented by N, N0 , Z, and R, respectively. The Hilbert space of all real, square-integrable functions F on , whereR  denotes the unit sphere, is called L2 () with the scalar product given by .F; G/L2 ./ D  F ./G./d!./; F; G 2 L2 ./. The space of all scalar spherical harmonics Yn W  ! R of degree n is of dimension 2n + 1 and the set fYn;k W  ! R; n 2 N0 ; k D 1; : : : ; 2n C 1g of spherical harmonics Page 2 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

of degree n and order k forms an orthonormalPbasisPof L2 (). Thus F 2 L2 () can be 1 2nC1 ^ 2 uniquely represented by a Fourier series F D nD0 kD1 F .n; k/Yn;k (in L ./ – sense) ^ with the Fourier coefficients F .n; k/ D .F; Yn;k /L2 ./ . Closely related to the spherical harmonics are the Legendre polynomials Pn W Œ1; 1 ! R Rof degree n, n 2 N0 . Considering the space 1 L2 ([1, 1]) with scalar product .F; G/L2 .Œ1;1/ D 1 F .t /G.t /dt , F , G 2 L2 ([1, 1]), the  12  Pn , L2 ([1, 1])-orthonormal Legendre polynomials Pn W Œ1; 1 ! R defined by Pn D 2nC1 2 2 n 2 N0 , form an orthonormal basis in L2 ([1, 1]). Thus, every F 2 L ([1, 1]) can be P1 ^  represented by a Legendre expansion F D nD0 F .n/Pn , with the Legendre coefficients F ^ .n/ D .F; Pn /L2 .Œ1;1/ . We conclude this section mentioning the addition theorem, which states the relation between the Legendre polynomial of degree n and the spherical harmonics of degree n: X2nC1 kD1

Yn;k ./Yn;k ./ D

2n C 1 Pn .  /; ;  2 : 4

2.2 Multiresolution in Hilbert Spaces Within this subsection, we briefly present the multiresolution analysis in Hilbert spaces developed in the Geomathematics Group of the University of Kaiserslautern (see, e.g., Freeden and Schneider (1998b) and the references therein). This theory is fundamental for the understanding of the timespace multiresolution in Sect. 3.1. With H we denote a real separable Hilbert space over a certain domain †  Rm with scalar in .H; .; /H / and product .; /H . Let fUn gn2N0 be an orthonormal system which P1 is ^complete   W †  † ! R an H-product kernel given by .x; y/ D nD0  .n/Un .x/Un .y/; x; y 2 †, with symbol f ^ .n/gn2N0 .  is called H-admissible if the following two conditions are satisfied: (i) (ii)

1 P nD0 1 P nD0

. ^ .n//2 < 1, . ^ .n/Un .x//2 < 1; 8x 2 †.

These admissibility conditions ensure that the functions .x; / W † ! R and .; x/ W † ! R, x 2 † fixed, are elements of H. Furthermore, they guarantee that the convolution of an admissible kernel functionR and a function F 2 H is again in H, where the convolution is defined by ^ ^  .  F /.x/ D † F .y/.x; y/dy D †1 nD0  .n/F .n/Un .x/. Fundamental for the multiresolution analysis are the so-called H-scaling functions which are defined in such a way that we can interpret them as low-pass filters for functions in H. We start with the definition of the mother H-scaling function. Let f.ˆ0 /^ .n/gn2N0 be the symbol of an H-admissible kernel function which additionally satisfies the following two conditions: (i) (ˆ0 /^ (0) = 1, (ii) if n > k then (ˆ0 /^ .n/  (ˆ0 /^ .k/. Then f.ˆ0 /^ .n/gn2N0 is called the generating symbol of the mother H-scaling function given by

Page 3 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

ˆ0 .x; y/ D

1 X

.ˆ0 /^ .n/Un .x/Un .y/;

x; y 2 †:

nD0

For the definition of the H-scaling function we have to extend this definition by defining the dilated versions of ˆ0 in the following way: let f.ˆJ /^ .n/gn2N0 , J 2 Z, be an H-admissible symbol satisfying in addition the following properties: (i) lim .ˆJ /^ .n/ D 1; n 2 N, J !1 (ii) .ˆJ /^ .n/  .ˆJ 1 /^ .n/; J 2 Z; n 2 N, (iii) lim .ˆJ /^ .n/ D 0; n 2 N, J !1

(iv) .ˆJ /^ .0/ D 1;

J 2 Z.

Then f.ˆJ /^ .n/gn2N0 , J 2 Z, is called the generating symbol of an H-scaling function and J is called the scale. The corresponding family fˆJ gJ 2Z of kernel functions given by ˆJ .x; y/ D

1 X

.ˆJ /^ .n/Un .x/Un .y/;

x; y 2 †;

nD0

is called H-scaling function. The symbols of the associated H-wavelets are defined with the help of the refinement equation 

2  2  2 .‰J /^ .n/ D .ˆJ C1 /^ .n/  .ˆJ /^ .n/ ;

n 2 N0 :

(1)

Then, the family f‰J gJ 2Z of H-product kernels defined by ‰J .x; y/ D

1 X

.‰J /^ .n/Un .x/Un .y/;

x; y 2 †;

nD0

is called H-wavelet associated to the H-scaling function fˆJ g, J 2 Z. The corresponding mother wavelet is denoted by ‰0 . Our numerical calculations are all performed with the so-called cubic polynomial wavelet. The corresponding cubic polynomial scaling function is composed by the symbol  ^

.ˆJ / .n/ D

.1  2J n/2 .1 C 21J n/; 0;

0  n < 2J ; n  2J :

Figure 1 shows the scaling function and the wavelet for different scales. The corresponding symbols are shown in Fig. 2, where the wavelet symbols are calculated with the help of the refinement Eq. (1). With the help of the H-scaling functions and H-wavelets we introduce the scale spaces VJ D fˆJ  ˆJ F jF 2 Hg and the corresponding detail spaces WJ D f‰J  ‰J  F jF 2 Hg. The operator TJ (F / = ˆJ * ˆJ * F can be interpreted as a low-pass filter and the corresponding scale space represents the approximation (reconstruction) of F at scale J . The operator RJ .F / D ‰J * ‰J * F can be interpreted as a band-pass filter and the corresponding detail spaces Page 4 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

350

350 Scale 3 Scale 4 Scale 5 Scale 6

300 250

250

200

200

150

150

100

100

50

50

0

0 −0.6 −0.4 −0.2

0

0.2

a Scaling functions: ϑ

0.4

Scale 3 Scale 4 Scale 5 Scale 6

300

−0.6 −0.4 −0.2

0.6

0

0.2

b Wavelet functions: ϑ

Φj (ϑ)

0.4

0.6

Ψj (ϑ)

Fig. 1 Cubic polynomial scaling function and wavelet for # 2 Œ; ; scale j D 3; 4; 5; 6

1

1 Scale 3 Scale 4 Scale 5 Scale 6

0.9 0.8 0.7

0.8 0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0

10

20

30

40

50

Scale 3 Scale 4 Scale 5 Scale 6

0.9

0

60

0

a Symbols of the scaling function: n

10

20

30

40

50

60

b Symbols of the wavelet: n

(Φ) j (n)

(Ψ) j (n)

Fig. 2 Symbols of the cubic polynomial scaling function and wavelet for n D 0; : : :; 65; scale j D 3; 4; 5; 6

WJ represent the wavelet approximation (detail information) of F at scale J . For these scale and detail spaces we have the decomposition VJ C1 D VJ C WJ . With increasing scale J , the scale spaces provide a better and better approximation of the function F , that is we have the limit relation (in H-sense) limJ !1 ˆJ  ˆJ  F D F . Thus, we end up in a multiresolution analysis given by the nested sequence of scale spaces     VJ  VJ C1      H; and HD

1 [

jjjjH

VJ

:

J D1

Page 5 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

In particular, we can decompose the space VJ for each scale J 2 Z in one “basic” scale space and P 1 Wj . several detail spaces: VJ D VJ0 C Jj DJ 0

2.3 Wavelets for the Time and Space Domain As a matter of fact most of the functions in geophysics and geodesy are of bounded energy and thus we conclude this section with the Hilbert spaces L2 ([1, 1]) used for the time domain (Legendre wavelets) and L2 () used for the space domain (spherical wavelets). 2.3.1 Legendre Wavelets Let H D L2 .Œ1; 1/ be the space of square-integrable functions F W Œ1; 1 ! R, i.e., we let † D Œ1; 1. This choice leads to the so-called Legendre wavelets (cf. Beth and Viell 1998). We already defined the scalar product .F; G/L2 .Œ1;1/ and the orthonormal system of Legendre polynomials Pn . The L2 ([1, 1])-admissible product kernels then are given by .s; t / D

1 X

 ^ .n/Pn .s/Pn .t /;

s; t 2 Œ1; 1;

nD0

and the convolution of  against F is given by .  F /.t / D

1 X

 ^ .n/F ^ .n/Pn .t /;

t 2 Œ1; 1:

nD0

2.3.2 Spherical Wavelets In case of the scalar spherical wavelet theory, we let † D  and consider the Hilbert space H D L2 ./. As an L2 ()-orthonormal system we choose the system fYn;k gn2N0 I kD1;:::;2nC1 of spherical harmonics of degree n and order k. The L2 ()-product kernels have the following representation .; / D

1 2nC1 X X

 ^ .n/Yn;k ./Yn;k ./;

;  2 ;

nD0 kD1

and the convolution of  against F is given by .  F /./ D

1 2nC1 X X

 ^ .n/F ^ .n; k/Yn;k ./;

 2 :

nD0 kD1

Page 6 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

3 Key Issues for the Comparison of GRACE and WGHM Data In view of an improvement of existing hydrological models, as e.g., WGHM, by comparing them with measurements based on GRACE data we first perform a multiscale analysis and then compute correlation coefficients. The first part of this section (Sect. 3.1) is therefore dedicated to the tensorial time-space multiresolution which is a method for the detection of temporal and spatial variations on different scales, i.e., sizes of the details. In the second part (Sect. 3.2) we compute the local and global correlation coefficients between GRACE and WGHM data and thus we are able to quantify the resemblance of both data sets at different scales. Some more results concerning the comparison of GRACE and WGHM data can be found in Freeden et al. (2010).

3.1 Tensorial Time-Space Multiresolution For the combination of the temporal multiresolution based on Legendre wavelets with the spatial multiresolution based on spherical wavelets we apply the theory of tensor product wavelets (see, e.g., Louis et al. 1998). This technique allows the transmission of the one-dimensional multiscale analysis to higher dimensions. Figure 3 shows the tensorial time-space multiresolution which provides a unique scale for both space and time domain and three detail parts for each scale, namely two hybrid and one pure detail part. A detailed introduction to this theory can be found in Freeden (1999), Maier (2003), Nutz and Wolf (2008), and the references therein. Starting point of our considerations is the Hilbert space L2 ([1, 1] ) where without loss of generality we assume the time interval to be normalized to the interval R 1 R [1, 1]. The scalar product 2 of F , G 2 L .Œ1; 1  / is given by .F; G/L2 .Œ1;1/ D 1  F .t I /G.t I /d!./dt . We presume that the time dependency is completely P2nC1 ^by the spherical harmonic^ coefficients P1 described and we have the representation F .t I / D nD0 kD1 F .n;  k/.t /Yn;k ./, with F .n; k/.t / D P1 ^ 0  ^ 0  n0 D0 F .n I n; k/Pn0 .t /, where F .n I n; k/ D F; Pn0 Yn;k L2 .Œ1;1/ . For notational reasons in the following text n0 will always denote the summation index in the time domain (Legendre polynomials), whereas n will be used in the space domain (spherical harmonics). We finally arrive at

F D

1 2nC1 1 X X X

F ^ .n0 I n; k/Pn0 Yn;k

n0 D0 nD0 kD1

Multiresolution in Time and Space Smoothing with decreasing scale

~ ~ ~ ~ L2([–1, 1]  Ω) → . . . → V J+1 → V J → V J–1 → . . . → V 0 ~ ~ ~ W 1J W 1J–1 W 10 ~ ~ ~ W 2J W 2J–1 . . . W 20 ~ ~ ~ W 3J W 3J–1 W 30

The higher the scale, the finer are the details which are detected

Fig. 3 Multiresolution of L2 .Œ1; 1  / with tensor product wavelets

Page 7 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

in L2 .Œ1; 1  /-sense. A multiresolution of the space L2 .Œ1; 1  / is given by a subset of scale spaces of the form     VQ J  VQ J C1      L2 .Œ1; 1  / and L2 .Œ1; 1  / D

1 [

jjjjL2 .Œ1;1/

VQ J

:

JD1

In this section we follow the presentation of the tensorial time-space multiresolution analysis in Nutz and Wolf (2008) for the definition of the scaling function and the wavelets. Our starting point is the definition of the generating symbol of a time-space scaling function. Let f.ˆ0J /^ .n0 /gn0 2N0 and f.ˆJ /^ .n/gn2N0 , J 2 Z, be the generating symbols of a temporal scaling function and a spatial scaling function, respectively. Then the generating symbol of the time-space (tensor product) Q J /^ .n0 I n/gn0 ;n2N0 , with the symbol of the scaling scaling function is given by the sequence f.ˆ Q J gJ 2Z defined by Q J /^ .n0 I n/ D .ˆ0J /^ .n0 /.ˆJ /^ .n/. The family of kernel functions fˆ function .ˆ Q J .s; t I ; / D ˆ

1 2nC1 1 X X X

Q J /^ .n0 I n/Pn0 .s/Pn0 .t /Yn;k ./Yn;k ./; .ˆ

n0 D0 nD0 kD1

s; t 2 Œ1; 1; ;  2 , denotes the time-space (tensor product) scaling functions. Since we have two refinement equations  0 ^ 0 2  0 2  2 .‰J / .n / D .ˆJ C1 /^ .n0 /  .ˆ0J /^ .n0 / ; ..‰J /^ .n//2 D ..ˆJ C1 /^ .n//2  ..ˆJ /^ .n//2 ; which have to be fulfilled simultaneously we get 2  2  0 .ˆJ C1 /^ .n0 / ..ˆJ C1 /^ .n//2 D .ˆ0J /^ .n0 / ..ˆJ /^ .n//2  2 C .‰J0 /^ .n0 / ..ˆJ /^ .n//2  2 C .ˆ0J /^ .n0 / ..‰J /^ .n//2  2 C .‰J0 /^ .n0 / ..‰J /^ .n//2 : Q J2 and one pure wavelet ‰ Q J3 : Q J1 and ‰ This leads to the definition of two hybrid wavelets ‰ Q Ji .s; t I ; / D ‰

1 2nC1 1 X X X  ^ Q Ji .n0 I n/Pn0 .s/Pn0 .t /Yn;k ./Yn;k ./; ‰ n0 D0 nD0 kD1

i D 1; 2; 3, with the hybrid and pure wavelet symbols

Page 8 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

   1 ^ 0 Q J .n I n/ D ˆ0J ^ .n0 /.‰J /^ .n/; ‰  2 ^ 0   Q J .n I n/ D ‰J0 ^ .n0 /.ˆJ /^ .n/; ‰  3 ^ 0   Q J .n I n/ D ‰J0 ^ .n0 /.‰J /^ .n/: ‰ We now introduce the time-space convolution of a function F 2 L2 .Œ1; 1  / and a kernel function of the form .s; t I ; / D

1 2nC1 1 X X X

 ^ .n0 I n/Pn0 .s/Pn0 .t /Yn;k ./Yn;k ./:

n0 D0 nD0 kD1

The time-space convolution of  against F is defined by . ? F /.t I / D D

R1 R

.s; t I ; /F .sI /d!./ds

1  1 P 1 2nC1 P P n0 D0 nD0 kD1

 ^ .n0 I n/F ^ .n0 I n; k/Pn0 .t /Yn;k ./:

Q The convolution of two kernel functions ˚ i  is defined in analogous manner. Now let fˆJ g be the Q J , i D 1; 2; 3, be the associated hybrid and pure time-space time-space scaling functions and ‰ wavelets at scale J . Then the pure time-space scale spaces are defined by ˚  QJ ?ˆ Q J ? F j F 2 L2 .Œ1; 1  / ; VQ J D ˆ and the first hybrid, the second hybrid, and the pure time-space detail spaces are given by ˚ i  Q Ji D ‰ QJ ?‰ Q Ji ? F j F 2 L2 .Œ1; 1  / ; W i D 1; 2; 3. We conclude this section with the following two important properties which guarantee the timeQ space multiresolution ˚ i  based on tensor product wavelets: let fˆJ g, J 2 Z, be a time-space scaling Q J , i D 1; 2; 3, J 2 Z, be the associated hybrid and pure time-space wavelets. function and ‰ Suppose that F 2 L2 .Œ1; 1  ). Then   QJ ?F QJ ?ˆ F D lim ˆ J !1 0 Q J0 ? ˆ Q J0 ? F C D lim @ˆ J !1

J X 3 X

1 Q ji  ‰ Q ji  F A ; ‰

j DJ0 iD1

J0 2 Z; holds true in the sense of the L2 .Œ1; 1  /-metric. Accordingly, for the time-space scale spaces and detail spaces we have VQ J D VQ J0 C

J 1 X 3 X

Q ji W

j DJ0 iD1

Page 9 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

75° N 60° N 45° N 30° N

180° W 90° W



90° E 180° E

15° N 0° 15° S 30° S 45° S 60° S 75° S

a Reconstruction at scale 4

+ 75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

180° W 90° W



90° E 180° E

b First hybrid detail part

75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

180° W 90° W



90° E 180° E

75° N 60° N 45° N 30° N 15° N

180° W 90° W



90° E 180° E

15° S 30° S 45° S 60° S 75° S

+ + c Second hybrid detail part d Pure detail part at scale 4

at scale 4

at scale 4

75° N 60° N 45° N 30° N

180° W 90° W



90° E 180° E

15° N 0° 15° S 30° S 45° S 60° S 75° S

e Reconstruction at scale 5 Fig. 4 Graphical illustration of the time-space multiresolution analysis computed from a time series of GRACE data and exemplarily shown for April 2005

with J; J0 2 Z, and J0  J . In Fig. 4 a graphical illustration of the time-space multiscale analysis calculated with GRACE-data is shown.

3.2 Correlation Analysis Between GRACE and WGHM By the use of the time-space multiresolution analysis, we are in the position to locally measure spatial and temporal changes in the data. With respect to the application of the theory to real data sets as, e.g., hydrological or GRACE data, we need an instrument to compare these results, i.e., we must perform a correlation analysis. The correlation coefficient is a gauge for the variation of two data sets and, thus, helps us to interpret the changes of the data at different scales. Based on the different corresponding detail parts and reconstructions we compute the local correlation coefficients on the continents which reflect the good and bad accordance of the two time series. In addition we compute global correlation coefficients by averaging the local correlation coefficients over the continents. For the definition of the local and global correlation coefficients of time series given on the sphere we use the following notation: we assume that we have T 2 N points in time and M 2 N grid points. The points in time are denoted by ti 2 [1, 1], i D 1; : : :; T , whereas all grid points are given by m 2 , for m D 1; : : :; M . Since we must take into account the latitude dependance of the grid points, we let pm D cos.'m / be the weight, where 'm denotes the latitude of the Page 10 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

corresponding grid point. We want to compare the values of two different time series which we denote by F and G, F , G 2 L.Œ1; 1  ). Then, the local correlation coefficient c at some location  2  is given by T  P

c./ D s

  N F .ti I /  FN ./ G.ti I /  G./

iD1 T  T  2 P 2 P N F .ti I /  FN ./ G.ti I /  G./

iD1

;

iD1

P PT 1 N where the weighted mean values are defined by FN ./D T1 TiD1 F .ti I / and G./D iD1 G.ti I /. T For the global correlation coefficient gc we average the local correlation coefficients over the corresponding grid points, such that we obtain T P M P

gc D s

   N m/ pm F .ti I m /  FN .m / G.ti I m /  G.

iD1 mD1 M T P P

M T P 2 P    N m/ 2 pm F .ti I m /  FN .m / pm G.ti I m /  G.

iD1 mD1

iD1 mD1

:

4 Fundamental Results This section is dedicated to numerical results for the tensorial time-space multiresolution and the correlation analysis between the GRACE and WGHM data. The computations have been carried out on the basis of 62 monthly data sets of spherical harmonic coefficients up to degree and order 70 from GRACE and WGHM for the period of August 2002 till September 2007. These data sets have been made available to us from GeoForschungszentrum Potsdam, Department 1, Geodesy and Remote Sensing within the German Ministry of Education and Research (BMBF) project “Observation System Earth from Space.” In case of the spatial analysis we exemplarily present the results of the first hybrid parts of April 2005. The left column of Fig. 5 shows the results based on the GRACE data and the right one shows the corresponding results in case of WGHM data. Note that in case of WGHM, measurements have only been achieved on the continents whereas in case of GRACE data we also have measurements on the oceans. At scale 3 (see Fig. 5a, b) large-area regions are visible. With increasing scale, we have better and better space localization. In case of the temporal analysis the time dependent courses of the second hybrid parts for selected locations, more precisely for Dacca and Kaiserslautern on the Northern hemisphere and Manaus and Lilongwe on the Southern hemisphere, are shown (see Fig. 6). Note that Kaiserslautern has moderate seasonal variations in the water balance, whereas the other three cities are selected exemplarily for well-known regions of great changes (Ganges and Amazonas basin, region around Lake Malawi). In Fig. 6 on the left column the time-dependent courses for the GRACE data and on the right column of Fig. 6 the time dependent courses for the corresponding results based on WGHM data are shown. The seasonal variations can be recognized best at scales 4 and 5. Even in case of Kaiserslautern, located in a region with moderate variations, the course of the second hybrid parts clarifies the seasonal course. Page 11 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S –3

180° W 90° W 0°

–2

–1

a 75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S –3

–1

c

–3

–2

1

75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S 2

3 10–3

–3

0 2 2 [m /s ]

75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S 2

3

–3

–2

–1

0 [m2/s2]

–1

d

1

e Scale 5 (GRACE)

75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S 2

3 10–3

–3

–2

0 [m2/s2]

1

2

3 10–3

Scale 3 (WGHM)

–3

90° E 180° E

90° E 180° E

180° W 90° W 0°

10

Scale 4 (GRACE)

180° W 90° W 0°

–1

b

90° E 180° E

1

180° W 90° W 0°

–2

Scale 3 (GRACE)

180° W 90° W 0°

–2

75° N 60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

0 [m2/s2]

90° E 180° E

90° E 180° E

0 2 2 [m /s ]

1

2

3 10

–3

Scale 4 (WGHM)

180° W 90° W 0°

–1

f

0 [m2/s2]

90° E 180° E

1

2

3 10–3

Scale 5 (WGHM)

Fig. 5 First hybrid parts of the potentials of April 2005 calculated with cubic polynomial wavelet in time and space with different scales

In Fig. 7, the local correlation coefficients and, additionally, the global correlation coefficients on the continents between GRACE and WGHM are shown which are computed from the original data sets (see Fig. 7a) and some low-pass and band-pass filtered parts for scales 3 and 4 (see Fig. 7b–f). Red regions correspond to a good correlation of the two underlying time series, whereas the blue regions show the locations with greater variations. In space domain, scale 3 and scale 4 correspond to a region of influence of about 8,000 and 4,000 km, respectively, whereas in time domain we have a time of influence of about 9 and 4 months, respectively. We now exemplarily consider the results for North and South America in detail because these regions show very different correlation coefficients for the coarse reconstruction at scale 3. In case of North America the reconstruction at scale 4 (Fig. 7c) shows a much better correlation than the reconstruction at scale 3 (Fig. 7b). This is traced back to the fact that the details of the size of scale 3 (8,000 km, 9 months) are better correlated than the reconstruction at scale 3 which leads to an improvement of the correlation coefficients in case of the reconstruction at scale 4 (4,000 km, 4 months). In South America, we have an excellent correlation coefficient for the coarse reconstruction at scale 3 which is slightly degraded turning to scale 4. The reason is that the detail parts at scale 3 are somewhat worse correlated than the reconstruction at scale 3.

Page 12 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

[m2/s2]

[m2/s2]

0.03

0.03

0.02

0.02

0.01

0.01

0

0

−0.01

−0.01

Ju l0 Ja 2 n0 Ju 3 l0 Ja 3 n0 Ju 4 l0 Ja 4 n0 Ju 5 l0 Ja 5 n0 Ju 6 l0 Ja 6 n0 Ju 7 l0 7

−0.03

a

−0.02 −0.03

b

Dacca (GRACE)

[m2/s2]

[m2/s2]

0.03

0.03

0.02

0.02

0.01

0.01

0

0

−0.01

c

−0.03

d Kaiserslautern (WGHM)

Kaiserslautern (GRACE)

[m2/s2]

[m2/s2]

0.03

0.03

0.02

0.02

0.01

0.01

0

0

−0.01

−0.01

Ju l0 Ja 2 n0 Ju 3 l0 Ja 3 n0 Ju 4 l0 Ja 4 n0 Ju 5 l0 Ja 5 n0 Ju 6 l0 Ja 6 n0 Ju 7 l0 7

−0.03

0.02

e

Lilongwe (GRACE)

scale 3 scale 4 scale 5 scale 6

−0.03

[m2/s2] 0.03 0.02

0.01

scale 3 scale 4 scale 5 scale 6

−0.02

Ju l0 Ja 2 n0 Ju 3 l0 Ja 3 n0 Ju 4 l0 Ja 4 n0 Ju 5 l0 Ja 5 n0 Ju 6 l0 Ja 6 n0 Ju 7 l0 7

scale 3 scale 4 scale 5 scale 6

−0.02

scale 3 scale 4 scale 5 scale 6

−0.02

Ju l0 Ja 2 n0 Ju 3 l0 Ja 3 n0 Ju 4 l0 Ja 4 n0 Ju 5 l0 Ja 5 n0 Ju 6 l0 Ja 6 n0 Ju 7 l0 7

Ju l0 Ja 2 n0 Ju 3 l0 Ja 3 n0 Ju 4 l0 Ja 4 n0 Ju 5 l0 Ja 5 n0 Ju 6 l0 Ja 6 n0 Ju 7 l0 7

−0.03

0.03

Dacca (WGHM)

−0.01 scale 3 scale 4 scale 5 scale 6

−0.02

[m2/s2]

scale 3 scale 4 scale 5 scale 6

Ju l0 Ja 2 n0 Ju 3 l0 Ja 3 n0 Ju 4 l0 Ja 4 n0 Ju 5 l0 Ja 5 n0 Ju 6 l0 Ja 6 n0 Ju 7 l0 7

scale 3 scale 4 scale 5 scale 6

−0.02

f

Lilongwe (WGHM)

scale 3 scale 4 scale 5 scale 6

0.01

−0.02

−0.02

−0.03

−0.03

g

Manaus (GRACE)

Ju l0 Ja 2 n0 Ju 3 l0 Ja 3 n0 Ju 4 l0 Ja 4 n0 Ju 5 l0 Ja 5 n0 Ju 6 l0 Ja 6 n0 Ju 7 l0 7

0 −0.01

Ju l0 Ja 2 n0 Ju 3 l0 Ja 3 n0 Ju 4 l0 Ja 4 n0 Ju 5 l0 Ja 5 n0 Ju 6 l0 Ja 6 n0 Ju 7 l0 7

0 −0.01

h

Manaus (WGHM)

Fig. 6 Time-dependent courses of the second hybrid parts of the potentials at different locations calculated with cubic polynomial wavelet in time and space with different scales

Page 13 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

180° W 90° W

75° N 60° N 45° N 30° N



90° E 180° E

75° N 60° N 45° N 30° N

180° W 90° W



90° E 180° E

75° N 60° N 45° N 30° N

15° N

15° N

0° S

0° S

0° S

15° S

15° S

15° S

30° S 45° S 60° S 75° S

–1

–0.5

75° N 60° N 45° N 30° N

0

0.5

1

180° W 90° W



–1

b

Original (gc = 0.75) 90° E 180° E

–0.5

0

0.5

1

180° W 90° W



–1

–0.5

c

90° E 180° E

0° S

0° S

15° S

15° S

–1

30° S 45° S 60° S 75° S

–0.5

0

0.5

1

–1

0.5

1

180° W 90° W



90° E 180° E

0

0.5

15° N

15° N

0° S

d First hybrid detail part at scale 3 (gc = 0.54)

0

Reconstruction at scale 4 (gc = 0.53)

75° N 60° N 45° N 30° N

15° S 30° S 45° S 60° S 75° S

90° E 180° E

30° S 45° S 60° S 75° S

Reconstruction at scale 3 (gc = 0.23)

75° N 60° N 45° N 30° N

15° N



15° N

30° S 45° S 60° S 75° S

a

180° W 90° W

30° S 45° S 60° S 75° S

–0.5

0

0.5

1

e Second hybrid detail part at scale 3 (gc = 0.76)

–1

–0.5

1

f Pure detail part at scale 3 (gc = 0.83)

Fig. 7 Local correlation coefficients and in brackets the corresponding global correlation coefficients (on the continents) between GRACE and WGHM data computed from the original potential and some low-pass and bandpass parts

5 Future Directions In the previous sections, we have presented some mathematical tools for the spatial and temporal analysis of hydrological and satellite data. We have also demonstrated how to compare the results of the multiresolution analysis by the use of correlation coefficients. In order to clarify the local differences between the hydrological model WGHM and the satellite measurements of GRACE, future research must now concentrate on the possibilities of how to take advantage of this knowledge for the improvement of existing hydrological models. In this section, we therefore try to give a first idea of how to interpret the results achieved from the multiscale analysis with the aid of the correlation coefficients in view of a correction of hydrology models. To this end, we propose a filter based on the correlation coefficients and we assume that we have an improvement if the (global and local) correlation coefficients of the filtered GRACE and WGHM data are better than those of the original data. Furthermore, we demand that a very large part of the original signal is reconstructed in the filtered data. Note that the improvement of the correlation coefficients and the increase of the percentage of the filtered signal from the original signal cannot be optimized simultaneously. To find out an optimal filter we start with computing the local .i/ correlation coefficients of GRACE and WGHM cJ , J 2 N, i D 1; 2; 3, on the continents for .i/ .i/ the corresponding detail parts F * ‰J * ‰J and the local correlation coefficients of GRACE and WGHM cJ , J 2 N, of the constructions F * ˆJ * ˆJ . In addition, we compute the corresponding .i/ global correlation coefficients on the continents gcJ , gcJ . Using these correlation coefficients we derive a weight function w: [1, 1] ! [0, 1] defined by

Page 14 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

w.k/ D

8 < 0; :

1 k G2 G1

1;



G1 ; G2 G1

k  G1 ; G1 < k < G2 ; k  G2 ;

for controlling the influence of the corresponding parts on the resulting reconstructed signal: if the .i/ local correlation coefficient cJ and cJ , respectively, is smaller than G1 the corresponding part is not added in the reconstruction formula (2), whereas in case of a correlation coefficient greater than G2 we add the entire corresponding part. In case of G1 < k < G2 we weight the corresponding part in the reconstruction formula (2) (the higher the correlation coefficient the higher the weights). We finally arrive at the following formula for a reconstruction of the signal: D .ˆJ0  ˆJ0  F /./w.cJ0 .// FJrec max    Jmax 3  P1 P .i/ .i/ .i/ ‰j  ‰j  F ./w cj ./ : C

(2)

j DJ0 iD1

In order to get the percentage of the reconstructed signal F rec D FJrec from the original max orig 2 signal F D F we energy of a signal F 2 L .Œ1; C1  ), which is given by P needPthe 1 P2nC1 2 ^ 0 ^ 0 jjF jj2L2 .Œ1;C1/ D 1 n0 D0 nD0 mD1 .F .n I n; m// , where F .n ; n, m/ are the time-space Fourier coefficients. The percentage p.F rec , F orig / is then given by   jjF rec jjL2 .Œ1;C1/ p F rec ; F orig D : jjF orig jjL2 .Œ1;C1/ Table 1 shows the percentage of the reconstruction from GRACE data to the original GRACE data and the correlation coefficients for the corresponding reconstructions between GRACE and WGHM data for different values of G1 and G2 . Note that all values of the percentage greater than 88 % and all correlation coefficients greater than 0.83 are in bold numbers. As expected we realize that with decreasing percentage the correlation coefficient goes up. In dependence of the parameters G1 and G2 we have to optimize both percentage and correlation coefficient. In Table 1 we have the best percentage for G1 D 0:3 and G2 D 0:0, if we claim a correlation coefficient greater than 0.83. In Fig. 8 we show the correlation coefficients of the reconstructions of GRACE and WGHM data for G1 D 0:3 and G2 D 0:0. Especially the regions with very bad correlation of the original data and the optimal reconstruction show differences in the local correlation coefficients. To make this more evident, we additionally in Fig. 9 show the differences of the correlation coefficients of the optimal reconstruction (G1 D 0:3 and G2 D 0:0) and the correlation coefficients of the original GRACE and WGHM data. Blue regions in Fig. 9 correspond to regions of good correlation of the hydrological model with the satellite data because in our reconstruction process (see Formula (2)) we did not have to do much corrections. Red regions correspond to those regions with bad correlation coefficient between GRACE and WGHM data. In this case we had to give up much of the detail information in Formula (2) due to the bad correlation coefficient. We want to emphasize that the approach presented in this section is a first idea of how to make use of the information achieved by the multiresolution analysis in view of improving the existing hydrological models. In Werth et al. (2009) a comparative overview of filter techniques based on NSC is given for three global hydrological models (WGHM, GLDAS, and LaD). Research in cooperation of geoscientists

Page 15 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

Table 1 Percentage (third column) of the reconstruction with details up to scale 9 from GRACE data to the original GRACE data and correlation coefficients (fourth column) for the corresponding reconstructions between GRACE and WGHM data for different values of G1 and G2 G1 G2 Percentage (in %) Corr. Coeff. – – 100 (original) 0.75  0.9  0.8 95.4 0.79  0.9  0.4 94.4 0.80  0.9 0.0 92.7 0.81  0.9 0.4 89.2 0.82  0.9 0.8 82.6 0.83  0.7  0.4 94.0 0.81  0.7 0.0 92.1 0.82  0.7 0.4 88.3 0.83  0.7 0.8 81.2 0.84  0.5  0.4 93.4 0.81  0.5 0.0 91.4 0.82  0.5 0.4 87.3 0.83  0.5 0.8 79.7 0.84  0.3 0.0 90.5 0.83  0.3 0.4 86.0 0.84  0.3 0.8 77.8 0.85  0.1 0.0 89.3 0.83  0.1 0.4 84.4 0.85  0.1 0.8 75.6 0.86 0.1 0.4 82.4 0.85 0.1 0.8 72.9 0.87 0.3 0.4 79.8 0.86 0.3 0.8 69.6 0.88

75° N 60° N 45° N

180° W

90° W



90° E

180° E

30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

–1

–0.5

0

0.5

1

Fig. 8 Correlation coefficients of the reconstructions of GRACE and WGHM data for G1 D 0:3 and G2 D 0:0

Page 16 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

75° N 60° N 45° N

180° W

90° W



90° E

180° E

30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Fig. 9 Difference of the correlation coefficients of the optimal reconstruction (G1 D 0:3 and G2 D 0:0) and the correlation coefficients of the original GRACE and WGHM data

and mathematicians is necessary for further progress in the field of extracting filter tools based on multiresolution of hydrology data.

6 Conclusion The huge amount of data which is provided by the satellite mission GRACE allows to quantify both spatial and temporal variations of the Earth’s gravity field. For this reason a time-space multiresolution analysis is presented in this chapter. The basic idea of this method is to transfer the one-dimensional multiscale analysis to higher dimensions, more precisely, a tensor product wavelet analysis using Legendre wavelets for the time domain and spherical wavelets for the space domain is realized. With the corresponding tensor product wavelets we are able to locally analyze (in one step) a time series of the gravitational potential. Particularly, the spatial detail information is not smeared over the whole Earth, which is a disadvantage of the classical approach based on spherical harmonics. Based on the results of the tensor product wavelet analysis, we are interested in an extraction of a global hydrological model from the satellite data, and, thus, in an improvement of already existing hydrological models. Therefore, the time series of the GRACE data and those of the existing hydrological model WGHM are compared using a correlation analysis. To this end, local and global correlation coefficients between the original data sets, the detail information, and the reconstructions of GRACE and WGHM are computed. With the aid of these correlation coefficients we are looking for an “optimal” reconstruction of the GRACE data, i.e., the aim is to find out an optimal filter. This is done by constructing a weight function which controls the influence of the corresponding detail parts on the resulting reconstructed signal. For future research, it is necessary to interpret these results not only from the mathematical point of view but also with geoscientifical knowledge in order to extract a reasonable optimal global hydrology model from the GRACE satellite data. Acknowledgments The authors gratefully acknowledge the support by the German Ministry of Education and Research (BMBF) and German Research Foundation (DFG) within the R&DProgramme Geotechnologies Special Programme “Observation System Earth from Space”,

Page 17 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

03F0424D, (publication number GEOTECH-317). We are also much obliged to GFZ Potsdam for providing us with all GRACE and WGHM data.

References Beth S, Viell M (1998) Uni- und multivariate Legendre-Wavelets und ihre Anwendung zur Bestimmung des Brechungsindexgradienten. In: Freeden W (ed) Progress in geodetic science at GW 98, Shaker, pp 25–33 Döll P, Kaspar F, Lehner B (2003) A global hydrological model for deriving water availability indicators: model tuning and validation. J Hydrol 270(1–2):105–134 Freeden W (1999) Multiscale modelling of spaceborne geodata. Teubner, Stuttgart/Leipzig Freeden W, Schneider F (1998a) An integrated wavelet concept of physical geodesy. J Geod 72:259–281 Freeden W, Schneider F (1998b) Regularization wavelets and multiresolution. Inverse Probl 14:225–243 Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences. A scalar, vectorial, and tensorial setup. Springer, Heidelberg Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford Science Publication, Clarendon Press, Oxford Freeden W, Nutz H, Wolf K (2010) Time-space multiscale analysis and its application to GRACE and hydrology data. In: Flechtner FM, Gruber Th, Güntner A, Mandea M, Rothacher M, Schöne T, Wickert J (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin/London, pp 387–397 Louis AK, Maaß P, Rieder A (1998) Wavelets: Theorie und Anwendungen. Teubner, Stuttgart Maier T (2003) Multiscale geomagnetic field modelling from satellite data: theoretical aspects and numerical applications. PhD thesis, Geomathematics Group, University of Kaiserslautern Mallat S (1989a) Multiresolution approximations and wavelet orthonormal bases of L2 .R/. Trans Am Math Soc 315:69–87 Mallat S (1989b) A theory for multiresolution signal decompostion. IEEE Trans Pattern Anal Mach Intell 11:674–693 Meyer Y (1992) Wavelets and operators. Cambridge University Press, Cambridge/New York Müller C (1966) Spherical harmonics, vol 17. Springer, Berlin Nutz H, Wolf K (2008) Time-space multiscale analysis by use of tensor product wavelets and its application to hydrology and GRACE data. Studia Geophysica et Geodaetica 52:321–339 Swenson S, Wahr J (2006) Post-processing removal of correlated errors in GRACE data. Geophys Res Lett 33:L08402. doi:10.1029/2005GL025285 Swenson S, Wahr J, Milly PCD (2003) Estimated accuracies of regional water storage variations inferred from the gravity recovery and climate experiment (GRACE). Water Resour Res 39(8):1223. doi:10.1029/2002WR001808 Tapley BD, Reigber C (2001) The GRACE mission: status and future plans. EOS Trans AGU 82(47):Fall Meet Suppl G41, C-02 Tapley BD, Bettadpur S, Ries JC, Thompson PF, Watkins MM (2004a) GRACE measurements of mass variability in the earth system. Science 305:503–505

Page 18 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_11-5 © Springer-Verlag Berlin Heidelberg 2014

Tapley BD, Bettadpur S, Watkins MM, Reigber C (2004b) The gravity recovery and climate experiment: mission overview and early results. Geophys Res Lett 31:L09607. doi:10.1029/2004GL019920 Werth S, Güntner A, Schmidt R, Kusche J (2009) Evaluation of GRACE filter tools from a hydrological perspective. Geophys J Int 179(3):1499–1515. doi:10.1111/j.1365-246X2009.04355.x

Page 19 of 19

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_12-4 © Springer-Verlag Berlin Heidelberg 2014

Time-Varying Mean Sea Level Luciana Fenoglio-Marca and Erwin Grotenb a Institute of Geodesy, Technical University Darmstadt, Darmstadt, Germany b Institute of Physical Geodesy, Technical University Darmstadt, Darmstadt, Germany

Abstract After a general theoretical consideration of basic mathematical aspects, we give numerical and physical details for the last two decades, where global and reliable data are available. The concept of a “mean sea level” is in itself rather artificial, because sea level varies at various temporal and spatial scales. This is because the sea is in constant motion, affected by the high- and low-pressure zones above it, the tides, local gravitational differences, and so forth. What is possible to do is to calculate the mean sea level at that point and use it as a reference datum. Traditionally, coastal tide gauges measure sea-level variability. Since 1993 satellite altimetry provides near-global maps of sea-level change with high spatial and temporal resolution. Sea-level change has therefore in altimetry its primary global source of information. Coastal and global sea-level variability is here considered. The coastal variability is estimated from tide gauges, from altimeter data colocated to the tide gauges and from altimeter data along the world coasts. The global variability is derived from altimeter data. We show that the sea level has in the interval 1993–2012 an average positive trend of 3:2 ˙ 0:4 mm/year, which is higher than the average trend of 1:8 ˙ 0:5 mm/year derived from tide gauge data over the twentieth century. The trends are not spatially uniform but regionally dependent. The variability at interannual time scale appears to be related to the variability of climatic indices, like Northern Atlantic (NAO) and the El Nino-Southern Atlantic Oscillation (ENSO). At certain coastal locations, non-climatic components of relative sea-level change, mainly subsidence, are locally appreciable. The relative sea-level rise there is up to three to four times larger than the global mean sea-level rise.

1 Introduction The study of transient phenomena differs essentially from usual geodetic investigations where in the past stationary solutions prevailed. Whenever time-dependent boundary value problems (BVPs) are considered, harmonic function solutions have to be carefully selected, because nonstationary potentials are often nonharmonic and, therefore, cannot be handled by geodetic techniques deduced from classical potential theory. Consequently, the investigation of global vertical reference frames in connection with rising sea level, global warming, and coastal deformations for Earth models affected by plate tectonics does not lead to straightforward classical solutions but rather to stepwise and similar approaches where redundant data sets involving a large number of various 

E-mail: [email protected]

Page 1 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_12-4 © Springer-Verlag Berlin Heidelberg 2014

types of observations can yield reliable results. Applications of that kind are not totally uncommon in geodesy, as geodetic systems had always been evaluated for models of the Earth perturbed by tides and other temporal variations where, however, appropriate reduction schemes had to be applied. The study of time-varying mean sea level is a rather complex topic in that connection which benefits from modern satellite techniques. They make precise solutions possible. In so far geodetic approaches are typically approximation approaches. Sea-level rise is an important aspect of climate change. At least 50-year records are needed to separate secular, decadal, and interannual variations (Douglas 2001). Over this long time interval, only few tide gauge stations along the world coastlines are available for the analysis (Church et al. 2008); the corresponding data indicate over the twentieth century a mean sea-level rise in the range of 1–2 mm/year (Cazenave and Nerem 2004; Church et al. 2004; Church and White 2006; Holgate 2007; Domingues et al. 2008). The spatial distribution of the tide gauge stations as well as the existence of interannual and low-frequency signals affects the recovery of secular trends in short records. Thus, it is important to develop techniques for the estimation of sea-level trends “cleaned” from decadal variability. The analysis based on tide gauge data alone reflects only the coastal sea level, while the global sea-level change is derived from the satellite altimeter missions, which provide since 1993 an uninterrupted global record of sea-level changes. Another fundamental difference between tide gauge and altimeter-derived sea level consists in the fact that while the first is relative to land, the second is relative to the Earth’s center. Estimates for the rate of the global sea level from altimetry, without accounting for the effect of glacial isostatic adjustment (GIA), are around 2:8 ˙ 0:4 mm/year over 1993–2003 (Lombard et al. 2005; Bindoff et al. 2007), 3:1 ˙ 0:4 in 1993–2006 (Beckley et al. 2007), and 3:1 ˙ 0:1 mm/year in 1993–2007 (Prandi et al. 2009). Holgate and Woodworth (2004) suggest that the coastal sea level is rising faster than the global mean and give in 1993–2002 a rate of 4 mm/year for the coastal sea level. Following White et al. (2005) however, the difference between the rates derived from global altimetry and from coastal tide gauges is due to the different sampling. Finally Prandi et al. (2009) show that the differences observed are mainly an artifact due to the interannual variability. In addition to the uncertainty of the rate due to the fitting procedure, measurement errors and omission error are involved. The calibration of altimeter data using colocated altimetry and tide gauge stations gives an error of 0.4 mm/year for the altimetric sea-level change (Mitchum 2000; Leuliette et al. 2004). Jevrejeva et al. (2006) assign an error of 1 mm/year to the global sea-level change derived from tide gauges, due to the nonuniform data distribution. The tide gauge stations used in the various studies are different, from more than 1,000 stations in Jevrejeva et al. (2006) to 91 stations in Prandi et al. (2009). The filling of the incomplete time series reduces the effective number of stations. Fenoglio-Marc and Tel (2010) investigate if the coastal sea level observed by tide gauges conveniently represents the global sea level obtained using both altimetry and tide gauge stations. The study differs from Prandi et al. (2009) in the selected tide gauge stations and in considering the sea-level averages both globally and regionally. Distinction is made between (a) global sealevel change derived from altimetry (GSL), (b) costal sea-level change derived from altimetry (CGSL), and (c) costal sea-level change derived from tide gauges (CGTG). First, the global and regional basin averages of sea level are considered and long-term trends are identified, and then the relationship between climatic indices and the main components of the interannual and interdecadal variability is investigated. Similar results are obtained over the extended interval 1993–2012 (Nerem et al. 2010). Page 2 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_12-4 © Springer-Verlag Berlin Heidelberg 2014

Regional analysis on the impact of sea-level rise in coastal zone shows that at certain locations, non-climatic components of relative sea-level change, mainly subsidence, are locally appreciable and the relative sea-level rise there is found to be up to three to four times larger than the global mean sea-level rise (Cazenave and Llovel 2010; Fenoglio-Marc et al. 2012).

2 Theoretical Considerations One of the key problems related to mean sea level is the global unification of vertical datums. Regionally, e.g., in Europe, the various national leveling systems have in the past been interconnected by different geodetic techniques. One of the pilot products was the UELN (Unified European Levelling Network) or REUN (Reseau Européen Unifié de Nivellement). In crossing smaller oceanic areas, hydrostatic leveling had been applied, e.g., between Denmark and Northern Germany. With the advent of the Global Positioning System (GPS) technique, satellite methods played a significant role, e.g., in the construction of the railway tunnel between France and Great Britain. For precise navigation purposes, the connection of vertical data between continents gained increasing interest. In this context, satellite altimetry together with recent precise Earth Gravity Models (EGM), e.g. EGM 2008, played a major role. With data from the gravimetric satellites GRACE and GOCE at hand, the previous obstacles in creating a global vertical datum are substantially diminished. The traditional argument that there is no unique solution of the third “mixed” (in the sense Dirichlet (first) plus Neumannian (second) BVP) boundary value problem of potential theory in terms of altimetric data at sea and gravity data on land is long-time obsolete. With GPS available all over the Earth, the BVP is considered to be a “fixed” problem. Meanwhile, satellite altimetry and gravity space techniques allow the measurement of both time and space variability of sea level and of a global absolute geoid. The transition from “observed instantaneous” to “mean sea-level” data is a rather complex and numerically complicated process which depends on the validity of involved modeling processes and on the availability and accuracy of related data sets. The exact determination of the N(zero)-term in geoid evaluation, i.e., its scale constant, and the estimation of time-varying sea tide contributions have meanwhile been solved by appropriate modeling procedures. Also temporal changes in tide gauge positions along the coasts where sea-level variations are of utmost practical importance are accurately monitored by GPS. In summary, the space-based positionning, altimetric and gravimetric techniques (see chapter “Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was Not the First: The History of the Geomagnetic Atlases”) render traditional arguments obsolete. Their main innovation is the use of a same accurate and global reference frame ensuring long-term precise monitoring and integration in a Global Godetic Observing System (GGOS) which is crucial for many practical applications. In this way geodesy can substantially contribute to the discussion and analysis of rising mean sea level and related ecological questions. Without highly precise monitoring of global mean sea level and its combination with reliable information on time-varying global gravity, there is no solution. Not only secular changes of mean sea surface in terms of steric and circulation variations, which vary significantly between oceanic regions, can be investigated but also periodic and aperiodic effects. In analyses of secular effects, these different types of time-varying contributions can be separated only over relatively long time spans. A typical example is the El Niño and La Niña effects in the Pacific between Peru and Australia affecting climate perturbations almost all over the globe. To what extent frequency and amplitude variations of such effects are really interdependent (global warming affects frequency and amplitudes and vice versa) has not yet been answered.

Page 3 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_12-4 © Springer-Verlag Berlin Heidelberg 2014

M. Bursa (Bursa et al. 2001) is one of the pioneers in investigating the global vertical datum. His first attempts led to accuracy estimates of about a decimeter or so, as far as the intercontinental connection of vertical datums is concerned. His various publications since 1997 reveal clearly that the determination of an absolute global geoid is basically a question of uniform global, i.e., unified scaling including time scales. This is obvious, because today the length scale is defined via the time scale given by an interval of travel time of a photon in vacuo. Moreover, atomic time is directly related to a specific value of the geopotential and thus to the dimension of the Earth. Consequently, the definition and implementation of a global vertical datum is a topic of broad practical and theoretical implications. Formally, the atomic time scale is related to “mean sea-level geopotential,” but this is just one of the different possible definitions. At present, this quantity is identified with the (constant) geoidal potential Wı (Bursa et al. 2007). It is a consequence of the classical definition of the geoid as the global approximation of mean sea level in least-squares sense. In a transient system, it needs a revision. This revision gains importance both with the increasing precision, i.e., relative accuracy, of time standards, e.g., “cold fountains,” which is of the order of 10 (exp 18), and with the significant secular changes of global gravity and mean sea level. In (Bursa et al. 2007) also space-based gravity measurements have been used. These are superior to both shipborne and airborne gravity observations in detecting the time-varying gravity, thanks to their global coverage and repeated period. Still crucial is the coastal zone where altimeter data are less reliable. This is also the boundary between land and sea in the aforementioned "mixed" boundary value problem. Of primary importance is the establishment of a global geocentric vertical reference system. Intercontinental air and sea traffic will strongly benefit from new improved determinations of such a global vertical datum. Open questions such as the melting processes of glaciers, particularly in Arctic and Antarctic regions, will benefit as well. There is no precise solution of such questions without dependable information about the time-varying Earth’s gravity field. The key to these solutions is the transition from steady-state to transitional solutions. Therefore, besides oceanography, glaciology, and precise measuring techniques, geodesy plays a dominant role for exact solutions of related problems by providing the precise global reference systems. The existing results on sea-level rise (secular and periodic) as well as the investigations of underlying processes leading to reliable prediction methods are promising but still controversial. This is not surprising, because GOCE was basically designed as a tool for investigation of steady-state phenomena, whereas GRACE’s aim is primarily the study of temporal changes at larger spatial scales. Due to the short duration of the space mission, the secular gravity changes are difficult to be detected by stand alone methods. Consequently, (numerically) ill-posed and (physically) improperly posed procedures are involved in the hybrid data processes, such as analytical downward continuation of gradiometric satellite results to sea level. Whereas the conversion of accelerations, i.e., gravity, to geopotential differences is stable, in general, the opposite procedure in converting altimetric results into sea surface gravity anomalies is not well posed. Only skilful combinations of the available data sets can, e.g., lead to useful results on secular variations. The final (stationary) product of a unified global vertical datum would be a value of the geopotential Wı .x/ associated with a set of geocentric coordinates .x/ related to the equatorial terrestrial system at epoch .t /. With present data at hand, the accuracy of the numerical values is assumed to be of the order of 109. The transition to a time-dependent transient geoid Wı .x; t / including secular temporal changes is be a challenging goal. A more exact definition of the geoid would be necessary in that case, as is seen from Wı .x; t / because also Wı .x.t // would make sense by keeping Wı constant. The decision in that case depends on the coupling between volume and potential of the Earth. It is easily seen that,

Page 4 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_12-4 © Springer-Verlag Berlin Heidelberg 2014

for instance, steric mean sea-level changes can lead to significant changes of the Earth’s volume without affecting the geopotential seriously. Therefore, only modeling processes based on gravity and geometric information, for instance, satellite altimetry, yield satisfying results. The analysis and separation of the different components of sea-level variation are now investigated by several authors. They were first initiated in areas where GPS-controlled tide gauge data of high data density were available, due to the particular interest in coastal areas. Also here the application of space techniques appears superior to terrestrial approaches.

3 Scientific Relevance Sea level is a very sensitive indicator of climate change and variability. It responds to change of several components of the climate system. Sea level rises due to global warming, as sea waters expand and fresh water comes from melted mountain glaciers. Variations in the mass balance of the ice sheets in response to global warming has also direct effect on sea level. The modification of the land hydrological cycle due to climate variability and anthropogenic forcing leads to change in runoff in river basins, hence ultimately to sea-level change. Even solid Earth affects sea level through small changes of the shape of ocean basins. Coupled atmosphere-ocean perturbations, like El Nino-Southern Oscillation (ENSO) and North Atlantic Oscillation (NAO), also affect sea level in a rather complex manner and contribute to sea-level variability. While sea level remained almost stable during the last two millennia, subsequently to the last deglaciation that started 18,000 years ago, tide gauge measurements available since the late nineteenth century have indicated significant sea-level rise during the twentieth century, at a mean rate of about 1.7 mm/year (e.g., Church et al. 2004; Church and White 2006; Holgate 2007). Since early 1993, sea-level variations are accurately measured by satellite altimetry from Topex/Poseidon, Jason-1, and Jason-2 missions (Cazenave and Nerem 2004; Leuliette et al. 2004; Beckley et al. 2007). This 15-yearlong data set shows that, in terms of global mean, sea level is presently rising at a rate of 3:5˙0:4 mm/year, with the GIA correction applied (Peltier 2004, 2009). The altimetry-based rate of sea-level rise is therefore significantly higher than the mean rate recorded by tide gauges over the previous decades, suggesting that sea-level rise is currently accelerating in response to global warming. Owing to its global coverage, altimetry also reveals considerable regional variability in the rates of sea-level change. In some regions, such as the Western Pacific or North Atlantic around Greenland, sea-level rates are several times faster than the global mean, while in other regions, e.g., Eastern Pacific, sea level has been falling during the past 20 years. It is expected that regional sea-level rise will strongly affect particular regions with direct impacts including submergence of coastal regions, rising water tables, and salt intrusion into groundwaters. It can possibly also exacerbate other factors as floodings, associated to storms and hurricanes, as well as ground subsidence of anthropogenic nature. There is an urgent need to understand the causes for the observed sea-level rise and its possible development. Over the 1993–2003 time span, sea-level rise was about equally caused by two main contributions: 50 % from warming of the oceans (through thermal expansion) and 50 % from glaciers melting and ice mass loss from Greenland and Antarctica (Bindoff et al. 2007). Since 2003 thermal expansion has negligibly contributed to sea level (e.g., Willis et al. 2008) although meanwhile satellite altimetry-based sea level has continued to rise. Accelerated ice mass loss from the Greenland and West Antarctica ice sheets (evidenced from various remote-sensing techniques such as radar and laser altimetry, INSAR and GRACE space gravimetry) and increase in glacier Page 5 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_12-4 © Springer-Verlag Berlin Heidelberg 2014

melting (e.g., Meier et al. 2007; Rignot et al. 2008) appear to account alone for the last 5 years sea-level rise (Cazenave et al. 2009). Changes in each component of the climate system, in ocean, land, and ice sheets, have a sensible effect on sea level. At global scale the sum of the observed contribution to sea-level rise has been compared to the observed sea-level rise over multi-decadal period. An improved closure of the sea-level global budget has been obtained both in trends and in mean trend variability (Cazenave and Llovel 2010; Moore et al. 2011; Church and White 2011). Key uncertainties include the role of the Greenland and West Antarctic ice sheets and the amplitude of regional changes in sea level, caused by both climatic and non-climatic components (Henry et al. 2012; Becker et al. 2012). Regional analysis on the impact of sea-level rise in coastal zone shows that at certain locations, non-climatic components of relative sea-level change, mainly subsidence, are locally appreciable, and the relative sea-level rise there is found to be up to three to four times larger than the global mean sea-level rise (Cazenave and Llovel 2010).

4 Data and Methodology of Numerical Treatment We use altimeter data from January 1993 to December 2008 of the Topex/Poseidon, Jason-1, and Envisat altimeter missions. Standard corrections are applied to the altimetry data with exception of the inverse barometric correction, for consistency with the monthly tide gauge data that are not corrected either. This correction, given by the MOG2D model, is applied for the comparison between global and coastal sea-level change derived from altimeter data only. Monthly 1ı grids are constructed by using a simple Gaussian weighted average method (half-weight equal to 1 and search radius equal to 1.5ı ). Tide gauge data are available from the Permanent Service for Mean Sea-Level (PSMSL, http:// www.pol.ac.uk/psmsl/) database. We correct the sea level at the tide gauge station for the GIA using the SELEN software (Spada and Stocchi 2007) forced with the ICE5-G glaciation history and a viscoelastic Maxwell Earth derived from VM2 (Paulson et al. 2007). Also sea- level change derived from the altimeter is corrected for the GIA effect, and this correction increases the satellite estimates of global sea surface rates by 0.3 mm/year (Church et al. 2004). Selection criteria or the tide gauges are based on the time length of the time series and on the data gaps. The stations available in each of the three intervals 1900–2001, 1950–2001, and 1993–2001 are 1,158, 1,103, and 738, respectively. A station is used if it is available over 90 % of the interval with gaps shorter than 2 years (availability criteria); stations fulfilling these criteria are 15, 117, and 365, respectively, and mostly located in the northern hemisphere along the European and North American coasts. A further criterion for the selection of tide gauge is based on proximity and agreement with satellite altimetry. The chosen stations will represent the large-scale open sea-level variability, and stations describing the local sea-level variability are eliminated (Fenoglio-Marc et al. 2004). A station is eliminated if the minimum distance from a point of the altimeter grid is greater than 2ı . For each tide gauge station, we consider the nearest node of the altimeter grid within 2ı radius and four parameters: (1) correlation, (2) the trend of the difference, (3) the standard error of the trend of the difference, and (4) the standard deviation of the difference are used as indicators of the agreement between altimetry and tide gauge data. The criteria are correlation >0.5, the trend and standard error of the trend

(21)

P lEk .l/ < l >D Pl ; l Ek .l/

(22)

with

where Ek .l/ is the kinetic energy carried by all modes with spherical harmonic degree l. Table 1 lists the input and diagnostic parameters introduced above for Earth. The numbers highlight three characteristics for the force balance in planetary dynamo regions: at E D 1015 , viscous effects are likely negligible on the larger length scale of interest in the dynamo process, and inertial forces should play a minor role at Ro D 2  106 . The associated dynamical consequences can be discussed in terms of a bifurcation scenario that starts with the onset of convective motions at a critical Rayleigh number Rac . Dynamo action sets in at higher Rayleigh numbers where the larger flow amplitude pushes the magnetic Reynolds number beyond its critical value. The small Ekman number of planetary dynamos suggests that viscous effects are negligible. Close to Rac , the Coriolis force is then predominantly balanced by pressure gradients in the so-called geostrophic regime. The Taylor-Proudman theorem states that the respective flow assumes a two-dimensional configuration, minimizing variations in the direction zO along the rotation axis: @U=@z ! 0 for E ! 0. Buoyancy, however, must be reinstated to allow for convective motions that necessarily involve a radial and thus non-geostrophic component. To facilitate this, viscous effects balance those Coriolis force contributions which cannot be balanced by pressure gradients (Zhang and Schubert 2000). At lower Ekman numbers, the azimuthal flow length scale decreases so that viscous effects can still provide this necessary balance: The azimuthal wave number mc at onset of convection scales like E1=3 . The impeding effect of the two dimensionality causes the “classical” critical Rayleigh Page 12 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

E=10–3, Ra=2 Rac

E=10–3, Ra=8.1 Rac

E=3x10–5, Ra=1.8 Rac

Fig. 3 Flow in nonmagnetic convection simulations at different parameter combinations. The top row shows isosurfaces of the z-components of the vorticity r  U, the bottom row shows isosurfaces of the z-component of the flow. The Prandtl number is unity in all cases

number to rise with decreasing E: Rac  E4=3 . Inertial effects can contribute to balancing the Coriolis force at larger Rayleigh numbers and thus reduce the two dimensionality of the flow. Figure 3 shows the principal convective motions and the effect of decreasing the Ekman number and increasing the Rayleigh number. The convective columns are illustrated with isosurfaces of the z-component of vorticity in the upper panels. Red isosurfaces depict cyclones that rotate in the same direction as the overall prograde rotation , and blue isosurfaces show anticyclones that rotate in the opposite sense. These columns are restricted to a region outside the so-called tangent cylinder that touches the inner-core boundary (see Fig. 2). Inside the tangent cylinder, buoyancy is primarily directed along the axis of rotation so that the Taylor-Proudman theorem is even more restricting here. The convective motions therefore start at higher Rayleigh numbers here, a few times Rac , and are plumelike rather than column-like. The meridional circulations in the north/south plane involve flow components along the direction of the rotation axis which violate the Taylor-Proudman theorem but are unavoidable in convection. The lower panels of Fig. 3 illustrate the meridional circulation with isosurfaces of the flow z-component which is directed equatorward in the (red) cyclones and polarward in the (blue) anticyclones. Inside the tangent cylinder, the meridional circulation is mostly outward close to the pole and inward close to the tangent cylinder. Decreasing the Ekman number thus has two principal effects: first, an increase of the rotational constraint promoting a more ordered geostrophic structure and second, a decrease of the length scales. The dynamics of the smaller flow structures also requires smaller numerical time steps and both effects significantly increase the numerical costs. For example, the time step required for the simulation at E D 3  105 shown in Fig. 3 is three orders of magnitude smaller than the time step required at E D 103 . Reaching smaller Ekman numbers is therefore numerically challenging and the available computing power limits the attainable values. When the Rayleigh number is increased beyond a certain critical value (see the discussion in section “Subcritical Dynamos and the Nature of the Dynamo Bifurcation”), dynamo action set in and Lorentz forces can contribute to balancing the Coriolis force and further release the two dimensionality. On theoretical grounds, the Lorentz force is thought to enter the leading order Page 13 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

force balance in order to saturate magnetic field growth, and this seems to be confirmed by the geomagnetic value of ƒ (see Table 1). The magnetostrophic balance assumed to rule planetary dynamo dynamics therefore involves Coriolis forces, pressure gradients, buoyancy, and Lorentz forces and is thought to be characterized by an Elsasser number of order one. We will further discuss the influence of Lorentz forces on the flow dynamics in Sect. 3.6. An Ekman number of about E D 106 seems to be the current limit for numerical dynamo simulations (Kageyama et al. 2008; Miyagoshi et al. 2010, 2011). Increasing the Rayleigh number in order to retain dynamo action and to yield a more realistic structure and time dependence further decreases both length scales and time scales, as we will discuss in Sect. 3. The parameters for a few representative dynamo simulations – simple, advanced, and high end – are listed along with the geophysical values in Table 1. Simple dynamos are characterized by moderate Ekman numbers, E D 103 or even larger, which yield a large-scale solution that can be computed with modest numerical efforts. These dynamos therefore lend themselves to study long-time behavior like field reversal, to explore the dependencies on the parameters other than E, and to unravel the 3D dynamo mechanism. The simple drifting behavior found at low Rayleigh numbers made the case BM II listed in Table 1 an ideal candidate for a numerical benchmark (Christensen et al. 2001). Model E6 represents the high end of the spectrum at E D 3  106 . Its solution is very small scale and complex and exhibits a chaotic time dependence. The dynamos at E D 3105 listed in Table 1 are typically advanced models that can be simulated on today’s midrange parallel computing systems on a more or less regular basis. Figure 4 shows the rms force balance for four different dynamo models, three of which are listed in Table 1. The individual contributions have been normalized with the rms Coriolis force. In all simulations, Coriolis force and pressure gradient clearly dominate which explains the high degree of geostrophy characterizing typical dynamo solutions. Lorentz force contribution L and buoyancy B are somewhat weaker, and it remains debatable whether they should be regarded as part of the first-order force balance (Christensen et al. 1999; Soderlund et al. 2012). Acceleration A and inertia I, or more precisely momentum advection, form a lower-order balance at small Rayleigh numbers. When Ra increases, their relative contribution grows and reaches a level of around 0:1 at the transition to the multipolar regime as correctly predicted by the local Rossby number. The local Rossby number thus seems to offer a fair estimate for the true rms force balance. Soderlund et al. (2012) remark that this transition roughly coincides with the point where the inertial contribution exceeds the viscous contribution V as is confirmed by Fig. 4 (compare the dipolar model E5R18 with the multipolar model E5R43). Once in the multipolar regime, inertia has nearly reached the level of the Lorentz force. It then becomes dubious to consider the latter as part of a first-order (magnetostrophic) force balance but not the former. The viscous contribution V is surprisingly high in all the models. A comparison of the three cases at E D 3  105 shows that V increases with Rayleigh number because the length scale of the solutions decreases. Since viscous effects are strongly dominated by the smallest length scales, the Ekman number cannot directly be used to estimate the true rms contribution. A better estimate would include a typical viscous length scale `V via V  E=`2v . Using V 0:1, a typical value for the larger Rayleigh number solutions at E D 3  105 , suggests `V D .E=V /1=2  0:02 which is an order of magnitude smaller than the mean flow scale `u . This suggests that the rms value V overestimates the effect of viscosity on the larger typical length scales relevant for magnetic field generation. Figure 4 also illustrates that the Elsasser number tends to overestimate the relative importance of the Lorentz force. For all the cases examined here, ƒ exceeds one, while L clearly remains below

Page 14 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

10 0

10 –1 E=10–3, Ra/Rac=2, BMII E=3x10–5, Ra/Rac=9 E=3x10–5, Ra/Rac=18, E5R18 E=3x10–5, Ra/Rac=43, E5R43

10 –2 A

V

I

C

L

P

B

Fig. 4 Force balance in the Navier-Stokes equation for four different dynamo models. Shown are rms force contributions due to total acceleration A, viscosity V, inertia (or momentum advection) I, Coriolis force C, Lorentz force L, pressure gradient P, and buoyancy B. All contributions have been normalized with the Coriolis contribution. See Table 1 for more information on the different dynamo models

unity. In analogy to the local Rossby number, Soderlund et al. (2012) therefore suggest a modified dynamic Elsasser number that provides a better estimate. The classical Elsasser number assumes that the electrical current density can be approximated by  u b based on Ohm’s law. The dynamic Elsasser number ƒd allows more direct estimate based on the ratio of the Lorentz and Coriolis term in the Navier-Stokes equation: ƒd D

b2 ƒ ` D : 2  u`b Rm `b

(23)

The typical magnetic length scale `b can be calculated similarly to `u when replacing the kinetic with the magnetic energy (see Soderlund et al. (2012) for a precise definition). A comparison of the parameters for the most advanced model E6 in Table 1 with Earth values shows that: (1) The Ekman number is still about nine orders of magnitude too large. (2) The Rossby number is three orders of magnitude too large. (3) The magnetic Prandtl number is six orders of magnitude too large and (4) the Alfvén Mach number is about one order of magnitude too large. On the other hand: (5) The magnetic Reynolds number reaches realistic values and (6) the Elsasser number is also about right. In terms of the dynamics, this means that: (1) Viscous effects are much too large, which is necessary to suppress unresolvable small-scale flow features. (2) Inertial effects are overrepresented. (3) Magnetic diffusion is much too low compared to viscous diffusion. (4) The numerical dynamos are too inefficient and produce weaker magnetic fields for a given flow amplitude than Earth. On the positive side: (5) The ratio of magnetic field production to diffusion is realistic and (6) the relative (rms) impact of the Lorentz forces on the flow seems correctly modeled. The role of the local Rossby number will further be discussed in the following section.

Page 15 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

a 20

b 20

D

E

M

16

16

14

14

12

10

8

8

C

4 2

4

6

0

6

200 400 600 800 1000 1200

M

C

4

8 10 12 14 16 Ra/Rac

Rm

E

12

10 6

D

18

Pm

Pm

18

2 Λ

4

6

8 10 12 14 16 Ra/Rac

0 10 20 30 40 50 60 70 80 90

Fig. 5 Regime diagrams that illustrate the transition from stable dipole-dominated dynamos (regime D) to constantly reversing multipolar dynamos (regime M). Models showing Earth-like rare reversals can be found at the transition in regime E. Gray symbols mark nonmagnetic convective solutions (regime C). Different symbols code the time dependence: squares = drifting, upward pointed triangles = oscillatory, circles = chaotic, diamonds = Earth-like rarely reversing, downward pointed triangles = constantly reversing. The Ekman number is E D 103 and the Prandtl number is Pr D 1 in all cases

3.2 Dynamo Regimes One possible strategy for choosing the parameters for a numerical dynamo model is to fix the Prandtl number to Pr D 1 appropriate for thermal convection and to choose E as small as the available numerical computing power permits. The Rayleigh number and the magnetic Prandtl number Pm are then varied until the critical magnetic Reynolds number is exceeded and dynamo action starts. Note, however, that some authors have chosen a different approach in selecting their parameters (Glatzmaier 2002). Figure 5 shows the dependence of Rm on Ra and Pm at E D 103 and Pr D 1. Rigid flow and fixed temperature boundary conditions have been assumed. Larger Ra values yield larger flow amplitudes u, while larger Pm values are synonymous with lower magnetic diffusivities  (or larger electrical conductivity ). The increase of either input parameter thus leads to larger magnetic Reynolds numbers Rm D u`= and finally promotes the onset of dynamo action. The minimal critical magnetic Reynolds number is about 50 here, a value typical for spherical dynamo models (Christensen and Aubert 2006). Figure 5 demonstrates that the increase of Rm and ƒ with Pm depends on the Rayleigh number which is an expression of changes in the dynamo mechanism and efficiency. Generally, the increase of Rm with Pm is slower than linear due to the growing field strength and thus intensified back reaction of the Lorentz force on the flow. Figure 5 highlights the four main dynamical regimes that have been identified in several extensive parameter space studies (Kutzner and Christensen 2002; Christensen and Aubert 2006; Amit and Olson 2008; Takahashi et al. 2008a; Wicht et al. 2009). C denotes the purely convective regime, and D is the regime where dipole-dominated magnetic fields are obtained. When u becomes large at values of Ra=Rac between 8 and 9, the dynamo changes its character and seems to become less efficient. The dipole component loses its special role which leads to a multipolar field configuration – hence regime M for multipolar dynamo regime – and an overall weaker field as indicated by the smaller Elsasser number (see Fig. 5b). In addition, the critical magnetic Reynolds number increases. Page 16 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

45 P [deg.]

0 –45

30 DM 20 [1022Am2] 10 0 3

5

7 Time [Myr]

9

11

Fig. 6 Reversal sequence in model E3R9 that typifies the behavior of dynamos in regime E (see Fig. 5). The top panel shows the dipole tilt P, the bottom panel shows the magnetic dipole moment DM, rescaled by assuming Earth-like rotation rate, core density, and electrical core conductivity. Excursions where the magnetic pole ventures further away than 45ı in latitude from the geographic pole, but then returns are marked in light gray. See Wicht et al. (2009) for further explanation

The dipole polarity, which remains stable in regime D, frequently changes in regime M. Earthlike reversals, where the polarity stays constant over long periods and reversals are relatively rare and short events, happen at the transition between the regimes D and M and define regime E. It is somewhat difficult to come up with a clear and unambiguous definition of “Earth-like” reversals. We follow Wicht et al. (2009) here, demanding that the magnetic pole should not spend more than 10 % of the time in transitional positions farther away than 45ı from either pole. Moreover, the dipole field should amount to at least 20 % of the total field strength at the outer boundary on time average (see Sect. 3). Figure 6 shows a time sequence for the Earth-like reversing model E3R9 (model T4 in Wicht et al. 2009) that exhibits ten reversals and several excursions where the magnetic pole also ventures farther away than 45ı from the closest geographic pole but then returns. The left boundary of regime E is difficult to pin down since it is virtually impossible to prove numerically that a dynamo never reverses (Wicht et al. 2009). The regime changes from D to E and further to M are attributed to the increased importance of the nonlinear inertial effects at larger Rayleigh numbers. They seem to have reached a critical strength compared to the ordering Coriolis force at the transition to regime M. Reversals typically set in at a critical local Rossby number around Ro`c D 0:1 (Christensen and Aubert 2006); the precise value depends on model details like the heating mode and the inner-core size (Aubert et al. 2009). Once in regime E the ever present fluctuation in the inertial forces may suffice to drive the dynamo into regime M for a short while and thus facilitate reversals (Kutzner and Christensen 2002). Figure 16, discussed in more detail below, shows the location of the different regimes with respect to the local Rossby number Ro` . As already discussed above, Soderlund et al. (2012) report that the transition to reversing dynamos happens when the rms inertial force exceeds the rms viscous force and thus gains more influence in their simulations. This is confirmed by the example shown in Fig. 4, but the significance remains unclear. The scenario outlined for E D 103 above is repeated at lower Ekman numbers with a few changes. Most notably, the boundary towards the inertia-influenced dynamo regime M shifts towards more supercritical Rayleigh numbers. The critical relative strength of inertial effects is reached at larger flow amplitudes u since the rotational constraint is stronger at lower E values. The critical magnetic Reynolds number is therefore reached at lower magnetic diffusivities  Page 17 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

which means that the minimum magnetic Prandtl number Pmmin where dynamo action is still retained decreases (Christensen and Aubert 2006). While Pmmin is about 4 at E D 103 (compare Fig. 5), it decreases to roughly 0:1 at E D 105 (Christensen and Aubert 2006). Christensen and Aubert (2006) suggest that Pmmin D 450 E0:75 . An extrapolation for Earth’s Ekman number yields Pmmin  108 which is safely below the geophysical value of Pm D 3  107 . This predicts that a realistic magnetic Prandtl number can indeed be reached at realistic Ekman numbers. In a few instances, it has been reported that both dipole-dominated and multipolar solutions can be found at identical parameters (Christensen and Aubert 2006; Simitev and Busse 2009; Gastine et al. 2012). This phenomenon becomes more typical when stress-free rather than rigid outer boundary conditions are used which allow stronger zonal winds to develop. Two distinct solution branches then coexist for local Rossby number below the critical value for the transition to regime M: a branch with dipole-dominated stronger magnetic fields and weak zonal flows and a branch with multipolar weaker magnetic fields but strong zonal flows. Strong zonal flows and dipole-dominated fields seem merrily exclusive. Another example where a dipolar and a multipolar branch coexist is tied to subcritical dynamo action further discussed in section “Subcritical Dynamos and the Nature of the Dynamo Bifurcation.”

3.3 Scaling Laws There is no unique way to rescale to dimensionless solutions to the planetary situation. Take, for example, time. Several different time scales enter the problem: the rotation time  D 1 , the fluid turnover time u D `=u, the magnetic diffusion time  D `2 =, the viscous diffusion time  D `2 = , and the thermal diffusion time  D `2 = . The first three are more directly relevant for the dynamo process. The ratio of  and u is the magnetic Reynolds number and both are often used to rescale the simulations. Olson et al. (2012) demonstrate that using the turnover time yields a temporal Fourier power spectrum that is more similar to the geomagnetic one. Simulations with a realistic magnetic Reynolds number (around 500) have the advantage that using either time scale provides equivalent results. The ratio of  and  is the magnetic Ekman number E D

  D D E Pm1 : 2  `

(24)

Since E is too large in the simulations, this is also true for E . When using either u of  for rescaling a dynamo simulation, this automatically means that the rotation rate is orders of magnitude too slow. Some authors argue that choosing larger Pm values and thus small magnetic Ekman numbers may cure this problem (Christensen et al. 2010), and we come back to this issue in section “Dipole Properties and Magnetic Field Symmetries.” Dipole-dominated dynamos, even Earth-like reversing dynamos, are found for a wide range of Elsasser numbers (see Fig. 5). We have already remarked on the fact that the classical Elsasser number may not provide a good estimate of the true importance of the Lorentz force in dynamo simulations. The assumption that ƒ has to be of order one, however, is the basis for a classical conjecture that the magnetic field strength should be proportional to the square root of the N The dimensionless field strengths provided by dynamo planetary rotation rate since b2 D . /. simulations are therefore often rescaled by assuming Earth-like values for , core density , N and electrical conductivity  D 1=./ (see, e.g., Fig. 6).

Page 18 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Dynamo simulations can be used to check and revise scaling laws, at least within the covered parameter regime. A large suit of dynamo models supports the law b2  fOhm N1=3 . F qo /2=3 ;

(25)

where fOhm is the fraction of the available power that is converted to magnetic energy and F qo is the total thermodynamically available power (Christensen 2010). The form factor F subsumes the radial dependence of the convective vigore and is typically of order one; qo is the heat flux through the outer boundary. This scaling law not only successfully predicts the field strength for several planets in our solar system (Olson and Christensen 2006; Christensen and Aubert 2006; Yadav et al. 2013) but also for some fast-rotating stars whose dynamo zones may obey a similar dynamics as those found in the planetary counterparts (Christensen et al. 2009). This suggests that all operate in a similar regime and thus implies that the dynamical differences play no major role, i.e., that viscous and inertial effects are already small enough in the simulations to capture at least the primary features of the geodynamo and that the ratio of viscous to ohmic dissipation is not essential. We refer to Christensen (2010) for a detailed comparison of the different scaling laws that have been proposed over time, not only for the magnetic field strength but also for the flow vigore and other dynamo properties. The power-dependent scaling law for the local Rossby number developed by Christensen and Aubert (2006) predicts the value of Ro`  0:09 for Earth that is listed in Table 1. This nicely agrees with the fact that the numerical simulations show Earth-like reversal in this parameter range. Inertial effects may thus play a much larger role in the geodynamo than previously anticipated based on the small Rossby number (Wicht and Christensen 2010). Scaled to Earth values, the associated length scale `u (Eq. 21) corresponds to roughly 100 m. It is hard to conceive that such a small length scale should play should matter in the dynamo process let alone influence its reversal behavior, but nonlinear interactions may play tricks here. The meaning of the local Rossby number and the related scaling laws needs to be explored further to understand their relevance. The fact that the power-dependent scaling law Eq. (25) is independent of the Ekman number seems good news for dynamo simulations. However, the numerical results still allow for a weak dependence on the magnetic Prandtl number (Christensen and Aubert 2006; Christensen 2010; Stelzer and Jackson 2013). This may amount for a significant change when extrapolating the simulation at Pm  0:1 to the planetary value of Pm D 3  107 . The much smaller viscous diffusion at Pm  1 allows for significantly smaller scales in the flow than in the magnetic field. The smaller flow scales should thus have no direct impact on the dynamo process in planetary dynamo regions but may still play a role for the magnetic field generation in dynamo simulations. However, the nonlinearities couple flow and magnetic field of different scales and may lead to complicated interactions. Simulations at lower magnetic Prandtl numbers are required to clarify these issues but, as outlined above, also require lower Ekman numbers which is numerically costly. The disparity between flow and magnetic field length scales further increases the numerical difficulties.

3.4 Double Diffusive Approach Simulations that simultaneously model thermal and compositional convection and allow for significant differences in the diffusivities of both components are still rare. While for thermal

Page 19 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

convection a Prandtl number of one is realistic, the composition Prandtl number may be three orders of magnitude larger, when molecular diffusivities are considered. Several studies that model pure thermal convection but vary the Prandtl number indicate that inertial effects decrease when increasing the Prandtl number in convectively driven flows (Tilgner 1996; Breuer et al. 2002; Schmalzl et al. 2002) as well as in dynamos (Simitev and Busse 2005; Sreenivasan and Jones 2006; Wicht and Christensen 2010). Larger Prandtl numbers promote more confined thinner convective flows and lower overall flow amplitudes (Breuer et al. 2002; Schmalzl et al. 2002), and the latter is also responsible for the smaller inertial effects (Sreenivasan and Jones 2006). Higher magnetic Prandtl numbers are thus required to sustain dynamo action and keep Rm above the necessary critical value (Simitev and Busse 2005). Simitev and Busse (2005) moreover report that the dipole contribution becomes stronger at larger Prandtl numbers which is in line with our understanding that inertia is responsible for the transition from the dipolar to the multipolar regime in Fig. 5. Breuer et al. (2010) present a double diffusive study at an Ekman number of E D 103 where thermal and compositional evolution is modeled separately with two equations of the form (6). The thermal and compositional Prandtl numbers are 0:3 and 3:0, respectively. They report that a sizable contribution of compositional buoyancy, similar to what can be expected for Earth, promotes smaller-scale flows and more convective plumes inside the tangent cylinder. Other important flow properties like the zonal flow or helicity are also significantly affected. In a double diffusive study geared to model Mercury’s dynamo, Manglik et al. (2010) observe that the thinner compositional plumes tend to destabilize a thermally stably stratified layer attached to the outer boundary. This result may also apply to the Earth’s core where recent studies suggest the presence of a similar stratified layer (Pozzo et al. 2012; Gubbins and Davies 2013).

3.5 Dynamo Mechanism The simple structure of large Ekman number and small Rayleigh number simulations allows to analyze the underlying dynamo mechanism (Olson et al. 1999; Wicht and Aubert 2005; Aubert et al. 2008b). Figure 7 illustrates the mechanism at work in the benchmark II dynamo (Christensen et al. 2001) (BMII in Table 1). The solution obeys a fourfold azimuthal symmetry, and the dynamo process can be illustrated by concentrating on the action of one cyclone/anticyclone pair. The mechanism is of the ˛ 2 type, a terminology that goes back to the mean field dynamo theory and refers to the fact that poloidal and toroidal magnetic fields are produced by the local action of small-scale flow; small scale refers to the individual convective columns here. The ˛!-mechanism, where the toroidal field is created by shear in the zonal flow, offers an alternative that may operate in models where stress-free boundary conditions allow for stronger zonal flows to develop (Kuang and Bloxham 1997; Busse and Simitev 2005a). Anticyclones stretch north-/south-oriented field lines radially outward in the equatorial region. This produces a strong inverse radial field on either side of the equatorial plane which is still clearly visible at the outer boundary. Inverse refers to the radial direction opposing the dominant axial dipole. The field is then wrapped around the anticyclone and stretched poleward, resulting in a field parallel to the original axial dipole and thus enforcing it. Advective transport from the anticyclones towards the cyclones and stretching by the meridional circulation down the axis of the cyclones closes the magnetic production cycle which maintains the field against ohmic decay. The converging flow into the anticyclones advectively concentrates the normal polarity field into strong flux lobes located at higher latitudes close but outside the tangent cylinder. The dipole field Page 20 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 7 Illustration of the dynamo mechanism in the benchmark II dynamo. Panel (a) shows isosurfaces of positive (red) and negative (blue) z-vorticity which illustrate the cyclonic and anticyclonic convective columns. Panel (b) shows isosurfaces of positive (red) and negative (blue) z-velocity, and panel (c) shows contours of the radial field at the outer boundary. Red (blue) indicates radially outward (inward) field. The magnetic field lines are colored accordingly, and their thickness is scaled with the local magnetic field strength

is to a good degree established by these lobes. Meridional circulation inside the tangent cylinder is responsible for the characteristically weaker magnetic field closer to the poles since it transports field lines away from the rotation axis towards the tangent cylinder. The distinct magnetic features associated with the action of the prograde and retrograde rotating convective columns have been named magnetic cyclones and anticyclones by Aubert et al. (2008b). An identification of the individual elements in the dynamo process becomes increasingly difficult at smaller Ekman numbers and larger Rayleigh numbers where the solutions are less symmetric, more small scale, and stronger time dependent. Many typical magnetic structures nevertheless prevail over a wide range of parameters which suggests that the underlying processes may still be similar (Aubert et al. 2008b).

3.6 Is There a Distinct Low Ekman Number Regime? As outlined above, even the most advanced spherical dynamo models are restricted to Ekman numbers E  O.106 / which is still nine orders of magnitude away from Earth’s value. Though the current dynamo models are rather successful in reproducing many features of observed planetary magnetic fields, and even though the changes with decreasing Ekman number seem modest in typical dynamo models, there is clearly the danger that more severe changes are encountered when future models succeed to reach more realistic Ekman numbers. In this section, we review some questions which arise in the context of the low Ekman number regime and discuss results from recent simulations.

Page 21 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

a

b

Fig. 8 Critical Rayleigh number vs. horizontal wave number KH in a plane layer with (a) fixed temperature boundary conditions and (b) fixed heat flux conditions (From Hori et al. 2012)

3.6.1 The Influence of the Magnetic Fields on Rapidly Rotating Convection It has long been known that magnetic fields can have a strong impact on rotating convective flows at small Ekman number E. In particular, studies of so-called magnetoconvection, in which the field is externally imposed onto the convective flow, have suggested that the magnetic forces act to change the flow structure fundamentally in comparison to the small-scale motions typically encountered in the nonmagnetic case. The general effects are most easily discussed in terms of a simple Cartesian plane layer model studied already by Chandrasekhar (1961). This model extends the classical Rayleigh-Bénard configuration, a horizontal fluid layer heated from below, by adding the effects of vertical rotation and of a uniform, imposed vertical magnetic field. For simplicity, we assume fixed temperatures and vanishing shear stresses at the boundaries. The stability of the purely conductive solution against small perturbations can then be analyzed in a straightforward manner by a standard linear analysis. Figure 8a, taken from Hori et al. (2012), shows the critical Rayleigh number Rac for the onset of convection as a function of the horizontal wave number KH . The behavior is illustrated for various Ekman numbers E and for imposed fields of varying strengths, measured here by the Elsasser number ƒ0 defined in Eq. (18). The minimum of each curve signifies the Rayleigh number and horizontal wave number where the purely conductive state becomes unstable to convective motions. In the nonmagnetic case, ƒ0 D 0, the Rayleigh number required to destabilize the system strongly increases with decreasing E, proportional to E4=3 for small E, while the wave number of the first unstable mode increases

Note that Ra is defined similar to Eq. (14), with ` replaced by the layer depth. Furthermore, possible oscillatory modes are omitted from Fig. 8a for simplicity. See Hori and Wicht (2013) for details. Page 22 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

like E1=3 . These scalings agree with those found for critical Rayleigh number and azimuthal wave number mc in spherical shells (Sect. 3.1). The behavior changes dramatically in the presence of a strong imposed field. The minima shift towards lower Rayleigh and wave numbers, revealing that a much weaker buoyancy forcing, Ra D O.E1 /, now suffices to destabilize the system. The first unstable mode is characterized by O.1/ flow scales, in sharp contrast to the O.E 1=3 / scales encountered in the nonmagnetic case. As already explained in Sect. 3.1, in the absence of a magnetic field, viscous forces play an important role in balancing the components of the Coriolis force that cannot be balanced by pressure alone, which requires small E 1=3 length scales to be present in the flow. For sufficiently strong magnetic fields, the Lorentz force enters the force balance and can help to balance the Coriolis force. Ultimately, the need for small flow scales vanishes, thus reducing the viscous dissipation and in turn the critical Rayleigh number. Note that the drop in both the critical Rayleigh from O.E 4=3 / to O.E 1 / and in the first unstable wave number from O.E 1=3 / to O.1/ increases with decreasing E, revealing that the effect of the imposed field successively becomes more pronounced as lower Ekman numbers are approached. As pointed out by Hori et al. (2012), it is interesting to compare the above results with a situation where constant heat flux conditions are imposed at the boundaries (Fig. 8b). In broad terms, the behavior is similar, but the curves now flatten out for small wave numbers, revealing that large flow scales are excited more easily. This suggests that the effect of magnetic fields on the large scales in rotating convection may be more pronounced if flux conditions are applied. We will come back to this observation in section “The Impact of Lorentz Forces and Buoyancy Boundary Conditions on the Flow Scales in Numerical Simulations.” The simple plane layer model considered above is certainly oversimplified. Linear studies using spherical shell geometries and more general imposed large-scale magnetic field topologies in fact reveal the existence of more complicated convective modes (Fearn 1979; Sakuraba and Kono 2000; Zhang and Gubbins 2000b, a; Sakuraba 2002), but generally confirm the basic result that strong imposed fields tend to promote large-scale flows and decrease the critical Rayleigh number. The magnetic field typically begins to alter the flow structure for ƒ D O.E 1=3 /, showing that at low Ekman number, even weak magnetic fields can be dynamically important. Much depends on the geometry of the imposed fields and the model details. We do not attempt to review the wealth of results obtained over the last decades, and instead refer to the reviews by Proctor (1994) and Jones (2007). We do point out however that a linear analysis by Sakuraba (2002) suggests that the choice of temperature boundary conditions strongly affects the magnetic modes also in spherical shells, a finding recently confirmed in fully nonlinear magnetoconvection simulations by Hori and Wicht (2013). Care should be taken when applying magnetoconvection results, which are obtained for topologically simple, externally imposed fields, to fully nonlinear dynamos, in which more complicated magnetic fields arise that are dynamically coupled to the flow. If we assume that the magnetoconvection results are significant for dynamos, this would suggest a growing scale disparity between weakly and strongly magnetic flows as E is reduced and ultimately lead to enormous scale differences between nonmagnetic and magnetic dynamics at Earth-like parameters. Typical estimates based on linear theory suggest that nonmagnetic columnar convection (as illustrated in Fig. 3b) would have azimuthal length scales ranging from about 30 m to one kilometer in the Earth’s core, depending on whether molecular or turbulent values of viscosity are used (Zhang and Gubbins 2000b). The planetary scale convection cells suggested by magnetoconvection studies for strong, Earth-like fields would be three to five orders of magnitude larger, again Page 23 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

illustrating the possible key role of the magnetic field in the low E regime. Interestingly, this effect would counteract the scale disparity due to the small magnetic Prandtl number that we discussed in Sect. 3.3. As noted above, the implications of magnetoconvection studies for rapidly rotating dynamos are not trivial. Apart from the general hope to make the models more realistic, this is one of the reasons for the ongoing efforts to push the numerical simulations towards more realistic Ekman numbers. 3.6.2 The Impact of Lorentz Forces and Buoyancy Boundary Conditions on the Flow Scales in Numerical Simulations The majority of dynamo simulations published so far have assumed fixed codensity values at the boundaries. Although some simulations have reached Ekman numbers down to O.106 /, the effect of the magnetic field on the flow is typically modest. In a study at E D 2  106 , Takahashi et al. (2008a) report an increase of `u by about 20 % as compared to the nonmagnetic case. Soderlund et al. (2012) also report small effects for Ekman numbers down to E D 105 . Sakuraba and Roberts (2009) were the first to notice that the situation can change dramatically if the buoyancy flux is prescribed at the boundaries and internal buoyancy sources are present. They compare two simulations at E  2  106 , shown in Fig. 9, one with a uniform temperature boundary condition applied at the core-mantle boundary and one with a fixed, uniform heat flux condition. Both simulations assume a homogeneous heat source throughout the core and also include a localized buoyancy source at the inner-core boundary to mimic the effects of inner-core growth. The choice of thermal boundary conditions at the core-mantle boundary obviously has a huge impact on the system dynamics. While fixed temperatures at the outer boundary lead to a flow field that is dominated by small-scale features, the case with a uniform heat flux boundary condition shows a much larger ability to create large-scale motions. In regions where the magnetic field is strong, the velocity field has a large-scale structure, with a dominant azimuthal wave number m  6. Small-scale turbulence is still observed in regions where the magnetic field is weak. Building up on the study of Sakuraba and Roberts (2009), Hori and her coauthors investigated the role of buoyancy boundary conditions in a series of papers (Hori et al. 2010; Hori et al. 2012; Hori and Wicht 2013). Following the codensity approach introduced in Sect. 2.1, they considered both fixed codensity values and fixed codensity flux at the boundaries and also varied the amount of internal codensity sources. The results confirm the linear magnetoconvection results shown in Fig. 8 that fixed-flux boundary conditions promote larger convective scales than fixed boundary values of codensity. This is true in the nonmagnetic case, but becomes much more pronounced if a dynamo generates a strong, dipolar field. The details seem to depend strongly on how the flow is driven. If internal buoyancy sources dominate, the condition on the core-mantle boundary largely controls the dynamics, just as in the case presented by Sakuraba and Roberts (2009). In cases mainly driven by buoyancy sources at the inner-core boundary, the conditions for codensity applied there have a stronger influence. Fixed codensity values at the boundaries lead to much less pronounced effects of the magnetic field on the flow. A surprising result of the study by Hori et al. (2012) is that significant effects are already observed at moderate Ekman numbers of O.104 /.

Page 24 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

a

b

–0.3

0

br

0.3

c

e

g

d

f

h

–300

0 us

300

–300

0 uf

300

–1.5

0 bf

1.5

Fig. 9 Snapshots of the velocity and of the magnetic field at E  2  106 in the Sakuraba and Roberts (2009) model. Panels (a, c, e) and (g) show results obtained for a simulation with fixed heat flux boundary conditions on the outer shell, while a simulation with prescribed uniform surface temperature gives results as shown in (b, d, f) and (h). The radial magnetic field on the CMB is shown in (a) and (b). The remaining panels are velocity and magnetic field plots in the z D 0:1ro plane viewed from the north. The radial component of velocity (c, d), the azimuthal component of velocity (e, f), and the azimuthal magnetic field (g, h) are, respectively, shown. Note that the velocity and magnetic field scales differ from the scaling used in the present paper; see Sakuraba and Roberts (2009) for details

3.6.3 Subcritical Dynamos and the Nature of the Dynamo Bifurcation The simplest bifurcation structure that might be expected based on the ideas presented above is sketched in Fig. 10. The illustration is taken from Hori and Wicht (2013), but similar figures have been published before. Starting from a weak magnetic perturbation, the nonmagnetic flow becomes a kinematic dynamo at Ra D Rad in a supercritical bifurcation when the critical magnetic Reynolds number for dynamo action, Rmd , is reached. If the Rayleigh number exceeds Rad only moderately, the dynamo may saturate on the so-called weak field branch where the magnetic field is too faint to substantially affect the flow. If Ra is increased further, the magnetic field reaches an ƒ D O.E 1=3 / amplitude where Lorentz forces become important in the force balance. It seems reasonable to expect that saturation does not occur before the Elsasser number ƒ becomes O.1/, where magnetoconvection suggests that the critical Rayleigh number reaches a minimum, viscous forces become negligible, Lorentz forces control the flow scales, and a magnetostrophic

Page 25 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

U or Rm Rmd mag

Rmd

Ra Ramag c

Rac

B2 or Λ Strong field branch

Λ = O(1)

Weak field branch mag

Rad

Rad

Ra

Subcritical dynamo

Fig. 10 A sketch of the classical dynamo bifurcation scenario as motivated by magnetoconvection results (Slightly modified from Hori and Wicht 2013)

force balance is established. Once the system reaches this so-called strong field branch, it seems possible to reduce the Rayleigh number to values below Rad without shutting down the dynamo. Such solutions are commonly called subcritical dynamos. Note that this term is sometimes also used in a more restrictive sense where it refers to hypothetical dynamos which remain stable even for Rayleigh numbers below the critical Rayleigh number Rac for the onset of nonmagnetic convection. Indeed, the magnetoconvection results and Cartesian dynamo studies (St. Pierre 1993; Stellmach and Hansen 2004) suggest that such dynamos may exist, although they have not been observed in spherical shell simulations yet. The bifurcation behavior found in numerical simulations is certainly more complicated than the simple picture sketched above. That a strong magnetic field can help to maintain a dynamo for Ra < Rad can already be inferred from the benchmark dynamo (case BMII in Table 1), which is only found if the calculation is started from a suitable and sufficiently strong initial field. Since the benchmark dynamo operates at E D 103 , the flow scales are not expected to differ much from the nonmagnetic solution, which is readily confirmed by comparing Figs. 3 and 7. Additional effects to those discussed above are likely to be at work. In addition to subcritical dynamos, numerical simulations have also demonstrated bistability, showing that more than one solution can be stable for the same set of control parameters. A recent study by Morin and Dormy (2009) has further shown the existence of isola-type dynamo bifurcations in certain regions of parameter space, in which stable magnetic and nonmagnetic states coexist for a finite range of Rayleigh numbers, with the dynamo branch suddenly braking down when the Rayleigh number exceeds a certain value. Several mechanisms have been proposed to explain the origin of subcritical behavior in numerical dynamos. Two possible mechanisms have recently been identified by Sreenivasan and Jones (2011). They note that magnetic fields with dipolar symmetry tend to enhance the flow along the axis of the convection columns, resulting in an overall enhancement and a more coherent spatial distribution of the flow helicity (see Sect. 1). Since helicity, though not essential for dynamo action (Gilbert et al. 1988), generally helps to maintain large-scale magnetic fields, this might explain the occurrence of subcritical behavior. Interestingly, magnetic fields with quadrupolar symmetry have a much weaker effect on helicity, which may hint at an explanation for why so Page 26 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

a

b 10 dominant mu

Λ

10 1

0.1

8 6 4 2

1

10 Ra/Rac

0

1

10 Ra/Rac

Fig. 11 Elsasser number ƒ (a) and dominant modes m in the kinetic energy spectra (b) versus Ra=Rac for models driven by internal heating and a prescribed heat flux at the core-mantle boundary at E D 104 ; P r D 1, and Pm D 3. Green crosses represent nonmagnetic convection with failed kinematic dynamos, blue stars are dynamos grown from small seed fields, and purple squares represent dynamos starting from a strong dipolar field. In (a), the dotted lines represent the jump from a solution with weak field to a solution with strong field and the decay of the strong field, respectively. The dominant m modes in (b) are defined as modes which contain more than 75 % of the energy of the peak mode. The first and the second peak modes are connected by thick and thin lines, respectively (From Hori and Wicht 2013)

many dynamos produce dipolar fields for not too larger Rayleigh numbers. A second, less robust mechanism described by Sreenivasan and Jones (2011) is the competition between dipolar fields and zonal flows most relevant for stress-free boundary conditions, as discussed in Sect. 3.2. The shear associated with strong zonal flows destroys convection and thus kinetic helicity in a large fraction of the spherical shell so that only relatively weak magnetic fields are produced. A strong magnetic field, on the other hand, can eliminate these zonal flows via Lorentz forces, and the resulting stronger helicity in turn maintains the strong field. The results of Sreenivasan and Jones (2011) and Morin and Dormy (2009), obtained for fixed temperature boundary conditions and without strong internal heating, reveal only a small subcritical window, extending down to no more than about 80 % of Rad . A recent study by Hori and Wicht (2013) shows that the way convection is driven again has a key influence in this context. In models driven by internal sources and a prescribed, constant heat flux condition at the CMB, the subcritical range is shown to extend to  25 % of Rad in some simulations, even at the moderate Ekman number E D 104 . Figure 11 shows an example from this study. Two dynamo branches are clearly evident in Fig. 11a: a branch of solutions with weak, multipolar magnetic fields grown from small initial seed fields and a branch of dynamos with much stronger, dipolar fields, which are obtained either at high Rayleigh numbers or by using a strong initial field. Both branches coexist in a finite range of Rayleigh numbers until the lower branch becomes unstable at Ra  13:8 and the dynamos develop into strong dipolar solutions. A subcritical branch extending down to Ra  1:8Rac is also clearly visible. Figure 11b demonstrates that the strong dipolar fields cause much larger dominating flow scales than observed in either nonmagnetic or weak field solutions, in agreement with the expectations from magnetoconvection. The results found by Hori and Wicht (2013) largely resemble the scenario depicted in Fig. 10. The Elsasser number for the weak field cases is somewhat larger than expected, i.e., larger than the ƒ D O.E 1=3 / value where the magnetic field should become dynamically important. This may be due to the fact that multipolar fields have a weaker effect on the convection than well-organized dipolar fields. Note that the magnetic Reynolds number for the strong field cases is lower than for nonmagnetic convection in all cases. The authors speculate that the existence of subcritical dynamos in these models is caused by the larger flow scales, which reduce ohmic dissipation, and Page 27 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Higher Ra

a

b

c

d

e

f

g

h

i

Higher W

Fig. 12 Axial vorticity in the equatorial plane obtained in dynamo simulations with varying Rayleigh number Ra and rotation rates by Miyagoshi et al. (2010). The solid arrow in panel (i) indicates the radial range of sheet-plume convection, and the dotted arrow the range of westward zonal flow (From Miyagoshi et al. 2010)

thus allows the dynamo to operate at lower magnetic Reynolds numbers. Further studies at lower Ekman number are needed to investigate whether the combination of internal heating and flux conditions at the outer boundary might finally allow for subcritical dynamos in the stronger sense, i.e., for Ra < Rac . Although the Earth’s dynamo today is likely in a highly supercritical state, the Martian dynamo may have evolved along a subcritical branch before it went extinct about 4 Gyrs ago (Kuang et al. 2008). No inner core was present at that time, so that the ancient Mars dynamo was probably driven mainly by secular cooling through the mantle, i.e., precisely the scenario studied by Hori and Wicht (2013). It thus seems likely that subcriticality played a role in the abrupt collapse of the Martian magnetic field recorded in crustal magnetizations (Lillis et al. 2008). 3.6.4 Transitions in Low Ekman Number Rapidly Rotating Convection A phenomenon that has recently been observed in low Ekman number dynamo simulations is dual convection (Kageyama et al. 2008; Miyagoshi et al. 2010; Miyagoshi et al. 2011), illustrated in Fig. 12. Shown is the axial vorticity in the equatorial plane for varying Rayleigh numbers and angular velocities , corresponding to Ekman numbers between O.102 / and O.106 / in our definition. A regime transition to a state characterized by sheetlike plumes close to the tangent cylinder and a strong westward zonal flow is observed at low Ekman and high Rayleigh number (Fig. 12h, i). The simulations were carried out using no-slip conditions for the velocity field. A detailed analysis of one particular run from this dual convection regime reveals a local Rossby number Ro`  4  103 , with a dipolar magnetic field, as expected (see section “The Influence Page 28 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

of the Magnetic Fields on Rapidly Rotating Convection”). Miyagoshi et al. (2010) point out that a similar dual convection regime has been observed in experiments using water as a working fluid (Gillet et al. 2007), suggesting that the magnetic field is not essential. The simulation results shown in Fig. 12 have been obtained by solving the compressible form of the magnetohydrodynamic equations without employing the Boussinesq approximation described in Sect. 2.1. Furthermore, the authors assumed a gravity field which drops off like r 2 , in contrast to the linearly increasing gravity field that is usually considered. It is thus difficult to directly compare the results to other studies. Still, the simulations strongly suggest that important new dynamical effects come into play at low Ekman numbers. A further hint that rapidly rotating, nonmagnetic convection might still yield surprises comes from a promising approach employing simplified equations (Julien et al. 1998, 2012; Julien and Knobloch 1998; Sprague et al. 2006). The authors point out that the extreme range of spatial and temporal scales which make direct numerical simulations of rapidly rotating convection so difficult can be exploited to simplify the governing equations considerably. Instead of solving the full Boussinesq equations, they propose to use a set of reduced equations which can be shown to be asymptotically valid in the limit of small Rossby number. Numerical simulations using these reduced equations in Cartesian geometry (Sprague et al. 2006; Julien et al. 2012) reveal a surprisingly rich behavior. For planetary cores, a regime the authors identify as geostrophic turbulence may be of particular interest. In this regime, vertical coherence is lost, and for strongly driven flows, Julien et al. (2012) present evidence for an inverse cascade from small horizontal scales to large-scale, depth-independent horizontal flow. The authors speculate that as soon as energy is transferred to spatial scales where the latitudinal variation of the Coriolis force becomes felt, the large scales might organize into a zonal flow at the Rhines scale (see, e.g., Vallis 2006). Future studies are clearly needed in order to clarify these issues.

4 Comparison with the Geomagnetic Field A detailed comparison of numerical solutions with the geomagnetic field structure and dynamics can serve to judge whether a particular simulation provides a realistic geodynamo model. Global geomagnetic models represent the field in terms of spherical harmonics which allows a downward continuation to the core-mantle boundary. Sufficient resolution in time and space and an even global coverage are key issues here. We refer to Hulot et al. (2010) for more detailed review of the different geomagnetic field models. A comparison of simulation snapshots with geomagnetic models of certain epochs always bears the problem that both magnetic fields are highly variable in time. The conclusions may therefore depend on the selected epochs. Time-averaged fields offer additional insight though certain smaller-scale features vanish in the averaging process. The relatively high resolution and data quality make the satellite-based modern field models from the past decades ideal candidates for a comparison. However, one should keep in mind that at least some of their features may not represent the “typical” geomagnetic field and that they do not embrace the full geomagnetic time variability. The attainable resolution for the dynamo field is limited by the fact that the crustal magnetic field cannot be separated from the core contribution. Even the modern satellite models that thrive to represent the core-mantle boundary field are therefore only reliable to spherical harmonic degrees

Page 29 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

GUFM 1990

E5R36 filtered

E6 filtered

E536 full res.

E6 full res.

Fig. 13 Comparison of radial magnetic fields at the outer boundary of the dynamo region (core-mantle boundary). The top panel shows the GUFM1 field for the year 1990, the other panels show snapshots from models E5R36 and E6, restricted to l  14 in the middle row and at full numerical resolutions in the bottom row. The color scheme has been inverted for the numerical model to ease the comparison. Generally, the dynamo problem in invariant with respect to a sign changes in the magnetic field

l 14 where the crustal contribution remains negligible. Satellite-based models reach back to 1980 and therefore encompass only a very small fraction of the inherent geomagnetic time scales. Historic field models cover the past 400 years and rely on geomagnetic observatory data and ship measurements among other sources. Naturally, the resolution and precession is inferior to the satellite models and degrades when going back in time. gufm1 (Jackson et al. 2000), the model that we will use for comparison in the following, provides spherical harmonic degrees up to l D 14 in 2:5 year intervals. Figure 13 compares the gufm1 core-mantle boundary (CMB) field model for the epoch 1990 with selected snapshots from models E5R36 and E6. The comparison with model E5R36 reveals striking similarities. An imposed fourfold azimuthal symmetry has been used in model E6 to save computing time. This complicates the comparison, but the field seems to show too little structure at low latitudes and lacks the inverse field patches that seem to be typical for the historic geomagnetic field in this region. A more detailed discussion of the individual features follows below. Respective comparisons for numerical dynamos at different Ekman numbers can be found elsewhere (Christensen and Wicht 2007; Wicht et al. 2009; Wicht and Christensen 2010). The normalized magnetic energy spectrum at the core-mantle boundary, PmDl

mD0 Em .l; m; r D r0 / ; EN m .l; r D r0 / D PmD1 mD0 Em .l D 1; m; r D r0 /

(26)

provides information about the importance of the different spherical harmonic degrees in the core field. Here, Em .l; m; r/ is the magnetic energy carried by the mode of spherical harmonic degree l Page 30 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

WICHT ET AL.

normalized magnetic energy

101

100

10–1

10–2

10–3 0

1

2

3

4

5

6

7 8 degree l

9

10 11 12 13 14

Fig. 14 Time-averaged normalized magnetic energy spectrum at the outer boundary for GUFM1 (black), the E D 3104 model E4R106 (green), and the E D 3105 models E5R36 (blue) and E5R48 (red) which belong to regimes D and M, respectively. Colored bars in the width of the standard deviation indicate the time variability

and order m at radius r. Figure 14 compares the time-averaged gufm1 spectra with time averaged spectra from different simulations. When downward continuing the magnetic measurements to construct models of core-mantle boundary field, damping of the harmonics beyond say degree l D 12 is required and this clearly affects the higher harmonics in gufm1. The spectra for the nondipolar contribution in the numerical simulations remain basically white for low to intermediate degrees with some degrees, notably l D 5, sticking out. The relative importance of the dipole contribution grows with decreasing Ekman number as long as the solutions belong to regimes D or E, as we will further discuss below. The red curve in Fig. 14 is an example for a multipolar solution where the dipole has completely lost its prominence. Archeomagnetic models include the magnetic information preserved in recent lava flows and sediments. The widely used CALS7K.2 model by Korte et al. (2005) and Korte and Constable (2005) is a degree l D 10 model that is thought to be reliable up to degree l D 4 with a time resolution of about a century and reaches back seven millennia. Newer models from the same family cover the last ten millennia (Korte et al. 2011). Figure 15 shows the respective timeaveraged CALS7K.2 CMB field in comparison with time-averaged numerical solutions that we discuss further below. Paleomagnetic data reaching further back in time neither have the spatial nor the temporal resolution and precision to construct faithful models for specific epochs. Statistical interpretations are thus the norm and provide time-averaged fields (TAF) and mean paleo-secular variation, for example, in the form of VGP scatter (see below). In the following we assess the similarity of the geomagnetic field and simulation results in the context of some key issues that were discussed more extensively in recent years.

Page 31 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 15 Time-averaged fields for the CALS7K.2 archeomagnetic model by Korte and Constable (2005) and for the numerical models E5R18 (left) and E5R36 (right). Time averages over intervals corresponding to 500, 5;000, and 50;000 year are shown in the top middle and bottom panels, respectively

4.0.5 Dipole Properties and Magnetic Field Symmetries In the linearized system describing the onset of convection, solutions with different equatorial symmetry and azimuthal symmetry decouple. In terms of spherical harmonics, azimuthal symmetries are related to different orders m, while equatorial symmetric and antisymmetric solutions are described by harmonics where the sum of degree l and order m is even or odd, respectively. Equatorial symmetric flows are preferred in rotation dominated systems and are thus excited at lower Rayleigh numbers than antisymmetric contributions. The convective columns described above reach right through the shell and are evidently symmetric with respect to the equator. Equatorial symmetric flows support dynamos with either purely equatorial antisymmetric or purely symmetric magnetic fields which in the dynamo context are called the dipolar and quadrupolar families, respectively (Hulot and Bouligand 2005). These names refer to the primary terms in the families, the axial dipole .l D 1; m D 0/ or the axial quadrupole .l D 2; m D 0/. Both are coupled by equatorial antisymmetric flows which can transfer energies from one family to the other. The dipolar family is clearly preferred in dynamo solutions at lower Rayleigh numbers which correspond to regime D. However, a few exceptions have been reported for large Ekman numbers (Aubert and Wicht 2004) or stress-free outer boundaries. Dynamos with perfectly azimuthal symmetry and equatorial symmetry are found close to the onset of convection where the underlying flow would still retain perfect symmetries in the nonmagnetic case. The respective models have been marked by squares in Fig. 5. These solutions

Page 32 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

obey a very simple time dependence: a drift of the whole pattern in azimuth. When the Rayleigh number and the magnetic Prandtl number are increased, the azimuthal symmetry is broken first and then the equatorial symmetry. This goes along with a change in time dependence from drifting to chaotic, sometimes via an oscillatory behavior (Wicht 2002; Wicht and Christensen 2010) (see Figs. 5 and 16). We quantify the field geometry and symmetry with four time-averaged measures here: The dipole contribution is characterized by time averages of its relative importance at ro , i.e., by the dipolarity 1=2 E .l D 1; m; r / m 0 mD0 d D 1=2 ; PlD11 PmDl lD1 mD0 Em .l; m; r0 / P mDl

(27)

and by the mean dipole tilt ‚, the minimum angle between the magnetic dipole axis and the rotation axis. The symmetry properties of the non-dipolar harmonics are quantified by the relative strength of the equatorial symmetric contributions, i.e., by the equatorial symmetry measure P eD

1=2 E .l; m; r / m 0 mD0 ; P  1=2 lD11 PmDl lD2 mD0 Em .l; m; r0 /

lD11 lD2

PmDl;lCmDeven

(28)

and the relative strength of the non-axially symmetric contributions, i.e., by the axial symmetry measure P aD

1=2 E .l; m > 0; r / 0 mD1 m :  P 1=2 lD11 PmDl lD2 mD0 Em .l; m; r0 / lD11 lD2

PmDl

(29)

These four measures are restricted to degrees 1 < l 11 to facilitate a comparison with the gufm1 model which is heavily damped for higher degrees. The dipole contributions would dominate both measures e and a and are therefore excluded. Dynamos where all non-dipolar modes statistically contain the same energy yield es D 0:73 and as D 0:93. Figure 16 shows the dependence of the four time-averaged measures on the local Rossby number Ro` for several dynamo models. The models span Ekman numbers from E D 103 to E D 3  105 and magnetic Prandtl numbers between Pm D 1 and Pm D 20 (see Table 1). Bottom driven cases with constant codensity boundary conditions are considered as well as cases where the compositional flux is forced to zero at the outer boundary. The former models can be considered as purely thermally driven, while the latter mimics compositional driving. For each model, Ro` is varied by changing the Rayleigh number. The gufm1 model is represented by the gray blocks in Fig. 16. Its vertical extension corresponds to minimum and maximum values within the represented 400 year period, and the horizontal extension indicates uncertainties in Ro` (Christensen and Aubert 2006). Note that Earth’s local Rossby number is based on the scaling derived by Christensen and Aubert (2006) and not on direct measurements (see also the discussion in Sect. 3.2). Table 1 lists the four measures for selected cases.

Page 33 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

a 1.0

b 50

0.9 0.8

d

0.7

M

0.6 0.5

tilt

E

0.4 0.3

D

0.2 0.1 0.00 0.05 0.10 0.15 0.20 0.25 Rol

E

0.8 D 0.7 0.6 0.5 0.4 0.3 M 0.2 0.1 0.0 0.00 0.05 0.10 0.15 0.20 0.25 Rol

d 1.00 0.95 0.90

D

E

0.85 a

e

c 0.9

45 40 35 30 D 25 20 15 M 10 5 E 0 0.00 0.05 0.10 0.15 0.20 0.25 Rol

0.80 0.75 0.70

M

0.65 0.60 0.00 0.05 0.10 0.15 0.20 0.25 Rol

Fig. 16 Dipole properties and symmetry properties of non-dipolar contributions for several dynamo models and the gufm1 (gray boxes). Different symbols code the time dependence as explained in the caption of Fig. 5. Panel (a) shows the time-averaged dipolarity d and panel (b) shows the time-averaged dipole tilt. Panels (c) and (d) display the time averaged equatorial symmetry measure e and time-averaged axial symmetry measure a, respectively. Blue and red, models from Fig. 5 with Pm D 10 and Pm D 20, respectively; yellow, identical parameters to the blue model but with chemical boundary conditions; green, E D 3  104 , Pm D 3, chemical boundary conditions; black, E D 3  105 , Pm D 1, fixed temperature conditions. The Prandtl number is unity in all cases. Colored bars in the width of the standard deviation indicate the time variability in panels (c) and (d). The standard deviation amounts to only a few percent in d (panel a) and is of the same order as the tilt itself (panel b)

Figure 16 demonstrates that the dipolarity generally increases with decreasing Ekman number while the mean tilt becomes smaller. This can be attributed to the growing influence of rotation at smaller E values; the associated increase in “geostrophic” flow correlation promotes the production of axial dipole field. This is counteracted by the increased influence of inertial forces at larger Rayleigh numbers. The strong dipolarity of gufm1 and its sizable mean tilt around 10ı can only be reached at lower Ekman numbers combined with larger Rayleigh numbers corresponding to Ro` values around 0:1. The comparison of the three different models at E D 103 suggests that neither the magnetic Prandtl number (Pm D 10 and Pm D 20) nor the thermal outer boundary condition play an important role (see also Wicht et al. 2009). The equatorial and axial symmetry measures shown in panels (c) and (d) of Fig. 16 provide a less conclusive picture. There is a trend that equatorially symmetric and non-axial contributions become more important with growing local Rossby number. Generally, dynamo simulations prefer equatorially antisymmetric and axisymmetric modes in regimes D and E since a and e lie below the statistical value as and es marked by thick horizontal lines in Fig. 16. Gufm1 also Page 34 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

shows a tendency towards equatorially antisymmetric modes which is even more pronounced in paleomagnetic data (Christensen et al. 2010). Figure 16 demonstrates that dynamo simulations are capable of reaching geodynamo-like equatorial and axial symmetries provided the Rayleigh number is large enough. The simulations at E D 3  105 , however, show some exceptions. The lowest Ro` case is close to the onset of dynamo action and still shows the perfect equatorially antisymmetry e D 0. For such fields, the statistical a value decreases to as D 0:63 which explains why the solution is also particularly axisymmetric. Why three cases in regime D at intermediate Ro` also show a distinct preference towards axisymmetric magnetic field configurations remains unknown. They seem incompatible with the gufm1 data in this respect. Larger Rayleigh numbers should help to bring the models in line with geomagnetic values. Note that the degree of dipole dominance and the mean tilt are somewhat better constrained by additional archeomagnetic and paleomagnetic information than the equatorial and axial symmetry (Hulot et al. 2010). In a similar study, Christensen et al. (2010) confirm that even models at relatively large Ekman numbers can be surprisingly Earth-like, provided the Rayleigh number is large enough to yield sufficient complexity. They report that the magnetic Ekman number E has to be smaller than 104 and that the magnetic Reynolds number should reach at least about 200 with larger values being required at lower E . This implies that increasing the magnetic Prandtl number offers an alternative and numerically more affordable path towards more Earth-like models than decreasing the Ekman number. However, the path will never lead from non-reversing to reversing dynamos since the magnetic Prandtl number seems to have no influence on the time variability (see Fig. 5). The above analysis demonstrates that dipole tilt and equatorial and axial symmetry do not strongly constrain the dynamo model which all more or less comply with geomagnetic values. When sufficient temporal complexity is required, however, Ekman numbers smaller than say E D 104 are necessary to also keep the dipolarity on an Earth-like level. 4.0.6 Persistent Features and Mantle Influence Strong normal polarity flux concentrations at higher latitudes are a common feature in many dynamo simulations. Very similar flux lobes can be found in the historic geomagnetic field, two in the northern and two in the southern hemisphere, and seem to have changed only little over the last four centuries (Gubbins et al. 2007). The flux lobes are also present in archeomagnetic models spanning the last ten millennia but show some variability on this time scale (Korte and Holme 2010; Amit et al. 2010b, 2011; Korte et al. 2011). Even some paleomagnetic TAF models covering 5 Myr report persistent high-latitude flux lobes at similar locations (Gubbins and Kelly 1993; Johnson and Constable 1995; Kelly and Gubbins 1997; Carlut and Courtillot 1998; Johnson et al. 2003). Their position and symmetry with respect to the equator suggest that they are caused by the inflow into convective cyclones as discussed in Sect. 3.5. Figure 13 demonstrates that the seemingly large flux lobes in the filtered field version are the expression of much smaller field concentrations caused by convective features of similar scale. Figure 17 shows a close-up of the convection around the patches in the dynamo models E5R43. The small-scale correlation of cyclonic features with strong magnetic field patches is evident. As the Ekman number decreases, the azimuthal scale of the convective columns shrinks, while the scale perpendicular to the rotations axis is much less affected. The columns become increasingly sheetlike which translates into thinner magnetic patches that are stretched into latitudinal direction.

Page 35 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 17 Panel (a) illustrates the complex sheetlike cyclones (red) and anticyclones (blue) in model E5R43 with isosurfaces of the z-vorticity. Panels (b) and (c) zoom in on the part of the northern hemisphere and illustrate how the normal polarity magnetic field is concentrated by the flow that converges into cyclones. The normal polarity field is radially outward in the northern hemisphere indicated by red field lines and the red radial field contours in panel (b)

In the filtered version, the finer scale is lost and the action of several convective structures appears as one larger magnetic flux lobe. Since the azimuthal symmetry is not broken in the dynamo models presented above, longitudinal features average out over time. Figure 15 demonstrates that the historical time span is too short to yield this effect. Even time spans comparable to the period covered by archeomagnetic models may still retain azimuthal structures. The flux lobes still appear at similar locations as in the snapshots (see Fig. 13) or the historic averages. The solution finally becomes nearly axisymmetric when averaging over periods corresponding to 50;000 year. This suggests that the persistence over historic or archeomagnetic time scales is nothing special. The persistent azimuthal structures in some 5 Myr TAF models, however, can only be retained when the azimuthal symmetry is broken. The preferred theory is an influence of the lower thermal mantle structure on the dynamo process. It has already been mentioned above that the mantle determines the heat flow out of the dynamo region. Lateral temperature differences in the lower mantle translate into lateral variations in the CMB heat flux since the core-mantle boundary is isothermal in comparison. The flux is higher where the mantle is colder than average and vice versa. The respective pattern is typically deduced from seismic tomography models that are interpreted in terms of temperature differences yielding the so-called tomographic heat flux models (Glatzmaier et al. 1999). Several authors have explored the potential influence of lateral CMB heat flux variations on the dynamo process (Glatzmaier et al. 1999; Olson and Christensen 2002; Christensen and Olson 2003; Amit and Olson 2006; Gubbins et al. 2007; Willis et al. 2007; Aubert et al. 2007, 2008a; Takahashi et al. 2008b; Sreenivasan 2009; Amit and Choblet 2009). Many confirm that these variations indeed yield persistent azimuthal features at locations similar to those in the geomagnetic field. The time

Page 36 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

scale over which the lobes become quasi-stationary depends on the model parameters. Willis et al. (2007) find that rather low Rayleigh numbers and strong heat flux variations, in the same order as the spherically symmetric total heat flow, are required to lock the patches on a historic time scale. Amit et al. (2010a) report rather time-dependent flux lobe locations similar to those observed in the archeomagnetic data for models run at higher Rayleigh numbers. According to Olson and Christensen (2002) rather long averaging times in the order of 100 kyr may then be needed to clearly reveal the mantle imposed heat flux structure. The inhomogeneous CMB heat flux may also help to explain some other interesting geophysical features. The non-axisymmetric structures in paleomagnetic TAF models remain arguable, but the axisymmetric properties are much better constrained. Typically, the axial quadrupole amounts to a few percents of the axial dipole, while the axial octupole is of similar order but of opposite sign (Johnson 2007). In dynamo simulations, the axial octupole contribution is typically too large. The axial quadrupole contribution, on the other hand, remains too low unless the equatorial symmetry is broken by an inhomogeneous CMB heat flux condition. The required degree of north/south heat flux asymmetry, however, is somewhat larger than what the typical “tomographic” models suggest (Olson and Christensen 2002). The direct translation of seismic velocity differences into temperature differences and subsequently heat flux patterns may oversimplify matters here since compositional variations may also have to be considered (Amit and Choblet 2009, 2012). In the historic magnetic field, the secular variation is clearly lower in the Pacific than in the Atlantic hemisphere (Jackson et al. 2000). Christensen and Olson (2003) demonstrate that using a tomographic CMB heat flux can promote such an asymmetry in the numerical simulations. It may also help to better model other features of the secular variation pattern at Earth’s CMB (Amit and Olson 2006; Aubert et al. 2007; Amit et al. 2008). Aubert et al. (2008a) evoke the inhomogeneous cooling of the inner core by a quasi persistent cyclone reaching deep into the core to explain seismic inhomogeneities in the outer 100 km of the inner-core radius. Note, however, that other authors promote an inhomogeneous inner-core growth caused by a specific mode of inner-core convection to explain this feature (Alboussière et al. 2010; Monnereau et al. 2010). The preference for magnetic poles to follow two distinct latitude bands during reversals (Gubbins and Love 1998) can also be explained with a tomographic CMB heat flux model (Coe et al. 2000; Kutzner and Christensen 2004). And finally, the CMB heat flux pattern has been shown to influence the reversal likelihood in dynamo simulations and may thus explain the observed changes in the geomagnetic reversal rate (Glatzmaier et al. 1999; Kutzner and Christensen 2004). 4.0.7 Inverse Magnetic Field Production and the Cause for Reversals The production of strong inverse magnetic field on both sides of the equatorial plane is an inherent feature of the fundamental dynamo mechanism outlined in Sect. 3.5. The associated pairwise radial field patches in the outer boundary field are therefore found in many dynamo simulations. Wicht et al. (2009) and Takahashi et al. (2008a), however, report that the patches become less pronounced at lower Ekman numbers (see also Sakuraba and Roberts 2009). Model E6 in Fig. 13 and model E5R18 in Fig. 15 demonstrate that the patches retreat to a small band around the equator which becomes indiscernible in the filtered field. Normal polarity patches now rule at mid- to lower latitudes, but the region around the equator generally shows rather weak field of normal polarity, which is compatible with the strong dipole component in these models. More pronounced inverse patches reappear when the Rayleigh number is increased, as is demonstrated by E5R36 Page 37 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

in Fig. 13. The low-latitude field remains only weakly inverse when averaged over long time spans (see Fig. 15). The time-averaged archeomagnetic model CALS7K.2 (see Fig. 15) points towards a weakly normal field, but the lack of resolution may be an issue here. The larger Rayleigh number in model E5R36 also promotes a stronger breaking of the equatorial symmetry, destroying the pairwise nature of the patches. This contributes to the convincing similarity between the E5R36 solution and the historic geomagnetic field where strong and equatorially asymmetric normal polarity patches seem typical at lower latitudes (Jackson 2003; Jackson and Finlay 2007) (see Fig. 13). The weak magnetic field in the polar regions can give way to inverse field at larger Rayleigh numbers, a feature also found in the historic geomagnetic field. This inverse field is produced by plumelike convection that rises inside the tangent cylinder, and the typical associated magnetic structure has been named magnetic upwellings (MU) by Aubert et al. (2008b). MUs may also rise at lower to mid-latitudes when the equatorial symmetry is broken to a degree where one leg of the equatorial inverse field production clearly dominates. The symmetry breaking is essential since the inverse fields created on both sides of the equator would cancel otherwise. The MUs may trigger magnetic field reversals and excursions when producing enough inverse field to efficiently cancel the prevailing dipole field (Wicht and Olson 2004; Aubert et al. 2008b). This basically resets the polarity, and small field fluctuations then decide whether axial dipole field of one or the other direction is amplified after the MUs have ceased (Aubert et al. 2008b; Wicht et al. 2009). MUs are a common feature in larger Rayleigh number simulations and vary stochastically in strength, number, and duration. Wicht et al. (2009) suggest that particularly strong or long-lasting MUs are required to trigger reversals. Alternatively, several MUs may constructively team up. Both scenarios remain unlikely at not too large Rayleigh numbers which explains why reversals are rare in regime E (see Figs. 5 and 16). 4.0.8 Time Variability The internal geomagnetic field obeys a very rich time variability from short-term variations on the yearly time scale, the geomagnetic jerks, to variations in the reversal frequency on the order of several tens of million years (Hulot et al. 2010). Dynamo simulations are capable of replicating many aspects of the time variability (Olson et al. 2012), but the relative time scales of the different phenomena may differ from the geomagnetic situation depending on the model parameters. Christensen and Tilgner (2004) analyzed a suit of dynamo simulations to further elucidate how the typical secular variation time scale depends on the degree l of the magnetic component. They find the inverse relationship: l D 5:2u l 1 ;

(30)

which is compatible with the idea that the magnetic field is advected by the large scale flow and agrees well with geomagnetic findings (Hongre et al. 1998; Olsen et al. 2006; Lhuillier et al. 2011). For Earth, u amounts to approximately two centuries. Torsional oscillations (TOs) are a specific form of short-term variations on the decadal time scale (Braginsky 1970). They concern the motion of so-called geostrophic cylinders that are aligned with the planetary rotation axis. TOs are essentially one-dimensional Alfvén waves that travel along the magnetic field lines connecting the cylinders. The field lines act like torsional springs that react to any relative acceleration of the cylinders with respect to each other. TOs

Page 38 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

have the correct time scale for explaining the decadal variations in Earth’s length of day, which are typically attributed to an exchange of angular momentum between Earth’s core and mantle (Jault et al. 1988; Jackson 1997; Jault 2003; Amit and Olson 2006). TOs have reportedly been identified in the geomagnetic secular variation signal (Zatman and Bloxham 1997; Gillet et al. 2010), are considered a possible origin for the geomagnetic jerks (Bloxham et al. 2002), and may be responsible for the length-of-day changes via coupling to Earth’s mantle (Gillet et al. 2010). The smallness of inertial forces and viscous forces in the magnetostrophic force balance is a prerequisite for torsional oscillations to become an important part of the short-term dynamics. Coriolis and pressure forces do not contribute to the integrated azimuthal force on geostrophic cylinders for geometrical reasons. This leaves the azimuthal Lorentz forces as the only constituent in the first-order balance. Taylor (1963) therefore conjectured that the dynamo would assume a configuration where the azimuthal Lorentz forces would cancel along the cylinders until the integrated force can be balanced by viscous or inertial effects. In recognition of Taylor’s pioneering work, dynamos where the normalized integrated Lorentz force R C.s/

O  ..r  B/  B/ dF

C.s/

jO  .r  B/  B j dF

T .s/ D R

;

(31)

is small are said to obey a Taylor state. Here, C.s/ is the geostrophic cylinder of radius s and dF is a respective surface element. Torsional oscillations are faster disturbances that ride on the background Taylor state which is established on the turnover time scale. Wicht and Christensen (2010) find torsional oscillations in dynamo simulations for Ekman numbers of E D 3  105 or smaller and for relatively low Rayleigh numbers where inertial forces remain secondary. Figure 18 shows the traveling waves and the T .s/ in their model E6 at E D 3  106 where TOs can be identified most clearly (see Table 1 for further model parameters and properties). The Alfvén Mach number (see Table 1) provides the ratio of the flow time scale to the Alfvén time scale characteristic for TOs. Alfvén waves travel more than an order of magnitude faster than the typical convective flow speed in Earth’s core. In the low Ekman number case where TOs have actually been identified, they were only roughly twice as fast as the flow, and larger ratios can only be expected when the Ekman number is further decreased below E D 3106 (Wicht and Christensen 2010). A recent analysis by Christensen et al. (2012) indicates that dynamo simulation may nevertheless be capable of recovering the decadal variations found in newest satellite base geomagnetic field models. This could mean that torsional oscillations are actually not an important part of the decadal variations. The slowest magnetic time scales are associated with field reversals and variations in the reversal rate (Olson et al. 2012). Geomagnetic reversals typically last some thousand years, while simulated reversals seem to take somewhat longer (Wicht 2005; Wicht et al. 2009). Figure 6 shows an example for a reversal sequence in a numerical simulation. The duration of geomagnetic and simulated reversals obeys a very similar latitudinal dependence (Clement 2004; Wicht 2005; Wicht et al. 2009) with shorter durations at the equator and a gradual increase towards the poles. During the last several Myr, the average reversal rate was about 4 per Myr, but there were also long periods in Earth’s history when the geodynamo stopped reversing. The most recent of these so-called superchrons happened in the Cretaceous and lasted for about 37 Myr. Whether the variations in reversal rate are caused by changes in Earth’s mantle (Glatzmaier et al. 1999; Constable 2000; Biggin et al. 2012) or are an expression of the internal stochastic nature of the dynamo process (Jonkers 2003; Ryan and Sarson 2007) is still debated. Generally, dynamo Page 39 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

a 1.4

s

1.2 1.0 0.8 0.6 0

1

2

3 time

4

5

4

5

[x10–3]

b 1.4

s

1.2 1.0 0.8 0.6 0

1

10–2

2

3 time [x10–3]

10–1

100

Fig. 18 Panel (a) shows the speed of geostrophic cylinders outside the tangent cylinder for a selected time span in model E6 (Wicht and Christensen 2010). The time is given in units of the magnetic diffusion time here. Torsional oscillations travel from the inner-core boundary (bottom) towards the outer core boundary (top) with the predicted Alfvén velocity (white lines). The agreement provides an important clue that the propagating features are indeed torsional oscillations. Panel (b) shows the normalized integrated Lorentz force T .s/ which can reach values down to 102 where the Taylor state is assumed to a good degree. The Taylor state is broken during the torsional oscillations

simulations (Glatzmaier et al. 1999; Wicht et al. 2009) and even rather simple parameterized models seem capable of showing Earth-like variations in reversal frequency. A more thorough analysis of reversal properties and their stochastic nature in geodynamo simulations is still missing. Such an analysis seems impossible at lower Ekman numbers due to the excessive computing times required (Wicht et al. 2009). Paleomagnetists frequently interpret the local magnetic field provided by a paleomagnetic sample as being caused by a pure dipole contribution which they call a virtual dipole. Consequently, the associated magnetic pole is referred to as a virtual geomagnetic pole (VGP). The scatter of the associated virtual geomagnetic pole around its mean position is used to quantify the paleosecular variation in paleomagnetic field models. The scatter shows a typical latitudinal dependence with low values around 12ı at the equator and rising to about 20ı at the poles. Some dynamo simulations show a very similar variation (Kono and Roberts 2002, Wicht 2005; Christensen and Wicht 2007; Wicht et al. 2009). Christensen and Wicht (2007) demonstrate that the amplitude of the scatter depends on the Rayleigh number and seems somewhat high at Rayleigh numbers where field reversals happen. Like for reversals, an analysis of the virtual dipole scatter is missing for lower Ekman number cases because of the long time spans required.

Page 40 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

5 Conclusion We have outlined the ingredients of modern numerical dynamo models that so successfully reproduce many features of the geomagnetic field. These models seem to correctly capture important aspects of the fundamental dynamics and very robustly produce dipole-dominated fields with Earth-like Elsasser numbers and magnetic Reynolds numbers over a wide range of parameters. Even Earth-like reversals are found when the Rayleigh number is chosen high enough to guarantee a sufficiently large impact of inertia. Simple scaling laws allow to directly connect simulations with Ekman numbers as large as E D 103 to the geodynamo at E D 1015 , to the dynamos of other planets in our solar system, and even to the dynamos in fast-rotating stars (Christensen and Aubert 2006; Olson and Christensen 2006; Takahashi et al. 2008a; Christensen et al. 2009). All this strongly suggests that the models get the basic dynamo process right. Lower Ekman number simulations certainly do a better job in capturing the small-scale turbulent flow in the dynamo regions. Is the associated excessive increase in numerical costs really warranted for modeling dynamo action? A closer comparison with the geomagnetic field suggests that some features indeed become more Earth-like. Reversing dynamos are inevitably too little “dipolar” at Ekman numbers E  3  104. At E D 3  105 , however, the relative importance of the dipole contribution remains compatible with the historic magnetic field even at Rayleigh numbers where reversals are expected. Torsional oscillations, which may play an important role in the dynamics on decadal times scales, only start to become significant at Ekman numbers E 3  105 (Wicht et al. 2009). Their time scale still remains too slow at E D 3  106 , and a further decrease in Ekman number will likely improve matters here. Low Ekman number simulation therefore seems in order to faithfully model the dynamics on decadal time scale where satellite data continue to provide the most reliable geomagnetic field models. On the other hand, even dynamo simulations around E D 105 already show a clear decadal variation signal (Christensen et al. 2012). Recently, Aubert (2013) used a data assimilation approach to demonstrate that even simpler dynamos at larger Ekman numbers are generally capable of reproducing the geomagnetic secular variation signal from the satellite area. Several published low Ekman number simulations have rather simple magnetic fields with much smaller dipole tilts than typical for Earth, little field complexity in the equatorial region, and a too symmetric field. Likely, the time variability is also little Earth-like. The reason is that the authors concentrated on decreasing the Ekman number while keeping the Rayleigh number moderate. Simulations with larger Rayleigh numbers that yield magnetic Reynolds numbers of at least a few hundred generally produce more Earth-like fields, even at larger Ekman numbers. The drastic regime change which marks the transition from the weak field branch to the strong field branch in Cartesian magnetoconvection problems has never been found in spherical dynamo simulations. Recent simulations have nevertheless shown that flow length scale clearly increases in the presence of a strong dipolar magnetic field, in particular when internal heating plays an important role in driving the dynamo (Sakuraba and Roberts 2009; Hori et al. 2012). The effect may further increase at lower Ekman numbers. The reason for the non-dipolar field and particular convection pattern found by Kageyama et al. (2008) at E D 106 remains to be clarified but may have to do with the different model setup. Dynamos at Ekman numbers E  105 will continue to be the focus of numerical dynamo simulations that aim at understanding the fundamental dynamics and mechanisms. For exploring the long-term behavior, relatively large Ekman numbers will remain a necessity. These should be

Page 41 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

supplemented by increased efforts to explore lower Ekman number models which show interesting distinct features that remain little understood. The use of boundary conditions that implement the laterally varying heat fluxes at Earth’s coremantle boundary is another example how dynamo modelers try to improve their numerical codes. The heat flux can influence the reversal behavior (Glatzmaier et al. 1999), helps to model some aspects of the long-term geomagnetic field and its secular variation, and even has the potential to explain a seismic anisotropy of Earth’s inner core (Aubert et al. 2008a). Recently revised estimates of the thermal and magnetic diffusivities indicate that both may be around three times higher than previously anticipated (Pozzo et al. 2012). The higher thermal diffusivity means that more heat can be conducted down the adiabat and is not available for driving the dynamo. The heat flux through Earth’s core-mantle boundary may actually be subadiabatic today. Compositional convection and radioactive heating based on potassium may come to a rescue here and will form a focus of future geodynamo research. Modern dynamo simulations have already proven useful for exploring the dynamics of planetary interiors. Model refinements and increasing numerical power will further increase their applicability in the future. Several space missions promise to map the interior magnetic fields of Earth (ESA’s Swarm mission), Mercury (NASA’s MESSENGER and ESA’s BepiColombo missions), and Jupiter (NASA’s Juno mission) with previously unknown precision. High-end numerical dynamo simulations will be indispensable for translating these measurements into interior properties and dynamical processes. Acknowledgments Johannes Wicht thanks Uli Christensen for useful discussions and hints.

References Alboussière T, Deguen R, Melzani M (2010) Melting-induced stratification above the Earth’s inner core due to convective translation. Nature 466:744–747 Amit H, Aubert J, Hulot G (2010a) Stationary, oscillating or drifting geomagnetic flux patches? J Geophys Res 115:B07108 Amit H, Aubert J, Hulot G, Olson P (2008) A simple model for mantle-driven flow at the top of Earth’s core. Earth Planets Space 60:845–854 Amit H, Choblet G (2009) Mantle-driven geodynamo features – effects of post-perovskite phase transition. Earth Planets Space 61:1255–1268 Amit H, Choblet G (2012) Mantle-driven geodynamo features – effects of compositional and narrow D” anomalies. Phys Earth Planet Inter 190:34–43 Amit H, Korte M, Aubert J, Constable C, Hulot G (2011) The time-dependence of intense archeomagnetic flux patches. J Geophys Res 116(B15):B12106 Amit H, Leonhardt R, Wicht J (2010b) Polarity reversals from paleomagnetic observations and numerical dynamo simulations. Space Sci Rev 155:293–335 Amit H, Olson P (2006) Time-average and time-dependent parts of core flow. Phys Earth Planet Inter 155:120–139 Amit H, Olson P (2008) Geomagnetic dipole tilt changes induced by core flow. Phys Earth Planet Inter 166:226–238 Aubert J (2013) Flow throughout the Earth’s core inverted from geomagnetic observations and numerical dynamo models. Geophys J Int 192:1537–556 Page 42 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Aubert J, Amit H, Hulot G (2007) Detecting thermal boundary control in surface flows from numerical dynamos. Phys Earth Planet Inter 160:143–156 Aubert J, Amit H, Hulot G, Olson P (2008a) Thermochemical flows couple the Earth’s inner core growth to mantle heterogeneity. Nature 454:758–761 Aubert J, Aurnou J, Wicht J (2008b) The magnetic structure of convection-driven numerical dynamos. Geophys J Int 172:945–956 Aubert J, Labrosse S, Poitou C (2009) Modelling the paleo-evolution of the geodynamo. Geophys J Int 179:1414–1429 Aubert J, Wicht J (2004) Axial versus equatorial dynamo models with implications for planetary magnetic fields. Earth Planet Sci Lett 221:409–419 Biggin AJ, Steinberger B, Aubert J et al (2012) Possible links between long-term geomagnetic variations and wholemantle convection processes. Nat Geosci 5:674 Bloxham J, Zatman S, Dumberry M (2002) The origin of geomagnetic jerks. Nature 420:65–68 Braginsky S (1970) Torsional magnetohydrodynamic vibrations in the Earth’s core and variation in day length. Geomag Aeron 10:1–8 ˘ Zs ´ core and the Braginsky S, Roberts P (1995) Equations governing convection in EarthâA geodynamo. Geophys Astrophys Fluid Dyn 79:1–97 Breuer M, Manglik A, Wicht J et al (2010) Thermochemically driven convection in a rotating spherical shell. Geophys J Int 183:150–162 Breuer M, Wesseling S, Schmalzl J, Hansen U (2002) Effect of inertia in Rayleigh-Bénard convection. Phys Rev E 69:026320/1–10 Bullard EC, Gellman H (1954) Homogeneous dynamos and terrestrial magnetism. Proc R Soc Lond A A 247:213–278 Busse FH, Simitev R (2005a) Convection in rotating spherical fluid shells and its dynamo states. In: Soward AM, Jones CA, Hughes DW, Weiss NO (eds) Fluid dynamics and dynamos in astrophysics and geophysics. CRC Press, Boca Rato, pp 359–392 Busse FH, Simitev R (2005b) Dynamos driven by convection in rotating spherical shells. Atronom Nachr 326:231–240 Carlut J, Courtillot V (1998) How complex is the time-averaged geomagnetic field over the past 5 Myr? Geophys J Int 134:527–544 Chan K, Li L, Liao X (2006) Phys Modelling the core convection using finite element and finite difference methods. Earth Planet Inter 157:124–138 Chandrasekhar S (1961) Hydrodynamic and hydromagnetic stability. Clarendon Press, Oxford Christensen UR (2002) Zonal flow driven by strongly supercritical convection in rotating spherical shells. J Fluid Mech 470:115–133 Christensen UR (2006) A deep dynamo generating Mercury’s magnetic field. Nature 444:1056– 1058 Christensen UR (2010) Accepted for publication at Space Sci Rev Christensen U, Aubert J (2006) Scaling properties of convection-driven dynamos in rotating spherical shells and applications to planetary magnetic fields. Geophys J Int 116:97–114 Christensen UR, Aubert J, Busse FH et al (2001) A numerical dynamo benchmark. Phys Earth Planet Inter 128:25–34 Christensen UR, Aubert J, Hulot G (2010) Conditions for Earth-like geodynamo models. Earth Planet Sci Lett 296:487–496 Christensen UR, Holzwarth V, Reiners A (2009) Energy flux determines magnetic field strength of planets and stars. Nature 457:167–169

Page 43 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Christensen U, Olson P (2003) Secular variation in numerical geodynamo models with lateral variations of boundary heat flow. Phys Earth Planet Inter 138:39–54 Christensen U, Olson P, Glatzmaier G (1999) Numericalmodeling of the geodynamo: a systematic parameter study. Geophys J Int 138:393–409 Christensen U, Tilgner A (2004) Power requirement of the geodynamo from Ohmic losses in numerical and laboratory dynamos. Nature 429:169–171 Christensen U, Wicht J (2007) Numerical dynamo simulations. In: Olson P (eds) Core dynamics. Treatise on geophysics, vol 8. Elsevier, Amsterdam/Boston, pp 245–282 Christensen UR, Wardinski I, Lesur V (2012) Time scales of geomagnetic secular acceleration in satellite field models and geodynamo models. Geophys J Int 190:243–254 Clement B (2004) Dependency of the duration of geomagnetic polarity reversals on site latitude. Nature 428:637–640 Clune T, Eliott J, Miesch M, Toomre J, Glatzmaier G (1999) Computational aspects of a code to study rotating turbulent convection in spherical shells. Parallel Comput 25:361–380 Coe R, Hongre L, Glatzmaier A (2000) An examination of simulated geomagnetic reversals from a paleomagnetic perspective. Philos Trans R Soc Lond A 358:1141–1170 Constable C (2000) On the rate of occurence of geomagnetic reversals. Phys Earth Planet Inter 118:181–193 Cowling T (1957) The dynamo maintainance of steady magnetic fields. Q J Mech Appl Math 10:129–136 Dormy E, Cardin P, Jault D (1998) Mhd flow in a slightly differentially rotating spherical shell, with conducting inner core, in a dipolar magnetic field. Earth Planet Sci Lett 158:15–24 Fearn D (1979) Thermal and magnetic instabilities in a rapidly rotating fluid sphere. Geophys Astrophys Fluid Dyn 14:103–126 Fournier A, Bunge H-P, Hollerbach R, Vilotte J-P (2005) A Fourier-spectral element algorithm for thermal convection in rotating axisymmetric containers. J Comput Phys 204:462–489 Gastine T, Duarte L, Wicht J (2012) Dipolar versus multipolar dynamos: the influence of the background density stratification. Astron Atrophys 546:A19 Gastine T, Wicht J (2012) Effects of compressibility on driving zonal flow in gas giants. Icarus 219:428–442 Gilbert AD, Frisch U, Pouquet A (1988) Helicity is unnecessary for alpha effect dynamos, but it helps. Geophys Astrophys Fluid Dyn 42(1–2):151–161 Gillet N, Brito D, Jault D, Nataf H (2007) Experimental and numerical studies of convection in a rapidly rotating spherical shell. J Fluid Mech 580:83 Gillet N, Jault D, Canet E, Fournier A (2010) Fast torsional waves and strong magnetic fields ˘ Zs ´ core. Nature 465:74–77 within the EarthâA Glatzmaier G (1984) Numerical simulation of stellar convective dynamos. 1. The model and methods. J Comput Phys 55:461–484 ˘ S¸ how realistic are they? Annu Rev Earth Planet Glatzmaier G (2002) Geodynamo simulations âA Sci 30:237–257 Glatzmaier G, Coe R (2007) Magnetic polarity reversals in the core. In: Olson P (eds) Core dynamics. Treatise on geophysics, vol 8. Elsevier, Amsterdam/Boston, pp 283–297 Glatzmaier G, Coe R, Hongre L, Roberts P (1999) The role of the Earth’s mantle in controlling the frequency of geomagnetic reversals. Nature 401:885-890 Glatzmaier G, Roberts P (1995) A three-dimensional convective dynamo solution with rotating and finitely conducting inner core and mantle. Phys Earth Planet Inter 91:63–75

Page 44 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Glatzmaier G, Roberts P (1996) An anelastic evolutionary geodynamo simulation driven by compositional and thermal convection. Physica D 97:81–94 Gubbins D (2001) The Rayleigh number for convection in the Earth’s core. Phys Earth Planet Inter 128:3–12 Gubbins D, Davies CJ (2013) The stratified layer at the core-mantle boundary caused by barodiffusion of oxygen, sulphur and silicon. Phys Earth Planet Inter 215:21–28 Gubbins D, Kelly P (1993) Persistent patterns in the geomagnetic field over the past 2.5 ma. Nature 365:829–832 Gubbins D, Love J (1998) Preferred vgp paths during geomagnetic polarity reversals: symmetry considerations. Geophys Res Lett 25:1079–1082 Gubbins D, Willis AP, Sreenivasan B (2007) Correlation of Earth’s magnetic field with lower mantle thermal and seismic structure. Phys Earth Planet Inter 162:256–260 Harder H, Hansen U (2005) A finite-volume solution method for thermal convection and dynamo problems in spherical shells. Geophys J Int 161:522–532 Heimpel M, Aurnou J, Wicht J (2005) Simulation of equatorial and high-latitude jets on Jupiter in a deep convection model. Nature 438:193–196 Hejda P, Reshetnyak M (2003) Control volume method for the dynamo problem in the sphere with the free rotating inner core. Stud Geophys Geod 47:147–159 Hejda P, Reshetnyak M (2004) Control volume method for the thermal convection problem in a rotating spherical shell: test on the benchmark solution. Stud Geophys Geod 48:741–746 Hongre L, Hulot G, Khokholov A (1998) An analysis of the geomangetic field over the past 2000 years. Phys Earth Planet Inter 106:311–335 ˘ Z´ core: Implications for cessation Hori K, Wicht J (2013) Subcritical dynamos in the early MarsâA of the past Martian dynamo. Phys Earth Planet Inter 219:21–33 Hori K, Wicht J, Christensen UR (2010) The effect of thermal boundary conditions on dynamos driven by internal heating. Phys Earth Planet Inter 182:85–97 Hori K, Wicht J, Christensen UR (2012) The influence of thermo-compositional boundary conditions on convection and dynamos in a rotating spherical shell. Phys Earth Planet Inter 196:32–48 Hulot G, Bouligand C (2005) Statistical paleomagnetic field modelling and symmetry considerations. Geophys J Int 161. doi:10.1111/j.1365 Hulot G, Finlay C, Constable C, Olsen N, Mandea M (2010) The magnetic field of planet Earth. Space Sci Rev. doi: 10.1007/s11,214–010–9644–0 Isakov A, Descombes S, Dormy E (2004) An integro-differential formulation of magnet induction in bounded domains: boundary element-finite volume method. J Comput Phys 197:540–554 Ivers D, James R (1984) Axisymmetric antidynamo theorems in non-uniform compressible fluids. Philos Trans R Soc Lond A 312:179–218 Jackson A (1997) Time dependence of geostrophic core-surface motions. Phys Earth Planet Inter 103:293–311 Jackson A (2003) Intense equatorial flux spots on the surface of the Earth’s core. Nature 424:760–763 Jackson A, Finlay C (2007) Geomagnetic secular variation and applications to the core. In: Kono M (ed) Geomagnetism. Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 147–193 Jackson A, Jonkers A, Walker M (2000) Four centuries of geomagnetic secular variation from historical records. Philos Trans R Soc Lond A358:957–990

Page 45 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Jault D (2003) Electromagnetic and topographic coupling, and LOD variations. In: Jones CA, Soward AM, Zhang K (eds) Earth’s core and lower mantle. Taylor & Francis, London/New York, pp 56–76 Jault D, Gire C, LeMouël J-L (1988) Westward drift, core motion and exchanges of angular momentum between core and mantle. Nature 333:353–356 Johnson C, Constable C (1995) Time averaged geomagnetic field as recorded by lava flows over the past 5 Myr. Geophys J Int 122:489–519 Johnson C, Constable C, Tauxe L (2003) Mapping long-term changed in Earth’s magnetic field. Science 300:2044–2045 Johnson CL, McFadden P (2007) Time-averaged field and paleosecular variation. In: Kono M (ed) Geomagnetism. Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 217–254 Jones C (2007) Thermal and compositional convection in the outer core. In: Olson P (eds) Core dynamics. Treatise on geophysics, vol 8. Elsevier, Amsterdam/Boston, pp 131–186 Jones CA, Boronski P, Brun AS et al (2011) Anelastic convection-driven dynamo benchmarks. Icarus 216:120–135 Jonkers A (2003) Long-range dependence in the cenozoic reversal record. Phys Earth Planet Inter 135:253–266 Julien K, Knobloch E (1998) Strongly nonlinear convection cells in a rapidly rotating fluid layer: the tilted f-plane. J Fluid Mech 360:141–178 Julien K, Knobloch E, Werne J (1998) A new class of equations for rotationally constrained flows. Theor Comput Fluid Dyn 11(3–4):251–261 Julien K, Rubio A, Grooms I, Knobloch E (2012) Statistical and physical balances in low Rossby number Rayleigh–Bénard convection. Geophys Astrophys Fluid Dyn 106(4–5):392–428 Kageyama A, Miyagoshi T, Sato T (2008) Formation of current coils in geodynamo simulations. Nature 454:1106–1109 Kageyama A, Sato T (1995) Computer simulation of a magnetohydrodynamic dynamo. II. Phys Plasmas 2:1421–1431 Kageyama A, Sato T (1997) Generation mechanism of a dipole field by a magnetohydrodynamic dynamo. Phys Rev E 55:4617–4626 Kageyama A, Watanabe K, Sato T (1993) Simulation study of a magnetohydrodynamic dynamo: convection in a rotating shell. Phys Fluids B 24(8):2793–2806 Kageyama A, Yoshida M (2005) Geodynamo and mantle convection simulations on the Earth simulator using the yin-yang grid. J Phys Conf Ser 16:325–338 Kaiser R, Schmitt P, Busse F (1994) On the invisible dynamo. Geophys Astrophys Fluid Dyn 77:93–109 Kelly P, Gubbins D (1997) The geomagnetic field over the past 5 million years. Geophys J Int 128:315–330 Kono M, Roberts P (2002) Recent geodynamo simulations and observations of the geomagnetic field. Rev Geophys 40:1013. doi:10.1029/2000RG000102 Korte M, Constable C (2005) Continuous geomagnetic field models for the past 7 millennia: 2. cals7k. Geochem Geophys Geosys 6:Q02H16 Korte M, Constable C, Donadini F, Holme R (2011) Reconstructing the Holocene geomagnetic field. Earth Planet Sci Lett 312:497–505 Korte M, Genevey A, Constable C, Frank U, Schnepp E (2005) Continuous geomagnetic field models for the past 7 millennia: 1. A new global data compilation. Geochem Geophys Geosyst 6:Q02H15

Page 46 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Korte M, Holme R (2010) On the persistence of geomagnetic flux lobes in global Holocene field models. Phys Earth Planet Inter 182:179–186 Kuang W, Bloxham J (1997) An Earth-like numerical dynamo model. Nature 389:371–374 Kuang W, Bloxham J (1999) Numerical modeling of magnetohydrodynamic convection in a rapidly rotating spherical shell: weak and strong field dynamo action. J Comput Phys 153:51–81 Kuang W, Jiang W, Wang T (2008) Sudden termination of martian dynamo? Implications from subcritical dynamo simulations. Geophys Res Lett 35(14):14,202 Kutzner C, Christensen U (2000) Effects of driving mechanisms in geodynamo models. Geophys Res Lett 27:29–32 Kutzner C, Christensen U (2002) From stable dipolar to reversing numerical dynamos. Phys Earth Planet Inter 131:29–45 Kutzner C, Christensen U (2004) Simulated geomagnetic reversals and preferred virtual geomagnetic pole paths. Geophys J Int 157:1105–1118 Lhuillier F, Fournier A, Hulot G, Aubert J (2011) The geomagnetic secular variation timescale in observations and numerical dynamo models. Geophys Res Lett 38:L09306 Lillis R, Frey H, Manga M (2008) Rapid decrease in martian crustal magnetization in the noachian era: implications for the dynamo and climate of early mars. Geophys Res Lett 35(14):14,203 Manglik A, Wicht J, Christensen UR (2010) A dynamo model with double diffusive convection ˘ Zs ´ core. Earth Planet Sci Lett 289:619–628 for MercuryâA Matsui H, Buffett B (2005) Sub-grid scale model for convection-driven dynamos in a rotating plane layer. Phys Earth Planet Inter 153:74–82 Miyagoshi T, Kageyama A, Sato T (2010) Zonal flow formation in the Earth’s core. Nature 463(7282):793–796 Miyagoshi T, Kageyama A, Sato T (2011) Formation of sheet plumes, current coils, and helical magnetic fields in a spherical magnetohydrodynamic dynamo. Phys Plasmas 18:072901 Monnereau M, Calvet M, Margerin L, Souriau A (2010) Lopsided growth of Earth’s inner core. Science 328:1014 Morin V, Dormy E (2009) The dynamo bifurcation in rotating spherical shells. Int J Mod Phys B 23(28n29):5467–5482 Olsen N, Haagmans R, Sabaka TJ et al (2006) The Swarm End-to-End mission simulator study: a demonstration of separating the various contributions to Earth’s magnetic field using synthetic data. Earth Planets Space 58:359–370 Olson P, Christensen U (2002) The time-averaged magnetic field in numerical dynamos with nonuniform boundary heat flow. Geophys J Int 151:809–823 Olson P, Christensen U (2006) Dipole moment scaling for convection-driven planetary dynamos. Earth Planet Sci Lett 250:561–571 Olson P, Christensen UR, Driscoll PE (2012) From superchrons to secular variation: a broadband dynamo frequency spectrum for the geomagnetic dipole. Earth Planet Sci Lett 319–320:75–82 Olson P, Christensen U, Glatzmaier G (1999) Numerical modeling of the geodynamo: mechanism of field generation and equilibration. J Geophys Res 104:10383–10404 Pozzo M, Davies C, Gubbins D, Alfè D (2012) Thermal and electrical conductivity of iron at Earth’s core conditions. Nature 485:355–358 Proctor M (1994) Convection and magnetoconvection in a rapidly rotating sphere. In: Proctor MRE, Gilbert AD (eds) Lectures on solar and planetary dynamos, vol 1. Cambridge University Press, Cambridge/New York, p 97 Roberts P (1972) Kinematic dynamo models. Philos Trans R Soc Lond A 271:663–697

Page 47 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Roberts P (2007) Theory of the geodynamo. In: Olson P (eds) Core dynamics. Treatise on geophysics, vol 8. Elsevier, Amsterdam/Boston, pp 245–282 Ryan DA, Sarson GR (2007) Are geomagnetic field reversals controlled by turbulence within the Earth’s core? Geophys Res Lett 34:2307 Sakuraba A (2002) Linear magnetoconvection in rotating fluid spheres permeated by a uniform axial magnetic field. Geophys Astrophys Fluid Dyn 96:291–318 Sakuraba A, Kono M (2000) Effect of a uniform magnetic field on nonlinear magnetocenvection in a rotating fluid spherical shell. Geophys Astrophys Fluid Dyn 92:255–287 Sakuraba A, Roberts P (2009) Generation of a strong magnetic field using uniform heat flux at the surface of the core. Nat Geosci 2:802–805 Schmalzl J, Breuer M, Hansen U (2002) The influence of the Prandtl number on the style of vigorous thermal convection. Geophys Astrophys Fluid Dyn 96:381–403 Simitev R, Busse F (2005) Prandtl-number dependence of convection-driven dynamos in rotating spherical fluid shells. J Fluid Mech 532:365–388 Simitev RD, Busse FH (2009) Bistability and hysteresis of dipolar dynamos generated by turbulent convection in rotating spherical shells. Europhys Lett 85:19001 Soderlund KM, King E, Aurnou JM (2012) The influence of magnetic fields in planetary dynamo models. Earth Planet Sci Lett 333–334:9–20 Sprague M, Julien K, Knobloch E, Werne J (2006) Numerical simulation of an asymptotically reduced system for rotationally constrained convection. J Fluid Mech 551:141–174 Sreenivasan B (2009) On dynamo action produced by boundary thermal coupling. Phys Earth Planet Inter 177:130–138 Sreenivasan B, Jones CA (2006) The role of inertia in the evolution of spherical dynamos. Geophys J Int 164:467–476 Sreenivasan B, Jones CA (2011) Helicity generation and subcritical behaviour in rapidly rotating dynamos. J Fluid Mech 688:5–30 St Pierre M (1993) The strong field branch of the Childress-Soward dynamo. In: Proctor MRE et al (eds) Solar and planetary dynamos, Cambridge University Press, Cambridge, pp 329–337 Stanley S, Bloxham J (2004) Convective-region geometry as the cause of Uranus’ and Neptune’s unusual magnetic fields. Nature 428:151–153 Stanley S, Bloxham J, Hutchison W, Zuber M (2005) Thin shell dynamo models consistent with ˘ Zs ´ weak observed magnetic field. Earth Planet Sci Lett 234:341–353 mercuryâA Stanley S, Glatzmaier G (2010) Dynamo models for planets other than Earth. Space Sci Rev 152:617–649 Stellmach S, Hansen U (2004) Cartesian convection-driven dynamos at low ekman number. Phys Rev E 70:056312 Stelzer Z, Jackson A (2013, in press) Extracting scaling laws from numerical dynamo models. Geophys J Int Stieglitz R, Müller U (2001) Experimental demonstration of the homogeneous two-scale dynamo. Phys Fluids 1:561–564 Takahashi F, Matsushima M (2006) Dipolar and non-dipolar dynamos in a thin shell geometry with implications for the magnetic field of Mercury. Geophys Res Lett 33:L10202 Takahashi F, Matsushima M, Honkura Y (2008a) Scale variability in convection-driven MHD dynamos at low Ekman number. Phys Earth Planet Inter 167:168–178 Takahashi F, Tsunakawa H, Matsushima M, Mochizuki N, Honkura Y (2008b) Effects of thermally heterogeneous structure in the lowermost mantle on the geomagnetic field strength. Earth Planet Sci Lett 272:738–746 Page 48 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_16-2 © Springer-Verlag Berlin Heidelberg 2014

Taylor J (1963) The magneto-hydrodynamics of a rotating fluid and the Earth’s dynamo problem. Proc R Soc Lond A 274:274–283 Tilgner A (1996) High-Rayleigh-number convection in spherical shells. Phys Rev E 53:4847–4851 Vallis GK (2006) Atmospheric and oceanic fluid dynamics: fundamentals and large-scale circulation. Cambridge University Press, Cambridge Wicht J (2002) Inner-core conductivity in numerical dynamo simulations. Phys Earth Planet Inter 132:281–302 Wicht J (2005) Palaeomagnetic interpretation of dynamo simulations. Geophys J Int 162:371–380 Wicht J, Aubert J (2005) Dynamos in action. GWDG-Bericht 68:49–66 Wicht J, Christensen UR (2010) Torsional oscillations in dynamo simulations. Geophys J Int 181:1367–1380 ˘ Zs ´ internal magnetic field. Wicht J, Mandea M, Takahashi F et al (2007) The origin of MercuryâA Space Sci Rev 132:261–290 Wicht J, Olson P (2004) A detailed study of the polarity reversalmechanism in a numerical dynamo model. Geochem Geophys Geosyst 5. doi:10.1029/2003GC000602 Wicht J, Stellmach S, Harder H (2009) Numerical models of the geodynamo: from fundamental Cartesian models to 3d simulations of field reversals. In: Glassmeier K, Soffel H, Negendank J (eds) Geomagnetic field variations – space-time structure, processes, and effects on system Earth. Springer monograph. Springer, Berlin/Heidelberg/NewYork, pp 107–158 Wicht J, Tilgner A (2010) Theory and modeling of planetary dynamos. Space Sci Rev 152:501–542 Willis AP, Sreenivasan B, Gubbins D (2007) Thermal core mantle interaction: exploring regimes for locked dynamo action. Phys Earth Planet Inter 165:83–92 Yadav RK, Gastine T, Christensen UR (2013) Scaling laws in spherical shell dynamos with freeslip boundaries. Icarus 225:185–193 Zatman S, Bloxham J (1997) Torsional oscillations and the magnetic field within the Earth’s core. Nature 388:760–761 Zhang K-K, Busse F (1988) Finite amplitude convection and magnetic field generation in in a rotating spherical shell. Geophys Astrophys Fluid Dyn 44:33–53 Zhang K, Gubbins D (2000a) Is the geodynamo process intrinsically unstable? Geophys J Int 140:F1–F4 Zhang K, Gubbins D (2000b) Scale disparities and magnetohydrodynamics in the Earth’s core. Philos Trans R Soc Lond A 358:899–920 Zhang K, Schubert G (2000) Magnetohydrodynamics in rapidly rotating spherical systems. Annu Rev Fluid Mech 32:409–443

Page 49 of 49

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

Mathematical Properties Relevant to Geomagnetic Field Modeling Terence J. Sabakaa , Gauthier Hulotb and Nils Olsenc   a Planetary Geodynamics Laboratory, Code 698, NASA Goddard Space Flight Center, Greenbelt, MD, USA b Equipe de Géomagnétisme, Institut de Physique du Globe de Paris, Sorbonne Paris Cité, Université Paris Diderot, Paris, France c DTU Space, Technical University of Denmark, Kgs. Lyngby, Denmark

Abstract Geomagnetic field modeling consists in converting large numbers of magnetic observations into a linear combination of elementary mathematical functions that best describes those observations. The set of numerical coefficients defining this linear combination is then what one refers to as a geomagnetic field model. Such models can be used to produce maps. More importantly, they form the basis for the geophysical interpretation of the geomagnetic field, by providing the possibility of separating fields produced by various sources and extrapolating those fields to places where they cannot be directly measured. In this chapter, the mathematical foundation of global (as opposed to regional) geomagnetic field modeling is reviewed, and the spatial modeling of the field in spherical coordinates is focused. Time can be dealt with as an independent variable and is not explicitly considered. The relevant elementary mathematical functions are introduced, their properties are reviewed, and how they can be used to describe the magnetic field in a source-free (such as the Earth’s neutral atmosphere) or source-dense (such as the ionosphere) environment is explained. Completeness and uniqueness properties of those spatial mathematical representations are also discussed, especially in view of providing a formal justification for the fact that geomagnetic field models can indeed be constructed from ground-based and satellite-born observations, provided those reasonably approximate the ideal situation where relevant components of the field can be assumed perfectly known on spherical surfaces or shells at the time for which the model is to be recovered.

1 Introduction The magnetic field measured at or near the Earth’s surface is the superposition of contributions from a variety of sources, as discussed in the accompanying chapter by Olsen et al. in this handbook (Part 2, Chap. 5). The sophisticated separation of the various fields produced by these sources on the basis of magnetic field observations is a major scientific challenge which requires the introduction of adequate mathematical representations of those fields (see, e.g., Hulot et al. 2007).



E-mail: [email protected]



E-mail: [email protected]



E-mail: [email protected]

Page 1 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

Here a synopsis of general properties relevant to such mathematical representations on a planetary scale is provided, which is the scale that is mainly dealt with here (regional representations of the field will only briefly be discussed, and the reader is referred to, e.g., Purucker and Whaler (2007) and to classical texts as Harrison (1987), Blakely (1995), and Langel and Hinze (1998)). Such representations are of significant utility since they encode the physics of the magnetic field and allow for a means of deducing these fields from measurements via inverse theory. It should be noted that these representations are spatial in nature, i.e., they provide instantaneous descriptions of the fields. The physical cause of the time variability of such fields is a complicated subject and, in its true sense, relies on the electromagnetic dynamics of the environment. Induced magnetism, for instance, is a well-understood subject which is covered in several good texts (e.g., Merrill et al. 1998) as is dynamo theory (cf. Chap. 16). However, while the physics is known, the incorporation of the dynamics into inverse problems is in its embryonic stages, especially for the Earth’s core dynamo, and entails the subject of data assimilation, which is beyond the scope of this discussion. In what follows, time is thus considered as an independent implicit variable. This chapter is divided into five parts. First, the general equations governing the behavior of any magnetic field are briefly recalled. Then naturally the concepts of potential and nonpotential magnetic fields are introduced, the mathematical representation of which is discussed in the following two sections. Next a short section introduces the useful concept of spatial power spectra. The last section finally deals with uniqueness issues raised by the limited availability of magnetic observations, a topic of paramount importance when defining observational and modeling strategies to recover complete mathematical representations of the field. The chapter concludes with a few words with respect to the practical use of such mathematical representations, guiding the reader to further reading.

2 Helmholtz’s Theorem and Maxwell’s Equations A convenient starting point for describing the spatial structure of any vector field is the Helmholtz theorem which states that if the divergence and curl of a vector field are known in a particular volume, as well as its normal component over the boundary of that volume, then the vector field is uniquely determined (e.g., Backus et al. 1996). When measurements of the Earth’s magnetic field are taken, they usually reflect some aspect of the magnetic induction vector B. Hence, statements about its divergence and curl will define many of its spatial properties. Maxwell’s equations then provide all the information necessary, sans boundary conditions, to describe the spatial behavior of both B and the electric field intensity E. They apply to electromagnetic phenomena in media which are at rest with respect to the coordinate system used. t ; 0

(1)

r  B D 0;

(2)

r ED

@B ; @t

(3)

r  B D 0 J:

(4)

r ED

Page 2 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

In SI units E is expressed in V =m and B is expressed in tesla denoted T .T D V  s=m2 /. The remaining quantities are the total electric charge density t in C/m3 , the permittivity of free space "0 D 8:851012 F=m, the permeability of free space 0 D 4 107 H=m, and the total volume current density J in A=m2 . The total volume current density is actually the sum of volume densities from free charge such that currents Jf , equivalent currents Je , and displacement currents @D @t J D Jf C Je C

@D : @t

(5)

While Jf reflects steady free currents in nonmagnetic materials, Je represents an equivalent current effect due to magnetized material whose local properties are described by the net dipole moment per unit volume M such that Je D r  M:

(6)

The displacement current density can be omitted if the time scales of interest are much longer than those required for light to traverse a typical length scale of interest (e.g., Backus et al. 1996). This criterion can be justified by the fact that, for instance, in a linear isotropic medium where D D "E (" is the permittivity of the medium) and Jf and Je are absent, taking the curl of Eq. 4, substituting Eq. 3, and making use of Eq. 2 give the homogeneous magnetic wave equation: @2 B D cl2 r 2 B; 2 @t

(7)

where the wave propagates at the speed of light cl . When time scales of interest are longer than those required for the light to traverse length scales of interest, the left-hand side of Eq. 7 can be neglected, amounting to neglect displacement currents in Eq. 4. Geomagnetism deals with frequencies up to only a few Hz and with length scales smaller than the radius of the Earth, which the light traverses in about 20 ms. Neglecting the displacement current is thus appropriate for most purposes in geomagnetism. Ampere’s law is then given in a form which reflects the state of knowledge before Maxwell, thus yielding the pre-Maxwell equations where r  B D 0 J;

(8)

J D Jf C Je :

(9)

and

What is clear from Maxwell’s equations is the absence of magnetic monopoles; magnetic field lines are closed loops. Indeed, because Eq. 2 is valid everywhere and under all circumstances, the magnetic field is solenoidal: its net flux through any closed surface must be zero. By contrast, Ampere’s law (Eq. 8) shows that the magnetic field is irrotational only in the absence of free currents and magnetized material. It is this presence or absence of J that naturally divides magnetic field representations into two classes: potential and nonpotential magnetic fields.

Page 3 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

3 Potential Fields 3.1 Magnetic Fields in a Source-Free Shell Let space be divided into three regions delineated by two shells of lower radius a and higher radius c so that regions I, II, and III are defined as r  a; a < r < c, and c  r. Imagine that current systems are confined to only regions I and III and that one wants to describe the field in the source-free region II where J = 0, such as, for instance, the neutral atmosphere. From Ampere’s law, r  B D 0 in region II. It is well known that the magnetic field is then conservative and can be expressed as the gradient of a scalar potential V (Lorrain and Corson 1970; Backus et al. 1996; Jackson 1998): B D rV:

(10)

These types of fields are known as potential magnetic fields. In addition, from Eq. 2, B must still be solenoidal which implies r 2 V D 0:

(11)

i.e., that the potential V be harmonic. This is Laplace’s equation, which can be solved for V in several different coordinated systems via separation of variables. The spherical system is most natural for the near-Earth environment, and its coordinates are .r; ; /, where r is radial distance from the origin,  is the polar angle rendered from the north polar axis (colatitude), and  is the azimuthal angle rendered in the equatorial plane from a prime meridian (longitude). The Laplacian operator may be written in spherical coordinates as 2 @ 1 @2 @ r D C 2C 2 r @r @r r sin  @ 2

  @2 @ 1 : sin  C 2 2 @ r sin  @' 2

(12)

Two linearly independent sets of solutions exist, depending on whether the source currents reside in either region I, V i , or region III, V e , i.e., are interior or exterior to the measurement shell, respectively. These solutions correspond to negative and positive powers of r, respectively, and are given by (e.g., Langel 1987; Backus et al. 1996) 1   n X  m a nC1 X  m gn cos m' C hm V .r; ; '/ D a n sin m' Pn .cos  /; r nD1 mD0 i

V e .r; ; '/ D a

1   X n X  m  r n qn cos m' C snm sin m' Pnm .cos  /; a mD0 nD1

(13)

(14)

which leads to B.r; ; '/ D Bi .r; ; '/ C Be .r; ; '/;

(15)

with

Page 4 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

1   n X  a nC2 X  m mc ms gn …ni .; '/ C hm Bi .r; ; '/ D n …ni .; '/ ; r nD1 mD0

(16)

1   n X  r n1 X  m mc qn …ne .; '/ C snm …ms Be .r; ; '/ D ne .; '/ ; a nD1 mD0

(17)

where m O …mc ni .; '/ D .nC1/Pn .cos  / cos m' r

dPnm.cos  / m m O (18) cos m' O C P .cos  / sin m' '; d sin  n

m O …mc ni .; '/ D .nC1/Pn .cos  / sin m' r

m m dPnm .cos  / sin m' O C P .cos  / cos m' '; O (19) d sin  n

m O …mc ne .; '/ D nPn .cos  / cos m' r

…ms ne .; '/

D

nPnm .cos  / sin m' rO

dPnm .cos  / m m O (20) cos m' O C P .cos  / sin m' '; d sin  n

m m dPnm .cos  / sin m' O C P .cos  / cos m' ';  O (21) d sin  n

O '/ and .Or; ; O are the unit vectors associated with the spherical coordinates (r; ; ). In these equations, a is the reference radius and in this case corresponds to the lower shell, Pnm .cos / is the Schmidt quasi-normalized associated Legendre function of degree n and order m m m (both being integers), and gnm and hm n are internal, and qn and sn external, constants known as the Gauss coefficients. Scaling these equations by a yields coefficients whose units are those of B, i.e., magnetic induction, which in the near-Earth environment is usually expressed in nanoteslas (nT). Note that although they formally appear in Eqs. 13–17, h0n and sn0 Gauss coefficients are in fact not needed (because sin m D 0 when m D 0) and therefore not defined. One should also notice the omission of the n = 0 terms in both solutions V i and V e (and therefore Bi and Be ). This is a consequence of the fact that for V i , this term leads to an internal field Bi not satisfying magnetic monopole exclusion at the origin (required by Eq. 2), while for V e , this term is just a constant (since P00 .cos  / D 1, see below) which produces no field. It may thus be set to zero. Finally, the reader should also be warned that the choice of the Schmidt quasi-normalization for the Pnm .cos  / in Eqs. 13 and 14 is very specific to geomagnetism. It dates back to the early work of Schmidt (1935) and has since been adopted as the conventional norm to be used in geomagnetism, following a resolution of the International Association of Terrestrial Magnetism and Electricity (IATME) of the International Union of Geophysics and Geodesy (IUGG) (Goldie and Joyce 1940).

3.2 Surface Spherical Harmonics The angular portions of the terms in Eqs. 13 and 14 are often denoted Ynm;c .; /  Pnm .cos  / cos m and Ynm;s .; /  Pnm .cos  / sin m and are known as Schmidt quasinormalized real surface spherical harmonics. They are related to analogous complex functions Yn;m .; / known as complex surface spherical harmonics, which have been extensively studied Page 5 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

in the mathematical and physical literature. For any couple of integers (n; m) with n  0 and n  m  n, those Yn;m .; / are defined as complex functions of the form (e.g., Edmonds 1996; Backus et al. 1996) s Yn;m .; '/ D .1/m

.2n C 1/

.n  m/Š Pn;m .cos  /e im' ; .n C m/Š

(22)

where the Pn;m .cos  / are again the associate Legendre functions of degree n and order m, but now satisfying the much more common Ferrers normalization (Ferrers 1877), and defined by (with  D cos  ) Pn;m ./ D

nCm 1 2 m2 d .1   / .2  1/n : 2n nŠ dnCm

(23)

Note that this definition holds for n  m  n and leads to the important property that Pn;m ./ D .1/m

.n  m/Š Pn;m ./: .n C m/Š

(24)

which then also implies that Yn;m .; '/ D .1/m YNn;m .; '/:

(25)

where the overbar denotes complex conjugation. The Yn;m .; / are eigenfunctions of the angular portion of the Laplacian operator rS2 rS2

1 @  sin  @

  @ 1 @2 sin  C 2 ; @ sin  @' 2

(26)

such that rS2 Yn;m .; '/ D n.n C 1/Yn;m .; '/:

(27)

They represent a complete, orthogonal set of complex functions on the unit sphere. The cumbersome prefactor to be found in Eq. 22 is chosen so that the inner products of these complex surface spherical harmonics over the sphere have the form 1 hYn;m ; Yl;k i  4

Z

2 0

Z



YNn;m .; '/Yl;k .; '/ sin  d d' D ınl ımk ;

(28)

0

where ıij is the Kronecker delta. Those complex surface spherical harmonics are then said to be fully normalized. The reader should however be aware that not all authors introduce the 1=4 factor in the definition (Eq. 28) of the inner product, in which case p fully normalized complex surface spherical harmonics are not exactly defined as in Eq. 22 (a1= 4 factor then needs to be introduced). Here, the convention used in most previous books dealing with geomagnetism (such as, e.g., Merrill and McElhinny 1983; Langel 1987; Merrill et al. 1998, and especially Page 6 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

Backus et al. 1996, where many useful mathematical properties satisfied by those functions can be found) is simply chosen. Similarly, the .1/m factor in the definition of Eq. 22 is not always introduced. The Schmidt quasi-normalized Pnm .cos  / introduced in Eqs. 13 and 14 are then related to the Ferrers normalized Pn;m .cos  / through ( Pnm .cos  / D

for m D 0; q Pn;m .cos  / 2.nm/Š P .cos  / for m > 0: .nCm/Š n;m

(29)

and the Schmidt quasi-normalized real surface spherical harmonics Ynm;c .; / and Ynm;s .; / are related to the fully normalized complex surface spherical harmonics Yn;m .; / through s Ynm;c .; '/ D .1/m

2 RŒYn;m .; '/; .2n C 1/.1 C ım0 /

(30)

s Ynm;s .; '/ D .1/m

2 IŒYn;m .; '/; .2n C 1/

(31)

for n  0 and 0  m  n. Note that whereas the Pn;m .cos  / are defined for n  0 and n  m  n, which are all needed for the Yn;m .; / to form a complete orthogonal set of complex functions on the unit sphere, the Pnm .cos  / are only used for n  0 and 0  m  n, which is then enough for the Ynm;c .; / and Ynm;s .; / to form a complete, orthogonal set of real functions on the unit sphere. They satisfy D

E D E Ynm;c ; Ylk;c D Ynm;s ; Ylk;s D

1 ınl ımk ; 2n C 1

(32)

and E D Ynm;c ; Ylk;s D 0:

(33)

Finally, because of Eqs. 30 and 31, they too are eigenfunctions of rS2 and satisfy rS2 Ynm;.c;s/.; '/ D n.n C 1/Ynm;.c;s/ .; '/:

(34)

where Ynm;.c;s/.; / stands for either Ynm;c .; / or Ynm;s .; /. The following recursion relationships allow the Schmidt quasi-normalized series to be generated for a given value of  (Langel 1987): r

2n  1 n1 sin Pn1 .cos  /; n > 1; 2n r .n  1/2  m2 m 2n  1 m m cos Pn1 .cos  /  Pn2 .cos  /; Pn .cos  / D p n2  m2 n2  m2 Pnn .cos  /

D

(35)

n > m  0: (36)

Page 7 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

1

P 06

P 16

P 26

P 66

P 36 P 4 P 5 6 6

0.5

0

−0.5

−1

0

30

60

90 θ

120

150

180

Fig. 1 Schmidt quasi-normalized associated Legendre functions P6m .cos / as a function of 

The first few terms of the series are p

P00 .cos  / D 1; P10 .cos  / D cos ; P11 .cos  / D sin ;

P22 .cos  / D 23 sin2 ; 0 P3 .cos  / D p12 .5 cos3   3 cos  /; P31 .cos  / D 2p32 sin .5 cos2   1/;

P20 .cos  / D 12 .3 cos2   1/; p P21 .cos  / D 3 cos  sin ;

P32 .cos  / D 215 psin2  cos ; P33 .cos  / D 2p52 sin3 

p

Plots of the P6m .cos  / functions are shown in Fig. 1 as a function of  . Real surface spherical harmonic functions Ynm;c .; / and Ynm;s .; / have n  m zeros in the interval Œ0;  along meridians and 2m zeros along lines of latitude. When m D 0, only the Yn0;c .; / exist, which is then denoted Yn0 .; /. They exhibit annulae of constant sign in longitude and are referred to as zonal harmonics. When n D m, there are lines of constant sign in latitude and these are referred to as sectorial harmonics. The general cases are termed tesseral harmonics. Figure 2 illustrates examples of each from the n = 6 family. Finally, note that since Y0;0 D P0;0 D P00 D Y00 D 1, using either Eqs. 28 or 32–33 for n D m D 0 leads to the important additional property that 1 4

Z

2 0

Z

 0

1 Yn;m .; '/ sin d d' D 4

Z

2

Z



Ynm .; '/ sin  dd' D ın0 ım0 ; 0

(37)

0

For many more useful properties satisfied by associate Legendre functions and real or complex surface spherical harmonics, the reader is referred to, e.g., Langel (1987), Backus et al. (1996), and Dahlen and Tromp (1998). But beware of conventions and normalizations!

Page 8 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

Y 06

Y 36

Y 66

Fig. 2 Color representations (red positive, blue negative) of real surface harmonics Y60 , Y63 , and Y66 (Note that Ynm;c  and Ynm;s are identical to within a 2m longitudinal phase shift)

3.3 Magnetic Fields from a Spherical Sheet Current Let space again be divided into three regions, but in a different way. Introduce a single shell of radius b (where a < b < c) so that regions I, II, and III are now defined as r < b; r D b, and b < r. This time imagine that the current system is confined to only region II (the spherical shell surface) and that one wants to describe the field produced by those sources in source-free regions I and III, where J D 0. Here, sources previously assumed to lie either below r D a or above r D c are thus ignored, and only the description of a field produced by a spherical sheet current is considered. This is very applicable to the Earth environment where currents associated with the ionospheric dynamo reside in the E-region and peak near 115 km (cf. Chap. 5). In region III those sources are seen as internal, producing a field with a potential of the form of Eq. 13: V .r; ; '/ D a i

1   n X a nC1 X nD1

r

.anm cos m' C bnm sin m'/Pnm .cos  /;

(38)

mD0

where anm and bnm are now the constants. In region I, by contrast, those sources are seen as external, producing a field with a magnetic potential of the form of Eq. 14. However, because B is solenoidal, its radial component must be continuous across the sheet so that @V e @V i jrDb D jrDb : @r @r

(39)

Unlike the independent internal and external magnetic fields found in a shell sandwiched between source bearing regions, the fields in regions I and III here are not independent. They are now coupled through Eq. 39. This also means that the expansion coefficients for V e are now related to those anm and bnm for V i (Granzow 1983; Sabaka et al. 2002): V e .r; ; '/ D a

1 n X  n C 1  a 2nC1  r n X  m an cos m' C bnm sin m' Pnm .cos  /: n b a mD0 nD1

(40)

Although the radial component of the field is the same just above and below the sheet current, the horizontal components are not. However, they are in the same vertical plane perpendicular to the sheet, and if Ampere’s circuital law (the integral form of Ampere’s law) is applied to the area

Page 9 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

containing the sheet, then the surface current density is seen to be (e.g., Granzow 1983) Js D

1 rO  .Be  Bi /; 0

(41)

which in component form is 

Js Js'

 D

1 0

   m  m Pn 2nC1 a nC2 m m a sin m' C b cos m' Pn .cos / n n nD1 n mD1 b sin      P P nC2 dPnm .cos  / n 2nC1 a m m  10 1 nD1 n mD0 an cos m' C bn sin m' b d

P1

! :

(42)

Finally, note that Js can also be written in the often used form Js D Or  r‰:

(43)

where ‰ is known as the sheet current function. The SI unit for surface current density is A=m, while that of the sheet current function ‰ is A.

4 Non-potential Fields In the previous section, mathematical representations were developed for magnetic fields in regions where J D 0. Satellite, surface, and near-surface surveys used in near-Earth magnetic field modeling do not sample the source regions of the core, crust, magnetosphere, or typically the ionospheric E-region. The representations developed so far can therefore be used to describe the fields produced by those sources in regions were observations are made. However, there are additional currents which couple the magnetosphere with the ionosphere in the F -region where satellite measurements are commonly made (cf. Chap. 5). These additional currents need to be considered. In this section, the condition r  B D 0 is therefore relaxed. This leads to field forms which are described by two scalar potentials rather than one. While it is true that if r  B D 0, then there exists a vector potential, A, such that r  A D B, the magnetic fields of this section, which cannot be written in the form of the gradient of a scalar potential (as in Eq. 10), will be referred to as nonpotential fields.

4.1 Helmholtz Representations and Vector Spherical Harmonics To begin with, consider a very general vector field F. Recall that the Helmholtz theorem then states that if the divergence and curl of F are known in a particular volume, as well as its normal component over the boundary of that volume, then F is uniquely determined. This is also true if F decays to zero as r ! 1 (for a rigorous proof and statement, see, e.g., Blakely 1995; Backus et al. 1996). In addition, F can then always be written in the form F D rS C r  A;

(44)

where S and A are however then not uniquely determined. In fact this degree of freedom further makes it possible to choose A of the form A D T r C r  P r, so that F can also be written as Page 10 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

(e.g., Stern 1976) F D rS C r  T r C r  r  P r;

(45)

This representation has the advantage that if F satisfies the vector Helmholtz equation r 2 F C k 2 F D 0;

(46)

then the three scalar potentials S; T , and P also satisfy the associated scalar Helmholtz equation: r 2Q C k2Q D 0

(47)

where Q is either S; T , or P . The solution to this scalar Helmholtz equation can be achieved through separation of variables in spherical coordinates. Expanding the angular dependence of Q in terms of real surface spherical harmonics leads to Q.r; ; '/ D

n 1 X X

.Qnm;c .r/Ynm;c .; '/ C Qnm;s .r/Ynm;s .; '//;

(48)

nD0 mD0

and Eq. 47 then implies 

d2 2 d n.n C 1/ 2 C Ck  Qnm.c;s/ .r/ D 0; 2 2 dr r dr r

(49)

which depends only on n. This may be further transformed into Bessel’s equation such that the solution of Eq. 49 can be written in the form Qnm.c;s/ .r/ D Cnm.c;s/ jn .kr/ C Dnm.c;s/nn .kr/;

(50)

where jn .kr/ and nn .kr/ are spherical Bessel functions of the first and second kind, respectively (Abramowitz and Stegun 1964), and Cnm.c;s/ and Dnm.c;s/ are constants. If k ! 0, then it can be shown that jn .kr/ and nn .kr/ approach the r .nC1/ internal and r n external potential forms, respectively (Granzow 1983). This is consistent with the fact that, as seen in the previous section, if the field F is potential, Eq. 45 reduces to Eq. 10 and if the corresponding potential is harmonic, Eq. 47 reduces to Eq. 11, in which case it can be expanded in the form of Eqs. 13 and 14. More generally, if the vector field F is to be defined within the spherical shell a < r < c, the general solutions (Eq. 50) of Eq. 49 for nonzero k can be used for the purpose of writing expansions of S; T; P , and therefore F, in terms of elementary fields. One can then indeed take advantage of the fact that any well-behaved function defined on .a; c/ can always be expanded in terms of a sum (over i , or an integral if c ! 1) of spherical Bessel functions of the first kind jn .ki r/ where ki D xci and the xi are the positive roots of jn .x/ D 0 (Watson 1966; Granzow 1983). It thus follows that in the spherical shell of interest, any scalar function Q.r; ; / (and, in particular, S; T , and P ) can be written in the form of the following expansion:

Page 11 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

Q.r; ; '/ D

1 I X X

jn .ki r/

iD1 nD0

n X   m;c m;c m;s m;s Yn;i .; '/ : Qn;i Yn .; '/ C Qn;i

(51)

mD0

Using such expansions for S; T , and P in Eq. 45 then provides an expansion of any vector field F within the spherical shell a < r < c in terms of elementary vector fields, where advantage can be taken of the fact that each jn .ki r/Ynm;c .; / and jn .ki r/Ynm;s .; / will satisfy Eq. 47, with k D ki . More details about how this can be applied to describe any nonpotential magnetic field can be found in Granzow (1983). Rather than using Eq. 45, one may also use the alternative Helmholtz representation (e.g., Backus et al. 1996; Dahlen and Tromp 1998): F D U rO C rS V  rO  rS W;

(52)

where the angular portion rS D rr  r

@ @r

(53)

of the r operator has been introduced (note that rs  rs D rs2 as defined by Eq. 26). Equation 52 amounts to decomposing F in terms of a purely radial vector field U rO and a purely tangent (to the sphere) vector field rS V  rO  rS W . Still considering F to be a general (well-behaved) vector field defined within the spherical shell a < r < c, this representation can be shown to be unique, provided one requires that for any value of r within the shell, the average values of V and W over the sphere of radius r (denoted S.r/ ) is such that hV iS.r/ D hW iS.r/ D 0

(54)

Of course, each scalar function U; V; W can then also be expanded in terms of real surface spherical harmonics of the form U.r; ; '/ D

n 1 X X

.Unm;c .r/Ynm;c .; '/ C Unm;s .r/Ynm;s .; '//;

(55)

nD0 mD0

and similarly for V and W (for which the n D 0 term must however be set to zero because of Eq. 54). This then has the advantage that in implementing Eq. 55 and the equivalent expansions for V and W in Eq. 52, the radial dependence of each Unm;c .r/; Unm;s .r/, etc., is not affected by the rs operator. This leads to F.r; ; '/ D

n 1 P P

m;s m;s .Unm;c .r/Pm;c n .; '/ C Un .r/Pn .; '//

nD0 mD0 n 1 P P

C

C

nD1 mD0 n 1 P P nD1 mD0

m;s m;s .Vnm;c .r/Bm;c n .; '/ C Vn .r/Bn .; '//

(56)

m;s m;s .Wnm;c .r/Cm;c n .; '/ C Wn .r/Cn .; '//;

Page 12 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

where the various Unm.c;s/.r/; Vnm;.c;s/.r/, and Wnm;.c;s/.r/ functions can independently be expanded with the help of any relevant representation and the following elementary vector functions have been introduced Pm;.c;s/ .; '/ D Ynm;.c;s/.; '/Or; n

(57)

Bm;.c;s/ .; '/ D rS Ynm;.c;s/.; '/; n

(58)

Cm;.c;s/ .; '/ D Or  rS Ynm;.c;s/ .; '/ D r  Ynm;.c;s/.; '/r: n

(59)

These are known as real vector spherical harmonics (see, e.g., Dahlen and Tromp 1998). Just like the surface spherical harmonics from which they derive, those vectors can also be introduced in complex form and with various norms (see, e.g., Morse and Feshbach 1953; Stern 1976; Granzow 1983; Jackson 1998; Dahlen and Tromp 1998). Here, for simplicity, real quantities and Schmidt quasi-normalizations are considered. Introducing the following inner product for two real vector fields K and L defined on the unit sphere: 1 hK; Li  4

Z

2

Z



K.; '/  L.; '/ sin  d d'; 0

(60)

0

it can be shown that (see, e.g., Dahlen and Tromp 1998) k;.c;s/

; Ll hKm;.c;s/ n

i D 0;

(61)

k;.c;s/

and Ll are not strictly identical real vector spherical harmonics as defined as soon as Km;.c;s/ n m;.c;s/ ; Bm;.c;s/ , and Cm;.c;s/ , it can further be shown that (see again, by Eqs. 57–59. For each Pn n n e.g., Dahlen and Tromp 1998, but beware the choice of inner product (Eq. 60) and normalization (Eq. 32)) m;c m;s m;s hPm;c n ; Pn i D hPn ; Pn i D

1 ; 2n C 1

(62)

and m;c m;s m;s m;c m;c m;s m;s hBm;c n ; Bn i D hBn ; Bn i D hCn ; Cn i D hCn ; Cn i D

n.n C 1/ : 2n C 1

(63)

Note the discrepancy by a factor of n.n C 1/ between Eqs. 62 and 63, which can be avoided by introducing an additional factor .n.n C 1//1=2 in the right-hand side of the definitions Eqs. 58 and 59 (as is most often done in the literature). Those real vector spherical harmonics are thus mutually orthogonal within each family and between families with respect to this inner product. They provide a convenient general basis for expanding any vector field F in the spherical shell a < r < c, as is made explicit by Eq. 56. Of course, many other basis can be built by making appropriate linear combinations of ; Bm;.c;s/ , and Cm;.c;s/ . Such linear combinations naturally arise when, for instance, the Pm;.c;s/ n n n implementing expansions of the type Eq. 51 for S; T , and P in Eq. 45. Also, it should be noted that other linear combinations of the kind have already been encountered when potential vector fields Page 13 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

were considered (as described in Sect. 3). This indeed led to Eqs. 16 and 17, where the elementary m;.c;s/ vector functions …ni .; '/ and …m;.c;s/ .; '/ naturally arose. These can now be rewritten in the ne form m;.c;s/

…ni

D .n C 1/Pm;.c;s/  Bm;.c;s/ ; n n

D nPm;.c;s/  Bm;.c;s/ : …m;.c;s/ ne n n

(64) (65)

Together with the Cm;.c;s/ , they again form a general basis of mutually orthogonal vector fields n (i.e., satisfying Eq. 61, as one can easily check). They also satisfy m;c m;s m;s h…m;c ni ; …ni i D h…ni ; …ni i D n C 1;

(66)

m;c m;s m;s h…m;c ne ; …ne i D h…ne ; …ne i D n:

(67)

An interesting discussion of how this alternative basis can be used for the purpose of describing the Earth’s magnetic field within a shell where sources with simplified geometry exist can be found in Winch et al. (2005).

4.2 Mie Representation Both Helmholtz representations, Eqs. 45 and 52, apply to any well-behaved vector field F. Another more restrictive representation can be derived if one requests the field F to be solenoidal, as the magnetic field is requested to be because of Eq. 2. Indeed, if a solenoidal field is written in the form of Eq. 45, then S must be harmonic, i.e., it must satisfy Laplace’s equation (Eq. 11). However, consider the following identity for the r  r operator in spherical coordinates @ .rP /  rr 2 P: r  r  Pr D r @r 

This suggests that if an additional scalar potential of the form Z 1 Ps D S dr r

(68)

(69)

is added to the original scalar potential P , then rS may be absorbed into Eq. 68. This is true since PS is then also harmonic and SD

@ .rPs /: @r

(70)

What this means is that any solenoidal field such as B may be written as the curl of a vector potential, which is to say, the last two terms of Eq. 45. The condition of r  B has eliminated the need for one of the three original scalar potentials and leaves B in the form B D r  T r C r  r  P r:

(71) Page 14 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

This representation can be shown to be unique provided one requests hT iS.r/ D hP iS.r/ D 0;

(72)

for all values of r within the spherical shell of interest a < r < c. It is known as the Mie representation for a solenoidal vector (Mie 1908; Backus 1986; Backus et al. 1996; Dahlen and Tromp 1998). The first term in Eq. 71 is known as the toroidal part of B, denoted as Btor , and T is known as the toroidal scalar potential Btor D r  T r;

(73)

1 @T O @T  ': O sin  @' @

(74)

D

This part has no radial component and its surface divergence is zero rS  Btor D 0;

(75)

where the surface divergence operates on a given vector field F as (recall the definition Eq. 53 of rS ) 1 rS  F D 2Fr C sin 



@F' @ .F sin  / C : @ @'

(76)

The second term in Eq. 71 is known as the poloidal part of B, denoted as Bpol , and P is known as the poloidal scalar potential Bpol D r  r  P r; 1 @ 1 D  rS2 P rO C r r @



(77)

 @ 1 @ @ O .rP /  C .rP / ': O @r r sin  @' @r

(78)

This part has a vanishing surface curl on the sphere, i.e., it satisfies rO  .rS  Bpol / D ƒS  Bpol D 0;

(79)

where the operator ƒS  rO  rS known as the surface curl operator has been introduced (Backus et al. 1996; Dahlen and Tromp 1998). This operator satisfies for any vector field F: 0

0

0

0

sh B D Bipol C Bepol C Bsh pol C Btor ;

(80)

The terms toroidal and poloidal were coined by Elsasser (1946) and so the Mie representation is sometimes also referred to as the toroidal-poloidal decomposition. Just like for the Helmholtz representations, one can again expand the toroidal and poloidal scalar fields in terms of real spherical harmonics:

Page 15 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

T .r; ; '/ D

n 1 X X

.Tnm;c .r/Ynm;c .; '/ C Tnm;s .r/Ynm;s .; '//;

(81)

.Pnm;c .r/Ynm;c .; '/ C Pnm;s .r/Ynm;s .; '//;

(82)

nD1 mD0

P .r; ; '/ D

n 1 X X nD1 mD0

where the sum starts at n D 1, and not at n D 0, because of Eq. 72. This then leads to (recalling Eq. 59) Btor .r; ; '/ D

n 1 X X

m;s m;s .Tnm;c .r/Cm;c n .; '/ C Tn .r/Cn .; '//;

(83)

nD1 mD0

and (recalling Eqs. 57 and 58 and making use of Eq. 27) Bpol .r; ; '/ D

n 1 P P nD1 mD0

C

m;s m;s n.n C 1/.Pnm;c .r/Pm;c n .; '/ C Pn .r/Pn .; '// n  1 P P d nD1 mD0

.rPnm;c .r//Bm;c n .; '/ C dr

 ;

(84)

d .rPnm;s .r//Bm;s n .; '/ dr

which shows that Btor and Bpol are expressible in terms of different families of vector spherical harmonics. As a result, it also follows that for any value of r within the shell of interest a < r < c; Btor and Bpol are orthogonal over the sphere of radius r, i.e., with respect to the inner product defined by Eq. 60: hBtor .r/; Bpol .r/i D 0:

(85)

4.3 Relationship of Band JMie Representations Recall from the pre-Maxwell form of Ampere’s law that J is also the curl of a vector. This means that it also is a solenoidal field r J D0

(86)

and so also possesses a toroidal-poloidal decomposition. Let TB and PB be the toroidal and poloidal scalars representing the magnetic field B and let TJ and PJ be the same for the volume current density J. It follows from Ampere’s law equation (8) and Eqs. 68 and 71 that 0 J D r  r  TB r C r  r  r  PB r;

(87)

D r  .r 2 PB /r C r  r  TB r:

(88)

Taking advantage of the uniqueness of the Mie representation when Eq. 72 is satisfied, one can then identify the following relationships between the various scalar functions Page 16 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

1 2 r PB ; 0

TJ D  PJ D

1 TB : 0

(89)

(90)

This then shows that poloidal magnetic fields are associated with toroidal current densities and toroidal magnetic fields are associated with poloidal current densities. Conversely, one may solve Eqs. 89 and 90 for TB and PB in terms of TJ and PJ . The first equation yields Poisson’s equation for PB at position r r 2 PB .r/ D 0 TJ .r/

(91)

whose classical solution (see, e.g., Jackson 1998) is given by 0 PB .r/ D 4

Z 0

TJ .r0 / 0 d ; jr  r0 j

(92)

where 0 is a volume enclosing all currents, d 0 is the differential volume element, and r0 is the position within the volume. The second equation yields simply TB .r/ D 0 PJ .r/:

(93)

Substituting these into Eqs. 73 and 77 shows the dependence of the toroidal and poloidal parts of B with respect to those of J Btor .r/ D r  Œ0 PJ .r/r; 

0 Bpol .r/ D r  r  4

Z 0

TJ .r0 / 0 d r: jr  r0 j

(94) (95)

This is an interesting result considering that r is operating at the position r where B is being calculated. If this position happens to be in a source-free region where J D 0, then PJ .r/ D 0 and the toroidal magnetic field Btor .r/ D 0. This implies that toroidal magnetic fields only exist within a conductor or magnetized material where the associated poloidal J is present. At the same location r, the poloidal magnetic field Bpol , however, does not identically vanish. This is because Bpol is sensitive to the toroidal current scalar TJ .r0 / evaluated within the distant current-carrying volume and not only to its local value. Of course at this point, Eq. 95 collapses to the usual potential form    Z 0 TJ .r0 / 0 @ @ .rPB .r// D r r d : Bpol .r/ D r @r @r 4 0 jr  r0 j 

(96)

This is because of Eq. 68 and of the fact that

r PB .r/ D 2

0 TJ .r/r inside 0 ; 0r outside 0 :

(97)

Page 17 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

This result then also provides the possibility of relating the present formalism to the one previously described in Sect. 3 when considering potential fields that arise in a source-free shell a < r < c. Then, indeed, the internal (below r D a) and external (above r D c) current sources lead to internal PBi and external PBe magnetic poloidal scalar potentials, which can be related to the harmonic scalar potentials V i and V e defined by Eqs. 13 and 14 through Vi D

@ .rPBi /; @r

(98)

Ve D

@ .rPBe /; @r

(99)

and leading to expansions of the form: PBi .r; ; '/

1   n X  a nC1 X  m Da Gn cos m' C Hnm sin m' Pnm .cos  /; r nD1 mD0

PBe .r; ; '/

Da

1   X n X  r n

a

nD1

 Qnm cos m' C Snm sin m' Pnm .cos  /;

(100)

(101)

mD0

where



m 1 qnm Qn D  nC1 Gnm D n1 gnm 1 m ; 1 m m Hn D n hn Sn D  nC1 snm

(102)

This then amounts to state that within a source-free spherical shell a < r < c, the r dependence of the Pnm;c .r/ and Pnm;s .r/ in Eq. 82 is entirely specified and of the form Pnm;c .r/

gnm  a nC1 qnm  r n ; Da  n r nC1 a

(103)

Pnm;s .r/

 a nC1 hm snm  r n n ; Da  n r nC1 a

(104)





while of course Tnm;c .r/ D Tnm;s .r/ D 0.

4.4 Magnetic Fields in a Current-Carrying Shell The toroidal-poloidal decomposition provides a convenient way of describing the magnetic field in a current-carrying shell a < r < c. The magnetic field then has four parts sh B D Bipol C Bepol C Bsh pol C Btor ;

(105)

where Bipol is the potential field due to toroidal currents Jitor in the region r < a (in fact, due to all currents in the region r < a, since poloidal currents in this region do not produce any field in the Page 18 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

shell a < r < c), Bepol is the potential field due to toroidal currents Jetor in the region c < r (in fact, due to all currents in the region c < r, since poloidal currents in this region do not produce any sh field in the shell a < r < c), and Bsh pol and Btor are the nonpotential poloidal and toroidal fields sh due to in situ toroidal Jsh tor and poloidal Jpol currents in the shell. From what was just seen, it is now known that the behavior of Bipol and Bepol is entirely dictated by the fact that their respective poloidal scalars PBi and PBe must take the forms Eqs. 100 and 101. Those potential fields are thus entirely defined by the knowledge of the Gnm ; Hnm ; Qnm , and Snm coefficients (or, equivalently, of sh m m sh the Gauss gnm ; hm n ; qn , and sn coefficients). By contrast, all that is known about Bpol and Btor is that they are associated with toroidal and poloidal scalars which can be written in the form Eqs. 81 and 82, so that they themselves can be written in the form Eqs. 83 and 84. The radial dependence of the corresponding Pnm;.c;s/.r/ and Tnm;.c;s/.r/ functions is unknown, but dictated by the distribution sh of the current sources Jsh tor and Jpol within the shell, because of Eqs. 92 and 93. It is interesting at this stage to introduce an inner shell a0 < r < c 0 within the spherical shell a < r < c considered so far (and which is referred here as the outer shell). Space can then naturally be divided into five regions (Fig. 3): region I, for r < a, region II for a < r < a0 , region III for a0 < r < c 0 (the inner shell), region IV for c 0 < r < c, and region V for r > c. As was just seen, everywhere within the outer shell a < r < c, the field B can be written in the form of Eq. 105. But in exactly the same way, everywhere within the inner shell a0 < r < c 0 , the field can also be written in the form 0

0

0

0

sh B D Bipol C Bepol C Bsh pol C Btor ; 0

(106)

0

0

where Bipol is the potential field due to toroidal currents Jitor in the region r < a0 ; Bepol is the potential 0 0 sh0 field due to toroidal currents Jetor in the region c 0 < r, and Bsh pol and Btor are the nonpotential 0 sh0 poloidal and toroidal fields due to in situ toroidal Jsh tor and poloidal Jpol currents in the inner shell. 0 0 It is important to note that Bipol ¤ Bipol and Bepol ¤ Bepol . Toroidal currents flowing in region I (resp. 0 0 V), which already contributed to Bipol (resp. Bepol ), still contribute to Bipol (resp. Bepol ). But toroidal i e currents flowing in region II (resp. IV), which contributed to Bsh pol and not to Bpol (resp. Bpol ), 0 0 now contribute to Bipol (resp. Bepol ). Only the toroidal currents flowing in region III, which already V IV III II I

r =c r =c’ r =R r =a’ r =a

Fig. 3 Schematic showing a meridional cross section of a general current-carrying shell a < r < c within which an inner shell a0 < r < c 0 is defined. This then defines five regions, as identified by their numbers (I, II, III, IV, and V), within which both toroidal and poloidal sources can be found. These will contribute differently to the field, depending on where the field is observed. Of particular interest is the way these contribute when the field is observed in the inner shell, and this shell progressively shrinks to the sphere r D R (see text for details)

Page 19 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

0

0

sh sh sh contributed to Bsh pol , still contribute to Bpol . By contrast, Btor D Btor , both because the toroidalpoloidal decomposition is unique and because the toroidal field is only sensitive to the local behavior of the poloidal currents (recall Eq. 94). Obviously, the smaller the thickness h D c 0 a0 of sh0 the inner shell, the less sources contributing to Bsh pol still contribute to Bpol . In fact, it can be shown 0 that if this inner shell eventually shrinks to a sphere of radius r D R (Fig. 3), then Bsh pol goes to zero 0 equal to Bsh as h=R ! 0, while Bsh tor (Backus 1986). Further introducing BiR (resp. BeR ) tor remains 0 0 i e as the limit of Bpol (resp. Bpol ) when h=R ! 0 then makes it possible to introduce the following unique decomposition of a magnetic field B on a sphere of radius r D R surrounded by sources

B.R; ; '/ D BiR .R; ; '/ C BeR .R; ; '/ C Bsh tor .R; ; '/;

(107)

where (recall Eq. 83) Bsh tor .R; ; '/ D

X1 Xn nD1

mD0

m;s m;s .Tnm;c .R/Cm;c n .; '/ C Tn .R/Cn .; '//;

(108)

is the toroidal field produced on the sphere r D R by the local poloidal currents and BiR .R; ; / and BeR (R, ; ) are the potential fields produced on the sphere r D R by all sources, respectively, below and above r D R. These fields are in fact the values taken for r D R of the potential fields BiR .r; ; / and BeR .r; ; / which may be defined more generally for, respectively, r  R and r  R, with the help of (recall Eqs. 98–102 and 16–17) BiR .r; ; '/ D

1   n X a nC2 X nD1

BeR .r; ; '/ D

r

a

(109)

ms m .qnm .R/…mc ne .; '/ C sn .R/…ne .; '//;

(110)

mD0

1   n X r n1 X nD1

ms m .gnm .R/…mc ni .; '/ C hn .R/…ni .; '//;

mD0

where the R dependence of the Gauss coefficients is here to recall that these Gauss coefficients m describe the fields produced by all sources, respectively, below gnm .R/, hm n .R/ and above qn .R/, m sn .R/r D R.

4.5 Thin-Shell Approximation Finally, if h=R is not zero, but small enough, one may still write within the (then thin-shell) a0 < r < c 0 around r D R: B.r; ; '/ D Bi .r; ; '/ C Be .r; ; '/ C Bsh tor .r; ; '/;

(111)

where Bi .r; ; /; Be .r; ; /, and Bsh tor .R; ; / are given by Eqs. 108–110. Equation 111 is then sh sh correct to within Bpol and Btor corrections of order h=R (Backus 1986). This approximation is known as the thin-shell approximation.

Page 20 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

5 Spatial Power Spectra In discussing the various components of the Earth’s magnetic field, and in particular the way each component contributes on average to the observed magnetic field, it will prove useful to deal with the concept of spatial power spectra. This concept was introduced by Mauersberger (1956) and popularized by Lowes (1966, 1974), both in the case of potential fields. However, it is quite straightforward to introduce these also in the case of nonpotential fields. Indeed, consider the sphere S.R/ of radius r D R and assume the most general case when this sphere is surrounded by sources. Then, as seen previously, the field can be written in the form of Eq. 107, and its average squared magnitude over S.R/ can be written in the form hB2 .R; ; '/iS.R/ D D

1 4

R 2 R  0

B.R; ; '/  B.R; ; '/ sin  d d' W i .R/ C W e .R/ C W T .R/; 0

(112)

where W i .R/ D

1 P nD1

Wni .R/; W e .R/ D

1 P nD1

Wne .R/; W T .R/ D

1 P nD1

WnT .R/ ;

(113)

with Wni .R/

n m  a 2nC4 X 2 .gn .R//2 C .hm D .n C 1/ R n .R// ;

(114)

mD0

Wne .R/

Dn

n  R 2n2 X

.qnm .R//2 C .snm .R//2 ;

(115)

.Tnm;c .R//2 C .Tnm;s .R//2 ;

(116)

a mD0

WnT .R/ D

n.nC1/ 2nC1

n P mD0

all of which follows from Eqs. 108 to 110 and the orthogonality properties of the Cm;.c;s/ and n m;.c;s/ …n;.i;e/ spherical harmonic vectors (recall Eqs. 60–62). Equations 112–113 then show that each type of field – the potential field produced by all sources above r D R, the potential field produced by all sources below r D R, and the nonpotential (toroidal) field produced by the local (poloidal) sources on r D R – and within each type of field, each degree n (in fact, each elementary field of degree n and order m, as is further shown by Eqs. 114–116) contributes independently to the average squared magnitude B2 .R; ; /S.R/ on the sphere r D R. Hence, plotting Wni .R/ (resp. Wne .R/, WnT .R/) as a function of n provides a very convenient mean of identifying which sources, and within each type of source, which degrees n, most contribute on average to the magnetic field B.R; ; / on the sphere r D R. Such plots are known as spatial power spectra. In the more restrictive (and better known) case when the field is potential within a source-free shell a < r < c, the field takes the form (Eq. 15) B.r; ; / D Bi .r; ; / C Be .r; ; /, where Bi .r; ; / and Be .r; ; / are respectively defined by Eqs. 16 and 17. In exactly the same way, it can again be shown that for any value of r within the shell, each degree n of the field Bi .r; ; / Page 21 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

of internal origin (with sources below r D a) and of the field Be .r; ; / of external origin (with sources above r D c) contributes to the average squared magnitude B2 .r; ; /S.r/ on the sphere S.r/ by, respectively, Wni .r/ D .n C 1/

n m 2  a 2nC4 X m 2 .g ; / C .h / n n r

(117)

mD0

Wne .r/

n  r 2n2 X m 2 Dn a .qn / C .snm /2 ;

(118)

mD0

which again define spatial power spectra. Note that in that case, the only r-dependence is the one due to the geometric factors .a=r/2nC4 and .r=a/2n2 . The spectrum for the field of internal origin Wni .r/ is what is often referred to as the Lowes-Mauersberger spectrum.

6 Mathematical Uniqueness Issue Sections 2–4 introduced several mathematical ways of representing magnetic fields in spherical shells, when sources of those fields lie either below, within, or above those shells. The goal is now to take advantage of those representations to explain how the best possible description of the Earth’s magnetic field can be recovered from available observations. Obviously, the more the observations, the better the description. However, even an infinite number of observations might not be enough to guarantee that one does eventually achieve a proper description of the field. Observations do not only need to be numerous; they also need to provide adequate information. This is an important issue since, in practice, observations cannot be made anywhere. Historical observations have all been made at the Earth’s surface. Aeromagnetic surveys provided additional observations after 1950, and satellite missions only started in the 1960s. Assuming that all those observations could have been made in an infinitely dense way (here, the issue of the limited density of observations is not discussed) and instantly (temporal issues are not discussed either), this means that the best information historical observations and aeromagnetic surveys can provide is the knowledge of B or of some derived quantities (components, direction, or intensity) over the entire surface of the Earth. Satellites bring additional information, which at best is a complete knowledge of B or of some derived quantities within a thin shell where sources can be found. To what extent is this enough to recover a complete mathematical description of the Earth’s magnetic field?

6.1 Uniqueness of Magnetic Fields in a Source-Free Shell First consider the situation when the sources of the magnetic field are a priori known to be either internal to an inner surface †i or external to an outer surface †e , so that the shell in between †i and †e is source-free. Within this shell, Eqs. 2 and 8 show that both r B.D 0/ and r  B.D 0/ are known. Applying Helmholtz theorem then shows that the field is completely characterized within the shell, provided the normal component of the field is known everywhere on both †i and †e .

Page 22 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

This very general result, also known in potential theory as the uniqueness theorem for Neumann boundary conditions (e.g., Kellogg 1954; Blakely 1995), can be made more explicit in the case considered in Sect. 3 when the shell is spherical, and †i and †e are defined by r D a and r D c. In that case, it is known that within the shell, B is described by B.r; ; / D Bi .r; ; /CBe .r; ; / (Eq. 15) where Bi .r; ; / and Be .r; ; / are given by Eqs. 16 and 17, the radial components of which can easily be inferred from Eqs. 18–21: Bir .r; ; '/ D

1 X

n  a nC2 X  m  m .n C 1/ r sin m' Pn .cos  /; gn cos m' C hm n

nD1

Ber .r; ; '/ D

1 X

(119)

mD0

.n/

nD1

n  r n1 X  a

 qnm cos m' C snm sin m' Pnm .cos  /;

(120)

mD0

This can then be used together with Eqs. 32 and 33 to show that 2nC1 4

2nC1 4

R  R 2

Br .r; ; '/Pnm.cos  / cos m' sin dd'  n1 m  nC2 m gn  n ar qn ; D .n C 1/ ar

(121)

Br .r; ; '/Pnm .cos  / sin m' sin dd'  n1 m  nC2 m hn  n ar sn : D .n C 1/ ar

(122)

 D0 'D0

R  R 2  D0

'D0

Then, assuming that the normal component of B is known on both †i .r D a/ and †e .r D c/ amounts to assume that B.a; ; / and B.c; ; / are both completely known; making use of Eqs. 121 and 122 once for r D a and again for r D c leads to a set of linear equations from m m which all Gauss coefficients gnm , hm n ; qn ; sn can be inferred; and recalling Eqs. 15–17 shows that the field B is then indeed completely defined within the spherical shell a < r < c. More generally, it is important to note that the field B can be characterized just as well as soon as Br .r; ; / is completely known for two different values r1 and r2 of r, provided a  r1 < r2  c (i.e., provided the two spherical surfaces defined by r D r1 and r D r2 lie within the source-free spherical shell). Similar conclusions can be reached if the potential V .r; ; / in place of the radial component Br .r; ; / is assumed to be known on both †i and †e . This very general result, known in potential theory as the uniqueness theorem for Dirichlet boundary conditions (e.g., Kellogg 1954; Blakely 1995), can also be made more explicit in the case when the shell is spherical, and †i and †e are defined by r D a and r D c. In that case, it is known that within the shell, V .r; ; / D Vi .r; ; / C Ve .r; ; /, where Vi .r; ; / and Ve .r; ; / are given by Eqs. 13 and 14, which can be used together with Eqs. 32 and 33 to show that 2nC1 4a 2nC1 4a

R  R 2  D0

'D0

R  R 2  D0

V .r; ; '/Pnm.cos  / cos m' sin  d d' D

m 'D0 V .r; ; '/Pn .cos  / sin m' sin  d d' D

 a nC1 r

 a nC1 r

gnm C hm n C

 r n a

 r n a

qnm ;

(123)

snm :

(124)

Then, assuming that V .r; ; / is known on both †i .r D a/ and †e .r D c/ makes it possible to use Eqs. 123 and 124, once for r D a, and again for r D c, to form another set of linear equations Page 23 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

m m from which all Gauss coefficients gnm ; hm n ; qn ; sn can again be inferred. The same reasoning as in the previous case then follows, leading to the same conclusions (and generalization to the case when V .r; ; / is known for r D r1 and r2 , provided a  r1 < r2  c). In practice however, only components of the magnetic field B and not its potential V are directly accessible to observations. The previous result can nevertheless be used to show yet another, directly useful, uniqueness property now applying to the B .r; ; / component of the field, when this component is assumed to be known on the two spherical surfaces defined by r D r1 and r2 , where a  r1 < r2  c. This component is such that

B .r; ; '/ D 

1 @V .r; ; '/ : r @

(125)

Integrating along a meridian (with fixed values of r and ) starting from  = 0 thus leads to Z



V .r; ; '/ D C.r/  r

B .r;  0 ; '/d 0 ;

(126)

0

which shows that if B .r; ; / is known, so is V .r; ; /, to within a function C.r/. But it is known from Eqs. 13, 14, and 37 that the average value of V .r; ; / over the sphere S.r/ of radius r is such that V .r; ; /S.r/ D 0 (recall that this is true, only because magnetic fields do not have monopole sources). It thus follows that *Z C.r/ D r 0



+ B .r;  0 ; '/d 0

;

(127)

S.r/

and that as soon as B .r; ; / is known for a given value of r, so is V .r; ; /. V .r; ; / can then again be used to compute the set of linear equations (Eqs. 123 and 124) for two different values r1 m m and r2 of r such that a  r1 < r2  c, from which all Gauss coefficients gnm ; hm n ; qn ; sn can finally again be inferred. It is important to note that by contrast, no similar conclusion applies when B ; .r; ; /, rather than B .r; ; /, is considered. This is because, as is clear from, e.g., Eqs. 16–21, B ; .r; ; / is totally insensitive to zonal fields (described by spherical harmonic terms of order m D 0). Of course, and as must now be obvious to the reader, other useful uniqueness properties can also be derived by combining the knowledge of Br .r; ; / for a given value r1 of r and of B .r; ; / for another value r2 of r. Of particular relevance to the historical situation (when observations are only available at the Earth’s surface) is the case when both Br .r; ; / and B .r; ; / are simultaneously known for a given value R of r, where a  R  c. In that case, Eqs. 121 and 122 on one hand, and Eqs. 123 and 124 on another hand, hold for r D R. This again leads to a set of m m linear equations from which all the Gauss coefficients gnm ; hm n ; qn ; sn can be inferred, and the field once again fully determined. Finally, it is important to note that independently of the method used to define the field in a unique way within the source-free shell a < r < c, all Gauss coefficients are then recovered, and it therefore becomes possible to identify Bi (Eq. 16) and Be (Eq. 17).

Page 24 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

6.2 Uniqueness Issues Raised by Directional-Only Observations Now focus a little more on the possibility of completely defining a magnetic field when information is only available on a single surface within the source-free shell. It has been shown that if the field has both internal and external sources, knowing both Br .r; ; / and B .r; ; / over a sphere defined by r D R within the source-free shell is enough to achieve uniqueness and identify the fields of internal and external origin. Implicit is also the conclusion that just knowing one component of the field (be it Br ; B , or B ) would not be enough. At least two components are needed and in fact not just any two components, since, as previously seen, B does not provide as much information as Br or B . Of course, knowing the entire field B is even better. The field is then overdetermined (at least its nonzonal component), which is very useful since, in practice, the field is known just at a finite number of sites and not everywhere at the Earth’s surface (cf. Chap. 5). This in fact is the way Gauss first proved that the Earth’s magnetic field is mainly of internal origin (Gauss 1839, for a detailed account of how Gauss did proceed in practice, see, e.g., Langel 1987). But before Gauss first introduced a way to measure the magnitude of a magnetic field (which he did in 1832; see, e.g., Malin 1987), only inclination and declination observations were made. Such observations have a serious drawback. They cannot tell the difference between the real magnetic field and the same magnetic field multiplied by some arbitrary positive constant . But if such directional-only observations are available everywhere at the Earth’s surface, could it be that they nevertheless provide enough information for the Earth’s magnetic field to be completely characterized, to within the global positive factor ? Until recently, most authors felt that this was indeed the case, at least when a priori assuming the field to be of internal origin (e.g., Kono 1976). However directional-only observations are not linearly related to the Gauss coefficients, and the answer turns out to be more subtle, as first recognized by Proctor and Gubbins (1990). Relying on complex variables, mathematical tools very different from those used in this chapter, Proctor and Gubbins (1990) investigated axisymmetric fields (i.e., zonal fields, with only m = 0 Gauss coefficients) of internal origin and succeeded in exhibiting a family of different fields all sharing exactly the same direction everywhere on a spherical surface r D R enclosing all sources. Clearly, even a perfect knowledge of the direction of the field on the sphere r D R would not be enough to fully characterize a field belonging to such a family, even to within a global positive factor. The fields exhibited by Proctor and Gubbins (1990) are however special in several respects: they are axisymmetric, antisymmetric with respect to the equator, and of octupole type (displaying four loci of magnetic poles on the sphere r D R, one magnetic pole at each geographic pole, and two midlatitude axisymmetric lines of magnetic poles, a magnetic pole being defined as a point where the field is perpendicular to the surface). The nonuniqueness property they reveal may therefore very well not transpose to Earth-like situations involving magnetic fields strongly dominated by their dipole component. This led Hulot et al. (1997) to reconsider the problem in a more general context. These authors again assumed the field to entirely be of internal origin, but only requested it to be regular enough (i.e., physical). Using a potential theory type of approach (which is again not given in any detail here), they were able to show that if the direction of the field is known everywhere on a smooth enough surface †i enclosing all sources, and if those directions do not reveal any more than N loci of magnetic poles on †i , then all fields satisfying these boundary conditions belong to an open cone (in such an open cone, any nonzero positive linear combination of solutions is a solution) of dimension N  1. Applying this result to the fields exhibited by Proctor and Gubbins (1990), Page 25 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

which display N D 4 loci of poles on †i (which is then the sphere r D R), shows that those fields would belong to open cones of dimension 3 (in fact, 2, when the equatorial antisymmetry is taken into account, see Hulot et al. 1997), leaving enough space for fields from this family to share the same directions on †i and yet not be proportional to each other. But applying this result to the historical magnetic field, which only displays N D 2 poles on †i (which is then the Earth’s surface), leads to a different conclusion. The Earth’s field belongs to an open cone of dimension 1, which shows that it can indeed be recovered from directional-only data on †i (the Earth’s surface) to within the already discussed global positive factor . Note however that this result only holds when all contributions from external sources are ignored. Similar results can be derived in the case, less relevant to the Earth, when the field is assumed to have all its sources outside the surface on which its direction is assumed to be known (Hulot et al. 1997). But to the authors knowledge, no results applying to the most general situation when both internal and external sources are simultaneously considered have yet been derived. Only a relatively weak statement can easily be made in the much more trivial case when the direction of the field is assumed to be known in a subshell of the source-free shell separating external and internal sources (and not just on a surface, Bloxham 1985; Proctor and Gubbins 1990; Lowes et al. 1995). In that case indeed, if two fields B.r/ and B0 .r/ share the same direction within the subshell, then a scalar function .r/ exists such that B0 .r/ D .r/B.r/ within it. But within that subshell, B.r/ and B0 .r/ must satisfy r  B D 0; r  B D 0; r  B0 D 0, and r  B0 D 0. This implies r B D 0 and r B D 0, i.e., that .r/ is a constant within the subshell. Since both fields B.r/ and B0 .r/ can be written in a unique way in the form Eqs. 15 to 21 within that source-free subshell, this means that all Gauss coefficients describing B0 .r/ are then proportional (by a factor ) to those describing B.r/. Hence B0 .r/ D B.r/ not only within the subshell within which observations are available but also beyond this shell, provided one remains within the source-free shell.

6.3 Uniqueness Issues Raised by Intensity-Only Observations Measuring the full magnetic field B requires a lot of care. Nowadays, by far the most demanding step turns out to be the orientation of the measured field with respect to the geocentric reference frame. This is especially true in the context of satellite measurements (cf. Chap. 5). By contrast, measuring the intensity F D B of the field is comparatively easier, and when satellites first started making magnetic measurements from space, they only measured the intensity of the field. But to what extent can a magnetic field be completely determined (to within a global sign, of course) when only its intensity is measured? First consider the now familiar case when a source-free shell can be defined, and information (i.e., intensity F ) is only available on a single surface within that shell. Though no explicit results have yet been published (at least to the authors knowledge) in the general case when the field has both external and internal sources, it can be easily anticipated that this will be a very unfavorable situation: it is already known that the knowledge of only one of the Br ; B , or B components everywhere on this surface is not enough and that at least Br and B need to be simultaneously known. In fact, even in the case when the field is further assumed to only have internal sources (i.e., enclosed within the surface where intensity is assumed known), no general conclusion can be drawn. Some important specific results are however available. In particular, Backus (1968) showed that if the field is of internal origin and further assumed to be a finite sum of vector spherical harmonics (i.e., if all Gauss coefficients gnm and hm n are a priori known to be zero for n > N , Page 26 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

where N is a finite integer, and the sum in Eq. 16 is therefore finite), then the field can indeed be completely determined (to within a global sign) by the knowledge of F everywhere on a spherical surface r D R enclosing all sources. But this result will generally not hold if the field is not a finite sum of vector spherical harmonics. This was first shown by Backus (1970), which exhibited what are now known as the Backus series (see also the comment by Walker (1992). These are fields BM .r; ; / of internal origin and order M (i.e., defined by Gauss coefficients with gnm D hm n D 0 if m ¤ M ), where M can be any positive integer. The Gauss (internal) coefficients of BM .r; ; / are defined by a recursion relation, which needs not be explicited here. Suffices to say that this relation is chosen so as to ensure that each field BM .r; ; / (1) has Gauss coefficients that converge fast enough with increasing degree n, for the convergence of the infinite sum (Eq. 16) defining BM .r; ; / to be ensured for all values of r  R, and (2) satisfies BM .r; ; /  BD .r; ; / D 0 everywhere on the surface r D R, where BD .r; ; / is the axial dipole of internal origin defined by the single Gauss coefficient g10 D 1 in Eq. 16. These Backus series can then be used to define an infinite number of pairs of magnetic field of internal origin BM C D ˛BD C ˇBM and BM  D ˛BD  ˇBM , where .˛; ˇ/ can be any pair of real (nonzero) values. These pairs will automatically satisfy (as one can easily check) BM  .r; ; / D BM C .r; ; / on the sphere r D R, while obviously BM  ¤ BM C . A perfect knowledge of the intensity of the field on the sphere r D R would therefore not be enough to characterize a field belonging to any such pairs, even to within a global sign. An interesting generalization of this result to the case when BD does not need to be an axial dipole field has more recently been published by Alberto et al. (2004). If BN is any arbitrary field of internal origin defined by a finite number of Gauss coefficients with maximum degree N , a field B0 N can again always be found such that B0 N .r; ; /  BN .r; ; / D 0 everywhere on the surface r D R. Many additional pairs of magnetic fields BN C D ˛BN C ˇB0 N and BN  D ˛BN  ˇB0 N , again sharing the same intensity on the sphere r D R, can thus be found. Those results are interesting. However, they do not provide a general answer to the question of the uniqueness of arbitrary fields of internal origin only constrained by intensity data on a surface enclosing all sources (when the field is not a finite sum of vector spherical harmonics). But does this really matter in practice? It may indeed be argued that the best model of the Earth’s magnetic field of internal origin that will ever be recovered will anyway be in the form of a finite number of Gauss coefficients (those compatible with the resolution matching the spatial distribution of the limited number of observations) and that because of the Backus (1968) uniqueness result previously discussed, this model would necessarily be determined in a unique way (to within a sign) by the knowledge of the intensity of the field at the Earth’s surface (assuming that some strategy of measurement has been used so as to minimize any contribution of the field of external origin). In practice, this indeed is the case. However, this practical uniqueness turns out to be very relative and misleading, as the study of Stern et al. (1980) illustrates. In this study, the authors use a data set of full vector magnetic field observations collected by the 1980 Magsat satellite (and carefully selected to minimize any local or external sources, which they then consider as a source of noise) once to produce a field model A best explaining the observed intensity and once again to produce a model B best explaining the observed vector field. They found that the two models differ very significantly. Both models predict similar intensity at the Earth’s surface (to within measurement error, typically a few nT), but they strongly disagree when predicting the full vector field, with model A leading to errors (up to 2,000 nT!) far more than tolerated by measurement errors on the measured vector field, contrary

Page 27 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

to model B. This disagreement can be traced back to the fact that the difference BB  BA between the predictions BA and BB of the two models tends to satisfy (BB  BA /  .BB C BA / D 0. One way of interpreting this result is to note that any practical optimizing procedure used to look for a model A best fitting the intensity (and just the intensity) will be much less sensitive to an error ıB perpendicular to the observed field B (which then produces a second-order error .ıB/2 =B in the intensity) than to a comparable error along the observed field (which then produces a first-order error ıB in the intensity). By contrast, the optimizing procedure used to look for a model B best fitting the full vector B will make sure this error is kept small, whatever its direction. As a result, the difference BB  BA will be largest in the direction perpendicular to the observed field, the value of which, to lowest order, is close to .BB C BA /=2. The two models are thus bound to lead to predictions such that (BB  BA /  .BB C BA / D 0. Since model A makes erroneous predictions in the direction perpendicular to the observed field, this effect is often referred to as the perpendicular error effect (Lowes 1975; Langel 1987). Some authors however also refer to this effect as the Backus effect (e.g., Stern et al. 1980; Langel 1987), quite correctly, since this error also seems to be closely related to the type of nonuniqueness exhibited by the fields constructed with the help of the Backus series. To see this, first recall that model B is constrained by full vector field observations. Were those perfect and available everywhere at the Earth’s surface, model B would be perfectly and uniquely determined. Improving the data set would thus lead model B to eventually match the true Earth model. Model A differs from model B in a macroscopic way, and it is very likely that improving the intensity data set would not lead model A to converge toward the true Earth model. This suggests that at least another model than the true Earth model could be found in the limit perfect intensity data is available everywhere at the Earth’s surface and that when only a finite imperfect intensity data set is available, a model recovered by optimizing the fit just to this intensity data set, such as model A, is a truncated and approximate version of this alternative model. Indeed, the magnitude and geographical distribution of the difference BB  BA is very comparable to that of the Backus series (e.g., Stern and Bredekamp 1975). This undesired nonuniqueness property is problematic. To alleviate the resulting Backus/ perpendicular-error effect, several practical solutions have therefore been proposed, such as adding a minimum of vector field data to the intensity-only data set (as first investigated by Barraclough and Nevitt 1976), or taking into account even poor determinations of the field direction, which indeed brings considerable improvement (Holme and Bloxham 1995). From a more formal point of view, the next conceptual improvement was however brought by Khokhlov et al. (1997). Again relying on a potential theory type of approach, these authors showed that a simple and unambiguous way of characterizing a regular enough field of internal origin only defined by the knowledge of its intensity F on a smooth enough surface †i enclosing all its sources is to locate its (possibly several) magnetic dip equators on †i (defined as curves across which the component of B normal to †i changes sign). In other words, if a field of internal origin is defined by both its intensity everywhere on †i and the location of its dip equator(s) on †i , it then is completely and uniquely determined (to within a global sign). Note that, indeed, any two fields of pairs built with the help of the Backus series (or their generalization by Alberto et al. 2004) do not share the same equators. The practical usefulness of this theoretical result was subsequently investigated by Khokhlov et al. (1999) and demonstrated in real situations by Ultré-Guérard et al. (1998) and, more recently, by Holme et al. (2005). However, the most obvious practical conclusion that should be drawn from these various results is that intensity-only strategies of measuring the Earth’s magnetic field on a planetary scale is a dangerous one. Indeed all advanced near-Earth magnetic

Page 28 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

field missions are now designed to make sure the full field B is measured (cf. Hulot et al. 2007). Besides, and as shown later, this turns out to be mandatory also because near-Earth satellites do not orbit in a source-free shell. Finally, and for completeness, the situation when the intensity of the field is no longer known on a surface, but within a volume within the source-free shell, should be considered (in which case, again a field with both internal and external sources may be considered). Then, as shown by Backus (1974), provided the source-free shell can be considered as connected (any two points in the shell can be joined by a smooth curve within the shell, which is the case of the spherical shell a < r < c), knowing the intensity of the field in an open region contained entirely within the source-free shell is enough to determine the field entirely up to its global sign.

6.4 Uniqueness of Magnetic Fields in a Shell Enclosing a Spherical SheetCurrent So far, the possibility of recovering a complete description of a magnetic field has only been addressed when some information is available on some surface or in some volume within a sourcefree shell, the sources of the field lying above and/or below the shell. This is typically the situation when one attempts to describe the Earth’s magnetic field within the neutral atmosphere, using ground-based, shipborne, and aeromagnetic observations. It was also noted that once the field is fully determined within such a source-free shell, then both the field of internal origin Bi and the field of external origin Be can be identified. Now, what if one considers a shell within which currents can be found? This is a situation that must be considered when attempting to also analyze satellite data, since satellites fly above the Eregion where the ionospheric dynamo resides and within the F -region where additional currents can be found (cf. Chap. 5). Here, currents in the F -region are first ignored, and a spherical shell a < r < c is assumed which only contains a spherical sheet current (which would typically describe the currents produced by the ionospheric dynamo in the E-region) at radius r D b. Then the sources of the field are assumed to lie below r D a (sources referred to as J.r < a/ sources), on r D b (Js .r D b/ sources), and above r D c (J.r > c/ sources). What kind of information does one then need to make sure the field produced by those sources can be completely described everywhere within that shell? To address this question, first note that since J.r < a/ sources lie below r D a, the potential field they produce can be described by a set of Gauss coefficients gnm ; hm n which may be used in Eq. 16 to predict this field for any value r > a. In the same way, since J.r > c/ sources lie above r D c, the potential field they produce can be described by a set of Gauss coefficients qnm , snm which may be used in Eq. 17 to predict this field for any value r < c. Finally, since the Js .r D b/ sources correspond to a spherical sheet current, anm ; bnm Gauss coefficients can also be introduced and used to predict the potential field produced by those sources thanks to Eq. 38 for r > b and Eq. 40 for r < b. Now, consider the lower subshell a < r < b. This shell is a source-free shell of the type considered so far. All the previous results relevant to the case when both internal and external sources are to be found therefore apply. Take the most relevant case when information about the field is available only on a surface (say, the Earth’s surface) within the subshell a < r < b. Then one must be able to know at least two components of the field everywhere on that surface,

Page 29 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

typically Br .r; ; / and B .r; ; / to fully characterize the field within that subshell. Once one knows these, one can recover all Gauss coefficients describing the field within that subshell. In particular, one will recover the gnm ; hm n Gauss coefficients of the field produced by the J.r < a/ sources, which is seen as the field of internal origin for that subshell. One can also recover the Gauss coefficients of the field of external origin for that subshell. But this field is the one produced by both Js .r D b/ sources and J.r > c/ sources. One will therefore recover the sum of their Gauss coefficients, i.e., (recall Eq. 40), qnm  ..n C 1/=n/.a=b/2nC1 anm and snm  ..n C 1/=n/.a=b/2nC1 bnm . Next, consider the upper subshell b < r < c. Obviously, the same reasoning as above can be made, and good use can again be made of, for instance, the knowledge of two components of the field on a surface within that subshell (say the surface covered by an orbiting satellite, assuming local sources can be neglected). But it must be acknowledged that the Js (r D b) sources are now seen as sources of internal origin. This then leads to the conclusion that the following sums of m Gauss coefficients will also be recovered, gnm C anm and hm n C bn (recall Eq. 40), together with the m m Gauss coefficients qn and sn . The reader will then notice that for each degree n and order m, four quantities (gnm ; qnm  ..n C 1/=n/.a=b/2nC1 anm , gnm C anm , and qnm ) will have been recovered to constrain the three Gauss m 2nC1 m bn , coefficients gnm , qnm , and anm , while four additional quantities (hm n , sn  ..n C 1/=n/.a=b/ m m m m m hn C bn and sn ) will have been recovered to constrain the three Gauss coefficients hn , sn , and bnm . In each case, one has one constraint too many. One constraint could have therefore been dropped. Indeed, and as the reader can easily check, in such a situation when only a spherical sheet current is to be found in an otherwise source-free spherical shell a < r < c, the knowledge of two components (typically Br .r; ; / and B .r; ; /) on a surface within one of the subshell, and of only one component (typically Br .r; ; / or B .r; ; /) on a surface in the other subshell, is enough to fully characterize the field everywhere within the shell a < r < c. Beware however that all the limitations already identified in Sects. 6.1 and 6.3 when attempting to make use of either the B component or the intensity also hold in the present case. It is important to stress that the above results hold only because it was assumed that all sources to be found within the shell a < r < c lie on a spherical infinitely thin sheet. What if the sheet is itself a subshell of some thickness (defined by say b  e < r < b C e)? In that case, one may still use Eq. 38 to describe the field the sources within that shell would produce for r > b C e. However, the corresponding Gauss coefficients anm , bnm may then no longer be used in Eq. 40 to predict the field the same sources would produce for r < b  e, because the continuity equation (Eq. 39) no longer holds. As a result, if the spherical sheet current r D b does have some nonnegligible thickness, one must back up from the previous conclusions. In that case, one really does need to have two independent sources of information (such as two components of the field on a surface) within each subshell, to fully characterize the field within each subshell. Also, the reader should note that the field is then still not fully defined within the current-carrying shell b  e < r < b C e. This finally brings one to the issue of uniquely defining a magnetic field within a current-carrying shell.

6.5 Uniqueness of Magnetic Fields in a Current-Carrying Shell If the most general case of a current-carrying shell a < r < c is now considered within which any type of currents can be found, the field may no longer be written in the form B.r; ; / D Bi .r; ; / C Be .r; ; / (Eq. 15). But it may be written in the form (Eq. 105, recall Sect. 4)

Page 30 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

sh B D Bipol C Bepol C Bsh pol C Btor ;

(128)

where Bipol describes the field produced by all internal sources (referred to as the J.r < a/ sources just above, only the toroidal type of which contribute to Bipol , as seen in Sect. 4), Bepol describes the field produced by all external sources (the J.r > c/ sources, only the toroidal type of which sh contribute to Bepol ), and Bsh pol and Btor are the nonpotential poloidal and toroidal fields due to in sh situ toroidal Jsh tor and poloidal Jpol currents in the shell. Because of those currents, the field B may generally no longer be defined in a unique way everywhere in such a shell. However, if appropriate additional assumptions with respect to the nature of those local currents are introduced, some uniqueness results can again be derived. To see this, it is useful to start from the decomposition Eq. 107 of the magnetic field B.R; ; / on the sphere of radius r D R (recall Sect. 4): B.R; ; '/ D BiR .R; ; '/ C BeR .R; ; '/ C Bsh tor .R; ; '/

(129)

where Bsh tor .R; ; /; BiR (R; ; ), and BeR .R; ; / are given by respectively Eqs. 108–110. m;.c;s/ Because of the orthogonality of the vector spherical harmonics Cm;.c;s/ ; …ni , and …m;.c;s/ , one ne n may then write (recall Eqs. 63, 66, and 67) hB.R; ; '/; …m;c ni .; '/i D

1 4

R  R 2

B.R; ; '/  …m;c .; '/ sin  d d'  a nC2ni m gn .R/; D .n C 1/ R

 D0 'D0

hB.R; ; '/; …m;s ni .; '/i D .n C 1/

 a nC2

hm n .R/; R  n1 R m;c qnm .R/; hB.R; ; '/; …ne .; '/i D n a

(130)

(131)

(132)

 n1 R Dn snm .R/; a

(133)

hB.R; ; '/; Cm;c n .; '/i D

n.n C 1/ m;c T .R/; 2n C 1 n

(134)

hB.R; ; '/; Cm;s n .; '/i D

n.n C 1/ m;s T .R/; 2n C 1 n

(135)

hB.R; ; '/; …m;s ne .; '/i

which shows that the complete knowledge of B.R; ; / on the spherical surface r D R already makes it possible to identify the Gauss coefficients gnm .R/; hm n .R/ of the potential field BiR .r; ; / produced above r D R by all sources below r D R, the Gauss coefficients qnm .R/; snm .R/ of the potential field BeR .r; ; / produced below r D R by all sources above r D R, and the coefficients Tnm;c .R/ and Tnm;s .R/ defining the toroidal field Bsh tor .R; ; / produced by the local (poloidal) sources at r D R. This uniqueness theorem, due to Backus (1986), is however not powerful enough in general to reconstruct the field B.r; ; / in a unique way everywhere within the shell a < r < c. But one may introduce additional assumptions that reasonably apply to the

Page 31 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

J (r > c)

Jpol

lite

Satel

ries

Observato

J (r < a)

r =c

Js (r = b)

No sources

r =R r =b r =a

Fig. 4 Uniqueness of a magnetic field recovered from partial information within a current-carrying shell. In this special case relevant to geomagnetism, it is assumed that any source can lie below r D a (internal J.r < a/ sources), and above r D c (external J.r > c/ sources), no sources can lie within the lower subshell (a < r < b, the neutral atmosphere), a spherical sheet current can lie at r D b (the E-region Js .r D b/ sources), and only poloidal sources can lie within the upper subshell (b < r < c, the F -region ionosphere). The knowledge of B on a sphere r D R in the upper subshell (as provided by, e.g., a satellite) and of enough components of B on the sphere r D a (as provided by, e.g., observatories at the Earth’s surface) is then enough to recover the field produced by most sources in many places (see text for details)

near-Earth environment (see Fig. 4 for a schematic sketch and Chap. 5 in the handbook for more details). Assume that the shell a < r < c can be divided into two subshells a < r < b and b < r < c separated by a spherical sheet current at r D b. Assume that the lower subshell describes the neutral atmosphere and is therefore source-free, while the spherical sheet current describes the ionospheric E-region (which is again considered as infinitely thin). Finally assume that the upper subshell describes the ionospheric current-carrying F -region within which near-Earth satellites orbit. Since those currents are known to be mainly the so-called field-aligned currents at polar latitudes (i.e., aligned with the dominant poloidal field, see, e.g., Olsen 1997), it is not unreasonable to further assume that those currents have no toroidal components (they are mainly in radial sh direction). In other words, assume that no Jsh tor sources but only Jpol sources lie in the b < r < c upper subshell. These assumptions, together with the previous uniqueness theorem, can then be combined in a very powerful way. Indeed, since B.R; ; / is already assumed to be known on a sphere r D R in the upper m m subshell, the Gauss coefficients gnm .R/; hm n .R/; qn .R/; sn .R/ can be inferred from Eqs. 130 to 133. The qnm .R/; snm .R/ Gauss coefficients then describe the field produced below r D R by all sources above r D R. But since between r D R and r D c only Jsh pol sources are to be found which only produce local toroidal fields, the only sources contributing to the field described by the qnm .R/; snm .R/ coefficients are the J.r > c/ sources. It thus follows that the qnm .R/; snm .R/ are the qnm ; snm Gauss coefficients describing the potential field produced below r D c by the J.r > c/ sources. A similar use of the Gauss coefficients gnm .R/; hm n .R/ recovered from Eqs. 130 and 131 an then also be made. Those describe the field produced above r D R by all sources below r D R. But no sources between r D b and r D R contribute. Only J.r < a/ and Js .r D b/ sources m m m m do. It then also follows that the gnm .R/; hm n .R/ are the sums .gn C an ; hn C bn / of the Gauss coefficients describing the field produced above r D a by the J.r < a/ sources and above r D b by the Js .r D b/ sources. This then brings one back to a situation similar to the one previously Page 32 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

encountered when considering a shell a < r < c just enclosing a spherical sheet current at r D b. If, in addition to knowing the field B.R; ; / on the spherical surface r D R, enough information m m is also available at the Earth’s surface (say at least Br ), all Gauss coefficients gnm ; hm n ; qn ; sn , and m m an ; bn can again be recovered. In which case, the field produced by the J.r < a/ sources can be predicted everywhere for r > a, the one produced by the J.r > c/ sources can be predicted everywhere for r < c, and the field produced by Js .r D b/ sources can be predicted everywhere (except r D b). In particular, the total field B.r; ; / is then completely defined within the lower subshell a < r < b. Note, however, and this is important, that the field Bsh tor .r; ; / (which is zero in the source-free lower subshell) is then still only known for r D R and cannot be predicted elsewhere in the upper subshell (defined by b < r < c). Of course, this is not the only set of assumptions one may introduce. One may first relax the infinitely thin assumption made for the spherical sheet current, which could then extend from r D b  e to r D b C e, as was assumed earlier. Provided enough information is available within the lower subshell a < r < b (say the two components Br .r; ; / and B .r; ; / at the Earth’s surface), then again J.r < a/ sources can be predicted everywhere for r > a, the one produced by the J.r > b/ sources can be predicted everywhere for r < b, and the field produced by the spherical (thick) sheet sources can be predicted everywhere except within b  e < r < b C e. Of course, the field Bsh tor .r; ; / would still only be known for r D R and would not be predicted elsewhere in the upper subshell. In fact, and as must now be obvious to the reader, this last point is precisely one of the several issues that make satellite data difficult to take advantage of. In particular, satellites are never on a rigorously circular orbit and do not exactly sample B.R; ; / (apart from the difficulties related to the proper sampling in space and time of the temporal variations of these currents). Rather they sample drifting elliptic shells within a spherical shell of average radius R but with some thickness h. When this shell is thin enough, one may however rely on the thin-shell approximation introduced in Sect. 4, in which case one may use Eq. 111. As was then noted, this approximation is correct sh to within Bsh pol and Btor corrections of order h=R. If the satellite indeed orbits within a region were sh only poloidal currents are to be found, Bsh pol D 0 and this correction only affects Btor . If h=R is indeed small, this correction is small enough, and all the reasoning above may be repeated. The practical applicability of such an approach for satellite magnetic measurements has been investigated by Backus (1986) and Olsen (1997). Olsen (1997) points out that for the Magsat 1980 mission, these numbers are h D 100 km  R D 6;821 km, which seems to justify the thin-shell approximation. Then, indeed, magnetic signatures from both field-aligned currents in the polar latitudes and meridional coupling currents associated with the equatorial electrojet (EEJ) (Maeda sh et al. 1982) are detected in the Magsat data, each examples of Bsh tor produced by local Jpol sources. Of course, additional or alternative assumptions can also be used. Olsen (1997), for instance, considers only radial poloidal currents in the sampling shell of Magsat, an assumption which is basically valid except for midlatitude interhemispheric currents. From Eq. 68, one can obtain purely radial currents if the radial dependency of PJ is proportional to 1=r, thus eliminating the first term. This idea can also be extended to purely meridional currents (J D 0), either in the standard geocentric coordinates (Olsen 1997) or in a quasi-dipole (QD) coordinate system (Richmond 1995), as has been done by Sabaka et al. (2004) (the QD system is a warped coordinate system useful in describing phenomena which are organized according to the Earth’s main field; see Richmond (1995) and Sect. 3.1 of the chapter by Olsen et al. in the handbook). Two classes of admissible scalar functions are then found to contribute to meridional currents: (1) those which are purely radial and (2) those which are QD zonal, i.e., m D 0. Clearly only the second class Page 33 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

contributes to the horizontal components of J, and so a nonvanishing first term in Eq. 68 for PJ is required. Finally, it should also be mentioned that advanced investigations of the CHAMP satellite (Maus 2007) data have recently provided several examples of situations revealing the presence of some sh Bsh pol fields (and Jtor sources) at satellite altitude, contradicting the assumptions described above (e.g., Lühr et al. 2002; Maus and Lühr 2006; Stolle et al. 2006). Although those fields are usually small (on the order of a few nT at most), they can be of comparable magnitude to the weakest signals produced by the smallest scales of the field of internal origin (the J.r < a/ sources), which sets a limit to the satellite’s ability to recover this field, despite the high quality of the measurements. No doubt that this limit is one of the greatest challenge the soon-to-be-launched (2011) ESA’s Swarm mission will have to face (Friis-Christensen et al. 2006, 2009).

7 Concluding Comments: From Theory to Practice The present review was intended to provide the reader with the mathematical background relevant to geomagnetic field modeling. Often, mathematical rigor required that a number of simplifying assumptions be introduced with respect to the location of the various magnetic field sources and to the type and distribution of magnetic observations. In particular, these observations were systematically assumed to continuously sample idealized regions (be it an idealized spherical “Earth” surface or an idealized spherical “ionospheric” layer or shell). Also, all of the observations were implicitly assumed to be error free and synchronous in time, thereby avoiding the issue of the mathematical representation of the time variation of the various fields. Yet, those fields vary in time, sometimes quite fast in the case of the field produced by external sources, and in practice, observations are limited in number, affected by measurement errors and not always synchronous (satellites take some time to complete their orbits). These departures from the ideal situations considered in this chapter are a significant source of concern for the practical computation of geomagnetic field models based on the various mathematical properties derived here. But they can fortunately be handled. Provided relevant temporal parameterizations are introduced, appropriate data selection used, and adequate so-called inverse methods employed, geomagnetic field models defined in terms of time-varying Gauss coefficients of the type described here can indeed be computed. Details about the way this is achieved is however very much dependent on the type of observations analyzed and on the field contribution one is more specifically interested in. Examples of geomagnetic field modeling based on historical ground-based observations, with special emphasis on the main field produced within the Earth’s core, can be found in, e.g., Jackson et al. (2000) and Jackson and Finlay (2007). A recent example of a geomagnetic field model based on satellite data and focusing on the field produced by the magnetization within the Earth’s crust is provided by Maus et al. (2008). Additional examples based on the joint use of contemporary ground-based and satellite-born observations for the modeling of both the field of internal and external origin can otherwise be found in, e.g., Sabaka et al. (2004), Thomson and Lesur (2007), Lesur et al. (2008), Olsen et al. (2009) and in the review paper by Hulot et al. (2007), where many more references are provided. For approximation of the geomagnetic field, the conventional system of vector spherical harmonics is used. An approach based on locally supported vector wavelets is studied in the next chapter (Chap. 18). Acknowledgment This is IPGP contribution 2596.

Page 34 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

References Abramowitz M, Stegun IA (1964) Handbook of mathematical functions. Dover, New York Alberto P, Oliveira O, Pais MA (2004) On the non-uniqueness of main geomagnetic field determined by surface intensity measurements: the Backus problem. Geophys J Int 159: 558–554. 10.1111/j.1365-246X.2004.02413.x Backus GE (1968) Applications of a non-linear boundary value problem for Laplace’s equation to gravity and geomagnetic intensity surveys. Q J Mech Appl Math 21:195–221 Backus GE (1970) Non-uniqueness of the external geomagnetic field determined by surface intensity measurements. J Geophys Res 75(31):6339–6341 Backus GE (1974) Determination of the external geomagnetic field from intensity measurements. Geophys Res Lett 1(1):21 Backus G (1986) Poloidal and toroidal fields in geomagnetic field modeling. Rev Geophys 24:75–109 Backus G, Parker R, Constable C (1996) Foundations of geomagnetism. Cambridge University Press, New York Barraclough DR, Nevitt C (1976) The effect of observational errors on geomagnetic field models based solely on total-intensity measurements. Phys Earth Planet Int 13:123–131 Blakely RJ (1995) Potential theory in gravity and magnetic applications. Cambridge University Press, Cambridge Bloxham J (1985) Geomagnetic secular variation. PhD thesis, Cambridge University Dahlen F, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Edmonds A (1996) Angular momentum in quantum mechanics. Princeton University Press, Princeton Elsasser W (1946) Induction effects in terrestrial magnetism. Part I. Theory. Phys Rev 69(3–4):106–116 Ferrers NM (1877) An elementary treatise on spherical harmonics and subjects connected with them. Macmillan, London Friis-Christensen E, Lühr H, Hulot G (2006) Swarm: a constellation to study the Earth’s magnetic field. Earth Planets Space 58:351–358 Friis-Christensen E, Lühr H, Hulot G, Haagmans R, Purucker M (2009) Geomagnetic research from space. Eos 90:25 Gauss CF (1839) Allgemeine Theorie des Erdmagnetismus. Resultate aus den Beobachtungen des Magnetischen Vereins im Jahre 1838. Göttinger Magnetischer Verein, Leipzig Goldie AHR, Joyce JW (1940) In: Proceedings of the 1939 Washington Assembly of the Association of Terrestrial Magnetism and Electricity of the International Union of Geodesy and Geophysics vol 11(6). Neill & Co, Edinburgh Granzow DK (1983) Spherical harmonic representation of the magnetic field in the presence of a current density. Geophys J R Astron Soc 74:489–505 Harrison CGA (1987) The crustal field. In: Jacobs JA (ed) Geomagnetism, vol 1. Academic, London, pp 513–610 Holme R, Bloxham J (1995) Alleviation of the backus effect in geomagnetic field modelling. Geophys Res Lett 22:1641–1644 Holme R, James MA, Lühr H (2005) Magnetic field modelling from scalar-only data: resolving the Backus effect with the equatorial electrojet. Earth Planets Space 57:1203–1209

Page 35 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

Hulot G, Khokhlov A, Le Mouël JL (1997) Uniqueness of mainly dipolar magnetic fields recovered from directional data. Geophys J Int 129:347–354 Hulot G, Sabaka TJ, Olsen N (2007) The present field. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam Jackson J (1998) Classical electrodynamics. Wiley, New York Jackson A, Finlay CC (2007) Geomagnetic secular variation and its application to the core. In: Kono M, (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam Jackson A, Jonkers ART, Walker MR (2000) Four centuries of geomagnetic secular variation from historical records. Philos Trans R Soc Lond A 358:957–990 Kellogg OD (1954) Foundations of potential theory. Dover, New York Khokhlov A, Hulot G, Le Mouël JL (1997) On the Backus effect – I. Geophys J Int 130:701–703 Khokhlov A, Hulot G, Le Mouël JL (1999) On the Backus effect – II. Geophys J Int 137:816–820 Kono M (1976) Uniqueness problems in the spherical analysis of the geomagnetic field direction data. J Geomagn Geoelectr 28:11–29 Langel RA (1987) The main field. In: Jacobs JA (ed) Geomagnetism, vol 1. Academic, London, pp 249–512 Langel RA, Hinze WJ (1998) The magnetic field of the Earth’s lithosphere: the satellite perspective. Cambridge University Press, Cambridge Lesur V, Wardinski I, Rother M, Mandea M (2008) GRIMM: the GFZ reference internal magnetic model based on vector satellite and observatory data. Geophys J Int 173:382–294 Lorrain P, Corson D (1970) Electromagnetic fields and waves. WH Freeman, San Francisco Lowes FJ (1966) Mean-square values on sphere of spherical harmonic vector fields. J Geophys Res 71:2179 Lowes FJ (1974) Spatial power spectrum of the main geomagnetic field, and extrapolation to the core. Geophys J R Astron Soc 36:717–730 Lowes FJ (1975) Vector errors in spherical harmonic analysis of scalar data. Geophys J R Astron Soc 42:637–651 Lowes FJ, De Santis A, Duka B (1995) A discussion of the uniqueness of a Laplacian potential when given only partial information on a sphere. Geophys J Int 121:579–584 Lühr H, Maus S, Rother M (2002) First in-situ observation of night-time F region currents with the CHAMP satellite. Geophys Res Lett 29(10):127.1–127.4. 10.1029/2001 GL 013845 Maeda H, Iyemori T, Araki T, Kamei T (1982) New evidence of a meridional current system in the equatorial ionosphere. Geophys Res Lett 9:337–340 Malin S (1987) Historical introduction to geomagnetism. In: Jacobs JA (ed) Geomagnetism, vol 1. Academic, London, pp 1–49 Mauersberger P (1956) Das Mittel der Energiedichte des geomagnetischen Hauptfeldes an der Erdoberfläche und seine säkulare Änderung. Gerl Beitr Geophys 65:207–215 Maus S (2007) CHAMP magnetic mission. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg Maus S, Lühr H (2006) A gravity-driven electric current in the Earth’s ionosphere identified in champ satellite magnetic measurements. Geophys Res Lett 33:L02812. doi:10.1029/2005GL024436 Maus S, Yin F, Lühr H, Manoj C, Rother M, Rauberg J, Michaelis I, Stolle C, Müller R (2008) Resolution of direction of oceanic magnetic lineations by the sixth-generation lithospheric magnetic field model from CHAMP satellite magnetic measurements. Geochem Geophys Geosyst 9(7):Q07021

Page 36 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_17-2 © Springer-Verlag Berlin Heidelberg 2014

Merrill R, McElhinny M (1983) The Earth’s magnetic field. Academic, London Merrill R, McFadden P, McElhinny M (1998) The magnetic field of the Earth: paleomagnetism, the core, and the deep mantle. Academic, London Mie G (1908) Considerations on the optic of turbid media, especially colloidal metal sols. Ann Phys (Leipzig) 25:377–442 Morse P, Feshbach H (1953) Methods of theoretical physics. International series in pure and applied physics. McGraw-Hill, New York Olsen N (1997) Ionospheric F region currents at middle and low latitudes estimated from Magsat data. J Geophys Res 102(A3):4563–4576 Olsen N, Mandea M, Sabaka TJ, Tøffner-Clausen L (2009) CHAOS-2-a geomagnetic field model derived from one decade of continuous satellite data. Geophys J Int 199(3):1477–1487. doi:10.1111/j.1365-246X.2009.04386.x Proctor MRE, Gubbins D (1990) Analysis of geomagnetic directional data. Geophys J Int 100:69–77 Purucker M, Whaler K (2007) Crustal magnetism. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 195–235 Richmond AD (1995) Ionospheric electrodynamics using magnetic Apex coordinates. J Geomagn Geoelectr 47:191–212 Sabaka TJ, Olsen N, Langel RA (2002) A comprehensive model of the quiet-time near-Earth magnetic field: phase 3. Geophys J Int 151:32–68 Sabaka TJ, Olsen N, Purucker ME (2004) Extending comprehensive models of the Earth’s magnetic field with Ørsted and CHAMP data. Geophys J Int 159:521–547. doi:10.1111/j.1365246X.2004.02421.x Schmidt A (1935) Tafeln der Normierten Kugelfunktionen. Engelhard-Reyher Verlag, Gotha Stern DP (1976) Representation of magnetic fields in space. Rev Geophys 14:199–214 Stern DP, Bredekamp JH (1975) Error enhancement in geomagnetic models derived from scalar data. J Geophys Res 80:1776–1782 Stern DP, Langel RA, Mead GD (1980) Backus effect observed by Magsat. Geophys Res Lett 7:941–944 Stolle C, Lühr H, Rother M, Balasis G (2006) Magnetic signatures of equatorial spread F , as observed by the CHAMP satellite. J Geophys Res 111:A02304. doi:10.1029/2005JA011184 Thomson AWP, Lesur V (2007) An improved geomagnetic data selection algorithm for global geomagnetic field modelling. Geophys J Int 169(3):951–963 Ultré-Guérard P, Hamoudi M, Hulot G (1998) Reducing the Backus effect given some knowledge of the dip-equator. Geophys Res Lett 22(16):3201–3204 Walker AD (1992) Comment on “Non-uniqueness of the external geomagnetic field determined by surface intensity measurements” by Georges E. Backus. J Geophys Res 97(B10):13991 Watson GN (1966) A treatise on the theory of Bessel function. Cambridge University Press, London Winch D, Ivers D, Turner J, Stening R (2005) Geomagnetism and Schmidt quasi-normalization. Geophys J Int 160(2):487–504

Page 37 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents Christian Gerhards Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany

Abstract This chapter gives a brief overview on the application of multiscale techniques to the modeling of geomagnetic problems. Two approaches are presented: one focusing on the construction of scaling and wavelet kernels in frequency domain and the other one focusing on a spatially oriented construction resulting in locally supported wavelets. Both approaches are applied exemplarily to the modeling of the crustal field, the reconstruction of radial current systems, and the definition of a multiscale power spectrum.

1 Introduction The last decade has severely improved the understanding of the Earth’s magnetic field due to high-precision vector satellite data supplied, e.g., by the Ørsted and CHAMP missions that were launched in 1999 and 2000, respectively. ESA’s new satellite constellation Swarm (cf. FriisChristensen et al. 2006) is anticipated to conduct even more accurate measurements. The provided data generally contains contributions from various sources of the Earth’s magnetic field (cf. Fig. 1). The major contributions are due to dynamo processes in the Earth’s interior (core/main field), electric currents in the iono- and magnetosphere (external field), and static magnetization in the Earth’s lithosphere in combination with induction processes (crustal/lithospheric field). A better understanding and quantification of the different contributions requires sophisticated mathematical tools and a great effort in modeling. Current models comprising different parts of the geomagnetic field are, e.g., IGRF11 (cf. IAGA 2010), MF7 (2010), CHAOS-4 (cf. Olsen et al. 2014), and GRIMM-3 (2011). Further overviews on the geophysical background and satellite data situation can be found, e.g., in Langel (1987), Langel and Hinze (1998), Hulot et al. (2007, 2010), and Thébault et al. (2010). In this article, we are mainly concerned with parts of the geomagnetic field whose length and time scales are such that the displacement currents in the Maxwell equations can be neglected (cf. Backus et al. 1996). Therefore, it is reasonable to concentrate the modeling effort on the preMaxwell equations r ^ b D 0 j; r  b D 0; 

(1) (2)

E-mail: [email protected]

Page 1 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 Schematic description of the Earth’s magnetic field contributions

where b denotes the magnetic field and j the electric current density (for all theoretical considerations the vacuum permeability 0 is chosen to be equal to one; for the application to satellite data, we switch to the SI unit system). In regions with vanishing current density j D 0, such as the neutral atmosphere between the Earth’s surface RE D fx 2 R3 j jxj D RE g and the lower bound of the ionosphere RI D fx 2 R3 j jxj D RI g, the classical Gauss representation holds true: the magnetic field b can be expressed by a harmonic potential U via b D rU;

(3)

where U has the well-known representation U D

1 2nC1 X X

int ext an;k Hn;k .RI / C cn;k Hn;k .RI /;

(4)

nD1 kD1 int ext for adequate coefficients an;k , cn;k , and a fixed radius R 2 .RE ; RI /. By Hn;k .RI / and Hn;k .RI /, we denote the inner and outer harmonics, respectively, which are the harmonic extensions of the ext 3 3 spherical harmonics R1 Yn;k into int R D fx 2 R j jxj < Rg and R D fx 2 R j jxj > Rg, respectively. The part b int D rU int of b D rU that consists of all outer harmonic contributions represents the magnetic field that is generated by source currents j in the interior of RE , while the part b ext D rU ext of b D rU that consists of all inner harmonic contributions resembles the magnetic field that is generated by sources in the exterior of RI . Satellite measurements, however, are generally conducted in regions of ionospheric currents, i.e., j 6D 0. Thus, the Gauss representation breaks down for modeling from such data sources. As a substitute, Gerlich (1972) and Backus (1986) suggest the Mie representation for geomagnetic applications. Any solenoidal vector field, such as the magnetic field b, can be decomposed into

b D pb C qb D r ^ LPb C LQb ;

(5)

with uniquely determined vector fields pb , qb (the operator L is the curl gradient acting at a point x 2 R3 via x ^ rx , where ^ denotes the vector product). The correspondingR scalars Pb ; Qb are 1 uniquely determined if they have vanishing integral mean values, i.e., if 4  Pb .r/d!./ D Page 2 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

R

Qb .r/d!./ D 0. The term pb is known as the poloidal part of the magnetic field, while is the toroidal part. The associated scalar functions Pb and Qb are often just called Mie scalars. Due to (1), the corresponding current density allows a Mie representation as well: 1 4 qb



j D pj C qj D r ^ LPj C LQj :

(6)

In combination with the pre-Maxwell equations, this yields a fundamental connection for the Mie scalars of the magnetic field and its source currents: Qb D Pj ;

(7)

Pb D Qj :

(8)

In other words, toroidal magnetic fields are caused by poloidal currents and poloidal fields by toroidal currents. Furthermore, the Mie scalars represent the analogon to the scalar potential U of the Gauss representation and can be expanded in terms of spherical harmonics on a fixed int sphere R . An extension into the exterior space ext R or the interior space R by outer and inner harmonics, respectively, is not possible without further information since Pb , Pj , and Qb , Qj are generally not harmonic. Within the framework described above, Fourier expansions with respect to spherical harmonics are the most popular and widespread tool to describe the geomagnetic field and related quantities. They form the basis of all models mentioned in the beginning of this introduction. An overview on spherical harmonic methods in geomagnetism can be found, e.g., in Backus et al. (1996) and the contribution Sabaka et al. (2014) of this handbook. These methods do well for global approximations and uniformly distributed data and have the advantage of physically relevant interpretations of the degree n in terms of multipoles and frequencies. However, the global nature of spherical harmonics makes them less suitable for the reconstruction of strongly localized features, such as crustal field anomalies or modeling from local or unevenly distributed data sets. Over the years, different methods have been developed to address these topics. Haines (1985) has introduced spherical cap harmonics, a basis system for spherical caps that is obtained by solving the Laplace equation with adequate boundary values. A revision of the approach can be found in Thébault et al. (2006). Spline functions, as used for the gravitational potential in Freeden (1981), have been formulated for geomagnetic purposes in Shure et al. (1982). So-called Slepian functions on the sphere, which are obtained by optimization of the spatial localization under certain constraints of band-limitation in frequency domain (or vice versa), are treated in Simons et al. (2006) and Simons and Plattner (2014). All of these approaches show the capability of modeling from local data, but the area of localization has to be fixed in advance. Multiscale approaches, on the other hand, are able to start on a global scale and then refine the localization step by step. Spherical versions have already been introduced in Dahlke et al. (1995), Schröder and Swelden (1995), Freeden and Windheuser (1996), Holschneider (1996), and Freeden et al. (1998). The application to geomagnetic problems, however, is rather recent (see, e.g., Bayer et al. 2001; Holschneider et al. 2003; Maier and Mayer 2003; Chambodut et al. 2005; Mayer and Maier 2006; Freeden and Gerhards 2010; Gerhards 2012). The goal of this chapter is to give an overview on the spherical multiscale methods developed at the Geomathematics Group of the University of Kaiserslautern and their application to problems in geomagnetism based on the geophysical foundations described in the beginning of the introduction. More precisely, in Sect. 2 we introduce necessary function systems such as spherical harmonics, Page 3 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

Green’s function for the Beltrami operator, and the single layer kernel. Two different ways of constructing multiscale representations, one focused on the frequency domain and the other one focused on locally compact support in spatial domain, are provided in Sect. 3. The application to different problems in geomagnetic modeling, namely, crustal field modeling, reconstruction of radial current systems, and the definition of a multiscale power spectrum, is then treated in Sect. 4.

2 Relevant Function Systems Fourier expansions on the sphere in terms of spherical harmonics are the classical approach to potential field modeling in geosciences, so that an introduction of this function system is inevitable. We are, however, in the first place interested in spherical harmonics as stepping stones leading from the frequency-oriented Fourier expansions to spatially oriented multiscale methods. Further important function systems on this path are Green’s function for the Beltrami operator and the single layer kernel. In many geomagnetic applications they make the explicit use of spherical harmonics unnecessary.

2.1 Spherical Harmonics The (scalar) spherical harmonics Yn;k of degree n and order k denote the infinitely often differentiable eigenfunctions of the Beltrami operator  to the eigenvalues n.n C 1/, i.e.,  Yn;k D n.n C 1/Yn;k ;

n 2 N0 ; k D 1; : : : ; 2n C 1:

(9)

They can be orthonormalized in the sense 

Z .Yn;k ; Ym;l /L2 ./ D

Yn;k ./Ym;l ./ d!./ D 

0; if n 6D m or k 6D l; 1; if n D m and k D l:

(10)

Note that  D 1 denotes the unit sphere. Whenever we mention spherical harmonics Yn;k in this chapter, we mean them to be L2 ./-orthonormalized in the above sense. It should be noted that, different from our convention, Schmidt semi-normalized spherical harmonics are more commonly used in geomagnetic literature. Any square integrable function F 2 L2 ./ can be represented by the Fourier expansion F D

1 2nC1 X X

F ^ .n; k/Yn;k ;

(11)

nD0 kD1

with Fourier coefficients ^

Z

F .n; k/ D

F ./Yn;k ./ d!./;

(12)



where convergence of the series is meant with respect to the L2 ./-topology. Of some further importance for the analysis of harmonic potentials are the so-called inner and outer harmonics Page 4 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

1  r n Yn;k ./ ; x 2 int R; R R   1 R nC1 ext Yn;k ./ ; x 2 ext Hn;k .RI x/ D R ; R r int Hn;k .RI x/ D

(13) (14)

x where x D r with r D jxj,  D jxj , as always in this chapter. They are the uniquely determined int ext harmonic functions in R and R , respectively, with spherical harmonics R1 Yn;k as boundary values on R . We know that the scalar potential U of the Gauss representation b D rU can be expressed in terms of (scalar) spherical harmonics and inner and outer harmonics (cf. expression (4)). The magnetic field b itself, however, is a vectorial quantity, so that it would be convenient to have vectorial counterparts of the (scalar) spherical harmonics at hand. To this end, we define the three .1/ .2/ .3/ spherical operators o F ./ D F ./, o F ./ D r F ./, and o F ./ D L F ./, acting at  2  on a sufficiently often differentiable scalar function F on the sphere . The operator r  denotes the surface gradient, related to the gradient via rx D  @r@ C 1r r , while L denotes the surface curl gradient, acting at a point  2  via L D  ^ r . The well-known spherical Helmholtz decomposition states that any vector field f 2 c .1/ ./ can be represented by

f D o.1/ F1 C o.2/ F2 C o.3/ F3 ;

(15)

with uniquely determined scalar functions F1 2 C .1/ ./ and F2 ; F3 2 C .2/ ./ satisfying Z Z 1 1 F2 ./ d!./ D F3 ./ d!./ D 0: 4  4 

(16)

Therefore, it is straightforward to define a system of vector spherical harmonics of type i in the following way. Definition 1.

.i/

The functions yn;k given by  2 .i/ yn;k D ..i/ o Yn;k ; n / .i/

1

i D 1; 2; 3; n 2 N0i ; k D 1; : : : ; 2n C 1

(17)

are called vector spherical harmonics of type i , degree n, and order k. For brevity, we use the .2/ .3/ notations 01 D 0, 02 D 03 D 1, and normalization constants .1/ n D 1, n D n D n.n C 1/. Similar as in the scalar case, any vectorial function f 2 l 2 ./ can be expressed as a Fourier series 1 2nC1 3 X X X .i/ .f .i/ /^ .n; k/ yn;k ; f D

(18)

iD1 nD0i kD1

with Fourier coefficients .f

.i/ ^

Z .i/

/ .n; k/ D 

f ./  yn;k ./ d!./:

(19)

Page 5 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

This way we have defined a vectorial orthonormal basis for l 2 ./ that decomposes a vector field f with respect to the Helmholtz representation, i.e., it decomposes f into a radial part (i D 1) and two tangential parts (i D 2; 3). There are further ways to define vector basis systems. A system especially suited for the separation of the geomagnetic field into interior and exterior sources requires the operators oQ .1/ D o.1/ .D C 12 /  o.2/ , oQ .2/ D o.1/ .D  12 / C o.2/ , and oQ .3/ D o.3/ , with 1  1 2  : D D  C 4

(20)

The operator D is characterized by the spherical harmonics Yn;k being the infinitely often differentiable eigenfunctions to the eigenvalues n C 12 , i.e., DYn;k

  1 Yn;k ; D nC 2

n 2 N0 ; k D 1; : : : ; 2n C 1:

(21)

Its inverse is called the (spherical) single layer operator (some more details are given in Sect. 2.3). We can now define an alternative orthonormal system of vector spherical harmonics. Definition 2.

.i/

The functions yQn;k given by  2 .i/ oQ Yn;k ; yQn;k D .Q .i/ n / 1

.i/

i D 1; 2; 3; n 2 N0i ; k D 1; : : : ; 2n C 1

(22)

are called vector spherical harmonics of type i , degree n, and order k. The normalization constants are Q .1/ Q .2/ Q .3/ n D .n C 1/.2n C 1/,  n D n.2n C 1/, and  n D n.n C 1/. Again, any vectorial function f 2 l 2 ./ can be expressed as a Fourier series 1 2nC1 3 X X X .i/ .fQ.i/ /^ .n; k/ yQn;k ; f D

(23)

iD1 nD0i kD1

with Fourier coefficients Q.i/ ^

.f

Z .i/

/ .n; k/ D 

f ./  yQn;k ./ d!./:

(24)

The advantage of this system is its connection to outer and inner harmonics: 1  r n1 .2/ 1 .2/ .Q n / 2 yQn;k ./; x 2 int R; 2 R R   1 R nC2 .1/ 1 .1/ ext rx Hn;k .RI x/ D 2 .Q n / 2 yQn;k ./; x 2 ext R : R r int rx Hn;k .RI x/ D

(25) (26)

A more detailed overview on scalar, vector, and here not mentioned tensor spherical harmonics can be found, e.g., in the monographs Müller (1966), Freeden et al. (1998), and Freeden and Schreiner Page 6 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

(2009) . The second set of vector spherical harmonics and some of its applications are additionally treated, e.g., in Edmonds (1957), Backus et al. (1996), and Mayer and Maier (2006).

2.2 Green’s Function for the Beltrami Operator As Green’s function with respect to the Beltrami operator, we define the rotationally invariant function G. I / satisfying  G. I   / D 

1 ; 4

;  2 ;  6D ;

(27)

with a singularity of type G. I   / D O.ln.1    //;

;  2 ;  ! :

(28)

Uniqueness is guaranteed by the claim that G. I / has vanishing integral mean value on the unit sphere. Observing that the spherical harmonics Yn;k are eigenfunctions to the Beltrami operator with eigenvalues n.n C 1/, one can derive the Fourier representation 

G. I   / D 

1 2nC1 X X nD1 kD1

D

1 X nD1

1 Yn;k ./Yn;k ./ n.n C 1/

2n C 1 Pn .  /; 4 n.n C 1/

(29)

;  2 ;  6D ;

where Pn denotes the Legendre polynomial of degree n. Of more relevance to us is that a closed representation G. I   / D

1 1 ln.1    / C .1  ln.2//; 4 4

;  2 ;  6D 

(30)

is available. Furthermore, making use of the defining properties of Green’s function, we obtain a simple integral relationship of a twice continuously differentiable function F to its Beltrami derivative  F . Theorem 1.

Let F be of class C .2/ ./. Then Z Z 1 F ./ D F ./d!./ C G. I   /  F ./ d!./; 4  

 2 :

As a consequence of the above representation, it is straightforward to see that Z G. I   /H./ d!./;  2 ; F ./ D

(31)

(32)



Page 7 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

is the uniquely determined solution with vanishing integral mean value of the Beltrami equation  F ./ D H./;

 2 ;

(33)

where H is a continuous function having vanishing integral mean value itself. Since the Beltrami equation as well as differential equations involving the surface gradient r  and the surface curl gradient L appear frequently in geomagnetic problems, Eq. (31) is a helpful companion in our later modeling. Using the expression (29) together with (32), we obtain the classical Fourier expansion of the solution F , while the closed expression (30) is a first step towards a more spatially oriented representation of F . The crucial idea for a multiscale representation with spatially locally supported wavelets is a regularization of G. I / around its singularity. A typical choice for the regularization G  . I / with regularization parameter  2 .0; 2/ is to substitute the Green function on the spherical cap  ./ D f 2 j1  < g with center  and radius  by its Taylor polynomial centered at 1. Taking the Taylor polynomial of degree 2 leads to a twice continuously differentiable function: By G  . I /,  2 .0; 2/, we denote the regularized Green function 8 1 1 ln.1    / C 4 .1  ln.2//; 1      ; ˆ ˆ < 4 G  . I   / D  1 2 .1    /2 C 1 .1    / 8 2 ˆ ˆ : 1 1 ; 1     < : C 4 .ln./  ln.2//  8

Definition 3.

(34)

Higher-degree Taylor polynomials yield smoother regularizations and become necessary when higher-order differential operators are involved. But for our later applications, it is generally sufficient to use (34). Only the example in Sect. 4.2 requires G  . I / to be three times continuously differentiable (for more details on higher-order regularizations, the reader is referred to Gerhards (2011a) and Freeden and Gerhards (2012)). At this point it should be emphasized that G. I / and its regularization only depend on the scalar product   , so that they can be regarded as functions acting on the interval Œ1; 1/ and Œ1; 1 , respectively. This simplifies many calculations. A plot of the regularized Green function (34) can be found in Fig. 2.

Fig. 2 Regularized Green’s function # ! G  . I cos.#// (left) and regularized single layer kernel # ! S  .cos.#// (right), plotted with respect to the angular distance

Page 8 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

The next question is how a convolution with the regularized Green function behaves in comparison to a convolution with the original Green function, asymptotically as  ! 0C. Elementary calculations yield the following relation for functions F 2 C .0/ ./: ˇZ ˇ Z ˇ ˇ    lim sup ˇˇ G . I   /F ./d!./  G. I   /F ./d!./ˇˇ D 0: (35) !0C 2





Vectorial kernels are achieved by application of the surface gradient and the surface curl gradient. Since the tangential derivatives of G. I / are still integrable on the sphere, a similar argumentation yields ˇZ ˇ Z ˇ ˇ      ƒ G . I   /  f ./d!./  ƒ G. I   /  f ./d!./ˇˇ D 0; (36) lim sup ˇˇ !0C 2





for ƒ being one of the operators r  or L and f 2 c .0/ ./. Using the surface theorem of Gauss, twofold application of the surface differential operators implies tensorial kernels satisfying ˇZ   ˇ ;.1/   G . I   / f ./d!./ lim sup ˇˇ ƒ ˝ ƒ;.2/  !0C 2



;.1/ ƒ

ˇ Z   ˇ  ˇ D 0; ƒ;.2/ G. I   /  f ./d!./  ˇ

(37)



for f 2 c .1/ ./ and ƒ;.1/ , ƒ;.2/ one of the operators r  , L (by ˝ we mean the tensor product of two vectorial quantities). Various further relations, e.g., those involving the Beltrami operator  , can be shown under sufficient smoothness assumptions on the data F and f . Those limit relations provided here should be sufficient to motivate the convergence of the multiscale representations treated later on. Such regularizations of Green’s function for the Beltrami operator have first been introduced in Freeden and Schreiner (2006) and Fehlinger et al. (2007, 2008). In geomagnetic applications they have been used in Freeden and Gerhards (2010) and Gerhards (2011a, 2012).

2.3 Single Layer Kernel In this subsection we briefly come back to the operator D from (20), more precisely its inverse D 1 . Since the spherical harmonics form the eigenfunctions of D to the eigenvalues n C 12 , we get 1

D F D

1 2nC1 X X nD0 kD1

1 F ^ .n; k/Yn;k ; n C 12

(38)

for F 2 C .0/ ./. More interesting for our spatial considerations is again that D 1 can be rewritten as a convolution operator: Z 1 1 S.  /F ./ d!./;  2 ; (39) D F ./ D 2  Page 9 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

where S is the (spherical) single layer kernel given by 1 ; S.  / D p 2.1    /

;  2 ;  6D :

(40)

The function S coincides, up to a multiplicative constant, with the fundamental solution of the Laplace operator in R3 restricted to the unit sphere, which is also the reason for the name (spherical) single layer kernel. S and D 1 are not as naturally connected to the governing geomagnetic equations as Green’s function for the Beltrami operator and the spherical differential operators r  , L , and  , but they are required for a decomposition of the magnetic field with respect to interior and exterior sources involving the operators oQ .i/ , i D 1; 2; 3. A regularization S  ,  2 .0; 2/, of S around its singularity can be achieved in analogy to the Green function for the Beltrami operator. For our purposes the following continuously differentiable example is sufficient: Definition 4.

By S  ,  2 .0; 2/, we denote the regularized single layer kernel ( S .  / D 

1 p1 .1    / 2 ; 2 3  2p1 2  2 .1   

1      ; / C

1 3 p  2 ; 2 2

1     < :

(41)

An illustration is given in Fig. 2. The behavior of convolutions with the regularized single layer kernel is also similar to those involving G  . I /. Thus, we get for F 2 C .0/ ./: ˇZ ˇ Z ˇ ˇ  ˇ lim sup ˇ S .  /F ./d!./  S.  /F ./d!./ˇˇ D 0: (42) !0C 2





If we want to apply r  and L to the above expression, we need to be aware that  7! r S.  / and  7! L S.  / are not integrable on the sphere. Assuming that F 2 C .1/ ./ we still get ˇZ ˇ Z ˇ ˇ    ˇ lim sup ˇ ƒ S .  /F ./d!./  ƒ S.  /F ./d!./ˇˇ D 0; !0C 2



(43)



where ƒ is one of the operators r  , L . Corresponding relations for further applications of differential operators and combinations of the single layer kernel and Green’s function become necessary for some applications later on, but we skip them at this point due to the similar argumentation. More details can be found in Gerhards (2011a, 2012). In the different context of gravity disturbances, regularizations of the single layer kernel have been used as well in Freeden and Wolf (2008) and Wolf (2009).

3 Two Approaches to Multiscale Representations As mentioned in the introduction, we want to substitute global Fourier expansions by multiscale representations involving scaling kernels ˆJ .; / and wavelet kernels ‰J .; / with certain spatial localization properties. In the following we motivate and describe these representations and the choice of the kernels. We present a construction scheme in frequency domain as well as spatial Page 10 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

domain. Both approaches are based on the same principles in the sense that the wavelet kernels arise as differences of scaling kernels. However, the frequency approach is somewhat more flexible in the choice of the scaling parameters and leads to a multiresolution analysis, while the spatial approach has its advantages in the locally supported wavelets and the closed representations in terms of elementary functions. Explicit examples are provided in the applications in Sect. 4.

3.1 Wavelets as Frequency Packages In the scalar case, a so-called scaling kernel ˆJ .; / at scale J is formally expressed by ˆJ .; / D

1 2nC1 X X

ˆ^ J .n/Yn;k ./Yn;k ./

(44)

nD0 kD1

D

1 X 2n C 1 nD0

4

ˆ^ J .n/Pn .  /;

;  2 ;

where ˆ^ J .n/ are coefficients (also called symbols) reflecting the spatial and frequency behavior of the kernel. Typically the choice is such that the kernels show a better spatial localization as the scale J increases. The fact that ˆ^ J .n/ is assumed to be independent of the order k makes the kernel zonal, i.e., it only depends on the scalar product    and can be regarded as a function on the interval Œ1; 1 . This simplifies many calculations but is not crucial for the approach. The notion “wavelets as frequency packages” in the title of this subsection simply stands for the superposition of spherical harmonics of different degrees, where the influence of every frequency, i.e., the influence of every degree n, is given by ˆ^ J .n/. One can distinguish four main types of kernels: band-limited, non-band-limited, space-limited, and non-space-limited ones. Band-limited kernels are characterized by ˆ^ J .n/ D 0, n  N , for some sufficiently large N . In other words, they are strongly localized in frequency domain. Perfect frequency localization would be given by the Legendre kernel defined via ˆ^ J .n/ D 1, n D m, and ^ ˆJ .n/ D 0, n 6D m, for some fixed integer m. The non-band-limited counterparts generally show a much stronger spatial localization. No space-limited kernel (i.e., ˆJ .; / has locally compact support) can simultaneously be band-limited. Perfect spatial localization is formally achieved by the Dirac kernel, setting ˆ^ J .n/ D 1 for all n 2 N0 (however, this is only to be understood in a distributional sense since (44) becomes singular for this particular choice of symbols). The variation of the symbols ˆ^ J .n/ for different scales J represents the trade-off between spatial and frequency localization (see illustration below). Ideal frequency localization Legendre kernel

Ideal space localization Band-limited Non band-limited Non space-limited Space-limited

Dirac kernel

For a more precise quantitative categorization in terms of the uncertainty principle, the reader is referred to Freeden (1998) and the contribution Freeden and Schreiner (2014a) of this handbook. A band-limited example is given by the cubic polynomial (CP) kernel

Page 11 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Illustration of # 7! ˆJ .cos.#// with respect to angular distance (left) and the corresponding symbols ˆ^ J .n/ for degrees n D 0; : : : ; 20 (right) for two different kernels: CP kernel (top) and Abel-Poisson kernel (bottom)

ˆ^ J .n/

( 2   1  n2J 1 C n2.J 1/ ; n  2J  1; D 0; else;

(45)

a non-band-limited one by the Abel-Poisson kernel J

n 2 ˆ^ ; J .n/ D e

n 2 N0 :

(46)

The Abel-Poisson kernel has the advantageous property to be a non-band-limited kernel with a known closed representation. In general, such representations are not known, and the numerical evaluation of the kernel requires the truncation of the series representation at some degree, so that de facto a band-limited kernel is evaluated. Figure 3 illustrates the evolution of space and frequency localization of the two kernels above at different scales J . The increase in spatial localization with increasing scale J reflects a “zooming-in” capability for regions of higher data density. This motivates us to start at a low scale J0 to reconstruct coarse global features of the quantity under investigation via use of ˆJ0 .; / and then locally improve the approximation by adding contributions for higher scales J involving related wavelet kernels ‰J .; /. We now give a more precise description of the method outlined above. First, the symbols ˆ^ J .n/ need to satisfy the following conditions:

Page 12 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

(i) (ii) (iii)

2  limJ !1 ˆ^ D n ; n 2 N0 , J .n/ P1 2nC1  ^ 2 ˆJ .n/ < 1; J 2 Z, 4   nD0  ^ 2 2 ˆ^ .n/  ˆJ .n/ ; J 2 Z; n 2 N0 . J C1

The CP as well as the Abel-Poisson symbols satisfies all three conditions for n D 1, n 2 N0 , leading to so-called approximate identities. This, however, is only a special case. In many applications, f n gn2N0 denotes the set of symbols of a pseudodifferential operator ƒ relating the modeled quantity F to the given data G via ƒ G D F , i.e., 

F Dƒ GD

1 2nC1 X X

n G ^ .n; k/Yn;k :

(47)

nD0 kD1

We do not go into detail on summability conditions for the symbols n and the corresponding Sobolev spaces. For that the reader is referred to Freeden et al. (1998) and references therein. We are simply aiming at outlining the ideas of the multiscale concept. Under these circumstances, assumptions (i)–(iii) yield 1 2nC1 X X 2 ^ ˆ^ F D lim ˆJ  ˆJ  G D lim J .n/ G .n; k/ Yn;k : J !1

J !1

(48)

nD0 kD1

For brevity, we write ˆJ  G for the convolution of a (scalar) scaling kernel ˆJ .; / with a (scalar) function G 2 L2 ./, i.e., Z ˆJ .; /G./d!./: (49) ˆJ  G D 

˚ Condition (iii) implies that the scale spaces VJ D ˆJ  ˆJ  Gj G 2 L2 ./ , J 2 Z, are nested in the sense VJ  VJ C1      L2 ./;

J 2 Z:

(50)

This nesting leads to a so-called multiresolution analysis. It implies the nice property that the approximation error of F improves for every increase of scale J . In order to make use of the previously mentioned “zooming-in” capability, we now define wavelet kernels of scale J by ‰J .; / D

1 2nC1 X X

‰J^ .n/Yn;k ./Yn;k ./

(51)

nD0 kD1

D

1 X 2n C 1 nD0

4

‰J^ .n/Pn .  /;

;  2 ;

Page 13 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

where the symbols ‰J^ .n/ read ‰J^ .n/ D

 2  ^ 2  12 ˆ^ .n/  ˆJ .n/ ; J C1

J 2 Z:

(52)

From this setting it is easily seen that ˆJ C1  ˆJ C1  G D ˆJ  ˆJ  G C ‰J  ‰J  G;

J 2 Z;

(53)

i.e., the wavelet contribution of scale J represents the remainder between the approximations at scales J and J C 1. In regard of (48), we are now able to state the following multiscale representation. Theorem 2. Let fˆJ gJ 2Z be a set of scaling kernels satisfying (i)–(iii), f‰J gJ 2Z the corresponding set of wavelet kernels, and J0 2 Z be fixed. Furthermore, assume that F; G 2 L2 ./ are related by ƒ G D F for some pseudodifferential operator ƒ (cf. expression (47)). Then F D ˆJ0  ˆJ0  G C

1 X

‰J  ‰J  G;

(54)

J DJ0

where equality is meant in the sense of the L2 ./-topology. Since a low scale J0 typically implies a concentration of the symbols ˆ^ J .n/ around low spherical harmonic degrees n, the scaling contribution ˆJ0  ˆJ0  G yields low frequency, i.e., coarse global features. More and more spatially localized features are added by the wavelet contributions ‰J ‰J G. The increasing localization of the wavelet kernels ‰J .; / also allows the use of local data sets to improve the approximation. This reflects the property previously denoted as “zooming-in.” It should be noted that the wavelet kernels mentioned in this subsection are typically not locally supported but only show an increasingly good localization, so that some error is made if only local information on G is used. Yet, if the scale J is sufficiently large, this error is negligible. Essentially the same concept as described above also holds true for the approximation of vector fields. We just briefly describe the necessary adaptations: the vector scaling kernel of type i D 1; 2; 3 and scale J is given by .i/

J .; / D

1 2nC1 X X .i/ .i/ . J /^ .n/Yn;k ./yn;k ./;

;  2 ;

(55)

nD0i kD1

where the symbols . J /^ .n/, i D 1; 2; 3 fixed, satisfy conditions (i)–(iii). The use of the vector .i/ .i/ spherical harmonics yQn;k instead of yn;k is valid as well. The associated vector wavelet kernel of type i and scale J reads .i/

.i/ J .; /

D

1 2nC1 X X .

.i/ ^ .i/ J / .n/Yn;k ./yn;k ./;

;  2 ;

(56)

nD0i kD1

Page 14 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

with the symbols .

.i/ ^ J / .n/

D



2 .i/ . J C1 /^ .n/

 .i/ ^ 2  12  . J / .n/ ;

J 2 Z:

(57)

.i/

Theorem 3. Let f J giD1;2;3; J 2Z be a set of vector scaling kernels satisfying the conditions .i/ (i)–(iii), f J giD1;2;3; J 2Z the corresponding set of vector wavelet kernels, and J0 2 Z be fixed. Furthermore, assume that f; g 2 l 2 ./ are related by ƒ g D f for some vectorial pseudodifferential operator ƒ . Then f D

3 X

.i/ J0

?

.i/ J0

gC

1 3 X X

.i/ J

?

.i/ J

 g;

(58)

iD1 J DJ0

iD1

where equality is meant in the sense of the l 2 ./-topology. .i/

The scalar-valued convolution of a vector scaling kernel J .; / with a vector field g appearing in (58) is defined by Z .i/ .i/ J .; /  g./d!./; (59) J  g D  .i/

while the vector-valued convolution of a vector scaling kernel J .; / with a scalar function F reads Z .i/ .i/ J .; /F ./d!./: (60) J ? F D 

Remark 1. The multiscale representations from Theorems 2 and 3 represent a bilinear approach. However, the concept can be formulated in a linear approach as well. The assumptions (i)–(iii) on the scaling kernel just need to be substituted by .n/ D ; n 2 N0 , (i’) limJ !1 ˆ^ P1 2nC1J ^ 2n (ii’) nD0 4 ˆJ .n/ < 1; J 2 Z, ^ J 2 Z; n 2 N0 . (iii’)ˆ^ J C1 .n/  ˆJ .n/  0; Then the (scalar) multiscale representation (54) turns into F D ˆJ0  G C

1 X

‰J  G;

(61)

J DJ0

where the symbols of the wavelet kernels are defined by .‰J /^ .n/ D .ˆJ C1 /^ .n/  .ˆJ /^ .n/; .i/

.i/

.i/

J 2 Z:

(62)

Page 15 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

In the vectorial case the changes are slightly bigger since we now need tensorial scaling and wavelet kernels. More precisely, .i/ ˆ J .; /

1 2nC1 X X .i/ .i/ .i/ D .ˆ J /^ .n/ yn;k ./ ˝ yn;k ./;

;  2 :

(63)

nD0i kD1 .i/

The tensorial wavelet kernels ‰ J .; / are defined analogously. The vector-valued convolution of a tensor scaling kernel of type i D 1; 2; 3 with a vector field g is given via Z .i/ .i/ ˆJ ? g D ˆ J .; /g./d!./: (64) 

As a consequence, (58) turns into f D

3 X

.i/

ˆ J0 ? g C

iD1

1 3 X X

.i/

‰ J ? g:

(65)

iD1 J DJ0

The linear approach is mentioned since it forms the direct connection between the wavelets constructed as frequency packages in this subsection and the locally supported wavelets in the next subsection.

3.2 Locally Supported Wavelets We start from the assumption that we already know a set of (scalar) scaling kernels ˆJ W    ! R or a set of tensorial scaling kernels ˆ J W    ! R33 satisfying (i) ˆJ .; / D ˆJ C1 .; /; ˆ J .; / D ˆ J C1 .; /; 1      J  J C1 , (ii) limJ !1 sup2 jˆJ .; /  G  F ./j D 0; limJ !1 sup2 jˆ J .; / ? g  f ./j D 0, where F , f denote the modeled scalar and vectorial quantities, respectively, and G, g the corresponding input data. For the parameter J 2 .0; 2/ we assume that it tends to zero as J ! 1. We suppose that all appearing functions F , f , G, g are at least continuous on the sphere (or of higher-order differentiability, depending on the problem). The main difficulty is to find problemspecific scaling kernels that actually satisfy (i) and (ii). The regularized Green function G J . I / and the regularized single layer kernel S J clearly satisfy condition (i) and form an important tool for their construction (more details can be found in the examples of Sect. 4). Property (i) implies that ˆJ C1 .; /  ˆJ .; / and ˆ J C1 .; /  ˆ J .; / vanish for 1      J , so that the wavelet kernels ‰J .; / D ˆJ C1 .; /  ˆJ .; /;

 2 ;

(66)

‰ J .; / D ˆ J C1 .; /  ˆ J .; /;

 2 ;

(67)

Page 16 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

are locally supported on the spherical cap J ./ D f 2 j1     < J g with radius J and center . The multiscale representation of the next theorem follows directly from (i) and (ii). Theorem 4. Let fˆJ gJ 2Z and fˆ J gJ 2Z be scaling kernels satisfying properties (i), (ii), f‰J gJ 2Z and f‰ J gJ 2Z the corresponding wavelet kernels, and J0 2 Z be fixed. Then we have F D ˆJ0  G C

1 X

‰J  G;

(68)

‰ J ? g;

(69)

J DJ0

f D ˆ J0 ? g C

1 X J DJ0

where equality is meant in the uniform sense of the C .0/ ./- and c .0/ ./-topology, respectively. Equations (66) and (67) show that the symbols ‰J^ .n/ of the wavelet kernel can be achieved ^ by just taking the difference of the symbols ˆ^ J C1 .n/ and ˆJ .n/ of the scaling kernels. Thus, we end up exactly with the linear approach mentioned in Remark 1, and we have the same desirable “zooming-in” capability of the multiscale representation. Since the wavelet kernels in Theorem 4 have locally compact support, the disregard of data outside the spherical cap J ./ actually does not lead to any deterioration at all. However, different from the construction of the scaling kernels in frequency domain, the construction of kernels with closed representations satisfying (i) and (ii) is sometimes more tedious and problem specific (especially condition (i) would be difficult to realize just by choice of the symbols ˆ^ J .n/). Regularizations of Green’s function for the Beltrami operator and the single layer kernel are a great help but limit the approach to the operators  , r  , L , and D 1 . For other types of equations, one has to hope for similar closed representations of the fundamental solutions and their regularizations. In our examples, the function systems presented in Sects. 2.2 and 2.3 are sufficient and supply us with kernels that allow an easy numerical evaluation. Concerning further relations to the frequency-oriented approach, it has to be mentioned that scaling kernels satisfying the conditions (i) and (ii) from Sect. 3.2 typically do not satisfy condition (iii’) from Sect. 3.1. A calculation of the Fourier coefficients for the regularized Green function can be found, e.g., in Freeden and Schreiner (2009), and shows that they frequently turn negative. In other words, the scale spaces are not nested in the sense of (50), and the approximation of F does not have to improve for every single increase of scale J , although in our numerical examples it typically does. Remark 2. There are cases where a scalar quantity F is modeled from vectorial input data g (see, e.g., field-aligned currents in Sect. 4.2) or vice versa. The corresponding multiscale concepts are completely analogous to those presented up to now and have only been left out for reasons of clarity and comprehensibility.

4 Application to Geomagnetic Problems Some geophysical background has already been presented in the introduction. In this section we indicate how multiscale techniques can be applied to crustal field modeling and the separation of

Page 17 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

the geomagnetic field with respect to its sources, to the reconstruction of field-aligned current systems, and to a power spectrum that reflects spatially localized properties. In all cases, the approach via frequency packages as well as the approach via locally supported wavelets is described.

4.1 Crustal Field Modeling and Separation of Sources A first attempt to separate internal and external contributions of the Earth’s magnetic field has been undertaken in Gauss (1839), using that b D rU in source-free regions. Under these circumstances, the part of the representation (4) that involves inner harmonics is due to sources in the exterior of the Earth, while the part involving outer harmonics represents the magnetic field that originates inside the Earth. If the measurements are conducted in source regions (i.e., the current density j in (1) does not vanish), the Gauss representation breaks down and we have to use the Mie representation. This is the case with satellite data collected on a spherical orbit R in the ionosphere. Yet, the poloidal part pb of the magnetic field b can be split up into an internal and an external contribution satisfying  rx ^

pbint .x/

D

qj .x/; x 2 int R; 0; x 2 ext R ;

rx  pbint .x/ D 0;

x 2 R3 ;

(70) (71)

and  rx ^

pbext .x/

D

0; x 2 int R; qj .x/; x 2 ext R ;

rx  pbext .x/ D 0;

x 2 R3 ;

(72) (73)

respectively, where qj denotes the toroidal part of the current density j . The remaining toroidal part qb of the magnetic field can be interpreted as the contribution due to poloidal currents crossing the sphere R , with R D RS being the satellite altitude. In Backus et al. (1996) it is shown that the above split-up and its interpretation in terms of the location of the sources is in accordance with the law of Biot-Savart. Summarizing, the desired separation of the magnetic field is given by b D pbint C pbext C qb :

(74)

int Equations (70) and (71) imply that pbint D rU int in ext R , where the harmonic potential U can be expressed by outer harmonics, as we already indicated for the Gauss representation (4). ext is entirely represented by inner harmonics. This Analogously, pbext D rU ext in int R , and U approach has also been discussed in Backus et al. (1996) and Olsen et al. (2010). The latter additionally provides an overview and further references on the extraction of external sources by physically motivated models and parametrizations (which, however, require a priori knowledge of the space-time structure of the magnetic field). We continue by applying the previously introduced multiscale representations to a separation based on (74). The use of wavelets as frequency packages for this problem is described in more detail in Mayer (2003, 2006) and Mayer and Maier (2006), while locally supported wavelets are presented in Gerhards (2011a, 2012).

Page 18 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

If interested in crustal field modeling, only the internal contribution pbint is relevant at satellite attitude. So the goal is to extract this quantity from an adequate set of magnetic field data. By adequate we mean that the data has been collected during magnetically quiet times and that a model of the core field and magnetospheric field has already been subtracted. Then the split-up (74) can correct the preprocessed data for possibly remaining undesired external contributions, as can be seen in the examples from Mayer and Maier (2006) and Gerhards (2012), where a CHAMP data set from 2001 has been used. The careful data selection becomes necessary because of the poor space-time coverage of satellite data, so that the time variability of the external sources is not entirely accounted for. More information on the data processing can be found, e.g., in Langel and Hinze (1998) and Maus et al. (2003). 4.1.1 Wavelets as Frequency Packages We have already concluded that the potential U int contains only outer harmonic contributions. In consequence, the relation (26) states that pbint D rU int can be expanded in terms of the vector .1/ .2/ spherical harmonics yQn;k . Analogously, (25) indicates the relation of pbext to yQn;k . More precisely, pbint .x/ D

1 2nC1 X X

1 .1/ .1/ .bQR /^ .n; k/ yQn;k ./; R

x 2 R ;

(75)

1 .2/ .2/ .bQR /^ .n; k/ yQn;k ./; R kD1

x 2 R ;

(76)

x 2 R ;

(77)

nD0 kD1

pbext .x/ D

1 2nC1 X X nD1

qb .x/ D

1 2nC1 X X nD1

where R D jxj and  D

x . jxj

1 .3/ .3/ .bQR /^ .n; k/ yQn;k ./; R kD1

.i/ By .bQR /^ .n; k/ we mean the Fourier coefficient

.i/ .bQR /^ .n; k/

Z

1 .i/ D b.y/  yQn;k R R



 y d!.y/: jyj

(78)

.i/

In other words, the vector spherical harmonics fyQn;k giD1;2;3 form a basis system that decomposes the magnetic field with respect to its sources. A multiscale reconstruction can then be achieved as described in Sect. 3.1 by setting the scaling kernel to .i/ J .; /

D

1 2nC1 X X

. J /^ .n; k/ Yn;k ./ yQn;k ./; .i/

.i/

;  2 :

(79)

nD0i kD1

The symbols . J /^ .n; k/ could be those corresponding to the CP kernel (observe that for the .i/ current example n D 1 in condition (i) of Sect. 3.1). Defining the wavelet kernels J accordingly and choosing a sufficiently large-scale Jmax , we end up with the approximation .i/

Page 19 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

.1/

.1/

pbint J0 ? J0  b C

Jmax X

.1/ J

?

.1/ J

 b;

(80)

.2/ J

?

.2/ J

 b;

(81)

.3/ J

?

.3/ J

b

(82)

J DJ0

pbext



.2/ J0

?

.2/ J0

bC

Jmax X J DJ0

for the poloidal parts on R , and qb

.3/ J0

?

.3/ J0

bC

Jmax X J DJ0

for the toroidal part on R . The scale Jmax typically depends on the data density and the quality of the input data in order to guarantee a sufficiently good numerical evaluation of the occurring integrals. 4.1.2 Locally Supported Wavelets .i/

The connection of the operators oQ .i/ to the vector spherical harmonics yQn;k implies that a representation b D oQ .1/ BQ 1 C oQ .2/ BQ 2 C oQ .3/ BQ 3

(83)

of the magnetic field b decomposes it with respect to the sources. More precisely, oQ .1/ BQ 1 denotes the part due to sources inside the sphere R , oQ .2/ BQ 2 the part due to exterior sources, and oQ .3/ BQ 3 the part due to sources on the sphere or crossing the sphere. One can derive that the scalars BQ 1 , BQ 2 , and BQ 3 are uniquely determined by the condition of vanishing integral mean values Z Z 1 1 Q Q B1 .y/  B2 .y/ d!.y/ D BQ 3 .y/ d!.y/ D 0 (84) 4 R 4 R and that they are expressible by 1 BQ 1 D D 1 B1 C 2 1 BQ 2 D D 1 B1 C 2 BQ 3 D B3 ;

1 1 1 D B2  B2 ; 4 2 1 1 1 D B2 C B2 ; 4 2

(85) (86) (87)

where B1 , B2 , and B3 correspond to the spherical Helmholtz decomposition b D o.1/ B1 C o.2/ B2 C o.3/ B3 (more information on such representations can be found, e.g., in Gerhards (2011b, 2012)). Theorem 1 and its consequences for solutions of the Beltrami equation imply a representation of the Helmholtz scalars B1 , B2 , and B3 in terms of Green’s function with respect to the Beltrami operator, so that, after some lengthy but elementary calculations, (83) and (85)–(87) lead to the scaling kernels Page 20 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

180° W 90° W 0°

180° W 90° W 0°

90° E 180° E

90° E 180° E

90° N

90° N 45° N

45° N

1

1 0.1 0.01 1e-005



0° 0.1

45° S 90° S

t

90° S

2

⏐Φ1 (ε , ⋅)⏐ 180° W 90° W 0°

1e-010

45° S

90° E 180° E

(1)

2

⏐Ψ1 (ε , ⋅)⏐ 180° W 90° W 0°

90° N

90° E 180° E

90° N 10

45° N

45° N

1e-005

1



45° S



(1)

90° S

2

⏐Φ4 (ε , ⋅)⏐ 180° W 90° W 0°

1e-010

45° S

0.1 90° S

90° E 180° E

(1)

2

⏐Ψ4 (ε , ⋅)⏐ 180° W 90° W 0°

90° N

90° E 180° E

90° N

100 10 1 0.1 0.01

100

45° N

45° N 10



10 1 0.1 0.01

0° 1

45° S

45° S

1e-005

0.1 90° S

(1)

90° S

2

⏐Φ8 (ε , •)⏐

(1)

2

⏐Ψ8 (ε , ⋅)⏐

.1/

Fig. 4 Absolute values of the scaling kernels ˆ J ."2 ; / (left), centered at "2 D .0; 1; 0/T , and the corresponding .1/ wavelet kernels ‰ J ."2 ; / (right), for scales J D 1; 4; 8



 1  J  1 J 1  G . I   / C S .  /   ˝ r S J .  / D ˝  (88) 2 8 4 1 1  J 1 r S .; / ˝   r ˝ r G J . I   /; C r ˝ r D1 S J .  /  4 4 2   1  J  1 J 1 .2/ ˆ J .; / D ˝  (89)  G . I   /  S .  / C  ˝ r S J .  / 2 8 4 1 1  J 1  r ˝ r D1 S J .  / C r S .; / ˝   r ˝ r G J . I   /; 4 4 2 .1/ ˆ J .; /

ˆ J .; / D  L ˝ L G J . I   /; .3/

;  2 :

(90)

.i/

The wavelets ‰ J .; / are defined by differences of the scaling kernels as indicated in (67). An illustration is given in Fig. 4 (the scaling parameter is chosen to be J D 2J for all examples in this chapter). It can clearly be seen that the wavelet kernels have locally compact support on a spherical cap of scale-dependent radius. Although not locally supported, the scaling kernels reveal a good spatial localization as well (note the logarithmic scaling of the plots). Eventually, the limit relations from Sects. 2.2 and 2.3 and some slight modifications tell us that a sufficiently large Jmax can be chosen so that

Page 21 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

.i nt /

pb

.1/

ˆ J0 ? b C

Jmax X

.1/

(91)

.2/

(92)

.3/

(93)

‰ J ? b;

J DJ0 .ext / pb



.2/ ˆ J0

?bC

Jmax X

‰ J ? b;

J DJ0

qb

.3/ ˆ J0

?bC

Jmax X

‰ J ? b:

J DJ0

In Fig. 5, the approximation of the radial part of the Earth’s crustal field over Africa by the multiscale representation (91) of pbint is indicated. The used data set contains CHAMP satellite measurements collected between January 2009 and May 2010 and has been kindly supplied and preprocessed (core field and magnetospheric contributions have been subtracted using the CHAOS4 model, cf. Olsen et al. 2014) by Nils Olsen, DTU Space. A first trend approximation at scale J D 2 is achieved using coarse global data, while for increasing scales and increasing local support of the wavelets, a higher data density has been used. At scale J D 10, the numerical integration has been conducted on a 360360 equiangular data grid. This is also the maximal scale Jmax resolvable for the given data situation since for higher scales the amount of data points in the support of the wavelet kernels would be too small to guarantee a reliable numerical evaluation. Furthermore, the series of pictures in Fig. 5 indicates that crustal field anomalies can be outlined more precisely by the different wavelet scales than by just using the final approximation at scale J D 10.

4.2 Reconstruction of Radial Current Systems Field-aligned current systems in the polar regions were first treated in Birkeland (1908), while a system of field-aligned currents at low latitudes due to the equatorial electrojet was proposed in Untied (1967). As the name suggests, they are directed along the main magnetic field lines. In polar regions, this implies that field-aligned currents are nearly radial with respect to the spherical Earth. Radial current systems have the advantage that they can be calculated from the knowledge of the toroidal magnetic field on a single sphere R . A spherical harmonic approach applying the Mie representation is given in Olsen (1997) where also more references on the geophysical background can be found. Here we give representations in terms of the two multiscale approaches presented in Sect. 3. First, we deal with the approach via frequency packages, which can be found in more detail in Bayer et al. (2001) and Maier (2005), and then we turn to the construction of locally supported wavelets, as treated in Freeden and Gerhards (2010). General point of departure is the observation that the combination of (7) and (8) with the spherical Helmholtz and the Mie representation yields j.x/ D 

 Qb .r/ r



@ .rQb .r//  @r r

r

 L x Pb .r/;

x 2 R3 ;

(94)

x . In particular, the radial current density Jrad (i.e., Jrad .x/ D   j.x/) is where r D jxj and  D jxj connected to the toroidal magnetic field scalar Qb by

Page 22 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

0.5 40° N

20 W

0

20 E

40 E

60 W 30 W 0 30 E 60 E 90 E

60 E 0.4

20° N

1

60° N

0.3 0.2

0.5

30° N

0.1 0



nT

nT

0



−0.1 −0.2

20° S

−0.3

−0.5

30° S

60° S

−1

−0.4

40° S

Φ2

(1)

∗b

Ψ2

(1)

−0.5

∗b

3 30 W

0

40 W 20 W 0 20 E 40 E 60 E

30 E 60 E

4

60° N 2

3

40° N

40° N 2 1

20° N

20° N

1

−1

−1

20° S

0



nT



nT

0

20° S −2

40° S

−2

−3

40° S

60° S −3

Ψ3(1)∗b

−4

Ψ5(1)∗b

20 20° W



20° E

40° E

60° E

40° N

1.5

40° N

20° W



20° E

40° E

60° E 15

1

10 20° N

20° N 0.5

0



−5

−0.5 20° S −1 40° S

−1.5

nT

0

nT



5

20° S

−10 −15

40° S

Ψ8(1)∗b

Φ10(1)∗b

−20

Fig. 5 Radial component of the internal contributions over Africa at scale 2 (top left) and 10 (bottom right) with wavelet contributions at intermediate scales from 2 to 8. The white areas correspond to the calculation regions for all of Africa, while the circles indicate the calculation region only for the marked location

Jrad .x/ D

 Qb .r/ r

;

(95)

while Qb can be obtained from the measured magnetic field by L Qb D qb , or equivalently,  Qb .r/ D L  b.r/:

(96)

Since (95) and (96) solely involve spherical differential operators, it is sufficient for the reconstruction of Jrad on a sphere R to know the magnetic field b only on R . For the tangential parts of the current density j , relation (94) shows that additionally the radial derivative of b needs to be known for the reconstruction of j . This quantity is not supplied by today’s satellite missions. The new

Page 23 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

Swarm mission with its constellation of two satellites flying side by side and another satellite at a higher altitude allows a somewhat rough estimation of the East-West gradient of the magnetic field (hopefully improving crustal magnetic field models (cf. Maus et al. 2006) and allowing a direct evaluation of the coarse features of field-aligned currents (cf. Ritter and Lühr 2006)) but still does not supply the radial derivative. Gradiometry methods as they are frequently used for gravitational problems (see, e.g., Rummel et al. 1993; Rummel 2010; Freeden and Schreiner 2014b) would be a step forward but are difficult to realize due to the more complex structure of the magnetic field. A mathematical formulation for magnetic field modeling in terms of spherical harmonics can be found, e.g., in Kotsiaros and Olsen (2012). 4.2.1 Wavelets as Frequency Packages The toroidal part of the magnetic field is given by qb D L Qb so that it can be expanded in terms .3/ of vector spherical harmonics yn;k . In other words, defining .i/ J .; /

D

1 2nC1 X X

. J /^ .n/ Yn;k ./yn;k ./; .i/

.i/

;  2 ;

(97)

nD0i kD1

with . J /^ .n/, e.g., the symbols of the CP kernel, we get .i/

qb D

.3/ J0

?

.3/ J0

1 X

bC

.3/ J

?

.3/ J

b

(98)

J DJ0

on R . For a representation of the toroidal scalar Qb , we use the kernel .3/ ˆJ .; /

D

1 2nC1 X X nD1 kD1

p

1

. J /^ .n/ Yn;k ./Yn;k ./; .3/

n.n C 1/

;  2 ;

(99)

observing that it satisfies L ˆJ .; / D J .; /. This implies .3/

.3/

.3/

.3/

Qb D ˆJ0  J0  b C

1 X

.3/

‰J 

.3/ J

b

(100)

J DJ0

on R , where R D RS is supposed to be the satellite altitude. Equation (95) states that the kernel .3/;

ˆJ

1 2nC1 X X 1  .3/ .; / D  ˆJ .; / D  R nD1 kD1

p

n.n C 1/ .3/ ^ . J / .n/Yn;k ./Yn;k ./ R

(101)

leads to the desired multiscale representation of the radial currents. For a sufficiently large-scale Jmax , we finally end up with

Page 24 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

.3/;

Jrad ˆJ0

.3/

 J0  b C

Jmax X

.3/;

‰J



.3/ J

 b:

(102)

J DJ0

Furthermore, we see that the symbol n of the underlying pseudodifferential operator is given by p  n.n C 1/=R. 4.2.2 Locally Supported Wavelets The spatially oriented method starts from the same situation as above, i.e., we need a representation of the toroidal scalar Qb . Equation (96) and the properties of Green’s function for the Beltrami operator imply Z Qb .x/ D L G. I   /  b.R/ d!./; x 2 R ; (103) 

where R D jxj and  D

x . jxj

From (95) we are led to

1 Jrad .x/ D  R

Z

L G. I   /  b.R/ d!./;

x 2 R :

(104)



Substituting Green’s function by its regularized version, and defining the scaling kernel J .; / D

1   J  L  G . I   /; R  

;  2 ;

(105)

limit relations like those in Sect. 3.2 yield Jrad J0  b C

Jmax X J

 b;

(106)

J DJ0

for a sufficiently large Jmax . The choice of (105) indicates that the regularized Green function has to be three times continuously differentiable in this example (while the one described in Definition 3 is only twice continuously differentiable). An illustration of the scaling kernels is given in Fig. 6. It is interesting to see that they are locally supported, opposed to the general construction where only the wavelet kernels have locally compact support. On the other hand, this fact is not so surprising, considering that the current density j is obtained from the magnetic field b by differentiation, a local operation. As a consequence, the multiscale procedure of starting at a low scale J0 and then successively adding further wavelet contributions to deal with unevenly distributed data sets is not necessary. Yet, the fine-resolution of the wavelet contributions reveals features that cannot be seen in the final reconstruction for some large scale J . This is illustrated in Fig. 7: the wavelet contributions for J D 4; 5 show disturbances along the satellite tracks that are less prominent in the final reconstruction at scale J D 6. Besides the data density, this is an indicator at what scale to truncate the approximation. The disturbances along the satellite tracks are no geophysical effects but simply measurement/preprocessing artefacts. The data used for Fig. 7 has been collected by MAGSAT Page 25 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

90° N

180° W 90° W 0° 90° E 180° E

90° N

45° N

45° N





45° S

45° S 90° S

90° S 0

180° W 90° W 0° 90° E 180° E

0.5

1

1.5

2.5

2

3

3.5

0

4

5

10

2

20

15

25

30

2

φ2(ε , ⋅)

φ4(ε , ⋅) 90° N

180° W 90° W 0° 90° E 180° E

45° N



45° S 90° S 0

50

100

150

200

250

2

φ6(ε , ⋅)

Fig. 6 Scaling kernels J ."2 ; /, for scales J D 2; 4; 6 (colors indicate the absolute value, arrows the direction)

during 1 month centered around March 21, 1980. It has been preprocessed by Nils Olsen, DTU Space, using the geomagnetic reference field GSFC(12/83) (cf. Langel and Estes 1985) in order to obtain the part of the magnetic field that is due to ionospheric current systems. An application of the approach via locally supported wavelets to recent CHAMP data together with a more sophisticated data selection (similar to Papitashvili et al. 2002) can be found in Gerhards (2011a).

4.3 Multiscale Power Spectrum Let us assume we are only interested in the internal part of the geomagnetic field (i.e., core field and crustal field). Then we know from the Gauss representation that b D rU D

1 2nC1 X X

ext UR^ .n; k/rHn;k .RI /

(107)

nD1 kD1

in ext R , with R D RE being the mean Earth radius. The contribution of the spherical harmonic degree n to the mean square value 41r 2 kbk2l2 .r / is represented by the degree variance

2

2nC1

1



X ^ ext UR .n; k/rHn;k .RI /

Varn .r/ D

2

2 4 r kD1

(108)

l .r /

  2nC1 X 2 2n C 1 R 2nC4 ^ .n C 1/ .n; k/ ; U D R 4R4 r kD1

Page 26 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

180°W 90°W 0° 90°E 180°E 90°N 45°N

45°N





45°S

45°S

−15

90°S

−10

−5

180°W 90°W 0° 90°E 180°E

90°N

90°S

−0

5

10

15

20

−25 −20 −15 −10 −5

2

10

15

20

nA/m ψ2∗b

180°W 90°W 0° 90°E 180°E

90°N

45°N

45°N





45°S

−30

5

2

nA/m φ2∗b 90°N

0

180°W 90°W 0° 90°E 180°E

45°S 90°S

−20

−10

0

10

20

−30

30

90°S

−20

2

nA/m ψ3∗b 90°N

90°N 45°N





45°S

45°S 90°S

10 2

nA/m ψ5∗b

10

20

30

40

nA/m ψ4∗b

45°N

0

0 2

180°W 90°W 0° 90°E 180°E

−40 −30 −20 −10

−10

20

30

40

−100

180°W 90°W 0° 90°E 180°E

90°S

−50

0

50

100

nA/m2 φ6∗b

Fig. 7 Radial current density at scale 2 (top left) and 6 (bottom right) with wavelet contributions at intermediate scales from 2 to 5

which has been studied, e.g., as early as Mauersberger (1956) and Lowes (1974). The corresponding spectrum is often referred to as Mauersberger-Lowes spectrum or power spectrum, although it has been argued, e.g., in Maus (2008) that the term power spectrum is more suitable for a slightly in (108) is only due to the fact that we have used modified quantity. The unusual prefactor 2nC1 4R4 2 L ./-normalized spherical harmonics and not Schmidt semi-normalized ones. A significant feature of the geomagnetic field revealed by the degree variances is a sharp “knee” around degree n D 15, which is generally interpreted as the transition from the core fielddominated part to the crustal field-dominated part (see, e.g., the contributions Sabaka et al. (2014) and Olsen et al. (2014) of this handbook for some more details and illustrations). However, the global character of spherical harmonics implies that the degree variances essentially reflect global properties, i.e., they give no information on where a signal is originated. In Beggan et al. (2013) Slepian functions have been used to model the continental part of the crustal magnetic field separately from the oceanic part. The Slepian functions were designed such that they are spatially concentrated either on the continents or on the oceans. Approximating the magnetic field with these functions, and afterwards transforming the Slepian coefficients into spherical harmonic coefficients, allows the derivation of degree variances for each of the two regions. This Page 27 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

approach reduces aliasing effects that might occur when trying to obtain such regionally reflected degree variances by methods based purely on spherical harmonics. Yet, each degree variance still reflects a spherical harmonic degree, i.e., a global frequency. Our philosophy in defining a local/regional power spectrum is somewhat different. We are aiming at a multiscale power spectrum where each multiscale variance reflects the influence of features of scale-dependent spatial extend. More precisely, we use the previously introduced multiscale representation to define such a spatially oriented power spectrum. High scales J reflect the contributions to jb.x/j2 that originate in the vicinity of the location x, while lower scales represent the influence of global sources or sources of a larger spatial extend. Integrating these contributions over all x 2 r or over all x in a subregion of the Earth’s surface yields the desired multiscale variances. This is described in more detail in the following paragraphs. 4.3.1 Wavelets as Frequency Packages .i/

.i/

Let J .; / and J .; / be the scaling and wavelet kernels as defined in Sect. 3.1, with respect to symbols n D 1. Then we know that b D rU D

2 X

.i/ J0

?

.i/ J0

bC

1 2 X X

.i/ J

?

.i/ J

 b:

(109)

iD1 J DJ0

iD1

Kernels of type i D 3 are not required in the representation (109) because the gradient rx D  @r@ C 1r r only generates kernels of type 1 and 2. A canonical extension of (108) to the multiscale setting is the multiscale variance

2 1

X VarJ .r/ D

4 r 2 iD1 D

1 4 r 2

2



.i/ .i/ ?  b

J J

2

(110)

l .r /

 1 2nC1 2   2  X X 4  .1/ ^ ^ .2/ ^ br : .n; k/ C br .n; k/ J .n/ nD1 kD0

This concept has already been introduced in Freeden and Maier (2003) in the different context of signal-to-noise thresholding. At scale J D J0 the wavelet kernels in (110) need to be substituted by scaling kernels (we take J0 D 0 in the examples of this subsection). For the specific choice of symbols  .i/ ^ J .n/ D



1; n  J; 0; else;

(111)

the multiscale variances (110) and the degree variances (108) coincide. In other words, VarJ .r/ is a generalization of Varn .r/. However, the original purpose was to obtain a multiscale power spectrum that reflects local/regional properties of the magnetic field. Thus, one would not choose (111) but a scaling kernel that generates spatially localizing wavelets, such as the CP kernel. In our examples we use a slightly adapted version of the CP kernel (cf. Figs. 8 and 9), namely,

Page 28 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

 .i / ^  Fig. 8 The symbols J .n/ of the adapted CP scaling kernel (left) and the symbols wavelet kernel (right) for scales J D 3; 8; 15; 25; 40 180°W 90°W



90°E 180°E

90°N

90°N

180°W 90°W



.i / ^ .n/ of the corresponding J

90°E 180°E

45°N

45°N





45°S

45°S 90°S 0.5

1

1.5

2.5

2

|ψ 3

(1)(ε2,•)

3

2

3.5

90°S 4

8

10

12

ψ 8(1)(ε2,•)

|

90°N

6

| 180°W 90°W



14

16

18

20

|

90°E 180°E

45°N



45°S 90°S 20 40

60

80 100 120 140 160 180 200

|ψ 15(1)(ε2,•)| Fig. 9 Absolute values of the adapted CP wavelet kernel

 .i/ ^ J .n/ D

.1/ 2 J ." ; /

at scales J D 3; 8; 15

( 2   J C5  J C5  J C5 4 4 1 C n2 ; n  2 4 ; 1  n2 0;

(112)

else:

The adaptation has simply been undertaken to reduce the changes between two consecutive scales, P2 .i/ .i/ so that we get a finer spectrum. The term j iD1 J ? J  b.x/j2 represents signals originating .i/ in a scale-dependent region around x (determined by the localization of the kernel J .; /, with x ). Integrating over all x 2 r , we obtain the multiscale variance VarJ .r/. Therefore,  D jxj VarJ .r/ does not actually reflect the contribution of a specific region to 41r 2 kbk2l2 .r / , but it rather reflects the spatial extend of the contributing features. Low scales J are influenced by signals of global origin, while higher scales account for signals of local origin. Thus, a multiscale power spectrum of the form (110) yields a spatially oriented alternative (in terms of its interpretation) to Page 29 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

power spectra based on degree variances. If one is interested in the multiscale power spectra of subregions r  r , such as continents and oceans, one should look at the modified multiscale variances

2

2

X 1



.i/ .i/ ; (113) VarJ; r .r/ D

J ? J  b



2 j r j iD1 l . r /

where j r j denotes the surface area of r . In this case, VarJ; r .r/ only accounts for signals originating in r , while the interpretation of the scales J in terms of the spatial extend of the signals within that region remains. One should be aware that, opposed to the Slepian functions used in Beggan et al. (2013), the scaling and wavelet kernels are (vectorial) radial basis functions that are not specifically adapted to the region r . Thus, at the lower scales, with less localizing .i/ kernels J .; /, aliasing might occur in a similar manner as it does for degree variances based on spherical harmonics. However, at higher scales this effect is being reduced more and more due to the improving localization of the wavelet kernels. In this sense, VarJ; r .r/ is less prone to aliasing issues than degree variances (if solely based on spherical harmonics). A distinct advantage of the multiscale approach over degree variances (whether based purely on spherical harmonics or on Slepian functions) is the information obtained from the evolution of the P2 .i/ .i/ expression j iD1 J ? J  b.x/j2 over different scales J for a single location x. The hope is that the trade-off between spatial and frequency localization contained in the construction of the .i/ wavelet kernels J .; / allows, at least to some extent, a better extraction of information on depth as well as surface localization of geomagnetic/gravity anomalies. Related results can be found in Fehlinger et al. (2008), Freeden et al. (2009). In the remainder of this section, we return to VarJ .r/ and VarJ; r .r/ from (110) and (113), respectively, and illustrate the application of the multiscale variances to the CHAOS-4 and MF7 models for different continental/oceanic regions. Spherical harmonic-based models are not the actual aim of multiscale variances (rather the application to actual data or high-resolution models), but they serve well for a first illustration. The degree variances (108) of CHAOS-4 and MF7 are shown in Fig. 10 for comparison. CHAOS-4 is a model of the entire geomagnetic field up to spherical harmonic degree Nmax D 100. We only regard the crustal field contribution at degrees n D 16; : : : ; 100. MF7 is purely a crustal magnetic field model for spherical harmonic degrees n D 16; : : : ; 133. In Fig. 11, we have indicated the global multiscale variances VarJ .R/ and the regional multiscale variances VarJ; R .R/ for the MF7 model, where R denotes either the continental or the oceanic shelf (we have used the same shelf boundaries as Beggan et al. 2013). First thing that we notice is that the multiscale power spectrum is significantly smoother than the power spectrum based on degree variances in Fig. 10. This is not surprising since an increasing set of spherical harmonics .i/ contributes to the wavelet kernel J .; / at each scale J , causing this smoothing effect. The second thing to notice is the split-up between continental and oceanic regions, showing a significantly greater power over the continents than over the oceans (note that the regional multiscale variances are scaled by the surface area j R j of the regions under consideration, so that the different size of areas covered by continents and oceans, respectively, does not contribute to the power). An interesting quantity to study is the deviation of the regional multiscale variances from the global average, given by the ratios as illustrated in the right plot of Fig. 11. At low scales J 11; 12; 13, there is no significant difference between the continental and the oceanic

Page 30 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

MF7 (Nmax=133), CHAOS-4 (Nmax=100) - Degree Variances

60

CHAOS-4 MF7

50 40 30 20 10 0

0

20

40

60

80

n

100

120

Fig. 10 Degree variances Varn .R/ of the CHAOS-4 and MF7 model MF7 (Nmax=133) - Multiscale Variances

50 40 30

1.5 1

20

0.5

10 0

MF7 (Nmax=133) - Ratio Local/Global

2

Continent Ocean Global

0

5

10

15

20 J

25

30

35

40

0

0

5

10

15

20 J

25

30

35

40

Fig. 11 Left: global multiscale variances VarJ .R/ and regional multiscale variances VarJ; R .R/ of the continental and oceanic shelf for the entire MF7 model. Right: the quotients VarJ; R .R/=VarJ .R/ of the regional and the global multiscale variances

multiscale power spectrum. This is due to the property that lower scales reflect features of rather global influence, affecting the continents and oceans in a similar manner. As the scale increases, the influence comes from a more and more local origin, causing the oceanic and continental power spectrum to drift apart. However, it can be seen that there are no major changes in the quotients for scales higher than J 30. This is simply due to the fact that the MF7 model is truncated at spherical harmonic degree Nmax D 133, while the wavelet kernels at these scales contain contributions up to much higher degrees. In Fig. 12, we compare the multiscale power spectra of the MF7 and the CHAOS-4 model. The top left plot shows that the MF7 power is generally stronger and shifted towards higher scales. The latter is due to the higher spherical harmonic degrees of the MF7 model that generate contributions at the higher scales. The ratio of the two models in the top right plot indicates that the MF7 and CHAOS-4 model coincide quite well up to scale J 23. After that the higher spherical harmonic degrees of the MF7 model, and therefore the more localized structures that can be modeled, gain influence and the power of the MF7 model dominates the CHAOS-4 model. If we restrict the MF7 model to spherical harmonic degree Nmax D 100 as well, the agreement between the two multiscale power spectra is apparently a lot better (bottom row of Fig. 12). Yet, at higher scales, the power of the MF7 model restricted to Nmax D 100 still seems to be moderately stronger than the power of CHAOS-4. Only around scale J 23 the CHAOS-4 model is slightly dominating. The latter probably reflects the stronger degree variances of CHAOS-4 in the degree range n D 70; : : : ; 90 (cf. Fig. 10). Last, we take a look at the multiscale power spectra of the separate continents. This is done for the CHAOS-4 model, for the MF7 model restricted to Nmax D 100, and for the entire MF7 model. The results are shown in Fig. 13. Again, the more interesting property to study is the deviation of the regional multiscale variances from the global average (right plots in Fig. 13) as it offers a more Page 31 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

50

MF7 (Nmax=133),CHAOS-4 (Nmax=100) - Multiscale Variances

40 30 20 10 0 0

3

CHAOS-4 - Continent CHAOS-4 - Ocean CHAOS-4 - Global MF7 - Continent MF7 - Ocean MF7 - Global

5

10

15

MF7 (Nmax=133),CHAOS-4 (Nmax=100) - Ratio MF7/CHAOS-4

2.5 2 1.5 1 20

25

30

35

0.5 0

40

5

10

15

J 30

20

25

30

35

40

J

MF7 (Nmax=100),CHAOS-4 (Nmax=100) - Multiscale Variances

1.15

MF7 (Nmax=100),CHAOS-4 (Nmax=100) - Ratio MF7/CHAOS-4

1.1 20

1.05 1

10

0.95 0 0

5

10

15

20

25

30

35

40

0.9 0

5

10

15

J

20

25

30

35

40

J

Fig. 12 Left: global multiscale variances VarJ .R/ and regional multiscale variances VarJ; R .R/ for the entire MF7 model (top) and for the MF7 model truncated at spherical harmonic degree Nmax D 100 (bottom). As a reference, CHAOS-4 the CHAOS-4 multiscale variances are plotted in fainted colors. Right: the quotients VarMF7 .R/ and J .R/=VarJ MF7 CHAOS-4 VarJ; R .R/=VarJ; R .R/ of the multiscale variances for the MF7 and the CHAOS-4 model

suitable illustration of the variability of the single continents. At low scales, Africa and Australia show a weaker power than the global average, and then the quotient increases steadily with a peak around scale J 19. All other continents are stronger than the global average over the entire multiscale power spectrum. At scales higher than J 25, all continents in the CHAOS-4 model and the MF7 model restricted to Nmax D 100 reveal flat ratios of approximately the same order. An exception is given by Antarctica that has a clearly dominating spectrum. Something interesting happens when going over from the models truncated at Nmax D 100 to the entire MF7 model with Nmax D 133 (top right plot in Fig. 13). The behavior at the higher scales diversifies. In particular, the power over Antarctica suddenly becomes weaker than that of all other continents. This probably reflects the calamities resulting from the polar gap of satellite data, which mainly affects spatially localized features. 4.3.2 Locally Supported Wavelets Starting from the representation in Theorem 1, the surface theorem of Gauss, and the substitution of Green’s function for the Beltrami operator by its regularization, we find that the scaling kernel ˆJ .; / D

1 C  G J . I   /; 4

;  2 ;

(114)

leads to the multiscale representation b D ˆJ0  b C

1 X

‰J  b:

(115)

J DJ0

Since we are not actually solving any differential equation involving  , the scaling kernel (114) is not the canonical choice for a regularization of the Dirac Kernel. We have simply picked this one because it complies with our previous considerations. The smoothed Haar scaling function as used, e.g., in Freeden et al. (2005) is another choice that leads to locally supported wavelet kernels Page 32 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

50 40 30 20 10 0

MF7 (Nmax=133) - Multiscale Variances

2.5

Eurasia Americas Africa Australia Antarctica Global

0

5

10

15

2 1.5 1 20 J

25

30

35

40

MF7 (Nmax=100) - Multiscale Variances

0.5 0

2.5

40 30

2

20

1.5

10

1

0

40

0

5

10

15

20 J

25

30

35

40

CHAOS-4 (Nmax=100) - Multiscale Variances

0.5 0

2.5

30

2

20

1.5

10

1

0

0.5 0

0

5

10

15

MF7 (Nmax=133) - Ratio Local/Global

20 J

25

30

35

40

5

10

15

20 J

25

30

35

40

MF7 (Nmax=100) - Ratio Local/Global

5

10

15

20 J

25

30

35

40

CHAOS-4 (Nmax=100) - Ratio Local/Global

5

10

15

20 J

25

30

35

40

Fig. 13 Left: global multiscale variances VarJ .R/ and regional multiscale variances VarJ; R .R/ of the separate continents for the entire MF7 model (top), for the MF7 model truncated at spherical harmonic degree Nmax D 100 (center), and for the CHAOS-4 model (bottom). Right: the quotients VarJ; R .R/=VarJ .R/ of the regional and the global multiscale variances

that could be used here. With (115) at hand, we are led to the following definitions of multiscale variances: 1 k‰J  bk2l2 .r / ; 4 r 2 1 VarJ; r .r/ D k‰J  bk2l2 . r / : j r j VarJ .r/ D

(116) (117)

Their interpretation is analogous to that for scaling kernels constructed as frequency packages.

5 Conclusion and Outlook Today, a variety of mathematical methods are available to approximate and analyze geophysical quantities. Which of these methods is the most adequate strongly depends on the available data, the properties of the modeled quantity, and the intentions of the applicant. Spherical harmonic expansions are very popular as they yield good global results and a meaningful physical interpretability. In this chapter, we have presented alternative multiscale representations that are suitable for unevenly distributed data and strongly spatially varying quantities such as the Earth’s crustal magnetic field. The multiscale power spectrum and its application to the CHAOS-4

Page 33 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

model and the MF7 model show that it can indicate local/regional differences at the higher scales. As a consequence, a future goal would be a more extensive application of multiscale methods to real data or high-resolution models in order to gain insights into the localized/regional contributions. In this sense, spatially oriented multiscale methods can supplement the already wellestablished frequency-oriented methods. Different from many other approaches, the evaluation of the multiscale representations presented in this chapter is based on numerical integration rather than the solution of systems of linear equations. On the one hand, this avoids problems with illconditioned matrices, and on the other hand, it requires adequate quadrature rules. An overview on numerical integration methods can be found in the contribution Hesse et al. (2014) of this handbook. The upcoming Swarm mission will be a further step forward in modeling the Earth’s magnetic field. More accurate measurements and new possibilities arising from the constellation of the three satellites are anticipated to reduce the spectral gap occurring between satellite data and ground measurements of the crustal magnetic field at or near the Earth’s surface. This is another point where the potential of multiscale methods could be used, namely, the combination of globally available satellite data with only locally available ground data.

References Augustin M, Bauer M, Blick C, Eberle S, Freeden W, Gerhards C, Ilyasov M, Kahnt R, Klug M, Möhringer S, Neu T, Nutz H, Ostermann I, Punzi A (2014) Modeling deep geothermal reservoirs: recent advances and future perspectives. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Backus GE (1986) Poloidal and toroidal fields in geomagnetic field modeling. Rev Geophys 24: 75–109 Backus GE, Parker R, Constable C (1996) Foundations of geomagnetism. Cambridge University Press, Cambridge Bayer M, Freeden W, Maier T (2001) A vector wavelet approach to iono- and magnetospheric geomagnetic satellite data. J Atm Sol-Ter Phys 63:581–597 Beggan CD, Saarimäki J, Whaler K, Simons FJ (2013) Spectral and spatial decomposition of lithospheric magnetic field models using spherical Slepian functions. Geophys J Int 193: 136–148 Birkeland K (1908) The Norwegian aurora polaris expedition 1902–1903, vol. 1. H. Aschehoug, Oslo Chambodut A, Panet I, Mandea M, Diament M, Holschneider M (2005) Wavelet frames: an alternative to spherical harmonic representation of potential fields. Geophys J Int 163:875–899 Dahlke S, Dahmen W, Schmitt W, Weinreich I (1995) Multiresolution analysis and wavelets on S 2 and S 3 . Numer Funct Anal Opt 16:19–41 Edmonds AR (1957) Angular momentum in quantum mechanics. Princeton University Press, Princeton Fehlinger T, Freeden W, Gramsch S, Mayer C, Michel D, Schreiner M (2007) Local modelling of sea surface topography from (geostrophic) Ocean flow. ZAMM 87:775–791 Fehlinger T, Freeden W, Mayer C, Schreiner M (2008) On the local multiscale determination of the Earth’s disturbing potential from discrete deflections of the vertical. Comput Geosci 12:473–490 Page 34 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

Freeden W (1981) On approximation by harmonic splines. Manuscr Geod 6:193–244 Freeden W (1998) The uncertainty principle and its role in physical geodesy. In: Freeden W (ed) Progress in geodetic science. Shaker, Aachen Freeden W, Gerhards C (2010) Poloidal and toroidal field modeling in terms of locally supported vector wavelets. Math Geosci 42:817–838 Freeden W, Gerhards C (2012) Geomathematically oriented potential theory. Chapman & Hall/CRC, Boca Raton Freeden W, Maier T (2003) Spectral and multiscale signal-to-noise thresholding of spherical vector fields. Comput Geosci 7:215–250 Freeden W, Schreiner M (2006) Local multiscale modeling of geoidal undulations from deflections of the vertical. J Geod 78:641–651 Freeden W, Schreiner M (2009) Spherical functions of mathematical (geo-) sciences. Springer, Heidelberg Freeden W, Schreiner M (2014a) Special functions in mathematical geosciences – an attempt of categorization. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Freeden W, Schreiner M (2014b) Satellite gravity gradiometry (SGG): from scalar to tensorial solution. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Freeden W, Windheuser U (1996) Spherical wavelet transform and its discretization. Adv Comput Math 5:51–94 Freeden W, Wolf K (2008) Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math Semesterber 56:53–77 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geosciences). Oxford University Press, New York Freeden W, Michel D, Michel V (2005) Local multiscale approximations of geostrophic oceanic flow: theoretical background and aspects of scientific computing. Mar Geod 28:313–329 Freeden W, Fehlinger T, Klug M, Mathar M, Wolf K (2009) Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J Geod 83:1171–1191 Friis-Christensen E, Lühr H, Hulot G (2006) Swarm: a constellation to study the Earth’s magnetic field. Earth Planets Space 58:351–358 Gauss CF (1839) Allgemeine Theorie des Erdmagnetismus. Resultate aus den Beobachtungen des Magnetischen Vereins im Jahre 1838. Göttinger Magnetischer Verein, Leipzig Gerhards C (2011a) Spherical multiscale methods in terms of locally supported wavelets: theory and application to geomagnetic modeling. PhD thesis, University of Kaiserslautern Gerhards C (2011b) Spherical decompositions in a global and local framework: theory and an application to geomagnetic modeling. Int J Geomath 1:1–52 Gerhards C (2012) Locally supported wavelets for the separation of spherical vector fields with respect to their sources. Int J Wavel Multires Inf Process 10. doi:10.1142/S0219691312500348 Gerlich G (1972) Magnetfeldbeschreibung mit Verallgemeinerten Poloidalen und Toroidalen Skalaren. Z Naturforsch 8:1167–1172 GRIMM-3 (2011) GFZ reference internal magnetic model 3. http://www.gfz-potsdam.de/en/res earch/organizational-units/departments/department-2/earths-magnetic-field/topics/field-models/ grimm-x/grimm-3. Accessed date 26 Aug 2014 Haines GV (1985) Spherical cap harmonic analysis. J Geophys Res 90:2583–2591

Page 35 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

Hesse K, Sloan IH, Womersley R (2014) Numerical integration on the sphere. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Holschneider M (1996) Continuous wavelet transforms on the sphere. J Math Phys 37:4156–4165 Holschneider M, Chambodut A, Mandea M (2003) From global to regional analysis of the magnetic field on the sphere using wavelet frames. Phys Earth Planet Int 135:107–124 Hulot G, Sabaka TJ, Olsen N (2007) The present field. In: Kono M (ed) Treatise on geophysics, vol. 5. Elsevier, Amsterdam Hulot G, Finlay CC, Constable CG, Olsen N, Mandea M (2010) The magnetic field of planet earth. Space Sci Rev 152:159–222 IAGA (International Association of Geomagnetism and Aeronomy), Working Group V-MOD (2010) International geomagnetic reference field: the eleventh generation. Geophys J Int 183:1216–1230 Kotsiaros S, Olsen N (2012) The geomagnetic field gradient tensor. Int J Geomath 3:297–314 Langel RA (1987) The main field. In: Jacobs JA (ed) Geomagnetism, vol 1. Academic, London Langel RA, Estes RH (1985) The near-earth magnetic field at 1980 determined from MAGSAT data. J Geophys Res 90:2495–2510 Langel RA, Hinze WJ (1998) The magnetic field of the Earth’s lithosphere: the satellite perspective. Cambridge University Press, Cambridge Lowes FJ (1974) Spatial power spectrum of the main geomagnetic field, and extrapolation to the core. Geophys J R Astron Soc 36:717–730 Maier T (2005) Wavelet-Mie-representation for solenoidal vector fields with applications to ionospheric geomagnetic data. SIAM J Appl Math 65:1888–1912 Maier T, Mayer C (2003) Multiscale downward continuation of the crustal field from CHAMP FGM data. In: Reigber C, Lühr H, Schwintzer P (eds) First CHAMP mission results for gravity, magnetic and atmospheric studies. Springer, Heidelberg Mauersberger P (1956) Das Mittel der Energiedichte des Geomagnetischen Hauptfeldes an der Erdoberfläche und seine sekuläre Änderung. Gerlands Beitr Geophys 65:135–142 Maus S (2008) The geomagnetic power spectrum. Geophys J Int 174:135–142 Maus S, Hemant K, Rother M, Lühr H (2003) Mapping the lithospheric magnetic field from CHAMP scalar and vector magnetic data. In: Reigber C, Lühr H, Schwintzer P (eds) First CHAMP mission results for gravity, magnetic and atmospheric studies. Springer, Heidelberg Maus S, Lühr H, Purucker M (2006) Simulation of the high-degree lithospheric field recovery for the Swarm constellation of satellites. Earth Planets Space 58:397–407 Mayer C (2003) Wavelet modeling of ionospheric currents and induced magnetic fields from satellite data. PhD thesis, University of Kaiserslautern Mayer C (2006) Wavelet decomposition of spherical vector fields with respect to sources. J Fourier Anal Appl 12:345–369 Mayer C, Maier T (2006) Separating inner and outer Earth’s magnetic field from CHAMP satellite measurements by means of vector scaling functions and wavelets. Geophys J Int 167:1188–1203 MF7 (2010) Magnetic field model MF7. http://www.geomag.us/models/MF7.html. Accessed date 28 Aug 2014 Müller C (1966) Spherical harmonics. Lecture notes in mathematics, vol 17. Springer, Berlin Olsen N (1997) Ionospheric F-region currents at middle and low latitudes estimated from MAGSAT data. J Geophys Res 102:4563–4576 Olsen N, Glassmeier K-H, Jia X (2010) Separation of the magnetic field into external and internal parts. Space Sci Rev 152:135–157

Page 36 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_18-4 © Springer-Verlag Berlin Heidelberg 2014

Olsen N, Hulot G, Sabaka TJ (2014) The geomagnetic field – from observations to separation of the various field contributions. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Olsen N, Lühr H, Finlay CC, Sabaka TJ, Michaelis I, Rauber J, Tøffner-Clausen L (2014) The CHAOS-4 geomagnetic field model. Geophys J Int 197:815–827 Papitashvili VO, Christiansen F, Neubert T (2002) A new model of field-aligned currents derived from high-precision satellite magnetic field data. Geophys Res Lett 29. doi:10.1029/2001GL014207 Ritter P, Lühr H (2006) Curl-B technique applied to Swarm constellation for determining fieldaligned currents. Earth Planets Space 58:463–476 Rummel R (2010) GOCE: gravitational gradiometry in a satellite. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics. Springer, Heidelberg Rummel R, van Gelderen M, Koop R, Schrama E, Sanso F, Brovelli M, Miggliaccio F, Sacerdote F (1993) Spherical harmonic analysis of satellite gradiometry. Publications on geodesy, vol 39. Nederlandse Commissie voor Geodesie, Delft Sabaka T, Hulot G, Olsen N (2014) Mathematical properties relevant to geomagnetic field modeling. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Schröder P, Swelden W (1995) Spherical wavelets on the sphere. In: Approximation theory VIII. World Scientific, Singapore Shure L, Parker RL, Backus GE (1982) Harmonic splines for geomagnetic modeling. Phys Earth Planet Int 28:215–229 Simons FJ, Plattner A (2014) Scalar and vector slepian functions, spherical signal estimation and spectral analysis. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Simons FJ, Dahlen FA, Wieczorek MA (2006) Spatiospectral localization on a sphere. SIAM Rev 48:504–536 Thébault E, Schott JJ, Mandea M (2006) Revised spherical cap harmonics analysis (R-SCHA): validation and properties. J Geophys Res 111. doi:10.1029/2005JB003836 Thébault E, Purucker E, Whaler KA, Langlais B, Sabaka TJ (2010) The magnetic field of the Earth’s lithosphere. Space Sci Rev 155:95–127 Untied J (1967) A model of the equatorial electrojet involving meridional currents. J Geophys Res 72:5799–5810 Wolf K (2009) Multiscale modeling of classical boundary value problems in physical geodesy by locally support wavelets. PhD thesis, University of Kaiserslautern

Page 37 of 37

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

The Forward and Adjoint Methods of Global Electromagnetic Induction for CHAMP Magnetic Data Zdenˇek Martinec Department of Geophysics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic Dublin Institute for Advanced Studies, Dublin 2, Ireland

Abstract Detailed mathematical derivations of the forward and adjoint sensitivity methods are presented for computing the electromagnetic induction response of a 2-D heterogeneous conducting sphere to a transient external-electric current excitation. The forward method is appropriate for determining the induced spatiotemporal electromagnetic signature at satellite altitudes associated with the upper and mid-mantle conductivity heterogeneities, while the adjoint method provides an efficient tool for computing the sensitivity of satellite magnetic data to the conductivity structure of the Earth’s interior. The forward and adjoint initial boundary-value problems, both solved in the time domain, are identical, except for the specification of the prescribed boundary conditions. The respective boundary-value data at the satellite’s altitude are the X magnetic component measured by the CHAMP vector magnetometer along the satellite track for the forward method and the difference between the measured and predicted Z magnetic component for the adjoint method. Both methods are alternatively formulated for the case when the time-dependent, spherical harmonic Gauss coefficients of the magnetic field generated by external equatorial ring currents in the magnetosphere and the magnetic field generated by the induced eddy currents in the Earth, respectively, are specified. Before applying these methods, the CHAMP vector magnetic data are modeled by a two-step, track-by-track spherical harmonic analysis. As a result, the X and Z components of CHAMP magnetic data are represented in terms of series of Legendre polynomial derivatives. Four examples of the two-step analysis of the signals recorded by the CHAMP vector magnetometer are presented. The track-by-track analysis is applied to the CHAMP data recorded in the year 2001, yielding a 1-year time series of spherical harmonic coefficients. The output of the forward modeling of electromagnetic induction, that is, the predicted Z component at satellite altitude, can then be compared with the satellite observations. The squares of the differences between the measured and predicted Z component summed up over all CHAMP tracks determine the misfit. The sensitivity of the CHAMP data, that is, the partial derivatives of the misfit with respect to mantle conductivity parameters, is then obtained by the scalar product of the forward and adjoint solutions, multiplied by the gradient of the conductivity, and integrated over all CHAMP tracks. Such exactly determined sensitivities are checked against the numerical differentiation of the misfit, and a good agreement is obtained. The attractiveness of the adjoint method lies in the fact that the adjoint sensitivities are calculated for the price of only an additional forward calculation, regardless of the number of conductivity parameters. However, since the adjoint solution proceeds backwards in time, the forward solution must be stored at each time step, leading to memory requirements that are linear with respect to the number of steps undertaken. Having determined the sensitivities, the conjugate gradient inversion is run to infer 1-D and 2-D conductivity structures of the Earth based on the CHAMP residual time series (after the subtraction of the static field and



E-mail: [email protected]

Page 1 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

secular variations as described by the CHAOS model) for the year 2001. It is shown that this time series is capable of resolving both 1-D and 2-D structures in the upper mantle and the upper part of the lower mantle, while it is not sufficiently long to reliably resolve the conductivity structure in the lower part of the lower mantle. Keywords 2-D electrical conductivity • Adjoint sensitivity method • CHAMP magnetic data • Electromagnetic induction • Finite elements • Inverse problem • Spherical harmonics • Timedomain approach

1 Introduction Global-scale electromagnetic (EM) induction processes in the Earth, induced by external magnetic storms, are traditionally used to determine the electrical conductivity of the Earth’s interior. During large magnetic storms, charged particles in the near-Earth plasma sheet are energized and injected deeper into the magnetosphere, producing the storm-time ring current at roughly 3–4 Earth radii (Kivelson and Russell 1995; Daglis et al. 1999). At mid-latitudes on the Earth’s surface, the magnetic potential due to this magnetospheric current has a nearly axisymmetric structure. It changes with colatitude predominantly as the cosine of colatitude, that is, as the Legendre function P10 .cos #/ (e.g., Eckhardt et al. 1963). The time evolution of a magnetic storm also follows a characteristic pattern. Its initial phase is characterized by a rapid intensification of the ring current over time scales of several hours. The following main phase of a storm, which can last as long as 2–3 days in the case of severe storms, is characterized by the occurrence of multiple intense substorms, with the associated auroral and geomagnetic effects. The final recovery phase of a storm is characterized by an exponential relaxation of the ring current to its usual intensity with a characteristic period of several days (Hultqvist 1973). The instantaneous strength of the magnetospheric ring currents is conventionally monitored by the Dst index, an average of the horizontal magnetic field at a collection of geomagnetic observatories located at equatorial latitudes. The EM induction signal, defined herein as the residual magnetic induction vector B that remains after core, lithospheric and ionospheric contributions are removed, is of the order of 20–200 nT (Langel et al. 1996). The total magnetospheric signal consists of a primary term owing to the ring current itself, plus an induced term from eddy currents inside the Earth responding to fluctuations in ring-current intensity. In geomagnetic dipole coordinates, the magnetic field signal induced by a zonal external current source in a radially symmetric Earth follows a zonal distribution, adding roughly 20–30 % to the primary signal at the dipole equator, and canceling up to 80 % of the primary signal near the geomagnetic poles. This is easily demonstrated by an analytic calculation of the response of a conducting sphere in a uniform magnetic field (Everett and Martinec 2003). Even when the source has a simple spatial structure, induced currents in reality are more complicated since they are influenced by the Earth’s heterogeneous conductivity structure. The resulting 3-D induction effects at satellite altitudes caused by the spatially complicated induced current flow can be large and dependent on upper-mantle electrical conductivity. Electrical conductivity is an important deep-Earth physical property, spatial variations of which provide fundamental information concerning geodynamic processes such as the subduction of slabs, the ascent of mantle plumes, and the convection of anomalously hot mantle material. The electrical conductivity of the upper to mid-mantle is conventionally studied using frequencydomain geomagnetic induction techniques. The traditional approach involves the estimation of Page 2 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

surface impedances from land-based observatory recordings of geomagnetic time variations of external origin in the period range of several hours to several days (e.g., Eckhardt et al. 1963; Banks 1969; Banks and Ainsworth 1992; Schultz and Larsen 1987; Schultz and Larsen 1990; Praus et al. 2011). The underlying electrical conductivity is extracted from the measured impedances by means of forward modeling and inversion. The task is difficult, however, since the global distribution of magnetic observatories is sparse and irregular, the quality of the magnetic time series is variable, and assumptions about the spatial and temporal variability of external magnetic sources must often be introduced to be able to carry out electromagnetic induction modeling (e.g., Langel and Estes 1985a; Kuvshinov 2010). The recent high-precision magnetic missions, Ørsted (launched February 1999) and CHAMP (July 2000), may have the ability to help these problems to be overcome. Unlike land-based observatories, satellites acquire data with no regard for oceans and continents, hemispheres, or political boundaries. On the other hand, however, the combined spatial and temporal character of satellite signals makes their analysis more difficult than that of their terrestrial counterparts, which manifest only temporal variations. Nonetheless, significant progress has already been made in separating the signals due to EM induction in the Earth from satellite magnetic data (Didwall 1984; Oraevsky et al. 1993; Olsen 1999; Tarits and Grammatica 2000; Korte et al. 2003; Constable and Constable 2004; Kuvshinov and Olsen 2006; Olsen et al. 2005; Olsen and Stolle 2012). In order to model 3-D EM induction effects in the geomagnetic field at satellite altitudes quantitatively, a transient EM induction in a 3-D heterogeneous sphere needs to be simulated. Several techniques are available to model the geomagnetic response of a 3-D heterogeneous sphere in the Fourier-frequency domain, each based on a different numerical method: the spherical thinsheet method (Fainberg et al. 1990; Kuvshinov et al. 1999a), the finite-element method (Everett and Schultz 1996; Weiss and Everett 1998), the integral-equation method (Kuvshinov et al. 1999b), the finite-difference method (Uyeshima and Schultz 2000), and the spectral finite-element method (Martinec 1999). It is, however, inconvenient to study the geomagnetic induction response to a transient excitation, such as a magnetic storm, in the Fourier-frequency domain. Moreover, the complicated spatial and temporal variability of satellite data favors a time-domain approach. Several time-domain methods for computing the geomagnetic induction response to a transient external source have recently been developed (Hamano 2002; Everett and Martinec 2003; Martinec et al. 2003; Velímský and Martinec 2005; Velímský 2010). On a planetary scale, the electrical conductivity of the Earth’s interior is determined from permanent geomagnetic observatory recordings by applying inverse theory (e.g., Banks 1969; Schultz and Larsen 1987, 1990, and others). Although electrical conductivity depends on the temperature, pressure, and chemical composition of the Earth’s interior with many degrees of freedom, one is always forced to parameterize the conductivity by only a few parameters so that the inverse modeling can be carried out with a certain degree of uniqueness. A major difficulty in the choice of model parameterization is to introduce those parameters that are most important for interpreting the data. Strictly, this information cannot be known a priori, but can be inferred from the analysis of the sensitivities, that is, partial derivatives of the data with respect to model parameters. Once the sensitivities to all parameters are available, they can subsequently be used for ranking the relative importance of conductivity parameters to a forward-modeled response, for refining an initial conductivity model to improve the fit to the observed data, and for assessing the uncertainty of the inverse-modeled conductivity distribution due to the propagation of errors contaminating the data. A straightforward approach to compute the sensitivities is the so-called brute-force method, whereby the partial derivatives with respect to model parameters are approximated by the centered

Page 3 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

difference of two forward model runs. Although the brute-force method is not particularly elegant, it is useful for computing the sensitivities with respect to a small number of model parameters or for testing the accuracy of faster algorithms (which is the case in this chapter). However, it becomes impractical for a conductivity model with a large number of parameters. There are two advanced formal techniques for calculating the sensitivities: the forward sensitivity method and the adjoint sensitivity method. In the forward sensitivity method, the forward model is differentiated with respect to model parameters and the resulting forward sensitivity equations are solved for the partial derivatives of the field variables. If there are M model parameters, then the solutions of M forward sensitivity equations are required. Although this is an excellent method when M is small, it, however, becomes computationally expensive for larger values of M . The mathematical foundations of the forward and adjoint sensitivities for linear and nonlinear dynamical systems are presented by Marchuk (1995), Cacuci (2003), and Sandu et al. (2003), while its application to EM induction modeling is described by Rodi (1976), Jupp and Vozoff (1977), Oldenburg (1990), and McGillivray et al. (1994). The adjoint sensitivity method, applied also hereafter, is a powerful complement to the forward sensitivity method. In this method, the adjoint sensitivity equations are solved by making use of nearly identical forward modeling code, but running it backwards in time. For a physical system that is linearly dependent on its model parameters, the adjoint sensitivity equations are independent of the original (forward) equations (Cacuci 2003), and hence the adjoint sensitivity equations are solved only once in order to obtain the adjoint solution. The sensitivities to all model parameters are then obtained by a subsequent integration of the product of the forward and adjoint solutions. Thus, there is no need to solve repeatedly the forward sensitivity equations, as in the forward sensitivity method. The adjoint sensitivity method is thus the most efficient for sensitivity analysis of models with large numbers of parameters. The adjoint sensitivity method for a general physical system was, for example, described by Morse and Feshbach (1953), Lanczos (1961), Marchuk (1995), Cacuci (2003), and Tarantola (2005), and its application to EM induction problems was demonstrated by Weidelt (1975), Madden and Mackie (1989), McGillivray and Oldenburg (1990), Oldenburg (1990), Farquharson and Oldenburg (1996), Newman and Alumbaugh (1997, 2000), Dorn et al. (1999), Rodi and Mackie (2001), Avdeev and Avdeeva (2006), and Kelbert et al. (2008). In this chapter, the forward and adjoint methods for data recorded by the vector flux gate magnetometer on board of the CHAMP are presented. The satellite was launched on July 15, 2000, into a near-polar orbit with an inclination approximately 87.3ı and initial altitude of 454 km. Magnetic storm signals which are generated by equatorial ring currents in the magnetosphere are concentrated on here. The configuration of the CHAMP orbit and the distribution of the magnetospheric ring currents allow one to model EM induction processes in an axisymmetric geometry. The main aim here is to rigorously formulate the forward and adjoint sensitivity methods and use them for interpreting the CHAMP magnetic data for the year 2001.

2 Basic Assumptions on EM Induction Modeling for CHAMP Magnetic Data In the following, the magnetic signals induced by equatorial ring currents in the magnetosphere and measured by the CHAMP vector flux gate magnetometer are considered. To obtain these signals, the CHAMP magnetic data are processed in a specific manner, and several assumptions for the

Page 4 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

EM induction modeling are made. The data processing steps and assumptions (see Martinec and McCreadie 2004; Velímský et al. 2006 for more details) are as follows. 1. The CHAOS model of the Earth’s magnetic field (Olsen et al. 2006a) is first subtracted from the CHAMP vector magnetic data. The residual magnetic time series are only considered along night-side satellite tracks at low and mid-latitudes, allowing the reduction of the magnetic effects of ionospheric currents and field-aligned currents in the polar regions. 2. The satellite is assumed to orbit the Earth sufficiently fast (one CHAMP orbit takes approximately 90 min) compared to the time variations of the ring current to allow the separation of the spatial and temporal changes of the magnetic field observed by the single satellite to be performed in a simple way. Each night-side satellite track is considered to sample a snapshot of the magnetic field at the time when the satellite crosses the magnetic equator. 3. Since the CHAMP satellite orbit is nearly polar, magnetic signals sampled along a track are dominantly influenced by latitudinal changes in the electrical conductivity of the Earth’s mantle. The effect of longitudinal variations in electrical conductivity on track data is not considered and it is assumed that the electrical conductivity  of the Earth is axisymmetric, that is,  D .r; #/ in B;

(1)

where B is a conducting sphere approximating a heterogeneous Earth, r is the radial distance from the center of B, and # is the colatitude. 4. Since ring-current magnetospheric excitation has nearly an axisymmetric geometry, it is assumed that, for a given satellite track, the inducing and induced magnetic fields possess an axisymmetry property, that is, .e/

.i/

Gj m .t / D Gj m.t / D 0 for m ¤ 0; .e/

(2)

.t /

where Gj m .t / and Gj m .t / are the time-dependent, spherical harmonic Gauss coefficients of the magnetic field generated by external equatorial ring currents in the magnetosphere and the magnetic field generated by the induced eddy currents in the Earth, respectively.

3 Forward Method of Global EM Induction In this section, the forward method of EM induction is formulated for a 2-D case when the electrical conductivity and external sources of electromagnetic variations are axisymmetrically distributed and when the external current excitation has a transient feature similar to that of a magnetic storm. Most of considerations in this section follow the papers by Martinec (1997, 1999).

3.1 Formulation of EM Induction for a 3-D Inhomogeneous Earth An initial, boundary-value problem (IBVP) for EM induction is first formulated for a 3-D inhomogeneous Earth. The conducting sphere B is excited by a specified source originating in the ionosphere and magnetosphere. From a broad spectrum of temporal variations in the geomagnetic field, long-period geomagnetic variations with periods ranging from several hours to tens of days Page 5 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

are considered. The EM induction within B for this range of periods is governed by Maxwell’s equations taken in the quasi-static approximation: curl B D E;

(3)

@B ; @t

(4)

curl E D 

div B D 0;

(5)

% 

(6)

div E D

where E is the electric intensity, B is the magnetic induction, % is the volume charge density, " and  are the permittivity and permeability, respectively, and  is the electrical conductivity. The vacuum values for  and " are used for the whole space. Since a 3-D inhomogeneous Earth is being considered, the volume charge density % does not vanish exponentially as in a homogeneous medium (e.g., Stratton, 1941), but electric charges may accumulate in regions where the electrical conductivity has a nonzero gradient. The assumption of the quasi-static approximation of the electromagnetic field then yields the volume charge density % in the form (Pˇecˇ and Martinec 1986; Weaver 1994)  % D  .grad   E/: 

(7)

From this, Eq. 6 may be written in the form div.E/ D 0;

(8)

showing that the electric-current density i D  E is divergence-free. Substituting Eqs. 3 into 4, the magnetic diffusion equation for B is obtained:  @B 1 curl B C  D 0 in B: curl  @t 

(9)

Note that the vector field B is automatically forced to be divergence-free by Eq. 9. Therefore, if Eq. 9 is not violated, the divergence-free condition (5) must not be explicitly imposed on B. Furthermore, it is assumed that the region external to B is a perfect insulator that models the near-space environment to the Earth’s surface. As seen from the Biot-Savart law, expressed in Eq. 3, the magnetic induction B0 in this insulator (the subscript 0 indicating a vacuum) is taken to be irrotational: curl B0 D 0:

(10)

Hence, B0 can be expressed as the negative gradient of magnetic scalar potential U : B0 D grad U:

(11)

Page 6 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

The constraint div B0 D 0 then implies that the potential U satisfies the scalar Laplace equation: r 2 U D 0:

(12)

Applying the toroidal-spheroidal decomposition of a vector to the magnetic induction B0 , that is, writing B0 D B0;T C B0;S , and using the fact that the gradient of a scalar generates a spheroidal vector, Eq. 11 shows that the toroidal component of the magnetic induction in a vacuum vanishes: B0;T D 0:

(13)

To complete the specifications, boundary conditions are prescribed on the external surface @B of sphere B. Decomposing the magnetic induction B into its toroidal and spheroidal parts, B D BT C BS , it is required, in view of Eq. 13, that the toroidal part of B vanishes on @B: BT D 0 on @B;

(14)

and that the tangential components of B are continuous across @B: n  .B  B0 / D 0 on @B;

(15)

where n is the outward unit normal to @B and “” stands for the cross product of vectors. In view of Eq. 13, the last condition requires that the tangential components of the spheroidal part of B are fixed on @B: n  BS D bt on @B;

(16)

where bt is the tangential component of the magnetic field B0 in the atmosphere taken at the surface @B at time t  0: bt WD n  B0 j@B :

(17)

Without additional specifications, bt is assumed to be derivable from ground observations of the geomagnetic field variations. Equation 9 is yet subject to the initial condition Bjt D0 D B0 in B;

(18)

where B0 is the initial value of the magnetic induction. The other two boundary conditions require the continuity of the tangential components of the electric intensity (e.g., Stratton 1941): n  .E  E0 / D 0 on @B;

(19)

and the continuity of the normal component of the magnetic induction: n  .B  B0 / D 0 on @B:

(20)

Page 7 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

A solution of Eq. 9 with the two boundary conditions (14) and (17) is now proved to be unique. Suppose that there exist two solutions, B1 and B2 , that satisfy the problem. The difference vector B WD B1  B2 is zero at the initial time t D 0 and satisfies a homogeneous boundary-value problem for t > 0:   1 @B curl curl B C  D 0 in B; (21)  @t BT D 0 on @B; n  BS D 0

(22)

on @B:

(23)

Multiplying Eq. 21 by B and making use of Green’s theorem Z

 Z Z 1 1 1 curl a  b d V D .curl a  curl b/d V  curl a  .n  b/ dS; curl  B B  @B  

(24)

where a and b are differentiable vector fields and “” stands for the scalar product of vectors results in Z Z Z  @ 1 1 .curl B  curl B/ d V C curl B  .n  B/ dS: (25) .B  B/ d V D 2 @t B B  @B  Decomposing B into its toroidal and spheroidal parts, B D BT C BS , the surface integral on the right-hand side then reads as Z Z 1 1 curl B  .n  B/ dS D  curl B  Œ.n  BT / C .n  BS / dS: (26) @B  @B  Both of the integrals on the right-hand side are equal to zero because of conditions (22) and (23). Moreover, integrating Eq. 25 over time and using the initial condition B D 0 at t D 0, one obtains  2

Z

Z

T

Z

.B  B/ d V D  B

0

B

1 .curl B  curl B/ d V 

 dt:

(27)

Since  > 0 in B, the term on the right-hand side is always equal to or less than zero. On the other hand, the energy integral on the left-hand side is positive or equal to zero. Equation 27 can therefore only be satisfied if the difference field B is equal to zero at any time t > 0. Hence, Eq. 9, together with the boundary conditions (14) and (17) and the initial condition (18) satisfying divergence-free condition (5), ensures the uniqueness of the solution.

3.2 Special Case: EM Induction in an Axisymmetric Case Adopting the above assumptions for the case of EM induction for CHAMP magnetic data (see Sect. 2), the discussion is confined to the problem with axisymmetric electrical conductivity  D .r; #/ and assumed that the variations of magnetic field recorded by the CHAMP magnetometer are induced by a purely zonal external source. Under these two restrictions, both the inducing and Page 8 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

induced parts of magnetic induction B are an axisymmetric vector field that may be represented in terms of zonal spherical vector harmonics Y`j .#/ (Varshalovich et al. 1989, Sect. 7.3). Using these harmonics, the toroidal-spheroidal decomposition of magnetic induction B can be expressed in the form B.r; #/ D BT .r; #/ C BS .r; #/ X1 Xj C1 Bj` .r/Y`j .#/; D j D1

j

(28)

`Dj 1

j ˙1

where Yj .#/ and Yj .#/ are zonal toroidal and spheroidal vector spherical harmonics, respecj j ˙1 tively, and Bj .r/ and Bj .r/ are spherical harmonic expansion coefficients of the toroidal (BT ) j and spheroidal (BS ) parts of B, respectively. Moreover, the r and # components of Yj .#/ and the j ˙1 ' component of Yj .#/ are identically equal to zero, which means that the toroidal-spheroidal decomposition of B is also decoupled with respect to the spherical components of BT and BS . The toroidal-spheroidal decomposition (28) can be introduced such that the toroidal part BT is divergence-free and the r component of the curl of the spheroidal part BS vanishes: div BT D 0;

er  curl BS D 0:

(29)

It should be emphasized that the axisymmetric geometry of the problem allows one to abbreviate the notation and drop the angular-order index m D 0 for scalar and vector spherical harmonics. The IBVP formulated in the previous section is now examined for the axisymmetric case. First, the product  B is decomposed into the toroidal and spheroidal parts. Considering two differential identities div.B/ D  div B C grad   B; curl.B/ D  curl B C grad   B

(30)

for BT and BS , respectively, and realizing that (i) for an axisymmetric electrical conductivity , the ' component of the vector grad  is identically equal to zero, and (ii) for purely zonal behavior of BT and BS , the r and # components of BT and the ' component of BS are identically equal to zero, it is found that the products grad   BT D 0 and er  (grad   BS ) = 0. Then div.BT / D 0; er  curl.BS / D 0:

(31)

In other words, the product of  with BT and BS results again in toroidal and spheroidal vectors, respectively. Note that such a “decoupled” spheroidal-toroidal decomposition of  B is broken once either of the two basic assumptions of the axisymmetric geometry of the problem is violated. Furthermore, since the rotation of a spheroidal vector is a toroidal vector and vice versa, the IBVP for an axisymmetric case can be split into two decoupled IBVPs: (1) the problem formulated for the spheroidal magnetic induction BS : 

1 curl BS curl 

 C

@BS D 0 in B; @t

div BS D 0 in B;

(32) (33) Page 9 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

n  BS D bt

on @B;

(34)

with an inhomogeneous initial condition BS D B0S at t D 0, and (2) the problem for the toroidal magnetic induction BT : 

1 curl BT curl 

 C

@BT D 0in B; @t

(35)

with homogeneous initial and boundary conditions. Attention is hereafter turned to the first IBVP, since the latter one has only a trivial solution BT D 0 in B. Note again that such a “decoupled” spheroidal-toroidal decomposition of the boundary-value problem for EM induction cannot be achieved if the conductivity  depends also on the longitude ' and/or when the external excitation source has not only zonal, but also tesseral and sectoral spherical components. For convenience, the toroidal vector potential A (A is not labeled by subscript T since its spheroidal counterpart is not used in this text) that generates the spheroidal magnetic induction BS is introduced: BS D curl A;

div A D 0:

(36)

By this prescription, the divergence-free constraint (33) is automatically satisfied and the IBVP for the spheroidal magnetic induction BS is transformed to the IBVP for the toroidal vector potential A D A.r; #/. In the classical mathematical formulation, the toroidal vector potential A 2 C 2 .B/  C 1 .h0; 1// is searched for such that BS D curlA and 1 @A curl curl A C  D0  @t div A D 0

in B;

n  curl A D bt Ajt D0 D A0

in B;

on @B in B;

(37) (38) (39) (40)

where the conductivity   0 is a continuous function in B;  2 C.B/,  > 0 is the constant permittivity of a vacuum, bt .#; t / 2 C 2 .@B/  C 1 .h0; 1//, and A0 is the generating potential for the initial magnetic induction B0S such that B0S D curlA0 . At the internal interfaces, where the electrical conductivity changes discontinuously, the continuity of the tangential components of magnetic induction and electric intensity is required. The various functional spaces used in this approach are listed in Table 1. The magnetic diffusion equation (37) for A follows from the Maxwell’s equation (3). To satisfy Faraday’s law (4), the electric intensity has to be a toroidal vector, E D ET , of the form ET D 

@A : @t

(41)

Note that the electric intensity ET is not generated by the gradient of a scalar electromagnetic potential, since the gradient of a scalar results in a spheroidal vector that would contradict the Page 10 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Table 1 List of functional spaces used C.D/ Space of the continuous functions defined in the domain D (D is the closure of D/ C 1 ((0, 1// Space of the functions for which the classical derivatives up to first order are continuous in the interval 0, 1/ C 2 .D/ Space of the functions for which the classical derivatives up to second order are continuous in D L2 .D/ Space of square-integrable functions in D

requirement that ET is a toroidal vector. Under the prescription (41), the continuity condition (19) of the tangential components of the electric intensity can be ensured by the continuity of the toroidal vector potential A, A D A0 on @B;

(42)

since A has only nonzero tangential components.

3.3 Gauss Representation of Magnetic Induction in the Atmosphere As mentioned above, the Earth’s atmosphere in the vicinity of the Earth is assumed to be nonconducting, with the magnetic induction B0 generated by the magnetic scalar potential U , which is a harmonic function satisfying Laplace’s equation (12). Under the assumption of axisymmetric geometry (see Sect. 2), its solution is given in terms of zonal solid scalar spherical harmonics r j Yj .#/ and r j 1 Yj .#/: U.r; #; t / D a

1   X r j j D1

a

.e/ Gj .t /

C

 a j C1 r

 .i/ Gj .t /

Yj .#/ for r  a;

(43)

where a is the radius of a conducting sphere B which is equal to a mean radius of the Earth, and .e/ .i/ Gj .t / and Gj .t / are the time-dependent, zonal spherical harmonic Gauss coefficients of the external and internal magnetic fields, respectively. Using the following formula for the gradient of a scalar function f .r/Yj .#/ in spherical coordinates (Varshalovich et al. 1989, p. 217), s gradŒf .r/Yj .#/ D

j 2j C 1



s    d d j C1 j C1 j j 1 f .r/Yj .#/  C  dr r 2j C 1 dr r j C1

f .r/Yj

.#/;

(44)

the magnetic induction in a vacuum .r  a/ may be expressed in terms of solid vector spherical harmonics as  r j 1 X1 p .e/ j 1 j.2j C 1/ Gj .t /Yj .#/ B0 .r; #; t / D  j D1 a  (45)  a j C2 p .i/ j C1 C .j C 1/.2j C 1/ Gj .t /Yj .#/ : r Page 11 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

This formula again demonstrates the fact that the toroidal component of the magnetic field in a vacuum vanishes, B0;T D 0. For the following considerations, it is convenient to express the magnetic induction B0 in terms of the toroidal vector potential A0 such that B0 D curl A0 . Using the rotation formulae (222) and (223) for vector spherical harmonics, the spherical harmonic representation of the toroidal vector potential in a vacuum reads as A0 .r; #; t / D a

1 X j D1

"s

j  r j .e/ Gj .t /  j C1 a

s

# j C 1  a j C1 .i/ j Gj .t / Yj .#/: j r

(46)

The representation (46) of A0 by solid spherical harmonics r j Yj .#/ and r j 1 Yj .#/ is consistent with the fact that A0 satisfies the vector Laplace equation, r 2 A0 D 0, as seen from Eqs. 37 and 38 for  D 0. As introduced above, @B is a sphere (of radius a) with the external normal n coinciding with the spherical base vector er , that is, n D er . Taking into account expression (213) for the polar components of the vector spherical harmonics, the horizontal northward X component of the magnetic induction vector B0 at radius r  a is j

X.r; #; t / WD e#  B0 D

1 X

Xj .r; t /

j D1

j

@Yj .#/ ; @#

(47)

where e# is the spherical base vector in the colatitudinal direction. The spherical harmonic coefficients Xj are expressed in the form Xj .r; t / D

 r j 1 a

.e/

Gj .t / C

 a j C2 r

.i/

Gj .t /:

(48)

Similarly, the spherical harmonic representation of the vertical downward Z component of the magnetic induction vector B0 at radius r  a is Z.r; #; t / WD er  B0 D

1 X

Zj .r; t /Yj .#/;

(49)

j D1

where the spherical harmonic coefficients Zj are Zj .r; t / D j

 r j 1 a

.e/ Gj .t /

 .j C 1/

 a j C2 r

.i/

Gj .t /:

(50)

Equations 48 and 50 show that the coefficients Xj and Zj are composed of two different linear .e/ combinations of the spherical harmonics Gj of the external electromagnetic sources and the .i/ spherical harmonics Gj of the induced electromagnetic field inside the Earth. Consequently, there is no need to specify these coefficients separately when Xj and Zj are used as the boundary-value data for the forward and adjoint modeling of EM induction, respectively.

Page 12 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Making use of Eq. 45 and formula (219) for the cross product of er with the spheroidal vector j ˙1 spherical harmonics Yj .#/, the tangential component of magnetic induction B0 in a vacuum .r  a/ has the form  1   r j 1  a j C2 X p .e/ .i/ j j.j C 1/ Gj .t / C Gj .t / Yj .#/; er  B0 .r; #; t / D  a r j D1

(51)

which, in view of Eq. 48, can be rewritten in terms of the spherical harmonic coefficients Xj (r, t ): er  B0 .r; #; t / D 

1 X p

j

j.j C 1/Xj .r; t /Yj .#/:

(52)

j D1

In the other words, the axisymmetric geometry allows the determination of er  B0 from the horizontal northward X component of the magnetic induction vector B0 . In particular, the ground magnetic observation vector bt , defined by Eq. 17, can be expressed as h i X1 p .e/ .i/ j j.j C 1/ Gj .t / C Gj .t / Yj .#/ j D1 X1 p j j.j C 1/Xj .a; t /Yj .#/: D

bt .#; t / D 

(53)

j D1

Likewise, the satellite magnetic observation vector Bt , defined by Bt WD n  B0 j@A ;

(54)

where @A is the mean-orbit sphere of radius r D b, can be expressed in terms of the external and internal Gauss coefficients and spherical harmonic coefficients Xj .b; t /, respectively, as Bt .#; t / D  D

X1 p j D1

X1 p j D1

"  #  j C2 b j 1 .e/ a .i/ j j.j C 1/ Gj .t / C Gj .t / Yj .#/ a b j.j C

(55)

j 1/Xj .b; t /Yj .#/:

4 Forward Method of EM Induction for the X Component of CHAMP Magnetic Data The forward method of EM induction can, at least, be formulated for two kinds of the boundaryvalue data. Either the X component of the CHAMP magnetic data (considered in this section) or the Gauss coefficients of the external magnetic field (the next section) along the track of CHAMP satellite is specified. Most of considerations in this section follow the papers by Martinec (1997), Martinec et al. (2003), and Martinec and McCreadie (2004).

Page 13 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

4.1 Classical Formulation The IBVP (37)–(40) assumes that magnetic data bt are prescribed on the Earth’s surface. For satellite measurements, this requires the continuation of magnetic data from satellite-orbit altitudes down to the Earth’s surface. Since the downward continuation of satellite magnetic data poses a fundamental problem, a modification of the IBVP (37)–(40) such that the X component of CHAMP magnetic data is used directly as boundary values at satellite altitudes is given in this section. The solution domain is extended by the atmosphere A surrounding the conducting sphere B. Since the magnetic signals from night-time, mid-latitude tracks only are considered, it is assumed that there are no electric currents in A. This assumption is not completely correct, but it is still a good approximation (Langel and Estes 1985b). Moreover, A is treated as a nonconducting spherical layer with the inner boundary coinciding with the surface @B of the conducting sphere B with radius r D a and the outer boundary coinciding with the mean-orbit sphere @A of radius r D b. The classical mathematical formulation of the IBVP of global EM induction for satellite magnetic data is as follows. Find the toroidal vector potential A in the conducting sphere B and the toroidal vector potential A0 in the nonconducting atmosphere A such that the magnetic induction vectors in B and A are expressed in the forms B D curl A and B0 D curl A0 , respectively, and, for t > 0, it holds that 1 @A curl curl A C  D0  @t

in B;

(56)

div A D 0 in B;

(57)

curl curl A0 D 0 in A;

(58)

div A0 D 0 A D A0

in A; on @B;

n  curl A D n  curl A0 n  curlA0 D Bt Ajt D0 D A0

(59) (60) on @B;

on @A;

in B [ A;

(61) (62) (63)

where the mathematical assumptions imposed on the functions A, A0 , , , Bt , and A0 are the same as for the IBVP (37)–(40), see Table 1. The continuity condition (60) on A is imposed on a solution since the intention is to apply a different parameterization of A in the sphere B and the spherical layer A. The term Bt represents the tangential components of the magnetic induction B0 at the satellite altitudes and n is the unit normal to @A. The axisymmetric geometry allows (see Sect. 7) to determine Bt from the horizontal northward X component of the magnetic induction vector B measured by the CHAMP vector magnetometer.

Page 14 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

4.2 Weak Formulation 4.2.1 Ground Magnetic Data The IBVP (37)–(40) for the ground magnetic data bt is now reformulated in a weak sense. The solution space is introduced as V WD fAjA 2 L2 .B/; curl A 2 L2 .B/; div A D 0 in Bg;

(64)

where the functional space L2 .B/ is introduced in Table 1. The weak formulation of the IBVP (37)– (40) consists of finding A 2 V  C 1 ..0; 1// such that at a fixed time it satisfies the following variational equation: a.A; ıA/ C b.A; ıA/ D f .ıA/

8ıA 2 V;

(65)

where the bilinear forms a.; /, b.; / and the linear functional f ./ are defined as follows: Z 1 a.A; ıA/ WD .curl A  curl ıA/d V;  B 

 @A b.A; ıA/ WD .r; #/  ıA d V; @t B Z 1 .bt  ıA/dS: f .ıA/ WD   @B Z

(66)

(67)

(68)

It can be seen that the assumptions imposed on the potential A are weaker in the weak formulation than in classical formulation. Moreover, the assumptions concerning the electrical conductivity  and the boundary data bt can also be made weaker in the latter formulation. It is sufficient to assume that the electrical conductivity is a square-integrable function in B;  2 L2 .B/, and the boundary data at a fixed time is a square-integrable function on @B; bt 2 L2 .@B/  C 1 ..0; 1//: To show that the weak solution generalizes the classical solution to the problem (37)–(40), it is for the moment assumed that the weak solution A is sufficiently smooth and belongs to A 2 C 2 .B/. Then, the following Green’s theorem is valid: Z Z Z .curl A  curlıA/d V D .curl curl A  ıA/d V  .n  curl A/  ıA dS: (69) B

B

@B

In view of this, the variational equation (65) can be rewritten as follows: 1 

Z

1 .curl curl A  ıA/d V   B

Z



Z .n  curl A/  ıA dS C

@B

 B

1 D 

Z

 @A  ıA d V @t .bt  ıA/dS:

(70)

@B

Taking first Eq. 70 only for the test functions ıA 2 C01 .B/, where C01 .B/ is the space of infinitely differentiable functions with compact support in B, and making use of the implication Page 15 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Z f 2 L2 .B/;

B

.f  ıA/d V D 0 8ıA 2 C01 .B/ ) f D 0 in B;

(71)

Eq. 37 is proved. To obtain the boundary condition (39), the following implication is used: Z f 2 L2 .@B/; .f  ıA/dS D 0 8ıA 2 C 1 .B/ ) f D 0 on @B;

(72)

@B

where C 1 .B/ is the space of infinitely differentiable functions in B. It can be seen that if a weak solution of the problem exists and is sufficiently smooth, for instance, if A 2 C 2 .B/, then this solution satisfies the differential equation (37) and the boundary condition (39), all taken at time t . Thus, the weak solution generalizes the classical solution C 2 .B/ since the weak solution may exist even though the classical solution does not exist. However, if the classical solution exists, it is also the weak solution (Kˇrížek and Neittaanmäki 1990). 4.2.2 Satellite Magnetic Data Turning the attention now to the weak formulation of IBVP (56)–(63) for satellite magnetic data Bt , the intention is to apply different parameterizations of the potentials A and A0 . In addition to the solution space V for the conducting sphere B, the solution space V0 for the nonconducting atmosphere A is introduced: V0 WD fA0 jA0 2 C 2 .A/; divA0 D 0 in Ag:

(73)

Note that the continuity condition (60) is not imposed on either of the solution spaces V and V0 . Instead, the Lagrange multiplier vector  and a solution space for it are introduced: V WD fj 2 L2 .@B/g:

(74)

The weak formulation of the IBVP (56)–(63) consists of finding (A, A0 , ) 2 (V , V0 , V /  C 1 ((0, 1// such that at a fixed time they satisfy the following variational equation: a.A; ıA/ C b.A; ıA/ C a0 .A0 ; ıA0/ C c.ıA  ıA0 ; / C c.A  A0 ; ı/ D F .ıA0 / 8ıA 2 V; 8ıA0 2 V0 ; 8ı 2 V ;

(75)

where the bilinear forms a., / and b., / are defined by Eqs. 66 and 67, and the additional bilinear forms a0 ., / and c., / and the new linear functional F ./ are defined as follows: Z 1 a0 .A0 ; ıA0 / WD .curl A0  curl ıA0 /d V; (76)  A Z c.A  A0 ; / WD .A  A0 /  dS; (77) @B

1 F .ıA0 / WD  

Z .Bt  ıA0 /dS:

(78)

@A

Page 16 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

To show that the weak solution generalizes the classical solution to the problem (56)–(63), it is again assumed that the weak solution A is sufficiently smooth and belongs to A 2 C 2 .B/. By making use of Green’s theorem (69), the variational equation (75) can be written as 1 

 Z  Z @A 1 .curl curl A  ıA/d V  .n  curlA/  ıA dS C   ıA d V  @B Z @t B Z B 1 1 C .curl curl A0  ıA0 /d V  .n  curl A0 /  ıA0 dS  ZA Z @A 1 C .n  curlA0 /  ıA0 dS C .ıA  ıA0 /  dS  Z @B Z @B 1 .A  A0 /  ıdS D  .Bt  ıA0 /dS: C  @A @B Z

(79)

Now, taking Eq. 79 for the test functions ıA 2 C01 .B/ and making use of the implication (71), Eq. 56 is proved. Likewise, taking Eq. 79 for the test functions ıA0 2 C01 .A/ and using the implication (71) for the domain A, Eq. 58 is proved. To obtain the continuity conditions (60) and (61), the implication (72) is used for the test functions ıA 2 C 1 .B/. The boundary condition (62) can be obtained by an analogous way if the implication (72) is considered for @A. It can therefore be concluded that if a weak solution of the problem exists and is sufficiently smooth, for instance, if A belongs to the space of functions whose second-order derivatives are continuous in B, then this solution satisfies the differential equations (56) and (58), the interface conditions (60) and (61), and the boundary conditions (62), all taken at time t .

4.3 Frequency-Domain and Time-Domain Solutions Two approaches of solving the IBVPs of EM induction with respect to the time variable t are now presented. The variational equation (65) is first solved in the Fourier-frequency domain, assuming that all field variables have a harmonic time dependence of the form e i!t . Denoting the Fourier image of O the weak formulation of EM induction for ground magnetic data in the frequency domain A by A, is described by the variational equation: O ı A/ O D f .ı A/8ı O O 2 V; O ı A/ O C i !b1 .A; A a.A; O ı A/ O is defined by where the new bilinear form b1 .A; Z O O O  ı A/d O V: b1 .A; ı A/ WD .r; #/.A

(80)

(81)

B

O the solution is transformed back to the time domain by applying the Having solved Eq. 80 for A, inverse Fourier transform. Alternatively, the IBVP for ground magnetic data can be solved directly in the time domain, which is the approach applied in the following. There are several choices for representing the time derivative of the toroidal vector potential A in the bilinear form b., /. For simplicity, the explicit Euler differencing scheme will be chosen and @ A/@t will be approximated by the differences of A at two subsequent time levels (Press et al. 1992): Page 17 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

A.r; #; tiC1/  A.r; #; ti / @A WiC1 A i A  D ; @t tiC1  ti ti

(82)

where i A denotes the values of A at discrete time levels 0 D t0 < t1 <    < tiC1 <    . The variational equation (65), which is now solved at each time level ti , i D 0; 1; : : :, has the form a

iC1

   1 1 A; ıA C b1 iC1 A; ıA D b1 i A; ıA C f iC1 bt ; ı ti ti

A 8ıA 2 V;

(83)

where the bilinear form b1 ., / is defined by Eq. 81. The same two approaches can be applied to the IBVP of EM induction for satellite magnetic data. Here, the time-domain approach is only presented. The variational equation (75) is discretized with respect to time and solved at each time level ti : a

    1 A; ıA C b1 iC1 A; ıA Ca0 iC1 A0 ; ıA0 C c ıA  ıA0 ;iC1  C c iC1 A iC1 A0 ; ı ti  i 1 b1 A; ıA C F iC1 Bt ; ıA0 8ıA 2 V; 8ıA0 2 V0 ; 8ı 2 V : D ti (84)

iC1

4.4 Vector Spherical Harmonic Parameterization over Colatitude For the axisymmetric geometry of external sources and the conductivity model, it has been shown that the induced electromagnetic field is axisymmetric and the associated toroidal vector potential is an axisymmetric vector. It may be represented in terms of zonal toroidal vector spherical j harmonics Yj .#/. Their explicit forms are as follows (more details are given in the Appendix): j

Yj .#/ WD Pj1 .cos #/e ;

(85)

where Pj1 .cos #/ is the associated Legendre function of degree j and order m D 1 and e' is the j spherical base vector in the longitudinal direction. An important property of the functions Yj .#/ is that they are divergence-free: i h j div f .r/Yj .#/ D 0;

(86)

where f .r/ is a differentiable function. The required toroidal vector potential A and test functions ı A inside the conducting sphere B j can be represented as a series of the functions Yj .#/:

j

A.r; #; t / ıA.r; #/

D

1 X j D1

(

) j Aj .r; t / j Yj .#/; j ıAj .r/

(87)

j

where Aj .r; t / and ı Aj .r/ are spherical harmonic expansion coefficients. The divergence-free j property of functions Yj .#/ implies that both the toroidal vector potential A and test functions ı A are divergence-free. Therefore, the parameterization (87) of potentials A and ı A automatically Page 18 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

satisfies the requirement that the functions from the solution space V be divergence-free. The parameterization (87) is also employed for the Lagrange multipliers .#; t / and the associated test j j functions ı.#/ with the expansion coefficients j .t / and ıj , respectively. Introducing the spherical harmonic representation of the zonal toroidal vector A, the curl of A is a zonal spheroidal vector: curl A D

C1;2 1 jX X

Rj` .AI r/Y`j .#/;

(88)

j D1 `Dj 1

where Rj` .AI r/ are given by Eq. 223 in the Appendix. The substitution of Eqs. 87 and 88 into Eqs. 66 and 81 leads to the spherical harmonic representation of the bilinear forms a., / and b1 ., /: 1 X1 Xj C1;2 a.A; ıA/ D j D1 `Dj 1 Z a

b1 .A; ıA/ D

Z

a

Rj` .AI r/Rj` .ıAI r/r 2 dr; 0

(89)

E.A; ıAI r/r dr; 2

0

where the orthogonality property (217) of vector spherical harmonics has been employed and E denotes the angular part of the ohmic energy: Z E.A; ıAI r/ D 2



.r; #/ 0

1 X

j

Aj11 .r; t /Pj1 ;1.cos #/

j1 D1

1 X

j

ıAj22 .r/Pj2 ;1 .cos #/ sin #d #:

(90)

j2 D1

Likewise, substituting Eqs. 53 and 87 into Eq. 68 results in the spherical harmonic representation of the linear functional f ./: f .ıA/ D

1 h i a2 X p .e/ .i/ j j.j C 1/ Gj .t / C Gj .t / ıAj .a/:  j D1

(91)

4.5 Finite-Element Approximation over the Radial Coordinate Inside the conducting sphere B, the range of integration 0, a is divided over the radial coordinate into P subintervals by the nodes 0 D r1 < r2 <    < rP < rP C1 D a. The piecewise-linear basis functions defined at the nodes by the relation k .ri / D ıki can be used as the basis function of the Sobolev functional space W21 .0; a/. Note that only two basis functions are nonzero in the interval rk  r  rkC1 , namely, k .r/

D

rkC1  r ; hk

kC1 .r/

D

r  rk ; hk j

(92) j

where hk D rkC1  rk . Since both the unknown solution Aj .r; t / and test functions ıAj .r/ are elements of this functional space, they can be parameterized by piecewise linear finite elements k .r/ such that Page 19 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

(

j

Aj .r; t / j ıAj .r/

) D

P C1 X

(

j;k

Aj .t / j;k ıAj

kD1

) k .r/:

(93)

The finite-element representation of curl A coefficients then reads as s

     1 j C1 j C1 1 j;k j;kC1 D C  C ; k .r/ Aj C kC1 .r/ Aj hk r hk r s      1 j j 1 j j C1 j;k j;kC1   ;  Rj .AI r/ D  k .r/ Aj C kC1 .r/ Aj 2j C 1 hk r hk r (94) where rk  r  rkC1 . Since the electrical conductivity .r; #/ 2 L2 .B/, the radial dependence of  can be approximated by piecewise constant functions: j 1 Rj .AI r/

j C1 2j C 1

.r; #/ D k .#/;

rk  r  rkC1 ;

(95)

where k .#/ does not depend on the radial coordinate r and may be further approximated by piecewise constant functions in colatitude #. (However, this approximation will not be denoted explicitly.) The integrals over r in Eq. 89 can be divided into P subintervals: a.A; ıA/ D

1 

R rkC1

P1 Pj C1;2 PP j D1

`Dj 1

b1 .A; ıA/ D

kD1

P Z X kD1

rkC1

rk

Rj` .AI r/Rj` .ıAI r/r 2 dr;

E.A; ıAI r/r 2 dr;

(96)

(97)

rk

and the integration over r is reduced to the computation of integrals of the type Z

rkC1 i .r/

j .r/r

2

dr;

(98)

rk

where the indices i and j are equal to k and/or k C1. These integrals can be evaluated numerically, for example, by means of the two-point p Gauss-Legendre numerical quadrature with the weights equal to 1 and the nodes x1;2 D ˙1= 3 (Press et al. 1992, Sect. 4.5). For instance, the quadrature formula for the integral in Eq. 97 can be written in the form b1 .A; ıA/ D

P X kD1

D

2 X ˛D1

E.A.r˛ /; ıA.r˛ /I r˛ /

r˛2 hk ; 2

(99)

where r˛ WD 12 .hk x˛ C rk C rkC1 /, ˛ D 1; 2. The integration over colatitude # in the term E, see Eq. 90, can also be carried out numerically by the Gauss-Legendre quadrature formula. Computational details of this approach can be found in Orszag (1970) or Martinec (1989).

Page 20 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

4.6 Solid Vector Spherical Harmonic Parameterization of A0 Our attention is now turned to the parameterization of the toroidal vector potential A0 and test functions ı A0 in an insulating atmosphere A. Equation 46 shows that A0 , and also ıA0 , can be j represented as a series of the zonal toroidal vector spherical harmonics Yj .#/ with the spherical j j expansion coefficients A0;j and ı A0;j , respectively, of the form (

j

A0;j .r; t / j ıA0;j .r/

"s

) Da

j  r j j C1 a

(

.e/

Gj .t / .e/ ıGj

s

) 

j C 1  a j C1 j r

(

.i/

Gj .t / .i/ ıGj

)# for a  r  b: (100)

.e/

.i/

The zonal scalar-magnetic Gauss coefficients Gj .t / and Gj .t / are considered known in the case of the IBVP for ground magnetic data as they constitute the ground magnetic data bt on B; .e/ .i/ see (53). However, for satellite magnetic data, the coefficients Gj .t / and Gj .t / in the insulating j;k atmosphere A are, in addition to coefficients Aj .t /, unknowns and are sought by solving the .e/ .i/ IBVP (56)–(63). The associated test-function coefficients are denoted by ıGj and ıGj . Applying the operator curl on the parameterization (100) and substituting the result into Eq. 76 results in the parameterization of the bilinear form a0 ., /: "  #  2j C1  1 a3 X b 2j C1 a .e/ .e/ .i/ .i/ j  1 Gj .t /ıGj  .j C 1/  1 Gj .t /ıGj ; a0 .A0 ; ıA0 / D  j D1 a b (101) where a and b are the radii of the spheres @B and @A, respectively. The continuity condition (60), that is, A D A0 on @B, is now expressed in terms of spherical harmonics. Substituting for the spherical harmonic representations (87) of A and (100) of A0 , .e/ respectively, into Eq. 60 results in the constraint between the external coefficients Gj .t /, the .i/ internal coefficients Gj .t / of the toroidal vector potential A0 in the atmosphere A, and the j coefficients Aj .a; t / of the toroidal vector potential A in the conducting sphere B: "s j

Aj .a; t / D a

s j .e/ Gj .t /  j C1

# j C 1 .i/ Gj .t / : j

(102)

This continuity condition can be used to express the bilinear form c., /, defined by Eq. 77, in terms of spherical harmonics as follows: # " s 1 X 1 j j j .e/ .i/ j Aj .a; t / C Gj .t /  Gj .t / j .t /; c.A  A0 ; / D a j C1 j C1 j D1

(103)

j

where j .t / are zonal toroidal vector spherical harmonic expansion coefficients of the Lagrange multiplier . Finally, making use of Eqs. 55 and 100, the linear functional F ./ defined by Eq. 78 can be expressed in the form

Page 21 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

1 b2 X p j F .ıA0 / D j.j C 1/Xj .b; t /ıA0;j .b/;  j D1

(104)

j

where the spherical harmonic coefficients ıA0;j .b/ of the test functions ıA0 .b; #/ are given by Eq. 100 for r D b: "s j

ıA0;j .b/ D a

j j C1

s #  j  a j C1 b j C 1 .e/ .i/ ıGj  ıGj : a j b

(105)

In view of this, the functional F ./ thus reads as # "   1  a j C1 ab 2 X b j .e/ .i/ Xj .b; t / j ıGj  .j C 1/ ıGj : F .ıA0 / D  j D1 a b

(106)

5 Forward Method of EM Induction for the External Gauss Coefficients .obs/

.obs/

The case where the time series of the CHAMP-derived coefficients Xj .t / and Zj .t / coefficients (see Sect. 7) are converted to a time series of spherical harmonic coefficients of external and internal fields at the CHAMP satellite altitude is now considered. To obtain these coefficients, .e;obs/ .i;obs/ denoted by Gj .t / and Gj .t /, the Gaussian expansion of the external magnetic potential is undertaken at the satellite orbit of radius r D b, which results in Eqs. 48 and 50 where the radius r equals to b. The straightforward derivation then yields i 1 h .obs/ .obs/ .j C 1/Xj .t / C Zj .t / ; 2j C 1 : i 1 h .obs/ .i;obs/ .obs/ jXj .t /  Zj .t / .t / D Gj 2j C 1 .e;obs/

Gj

.t / D

.e;obs/

.i;obs/

.t / and Gj The satellite observables Gj .e/ .i/ coefficients Gj .t / and Gj .t / by

.t / are related to the original, ground-based Gauss

.t / D .b=a/j 1 Gj .t /; Gj .i;obs/ .i/ Gj .t / D .a=b/j C2 Gj .t /: .e;obs/

.e/

(107)

.e/

(108)

.i/

When the Gauss coefficients Gj .t / and Gj .t / are computed from the satellite observables .e;obs/ .i;obs/ .t / and Gj .t / by inverting Eq. 108, the aim is to solve the downward continuation Gj of satellite magnetic data from the satellite’s orbit to the Earth’s surface. It is, in principal, a numerically unstable problem, in particular for higher-degree spherical harmonic coefficients, .e;obs/ .i;obs/ .t / and Gj .t / is amplified by a factor since noise contaminated the satellite observables Gj .e/ .i/ j of .b=a/ when computing the ground-based Gauss coefficients Gj .t / and Gj .t /. Hence, the IBVP of EM induction is assumed to be solved only for low-degree spherical harmonics, typically Page 22 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

up to spherical harmonic degree jmax D 5. In this case, Martinec and McCreadie (2004) showed that the downward continuation of the satellite-determined coefficients from the CHAMP satellite .e/ orbit to the ground is numerically stable. This fact will be adopted and low-degree Gj .t / and .i/ Gj .t / are assumed to be calculated from the satellite observables by inverting (108). However, it should be noted that future satellite missions, such as SWARM (Olsen et al. 2006b), may provide reliable information about higher-degree spherical harmonic coefficients. Then, their downward continuation from a satellite’s altitude to the ground will become numerically unstable and the .e;obs/ forward and adjoint IBVP of EM induction will need to be formulated directly for Gj .t / and .i;obs/ .t /. Gj

5.1 Classical Formulation .e/

.i/

Given the Gauss coefficients Gj .t / and Gj .t / as observations, the forward IBVP of EM induction can be reformulated. From several possible combinations of these coefficients, it is .e/ natural to consider that the external Gauss coefficients Gj .t / are used as the boundary-value .i/ data for the forward EM induction method, while the internal Gauss coefficients Gj .t / are used for the adjoint EM induction method. The modification of the forward method is now derived. The first modification concerns the solution domain. While in the previous case for the X and Z components of the CHAMP magnetic data, the solution domain consists of a conducting sphere B .e/ surrounded by an insulating atmosphere A, in the present case where the Gauss coefficients Gj .t / .i/ and Gj .t / are used as observations, it is sufficient to consider only the conducting sphere B as the solution domain. Note, however, that the solution domain will again consist of the unification of B .e;obs/ .i;obs/ .t / and Gj .t / are taken as observations. and A when the satellite observables Gj Another modification concerns the boundary condition (16). Making use of Eq. 45 and formulae (218) and (219) for the scalar and vector products of e r with the spheroidal vector spherical j ˙1 harmonics Yj .#/, the continuity of the normal and tangential components of the magnetic induction vector B on the boundary @B, see Eqs. 15 and 20, is of the form .e/

.i/

Œn  curl A.a/j D jGj C .j C 1/Gj ;

(109)

  p j .e/ .i/ Œn  curl A.a/j D  j.j C 1/ Gj C Gj :

(110)

Combining these equations such that the external and internal Gauss coefficients are separated, and making the scalar product of er with Eq. 88 for r D a, that is, p j.j C 1/ j Aj .a/; Œn  curl A.a/j D  a results in

s

j .e/ D .2j C 1/Gj ; j C1 s j C1 j C1 j j .i/ Aj .a/ C Œn  curl A.a/j D  .2j C 1/Gj : a j

j j j  Aj .a/ C Œn  curl A.a/j a

(111)

(112)

Page 23 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

The last two equations represent the boundary conditions, which will be used in the forward and adjoint IBVP of EM induction, respectively. .e/ The forward IBVP of EM induction for the Gauss coefficient Gj .t / can now be formulated as follows. Given the conductivity .r; #/ of sphere B, the toroidal vector potential A is searched for such that, for t > 0, it holds that @A 1 curl curl A C  D0  @t

in B

(113)

with the boundary condition s j j j j .e/ .2j C 1/Gj Œn  curl A.a/j  Aj .a/ D  a j C1

on @B

(114)

and the inhomogeneous initial condition Ajt D0 D A0

in B:

(115)

5.2 Weak Formulation The IBVP (113)–(115) can again be reformulated in a weak sense. By this it is meant that A 2 V  C 1 ..0; 1// is searched for such that at a fixed time it satisfies the following variational equation: a1 .A; ıA/ C b.A; ıA/ D f1 .ıA/

8ıA 2 V;

(116)

where the solution space V is defined by Eq. 64. The new bilinear form a1 ., / and the new linear functional f1 ./ are expressed in terms of the original bilinear form a., / and the coefficient j Aj .a; t / as follows: a1 .A; ıA/ D a.A; ıA/  1 a2 X f1 .ıA/ D  j D1

a X1 j j jAj .a; t /ıAj .a/; j D1 

(117)

s j .e/ j .2j C 1/Gj .t /ıAj .a/: j C1

(118)

It should be emphasized that there is a difference in principle between the original variational equation (65) and the modification (116) in prescribing the boundary data on the surface @B. Equation 65 requires the prescription of the tangential components of the total magnetic induction in a vacuum on @B. Inspecting the functional f ./ in Eq. 91 shows that this requirement leads .e/ .i/ to the necessity to define the linear combinations Gj .t / C Gj .t / for j D 1, 2, . . . , as input boundary data for solving Eq. 65. In contrast to this scheme, the functional f1 ./ on the right-hand .e/ side of Eq. 116 only contains the spherical harmonic coefficients Gj .t /. Hence, to solve Eq. 116, .e/ only the spherical harmonic coefficients Gj .t / of the external electromagnetic source need to be Page 24 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

.i/

prescribed, while the spherical harmonic coefficients Gj .t / of the induced magnetic field within the earth are determined after solving Eq. 116 by means of Eq. 112: s .i/

Gj

j 1 .e/ Gj  D j C1 a

j j A .a/: j C1 j

(119)

The former scheme is advantageous in the case where there is no possibility of separating the external and internal parts of magnetic induction observations by spherical harmonic analysis. The latter scheme can be applied if such an analysis can be carried out or in the case when the external magnetic source is defined by a known physical process.

6 Time-Domain, Spectral Finite-Element Solution Finally, the spectral finite-element solution to the IBVP of EM induction for CHAMP magnetic data is introduced. For the sake of simplicity, the case where the spherical harmonic coefficients .e/ Gj .t / of the external electromagnetic source are considered as input observations is treated first. Introducing the finite-dimensional functional space as 8 9 jmax P C1 < = X X j;k j ıAj k .r/Yj .#/ ; (120) Vh WD ıA D : ; j D1 kD1

where jmax and P are finite cutoff degrees, the Galerkin method for approximating the solution of variational equation (116) at a fixed time tiC1 consists in finding iC1 Ah 2 Vh such that a1

iC1

    1 1 .e/ Ah ; ıAh C b1 iC1 Ah ; ıAh D b1 i Ah ; ıAh C f1 iC1 Gj ; ıAh 8ıAh 2 Vh : ti ti (121)

The discrete solution iC1 Ah of this system of equations is called the time-domain, spectral finiteelement solution. For a given angular degree j (and a fixed time tiC1 ), there are P C 1 unknown j;k coefficients iC1 Aj in the system (121) that describe the solution in the conducting sphere B. Once .i/ this system is solved, the coefficient iC1 Gj of the induced magnetic field is computed by means of the continuity condition (119). The time-domain, spectral finite-element solution can similarly be introduced to the IBVP of EM induction for the CHAMP magnetic data in the case where the spherical harmonic expansion coefficients Xj .t / of the X component of the magnetic induction vector B0 measured at satellite altitudes are considered as input observations. Besides the functional space Vh , the finite-dimensional functional subspaces of the spaces V0 and V are constructed by the following prescriptions: ( V0;h WD ıA0 D a

Xjmax j D1

"s

j  r j .e/ ıGj  j C1 a

s

) # j C 1  a j C1 .i/ j ıGj Yj .#/; j r

(122)

Page 25 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

V;h WD

8 < :

ı D

jmax X j D1

j

j

9 =

ıj Yj .#/ : ;

(123)

The Galerkin method for approximating the solution of the variational equation (75) at a fixed time tiC1 consists in finding iC1 Ah 2 Vh ,iC1 A0;h 2 V0;h and iC1 h 2 V;h , satisfying the variational equation a

   1 Ah ; ıAh C b1 iC1 Ah ; ıAh C a0 iC1 A0;h ; ıA0;h C c ıAh  ıA0;h ;iC1 h ti   iC1 1 Ah iC1 A0;h ; ıh D b1 i Ah ; ıAh C F iC1 Bt ; ıA0;h Cc ti 8ıAh 2 Vh ; 8ıA0;h 2 V0;h ; 8ıh 2 V;h :

iC1

(124)

For a given angular degree j (and a fixed time tiC1 ), the unknowns in Eq. 124 consist of P C 1 j;k .e/ coefficients iC1 Aj describing the solution in the conducting sphere B, the coefficients iC1 Gj .i/ j and iC1 Gj describing the solution in a nonconducting spherical layer A, and iC1 j ensuring the continuity of potentials iC1 A and iC1 A0 on the Earth’s surface @B. In total, there are P C 4 unknowns in the system for a given j . Martinec et al. (2003) tested the time-domain, spectral finite-element method for the spherical .e/ harmonic coefficients Gj .t /, described by the variational equation (121), by comparing the results with the analytical and semi-analytical solutions to EM induction in two concentrically and eccentrically nested spheres of different, but constant electrical conductivities. They showed .e/ that the numerical code implementing the time-domain, spectral finite-element method for Gj .t / performs correctly, and the time-domain, spectral finite-element method is particularly appropriate when the external current excitation is transient. Later on, Martinec and McCreadie (2004) made use of these results and tested the time-domain, spectral finite-element method for satellite magnetic data, described by the variational equation (124), by comparing it with the time-domain, spectral finite-element method for ground magnetic data. They showed that agreement between the numerical results of the two methods for synthetic data is excellent.

7 CHAMP Data Analysis 7.1 Selection and Processing of Vector Data The data analyzed in this chapter were recorded by the three-component vector magnetometer on board of CHAMP. To demonstrate the performance of the forward method, from all records spanning more than 8 years, the 1-year-long time series from January 1, 2001 (track No. 2610), to January 10, 2002 (track. No. 8402), has been selected. Judging from the Dst index (Fig. 4), there were about ten events when the geomagnetic field was significantly disturbed by magnetic storms or substorms. In order to minimize the effect of strong day-side ionospheric currents, night-side data recorded by the satellite between 18:00 and 6:00 local-solar time are only used. In the first step of the data processing, the CHAOS model of the Earth’s magnetic field (Olsen et al. 2006a) is used to separate the signals corresponding to EM induction by storm-time

Page 26 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Track No. 6755

Xg(103 nT)

50

0

50 0 X (nT)

Yg(103 nT)

–50 10

0

–100 –10

–150 100

50

50 Z (nT)

Z(103 nT)

–50

0

0 –50 –100

–50 0

60

120

Geographic colatitude

180

–150

0

60

120

180

Geomagnetic colatitude

Fig. 1 CHAMP satellite magnetic data along track No. 6755 (red line on global map shows the satellite track), which samples the initial phase of a magnetic storm on September 26, 2001, above the East Pacific Ocean. Left panels: the original CHAMP data plotted along geographical colatitude. Xg , Yg , and Z components point, respectively, to the geographic north, the geographic east, and downwards. Right panels: Black lines denote X and Z CHAMP components after the removal of the CHAOS model and the rotation of the residual field to dipole coordinates. The red lines show the results of the two-step, track-by-track spherical harmonic analysis, including the extrapolation into the polar regions using data from the mid-colatitude interval (40ı , 140ı), as marked by dotted lines

magnetospheric currents. Based on the CHAOS model, the main and crustal fields up to degree 50 and the secular variation up to degree 18 are removed from the CHAMP data. In the next step, the horizontal magnetic components .X; Y / are rotated from geographic coordinates to dipole coordinates, assuming that the north geomagnetic pole is at 78.8ıN, 70.7ı W. Since an axisymmetric geometry of external currents and mantle electrical conductivity is assumed, the dipolar longitudinal component Y is not considered hereafter and X and Z are used to describe the northward and downward magnetic components in dipolar coordinates, respectively. Figure 1 shows an example of the original and rotated data from CHAMP track No. 6755.

7.2 Two-Step, Track-by-Track Spherical Harmonic Analysis The input data of the two-step, track-by-track spherical harmonic analysis are the samples of the X component of the residual magnetic signal along an individual satellite track, that is, data set .#i ; Xi /, i D 1; : : :; N , where #i is the geomagnetic colatitude of the i th measurement side and N is the number of data points. The magnetic data from low and mid-latitudes within the interval .#1 ; #2/ are only considered in accordance with the assumption that global EM induction is driven by the equatorial ring currents in the magnetosphere. Hence, observations from the polar regions, which are contaminated by signals from field-aligned currents and polar electrojets, are excluded Page 27 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

from the analyzed time series. The satellite-track data .#i ; Xi / are referenced to the time when CHAMP passes the magnetic equator. In view of parameterization (47), N observational equations for data Xi are considered in the form jmax X j D1

Xj .t /

@Yj .#k / C ei D Xi ; i D 1; : : : ; N; @#

(125)

where Xj .t / are the expansion coefficients to be determined by a least-squares method and jmax is the cutoff degree. The measurement errors ek are assumed to have zero means, have uniform variances  2 , and are uncorrelated: D 0; Eei D  2; var ei cov.ei ; ej / D 0 for i ¤ j;

(126)

where E, var, and cov are the statistical expectancy, the variance, and the covariance operator, respectively. The spherical harmonic analysis of satellite-track magnetic measurements of the X component of the magnetic induction vector is performed in two steps. 7.2.1 Change of the Interval of Orthogonality In the first step, the data Xi are mapped from the mid-latitude interval # 2 .#1 ; #2 / onto the half-circle interval # 0 2 .0; / by the linear transformation # 0 .#/ D

#  #1 ; #2  #1

(127)

and then adjusted by a series of Legendre polynomials: 0

0

X.# / D

N X

Xj0 Yj .# 0 /:

(128)

j D0

Likewise, the samples of the Z component of the residual magnetic signal along an individual satellite track, that is, data set .#i ; Zi /, are first mapped from the mid-latitude interval # 2 .#1 ; #2 / onto the half-circle interval # 0 2 .0; / and then expanded into a series of Legendre polynomials: 0

0

Z.# / D

N X

Zj0 Yj .# 0 /:

(129)

j D0

The expansion coefficients Xj0 and Zj0 are determined by fitting the models (128) and (129) to mid-latitude magnetic data Xi and Zi , respectively. Since the accuracy of the CHAMP magnetic measurements is high, both long-wavelength and short-wavelength features of the mid-latitude data are adjusted. That is why the cutoff degree N 0 is chosen to be large. In the following numerical Page 28 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

examples, N 0 D 25, while the number of datum points is N D 1;550. Because of data errors, the observational equations based on the models (128) and (129) are inconsistent and an exact solution to these systems does not exist. The solution to each system of equations is estimated by a leastsquares method. Since this method is well documented in the literature (e.g., Bevington 1969), no details are given. 7.2.2 Extrapolation of Magnetic Data from Mid-latitudes to PolarRegions When the analysis of mid-latitude data Xi is complete, the signal that best fits the mid-latitude data is extrapolated to the polar regions. To do it, it is required that the original parameterization (125) of the X component matches that found in the previous step: ˇ N0 X @Yj .#/ ˇˇ Xj .t / D Xk0 Yk .# 0 /; ˇ @# #.# 0 / kD0 j D1

jmax X

(130)

where # D #.# 0 / denotes the inverse mapping to (127) and the coefficients Xk0 are known from the previous step. To determine Xj .t /, the orthonormality property of Yk .# 0 / is used and the extrapolation condition (130) is rewritten as a system of linear algebraic equations: 2

jmax X

Z Xj .t /

j D1

# 0 D0

ˇ @Yj .#/ ˇˇ Yk .# 0 / sin # 0 d # 0 D Xk0 @# ˇ#.# 0 /

(131)

for k D 0; 1; : : : ; N 0 . In a similar way, the extrapolation condition for the Z component can be expressed as 2

jmax X j D1

Z Zj .t /

# 0 D0

Yj .#.# 0 //Yk .# 0 / sin # 0 d # 0 D Zk0 :

(132)

In contrast to the previous step, only long-wavelength features of mid-latitude data are extrapolated to the polar regions; thus, jmax  N 0 . In the following numerical examples, only the range 2  jmax  6 is considered, depending on the character of the mid-latitude data. This choice implies that both systems of linear equations are overdetermined and are solved by a least-squares method. .obs/ The least-squares estimates of the coefficients Xj .t / and Zj .t / will be denoted by Xj .t / and .obs/ .obs/ .obs/ Zj .t /, respectively. Respective substitutions of Xj .t / and Zj .t / into Eqs. 47 and 49 yield smooth approximations of the X and Z components inside the colatitude interval .#1 ; #2 / as well as undisturbed extrapolations into the polar regions (0ı ,#1 / [ .#2 ,180ı ). 7.2.3 Selection Criteria for Extrapolation The crucial points of the extrapolation are the choice of the truncation degree jmax of the parameterization (125) and the determination of the colatitude interval .#1 ; #2 / where the data are not disturbed by the polar currents. Martinec and McCreadie (2004) and Velímský et al. (2006) imposed three criteria to determine these two parameters. First, the power of the magnetic field

Page 29 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

from the external ring currents is concentrated in the low-degree harmonic coefficients, particularly in the j D 1 term, and the leakage of electromagnetic energy into higher-degree terms caused by the Earth’s conductivity and electric-current geometry monotonically decreases. This criterion is applied in such a way that the analysis begins with degree jmax D 1, increases it by one, and .obs/ plots the degree-power spectrum of the coefficients Xj .t /. While the degree-power spectrum is a monotonically decreasing function of angular degree j , increasing the cutoff degree jmax is .obs/ continued. Once the degree-power spectrum of Xj .t / no longer decreases monotonically, the actual cutoff degree is taken from the previous step for which the degree-power spectrum still .obs/ monotonically decreases. The degree-power spectrum of coefficients Xj .t / for the final choice of cutoff degree jmax is shown in the third-row panels of Fig. 5. This criterion can be interpreted as follows. The largest proportion of the magnetospheric ringcurrent excitation energy is concentrated in the low-degree harmonic coefficients, particularly in the j D 1 term. The leakage of the electromagnetic energy from degree j D 1 to higher degrees is caused by lateral heterogeneities in the electrical conductivity of the Earth’s mantle. The more pronounced the lateral heterogeneities, the larger the transport of energy from degree j D 1 to higher degrees. Accepting the criterion of a monotonically decreasing degree-power spectrum means therefore that the Earth’s mantle is regarded as only weakly laterally heterogeneous. Second, the first derivative of the X component with respect to colatitude does not change sign in the polar regions. This criterion excludes unrealistic oscillatory behavior of the X component in these regions caused by a high-degree extrapolation. Third, if the least-squares estimate of the X component of CHAMP data over the colatitude interval (#1  5ı ; #2 C 5ı ) differs by more than 10 nT compared to the estimate over the interval .#1 ; #2 /, the field due to the polar currents is assumed to encroach upon the field produced by near-equatorial ring currents, and the narrower colatitude interval .#1 ; #2/ is considered to contain only the signature generated by the nearequatorial currents. Applying these criteria to the CHAMP-track data iteratively, starting from degree j D 1 and the colatitude interval (10ı , 170ı ) and proceeding to higher degrees and shorter colatitude intervals, it is found that the maximum cutoff degree varies from track to track, but does not exceed jmax D 6 and the colatitude interval is usually (40ı , 140ı ). The extrapolation of the Z component from the field at low and mid-latitudes is more problematic than that for the X component. This is because (i) the second selection criterion cannot be applied since the Z component does not approach zero at the magnetic poles as seen from parameterization (49) and (ii) the Z component of CHAMP magnetic data contains a larger portion of high-frequency noise than the X component, which, in principle, violates the assumption of the third selection criterion. Figure 5 shows that the leakage of electromagnetic energy from j D 1 to higher-degree terms is not monotonically decreasing for the Z component. That is why the least.obs/ squares estimates Zj .t / are extrapolated to polar regions from the colatitude interval .#1 ; #2 / and up to the spherical degree jmax determined for the X component. 7.2.4 Examples of Spherical Harmonic Analysis of the CHAMP Magnetic Data Presented here are four examples of the spherical harmonic analysis of the CHAMP magnetic data recorded in the period from September 25 to October 7, 2001. This period is chosen because it includes a magnetic storm followed by a magnetic substorm, as seen from the behavior of the Dst index (see Fig. 2). For demonstration purposes, four CHAMP-track data sets are chosen: the data recorded along track No. 6732 as an example of data analysis before a magnetic storm occurs; track

Page 30 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Time after storm onset (days) 50

0

3

6

9

12

nT

0 –50 –100 –150 –200

6,740 6,760 6,780 6,800 6,820 6,840 6,860 6,880 6,900

Track number

Fig. 2 The Dst index for the magnetic storm that occurred between September 25 and October 7, 2001. The arrows mark the satellite tracks chosen to demonstrate the two-step, track-by-track spherical harmonic analysis of satellite magnetic data

No. 6755 represents the main phase of a magnetic storm; track No. 6780 represents the recovery phase of a storm; and track No. 6830 represents the appearance of a substorm. In Fig. 3, the X component of the original CHAMP magnetic data reduced by the main magnetic field and the lithospheric magnetic field is shown. The top panels show the residual magnetic signals for the night-time mid-latitudes and the filtered signals after the first step of the spherical harmonic analysis has been performed. The mid-latitude data Xi are adjusted by the model (128) rather well by choosing N 0 D 25. For the sake of completeness, the second-row panels of Fig. 3 show the degree-power spectrum of the coefficients Xj0 . The degree-power spectrum of the .obs/ coefficients Xj .t / for the cutoff degree jmax chosen according to the first selection criterion is shown in the third-row panels of Fig. 3. The bottom panels of Fig. 3 show the residual signals over the whole night-time track derived from the CHAMP observations and the signals extrapolated from low-latitude and mid-latitude data. First, the well-known fact can be seen that the original magnetic data are disturbed at the polar regions by sources other than equatorial ring currents in the magnetosphere. Second, since there is no objective criterion for evaluating the quality of the extrapolation of the X component to the polar regions, it is regarded subjectively. For the track data shown here, but also for the other data for the magnetic storm considered, the extrapolation of the X component from mid-latitudes to the polar regions works reasonably well, provided that the cutoff degree jmax and the colatitude interval .#1 ; #2/ are chosen according to the criteria introduced above. The procedure applied to the 2001-CHAMP-track data results in time series of spherical .obs/ .obs/ harmonic coefficients Xj .t / and Zj .t / for j D 1; : : : ; 4. As an example, the resulting coefficients for degree j D 1 are plotted in Fig. 4 as functions of time after January 1, 2001. .obs/ .obs/ As expected, there is a high correlation between the first-degree harmonics X1 .t / and Z1 .t / and the Dst index for the days that experienced a magnetic storm.

7.3 Power-Spectrum Analysis Although the method applied in this chapter is based on the time-domain approach, it is valuable to .obs/ .obs/ inspect the spectra of the Xj .t / and Zj .t / time series. Figure 5 shows the maximum-entropy power-spectrum estimates (Press et al. 1992, Sect. 7) of the first four spherical harmonics of the horizontal and vertical components. It can be seen that the magnitudes of the power spectra of the

Page 31 of 62

Fig. 3 Examples of the two-step, track-by-track spherical harmonic analysis of magnetic signals along four satellite tracks. The top panels show the X component of the residual magnetic signals at the night-time mid-latitudes derived from the CHAMP magnetic observations (thin lines) and the predicted signals after the first step of the spherical harmonic analysis has been completed (thick lines). The number of samples in the original signals is N D 1;550. The second- and third-row panels show the degree-power spectrum of the coefficients Xj0 and Xj .t/, respectively. The cutoff degree of the coefficients Xj0 is fixed to N 0 D 25, while the cutoff degree jmax of the coefficients Xj .t/ is found by the criteria discussed in the text. The bottom panels show the X component of the residual magnetic signals over the whole night-time tracks (thin lines) and the signals extrapolated from mid-latitude data according to the second step of the spherical harmonic analysis (thick lines). The longitude when the CHAMP satellite crosses the equator of the geocentric coordinate system is 55.19ı, 127.19ı, 97.15ı, and 174.23ı for tracks No. 6732, 6755, 6780, and 6830, respectively

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Page 32 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

X1, Z1, Dst (nT)

300 200 100 0 –100 –200 –300 360

380

400

420

440

460

480

500

520

540

560

580

600

620

640

660

740

760

X1, Z1, Dst (nT)

300 200 100 0 –100 –200 –300 460 300 X1, Z1, Dst (nT)

200 100 0 –100 –200 –300 560

X1, Z1, Dst (nT)

300 200 100 0 –100 –200 –300 660

680

700 720 Time (MD2000) in days .obs/

.obs/

Fig. 4 Time series of the spherical harmonic coefficients X1 .t/ (red) and Z1 .t/ (blue) of horizontal and vertical components obtained by the two-step, track-by-track spherical harmonic analysis of CHAMP data for the year 2001. A mean and linear trend have been removed following Olsen et al. (2005). The coefficients from the missing tracks are filled by cubic spline interpolation applied to the detrended time series. Note that the sign of the X1 component is opposite to that of the Dst index (black line). Time on the horizontal axis is measured from midnight of January 1, 2000

X component monotonically decrease with increasing harmonic degree, which is a consequence of the first selection criterion applied in the two-step, track-by-track analysis. For instance, the power spectrum of the second-degree terms is about two orders of magnitude smaller than that of the first-degree terms. As already mentioned, and also seen in Fig. 5, this is not the case for the Z component, where the magnitude of the maximum-entropy power-spectrum of the Z component is larger than that of the X component for j > 1, which demonstrates that the Z component of the CHAMP magnetic data contains a larger portion of high-frequency noise than the X component.

Page 33 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

9.0

6.8

5.6

4.8

3.7

2.6

1e+06

1 day

X-component

j=1

PME(nT2)

100,000 10,000 1,000

j=2

100

j=3

10

j=4

1 Z-component 100,000

PME(nT2)

10,000

1,000

100

10

2

4

6

8

10

12

14

Period (days) .obs/

Fig. 5 The maximum-entropy power-spectrum estimates of the spherical harmonic coefficients of Xj

.t/ (top panel)

.obs/ Zj .t/

and (bottom panel) components. Degrees j D 1; 2; 3, and 4 are shown by black, red, blue, and green lines, respectively, The spectra have peaks at higher harmonics of the 27-day solar rotation period, that is, at periods of 9, 6.8, 5.6, 4.8 days, etc.

Despite analyzing only night-side tracks, there is a significant peak at the period of 1 day in the power spectra of the higher-degree harmonics (j  2), but, surprisingly, missing in the spectra of the first-degree harmonic. To eliminate the induction effect of residual dawn/dusk ionospheric electric currents, the night-side local-solar time interval is shrunk from (18:00, 6:00) to (22:00, 4:00). However, a 1-day period signal remains present in the CHAMP residual signal (not shown .obs/ .obs/ here). To locate a region of potential inducing electric currents, time series of Xj .t / and Zj .t / .e;obs/ .i;obs/ coefficients are converted to time series of spherical harmonic coefficients Gj .t / and Gj .t / of the external and internal fields counted with respect to the CHAMP satellite altitude by applying Eq. 107. The maximum-entropy power-spectrum estimates of the external and internal coefficients .e;obs/ .i;obs/ .t / and Gj .t / are shown in Fig. 6. It can be seen that these spectra for degrees j D 24 Gj .e;obs/ also have a peak at a period of 1 day. This means that at least part of Gj .t / originates in the .i;obs/ magnetosphere or even the magnetopause and magnetic tail, while Gj .t / may originate from the residual night-side ionospheric currents and/or the electric currents induced in the Earth by either effect.

Page 34 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

9.0

6.8

5.6

4.8

3.7

2.6

1e+06

1 day

External component

j=1

PME(nT2)

100,000 10,000 1,000

j=3

100 j=2

10 j=4

1

Internal component 10,000

PME(nT2)

1,000

100

10 1

2

4

6

10

8

12

14

Period (days) .e;obs/

Fig. 6 As Fig. 5, but for the external and internal Gauss coefficients Gj the CHAMP satellite altitude

.i;obs/

.t/ and Gj

.t/, counted with respect to

Figure 6 also shows that, while the periods of peak values in the external and internal magnetic fields for degree j D 1 correspond to each other, for the higher-degree spherical harmonic coefficients, such a correspondence is only valid for some periods, for instance, 6.8, 5.6, or 4.8 days. However, the peak for the period of 8.5 days in the internal component for j D 2 is hardly detectable in the external field. This could be explained by a three-dimensionality effect in the electrical conductivity of the Earth’s mantle that causes the leakage of electromagnetic energy from degree j D 1 to the second and higher-degree terms. This leakage may partly shift the characteristic periods in the resulting signal due to interference between signals with various spatial wavelengths and periods.

8 Adjoint Sensitivity Method of EM Induction for the Z Component of CHAMP Magnetic Data In this section, the adjoint sensitivity method of EM induction for computing the sensitivities of the Z component of CHAMP magnetic data with respect to the mantle’s conductivity structure is formulated. Most of considerations in this section follow the paper by Martinec and Velímský (2009). Page 35 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

8.1 Forward Method The forward method of EM induction for the X component of CHAMP magnetic data was formulated in Sect. 4. In this case, the solution domain G for EM induction modeling is the unification of the conducting sphere B and the insulating spherical layer A, that is, G D B [ A with the boundary @G coinciding with the mean-orbit sphere, that is, @G D @A. The forward IBVP (56)–(63) for the toroidal vector potential A in G can then be written in the abbreviated form as @A 1 curl curl A C  D0  @t

in G

(133)

with the boundary condition n  curl A D Bt

on @G;

(134)

and the inhomogeneous initial condition Ajt D0 D A0

in G:

(135)

Note that the conductivity  D 0 in the insulating atmosphere A implies that the second term in Eq. 133 vanishes in A and Eq. 133 reduces to Eq. 58.

8.2 Misfit Function and Its Gradient in the Parameter Space Let the conductivity .r; #/ of the conducting sphere B now be represented in terms of an M dimensional system of r- and #-dependent base functions and denote the expansion coefficients of this representation to be 1 , 2 , . . . , M . Defining the conductivity parameter vector E WD .1 ; 2 ; : : : ; M /, the dependence of the conductivity .r; #/ on the parameters E can be made explicit as  D .r; #I E /:

(136)

In Sect. 4, it is shown that the solution of the IBVP for CHAMP magnetic data enables the modeling of the time evolution of the normal component Bn WD n  B of the magnetic induction vector on  / can be compared the mean-orbit sphere @G along the satellite tracks. These predicted data Bn .E .obs/ .obs/ of the normal component of the magnetic induction vector with the observations Bn D Z by the CHAMP onboard magnetometer. The differences between observed and predicted values can then be used as a misfit for the inverse EM induction modeling. The adjoint method of EM  / on the induction presented hereafter calculates the sensitivity of the forward-modeled data Bn .E .obs/  / as boundary-value data. conductivity parameters E by making use of the differences Bn Bn .E Let the observations Bn.obs/ be made for times t 2 .0; T / such that, according to assumption (2) in Sect. 2, Bn.obs/ .#; ti / at a particular time ti 2 .0; T / corresponds to the CHAMP observations along the i th satellite track. The least-squares misfit is then defined as b  / WD

.E 2

Z

T

Z

2

0

2 w2b Bn.obs/  Bn .E  / dS dt;

(137)

@G

Page 36 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

where the weighting factor wb D wb .#; t / is chosen to be dimensionless such that the misfit has the SI unit m3 sT2 =Œ; Œ D kg m s2 A2 . If the observations Bn.obs/ contain random errors which are statistically independent, the statistical variance of the observations may be substituted for the reciprocal value of w2b (e.g., Bevington, 1969, Sects. 6–4). The # dependence of wb allows the elimination of the track data from the polar regions which are contaminated by signals from fieldaligned currents and polar electrojets, while the time dependence of wb allows the elimination of the track data for time instances when other undesirable magnetic effects at low and mid-latitudes contaminate the signal excited by equatorial ring currents in the magnetosphere. The sensitivity analysis or inverse modeling requires the computation of the partial derivative of the misfit with respect to the model parameters, that is, the derivatives @ 2/@m , m D 1; : : :; M , often termed the sensitivities of the misfit with respect to the model parameters m (e.g., Sandu et al., 2003). To abbreviate the notation, the partial derivatives with respect to the conductivity parameters are ordered in the gradient operator in the M -dimensional parameter space: rE WD

M X mD1

O m

@ ; @m

(138)

where the hat in O m indicates a unit vector. Realizing that the observations Bn.obs/ are independent of the conductivity parameters E , that is, E the gradient of 2 .E  / is rE Bn.obs/ D 0, b rE D  

Z

T

Z

2

0

@G

Bn .E  /rE Bn dS dt;

(139)

 / are the weighted residuals of the normal component of the magnetic induction where Bn .E vector:

Bn .E  / WD w2b Bn.obs/  Bn .E / :

(140)

The straightforward approach to find rE 2 is to approximate @ 2/@m by a numerical differentiation of forward model runs. Due to the size of the parameter space, this procedure is often extremely computationally expensive.

8.3 The Forward Sensitivity Equations The forward sensitivity analysis computes the sensitivities of the forward solution with respect to the conductivity parameters, that is, the partial derivatives @A=@m; m D 1; : : :; M . Using them, the forward sensitivities rE Bn are computed and substituted into Eq. 139 for rE 2 . To form the forward sensitivity equations, also called the linear tangent equations of the model (e.g., McGillivray et al. 1994; Cacuci 2003; Sandu et al. 2003, 2005), the conductivity model (136) is considered in the forward model Eqs. 133–135. Differentiating them with respect to the conductivity parameters E yields 1 @r A @A curl curlrE A C  E C rE  D0  @t @t

in G

(141)

Page 37 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

with homogeneous boundary condition n  curlrE A D 0

on @G

(142)

and homogeneous initial condition

rE Ajt D0 D 0 in G;

(143)

where rE Bt D rE A0 D 0 have been substituted because the boundary data Bt and the initial condition A0 are independent of the conductivity parameters E . In the forward sensitivity analysis, for each parameter m and associated forward solution A, a new source term rE @A=@t is created and the forward sensitivity equations (141)–(143) are solved to compute the partial derivative @A/@m . The forward sensitivity analysis is known to be very effective when the sensitivities of a larger number of output variables are computed with respect to a small number of model parameters (Sandu et al. 2003; Petzold et al. 2006). In Sect. 9, the adjoint sensitivity method of EM induction for the case when the Gauss coefficients are taken as observations will be dealt with. In this case, the boundary condition (142) has a more general form: n  curlrE A  L.rE A/ D 0 on @G;

(144)

where L is a linear vector operator acting on a vector function defined on the boundary @G. For the case studied now, however, L D 0.

8.4 The Adjoint Sensitivity Equations The adjoint method provides an efficient alternative to the forward sensitivity analysis for evaluating rE 2 without explicit knowledge of rE A, that is, without solving the forward sensitivity equations. Hence, the adjoint method is more efficient for problems involving a large number of model parameters. Because the forward sensitivity equations are linear in rE A, an adjoint equation exists (Cacuci 2003). The adjoint sensitivity analysis proceeds by forming the inner product of Eqs. 141 and 144 with O #; t /, then integrated over G and @G, respectively, and an yet unspecified adjoint function A.r; subtracted from each other: Z Z Z 1 1 1 O O O dS curl curl rE A  A d V  .n  curl rE A/  A dS C L.rE A/  A  GZ   @G @G Z (145) @rE A O @A O rE   A dV C  A d V D 0; C  @t @t G G where the dot stands for the scalar product of vectors. O interchange. To In the next step, the integrals in Eq. 145 are transformed such that rE A and A achieve this, the Green’s theorem is considered for two sufficiently smooth functions f and g in the form Page 38 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Z

Z

Z

curl f  curl g d V D G

curl curl f  g d V  G

.n  curl f/  g dS:

(146)

@G

Interchanging the functions f and g and subtracting the new equation from the original one results in the integral identity Z Z Z Z curl curl f  g d V  .n  curl f/  g dS D curl curl g  f d V  .n  curl g/  f dS: G

@G

G

@G

(147)

O can be exchanged in the first two integrals in Eq. 145: By this, the positions of rE A and A Z Z Z 1 1 1 O  rE A d V  O  rE A dS C O dS curl curlA .n  curl A/ L.rE A/  A  GZ  @G Z  @G @rE A @A O dV C O d V D 0:  A  A rE  C  @t @t G G (148) To perform the same transformation in the fourth integral, Eq. 148 is integrated over the time interval t 2 .0; T /, that is, 1 

Z

Z T Z 1 O O  rE A dSdt curl curlA  rE A d V dt  .n  curl A/  G @G 0 0 Z T Z Z T Z @rE A 1 O d V dt O  A L.rE A/  A dSdt C  C  @t G 0 Z T 0Z @G @A O d V dt D 0:  A rE  C @t G 0 T

Z

(149)

Then the order of integration is exchanged over the spatial variables and time in the fourth integral and performs the time integration by parts: Z

T 0

@rE A O t DT  rE A  Aj O t D0  O dt D rE A  Aj  A @t

Z

T 0

rE A 

O @A dt: @t

(150)

The second term on the right-hand side is equal to zero because of the homogeneous initial condition (143). Finally, Eq. 149 takes the form 1 

Z

Z T Z 1 O O  rE A dS dt curl curl A  rE A d V dt  .n  curl A/  @G 0 ZG Z 0 Z T 1 O dSdt C O t DT d V L.rE A/  A rE A  Aj C  0 @G G Z T Z Z T Z O @A @A O d V dt D 0: d Vdt C  A rE A  rE   @t @t G G 0 0 T

Z

(151)

Remembering that rE Bn is the derivative that is to be eliminated from rE 2 , the homogeneous equation (151) is added to Eq. 139 (note the physical units of Eq. 151 are the same as rE 2 , namely, O are the same as of A, namely, Tm): m3 sT2 /[], provided that the physical units of A Page 39 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Z Z Z TZ 1 T 1 O  rE A d V dt  O  rE A dS dt rE D curl curl A .n  curl A/  0 G  0 @G Z Z Z Z TZ O @A 1 T O dSdt C O t DT d V   rE A d V dt L.rE A/  A rE A  Aj  C  @t @G G G 0 0 Z Z Z TZ b T @A O  A d V dt  rE  Bn rE Bn dS dt: C @t  0 @G G 0 (152) O has been considered arbitrary so far. The aim is now to impose constraints The adjoint function A O transforms to the well-defined adjoint toroidal vector on it such that the originally arbitrary A potential. The volume integrals over G proportional to rE A are first eliminated by requiring that 2

O 1 O   @A D 0 in G; curl curlA  @t

(153)

O t DT D 0 in G: Aj

(154)

O with the terminal condition on A:

O on @G is derived from the requirement that the surface integrals over The boundary condition for A @G in Eq. 152 cancel each other, that is, Z Z Z O  rE A dS  O dS C b .n  curlA/ L.rE A/  A Bn rE Bn dS D 0 (155) @G

@G

@G

at any time t 2 .0; T /. This condition will be elaborated on in the next section. Under these  / takes the form constraints, the gradient of 2 .E Z rE D

T

Z

2

0

G

rE 

@A O d V dt:  A @t

(156)

8.5 Boundary Condition for the Adjoint Potential To relate rE A and rE Bn in the constraint described by Eq. 155 and, subsequently, to eliminate rE A from it, A needs to be parameterized. In the colatitudinal direction, A will be represented as j a series of the zonal toroidal vector spherical harmonics Yj .#/ in the form given by Eq. 87, which O is also employed for the adjoint potential A:

A.r; #; t / O #; t / A.r;

D

1 X j D1

(

) j Aj .r; t / j Yj .#/: j O Aj .r; t /

(157)

In the radial direction, inside a conducting sphere B of radius a, the spherical harmonic expansion j coefficients Aj .r; t / are parameterized by P C 1 piecewise-linear finite elements k .r/ on the interval 0  r  a as shown by Eq. 93: Page 40 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

j

Aj .r; t / D

P C1 X

j;k

Aj .t /

k .r/:

(158)

kD1 j

In an insulating atmosphere A, the spherical harmonic expansion coefficients Aj .r; t / are parameterized in the form given by Eq. 100: "s j

Aj .r; t / D a

j  r j .e/ Gj .t /  j C1 a

s

# j C 1  a j C1 .i/ Gj .t / : j r

(159)

j The same parameterizations as shown by Eqs. 158 and 159 are taken for the coefficients AOj .r; t /. The first aim is to express the gradient rE 2 in terms of spherical harmonics. Since the upperboundary @G of the solution domain G is the mean-orbit sphere of radius b, the external normal n to @G coincides with the unit vector er , that is, n D er . Applying the gradient operator rE on the equation Bn D er  curl A and using Eq. 224 yields

1 Xp j j.j C 1/rE Aj .r; t /Yj .#/: r j D1 1

rE Bn .r; #; t / D 

(160)

Moreover, applying a two-step, track-by-track spherical harmonic analysis on the residual satellitetrack data Bn defined by Eq. 140, these observables can, at a particular time t 2 .0; T /, be represented as a series of the zonal scalar spherical harmonics: Bn .#; t I E / D

1 X

Bn;j .t I E /Yj .#/

(161)

j D1

with spherical harmonic coefficients of the form Z

1 w2b Bn.obs/ .#; t /  Bn .b; #; t I E / Yj .#/ dS: Bn;j .t I E / D 2 b @G

(162)

Substituting Eqs. 160 and 161 into Eq. 139 and employing the orthonormality property (212) of the zonal scalar spherical harmonics Yj .#/, the gradient of the misfit 2 becomes b2 rE D 

Z

2

0

1 T X p

j

j.j C 1/Bn;j .t I E /rE Aj .b; t / dt:

(163)

j D1

The constraint (155) with L D 0, that is, for the case of the boundary condition (142), is now expressed in terms of spherical harmonics. By the parameterization (157) and the assumption n D O yields er , the differential relation (226) applied to A O D n  cural A

1 h X j D1

ij O t / Yjj .#/; n  curl A.r; j

(164)

Page 41 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

where h

  ij 1 Oj d O C Aj .r/: n  cural A.r; t / D  j dr r

(165)

The first constituent in the first integral of the constraint (155) is expressed by Eq. 164, while the second constituent can be obtained by applying the gradient operator rE to Eq. 157. The two constituents in the second integral of the constraint (155) are expressed by Eqs. 160 and 161, respectively. Performing all indicated substitutions, one obtains Z

2

Z

D0

Db



X1

h ij 1 X1 j O n  curl A.b; t / Yj11 .#/ 

j1 D1 D0 Z 2 Z X 1 D0

j1 D1

#D0

j2 D1

j1

Bn;j1 .t I E /Yj1 .#/

j

j

rE Aj22 .b; t / Yj22 .#/b 2 sin #d #d

1 X1 p j j2 .j2 C 1/rE Aj22 .b; t /Yj2 .#/ j2 D1 b

(166)

2

b sin #d #d: Interchanging the order of integration over the full solid angle and summations over j ’s, and making use of the orthonormality properties (212) and (217) of the zonal scalar and vector spherical harmonics, respectively, Eq. 166 reduces to 1 h X

1 ij X p j j O n  curl A.b; t / rE Aj .b; t / D j.j C 1/Bn;j .t I E /rE Aj .b; t /; j

j D1

(167)

j D1 j

which is to be valid at any time t 2 .0; T /. To satisfy this constraint independently of rE Aj .b; t /, O namely, one last condition is imposed upon the adjoint potential A, h

ij p O n  curl A.b; t / D j.j C 1/Bn;j .t I E / on @G j

(168)

at any time t 2 .0; T /.

8.6 Adjoint Method The formulation of the adjoint method of EM induction for the Z component of CHAMP satellite magnetic data can be summarized as follows. Given the electrical conductivity model .r; #/ in the sphere B, the forward solution A.r; #; t / in B, and the atmosphere A for t 2 .0; T / and the observations Bn.obs/ .t / on the mean-orbit sphere @G of radius r D b, with uncertainties quantified by weighting factor wb , find the adjoint potential O #; t / in G D B [ A by solving the adjoint problem: A.r; O 1 O   @A D 0 in G curl curl A  @t

(169)

with the boundary condition

Page 42 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

h

ij p O n  curl A.b; t / D j.j C 1/Bn;j .t / j

on @G

(170)

and the terminal condition O t DT D 0 in G: Aj

(171)

The gradient of the misfit 2 .E  / is then expressed as Z rE D

T

Z

2

0

G

rE 

@A.t / O  A.t / d V dt: @t

(172)

The set of Eqs. 169–171 is referred to as the adjoint problem of the forward problem specified by O according to Eq. 172 Eqs. 133–135. Combining the forward solution A and the adjoint solution A 2 thus gives the exact derivative of the misfit .

8.7 Reverse Time The numerical solution of Eq. 169, solved backwards in time from t D T to t D 0, is inherently unstable. Unlike the case of the forward model equation and the forward sensitivity equation, the adjoint equation effectively includes negative diffusion, which enhances numerical perturbations instead of smoothing them, leading to an unstable solution. To avoid such numerical instability, the sign of the diffusive term in Eq. 169 is changed by reversing the time variable. Let the reverse L time D T  t; 2 .0; T /, and the reverse-time adjoint potential A. / be introduced such that O / D A.T O L A.t  / DW A. /:

(173)

L O @A @A D ; @t @

(174)

Hence

L and Eq. 169 transforms to the diffusion equation for the reverse-time adjoint potential A. /: L 1 L C  @A D 0 curl curl A  @

in G

(175)

with the boundary condition h ij p L n  curl A.b; / D j.j C 1/Bn;j .T  / j

on @G:

(176)

O becomes the initial condition for the potential A: L The terminal condition (171) for A L D0 D 0 in G: Aj

(177)

Page 43 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

With these changes, the adjoint equations become similar to those of the forward method, and thus nearly identical numerical methods can be applied. In addition, the gradient (172) transforms to Z rE D

T

Z

2

0

G

rE 

@A.t / L  A.T  t / d V dt: @t

(178)

The importance of Eq. 178 is that, once the forward problem (133)–(135) is solved and the misfit

2 is evaluated from Eq. 137, the gradient rE 2 may be evaluated for little more than the cost of a single solution of the adjoint system (175)–(177) and a single scalar product in Eq. 178, regardless of the dimension of the conductivity parameter vector E . This is compared to other methods of evaluating rE 2 that typically require the solution of the forward problem (133)–(135) per component of E . The specific steps involved in the adjoint computations are now explained. First, the forward solutions A.ti / are calculated at discrete times 0 D t0 < t1 <    < tn D T by solving the forward problem (133)–(135), and each solution A.ti / must be stored. Then, the reverse-time L i /; i D 0; : : : ; n, are calculated, proceeding again forwards in time according adjoint solutions A.t to Eqs. 175–177. As each adjoint solution is computed, the misfit and its derivative are updated L / has finally been calculated, both 2 and according to Eqs. 137 and 178, respectively. When A.T rE 2 are known. The forward solutions A.ti / are stored because Eqs. 176 and 178 depend on them for the adjoint calculation. As a result, the numerical algorithm has memory requirements that are linear with respect to the number of time steps. This is the main drawback of the adjoint method.

8.8 Weak Formulation The adjoint IBVP (175)–(177) can again be reformulated in a weak sense. Creating an auxiliary boundary-value vector .adj/

Bt

.#; / WD

X1 p

D

j D1 X 1 j D1

j.j C 1/Bn;j .T  /Yij .#/ p j j.j C 1/Zj .T  /Yj .#/;

(179)

where the negative vertical downward Z component of the magnetic induction vector B0 has been substituted for the normal upward Bn component of B0 , the boundary condition (176) can be written as L D B.adj/ n  curl A t

on @G:

(180)

It can be seen that the adjoint problem (175), (177), and (180) for the reverse-time adjoint potential L A. / has the same form as the forward problem (133)–(135) for the forward potential A. Hence, the weak formulation of the adjoint problem is given by the variational equation (75), where the .adj / forward boundary data vector Bt is to be replaced by the adjoint boundary data vector Bt . In .adj / addition, the form similarity between the expression (57) for Bt and the expression (179) for Bt enables one to express the spherical harmonic representation (106) of the linear functional F ./ in a unified form: Page 44 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

"   # 1  a j C1 b j ab 2 X .e/ .i/ F .ıA0 / D Dj .t / j ıGj  .j C 1/ ıGj ;  j D1 a b

(181)

where

Dj .t / D

for the forward method; Xj .t / Zj .T  t / for the adjoint method;

(182)

and Zj .t / is the residual between the Z component of the CHAMP observations and the .obs/ forward-modeled data, that is, Zj .t / D Zj .t /  Zj .t I E /.

9 Adjoint Sensitivity Method of EM Induction for the Internal Gauss Coefficients of CHAMP Magnetic Data In this section, the adjoint sensitivity method of EM induction for computing the sensitivities of the internal Gauss coefficients of CHAMP magnetic data with respect to mantle conductivity structure is formulated.

9.1 Forward Method The forward method of EM induction for the external Gauss coefficients of CHAMP magnetic was formulated in Sect. 5. As discussed, the solution domain for EM induction modeling is the conducting sphere B with the boundary @B coinciding with the mean Earth surface with radius r D a. Since both the external and internal Gauss coefficients are associated with the spherical harmonic expansion of the magnetic scalar potential U in a near-space atmosphere to the Earth’s .e/ surface, the boundary condition for Gj .t / can only be formulated in terms of spherical harmonic expansion coefficients of the sought-after toroidal vector potential A. First, the forward IBVP of .e/ EM induction for the Gauss coefficient Gj .t / is briefly reviewed: The toroidal vector potential A inside the conductive sphere B with a given conductivity .r; #/ is sought such that, for t > 0, it holds that @A 1 curl curl A C  D 0 in B  @t

(183)

with the boundary condition Œn  curl

j A.a/j

s j j j .e/ .2j C 1/Gj  Aj .a/ D  a j C1

on @B

(184)

and the inhomogeneous initial condition Ajt D0 D A0

in B:

(185)

Page 45 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

9.2 Misfit Function and Its Gradient in the Parameter Space In comparison with the adjoint sensitivity method for the Z component of CHAMP magnetic data, yet another modification concerns the definition of a misfit function. Let the observations .i;obs/ .t /; j D 1; 2; : : : ; jmax , be made over the time interval (0, T ). The least-squares misfit is Gj then defined as a3  / WD

.E 2

Z

jmax h T X

2

0

.i;obs/ Gj .t /



.i/ Gj .t I E /

i2 dt;

(186)

j D1

.i/

where the forward-modeled data Gj .t I E / are computed according to Eq. 119 after solving the forward IBVP of EM induction (183)–(185). To bring the misfit function (186) to a form that is analogous to Eq. 137, two auxiliary quantities are introduced:

.i;obs/

G .#; t / G .i/ .#; t I E /

D

jmax X j D1

(

) .i;obs/ Gj .t / Yj .#/; .i/ Gj .t I E /.t /

(187)

by means of which, and considering the orthonormality property of spherical harmonics Yj .#/, the misfit (186) can be written as a

.E / D 2

Z

T

Z

.i;obs/

2 G  G .i/ .E  / dS dt:

2

(188)

@B

0

In contrast to Eq. 137, the weighting factor w2b does not appear in the integral (188) since possible inconsistencies in the CHAMP magnetic data are already considered in data processing for .i;obs/ .t / (see Sect. 7). Gj Realizing that the observations G .i;obs/ are independent of the conductivity parameters E , that is, E the gradient of 2 .E rE G .i;obs/ D 0,  / is a rE D  

Z

T

Z

2

0

@B

G .i/ .E  /rE G .i/ dS dt;

(189)

 / are the residuals of the internal Gauss coefficients: where G .i/ .E G .i/ .E  / WD G .i;obs/  G .i/ .E  /:

(190)

9.3 Adjoint Method Differentiating Eqs. 183 and 185 for the forward solution with respect to conductivity parameters E results in the sensitivity equations of the same form as Eqs. 141 and 143, but now valid inside the sphere B. The appropriate boundary condition for the sensitivities rE A is obtained by differentiating Eq. 184 with respect to the parameters E :

Page 46 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

j

Œn  curlrE A.a/j 

j j rE Aj .a/ D 0 on @B: a

(191)

Multiplying the last equation by the zonal toroidal vector spherical harmonics and summing up the result over j , the sensitivity equation (191) can be written in the form of Eq. 144, where the linear vector boundary operator L has the form 1X j j j rE Aj .a/Yj .#/: a j D1 1

L.rE A/ D

(192)

In view of the form similarity between the expressions (139) and (189), the boundary condition for O on @B can be deduced from the condition (155): A Z Z Z O  rE A dS  O dS C a .n  curl A/ L.rE A/  A G .i/ rE G .i/ dS D 0; (193) @B

@B

@B

which must be valid at any time t 2 .0; T /. .i/ To express the partial derivatives of the forward-modeled data Gj with respect to the .i/ conductivity parameters E , that is, the gradient rE Gj in terms of the sensitivities rE A, Eq. 119 is differentiated with respect to the conductivity parameters E : s .i/

rE Gj

1 D a

j j rE Aj .a/; j C1

(194)

.e/

.e/

where rE Gj D 0 has been considered because the forward model boundary data Gj are independent of the conductivity parameters E . The constraint (193) can finally be expressed in terms of spherical harmonics. The first constituent in the first integral of the constraint (193) is expressed by Eq. 164, while the second constituent can be obtained by applying the gradient operator rE to Eq. 157. The two constituents in the second and third integrals of the constraint (193) are expressed by Eqs. 192 and 157 and by Eqs. 190 and 194, respectively. Performing all indicated substitutions results in Z

2

Z

D0

1  a



X1

D0 Z 2

Z

h

ij 1 X1 j O n  curl A.a; t / Yj11 .#/ 

j1 D1 X1

j2 D1

j1

j

j

rE Aj22 .a; t /Yj22 .#/a2 sin #d #d

X1

j j AOj22 .a; t /Yj22 .#/a2 sin #d #d D0 D0 s Z 2 Z X X 1 1 j2 1 .i/ j rE Aj22 .a; t /Yj2 .#/a2 sin #d #d: Da Gj1 .t I E /Yj1 .#/ j j D1 D1 1 2 a j C 1 2 D0 D0 (195) Interchanging the order of integration over the full solid angle and summations over j ’s, and making use of the orthonormality properties (212) and (217) of the zonal scalar and vector spherical harmonics, respectively, Eq. 195 reduces to j1 D1

j

j

j1 rE Aj11 .a; t /Yj11 .#/ 

j2 D1

Page 47 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

s 1 h 1 ij j X X j j j .i/ j O Gj .t I E /rE Aj .a; t /: n  curl A.a; t /  AOj .a; t / rE Aj .a; t / D j a j C1 j D1 j D1 (196) j To satisfy this constraint independent of rE Aj .a; t /, one last condition is imposed upon the adjoint O namely, potential A, h ij j j O n  curl A.a; t /  AOj .a; t / D j a

s j .i/ Gj .t I E / j C1

on @B

(197)

for j D 1; 2; : : : ; jmax , and at any time t 2 .0; T /. The formulation of the adjoint method of EM induction for the internal Gauss coefficients is now summarized. Given the electrical conductivity .r; #/ in the conducting sphere B, the forward .i;obs/ .t /, j D 1; 2; : : : ; jmax , on the mean sphere @B solution A.r; #; t / in B and the observations Gj O #; t / in B, such that, for of radius r D a for the time interval (0, T ), find the adjoint potential A.r; t > 0, it satisfies the magnetic diffusion equation O 1 O   @A D 0 curl curl A  @t

in B

(198)

with the boundary condition h ij j j O n  curl A.a; t /  AOj .a; t / D j a

s j .i/ Gj .t / j C1

on @B

(199)

for j D 1; 2; : : : ; jmax , and the terminal condition O t DT D 0 in B: Aj

(200)

9.4 Weak Formulation To find a stable solution of diffusion equation (198), the reverse time D T t and the reverse-time L adjoint potential A. / are introduced in the same manner as in Sect. 8.7. By this transformation, the negative sign at the diffusive term in Eq. 198 is inverted to a positive sign. The adjoint IBVP L can be reformulated in a weak sense and described by the variational equation for the A L ıA/ C b.A; L ıA/ D f2 .ıA/ a1 .A;

8ıA 2 V;

(201)

where the solution space V , the bilinear forms a1 .; / and b., / are given by Eqs. 64, 117, and 67, respectively, and the new linear functional f2 ./ is defined by

Page 48 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Table 2 Boundary conditions for the forward and adjoint methods Method Satellite magnetic components .r D b/ Ground-based Gauss coefficients .r D a/ q p j j j .e/ j Forward Œn  curl F.t/j D  j.j C 1/Xj .t/ Œn  curl F.t/j  ja Fj .t/ D j C1 .2j C 1/Gj .t/ q p j j j .i / j Adjoint Œn  curl F. /j D  j.j C 1/Zj .T  Œn  curl F. /j  ja Fj . / D j C1 Gj .T  / /

1 a2 X f2 .ıA/ D   j D1

s j .i/ j Gj .T  /ıAj .a/: j C1

.i/

(202)

.i;sur/

.t / determined from the CHAMP Here, Gj .t / are the residuals between the coefficients Gj observations of the X and Z components of the magnetic induction vector and continued downwards from the satellite’s altitude to the Earth’s surface according to Eq. 107: .i;sur/

Gj

.t / D .b=a/j C2 Gj

.i;obs/

.t /;

(203)

.i/

and the forward-modeled coefficients Gj .t I E /: .i/

.i;sur/

Gj .t / D Gj

.i/

.t /  Gj .t I E /:

(204)

L the gradient of Having determined the forward solution A and the reverse-time adjoint solution A, 2  / with respect to the conductivity parameters E can be computed by the misfit .E Z rE D

T

Z

2

0

G

rE 

@A.t / L  A.T  t / d V dt: @t

(205)

9.5 Summary The forward and adjoint IBVPs of EM induction for the CHAMP satellite data can be formulated in a unified way. Let F denote either the toroidal vector potential A or the reverse-time adjoint L for the forward and the adjoint problems, respectively. F is sought toroidal vector potential A inside the conductive sphere S with a given conductivity .r; #/ such that, for t > 0, it satisfies the magnetic diffusion equation @F 1 curl curl F C  D 0 in S  @t

(206)

with the inhomogeneous initial condition Fjt D0 D F0

in S

(207)

and an appropriate boundary condition chosen from the set of boundary conditions summarized for convenience in Table 2. Page 49 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

10 Sensitivity Analysis for CHAMP Magnetic Data The forward and adjoint solutions are now computed for the 2001-CHAMP data (see Sect. 7) with spherical harmonic cutoff degree jmax D 4 and time step t D 1 h. The sensitivity analysis of the data will be performed with respect to two different conductivity models: a three-layer, 1-D conductivity model and a two-layer, 2-D conductivity model. For each case, the approximation error of the adjoint sensitivity method is first investigated and then the conjugate gradient method is run to search for an optimal conductivity model by adjusting the Z component of CHAMP data in a least-squares sense.

10.1 Brute-Force Sensitivities Sensitivities generated with the adjoint sensitivity method (ASM), called hereafter as the adjoint sensitivities, will be compared to those generated by direct numerical differentiation of the misfit, the so-called brute-force method (BFM) (e.g., Bevington, 1969), in which the partial derivative of misfit with respect to m at the point E 0 is approximated by the second-order-accuracy-centered difference of two forward model runs:  2 0 0 @

/  2 .10 ; : : : ; m0  ; : : : ; M /

2 .10 ; : : : ; m0 C ; : : : ; M  (208) @m E 0 2 where " refers to a perturbation applied to the nominal value of m0 .

10.2 Model Parameterization To parameterize the electrical conductivity, the radial interval < 0; a > is divided into L subintervals by the nodes 0 D R1 < R2 <    < RL < RLC1 D a such that the radial dependence of the electrical conductivity .r; #/ is approximated by piecewise constant functions: .r; #/ D ` .#/;

R`  r  R`C1 ;

(209)

where ` .#/ for a given layer ` D 1; : : : ; L does not depend on the radial coordinate r. Moreover, let ` .#/ be parameterized by the zonal scalar spherical harmonics Yj .#/. As a result, the logarithm of the electrical conductivity is considered in the form log .r; #I E / D

p

4

j L X X

`j ` .r/Yj .#/;

(210)

`D1 j D0

where ` .r/ is equal to 1 in the interval R`  r  R`C1 and 0 elsewhere. The number of conductivity parameters `j , that is, the size of conductivity parameter vector E , is M D L.J C1/.

Page 50 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

10.3 Three-Layer, 1-D Conductivity Model Consider a 1-D conducting sphere B consisting of the lithosphere, the upper mantle (UM), the upper (ULM) and lower (LLM) parts of the lower mantle, and the core. The interfaces between the conductivity layers are kept fixed at depths of 220, 670, 1,500, and 2,890 km, respectively. The conductivities of the lithosphere and the core are 0.001 and 10,000 S/m, respectively, and fixed at these values for all computation runs; hence the number of conductivity parameters `0 is 0 D 1 (hence LLM D 10 S=m), L D 3. The nominal values of the conductivity parameters are 10 0 0 20 D 0 .ULM D 1 S=m/, and 30 D 1 .UM D 0:1 S=m/. 10.3.1 Sensitivity Comparison The results of the sensitivity tests computed for the three-layer, 1-D conductivity model are summarized in Fig. 7, where the top panels show the misfit 2 as a function of one conductivity parameter `0 , with the other two equal to the nominal values. The bottom panels compare the derivatives of the misfit obtained by the ASM with the BFM. From these results, two conclusions can be drawn. First, the differences between the derivatives of the misfit obtained by the ASM and BFM (the dashed lines in the bottom panels) are about one order (for 30 ) and at least two orders (for 10 and 20 ) of magnitude smaller than the derivatives themselves, which justifies the validity of the ASM. The differences between the adjoint and brute-force sensitivities are caused by the approximation error in the time numerical differentiation (82). This error can be reduced by low-pass filtering of CHAMP time series Martinec and Velímský (2009). Second, both the top and bottom panels show that the misfit 2 is most sensitive to the conductivity changes in the upper mantle and decreases with increasing depth of the conductivity layer, being least sensitive to conductivity changes in the lower part of the lower mantle. 10.3.2 Conjugate Gradient Inversion The sensitivity results in Fig. 7 are encouraging with regard to the solution of the inverse problem for a 1-D mantle conductivity structure. The conjugate gradient (CG) minimization with bracketing and line searching is employed using Brent’s method with derivatives (Press et al. 1992, Sect. 10.3) obtained by the ASM. The inverse problem is solved for the three parameters `0 , with starting values equal to .1:5; 0; 1/. Figure 8 shows the results of the inversion, where the left panel displays the conductivity structure in the three-layer mantle and the right panel the misfit 2 as a function of the CG iterations. The blue line shows the starting model of the CG minimization, the dotted line the model after the first iteration, and the red line the model after ten iterations. As expected from the sensitivity tests, the minimization first modifies the conductivities of the UM and ULM, to which the misfit 2 is the most sensitive. When the UM and ULM conductivities are improved, the CG minimization also changes the LLM conductivity. The optimal values of the conductivity parameters after ten iterations are .10 ; 20 ; 30 / D .1:990; 0:186; 0:501/. This corresponds to the conductivities ULM D 1:53 S=m and UM D 0:32 S/m for ULM and UM, which are considered to be well resolved, while the conductivity LLM D 97:8 S/m should be treated with some reservation, because of its poor resolution. A CHAMP time series longer than 1 year would be necessary to increase the sensitivity of CHAMP data to the LLM conductivity.

Page 51 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

CMB – 1,500 km

670 – 220 km

1,500 – 670 km

620

X2

610 600 590

log⏐∇σ χ2⏐

102 101 100 10–1 10–2 10–3 –1 0

1 2 s10

3

4 –2

–1

0 s20

1

2 –3

–2

–1 0 s30

1

Fig. 7 The misfit 2 (top panels) and the magnitude of its sensitivities rE2 (bottom panels) as functions of the conductivity parameters `0 for the three-layer, 1-D conductivity model consisting of the lower and upper parts of the lower mantle (` D 1; 2/ and the upper mantle (` D 3/. Two panels in a column show a cross section through the respective hypersurface 2 and jrE2 j in the 3-D parameter space along one parameter, while the other two model parameters are kept fixed and equal to nominal values E 0 D .2; 0; 1/. The adjoint sensitivities computed for  D 1 h (the solid lines in the bottom panels) are compared with the brute-force sensitivities (" D 0:01) and their differences are shown (the dashed lines) 0

596

220 500

670 594 1,500

1,500

c2

Depth (km)

1,000

2,000 592 2,500 CMB

3,000 10–2 10–1 100

101 102

Conductivity (S/m)

590

0

2

4

6

8

10

CG iteration

Fig. 8 Three-layer, 1-D conductivity model (left panel) best fitting the 2001-CHAMP data (red line), the starting model for the CG minimization (blue line), and the model after the first iteration (dotted line). The right panel shows the misfit 2 as a function of CG iterations

10.4 Two-Layer, 2-D Conductivity Model 10.4.1 Sensitivity Comparison The adjoint sensitivities are now computed for the 2-D conductivity model, again consisting of the lithosphere, the upper mantle, and the upper and lower parts of the lower mantle, with the interfaces at depths of 220, 670, 1,500, and 2,890 km, respectively. The conductivities of the UM Page 52 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

620

670 – 220 km

1,500 – 670 km

c2

610 600 590

log⏐∇σ χ2⏐

102 101 100 10–1 10–2 10–3 –1

0 s11

1

–1

0 s21

1

Fig. 9 As for Fig. 7, but with respect to the conductivity parameters `1 of the latitudinally dependent conductivities of the upper part of the lower mantle .` D 1/ and the upper mantle .` D 2/. The nominal values of the conductivity 0 0 ; `1 D .0; 0; 1; 0/. The results apply to conductivities of the lithosphere, the lower part of the lower parameters `0 mantle, and the core equal to 0.001, 97.8, and 104 S/m, respectively

and ULM are now considered to be # dependent, such that the cutoff degree J in the conductivity parameterization (210) is equal to J D 1. The conductivity of the lithosphere is again fixed to 0.001 S/m. Because of the rather poor resolution of the LLM conductivity, this conductivity is chosen to be equal to the optimal value obtained by the CG minimization, that is, 97.8 S/m, and is kept fixed throughout the sensitivity tests and subsequent inversion. Complementary to the sensitivity tests for the zonal coefficients `0 shown in Fig. 7, the sensitivity tests for non-zonal coefficients `1 of the ULM .` D 1/ and UM .` D 2/ are now carried out in a manner similar to that applied in section “Sensitivity Comparison,” in Sect. 10.3 with the same nominal values 0 0 D 21 D 0. The forward and adjoint solutions are again for the zonal coefficients `0 and 11 computed for the 2001 CHAMP data (see Sect. 7) with spherical harmonic cutoff degree jmax D 4 and time step t D 1 h. The Earth model is again divided into 40 finite-element layers with layer thicknesses increasing with depth. Figure 9 summarizes the results of the sensitivity tests. The top panels show the misfit 2 as a function of the parameters 11 and 21 , where only one conductivity parameter is varied and the other to zero. The bottom panels compare the derivatives of the misfit obtained by the ASM and the BFM. It can be seen that the adjoint sensitivities show very good agreement with the brute-force results, with differences not exceeding 0.01 % of the magnitude of the sensitivities themselves. Moreover, the sensitivities to latitudinal dependency of conductivity are significant, again more pronounced in the upper mantle than in the lower mantle. This tells that the CHAMP data are capable of revealing lateral variations of conductivity in the upper and lower mantle. 10.4.2 Conjugate Gradient Inversion The sensitivity results in Fig. 9 are encouraging to attempt to solve the inverse problem for lateral variations of conductivity in the mantle. For this purpose, the CG minimization with derivatives Page 53 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

670 – 220 km 101

100

100

10–1

0

60

120

Colatitude ( )

180

10–1

592

χ2

Conductivity (S/m)

1,500 – 670 km 101

591

590 0

60

120

Colatitude ( )

180

0

2

4

6

8

10

CG iteration

Fig. 10 Two-layer, latitudinally dependent conductivity model of the upper part of the lower mantle and the upper mantle (left and middle panels). The model best fitting the 2001 CHAMP data (red lines), the starting model for the CG minimization (blue lines), and the model after the first iteration (dotted line) are compared to the best 1-D conductivity model from Fig. 8 (black lines). The right panel shows the misfit 2 as a function of the number of CG iterations; the dashed line shows the misfit 2 for the best 1-D conductivity model

obtained by the ASM is again employed. The inverse problem is solved for four parameters, `0 and `1 , ` D 1; 2. The starting values of `0 are the nominal values of the three-layer, 1-D conductivity model (see Sect. 10.3), while the values of `1 are set equal to zero at the start of minimization. The results of the inversion are summarized in Fig. 10, where the left and center panels show the conductivity structure in the ULM and UM, while the right panel shows the misfit 2 as a function of CG iterations. The blue lines show the starting model of minimization, the dotted lines the model of minimization after the first iteration, and the red lines the final model of minimization after eight iterations. These models are compared with the optimal three-layer, 1-D conductivity model (black lines) found in section “Conjugate Gradient Inversion” in Sect. 10.3. Again, as indicated by the sensitivity tests, the minimization, at the first stage, adjusts the conductivity in the upper mantle, to which the misfit 2 is the most sensitive, and then varies the ULM conductivity, to which the misfit is less sensitive. The optimal values of the conductivity parameters after eight iterations are .10 ; 11 ; 20 ; 21 / D .0:192; 0:008; 0:476; 0:106/. It is concluded that the mantle conductivity variations in the latitudinal direction reach about 20 % of the mean value in the upper mantle and about 4 % in the upper part of the lower mantle. Comparing the optimal values of the zonal coefficients 10 and 20 with those found in section “Conjugate Gradient Inversion” in Sect. 10.3 for a 1-D conductivity model, it is concluded that the averaged optimal 2-D conductivity structure closely approaches the optimal 1-D structure. This is also indicated in Fig. 10, where the final 2-D conductivity profile (red lines) intersects the optimal 1-D conductivity profile (black lines) at the magnetic equator.

11 Conclusions This chapter has been motivated by efforts to give a detailed presentation of the advanced mathematical methods available for interpreting the time series of CHAMP magnetic data such that the complete time series, not only their parts, can be considered in forward and inverse modeling and still be computationally feasible. It turned out that these criteria are satisfied by highly efficient methods of forward and adjoint sensitivity analysis that are numerically based on the time-domain, spectral finite-element method. This has been demonstrated for the year 2001 CHAMP time series with a time step of 1 h. To apply the forward and adjoint sensitivity methods to longer time series Page 54 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

is straightforward, leading to memory and computational time requirements that are linear with respect to the number of time steps undertaken. The analysis of the complete, more than 8-yearlong, CHAMP time series is ongoing with the particular objective of determining the lower-mantle conductivity. The achievement of the present approach is its ability to use satellite data directly, without continuing them from the satellite altitude to the ground level or without decomposing them into the exciting and induced parts by spherical harmonic analysis. This fact is demonstrated for a 2-D configuration, for which the electrical conductivity and the external sources of the electromagnetic variations are axisymmetrically distributed and for which the external current excitation is transient, as for a magnetic storm. The 2-D case corresponds to the situation where vector magnetic data along each track of a satellite, such as the CHAMP satellite, is used. The present approach can be extended to the transient electromagnetic induction in a 3-D heterogeneous sphere if the signals from multiple satellites, simultaneously supplemented by ground-based magnetic observations, are available in the future. The presented sensitivity analysis has shown that the 2001 CHAMP data are clearly sensitive to latitudinal variations in mantle conductivity. This result suggests the need to modify the forward and adjoint methods for an axisymmetric distribution of mantle conductivity to the case where the CHAMP data will only be considered over particular areas above the Earth’s surface, for instance, the Pacific Ocean, allowing the study of how latitudinal variations in conductivity differ from region to region. This procedure would enable one to find not only conductivity variations in the latitudinal direction but also longitudinally. This idea warrants further investigation, because it belongs to the category of problems related to data assimilation and methods of constrained minimization can be applied. Similar methods can also be applied to the assimilation of the recordings at permanent geomagnetic observatories into the conductivity models derived from satellite observations. Acknowledgments The author thanks Kevin Fleming for his comments on the manuscript. The author acknowledges support from the Grant Agency of the Czech Republic through Grant No. 205/09/0546.

Appendix: Zonal Scalar and Vector Spherical Harmonics In this section, we define the zonal scalar and vector spherical harmonics, introduce their orthonormality properties, and give some other relations. All considerations follow the book by Varshalovich et al. (1989), which is referenced in the following. The zonal scalar spherical harmonics Yj .#/ can be defined in terms of the Legendre polynomials Pj .cos #/ of degree j (ibid., p. 134, Eq. 6): r Yj .#/ WD

2j C 1 Pj .cos #/; 4

(211)

where j D 0; 1; : : :. The orthogonality property of the Legendre polynomials over the interval 0 #  (ibid., p. 149, Eq. 10) results in the orthonormality property of the zonal scalar spherical harmonics Yj .#/ over the full solid angle .0  #  ; 0  ' < 2 /:

Page 55 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Z

2

Z



Yj1 .#/Yj2 .#/ sin #d #d D ıj1 j2 ; D0

(212)

#D0

where ıij stands for the Kronecker delta symbol. Note that the integration over longitude ' can be performed analytically, resulting in the multiplication by a factor of 2 . However, the form of the double integration will be kept since it is consistent with surface integrals considered in the main text. The zonal vector spherical harmonics Y`j .#/, j D 0; 1; : : :; ` D j ˙ 1; j , can be defined (see also Chaps. 18 and 19) via their polar components (ibid., p. 211, Eq. (10); pp. 213–214, Eqs. 25– 27): p @Yj .#/ j 1 e# ; j.2j C 1/Yj .#/ D jYj .#/er C @# p @Yj .#/ j C1 e# ; .j C 1/.2j C 1/Yj .#/ D .j C 1/Yj .#/er C @# p @Yj .#/ j e ; j.j C 1/Yj .#/ D i @#

(213)

p j ˙1 where i D 1, and er , e# , and e are spherical base vectors. The vector functions Yj .#/ j are called the zonal spheroidal vector spherical harmonics and Yj .#/ are the zonal toroidal vector spherical harmonics. A further usefulpform of the zonal toroidal vector spherical harmonics can be obtained considering @Yj .#/=@# D j.j C 1/Pj1 .cos #/ (ibid., p. 146, Eq. 5), where Pj1 (cos #) is fully normalized associated Legendre functions of order m D 1: j

Yj .#/ D iPj1.cos #/e :

(214)

The orthonormality property of the spherical base vectors and the zonal scalar spherical harmonics combine to give the orthonormality property of the zonal vector spherical harmonics (ibid., p. 227, Eq. 117): Z

2 D0

Z

2

h i Y`j11 .#/  Y`j22 .#/ sin #d #d D ıj1 j2 ı`1 `2 ;

(215)

#D0

where the dot stands for the scalar product of vectors and the asterisk denotes complex conjugation. Since both the zonal scalar spherical harmonics and the spherical base vectors are real functions, j ˙1 Eq. 213 shows that the spheroidal vector harmonics Yj .#/ are real functions, whereas the j toroidal vector harmonics Yj .#/ are pure imaginary functions. To avoid complex arithmetics, j Yj .#/ is redefined in such a way that they become real functions (of colatitude #): j

Yj .#/ WD Pj1 .cos #/e :

(216)

At this stage, a remark about this step is required. To avoid additional notation, the same notation j j is used for the real and complex versions of Yj .#/, since the real version of Yj .#/ is exclusively used throughout this chapter. It is in contrast to Martinec (1997), Martinec et al. (2003), and j Martinec and McCreadie (2004) where the complex functions Yj .#/, defined by Eq. 218, have Page 56 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

been used. However, the re-definition (216) only makes sense for studying a phenomenon with an axisymmetric geometry. For a more complex phenomenon, the original definition (214) is to be used. The orthonormality property (215) for the real zonal vector spherical harmonics now reads as Z

2

Z

D0

2

#D0

Y`j11 .#/  Y`j22 .#/ sin #d #d D ıj1 j2 ı`1 `2 :

(217)

The formulae for the scalar and vector products of the radial unit vector er and the zonal vector spherical harmonics Y`j .#/ follow from Eq. 213: s

j Yj .#/; 2j C 1 s j C1 j C1 Yj .#/; er  Yj .#/ D  2j C 1 j er  Yj .#/ D 0; j 1

er  Yj

.#/ D

(218)

and s er 

j 1 Yj .#/ j C1

er  Yj

D s

j C1 j Y .#/; 2j C 1 j

j j Y .#/; 2j C 1 j s s j C 1 j 1 j j C1 Yj .#/  Y .#/: D 2j C 1 2j C 1 j

.#/ D

j

er  Yj .#/

(219)

Any vector A(#) which depends on colatitude # and which is square-integrable over the interval 0 #  may be expanded in a series of the zonal vector spherical harmonics, that is, A.#/ D

j C1 1 X X

A`j Y`j .#/

(220)

j D0 `Djj 1j

with the expansion coefficients given by Z A`j

2

Z

2

D

A.#/  Y`j .#/ sin #d #d: D0

(221)

#D0

The curl of vector A(r,#) is then curl A D

j C1 1 X X

Rj` .r/Y`j .#/;

(222)

j D1 `Dj 1

Page 57 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

where (ibid, p. 217, Eq. 54) s

 d j C1 j C Aj .r/; D dr r s   d j j j C1 j  Aj .r/; Rj .r/ D  2j C 1 dr r s s     j 1 d j C2 j C1 d j j j 1 j C1  Aj .r/ C C Aj .r/: Rj .r/ D 2j C 1 dr r 2j C 1 dr r j 1 Rj .r/

j C1 2j C 1



(223)

The radial and tangential components of curl A may be evaluated as 1 Xp j er  curl A D  j.j C 1/Aj .r/Yj .#/; r j D1

(224)

 1  1   X X 1 j d j j j C Aj .r/Yj .#/ C er  curl A D  Rj .r/ er  Yj .#/ : dr r j D1 j D1

(225)

1

and

j ˙1

In particular, for a toroidal vector A(#), the coefficients Aj

.r/ D 0 and Eq. 225 reduces to

 1  X 1 j d j C Aj .r/Yj .#/: er  curl A D  dr r j D1

(226)

References Avdeev DB, Avdeeva AD (2006) A rigorous three-dimensional magnetotelluric inversion. PIER 62:41–48 Banks R (1969) Geomagnetic variations and the electrical conductivity of the upper mantle. Geophys J R Astron Soc 17:457–487 Banks RJ, Ainsworth JN (1992) Global induction and the spatial structure of mid-latitude geomagnetic variations. Geophys J Int 110:251–266 Bevington PR (1969) Data reduction and error analysis for the physical sciences. McGraw-Hill, New York Cacuci DG (2003) Sensitivity and uncertainty analysis. Volume I. Theory. Chapman & Hall/CRC, Boca Raton Constable S, Constable C (2004) Observing geomagnetic induction in magnetic satellite measurements and associated implications for mantle conductivity. Geochem Geophys Geosyst 5:Q01006. doi:10.1029/2003GC000634 Daglis IA, Thorne RM, Baumjohann W, Orsini S (1999) The terrestrial ring current: origin, formation and decay. Rev Geophys 37:407–438

Page 58 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Didwall EM (1984) The electrical conductivity of the upper mantle as estimated from satellite magnetic field data. J Geophys Res 89:537–542 Dorn O, Bertete-Aquirre H, Berryman JG, Papanicolaou GC (1999) A nonlinear inversion method for 3-D electromagnetic imaging using adjoint fields. Inverse Probl 15:1523–1558 Eckhardt D, Lamer K, Madden T (1963) Long periodic magnetic fluctuations and mantle conductivity estimates. J Geophys Res 68:6279–6286 Everett ME, Martinec Z (2003) Spatiotemporal response of a conducting sphere under simulated geomagnetic storm conditions. Phys Earth Planet Inter 138:163–181 Everett ME, Schultz A (1996) Geomagnetic induction in a heterogeneous sphere: azimuthally symmetric test computations and the response of an undulating 660-km discontinuity. J Geophys Res 101:2765–2783 Fainberg EB, Kuvshinov AV, Singer BSh (1990) Electromagnetic induction in a spherical Earth with non-uniform oceans and continents in electric contact with the underlying medium – I. Theory, method and example. Geophys J Int 102:273–281 Farquharson CG, Oldenburg DW (1996) Approximate sensitivities for the electromagnetic inverse problem. Geophys J Int 126:235–252 Hamano Y (2002) A new time-domain approach for the electromagnetic induction problem in a three-dimensional heterogeneous earth. Geophys J Int 150:753–769 Hultqvist B (1973) Perturbations of the geomagnetic field. In: Egeland A, Holter O, Omholt A (eds) Cosmical geophysics. Universitetsforlaget, Oslo, pp 193–201 Jupp DLB, Vozoff K (1977) Two-dimensional magnetotelluric inversion. Geophys J R Astron Soc 50:333–352 Kelbert A, Egbert GD, Schultz A (2008) Non-linear conjugate gradient inversion for global EM induction: resolution studies. Geophys J Int 173:365–381 Kivelson MG, Russell CT (1995) Introduction to space physics, Cambridge University Press, Cambridge. Korte M, Constable S, Constable C (2003) Separation of external magnetic signal for induction studies. In: Reigber Ch, Lühr H, Schwintzer P (eds) First CHAMP mission results for gravity, magnetic and atmospheric studies. Springer, Berlin, pp 315–320 Kˇrížek M, Neittaanmäki P (1990) Finite element approximation of variational problems and applications. Longmann Scientific and Technical/Wiley, New York Kuvshinov AV (2010) Deep electromagnetic studies from land, sea, and space: progress status in the past 10 years. Surv Geophs 33:169–209 Kuvshinov A, Olsen N (2006) A global model of mantle conductivity derived from 5 years of CHAMP, Ørsted, and SAC-C magnetic data. Geophys Res Lett 33:L18301. doi:10.1029/2006GL027083 Kuvshinov AV, Avdeev DB, Pankratov OV (1999a) Global induction by Sq and Dst sources in the presence of oceans: bimodal solutions for non-uniform spherical surface shells above radially symmetric earth models in comparison to observations. Geophys J Int 137:630–650 Kuvshinov AV, Avdeev DB, Pankratov OV, Golyshev SA (1999b) Modelling electromagnetic fields in 3-D spherical earth using fast integral equation approach. In: Expanded abstract of the 2nd international symposium on 3-D electromagnetics, pp 84–88. The university of Utah Lanczos C (1961) Linear differential operators. Van Nostrand, Princeton Langel RA, Estes RH (1985a) Large-scale, near-field magnetic fields from external sources and the corresponding induced internal field. J Geophys Res 90:2487–2494

Page 59 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Langel RA, Estes RH (1985b) The near-Earth magnetic field at 1980 determined from Magsat data. J Geophys Res 90:2495–2510 Langel RA, Sabaka TJ, Baldwin RT, Conrad JA (1996) The near-Earth magnetic field from magneto spheric and quiet-day ionospheric sources and how it is modeled. Phys Earth Planet Inter 98:235–268 Madden TM, Mackie RL (1989) Three-dimensional magnetotelluric modelling and inversion. Proc Inst Electron Electric Eng 77:318–333 Marchuk GI (1995) Adjoint equations and analysis of complex systems. Kluwer, Dordrecht Martinec Z (1989) Program to calculate the spectral harmonic expansion coefficients of the two scalar fields product. Comput Phys Commun 54:177–182 Martinec Z (1997) Spectral-finite element approach to two-dimensional electromagnetic induction in a spherical earth. Geophys J Int 130:583–594 Martinec Z (1999) Spectral-finite element approach to three-dimensional electromagnetic induction in a spherical earth. Geophys J Int 136:229–250 Martinec Z, McCreadie H (2004) Electromagnetic induction modelling based on satellite magnetic vector data. Geophys J Int 157:1045–1060 Martinec Z, Velímský J (2009) The adjoint sensitivity method of global electromagnetic induction for CHAMP magnetic data. Geophys J Int 179:1372–1396. doi:10.1111/j.1365246X.2009.04356.x Martinec Z, Everett ME, Velímský J (2003) Time-domain, spectral-finite element approach to transient two-dimensional geomagnetic induction in a spherical heterogeneous earth. Geophys J Int 155:33–43 McGillivray PR, Oldenburg DW (1990) Methods for calculating Fréchet derivatives and sensitivities for the non-linear inverse problems: a comparative study. Geophys Prospect 38:499–524 McGillivray PR, Oldenburg DW, Ellis RG, Habashy TM (1994) Calculation of sensitivities for the frequency-domain electromagnetic problem. Geophys J Int 116:1–4 Morse PW, Feshbach H (1953) Methods of theoretical physics. McGraw-Hill, New York Newman GA, Alumbaugh DL (1997) Three-dimensional massively parallel electromagnetic inversion – I. Theory Geophys J Int 128:345–354 Newman GA, Alumbaugh DL (2000) Three-dimensional magnetotelluric inversion using nonlinear conjugate on induction effects of geomagnetic daily variations from equatorial gradients. Geophys J Int 140:410–424 Oldenburg DW (1990) Inversion of electromagnetic data: an overview of new techniques. Surv Geophys 11:231–270 Olsen N (1999) Induction studies with satellite data. Surv Geophys 20:309–340 Olsen N, Stolle C (2012) Satellite Geomagnetism. Annu Rev Earth Planet Sci 40:441–465 Olsen N, Sabaka TJ, Lowes F (2005) New parameterization of external and induced fields in geomagnetic field modeling, and a candidate model for IGRF 2005. Earth Planets Space 57:1141–1149 Olsen N, Lühr H, Sabaka TJ, Mandea M, Rother M, Toffiner-Clausen L, Choi S (2006a) CHAOS – a model of the Earth’s magnetic field derived from CHAMP, Øersted & SAC-C magnetic satellite data. Geophys J Int 166:67–75 Olsen N, Haagmans R, Sabaka T, Kuvshinov A, Maus S, Purucker M, Rother M, Lesur V, Mandea M (2006b) The swarm end-to-end mission simulator study: separation of the various contributions to earths magnetic field using synthetic data. Earth Planets Space 58:359–370

Page 60 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Oraevsky VN, Rotanova NM, Semenov VYu, Bondar TN, Abramova DYu (1993) Magnetovariational sounding of the Earth using observatory and MAGSAT satellite data. Phys Earth Planet Inter 78:119–130 Orszag SA (1970) Transform method for the calculation of vector-coupled sums: application to the spectral form of the vorticity equation. J Atmos Sci 27:890 Pˇecˇ K, Martinec Z (1986) Spectral theory of electromagnetic induction in a radially and laterally inhomogeneous Earth. Studia Geoph et Geod 30:345–355 Petzold L, Li ST, Cao Y, Serban R (2006) Sensitivity analysis of differential-algebraic equations and partial differential equations. Comput Chem Eng 30:1553–1559 ˇ Praus OJ, Pˇecˇ ová J, Cerv V, Kovaˇcíková S, Pek J, Velímský J (2011) Electrical conductivity at mid-mantle depths estimated from the data of Sq and long period geomagnetic variations. Studia Geoph Geod 55:241–264 Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992) Numerical recipes in Fortran. The art of scientific computing. Cambridge University Press, Cambridge Rodi WL (1976) A technique for improving the accuracy of finite element solutions of MT data. Geophys J R Astron Soc 44:483–506 Rodi WL, Mackie RL (2001) Nonlinear conjugate gradients algorithm for 2-D magnetotel-luric inversion. Geophysics 66:174–187 Sandu A, Daescu DN, Carmichael GR (2003) Direct and adjoint sensitivity analysis of chemical kinetic systems with KPP: I-theory and software tools. Atmos Environ 37:5083–5096 Sandu A, Daescu DN, Carmichael GR, Chai T (2005) Adjoint sensitivity analysis of regional air quality models. J Comput Phys 204:222–252 Schultz A, Larsen JC (1987) On the electrical conductivity of the mid-mantle, I, Calculation of equivalent scalar magnetotelluric response functions. Geophys J R Astron Soc 88:733–761 Schultz A, Larsen JC (1990) On the electrical conductivity of the mid-mantle, II. Delineation of heterogeneity by application of extremal inverse solutions. Geophys J Int 101:565–580 Stratton JA (1941) Electromagnetic theory. Wiley, New Jersey (reissued in 2007) Tarantola A (2005) Inverse problem theory and methods for model parameter estimation. SIAM, Philadelphia Tarits P, Grammatica N (2000) Electromagnetic induction effects by the solar quiet magnetic field at satellite altitude. Geophys Res Lett 27:4009–4012 Uyeshima M, Schultz A (2000) Geoelectromagnetic induction in a heterogeneous sphere: a new three-dimensional forward solver using a conservative staggered-grid finite difference method. Geophys J Int 140:636–650 Varshalovich DA, Moskalev AN, Khersonskii VK (1989) Quantum theory of angular momentum World Scientific, Singapore Velímský J (2010) Electrical conductivity in the lower mantle: constraints from CHAMP satellite data by time-domain EM induction modelling. Phys Earth Planet Inter 180:111–117 Velímský J, Martinec Z (2005) Time-domain, spherical harmonic-finite element approach to transient three-dimensional geomagnetic induction in a spherical heterogeneous Earth. Geophys J Int 161:81–101 Velímský J, Martinec Z, Everett ME (2006) Electrical conductivity in the Earth’s mantle inferred from CHAMP satellite measurements – I. Data processing and 1-D inversion. Geophys J Int 166:529–542 Weaver JT (1994) Mathematical methods for geo-electromagnetic induction, research studies press. Wiley, New York

Page 61 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_19-4 © Springer-Verlag Berlin Heidelberg 2014

Weidelt P (1975) Inversion of two-dimensional conductivity structure. Phys Earth Planet Inter 10:282–291 Weiss CJ, Everett ME (1998) Geomagnetic induction in a heterogeneous sphere: fully threedimensional test computations and the response of a realistic distribution of oceans and continents. Geophys J Int 135: 650–662

Page 62 of 62

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Asymptotic Models for Atmospheric Flows Rupert Klein FB Mathematik und Informatik, Institut für Mathematik, Freie Universität Berlin, Berlin, Germany

Abstract Atmospheric flows feature length and time scales from 105 to 105 m and from microseconds to weeks and more. For scales above several kilometers and minutes, there is a natural scale separation induced by the atmosphere’s thermal stratification together with the influences of gravity and Earth’s rotation and the fact that atmospheric flow Mach numbers are typically small. A central aim of theoretical meteorology is to understand the associated scale-specific flow phenomena, such as internal gravity waves, baroclinic instabilities, Rossby waves, cloud formation and moist convection, (anti-)cyclonic weather patterns, hurricanes, and a variety of interacting waves in the tropics. Such understanding is greatly supported by analyses of reduced sets of model equations which capture just those fluid mechanical processes that are essential for the phenomenon in question while discarding higher-order effects. Such reduced models are typically proposed on the basis of combinations of physical arguments and mathematical derivations, and are not easily understood by the meteorologically untrained. This chapter demonstrates how many well-known reduced sets of model equations for specific, scale-dependent atmospheric flow phenomena may be derived in a unified and transparent fashion from the full compressible atmospheric flow equations using standard techniques of formal asymptotics. It also discusses an example for the limitations of this approach. Sections 3–5 of this chapter are a recompilation of the author’s more comprehensive article “Scale-dependent models for atmospheric flows”, Annual Reviews of Fluid Mechanics, 42 (2010).

1 Introduction Atmospheric flows feature length and time scales from 105 to 105 m and from microseconds to weeks and more, and many different physical processes persistently interact across these scales. A central aim of theoretical meteorology is to understand the associated scale-specific flow phenomena, such as internal gravity waves, baroclinic instabilities, Rossby waves, cloud formation and moist convection, (anti-)cyclonic weather patterns, hurricanes, and a variety of interacting waves in the tropics. To cope with the associated complexity, theoretical meteorologists have developed a large number of reduced mathematical models which capture the essence of these scale-dependent processes while consciously discarding effects that would merely amount to small, nonessential corrections. See, e.g., the texts by Gill (1982), Pedlosky (1987), Zeytounian (1990), Majda (2002), and White (2002) for detailed derivations and for discussions of the usefulness and validity of such reduced model equations. Section 2 of this chapter presents two classical examples of such reduced model equations and discusses possible lines of thought for their derivations with emphasis on the relation between meteorological scale analysis and formal asymptotics. 

E-mail: [email protected]

Page 1 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Section 3, introduces several universal physical characteristics of Earth and its atmosphere that are associated with the atmosphere’s thermal stratification, with the influences of gravity and Earth’s rotation, and with the fact that atmospheric flow Mach numbers are typically small. These characteristics give rise to three nondimensional parameters, two of which are small and motivate a unified approach to meteorological modeling via formal asymptotics. In pursuing this route, a particular distinguished limit among these parameters is introduced, which leaves us with only one small reference parameter, and this is used in the sequel as the basis for asymptotic expansions. In Sect. 4, we find that many classical reduced scale-dependent models of theoretical meteorology can in fact be rederived through formal asymptotics based on this remaining small parameter in a unified fashion directly from the full three-dimensional compressible flow equations. The distinguished limit adopted in Sect. 3 is thus justified in hindsight as one that supports a substantial part of the established meteorological theory. The reduced models rederived in Sect. 4 are all well-known, so that this section should be considered a reorganization of a family of classical models under a common systematic mathematical framework. The models discussed are all single-scale models in the sense that they incorporate a single horizontal and a single vertical spacial scale, and a single timescale only. In contrast, Sect. 5 briefly sketches more recent developments for multiscale problems. Section 6 draws some conclusions.

2 Examples of Reduced Models, Scale Analysis, and Asymptotics Arguments from formal asymptotic analysis are the principal tool used in this chapter to generate an overview of the multitude of reduced mathematical models for atmospheric flows. Yet, formal asymptotic expansions do have their limitations. In this section, we consider two examples of established meteorological models: the quasi-geostrophic (QG) model, which does have a straightforward derivation through formal (single-scale) asymptotics, and the class of anelastic and pseudo-incompressible sound-proof models, which does not. See Salmon (1983), White (2002), Névir (2004), Oliver (2006) for discussions of alternative approaches to the construction of reduced models.

2.1 QG-Theory A prominent example of a reduced model of theoretical meteorology is the QG model: (see section “Synoptic Scales and the Quasi-Geostrophic Approximation” for its derivation) .@t C u  r /q D 0;

(1)

where f @ qD C ˇ C N @z u D k  r : Q r2 Q



N @Q N z @z d =d

 ;

(2)

Page 2 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Here ; Q u are a pressure variable and the horizontal flow velocity, .z/, N N .z/ are the vertical background stratifications of density and potential temperature (the “potential temperature” is closely related to entropy: it is the temperature which a parcel of air would attain if compressed or expanded adiabatically to sea-level pressure),  D .; / are horizontal spacial coordinates, r denotes the horizontal components of the gradient, k is the local vertical unit vector at .;  D 0/, f D 2k   is the Coriolis parameter, with , the Earth rotation vector, and ˇ is its derivative w.r.t. . The central dynamical variable in the QG theory is the potential vorticity, q, which according to (2)1 is a superposition of the vertical component of the relative vorticity, r2 Q  k  r  u, a contribution from meridional (north-south) variations of the Coriolis parameter, and of a third term that represents the effects of vortex stretching due to the compression and extension of vertical columns of air. The QG theory has been derived originally to explain the formation and evolution of the largescale vortical atmospheric flow patterns that constitute the familiar high and low pressure systems seen regularly on mid latitude weather maps. It also served as the basis for Richardson’s first attempt at a numerical weather forecasting, (see Lynch 2006). Its classical derivation (see, e.g., Pedlosky 1987) can be reproduced one-to-one within the formal asymptotic framework to be summarized in subsequent sections.

2.2 Sound-Proof Models This is in contrast, e.g., to the situation with the classical derivations of reduced sound-proof models for atmospheric flows at much smaller scales, such as the extended anelastic models summarized by Bannon (1996), or the pseudo-incompressible model by Durran (1989). In the simplest case of an inviscid, adiabatic flow in a nonrotating frame of reference, the latter model is given by D0  C r  .v/ .v/ C r  .v ı v/ C PN r Q D g k: r  .PN v/ D 0:

(3)

Here ; v; Q are the density, flow velocity, and pressure field for this model, g is the acceleration of gravity, and N PN .z/  .z/ N ‚.z/

(4)

is the density-weighted background stratification of potential temperature. As shown in Davies et al. (2003) and Klein (2009), this model results formally from the full compressible flow equations for an ideal gas with constant specific heat capacities in a straightforward fashion. The point of departure are the compressible Euler equations with gravity in the form D0 t C r  .v/ .v/t C r  .v ı v/ C P r D g k: D 0; pt C v  r C pr  v

(5)

Page 3 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Table 1 Dimensional and nondimensional inverse time scales for the three physical processes described by the inviscid, adiabatic compressible flow equations with gravity in (6) Process

Inverse timescale

Dimensionless inverse timescale

Advection

uref hsc

1

Internal gravity waves

N=

Sound propagation

1 γMa

g dΘ Θ dz

hsc dΘ Θ dz

1 Ma

hsc γpref ρref

where p is the thermodynamic pressure, and 1

P D p  and  D p = with D

 1 ; 

(6)

and  the isentropic exponent. Now we rewrite (5)3 as Pt C r  .P v/ D 0;

(7)

and assume (i) that all pressure variations are systematically small so that Pt  0, and (ii) that P  PN .z/ corresponds to the hydrostatic pressure distribution of an atmosphere a rest. With these two assumptions, the pseudo-incompressible model in (4) emerges from (5). Close inspection of Durran’s original derivation of (4) reveals that his only essential assumption consists, in fact, of small deviations of the thermodynamic pressure from a hydrostatically balanced background distribution. Interestingly, there is no derivation of this reduced model via classical single or multiple scales asymptotic analysis unless one adopts the additional assumption of weak stratification of the potential temperature, so that PN .z/ N D 1 C Ma2 N .z/; ‚.z/  .z/ N

(8)

where Ma is the flow Mach number. The reason is that the compressible flow equations from (5) describe three processes with asymptotically different characteristic time scales as indicated in Table 1 (Klein 2009). In the table, uref  10 m s1 is a typical atmospheric flow velocity, pref  105 Pa is a reference p sea-level pressure, ref  1 kg m3 is an associated reference density, and hsc D  refg  10 km is ref the pressure scale height, i.e., a characteristic vertical distance over which the atmospheric pressure changes appreciably as a response to the diminishing mass of p air that it has p to balance against the force of gravity at higher altitudes. Notice that the quantity ghsc D pref =ref , the speed of external gravity waves of the atmosphere, equals the speed of sound at reference conditions, except p N is the background stratification of the potential temperature for a factor of . Recall that ‚.z/ defined in (4). The table reveals that, generally, compressible flows under the influence of gravity feature three distinct characteristic time scales: those of advection, of internal wave, and sound propagation. Page 4 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

The pseudo-incompressible model from (3) does not support acoustic waves due to the divergence constraint imposed on the velocity field in (3)3 . Yet it does feature the processes of advection and internal wave propagation. According to the preceding discussion of characteristic time scales, the pseudo-incompressible model will thus be a two-timescale model unless the characteristic frequency of internal waves is slowed down to the time scale of advection by assuming h‚sc dd‚z D O.Ma2 / as Ma! 0. In fact, Ogura and Phillips (1962) adopted this assumption in the derivation of their original anelastic model (see also section “Ogura and Phillips’ Anelastic Model for Weak Stratification”). Yet their model was criticized because of this assumption, because stratifications of order Ma2  103 imply potential temperature variations across the troposphere, i.e., the lower 10– 15 km of the atmosphere, of less than 1 K, which is far too small in comparison with the observed 30–50 K. Generalized sound-proof models proposed later (see Durran 1989; Bannon 1996, and references therein), are thus genuine two-scale models. These observations allow us to exemplify explicitly one remark by White (2002) who takes the somewhat extreme stance that “Such expansion methods [single-scale asymptotic expansions; explanation added by the author] may be suspected of lending a cosmetic veneer to what is rather crude and restricted scale analysis.” In fact, formal asymptotic methods when faced with a two-timescale problem will allow us to filter out the fast scale and produce a reduced balanced model that captures the remaining slower process(es) only. Faced with a three-scale problem, such as the compressible atmospheric flows discussed in this section, asymptotic multiple scales analysis will provide us with separate equations for the processes taking place on each scale, and with additional coupling constraints that guarantee the validity of the coupled model over the longest of the involved time scales. Standard techniques of asymptotic analysis will, however, not allow us to eliminate from a three-scale problem only the fastest scale while retaining a reduced two-scale pde model that captures the remaining two slower time scales only. In this sense, the meteorological scale analyses presented by Durran (1989) and Bannon (1996) are not equivalent to formal asymptotic approaches. Their analyses are, however, also not entirely satisfactory in the author’s view, because they do not at all address the asymptotic two-scale properties of the remaining pseudo-incompressible and anelastic model equations. Mathematically speaking, an interesting challenge remains: There are many textbooks on methods of formal asymptotic analysis, (see, e.g., Kevorkian and Cole 1996), and there is also a multitude of examples where mathematicians have rigorously proven the validity of approximate equation systems derived with these techniques, (see, e.g., Majda 2002; Schochet 2005, and references therein). In contrast, a mathematically sound classification of the methods of meteorological scale analysis as referred to, e.g., by White (2002), and an explicit characterization of what distinguishes them from formal asymptotics seems outstanding. Similarly, rigorous proofs of validity of the resulting reduced models such as the extended anelastic and pseudoincompressible models are still missing.

3 Dimensional Considerations 3.1 Characteristic Scales and Dimensionless Parameters Here, we borrow from Keller and Ting (1951) and Klein (2008) to argue for an inherent scale separation in large-scale atmospheric flows that will motivate the asymptotic arguments to be presented subsequently. Table 2 lists eight universal characteristics of atmospheric motions involving the physical dimensions of length, time, mass, and temperature. Page 5 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Table 2 Universal characteristics of atmospheric motions a = 6 · 106 m Ω ~ 10− 4 s− 1 g = 9.81 ms− 2 pref = 105 kgm− 1 s− 2 Tref ~ 273 K

Earth’s radius Earth’s rotation rate Acceleration of gravity Sea level pressure H2 O freezing temperature Equator–pole pot. temp. diff. Tropospheric vertical pot. temp. diff. Dry gas constant Dry isentropic exponent

ΔΘ ~ 40 K

R = 287 m2 s− 2 K− 1 γ = 1.4

Table 3 Auxiliary quantities of interest derived from those in Table 2 Sea-level air density

ρref = pref R Tref ) ~ 1.25 kg m− 3

Pressure scale height

hsc = pref g ρref ) ~ 8 km

Sound speed

cref =

Internal wave speed

cint =

Thermal wind velocity

γpref ρref ~ 330 m s− 1

ΔΘ ~ 110 m s− 1 Tref 2 ghsc ΔΘ uref = ~ 12 m s− 1 π Ωa Tref

ghsc

Earth’s radius, its rotation rate, and the acceleration of gravity, are obviously universal. The sea-level pressure is set by the mass (weight) of the atmosphere, which is essentially constant in time. The water freezing temperature is a good reference value for the large-scale, long-time averaged conditions on Earth. See also Rahmstorf et al. (2004) for the related paleo-climatic record. The equator-to-pole potential temperature difference is maintained by the inhomogeneous irradiation from the sun, and its magnitude appears to be stable over very long periods of time. An average vertical potential temperature difference across the troposphere, i.e., across the lower 10–15 km of the atmosphere, is of the same order of magnitude as the equator-to-pole temperature difference (see, e.g., Frierson 2008, and references therein). Finally the dry air gas constant is a good approximation to local values, because the mass fractions of water vapor and greenhouse gases are very small. These seven dimensional characteristics allow for three independent dimensionless combinations in addition to the isentropic exponent  . A possible choice is hsc  1:6  103 ; a

‚  1:5  101 ; …2 D Tref cref  4:7  101 ; …3 D a

…1 D

(9)

with hsc the density scale height and cref of the order of a characteristic speed of sound or of barotropic (external) gravity waves as given in the first three items of Table 3. The last two items in Table 3 are further characteristic signal speeds derived from the quantities of Table 2: cint corresponds to the horizontal phase speed of linearized internal gravity waves in the longwavelength limit (see Gill 1982, Chap. 6). In estimating a typical horizontal flow velocity, we have used the so-called thermal wind relation which provides an estimate for the vertical variation of the horizontal velocity in an atmosphere that is simultaneously in hydrostatic and geostrophic balance. Here “geostrophy” refers to a balance of the horizontal pressure gradient and Coriolis force. See Page 6 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Table 4 Hierarchy of physically distinguished scales in the atmosphere π a ~ 10, 000 km 2 cref LOb = ~ 3, 300 km Ω cint ~ 1, 100 km LRo = Ω uref Lmeso = ~ 150 km Ω γpref hsc = ~ 11 km gρref

Lp =

Planetary scale Obukhov radius Synoptic scale Meso- β scale Meso- γ scale

the discussion of Eq. 23 for more details. Importantly, internal gravity waves are dispersive, their phase speed depending strongly on the wave number vector, so that cint is merely an upper estimate and, in practice, internal wave signals may move at speeds as low as uref . Consider now the distances which sound waves, internal waves, and particles advected by the thermal wind would travel during the characteristic time of Earth’s rotation, 1=  104 s. Together with the scale height, hsc , and the equator-to-pole distance, Lp , we find the hierarchy of characteristic lengths displayed in Table 4. The technical terms in the left column are often used for these scales in the meteorological literature. The synoptic reference scale, LRo , is also called the “Rossby radius,” and the Obukhov scale is frequently termed “external Rossby radius.” These length scales are naturally induced by fluid dynamical processes in the atmosphere and they give rise to multiple scales regimes when the characteristic signal speeds differ significantly. If we fix, in turn, a length scale to be considered, then the different characteristic signal speeds give rise to multiple times instead of multiple lengths. In general situations, one will be faced with combined multiple length-multiple time regimes.

3.2 Distinguished Limits Even for the very simple problem of the linear oscillator with small mass and small damping, an asymptotic expansion that allows for independent variation of the two parameters is bound to fail, because limits taken in the space of the mass and damping parameters turn out to be path dependent (Klein 2010). If the mass vanishes faster than the damping, nonoscillatory, purely damped solutions prevail; if the damping vanishes faster we obtain rapidly oscillating limit solutions. If that is so even for a simple linear problem there is little hope for independent multiple parameter expansions in more complex fluid dynamical problems. Therefore, being faced with multiple small parameters as in (9), we proceed by introducing distinguished limits, or coupled limit processes: the parameters are functionally related and asymptotic analyses proceed in terms of a single expansion parameter only. The characteristic signal speeds from Table 3 are compatible with the scalings p cint  1=3  "; cref

uref  1=9  "; cint

and

uref 3  "2 ; cref

(10)

and this corresponds to letting …1 D c1 "3 ;

…2 D c2 ";

p …3 D c3 "

(11)

Page 7 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

with ci D O.1/ as " ! 0 for the parameters in (9). The length scales in Table 4 then obey Lmeso D

hsc ; "

LRo D

hsc ; "2

LOb D

hsc "

5 2

;

Lp D

hsc ; "3

(12)

and we find that all these familiar meteorologically relevant scales can be accessed naturally through asymptotic scalings in the small parameter ".

3.3 Remarks of Caution We will restrict our discussion here to length scales larger or equal to the density scale height. Of course, on the much smaller length and time scales comparable to those of typical engineering applications, turbulence will induce a continuous range of scales. Analyses that exclusively rely on the assumption of scale separations will be of limited value in studying such flows. The interested reader may want to consult (Oberlack 2006) for theoretical foundations and references. There are serious voices in the literature (see, e.g., White 2002, Sect. 9.3) who would not necessarily agree that formal asymptotics as promoted here are the best way forward in developing approximate model equations for atmospheric flow applications. Salmon (1998) and Norbury and Roulstone (2002) are excellent starting points for the interested reader. Other serious voices in the literature, e.g., Lovejoy et al. (2008), would dispute our basic proposition that there be a natural scale separation for atmospheric flows of sufficiently large scale. Their arguments are based on the fact that it is very difficult to detect such scale separations in the observational record. To our defense we refer to our previous discussion, and to Lundgren (1982), or Hunt and Vassilicos (1991). These authors demonstrate that some localized, i.e., ideally scale-separated, and entirely realizable flow structures exhibit full spectra without gaps and with exponents akin to those of turbulence spectra. In other words, there may very well be clean scale separations even if they are not visible in spectral and similar global decompositions.

4 Classical Single-Scale Models In this section, we summarize the hierarchy of models that obtains from the full compressible flow equations when we assume solutions to depend on a single horizontal, a single vertical, and a single time scale only. See Sect. 5 for a brief discussion of multiscale regimes. Section 4.1 will set the stage for the subsequent discussions by introducing the nondimensional governing equations and a general scale transformation that will be used excessively later on. There is common agreement among meteorologists that, on length scales small compared to the planetary scale, acoustic modes contribute negligibly little to the atmospheric dynamics. As a consequence, there has been long-standing interest in analogs to the classical incompressible flow equations that would be valid for meteorological applications. Such models would capture the effects of advection, vertical stratification, and of the Coriolis force, while discarding acoustic modes. According to our previous discussion, given a horizontal characteristic length, acoustic waves, internal waves, and advection are all associated with their individual characteristic time scales. Therefore, after having decided to eliminate the sound waves we still have the option of adjusting a reduced model to the internal wave dynamics (Sect. 4.2) or to capture the effects of advection (Sect. 4.3). Page 8 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Near-equatorial motions are addressed in Sects. 4.4 and 4.5 summarizes the hydrostatic primitive equations which are at the core of most current global weather forecast and climate models. To simplify notations, the tangent plane approximation is used throughout this section. That is, we neglect the influence of Earth’s sphericity, except for including the horizontal variation of the Coriolis parameter. See standard textbooks, such as Gill (1982) and Pedlosky (1987), for discussions of the regime of validity of this approximation.

4.1 Scalings of the Governing Equations 4.1.1 Governing Equations The compressible flow equations for an ideal gas with constant specific heat capacities and including gravity, rotation, and generalized source terms will serve as the basis for the subsequent discussions. We nondimensionalize using `ref D hsc and tref D hsc =uref for space and time, and pref ; Tref ; R; uref from Tables 2 and 3 for the thermodynamic variables and velocity. In addition, we employ the distinguished limits from (10) and (11). The governing equations for the dimensionless horizontal and vertical flow velocities, vk ; w, pressure, p, density, , and potential temperature, ‚ D p 1= =, as functions of the horizontal and vertical coordinates, x, z, and time, t , then read 

 @ 1 @ C vk  rk C w vk C " .2  v/k C 3 rk p D Qvk ; @t @z "   @ @ 1 @p 1 C vk  rk C w w C ".2  v/? C 3 D Qw  3 ; @t @z "  @z "   @ @ C vk  rk C w  C r  v D 0; @t @z   @ @ C vk  rk C w ‚ D Q‚ : @t @z

(13)

Here bk ; b? denote the horizontal and vertical components of a vector b, respectively. The terms QŒ in (13) stand for general source terms, molecular transport, and further unresolved-scale closure terms. In the sequel, we assume that these terms will not affect the leading-order asymptotic balances to be discussed. If in one or the other practical situation it turns out that this assumption is void, these terms can be handled within the same framework as the rest of the terms. See, e.g., Klein and Majda (2006), for examples involving moist processes in the atmosphere. 4.1.2 Some Revealing Transformations The Exner Pressure For the analysis of atmospheric flows, the Exner pressure  D p = with D .  1/= turns out to be a very convenient variable. The pressure is always in hydrostatic balance to leading order in the atmosphere. For a given horizontally averaged mean stratification N zD of the potential temperature, ‚.z/, the corresponding hydrostatic pressure will satisfy d p=d N D pN 1= =‚, (see the O.1="3 /-terms in (13)2 ) with exact solution p.z/ N D . .z// N

1

where

.z/ N D

1

Zz  0

1 ‚.z0 /

d z0 :

(14) Page 9 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Thus, the Exner pressure captures the essential vertical variation of the hydrostatic background pressure due to the atmosphere’s stratification without exhibiting the very strong drop of the pressure pN due to the compressibility of air. For example, for a neutrally stratified atmosphere N  const; N is linear in z, whereas pN is a strongly nonlinear function. Most atmospheric with ‚ flow phenomena may be understood as perturbations away from such a hydrostatically balanced background state. A General Scale Transformation Suppose we are interested in solutions to (13) with a characteristic time T D tref ="˛t and a horizontal characteristic length L D hsc ="˛x with ˛t , ˛x > 0. Appropriate rescaled coordinates are  D "˛t t and  D "˛x x:

(15)

Furthermore, we take into account that the relative (vertical) variation of the potential temperature in the atmosphere is of order O. ‚=Tref / D O."/ and that the vertical and horizontal velocities in most of the regimes to be considered scale as jwj=jvk j  hsc =L D O."˛ x /. This suggests new dependent variables, Q ; z/; ‚.; ; z/ D 1 C "N .z/ C "˛ .; Q ; z/; .; ; z/ D .z/ N C "˛ .;

(16)

Q ; z/; w.; ; z/ D "˛x ; w.; with .z/ N from (14) for ‚.z/ D 1 C "N .z/. Then the governing equations become  .2  v/k @ "˛ "˛t @ C vk  r C wQ vk C C 3 ‚r Q D Q"vk ˛ ˛ 1 x x " @ @z " " !   ˛t Q .2  v/? " @ @ "˛ @Q C vk  r C wQ wQ C  C 3C2˛ ‚ D Qw" x "˛x @ @z "2˛x 1 " @z ‚    ˛t  "   Q‚ " @ @ @wQ wQ d N  Q C D r  r C w Q  v C C v C k   k "˛x @ @z "˛ @z   d z ‚   ˛t " @ @ Q " d N "  r C w Q wQ : C v  C D Q‚ k  ˛ ˛ x  " @ @z " dz 

(17)

" , are of sufficiently high order in " to not affect Below we will assume that the source terms, QŒ the leading-order asymptotics. Importantly, in writing down the equations in (17) we have merely adopted a transformation of variables, but no approximations. Together with the definitions in (15) and (16), they are equivalent to the original version of the compressible flow equations in (13). However, below we will employ judicious choices for the scaling exponents, ˛Œ , and assume solutions that adhere to the implied scalings in that vk ; w; Q ; Q Q D O.1/ and that the partial derivative operators @ , r , @z yield O.1/ results when applied to these variables as " ! 0. This will allow us to efficiently carve out the essence of various limit regimes for atmospheric flows without having to go through the details of

Page 10 of 24

[hsc/uref]

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

1/e3

PG

1/ε5/2 1/e 2

1/e

QG

Inertial waves

WTG

Boussinesq

e

av

lw

a rn

te

In

HPE

An

es

Obukhov scale

av

n tio

c

ve

es

a

/p

ic

st

a el

d eu

c

In

o-

WTG

om

+Coriolis

b si

es

pr

1

HPE

le

+Coriolis

w tic

s

ou

Ad

1/e 5/2

Ac

e

1

1/e

1/e 2

1/e3

Bulk micro

Convective

Mesco

Synoptic

Planetary

[hsc]

Fig. 1 Scaling regimes and model equations for atmospheric flows. WTG weak temperature gradient approximations, QG quasi-geostrophic, PG planetary geostrophic, HPE hydrostatic primitive equations. The WTG and HPE models cover a wide range of spacial scales assuming the associated advective and acoustic times scales, respectively. The anelastic and pseudo-incompressible models for realistic flow regimes cover multiple spatiotemporal scales (Sect. 2). See similar graphs for near-equatorial flows in Majda and Klein (2003) and Majda (2007b)

the asymptotic expansions-an exercise that can be completed, of course, see, e.g., Majda and Klein (2003) and Klein (2003), but whose detailed exposition would be too tedious and space-consuming for the present purposes. Figure 1 summarizes the mid latitude flow regimes to be discussed in this way below. Considering (17), the classical Strouhal, Mach, Froude, and Rossby numbers are now related to " and the spatiotemporal scaling exponents via St1 D

"˛t L D ˛ ; UT "x

Fr2 D  Ma2 D

U 2 D "3 ; C2

Ro D "˛x 1 ;

(18)

where T , L, U , C are the characteristic time and length scale, horizontal flow velocity, and sound speed in the rescaled variables from Eqs. 16 and 17, respectively.

4.2 Midlatitude Internal Gravity Wave Models Here we consider (17) with ˛t D ˛x  1, so that temporal variations are fast compared with trends due to advection, and we restrict ourselves to length scales up to the synoptic scale, LRo D hsc ="2 , so that ˛x 2 f0; 1; 2g. The resulting leading-order models will include the time derivative at O.1="/ but not the advection terms. The Coriolis terms will not dominate the time derivative unless ˛x > 2. As a consequence, for any horizontal scale L  hsc ="2 , the level of pressure perturbations can be assessed from the horizontal momentum equation, (17)1 , by balancing the pressure gradient and Page 11 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

time derivative term and we find ˛ D 2. With this information at hand, we observe that the velocity divergence term will dominate the pressure equation (17)3 so that a divergence constraint arises at leading order. Collecting these observations, we obtain a generic leading-order model for horizontal scales hsc  L  hsc ="2 D LRo , @vk C a f k  vk C r Q D Q"vk ; @ @wQ @Q b C D Q ; @ @z r  .PN v/ D 0;

(19)

@Q d N " : C wQ D Q‚ @ dz Here we have introduced the vertical component of the Earth rotation vector, f D k  2, with k the vertical unit vector, and we have abbreviated   wQ d N @w Q N N r  .P v/ D P r  vk C C ; @z   d z where PN .z/ D .z/ N 1= D pN 1= . The parameters a, b in (19) depend on the characteristic horizontal scale and are defined as 8 8 < 1 .˛ D 2/ < 1 .˛ D 0/ x x aD ; bD : (20) : 0 .˛ 2 f0; 1g/ : 0 .˛ 2 f1; 2g/ x x For any pair (a, b), the model equations in (19) are “sound-proof” due to the velocity divergence constraint (19)3 . They support internal gravity waves through the interplay of vertical advection of the background stratification in (19)4 and the buoyancy force on the right hand side of the vertical momentum equation, (19)2 . When b D 0, i.e., for horizontal scales L hsc , we have @=@z Q D Q and these waves are called hydrostatic; for b D 1 they are nonhydrostatic with the inertia of the vertical motion participating in the dynamics. When a D 0, i.e., for length scales small compared with the synoptic scale, LRo  hsc ="2 , the internal waves propagate freely. Hydrostatic states at rest with horizontally homogeneous Q are then the only equilibrium states of the system. For length scales comparable to the synoptic scale, however, a D 1, and the Coriolis effect affects the dynamics. As a consequence, there are new nontrivial steady states involving the geostrophic balance of the Coriolis and horizontal pressure gradient terms, i.e., f k  vk C r Q D 0. Instead of assuming hydrostatic states at rest, such a system will adjust to the geostrophic state compatible with its initial data over long times, and it will generally release only a fraction of the full potential energy that it held initially (see Gill 1982, Chap. 7).

Page 12 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

4.3 Balanced Models for Advection Time Scales Here we let ˛t D ˛x in (17), i.e., we consider processes that will always include the effects of advection in their temporal evolution. Now we will have to distinguish between the mesoscales, L D hsc and L D Lmeso D hsc =" on the one hand, and the synoptic scale, L D LRo D hsc ="2 , respectively. For the latter, the Coriolis term becomes asymptotically dominant. 4.3.1 Weak Temperature Gradient Models for the Mesoscales On the mesoscales, i.e., for ˛x 2 f0; 1g, the horizontal momentum balance in (17)1 yields ˛ D 3 and we find the “weak temperature gradient” or “quasi-nondivergent” approximation at leading order (Held and Hoskins 1985; Zeytounian 1990; Sobel et al. 2001), 

@ @ C vk  r C wQ @ @z

 vk C a f k  vk C r Q D Q" ; r  .PN v/ D 0; d N " D Q‚ wQ : dz

(21)

where a D 0 for L D hsc and a D 1 for L D hsc =" D Lmeso . The third equation in (21) describes how diabatic heating in a sufficiently strongly stratified medium induces vertical motions so that air parcels adjust quasi-statically to their individual vertical levels of neutral buoyancy. Replacing the potential temperature transport equation from (17)4 with (21)3 suppresses internal gravity waves, just as the divergence constraints suppress the acoustic modes. With (21)3 , the vertical velocity is fixed, given the diabatic source term. The second equation then becomes a divergence constraint for the horizontal velocity, and the perturbation pressure Q is responsible for guaranteeing compliance with this constraint. The vertical momentum equation, (17)2 , in this regime becomes the determining equation for potential temperature perturbations, Q . These are passive unless they influence the source terms in (21). Weak temperature gradient models are frequently employed for studies in tropical meteorology, see Klein (2006); [and references therein]. 4.3.2 Synoptic Scales and the Quasi-geostrophic Approximation On the larger synoptic length and advective time scales, T D hsc ="2 uref  12 h and L D hsc ="2  1,100 km, for which ˛t D ˛x D 2, we find that the Coriolis and pressure gradient terms form the geostrophic balance at leading order, and this changes the dynamics profoundly in comparison with the weak temperature gradient dynamics from the last section. Here we encounter the likely most prominent classical model of theoretical meteorology, the quasi-geostrophic model (see Gill 1982; Pedlosky 1987; Zeytounian 1990) and references therein. See also Muraki et al. (1999) for a higher-order accurate QG theory. Matching the Coriolis and pressure gradient terms, which are O."1 / and O."˛ 3 /, provides ˛ = 2. The horizontal and vertical momentum equations reduce to the geostrophic and hydrostatic balances, respectively, viz., f0 k  vk C r Q D 0;

@Q D Q ; @z

(22)

Page 13 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

where f0 D k  2jD0 . By taking the curl of (22)1 and inserting (22)2 we obtain 0

1

0

1

0

@Q =@

@ =@z@ @v=@z Q C B B C B C B Q B C B 2 f0 B @u=@z C D B @ =@z@ C D B @ =@ Q A @ @ A @ 0 r  vk 0 2

1 C C C; A

(23)

with ,  the east and northward components of , respectively. The first two components in (23) represent the “thermal wind relation” which was introduced earlier in the introduction and was used in Table 3 to define the universal reference flow velocity, uref . The third component in (23) states that the horizontal flow is divergence free at leading order, so that Q may be considered a stream function for the horizontal flow according to (22)1 . Equation 22 does not allow us to determine the temporal evolution of the flow. An evolution equation is obtained by going to the next order in " and eliminating the arising higher-order perturbation functions using the vertical momentum, pressure, and potential temperature equations (see Pedlosky 1987, Sect. 6.5). The result is, as one may have expected, a vorticity transport equation and one finds 

 @ C vk  r q D Qq" ; @

(24)

where f @ q D C ˇ C N @z

NQ d N =d z

! (25)

is the leading-order potential vorticity with , ˇ,  defined through

D k  r  vk ;  D   nE north ; k  2 D f C ˇ:

(26)

Thus is the vertical component of vorticity,  the northward component of , and ˇ the derivative of the Coriolis parameter with respect to . The last term in (25) captures the effect on vorticity of the vertical stretching of a column of air induced by nonzero first-order horizontal divergences and the constraint of mass continuity. The potential vorticity source term, Qq" , in (24) is a combination of the heat and momentum source terms and their gradients. Using the geostrophic and hydrostatic balances in (22), the definition of potential vorticity in (25) becomes   N @Q f @ 2 : (27) q D r Q C ˇ C N @z d N =d z @z This, together with (22)1 , reveals that the transport equation from (24) is a closed equation for the perturbation pressure field, , Q given appropriate definitions of the source terms and boundary conditions.

Page 14 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

The interpretation of the potential vorticity transport equation in (24) is as follows: Total vorticity is advected along (horizontal) pathlines of the leading-order flow. In the process, vertical vorticity, , adjusts to compensate for local changes of the planetary rotation, ˇ, when there are northward or southward excursions, and feels the effects of vortex stretching as captured by the third term on the right in (27). Important manifestations of these processes are planetary-scale meandering motions, the Rossby waves, and baroclinic instabilities which induce the sequence of cyclones and anti cyclones that constitute much of the weather statistics in the middle latitudes, see Pedlosky (1987, Chap. 7) for a comprehensive discussion. Importantly, in the QG theory both acoustic and internal gravity waves are asymptotically filtered out, as the characteristic time scale considered is that of advection only. 4.3.3 Ogura and Phillips’ Anelastic Model for Weak Stratification Thus far we faced the alternative of adopting either the internal wave or the advection time scale in constructing reduced single-scale models. A compact description of the combined effects of both would not transpire. Ogura and Phillips (1962) realized that this separation may be overcome " for weak stratification and diabatic heating. To summarize their development, we let .N ; Q‚ /D  2 N  2 "  ; Q‚ D O." / in (17)4 but otherwise follow the developments for the advective time scale in section “Weak Temperature Gradient Models for the Meso Scales.” This leads to the classical anelastic model,   @ @ C vk  r C wQ vk C r D Q Q"vk ; @ @z   @Q @ @ C vk  r C wQ wQ C D Q C Qw" ; @ @z @z (28) N r  .P v/D 0;   d N  @ Q @  C vk  r C wQ  C wQ D Q‚ : @ @z dz The velocity divergence constraint in (28)3 is responsible for suppressing sound waves. For the present weak variations of potential temperature it is equivalent to the more common formulation, r  .v/ N D 0, see the definition of ‚ below (13). The model involves a Boussinesqtype approximation for buoyancy in (28)2 , and the buoyancy evolution equation (28)4 accounts explicitly for the background stratification. As a consequence, the system does support both internal gravity waves and the nonlinear effects of advection. The total variation of potential temperature allowed for in the Ogura and Phillips’ anelastic model is of order O."3 / D O.Ma2 /  103 (see (16), (18), and let N D O."2 /). This amounts to temperature variations of merely a few Kelvin in dimensional terms. This is in contrast with the observed variations of 30. . . 50 K across the troposphere. To address this issue, various generalizations of the classical anelastic model allowing for stronger stratifications have been proposed (see, e.g., Durran 1989; Bannon 1996, and references therein). See also the previous discussion (Sect. 2).

Page 15 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

4.4 Scalings for Near-Equatorial Motions In the tropics, the same processes as encountered so far contribute to the atmospheric fluid dynamics, i.e., we have advection, inertial effects due to the Coriolis force, and internal gravity and sound waves. However, the fact that the Coriolis force for the horizontal motion changes sign (and is zero) at the equator, lets the tropical dynamics exhibit a range of very different phenomena as compared to middle and high latitude motions. There is, in particular, a spectrum of nearequatorial trapped waves which combine aspects of internal gravity and Rossby waves and which move predominantly in the zonal direction along the equator. For the asymptotic scaling in " of the Coriolis term in the horizontal momentum balance near the equator we have .2  v/k D f k  vk C w2k  k, where f D 2  k D 2 sin./ D 2  CO. 3 / for small latitudes,  1. For scales smaller than the planetary scale, we will place our coordinate system onto the equator and let the space coordinate y point to the north. Then, as y is nondimensionalized by hsc D "3 a, we have  D "3 y and in (17)1 find ".2  v/k D "4 y2 k  vk C O."w/:

(29)

For spatiotemporal scales that are too small for the Coriolis effect to play a leading-order role, we will find essentially the same small-scale equations discussed in Sect. 4.2 for internal waves and in Sect. 4.3 for the balanced weak temperature gradient flows, except that the Coriolis parameter will be a linear function of the meridional coordinate. A prominent difference between tropical and midlatitude dynamics arises at the equatorial synoptic scales, i.e., at horizontal spacial scales for which the Coriolis term again dominantes the advection terms in the momentum balance even under the equatorial scaling from (29). We will find the governing equations for a variety of dispersive combined internal-inertial waves that are confined to the near-equatorial region and, on longer time scales, a geostrophically balanced model relevant to the tropics. In analogy with our discussion of the scales in Table 4 we assess the equatorial synoptic scale, Les , by equating the time an internal gravity wave would need to pass this characteristic distance with the inverse of the Coriolis parameter. The latter is, however, now proportional to the p scale we are interested in, and we obtain in dimensional terms, cint =Les D Les =a, i.e., Les  cint a= . Using (9–11), we recover the scaling developed by Majda and Klein (2003) Les  hsc

r

cref cint a 5  " 2 : a cref hsc

(30)

Comparing this with (12) we observe that the equatorial synoptic scale matches the asymptotic scaling of the midlatitude Obukhov or external Rossby radius, and this corresponds to an intriguing connection between baroclinic near-equatorial waves and barotropic midlatitude modes discussed by Wang and Xie (1996) and Majda and Biello (2003). 4.4.1 Equatorial Sub-Synoptic Flow Models For scales smaller than the equatorial synoptic scale with L  hsc ="2 or smaller, the situation is very similar to that encountered for sub-synoptic scales in the midlatitudes. On the internal wave time scale, the leading-order model is again that of internal wave motions without influence of the Coriolis force as in Sect. 4.2. On advective time scales, weak temperature gradient models Page 16 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

are recovered in analogy with those found in section “Weak Temperature Gradient Models for the Meso Scales,” yet with a non constant Coriolis parameter f D 2 . For related derivations see Browning et al. (2000), Sobel et al. (2001), and Majda and Klein (2003). When the zonal characteristic length is allowed to be large compared to the meridional one, then a modified set of equations is obtained that still retains the advection of zonal momentum, but that is geostrophically balanced in the meridional direction, Majda and Klein (2003). We omit listing all these equations here as they are easily obtained from those in section “Weak Temperature Gradient Models for the Meso Scales” taking into account the equatorial representation of the Coriolis term in (29). 4.4.2 Equatorial Synoptic and Planetary Models In the tropics, there is a strong anisotropy between the zonal and meridional directions due to the variation of the Coriolis parameter. Here we distinguish the zonal and meridional velocity components, u and v, respectively, so that vk D ui C vj . In the meridional direction, we allow for waves extending over the equatorial synoptic distance, Les , whereas in the zonal direction, we have Lx D Les for synoptic scale waves, but Lx  Les =" for equatorial long waves (Majda and Klein 2003). Just as we scaled the vertical velocity with the vertical-to-horizontal aspect ratio before, we rescale the meridional velocity here with the meridional-to-zonal aspect ratio, i.e., we let  D "5=2 y throughout this section, but v D "˛ v, Q with ˛v D 0 for synoptic scale waves and ˛v D 1 for equatorial long waves. Also, we focus on the internal wave or advection time scale, so that ˛t D ˛x  1 or ˛t D ˛x , respectively, with ˛x 2 f5=2; 7=2g. In the usual way, the pressure scaling is found by balancing the dominant terms in the (zonal) momentum equation and we have ˛ D 2. These scalings yield the generalized linear equatorial (wave) equations b

@Q @u  2 vQ C D Q"u ; @ @

ba

@Q @vQ C 2 u C D Q"Q ; @ @ @Q D Q ; @z r  .PN v/D 0; b

(31)

@Q d N " C wQ D Q‚ : @ dz

For synoptic scale waves with isotropic scaling in the horizontal directions a D 1, whereas for equatorial long waves there is geostrophic balance in the meridional direction and a D 0. To describe processes advancing on the slower advection time scale, b D 0, and we obtain the Matsuno-Webster-Gill model for geostrophically balanced near-equatorial motions, Matsuno (1966). The equations in (31) support, for b ¤ 0, a rich family of essentially different traveling waves (Gill 1982; Pedlosky 1987; Majda 2002; Wheeler and Kiladis 1999). There are fast equatorially trapped internal gravity waves that correspond to the midlatitude internal waves from Sect. 4.2, slowly westward moving equatorially trapped Rossby waves, the Yanai waves which are special solutions behaving like internal gravity waves at short wavelengths but like Rossby waves in the

Page 17 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

long-wavelength limit, and finally there are Kelvin waves, which travel exclusively eastward and are associated with purely zonal motions, i.e., with vN  0. In the quasi-steady balanced case, b D 0, the equations in (31) reduce to a linear model for the geostrophically balanced flow for which exact solutions are available, given the source terms " QΠ(see, e.g., Matsuno 1966). These exact solutions are determined, however, only up to a purely zonal, i.e., x-independent mean flow. Such a mean flow must either be prescribed or obtained from an analysis of the next larger zonal scale (Majda and Klein 2003). The mathematical character of near-equatorial geostrophically balanced flows is very different from that of their midlatitude analogs from section Synoptic Scales and the Quasi-geostrophic Approximation.

4.5 The Hydrostatic Primitive Equations In the construction of computational global weather forecasting or climate models, it is important that the basic fluid flow model used is uniformly valid on the entire globe and for the most extreme flow conditions found. As we have seen, the large-scale model equations discussed in the previous sections are adapted to either the middle latitudes or the near-equatorial region. At the same time, one observes relatively large mean flow velocities of the order of jvj  50 m=s, corresponding to Mach numbers Ma  0:15, in strong jet streams near the tropopause (see, e.g., Schneider 2006), and it turns out that global scale barotropic wave perturbations may be equally considered as external gravity or as planetary sound waves. These observations lead us to consider fully compressible flows with Mach numbers of order unity on horizontal scales large compared with the scale height, hsc . Going back to our general rescaled compressible flow equations, we nondimensionalize the flow velocity by the sound speed, 3 cref instead of by uref recalling that uref =cref D Ma  " 2 according to (10). Then we let  5 ; ˛x 2 1; : : : ; 2 

3 2

vO k D " vk ;

˛t D ˛x 

3 2

(32)

O and dropping the “hats” for convenience of and obtain, after reverting to the variables p, , vO k ; w, notation,   @ 1 @ C vk  r C w vk C a f k  v C r pD Qvk ; @t @z 



1 @p D 1;  @z

 @ @ C vk  r C w  C r  vD 0; @t @z   @ @ ‚D Q‚ ; C vk  r C w @t @z

(33)

where f D 2  k is the vertical component of the Earth rotation vector and a D 0 for ˛x < 5=2, whereas a D 1 for ˛x D 5=2. The only approximations relative to the full compressible flow equations from (13) result from the hydrostatic approximation for the pressure in the second equation, from the neglect of Page 18 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

higher-order contributions to the Coriolis terms both in the vertical and horizontal momentum balances, i.e., from the “traditional approximation,” (see, e.g., White 2002), and possibly from our assumption regarding the nondominance of the source terms QŒ  . The main consequence of the hydrostatic approximation is that the vertical velocity becomes a diagnostic variable that can be determined from the horizontal velocity, pressure, and potential temperature fields at any given time by integrating a second-order ordinary differential equation in z. This can be demonstrated, e.g., by differentiating (33)2 with respect to time and collecting all expressions involving w. A more elegant way to reveal the diagnostic nature of vertical velocity is to move to pressure coordinates, i.e., to introduce the time dependent coordinate transformation, .t; ; z/ ! .t; ; p/, where the transformation rule is given by the hydrostatic relation in (33)2 . After this transformation, one obtains a divergence constraint p

r  vO k C

@! D0 @p

(34)

where ! D Dp=Dt is the analog of vertical velocity in the pressure coordinates (see, e.g., Salmon 1998; White 2002). The traditional approximation for the Coriolis terms results automatically from the present asymptotic considerations. At the same time, it is extremely important for energetic reasons: unless the traditional approximation is introduced together with the hydrostatic one, the resulting equations will not feature an energy conservation principle, and this would render them useless for long-term integrations (White 2002). In terms of physical processes represented, the hydrostatic primitive equations cover advection, hydrostatic internal gravity waves, baroclinic and barotropic inertial (Rossby) waves, and barotropic large-scale external gravity waves. The latter are the remnant of compressibility in this system and may thus equivalently be considered as its sound waves. Due to the fact that the Coriolis term does not become dominant in (33), there is no degeneration of these equations near the equator, and the equatorial region is represented with uniform accuracy. The hydrostatic primitive equations form the fluid dynamical basis of most global weather forecasting and climate models today (see, e.g., Lorenz 1967; Zeytounian 1990; Salmon 1998; Houghton 2002; Shaw and Shepherd 2009). In regional weather forecasting, they are being replaced today with nonhydrostatic models (Clark et al. 2009; Koppert et al. 2009).

5 Developments for Multiple Scales Regimes Reduced single-scale models for atmospheric motions as discussed in the previous section constitute the scientific basis of theoretical meteorology. Most of them have been derived long ago and have stood the test of time. In contrast, much of today’s research aims at modeling interactions across multiple length and/or timescales. Here we briefly sketch just two recent examples, both of which address the ubiquitous question of how computational simulation techniques can be improved systematically by employing multiscale modeling approaches. The reader may consult (Klein 2010) for further examples and discussions.

Page 19 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

5.1 Shaw and Shepherd’s Parameterization Framework One important recent application of multiple scales theories is the development of sound bases for the parameterization of the net effects of unresolved scales in computational atmospheric flow models. An excellent example is the recent work by Shaw and Shepherd (2009). Burkhardt and Becker (2006) discuss the potentially detrimental effect of subgrid-scale parameterizations of effective diffusion, friction, and dissipation in global climate models that are not energetically and thermodynamically consistent. Shaw and Shepherd (2009) employ multiple scales asymptotics to reveal how mesoscale internal wave processes (see Sect. 4.2) would cumulatively affect planetaryscale motions as described by the hydrostatic primitive equations (Sect. 4.5). Using a different set of ideas associated with wave activity conservation laws developed in Shepherd (1990), they move on to derive simplified expressions for the discovered interaction terms that would allow them not to solve the mesoscale wave model in detail, as the formal multiple scales asymptotics procedure would require, but would provide effective closures that are energetically and thermodynamically consistent. Shaw and Shepherd restrict their framework, in this first layout, to the characteristic internal gravity wave time for the mesoscale, and to the planetary acoustic time scale. Within their framework, there is thus great potential for incorporation of wide-ranging results on wavemeanfield interactions as developed in recent years (see, e.g., Bühler 2010, and references therein). It is as yet not clear, however, how the closure schemes addressing wave activities will carry over to situations where nonlinear advection and (moist) convection play an important role on the small scales.

5.2 Superparameterization It is, of course, by no means guaranteed that the fast or small-scale equations in a multiple scales asymptotic analysis are analytically tractable. For example, in a two-scale model that involves the length scales `  hsc and L  hsc =" and covers the advective time scale hsc =uref for the scale `, the small-scale model would be either the weak temperature gradient flow equations from Sect. Weak Temperature Gradient Models for the Mesoscales or some version of the anelastic model from section Ogura and Phillips’ Anelastic Model for Weak Stratification (Klein and Majda 2006). In both cases one will generally have to resort to numerical computation in order to determine the small-scale dynamics. In such a situation, the technique of multiple scales analysis can nevertheless yield valuable guidelines regarding the reduced or averaged information that is to be extracted from the small-scale model and communicated to the large-scale equations. E and Engquist have systematized these procedures and proposed a framework for the construction of computational “heterogeneous multiscale models” (see Engquist et al. 2007). In the context of meteorological modeling, there is a related very active recent development aiming at overcoming one of the major deficiencies of today’s climate models in an efficient way. Cloud distributions in the atmosphere feature a wide spectrum of length and time scales (Tessier et al. 1993). Therefore, it is quite generally not possible to even remotely resolve cloud processes in general circulation models for the climate, which feature horizontal grid sizes of 100 km and more. Grabowski (2001, 2004) proposed the superparameterization approach, which involves operating a two-dimensional cloud-resolving model, based in his case on an extended version of the anelastic model from section “Ogura and Phillips’ Anelastic Model for Weak Stratification,” within each of the climate model’s grid boxes, and extracting the net effects of the small-scale processes on the Page 20 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

large-scale flow through appropriate averaging procedures. In today’s terms, the approach may be considered a heterogeneous multiscale model for cloud processes in climate applications. Majda (2007a) formulated a general framework for the construction of such hierarchies of embedded models based on the general multiple scales approach from Majda and Klein (2003) and Klein (2004). In particular, he accounts explicitly for the fact that between the convective length scale of the order of hsc and the planetary (climate) scale there are two more groups of characteristic length and time scales to be accounted for as explained here in Sect. 4.1. In this context it is worth noting that a direct translation of multiple scales techniques into a working computational geo-fluid dynamics code was shown to be feasible already by Nadiga et al. (1997).

6 Conclusions This chapter has summarized a unified view on the construction of reduced models in theoretical meteorology that is based on formal asymptotic arguments. A wide range of established models of theoretical meteorology can be reproduced in this way, and this approach also paves the way for systematic studies of multiscale interactions through more sophisticated techniques of asymptotic analysis. Yet, through one explicit example–the class of sound-proof models–we have also outlined the limitations of the approach in that there are well-established reduced models which do not lend themselves to formal asymptotic derivation in a straightforward fashion. This review has also consciously addressed only the fluid mechanically induced scale separations and their consequences for the construction of reduced flow models. Ubiquitous source terms in the equations for real-life applications associated with diabatic effects, turbulence closures and the like may impose additional characteristics both in terms of spatiotemporal scales and in terms of their response to variations of the flow variables. As is well known, e.g., from combustion theory, this can strongly affect the resulting simplified models (see, e.g., Peters 2000; Klein and Majda 2006). There is a rich literature on the mathematically rigorous justifications of reduced asymptotic models. Schochet (2005) reviews work on the classical problem of low Mach number flows and, in particular, includes recent developments to systematically addressing multiple length and timescale problems. There are also many examples of rigorous justifications of asymptotic models for geophysical flows, and here the reader may consult (Levermore et al. 1996; Embid and Majda 1998; Babin et al. 2002; Majda 2002; Bresch and Desjardins 2003; Cao and Titi 2003, 2007; Masmoudi 2007; Zeitlin 2007; Bresch and Gérard-Varet 2007; Feireisl et al. 2008; Dutrifoy et al. 2009, and references therein). Acknowledgments The author thanks Ulrich Achatz, Dargan Frierson, Juan Pedro Mellado, Norbert Peters, Heiko Schmidt, and Bjorn Stevens for very helpful discussions and suggestions concerning the content and structure of this manuscript, and Ulrike Eickers for her careful proofreading.

Page 21 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

References Babin A, Mahalov A, Nicolaenko B (2002) Fast singular limits of stably stratified 3D Euler and Navier-Stokes equations and ageostrophic wave fronts. In: Norbury J, Roulstone I (eds) Largescale atmosphere-ocean dynamics 1: analytical methods and numerical models. Cambridge University Press, Cambridge Bannon PR (1996) On the anelastic approximation for a compressible atmosphere. J Atmos Sci 53:3618–3628 Bresch D, Desjardins B (2003) Existence of global weak solutions for a 2D viscous shallow water equations and convergence to the quasi-geostrophic model. Commun Math Phys 238:211–223 Bresch D, Gérard-Varet D (2007) On some homogenization problems from shallow water theory. Appl Math Lett 20:505–510 Browning G, Kreiss HO, Schubert WH (2000) The role of gravity waves in slowly varying in time tropospheric motions near the equator. J Atmos Sci 57:4008–4019 Bühler O (2010) Wave-mean interactions in fluids and superfluids. Ann Rev Fluid Mech 42:205– 228 Burkhardt U, Becker E (2006) A consistent diffusion-dissipation parameterization in the ECHAM climate model. Mon Weather Rev 134:1194–1204 Cao C, Titi E (2003) Global well-posedness and finite dimensional global attractor for a 3-D planetary geostrophic viscous model. Commun Pure Appl Math 56:198–233 Cao C, Titi E (2007) Global well-posedness of the three-dimensional viscous primitive equations of large scale ocean and atmosphere dynamics. Ann Math 166:245–267 Clark P et al (2009) The weather research & forecasting model. http://www.wrf-model.org/ Davies T, Staniforth A, Wood N, Thuburn J (2003) Validity of anelastic and other equation sets as inferred from normal-mode analysis. Q J R Meteorol Soc 129:2761–2775 Durran DR (1989) Improving the anelastic approximation. J Atmos Sci 46:1453–1461 Dutrifoy A, Schochet S, Majda AJ (2009) A simple justification of the singular limit for equatorial shallow-water dynamics. Commun Pure Appl Math LXI:322–333 Embid P, Majda AJ (1998) Averaging over fast gravity waves for geophysical flows with unbalanced initial data. Theor Comput Fluid Dyn 11:155–169 Engquist B, Weinan E, Vanden-Eijnden E (2007) Heterogeneous multiscale methods: a review. Commun Comput Phys 2:367–450 Feireisl E, Málek J, Novtoný A, Stravskraba I (2008) Anelastic approximation as a singular limit of the compressible Navier-Stokes system. Commun Part Differ Equ 33:157–176 Frierson DMW (2008) Midlatitude static stability in simple and comprehensive general circulation models. J Atmos Sci 65:1049–1062 Gill AE (1982) Atmosphere-ocean dynamics. International geophysics series, vol 30. Academic, San Diego Grabowski WW (2001) Coupling cloud processes with the large-scale dynamics using the cloudresolving convection parameterization (CRCP). J Atmos Sci 58:978–997 Grabowski WW (2004) An improved framework for superparameterization. J Atmos Sci 61:1940– 1952 Held IM, Hoskins BJ (1985) Large-scale eddies and the general circulation of the troposphere. Adv Geophys 28:3–31 Houghton J (ed) (2002) The physics of atmospheres. Cambridge University Press, Cambridge

Page 22 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Hunt JCR, Vassilicos JC (1991) Kolmogoroffs contributions to the physical and geometrical understanding of small-scale turbulence and recent developments. Proc R Soc Lond A 434:183– 210 Keller J, Ting L (1951) Approximate equations for large scale atmospheric motions. Internal Report, Inst. for Mathematics & Mechanics (renamed to Courant Institute of Mathematical Sciences in 1962), NYU, (http://www.arxiv.org/abs/physics/0606114v2) Kevorkian J, Cole J (1996) Multiple scale and singular perturbation methods. Springer, New York Klein R (2004) An applied mathematical view of theoretical meteorology. In: Applied mathematics entering the 21st century: invited talks from the ICIAM 2003 congress. SIAM proceedings in applied mathematics, vol 116 Klein R (2006) Theoretical developments in tropical meteorology. Special issue, Theoretical and computational fluid dynamics, vol 20. Springer, Berlin Klein R (2008) An unified approach to meteorological modelling based on multiple-scales asymptotics. Adv Geosci 15:23–33 Klein R (2009) Asymptotics, structure, and integration of sound-proof atmospheric flow equations. Theor Comput Fluid Dyn 23:161–195 Klein R (2010) Scale-dependent asymptotic models for atmospheric flows. Ann Rev Fluid Mech 42:249–274 Klein R, Majda AJ (2006) Systematic multiscale models for deep convection on mesoscales. Theor Comput Fluid Dyn 20:525–551 Koppert HJ et al (2009) Consortium for small-scale modelling. http://www.cosmo-model.org/ Levermore CD, Oliver M, Titi ES (1996) Global well-posedness for models of shallow water in a basin with a varying bottom. Indiana Univ Math J 45:479–510 Lorenz EN (1967) The nature and theory of the general circulation of the atmosphere. World Meteorological Organization, Geneva Lovejoy S, Tuck AF, Hovde SJ, Schertzer D (2008) Do stable atmospheric layers exist? Geophys Res Lett 35:L01802 Lundgren TS (1982) Strained spiral vortex model for turbulent fine structure. Phys Fluids 25:2193– 2203 Lynch P (2006) The emergence of numerical weather prediction: Richardson’s dream. Cambridge University Press, Cambridge Majda AJ (2002) Introduction to P.D.E.’s and waves for the atmosphere and ocean. Courant lecture notes, vol 9. American Mathematical Society & Courant Institute of Mathematical Sciences Majda AJ (2007a) Multiscale models with moisture and systematic strategies for superparameterization. J Atmos Sci 64:2726–2734 Majda AJ (2007b) New multiscale models and self-similarity in tropical convection. J Atmos Sci 64:1393–1404 Majda AJ, Biello JA (2003) The nonlinear interaction of barotropic and equatorial baroclinic Rossby waves. J Atmos Sci 60:1809–1821 Majda AJ, Klein R (2003) Systematic multi-scale models for the tropics. J Atmos Sci 60:393–408 Masmoudi N (2007) Rigorous derivation of the anelastic approximation. J Math Pures et Appliquées 3:230–240 Matsuno T (1966) Quasi-geostrophic motions in the equatorial area. J Met Soc Jpn 44:25–43 Muraki DJ, Snyder C, Rotunno R (1999) The next-order corrections to quasi-geostrophic theory. J Atmos Sci 56:1547–1560

Page 23 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_20-3 © Springer-Verlag Berlin Heidelberg 2014

Nadiga BT, Hecht MW, Margolin LG, Smolarkiewicz PK (1997) On simulating flows with multiple time scales using a method of averages. Theor Comput Fluid Dyn 9:281–292 Névir P (2004) Ertel’s vorticity theorems, the particle relabelling symmetry and the energyvorticity theory of fluid mechanics. Meteorologische Zeitschrift 13:485–498 Norbury J, Roulstone I (eds) (2002) Large scale atmosphere-ocean dynamics I: analytical methods and numerical models. Cambridge University Press, Cambridge Oberlack M (2006) Symmetries, invariance and self-similarity in turbulence. Springer, Berlin Ogura Y, Phillips NA (1962) Scale analysis of deep moist convection and some related numerical calculations. J Atmos Sci 19:173–179 Oliver M (2006) Variational asymptotics for rotating shallow water near geostrophy: a transformational approach J Fluid Mech 551:197–234 Pedlosky J (1987) Geophysical fluid dynamics, 2nd edn. Springer, Berlin Peters N (2000) Turbulent combustion. Cambridge University Press, Cambridge Rahmstorf S et al (2004) Cosmic rays, carbon dioxide and climate. EOS 85:38–41 Salmon R (1983) Practical use of Hamilton’s principle. J Fluid Mech 132:431–444 Salmon R (1998) Lectures on geophysical fluid dynamics. Oxford University Press, Oxford Schneider T (2006) The general circulation of the atmosphere. Ann Rev Earth Planet Sci 34: 655–688 Schochet S (2005) The mathematical theory of low Mach number flows. M2AN 39:441–458 Shaw TA, Shepherd TG (2009) A theoretical framework for energy and momentum consistency in subgrid-scale parameterization for climate models. J Atmos Sci 66:3095–3114 Shepherd T (1990) Symmetries, conservation laws, and Hamiltonian structure in geophysical fluid dynamics. Adv Geophys 32:287–338 Sobel A, Nilsson J, Polvani L (2001) The weak temperature gradient approximation and balanced tropical moisture waves. J Atmos Sci 58:3650–3665 Tessier Y, Lovejoy S, Schertzer D (1993) Universal multi-fractals: theory and observations for rain and clouds. J Appl Meteorol 32:223–250 Wang B, Xie X (1996) Low-frequency equatorial waves in vertically sheared zonal flow. Part I: stable waves. J Atmos Sci 53:449–467 Wheeler M, Kiladis GN (1999) Convectively coupled equatorial waves analysis of clouds and temperature in the wavenumber-frequency domain. J Atmos Sci 56:374–399 White AA (2002) A view of the equations of meteorological dynamics and various approximations. In: Norbury J, Roulstone I (eds) Large-scale atmosphere-ocean dynamics 1: analytical methods and numerical models. Cambridge University Press, Cambridge Zeitlin V (ed) (2007) Nonlinear dynamics of rotating shallow water: methods and advances. Elsevier, Amsterdam Zeytounian RK (1990) Asymptotic modeling of atmospheric flows. Springer, Heidelberg

Page 24 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill Nils Dorbanda , Martin Fenglera, Andreas Gumannb and Stefan Lapsc a Meteomatics GmbH, St. Gallen, Switzerland b Zurich, Switzerland c Bochum, Germany

Abstract This chapter gives a short overview on modern numerical weather prediction (NWP): The chapter sketches the mathematical formulation of the underlying physical problem and its numerical treatment and gives an outlook on statistical weather forecasting (MOS). Special emphasis is given to the Kyrill event in order to demonstrate the application of the different methods.

1 Introduction On 18 and 19 of January 2007, one of the most severe storms during the last decades came across Europe: Kyrill, as it was called by German meteorologists. With its hurricane-force winds, Kyrill was leaving a trail of destruction in its wake as it traveled across Northern and Central Europe. Public life broke down completely: schools, universities, and many companies had been closed beforehand; parts of the energy supply and public transport came to a virtual standstill; hundreds of flights had been canceled; and finally more than several dozens found their death due to injuries in accidents. The total economical damage had been estimated to EUR 2.4 billion. Starting from this dramatic event, this chapter tries to sketch the strengths, weaknesses, and challenges of modern weather forecasting. First, we will introduce the basic setup in tackling the underlying physical problem. This leads us to the governing dynamical equations and the approach used by European Center for Medium-Range Weather Forecasts (ECMWF, www.ecmwf.int) for solving them. To complete the picture, we also show how ensemble prediction techniques help to identify potential risks of a forthcoming event. And finally we show how Model Output Statistics (MOS) techniques – a statistical post-processing method – help to estimate the specific impact of such weather phenomena at specific locations. Clearly, in each of these steps mathematics plays the fundamental role. However in order to clarify the relation between mathematics and meteorology, we demonstrate several of the aspects discussed below on the example of the winter storm Kyrill.

1.1 What Is a Weather Forecast? Weather forecasting is nothing else than telling someone how the weather is going to develop. However, this definition does not explain the mechanics that are necessary for doing this. At the beginning of the twenty-first century, it is a lot more than reading the future from stars or throwing



E-mail: [email protected]

Page 1 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

chicken bones as it might have been hundreds of years ago – although some of the readers might suggest this. Today the demands for precise weather forecasts are manifold. Classically, the main driver for modern meteorology came from marine and aeronautical purposes. But next to these aspects it is clear that for agriculture, insurance business, or energy producers the weather is an increasingly important economic factor. Actually, there are studies estimating that 80 % of the value-added chain is directly or at least indirectly dependent on weather. Last but not least, the human itself has an intrinsic interest in weather, when clothing for the day. Currently, the market for weather forecasts is still heavily developing. At the moment, one estimates the turnover in the weather business to be $10 billion. If one talks about meteorology and weather forecasts, one is usually talking about forecasts lasting for the next couple of days. One distinguishes between real-time forecasts of the next few hours – commonly called nowcasting – and the classical weather forecast for several days (usually up to 10). When going beyond 10 days it becomes more convenient to speak of trend analyses for the next several weeks, a terminology that indicates that even the scientists know about the strong challenges in doing forecasts for such a time interval. Forecasts that cover months or years are actually no common topics in meteorology but rather in climatology. However, it should be outlined that meteorology pushes the virtual border to climatology step by step from days to weeks – and currently one is heavily working on monthly forecasts by introducing atmosphere and oceancoupled systems to meteorology which was once the hobby-horse of climatologists. Finally, there is a rather small community that is specialized to analyze the dynamical forecasts derived from above. By applying statistical techniques starting from linear regression and ending up at complex nonlinear models between historical model data and stations’ measurements, one is able to refine the forecasts for specific stations. This immediately distills the real benefit of the forecast. Now, let us have a closer look at the different ingredients of weather forecast. At first glance, we are confronted with an initial value problem. Thus, once endowed with the initial state of the atmosphere and the complete set of all physical processes describing the world outside, we are able to compute in a deterministic way all future states of the atmosphere, such as temperature, precipitation, wind, etc. Unfortunately, in practice we know little about this initial state which introduces a significant uncertainty right from the beginning. Due to the nonlinear nature of the dynamical problem, this uncertainty can lead to very large errors in the prediction. In the most extreme cases, it can even drive the numerical model into a completely wrong atmospheric state, which can lead to missing or not properly predicting important events like Kyrill. Statistical methods can be applied for dealing with such uncertainties. We sketch out so-called ensemble techniques immediately after having discussed the deterministic case.

2 Data Assimilation Methods: The Journey from 1d-Var to 4d-Var Numerical weather models are central tools for modern meteorology. With rapidly increasing computer power during the last decade, decreasing cost of hardware and improvements in weather and climate codes and numerical methods, it has become possible to model the global and mesoscale dynamics of the atmosphere with accurate physics and well-resolved dynamics. However, the predictive power of all models is still limited due to some very fundamental problems. Arguably the most severe one is our ignorance of the initial condition for a simulation. While, in Page 2 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

many areas of the world, we do have lots of data from ground-based weather stations, for the higher layers of the atmosphere we must rely either on very sparse direct measurements, like radio soundings, or on remote sensing observations that are obtained for example from radar stations or satellites. The amount and quality of such remote sensing data is increasing rapidly, but they are often in a form that is not particularly useful for Numerical Weather Prediction (NWP). It is highly nontrivial to properly inject information into the NWP models, when the observed quantities (as for example radar reflectivity) are only indirectly related to the model parameters (typically temperature, pressure, humidity, and wind velocities). At the same time, assimilation of such data is a key element for creating realistic initial states of the atmosphere. Some of the basic techniques for assimilating data are described in this section.

2.1 Observational Nudging A simple but effective approach to data assimilation is to modify the background analysis by terms proportional to the difference between the model state and the observational data. An example is the widely used Cressman analysis scheme. If xb is the background analysis and yi a vector of i observations, the model state x provided by a simple Cressman analysis would be Pn x D xb C

w.i; j /.yi  xb;i / Pn ; iD1 w.i; j /

iD1

(1)

where the weights w.i , j / are a function of the distance di;j between the points i and j and take on the value 1 for i D j (Daley 1991). There are different possible definitions for these weights. In methods that are commonly referred to as observational nudging, the condition wiDj D 1 is dropped, so that a weighted average between the background state and the observations is performed. Observational nudging can be used as a four-dimensional analysis method, that is, observational data from different points in time are considered. Instead of modifying the background state directly at an initial time, source terms are added to the evolution equations, so that the model is forced dynamically toward the observed fields. Effectively, this is equivalent to changing the governing evolution equations. Therefore, the source term has to be chosen small enough to prevent the model from drifting into unrealistic physical configurations. Observational nudging is still used in lots of operational NWP systems. It is a straightforward method, with a lot of flexibility. A common criticism is that the modifications are done without respecting the consistency of the atmospheric state which might lead to unrealistic configurations.

2.2 Variational Analysis In variational data assimilation, a cost function is defined from the error covariances and an atmospheric state that minimizes that cost function is constructed. If model errors can be neglected and assuming that background and observation errors are normal, unbiased distributions, the cost function is 1 1 J D .x.0/  xb .0//T B1 .x.0/  xb .0// C .y  H.x//T R1 .y  H.x//; 2 2

(2)

Page 3 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

where xb is the background field, y the vector of observations, and H(x) the observation operator that translates the model fields to the observed quantities. B and R are the background error covariance and observation error covariance, respectively. The operator H(x) provides a mechanism for assimilating any observational quantities that can be derived from the model parameters, without the necessity of solving the inverse problem. An example is radar reflectivity: it is much more difficult to match the atmospheric conditions to a radar image than to compute a reflectivity out of model parameters. The latter is what H(x) does and is all that is needed for defining the cost function. Having defined a suitable cost function, variational data analysis is reduced to a highdimensional linear or nonlinear (depending on the properties of H(x)) minimization problem (see Menke (1984), Lorenc (1986), Tarantola (1987) for more detailed discussions and Courtier et al. (1998) or Baker et al. (2004) as examples for implementations of such algorithms). In order to find solutions to the minimization problem, a number of simplifying assumptions can be made. One simplification has already been introduced when we neglected the model errors in Eq. 2. Another common simplification is to evaluate all observation operators at a fixed time only, neglecting the time dependence of the observations. This leads to the so-called 3D-VAR scheme (see Parrish and Derber (1992), for more details). The resulting cost function is then minimized using variational methods. A method closely related to 3D-VAR is Optimal Interpolation, which uses the same approximation, but solves the minimization problem not via variational methods but by direct inversion (see Bouttier and Coutier (2001), and references given therein). In 4D-VAR methods, the time parameter of the observations is taken into account. Therefore, the assimilation of data is not only improving the initial state of the model, but also the dynamics during some period of time. The minimization of the cost function is then considerably more difficult. Commonly applied methods for finding a solution are iterative methods of solving in a linearized regime, and using a sequence of linearized solutions for approximating the solution to the full nonlinear problem (Bouttier and Coutier 2001). While the variational methods outlined above are aiming toward optimizing the state vector of the full three-dimensional atmosphere, they are general enough to be applied to simpler problems. An example that is frequently encountered is 1D-VAR, where assimilation is done only for a vertical column at a fixed coordinate and time. This method plays an important role in the analysis of satellite data. There are other assimilation methods that can be combined with the previously discussed methods, or even replace them. One of the more widely used ones is the Kalman filter, which can be used to assimilate 4D data (space and time), and is capable of taking into account the time dependence of the model errors.

3 Basic Equations This section is dedicated to the set of basic, physical, and rather complex equations that are commonly used for numerical weather prediction. Before entering details of the partial differential equations it is necessary to talk about coordinate systems. While the horizontal one (;  / is quite common in mathematics, we keep a special eye on the vertical one.

Page 4 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

3.1 Vertical Coordinate System 3.1.1  -Coordinates Any quantity that exhibits a one-to-one relation to height z may be used as vertical coordinate. If the hydrostatic approximation is made, pressure is such a quantity, since  > 0 lets @p/@z D  g be negative everywhere. In these so-called pressure coordinates, the independent variables are ; ; p; t instead of ; ; z, t and the height z becomes a dependent variable (Norbury and Roulstone 2002). The physical height is rarely used as the vertical coordinate in atmospheric simulation models. Some models use pressure as vertical coordinate, because it simplifies the equations, at least if the atmosphere is in hydrostatic balance, which is generally true for synoptic and mesoscale motion. In such models, the 500 hPa isobaric surface (which undulates in space and time) for instance is a fixed reference level. For complex terrain it is better to use sigma-coordinates instead of pressure, because a sigma (or terrain-following) coordinate system allows for a high resolution just above ground level, whatever altitude the ground level may be, -coordinates are defined by D

psfc  p ; psfc

where psfc is the ground-level pressure, and p the variable pressure. The -coordinates range from 1 at the ground to 0 at the top of the atmosphere. The sigma coordinate found the basis for an essential modification that is introduced by the eta coordinates. 3.1.2 -Coordinates The fundamental base in the eta system is not at the ground surface, but at mean sea level (Simmons and Burridge 1981). The eta coordinate system has surfaces that remain relatively horizontal at all times. At the same time, it retains the mathematical advantages of the pressure-based system that does not intersect the ground. It does this by allowing the bottom atmospheric layer to be represented within each grid box as a flat “step.” The eta coordinate system defines the vertical position of a point in the atmosphere as a ratio of the pressure difference between that point and the top of the domain to that of the pressure difference between a fundamental base below the point and the top of the domain. The ETA coordinate system varies from 1 at the base to 0 at the top of the domain. Because it is pressure based and normalized, it is easy to mathematically cast governing equations of the atmosphere into a relatively simple form. There are several advantages of eta coordinates compared with the sigma ones, which should be mentioned: 1. Eta models do not need to perform the vertical interpolations that are necessary to calculate the pressure gradient force (PGF) in sigma models (Mesinger and Janji 1985). This reduces the error in PGF calculations and improves the forecast of wind and temperature and moisture changes in areas of steeply sloping terrain. 2. Although the numerical formulation near the surface is more complex, the low-level convergence in areas of steep terrain are far more representative of real atmospheric conditions than in the simpler formulations in sigma models (Black 1994). Especially, precipitation forecasts

Page 5 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

improve in these areas significantly, which more than compensates for the slightly increased computer run time. 3. Compared with sigma models, eta models can often improve forecasts of cold air outbreaks, damming events, and lee-side cyclogenesis. For example, in cold-air damming events, the inversion in the real atmosphere above the cold air mass on the east side of a mountain are preserved almost exactly in an eta model. Unfortunately eta coordinates also introduce some drawbacks and come along with certain limitations, for example: 1. The step nature of the eta coordinate makes it difficult to retain detailed vertical structure in the boundary layer over the entire model domain, particularly over elevated terrain. 2. Gradually sloping terrain is not reflected within the Eta models. Since all terrain is represented in discrete steps, gradual slopes that extend over large distances can be concentrated within as few as one step. This unrealistic compression of the slope into a small area can be compensated, in part, by increasing the vertical and/or horizontal resolution. 3. By its step nature, Eta models have difficulty predicting extreme downslope wind events. For models using eta coordinates the user is referred to the ETA Model (Black 1994) and, naturally, to the ECMWF model as introduced below.

3.2 The Eulerian Formulation of the Continuous Equations In the following, we gather the set of equations used at ECMWF for describing the atmospherical flow. In detail we follow exactly the extensive documentation provided by IFS (2006a). To be more specific, we introduce a spherical coordinate system given by (; ; /, where  denotes the longitude,  the latitude, and  the so-called hybrid vertical coordinate as introduced above. Then vertical coordinate  could be considered as a monotonic function of the pressure p and the surface pressure psfc , that is, .p, psfc / such that .0; psfc / D 0

and .psfc ; psfc / D 1:

(3)

Then the equations of momentum can be written as @U 1 @U @U @U C .U C V cos  / C P 2 @t a cos  @ @ @   @ 1 @ C Rdry T lnp D PU C KU f V C a @ @ 1 @V @V @V @V 2 2 C .U C V cos  C sin .U C V // C  P @t a cos2  @ @ @   @ cos  @ C Rdry T ln p D PV C KV ; Cf U C a @ @

(4)

Page 6 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

where a is the Earth’s radius, P is the vertical velocity P D d , '; is the geopotential, Rdry is the dt gas constant of dry air, and Tv is the virtual temperature defined by     Rvap q ; T D T 1 C Rdry  1 where T is the temperature, q is the specific humidity, and Rvap is the gas constant of water vapor. The terms PU and PV represent contributions of additional physical background processes that are discussed later on. KU and KV denote horizontal diffusion. Equation (4) is coupled with the thermodynamic equation given by @T 1 C @t a cos2  with D

Rdry cpdry



@T @T U C V cos  @ @

 C P

@T T !  D P T C KT ; @ .1 C .ı  1/q/p

(5)

(with cpdry the specific heat of dry air at constant pressure), ! is the pressure cp

coordinate vertical velocity ! D dp , and ı D cpvap with cpvap the specific heat of water vapor dt dry at constant pressure. Again PT abbreviates additional physical background processes, whereas KT denotes horizontal diffusion terms. The moisture equation reads as 1 @q C @t a cos2 

  @q @q @q U C V cos  C P D P q C Kq ; @ @ @

(6)

where Pq and Kq are again background process and diffusion terms. The set of Eqs. (4)–(6) get closed by the continuity equation @ @t



@p @



    @ @p @p C r  vH C P D 0; @ @ @

(7)

where vH is the vector (u, v/ of the horizontal wind speed. Now, under the assumption of an hydrostatic flow the geopotential ' in (4) can be written as Rdry Tv @p @ D : @ p @ Then the vertical velocity ! in (5) is given by Z !D 0



 @p d C vH  rp: r  vh @ 

By integrating Eq. (7) with boundary conditions P D 0 taken at the levels from Eq. (3), we end up with an expression for the change in surface pressure

Page 7 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

  @p d: r  vH @ 0   Z  @p @p @p D  d: P r  vH @ @t @ 0 @psfc D @t

Z

1

3.3 Physical Background Processes When we discuss about physical background processes we talk, for example, about radiation, turbulent diffusion, and interactions with the surface; subgrid-scale orographic drag, convection, clouds and large-scale precipitation, surface parametrization, methane oxidation, ozone chemistry parametrization, climatological data, etc. All of the above-mentioned processes have in common that they are parametrized and triggered subsequently after the computation of the prognostic equations. To make this idea more evident we have in the following a closer look to the generation of clouds and precipitation. 3.3.1 Clouds and Precipitation The described equations allow for modeling subsequent physical processes that influence via the introduced forcing terms Px the prognostic equations. For convenience, we keep an eye on two important processes in order to demonstrate the general purpose: cloud modeling and large-scale (stratiform) precipitation. We follow the representation given in IFS (2006b). 3.3.2 Clouds For simplicity we focus only on stratiform (non-convective) clouds. Having once implicitly introduced a vertical and horizontal grid in space (the Gauss-Legendre transform implies a regular grid), one can define the cloud and ice water content of a specific grid volume (cell) as Z w 1 dV; lD V V  where w is the density of cloud water,  is the density of moist air, and V is the volume of the grid box. The fraction of the grid box that is covered by clouds is given by a. Then the time change of cloud water and ice can be obtained by @l D A.l/ C Sconv C Sstrat  Ecld  Gprec @t together with @a D A.a/ C ıaconv C ıastrat  ıaevap ; @t

Page 8 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

where A.l/ and A.a/ denote the transport of cloud water/ice and cloud area through the boundaries of the grid volume. Sconv , ıaconv are the formation of cloud water/ice and cloud area by convective processes, resp. Sstrat , ıastrat by stratiform condensation processes. Ecld is the evaporation rate of cloud water/ice. Gprec is the rate of precipitation falling out of the cloud. And finally, we denote by ıaevap the rate of decrease of the cloud area by evaporation. For the formation of clouds one distinguishes two cases, namely the processes in case of already existing clouds and the formation of new clouds. Details can be found in IFS (2006b), but trivially spoken, new clouds are assumed to form when the relative humidity is larger than a certain threshold that depends on the pressure level, that is, tropospheric clouds are generated if the relative humidity exceeds 100 %. Finally, it is important to mention that the formation of new clouds comes along with evaporative processes which introduces reversibility into the system. 3.3.3 Precipitation Similarly, we sketch the procedure for estimating the amount of precipitation in some grid-box. The precipitation Z 1 PH.l/ dA; P D A where the step function H.l/ depends on the portion of the cell containing clouds at condensate specific humidity l and A denotes the volume of the grid-box. The precipitation fraction can then be expressed as Z 1 aD H.l/H.P / dA: A The autoconversion from liquid cloud-water to rain and also from ice to snow is parametrized in Sundqvist (1978) and can be written as  2  lcld lcrit : G D ac0 1  e The reader should be aware that there is also an additional, completely different process contributing to the large-scale precipitation budget, namely in case of clear-sky conditions. Again we refer to IFS (2006b), which also describes ice sedimentation, evaporation of precipitation, and melting of snow. Rain and snow is removed from the atmospheric column immediately but can evaporate, melt, and interact with the cloud-water in all layers through which it passes.

3.4 The Discretization There are many ways of tackling the above-mentioned coupled set of partial differential equations. However, for meteorological reasons, one uses a scheme introduced by Simmons and Burridge (1981) based on frictionless adiabatic flow. It is designed such that it conserves angular momentum, which helps avoiding timing problems in traveling fronts. Therefore, one introduces a fixed number Page 9 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

of vertical layers at fixed pressure levels-the so-called vertical (finite element) discretization build of cubic B-splines. The prognostic horizontal variables T , u, v, ';, q, and p are represented in terms of scalar spherical harmonics (see Freeden et al. (1998), for an extensive introduction). At the moment ECMWF uses a representation of 92 layers in the vertical and horizontally spherical harmonics of degree 1–1,279. All (nonlinear) differential operators acting on the spherical harmonics are applied after the transformation from Fourier into space domain on the grid. Then, physical, parametrized background processes are applied in space and one projects back to Fourier domain, where finally the diffusion terms are applied. For the discretization one leaves the Eulerian representation and uses a Semi-Lagrangian formulation. This is for two reasons. First, Eulerian schemes often require small time-steps to avoid numerical instability (CFL condition): that is, the prognostic variable must not be advected more than one grid length per time-step. The maximum time-step is therefore defined by the strongest winds. To overcome this problem one uses a Lagrangian numerical scheme where the prognostic variable is assumed to be conserved for an individual particle in the advection process along its trajectory. The drawback is that with a pure Lagrangian framework it would be impossible to maintain uniform resolution over the forecast region. A set of marked particles, would ultimately result in dense congestion at some geographical locations, complete absence in the other. To overcome this difficulty a semi-Lagrangian scheme has been developed. In this numerical scheme at every time-step the grid-points of the numerical mesh are representing the arrival points of backward trajectories at the future time. The point reached during this back-tracking defines where an air parcel was at the beginning of the time-step. During the transport, the particle is subject to various physical and dynamical forcing. Essentially, all prognostic variables are then found through interpolation (using values at the previous time-step for the interpolation grid) to this departure point. In contrast to the Eulerian framework, the semi-Lagrangian scheme allows the use of large time steps without limiting the stability. One limitation for stability is that trajectories should not cross each other. Another, that particles should not overtake another. Therefore, the choice of the timestep in the semi-Lagrangian scheme is only limited by numerical accuracy. However, despite its stability properties severe truncation errors may cause misleading results. Interestingly, one should note when talking about accuracy that the convergence order of the underlying Galerkin method could be massively improved by switching to a nonlinear formulation as proposed in Fengler (2005). Finally, we would like to outline that the horizontal discretization used for the Gauss-Legendre transformation (e.g., see Fengler 2005) is – due to performance issues – slightly modified. Originally, the Gauss-Legendre grid converges massively to the poles which is due to the zeros of the Legendre polynomials that accumulate at the boundary. This introduces naturally a work overload in polar regions, where little is known about the atmospheric conditions and numerical noise due to the pole convergence/singularity of the underlying vector spherical harmonics. To overcome these problems one integrates over reduced lattices that drop points on each row such that one keeps powers of 2n 3m 5k , which allow for fast Fourier transforms. Experimentally, one was able to show that this modification introduced only minor artifacts that are negligible in comparison to the effort to be spent for avoiding them.

Page 10 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

4 Ensemble Forecasts Now, having sketched the basic dynamics behind numerical weather prediction we may draw from this a bit more frowning picture. Indeed, numerical weather prediction generally suffers from two types of uncertainty: First, the initial state of the atmosphere is known only to an approximate extent and, second, the numerical weather prediction models themselves exhibit an intrinsic uncertainty. In modern numerical weather prediction systems, assessing this double uncertainty employing ensemble forecasting has, on the one hand, become a major challenge and, on the other, provides a set of tools for probability-based decision-making. Ultimately, ensemble forecasting allows to quantitatively estimate the potential, environmental, and entrepreneurial risks of a forthcoming severe weather event. Over the past 20 years, ensemble forecasting has been implemented and further developed at the main weather prediction centers. For an overview of historic and recent developments, see Lewis (2005) and Leutbecher and Palmer (2008). Historically and practically, there are different ways to tackle the uncertainty issue inherent in numerical weather forecasts. The most basic idea is the one that probably any professional forecaster employs in his daily work: he compares the forecasts of different numerical models, which is sometimes referred to as the “poor man’s ensemble.” The more sophisticated version is an ensemble simulation, for which a certain numerical model is evaluated many times using different sets of initial conditions as well as different parameter sets for the parametrizations of the atmospheric physics. Due to the increased computational requirements compared to a single deterministic run, ensemble simulations are usually carried out using a lower horizontal resolution and a smaller number of vertical levels than the main deterministic run of the respective model. In addition to the perturbed ensemble members, one usually launches a control run, with the resolution of the ensemble members and still the best initialization available, the one which is in use for the main deterministic run. The different ensemble members ideally represent the different possible ways in which the current state of the atmosphere might possibly evolve. The variance or spread of the different members of the ensemble as well as the deviation from the control run provide useful information on the reliability of the forecast and on its future development. The two sources of uncertainty present in numerical weather prediction cannot really be distinguished in the final output of a numerical forecast. A numerical weather prediction model is a highly nonlinear dynamical system living in a phase space of about 106 –108 dimensions. Nevertheless, the underlying evolution equations are well defined and deterministic and, accordingly, the system exhibits deterministic chaotic behavior. This implies that small variations in the initial state of the system may rapidly grow and lead to diverging final states. At the same time, the errors from the initial state blend with the errors caused by the model itself, which stem from the choice of the parametrization coefficients, from truncation errors and from discretization errors. Thus, the errors in the final state are flow-dependent and change from one run of the model to another. Technically speaking, the purpose of ensemble forecasting is to appropriately sample the phase space of the numerical model in order to estimate the probability density function of the final outcome. There are several methods which are commonly used to create the perturbations of the initial state. The perturbations of the initial state have to be set up in such a way that they are propagated during the model run, and thus lead to significant deviations of the final state of the ensemble members. The perturbations which grow strongly during the dynamical evolution identify the directions of initial uncertainty which lead to the largest forecast errors.

Page 11 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

The first group of methods to create the perturbations of the initial state is based on ensemblespecific data assimilation techniques. The ensemble Kalman filter, which adds pseudo-random numbers to the assimilated observations, used by the Canadian Meteorological Service belongs to this category (Houtekamer et al. 2005). The second group of methods is based on the so-called bred vector technique. This technique is based on the idea to repeatedly propagate and rescale a random initial perturbation in order to breed the perturbations which are the most important ones in the dynamical evolution. A bred vector technique is being employed by the US National Center for Environmental Prediction (NCEP) (Toth and Kalnay 1997). The third group of methods is based on the identification of the leading singular vectors of the operator which is responsible for the propagation of the perturbations. The leading singular vectors have to be identified for each initial state, and different ensemble members can then be initialized with different linear combinations of the leading singular vectors. The singular vector technique is employed by ECMWF (Molteni et al. 1996). The physical effects living on spatial scales which are not resolved by numerical weather prediction models are usually represented by parametrizations as sketched above. The most common approach to introducing model uncertainty is to perturb the parameters of the model’s parametrizations. Other less commonly used approaches are multi-model ensembles and stochastic-dynamic parametrizations. For a review of the current methods, which are used in order to represent model error, see Palmer et al. (2005). It is a highly nontrivial task to adjust an ensemble simulation such that it is neither over nor under dispersive. The spread of the members of an ideal ensemble should be such that its probability density function perfectly matches the probability distribution of the possible atmospheric configurations. This can only be reached by properly adjusting both the perturbations of the initial state as well as the perturbations of the model. Interestingly, numerical weather prediction models rather tend to be under dispersive and rapidly converge toward the climate normals if the perturbations of the models are insufficient. Calibration techniques based on statistics of past ensemble forecasts can be used in a post-processing step in order to improve the forecast skill and in order to adjust the statistical distribution of an ensemble simulation. Interpreting the outcome of an ensemble simulation is much more sophisticated than interpreting a single deterministic run. The first step is usually to compare the main deterministic run of the respective model with the control run of the ensemble. Both these runs are initialized using the best guess for the initial state of the atmosphere which is available. Large deviations indicate that the model resolution has a strong impact on the outcome for the given atmospheric configuration. In the second step, one usually investigates the spread of the members of the ensemble, their median, and the deviation from the control run. A small spread of the ensemble members indicates a comparatively predictable state of the atmosphere, whereas a large spread indicates an unstable and less predictable state. Finally, the ensemble members allow for probabilistic weather predictions. If, for example, only a fraction of the ensemble members predict a specific event for a certain region, the actual probability of the occurrence of the event can be derived from this fraction. For a more detailed overview and further references on measuring the forecast skill of ensemble simulations, see Candille and Talagrand (2005) There are many different ways to depict the outcome of an ensemble simulation. The most prominent one is probably the ensemble plume for a certain location in the simulation domain. For an ensemble plume, the forecasts of all members for a certain parameter are plotted against the lead time. Additional information can be incorporated by including the control run, the

Page 12 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Station 100850

2.5

9 8 Mean absolute error [km]

Mean absolute error [⬚c]

Station 100850

EZ Mos EZ Model

2 1.5 1

7 6 5 4 3 2

0.5

EZ Mos EZ Model

1 0

0

50

100

150

200

0

0

20

40

60

Leadtime [h]

80 100 120 140 160 180 200 Leadtime [h]

Fig. 1 Comparison of MOS and DMO error in temperature and wind speed

deterministic run, the ensemble median, and optionally the climate characteristics. Ensemble plumes allow for a quick overview of the spread of the ensemble members and the easy comparison to the ensemble median, the control run, the deterministic run, and the climate normal (see, e.g., Fig. 17).

5 Statistical Weather Forecast (MOS) Once endowed with the algorithms and techniques sketched above, one can do what is known as (dynamical) numerical weather prediction. Dropping for a moment any thoughts on physics and mathematics, that is, convergence, stability, formulation of the equations, technical difficulties, and accuracy, a weather model provides us with nothing else but some numerical output that is usually given either as gridded data on different layers and some (native) mesh or by some spherical harmonic coefficients. Either one could be mapped onto some regular grid for drawing charts and maps. However, any kind of regularity in the data immediately gives a convenient access of time series from model data if one archives model runs from the past. Clearly, this opens the way to answer questions concerning accuracy and model performance at some specific location but also allows to refine the model in a so-called post-processing step. This leads us to MOS (Model Output Statistics), which relates the historical model information to measurements that have been taken at a certain coordinate by linear or nonlinear regression. Hence, a dense station network helps to improve the outcome of a numerical weather prediction tremendously. For example, regional and especially local effects that are either physically not modeled or happen at some scale that is not resolved by the underlying model could be made visible in this statistically improved weather forecast. Such local effects could be for example luv- and lee-effects, some cold air basin, exposition to special wind systems in valleys, Foehn, inversion, and so on. Figure 1 shows the comparison between accuracy of the ECMWF direct model output (DMO) and the ECMWF-MOS at a station that is located on Hiddensee in the Baltic Sea. The MOS system improves significantly the model performance by detecting station specific characteristics like sea breezes, sea-land wind circulation, sea warming in autumn, and so forth Fig. 1. Finally, the accuracy of statistical post-processing systems can be improved by combining the MOS systems of different models: For example, such as a ECMWF-MOS together with a GFS-MOS derived from the NOAA/NCEP GFS model, UKMO-/UKNAE-MOS derived from the Page 13 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Real time data UKMO MOS ECMWF MOS

MOS Mix

GFS MOS

Rader

Fig. 2 Combination of different MOS systems GFH Mos EZ Mos UKMO Mos Mos Mix

Mean absolute Error [⬚C]

2

1.5

1

0.5

0

0

20

40

60

80

100

120

Leadtime [h]

Fig. 3 Increase of accuracy when combining the different forecasts

Met Office’s global and mesoscale (NAE) models, and additional different MOS systems. The combination done by an expert system (see Fig. 2) adjusts the weighting by the current forecast skill and reduces the error variances tremendously and, thus, further improves the performance as shown in Fig. 3.

6 Applying the Techniques to Kyrill Keeping in mind the methods introduced so far, we will have a closer look of how they apply to the Kyrill event. We use ECMWF and Met Office NAE model data for an analysis of the weather conditions around January 17 and 18, shown in Fig. 4. A detailed synoptic analysis of this severe storm has been done in Fink et al. (2009). The reason for us to choose this event for an analysis based on model output is due to two facts. First, the mesoscale flow patterns of this winter storm are well captured – even 8–10 days before landfall. This outlines the good formulation of the flow and its parametrization done by ECMWF. On the other hand, we observe certain effects that are not resolved by a spherical harmonic representation and which shows the limitations of this formulation, namely sharp and fast traveling cold fronts coming with Page 14 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 Geopotential height [Dm] and temperature at 500 hPa at 17 January 2007 18z (left) and 18 January 2007 0z (right)

convectively embedded rainfall. Therefore, at first we have a closer look at the ECMWF analysis to describe the geopotential height field and afterward we use the Met Office NAE model that uses a finite difference representation to resolve very local effects. This model is nested into a global one but operates with lead times of only up to the next 36 h. The reader should note that all times are given in Greenwich Mean Time, also known as UTC, Zulu-time, or z-time.

6.1 Analysis of the Air Pressure and Temperature Fields The images in Fig. 4 show a strong westerly flow that has dominated the first weeks of January. In the evening of January 17, the depression Jürgen, as it was dubbed by German meteorologists, influenced Central Europe with some windy and rainy weather. The first signals of Kyrill can be seen on the western edge of the domain over the Atlantic ocean. This map shows also that Kyrill is accompanied by very cold air on its backside in great heights, which indicates the high potential of a quickly intensifying low-pressure cell. In the night of the 18th we observe from Fig. 4, a strong separation of cold air masses in northern and milder ones in southern parts of Central Europe. This line of separation developed over the British Islands to the frontal zone of Kyrill. Meanwhile, Kyrill has shown a pressure minimum of 962 hPa at mean sea level and traveled quickly eastward due to the strong flow at 500 hPa (about 278–315 km/h). In the early morning of the 18th, Kyrill developed strong gradients to a height laying above Spain and northern parts of Africa. This distinctive gradient led – in the warm sector of Kyrill – to the first damages in Ireland, South England, and northern parts of France with gusts of more than 65 kn. The pressure gradients further intensified and Kyrill was classified as a strong winter storm (see Fig. 5). While in southern and south-westerly parts of Germany the pressure gradients started rising, the eastern parts were influenced by relatively calm parts of the height ridge between Kyrill and Jürgen. The embedded warm front of Kyrill brought strong rainfall in western parts of Germany, especially in the low mountain range. The warm front drove mild air into South England, Northern France, and West Germany. Till noon of the 18th, Kyrill has developed into a large storm depression yielding gusts at wind speeds of more than 120 km/h in wide areas of South England, North France, the Netherlands, Belgium, Luxembourg, Germany, and Switzerland (see Fig. 5). Even parts of Austria were affected. In the late noon till the early evening of the 18th the heavy and impressively organized cold front of Kyrill traveled from North West of Germany to the South East bringing heavy rain, hail, strong gusts, and thunderstorms (see Fig. 6). Due to the strong gradient in pressure and the cold Page 15 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 5 Geopotential height (dm) and temperature at 500 hPa on 18 January 2007 6z (left) and 12z (right)

Fig. 6 Geopotential height (dm) and temperature at 500 hPa on 18 January 2007 18z (left) and 19 January 2007 0z (right)

front, gusts at velocities between 120 and 160 km/h have been measured even in the plains and lowlands. Finally, it should be mentioned that after the cold front had passed, a convergence and backward oriented occlusion connected to the depression brought heavy rainfall in the North West of Germany. In the back of Kyrill, which was passing through quickly, the weather calmed down and heavy winds only occurred in the South and the low mountain range.

6.2 Analysis of Kyrills Surface Winds The following figures have been computed from the UK Met Office NAE model that uses a finite difference formulation at a horizontal resolution of CA. 12 km. They show the model surface winds at a height of 10 m above West France, Germany, Switzerland, Austria, and Eastern parts of Poland. The reader should note that the colors indicate the strength of the wind speed at the indicated time and not the absolute value of gusts in a certain time interval. The overlay of white isolines show the pressure field corrected to mean sea level. Starting at midnight of the 18th in Figs. 7 and 8, we observe a strong intensification of the surface winds over the North Sea and the described landfall. The strong winds inshore are due to the Channel acting like a nozzle. Noteworthy, when comparing Figs. 9–11, is the change in wind direction when the cold front entered northern parts of Germany.

Page 16 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 7 Wind speed 10 m above ground at 18 January 2007 0z (left) and 3z (right)

Fig. 8 Wind speed 10 m above ground on 18 January 2007 6z (left) and 9z (right)

Fig. 9 Wind speed 10 m above ground on 18 January 2007, 12z (left) and 15z (right)

Fig. 10 Wind speed 10 m above ground on 18 January 2007, 18z (left) and 21z (right) Page 17 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 11 Wind speed 10 m above ground on 19 January 2007, 0z (left) and 3z (right)

Fig. 12 850 hPa-Wind on 18 January 2007 0z (left) and 3z (right)

Fig. 13 850 hPa-Wind on 18 January 2007 6z (left) and 9z (right)

6.3 Analysis of Kyrill’s 850 hPa Winds To understand the heaviness of Kyrill’s gusts, we have closer look at the model on the 850 hPa pressure level. In these layers, due to the convective nature of the cold front, heavy rainfall causes transport of the strong horizontal momentum into the vertical. Caused by this kind of mixing, fast traveling air at heights of 1,200 and 1,500 m is pushed down to earth. These so-called down bursts are commonly responsible for the heavy damages of such a storm. From the images shown in Fig. 12 we observe a westerly flow at strong but not too heavy wind speeds. The reader should note that the wind speeds are given in knots. Moreover, it should be outlined that the “calm” regions in the Alps are numerical artifacts that have their origin in the fact that the 850 hPa layer is in these mountainous regions in the ground. In the following images shown from Figs. 13–16 we now observe the cold front with these strong winds passing over Germany.

Page 18 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 14 850 hPa-Wind on 18 January 2007, 12z (left) and 15z (right)

Fig. 15 850 hPa-Wind on 18 January 2007, 18z (left) and 21z (right)

Fig. 16 850 hPa-Wind on 19 January 2007, 0z (left) and 3z (right)

6.4 Ensemble Forecasts In the scope of the 50-member ECMWF ensemble forecasting system, first signals for a severe winter storm have shown up as early as 10 days in advance. In Fig. 17, we show a so-called ensemble plume plot for the 10 m wind for the city of Frankfurt, Germany. Plotted over the lead time, the plume plot contains the results for the 50 ensemble members, the ensemble median, as well as the 10 % and the 90 % quantile. In the ensemble run initialized on Monday 8.12, 12z, at least a fraction of the ensemble members exceeded wind speeds of 24 kn for January 18. At the same time, a fraction of the ensemble members exceeded 14 mm for the 6-h precipitation for the same forecast time (see Fig. 17).

Page 19 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 17 Ensemble plumes for Frankfurt from 8 January 2007, 12z. 10 m Wind (left) and precipitation (right)

Fig. 18 Ensemble plumes for Frankfurt from 10 January 2007, 12z. 10 m Wind (left) and precipitation (right)

Fig. 19 Ensemble plumes for Frankfurt from 12 January 2007, 12z. Wind (left) and precipitation (right)

It should be noted that despite the underestimated strength of the event, the timing was already highly precise at that time. However, on January 8, the different members of the ensemble did not exhibit a very consistent outcome for the future Kyrill event. In the following, we will track the ensemble forecasts for the Kyrill winter storm for the city of Frankfurt, Germany, while approaching the time of the pass of the main front. In Figs. 18 and 19, we show according ensemble plume plots for the wind and the 6-h precipitation for the ensemble run initialized on January 10, 12z. The signals for the 10 m wind considerably increased compared to the previous run. A fraction of the ensemble members now exceeds 30 kn and the ensemble median clearly exceeds 15 kn. At the same time, a distinct peak in the expected precipitation develops, with an increased number of the ensemble members exceeding 14 mm for the 6-h precipitation.

Page 20 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 20 Ensemble plumes for Frankfurt from 14 January 2007, 12z. Wind (left) and precipitation (right)

The timing for the main event is almost unchanged compared to the previous run. The strongest winds are still expected to occur around 19 January 0z. In the plots for the ensemble run initialized January 12, 12z shown in Figs. 19 and 20, the signals for the severe winter storm Kyrill become more pronounced. At that time, about 7 days in advance, almost all ensemble members exceed 10 m winds of 15 kn and the 90 % quantile exceeds 30 kn. At the same time, the distinct peak in the 6-h precipitation becomes more pronounced. The timing of the event is hardly altered, with the strongest winds still expected to break their way around January 19, 0z. About 5 days before the main event, the 10 % quantile for the 10 m winds almost reaches 18 kn and a very consistent picture develops in the scope of the ECMWF ensemble forecasts (see Fig. 20). Based on the very consistent picture, 10 m winds of up to 30 kn can be expected for the time between January 18, 12z and January 19, 0z. The 6-h precipitation can be expected to reach about 5 mm based on the ensemble median.

6.5 MOS Forecasts In order to stress the importance of local forecasts which take into account local effects, we will show in the following three MOS forecast charts for the city of Frankfurt, Germany (Figs. 21–23). The MOS forecasts shown in the following are based on the deterministic run of the ECMWF model. Already from the MOS run based on the model output from January 12, 12z, a detailed and very precise picture of the Kyrill winter storm can be drawn. All the main features of the event are contained as well as the precise timing, which had already been found in the ensemble forecasts. The strongest gusts with more than 50 km/h are expected for the late evening of January 19. At the same time, the reduced pressure is expected to drop to about 999 hPa at this specific location. In the MOS run based on the January 14, 12z model output, the pass of the cold front with the following peak of the gusts and the wind becomes more pronounced than in the previous run. The peak in the precipitation was to be expected after the pass of the cold front. In the MOS run based on the January 17, 12z ECMWF model output, the speed of the maximum gusts increased even more and the expected precipitation rose. From these MOS charts, the importance of local forecasts can easily be understood. Comparing the MOS charts with the ensemble forecasts shown in the previous section, it is obvious that local effects play a crucial role.

Page 21 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 21 MOS chart for Frankfurt from 12 January 2007, 12z

Fig. 22 MOS chart for Frankfurt from 14 January 2007, 12z

Page 22 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 23 MOS chart for Frankfurt from 17 January 2007, 12z

Fig. 24 Observation chart for Frankfurt for the Kyrill event

In Fig. 24, we show an observation chart in order to verify the MOS forecast from January 17, 12z. The precipitation has been overestimated by the forecasts, but the sea level pressure, the wind, the gusts, and the temperature profile have been captured to a very high precision.

Page 23 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 25 Radar images from 18.1.2007 0z (left) and 3z (right)

Fig. 26 Radar images from 18.1.2007 6z (left) and 9z (right)

6.6 Weather Radar For the sake of completeness we should also have a closer look at the radar images for the same time period. The radar images in Figs. 25–29 show shaded areas where one observes strong rainfall or hail. The stronger the precipitation event the brighter the color. These images provide a deep insight to the strong, narrow cold front traveling from North Germany to the South. In Figs. 27 and 28 we see an extreme sharpness and an impressively strong organization in the fast traveling cold front. This vastly damaging front had a cross diameter of 46 km that the models we talked above, unfortunately, did not show due to their resolution. To resolve these patterns properly is a challenging task for the current research.

Page 24 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 27 Radar images from 18.1.2007 12z (left) and 15z (right)

Fig. 28 Radar images from 18.1.2007 18z (left) and 21z (right)

Fig. 29 Radar images from 19.1.2007 0z (left) and 19.1. 3z (right)

Page 25 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

7 Conclusion Since the early days of computer simulations, scientist have been interested in using them for modeling the atmosphere and predicting the weather. Nowadays these efforts have evolved into essential tools for meteorologists. There still is and will be rapid development in the future, due to the access to fast enough computers and the availability of highly sophisticated mathematical methods and algorithms for solving the underlying nonlinear problems. Some of the most important methods that are used in current operational forecasting codes have been reviewed in this chapter. After the fundamental set of governing partial differential equations was defined, a formulation and discretization suitable for solving these equations numerically on three-dimensional grids was introduced. On the example of cloud formation and precipitation, we showed how microphysical processes are coupled to such a model. Some commonly used methods for assimilating information from observational data to improve the accuracy of predictions have been outlined. It is crucial to understand, that even though – as a well defined initial value problem – the models are deterministic, the complexity and nonlinearity of the underlying mathematics as well as our ignorance of exact initial conditions make it difficult to predict the quality of a single forecast. To tackle that problem, statistical methods are developed, so-called Ensemble Forecasts. Finally statistical post-processing techniques, known as Model Output Statistics (MOS) can be used for further improving the forecast quality at specific locations. During the last two decades, through methods as the ones described here, numerical weather models opened a window for observing and investigating the atmosphere at an unprecedented level of detail and contribute significantly to our ability to understand and predict the weather and its dynamics. As an example for the type and quality of information we can extract from atmospheric simulations, in combination with statistical analysis and observational data (weather station reports and radar maps) we analyzed the winter storm Kyrill and the processes that were leading to this event. For such an analysis, model data provide us with detailed temperature, pressure, and wind maps, that would not be available at a comparable frequency through observational data alone. The dynamics at different pressure levels that was eventually leading to this devastating storm were described and analyzed in detail based on data obtained from the ECMWF and the UK Met Office NAE models. We then looked at the event through ensemble plumes and MOS diagrams, which give us first hints at the storm more than a week in advance and rather precise quantitative predictions about wind speeds and precipitation, a couple of days before the event at specific locations. The comparison to Radar images reveals limitations of the model predictions, due to fine, localized structures that are not accurately resolved on the model grids. This demonstrates the need for better local models with very high spatial resolutions and more reliable coupling to all available observational data. Despite the overall accurate picture we already obtain, such high resolution models can be very valuable when preparing for a severe weather situation, for example for emergency teams that have to decide where to start evacuations or move manpower and machinery. Acknowledgments The authors would like to acknowledge Meteomedia, especially Markus Pfister and Mark Vornhusen for many fruitful discussions and help with the ensemble and MOS charts. Moreover, the authors’ gratitude goes to ECMWF for providing an extensive documentation and scientific material to their current systems.

Page 26 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Bibliography Baker DM, Huang W, Guo YR, Bourgeois A, Xiao XN (2004) A three-dimensional variational data assimilation system for MM5: implementation and initial results. Mon Weather Rev 132:897 Black TL (1994) The new NMC mesoscale eta model: description and forecast examples. Weather Forecast 9:265 Bouttier F, Coutier P (2001) Meteorological training course lecture series, ECMWF Candille G, Talagrand O (2005) Evaluation of probabilistic prediction systems for a scalar variable. Q J R Meteor Soc 131:2131 Courtier P et al (1998) The ECMWF implementation of three-dimensional variational assimilation (3DVAR). I: formulation. Q J R Meteor Soc 124:1783 Daley R (1991) Atmospheric data analysis. Cambridge University Press, Cambridge/New York ECMWF Webpage: www.ecmwf.int Fengler M (2005) Vector spherical harmonic and vector wavelet based non-linear Galerkin schemes for solving the incompressible Navier-Stokes equation on the sphere. Shaker Verlag, Maastricht Fink AH, Brücher T, Ermert V, Krüger A, Pinto JG (2009) The European storm kyrill in January 2007: synoptic evolution and considerations with respect to climate change. Nat Hazards Earth Syst Sci 9:405–423 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford Science Publications/Clarendon, Oxford Houtekamer PL, Mitchell HL, Pellerin G, Buehner M, Charron M, Spacek L, Hansen B (2005) Atmospheric data assimilation with an ensemble Kalman filter: results with real observations. Mon Weather Rev 133:604 IFS Documentation (2006a) Cy31r1 operational implementation 12 September 2006; Part III: dynamics and numerical procedures IFS Documentation (2006b) Cy31r1 operational implementation 12 September 2006; Part IV: physical processes Leutbecher M, Palmer TN (2008) Ensemble prediction of tropical cyclones using targeted diabatic singular vectors. J Comput Phys 227:3515 Lewis JM (2005) Roots of ensemble forecasting. Mon Weather Rev 133:1865 Lorenc AC (1986) Analysis methods for numerical weather prediction. Mon Weather Rev 112:1177 Menke W (1984) Geophysical data analysis: discrete inverse theory. Academic, New York Mesinger F, Janji Z (1985) Problems and numerical methods of incorporation of mountains in atmospheric models. Lect Appl Math 22:81–120 Molteni F, Buizza R, Palmer TN (1996) The ECMWF ensemble prediction system: methodology and validation. Q J R Meteor Soc 122:73 Norbury J, Roulstone I (2002) Large-scale atmosphere ocean dynamics, vol I. Cambridge University Press, Cambridge Palmer TN, Shutts GJ, Hagedorn R, Doblas-Reyes FJ, Jung T, Leutbecher M (2005) Representing model uncertainty in weather and climate prediction. Annu Rev Earth Planet Sci 33:163 Parrish D, Derber J (1992) The national meteorological center’s spectral statistical interpolation analysis system. Mon Weather Rev 120:1747 Simmons AJ, Burridge (1981) An energy and angular momentum conserving vertical finite difference scheme and hybrid vertical coordinates. Mon Weather Rev 109:758–766

Page 27 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_21-2 © Springer-Verlag Berlin Heidelberg 2014

Sundqvist H (1978) A parametrization scheme for non-convective condensation including prediction of cloud water content. Q J R Meteor Soc 104:677–690 Tarantola A (1987) Inverse problem theory. Methods for data fitting and model parameter estimation. Elsevier, Amsterdam Toth Z, Kalnay E (1997) Ensemble forecasting at NCEP: the breeding method. Mon Weather Rev 125:3297

Page 28 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives Matthias Augustinc , Mathias Bauera , Christian Blickc , Sarah Eberlec , Willi Freedenc , Christian Gerhardsc , Maxim Ilyasovc , René Kahntd , Matthias Klugc , Sandra Möhringerc , Thomas Neue , Helga Nutzc , Isabel Michel née Ostermannb and Alessandro Punzic a CBM GmbH, Bexbach, Germany b Fraunhofer ITWM, Kaiserslautern, Germany c Geomathematics Group, University of Kaiserslautern, Rhineland-Palatinate, Germany d G.E.O.S. Ingenieurgesellschaft mbH, Freiberg, Germany e Tiefe Geothermie Saar GmbH, Saarbrücken, Germany

Abstract Modeling geothermal reservoirs is a key issue of a successful geothermal energy development. After over 40 years of study, many models have been proposed and applied to hundreds of sites worldwide. Nevertheless, with increasing computational capabilities, new efficient methods become available. The aim of this paper is to present recent progress on potential methods and seismic (post-)processing, as well as fluid and thermal flow simulations for porous and fractured subsurface systems. Commonly used procedures in industrial energy exploration and production such as forward modeling, seismic migration, and inversion methods together with continuum and discrete flow models for reservoir monitoring and management are explained, and some numerical examples are presented. The paper ends with the description of future fields of studies and points out opportunities, perspectives, and challenges.

1 Introduction Temperature increases with depth in the Earth at the average of 25 ı C=km. If the average surface temperature is assumed to be around 15 ı C, the temperature at about 3 km is around 90 ı C. All in all, in accordance with the standard geophysical estimates, the total heat content of the Earth (reckoned at an average surface temperature of 15 ı C) is approximately 12:6  1024 MJ, where the heat content of the crust amounts to 5:4  1021 MJ. Nowadays, only a fraction of the heat content can be utilized, depending on geological conditions. In favor are areas that transfer heat from deep zones to the surface. Spatial variations of thermal energy within the deep crust and mantle of the Earth give rise to concentrations of thermal energy near the surface of the Earth, such that certain locations can be used as an energy resource. Heat is transferred from the deeper parts of the Earth by conduction through rocks, by the movement of hot deep material towards the surface, particularly when associated with recent volcanism, and by circulation of water, e.g., in active fault zones. Figure 1 illustrates some scenarios of interest for geothermal exploitation. Much of the geothermal exploration occurring worldwide is focused on the geologic concept of plate tectonics since most of the current thermal 

E-mail: [email protected]

Page 1 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 Schematic representation of geothermally relevant zones (following Saemundsson 2009)

activities are located near plate boundaries. The brittle and moving plates of the lithosphere (crust and upper mantle) are driven by convection of plastic rocks beneath the lithosphere (see Geothermal Energy Association 2011, for more details). Convection causes the crustal plates to break and move away in opposite directions from zones of upwelling hot material. Magma moving upward into a zone of separation is followed by substantial amounts of thermal energy. A particular source of elevated heat flow and volcanism are plumes and hotspots. Several important geothermal systems are associated with recent volcanism caused by hotspots, e.g., geothermal fields of Iceland, Yellowstone, and the Azores. Areas of the world with high geothermal potential are shown in Fig. 2. Innovative techniques of exploration, geothermal drilling, electric power generation, heat pumps, etc. may open new frontiers in the near future and probably allow utilization of larger fractions of the untapped resources. In fact, the heat in place is not the problem. The question is how much can be taken out and at what price. It is our belief that a geoscientific consortium including geomathematics as a key technology can contribute significantly to find a more appropriate answer for all types of geothermal energy systems, i.e., deep as well as near-surface geothermal systems (see Fig. 3). Indeed, the complexity of the entire energy production chain makes modeling of geothermal reservoirs a difficult challenge involving different scientific disciplines. For the geothermal energy production system to be efficient, it is of essential importance not only to have a deep understanding of the geologic, thermal, and mechanical configuration of the potential site but also to be able to predict possible consequences of this invasive procedure, to satisfy mathematical requirements, and to execute large numerical computations. For this reason, geologists, engineers, physicists, and mathematicians must share their expertise and work together in order to provide an efficient solution for the ambitious task “geothermal energy”. The gained geothermal energy can be employed directly for the heat market or it can be used indirectly for electricity generation. Countries with active volcanism such as Iceland have a longstanding tradition of industrial use of geothermal energy gained by reservoirs with high enthalpy (>180 ı C), which are easily accessible due to their shallow depths (cf. Georgsson and Page 2 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

North American Plate

Eurasian Plate

Eurasian Plate

A tt

antic R

e idg

Indo-Australian Plate

East Pacific Rise

i

d-

Nazca Plate

South American Plate

0° African Plate

30° 45° 60°

Antarctic Plate 90°

120°

150°

180°

150°

120°

90°

60°

30°



30°

North American Plate

Eurasian Plate

45° 30°

M

Pacific Plate

60°

60°

90°

60° Eurasian Plate

45° 30°

Pacific Plate

0° Nazca Plate

South American Plate

Indo-Australian Plate

African Plate

30° 45° 60°

Antarctic Plate 90°

120°

150°

180°

150°

120°

90°

60°

30°



30°

60°

90°

Fig. 2 Major tectonic plates of the Earth (top), areas of the Earth with potential for generating geothermal energy (bottom) (due to Hammons 2011)

Friedleifsson 2009). Nevertheless, the absence of these sources does not exclude substantial geothermal potential in other regions or countries (for more details on the geothermal potential in Germany, see, e.g., Jung 2007; Schulz 2009) in the shape of deep reservoirs with low enthalpy (1 km), through mostly vertical fractures, to extract the heat from the rocks. (3) Sedimentary geothermal systems are probably the most common type worldwide. The systems in this category are characterized by medium temperature, high flow rate geothermal reservoirs in large-basin, sedimentary deposits. Page 3 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Diversity of production facilities for deep geothermal energy Table 1 Classification of geothermal systems on the basis of temperature, enthalpy, and physical state (based on the work of Bödvarsson 1964, Axelsson and Gunnlaugsson 2000, and Sanyal 2005) Liquid-dominated Low-temperature (LT) systems Low-enthalpy geothermal geothermal reservoirs with with reservoir temperature at systems with reservoir fluid water temperature at, or 1 km depth below 150 ı C, often enthalpies less than below, the boiling point at the characterized by hot or boiling 800 kJ/kg, corresponding to temperatures less than about prevailing pressure, such that springs ı the water phase controls the Medium-temperature (MT) 180 C pressure in the reservoir. systems with reservoir Some steam may be present temperature at 1 km depth between 150 and 200 ı C High-temperature (HT) High-enthalpy geothermal Two-phase geothermal reservoirs systems with reservoir systems with reservoir fluid where steam and water co-exist and the temperature at 1 km depth enthalpies greater than temperature and pressure follow the above 200 ı C, characterized 800 kJ/kg boiling point curve by fumaroles, steam vents, Vapor-dominated geothermal systems mud pools, and highly altered where temperature is at, or above, the ground boiling point at the prevailing pressure and the steam phase controls the pressure in the reservoir. Some liquid water may be present

Geothermics can also be classified based on depth. It distinguishes between near-surface geothermal energy and deep geothermal energy. The former is mainly used for heating in private homes. The required facilities are, e.g., geothermal collectors or thermo-active pipes reaching down to a depth of at most 400 m. The use of deep geothermal energy, on the other hand, requires boreholes with a depth between 2 and 5 km.

Page 4 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Basically, there are three different types of systems for deep geothermal power production (see Fig. 3): thermowells (deep heat exchanger), hydrothermal reservoirs, and petrothermal reservoirs. In a thermowell, the heat transfer medium circulates in a closed cycle within a U-pipe or a coaxial heat exchanger. Therefore, only one borehole has to be drilled. Although there is the advantage of no contact with the groundwater, the relatively low productivity gives rise to focus on open systems, namely, hydrothermal and petrothermal reservoirs. The concept behind hydrothermal systems is to let thermal water found in deep reservoirs circulate between two or three drilled deep wells through a previously existing aquifer. Typically, these reservoirs consist of a porous medium layer heated from below by a hot stratum of impermeable material. By contrast, in petrothermal systems (also called Hot Dry Rock (HRD) Systems or Enhanced Geothermal Systems (EGS)), the water flows through fractured hot rock, the porosity of which can be enhanced by hydraulic stimulation. In the latter case, water is artificially pumped into the reservoir. In this work, we essentially focus on hydro- and petrothermal systems. The key issues that geomathematicians have to face are the detection of potential reservoirs along with the surrounding subsurface structure, including temperature, capacity and hydraulic characteristics of aquifers, and the development of a comprehensive model describing the dynamics of the production process, in particular concerning flow, temperature, and composition of the fluid. Recent events such as earthquakes (see also Phillips et al. 2002) show that another fundamental mathematical problem is the understanding of inner stress conformation and dynamics. Geothermal exploration methods include a broad range of disciplines: geology, geophysics, geochemistry, geoengineering, and, of course, geomathematics. Exploration involves not only identifying hot geothermal bodies but also cost-efficient regions to drill. A key technology is the detection of geophysical features, such as fault patterns and their accompanying fractures, karst occurrence, or water transfer rates in the area of interest in the deeper underground. This is done, for example, by migration of seismic data. Gravimetry studies use changes in the density to characterize subsurface properties. In particular, subsurface fault lines are identifiable with gravitational methods. Magnetotellurics data allow the detection of resistivity anomalies associated with geothermal structures, including faults, and enable the estimation of geothermal reservoir temperatures at various depths. Geomagnetism offers the possibility to detect in which depth the Curie temperature for certain materials is reached, hence, providing valuable information on future plant productivity. Based on the knowledge gained by seismic modeling (location, orientation, and aperture of cracks), a mathematical description of the stress field prior to production can be provided. Due to the danger of fluid flow in a highly stressed system, it is also of crucial importance to understand the evolution of the stress during the production process. Another core issue is modeling the actual underground flow of water, which must take into account several aspects such as thermal flow, chemical flow, evolution of the pressure gradient, and eventual consequences on the highly stressed rock conformation. This means solving coupled equations for elasticity as well as for two-, three-, or multiphase fluid, heat, and mass flow in porous or fractured media (see, e.g., Pruess 1990, for flow problems). All in all, from the viewpoint of geomathematics, we are required to model and simulate both the necessary (thermal, mechanical, hydraulic, etc.) parameters and the processes occurring during exploration, construction, and operational phases by using faulty or incomplete (measuring) data. In the Geomathematics Group at the University of Kaiserslautern, a column model was developed to handle deep geothermal reservoirs. It consists of the following four areas (columns): potential methods (gravitation/geomagnetics), seismic exploration, transport processes, and stress

Page 5 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 Column model characterizing deep geothermal systems implemented by the Geomathematics Group at the University of Kaiserslautern

field (see Fig. 4), with various related publications, e.g., Freeden et al. (2003), Freeden and Schreiner (2006), Freeden and Wolf (2009), Freeden (2011), Freeden and Nutz (2011, 2014), Ilyasov (2011), Luchko and Punzi (2011), Ostermann (2011a, b), Gerhards (2011, 2012, 2014), Augustin (2012), Freeden and Blick (2013), Freeden and Gerhards (2013), Freeden and Gutting (2013), and Bauer et al. (2014a, b). In this work, we essentially follow the Kaiserslautern model. Accordingly, we first give a short description of geopotential methods. Then, we go on to seismic data retrieval, post-processing, and decorrelation of signatures. Different types of geothermal reservoirs will be discussed subsequently, followed by a description of the involved transport processes. The simulation of the stress field is investigated. Finally, we present an outlook on the future of the geothermal situation, i.e., opportunities, challenges, as well as perspectives.

2 Potential Methods In order to minimize the geothermal exploration risk, one is well advised in geothermal obligations first to consult potential methods using gravimeter and/or magnetometer data (in accordance with the first column of the Kaiserslautern model in Fig. 4).

2.1 Classical Gravimetry Problem The inversion of Newton’s Law of Gravitation, i.e., the determination of the integral density function from information of the external gravitational potential, is known as the gravimetry problem. To be more precise, let B  R3 be (a region of) the Earth. We are interested in the density function  W B ! R which we want to reconstruct from (information of) the gravitational potential P in R3 nB. The gravitational potential is given by the integral Z (1) P .x/ D T Œ.x/ D ˆ.r 2 I kx  yk/ .y/ d V .y/; B

Page 6 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

with the volume element d V .y/ and the Laplace operator r 2 . Please note that we assume here that the gravitational constant can be set equal to 1. The kernel function, given by ˆ.r 2 I kx  yk/ D

1 ; x; y 2 R3 ; 4kx  yk

x ¤ y;

(2)

is the fundamental solution to the Laplace operator in R3 (see, e.g., Freeden and Gerhards 2013). Here, kk denotes the Euclidean norm of a vector in R3 . It is well known that in the classification due to Hadamard, the gravimetry problem violates all criteria of well posedness, viz., uniqueness, existence, and stability (for more details, the reader is referred to, e.g., the Habilitation thesis Michel (2002) as well as Freeden and Michel (2004) and Michel and Fokas (2008)). We merely summarize the essential results: (1) The potential P is harmonic in R3 nB. In accordance with the so-called Picard condition known in the theory of inverse problems (e.g., Engl et al. 1996), a solution exists only if P belongs to (an appropriate subset in) the space of harmonic functions. However, it should be pointed out that this observation does not cause a numerical problem since in practice the information on P is only finite dimensional. In particular, an approximation by an appropriate harmonic function is a canonical ingredient of any practical method. (2) The most serious problem is the non-uniqueness of the solution: the associated Fredholm integral operator T is of first kind and has a kernel (null space) that is known to coincide with the L2 .B/-orthogonal complement of the closed linear subspace of all harmonic functions in B (see, e.g., Michel 2002; Freeden and Gerhards 2013). The orthogonal complement, i.e., the class of anharmonic functions on B, is known to be infinite dimensional. In fact, the problem of non-uniqueness has been discussed extensively in the literature. This problem can be overcome by imposing some reasonable additional conditions on the density. A suitable condition, suggested by the mathematical structure of the Newton potential operator T , is to simply require that the density is harmonic. The approximate calculation of the harmonic density has already been implemented and covered in several papers, whereas the problem of determining the anharmonic part still seems to remain a great challenge. Due to the lack of an appropriate physical interpretation of the harmonic part of the density, various alternative variants have been discussed in the literature (see, e.g., Freeden and Michel 2004; Michel and Fokas 2008, and the references therein). In general, gravitational data yield significant information only about the uppermost part of the Earth’s interior, which is not laterally homogeneous. This is the reason why it results in serious difficulties. (3) Restricting the operator to harmonic densities leads to an injective mapping that has an unbounded inverse, implying an unstable solution. Apart from the inversion of the Newton potential (see, e.g., Ernstson and Alt 2013, for an overview on the conventional methods in geothermal modeling), the decorrelation of features occurring in the density function  W B ! R plays an important role in geothermal practice. Thus, we introduce a wavelet approach that supplies detail information that is useful for interpretation of the different features. As we shall see, the resulting post-processing method as proposed here can either be implemented for the density function itself or – more troublesome – based on gravitational data. That is because the Newton potential (1) is directly related to the density via a differential equation in the interior of the Earth, viz., the Poisson equation, at least if Hölder continuity is

Page 7 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

assumed for  W B ! R (see, e.g., Freeden and Gerhards 2013, for more details). We obtain Z 2 2 .x/ D rx P .x/ D rx ˆ.r 2 I kx  yk/ .y/ d V .y/; x 2 B:

(3)

B

Indeed, the Poisson differential equation (3) enables us to develop specific localizing scaling and wavelet functions for post-processing procedures from the density function itself and/or from gravitational data. Concerning the first variant, i.e., efficient post-processing of available density information, we use a regularization technique of the Newton potential (1) by approximating the fundamental solution ˆ.r 2 I k  k/ by a (one-dimensional) linear Taylor expansion ˆ .r 2 I k  k/ given by ( ˆ .r 2 I kx  yk/ D

1 1 ; 4 kxyk 1 .3  12 kx 8

kx  yk    yk2 /;

kx  yk < :

It is readily seen that the regularization P of the potential P in (1) given by Z P .x/ D T Œ.x/ D ˆ .r 2 I kx  yk/ .y/ d V .y/

(4)

(5)

B

satisfies the asymptotic relation sup kP .x/  P .x/k D O. 2 /;

 ! 0;

(6)

x2B

i.e., P approximates P with order  2 . fˆ g >0 is called the family of (scale continuous) scaling functions, while fW g >0 given via the (scale continuous) differential expression W D 

d ˆ d

(7)

is called the family of (scale continuous) wavelet functions. We omitted the reference to the Laplace operator here and will do so further on. The transition from scale continuity to scale discretization (see Freeden and Schreiner 2009; Freeden and Blick 2013;   Freeden and Gerhards 2013) is provided by a positive, monotonically decreasing sequence j j 2N of scale parameters such that lim  j !1  j D 0. In fact, it is not difficult to see that the family of (scale discrete) wavelet functions ‰j j 2N defined by Z

j

‰j D

W j C1

1 d 

(8)

satisfies the (scale discrete) difference equation

Page 8 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Z ‰j D

j

j C1



d ˆ  d



1 d  D ˆj C1  ˆj ; 

j 2 N:

(9)

‰l .kx  yk/.y/ d V .y/;

(10)

Obviously, we have Z PJ .x/ D

ˆj .kx  yk/.y/ d V .y/ C

J 1 Z X lDj B

B

j; J 2 N; J > j; with lim sup kP .x/  PJ .x/k D 0:

J !1

(11)

x2B

Of course, this convergence result is also valid for (5) if we regard  ! 0. It should be noted that all wavelet functions defined by (9) have a compact support, which means that the evaluation of the integrals can be restricted economically to the support of the functions. Moreover, we are able to introduce low-pass filters Lj and band-pass filters Bj for the density  in the form Z j .x/ D Lj Œ.x/ D  rx2 ˆj .kx  yk/ .y/ d V .y/ (12) B

and Z Bj Œ.x/ D 

rx2 ‰j .kx  yk/ .y/ d V .y/:

(13)

B

In connection with the differential equation (see, e.g., Freeden and Gerhards 2013)  rx2 ˆj .kx  yk/ D Hj .kx  yk/;

(14)

we indeed arrive at the Haar scaling function Hj .k  k/ given by ( Hj .kx  yk/ D

0;

kx  yk  j

3 ; 4j3

kx  yk < j

(15)

and the Haar wavelet function Kj .kx  yk/ D Hj C1 .kx  yk/  Hj .kx  yk/;

(16)

which are well known in constructive approximation. Higher-order Taylor approximations of the fundamental solution of the Laplace operator and resulting smoothed Haar kernels are considered in the PhD thesis Möhringer (2014). Page 9 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

If  is assumed to be Hölder continuous on B, the low-pass filter converges in the sense that lim LJ Œ.x/ D .x/;

J !1

x 2 B;

(17)

where LJ Œ.x/ is expressed using a sum of band-pass filters Z LJ Œ.x/ D

Hj .kx  yk/.y/ d V .y/ C

J 1 Z X

Kl .kx  yk/.y/ d V .y/;

(18)

lDj B

B

j; J 2 N; J > j: This is equivalent to the (more general) limit relation known from the theory of singular integrals in Euclidean space R3 Z lim HJ .kx  yk/ .y/ d V .y/ D .x/; x 2 R3 : (19) J !1

B

Some properties of the multiscale reconstruction of the density via (18) are demonstrated here with regard to the BP density model (see Fig. 5). This model was constructed during a research workshop co-sponsored by the EAGE and the SEG from the 2004 BP density benchmark with the aim to properly validate and test not only a velocity model but also a corresponding density model. As a three-dimensional density model was not available, we extended the two-dimensional BP model by arranging several copies of the data set consecutively along the third orthogonal axis. Figure 6 shows the decomposition of the BP density model based on property (17) of the lowpass filters by displaying the scales j D 1 to j D 7, where the scale parameter j is adapted to the expansion of the model. The detail information provided by band-pass filtering of the scales j D 3; 4 sharply shows all essential density boundaries. Concerning the second variant, i.e., efficient multiscale processing of density information from gravitational data input (cf. Freeden and Gerhards 2013), we come back to (5) and the limit relation (11) from which it is clear that the Newton potential can be approximated by 0

2.6 2.4

2000

2.2 4000

2

6000

1.8 1.6

8000

1.4 10000

1.2 1

0

1

2

Fig. 5 Density contours of the BP model in

3



g  cm3

4

5

6

x 104

(due to Billette and Brandsberg-Dahl 2005)

Page 10 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

0

2.6 2.5 2.4 2.3 2.2 2.1 2 1.9 1.8 1.7

2000 4000 6000 8000 10000 0

1

2

3

4

5

6

x 10 0

6000 8000 10000 1

2

3

4

5

6000 8000 10000 0

1

2

3

4

5

6

4

0 2000 4000 6000

8000 10000 0

1

2

3

4

5

x 104 0

4000 6000 8000 10000 0

1

2

3

4

5

6

0

0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 −0.5

2000 4000 6000 8000 10000 0

1

2

3

4

5

x 104 0

4000 6000 8000 10000 0

1

2

3

4

5

x 104 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1

2000 4000 6000 8000 10000 0

1

2

3

4

5

0 2000 4000 6000 8000 10000 0

6

0

6

1

2

3

4

5

0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3

4000 6000 8000 10000 1

2

3

4

5

6000 8000 10000 0

1

2

3

4

5

x 104 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1 0.8

2000 4000 6000 8000 10000 0

1

2

3

4

5

4

0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3

2000 4000 6000 8000 10000 1

2

3

4

5

6

x 104 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3

0 2000 4000 6000 8000 10000 0

6

x 10

0

0

6

0

6

x 104 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1

4000

0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 −0.5

x 104

2000

0

2000

6

0

x 104 0

6

x 104 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1

2000

6

0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 −0.5 −0.6

x 104 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1

2000

6

x 10 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1

4000

0

4000

4

2000

0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 −0.5

0 2000

1

2

3

4

5

6

x 104

Fig. 6 Multiscale decomposition of the (3D-extended)   BP density model from scale j D 1 to j D 7 (low-pass filtered density, left; band-pass filtered density, right) in cmg 3

Page 11 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Z P .x/ ' PJ .x/ D

ˆJ .kx  yk/ .y/ d V .y/;

x 2 B; J 2 N;

(20)

B

for sufficiently large J. An approximate integration formula over B then leads to P .x/ '

NJ X

NJ J ˆJ .kx  yiNJ k/ wN i .yi /;

(21)

i D1 NJ J where wN i ; yi ; i D 1; : : : ; NJ ; are prescribed weights and nodes, respectively. Within the inversion process of density modeling, the unknown coefficients NJ J aiNJ D wN i .yi /;

i D 1; : : : ; NJ ;

(22)

must be determined by solving a linear system of the type P .xkMJ /

D

NJ X

ˆJ .kxkMJ  yiNJ k/aiNJ ;

k D 1; : : : ; MJ ;

(23)

i D1

from known gravitational values P .xkMJ / at the nodes xkMJ 2 B; k D 1; : : : ; MJ . Since the inteJ gration weights wN are known, the density values .yiNJ / are immediately obtainable via (22) i such that the density  can be determined by (cf. (19)) .x/ ' J .x/ D

NJ X

NJ J HJ .kx  yiNJ k/wN i .yi /;

x 2 B:

(24)

i D1

2.2 Spline Method for Data Supplementation of Gravitational Measurement Information In geothermal practice, we are often confronted with the problem to complete gravitational information by means of the knowledge of discrete spaceborne, airborne, and/or terrestrial observations as well as internal borehole data. Our purpose is to explain a spline method involving the framework of Newton’s volume potential. The point of departure is the well-known Poisson formula (see, e.g., Freeden and Gerhards 2013) Z 2 P .x/ D  rx ˆ.kx  yk/ P .y/ d V .y/ B

Z Z

D

ˆ.kx  yk/ˆ.ky  zk/ d V .y/ .z/ d V .z/;

rx2 B B



ƒ‚

(25)



KH .x;z/

Page 12 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

where KH .; / is the uniquely determined reproducing kernel of a Hilbert space H (of Sobolev type). The inner product .; /H in H is given by .P1 ; P2 /H 0 1 Z Z D @ ˆ.k  yk1 .y/ d V .y/; ˆ.k  yk/2 .y/ d V .y/A B

Z D B

Z

D

20 4@rx2

B

Z

H

10

ˆ.kx  yk1 .y/ d V .y/A @rx2

B

Z

13

ˆ.kx  yk/2 .y/ d V .y/A5 d V .x/

B

1 .x/2 .x/ d V .x/ B

D.1 ; 2 /L2 .B/ :

(26)

In other words, the inner product .; /H may be interpreted by a transition from the L2 -product .; /L2 .B/ for density functions 1 ; 2 2 L2 .B/ to the level of Newton volume potentials P1 ; P2 2 H . For each x 2 R3 and every Newton volume potential P 2 H , the reproducing property .KH .x; /; P /H D P .x/

(27)

holds true. This property implies that there exists a (spline) function S 2 H satisfying the minimum norm interpolation relation kS kH D min kP kH ;

(28)

P 2I

where I is the class of Newton potentials in H consistent to the interpolating conditions Li S D Li P D ˛i ;

i D 1; : : : ; N:

(29)

Li ; i D 1; 2; : : : ; N , denote linear bounded functionals on H characterizing measurable gravitational observables (e.g., potential values, gravity anomalies), whereas ˛i ; i D 1; 2; : : : ; N , are the known measured quantities. The interpolation spline S 2 H then takes the form S.x/ D

N X

ai Li KH .x; /;

x 2 R3 :

(30)

i D1

The coefficients ai ; i D 1; 2; : : : ; N , are obtainable via the linear system N X

aj Li Lj KH .; / D ˛i ;

i D 1; : : : ; N:

(31)

j D1

Page 13 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

In case of erroneous data, we can proceed from spline interpolation to spline smoothing (cf. Freeden 1981, 1999). As indicated by (4), the function ˆ.kx yk/ may be regularized. Consequently, the reproducing kernel Z KH .x; z/ D ˆ.kx  yk/ˆ.ky  zk/ d V .y/ (32) B

may be replaced by the regularized kernel Z J KH .x; z/ D ˆJ .kx  yk/ˆJ .ky  zk/ d V .y/

(33)

B

for a sufficiently large integer J such   that J is a sufficiently small element of a positive, monotonically decreasing sequence j j 2N as introduced above. The important difference is that the right-hand side of (32) is an indefinite integral in R3 for all x; z 2 B, whereas the right-hand side of (33) is a regular integral (thus, allowing the application of unbounded functionals Li , e.g., oblique derivatives). Therefore, if the values Li P D ˛i , i D 1; 2; : : : ; N , are known, we are able to derive an expression in terms of the regularized kernels S.x/ ' S .x/ D J

N X

J ai Li KH .x; /

i D1

D

N X

Z ˆJ .kx  yk/ˆJ .ky  k/ d V .y/;

ai L i

i D1

x 2 B;

(34)

B

as well as by applying the operator rx2 to both sides of (34) the corresponding density function (cf. Eq. (14)) .x/ '  .x/ J

N X i D1

Z HJ .kx  yk/Li ˆJ .ky  k/ d V .y/;

ai

x 2 B;

(35)

B

for sufficiently large J . The representations (34) and (35) hold independently of the location of the measurement points (i.e., whether we consider inner, terrestrial, or outer data). Furthermore, (34) and (35) allow a decorrelation of the signal in a similar manner as achieved by the multiscale representation in Sect. 2.1. Some numerical results for the spline method are shown in Fig. 8 for a section of the BP density model as shown in Fig. 7. For more details, the reader is referred to the PhD thesis Möhringer (2014).

Page 14 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 7 Section of the BP density model chosen for the implementation of the spline interpolation in

2000 2.2

2200

2400

2

2400

2600

1.8

2600

2800

1.6

2800

3000

1.4

3200

1.2 1

3400

0.8

3600

0.6

3800

0.4 2.3

2.4

2.5

2.6

2.7 x 104

2.5

2

3000

1.5

3200 3400

1

3600 3800 4000

0.5 2.3

2.4

(a) j = 6

2.5

2.6

(b) j = 7

2000

2.7 x 104

2000

2200

3

2200 2

2400 2600

2400

2.5

2600 1.5

2800 3000

2

2800 3000

1

3200 3400

1.5

3200 1

3400 0.5

3600 3800 4000

g  cm3

2000

2200

4000



3600

0.5

3800 2.3

2.4

2.5

2.6

2.7 x 104

(c) j = 8 Fig. 8 Spline interpolation of the density

0

4000

2.3

2.4

2.5

2.6

(d) j = 9 

g cm3



2.7 x 104

in a local region for different scales j

Page 15 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

2.3 Gravitational Signatures of Hotspots/Mantle Plumes Because of the different temperature levels in comparison to its vicinity, locations of hotspots/plumes play a particular role in geothermal obligations, e.g., for locating high temperature water at moderate depths. Here we essentially follow the theory developed in the monograph Freeden and Gerhards (2013) on potential theory to model gravitational anomalies caused by hotspots/plumes with respect to their horizontal/vertical spatial extensions inside the Earth. Nowadays, the concept of mantle plumes is widely accepted in the geoscientific community. Mantle plumes are understood to be approximately cylindrically concentrated upflows of hot mantle material with a common diameter of about 100–200 km. They are an upwelling of abnormally hot rock within the Earth’s mantle. As the heads of mantle plumes can partly melt when they reach shallow depths, they are thought to be the cause of volcanic centers known as hotspots. Hotspots have first been explained by Wilson (1963) as long-term sources of volcanism that are fixed, with a tectonic plate overriding them. Following Morgan (1971), characteristic surface signatures of hotspots are due to the rise and melting of hot plumes from deep areas in the mantle. Special cases occur as chains of volcanic edifices whose age progresses with increasing distance to the plume, like the islands of Hawaii. They are the result of a pressure-release melting close to the bottom of the lithosphere that produces magma rising to the surface and by plate motion relative to the plume. The term “hotspot” is used rather loosely. It is often applied to any long-lived volcanic center that is not part of the global network of mid-ocean ridges and island arcs, like Hawaii which serves as a classical example. Anomalous regions of thick crust on ocean ridges are also considered to be hotspots, like Iceland. The multiscale reconstruction/decomposition proposed by Freeden (1999), Freeden and Schreiner (2006), Freeden and Wolf (2009), Freeden et al. (2009), and Freeden and Gerhards (2013) used for modeling gravity anomalies and/or vertical deflections caused by hotspots/plumes essentially consists of two ingredients: First, terrestrial gravity anomalies and/or deflections of the vertical are taken as gravitational input data. Second, significant tools for signal recovery are locally supported Haar wavelets and their smoothed versions (see, e.g., Freeden et al. 1998). Clearly, the size of the local support depends on the scale of the wavelet, i.e., with increasing scale its diameter decreases. This is the reason why the wavelet concept allows a “zooming-in” process to local (high-frequency) phenomena. It turns out that the application of (smoothed) Haar wavelets provides a powerful approximation technique for the investigation of, e.g., local fine-structured hotspots/plume features. The included illustrations (taken from the PhD theses Fehlinger (2009) and Wolf (2009); analogous computations in geomagnetics can be found in Gerhards (2011, 2012)) show that the presented multiscale procedure allows a scale- and spacedependent characterization of geophysically reflected phenomena. The wavelet coefficients can be interpreted as spatial measures of certain frequency bands contained in the signal signatures. Thereby, the wavelet theory offers a physically relevant approach for detecting and decorrelating hotspots/plume features. A critical point for numerical computation involving Haar wavelets is that today only terrestrial gravitational data sets of limited spatial extent are available. This causes numerical instabilities (oscillations) at the boundaries of the test areas under consideration. In consequence, it is a twofold challenge for future work to combine globally given satellite and locally available terrestrial data to get a higher accuracy within the modeling process as well as to avoid artificial phenomena, such as Gibb’s phenomena, by imbedding the local test area in a larger satellite-generated regional framework.

Page 16 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Kauai

NW plate motion

100 km

Oahu Maui

Hawaii

SE

60 km 70 km 88 km 105 km viscous flow

LVZ

plume

130 km

Fig. 9 Interpretation of seismic tomography results by Ritter and Christensen (2007) (modified version by Fehlinger 2009)

The hotspots/plume test areas selected here for multiscale demonstration are the Hawaiian islands and Iceland. Hawaii: Ritter and Christensen (2007) believe that a stationary mantle plume located beneath the Hawaiian Islands created the Hawaii-Emperor seamount chain while the oceanic lithosphere continuously passed over it (Fig. 9). The Hawaii-Emperor chain consists of about 100 volcanic islands, atolls, and seamounts that spread nearly 6000 km from the active volcanic island of Hawaii to the 75–80 Ma old Emperor seamounts nearby the Aleutian trench. Moving further south east along the island chain, the geological age decreases. The interesting area is the relatively young southeastern part of the chain, situated on the Hawaiian swell, a 1200 km broad anomalously shallow region of the ocean floor, extending from the island of Hawaii to the Midway atoll. Here, a distinct gravity disturbance and a geoid anomaly occur which have their maxima around the youngest island. Both coincide with the maximum topography and both decrease in northwestern direction. The progressive decrease in terms of the geological age is believed to result from the continuous motion of the underlying plate (cf. Wilson 1963; Morgan 1971). Using seismic tomography, several features of the Hawaiian mantle plume are gained (cf. Ritter and Christensen 2007, and the references specified therein). They reveal a low-velocity zone (LVZ) beneath the lithosphere, starting at a depth of about 130–140 km, below the central part of the island of Hawaii. So far, plumes have just been identified as low seismic velocity anomalies in the upper mantle and the transition zone by the use of seismic wave tomography, which is a fairly new achievement. Because plumes are relatively thin in lateral direction according to their diameter, they are hard to detect in global tomography models. Hence, despite novel advances, there is still no general agreement on the fundamental questions concerning mantle plumes, like their depth of origin, their morphology, and their longevity, and even their existence is still discussed controversially. This is due to the fact that many geophysical as well as geochemical observations can be explained by different plume models and even by models that do not include plumes at all Page 17 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

h 2i Fig. 10 Band-pass filtered details of the disturbing potential in ms2 from gravity anomalies in the region of Hawaii using discrete (smoothed) Haar wavelets for j D 9; 11; 13 (left) and j D 10; 12; 14 (right)

(e.g., Foulger et al. 2005). With our space-localized multiscale method of deriving gravitational signatures (see Freeden et al. 2009), more precisely the disturbing potential, from the deflections of the vertical, we add a new component in specifying essential features of plumes. From the bandpass filtered detail approximation (Fig. 10), we are able to conclude that the Hawaii plume has an oblique layer structure. As can be seen in the lower scale, which reflects the greater depths, the strongest signal is located in the ocean in a westward direction of Hawaii. With increasing scale, i.e., closer to the surface, it moves more and more to the Big Island of Hawaii, i.e., in eastward direction. Iceland: The plume beneath Iceland (cf. Freeden et al. 2009) is a typical example of a ridge-centered mantle plume. An interaction between the North Atlantic ridge and the mantle plume is believed to be the reason for the existence of Iceland, resulting in melt production and crust generation since the continental break-up in the late Paleocene and early Eocene. Nevertheless, there is still no agreement on the location of the plume before rifting started in the East. Controversial discussions, whether it was located under central or eastern Greenland about 62–64 Ma ago, are still in progress (cf. Schubert et al. 2001, and the references therein). Iceland itself represents the top of a nearly circular rise of topography, with the maximum of about 2:8 km above the surrounding seafloor in the south of the glacier “Vatnajökull.” Beneath this Page 18 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

glacier, several active volcanoes are located, which are supposed to result from the mantle plume. The surrounding oceanic crust consists of three different types involving a crust thickness that is more than three times as thick as average oceanic crusts. Seismic tomography provides evidence of the existence of a mantle plume beneath Iceland, resulting in low-velocity zones in the upper mantle and a transition zone, but also hints for anomalies in the deeper mantle seem to exist. The low-velocity anomalies have been detected in depths ranging from at least 400 km up to about 150 km. Above 150 km, ambiguous seismic velocity structures were obtained involving regions of low velocities covered by regions of high seismic velocities. For a deeper insight into the theory of the Iceland plume, the interested reader is referred to Ritter and Christensen (2007) and the references therein. From our multiscale reconstruction, it can be derived that the deeper parts of the mantle plume are located in the northern part of Iceland (compare the lower scales in Fig. 11), while shallower parts are located further south (compare the higher scales in Fig. 11). It is remarkable that from scale 13 on, the plume seems to divide into two sectors.

h 2i Fig. 11 Band-pass filtered details of the disturbing potential in ms2 from gravity anomalies in the region of Iceland using Haar wavelets for j D 10; 12; 14 (left) and j D 11; 13; 15 (right)

Page 19 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

68°N

68°N 2 1.5

66°N

1

1.5

1

66°N 0.5

0.5 0

0

64°N

64°N –0.5

–0.5

–1 –1.5

62°N

26°W 24°W 22°W 20°W 18°W 16°W 14°W 12°

–1

62°N

26°W 24°W 22°W 20°W 18°W 16°W 14°W 12°

i 2 Fig. 12 Band-pass filtered details of the disturbing potential in ms2 from gravity anomalies in the region of Iceland using Haar wavelets for j D 14 (left) and j D 15 (right) including the Mid-Atlantic Ridge (gray) h

Fig. 13 Geothermal power plants in Iceland (following the International Energy Agency 2010)

As the North American plate moves westward and the Eurasian plate eastward, new crust is generated on both sides of the Mid-Atlantic Ridge. In case of Iceland, which lies on the MidAtlantic Ridge, the neovolcanic zones are readily seen in Fig. 12. In Iceland, electrical production from geothermal power plants has been developed rapidly (see Georgsson and Friedleifsson 2009; Saemundsson 2009). Reflecting the geological situation (see Fig. 13), Iceland is a unique country with regard to utilization of geothermal energy, with more than 50 % of its primary energy consumption coming from geothermal power plants. All in all, by the space-based multiscale techniques initiated by Freeden and Schreiner (2006) in gravitation and by Freeden and Gerhards (2010) in geomagnetics, we are able to come to interpretable results involving hotspots/mantle plumes. In particular, for Iceland, we are led to the conclusion (cf. Fig. 12) that Iceland is split into three areas with characteristic ages of the basaltic rock. Tertiary flood basalt fills up most of the northwestern area. This formation of lava must be of considerable thickness, probably more than 3 km. Quaternary flood basalt must be revealed in the southwest and southeast. These rocks are cut by the neovolcanic areas of active rifting. Indeed, this area covers almost one-third of Iceland. However, it should be mentioned that our multiscale method offers better results for specifying hotspots/mantle plumes with respect to their horizontal Page 20 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

than vertical size. A more detailed study of their depth remains a challenge for future work. One possible remedy is the artificial positioning of buried mass points in order to study the behavior of the multiscale representation for known depths of the anomaly. First attempts in this direction have been made in the PhD thesis Fehlinger (2009) and in Gerhards (2014). Indeed, this technique allows us to estimate the locations of upwelling gravity anomalies indicating a higher temperature in these regions.

2.4 Gravito-Magneto Combined Inversion As the density  of a section of the underground is coupled to its gravitational effect, its magnetization m results in a magnetic field that can be related to a magnetic potential M (see Fig. 14). However, while the density  is scalar, the magnetization m is a vector-valued quantity. Similarly to the inversion problem in the gravitational case via the Newton integral (1), major difficulties are faced in the determination of the magnetization m from the magnetic potential M in form of the integral expression Z 1 1 M.x/ D m.y/  ry d V .y/: (36) 4 B kx  yk Most of the known inversion techniques (see, e.g., Menke 1984; Blakely 1996; Freeden and Gerhards 2013, and the references therein) for this problem make use of the replacement of the integral by a suitable (finite) sum and subsequent computation of a suitable solution of the linear equation system. Usually, gravity and magnetic inversion are handled separately (see, e.g., Turcotte and Schubert 2001; Ernstson and Alt 2013) in order to obtain density and magnetization independently from one another. At the same time, Poisson’s ratio (see Blakely 1996) shows that for each subset K of a body B  R3 with uniform magnetization (i.e., m.y/ D m, y 2 K ) as well as uniform density (i.e., .y/ D , y 2 K ), we have

Fig. 14 Principle of gravimetry (left) and its geomagnetic counterpart (right) (modified illustration based on Jacobs and Meyer 1992, with permission by B. G. Teubner)

Page 21 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Z Z 1 1 1 1 m d V .y/ D  m  d V .y/ M.x/ D ry rx 4 kx  yk 4 kx  yk K K Z 1 m 1 m D .y/rx  d V .y/ D   rx P .x/ 4  K kx  yk 

(37)

for all x 2 R3 n K and the gravitational potential P from (1). In other words, if a body has a uniform magnetization and density, then the magnetic potential is proportional to the gravitational field component in the direction of the magnetization. These facts allow for a gravito-magneto combined inversion, where, e.g., the Euler summation formula for the Laplace operator r 2 and the boundary condition of periodicity (see Freeden 2011) turns out to be an advantageous tool (see Augustin et al. 2012, for more details). Multiscale methods similar to those described in the previous sections can be used to quantify deviations from uniformity and, thus, indicate gravitational/magnetic anomalies.

3 Seismic Processing Following the Kaiserslautern model (cf. Fig. 4), we next deal with the explanation of seismic post-processing procedures based on multiscale techniques obtained by regularization of singular integrals occurring in seismic reflection tomography.

3.1 Seismic Recording and Data Retrieval For seismic recording, an energy source (vibroseis, air gun, etc.) is placed on the surface. While the energy source generates a wave impulse, a set of receivers (geophones, hydrophones) placed along one or many parallel lines record this impulse after it is transmitted through the Earth’s interior, reflected at places of impedance contrasts (rapid changes of density/velocity) and transmitted back to the surface. Then, the configuration is moved into the direction of seismic acquisition and the procedure is repeated (see Fig. 15), so that each underground point is represented from all incidence angles needed for further data analysis. Other strategies to retrieve seismic data can be found, e.g., in Yilmaz (1987), Biondi (2006), and Claerbout (2009). In the context of seismic imaging, it is usually assumed that shear stresses generated by the wave impulse and other kinds of damping can be neglected. As a consequence, wave propagation is treated as an acoustic phenomenon. A derivation of the acoustic wave equation in seismic imaging can be found, e.g., in Freeden (2014). It is given by @2 K.x/ 2 @2 p.x; t / D p.x; t / C S.x; t /: r @t 2 .x/ x @t 2

(38)

Here, p.x; t / is the pressure, K.x/ is the bulk modulus, .x/ is the density, and S.x; t / contains pressure sources and sinks. Equation (38) implies that the propagation speed of a wave, i.e., the Euclidean norm of the velocity vector, is given by

Page 22 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 15 Seismic acquisition (modified illustration taken from Jacobs and Meyer 1992, with permission by B. G. Teubner)

s K.x/ .x/

c.x/ D

(39)

and the wave equation can be written as 

 1 @2 1 @2 2  r S.x; t /: p.x; t / D x c 2 .x/ @t 2 c 2 .x/ @t 2

(40)

A solution scheme to (40) can be found by applying Fourier transformation with respect to time. 2 With the assumption that @t@ 2 S.x; t / D 0 for all x and the definition 1 U.x/ D  p 2

Z

p.x; t / e i !t dt

(41)

R

we get  rx2

 !2 C 2 U.x/ D 0: c .x/

(42)

This leads to the definition of the wave number k.x/ and the refraction index N.x/ by c0 ; c.x/ ! ! c0 k.x/ D D D k0 N.x/; c.x/ c0 c.x/

N.x/ D

(43) (44)

with c0 being a suitable constant reference velocity (see Engl et al. 1996; Snieder 2002; Biondi 2006, and the references therein). Accordingly, the wave equation (42) can be written as 

 rx2 C k02 N 2 .x/ U.x/ D 0:

(45) Page 23 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

The region where N.x/ ¤ 1 represents the scattering object such that N.x/1 may be supposed to have compact support. Another standard assumption is that the difference between c.x/ and c0 is small. As a consequence, N 2 .x/ may be developed into a Taylor series up to order one with a center such that c.x0 / D c0 . This yields N 2 .x/ D 1 C ".x/

(46)

with the small perturbation parameter " and consequently k 2 .x/ D k02 N 2 .x/ D k02 .1 C ".x// :

(47)

With the same argument as before, the unknown function .x/ may be supposed to have compact support. The wave operator may be split into Ax Drx2 C

!2 D rx2 C k02 N 2 .x/ D rx2 C k02 .1 C ".x// 2 c .x/

.1/ Drx2 C k02 C "k02 .x/ D A.0/ x C "Ax

(48)

2 2 2 .1/ with A.0/ x D rx C k0 and Ax D k0 .x/. Moreover, the wave field U may be split into an incident wave field UI , corresponding to the wave propagating in the absence of the scatterer, and the scattered wave field US such that

U D UI C US :

(49)

  A.0/ UI D rx2 C k02 UI D 0;   A.0/ US D rx2 C k02 US D "k02 .UI C US / D "k02 U D "A.1/ U:

(50)

This leads to

(51)

An image of the subsurface structure corresponding to some given parameter like the wave propagation velocity or the underground density is produced by corresponding methods of seismic migration. By migration of the seismic data, the seismogram (amplitudes) recorded in time is shifted to its “true” depth position (see Figs. 15 and 16), so that the shape, depth, and reflection coefficients of different structures can be reconstructed (for more details, see, e.g., Yilmaz 1987, Claerbout 2009, and the references therein). All migration methods use an approximate velocity model obtained by means of a “migration velocity analysis” (e.g., tomography, full wave inversion, etc.) in the computation process (for more details, the reader is referred, e.g., to Biondi 2006, and the references therein). In addition, migration methods can be recursively applied in order to refine the given velocity model. Here, the migration is repeated with a velocity differing from the initial mode by a small perturbation in a local area. In the end, the velocity model with the “best” reflector image is chosen as the final model.

Page 24 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 16 Coherence between seismic experiments and migration (from Ilyasov 2011)

The known migration methods today are all based on some approximation of the wave equation or, more generally, on the elastodynamic equation and can roughly be divided into the following groups: • Ray-based methods, which usually model the high-frequency asymptotic solution (see Bleistein et al. 2000) in terms of Gaussian beams (e.g., Popov 1982; Semtchenok et al. 2009), or Kirchhoff migration based on the solution of the eikonal equation (e.g., Vidale 1988; Podvin and Lecomte 1991; Buske 1994) • Depth continuation methods, which are usually based on the one-way wave equation and compute wave fields from one depth level to the next (e.g., Xie and Wu 2006; Deng and McMechan 2007; Claerbout 2009) • Reverse time migration, which is based on the full wave equation and follows the recorded seismogram backward in time until the starting time is reached (e.g., Baysal et al. 1983; Bording and Liner 1994; Haney et al. 2005; Symes 2007) The numerical realizations of all aforementioned methods can be classified according to Yilmaz (1987) in three broad categories: • Algorithms based on integral solutions to the acoustic and/or elastic wave equation (e.g., Bleistein 1987; Symes 2003; Xie and Wu 2006; Nolet 2008; Semtchenok et al. 2009) • Algorithms based on finite-difference solutions (e.g., Baysal et al. 1984; Renaut and Fröhlich 1996; Du and Bancroft 2004; Jia and Hu 2006) Page 25 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

• Algorithms based on frequency/wave number implementations (e.g., Bonomi and Pieroni 1998; Takenaka et al. 1999; Bollhöfer et al. 2008) In order to acquire a better accuracy and resolution in the resulting image, modern migration methods can combine any number of strategies. This presents itself in algorithms which, for example, compute an initial approximation with the finite-difference approximation of the full wave equation (Wu et al. 2006) and additionally apply the depth continuation method based on the space-frequency implementation.

3.2 Seismic Post-processing in a Multiscale Framework by Means of Surface-Layer Potentials Our aim now is the decomposition of a signal F – for example, a seismogram or a migration result of a seismogram – into multiscale components using space-localized Helmholtz wavelets (as proposed by Freeden et al. 2003) associated to a given wave number k0 . As essential tools, limit and jump relations are used pointwise in the way known from mathematical physics for Helmholtz surface-layer potentials. Our approach to post-processing is then based on a multiscale technique developed by Freeden and Mayer (2003). Starting point for the following considerations are Eqs. (50) and (51). Let S be the surface of B. As the fundamental solution to the Helmholtz operator r 2 C k02 is known to be (Müller 1998) ˆ.r C 2

k02 I kx

1 e i k0 kxyk  yk/ D ; 4 kx  yk

x ¤ y;

(52)

UI and US can be represented by the potentials Z   F .y/ ry ˆ.r 2 C k02 I kx  yk/  n.y/ dS.y/; UI .x/ D S Z   US .x/ D "k02 .y/ˆ.r 2 C k02 I kx  yk/ U.y/ d V .y/;

(53) (54)

K

with the surface element dS , the unit normal vector n.y/, a suitable, sufficiently differentiable function F , and K D supp./ being the support of . Here, Eq. (53) gives UI as a double-layer potential, whereas (54) gives US as a volume potential. In the following, we will only consider (53). The double-layer potential can be used to represent solutions to the Helmholtz equation on some bounded domain B or the exterior domain R3 n B, respectively (cf. Freeden et al. 2003, and the references therein). Starting from this, we assume that S is piecewise locally twice continuously differentiable and F is continuous. As the double-layer potential is singular, the first step consists in regularizing it. For this purpose, we introduce a regularization of the fundamental solution ˆ.r 2 C k02 I kx  yk/ as (see Freeden et al. 2003, and the references therein) ˆ.r 2 C k02 I k.x C  n.x//  .y C  n.y//k/ D

1 e i k0 k.xC n.x//.yC n.y//k 4 k.x C  n.x//  .y C  n.y//k

(55)

Page 26 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

  with  ¤ , jj, jj small. A regularization of the kernel ry ˆ.r 2 C k02 I kx  yk/  n.y/ is then given by ˇ   ˇ 2 2 ry ˆ.r C k0 I k.x C  n.x//  .y C  n.y//k/  n.y/ ˇ  D0   i k0 k.xC n.x//y/k 1 n.y/  .x C  n.x/  y/e 1 D : (56) i k0 C 4 kx C  n.x/  yk2 kx C  n.x/  yk The operator of the double-layer potential on S for values on the shifted surface S . / is given as Pj2 .; 0I k0 /ŒF .x/ Z  ˇ  ˇ 2 2 F .y/ ry ˆ.r C k0 I k.x C  n.x//  .y C  n.y//k/  n.y/ ˇ D S

 D0

dS.y/: (57)

In a more general approach on regularized surface-layer potentials for the Helmholtz equation, Freeden et al. (2003) prove limit and jump relations. For our purposes here, we need the following limit relation (given by Freeden et al. 2003, for a slightly different ˆ.r 2 C k02 I k.x C  n.x//  .y C  n.y//k/)  1  lim Pj2 .; 0I k0 /ŒF .x/  Pj2 .0; 0I k0 /ŒF .x/ D F .x/:  !0 2  >0

(58)

This leads to the definition of the continuous Helmholtz scaling function "   1 n.y/  .x C  n.x/  y/e i k0 k.xC n.x//y/k 1 ˆ .x; y/ D i k0 C 2 kx C  n.x/  yk2 kx C  n.x/  yk #  n.y/  .x  y/e i k0 k.xy/k 1  (59) i k0 C kx  yk2 kx  yk such that Z lim

 !0  >0

S

F .y/ˆ .x; y/ dS.y/ D F .x/;

(60)

as shown in Freeden et al. (2003). We omitted the reference to the Helmholtz operator here and will do so further on. Once again, in numerical applications, instead   of the continuously varying parameter  , a positive monotonically decreasing sequence j j 2N with limj !1 j D 0 is chosen (of which, of   course, only finitely many elements are used). Thus, we obtain the associated sequence ˆj j 2N   of scaling functions ˆj of scale j . Moreover, we introduce a sequence ‰j j 2N of (difference) wavelet functions ‰j of scale j given by (cf. (9))

Page 27 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

‰j D ˆj C1  ˆj :

(61)

An approximation of UI at scale J0 2 N is given by the low-pass filter Z ˆJ0 .x; y/F .y/ dS.y/; LJ0 ŒF .x/ D

(62)

S

and two low-pass filters to the scales J and J0 with J > J0 are related by LJ ŒF .x/ D LJ0 ŒF .x/ C

J 1 Z X j DJ0

S

‰j .x; y/F .y/ dS.y/:

This leads to the introduction of the band-pass filter to scale j as Z Bj ŒF .x/ D ‰j .x; y/F .y/ dS.y/:

(63)

(64)

S

It is evident from Eq. (63) that the band-pass filtered function of the scale j contains all detail information that is included in the low-pass filtered function of scale j C 1 but not in the low-pass filtered function of scale j . This allows a “zooming-in process” on details of different scales which are decorrelated. In the context of seismic imaging, we suppose that S consists of finitely many patches P such that a velocity model is available on every patch P. The values of the signal F shall be discretely available on every P in sufficient data density. Instead of integrating over S as a whole, every patch P is treated on its own. Of course, a numerical solution scheme uses some kind of summation formula to approximate the surface integral, such that Z Lj ŒF .x/ D

P

ˆj .x; y/F .y/ dS.y/ '

Nj X

 N N ai j ˆj x; yi j

(65)

i D1

with the coefficients a

Nj

2R ; a Nj

Nj

 Nj Nj T D a1 ; : : : ; aNj ; j D J0 ; : : : ; J;

(66)

given by N

N

aNj D wi j F .yi j /; i D 1; : : : ; Nj :

(67)

The band-pass filter is then given by Z Bj ŒF .x/ D

P

‰j .x; y/F .y/ dS.y/ '

Nj X

 N N ai j ‰j x; yi j :

(68)

i D1

Page 28 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

In Freeden et al. (2003), a relation between the coefficient vectors aNj for different scales j D J0 ; : : : ; J is shown. This allows the use of a tree algorithm or pyramid scheme in the following way: 1. For a sufficiently large integer J , calculate the coefficient vector to the scale J via NJ J aiNJ D wN i F .yi /; i D 1; : : : ; NJ ;

(69)

NJ J with known weights wN i and known values F .yi /. N 2. For j D J  1; : : : ; J0 , calculate the coefficients ai j recursively from the coefficient of the previous, finer scale.

The procedure outlined above was used in the PhD thesis Ilyasov (2011) as a method for seismic post-processing for the “Marmousi” model due to Martin et al. (2002) (see Fig. 17), where P is a rectangle. It should be noted that Ilyasov (2011) used surface decorrelation of seismic data involving jump and limit relations in Helmholtz potential theory, while Freeden and Blick (2013) applied volume decorrelation techniques based on Newton-like integrals. Figure 18 shows an example of a migration result for the “Marmousi” model. Figure 19 illustrates a corresponding smoothed velocity field for this rectangle, which gives locationdependent information about the wave number k, with f D !=.2/ denoting the frequency. Since the applied Helmholtz wavelets show a strong spatial localization for high scales, they reflect the position-dependent wave number k in close approximation. Figure 20 visualizes the decomposition of the migration result F (from Fig. 18) regarding the velocity model for the “Marmousi” model by means of Helmholtz scaling and wavelet functions using the tree algorithm. The numerical calculations are carried out first with a certain wave number associated to a velocity of 2 km and a frequency of f D 20 Hz. High-frequency information – for s example, “high-amplitude reflectors” – can be seen in the band-pass filtered details B9 ŒF , B8 ŒF , and B7 ŒF . Of special interest is the fact that the shadow in the upper left corner of the low-pass filtered data (left side of Fig. 20) is not present in the band-pass filtered data (right side of Fig. 20). Hence, structures that are only hard to recognize in the former become obvious in the latter. In

Fig. 17 Lithology for the “Marmousi” model (following Martin et al. 2002)

Page 29 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

[m] 1000

2000

3000

4000

5000

6000

7000

8000

9000

23500

1000

0

14660

–3043 –11892

2000

5806

[m]

Fig. 18 Migration result F of the “Marmousi” model on the surface patch P 2000

4000

6000

8000

[m]

0

5315 4361

2451 1497

2000

3406

[m]

Fig. 19 Smoothed velocity model of the “Marmousi” model

view of the lithological situation (provided by Fig. 17), it is not difficult to recognize faults and salt domes in their spatial location. Figure 21 provides a wavelet decomposition for a frequency of f D 50 Hz. The details here show a finer structure. Comparison with Fig. 20 shows that the change of frequency yields different structures in the band-pass filtered data. The salt domes that were prominent for f D 20 Hz have little influence for f D 50 Hz. Instead, the highest amplitudes appear in the shale structures in the lower left corner above the salt dome. Additional numerical calculations and illustrations can be found in the PhD thesis Ilyasov (2011). The particular objectives of the interpretation of seismic signals are, besides the fault pattern, the detection of special facies and karst systems with the intention to specify certain water levels in backfills and disturbances. Seismic post-processing via frequency-dependent Helmholtz wavelets provides useful additional information to manage this task, as particular structures in the deeper underground become obvious by the decorrelation capacity of this method.

4 Fluid and Heat Flow Models The problem of fluid flow in a geothermal reservoir is very complex for several reasons. First of all, the data regarding geometry and composition of the domain are usually lacking or incomplete. Moreover, the direct feedback from the physical situation needed to compare theoretical results Page 30 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

40.67 19.77 −1.131

−3049

−22.03

−11870

2000

5777

−42.93

[m]

1000

14600

1000

23430

2000

0

0

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

[m]

L τ10[F ]

B τ 9[F ]

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

19.84 9.691 0.457

−2963

−10.61

−12040

2000

6114

−20.75

[m]

1000

15190

1000

24270

2000

0

0

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

[m]

L τ9[F ]

B τ 8[F ]

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

−11079

1000

4.574 −0.249 −5.072 −9.895

2000

7166 −1955

9.397

1000

16290

2000

25410

0

0

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

[m]

L τ8[F ]

[m]

B τ 7[F ]

Fig. 20 Multiscale decomposition of the migration result F in Fig. 18 realized by the described tree algorithm for the frequency f D 20 Hz

with practical measurements is difficult to obtain. Furthermore, the large number of parameters and coupled processes that have to be taken into account make the development of a comprehensive model a very hard task. According to the Kaiserslautern model (Fig. 4), the three main phenomena that have to be modeled are fluid flow, thermal flow, and chemical flow. Chemicals can dissolve and precipitate during fluid circulation eventually reducing the permeability of the medium and influencing the flow (see Cheng and Yeh 1998; Durst and Vuataz 2000; Jing et al. 2002; Kühn and Stöfen 2005; Kühn 2009, and the references therein), but the contribution of this process to thermal flow is limited; therefore, most of the coupled models just neglect it. Depending on the physical conditions of the reservoir, the problems arising during modeling fluid and heat flow differ.

4.1 Basic Physical Background This section makes an attempt on a general and uniform theoretical description of the underlying equations for the modeling of physical processes within a geothermal reservoir derived from microscopic considerations which assume a multiphase decomposition of a porous medium and contained fluids. The theoretical basis for this formulation was developed in a number of publications in the 1970s and 1980s which are summarized, e.g., in Diersch (1985). The overview which we give here is based on a simplified version of this approach due to Diersch (2000).

Page 31 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

0

0

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

9.515 0.2433

−3047 −11870

−9.028

2000

5777

2000

14600

1000

18.79

1000

23430

−18.30

[m]

[m]

L τ10[F ]

B τ 9[F ]

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

0

0

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

15210

1000

8.897

1000

24270

4.409 −4.568

2000

−11977

[m]

−9.058

2000

−0.07935

6148 −2915

[m]

L τ9[F ]

16340

0 2.992

1000

9.489

1000

25440

B τ 8[F ]

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

0

[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000

−10960

−10

2000

−1806

[m]

−16.5

2000

−3.505

7234

[m]

L τ8[F ]

B τ 7[F ]

Fig. 21 Multiscale decomposition of the migration result F in Fig. 18 realized by the described tree algorithm for the frequency f D 50 Hz

We consider a three-dimensional, bounded, regular region B and a (scalar) quantity  which is transported through this medium. It is convenient to define some kind of density ! such that  D ! with the mass density . The sum of the temporal change of  inside the domain B and the inflow and outflow of  caused by the flow (vector) j through the boundary @B is equal to the amount of this quantity generated by sources and sinks Q within B. This means d dt

 Z .x; t /!.x; t / d V .x/ C

Z B

Z j.x; t /  n.x/ dS.x/ D

Q.x; t / d V .x/;

(70)

B

@B

where we remind the reader that d dt n

(total) derivative w.r.t. time t ;

dV

volume element in R3 ;

outer (unit-)normal w.r.t. @B;

dS

surface element in R3 :

If we use the (vectorial) velocity v.x; t / D Z  B

@x , @t

Eq. (70) can be rewritten in the following way:

 Z Z @.!/ .x; t / C rx  .!v/ .x; t / d V .x/ C rx  j.x; t / d V .x/ D Q.x; t / d V .x/: @t B B (71) Page 32 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

From the integral representation (71), we can conclude the differential equation @.!/ .x; t / C rx  .!v/ .x; t / C rx  j.x; t / DQ.x; t /: @t

(72)

We will omit dependencies on x and t further on for the sake of readability. In its above form, the differential equation is only valid with regard to microscopic quantities which can usually not be measured. To derive equations for macroscopic quantities, spatial averaging over a representative elementary volume and further carefully applied simplifications are necessary. For a system consisting of fluid and solid phases, such as a liquid in a porous medium, this procedure was carried out, e.g., by Diersch (1985) and yields in a simplified form (cf. Diersch 2000) @."!/ C r  ."!v/ C r  ."j / D ."Q/ C j inter ; @t

(73)

with the volume fraction " of the fluid. We will assume here that the porous medium is completely saturated with the fluid, such that " is equal to the porosity . In case of a free one-component fluid, we have " D 1. Moreover, Eq. (73) contains the additional interaction term Z 1 inter .j C ! .v  w//  ninter dS j D (74) ıS S inter at phase interfaces. Here, ıS is the surface area of the interface, v the velocity with which  is transported, and w the interface velocity. j inter is a scalar and vanishes for a free one-component fluid. From Eq. (73), we can derive balance equations for fluid mass, species mass of chemical materials, fluid impulse, and energy. It is also possible to derive equations for the mechanical stress field of the porous medium by considering balance of mass and balance of impulse for the whole system of solid and fluid components, but such a derivation is out of the scope of this article. The interested reader is referred to, e.g., Diersch (1985). All of these equations are strongly related to each other and decouple only if further assumptions on the negligibility of coupling terms are valid. Considering  D f , the density of the fluid, Eq. (73) leads to conservation of mass which can be written as     @ f C r  f vf D f Qp;f ; (75) @t with vf being the fluid velocity. All mass flows over interfaces, including interfaces between phases, and all other impulse sink and source terms are summarized in the right-hand side term f Qp;f . To derive the species mass balance equation, we have to replace the mass density f by the concentration of the species C . In this case, the mass flux is governed by Fick’s law jC D D .rC / ;

(76)

Page 33 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

with the tensor of hydrodynamic dispersion D. This tensor can be written as D D Dd I C Dm ;

(77)

with a diffusion part Dd I and a part Dm , which is needed to include mechanic dispersion. Considering porous media, the latter one can be expressed by the Scheidegger-Bear dispersion law as .Dm /ij D ˇT kvf kıij C .ˇL  ˇT /

vf;i vf;j kvf k

(78)

with ˇT , ˇL being the transversal and longitudinal dispersivity, respectively. Additionally, the sinksource term on the right-hand side has to include a term # C to incorporate chemical reactions with the decay rate #, and as we are dealing with porous media, sorption effects have to be considered. By combining all these effects, we obtain   @. Rd C / C r  C vf  r  . D .rC // C R# C D Qc @t

(79)

as the balance equation for concentration of chemical components with the so-called retardation relations 1

.C /; 1 d . .C / C / : Rd D1 C dC R D1 C

(80) (81)

Here, .C / is a sorption function that can be expressed by empirical material relations. For the balance of momentum of the fluid, we have to bear in mind that linear momentum is a vector-valued quantity, whereas the considerations above use vocabulary for a scalar-valued quantity . As a consequence, the balance of momentum incorporates the stress tensor  instead of the flow vector. The stress tensor usually splits into an isotropic part containing pressure and a deviatoric part  0 such that  D pI   0 :

(82)

In order to give the balance of momentum, it is convenient to use Einstein’s summation convention and consider the components of the momentum separately. This leads to      @ f vf;k @  @ @  0  . p/  C f vf;k vf;i C  ki D f gk C jpinter k @t @xi @xk @xi

(83)

with the fluid force density f g and a still not specified interaction term jpinter , which is now a vector-valued quantity. For a Newtonian fluid,  0 is given by

 0i k

D f

@vk 2 @vj @vi C  ıi k @xk @xi 3 @xj

(84)

Page 34 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

with the fluid viscosity f . With the further assumptions that the fluid is incompressible which means that the mass density does change neither in space nor time and that the divergence of the velocity field vanishes, we obtain the Navier-Stokes equation f

  @vf C f vf  r vf  f r 2 vf D  rp C f g C jpinter : @t

(85)

Usually, the fluid velocity in a porous medium is rather small which results in small Reynold’s numbers. As a consequence, the time derivative and the convective transport term can be neglected, i.e.,   vf  r vf  0:

@vf  0; @t

(86)

Thus, the balance of momentum is reduced to rp  f g D f r 2 vf C

1 inter j : p

(87)

Moreover, inner friction as a result of viscosity is much smaller than the friction between solid and fluid such that f r 2 vf  0. The interaction term between solid and fluid can be modeled by 1 inter D  f 1 vf j p

(88)

with the hydraulic permeability tensor . This yields Darcy’s law vf D 

 1   rp  f g f

(89)

which was first introduced empirically by Darcy in the nineteenth century (Darcy 1856). More sophisticated justifications can be provided, e.g., by the homogenization method described by Ene and Poliševski (1987) or the volume averaging method described by Sahimi (1995). For further details on transport in porous media, see, e.g., Bear (1972) and de Boer (2000) and the chapters on flow in porous media in this handbook. From the mathematical point of view, the problem has been challenged by Ene and Poliševski (1987) who proved the existence and uniqueness of a solution for the incompressible fluid case, i.e., r  vf D 0, in both bounded and unbounded domains. Until now, we only considered the fluid phase, i.e., we neglected mass exchange between fluid and solid and momentum transfer from the fluid to the solid. This corresponds to the assumption of a rigid non-deformable solid. But even in this case, energy transfer between the solid and the fluid happens. Thus, we have to consider both phases to derive the energy balance. We will use the term “porous medium” for the whole system of fluid and solid now. The indices “p”, “f”, and “s” characterize the associated quantities of the “porous medium”, “fluid phase”, and “solid phase”, respectively. As a consequence, we have to consider Eq. (72) before averaging it into Eq. (73), but may consider Eq. (72) for both phases separately. With heat capacity c and temperature T , the internal (thermal) energy is given by dE D c d T:

(90) Page 35 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

The heat flow is governed by Fourier’s law jT D k .rT / ;

(91)

with the heat conductivity tensor k given by ks Dks I;

(92)

kf Dkf I C f cf Dm

(93)

for the solid and fluid phase, respectively. To combine both phases, we make the following assumptions: (1) The temperature of the solid phase equals (locally) the temperature of the fluid phase T .D Ts D Tf / ((local) thermodynamic equilibrium). (2) There are only heat sources and sinks in the fluid phase and none in the solid phase. For this, a stiff and rigid stone matrix without heat sources and sinks is assumed. This also excludes heat generation through mechanical deformation of the solid. Thus, we have QT;s D 0. The fluid heat source and sink term QT;f represents the injection and extraction of heat into and from the hydrothermal reservoir, respectively. Furthermore, there is neither a change in the rock morphology nor seismic activity during the production process, and thus, vs D 0. (3) Weighted summation of the solid and fluid phase with porosity : .c/p D cf f C .1  /cs s kp D kf C .1  /ks

volumetric heat capacity; heat conductivity;

QT D QT;p D QT;f

heat sources and sinks;

jT;p D jT;f C .1  /jT;s

thermal flow:

Combining those assumptions, we obtain       @ .c/p T C r  f cf vf T  r  kp .rT / D QT : @t

(94)

In order to solve the fluid and thermal flow problem for a porous medium representing a geothermal reservoir, these coupled equations have to be solved. Commonly, numerical solution methods are based on finite-element, finite-volume, or finite-difference methods (e.g., Zyvoloski 1983; Zhao et al. 1999; O’Sullivan et al. 2001; Chen et al. 2006). A solution technique for advective-diffusive heat transport under the assumption of a known fluid velocity will also be presented in Sect. 4.3.

4.2 Fluid Flow in Hydrothermal Reservoirs In a hydrothermal system, wells are drilled into a so-called aquifer, i.e., a water-bearing layer which is usually modeled as a porous medium. The basic equations to model fluid flow and heat transport in such a reservoir are given in Sect. 4.1. Page 36 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Usually, the vertical dimension of an aquifer is much smaller than its horizontal expansions. It is therefore convenient to apply some kind of vertical averaging, which also reduces the dimension of the problem. Following Diersch (2000), we consider a Cartesian grid x D .x1 ; x2 ; x3 /T such that x3 is the vertical component. The upper and lower bounds of the aquifer are given by b up .x1 ; x2 ; t / and b down .x1 ; x2 /, respectively, i.e., we assume that the lower bound does not change with respect to time. The thickness of the aquifer is given by B.x1 ; x2 ; t / D b up .x1 ; x2 ; t /  b down .x1 ; x2 /:

(95)

Another useful quantity is the hydraulic head h. We assume that the only force which affects the fluid is gravity, such that for the fluid force density f g, g is the gravitational acceleration and is assumed to be given by g D kgke .3/ with e .3/ being the unit vector in x3 -direction. Choosing a reference fluid density f;0 , the pressure head hp is defined via p f;0 kgk

(96)

h D hp C x3 :

(97)

hp D and the hydraulic head via

If we assume that the upper bound of the aquifer is given by the pressure head, we obtain B D hp  b down :

(98)

Unlike in Sect. 4.1, we now take into account that the porous medium may not be completely saturated, such that " D s with saturation s. Thus, integrating Eq. (75) over the depth B yields     @ Bs f C r  Bs f vf D Bs f Qp;f : @t

(99)

To incorporate the hydraulic head, we take a closer look at the time derivative term and get     @ Bs f @ f @B @s D s f C B f C Bs : @t @t @t @t

(100)

The first summand is directly related to the pressure head via Eq. (98) which implies @hp @B @h D D : @t @t @t

(101)

@s @h @s @hp @s @hp @h D D D C.hp / @t @hp @t @hp „ƒ‚… @h @t @t

(102)

For the second summand, we get

D1

Page 37 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

with the moisture capacity C.hp /. For the third summand, we remark that density changes can be related to pressure changes via a suitable compressibility. As pressure and hydraulic head are equivalent, it is also possible to define a compressibility S0 D f C .1  / s

(103)

with the fluid and solid compressibilities f and s such that   @ f @h Bs D Bs.hp /S0 : @t @t

(104)

Neglecting fluid density effects in the divergence term as part of the Boussinesq approximation and with the definition   S D s C B S0 C C.hp / ;

(105)

the balance of mass may be written as S

  @h C r  Bs vf D Bs Qp;f : @t

(106)

Darcy’s law (89) can also be rewritten with respect to the hydraulic head instead of the pressure:    1  1    rp  f g D   r f;0 kgk.h  x3 / C f kgke .3/ f f  1  D  f;0 kgkrh  f;0 kgke .3/ C f kgke .3/ f   f  f;0 1 .3/ D f;0 kgk rh C kgke : f f;0

vf D 

(107)

Combining Eqs. (106) and (107) gives    f  f;0 Bs @h .3/ f;0 kgk rh C kgke r  S D Bs Qp;f @t f f;0

(108)

as a kind of diffusion equation for the hydraulic head which is highly nonlinear due to the dependencies of the coefficients on various quantities related to the hydraulic head.

4.3 Heat Transport in a Porous Medium The aim of this section is to simulate the heat transport in a two-phase porous medium, which consists of a solid and a fluid phase, based on Eq. (94). We make the following additional assumptions:

Page 38 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

(1) The fluid is incompressible and the divergence of the velocity field vanishes. (2) The heat produced from mechanic dispersion is negligible, such that kf D kf I. (3) All heat capacities and the porosity are constant with respect to time and space. With assumption (1)–(3), we obtain the transient advection-diffusion-equation (TADE) for the twophase porous medium (0 < tend < 1): @T D r  .kp rT /  cf f vf  rT C QT @t T .; 0/ D T0

.c/p

@T DF @n

in B  .0; tend /;

(109)

in B;

(110)

on @B  Œ0; tend :

(111)

Statements concerning existence, uniqueness, and continuity of the weak solution of problem (109)–(111) are proved in Ostermann (2011a). Because the weak formulation of problem (109)–(111) is infinite dimensional, we need a finitedimensional problem for the numerical realization whose solution is a good approximation of the original solution. In order to reach this aim, a linear Galerkin scheme using scalar kernels and a grid of J pairwise distinct points in the domain B is used. The corresponding approximated Galerkin solution is called TJ . The convergence towards the actual solution T is shown, e.g., in Ostermann (2011a, b). Suitable examples for the Galerkin grid and Galerkin kernel are tetragonal grids and the biharmonic kernel (which is a radial basis function; see, e.g., Buhmann 2003; Wendland 2005). They satisfy the required conditions for the convergence of the Galerkin scheme (see Ostermann 2011a, b, for details). The results for a numerical test for the cube .1; 1/3 for a diffusion dominated problem (with Péclet number Pe D cf f jjvp f jjxo D 0:01 with characteristic length scale xo ), a constant initial condition T0 D 393:15 K, and a (single-point-)injection at the origin with an injection temperature of 343:15 K that is constant in time are shown in Figs. 22 and 23 (cf. Ostermann 2011a, b). In the first two columns of Fig. 22, the temporal evolution of the temperature difference TJ .; t /  TJ .; 0/ for chosen times between 3 and 36 months is presented. The global expansion shows a significant cooling of the injection region – expectedly due to the small injection temperature compared to the initial temperature. The color scale is always scaled to the interval Œ50; 0. In the third column of this figure, a detailed view after 6, 12, and 36 months can be seen. The quantity det ail J .t / D  log10 ..TJ .; t /  TJ .; 0/// is illustrated for the specific temperature intervals. For clarification of the arising structures, the color scale is condensed at the upper end. Nevertheless, the influence of the advection term, which is comparatively low in this case, can be clearly detected. The cooling front, which propagates “spherically” from the injection point, is slowed down in positive x2 -direction, whereas it is accelerated in negative x2 -direction. This behavior can also be observed in Fig. 23 (see also Ostermann 2011a, b). There the temperature TJ .; t / after 36 months both in x1 x2 - and x2 x3 -plane for an adapted color map is shown. Further investigations concerning the presented method – especially in case of advection dominated problems – are needed because the applied Galerkin method does not contain any stabilization terms. Such terms should prevent strong oscillations for large Péclet numbers, (e.g., John et al. 2006; John and Schmeyer 2008, for similar considerations). The numerics of a similar convectiondiffusion-reaction-problem has been discussed in the PhD-thesis by Eberle and in Eberle et al. 2014 in this issue. Page 39 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

x1 –1 1

–0.5

x1

0

0.5

1

–1 1

0.5 x2

–0.5

x1

0

0.5

1

–1 1

0.5 x2

0

x2

0

–0.5

–1

–1

–1

–30

–20

–10

0

–50

–40

–30

x1 –1 1

–0.5

0.5

1

–1 1

0.5 x2

–20

–10

0

–0.5

0.5

1

x1

0

–1 1

x2

x2

0

–0.5

–0.5

–1

–1

–1

–40

–30

–20

–10

0

–50

–40

–30

–0.5

0.5

0

1

–1 1

0.5 x2

–20

–10

0

–1.6

x1

x1 –1 1

x2

–0.5

0

0.5

1

–1 1

x2

0

–0.5

–1

–1

–1

–30

–20

–10

0

–50

–1.4

–1.3

–0.5

0

0.5

1

0

–0.5

–40

–1.5

0.5

–0.5

–50

1

x1

0.5

0

0.5

0

0

–0.5

–50

–0.5

0.5

0.5

0

1

–1.6 –1.5 –1.4 –1.3 –1.2 –1.1 –1

x1

0

0.5

0

–0.5

–40

0

0.5

–0.5

–50

–0.5

–40

–30

–20

–10

0

–1.65

–1.6

–1.55

Fig. 22 Temperature difference TJ .; t /  TJ .; 0/ in ŒK for the Galerkin expansion (extended by a constant) on the x1 x2 -plane after 3, 6, 9, 12, 24, and 36 months (first two columns from left to right and top to bottom) and detail view of det ai l J .t / in ŒK after 6, 12, and 36 months (right column from top to bottom) for a tetragonal grid with grid size 1 and Pe D 0:01 9

4.4 Flow Models for Petrothermal Reservoirs In petrothermal systems, the possible efficiency of the geothermal reservoirs is often too low for industrial purposes. Hence, the usual method to improve the productivity is to use fluid pressure to increase the aperture of the fractures as well as their conductivity and, thus, the permeability of the reservoir, creating what is called an Enhanced Geothermal System (EGS). Applying hydraulic stimulation (e.g., Ghassemi 2003; Zubkov et al. 2007), the fractures propagate along the maximum principle stress direction. The fracture path is governed by the actual stress field in the reservoir acting at the fracture tips (e.g., Moeck et al. 2009). Often the main difficulty of the mechanical models that predict the fracture growth during the stimulation process is to couple the applied pressure of the fluid with the fracture aperture and the rock stress since the parameters that govern this coupling (such as the rock stiffness and the shear stiffness) are very difficult to detect (see Hillis 2003). For all models which predict fracture growth, knowledge of the stress field prior to stimulation is vital. Page 40 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 23 Temperature TJ .; t / in ŒK for the Galerkin expansion (extended by a constant) on the x1 x2 - and x2 x3 -plane after 36 months for a tetragonal grid with grid size 19 and Pe D 0:01

There are two basic categories of models for petrothermal systems consisting of a fractured medium: continuum methods and discrete methods. The continuum model approach can be subdivided into three methods, namely, the Effective Continuum Method (ECM), the DualContinuum (DC) or the generalized Multiple Continuum Model (MINC, Multiple INteracting Continua), and the Stochastic-Continuum (SC) model. The continuum approach is a simple method since it ignores the complex geometry of fractured systems and employs effective parameters to describe their behavior. Nevertheless, it is not useful to apply vertical averaging as for hydrothermal systems. In contrast to this, the discrete approach is based on the explicit determination of the fractures. Thus, the need for a detailed knowledge of the geometry of the fractures arises. There are two different tasks which discrete models can fulfill, namely, modeling the flow when the continuum methods cannot be applied and specifying the effective parameters needed in the continuum approach if they are applicable. Similar to the continuum models, the discrete models can be divided into three different methods: Single-Fracture (SF) models, Discrete Fracture Network (DFN) models, and Fracture Matrix (FM) models. A schematic overview of the presented methods used to model geothermal reservoirs is given in Fig. 24. This article can only give an overview on the different methods used in reservoir and flow modeling. For further information on (numerical) methods concerned with flow and transport through fractured media, the reader is referred to the recent assessments by Adler and Thovert (1999), Sanyal et al. (2000), Dietrich et al. (2005), and Neuman (2005).

Page 41 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

reservoir

hydrothermal

petrothermal

porous medium

fractured medium

continuum

continuum parameters

ECM DC/MINC SC

discrete

SF DFN FM

Fig. 24 Scheme of geothermal reservoir models characterizing today’s situation

4.4.1

Continuum Models

Effective Continuum Method We first present two basic models for determining effective coefficients in fractured rocks introduced by Pruess et al. (1986) and Wu (2000) based on a deterministic approach. Pruess and his co-workers developed a semi-empirical model to represent an unsaturated fracture/matrix system as a single equivalent continuum, therein using the simple arithmetic sum of the conductivity of the rock matrix and the fracture system as approximation of the unsaturated conductivity of the fractured rock: .p/ D matrix .p/ C fracture .p/:

(112)

A similar approximation of the isotropic permeability of a fracture/matrix system was used by Peters and Klavetter (1988). Approximation (112) is based on the assumption that the local thermodynamic equilibrium between the fracture system and the rock matrix blocks breaks down in case of fast transients. An example including solute transport in models for fractured media is given by Wu (2000) who presents a generalized ECM formulation to model multiphase, non-isothermal flow and solute transport. The governing equations for compositional transport and energy conservation contain the key effective correlations such as capillary pressures, relative permeability, dispersion tensor, and thermal conductivity. Introducing a fracture/matrix combined capillary pressure curve as a function of an effective liquid saturation, the corresponding fracture and matrix saturations can be determined by inversion of the capillary pressure functions for fracture and matrix, respectively. Consequently, the effective relative permeabilities of the corresponding fluid phase can be determined as the weighted sum (via the absolute permeabilities of the fracture system and the matrix) of the relative effective permeabilities of the fracture system and the matrix evaluated at the respective saturations. Similarly, the thermal conductivity can be specified. For the determination of the dispersion tensor, the Darcy velocities in the fracture system and the matrix are necessary, additionally to the corresponding saturations. Note that approximation (112) is also incorporated in this model for the effective continuum permeability. An attempt to categorize the different mathematical concepts beyond the problem to find good estimates for the matrix-dependent effective parameters can be found in Sahimi (1995). Since the applicability of this method is restricted due to the fact that there is generally no guarantee that a Representative Elementary Volume (REV) exists for a given site (cf. Long et al. Page 42 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

1982), or that the assumption of local thermodynamic equilibrium between fracture and matrix is violated by rapid flow and transport processes (see Pruess et al. 1986), alternatives are needed. Dual-Continuum and MINC Methods The dual-continuum method or the more general Multiple INteracting Continua (MINC) method (see Barenblatt et al. 1960; Warren and Root 1963; Kazemi 1969; Pruess and Narasimhan 1985; Wu and Qin 2009, and the references therein) tries to overcome the disadvantages of the ECM. The dual-porosity method as one possible form of dual-continuum methods was first introduced by Barenblatt et al. (1960), applying the idea of two separate but overlapping continua – one modeling the fracture system and the other one modeling the porous matrix blocks. The method does not only take into account the fluid within the fractures, as the general Darcy model does, but also the fluid mass transfer between the fractures and the matrix blocks. Typically, the flow between the fracture system and the matrix is modeled as pseudo-steady, assuming pressure equilibration between the fracture system and the matrix blocks within the duration of each time step. Two conservation laws are required: one for the fracture and one for the rock matrix blocks (see Barenblatt et al. 1960) given by @. 1 f / C r  . 1 f v1 /  Qinter ; @t @. 2 f / 0D C r  . 2 f v2 / C Qinter : @t 0D

(113) (114)

Here, Qinter is the mass of the liquid which flows from the rock matrix into the fractures per unit time and unit rock volume. The subscripts 1 and 2 represent the fracture phase and the rock matrix phase, respectively. Note that Eqs. (113) and (114) are in the same form as (75) for a single porous medium. Combining Eq. (113) with Darcy’s law and Eq. (114) with the assumption that the porosity in the rock matrix only depends on the corresponding pressure in the rock matrix, we obtain @p1 @r 2 p1  D r 2 p1 ; @t @t

(115)

under the assumptions that the medium is homogeneous, the fluid is slightly compressible, and higher-order terms may be neglected. Here, p1 is the pressure in the fracture phase,  and  are constants depending on the system. The first numerical implementation of a finite-difference, multiphase, dual-porosity scheme including gravity and imbibition was presented by Kazemi et al. (1976). It can be shown (see Arbogast 1989) that models of this kind are well-posed, given appropriate boundary and initial conditions. For the derivation of the double-porosity model due to homogenization, see Arbogast et al. (1990). The assumptions of the dual-porosity method fail in the presence of large matrix blocks, low permeability, or silicification of fracture surfaces. An extension of the dual-porosity model has been proposed by Pruess and Narasimhan (1985), the so-called Multiple INteracting Continua (MINC) method. The idea is to divide the domain into regions in which the thermal dynamic equilibrium can be assumed and to treat every block of the rock matrix not as a porous medium as in the dual-porosity method but as a set of different porous media with different constituting properties all following Darcy’s law. This way, the transient interaction between matrix and fractures can

Page 43 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

be shown to be more realistic. A coherent representation for the thermal flow has also been incorporated into the model using the integral finite-difference method. Examples on how this method can be adapted to different situations can be found in Kimura et al. (1992) and Wu and Pruess (2005). Further information on multiphysics and multiscale approaches can be found in this handbook in the corresponding chapter by Helmig et al. (2014), and the references therein. For other models concerning fracture-matrix interaction the reader is referred to Berkowitz (2002). Stochastic-Continuum (SC) Method The stochastic-continuum approach represents the fractured medium as a single continuum, but, as opposed to the ECM, it uses geostatistical parameters. The concept was first introduced for fractured media by Neuman and Depner (1988) and Tsang et al. (1996). The site’s specific hydraulic parameters, such as the hydraulic conductivity, are modeled as random variables via stochastic methods, e.g., Monte Carlo simulation. However, measuring the hydraulic conductivity in this manner is problematic since it is a scale-dependent parameter which usually has a higher variance at smaller scales, possesses a varying support volume, and is derived from well tests whose scales have to match the required support scale of the model (see, e.g., Oden and Niemi 2006). The pioneering work by Neuman and Depner (1988) verifies the dependence of the effective principal hydraulic conductivity solely on mean, variance, and integral scales of local log hydraulic conductivities for ellipsoidal covariance functions. The proposed method requires that the medium is locally isotropic and that the variance is small. The stochastic-continuum model by Tsang et al. (1996) is based on a nonparametric approach using the sequential indicator simulation method. Hydraulic conductivity data from point injection tests serve the purpose of deriving the needed input for this simulation, namely, the thresholds dividing the possible range of values of hydraulic conductivity in the stochastic continuum into classes and the corresponding fractions within each of these classes. In order to reflect the fractures and the rock matrix, they introduced a long-range correlation for the high hydraulic conductivity as part of the distribution in the preferred planes of fractures. Thus, both the fractures and the rock matrix contribute to the hydraulic conductivity even though they are not treated as two different continua as in the dual-continuum model described above. Due to the inherent restrictions of the model and the reduction of the uncertainty, they propose to employ spatially integrated quantities to model flow and transport processes in a strongly heterogeneous reservoir reflecting the continuum quality (spatial invariability) of their model. Based on the concept of a stochastic REV using multiple realizations of stochastic Discrete Fracture Network (DFN) models (see below) simulated via the Monte Carlo method Min et al. (2004), determine the equivalent permeability tensor with the help of the two-dimensional UDEC code by the Itasca Consulting Group Inc (2000). The central relation that is used in this code is a generalized Darcy’s law for anisotropic and homogeneous porous media (see Bear 1972) 2 X ij @p vi D F ; @x f j j D1

(116)

where F is the cross-section area of the DFN model. One of the most important steps in the analysis is to prove the existence of the resulting permeability as a tensor, which was done by comparing the derived results with the ellipse equation of the directional permeability. The second important step is to determine whether a REV can be established for a specific site or not. Min and his coworkers Page 44 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

presented two criteria, “coefficient of variation” and “prediction error”, to show the existence of a REV and to specify its size for Sellafield, UK. 4.4.2

Discrete Models

Single-Fracture (SF) Models In single-fracture models, only one fracture is considered. The analysis of the accurate behavior of a single fracture is crucial in the understanding of situations in which most of the flow occurs in a few dominant paths. There are different realizations for this concept, both deterministic and stochastic. An early statistical approach dealing with preferential flow paths based on the variation of fracture aperture can be found in Tsang et al. (1996). The classical idea for modeling a rock fracture is to consider it as a pair of smooth parallel planes (see Lomize 1951; Tang et al. 1981). These kinds of models are interesting from a mathematical point of view since they often offer an analytical or semi-analytical solution for the flow (see, e.g., Wu et al. 2005); and they are widely used in reservoir modeling as they can be useful for a quantitative analysis. But they are far from being realistic. In reality, the surface of the fracture may be rough to the point that the flow can fail to satisfy the cubic law (see Bear et al. 1993) vD

g h ap3 ; 12 f l

(117)

where v is the flow rate, g is the gravity acceleration, ap is the aperture, h is the head loss, and l is the fracture length. Several authors have tried to characterize this deviation from the cubic law with fractal (e.g., Brown 1987; Fomin et al. 2003) and statistical (e.g., Tsang and Tsang 1989) modeling of the fracture roughness. A derivation in a simplified setting and further discussion on possible generalizations as well as further discussion on fluid flow and its interaction with the stresses of the surrounding medium can also be found in the chapter by Renner and Steeb (2014) in this book. In general, it is acknowledged that the fractures themselves are two-dimensional networks of variable aperture. Most of the time, the trick is not to identify every single roughness in the surface, even though a number of technical methods are used to compute a precise profilometric analysis, but rather the scale of the roughness which has a dominant influence on the fluid flow (see Berkowitz 2002) which is accommodated by introducing an effective aperture. Recently, another method realizing the irregularity of the fracture surface was introduced to model (Navier-Stokes) flow through a single fracture, namely, the Lattice Boltzmann method (see Kim et al. 2003; Eker and Akin 2006, and the references therein). In contrast to the traditional “top-down” methods employing partial differential equations, the idea of this “bottom-up” method is to use simple rules to represent fluid flow based on the Boltzmann equation. Additionally to problems caused by roughness, the situation is complicated by the influence of deformation processes due to flow and pressure gradients which should be considered at this scale (see, e.g., Auradou 2009). For heat extraction from Hot Dry Rock (HDR) systems which are a member of petrothermal systems, Heuer et al. (1991) and Ghassemi et al. (2003) developed mathematical models for a single fracture. The essential feature of the approach by Heuer et al. (1991) is that the presented (one-dimensional) model can be solved by analytical methods. Furthermore, a generalization to an infinite number of parallel fractures is also given. A three-dimensional model of heat transport in a planar fracture in an infinite reservoir is given by Ghassemi et al. (2003), who derive an

Page 45 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

integral equation formulation with a Green function effectively sidestepping the need to discretize the geothermal reservoir. Discrete Fracture Network (DFN) Among the methods for modeling a geothermal reservoir, the DFN approach is one of the most accurate but also one of the most difficult to implement. This model restricts fluid flow to the fractures and regards the surrounding rock as impermeable (e.g., Dershowitz et al. 2004). Thus, as described for the single-fracture models, most of the time fluid flow through fractures is compared to flow between parallel plates with smooth walls (Lee et al. 1999) or to flow through pipes (channeling-flow concept; see Tsang and Tsang 1987). In contrast to the dual-porosity model, fluid flow in the DFN model is governed by the cubic law (117). Based on this momentum equation and the continuity relation, unknown heads at intersections of the fracture network can be determined in the DFN model. A comparison of the dual-porosity method and the DFN method can be found in Lee et al. (1999). They identified the fracture volume fraction and the aperture to be the most significant parameters in the dual-porosity and the DFN model, respectively. Furthermore, they derived a one-dimensional analytical solution of the dual-porosity model for a confined fractured aquifer problem with the help of Fourier and Laplace transforms based on an earlier solution by Ödner (1998). Different approaches to characterize a reservoir fracture network and the flow in such a network were introduced, e.g., based on stochastic models, fractal models, fuzzy logic and neural networks (incorporating field data), a combination of these models, or on percolation theory (see Watanabe and Takahashi 1995; Mo et al. 1998; Jing et al. 2000; Ouenes 2000; Maryška et al. 2004; Tran and Rahman 2006, and the references therein). Note that percolation theory can also be used to determine the connectivity of a DFN and its effective permeability (see Berkowitz 1995; Mo et al. 1998; Masahi et al. 2007). Analytical models for the determination of the permeability of an anisotropic DFN are presented by Chen et al. (1999). The idea behind the stochastic models is that the entire fracture network cannot be located via seismic and geological means. Thus, fractures that are too small to be detected, but through which an important amount of flow can occur, are generated stochastically. Information on the morphology of the system has to be gathered first (via Monte Carlo sampling or geological analysis) in order to assign the correct probability distributions. Then, a realistic fracture network can be generated, providing a fairly good approximation of the real underground situation. The statistical distribution of fracture orientation is often described as a Fisher distribution (see Fisher et al. 1993), whereas fracture aperture and size can be sampled from log-normal or Gaussian distributions. For more details, see Assteerawatt (2008). Fracture-Matrix (FM) Models The last approach in the discrete category are fracture-matrix models, also called explicit discrete fracture and matrix methods (see Snow 1965; Sudicky and McLaren 1992; Stothoff and Or 2000; Reichenberger et al. 2006, and the references therein). These models are an extension of the above-described DFN models in terms of considering the rock matrix as a porous medium. Thus, the influence of the interaction between fractures and surrounding rock matrix on the physical processes during fluid and thermal flow through the reservoir can be captured and analyzed on a more realistic level. For this reason, these models can also be used to determine the parameters needed in the continuum methods (see, e.g., Lang 1995; Lang and Helmig 1995). Although the fracture-matrix models allow to represent fluid potential gradients and fluxes between the fracture system and the rock matrix on a physical level, this advantage is often reduced by the fact that the models are restricted to a vertical or horizontal orientation of the fractures (e.g.,

Page 46 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Travis 1984) or that the influence of the real geometry of the fractures such as their tortuosity is neglected. Since not only a detailed knowledge of the matrix geometric properties is rarely available at a given site but also since the application of this method is computationally intensive, the less demanding dual-continuum or the more general MINC method are widely used. Nevertheless, the progress of technology allowed the recent beginning of more extensive studies which combine an accurate description of the fracture system with a permeable matrix (e.g., by R. Helmig and his group, University of Stuttgart, Germany). It is now possible to model a very complex set of fractures and nonetheless take into account the fact that every fracture can exchange fluid with the surrounding rock matrix. An attempt at this kind of study has been done with success by Reichenberger et al. (2006). The main idea is that the capillary pressure and the flow must be continuous across the fracture boundary; therefore, a proper interface condition should be given. This method is still at an early stage and, at the moment, can be rarely used due to its strong dependency on a precise and complete knowledge of the real on-site fracture-matrix configuration.

4.5 Heat Transport in a Fissured Medium In contrast to an isotropic, porous medium which has been discussed in Sect. 4.3, fissured geothermic systems are characterized by strong heterogeneities. They are subject to internal and external forces with different scales in time and space. Furthermore, the heat transfer medium flows through a partially poorly connected fractured network along obstacles with different structures and proportions. This may cause an anomalous behavior of the heat transport (as well as other transport processes), due to the fact that the movement of the diffuse quantity – heat in our case – slows down because it probably remains at a certain position for a longer timespan. By virtue of its wide-ranging applicability, the concept of anomalous transport processes attracts more and more interest in natural sciences (see Luchko and Punzi 2011; Luchko 2014). To model the anomalous behavior of heat diffusion, the Continuous Time Random Walk (CTRW) can be applied. The idea is to interpret the heat as an amount of particles which moves erratically through the porous medium. On the basis of an estimation of the probability density function of the jumps of the moving particles, a continuous heat transport model is obtained. The anomalous behavior is estimated using the temporal development of the expectation value E..ıx/2 / of the mean square deviation, i.e., the variance, of a moving particle in the form E..ıx/2 /  t ˛ :

(118)

Depending on the parameter ˛, four cases can be distinguished: subdiffusion (0 < ˛ < 1) and standard diffusion (˛ D 1) as shown in Fig. 25, super diffusion (1 < ˛ < 2) and ballistic diffusion (˛ D 2). In the preceding section on heat transport in a porous medium, we assumed standard diffusion. In the following, we consider the case of subdiffusion, i.e., 0 < ˛ < 1, according to the considerations in Luchko and Punzi (2011). Using a probability density function with first moment  and Laplace transform of the form .s/ D 1  . s/˛ ;

for s ! 1;

(119)

the fractional Fokker-Planck equation

Page 47 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 25 Evolution of the variance in time of a one-dimensional CTRW simulation starting with a Dirac distribution for various ˛ (Luchko and Punzi 2011)

.c/p

@1˛ @T D 1˛ .r  .kp rT /  cf f f  rT /; @t @t

T .; 0/ D T0 ;

(120)

1˛

results from the CTRW-ansatz, where @t@ 1˛ is the fractional time derivative of order 1  ˛ and k is a scalar such that the heat conductivity is given by kp D kp I. The fractional time derivative turns out to be problematic both from a theoretical and a numerical point of view. Various definitions of the fractional calculus are known. For the simulation of real processes – as in case of modeling anomalous heat transport – the Caputo derivative is usually employed. It is defined by (see Gorenflo and Mainardi 1997) @ˇ 1 f .t / D ˇ @t .1  ˇ/

Zt

1 @ f ./d ; ˇ .1   / @

for 0 < ˇ < 1;

(121)

0

with the Gamma function ./. The derivative described in (121) corresponds to an integrodifferential operator of convolution type. It is not a Markov process due to the fact that the determination of the integral includes all preceding points in time. This results in problems for the numerical implementation especially with respect to memory capacity. In case of boundary value problems for fractional partial differential equations, there are still some difficulties and unsolved problems – especially as far as existence and uniqueness theorems are concerned. The methods known for classical elliptic and parabolic partial differential equations can often not be (easily) extended. An example for an extension of the established concepts is presented in Luchko (2009, 2010, 2014) and Luchko and Punzi (2011). We will further on neglect the convection term vf  rT but account for a reaction term T with the reaction rate . The (Dirichlet) initial boundary value problem with 0 < ˛ < 1 for the three-dimensional case is given by

Page 48 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

.c/p

@˛ T D r  .kp rT /  T C QT @t ˛

T .; 0/ D T0 T DF

in B  .0; tend /;

(122)

in B;

(123)

on @B  Œ0; tend ;

(124)

with the notation used in Sect. 4.3. Furthermore, for the time- and space-dependent source term QT as well as the space-dependent parameters k and , we have QT .x; t / 0

for x 2 B; t 2 .0; tend /;

(125)

k 2 C .1/ .B/; k.x/ > 0

for x 2 B;

(126)

 2 C.B/; .x/  0

for x 2 B:

(127)

By use of a maximum principle in Luchko (2009), it is shown that the initial boundary value problem (122)–(124) has at most one solution. If this solution exists, it depends continuously on the given data. In order to prove the existence of the solution, a generalized solution of the initial boundary value problem in the sense of Vladimirov is introduced in Luchko (2010). Further details can be found in Luchko (2009, 2010, 2014) and Luchko and Punzi (2011). As already mentioned above, the numerical approximation of the fractional time derivative is problematic because information of the integrand at all preceding points in time is needed. The “short memory principle” and the “logarithmic memory principle” provide possibilities for the reduction of memory capacity requirements (see Ford and Simpson 2001). In the range of finite-difference schemes, the standard approximation is the Grünwald-Letnikov definition of the fractional derivative. This method converges linearly in time (e.g., Blank 1996). In Luchko and Punzi (2011), some techniques with better convergence rates are given. In order to clarify the influence of the fractional diffusion in comparison to standard diffusion, the following initial boundary value problem is considered: @˛ T D r 2T @t ˛ T .; 0/ D T0 @T D0 @n

in B  .0; tend /;

(128)

in B;

(129)

on @B  Œ0; tend ;

(130)

where B D .0; 1/2 is the unit square in R2 and tend D 1 s. Furthermore, T0 is the characteristic function of the set f.x1 ; x2 / 2 R2 j .x1  0:75/2 C .x2  0:25/2 < 0:002g:

(131)

The numerical solution calculated by use of the approximation method described in McLean and Mustapha (2009) for ˛ D 0:8 (subdiffusion) and ˛ D 1 (standard diffusion) is illustrated in Fig. 26. The method uses a combination of linear time iteration and a discontinuous Galerkin scheme on the basis of a finite-element triangulation. The illustration shows that the heat diffuses slower in case of subdiffusion (upper row) than in case of standard diffusion (bottom row). Deviations from a symmetrical propagation behavior are due to boundary effects. Page 49 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

1

1

1

0.9

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.6

0.6

0.6

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

1

1

1

0.9

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.6

0.6

0.6

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0

0.1

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Fig. 26 Numerical solution of problem (128)–(130) using a two-dimensional DG-FEM for ˛ D 0:8 (upper row) and ˛ D 1 (bottom row) at t D 0:33 s, t D 0:66 s, and t D 1:00 s from left to right (see also Luchko and Punzi 2011)

5 Poroelastic Stress Field Modeling It is evident that pumping a pressurized fluid into a rock formation changes the stresses within the rock as the fluid flows through pores and fractures. Leak-off causes the surrounding rock to expand (e.g., Biot 1935, 1941, 1955; Rice and Cleary 1976). This relation between pore pressure and stress (see Engelder and Fischer 1994; Addis 1997; Hillis 2000, 2001, 2003; Zhou and Ghassemi 2009, and the references therein) is known as poroelasticity. It may lead to changes in permeability and porosity – especially when fractures are stimulated within a petrothermal reservoir. Such fractures grow along the axis of maximum principal stress, whereas its direction results from the stresses at the fracture front (cf. Moeck et al. 2009). Furthermore, poroelastic effects also yield changes in the velocity regime of seismic waves, microseismicity, reactivation of slips and faults, disturbance of borehole stability, and changes in the flow paths of the fluid through the reservoir (see, e.g., Altmann et al. 2008). Moreover, the occurrence of mild seismic shocks in the surroundings of geothermal facilities, e.g., in Basel, Switzerland (December 2006 and January 2007, see Baisch et al. 2009) or Landau, Germany (August 2009, see Expertengruppe “Seismisches Risiko bei hydrothermaler Geothermie” 2010), led to a discussion about the safety of geothermal power plants. Consequently, modeling of stress field effects forms the fourth column of the Kaiserslautern model (Fig. 4). Along with the poroelastic effects on the rock matrix, the other major sources of possible rock displacement, even after terminating the injection of fluid, are thermoelastic processes due to the temperature difference between the injected fluid and the medium (see Hicks et al. 1996; Ghassemi 2003; Ghassemi and Tarasovs 2004; Ghassemi and Zhang 2004; Brouwer et al. 2005; Yin 2008). As a matter of fact, the hot rock is cooled down by the injection of cold fluid and, as a consequence, the rock shrinks contrasting the poroelastic effects of dilation (e.g., Nakao and Ishido 1998). Normally, on a short-time scale, the mechanical effects are dominant and the thermoelastic effects can be neglected, but in a long-term simulation, they have to be taken into account. For a more specific treatment of this problem, the reader is referred to Evans et al. (1999), Jing and Hudson (2002), Ghassemi (2003), Rutqvist and Stephansson (2003), and the references therein for comprehensive reviews. We do not treat thermoelastic effects here, as a consequent treatment of those requires the consideration of flow transport, heat transport, and deformation processes as a Page 50 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

coupled system of equations (thermo-hydro-mechanical modeling) which goes beyond the scope of this chapter. For simplicity, within this section, we focus on poroelastic effects based on the consideration in Augustin (2012). In order to simulate the stress field in a homogeneous, isotropic medium B  R3 , the so-called quasistatic equations of poroelasticity may be used. They were first described by Biot (1935, 1941). We briefly state the basic relations which are used to derive these equations. Further details can be found in Auriault (1973), Landau et al. (1986), Showalter (2000), Phillips (2005), Jaeger et al. (2007), Phillips and Wheeler (2007), and Lai et al. (2010). Considering a domain B  R3 , with x being a point in B and x 0 its image after a small deformation, the displacement vector u is defined as u.x; t / D x 0 .t /  x.t /

(132)

for some t 2 Œ0; tend . It describes deformations of the solid porous medium. With the standard assumptions of linear elasticity, the strain tensor " is given by 1 "ij .u/ D 2



 @uj @ui .x; t / C .x; t / @xj @xi

(133)

and in case of an isotropic, homogeneous medium, this yields the stress tensor  given by  ij .x; t / D Cij kl .x; t /"kl .u/ D "kk .u/ıij C 2 "ij .u/:

(134)

Here, C is the Cauchy elasticity tensor of rank four, and and are the Lamé parameters of the system. Moreover, we used Einstein’s summation convention. Since u describes the solid medium, we need another quantity to describe the behavior of the fluid. As mentioned before in Sect. 4.1, the fluid flow in a porous medium can be modeled by Darcy’s law (89). As we assume an isotropic, homogeneous medium and fluid,  can be replaced by a scalar . Moreover, we set k D f . Next, we have to describe the interactions between fluid and solid. According to Biot (1935, 1941), there are two quantities which we have to consider. The first one is the poroelastic stress tensor  pe given by  pe .x; t / D  .x; t /  ˛Ip.x; t /:

(135)

Here, I is the unity tensor of rank two, ˛ is the so-called Biot-Willis constant, and p is the pore pressure of the fluid. The second quantity is the so-called volumetric fluid content .x; t / D c0 p.x; t / C ˛r  u.x; t /:

(136)

It describes those parts of volume changes of the fluid which are due to changes in fluid mass (see Jaeger et al. 2007, for a detailed derivation). c0 is the specific storage coefficient. Equations (133)–(136) form a complete list of constitutive relations to describe the interacting fluid-solid system. The behavior of the system is governed by the usual conservation laws of physics. In our case, balance of linear momentum and balance of mass read (Augustin 2012)

Page 51 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

s .x; t /

@2 u .x; t /  r   pe .x; t / Df .x; t / (balance of linear momentum); 2 @t @ .x; t / C r  vf .x; t / Dh.x; t / (balance of mass); @t

(137) (138)

with the mass density s of the solid. We neglect a detailed description of how these differential equations can be derived from the integral equations d dt

Z

Z

Z

s .x; t /vs .x; t / d V .x/ D Bt

d dt

f .x; t / d V .x/ C Bt

.x; t / d V .x/ D Bt

(139)

@Bt

Z

Z

 pe .x; t /n.x; t / dS.x/;

h.x; t / d V .x/:

(140)

Bt

The interested reader can find those, e.g., in Jaeger et al. (2007), Lai et al. (2010), and the references therein. The occurring quantities here, as far as not yet introduced, are Bt  B

arbitrary material volume;

vs .x; t /

deformation velocity of the solid;

f .x; t / body force density; h.x; t /

fluid source density:

By combining Eqs. (89) and (133)–(138), we get (Augustin et al. 2012) s .x; t /

@2 u .x; t /  . C / r .r  u.x; t //  r 2 u.x; t / C ˛rp.x; t / D f .x; t /; @t 2    @ .c0 p.x; t / C ˛r  u.x; t //  r  k rp.x; t /  f g.x; t / D h.x; t / @t

(141) (142)

as the governing equations of poroelasticity in a homogeneous, isotropic medium. This can be seen as the poroelastic generalization of the acoustic wave equation (38). Since we are not interested in wave phenomena here, but in consolidation processes, it appears plausible to neglect the second-order time derivative term in (141). This can be justified by a nondimensionalization. Let x0 , t0 be a characteristic length scale and a characteristic time scale, respectively. Neglecting fluid forces and setting 1 rQ D rxQ D rx xo uQ D xu0 ; pQ D p ; fQ D x 0 f; hQ D t0 h xQ D

x ; x0

tQ D tt0 ;

yields h

s x02 t02

i

@2 uQ @tQ2



h

C

i

 rQ rQ  uQ  rQ 2 uQ C ˛ rQ pQ D fQ;

(143) Page 52 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

@ @tQ

 h i  Q Q c0 pQ C ˛ rQ  uQ  t0xk r  rQ pQ D h: 2



(144)

0

The nondimensional coefficient in front of the diffusion term in (144) suggests t0 D

x02 k

leading to h

s k 2 x02

i

@2 uQ @tQ2



h

 rQ rQ  uQ  rQ 2 uQ C ˛ rQ pQ D fQ;   @ Q Q  uQ  rQ 2 pQ D h: p Q C ˛ r c 0 @tQ

C

i

(145) (146)

For our purposes, the characteristic length scale x0 can be assumed to be of the order of several hundred meters or several kilometers. For example, with x0 D 100 m and for Berea sandstone (Schanz 2001), we have s k 2 x02

 5:3  1011

which is rather small compared to C

D 53 ; ˛ D 0:867; and c0  0:461:

Therefore, the term s .x; t /@2t u.x; t / in (141) is negligible. Hence, the quasistatic equations of poroelasticity can be written as  . C / r .r  u/  r 2 u C ˛rp D f

in B  .0; tend /;

(147)

  @ .c0 p C ˛r  u/  r  k rp  f g D h @t

in B  .0; tend /:

(148)

To guarantee uniqueness of a solution, these equations have to be equipped with initial and boundary conditions. Boundary conditions might be u D uD

on d  Œ0; tend ;

.  ˛Ip/ n D tN

on t  Œ0; tend ;

p D pD k.rp  g/  n D q

on p  Œ0; tend ;

(149)

on f  Œ0; tend :

(150)

Here, the different parts of the boundary have to satisfy the conditions d \ t D ; D p \ f and d [t D @B D p [f . The indices “d ”, “t ”, “p”, and “f ” stand for “displacement”, “tension”, “pressure”, and “flow”. It turns out that a suitable initial condition is given by prescribing the fluid content  D c0 p C ˛r  u at t D 0, such that (Augustin 2012) c0 p.x; 0/ C ˛r  u.x; 0/ D .x; 0/ D  .0/ .x/

for all x 2 B:

(151)

Probably, the most realistic way to do this would be to prescribe an initial pressure p.x; 0/ D p .0/ .x/, initial volume force density f .x; 0/ D f .0/ .x/, and boundary conditions on u at t D 0. Page 53 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

−μ∇2u + ∇p = 0 ∇·u= 0

∂tp − ∇2p = 0

λ+μ − μ ∇ (∇ · u) − ∇2u + α∇p = 0 ∂t (c0μp + α∇ · u) − ∇2p = 0

λ+μ − μ ∇(∇ · u) − ∇2u = 0

Fig. 27 Relations of the equations of quasistatic poroelasticity to well-known (systems of) differential equations

With this, u.x; 0/ D u.0/ .x/ can be determined from Eq. (147) as this equation is independent of time and, hence, also holds for t D 0. As a consequence, .x; 0/ D  .0/ .x/ can be calculated. Existence and uniqueness results for a solution of the initial boundary value problem (147)– (151) can be found in Augustin (2012). In order to put Eqs. (147)–(148) in context, we mention some relations, which are also visualized in Fig. 27. Obviously, we have a relation to the Cauchy-Navier equation  . C / r .r  u/  r 2 u D f

(152)

of elastostatics. Moreover, a relation to the heat or diffusion equation @ p  r 2p D h @t

(153)

is clearly recognizable, as this is a prototype of a time-dependent parabolic differential equation. Furthermore, there is a relation to the Stokes equations  r 2 u C rp D 0; r  u D 0;

(154)

which is probably the simplest system of equations for a vector-valued quantity u and a scalarvalued quantity p. The above given relations can also be established by regarding fundamental solutions to the quasistatic equations of poroelasticity. For the sake of convenience, we consider the dimensionless equations (cf. Eqs. (145) and (146))  C r .r  u/  r 2 u C ˛rp D0;

(155)

@ .c0 p C ˛ .r  u//  r 2 p D0: @t

(156)

Page 54 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

For the sake of readability, dimensionless quantities are denoted without the Q further on. With the abbreviations C1 D

˛ ; c0 . C2 /C˛ 2

C2 D

C2 ; c0 . C2 /C ˛ 2

C3 D

c0 . C3 /C˛ 2 ; 2.c0 . C2 /C˛ 2 /

C4 D

c0 . C /C˛ 2 ; c0 . C3 /C˛ 2

(157)

we obtain the fundamental solutions (Augustin 2012) p Si .x; t / DC2 G Heat .x; t /; 0 1 Zt uSi .x; t / DC1 @rx C2 G Heat .x;  / C G Harm .x/ı. / d  A ; 

(158) (159)

0

 p Fi .x; t / DC1 rx C2 G Heat .x; t / C rx G Harm .x/ı.t / ; 0 1 Zt uFi .x; t / DC12 rx @rx C2 G Heat .x;  / C G Harm .x/ı. / d  A C uCN .x/ı.t /;

(160) (161)

0

or, more explicitly,  kxk2 exp  4C ; 2t     kxk kxk2 x 2 pkxk p p exp  erf  ; uSi .x; t / DC1 4kxk 3 4C2 t  4C2 t 4C2 t  kxk2 Fi x x 2 p .x; t / DC1 4kxk3 ı.t /  C1 C2 p 3 p 5 exp  4C2 t ;  4C2 t  xi xk 1 uFi ki .x; t / DC3 4kxk ıki C C4 kxk2 ı.t /     2 1 i xk pkxk p2 pkxk exp  kxk C C12 4kxk ıi k  3x erf  3 2 4C2 t  4C2 t 4C2 t kxk  2 C p4 pkxk 3 xi xk exp  kxk 4C2 t

p Si .x; t / DC2 p1 3 p

1 3 4C2 t

4C2 t

(162) (163) (164)

(165)

with Dirac’s delta distribution ı./, G Heat .; /, G Harm ./, and uCN ./ being the fundamental solutions to the heat, Laplace, and Cauchy-Navier equation, respectively, as well as erf./ being the Gaussian error function defined as Z erf./ D

exp. 2 / d:

p2 

(166)

0

Note that the fundamental solution for the pressure in the Stokes equations (154) is rx G Harm .x/, which is part of p Fi . Although the existence of a solution is guaranteed under certain conditions, it is in most cases not possible to compute such a solution analytically. Instead, a numerical solution scheme is needed to find a good approximate solution. Here, we introduce a solution scheme based on the method of fundamental solutions. Page 55 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

The idea of the method of fundamental solutions is to choose a system of fundamental solutions to a given differential equation such that their singularities are not contained in the domain under consideration. Any finite linear combination of these fundamental solutions satisfies the corresponding differential equations with vanishing right-hand side and, thus, is a suitable ansatz for the solution of a corresponding initial boundary value problem. The coefficients of this ansatz can be determined by demanding that prescribed initial and boundary values have to be approximated, e.g., in a least square or collocation sense. The basics of this idea can be dated back to Runge (1885), Trefftz (1926), and Walsh (1929). In the context of electrostatics, the method is known as method of image charges (Jackson 1998). Results in the context of potential theory, which also give a connection to a single-layer approach, can be found in Freeden (1983), Freeden and Michel (2004), and the references therein. The main advantages of the method are that it is mesh-free, integration-free, and easy to implement (Smyrlis 2009a). On the other hand, there is no general advice on how to choose the location of singularities or collocation points (Katsurada and Okamoto 1996; Barnett and Betcke 2008; Smyrlis and Karageorghis 2009). Choosing suitable sets of points is crucial for the quality of the approximation. Often, the occurring linear equation systems are ill-conditioned and have to be stabilized (Smyrlis 2009a). It is possible to show that certain systems of fundamental solutions are dense subsets of the solution space to a given partial differential equation. We refer to Freeden (1980, 1983), Freeden and Kersten (1981), and Freeden and Gerhards (2013) for results in potential theory, Müller and Kersten (1980) for results on the Helmholtz equation, Browder (1962) and Smyrlis (2009a) for results on elliptic operators, Freeden and Reuter (1990), Freeden and Michel (2004), Smyrlis (2009b) for results on the Cauchy-Navier equation, Mayer (2007) and Mayer and Freeden (2014) for results on the Stokes equations, and Kupradze (1964), Johansson and Lesnic (2008), and Johansson et al. (2011) for results on the heat equation. Quantitative convergence results are harder to get. They are known for the Laplace equation (Katsurada 1989; Li 2008a) and a (modified) Helmholtz equation (Barnett and Betcke 2008; Li 2008b). For the case of quasistatic poroelasticity, an ansatz based on the method of fundamental solutions can be written in the form ui .x; t / D

N X

M X 3  X

nD1

mD1

C

3 X

.k/ fi uki .x anm

kD1

  Si  yn ; t  m / C bnm ui .x  yn ; t  m / !

anCN;.k/ .t /uCN ki .x

 yn / C

kD1

p.x; t / D

L X

.0/

bl uSi i .x  yl ; t  o /;

(167)

lD1

N X

  M X 3  X .k/ fi Si anm pk .x  yn ; t  m / C bnm p .x  yn ; t  m /

nD1

mD1

C

3 X kD1

kD1

anCN;.k/ .t /pkSt .x

!  yn / C

L X

.0/

bl p Si .x  yl ; t  o /;

(168)

lD1

Page 56 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

2.36e−03 −7

10 E (x=0,t)

2.31e−03

rel

2.26e−03

−8

10

2.21e−03 −9

10 2.16e−03 0

0.2

0.4

0.6

0.8

1

0

t

0.2

0.4

0.6

0.8

1

t

Fig. 28 Time development of p (left), evaluated at the origin x D 0, and approximation error (right)

with N; M; L 2 N, and ufi .x  y; t  / D uFi .x  y; t   /  uCN .x  y/ı.t   /;

(169)

p fi .x  y; t  / D p Fi .x  y; t   /  p St .x  y/ı.t   /;

(170)

p St .x  y/ D rx G Harm .x  y/:

(171)

It can be shown that this ansatz can be reduced by neglecting the fi-parts. A detailed discussion on the method of fundamental solutions, including density results in appropriate solution spaces, can be found in the PhD thesis Augustin (2014). The considerations here are restricted to the behavior of the pressure p in an exemplary problem on the cube .1; 1/3 . For this purpose, we assume that the fluid content .x; t / associated to the functions uSi .x1  2; x2  2; x3  2; t C 1/, p Si .x1  2; x2  2; x3  2; t C 1/ restricted to .1; 1/3  .0; 1/ is known at time t D 0 in the whole cube. As boundary conditions, let the associated normal tensions and normal fluid fluxes be prescribed on the surfaces of the cube at x3 D ˙1 and the displacement and pressure themselves at all other surfaces. We want to approximate the values of uSi .x1  2; x2  2; x3  2; t C 1/, p Si .x1  2; x2  2; x3  2; t C 1/ in .1; 1/3  .0; 1/. Figure 28 depicts the development of the pressure p in time at the origin, i.e., the center of the cube. The behavior of p is non-monotonic as a result of poroelastic effects. The approximation error is shown on the right-hand side of Fig. 28, using a logarithmic scale. As can be seen, the approximation is very good with an error between 2  107 and 109 . Figures 29 and 30 show on the left-hand side the pressure in the intersections of .1; 1/3 with the planes x3 D 0 and x2 D 0, respectively, at times t D 1 and t D 5, respectively. On the right-hand side, the associated relative errors are shown in a logarithmic pseudocolor plot using logarithmic contour distances. As can be seen, the largest errors in the intersection with the plane x3 D 0 are found inside the cube, and the largest errors in the intersection with the plane x2 D 0 are found at the boundaries at x3 D ˙1. This leads to the conclusion that the approximation of normal tensions and normal fluid fluxes is harder than the approximation of prescribed values of displacement or pressure at the boundary. For further examples including a parameter study on the method of fundamental solutions, the reader is referred to the PhD thesis Augustin (2014).

Page 57 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

1

1

5.75e−03

0.5

4.50e−03

0.5

0

3.25e−03

0

1.0e−06 1.0e−07

x2

x2

1.0e−08 1.0e−09 2.00e−03

−0.5

−1 −1

−0.5

0.75e−03 −0.5

0 x1

0.5

−1 −1

1

1.0e−10

−0.5

0 x1

0.5

1

1

1

5.75e−03

0.5

4.50e−03

0.5

0

3.25e−03

0

1.0e−11

1.0e−06 1.0e−07

x3

x3

1.0e−08 1.0e−09 2.00e−03

−0.5

−1 −1

−0.5

0.75e−03 −0.5

0 x2

0.5

−1 −1

1

1.0e−10

−0.5

0 x2

0.5

1

1.0e−11

Fig. 29 Left: pseudocolor plot of p in the intersection with the plane x3 D 0 (upper row) and the intersection with the plane x1 D 0 (lower row) at time t D 0:2. Right: logarithmic pseudocolor plots using logarithmic contour distances for the associated relative errors in approximating p

1

1

1.0e−06

3.80e−03

0.5

3.10e−03

0

2.40e−03

−0.5

1.70e−03

1.0e−07

0.5

x2

x2

1.0e−08 0 1.0e−09

−1 −1

−0.5

1.00e−03 −0.5

0 x1

0.5

−1 −1

1

1.0e−10

−0.5

1

1

0 x1

0.5

1

1.0e−11 1.0e−06

3.80e−03

0.5

3.10e−03

0

2.40e−03

−0.5

1.70e−03

1.0e−07

0.5

x3

x3

1.0e−08 0 1.0e−09

−1 −1

1.00e−03 −0.5

0 x2

0.5

1

−0.5

−1 −1

1.0e−10

−0.5

0 x2

0.5

1

1.0e−11

Fig. 30 Left: pseudocolor plot of p in the intersection with the plane x3 D 0 (upper row) and the intersection with the plane x1 D 0 (lower row) at time t D 1. Right: logarithmic pseudocolor plots using logarithmic contour distances for the associated relative errors in approximating p

Page 58 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

In order to use the method of fundamental solutions for problems with nonvanishing right-hand side, it has to be combined with another method. A possible extension is the dual reciprocity method (Golberg and Chen 1998). Starting point is the fact that a solution of a linear differential equation can be represented as the sum of a solution to the homogeneous equation and a particular solution to the equation in case of a nonvanishing right-hand side. Let us assume that we have a differential equation for the scalar-valued function p with a right-hand side h. The idea of the dual .i / N reciprocity method is to find systems of radial basis functions fp .i / gN i D0 and fh gi D0 such that for .i / every i , the corresponding p is a solution to the differential equation under consideration with right-hand side h.i / . If such systems exist and build up a basis of the respective spaces to which h and p belong, a solution with arbitrary right-hand sides h can be found by expanding h with respect to fh.i / gN i D0 . The solution ppart is then given by plugging the coefficients of this expansion into an ansatz given by a similar expansion based on fp .i / gN i D0 . In order to satisfy initial and boundary conditions, modify the given initial and boundary values by the values of ppart for t D 0 as well as at the boundary, respectively, and apply the method of fundamental solutions to the new problem with h D 0 and the modified initial and boundary conditions to obtain phom . The wanted solution is then given by p D ppart C phom . In case of quasistatic poroelasticity, the above-explained procedure has to be generalized. It is convenient to regard (147) and (148) as a system of four equations. The right-hand side of this system is a vector with four components, f1 , f2 , f3 , and h. Thus, we need a system ff.i / gN i D0 of radial basis functions which takes values in the space of 4  4-real-valued tensors. This becomes clear if we consider that in order to express any arbitrary vector in R4 , a basis of four vectors is 44 of solutions are needed. Consequently, the corresponding radial basis functions fu.i / gN i D0  R also tensor valued. The procedure of how to find a solution stays just the same as in the scalarvalued case mentioned above.

6 Opportunities, Challenges, and Perspectives Geothermal energy is a key renewable resource with many valuable opportunities (see, e.g., Mongillo 2011) such as • Extensive global distribution: There is a high potential for geothermal energy production in many areas. Nevertheless, a joint worldwide research activity in the geothermal sector is not feasible, as the geological and geothermal conditions vary substantially between countries. Geothermal energy requires decentralized research facilities, however, complemented by a close networking in training and education activities worldwide. • Independence of season: Geothermal energy production is not dependent on any weather phenomenon (such as solar and wind influences) and is inexhaustibly available anywhere and anytime in the Earth’s interior. • Environmentally friendly character: Deep geothermal energy projects realize an excellent carbon balance. After the installation of a geothermal power plant, the net power production is totally CO2 -free. Further valuable arguments are the creation of a local value, the prevention of a foreign policy influence, and the independence of fossil energy forms, as well as the requirement of little space. • Contribution to the development of diversified power: Geothermal resources provide a reliable local energy source that can, at least to some extent, be used to replace the energy production based on fossil fuels. Page 59 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

In consequence, the International Energy Agency report (see, e.g., International Energy Agency 2010) points out the perspective that geothermal resources have the potential to make a considerable contribution towards meeting the world’s energy needs well into the future while contributing to reduced emissions and to the mitigation of climate change. The global geothermal potential is enormous and the special significance of geomathematics as an interdisciplinary science can play a decisive role in the scientific consortium concerned with deep geothermal research. The GLITNIR geothermal research report (see Welding 2007) lists some challenges which are still valid today: • With the rapid development of the industry, there is an immediate need for additional expertise (from resource development to business management), support services (drilling capacity, data/information generation and availability, technology, power generation equipment, etc.), focused capital, transparency of the assurance market, and public acceptance. • There is still some fragmentation in the sector to appropriately fund and develop the resource. • Risk minimization including geomathematical as well as geophysical research is absolutely indispensable. Further academic education is necessary to increase the industry talent pool. Overall project risks are higher with inexperienced scientists. • Pressure will remain to ramp up projects at a rapid pace, requiring commitments from multiple players and institutions including the public sector. • Geothermal energy is the only real renewable baseload electricity option, yet does not get enough political support. Other incentives for geothermal exploration are needed which could help to cover at least parts of the drilling risk for geothermal reservoirs. • Future expansion of geothermal developments depends on exploring new fields and overcoming technical calamities in known fields that are not exploited, yet. A particular issue that should currently be addressed by the geothermal community is the development of reliable EGS procedures that can ensure sustainable flow rates. • The potential risk of induced seismicity prevents the sustainable success of the geothermal technology and leads to the problem of social acceptance of the population (see Freeden 2013; Bauer et al. 2014a, and the references therein for more detailed explanations). There is a discrepancy between the willingness of the population for more intensive utilization of “green energies” and the acceptance of neighborhoods close to geothermal facilities. Altogether, we are led to the following key elements in geothermal development (see, e.g., MIT (Massachusetts Institute of Technology) 2006; Gehringer and Loksha 2012): • Geothermal energy is well positioned within the renewable energies. • Advanced exploration methods are crucial to reduce risks of failure in geothermal drilling and relevant costs during the exploratory phase. • A large and targeted geomathematical as well as geophysical research effort is needed to improve the results obtained by geothermal exploration methods. • Substantial geothermal growth could be provided by EGS technology; however, simultaneously sociopolitical acceptance of the population represents more and more a success factor for the realization of geothermal projects. The indicators “risk perception” and “societal and individual benefits” are decisive features (see, e.g., Aitken 2010; Freeden 2013; Bauer et al. 2014a, b).

Page 60 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 31 Key elements of successful geothermal energy development (due to Gehringer and Loksha 2012)

All in all, geothermal energy becomes more and more important in the increasing demand of renewable energy production. The key elements of a successful geothermal energy development are illustrated in Fig. 31. All developments and success factors are liable to a locally dependent understanding of reliable research study and external as well as internal stakeholder management. Geomathematics together with the scientific consortium concerned with the geothermal business will help to overcome the critical issues, i.e., minimizing risks, maximizing economic efficiency, and, finally, strong decorrelation of geothermal energy retrieval and local seismicity, as well as sociopolitical acceptance. It is remarkable that, in the interplay of risk and bankability (see Figs. 32 and 33), geomathematical components within pre-survey and exploration turn out to be both of tremendous significance and low cost. Unfortunately, the total success of exploration phases in the past is typically seen as critical as illustrated in Fig. 32 (see Gehringer and Loksha 2012). Undoubtedly, the special feature is that potential and exploration techniques have to deal with regions of the Earth which are not at all accessible for direct measurement and observation. Characterizing the interrelation between project risk and investment costs invariably remains beyond a certain success level such that there is a canonical limitation in all geoscientific work. This is also the reason why geothermal projects inherently show an always existing exploration risk (i.e., resource risk) often considered the greatest challenge from an investor’s point of view. Meanwhile, however, computers and observational technology have resulted in an explosive propagation of mathematics in about every area of science. Indeed, mathematics acts as an interdisciplinary accelerator even in exploration. The “mathematization of sciences” allows for the handling of complicated models and structures even for large data sets. Nowadays, modeling, computation, and visualization yield reliable simulation of states and processes. The resume is that the authors estimate the project risk in a more realistic way to be of the type illustrated in Fig. 33, at least when modern geomathematical tools are appropriately taken into account. Acknowledgments The work of the Geomathematics Group Kaiserslautern and G.E.O.S Ingenieurgesellschaft mbH, Freiberg, is supported by the “Verbundprojekt GEOFÜND: Charakter-

Page 61 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Fig. 32 Geothermal project risk, cumulative investment costs, and bankability (under past geomathematical capabilities, cf. Gehringer and Loksha 2012)

Fig. 33 Geothermal project risk, cumulative investment costs, and bankability (under today’s geomathematical capabilities)

isierung und Weiterentwicklung integrativer Untersuchungsmethoden zur Quantifizierung des Fündigkeitsrisikos” (PI: W. Freeden) Federal Ministry for Economic Affairs and Energy (BMWi) Germany. M. Augustin has been supported by a fellowship of the German National Academic Foundation (Studienstiftung des deutschen Volkes). C. Gerhards has been supported by a fellow-

Page 62 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

ship within the Postdoc-program of the German Academic Exchange Service (DAAD). S. Eberle is thankful for the support by the Rhineland-Palatinate Center of Excellence for Climate Change Impacts. M. Ilyasov, S. Möhringer, H. Nutz, I. Ostermann, and A. Punzi thank for the support by the Rhineland-Palatinate excellence research center “Center for Mathematical and Computational Modeling (CM)2 ” and the University of Kaiserslautern within the scope of the project “EGMS” (PI: W. Freeden).

References Addis MA (1997) The stress-depletion response of reservoirs. In: SPE annual technical conference and exhibition, San Antonio, 5–8 Oct 1997 Adler PM, Thovert JF (1999) Theory and applications in porous media. Fractures and fracture networks, vol 15. Kluwer Academic, Dordrecht Aitken M (2010) Why we still don’t understand the social aspects of wind power: a critique of key assumptions with the literature. Energy Policy 38:1834–1841 Altmann J, Dorner A, Schoenball M, Müller BIR, Müller T (2008) Modellierung von porendruckinduzierten Änderungen des Spannungsfeldes in Reservoiren. In: Kongressband, Geothermiekongress 2008, Karlsruhe Arbogast T (1989) Analysis of the simulation of single phase flow through a naturally fractured reservoir. SIAM J Numer Anal 26:12–29 Arbogast T, Douglas J, Hornung U (1990) Derivation of the double porosity model of single phase flow via homogenization theory. SIAM J Math Anal 21:823–836 Assteerawatt A (2008) Flow and transport modelling of fractured aquifers based on a geostatistical approach. PhD thesis, Institute of Hydraulic Engineering, University of Stuttgart Augustin M (2012) On the role of poroelasticity for modeling of stress fields in geothermal reservoirs. Int J Geomath 3:67–93 Augustin M (2014) A method of fundamental solutions in poroelasticity to model the stress field in geothermal reservoirs. PhD thesis, Geomathematics Group, University of Kaiserslautern Augustin M, Freeden W, Gerhards C, Möhringer S, Ostermann I (2012) Mathematische Methoden in der Geothermie. Math Semesterber 59:1–28 Auradou H (2009) Influence of wall roughness on the geometrical, mechanical and transport properties of single fractures. J Phys D Appl Phys 42:214015 Auriault J-L (1973) Contribution à l’étude de la consolidation des sols. PhD thesis, L’Université scientifique et médicale de Grenoble Axelsson G, Gunnlaugsson E (2000) Long-term monitoring of high- and low-enthalpy fields under exploitation. In: World geothermal congress 2000, pre-congress course, Kokonoe Baisch S, Carbon D, Dannwolf U, Delacou B, Delvaux M, Dunand F, Jung R, Koller M, Martin C, Sartori M, Secanell R, Vorös R (2009) Deep heat mining Basel – seismic risk analysis. SERIANEX. Technical report, study prepared for the Departement für Wirtschaft, Soziales und Umwelt des Kantons Basel-Stadt, Amt für Umwelt und Energie Barenblatt GI, Zheltov IP, Kochina IN (1960) Basic concepts in the theory of seepage of homogeneous liquids in fissured rocks. PMM Sov Appl Math Mech 24:852–864 Barnett AH, Betcke T (2008) Stability and convergence of the method of fundamental solutions for Helmholtz problems on analytic domains. J Comput Phys 227:7003–7026

Page 63 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Bauer M, Freeden W, Jacobi H, Neu T (2014a) Energiewirtschaft 2014. Springer Spektrum, Wiesbaden Bauer M, Freeden W, Jacobi H, Neu T (2014b) Handbuch Tiefe Geothermie. Springer Spektrum, Berlin/Heidelberg Baysal E, Kosloff DD, Sherwood JWC (1983) Reverse time migration. Geophysics 48:1514–1524 Baysal E, Kosloff DD, Sherwood JWC (1984) A two-way nonreflecting wave equation. Geophysics 49:132–141 Bear J (1972) Dynamics of fluids in porous media. Elsevier, New York Bear J, Tsang CF, de Marsily G (1993) Flow and contaminant transport in fractured rock. Academic, San Diego Berkowitz B (1995) Analysis of fracture network connectivity using percolation theory. Math Geol 27:467–483 Berkowitz B (2002) Characterizing flow and transport in fractured geological media: a review. Adv Water Resour 25:852–864 Billette F, Brandsberg-Dahl S (2005) The 2004 BP velocity benchmark. In: 67th annual international meeting EAGE, Madrid. Expanded abstracts. EAGE Biondi BL (2006) Three-dimensional seismic imaging. Society of Exploration Geophysicists, Tulsa Biot MA (1935) Le problème de la consolidation des matières argileuses sous une charge. Ann Soc Sci Brux B55:110–113 Biot MA (1941) General theory of three-dimensional consolidation. J Appl Phys 12:151–164 Biot MA (1955) Theory of elasticity and consolidation for a porous anisotropic solid. J Appl Phys 26:182–185 Blakely RJ (1996) Potential theory in gravity & magnetic application. Cambridge University Press, Cambridge Blank L (1996) Numerical treatment of differential equations of fractional order. Technical report, numercial analysis report, University of Manchester Bleistein N (1987) On the imaging of reflectors in the Earth. Geophysics 49:931–942 Bleistein N, Cohen JK, Stockwell JW (2000) Mathematics of multidimensional seismic imaging, migration, and inversion. Springer, New York Bödvarsson G (1964) Physical characteristics of natural heat sources in Iceland. In: Proceedings of the United Nations conference on new sources of energy, vol 2. United Nations Bollhöfer M, Grote MJ, Schenk O (2008) Algebraic multilevel preconditioner for the Helmholtz equation in heterogeneous media. SIAM J Sci Comput 31:3781–3805 Bonomi E, Pieroni E (1998) Energy-tuned absorbing boundary conditions. In: 4th SIAM international conference on mathematical and numerical aspects of wave propagation, Colorado School of Mines Bording RP, Liner CL (1994) Theory of 2.5-D reverse time migration. In: Proceedings, 64th annual international meeting: society of exploration geophysicists, Los Angeles Brouwer GK, Lokhorst A, Orlic B (2005) Geothermal heat and abandoned gas reservoirs in the Netherlands. In: Proceedings world geothermal congress 2005, Antalya Browder FE (1962) Approximation by solutions of partial differential equations. Am J Math 84:134–160 Brown SR (1987) Fluid flow through rock joints: the effect of surface roughness. J Geophys Res 92:1337–1347

Page 64 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Buhmann MD (2003) Radial basis functions: theory and implementations. Cambridge monographs on applied and computational mathematics, vol 12. Cambridge University Press, Cambridge Buske S (1994) Kirchhoff-Migration von Einzelschußdaten. Master thesis, Institut für Meteorologie und Geophysik der Johann Wolfgang Goethe Universität Frankfurt am Main Chen M, Bai M, Roegiers JC (1999) Permeability tensors of anisotropic fracture networks. Math Geol 31:355–373 Chen Z, Huan G, Ma Y (2006) Computational methods for multiphase flows in porous media. Computational science & engineering, vol 2. SIAM, Philadelphia Cheng H-P, Yeh G-T (1998) Development and demonstrative application of a 3-D numerical model of subsurface flow, heat transfer, and reactive chemical transport: 3DHYDROGEOCHEM. J Contam Hydrol 34:47–83 Claerbout J (2009) Basic Earth imaging. Stanford University, Stanford Darcy HPG (1856) Les Fontaines Publiques de la Ville de Dijon. Victor Dalmont, Paris de Boer R (2000) Theory of porous media – highlights in historical development and current state. Springer, Berlin Deng F, McMechan GA (2007) 3-D true amplitude prestack depth migration. In: Proceedings, SEG annual meeting, San Antonio Dershowitz WS, La Pointe PR, Doe TW (2004) Advances in discrete fracture network modeling. In: Proceedings, US EPA/NGWA fractured rock conference, Portland, pp 882–894 Diersch H-J (1985) Modellierung und numerische Simulation geohydrodynamischer Transportprozesse. PhD thesis, Akademie der Wissenschaften der DDR Diersch H-J (2000) Numerische Modellierung ober- und unterirdischer Strömungs- und Transportprozesse. In: Martin H, Pohl M (eds) Technische Hydromechanik 4 – Hydraulische und numerische Modelle. Verlag Bauwesen, Berlin Dietrich P, Helmig R, Sauter M, Hötzl H, Köngeter J, Teutsch G (2005) Flow and transport in fractured porous media. Springer, Berlin Du X, Bancroft JC (2004) 2-D wave equation modeling and migration by a new finite difference scheme based on the Galerkin method. Technical report, CREWES Durst P, Vuataz FD (2000) Fluid-rock interactions in hot dry rock reservoirs: a review of the HDR sites and detailed investigations of the Soultz-sous-Forets system. In: Proceedings of the world geothermal congess 2000, Kyushu-Tohoku Eberle S (2014) Forest fire determination: theory and numerical aspects. PhD thesis, Geomathematics Group, University of Kaiserslautern Eberle S, Freeden W, Matthes U (2014) Forest fire spreading. In Freeden W, Nashed B, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, New York Eker E, Akin S (2006) Lattice Boltzmann simulation of fluid flow in synthetic fractures. Transp Porous Media 65:363–384 Ene HI, Poliševski D (1987) Thermal flow in porous media. D. Reidel, Dordrecht Engelder T, Fischer MP (1994) Influence of poroelastic behaviour on the magnitude of minimum horizontal stress, Sh , in overpressured parts of sedimentary basins. Geology 22:949–952 Engl W, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer Academic, Dordrecht Ernstson K, Alt W (2013) Gravity and geomagnetic methods in geothermal exploration: understanding and misunderstanding. World Min 65:115–122

Page 65 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Evans KF, Cornet FH, Hashida T, Hayashi K, Ito T, Matsuki K, Wallroth T (1999) Stress and rock mechanics issues of relevance to HDR/HWR engineered geothermal systems: review of developments during the past 15 years. Geothermics 28:455–474 Expertengruppe “Seismisches Risiko bei hydrothermaler Geothermie” (2010) Das seismische Ereignis bei Landau vom 15. August 2009, Abschlussbericht. Technical report, on behalf of the Ministerium für Umwelt, Landwirtschaft, Ernährung, Weinbau und Forsten des Landes Rheinland-Pfalz Fehlinger T (2009) Multiscale formulations for the disturbing potential and the deflections of the vertical in locally reflected physical geodesy. PhD thesis, Geomathematics Group, University of Kaiserslautern Fisher N, Lewis T, Embleton B (1993) Statistical analysis of spherical data. Cambridge University Press, Cambridge Fomin S, Hashida T, Shimizu A, Matsuki K, Sakaguchi K (2003) Fractal concept in numerical simulation of hydraulic fracturing of the hot dry rock geothermal reservoir. Hydrol Process 17:2975–2989 Ford NJ, Simpson A (2001) The numerical solution of fractional differential equations: speed versus accuracy. Numer Algorithms 26:333–346 Foulger G, Natland J, Presnall D, Anderson D (2005) Plates, plumes, and paradigms. Geological Society of America, Boulder Freeden C (2013) The role and the potential of communication by analysing the social acceptance of the German deep geothermal energy market. Master thesis, University of Plymouth Freeden W (1980) On the approximation of external gravitational potential with closed systems of (trial) functions. Bull Geod 54:1–20 Freeden W (1981) On approximation by harmonic splines. Manuscr Geod 6:193–244 Freeden W (1983) Least squares approximation by linear combination of (multi-)poles. Report 344, Departement of Geodetic Science and Surveying, The Ohio State University, Columbus Freeden W (1999) Multiscale modelling of spaceborne geodata. Teubner, Stuttgart Freeden W (2011) Metaharmonic lattice point theory. CRC/Taylor & Francis, Boca Raton Freeden W (2014) Geomathematics: its role, its aim, and its potential. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, New York Freeden W, Blick C (2013) Signal decorrelation by means of multiscale methods. World Min 65:304–317 Freeden W, Gerhards C (2010) Poloidal and toroidal field modeling in terms of locally supported vector wavelets. Math Geosci 42:817–838 Freeden W, Gerhards C (2013) Geomathematically oriented potential theory. Chapman & Hall/CRC, Boca Raton Freeden W, Gutting M (2013) Special functions of mathematical (geo-)physics. Birkhäuser, Basel Freeden W, Kersten H (1981) A constructive approximation theorem for the oblique derivative problem in potential theory. Math Methods Appl Sci 3:104–114 Freeden W, Mayer C (2003) Wavelets generated by layer potentials. Appl Comput Harm Anal 14:195–237 Freeden W, Michel V (2004) Multiscale potential theory with applications to geoscience. Birkhäuser, Boston Freeden W, Nutz H (2011) Satellite gravity gradiometry as tensorial inverse problem. Int J Geomath 2:123–146

Page 66 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Freeden W, Nutz H (2014) Mathematische Methoden. In: Bauer M, Freeden W, Jacobi H, Neu T (eds) Handbuch Tiefe Geothermie. Springer, Heidelberg, pp 125–222 Freeden W, Reuter R (1990) A constructive method for solving the displacement boundary-value problem of elastostatics by use of global basis systems. Math Methods Appl Sci 12:105–128 Freeden W, Schreiner M (2006) Local multiscale modelling of geoid undulations from deflections of the vertical. J Geodesy 79:641–651 Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences: a scalar, vectorial, and tensorial setup. Springer, Berlin Freeden W, Wolf K (2009) Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math Semesterber 56:53–77 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford Science Publications/Clarendon, Oxford Freeden W, Mayer C, Schreiner M (2003) Tree algorithms in wavelet approximations by Helmholtz potential operators. Numer Funct Anal Optim 24:747–782 Freeden W, Fehlinger T, Klug M, Mathar D, Wolf K (2009) Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J Geodesy 83:1171–1191 Gehringer M, Loksha V (2012) Handbook on planning and financing geothermal power generation. ESMAP (Energy Sector Management Assistence Programm), main findings and recommendations, The International Bank for Reconstruction and Development, Washington Georgsson LS, Friedleifsson IB (2009) Geothermal energy in the world from energy perspective. In: Short course IV on exploration for geothermal resources, Lake Naivasha, pp 1–22 Geothermal Energy Association (2011) Annual US geothermal power production and development report. Technical report Gerhards C (2011) Spherical multiscale methods in terms of locally supported wavelets: theory and application to geomagnetic modeling. PhD thesis, Geomathematics Group, University of Kaiserslautern Gerhards C (2012) Locally supported wavelets for the separation of spherical vector fields with respect to their sources. Int J Wavel Multires Inf Proc 10:1250034 Gerhards C (2014) A multiscale power spectrum for the analysis of the lithospheric magnetic field. Int J Geomath. 5:63–79 Ghassemi A (2003) A thermoelastic hydraulic fracture design tool for geothermal reservoir development. Technical report, Department of Geology & Geological Engineering, University of North Dakota Ghassemi A, Tarasovs S (2004) Three-dimensional modeling of injection induced thermal stresses with an example from Coso. In: Proceedings, 29th workshop on geothermal reservoir engineering, Stanford University, Stanford Ghassemi A, Zhang Q (2004) Poro-thermoelastic mechanisms in wellbore stability and reservoir stimulation. In: Proceedings, 29th workshop on geothermal reservoir engineering, Stanford University, Stanford Ghassemi A, Tarasovs S, Cheng AHD (2003) An integral equation solution for three-dimensional heat extraction from planar fracture in hot dry rock. Int J Numer Anal Methods Geomech 27:989–1004 Golberg MA, Chen CS (1998) The method of fundamental solutions for potential, Helmholtz and diffusion problems. In: Golberg MA (ed) Boundary integral methods – numerical and mathematical aspects. Computational mechanics publications. WIT, Southhampton, pp 103–176

Page 67 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Gorenflo R, Mainardi F (1997) Fractional calculus: integral and differential equations of fractional order. In: Carpinteri A, Mainardi F (eds) Fractals and fractional calculus in continuum mechanics. Springer, Wien, pp 223–276 Hammons TJ (2011) Geothermal power generation: global perspectives, technology, direct uses, plants, drilling and sustainability worldwide. In: Electricity infrastructures in the global marketplace. InTech, pp 195–234 Haney MM, Bartel LC, Aldridge DF, Symons NP (2005) Insight into the output of reverse-time migration: what do the amplitudes mean? In: Proceedings, SEG annual meeting, Houston Helmig R, Niessner J, Flemisch B, Wolff M, Fritz J (2014) Efficient modeling of flow and transport in porous media using multi-physics and multi-scale approaches. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, New York Heuer N, Küpper T, Windelberg D (1991) Mathematical model of a hot dry rock system. Geophys J Int 105:659–664 Hicks TW, Pine RJ, Willis-Richards J, Xu S, Jupe AJ, Rodrigues NEV (1996) A hydro-thermomechanical numerical model for HDR geothermal reservoir evaluation. Int J Rock Mech Min Sci 33:499–511 Hillis RR (2000) Pore pressure/stress coupling and its implications for seismicity. Explor Geophys 31:448–454 Hillis RR (2001) Coupled changes in pore pressure and stress in oil fields and sedimentary basins. Pet Geosci 7:419–425 Hillis RR (2003) Pore pressure/stress coupling and its implications for rock failure. In: Vanrensbergen P, Hillis RR, Maltman AJ, Morley CK (eds) Subsurface sediment mobilization. Geological Society of London, London, pp 359–368 Ilyasov M (2011) A tree algorithm for Helmholtz potential wavelets on non-smooth surfaces: theoretical background and application to seismic data postprocessing. PhD thesis, Geomathematics Group, University of Kaiserslautern International Energy Agency (2010) Annual report. Technical report Itasca Consulting Group Inc (2000) UDEC user’s guide. Minnesota Jackson JD (1998) Classical electrodynamics. Wiley, New York Jacobs F, Meyer H (1992) Geophysik – Signale aus der Erde. Teubner, Stuttgart Jaeger JC, Cook NGW, Zimmerman RW (2007) Fundamentals of rock mechanics. Blackwell, Malden Jia X, Hu T (2006) Element-free precise integration method and its application in seismic modelling and imaging. Geophys J Int 166:349–372 Jing L, Hudson JA (2002) Numerical methods in rock mechanics. Int J Rock Mech Min Sci 39:409–427 Jing Z, Willis-Richards J, Watanabe K, Hashida T (2000) A three-dimensional stochastic rock mechanics model of engineered geothermal systems in fractured crystalline rock. J Geophys Res 105:23663–23679 Jing Z, Watanabe K, Willis-Richards J, Hashida T (2002) A 3-D water/rock chemical interaction model for prediction of HDR/HWR geothermal reservoir performance. Geothermics 31:1–28 Johansson BT, Lesnic D (2008) A method of fundamental solutions for transient heat conduction. Eng Anal Bound Elem 32:697–703 Johansson BT, Lesnic D, Reeve T (2011) A method of fundamental solutions for two-dimensional heat conduction. Int J Comput Math 88:1697–1713

Page 68 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

John V, Schmeyer E (2008) Finite element methods for time-dependent convection-diffusionreaction equations with small diffusion. Comput Methods Appl Mech Eng 198:475–494 John V, Kaya S, Layton W (2006) A two-level variational multiscale method for convectiondominated convection-diffusion equations. Comput Methods Appl Mech Eng 195:4594–4603 Jung R (2007) Stand und Aussichten der Tiefengeothermie in Deutschland. Erdöl, Erdgas, Kohle 123:1–7 Katsurada M (1989) A mathematical study of the charge simulation method II. J Fac Sci Univ Tokyo Sect IA Math 36:135–162 Katsurada M, Okamoto H (1996) The collocation points of the fundamental solution method for the potential problem. Comput Math Appl 31:123–137 Kazemi H (1969) Pressure transient analysis of naturally fractured reservoirs with uniform fracture distribution. Soc Petrol Eng J 246:451–461 Kazemi H, Merril LS, Porterfield KL, Zeman PR (1976) Numerical simulation of water-oil flow in naturally fractured reservoirs. In: Proceedings, SPE-AIME 4th symposium on numerical simulation of reservoir performance, Los Angeles Kim I, Lindquist WB, Durham WB (2003) Fracture flow simulation using a finite-difference lattice Boltzmann method. Phys Rev E 67:046708 Kimura S, Masuda Y, Hayashi K (1992) Efficient numerical method based on double porosity model to analyze heat and fluid flows in fractured rock formations. JSME Int J Ser 2 35:395–399 Kühn M (2009) Modelling feed-back of chemical reactions on flow fields in hydrothermal systems. Surv Geophys 30:233–251 Kühn M, Stöfen H (2005) A reactive flow model of the geothermal reservoir Waiwera, New Zealand. Hydrogeol J 13:606–626 Kupradze VD (1964) A method for the approximate solution of limiting problems in mathematical physics. USSR Comput Math Math Phys 4:199–205 Lai M, Krempl E, Ruben D (2010) Introduction to continuum mechanics. Butterworth-Heinemann, Burlington Landau LD, Pitaevskii LP, Lifshitz EM, Kosevich AM (1986) Theory of elasticity. Theoretical physics, vol 7, 3rd edn. Butterworth-Heinemann, Oxford Lang U (1995) Simulation regionaler Strömungs- und Transportvorgänge in Karstaquifern mit Hilfe des Doppelkontinuum-Ansatzes: Methodenentwicklung und Parameteridentifikation. PhD thesis, University of Stuttgart Lang U, Helmig R (1995) Numerical modeling in fractured media – identification of measured field data. In: Herbert M, Kovar K (eds) Groundwater quality: remediation and protection. IAHS and University Karlova, Prague, pp 203–212 Lee J, Choi SU, Cho W (1999) A comparative study of dual-porosity model and discrete fracture network model. KSCE J Civ Eng 3:171–180 Li X (2008a) Convergence of the method of fundamental solutions for Poisson’s equation on the unit sphere. Adv Comput Math 28:269–282 Li X (2008b) Rate of convergence of the method of fundamental solutions and hyperinterpolation for modified Helmholtz equations on the unit ball. Adv Comput Math 29:393–413 Lomize GM (1951) Seepage in fissured rocks. State Press, Moscow Long J, Remer J, Wilson C, Witherspoon P (1982) Porous media equivalents for networks of discontinuous fractures. Water Resour Res 18:645–658 Luchko Y (2009) Maximum principle for the generalized time-fractional diffusion equation. J Math Anal Appl 351:218–223

Page 69 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Luchko Y (2010) Some uniqueness and existence results for the initial-boundary-value problems for the generalized time-fractional diffusion equation. Comput Math Appl 59:1766–1772 Luchko Y (2014) Fractional diffusion and wave propagation. In: Freeden W, Nashed M, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, New York Luchko Y, Punzi A (2011) Modeling anomalous heat transport in geothermal reservoirs via fractional diffusion equations. Int J Geomath 1:257–276 Martin GS, Marfurt KJ, Larsen S (2002) Marmousi-2: an updated model for the investigation of AVO in structurally complex areas. In: Proceedings, SEG annual meeting, Salt Lake City Maryška J, Severýn O, Vohralík M (2004) Numerical simulation of fracture flow in mixed-hybrid FEM stochastic discrete fracture network model. Comput Geosci 8:217–234 Masahi M, King P, Nurafza P (2007) Fast estimation of connectivity in fractured reservoirs using percolation theory. SPE J 12:167–178 Mayer C (2007) A wavelet approach to the Stokes problem. Habilitation thesis, Geomathematics Group, University of Kaiserslautern Mayer C, Freeden W (2014) Stokes problem, layer potentials and regularizations, multiscale applications. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, New York McLean W, Mustapha K (2009) Convergence analysis of a discontinuous Galerkin method for a sub-diffusion equation. Numer Algorithms 52:69–88 Menke W (1984) Geophysical data analysis: discrete inverse theory. Academic, Orlando Michel V (2002) A multiscale approximation for operator equations in separable Hilbert spaces – case study: reconstruction and description of the Earth’s interior. Habilitation thesis, Geomathematics Group, University of Kaiserslautern Michel V, Fokas AS (2008) A unified approach to various techniques for the non-uniqueness of the inverse gravimetric problem and wavelet based methods. Inverse Probl 24:045019 Min KB, Jing L, Stephansson O (2004) Determining the equivalent permeability tensor for fractured rock masses using a stochastic REV approach: method and application to the field data from Sellafield, UK. Hydrogeol J 12:497–510 MIT (Massachusetts Institute of Technology) (2006) The future of geothermal energy. http://mitei. mit.edu/publications/reports-studies/future-geothermal-energy Mo H, Bai M, Lin D, Roegiers JC (1998) Study of flow and transport in fracture network using percolation theory. Appl Math Model 22:277–291 Moeck I, Kwiatek G, Zimmermann G (2009) The in-situ stress field as a key issue for geothermal field development – a case study from the NE German Basin. In: Proceedings, 71st EAGE conference & exhibition, Amsterdam Möhringer S (2014) Decorrelation of gravimetric data. PhD thesis, Geomathematics Group, University of Kaiserslautern Mongillo M (2011) International efforts to promote global sustainable geothermal development. In: GIA annual report executive summary, Singapore, pp 1–19 Morgan WJ (1971) Convective plumes in the lower mantle. Nature 230:42–43 Müller C (1998) Analysis of spherical symmetries in euclidean spaces. Applied mathematical sciences, vol 129. Springer, Berlin Müller C, Kersten H (1980) Zwei Klassen vollständiger Funktionensysteme zur Behandlung der Randwertaufgaben der Schwingungsgleichung 4U Ck 2 U D 0. Math Method Appl Sci 2:48–67 Nakao S, Ishido T (1998) Pressure-transient behavior during cold water injection into geothermal wells. Geothermics 27:401–413

Page 70 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Neuman S (2005) Trends, prospects and challenges in quantifying flow and transport through fractured rocks. Hydrogeol J 13:124–147 Neuman S, Depner J (1988) Use of variable-scale pressure test data to estimate the log hydraulic conductivity covariance and dispersivity of fractured granites near Oracle, Arizona. J Hydrol 102:475–501 Nolet G (2008) Seismic tomography: imaging the interior of the Earth and Sun. Cambridge University Press, Cambridge Oden M, Niemi A (2006) From well-test data to input to stochastic continuum models: effect of the variable support scale of the hydraulic data. Hydrogeol J 14:1409–1422 Ödner H (1998) One-dimensional transient flow in a finite fractured aquifer system. Hydrol Sci J 43:243–265 Ostermann I (2011a) Modeling heat transport in deep geothermal systems by radial basis functions. PhD thesis, Geomathematics Group, University of Kaiserslautern Ostermann I (2011b) Three-dimensional modeling of heat transport in deep hydrothermal reservoirs. Int J Geomath 2:37–68 O’Sullivan MJ, Pruess K, Lippmann MJ (2001) State of the art of geothermal reservoir simulation. Geothermics 30:395–429 Ouenes A (2000) Practical application of fuzzy logic and neural networks to fractured reservoir characterization. Comput Geosci 26:953–962 Peters RR, Klavetter EA (1988) A continuum model for water movement in an unsaturated fractured rock mass. Water Resour Res 24:416–430 Phillips PJ (2005) Finite element method in linear poroelasticity: theoretical and computational results. PhD thesis, University of Texas, Austin Phillips PJ, Wheeler MF (2007) A coupling of mixed and continuous Galerkin finite element methods for poroelasticity I: the continuous in time case. Comput Geosci 11:131–144 Phillips WS, Rutledge JT, House LS, Fehler MC (2002) Induced microearthquake patterns in hydrocarbon and geothermal reservoirs: six case studies. Pure Appl Geophys 159:345–369 Podvin P, Lecomte I (1991) Finite difference computation of traveltimes in very contrasted velocity models: a massively parallel approach and its associated tools. Geophys J Int 105:271–284 Popov M (1982) A new method of computation of wave fields using Gaussian beams. Wave Motion 4:85–97 Pruess K (1990) Modelling of geothermal reservoirs: fundamental processes, computer simulation and field applications. Geothermics 19:3–15 Pruess K, Narasimhan TN (1985) A practical method for modeling fluid and heat flow in fractured porous media. Soc Pet Eng J 25:14–26 Pruess K, Wang JSY, Tsang YW (1986) Effective continuum approximation for modeling fluid and heat flow in fractured porous tuff. Technical report, Sandia National Laboratories Report SAND86-7000, Albuquerque Reichenberger V, Jakobs H, Bastian P, Helmig R (2006) A mixed-dimensional finite volume method for two-phase flow in fractured porous media. Adv Water Resour 29:1020–1036 Renaut R, Fröhlich J (1996) A pseudospectral Chebychev method for 2D wave equation with domain stretching and absorbing boundary conditions. J Comput Phys 124:324–336 Renner J, Steeb H (2014) Modeling of fluid transport in geothermal research. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, New York Rice JR, Cleary MP (1976) Some basic stress diffusion solutions for fluid-saturated elastic porous media with compressible constituents. Rev Geophys Space Phys 14:227–241

Page 71 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Ritter JRR, Christensen UR (2007) Mantle plumes: a multidisciplinary approach. Springer, Berlin Runge C (1885) Zur Theorie der eindeutigen analytischen Funktionen. Acta Math 6:229–234 Rutqvist J, Stephansson O (2003) The role of hydromechanical coupling in fractured rock engineering. Hydrogeol J 11:7–40 Saemundsson K (2009) Geothermal systems in global perspective. In: Short course IV on exploration for geothermal resources, Lake Naivasha Sahimi M (1995) Flow and transport in porous media and fractured rock: from classical methods to modern approaches. VCH, Weinheim Sanyal SK (2005) Classification of geothermal systems – a possible scheme. In: Proceedings, 30th workshop on geothermal reservoir engineering, Stanford University, Stanford, SGP-TR176, pp 85–92 Sanyal SK, Butler SJ, Swenson D, Hardeman B (2000) Review of the state-of-the-art of numerical simulation of enhanced geothermal systems. In: Proceedings, world geothermal congress, Kyushu-Tohoku Schanz M (2001) Application of 3D time domain boundary element formulation to wave propagation in poroelastic solids. Eng Anal Bound Elem 25:363–376 Schubert G, Turcotte DL, Olson P (2001) Mantle convection in the Earth and Planets. Cambridge University Press, Cambridge Schulz R (2009) Aufbau eines geothermischen Informationssystems für Deutschland. Technical report, Leibniz-Institut für Angewandte Geophysik, Hannover Semtchenok NM, Popov MM, Verdel AR (2009) Gaussian beam tomography. In: Extended abstracts, 71st EAGE conference & exhibition, Amsterdam Showalter RE (2000) Diffusion in poro-elastic media. J Math Anal Appl 251:310–340 Smyrlis Y-S (2009a) Applicability and applications of the method of fundamental solutions. Math Comput 78:1399–1434 Smyrlis Y-S (2009b) Mathematical foundation of the MFS for certain elliptic systems in linear elasticity. Numer Math 112:319–340 Smyrlis Y-S, Karageorghis A (2009) Efficient implementation of the MFS: the three scenarios. J Comput Appl Math 227:83–92 Snieder R (2002) The perturbation method in elastic wave scattering and inverse scattering in pure and applied science. In: General theory of elastic wave. Academic, San Diego, pp 528–542 Snow DT (1965) A parallel plate model of fractured permeable media. PhD thesis, University of California, Berkeley Stothoff S, Or D (2000) A discrete-fracture boundary integral approach to simulating coupled energy and moisture transport in a fractured porous medium. In: Faybishenko B, Witherspoon PA, Benson SM (eds) Dynamics of fluids in fractured rocks, concepts and recent advances. AGU geophysical monograph, vol 122. American Geophysical Union, Washington, DC, pp 269–279 Sudicky EA, McLaren RG (1992) The Laplace transform Galerkin technique for large-scale simulation of mass transport in discretely fractured porous formations. Water Resour Res 28:499–514 Symes WW (2003) Kinematics of reverse time S-G migration. Technical report, Rice University Symes WW (2007) Reverse time migration with optimal checkpointing. Geophysics 72:SM213– SM221 Takenaka H, Wang Y, Furumura T (1999) An efficient approach of the pseudospectral method for modelling of geometrical symmetric seismic wavefields. Earth Planets Space 51:73–79

Page 72 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Tang DH, Frind EO, Sudicky EA (1981) Contaminant transport in fractured porous media: analytical solution for a single fracture. Water Resour Res 17:555–564 Tran NH, Rahman SS (2006) Modelling discrete fracture networks using neuro-fractal-stochastic simulation. J Eng Appl Sci 1:154–160 Travis BJ (1984) TRACR3D: a model of flow and transport in porous/fractured media. Technical report, Los Alamos National Laboratory LA-9667-MS, Los Alamos Trefftz E (1926) Ein Gegenstück zum Ritzschen Verfahren. In: Proceedings of the 2nd international congress for applied mechanics, Zürich Tsang Y, Tsang C (1987) Chanel flow model through fractured media. Water Resour Res 23:467– 479 Tsang Y, Tsang C (1989) Flow chaneling in a single fracture as a two-dimensional strongly heterogeneous permeable medium. Water Resour Res 25:2076–2080 Tsang Y, Tsang C, Hale F, Dverstorp B (1996) Tracer transport in a stochastic continuum model of fractured media. Water Resour Res 32:3077–3092 Turcotte DL, Schubert G (2001) Geodynamics. Cambridge University Press, Cambridge Vidale J (1988) Finite-difference calculation of travel times. Bull Seismol Soc Am 78:2062–2076 Walsh J (1929) The approximation of harmonic functions by harmonic polynomials and by harmonic rational functions. Bull Am Math Soc 35:499–544 Warren JE, Root PJ (1963) The behaviour of naturally fractured reservoirs. Soc Pet Eng J 228:245– 255 Watanabe K, Takahashi T (1995) Fractal geometry characterization of geothermal reservoir fracture networks. J Geophys Res 100:521–528 Welding L (2007) GLITNIR geothermal research. In: United States geothermal energy market report, pp 1–37 Wendland H (2005) Scattered data approximation. Cambridge monographs on applied and computational mathematics, vol 17. Cambridge University Press, Cambridge Wilson JT (1963) A possible origin of the Hawaiian island. Can J Phys 41:863–868 Wolf K (2009) Multiscale modeling of classical boundary value problems in physical geodesy by locally supported wavelets. PhD thesis, Geomathematics Group, University of Kaiserslautern Wu YS (2000) On the effective continuum method for modeling multiphase flow, multicomponent transport and heat transfer in fractured rock. In: Faybishenko B, Witherspoon PA, Benson SM (eds) Dynamics of fluids in fractured rocks, concepts and recent advances. American Geophysical Union, Washington, DC, pp 299–312 Wu YS, Pruess K (2005) A physically based numerical approach for modeling fracture-matrix interaction in fractured reservoirs. In: Proceedings, world geothermal congress 2005, Antalya Wu YS, Qin G (2009) A generalized numerical approach for modeling multiphase flow and transport in fractured porous media. Commun Comput Phys 6:85–108 Wu X, Pope GA, Shook GM, Srinivasan S (2005) A semi-analytical model to calculate energy production in single fracture geothermal reservoirs. Geotherm Resour Counc Trans 29:665–669 Wu RS, Xie XB, Wu XY (2006) One-way and one-return approximations (de Wolf approximation) for fast elastic wave modeling in complex media. Adv Geophys 48:265–322 Xie XB, Wu RS (2006) A depth migration method based on the full-wave reverse time calculation and local one-way propagation. In: Proceedings, SEG annual meeting, New Orleans Yilmaz O (1987) Seismic data analysis: processing, inversion, and interpretation of seismic data. Society of Exploration Geophysicists, Tulsa

Page 73 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_22-4 © Springer-Verlag Berlin Heidelberg 2014

Yin S (2008) Geomechanics-reservoir modeling by displacement discontinuity-finite element method. PhD thesis, University of Waterloo, Ontario Zhao C, Hobbs BE, Baxter K, Mühlhaus HB, Ord A (1999) A numerical study of pore-fluid, thermal and mass flow in fluid-saturated porous rock basins. Eng Comput 16:202–214 Zhou XX, Ghassemi A (2009) Three-dimensional poroelastic simulation of hydraulic and natural fractures using the displacement discontinuity method. In: Proceedings of the 34th workshop on geothermal reservoir engineering, Stanford Zubkov VV, Koshelev VF, Lin’kov AM (2007) Numerical modeling of hydraulic fracture initiation and development. J Min Sci 43:40–56 Zyvoloski G (1983) Finite element methods for geothermal reservoir simulation. Int J Numer Anal Methods Geomech 7:75–86

Page 74 of 74

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation Andreas Meistera and Joachim Benzb a Department of Mathematics, Work-Group of Analysis and Applied Mathematics, University of Kassel, Kassel, Germany b Faculty of Organic Agricultural Sciences, Work-Group Data-Processing and Computer Facilities, University of Kassel, Witzenhausen, Germany

Abstract From spring to summer period, a large number of lakes are laced with thick layers of algae implicitly representing a serious problem with respect to the fish stock as well as other important organisms and at the end for the complete biological diversity of species. Consequently, the investigation of the cause-and-effect chain represents an important task concerning the protection of the natural environment. Often such situations are enforced by an oversupply of nutrient. As phosphorus is the limiting nutrient element for most of all algae growth processes an advanced knowledge of the phosphorus cycle is essential. In this context the chapter gives a survey on our recent progress in modeling and numerical simulation of plankton spring bloom situations caused by eutrophication via phosphorus accumulation. Due to the underlying processes we employ the shallow water equations as the fluid dynamic part coupled with additional equations describing biogeochemical processes of interest within both the water layer and the sediment. Depending on the model under consideration one is faced with significant requirements like positivity as well as conservativity in the context of stiff source terms. The numerical method used to simulate the dynamic part and the evolution of the phosphorus and different biomass concentrations is based on a second-order finite volume scheme extended by a specific formulation of the modified Patankar approach to satisfy the natural requirements to be unconditionally positivity preserving as well as conservative due to stiff transition terms. Beside a mathematical analysis, several test cases are shown which confirm both the theoretical results and the applicability of the complete numerical scheme. In particular, the flow field and phosphorus dynamics for the West Lake in Hangzhou, China are computed using the previously stated mass and positivity preserving finite volume scheme.

1 Introduction The main objective of this chapter is to present a stable and accurate numerical method for a wide range of applications in the field of complex ecosystem models. Thereby, we focus on the process of eutrophication, which represents a serious problem through giving rise to excessive algae blooms in all eutrophic fresh water ecosystems. Thus, it is crucial in ecological science to understand and predict the coherences of the underlying dynamics. Even taking into account the systematical difficulties of modeling and simulation (Poethke 1994), elaborate modeling provides a very useful tool for deeper understanding and long range prediction (Sagehashi et al. 2000). 

E-mail: [email protected]

Page 1 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

In the 1970s computer-based modeling and simulation in ecological science with a special focus on lake eutrophication examination started to achieve good results. Deterministic models were proposed in Park et al. (1974), Jørgensen (1975), and Straškraba and Gnauk (1985). Up to now due to both deeper knowledge of the underlying principles and growing computational power the examined models have been developed more sophisticated. The evolved models nowadays pose severe demands on the used numerical schemes. Models including matter cycles, where the material, for example, atoms, can only change their configuration and thus are neither created nor destroyed, require the algorithm to maintain the mass and number of atoms in the system, respectively. Another computational nontrivial but obvious demand is to ensure positivity for all examined material constituents. Consequently, in recent years, one can observe increasing interest in the design of positivity preserving schemes. A survey on positive advection transport methods can be found in Smolarkiewicz (2006). With respect to shallow water flows one is often faced with the transition between dry and wet areas such that the construction of positive schemes is widespread for this kind of application, see Berzins (2001), Audusse and Bristeau (2005), Chertock and Kurganov (2008), Ricchiuto and Bollermann (2009) and the references therein. Concerning stiff reaction terms a modification of common Runge-Kutta schemes which ensures the positivity was originally suggested by Patankar in the context of turbulent mixing (Patankar 1980). Unfortunately the schemes obtained were not able to retain the characteristic of mass conservation, even though mass conservativity is a feature of the original Runge-Kutta schemes. The scheme was improved in Burchard et al. (2003) retaining the positivity but also reacquiring the conservation of mass. Based on this idea, a more complex form of conservation was achieved in a series of publications (Bruggeman et al. 2007; Broekhuizen et al. 2008). First applications of the modified Patankar-approach in combination with classical discretization schemes for the solution of advection-diffusion-reaction equations are presented in Burchard et al. (2005, 2006) with a specific focus on marine ecosystem dynamics in the North and the Baltic Sea. But being more applicable to flexible definitions of conservation they lost crucial numerical stability. A detailed overview of those schemes can be found in Zardo (2005). Due to the superior numerical stability the modified Patankar ansatz is used. Also the ensured conservation of mass is appropriate for the model under consideration. One major and essential aspect for growth is nutrient supply. To predict the growth of an algae reliably one has to consider the limiting factor (Schwoerbel and Brendelberger 2005). The basis for the implemented ecological model is presented by Hongping and Jianyi (2002). As the model under consideration describes algae growth in the West Lake, China with a ratio of more then 1:14 for phosphorus to nitrogen the limiting nutrient is phosphorus (Lampert and Sommer 1999). Therefore it is sufficient to restrict the considered nourishment to phosphorus for modeling purpose. The model features the description of four different groups of algae species and zooplankton with the particularity of mapping specific phosphorus contents for each of these organisms. Furthermore, phosphorus is presented in solute and organic form in the water body as well as in the sediment. Beside the biological and chemical processes the flow field of the lake itself is of major importance. The current leads to an unbalanced distribution of matter inside a lake and thus creates significant differences in phosphorus concentrations. The two-dimensional shallow water equations form an adequate formulation of shallow water flow. By means of sophisticated high resolution schemes (Toro 2001; Vater 2004), one is able to simulate complicated phenomena like, for example, a dam break problem or others, see Stoker

Page 2 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

(1957) and Vázquez-Cendón (2007). The two-dimensional shallow water equations thus serve as basis formulation for the flow phenomena of interest. The phosphorus cycle model together with the two-dimensional shallow water equations forms the complete set of convection-diffusion-reaction equations and is a set of hyperbolic parabolic partial differential equations.

2 Mathematical Modeling Modeling ecological process in lakes, urban rivers, or channels possess two main demands. First, a reliable mathematical model has to be formulated which includes fundamental properties and effects of eutrophication including local biochemical phenomena as well as transport processes due to convection and diffusion. Thus, beside the consideration of the dynamics of biomass and phosphorus concentrations, one has to take account of the flow situation in the water body. Consequently, the equations governing the process of interest are given by a combination of the Saint-Venant system for the flow part and an advection-diffusion-reaction system for the biochemical part. Neglecting the influence of rain and evaporation the underlying fluid dynamic part represented by the shallow water equations can be written in the case of a constant bottom topography in form of the conservation law @t us C

2 X

@xm fcm;s .us / D 0:

mD1

 T Thereby, us D .H; ˆv1 , ˆv2 /T and fcm;s .us / D Hvm ; ˆvm v1 C 12 ı1m ˆ2 ; ˆvm v2 C 12 ı2m ˆ2 are referred to as the vector of conserved quantities and the flux function in which H , v D .v1 ; v2 /T denote the water height and the velocity, respectively. Furthermore, ˆ D gH represents the geopotential with the gravity force g and the Kronecker delta is denoted by ıim . The ecological part consists of processes which describe the behavior of biomass and organic phosphorus of four different groups of algae species (BA, PA) and zooplankton (BZ, PZ). To describe the complete phosphorus cycle including solute phosphorus in the water body (PS), organic phosphorus in the detritus (PZ) and organic and inorganic phosphorus in the sediment (PEO ; PEI / are considered, see Fig. 2. This model is based on the West Lake Model, which is published in Hongping and Jianyi (2002). The dynamics of biomass demands positivity and the cycle of phosphorus positivity and conservativity as well. The interactions between these ecological system elements are nonlinear, for example, growth is formulated as Michaelis-Menten kinetics. In addition, the particulate components (biomass, organic phosphorus, and detritus) in the water body are influenced by advective transport. The solute phosphorus in the water body is affected by advective and diffusive transport. For the elements inside of the sediment no horizontal transport is considered. The vertical exchange between phosphorus in the sediment and in the water body is driven by diffusion. Consequently, the governing equations for the phosphorus cycle and the biomass dynamics (Fig. 1) represents a system of advection-diffusion-reaction equations of the form

Page 3 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

Growth Assimilation BZ

Grazing BA

Sinking

Mortality

Sinking

Respiration

Respiration

Biomass dynamics– positive, nonmass conserving

Respiration Phosphorus dynamics– positive, mass conserving

PZ

Mortality

Assimilation Grazing

Respiration PA

PS

Sinking

Uptake Sedimentation

PD Mineralization

Exchange

Sedimentation

Waterbody Sediment Mineralization PEI

PEO

Fig. 1 Phosphorus and biomass dynamic

@t up C

2 X

c @xm fm;p .us ; up / D

mD1

2 X

@xm fm;p .up / C qp .us ; up /

(1)

mD1

where the components read up D .BA; BZ; PA; P Z; PD; PS; PEI ; PEO /T and the convective and viscous fluxes are c fm;p .us ; up / D vm .BA; BZ; PA; P Z; PD; PS; 0; 0/T

and fm;p .up / D .0; : : : ; 0; PS @xm PS; PEI @xm PEI ; 0/T; respectively. Note, that BA and PA denotes a vector of four constituents by it selves. With respect n to the spatial domain D  R2 , which is assumed to be polygonal bounded, i.e., @D D [ @Dk , the kD1

complete system consisting of the fluid and biochemical part can be written as @t U C

2 X mD1

@xm Fcm .U/

D

2 X

@xm Fvm .U/ C Q.U/ in D  RC ;

(2)

mD1

Page 4 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

where U D .us ; up /T; Fcm D .fcm;s ; fcm;p /T; Fvm D .0; fvm;p /T , and Q D .0; qp /T. A detailed description of the biochemical part will be presented in Sect. 3.3.

3 Numerical Method The discretization of the mathematical model is based on a conventional finite volume scheme. To satisfy the specific requirements of positivity with respect to the involved biogeochemical system it is necessary to employ an appropriate method for the corresponding local transition terms. Thus, for the sake of simplicity, we first present the finite volume approach and investigate its applicability in the context of the fluid dynamic part of the governing equation. Thereafter, we will discuss positivity preserving, conservative scheme for stiff and non-stiff ordinary differential equations. Similar to the finite-volume approach, we will discuss the properties of this numerical scheme not only theoretically but also by means of different test cases. Finally, we combine both parts to simulate the flow field and the phosphorus cycle with respect to the West lake as a comprehensive practical application.

3.1 Finite Volume Method Finite volume schemes are formulated on arbitrary control volumes and well-known successful time stepping schemes and spatial discretization techniques can easily be employed in the general framework. This approximation technique perfectly combines the needs concerned with robustness and accuracy as well as the treatment of complicated geometries. We start with the description of a state-of-the-art finite volume scheme. The development of the method is presented in various articles (Meister 1998, 2003; Meister and Sonar 1998; Meister and Oevermann 1998; Meister and Vömel 2001) which also proof the validity of the algorithm in the area of inviscid and viscous flow fields. Smooth solutions of the system of the shallow water equations exist in general only for short time and well-known phenomena like shock waves and contact discontinuities develop naturally. Hence we introduce the concept of weak solutions which represents the basis for each finite volume scheme. Every single finite volume scheme can be derived using the concept of weak solutions. A bounded set   R2 is said to be a control volume if Gauss’ integral theorem is applicable to functions defined on . The mapping uQ is called a weak solution of the system (2) if d dt

Z U dx D  

Z X 2 @ mD1

Z Fcm .U/nm

ds C

2 X @ mD1

Z Fm .U/nm

ds C

Q.U/ d x

(3)



is valid on every control volume   D with outer unit normal vector n D .n1 ; n2 /T . Note that this formula can be derived from the system (2) by integrating over a control volume and applying Gauss’ theorem. The solution class considered can be described by the function space  integral 1 BV RC ; L \ L1 .R2 I R17 / , i.e., the mapping t 7! U.; t / is of bounded variation and the 0 image is an integrable function of the space variables which is bounded almost everywhere. The numerical approximation of Eq. 3 requires an appropriate discretization of the space part D as well as the time part RC .

Page 5 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

We start from an arbitrary conforming triangulation T h of the domain D which is called the primary mesh consisting of finitely many (say #T h / triangles Ti ; i D 1; : : : ; #T h . Thereby, the parameter h corresponds to a typical one-dimensional geometrical measure of the triangulation, as, for example, the maximum diameter of the smallest in-closing circle of the triangle Ti ; i D 1; : : : ; #T h . For a comprehensive definition of a primary grid, we refer to Sonar (1997a). Furthermore, N h denotes the index set of all nodes of the triangulation T h and is subdivided by N h D N h;D [ N h;@D , where N h;D is associated with the inner points and N h;@D includes the indices of the boundary points. We set N WD #N h and denote the three edges of the triangle T by eT ;k ; k D 1; 2; 3. Furthermore, we define E.i / D feT ;k jk 2 f1; 2; 3g; T 2 T h ; node xi 2 eT ;k g; V .i / D fT j node xi is vertex of T 2 T h g; and C.T / D fi ji 2 f1; : : : ; N g; node xi is vertex of T g: For the calculation of the triangulation we employ an algorithm developed by Friedrich (1993) which provides mostly isotropic grids. As we see in the following, the occurrence of second-order derivatives within the partial differential equation requires the evaluation of first order derivative on the boundary of each control volume used in the finite volume scheme. Due to this fact it is advantageous to employ a box-type method where the computation of a first-order derivative at the boundary of each box is straightforward. We define a discrete control volume i as the open subset of R2 including the node xi D .xi1 ; xi2 /T and bounded by the straight lines, which are defined by the connection of the midpoint of the edge eT ;k 2 E.i / with the point xs D .xs1 ; xs2 /T D

X

˛i x i

i2C.T /

of the corresponding triangle T , see Fig. 2. In the case that xi is at the boundary of the computational domain, the line defined by the connection of the midpoint of the boundary edge and the node itself is also a part of @i . For the calculation of the weights we employ

xj

n2ij lij2 lk

n1ij lij1

li

xs

xi

lj xk

Fig. 2 General form of a control volume (left) and its boundary (right)

Page 6 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

˛i D

2

1 P m2C.T /

X jlm j m2C.T /

jlm j with jli j D jjxj  xk jj2

for

i; j; k 2 C.T /:

(4)

m¤i

This definition exhibits the advantage that the deformation of the control volume with respect to distorted triangles is much smaller compared to those achieved by the use of xs as the barycenter of the triangle T . The union Bh of all boxes i , i 2 N h is called the secondary mesh. Let N.iR/ denote the index set of all nodes neighboring node xi , i.e., those nodes xj, j ¤ i , for which @i \@j 1 ds ¤ 0 is valid. In general, for j 2 N.i /, the boundary between the control volume i and j consists of two line segments which are denoted by lijk , k D 1; 2. Furthermore, nkij , k D 1; 2, represent the accompanying unit normal vectors. Note that in the case of a boundary box i there exist two adjacent cells j , j 2 N h;@D , such that @i \ @j consists of one line segment lijk only. In order to achieve a unique representation we interpret the lacking line segment as having the length zero. Introducing the cell average Z 1 U.x; t /d x Ui .t / WD ji j i the integral form with regard to an inner box i can be written as d Ui .t / D .Lci U/.t / C .Li U/.t / C .Qi U/.t / dt with p .Li U/.t /

2 2 Z X   1 X X WD Fpm .U/ nkij m ds; p 2 fc; g ji j j 2N.i/ kD1 lijk mD1

and 1 .Qi U/.t / WD ji j

Z Q.U/ d x: i

Note that in the case of a boundary box i the boundary conditions have to be considered additionally. Obviously, the evaluation of the line integrals leads into trouble when U is discontinuous across the line segment lijk . For the solution of these Riemann problems we introduce the concept of a numerical flux function which is often called a Riemann solver. During the past more than 40 years, a huge variety of numerical flux functions has been developed for various balance laws showing different advantages as well as failings. Such schemes can be found in numerous wellwritten text books (Hirsch 1988a, b; LeVeque 1990; Toro 1999; Ansorge and Sonar 2009). Utilizing the HLL-scheme (Toro 1999), one obtains .Lci U /.t /

2  1 X X ˇˇ k ˇˇ HLL  O k O l H  Ui .t /; Uj .t/I nij : ji j j 2N.i/ kD1 ij

Page 7 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

O i and U O j represent an approximation of the conserved variables at the left- and rightHerein, U hand side of the midpoint xkij corresponding to the line segment lijk . Using the cell averages, i.e., Oj D U O j , leads to a spatially first-order accurate scheme. The central approach O i D Ui and U U for the construction of higher-order finite volume methods is based on the recovery of improved Oi D U O j , from the given cell averages. A detailed introduction and mathematical approximations U analysis of different reconstruction technique is presented in Sonar (1997a). Within this study, we concentrate on the recovery based on linear polynomials calculated by means of a TVD reconstruction procedure using the Barth-Jespersen limiter (Barth and Jesperson 1989). Furthermore, the numerical evaluation of the viscous fluxes is performed in the sense of central differences. For each quantity appearing inside the flux Fm , we calculate the unique linear distribution with respect to the triangle T from the cell averages of the three corresponding boxes located at the node of the triangle. Thus, we can write .Li U/.t / 

2  1 X X ˇˇ k ˇˇ central  UT ;1 .t /; UT ;2 .t /; UT ;3 .t /I nkij : lij H ji j j 2N.i/ kD1

where lijk  T . Finally, the source term is simply expressed as .Qi U/.t /  Q.Ui .t //: Subsequent to the approximation of, the spatial parts we are faced with a huge system of ordinary differential equations of the form d U.t / D .LU/.t / C .QU/.t /; dt where U represents the vector containing all cell averages and L, Q denote the numerical approximation of the global terms (viscous and inviscid fluxes) and the local source terms, respectively. Concerning the decomposition of the right-hand side, we employ a standard Strangsplitting U.1/ D Un Ct ‰.t; Un ; Q/ U.2/ D U.1/ C2t ‚.t; U.1/ ; L/ UnC2 D U.2/ Ct ‰.t; U.2/ ; Q/

(5)

which is of second-order if both discretization schemes ‚ and ‰ are second-order accurate. Note that the steps associated with the operator L include the convection and diffusion processes, whereas the other steps possess a local character due to the reaction terms. Thus, the intermediate steps are carried out separately for each control volume. The discretization scheme ‚ is simple realized by a second-order Runge-Kutta method. In the context of stiff reaction terms Q the performance of the complete numerical method decisively depends on the properties of the incorporated scheme ‰, which has to be stable and if necessary conservative and positivity preserving independent of the time step size used.

Page 8 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

2.0

1.75

1.5

1.25

1.0 [m]

Fig. 3 Instationary flow around a pillar

3.1.1 Numerical Results for Shallow Water Flow The example deals with a dam-break situation coupled with the occurrence of a pillar. The evolution of the water level depicted in Fig. 3 shows a good resolution of the dominant instationary flow features which are shock-boundary- and shock-shock interactions. The test case emphasize the applicability and stability of the finite volume scheme explained above for the simulation of shallow water flow.

3.2 Positivity Preserving and Conservative Schemes Concerning the remaining numerical step associated with the system of ordinary differential equations we present a survey on the modified Patankar-approach originally suggested and analyzed by Burchard et al. (2003) and its extension published in Benz et al. (2009). In order to describe the properties of the enclosed numerical scheme for stiff positive systems we make use of an arbitrary production-destruction equation. For i ¤ j we utilize the notation di;j .c/  0 as the rate at which the i th component transforms into the j th, while pi;j (c)  0 represents the rate at which the j th constituent transforms into the i th. Clearly, pi;j (c) and di;j (c) must satisfy pi;j .c/ D dj;i .c/. In addition to these transition terms, we consider for the i th constituent local production by pi;i .c/  0 and similar local destruction by di;i .c/  0. Thus, the system we start to investigate here can be written as I I X X d ci D pi;j .c/  di;j .c/ C pi;i .c/  di;i .c/; i D 1; : : : ; I; dt j D1 j D1 j ¤i

(6)

j ¤i

where c D c.t / D .c1 .t /; : : :; cI .t //T denotes the vector of the I constituents. This system has to be solved under the initial conditions c0 D c.t D 0/ > 0: Definition 1. satisfy

(7)

The system (6) is called fully conservative, if the production and destruction terms pi;j .c/ D dj;i .c/ for j ¤ i; j; i 2 f1; : : : ; I g

Page 9 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

and pi;i .c/ D di;i .c/ D 0 for i 2 f1; : : : ; I g: Furthermore, we always consider ecosystems where the constituents are by nature positive. From a mathematical point of view one can easily prove by means of a simple contradiction argument, that for non negative initial conditions ci .0/  0; i D 1; : : :; I , the condition di;j .c/ ! 0 for ci ! 0 for all j 2 f1; : : :; I g, guarantees that the quantities ci .t /; i D 1; : : :; I remain non negative for all t 2 RC . Consequently, the properties mentioned above have to be maintained by the discretization scheme which means that no gain or loss of mass should occur for numerical reasons and that the concentration of all constituents have to remain positive independent of the time step size used. Based on the following standard formulation of a discretization scheme as cnC1 D cn C t ˆ.cn ; cnC1 I t / we introduce some notations and definitions which are used in the subsequent parts of the chapter. Definition 2.

For a given discretization scheme ˆ we call       e D cnC1  c t nC1 with cnC1 D c.t n / C t ˆ c.t n /; c t nC1 I t

the local discretization error vector, where c.t / represents the exact solution of the initial value problem (6), (7) and t D t nC1  t n . Moreover, we write M D O.t p /

as

t ! 0;

p 2 N0 ;

in terms of mi;j D O.t p / as t ! 0; p 2 N0 for all elements mi;j ; i D 1; : : :; r; j D 1; : : :; k of the matrix M 2 Rrk It is worth mentioning that the production and destruction terms pi;j; di;j; i; j D 1; : : :; I , are considered to be sufficiently smooth and we require the solution c of the initial value problem (6) and (7) to be sufficiently differentiable. In the following, we always consider the case of a vanishing time step t and thus the supplement t ! 0 will be neglected for simplification and we use the expression accuracy in the sense of consistency. Definition 3.

A discretization scheme ˆ is called

• consistent of order p with respect to the ordinary differential equation (6), if e D O.t pC1 /;

Page 10 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

• unconditionally positive, if cnC1 > 0 for any given cn WD c.t n / > 0 and any arbitrary large time step t  0 independent of the specific definition of the production and destruction terms within the ordinary differential equation (6) • conservative, if I X

.cinC1  cin / D 0

(8)

iD1

for any fully conservative ordinary differential equation (6) and cn WD c.t n /. At first, we consider the well-known forward Euler scheme 0

1

I I X BX C n n n n C cinC1 D cin C t B p .c /  d .c / C p .c /  d .c / i;j i;j i;i i;i @ A j D1 j ¤i

(9)

j D1 j ¤i

for i D 1; : : :; I . Quite obviously, we obtain the following result concerning the positivity and conservativity of the numerical method. Theorem 1. preserving.

The forward Euler scheme (9) is conservative but not unconditional positivity

Proof. By means of the abbreviations Pi D

PI j D1 j ¤i

pi;j and Di D

PI j D1 j ¤i

di;j and utilizing the

properties of the production and destruction rates we deduce I X

.cinC1  cin / D t

iD1

I X

.Pi .cn /  Di .cn // Ct

iD1 „

ƒ‚ D0



I X iD1

.pi;i .cn /  di;i .cn // D 0; „ ƒ‚ … „ ƒ‚ … D0

D0

which proves that the method is conservative. Due to the fact that the property to be unconditionally positive is independent of the production and destruction terms, we consider a fully conservative system with non vanishing right-hand side. Thus, there exists at least one index i 2 f1; : : :; I g such that Pi .cn /  Di .cn / < 0 for a given cn > 0. Thus, using cin yields cinC1 D cin C t .Pi .cn /  Di .cn // < 0; t > n n Di .c /  Pi .c / which proves the statement. To overcome this disadvantage, Patankar (1980) suggested to weight the destruction terms di;j .c/ and di;i .c/ by the factor

cinC1 . cin

Page 11 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

Theorem 2.

cinC1

The Patankar scheme. 0 D

cin

1

I I nC1 nC1 C X BX n n ci n n ci B C t @ pi;j .c /  di;j .c / n C pi;i .c /  di;i .c / n C c c A j D1 j ¤i

i

j D1 j ¤i

i

for i D 1; : : :; I is unconditional positivity preserving but not conservative. Proof. We simply rewrite the Patankar scheme in the form 0

11

0

0

1

I I B BX CC BX C n n CC nC1 n B1 C t B B di;j .c / C di;i .c /AA ci D ci C t @ pi;j .cn / C pi;i .cn /C n @ @ A ci j D1 j D1



j ¤i

ƒ‚



1

j ¤i



ƒ‚

cin >0



for i D 1; : : :; I . Thus, we can immediately conclude the positivity of the method. However, one can easily see that even in the case of the simple system c10 .t / D c2 .t /  2c1 .t / c20 .t / D 2c1 .t /  c2 .t / and initial conditions c0 D c.t D 0/ D .1; 1/T one gets c11 D

1 C t ; 1 C 2t

c21 D

1 C 2t 1 C t

such that 2 X iD1

.ci1



ci0 /

t 2 D ¤ 0 for all t > 0: 1 C 3t C 2t 2

Theorem 2 shows that the so-called Patankar-trick represents a cure with respect to the positivity constraint but this method suffers from the fact that the conservativity is violated since production and destruction terms are handled in a different manner. Inspired by the Patankar-trick, Burchard et al. (2003) introduced a modified Patankar approach where source as well as sink terms are treated in the same way. However, this procedure can only directly be applied to conservative systems. Thus, an extension of this modified Patankar scheme to take account of additional non conservative reaction terms as appearing within the biomass dynamics has been presented in Benz et al. (2009). With respect to the Euler scheme as the underlying basic solver, this extended modified Patankar approach can be written in the form

Page 12 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

0 cinC1

D

cin

1

I I BX C c nC1 X c nC1 n j B C t @ pi;j .c / n  di;j .cn / i n C .pi;i .cn /  di;i .cn //!in C A c c j

j D1 j ¤i

(10)

i

j D1 j ¤i

for i D 1; : : :; I , where ( !in Theorem 3. Definition 3.

D

cinC1 ; cin

if pi;i .cn /  di;i .cn / < 0; 1; otherwise:

The extended modified Patankar scheme (10) is conservative in the sense of

Proof. It is easily seen by straightforward calculations that 0 I X

.cinC1



cin /

iD1

1

I I I nC1 C X BX c nC1 X n j n ci B C D t p .c /  d .c / i;j i;j @ cn cn A

Ct

I X

D t

j

j D1 j ¤i

iD1

j D1 j ¤i

i

.p .cn /  di;i .cn //!in ƒ‚ … „ i;i

iD1 I X

D0

cjnC1

.pi;j .c /  dj;i .c // n D 0; „ ƒ‚ … cj i;j D1 j ¤i

n

n

D0

which proves the statement. Theorem 4. The extended modified Patankar scheme (10) applied to the system of differential equations (6) is unconditionally positivity preserving. Proof. The Patankar-type approach (10) can be written in the form AcnC1 D bn ;

(11)

where A D .ai;j / 2 RI I with 1

0 ai;i D 1 C

I X

t B C maxf0; di;i .cn /  pi;i .cn /g C di;k .cn /A > 0; n @ ci kD1

i D 1; : : : ; I;

k¤i

ai;j D t

pi;j .cn / cjn

 0;

i; j D 1; : : : ; I;

i ¤ j:

and Page 13 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

bin D cin C t maxf0; pi;i .cn /  di;i .cn /g  cin > 0,

i =1,. . . ,I .

Hence, for i D 1; : : :; I we have jai;i j > t

I X di;k .cn / kD1 k¤i

cin

D t

I X pk;i .cn / kD1 k¤i

cin

D

I X

.ak;i / D

kD1 k¤i

I X

jak;i j

kD1 k¤i

which directly shows that the point Jacobi matrix B D ID1 AT defined by means of the diagonal 1 1 ; : : : ; aI;I g satisfies .B/  k B k1 < 1. Thus, the matrix B is convergent. matrix D1 D diagfa1;1 Regarding the fact that the matrix A contains only nonpositive off-diagonal elements and positive diagonal entries we can conclude with a standard statement from the numerical linear algebra that AT and therefore A are M-matrices. Thus, A1 exists and is nonnegative, i.e., A  0. This fact implies that cnC1 D A1 bn  A1 cn > 0 since at least one entry per row of the matrix A1 is positive. The Patankar scheme as well as the modified and extended modified version can be interpreted as a perturbed Euler scheme due to the incorporation of the weights. Thus, it is quite not obvious that these schemes are still first order accurate. Fortunately, the order of accuracy of the underlying Euler scheme transmits to each variant discussed above. Similar to the error analysis presented in Burchard et al. (2003) for the modified Patankar scheme we will now prove this important property for the extended formulation. Theorem 5. The extended modified Patankar scheme (10) is first-order accurate in the sense of the local discretization error. Proof. Since the time step inside the extended modified Patankar scheme (10) is performed by the solution of a linear system of equations, it is advantageous to investigate the entries of the inverse matrix A1 D .aQ i;j / 2 RI I . Introducing e D .1; : : :; 1/T 2 RI one can easily verify eT A  eT . Within the proof of Theorem 4 we have seen that A is regular and A1  0. Thus, we obtain eT  eT A1 that yields 0  aQ i;j  1;

i; j D 1; : : : ; I

(12)

independent of the time step size t > 0 used. Starting from the formulation of the time step in the form of the linear system (11) one can conclude I X bjn cinC1 D a Q D O.1/; i D 1; : : : ; I; i;j „ƒ‚… cin cin j D1 2Œ0;1

from the estimation (12). Introducing this property into the extended modified Patankar scheme (10) and determining cn WD c.t n / for simplification leads to

Page 14 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

0 cinC1



cin

1

I I BX C c nC1 X c nC1 n j B D t @ pi;j .c / n  di;j .cn / i n C .pi;i .cn /  di;i .cn //!in C A c c



j

j D1 j ¤i

j D1 j ¤i

i

ƒ‚



DO.1/

D O.t /: Thus we obtain

cinC1 cinC1  cin  1 D D O.t / cin cin

(13)

for i D 1; : : :; I . A similar results is valid for the weight !in due to ( !in

1D

cinC1 cin cin

D O.t /; 0 D O.t /;

if pi;i .cn /  di;i .cn / < 0; otherwise:

(14)

A combination of (13) and (14) with a simple Taylor series expansion yields ci .t nC1 / D ci .t n / C t

dci n .t / C O.t 2 / dt 0

1

I I X BX C n B D ci .t / C t @ pi;j .c.t //  di;j .c.t n // C pi;i .c.t n //  di;i .c.t n //C A n

j D1 j ¤i

j D1 j ¤i

CO.t 2 / 0 D

cin

1

I I nC1 X BX C c nC1 n cj B C t @ pi;j .c / c n  di;j .cn / ic n C .pi;i .cn /  di;i .cn //!in C A j i

0

j D1 j ¤i

j D1 j ¤i

1

I I nC1 BX c nC1  cjn X  cin C n j n ci C B pi;j .c /  d .c / t @ i;j A cn cn



j

j D1 j ¤i

ƒ‚

j D1 j ¤i

DO.t /

i



t ..pi;i .cn /  di;i .cn //.!in  1// CO.t 2 / ƒ‚ … „ D cinC1 C O.t 2 /;

DO.t /

which completes the proof.

Page 15 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

In order to increase the accuracy one can easily integrate the idea describe in the context of the extended modified Patankar scheme within a standard second-order Runge-Kutta method. Similar to the proof presented in Burchard et al. (2003), it can be shown that such an extension is secondorder accurate, unconditionally positivity preserving and conservative in the sense of Definition 3. 3.2.1 Numerical Results for Positive Ordinary Differential Equations The first test cases presents a simple linear system of ordinary differential equations. The results are taken from the original paper by Burchard et al. (2003). The system can be written as dt c1 D c2  5c1 ;

and dt c2 D 5c1  c2

(15)

with initial values set to be c1 .0/ D 0:9 and c2 .0/ D 0:1. The analytic solution is c1 .t / D .1 C 4:4 exp.6t //=6 and c2 .t / D 1  c1 .t /. Using the step size t D 0:25 one obtains negative concentrations for the standard forward Euler scheme, whereas the modified Patankar approach still preserves the positivity of the solution. Furthermore, both schemes are conservative, which can be seen from the horizontal line representing the sum of both concentrations (Fig. 4). As a second test cases we consider the simplified conservative biochemical system 0

1 0 1 PA upt ba  setpa  resp C B C d B mi npe  exchp B PEI C D B C; A setpa  mi npe dt @ PEO A @ PS exchp  upt ba C resp where uptba, setpa, resp, exchp, and minpe denote the phosphorus uptake, phosphorus of setting phytoplankton, phosphorus due to respiration, exchange between water and sediment and the mineralization of organic phosphorus, respectively. The results obtained by the standard forward Euler scheme, the Patankar scheme and the modified Patankar scheme for a constant time step size t D 1=3d are compared with a high resolution numerical result using the time step size t D 1=100d . Figures 5 and 6 confirm the analytic statements concerning the three schemes used. The modified Patankar scheme is positivity preserving and conservative while the Patankar method suffers from non-conservativity and the standard Euler approach yields meaningless negative values for both the solute phosphorus concentration in the water body and the phosphorus within the biomass. Note that the conservativity can be observed by Delta(PS C PA C PEI C PEO / WD .PS C PA C PEI C PEO /n  .PS C PA C PEI C PEO /0 .

3.3 Practical Applications The ecological model presented here is an enhancement of the phosphorus cycle model proposed in Hongping and Jianyi (2002). All considered processes are depicted in Fig. 2. For the sake of simplicity only one biomass algae (BA) and respectively only one phosphorus content (PA) are included in Fig. 2. The remaining three biomasses and corresponding phosphorus can easily be included since their evolution is quite alike. The examined constituents of the biological system are summarized with their abbreviations and their units respectively in Table 1.

Page 16 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

Forward Euler scheme

2

1.5 Concentration

1.5 Concentration

Modified Patankar scheme

2

1

0.5 0

c1, simulated c2, simulated c2, simulated c1 + c , analytical 1 c2, analytical

–0.5 –1 0

0.2 0.4 0.6 0.8

1

1 0.5 0 –0.5

c1 +

–1

1.2 1.4 1.6 1.8

0

0.2 0.4 0.6 0.8

t

1

c1, simulated c2, simulated c2, simulated c1, analytical c2, analytical

1.2 1.4 1.6 1.8

t

Fig. 4 Numerical approximation (t D 0:25) and analytical solution of the simple linear system with the forward Euler scheme (left) and the (extended) modified Patankar scheme (right) 0.004

0.004 PSref PAref Delta (PS + PA + PE° + PEI) PS PA

0.0035 0.003

PSref PAref Delta (PS + PA + PE° + PEI) PS PA

0.0035 0.003

0.0025

0.0025

0.002

0.002

0.0015

0.0015

0.001

0.001

0.0005

0.0005

0

0

–0.0005

–0.0005 0

50

100

150

200

250

300

350

0

50

100

150

200

250

300

350

Fig. 5 Results of the forward Euler scheme (left) and the Patankar scheme (right) concerning the second test case 30 PSref PAref Delta (PS + PA + PE° + PEI) PS PA

25 Temperature [°C]

0,003

800

0,0025 0,002 0,0015 0,001

I (Light intensity)

T (Water body) TE (Sediment)

700

20

600

15

500

10

400

5

300

0,0005 0 –0,0005

0 0

50

100

150

200 Days

250

300

350

Light intensity [Lux]

0,004 0,0035

200 0

50

100

150

200

250

300

350

Days

Fig. 6 Results of the modified Patankar scheme (left) concerning the second test case. Distribution of sediment temperature (TE), water temperature (T ) and light intensity (I ) over the course of the year (right) with respect to the West Lake

The impact of the current on the different kinds of biomass and phosphorus constituents within the governing equations (2) is performed as follows. Regarding the biochemical system the flux c describes the passive advective transport with the existing current for all constituents of the fm;p  includes phosphorus cycle, which are in the water body. Furthermore, the additional term fm;p diffusive effects for the solute components PS and PEI with appropriate diffusion coefficients DiffPS and DiffPEI . The expansion of the right-hand side qp (U) is given by

Page 17 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

Table 1 Constituents of the phosphorus cycle Description Solute phosphorus/PO 4 Phosphorus in detritus Inorganic and solute phosphorus in sediment Organic phosphorus in sediment Biomass of zooplankton and its content of phosphorus Biomasses of four different groups of algae species With their respective content of phosphorus

0

Abbreviation PS PD PEI PEO BZ PZ BAAD PAAD

Unit g/m3 g/m3 g/m3 g/m3 g/m3 g/m3 g/m3 g/m3

1 P upt bai C mi npd C zresp C i respi B C mi npd  setpd C zmorp B C B C exchp C mi npe B C P B C mi npe C setpd C i .setpai C gsi nkpi / B C B upt ba  resp  setpa  gsi nkp  assi mp C A A A A A C B B C B upt baB  respB  setpaB  gsi nkpB  assi mpB C B C B upt baC  respC  setpaC  gsi nkpC  assi mpC C qp .U/ D B C B upt baD  respD  setpaD  gsi nkpD  assi mpD C P B C B C zresp  zmorp C i assi mpi P B C B C assi m i  zres  zmor i B C B C growt hA  resA  si nkA  grazA B C B C growt h  res  si nk  graz B B B B B C @ A growt hC  resC  si nkC  grazC growt hD  resD  si nkD  grazD exchp 

P

i

A detailed description of the terms is presented in Appendix 4. For the proper computability of the above given equations the temperature of the water T and of the sediment TE must be supplied. Based on the air temperature stated in Weather Hangzhou (2008a, b, c), these temperatures are solutions of a one-dimensional heat equation for the proper distribution shown in Fig. 6. The figure also shows the assumed light intensity over the course of 1 year. For the rain q also annual developments are used from the same source of information. Most of the constants are adopted from Hongping and Jianyi (2002). All constants used are assembled in Table 2, “Appendix: Ecological Model.” Even though the model described above is based on Hongping and Jianyi (2002), it has many important new features. An extremely important fact is that the underlying flow field of the water coupled with the biological and chemical equations are simultaneously solved. This gives a much more detailed image of the modeled occurrences in the lake then any prediction made under the assumption of even distributed materia can hope for. Another major difference between the model proposed in Hongping and Jianyi (2002) and our model is the splitting of the phosphorus in sediment PE to organic phosphorus in the sediment PEO and inorganic phosphorus in the sediment PEI . A temperature-driven mineralization process from sedimented matter to solute phosphorus is introduced. As only the second form of phosphorus can contribute to the phosphorus supply in the water body, this offers a better reproduction of the phosphorus accumulation in sediment during the winter period. The fixation of phosphorus in the Page 18 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

Table 2 Description and numerical values for all constants Constant

Value

Fmin

0.05

Fs

0.25

Unit g m3 g m3

Min feeding concentration of phytoplankton Max graze rate of zooplankton Preference factor for grazing i

GRmax

0.09

1 d

Ri

0.18, 1, 1, 1



DPSi

0.027, 0.016, 0.016, 0.018

TOPi

30, 23, 20, 21

g m3 ı C

Description Menten feeding constant phytoplankton

Menten constant for i phosphorus uptake Optimal temperature for growth i

K1

1.5

K2

1

1 d 1 d

LOPi

310, 340, 350, 310

Lux

Optimal light radiation for growth i

growthmaxi

3.3, 2.35, 2.43, 2.37

1 d

Max growth rate i

ROi

0.007, 0.003, 1 d

Respiration rate i

0.003, 0.003 Kseti

d

Velocity of sedimentation i

g m3

Min content of phosphorus in i

g m3

Max content of phosphorus in i

0.015, 0.020, 0.015, 0.015

P UPmaxi

m

0.005, 0.005, 0.005, 0.005

P i nmaxi

Extinction coefficient of phytoplankton

0.025, 0.016, 0.016, 0.017

P i nmini

Extinction coefficient of water

0.07, 0.1,

mor

0.0004

ZRO

0.03

1 d 1 d 1 d

ZRM

0.04



Respiration multiplier of zooplankton

Kex

0.03

Phosphorus exchange coefficient

Kml

0.01

1 d 1 d

Sm1

0.8



Temp. coeff of mineralization rate PD

Km2

0.178

1 d

Mineralization rate of PE

Sm2

1.08



Temp. coeff of mineralization rate PE

Vs

0.125

m d

Sedimentation rate

rd

0.38



Percentage of organic phosphorus in detritus

uti li

0.8, 0.8, 0.8, 0.8



Usage of feed material

TCOEF

0.38

1 ıC

Q10 coefficient

0.07, 0.07

Max phosphorus uptake rate of i Death rate of zooplankton Respiration rate of zooplankton

Mineralization rate of PD

Page 19 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

0.002

BA

Phosphorus in g/m3

0.0015

0.2

0.001

PE_I PE_O PS

0.15

0.1

Bio mass algae in g/m3

0.25

0.0005 0.05

0 Jan

Feb Mar Apr May June July Aug

Sep

Oct Nov Dec

0 Jan

Fig. 7 Course of 1 year for phosphorus in water and sediment with one biomass algae

sediment is also an important process from the ecological point of view. Nevertheless, due to the missing pH and O2 -concentration the fixation is not considered in this model. Not complete uptaken biomass of algae by grazing is assimilated. The model by Hongping and Jianyi assumes a complete assimilation. Our model expects a gain of 80 % of the biomass for the zooplankton biomass and a loss of 20 %. A critical mathematical and also biological certainty is that all processes (with one exception: exchp) are positive. This is the mathematical formulation for the fact that processes are not reversible. For example, zooplankton grazing cannot grant a mass win for the grazed algae or respiration cannot end up with the algae gathering phosphorus from the water body. This should be modeled accordingly. Nevertheless, the original model allows sign changes for example in grazi , upt bai or asim. It also allows greater uptakes then the maximum uptake rate. This flaws are corrected by using equations oriented at equations proposed in Hongping and Jianyi (2002) with the proper constraints. An example for this is the determination of P Ci . Since no measured data are available for a comparison with the numerical simulation, the results are valid only under the assumptions included in the model. To demonstrate the applicability of the scheme for the discussed model we show various numerical results. The first test case presents a course of 1 year after the computation of several years for the whole phosphorus cycle model. In Fig. 7, the changes of the phosphorus of one algae and there biomass together with the solute phosphorus in the water body and the phosphorus in the sediment are presented. The scheme gives reasonable results for the change of the phosphorus content over the seasonal changes as can be seen in Fig. 7. The phosphorus concentration PEO augments in spring. With the rising of the temperature the bacteria in the sediment start to convert the organic-bound phosphorus, which was accumulated over the winter, into inorganic solute phosphorus. Through the diffusive exchange between the solute phosphorus in water and sediment the solute phosphorus in the water augments and is directly consumed by the starting algae growth BA.

Page 20 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

4000 VABS 320000 300000 280000 260000 240000 220000 200000 180000 160000 140000 120000 100000 80000 60000 40000 20000

Y

3000

2000

1000

0

0

1000

2000

3000

4000

X

Fig. 8 Absolute value of the velocity vabs D k v k in the West Lake 4000 PS 0.003 0.0028 0.0026 0.0024 0.0022 0.002 0.0018 0.0016 0.0014 0.0012 0.001 0.0008 0,0006 0.0004 0.0002

Y

3000

2000

1000

0

0

1000

2000

3000

4000

X

Fig. 9 Solute phosphorus floating into the West Lake

At end of the year, the phosphorus concentration PEO and PS increases due to the decease of the algae BA. PEO is augmented by the dead biological mass, PS through lesser consumption and exchange between PEI and PS. As second test case shown in Figs. 8 and 9 solute phosphorus inflow has been examined. It was assumed that all boundaries of the West Lake except the small channels in the south end (lower left part of the figure) and in the north eastern corner (upper right part of the figure) are fixed walls, i.e., no particles can pass this boundary. The southern boundary is considered the inflow, the north eastern boundary the outflow. For a clear visualization of the proportions of the current, high inflow velocities have been assumed. One can observe the curvature of the current in the lake after the straight inflow following

Page 21 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

the structure of the lake. Then the current splits into a western and an eastern part. The west flowing current circles back into the lake behind the three islands and through the small openings into the nearly cut off western part of the lake. The eastern directed part of the flow pours out of the West Lake. The West Lake was assumed to contain only traces of phosphorus PS at the beginning of the calculation. This allows the undisturbed observation of the distribution of in-flowed phosphorus. One can see clearly how the phosphorus is distributed. High concentrations of phosphorus are found directly along the strong currents, the inflow area and also the triangular dead area along the south eastern boundary of the lake. Also one can note that through the absence of a strong current into the nearly cut off western part of the lake the phosphorus is entering very slowly.

4 Conclusion A complex phosphorus cycle containing four different groups of algae species and their respective phosphorus content was presented. Furthermore, a higher order finite volume scheme was introduced. Through various numerical results the scheme has proven to be able to solve simultaneously a real life fluid dynamics problem and a sophisticated system of positive and conservative phosphor cycle describing biological and chemical equations. The task is accomplished via a splitting ansatz combining higher order flux solving methods with problem adopted ordinary differential equation techniques, the extended modified Patankar ansatz. Constructed in this way, the scheme preserves the important characteristics of conservativity and positivity necessary to obtain meaningful numerical results. Thus state-of-the-art modeling and numerical techniques have been successfully combined in the context of an eutrophication lake modeling. Acknowledgments First of all, we would like to express our thanks to W. Freeden for initiating this Handbook of Geomathematics and giving us the possibility to participate. Several mathematical parts of this chapter are developed in a joint work with H. Burchard and E. Deelersnijder. A. Meister is grateful for this productive cooperation. Furthermore, A. Meister would like to thank Th. Sonar for many years of fruitful collaborations in the field of computational fluid dynamics.

Appendix: Ecological Model In order to give a deeper insight in the structure of the source term qp (us , up /, we now concentrate on the specific expression for the biomass of i th algae group: 1. Temperature impact on growth of algae group i T Ti D

TOPi T T e TOPi : TOPi

2. Phosphor availability factor for algae group i Page 22 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

PPi D

PS : DPsi C PS

3. Light extinction K D K1 C K2 †j BAj .

 KH  : 4. Depth averaged available light intensity L D I 1eKH 5. Light intensity impact on growth of algae group i LLi D

LOPi L L e LOPi : LOPi

6. Growth of algae i growt hi D growt hmaxi T Ti PPi LLi BAi : 7. Respiration loss of biomass of algae group i resi D ROi e T COEF

T

BAi :

8. Sinking, i.e., dying, of biomass of algae group i si nki D BAi

Kseti : H

9. Total food preference weighted biomass algae F D †jRjBAj . 10. Available-food factor for zooplankton  FF D

0:0

F  Fmin Fs C F Fmin

Fmi n > F : Fmi n  F

11. Grazing of biomass algae i through zooplankton grazi D GRmax FF

Ri BAi BZ: F

The biomass zooplankton dynamics are given by 1. Biomass gain for zooplankton through grazing algae i assi mi D ut i li grazi : 2. Respiration of zooplankton zres D ZRO e T COEF

T

BZ C ZRM

X

grazj :

j

3. Death rate of zooplankton zmor D mor BZ.

Page 23 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

The phosphorus dynamics are given as: 1. Diffusive exchange between the solute phosphorus in sediment and water exchp D Kex .PEI  PS/: T 20 2. Mineralization of PD: mi npd D Km1 Sm1 PD. TE20 PEO . 3. Mineralization from PEO to PEI : mi npe D Km2 Sm2 PD 4. Sedimentation of PD: setpd D H .1  rd / Vs .

The exchange dynamics between phosphorus in living cells and the phosphorus without associated biomass are: 1. Phosphorus fraction from biomass of algae group i P CONi D

PAi : BAi

2. Influence of the phosphorus already contained in the biomass of algae i for its growth 8 1 P CONi < P i nmini ˆ ˆ < 0 P CONi > P i nmaxi P Ci D P i n  P CON ˆ maxi i ˆ otherwise : P i nmaxi  P i nmini 3. Phosphorus uptake from algae group i upt bai D P UPmaxi P Ci BAi PPi : 4. Respiration loss of phosphorus of algae i respi D resi P CONi :

5. Sinking, i.e., dying, of phosphorus of algae group i setpai D si nki P CONi : 6. Phosphorus loss of algae group i through zooplankton grazing grazpi D grazi P CONi : 7. Biomass loss of algae group i through zooplankton grazing not assimilated by zooplankton gsi nki D .1  ut i li /grazi : Page 24 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

8. PD gain through zooplankton grazing algae group i gsi nkpi D gsi nki P CONi : 9. Phosphorus gain for zooplankton through grazing algae group i assi mpi D assi mi P CONi : 10. Phosphorus fraction from biomass of zooplankton P CONz D

PZ : BZ

11. Respiration amount from zooplankton zresp D zres P CONz : 12. PZ loss through mortality zmorp D zmor P CONz : 13. PZ loss through respiration zresp D zres P CONz :

References Ansorge R, Sonar Th (2009) Mathematical models of fluid dynamics. Wiley-VCH, New York Audusse E, Bristeau M-O (2005) A well-balanced positivity preserving second-order scheme for shallow water flows on unstructured meshes. J Comput Phys 206:311–333 Barth TJ, Jesperson DC (1989) The design and application of upwind schemes on unstructured meshes. AIAA paper 89-0366 Benz J, Meister A, Zardo PA (2009) A conservative, positivity preserving scheme for advectiondiffusion-reaction equations in biochemical applications. In: Tadmor E, Liu J-G, Tzavaras AE (eds) Hyperbolic problems. Proceedings of symposia in applied mathematics. American Mathematical Society, Providence Berzins M (2001) Modified mass matrices and positivity preservation for hyperbolic and parabolic PDEs. Commun Numer Methods Eng 9:659–666 Broekhuizen N, Rickard GJ, Bruggeman J, Meister A (2008) An improved and generalized second order, unconditionally positive, mass conserving integration scheme for biochemical systems. Appl Numer Math 58:319–340 Bruggeman J, Burchard H, Kooi BW, Sommeijer B (2007) A second-order, unconditionally positive, mass-conserving integration scheme for biochemical systems. Appl Numer Math 57:36–58

Page 25 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

Burchard H, Deleersnijder E, Meister A (2003) A high-order conservative Patankar-type discretisation for stiff systems of production-destruction equations. Appl Numer Math 47:1–30 Burchard H, Deleersnijder E, Meister A (2005) Application of modified Patankar schemes to stiff biogeochemical models of the water column. Ocean Dyn 55(3–4):326–337 Burchard H, Bolding K, Kühn W, Meister A, Neumann T, Umlauf L (2006) Description of a flexible and extendable physical-biogeochemical model system for the water column. J Mar Syst 61:180–211 Chertock A, Kurganov A (2008) A second-order positivity preserving central upwind scheme for Chemotaxis and Haptotaxis models. Numer Math 111:169–205 Friedrich O (1993) A new method for generating inner points of triangulations in two dimensions. Comput Methods Appl Mech Eng 104:77–86 Hirsch C (1988a) Numerical computation of internal and external flows, vol 1. Wiley, New York Hirsch C (1988b) Numerical computation of internal and external flows, vol 2. Wiley, New York Hongping P, Jianyi M (2002) Study on the algal dynamic model for West Lake, Hangzhou. Ecol Model 148:67–77 Jørgensen SE (1975) A eutrophication model for a lake. Ecol Model 2:147–165 Lampert W, Sommer U (1999) Limnoökologie. Georg Thieme, Stuttgart LeVeque RJ (1990) Numerical methods for conservation laws. Birkhäuser, Boston Meister A (1998) Comparison of different Krylov subspace methods embedded in an implicit finite volume scheme for the computation of viscous and inviscid flow fields on unstructured grids. J Comput Phys 140:311–345 Meister A (2003) Viscous flow fields at all speeds: analysis and numerical simulation. J Appl Math Phys 54:1010–1049 Meister A, Oevermann M (1998) An implicit finite volume approach of the k   turbulence model on unstructured grids. ZAMM 78(11):743–757 Meister A, Sonar Th (1998) Finite-volume schemes for compressible fluid flow. Surv Math Ind 8:1–36 Meister A, Vömel C (2001) Efficient preconditioning of linear systems arising from the discretization of hyperbolic conservation laws. Adv Comput Math 14(1):49–73 Park RA et al (1974) A generalized model for simulating lake ecosystems. Simulation 21:33–50 Patankar SV (1980) Numerical heat transfer and fluid flows. McGraw-Hill, New York Poethke H-J (1994) Analysieren, Verstehen und Prognostizieren. PhD thesis, Johannes GutenbergUniversität Mainz, Mainz Ricchiuto M, Bollermann A (2009) Stabilized residual distribution for shallow water simulations. J Comput Phys 228(4):1071–1115 Sagehashi M, Sakoda A, Suzuki M (2000) A predictive model of long-term stability after biomanipulation of shallow lakes. Water Res 34(16):4014–4028 Schwoerbel J, Brendelberger H (2005) Einführung in die Limnologie. Elsevier/Spektrum Akademischer, Munich Smolarkiewicz PK (2006) Multidimensional positive definite advection transport algorithm: an overview. Int J Numer Methods Fluids 50:1123–1144 Sonar Th (1997a) On the construction of essentially non-oscillatory finite volume approximations to hyperbolic conservation laws on general triangulations: polynomial recovery, accuracy, and stencil selection. Comput Methods Appl Mech Eng 140:157–181 Sonar Th (1997b) Mehrdimensionale ENO-Verfahren. Teubner, Stuttgart Stoker JJ (1957) Water waves. Interscience Publisher, New York

Page 26 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_23-3 © Springer-Verlag Berlin Heidelberg 2015

Straškraba M, Gnauk A (1985) Freshwater ecosystems. Elsevier, Amsterdam Toro EF (1999) Riemann solvers and numerical methods for fluid dynamics. Springer, Berlin Toro EF (2001) Shock-capturing methods for free-surface shallow flows. Wiley, New York Vater S (2004) A new projection method for the zero Froud number shallow water equations. Master’s thesis, Freie Universität Berlin, Berlin Vázquez-Cendón M-E (2007) Depth averaged modelling of turbulent shallow water flow with wet-dry fronts. Arch Comput Methods Eng 14(3):303–341 Weather Hangzhou (2008a) Internet, May 23. http://www.chinatoday.com.cn/english/chinatours/ hangzhou.htm Weather Hangzhou (2008b) Internet, May 23. http://www.ilec.or.jp/database/asi/asi-53.html Weather Hangzhou (2008c) Internet, May 23. http://www.chinatoday.com.cn/english/chinatours/ hangzhou.htm Zardo PA (2005) Konservative und positive Verfahren für autonome gewöhnliche Differentialgleichungssysteme. Master’s thesis, University of Kassel

Page 27 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

Noise Models for Ill-Posed Problems Paul N. Eggermonta , Vincent LaRicciaa and M. Zuhair Nashedb a Food and Resource Economics, University of Delaware, Newark, DE, USA b Department of Mathematics, University of Central Florida, Orlando, FL, USA

Abstract The standard view of noise in ill-posed problems is that it is either deterministic and small (strongly bounded noise) or random and large (not necessarily small). Following Eggerment, LaRiccia and Nashed (2009), a new noise model is investigated, wherein the noise is weakly bounded. Roughly speaking, this means that local averages of the noise are small. A precise definition is given in a Hilbert space setting, and Tikhonov regularization of ill-posed problems with weakly bounded noise is analysed. The analysis unifies the treatment of “classical” ill-posed problems with strongly bounded noise with that of ill-posed problems with weakly bounded noise. Regularization parameter selection is discussed, and an example on numerical differentiation is presented.

1 Introduction The key feature of ill-posed problems is the lack of robustness of solutions with respect to noise in the data. Whereas for well-posed problems it may be acceptable to ignore the effects of the noise in the data, to do so for ill-posed problems would be disastrous. Consequently, one must devise approximate solution schemes to implement the trade-off between fidelity to the data and noise amplification. How this is done depends on how one models the noise. There are two predominant views in the literature. In the classical treatment of ill-posed problems, dating back to Tikhonov (1943, 1963) and Phillips (1962), one assumes that the noise is small, say with a signal-to-noise ratio of 95 % or more. In other words, the noise is deterministic, not random. One then devises approximate solution procedures and studies their behavior when the noise tends to 0, that is, when the signal-to-noise ratio tends to 1. For summaries of these theories, see Tikhonov and Arsenin (1977), Groetsch (1984), Morozov (1984), Tikhonov et al. (1995), Engl et al. (1996), and Kirsch (1996). Of course, there are many settings where the noise is not small in the usual sense. The premier examples of these are those involving high frequency noise, which abound in the geosciences (Tronicke 2007, Duan et al. 2008 and references therein). On the other hand, it was realized early on in the development of ill-posed problems (Sudakov and Khalfin 1964; Franklin 1970) that noise is often random, and that probabilistic methods must be used for its analysis. Typically, one shows by way of exponential inequalities that the probabilities that the error in the approximate solution exceeds certain small levels are themselves small. See, e.g., Mair and Ruymgaart (1996), Cavalier et al. (2002), Cavalier and Golubev (2006), Bissantz et al. (2007) to name a few. Another approach is based on the Bayesian point of view but is not considered further (e.g., Kaipio and Somersalo 2005). It is then not surprising that the classical and probabilistic approaches are at odds, not only in assumptions and conclusions, but also in technique. As a way to narrow this gap, a weak model is proposed for the noise, in which local averages of the noise are 

E-mail: [email protected]

Page 1 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

small in the classical sense. This covers small low frequency noise as well as large high frequency noise, and almost covers white noise and discrete random noise. However, the precise notion of local averages must be more or less precisely matched with the particular ill-posed problem at hand. The approach is worked out for linear ill-posed problems and is illustrated on numerical differentiation.

2 Noise in Well-Posed and Ill-Posed Problems Noise is pervasive in scientific data analysis and processing. In scientific computing, an important notion is that of the propagation of errors, whether internal (round-off) errors due to finite precision arithmetic or external such as errors (noise) in the initial data assuming infinite precision arithmetic. The scientific computation problem under consideration determines the minimal level of noise amplification; optimal algorithms are the ones that achieve the minimum amplification of errors. However, all of this depends very much on the precise properties of the noise. In fact, these properties determine whether the (inverse) problem at hand is well-posed or ill-posed. The notions of well-posed and ill-posed problems are discussed and elaborated in the context of integral equations of the first and of the second kind.

2.1 Well-Posed and Ill-Posed Problems In this chapter, and several other chapters in this volume, the interest is in solving “problems” with noisy data, abstractly formulated as follows. Let X and Y be Hilbert spaces with inner products and norms denoted by h ,  iX , h ,  iY , and jj  jjX ; jj  jjY . Let F W D.F /  X ! Y be a mapping of the domain D.F / in X to Y , and suppose “data” y 2 Y is given such that y D F .xo / C ;

(1)

for some unknown xo 2 D.F /, where  2 Y is noise. The objective is to recover (estimate) xo given the data y. The natural, possibly naive, recovery scheme is to find x 2 D.F / such that F .x/ D y:

(2)

The question is then whether this is a “nice” problem. The obvious desirable requirements for a problem to be nice are that the problem should have one and only one solution. If there is no solution, then one is in trouble, but one is likewise in trouble if there are many (more than one) solutions, because how is one supposed to choose one? There is one more requirement for a problem to be “nice” and it has to do with the presence of the noise . The requirement is that if the noise is “small,” then the solution of the problem should be close to the “true” xo . That is, the solution should be robust with respect to small perturbations in the data. The first author to recognize the significance of all this was Hadamard (1902) in the context of initial/boundary-value problems for partial differential equations. Adapted to (2), his definition of a “nice” problem may be stated as follows.

Page 2 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

Definition 1. The problem (2) is well-posed if the solution x exists and is unique and depends continuously on y, that is for all xo 2 X 8" > 0 9 ı > 0 W jjy  F .xo /jjY < ı H) jjx  xo jjx < ":

(3)

If any of these three conditions is not satisfied, then the problem (2) is ill-posed. What is one to do if any of the three conditions for well-posedness are violated? It turns out that violations of the third condition are the most problematic. 1. If the problem (2) does not have a solution, it is accepted practice to change the notion of a solution. The standard one is to consider least-squares solutions in the sense of minimize jjF .x/  yjj2Y

subject to

x 2 D.F /:

(4)

Borrowing from partial differential equations, the notion of a weak solution suggests finding x 2 D.F / such that hz; F .x/  yiY D 0

(5)

for all z in some subset of Y . 2. If the problem (2) has more than one solution, it may be acceptable (or necessary) to impose extra conditions on the solution. A standard approach is to find the minimum-norm least-squares solution by solving minimize jjxjjX

subject to

x minimizes jjF .x/  yjjY :

(6)

3. If the solution exists and is unique, then the question of continuous dependence makes sense. If this requirement is violated, there are two, no three, options. Change the way differences in the data are measured, change the way differences in the solutions are measured, or both. The cheating way would be to declare the data vectors y and z to be “close” if the solutions of F .x/ D y and F .x/ D z are close. However, the way differences in the data are measured is determined by the properties of the (idealized) measuring device as well as the properties of the noise (perturbations in the data external or internal to the measuring device). Changing this so that the problem becomes well-posed is probably not acceptable. Apparently the only way to restore the continuous dependence of the solution on the data is by restricting the set of “admissible” solutions (Tikhonov 1943, 1963). Here too there are two approaches. In the first scenario, one has some a priori information on the solution, such as it belonging to some (compact) set of smooth elements of X, say jjxjjZ 6 C for some constant C where jj  jjZ is some stronger norm on a dense subset of X. In the random setting, this is the Bayesian approach (Kaipio and Somersalo 2005). Alternatively, one may just have to be satisfied with constructing a smooth approximation to the true solution. It should be mentioned that an alternative to smooth approximations is finite-dimensional approximation (Grenander 1981; Natterer 1977).

Page 3 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

2.2 Fredholm Integral Equations with Strong Noise The notions of well- and ill-posedness are explored in the concrete setting of some specific Fredholm integral equations of the first and second kind. To avoid confusing the issues, a completely understood one-dimensional example set in the Hilbert space L2 .0; 1/ of square integrable functions on the interval [0, 1] is considered. The inner product and norm of L2 .0; 1/ are denoted by h ;  iL2 .0;1/ and jj  jjL2 .0;1/ . Let k be defined on [0, 1 ]  [0, 1] by k.s; t / D s ^ t  st; s; t 2 Œ0; 1;

(7)

where s ^ t D min.s; t /, and define the operator K : L2 .0; 1/ ! L2 .0; 1/ by Z

1

ŒKx.s/ D

k.s; t /x.t /dt;

s 2 Œ0; 1:

(8)

0

In fact, K is the Green’s function operator for the two-point boundary-value problem 

u00 D x in .0; 1/; u.0/ D u.1/ D 0:

(9)

So, the solution of (9) is given by u D Kx. The operator K is compact and the Fredholm equation of the first kind Kx D y

(10)

is an ill-posed problem. Consider the data y 2 L2 .0; 1/ following the model y D Kxo C ;

(11)

for some unknown xo 2 L2 .0; 1/ and noise  with jjjjL2 .0;1/ D ı;

(12)

with ı “small”. Taking for example y D 0 and  D .k/2 sin.k t /; 0 6 t 6 1; (13) p where k 2 N is “large,” one has jjjjL2 .0;1/ D 2=.k/2 , and the solution of (10) and (11) is x D .k/2 . This shows that any inequality of the form jjx  xo jjL2 .0;1/ 6 cjjjjL2 .0;1/ ; with c not depending on , must fail. Consequently, x does not depend continuously on the data, and the problem is ill-posed. For more on the ill-posed aspects of Fredholm integral equations of the first kind as well as equations of the second kind, see, e.g., Kress (1999). Now consider integral equations of the second kind. Consider the data y according to the model Page 4 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

y D xo C Kxo C ;

(14)

with the operator K as before, with some unknown xo 2 L2 .0; 1/, and with  the noise. If it is assumed that  is small as in (12), then the problem find x such that x C Kx D y

(15)

is well-posed: The solution x exists and is unique, and it satisfies jjx  xo jjL2 .0;1/ 6 jjjjL2 .0;1/ :

(16)

This may be seen as follows. First, it is readily seen that K is positive-definite. Since u D Kx solves (9), then by integration by parts hx; KxiL2 .0;1/ D hu00 ; uiL2 .0;1/ D jju0 jj2L2 .0;1/ > 0;

(17)

unless u is a constant function, which must then be identically zero, since it must vanish at t D 0 and t D 1. Thus, K is positive-definite. Then on the one hand hx  xo ; x  xo C K.x  xo /iL2 .0;1/ > jjx  xo jj2L2 .0;1/ ;

(18)

since K is positive-definite, and on the other hand hx  xo ; x  xo C K.x  xo /iL2 .0;1/ D hx  xo ; iL2 .0;1/ 6 jjx  xo jjL2 .0;1/ jjjjL2 .0;1/ ;

(19)

jjx  xo jj2L2 .0;1/ 6 jjx  xo jjL2 .0;1/ jjjjL2 .0;1/ ;

(20)

so that

and the conclusion follows. Consequently, the problem (18) is well-posed. Thus, all is well: Fredholm integral equations of the first kind are ill-posed, and Fredholm integral equations of the second kind are well-posed.

2.3 Super-Strong and Weak Noise Models The noise model in the previous section is the classical or strong noise model for noise in L2 .0; 1/, viz. jjjjL2 .0;1/ D ı;

(21)

with ı “small”. Here the discussion centers on how changing this assumption affects the well/illposedness of the problems under consideration. Consider again the Fredholm integral equation of the first kind, Kx D y, with K as the Green’s function operator for the boundary-value problem (9). As said, K : L2 .0; 1/ ! L2 .0; 1/ is compact, but much more is true. Let H02 .0; 1/ be the Sobolev space of square integrable functions Page 5 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

on the interval (0, 1) which vanish at the endpoints, and with square integrable (weak) first and second derivatives. The norm on H02 .0; 1/ is jujH 2 .0;1/ D jju00 jjL2 .0;1/ :

(22)

Then, since jKxjH 2 .0;1/ D jjxjjL2 .0;1/ , the mapping K : L2 .0; 1/ ! H02 .0; 1/ is a homeomorphism, and the inverse mapping K 1 : H02 .0; 1/ ! L2 .0; 1/ exists and is a bounded linear transformation. Now consider the data y according to the model y D Kxo C 

(23)

for some unknown xo 2 L2 .0; 1/ and noise , but now with jjH 2 .0;1/ D ı;

(24)

with ı “small.” This is referred to as a super-strong noise model. Then the solution of the problem find x

such that Kx D y

(25)

satisfies jjx  xo jjL2 .0;1/ D jjK 1 jjL2 .0;1/ D jjH 2 .0;1/ ;

(26)

thus showing that (25) is well-posed. Now consider integral equations of the second kind x C Kx D y;

(27)

y D xo C Kxo C ;

(28)

with the data y following the model

with the operator K as before, with some unknown xo 2 L2 .0; 1/, and with  the noise. Note that so far, the noise is assumed to be small. As such, it does not represent noise as one usually understands it: high frequency or random (without necessarily implying a firm probabilistic foundation). In this interpretation, the size of the noise is then better measured by the size of averages of the noise over small intervals (but not too small) or by the size of smoothed versions of the noise. So, the size of the noise must then be measured by something like ı D jjSjjL2 .0;1/

(29)

for a suitable compact smoothing operator S W L2 .0; 1/ ! L2 .0; 1/. Such models are referred to as weak noise models. (Note that if  converges weakly to 0, then S converges strongly to 0 since S is compact.) Now comes the crux of the matter regarding the problem (27). Since the solution is given by x D .I C K/1 y, then x  xo D .I C K/1  but any inequality of the form

Page 6 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

jj.I C K/1 jjL2 .0;1/ 6 cjjSjjL2 .0;1/ ;

(30)

with the constant c not depending on , must necessarily fail. Consequently, there is no constant c such that jjx  xo jjx 6 cjjSjjL2 .0;1/ :

(31)

It follows that x does not depend continuously on the data. In other words, in the weak noise setting the problem (27) is ill-posed. In summary, everything is cocked up: The Fredholm integral equation of the first kind turned out to be well-posed, and the Fredholm integral of the second kind is ill-posed. (Of course, this actually shows that the designations of Fredholm integral equations as being of the first or second kind are not the most informative with regard to their well- or ill-posedness.) In the remainder of this chapter, a precise version of the weak noise model is studied.

3 Weakly Bounded Noise In this section, the weak noise model, the main focus of this work is precisely described, pointing out the strong and weak points along the way. Let K:X ! Y be a linear compact operator between the Hilbert spaces X and Y . The inner products and norms of X and Y are denoted by h ,  iX , h , iY and jjjjX , jjjjY . Consider the data y 2 Y according to y D Kxo C ;

(32)

where  2 Y is the unknown noise and xo 2 X is an unknown element one wishes to recover from the data y. The following model is imposed on the noise. Let T : Y ! Y be linear, compact, Hermitian, and positive-definite (i.e., h y, TyiY > 0 for all y 2 Y , y ¤ 0), and let def

ı 2 Dh; T iY :

(33)

It is assumed that ı is “small” and the investigation concerns what happens when ı ! 0. In the above, the operator T is not arbitrary: It must be connected with K in the sense that for some m > 1 (not necessarily integer) the range of K is continuously embedded into the range of T m . That is T m K W X ! Y

is continuous:

(34)

If  satisfies (33) and (34), it is referred to as weakly bounded noise. Some comments are in order. In a deterministic setting, a reasonable model for the noise is that it is “high-frequency,” and one would like to investigate what happens when the frequency tends to 1, but without the noise tending to 0 strongly, that is without assuming that jjjjY ! 0. Thus,  ! 0 weakly begins capturing the essence of “noise.” Then, for any linear compact operator S:Y ! Y , one would have jjSjjY ! 0. So, in this sense, there is nothing unusual about (33). Moreover, one would like (33) to capture the whole truth, that is, that the statements Page 7 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

h; T p iY D o.ı 2 /

and h; T q iY D O.ı 2 /

(35)

fail for p > 1 and q < 1 as ı ! 0. This may be a tall order, although examples of operators T and noises  satisfying (33)–(35) are easily constructed (Eggermont et al. 2009b). At the same time T is supposed to capture the smoothing effect of K in the sense of (34). Ideally, one would like T m K to be continuous with a continuous inverse. The natural choice T D .KK  /1=2m would achieve this, but would have to be reconciled with (33) and possibly (35). The condition (34) is not unreasonable. There are many cases where the operator K is smoothing of order m and then T 1 could be a first-order differentiation operator. This section is concluded by showing how the weak noise model leads to simple bounds on expressions like h ; yiY for y 2 T m .Y /, the range of T m . For ˇ > 0, introduce the inner product on T m .Y /, hy; zim;ˇ D hy; ziY C ˇ 2m hT m y; T m ziY ;

(36)

and denote the associated norm by jj  jjm;ˇ . The following lemma is of interest in itself, but later on plays a crucial role in the analysis of Tikhonov regularization with weakly bounded noise. Lemma 1 (Weakly Bounded Noise). Let m > 1. Under the assumptions (33) and (34) on the weakly bounded noise, for all y 2 T m .Y / and all ˇ > 0 jh; yiY j 6 ˇ 1=2 ıjjyjjm;ˇ :

(37)

Note that the factor ˇ 1=2 stays the same, regardless of m. Proof of Lemma 1. Let ˇ > 0. Consider the smoothing operator Sˇ D .T 2 C ˇ 2 I /1 T 2 and let Tˇ denote its Hermitian square root, def

Tˇ D.T 2 C ˇ 2 I /1=2 T:

(38)

Note that the inverse of Sˇ is well-defined on T 2 .Y /, and that there one has Sˇ1 D I C ˇ 2 T 2 , and likewise that Tˇ1 is well defined on T .Y /, with Tˇ1 D .I C ˇ 2 T 2 /1=2 : Now, for all y 2 T .Y / one has that y D Tˇ1 Tˇ y and so jh; yiY j2 D jhTˇ ; Tˇ1 yiY j2 6 jjTˇ jj2Y jjTˇ1 yjj2Y

(39)

by the Cauchy-Schwarz inequality. The last factor on the right equals hy; yiY C ˇ 2 hT 1 y; T 1 yiY D jjyjj21;ˇ ; and it is an easy exercise using the spectral decomposition of T to show that for 1 6 m < n (not necessarily integers) and all ˇ > 0, Page 8 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

jjyjj2m;ˇ 6 2jjyjj2n;ˇ :

(40)

See Eggermont et al. (2009b). The first factor on the right in (39) equals h ; .I C ˇ 2 T 2 /1 iY and must be bounded further in terms of fnbigl nlangle ; T nbigr nranglegY . One has h; .I C ˇ 2 T 2 /1 iY D ˇ 1 h; .ˇ 1 T C ˇT 1 /1 T iY 6 rˇ 1 h; T iY ; where r is the spectral radius of .ˇ 1 T C ˇ T 1 /1 . Since 1 r 6 sup.ˇ 1 t C ˇt 1 /1 D sup. C  1 /1 D ; 2 t >0  >0 then h; .I C ˇ 2 T 2 /1 iY 6 .2ˇ/1 h; T iY ;

(41)

To summarize, (39)–(41) show that jh; yiY j2 6 ˇ 1 h; T iY jjyjj2m;ˇ :

(42)

This estimate together with the assumption (33) imply the lemma.

4 Tikhonov Regularization 4.1 Strongly Bounded Noise Tikhonov regularization is now discussed as the scheme to recover xo from the data y in the strong noise model y D Kxo C  with jjjjY 6 ı:

(43)

The interest is in what happens when ı ! 0. In the Tikhonov regularization scheme, the unknown xo is estimated by the solution x D x ˛;ı of minimize L.xj˛; y/

over x 2 X;

(44)

for some regularization parameter ˛; ˛ > 0, yet to be specified. Here, def

L.xj˛; y/ D jjKx  yjj2Y C ˛jjxjj2X :

(45)

This dates back to Phillips (1962) and Tikhonov (1963). Note that since L.xj˛; y/ is strongly convex in x, its minimizer exists and is unique.

Page 9 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

It is well-known (Groetsch 1984) that to get convergence rates on the error jjx ˛;ı  xo jjX , one must assume a source condition. For simplicity, it is assumed here that there exists a zo 2 X such that xo D .K  K/=2 zo

for some 0 <  6 2:

(46)

Precise necessary and sufficient conditions are given in Neubauer (1997). In the study of convergence rates under the source condition (46), it is assumed here that  is known and that ˛ is chosen accordingly. As said, one wants to obtain bounds on the error jjx ˛;ı  xo jjX . As usual, this is broken up into two parts jjx ˛;ı  xo jjX 6 jjx ˛;ı  x ˛;o jjX C jjx ˛;o  xo jjX ;

(47)

where x ˛;o is the “noiseless” estimator, i.e., the minimizer of L.xj˛; Kxo /. Thus, x ˛;ı  x ˛;o is the noise part of the error and x ˛;o  xo is the error introduced by the regularization. One has the following classical results. See, e.g., Groetsch (1984) and Engl et al. (1996). Theorem 1.

There exists a constant c such that for all ˛; 0 < ˛ 6 1, jjx ˛;ı  x ˛;o jjX 6 c˛ 1=2 jjjjY :

Theorem 2. ˛ 6 1,

(48)

Under the source condition (46), there exists a constant c such that for all ˛; 0 < jjx ˛;o  xo jjX 6 c˛ =2 :

(49)

The two theorems above then give the following convergence rates. Theorem 3.

Assuming the source condition (46) and the condition (43) on the noise for ˛ ! 0, jjx ˛;ı  xo jjX D O.˛ 1=2 ı C ˛ =2 /:

(50)

Moreover, if ˛  ı 2=.C1/ then jjx ˛;ı  xo jjX D O.ı =.C1/ /: In the remainder of this subsection, Theorems 1 and 2 are proved. The first observation is that the normal equations for the problem (44) are K  .Kx  y/ C ˛x D 0, so that the solution of (44) is x ˛;ı D .K  K C ˛I /1 K  y:

(51)

Proof of Theorem 1. From (51), it follows that x ˛;ı  x ˛;o D .K  K C ˛I /1 K  , so that jjx ˛;ı  xo jj2X 6 rjjjj2Y ;

Page 10 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

where r is the spectral radius of .K  K C ˛I /1 K  K.K  K C ˛I /1 . One then gets that r 6 sup t >0

t  1 D ˛ sup ; 2 .t C ˛/2  >0 . C 1/

where the substitution t D ˛ is applied. Since the supremum is finite, the theorem follows. Proof of Theorem 2. It follows from (51) that x ˛;o  xo D .K  K C ˛I /1 K  Kxo  xo D ˛.K  K C ˛I /1 xo : Now, with the source condition (46), one then obtains jjx ˛;o  xo jjX D ˛jj.K  K C ˛I /1 .K  K/=2 zo jjX 6 ˛%jjzo jjX ; where % is the spectral radius of the operator .K  K C ˛I /1 .K  K/v=2 . Since K  K is Hermitian, one has % 6 sup t >0

t =2  =2 6 ˛ .2/=2 sup : t C˛  >0  C 1

  Since for 0 <  6 2 the supremum is finite, this gives that % D O ˛ .2/=2 , and the theorem follows.

4.2 Weakly Bounded Noise Tikhonov regularization may now be considered as the scheme to recover xo from the data y in the weak noise model y D Kxo C :

(52)

Thus, it is assumed that there is a smoothing operator T such that the noise  and T satisfy (33) and (34). In particular, h; TiY D ı;

(53)

and the discussion focuses on what happens when ı ! 0. Formally, Tikhonov regularization does not depend on the noise being strongly or weakly bounded. Thus, xo is estimated by the solution x D x ˛;ı of minimize L.xj˛; y/

over x 2 X;

(54)

with L.xj˛; y/ as in (45), for some positive regularization parameter ˛ yet to be specified. As said, one wants to obtain bounds on the error jjx ˛;ı  xo jjX , and it is broken up as

Page 11 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

jjx ˛;ı  xo jjX 6 jjx ˛;ı  x ˛;o jjX C jjx ˛;o  xo jjX ;

(55)

where x ˛;o is the “noiseless” estimator, i.e., the minimizer of L.xj˛; Kxo /. Thus, x ˛;ı  x ˛;o is the noise part of the error and x ˛;o  xo is the error introduced by the regularization. It is useful to introduce a new norm on X by way of jjxjj2˛;X D jjKxjj2Y C ˛jjxjj2X :

(56)

Assuming again the source condition (46), the noiseless part x ˛;o  xo is covered by Theorem 2, but the treatment of the noise part is markedly different. Theorem 4. Under the conditions (33) and (34) on the noise , there exists a constant C depending on T only such that for ˛ ! 0 jjx ˛;ı  x ˛;o jj2˛;X 6 C ˛ 1=4m ı:

(57)

Theorems 2 and 4 above then give the following convergence rates. In Eggermont et al. (2009b) it is shown that they are optimal, following Natterer (1984), assuming in addition that T m K has a continuous inverse. Theorem 5. for ˛ ! 0,

Assuming the source condition (46) and the conditions (33) and (34) on the noise jjx ˛;ı  xo jjX D O.˛ 1=21=4m ı C ˛ =2 /:

(58)

Moreover, if ˛  ı 4m=.2mC2mC1/ then jjx ˛;ı  xo jjX D O.ı 2m=.2mC2mC1/ /: In the remainder of this subsection, Theorem 4 is proved. The direct computational approach of the previous section does not seem very convenient here, so one proceeds by way of the following equality due to Ribière (1967). Lemma 2.

Let x ˛;ı be the minimizer of L.xja; y/. Then, for all x 2 X jjx  x ˛;ı jj2˛;X D hKx  y; K.x  x ˛;ı /iY C ˛hx; x  x ˛;ı iX :

(59)

Proof. Look at L.xj˛; y/  L.x ˛;ı j˛; y/ in two ways. First, calculate the quadratic Taylor expansion of L.xj˛; y/ around the point x ˛;ı . So L.xj˛; y/ D L.x ˛;ı j˛; y/ C hL0 .x ˛;ı j˛; y/; x  x ˛;ı iX C jjx  x ˛;ı jj2˛;X ; where L0 .xj˛; y/ is the Gateaux derivative of L.xj˛; y/, L0 .xj˛; y/ D 2K  .Kx  y/ C 2˛x:

(60)

Page 12 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

Since x ˛;ı is the minimizer of L.xj˛; y/, the Gateaux derivative L0 .x ˛;ı j˛; y/ vanishes. This shows that for all x 2 X jjx  x ˛;ı jj2˛;X D L.xj˛; y/  L.x ˛;ı j˛; y/:

(61)

Second, expand L.x ˛;ı j˛; y/ around x, so L.x ˛;ı j˛; y/ D L.xj˛; y/ C hL0 .xj˛; y/; x ˛;ı  xiX C jjx ˛;ı  xjj2˛;X : With (60) then L.xj˛; y/  L.x ˛;ı j˛; y/ D 2hKx  y; K.x ˛;ı  x/iY  2˛hx; x ˛;ı  xiX  ˛jjx ˛;ı  xjj2˛;X : Substituting this into (61) gives the required results. Corollary 1.

Under the conditions of Lemma 2 jjx ˛;o  x ˛;ı jj2˛;X D h; K.x ˛;ı  x ˛;o /iY :

Proof. Let "˛;ı  x ˛;o  x ˛;ı . In Lemma 2, take x D x ˛;o . Then jj"˛;ı jj2˛;X D hKx ˛;o  y; K"˛;ı iY C ˛hx ˛;o ; "˛;ı iX :

(62)

Next, since x D x ˛;o is the minimizer of L.xj˛; Kxo /, Lemma 2 gives for all x 2 X, jjx  x ˛;o jj2˛;X D hKx  Kxo ; K.x  x ˛;o /iY C ˛hx; x  x ˛;o iX :

(63)

Now take x D x ˛;ı . Then jj"˛;ı jj2˛;X D hKx ˛;ı  Kxo ; K"˛;ı iY  ˛hx ˛;ı ; "˛;ı iX ; and add this to (62). The result is that 2jj"˛;ı jj2˛;X D hKx ˛;o  y  Kx ˛;ı C Kxo ; K"˛;ı iY C ˛hx ˛;o  x ˛;ı ; "˛;ı iX ; or 2jj"˛;ı jj2˛;X D jjK"˛;ı jj2Y C hy  Kxo ; K"˛;ı iY C ˛jj"˛;ı jj2X ; and the corollary follows. Proof of Theorem 4. By Corollary 1, one needs to properly bound h ; K.x ˛;o  x ˛;ı /iY . By Lemma 1 one has for all ˇ > 0 and all x 2 X jh; KxiY j 6 ˇ 1=2 ıjjKxjjm;ˇ :

(64)

Page 13 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

Now, jjKxjj2m;ˇ D jjKxjj2Y C ˇ 2m jjT m Kxjj2Y 6 jjKxjj2Y C cˇ 2m jjxjj2X ; the last inequality by assumption (34) for a suitable constant c. Consequently, jjKxjjm;ˇ 6 cjjxjj˛;X

ˇ D ˛ 1=2m :

for

(65)

Now apply this to (64) and that to h ; K.x ˛;o  x ˛;ı / iY .

5 Regularization Parameter Selection The rates of convergence for Tikhonov regularization for weakly bounded noise established in Sect. 4 are nice, but practically speaking they only show what is possible in an asymptotic sense under perfect information on the noise and the source condition. In this section, some data-driven methods are explored for selecting the regularization parameter: Morozov’s discrepancy principle (apparently applicable only for strongly bounded noise) and Lepski˘ı’s method in the interpretation of Mathé (2006).

5.1 Discrepancy Principles First, the data y 2 Y is considered in the classical model y D Kxo C  for some unknown xo and noise  with jj jjY D ı for “small” ı. The estimator of xo is given by Tikhonov regularization of the equation Kx D y, that is, x ˛;ı D .K  K C ˛I /1 K  y;

(66)

except of course that this requires a specific choice for ˛. The oldest a posteriori method for choosing the regularization parameter is Morozov’s discrepancy principle (Morozov 1966, 1984) based on the behavior of the residual r.˛/ D jjKx ˛;ı  yjjY ; ˛ > 0:

(67)

Since for the “exact” solution, the residual would equal jjjjY , in Morozov’s discrepancy principle one chooses ˛ D ˛M such that r.˛M / D jjjjY :

(68)

Writing Kx ˛;ı  y D ˛.KK  C ˛I /1 y, one easily shows that r.˛/ is a strictly increasing function of ˛ > 0. Moreover, it can be shown that lim r.˛/ D jjyjjY

˛!1

and

lim r.˛/ D 0;

˛!0

(69)

Page 14 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

so that (68) has a unique solution. (In the second part of (69) it is assumed that the range of K is dense in Y . If this is not the case, one must consider r.˛/ D jjKx ˛;ı  QyjjY ; where Q is the orthogonal projection operator onto the closure of the range of K in Y ). Before discussing the (im)possibility of adapting the discrepancy principle to weakly bounded noise, it is illustrative to explore why Morozov’s discrepancy principle works in the classical case. Indeed, as discovered by Groetsch (1983), in Morozov’s method, one is minimizing an upper bound for the error jjx ˛;ı  xo jjX , To see this, write jjx ˛;ı  xo jj2X D jjx ˛;ı jj2X C jjxo jj2X  2hx ˛;ı ; xo iX ; and observe that x ˛;ı D .K  K C ˛I /1 K  y D K  .KK  C ˛I /1 y, so that hx ˛;ı ; xo iX D h.KK  C˛I /1 y; Kxo iX D h.KK  C˛I /1 y; yiX  h.KK  C˛I /1 y; iX : Now, the last term may be bounded by jj.KK  C ˛I /1 yjjY jjjjY . It follows that jjx ˛;ı  xo jj2X 6 U.˛/ 6 jjx ˛;ı  xo jj2X C 4jj.KK  C˛I /1 yjjY jjjjY ;

(70)

where U.˛/ D jjx ˛;ı jj2X C jjxo jj2X  2h.KK  C˛I /1 y; yiX C 2ıjj.KK  C˛I /1 yjjX :

(71)

Now, writing jjx ˛;ı jj2X D hy; .KL  C˛I /2 KK  yiY , it is a somewhat laborious but straightforward exercise to show that dU 2ıhy; .KK  C ˛I /3 yiY D 2˛hy; .KK  C˛I /3 yiY  ; d˛ jj.KK  C˛I /1 yjjX so that setting the derivative equal to 0 gives ˛jj.KK  C˛I /1 yjjX D jjjjY :

(72)

˛.KK  C˛I /1 y D y  Kx ˛;ı ;

(73)

Since

this is Morozov’s discrepancy principle. One then derives convergence rates for jjx ˛;ı  xo jjX with ˛ D ˛M by theoretically minimizing over ˛, the upper bound of (70) for U.˛/ which by (73) may be written as U.˛/ 6 jjx ˛;ı  xo jj2X C 4˛ 1 jjKx ˛;ı  yjjY jjjjY :

(74)

Page 15 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

Using the estimates from Sect. 4 will do the trick. See Groetsch (1984) or also Gfrerer (1987). Can Morozov’s discrepancy principle be applied to weakly bounded noise? The real question is whether one can derive an upper bound for the error jjx ˛;ı  xo jjX based on weakly bounded noise, but this seems to be an open question. However, one can do some plausible reasoning. In the optimal case jjKx ˛;ı  yjj2Y jjjj2Y ; so this cannot account for weakly bounded noise. Then it seems to make sense to smooth it out as def

jjy  Kx ˛;ı jj2S˛ Dhy  Kx ˛;ı ; S˛ .y  Kx ˛;ı /iY ; where S˛ D K.K  K C ˛I /1 K  is a smoothing operator that makes sense in this context. A weak discrepancy principle would then read jjy  Kx ˛;ı jj2S˛ D C ˛ 1=2m ı 2

(75)

for a “suitable” constant C . However, since jjy  Kx ˛;ı jjS˛ is certainly not strictly increasing (it tends to 0 for ˛ ! 0 and ˛ ! 1/, this raises more questions than it answers.

5.2 Lepski˘ı’s Principle In this section, the adaptation to weakly bounded noise of the method of Lepski˘ı (1990) for choosing the regularization parameter ˛ in Tikhonov regularization is discussed. This method originated in nonparametric regression with random noise (Lepski˘ı 1990), but the method has found a ready interpretation in the classical setting of ill-posed problems, see Bissantz et al. (2007), Mathé and Pereverzev (2006a, b), and Pereverzev and Schock (2005) and the concise interpretation of Mathé (2006). The Lepskii principle in the interpretation of Mathé (2006) may be adapted to the weakly bounded noise setting as follows. Let ‰.˛/ D 2ı˛ .2mC1/=4m ;

˛ > 0:

(76)

This is an overestimate of the contribution of the noise to the solution. One observes that ‰.˛/ is a decreasing function of ˛. In Sect. 4.2, it was shown that 1 jjx ˛;ı  x ˛;o jjX 6 ‰.˛/: 2

(77)

1 jjx ˛;ı  xo jjX 6 ‰.˛/ C jjx ˛;o  xo jjX ; 2

(78)

It then follows that

and this is 6 ‰.˛/ for all suitably small ˛ since jjx ˛;o  xo jjX ! 0 and ‰.˛/ ! 1 for ˛ ! 0. Define Page 16 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

˛star D sup f˛ > 0 W 800

as well as the approximations to Page 18 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

p Table 1 The mean errors .1= n/jjx.˛; ı/  xo jjRn for various ˛ n Optimal Star 100 5.9047e  2 6.9908e  2 200 5.0559e  2 5.9081e  2 400 4.1698e  2 4.9118e  2

Dangerous 9.6766e  02 7.7766e  02 6.5244e  02

Lepski˘ı 1.7657e  1 1.4667e  1 1.1948e  1

0 –0.5 –1 –1.5 –2 –2.5 –3 –3.5 –7

–6.5

–6

–5.5

–5

–4.5

–4

–3.5

Fig.p1 Graphs of the true error (solid), the ‰.ˇ/ function (dotted), 2 ‰.ˇ/ (dash-dotted), and .1= n/jjx.ˇ; ı/  x.˛L ; ı/jjRn (the Lepski˘ı curve) versus log10 .ˇ/ in a typical case

  E n1=2 jjx.˛star ; ı/  xo jjRn

and

  E n1=2 jjx.˛D ; ı/  xo jjRn :

in connection with Lepski˘ı’s principle (see below for ˛D /. The ‰-function used is ‰.˛/ D 2˛ 9=16 n1=2 ı;

(90)

which corresponds to m D 4. So, here the operator T of (33) and (34) is taken to be such that T 1 is differentiation of order 1/2. This almost works theoretically, since for random noise one has h; T iL2 .0;1/ < 1

(91)

if T is a smoothing operator of order > 1/2.pSee Eggermont and LaRiccia (2009a), Chap. 13. Also, in (90) the weak noise level is not ı but ı= n. The results for a “dangerous” choice, ˛ D ˛D , are also shown. This choice comes about as follows. In Fig. 1 a typical graph is shown of   log10 n1=2 jjx.ˇ; ı/  x.˛L ; ı/jjRn ;

0 < ˇ < ˛L ;

which lies below the graph of 2 ‰.ˇ/, but touches it at one point, the abscissa of which is denoted by ˛D . Increasing ˛L would make part of the graph lie above that of 2 ‰.ˇ/. Observe that the logarithm tends to – 1 for ˇ ! ˛L , as it should. In Fig. 1, the graphs of the true error and of ‰.ˇ/ and 2 ‰.ˇ/ are also shown. It seems clear from the graph that the choice ˛ D ˛D is reasonable, but since there is no theoretical backing, it is called the “dangerous” one. Page 19 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

It is clear from inspecting Table 1 that the Lepski˘ı principle works, but that the dangerous choice works better (for this example).

7 Conclusions and Future Directions It has been demonstrated that weak noise requires an essentially different treatment compared to strongly bounded or “classical” noise. However, it is clear that only the surface of weak noise models in ill-posed problems has been scratched, most notably with regards to nonlinear problems and moment discretization. This will be discussed in later works. In the classical approach of moment discretization, this leads naturally to reproducing kernel Hilbert spaces (Nashed and Wahba 1974a, b; Nashed 1976, 1981). However, reproducing kernel Hilbert space ideas are hidden in this chapter as well, most notably in the Weak Noise Lemma 1, which should be compared with the Random Sum Lemma 13.2.20 of Eggermont and LaRiccia (2009a). There are two more pressing areas that need study, to wit more flexible recovery schemes and procedures for the selection of the regularization parameter that automatically adapt to generally unknown source conditions. The two are sometimes not easily disentangled. The hypothesis is offered that the “usual” recovery schemes for strongly bounded noise will also work for weakly bounded noise. Thus, total-variation regularization (Rudin et al. 1992; Dobson and Scherzer 1996; Vogel and Oman 1996), inverse space scale methods (Groetsch and Scherzer 1984; Burger et al. 2007), and iterative methods (Eicke et al. 1990; Frankenberger and Hanke 2000), to name a few, should prove promising avenues of investigation for weak noise models. Regarding smoothing parameter selection, things are much less clear. Some selection criteria obviously do not apply, such as Morozov’s discrepancy principle and variants for classical noise (Mathé and Pereverzev 2006a), but others do. Lepski˘ı’s method (Mathé 2006; Mathé and Pereverzev 2006b) may be adapted (but the actual implementation is storage intensive). The L-method works for weak noise models in the sense that the graph of jjKx ˛;ı  yjjY versus jjx ˛;ı jjX exhibits the usual L-shape (but identifying the corner is problematic). It would be interesting to see under what conditions methods designed explicitly for random noise apply to weakly bounded noise as well, see Desbat and Girard (1995).

8 Synopsis 1. Let X and Y be Hilbert spaces with inner products h,iX ,h,iY and norms jj  jjX , jj  jjY . Let KWX !Y be linear and compact. Consider the data y 2 Y , y D Kxo C  with unknown xo 2 X and noise  2 Y . 2. Goal: Recover xo ! Problem is ill-posed because K 1 is not bounded. 3. Classical view: jjjjY is “small.” Analyze what happens when jjjjY ! 0. Page 20 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

4. Cases not covered: 1. 2. 3. 4.

High-frequency noise (not “small”) White noise . … Y Š/ Random discrete noise (not “small”; more data becomes available) Signal-to-noise-ratio < 10 %!

5. New noise model: y D Kxo C  with  ! 0

weakly in Y:

Need rates on the weak convergence. 6. Precise conditions for weakly bounded noise (Sect. 3): T :Y ! Y linear, compact, Hermitian, positive definite ; ı 2 D h; T iY ! 0; K.X/  T m .Y / for some m > 1. (m need not be an integer.) Note: this does not imply jjjjY ! 0! 7. T compact and  ! 0 weakly implies h; T iY ! 0 but T must be related to K, e.g., T D .K  K/1=2m , but then . . . . 8. Weak noise Lemma 1: h; KxiY 6 .2ˇ/1=2 ıfjjKxjj2Y C ˇ 2m jjT m Kxjj2Y g1=2 : So, with ˇ D ˛ 1=2m and T m K bounded h; KxiY 6 c˛ 1=4m ıjjxjj˛;X ; with jjxjj2˛;X D jjKxjj2Y C ˛jjxjj2X : 9. Ideally suited for Tikhonov regularization: x ˛;ı D arg min jjKx  yjj2Y C ˛jjxjj2X Š 10. Ribière’s (in)equality with "˛;ı D x ˛;ı  xo (Lemma 2) jj"˛;ı jj2˛;X D h; K"˛;ı iY C ˛hxo ; "˛;ı iX : Two terms ! 11. h; K"˛;ı iY : see (h). 12. ˛hxo ; "˛;ı iX 6 c˛ .C1/=2 k"˛;ı k˛;X if one has a source condition

Page 21 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

xo D .K  K/=2 zo

with 0 <  6 2;

for some zo 2 X (from “classic” Tikhonov regularization) 13. Leads to k"˛;ı k2˛;X 6 cf˛ 1=2m ı C ˛ .C1/=2 gk"˛;ı k˛;X ; and rates follow!

References Bissantz N, Hohage T, Munk A, Ruymgaart F (2007) Convergence rates of general regularization methods for statistical inverse problems and applications. SIAM J Numer Anal 45:2610–2626 Burger M, Resmerita E, He L (2007) Error estimation for Bregman iterations and inverse scale space methods in image restoration. Computing 81:109–135 Cavalier L, Golubev GK (2006) Risk hull method and regularization by projections of ill-posed inverse problems. Ann Stat 34:1653–1677 Cavalier L, Golubev GK, Picard D, Tsybakov AB (2002) Oracle inequalities for inverse problems. Ann Stat 30:843–874 Desbat L, Girard D (1995) The “minimum reconstruction error” choice of regularization parameters: some more efficient methods and their application to deconvolution problems. SIAM J Sci Comput 16:1387–1403 Dobson DC, Scherzer O (1996) Analysis of regularized total variation penalty methods for denoising. Inverse Probl 12:601–617 Duan H, Ng BP, See CMS, Fang J (2008) Broadband interference suppression performance of minimum redundancy arrays. IEEE Trans Signal Process 56:2406–2416 Eicke B (1992) Iteration methods for convexly constrained ill-posed problems in Hilbert space. Numer Funct Anal Optim 13:413–429 Eggermont PPB, LaRiccia VN (2009a) Maximum penalized likelihood estimation. Volume II: Regression. Springer, New York Eggermont PPB, LaRiccia VN, Nashed MZ (2009) On weakly bounded noise in ill-posed problems. Inverse Probl 25:115018 (14pp) Engl HW, Kunisch K, Neubauer A (1989) Convergence rates for Tikhonov regularisation of nonlinear ill-posed problems. Inverse Probl 5:523–540 Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer, Dordrecht Frankenberger H, Hanke M (2000) Kernel polynomials for the solution of indefinite and ill-posed problems. Numer Algorithms 25:197–212 Franklin JN (1970) Well-posed stochastic extensions of ill-posed linear problems. J Math Anal Appl 31:682–716 Gfrerer H (1987) An a posteriori parameter choice for ordinary and iterated Tikhonov regularization of ill-posed problems leading to optimal convergence rates. Math Comput 49:523–542 Grenander U (1981) Abstract inference. Wiley, New York Groetsch CW (1983) Comments on Morozov’s discrepancy principle. In: Hämmerlin G, Hoffman KN (eds) Improperly posed problems and their numerical treatment. Birkhäuser, Basel

Page 22 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

Groetsch CW (1984) The theory of Tikhonov regularization for Fredholm equations of the first kind. Pitman, Boston Groetsch CW, Scherzer O (2002) Iterative stabilization and edge detection. Contemp Math 313:129–141. American Mathematical Society, Providence Hadamard J (1902) Sur les problèmes aux derivées partièlles et leur signification physique. Princet Univ Bull 13:49–52 Hanke M, Scherzer O (2001) Inverse problems light: numerical differentiation. Am Math Mon 108:512–521 Kaipio J, Somersalo E (2005) Statistical and computational inverse problems. Springer, Berlin Kirsch A (1996) An introduction to the mathematical theory of inverse problems. Springer, Berlin Kress R (1999) Linear integral equations. Springer, Berlin Lepski˘ı OV (1990) On a problem of adaptive estimation in Gaussian white noise. Theory Probl Appl 35:454–466 Mair BA, Ruymgaart FH (1996) Statistical inverse estimation in Hilbert scales. SIAM J Appl Math 56:1424–1444 Mathé P (2006) The Lepski˘ı principle revisited. Inverse Probl 22:L11–L15 Mathé P, Pereverzev SV (2006a) The discretized discrepancy principle under general source conditions. J Complex 22:371–381 Mathé P, Pereverzev SV (2006b) Regularization of some linear ill-posed problems with discretized random noisy data. Math Comput 75:1913–1929 Morozov VA (1966) On the solution of functional equations by the method of regularization. Soviet Math Dokl 7:414–417 Morozov VA (1984) Methods for solving incorrectly posed problems. Springer, New York Nashed MZ (1976) On moment-discretization and least-squares solutions of linear integral equations of the first kind. J Math Anal Appl 53:359–366 Nashed MZ (1981) Operator-theoretic and computational approaches to ill-posed problems with applications to antenna theory. IEEE Trans Antennas Propag 29:220–231 Nashed MZ, Wahba G (1974a) Regularization and approximation of linear operator equations in reproducing kernel spaces. Bull Am Math Soc 80:1213–1218 Nashed MZ, Wahba G (1974b) Convergence rates of approximate least squares solutions of linear integral and operator equations of the first kind. Math Comput 28:69–80 Natterer F (1977) Regularisierung schlecht gestellter Probleme durch Projektionsverfahren. Numer Math 28:329–341 Natterer F (1984) Error bounds for Tikhonov regularization in Hilbert scales. Appl Anal 18:29–37 Neubauer A (1997) On converse and saturation results for Tikhonov regularization of linear illposed problems. SIAM J Numer Anal 34:517–527 Pereverzev SV, Schock E (2005) On the adaptive selection of the parameter in regularization of ill-posed problems. SIAM J Numer Anal 43:2060–2076 Phillips BL (1962) A technique for the numerical solution of certain integral equations of the first kind. J Assoc Comput Mach 9:84–97 Ribière G (1967) Régularisation d’opérateurs. Rev Informat Recherche Opérationnelle 1:57–79 Rudin LI, Osher SJ, Fatemi E (1992) Nonlinear total variation based noise removal algorithms. Physica D 60:259–268 Sudakov VN, Khalfin LA (1964) A statistical approach to the correctness of the problems of mathematical physics. Dokl Akad Nauk SSSR 157:1058–1060 Tikhonov AN (1943) On the stability of inverse problems. Dokl Akad Nauk SSSR 39:195–198 (in Russian) Page 23 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_24-3 © Springer-Verlag Berlin Heidelberg 2015

Tikhonov AN (1963) On the solution of incorrectly formulated problems and the regularization method. Dokl Akad Nauk SSSR 151:501–504 Tikhonov AN, Arsenin VYa (1977) Solutions of ill-posed problems. Wiley, New York Tikhonov AN, Goncharsky AV, Stepanov VV, Yagola AG (1995) Numerical methods for the solution of ill-posed problems. Kluwer, Dordrecht Tronicke J (2007) The influence of high frequency uncorrelated noise on first-break arrival times and crosshole traveltime tomography. J Environ Eng Geophys 12: 172–184 Vogel CR, Oman ME (1996) Iterative methods for total variation denoising. SIAM J Sci Comput 17:227–238 Wahba G (1973) Convergence rates of certain approximate solutions to Fredholm integral equations of the first kind. J Approx Theory 7:167–185

Page 24 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Sparsity in Inverse Geophysical Problems Markus Grasmaira , Markus Haltmeierb and Otmar Scherzerc a Department of Mathematics, Norwegian University of Science and Technology, Trondheim, Norway b Institute of Mathematics, University of Innsbruck, Innsbruck, Austria c Computational Science Center, University of Vienna, Vienna, Austria

Abstract Many geophysical imaging problems are ill-posed in the sense that the solution does not depend continuously on the measured data. Therefore, their solutions cannot be computed directly but instead require the application of regularization. Standard regularization methods find approximate solutions with small L2 norm. In contrast, sparsity regularization yields approximate solutions that have only a small number of nonvanishing coefficients with respect to a prescribed set of basis elements. Recent results demonstrate that these sparse solutions often much better represent real objects than solutions with small L2 norm. In this survey, recent mathematical results for sparsity regularization are reviewed. As an application of the theoretical results, synthetic focusing in Ground Penetrating Radar is considered, which is a paradigm of inverse geophysical problem.

1 Introduction In a plethora of industrial problems, one aims at estimating the properties of a physical object from observed data. Often the relation between the physical object and the data can be modeled sufficiently well by a linear equation Au D v;

(1)

where u is a representation of the object in some Hilbert space U , and v a representation of the measurement data, again in a Hilbert space V . Because the operator A: U ! V in general is continuous, the relationship (1) allows one to easily compute data v from the properties of the object u, provided they are known. This is the so called forward problem. In many practical applications, however, one is interested in the inverse problem of estimating the quantity u from measured data v. A typical feature of inverse problems is that the solution of (1) is very sensitive to perturbations in v. Because in practical applications only an approximation v ı of the true data v is given, the direct solution of Eq. 1 by applying the inverse operator is therefore not advisable (see Engl et al. 1996; Scherzer et al. 2009). By incorporating a priori information about the exact solution, regularization methods allow to calculate a reliable approximation of u from the observed data v ı . In this chapter, the main



E-mail: [email protected]

Page 1 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Antenna

x3

Flight path Γ x1 x2 Snow

Target

Subsurface

Fig. 1 Collecting GPR data from a flying helicopter. At each position on the flight path , the antenna emits a short radar pulse. The radar waves get reflected, and the scattered signals are collected in radargrams

interest lies in sparsity regularization, where the a priori information is that the true solution u is sparse in the sense that only few coefficients hu; ¥i with respect to some prescribed basis .¥ /2ƒ are nonvanishing. In the discrete setting of compressed sensing, it has recently been shown that sparse solutions can be found by minimizing the `1 -norm of the coefficients hu; ¥i, see Donoho and Elad (2003) and Candès et al. (2006). Minimization of the `1 norm for finding a sparse solutions has, However, been proposed and studied much earlier for certain geophysical inverse problems (see Claerbout and Muir 1973; Levy and Fullagar 1981; Oldenburg et al. 1983; Santosa and Symes 1986).

1.1 Case Example: Ground Penetrating Radar The case example of a geophysical inverse problem studied in this chapter is Ground Penetrating Radar (GPR), which aims at finding buried objects by measuring reflected radar signals (Daniels 2004). The reflected signals are detected in zero offset mode (emitting and detecting antenna are at the same position) and used to estimate the reflecting objects. The authors’ interest in GPR has been raised by the possibility of locating avalanche victims by means of a GPR system mounted on a flying helicopter (Haltmeier et al. 2005; Frühauf et al. 2009). The basic principle of collecting GPR data from a helicopter is shown in Fig. 1. In Sect. 5.1, it is shown that the imaging problem in GPR reduces to solving Eq. 1, with A being the circular Radon transform. The inversion of the circular Radon transform also arises in several other up-to-date imaging modalities, such as in SONAR, seismic imaging, ultrasound tomography, and photo-/thermo-acoustic tomography (see, e.g., Norton and Linzer 1981; Andersson 1988; Bleistein et al. 2001; Finch and Rakesh 2007; Patch and Scherzer 2007; Kuchment and Kunyansky 2008; Scherzer et al. 2009; Symes 2009 and the reference therein).

2 Variational Regularization Methods Let U and V be Hilbert spaces and let A W U ! V a bounded linear operator with unbounded inverse. Then, the problem of solving the operator equation Au D v Page 2 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

is ill-posed. In order to (approximately) solve this equation in a stable way, it is therefore necessary to introduce some a priori knowledge about the solution u, which can be expressed via smallness of some regularization functional R W U ! Œ0; C1. In classical regularization theory, one assumes that the possible solutions have a small energy in some Hilbert space norm; typically, an L2 or H 1 -norm is used, and defines R as the square of this norm. In contrast, in this chapter the situation of sparsity constraints is considered, where one assumes that the possible solutions have a sparse expansion with respect to a given basis. In the following, u denotes any R-minimizing solution of the equation A u D v, provided that it exists, that is, u 2 arg min fR.u/ W Au D vg : In applications, it is to be expected that the measurements v one can obtain are disturbed by noise. That is, one is not able to measure the true data v but only has some noisy measurements v ı available. In this case, solving the constrained minimization problem R.u/ ! min subject to A u D v ı is not suitable, because the ill-posedness of the equation will lead to unreliable results. Even more, in the worst case it can happen that v ı is not contained in the range of A, and thus the equation Au D v ı has no solution at all. Thus, it is necessary to restrict oneself to solving the given equation only approximately. Three methods are considered for the approximate solution,  all of which require knowledge ı  about, or at least some estimate of, the noise level ı W D v  v . Residual method: Fix  > 1 and solve the constrained minimization problem   R .u/ ! min subject to Au  uı   ı:

(2)

Tikhonov regularization with discrepancy principle: Fix   1 and minimize the Tikhonov functional  2 T˛;uı .u/ WD Au  v ı  C ˛R .u/;

(3)

where ˛ > 0 is chosen in such a way that Morozov’s discrepancy principle is satisfied, that is,  ı  Au  v ı  D ı with uı 2 arg min T˛;v ı .u/: ˛ ˛ u

Tikhonov regularization with a priori parameter choice: Fix C > 0 and minimize the Tikhonov functional (3) with a parameter choice ˛ D C ı:

(4)

The residual method aims for the minimization of the penalty term R over all elements u that generate approximations of the given noisy data v ı ; the size of the permitted defect is dictated by the assumed noise level ı. In particular, the true solution u is guaranteed to be among the feasible elements in the minimization problem (2). The additional parameter   1 allows for some incertitude concerning the precise noise level; if  is strictly greater than 1, an underestimation of the noise would still yield a reasonable result.

Page 3 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

If the regularization functional R is convex, the residual method can be shown to be equivalent to Tikhonov regularization with a parameter choice according to Morozov’s discrepancy principle, provided the size of the signal is larger than the noise level, that is, the signal-to-noise ratio is larger than t. In this case, the regularization parameter a in (3) plays the role of a Lagrange parameter for the solution of the constrained minimization problem (2). This equivalence result is summarized in the following theorem (see Ivanov et al. 2002, Theorems 3.5.2, 3.5.5): Theorem 1. Assume that the operator A W U ! V is linear and has a dense range and that the regularization term R is convex. In addition, assume that R.u/ D 0 if and only if u D 0. Then the residual method and Tikhonov regularization with an a posteriori parameter choice by means of the discrepancy principle are equivalent   in the following sense: Let vı 2 V and ı > 0 satisfy v ı  > ı. Then uı solves the constrained problem (2), if and only if Auı  v ı  D ı and there exists some ˛ > 0 such that uı minimizes the Tikhnonov functional (3). In order to show that the methods introduced above are indeed regularizing, three properties have to be necessarily satisfied, namely, existence, stability, and convergence. In addition, convergence rates can be used to quantify the quality of the method: • Existence: For each regularization parameter ˛ > 0 and every v ı 2 V the regularization functional T˛;v ı attains its minimum. Similarly, the minimization problem (2) has a solution. • Stability is required to ensure that, for fixed noise level ı, the regularized solutions depend continuously on the data v ı . • Convergence ensures that the regularized solutions converge to u as the noise level decreases to zero. • Convergence rates provide an estimate of the difference between the minimizers of the regularization functional and u . Typically, convergence rates are formulated in terms of the Bregman distance (see Burger and Osher 2004; Resmerita 2005; Hofmann et al. 2007; Scherzer et al. 2009) which, for a convex and differentiable regularization term R with subdifferential R and  2 @R.u / is defined as    ˝ ˛  D u; u D R .u/  R u  ; u  u :   That is, D u; u measures the distance between the tangent and the convex function R. In general, convergence with respect to the Bregman distance does not imply convergence with respect to the norm, strongly reducing the significance of the derived rates. In the setting of sparse regularization to be introduced below, however, it is possible to derive convergence rates with respect to the norm on U .

3 Sparse Regularization In the following, the focus lies on sparsity promoting regularization methods. To that end, it is assumed that . /2 is an orthonormal basis of the Hilbert space U , for instance a wavelet or Fourier basis. For u 2 U , the support of u with respect to the basis . /2 a is denoted by Page 4 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

supp .u/ WD f 2 W h ; ui ¤ 0g If jsupp.u/j  s for some s 2 N, then the element u is called s-sparse. It is called sparse, if it is ssparse for some s 2 N, that is, jsupp.u/j < 1. Given weights w ;  2 ƒ, bounded below by some constant wmin > 0, one defines for 0 < q  2 the `q -regularization functional Rq W U ! R [ f1g, Rq .u/ WD

X

w jh ; uijq :

2ƒ

If q D 2, then the regularization functional is simply the weighted squared Hilbert space norm on U . If q is smaller than 2, small coefficients h¥ ; ui are penalized comparatively stronger, while the penalization of large coefficients becomes less pronounced. As a consequence, the reconstructions resulting by applying any of the above introduced regularization methods will exhibit a small number of significant coefficients, while most of the coefficients will be close to zero. These sparsity enhancing properties of `q -regularization become more pronounced as the parameter q decreases. If one choose q at most 1, then the reconstructions are necessarily sparse in the above, strict sense, that is, the number of nonzero coefficients is at most finite (see Grasmair 2010): Proposition 1. Let q  1; ˛ > 0; v ı 2 V . Then every minimizer of the Tikhonov functional T˛;v ı with regularization term Rq is sparse. There are compelling reasons for using an exponent q  1 in applications, as this choice entails the convexity of the ensuing regularization functionals. In contrast, a choice q < 1 leads to nonconvex minimization problems and, as a consequence, to numerical difficulties in their minimization. In the convex case q  1, there are several possible strategies for computing the minimizers of regularization functional T˛;v ı . Below, in Sect. 4, two different, iterative methods are considered: an Iterative Thresholding Algorithm for regularization with a priori parameter choice and 1  q  2 (Daubechies et al. 2004), and a log-barrier method for Tikhonov regularization with an a posteriori parameter choice by the discrepancy principle in the case q D 1 (Candès and Romberg 2005). Iterative thresholding algorithms have also been studied for nonconvex situations, but there the convergence to global minima has not yet been proven (Bredies and Lorenz 2014).

3.1 Convex Regularization Now the theoretical properties of `q type regularization methods with q  1 are studied, in particular the questions of existence, stability, convergence, and convergence rates. In order to be able to take advantage of the equivalence result Theorem 1, it is assumed in the following that the operator A W U ! V has dense range. The question of existence is easily answered (Grasmair et al. 2008, 2011b): Proposition 2 (Existence). For every ˛ > 0 and v ı 2 V the functional T˛;v ı has a minimizer in U. Similarly, the problem of minimizing Rq .u/ subject to the constraint Au  vı   ı admits a solution in U.

Page 5 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Though the previous lemma states the existence of minimizers for all q  1, there is a difference between the cases q D 1 and q > 1. In the latter case, the regularization functional T˛;v ı is strictly convex, which implies that the minimizer must be unique. For q D 1, the regularization functional is still convex, but the strict convexity holds only, if the operator A is injective. Thus, it can happen that one does not obtain a single approximate solution, but a whole (convex and closed) set of minimizers. Because of this possible nonuniqueness, the stability and convergence results have to be formulated in terms of subsequential convergence. Also, one has to differentiate between a priori and a posteriori parameter selection methods. In the latter case, the stability and convergence results can be formulated solely in terms of the noise level ı. In the case of an a priori parameter choice, it is in addition necessary to take into account the actual choice of a in dependence of ı. For the following results see Lorenz (2008) and Grasmair et al. (2011b). Proposition 3 (Stability). Let ı > 0 be fixed and let vk ! v ı . Consider one of the following settings: Residual method: Let uk 2 U be solutions of the residual method with data vk and noise level ı. Discrepancy principle: Let uk 2 U be solutions of Tikhonov regularization with data vk and an a posteriori parameter choice according to the discrepancy principle for noise level ı. A priori parameter choice: Let ˛ > 0 be fixed, and let uk 2 U be solutions of Tikhonov regularization with data vk and regularization parameter ˛. Then the sequence .uk /k2N has a subsequence converging to a regularized solution uı obtained with data v ı and the same regularization method. If uı is unique, then the whole sequence .uk /k2N converges to uı . Proposition 4 (Convergence). Let ık ! 0 and let vk 2 V satisfy kuk  vk  ık : Assume that there exists u 2 U with Au D v and Rq .u/ < C1. Consider one of the following settings: Residual method: Let uk 2 U be solutions of the residual method with data vk and noise level ık . Discrepancy principle: Let uk 2 U be solutions of Tikhonov regularization with data vk and an a posteriori parameter choice according to the discrepancy principle with noise level ık . A priori parameter choice: Let ˛k > 0 satisfy ˛k ! 0 and ık2 =˛k ! 0, and let uk 2 U be solutions of Tikhonov regularization with data vk and regularization parameter ˛k . Then the sequence .uk /k2N has a subsequence converging to an Rq -minimizing solution u of the equation Au D v. If u is unique, then the whole sequence .uk /k2N converges to u . Note that the previous result in particular implies that an Rq -minimizing solution u of Au D v indeed exists. Also, the uniqueness of u is trivial in the case q > 1, as then the functional Rq is strictly convex. Thus, one obtains in this situation indeed convergence of the whole sequence .uk /k2N . Though it is known now that approximative solutions converge to true solutions of the considered equation as the noise level decreases to zero, no estimate for the speed of the convergence is obtained. Indeed, in general situations the convergence can be arbitrarily slow. If, however, the Page 6 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Rq -minimizing solution u satisfies a so-called source condition, then one can obtain sufficiently good convergence rates in the strictly convex case q > 1. If, in addition, the solution u is sparse and the operator A is invertible on the support of u , then the convergence rates improve further. Before stating the convergence rates results, the authors recall the definition of the source condition and its relation to the well-known Karush-Kuhn-Tucker condition used in convex optimization. Definition 1. The Rq -minimizing solution u of the equation A u D v satisfies the source condition, if there exists  2 V such that A  2 @Rq .u /. Here @Rq .u / denotes the subdifferential of the function Rq at u and A W V ! U is the adjoint of A. In other words, if q > 1 one has ˛ ˇ˝ ˛ˇq1 ˝ h; A¥ i D q sign u ; ¥ ˇ u ; ¥ ˇ ;

 2 ƒ;

and if q D 1 one has ˛   ˝ h; A¥ i D sign u ; ¥ if  2 supp u ;   if  … supp u : h; A¥ i 2 Œ1; C1 The conditions A  2 @Rq .u / for some  2 V and Au D v are nothing more than the KarushKuhn- Tucker conditions for the constrained minimization problem Rq .u/  min

subject to Au D v:

In particular, it follows that uQ 2 U is an Rq -minimizing solution of the equation A u D v whenever uQ satisfies the equation AQu D v and one has ran A \ @Rq .Qu/ ¤ Ø (Ekeland and Temam 1974, Proposition 4.1). The following convergence rates result can be found in Lorenz (2008) and Grasmair et al. (2011b). It is based on results concerning convergence rates with respect to the Bregman distance (see Burger and Osher 2004) and the fact that, for `q -regularization, the norm can be bounded from above, locally, by the Bregman distance. Proposition 5. Let 1 < q < 2 and assume that u satisfies the source condition. Denote, for v ı 2 V satisfying v ı  v   ı, by uı WD u.v ı / the solution with data v ı of either the residual method, or Tikhonov regularization with Morozov’s discrepancy principle, or Tikhonov regularization with an a priori parameter choice ˛ D Cı for some fixed C > 0. Then p    ı u  u  D O ı : In the case of an a priori parameter choice, one additionally has that   ı Au  v  D O .ı/: The convergence rates provide (asymptotic) estimates of the accuracy of the approximative solution in dependence of the noise level ı. Therefore, the optimization of the order of convergence is an important question in the field of inverse problems. Page 7 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

In the case of Tikhonov regularization with a priori parameter choice, the rates can indeed be improved, if the stronger source condition A A 2 @Rq .u / for some 2 U holds. Then, one obtains with a parameter choice ˛ D C ı 2=3 a rate of order O.ı 2=3 / (see Groetsch 1984; Resmerita 2005). For quadratic Tikhonov regularization, it has been shown that this rate is the best possible one. That is, except in the trivial case u D 0, there exists no parameter selection method, neither a priori nor a posteriori, that can yield a better rate than O.ı 2=3 / (see Neubauer 1997). This saturation result poses a restriction on the quality of reconstructions obtainable with quadratic regularization. In the nonquadratic case q < 2, the situation looks different. If the solution u is sparse, then the convergence rates results can be improved beyond the quadratic bound of O.ı 2=3 /. Moreover, they also can be extended to the case q D 1. For the improvement of the convergence rates, an additional injectivity condition is needed, which requires the operator A to be injective on the (finite dimensional) subspace of U spanned by the basis elements ¥ ;  2 supp.u /. This last condition is trivially satisfied, if the operator A itself is injective. There exist, however, also interesting situations, where the linear equation A u D v is vastly underdetermined, but the restriction of A to all sufficiently low-dimensional subspaces spanned by the basis elements ¥ is injective. These cases have recently been well studied in the context of compressed sensing (Donoho and Elad 2003; Candès et al. 2006). The first improved convergence rates have been derived in Grasmair et al. (2008, 2011b). Proposition 6. Let 1  q  2 and assume that u satisfies the source condition. In addition, assume that u is sparse and that the restriction of the operator A t o spanf¥ W  2 supp.u /g is injective. Then, with the notation of Proposition 5, one has  ı    u  u  D O ı 1=q : The most interesting situation is the case q D 1. Here, one obtains a linear convergence of the regularized solutions to u . That is, the approximative inversion of A is not only continuous but in fact Lipschitz continuous; the error in the reconstruction is of the same order as the data error. In addition, the source condition A  2 @Rq .u / in some sense becomes weakest for q D 1, because then the subdifferential is set-valued and therefore larger than in the strictly convex case. Moreover, the source condition for q > 1 requires that the support of A  equals the support of u , which strongly limits the applicability of the convergence rates result. While Proposition 6 concerning convergence rates in the presence of a sparsity assumption and restricted injectivity holds for all 1  q  2, the rates result without these assumptions, Proposition 5, requires that the parameter q is strictly greater than 1. The following converse result shows that, at least for Tikhonov regularization with an a priori parameter choice, a similar relaxation of the assumptions by dropping the requirement of restricted injectivity is not possible for q D 1; the assumptions of sparsity and injectivity of A on supp.u / are not only sufficient but also necessary for obtaining any sensible convergence rates (see Grasmair et al. 2011a). Proposition 7. Let q D 1 and assume that u is the unique R1 -minimizing solution of the  ı ı   equation Au = v. Denote, for v 2 V satisfying v  v  ı, by uı WD u.v ı / the solution with data v ı of Tikhonov regularization with an a priori parameter choice ˛ D C ı for some fixed C > 0. If the obtained data error satisfies

Page 8 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

  ı Au  v  D O .ı/; then u is sparse and the source condition holds. In particular, also   ı u  u  D O .ı/:

3.2 Nonconvex Regularization In the following, the properties of `q regularization with a sub-linear regularization term, that is, 0 < q < 1, are studied. In this situation, the regularization functional is nonconvex, leading to both theoretical and numerical challenges. Still, nonconvex regularization terms have been considered for applications, because they yield solutions with even more pronounced sparsity patterns than `1 regularization. From the theoretical point of view, the lack of convexity prohibits the application of Theorem 1, which states that the residual method is equivalent to Tikhonov regularization with Morozov’s discrepancy principle. Indeed, it seems that an extension of said result to nonconvex regularization functionals has not been treated in the literature so far. Even more, though corresponding results have recently been formulated for the residual method, the question, whether the discrepancy principle yields stable reconstructions, has not yet been answered. For these reasons, the discussion of nonconvex regularization methods is limited to the two cases of the residual method and Tikhonov regularization with an a priori parameter choice. Both methods allow the derivation of basically the same, or at least similar, results as for convex regularization, the main difference being the possible nonuniqueness of the Rq -minimizing solutions of the equation Au D v (see Grasmair 2009; Grasmair et al. 2011b; Zarzer 2009). Proposition 8. Consider either the residual method or Tikhonov regularization with an a priori parameter choice. Then Propositions 2–4 concerning existence, stability, and convergence remain to hold true for 0 < q < 1. Also the convergence rates result in the presence of sparsity, Proposition 6, can be generalized to nonconvex regularization The interesting point is that the source condition needed in the convex case apparently is not required any more. Instead, the other conditions of Proposition 6, uniqueness and sparsity of u and restricted injectivity of A, are already sufficient for obtaining linear convergence (see Grasmair 2010; Grasmair et al. 2011b). Proposition 9. Let 0 < q < 1 and assume that u is the unique Rq -minimizing solution of the equation A u D v. A to spanf¥ W  2 Assume moreover that u is sparse and that the restriction  ı  of the operator  ı ı   supp.y /g is injective. Denote, for v 2 V satisfying v  v  ı, by u WD .v ı / the solution with data v ı of either the residual method or Tikhonov regularization with an a priori parameter choice ˛ D C ı for some fixed C > 0. Then   ı u  u  D O .ı/: In the case of Tikhonov regularization with an a priori parameter choice, one additionally has that Page 9 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

  ı Au  u D O .ı/:

4 Numerical Minimization 4.1 Iterative Thresholding Algorithms In Daubechies et al. (2004), an iterative algorithm has been analyzed that can be used for minimizing the Tikhonov functional T˛;v ı for fixed ˛ > 0, that is, for an a priori parameter choice. To that end, the authors define for b > 0 and 1  q  2 the function Fb;q W R ! R Fb;q .t / WD t C

bq sign .t / jt jq1 : 2

If q > 1, the function Fb;q is a one-to-one mapping from R to R. Thus, it has an inverse Sb;q WD .Fb;q /1 W R ! R. In the case q D 1 one defines 8 ˆ < t  b=2 if t  b=2; Sb;1 .t / WD 0 if jt j < b=2; ˆ : t C b=2 if t  b=2:

(5)

Using the functions Sb;q , for b D .b /2ƒ 2 Rƒ >0 and 1  q  2, the Shrinkage Operator Sb;q W U ! U , is defined as Sb;q .u/ WD

X

Sb ;q .hu;  i/  :

(6)

2ƒ

Proposition 10. Let v ı 2 V; ˛ > 0, and 1  q  2, and denote w WD .w /2ƒ . Let > 0 be such that kA Ak < 1. Choose any u0 2 U and define inductively    unC1 WD S ˛w;q un C A v ı  Aun :

(7)

Then the iterates un , defined by the thresholding iteration (7), converge to a minimizer of the functional T˛;v ı as n ! 1. The method defined by the iteration (7) can be seen as a forward-backward splitting algorithm for the minimization of T˛;v ı , the inner update u 7! u C A  .v ı  Au/ being gradient descent  2 step for the functional Au  Av ı  and the shrinkage operator a gradient descent step for ˛Rq . More details on the application of forward-backward splitting methods to similar problems can, for instance, be found in Combettes and Wajs (2005).

4.2 Second Order Cone Programs In the case of an a posteriori parameter choice (or the equivalent residual method), the iterative thresholding algorithm (7) cannot be applied directly, as the regularization parameter ˛ > 0 is not Page 10 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

known in advance. One can show, however, that the required parameter a depends continuously on 5 (see Bonesky 2009). Thus, it is possible to find the correct parameter iteratively, starting with some initial guess ˛ > 0 and computing some uO 2 arg minu T˛;v ı .u/. Depending on the size of the residual AOu  v ı , one subsequently either increases or decreases ˛ and computes the minimizer of T˛;v ı using the new regularization parameter. This procedure of updating ˛ and minimizing T˛;v ı is   stopped, as soon as the residual satisfies AOu  v ı   ı. In the important case q D 1, a different solution algorithm has been established, which takes advantage of the fact that the constrained minimization problem R1 .u/ ! min subject to   Au  v ı 2  ı can be rewritten as a second-order cone program (SOCP) (Candès and Romberg 2005). To that end the authors introduce an additional variable a D .a/2ƒ 2 `2.ƒ/ and minimize 2 †2ƒ w a subject to the constraints a  jhu; ¥ij for all  2 ƒ and Au  v ı   ı 2 . The former bound consisting of the two linear constraints a  ˙ hu; ¥ i, the authors arrive at the SOCP

S .u; a/ WD

X

a C hu; ¥i w a ! min

2ƒ

subject to

 0;

a  hu; ¥i  0;   2 ı 2  Au  v ı   0:

(8)

If the pair (u, a) solves (8), then u is a solution of the residual method. The solutions of the program (8) can be computed using a log-barrier method, defining for > 0 the functional P P S .u; a/ WD

w a  log .a C hu; ¥ i/ 2ƒ 2ƒ   2 P log .a  hu; ¥ i/  log Au  v ı   ı 2 :  2ƒ

As ! 1, the minimizers of S .u; a/ converge to a solution of (8). Moreover, one can show that the solution .uı ; aı / of (8) and the minimizer uı ; a ı of S satisfy the relation     S uı ; a ı < S uı ; aı C .jƒj C 1/ = ;

(9)

that is, the value of the minimizer of the relaxed problem S lies within .jƒj C 1/ = of the optimal value of the original minimization problem (Renegar 2001). In order to solve (8), one alternatingly minimizes S and increases the parameter . That is, one chooses some parameter > 1 defining the increase of and starts with k D 1 and some initial parameter .1/ > 0. Then one iteratively computes .uk ; ak / 2 arg min S .k/, set .kC1/ WD .k/ and increases k until the value .jƒj C 1/ = .k/ is smaller than some predefined tolerance— according to (9), this implies that also the value S.uk ; ak / is within the same tolerance of the actual minimum. For the minimization of S .k/, which has to take place in each iteration step, one can use a Newton method combined with a line search that ensures that one does not leave the domain of S .k/ and that the value of S .k/ actually decreases. More details on the minimization algorithm can be found in Candès and Romberg (2005).

Page 11 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

w^ b (2πv )

wb(t )

–2

–1

0

1

2

[ns]

–1500

–500

500

1500 [Mz]

Fig. 2 Ricker wavelet (second derivative of a small Gaussian) with a central frequency of b D 500 MHz in the time domain (left) and in the frequency domain (right)

5 Application: Synthetic Focusing in Ground Penetrating Radar In this section, sparsity regularization is applied to data obtained with Ground Penetrating Radar (GPR) mounted on a flying helicopter (see Fig. 1). As stated in Sect. 1, the imaging problem will be written as the inversion of the circular Radon transform.

5.1 Mathematical Model For simplicity of presentation, polarization effects of the electromagnetic field are ignored and a small isotropic antenna is assumed. In this case, each component of the electromagnetic field E.xant Ix; t / induced by an antenna that is located at xant 2 R3 is described by the scalar wave equation 

1

@2 2 t

c .x/

      E xant I x; t D ı3D x  xant wb .t /;

.x; t / 2 R3  R:

(10)

Here ı3D denotes the three-dimensional delta distribution, wb represents the temporal shape of the emitted radar signal (impulse response function of the antenna) with bandwidth b, and c(x) denotes the wave speed. GPR systems are designed to generate ultrawideband radar signals, where the bandwidth b is approximately equal to the central frequency, and the pulse duration is given by  D 1=b. Usually, wb is well approximated by the second derivative of a small Gaussian (Ricker wavelet), see Daniels (2004). Figure 2 shows a typical radar signal emitted by a radar antenna at 500 MHz and its Fourier transform. 5.1.1 Born Approximation Scattering of the radar signals occurs at discontinuities of the function c. In the sequel, it is assumed that 1 2

c .x/

D

 1  1 C u3D .x/ ; 2 c0

Page 12 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

where c0 is assumed to be constant (the light speed) and u3D is a possibly nonsmooth function. Moreover, the following decomposition is made: E .xant I x; t/ D E0 .xant I x; t/ C Escat .xant I x; t/; .x; t/ 2 R3  R; where E0 denotes the incident field (the solution of the wave equation (10) with c replaced by c0 ), and Escat is the scattered field. From (10) it follows that the scattered field satisfies 

   ant u3D .x/ @2 E .xant I x; t/ 1 2 @  Escat x I x; t D  : @t 2 c02 t c02

The Born approximation consist in replacing the total field E in the above equation by the incident field E0 . This results in the approximation Escat ' EBorn , where EBorn solves the equation 

 1 2 u3D .x/ @2 E0 .xant I x; t/ ant @  E .x I x; t/ D  ; .x; t/ R3  R: Born @t 2 c02 t c02

(11)

Together with the initial condition Escat .xant I x; t / D 0 for t < t0 , Eq. 11 can be solved explicitly via Kirchhoff’s formula, see Courant and Hilbert (1962, p. 692),   jx  yj ant Z E0 x I y; t    ant 1 c0 3D EBorn x I x; t D  d y: u .y/ 2 4 c0 R3 jx  yj The identity   .wb t ı1D / .t  jy  xant j =c0 / wb .t  jy  xant j =c0 / E0 xant I y; t D  D  ; 4 jy  xant j 4 jy  xant j with ı1D denoting the one-dimensional delta distribution, leads to    w00 .t / EBorn xant I x; t D b 2 2 t 16 c0

Z

ı1D u

3D

R3

 jy  xant j jx  yj t  c0 c0 d y: ant jx  yj jy  x j

(12)

In GPR, the data are measured in zero offset mode, which means that the scattered field is only recorded at location x D xant . In this situation, Eq. 12 simplifies to    w00 .t / EBorn xant I xant ; t D b 2 t 32 c0

Z

ı1D 3D

R3

u .y/

c0 t  jy  xant j jy  xant j2

 d y;

R where the formula ' .x/ ı1D .ax/ dx D '.0/ has been used. By partitioning the above integral jaj 3 over y 2 R into integrals over spheres centered at xant , and using the definition of the onedimensional delta distribution, one obtains that Page 13 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

  ant 3D  ant ant  u R .x ; c0t =2/ 3D EBorn x I x ; t D w00b t 32 2 c03 .t =2/2

(13)

with    R3D u3D xant ; r WD

Z jxant yjDr

u3D .y/ dS .y/

(14)

denoting the (three-dimensional) spherical Radon transform. This is the basic equation of GPR, that relates the unknown function u3D with the scattered data measured in zero offset mode. 5.1.2 The Radiating Reflectors Model In the presented application (see Fig. 1), the distances between the antenna position xant and the positions y of the reflectors are relatively large. In this case, multiplication by t and convolution with w0b in (13) can be (approximately) interchanged, that is,       8 c02 t EBorn xant I xant ; 2t ' ˆ xant ; t DW w00b t



 R3D u3D .xant ; c0t / : 4 c0 t

(15)

One notes that ˆ is the solution at position xant of the wave equation 

1 2 @ c02 t

  ˆ .x; t/ D w00b .t / u3D .x/ ; .x; t/ 2 R3  R:

(16)

Equation (16) is named the radiating (or exploding) reflectors model, as the inhomogeneity u3D now appears as active source in the wave equation. 5.1.3 Formulation of the Inverse Problem Equation (15) relates the unknown function u3D .x/ with the data ˆ.xant ; t /. Due to the convolution with the function w0b , which does not contain high frequency components (see Fig. 2), the exact reconstruction of u3D is hardly possible. It is therefore common to apply migration, which is designed to invert the spherical Radon transform. When applying migration to the data defined in (15), one reconstructs a band-limited approximation of u3D . Indeed, from Haltmeier et al. (2009, Proposition 2.2), it follows that (see also Haltmeier and Zangerl 2010)  ant   ant  .x ; c0 t / R3D u3D b ˆ x ;t D ; t

(17)

where u3D b

.x/ WD  8 c0

Z R3

w000 b .jyj/ 3D u .x  y/ d y; x 2 R3: jyj

(18)

Page 14 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Therefore, the data t ˆ.xant ; t / can be viewed as the spherical Radon transform of the band-limited ant reflectivity function u3D b .x/, and the application of migration to the data t ˆ.x ; t / will reconstruct 3D the function ub .x/. A characteristic of the presented application (see Fig. 1) is that the radar antenna is moved along a one dimensional path, that is, only the two-dimensional data set v .x ant ; t / WD t ˆ ..x ant ; 0; 0/; t / with .x ant ; t / 2 R  .0; 1/; is available from which one can recover at most a function with two degrees of freedom. Therefore, it is assumed that the support of the function u2D b is approximately located in the plane f.x1 ; x2 ; x3 / W x3 D 0g, that is, 2 2D u3D b .x1 ; x2 ; x3 / D ub .x1 ; x2 / ı1D .x3 / with x D .x1 ; x2 ; x3 / 2 R  R:

Together with (17) this leads to the equation   ant v .xant ; t / D R2D u2D .x ; c0 t/ ; .x ant ; t/ 2 R  .0; 1/; b

(19)

where .R2D u/ .x ant ; r/ WD

R j.x ant ;0/yjDr

u .y/ ds .y/; .x ant ; r/ 2 R  .0; 1/;

(20)

denotes the circular Radon transform (the spherical Radon in two dimensions). Equation (19) is the final equation that will be used to reconstruct the bandlimited reflectivity function u2D b .x1 ; x2 / from data v.x ant ; r/.

5.2 Migration Versus Nonlinear Focusing ant ant If the values R2D u2D 2 R and all r > 0, then u2D b .x ; r/ in (19) were known for all x b could be reconstructed by means of explicit reconstruction formulas. At least two types of theoretically exact have been derived: Temporal back-projection and Fourier domain formulas for recovering u2D b formulas (Stolt 1978; Norton and Linzer 1981; Fawcett 1985; Andersson 1988). These formulas and their variations are known as migration, backprojection, or synthetic focusing techniques.

5.2.1 The Limited Data Problem ant ant 2 R, and the In practice, it is not appropriate to assume R2D u2D b .x ; t / is known for all x antenna positions and acquisition times have to be restricted to domains .X; X/ and .0; R=c0 /, respectively. We model the available partial data by

  ant .x ; r/ ; with .x ant ; r/ 2 .X; X/  .0; R/; vcut .x ant ; r/ WD wcut .x ant ; r/ R2D u2D b

(21)

where wcut is a smooth cutoff function that vanishes outside the domain .X; X/  .0; R/. Without a priori knowledge, the reflectivity function u2D b cannot be exactly reconstructed from the incomplete data (21) in a stable way (see Louis and Quinto 2000). It is therefore common to apply migration techniques just to the partial data and to consider the resulting image as approximate reconstruction. Page 15 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Applying Kirchhoff migration to the partial data (21) leads to   ukm .x1 ; x2 / WD R2D vcut .x1 ; x2 / WD

Z

X X

  q 2 ant 2 ant vcut x ; .x  x1 / C x2 dx ant :

With Kirchhoff migration, the horizontal resolution at location (0; d ) is given by c0 d=.2Xb/ (see Borcea et al. 2005, Appendix A.1 for a derivation). Incorporating a priori knowledge via nonlinear inversion, however, may be able to increase the resolution. Below it is demonstrated that this is indeed the case for sparsity regularization using a Haar wavelet basis. A heuristic reason is that sparse objects (reconstructed with sparse regularization) tend to be less blurred than images reconstructed by linear methods. 5.2.2 Application of Sparsity Regularization For the sake of simplicity, only Tikhonov regularization with R1 penalty term and uniform weights is considered, leading to the regularization functional X  2 T˛;v ı .u/ WD R2D u  v ı  C ˛ jh¥ ; uij;

(22)

2ƒ

where .¥ /2ƒ is a Haar wavelet basis and ˛ is the regularization parameter. Here u and v ı are elements of the Hilbert spaces o n   U WD u 2 L2 R2 W supp .u/  .X; X/  .0; R/ ; V WD L2 ..X; X/  .0; R//: The circular Radon transform R2D , considered as operator between U and V , is easily shown to be bounded linear (see, e.g., Scherzer et al. 2009, Lemma 3.79) For the minimization of (22), we apply the iterative thresholding algorithm (7), which in this context reads as    unC1 WD S ˛;1 un C R2D v ı  R2D un :

(23)

Here S ˛;1 is the shrinkage operator defined by (6) and (5), and is a positive parameter such that kR2D R2D k < 1.

5.3 Numerical Examples In the numerical examples, X D 2 m and R D 12 m. The scatterer u is the characteristic function of a small disk located at position .0; d / with d D 7 m, see Fig. 3. It is assumed that the emitted radar signal is a Ricker wavelet wb with a central frequency of 250 MHz (compare with Fig. 2). The data v.x ant , r/ are generated by numerically convolving R2D with the second derivative of the Ricker wavelet. The reconstructions obtained with Kirchhoff migration and with sparsity regularization are depicted in Fig. 4. Both methods show good resolution in the vertical direction (often called Page 16 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

−X

X X1 - axis

{X2 = 7}

{X1 = 0} X2 - axis

Fig. 3 Geometry in the numerical experiment. Data v.x ant , r/, caused by a small scatterer positioned at location (0, 7{m}), are simulated for (x ant , r/ 2 (X , X / (0, R) with X D 2 m and R D 12 m The reconstructions obtained with Kirchhoff migration and with sparsity regularization are depicted in Fig. 4. Both methods show good resolution in the vertical direction (often called axial or range resolution). The horizontal resolution (lateral or cross-range resolution) of the scatterer, however, is significantly improved by sparsity regularization. This shows that sparsity regularization is indeed able to surpass the resolution limit c0 d= (2Xb) of linear reconstruction techniques

axial or range resolution). The horizontal resolution (lateral or cross-range resolution) of the scatterer, however, is significantly improved by sparsity regularization. This shows that sparsity regularization is indeed able to surpass the resolution limit c0 d=.2  b/ of linear reconstruction techniques. In order to demonstrate the stability with respect to data perturbations, we also perform reconstructions after adding Gaussian noise and clutter. Clutter occurs from multiple reflections on fixed structures and reflections resulting from the inhomogeneous background (Daniels 2004). A characteristic property of clutter is that is has similar spectral characteristics as the emitted radar signal. The reconstruction results from data with clutter and noise added are depicted in Fig. 5. Again, sparsity regularization shows better horizontal resolution than Kirchhoff migration. Moreover, the image reconstructed with sparsity regularization is less noisy.

5.4 Application to Real Data Radar measurements were performed with a 400 MHz antenna (RIS One GPR instrument). The investigated area was a complex avalanche deposit near Salzburg, Austria. The recorded data are shown in Fig. 6. In the numerical reconstruction, an aperture of X D 3:3 m and a time window of R=c0 D 50 ns are chosen. The extracted data are depicted in the left image in Fig. 7. One clearly sees a diffraction hyperbola stemming from a scatterer in the subsurface. Moreover, the data agree very well with the simulated data depicted in the left image in Fig. 5.

Page 17 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

1

1 Original Kirchhoff Sparsity

0.5

Original Kirchhoff

0.8

Sparsity 0.6 0.4

0

0.2 –0.5 0 –1 5.5

6

6.5

7

7.5

8

8.5

–0.2 –2 –1.5 –1 –0.5

0

0.5

1

1.5

2

Fig. 4 Exact data experiment. Top left: Data. Top middle: Reconstruction by Kirchhoff migration. Top right: Reconstruction with sparsity regularization. Bottom: Vertical and horizontal profiles of the reconstructions

The reconstruction results with Kirchhoff migration and with sparsity regularization are depicted in Fig. 7. The regularization parameter ˛ is chosen as 0.02, and the scaling parameter is chosen in such a way, that kR2D R2D k is only slightly smaller than 1.

Appendix: Numerical Methods In the main article, we have presented two methods for the minimization of sparsity functionals of the form T˛;v ı .u/ D kAu  v ı k2 C ˛R.u/

(24)

with R.u/ D

X

w jh ; uij;

2ƒ

Page 18 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

1

1

Original

Original

0.5

Kirchhoff Sparsity

0.8

Kirchhoff Sparsity

0.6 0.4

0

0.2 –0.5 0 –1 5.5

6

6.5

7

7.5

8

–0.2 –2 –1.5 –1 –0.5

8.5

0

0.5

1

1.5

2

Fig. 5 Noisy data experiment. Top left: Data. Top middle: Reconstruction by Kirchhoff migration. Top right: Reconstruction with sparsity regularization. Bottom: Vertical and horizontal profiles of the reconstructions −X

X flight-path

R / c0

time

Fig. 6 Measured radar data. For the numerical reconstruction only the partial data ˆ((x ant ; 0; 0/, t/, with (x ant , t/ 2 (X , X /(0, R=c0 / where X D 3:3 m and R=c0 D 50 ns, have been used

Page 19 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

1.2

1 Kirchhoff Sparsity

Kirchhoff Sparsity

1

0.5

0.8 0.6

0 0.4 0.2

–0.5

0 –1 9

10

11

12

13

14

15

–0.2 –4

–2

0

2

4

Fig. 7 Reconstruction from real data. Top left: Data. Top middle: Reconstruction by Kirchhoff migration. Top right: Reconstruction with sparsity regularization. Bottom: Vertical and horizontal profiles of the reconstructions

namely, iterative thresholding and second order cone programs. In this appendix, we will discuss some additional methods. Numerical methods for sparsity minimization can be divided into two categories: first, methods that attempt to minimize the (non-smooth) functional (24) directly and, second, methods that approximate the functional (24) with a differentiable one and then try to find a minimum of the approximation. In contrast to the non-smooth original problem, here the corresponding optimality condition will be a single valued equation with a unique solution. Both methods introduced in the main article treated the direct problem. We now discuss first two approximation approaches and then two direct methods.

Explicit Approximation Methods Usually, the approximating functionals can be written in the form T˛;;v ı .u/ D kAu  v ı k2 C ˛R .u/

(25)

with R .u/ D

X

w .h ; ui; /:

2ƒ

Page 20 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

W R  R0 ! R has the following properties:

Here we assume that the function

• The function satisfies .s; / ! jsj for every s 2 R. • For every  > 0 the function . ; / is convex and differentiable. Typical examples of functions 1. 2. 3.

are

p .s; / D s 2 C  2 , .s; / D s 2 =2 für jsj   und .s; / D jsj1C ,

.s; / D jsj  =2 für jsj  ,

to name but a few. The advantage of these approaches is that the subdifferential of T˛;;v ı is at most single valued, and therefore the minimization can be performed with gradient based algorithms. Consider now  > 0 fixed. Minimization of the functional T˛;;v ı can be performed with gradienttype methods, such as Landweber iteration, or (quasi-)Newton methods. In the case of .s; / D p s 2 C  2 the gradient based method looks as follows: h i X h ; u.n/ i u.nC1/ D u.n/  n 2A .Au.n/  v ı /  ˛ w p  : h ; u.n/ i2 C  2 2ƒ

(26)

Here n is some positive step-size that can, for instance, be defined by a line-search.

Iteratively Reweighted Least Squares Method (IRLS) This approach is based on the identity R.u/ D

X 2ƒ

w h ; ui2 : jh ; uij

Thus, `1 -regularization can be considered as quadratic regularization with weights w =jh ; uij depending on the minimizer u. We now define S.u; uQ ; / WD

X

w  .jh ; uQ ij/h ; ui2 ;

2ƒ

where the functions  satisfy  .t / ! 1=t for  ! 0. A typical choice is  .t / D p

1 t2

C 2

:

Starting with some initial guess u.0/ of the minimizer of T˛;v ı , one then chooses some sequence of positive numbers n ! 0 and defines inductively h i u.nC1/ WD arg min kAu  v ı k2 C ˛S.u; u.n/ ; n/ : u

Page 21 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Note that in each iteration one has to minimize a quadratic functional, which amounts to solving a linear equation. This and related methods have been considered by several authors. A theoretical analysis can, for instance, be found in Daubechies et al. (2010).

Accelerated Thresholding Methods We recall first the iterative thresholding algorithm defined by   unC1 WD S ˛w;1 un C A .v ı  Aun / ; where the thresholding operator S is as in Sect. 4.1. In Beck and Teboulle (2009), an improvement of the iterative thresholding algorithm has been proposed, where the iterates are defined as linear combinations of subsequent iterates. More precisely, given some starting point u0 2 U , the iteration is defined by   yn D S ˛w;1 un C A .v ı  Aun / ; p 1 C 1 C 4tn2 ; tnC1 D 2 tn  1 unC1 D yn C .yn  yn1 /; tnC1 with the initialization t0 D 1. We note that the first step of this algorithm is an ordinary thresholding step, and thus an initialization of y1 is not necessary. It has been shown in Beck and Teboulle (2009) that this algorithm converges provided that the shrinkage parameter is chosen in such a way that kA Ak < 1. This is precisely the same constraint as in the classical iterative thresholding algorithm. In addition, the estimate T˛;v ı .uk / 

T˛;v ı .uı˛ /

2 ku0  uı˛ k2  .k C 1/2

holds, which means that the accuracy of the iterate uk , measured in terms of the energy T˛;v ı , is of order O.1=k 2 /. In contrast, the classical iterative thresholding algorithm only yields an accuracy of order O.1=k/.

Augmented Lagrangian Methods In order to define augmented Lagrangian methods for the minimization of T˛;v ı , we consider the equivalent formulation as the constrained optimization problem krk2 C ˛R.u/ ! min

subject to Au C r D v ı :

The augmented Lagrangian of this optimization problem is the mapping Lˇ W U  V  V ! .1; C1,

Page 22 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Lˇ .u; r; / D krk2 C ˛R.u/  h; Au C r  v ı i C ˇkAu C r  v ı k2 ; depending on the parameter ˇ > 0. In order to find a saddle-point of Lˇ , we perform the iteration (see Yang and Zhang 2011)   unC1 D S ˛w=2ˇ;1 un C A .v ı C rn  Aun  n =.2ˇ// ;  1  n =2  ˇ.AunC1  v ı / ; 1Cˇ   D n  ˇ AunC1 C rnC1  v ı :

rnC1 D nC1

We do note that the update step for r is simply an explicit computation of the minimum of Lˇ .unC1 ; ; n /. Similarly, the update step for u can be interpreted as an approximate minimization of Lˇ . ; rn ; n /. Therefore, the iteration can be regarded as an inexact alternating directions method. It has been shown in Yang and Zhang (2011) that this algorithm converges, if the parameters ,  > 0 satisfy the relation kA Ak C 2 < 2. In addition to this algorithm, augmented Lagrangian methods based on a dual problem have also been proposed in Yang and Zhang (2011). We refer to that paper for further information. Acknowledgments This work has been supported by the Austrian Science Fund (FWF) within the national research networks Industrial Geometry, project 9203-N12, Variational Imaging on Manifolds, project 11704, and Photoacoustic Imaging in Biology and Medicine, project S10505N20. The authors thank Sylvia Leimgruber (alpS – Center for Natural Hazard Management in Innsbruck) and Harald Grossauer (University Innsbruck) for providing real life data sets.

References Andersson LE (1988) On the determination of a function from spherical averages. SIAM J Math Anal 19(1):214–232 Beck A, Teboulle M (2009) Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans Image Process 18(11):2419–2434 Bleistein N, Cohen JK, Stockwell JW Jr (2001) Mathematics of multidimensional seismic imaging, migration, and inversion. Interdisciplinary applied mathematics: Geophysics and planetary sciences, vol 13. Springer, New York Bonesky T (2009) Morozov’s discrepancy principle and Tikhonov-type functionals. Inverse Probl 25(1):015015 Borcea L, Papanicolaou G, Tsogka C (2005) Interferometric array imaging in clutter. Inverse Probl 21(4):1419–1460 Bredies K, Lorenz DA (2014) Minimization of non-smooth, non-convex functionals by iterative thresholding. J Optim Theory Appl doi:10.1007/s10957-014-0614-7 Burger M, Osher S (2004) Convergence rates of convex variational regularization. Inverse Probl 20(5):1411–1421 Candès EJ, Romberg J (2005) `1 -MAGIC: recovery of sparse signals via convex programming. Technical report, 2005. Available at http://www.acm.caltech.edu/l1magic

Page 23 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Candès EJ, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52(2): 489–509 Claerbout J, Muir F (1973) Robust modeling of erratic data. Geophysics 38:826–844 Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4(4):1168–1200 Courant R, Hilbert D (1962) Methods of mathematical Physics, vol 2. Wiley-Interscience, New York Daniels D (2004) Ground penetrating radar. The Institution of Electrical Engineers, London Daubechies I, Defrise M, De Mol C (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun Pure Appl Math 57(11):1413–1457 Daubechies I, DeVore R, Fornasier M, Güntürk CS (2010) Iteratively reweighted least squares minimization for sparse recovery. Commun Pure Appl Anal 63(1):1–38 Donoho DL, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via I 1 minimization. Proc Natl Acad Sci USA 100(5):2197–2202 Ekeland I, Temam R (1974) Analyse convexe et problèmes variationnels. Collection Études Mathématiques. Dunod, Paris Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Mathematics and its applications. Kluwer Academic, Dordrecht Fawcett JA (1985) Inversion of n-dimensional spherical averages. SIAM J Appl Math 45(2): 336–341 Finch D, Rakesh (2007) The spherical mean value operator with centers on a sphere. Inverse Probl 23(6):37–49 Frühauf F, Heilig A, Schneebeli M, Fellin W, Scherzer O (2009) Experiments and algorithms to detect snow avalanche victims using airborne ground-penetrating radar. IEEE Trans Geosci Remote Sens 47(7):2240–2251 Grasmair M (2009) Well-posedness and convergence rates for sparse regularization with sublinear l q penalty term. Inverse Probl Imaging 3(3):383–387 Grasmair M (2010) Non-convex sparse regularisation. J Math Anal Appl 365:19–28 Grasmair M, Haltmeier M, Scherzer O (2008) Sparse regularization with l q penalty term. Inverse Probl 24(5):055020 Grasmair M, Haltmeier M, Scherzer O (2011a) Necessary and sufficient conditions for linear convergence of `1 -regularization. Commun Pure Appl Math 64(2):161–182 Grasmair M, Haltmeier M, Scherzer O (2011b) The residual method for regularizing ill-posed problems. Appl Math Comput 218(6):2693–2710 Groetsch CW (1984) The theory of Tikhonov regularization for Fredholm equations of the first kind. Pitman, Boston Haltmeier M, Zangerl G (2010) Spatial resolution in photoacoustic tomography: effects of detector size and detector bandwidth. Inverse Probl 26(12):125002 Haltmeier M, Kowar R, Scherzer O (2005) Computer aided location of avalanche victims with ground penetrating radar mounted on a helicopter. In: Lenzen F, Scherzer O, Vincze M (eds) Digital imaging and pattern recognition. Proceedings of the 30th workshop of the Austrian Association for Pattern Recognition, Obergugl, pp 1736–1744 Haltmeier M, Scherzer O, Zangerl G (2009) Influence of detector bandwidth and detector size to the resolution of photoacoustic tomagraphy. In: Breitenecker F, Troch I (eds) Argesim report no. 35: Proceedings Mathmod’09, Vienna, pp 1736–1744

Page 24 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_25-2 © Springer-Verlag Berlin Heidelberg 2014

Hofmann B, Kaltenbacher B, Pöschl C, Scherzer O (2007) A convergence rates result in Banach spaces with non-smooth operators. Inverse Probl 23(3):987–1010 Ivanov VK, Vasin VV, Tanana VP (2002) Theory of linear ill-posed problems and its applications. Inverse and ill-posed problems series, 2nd edn. (Translated and revised from the 1978 Russian original). VSP, Utrecht Kuchment P, Kunyansky LA (2008) Mathematics of thermoacoustic and photoacoustic tomography. Eur J Appl Math 19:191–224 Levy S, Fullagar T (1981) Reconstruction of a sparse spike train from a portion of its spectrum and application to high-resolution deconvolution. Geophysics 46:1235–1243 Lorenz D (2008) Convergence rates and source conditions for Tikhonov regularization with sparsity constraints. J Inverse Ill-Posed Probl 16(5):463–478 Louis AK, Quinto ET (2000) Local tomographic methods in sonar. In: Colton D, Engl HW, Louis AK, McLaughlin JR, Rundell W (eds) Surveys on solution methods for inverse problems. Springer, Vienna, pp 147–154 Neubauer A (1997) On converse and saturation results for Tikhonov regularization of linear ill-posed problems. SIAM J Numer Anal 34:517–527 Norton SJ, Linzer M (1981) Ultrasonic reflectivity imaging in three dimensions: exact inverse scattering solutions for plane, cylindrical and spherical apertures. IEEE Trans Biomed Eng 28(2):202–220 Oldenburg D, Scheuer T, Levy S (1983) Recovery of the acoustic impedance from reflection seismograms. Geophysics 48:1318–1337 Patch SK, Scherzer O (2007) Special section on photo- and thermoacoustic imaging. Inverse Probl 23:S1–S122 Renegar J (2001) A mathematical view of interior-point methods in convex optimization. MPS/SIAM series on optimization. SIAM, Philadelphia Resmerita E (2005) Regularization of ill-posed problems in Banach spaces: convergence rates. Inverse Probl 21(4):1303–1314 Santosa F, Symes WW (1986) Linear inversion of band-limited reflection seismograms. SIAM J Sci Comput 7(4):1307–1330 Scherzer O, Grasmair M, Grossauer H, Haltmeier M, Lenzen F (2009) Variational methods in imaging. Applied mathematical sciences, vol 167. Springer, New York Stolt RH (1978) Migration by Fourier transform. Geophysics 43:23–48 Symes WW (2009) The seismic reflection inverse problem. Inverse Probl 15(12):123008 Yang J, Zhang Y (2011) Alternating direction algorithms for `1 -problems in compressive sensing. SIAM J Sci Comput 33(1):250–278 Zarzer CA (2009) On Tikhonov regularization with non-convex sparsity constraints. Inverse Probl 25:025006

Page 25 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment Yanfei Wang Key Laboratory of Petroleum Resources Research, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing, People’s Republic of China

Abstract Quantitative remote sensing is an appropriate way to estimate structural parameters and spectral component signatures of Earth surface cover type. Since the real physical system that couples the atmosphere, water, and the land surface is very complicated and should be a continuous process, sometimes it requires a comprehensive set of parameters to describe such a system, so any practical physical model can only be approximated by a mathematical model which includes only a limited number of the most important parameters that capture the major variation of the real system. The pivot problem for quantitative remote sensing is the inversion. Inverse problems are typically illposed. The ill-posed nature is characterized by (C1) the solution may not exist, (C 2) the dimension of the solution space may be infinite, and (C 3) the solution is not continuous with variations of the observed signals. These issues exist nearly for all inverse problems in geoscience and quantitative remote sensing. For example, when the observation system is band-limited or sampling is poor, i.e., there are too few observations, or directions are poor located, the inversion process would be underdetermined, which leads to the large condition number of the normalized system and the significant noise propagation. Hence (C 2) and (C 3) would be the highlight difficulties for quantitative remote sensing inversion. This chapter will address the theory and methods from the viewpoint that the quantitative remote sensing inverse problems can be represented by kernel-based operator equations and solved by coupling regularization and optimization methods.

1 Introduction Both modeling and model-based inversion are important for quantitative remote sensing. Here, modeling mainly refers to data modeling, which is a method used to define and analyze data requirements; model-based inversion mainly refers to using physical or empirically physical models to infer unknown but interested parameters. Hundreds of models related to atmosphere, vegetation, and radiation have been established during past decades. The model-based inversion in geophysical (atmospheric) sciences has been well understood. However, the model-based inverse problems for Earth surface received much attention by scientists only in recent years. Compared to modeling, model-based inversion is still in the stage of exploration (Wang et al. 2009c). This is because that intrinsic difficulties exist in the application of a priori information, inverse strategy,



E-mail: [email protected]

Page 1 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

a

b Laser

Receiver

Atmosphere

Atmosphere

Reflected Sunlight

βt

Aperture Dt

Atmosphere

βr

Aperture Dr

Atmosphere Transmitted Power Pt

R

R

Power Pr

Thermal Radiation Scattering cross section σ

Fig. 1 Remote observing the Earth (a); geometry and parameters for laser scanner (b)

and inverse algorithm. The appearance of hyperspectral and multiangular remote sensor enhanced the exploration means and provided us more spectral and spatial dimension information than before. However, how to utilize these information to solve the problems faced in quantitative remote sensing to make remote sensing really enter the time of quantification is still an arduous and urgent task for remote sensing scientists. Remote sensing inversion for different scientific problems in different branch is being paid more and more attentions in recent years. In a series of international study projections, such as International Geosphere-Biosphere Programme (IGBP), World Climate Research Programme (WCRP), and NASA’s Earth Observing System (EOS), remote sensing inversion has become a focal point of study. Model-based remote sensing inversions are usually optimization problems with different constraints. Therefore, how to incorporate the method developed in operation research field into remote sensing inversion field is very much needed. In quantitative remote sensing, since the real physical system that couples the atmosphere and the land surface is very complicated (see Fig. 1a) and should be a continuous process, sometimes it requires a comprehensive set of parameters to describe such a system, so any practical physical model can only be approximated by a model which includes only a limited number of the most important parameters that capture the major variation of the real system. Generally speaking, a discrete forward model to describe such a system is in the form y D h.x; S/;

(1)

where y is a single measurement; x is a vector of controllable measurement conditions such as wave band, viewing direction, time, Sun position, polarization, and the forth; S is a vector of state parameters of the system approximation; and h is a function that relates x with S, which is generally nonlinear and continuous. With the ability of satellite sensors to acquire multiple bands, multiple viewing directions, and so on, while keeping S essentially the same, we obtain the following nonhomogeneous equations y D h.x; S/ C n;

(2)

where y is a vector in RM , which is an M dimensional measurement space with M values corresponding to M different measurement conditions, and n 2 RM is the vector of random

Page 2 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

noise with same vector length M . Assume that there are m undetermined parameters need to be recovered. Clearly, if M D m, (2) is a determined system, so it is not difficult to develop some suitable algorithms to solve it. If more observations can be collected than the existing parameters in the model (Verstraete et al. 1996), i.e., M > m, the system (2) is over determined. In this situation, the traditional solution does not exist. We must define its solution in some other meaning, for example, the least squares error (LSE) solution. However as Li (Li et al. 1998) pointed out that, “for physical models with about ten parameters (single band), it is questionable whether remote sensing inversion can be an over determined one in the foreseeable future.” Therefore, the inversion problems in geosciences seem to be always underdetermined in some sense. Nevertheless, the underdetermined system in some cases, can be always converted to an overdetermined one by utilizing multiangular remote sensing data or by accumulating some a priori knowledge (Li et al. 2001). Developed methods in literature for quantitative remote sensing inversion are mainly statistical methods with several variations from Bayesian inference. In this chapter, using kernel expression, we analyze from algebraic point of view, about the solution theory and methods for quantitative remote sensing inverse problems. The kernels mentioned in this chapter mainly refer to integral kernel operators (characterized by integral kernel functions) or discrete linear operators (characterized by finite rank matrices). It is closely related with the kernels of linear functional analysis, Hilbert space theory, and spectral theory. In particular, we present regularizing retrieval of parameters with a posteriori choice of regularization parameters, several cases of choosing scale/weighting matrices to the unknowns, numerically truncated singular value decomposition (NTSVD), nonsmooth inversion in lp space, and advanced optimization techniques. These methods, as far as we know, are novel to literature in Earth science. The outline of this chapter is as follows: in Sect. 2, we list three typical kernel-based remote sensing inverse problems. One is the linear kernel-based bidirectional reflectance distribution function (BRDF) model inverse problem, which is of great importance for land surface parameters retrieval; the other is the backscattering problem for Lidar sensing; and the last one is aerosol particle size distributions from optical transmission or scattering measurements, which is a long time existed problem and still an important topic today. In Sect. 3, the regularization theory and solution techniques for ill-posed quantitative remote sensing inverse problems are described. Section 3.1 introduces the conception of well-posed problems and ill-posed problems; Sect. 3.2 discusses about the constrained optimization; Sect. 3.3 fully extends the Tikhonov regularization; Sect. 3.4 discusses about the direct regularization methods for equality-constrained problem; then in Sect. 3.5, the regularization scheme formulated in the Bayesian statistical inference is introduced. In Sect. 4, the optimization theory and solution methods are discussed for finding an optimized solution of a minimization model. Section 4.1 talks about sparse and nonsmooth inversion in l1 space; Sect. 4.2 introduces the Newton-type and gradient-type methods. In Sect. 5.1, the detailed regularizing solution methods for retrieval of ill-posed land surface parameters are discussed. In Sect. 5.2, the results for retrieval of backscatter cross-sections by Tikhonov regularization are displayed. In Sect. 5.3, the regularization and optimization methods for recovering aerosol particle size distribution functions are presented. Finally, in Sect. 6, some concluding remarks are given.

Page 3 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

2 Typical Inverse Problems in Earth Science Many inverse problems in geophysics are kernel-based, e.g., problems in seismic exploration and gravimetry. I do not introduce these solid Earth problems in the present chapter, instead, I mainly focus on Earth surface problems. The kernel methods can increase the accuracy of remote-sensing data processing, including specific land-cover identification, biophysical parameter estimation, and feature extraction (Camps-Valls 2008; Wang et al. 2009c). I introduce three typical kernel-based inverse problems in geoscience, one belongs to the atmospheric problem and another two belong to the Earth surface problems.

2.1 Land Surface Parameter Retrieval Problem As is well-known, the anisotropy of the land surface can be best described by the BRDF. With the progress of the multiangular remote sensing, it seems that the BRDF models can be inverted to estimate structural parameters and spectral component signatures of Earth surface cover type (see Strahler et al. 1994; Roujean et al. 1992). The state of the art of BRDF is the use of the linear kernel-driven models, mathematically described as the linear combination of the isotropic kernel, volume scattering kernel, and geometric optics kernel. The information extraction on the terrestrial biosphere and other problems for retrieval of land surface albedos from satellite remote sensing have been considered by many authors in recent years, see for instance the survey papers on the kernel-based BRDF models by Pokrovsky and Roujean (2002, 2003), Pokrovsky et al. (2003), and references therein. The computational stability is characterized by the algebraic operator spectrum of the kernel-matrix and the observation errors. Therefore, the retrieval of the model coefficients is of great importance for computation of the land surface albedos. The linear kernel-based BRDF model can be described as follows (Roujean et al. 1992): fiso C kvol .ti ; tv ; /fvol C kgeo .ti ; tv ; /fgeo D r.ti ; tv ; /;

(3)

where r is the bidirectional reflectance; the kernels kvol and kgeo are the so-called kernels, that is, known functions of illumination and of viewing geometry which describe volume and geometric scattering, respectively; ti and tv are the zenith angle of the solar direction and the zenith angle of the view direction, respectively; ' is the relative azimuth of sun and view direction; and fiso , fvol , and fgeo are three unknown parameters to be adjusted to fit observations. Theoretically, fiso , fvol , and fgeo are closely related to the biomass such as leaf area index (LAI), Lambertian reflectance, sunlit crown reflectance, and viewing and solar angles. The vital task then is to retrieve appropriate values of the three parameters. Generally speaking, the BRDF model includes kernels of many types. However, it was demonstrated that the combination of RossThick (kvol ) and LiSparse (kgeo ) kernels had the best overall ability to fit BRDF measurements and to extrapolate BRDF and albedo (see, e.g., Wanner et al. 1995; Privette et al. 1997; Li et al. 1999). A suitable expression for the RossThick kernel kvol was derived by Roujean et al. (1992). It is reported that the LiTransit kernel kTransit , instead of the kernel kgeo , is more robust and stable than LiSparse non-reciprocal kernel and the reciprocal LiSparse kernel ksparse (LiSparseR) where the LiTransit kernel and the LiSparse kernel are related by

Page 4 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

 kTransit D

ksparse ; B  2; and B isgivenby B WD B.ti ; tv ; / D O.ti ; tv ; / C sec ti0 C sec tv0 2 k ; B > 2; B sparse

in Li et al. (2000). More detailed explanation about O and t 0 in the definition of kTransit can be found in Wanner et al. (1995). To use the combined linear kernel model, a key issue is to numerically solve the inverse model in a stable way. However, it is difficult to do in practical applications due to ill-posed nature of the inverse problem. So far, statistical methods and algebraic methods have been developed for solving this inverse problem, e.g., Pokrovsky and Roujean (2002, 2003) and Wang et al. (2007a, 2008). We will describe these methods and introduce recent advances in following paragraphs.

2.2 Backscatter Cross-Section Inversion with Lidar Airborne laser scanning (ALS) is an active remote sensing technique which is also often referred to as Lidar or laser radar. Due to the increasing availability of sensors, ALS has been receiving increasing attention in recent years (e.g., see Wagner et al. 2006). In ALS, a laser emits short infrared pulses toward the Earth surface and a photodiode records the backscattered echo. With each scan, measurements are taken of the round trip time of the laser pulse, the received echo power and of the beam angle in the locator coordinate system. The round-trip time of the laser pulse allows calculating the range (distance) between the laser scanner and the object that generated the backscattered echo. Thereby, information about the geometric structure of the Earth surface is obtained. The received power provides information about the scattering properties of the targets, which can be exploited for object classification and for modeling of the scattering properties. The latest generation of ALS systems does not only record a discrete number of echoes but also digitizes the whole waveform of the reference pulse and the backscattered echoes. In this way, besides the range further echo parameters can be determined. The retrieval of the backscatter cross-section is of great interest in full-waveform ALS. Since it is calculated by deconvolution, its determination is an ill-posed problem in a general sense. ALS utilizes a measurement principle, firstly, strongly related to radar remote sensing (see Fig. 1b). The fundamental relation to explain the signal strength in both techniques is the radar equation (Wagner et al. 2006):   2R Dr2 ; Pt t  Pr .t / D vg 4R4 ˇt2

(4)

where t is the time, R is the range, Dr is the aperture diameter of the receiver optics, ˇt is the transmitter beam width, Pt is the transmitted power of the laser, and  denotes the scattering crosssection. The time delay is equal to t 0 D 2R=vg , where vg is the group velocity of the laser pulse in the atmosphere. Taking the occurrence of multiple scatterers into account and regarding the impulse response .t / of the system receiver, we get (Wagner et al. 2006) Pr .t / D

N X iD1

Dr2 P .t /  i0 .t /  .t /; 2 t 4 4R ˇt

(5)

Page 5 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

where * denotes the convolution operator. Since convolution is commutative, we can set Pt .t /  i0 .t /  .t / D Pt .t /  .t /  i0 .t / D S.t /  i0 .t /; i.e., it is possible to combine both the transmitter and the receiver characteristics to a single term S.t /. This term is referred to as the system waveform. Thus, we are able to write our problem in the form (Wang et al. 2009a) h.t / D

N X

.f  g/.t /:

(6)

iD1

where h is the incoming signal recorded by the receiver, f denotes a mapping which specifies the kernel function or point spread function, and g is the unknown cross-section. The problem is how to deconvolve the convolution equation (6) to get the approximation to the actual cross-section.

2.3 Aerosol Inverse Problems It is well-known that the characteristics of the aerosol particle size, which can be represented as a size distribution function in the mathematical formalism, say n.r/, plays an important role in climate modeling due to its uncertainty (Houghton et al. 1966). So, the determination of particle size distribution function becomes a basic task in aerosol research (see, e.g., Davies 1974; Mccartney 1976; Twomey 1977; Bohren and Huffman 1983; Bockmann 2001; Bockmann and Kirsche 2006). Since the relationship between the size of atmospheric aerosol particles and the wavelength dependence of the extinction coefficient was first suggested by Ångström (1929), the size distribution began to be retrieved by extinction measurements. For sun-photometer, the attenuation of the aerosols can be written as the integral equation of the first kind Z 1  r 2 Qext .r; ; /n.r/dr C %./; (7) aero ./ D 0

where r is the particle radius, n.r/ is the columnar aerosol size distribution (i.e., the number of particles per unit area per unit radius interval in a vertical column through the atmosphere),  is the complex refractive index of the aerosol particles,  is the wavelength, %./ is the error/noise, and Qext .r; ; / is the extinction efficiency factor from Mie theory. Since aerosol optical thickness (AOT) can be obtained from the measurements of the solar flux density with sun-photometers, one can retrieve the size distribution by the inversion of AOT measurements through the above equation. This type of method is called extinction spectrometry, which is not only the earliest method applying remote sensing to determine atmospheric aerosol size characteristics but also the most mature method thus far. A common feature for all particle size distribution measurement systems is that the relation between noiseless observations and the size distribution function can be expressed as a first kind Fredholm integral equation, e.g., see Nguyen and Cox (1989), Voutilainenand and Kaipio (2000), Wang et al. (2006a), Wang (2007, 2008), and Wang and Yang (2008). For the aerosol attenuation problem (7), let us rewrite (7) in the form of the abstract operator equation K W X ! Y; R1 .Kn/./ C %./ D 0 k.r; ; /.r/dr C %./ D o./ C %./ D d./;

(8)

Page 6 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

where k.r; œ; ˜/ D  r 2 Qext .r; œ; ˜/I X denotes the function space of aerosol size distributions; and Y denotes the observation space. Both X and Y are considered to be the separable Hilbert space. Note that aero in Eq. 7 is the measured term, it inevitably induces noise or errors. Hence, d./ is actually a perturbed right-hand side. Keeping in mind operator symbol, Eq. 8 can be written as Kn C % D o C % D d:

(9)

3 Regularization 3.1 What Causes Ill-Posedness From this section till the end of the chapter, unless it is specified, we will denote the operator equation as K.x/ D y;

(10)

which is an appropriate expression for an observing system, with K the response function (linear or nonlinear), x the unknown input, and y the observed data. Particularly, if K is a linear mapping, we will denote the response system as Kx D y;

(11)

which is clearly a special case of (10). We will also use K as a operator in infinite spaces sometimes, and a matrix sometimes. We assume that readers can readily recognize them. The problem (10) is said to be properly posed or well-posed in the sense that it has the following three properties: (C1 ) (C2 ) (C3 )

There exists a solution of the problem, i.e., existence; There is at most one solution of the problem, i.e., uniqueness; The solution depends continuously on the variations of the right-hand side (data), i.e., stability.

The condition (C1 ) can be easily fulfilled if we enlarge the solution space of the problem (10). The condition (C2 ) is seldom satisfied for many indirectly measurement problems. This means more than one solution may be found for the problem (10) and the information about the model is missing. In this case, a priori knowledge about the solution must be incorporated and built into the model. The requirement of stability is the most important one. If the problem (10) lacks the property of stability, then the computed solution has nothing to do with the true solution since the practically computed solution is contaminated by unavailable errors. Therefore, there is no way to overcome this difficulty unless additional information about the solution is available. Again, a priori knowledge about the solution should be involved. If problem (10) is well-posed, then K has a well-defined, continuous inverse operator K 1 . In particular, K 1 .K.x// D x for any x 2 X and Range.K/ D Y. In this case, both the algebraic nature of the spaces and the topologies of the spaces are ready to be employed. The particle size distribution model (9) is a linear model in infinite spaces. The operator K is compact. The ill-posedness is self-evident because that at least one of the three items for wellposed problems is violated. Note that (3) is a linear model in finite spaces, therefore it is easy to rewrite it into a finite rank operator equation Page 7 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Kx D y;

(12)

by setting x D Œfiso ; fvol ; fgeo T and y D Œyj with the entries yj D rj .ti ; tv ; '/, where y is the measurement data. The inverse problem is how to recover the model parameters x given the limited measurement data y. For Lidar backscatter cross-section inversion, one needs to solve a deconvolution problem. The ill-posedness is due to the ill-conditioning of the spectrum of the operator and noisy data. Numerically, the discrete ill-posedness for the above examples is because that their operators may be inaccurate (can only be approximately calculated), their models are usually underdetermined if there are too few observations or poor directional range, or the observations are highly linearly dependent and noisy. For example, a single angular observation may lead to a under determined system whose solutions are infinite (the null space of the kernel contains nonzero vectors) or the system has no solution (the rank of the coefficient matrix is not equal to the augmented matrix). In practice, random uncertainty in the reflectances sampled translates into uncertainty in the BRDF and albedo. We note that noise inflation depends on the sampling geometry alone. For example, for MODIS and MISR sampling, they vary with latitude and time of year; but for kernel-based models, they do not depend on wavelength or the type of BRDF viewed. Therefore, the random noise in the observation (BRDF) and the small singular values of K control the error propagation.

3.2 Imposing a Priori Constraints on the Solution For effective inversion of the ill-posed kernel driven model, we have to impose an a priori constraint to the interested parameters. This leads to solving a constrained LSE problem min J.x/;

s:t: Kx D y;

1  c.x/  2 ;

(13)

where J.x/ denotes an object functional, which is a function of x; c.x/ is the constraint to the solution x; and 1 and 2 are two constants which specify the bounds of c.x/. Usually, J.x/ is chosen as the norm of x with different scale. If the parameter x comes from a smooth function, then J.x/ can be chosen as a smooth function, otherwise, J.x/ can be nonsmooth. The constraint c.x/ can be smooth (e.g., Sobolev stabilizer) or nonsmooth (e.g., total variation or lq norm .q ¤ 2/ based stabilizer). A generically used constraint is the smoothness. It assumes that physical properties in a neighborhood of space or in an interval of time present some coherence and generally do not change abruptly. Practically, we can always find regularities of a physical phenomenon with respect to certain properties over a short period of time (Wang et al. 2007a, 2008). The smoothness a prior has been one of the most popular a prior assumptions in applications. The general framework is the so-called regularization which will be explained in the next subsection.

3.3 Tikhonov/Phillips-Twomey’s Regularization Most of inverse problems in real environment are generally ill-posed. Regularization methods are widely used to solve such ill-posed problems. The complete theory for regularization was developed by Tikhonov and his colleagues (Tikhonov and Arsenin 1977). For the discrete model Page 8 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

(12), we suppose y is the true right-hand side, and denote yn the measurements with noise which represents the bidirectional reflectance. The Tikhonov regularization method is to solve a regularized minimization problem J ˛ .x/ WD jjKx  yn jj22 C ˛jjD 1=2 xjj22 ! min

(14)

J.x/ D jjKx  yn jj22 ! min :

(15)

instead of solving

In (14), ˛ is the regularization parameter and D is a positively (semi-)definite operator. By a variational process, the minimizer of (14) satisfies K T Kx C ˛Dx D K T yn :

(16)

The operator D is a scale matrix which imposes smoothness constraint to the solution x. The scale operator D and the regularization parameter ˛ can be considered as some kind of a priori information, which will be discussed next. Phillips-Twomey’s regularization is based on solving the problem (Phillips 1962; Twomey 1975) min Q.x/; s:t: jjKx  yn jj D ; x

(17)

where Q.x/ D .Dx; x/, where D is a preassigned scale matrix and > 0. It is clear that PhillipsTwomey’s regularization shares similarity with Tikhonov’s regularization and can be written in consistent form. 3.3.1 Choices of the Scale Operator D To regularize the ill-posed problem discussed in Sect. 3.1, the choice of the scale operator D has great impact to the performance to the regularization. Note that the matrix D plays the role in imposing a smoothness constraint to the parameters and in improving the condition of the spectrum of the adjoint operator K T K. Therefore, it should be positively definite or at least positively semidefinite. One may readily see that the identity may be a choice. However this choice does not fully employ the assumption about the continuity of the parameters. In Wang et al. (2007a), we assume that the operator equation (12) is the discretized version of a continuous physical model K.x.// D y./

(18)

with K the linear/nonlinear operator, x./ the complete parameters describing the land surfaces, and y the observation. Most of the kernel model methods reported in literature may have the above formulation. Hence instead of establishing regularization for the operator equation (12) in the Euclidean space, it is more convenient to perform the regularization to the operator equation (18) on an abstract space. So from a priori considerations we suppose that the parameters x is a smooth function, in the sense that x is continuous on [a; b], is differentiable almost everywhere and its derivative is square-integrable on [a; b]. By Sobolev’s imbedding theorem (see, e.g., Tikhonov and Page 9 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Arsenin 1977; Xiao et al. 2003), the continuous differentiable function x in W 1;2 space imbeds into integrable continuous function space L2 automatically. The inner product of two functions x./ and y./ in W 1;2 space is defined by Z .x./; y.//W 1;2 WD

! n X @x @y x./y./ C d1 d2 : : : dn ; @i @j iD1

(19)

where is the assigned interval of the definition. Now we construct a regularizing algorithm that an approximate solution x ˛ 2 W 1;2 Œa; b which converges, as error level approaching zero, to the actual parameters in the norm of space W 1;2 Œa; b , precisely we construct the functional J ˛ .x/ D F ŒKx; y C ˛L.x/;

(20)

where F ŒKx; y D 12 jjKx  yjj2L2 ; L.x/ D 12 jjxjj2W 1;2 : Assume that the variation of x./ is flat and smooth near the boundary of the integral interval Œa; b . In this case, the derivatives of x are zeros at the boundary of Œa; b . Let hr be the step size of the grids in Œa; b , which could be equidistant or adaptive. Then after discretization of L.x/, D is a tridiagonal matrix in the form 2

1 C h12  h12 6  1 r 1 C r2 6 h2r h2r 6 : : 6 :: D WD D1 D 6 :: 6  4 0 0 

0  h12 r :: :

  :: :

3

0 0 :: :

 h12 1 C h22  h12 r r r 0  h12 1 C h12 r

7 7 7 7: 7 7 5

r

For the linear model (3), after the kernel normalization, we may consider Œa; b D Œ1; 1 . Thus, D is in the above form with hr D 2=.N  1/. There are many kinds of techniques for choosing the scale matrix D appropriately. In PhillipsTwomey’s formulation of regularization P 1(see, e.g., Wang et al.22006a), the matrix D is created by the norm of the second differences, N iD2 .xi1  2xi C xiC1 / , which leads to the following form of matrix D 3 2 1 2 1 0 0 0    0 0 0 0 6 2 5 4 1 0 0    0 0 0 0 7 7 6 6 1 4 6 4 1 0    0 0 0 0 7 7 6 6 0 1 4 6 4 1    0 0 0 0 7 7 6 6 :: :: : : : : : : : : : : :: :: :: :: 7 D WD D2 D 6 : : : : : : : : : : : 7: 7 6 6 0 0 0    0 1 4 6 4 1 0 7 7 6 6 0 0 0    0 0 1 4 6 4 1 7 7 6 4 0 0 0    0 0 0 1 4 5 2 5 0 0 0    0 0 0 0 1 2 1

Page 10 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

However, the matrix D is badly conditioned and thus the solution to minimize the functional J ˛ Œx

with D as the smooth constraint is observed to have some oscillations (Wang et al. 2006). Another P 2 option is the negative Laplacian (see, e.g., Wang and Yuan 2003; Wang 2007): Lx WD  niD1 @@x2 , i for which the scale matrix D for the discrete form of the negative Laplacian Lx is 3 1 1 0    0 0 6 1 2 1    0 0 7 7 6 7 6 :: :: :: : : : : D WD D3 D 6 : : :    : : 7 : 7 6 4 0 0 0 1 2 1 5 0 0 0    1 1 2

Where we assume that the discretization step length as 1. The scale matrix D3 is positive semidefinite but not positive definite and hence the minimization problem may not work efficiently for severely ill-posed inverse problems. Another option of the scale matrix D is the identity, i.e., D WD D4 D diag.e/, where e is the components of all ones, however this scale matrix is too conservative and may lead to over regularization. 3.3.2 Regularization Parameter Selection Methods As noted above, the choice of the regularization parameter ˛ is important to tackle the illposedness. A priori choice of the parameter ˛ allows 0 < ˛ < 1. However the a priori choice of the parameter does not reflect the degree of approximation that may lead to either overestimate or underestimate of the regularizer. We will use the widely used discrepancy principle (see, e.g., Tikhonov and Arsenin 1977; Tikhonov et al. 1995; Xiao et al. 2003) to find an optimal regularization parameter. In fact, the optimal parameter ˛* is a root of the nonlinear function ‰.˛/ D jjKx˛  yn jj2  ı 2 ;

(21)

where ı is the error level to specify the approximate degree of the observation to the true noiseless data, x˛ denotes the solution of the problem in Eq. (16) corresponding to the value ˛ of the related parameter. Noting  .˛/ is differentiable, fast algorithms for solving the optimal parameter ˛* can be implemented. In this chapter we will use the cubic convergent algorithm developed in (Wang and Xiao 2001): ˛kC1 D ˛k 

2‰.˛k / 1

‰ 0 .˛k / C .‰ 0 .˛k /2  2‰.˛k /‰ 00 .˛k // 2

:

(22)

In the above cubic convergent algorithm, the function ‰ 0 .˛/ and ‰ 00 .˛/ have the following explicit expression: "  #  2  dx˛ 2 d x ˛  ; ‰ .˛/ D ˛ˇ .˛/; ‰ .˛/ D ˇ .˛/  2˛   d˛  C x˛ ; d˛ 2 0

0

00

0

Page 11 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

where ˇ.˛/ D jjx˛ jj2 , ˇ 0 .˛/ D 2 solving the following equations:

 dx˛

 ; x , and x˛ , dx˛ =d˛ and d2 x˛ =d˛ 2 can be obtained by ˛ d˛

.K T K C ˛D/x˛ D K T yn ; dx˛ D Dx˛ ; d˛

(24)

d2 x˛ dx˛ D 2D : 2 d˛ d˛

(25)

.K T K C ˛D/ .K T K C ˛D/

(23)

To solve the linear matrix-vector equations (23)–(25), we use the Cholesky (square root) decomposition method. A remarkable characteristic of the solution of (23)–(25) is that the Cholesky decomposition of the coefficient matrix K T K C ˛D needs only once, then the three vectors x˛ ; dx˛ =d˛; d2x˛ =d2 ˛ can be obtained cheaply. In the case of perturbation of operators, the above method can be applied similarly. Note that in such case, the discrepancy equation becomes   Q 2  k .yn ; K/ Q 2 D 0; Q Q ˛  yn jj2  .ı C ı/ ‰.˛/ D jjKx 

(26)

Q and  .yn ; K/ Q is the where ıQ is the error level of KQ approximating the true operator,  D .ı; ı/ incompatibility measure of the equation Kx D y and  > 0. Equation 26 is called a generalized discrepancy equation and is an one-dimensional nonlinear equation, which can be solved by Newton’s or cubic convergent method. For more information about generalized discrepancy, we refer to Tikhonov et al. (1995) and Wang (2007) for details.

3.4 Direct Regularization Instead of Tikhonov regularization, our goal in this section is to solve an equality constrained l2 problem Q C n D yn ; jjxjj2 ! min; s:t: Kx

(27)

where KQ 2 RM N is a perturbation of K (i.e., if we regard K as an accurate operator, then KQ is an approximation to K which may contain error or noise), x 2 RN ; n; yn 2 RM . As is mentioned already, the ill-posedness is largely due to the small singular values of the linear operator. Let us denote the singular value decomposition of KQ as KQ D UM N †N N VNTN D N P i ui viT , where both U D Œui and V D Œvi are orthonormal matrices, i.e., the products of iD1

U with its transpose and V with its transpose are both identity matrices; ˙ is a diagonal matrix Q The traditional LSE solution xlse of the whose nonzero entries consist of the singular values of K. constrained optimization system (27) can be expressed by the singular values and singular vectors in the form

Page 12 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

xlse D

N X 1  T  ui yn v i :  i iD1

(28)

If the rank of KQ is p  minfM; N g, then the above solution form inevitably encounters numerical difficulties, since the denominator contains numerically infinitesimal values. Therefore, to solve the problem by the SVD, we must impose a priori information. As we have noted, Tikhonov regularization solves a variation problem by incorporating a priori information into the solution. In this section, we consider another way of incorporating a priori information to the solution. The idea is quite simple: instead of filtering the small singular values by replacing the small singular values with small positive numbers, we just make a truncation of the summation, i.e., the terms containing small singular values are replaced by zeroes. In this way, we obtain a regularized solution of the least squares problem (27) of minimal norm xtrunc lse

p X 1 T D .ui yn /vi  i iD1

(29)

P T 2 Q  yn jj22 D and minx jjKx iDpC1; jui yn j . We wish to examine the truncated singular value decomposition more. Note that in practice, KQ may not be exactly rank deficient, but instead be numerically rank deficient, i.e., it has one or more small but nonzero singular values such that Q Here, pı refers to the numerical ı-rank of a matrix, see, e.g., Wang et al. (2006b) pı < rank.K/. for details. It is clear from Eq. 29 that the small singular values inevitably give rise to difficulties. The regularization technique for SVD means some of the small singular values are truncated when in computation and is hence is called the NTSVD. Now assume that K is corrupted by the error matrix Bı . Then, we replace K by a matrix KpN that is close to K and mathematically rank deficient. ; pC2 ; : : : with Our choice of KpN is obtained by replacing the small nonzero singular values pC1 N N exact zeros, i.e., KpN D

pQ X

i ui viT

(30)

iD1

where pQ is usually chosen as pı . We call (30) the NTSVD of K. Now, we use (30) as the linear kernel to compute the least squares solutions. Actually, we solve the problem min jjKpQ x  yn jj2 x

appr

and obtain the approximate solution xlse of the minimal-norm appr xlse

D

 KpQ yn

pQ X 1 T D .u yn /vi ;  i iD1 i

(31)



where KpQ denotes the Moore-Penrose generalized inverse. Let us explain in more details the NTSVD for the underdetermined linear system. In this case, the number of independent variables is more than the number of observations, i.e., M < N . Assume that the ı-rank of KQ is pQ  minfM; N g. It is easy to augment KQ to be an N  N square matrix KQ aug by padding zeros underneath its M nonzero rows. Similarly, we can augment the right-hand side vector yn with zeros. The singular decomposition of KQ can Page 13 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

be rewritten as KQ aug D U †V T , where U D Œu1 u2 : : :uN N N ; V D Œv1 v2 : : :; vN N N and † D diag.1 ; 2 ; : : : ; pQ ; 0; : : : ; 0/. From this decomposition, we find that there are N  pQ theoretical zero singular values of the diagonal matrix †. These N  pQ zero singular values will inevitably induce high numerical instability.

3.5 Statistical Regularization Bayesian statistics provides a conceptually simple process for updating uncertainty in the light of evidence. Initial beliefs about some unknown quantity are represented by a prior distribution. Information in the data is expressed by the likelihood function L.xjy/. The a prior distribution p.x/ and the likelihood function are then combined to obtain the posterior distribution for the quantity of interest. The a posterior distribution expresses our revised uncertainty in light of the data, in other words, an organized appraisal in the consideration of previous experience. The role of Bayesian statistics is very similar to the role of regularization. Now, we establish the relationship between the Bayesian estimation and the regularization. A continuous random vector x is said to have a Gaussian distribution if its joint probability distribution function has the form   1 T 1 exp  .x  / C .x  / ; px .xI ; C / D p 2 .2/N det.C / 1

(32)

where x;  2 RN , C is an n-by-n symmetric positive definite matrix, and det./ denotes the matrix determinant. The mean is given by E.x/ D  and the covariance matrix is cov.x/ D C . Suppose y D Kx C n is a Gaussian distribution with mean Kx and covariance Cn , where Cn is the noise covariance of the observation noise and model inaccuracy. Then by (32) we obtain 

 1 T 1 exp  .y  Kx/ Cn .y  Kx/ : p.yjx/ D p 2 .2/M det.Cn / 1

1 T 1 Cx x/

exp. x From (32), the prior probability distribution is given by p.x/ D p 2 N

.2/ det.Cx /

(33)

. By Bayesian

statistical inference and the above two equations, we obtain an a posteriori log likelihood function 1 1 L.xjy/ D log p.xjy/ D  .y  Kx/T Cn1 .y  Kx/  xT Cx1 x C ; 2 2

(34)

where  is constant with respect to x. The maximum a posteriori estimation is obtained by maximizing (34) with respect to x, x D .K T Cn1 K C Cx1 /1 K T Cn1 y:

(35)

The easiest way of choosing Cn and Cx is by letting Cn D n2 IM ; Cx D x2 IN , and then (35) becomes x D .K T K C IM /1 K T y;

(36)

Page 14 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

where  D n2 =x2 , which is the noise-to-signal ratio. It is clear that the solution obtained by maximum a posteriori estimation has the same form as the solution of the Tikhonov regularization.

4 Optimization 4.1 Sparse/Nonsmooth Inversion in l1 Space It deserves attention that the ill-posedness is the intrinsic feature of the inverse problems. Unless some additional information/knowledge such as monotonicity, smoothness, boundedness, or the error bound of the raw data are imposed, the difficulty is hardly to be solved. Generally speaking, the kernel-driven BRDF model is semiempirical, the retrieved parameters x are mostly considered as a kind of weight function though it is a function of leaf area index (LAI), Lambertian reflectance, sunlit crown reflectance, and viewing and solar angles. Therefore, x is not necessarily positive. However, since it is a weight function, an appropriate arrangement of the components of x can yield the same results. That is to say, x can be “made” to be nonnegative. The problem remaining is to develop some proper methods to solve the “artificial” problem. Our new meaning to the solution x* is related to the l1 norm problem min jjxjj1 ; s:t: Kx D y; x  0; x

(37)

which automatically imposes a priori information by considering the solution in l1 space. Because of the limitations of the observation system, one may readily see that the recovered land surface parameters are discrete and sparse. Therefore, if an inversion algorithm is not robust, the outliers far from the true solution may occur. In this situation, the priori constrained l1 minimization may work better than the conventional regularization techniques. The model (37) can be reduced to a linear programming problem (see Ye 1997; Yuan 2001; Wang et al. 2005), hence linear programming methods can be used for solving the inverse problem. The l1 norm solution method is seeking for a feasible solution within the feasible set S D fx W Kx D y; x  0g. So it is actually searching for an interior point within the feasible set S, and hence is called the interior point method. The dual standard form of (37) is in the form max yT g;

s:t: s D e  K T g  0;

(38)

where e is a vector with all components equaling to 1. Therefore, the optimality conditions for (x, g, s) to be a primal-dual solution triplet are that Kx D y; K T g C s D e; SQ FQ e D 0; x  0; s  0;

(39)

where SQ D diag.s1 ; s2 ;    ; sN /; FQ D diag.x1 ; x2 ; : : : ; xN /, and si , xi are components of vectors s and x, respectively. The notation diag./ denotes the diagonal matrix whose only nonzero components are the main diagonal line. The interior point method generates iterates fxk ; gk ; sk g such that xk > 0 and sk > 0. As the iteration index k approaches infinity, the equality-constraint violations jjy  Kxjj and jjK T gk C sk  ejj and the duality gap xTk sk are driven to zero, yielding a limiting point that solves the primal and dual linear problems. For the implementation procedures and examples about using the algorithm, please refer to Wang et al. (2007b, 2009d) for details.

Page 15 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

A more general regularization model is recently proposed in Wang et al. (2009b), where the authors considered a regularization model in general form  1 p q min J Œf WD jjKf  hn jjlp C jjL.f  f0 /jjlq ; 2 2

(40)

where p, q > 0 which are specified by users, v > 0 is the regularization parameter, L is the scale operator, and f0 is an a priori solution of the original model. This formulation includes most of the developed methods. Particularly, for p D 2 and q D 1 or q D 0, the model represents nonsmooth and sparse regularization, which represents a quite important and hot topic in present, compressive sensing for decoding in signal processing. A regularizing active set method was proposed both for quadratic programming and non-convex problems, we refer Wang et al. (2009b) for details.

4.2 Optimization Methods for l2 Minimization Model 4.2.1 Newton-Type Methods The conventional Tikhonov regularization method is equivalent to constrained l2 minimization problem min jjxjj2 ; x

s:t: Kx D y:

(41)

This reduces to solve an unconstrained optimization problem 1 ˛ x D argminx J ˛ .x/; J ˛ .x/ D jjKx  yjj22 C jjxjj22 : 2 2

(42)

The gradient and Hessian of J ˛ .x/ are given by gradx ŒJ ˛ .x/ D .K T K C ˛I /1 x  K T y and Hessx ŒJ ˛ .x/ D K T K C˛I , respectively. Hence at the k-th iterative step, the gradient and Hessian of J ˛ .xk / can be expressed as gradk ŒJ ˛ and Hessk ŒJ ˛ , which are evaluated by gradxk ŒJ ˛ .xk / and Hessxk ŒJ ˛ .xk / , respectively. Newton-type methods are based on Gauss-Newton method and its various variations. We only supply the algorithm for Gauss-Newton method in this subsection. The Gauss-Newton method is an extension of Newton method in one-dimensional space to higher dimensional space. The iteration formula reads as xkC1 D xk  k .Hessk ŒJ ˛ /1 gradk ŒJ ˛ ;

(43)

where k , a damping parameter, which can be solved by line search technique, is used to control the direction .Hessk ŒJ ˛ /1 gradkŒJ ˛ . One may also apply a more popular technique, called the trust region technique, to control the direction .Hessk ŒJ ˛ /1 gradk ŒJ ˛ within a reliable generalized ball in every iteration (see Wang and Yuan 2005; Wang 2007). We recall that the inverse of Hessk ŒJ ˛ should be avoided for saving the amount of computation. Instead, linear algebraic decomposition methods can be applied to solve .Hessk ŒJ ˛ /1 gradk ŒJ ˛ . There are different variations of the Gauss-Newton method, which are based on the approximation of the explicit Hessian matrix Hessx ŒJ ˛ , e.g., DFP, BFGS, L-BFGS, and trust region methods.

Page 16 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

For these extensions to well-posed and ill-posed problems, please refer to Nocedal (1980), Dennis and Schnable (1983), Yuan (1994, 1993), Kelley (1999), Wang and Yuan (2005), and Wang (2007) for details. We mention briefly a global convergence method, the trust region method. The method solves an unconstrained non-quadratic minimization problem minn .x/. For the problem (42), the x2R

trust region method requires solving a trust region subproblem 1 min ‡ .s/ WD .gradx ŒJ˛ ; s/ C .Hessx ŒJ ˛ s; s/; s 2

s:t: jjsjj  ;

(44)

where > 0 is the trust region radius. In each step, a trial step s is computed and decided whether it is acceptable or not. The decision rule is based on the ratio between the actual reduction in the objective functional and the predicted reduction in the approximate model. And the trust region iterative step remains unchanged if  0, where D Ared.x/ , and Ared(x) and Pred(x) are defined Pred.x/ ˛ ˛ by J .x/  J .x C s/ and Y.0/  Y.s/, respectively. For the model in (42), since it is in a quadratic form, the ratio is always equal to 1. This means the trial step s, no matter it is good or not, will be always accepted. We note that the approximate accuracy is characterized by the discrepancy between the observation and the true data; therefore variations of the norm of the discrepancy may reflect the degree of approximation. Based on these considerations, we propose to accept or reject the trial step sk at the kth step by the ratio

k D

J ˛ .xk C sk / J ˛ .xkC1 / D ; J ˛ .xk / J ˛ .xk /

where J ˛ .xkC1 / and J ˛ .xk / are the reductions in norm of the discrepancy at .k C 1/-th and k-th steps, respectively. For the convergence and regularizing properties, we refer to Wang (2007) and Wang and Ma (2009) for details. 4.2.2 Gradient-Type Methods The gradient method does not need the Hessian information. For the linear operator equation Kx D y, where K, x, and y are with the same meaning as before, we first recall the wellknown fixed-point iteration method in standard mathematical textbook: the fixed-point iteration formula for solving the above linear operator equation is as XkC1 D Xk C .y  Kxk /;

(45)

where  2 .0; 2=jjKjj/ and K is linear, bounded, and nonnegative. One may readily see that this method is very similar to the method of successive approximations, where a very simple way to introduce the method is the following. Consider the operator T .x/ D x C .y  Kx/, where  is the so-called relaxation parameter. Any solution of the linear operator equation is equivalent to finding a fixed point of the operator T , i.e., solve for x from x D T .x/. Assuming that T is a contraction mapping, then by the method of successive approximations, we obtain the following iterative scheme xkC1 D T .xk /, i.e., iterative formula (45). The method converges if and only if K x D y has a solution. Now we introduce a very simple gradient method, the steepest descent method, the iteration formula reads as Page 17 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

xkC1 D xk C k  K T .y  Kxk /;

(46)

where k is obtained by line search, i.e., k D argmin 0 J.xk  gradk ŒJ /. If we restrict stepsize k to be fixed in .0; 2=jjK T Kjj/, then the steepest descent method reduces to the famous Landweber-Fridman iteration method. More extensions include nonmonotone gradient method, truncated conjugate gradient method with trust region techniques and different applications in applied science and can be found in, e.g., Brakhage (1987), Barzilai and Borwein (1988), Fletcher (2001), Wang and Yuan (2002), Wang (2007), and Wang and Ma (2007). Finally we want to mention that for underdetermined ill-posed problems, regularization constraints or a priori constraints should be incorporated into the minimization model then we may apply the aforementioned gradient methods. Application examples on aerosol particle size distribution function retrieval problems and nonmonotone gradient method are include in Wang (2008).

5 Practical Applications 5.1 Kernel-Based BRDF Model Inversion 5.1.1 Inversion by NTSVD Consider the linear combination of three kernels kgeo , kvol , and the isotropic kernel fOiso C fOgeo kgeo .ti ; tv ; / C fOvol kvol .ti ; tv ; / D rO for each observation. Considering the smoothing technique in l2 space, we solve the following constrained optimization problem O min jjŒfOiso ; fOgeo ; fOvol T jj2 ; s:t: fOiso C fOgeo kgeo C fOvol kvol D r:

(47)

Let us just consider an extreme example for kernel-based BRDF model: i.e., if only a single observation is available at one time, then it is clear that the above equation has infinitely many solutions. If we denote K D Œ1kgeo .ti , tv , '/kvol .ti , tv , '/ 13 , then the singular decomposition of T with U D Œu1 u2 u3 , † D diag.1 , the zero augmented matrix Kaug leads to Kaug D U33 †33 V33 2 , 3 /, and V D Œv1 v2 v3 , where each ui , vi , i D 1; 2; 3, are the 3-by-1 columns. Our a priori information is based on searching for a minimal norm solution within the infinite set of solutions, h iT     O O O satisfies fO C fO kgeo .ti ; tv ; '/ C fO kvol .ti ; tv ; '/ D rO i.e., the solution f D f ; f ; f iso

geo

vol

iso

geo

vol

and at the same time jjf  jj ! minimum. 5.1.2 Tikhonov Regularized Solution Denote by M the number of measurements in the kernel-based models. Then the operator equation (12) can be rewritten in the following matrix-vector form Kx D y;

(48)

Page 18 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

3 2 3 1 kgeo .1/ kvol .1/ r1 3 2 fiso 6 1 k .2/ k .2/ 7 6r 7 vol 7 6 geo 6 27 where K D 6 : : 7 ; x D 4 fgeo 5 ; y D 6 :: 7 : :: : : 5 4: : 4 : 5 : fvol 1 kgeo .M / kvol .M / rM In which, kgeo .k/ and kvol .k/ represent the values of kernel functions kgeo .ti ; tv ; '/ and kvol .ti ; tv ; '/ corresponding to the k-th measurement for k D 1; 2; : : : I rk represents the k-th observation for k D 1; 2; : : :. By Tikhonov Regularization, we solve for a regularized solution x˛ from minimizing the functional 2

1 1 J.x; ˛/ D jjKx  yjj2 C ˛jjDxjj2 : 2 2

(49)

Choices of the parameter ˛ and the scale operator D are discussed in Sect. 3.3. 5.1.3 Land Surface Parameter Retrieval Results We use the combination of RossThick kernel and LiTransit kernel in the numerical tests. In practice, the coefficient matrix K cannot be determined accurately, and a perturbed version KQ is obtained instead. Also instead of the true measurement y, the observed measurement yn D y C n is the addition of the true measurement y and the noise n, which for simplicity is assumed to be additive Gaussian random noise. Therefore it suffices to solve the following operator equation with perturbation Q D yn ; Kx where KQ WD K CıB for some perturbation matrix B and ı denotes the noise level (upper bound) of n in (0,1). In our numerical simulation, we assume that B is a Gaussian random matrix and also that jjyn  yjj  ı < jjyn jj. The above assumption about the noise can be interpreted as that the signalto-noise ratio (SNR) should be greater than 1. We make such an assumption as we believe that observations (BRDF) are not trustable otherwise. It is clear that (48) is an underdetermined system if M  2 and an overdetermined system if M > 3. Note that for satellite remote sensing, because of the restrictions in view and illumination geometries, KQ T KQ needs not have bounded inverse (see Verstraete et al. 1996; Li et al. 2001; Wang et al. 2007a, 2008). We believe that the proposed Q ˛ yn jj ! regularization method can be employed to find an approximate solution x˛ satisfies jjKx min. We use atmospherically corrected moderate resolution imaging spectroradiometer (MODIS) 1B product acquired on a single day as an example of single observation BRDF at certain viewing direction. Each pixel has different view zenith angle and relative azimuth angle. The data MOD021KM.A2001135-150 with horizontal tile number (26) and vertical tile number (4) were measured covers Shunyi county of Beijing, China. The three parameters are retrieved by using this 1B product. Figure 2a plots the reflectance for band 1 of a certain day DOY = 137. In MODIS AMBRALS algorithm, when insufficient reflectances or a poorly representative sampling of high quality reflectances are available for a full inversion, a database of archetypal BRDF parameters is used to supplement the data and a magnitude inversion is performed (see Verstraete et al. 1996; Strahler et al. 1999). We note that the standard MODIS AMBRALS algorithm cannot work for Page 19 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

a

c

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

b

d

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Fig. 2 (a) Reflectance for band 1 of MOD021KM.A2001137; (b) white-sky albedo retrieved by Tikhonov regularization method; (c) white-sky albedo retrieved by NTSVD method; and (d) WSA retrieved by l1 sparse regularization method

such an extreme case, even for MODIS magnitude inversion since it is hard to obtain seasonal data associated with a dynamic land cover in a particular site. But our method still works for such an extreme case because that smoothness constraint is implanted into the model already. We plot the white-sky albedos (WSAs) retrieved by NTSVD, Tikhonov regularization and sparse inversion for band 1 of one observation (DOY = 137) in Fig. 2b–d, respectively. From Fig. 2b–d, we see that the albedo retrieved from insufficient observations can generate the general profile. We observe that most of the details are preserved though the results are not perfect. The results are similar to the one from NTSVD method developed in Wang et al. (2007a). Hence, we conclude that these developed methods can be considered useful methods for retrieval of land surface parameters and for computing land surface albedos. Thus these developed algorithms can be considered as supplement algorithms for the robust estimation of the land surface BRDF/albedos. We want to emphasize that our method can generate smoothing data for helping retrieval of parameters once sufficient observations are unavailable. As we have pointed out in Wang et al. (2007a, 2008), we do not suggest discarding the useful history information (e.g., data that is not too old) and the multiangular data. Instead, we should fully employ such information if it is available. The key to why our algorithm outperforms previous algorithms is because that our algorithm is adaptive, accurate, and very stable, which solves kernel-based BRDF model of any order, which may be a supplement for BRDF/albedo retrieval product. For the remote sensor MODIS, which can generate a product by using 16 days different observations data, this is not a strict restriction for MODIS, since it aims at global exploration. For other sensors, the period for their detection of the same area will be longer than 20 days or more. Therefore, for vegetation in the growing season, the reflectance and albedos will change significantly. Hence robust algorithms to estimate BRDF and albedos in such cases are highly desired. Our algorithm is a proper choice, since it can generate retrieval results which quite approximate the true values of different vegetation type of land surfaces by capturing just one time of observation. Page 20 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Moreover, for some sensors with high spatial resolution, the quasi multiangular data are impossible to obtain. This is why there are not high resolution albedo products. But with our algorithm, we can achieve the results. This is urgently needed in real applications.

5.2 Inversion of Airborne Lidar Remote Sensing The analytical representation of the transmitted laser pulse and the true cross-sections given by third-order spline functions. For example, we generate the synthetic laser pulse function flp .x/ within the interval [2, 3] by the formula flp .x/ D 31:25x 3 C 206:25x 2  356:25x C 218:75. The analytical representation of the cross-section function gcs .x/ within the interval [3/8, 1/2] is by the formula gcs .x/ D 8x 3  10x 2 C 3x C 16 . The recorded waveform function hwf .x/ (i.e., data) is calculated by a convolution of the splines representing the transmitted laser pulse .flp .x// and the cross-section .gcs .x// so that hwf .x/ D flp .x/  gcs .x/: Note that hwf represents the observation that means different kinds of noise may be also recorded besides the true signal. Here we only consider a simple case, i.e., we assume that the noise is true mainly additive Gaussian noise in [0, 1], i.e., hwf D htrue wf C ı  r and .size.hwf //, where ı > 0 is true the noise level and randsize htrue wf is the Gaussian random noise with the same size as hwf . In our simulation, the Gaussian random noise is generated with mean equaling 0 and standard deviation equaling 2. We apply Tikhonov regularization algorithm (see Sect. 3.3) to recover the cross-section and make a comparison. The synthetic laser pulse sampled with 1 ns resolution is shown in Fig. 3a. Comparisons of the undistorted cross-sections with the recovered cross-sections are illustrated in Fig. 3b. It is apparent that our algorithm can find stable recoveries to the simulated synthetic crosssections. We do not list the plot of the comparison results for small noise levels since the algorithm yields perfect reconstructions. We also tested the applicability of the regularization method to LMS-Q560 data (RIEGL LMS-Q560 (www.riegl.co.at)). The emitted laser scanner sensor pulse is shown in Fig. 3c. The recorded waveform of the first echo of this pulse is shown in Fig. 3d (dotted line). The retrieved backscatter cross-section using regularization method is shown in Fig. 4a. The solid line in Fig. 3d shows the reconstructed signal derived by the convolution of the emitted laser pulse and this cross-section. One may see from Fig. 4a that there are several small oscillations in the region [3,850, 3,860] ns. But note that the amplitude of these oscillations are typically small, we consider they are noise or computational errors induced by noise when performing numerical inversion. To show the necessity of regularization, we plot the result of least squares fitting without regularization in Fig. 4b. The comparison results immediately reveal the importance of acceptance of regularization. More extension about numerical performances and comparisons can be found in Wang et al. (2009a).

5.3 Particle Size Distribution Function Retrieval We consider retrieving aerosol particle size distribution function n.r/ from the attenuation equation (7). But it is an infinite dimensional problem with only a finite set of observations, so it is improbable to implement such a system by computer to get a continuous expression of the size Page 21 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 (a) Synthetic emitted laser pulse; (b) comparison of the true and recovered cross-sections in the case of noise of level 1; (c) second emitted laser pulse; and (d) recorded echo waveform of the laser pulse shown in (c) (dotted curve) and its reconstruction using the cross-section shown in Fig. 4a (solid curve)

a

b 14 ⫻ 107

0.1 0.09

12

0.08

10 Amplitude

Amplitude

0.07 0.06 0.05 0.04 0.03

8 6 4 2

0.02 0

0.01 0 3810

3820

3830

3840 3850 3860 Time stamp (ns)

3870 3880

–2 3810

3820

3830

3840 3850 3860 Time stamp (ns)

3870

3880

Fig. 4 (a) The retrieved backscatter cross-section using regularization; (b) the retrieved backscatter cross-section using least squares fitting without regularization

distribution n.r/. Numerically, we solve the discrete problem of operator equation (9). Using collocation (Wang et al. 2006), the infinite problem can be written in an finite dimensional form by sampling some grids frj gN j D1 in the interval of interests [a; b]. Denoting by K D .Kij /N N ; n; % and d the corresponding vectors, we have Kn C % D d:

(50)

This discrete form can be used for computer simulations. Phillips-Twomey’s regularization is based on solving the problem

Page 22 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

min Q.n/; s:t:jjKn  djj D ; n

(51)

where Q.n/ D .Dn; n/, where D is a preassigned scale matrix. In Phillips-Twomey’s formulation of regularization, the choice of the matrix is vital. They P scale 1 2 .n chose the form of the matrix D by the norm of the second differences, N i1  2ni C niC1 / , iD2 which corresponds to the form of matrix D D D2 . However, the matrix D is badly conditioned. For example, with N D 200, the largest singular value is 15.998012. The smallest singular value is 6:495571  1017 . This indicates that the condition number of the matrix D defined by the ratio of the largest singular value to the smallest singular value equals 2.462911 1017 , which is worse. Hence, for small singular values of the discrete kernel matrix K, the scale matrix D cannot have them filtered even with large Lagrangian multiplier . This numerical difficulty encourages us to study a more robust scale matrix D, which is formulated as follows. We consider the Tikhonov regularization in Sobolev W 1;2 space as is mentioned in Sect. 3.3.1. By variational process, we solve a regularized linear system of equations K T Kn C ˛H n  K T d D 0;

(52)

where H is a triangular matrix in the form of D1 . For choice of the regularization parameter, we consider the a posteriori approach mentioned in Sect. 3.3.2. Suppose we are interested in . Now choosing the the particle size in the interval [0.1, 4] m, the step size is hr D N3:9 1 discrete nodes N D 200, the largest singular value of H is 1:041482176501067  104 by double machine precision, and the smallest singular value of H is 0. 99999999999953 by double machine precision. Compared to the scale matrix D of Phillips-Twomey’s regularization, the condition number of H is 1:041482176501554  104 , which is better than D in filtering small singular values of the discrete kernel K. To perform the numerical computations, we apply the technique developed in King et al. (1978), i.e., we assume that the actual aerosol particle size distribution function consists of the multiplication of two functions h.r/ and f .r/ W n.r/ D h.r/f .r/, where h.r/ is a rapidly varying function of r, while f .r/ is more slowly varying. In this way we have Z

b

aero ./ D

Œk.r; ; /h.r/ f .r/dr ;

(53)

a

where k.r; ; / D  r 2 Qext .r; ; / and we denote k.r; ; /h.r/ as the new kernel function which corresponding to a new operator : .„f /.r/ D aero ./:

(54)

After obtaining the function f .r/, the size distribution function n.r/ can be obtained by multiplying f .r/ by h.r/. The extinction efficiency factor (kernel function) Qext .r; ; / is calculated from Mie theory: by Maxwell’s electromagnetic (E; H ) theory, the spherical particle size scattering satisfies curlH D i 2E; curlE D i H;

(55)

Page 23 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

where  D 2=. The Mie solution process is one of finding a set of complex numbers an and bn which give vectors E and H that satisfy the boundary conditions at the surface of the sphere (Bohren and Huffman 1983). Suppose the boundary conditions of the sphere is homogenous, the expressions for Mie scattering coefficients an and bn are related by an .z; / D bn .z; / D

 n .z/ n0 .z/   n .z/n0 .z/  0 n .z/ 0 n .z/n .z/

n .z/

 

0 n .z/ n .z/ ; 0 n .z/n .z/

(56)

0 n .z/ n .z/ ; 0 n .z/n .z/

(57)

p p p where n .z/ D z JnC 12 .z/; n .z/ D z JnC 12 .z/  i z NnC 12 .z/; JnC 12 .z/ and NnC 12 .z/ are the 2 2 2   1 n C 2 -th order first kind Bessel function and second kind Bessel function (Neumann function), respectively. These complex-valued coefficients, functions of the refractive index , z D 2 r and z provide the full solution to the scattering problem. Thus the extinction efficiency factor (kernel function) can be written as 1 2 X .2n C 1/Real.an C bn /: Qext .r; ; / D 2 z nD1

(58)

The size distribution function nt rue .r/ D 10:5r 3:5 exp.1012 r 2 / is used to generate synthetic data. The particle size radius interval of interest is [0. 1, 2] m. This aerosol particle size distribution function can be written as nt rue .r/ D h.r/f .r/, where h.r/ is a rapidly varying function of r, while f .r/ is more slowly varying. Since most measurements of the continental aerosol particle size distribution reveal that these functions follow a Junge distribution (Junge 1955), h.r/ D r .C1/ , where   is a shaping constant with typical values in the range 2.0–4.0, therefore it is reasonable to use h.r/ of Junge type as the weighting factor to f .r/. In this work, we choose   D 3 and f .r/ D 10:5r 1=2 exp.1012 r 2 /. The form of this size distribution function is similar to the one given by Twomey (1975), where a rapidly changing function h.r/ D C r 3 can be identified, but it is more similar to a Junge distribution for r  0:1 m. One can also generate other particle number size distributions and compare the reconstruction with the input. In the first place, the complex refractive index  is assumed to be 1:45  0:00i and 1:50  0:00i , respectively. Then we invert the same data, supposing  has an imaginary part. The complex refractive index  is assumed to be 1:45  0:03i and 1:50  0:02i , respectively. The precision of the approximation is characterized by the root mean-square error (rmse) v u m u1 X .comp .i /  meas .i //2 t ; rmse D m iD1 .comp .i //2

(59)

which describes the average relative deviation of the retrieved signals from the true signals. In which, comp refers to the retrieved signals and meas refers to the measured signals. Numerical illustrations are plotted in Fig. 5b with noise level ı D 0:05 for different refractive indices, respectively. The behavior of regularization parameter is plotted in Fig. 5a. The rmses for each case are shown in Table 1.

Page 24 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 5 Iterative computational values of regularization parameters when the error level ı D 0:05 (a); input and retrieved results with our inversion method in the case of error level ı D 0:05 and different complex refractive indices (b) Table 1 The rmses for different noise levels Noise levels ˜ D 1:45  0:00i ı D 0:005 1:6443  105 ı D 0:01 1:6493  105 ı D 0:05 1:6996  105

˜ D 1:45  0:03i 1:2587  105 1:2720  105 1:3938  105

˜ D 1:50  0:02i 2:2773  105 2:2847  105 2:3504  105

6 Conclusion In this chapter, we study the regularization and optimization methods for solving the inverse problems in geoscience and quantitative remote sensing. Three typical kernel-based problems are introduced, including computation of number of aerosol particle size distribution function, estimation of land surface biomass parameter, and backscatter cross-section. These problems are formulated in functional space by introducing the operator equations of the first kind. The mathematical models and solution methods in l1 and l2 spaces are considered. The regularization strategies and optimization solution techniques are fully described. The equivalence between the Tikhonov regularization and Bayesian statistical inference for solving geoscience inverse problems is established. The general regularization model in lp  lq (for p; q  0) spaces, which can be convex or non-convex, are introduced. Numerical simulations for these problems are performed and illustrated. Acknowledgments This research is supported by National “973” Key Basic Research Developments Program of China under grant numbers 2007CB714400, National Natural Science Foundation of China (NSFC) under grant numbers 10871191 and 40974075, and Knowledge Innovation Programs of Chinese Academy of Sciences KZCX2-YW-QN107.

References Ångström A (1929) On the atmospheric transmission of sun radiation and on dust in the air. Geogr Ann 11:156–166 Barzilai J, Borwein J (1988) Two-point step size gradient methods. IMA J Numer Anal 8:141–148

Page 25 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Bockmann C (2001) Hybrid regularization method for the ill-posed inversion of multiwavelength lidar data in the retrieval of aerosol size distributions. Appl Opt 40:1329–1342 Bockmann C, Kirsche A (2006) Iterative regularization method for lidar remote sensing. Comput Phys Commun 174:607–615 Bohren GF, Huffman DR (1983) Absorption and scattering of light by small particles. Wiley, New York Brakhage H (1987) On ill-posed problems and the method of conjugate gradients. In: Engl HW, Groetsch CW (eds) Inverse and ill-posed problems. Academic, Boston, pp 165–175 Camps-Valls G (2008) New machine-learning paradigm provides advantages for remote sensing. SPIE Newsroom. doi:10.1117/2.1200806. 1100 Davies CN (1974) Size distribution of atmospheric aerosol. J Aerosol Sci 5:293–300 Dennis JE, Schnable RB (1983) Numerical methods for unconstrained optimization and nonlinear equations. Prentice Hall, Englewood Cliffs Fletcher R (2001) On the Barzilai-Borwein method. Numerical Analysis report NA/207 Houghton JT, Meira Filho LG, Callander BA, Harris N, Kattenberg A, Maskell K (1966) Climate change 1995. Published for the Intergovernmental Panel on Climate Change, Cambridge University Press Junge CE (1955) The size distribution and aging of natural aerosols as determined from electrical and optical data on the atmosphere. J Meteorol 12:13–25 Kelley CT (1999) Iterative methods for optimization. SIAM, Philadelphia King MD, Byrne DM, Herman BM, Reagan JA (1978) Aerosol size distributions obtained by inversion of spectral optical depth measurements. J Aerosol Sci 35:2153–2167 Li X, Wang J, Hu B, Strahler AH (1998) On utilization of a priori knowledge in inversion of remote sensing models. Sci China D 41:580–585 Li X, Wang J, Strahler AH (1999) Apparent reciprocal failure in BRDF of structured surfaces. Prog Nat Sci 9:747–750 Li X, Gao F, Liu Q, Wang JD, Strahler AH (2000) Validation of a new GO kernel and inversion of land surface albedo by kernel-driven model (1). J Remote Sens 4:1–7 Li X, Gao F, Wang J, Strahler AH (2001) A priori knowledge accumulation and its application to linear BRDF model inversion. J Geophys Res 106:11925–11935 Mccartney GJ (1976) Optics of atmosphere. Wiley, New York Nguyen T, Cox K (1989) A method for the determination of aerosol particle distributions from light extinction data. In: Abstracts of the American association for aerosol research annual meeting, American Association of Aerosol Research, Cincinnati, pp 330–330 Nocedal J (1980) Updating quasi-Newton matrices with limited storage. Math Comput 95:339–353 Phillips DL (1962) A technique for the numerical solution of certain integral equations of the first kind. J Assoc Comput Mach 9:84–97 Pokrovsky O, Roujean JL (2002) Land surface albedo retrieval via kernel-based BRDF modeling: I. Statistical inversion method and model comparison. Remote Sens Environ 84:100–119 Pokrovsky OM, Roujean JL (2003) Land surface albedo retrieval via kernel-based BRDF modeling: II. An optimal design scheme for the angular sampling. Remote Sens Environ 84:120– 142 Pokrovsky IO, Pokrovsky OM, Roujean JL (2003) Development of an operational procedure to estimate surface albedo from the SEVIRI/MSG observing system by using POLDER BRDF measurements: II. Comparison of several inversion techniques and uncertainty in albedo estimates. Remote Sens Environ 87:215–242

Page 26 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Privette JL, Eck TF, Deering DW (1997) Estimating spectral albedo and nadir reflectance through inversion of simple bidirectional reflectance distribution models with AVHRR/MODIS-like data. J Geophys Res 102:29529–29542 Roujean JL, Leroy M, Deschamps PY (1992) A bidirectional reflectance model of the Earth’s surface for the correction of remote sensing data. J Geophys Res 97:20455–20468 Strahler AH, Li XW, Liang S, Muller J-P, Barnsley MJ, Lewis P (1994) MODIS BRDF/albedo product: algorithm technical basis document. NASA EOS-MODIS Doc. 2.1 Strahler AH, Lucht W, Schaaf CB, Tsang T, Gao F, Li X, Muller JP, Lewis P, Barnsley MJ (1999) MODIS BRDF/albedo product: algorithm theoretical basis document. NASA EOS-MODIS Doc. 5.0 Tikhonov AN, Arsenin VY (1977) Solutions of ill-posed problems. Wiley, New York Tikhonov AN, Goncharsky AV, Stepanov VV, Yagola AG (1995) Numerical methods for the solution of ill-posed problems. Kluwer, Dordrecht Twomey S (1975) Comparison of constrained linear inversion and an iterative nonlinear algorithm applied to the indirect estimation of particle size distributions. J Comput Phys 18:188–200 Twomey S (1977) Atmospheric aerosols. Elsevier, Amsterdam Verstraete MM, Pinty B, Myneny RB (1996) Potential and limitations of information extraction on the terrestrial biosphere from satellite remote sensing. Remote Sens Environ 58:201–214 Voutilainenand A, Kaipio JP (2000) Statistical inversion of aerosol size distribution data. J Aerosol Sci 31:767–768 Wagner W, Ullrich A, Ducic V, Melzer T, Studnicka N (2006) Gaussian decomposition and calibration of a novel small-footprint full-waveform digitising airborne laser scanner. ISPRS J Photogram Remote Sens 60:100–112 Wang YF (2007) Computational methods for inverse problems and their applications. Higher Education Press, Beijing Wang YF (2008) An efficient gradient method for maximum entropy regularizing retrieval of atmospheric aerosol particle size distribution function. J Aerosol Sci 39:305–322 Wang YF, Ma SQ (2007) Projected Barzilai-Borwein methods for large scale nonnegative image restorations. Inverse Probl Sci Eng 15:559–583 Wang YF, Ma SQ (2009) A fast subspace method for image deblurring. Appl Math Comput 215:2359–2377 Wang YF, Xiao TY (2001) Fast realization algorithms for determining regularization parameters in linear inverse problems. Inverse Probl 17:281–291 Wang YF, Yang CC (2008) A regularizing active set method for retrieval of atmospheric aerosol particle size distribution function. J Opt Soc Am A 25:348–356 Wang YF, Yuan YX (2002) On the regularity of a trust region-CG algorithm for nonlinear ill-posed inverse problems. In: Sunada T, Sy PW, Yang L (eds) Proceedings of the third Asian mathematical conference, Diliman, Philippines, 23–27, Oct 2000. World Scientific, Singapore, pp 562–580 Wang YF, Yuan YX (2003) A trust region algorithm for solving distributed parameter identification problem. J Comput Math 21:759–772 Wang YF, Yuan YX (2005) Convergence and regularity of trust region methods for nonlinear ill-posed inverse problems. Inverse Probl 21:821–838 Wang YF, Li XW, Ma SQ, Yang H, Nashed Z, Guan YN (2005) BRDF model inversion of multiangular remote sensing: ill-posedness and interior point solution method. In: Proceedings of the 9th international symposium on physical measurements and signature in remote sensing (ISPMSRS), Beijing, 17–19 Oct 2005, vol XXXVI, pp 328–330

Page 27 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_26-2 © Springer-Verlag Berlin Heidelberg 2014

Wang YF, Fan SF, Feng X, Yan GJ, Guan YN (2006a) Regularized inversion method for retrieval of aerosol particle size distribution function in W 1;2 space. Appl Opt 45:7456–7467 Wang YF, Wen Z, Nashed Z, Sun Q (2006b) Direct fast method for time-limited signal reconstruction. Appl Opt 45:3111–3126 Wang YF, Li XW, Nashed Z, Zhao F, Yang H, Guan YN, Zhang H (2007a) Regularized kernelbased BRDF model inversion method for ill-posed land surface parameter retrieval. Remote Sens Environ 111:36–50 Wang YF, Fan SF, Feng X (2007b) Retrieval of the aerosol particle size distribution function by incorporating a priori information. J Aerosol Sci 38:885–901 Wang YF, Yang CC, Li XW (2008) A regularizing kernel-based BRDF model inversion method for ill-posed land surface parameter retrieval using smoothness constraint. J Geophys Res 113:D13101 Wang YF, Zhang JZ, Roncat A, Künzer C, Wagner W (2009a) Regularizing method for the determination of the backscatter cross-section in Lidar data. J Opt Soc Am A 26:1071–1079 Wang YF, Cao JJ, Yuan YX, Yang CC, Xiu NH (2009b) Regularizing active set method for nonnegatively constrained ill-posed multichannel image restoration problem. Appl Opt 48:1389–1401 Wang YF, Yang CC, Li XW (2009c) Kernel-based quantitative remote sensing inversion. In: Camps-Valls G, Bruzzone L (eds) Kernel methods for remote sensing data analysis. Wiley, New York Wang YF, Ma SQ, Yang H, Wang JD, Li XW (2009d) On the effective inversion by imposing a priori information for retrieval of land surface parameters. Sci China D 39:360–369 Wanner W, Li X, Strahler AH (1995) On the derivation of kernels for kernel-driven models of bidirectional reflectance. J Geophys Res 100:21077–21090 Xiao TY, Yu SG, Wang YF (2003) Numerical methods for the solution of inverse problems. Science Press, Beijing Ye YY (1997) Interior point algorithms: theory and analysis. Wiley, Chichester Yuan YX (1993) Numerical methods for nonlinear programming. Shanghai Science and Technology Publication, Shanghai Yuan YX (1994) Nonlinear programming: trust region algorithms. In: Xiao ST, Wu F (eds) Proceedings of Chinese SIAM annual meeting, Tsinghua University Press, Beijing, pp 83–97 Yuan YX (2001) A scaled central path for linear programming. J Comput Math 19:35–40

Page 28 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

Multiparameter Regularization in Downward Continuation of Satellite Data Shuai Lu and Sergei V. Pereverzev Institute for Computational and Applied Mathematics, Austrian Academy of Sciences, Linz, Austria

Abstract This chapter discusses the downward continuation of the spaceborne gravity data. We analyze the ill-posed nature of this problem and describe some approaches to its treatment. This chapter focuses on the multiparameter regularization approach and show how it can naturally appear in the geodetic context in the form of the regularized total least squares or the dual regularized total least squares, for example. The numerical illustrations with synthetic data demonstrate that multiparameter regularization can indeed produce a good accuracy approximation.

1 Introduction A principal aim of satellite gravity field determination is the derivation of the disturbing gravity potential or geoidal undulations at the Earth’s surface. However, determining a satellite-only gravitational model from satellite tracking measurements is ill-posed due to the nature of downward continuation. To have a better understanding of the ill-posed nature of this problem assume that data G.t /, derived from satellite gravimetry, are given at a spherical surface of the satellite orbit r D ft 2 R3 W jt j D rg. Then the problem basically is to determine disturbing gravity potential U.x/, harmonic outside the geocentric reference sphere R of the radius R < r, which is a spherical approximation of the geoid. On the other hand, the upward continuation of U.t / from the reference 3 sphere R into external space ext R D ft 2 R W jt j > Rg can be found by solving the spherical Dirichlet problem, which reads U.t / D 0 for jt j > R; U.t / D F .t / for jt j D R;

(1)

where F .t / is the disturbing gravity potential at the reference sphere,    in which we are interested 1 in, and we assume U.t / being regular at infinity, i.e., U.t / D O jt j and jrt U.t /j D O jt1j2 for jt j ! 1. The solution to the problem (1) is represented by the Abel-Poisson integral (Kellogg 1967) 1 U.t / D 4R



Z R

jt j2  R2 F ./d R ./: jt  j3

(2)

E-mail: [email protected]

Page 1 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

Then the disturbing gravity potential F .t / at the reference sphere can be found using the inverse of Abel-Poisson’s integral (2) for known values of U.t / D G.t / at the satellite orbit r , resulting in a Fredholm integral equation of the first kind 1 ADW C F .t / WD 4R

Z R

r 2  R2 F ./d R ./ D G.t /; .r 2 C R2  2t  /3=2

t 2 r :

(3)

Now one can easily recognize that for r > R the kernel of the downward continuation integral operator ADW C is a bounded continuous function, and therefore, ADW C is a compact operator between Hilbert spaces L2 .R / and L2 .r /. Hence, its inverse cannot be a bounded operator from L2 .r / to L2 .R / (see, e.g., Engl et al. 1996). Remembering Hadamard’s definition of a well-posed problem (existence, uniqueness, and continuity of the inverse), we consequently see that the problem of downward continuation (3) is ill-posed as it violates the third condition. Moreover, a straightforward calculation (see, e.g., Freeden et al. 1997) shows that the operator A D ADW C admits the singular value decomposition (SVD) AD

1 X

aj uj hvj ; iL2 .R / ;

(4)

j D1

where aj D

 R k r

    ; uj D uj .t / D 1r Yk;i rt ; vj D vj ./ D R1 Yk;i R ; j D i C k 2 ; i D 1; 2; : : : ; 2k C 1; k D 0; 1; : : : ;

(5)

and {Yk;i } is a system of spherical harmonics L2 -orthonormalized with respect to the unit sphere 1  R3 . Therefore, the problem of downward continuation (3) can be formally classified to be exponentially or severely ill-posed (see, e.g., Louis 1989), since the singular values aj of the operator A D ADW C converge to zero exponentially fast for j ! 1. Ill-posed nature of the downward continuation can be seen as a background to the severely illposedness of other satellite geodetic problems such as satellite gravity gradimetry, e.g., where the satellite data provide information about second-order partial derivatives of the gravity potential U.t / at a satellite altitude (for more details on satellite gravity gradiometry the reader is referred to Rummel et al. (1993) and Freeden (1999) as well as Chap. 9 of this work). Using secondorder radial derivatives on the orbital sphere r , the spherical framework of the satellite gravity gradiometry (SGG) can also be mathematically formulated in the form of a first kind Fredholm integral equation with the operator A D AS GG W L2 .R / ! L2 .r / defined by 1 AS GG F .t / WD 4R

Z R

  r 2  R2 @2 F ./d R ./; @r 2 .r 2 C R2  2t  /3=2

t 2 r :

(6)

Although the kernel of this integral operator is less smooth than the downward continuation kernel, its SVD has the form (4) with exponentially decreasing singular values  k R .k C 1/.k C 2/ ; aj D r r2

j D i C k 2;

i D 1; 2; : : : ; 2k C 1;

k D 0; 1; : : : :

(7)

Page 2 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

(see, e.g., Freeden 1999; Pereverzev and Schock 1999). Therefore, SGG-problem can be formally classified to be severely ill-posed as well. Thus, due to the ill-posed nature of downward continuation, in satellite geodesy we have to deal with exponentially ill-posed integral equations. In any discretized version of such an equation, this ill-posedness is reflected in the ill-conditioning of the corresponding matrix equation Ax D y;

(8)

and the crux of the difficulty is that usually in practice we are confronted with satellite data y" blurred by observation error 2 D y  y2 . Note that polar gaps could be an extra error source (see, e.g., Boumann 2000). Ill-conditioned matrix equations are not new in geodesy, especially when satellite observations are used for gravity field estimation (see, e.g., Kusche and Klees 2002). A number of methods have been proposed to reconstruct the solution x  of problem (8) from noisy observation y" . One of the most widely used methods was and still is Tikhonov regularization, which estimates x  as the minimizer x"˛ of the functional ˆ.˛I x/ D jjAx  y jj2 C ˛jjBxjj2 ;

(9)

where jjjj is the norm induced by an appropriate scalar product h; i, B is a symmetric positive (or semi-positive) definite matrix, and ˛ > 0 is the regularization parameter to be chosen properly. In satellite geodesy, B is almost always chosen by using the formulation of Bayesian statistics, where a problem (8) with noisy data is written in the form of a standard Gauss-Markov model y D Ax  C ;

(10)

and the observation error " is assumed to be a random vector with zero expectation E D 0 and the covariance matrix cov " D ı 2 P ; here ı is a small positive number used for measuring the noise intensity. In the next section, we discuss how the Bayesian statistics converts a priori information about noise convariance structure given in the form of a matrix P into the choice of a matrix B in (9). At the same time, several authors (see, e.g., Klees et al. 2003) note that the Bayesian approach cannot be used at all if the covariance matrix cov " will not be known exactly. On the other hand, for new satellite missions one cannot expect to have a good description of the noise covariance. Indeed, as it has been indicated by Kusche and Klees (2002), first, the measurement equipment on board of a satellite will never be validated in orbit, and second, the measurements will also be contaminated with an aliasing signal caused by the unmodeled high frequencies of the gravitational field. It is also worth to note that numerical experiments reported in Bauer and Pereverzev (2006) show that the use of a rough approximation of the covariance operator in Tikhonov regularization leads to a very poor performance. Therefore, one needs algorithms that are capable of dealing with different noise models. One of such algorithms has recently been discussed in Bauer et al. (2007), but it has been designed only for estimating bounded linear functionals of the solution x  . The goal of the present chapter is to discuss another way of the regularization, which is based on the minimization of a Tikhonov-type functional with several regularization parameters.

Page 3 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

ˆ.˛1 ; ˛2; : : : ; ˛l I x/ D jjAx  y jj2 C

l X

˛i jjBi xjj2 :

(11)

iD1

The application of such a multiple parameter regularization in satellite geodesy has been advocated in Xu (1992) and Xu and Rummel (1994). In these papers, the matrices Bi have been chosen to be equal to 0

1

0

B 0 B B 0 Bi D B B Ii B @ 0

C C C C; C C A

(12)

0 where Ii is an identity matrix whose dimension is equal to the numbers of the harmonic coefficients of the corresponding degree. On the other hand, in Xu et al. (2006) it has been observed that multiple parameter regularization (11) and (12) only marginally improved its single parameter version (9), with B equals to the identity matrix I . At this point, it is worth to note that in the regularization theory (see, e.g., Engl et al. 1996) one chooses a matrix B in (9) on the base of a priori knowledge about the solution smoothness. For example, if it is known a priori that x  admits a representation x  D B p for p > 0 and jj jj  , then the theory suggests to use such B in (9). It is interesting to note that in view of the paper Svensson (1983) (see also Freeden and Pereverzev 2001), it is reasonable to believe that in an appropriate scale of Sobolev smoothness classes the Earth’s gravitational potential has a smoothness index greater than 3/2. In Sect. 2, we show how this information can be transformed in an appropriate choice of the regularizing matrix B. This choice together with above mentioned conclusion of Xu et al. (2006) suggests the following form of Tikhonov multiple regularization functional ˆ.˛; ˇI x/ D jjAx  y jj2 C ˛jjBxjj2 C ˇjjxjj2 ;

(13)

where B is used in the description of the solution smoothness. In Sect. 3, we discuss a choice of the regularization parameters ˛, ˇ that allows an optimal order of the reconstruction accuracy under standard smoothness assumptions. Numerical aspects of the implementation of this parameter choice are discussed in Sect. 4. In Sect. 5, we present some numerical experiments illustrating theoretical results.

2 A Functional Analysis Point of View on Satellite Geodetic Problems Due to the huge number of observations and unknowns, it is reasonable to analyze (8) and (10) as operator equations in Hilbert spaces with the design operator A acting compactly from the solution space X into the observation space Y . In this context, the covariance P can be seen as a bounded self-adjoint nonnegative operator from Y to Y such that for any f , g 2 Y there holds Page 4 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

EŒhf; i hg; i D ı 2 hPf; gi, where h; i; means the scalar product of corresponding Hilbert space. If AD

nA X

ai ui hvi ; i; nA D rank.A/ D dim Range.A/;

(14)

iD1

is the SVD of the design operator, then it is natural to assume that for random noise " the Fourier coefficients hui , "i; are uncorrelated random variables. This assumption allows us to treat the covariance P as a diagonal operator with respect to the system {ui }, since the observation error " is assumed to be zero-mean such that Ehui ; i D 0, i D 1; 2; : : :, and for i ¤ j hP ui ; uj i D ı 2 EŒhui ; ihuj ; i D ı 2 Ehui ; iEhuj ; i D 0: Thus, P D

nP X

pi ui hui ; i;

nP D rank.P/;

(15)

iD1

where pi D ı 2 EŒhui ; i 2. In agreement with the Bayesian approach, not only the covariance P is introduced as a priori information, but also the expectation x0 D Ex  , which gives one more observation equation x  D x0 C ;

E D 0;

cov D 2 Q;

where the covariance Q is also assumed to be a bounded self-adjoint nonnegative operator, i.e., Q DPQT  0, and > 0. Keeping in mind that x  2 Range.AT /, it is natural to assume that D i i i with uncorrelated random Fourier coefficients i D h i , i. Therefore, as in the case of cov 2, QD

nQ X

qi vi hvi ; i;

qi D 2 EŒhvi ; i 2 :

(16)

iD1

Then within the framework of the Bayesian approach, the estimate xO of the unknown element x  follows from the normal equation (see, e.g., Kusche and Klees 2002) .ı 2 AT P 1 A C 2 Q1 /xO D ı 2 AT P 1 y C 2 Q1 x0 :

(17)

At this point it is worth to mention that, as it has been noted in Xu et al. (2006), to obtain a stabilized solution of the geopotential from space geodetic observations, one of the most widely used methods was and still is to employ Kaula’s rule of thumb on the spherical harmonic coefficients, which can be written as follows: xO D .AT P 1 A C K/1 AT P 1 y ;

(18)

Page 5 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

where K is diagonal with its elements being inversely proportional to Kaula’s rule of degree variances. The solution of type (18) has often been interpreted in terms of Bayesian inference, since it solves a particular case of (17), where x0 D 0 and QD

2 1 K ı2

At the same time, Xu (1992) argued that because Kaula’s rule of thumb reflects only possible magnitudes of harmonic coefficients, and because the expected values of these coefficients are indeed not equal to zero (x0 ¤ 0), it is questionable to interpret (18) from the Bayesian point of view. Instead, Xu (1992) and Xu et al. (2006) interpreted (18) as a regularization of (10). Note that in view of (14)–(16), by introducing ı2 ˛ D 2;

P D

nP X

pi vi hvi ; i;

B D .Q1 P /1=2 ;

(19)

iD1

we can reduce (17) with x0 D 0 to .AT A C ˛B 2 /xO D AT y ;

(20)

which is nothing but the Euler equation for the minimization of the Tikhonov functional (9). Thus, the solution xO D x˛ of (20), or (17) (with x0 D 0, as in Kaula’s rule), is the minimizer of (9) with B given by (19). It allows an interpretation of the regularization parameter ˛ as the ratio of the observation noise level ı 2 to the unknown variance 2 . Moreover, in view of the relations 2 Q D P B 2 , K D ı 2 B 2 P1 the choice of the prior covariance or Kaula’s operator K means the choice of a regularizing operator B in (9). On the other hand, the regularization theory provides a theoretical justification for the Tikhonov regularization (9) under the assumption that the smoothness of the unknown solution x  as well as the smoothing properties of the design operator A can be measured in terms of a regularizing operator B. For example, from Engl et al. (1996) it is known that if for any x 2 X mjjB a xjj  jjAxjj  M jjB a xjj;

(21)

jjB p x  jj  ;

(22)

and

with some positive constants a, p, m, M , and then the  Tikhonov regularization approximation 2.aC1/ ˛ aCp provides the error bound x" given as the minimizer of (9) with ˛ D O jjjj   p jjx   x˛ jj D O jjjj aCp

(23)

for p  a C 2.

Page 6 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

Observe that in terms of covariance P (15) the expected value of the norm jj 2 jj can be written as follows: Ejjjj2 D ı 2

nP X

pi :

iD1

Then it is natural to assume that the norm of actual realization of the random variables " can be estimated as jjjj D jjAx   y jj  cı;

(24)

where c is some fixed constant. It is interesting to note that within the problem setting (21), (22), and (24) the order of accuracy  p  jjx   x˛ jj D O ı aCp ;

(25)

given by (23) and (24), cannot be improved by any other regularization method (see, e.g., Nair et al. 2005). Our goal now is to clarify the assumption (21) and (22) in the space geodetic context. Of particular interest for our consideration are the spherical Sobolev spaces (Freeden 1999). Starting with an unbounded self-adjoint strictly positive definite in L2 .R / operator Df .t / WD

1 2kC1 X X kD0 iD1

R where Yk;i .t / D R1 Yk;i

˝ R ˛ 1 R Yk;i kC .t / Yk;i ; f L2 . / ; R 2

t 

are L2 .R /-orthonormal spherical harmonics, we introduce the space ( ) 2s 1 2kC1 X X ˝ ˛ 1 2 R Yk;i ; f L2 .R / < 1 hs .R / D f W jjD s f jj2 D kC 2 iD1 R

kD0

with the scalar product f; gs WD D s f; D s gL2 .R / and the associated norm jjf jjs D .hf ,f is /1=2 . The spherical Sobolev space Hs .R / is the completion of hs .R / under this norm. In particular H0 .R / D L2 .R /. The family fHs .R /g; s 2 .1; 1/, of spherical Sobolev spaces is a particular example of the so-called Hilbert scale. Any such scale allows the following interpolation inequality relating the norms of the same element in different spaces (see, e.g., Engl et al., 1996): s



jjf jj  jjf jj s jjf jjss ;

(26)

where 1 < <  < s < 1. The spherical Sobolev spaces are of particular interest in physical geodesy, since a mathematical study (Svensson 1983) shows that a square-integrable density inside the Earth’s sphere R implies a potential x˝ D F ˛.t / of class Hs .R / with s D 32 . In addition, the known (unitless) R leading coefficients x  ; Yk;i of the Earth’s anomalous potential allow the estimates (see, L2 .R / e.g., Freeden and Pereverzev, 2001) Page 7 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

jjx jj3=2 

jjx  P31 x jj3=2 



 1 2kC1  ˝  R ˛2 1=2 P P  1 2 WD  1:934  103 ; k C 2 x ; Yk;i iD1 kD0 1 2kC1  ˝  R ˛2 1=2 P P  1 2 k C 2 x ; Yk;i WD  4:3  108 ;

(27)

kD32 iD1

R ; i D 1; 2; : : :; 2k C 1; k D 0; 1; : : :; ng. where Pn is the orthoprojector onto the linear span of fYk;i 3 , Thus, in the space geodetic context the assumption (22) is satisfied with any B D D q , p D 2q 3 q > 0 and  1:934  10 . Now consider the assumption (21) of smoothing properties of design operators A. We analyze it in the case of SGG problem, when the design operator A D AS GG is given by (6) and (7). Similar analysis can be made for A D ADW C . Note that in the regularization theory, the number a in (21) is usually interpreted as the degree of the ill-posedness of a problem: the more a, the smoother is the image Ax, and the more ill-posed is the problem (10). As one can see it from (7), the singular values aj of the design operator AS GG decay exponentially fast with j ! 1. On the other hand, for any operator B D D q the singular values bj of B a have the form bj D .k C 1=2/aq , j D k 2 C i , i D 1; 2, . . . , 2k C 1, k D 0; 1; : : :, and decay at a power rate only. For this reason SGG-problem can be formally classified to be exponentially or severely ill-posed, since in general no inequality of the form (21) is possible for B D D q , A D AS GG and a finite a > 0. Nevertheless, we argue that for some satellite missions such as Gravity field and steadystate Ocean Circulation Explorer (GOCE) (see, e.g., Rebhan et al. 2000), the satellite gravity gradiometry problem can be treated as mildly ill-posed. Recall that the aim of GOCE-mission (see Chaps. 3, 4, and 9 of this book) is to ˛ a high˝ provide R up accuracy model of the Earth’s gravitational field based on potential coefficients x  ; Yk;i L2 .R / to degree k D 300. It is interesting to observe that up to this degree the singular values aj D ak;i D .R=r/k .k C 1/.k C 2/=r 2 , j D k 2 C i behave like .k C 1=2/s with s D 5. 5. A straightforward calculation shows that assuming a mean Earth’s radius R D 6;378  103 [m] and an altitude of GOCE-satellite of about r  R D 250  103 [m] we obtain, in particular

0:2.k C 1=2/ 2  ak;i  3.k C 1=2/ 2 ; 11

11

k D 100; 101; : : : ; 300:

(28)

This observation gives a hint that within the GOCE-data processing it is reasonable to approximate AS GG by the design operator 300   2kC1 P P R k .kC1/.kC2/

AQS GG WD C

kD0 1 P

r

kD301

r2

˝ R ˛ r Yk;i ;  L2 . Yk;i

iD1 2kC1 11 P

.k C 1=2/ 2

iD1

R/

˝ R ˛ r Yk;i Yk;i ;  L2 . / ; R

  r .t / D 1r Yk;i rt . where Yk;i Using (28) and the definition of the operator D generating the scale of spherical Sobolev spaces fHs .R /g one can easily find constants m, M such that for any x 2 L2 .R /

Page 8 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

ˇˇ ˇˇ 11 ˇˇ ˇˇ m ˇˇD  2 x ˇˇ

L2 .R /

ˇˇ ˇˇ 11 ˇˇ ˇˇ  jjAQS GG xjjL2 .r /  M ˇˇD  2 x ˇˇ

L2 .R /

:

(29)

It means that the approximate design operator AQS GG satisfies the assumption (21) for any B D D q 11 and a D 2q , q > 0. Consider now the solution xQ  of approximate SGG-problem AQS GG x D y0 ;

(30)

where data y0 D AS GG x  correspond to the exact potential x  . Then (29) together with the interpolation inequality (26), where D  112 ,  D 0, s D 32 , yields ˇˇ ˇˇ 3 ˇˇ ˇˇ 11  ˇˇx   xQ  ˇˇ14 11 ˇˇx   xQ  ˇˇ 14 3 2 2 ˇ ˇ ˇ ˇ ˇ ˇ ˇˇ 11 3 ˇˇx   xQ  ˇˇ 14  m 14 ˇˇAQS GG .x   xQ  /ˇˇ 2 3

ˇˇ  ˇˇ ˇˇx  xQ  ˇˇ

L2 .R /

L .r /

2

From the definition of AS GG and AQS GG , it follows that ˝ R ˛ ˝ R ˛ ˝ r ˛ 1 Yk;i ; xQ L2 . / D Yk;i ; x L2 . / D ak;i Yk;i ; y0 L2 . R

R/

R

 11  up to the degree k D 300, while for k D 301, 302, . . . , ak;i < k C 12 2 and ˇ ˇ˝ ˇ ˇ R ˛ ˇ Yk;i ; xQ L2 .R / ˇ D

ˇ ˛ ˇˇ 11 ˇ˝ r ; y0 L2 ˇ .k C 12 / 2 ˇ Yk;i ˇ˝ ˇ ˇ ˝ r ˛ ˇ ˛ ˇ ˇ R  ˇ 1 ˇ r < ak;i ˇ Yk;i ; y0 L2 . / ˇ D ˇ Yk;i ; x L2 . / ˇ r

r

It means that ˇˇ ˇˇ ˇˇ ˇˇ ˇˇ ˇˇ  ˇˇx  xQ  ˇˇ 3  ˇˇx   P300 x  ˇˇ 3  ˇˇx   P31 x  ˇˇ 3 : 2

2

2

On the other hand, it is easy to see that ˇˇ  ˇˇ ˇˇAQS GG x   xQ  ˇˇ

ˇˇ ˇˇ  ˇˇ ˇ ˇ Q D .A  A /x S GG S GG L2 .r / L2 .ˇrˇ/ ˇˇ D ˇˇ.AS GG  AQS GG /.I  P300 /x  ˇˇL2 .r / ˇˇ 11 ˇˇ  .300 C 12 / 2 ˇˇ.I  P300 /x  ˇˇL2 .R / ˇˇ ˇˇ  .300 C 12 /7 ˇˇ.I  P300 /x  ˇˇ 3 ˇˇ ˇˇ 2  4:5  1018 ˇˇx   P31 x  ˇˇ 3 2

Summing up and using (27) we obtain ˇˇ ˇˇ ˇˇ ˇˇ  4 ˇˇ   ˇˇ 11 ˇˇx  xQ  ˇˇ 2  2:10  P x : x 3  10 31 L .R / 2

Page 9 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

Thus, approximating the solution x  of a severely ill-posed problem AS GG x D y0 by the solution xQ  of a moderately ill-posed problem AQS GG x D y0 one has a possibility to recover the gravitational potential x  with an accuracy of order 1011 . At the same time, as it has been indicated in Freeden and Pereverzev (2001, Table 3) the expected accuracy for a mission with the same altitude as GOCE is about 4  1011 . It means that within a desired accuracy of GOCE-data processing, there is a possibility to treat the satellite gravity gradiometry problem as a moderately ill-posed equation AQS GG x D y" with a degree of ill-posedness a D 11 , say. Moreover, from (25) it follows that in 2 terms of an observation error norm jj"jj D ı the accuracy provided for  suchan equation by the 3 q Tikhonov regularization (9) with B D D , q > 0 has at least the order O ı 14 .

3 An Appearance of a Multiparameter Regularization in the Geodetic Context: Theoretical Aspects When determining the gravity field of the Earth by satellite data, we have to keep in mind that (8) and (10) contain also a modeling error, called aliasing sometimes. Such an error may be caused, e.g., by a deviation of a satellite from a circular orbit, or by the fact that when locally determining the geopotential, the effects of outer zones are neglected. Due to modeling error, the design operator/matrix A is specified inexactly, and we represent it as A D A0 C hE;

(31)

where A0 is the exact design operator/matrix and hE is the modeling error. Remember that for well-posed problem of the form (8) and (10) total least squares method (TLS) takes care of additional perturbations in the design operator A0 and is a well-accepted generalization of the classical least squares (see, e.g., Van Huffel and Vanderwalle 1991). In the O for .x  , y0 , A0 / from given data .y2 ; A/ is determined by TLS-method some estimate .x; O y; O A/ solving the constrained minimization problem jjAO  Ajj2 C jjyO  y jj2 ! min subject toAOxO D y: O

(32)

O are the unknowns, and in the finite dimensional case the norms in (32) In the problem, .x; O y; O A/ are the Frobenius matrix norm and the Euclidean vector norm respectively. For ill-posed problems (8) and (10) it may happen that there does not exist any solution xO 2 X of the problem (32). Furthermore, if there exists a TLS-solution x, O it may be far away from the desired solution x  . Therefore, it is quite natural to restrict the set of admissible solutions by searching for approximations xO that belong to some prescribed set S, which is the philosophy of regularized total least squares (RTLS) (see Golub et al., 1999). The simplest case occurs when the set S is a ball S D fx 2 X W jjBxjj  g, where B is some densely defined self-adjoint strictly positive operator, and is a prescribed radius. This leads to the RTLS-method, in which some estimate O for .x  , y0 , A0 / is determined by solving the constrained minimization problem .x; O y; O A/ jjAO  Ajj2 C jjyO  y jj2 ! min subject to AOxO D y; O jjB xjj O  :

(33)

Page 10 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

Note that several authors (see, e.g., Lu et al. 2009) reported a nondesirable sensitivity of RTLSsolutions to a misspecification of the bound , which ideally should be chosen as D jjBx  jj. But in the satellite geodesy one can easily overcome this drawback by using known bounds of the form (27). For example, it is known (see, e.g., Freeden and Pereverzev, 2001) that !1=2 3 1 2kC1 X X ˇˇ  ˇ ˇ ˝ ˛ 1 2 R ˇˇx  P1 x  ˇˇ 3 WD kC  3:033  107: x  ; Yk;i 2 2 kD2 iD1

(34)

Then within the framework of the RTLS-method it is reasonable to take 3 1 2kC1 1 2kC1 X X˝ X ˛ X 1 2˝ R ˛ R B D B D  kC Yk;i  C Yk;i  ; 2 kD0 iD1 kD2 iD1

D 3:033  107 ;

(35)

where " is some small positive number introduced with the aim to keep B D B strictly positive. Note that the reason to use = jjx   P1 x  jj3=2 instead of D jjx  jj3=2 is that in the satellite geodesy a regularization is often not applied to the first few degrees of the model (see, e.g., Xu et al. 2006). Let us now summarize some characterizations of RTLS-solutions that serve as a starting point for developing algorithms to construct such solutions. From Golub et al. (1999) and Beck et al. (2006) we have Theorem 1. If in the problem (33) the constraint jjB xjj O  is active then the RTLS-solution xO D x˛;ˇ satisfies the equations .AT A C ˛B 2 C ˇI /x˛;ˇ D AT y

(36)

jjBx˛;ˇ jj D ;

(37)

and

where the parameters ˛, ˇ satisfy ˛D

˝ ˛ 1  ˇ C jjy jj2  y ; Ax˛;ˇ ; 2

ˇD

jjAx˛;ˇ  y jj2 ˛;ˇ

1 C jjx jj2

:

(38)

Moreover, the RTLS-solution xO D x˛;ˇ is also the solution of the constrained minimization problem jjAx  y jj2 ! mi n subject tojjBxjj  : 1 C jjxjj2

(39)

From Theorem 1 it follows that in case of the active inequality constraint (33), the RTLS-solution xO D x˛;ˇ minimizes the Tikhonov multiple regularization functional ˆ.˛; ˇI x/, since (36) is just the Euler equation for the minimization of (13). In this way multiparameter regularization naturally appears in the RTLS-method, but from (38) it can be seen that in the RTLS-method one of the regularization parameters is negative, which is unusual for Tikhonov regularization schemes, Page 11 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

where regularization parameters are assumed to be positive. Nevertheless, the system (38) can be considered as an a posteriori rule for choosing regularization parameters. This rule does not require a knowledge of a modeling error h in (31), or an observation error ı in (24) but in general the system (38) is hardly tractable. An alternative to the RTLS-method is the method of dual regularized total least squares (DRTLS), which has been recently proposed in Lu et al. (2009). In this method, the levels of a modeling error h and an observation error ı are assumed to be known. Then in the DRTLS-method, O for .x  , y0 , A0 / is determined by solving the constrained minimization some estimate .x; O y; O A/ problem jjBxjj ! min

subject to AOxO D y; O

jjyO  y  ı;

jjAO  Ajj  h;

(40)

where B is the same as in (33). From Lu et al. (2009) we have the following theorem, which provides us with a characterization of DRTLS-solutions. Theorem 2. If in the problem (40) the two inequality constraints are active, then the DRTLSsolution xO D x˛;ˇ satisfies (36), where the regularization parameters ˛, ˇ solve the following system of nonlinear equations jjAx˛;ˇ  y jj D ı C hjjx˛;ˇ jj; ˛;ˇ

 ˇ D  h.ıChjjx ˛;ˇ

jjx jj

jj/

:

(41)

It is worth to note that in the special case of no modeling error h D 0, we have ˇ D 0, and the DRTLS-solution xO D x˛;0 reduces to the Tikhonov regularized approximation x"˛ with ˛ chosen by the discrepancy principle such that jjAx˛  y jj D ı. In the case h ¤ 0, we again meet with a multiparameter regularization (36), where one of the regularization parameters is negative. The system (41) gives us a rule for choosing the parameters. In contrast to the RTLS-method, this rule does not require a reliable bound for the solution norm jjBx  jj and use only the bounds for the error levels ı, h. Thus, in some sense the methods (36)–(38) and (36), (41) compliment each other. Moreover, in terms of the error levels ı, h both these methods guarantee an accuracy of the same order, provided that the standard smoothness conditions (21) and (22) are satisfied. It can be seen from the following theorem Lu et al. (2009) Theorem 3. Assume the conditions (21) and (22) hold with 1  p  a C 2. Let xO D xı;ˇ be the DRTLS-solution given by (36) and (41). If in (40) the inequality constraints are active then O  2jjB p x  jj jjx   xjj

a aCp



ı C hjjx  jj m

p pCa

  p D O .ı C h/ pCa :

In addition let xO D x˛;ˇ be the RTLS-solution given by (36)–(38). If the exact solution x  satisfies the side condition jjBx  jj D then

Page 12 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

jjx   xjj O  .2jjB p x  jj/

a aCp

p ! pCa p   p maxf1; g. 2 C 1/ D O .ı C h/ pCa : m

From the discussion in Lu et al. (2009), it follows that under the conditions of the Theorem 3 the  p pCa order of accuracy O .ı C h/ cannot be improved by any other regularization method.

4 Computational Aspects of Some Multiparameter RegularizationSchemes In this section, we discuss computational aspects of multiparameter regularization considered above. We start with the scheme (11) and (12). As it was observed in Xu et al. (2006), in the determination of the geopotential from precise satellite orbits this scheme performs similar to the iterative version of the single parameter regularization (20) with 0 B B B D I0 D B @

1

02 ::

C C C; A

: 0k

(42)

I where 02 , . . . , 0k are all zero matrices, corresponding to the harmonic coefficients of degrees 2 to k, which are supposed to be sufficiently precise and require no regularization. In Xu et al. (2006) the iterative version of (20) and (42) has been implemented by repeatedly replacing in (10) the true model parameters vector x  with its estimate from the previous iteration. Such a scheme is sometimes called iterative Tikhonov regularization (see, e.g., Engl et al. 1996). The simulations of Xu et al. (2006) have shown that the iterative version of (20) and (42) is sufficient to derive an almost best quality geopotential model. Therefore, it can serve as a benchmark for performance evaluation of other schemes. In our numerical illustrations presented in the next section, we use the iterative version of (20) and (42) with k D 2, since in the multiparameter regularizations discussed above the choice of the operator B D B" (35) has been made under the assumption that the harmonic coefficients R i; of degree k D 2 are sufficiently precise. Note that in view of (27) an extension of all hx  ; Yk;i constructions for degrees up to k D 32 is also possible. Our illustrations are performed with synthetic data simulated from known exact solution x  Therefore, in (20) and (42) we are able to choose the best ˛ among a given set of regularization parameters. In this way, we create an ideal benchmark of the performance. We consider three iterations of (20) and (42). An increase of the number of iterations does not essentially change the results. Let us briefly discuss computations with the RTLS-method. It is usually assumed that the radius

in (33) is less than jjBxT LS jj, where xT LS denotes the solution of the TLS problem (32); otherwise no regularization would be necessary. Then at the minimizer of (33) the inequality constraint holds with equality, and from the Theorem 1 it follows that the RTLS-solution belongs ˛;ˇ to the two-parameter family of elements x2 , which solve (36) and satisfy (37). In view of (39) a straightforward numerical approach to the RTLS-solution is to construct a representative sample of x"˛;ˇ by solving (36) with Page 13 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

˛ D ˛i 2 f˛1 ; ˛2; : : : ; ˛max g;

ˇ D ˇj 2 fˇ1 ; ˇ2 ; : : : ; ˇmax g

˛ ;ˇ

and then choose such a sample element x i j that meets (37) and brings the minimal value to the  functional jjAx˛;ˇ  y jj2 = 1 C jjx˛;ˇ jj2 among the samples. Of course, in general such a straightforward approach is rather expensive. In the literature, the condition (38) has been used to find RTLS-solution from (36) via solutions of linear or quadratic eigenvalue problems. In both cases one has to solve a sequence of eigenvalue problems, and to make this approach tractable one should reuse as much information as possible from previous problems of the sequence. More details can be found, e.g., in Lampe and Voss (2009). For our numerical illustrations, we implement the RTLS-method in the straightforward way described above, since we are mainly interested in a comparison of results produced by different multiparameter regularization schemes and the RTLS-method is only one of them. At the end of the section, we briefly describe a strategy for finding two regularization parameters in the DRTLS-method (36) and (41). This strategy is based on an extension of the idea of model function approximation originally proposed in Kunisch and Zou (1998) for a realization of the discrepancy principle in the standard single parameter Tikhonov regularization (9) with B D I . For the DRTLS-method, we need to derive a model function of two variables. To this end we consider the function F .˛; ˇ/ D ˆ.˛, ˇ; x"˛;ˇ /, where ˆ is defined by (13), and ˛;ˇ x2 is the solution of (36). Using the properties of the norm induced by a scalar product, one can easily check that ˇˇ ˇˇ ˇˇ ˇˇ2 ˇˇ2 ˇˇ2 F .˛; ˇ/ D jjy jj2  ˇˇAx˛;ˇ ˇˇ  ˛ ˇˇBx˛;ˇ ˇˇ  ˇ ˇˇx˛;ˇ ˇˇ :

(43)

Moreover, similarly to Kunisch and Zou (1998) it can be also shown that ˇˇ ˇˇ2 @F D ˇˇBx˛;ˇ ˇˇ ; @˛

ˇˇ ˇˇ2 @F D ˇˇx˛;ˇ ˇˇ : @ˇ

(44)

Now the idea is to approximate in (43) the term jjAx˛;ˇ jj2 locally at a given point .˛; ˇ/ by a value T jjx˛;ˇ jj2 , where T is a positive constant to be determined. This approximation together with (43) and (44) gives us an approximate formula F  jjy jj2  ˛

@F @F  .ˇ C T / : @˛ @ˇ

By a model function approximation we mean a parameterized family of functions m.˛; ˇ/ for which this formula is exact, i.e., each model function should solve the partial differential equation mC˛

@m @m C .ˇ C T / D jjy jj2 : @˛ @ˇ

It is easy to check that a simple parametric family of the solutions of this equation is given by m.˛; ˇ/ D jjy jj2 C

C2 C1 C ; ˛ T Cˇ

(45)

Page 14 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

where C1 , C2 , T are parameters to be determined. To use m.˛; ˇ/ for approximating F .˛; ˇ/ locally in a neighborhood of a point ˛ D a, ˇ D b, one can determine these parameters by the interpolation conditions 8 ˆ < m.a; b/ D F .a; b/ ˇˇ ˇˇ2 @m .a; b/ D @F .a; b/ D ˇˇBxa;b ˇˇ ; (46) @˛ @˛ ˆ : @m .a; b/ D @F .a; b/ D ˇˇˇˇx a;b ˇˇˇˇ2 :  @ˇ @ˇ Then the parameters can be derived explicitly as 8 a;b 2 2 jj a ; < C1 D C1 .a; b/ D jjBx   2 2 a;b 2 C2 D C2 .a; b/ D  jjyjj  F .a; b/  ajjBxa;b jj2 =jjx  jj ;  : T D T .a; b/ D b C jjy jj2  F .a; b/  ajjBxa;b jj2 =jjxa;b jj2 :

(47)

Using the model function approximation (45)–(47), we can easily derive an iteration algorithm for constructing DRTLS-solutions x"˛;ˇ from (36) and (41). Let x˛k ;ˇk be an approximation to a DRTLS-solution constructed at k-th iteration step. Then using the second equation of (41), we can explicitly update ˇ D ˇkC1 by a fixed point iteration as  ˇkC1 D h

ı C hjjx˛k ;ˇk jj ˛ ;ˇk

jjx k

jj

 :

(48)

For updating ˛ we use (44) and the representation ˇˇ ˇˇ ˛;ˇ ˇˇAx  y ˇˇ2 D F .˛; ˇ/  ˛ @F .˛; ˇ/  ˇ @F .˛; ˇ/  @˛ @ˇ to rewrite the first equation of (41) as s !2 @F @F @F .˛; ˇ/ : F .˛; ˇ/  ˛ .˛; ˇ/  ˇ .˛; ˇ/ D ı C h @˛ @˛ @ˇ

(49)

Now the idea is to approximate F .˛; ˇ/ in the neighborhood of (˛k , ˇk / by a model function (45) with the parameters C1 , C2 , T determined by (47) for a D ˛k , b D ˇk , such that the updated regularization parameter ˛ D ˛kC1 can be easily found as the solution of corresponding approximate version of (49) s @m @m .˛; ˇkC1 /  ˇkC1 .˛; ˇkC1 / D ı C h m.˛; ˇkC1 /  ˛ @˛ @˛

@m .˛; ˇkC1 / @ˇ

!2 :

(50)

which is in fact a linear equation in ˛. The algorithm is iterated until the relative or absolute change in the iterates is determined to be sufficiently small.

Page 15 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

In Lu et al. (2008), it has been demonstrated by numerical experiments that the algorithm of the model function approximation (45), (47), (48), and (50) converges rapidly thereby making the problem of computing DRTLS-solutions computationally tractable. To sum up the discussion, each of considered multiparameter regularization methods can be in principle equipped with a numerical realization scheme, and it seems that for the DRTLSmethod such a scheme is algorithmically the simplest one. In the next section, we present numerical simulations to compare the performance of the discussed methods.

5 Numerical Illustrations Similar to Xu et al. (2006), in our numerical experiments we do not work with real data, but do with artificially generated ones, in order to compare statistically the performances of considered methods. As in Bauer et al. (2007), we assume to have gravity data at an orbit height of about 400 km and to reconstruct a simulated gravitational field at the Earth’s surface. This situation can be modeled by means of the downward continuation operator (4), where aj D .1:06/k ; j D i C k 2 ; i D 1; 2; : : : ; 2k C 1; k D 0; 1; : : : : For the sake of simplicity in (4) we keep only spherical harmonics up to degree k D 10 such that the exact design operator A0 is given in the form of 121  121-matrix A0 D diag.1; .1:06/1 ; .1:06/1; .1:06/1 ; : : : ; .1:06/10 /: Then we simulate a modeling error as in (31), where E is given as E D jjU jj1 F U , jj  jjF is the Frobenius norm, and U is 121  121-matrix with random elements which are uniformly distributed on [0, 1]. The exact solutions x  are also simulated randomly as 121-dimensional vectors         x  D x0;1 ; x1;1 ; x1;2 ; x1;3 ; x2;1 ; : : : ; xk;i ; : : : ;

i D 1; 2; : : : ; 2k C 1;

k D 2; 3; : : : ; 10;



where xk;i are uniformly distributed on [0, 1], and 2   !1=2 10 2kC1 X X 2 1  xk;i kC ;  D 3:033  107  2 kD2 iD1 which is to say that the components of x  can be seen as spherical harmonics coefficients of a function satisfying (34). Then noisy observations y" are simulated as y D A0 x  C ıjjejj1 e; where e is a 121-dimensional random vector with uniformly distributed on [0, 1] components, and ı is chosen as ı D 0:01jjA0 x  jj that corresponds to the noise level of 1 %.

Page 16 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

d = 1% h=0 RTLS

DRTLS

IT with B = I0

0

1

2

3

4

5

6

7

8 x 10–3

Fig. 1 Comparison of the performances of considered multiparameter regularizations in the case of no modeling error RTLS d = 1%

DRTLS

h=d IT with B = I0

0

1

2

3

4

5

6 x 10–3

Fig. 2 Comparison of the performances of considered multiparameter regularizations in the presence of modeling error, h D ı

For each of above mentioned problem instances, 50 independent runs of the random number generator were performed, and the methods discussed in Sect. 4 were applied in parallel to 50 obtained noisy matrix equations. The results are displayed in Figs. 1 and 2, where each circle exhibits a relative error in solving one of 50 simulated problems. The circles on the horizontal lines labeled by DRTLS, RTLS, and IT with B D I0 correspond to errors of the DRTLS- and RTLS-methods, and to errors of the iterative version of (20) and (42). Recall that the latter method is used here as a surrogate for the multiple parameter regularization (11) and (12). Moreover, the results obtained by this method correspond to the ideal choice of the regularization parameter based on the complete knowledge of the exact solution x  . Therefore, these results can be considered as a benchmark to assess the performance of other methods. Figure 1 corresponds to the case of no modeling error, h D 0, while Fig. 2 displays the results for modeling error simulated for h D ı. In both cases, the RTLS-method was implemented in the straightforward way described in Sect. 4, where the parameters ˛i , ˇj are equidistant such that ˛1 D 0:1, ˛max D ˛20 D 104 , ˇ1 D 1, ˇmax D ˇ20 D 103 . To implement the DRTLS-method we used the algorithm of the model function approximation (45), (47), (48), and (50), and its convergence was observed after 4–5 iteration steps. In the DRTLS-method, as well as in the RTLS-method, we used B D B" defined by (35) and it turns out that the performance of considered methods is not so sensitive to the choice of ". In our experiments we tried " D 101 and " D 1010 , that did not essentially change the situation. Page 17 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

Table 1 Statistical performance measures for different methods in the case of 1 % of data noise Mean relative Median relative Standard deviation Mean regularization Method Error Error Of relative error Parameter(s) DP with B D I 0.0174 0.0174 0.0006 0.0093 DRTLS 0.0027 0.0026 0.0015 18.1807 RTLS 0.0022 0.0022 0.0006 1, 0.001 Iterated Tikhonov 0.002 0.002 0.0007 190.62 with B D I0 Table 2 Statistical performance measures for different methods in the presence of modeling error h D ı Mean relative Median relative Standard deviation Mean regularization Method Error Error Of relative error Parameter(s) DP with B D I 0.0176 0.0175 0.0006 0.0092 DRTLS 0.0019 0.0019 0.0007 0.0942, 0.002 RTLS 0.0017 0.0016 0.0006 0.1, 0.001 Iterated Tikhonov 0.0026 0.0026 0.0005 10.2 with B D I0

For comparison we also present Tables 1 and 2, where mean values, median values and standard deviations of the relative error are given for all discussed methods. Moreover, in these tables one can find statistical performance measures for the most widely used Tikhonov-Phillips regularization scheme corresponding to (20) with B D I . A poor performance of this scheme can be seen as a sign that the downward continuation problem should be treated with care. At the same time, multiparameter regularization such as the RTLS- and DRTLS-methods perform at the benchmark level, that allows the suggestion to employ multiparameter regularization schemes in satellite data processing.

6 Conclusion The downward continuation of the spaceborne gravity data is a challenging task, due to the illposedness of the problem. This ill-posedness is inherited by other satellite geodetic problems such as satellite gravity gradiometry and satellite-to-satellite tracking. All these problems can be formally classified as exponentially or severely ill-posed ones. At the same time, under some circumstances, within desired accuracy range, one can effectively approximate the problems of satellite geodesy by moderately or polynomially ill-posed problems, as it has been shown in Sect. 2. Of course, in this way additional modeling error is introduced into the data processing, and it should be taken into account together with the unavoidable observation error. In mathematical formulation it leads to operator equations with noisy problem instances. The RTLS method is the well-accepted remedy for dealing with such equations. Recently the DRTLS method has been also proposed for solving noisy operator equations. Both these methods can be viewed as multiparameter regularization methods, where one of the regularization parameters is negative. Note that in the geodetic context the concept of multiparameter regularization was for the first time introduced in Xu and Rummel (1992), but usually all regularization parameters are supposed to be positive.

Page 18 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

In this chapter, we have analyzed the above mentioned multiparameter schemes. Our analysis and numerical illustrations show that multiparameter regularization can indeed produce good results from simulated data. We have restricted our discussion to the case of two regularization parameters, since the RTLS- and DRTLS-method are, in fact, two-parameter regularization schemes. At the same time, a regularization with more than two parameters seems to be an effective tool. A complete analysis of such regularization is an important direction for future research, since the downward continuation of the spaceborne data requires the use of different computational methods as well as different techniques that take into account the nature of the this problem. Acknowledgments The authors are supported by the Austrian Fonds Zur Förderung der Wissenschaftlichen Forschung (FWF), Grant P20235-N18.

References Bauer F, Pereverzev SV (2006) An utilization of a rough approximation of a noise covariance within the framework of multi-parameter regularization. Int J Tomogr Stat 4:1–12 Bauer F, Mathé P, Pereverzev SV (2007) Local solutions to inverse problems in geodesy: the impact of the noise covariance structure upon the accuracy of estimation. J Geod 81:39–51 Beck A, Ben-Tal A, Teboulle M (2006) Finding a global optimal solution for a quadratically constrained fractional quadratic problem with applications to the regularized total least squares. SIAM J Matrix Anal Appl 28:425–445 Boumann J (2000) Quality assessment of satellite-based global gravity field models. PhD dissertation, Delft University of Technology Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer, Dordrecht Freeden W (1999) Multiscale modeling of spaceborne geodata. B.G. Teubner, Leipzig Freeden W, Pereverzev SV (2001) Spherical Tikhonov regularization wavelets in satellite gravity gradiometry with random noise. J Geod 74:730–736 Freeden W, Schneider F, Schreiner M (1997) Gradiometry – an inverse problem in modern satellite geodesy. In: Engl HW, Louis AK, Rundell W (eds) GAMM-SIAM symposium on inverse problems in geophysical applications. Fish Lake, Yosemite, pp 179–239 Golub GH, Hansen PC, O’Leary DP (1999) Tikhonov regularization and total least squares. SIAM J Matrix Anal Appl 21:185–194 Kellogg OD (1967) Foundations of potential theory. Springer, Berlin Klees R, Ditmar P, Broersen P (2003) How to handle colored observation noise in large leastsquares problems. J Geod 76:629–640 Kunisch K, Zou J (1998) Iterative choices of regularization parameters in linear inverse problems. Inverse Probl 14:1247–1264 Kusche J, Klees R (2002) Regularization of gravity field estimation from satellite gravity gradients. J Geod 76:359–368 Lampe J, Voss H (2009) Efficient determination of the hyperparameter in regularized total least squares problems. Available online https://www.mat.tu-harburg.de/ins/forschung/rep/rep133.pdf Louis AK (1989) Inverse und schlecht gestellte problems. Teubner, Stuttgart Lu S, Pereverzev SV, Tautenhahn U (2008) Dual regularized total least squares and multi-parameter regularization. Comput Methods Appl Math 8:253–262

Page 19 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_27-4 © Springer-Verlag Berlin Heidelberg 2014

Lu S, Pereverzev SV, Tautenhahn U (2009) Regularized total least squares: computational aspects and error bounds. SIAM J Matrix Anal Appl 31:918–941 Nair MT, Pereverzev SV, Tautenhahn U (2005) Regularization in Hilbert scales under general smoothing conditions. Inverse Probl 2:1851–1869 Pereverzev SV, Schock E (1999) Error estimates for band-limited spherical regularization wavelets in an inverse problem of satellite geodesy. Inverse Probl 15:881–890 Rebhan H, Aguirre M, Johannessen J (2000) The gravity field and steady-state ocean circulation explorer mission-GOCE. ESA Earth Obs Q 66:6–11 Rummel R, van Gelderen, Koop R, Schrama E, Sanso F, Brovelli M, Miggliaccio F, Sacerdote F (1993) Spherical harmonic analysis of satellite gradiometry. Publ Geodesy, New Series, 39. Netherlands Geodetic Commission, Delft Svensson SL (1983) Pseudodifferential operators – a new approach to the boundary value problems of physical geodesy. Manuscr Geod 8:1–40 van Huffel S, Vanderwalle J (1991) The total least squares problem: computational aspects and analysis. SIAM Philadelphia Xu PL (1992) Determination of surface gravity anomalies using gradiometric observables. Goephys J Int 110:321–332 Xu PL, Rummel R (1992) A generalized regularization method with application in determination of potential fields. In: Holota P, Vermeer M (eds) Proceedings of 1st continental workshop on the geoid in Europe, Prague, pp 444–457 Xu PL, Rummel R (1994) A generalized ridge regression method with application in determination of potential fields. Manuscr Geod 20:8–20 Xu PL, Fukuda Y, Liu YM (2006) Multiple parameter regularization: numerical solutions and applications to the determination of geopotential from precise satellite orbits. J Geod 80:17–27

Page 20 of 20

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

Correlation Modeling of the Gravity Field in Classical Geodesy Christopher Jekeli Division of Geodetic Science, School of Earth Sciences, Ohio State University, Columbus, OH, USA

Abstract The spatial correlation of the Earth’s gravity field is well known and widely utilized in applications of geophysics and physical geodesy. This paper develops the mathematical theory of correlation functions, as well as covariance functions under a statistical interpretation of the field, for functions and processes on the sphere and plane, with formulation of the corresponding power spectral densities in the respective frequency domains and with extensions into the third dimension for harmonic functions. The theory is applied, in particular, to the disturbing gravity potential with consistent relationships of the covariance and power spectral density to any of its spatial derivatives. An analytic model for the covariance function of the disturbing potential is developed for both spherical and planar application, which has analytic forms also for all derivatives in both the spatial and the frequency domains (including the along-track frequency domain). Finally, a method is demonstrated to determine the parameters of this model from empirical regional power spectral densities of the gravity anomaly.

1 Introduction The Earth’s gravitational field plays major roles in geodesy, geophysics, and geodynamics and is also a significant factor in specific applications such as precision navigation and satellite orbit analysis. With the advance of instrumentation technology over the last several decades, we now have gravitational models of high spatial resolution over most of the land areas, thanks to extensive ground and expanding airborne survey campaigns and over the oceans owing to satellite radar altimetry, which measures essentially a level surface. Recent satellite gravity missions (e.g., the Gravity Field and Steady-State Ocean Circulation Explorer (GOCE), Rummel et al. 2011) also have vastly improved the longer-wavelength parts of the model with globally distributed in situ measurements. Despite these improvements, there remain deficiencies in resolution, including a lack of uniformity and accuracy in some land areas, such as Antarctica and significant parts of Africa, South America, and Asia (Pavlis et al. 2012a). These gaps will be filled with continued measurement, mostly using airborne systems for efficient accessibility to remote regions. Determining the required resolution and analyzing the effect or significance of the gravitational field at various scales for particular applications often rely on some a priori knowledge of the field. Also, the interpolation and extrapolation of the field from given discrete data and the prediction or estimation of field quantities other than those directly measured requires a weighting function based on the essential spatial correlative characteristics of the gravitational field. For these reasons, 

E-mail: [email protected]

Page 1 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

the study and development of correlation or covariance functions of the field have occupied geodesists and geophysicists in tandem with the advancements of measurement and instrument technology. The rather slow attenuation of the field as a function of resolution gives it at regional scales a kind of random character, much like the Earth’s topography. Indeed, the shorter spatial wavelengths of the gravitational field are in many cases highly correlated with the topography; and, profiles of topography, like coastlines, are known to be fractals, which arise from certain random fluctuations, analogous to Brownian motion (Mandelbrot 1983). Thus, we may argue that the Earth’s gravitational field at fine scales also exhibits a stochastic nature (Jekeli 1991). This randomness in the field has been argued and counterargued for decades, but it does form the basis for one of the more successful estimation methods in physical geodesy, called least-squares collocation (Moritz 1980). In addition, the correlative description of the field is advantageous in more general error analyses of the problem of field modeling; and, it is particularly useful in generating synthetic fields for deterministic simulations of the field for Monte Carlo types of analyses. The stochastic nature of the gravitational field, besides assumed primarily for the shorter wavelengths, is also limited to the horizontal dimensions. The variation in the vertical (above the Earth’s surface) is constrained deterministically by the attenuation of the gravitational potential with distance from its source, as governed by the solution to Laplace’s differential equation in free space. However, this constraint also extends the stochastic interpretation in estimation theory, since it analytically establishes mutually consistent correlations for vertical derivatives of the potential, or between its horizontal and vertical derivatives, or between the potential (and any of its derivatives) at different vertical levels. Thus, with the help of the corresponding covariance functions, one is able to estimate, for example, the geoid undulation from gravity anomaly data in a purely operational approach using no other physical models, which is the essence of the method of least-squares collocation. It is necessary to distinguish and relate correlation and covariance functions as used in this text. The covariance function refers to random or stochastic processes and is the statistical expectation of the product of the centralized process at two points of the process (i.e., of two random variables with their means removed). The correlation function has more than one definition. As a natural extension of the Pearson correlation coefficient, it is the covariance function normalized by the square roots of the variances of the process at the two points (Priestley 1981). An alternative definition is the statistical expectation of the non-centralized product of the process at two points (Maybeck 1979). A third definition characterizes the correlation of deterministic (nonrandom) functions on the basis of averages of products over the domain of the function (de Coulon 1986). Ultimately, the covariance function and the correlation function, in its various incarnations, are related, but there is an advantage to distinguish between the stochastic and the non-stochastic versions. Minimum error variance estimation requires a stochastic interpretation, and the gravitational field is characterized stochastically in terms of covariance functions. If interpolation or filtering or simulation through arbitrary synthesis is the principal application, then it may be sufficient to dispense with the stochastic interpretation. If the stochastic process is ergodic then the average-based correlation function of its realization is the same as the its covariance function if the means are known and removed. Thus, one may start with the formulation of the physical correlation of the gravitational field without the stochastic underpinning and introduce the stochastic interpretation as needed. Since one of the main applications is the popular least-squares collocation in physical geodesy,

Page 2 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

the terminology of covariance functions dominates the later chapters. Whether from the more general or the stochastic viewpoint, the correlative methods can be extended to other fields on the Earth’s surface and to fields that are harmonic in free space. For example, the anomalous magnetic potential (due to the magnetization of the crust material induced by the main, outer-core-generated field of the Earth) also satisfies Laplace’s differential equation. Thus, it shares basic similarities to the anomalous gravitational field. Under certain, albeit rather restrictive assumptions, one field may even be represented in terms of the other (Poisson’s relationship; Baranov 1957). Although this relationship has not been studied in detail from the stochastic or more general correlative viewpoint, it does open numerous possibilities in estimation and error analysis. Finally, it is noted that spatial data analyses in geophysics, specifically the optimal prediction and interpolation of geophysical signals, known as kriging (Olea 1999), rely as does collocation in geodesy on a correlative interpretation of the signals. Semi-variograms, instead of correlation functions, are used in kriging, but they are closely related. Therefore, a study of modeling one (correlations or covariances, in the present case) immediately carries over to the other. The following chapters review correlation functions on the sphere and plane, as well as the transforms into their respective spatial frequency domains. For the stochastic understanding of the geopotential field, the covariance function is introduced, under the assumption of ergodicity (hence, stationarity). Again, the frequency domain formulation, that is, the power spectral density of the field, is of particular importance. The method of covariance propagation, which is indispensable in such estimation techniques as least-squares collocation, naturally motivates the analytic modeling of covariance functions. Models have occupied physical geodesists since the utility of least-squares collocation first became evident, and myriad types of models and approaches exist. In this paper, a single yet comprehensive, adaptable, and flexible model is developed that offers consistency among all derivatives of the potential, whether in spherical or planar coordinates, and in the space or frequency domains. Methods to derive appropriate parameters for this model conclude the essential discussions of this paper.

2 Correlation Functions We start with functions on the sphere and develop the concept of the correlation function without the need for a stochastic foundation. The statistical interpretation may be imposed at a later time when it is convenient or necessary to do so. As it happens, the infinite plane as functional domain offers more than one option for developing correlation functions, depending on the class of functions, and, therefore, will be treated after considering the unit sphere, . Other types of surfaces that approximate the Earth’s surface more accurately (ellipsoid, geoid, topographic surface) could also be contemplated. However, the extension of the correlation function into space according to potential theory and the development of a useful duality in the spatial frequency domain then become more problematic, if not impossible. In essence, we require surfaces on which functions have a spectral decomposition and such that convolutions transform into the frequency domain as products of spectra. The latter requirement is tied to the analogy between convolutions and correlations. Furthermore, the surface should be sufficiently simple as a boundary in the solution to Laplace’s equation for the gravitational potential. To satisfy all these requirements and with a view toward practical applications, the present discussion is restricted to the plane and the sphere. Although data on the surface are always discrete, we do not consider discrete functions. Rather, it is always assumed that the data are samples of a continuous function. Then, the correlation Page 3 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

functions to be defined are also continuous, and correlations among the data are interpreted as samples of the correlation function.

2.1 Functions on the Sphere Let g and h be continuous, square-integrable functions on the unit sphere, , i.e., ZZ ZZ 1 1 2 g d < 1; h2 d < 1; 4 4 

(1)



and suppose they depend on the spherical polar coordinates, f.; / j0    ; 0   < 2g. Each function may be represented in terms of its Legendre transform as an infinite series of spherical harmonics, g .; / D

n 1 X X

Gn;m YNn;m .; /;

(2)

nD0 mDn

where the Legendre transform, or the Legendre spectrum of g, is ZZ 1 g .; / YNn;m .; / d; Gn;m D 4

(3)



and where the functions, YNn;m .; /, are surface spherical harmonics defined by YNn;m .; / D PNn;jmj .cos  /



cos m; m  0 sin jmj ; m < 0

(4)

The functions, PNn;m , are associated Legendre functions of the first kind, fully normalized so that 1 4

ZZ

YNn0 ;m0 .; / YNn;m .; / d D



1; n0 D n and m0 D m 0; n0 ¤ n or m0 ¤ m

(5)



A similar relationship exists between h and its Legendre transform, Hn;m . The degree and order, .n; m/, are wave numbers belonging to the frequency domain. The unit sphere is used here only for convenience, and any sphere (radius, R) may be used. The Legendre spectrum then refers to this sphere. We define the correlation function of g and h as ZZ   1 gh . ; ˛/ D g .; / h  0 ; 0 sin dd ; (6) 4 

where the points .; / and . 0 ; 0 / are related by

Page 4 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

  D cos  cos  0 C sin  sin  0 cos   0 ;

cos

 sin  0 sin .  0 / ; tan ˛ D sin  cos  0  cos  sin  0 cos .  0 /

(7) (8)

and where the integration is performed over all pairs of points, .; / and . 0 ; 0 /, separated by the fixed spherical distance, , and oriented by the fixed azimuth, ˛. If the spherical harmonic series, Eq. (2), for g and h are substituted into Eq. (6), we find that, due to the special geometry of the sphere, no simple analytic expression results unless we further average over all azimuths, ˛, thus imposing isotropy on the correlation function. Therefore, we redefine the correlation function of g and h (on the sphere) as follows: 1 gh . / D 8 2

Z2 Z Z

  g .; / h  0 ; 0 sin dd d˛:

(9)



0

More precisely, this is the cross-correlation function of g and h. The autocorrelation function of g is simply gg . /. The prefixes, cross- and auto-, are used mostly to emphasize a particular application and may be dropped when no confusion arises. Because of its sole dependence on , gh can be expressed as an infinite series of Legendre polynomials: gh . / D

1 X

  .2n C 1/ ˚gh n Pn .cos /;

(10)

nD0

  where the coefficients, ˚gh n , constitute the Legendre transform of gh :   1 ˚gh n D 2

Z gh . / Pn .cos / sin d :

(11)

0

Substituting the decomposition formula for the Legendre polynomial, Pn .cos / D

n X   1 YNn;m .; / YNn;m  0 ; 0 ; 2n C 1 mDn

(12)

and Eq. (9) into Eq. (11) and then simplifying using the orthogonality, Eq. (5), and the definition of the Legendre spectrum, Eq. (3), we find: 1 0 ZZ ZZ n X       1 1 1 ˚gh n D g .; / YNn;m .; / @ h  0 ; 0 YNn;m  0 ; 0 sin d d˛ A 2n C 1 mDn 4 4 

sin dd  D

n X 1 Gn;m Hn;m 2n C 1 mDn



(13)

Page 5 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

  where .; / is constant in the inner integral. The quantities, ˚gh n , constituting the Legendre transform of the correlation function, may be called the (cross-) power spectral density (PSD) of g and h. They are determined directly from the Legendre spectra of g and h. The (auto-) PSD of g is simply   ˚gg n D

n X 1 G2 : 2n C 1 mDn n;m

(14)

The terminology that refers the correlation function to “power” is appropriate since it is an integral divided by the solid angle of the sphere. For functions on the plane, we make a distinction between energy and power, depending on the class of functions.

2.2 Functions on the Plane On the infinite plane with Cartesian coordinates, f.x1 ; x2 / j  1 < x1 < 1; 1 < x2 < 1g, we consider several possibilities for the functions. The situation is straightforward if the functions are periodic and square integrable over the domain of a period or are square integrable over the plane. Anticipating no confusion, these functions again are denoted, g and h. For the periodic case, with periods, Q1 and Q2 , in the respective coordinates, 1 Q1 Q2

ZQ1 ZQ2

1 Q1 Q2

g 2 dx1 dx2 < 1; 0

0

ZQ1 ZQ2 h2 dx1 dx2 < 1I 0

(15)

0

and each function may be represented in terms of its Fourier transform as an infinite series of sines and cosines, conveniently combined using the complex exponential:   1 1 X X k1 x1 k2 x2 1 i2 Q C Q 1 2 Gk ;k e ; g .x1 ; x2 / D Q1 Q2 k D1 k D1 1 2 1

(16)

2

where the Fourier transform, or the Fourier spectrum of g, is given by ZQ1 ZQ2 Gk1 ;k2 D

g .x1 ; x2 / e 0

i2



k1 x1 k2 x2 Q1 C Q2



dx1 dx2 ;

(17)

0

and a similar relationship exists between h and its transform, Hk1 ;k2 . Again, the integers, k1 , k2 , are the wave numbers in the frequency domain. Assuming both functions have the same periods, the correlation function of g and h is defined by 1 gh .s1 ; s2/ D Q1 Q2

Q Z1 =2

Q Z2 =2

    g  x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 ;

(18)

Q1 =2 Q2 =2

Page 6 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

where g  is the complex conjugate of g (we deal only with real functions but need this formal definition). The independent variables are the differences between points of evaluation of h at   .x1 ; x2 / and g  at x10 ; x20 , respectively, as follows: s1 D x1  x10 ;

s2 D x2  x20 :

(19)

The integration is performed with s1 and s2 fixed and requires recognition of the fact that g and h are periodic. The correlation function is periodic with the same periods as for g and h, and its Fourier transform, that is, the power spectral density (PSD), is discrete and given by ZQ1 ZQ2

  ˚gh k1 ;k2 D

gh .s1 ; s2/ e 0

i2



k 1 s1 k 2 s2 Q1 C Q2



ds1 ds2 :

(20)

0

Substituting the correlation function, defined by Eq. (18) into Eq. (20), yields after some straightforward manipulations (making use of Eq. (17) and the periodicity of its integrand):   ˚gh k1 ;k2 D

1 G  Hk ;k : Q1 Q2 k1 ;k2 1 2

(21)

Analogous to the spherical case, Eq. (13), the PSD of periodic functions on the plane can be determined directly from their Fourier series coefficients. A very similar situation arises for nonperiodic functions that are nevertheless square integrable on the plane: Z1 Z1

Z1 Z1 g dx1 dx2 < 1;

h2 dx1 dx2 < 1:

2

1 1

(22)

1 1

In this case, the Fourier transform relationships for the function are given by Z1 Z1 g .x1 ; x2 / D

G .f1 ; f2 / e i2.f1 x1 Cf2 x2 / df1 df2 ;

(23)

1 1

Z1 Z1 G .f1 ; f2 / D

g .x1 ; x2 / e i2.f1 x1 Cf2 x2 / dx1 dx2

1 1

where f1 and f2 are corresponding spatial (cyclical) frequencies. The correlation function is given by Z1 Z1 gh .s1 ; s2 / D

    g  x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 I

(24)

1 1

Page 7 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

and its Fourier transform is easily shown to be ˚gh .f1 ; f2 / D G  .f1 ; f2 / H .f1 ; f2 / :

(25)

This Fourier transform of the correlation function is called more properly the energy spectral density, since the correlation function is simply the integral of the product of function. The square integrability of the functions implies that they have finite energy. Later we consider stochastic processes on the plane that are stationary, which means that they are not square integrable. For this case, one may relax the integrability condition to 1 lim lim E1 !1 E2 !1 E1 E2

E Z1 =2

E Z2 =2

jgj2 dx1 dx2 < 1;

(26)

E1 =2 E2 =2

and we say that g has finite power (energy per domain unit). Analogously, the correlation function is given by 1 gh .s1 ; s2 / D lim lim E1 !1 E2 !1 E1 E2

E Z1 =2

E2 =2 Z

    g  x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 ;

(27)

E1 =2 E2 =2

but the Fourier transforms of the functions, g and h, do not exist in the usual way (as in Eq. (23)). On the other hand, the correlation function is square integrable and, therefore, possesses a Fourier transform, that is, the PSD of g and h: Z1 Z1 ˚gh .f1 ; f2 / D

gh .s1 ; s2 / e i2.f1 s1 Cf2 s2 / ds1 ds2 :

(28)

1 1

Consider truncated functions defined on a finite domain:  g .x1 ; x2 / ; x1 2 ŒE1 =2; E1 =2 and x2 2 ŒE2 =2; E2 =2 gN .x1 ; x2 / D 0 otherwise

(29)

N Then gN and hN are square integrable on the plane and have Fourier transforms, and similarly for h. GN and HN , respectively; e.g., GN .f1 ; f2 / D

E Z1 =2

E2 =2 Z

g .x1 ; x2 / e i2.f1 x1 Cf2 x2 / dx1 dx2 :

(30)

E1 =2 E2 =2

It is now straightforward to show that in this case, the Fourier transform of gh is given by 1 N G .f1 ; f2 / HN .f1 ; f2 / : E1 !1 E2 !1 E1 E2

˚gh .f1 ; f2 / D lim

lim

(31)

Page 8 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

In practice, this power spectral density can only be approximated due to the required limit operators. However, the essential relationship between (truncated) function spectra and the PSD is once more evident.

2.3 From the Sphere to the Plane For each class of functions on the plane, we did not need to impose isotropy on the correlation function. However, isotropy proves useful in comparisons to the spherical correlation function at high spatial frequencies. In the case of the nonperiodic functions on the plane, a simple averaging over azimuth changes the Fourier transform of the correlation function to its Hankel transform: Z1 ˚gh .f / D 2

Z1 gh .s/s J0 .2fs/ ds;

gh .s/ D 2

0

˚gh .f /f J0 .2fs/ df ;

(32)

0

p p where s D s12 C s22 and f D f12 C f22 , and J0 is the zero-order Bessel function of the first kind. An approximate relation between the transforms of the planar and spherical isotropic correlation functions follows from the asymptotic relationship between Legendre polynomials and Bessel functions:  x D J0 .x/ ; for x > 0: (33) lim Pn cos n!1 n If we let x D 2f s, where s D R , and R is the radius of the sphere, then with 2f  n=R, we have x=n D . Hence, for large n (or small ), Pn .cos /  J0 .2f s/ :

(34)

Now, discretizing the second of Eqs. (32) (with df D 1=.2R/) and substituting Eq. (33) yields (again, with 2f  n=R) gh .s/ 

1 X nD0

 n   n s ˚ P : cos gh n 2R2 2R R

(35)

Comparing this with the spherical correlation function, Eq. (10), we see that   .2n C 1/ ˚gh n 

n ˚gh .f / ; 2R2

where f 

n : 2R

(36)

This relationship between planar and spherical PSDs holds only for isotropic correlation functions and for large n or f .

Page 9 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

2.4 Properties of Correlation Functions and PSDs Correlation functions satisfy certain properties that should then also hold for corresponding models and may aid in their development. The autocorrelation is a positive definite function, since its eigenvalues defined by its spectrum, the PSD, are positive; e.g., see Eq. (14) or from Eq. (31): ˇ2 1 ˇˇ N G .f1 ; f2 /ˇ  0: E1 !1 E2 !1 E1 E2

˚gg .f1 ; f2 / D lim

lim

(37)

The values of the autocorrelation function for nonzero argument are not greater than at the origin: gg . /  gg .0/ ;

> 0I

gg .s1 ; s2 /  gg .0; 0/ ;

q s12 C s22 > 0I

(38)

where equality would imply a perfectly correlated function (a constant). The inequalities (38) are proved using Schwartz’s inequality applied to the Eqs. (6) and (24), respectively. Note that cross correlations may be larger in absolute value than their values at the origin (e.g., if they vanish there). Because of the imposed isotropy, spherical correlation functions are not defined for < 0. On the other hand, planar correlation functions may be formulated for all quadrants; and, they satisfy:  gh .s1 ; s2 / D hg .s1 ; s2 / ;

(39)

which follows readily from their definition, given by Eqs. (24) or (27). Clearly, the autocorrelation function of a real function is symmetric with respect to the origin, even if not isotropic. The correlation function of a derivative is the derivative of the correlation. For finite energy functions, we find immediately from Eq. (24) that @ gh .s1 ; s2/ D @sk

Z1 Z1 1 1

  @h  0  g  x10 ; x20 x1 C s1 ; x20 C s2 dx10 dx20 D g; @h .s1 ; s2 / ; @xk @sk

k D 1; 2: (40)

From this and Eq. (39), we also have @  @ gh .s1 ; s2/ D  .s1 ; s2 / D h; @g .s1 ; s2 / D  @g ;h .s1 ; s2 / ; @xk @sk @sk hg @xk

k D 1; 2: (41) .h/

The minus sign may be eliminated with the definition of sk , Eqs. (19). We have @=@sk D @=@xk D .g/ .g/ .h/ @=@xk , where xk and xk refer, respectively, to the coordinates of g and h. Therefore, 

@g .g / ;h @xk

g;

@h .h/ @xk

.s1 ; s2 / D

@ .g / gh @xk

.s1 ; s2 / D

@ .h/ gh @xk

.s1 ; s2 /

.s1 ; s2 /

(42)

The same results may be shown for correlation functions of other types of functions on the plane (where the derivations in the case of the limit operators require a bit more care).

Page 10 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

Higher-order derivatives follow naturally, and indeed, we see that the correlation function of any linear operators on functions, L.g/ g and L.h/ h, is the combination of these linear operators applied to the correlation function:   L.g/ g;L.h/ h D L.g/ L.h/ gh :

(43)

Independent variables are omitted since this property, known as the law of propagation of correlations, holds also for the spherical case. The PSDs of derivatives of functions on the plane follow directly from the inverse transform of the correlation function: Z1 Z1 gh .s1 ; s2 / D

˚gh .f1 ; f2 / e i2.f1 s1 Cf2 s2 / df1 df2 :

(44)

1 1

With Eqs. (42), we find Z1 Z1 

@g @h .g / ; @x.h/ @xk k

.s1 ; s2 / D

˚gh .f1 ; f2 / 1 1

@2 .g/ .h/ @xk @xk

e i2.f1 s1 Cf2 s2 / df1 df2 :

(45)

From this (and Eqs. (19)) one may readily infer the following general formula for the PSD of the derivatives of g and h of any order: ˚gp1 p2 ;hq1 q2 .f1 ; f2 / D .1/p1 Cp2 .i 2f1 /p1 Cq1 .i 2f2/p2 Cq2 ˚gh .f1 ; f2 / ;

(46)

ı p p  ı q q  where gp1 p2 D @p1 Cp2 g @x1 1 @x2 2 and hq1 q2 D @q1 Cq2 h @x1 1 @x2 2 . These expressions could be obtained also through Eqs. (21), (25), or (31), from the spectra of the function derivatives, which have a corresponding relationship to the spectra of the functions. For functions on the sphere, the situation is hardly as simple. Indeed, this writer is unaware of formulas for the PSDs of horizontal derivatives, with the exception of an approximation for the average horizontal derivative, s dH g .; / D



@g @

2

 C

1 @g sin  @

2 :

(47)

Making use of an orthogonality proved by Jeffreys (1955): 1 4

ZZ 

@YNn;m @YNp;q 1 @YNn;m @YNp;q C 2 @ @ sin  @ @

the autocorrelation function of dHg at

!

 d D

n .n C 1/ ; n D p and m D q 0; n ¤ p or m ¤ q

(48)

D 0 from Eq. (9) becomes

Page 11 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

1 dHg;dHg .0/ D 4

ZZ



@g @

2



1 @g C sin  @

2 ! sin dd  D

1 X

n .n C 1/

2 Gnm : (49)

mDn

nD0



n X

It is tempting to identify the PSD by comparing this result to Eq. (10), but Eq. (49) proves this form of the correlation function only for D 0. The error in this approximation of the PSD of dHg is an open question. _ _ For functions, g .x1 ; x2 I z/, that satisfy Laplace’s equation, r 2 g D 0, in the space exterior to _ the plane (i.e., they are harmonic for z > 0) and satisfy the boundary condition, g .x1 ; x2 I 0/ D g .x1 ; x2 /, the Fourier spectrum on any plane with z D z0 > 0 is related to the spectrum of g: _

G .f1 ; f2 I z0 / D G .f1 ; f2 / e 2f z0 ;

(50)

where f 2 D f12 C f22 . Similarly, for functions, g .; I r/, harmonic outside the sphere (r > R) _ that satisfy g .; I R/ D g .; /, the Legendre spectrum on any sphere with r D r0 > R is related to the spectrum of g according to _

 nC1 R G n;m .r0 / D Gn;m : r0 _

(51)

Therefore, the corresponding spectral densities are analogously related. In general, the cross PSD _ _ of g at level, z D zg , and h at level, z D zh , is given (e.g., substituting Eq. (50) for g and h into Eq. (31)) by   ˚__ f1 ; f2 I zg ; zh D e 2f .zg Czh / ˚gh .f1 ; f2 / : gh

(52)

Note that the altitudes add in the exponent. Similarly, for cross PSDs of functions on spheres, r D rg and r D rh , we have  2 nC1      R ˚gh n : D ˚__ rg ; rh gh n rg rh

(53)

Although the altitude variables were treated strictly as parameters in these PSDs, one may consider briefly the corresponding correlation functions as “functions” of z and r, respectively, for the sole purpose of deriving the correlation functions of vertical (radial) derivatives. Indeed, it is readily _  _  seen from the definitions, Eqs. (9) and (27), for the cross correlation of g ; I rg and h .; I rh / that 

 _ _ @g @h @rg ; @rh

I rg ; rh



  @2 __ I rg ; rh ; D @rg @rh g h



_ _ @g @h @zg ; @zh

  s1 ; s2 I z g ; z h D

  @2 __ s1 ; s2 I zg ; zh ; @zg @zh g h (54)

and the law of propagation of correlations, Eq. (43), holds also for this linear operator. It should be stressed, however, that the correlation function is essentially a function of variables on the plane or sphere; no integration of products of functions takes place in the third dimension.

Page 12 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

The cross PSDs of vertical derivatives, therefore, are easily derived by applying Eqs. (54) to the inverse transforms of the correlation functions, Eqs. (44) and (10), with extended expressions for the PSDs, Eqs. (52) and (53). The result is ˚_ g

 _ j zg

h zk

h

  @j Ck  f1 ; f2 I zg ; zh D j k e 2f .zg Czh / ˚gh .f1 ; f2 / @zg @zh D .2f /j Ck e 2f .zg Czh / ˚gh .f1 ; f2 / ;

˚_ g

_

_ j rg

h rk h

!  2 nC1     R @j Ck rg ; rh ˚gh n ; D j k @rg @rh rg rh n _

_

(55) (56)

_

where g zjg D @j g=@zjg , hzkh D @k h=@zkh , g rgj D @j g=@rgj , and hrhk D @k h=@rhk . Thus, the PSD for any combination of horizontal and vertical derivatives of g and h on horizontal planes in Cartesian space may be obtained by appending the appropriate factors to ˚gh . The same holds for any combination of vertical derivatives of g and h on concentric spheres. _

_

_

_

3 Stochastic Processes and Covariance Functions A stochastic (or random) process is a collection, discrete or continuous, of random variables that are associated with a deterministic variable, in our case, a point on the plane or sphere. At each point, the process is random with an underlying probability distribution. A probability domain or sample space for each random variable is implied but omitted in the following simplified notation; in fact, the distribution may be unknown. If each random variable takes on a specific value from the corresponding sample spaces, then the process is said to be realized, and this realization is a function of the point coordinates. Thus, we continue to use the notation, g, to represent a continuous stochastic process, with the understanding that for any fixed point, it is a random variable. We assume that the process is wide-sense stationary, meaning that all statistics up to second order are invariant with respect to the origin of the space variable. Then, the expectation of g at all points is the same constant, and the covariance between g at any two points depends only on the displacement (vector) of one point relative to the other. Typically, besides not knowing the probability distribution, we have access only to a single realization of the stochastic process, which makes the estimation of essential statistics such as the mean and covariance problematic, unless we invoke an additional powerful condition characteristic of many processes – ergodicity. For ergodic processes the statistics associated with the underlying probability law, based on the statistical (ensemble) expectation, are equivalent to the statistics derived from space-based averages of a single realization of the process. Stationarity is necessary but not sufficient for ergodicity. Also, we consider only wide-sense ergodicity. It can be shown that stationary stochastic processes whose underlying probability distribution is Gaussian is also ergodic (Moritz 1980; Jekeli 1991). We do not need this result since the probability distribution is not needed in our developments; and, indeed, ergodic processes on the sphere cannot be Gaussian (Lauritzen 1973).

Page 13 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

For stochastic processes on the sphere, we define the space average as ZZ 1 ./ d: M ./ D 4

(57)



Let g and h be two such processes that are ergodic (hence, also stationary) and let their means, according to Eq. (57), be denoted g and h . Then, the covariance function of g and h is given by    1 M g  g .h  h / D 4

ZZ



g .; /  g

   0 0  h  ;   h sin dd ;

(58)



which, by the stationarity, depends only on the relative location of g and h, that is, on . ; ˛/, as given by Eqs. (7) and (8). We will assume without loss in generality that the means of the processes are zero (if not, redefine the process with its constant mean removed). Then, clearly, the covariance function is like the correlation function, Eq. (6), except the interpretation is for stochastic processes. We continue to use the same notation, however, and further redefine the covariance function to be isotropic by including an average in azimuth, ˛, as in Eq. (9). The Legendre transform of the covariance function is also the (cross) PSD of g and h and is given by Eq. (13). The quantities,     cgh n D .2n C 1/ ˚gh n ;

(59)

are known as degree variances, or variances per degree, on account of the total variance being, from Eq. (10), 1 X   gh .0/ D cgh n :

(60)

nD0

Ergodic processes on the plane are not square integrable since they are also stationary, and we define the average operator as 1 M ./ D lim lim E1 !1 E2 !1 E1 E2

E1 =2 E Z Z2 =2

./ dx1 dx2 :

(61)

E1 =2 E2 =2

The covariance function under the assumption of zero means for g and h is, again, the correlation function given by Eq. (27). However, the PSD requires some additional derivation since the N defined as in Eq. (29), are not stationary and, therefore, truncated stochastic processes, gN and h, not ergodic. Since both gN and hN are random for each space variable, their Fourier transforms, GN and HN , are also stochastic in nature. Consider first the ensemble expectation of the product of transforms, given by Eq. (30),

Page 14 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

  E GN  .f1 ; f2 / HN .f1 ; f2 / D ! ER1 =2 ER2 =2 ER1 =2 ER2 =2   0 0  i2 .f .x0 x /Cf .x0 x // 1 1 2 2 1 2 dx1 dx2 dx10 dx20 E g x1 ; x2 h .x1 ; x2 / e E1 =2 E2 =2 E1 =2 E2 =2

(62) The expectation inside the integrals is the same as the space average and is the covariance function of g and h, as defined above, which because of their stationarity depends only on the coordinate differences, s1 D x1 x10 and s2 D x2 x20 . It can be shown Brown (1983, p. 86) that the integrations reduce to 0 E E   Z1 Z2    js1 j js2 j N N @ E G .f1 ; f2 / H .f1 ; f2 / D E1 E2 1 1 gh .s1 ; s2 / e i2.f1 s1 Cf2 s2 / 2E1 2E2 E1 E2

ds1 ds2 / :

(63)

In the limit, the integrals on the right side approach the Fourier transform of the covariance function, that is, the PSD, ˚gh .f1 ; f2 /; and, we have  ˚gh .f1 ; f2 / D lim

lim E

E1 !1 E2 !1

 1 N G .f1 ; f2 / HN .f1 ; f2 / : E1 E2

(64)

Again, in practice, this PSD can only be approximated due to the limit and expectation operators. We have shown that under appropriate assumptions (ergodicity), the covariance functions of stochastic processes on the sphere or plane are essentially identical to the corresponding correlation functions that were developed without a stochastic foundation. The only exception occurs in the relationship between Fourier spectra and the (Fourier) PSD (compare Eqs. (31) and (64)). Furthermore, from Eqs. (62) through (64) we have also shown that the covariance function of a stochastic process is the Fourier transform of the PSD, given by Eq. (64). This is a statement of the more general Wiener-Khintchine theorem (Priestley 1981). Although there are opposing schools of thought as to the stochastic nature of a field like Earth’s gravitational potential, we will argue (see below) that the stochastic interpretation is entirely legitimate. Moreover, the stochastic interpretation of the gravitational field is widely, if not uniformly, accepted in geodesy (e.g., Moritz 1978, 1980; Hofmann-Wellenhof and Moritz 2005), as is the covariance nomenclature. Moritz (1980) provided compelling justifications to view the gravitational field as a stochastic process on the plane or sphere. The use of covariance functions also emphasizes that the significance of correlations among functions lies in their variability irrespective of the means (which we will always assume to be zero). For these reasons, we will henceforth in our applications to the Earth’s gravitational field refer only to covariance functions, use the same notation, and use all the properties and relationships derived for correlation functions.

3.1 Earth’s Anomalous Gravitational Field The masses of the Earth, including all material below its surface, as well as the atmosphere, generate the gravitational field, which in vacuum is harmonic and satisfies Laplace’s differential equation. For present purposes we neglect the atmosphere (and usually its effect is removed from Page 15 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

data) so that for points, x, above the surface, the gravitational potential, V , fulfills Laplace’s equation, r 2 V .x/ D 0:

(65)

Global solutions to this equation depend on boundary values of V or its derivatives on some mathematically convenient bounding surface. Typically this surface is a sphere with radius, a, and the solution is then expressed in sphericalıpolar coordinates, .r; ; /, as an infinite series of solid spherical harmonic functions, YNn;m .; / r nC1 , for points outside the sphere: n 1 GM X X  a nC1 Cn;m YNn;m .; /; V .r; ; / D a nD0 mDn r

(66)

where GM is Newton’s gravitational constant times the total mass of the Earth (this scale factor is determined from satellite tracking data); and Cn;m is a coefficient of degree, n, and order, m, determined from V and/or its radial derivatives on the bounding sphere (obtained, e.g., from measurements of gravity). Modern solutions also make use of satellite tracking data and in situ measurements of the field by satellite-borne instruments to determine these coefficients. In a coordinate system fixed to the Earth, we define the gravity potential as the sum of the gravitational potential, V , due to mass attraction and the (nongravitational) potential, ', whose gradient equals the centrifugal acceleration due to Earth’s rotation: W .x/ D V .x/ C ' .x/ :

(67)

If we define a normal (i.e., reference) gravity potential, U D V el lip C , associated with a corotating material ellipsoid, such that on this ellipsoid, U jx2ellip D U0 , then the difference, called the disturbing potential, T .x/ D W .x/  U .x/ ;

(68)

is also a harmonic function in free space and may be represented as a spherical harmonic series: n 1 GM X X  a nC1 T .r; ; / D ıCn;m YNn;m .; /; a nD2 mDn r

(69)

where the ıCn;m are coefficients associated with the difference, V V el lip . The total ellipsoid mass is set equal to the Earth’s total mass, so that ıC0;0 D 0; and, the coordinate origin is placed at the center of mass of the Earth (and ellipsoid), implying that the first moments of the mass distribution all vanish: ıC1;m D 0 for m D 1; 0; 1. The set of spherical harmonic coefficients, tn;m D .GM =a/ ıCn;m , represents the Legendre spectrum of T . Practically, it is known only up to some finite degree, nmax ; for example, the model, EGM2008, has nmax D 2;190 (Pavlis et al. 2012a, b). The harmonic coefficients of this model refer to a sphere of boundary values whose radius is equated with the semimajor axis of the best-fitting Earth ellipsoid. The uniform convergence of the infinite series, Eq. (69), is guaranteed for r  a,

Page 16 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

but effects of divergence are evident in the truncated series, EGM2008, when r < a, and due care should be exercised in evaluations on or near the Earth’s surface. The disturbing potential may also be defined with respect to higher-degree reference potentials, ref . although in this case one may need to account for significant errors in the coefficients, Cn;m In particular, the local interpretation of the field as a stationary random process usually requires removal of a higher-degree reference field. In the Cartesian formulation, the disturbing potential in free space (z  0) is expressed in terms of its Fourier spectrum, .f1 ; f2 /, on the plane, z D 0, as Z1 Z1 T .x1 ; x2 I z/ D

.f1 ; f2 / e 2f z e i2.f1 x1 Cf2 x2 / df1 df2 :

(70)

1 1

where f D

p f12 C f22 .

3.2 The Disturbing Potential as a Stochastic Process In addition to the well-grounded reasoning already cited, an alternative justification of the stochastic nature of T is argued here based on the fractal (self-similar) characteristics of Earth’s topography (see also Turcotte 1987). This will provide also a basis for modeling the covariance function of T and its derivatives. The fractal geometry of the Earth’s topography (among fractals in general) was investigated and popularized by Mandelbrot in a number of papers and reviewed in his book (Mandelbrot 1983) using fundamentally the concept of Brownian motion, which is the process of a random walk. Thus, without going into the details of fractals, we have at least a connection between topography and randomness. Next, we may appeal to the well-known (in physical geodesy and geophysics) high degree of linear correlation between gravity anomalies and topographic height. This correlation stems from the theory of isostasy that explains the existence of topography on the Earth whose state generally tends toward one of hydrostatic equilibrium. Although this correlation is not perfect (or almost nonexistent in regions of tectonic subsidence and rifting), empirical evidence suggests that in many areas the correlation is quite faithful to this theory, even with a number of seemingly crude approximations. The gravity anomaly, g, and its isostatic reduction are defined in Hofmann-Wellenhof and Moritz (2005). At a point, P , the isostatically reduced gravity anomaly is given by

gI .P / D g .P /  C .P / C A .P / ;

(71)

where C .P / is the gravitational effect of all masses above the geoid and A .P / is the effect of their isostatic compensation. Several models for isostatic compensation have been developed by geophysicists (Watts 2001). Airy’s model treats the compensation locally and assumes that there is

Page 17 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

P

Topographic surface

ρ

rh Density layers:

D

−(rm−r)h'

b

−r(1−rw /r)b Geoid

ρ b'

Crust h'

ρ

rw

h

rm

(rm−r)b'

Mantle

Fig. 1 Isostatic compensation of topography according to the Airy model

no regional flexural rigidity in the lithosphere. With this model, the topography presumably floats in the denser mantle, and equilibrium is established according to the buoyancy principle (Fig. 1): h D . m  / h0 D h0 ;

(72)

where h0 is the (positive) depth of the “root” with respect to the depth of compensation, D (typically, D D 30 km), and the crust density, , and the mantle density, m , are assumed constant. Similarly, in ocean areas, the lower density of water relative to the crust allows the mantle to intrude into the crust, where equilibrium is established if .  w / b D b 0 , and b is the (positive) bathymetric distance to the ocean floor, b 0 is the height of the “anti-root” of mantle material, and w is the density of seawater. Removing the mass that generates C .P / makes the space above the geoid homogeneous (empty). According to Airy’s model, the attraction, A .P /, is due, in effect, to adding that mass to the root so as to make the mantle below D homogeneous. If the isostatic compensation is perfect according to this model, then the isostatic anomaly would vanish because of this created homogeneity; and, indeed, isostatic anomalies tend to be small. Therefore, the free-air gravity anomaly according to Eq. (71) with gI .P /  0 is generated by the attraction due to the topographic masses above the geoid, with density, , and by the attraction due to the lack of mass below the depth of compensation, with density,  :

g .P /  C .P /  A .P / :

(73)

Expressions for the terms on the right side can be found using various approximations. One such approximation (Forsberg 1985) “condenses” the topography onto the geoid (Helmert condensation, Helmert 1884; Martinec 1998), and the gravitational effect is then due to a two-dimensional mass layer with density, H D h. Likewise, the gravitational effect of the ocean bottom topography can be modeled by forming a layer on the geoid that represents the ocean’s deficiency in density relative to the crust. The density of this layer is negative: B D  .  w / b D  .1  w = / b. The gravitational potential, v, at a point, P , due to a layer condensed from topography (or bathymetry) is given by

Page 18 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

ZZ N h .Q/ 2 dQ ; v .P / D G R ` Q

( hN .Q/ D

Q 2 land  h .Q/ ; w  1  b .Q/ ; Q 2 ocean

(74)

where ` is the distance between P and the integration point. Similarly, the potential of the mass added below the depth of compensation can be approximated 0 by that of another layer at level D with density, H D  h0 , representing a condensation of material that is deficient in density with respect to the mantle and extends a depth, h0 , below D(see Fig. 1). For ocean areas, the anti-root is condensed onto the depth of compensation with density, B0 D b 0 . Equation (74) for a fixed height of the point, P , is a convolution of hand the inverse distance. Further making the planar approximation (for local, or high-frequency applications), this distance q  2  2  x1  x10 C x2  x20 C z2 , with x10 ; x20 being the planar coordinates of point Q. is `  Applying the convolution theorem, the Fourier transform of the potential at the level of z > 0 is given by V .f1 ; f2 I z/ D

G N H .f1 ; f2 / e 2f z : f

(75)

0 D  h (in Including the layer at the compensation depth, D, below the geoid with density, H 0 view of Eqs. (72); and similarly B D .1  w = / b for ocean areas), the Fourier transform of the total potential due to both the topography and its isostatic compensation is approximately

V .f1 ; f2 I z/ D

  G N H .f1 ; f2 / e 2f z  e 2f .DCz/ : f

(76)

Since the gravity anomaly is approximately the radial derivative of this potential, multiplying by 2f yields its Fourier transform:  

G .f1 ; f2 I z/ D 2G HN .f1 ; f2 / e 2f z  e 2f .DCz/ :

(77)

Neglecting the upward continuation term, as well as the isostatic term (which is justified only for very short wavelengths), confirms the empirical linear relationship between the heights and the gravity anomaly. Figure 2 compares the PSDs of the topography and the gravitational field both globally and locally. The global PSDs were computed from spherical harmonic expansions EGM2008 for the gravitational potential and DTM2006 for the topography (Pavlis et al. 2012a) according to Eq. (14) but converted to spatial frequency using Eq. (36). In addition, both were scaled to the PSD of the geoid undulation, which is related to the disturbing potential as N D T = , where D 9:8 m/s2 is an average value of gravity. The topographic height is related to the potential through Eq. (76). Both expansions are complete up to degree, nmax D 2;160 (fmax D 5:4  105 cy/m). DTM2006 is an expansion for both the topographic height above mean sea level and the depth of the ocean N as defined in Eq. (74). This contributes to an and, therefore, does not exactly correspond to h, overestimation of the power at lower frequencies. The obviously lower power of EGM2008 at higher frequencies results from the higher altitude, on average, to which its spectrum refers, that is, the sphere of radius, a. Page 19 of 34

1.10 15 1.10 14 1.10 13 1.10 12 1.10 11 1.10 10 1.10 9 1.10 8 1.10 7 1.10 6 1.10 5 1.10 4 1.10 3 100 10 1 0.1 1.10 –7

DTM2006

gra v topo

EGM2008

area 1 1.10 –6

area 2

geoid undulation psd [ m2/(cy/m)2]

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

gra v topo

1.10 –5 frequency [cy/m]

1.10 –4

1.10 –3

Fig. 2 Comparison of gravitational and topographic PSDs, scaled to the geoid undulation PSD. Global models are EGM2008 and DTM2006, and local PSDs were derived from gravity and topographic data in the indicated areas 1 and 2

The other PSDs in Fig. 2 correspond to the indicated regions and were derived according to Eq. (31) from local terrain elevation and gravity anomaly data provided by the US National Geodetic Survey. The data grids in latitude and longitude have resolution of 30 arcsec for the topography and 1 arcmin for the gravity. With a planar approximation for these areas, the Fourier transforms were calculated using their discrete versions. The PSDs were computed by neglecting the limit (and expectation) operators and were averaged in azimuth. Dividing the gravity PSD by .2 f /2 then yields the geoid undulation PSD; and, as before, Eq. (76) ı 2relates the topography PSD to the potential PSD that scales to the geoid undulation PSD by 1 . In these regions, the gravity and topography PSDs match well at the higher frequencies at least, attesting to their high linear correlation. Moreover, these PSDs follow a power law in accord with the presumed fractal nature of the topography. These examples then offer a validation of the stochastic interpretation of the gravitational field and also provide a starting point to model its covariance function.

4 Covariance Models Since the true covariance function of a process, such as the Earth’s gravity field, rarely is known and local functions can vary from region to region (thus we allow global non-stationarity in local applications), it must usually be modeled from data. We consider here primarily the modeling of the autocovariance function, that is, when g D h. Models for the cross covariance function could follow similar procedures, but usually g and h are linearly related and the method of propagation of covariances (see Sect. 2.4) should be followed to derive gh from gg . Modeling the covariance (or correlation) function of a process on the plane or sphere can proceed with different assumptions and motivations. We distinguish in the first place between empirical and analytic methods and in the second between global and local models. Global models describe the correlation of functions on the sphere; whereas, local models usually are restricted to applications where a planar approximation suffices. Empirical models are derived directly from data distributed Page 20 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

on the presumably spherical or planar surface of the Earth. Rarely, if ever, are global empirical covariance models determined for the sphere according to the principal definition, Eq. (9). Instead, such models are given directly by the degree variances, Eq. (59). For local modeling with the planar approximation, the empirical model comes from a discretization of Eq. (27), where we neglect also the limit processes,  1 XX  0 0  0 Ogg .s1 ; s2 / D g x1 ; x2 g x1 C s1 ; x20 C s2 ; M 0 0 x1

(78)

x2

and where M is the total number of summed products for each .s1 ; s2 /. A corresponding approximation for an isotropic additionally averages the products of g that are separated q model   2 0 2 x1  x1 C x2  x20 , over all directions. Typically, the maximum by a given distance, s D s considered is much smaller than (perhaps 10–20 % of) the physical dimension of the data area, since the approximation of Eq. (27) by Eq. (78) worsens as the number of possible summands within a finite area decreases. Also, M may be fixed at the largest possible value (the number of products for s D 0) in order to avoid a numerical nonpositive definiteness of the covariance function (Marple 1987, p. 148). However, this creates a biased estimate of the covariance, particularly for the larger distances. Another form of empirical covariance model is its Fourier (or Legendre) transform, derived directly from the data, as was illustrated for the gravity anomaly and topography in Fig. 2. The inverse transform then yields immediately the covariance function. The disadvantage of the empirical covariance model, Eq. (78), is the limited ability (or inability) to derive consistent covariances of functionally related quantities, such as the derivatives of g through the law of propagation of covariances, Eq. (43). This could only be accomplished by working with its transform (see Eqs. (46)), but generally, an analytic model eventually simplifies the computational aspects of determining auto- and cross covariances.

4.1 Analytic Models Analytic covariance models are constructed from relatively simple mathematical functions that typically are fit to empirical data (either in the spatial or frequency domains) and have the benefit of easy computation and additional properties useful to a particular application (such as straightforward propagation of covariances). An analytic model should satisfy all the basic properties of the covariance function (Sect. 2.4), although depending on the application some of these may be omitted (such as the harmonic extension into space for the Gaussian model, 2  .s/ D  2 e ˇs ). An analytic model may be developed for the PSD or the covariance function. Ideally (but not always), one leads to a mathematically rigorous formulation of the other. Perhaps the most famous global analytic model is known as Kaula’s rule, proposed by W. Kaula (1966, p. 98) in order to develop the idea of a stochastic interpretation of the spherical spectrum of the disturbing potential:  .˚T T /n D

GM R

2

1010 4 4 m /s ; n4

(79)

Page 21 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

where R is the mean Earth radius. It roughly described the attenuation of the harmonic coefficients known at that time from satellite tracking observations, but it is reasonably faithful to the spectral attenuation of the field even at high degrees (see Fig. 4). Note that Kaula’s rule is a power-law model for the PSD of the geopotential, agreeing with our arguments above for such a characteristic based on the fractal nature of the topography. The geodetic literature of the latter part of the last century is replete with different types of global and local covariance and PSD models for the Earth’s residual gravity field (e.g., Jordan 1972; Tscherning and Rapp 1974; Heller and Jordan 1979; Forsberg 1987; Milbert 1991; among others); but, it is not the purpose here to review them. Rather the present aim is to promote a single elemental prototype model that (1) satisfies all the properties of a covariance model for a stochastic process, (2) has harmonic extension into free space, (3) has both spherical and planar analytic expressions for all derivatives of the potential in both the space and frequency domains, and (4) is sufficiently adaptable to any strength and attenuation of the gravitational field. This is the reciprocal distance model introduced by Moritz (1976, 1980), so called because the covariance function resembles an inverse-distance weighting function. It was also independently studied by Jordan et al. (1981).

4.2 The Reciprocal Distance Model Consider the disturbing potential, T , as a stochastic process on each of two possibly different horizontal parallel planes or concentric spheres. Given a realization of T on one plane (or sphere), its realization on the other plane (or sphere) is well defined by a solution to Laplace’s equation, provided both surfaces are on or outside the Earth’s surface (approximated as a plane or sphere). The reciprocal distance covariance model between T on one plane and T on the other is given by 2 T T .sI z1; z2 / D q ; 2 2 2 ˛ s C .1 C ˛ .z1 C z2 //

(80)

p where with Eq. (19), s D s12 C s22 ; z1 , z2 are heights of the two planes; and  2 , ˛ are parameters. The Fourier transform, or the PSD, is given by ˚T T .f I z1 ; z2 / D

 2 2f .z1 Cz2 C1=˛/ ; e ˛f

f ¤ 0:

(81)

For spheres with radii, r1  R and r2  R, the spherical covariance model is T T

 2 .1  0 / = 0 . I r1 ; r2 / D p 1 C 2  2 cos

;

(82)

ı where is given by Eq. (7), 0 D .R0 =R/2 and  2 are parameters, and D R02 .r1 r2 /. The Legendre transform, or PSD, is given by .˚T T /n D

 2 .1  0 / nC1 : .2n C 1/ 0

(83)

Page 22 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

In all cases, the heights (or radii) refer to fixed surfaces that define the spatial domain of the corresponding stochastic process. Since we allow z1 ¤ z2 or r1 ¤ r2 , the models, Eqs. (80) and (82), technically are cross covariances between two different (but related) processes; and Eqs. (81) and (83) are cross PSDs. The equivalence of the models, Eqs. (80) and (82), as the spherical surface approaches a plane, is established by identifying z1 D r1  R, z2 D r2  R and s 2  2R2 .1  cos /, from which it can be shown (Moritz 1980, p. 183) that 1 p 1 C 2  2 cos

R

q s2

C .1=˛ C z1 C z2 /

;

(84)

2

where 1=˛ D 2 .R  R0 / and terms of order .R  R0 /=R are neglected. The variance parameter,  2 , is the same in both versions of the model. It is noted that this model, besides having analytic forms in both the space and frequency domains, is isotropic, depending only on the horizontal distance. Moreover, it correctly incorporates the harmonic extension for the potential at different levels. It is also positive definite since the transform is positive for all frequencies. The analytic forms permit exact propagation of covariances as elaborated in Sect. 2.4. Since many applications involving the stochastic interpretation of the field nowadays are more local than global, only the (easier) planar propagation is given here (Appendix A) up to second-order derivatives. The covariance propagation of derivatives for similar spherical models was developed by Tscherning (1976). Note that the covariances of the horizontal derivatives are not isotropic. One further useful feature of the reciprocal distance model is that it possesses analytic forms for hybrid PSD/covariance functions, those that give the PSD in one dimension and the covariance in the other: Z1 ST T .f1 I s2 I z1 ; z2/ D

˚T T .f1 ; f2 I z1; z2 / e i2f2 s2 df2 1

Z1 D

T T .s1 ; s2 I z1 ; z2/ e i2f2 s2 ds1

(85)

1

The first integral transforms the PSD to the covariance in the second variable, while the second equivalent integral transforms the covariance function to the frequency domain in the first variable. When a process is given only on a single profile (e.g., along a data track), one may wish to model its along-track PSD, which is the hybrid PSD/covariance function with s2 D 0. Appendix B gives the corresponding analytic forms for the (planar) reciprocal distance model.

4.3 Parameter Determination The reciprocal distance PSD model, Eq. (81), clearly does not have the form of a power law, but it nevertheless serves in modeling the PSD of the gravitational field when a number of these models are combined linearly:

Page 23 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

1.1012 1.1011 Power law model

psd

1.1010 1.109

Recip. dist. model

1.108 1.107 1.106 1.10−7

1.10−6

1.10−5

1.10−4

1.10−3

Frequency, f

Fig. 3 Fitting a reciprocal distance model component to a power-law PSD

˚T T .f I z1; z2 / D

J X j2

˛ f j D1 j

e 2f .z1 Cz2 C1=˛j / :

(86)

The parameters, ˛j , j2 , are chosen appropriately to yield a power-law attenuation of the PSD. This selection is based on the empirical PSD of data that in the case of the gravitational field are usually gravity anomalies, g  @T =@r, on the Earth’s surface (z1 D z2 D 0). Multiplying the PSD for the disturbing potential by .2f /2 , we consider reciprocal distance components of the PSD of the gravity anomaly (from Eq. (86)) in the form ˚ .f / D Af e Bf ;

(87)

ı where A D .2/2  2 ˛ and B D 2=˛ are constants to be determined such that the model is tangent to the empirical PSD. Here we assume that the latter is a power-law model (see Fig. 3), p .f / D Cf ˇ ;

(88)

where the constants, C and ˇ, are given. In terms of natural logarithms, the reciprocal distance PSD component is ln .˚ .f // D ln .A/ C !  Be ! , where ! D ln f ; and its slope is d .ln .˚ .f ///=d! D 1  Be ! . The slope should be ˇ, which yields Be ! D 1 C ˇ. Also, the reciprocal distance and power-law models should intersect, say, at f D fN, which requires ln .C /  ˇ! D ln .A/ C !  Be ! . Solving for A and B, we find:  1Cˇ e ; ADC fN

BD

1Cˇ : fN

(89)

With a judicious selection of discrete frequencies, fNj , a number of PSD components may be combined to approximate the power law over a specified domain. Due to the overlap of the component summands in Eq. (86), an appropriate scale factor may still be required for a proper fit.

Page 24 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

Table 1 Reciprocal distance PSD parameters

j 1 2 3 4 5 6 7 8

(m /s ) 10 3,300 650 162 10.2 0.641 4.02  10 2.53  10 2

4

4

Area 1 (1/m) j 310 9 9.69  10 10 4.76  10 11 8.94  10 12 2.00  10 13 4.48  10 14 1.00  10 15 2.25  10 ˛

7

5

7 6 6 5 5

2 3

4

(m /s ) 1.59  10 9.97  10 6.26  10 3.93  10 2.47  10 1.55  10 9.74  10 2

4

4

(1/m) 5.03  10 1.13  10 2.52  10 5.64  10 1.26  10 2.83  10 6.33  10 ˛

4 6 7 8 9 10 12

4

j 4

1

3

2

3

3

3

4

2

5

2

6

2

7 8

(m /s ) 10 3,300 640 951 79.9 6.71 0.564 4.74  10 2

4

4

Area 2 (1/m) j 3  10 9 9.69  10 10 7.56  10 11 9.73  10 12 2.18  10 13 4.88  10 14 1.09  10 15 2.44  10 ˛

7

5

7 6 6 5 5 4

2

(m /s ) 3.98  10 3.34  10 2.81  10 2.36  10 1.98  10 1.66  10 1.40  10 2

4

4

(1/m) 5.47  10 1.23  10 2.74  10 6.14  10 1.37  10 3.08  10 6.89  10 ˛

3 4 5 6 7 8 9

4 3 3 3 2 2 2

4

1.1016 Gravity anomaly psd [mGal2/(cy/m)2]

1.1015 1.1014 1.1013 1.1012

EGM2008

1.1011

Kaula’s rule

1.1010

EGM2008

1.109

emp. psd RD model area 2

1.108 1.107 1.106 −7 1.10

area 1

1.10−6

RD model emp. psd

1.10−5 Frequency [cy/m]

1.10−4

1.10−3

Fig. 4 Comparison of empirical and reciprocal distance (RD) model PSDs for the gravity anomaly in the two areas shown in Fig. 2

This modeling technique was applied to the two regional PSDs shown in Fig. 2. Additional lowfrequency components were added to model the field at frequencies, f < 105 cy/m. Table 1 lists the reciprocal distance parameters for each of the regions in Fig. 2; and Fig. 4 shows various true and corresponding modeled PSDs for the gravity anomaly. The parameters may be used to define consistently the cross PSDs and cross covariances of any of the derivatives of the disturbing potential in the respective regions.

5 Summary and Future Directions The preceding sections have developed the theory for correlation functions on the sphere and plane for deterministic functions and stochastic processes using standard spherical harmonic (Legendre) and Fourier basis functions. Assuming an ergodic (hence stationary) stochastic process, its covariance function (with zero means) is essentially the correlation function defined for a particular realization of the process. These concepts were applied to the disturbing gravitational Page 25 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

potential. Based on the fractal nature of Earth’s topography and its relationship to the gravitational field, the power spectral density (PSD) of the disturbing potential was shown to behave like a power law at higher spatial frequencies. This provides the basis for the definition and determination of an analytic model for the covariance function that offers mutually consistent cross covariances (and PSDs) among its various derivatives, including vertical derivatives. Once established for a particular region, such models have numerous applications from leastsquares collocation (and the related kriging) to more mundane procedures such as interpolation and filtering. Furthermore, they are ideally suited to generating a synthetic field for use in simulation studies of potential theory, as well as Monte Carlo statistical analyses in estimation theory. The details of such applications are beyond the present scope but are readily formulated. The developed reciprocal distance model is quite versatile when combined linearly using appropriate parameters and is able to represent the PSD of the disturbing potential (and any of its derivatives) with different spectral amplitudes depending on the region in question. Two examples are provided in which a combination of 15 such reciprocal distance components is fitted accurately to the empirical gravity anomaly PSD in either smooth or rough regional fields. Although limited to some extent by being isotropic (for the vertical derivatives, only), the resulting models are completely analytic in both spatial and frequency domains; and thus, the computed cross covariances and cross PSDs of all derivatives of the disturbing potential are mutually consistent, which is particularly important in estimation and error analysis studies. The global representation of the gravitational field in terms of spherical harmonics has many applications that are, in fact, becoming more and more local as the computational capability increases and models are expanded to higher maximum degree, nmax . The most recent global model, EGM2008, includes coefficients complete up to nmax D 2;190, and the historical trend has been to develop models with increasingly high global resolution as more and more globally distributed data become available. However, such high-degree models also face the potential problem of divergence near the Earth’s surface (below the sphere of convergence) and must always submit to the justifiable criticism that they are inefficient local representations of the field. In fact, the two PSDs presented here are based on the classical local approximation, the planar approximation, with traditional Fourier (sinusoidal) basis functions. Besides being limited by the planar approximation, the Fourier basis functions, in the strictest sense, still have global support for nonperiodic functions. However, there exists a vast recent development of local-support representations of the gravitational field using splines on the sphere, including tensor-product splines (e.g., Schumaker and Traas 1991), radial basis functions (Schreiner 1997; Freeden et al. 1998), and splines on sphere-like surfaces (Alfeld et al. 1996); see also Jekeli (2005). Representations of the gravitational field using these splines, particularly the radial basis functions and the Bernstein-Bézier polynomials used by Alfeld et al. depend strictly on local data, and the models can easily be modified by the addition or modification of individual data. Thus, they also do not depend on regularly distributed data, as do the spherical harmonic and Fourier series representations. On the other hand, these global support models based on regular data distributions lead to particularly straightforward and mutually consistent transformations among the PSDs of all derivatives of the gravitational potential, which greatly facilitates the modeling of their correlations. For irregularly scattered data, the splines lend themselves to a multiresolution representation of the field on the sphere, analogous to wavelets in Cartesian space. This has been developed for the tensor-product splines by Lyche and Schumaker (2000) and for the radial basis splines by Freeden et al. (1998); see also Fengler et al. (2004). For the Bernstein-Bézier polynomial

Page 26 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

splines, a multiresolution model is also possible. How these newer constructive approximations can be adapted to correlation modeling with mutually consistent transformations (propagation of covariances and analogous PSDs) among all derivatives of the gravitational potential represents a topic for future development and analysis.

Appendix A The planar reciprocal distance model, Eq. (80), for the covariance function of the disturbing potential is repeated here for convenience with certain abbreviations T;T .s1 ; s2 I z1 ; z2/ D

2 M 1=2

(90)

where M D ˇ2 C ˛2s 2;

ˇ D 1 C ˛ .z1 C z2 / ;

s 2 D s12 C s12 ;

s1 D x1  x10 ;

s2 D x2  x20 : (91)

The primed coordinates refer to the first subscripted function in the covariance, and the unprimed coordinates refer to the second function. The altitude levels for these functions are z1 and z2 , respectively. Derivatives ı of the disturbing potential with respect to the coordinates are denoted 2 @T =@x1 D Tx1 , @ T .@x1 @z/ D Tx1 z , etc. The following expressions for the cross covariances are derived by repeatedly using Eqs. (42) and (54). The arguments for the resulting function are omitted but are the same as in Eq. (90): Tx1 ;T

 2 ˛ 2 s1 D D T;Tx1 M 3=2

(92)

 2 ˛ 2 s2 D T;Tx2 M 3=2

(93)

 2 ˛ˇ D T;Tz M 3=2

(94)

Tx2 ;T D

Tz ;T D  Tx1 ;Tx1

  2˛2  D 5=2 M  3˛ 2 s12 M  2˛4 s1 s2 D Tx2 ;Tx1 M 5=2

(96)

 2˛3ˇ s1 D Tz ;Tx1 M 5=2

(97)

  2˛2  2 2 M  3˛ s 2 M 5=2

(98)

Tx1 ;Tx2 D 3 Tx1 ;Tz D 3 Tx2 ;Tx2 D

(95)

Page 27 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

Tx2 ;Tz D 3 Tz ;Tz D

  2˛2  2 2 s 2M  3˛ D Tx1 ;Tx1 C Tx2 ;Tx2 M 5=2

Tx1 ;Tx1 z

Tx1 ;Tx2 z D

(100) (101)

T;Tx1 x2 D Tx1 ;Tx2 D Tx1 x2 ;T

(102)

T;Tx1 z D Tx1 ;Tz D Tx1 z ;T

(103)

T;Tx2 x2 D Tx2 ;Tx2 D Tx2 x2 ;T

(104)

T;Tx2 z D Tx2 ;Tz D Tx2 z ;T

(105)

T;Tzz D Tz ;Tz D Tzz ;T

(106)

 3 2 ˛ 4 s1  2 2 3M C 5˛ s 1 D Tx1 x1 ;Tx1 M 7=2

(107)

 3 2 ˛ 4 s2  M C 5˛ 2 s12 D Tx1 x2 ;Tx1 D Tx2 ;Tx1 x1 D Tx1 x1 ;Tx2 7=2 M

(108)

 3 2 ˛ 3 ˇ  2 2 D s M C 5˛ D Tx1 z ;Tx1 D Tz ;Tx1 x1 D Tx1 x1 ;Tz 1 M 7=2

Tx1 ;Tx2 x2 D

(99)

T;Tx1 x1 D Tx1 ;Tx1 D Tx1 x1 ;T

Tx1 ;Tx1 x1 D Tx1 ;Tx1 x2 D

 2˛3ˇ s2 D Tz ;Tx2 M 5=2

 3 2 ˛ 4 s1  M C 5˛ 2 s22 D Tx2 x2 ;Tx1 D Tx2 ;Tx1 x2 D Tx1 x2 ;Tx2 7=2 M

15 2 ˛ 5 ˇ s1 s2 D Tx2 z ;Tx1 D Tz ;Tx1 x2 D Tx2 ;Tx1 z D Tx1 x2 ;Tz D Tx1 z ;Tx2 M 7=2 Tx1 ;Tzz D

 3 2 ˛ 4 s1  2 2 2  ˛ s 4ˇ D Tzz ;Tx1 D Tz ;Tx1 z D Tx1 z ;Tz M 7=2

(110)

(111)

(112)

 3 2 ˛ 4 s2  3M C 5˛ 2 s22 D Tx2 x2 ;Tx2 7=2 M

(113)

 3 2 ˛ 3 ˇ  2 2 s M C 5˛ D Tx2 z ;Tx2 D Tz ;Tx2 x2 D Tx2 x2 ;Tz 2 M 7=2

(114)

 3 2 ˛ 4 s2  2 2 2  ˛ s 4ˇ D Tzz ;Tx2 D Tz ;Tx2 z D Tx2 z ;Tz M 7=2

(115)

 3 2 ˛ 3 ˇ  2M C 5˛ 2 s 2 D Tzz ;Tz 7=2 M

(116)

Tx2 ;Tx2 x2 D Tx2 ;Tx2 z D

(109)

Tx2 ;Tzz D

Tz ;Tzz D

Page 28 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

 3 2 ˛ 4  2 2 2 4 4  30M˛ s C 35˛ s 3M 1 1 M 9=2

(117)

Tx1 x1 ;Tx1 x2 D

 15 2 ˛ 6 s1 s2  2 2 s 3M C 7˛ 1 D Tx1 x2 ;Tx1 x1 M 9=2

(118)

Tx1 x1 ;Tx1 z D

 15 2 ˛ 5 ˇs1  3M C 7˛ 2 s12 D Tx1 z ;Tx1 x1 9=2 M

(119)

Tx1 x1 ;Tx1 x1 D

Tx1 x1 ;Tx2 x2 Tx1 x1 ;Tx2 z D

 3 2 ˛ 4  2 2 2 2 2 D  5M˛ s C 35s s M D Tx2 x2 ;Tx1 x1 D Tx1 x2 ;Tx1 x2 1 2 M 9=2

 15 2 ˛ 5 ˇs2  M C 7˛ 2 s12 D Tx2 z ;Tx1 x1 D Tx1 z ;Tx1 x2 D Tx1 x2 ;Tx1 z 9=2 M

(121)

 3 2 ˛ 4  2 2 2 2 2 2 4M D Tzz ;Tx1 x1 D Tx1 z ;Tx1 z C 5M˛ s C 35ˇ ˛ s 2 1 M 9=2

(122)

Tx1 x1 ;Tzz D

Tx1 x2 ;Tx2 x2 D Tx1 x2 ;Tx2 z D

 15 2 ˛ 6 s1 s2  2 2 3M C 7˛ s 2 D Tx2 x2 ;Tx1 x2 M 9=2

(123)

 15 2 ˛ 5 ˇs1  M C 7˛ 2 s22 D Tx2 z ;Tx1 x2 D Tx1 z ;Tx2 x2 D Tx2 x2 ;Tx1 z 9=2 M

(124)

 15 2 ˛ 5 ˇs1  2 D 3M  7ˇ D Tzz ;Tx1 z M 9=2

(125)

Tx1 z ;Tzz

 3 2 ˛ 4  2 2 2 4 4  30M˛ s C 35˛ s 3M 2 2 M 9=2

(126)

 15 2 ˛ 5 ˇs2  2 2 s 3M C 7˛ D Tx2 z ;Tx2 x2 2 M 9=2

(127)

Tx2 x2 ;Tx2 x2 D Tx2 x2 ;Tx2 z D Tx2 x2 ;Tzz

(120)

 3 2 ˛ 4  2 2 2 2 2 2 D C 5M˛ s C 35ˇ ˛ s 4M D Tzz ;Tx2 x2 D Tx2 z ;Tx2 z 1 2 M 9=2  15 2 ˛ 5 ˇs2  3M  7ˇ 2 D Tzz ;Tx2 z 9=2 M

(129)

 3 2 ˛ 4  4 2 2 2 4 4  24ˇ ˛ s C 3˛ s 8ˇ D Tx1 z ;Tx1 z C Tx2 z ;Tx2 z M 9=2

(130)

Tx2 z ;Tzz D Tzz ;Tzz D

(128)

Page 29 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

Appendix B The hybrid PSD/covariance function of the disturbing potential, given by Eq. (85), can be shown to be ST;T .f1 I s2 I z1 ; z2 / D

2 2 K0 .2f1 d / ; ˛

where K0 is the modified Bessel function of the second kind and zero order, and r ˇ2 C s22 : d D ˛2

(131)

(132)

It is the along-track PSD if s2 D 0. In the following hybrid PSD/covariances of the derivatives of T , also the modified Bessel function of the second kind and first order, K1 , appears. Both Bessel function always have the argument, 2f1 d ; and, the arguments of the hybrid PSD/covariances are the same as in Eq. (131). ST;Tx1 D i 2f1 ST T D STx1 ;T ST;Tx2 D

2 2 .2f1 / s2 K1 D STx2 ;T ˛d

(134)

2 2 .2f1 / ˇ K1 D STz ;T ˛2d

(135)

ST;Tz D 

STx2 ;Tx2

STx1 ;Tx1 D .2f1 /2 ST T

(136)

STx1 ;Tx2 D i 2f1STx2 ;T D STx2 ;Tx1

(137)

STx1 ;Tz D i 2f1 ST;Tz D STz ;Tx1

(138)

2 2 .2f1/ D ˛d

STx2 ;Tz D 

(133)



  2s22 s22 1  2 K1  2f1 K0 d d

2 2 .2f1 / ˇs2 .2K1 C 2f1 d K0 / D STz ;Tx2 ˛2d 3

(139)

(140)

STz ;Tz D STx1 ;Tx1 C STx2 ;Tx2

(141)

ST;Tx1 x1 D STx1 ;Tx1 D STx1 x1 ;T

(142)

ST;Tx1 x2 D STx1 ;Tx2 D STx1 x2 ;T

(143)

ST;Tx1 z D STx1 ;Tz D STx1 z ;T

(144)

Page 30 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

ST;Tx2 x2 D STx2 ;Tx2 D STx2 x2 ;T

(145)

ST;Tx2 z D STx2 ;Tz D STx2 z ;T

(146)

ST;Tzz D STz ;Tz D STzz ;T

(147)

STx1 ;Tx1 x1 D i .2f1/3 ST;T D STx1 x1 ;Tx1

(148)

STx1 ;Tx1 x2 D .2f1 /2 ST;Tx2 D STx1 x2 ;Tx1 D STx2 ;Tx1 x1 D STx1 x1 ;Tx2

(149)

STx1 ;Tx1 z D .2f1 /2 ST;Tz D STx1 z ;Tx1 D STz ;Tx1 x1 D STx1 x1 ;Tz

(150)

STx1 ;Tx2 x2 D i 2f1 STx2 ;Tx2 D STx2 x2 ;Tx1 D STx2 ;Tx1 x2 D STx1 x2 ;Tx2

(151)

STx1 ;Tx2 z D i 2f1 STx2 ;Tz D STx2 z ;Tx1 D STz ;Tx1 x2 D STx2 ;Tx1 z D STx1 x2 ;Tz D STx1 z ;Tx2

(152)

STx1 ;Tzz D i 2f1 STz ;Tz D STzz ;Tx1 D STz ;Tx1 z D STx1 z ;Tz STx2 ;Tx2 x2

2 2 .2f1 / s2 D ˛d 3

(153)

     8s22 4s22 2 6  2  .2f1 s2 / K1 C 2f1 d 3  2 K0 d d

D STx2 x2 ;Tx2

STx2 ;Tx2 z

(154)

2 2 .2f1 / ˇ D ˛2d 3

     8s22 4s22 2 2  2  .2f1 s2 / K1 C 2f1 d 1  2 K0 d d

D STx2 z ;Tx2 D STz ;Tx2 x2 D STx2 x2 ;Tz

(155)

STx2 ;Tzz D STx2 ;Tx1 x1  STx2 ;Tx2 x2 D STzz ;Tx2 D STz ;Tx2 z D STx2 z ;Tz

(156)

STz ;Tzz D STx1 ;Tx1 z C STx2 ;Tx2 z D STzz ;Tz

(157)

STx1 x1 ;Tx1 x1 D .2f1 /4 ST;T

(158)

STx1 x1 ;Tx1 x2 D i .2f1 /3 STx2 ;T D STx1 x2 ;Tx1 x1

(159)

STx1 x1 ;Tx1 z D i .2f1 /3 ST;Tz D STx1 z ;Tx1 x1

(160)

STx1 x1 ;Tx2 x2 D .2f1 /2 STx2 ;Tx2 D STx2 x2 ;Tx1 x1 D STx1 x2 ;Tx1 x2

(161)

STx1 x1 ;Tx2 z D .2f1/2 STx2 ;Tz D STx2 z ;Tx1 x1 D STx1 z ;Tx1 x2 D STx1 x2 ;Tx1 z

(162)

Page 31 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

STx1 x1 ;Tzz D  .2f1 /2 STz ;Tz D STzz ;Tx1 x1 D STx1 z ;Tx1 z

(163)

STx1 x2 ;Tx2 x2 D i 2f1 STx2 x2 ;Tx2 D STx2 x2 ;Tx1 x2

(164)

STx1 x2 ;Tx2 z D i 2f1 STx2 ;Tx2 z D STx2 z ;Tx1 x2 D STx1 z ;Tx2 x2 D STx2 x2 ;Tx1 z

(165)

STx1 x2 ;Tzz D i 2f1 STx2 z ;Tz D STzz ;Tx1 x2 D STx1 z ;Tx2 z D STx2 z ;Tx1 z

(166)

STx1 z ;Tzz D STx1 x1 ;Tx1 z C STx2 x2 ;Tx1 z D STzz ;Tx1 z

(167)

STx2 x2 ;Tx2 x2 D

2 2 .2f1 / ˛d 3

 C2 3 



 2f1 d 3 

24s22 d2





24s22 24s 4 s2 C d 42 C .2f1 s2 /2 d22 K0 C d2   24s 4 s2 3 .2f1 s2 /2 C d 42 C 4 .2f1 s2 /2 d22 K1

   2 24s22 2 1 /ˇs2 d 12   .2f s / 2f K0 C STx2 x2 ;Tx2 z D  2 .2f 1 1 2 ˛2 d 5 d2    2 48s C 24  d 22 C 3 .2f1 d /2  8 .2f1 s2 /2 K1

(168)

(169)

D STx2 z ;Tx2 x2 STx2 x2 ;Tzz D STx1 x1 ;Tx2 x2  STx2 x2 ;Tx2 x2 D STzz ;Tx2 x2 D STx2 z ;Tx2 z

(170)

STx2 z ;Tzz D STx1 x1 ;Tx2 z C STx2 x2 ;Tx2 z D STzz ;Tx2 z

(171)

STzz ;Tzz D STx1 z ;Tx1 z C STx2 z ;Tx2 z

(172)

References Alfeld P, Neamtu M, Schumaker LL (1996) Fitting scattered data on sphere-like surfaces using spherical splines. J Comput Appl Math 73:5–43 Baranov V (1957) A new method for interpretation of aeromagnetic maps: pseudo-gravimetric anomalies. Geophysics 22:359–383 Brown RG (1983) Introduction to random signal analysis and Kalman filtering. Wiley, New York de Coulon F (1986) Signal theory and processing. Artech House, Dedham Fengler MJ, Freeden W, Michel V (2004) The Kaiserslautern multiscale geopotential model SWITCH-03 from orbit perturbations of the satellite CHAMP and its comparison to models EGM96, UCPH2002_02_05, EIGEN-1S and EIGEN-2. Geophys J Int 157:499–514 Forsberg R (1985) Gravity field terrain effect computations by FFT. Bull Géod 59(4):342–360 Forsberg R (1987) A new covariance model, for inertial gravimetry and gradiometry. J Geophys Res 92(B2):1305–1310 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere, with applications in geomathematics. Clarendon, Oxford Heller WG, Jordan SK (1979) Attenuated white noise statistical gravity model. J Geophys Res 84(B9):4680–4688 Helmert FR (1884) Die Mathematischen und Physikalischen Theorien der Höheren Geodäsie, vol 2. BD Teubner, Leipzig Page 32 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

Hofmann-Wellenhof B, Moritz H (2005) Physical geodesy. Springer, Berlin Jeffreys H (1955) Two properties of spherical harmonics. Q J Mech Appl Math 8(4):448–451 Jekeli C (1991) The statistics of the Earth’s gravity field, revisited. Manuscr Geod 16(5):313–325 Jekeli C (2005) Spline representations of functions on a sphere for geopotential modeling. Report no. 475, Geodetic Science, Ohio State University, Columbus. http://www.geology.osu.edu/~ jekeli.1/OSUReports/reports/report_475.pdf Jordan SK (1972) Self-consistent statistical models for the gravity anomaly, vertical deflections, and the undulation of the geoid. J Geophys Res 77(20):3660–3669 Jordan SK, Moonan PJ, Weiss JD (1981) State-space models of gravity disturbance gradients. IEEE Trans Aerosp Electron Syst AES 17(5):610–619 Kaula WM (1966) Theory of satellite geodesy. Blaisdell, Waltham Lauritzen SL (1973) The probabilistic background of some statistical methods in physical geodesy. Report no. 48, Geodaestik Institute, Copenhagen Lyche T, Schumaker LL (2000) A multiresolution tensor spline method for fitting functions on the sphere. SIAM J Sci Comput 22(2):724–746 Mandelbrot B (1983) The fractal geometry of nature. Freeman, San Francisco Marple SL (1987) Digital spectral analysis with applications. Prentice-Hall, Englewood Cliffs Martinec Z (1998) Boundary-value problems for gravimetric determination of a precise geoid. Springer, Berlin Maybeck PS (1979) Stochastic models, estimation, and control, vols I and II. Academic, New York Milbert DG (1991) A family of covariance functions based on degree variance models and expressible by elliptic integrals. Manuscr Geod 16:155–167 Moritz H (1976) Covariance functions in least-squares collocation. Report no. 240, Department of Geodetic Science, Ohio State University, Columbus Moritz H (1978) Statistical foundations of collocation. Report no. 272, Department of Geodetic Science, Ohio State University, Columbus Moritz H (1980) Advanced physical geodesy. Abacus Press, Tunbridge Wells Olea RA (1999) Geostatistics for engineers and earth scientists. Kluwer Academic, Boston Pavlis NK, Holmes SA, Kenyon SC, Factor JF (2012a) The development and evaluation of earth gravitational model (EGM2008). J Geophys Res 117:B04406. doi:10.1029/2011JB008916 Pavlis NK, Holmes SA, Kenyon SC, Factor JF (2012b) Correction to “The development and evaluation of Earth Gravitational Model (EGM2008)”. J Geophys Res, 118, 2633, doi:10.1002/jgrb.50167 Priestley MB (1981) Spectral analysis and time series analysis. Academic, London Rummel R, Yi W, Stummer C (2011) GOCE gravitational gradiometry. J Geod 85:777–790 Schreiner M (1997) Locally supported kernels for spherical spline interpolation. J Approx Theory 89:172–194 Schumaker LL, Traas C (1991) Fitting scattered data on sphere-like surfaces using tensor products of trigonometric and polynomial splines. Numer Math 60:133–144 Tscherning CC (1976) Covariance expressions for second and lower order derivatives of the anomalous potential. Report no. 225, Department of Geodetic Science, Ohio State University, Columbus. http://geodeticscience.osu.edu/OSUReports.htm Tscherning CC, Rapp RH (1974) Closed covariance expressions for gravity anomalies, geoid undulations and deflections of the vertical implied by anomaly degree variance models. Report no. 208, Department of Geodetic Science, Ohio State University, Columbus. http:// geodeticscience.osu.edu/OSUReports.htm

Page 33 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-2 © Springer-Verlag London 2014

Turcotte DL (1987) A fractal interpretation of topography and geoid spectra on the Earth, Moon, Venus, and Mars. J Geophys Res 92(B4):E597–E601 Watts AB (2001) Isostasy and flexure of the lithosphere. Cambridge University Press, Cambridge

Page 34 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

Correlation Modeling of the Gravity Field in Classical Geodesy Christopher Jekeli Division of Geodetic Science, School of Earth Sciences, Ohio State University, Columbus, OH, USA

Abstract The spatial correlation of the Earth’s gravity field is well known and widely utilized in applications of geophysics and physical geodesy. This paper develops the mathematical theory of correlation functions, as well as covariance functions under a statistical interpretation of the field, for functions and processes on the sphere and plane, with formulation of the corresponding power spectral densities in the respective frequency domains and with extensions into the third dimension for harmonic functions. The theory is applied, in particular, to the disturbing gravity potential with consistent relationships of the covariance and power spectral density to any of its spatial derivatives. An analytic model for the covariance function of the disturbing potential is developed for both spherical and planar application, which has analytic forms also for all derivatives in both the spatial and the frequency domains (including the along-track frequency domain). Finally, a method is demonstrated to determine the parameters of this model from empirical regional power spectral densities of the gravity anomaly.

1 Introduction The Earth’s gravitational field plays major roles in geodesy, geophysics, and geodynamics and is also a significant factor in specific applications such as precision navigation and satellite orbit analysis. With the advance of instrumentation technology over the last several decades, we now have gravitational models of high spatial resolution over most of the land areas, thanks to extensive ground and expanding airborne survey campaigns and over the oceans owing to satellite radar altimetry, which measures essentially a level surface. Recent satellite gravity missions (e.g., the Gravity Field and Steady-State Ocean Circulation Explorer (GOCE), Rummel et al. 2011) also have vastly improved the longer-wavelength parts of the model with globally distributed in situ measurements. Despite these improvements, there remain deficiencies in resolution, including a lack of uniformity and accuracy in some land areas, such as Antarctica and significant parts of Africa, South America, and Asia (Pavlis et al. 2012a). These gaps will be filled with continued measurement, mostly using airborne systems for efficient accessibility to remote regions. Determining the required resolution and analyzing the effect or significance of the gravitational field at various scales for particular applications often rely on some a priori knowledge of the field. Also, the interpolation and extrapolation of the field from given discrete data and the prediction or estimation of field quantities other than those directly measured requires a weighting function based on the essential spatial correlative characteristics of the gravitational field. For these reasons, 

E-mail: [email protected]

Page 1 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

the study and development of correlation or covariance functions of the field have occupied geodesists and geophysicists in tandem with the advancements of measurement and instrument technology. The rather slow attenuation of the field as a function of resolution gives it at regional scales a kind of random character, much like the Earth’s topography. Indeed, the shorter spatial wavelengths of the gravitational field are in many cases highly correlated with the topography; and, profiles of topography, like coastlines, are known to be fractals, which arise from certain random fluctuations, analogous to Brownian motion (Mandelbrot 1983). Thus, we may argue that the Earth’s gravitational field at fine scales also exhibits a stochastic nature (Jekeli 1991). This randomness in the field has been argued and counterargued for decades, but it does form the basis for one of the more successful estimation methods in physical geodesy, called least-squares collocation (Moritz 1980). In addition, the correlative description of the field is advantageous in more general error analyses of the problem of field modeling; and, it is particularly useful in generating synthetic fields for deterministic simulations of the field for Monte Carlo types of analyses. The stochastic nature of the gravitational field, besides assumed primarily for the shorter wavelengths, is also limited to the horizontal dimensions. The variation in the vertical (above the Earth’s surface) is constrained deterministically by the attenuation of the gravitational potential with distance from its source, as governed by the solution to Laplace’s differential equation in free space. However, this constraint also extends the stochastic interpretation in estimation theory, since it analytically establishes mutually consistent correlations for vertical derivatives of the potential, or between its horizontal and vertical derivatives, or between the potential (and any of its derivatives) at different vertical levels. Thus, with the help of the corresponding covariance functions, one is able to estimate, for example, the geoid undulation from gravity anomaly data in a purely operational approach using no other physical models, which is the essence of the method of least-squares collocation. It is necessary to distinguish and relate correlation and covariance functions as used in this text. The covariance function refers to random or stochastic processes and is the statistical expectation of the product of the centralized process at two points of the process (i.e., of two random variables with their means removed). The correlation function has more than one definition. As a natural extension of the Pearson correlation coefficient, it is the covariance function normalized by the square roots of the variances of the process at the two points (Priestley 1981). An alternative definition is the statistical expectation of the non-centralized product of the process at two points (Maybeck 1979). A third definition characterizes the correlation of deterministic (nonrandom) functions on the basis of averages of products over the domain of the function (de Coulon 1986). Ultimately, the covariance function and the correlation function, in its various incarnations, are related, but there is an advantage to distinguish between the stochastic and the non-stochastic versions. Minimum error variance estimation requires a stochastic interpretation, and the gravitational field is characterized stochastically in terms of covariance functions. If interpolation or filtering or simulation through arbitrary synthesis is the principal application, then it may be sufficient to dispense with the stochastic interpretation. If the stochastic process is ergodic then the average-based correlation function of its realization is the same as the its covariance function if the means are known and removed. Thus, one may start with the formulation of the physical correlation of the gravitational field without the stochastic underpinning and introduce the stochastic interpretation as needed. Since one of the main applications is the popular least-squares collocation in physical geodesy,

Page 2 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

the terminology of covariance functions dominates the later chapters. Whether from the more general or the stochastic viewpoint, the correlative methods can be extended to other fields on the Earth’s surface and to fields that are harmonic in free space. For example, the anomalous magnetic potential (due to the magnetization of the crust material induced by the main, outer-core-generated field of the Earth) also satisfies Laplace’s differential equation. Thus, it shares basic similarities to the anomalous gravitational field. Under certain, albeit rather restrictive assumptions, one field may even be represented in terms of the other (Poisson’s relationship; Baranov 1957). Although this relationship has not been studied in detail from the stochastic or more general correlative viewpoint, it does open numerous possibilities in estimation and error analysis. Finally, it is noted that spatial data analyses in geophysics, specifically the optimal prediction and interpolation of geophysical signals, known as kriging (Olea 1999), rely as does collocation in geodesy on a correlative interpretation of the signals. Semi-variograms, instead of correlation functions, are used in kriging, but they are closely related. Therefore, a study of modeling one (correlations or covariances, in the present case) immediately carries over to the other. The following chapters review correlation functions on the sphere and plane, as well as the transforms into their respective spatial frequency domains. For the stochastic understanding of the geopotential field, the covariance function is introduced, under the assumption of ergodicity (hence, stationarity). Again, the frequency domain formulation, that is, the power spectral density of the field, is of particular importance. The method of covariance propagation, which is indispensable in such estimation techniques as least-squares collocation, naturally motivates the analytic modeling of covariance functions. Models have occupied physical geodesists since the utility of least-squares collocation first became evident, and myriad types of models and approaches exist. In this paper, a single yet comprehensive, adaptable, and flexible model is developed that offers consistency among all derivatives of the potential, whether in spherical or planar coordinates, and in the space or frequency domains. Methods to derive appropriate parameters for this model conclude the essential discussions of this paper.

2 Correlation Functions We start with functions on the sphere and develop the concept of the correlation function without the need for a stochastic foundation. The statistical interpretation may be imposed at a later time when it is convenient or necessary to do so. As it happens, the infinite plane as functional domain offers more than one option for developing correlation functions, depending on the class of functions, and, therefore, will be treated after considering the unit sphere, . Other types of surfaces that approximate the Earth’s surface more accurately (ellipsoid, geoid, topographic surface) could also be contemplated. However, the extension of the correlation function into space according to potential theory and the development of a useful duality in the spatial frequency domain then become more problematic, if not impossible. In essence, we require surfaces on which functions have a spectral decomposition and such that convolutions transform into the frequency domain as products of spectra. The latter requirement is tied to the analogy between convolutions and correlations. Furthermore, the surface should be sufficiently simple as a boundary in the solution to Laplace’s equation for the gravitational potential. To satisfy all these requirements and with a view toward practical applications, the present discussion is restricted to the plane and the sphere. Although data on the surface are always discrete, we do not consider discrete functions. Rather, it is always assumed that the data are samples of a continuous function. Then, the correlation Page 3 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

functions to be defined are also continuous, and correlations among the data are interpreted as samples of the correlation function.

2.1 Functions on the Sphere Let g and h be continuous, square-integrable functions on the unit sphere, , i.e., ZZ ZZ 1 1 2 g d < 1; h2 d < 1; 4 4 

(1)



and suppose they depend on the spherical polar coordinates, f.; / j0    ; 0   < 2g. Each function may be represented in terms of its Legendre transform as an infinite series of spherical harmonics, g .; / D

n 1 X X

Gn;m YNn;m .; /;

(2)

nD0 mDn

where the Legendre transform, or the Legendre spectrum of g, is ZZ 1 g .; / YNn;m .; / d; Gn;m D 4

(3)



and where the functions, YNn;m .; /, are surface spherical harmonics defined by YNn;m .; / D PNn;jmj .cos  /



cos m; m  0 sin jmj ; m < 0

(4)

The functions, PNn;m , are associated Legendre functions of the first kind, fully normalized so that 1 4

ZZ

YNn0 ;m0 .; / YNn;m .; / d D



1; n0 D n and m0 D m 0; n0 ¤ n or m0 ¤ m

(5)



A similar relationship exists between h and its Legendre transform, Hn;m . The degree and order, .n; m/, are wave numbers belonging to the frequency domain. The unit sphere is used here only for convenience, and any sphere (radius, R) may be used. The Legendre spectrum then refers to this sphere. We define the correlation function of g and h as ZZ   1 gh . ; ˛/ D g .; / h  0 ; 0 sin dd ; (6) 4 

where the points .; / and . 0 ; 0 / are related by

Page 4 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

  D cos  cos  0 C sin  sin  0 cos   0 ;

cos

 sin  0 sin .  0 / ; tan ˛ D sin  cos  0  cos  sin  0 cos .  0 /

(7) (8)

and where the integration is performed over all pairs of points, .; / and . 0 ; 0 /, separated by the fixed spherical distance, , and oriented by the fixed azimuth, ˛. If the spherical harmonic series, Eq. (2), for g and h are substituted into Eq. (6), we find that, due to the special geometry of the sphere, no simple analytic expression results unless we further average over all azimuths, ˛, thus imposing isotropy on the correlation function. Therefore, we redefine the correlation function of g and h (on the sphere) as follows: 1 gh . / D 8 2

Z2 Z Z

  g .; / h  0 ; 0 sin dd d˛:

(9)



0

More precisely, this is the cross-correlation function of g and h. The autocorrelation function of g is simply gg . /. The prefixes, cross- and auto-, are used mostly to emphasize a particular application and may be dropped when no confusion arises. Because of its sole dependence on , gh can be expressed as an infinite series of Legendre polynomials: gh . / D

1 X

  .2n C 1/ ˚gh n Pn .cos /;

(10)

nD0

  where the coefficients, ˚gh n , constitute the Legendre transform of gh :   1 ˚gh n D 2

Z gh . / Pn .cos / sin d :

(11)

0

Substituting the decomposition formula for the Legendre polynomial, Pn .cos / D

n X   1 YNn;m .; / YNn;m  0 ; 0 ; 2n C 1 mDn

(12)

and Eq. (9) into Eq. (11) and then simplifying using the orthogonality, Eq. (5), and the definition of the Legendre spectrum, Eq. (3), we find: 1 0 ZZ ZZ n X       1 1 1 ˚gh n D g .; / YNn;m .; / @ h  0 ; 0 YNn;m  0 ; 0 sin d d˛ A 2n C 1 mDn 4 4 

sin dd  D

n X 1 Gn;m Hn;m 2n C 1 mDn



(13)

Page 5 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

  where .; / is constant in the inner integral. The quantities, ˚gh n , constituting the Legendre transform of the correlation function, may be called the (cross-) power spectral density (PSD) of g and h. They are determined directly from the Legendre spectra of g and h. The (auto-) PSD of g is simply   ˚gg n D

n X 1 G2 : 2n C 1 mDn n;m

(14)

The terminology that refers the correlation function to “power” is appropriate since it is an integral divided by the solid angle of the sphere. For functions on the plane, we make a distinction between energy and power, depending on the class of functions.

2.2 Functions on the Plane On the infinite plane with Cartesian coordinates, f.x1 ; x2 / j  1 < x1 < 1; 1 < x2 < 1g, we consider several possibilities for the functions. The situation is straightforward if the functions are periodic and square integrable over the domain of a period or are square integrable over the plane. Anticipating no confusion, these functions again are denoted, g and h. For the periodic case, with periods, Q1 and Q2 , in the respective coordinates, 1 Q1 Q2

ZQ1 ZQ2

1 Q1 Q2

g 2 dx1 dx2 < 1; 0

0

ZQ1 ZQ2 h2 dx1 dx2 < 1I 0

(15)

0

and each function may be represented in terms of its Fourier transform as an infinite series of sines and cosines, conveniently combined using the complex exponential:   1 1 X X k1 x1 k2 x2 1 i2 Q C Q 1 2 Gk ;k e ; g .x1 ; x2 / D Q1 Q2 k D1 k D1 1 2 1

(16)

2

where the Fourier transform, or the Fourier spectrum of g, is given by ZQ1 ZQ2 Gk1 ;k2 D

g .x1 ; x2 / e 0

i2



k1 x1 k2 x2 Q1 C Q2



dx1 dx2 ;

(17)

0

and a similar relationship exists between h and its transform, Hk1 ;k2 . Again, the integers, k1 , k2 , are the wave numbers in the frequency domain. Assuming both functions have the same periods, the correlation function of g and h is defined by 1 gh .s1 ; s2/ D Q1 Q2

Q Z1 =2

Q Z2 =2

    g  x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 ;

(18)

Q1 =2 Q2 =2

Page 6 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

where g  is the complex conjugate of g (we deal only with real functions but need this formal definition). The independent variables are the differences between points of evaluation of h at   .x1 ; x2 / and g  at x10 ; x20 , respectively, as follows: s1 D x1  x10 ;

s2 D x2  x20 :

(19)

The integration is performed with s1 and s2 fixed and requires recognition of the fact that g and h are periodic. The correlation function is periodic with the same periods as for g and h, and its Fourier transform, that is, the power spectral density (PSD), is discrete and given by ZQ1 ZQ2

  ˚gh k1 ;k2 D

gh .s1 ; s2/ e 0

i2



k 1 s1 k 2 s2 Q1 C Q2



ds1 ds2 :

(20)

0

Substituting the correlation function, defined by Eq. (18) into Eq. (20), yields after some straightforward manipulations (making use of Eq. (17) and the periodicity of its integrand):   ˚gh k1 ;k2 D

1 G  Hk ;k : Q1 Q2 k1 ;k2 1 2

(21)

Analogous to the spherical case, Eq. (13), the PSD of periodic functions on the plane can be determined directly from their Fourier series coefficients. A very similar situation arises for nonperiodic functions that are nevertheless square integrable on the plane: Z1 Z1

Z1 Z1 g dx1 dx2 < 1;

h2 dx1 dx2 < 1:

2

1 1

(22)

1 1

In this case, the Fourier transform relationships for the function are given by Z1 Z1 g .x1 ; x2 / D

G .f1 ; f2 / e i2.f1 x1 Cf2 x2 / df1 df2 ;

(23)

1 1

Z1 Z1 G .f1 ; f2 / D

g .x1 ; x2 / e i2.f1 x1 Cf2 x2 / dx1 dx2

1 1

where f1 and f2 are corresponding spatial (cyclical) frequencies. The correlation function is given by Z1 Z1 gh .s1 ; s2 / D

    g  x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 I

(24)

1 1

Page 7 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

and its Fourier transform is easily shown to be ˚gh .f1 ; f2 / D G  .f1 ; f2 / H .f1 ; f2 / :

(25)

This Fourier transform of the correlation function is called more properly the energy spectral density, since the correlation function is simply the integral of the product of function. The square integrability of the functions implies that they have finite energy. Later we consider stochastic processes on the plane that are stationary, which means that they are not square integrable. For this case, one may relax the integrability condition to 1 lim lim E1 !1 E2 !1 E1 E2

E Z1 =2

E Z2 =2

jgj2 dx1 dx2 < 1;

(26)

E1 =2 E2 =2

and we say that g has finite power (energy per domain unit). Analogously, the correlation function is given by 1 gh .s1 ; s2 / D lim lim E1 !1 E2 !1 E1 E2

E Z1 =2

E2 =2 Z

    g  x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 ;

(27)

E1 =2 E2 =2

but the Fourier transforms of the functions, g and h, do not exist in the usual way (as in Eq. (23)). On the other hand, the correlation function is square integrable and, therefore, possesses a Fourier transform, that is, the PSD of g and h: Z1 Z1 ˚gh .f1 ; f2 / D

gh .s1 ; s2 / e i2.f1 s1 Cf2 s2 / ds1 ds2 :

(28)

1 1

Consider truncated functions defined on a finite domain:  g .x1 ; x2 / ; x1 2 ŒE1 =2; E1 =2 and x2 2 ŒE2 =2; E2 =2 gN .x1 ; x2 / D 0 otherwise

(29)

N Then gN and hN are square integrable on the plane and have Fourier transforms, and similarly for h. GN and HN , respectively; e.g., GN .f1 ; f2 / D

E Z1 =2

E2 =2 Z

g .x1 ; x2 / e i2.f1 x1 Cf2 x2 / dx1 dx2 :

(30)

E1 =2 E2 =2

It is now straightforward to show that in this case, the Fourier transform of gh is given by 1 N G .f1 ; f2 / HN .f1 ; f2 / : E1 !1 E2 !1 E1 E2

˚gh .f1 ; f2 / D lim

lim

(31)

Page 8 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

In practice, this power spectral density can only be approximated due to the required limit operators. However, the essential relationship between (truncated) function spectra and the PSD is once more evident.

2.3 From the Sphere to the Plane For each class of functions on the plane, we did not need to impose isotropy on the correlation function. However, isotropy proves useful in comparisons to the spherical correlation function at high spatial frequencies. In the case of the nonperiodic functions on the plane, a simple averaging over azimuth changes the Fourier transform of the correlation function to its Hankel transform: Z1 ˚gh .f / D 2

Z1 gh .s/s J0 .2fs/ ds;

gh .s/ D 2

0

˚gh .f /f J0 .2fs/ df ;

(32)

0

p p where s D s12 C s22 and f D f12 C f22 , and J0 is the zero-order Bessel function of the first kind. An approximate relation between the transforms of the planar and spherical isotropic correlation functions follows from the asymptotic relationship between Legendre polynomials and Bessel functions:  x D J0 .x/ ; for x > 0: (33) lim Pn cos n!1 n If we let x D 2f s, where s D R , and R is the radius of the sphere, then with 2f  n=R, we have x=n D . Hence, for large n (or small ), Pn .cos /  J0 .2f s/ :

(34)

Now, discretizing the second of Eqs. (32) (with df D 1=.2R/) and substituting Eq. (33) yields (again, with 2f  n=R) gh .s/ 

1 X nD0

 n   n s ˚ P : cos gh n 2R2 2R R

(35)

Comparing this with the spherical correlation function, Eq. (10), we see that   .2n C 1/ ˚gh n 

n ˚gh .f / ; 2R2

where f 

n : 2R

(36)

This relationship between planar and spherical PSDs holds only for isotropic correlation functions and for large n or f .

Page 9 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

2.4 Properties of Correlation Functions and PSDs Correlation functions satisfy certain properties that should then also hold for corresponding models and may aid in their development. The autocorrelation is a positive definite function, since its eigenvalues defined by its spectrum, the PSD, are positive; e.g., see Eq. (14) or from Eq. (31): ˇ2 1 ˇˇ N G .f1 ; f2 /ˇ  0: E1 !1 E2 !1 E1 E2

˚gg .f1 ; f2 / D lim

lim

(37)

The values of the autocorrelation function for nonzero argument are not greater than at the origin: gg . /  gg .0/ ;

> 0I

gg .s1 ; s2 /  gg .0; 0/ ;

q s12 C s22 > 0I

(38)

where equality would imply a perfectly correlated function (a constant). The inequalities (38) are proved using Schwartz’s inequality applied to the Eqs. (6) and (24), respectively. Note that cross correlations may be larger in absolute value than their values at the origin (e.g., if they vanish there). Because of the imposed isotropy, spherical correlation functions are not defined for < 0. On the other hand, planar correlation functions may be formulated for all quadrants; and, they satisfy:  gh .s1 ; s2 / D hg .s1 ; s2 / ;

(39)

which follows readily from their definition, given by Eqs. (24) or (27). Clearly, the autocorrelation function of a real function is symmetric with respect to the origin, even if not isotropic. The correlation function of a derivative is the derivative of the correlation. For finite energy functions, we find immediately from Eq. (24) that @ gh .s1 ; s2/ D @sk

Z1 Z1 1 1

  @h  0  g  x10 ; x20 x1 C s1 ; x20 C s2 dx10 dx20 D g; @h .s1 ; s2 / ; @xk @sk

k D 1; 2: (40)

From this and Eq. (39), we also have @  @ gh .s1 ; s2/ D  .s1 ; s2 / D h; @g .s1 ; s2 / D  @g ;h .s1 ; s2 / ; @xk @sk @sk hg @xk

k D 1; 2: (41) .h/

The minus sign may be eliminated with the definition of sk , Eqs. (19). We have @=@sk D @=@xk D .g/ .g/ .h/ @=@xk , where xk and xk refer, respectively, to the coordinates of g and h. Therefore, 

@g .g / ;h @xk

g;

@h .h/ @xk

.s1 ; s2 / D

@ .g / gh @xk

.s1 ; s2 / D

@ .h/ gh @xk

.s1 ; s2 /

.s1 ; s2 /

(42)

The same results may be shown for correlation functions of other types of functions on the plane (where the derivations in the case of the limit operators require a bit more care).

Page 10 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

Higher-order derivatives follow naturally, and indeed, we see that the correlation function of any linear operators on functions, L.g/ g and L.h/ h, is the combination of these linear operators applied to the correlation function:   L.g/ g;L.h/ h D L.g/ L.h/ gh :

(43)

Independent variables are omitted since this property, known as the law of propagation of correlations, holds also for the spherical case. The PSDs of derivatives of functions on the plane follow directly from the inverse transform of the correlation function: Z1 Z1 gh .s1 ; s2 / D

˚gh .f1 ; f2 / e i2.f1 s1 Cf2 s2 / df1 df2 :

(44)

1 1

With Eqs. (42), we find Z1 Z1 

@g @h .g / ; @x.h/ @xk k

.s1 ; s2 / D

˚gh .f1 ; f2 / 1 1

@2 .g/ .h/ @xk @xk

e i2.f1 s1 Cf2 s2 / df1 df2 :

(45)

From this (and Eqs. (19)) one may readily infer the following general formula for the PSD of the derivatives of g and h of any order: ˚gp1 p2 ;hq1 q2 .f1 ; f2 / D .1/p1 Cp2 .i 2f1 /p1 Cq1 .i 2f2/p2 Cq2 ˚gh .f1 ; f2 / ;

(46)

ı p p  ı q q  where gp1 p2 D @p1 Cp2 g @x1 1 @x2 2 and hq1 q2 D @q1 Cq2 h @x1 1 @x2 2 . These expressions could be obtained also through Eqs. (21), (25), or (31), from the spectra of the function derivatives, which have a corresponding relationship to the spectra of the functions. For functions on the sphere, the situation is hardly as simple. Indeed, this writer is unaware of formulas for the PSDs of horizontal derivatives, with the exception of an approximation for the average horizontal derivative, s dH g .; / D



@g @

2

 C

1 @g sin  @

2 :

(47)

Making use of an orthogonality proved by Jeffreys (1955): 1 4

ZZ 

@YNn;m @YNp;q 1 @YNn;m @YNp;q C 2 @ @ sin  @ @

the autocorrelation function of dHg at

!

 d D

n .n C 1/ ; n D p and m D q 0; n ¤ p or m ¤ q

(48)

D 0 from Eq. (9) becomes

Page 11 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

1 dHg;dHg .0/ D 4

ZZ



@g @

2



1 @g C sin  @

2 ! sin dd  D

1 X

n .n C 1/

2 Gnm : (49)

mDn

nD0



n X

It is tempting to identify the PSD by comparing this result to Eq. (10), but Eq. (49) proves this form of the correlation function only for D 0. The error in this approximation of the PSD of dHg is an open question. _ _ For functions, g .x1 ; x2 I z/, that satisfy Laplace’s equation, r 2 g D 0, in the space exterior to _ the plane (i.e., they are harmonic for z > 0) and satisfy the boundary condition, g .x1 ; x2 I 0/ D g .x1 ; x2 /, the Fourier spectrum on any plane with z D z0 > 0 is related to the spectrum of g: _

G .f1 ; f2 I z0 / D G .f1 ; f2 / e 2f z0 ;

(50)

where f 2 D f12 C f22 . Similarly, for functions, g .; I r/, harmonic outside the sphere (r > R) _ that satisfy g .; I R/ D g .; /, the Legendre spectrum on any sphere with r D r0 > R is related to the spectrum of g according to _

 nC1 R G n;m .r0 / D Gn;m : r0 _

(51)

Therefore, the corresponding spectral densities are analogously related. In general, the cross PSD _ _ of g at level, z D zg , and h at level, z D zh , is given (e.g., substituting Eq. (50) for g and h into Eq. (31)) by   ˚__ f1 ; f2 I zg ; zh D e 2f .zg Czh / ˚gh .f1 ; f2 / : gh

(52)

Note that the altitudes add in the exponent. Similarly, for cross PSDs of functions on spheres, r D rg and r D rh , we have  2 nC1      R ˚gh n : D ˚__ rg ; rh gh n rg rh

(53)

Although the altitude variables were treated strictly as parameters in these PSDs, one may consider briefly the corresponding correlation functions as “functions” of z and r, respectively, for the sole purpose of deriving the correlation functions of vertical (radial) derivatives. Indeed, it is readily _  _  seen from the definitions, Eqs. (9) and (27), for the cross correlation of g ; I rg and h .; I rh / that 

 _ _ @g @h @rg ; @rh

I rg ; rh



  @2 __ I rg ; rh ; D @rg @rh g h



_ _ @g @h @zg ; @zh

  s1 ; s2 I z g ; z h D

  @2 __ s1 ; s2 I zg ; zh ; @zg @zh g h (54)

and the law of propagation of correlations, Eq. (43), holds also for this linear operator. It should be stressed, however, that the correlation function is essentially a function of variables on the plane or sphere; no integration of products of functions takes place in the third dimension.

Page 12 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

The cross PSDs of vertical derivatives, therefore, are easily derived by applying Eqs. (54) to the inverse transforms of the correlation functions, Eqs. (44) and (10), with extended expressions for the PSDs, Eqs. (52) and (53). The result is ˚_ g

 _ j zg

h zk

h

  @j Ck  f1 ; f2 I zg ; zh D j k e 2f .zg Czh / ˚gh .f1 ; f2 / @zg @zh D .2f /j Ck e 2f .zg Czh / ˚gh .f1 ; f2 / ;

˚_ g

_

_ j rg

h rk h

!  2 nC1     R @j Ck rg ; rh ˚gh n ; D j k @rg @rh rg rh n _

_

(55) (56)

_

where g zjg D @j g=@zjg , hzkh D @k h=@zkh , g rgj D @j g=@rgj , and hrhk D @k h=@rhk . Thus, the PSD for any combination of horizontal and vertical derivatives of g and h on horizontal planes in Cartesian space may be obtained by appending the appropriate factors to ˚gh . The same holds for any combination of vertical derivatives of g and h on concentric spheres. _

_

_

_

3 Stochastic Processes and Covariance Functions A stochastic (or random) process is a collection, discrete or continuous, of random variables that are associated with a deterministic variable, in our case, a point on the plane or sphere. At each point, the process is random with an underlying probability distribution. A probability domain or sample space for each random variable is implied but omitted in the following simplified notation; in fact, the distribution may be unknown. If each random variable takes on a specific value from the corresponding sample spaces, then the process is said to be realized, and this realization is a function of the point coordinates. Thus, we continue to use the notation, g, to represent a continuous stochastic process, with the understanding that for any fixed point, it is a random variable. We assume that the process is wide-sense stationary, meaning that all statistics up to second order are invariant with respect to the origin of the space variable. Then, the expectation of g at all points is the same constant, and the covariance between g at any two points depends only on the displacement (vector) of one point relative to the other. Typically, besides not knowing the probability distribution, we have access only to a single realization of the stochastic process, which makes the estimation of essential statistics such as the mean and covariance problematic, unless we invoke an additional powerful condition characteristic of many processes – ergodicity. For ergodic processes the statistics associated with the underlying probability law, based on the statistical (ensemble) expectation, are equivalent to the statistics derived from space-based averages of a single realization of the process. Stationarity is necessary but not sufficient for ergodicity. Also, we consider only wide-sense ergodicity. It can be shown that stationary stochastic processes whose underlying probability distribution is Gaussian is also ergodic (Moritz 1980; Jekeli 1991). We do not need this result since the probability distribution is not needed in our developments; and, indeed, ergodic processes on the sphere cannot be Gaussian (Lauritzen 1973).

Page 13 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

For stochastic processes on the sphere, we define the space average as ZZ 1 ./ d: M ./ D 4

(57)



Let g and h be two such processes that are ergodic (hence, also stationary) and let their means, according to Eq. (57), be denoted g and h . Then, the covariance function of g and h is given by    1 M g  g .h  h / D 4

ZZ



g .; /  g

   0 0  h  ;   h sin dd ;

(58)



which, by the stationarity, depends only on the relative location of g and h, that is, on . ; ˛/, as given by Eqs. (7) and (8). We will assume without loss in generality that the means of the processes are zero (if not, redefine the process with its constant mean removed). Then, clearly, the covariance function is like the correlation function, Eq. (6), except the interpretation is for stochastic processes. We continue to use the same notation, however, and further redefine the covariance function to be isotropic by including an average in azimuth, ˛, as in Eq. (9). The Legendre transform of the covariance function is also the (cross) PSD of g and h and is given by Eq. (13). The quantities,     cgh n D .2n C 1/ ˚gh n ;

(59)

are known as degree variances, or variances per degree, on account of the total variance being, from Eq. (10), 1 X   gh .0/ D cgh n :

(60)

nD0

Ergodic processes on the plane are not square integrable since they are also stationary, and we define the average operator as 1 M ./ D lim lim E1 !1 E2 !1 E1 E2

E1 =2 E Z Z2 =2

./ dx1 dx2 :

(61)

E1 =2 E2 =2

The covariance function under the assumption of zero means for g and h is, again, the correlation function given by Eq. (27). However, the PSD requires some additional derivation since the N defined as in Eq. (29), are not stationary and, therefore, truncated stochastic processes, gN and h, not ergodic. Since both gN and hN are random for each space variable, their Fourier transforms, GN and HN , are also stochastic in nature. Consider first the ensemble expectation of the product of transforms, given by Eq. (30),

Page 14 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

  E GN  .f1 ; f2 / HN .f1 ; f2 / D ! ER1 =2 ER2 =2 ER1 =2 ER2 =2   0 0  i2 .f .x0 x /Cf .x0 x // 1 1 2 2 1 2 dx1 dx2 dx10 dx20 E g x1 ; x2 h .x1 ; x2 / e E1 =2 E2 =2 E1 =2 E2 =2

(62) The expectation inside the integrals is the same as the space average and is the covariance function of g and h, as defined above, which because of their stationarity depends only on the coordinate differences, s1 D x1 x10 and s2 D x2 x20 . It can be shown Brown (1983, p. 86) that the integrations reduce to 0 E E   Z1 Z2    js1 j js2 j N N @ E G .f1 ; f2 / H .f1 ; f2 / D E1 E2 1 1 gh .s1 ; s2 / e i2.f1 s1 Cf2 s2 / 2E1 2E2 E1 E2

ds1 ds2 / :

(63)

In the limit, the integrals on the right side approach the Fourier transform of the covariance function, that is, the PSD, ˚gh .f1 ; f2 /; and, we have  ˚gh .f1 ; f2 / D lim

lim E

E1 !1 E2 !1

 1 N G .f1 ; f2 / HN .f1 ; f2 / : E1 E2

(64)

Again, in practice, this PSD can only be approximated due to the limit and expectation operators. We have shown that under appropriate assumptions (ergodicity), the covariance functions of stochastic processes on the sphere or plane are essentially identical to the corresponding correlation functions that were developed without a stochastic foundation. The only exception occurs in the relationship between Fourier spectra and the (Fourier) PSD (compare Eqs. (31) and (64)). Furthermore, from Eqs. (62) through (64) we have also shown that the covariance function of a stochastic process is the Fourier transform of the PSD, given by Eq. (64). This is a statement of the more general Wiener-Khintchine theorem (Priestley 1981). Although there are opposing schools of thought as to the stochastic nature of a field like Earth’s gravitational potential, we will argue (see below) that the stochastic interpretation is entirely legitimate. Moreover, the stochastic interpretation of the gravitational field is widely, if not uniformly, accepted in geodesy (e.g., Moritz 1978, 1980; Hofmann-Wellenhof and Moritz 2005), as is the covariance nomenclature. Moritz (1980) provided compelling justifications to view the gravitational field as a stochastic process on the plane or sphere. The use of covariance functions also emphasizes that the significance of correlations among functions lies in their variability irrespective of the means (which we will always assume to be zero). For these reasons, we will henceforth in our applications to the Earth’s gravitational field refer only to covariance functions, use the same notation, and use all the properties and relationships derived for correlation functions.

3.1 Earth’s Anomalous Gravitational Field The masses of the Earth, including all material below its surface, as well as the atmosphere, generate the gravitational field, which in vacuum is harmonic and satisfies Laplace’s differential equation. For present purposes we neglect the atmosphere (and usually its effect is removed from Page 15 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

data) so that for points, x, above the surface, the gravitational potential, V , fulfills Laplace’s equation, r 2 V .x/ D 0:

(65)

Global solutions to this equation depend on boundary values of V or its derivatives on some mathematically convenient bounding surface. Typically this surface is a sphere with radius, a, and the solution is then expressed in sphericalıpolar coordinates, .r; ; /, as an infinite series of solid spherical harmonic functions, YNn;m .; / r nC1 , for points outside the sphere: n 1 GM X X  a nC1 Cn;m YNn;m .; /; V .r; ; / D a nD0 mDn r

(66)

where GM is Newton’s gravitational constant times the total mass of the Earth (this scale factor is determined from satellite tracking data); and Cn;m is a coefficient of degree, n, and order, m, determined from V and/or its radial derivatives on the bounding sphere (obtained, e.g., from measurements of gravity). Modern solutions also make use of satellite tracking data and in situ measurements of the field by satellite-borne instruments to determine these coefficients. In a coordinate system fixed to the Earth, we define the gravity potential as the sum of the gravitational potential, V , due to mass attraction and the (nongravitational) potential, ', whose gradient equals the centrifugal acceleration due to Earth’s rotation: W .x/ D V .x/ C ' .x/ :

(67)

If we define a normal (i.e., reference) gravity potential, U D V el lip C , associated with a corotating material ellipsoid, such that on this ellipsoid, U jx2ellip D U0 , then the difference, called the disturbing potential, T .x/ D W .x/  U .x/ ;

(68)

is also a harmonic function in free space and may be represented as a spherical harmonic series: n 1 GM X X  a nC1 T .r; ; / D ıCn;m YNn;m .; /; a nD2 mDn r

(69)

where the ıCn;m are coefficients associated with the difference, V V el lip . The total ellipsoid mass is set equal to the Earth’s total mass, so that ıC0;0 D 0; and, the coordinate origin is placed at the center of mass of the Earth (and ellipsoid), implying that the first moments of the mass distribution all vanish: ıC1;m D 0 for m D 1; 0; 1. The set of spherical harmonic coefficients, tn;m D .GM =a/ ıCn;m , represents the Legendre spectrum of T . Practically, it is known only up to some finite degree, nmax ; for example, the model, EGM2008, has nmax D 2;190 (Pavlis et al. 2012a, b). The harmonic coefficients of this model refer to a sphere of boundary values whose radius is equated with the semimajor axis of the best-fitting Earth ellipsoid. The uniform convergence of the infinite series, Eq. (69), is guaranteed for r  a,

Page 16 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

but effects of divergence are evident in the truncated series, EGM2008, when r < a, and due care should be exercised in evaluations on or near the Earth’s surface. The disturbing potential may also be defined with respect to higher-degree reference potentials, ref . although in this case one may need to account for significant errors in the coefficients, Cn;m In particular, the local interpretation of the field as a stationary random process usually requires removal of a higher-degree reference field. In the Cartesian formulation, the disturbing potential in free space (z  0) is expressed in terms of its Fourier spectrum, .f1 ; f2 /, on the plane, z D 0, as Z1 Z1 T .x1 ; x2 I z/ D

.f1 ; f2 / e 2f z e i2.f1 x1 Cf2 x2 / df1 df2 :

(70)

1 1

where f D

p f12 C f22 .

3.2 The Disturbing Potential as a Stochastic Process In addition to the well-grounded reasoning already cited, an alternative justification of the stochastic nature of T is argued here based on the fractal (self-similar) characteristics of Earth’s topography (see also Turcotte 1987). This will provide also a basis for modeling the covariance function of T and its derivatives. The fractal geometry of the Earth’s topography (among fractals in general) was investigated and popularized by Mandelbrot in a number of papers and reviewed in his book (Mandelbrot 1983) using fundamentally the concept of Brownian motion, which is the process of a random walk. Thus, without going into the details of fractals, we have at least a connection between topography and randomness. Next, we may appeal to the well-known (in physical geodesy and geophysics) high degree of linear correlation between gravity anomalies and topographic height. This correlation stems from the theory of isostasy that explains the existence of topography on the Earth whose state generally tends toward one of hydrostatic equilibrium. Although this correlation is not perfect (or almost nonexistent in regions of tectonic subsidence and rifting), empirical evidence suggests that in many areas the correlation is quite faithful to this theory, even with a number of seemingly crude approximations. The gravity anomaly, g, and its isostatic reduction are defined in Hofmann-Wellenhof and Moritz (2005). At a point, P , the isostatically reduced gravity anomaly is given by

gI .P / D g .P /  C .P / C A .P / ;

(71)

where C .P / is the gravitational effect of all masses above the geoid and A .P / is the effect of their isostatic compensation. Several models for isostatic compensation have been developed by geophysicists (Watts 2001). Airy’s model treats the compensation locally and assumes that there is

Page 17 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

P

Topographic surface

ρ

rh Density layers:

D

−(rm−r)h'

b

−r(1−rw /r)b Geoid

ρ b'

Crust h'

ρ

rw

h

rm

(rm−r)b'

Mantle

Fig. 1 Isostatic compensation of topography according to the Airy model

no regional flexural rigidity in the lithosphere. With this model, the topography presumably floats in the denser mantle, and equilibrium is established according to the buoyancy principle (Fig. 1): h D . m  / h0 D h0 ;

(72)

where h0 is the (positive) depth of the “root” with respect to the depth of compensation, D (typically, D D 30 km), and the crust density, , and the mantle density, m , are assumed constant. Similarly, in ocean areas, the lower density of water relative to the crust allows the mantle to intrude into the crust, where equilibrium is established if .  w / b D b 0 , and b is the (positive) bathymetric distance to the ocean floor, b 0 is the height of the “anti-root” of mantle material, and w is the density of seawater. Removing the mass that generates C .P / makes the space above the geoid homogeneous (empty). According to Airy’s model, the attraction, A .P /, is due, in effect, to adding that mass to the root so as to make the mantle below D homogeneous. If the isostatic compensation is perfect according to this model, then the isostatic anomaly would vanish because of this created homogeneity; and, indeed, isostatic anomalies tend to be small. Therefore, the free-air gravity anomaly according to Eq. (71) with gI .P /  0 is generated by the attraction due to the topographic masses above the geoid, with density, , and by the attraction due to the lack of mass below the depth of compensation, with density,  :

g .P /  C .P /  A .P / :

(73)

Expressions for the terms on the right side can be found using various approximations. One such approximation (Forsberg 1985) “condenses” the topography onto the geoid (Helmert condensation, Helmert 1884; Martinec 1998), and the gravitational effect is then due to a two-dimensional mass layer with density, H D h. Likewise, the gravitational effect of the ocean bottom topography can be modeled by forming a layer on the geoid that represents the ocean’s deficiency in density relative to the crust. The density of this layer is negative: B D  .  w / b D  .1  w = / b. The gravitational potential, v, at a point, P , due to a layer condensed from topography (or bathymetry) is given by

Page 18 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

ZZ N h .Q/ 2 dQ ; v .P / D G R ` Q

( hN .Q/ D

Q 2 land  h .Q/ ; w  1  b .Q/ ; Q 2 ocean

(74)

where ` is the distance between P and the integration point. Similarly, the potential of the mass added below the depth of compensation can be approximated 0 by that of another layer at level D with density, H D  h0 , representing a condensation of material that is deficient in density with respect to the mantle and extends a depth, h0 , below D(see Fig. 1). For ocean areas, the anti-root is condensed onto the depth of compensation with density, B0 D b 0 . Equation (74) for a fixed height of the point, P , is a convolution of hand the inverse distance. Further making the planar approximation (for local, or high-frequency applications), this distance q  2  2  x1  x10 C x2  x20 C z2 , with x10 ; x20 being the planar coordinates of point Q. is `  Applying the convolution theorem, the Fourier transform of the potential at the level of z > 0 is given by V .f1 ; f2 I z/ D

G N H .f1 ; f2 / e 2f z : f

(75)

0 D  h (in Including the layer at the compensation depth, D, below the geoid with density, H 0 view of Eqs. (72); and similarly B D .1  w = / b for ocean areas), the Fourier transform of the total potential due to both the topography and its isostatic compensation is approximately

V .f1 ; f2 I z/ D

  G N H .f1 ; f2 / e 2f z  e 2f .DCz/ : f

(76)

Since the gravity anomaly is approximately the radial derivative of this potential, multiplying by 2f yields its Fourier transform:  

G .f1 ; f2 I z/ D 2G HN .f1 ; f2 / e 2f z  e 2f .DCz/ :

(77)

Neglecting the upward continuation term, as well as the isostatic term (which is justified only for very short wavelengths), confirms the empirical linear relationship between the heights and the gravity anomaly. Figure 2 compares the PSDs of the topography and the gravitational field both globally and locally. The global PSDs were computed from spherical harmonic expansions EGM2008 for the gravitational potential and DTM2006 for the topography (Pavlis et al. 2012a) according to Eq. (14) but converted to spatial frequency using Eq. (36). In addition, both were scaled to the PSD of the geoid undulation, which is related to the disturbing potential as N D T = , where D 9:8 m/s2 is an average value of gravity. The topographic height is related to the potential through Eq. (76). Both expansions are complete up to degree, nmax D 2;160 (fmax D 5:4  105 cy/m). DTM2006 is an expansion for both the topographic height above mean sea level and the depth of the ocean N as defined in Eq. (74). This contributes to an and, therefore, does not exactly correspond to h, overestimation of the power at lower frequencies. The obviously lower power of EGM2008 at higher frequencies results from the higher altitude, on average, to which its spectrum refers, that is, the sphere of radius, a. Page 19 of 34

1.10 15 1.10 14 1.10 13 1.10 12 1.10 11 1.10 10 1.10 9 1.10 8 1.10 7 1.10 6 1.10 5 1.10 4 1.10 3 100 10 1 0.1 1.10 –7

DTM2006

gra v topo

EGM2008

area 1 1.10 –6

area 2

geoid undulation psd [ m2/(cy/m)2]

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

gra v topo

1.10 –5 frequency [cy/m]

1.10 –4

1.10 –3

Fig. 2 Comparison of gravitational and topographic PSDs, scaled to the geoid undulation PSD. Global models are EGM2008 and DTM2006, and local PSDs were derived from gravity and topographic data in the indicated areas 1 and 2

The other PSDs in Fig. 2 correspond to the indicated regions and were derived according to Eq. (31) from local terrain elevation and gravity anomaly data provided by the US National Geodetic Survey. The data grids in latitude and longitude have resolution of 30 arcsec for the topography and 1 arcmin for the gravity. With a planar approximation for these areas, the Fourier transforms were calculated using their discrete versions. The PSDs were computed by neglecting the limit (and expectation) operators and were averaged in azimuth. Dividing the gravity PSD by .2 f /2 then yields the geoid undulation PSD; and, as before, Eq. (76) ı 2relates the topography PSD to the potential PSD that scales to the geoid undulation PSD by 1 . In these regions, the gravity and topography PSDs match well at the higher frequencies at least, attesting to their high linear correlation. Moreover, these PSDs follow a power law in accord with the presumed fractal nature of the topography. These examples then offer a validation of the stochastic interpretation of the gravitational field and also provide a starting point to model its covariance function.

4 Covariance Models Since the true covariance function of a process, such as the Earth’s gravity field, rarely is known and local functions can vary from region to region (thus we allow global non-stationarity in local applications), it must usually be modeled from data. We consider here primarily the modeling of the autocovariance function, that is, when g D h. Models for the cross covariance function could follow similar procedures, but usually g and h are linearly related and the method of propagation of covariances (see Sect. 2.4) should be followed to derive gh from gg . Modeling the covariance (or correlation) function of a process on the plane or sphere can proceed with different assumptions and motivations. We distinguish in the first place between empirical and analytic methods and in the second between global and local models. Global models describe the correlation of functions on the sphere; whereas, local models usually are restricted to applications where a planar approximation suffices. Empirical models are derived directly from data distributed Page 20 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

on the presumably spherical or planar surface of the Earth. Rarely, if ever, are global empirical covariance models determined for the sphere according to the principal definition, Eq. (9). Instead, such models are given directly by the degree variances, Eq. (59). For local modeling with the planar approximation, the empirical model comes from a discretization of Eq. (27), where we neglect also the limit processes,  1 XX  0 0  0 Ogg .s1 ; s2 / D g x1 ; x2 g x1 C s1 ; x20 C s2 ; M 0 0 x1

(78)

x2

and where M is the total number of summed products for each .s1 ; s2 /. A corresponding approximation for an isotropic additionally averages the products of g that are separated q model   2 0 2 x1  x1 C x2  x20 , over all directions. Typically, the maximum by a given distance, s D s considered is much smaller than (perhaps 10–20 % of) the physical dimension of the data area, since the approximation of Eq. (27) by Eq. (78) worsens as the number of possible summands within a finite area decreases. Also, M may be fixed at the largest possible value (the number of products for s D 0) in order to avoid a numerical nonpositive definiteness of the covariance function (Marple 1987, p. 148). However, this creates a biased estimate of the covariance, particularly for the larger distances. Another form of empirical covariance model is its Fourier (or Legendre) transform, derived directly from the data, as was illustrated for the gravity anomaly and topography in Fig. 2. The inverse transform then yields immediately the covariance function. The disadvantage of the empirical covariance model, Eq. (78), is the limited ability (or inability) to derive consistent covariances of functionally related quantities, such as the derivatives of g through the law of propagation of covariances, Eq. (43). This could only be accomplished by working with its transform (see Eqs. (46)), but generally, an analytic model eventually simplifies the computational aspects of determining auto- and cross covariances.

4.1 Analytic Models Analytic covariance models are constructed from relatively simple mathematical functions that typically are fit to empirical data (either in the spatial or frequency domains) and have the benefit of easy computation and additional properties useful to a particular application (such as straightforward propagation of covariances). An analytic model should satisfy all the basic properties of the covariance function (Sect. 2.4), although depending on the application some of these may be omitted (such as the harmonic extension into space for the Gaussian model, 2  .s/ D  2 e ˇs ). An analytic model may be developed for the PSD or the covariance function. Ideally (but not always), one leads to a mathematically rigorous formulation of the other. Perhaps the most famous global analytic model is known as Kaula’s rule, proposed by W. Kaula (1966, p. 98) in order to develop the idea of a stochastic interpretation of the spherical spectrum of the disturbing potential:  .˚T T /n D

GM R

2

1010 4 4 m /s ; n4

(79)

Page 21 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

where R is the mean Earth radius. It roughly described the attenuation of the harmonic coefficients known at that time from satellite tracking observations, but it is reasonably faithful to the spectral attenuation of the field even at high degrees (see Fig. 4). Note that Kaula’s rule is a power-law model for the PSD of the geopotential, agreeing with our arguments above for such a characteristic based on the fractal nature of the topography. The geodetic literature of the latter part of the last century is replete with different types of global and local covariance and PSD models for the Earth’s residual gravity field (e.g., Jordan 1972; Tscherning and Rapp 1974; Heller and Jordan 1979; Forsberg 1987; Milbert 1991; among others); but, it is not the purpose here to review them. Rather the present aim is to promote a single elemental prototype model that (1) satisfies all the properties of a covariance model for a stochastic process, (2) has harmonic extension into free space, (3) has both spherical and planar analytic expressions for all derivatives of the potential in both the space and frequency domains, and (4) is sufficiently adaptable to any strength and attenuation of the gravitational field. This is the reciprocal distance model introduced by Moritz (1976, 1980), so called because the covariance function resembles an inverse-distance weighting function. It was also independently studied by Jordan et al. (1981).

4.2 The Reciprocal Distance Model Consider the disturbing potential, T , as a stochastic process on each of two possibly different horizontal parallel planes or concentric spheres. Given a realization of T on one plane (or sphere), its realization on the other plane (or sphere) is well defined by a solution to Laplace’s equation, provided both surfaces are on or outside the Earth’s surface (approximated as a plane or sphere). The reciprocal distance covariance model between T on one plane and T on the other is given by 2 T T .sI z1; z2 / D q ; 2 2 2 ˛ s C .1 C ˛ .z1 C z2 //

(80)

p where with Eq. (19), s D s12 C s22 ; z1 , z2 are heights of the two planes; and  2 , ˛ are parameters. The Fourier transform, or the PSD, is given by ˚T T .f I z1 ; z2 / D

 2 2f .z1 Cz2 C1=˛/ ; e ˛f

f ¤ 0:

(81)

For spheres with radii, r1  R and r2  R, the spherical covariance model is T T

 2 .1  0 / = 0 . I r1 ; r2 / D p 1 C 2  2 cos

;

(82)

ı where is given by Eq. (7), 0 D .R0 =R/2 and  2 are parameters, and D R02 .r1 r2 /. The Legendre transform, or PSD, is given by .˚T T /n D

 2 .1  0 / nC1 : .2n C 1/ 0

(83)

Page 22 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

In all cases, the heights (or radii) refer to fixed surfaces that define the spatial domain of the corresponding stochastic process. Since we allow z1 ¤ z2 or r1 ¤ r2 , the models, Eqs. (80) and (82), technically are cross covariances between two different (but related) processes; and Eqs. (81) and (83) are cross PSDs. The equivalence of the models, Eqs. (80) and (82), as the spherical surface approaches a plane, is established by identifying z1 D r1  R, z2 D r2  R and s 2  2R2 .1  cos /, from which it can be shown (Moritz 1980, p. 183) that 1 p 1 C 2  2 cos

R

q s2

C .1=˛ C z1 C z2 /

;

(84)

2

where 1=˛ D 2 .R  R0 / and terms of order .R  R0 /=R are neglected. The variance parameter,  2 , is the same in both versions of the model. It is noted that this model, besides having analytic forms in both the space and frequency domains, is isotropic, depending only on the horizontal distance. Moreover, it correctly incorporates the harmonic extension for the potential at different levels. It is also positive definite since the transform is positive for all frequencies. The analytic forms permit exact propagation of covariances as elaborated in Sect. 2.4. Since many applications involving the stochastic interpretation of the field nowadays are more local than global, only the (easier) planar propagation is given here (Appendix A) up to second-order derivatives. The covariance propagation of derivatives for similar spherical models was developed by Tscherning (1976). Note that the covariances of the horizontal derivatives are not isotropic. One further useful feature of the reciprocal distance model is that it possesses analytic forms for hybrid PSD/covariance functions, those that give the PSD in one dimension and the covariance in the other: Z1 ST T .f1 I s2 I z1 ; z2/ D

˚T T .f1 ; f2 I z1; z2 / e i2f2 s2 df2 1

Z1 D

T T .s1 ; s2 I z1 ; z2/ e i2f2 s2 ds1

(85)

1

The first integral transforms the PSD to the covariance in the second variable, while the second equivalent integral transforms the covariance function to the frequency domain in the first variable. When a process is given only on a single profile (e.g., along a data track), one may wish to model its along-track PSD, which is the hybrid PSD/covariance function with s2 D 0. Appendix B gives the corresponding analytic forms for the (planar) reciprocal distance model.

4.3 Parameter Determination The reciprocal distance PSD model, Eq. (81), clearly does not have the form of a power law, but it nevertheless serves in modeling the PSD of the gravitational field when a number of these models are combined linearly:

Page 23 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

1.1012 1.1011 Power law model

psd

1.1010 1.109

Recip. dist. model

1.108 1.107 1.106 1.10−7

1.10−6

1.10−5

1.10−4

1.10−3

Frequency, f

Fig. 3 Fitting a reciprocal distance model component to a power-law PSD

˚T T .f I z1; z2 / D

J X j2

˛ f j D1 j

e 2f .z1 Cz2 C1=˛j / :

(86)

The parameters, ˛j , j2 , are chosen appropriately to yield a power-law attenuation of the PSD. This selection is based on the empirical PSD of data that in the case of the gravitational field are usually gravity anomalies, g  @T =@r, on the Earth’s surface (z1 D z2 D 0). Multiplying the PSD for the disturbing potential by .2f /2 , we consider reciprocal distance components of the PSD of the gravity anomaly (from Eq. (86)) in the form ˚ .f / D Af e Bf ;

(87)

ı where A D .2/2  2 ˛ and B D 2=˛ are constants to be determined such that the model is tangent to the empirical PSD. Here we assume that the latter is a power-law model (see Fig. 3), p .f / D Cf ˇ ;

(88)

where the constants, C and ˇ, are given. In terms of natural logarithms, the reciprocal distance PSD component is ln .˚ .f // D ln .A/ C !  Be ! , where ! D ln f ; and its slope is d .ln .˚ .f ///=d! D 1  Be ! . The slope should be ˇ, which yields Be ! D 1 C ˇ. Also, the reciprocal distance and power-law models should intersect, say, at f D fN, which requires ln .C /  ˇ! D ln .A/ C !  Be ! . Solving for A and B, we find:  1Cˇ e ; ADC fN

BD

1Cˇ : fN

(89)

With a judicious selection of discrete frequencies, fNj , a number of PSD components may be combined to approximate the power law over a specified domain. Due to the overlap of the component summands in Eq. (86), an appropriate scale factor may still be required for a proper fit.

Page 24 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

Table 1 Reciprocal distance PSD parameters

j 1 2 3 4 5 6 7 8

(m /s ) 10 3,300 650 162 10.2 0.641 4.02  10 2.53  10 2

4

4

Area 1 (1/m) j 310 9 9.69  10 10 4.76  10 11 8.94  10 12 2.00  10 13 4.48  10 14 1.00  10 15 2.25  10 ˛

7

5

7 6 6 5 5

2 3

4

(m /s ) 1.59  10 9.97  10 6.26  10 3.93  10 2.47  10 1.55  10 9.74  10 2

4

4

(1/m) 5.03  10 1.13  10 2.52  10 5.64  10 1.26  10 2.83  10 6.33  10 ˛

4 6 7 8 9 10 12

4

j 4

1

3

2

3

3

3

4

2

5

2

6

2

7 8

(m /s ) 10 3,300 640 951 79.9 6.71 0.564 4.74  10 2

4

4

Area 2 (1/m) j 3  10 9 9.69  10 10 7.56  10 11 9.73  10 12 2.18  10 13 4.88  10 14 1.09  10 15 2.44  10 ˛

7

5

7 6 6 5 5 4

2

(m /s ) 3.98  10 3.34  10 2.81  10 2.36  10 1.98  10 1.66  10 1.40  10 2

4

4

(1/m) 5.47  10 1.23  10 2.74  10 6.14  10 1.37  10 3.08  10 6.89  10 ˛

3 4 5 6 7 8 9

4 3 3 3 2 2 2

4

1.1016 Gravity anomaly psd [mGal2/(cy/m)2]

1.1015 1.1014 1.1013 1.1012

EGM2008

1.1011

Kaula’s rule

1.1010

EGM2008

1.109

emp. psd RD model area 2

1.108 1.107 1.106 −7 1.10

area 1

1.10−6

RD model emp. psd

1.10−5 Frequency [cy/m]

1.10−4

1.10−3

Fig. 4 Comparison of empirical and reciprocal distance (RD) model PSDs for the gravity anomaly in the two areas shown in Fig. 2

This modeling technique was applied to the two regional PSDs shown in Fig. 2. Additional lowfrequency components were added to model the field at frequencies, f < 105 cy/m. Table 1 lists the reciprocal distance parameters for each of the regions in Fig. 2; and Fig. 4 shows various true and corresponding modeled PSDs for the gravity anomaly. The parameters may be used to define consistently the cross PSDs and cross covariances of any of the derivatives of the disturbing potential in the respective regions.

5 Summary and Future Directions The preceding sections have developed the theory for correlation functions on the sphere and plane for deterministic functions and stochastic processes using standard spherical harmonic (Legendre) and Fourier basis functions. Assuming an ergodic (hence stationary) stochastic process, its covariance function (with zero means) is essentially the correlation function defined for a particular realization of the process. These concepts were applied to the disturbing gravitational Page 25 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

potential. Based on the fractal nature of Earth’s topography and its relationship to the gravitational field, the power spectral density (PSD) of the disturbing potential was shown to behave like a power law at higher spatial frequencies. This provides the basis for the definition and determination of an analytic model for the covariance function that offers mutually consistent cross covariances (and PSDs) among its various derivatives, including vertical derivatives. Once established for a particular region, such models have numerous applications from leastsquares collocation (and the related kriging) to more mundane procedures such as interpolation and filtering. Furthermore, they are ideally suited to generating a synthetic field for use in simulation studies of potential theory, as well as Monte Carlo statistical analyses in estimation theory. The details of such applications are beyond the present scope but are readily formulated. The developed reciprocal distance model is quite versatile when combined linearly using appropriate parameters and is able to represent the PSD of the disturbing potential (and any of its derivatives) with different spectral amplitudes depending on the region in question. Two examples are provided in which a combination of 15 such reciprocal distance components is fitted accurately to the empirical gravity anomaly PSD in either smooth or rough regional fields. Although limited to some extent by being isotropic (for the vertical derivatives, only), the resulting models are completely analytic in both spatial and frequency domains; and thus, the computed cross covariances and cross PSDs of all derivatives of the disturbing potential are mutually consistent, which is particularly important in estimation and error analysis studies. The global representation of the gravitational field in terms of spherical harmonics has many applications that are, in fact, becoming more and more local as the computational capability increases and models are expanded to higher maximum degree, nmax . The most recent global model, EGM2008, includes coefficients complete up to nmax D 2;190, and the historical trend has been to develop models with increasingly high global resolution as more and more globally distributed data become available. However, such high-degree models also face the potential problem of divergence near the Earth’s surface (below the sphere of convergence) and must always submit to the justifiable criticism that they are inefficient local representations of the field. In fact, the two PSDs presented here are based on the classical local approximation, the planar approximation, with traditional Fourier (sinusoidal) basis functions. Besides being limited by the planar approximation, the Fourier basis functions, in the strictest sense, still have global support for nonperiodic functions. However, there exists a vast recent development of local-support representations of the gravitational field using splines on the sphere, including tensor-product splines (e.g., Schumaker and Traas 1991), radial basis functions (Schreiner 1997; Freeden et al. 1998), and splines on sphere-like surfaces (Alfeld et al. 1996); see also Jekeli (2005). Representations of the gravitational field using these splines, particularly the radial basis functions and the Bernstein-Bézier polynomials used by Alfeld et al. depend strictly on local data, and the models can easily be modified by the addition or modification of individual data. Thus, they also do not depend on regularly distributed data, as do the spherical harmonic and Fourier series representations. On the other hand, these global support models based on regular data distributions lead to particularly straightforward and mutually consistent transformations among the PSDs of all derivatives of the gravitational potential, which greatly facilitates the modeling of their correlations. For irregularly scattered data, the splines lend themselves to a multiresolution representation of the field on the sphere, analogous to wavelets in Cartesian space. This has been developed for the tensor-product splines by Lyche and Schumaker (2000) and for the radial basis splines by Freeden et al. (1998); see also Fengler et al. (2004). For the Bernstein-Bézier polynomial

Page 26 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

splines, a multiresolution model is also possible. How these newer constructive approximations can be adapted to correlation modeling with mutually consistent transformations (propagation of covariances and analogous PSDs) among all derivatives of the gravitational potential represents a topic for future development and analysis.

Appendix A The planar reciprocal distance model, Eq. (80), for the covariance function of the disturbing potential is repeated here for convenience with certain abbreviations T;T .s1 ; s2 I z1 ; z2/ D

2 M 1=2

(90)

where M D ˇ2 C ˛2s 2;

ˇ D 1 C ˛ .z1 C z2 / ;

s 2 D s12 C s12 ;

s1 D x1  x10 ;

s2 D x2  x20 : (91)

The primed coordinates refer to the first subscripted function in the covariance, and the unprimed coordinates refer to the second function. The altitude levels for these functions are z1 and z2 , respectively. Derivatives ı of the disturbing potential with respect to the coordinates are denoted 2 @T =@x1 D Tx1 , @ T .@x1 @z/ D Tx1 z , etc. The following expressions for the cross covariances are derived by repeatedly using Eqs. (42) and (54). The arguments for the resulting function are omitted but are the same as in Eq. (90): Tx1 ;T

 2 ˛ 2 s1 D D T;Tx1 M 3=2

(92)

 2 ˛ 2 s2 D T;Tx2 M 3=2

(93)

 2 ˛ˇ D T;Tz M 3=2

(94)

Tx2 ;T D

Tz ;T D  Tx1 ;Tx1

  2˛2  D 5=2 M  3˛ 2 s12 M  2˛4 s1 s2 D Tx2 ;Tx1 M 5=2

(96)

 2˛3ˇ s1 D Tz ;Tx1 M 5=2

(97)

  2˛2  2 2 M  3˛ s 2 M 5=2

(98)

Tx1 ;Tx2 D 3 Tx1 ;Tz D 3 Tx2 ;Tx2 D

(95)

Page 27 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

Tx2 ;Tz D 3 Tz ;Tz D

  2˛2  2 2 s 2M  3˛ D Tx1 ;Tx1 C Tx2 ;Tx2 M 5=2

Tx1 ;Tx1 z

Tx1 ;Tx2 z D

(100) (101)

T;Tx1 x2 D Tx1 ;Tx2 D Tx1 x2 ;T

(102)

T;Tx1 z D Tx1 ;Tz D Tx1 z ;T

(103)

T;Tx2 x2 D Tx2 ;Tx2 D Tx2 x2 ;T

(104)

T;Tx2 z D Tx2 ;Tz D Tx2 z ;T

(105)

T;Tzz D Tz ;Tz D Tzz ;T

(106)

 3 2 ˛ 4 s1  2 2 3M C 5˛ s 1 D Tx1 x1 ;Tx1 M 7=2

(107)

 3 2 ˛ 4 s2  M C 5˛ 2 s12 D Tx1 x2 ;Tx1 D Tx2 ;Tx1 x1 D Tx1 x1 ;Tx2 7=2 M

(108)

 3 2 ˛ 3 ˇ  2 2 D s M C 5˛ D Tx1 z ;Tx1 D Tz ;Tx1 x1 D Tx1 x1 ;Tz 1 M 7=2

Tx1 ;Tx2 x2 D

(99)

T;Tx1 x1 D Tx1 ;Tx1 D Tx1 x1 ;T

Tx1 ;Tx1 x1 D Tx1 ;Tx1 x2 D

 2˛3ˇ s2 D Tz ;Tx2 M 5=2

 3 2 ˛ 4 s1  M C 5˛ 2 s22 D Tx2 x2 ;Tx1 D Tx2 ;Tx1 x2 D Tx1 x2 ;Tx2 7=2 M

15 2 ˛ 5 ˇ s1 s2 D Tx2 z ;Tx1 D Tz ;Tx1 x2 D Tx2 ;Tx1 z D Tx1 x2 ;Tz D Tx1 z ;Tx2 M 7=2 Tx1 ;Tzz D

 3 2 ˛ 4 s1  2 2 2  ˛ s 4ˇ D Tzz ;Tx1 D Tz ;Tx1 z D Tx1 z ;Tz M 7=2

(110)

(111)

(112)

 3 2 ˛ 4 s2  3M C 5˛ 2 s22 D Tx2 x2 ;Tx2 7=2 M

(113)

 3 2 ˛ 3 ˇ  2 2 s M C 5˛ D Tx2 z ;Tx2 D Tz ;Tx2 x2 D Tx2 x2 ;Tz 2 M 7=2

(114)

 3 2 ˛ 4 s2  2 2 2  ˛ s 4ˇ D Tzz ;Tx2 D Tz ;Tx2 z D Tx2 z ;Tz M 7=2

(115)

 3 2 ˛ 3 ˇ  2M C 5˛ 2 s 2 D Tzz ;Tz 7=2 M

(116)

Tx2 ;Tx2 x2 D Tx2 ;Tx2 z D

(109)

Tx2 ;Tzz D

Tz ;Tzz D

Page 28 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

 3 2 ˛ 4  2 2 2 4 4  30M˛ s C 35˛ s 3M 1 1 M 9=2

(117)

Tx1 x1 ;Tx1 x2 D

 15 2 ˛ 6 s1 s2  2 2 s 3M C 7˛ 1 D Tx1 x2 ;Tx1 x1 M 9=2

(118)

Tx1 x1 ;Tx1 z D

 15 2 ˛ 5 ˇs1  3M C 7˛ 2 s12 D Tx1 z ;Tx1 x1 9=2 M

(119)

Tx1 x1 ;Tx1 x1 D

Tx1 x1 ;Tx2 x2 Tx1 x1 ;Tx2 z D

 3 2 ˛ 4  2 2 2 2 2 D  5M˛ s C 35s s M D Tx2 x2 ;Tx1 x1 D Tx1 x2 ;Tx1 x2 1 2 M 9=2

 15 2 ˛ 5 ˇs2  M C 7˛ 2 s12 D Tx2 z ;Tx1 x1 D Tx1 z ;Tx1 x2 D Tx1 x2 ;Tx1 z 9=2 M

(121)

 3 2 ˛ 4  2 2 2 2 2 2 4M D Tzz ;Tx1 x1 D Tx1 z ;Tx1 z C 5M˛ s C 35ˇ ˛ s 2 1 M 9=2

(122)

Tx1 x1 ;Tzz D

Tx1 x2 ;Tx2 x2 D Tx1 x2 ;Tx2 z D

 15 2 ˛ 6 s1 s2  2 2 3M C 7˛ s 2 D Tx2 x2 ;Tx1 x2 M 9=2

(123)

 15 2 ˛ 5 ˇs1  M C 7˛ 2 s22 D Tx2 z ;Tx1 x2 D Tx1 z ;Tx2 x2 D Tx2 x2 ;Tx1 z 9=2 M

(124)

 15 2 ˛ 5 ˇs1  2 D 3M  7ˇ D Tzz ;Tx1 z M 9=2

(125)

Tx1 z ;Tzz

 3 2 ˛ 4  2 2 2 4 4  30M˛ s C 35˛ s 3M 2 2 M 9=2

(126)

 15 2 ˛ 5 ˇs2  2 2 s 3M C 7˛ D Tx2 z ;Tx2 x2 2 M 9=2

(127)

Tx2 x2 ;Tx2 x2 D Tx2 x2 ;Tx2 z D Tx2 x2 ;Tzz

(120)

 3 2 ˛ 4  2 2 2 2 2 2 D C 5M˛ s C 35ˇ ˛ s 4M D Tzz ;Tx2 x2 D Tx2 z ;Tx2 z 1 2 M 9=2  15 2 ˛ 5 ˇs2  3M  7ˇ 2 D Tzz ;Tx2 z 9=2 M

(129)

 3 2 ˛ 4  4 2 2 2 4 4  24ˇ ˛ s C 3˛ s 8ˇ D Tx1 z ;Tx1 z C Tx2 z ;Tx2 z M 9=2

(130)

Tx2 z ;Tzz D Tzz ;Tzz D

(128)

Page 29 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

Appendix B The hybrid PSD/covariance function of the disturbing potential, given by Eq. (85), can be shown to be ST;T .f1 I s2 I z1 ; z2 / D

2 2 K0 .2f1 d / ; ˛

where K0 is the modified Bessel function of the second kind and zero order, and r ˇ2 C s22 : d D ˛2

(131)

(132)

It is the along-track PSD if s2 D 0. In the following hybrid PSD/covariances of the derivatives of T , also the modified Bessel function of the second kind and first order, K1 , appears. Both Bessel function always have the argument, 2f1 d ; and, the arguments of the hybrid PSD/covariances are the same as in Eq. (131). ST;Tx1 D i 2f1 ST T D STx1 ;T ST;Tx2 D

2 2 .2f1 / s2 K1 D STx2 ;T ˛d

(134)

2 2 .2f1 / ˇ K1 D STz ;T ˛2d

(135)

ST;Tz D 

STx2 ;Tx2

STx1 ;Tx1 D .2f1 /2 ST T

(136)

STx1 ;Tx2 D i 2f1STx2 ;T D STx2 ;Tx1

(137)

STx1 ;Tz D i 2f1 ST;Tz D STz ;Tx1

(138)

2 2 .2f1/ D ˛d

STx2 ;Tz D 

(133)



  2s22 s22 1  2 K1  2f1 K0 d d

2 2 .2f1 / ˇs2 .2K1 C 2f1 d K0 / D STz ;Tx2 ˛2d 3

(139)

(140)

STz ;Tz D STx1 ;Tx1 C STx2 ;Tx2

(141)

ST;Tx1 x1 D STx1 ;Tx1 D STx1 x1 ;T

(142)

ST;Tx1 x2 D STx1 ;Tx2 D STx1 x2 ;T

(143)

ST;Tx1 z D STx1 ;Tz D STx1 z ;T

(144)

Page 30 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

ST;Tx2 x2 D STx2 ;Tx2 D STx2 x2 ;T

(145)

ST;Tx2 z D STx2 ;Tz D STx2 z ;T

(146)

ST;Tzz D STz ;Tz D STzz ;T

(147)

STx1 ;Tx1 x1 D i .2f1/3 ST;T D STx1 x1 ;Tx1

(148)

STx1 ;Tx1 x2 D .2f1 /2 ST;Tx2 D STx1 x2 ;Tx1 D STx2 ;Tx1 x1 D STx1 x1 ;Tx2

(149)

STx1 ;Tx1 z D .2f1 /2 ST;Tz D STx1 z ;Tx1 D STz ;Tx1 x1 D STx1 x1 ;Tz

(150)

STx1 ;Tx2 x2 D i 2f1 STx2 ;Tx2 D STx2 x2 ;Tx1 D STx2 ;Tx1 x2 D STx1 x2 ;Tx2

(151)

STx1 ;Tx2 z D i 2f1 STx2 ;Tz D STx2 z ;Tx1 D STz ;Tx1 x2 D STx2 ;Tx1 z D STx1 x2 ;Tz D STx1 z ;Tx2

(152)

STx1 ;Tzz D i 2f1 STz ;Tz D STzz ;Tx1 D STz ;Tx1 z D STx1 z ;Tz STx2 ;Tx2 x2

2 2 .2f1 / s2 D ˛d 3

(153)

     8s22 4s22 2 6  2  .2f1 s2 / K1 C 2f1 d 3  2 K0 d d

D STx2 x2 ;Tx2

STx2 ;Tx2 z

(154)

2 2 .2f1 / ˇ D ˛2d 3

     8s22 4s22 2 2  2  .2f1 s2 / K1 C 2f1 d 1  2 K0 d d

D STx2 z ;Tx2 D STz ;Tx2 x2 D STx2 x2 ;Tz

(155)

STx2 ;Tzz D STx2 ;Tx1 x1  STx2 ;Tx2 x2 D STzz ;Tx2 D STz ;Tx2 z D STx2 z ;Tz

(156)

STz ;Tzz D STx1 ;Tx1 z C STx2 ;Tx2 z D STzz ;Tz

(157)

STx1 x1 ;Tx1 x1 D .2f1 /4 ST;T

(158)

STx1 x1 ;Tx1 x2 D i .2f1 /3 STx2 ;T D STx1 x2 ;Tx1 x1

(159)

STx1 x1 ;Tx1 z D i .2f1 /3 ST;Tz D STx1 z ;Tx1 x1

(160)

STx1 x1 ;Tx2 x2 D .2f1 /2 STx2 ;Tx2 D STx2 x2 ;Tx1 x1 D STx1 x2 ;Tx1 x2

(161)

STx1 x1 ;Tx2 z D .2f1/2 STx2 ;Tz D STx2 z ;Tx1 x1 D STx1 z ;Tx1 x2 D STx1 x2 ;Tx1 z

(162)

Page 31 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

STx1 x1 ;Tzz D  .2f1 /2 STz ;Tz D STzz ;Tx1 x1 D STx1 z ;Tx1 z

(163)

STx1 x2 ;Tx2 x2 D i 2f1 STx2 x2 ;Tx2 D STx2 x2 ;Tx1 x2

(164)

STx1 x2 ;Tx2 z D i 2f1 STx2 ;Tx2 z D STx2 z ;Tx1 x2 D STx1 z ;Tx2 x2 D STx2 x2 ;Tx1 z

(165)

STx1 x2 ;Tzz D i 2f1 STx2 z ;Tz D STzz ;Tx1 x2 D STx1 z ;Tx2 z D STx2 z ;Tx1 z

(166)

STx1 z ;Tzz D STx1 x1 ;Tx1 z C STx2 x2 ;Tx1 z D STzz ;Tx1 z

(167)

STx2 x2 ;Tx2 x2 D

2 2 .2f1 / ˛d 3

 C2 3 



 2f1 d 3 

24s22 d2





24s22 24s 4 s2 C d 42 C .2f1 s2 /2 d22 K0 C d2   24s 4 s2 3 .2f1 s2 /2 C d 42 C 4 .2f1 s2 /2 d22 K1

   2 24s22 2 1 /ˇs2 d 12   .2f s / 2f K0 C STx2 x2 ;Tx2 z D  2 .2f 1 1 2 ˛2 d 5 d2    2 48s C 24  d 22 C 3 .2f1 d /2  8 .2f1 s2 /2 K1

(168)

(169)

D STx2 z ;Tx2 x2 STx2 x2 ;Tzz D STx1 x1 ;Tx2 x2  STx2 x2 ;Tx2 x2 D STzz ;Tx2 x2 D STx2 z ;Tx2 z

(170)

STx2 z ;Tzz D STx1 x1 ;Tx2 z C STx2 x2 ;Tx2 z D STzz ;Tx2 z

(171)

STzz ;Tzz D STx1 z ;Tx1 z C STx2 z ;Tx2 z

(172)

References Alfeld P, Neamtu M, Schumaker LL (1996) Fitting scattered data on sphere-like surfaces using spherical splines. J Comput Appl Math 73:5–43 Baranov V (1957) A new method for interpretation of aeromagnetic maps: pseudo-gravimetric anomalies. Geophysics 22:359–383 Brown RG (1983) Introduction to random signal analysis and Kalman filtering. Wiley, New York de Coulon F (1986) Signal theory and processing. Artech House, Dedham Fengler MJ, Freeden W, Michel V (2004) The Kaiserslautern multiscale geopotential model SWITCH-03 from orbit perturbations of the satellite CHAMP and its comparison to models EGM96, UCPH2002_02_05, EIGEN-1S and EIGEN-2. Geophys J Int 157:499–514 Forsberg R (1985) Gravity field terrain effect computations by FFT. Bull Géod 59(4):342–360 Forsberg R (1987) A new covariance model, for inertial gravimetry and gradiometry. J Geophys Res 92(B2):1305–1310 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere, with applications in geomathematics. Clarendon, Oxford Heller WG, Jordan SK (1979) Attenuated white noise statistical gravity model. J Geophys Res 84(B9):4680–4688 Helmert FR (1884) Die Mathematischen und Physikalischen Theorien der Höheren Geodäsie, vol 2. BD Teubner, Leipzig Page 32 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

Hofmann-Wellenhof B, Moritz H (2005) Physical geodesy. Springer, Berlin Jeffreys H (1955) Two properties of spherical harmonics. Q J Mech Appl Math 8(4):448–451 Jekeli C (1991) The statistics of the Earth’s gravity field, revisited. Manuscr Geod 16(5):313–325 Jekeli C (2005) Spline representations of functions on a sphere for geopotential modeling. Report no. 475, Geodetic Science, Ohio State University, Columbus. http://www.geology.osu.edu/~ jekeli.1/OSUReports/reports/report_475.pdf Jordan SK (1972) Self-consistent statistical models for the gravity anomaly, vertical deflections, and the undulation of the geoid. J Geophys Res 77(20):3660–3669 Jordan SK, Moonan PJ, Weiss JD (1981) State-space models of gravity disturbance gradients. IEEE Trans Aerosp Electron Syst AES 17(5):610–619 Kaula WM (1966) Theory of satellite geodesy. Blaisdell, Waltham Lauritzen SL (1973) The probabilistic background of some statistical methods in physical geodesy. Report no. 48, Geodaestik Institute, Copenhagen Lyche T, Schumaker LL (2000) A multiresolution tensor spline method for fitting functions on the sphere. SIAM J Sci Comput 22(2):724–746 Mandelbrot B (1983) The fractal geometry of nature. Freeman, San Francisco Marple SL (1987) Digital spectral analysis with applications. Prentice-Hall, Englewood Cliffs Martinec Z (1998) Boundary-value problems for gravimetric determination of a precise geoid. Springer, Berlin Maybeck PS (1979) Stochastic models, estimation, and control, vols I and II. Academic, New York Milbert DG (1991) A family of covariance functions based on degree variance models and expressible by elliptic integrals. Manuscr Geod 16:155–167 Moritz H (1976) Covariance functions in least-squares collocation. Report no. 240, Department of Geodetic Science, Ohio State University, Columbus Moritz H (1978) Statistical foundations of collocation. Report no. 272, Department of Geodetic Science, Ohio State University, Columbus Moritz H (1980) Advanced physical geodesy. Abacus Press, Tunbridge Wells Olea RA (1999) Geostatistics for engineers and earth scientists. Kluwer Academic, Boston Pavlis NK, Holmes SA, Kenyon SC, Factor JF (2012a) The development and evaluation of earth gravitational model (EGM2008). J Geophys Res 117:B04406. doi:10.1029/2011JB008916 Pavlis NK, Holmes SA, Kenyon SC, Factor JF (2012b) Correction to “The development and evaluation of Earth Gravitational Model (EGM2008)”. J Geophys Res, 118, 2633, doi:10.1002/jgrb.50167 Priestley MB (1981) Spectral analysis and time series analysis. Academic, London Rummel R, Yi W, Stummer C (2011) GOCE gravitational gradiometry. J Geod 85:777–790 Schreiner M (1997) Locally supported kernels for spherical spline interpolation. J Approx Theory 89:172–194 Schumaker LL, Traas C (1991) Fitting scattered data on sphere-like surfaces using tensor products of trigonometric and polynomial splines. Numer Math 60:133–144 Tscherning CC (1976) Covariance expressions for second and lower order derivatives of the anomalous potential. Report no. 225, Department of Geodetic Science, Ohio State University, Columbus. http://geodeticscience.osu.edu/OSUReports.htm Tscherning CC, Rapp RH (1974) Closed covariance expressions for gravity anomalies, geoid undulations and deflections of the vertical implied by anomaly degree variance models. Report no. 208, Department of Geodetic Science, Ohio State University, Columbus. http:// geodeticscience.osu.edu/OSUReports.htm

Page 33 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_28-3 © Springer-Verlag Berlin Heidelberg 2014

Turcotte DL (1987) A fractal interpretation of topography and geoid spectra on the Earth, Moon, Venus, and Mars. J Geophys Res 92(B4):E597–E601 Watts AB (2001) Isostasy and flexure of the lithosphere. Cambridge University Press, Cambridge

Page 34 of 34

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

Modeling Uncertainty of Complex Earth Systems in Metric Space Jef Caers , Kwangwon Park and Céline Scheidt Department of Energy Resources Engineering, Stanford University, Stanford, CA, USA

Abstract Modeling the subsurface of the Earth has many characteristic challenges. Earth models reflect the complexity of the Earth subsurface and contain many complex elements of modeling, such as the subsurface structures, the geological processes of growth and/or deposition, and the placement, movement, or injection/extraction of fluid and gaseous phases contained in rocks or soils. Moreover, due to the limited information provided by measurement data, whether from boreholes or geophysics, and the requirement to make interpretations at each stage of the modeling effort, uncertainty is inherent to any modeling effort. As a result, many alternative (input) models need to be built to reflect the ensemble of sources of uncertainty. On the other hand, the (engineering) purpose (in terms of target response) of these models is often very clear, simple, and straightforward: do we clean up or not, do we drill, where do we drill, what are oil and gas reserves, how far have contaminants traveled, etc. The observation that models are complex but their purpose is simple suggests that input model complexity and dimensionality can be dramatically reduced, not by itself, but by means of the purpose or target response. Reducing dimension by only considering the variability between all possible models may be an impossible task, since the intrinsic variation between all input models is far too complex to be reduced to a few dimensions by simple statistical techniques such as principal component analysis (PCA). In this chapter, we will define a distance between two models created with different (and possibly randomized) input parameters. This distance can be tailored to the application or target output response at hand, but should be chosen such that it correlates with the difference in target response between any two models. A distance defines then a metric space with a broad gamma of theory. Starting from this point of view, we redefine many of the current Cartesian-based Earth modeling problems and methodologies, such as inverse modeling, stochastic simulation and estimation, model selection and screening, model updating, and response uncertainty evaluation in metric space. We demonstrate how such a redefinition greatly simplifies as well as increases effectiveness and efficiency of any modeling effort, particularly those that require addressing the matter of model and response uncertainty.

1 Introduction The purpose of 3D/4D modeling and prediction is very clear: to produce forecasts (climate, reservoir flow, contaminant transport, groundwater recharge efficiency), estimate reserves (mineral resources, total contaminated sediment volume), or make decisions (drill wells, obtain more data, clean up). To complete this task however, many fields of expertise in geomathematics are required: spatial modeling, structural modeling, process modeling, geological interpretation, data processing and interpretation, modeling of PDEs, optimization, decision theory, etc. In many applications, 

E-mail: [email protected]

Page 1 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

a rigorous assessment of uncertainty is critical since it is fundamental to any geo-engineering decision-making process. Despite significant progress in each area of expertise, a state-of-the-art, all-encompassing framework integrating all components is still lacking. Most analytical statistical approaches rely on Gaussian models that are too limited to realistically model all sources of uncertainty. Many statistical approaches are categorized under “Bayesian” models. Although general in their formulation, they rely on multivariate Gaussian models for prior and/or posterior as well as additional assumptions of (conditional) independence that cannot include a fully integrated modeling effort (Tarantola 1987; Besag and Green 1993; Moosegard and Tarantola 1995; Omre and Tjelmeland 1996). Also, the Earth models consist of many components, such as a discrete set of complex surfaces describing layering and faults. These cannot easily be described with random variables and their associated probability distributions. Parameterizing such surfaces for modeling is a nontrivial problem. Traditional Monte Carlo approaches are infeasible mostly due to the large number of uncertain parameters involved and because modeling involves discretization of PDEs that require CPU-expensive computer codes (climate models, flow models, transport models, mine planning models). Experimental design and response surface analysis only treat nonspatial variables and is limited in scope due to its formulation in least-squares theory. While this chapter does not solve the problem of modeling uncertainty, it introduces some new concepts that aim to overcome many of the current limitations. We will lay out a new approach to integrated modeling, termed distance-based modeling, or modeling in metric space. Model uncertainty (or variability) is formulated by choosing a distance, i.e., a single scalar value that measures the difference between any two complex 3D/4D models. If this distance is chosen with respect to the purpose of the modeling effort (e.g., reserves, amount of contaminant cleanup), then we will show how large complexity and dimensionality in 3D/4D modeling may be significantly reduced in the metric space defined by this distance. As a consequence, the complexity of the model is less of an obstacle as long as an appropriate distance can be defined. Therefore, we envision that any type of model (surface based, cell based, object based) as well as any source of uncertainty can be addressed. Modeling uncertainty, data conditioning, optimization, as well as decision-making can be done more efficiently, therefore becoming more practical for real-world applications. This is partly because this approach focuses on the ensemble of 3D/4D models, not on one model at a time, and on the purpose for which 3D/4D models are used. Several papers have been published by the authors in this area that focus on more technical aspects (Scheidt et al. 2008; Scheidt and Caers 2009a, b, c; Alpak et al. 2009) and are oriented toward oil and gas reservoir. The aim of this chapter is to provide an overview of this methodology and focus on the motivation for, and principles of this approach within a unified theoretical framework as well as emphasizing generality of application. Therefore, we first review typical characteristics that make modeling in the Earth sciences unique (e.g., as opposed to modeling air flow over a wing). This analysis will lead to the observation that modeling does not require a Cartesian framework; a distance defining a metric space suffices.

2 Nomenclature Because this chapter covers a wide spectrum of disciplines, some clarification on commonly used terms such as model, simulation, grid, etc. is needed. A model can denote many things, such a Page 2 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

“Model” Sedimentology Waves Flow B

Output Response prediction C

Complex “System” High-dimensional space

Optimization Control variables

Data and Input variables: Geological Geophysical Engineering A

D

Low-dimensional space

Purpose-driven modeling Dimension reduction

Fig. 1 Overview of modeling in the Earth sciences from an engineering perspective

multi-Gaussian random field model which is fully specified when the spatial covariance function is known. From this random field model, samples can be generated or drawn. In geostatistical jargon, one often uses the term “realization” (sometimes also a “simulation”) and in statistical jargon, a “sample.” In this chapter, we will also use the term “model” since such sample or realization is meant to model the Earth, whether subsurface rock types or atmospheric temperature variation. Complex 3D Earth models are often gridded and such grids may contain several millions of grid cells. Each grid cell may have several properties or variables attached to it, and variables may vary in time (4D), i.e., be dynamic (versus static). A full list of mathematical notation is given in the back of this chapter.

3 Scientific Relevance 3.1 Characteristics of Modeling in the Earth Sciences To rethink the goal and purpose of building complex models in the Earth sciences, consider the “system” view of modeling in Fig. 1. Typically, one builds models from data. Various sources of data exist: point sources or point samples (in geostatistical jargon: hard data), indirect sources often obtained from remote sensing and geophysical surveys, as well as data about the physical state of the system such as boundary conditions, rock, and fluid properties, etc. Such data often require processing and interpretation before they are used in any modeling study. Building of 3D/4D models requires considerable effort and expertise. First, grids need to be populated with static properties (often rock properties such as rock or soil type, porosity, etc.). The population of grid models with properties may be purely stochastic (e.g., using geostatistical techniques) or may include physics (process models) by which the Earth system was generated (deposition or growth). Such models are also constrained to the input data (point or indirect sources), which require the solution of difficult ill-posed inverse problems requiring knowledge about the physics (quantified in a forward model) relating the Earth system and the data acquired. 3D/4D models are used to solve practical scientific or engineering questions. Hence, when one or a set of alternative models have been built, these models are subsequently processed for prediction purposes. The term “applying a transfer function” has been coined for taking a single model and Page 3 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

calculating a desired or “target” response from it. The alternative set of models is termed “input models.” If many input models have been created, the transfer function needs to be applied to each model, resulting in a set of alternative responses that model/reflect the uncertainty of the target response. This is a common Monte Carlo approach. In a reservoir/aquifer context, this response may be the amount of water or oil from a new well location; in mining, this could be a mining plan obtained through running an optimization code; in environmental applications, this may be the amount of contaminant in a drinking water well; in climate models, the sea temperature change at a specified location, etc. Based on this response prediction, one can then take various actions (see Fig. 1). Either more data is gathered with the aim of further reducing uncertainty or controls are imposed on the system (pumping rates, control valves, CO2 reduction) or decisions are made (e.g., policy changes, clean up or not). An observation critically important to motivating our approach is that the complexity and dimensionality (number of variables) of the input data and models (boxes A and B in Fig. 1) is far greater than the complexity and dimensionality of the desired target response (box C) or control variables (box D). In fact, the desired output response can be as simple as a binary question: do we drill or not, do we clean up or not? At the same time, the complexity of the input model can be enormous, containing complex relationships between different types of variables and physics (e.g., flow in porous media or wave equations), and is moreover spatially complex. For example, if three variables (soil type, permeability, and porosity) are simulated on a one million cell grid and what is needed is contaminant concentration at 10 years for a groundwater well located at coordinate (x, y), then a single input model has dimension of 3  106 while the target response is a single variable. This simple but key observation suggests that input model complexity and dimensionality can be dramatically reduced, not by itself, but by means of the target response. Indeed, many factors may affect contaminant concentration at 10 years to a varying degree of importance. If a difference in value of a single input variable (porosity at location (x, y, z)) leads to a considerable difference in the target response, then that variable is critical to the decision-making process. Note that the previous statement contains the notion of a “distance.” However, because input models are of large dimension, complex, and spatially/time varying, it may not be trivial to discern variables that are critical to the decision-making process easily. Reducing dimension by simply considering the variability between all input models may be an impossible task, since the intrinsic variation between all input models is far too complex to be reduced to a few dimensions by simple statistical techniques such as principal component analysis (PCA). In the distance approach, we will therefore not work with differences between individual input parameters to the models but define a distance between two models created with different/randomized input parameters. This has the advantage that one no longer has to be concerned with interaction between parameters or whether a parameter is discrete or continuous. More importantly, that distance can be tailored to the application or target output response at hand. For example, if contaminant transport from a source to a specific location is the target, then a distance measuring the connectivity difference (from source to well) between any two models would be a suitable distance. The only requirement is that such distance can be calculated relatively easily and with negligible effort and CPU time. In the next section, we define some basic concepts related to distances that are critical to our approach. Then, we provide various tools of modeling uncertainty when models are complex and target responses require considerable CPU time. We show how many popular techniques in modeling such as model expansion, model updating, and inverse modeling can be reformulated in metric space and if this is effectively done and provides greater efficiency to their solution methods.

Page 4 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

3.2 Distances, Metric Space, and Multidimensional Scaling A single (input) model i is represented by a vector xi which contains either properties (continuous or categorical or a mix of them) on a grid or an exhaustive list of variables uniquely quantifying that model. The “size” or “dimension” N of the model is then the length of this vector, for example, the number of grid cells in a gridded model. N is typically very large. L denotes the number of alternative models generated with typically L  N. All models are collected in the matrix X X D Œx1 x2 : : : xL T

of size L  N:

One of the most studied distances is the Euclidean distance which is defined as q dij D .xi  xj /T .xi  xj /

(1)

(2)

if one applies this distance to a pair of models xi , xj . Models exist within a Cartesian space D of high dimension N; see Fig. 2. A distance, such as a Euclidean distance defines a metric space M, which is a space only equipped with a distance, hence does not have any axis, origin, nor direction. This means that we cannot uniquely define the location of any x in this space, only how far each xi is from any other xj . Even though we cannot uniquely define locations for x in M, we can however present some mapping or projection of these points in a low-dimensional Cartesian space. Indeed, knowing the distance table between a set of cities, we can produce a 2D map of these cities, up to rotation, reflection, and translation of the mapped city locations. To construct such maps, we employ a traditional statistical technique termed multidimensional scaling (MDS, Borg and Groenen 1997). In MDS, we rely on a duality between a dot product of matrix of models and the Euclidean distances between pairs of X (Eq. 2). The MDS procedure works as follows. First, we center around the origin as follows: ! NM X NM NM NM 1 1X 1X 1 X 2 2 2 2 dij  : d  d  d bij D  2 L kD1 ik L lD1 lj L2 kD1 lD1 kl This expression can be represented in matrix form as follows. First, construct a matrix A containing the elements 1 aij D  dij2 2

(3)

and then, center this matrix as follows: B D HAH with H D I 

1 T 11 L

(4)

with 1 D Œ111: : :1 a row of L ones, I , the identity matrix of dimension L. Matrix B contains elements bij . B can then be written as B D .HX/.HX/T

of size L  L:

(5)

Consider now the eigenvalue decomposition of B as Page 5 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

High-dimensional space of models x

Metric Space on x

xi Distance

Cartesian

No axis defined Max dimension = # models

Fig. 2 Representation of models in a metric space. In all rigor, one cannot make a plot of models in metric space, since their exact location remains undefined

B D VB ƒB VBT :

(6)

In our case, L  N and the distance is Euclidean; hence all eigenvalues are positive. We can now reconstruct (map onto a location in Cartesian space) any x in X in any dimension from a minimum of one dimension up to a maximum of L dimensions, by considering that 1=2

1=2

MDS

B D .HX/.HX/T D VB ƒB VBT ) X D VB ƒB ) Xd D VB;d ƒB;d W X 7! Xd

(7)

if we take the d largest eigenvalues. VB;d contains the eigenvectors belonging to the d largest eigenvalues contained in the diagonal matrix ƒB;d . The solution Xd retained by MDS is such that the mapped locations have their centroid as origin and the axes are chosen as the principal axes of X. The classical MDS was developed for Euclidean distances, but as an extension, the same operations can be done on any distance matrix (although positive definiteness is not always guaranteed for many typical Earth science applications; see Scheidt and Caers 2009a). Consider the following example in Fig. 3. Thousand models are generated from a multi-Gaussian random field model (size N D 10;000 D 100100) using a spherical anisotropic covariance model and standard Gaussian marginal. The Euclidean distance is calculated between any two models resulting in a 1;000  1;000 distance matrix. A 2D mapping is retained of the realizations in Fig. 3. What is important in this plot is that the Euclidean distance in 2D in Fig. 3 is a good approximation of the Euclidean distance between the models. The axis values are not of any relevance; it is the relative position of locations that matters. Scheidt and Caers (2009a) found that projecting models with MDS rarely requires dimensions of five or higher such that the Euclidean distance between locations correlates well with the actual (application-tailored) distance. Consider now an alternative distance definition between two models. Using the same models as previously, two points are located (A and B; see Fig. 4). A measure of connectivity (see Park and Caers 2007 for exact definition) is calculated for each realization. Such measure simply states how well the high values form a connected path between those two locations. The distance is then simply the difference in connectivity between two models. Using this distance, we produce an equivalent 2D map of the same models; see Fig. 4. Note the difference between Figs. 3 and 4, although both plot locations of the same set of models. If we investigate the connectivity-based projections (Fig. 5), we note how the models of the left-most group are disconnected, while models on the right reflect connected ones. However, any two models that map closely may look, at least by visual inspection, quite different.

Page 6 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

2D projection of model locations with MDS

1000 Gaussian models

–30

X1

–25 –20 –15

X2 x1

–10 –5 0 5 10

X1000

15 20 –25 –20 –15 –10

–5

0 x2

5

10

15

20

25

Fig. 3 Thousand Gaussian models and their locations after projection with MDS: Euclidean distance case. The axes are given by Eq. (7)

1000 Gaussian models 2D projection of model locations with MDS 2 1.5 1 0.5 x2

0 –0.5 –1

B

–1.5 –2 –2.5

A

–3 –10

–5

0

5

10

15

x1

Fig. 4 Thousand Gaussian models and their locations after projection with MDS: connectivity distance case

Consider now the case where these models are used to assess uncertainty of a contaminant traveling from the source A to a well at location B. Suppose that we are interested in the arrival time of such contaminant, then Fig. 6 demonstrates clearly that a connectivity distance nicely sorts models in a low-dimensional space, while models projected based on the Euclidean distance do not sort well at all. This will be important later, when we attempt to constrain models to data such as travel times (or any other nonlinear response).

3.3 Kernels and Feature Space Defining a distance on models and projecting them in a low-dimensional (2D or 3D) Cartesian space presents a simple but powerful diagnostic on model variability in terms of the application

Page 7 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 5 Location of a few selected models

Contaminant arrival time

a

b

Euclidean distance

–30

Low

x1

–25

1.5

–20

1

–15

0.5

–10

0

–5

–0.5

0

–1

5

1.5

High

10

–2

15

2.5

20 –25 –20 –15 –10

–5

0 x2

5

10

15

20

25

Connectivity distance

2

–3 –10

–5

0

5

10

15

x1

Fig. 6 Plotting a response function (arrival time) at locations of models projected with different distances

at hand. Clearly, how one looks at model uncertainty is application dependent (see difference between Figs. 3 and 4). Aside from being a diagnostic tool, the metric space approach to modeling can only be useful if models can be selected (for target response uncertainty evaluation) or new models can be created (e.g., for model updating, for solving inverse problems, or other optimization problems) that share similar properties to the initial set of models. An important problem that will be addressed is on how to create a new model from an existing cloud of models, without necessarily needing to go back to the initial sampling or generating algorithm. As shown in Fig. 5, the cloud of models in a 2D projection Cartesian space may have a complex shape, which may make common statistical operations such as clustering or PCA difficult. In this chapter, we propose to use kernel techniques (Vapnik 1998; Schöelkopf and Smola 2002) to transform from one metric space into

Page 8 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 7 Concept of metric and feature space and their projection with MDS

a new metric space such that, after projecting in 2D, 3D, etc. from this new metric space, the cloud of models displays a simpler arrangement. In fact many common statistical techniques (e.g., clustering, PCA, and even regression) require only knowledge of a distance (or difference) and hence can be applied in any metric space without explicit knowledge of a Cartesian space. One of the main purposes of this chapter is to show that many geomathematical modeling techniques can equally be defined in metric space and therefore a wider set of problems (such as nonlinear, non-Gaussian) solved. Transformations from metric-to-metric space are much simpler than transformation from Cartesian-to-Cartesian space (Fig. 7). Indeed, any transformation (in a high-dimensional Cartesian space) that would simplify the variability between models xi from a non-Gaussian to a more Gaussian and linear-type behavior is likely nontrivial involving a complex, multivariate (multipoint) function ' xi 7! '.xi / ) X 7! ˆ;

(8)

where the set of realizations X is transformed into a new set ˆ. Finding a suitable ' is virtually impossible. Fortunately, we do not need to know ', we only need to know its dot product since that will fully define a new metric space on which several statistical operations such as clustering, PCA, regression, or model expansion can be applied (the “kernel trick,” Schöelkopf and Smola 2002). To distinguish between the metric space defined by our application-tailored distance, we term this new space the “feature space” F, as often done in computer science publications. If this transformation transforms directly from metric space to another metric space, then the dot product of this new space is written as Kij D k.xi ; xj / D h'.xi /; '.xj /i

(9)

Note that the matrix K constructed from the input elements Kij is also termed the Gram matrix. A function k(.,.) that is considered a dot product in feature space of ' is also known as a kernel function. What remains is the specification of the kernel function. The literature offers many choices for kernel functions, but since we work with Euclidean distance after projection with MDS, Page 9 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

2D projection of models from feature space

2D projection of models from metric space 2

0.4

1.5

0.3

1

0.2

0.5

x2

x2

–0.5

0

–0.1

–1

–0.2

–1.5

–0.3

–2

–0.4

–2.5 –3 –10

0.1

RBF Kernal

0

–5

0

5

10

15

–0.5 –0.5 –0.4 –0.3 –0.2 –0.1

x1

0 0.1 0.2 0.3 0.4 0.5 x1

Fig. 8 Comparison of locations of models and response function evaluation after projection with MDS from metric space and feature space

we will employ throughout our work the Gaussian radial basis function (RBF), which is given by   .xd;i  xd;j /T .xd;i  xd;j / Kij D k.xi ; xj / D exp  

(10)

with MDS

MDS

xi 7! xd;i and xj 7! xd;j : Note that we use the approximating Euclidean distances between locations after MDS projection. Since MDS transforms the non-Euclidean distance into an approximating Euclidean distance, we can make use of many useful properties that link Euclidean distances, Gaussian variables, and RBF kernels readily available in the literature (Schöelkopf and Smola 2002). One again notice the importance of MDS: it allows transforming any non-Euclidean distance into an approximating Euclidean distance. Then, kernel functions are used to simplify variability in the metric space defined by these Euclidean distances. To project realizations in feature space, we apply the same MDS operation to the gram matrix K. The eigenvalue decomposition of K is calculated, and projections can be mapped in any dimension (Fig. 7), for example, in 2D using 1=2

ˆf D2 D VK;f D2 ƒK;f D2 ;

(11)

where VK;f D2 contains the eigenvectors of K belonging to the two largest eigenvalues of K contained in the diagonal matrix ƒK;f D2 . An illustration example is provided in Fig. 8. Thousand multi-Gaussian models were mapped into 2D Cartesian space (same as Fig. 5). What is shown on the right side of Fig. 8 are the 2D projections of models in feature space. Note how the disorganized cloud of model locations in projected Cartesian space have become more organized if model locations are projected to feature space (left side), allowing easier modeling and quantification of model variability. In actual applications, the dimension of the feature space will be taken much higher. Partially, it is this increase in dimension that renders locations of models more linear/Gaussian.

Page 10 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

3.4 Model Expansion in Metric Space In geostatistics and spatial statistics, the spatial covariance of a variable is used to generate multiple models of that variable (concept of stochastic simulation; see Lantuejoul 2002; Ripley 2004) reflecting the spatial heterogeneity expressed in that covariance function. The most straightforward and also exact technique to accomplish this is through LU decomposition, where the model covariance matrix C (size N  N) is decomposed into an upper (U ) and lower triangular (L) matrix. Then, a correlated Gaussian vector x can be expressed by C D LU x D m C Ly

(12)

given a random vector of Gaussian deviates y, with m the mean of the Gaussian field. The resulting realizations x follow a multi-Gaussian distribution exactly. The LU decomposition has fallen out of favor because the covariance matrix C is very large, namely, of dimension N  N. An alternative, approximate method to the LU decomposition is the so-called Karhunen-Loeve (KL) model expansion (Karhunen 1947; Loève 1978), which relies on the eigen-decomposition of C instead: 1=2

x D VC ƒC y:

C D VC ƒC VCT

(13)

As stated, the KL model expansion is an expansion of the original random field, so it will only regenerate approximately the input covariance C , at least on average over all models. Note that for large covariance matrices C , the KL expansion is as CPU prohibitive as the LU decomposition. To solve this problem, we propose to apply the KL expansion in metric space. This is possible because of a duality between a spatial covariance and an Euclidean distance for Gaussian models. Indeed, one can establish a relationship between the ensemble covariance matrix of a set of Gaussian models and the Euclidean distances between these realizations. The (centered) ensemble covariance matrix between a set of L models is defined as C D

1 1 .HX/T .HX/ ) C D X T HX; L L

(14)

where H is the centering matrix. Consider now the eigenvalue decomposition of the dot product B in (6) and (7), and then pre-multiplying with (HX)T , we get BVB D VB ƒB ) .HX/.HX/T VB D VB ƒB ) .HX/T .HX/.HX/T VB D .HX/T VB ƒB :

(15)

This means that 1=2

VC D .HX/T VB ƒB

and ƒC D

ƒB : L

(16)

In other words, given the eigenvalue decomposition of B, with eigenvectors VB and eigenvalue matrix ƒB , we can obtain the eigenvalue decomposition of the ensemble covariance C . This means that we can reconstruct the matrix C from the eigen-decomposition of B. Note that since L  N, the cost of calculating the eigenvalue decomposition of B is much less than the cost of calculating

Page 11 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 9 Thousand input models are used to generate thousand new models using KL expansion, three models are shown

the eigen-decomposition of C . This motivates the use of distances instead of covariances to create model expansions, which is done next in the context of Karhunen-Loeve expansions. Consider now the case where we have L models of a Gaussian random function; hence, we have matrix X. The N  N centered ensemble covariance matrix C is (Eq. 14) C D

1 T X HX: L

The relation between the eigen-decomposition of the dot product matrix B and the eigenvalue decomposition of C (Eq. 16) then allows applying the KL expansion (Eq. 13) in terms of Euclidean distance, instead of covariance; we get 1 x D .HX/T VB p y: L

(18)

Observe that a new Gaussian model can be written as a linear combination of the existing Gaussian models contained in X, by generating a new vector of Gaussian variables ynew . Figure 9 shows an example of how models can be generated using a Euclidean distance table. First, 1000 models of a Gaussian field are generated. The Euclidean distance table was then calculated from these models. Next, 1,000 new models are generated, of which three are shown in Fig. 10. The variogram model and the ensemble variogram of both the 1,000 original and 1,000 new models are shown in Fig. 10b. As can be observed, the histogram and variogram statistics are maintained. To check the variability between the new models, we create a 2D MDS map and plot them on top of the original models. Figure 10a shows that the distribution of model locations in this 2D maps is similar; hence, the new models have a similar variability between them as the original ones. The KL expansion remains faithful to the multi-Gaussian properties of the original models. The same KL expansion can be applied in feature space. This will be useful to generate new models when the original input models are non-Gaussian and/or the application-tailored distance is non-Euclidean since the traditional KL expansion is formulated for covariances and Gaussian random fields. Many applications in the Earth sciences require models that are non-Gaussian,

Page 12 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

Blue dots: 300 new models Red dots: 300 original models

a

b 1.2

Variogram Model Exp. Vario - Initial Real. Exp. Vario - New Real.

2.3 New Models

Variogram

1 0.8 0.6 0.4

2.1 2 1.9

0.2 0

2.2

0

20

40 60 Distance

80

100

1.8

1.8

1.9

2 2.1 2.2 Initial Models

2.3

Fig. 10 (a) Comparison of variability, as represented by location of models, between the initial input models and the model expansion. (b) Comparison of variogram and histogram (QQ-plot)

such as binary models (e.g., Ising model), object-based models, or Markov random field models (Lantuejoul 2002). The KL expansion in the non-Euclidean, non-Gaussian case is based on the eigenvalue decomposition of the Gram matrix K, K D VK ƒK VKT :

(19)

In order to perform a KL expansion in feature space, we need to know the eigenvalue decomposition of the covariance of the models in feature space: '(xi ) C' D

1 ˆˆT D VC' ƒC' VCT' L

)

1=2

'.x/ D VC' ƒC' y;

(20)

where y is a vector of standard Gaussian variables. Evidently, this covariance cannot be explicitly calculated since only the dot product is known, not ˆ itself. Using the duality between covariance and distance (in this case the dot product K), we know there exists a relationship between eigendecompositions of the covariance of models in feature space and the dot product of that space, namely, ƒC' D

1 1=2 ƒK VC' D ˆVK ƒK : L

(21)

Hence, any model in feature space can be written as

Page 13 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

1 '.x/ D ˆb with b D p VK y: L

(22)

This shows how a new model in feature space is formulated as a linear combination of the original input models mapped in this space ˆ. A new model '.xnew ) can be obtained by generating a new vector of Gaussian variables ynew . The problem with (22) is that neither ' nor ˆ is explicitly known, only the dot product (kernel function) is, so any back transformation by applying some ' 1 to obtain the corresponding model x is even less feasible. In the next section, we demonstrate how such back transformation relies only on the dot product itself, not on explicit knowledge of ' or ˆ. In the computer science literature, this problem is better known as the pre-image problem. We will discuss this problem and further extend the solution to it for Earth science-specific applications.

4 Key Issues 4.1 The Pre-image Problem So far, we have established a simple and efficient method for generating new models in feature space (as points projected in that space), providing a model expansion based on the initial set of input models. This procedure can be summarized as follows: 1. 2. 3. 4.

Generate an initial set of prior (or input) models. Define an application-tailored distance between these models. Use MDS to transform this application-tailored distance into a Euclidean distance. Use these Euclidean distances to establish a Gram matrix K with the radial basis function (Eq. 10). 5. Establish a KL expansion in feature space based on the eigenvalue decomposition of K (Eqs. 20–21). Note that all these operations occur in metric space, and no specific axis system is needed to formulate these operations. MDS can also be used to project the models in a low-dimensional Cartesian space and visualize the results in a 2D or 3D plot. Remaining is the problem of specifying a new model in physical space (N-dimensional), that is, with actual meaningful Earth properties. This problem is in the computer science literature also known as the “pre-image problem” (Schöelkopf and Smola 2002). In this pre-image problem, we deduce, based on the relative position of the model in metric space, what this new model, created by the KL expansion, “images” like. Several solutions to this problem are formulated in the computer science literature (Schöelkopf and Smola 2002; Kwok and Tsang 2004). However, due to the possibly large dimension of Earth science models x, these solutions do not apply in practice to Earth models which could contain millions of grid cells. In this chapter, we propose practical solutions to the pre-image problem for real Earth models. To solve the pre-image problem for Earth models, we include an additional step to the traditional pre-image solution. Consider Fig. 11. On the right is depicted a feature space and its projection through eigenvalue decomposition on K, and on the left a metric space with its own MDS projected space. A new model generated in feature space is shown as well as its corresponding projection. Instead of finding directly a full pre-image, we will first attempt to find its relative position in

Page 14 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

xnew

Model expansion Metric space

ϕ ϕ (xnew) Feature space

xnew ϕ–1

MDS on B

Φ Vk ynew L

MDS on K xnew d

Fig. 11 Illustration of model expansion in feature space and the pre-image problem

metric space, and then deduce the full image. This has the advantage of working with a much lower dimension through projection with MDS over reconstructing directly the full vector x. Consider the new, yet unknown model as xnew and its (also not yet know) location mapped by MDS in Cartesian space; see Fig. 11, as xnew 7! xnew d :

(23)

According to Fig. 11, we first look into the problem of finding xnew d knowing its expansion in feature space '.xnew /. This is essentially the traditional pre-image problem treated in the computer science d literature. We pose this problem as an optimization problem as follows: new 2 D arg min k'.xnew k2 xO new d d /  ˆb xnew d

with

1 bnew D p VK ynew : L

(24)

In other words, one desires the Euclidean distance between the new model mapped in feature space and the model expansion ˆbnew to be as small as possible. The solution of this inverse problem is ill-posed because of the nonlinear nature of the function '. We can solve this optimization problem without requiring explicit knowledge of ' as follows. Since the feature space is obtained through a transformation of distances from the metric space, there exists a relationship between the Euclidean distances d in both these metric spaces d.xd;i ; xd;j / D Ki i C Kjj  2d.'.xd;i /; '.xd;j //:

(25)

Hence, the minimization problem in (24) is equivalent to the pre-image problem   T new T new new D arg min Œ'.xnew C Œbnew T Kbnew : xO new d d / '.xd /  2' .xd /ˆb

(26)

xnew d

This optimization problem contains only dot products, and solving it does not require explicit knowledge of the function '. In order to find a minimum of (26), we set the gradient to zero and solve that equation iteratively using a fix-point algorithm (Schöelkopf and Smola 2002), where each iteration ` consists of: Page 15 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

L P

binew k.Oxnew;`1 ; xd;i /xd;i d

iD1 L P

xnew;` D d

iD1

(27) binew k.Oxnew;`1 ; xd;i / d

with k the radial basis kernel. The starting value xO dnew;0 is defined as a linear combination of the existing model realizations in Cartesian space, with random coefficients. Equation (27) is iterated until convergence results in a solution that is a nonlinear combination of the existing model realization locations in the Cartesian space after projection by MDS: xO new d

D

L X

opt ˇi xd;i

with

iD1

L X

opt

ˇi

D 1:

(28a)

iD1

Note that the weights ˇ depend on the locations xd;i; and hence this constitutes a nonlinear combination. The weights sum up to unity, by construction. The above optimization provides the location of a new model after projection with MDS. Remaining is the problem of specifying what this new model looks like in its actual physical space, that is, actual values of properties on a grid. Three options are proposed in Scheidt et al. (2008): unconstrained optimization, feature-constrained optimization, and geologically constrained optimization. The first option consists of applying the same weights ˇ opt in (28) to the existing input models: xO

new

D

L X iD1

opt ˇi xi

with

L X

opt

ˇi

D 1:

(28b)

iD1

The extension of the linear combination of locations xd;i to models xi can be motivated as follows. The result of applying MDS is such that the distance between any two models is approximately the same as the Euclidean distance between their locations after projection in a low-dimensional Cartesian space. Hence, if a Euclidean distance is an adequate measure of describing variability between models, then extrapolating Eq. (27) to models in physical would be valid. As a result, Eq. (28) would work if the input models are drawn from a multi-Gaussian model, because of the duality between the spatial covariance of a multi-Gaussian model and the Euclidean distance between the models (see Eq. (16)) and because any linear combination of Gaussian models is again a Gaussian model. The histogram (mean) is preserved if the weights sum up to unity. The spatial covariance (which includes variance) is preserved if the sum of squared weights sums up to unity. That constraint is not explicitly enforced here. However, practice (Scheidt et al. 2008) has shown that, in case of a large enough ensemble, sufficient preservation of the covariance model is obtained by applying (28) to any covariance-based model (not necessarily Gaussian). If the input models are binary (or categorical), then a nonlinear combination of binary realizations will not necessarily yield a binary model. In such case, we propose to solve the optimization problem in (26) using any stochastic sampling method (Bayesian or otherwise) that guarantees reproduction of the prior spatial continuity model, such as the gradual deformation (Roggero and Hu 1998), probability perturbation optimization algorithms (Caers and Hoffman

Page 16 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

5 4 3 2 x2

1 0 –1 –2 –3 –4 –5 –15 –10

–5

0

5

10

15

20

25

30

x1

Fig. 12 Five of 300 channel-type models are shown and their projection with MDS (using a connectivity distance). Using probability perturbation, a single-channel model corresponding to the black dot is created

2006), or McMC samplers. These algorithms ensure that the new realization has the same spatial statistical properties as the input models. Figure 12 shows the result of solving a pre-image problem in a non-Gaussian case. Using a multiple-point geostatistical algorithm, 300 models are created that depict a channel structure of which five are shown. The task solved in the pre-image solution is how to generate a new model that is not part of the initial set of 300 and is located at a certain position in metric space. This position is highlighted by the black dot in Fig. 12. The probability perturbation algorithm is a Bayesian technique for sampling constrained to an objective (likelihood) and a prior model, in this case the multiple-point statistics model. Starting from an initial guess (the green dot in Fig. 12), this algorithm iteratively finds a solution to the optimization problem (25) as shown in Fig. 12. The resulting model maps exactly at the location of the black dot and has a similar channel structure as the initial 300 models. For details on the overall procedure, we refer to Scheidt et al. (2008).

4.2 The Post-Image Problem In the previous section, we showed how, given any set of prior models (Gaussian or not), a model expansion can be created from this prior set and a sampling technique can be developed to create new models from the existing ones. In this section, we show how such model expansion can be used to solve large-scale ill-posed inverse problem very efficiently. Consider the following problem formulation which is often of practical interest in solving inverse problems: a set of input models and their corresponding output responses (generated using some multivariate function g, e.g., a PDE solution, termed the forward model) are available X D Œx1 x2 : : : xL T G D Œg.x1 /g.x2 / : : : g.xL /T :

(29)

Page 17 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

In addition, observation of the true response is available. Geophysical or any other measurement technique is collected from the field and assembled in the vector data. Inverse modeling entails searching for models x as follows: find x such that d.data; g.x// D 0;

(30a)

x D arg min d.data; g.xi //;

(30b)

or xi

where d is some measure of difference (D a distance, an objective) between forward simulated data and actual field data. Note that we do not include any error (data or model). Often, a perfect solution (d D 0) cannot be found, and hence one solves the optimization problem (30b). The “true Earth” xtrue , evidently, is part of the space of models that match the data if one assumes no errors and a prior model that covers the true Earth. Consider reformulating this problem in metric space. As distance, we take dij D d.g.xi /; g.xj //

8i; j;

(31)

which is as in (30b) some measure of difference d in the forward model evaluation between any two models. Since the response of the true Earth xtrue is available as data, we also have di;data D d.g.xi /; data/ 8i:

(32)

Hence, we augment the models in (29) as follows: X C D Œx1 x2 : : : xL ; xtrue T

G C D Œg.x1 /g.x2 / : : : g.xL /; dataT :

(33)

Using d in (32) as distance, MDS can be performed on X C and models in metric space can be projected in a low-dimensional Euclidean space. An example is shown in Fig. 13. In this example, 200 Gaussian models of permeability are created. Two wells are located in the model; one well I (Fig. 13) injects fluid A in a porous media with fluid B. Another well P produces both fluids. The data in this case consist of the fraction of fluid A produced from the producing well P. A discretization of the flow equations (PDEs) is implemented as a forward model and termed “flow simulator.” The flow simulator is applied to each of the 200 models, and 200 responses are obtained which together with the field data (see Fig. 13) created vectors (34). MDS is applied with as distance the least-square difference between any two responses. A 2D plot is created in Fig. 13, the color representing the mismatch with the field data. The black dot represents the position of the true Earth. Distance (32) creates a metric space and a projection with MDS. The true Earth xtrue is unknown; however, its projection from metric space using MDS can be uniquely determined; see, for example, Fig. 13. Consider now the following assumption. Assume the relationship between the Euclidean distance between any two models xi and xj and the response distance is linear/proportional, namely, that dE .xi ; xj / / d.g.xi /; g.xj //8i; j

hence also dE .xi ; xtrue / / d.g.xi /; data/

(34)

Page 18 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

5 4

P

3

x2

2 1 0 –1

I

–3 –4 –10

True Earth –8

–6

–4

–2

0

2

4

x1

6

8

Fraction of flow of fluid A

–2

Field data

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

200

400

600

800 1000 1200

Time (days)

Fig. 13 Example of MDS of 200 permeability models and the location of the true Earth

with dE the Euclidean distance. Then, the locations (such as in Fig. 13) of the actual models mapped with MDS are the same as the locations of responses of the models mapped with MDS. Hence, the inverse problem is reduced to finding those models that map at the location of the data in Fig. 13. To find these models, one formulates a model expansion based on the initial set of models using KL expansion. Then, one needs to find those models (generated through the KL is zero. This problem is model expansion in feature space) whose distance with respect to xtrue d different from the model expansion problem treated previously, that is, finding model expansions through solving a pre-image problem. Indeed, in the pre-image problem, we know the relative position of the model expansion '.xnew / in feature space, and we need to find a (new) model x that maps close to '(xnew ). In this so-called post-image problem, the relative position in feature space of the expansion is unknown, while the relative location in metric space xtrue d is known. Figure 14 compares the pre-image with the post-image problem. Given the model expansion (22) in feature space, the post-image problem is formulated as an optimization problem as follows (refer to (22)): new yopt D arg min d.xtrue d ; xd / ynew

1 new with '.xnew : d / D ˆ p VK y L

(35)

Then, given a solution yopt for (36), we can calculate the weights b (Eq. (22)) and hence the weights ˇ opt in (27) or (28) to obtain a new model in the same way as was done for solving the pre-image problem. For obtaining multiple solutions, we apply a global stochastic optimization technique on y. A suitable technique is gradual deformation (Hu and Roggero, 1998) since it searches in the space of Gaussian random variables. Each solution for yopt then provides a new set of weights ˇ opt hence a new model x that match data. Note that the solution of the post-image does not require any new response output evaluation. Clearly Eq. (35) requires some linearity in the inverse model formulation for (36) to be exact. Indeed, the physics in the forward model may be very nonlinear entailing that the Euclidean difference in two models with different properties (e.g., two hydraulic conductivity models) is Page 19 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

MDS projection from feature space

?x

The poste-image problem

The pre-image problem

MDS projection from Metric space

Model expansion

Model expansion ? x

Fig. 14 Comparison between the pre-image and post-image problem

nonlinearly related to a difference in the response (e.g., contaminant transport) calculated from these models. The solution to this case, as with any nonlinear optimization problem, is to iterate the “linear” problem solution, that is, take the solution of (36) as a new initial guess, and solve the linear problem till convergence.

5 Results 5.1 Illustration of the Post-image Problem A simple illustrative example is presented to demonstrate the post-image problem. Consider the following inverse problem related to flow in Fig. 15. In a simulated/synthetic case study, water is injected at location I into a porous medium containing any other liquid and travels over time to location P. After about 500 days, water arrives at location P and is extracted from a well P at that location; see Fig. 15c. The percentage of water extracted is recorded for 2,000 days. The task is to find permeability models (porosity is assumed constant and known) that match this data. What is also assumed to be known for this case study is that the spatial continuity model for this permeability field is described by an anisotropic variogram. Figure 15d shows the location of 200 models after projection with MDS using as distance, the difference in flow response (see Fig. 13) between the models. Figure 15a shows the (prior) permeability models that are drawn from a multi-Gaussian distribution with given spatial covariance. Using the field data, the unknown “true permeability” can also be plotted; see Fig. 15d. The post-image problem can now be formulated and solved. Three models are shown in Fig. 15b and their forward model response plotted on the data in Fig. 15c. Clearly, these posterior model solutions match the data accurately as well as represent the same (known) spatial continuity as the prior models. Many additional matches can be obtained at virtually no CPU cost. Consider now a similar setup but with different prior knowledge. Suppose a site has been further geologically assessed and it has been determined that channel structures with certain width and sinuosity occur. What is unknown is the location of these channel elements. Figure 16a shows three models of channels structures that have been created with a multiple-point algorithm (Strebelle 2002) reflecting the geologist’s view of the subsurface. Using the same techniques as applied to the previous case, this problem can be solved through (37). Note that the resulting solution in Page 20 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

P

a

7 6 5 4 3 2 1 0 –1

I 7

b

6 5 4 3 2

80

Water production at P

1

c

70

d

60 “True Earth”

50 40 Data

30 20 10 0 0

0.5

1 Time (days)

1.5

2 x 104

Fig. 15 (a) Three prior models of permeability not constrained to any data. (b) Three posterior models constrained to the water fractional flow data. (c) Green dots are the field data, gray lines are the forward model responses of the prior models, and red lines are the forward model responses of the matched models. (d) Projection with MDS of the models and the data, color indicates mismatch, and axis units are removed because they are not essential.

Fig. 16b shows the same channel structures now positioned such that the flow data is matched; see Fig. 16c. This example illustrates the generality of the distance approach

6 Conclusions and Future Directions We end this chapter by pointing out some important properties of modeling in metric space and the use of distances. A distance should not be confused with a measure (a length). For example, one can calculate a global or a local property of a model (such as a mean or any other summary statistics of a variable over a certain area). Often, these summaries are used as a “proxy” of a more CPU-demanding response (e.g., Balin et al. 1992). A distance should not be confused with a proxy. A distance need not require the evaluation of properties or proxies of any individual model. A Euclidean distance and a Hausdorff distance (Dubuisson and Jain 1994; Suzuki and Caers 2008; Suzuki et al. 2008) are such examples. The critical contribution of using distances is twofold. It allows including the purpose or target application into the modeling effort rendering it more effective (targeted) and efficient (lower dimension). Secondly, it allows modeling with ensemble of models instead of one model at a time. This last observation is critical for inverse modeling on ill-posed problems and to address the important issue of uncertainty, which is critical in engineering application involving risk. The challenge of many inverse modeling applications is not just to provide a single or few models that match data but to provide predictions, which requires many models. While substantial effort in the inverse modeling literature has been made on obtaining one inverse solution, obtaining several 100s Page 21 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

a P I

b

c Water production at P

2,000

d

1,500

1,000

500

0 0

“True Earth”

1,000 2,000 3,000 4,000 5,000 6,000 Time (days)

Fig. 16 Channel case: (a) three prior models of permeability not constrained to any data. (b) Three posterior models constrained to the water fractional flow data. (c) Green dots are the field data, gray lines are the forward model responses of the prior models, and red lines are the forward model responses of the matched models. (d) Projection with MDS of the models and the data, color indicates mismatch, and axis units are removed because they are not essential

solutions is still considered infeasible, not theoretically but practically due to CPU concerns related to forward models. With the above approach of post-imaging (working with ensemble of models), this target can now be reached. Other approaches such as ensemble Kalman filters (Evensen 2003) have the same goal but have limiting assumptions of Gaussianity and linearity built into its theory. The distance approach does not have such limitations as clearly demonstrated by the example of Fig. 16. The use of kernels in modeling is not new. It has been recognized recently that many statistical techniques such as clustering, PCA, and regression can be formulated with dot products and do not need a Cartesian framework. In this chapter, we extend that observation to spatial modeling. Many spatial modeling techniques require only distances and differences (Kriging is such an example) and hence can be reformulated in metric spaces resulting in increased robustness and applicability of such techniques as was observed for clustering and PCA. Moreover, the metric allows inclusion of the purpose of modeling. A key assumption for some of the proposed techniques to work well is that the distance chosen correlates sufficiently with the difference in target response. Hence, the question arises what would happen if that is not the case. Note that in many practical situations we cannot assess directly how high or low that correlation is, since that would require knowing/evaluating the target response (often CPU prohibitive). In Scheidt and Caers (2009), this issue is further investigated. Choosing a “wrong” distance, that is, one that has very low correlation to target response difference, will not necessarily lead to a “bias” when applying distance method in such case; in fact, it leads to Page 22 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

an increase in “variance.” This means that the distance approach becomes inefficient and, in other words, requires an increased amount of response evaluations to provide useful results. The goal and philosophy of the metric space approach is not different from many existing approaches to data assimilation, data integration, and inverse modeling through Bayesian approaches. All these approaches attempt to sample models, constrained to a prior model (either as a probability distribution or a generating algorithm) and likelihood function (the relation between data and model). The resulting samples (realizations or models) of the posterior distribution are representative for the remaining uncertainty. What is different in the distance approaches is the practicality to solve real problems that are high dimensional, involve many components of modeling and optimization, and are consistent with prior geological concepts of the phenomenon studied.

7 Notation

L N xi X D F d dij B ƒS ƒS;d VS VS;d xd;i 'W

ˆ k(.,.) Kij f C

Number of input models “Dimension” of an input model, for example, number of grid cells A single model Matrix containing all models Metric space Feature space Dimension of Cartesian space in which models are projected with MDS Application-tailored distance (can be Euclidean) between model xi and xj Dot-product matrix derived from application-tailored distance Eigenvalue matrix of a matrix S Eigenvalue matrix of a matrix S containing the d largest eigenvalues and rest set to zero Eigenvector matrix of S Eigenvector matrix retaining eigenvectors belonging to the d largest eigenvalues Location of xi after projection from metric space into d dimensional Cartesian space Unspecified multivariate function that maps models to display more linear variation, mapping can be performed on xi or on xd;i Matrix of models after being mapped with ' Kernel function Entry i , j into the gram matrix K Dimension after mapping from feature space to Cartesian space using MDS with K Ensemble covariance matrix of models in X

Page 23 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

References Alpak F, Barton M, Caers J (2010) A flow-based pattern recognition algorithm for rapid quantification of geologic uncertainty: application to high resolution channelized reservoir models. Comput Geosci, 14:603–621 Besag J, Green PJ (1993) Spatial statistics and Bayesian computation. J R Stat Soc B 55:3–23 Borg I, Groenen P (1997) Modern multidimensional scaling: theory and applications. Springer, New York Caers J, Hoffman T (2006) The probability perturbation method: a new look at Bayesian inverse modeling. Math Geol 38(1):81–100 Dubuisson MP, Jain AK (1994) A modified Hausdorff distance for object matching. In: Proceedings of the 12th international conference on pattern recognition, Jerusalem, vol A, pp 566–568 Evensen G (2003) The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dyn 53:343 Karhunen K (1947) Über lineare Methoden in der Wahrscheinlichkeitsrechnung. Ann Acad Sci Fennicae Ser A I Math-Phys 37:1–79 Kwok JT-Y, Tsang IW-H (2004) The pre-image problem in kernel methods. IEEE Trans Neural Netw 15(6):1517–1525 Lantuejoul C (2002) Geostatistical simulation. Springer, Berlin Loève M (1978) Probability theory, vol II, 4th edn. Graduate texts in mathematics, vol 46. Springer, Berlin Moosegard K, Tarantola A (1995) Monte Carlo sampling of solutions to inverse problems. J Geophys Res B 100:12431–12447 Omre H, Tjelmeland H (1996) Petroleum geostatistics. In: Baafi EY, Schofield NA (eds) Proceeding of the fifth international geostatistics congress, Wollongong Australia, vol 1, pp 41–52 Park K, Caers J (2007) History matching in low-dimensional connectivity vector space. In: EAGE petroleum geostatistics conference, Cascais, 10–14 Sept 2007 Ripley BD (2004) Spatial statistics. Wiley series in probability and statistics. Wiley, Hoboken, 251p Roggero F, Hu LY (1998) Gradual deformation of continuous geostatistical models for history matching. In: Proceedings society of petroleum engineers 49004, annual technical conference, New Orleans Scheidt C, Caers J (2009a) Bootstrap confidence intervals for reservoir model selection techniques. Comput Geosci. doi:10.1007/s10596-009-9156-8 Scheidt C, Caers J (2009b) Uncertainty quantification in reservoir performance using distances and kernel methods – application to a West-Africa deepwater turbidite reservoir. SPEJ 118740-PA. Online First Scheidt C, Caers J (2009c) Representing spatial uncertainty using distances and kernels. Math Geosci 41(4):397–419. doi:10.1007/s11004-008-9186-0 Scheidt C, Park K, Caers J (2008) Defining a random function from a given set of model realizations. In: Proceedings of the VIII international geostatistics congress, Santiago, 1–5 Dec 2008 Schöelkopf B, Smola A (2002) Learning with kernels. MIT, Cambridge Strebelle S (2002) Conditional simulation of complex geological structures using multiple-point geostatistics. Math Geol 34:1–22

Page 24 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_29-3 © Springer-Verlag Berlin Heidelberg 2015

Suzuki S, Caers J (2008) A distance-based prior model parameterization for constraining solutions of spatial inverse problems. Math Geosci 40(4):445–469 Suzuki S, Caumon G, Caers J (2008) Dynamic data integration into structural modeling: model screening approach using a distance-based model parameterization. Comput Geosci 12(1):105–119 Tarantola A (1987) Inverse problem theory. Elsevier, Amsterdam, 342p Vapnik V (1998) Statistical learning theory. Wiley, New York

Page 25 of 25

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Scalar and Vector Slepian Functions, Spherical Signal Estimation and Spectral Analysis Frederik J. Simonsa,b and Alain Plattnera,c a Department of Geosciences, Princeton University, Princeton, NJ, USA b Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA c Department of Earth and Environmental Science, California State University, Fresno, CA, USA

Abstract It is a well-known fact that mathematical functions that are timelimited (or spacelimited) cannot be simultaneously bandlimited (in frequency). Yet the finite precision of measurement and computation unavoidably bandlimits our observation and modeling scientific data, and we often only have access to, or are only interested in, a study area that is temporally or spatially bounded. In the geosciences we may be interested in spectrally modeling a time series defined only on a certain interval, or we may want to characterize a specific geographical area observed using an effectively bandlimited measurement device. It is clear that analyzing and representing scientific data of this kind will be facilitated if a basis of functions can be found that are “spatiospectrally” concentrated, i.e., “localized” in both domains at the same time. Here, we give a theoretical overview of one particular approach to this “concentration” problem, as originally proposed for time series by Slepian and coworkers, in the 1960s. We show how this framework leads to practical algorithms and statistically performant methods for the analysis of signals and their power spectra in one and two dimensions and particularly for applications in the geosciences and for scalar and vectorial signals defined on the surface of a unit sphere.

1 Introduction It is well appreciated that functions cannot have finite support in the temporal (or spatial) and spectral domain at the same time (Slepian 1983). Finding and representing signals that are optimally concentrated in both is a fundamental problem in information theory which was solved in the early 1960s by Slepian, Landau, and Pollak (Slepian and Pollak 1961; Landau and Pollak 1961, 1962). The extensions and generalizations of this problem (Daubechies 1988, 1990; Daubechies and Paul 1988; Cohen 1989) have strong connections with the burgeoning field of wavelet analysis. In this contribution, however, we shall not talk about wavelets, the scaled translates of a “mother” with vanishing moments, the tool for multiresolution analysis (Daubechies 1992; Flandrin 1998; Mallat 1998). Rather, we devote our attention entirely to what we shall collectively refer to as “Slepian functions,” in multiple Cartesian dimensions and on the sphere. These we understand to be orthogonal families of functions that are all defined on a common, e.g., geographical, domain, where they are either optimally concentrated or within which they are exactly limited, and which at the same time are exactly confined within a certain bandwidth or maximally concentrated therein. The measure of concentration is invariably a quadratic energy



E-mail: [email protected]

Page 1 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

ratio, which, though only one choice out of many (Donoho and Stark 1989; Freeden and Windheuser 1997; Riedel and Sidorenko 1995; Freeden and Schreiner 2010; Michel 2010), is perfectly suited to the nature of the problems we are attempting to address. These are, for example: How do we make estimates of signals that are noisily and incompletely observed? How do we analyze the properties of such signals efficiently, and how can we represent them economically? How do we estimate the power spectrum of noisy and incomplete data? What are the particular constraints imposed by dealing with potential-field signals (gravity, magnetism, etc.) and how is the altitude of the observation point, e.g., from a satellite in orbit, taken into account? What are the statistical properties of the resulting signal and power spectral estimates? These and other questions have been studied extensively in one dimension, that is, for time series, but until the twenty-first century, remarkably little work had been done in the Cartesian plane or on the surface of the sphere. For the geosciences, the latter two domains of application are nevertheless vital for the obvious reasons that they deal with information (measurement and modeling) that is geographically distributed on (a portion of) a planetary surface. In our own recent series of papers (Wieczorek and Simons 2005, 2007; Simons and Dahlen 2006, 2007; Simons et al. 2006, 2009; Dahlen and Simons 2008; Plattner and Simons 2013, 2014) we have dealt extensively with Slepian’s problem in spherical geometry. Asymptotic reductions to the plane (Simons et al. 2006; Simons and Wang 2011) then generalize Slepian’s early treatment of the multidimensional Cartesian case (Slepian 1964). In this chapter we provide a framework for the analysis and representation of geoscientific data by means of Slepian functions defined for time series, on two-dimensional Cartesian, and spherical domains. We emphasize the common ground underlying the construction of all Slepian functions, discuss practical algorithms, and review the major findings of our own recent work on signal (Wieczorek and Simons 2005; Simons and Dahlen 2006) and power spectral estimation theory on the sphere (Wieczorek and Simons 2007; Dahlen and Simons 2008). Compared to the first edition of this work (Simons 2010), we now also include a section on vector-valued Slepian functions that brings the theory in line with the modern demands of (satellite) gravity, geomagnetic, or oceanographic data analysis (Freeden 2010; Olsen et al. 2010; Grafarend et al. 2010; Martinec 2010; Sabaka et al. 2010).

2 Theory of Slepian Functions In this section we review the theory of Slepian functions in one dimension, in the Cartesian plane, and on the surface of the unit sphere. The one-dimensional theory is quite well known and perhaps most accessibly presented in the textbook by Percival and Walden (1993). It is briefly reformulated here for consistency and to establish some notation. The two-dimensional planar case formed the subject of a lesser-known of Slepian’s papers (Slepian 1964) and is reviewed here also. We are not discussing alternatives by which two-dimensional Slepian functions are constructed by forming the outer product of pairs of one-dimensional functions. While this approach has produced some useful results (Hanssen 1997; Simons et al. 2000), it does not solve the concentration problem sensu stricto. The spherical scalar case was treated in most detail, and for the first time, by ourselves elsewhere (Wieczorek and Simons 2005; Simons et al. 2006; Simons and Dahlen 2006), though two very important early studies by Slepian, Grünbaum, and others laid much of the foundation for the analytical treatment of the spherical concentration problem for cases with special symmetries (Gilbert and Slepian 1977; Grünbaum 1981). The spherical vector case was treated in its most general form by ourselves elsewhere (Plattner et al. 2012; Plattner and Simons 2013, 2014), Page 2 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

but had also been studied in some, but less general, detail by researchers interested in medical imaging (Maniar and Mitra 2005; Mitra and Maniar 2006) and optics (Jahn and Bokor 2012, 2013). Finally, we recast the theory in the context of reproducing-kernel Hilbert spaces, through which the reader may appreciate some of the connections with radial basis functions, splines, and wavelet analysis, which are commonly formulated in such a framework (Freeden et al. 1998; Michel 2010).

2.1 Spatiospectral Concentration for Time Series 2.1.1 General Theory in One Dimension We use t to denote time or one-dimensional space and ! for angular frequency, and adopt a normalization convention (Mallat 1998) in which a real-valued time-domain signal f .t / and its Fourier transform F .!/ are related by 1

Z

Z

1

f .t / D .2/

F .!/e

i!t

F .!/ D

d!;

1

1

f .t /e i!t dt:

(1)

1

The problem of finding the strictly bandlimited signal 1

Z

W

g.t / D .2/

G.!/e i!t d!;

(2)

W

that is maximally, though by virtue of the Paley-Wiener theorem (Daubechies 1992; Mallat 1998) never completely, concentrated into a time interval jt j  T , was first considered by Slepian, Landau, and Pollak (Slepian and Pollak 1961; Landau and Pollak 1961). The optimally concentrated signal is taken to be the one with the least energy outside of the interval: RT  D R T 1 1

g 2 .t / dt g 2 .t / dt

D maximum:

(3)

Bandlimited functions g.t / satisfying the variational problem (3) have spectra G.!/ that satisfy the frequency-domain convolutional integral eigenvalue equation Z

W

D.!; ! 0 / G.! 0 / d! 0 D G.!/;

j!j  W;

(4a)

W

D.!; ! 0 / D

sin T .!  ! 0 / : .!  ! 0 /

(4b)

The corresponding time- or spatial-domain formulation is Z

T

D.t; t 0 / g.t 0 / dt 0 D g.t /;

T

sin W .t  t 0 / D.t; t / D : .t  t 0 / 0

t 2 R;

(5a)

(5b)

Page 3 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

The “prolate spheroidal eigentapers” g1 .t /; g2 .t /; : : : that solve Eq. (5) form a doubly orthogonal set. When they are chosen to be orthonormal over infinite time jt j  1, they are also orthogonal over the finite interval jt j  T : Z

1 1

Z g˛ gˇ dt D ı˛ˇ ;

T

T

g˛ gˇ dt D ˛ ı˛ˇ :

(6)

A change of variables and a scaling of the eigenfunctions transforms Eq. (4) into the dimensionless eigenproblem Z

1

D.x; x 0 / .x 0 / dx 0 D  .x/;

(7a)

1

D.x; x 0 / D

sin T W .x  x 0 / : .x  x 0 /

(7b)

Equation (7) shows that the eigenvalues 1 > 2 > : : : and suitably scaled eigenfunctions 1 .x/; 2 .x/; : : : depend only upon the time-bandwidth product T W . The sum of the concentration eigenvalues  relates to this product by N

1D

D

1 X ˛D1

Z ˛ D

1

D.x; x/ dx D 1

2T W .2T /.2W / D : 2 

(8)

The eigenvalue spectrum of Eq. (7) has a characteristic step shape, showing significant (  1) and insignificant (  0) eigenvalues separated by a narrow transition band (Landau 1965; Slepian and Sonnenblick 1965). Thus, this “Shannon number” is a good estimate of the number of significant eigenvalues or, roughly speaking, N 1D is the number of signals f .t / that can be simultaneously well concentrated into a finite time interval jt j  T and a finite frequency interval j!j  W . In other words (Landau and Pollak 1962), N 1D is the approximate dimension of the space of signals that is “essentially” timelimited to T and bandlimited to W , and using the orthogonal set g1 ; g2; : : : ; gN 1D as its basis is parsimonious. 2.1.2 Sturm-Liouville Character and Tridiagonal Matrix Formulation The integral operator acting upon in Eq. (7) commutes with a differential operator that arises in expressing the three-dimensional scalar wave equation in prolate spheroidal coordinates (Slepian and Pollak 1961; Slepian 1983), which makes it possible to find the scaled eigenfunctions by solving the Sturm-Liouville equation     d .N 1D /2  2 2 2 d .1  x / C  x dx dx 4

D 0;

(9)

where  ¤  is the associated eigenvalue. The eigenfunctions .x/ of Eq. (9) can be found at discrete values of x by diagonalization of a simple symmetric tridiagonal matrix (Slepian 1978; Grünbaum 1981; Percival and Walden 1993) with elements

Page 4 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

.ŒN  1  2x=2/2 cos.2W /

for

x D 0;    ; N  1;

x.N  x/=2 for

x D 1; : : : ; N  1:

(10)

The matching eigenvalues  can then be obtained directly from Eq. (7). The discovery of the Sturm-Liouville formulation of the concentration problem posed in Eq. (3) proved to be a major impetus for the widespread adoption and practical applications of the “Slepian” basis in signal identification, spectral analysis, and numerical analysis. Compared to the sequence of eigenvalues , the spectrum of the eigenvalues  is extremely regular and thus the solution of Eq. (9) is without any problem amenable to finite-precision numerical computation (Percival and Walden 1993).

2.2 Spatiospectral Concentration in the Cartesian Plane 2.2.1 General Theory in Two Dimensions A square-integrable function f .x/ defined in the plane has the two-dimensional Fourier representation Z 1 Z 1 2 ikx f .x/ D .2/ F .k/e d k; F .k/ D f .x/e ikx d x; (11) 1

1

We use g.x/ to denote a function that is bandlimited to K, an arbitrary subregion of spectral space, Z 2 G.k/e ikx d k: (12) g.x/ D .2/ K

Following Slepian (1964), we seek to concentrate the power of g.x/ into a finite spatial region R 2 R2 , of area A: R 2 R g .x/ d x  D R1 D maximum: (13) 2 .x/ d x g 1 Bandlimited functions g.x/ that maximize the Rayleigh quotient (13) solve the Fredholm integral equation (Tricomi 1970) Z D.k; k0 / G.k0 / d k0 D  G.k/; k 2 K; (14a) K

0

2

Z

0

e i.k k/x d x:

D.k; k / D .2/

(14b)

R

Page 5 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

The corresponding problem in the spatial domain is Z

D.x; x0 / g.x0 / d x0 D  g.x/;

x 2 R2 ;

R 0

2

Z

0

D.x; x / D .2/

e ik.xx / d k:

(15a)

(15b)

K

The bandlimited spatial-domain eigenfunctions g1 .x/; g2 .x/; : : : and eigenvalues 1  2  : : : that solve Eq. (15) may be chosen to be orthonormal over the whole plane kxk  1 in which case they are also orthogonal over R: Z

1

1

Z g˛ gˇ d x D ı˛ˇ ;

R

g˛ gˇ d x D ˛ ı˛ˇ :

(16)

Concentration to the disk-shaped spectral band K D fk W kkk  Kg allows us to rewrite Eq. (15) after a change of variables and a scaling of the eigenfunctions as Z

D.;  0 / . 0 / d  0 D  ./;

(17a)

R

p p K J A=4 .K A=4 k   0 k/ 1 D.;  0 / D ; 2 k   0 k

(17b)

where the region R is scaled to area 4 and J1 is the first-order Bessel function of the first kind. Equation (17) shows that, also in the two-dimensional case, the eigenvalues 1 ; 2; : : : and the scaled eigenfunctions 1 ./; 2 ./; : : : depend only on the combination of the circular bandwidth K and the spatial concentration area A, where the quantity K 2 A=.4/ now plays the role of the time-bandwidth product T W in the one-dimensional case. The sum of the concentration eigenvalues  defines the two-dimensional Shannon number N 2D as N

2D

D

1 X ˛D1

Z ˛ D

D.; / d  D R

A .K 2 /.A/ : D K2 2 4 .2/

(18)

Just as N 1D in Eq. (8), N 2D is the product of the spectral and spatial areas of concentration multiplied by the “Nyquist density” (Daubechies 1988, 1992). And, similarly, it is the effective dimension of the space of “essentially” space- and bandlimited functions in which the set of twodimensional functions g1 ; g2 ; : : : ; gN 2D may act as a sparse orthogonal basis. After a long hiatus since the work of Slepian (1964), the two-dimensional problem has recently been the focus of renewed attention in the applied mathematics community (de Villiers et al. 2003; Shkolnisky 2007), and applications to the geosciences are following suit (Simons and Wang 2011). Numerous numerical methods exist to use Eqs. (14) and (15) in solving the concentration problem (13) on two-dimensional Cartesian domains. An example of Slepian functions on a geographical domain in the Cartesian plane can be found in Fig. 1. Page 6 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

northing (km)

λ1 = 0.971268

λ2 = 0.957996

λ3 = 0.936969

λ4 = 0.894682

400 0 −400 −500

0 500 easting (km)

−500

0 500 easting (km)

−500

0 500 easting (km)

−500

0 500 easting (km)

ky/ky

Nyq

0.1

0

−0.1 −0.1

0

0.1

−0.1

0

Nyq

Nyq

kx/kx

kx/kx

0.1

−0.1

0

0.1

Nyq

kx/kx

−0.1

0

0.1

Nyq

kx/kx

Fig. 1 Bandlimited eigenfunctions g1 ; g2 ; : : : ; g4 that are optimally concentrated within the Columbia Plateau, a physiographic region in the United States centered on 116.02 ı W 43.56 ı N (near Boise City, Idaho) of area A  145  103 km2 . The concentration factors 1 ; 2 ; : : : ; 4 are indicated; the Shannon number N 2D D 10. The top row shows a rendition of the eigenfunctions in space on a grid with 5 km resolution in both directions, with the convention that positive values are blue and negative values red, though the sign of the functions is arbitrary. The spatial concentration region is outlined in black. The bottom row shows the squared Fourier coefficients jG˛ .k/j2 as calculated from the functions g˛ .x/ shown, on a wavenumber scale that is expressed as a fraction of the Nyquist wavenumber. The spectral limitation region is shown by the black circle at wavenumber K D 0:0295 rad/km. All areas for which the absolute value of the functions plotted is less than one hundredth of the maximum value attained over the domain are left white. The calculations were performed by the Nyström method using Gauss-Legendre integration of Eq. (17) in the two-dimensional spatial domain (Simons and Wang 2011)

2.2.2 Sturm-Liouville Character and Tridiagonal Matrix Formulation If in addition to the circular spectral limitation, space is also circularly limited, in other words, if the spatial region of concentration or limitation R is a circle of radius R, then a polar coordinate, x D .r;  /, representation 8p ˆ < 2 g.r/ cos m if m < 0; if m D 0; (19) g.r;  / D g.r/ ˆ : p2 g.r/ sin m if m > 0; may be used to decompose Eq. (17) into a series of nondegenerate fixed-order eigenvalue problems, after scaling, Z

1

D.;  0 / . 0 /  0 d  0 D  ./;

(20a)

0 0

D.;  / D 4N

Z 2D

1

  p  p  Jm 2 N 2D p Jm 2 N 2D p 0 p dp:

(20b)

0

The solutions to Eq. (20) also solve a Sturm-Liouville equation on 0    1. In terms of p './ D  ./,

Page 7 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

    1=4  m2 d 2 d' 2D 2 .1   / C C  4N  ' D 0; d d 2

(21)

for some  ¤ . When m D ˙1=2 Eq. (21) reduces to Eq. (9). By extension to  > 1 the fixed-order “generalized prolate spheroidal functions” '1 ./; '2 ./; : : : can be determined from the rapidly converging infinite series p

2X JmC2lC1 .c / ; .2l C m C 1/1=2 dl p './ D  c 1

 2 RC ;

(22)

lD0

where '.0/ D 0 and the fixed-m expansion coefficients dl are determined by recursion (Slepian 1964) or by diagonalization of a symmetric tridiagonal matrix (de Villiers et al. 2003; Shkolnisky 2007) with elements given by      3 c2 m2 1 Tl l D 2l C m C 2l C m C C 1C ; 2 2 2 .2l C m/.2l C m C 2/ TlC1 l D  p

c 2 .l C 1/.m C l C 1/ ; p 2l C m C 1 .2l C m C 2/ 2l C m C 3

(23)

where the parameter l ranges from 0 to some large value that ensures convergence. The desired concentration eigenvalues  can subsequently be obtained by direct integration of Eq. (17), or, alternatively, from  D 2

2

p

N 2D ;

1 X c mC1=2 d0 with  D mC1 dl 2 .m C 1/Š lD0

!1 :

(24)

An example of Slepian functions on a disk-shaped region in the Cartesian plane can be found in Fig. 2. The solutions were obtained using the Nyström method using Gauss-Legendre integration of Eq. (17) in the two-dimensional spatial domain. These differ only very slightly from the results of computations carried out using the diagonalization of Eqs. (23) directly, as shown and discussed by us elsewhere (Simons and Wang 2011).

2.3 Spatiospectral Concentration on the Surface of a Sphere 2.3.1 General Theory in “Three” Dimensions We denote the colatitude of a geographical point rO on the unit sphere surface D fOr W kOrk D 1g by 0     and the longitude by 0  < 2. We use R to denote a region of , of area A, within which we seek to concentrate a bandlimited function of position rO . We express a square-integrable function f .Or/ on the surface of the unit sphere as f .Or/ D

l 1 X X lD0 mDl

Z flm Ylm .Or/;

flm D

f .Or/Ylm .Or/ d ;

(25)



Page 8 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

λ1 = 1.000000

λ2 = 1.000000

λ3 = 1.000000

λ4 = 0.999998

λ5 = 0.999998

λ6 = 0.999997

λ7 = 0.999974

λ8 = 0.999974

λ9 = 0.999930

λ10 = 0.999930

λ11 = 0.999738

λ12 = 0.999738

λ13 = 0.999121

λ14 = 0.999121

λ15 = 0.998757

λ16 = 0.998029

λ17 = 0.998028

λ18 = 0.992470

λ19 = 0.992470

λ20 = 0.988706

λ21 = 0.988701

λ22 = 0.986945

λ23 = 0.986930

λ24 = 0.955298

λ25 = 0.955287

λ26 = 0.951126

λ27 = 0.951109

λ28 = 0.915710

λ29 = 0.915709

λ30 = 0.898353

Fig. 2 Bandlimited eigenfunctions g˛ .r; / that are optimally concentrated within a Cartesian disk of radius R D 1. The dashed circle denotes the region boundary. The Shannon number N 2D D 42. The eigenvalues ˛ have been sorted to a global ranking with the best-concentrated eigenfunction plotted at the top left and the 30th best in the lower right. Blue is positive and red is negative and the color axis is symmetric, but the sign is arbitrary; regions in which the absolute value is less than one hundredth of the maximum value on the domain are left white. The calculations were performed by Gauss-Legendre integration in the two-dimensional spatial domain, which sometimes leads to slight differences in the last two digits of what should be identical eigenvalues for each pair of non-circularly-symmetric eigenfunctions

using orthonormalized real surface spherical harmonics (Edmonds 1996; Dahlen and Tromp 1998) 8p ˆ < 2Xljmj . / cos m if l  m < 0; if m D 0; (26) Ylm .Or/ D Ylm .; / D Xl0 . / ˆ : p2X . / sin m

if 0 < m  l; lm 1=2    .l  m/Š 1=2 m 2l C 1 Plm .cos  /; (27) Xlm . / D .1/ 4 .l C m/Š Page 9 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

 lCm d 1 2 m=2 Plm. / D l .1  / . 2  1/l : 2 lŠ d

(28)

The quantity 0  l  1 is the angular degree of the spherical harmonic, and l  m  l is its angular order. The function Plm. / defined in (28) is the associated Legendre function of integer degree l and order m. Our choice of the constants in Eqs. (26) and (27) orthonormalizes the harmonics on the unit sphere: Z Ylm Yl 0 m0 d D ıl l 0 ımm0 ; (29)

and leads to the addition theorem in terms of the Legendre functions Pl . / D Pl0 . / as l X

 0

Ylm .Or/Ylm .Or / D

mDl

 2l C 1 Pl .Or  rO 0 /: 4

(30)

To maximize the spatial concentration of a bandlimited function g.Or/ D

l L X X

glm Ylm .Or/

(31)

lD0 mDl

within a region R, we maximize the energy ratio Z g 2 .Or/ d R D maximum: D Z g 2 .Or/ d

(32)



Maximizing Eq. (32) leads to the positive-definite spectral-domain eigenvalue equation 0

l L X X l 0 D0

Dlm;l 0 m0 gl 0 m0 D glm;

0  l  L;

(33a)

m0 Dl 0

Z Dlm;l 0 m0 D

Ylm Yl 0 m0 d ;

(33b)

R

and we may equally well rewrite Eq. (33) as a spatial-domain eigenvalue equation: Z D.Or; rO 0 / g.Or0 / d 0 D g.Or/; rO 2 ;

(34a)

R

 L  X 2l C 1 Pl .Or  rO 0 /; D.Or; rO / D 4 0

(34b)

lD0

Page 10 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

where Pl is the Legendre function of degree l. Equation (34) is a homogeneous Fredholm integral equation of the second kind, with a finite-rank, symmetric, Hermitian kernel. We choose the spectral eigenfunctions of the operator in Eq. (33b), whose elements are glm ˛ , ˛ D 1; : : : ; .LC1/2 , to satisfy the orthonormality relations L X

glm ˛ glm ˇ D ı˛ˇ ;

lm

L X

glm ˛

L X

Dlm;l 0 m0 gl 0 m0 ˇ D ˛ ı˛ˇ :

(35)

l 0 m0

lm

The finite set of bandlimited spatial eigensolutions g1 .Or/; g2.Or/; : : : ; g.LC1/2 .Or/ can be made orthonormal over the whole sphere and orthogonal over the region R: Z Z g˛ gˇ d D ı˛ˇ ; g˛ gˇ d D ˛ ı˛ˇ : (36)

R

In the limit of a small area A ! 0 and a large bandwidth L ! 1 and after a change of variables, a scaled version of Eq. (34) will be given by Z D.;  0 / . 0 / d 0 D  ./; (37a) R

p p .L C 1/ A=4 J1 Œ.L C 1/ A=4 k   0 k ; D.;  / D 2 k   0 k 0

(37b)

where the scaled region R now has area 4 and J1 again is the first-order Bessel function of the first kind. As in the one- and two-dimensional case, the asymptotic, or “flat-Earth” eigenvalues 1  2  : : : and scaled eigenfunctions 1 ./; 2 ./; : : : depend upon the maximal degree L and the area A only through what is once again a space-bandwidth product, the “spherical Shannon number,” this time given by .LC1/2

N

3D

D

X ˛D1

Z D

˛ D

l L X X

Z

lD0 mDl

D.; / d  D .L C 1/2 R

D.Or; rO / d

Dlm;lm D R

A : 4

(38)

Irrespectively of the particular region of concentration that they were designed for, the complete set of bandlimited spatial Slepian eigenfunctions g1 ; g2 ; : : : ; g.LC1/2 is a basis for bandlimited scalar processes anywhere on the surface of the unit sphere (Simons et al. 2006; Simons and Dahlen 2006). This follows directly from the fact that the spectral localization kernel (33b) is real, symmetric, and positive definite: its eigenvectors g1 lm ; g2 lm; : : : ; g.LC1/2 lm form an orthogonal set as we have seen. Thus the Slepian basis functions g˛ .Or/, ˛ D 1; : : : ; .L C 1/2 given by Eq. (31) simply transform the same-sized limited set of spherical harmonics Ylm .Or/, 0  l  L, l  m  l that are a basis for the same space of bandlimited spherical functions with no power above the bandwidth L. The effect of this transformation is to order the resulting basis

Page 11 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

set such that the energy of the first N 3D functions, g1 .Or/; : : : ; gN 3D .Or/, with eigenvalues   1, is concentrated in the region R, whereas the remaining eigenfunctions, gN 3D C1 .Or/; : : : ; g.LC1/2 .Or/, are concentrated in the complimentary region n R. As in the one- and two-dimensional case, therefore, the reduced set of basis functions g1 ; g2 ; : : : ; gN 3D can be regarded as a sparse, global basis suitable to approximate bandlimited processes that are primarily localized to the region R. The dimensionality reduction is dependent on the fractional area of the region of interest. In other words, the full dimension of the space .L C 1/2 can be “sparsified” to an effective dimension of N 3D D .L C 1/2 A=.4/ when the signal of interest resides in a particular geographic region. Numerical methods for the solution of Eqs. (33) and (34) on completely general domains on the surface of the sphere were discussed by us elsewhere (Simons et al. 2006; Simons and Dahlen 2006, 2007). An example of Slepian functions on a geographical domain on the surface of the sphere is found in Fig. 3. 2.3.2 Sturm-Liouville Character and Tridiagonal Matrix Formulation In the special but important case in which the region of concentration is a circularly symmetric cap of colatitudinal radius ‚, centered on the North Pole, the colatitudinal parts g. / of the separable functions 8p ˆ < 2 g. / cos m if L  m < 0; if m D 0; g.; / D g. / (39) ˆ : p2 g. / sin m if 0 < m  L; which solve Eq. (34), or, indeed, the fixed-order versions Z



D.;  0 / g. 0 / sin  0 d 0 D g. /;

0    ;

(40a)

0

0

D.;  / D 2

L X

Xlm . /Xlm . 0 /;

(40b)

lDm

are identical to those of a Sturm-Liouville equation for the g. /. In terms of D cos  ,     m2 .  cos ‚/ d 2 dg .  cos ‚/.1  / C  C L.L C 2/  g D 0; d d 1  2

(41)

with  ¤ . This equation can be solved in the spectral domain by diagonalization of a simple symmetric tridiagonal matrix with a very well-behaved spectrum (Simons et al. 2006; Simons and Dahlen 2007). This matrix, whose eigenfunctions correspond to the glm of Eq. (31) at constant m, is given by Tl l D l.l C 1/ cos ‚;

Page 12 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

−65° −75° −85° −95° −105° −115°

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

λ = 1.000

−65° −75° −85° −95° −105° −115° −65° −75° −85° −95° −105° −115° −65° −75° −85° −95° −105°

−115° −30° −15°



15°

30° −30° −15°



15°

30° −30° −15°



15°

30°

Fig. 3 Bandlimited L D 60 eigenfunctions g1 ; g2 ; : : : ; g12 that are optimally concentrated within Antarctica. The concentration factors 1 ; 2 ; : : : ; 12 are indicated; the rounded Shannon number is N 3D D 102. The order of concentration is left to right, top to bottom. Positive values are blue and negative values are red; the sign of an eigenfunction is arbitrary. Regions in which the absolute value is less than one hundredth of the maximum value on the sphere are left white. We integrated Eq. (33b) over a splined high-resolution representation of the boundary, using Gauss-Legendre quadrature over the colatitudes, and analytically in the longitudinal dimension (Simons and Dahlen 2007)

s  .l C 1/2  m2 : Tl lC1 D l.l C 2/  L.L C 2/ .2l C 1/.2l C 3/

(42)

Moreover, when the region of concentration is a pair of axisymmetric polar caps of common colatitudinal radius ‚ centered on the North and South Pole, the g. / can be obtained by solving the Sturm-Liouville equation     d m2 . 2  cos2 ‚/ 2 2 2 dg 2 .  cos ‚/.1  / C  C Lp .Lp C 3/  g D 0; d d 1  2

(43)

where Lp D L or Lp D L  1 depending whether the order m of the functions g.; / in Eq. (39) is odd or even and whether the bandwidth L itself is odd or even. In their spectral form Page 13 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

the coefficients of the optimally concentrated antipodal polar-cap eigenfunctions only require the numerical diagonalization of a symmetric tridiagonal matrix with analytically prescribed elements and a spectrum of eigenvalues that is guaranteed to be simple (Simons and Dahlen 2006, 2007), namely, 2  .l C 1/2  m2 2l C 3   1 2 3m2  l.l C 1/  ; C Œ.l  2/.l C 1/  Lp .Lp C 3/ 3 3 .2l C 3/.2l  1/   s l.l C 3/  L .l C 2/2  m2 .l C 1/2  m2 .L C 3/ p p p : Tl lC2 D 2l C 3 .2l C 5/.2l C 1/ p

Tl l D l.l C 1/ cos2 ‚ C

(44)

The concentration values , in turn, can be determined from the defining Eqs. (33) or (34). The spectra of the eigenvalues  of Eqs. (42) and (44) display roughly equant spacing, without the numerically troublesome plateaus of nearly equal values that characterize the eigenvalues . Thus, for the special cases of symmetric single and double polar caps, the concentration problem posed in Eq. (32) is not only numerically feasible also in circumstances where direct solution methods are bound to fail (Albertella et al. 1999), but essentially trivial in every situation. In practical applications, the eigenfunctions that are optimally concentrated within a polar cap can be rotated to an arbitrarily positioned circular cap on the unit sphere using standard spherical harmonic rotation formulae (Edmonds 1996; Blanco et al. 1997; Dahlen and Tromp 1998).

2.4 Vectorial Slepian Functions on the Surface of a Sphere 2.4.1 General Theory in “Three” Vectorial Dimensions The expansion of a real-valued square-integrable vector field f.Or/ on the unit sphere can be written as f.Or/ D

1 X l X

P B C flm Plm.Or/ C flm Blm .Or/ C flm Clm .Or/;

(45a)

lD0 mDl

Z P flm

Z

D

Plm .Or/f.Or/ d ;

B flm

Z

D



Blm .Or/f.Or/ d ;

and

C flm

D

Clm .Or/f.Or/ d ; (45b)

using real vector surface spherical harmonics (Dahlen and Tromp 1998; Sabaka et al. 2010; Gerhards 2011) that are constructed from the scalar harmonics in Eq. (26), as follows. In the vector O / O and using the surface gradient r 1 D O @ C O .sin  /1 @ , we write spherical coordinates .Or; ; for l > 0 and m  l  m, Plm .Or/ D rO Ylm .Or/;

(46)

Page 14 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

ŒO @ C O .sin  /1 @ Ylm .Or/ r 1 Ylm .Or/ D ; Blm .Or/ D p p l.l C 1/ l.l C 1/ Clm .Or/ D

(47)

ŒO .sin  /1 @  O @ Ylm .Or/ Or  r 1 Ylm .Or/ D ; p p l.l C 1/ l.l C 1/

(48)

together with the purely radial P00 D .4/1=2 rO , and setting f00B D f00C D 0 for every vector field f. The remaining expansion coefficients (45b) are naturally obtained from Eq. (45a) through the orthonormality relationships Z Z Z Plm  Pl 0 m0 d D Blm  Bl 0 m0 d D Clm  Cl 0 m0 d D ıl l 0 ımm0 ; (49a) Z Z Z Plm  Bl 0 m0 d D Plm  Cl 0 m0 d D Blm  Cl 0 m0 d D 0: (49b)





The vector spherical-harmonic addition theorem (Freeden and Schreiner 2009) implies the limited result l X

 Plm.Or/  Plm.Or/ D

mDl

2l C 1 4



l X

D

Blm .Or/  Blm .Or/ D

mDl

l X

Clm.Or/  Clm.Or/:

(50)

mDl

As before we seek to maximize the spatial concentration of a bandlimited spherical vector function g.Or/ D

L X l X

P B C glm Plm.Or/ C glm Blm .Or/ C glm Clm .Or/

(51)

lD0 mDl

within a certain region R, in the vectorial case by maximizing the energy ratio Z g  g d R Z D maximum: D g  g d

(52)



The maximization of Eq. (52) leads to a coupled system of positive-definite spectral-domain eigenvalue equations, for 0  l  L and  l  m  l, 0

l L X X

P Dlm;l 0 m0 glP0 m0 D glm ;

(53a)

B Blm;l 0 m0 glB0 m0 C Clm;l 0m0 glC0 m0 D glm ;

(53b)

l 0 D0

m0 Dl 0

0

l L X X l 0 D0 m0 Dl 0

Page 15 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

0

L l X X

T B C C Clm;l 0 m0 gl 0 m0 C Blm;l 0 m0 gl 0 m0 D glm :

(53c)

l 0 D0 m0 Dl 0

Of the below matrix elements that complement the equations above, Eq. (54a) is identical to Eq. (33b), Z Z Dlm;l 0 m0 D Plm  Pl 0 m0 d D Ylm Yl 0 m0 d ; (54a) R R Z Z Blm;l 0 m0 D Blm  Bl 0 m0 d D Clm  Cl 0 m0 d ; (54b) R R Z Clm;l 0 m0 D Blm  Cl 0 m0 d ; (54c) R

and the transpose of Eq. (54c) switches its sign. The radial vectorial concentration problem (53a)– (54a) is identical to the corresponding scalar case (33) and can be solved separately from the tangential equations. Altogether, in the space domain, the equivalent eigenvalue equation is Z D.Or; rO 0 /  g.Or0 / d D  g.Or/; rO 2 ; (55a) R

0

D.Or; rO / D

L X l X

Plm .Or/Plm .Or0 / C Blm .Or/Blm .Or0 / C Clm .Or/Clm .Or0 /;

(55b)

lD0 mDl

a homogeneous Fredholm integral equation with a finite-rank, symmetric, separable, bandlimited kernel. Further reducing Eq. (55) using the full version of the vectorial addition theorem does not yield much additional insight. After collecting the spheroidal (radial, consoidal) and toroidal expansion coefficients in a vector, P B C ; : : : ; glm ; : : : ; glm ; : : :/T g D .: : : ; glm

(56)

and the kernel elements Dlm;l 0 m0 , Blm;l 0 m0 and Clm;l 0m0 of Eq. (54) into the submatrices D, B, and C, we assemble 0 1 D 0 0 K D @ 0 B CA : (57) T 0C B In this new notation Eq. (53) reads as an Œ3.L C 1/2  2  Œ3.L C 1/2  2-dimensional algebraic eigenvalue problem K g D  g;

(58)

whose eigenvectors g1 ; g2 ; : : : ; g3.LC1/2 2 are mutually orthogonal in the sense

Page 16 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

gT˛ gˇ D ı˛ˇ ;

gT˛ K gˇ D ˛ ı˛ˇ :

(59)

The associated eigenfields g1 .Or/; g2 .Or/; : : : ; g3.LC1/2 2 .Or/ are orthogonal over the region R and orthonormal over the whole sphere : Z Z g˛  gˇ d D ı˛ˇ ; g˛  gˇ d D ˛ ı˛ˇ : (60)

R

The relations (60) for the spatial domain are equivalent to their matrix counterparts (59). The eigenfield g1 .Or/ with the largest eigenvalue 1 is the element in the space of bandlimited vector fields with most of its spatial energy within region R; the eigenfield g2 .Or/ is the next bestconcentrated bandlimited function orthogonal to g1 .Or/ over both and R; and so on. Finally, as in the scalar case, we can sum up the eigenvalues of the matrix K to define a vectorial spherical Shannon number 3.LC1/2 2

N

vec

D

X

˛ D trK D

˛D1

D

Z "X L X l R

L X l X

.Dlm;lm C Blm;lm C Clm;lm /

lD0 mDl

(61) #

Plm .Or/  Plm .Or/ C Blm .Or/  Blm .Or/ C Clm .Or/  Clm .Or/ d

(62)

lD0 mDl

 A : D 3.L C 1/2  2 4

(63)

To establish the last equality we used the relation (50). Given the decoupling of the radial from the tangential solutions that is apparent from Eq. (57), we may subdivide the vectorial spherical Shannon number into a radial and a tangential one. These are N r D .L C 1/2 A=.4/ and N t D Œ2.L C 1/2  2A=.4/, respectively. Numerical solution methods were discussed by Plattner and Simons (2014). An example of tangential vectorial Slepian functions on a geographical domain on the surface of the sphere is found in Fig. 4. 2.4.2 Sturm-Liouville Character and Tridiagonal Matrix Formulation When the region of concentration R is a symmetric polar cap with colatitudinal radius ‚ centered on the north pole, special rules apply that greatly facilitate the construction of the localization kernel (57). There are reductions of Eq. (54) to some very manageable integrals that can be carried out exactly by recursion. Solutions for the polar cap can be rotated anywhere on the unit sphere using the same transformations that apply in the rotation of scalar functions (Edmonds 1996; Blanco et al. 1997; Dahlen and Tromp 1998; Freeden and Schreiner 2009). In the axisymmetric case the matrix elements (54a)–(54c) reduce to Z D

lm;l 0 m0

D 2 ı



Xlm Xl 0 m sin  d;

mm0

(64)

0

Page 17 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

−5° −20° −35° λ = 1.000000

λ = 1.000000

λ = 0.999999

λ = 0.999999

λ = 0.999997

λ = 0.999997

λ = 0.999995

λ = 0.999995

λ = 0.999990

−5° −20° −35°

−5° −20° −35°

−5° −20° −35° λ = 0.999990 120°

140°

λ = 0.999976 160°

120°

140°

λ = 0.999976 160°

120°

140°

160°

Fig. 4 Twelve tangential Slepian functions g1 ; g2 ; : : : ; g12 , bandlimited to L D 60, optimally concentrated within Australia. The concentration factors 1 ; 2 ; : : : ; 12 are indicated. The rounded tangential Shannon number N t D 112. Order of concentration is left to right, top to bottom. Color is absolute value (red the maximum) and circles with strokes indicate the direction of the eigenfield on the tangential plane. Regions in which the absolute value is less than one hundredth of the maximum absolute value on the sphere are left white

Z

 0 0 Xlm Xl 0 m C m2 .sin  /2 Xlm Xl 0 m sin  d 0 D p ; (65) l.l C 1/l 0 .l 0 C 1/ Z ‚  0 2 ımm0 m Xlm Xl 0 m C Xlm Xl00 m d 2 ımm0 mXlm .‚/Xl 0 m .‚/ D p0 p D ; (66) l.l C 1/l 0 .l 0 C 1/ l.l C 1/l 0 .l 0 C 1/ ‚

2 ımm0

Blm;l 0 m0

Clm;l 0 m0

0 using the derivative notation Xlm D dXlm =d for the normalized associated Legendre functions of Eq. (27). Equation (66) can be easily evaluated. The integrals over the product terms Xlm Xl 0 m in Eq. (64) can be rewritten using Wigner 3j symbols (Wieczorek and Simons 2005; Simons et al. 2006; Plattner and Simons 2014) to simple integrals over Xl 2m or Xl 0 which can be handled 0 Xl00 m , and m2 .sin  /2 Xlm Xl 0 m recursively (Paul 1978). Finally, in Eq. (65) integrals of the type Xlm can be rewritten as integrals over undifferentiated products of Legendre functions (Ilk 1983; Eshagh 2009; Plattner and Simons 2014). All in all, these computations are straightforward to

Page 18 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

carry out and lead to block-diagonal matrices at constant order m, which are relatively easily diagonalized. As this chapter went to press, Jahn and Bokor (2014) reported the exciting discovery of a differential operator that commutes with the tangential part of the concentration operator (55), and a tridiagonal matrix formulation for the tangential vectorial concentration problem to axisymmetric domains. They achieve this feat by a change of basis by which to reduce the vectorial problem to a 0 scalar one that is separable in  and , using the special functions Xlm ˙m .sin  /1 Xlm (Sheppard and Török 1997). Hence they derive a commuting differential operator and a corresponding spectral matrix for the concentration problem. By their approach, the solutions to the fixed-order tangential concentration problem are again solutions to a Sturm-Liouville problem with a very simple eigenvalue spectrum, and the calculations are always fast and stable, much as they are for the radial problem which completes the construction of vectorial Slepian functions on the sphere.

2.5 Midterm Summary It is interesting to reflect, however heuristically, on the commonality of all of the above aspects of spatiospectral localization, in the slightly expanded context of reproducing-kernel Hilbert spaces (Yao 1967; Nashed and Walter 1991; Daubechies 1992; Amirbekyan et al. 2008; Kennedy and Sadeghi 2013). In one dimension, the Fourier orthonormality relation and the “reproducing” properties of the spatial delta function are given by 0

1

Z

ı.t; t / D .2/

1

e

i!.t t 0 /

Z

1

d!;

1

f .t 0 /ı.t; t 0 / dt 0 D f .t /:

(67)

f .x0 /ı.x; x0 / d x0 D f .x/;

(68)

1

In two Cartesian dimensions the equivalent relations are 0

2

Z

ı.x; x / D .2/

1

e

ik.xx0 /

Z

1

d k;

1

1

and on the surface of the unit sphere we have, for the scalar case,  1  X 2l C 1 ı.Or; rO / D Pl .Or  rO 0 /; 4 0

Z

f .Or0 /ı.Or; rO 0 / d 0 D f .Or/;

(69)

Plm.Or/Plm .Or0 / C Blm .Or/Blm .Or0 / C Clm .Or/Clm .Or0 /;

(70a)

lD0



and for the vector case, we have the sum of dyads 0

ı.Or; rO / D

1 X l X lD0 mDl

Z

f.Or0 /  ı.Or; rO 0 / d 0 D f.Or/:

(70b)



The integral-equation kernels (5b), (15b), (34b), and (55b) are all bandlimited spatial delta functions which are reproducing kernels for bandlimited functions of the types in Eqs. (2), (12), (31), and (51): Page 19 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

0

1

Z

W

D.t; t / D .2/

e

i!.t t 0 /

Z d!;

W 0

2

Z

D.x; x / D .2/

e

ik.xx0 /

Z

 L  X 2l C 1 Pl .Or  rO 0 /; D.Or; rO / D 4 lD0 D.Or; rO / D

(71)

1

g.x0 /D.x; x0 / d x0 D g.x/;

(72)

1

0

L X l X

g.t 0 /D.t; t 0 / dt 0 D g.t /;

1

d k;

K

0

1

Z

g.Or0 /D.Or; rO 0 / d D g.Or/;

(73)

Plm .Or/Plm .Or0 / C Blm .Or/Blm .Or0 / C Clm .Or/Clm .Or0 /;

(74a)



lD0 mDl

Z

g.Or0 /  D.Or; rO 0 / d 0 D g.Or/:

(74b)



The equivalence of Eq. (71) with Eq. (5b) is through the Euler identity, and the reproducing properties follow from the spectral forms of the orthogonality relations (67) and (68), which are self-evident by change of variables, and from the spectral form of Eq. (69), which is Eq. (29). Much as the delta functions of Eqs. (67)–(70) set up the Hilbert spaces of all square-integrable functions on the real line, in two-dimensional Cartesian space and on the surface of the sphere (both scalar and vector functions), the kernels (71) and (74) induce the equivalent subspaces of bandlimited functions in their respective dimensions. Inasmuch as the Slepian functions are the integral eigenfunctions of these reproducing kernels in the sense of Eqs. (5a), (15a), (34a), and (55a), they are complete bases for their band-limited subspaces (Slepian and Pollak 1961; Landau and Pollak 1961; Daubechies 1992; Flandrin 1998; Freeden et al. 1998; Plattner and Simons 2014). Therein, the N 1D , N 2D , N 3D , or N vec best time- or space-concentrated members allow for sparse, approximate expansions of signals that are spatially concentrated to the onedimensional interval t 2 ŒT; T   R, the Cartesian region x 2 R  R2 , or the spherical surface patch rO 2 R  . As a corollary to this behavior, the infinite sets of exactly time- or spacelimited (and thus bandconcentrated) versions of the functions g and g, which are the eigenfunctions of Eqs. (5), (15), (34), and (55) with the domains appropriately restricted, are complete bases for square-integrable scalar or vector functions on the intervals to which they are confined (Slepian and Pollak 1961; Landau and Pollak 1961; Simons et al. 2006; Plattner and Simons 2013). Expansions of such wideband signals in the small subset of their N 1D , N 2D , N 3D , or N vec most band-concentrated members provide reconstructions which are constructive in the sense that they progressively capture all of the signal in the mean-squared sense, in the limit of letting their numbers grow to infinity. This second class of functions can be trivially obtained, up to a multiplicative constant, from the bandlimited Slepian functions g and g by simple time- or space limitation. While Slepian (Slepian and Pollak 1961; Slepian 1983), for this reason perhaps, never gave them a name, we have been referring to those as h (and h) in our own investigations of similar functions on the sphere (Simons et al. 2006; Simons and Dahlen 2006; Dahlen and Simons 2008; Plattner and Simons 2013).

Page 20 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

3 Problems in the Geosciences and Beyond Taking all of the above at face value but referring again to the literature cited thus far for proof and additional context, we return to considerations closer to home, namely, the estimation of geophysical (or cosmological) signals and/or their power spectra, from noisy and incomplete observations collected at or above the surface of the spheres “Earth” or “planet” (or from inside the sphere “sky”). We restrict ourselves to real-valued scalar measurements, contaminated by additive noise for which we shall adopt idealized models. We focus exclusively on data acquired and solutions expressed on the unit sphere. We have considered generalizations to problems involving satellite potential-field data collected at an altitude elsewhere (Simons and Dahlen 2006; Simons et al. 2009). We furthermore note that descriptions of the scalar gravitational and magnetic potential may be sufficient to capture the behavior of the corresponding gravity and magnetic vector fields, but that with vectorial Slepian functions, versatile and demanding satellite data analysis problems will be able to get robustly handled even in the presence of noise that may be strongly heterogeneous spatially and/or over the individual vector components. Speaking quite generally, the two different statistical problems that arise when geomathematical scalar spherical data are being studied are, (i) how to find the “best” estimate of the signal given the data and (ii) how to construct from the data the “best” estimate of the power spectral density of the signal in question. There are problems intermediate between either case, for instance, those that utilize the solutions to problems of the kind (i) to make inference about the power spectral density without properly solving any problems of kind (ii). Mostly such scenarios, e.g., in localized geomagnetic field analysis (Beggan et al. 2013), are born out of necessity or convenience. We restrict our analysis to the “pure” end-member problems. Thus, let there be some real-valued scalar data distributed on the unit sphere, consisting of “signal,” s and “noise,” n, and let there be some region of interest R  ; in other words, let

d.Or/ D

s.Or/ C n.Or/ if rO 2 R; unknown/undesired if rO 2 n R:

(75)

We assume that the signal of interest can be expressed by way of spherical harmonic expansion as in Eq. (25), and that it is, itself, a realization of a zero-mean, Gaussian, isotropic, random process, namely, s.Or/ D

1 X l X

slm Ylm .Or/;

hslm i D 0

and hslm sl 0 m0 i D Sl ıl l 0 ımm0 :

(76)

lD0 mDl

For illustration we furthermore assume that the noise is a zero-mean stochastic process with an isotropic power spectrum, i.e., hn.Or/i D 0 and hnlmnl 0 m0 i D Nl ıl l 0 ımm0 , and that it is statistically uncorrelated with the signal. We refer to power as white when Sl D S or Nl D N , or, equivalently, when hn.Or/n.Or0 /i D N ı.Or; rO 0 /. Our objectives are thus (i) to determine the best estimate sOlm of the spherical harmonic expansion coefficients slm of the signal and (ii) to find the best estimate SOl for the isotropic power spectral density Sl . While in the physical world there can be no limit on bandwidth, practical restrictions force any and all of our estimates to be bandlimited to some maximum spherical harmonic degree L, thus by necessity sOlm D 0 and SOl D 0 for l > L:

Page 21 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

sO .Or/ D

L X l X

sOlmYlm .Or/:

(77)

lD0 mDl

This limitation, combined with the statements of Eq. (75) on the data coverage or the study region of interest, naturally leads us back to the concept of “spatiospectral concentration,” and, as we shall see, solving either problem (i) or (ii) will gain from involving the “localized” scalar Slepian functions rather than, or in addition to, the “global” spherical harmonics basis. This leaves us to clarify what we understand by “best” in this context. While we adopt the traditional statistical metrics of bias, variance, and mean-squared error to appraise the quality of our solutions (Cox and Hinkley 1974; Bendat and Piersol 2000), the resulting connections to sparsity will be real and immediate, owing to the Slepian functions being naturally instrumental in constructing efficient, consistent, and/or unbiased estimates of either sOlm or SOl . Thus, we define v D hOs 2i  hOs i2 ;

b D hOs i  s;

D sO  s;

and

h 2 i D v C b 2

(78)

for problem (i), where the lack of subscript indicates that we can study variance, bias, and meansquared error of the estimate of the coefficients sOlm but also of their spatial expansion sO.Or/. For problem (ii) on the other hand, we focus on the estimate of the isotropic power spectrum at a given spherical harmonic degree l by identifying vl D hSOl2 i  hSOl i2 ;

bl D hSOl i  Sl ;

l D SOl  Sl ;

and h l2 i D vl C bl2 :

(79)

Depending on the application, the “best” estimate could mean the unbiased one with the lowest variance (Tegmark 1997; Tegmark et al. 1997; Bond et al. 1998; Oh et al. 1999; Hinshaw et al. 2003), it could be simply the minimum-variance estimate having some acceptable and quantifiable bias (Wieczorek and Simons 2007), or, as we would usually prefer, it would be the one with the minimum mean-squared error (Simons and Dahlen 2006; Dahlen and Simons 2008).

3.1 Problem (i): Signal Estimation from Spherical Data 3.1.1 Spherical Harmonic Solution Paraphrasing results elaborated elsewhere (Simons and Dahlen 2006), we write the bandlimited solution to the damped inverse problem Z Z 2 .s  d / d C s 2 d D minimum; (80) RN

R

where  0 is a damping parameter, by straightforward algebraic manipulation, as 0

sOlm D

l L X X  l 0 D0

m0 Dl 0

Dlm;l 0 m0

C DN lm;l 0 m0

1

Z d Yl 0 m0 d ;

(81)

R

Page 22 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

where DN lm;l 0 m0 , the kernel that localizes to the region RN D n R, compliments Dlm;l 0 m0 given by Eq. (33b) which localizes to R. Given the eigenvalue spectrum of the latter, its inversion is inherently unstable, thus Eq. (80) is an ill-conditioned inverse problem unless > 0, as has been well known, e.g., in geodesy (Xu 1992; Sneeuw and van Gelderen 1997). Elsewhere (Simons and Dahlen 2006) we have derived exact expressions for the optimal value of the damping parameter as a function of the signal-to-noise ratio under certain simplifying assumptions. As can be easily shown, without damping the estimate is unbiased but effectively incomputable; the introduction of the damping term stabilizes the solution at the cost of added bias. And of course when R D , Eq. (81) is simply the spherical harmonic transform, as in that case, Eq. (33b) reduces to Eq. (29), in other words, then Dlm;l 0 m0 D ıl l 0 ımm0 . 3.1.2 Slepian Basis Solution The trial solution in the Slepian basis designed for this region of interest R, i.e., .LC1/2

sO .Or/ D

X

sO˛ g˛ .Or/;

(82)

˛D1

would be completely equivalent to the expression in Eq. (77) by virtue of the completeness of the Slepian basis for bandlimited functions everywhere on the sphere and the unitarity of the transform (31) from the spherical harmonic to the Slepian basis. The solution to the undamped ( D 0) version of Eq. (80) would then be Z 1 d g˛ d ; (83) sO˛ D ˛ R

which, being completely equivalent to Eq. (81) for D 0, would be computable and biased, only when the expansion in Eq. (82) were to be truncated to some finite J < .L C 1/2 to prevent the blowup of the eigenvalues . Assuming for simplicity of the argument that J D N 3D , the essence of the approach is now that the solution N X 3D

sO .Or/ D

sO˛ g˛ .Or/

(84)

˛D1

will be sparse (in achieving a bandwidth L using N 3D Slepian instead of .L C 1/2 sphericalharmonic expansion coefficients) yet good (in approximating the signal as well as possible in the mean-squared sense in the region of interest R) and of geophysical utility (assuming we are dealing with spatially localized processes that are to be extracted, e.g., from global satellite measurements) as shown by Han and Simons (2008), Simons et al. (2009), and Harig and Simons (2012).

3.2 Bias and Variance In concluding this section let us illustrate another welcome by-product of our methodology, by writing the mean-squared error for the spherical harmonic solution (81) compared to the equivalent Page 23 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

expression for the solution in the Slepian basis, Eq. (83). We do this as a function of the spatial coordinate, in the Slepian basis for both, and, for maximum clarity of the exposition, using the contrived case when both signal and noise should be white (with power S and N , respectively) as well as bandlimited (which technically is impossible). In the former case, we get .LC1/2

h 2 .Or/i D N

X

˛ Œ˛ C .1  ˛ /2 g˛2 .Or/

(85)

˛D1 .LC1/2

C S 2

X

.1  ˛ /2 Œ˛ C .1  ˛ /2 g˛2 .Or/;

˛D1

while in the latter, we obtain N X

.LC1/2

3D

h 2 .Or/i D N

2 1 r/ C S ˛ g˛ .O

˛D1

X

g˛2 .Or/:

(86)

˛>N 3D

All .L C 1/2 basis functions are required to express the mean-squared estimation error, whether in Eq. (85) or in Eq. (86). The first term in both expressions is the variance, which depends on the measurement noise. Without damping or truncation the variance grows without bounds. Damping and truncation alleviate this at the expense of added bias, which depends on the characteristics of the signal, as given by the second term. In contrast to Eq. (85), however, the Slepian expression (86) has disentangled the contributions due to noise/variance and signal/bias by projecting them onto the sparse set of well-localized and the remaining set of poorly localized Slepian functions, respectively. The estimation variance is felt via the basis functions ˛ D 1 ! N 3D that are well concentrated inside the measurement area, and the effect of the bias is relegated to those ˛ D N 3D C 1 ! .L C 1/2 functions that are confined to the region of missing data. When forming a solution to problem (i) in the Slepian basis by truncation according to Eq. (84), changing the truncation level J to values lower or higher than the Shannon number N 3D amounts to navigating the trade-off space between variance, bias (or “resolution”), and sparsity in a manner that is captured with great clarity by Eq. (86). We refer the reader elsewhere (Simons and Dahlen 2006, 2007) for more details, and, in particular, for the case of potential fields estimated from data collected at satellite altitude, treated in detail in chapter “Potential-field Estimation using Scalar and Vector Slepian Functions at Satellite Altitude” of this book.

3.3 Problem (ii): Power Spectrum Estimation from Spherical Data Following Dahlen and Simons (2008) we will find it convenient to regard the data d.Or/ given in Eq. (75) as having been multiplied by a unit-valued boxcar window function confined to the region R, b.Or/ D

p 1 X X pD0 qDp

bpq Ypq .Or/ D

1 if rO 2 R; 0 otherwise.

(87)

Page 24 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

The power spectrum of the boxcar window (87) is Bp D

p X 1 2 bpq : 2p C 1 qDp

(88)

3.3.1 The Spherical Periodogram Should we decide that an acceptable estimate of the power spectral density of the available data is nothing else but the weighted average of its spherical harmonic expansion coefficients, we would be forming the spherical analogue of what Schuster (1898) named the “periodogram” in the context of time series analysis, namely, SOlSP D



4 A



Z 2 l X 1 d.Or/ Ylm .Or/ d : 2l C 1 mDl R

(89)

3.3.2 Bias of the Periodogram Upon doing so we would discover that the expected value of such an estimator would be the biased quantity hSOlSP i D

1 X

Kl l 0 .Sl 0 C Nl 0 /;

(90)

l 0 D0

where, as it is known in astrophysics and cosmology (Peebles 1973; Hauser and Peebles 1973; Hivon et al. 2002), the periodogram “coupling” matrix  Kl l 0 D

4 A



0

l l X X 1 ŒDlm;l 0 m0 2 2l C 1 mDl m0 Dl 0

(91)

governs the extent to which an estimate SOlSP of Sl is influenced by spectral leakage from power in neighboring spherical harmonic degrees l 0 D l ˙ 1; l ˙ 2; : : :, all the way down to 0 and up to 1. In the case of full data coverage, R D , or of a perfectly white spectrum, Sl D S, however, the estimate would be unbiased – provided the noise spectrum, if known, can be subtracted beforehand. 3.3.3 Variance of the Periodogram The covariance of the periodogram estimator (89) would moreover be suffering from strong wideband coupling of the power spectral densities in being given by

†SP l l0 D

2

0

l l X X

2

1 1 X X

32

2.4=A/ 4 .Sp C Np /Dlm;pq Dpq;l 0m0 5 : .2l C 1/.2l 0 C 1/ mDl m0 Dl 0 pD0 qD0

(92)

Page 25 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Even under the commonly made assumption as should the power spectrum be slowly varying within the main lobe of the coupling matrix, such coupling would be nefarious. In the “locally white” case we would have 0

†SP l l0

l l X X 2.4=A/2 0 0 .S D C N /.S C N / ŒDlm;l 0 m0 2 : l l l l .2l C 1/.2l 0 C 1/ mDl m0 Dl 0

(93)

Only in the limit of whole-sphere data coverage will Eqs. (92) or (93) reduce to †WS l l0 D

2 .Sl C Nl /2 ıl l 0 ; 2l C 1

(94)

which is the “planetary” or “cosmic” variance that can be understood on the basis of elementary statistical considerations (Jones 1963; Knox 1995; Grishchuk and Martin 1997). The strong spectral leakage for small regions (A 4) is highly undesirable and makes the periodogram “hopelessly obsolete” (Thomson and Chave 1991), or, to put it kindly, “naive” (Percival and Walden 1993), just as it is for one-dimensional time series. In principle it is possible – after subtraction of the noise bias – to eliminate the leakage bias in the periodogram estimate (89) by numerical inversion of the coupling matrix Kl l 0 . Such a “deconvolved periodogram” estimator is unbiased. However, its covariance depends on inverting the periodogram coupling matrix, which is only feasible when the region R covers most of the sphere, A  4. For any region whose area A is significantly smaller than 4, the periodogram coupling matrix (91) will be too ill-conditioned to be invertible. Thus, much like in problem (i) we are faced with bad bias and poor variance, both of which are controlled by the lack of localization of the spherical harmonics and their non-orthogonality over incomplete subdomains of the unit sphere. Both effects are described by the spatiospectral localization kernel defined in (33b), which, in the quadratic estimation problem (ii), appears in “squared” form in Eq. (92). Undoing the effects of the wideband coupling between degrees at which we seek to estimate the power spectral density by inversion of the coupling kernel is virtually impossible, and even if we could accomplish this to remove the estimation bias, this would much inflate the estimation variance (Dahlen and Simons 2008). 3.3.4 The Spherical Multitaper Estimate We therefore take a page out of the one-dimensional power estimation playbook of Thomson (1982) by forming the “eigenvalue-weighted multitaper estimate.” We could weight single-taper estimates adaptively to minimize quality measures such as estimation variance or mean-squared error (Thomson 1982; Wieczorek and Simons 2007), but in practice, these methods tend to be rather computationally demanding. Instead we simply multiply the data d.Or/ by the Slepian functions or “tapers” g˛ .Or/ designed for the region of interest prior to computing power and then averaging: 

.LC1/2

SOlMT

D

X ˛D1



4 N 3D



2 Z l X 1 g˛ .Or/ d.Or/ Ylm .Or/ d : 2l C 1

(95)

mDl

Page 26 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

3.3.5 Bias of the Multitaper Estimate The expected value of the estimate (95) is hSOlMT i D

lCL X

Ml l 0 .Sl 0 C Nl 0 /;

(96)

l 0 DlL

where the eigenvalue-weighted multitaper coupling matrix, using Wigner 3-j functions (Varshalovich et al. 1988; Messiah 2000), is given by Ml l 0

 2 L 2l 0 C 1 X l p l0 D .2p C 1/ : 000 .L C 1/2 pD0

(97)

It is remarkable that this result depends only upon the chosen bandwidth L and is completely independent of the size, shape, or connectivity of the region R, even as R D . Moreover, every row of the matrix in Eq. (97) sums to unity, which ensures that the multitaper spectral estimate SOlMT has no leakage bias Pin the case of a perfectly white spectrum provided the noise bias is subtracted MT O as well: hSl i  Ml l 0 Nl 0 D S if Sl D S. 3.3.6 Variance of the Multitaper Estimate Under the moderately colored approximation, which is more easily justified in this case because the coupling (97) is confined to a narrow band of width less than or equal to 2L C 1, with L the bandwidth of the tapers, the eigenvalue-weighted multitaper covariance is †MT l l0

2  2L X 1 l p l0 D .2p C 1/ p ; .Sl C Nl /.Sl 0 C Nl 0 / 000 2 pD0

(98)

where, using Wigner 3-j and 6-j functions (Varshalovich et al. 1988; Messiah 2000), p D

L X L L X L X X

1 .N 3D /2 

2L X

.2s C 1/.2s 0 C 1/.2u C 1/.2u0 C 1/

sD0 s 0 D0 uD0 u0 D0

.1/pCe .2e C 1/Be

eD0

     s e s 0 s e s 0 u e u0 s p u0 u p s 0  : u p u0 0 0 0 0 0 0 0 0 0 0 0 0

(99)

In this expression Be , the boxcar power (88), which we note does depend on the shape of the region of interest, appears again, summed over angular degrees limited by 3-j selection rules to 0  e  2L. The sum in Eq. (98) is likewise limited to degrees 0  p  2L. The effect of tapering with windows bandlimited to L is to introduce covariance between the estimates at any two different degrees l and l 0 that are separated by fewer than 2L C 1 degrees. Equations (98) and

Page 27 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

(99) are very efficiently computable, which should make them competitive with, e.g., jackknifed estimates of the estimation variance (Chave et al. 1987; Thomson and Chave 1991; Thomson 2007). The crux of the analysis lies in the fact that the matrix of the spectral covariances between singletapered estimates is almost diagonal (Wieczorek and Simons 2007), showing that the individual estimates that enter the weighted formula (95) are almost uncorrelated statistically. This embodies the very essence of the multitaper method. It dramatically reduces the estimation variance at the cost of small increases of readily quantifiable bias.

4 Practical Considerations In this section we now turn to the very practical context of sampled, e.g., geodetic, data on the sphere. We shall deal exclusively with bandlimited scalar functions, which are equally well expressed in the spherical harmonic as the Slepian basis, namely: f .Or/ D

L X l X

.LC1/2

flm Ylm .Or/ D

X

f˛ g˛ .Or/;

(100)

˛D1

lD0 mDl

whereby the Slepian-basis expansion coefficients are obtained as Z f .Or/g˛ .Or/ d : f˛ D

(101)



If the function of interest is spatially localized in the region R, a truncated reconstruction using Slepian functions built for the same region will constitute a very good, and sparse, local approximation to it (Simons et al. 2009): N X 3D

f .Or/ 

rO 2 R:

f˛ g˛ .Or/;

(102)

˛D1

We represent any sampled, bandlimited function f by an M -dimensional column vector f D .f1    fj    fM /T ;

(103)

where fj D f .Orj / is the value of f at pixel j , and M is the total number of sampling locations. In the most general case the distribution of pixel centers will be completely arbitrary (Hesse et al. 2010). The special case of equal-area pixelization of a 2-D function f .Or/ on the unit sphere is analogous to the equispaced digitization of a 1-D time series. Integrals will then be assumed to be approximated with sufficient accuracy by a Riemann sum over a dense set of pixels, Z f .Or/ d  

M X

Z fj

and

f 2 .Or/ d   fT f:

(104)

j D1

We have deliberately left the integration domain out of the above equations to cover both the cases of sampling over the entire unit sphere surface , in which case the solid angle  D 4=M Page 28 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

(case 1) as well as over an incomplete subdomain R  , in which case  D A=M , with A the area of the region R (case 2). If we collect the real spherical harmonic basis functions Ylm into an .L C 1/2  M -dimensional matrix 1 Y00 .Or1 /    Y00 .Orj /    Y00 .OrM / :: C B : C B C B Y D B  Ylm .Orj /  C; C B :: A @ : YLL .Or1 /    YLL .Orj /    YLL .OrM / 0

(105)

and the spherical harmonic coefficients of the function into an .L C 1/2  1-dimensional vector f D . f00    flm    fLL /T ;

(106)

we can write the spherical harmonic synthesis in Eq. (100) for sampled data without loss of generality as f D YT f:

(107)

We will adhere to the notation convention of using sans-serif fonts (e.g., f, Y) for vectors or matrices that depend on at least one spatial variable, and serifed fonts (e.g., f; D ) for those that are entirely composed of “spectral” quantities. In the case of dense, equal-area, whole-sphere sampling, we have an approximation to Eq. (29): YYT   1 I

(case 1);

(108)

where the elements of the .LC1/2 .LC1/2 -dimensional spectral identity matrix I are given by the Kronecker deltas ıl l 0 ımm0 . In the case of dense, equal-area sampling over some closed region R, we find instead an approximation to the .L C 1/2  .L C 1/2 -dimensional “spatiospectral localization matrix”: YYT   1 D

(case 2);

(109)

where the elements of D are those defined in Eq. (33b). Let us now introduce the .L C 1/2  .L C 1/2 -dimensional matrix of spectral Slepian eigenfunctions by 0

1 g00 1    g00 ˛    g00 .LC1/2 :: B C : B C B C G D B  glm ˛  C: B C : @ A :: gLL 1    gLL ˛    gLL .LC1/2

(110)

This is the matrix that contains the eigenfunctions of the problem defined in Eq. (33), which we rewrite as Page 29 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

D G D G ƒ;

(111)

where the diagonal matrix with the concentration eigenvalues is given by   ƒ D diag 1    ˛    .LC1/2 :

(112)

The spectral orthogonality relations of Eq. (35) are GT G D I;

GT D G D ƒ;

(113)

where the elements of the .L C 1/2  .L C 1/2 -dimensional Slepian identity matrix I are given by the Kronecker deltas ı˛ˇ . We write the Slepian functions of Eq. (31) as G D GT Y and Y D G G;

(114)

where the .L C 1/2  M -dimensional matrix holding the sampled spatial Slepian functions is given by 0 B B B GDB B @

g1 .Or1 / 



g1 .Orj / :: :



g1 .OrM / 

g˛ .Orj / :: :

1 C C C C: C A

(115)

g.LC1/2 .Or1 /    g.LC1/2 .Orj /    g.LC1/2 .OrM / Under a dense, equal-area, whole-sphere sampling, we will recover the spatial orthogonality of Eq. (36) approximately as GGT   1 I

(case 1);

(116)

whereas for dense, equal-area sampling over a region R we will get, instead, GGT   1 ƒ

(case 2):

(117)

With this matrix notation we shall revisit both estimation problems of the previous section.

4.1 Problem (i), Revisited 4.1.1 Spherical Harmonic Solution If we treat Eq. (107) as a noiseless inverse problem in which the sampled data f are given but from which the coefficients f are to be determined, we find that for dense, equal-area, whole-sphere sampling, the solution Of   Y f

(case 1)

(118)

Page 30 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

is simply the discrete approximation to the spherical harmonic analysis formula (25). For dense, equal-area, regional sampling we need to calculate Of   D1 Yf

(case 2):

(119)

Both of these cases are simply the relevant solutions to the familiar overdetermined spherical harmonic inversion problem (Kaula 1967; Menke 1989; Aster et al. 2005) for discretely sampled data, i.e., the least-squares solution to Eq. (107), Of D .YYT /1 Yf;

(120)

for the particular cases described by Eqs. (108) and (109). In Eq. (119) we furthermore recognize the discrete version of Eq. (81) with D 0, the undamped solution to the minimum mean-squared error inverse problem posed in continuous form in Eq. (80). From the continuous limiting case Eq. (81), we thus discover the general form that damping should take in regularizing the illconditioned inverse required in Eqs. (119) and (120). Its principal property is that it differs from the customary ad hoc practice of adding small values on the diagonal only. Finally, in the most general and admittedly most commonly encountered case of randomly scattered data, we require the Moore-Penrose pseudo-inverse Of D pinv.YT /f;

(121)

which is constructed by inverting the singular value decomposition (svd) of YT with its singular values truncated beyond where they fall below a certain threshold (Xu 1998). Solving Eq. (121) by truncated svd is equivalent to inverting a truncated eigenvalue expansion of the normal matrix YYT as it appears in Eq. (120), as can be easily shown. 4.1.2 Slepian Basis Solution If we collect the Slepian expansion coefficients of the function f into the .LC1/2  1-dimensional vector t D . f1    f˛    f.LC1/2 /T ;

(122)

the expansion (100) in the Slepian basis takes the form f D GT t D YT G t;

(123)

where we used Eqs. (113) and (114) to obtain the second equality. Comparing Eq. (123) with Eq. (107), we see that the Slepian expansion coefficients of a function transform to and from the spherical harmonic coefficients as f D Gt

and t D GT f:

(124)

Under dense, equal-area sampling with complete coverage, the coefficients in Eq. (123) can be estimated from Page 31 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Ot   G f

(case 1);

(125)

the discrete, approximate version of Eq. (101). For dense, equal-area sampling in a limited region R, we get Ot   ƒ1 Gf

(case 2):

(126)

As expected, both of the solutions (125) and (126) are again special cases of the overdetermined least-squares solution Ot D .GGT /1 G f;

(127)

as applied to Eqs. (116) and (117). We encountered Eq. (126) before in the continuous form of Eq. (83); it solves the undamped minimum mean-squared error problem (80). The regularization of this ill-conditioned inverse problem may be achieved by truncation of the concentration eigenvalues, e.g., by restricting the size of the .L C 1/2  .L C 1/2 -dimensional operator GGT to its first J  J subblock. Finally, in the most general, scattered-data case, we would be using an eigenvalue-truncated version of Eq. (127), or, which is equivalent, form the pseudo-inverse Ot D pinv.GT /f:

(128)

The solutions (118)–(120) and (125)–(127) are equivalent and differ only by the orthonormal change of basis from the spherical harmonics to the Slepian functions. Indeed, using Eqs. (114) and (124) to transform Eq. (127) into an equation for the spherical harmonic coefficients and comparing with Eq. (120) exposes the relation G.GGT /1 GT D .YYT /1 ;

(129)

which is a trivial identity for case 1 (insert Eqs. 108, 116 and 113) and, after substituting Eqs. (109) and (117), entails Gƒ1 GT D D1

(130)

for case 2, which holds by virtue of Eq. (113). Equation (129) can also be verified directly from Eq. (114), which implies YYT D G.GGT /GT :

(131)

The popular but labor-intensive procedure by which the unknown spherical harmonic expansion coefficients of a scattered data set are obtained by forming the Moore-Penrose pseudo-inverse as in Eq. (121) is thus equivalent to determining the truncated Slepian solution of Eq. (126) in the limit of continuous and equal-area, but incomplete data coverage. In that limit, the generic eigenvalue decomposition of the normal matrix becomes a specific statement of the Slepian problem as we encountered it before, namely, YYT D U† 2 UT

!

D D GƒGT :

(132) Page 32 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Such a connection has been previously pointed out for time series (Wingham 1992) and leads to the notion of “generalized prolate spheroidal functions” (Bronez 1988) should the “Slepian” functions be computed from a formulation of the concentration problem in the scattered data space directly, rather than being determined by sampling those obtained from solving the corresponding continuous problem, as we have described here. Above, we showed how to stabilize the inverse problem of Eq. (120) by damping. We dealt with the case of continuously available data only; the form in which it appears in Eq. (81) makes it clear that damping is hardly practical for scattered data. Indeed, it requires knowledge of the N in addition to being sensitive to the choice of , whose complementary localization operator D, optimal value depends implicitly on the unknown signal-to-noise ratio (Simons and Dahlen 2006). The data-driven approach taken in Eq. (121) is the more sensible one (Xu 1998). We have now seen that, in the limit of continuous partial coverage, this corresponds to the optimal solution of the problem formulated directly in the Slepian basis. It is consequently advantageous to also work in the Slepian basis in case the data collected are scattered but closely collocated in some region of interest. Prior knowledge of the geometry of this region and a prior idea of the spherical harmonic bandwidth of the data to be inverted allows us to construct a Slepian basis for the situation at hand, and the problem of finding the Slepian expansion coefficients of the unknown underlying function can be solved using Eqs. (127) and (128). The measure within which this approach agrees with the theoretical form of Eq. (126) will depend on how regularly the data are distributed within the region of study, i.e., on the error in the approximation (117). But if indeed the scattered-data Slepian normal matrix GGT is nearly diagonal in its first J  J -dimensional block due to the collocated observations having been favorably, if irregularly, distributed, then Eq. (126), which, strictly speaking, requires no matrix inversion, can be applied directly. If this is not the case, but the data are still collocated or we are only interested in a local approximation to the unknown signal, we can restrict G to its first J rows, prior to diagonalizing GGT or performing the svd of a partial GT as necessary to calculate Eqs. (127) and (128). Compared to solving Eqs. (120) and (121), the computational savings will still be substantial, as only when R  will the operator YYT be nearly diagonal. Truncation of the eigenvalues of YYT is akin to truncating the matrix GGT itself, which is diagonal or will be nearly so. With the theoretically, continuously determined, sampled Slepian functions as a parametrization, the truncated expansion is easy to obtain and the solution will be locally faithful within the region of interest R. In contrast, should we truncate YYT itself, without first diagonalizing it, we would be estimating a low-degree approximation of the signal which would have poor resolution everywhere. See Slobbe et al. (2012) for a set of examples in a slightly expanded and numerically more challenging context. 4.1.3 Bias and Variance For completeness we briefly return to the expressions for the mean-squared estimation error of the damped spherical-harmonic and the truncated Slepian function methods, Eqs. (85) and (86), which we quoted for the example of “white” signal and noise with power S and N , respectively. Introducing the .L C 1/2  .L C 1/2 -dimensional spectral matrices H D ƒ C .I  ƒ/; V D N H2 ƒ;

and B D

p

S H1 .I  ƒ/;

(133a) (133b)

Page 33 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

we handily rewrite the “full” version of Eq. (85) in two spatial variables as the error covariance matrix   h .Or/ .Or0 /i D GT V C 2 B2 G:

(134)

We subdivide the matrix with Slepian functions into the truncated set of the best-concentrated ˛ D 1 ! J and the complementary set of remaining ˛ D J C 1 ! .L C 1/2 functions, as follows:   N T T; G D GT G ¯

(135)

and similarly separate the eigenvalues, writing N D diag . 1    J / ; ƒ   ƒ D diag J C1    .LC1/2 : ¯ Likewise, the identity matrix is split into two parts, IN and I. If we now also redefine ¯ p N D S I; N N D Nƒ N 1 ; and B V p V D N ƒ1 ; and B D S I; ¯ ¯ ¯ ¯

(136a) (136b)

(137a) (137b)

the equivalent version of Eq. (86) is readily transformed into the full spatial error covariance matrix N N T BN 2 G: h .Or/ .Or0 /i D GT V G C G ¯ ¯ ¯

(138)

In selecting the Slepian basis we have thus successfully separated the effect of the variance and the bias on the mean-squared reconstruction error of a noisily observed signal. If the region of observation is a contiguous closed domain R  and the truncation should take place at the Shannon number J D N 3D , we have thereby identified the variance as due to noise in the region where data are available and the bias to signal neglected in the truncated expansion – which, in the proper Slepian basis, corresponds to the regions over which no observations exist. In practice, the truncation will happen at some J that depends on the signal-to-noise ratio (Simons and Dahlen 2006) and/or on computational considerations (Slobbe et al. 2012). Finally, we shall also apply the notions of discretely acquired data to the solutions of problem (ii), below.

4.2 Problem (ii), Revisited We need two more pieces of notation in order to rewrite the expressions for the spectral estimates (89) and (95) in the “pixel-basis.” First we construct the M  M -dimensional symmetric spatial matrix collecting the fixed-degree Legendre polynomials evaluated at the angular distances between all pairs of observations points,

Page 34 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

0

Pl .Or1  rO 1 /    Pl .Or1  rO j /    Pl .Or1  rO M / :: B  B : 2l C 1 B Pl D  Pl .Ori  rO j /  B B 4 : @ ::

1 C C C C: C A

(139)

Pl .OrM  rO 1 /    Pl .OrM  rO j /    Pl .OrM  rO M / P The elements of Pl are thus lmDl Ylm .Ori /Ylm .Orj /, by the addition theorem, Eq. (30). And finally, we define G˛l , the M  M symmetric matrix with elements given by  ˛ Gl ij D



 2l C 1 g˛ .Ori /Pl .Ori  rO j /g˛ .Orj /: 4

(140)

4.2.1 The Spherical Periodogram The expression equivalent to Eq. (89) is now written as SOlSP D



4 A



. /2 T d Pl d; 2l C 1

(141)

whereby the column vector d contains the sampled data as in the notation for Eq. (103). This lends itself easily to computation, and the statistics of Eqs. (90)–(93) hold, approximately, for sufficiently densely sampled data. 4.2.2 The Spherical Multitaper Estimate Finally, the expression equivalent to Eq. (95) becomes 

.LC1/2

SOlMT

D

X ˛D1



4 N 3D



. /2 T ˛ d Gl d: 2l C 1

(142)

Both Eqs. (141) and (142) are quadratic forms, earning them the nickname “quadratic spectral estimators” (Mullis and Scharf 1991). The key difference with the maximum-likelihood estimator popular in cosmology (Bond et al. 1998; Oh et al. 1999; Hinshaw et al. 2003), which can also be written as a quadratic form (Tegmark 1997), is that neither Pl nor G˛l depends on the unknown spectrum itself and can be easily precomputed. In contrast, maximum-likelihood estimation is inherently nonlinear, requiring iteration to converge to the most probable estimate of the power spectral density (Dahlen and Simons 2008). As such, given a pixel grid, a region of interest R and a bandwidth L, Eq. (142) produces a consistent localized multitaper power spectral estimate in one step. The estimate (142) has the statistical properties that we listed earlier as Eqs. (96)–(99). These continue to hold when the data pixelization is fine enough to have integral expressions of the kind (104) be exact. As mentioned before, for completely irregularly and potentially non-densely distributed discrete data on the sphere, “generalized” Slepian functions (Bronez 1988) could be constructed specifically for the purpose of their power spectral estimation and used to build the operator (140). Page 35 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

5 Conclusions What is the information contained in a bandlimited set of scientific observations made over an incomplete, e.g., temporally or spatially limited sampling domain? How can this “information,” e.g., an estimate of the signal itself, or of its energy density, be determined from noisy data, and how shall it be represented? These seemingly age-old fundamental questions, which have implications beyond the scientific (Slepian 1976), had been solved – some say, by conveniently ignoring them – heuristically, by engineers, well before receiving their first satisfactory answers given in the theoretical treatment by Slepian, Landau, and Pollak (Slepian and Pollak 1961; Landau and Pollak 1961, 1962), first for “continuous” time series, later generalized to the multidimensional and discrete cases (Slepian 1964; Slepian 1978; Bronez 1988). By the “Slepian functions” in the title of this contribution, we have lumped together all functions that are “spatiospectrally” concentrated, quadratically, in the original sense of Slepian. In one dimension, these are the “prolate spheroidal functions” whose popularity is as enduring as their utility. In two Cartesian dimensions, and on the surface of the unit sphere, both scalar and vectorial, their time for applications in geomathematics has come. The answers to the questions posed above are as ever relevant for the geosciences of today. There, we often face the additional complications of irregularly shaped study domains, scattered observations of noise-contaminated potential fields, perhaps collected from an altitude above the source by airplanes or satellites, and an acquisition and model-space geometry that is rarely if ever nonsymmetric. Thus the Slepian functions are especially suited for geoscientific applications and to study any type of geographical information, in general. Two problems that are of particular interest in the geosciences, but also further afield, are how to form a statistically “optimal” estimate of the signal giving rise to the data and how to estimate the power spectral density of such signal. The first, an inverse problem that is linear in the data, applies to forming mass flux estimates from time-variable gravity, e.g., by the GRACE mission (Harig and Simons 2012), or to the characterization of the terrestrial or planetary magnetic fields by satellites such as CHAMP, SWARM, or MGS. The second, which is quadratic in the data, is of interest in studying the statistics of the Earth’s or planetary topography and magnetic fields (Lewis and Simons 2012; Beggan et al. 2013) and especially for the cross-spectral analysis of gravity and topography (Wieczorek 2008), which can yield important clues about the internal structure of the planets. The second problem is also of great interest in cosmology, where missions such as WMAP and PLANCK are mapping the cosmic microwave background radiation, which is best modeled spectrally to constrain models of the evolution of our universe. Slepian functions, as we have shown by focusing on the scalar case in spherical geometry, provide the mathematical framework to solve such problems. They are a convenient and easily obtained doubly orthogonal mathematical basis in which to express, and thus by which to recover, signals that are geographically localized or incompletely (and noisily) observed. For this they are much better suited than the traditional Fourier or spherical harmonic bases, and they are more “geologically intuitive” than wavelet bases in retaining a firm geographic footprint and preserving the traditional notions of frequency or spherical harmonic degree. They are furthermore extremely performant as data tapers to regularize the inverse problem of power spectral density determination from noisy and patchy observations, which can then be solved satisfactorily without costly iteration. Finally, by the interpretation of the Slepian functions as their limiting cases, much can be learned about the statistical nature of such inverse problems when the data provided are themselves scattered within a specific areal region of study.

Page 36 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Acknowledgments We are indebted to Tony Dahlen (1942–2007), Mark Wieczorek, and Volker Michel for many enlightening discussions over the years. Dong V. Wang aided with the calculations of the Cartesian case, and Liying Wei contributed to the development of the vectorial case. Yoel Shkolnisky pointed us to the symmetric relations (22) and (23), and Kornél Jahn shared a preprint of his most recent paper. Financial support for this work was provided by the US National Science Foundation under Grants EAR-0105387, EAR-0710860, EAR-1014606, and EAR-1150145, by the Université Paris Diderot–Paris 7 and the Institut de Physique du Globe de Paris in St. Maurdes-Fossés, the Ulrich Schmucker Memorial Trust, and the Swiss National Science Foundation. Computer algorithms are made available on www.frederik.net.

References Albertella A, Sansò F, Sneeuw N (1999) Band-limited functions on a bounded spherical domain: the Slepian problem on the sphere. J Geodesy 73:436–447 Amirbekyan A, Michel V, Simons FJ (2008) Parameterizing surface-wave tomographic models with harmonic spherical splines. Geophys J Int 174(2):617–628. doi:10.1111/j.1365– 246X.2008.03809.x Aster RC, Borchers B, Thurber CH (2005) Parameter estimation and inverse problems. Volume 90 of International Geophysics Series. Elsevier Academic, San Diego Beggan CD, Saarimäki J, Whaler KA, Simons FJ (2013) Spectral and spatial decomposition of lithospheric magnetic field models using spherical Slepian functions. Geophys J Int 193: 136–148. doi:10.1093/gji/ggs122. Bendat JS, Piersol AG (2000) Random data: Analysis and measurement procedures, 3rd edn. Wiley, New York Blanco MA, Flórez M, Bermejo M (1997) Evaluation of the rotation matrices in the basis of real spherical harmonics. J Mol Struct (Theochem) 419:19–27 Bond JR, Jaffe AH, Knox L (1998) Estimating the power spectrum of the cosmic microwave background. Phys Rev D 57(4):2117–2137 Bronez TP (1988) Spectral estimation of irregularly sampled multidimensional processes by generalized prolate spheroidal sequences. IEEE Trans Acoust Speech Signal Process 36(12): 1862–1873 Chave AD, Thomson DJ, Ander ME (1987) On the robust estimation of power spectra, coherences, and transfer functions. J Geophys Res. 92(B1):633–648 Cohen L (1989) Time-frequency distributions – a review. Proc IEEE 77(7):941–981 Cox DR, Hinkley DV (1974) Theoretical statistics. Chapman and Hall, London Dahlen FA, Simons FJ (2008) Spectral estimation on a sphere in geophysics and cosmology. Geophys J Int 174:774–807. doi:10.1111/j.1365–246X.2008.03854.x Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Daubechies I (1988) Time-frequency localization operators: a geometric phase space approach. IEEE Trans Inform Theory 34:605–612 Daubechies I (1990) The wavelet transform, time-frequency localization and signal analysis. IEEE Trans Inform Theory 36(5):961–1005 Daubechies I (1992) Ten lectures on wavelets. Volume 61 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial & Applied Mathematics, Philadelphia

Page 37 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Daubechies I, Paul T (1988) Time-frequency localisation operators – a geometric phase space approach: II. The use of dilations. Inverse Probl 4(3):661–680 de Villiers GD, Marchaud FBT, Pike ER (2003) Generalized Gaussian quadrature applied to an inverse problem in antenna theory: II. The two-dimensional case with circular symmetry. Inverse Probl 19:755–778 Donoho DL, Stark PB (1989) Uncertainty principles and signal recovery. SIAM J Appl Math 49(3):906–931 Edmonds AR (1996) Angular momentum in quantum mechanics. Princeton University Press, Princeton Eshagh M (2009) Spatially restricted integrals in gradiometric boundary value problems. Artif Sat 44(4):131–148. doi:10.2478/v10018–009–0025–4 Flandrin P (1998) Temps-Fréquence, 2nd edn. Hermès, Paris Freeden W (2010) Geomathematics: Its role, its aim, and its potential. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 1. Springer, Heidelberg, pp 3–42. doi:10.1007/978–3–642–01546–5_1 Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences: a scalar, vectorial, and tensorial setup. Springer, Berlin Freeden W, Schreiner M (2010) Special functions in mathematical geosciences: an attempt at a categorization. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 31. Springer, Heidelberg, pp 925–948. doi:10.1007/978–3–642–01546–5_31 Freeden W, Windheuser U (1997) Combined spherical harmonic and wavelet expansion – a future concept in Earth’s gravitational determination, Appl Comput Harmon Anal 4:1–37 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere. Clarendon, Oxford Gerhards C (2011) Spherical decompositions in a global and local framework: theory and an application to geomagnetic modeling. Int J Geomath 1(2):205–256. doi:10.1007/s13137–010– 0011–9 Gilbert EN, Slepian D (1977) Doubly orthogonal concentrated polynomials. SIAM J Math Anal 8(2):290–319 Grafarend EW, Klapp M, Martinec Z (2010) Spacetime modeling of the Earth’s gravity field by ellipsoidal harmonics, In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 7. Springer, Heidelberg, pp 159–252. doi:10.1007/978–3–642–01546–5_7 Grishchuk LP, Martin J (1997) Best unbiased estimates for the microwave background anisotropies. Phys Rev D 56(4):1924–1938 Grünbaum FA (1981) Eigenvectors of a Toeplitz matrix: discrete version of the prolate spheroidal wave functions. SIAM J Alg Disc Methods 2(2):136–141 Han S-C, Simons FJ (2008) Spatiospectral localization of global geopotential fields from the Gravity Recovery and Climate Experiment (GRACE) reveals the coseismic gravity change owing to the 2004 Sumatra-Andaman earthquake. J Geophys Res. 113:B01405. doi:10.1029/2007JB004927 Hanssen A (1997) Multidimensional multitaper spectral estimation. Signal Process 58:327–332 Harig C, Simons FJ (2012) Mapping Greenland’s mass loss in space and time. Proc Natl Acad Soc 109(49):19934–19937 doi:10.1073/pnas.1206785109 Hauser MG, Peebles PJE (1973) Statistical analysis of catalogs of extragalactic objects. II. The Abell catalog of rich clusters. Astrophys J 185:757–785

Page 38 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Hesse K, Sloan IH, Womersley RS (2010) Numerical integration on the sphere. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 40. Springer, Heidelberg, pp 1187–1219. doi:10.1007/978–3–642–01546–5_40 Hinshaw G, Spergel DN, Verde L, Hill RS, Meyer SS, Barnes C, Bennett CL, Halpern M, Jarosik N, Kogut A, Komatsu E, Limon M, Page L, Tucker GS, Weiland JL, Wollack E, Wright EL (2003) First-year Wilkinson Microwave Anisotropy Probe (WMAP) observations: The angular power spectrum. Astrophys J Suppl Ser 148:135–159 Hivon E, Górski KM, Netterfield CB, Crill BP, Prunet S, Hansen F (2002) MASTER of the cosmic microwave background anisotropy power spectrum: a fast method for statistical analysis of large and complex cosmic microwave background data sets. Astrophys J. 567:2–17 Ilk KH (1983) Ein Beitrag zur Dynamik ausgedehnter Körper: Gravitationswechselwirkung. Deutsche Geodätische Kommission C (288) Jahn K, Bokor N (2012) Vector Slepian basis functions with optimal energy concentration in high numerical aperture focusing. Opt Commun 285:2028–2038. doi:10.1016/j.optcom.2011.11.107 Jahn K, Bokor N (2013) Solving the inverse problem of high numerical aperture focusing using vector Slepian harmonics and vector Slepian multipole fields. Opt Commun 288:13–16. doi:10.1016/j.optcom.2012.09.051 Jahn K, Bokor N (2014) Revisiting the concentration problem of vector fields within a spherical cap: A commuting differential operator solution. J Fourier Anal Appl 20:421–451. doi:10.1007/s00041-014-9324-7 Jones RH (1963) Stochastic processes on a sphere. Ann Math Stat 34(1):213–218 Kaula WM (1967) Theory of statistical analysis of data distributed over a sphere. Rev Geophys 5(1):83–107 Kennedy RA, Sadeghi P (2013) Hilbert space methods in signal processing. Cambridge University Press, Cambridge Knox L (1995) Determination of inflationary observables by cosmic microwave background anisotropy experiments. Phys Rev D 52(8):4307–4318 Landau HJ (1965) On the eigenvalue behavior of certain convolution equations. Trans Am Math Soc 115:242–256 Landau HJ, Pollak HO (1961) Prolate spheroidal wave functions, Fourier analysis and uncertainty – II. Bell Syst Tech J 40(1):65–84 Landau HJ, Pollak HO (1962) Prolate spheroidal wave functions, Fourier analysis and uncertainty – III: the dimension of the space of essentially time- and band-limited signals. Bell Syst Tech J 41(4):1295–1336 Lewis KW, Simons FJ (2012) Local spectral variability and the origin of the Martian crustal magnetic field. Geophys Res Lett 39:L18201. doi:10.1029/2012GL052708 Mallat S (1998) A wavelet tour of signal processing. Academic, San Diego Maniar H, Mitra PP (2005) The concentration problem for vector fields. Int J Bioelectromag 7(1):142–145 Martinec Z (2010) The forward and adjoint methods of global electromagnetic induction for CHAMP magnetic data. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 19. Springer, Heidelberg, pp 565–624. doi:10.1007/978–3–642–01546–5_19 Menke W (1989) Geophysical data analysis: discrete inverse theory. Volume 45 of International Geophysics Series, Rev. edn. Academic, San Diego Messiah A (2000) Quantum mechanics. Dover, New York

Page 39 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Michel V (2010) Tomography: problems and multiscale solutions. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 32. Springer, Heidelberg, pp 949–972. doi:10.1007/978–3–642–01546–5_32 Mitra PP, Maniar H (2006) Concentration maximization and local basis expansions (LBEX) for linear inverse problems. IEEE Trans Biomed Eng 53(9):1775–1782 Mullis CT, Scharf LL (1991) Quadratic estimators of the power spectrum. In: Haykin S (eds) Advances in spectrum analysis and array processing, vol 1, chap 1. Prentice-Hall, Englewood Cliffs, pp 1–57 Nashed MZ, Walter GG (1991) General sampling theorems for functions in reproducing kernel Hilbert spaces. Math Control Signals Syst 4:363–390 Oh SP, Spergel DN, Hinshaw G (1999) An efficient technique to determine the power spectrum from cosmic microwave background sky maps. Astrophys J 510:551–563 Olsen N, Hulot G, Sabaka TJ (2010) Sources of the geomagnetic field and the modern data that enable their investigation. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 5. Springer, Heidelberg, pp 105–124. doi:10.1007/978–3–642–01546– 5_5 Paul MK (1978) Recurrence relations for integrals of associated Legendre functions. Bull Géod. 52:177–190 Peebles PJE (1973) Statistical analysis of catalogs of extragalactic objects. I. Theory. Astrophys J 185:413–440 Percival DB, Walden AT (1993) Spectral analysis for physical applications, multitaper and conventional univariate techniques. Cambridge University Press, New York Plattner A, Simons FJ, Wei L (2012) Analysis of real vector fields on the sphere using Slepian functions. In: 2012 IEEE Statistical Signal Processing Workshop (SSP’12). Ann Arbor, Mich Plattner A, Simons FJ (2013) A spatiospectral localization approach for analyzing and representing vector-valued functions on spherical surfaces. In: Van de Ville D, Goyal VK, Papadakis M (eds) Wavelets and Sparsity XV, vol 8858. SPIE, pp 88580N. doi: 10.1117/12.2024703. Plattner A, Simons FJ (2014) Spatiospectral concentration of vector fields on a sphere. Appl Comput Harmon Anal 36:1–22. doi:10.1016/j.acha.2012.12.001 Riedel KS, Sidorenko A (1995) Minimum bias multiple taper spectral estimation. IEEE Trans Signal Process 43(1):188–195 Sabaka TJ, Hulot G, Olsen N (2010) Mathematical properties relevant to geomagnetic field modeling. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 17. Springer, Heidelberg, pp 503–538. doi:10.1007/978–3–642–01546–5_17 Schuster A (1898) An investigation of hidden periodicities with application to a supposed 26-day period of meteorological phenomena. Terr Mag 3:13–41 Sheppard CJR, Török P (1997) Efficient calculation of electromagnetic diffraction in optical systems using a multipole expansion. J Mod Opt 44(4):803–818. doi:10.1080/09500349708230696 Shkolnisky Y (2007) Prolate spheroidal wave functions on a disc – integration and approximation of two-dimensional bandlimited functions. Appl Comput Harmon Anal 22:235–256. doi:10.1016/j.acha.2006.07.002 Simons FJ (2010) Slepian functions and their use in signal estimation and spectral analysis. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 30. Springer, Heidelberg, pp 891–923. doi:10.1007/978–3–642–01546–5_30 Simons FJ, Dahlen FA (2006) Spherical Slepian functions and the polar gap in geodesy. Geophys J Int 166:1039–1061. doi:10.1111/j.1365–246X.2006.03065.x

Page 40 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Simons FJ, Dahlen FA (2007) A spatiospectral localization approach to estimating potential fields on the surface of a sphere from noisy, incomplete data taken at satellite altitudes. In: Van de Ville D, Goyal VK, Papadakis M (eds) Wavelets XII, vol 6701. SPIE, pp 670117. doi:10.1117/12.732406 Simons FJ, Wang DV (2011) Spatiospectral concentration in the Cartesian plane. Int J Geomath 2(1):1–36. doi:10.1007/s13137–011–0016–z Simons FJ, Zuber MT, Korenaga J (2000) Isostatic response of the Australian lithosphere: estimation of effective elastic thickness and anisotropy using multitaper spectral analysis. J Geophys Res 105(B8):19163–19184. doi:10.1029/2000JB900157 Simons FJ, Dahlen FA, Wieczorek MA (2006) Spatiospectral concentration on a sphere. SIAM Rev 48(3):504–536. doi:10.1137/S0036144504445765 Simons FJ, Hawthorne JC, Beggan CD (2009) Efficient analysis and representation of geophysical processes using localized spherical basis functions. In: Goyal VK, Papadakis M, Van de Ville D (eds) Wavelets XIII, vol 7446. SPIE, pp 74460G. doi:10.1117/12.825730 Slepian D (1964) Prolate spheroidal wave functions, Fourier analysis and uncertainty – IV: extensions to many dimensions; generalized prolate spheroidal functions. Bell Syst Tech J 43(6):3009–3057 Slepian D (1976) On bandwidth. Proc IEEE 64(3):292–300 Slepian D (1978) Prolate spheroidal wave functions, Fourier analysis and uncertainty – V: the discrete case. Bell Syst Tech J 57:1371–1429 Slepian D (1983) Some comments on Fourier analysis, uncertainty and modeling. SIAM Rev 25(3):379–393 Slepian D, Pollak HO (1961) Prolate spheroidal wave functions, Fourier analysis and uncertainty – I. Bell Syst Tech J 40(1):43–63 Slepian D, Sonnenblick E (1965) Eigenvalues associated with prolate spheroidal wave functions of zero order. Bell Syst Tech J 44(8):1745–1759 Slobbe DC, Simons FJ, Klees R (2012) The spherical Slepian basis as a means to obtain spectral consistency between mean sea level and the geoid. J Geodesy 86(8):609–628. doi:10.1007/s00190–012–0543–x Sneeuw N, van Gelderen M (1997) The polar gap. In: Sansò F, Rummel R (eds) Geodetic boundary value problems in view of the one centimeter geoid. Volume 65 of Lecture notes in Earth sciences. Springer, Berlin, pp 559–568 Tegmark M (1997) How to measure CMB power spectra without losing information. Phys Rev D 55(10):5895–5907 Tegmark M, Taylor AN, Heavens AF (1997) Karhunen-Loève eigenvalue problems in cosmology: How should we tackle large data sets? Astrophys J 480(1):22–35 Thomson DJ (1982) Spectrum estimation and harmonic analysis. Proc IEEE 70(9):1055–1096 Thomson DJ (2007) Jackknifing multitaper spectrum estimates. IEEE Signal Process Mag 20:20– 30. doi:0.1109/MSP.2007.4286561 Thomson DJ, Chave AD (1991) Jackknifed error estimates for spectra, coherences, and transfer functions, In: Haykin S (eds) Advances in spectrum analysis and array processing, vol 1, chap 2. Prentice-Hall, Englewood Cliffs, pp 58–113 Tricomi FG (1970) Integral equations, 5th edn. Interscience, New York Varshalovich DA, Moskalev AN, Kherso´nskii VK (1988) Quantum theory of angular momentum. World Scientific, Singapore

Page 41 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_30-2 © Springer-Verlag Berlin Heidelberg 2014

Wieczorek MA (2008) Constraints on the composition of the martian south polar cap from gravity and topography. Icarus 196(2):506–517. doi:10.1016/j.icarus.2007.10.026 Wieczorek MA, Simons FJ (2005) Localized spectral analysis on the sphere. Geophys J Int 162(3):655–675. doi:10.1111/j.1365–246X.2005.02687.x Wieczorek MA, Simons FJ (2007) Minimum-variance spectral analysis on the sphere. J Fourier Anal Appl 13(6):665–692. doi:10.1007/s00041–006–6904–1 Wingham DJ (1992) The reconstruction of a band-limited function and its Fourier transform from a finite number of samples at arbitrary locations by singular value decomposition. IEEE Trans Signal Process 40(3):559–570. doi:10.1109/78.120799 Xu P (1992) Determination of surface gravity anomalies using gradiometric observables. Geophys J Int 110:321–332 Xu P (1998) Truncated SVD methods for discrete linear ill-posed problems. Geophys J Int 135(2):505–514 doi:10.1046/j.1365–246X.1998.00652.x Yao K (1967) Application of reproducing kernel Hilbert spaces – Bandlimited signal models. Inform Control 11(4):429–444

Page 42 of 42

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

Special Functions in Mathematical Geosciences: An Attempt at a Categorization Willi Freedena and Michael Schreinerb a Geomathematics Group, University of Kaiserslautern, Rhineland-Palatinate, Germany b Institute for Computational Engineering, University of Buchs, Buchs, Switzerland

Abstract This chapter reports on the current activities and recent progress in the field of special functions of mathematical geosciences. The chapter focuses on two major topics of interest, namely, trial systems of polynomial (i.e., spherical harmonics) and polynomially based (i.e., zonal kernel) type. A fundamental tool is an uncertainty principle, which gives appropriate bounds for both the qualification and quantification of space and frequency (momentum) localization of the special (kernel) function under consideration. The essential outcome is a better understanding of constructive approximation in terms of zonal kernel functions such as splines and wavelets.

1 Introduction Today’s geosciences profit a lot from the possibilities that result from highly advanced electronic measurement concepts, modern computer technology, and, most of all, artificial satellites. In fact, the exceptional situation of getting simultaneous and complementary observations from multiple low-orbiting satellites opens new opportunities to contribute significantly to the understanding of our planet, its climate, its environment, and about an expected shortage of natural resources. All of a sudden, key parameters for the study of the dynamics of our planet and the interaction of its solid part with ice, oceans, atmosphere, etc. become accessible. In this context, new types of observations and data measured on (almost) spherical reference surfaces such as the (spherical) Earth or (near-)circular orbits are very likely to be the greatest challenge. This is the reason why adequate components of mathematical thinking, adapted formulations of theories and models, and economical and efficient numerical developments are indispensable (also within the spherical framework). Up to now, the modeling of geodata is mainly done on a global scale by orthogonal expansions by means of polynomial structures such as (certain types of) spherical harmonics. But so far, there are many aspects where they cannot keep pace with the prospects and the expectations of the “Earth system sciences.” Moreover, there is an increasing need for highprecision modeling on local areas. In this respect, zonal kernel functions, i.e., in the jargon of constructive approximation, radial basis functions, become more and more important because of their space-localizing properties (even in the vectorial and tensorial context). It is known that the addition theorem of the theory of spherical harmonics enables us to express all types of zonal kernel functions in terms of a one-dimensional function, namely, the Legendre polynomial. Weighted additive clustering of Legendre polynomials generates specific classes of space-localizing zonal kernel functions, i.e., Legendre series expansions, ready for approximation within scalar, vectorial, and tensorial context. Furthermore, our investigations will demonstrate that the closer the Legendre 

E-mail: [email protected] Page 1 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

series expansion is to the Dirac kernel, the more localized the zonal kernel is in space, and the more economical is its role in (spatial) local computation. In addition, the Funk-Hecke formula provides the natural tool for establishing convolutions of spherical fields against zonal kernels. In consequence, by specifying Dirac families, i.e., sequences of zonal functions tending to the Dirac kernel (space-localized), filtered versions of (square-integrable), spherical fields are obtainable by convolution leading to “zooming-in” approximation within a multiscale procedure. Altogether, the Legendre polynomial is the essential keystone in the work about special functions in mathematical geosciences. It enables the transition from spherical harmonics via zonal kernels up to the Dirac kernel. In addition, the Funk-Hecke formula and its consequences in spherical convolutions open new methodological perspectives for global, as well as local, approximation in scalar, vectorial, as well as tensorial (physically motivated) applications.

2 Special Function Systems From modern satellite positioning, it is well known that the Earth’s surface deviates from a sphere by less than 0.4 % of its radius. Thus, spherical functions and concepts play an essential part in all geosciences. In particular, in accordance with the Weierstraß theorem, spherical polynomials, i.e., spherical harmonics, constitute fundamental ingredients of modern geomathematical research wherever spherical fields are significant, be they electromagnetic, gravitational, hydrodynamical, etc.

2.1 Spherical Harmonics Spherical harmonics are the analogues of trigonometric functions for Fourier expansion theory on the sphere. They were introduced in the 1780s to study gravitational theory (cf. de Laplace 1785; Legendre 1785). Early publications on the theory of spherical harmonics in their original physically motivated meaning as multipoles are, e.g., due to Clebsch (1861), Sylvester (1876), Heine (1878), Neumann (1887), and Maxwell (1891). Today, the use of spherical harmonics in diverse procedures is a well-established technique in all geosciences, particularly for the purpose of representing scalar potentials. A great incentive came from the fact that global geomagnetic data became available in the first half of the nineteenth century (cf. Gauß 1838). Nowadays, reference models for the Earth’s gravitational or magnetic field, e.g., are widely known by tables of coefficients of the spherical harmonic Fourier expansion of their potentials. It is characteristic for the Fourier approach that each spherical harmonic, as an “ansatz-function” of polynomial nature, corresponds to one degree, i.e., in the jargon of signal processing to exactly one frequency. Thus, orthogonal (Fourier) expansion in terms of spherical harmonics amounts to the superposition of summands showing an oscillating character determined by the degree (frequency) of the Legendre polynomial (see Table 1). The more spherical harmonics of different degrees are involved in the Fourier (orthogonal) expansion of a signal, the more the oscillations grow in number, and the less the amplitudes are in size. Geosciences are much concerned with the space L2 ./ of square-integrable functions on the unit sphere . The quantity jjF jjL2 ./ is called the energy of the “signal” F 2 L2 ./. The representation of a signal F 2 L2 ./ with finite energy in terms of a countable Hilbert basis is one of the most interesting and important problems in geoscience. In fact, spherical harmonics form a Hilbert basis in this space L2 ./. Appropriate systems of spherical harmonics fYn;k gnD0;1;:::I kD1;:::;2nC1 Page 2 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

Table 1 Fourier expansion of scalar square-integrable functions on the unit sphere  Weierstra approximation theorem: use of homogeneous polynomials

(Geo)physical constraint of harmonicity

Spherical harmonics Yn, j as restrictions of homogeneous harmonic polynomials Hn, j to the unit sphere Hn, j Orthonormality and orthogonal invariance Addition theorem One-dimensional Legendre polynomial Pn satisfying

Convolution against the Legendre kernel

Funk–Hecke formula

Legendre transform of F:

Superposition over frequencies

Orthogonal (Fourier) series expansion

Fourier series of F

are usually defined as the restrictions of homogeneous harmonic polynomials to the sphere. The polynomial structure has tremendous advantages. First of all, spherical harmonics of different degrees are orthogonal. In addition, the space Harmn of spherical harmonics of degree n is finitedimensional: dim(Harmn / D 2n C 1. Therefore, the basis property of fYn;k gnD0;1;:::; kD1;:::;2nC1 is equivalently characterized by the completion of the direct sum ˚1 nD0 Harmn , i.e.: 1

L ./ D ˚ Harmn 2

kkL2 ./

:

(1)

nD0

This is the canonical reason why spherical harmonic expansions (i.e., multipole expansions) are the classical approaches to geopotentials. More explicitly, any “signal” F 2 L2 ./ can be split into “orthogonal contributions” involving the Fourier transforms F ^ .n; k/ defined by Z ^ F .n; k/ D F ./Yn;k ./d!./ (2) 

(in terms of an L2 ./-orthonormal spherical harmonics fYn;k gnD0;1;:::

/. From Parseval’s identity,

kD1;:::;2nC1

we get the orthogonal decomposition into frequency-based energy jjF jj2L2 ./

D .F; F /L2 ./

1 2nC1 X X D .F ^ .n; k//2 : nD0 kD1

Page 3 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

This explains that (globally oriented) geosciences work much more with the “amplitude spectrum” fF ^ .n; k/g nD0;1;:::; than with the “original signal” F 2 L2 ./. The “inverse Fourier transform” kD1;:::;2nC1

F D

1 2nC1 X X

F ^ .n; k/Yn;k

(3)

nD0 kD1

allows the geoscientist to think of the function F as a sum of weighted “wave functions” Yn;k of different frequencies. One can think of their measurements as operating on an “input signal” F to produce an output signal G D ƒF , where ƒ is an operator acting on L2 ./. Fortunately, it is the case that large portions of interest can be well approximated by linear, rotation-invariant pseudodifferential operators (see, e.g., Svensson 1983; Freeden et al. 1998). If ƒ is such an operator on L2 ./, this means that ƒYn;k D ƒ^ .n/Yn;k ; n D 0; 1; : : :I k D 1; : : :; 2nC1, where the so-called symbol fƒ^ .n/gn2N0 is a sequence of real values (independent of k). Thus, we have the fundamental fact that the spherical harmonics are the eigenfunctions of the operator ƒ. Different pseudo-differential operators ƒ are characterized by their eigenvalues ƒ^ .n/. All eigenvalues fƒ^ .n/gn2N0 are collected in the so-called symbol of ƒ. The “amplitude spectrum” fG ^ .n; k/g of the response of ƒ is described in terms of the amplitude spectrum of functions (signals) by a simple multiplication by the “transfer” ƒ^ .n/. Physical devices do not transmit spherical harmonics of arbitrarily high frequency without severe attenuation. The “transfer” ƒ^ .n/ usually tends to zero with increasing n. It follows that the amplitude spectra of the responses (observations) to functions (signals) of finite energy are also negligibly small beyond some finite frequency. Thus, both because of the frequencylimiting nature of the used devices and because of the nature of the “transmitted signals,” the geoscientist is soon led to consider bandlimited functions. These are the functions F 2 L2 ./ whose “amplitude spectra” vanish for all n > N.N 2 N fixed). In other words, each bandlimited function 2 L2 ./ can be written as a finite Fourier series. So, any function F of the form PNF P ^ F D nD0 2nC1 kD1 F .n; k/Yn;k is said to be bandlimited with the band N. In analogous manner, 2 F 2 L ./ is said to be locally supported (spacelimited) with space width  around an axis  2 , if for some  2 .1; 1/, the function F vanishes on the set of all  2  with 1      . Bandlimited functions are infinitely often differentiable everywhere. Moreover, it is clear that any bandlimited function F is an analytic function. From the analyticity, it follows immediately that a nontrivial bandlimited function cannot vanish on any (nondegenerate) subset of . The only function that is both bandlimited and spacelimited is the trivial function. Now, in addition to bandlimited but non-spacelimited functions, numerical analysis would like to deal with spacelimited functions. But as we have seen, such a function (signal) of finite (space) support cannot be bandlimited, it must contain spherical harmonics of arbitrary large frequencies. Thus, there is a dilemma of seeking functions that are somehow concentrated in both space and (angular) momentum domain. There is a way of mathematically expressing the impossibility of simultaneous confinement of a function to space and (angular) momentum, namely, the uncertainty principle.

2.2 Transition to Zonal Kernel Functions To understand the transition from the theory of spherical harmonics to zonal kernel function up to the Dirac kernel, we have to realize the relative advantages of the classical Fourier expansion method by means of spherical harmonics not only in the frequency domain but also in the Page 4 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

space domain. Obviously, it is characteristic for Fourier techniques that the spherical harmonics as polynomial trial functions admit no localization in space domain, while in the frequency domain (more precisely, momentum domain), they always correspond to exactly one degree, i.e., frequency, and, therefore, are said to show ideal frequency localization. Because of the ideal frequency localization and the simultaneous absence of space localization, in fact, local changes of fields (signals) in the space domain affect the whole table of orthogonal (Fourier) coefficients. This, in turn, causes global changes in the corresponding (truncated) Fourier series in the space domain. Nevertheless, the ideal frequency localization usually proves to be helpful for meaningful physical interpretations relating – for a frequency being fixed – the different observables of a geopotential to each other. Taking these aspects on spherical harmonic modeling by Fourier series into account, trial functions, which simultaneously show ideal frequency localization as well as ideal space localization, would be a desirable choice. In fact, such an ideal system of trial functions would admit models of highest spatial resolution expressible in terms of single frequencies. However, as we will see, the uncertainty principle – connecting space and frequency localization – tells us that both characteristics are mutually exclusive. In conclusion, Fourier expansion methods are well suited to resolve low- and medium-frequency phenomena, i.e., the “trend” of a signal, while their application to obtain high resolution in global or local models is critical. This difficulty is also well known to theoretical physics, e.g., when describing monochromatic electromagnetic waves or considering the quantum-mechanical treatment of free particles. In this case, plane waves with fixed frequencies (ideal frequency localization, but no space localization) are the solutions and the corresponding differential equations, but do not certainly reflect the physical reality. As a remedy, plane waves of different frequencies are superposed to the so-called wave packages which gain a certain amount of space localization, while losing their ideal spectral localization. In a similar way, a suitable superposition of polynomial functions leads to the so-called zonal kernel functions, in particular to kernel functions with a reduced frequency, but increased space localization. More concretely, any kernel function K W    ! R that is characterized by the property K.; / D K.j  j/; ;  2  is called a (spherical) radial basis function (at least in the theory of constructive approximation). In other words, a radial basis function is a real-valued kernel function whose values depend only on the Euclidean distance j  j of two unit vectors ; . A well-known fact is that the distance of two unit vectors is expressible in terms of their inner product: j  j2 D jj2 C jj2  2   D 2.1    /; ;  2 : Consequently, any radial basis function is equivalently characterized by the property of being dependent only on the inner product    of the unit vectors ;  2 , i.e., O  /; K.; / D K.j  j/ D K.

;  2 :

(4)

In the theory of special functions of mathematical physics, however, a kernel K W    ! R O  / D K.t O satisfying K.  t/;  2  for all orthogonal transformation t is known as a zonal kernel function. In order to point out the reducibility of KO to a function defined on the interval O  /; .; / 2    is used throughout this Œ1; 1, the notation of a zonal kernel .; / 7! K. chapter. From the theory of spherical harmonics, we get a representation of any L2 ./-zonal kernel function K in terms of a Legendre series Page 5 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Two examples of scalar (locally supported) zonal functions on the unit sphere 

K./ D

1 X 2n C 1 nD0

4

K ^ .n/Pn ./

(5)

(in jj  jjL2 ./ -sense), where the sequence fK ^ .n/gn2N0 given by ^

Z

1

K .n/ D 2 1

K.t /Pn .t / dt

(6)

is called the Legendre symbol of the zonal kernel K./. In other words, additive clustering of weighted Legendre kernels generates zonal kernel functions. It is remarkable to distinguish bandlimited kernels (i.e., K ^ .n/ D 0 for all n  N ) and non-bandlimited ones, for which infinitely many numbers K ^ .n/ do not vanish. Non-bandlimited kernels show a much stronger space localization than their bandlimited counterparts. Empirically, if K ^ .n/  K ^ .n C 1/  1 for many successive large integers n, then the support of the series (5) in the space domain is small (see Fig. 1), i.e., the kernel is spacelimited (i.e., in the jargon of approximation theory “locally supported”). Assuming the condition limn!1 K ^ .n/ D 0, we are confronted with the situation that the slower the sequence fK ^ .n/gnD0;1;::: converges to zero, the lower is the frequency localization, and the higher is the space localization. Empirically, illustrated in a unified scheme (see Table 2), the formalisms for zonal kernel function theory lead to the following principles: 1. Weighted Legendre kernels are the constituting summands of zonal kernel functions. 2. The only zonal kernel that is both band- and spacelimited is the trivial kernel; the Legendre kernel is ideal in frequency localization; and the Dirac kernel is ideal in space localization.

Table 2 From Legendre kernels via zonal kernels to the Dirac kernel Legendre kernels

Zonal kernels

Dirac kernel

General case Bandlimited

Spacelimited

Page 6 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

2.3 Transition to the Vector and Tensor Context In the second half of the last century, a physically motivated approach for the decomposition of spherical vector and tensor fields was presented based on a spherical variant of the Helmholtz theorem (see, e.g., Morse and Feshbach 1953; Backus 1966, 1967; Backus 1986). Following this concept, e.g., the tangential part of a spherical vector field is split up into a curl-free and a divergence-free field by use of two differential operators, namely, the surface gradient and the surface curl gradient. Of course, an analogous splitting is valid in tensor theory. In subsequent publications during the second half of the last century, however, the vector spherical harmonic theory was usually written in local coordinate expressions that make mathematical formulations lengthy and hard to read. Tensor spherical harmonic settings are even more difficult to understand. In addition, when using local coordinates within a global spherical concept, differential geometry tells us that there is no representation of vector and tensor spherical harmonics which is free of singularities. In consequence, the mathematical arrangement involving vector and tensor spherical harmonics has led to an inadequately complex and less-consistent literature, yet. Coordinate-free explicit formulas on vector and/or tensor variants of the Legendre polynomial could not be found in the literature. As an immediate result, the orthogonal invariance based on specific vector/tensor extensions of the Legendre polynomials was not worked out suitably in a unifying scalar/vector/tensor framework. Even more, the concept of zonal (kernel) functions was not generalized adequately to the spherical vector/tensor case. All these new structures concerning spherical functions in mathematical (geo)physics were successfully developed in the monograph (Freeden and Schreiner 2009). Basically two transitions are undertaken in that approach, namely, the transition from spherical harmonics via zonal kernel functions to the Dirac kernels on the one hand and the transition from scalar to vector and tensor theory on the other hand (see Table 3). To explain, the transition from the theory of scalar spherical harmonics to its vectorial and tensorial extensions (Freeden and Schreiner 2009) starts from physically motivated dual pairs of operators (the reference space being always the space of signals with finite energy, i.e., the space of square-integrable fields). The pair o.i/ ; O .i/ ; i 2 f1; 2; 3g is originated in the constituting ingredients of the Helmholtz decomposition of a vector field, while o.i;k/ ; O .i;k/ ; i; k 2 f1; 2; 3g take the analogous role for the Helmholtz decomposition of tensor fields. For example, in vector .1/ theory, o.1/ F is assumed to be the normal field  ! o F ./ D F ./,  2 , o.2/ F is the Table 3 From scalar via vectorial to tensorial kernels Scalar Legendre kernels

Vector Legendre kernels

Tensor Legendre kernels

Scalar zonal kernels

Vector zonal kernels

Tensor zonal kernels

Scalar Dirac kernel

Vector Dirac kernel

Tensor Dirac kernel

Page 7 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

surface gradient field  ! o F ./ D r F ./,  2 , and o.3/ F is the surface curl gradient .2/

field  ! o F ./ D L F ./;  2 , with L D  ^ r applied to a scalar valued function F , .3/

.1/

while O .1/ f is the normal component  ! O f ./ D f ./  ;  2 , O .2/ f is the negative surface divergence  ! O f ./ D r  f ./,  2 , and O .3/ f is the negative surface curl .2/

 ! O f ./ D L  f ./;  2 , taken over a vector-valued function f. Clearly, the operators o.i;k/ ; O .i;k/ are also definable in orientation to the tensor Helmholtz decomposition theorem (for reasons of simplicity, however, their explicit description is omitted here). The pairs o.i/ ; O .i/ and o.i;k/ ; O .i;k/ of dual operators lead us to an associated palette of Legendre kernel functions, all of them generated by the classical one-dimensional Legendre polynomial Pn of degree n. To be more concrete, three types of Legendre kernels occur in the vectorial as well as tensorial context (see Table 4). The Legendre kernels o.i/ Pn ; o.i/ o.i/ Pn are of concern for the vector approach to spherical harmonics, whereas o.i;i/ Pn ; o.i;k/ o.i;k/ Pn ; i; k D 1; 2; 3 form the analogues in tensorial theory. Corresponding to each Legendre kernel, we are led to two variants for representing squareintegrable fields by orthogonal (Fourier) expansion, where the reconstruction – as in the scalar case – is undertaken by superposition over all frequencies. In a single unified notation, the formalisms for the vector/tensor spherical harmonic theory are based on the following principles (see Freeden and Schreiner 2009): .3/

1. The vector/tensor spherical harmonics involving the o.i/ ; o.i;i/ -operators, respectively, are obtainable as restrictions of three-dimensional homogeneous harmonic vector/tensor polynomials, respectively. 2. The vector/tensor Legendre kernels are obtainable as the outcome of sums extended over a maximal orthonormal system of vector/tensor spherical harmonics of degree (frequency) n, respectively. Table 4 Legendre kernel functions Scalar Legendre polynomial Pn =

o(i)Pn

( mn(i) )1/2

mn(i)

Application of O (i)

Application of o(i) Vector Legendre kernel pn(i) =

O (i)O (i)pn(i,i)

=

O(i) pn(i,i)

( mn(i) )1/2

Application of Application of o(i) O (i) Tensor Legendre kernel (order 2) = p(i,i) n

o(i)pn(i)

( mn(i) )1/2

=

Vectorial context

o(i)o(i)Pn mn(i)

=

O (i,k)O (i,k) Pn(i,k) mn(i,k) Application of Application of O (i,k) o(i,k) Tensor Legendre kernel (order 2) pn(i,k)=

o(i,k)Pn

( mn(i,k) )1/2

=

O(i,k) Pn(i,k)

( mn(i,k) )1/2

Application of Application of O (i,k) O (i,k) Tensor Legendre kernel (order 4) Pn(i,k,i,k) =

o(i,k) pn(i,k)

( mn(i,k) )1/2

=

o(i,k)o(i,k)Pn mn(i,k)

Tensorial context

Page 8 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

3. The vector/tensor Legendre kernels are zonal kernel functions, i.e., they are orthogonally invariant (in vector/tensor sense, respectively) with respect to orthogonal transformations (leaving one point of the unit sphere  fixed). 4. Spherical harmonics of degree (frequency) n form an irreducible subspace of the reference space of (square-integrable) fields on . 5. Each Legendre kernel implies an associated Funk-Hecke formula that determines the constituting features of the convolution (filtering) of a square-integrable field against the Legendre kernel. 6. The orthogonal Fourier expansion of a square-integrable field is the sum of the convolutions of the field against the Legendre kernels being extended over all frequencies. To summarize, the theory of spherical harmonics enables us to provide a unifying attempt at consolidating, reviewing, and supplementing the different approaches in real scalar, vector, and tensor theory. The essential tools are the Legendre kernels which are explicitly available and tremendously significant in rotational invariance and orthogonal Fourier expansions. The coordinate-free setup yields a number of formulas and theorems that previously were derived only in coordinate representation (such as polar coordinates). Consequently, any kind of singularities is avoided at the poles. Finally, our transition from the scalar to the vectorial as well as the tensorial case opens new promising perspectives of constructing important, i.e., zonal classes of spherical trial functions by summing up Legendre kernel expressions, thereby providing (geo-)physical relevance and increasing local applicability.

3 The Uncertainty Principle as Key Issue for Classification As already known, four classes of zonal kernel functions can be distinguished, namely, bandlimited and non-bandlimited and spacelimited and non-spacelimited kernel functions. But the question is what is the right zonal kernel function of local nature for local purposes of approximation? Of course, the user of a mathematical method is interested in knowing the trial system which fits “adequately” to the problem. Actually it is necessary, in the case where several choices are possible or an optimal choice cannot be found, to choose the trial systems in close adaptation to the data width and the required smoothness of the field to be approximated. In this respect, an uncertainty principle specifying the space and frequency localization is helpful to serve as a decisive criterion. The essential outcome of the uncertainty principle is a better understanding of the classification of zonal kernel functions based on the development of suitable bounds for their quantification with respect to space and frequency localization.

3.1 Localization in Space

R Suppose that F is of class L2 ./. Assume first that jjF jjL2 ./ D .  .F .//2 d!.//1=2 D 1. We 2 associate to F the normal (radial) field  ! F ./ D o.1/  F ./;  2 . This function maps L ./ into the associated set of normal fields on . The “center of gravity of the spherical window” is defined by the expectation value in the space domain Z Z   .1/ o.1/ .F .//2 d!./ 2 R3 (7) gF D o F ./ F ./d!./ D 



Page 9 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

thereby interpreting .F .//2 d!./ as surface mass distribution over the sphere  in ˇ ˇ embedded .1/ .1/ ˇ ˇ Euclidean space R3 . It is clear that gFo lies in the closed inner space int of  W ˇgFo ˇ  1. The variance in the space domain is understood in a canonical sense as the variance of the operator o.1/  2  2 R  .1/ o.1/ o.1/ D  o  gF F ./ d!./ F 2 R  o.1/ D    gF .F .//2 d!./ 2 R:

(8)

2 2   .1/ .1/ .1/ Observing the identity   gFo D 1 C gFo  2  gFo ;  2 , it follows immediately that  .1/ 2  .1/ 2  .1/ 2 o o F D 1  gF . Obviously, 0  Fo  1. Since we are particularly interested in bandlimited or non-bandlimited zonal (i.e., radial basis) functions on the sphere, some simplifications can be made. Let K be of class L2 Œ1; 1 and jjKjjL2 Œ1;1 D 1. Then the corresponding expectation value (“center of gravity”) can be computed readily as follows ."3 D .0; 0; 1/T /:  Z .K.  " // d!./ D 2

Z o.1/ gK." 3/

D

3



1

2

2

t .K.t // dt "3 :

1



ˇR ˇ ˇ ˇ .1/ ˇ 1 ˇ o.1/ ˇ ˇ 2 D 2 t .K.t // dt Letting tKo D ˇgK." ˇ 1 ˇ 2 R, we find for the corresponding variance 3/ ˇ 

.1/

Ko

2

2 R  o.1/   g .K.  "3 //2 d!./  K."3 /  2  2 .1/ o.1/ D 1  gK." 2 R: D 1  tKo 3/ D

.1/

.1/

(9)

.1/

.1/

Figure 2 gives a geometric interpretation of gFo and Fo . We associate to gFo ; gFo ¤ 0, and its .1/ projection oF onto the sphere  the spherical cap ˇ .1/ ˇo n .1/ ˇ ˇ C D  2  W 1    oF  1  ˇgFo ˇ : ηFo

(1)

C gFo

(1)

(1) σFo

1

Fig. 2 Localization in a spherical cap Page 10 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

.1/

Then the boundary @C is a circle with radius Fo . As one thinks of a zonal function F to be a .1/ “window function” on , the window is determined by C , and its width is given by Fo .

3.2 Localization in Frequency Next the expectation value in the “frequency domain” is introduced to be the expectation value of the surface curl operator o.3/ on . Then, for F 2 H2l ./; l 2 N, i.e., for all F 2 L2 ./ such that there exists a function G 2 L2 ./ with G ^ .n; k/ D .n.n C 1//l F ^ .n; k/ for all n D 0; 1; : : : ; k D 1; : : :; 2n C 1, we have Z   .3/ o.3/ o F ./ F ./ d!./ D 0 2 R3 : gF D 

Correspondingly, the variance in the “frequency domain” is given by  2  .3/ 2 Z  o.3/ o.3/ F ./ D  g d!./ 2 R: Fo  F 

The surface theorem of Stokes shows us that  2 R    .3/  .3/ Fo D  o.3/  F ./  o F ./ d!./ R D  .  F .//F ./ d!./: Expressed in terms of spherical harmonics, we get via the Parseval identity 1 2nC1  .3/ 2 X X D n.n C 1/.F ^ .n; k//2 : Fo nD0 kD1

Note that we require jjF jj2L2 ./ D

P1 P2nC1 nD0

kD1

.F ^ .n; k//2 D 1. The meaning of Fo as measure .3/

.3/

for “frequency localization” is as follows: The range of Fo is the interval Œ0; 1; a large value of .3/ .3/ Fo occurs if many Fourier coefficients contribute to Fo . In conclusion, relating any spherical .3/ harmonic to a “single wavelength,” a large value Fo tells us that F is spread out widely in .3/ “frequency domain.” In contrast to this statement, a small number of Fo indicates that only a few number of Fourier coefficients are significant (cf. Table 5). Again we formulate our quantities in the context of zonal functions. Let K."3 / be of class H2 ./ satisfying jjK."3 /jjL2 ./ D 1, and then 

o.3/ K." 3/

2

R

 K.  "3 /K.  "3 /d!./ R1 D 2 1 K.t /Lt K.t /dt :

D



d .1  t 2 / d . where Lt denotes the Legendre operator as given by Lt D dt dt

Page 11 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

Table 5 Space/frequency localization: A comparison of the operators o.1/ and o.3/ Expectation value

Operator Space

o(1)

Frequency

o(3) Operator

Space

o(1)

Frequency

o(3)

Variance

3.3 Uncertainty .1/

.3/

The square roots of the variances, i.e.,  o and  o , are called the uncertainties in o.1/ and o.3/ , respectively. For these quantities we get (see ˇ2 Narcowich and Ward 1996; Freeden and Schreiner 2  2 ˇ  .1/ .3/ ˇ .1/ ˇ Fo  ˇgFo ˇ . For details concerning the proof in the formulation 2009) the estimate Fo of Theorem 1, the reader is referred to Freeden (1999). Summarizing our results we are led to the following theorem. Theorem 1.

Let F 2 H2 ./ satisfy jjF jjL2 ./ D 1. Then  .1/ 2  .3/ 2 ˇ .1/ ˇ2 ˇ ˇ Fo Fo  ˇgFo ˇ :

(10)

.1/

If gFo is nonvanishing, then .1/

.3/

oF oF  1;

(11)

.1/

where we have used the abbreviations

.1/ oF

o .3/ .3/ D ˇ Fo.1/ ˇ and oF D Fo . ˇg ˇ F

The uncertainty relation measures the trade-off between “space localization” and “frequency localization” (“spread in frequency”). It states that sharp localization in space and “frequency” is mutually exclusive. An immediate consequence of Theorem 1 is its reformulation for zonal functions K."3 / W  7! K."3  /;  2 : Corollary 2.

.1/

Let K."3 / 2 H2 ./ satisfy jjKjjL2 Œ1;1 D 1. If tKo is nonvanishing, then .1/

.3/

oK oK  1;

Page 12 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

.1/

where

.1/ oK

D

Ko

.3/

.1/ tKo

.3/

and oK D Ko .

 .3/ 2 as variance in “total angular momentum” helped us to prove The interpretation of Ko Theorem 1. But this interpretation shows two essential drawbacks: First, the expectation value of the surface curl gradient is a vector which seems to be inadequate in “momentum localization” .3/ in terms of scalar spherical harmonics, and second, the value of gFo vanishes for all candidates F . This means that the “center of gravitation of the spherical window” in “momentum domain” is independent of the function F under consideration. Therefore, we are finally interested in the variance of the operator   ˇ2  2 Z ˇ  ˇ ˇ      D /  g (12) F F ./ . ˇ ˇ d!./;  F  

which is a measure for the “spread in momentum.” Now the corresponding expectation value gF is scalar valued and nonvanishing. It can be easily seen that 



F

2

.  /2

D gF

   2  gF :

(13)

In connection with Theorem 1, this leads to the following result. Theorem 2.

Let F be of class H4 ./ such that jjF jjL2 ./ D 1. Then  .1/ 2  2 ˇ .1/ ˇ g .  /2  g   2 ˇ ˇ F o   F F  ˇgFo ˇ F  gF

(14)



provided that gF ¤ 0. If the right-hand side of (14) is nonvanishing, then 

oF   1; F .1/

(15)

where 0   F

B D@

   2 F

11=2

C A .  /2   2 gF .gF /

D



 gF

1=2

.3/

D oF :



gF

4 Fundamental Results Next, the result of the uncertainty principle will be applied to some examples which are of particular interest in geoscientific research.

Page 13 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

4.1 Localization of Spherical Harmonics We know that Z .Yn;k .//2 d!./ D 1:

(16)



Now it is clear that .1/

.1/

gYon;k D 0; Yon;k D 1:

(17)

Moreover, we have 



gY D n.n C 1/; n;k

Y D 0: n;k

(18)

In other words, spherical harmonics show an ideal frequency localization, but no space localization (see Fig. 3 for an illustration of space and frequency localization for the Legendre polynomials). q Localization of the Legendre Kernel. We have with Pn D Z

2nC1 Pn 2

.Pn .  //2 d!. / D 1

(19)



for all  2 , such that .1/

.1/

gPon ./ D 0; Pon ./ D 1; 

(20)



gP D n.n C 1/; P D 0:   n ./ n ./

(21)

4.2 Localization of Non-bandlimited Kernels Consider the Abel-Poisson function Qh W Œ1; 1 ! R, h < 1, given by (see Fig. 4)

1

n=2 n=5 n=9

n=2 n=5 n=9

1 0.8

0.5

0.6

0

0.4 –0.5

–1 −π

0.2 −π/2

0

π/2

π

0 0

5

10

15

Fig. 3 The Legendre kernel Pn for n D 2; 5; 9, space representation # 7! Pn .cos.#// (left) and frequency representation m 7! .Pn /^ .m/ (right) Page 14 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

X 2n C 1 1 1  h2 Qh .t / D D hn Pn .t /: 4 .1 C h2  2ht /3=2 4 nD0 1

(22)

An easy calculation gives us  kQh kL2 Œ1;1 D .Qh2 .1//

1=2

D

1 C h2 4

1=2

1 : 1  h2

(23)

Q .t /, t 2 Œ1; 1, we obtain after an elementary Furthermore, for QN h .t / D kQh k1 L2 Œ1;1 h calculation (see Fig. 5)

.1/

oQN h

p 6h 1  h2  D : ;  Nh D Q 2h 1  h2

(24)

Thus, we finally obtain 2

h= 0.7 h= 0.5 h= 0.2

n=2 n=5 n=9

1

1.5

0.8 0.6

1

0.4 0.5

0 −π

0.2 −π/2

0

π/2

π

0

0

5

10

15

Fig. 4 The Abel-Poisson kernel Qh for h D 0:7; 0:5; 0:2, space representation # 7! Qh .cos.#// (left) and frequency representation n 7! .Qh /^ .n/ (right) 15

Δ Δ

O (1) Qh − Δ• Qh

10

5

0 0

0.2

0.4

0.6

0.8

1

o  Fig. 5 Abel-Poisson kernel uncertainty classification: The functions h 7! Q N and h 7! QN .1/ h



h

Page 15 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

p

o.1/

 

QN h QN h

6 D D 2

r

3 > 1: 2

(25)



o  Note that in this case, the value Q N h QN h is independent of h. All intermediate cases of “space-frequency localization” occur when discussing the AbelPoisson kernel. In fact, it should be pointed out that the Abel-Poisson kernel does not satisfy a minimum uncertainty state. Letting h formally tend to 1 in the results provided by the uncertainty principle for the AbelPoisson kernel function, we are able to interpret the localization properties of the Dirac kernel on  satisfying ı ^ .n/= 1 for all n 2 N0 : .1/

ı.  / D

1 X 2n C 1 nD0

4

Pn .  /; ;  2 ;

(26)

where the convergence is understood in the distributional sense. As a matter of fact, letting h tend to 1 shows us that the variances in the space domain take the constant value 0. On the other hand, the variances in the frequency domain converge to 1. Hence, the Dirac kernel shows ideal space localization, but no frequency localization. The minimum uncertainty state within the uncertainty relation is provided by the bell-shaped (Gaussian) probability density function (see Freeden et al. 1998; Laín Fernández 2003). Localization of the Gaussian Function. Consider the function G given by G .t / D e . =2/.1t / ; t 2 Œ1; 1; > 0:

(27)

An elementary calculation shows us that GQ .t / D . /e . =2/.1t / ;

(28)

with 1=2 1 2 .1  e / ; . / D .1= 4/ 2 p



(29)

satisfies kGQ kL2 Œ1;1 D 1. Furthermore, it is not difficult to deduce (cf. Freeden and Windheuser .1/  ! 1 as ! 1. This shows us that the best value of the uncertainty principle 1996) that oGQ  GQ (Corollary 2) is 1.

4.3 Quantitative Illustration of Localization Summarizing our results, we are led to the following conclusions: The uncertainty principle represents a trade-off between two “spreads,” one for the position and the other for the frequency. The main statement is that sharp localization in space and in frequency is mutually exclusive. The reason for the validity of the uncertainty relation (Theorem 1) is that the operators o.1/ and o.3/ do not commute. Thus, o.1/ and o.3/ cannot be sharply defined simultaneously. Extremal members

Page 16 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

in the space/momentum relation are the polynomials (i.e., spherical harmonics) and the Dirac function(al)s. An asymptotically optimal kernel is the Gaussian function. The estimate (Corollary 2) allows us to give a quantitative classification in the form of a canonically defined hierarchy of the space/frequency localization properties of kernel functions of the form K.t / D

1 X 2n C 1

4

nD0

K ^ .n/Pn .t /;

t D   ;

(30)

.; / 2   . In view of the amount of space/frequency localization, it is also important to distinguish bandlimited kernels (i.e., K ^ .n/ D 0 for all n  N ) and non-bandlimited ones. Nonbandlimited kernels show a much stronger space localization than bandlimited counterparts. It is not difficult to prove that if K 2 L2 Œ1; 1 with kK./kL2 ./ D 1, 

o.1/

K./

2

D1

1 X 2n C 1 nD1

4

!2 K ^ .n/K ^ .n C 1/

(31)

Thus, as already mentioned, if K ^ .n/  K ^ .n C 1/  1 for many successive integers n, then the support of (30) in space domain is small. Once again, the varieties of the intensity of the localization on the unit sphere  can be also illustrated by considering the kernel function (30). By choosing K ^ .n/ D ınk , we obtain a Legendre kernel of degree k, i.e., we arrive at the left end of our scheme (see Fig. 6). On the other hand, if we formally take K ^ .n/ D 1 for n D 0; 1; : : :, we obtain the kernel which is the Dirac functional in L2 ./. Bandlimited kernels have the property K ^ .n/ D 0 for all n  N , N 2 N0 . Non-bandlimited kernels satisfy K ^ .n/ ¤ 0 for an infinite number of integers n 2 N0 . Assuming the condition limn!1 K ^ .n/ D 0, it follows that the slower the sequence fK ^ .n/gnD0;1;::: converges to zero, the lower the frequency localization, and the higher the space localization. Altogether, Fig. 6 gives a qualitative illustration of the consequences of the uncertainty principle in the theory of zonal kernel functions on the sphere: On the left end of this scheme, we have the Legendre kernels with their ideal frequency (momentum) localization. However, they show no Space localization

No space localization

Ideal space localization Frequency localization

Ideal frequency localization

No frequency localization Kernel type

Legendre kernel

Bandlimited

Locally supported

Dirac kernel

Fig. 6 The uncertainty principle and its consequences

Page 17 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

space localization, as they are of polynomials nature. Thus, the present standard way of increasing the accuracy in applications in spherical harmonic (Fourier) expansions is to increase the maximum degree of the spherical harmonics expansions under consideration. On the right end of the scheme, there is the Dirac kernel which maps a function to its value at a certain point. Hence, those functionals have ideal space localization but no frequency localization. Consequently, they can be used in a finite point set approximation.

5 Future Directions As already pointed out, in this context, the spectral representation of a square-integrable function by means of spherical harmonics is essential to solve many problems in today’s applications.

5.1 Fourier Approach In future research, however, Fourier (orthogonal) expansions in terms of spherical harmonics fYn;j g will not be the only way of representing a square-integrable function. In order to explain this in more detail, we think of a square-integrable function as a signal in which the spectrum evolves over space in a significant way. We imagine that, at each point on the sphere , the function refers to a certain combination of frequencies and that these frequencies are continuously changing. This space evolution of the frequencies, however, is not reflected in the Fourier expansion in terms of non-space-localizing spherical harmonics, at least not directly. Therefore, in theory, any member F of the space L2 ./ can be reconstructed from its Fourier transforms, i.e., the “amplitude spectrum” fF ^ .n; j /gnD0;1;:::;j D1;:::;2nC1 , but the Fourier transform contains information about the frequencies of the function over all positions instead of showing how the frequencies vary in space.

5.2 Multiscale Approach In what follows we present a two-parameter, i.e., scale- and space-dependent, method of achieving a reconstruction of a function F 2 L2 ./ involving (scalar) zonal kernel functions which we refer to as a Dirac family (or equivalently in the language of modern multiscale approximation, a scaling (kernel) function), fˆ g2.0;1/ converging to the (zonal) Dirac kernel ı. In other words, a Dirac family is a set of zonal kernels ˆ W Œ1; 1 ! R;  2 .0; 1/, of the form ˆ .  / D

1 X nD0

' .n/

2n C 1 Pn .  /; 4

;  2 ;

(32)

converging to the “Dirac-kernel” ı as  ! 0. Consequently, if fˆ g2.0;1/ is a Dirac family ˚  of zonal kernels, its “symbol” ˆ^ .n/ constitutes a sequence satisfying the limit relation  nD0;1;::: ^ lim!0 ˆ .n/ D 1 for each n = 0,1,. . . . Accordingly, if fˆ g2.0;1/ is a Dirac family of zonal kernels, the convolution integrals

Page 18 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

Z .ˆ  F /./ D

ˆ .  /F ./ d!./;

 2 ;

(33)



converge (in a certain topology) to the limit Z F ./ D .ı  F /./ D

ı.  /F ./ d!./;

 2 ;

(34)



for all  2  as  tends to 0. In more detail, if F is a function of class L2 ./ and fˆ g is a (suitable) Dirac family (tending to the Dirac kernel), then the following limit relation holds true: lim F  ˆ F L2 ./ D 0 (35) !0;>0

There is a large number of Dirac families that is of interest for geoscientific applications (for more details, the reader is referred to, e.g., Freeden and Schreiner (1995, 2009) and Freeden and Windheuser (1996) and the references therein). Only three prototypes of Dirac families should be mentioned here, namely, the bandlimited Shannon family, the neither bandlimited nor spacelimited Abel-Poisson and Gauß-Weierstraß families, and the spacelimited Haar family. It should be noted that an approximate convolution identity acts as a space and frequency ˚  localization procedure in the following way: As ˆ ;  2 .0; 1/, is a Dirac family of zonal scalar kernel functions tending to the Dirac kernel, the function ˆ ./ is highly concentrated about the point  2  if the “scale parameter” is a small positive value. Moreover, as  tends to infinity, ˆ ./ becomes more and more localized in frequency. Correspondingly, the uncertainty principle states that the space localization of ˆ ./ becomes more and more decreasing. In conclusion, the products  ! ˆ .  /F ./;  2 ;  2 , for each fixed value , display information in F 2 L2 ./ at various levels of spatial resolution or frequency bands. R Consequently, as  approaches 1, the convolution integrals ˆ  F D  ˆ ./F ./ d!./ display coarser, lower-frequency features. As  approaches 0, the integrals give sharper and sharper spatial resolution. Thus, the convolution integrals can measure the space-frequency variations of spectral components, but they have a different space-frequency resolution. Each scale approximation ˆ  F of a function F 2 L2 ./ must be made directly by computing the relevant convolution integrals. In doing so, however, it is inefficient to use no information from the approximation ˆ  F within the computation of ˆ0  F provided that 0 < . In fact, the efficient construction of multiscale approximation based on Dirac families begins by a multiresolution analysis in terms of wavelets, i.e., a recursive method which is ideal for computation (see, e.g., Freeden et al. 1998; Freeden and Schreiner 2007, 2009, and the references therein). The wavelet transform acts as a space and frequency localization operator in the following way: If fˆ g;  2 .0; 1/ is a Dirac family and  approaches infinity, the convolution integrals ˆ  F display coarser, lower-frequency features. As  approaches zero, the integrals give sharper and sharper spatial resolution. In other words, the convolution integrals can measure the spacefrequency variations of spectral components, but they have a different space-frequency resolution. In this context, we observe that Z 1Z d ! F 2 L2 ./; R ! 0; ‰ .; /F ./ d!./ (36)   R Page 19 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

i.e.,

ˇˇ Z ˇˇ lim ˇˇF  R!0 ˇˇ R>0

1

ˇˇ d ˇˇˇˇ ‰ .; /F ./ d!./ ˇˇ D 0;  L2 ./ 

Z

R

(37)

provided that ‰ .; / D

1 X

‰^ ./

nD0

2n C 1 Pn .  /; 4

.; / 2   ;

(38)

is given such that ‰^ .n/ D 

d ^ ˆ .n/ d 

(39)

for n D 0; 1; : : : and all  2 .0; 1/. Conventionally, the family f‰ g;  2 .0; 1/, is called a (scale continuous) wavelet. The (scale continuous) wavelet transform .W T / W L2 ./ ! L2 ..0; 1/  / is defined by

Z .W T /.F /.I / D .‰ /  F ./ D

‰ .; /F ./ d!./

(40)



In other words, the wavelet transform is defined as the L2 ./-inner product of F 2 L2 ./ with the set of “rotations” and “dilations” of F . The (scale continuous) wavelet transform (WT) is invertible on L2 ./, i.e., Z Z F D

1

.W T /.F /.I /‰ .; / 

0

d d!./ 

(41)

in the sense of jj  jjL2 ./ . From Parseval’s identity in terms of scalar spherical harmonics, it follows that Z Z 1 d .‰  F /./ d!./ D jjF jj2L2 ./ (42)   0 i.e., (WT) converts a function F of one variable into a function of two variables  2  and  2 .0; 1/ without changing its total energy. In terms of filtering fˆ g and f‰ g;  2 .0; 1/ may be interpreted as low-pass filter and bandpass filter, respectively. Correspondingly, the convolution operators are given by ˆ F;

F 2 L2 ./;

(43)

‰  F;

F 2 L2 ./

(44)

The Fourier transforms read as follows:

Page 20 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

.ˆ F /^ .n; j / D F ^ .n; j / ˆ^  .n/;

(45)

.‰  F /^ .n; j / D F ^ .n; j /‰^ .n/:

(46)

These formulas provide the transition from the wavelet transform to the Fourier transform. Since all scales  are used, the reconstruction is highly redundant. Of course, the redundancy leads us to the following question which is of particular importance in data analysis: • Given an arbitrary H 2 L2 ..0; 1/  /, how can we know whether H D .W T /.F / for some function F 2 L2 ./? The question amounts to finding the range of the (scale continuous) wavelet transform (WT):L2 ./ ! L2 ..0; 1/  / (see Freeden et al. 1998), i.e., the subspace W D .W T /.L2 .// ¤ L2 ..0; 1/  /:

(47)

Actually, it can be shown that the tendency for minimizing errors by use of the wavelet transform is again expressed in least-squares approximation: Let H be an arbitrary element of L2 ..0; 1/  /. Then the unique function FH 2 L2 ./ with the property jjH  .W T /.FH /jjL2 ..0;1// D

inf jjH  .W T /.U /jjL2 ..0;1//

U 2L2 ./

(48)

is given by Z

1

FH D

Z H.I /‰ .; /d!./

0



d : 

(49)

(WT)(FH / is indeed the orthogonal projection of H onto W. Another important question in the context of the wavelet transform is: • Given an arbitrary H.I / D .W T /.F /.I /;  2 .0; 1/, and  2 , for some F 2 L2 ./, how can we reconstruct F ? The answer is provided by the so-called least-energy representation. It states: Of all possible functions H 2 L2 ..0; 1/  / for F 2 L2 ./, the function H D .W T /.F / is unique in that it minimizes the “energy” jjH jj2L2 ..0;1/. More explicitly (see Freeden et al. 1998) jj.W T /.F /jjL2 ..0;1// D

inf H 2L2 ..0;1// .W T /1 .H /DF

jjH jjL2 ..0;1//:

Page 21 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

6 Conclusion It is only relatively recent that zonal kernel function techniques such as splines and wavelets play a fundamental role in modern numerical analysis on the sphere. Starting point of spherical splines are the early 1980s (Freeden 1981; Wahba 1981). Spline functions are canonical generalizations of “spherical polynomials,” i.e., spherical harmonics, having desirable characteristics as interpolating, smoothing, and best approximating functions (cf. Freeden 1981; Freeden et al. 1998; Wahba 1990). By spline interpolation, we mean a variational problem of minimizing an “energy” norm of a suitable Sobolev space. According to the choice of the norm, bandlimited as well as non-bandlimited splines can be distinguished. Spherical splines have been successfully applied to many areas of application and by a great number of applicants (for more details, see Freeden et al. 1998; Freeden 1999; Freeden and Michel 2004, and the references therein). The construction of spherical wavelets has seen an enormous increase of activities over the last 20 years. Three features are incorporated in the way of thinking about georelevant wavelets, namely, basis property, decorrelation, and fast computation. First of all, wavelets are building blocks for general data sets derived from functions. By virtue of the basis property, each element of a general class of functions (e.g., a geopotential seen as member of a set of potentials within a Sobolev space framework) can be expressed in stable way as a linear combination of dilated and shifted copies of a “mother function.” The role of the wavelet transform as a mapping from the class of functions into an associated two-parameter family of space- and scale-dependent functions is properly characterized by least-squares properties. Secondly, wavelets have the power to decorrelate. In other words, the representation of data in terms of wavelets is somehow “more compact” than the original representation. We search for an accurate approximation by only using a small fraction of the original information of a function. Typically the decorrelation is achieved by building wavelets which have a compact support (localization in space), which are smooth (decay toward high frequencies), and which have vanishing moments (decay toward low frequencies). Different types of wavelets can be found from certain constructions of space/momentum localization. In this respect, the uncertainty principle tells us that sharp localization in “space and momentum” is mutually exclusive. Nevertheless, it turns out that decay toward long and short wavelengths (i.e., in information theoretic jargon, band-pass filtering) can be assured without any difficulty. Moreover, vanishing moments of wavelets enable us to combine (polynomial) outer harmonic expansions (responsible for the long-wavelength part of a function) with wavelet multiscale expansions (responsible for the medium- to short-wavelength contributions). Third, the main question of recovering a function on the sphere, e.g., the earth’s gravitational potential, is how to decompose the function into wavelet coefficients and how to reconstruct efficiently the potential from the coefficients. There is a “tree algorithm” or “pyramid algorithm” that makes these steps simple and fast. In this respect, it is desirable to switch between the original representation of the data and its wavelet representation in a time proportional to the size of the data. In fact, the fast decorrelation power of wavelets is the key to applications such as data compression, fast data transmission, noise cancelation, signal recovering, etc. In the last years, wavelets on the sphere have been the focus of several research groups which led to different wavelet approaches. Common to all these proposals is a multiresolution analysis which enables a balanced amount of both frequency (more accurately, angular momentum) and space localization (see, e.g., Dahlke et al. 1995, Schröder and Sweldens 1995, Potts and Tasche 1995, Lyche and Schumaker 2000, Weinreich 2001). A group theoretical approach to a continuous

Page 22 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

wavelet transform on the sphere is followed by Antoine and Vandergheynst (1999), Antoine et al. (2002), and Holschneider (1996). The parameter choice of their continuous wavelet transform is the product of SO(3) (for the motion on the sphere) and RC (for the dilations). A continuous wavelet transform approach for analyzing functions on the sphere is presented by Dahlke and Maass (1996). The constructions of the Geomathematics Group in Kaiserslautern on spherical wavelets (see, e.g., Freeden and Schreiner 1995; Freeden et al. 1998; Freeden and Schreiner 2009) are intrinsically based on the specific properties concerning the theory of spherical harmonics. Freeden and Schreiner (2007); Freeden and Gutting (2013) are interested in a compromise connecting zonal function expressions and structured grids on the sphere to obtain fast algorithms. Finally, the authors would like to point out that much of the material presented in this chapter within a spherical framework can be readily formulated for nonspherical reference surfaces even for vector and tensor data. Nevertheless, it remains to work with more realistic geometries such as (actual) Earth’s surface, real satellite orbits, etc. This is the great challenge for future research.

References Antoine JP, Vandergheynst P (1999) Wavelets on the 2-sphere: a group-theoretic approach. Appl Comput Harmon Anal 7:1–30 Antoine JP, Demanet L, Jaques L, Vandergheynst P (2002) Wavelets on the sphere: implementations and approximations. Appl Comput Harm Anal 13:177–200 Backus GE (1966) Potentials for tangent tensor fields on spheroids. Arch Ration Mech Anal 22:210–252 Backus GE (1967) Converting vector and tensor equations to scalar equations in spherical coordinates. Geophys J R Astron Soc 13:61–101 Backus GE (1986) Poloidal and toroidal fields in geomagnetic field modelling. Rev Geophys 24:75–109 Clebsch RFA (1861) Über eine Eigenschaft der Kugelfunktionen. Crelles J 60:343 Dahlke S, Maass P (1996) Continuous wavelet transforms with applications to analyzing functions on spheres. J Fourier Anal Appl 2(4):379–396 Dahlke S, Dahmen W, Schmitt W, Weinreich I (1995) Multiresolution analysis and wavelets on S 2 . Numer Funct Anal Optim 16(1–2):19–41 de Laplace PS (1785) Theorie des attractions des sphéroides et de la figure des planètes. Mèm de l’Acad, Paris Freeden W (1981) On spherical spline interpolation and approximation. Math Methods Appl Sci 3:551–575 Freeden W (1999) Multiscale modelling of spaceborne geodata. B.G. Teubner, Leipzig Freeden W, Gutting M (2013) Special functions of mathematical (geo-)physics. Birkhäuser, Basel Freeden W, Michel V (2004) Multiscale potential theory (with applications to geoscience). Birkhäuser Verlag, Boston/Basel/Berlin Freeden W, Schreiner M (1995) Non-orthogonal expansions on the sphere. Math Methods Appl Sci 18:83–120 Freeden W, Schreiner M (2007) Biorthogonal locally supported wavelets on the sphere based on zonal kernel functions. J Fourier Anal Appl 13:693–709

Page 23 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_31-3 © Springer-Verlag Berlin Heidelberg 2015

Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences-a scalar, vectorial, and tensorial setup. Springer, Berlin/Heidelberg Freeden W, Windheuser U (1996) Spherical wavelet transform and its discretization. Adv Comput Math 5:51–94 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford/Clarendon, Oxford Gauß CF(1838) Allgemeine Theorie des Erdmagnetismus, Resultate aus den Beobachtungen des magnetischen Vereins. Göttinger Magnetischer Verein, Leipzig Heine E (1878) Handbuch der Kugelfunktionen. Verlag G. Reimer, Berlin Holschneider M (1996) Continuous wavelet transforms on the sphere. J Math Phys 37:4156–4165 Laín Fernández N (2003) Polynomial bases on the sphere. PhD thesis, University of Lübeck Legendre AM (1785) Recherches sur l’attraction des sphèroides homogènes. Mèm math phys près à l’Acad Aci par divers savantes 10:411–434 Lyche T, Schumaker L (2000) A multiresolution tensor spline method for fitting functions on the sphere. SIAM J Sci Comput 22:724–746 Maxwell JC (1891) A treatise on electricity and magnetism (1873, 1881, 1891) Bde 1 u. 2 Ungekürzter Nachdruck der letzten Auflage 1891, Carnegie Mellon University, Dover, 1954. (Vol 2. Available at http://posner.library.cmu/Posner/books/book.cgi?call=537_M46T_1873_ VOL_2) Morse PM, Feshbach H (1953) Methods of theoretical physics. McGraw-Hill, New York Narcowich FJ, Ward JD (1996) Nonstationary wavelets on the m-sphere for scattered data. Appl Comput Harmon Anal 3:324–336 Neumann F (1887) Vorlesungen über die Theorie des Potentials und der Kugelfunktionen. Teubner, Leipzig, pp 135–154 Potts D, Tasche M (1995) Interpolatory wavelets on the sphere. In: Chui CK, Schumaker LL (eds) Approximation theory VIII. World Scientific, Singapore, pp 335–342 Schröder P, Sweldens W (1995) Spherical wavelets: efficiently representing functions on the sphere. In: Computer graphics proceedings (SIGGRAPH95). ACM, New York, pp 161–175 Svensson SL (1983) Pseudodifferential operators-a new approach to the boundary value problems of physical geodesy. Manus Geod 8:1–40 Sylvester T (1876) Note on spherical harmonics. Phil Mag II 291, 400 Wahba G (1981) Spline interpolation and smoothing on the sphere. SIAM J Sci Stat Comput 2:5–16 (also errata: SIAM J Sci Stat Comput 3:385–386) Wahba G (1990) Spline models for observational data. In: CBMS-NSF regional conference series in applied mathematics, vol 59. SIAM, Philadelphia Weinreich I (2001) A construction of C.1/ -wavelets on the two-dimensional sphere. Appl Comput Harmon Anal 10:1–26

Page 24 of 24

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

Tomography: Problems and Multiscale Solutions Volker Michel Geomathematics Group, University of Siegen, Siegen, Germany

Abstract In this chapter, a brief survey of three different approaches for the approximation of functions on the 3d-ball is presented: the expansion in an orthonormal (polynomial) basis, a reproducing kernelbased spline interpolation/approximation, and a wavelet-based multiscale analysis. In addition, some geomathematical tomography problems are discussed as applications.

1 Introduction Tomography problems are typical examples for applications where approximating structures on a ball (and not only a sphere) are needed. The use of a tensor product of standard 1d-techniques often appears to be inadequate for such problems since the structures that have to be determined (such as the Earth’s interior or a human brain) usually consist of layers that are approximately spherical shells. Therefore, it appears to be more reasonable to enhance spherical methods to their use on the interior of the sphere. As it is already known (not only) on the sphere, the use of orthogonal polynomials is often connected to a series of drawbacks that are caused by the global character of these functions (see, e.g., Schröder and Sweldens 1995, Amirbekyan et al. 2008, and Michel 2013, p. 142). Spline techniques and wavelet-based multiscale methods can avoid many of these problems and turned out to be reasonable alternatives. In this chapter, the basics of all three approaches for the 3d-ball are presented to provide the reader with a brief survey of the state of the art. The presented spline and wavelet methods on the ball are based on previous works by the author and his research group. Further details can also be found in the textbook Michel (2013).

2 Complete Orthonormal Systems 2.1 In the Univariate Case For the following investigations, the basics of orthogonal polynomials on intervals Œa; b on the real line R are briefly recapitulated. It suffices to consider the interval Œ1; 1 since an easy substitution can transfer such polynomials to arbitrary intervals Œa; b. For further details on univariate orthogonal polynomials, the reader is referred to, for example, Szegö (1939), Freud (1971), and Nikiforov and Uvarov (1988).



E-mail: [email protected], www.geomathematics-siegen.de

Page 1 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

The following theorem guarantees the existence of orthogonal polynomials. Theorem 1. Let a < b be given real numbers and w W a; bŒ! R be a given function which is continuous and positive. Moreover, let .n /n2N0 be a given real sequence, where n ¤ 0 for all n 2 N0 . Then there exists one and only one sequence of polynomials .Pn /n2N0 such that: 1. Each Pn , n 2 N0 , is a polynomial of degree n. Rb 2. For all n; m 2 N0 , a Pn .t /Pm .t /w.t / dt D 0, if n ¤ m, 3. For every n 2 N0 , Pn .b/ D n . Due to the following theorem, orthogonal polynomials can be used for the expansion of univariate functions on compact intervals. Theorem 2. Let the conditions of Theorem 1 be satisfied and the Hilbert space L2w Œa; b be defined by requiring F 2

L2w Œa; b

Z W, F W Œa; b ! R measurable and

b a

F .t /2 w.t / dt < C1;

defining the inner product Z hF; GiL2w Œa;b WD

b a

F .t /G.t /w.t / dt I

F; G 2 L2w Œa; bI

and considering functions to be equivalent, if they are equal almost everywhere. Then the polynomials .Pn /n2N0 constitute a complete orthogonal system in .L2w Œa; b; h; iL2w Œa;b /: The following systems are commonly used orthogonal polynomials. They will be important in the further considerations. Definition 1. Let ˛; ˇ > 1: The polynomials .Pn.˛;ˇ/ /n2N0 which are uniquely given by the requirements: 1. Each Pn.˛;ˇ/ ; n 2 N0 , is a polynomial of degree n, R1 2. For all n; m 2 N0 , 1 .1  t /˛ .1 C t /ˇ Pn.˛;ˇ/ .t /Pm.˛;ˇ/ .t / dt D 0, if n ¤ m,   3. For every n 2 N0 ; Pn.˛;ˇ/ .1/ D nC˛ ; n are called the Jacobi polynomials. The particular functions Pn WD Pn.0;0/ ; n 2 N0 , are called the Legendre polynomials.  .rC1/ Note that the binomial coefficient is defined by rs WD .sC1/.rsC1/ for r  s  0; where  represents the well-known Gamma function with .m C 1/ D mŠ for all m 2 N0 and .x C 1/ D x.x/ for all x 2 RC . The Jacobi polynomials and, in particular, the Legendre polynomials have already been intensively investigated. Only a few properties are mentioned here.

Page 2 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

Theorem 3 (Recurrence Relation for Jacobi Polynomials). Let ˛; ˇ > 1. Then the Jacobi polynomials satisfy the recurrence relation 2n.n C ˛ C ˇ/.2n C ˛ C ˇ  2/Pn.˛;ˇ/ .x/  .˛;ˇ/  D .2n C ˛ C ˇ  1/ .2n C ˛ C ˇ/.2n C ˛ C ˇ  2/x C ˛ 2  ˇ 2 Pn1 .x/ .˛;ˇ/

2.n C ˛  1/.n C ˇ  1/.2n C ˛ C ˇ/Pn2 .x/ for all n  2 and all x 2 Œ1; 1; where .˛;ˇ/

P0

.x/ D 1;

.˛;ˇ/

P1

1 1 .x/ D .˛ C ˇ C 2/x C .˛  ˇ/ 2 2

for all x 2 Œ1; 1. Due to this recurrence formula, the Jacobi polynomials can be calculated efficiently in numerical implementations. Theorem 4 (Norm of the Jacobi Polynomials). Let ˛; ˇ > 1 and n 2 N0 . Then the following norms of Pn.˛;ˇ/ are known:  .˛;ˇ/   2 P D n L Œ1;1



.n C ˛ C 1/.n C ˇ C 1/ 2˛CˇC1 w 2n C ˛ C ˇ C 1 .n C 1/.n C ˛ C ˇ C 1/ !  .˛;ˇ/  n C q 1  P D ; if q WD max.˛; ˇ/   ; n CŒ1;1 n 2

1=2 ;

where k  kCŒa;b denotes the usual maximum norm on Œa; b. Theorem 5 (Ordinary Differential Equation of the Jacobi Polynomials). Let ˛; ˇ > 1 and n 2 N0 . Then the Jacobi polynomial y D Pn.˛;ˇ/ satisfies the following ordinary differential equation:   d  1  x 2 w.x/y 0 .x/  n w.x/y.x/ D 0; dx

x 2  1; 1Œ;

where w.x/ WD .1  x/˛ .1 C x/ˇ , x 2  1; 1Œ, is the usual weight function associated to the Jacobi polynomial Pn.˛;ˇ/ and the eigenvalue n is given by n D n.n C ˛ C ˇ C 1/. In other words, y D Pn.˛;ˇ/ satisfies   1  x 2 y 00 .x/ C Œˇ  ˛  .˛ C ˇ C 2/x y 0 .x/ C n.n C ˛ C ˇ C 1/y.x/ D 0; x 2  1; 1Œ.

Page 3 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

2.2 On the 2-Sphere The theory of the orthonormal systems of spherical harmonics in L2 ./, where  is the unit sphere in R3 , is also a well-known mathematical subject. For this reason, it will be kept short here. For further details, see Müller (1966), Freeden et al. (1998), Freeden and Schreiner (2009), and Michel (2013). Definition 2. The space of the restrictions of all homogeneous harmonic polynomials of degree n on R3 to  is denoted by Harmn ./. Its elements are called the spherical harmonics. Note that a function F W R3 ! R called harmonic, if it is an element of the null-space of

is @2 @2 @2 the Laplacian, i.e., x F .x/ WD @x2 C @x2 C @x2 F .x/ D 0 for all x 2 R3 . Furthermore, a 1

2

3

polynomial P W R3 ! R is called homogeneous of degree n if P .x/ D n P .x/ for all  2 R and all x 2 Rn .

Theorem 6. The Laplace-Beltrami operator  W C.2/ ./ ! C./ which represents the angular part of the Laplacian  in the sense that  x G.x/ D

 1  @2 2 @ C 2  G.r/; C @r 2 r @r r

x D r; r 2 RC ;  2 ;

G 2 C.2/ .R3 n f0g/, has the eigenvalues n WD n.n C 1/, n 2 N0 . The eigenspace corresponding to n is Harmn ./. Theorem 6 obviously yields an equivalent way of defining spherical harmonics. Theorem 7.

For every n 2 N0 , dim Harmn ./ D 2n C 1:

Definition 3. For every fixed n 2 N0 , the system fYn;j j j D 1; : : : ; 2n C 1g represents an  arbitrary but fixed choice of a complete orthonormal system in Harmn ./; h; iL2 ./ . The index n is called the degree of Yn;j and j is called the order of Yn;j . Theorem 8. Taking all systems in Definition 3, i.e., taking the set of functions fYn;j j n 2 N0 I j D 1; : : : ; 2n C 1g, yields a complete orthonormal system in L2 ./. Theorem 9 (Addition Theorem for Spherical Harmonics). For every n 2 N0 and all ; 2 , 2nC1 X j D1

Yn;j ./Yn;j . / D

2n C 1 Pn .  /; 4

where Pn is the Legendre polynomial of degree n (see Definition 1). Theorem 10. For every n 2 N0 and all j D 1; : : : ; 2n C 1, the norm estimate kYn;j kC./  q 2nC1 is valid. 4

Page 4 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

The scalar basis system can be used to construct a vectorial basis system on the sphere. For this purpose, one uses the operators .1/

o F ./ WD F ./;

 2 ; F 2 C./;

.2/

o F ./ WD r F ./;

 2 ; F 2 C.1/ ./;

.3/

o F ./ WD L F ./ WD   r F ./;

 2 ; F 2 C.1/ ./;

where r  W C.1/ ./ ! C./ is the angular part of the gradient in the sense that rx G.x/ D 

1 @ G.r/ C r G.r/; @r r

x D r; r 2 RC ;  2 ;

for all G 2 C.1/ .R3 n f0g/. Definition 4. Let fYn;j j j D 1; : : : ; 2n C 1g be a system in Harmn ./ in the sense of Definition 3. Then we define  1=2 .i/ .i/ yn;j ./ WD .i/ o Yn;j ./; n

 2 ;

for all n  0i (with 01 WD 0 and 02 WD 03 WD 1); j D 1; : : : ; 2n C 1; and i D 1; 2; 3, where .i/ n

WD

1; i D1 : n.n C 1/; i > 1

.i/

Every element of span fyn;j j j D 1; : : : ; 2n C 1g DW harm.i/ n ./ is called a vector spherical harmonic of degree n and type i . .i/

Theorem 11. Taking all systems in Definition 4, i.e., taking the set of functions fyn;j j i D 1; 2; 3I n  0i I j D 1; : : : ; 2n C 1g, yields a complete orthonormal system in L2 .; R3 /, i.e., in particular, Z .i/ .l/ yn;j ./  ym;k ./ d!./ D ınm ıj k ıil ; 

where ıab is the Kronecker delta which is 1, if a D b, and 0, if a ¤ b, and “” is the usual Euclidean inner product (dot product) in R3 . Furthermore, a set of nine operators can be defined to construct an orthonormal basis of L2 .; R33 / to expand tensorial functions on the sphere. For further details, see Freeden et al. (1998, Chapter 14).

Page 5 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

2.3 On the 3d-Ball In the following, B denotes the ball in R3 with the center at 0 and the radius ˇ > 0. Two orthonormal bases using a radial-angular separation are known for L2 .B/: They will be called here the types I and II. Type I is due to Dufour (1977) and Ballani et al. (1993) (see also Michel 1999 and Freeden and Michel 2004) and type II is due to Tscherning (1996). A more abstract discussion of orthogonal functions on the ball can be found in Dunkl and Xu (2001). The following systems are complete orthonormal bases in the Hilbert space L2 .B/:

Theorem 12.

s I .x/ Gm;n;j

WD

4m C 2n C 3 .0;nC 21 / jxj2 P 1 2 m ˇ3 ˇ2

!

jxj ˇ

n

 Yn;j

 x ; jxj

(1)

x 2 B; m; n 2 N0 ; j 2 f1; : : : ; 2n C 1g; s II .x/ Gm;n;j

WD

    2m C 3 .0;2/ jxj x  1 Yn;j ; 2 Pm 3 ˇ ˇ jxj

(2)

x 2 Bnf0g; m; n 2 N0 ; j 2 f1; : : : ; 2n C 1g. In the following, the upper reference “I” or “II” is omitted, if the statement is valid for both systems. x Proof. Both systems are of the form Gm;n;j .x/ D Fm;n .jxj/Yn;j . jxj /. Therefore, the following consideration Z ˇ Z 2 r Fm;n .r/F ; .r/ dr  Yn;j ./Y ; ./ d!./ hGm;n;j ; G ; ; iL2 .B/ D 

0

Z D

ˇ

0

r 2 Fm;n .r/F ; .r/ dr ın ıj

shows that q only the case m 6D ; n D ; j D requires further investigations. The substitution in the case of type I gives r WD ˇ t C1 2 Z

ˇ

0

r 2 Fm;n .r/F ;n .r/ dr

Z

1

D

1

Setting Fm;n .r/ WD Z

ˇ 0

ˇ 2 .t C 1/ Fm;n ˇ 2

n

r

! ! r t C1 t C1 1 dt: F ;n ˇ ˇp 2 2 8.t C 1/

r ˇ

2 r Q Fm;n 2 ˇ2  1 for r 2 Œ0; ˇ, one obtains

2

ˇ3

r Fm;n .r/F ;n .r/ dr D

2

nC 52

Z

1

1

1

.t C 1/nC 2 FQm;n .t /FQ ;n .t / dt:

Page 6 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

1

.0;nC / Obviously, the orthogonality requirement is achieved, if FQm;n is chosen as cPm 2 , c 2 R constant (see Definition 1). In the case of type II, the substitution r WD ˇ2 .t C 1/ yields

Z

ˇ 0

r 2 Fm;n .r/F ;n .r/ dr

Z

1

D

1

    ˇ ˇ ˇ ˇ2 2 .t C 1/ Fm;n .t C 1/ F ;n .t C 1/ dt; 4 2 2 2

which, obviously, vanishes for m 6D , if Fm;n .r/ D cPm.0;2/ .2 ˇr  1/, r 20; ˇ, c 2 R constant, is chosen for all m; n 2 N0 . To show that the norms of all functions are 1, one has to use Theorem 4, which yields Z

3

2  I ˇ3 2nC 2 r Fm;n .r/ dr D cI2 5 2nC 2 2m C n C

ˇ

2

0

3 2

D cI2

ˇ3 4m C 2n C 3

and Z

ˇ

0

2  II 23 ˇ3 ˇ3 D cII2 r 2 Fm;n .r/ dr D cII2 8 2m C 3 2m C 3

for all n; m 2 N0 , where the superscripts “I” and “II” refer to the chosen type. Hence, the functions in (1) and (2) are orthonormal. Finally, for proving the completeness, consider f 2 L2 .B/ with the property hf; Gm;n;j iL2 .B/ D 0 8m; n; j . This implies that Z

ˇ

0

Z 2

r Fm;n .r/



 Yn;j ./f .r/ d!./ dr D 0

8m 2 N0

7 Rfor every fixed pair .n; j /. Due to Theorem 2, this yields that each function r !  Yn;j ./f .r/ d!./ vanishes almost everywhere on Œ0; ˇ. Finally, Theorem 8 yields that  f D 0 in the sense of L2 .B/. The orthogonal functions can be considered as analogues of the 1D orthogonal polynomials and the spherical harmonics. This is supported by the following theorem, which shows an analogy to Theorems 5 and 6 in the sense that the orthogonal functions on the ball are also eigenfunctions of a differential operator. Theorem 13.

Let the differential operators DrI;n DrII

  2 d ˇ2 d2 ˇ  2r  n.n C 1/ 2 ; WD .ˇ  r / 2 C 2 dr r dr r 2

2

d2 d WD r.ˇ  r/ 2 C .3ˇ  4r/ dr dr

Page 7 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

be defined on C.2/ 0; ˇ, where n 2 N0 . Then the differential operators I;n  I;n x WD Djxj x=jxj ;

II IIx WD Djxj x=jxj

on C.4/ .B n f0g/ satisfy     I    I 3 n.n C 1/ Gm;n;j .x/ ; Gm;n;j .x/ D n.n C 3/ C 4m m C n C 2   II II IIx Gm;n;j .x/ D m.m C 3/n.n C 1/Gm;n;j .x/;

I;n x

x 2 B n f0g, for all m; n 2 N0 and all j D 1; : : : ; 2n C 1. The proof uses Theorems 5 and 6 in combination with lengthy but elementary calculations; see Akram et al. (2011) and Michel (2013) for further details. Note that both systems have advantages and disadvantages. The advantage of type I in comparison to type II is that every function is a polynomial due to the factor jxjn . In particular, I II every Gm;n;j is well defined in x D 0. However, Gm;n;j is, in general, not defined in x D 0 for n > 0 since in this case a limit x ! 0 cannot be calculated for the angular part and (see, e.g., Szegö   .0;2/ m mC2 1939, p. 59) Pm .1/ D .1/ m 6D 0 for all m 2 N0 : For this reason, we set the values of II Gm;n;j at x D 0 to an arbitrarily chosen value (see also Amirbekyan 2007 and Amirbekyan and Michel 2008). Definition 5. We set f1; : : : ; 2n C 1g:

II Gm;n;j .0/

WD

q

3 4 ˇ 3

and

II Fm;n .0/

WD

q

3 ˇ3

for all m 2 N0 , n 2 N, j 2

On the other hand, the radial part and the angular part of type II are completely decoupled. Hence, essentially less memory is needed for an implementation in comparison to type I. Moreover, series involving coefficients that can be split into factors depending only on m and factors depending only on .n; j / can be calculated as a product of a series in Jacobi polynomials and a series in spherical harmonics. This is, in particular, important for the implementation of product series, which will be studied in the next section.

3 Product Series and Reproducing Kernels 3.1 Product Series For further proceeding, the definition of a product series is needed (see also Michel 2005a, Amirbekyan 2007, Amirbekyan and Michel 2008, Berkel 2009, and Berkel and Michel 2010). Definition 6.

A kernel K W B  B ! R of the form K.x; y/ D

1 2nC1 X X

K ^ .m; n/Gm;n;j .x/Gm;n;j .y/I

x; y 2 BI

(3)

m;nD0 j D1

Page 8 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

is called a product series, provided that every Gm;n;j belongs to the same type (I or II). A product series is called separable, if it has the form K.x; y/ D

1 2nC1 X X

II II am bn Gm;n;j .x/Gm;n;j .y/I

x; y 2 B:

(4)

m;nD0 j D1

Note that the coefficients must not depend on the order j of the spherical harmonics. Therefore, the addition theorem (Theorem 9) yields   y x 2n C 1 Pn  I K .m; n/Fm;n .jxj/Fm;n .jyj/ K.x; y/ D 4

jxj jyj m;nD0 1 X

^

(5)

x; y 2 B n f0g; and, in the particular case of a separable product series,  !   jxj jyj  1 Pm.0;2/ 2 1 K.x; y/ D Pm.0;2/ 2 3 ˇ ˇ ˇ mD0 !  1 X y 2n C 1 x  Pn  I x; y 2 Bnf0g: bn 4

jxj jyj nD0 1 X am .2m C 3/

An important question is certainly concerned with the convergence of the series. Some answers are given by the following theorem (see also Freeden and Michel 2004, pp. 458–462). Theorem 14.

The series in (3) converges in L2 .B  B/, if and only if 1 X

n.K ^ .m; n//2 < C1

(6)

m;nD0

and uniformly on B  B, if 1 X ˇ ˇ ^ .n C m C 12 /2m ˇK .m; n/ˇ n.2m C n/ < C1 in the case of type I .mŠ/2 m;nD0

and 1 X ˇ ˇ ^ ˇK .m; n/ˇ nm5 < C1 in the case of type II; m;nD0

respectively. Proof. It is an easy task to verify that the application of Fubini’s Theorem yields the L2 .B  B/˚ orthonormality (but not the completeness!) of the functions B  B 3 .x; y/ 7! Gm;n;j .x/ Gm;n;j .y/ m;n2N0 I j D1;:::;2nC1 . This system could (due to Zorn’s Lemma) be theoretically completed to an orthonormal basis. Hence, the Parseval identity in L2 .B  B/ yields the equivalent condition Page 9 of 31

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_32-2 © Springer-Verlag Berlin Heidelberg 2014

P1

P2nC1

.K ^ .m; n//2 < C1 for the L2 .B  B/-convergence. This condition is valid, if and only if (6) is satisfied. proving the uniform convergence, consider (5). A sufficient condition is, obviously, PFor 1 2 ^ m;nD0 jK .m; n/j kFm;n kCŒ0;ˇ n kPn kCŒ1;1 < C1. Following Theorems 4 and 12 as well as Definition 5, one gets m;nD0

j D1

s s ! ! 1   I   m C n C m C 2 4m C 2n C 3 2m C 3 II  2 F  D ; Fm;n m;n CŒ0;ˇD CŒ0;ˇ m ˇ3 m ˇ3 as well as kPn kCŒ1;1 D 1 for all n; m 2 N0 . The criteria can be simplified by taking into account that mCnC m

1 2

! D

.m C n C 32 / mŠ .n C 32 /

D

.n C 32 /

mŠ .n C 32 /

mC 12

Y

kD 32

.n C k/ 

.n C m C 12 /m mŠ

and !  mC2 .m C 2/.m C 1/ 1 2 .m C 2/Š D D m C 3m C 2 D m mŠ 2 2 2 

This yields the desired result. CorollaryP 1. A separable product series (4) converges uniformly on B  B, if C1 and 1 nD0 jbn j n < C1.

P1 mD0

jam j m5
and strain < "ij > of a statistically uniform sample are linked by effective macroscopic moduli C  and S  that obey Hookes’s law of linear elasticity, Cijkl D < ij >< kl >1 ; ; Sijkl D < ij >< kl >1 R R where < ij >D V1 ij .r/dr, < ij >D V1 ij .r/dr, V is the volume, and the notation < : > denotes an ensemble average. The stress (r) and strain "(r) distribution in a real polycrystal vary discontinuously at the surface of grains. By replacing the real polycrystal with a “statistically uniform” sample, we are assuming that stress (r) and strain "(r) are varying slowly and continuously with position r. A number of methods are available for determining the effective macroscopic modulus of an aggregate. We make the simplifying assumption that there is no significant interaction between grains, which for fully dense polycrystalline aggregates is justified by agreement between theory and experiments for the methods we present here. However, these methods are not appropriate for aggregates that contain voids, cracks, or pores filled with liquids or gases, as the elastic contrast between the different microstructural elements will be too high and we cannot ignore elastic interactions in such cases. The classical method that takes into account grain interaction is the self-consistent method based on the Eshelby inclusion model (e.g., Eshelby 1957; Hill 1965, which can also account for the shape of the microstructural elements. The simplest and best-known averaging techniques for obtaining estimates of the effective elastic constants of polycrystals are the Voigt (1928) and Reuss (1929) averages. These averages only use the volume fraction of each phase, the orientation, and the elastic constants of the single crystals or grains. In terms of statistical probability functions, these are first-order bounds, as only the first-order correlation function is used, which is the volume fraction. Note that no information about the shape or position of neighboring grains is used. The Voigt average is found by simply assuming that the strain field is everywhere constant (i.e., "(r) is independent of r) and hence the strain is equal to its mean value in each grain. The strain at every position is set equal to the macroscopic strain of the sample. C  is then estimated by a volume average of local stiffnesses C.gi / with orientation gi and volume fraction Vi ,

Page 24 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

" C  C Voigt D

X

# Vi C.gi / :

i

The Reuss average is found by assuming that the stress field is everywhere constant. The stress at every position is set equal to the macroscopic stress of the sample. C  or S  is then estimated by the volume average of local compliances S.gi /,

1 D Vi S.gi / C C i

: P  Reuss S S D Vi S.gi / 

Reuss

P

i

and C Voigt ¤ C Reuss and C Voigt ¤ ŒS Reuss 1 : These two estimates are not equal for anisotropic solids, with the Voigt being an upper bound and the Reuss a lower bound. A physical estimate of the moduli should lie between the Voigt and Reuss average bounds, as the stress and strain distributions are expected to be somewhere between uniform strain (Voigt bound) and uniform stress (Reuss bound). Hill (1952) observed that the arithmetic mean (and the geometric mean) of the Voigt and Reuss bounds, sometimes called the Hill or Voigt-Reuss-Hill (VRH) average, is often close to experimental values. The VRH average has no theoretical justification. As it is much easier to calculate the arithmetic mean of the Voigt and Reuss elastic tensors, all authors have tended to apply the Hill average as an arithmetic mean. In Earth sciences, the Voigt, Reuss, and Hill averages have been widely used for averages of oriented polyphase rocks (e.g., Crosson and Lin 1971). Although the Voigt and Reuss bounds are often far apart for anisotropic materials, they still provide the limits within which the experimental data should be found. Several authors have searched for a geometric mean of oriented polycrystals using the exponent of the average of the natural logarithms of the eigenvalues of the stiffness matrix (Matthies and Humbert 1993). Their choice of this averaging procedure was guided by the fact that the ensemble average elastic stiffness < C > should equal the inverse of the ensemble average elastic compliances < S >1 , which is not true, for example, of the Voigt and Reuss estimates. A method of determining the geometric mean for arbitrary orientation distributions has been developed (Matthies and Humbert 1993). The method derives from the fact that a stable elastic solid must have an elastic strain energy that is positive. It follows from this that the eigenvalues of the elastic matrix must all be positive. Comparison between Voigt, Reuss, Hill, and self-consistent estimates shows that the geometric mean provides estimates very close to the self-consistent method but at considerably reduced computational complexity (Matthies and Humbert 1993). The condition that the macroscopic polycrystal elastic stiffness < C > must equal the inverse of the aggregate elastic compliance < S >1 would appear to be a powerful physical constraint on the averaging method (Matthies and Humbert 1993). However, the arithmetic (Hill) and geometric means are also very similar (Mainprice and Humbert 1994), which tends to suggest that they are just mean estimates with no additional physical significance.

Page 25 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

The fact that there is a wide separation between the Voigt and Reuss bounds for anisotropic materials is caused by the fact that the microstructure is not fully described by such averages. However, despite the fact that these methods do not take into account such basic information as the position or the shape of grains, several studies have shown that the Voigt and Hill average are within 5–10 % of experimental values for crystalline rocks. For example, Barruol and Kern (1996) showed for several anisotropic lower-crust and upper-mantle rocks from the Ivrea zone in Italy that the Voigt average is within 5 % of the experimentally measured velocity.

7.2 Properties of Polycrystalline Aggregates with Texture The orientation of crystals in a polycrystal can be measured by volume diffraction techniques (e.g., X-ray or neutron diffraction) or individual orientation measurements (e.g., U-stage and Optical microscope, electron channeling, or electron backscattered diffraction (EBSD)). In addition, numerical simulations of polycrystalline plasticity also produce populations of crystal orientations at mantle conditions (e.g., Tommasi et al. 2004). An orientation, often given the letter g, of a grain or crystal in sample coordinates can be described by the rotation matrix between crystal and sample coordinates. In practice, it is convenient to describe the rotation by a triplet of Euler angles, e.g., g D .1 ; ˆ; 2 / by Bunge (1982). One should be aware that there are many different definitions of Euler angles that are used in the physical sciences. The orientation distribution function (O.D.F.) f .g/ is defined as the volume fraction of orientations, with an orientation in the interval between g and g C dg in a space containing all possible orientations given by Z V D f .g/dg; V where V =V is the volume fraction of crystals with orientation g, f .g/ is the texture function, and dg D 1=8 2 sin ' d1 dˆ d2 is the volume of the region of integration in orientation space. To calculate the seismic properties of a polycrystal, one must evaluate the elastic properties of the aggregate. In the case of an aggregate with a crystallographic texture, the anisotropy of the elastic properties of the single crystal must be taken into account. A potential complication is the fact that the Cartesian frame defined by orthogonal crystallographic directions used report elastic tensor of the single crystal, may not be the same as those used for Euler angle reference frame used in texture analysis (e.g., MTEX) or measurement (e.g., EBSD) packages. To account for this difference, a rotation may be required to bring the crystallographic frame of tensor into coincidence with the Euler angle frame, Cij kl .g E / D Tip :Tj q :Tkr :Tlt Cpqrt .g T /; where Cij kl .g E / is the elastic property in the Euler reference and Cpqrt (g T / is the elastic property in the original tensor reference frame; both frames are in crystal coordinates. The transformation matrix Tij is constructed from the angles between the two sets perpendicular to the crystallographic axes, forming rows and columns of the orthogonal transformation or rotation matrix (see Nye 1957). For each orientation g, the single-crystal properties have to be rotated into the specimen coordinate frame using the orientation or rotation matrix gij , Cij kl .g/ D gip :gj q :gkr :glt :Cpqrt .g E /; Page 26 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

where Cij kl .g E / is the elastic property in sample coordinates, gij D g.1 ; ˆ; 2 / is the measured orientation in sample coordinates, and Cpqrt (g E / is the elastic property in crystal coordinates of the Euler frame. We can rewrite the above equation as Cij kl .g/ D Tij klpqrt .g/Cpqrt .g E / with Tij klpqrt .g/ D @xi =@xp @xj =@xq @xk =@xr @xl =@xt D gip :gj q :gkr :glt : The elastic properties of the polycrystal may be calculated by integration over all possible orientations of the ODF. Bunge (1982) has shown that integration is given as Z Z E < Cij kl >D gip :gj q :gkr :glt :Cpqrt .g /:f .g/ dg D Cij kl .g/:f .g/ dg; R where < Cij kl > are the elastic properties of the aggregate and f .g/dg D 1. The integral on SO(3) can be calculated efficiently using the numerical methods available in MTEX. We can also regroup the texture-dependent part of the integral as < Tij klpqrt > Z < Tij klpqrt > Cpqrt .g / D E

Tij klpqrt .g/ f .g/dg Cpqrt .g E /:

We can evaluate < Tij klpqrt > analytically in terms of generalized spherical harmonic coefficients for specific crystal and sample symmetries (e.g., Ganster and Geiss 1985; Johnson and Wenk 1986; Zuo et al. 1989; Morris 2006). The minimum texture information required to calcluate the elastic properties are the even-order coefficients and series expansion to 4, which drives from centrosymmetric symmetry and fourth-rank tensor of elasticity, respectively. The direct consequence of this is that only a limited number of pole figures are required to define the ODF, e.g., 1 for cubic and hexagonal and 2 for tetragonal and trigonal crystal symmetries. Alternatively, elastic properties may be determined by simple summation of individual orientation measurements X X gip :gj q :gkr :glt :Cpqrt .g E /:V .g/ D Cij kl .g/:V .g/; < Cij kl >D where V .g/ is the volume fraction of grains in orientation g. For example, the Voigt average of the rock for m mineral phases of volume fraction V .m/ is given as < Cij kl >Voigt D

X

V .m/ < Cij kl >m :

The final step is the calculation of the three seismic phase velocities by solution of the Christoffel tensor (Tik /. The Christoffel tensor is symmetrical because of the symmetry of the elastic constants, and hence, Tik D Cij kl nj nl D Cj ikl nj nl D Cij lk nj nl D Cklij nj nl D Tki :

Page 27 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

The Christoffel tensor is also invariant upon the change of sign of the propagation direction n, as the elastic tensor is not sensitive to the presence or absence of a center being   1 of symmetry, a centrosymmetric physical property. Because the elastic strain energy 2 Cij kl ij kl of a stable crystal is always positive and real (e.g., Nye 1957), the eigenvalues of the 3  3 Christoffel tensor (being a Hermitian matrix) are three positive real values of the wave moduli M corresponding to Vp 2 ; Vs12 ; and Vs 22 of the plane waves propagating in the direction n. The three eigenvectors of the Christoffel tensor are the polarization directions (also called vibration, particle movement, or displacement vectors) of the three waves, as the Christoffel tensor is symmetrical to the three eigenvectors, and polarization vectors are mutually perpendicular. In the most general case, there are no particular angular relationships between polarization directions p and the propagation direction n; however, typically the P-wave polarization direction is nearly parallel and the two S-wave polarizations are nearly perpendicular to the propagation direction, and they are termed quasi-P or quasi-S waves. If the P-wave and two S-wave polarizations are parallel and perpendicular to the propagation direction, which may happen along a symmetry direction, then the waves are termed pure P and pure S or pure modes. In general, the three waves have polarizations that are perpendicular to one another and propagate in the same direction with different velocities, with Vp > Vs1 > Vs2 .

7.3 Properties of Polycrystalline Aggregates: An Example Metamorphic reactions and phase transformations often result in specific crystallographic relations between minerals. A specific orientation relationship between two minerals is defined by choosing any orientation descriptor that is convenient, e.g., a pair of parallel crystallographic features, Euler angle triplet, rotation matrix, or rotation axis and angle. The two minerals may have the same or different crystal symmetries. The composition may be the same, as in polymorphic phase transitions, or different, as in dehydration or oxidization reactions. Recently, Boudier et al. (2009/ described the orientation relationship between olivine and antigorite serpentine crystal structures by two pairs of planes and directions that are parallel in both minerals: relation 1 W .100/ Olivinejj.001/Antigorite andŒ001OlivinejjŒ010Antigorite relation 2 W .010/ Olivinejj.001/Antigorite andŒ001OlivinejjŒ010Antigorite Such relationships are called Burgers orientation relationships in metallurgy. The relation is used in the present study to calculate the Euler angle triplet, which characterizes the rotation of the crystal axes of antigorite into coincidence with those of olivine. Olivine is hydrated to form antigorite, and in the present case, the rotational point group symmetry of olivine (orthorhombic) and antigorite (monoclinic) results in four symmetrically equivalent new mineral orientations (see Mainprice et al. (1990) for details) because of the symmetry of the olivine that is transformed. The orientation of the n symmetrically equivalent antigorite minerals is given by Antigorite

gnD1;:::;4 D g OlivineAntigorite :SnOlivine :g Olivine ; where g OlivineAntigorite is rotation between olivine and newly formed antigorite, SnOlivine are the rotational point group symmetry operations of olivine, and g Olivine is the orientation of an olivine crystal. g is defined by the Burgers relationships given above, where relation 1 is

Page 28 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

g D 1 ; ˆ; 2 D (88.6, 90.0, 0.0) and relation 2 is g D (178.6, 90.0, 0.0). Note that the values of the Euler angles of g will depend on the right-handed orthonormal crystal coordinate system chosen for the orthorhombic olivine and the monoclinic antigorite. In this example, for olivine KC D fa; b; cg, and for antigorite KC D fa ; b; cg. The measurement of the texture of antigorite is often unreliable using EBSD because of sample preparation problems. We will use g, which may be expressed as a mineral or phase misorientation function (Bunge and Weiland 1988) as Z OlivineAntigorite . g/ D f Olivine .g/:f Antigorite . g:g/dg F to predict the texture of antigorite from the measured texture of olivine. We will only use relation 1 of Boudier et al. (2009) because this relation was found to have a much higher frequency in their samples. We used the olivine texture database of Ben Ismail and Mainprice (1998), consisting of 110 samples and over 10,000 individual measurements made with an optical microscope equipped with a five axis universal stage as our model olivine texture illustrated in Fig. 3. The olivine model texture has the [100] aligned with the lineation and the [010] axes normal to the foliation. The texture of the antigorite, calculated using phase misorientation functions, and the pole figures (Fig. 3) clearly show that Burgers orientation relationships between olivine and antigorite are statistically respected in the aggregates. The seismic properties of the 100 % olivine and antigorite aggregates were calculated using the methods described in Sect. 6.2 for individual orientations using the elastic single-crystal tensors for olivine (Abramson et al. 1997) and antigorite (Pellenq et al. 2009), respectively. The numerical methods for the seismic calculations are described by Mainprice (1990). The seismic velocities for a given propagation direction are on a five degree grid in the lower hemisphere. The percentage anisotropy (A) is defined here as A = 200(Vmax  Vmin /=.Vmax C Vmin /. The Vp anisotropy is found by searching the hemisphere for all possible propagation directions for maximum and minimum values of Vp . There are in general two orthogonally polarized S-waves for each propagation direction with different velocities in an anisotropic medium. The anisotropy AVs can then be defined for each direction, with one S-wave having the maximum velocity and the other the minimum velocity. Contoured lower-hemisphere stereograms of P-wave velocity (Vp /, percentage shear-wave anisotropy (AVs), also called shearwave splitting, as well as polarization .Vs1 / of the fastest S-wave are shown in Fig. 4. The seismic properties show a major change in the orientation of the fast direction of compressional wave propagation from parallel to the lineation (X/ in the olivine aggregate to normal to the foliation (Z/ in the antigorite aggregate. In addition, there is a dramatic change of orientation of the polarization (or vibration) of the fastest S-wave (S1) from parallel to the (XY) foliation plane in the olivine aggregate to perpendicular to the foliation (Z/ in the antigorite aggregate. The remarkable changes in seismic properties associated with hydration of olivine and its transformation to antigorite have been invoked to explain the changes in orientation of S-wave polarization of the upper mantle between back arc and mantle wedge in subduction zones (Faccenda et al. 2008; Kneller et al. 2008; Katayama et al. 2009).

8 Future Directions Although quantitative texture analysis has been formally available since the publication of the H.-J. Bunges classical book (1969), many of the original concepts only applied to singlePage 29 of 36

Fig. 3 Olivine CPO of the Ben Ismail and Mainprice (1998) database and the corresponding antigorite CPO calculated using phase misorientation function described in the text. Horizontal black lines on the pole figures marks the foliation (XY) plane of the olivine aggregates and the lineation (X / is East-West. Contours in times uniform. Lower hemisphere equal area projection

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

Page 30 of 36

Fig. 4 The calculated seismic properties of the olivine and antigorite polycrystals with pole figures shown in Fig. 3. Vp is compression wave velocity, AVs is shearwave splitting or birefringence anisotropy as percentage as defined in the text and Vs1 polarization is the vibration direction of the fastest S-wave. Horizontal black lines on the pole figures marks the foliation (XY) plane of the olivine aggregates and the lineation (X / is East-West. Lower hemisphere equal area projection

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

Page 31 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

phase aggregates of metals. Extension of these methods was rapidly made to lower crystal symmetry typical of rock-forming minerals and lower sample symmetry corresponding to naturally deformed rocks. The relationship of neighboring crystal orientations called misorientation has also now been widely studied. However, most rocks are poly mineral or poly phase, and the extension of quantitative texture analysis to poly phase materials has been slow to develop because a universal mathematical framework is missing. A coherent framework will encompass misorientation between crystals of the same phase and between crystals of different phases. Future research in this area based on the mathematical framework of this chapter will provide a coherent and efficient theoretical and numerical methodology. Other future developments will include methods to quantify the statistical sampling of the orientation space of different types of data.

9 Conclusions Forty years after Bunge’s pioneering Mathematische Methoden der Texturanalyse (Bunge 1969), which is most likely the single most influential textbook besides its English translation (Bunge 1982), this contribution to the Handbook of Geomathematics presents elements of mathematical texture analysis as part of mathematical tomography. The “fundamental relationship” of an orientation distribution and its corresponding “pole figures” was identified as a totally geodesic Radon transform on SO(3) or S3  H. Being a Radon transform, pole figures are governed by an ultrahyperbolic or Darboux-type differential equation, the meaning of which was furiously denied at its first appearance. In fact, this differential equation opened a new dimension, and its general solution, both in terms of harmonics and characteristics, suggested a novel approach by radial basis functions, featuring a compromise of sufficiently good localization in spatial and frequency domains. Availability of fast Fourier methods for spheres and SO(3) was the necessary prerequiste to put the mathematics of texture analysis into practice, as provided by the free and open-source toolbox MTEX.

References Abramson EH, Brown JM, Slutsky LJ, Zaug J (1997) The elastic constants of San Carlos olivine to 17 GPa. J Geophys Res 102:12253–12263 Altmann SL (1986) Rotations, quaternions and double groups. Clarendon, Oxford Barruol G, Kern H (1996) P and S waves velocities and shear wave splitting in the lower crustal/upper mantle transition (Ivrea Zone). Experimental and calculated data. Phys Earth Planet Int 95:175–194 Ben Ismail W, Mainprice D (1998) An olivine fabric database: an overview of upper mantle fabrics and seismic anisotropy. Tectonophysics 296:145–157 Bernier JV, Miller MP, Boyce DE (2006) A novel optimization-based pole-figure inversion method: comparison with WIMV and maximum entropy methods. J Appl Cryst 39:697–713 Bernstein S, Schaeben H (2005) A one-dimensional radon transform on SO(3) and its application to texture goniometry. Math Methods Appl Sci 28:1269–1289 Bernstein S, Hielscher R, Schaeben H (2009) The generalized totally geodesic Radon transform and its application in texture analysis. Math Methods Appl Sci 32:379–394

Page 32 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

Boudier F, Baronnet A, Mainprice D (2009) Serpentine mineral replacements of natural olivine and their seismic implications: oceanic lizardite versus subduction-related antigorite. J Pet. doi:10.1093/petrology/egp049 Bunge HJ (1965) Zur Darstellung allgemeiner Texturen. Z Metallk 56:872–874 Bunge HJ (1969) Mathematische Methoden der Texturanalyse. Akademie-Verlag, New York Bunge HJ (1982) Texture analysis in materials science. Butterworths, Boston Bunge HJ, Weiland H (1988) Orientation correlation in grain and phase boundaries. Textures Microstruct 7:231–263 Cowley JM (1995) Diffraction physics, 3rd edn. North-Holland personal library. North-Holland, Oxford Crosson RS, Lin JW (1971) Voigt and Reuss prediction of anisotropic elasticity of dunite. J Geophys Res 76:570–578 Epanechnikov VA (1969) Nonparametric estimates of a multivariate probability density. Theor Probl Appl 14:153–158 Eshelby JD (1957) The determination of the elastic field of a ellipsoidal inclusion, and related problems. Proc R Soc Lond A 241:376–396 Faccenda M, Burlini L, Gerya T, Mainprice D (2008) Fault-induced seismic anisotropy by hydration in subducting oceanic plates. Nature 455:1097–1101 Fengler MJ, Freeden W, Gutting M (2006) The Spherical Bernstein Wavelet. Int J Pure Appl Math, 31, 209–230 Forsyth JB (1988) Single crystal diffractometry. In: Newport RJ, Rainford BD, Cywinski R (eds) Neutron scattering at a pulsed source. Adam Hilger, Bristol, pp 177–188 Friedel G (1913) Sur les symetries cristallines que peut reveler la diffraction des rayons Röntgen. C R Acad Sci 157:1533–1536 Ganster J, Geiss D (1985) Polycrystalline simple average of mechanical properties in the general (triclinic) case. Phys Stat Sol (B) 132:395–407 Gel’fand IM, Minlos RA, Shapiro ZYa (1963) Representations of the rotation and Lorentz groups and their application. Pergamon, Oxford Gürlebeck K, Sprößig W (1997) Quaternionic and Clifford calculus for physicists and engineers. Wiley, New York Hall P, Watson GS, Cabrera J (1987) Kernel density estimation with spherical data. Biometrika 74:751–762 Hammond C (1997) The basics of crystallography and diffraction. Oxford University Press, Oxford Hanson AJ (2006) Visualizing quaternions. Morgan Kaufmann, San Francisco Helgason S (1984) Groups and geometric analysis. Academic, New York/Orlando Helgason S (1994) Geometric analysis on symmetric spaces. Mathematical surveys and monographs, vol 39. American Mathematical Society, New York/Orlando Helgason S (1999) The Radon transform, 2nd edn. Birkhäuser Boston, Boston Hielscher R (2007) The Radon transform on the rotation group-inversion and application to texture analysis. PhD thesis, TU Bergakademie Freiberg Hielscher R, Schaeben H (2008a) A novel pole figure inversion method: specification of the MTEX algorithm. J Appl Cryst 41:1024–1037 Hielscher R, Schaeben H (2008b) MultiScale texture modeling. Math Geosci 40:63–82 Hielscher R, Potts D, Prestin J, Schaeben H, Schmalz M (2008) The Radon transform on SO(3): a Fourier slice theorem and numerical inversion. Inverse Probl 24:025011 (21p) Hielscher R, Prestin J, Vollrath A (2010) Fast summation of functions on SO(3). Math Geosci, 42, 773–794 Page 33 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

Hill R (1952) The elastic behaviour of a crystalline aggregate. Proc Phys Soc Lond Ser A 65:349–354 Hill R (1965) A self consistent mechanics of composite materials. J Mech Phys Solids 13:213–222 Johnson GC, Wenk HR (1986) Elastic properties of polycrystals with trigonal crystal and orthorhombic specimen symmetry. J Appl Phys 60:3868–3875 Katayama I, Hirauchi KI, Michibayashi K, Ando JI (2009) Trench-parallel anisotropy produced by serpentine deformation in the hydrated mantle wedge. Nature 461:1114–1118. doi:10.1038/nature08513 Kneller EA, Long MD, van Keken PE (2008) Olivine fabric transitions and shear wave anisotropy in the Ryukyu subduction system. Earth Planet Sci Lett 268:268–282 Kostelec PJ, Rockmore DN (2003) FFTs on the rotation group. Santa Fe institute working papers series paper, 03-11-060 Kreminski R (1997) Visualizing the Hopf fibration. Math Educ Res 6:9–14 Kuipers JB (1999) Quaternions and rotation sequences-a primer with applications to orbits, aerospace, and virtual reality. Princeton University Press, Princeton Kunze K (1991) Zur quantitativen Texturanalyse von Gesteinen: Bestimmung, Interpretation und Simulation von Quarztefügen. PhD thesis, RWTH Aachen Kunze K, Schaeben H (2004) The Bingham distribution of rotations and its spherical Radon transform in texture analysis. Math Geol 36:917–943 Mainprice D (1990) A FORTRAN program to calculate seismic anisotropy from the lattice preferred orientation of minerals. Comput Geosci 16:385–393 Mainprice D, Humbert M (1994) Methods of calculating petrophysical properties from lattice preferred orientation data. Surv Geophys 15:575–592 (Special Issue Seismic properties of crustal and mantle rocks: laboratory measurements and theoretical calculations) Mainprice D, Humbert M, Wagner F (1990) Phase transformations and inherited lattice preferred orientation: implications for seismic properties. Tectonophysics 180:213–228 Mainprice D, Tommasi A, Couvy H, Cordier P, Frost DJ (2005) Pressure sensitivity of olivine slip systems: implications for the interpretation of seismic anisotropy of the Earths upper mantle. Nature 433:731–733 Mao HK, Shu J, Shen G, Hemley RJ, Li B, Singh, AK (1998) Elasticity and rheology of iron above 220 GPa and the nature of the Earths inner core. Nature 396:741–743 Matthies S (1979) On the reproducibility of the orientation distribution function of texture samples from pole figures (ghost phenomena). Phys Stat Sol (B) 92:K135–K138 Matthies S, Humbert M (1993) The realization of the concept of a geometric mean for calculating physical constants of polycrystalline materials. Phys Stat Sol (B) 177:K47–K50 Matthies S, Vinel GW, Helming K (1987) Standard distributions in texture analysis, vol I. Akademie Verlag, New York Meister L, Schaeben H (2004) A concise quaternion geometry of rotations. Math Methods Appl Sci 28:101–126 Morawiec A (2004) Orientations and rotations. Springer, Berlin Morris PR (2006) Polycrystal elastic constants for triclinic crystal and physical symmetry. J Appl Cryst 39:502–508. doi:10.1107/S002188980 6016645 Muller J, Esling C, Bunge HJ (1981) An inversion formula expressing the texture function in terms of angular distribution function. J Phys 42:161–165 Nye JF (1957) Physical properties of crystals – their representation by tensors and matrices. Oxford University Press, Oxford

Page 34 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

Nikiforov AF, Uvarov VB (1988) Special functions in mathematical physics. Birkhäuser Boston, Boston Nikolayev DI, Schaeben H (1999) Characteristics of the ultrahyperbolic differential equation governing pole density functions. Inverse Probl 15:1603–1619 Pellenq RJM, Mainprice D, Ildefonse B, Devouard B, Baronnet A, Grauby O (2009) Atomistic calculations of the elastic properties of antigorite at upper mantle conditions: application to the seismic properties in subduction zones. EPSL submitted Prior DJ, Mariani E, Wheeler J (2009) EBSD in the Earth Sciences: applications, common practice and challenges. In: Schwartz AJ, Kumar M, Adams BL, Field DP (eds) Electron backscatter diffraction in materials science. Springer, Berlin Randle V, Engler O (2000) Texture analysis: macrotexture, microtexture, and orientation mapping. Gordon and Breach Science, New York Raterron P, Merkel S (2009) In situ rheological measurements at extreme pressure and temperature using synchrotron X-ray diffraction and radiography. J Synchrotron Radiat 16:748–756 Reuss A (1929) Berechnung der Fließgrenze von Mischkristallen auf Grund der Plastizitätsbedingung für Einkristalle. Z Angew Math Mech 9:49–58 Roe RJ (1965) Description of crystallite orientation in polycrystal materials III. General solution to pole figure inversion. J Appl Phys 36:2024–2031 Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27:832–837 Sander B (1930) Gefügekunde der Gesteine mit besonderer Bercksichtigung der Tektonite. Springer, Berlin, p 352 Savyolova TI (1994) Inverse formulae for orientation distribution function. Bunge HJ (ed) Proceedings of the tenth international conference on textures of materials (Materials Science Forum 15762), pp 419–421 Schaeben H (1982) Fabric-diagram contour precision and size of counting element related to sample size by approximation theory methods. Math Geol 14:205–216 [Erratum: Math Geol 15:579–580] Schaeben H (1997) A simple standard orientation density function: the hyperspherical de la Vallée Poussin kernel. Phys Stat Sol (B) 200:367–376 Schaeben H (1999) The de la Vallée Poussin standard orientation density function. Textures Microstruct 33:365–373 Schaeben H, Sprößig W, van den Boogaart KG (2001) The spherical X-ray transform of texture goniometry. In: Brackx F, Chisholm JSR, Soucek V (eds) Clifford analysis and its applications. Proceedings of the NATO advanced research workshop Prague, 30 Oct–3 Nov, 2000, pp 283–291 Schaeben H, Hielscher R, Fundenberger, J-J, Potts D, Prestin J (2007) Orientation density functioncontrolled pole probability density function measurements: automated adaptive control of texture goniometers. J Appl Cryst 40:570–579 Schwartz AJ, Kumar M, Adams BL (2000) Electron back scatter diffraction in materials science. Kluwer Academic, Dordrecht Scott DW (1992) Multivariate density estimation-Theory, practice, and visualization. Wiley, New York Tommasi A, Mainprice D, Cordier P, Thoraval C, Couvy H (2004) Strain-induced seismic anisotropy of wadsleyite polycrystals: constraints on flow patterns in the mantle transition zone. J Geophys Res 109:B12405, 1–10 Vajk KM (1995) Spin space and the strange properties of rotations. MSc thesis, UC Santa Cruz

Page 35 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_33-3 © Springer-Verlag Berlin Heidelberg 2015

Van den Boogaart KG (2002) Statistics for Individual Crystallographic Orientation Measurements. PhD thesis, TU Bergakademie Freiberg Van den Boogaart KG, Hielscher R, Prestin J, Schaeben H (2007) Kernel-based methods for inversion of the Radon transform on SO(3) and their applications to texture analysis. J Comput Appl Math 199:122–140 Van Houtte P (1980) A method for orientation distribution function analysis from incomplete pole figures normalized by an iterative method. Mater Sci Eng 43:7–11 Van Houtte P (1984) A new method for the determination of texture functions from incomplete pole figures – comparison with older methods. Textures Microstruct 6:137–162 Varshalovich D, Moskalev A, Khersonski V (1988) Quantum theory of angular momentum. World Scientific, Singapore Vilenkin NJ (1968) Secial functions and the theory of group representations. American Mathematical Society, Providence Vilenkin NJ, Klimyk AU (1991) Representation of Lie groups and special fucntions, vol 1. Kluwer Academic, Dordrecht Voigt W (1928) Lehrbuch der Kristallphysik. Teubner-Verlag, Leipzig Vollrath A (2006) Fast Fourier transforms on the rotation group and applications. Diploma thesis, Universität zu Lübeck Watson GS (1969) Density estimation by orthogonal series. Ann Math Stat 40:1496–1498 Watson GS (1983) Statistics on spheres. Wiley, New York Wenk HR (1985) Preferred orientation in deformed metals and rocks: an introduction to modern texture analysis. Academic, New York Zuo L, Xu J, Liang, Z (1989) Average fourth-rank elastic tensors for textured polycrystalline aggregates without symmetry. J Appl Phys 66:2338–2341

Page 36 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Nonlinear Methods for Dimensionality Reduction Charles K. Chuia and Jianzhong Wangb a Department of Statistics, Stanford University, Stanford, CA, USA b Department of Mathematics, Sam Houston State University, Huntsville, TX, USA

Abstract The main objective of this handbook paper is to summarize and compare various popular methods and approaches in the research area of dimensionality reduction of high-dimensional data sets, with emphasis on hyperspectral imagery data. In addition, the topics of our discussions will include data preprocessing, data geometry in terms of similarity/dissimilarity, construction of dimensionality reduction kernels, and dimensionality reduction algorithms based on these kernels.

1 Introduction Current hyperspectral sensors are being used to collect geological/geographical imagery data as a set of images of the same scene, with each image representing a range (also called spectral band) of 5–10 nm (nanometers) of the electromagnetic spectrum. In general, a set of such hyperspectral images contains hundreds, or occasionally even over a thousand, of narrow (adjacent) spectral bands of electromagnetic radiation between the ultraviolet range of 350 nm and the infrared range of over 2,500 nm. These images are stacked together to form a three-dimensional “hyperspectral image (HSI) right-angled parallelepiped,” but commonly called an “HSI cube” instead. This abuse of terminology certainly facilitates our discussions in this paper. Furthermore, since the distance between two adjacent images in an HSI cube is negligibly small, an HSI cube is also considered as a three-dimensional image, to be called an hyperspectral image (also with abbreviation HSI), by skipping the word “cube,” for simplicity. The problem of dimensionality reduction is to reduce the thickness of this HSI, while retaining all of key features, particularly its data geometry. Hence, an HSI may be considered as a real-valued function f defined on some cube. Let f W C  R3 ! R denote an HSI, where n C D .x; y; z/ W

o x 2 Œa; b; y 2 Œc; d ; z 2 Œz1 ; z2 

so that the HSI is f W C  R3 ! R with the domain C . In C , the two-dimensional rectangular (spatial) region, Œa; bŒc; d ] is called the sensing area and the third dimension [z1 , z2 ] is called the range of the spectral band. Furthermore, at each position (x0 , y0 / in the sensing area [a; b  Œc; d ], the function f0 .z/ D f .x0 , y0 , z) is called a raster cell (also called a pixel that stands for picture element), and the graph of a raster cell is called its spectral radiance curve (usually called spectral curve, for simplicity). 

E-mail: [email protected]

Page 1 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Material components of objects of interest for each pixel of an HSI are usually identified from the spectral curves. The reason is that a specific material is characterized by its spectral reflectance curve, so that different “pure” material components can be distinguished by such curves, provided that the number of spectral bands for creating the spectral reflectance curves is sufficiently large. The technical term for a “pure” material is “endmember,” and the spectral reflectance curves for endmembers are called signatures of the materials, which are identified by using certain libraries. On the other hand, the spectral curve of a raster cell is not an endmember spectrum in general, but rather a mixture of spectra (called a composite spectrum), since it is contributed by more than one materials. Hyperspectral sensors are used to scan an area of the spatial domain Œa; b  Œc; d  to produce HSI images with more than one samples. For example, high-resolution sensors can capture a square-meter area for each pixel. The precision of hyperspectral sensors is typically measured in terms of both spectral resolution (which is the width of each band of the captured spectrum) and spatial resolution (which is the size of a pixel in an HSI). Since a hyperspectral sensor can collect a large number of fairly narrow bands, it is possible to identify objects even if they are only captured in a handful of pixels. It is important to point out that spatial resolution contributes to the effectiveness of spectral resolution. For example, if the spatial resolution is too low, meaning that a pixel covers too large an area, then multiple objects might be captured within the same pixel, making it difficult to identify the objects of interest. On the other hand, if a pixel covers too small an area, then the energy captured by the sensor cell could be so low that the signal-to-noise ratio is decreased by too much to reduce the reliability of measured features. To achieve high spatial resolution imagery (HRI), a high resolution black-and-white (or panchromatic) camera is usually integrated in the HSI system. The typical spatial resolution of HSI systems without HRI cameras is one square-meter per pixel; while integrated with high-quality HRI cameras, the spatial resolution of an HSI system could be as fine as a few square inches per pixel. Typical applications of HSI data for geological/geographical studies include the following: • Signature matching (for object detection). Matching reflected light of pixels to spectral signatures of given objects. • Region segmentation. Partitioning the spatial region of an HSI into multiple regions (or subsets of pixels). The goal of HSI segmentation is to simplify and/or change the representation of an HSI into some other form for more meaningful and effective analysis. Region segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. • Classification of objects (in terms of spectral classification). Classifying objects in an HSI by spectral similarity. • Anomaly detection. Detecting anomaly for given statistical models. • Demixing. Finding material components in a raster cell. • Shape recognition. Recognizing the shape of a detected object. Because of the rich structure of HSI data, the list of applications of HSI for the geological/geographical field is very long, including resource management, agriculture, mineral exploration, environmental monitoring, and homeland security. But the effective use of HSI requires an understanding of the nature and limitations of HSI data. Besides, because of the huge size of HSI data, even experts have difficulty to process and interpret HSI data directly. In this chapter, we discuss a powerful tool, called “dimensionality reduction (DR),” to facilitate the processing, analysis, and visualization of HSI data.

Page 2 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

2 Euclidean Distance and Gram Matrix on HSI Since the basic elements in dimensionality reduction and data analysis of HSI are raster cells, it is natural to represent an HSI cube as a three-dimensional matrix: ˚  H D hj;k;l I 1  j  J; 1  k  K; 1  l  D ; where for each fixed l, the data represent an l-band (spatial) image, while for each fixed orderpair (j; k), the data represent the spectral radiance of the raster cell at the position (j; k). In order to change the three-dimensional data structure of HSI to two-dimensional, we may introduce a one-one index transformation: N2 ! N, that uniquely maps each double-index pair (j; k) to a single index i 2 f1; 2; : : : ; ng, with n D J  K. Then, the HSI cube H is converted to a D  n (two-dimensional) matrix P D .pl;i /Dn , with pl;i D hj;k;l where i is associated with (j; k) according to the index transformation. Therefore, each column of the HSI data matrix P represents the spectral radiance curve of the corresponding raster cell, while each row of P is a one-dimensional representation of the image corresponding to the lth spectral band, called a “band image.” For convenience, we shall denote the i th row vector of P by pi and the lth column vector of P by pEl . Thus, the totality of all columns of P constitute a “data cloud” in the space RD , called the spectral domain (or spectral space) of the HSI. The data cloud is some “point-cloud” that lies on some (unknown) manifold of lower dimension in the spectral space RD . The problem of dimensionality reduction is to map this manifold to a significantly lower dimensional space Rd , with only insignificant loss of the HSI data integrity. Correspondingly, we call the n-space RJ K the spatial space of the HSI.

2.1 Euclidean and Related Distances on HSI The measurement of data similarity/dissimilarity is the key to the success of HSI data classification. The distance metric chosen for the HSI data measures the similarities and dissimilarities among the pixels of the HSI, with larger distance measurement between two pixels indicating more dissimilar of these two pixels. If we do not take the manifold structure of the HSI data (as a point-cloud in the spectral space) into consideration, then the Euclidean distance of the spectral space RD can be used for describing the dissimilarity on HSI data. Let aE D Œa1 ; a2 ; : : : ; aD 0 2 RD and xE D Œx1 ; x2 ; : : : ; xD 0 2 RD be two pixels in an HSI. The Euclidean (spectral) distance between them is defined by v u D uX a; x/ E D jjE x  aE jj D t .x  a /2 : d .E 2

l

l

lD1

However, the Euclidean (spectral) distance does not truly describe the dissimilarities among pixels on an HSI in general. In this regard, recall that a material component of an object is characterized by the signature of its spectral reflectance curve, while the spectral vector is the spectral radiance on the pixel, recorded in terms of the reflected light of the raster cell at the sensor. Hence, spectral radiance is affected not only by spectral reflectance of materials but also by many other factors, such as the spectra of the input solar energy, its interactions during the travel of light, downward, upward, and sideways, passing through the atmosphere. These and other factors contribute to noise

Page 3 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

contamination of the data. To reduce such noise contamination, it would be more effective to transform the HSI data so that the distance in the transformed space describes the dissimilarity on the HSI more accurately. In addition, another factor we have to consider is the effect of the energy of the reflected light. For instance, it is clear that if a spectral vector aE is proportional to (i.e., E then they are definitely similar. However, constant multiple of) another spectral vector bE W aE D k b, the Euclidean (spectral) distance between them could be very large (depending on the constant of proportionality). Therefore, in order to describe the dissimilarity for HSI without taking the energy effect into consideration, we may first normalize each spectral vector (to become a unit vector) before measuring the distance (among unit vectors). In practice, after various transformations of the HSI data, the Euclidean distance is used in the transformed spaces for dissimilarity measurement. The following is a list of several other popular distances used to measure HSI data dissimilarities in certain transformed spaces. • Angular distance. To eliminate the energy effect, we may use the angular distance between two (nonzero) spectral vectors xE 2 RD and aE 2 RD in an HSI, defined by v u D ˇ ˇ2 uX ˇ xi ˇ a i t ˇ: ˇ dang .E x; aE / D jjE x  aE jja D  ˇ ˇ jjE xjj jjE a jj iD1 Let the angle between xE and aE be denoted by  , and recall that all spectral vectors are nonnegative, so that 0    2 . Then  x; aE / D 2 sin ; dang .E 2 which is an increasing function of  . Observe that dang .E x; aE / '  for small values of  . Hence, the angular distance approximately measures the angle between two spectral vectors. • Template distance (l1 distance). If the peak of a spectral curve is considered one of its main characteristics, then the template distance defined by x; aE / D jjE x  aE jj1 D max1iD .jxi  ai j/ d1 .E would be a suitable measurement for dissimilarity. • Fourier distance (distance in the Fourier domain). Since noise mostly occurs in the highfrequency range of the Fourier domain, we may use the fast fourier transform (FFT) to map each spectral vector to the Fourier domain, followed by deleting the desired high-frequency terms. The Fourier distance is then the Euclidean distance in the cut-off frequency space, defined by v u s uX dO2 .E x; aE / D t .FFT.xE  aE //2i ;

1  s < D;

iD1

where FFT .E v / denotes the FFT of the vector vE, and s is some suitable integer smaller than D. Note that the sum only contains the lower frequency terms of the FFT in order to reduce noise. Of course for s D D, we have dO2 .E x; aE / D d2 .E x; aE /.

Page 4 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

• Wavelet distance (distance in the wavelet domain). The wavelet transform (see, for example, Chui and Wang 1991, 1992a; Chui 1992) is a useful tool in data analysis, particularly for separating noise from data contents (see Chui 1997). Hence, the (discrete) wavelet transform of the spectral vector can be adopted, with certain modification, to effectively distinguish its feature from noise. To eliminate noise contamination, we suggest the following wavelet distance on the data set. For instance, we may consider the simple case of single-level discrete wavelet transform (DWT) of a vector xE 2 RD , denoted by DWT.E x/ D L.E x/ C H.E x/; where L.E x/ is in the low-frequency subspace V1 and H.E x/ is in the high-frequency subspace W1 . Then, the wavelet distance between xE and aE may be defined by dw .E x; aE / D jjL.E x/  L.E a/jj: Modifications of this definition could include some high-frequency terms of a multilevel DWT. The interested reader is referred to Chui and Wang (1992b, and references therein) for more detailed discussions of this discrete wavelet transform approach and its properties. • Fourier phase distance. The Fourier phase distance is used to measure the dissimilarity of two raster cells xE and aE in an HSI in terms of the phases of their FFT, defined by v u s uX O d .E x; aE / D t .arg.FFT.E x//i  arg.FFT.E a//i /2 ;

1  s < D;

iD1

which only takes the phase difference of the corresponding harmonics into consideration. • Derivative distance. This distance definition is a special case of distances for the homogeneous Sobolev spaces. It can also be called (one-dimensional) edge distance, since it measures the dissimilarity among edges of two spectral vectors. For images, it is well known that edges play a very important role in describing image features. Hence, the edge distance should well describe the dissimilarity of pixels. The derivative distance can be formulated as v uD1 uX x; aE / D t .diff.x; E aE //2i ; dder .E iD1

where “diff” denotes any desirable difference operator. • Fourier phase-difference distance. This distance definition adopts the same idea of the derivative distance, but the difference of Fourier phases is considered instead, as follows: v u s uX dOd .E x; aE / D t .diff. arg FFT.E x/  arg FFT.E a///2i ;

1  s < D:

iD1

Page 5 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

• Directionally weighted distance. Let M be a DD symmetric and positive definite (spd) matrix, which defines a metric on RD . Then, the directionally weighted distance between two pixels aE and xE associated with M is defined by d .E x; aE / D .E x  aE /0 M 1 .E x  aE /:

(1)

Here and throughout, the notation A0 is used, as usual, to denote transposition of the matrix A. The above weighted distance can be explained in the following way. Suppose that bE1 ; bE2 ; : : : ; bED are the D most important (notably linearly independent) spectral vectors for the given HSI, usually called feature vectors in the study of machine learning (Laub and Müller 2004). Assume also that their importance are to be measured by a certain set of weights 1  2      D > 0. Since the vectors fbE1 ; bE2 ; : : : ; bED g constitute a basis of RD , the matrix B D ŒbE1 ; bE2 ; : : : ; bED  is invertible, and we may consider the distance between two pixels aE and xE as the directionally weighted distance, defined by x  aE /0 ƒB 1 .E x  aE /; ŒB 1 .E

(2)

to classify the pixels according to the importance of bE1 ; bE2 ; : : : ; bED . Here, ƒ D diag.1 ; 2 ; : : :; D /, and B 1 xE and B 1 aE are the so-called B-coordinates of the vectors xE and aE , respectively. Now, by setting M D B 0 ƒ1 B, we note that the directionally weighted distance in (2) can be reformulated as the distance in (1). We remark that this distance formula is an extension of the well-known Mahalanobis distance d† .E x; aE / D .E x  aE /0 †1 .E x  aE /; where † denotes the covariance matrix of the D band-images (or sub-images) of the HSI. Remark 1. Some of distance formulations defined above may not satisfy the distance axiom: d.E x; aE / D 0 ) xE D aE . However, if this is the case, it is easy to introduce a quotient space from the original space, so that the distance in the quotient space satisfies the axiom. Each distance introduced above is the Euclidean distance in some transformed space (of the spectral space). For convenience, we shall later study only the HSI data, that are represented in the x; aE /. It is spectral space, so that the Euclidean distance between two pixels xE and aE is indeed d2 .E easy to see that the discussion can be extended to HSI data represented in any transformed space.

2.2 Gram Matrix on HSI Assume that an HSI is represented by its data matrix P , with columns being the spectral vectors of the HSI in RD and denote, as usual, the inner product of two spectral vectors xE and yE by hx; E yi E D

D X

xi yi :

iD1

Page 6 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Then, the Gram matrix on the HSI data set is defined by G D ŒGij  D ŒhpEi ; pEj ini;j D1 : On the other hand, it is well known that an n  n symmetric positive semidefinite (spsd) matrix G can be formulated as a Cholesky decomposition G D X 0 X;

(3)

where X D ŒE x1 ; : : : ; xEn  is an m  n matrix with m  n. By (3), G is the Gram matrix of X. That is, each spsd matrix is a Gram matrix of some data set. We remark that the Gram matrix has very close relationship with the Euclidean distance matrix D D ŒDij  D Œd2 .pEi ; pEj /ni;j D1 . Indeed, by the law of cosines d2 .E x; y/ E D

q

we have Dij D

hx; E xi E C hy; E yi E  2hx; E yi; E

p Gi i C Gjj  2Gij :

If xE and yE are unit vectors, then since d2 .E x; y/ E D

(4)

p 2.1  hx; E yi/, E we have

1 x; y/: E Gij D 1  d22 .E 2 To improve the condition number of the Gram matrix G, we often employ the shifted Gram matrix as follows. Let aE 2 Rn be a so-called virtual vector for the given HSI data matrix P . For each column vector xE of P , we define its aE -shift by xO D xE  aE and write PO D ŒpO1 ; : : : ; pOn . Then, the aE -shifted Gram matrix for the HSI is defined by GO D ŒhpOi ; pOj ini;j D1 D PO 0 PO :

(5)

Note that a shifted Gram matrix GO has the same relationship as G with the Euclidean distance matrix D as in (4), namely Dij D

q

GO i i C GO jj  2GO ij :

(6)

This result is usually stated as follows: The Euclidean distance matrix of a data setPis translation invariant. In HSI data analysis, it is quite common to choose the mean aE D n1 niD1 pEi of all spectral vectors as the virtual vector. In this case, the corresponding shifted Gram matrix is a centering data set. Here we recall that a centering Gram matrix and the data set fpO1 ; : : : ; pOn g a P data set fpO1 ; : : : ; pOn g is called a centering data set if P niD1 pOi D 0, and an n  n symmetric matrix G D ŒgE1 ; : : : ; gEn is called a centering matrix if niD1 gEi D 0. Since the centering data and centering Gram matrices play an important role in DR, we further elaborate our discussion on this topic as follows.

Page 7 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Definition 1. Let e D Œ1; 1; : : : ; 10 2 Rn and I denote the n n identity matrix. The n  n matrix H D I  n1 ee0 is called the n-centralizing matrix. If the dimension n is understood, or not an important information, it will be called centralizing matrix. Lemma 1. 1. 2. 3. 4.

The centralizing matrix H has the following properties.

H2 D H. e0 H D He D 0. X D ŒE x1 ; : : : ; xEn  is the matrix of a centering data set, if and only if XH = X. an n  n symmetric matrix C is a centering matrix, if and only if H CH D C .

Proof. Set E D ee0 . Since e0 e D n, we have Ee D ne, so that 2  1 1 2 1 H D I  E D I  E C 2 Eee0 D I  E D H; n n n n 2

which yields (1). Furthermore, (2) follows from the fact that Ee D ne, (3) can be derived from the definition of centering data, and (4) is a direct consequence of (3). By applying these properties directly, we have the following. Lemma 2. If A is an n  n symmetric matrix, then HAH is a centering matrix, and if X D ŒE x1 ; : : : ; xEn  is a data matrix, then XH is a centering data matrix. We shall use the notation Ac D HAH for the centering matrix corresponding to a symmetric matrix A. For a centering data, the relation (4) is reduced to a very simple form. Theorem 1. Let m  n, and fpE1 ; : : : ; pEn g be a centering data in Rm . Set P D ŒpE1 ; : : : ; pEn . Let G D ŒGij  D P 0 P be its Gram matrix, and S D Œdij2 ni;j D1 be the square Euclidean distance matrix of the data set with dij D d2 .pEi ; pEj /: Then S c D 2G:

(7)

Proof. Since fpE1 ; : : :P ; pEn g is a centering data set, it follows from Lemma 1 that G is a centering Gram matrix so that niD1 Gij D 0. Hence, the relation dij2 D Gi i C Gjj  2Gij ; in (4) immediately yields both n X iD1

dij2

D nGjj C

n X

Gi i ;

iD1

Page 8 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

and n X

dij2

D nGi i C

j D1

n X

Gjj :

j D1

Therefore, the (i; j )-entry of S c is given by .S c /ij D dij2 

1 n

n P iD1

dij2 C

n P j D1

dij2 

1 n

n P i;j D1

! dij2

 Gi i C Gjj D D 2Gij ; dij2

completing the proof of the theorem. Another important property of the Euclidean distance metric of a data set is the following. q1 ; : : : ; qEn g  Rk be two data sets, with corresponding Lemma 3. Let fpE1 ; : : : ; pEn g  Rm and fE q1 ; : : : ; qEn , respectively. Then matrices denoted by P D ŒpE1 ; : : : ; pEn  and Q D ŒE d2 .pEi ; pEj / D d2 .E qi ; qEj /; 1  i  n; 1  j  n; provided that P 0 P D Q0 Q:

(8)

Equation 8 reveals an intrinsic orthogonal transformation T from spanfpE1 ; : : : ; pEn g to span fE q1 ; : : : ; qEn g such that T pEi D qEi ; i D 1; : : : ; n. Hence, Lemma 3 can be stated as follows: The Euclidean distance of a data set is invariant under orthogonal transformation. An important task of DR is to solve the converse of this statement, namely given an n  n symmetric matrix D, and assume that D is the Euclidean distance matrix of an unknown data set. The problem is to recover the data set with D as its Euclidean distance matrix. Observe that since the Euclidean distance matrix is translation invariant as well as invariant under orthogonal transformation, the solution is not unique. We therefore focus on finding the centering data set, which may not be unique either. In this direction, we establish the following result for the formulation of some centering data set from a given Euclidean distance matrix. Theorem 2. Let an n  n matrix S D Œdij2 ni;j D1 be the square Euclidean distance matrix of some data set. Then the matrix 1 G D  Sc 2 is a centering spsd matrix. Hence, there is a centering data set, such that its Gram matrix is G. Proof. Since S is the square Euclidean distance matrix of some data set, there is an m  n data matrix V D ŒE v1 ; : : : ; vEn with m  n, such that dij D jjE vi  vEj jj: Page 9 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Let VO D ŒvO 1 ; : : : ; vO n  be the centering data matrix of V so that 1X vO j D vEj  vEi ; n iD1 n

1  j  n:

By the translation invariance of S, we have dij D jjvO i  vO j jj; and thus, by setting M D ŒjjvO 1 jj2 ; : : : ; jjvO n jj2 0 2 Rn , we obtain S D Me0 C eM0  2VO 0 VO : Therefore, by recalling that H.Me0 /H D H.eM0 /H D 0; and VO H D VO ; it follows that the matrix G defined by 1 G W D  Sc 2 is the Gram matrix of the vector set fvO 1 ; : : : ; vO n g. This completes the proof of the theorem. By Theorem 2, we may recover the (nonunique) data set from its square distance matrix as follows. Corollary 1. Let an n  n matrix S D Œdij2 ni;j D1 be the square Euclidean distance matrix of some unknown data set. Set G D  12 S c , and let G D U 0 U be a Cholesky decomposition of G. Write U D ŒEu1 ; : : : ; uE n . Then fEu1 ; : : : ; uE ng is a centering data set, and S is its square Euclidean distance matrix: dij D d2 .Eui ; uE j /: Theorem 2 together with Theorem 1 yields the following. Theorem 3.

Let Q be an n  n symmetric matrix. Define 1 M D  Qc : 2

Then M is an spsd centering matrix if and only if Q is a square Euclidean distance matrix of some data set. Page 10 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

We remark that Theorem 3 is essentially equivalent to a classical result of Young and Householder (see Young and Householder 1938; Torgerson 1958).

3 Linear Methods for Dimensionality Reduction The key step for dimensionality reduction (DR) of HSI is the construction of an effective DR kernel. Assume that an HSI data set is represented by some matrix PDn , with columns consisting of vectors in the spectral (or some transformed) space and rows representing band images (or some transformation of them). A DR kernel for the HSI data set is an n  n symmetric matrix K, whose (i; j )-entry is used to measure the similarity between the i th and j th spectral vectors. A good DR kernel should be consistent with the geometric structure of the HSI. If the spectral vectors of the HSI data matrix P (approximately) span some d -dimensional subspace of RD , then linear methods can be applied to reduce the dimension D of P to d . On the other hand, if the spectral vectors (nearly) lie on some d -dimensional (continuous and nonlinear) manifold in RD , then nonlinear methods are needed to achieve reasonable DR results. For linear methods, the Gram (or shifted Gram) matrix of the HSI is used as the DR kernel. In other words, the dissimilarity among pixels are measured by the Euclidean distance. As to the nonlinear approach, the manifold structure of the HSI data is taken into consideration, with the distance defined on HSI data set to be measured on the manifold. Note that since an HSI data set is discrete, it can be embedded in infinitely many (smooth) manifolds, at least in theory. In general, various DR kernels introduce different manifolds. The question as to which DR kernel is more suitable for a given classification task is mainly an experimental problem, not a mathematical one. Hence, it would be somewhat meaningless to ask which method is the best. This section is devoted to the discussion of linear methods and the mathematical foundation for the understanding of these methods. As mentioned above, if the column vectors of an HSI data matrix P approximately span a d -dimensional subspace of RD , then we can use linear methods to reduce the dimension of the data set to d . For a linear method, the Euclidean distance is adopted to describe the dissimilarity among pixels in the HSI. The two most popular linear methods are principal component analysis (PCA) (see, for example, Partridge and Calvo, 1997) and (linear) multidimensional scaling (MDS) (see Cox and Cox 2004), but they are essentially equivalent. In general, linear methods are much faster and more stable than nonlinear methods. However, as mentioned previously, if the data do not lie on a d -dimensional subspace, then linear methods usually yield unacceptably large errors. In an HSI for geological/geographical studies, the typical number of bands is about 200. Hence, under the assumption that the column vectors of P approximately span some d -dimensional subspace of RD , with d  20, linear methods are very effective for reducing the data dimension from D to 20, particularly for feature classification of the HSI. Throughout this chapter, we need the following notions and notations. The set of all n  m matrices with rank  r and the set of n  n spsd matrices with rank  r are denoted, respectively, by Mm;n .r/ and Sn .r/. If the rank is understood, or not an important information, then we shall use the notion Mm;n and Sn instead. We will also denote the set of n  k.k  n/ matrices with orthonormal (o.n.) columns by On;k , the set of diagonal matrices in Mn;m by Dn;m , and the set of diagonal matrices in Sn by Dn . For an spsd matrix A 2 Sn , its eigenvalues 1 ; : : :; n are nonnegative and will always be arranged in the nonincreasing order: 1  2      D  0, unless a different statement presents.

Page 11 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

3.1 Principal Component Analysis The principal component analysis (Partridge and Calvo 1997) can be described from two different points of view: algebraic and statistical. Let us first discuss the algebraic point of view. As before, let the data set of an HSI be fpE1 ; : : : ; pEn g  RD , and we want to reduce it to a data set in Rd .d < D/. In the following, orthogonal (o.g.) embedding from Rd to RD , will be represented by matrices in OD;d . Then, the PCA method is used in finding a data set fqQ1 ; : : : ; qQn g  Rd and T 2 OD;d such that the set fT .qQi /gniD1 is the best approximation of fpEi gniD1 with respect to the q1 ; : : : ; qEn g  Rd , square Euclidean norm, i.e., for each T 2 OD;d and each set fE n X

n X

jjT qQi  pEi jj  2

iD1

jjT qEi  pEi jj2 :

(9)

iD1

By setting Q D ŒE q1 ; : : : ; qEn  2 Md;n.d /; then the error can be formulated by using the matrix Frobenius norm jj  jjF : n 2  X   E jjT .E qi /  pEi jj2 : TQ  P  D F

iD1

To derive fqQ1 ; : : : ; qQn g and T in (9), we need the following result on best approximation for matrices (P 11.5.5 in Rao and Rao 1998). Theorem 4 (Young and Householder 1938). Let the singular value decomposition (SVD) of a matrix A 2 Mm;n .r/ be 0

A D V †U D

r X

i vEi .Eui /0 ;

(10)

iD1

v1 ; : : : ; vEr  2 Om;r , and † D diag.1 ; : : :; r / is a diagonal where U D ŒEu1 ; : : : ; uE n 2 On;r ; V D ŒE matrix with 1  2      r . 0/. Let B D

k X

i vEi .Eui /0 ;

k < r:

(11)

iD1

Then B 2 Mm;n .k/ in (11) is the best approximation of A under the Frobenius norm, namely jjA  B jjF D minB2Mm;n .k/ jjA  BjjF ; with the error of approximation given by v u r uX jjA  B jjF D t l2 : lDkC1

Page 12 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Since P 2 MD;n .D/, it has an SVD, namely P D

D X

i vEi .Eui /0 :

(12)

iD1

For d < D, set †d D diag.1 ; : : :; d /, Ud D ŒEu1 ; : : : ; uE d  2 On;d ; T D ŒE v1 ; : : : ; vEd  2 OD;d , and QQ D †d Ud0 D ŒqQ1 ; : : : ; qQn . Then, by Theorem 4, the vectors fqQ1 ; : : : ; qQ n g and the matrix T satisfy (9) so that the new data set QQ provides the required (linear) dimensionality reduction of P . In addition, it is easy to see that the Gram matrix of P : G D P 0P can be used as the DR kernel in PCA. Indeed, since the rank of G is no larger than D, the eigen decomposition of G 2 Sn .D/ is given by G D UƒU 0 ;

(13)

where the diagonal matrix ƒ D diag.12 ; : : :; D2 / consists of all the eigenvalues of G, and the ith column of U 2 On;D is the (unit) eigenvector of G corresponding to the eigenvalue i2 , and therefore, by using the d -leading eigenvectors and eigenvalues of G, the new data matrix QQ D †d Ud0 is obtained. In an HSI, since the number of pixels, n, is much larger than the number of bands, D, the DR kernel G in (13) has the significantly larger size of n  n. To avoid such high-dimensional kernels for computation, we may get around it in the following way. Observe that QQ D .T /0 P:

(14)

Here, the matrix T can be computed as follows. Construct F D PP 0 2 SD : Then the eigen decomposition of F is given by F D VƒV 0 ; where V D ŒE v1 ; : : : ; vED , so that T is determined by the first d columns of V . O Let P D ŒpO1 ; : : : ; pOn  be the centering data matrix of P . Then d2 .pOi ; pOj / D d2 .pEi ; pEj /; i; j D 1; 2; : : : ; n: Therefore, in a PCA algorithm, we may use PO instead of P . This replacement also has the benefit of improving the condition number for the computation in general. Next, let us discuss the statistical point of view of the PCA method. First, we may consider each pixel (i.e., each column of the matrix P / as a sample of the random vector pE 2 RD . Let T denote the set of all orthogonal (o.g.) projections from RD to Rd . Then the PCA method can be

Page 13 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

described as finding the o.g. projection T 2 T , such that the random vector yE WD T .p/ E has maximal variance. More precisely, recall that the variance of the random vector pE D Œp1 ; : : : ; pD 0 is defined by D X Var.p/ E D Var.pi /; iD1

so that, for an HSI data P , we have Var.p/ E D

1 O O0 tr.P P /; n

where tr(M ) denotes the trace of the matrix M . Hence, the PCA method can be formulated as finding T 2 OD;d , such that T D arg max Var.T 0 P /: T 2OD;d

Let the eigenvalues of PO PO 0 be 1 ; : : :; D . Then, for each T 2 OD;d , 1X Var.T P /  i : n iD1 d

0

v1 ; : : : ; vEd  and Therefore, the maximal variance of T 0 P is obtained by the o.g. projection T D ŒE 1X D j : n j D1 d

Var.T0 P /

Hence, dimensionality reduction of P by using the PCA method is achieved by appealing to the E whose sample set is given by random vector yE D T .p/, Y D T0 P:

3.2 Linear Multidimensional Scaling Let D D Œdij ni;j D1 be a given distance matrix of an n-data set and d 2 N .d < D/. The method of metric multidimensional scaling (Borg and Groenen 1997) is a procedure for finding some data set fyE1 ; : : : ; yEn g  Rd , such that the (Euclidean) distance matrix associated with fyE1 ; : : : ; yEn g is as close as possible to the matrix D, namely d2 .yEi ; yEj /  dij ; 8i; j:

(15)

In practice, a (so-called) lost function is adopted to measure the closeness in (15). The classical multidimensional scaling (or linear MDS) dimensionality reduction method (Cox and Cox 2004) can be described as follows. Let fpE1 ; : : : ; pEn g  RD be a data set of an HSI, which derives the Page 14 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Euclidean distance matrix D D Œd2 .pEi ; pEj /. Let T be an o.g. projection from RD into a d dimensional subspace S  RD such that xEi D TpEi ; 1  i  n. It is obvious that d2 .E xi ; xEj /  d2 .pEi ; pEj /; 8i; j: We denote X D ŒE x1 ; : : : ; xEn  2 MD;n .d / and let .X/ D

n  1 X xi  xEj jj2 jjpEi  pEj jj2  jjE 2n i;j D1

be the lost function. Since T is an o.g. projection, we have n n  1 X 1 X 2 2 .X/ D jj.I  T/.pEi  pEj /jj2 : jjpEi  pEj jj  jjT.pEi  pEj /jj D 2n i;j D1 2n i;j D1

The matrix representation of the o.g. projection T is T T 0 , where T 2 OD;d . Then, the linear MDS-DR algorithm can be formulated as the minimization problem of computing XX 1 jj.I  T T 0 /.pEi  pEj /jj2 : arg min T D T 2OD;d 2n iD1 j D1 n

n

(16)

Lemma 4. Let fpEi gniD1 be a given data set, with corresponding square distance matrix SP D Œsij ni;j D1 , i.e., sij D d22 .pEi ; pEj /, and let GO P be its associated centering Gram matrix. Then tr.GO P / D

1 X sij : ij 2n

Proof. Let H be the n-centralizing matrix and E be the n  n matrix, in which each entry is 1. Let fpOi gniD1 be the centering data set of fpEi gniD1 . By Theorem 2, we have   1 1 1 1 1 GO P D  HSP H D  S  ES  SE C 2 ESE ; 2 2 n n n which yields

! 1 1X 1X 1 X sik  skj C 2 skl : hpOi ; pOj i D  sij  2 n k n k n k;l

Therefore, 1X 1X 1X 1 X sik  sik C 2 skl si i  tr.GO P / D  2 i n n n k

k

!

k;l

1 X 1 D sij  tr.SP /: 2n ij 2 Page 15 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Since tr.SP / D 0, we have tr.GO P / D Lemma 5.

1 2n

P

sij , completing the proof of the lemma.

ij

Let DP D Œd2 .pEi ; pEj /ni;j D1 and PO D ŒpO1 ; : : : ; pOn . Then jjPO jjF D

Proof. Since jjDp jj2F D

P ij

p1 jjDP jjF . 2n

O the result is a consequence of Lemma 4. sij and jjPO jj2F D tr.G/,

We are now ready to establish the following result. Theorem 5.

Let P D ŒpE1 ; : : : ; pEn  2 MD;n , and PO D PH with its SVD given by PO D V †D U 0 ;

where †D D diag.1 ; : : : ; D / 2 DD ; U 2 On;D , and V D ŒE v1 ; : : : ; vED  2 OD;D . Then T D ŒE v1 ; : : : ; vEd  is the solution of the minimization problem in (16) with .T T0 P /

D

D X

i :

iDd C1

Proof. Let X D ŒE x1 ; : : : ; xEn , where xEi D .I  T T /pEi and DX D Œd2 .E xi ; xEj /. Then, n X

jj.I  T T 0 /.pEi  pEj /jj2 D jjDX jj2F :

i;j D1

By Lemma 5, we have jjDX jj2F D 2njjXO jj2F : Recall that jjXO jjF D jjPO  T T 0 PO jjF ; where PO 2 MD;n .D/ and T T 0 PO 2 MD;n .d /. By Theorem 4, we see that T T0 PO is the best approximation of PO in mD;n .d / under the Frobenius norm. In addition, since .T T0 P / D jjPO  T  .T  /0 PO jj2F D

D X

i ;

iDd C1

the proof is completed. We remark that since the operator T is an o.g. embedding from Rd to RD , we have jjT T0 pOi  T T0 pOj jj D jjT0 pOi  T0 pOj jj: Therefore, T0 PO provides the required dimensionality reduction of P . Page 16 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

4 Nonlinear Methods for Dimensionality Reduction Nonlinear dimensionality reduction is based on the assumption that the high-dimensional data set fpEi gniD1 , of dimension D, lies on a lower dimensional (continuous) manifold M  RD , of dim.M / D d . Let f be an isometric mapping from M to  Rd , which maps fpEi gniD1 to a data set fyEi gniD1  , such that f .pEi / D yEi . Then, the set fyEi gniD1 is viewed as a dimensionality reduction of fpEi gniD1 . However, in DR, only a discrete data set is given. Hence, a key issue for all nonlinear DR methods is based on the appropriate metric structure on the manifold M . Note that, although globally the data set fpEi gniD1 does not lie on a d -dimensional subspace in RD , yet within a small local region, the data do lie, at least approximately, on a d -dimensional subspace, such as the tangent space of the manifold at a certain point. The locally linear approximation enables us to make an appropriate “guess” of the metric structure on M . For nonlinear DR, there are two different approaches in general, to be described as follows. The first approach is to change the Euclidean distance (used in linear DR) to some non-Euclidean distance, such as the manifold geodesic distance, to measure dissimilarity among the pixels in the data set. As mentioned above, if two pixels are sufficiently close to each other, then the manifold geodesic distance between them can be approximated by the Euclidean distance. In this case, we call one pixel is in the neighborhood of another pixel and a path is created to connect them Otherwise, they are disconnected. After the neighborhood is defined for each pixel, we may create a graph on the data set. Then, the distance between two pixels is defined as the shortest path connecting these two pixels. Once the distance on the data set has been defined properly, it can be used to construct an effective DR kernel. Finally, eigen decomposition of the DR kernel is applied to complete the (nonlinear) dimensionality reduction process. This approach can be illustrated by the following diagram (Fig. 1). The second approach is to change the Gram matrix (used in linear DR) to a weight matrix to measure the similarity among the pixels in the data set. A nonlinear dimensionality reduction algorithm of this approach usually consists of the following steps (Fig. 2) (see Lee and Verleysen (2007)). The essential difference between these two approaches is the construction of DR kernels. In the second approach, the neighborhood definition is similar to that of the first approach, with a suitable definition of the neighborhood for each pixel, so that only those pixels in the neighborhood of this pixel are considered to be similar to it. For the weighted graph in the second approach, a weighted (bidirectional) graph is constructed on the data set, such that a pixel qE is connected to another pixel pE if and only if qE is in the neighborhood of p, E and some weight is assigned to the path to describe the similarity between qE to p, E in that a larger weight indicates that they are more similar. If there is no path from qE to p, E then the weight is considered to be zero. To derive the required DR kernel, the weighted graph on the data set fpE1 ; : : : ; pEn g is formulated as an n  n weight matrix, such as K D Œkij ni;j D1 , of which each entry ki;j measures the similarity of the two points Neighborhood definition

Distance graph creation

DR kernel construction

DR kernel eigen decomposition

DR kernel construction

DR kernel eigen decomposition

Fig. 1 Nonlinear dimensionality reduction approach 1 Neighborhood definition

Weighted graph creation

Fig. 2 Nonlinear dimensionality reduction approach 2 Page 17 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

pEi and pEj . Since K is not a Gram (or shifted Gram) matrix of the data set fpE1 ; : : : ; pEn g, it gives rise to some non-Euclidean distance on the data set. Hence, both approaches lead to very similar geometric structure of the data set. Also, since the data set is finite, it may be embedded in infinitely many manifolds. Unfortunately, the question as to which manifold is the best for the given data set cannot be answered mathematically. Perhaps experimental results and analysis can be used to settle the question, unless some prior knowledge of the manifold is known. The final step of eigen decomposition of the kernel for both approaches is similar to PCA or MDS. In summary, the only essential difference among all nonlinear methods is in neighborhood definition and weight calculation. In this section, for convenience, both the data set fpE1 ; : : : ; pEn g and the data matrix ŒpE1 ; : : : ; pEn  are denoted by P .

4.1 Neighborhood Definition The two most popular notions of neighborhoods adopted for DR algorithm development are kneighborhood and "-neighborhood. The k-neighborhood of pEi is the set of k pixels pEj1 ; : : : ; pEjk ; .¤ pEi /, which satisfy max d2 .pEi ; pEjs /  d2 .pEi ; pEj /;

1sk

8j … fi; j1 ; : : : ; jk g:

We shall denote the k-neighborhood of pEi by Oi WD Oi .k/ D fpEjs gksD1 : On the other hand, the "-neighborhood of pEi is simply the set Oi WD Oi . / D fpEj I d2 .pEi ; pEj /  ; j 2 f1; : : : ; ngnfi gg: In the following, let the square Euclidean distance matrix on fpE1 ; : : : ; pEn g be denoted by S D Œd22 .pEi ; pEj /ni;j D1 ;

(17)

and consider the matrix ˆ D diag.jjpE1 jj2 ; : : : ; jjpEn jj2 /E; where, as introduced previously, E is the n  n matrix with the same value 1 for all of its entries. Then the square Euclidean distance matrix S can be computed by using the formula S D ˆ C ˆ0  2P 0 P:

(18)

In algorithm development, both the k-neighborhood and "-neighborhood of any pixel in fpE1 ; : : : ; pEn g can be derived by an appropriate sorting procedure based on using the matrix S.

Page 18 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

4.2 Isomap The isomap method was introduced by Tenenbaum et al. (2000). It adopts the first approach described above. Isomap gives a parametric representation of the data on some manifold. More recent development and applications of the isomap method can be found in Balasubramanian et al. (2002) and Bachmann et al. (2005, 2006, 2009). 4.2.1 Description of the Isomap Method Assume that the data set P D fpE1 ; : : : ; pEn g  RD lies on a d -dimensional manifold M  RD . If there is a one-one continuous map of M to Rd such that the image of the manifold M is a convex region, then M is called a convex manifold. For a convex M , there is a one-one continuous map f from M to f .M /  Rd that preserves the geodesic distance dM .; / on M , namely dM .E x; y/ E D d2 .f .E x/; f .y//; E 8E x; yE 2 M: In particular, we have dM .pEi ; pEj / D d2 .f .pEi /; f .pEj //; 8i; j: The map f is called an isomap, and f .P / is called the parametric representation of P . In practice, since it is very difficult to find the map f directly, the distance matrix Œd2 .f .pEi /; f .pEj // is constructed instead. 4.2.2 Isomap Algorithm The algorithm essentially adopts the steps in Fig. 1. (1) Estimate the geodesic interpoint distances for the data set P . Since the points in the data set form a discrete set, we can only use the geodesic interpoint distance to approximate the manifold distance. (2) Apply a linear method, such as MDS, to find the projection f that preserves distances. Step 1: Neighborhood definition. As mentioned above, the neighborhood of a pixel in fpE1 ; : : : ; pEn g may be defined by either its k-neighborhood or "-neighborhood. If the notion of k-neighborhood is used, then the method is called a k-isomap, or else, it is called an "-isomap. Step 2: Distance graph construction. After the neighborhoods are defined, a graph G.P; E/ can be constructed on the data set, consisting of a vertical set P and an edge set E, in which each element connects a vertex to one of its neighbors. We can now define the graph distance on P in terms of the graph distance dG .i; j / between pEi and pEj for each pair of points on P . For convenience, let the Euclidean distance d2 .pEi ; pEj / be denoted by d2 .i; j /. On the graph G, each connection from pEi to pEj is called a path, denoted by ij . The set of all such paths is denoted by ij D f ij I ij is a path from pEi to pEj g: A path ij can be represented by the set of all vertices passing through by the path, say ij D fpEi0 ; pEi1 ; : : : ; pEil g, with pEi0 D pEi and pEil D pEj . Then, the length of the path is defined by d ij D

l X

d2 .ik1 ; il /;

kD1

Page 19 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

from which the graph distance is defined by dG .i; j / D min d ij : ij 2 ij

Recall that if k-neighborhood definition is adopted, then dG .i; j / may not be equal to dG .j; i / since pEj 2 Oi » pEi 2 Oj : Therefore, ij may not be equal to j i . In this case, we define the graph distance between pEi and pEj by min(dG .i; j /, dG .j; i /). For convenience, we shall abuse notation and still use dG .i; j / for min(dG .i; j /, dG .j; i /). If there is no path connecting pEi and pEj , then we set dG .i; j / D 1. It is clear that the graph distance dG .i; j / satisfies the distance axioms: dG .i; j / D dG .j; i /; dG .i; j / D 0; iff pEi D pEj ; dG .i; j /  dG .i; k/ C dG .k; j /: The graph distance is an approximation of the geodesic distance on the manifold M . Since the given data set is discrete and we have no prior knowledge of M , we shall assume dG .i; j / D dM .i; j /: To compute the graph distance between two pixels, the lengths of all paths must be estimated. However, this is very time consuming. For example, Floyd’s algorithm (Kumar et al. 1994), which employs an exhaustive iterative method for computing all possible paths to achieve the minimum, requires O.n3 / computations. For large n, Dijkstra’s algorithm (Balasubramanian et al. 2002) can be applied to reduce the computation to O.n2 log n/. In any case, Step 2 yields an n  n (square graph distance) matrix  n SG D dG2 .i; j / i;j D1 ; which is also symmetric. Step 3: DR kernel construction. Since we have already assumed dG D dM , and since the map f is an isomorphism, we have   dG .i; j / D dM .pEi ; pEj / D d2 .f .pEi /; f .pEj //; 8i; j: Hence, the matrix SG is the square Euclidean distance matrix of Y D f .P /. By setting 1 G D  HSG H; 2 it follows from Theorem 2 that G is the centering Gram matrix of the data matrix Y . Step 4: Eigen decomposition. Let the eigen decomposition of G be given by G D VƒV 0 ; where ƒ D diag.1 ; : : :; d ; : : :; n/ and V D ŒE v1 ; : : : ; vEn . Let Y D ŒyE1 ; : : : ; yEn  be defined by p p p Y 0 D Œ 1 vE1 ; 2 vE2 ; : : : ; d vEd : Page 20 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Then Y is the dimensionality reduction of P . In general, since the data set fpE1 ; : : : ; pEn g does not exactly lie on a d -dimensional manifold in RD , the Euclidean distance on Y only approximates the manifold distance on P . But by Theorem 5, we have the following error estimate. n n X 1 X 2 2 .d .pEi ; pEj /  d2 .yEi ; yEj //  l : 2n i;j D1 G lDd C1

4.2.3 Conclusion Isomap is a computationally stable algorithm. Various experiments show that most HSI data lie on convex manifolds. Hence, the isomap method usually well models HSI and it is widely adopted in HSI data analysis. However, among the disadvantages are that estimating geodesic distances require significantly intense computing and that the Gram matrix G is usually dense. Hence, the isomap method is considered to be among the most computational expensive algorithms. Besides, if the HSI data do not lie on a convex manifold (for instance, the manifold consists of some “holes”), then the isomap method may not preserve the topology of the data structure.

4.3 Semidefinite Programming The method of semidefinite programming (SDP), introduced by Weinberger and Saul in Weinberger et al. (2005), is also called semidefinite embedding (SDE), or maximum variance unfolding (MVU). 4.3.1 Description of the SDP Method As before, we again assume that the data set fpE1 ; : : : ; pEn g  RD lies on a convex d -manifold M . Let Oi be the neighborhood of the point pEi , and N.i / denote the index set of the pixels in Oi . If the neighborhood Oi is well defined, it should be very close to the tangent space of M at pEi , so that the local geometry of Oi resembles that of a set in a d -dimensional subspace. Let < ;  > denote the inner product on RD . Then the (local) geometry of Oi can be characterized by the local shifted Gram matrix GPi W D .Gji k /j;k2N.i/ W D .hpEi  pEj ; pEi  pEk i/j;k2N.i/ : Let Y D f .P / be the coordinates of P in Rd , so that the neighborhood Oi is mapped to f .Oi /. Assume that f is also isometric. Then we have GYi D GPi ; i D 1; : : : ; n:

(19)

GYi D .hyEi  yEj ; yEi  yEk i/j;k2N.i/

(20)

where

and the inner product in (20) is defined on Rd . Since fyE1 ; : : : ; yEn g is a data set in Rd and its geometry is Euclidean, the (global) Gram matrix of Y , namely K, can be obtained from all the Page 21 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

local shifted Grams fGYi gniD1 . Since GYi D GPi by (19), K can be constructed from fGPi gniD1 . Finally, the SDE method achieves the dimensionality reduction Y from its Gram matrix K. In general, the data set P may not exactly lie on a d -dimensional manifold M so that (19) does not hold. Then GYi is constructed from the approximation of GPi . 4.3.2 SDP Algorithm The SDP algorithm can be described by following the steps described below. Step 1: Neighborhood definition. This is the same as the description for the isomap method. Step 2: Local Gram matrix construction. The local (shifted) Gram matrix GPi must be introduced for each neighborhood Oi . Recall that the entries of GPi can be represented by  i GP j k D hpEi ; pEi i  hpEi ; pEk i  hpEj ; pEi i C hpEj ; pEk i:

(21)

If the Gram matrix of P is denoted by GP WD . ij /ni;j D1 D P 0 P; we then have .GPi /j k D i i  ij  ik C j k ; Step 3: DR kernel construction. Construction of the Gram matrix (i.e., the DR kernel) of the dimensionality reduction data set from the local Gram matrices of the original data set P is called semidefinite programming. This is the key step of the SDP. In this step, the Gram matrix K WD .Kij /ni;j D1 ;

Kij D hyEi ; yEj i

of Y 2 md;n is constructed as solution of the optimization problem: K D arg maxK2Sn .d / .tr.K// under the constraints that

Pn i;j D0

Kij D 0 and

Ki i  Kij  Kik C Kj k D i i  ij  ik C j k ;

i D 1; : : : ; n; j 2 N.i /;

where the first constraint means that K is centering, and the second constraint guarantees that (19) holds, i.e., the map f : P ! Y is locally isometric. Step 4: Eigen decomposition of DR kernel. This step is similar to Step 4 in the description of the isomap method. 4.3.3 Conclusion The SDP method has properties similar to those of the isomap method, in that the method is computationally stable but the DR kernel constructed is dense, and the DR algorithm is computationally intensive, while requiring large memory. Page 22 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

4.4 Locally Linear Embedding The method of locally linear embedding (LLE) was introduced by Roweis and Saul in Roweis and Saul (2000). 4.4.1 Description of the LLE Method Let PDn be the given data matrix. The neighborhood Oi , with index set N.i /, of each pixel pEi in the data set for LLE is defined as follows. Since each pixel in Oi is supposed to be similar to pEi , we require d2 .pEi ; pEj /  0; 8j 2 N.i /. Hence, for any set 9 8 = < X wij D 1 ; wij 2 RI j 2 N.i /; ; : j 2N.i/

P of weights, the sum j 2N.i/ wij pEj must also be similar to pEi . The LLE method specifies the choice of the set fwij g , j 2 N.i /, of weights by minimizing the error     X   : pEi  w p E ij j     j 2N.i/

(22)

P Then the sum j 2N.i/ wij pEj is called the locally linear embedding (LLE) of pEi , and the LLE method describes similarity between a pixel (in a data set) and its neighbors in terms of this set of weights. Finally, the DR data set Yd n of the LLE method is computed by optimally preserving the local linear embedding, in that Y minimizes the locally linear embedding error: ˇˇ ˇˇ2 ˇˇ X ˇˇ ˇˇ ˇˇxEi  Y D arg min ‰.X/ WD wij xEj ˇˇˇˇ : ˇ ˇ X 2Md;n ˇˇ iD1 ˇˇ j 2N.i/ n ˇˇ X

4.4.2 LLE Algorithm LLE method adopts the approach described in Fig. 2. Step 1: Neighborhood definition. The neighborhood of each point can be either k-neighborhood or "-neighborhood. The number of pixels in a neighborhood is usually greater than d . Step 2: Weighted graph construction. LLE generates a weighted graph, by using the weight matrix W D Œwij ni;j D1 , according to the following two rules. 1. Localness. If pEj is not in the neighborhood of pEi , then the weight wij D 0. Let the i th row wi in W represent the weight-set of pEi . Then all the nonzero components in the i th row wi are in the set fwij gj 2N.i/. 2. Local invariance. The local weight of pEi is invariant P under rotations, translations, and dilations of its neighbors. The reason is that the condition j 2N.i/ wij D 1 ensures such local invariance. Indeed, since both rotation and dilation are linear operations, by using T to denote the rotation operator, and for  ¤ 0, denote by pEj the dilation of pEj , we see that Page 23 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

X

0

  wij T pEj D T @

j 2N.i/

1

X

wij pEj A ;

j 2N.i/

and X

0

  wij pEj D  @

j 2N.i/

X

1 wij pEj A :

j 2N.i/

As to translation invariance, suppose that each vector in RD is translated by a vector aE 2 RD . Then we have 0 1 X X X wij .pEj  aE / D wij pEj  @ wij A aE j 2N.i/

j 2N.i/

D

X

j 2N.i/

wij pEj  aE

j 2N.i/

Next, write  2   X   Ei .W / D  wij pEj  pEi   ;   j 2N.i/ and set ˇˇ ˇˇ2 ˇˇ n ˇˇ X X ˇˇ ˇˇ ˇˇpEi  Ei .W / WD wij pEj ˇˇˇˇ : E.W / D ˇˇ ˇˇ iD1 iD1 ˇˇ j 2N.i/ n X

Also denote by In the set of matrices, of which the sum of each row is 1. Then, the LLE weight matrix W  is the solution of the minimization problem: W  D arg min E.W /; W 2In

and this minimization problem can be uncoupled as follows: arg min Ei .W /; wij

subject to

X

wij D 1:

(23)

j 2N.i/

so that the problem can be solved for each i individually. The variational approach can be applied for this purpose. Indeed, by using the notation C D .csj /kk , with csj D hpEj  pEi ; pEs  pEi i; j; s 2

Page 24 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

N.i /, the Euler-Lagrange equation of the minimization problem (23), for each i , is given by an N.i /-linear system in terms of w E i D .wij1 : : : ; wijk /0 , namely X

csj wij   D 0;

s 2 N.i /;

j 2N.i/

where  is the Lagrange multiplier, with compact vector formulation given by Cw E i D e; e D Œ1; 1; : : : ; 10 : If C has full rank, we can first E i by setting  D 1, followed by normalization of the P solve for w solution by resetting  D 1= j 2N.i/ wij . On the other hand, if C is not of full-rank, then we may P set  D 0 and solve the homogeneous equation C w E i D 0 with the constraint j 2N.i/ wij D 1. In this case, the solution may not be unique. Step 3: DR kernel construction. The new data matrix Yd n is defined by minimization of the local squared l2 -error:  2   X X  xEi   ; ‰.X/ D w x E ij j     i j 2N.i/

(24)

under the constraints .xj /0 e D 0;

j D 1; 2; : : : ; d;

.xj /0 xj D 1; where xi 2 Rn is the i th row vector of X. Recall that xEi 

X

wij xEj D xEi 

n X

wij xEj D

j D1

j 2N.i/

d X

0 @xli 

n X

1 wij xlj A eEl ;

j D1

lD1

where the set fE el gdlD1 is the standard basis of Ad . Then ‰.X/ in (24) can be reformulated as ‰.X/ D

n X iD1

* xEi 

n X

wij xEj ; xEi 

n X

j D1

+ wij xEj ;

j D1

so that ‰.X/ D tr.XL0 LX 0 /;

(25)

where L D I  W . Hence, by (25), the DR kernel is given by K D L0 L D .I  W /0 .I  W /:

Page 25 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Step 4: Eigen decomposition of the DR kernel. Let the singular value decomposition of L be given by L D V †U 0 Then we have ‰.X/ D tr.XL0 LX 0 / D tr.XU †.XU †/0 /: Since Le D 0, the bottom eigenvector of K, associated with the 0-eigenvalue, is p1n e, which does not satisfy the constraints of (24) and should be excluded from the solution set of (24). Let Y D Œy1 ; : : :; yd  be the eigenvector matrix corresponding to the second to the (d C 1/st smallest eigenvalues of K. Then the minimum of ‰.X/ is attained by Y, so that Y D Y0 is the dimensionality reduction of P . 4.4.3 Conclusion The algorithm for the LLE is a fast algorithm, and the method provides a fairly good linearization of the data on some manifold. However, the algorithm is occasionally computationally unstable.

4.5 Local Tangent Space Alignment The local tangent space alignment method (LTSA) was introduced by Zhang and Zha in Zhang and Zha (2003) and Zhang and Zha (2005). Recent development and applications of this method can be found in Lin et al. (2006), Zhao (2006), and Zha and Zhang (2009). 4.5.1 Description of the LTSA Method Let PDn be the given data, and assume that it lies on some d -dimensional smooth manifold M  RD . Let f W U  Rd ! M be an isometric parametrization of M , such that for each pE 2 M , pE D f .Eu/; uE 2 U: Let Y D ŒyE1 ; : : : ; yEn niD1  Md;n be determined by f .yEi / D pEi . Then Y is the dimensionality reduction of P . The LTSA method is used for finding the unknown map f as follows. For each pE 2 M , assume that the neighborhood of pE denoted by OpE, is well defined. Then, by using the language of machine learning, the local geometry of each neighborhood OpE can be learned from E In fact, let Jf .p/ E be the Jacobi matrix of f at pE 2 M , and QpE the local tangent place TpE at p. be the orthogonal project from OpE to the tangent place TpE . For each pQ 2 OpE, we write the local coordinate of pQ  pE by E E 2 TpE:

E D QpE .pQ  p/; Let pE D f .Eu/ and pQ D f .Qu/. By applying the Taylor formula, we have

E D QpEJf .p/.Q E u  uE/ C O.d22 .Qu; uE //; Page 26 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

where PEu WD QpEJf .Eu/ is a (locally linear) invertible transformation. We denote its inverse by LEu D PEu1 . Since d22 .Qu; uE/ is close to 0, the term O .d22 .Qu; uE // can be ignored, so that uQ  uE D L.Eu/E

: The above equation shows that the global coordinate uQ  uE for f can be obtained by aligning the local coordinate E . Hence, the LTSA method realizes dimensionality reduction by finding the local coordinates (with respect to tangent space), followed by constructing an alignment to transform them to the global coordinates. 4.5.2 LTSA Algorithm Step 1: Neighborhood definition. The notion of k-neighborhood, with k  d , is used for this method. Step 2: Local coordinates computation. Let Xi D ŒpEi1 ; : : : ; pEik  be the matrix consisting of the k neighbors of pEi , and XOi D ŒpEi1  pEi ; : : : ; pEik  pEi . We find the local coordinates (in Rd / of XO i by PCA, that is, by considering the SVD XOi D

k X

j uEj .E vj /0

j D1

P of XOi . Then, the local tangent coordinates of XO i is given by Ti D dj D1 j .E vj /0 2 Md;k , and an orthonormal basis for the local tangent coordinates is Qi D ŒEu1 ; : : : ; uE d  2 Ok;d , in which each column is a local tangent coordinate function on Oi . Let Gi D k1 e; Qi . Then the columns of Gi 2 Ok;d C1 consist of the constant function and all coordinate functions on Oi . Step 3: DR kernel construction. The DR kernel K in LTSA is called the alignment matrix. Let N.i / be the index set of the neighborhood of pEi . Then the submatrix of K, corresponding to the index set N.i /, is K.N.i /; N.i // D I  Gi Gi0 . Step 4: Eigen decomposition of DR kernel. Let Y D Œy1 ; : : :; yd  be the eigenvector matrix corresponding to the 2nd to the (d C 1/st smallest eigenvalues of K. Then Y D Y0 is the dimensionality reduction of P. 4.5.3 Conclusion Similar to LLE, the kernel in LTSA method is sparse, and hence a fast algorithm can be developed. However, when the data are contaminated with noise, the algorithm may become unstable.

4.6 Laplacian Eigenmaps The Laplacian eigenmap method was introduced by Belkin and Niyogi (2002, 2003). It is closely related to the manifold learning (see Belkin and Niyogi 2004; Park et al. 2004; Law and Jain 2006;

Page 27 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Li et al. 2007). While the idea of Laplacian eigenmaps is similar to that of LLE, its DR kernel is derived from the Laplace-Beltrami operator on the manifold, where the data set resides. 4.6.1 Description of the Laplacian Eigenmap Method Let fpE1 ; : : : ; pEn g  RD be the data set, which resides on some d -dimensional (smooth) manifold M  RD . Let f D Œf 1 ; : : :; f d 0 be some local coordinate (vector-valued) function on the manifold M , in the sense that for each x 2 M; f.x/ 2 Rd provides certain local coordinates of x. Let e be the constant function on M , defined by e.x/ D 1; x 2 M . Then, the totality of f l ; 1  l  d , and e constitutes a basis of the space of linear functions on M . Let C k .M / be the space of the kth differentiable functions on M , and L be the Laplacian-Beltrami operator on C 2 .M /, defined by Lu D div.grad.u//; u 2 C 2 .M /: The operator L is positive and self-adjoint, with e, f 1 , . . . , f d , residing in its null space. If the d coordinate functions f 1 , . . . , f d , can be constructed, then the data set ff.pE1 /; : : : ; f.pEn /g  Rd would provide a dimensionality reduction of the given data fpE1 ; : : : ; pEn g. In practice, since the data set fpE1 ; : : : ; pEn g may not precisely reside on a d -manifold, we need to choose a suitable d dimensional manifold M , on which the data set P approximately resides. The Laplacian-Beltrami operator can be constructed in the following way. As before, assume that the manifold structure is given by the neighborhood definition, and that the neighborhood Oi of each pixel pEi is well defined, either by k-neighborhood or by "-neighborhood. If the kneighborhood definition is adopted, then we may modify it to ensure symmetry of neighbors. This can be achieved as follows. If pEi is a neighbor of pEj , but pEj is not a neighbor of pEi , then we need to force pEj to be a neighbor of pEi . Let Ht D exp.t L/ be the Neumann heat kernel on M . Then the spectral radius of Ht is 1. Since, e,f 1 , . . . , f d , are in the null space of L, they are in the 1eigenspace of Ht , which can be approximately represented as a weight matrix WQ D Œwij =di ni;j D1 on the graph G.P; E/, where ( wij D

jjpEi pEj jj2 exp  4t ; if pEi 2 Oi ; 0;

and di D

Pn j D1

(26)

otherwise;

wij (see Belkin and Niyogi 2003, for details). Let D D diag.d1 ; : : : ; dn /; W D Œwij ni;j D1 ;

(27)

and let yEi D Œf 1 .pEi /; : : : ; f d .pEi /0 ; i D 1; : : : ; n, and Y D ŒyE1 ; : : : ; yEn   Rd . Then we have Y DY 0 D I; Y De D 0; and tr.Y.D  W /Y 0 / D 0:

Page 28 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

In general, since the data set fpE1 ; : : : ; pEn g may not reside on a d -dimensional manifold, dimensionality reduction is realized by considering minimization of the quadratic form Y D arg min tr.Y.D  W /Y 0 /: 0 YDY DI YDe D0

(28)

But since 1X 2 d .yEi ; yEj /wij D tr.Y.D  W /Y 0 /; 2 i;j 2 the minimization problem (28) can also be formulated as Y D arg min 0

X

Y DY DI Y DeD0 i;j

d22 .yEi ; yEj /wij :

(29)

4.6.2 Theoretic Discussion of the Laplacian Eigenmap Method In the following, we provide the mathematical foundation for the Laplacian eigenmap method. Assume that the data set fpE1 ; : : : ; pEn g resides on some d -dimensional manifold M  RD . Let f 2 C 2 .M / W M ! R, and define the inner product on C 2 .M / by Z f .x/g.x/dx; f; g 2 C 2 .M /: hf; gi D M

Let L be the Laplacian-Beltrami operator on M as defined previously. Then by Stokes’ theorem, we have Lf D div.grad f /: Again as above, let e be the constant function on M and f 1 , . . . , f d 2 C 2 .M / be the selected coordinate functions on M . Then hf l ; ei D 0, l D 1, . . . , d , hf l ; f k i D 0, l ¤ k, and Z    div grad f l .x/ f l .x/dx D 0; l D 1; 2; : : : ; d: M

Again by Stokes’ theorem, we have Z Z ˇˇ ˇˇ   ˇˇgrad f l .x/ˇˇ2 dx D  div grad f l .x/ f l .x/dx; M

M

so that Z

ˇˇ ˇˇ ˇˇgrad f l .x/ˇˇ2 dx D 0;

l D 1; 2; : : : ; d:

(30)

M

Page 29 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Suppose that the data set fpE1 ; : : : ; pEn g does not reside on a d -dimensional manifold, but on some m-dimensional manifold M , where d < m  D, then we select the coordinate functions f 1 ; : : :; f d such that hf l ; ei D 0, l D 1, . . . , d , hf l ; f k i D 0, l ¤ k, and Xd Z ˇˇ ˇˇ  1 d ˇˇgrad f l .x/ˇˇ2 dx; f ; : : : ; f D arg min lD1

M

of which the discrete formulation leads to (29). 4.6.3 Laplacian Eigenmaps Algorithm Step 1: Neighborhood definition. Either the k-neighborhood or "-neighborhood can be defined for this method. Step 2: Weighted graph creation. The Laplacian eigenmaps method adopts positive and symmetric graphs. When k-neighborhood is applied, to ensure the weighted graph is symmetric, the edges need the following modification: pEi and pEj have to be connected if either pEi is in the neighborhood of pEj , or pEj is in the neighborhood of pEi . Then, for each pair .pEi ; pEj /, we set weight wij by (26). P Step 3: DR kernel construction. If the weight matrix is W D .wij /, let di D j wij and D D diag.d1 , . . . , dn /. Then L D D  W is a DR kernel for Laplacian eigenmap method. Step 4: Eigen decomposition of DR kernel. For the Laplacian eigenmap method, we need to solve the generalized eigenproblem: Lf D Df (31) 1 d Let fi gn1 iD0 , with 0 D 0  1      n1 , be the eigenvalues of (31), and Y D Œf , . . . , f ] be the eigenvector matrix corresponding to the eigenvalues 1      d such that

Lfi D i Dfi ; i D 1; : : : ; d: Then Y D Y0 is the required dimension-reduced data of P . 4.6.4 Conclusion Similar to LLE, the kernel used in the Laplacian eigenmap method is sparse. Hence, the algorithm for the Laplacian eigenmap method is a fast algorithm. However, there is the need of parameterization in the construction of its weight matrix, and optimization of such parameters is difficult in general. In addition, when the data are contaminated with noise, the algorithm may become unstable.

4.7 Hessian Locally Linear Embedding The Hessian locally linear embedding (HLLE) method was introduced by Donoho and Grimes (2003). While the (standard) LLE achieves linear embedding by minimizing the (squared) l2 error in (22), the HLLE achieves linear embedding by minimizing the Hessian functional on the manifold where the data set resides. Page 30 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

4.7.1 Theoretic Background on HLLE Method To describe the HLLE method, let us first define the Hessian functional on a manifold. Suppose that the data set fpE1 ; : : : ; pEn g lies on some d -dimensional smooth manifold M  RD . For each point m 2 M , there is a (manifold) neighborhood Om  M , which is differentiable homeomorphic to an open set U  Rd . Any differential homeomorphism g : U ! Om is called a parametrization of Om , and g 1 : Om ! U is called a coordinate system on Om . Of course, the parametrization g is not unique, and we may choose an isometry g of U to Om . Then g 1 is called an isometric coordinate system on Om . For instance, the geodesic coordinate system is isometric. Let g be a parametrization of Om and g.u/ D m. Denote the derivative of g at u by dgu . Then the image dgu .Rd /  RD is the tangent space of M at m, denoted by TMm . It is known that the tangent place TMm is a local approximation of Om , namely g.u C u/ D m C .dgu /u C O.jjujj2 /; g 2 C 2 .U /: Write dgu D Œ 1 ;    ; d . Then u provides the tangent coordinates of g.u C u/ 2 M . For an isometry g, the columns of dgu form an orthonormal system in RD , i.e., dgu is a linear isometric map from Rd to TMm , so that .dgu /1 is a linear isometric map from TMm to Rd . Let f W M ! R be some function defined on M . We write f 2 C k .M / if for each parametrization g, f ı g is in C k .U /. It is clear that the function space C k .M / is a linear space. Definition 2. Let [1 ; 2 ; : : :; d 0 be isometric coordinates of Om , and fei gdiD1 be the standard coordinate basis of Rd . Then for each point p 2 Om , there is a vector  D †i ei such that p D g.u C  /. For f 2 C 2 .M /, the Hessian of f at m with respect to the isometric coordinates [1 ; 2 ; : : :; d 0 is defined by  iso  @ @ .f ı g/.u C  /j D0 ; Hf ij .m/ D @i @j and the corresponding Hessian functional on C 2 .M / is defined by Z iso jjHfiso jj2F dm; f 2 C 2 .M /; H .f / D M

where jjjjF is the matrix Frobenius norm. Similarly, let x D Œx1 ; x2 ; : : :; xd 0 be the tangent coordinates of a point in p 2 Om , associated with the coordinate basis f 1 ; : : :; d g, so that g.u C x/ D mC < dgu ; x > CO.jjxjj2 / holds. Let Nm be an open set in RD , such that Om  Nm , and f be extended on C 2 .Nm /. The Hessian of f at m in tangent coordinates is defined by  tan  @ @ f .m C .dgu /x/jxD0 ; Hf ij .m/ D @xi @xj and the corresponding Hessian functional in tangent coordinates is defined by

Page 31 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Z H .f / D

jjHftan jj2F dm;

tan

f 2 C 2 .M /:

M

We have the following. Theorem 6. The Hessian functional of f on C 2 .M / is equal to the Hessian functional of f in tangent coordinates, namely H tan .f / D H iso .f /: Proof. The proof of the theorem is quite long and is omitted here (see Donoho and Grimes 2003, for the proof and further discussions). Theorem 7. The Hessian functional H iso .f / on C 2 .M / has a d C 1 dimensional nullspace, consisting of the constant function and a d-dimensional space of functions spanned by the original isometric coordinates. Proof. A function f 2 C 2 .M / is said to be linear, if for each parametrization g, f ı g is linear on Rd . It is clear that the nullspace of H tan is the space of all linear functions f 2 C 2 .M /, whose dimension is d C 1. The space of linear functions is spanned by the constant function and the isometric coordinate functions. This completes the proof of the theorem. 4.7.2 Description of the HLLE Method Assume that the data set P lies on a d-dimensional manifold M  RD with some (vector-valued) coordinate function D Œ 1 ; : : :; d 0 . Let g D Œg1 ; : : :; gd 0 : U  Rd ! M , be the inverse of , which provides a parametrization of M . As before, we denote the constant function on M by e. Then, the HLLE method is to find coordinate function by solving the minimization problem i

D arg min

f 2C 2 .M /

Htan .f /; i D 1; : : : ; d;

under the constraints hf l ; ei D 0; l D 1; : : :; d , and hf l ; f k i D 0, l ¤ k. 4.7.3 HLLE Algorithm The HLLE algorithm can be described by the steps described below. Assume that the data set fpE1 ; : : : ; pEn g  RD is to be reduced to fyE1 ; : : : ; yEn g  Rd . Step 1: Neighborhood definition. Either the k-neighborhood or "-neighborhood can be defined for this method. Assume that the k-nearest neighborhood definition is used. Let r D .d C 2/ .d C 1/=2. The HLLE method often employs k  r. Step 2: Hessian estimator creation.PFirst, the tangent coordinates of each pixel in the neighborhood of pEi is computed. Let pEi D k1 j 2N.i/ pEi , and define the matrix M i D ŒpEj1  pEi ; : : : ; pEjk  pEi ; jl 2 N.i /; jN.i /j D k; Page 32 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

whose SVD is M i D UDV 0 i of MDk , where U 2 OD;D ; V 2 Ok;k and

8 Œdiag.1 ; : : : ; D /; 0; ˆ ˆ < diag.1 ; : : : ; D / DD ˆ diag.1 ; : : : ; k / ˆ : 0

if k > D; if k D D; if k < D:

v1 ; : : : ; vEd  be the first d columns extracted from V . Then, the tangent coordinate of Let V d D ŒE the point pEj l is the lth row of V d . Second, the Hessian estimator is developed (Key step). Let f 2 C 2 .M / be a function defined on M . Since the data set P is discrete, we only have some discrete formulation of f on P . For convenience, we keep the same notation f for its discrete formation, which is uniquely determined by the vector f D .f .pEi //niD1 . At the point pEi the Hessian estimator, as a functional of f may be constructed as follows. Denote the tangent Hessian at pEi by H i .f /. Recall that on the neighborhood Oi the function f is determined by fi D Œf .pEj1 /; : : : ; f .pEj k /0 . Hence, the local i (where s D d.d C 1/=2/ with Hessian functional H i .f / can be represented by a matrix Hsk i H i .f / D Hsk fi ; 8f 2 C 2 .Oi /;

such that i eE D 0; Hsk i vEi D 0; Hsk

i D 1; : : : ; d;

where eE D Œ1; : : : ; 10 2 Rk . Write aE  bE D Œa1 b1 ; : : : ; ak bk 0 . Then, the column vectors in the matrix ŒE vi  vEj 1ij d form a basis of all quadratic functions on Oi . Let i h     d V D eE; vEi iD1 ; vEi  vEj 1ij d ; a

whose columns form a basis of all polynomials on Oi with degree 2. Let VQ 2 Ok;r be the orthonormalization of V a obtained by the Gram-Schmidt procedure. Let T s be the submatrix of VQ , containing last s columns of VQ . Then, the local Hessian functional is constructed by i Hsk D .T s /0 :

To construct the (global) Hessian functional in tangent coordinates, we need to combine all local Hessian functionals. Let .Wi /sn be the extension of the local Hessian estimator .H i /sk , such that

Page 33 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

the j1 ; : : :; jk 2 N.i / columns of Wi constitute the matrix .H i /sk and the remaining columns of Wi vanish. We write W D ŒW1 ; : : :; Wn0 2 M.sn/n. Then H tan .f / D jjW fjj2 Step 3: DR kernel construction. The DR kernel for HLLE is then given by K D W 0 W: Like Laplacian eigenmap method, the DR kernel of HLLE method is constructed for the space of the functions on the data. Step 4: Eigen decomposition of DR kernel. Let fy0 ; y1 ; : : : ; yd g  Rn be the d C 1 eigenvectors corresponding to the d C 1 smallest ascending eigenvalues of K. It is clear that y0 is a constant vector, corresponding to 0 eigenvalue. Let Y D Œy1 ; : : : ; yd  2 On;d . Then Y D arg minY2On;d tr.Y0 KY/ under the constraint Y0 e D 0: The DR data set is Y D Y0 . 4.7.4 Conclusion The kernel in HLLE is sparse, and the HLLE algorithm works well, even when the manifold M is not convex and consists of holes. In addition, the HLLE algorithm is a fast algorithm that allows high accuracy. However, the kernel size used for the HLLE method is larger than those for other methods. Besides, the Hessian functional involves the second derivatives, so that the algorithm is quite sensitive to noise.

4.8 Diffusion Maps The mathematics of diffusion maps was first studied in the PhD thesis of Lafon (2004) at Yale and was available in his homepage. Two of the easily accessible published papers are Coifman and Lafon (2006), Coifman and Maggion (2006), and Nadler et al. (2006) authored by Nadler, Lafon, Coifman, and Kevrekidis. 4.8.1 Diffusion Maps Method Description The idea of this method is to consider a family of diffusion maps (Coifman and Lafon 2006), each of which embeds the data set P into an Euclidean space so that the Euclidean distance in the space is equal to the diffusion distance on the data P , with various (numerical) dimension in the range of the maps. Let ˆ denote such a diffusion map in the family that (numerically) embeds P into Rd . Then the dimensional reduction of P is a data set Y D ˆ.P /  Rd .

Page 34 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

The notion of the diffusion maps method may be considered as some generalization of Laplacian eigenmap method. Recall that if the set fpE1 ; : : : ; pEn g  RD resides on a d -dimensional manifold M  RD , then the Neumann heat kernel Ht for the Laplacian eigenmap method defines a metric on M , which, in discrete formulation, is a weight matrix W on the graph G.P; E/ that describes the similarity on the data set P with some parameter t . The Laplacian eigenmap method is applied to achieve the dimensionality reduction of P by solving the minimization problem (28). Now, if W is so normalized that each row sums to 1, then the minimization problem (28) is equivalent to the optimization problem Y D arg min tr.Y W Y 0 / with the constraints Y e D 0I Y Y 0 D I: Hence, since the role of Ht is to equip the data set with a metric, the method of diffusion maps may be considered as an extension of the heat kernel to a semigroup of diffusion kernels for generating some diffusion process on the data set, in a manner similar to what the heat kernel does. The eigen decompositions of the kernels in the semigroup lead to a family of diffusion maps, one of which is used to achieve the required dimensionality reduction of P . 4.8.2 Theoretic Background of the Diffusion Maps Method In the following, we provide the mathematical foundation of the diffusion maps method. Definition 3. conditions: 1. 2. 3. 4.

A matrix K 2 Sn is called a diffusion kernel if it satisfies the following

Symmetry: K.i; j / D K.j; i /. Positivity preservation:P K.i; j /  0. n Positive semi-definite: i;j D1 K.i; j /ui uj  0; 8Eu 2 Rn . Contraction: Largest eigenvalue is 1.

If the contraction condition in the definition is relaxed, then K is called a nonnormalized diffusion kernel, of which the following result provides a normalization method. Proposition 1.

Let K D Œkij ni;j D1 be an spsd matrix with nonnegative entries. Let si D

and define

" KQ D

kij p si sj

n P j D1

kij

# :

Then KQ is also spsd with nonnegative entries, and with largest eigenvalue equal to 1 and p p p corresponding eigenvector given by Πs1 ; s2 ; : : : ; sn 0 . Proof. That KQ is spsd with nonnegative entries follows immediately from

Page 35 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015





0 x x x x 1 n 1 n Q D p ;:::; p K p ;:::; p x Kx : s1 sn s1 sn 0

It is also clear that the largest eigenvalue of KQ is 1. For each i , 1  i  n, since n n X X p kij p kij s D p D si : p j si sj si j D1 j D1

p p p Œ s1 ; s2 ; : : : ; sn 0 is an eigenvector of KQ corresponding to the eigenvalue 1, completing the proof of the proposition. Let P D fpE1 ; : : : ; pEn g  RD be a data set and K D ŒKij ni;j D1 be a (normalized) diffusion kernel defined on P , with the entry wij D Kij to describe the similarity between pEi and pEj . Since K is an spsd matrix, we may define a function dK W P  P ! RC by dK .pEi ; pEj / D

p

K.i; i / C K.j; j /  2K.i; j /; i; j D 1; : : : ; n:

As discussed in Sect. 2, this function dK satisfies the distance axioms, and hence defines a distance metric on the set P . This non-Euclidean distance dK is also called kernel distance, equipped with the manifold structure of P . Let rank(K/ D m C 1. Then P is embedded into an m-manifold M by dK . To find the parameterizations of the embedding, let us consider the operator Z K.x; y/f .y/dy; f 2 C.M /; (32) T .f / D P

on the space C.M /, where we have assumed that the data are uniformly contributed so that the measure of the integral is d.y/ D dy, for simplicity, counting measures with various weights can be used in place of the Lebesgue measure dy. Now, since rank.K/ D m C 1, the operator T has m C 1 positive eigenvalues, 1 D 0  1  2      m  0. Let 0 , 1 , 2 , . . . , m , be their corresponding eigenvectors such that T .0 / D 0 is a constant function (see description of the Laplacian eigenmap method), and T .i / D i i ; i D 1; : : : ; m: The mapping ˆ W M ! Rm defined by 2

p

3

1 1 .x/ :: :

6 7 6 7 7; ˆ.x/ D 6 6p 7 4 m1 m1 .x/ 5 p m m .x/

x2M

(33)

is an isometry of M to Rm , since .Tf /.x/ D

Xm iD1

Z i i .x/

i .y/f .y/dy P

Page 36 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

leads to

d22 .ˆ.x/; ˆ.y// D dK2 .pEi ; pEj /:

(34)

The interested reader is referred to Coifman and Lafon (2006) for further details. If m D d , then the data set fˆ.pE1 /; : : : ; ˆ.pEn /g  Rd is the required dimensionality reduction of P . However, it is difficult to construct diffusion kernels with rank exactly equal to d . Hence, instead of a single diffusion operator T , a semigroup of diffusion operators {T t }1 t D0 is used, where Z T .f / D t

f 2 C.M /;

K t .x; y/f .y/dy;

t  0:

P

Since the eigenvalues of T D T 1 are 1 D 0 > 1  2      m > 0, the eigenvalues of T t are {1, t1 ; : : :; tm g. Correspondingly, the map in (33) now becomes a family of maps 2

p t 1 1 .x/ :: :

3

6 7 6 7 7; ˆ .x/ D 6 6p t 7 m1 m1 .x/ 5 4 p tm m .x/ t

x 2 M;

t  0:

(35)

The maps in (35) are called diffusion maps, which give rise to the diffusion distances d2 .ˆt .x/; ˆt .y// D

Xm iD1

ti .i .x/  i .y//2 :

For any given sufficiently small threshold " > 0, there is a t > 0 such that td C1  " and td > ". Then, the numerical rank of K t is d , and the data set  ˚ t ˆ .pE1 /;    ; ˆt .pEn /  Rd provides the required dimensionality reduction of P . 4.8.3 Diffusion Maps Algorithm Step 1: Data scaling. For the diffusion maps method, the nonnormalized weight matrix W is jjpE pE jj2 usually chosen to be similar to that of the Laplacian eigenmap method, say wij D exp  i2 2j , which, however, depends on the scale of data. In order to develop diffusion maps algorithms for data from various sources, the data need to be scaled. Let P D ŒpE1 ; : : : ; pEn  be a given D  n data matrix. We first centralize the data matrix P and then scale it to be in [0, 1], such that

P pij  n1 nj D1 pij  m ; 1  i  D; 1  j  n; (36) pijnew D M m



  P P where m D min pij  n1 nj D1 pij and M D max pij  n1 nj D1 pij . Then d2 pEinew ; pEjnew D

ij 1 d .pEi ; pEj /. M m 2

ij

Page 37 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

For convenience, we shall use P itself to denote the new data set P new D ŒpE1new ; : : : ; pEnnew  in the following steps. Step 2: Weight matrix creation. We create the weight matrix W D .wij /nij D1 , by setting ( wij D

jjpE pE jj2 exp  i2 2 j ; ij

wij < ; wij > ;

0;

where is some threshold for making the weight matrix sparse and ij is a parameter that may be dependent on (i; j ). Sparsity of the matrix is important for defining the neighborhood of each point pEi . Step 3: Diffusion kernel construction. We define K D W 0 W , so that K is a nonnormalized diffusion kernel. Using the method described in Proposition 1, we then normalize the kernel K (of which we again use the same notation K for convenience). Step 4: DR kernel decomposition. Let v0 , v1 , v2 , . . . , vm be the eigenvectors of K corresponding to nonzero eigenvalues 1 D 0 > 1  2      m > 0. Then KD

Xm iD0

i vi .vi /0 ;

where vi D Œv1i ; : : : ; vni 0 . For a given threshold  > 0, choose t so that 2t d C1  . Let ˆ be the map from P to Rd given by ˆ.pEj / D Œ.1 /t v1j ; : : : ; .d /t vdj 0 :

(37)

Then the data set fˆ.pE1 /; : : : ; ˆ.pEn /g provides a dimension reduction of P new . Hence, after scaling by multiplying with the scaling factor (M  m), we obtain the required dimensionality reduction of P . Remark 2. In practice, it is hard to determine parameter ij in the weight wij . In Szlam (2006), the author proposes a geometric-mean adjustment formula for constructing the weight matrix W D .wij / as follows. Define jjpEi  pEj jj : pEi .pEi ; pEj / D jjpEi  pEim jj where pEim is the mth nearest pixel to pEi within its neighborhood, and is some threshold. Then, the weight matrix is constructed by setting 8   pEi .pEi ;pEj /pEj .pEj ;pEi / < exp  wij < ; 2 2 wij D : 0; wij > ; where the parameter  in wij now is universal for all (i; j ).

Page 38 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

4.8.4 Conclusion Construction of the DR kernel in diffusion maps method is very simple. The neighborhood definition for the diffusion maps method is different from the k-neighborhood and "-neighborhood, while being realized by thresholding the weights appropriately, and hence avoids the difficulty of the selection of k or ". In addition, by taking advantage of the semigroup structure, the fast diffusion wavelet algorithms developed in Coifman and Lafon (2006) are stable and can be used to compute the eigen decomposition of the DR kernel. In addition, since the algorithm adopts diffusion processing, it is less sensitive to noise. However, the parameters (ij /i;j D 1n in the weight matrix are difficult to determine in general. Therefore, the error of dimensionality reduction could be quite large if the parameters are not chosen properly.

4.9 Anisotropic Transform Method The anisotropic transform method was proposed by the authors of this chapter, (see Chui and Wang 2008, 2010). This method provides a faster and more stable algorithm as compared with the other nonlinear methods discussed in this chapter. Recall that for all of nonlinear DR methods introduced previously, the kernel is an n  n matrix, where n is the total number of pixels in an HSI cube, which is very large. For example, even for a small HSI cube with 560  480 D 268; 800 spatial resolution, the number of entries of the nonlinear DR kernel is almost 7  1010 . Hence, when these DR methods are applied to a larger HSI cube, they usually encounter difficulties in at least three aspects: memory usage, computational complexity, and computational instability. Although kernels of some of the algorithms discussed previously are sparse, which enable partially overcoming the difficulties in memory usage and computational complexity, yet it is not clear if the instability issue can be settled. The anisotropic transform method for DR avoids large size kernels in DR to dramatically reduce memory usage and computational complexity, as well as to increase numerical stability. The anisotropic transform method is based on the observation of the particular structure of HSI data. It is noted that HSI data have very high correlation among band images and as well as among cell clusters (i.e., pixels). For example, we studied the correlation between 10 band images in an HSI cube, called AVIRIS, which contains 240 bands, covering the spectrum of wavelengths from 350 to 2,500 nm. The bands we selected cover the wavelength from 400 to 480 nm. All the correlation coefficients between these band images are larger than 0.92. It is also revealed that in the spatial domain most pixels are similar to their spatial neighbors, with the exception of those near the image edges of the HSI cube. These properties of HSI data distinguish them from other types of high-dimensional data, such as Internet documents and data for face recognition, and constitute the foundation of the anisotropic transform method. 4.9.1 Anisotropic Transform Method Description The main idea of the anisotropic transform method can be described as follows. Assume that an HSI data set fpE1 ; : : : ; pEn g  RD (nearly) resides on a d -dimensional manifold M  RD , which is equipped with a diffusion distance dK . An anisotropic transformation T from M to an m-dimensional subspace S  Rm , with d  m n, is constructed such that d2 .T .pEi /; T .pEj // D dK .pEi ; pEj /; pEi 2 M; T .pEi / 2 S; i D 1; : : : ; n:

(38) Page 39 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Recall that the anisotropic transform T is represented by an m  n matrix, whose size is much smaller than n  n. Since S is a d -dimensional subspace of Rm , it is easy to find an orthogonal transformation Q from S to Rd . Then the data set fQ ı T .pE1 /; : : : ; Q ı T .pEn /g provides dimensionality reduction of fpE1 ; : : : ; pEn g. 4.9.2 Theoretic Background of the Anisotropic Transform Method Let P D fpE1 ; : : : ; pEn g  RD be an HSI data set, and W D Œwij ni;j D1 be a weight matrix whose entry wij describes the (local) similarity between pEi and pEj . Therefore, W defines a neighborhood for each pixel pEi . In the diffusion maps method, the diffusion kernel K is generated by K D W 0 W and the weight matrix was constructed by the Gaussian distribution function or its geometric mean adjustment, as described in Sect. 4.8. Remark 3. A nonnegative decreasing function c.x/, x 2 RC , is called a conductivity function. We may use any conductivity function to define the weight by setting ( wij D

c.jjpEi  pEj jj/; 0;

wij < ;

(39)

wij > :

As pointed out in Sect. 4.8, the eigen decomposition of K is used to derive a diffusion map ˆ from P to Rd by (37), which provides the dimensionality reduction of P . It can be seen that the diffusion map ˆ is a composition of a nonlinear map from P to Rn and an orthogonal projection from Rn to Rd . To verify this, let F be the map from P to Rn defined by E i ; i D 1; : : : ; n; F .E ai / D w where w E i D Œwi1 ; : : : ; wi n 0 . Also, let Q be the map from fw E 1; : : : ; w E n g to Rd defined by Q.w E i / D yEi .D ˆ.pEi //: Then we have ˆ D Q ı F . Since K D W 0 W , K is the Gram matrix of W and therefore dK .pEi ; pEj / D d2 .w Ei; w E j /: Here the map Q is an orthogonal projection. Observe that the main source of computational complexity and numerical stability problem is the eigen decomposition of K. The anisotropic transform method is designed to find a (nonlinear) map T from P to Rm .m n/, such that (38) holds. Once T is constructed, then the SVD of T .P / can be applied to derive the orthogonal projection Q. Note that the size of T .P / is m  n, where m n. Hence, if we know the rank of W to be m, we may decompose W as W D QS; Q 2 On;m ; S 2 Mm;n : This yields K D S 0 S: Page 40 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Let S D ŒEs1 ; : : : ; sEn . Then the nonlinear map T is constructed by T .pEi / D sEi ; i D 1; : : : ; n:

(40)

Otherwise, we employ the algorithms of rank-revealing approximation (Chan and Hansen 1992; Gu and Eisenstat 1996; Cheng et al. 2005; Woolfe et al. 2008) to construct T . The following result on rank-revealing approximation can be found in Cheng et al. (2005) and Woolfe et al. (2008). Theorem 8. Suppose A 2 Mm;n and k  min.m; n/. There is a matrix P 2 Mk;n and a matrix B 2 Mm;k whose columns constitute a subset of the columns of A such that P has an k  k submatrix, which is identity. No entrypof P has an absolute value great than 2. jjP jj  4k.n  k/ C 1, and the least singular value p of P is 1. BP D A when k D m or k D n, and jjBP Ajj  4k.n  k/ C 1kC1 when k < min.m; n/, where kC1 is the .k C 1/st greatest singular value of A. 5. There are algorithms that compute both B and P using at most Ckmnlog.n/ floating point operations, where C is a constant.

1. 2. 3. 4.

We then construct the nonlinear map T as follows. First, applying QR decomposition to P 0 , we obtain S 2 Mm;k and Q0 2 On;k , such that SQ D BP . The computation of S and Q requires only Ckmnlog(n) floating-point operations. Second, using (40), we construct the nonlinear map T . Remark 4. Because the method realizes DR by constructing a (nonlinear) anisotropic transform T and an orthogonal projection Q, we coined it the anisotropic transform method. 4.9.3 Anisotropic Transform Algorithm Step 1: Data scaling. This step is the same as the one described in the diffusion maps algorithm. Step 2: Weight matrix introduction. W D .wij /nij D1 by (39). Step 3: Anisotropic transform construction. Apply a rank-revealing algorithm to find a matrix A D ŒE a1 ; : : : ; aEn  2 Mm;n and a matrix Q 2 On;m such that jjQA  W jj  cmC1 ; where c is a constant and mC1 is the (m C 1)st singular value of A. Then the anisotropic transform T is given by T .pEi / D aEi ; 1  i  n: Step 4: Orthogonal projection construction. The orthogonal projection Q from fE a1 ; : : : ; aEn g  Rm to Rd is constructed as follows. Let the SVD decomposition of A be AD

Xm iD0

i vi .ui /0 ;

Page 41 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

where 1 D 0 > 1  2      m > 0 are the singular values of A. Let Vd D Œv1 ; : : : ; vd  2 Om;d , and define Q.E aj / D .Vd /0 aEj : Then Q is the orthogonal transformation from A to Rd , which yields the required dimension an /. reduction Y of P , by setting Y D ŒQ.E a1 /; : : : ; Q.E 4.9.4 Conclusion For the anisotropic transform method, an anisotropic transform is constructed to map the original data to an auxiliary data set in a significantly lower dimensional space, say Rm with m n, and the dimensionality reduction is completed by orthogonally mapping the auxiliary data to the required low-dimensional data set in Rd , with d  m. The method avoids eigen decomposition of matrices of large size, namely n  n. Hence, the anisotropic transform method provides a faster and more stable algorithm as compared with the other nonlinear methods. However, the anisotropic transform method encounters the same difficulty as the diffusion maps method in determining the parameters in the weight matrix.

Appendix: Glossary Anomaly detection. Detecting anomaly for given statistical models. Classification of objects (spectral classification). Classifying objects in a HSI data set. Demixing. Finding material components in a raster cell. Electromagnetic radiation. The energy in the form of electromagnetic waves. Electromagnetic spectrum. The entire family of electromagnetic radiation, together with all its various wavelengths. Endmember spectra. The “pure” spectra that contribute to mixed spectra. Fused images. A fused image is a combination of the HSI image and the HRI image (mentioned below). It is usually the best because of high resolution from the HRI camera and color information from the HSI sensor. This combination results in sufficiently high image resolution and contrast to facilitate image evaluation by the human eyes. High-resolution imagery. A high-resolution image camera, which captures black-and-white or panchromatic images, is usually integrated in an HSI system to capture the same reflected light. However, the HRI camera does not have a diffraction grating to disperse the incoming reflected light. Instead, the incoming light is directed to a wider CCD (Charge-Couple Device) capture more image data. The HSI resolution is typically 1 m pixel, and the HRI resolution is much finer: typically a few inches pixel. Hyperspectral imaging. The imagery consists of a larger number of spectral bands so that the totality of these bands is numerically sufficient to represent a (continuous) spectral curve for each raster cell. Illumination factors. The incoming solar energy varies greatly in wavelengths, peaking in the range of visible light. To convert spectral radiance to spectral reflectance, the illumination factors must be accounted. Illumination factors taken into account include both the illumination

Page 42 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

geometry (angles of incoming light, etc.) and shadowing. Other factors, such as atmospheric and sensor effects, are also taken into consideration. Macroscopic and intimate mixtures. Microscopic mixture is a linear combination of its endmembers, while an intimate mixture is a nonlinear mixture of its endmembers. Mixed spectra. Mixed spectra, also known as composite spectra, are contributed by more than one material components. Multispectral imaging. The imaging bins the spectrum into a handful of bands. Raster cell. A pixel in a hyperspectral image. Reflectance conversion. Radiance values must be converted to reflectance values before comparing image spectra with reference reflectance spectra. This is called atmospheric correction. The method for converting the radiance to reflectance is also called reflectance conversion. The image-based correction methods include flat field conversion and internal average relative reflectance conversion. They apply the model R1 D mR2 , where R1 is the reflectance, R2 is the radiance, and m is the conversion slope. Some conversions also apply the linear model R1 D c C mR2 , where c is an offset that needs to be abstracted from the radiance. The popular conversions are 1. Flat field conversion. A flat field has a relatively flat spectral reflectance curve. The mean spectrum of such an area would be dominated by the combined effects of solar irradiance and atmospheric scattering and absorption. The scene is converted to “relative” reflectance by dividing each image spectrum by the flat field mean spectrum. 2. Internal average relative reflectance conversion. This technique is used when no knowledge of the surface materials is available. The technique calculates a relative reflectance by dividing each spectrum (pixel) by the scene average spectrum. Region segmentation. Partitioning the spatial region of a hyperspectral image into multiple regions (sets of pixels). The goal of segmentation is to simplify and/or change the representation of an HSI image into something that is more meaningful and easier to analyze. Region segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. Remote sensing. Sensing something from a distance. The following processes affect the light that is sensed by a remote sensing system: 1. Illumination. Light has to illuminate the ground and objects on the ground before they can reflect any light. In the typical remote sensing environment, which is outdoors, illumination comes from the sun. We call it solar illumination. 2. Atmospheric absorption and scattering of illumination light. As solar illumination travels through the atmosphere, some wavelengths are absorbed and some are scattered. Scattering is the change in direction of a light wave that occurs when it strikes a molecule or particle in the atmosphere. 3. Reflection. Some of the light that illuminates the ground and objects on the ground is reflected. The wavelengths that are reflected depend on the wavelength content of the illumination and on the object’s reflectance. The area surrounding a reflecting object also reflects light, and some of this light is reflected into the remote sensor. 4. Atmospheric absorption and scattering of reflected light. As reflected light travels through the atmosphere to the remote sensor, some wavelengths are absorbed, some are scattered away from the sensor, and some are scattered into the sensor.

Page 43 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

These four effects all change the light that reaches the remote sensor from the original light source. After the reflected light is captured by the remote sensor, the light is further affected by how the sensor converts the captured light into electrical signals. These effects that occur in the sensor are called sensor effects. Spectral curve. It is the one-dimensional curve of a spectral reflectance. Spectral libraries. A spectral library consists of a list of spectral curves with data of their characteristics corresponding to specific materials such as mines, plants, etc. Spectral radiance. It is the measurable reflected light reaching the sensor. The spectral reflectance of the material is only one factor affecting it. It is also dependent of the spectra of the input solar energy interactions of this energy during its downward and upward passages through the atmosphere, etc. Shape recognition. Recognizing the shape of a detected object. Spectral reflectance. It is the ratio of reflected energy to incident energy as a function of wavelengths. A certain material has its own spectral reflectance. The light that is reflected by an object depends on two things: (1) light that illuminates the object and (2) the reflectance of the object. Reflectance is a physical property of the object surface. It is the percentage of incident EMR of each wavelength that is reflected by the object. Because it is a physical property, it is not affected by the light that illuminates the object. Spectral reflection. It is the observed reflected energy, represented as a function of wavelengths under illumination. It is affected by both reflectance of the object and the light that illustrates the object. If an object was illuminated by balanced white light, and if there was no atmospheric absorption or scatter, and if the sensor was perfect, then the wavelength composition of reflected light detected by an HSI sensor would match the reflectance, or spectral signature of the object. Spectral space. The n-dimensional space, where each point is the spectra of a material or a group of materials. Spectral signature. A unique characteristic of an object, represented by some chart of the plot of the object’s reflectance as a function of its wavelength. It can be thought of as an EMR “fingerprint” of the object. Spectroscopy. The study of the wavelength composition of electromagnetic radiation. It is fundamental to how HSI technology works. Signature matching. Matching reflected light of pixels to spectral signatures of given objects.

References Bachmann CM, Ainsworth TL, Fusina RA (2005) Exploiting manifold geometry in hyperspectral imagery. IEEE Trans Geosci Remote Sens 43:441–454 Bachmann CM, Ainsworth TL, Fusina RA (2006) Improved manifold coordinate representations of large-scale hyperspectral scenes. IEEE Trans Geosci Remote Sens 44:2786–2803 Bachmann CM, Ainsworth TL, Fusina RA, Montes MJ, Bowles JH, Korwan DR, Gillis L (2009) Bathymetric retrieval from hyperspectral imagery using manifold coordinate representations. IEEE Trans Geosci Remote Sens 47:884–897 Balasubramanian M, Schwaartz EL, Tenenbaum JB, de Silva V, Langford JC (2002) The isomap algorithm and topological stability. Science 29 Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Process Syst 14:585–591 Page 44 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15:1373–1396 Belkin M, Niyogi P (2004) Semi-surpervised learning on Riemannian manifolds. Mach Learn (special issue on clustering) 56:209–239 Borg I, Groenen P (1997) Modern multidimensional scaling. Springer, New York Chan TF, Hansen PC (1992) Some applications of the rank-revealing QR factorization. SIAM J Sci Stat Comput 13:727–741 Cheng H, Gimbutas Z, Martinsson PG, Rokhlin V (2005) On the compression of low rank matrices. SIAM J Sci Comput 26:1389–1404 Chui CK (1992) An introduction to wavelets. Academic, Boston Chui CK (1997) Wavelets: a mathematical tool for signal processing. SIAM, Philadelphia Chui CK, Wang JZ (1991) A cardinal spline approach to wavelets. Proc Am Math Soc 113:785–793 Chui CK, Wang JZ (1992a) On compactly supported wavelet and a duality principle. Trans Am Math Soc 330:903–915 Chui CK, Wang JZ (1992b) A general framework of compactly supported splines and wavelets. J Approx Theory 71:263–304 Chui CK, Wang JZ (2008) Methods and algorithms for dimensionality reduction of HSI data. In: The 2nd advancing the automation of image analysis workshop (AAIA Workshop II), UCLA, Los Angeles, 29–31 July 2008 Chui CK, Wang, JZ (2010) Randomized anisotropic transform for nonlinear dimensionality reduction. Int J Geomath 1:23–50 Coifman RR, Lafon S (2006) Diffusion maps. Appl Comput Harmon Anal 21:5–30 Coifman RR, Maggioni M (2006) Diffusion wavelets in special issue on diffusion maps and wavelets. Appl Comput Harmon Anal 21:53–94 Cox TF, Cox MA (2004) Multidimensional scaling. Chapman & Hall, Landon Donoho D, Grimes C (2003) Hessian eigenmaps: new locall linear embedding techniques for highdimensional data. Proc Natl Acad Sci 100:5591–5596 Gu M, Eisenstat SC (1996) Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM J Sci Comput 17:848–869 Kumar V, Grama A, Gupta A, Karypis G (1994) Introduction to paralell computing, design and analysis of algorithms. Benjamin/Cummings, Redwood City Lafon S (2004) Diffusion maps and geometric harmonics, PhD dissertation, Yale University Laub J, Müller KR (2004) Feature discovery in non-metric pairwise data. J Mach Learn Res 5:801–818 Law MHC, Jain AK (2006) Incremental nonlinear dimensionality reduction by manifold learning. IEEE Trans Pattern Anal Mach Intell 28:377–391 Lee JA, Verleysen M (2007) Nonlinear dimensionality reduction. Springer, New York Li CK, Li RC, Ye Q (2007) Eigenvalues of an alignment matrix in nonlinear manifold learning. Commun Math Sci 5:313–329 Lin T, Zha HY, Lee S (2006) Riemannian manifold learning for nonliear dimensionality reduction. In: European conference on computer vision, Graz, pp 44–55 Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2006) Diffusion maps, spectral clustering and the reaction coordinates of dynamical systems. Appl Comput Harm Anal 21:113–127 Park J, Zhang ZY, Zha HY, Kasturi R (2004) Local smoothing for manifold learning. Comput Vis Pattern Recogn 2:452–459

Page 45 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_34-3 © Springer-Verlag Berlin Heidelberg 2015

Partridge M, Calvo R (1997) Fast dimensionality reduction and simple PCA. Intell Data Anal 2:292–298 Rao C, Rao M (1998) Matrix algebra and its applications to statistics and econometric. World Scientific, Singapore Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 260:2323–2326 Szlam A (2006) Non-stationary analysis on datasets and applications, PhD dissertation, Yale University Tenenbaum JB, de Silva V, Langford JC (2000) A global geometric framwork for nonlinear dimensionality reduction. Science 290:2319–2323 Torgerson WS (1958) Theory and methods of scaling. Wiley, New York Weinberger KQ, Packer BD, Saul LK (2005) Nonlinear dimensionality reduction by semi-definite programming and kernel matrix factorization. In: Proceedings of the 10th international workshop on AI and statistics, Barbados Woolfe F, Liberty E, Rokhlin V, Tygert M (2008) A randomized algorithm for the approximation of matrices. Appl Comput Harmon Anal 25:335–366 Young G, Householder AS (1938) Discussion of a set of points in term of their mutual distances. Psychometrika 3:19–22 Zha HY, Zhang ZY (2009) Spectral properties of the alignment matrices in manifold learning. SIAM Rev 51:546–566 Zhang ZY, Zha HY (2003) Nonlinear dimension reduction via local tangent space alignment. Intell Data Eng Autom Learn 25:477–481 Zhang ZY, Zha HY (2005) Principal manifolds and nonlinear dimensionality reduction via local tangent space alignment. SIAM J Sci Comput 26:313–338 Zhao D (2006) Formulating LLE using alignment technique. Pattern Recogn 39:2233–2235

Page 46 of 46

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

Oblique Stochastic Boundary-Value Problem Martin Grothaus and Thomas Raskop Functional Analysis Group, University of Kaiserslautern, Kaiserlautern, Germany

Abstract The aim of this chapter is to report the current state of the analysis for weak solutions to oblique boundary problems for the Poisson equation. In this chapter, deterministic as well as stochastic inhomogeneities are treated and existence and uniqueness results for corresponding weak solutions are presented. We consider the problem for inner bounded and outer unbounded domains in Rn . The main tools for the deterministic inner problem are a Poincaré inequality and some analysis for Sobolev spaces on submanifolds, in order to use the Lax-Milgram lemma. The Kelvin transformation enables us to translate the outer problem to a corresponding inner problem. Thus, we can define a solution operator by using the solution operator of the inner problem. The extension to stochastic inhomogeneities is done with the help of tensor product spaces of a probability space with the Sobolev spaces from the deterministic problems. We can prove a regularization result, which shows that the weak solution fulfills the classical formulation for smooth data. A RitzGalerkin approximation method for numerical computations is available. Finally, we show that the results are applicable to geomathematical problems.

1 Introduction The main subject of this chapter is existence results for solutions to oblique boundary problems for the Poisson equation. We start with the deterministic problems. The Poisson equation in the domain † is given by u D f and the oblique boundary condition is given by ha; rui C bu D g: This condition is called regular if the equation jha; ij > C > 0 holds on @† for a constant 0 < C < 1. The problem is called an outer problem if the Poisson equation has to hold on an outer domain †  Rn . This is a domain †, having the representation 

E-mail: [email protected]

Page 1 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

† D Rn nD where 0 2 D is a bounded domain. Consequently, @† divides the Euclidean space Rn into a bounded domain D, called inner domain, and an unbounded domain †, called outer domain. A problem defined on a bounded domain is called an inner problem. A classical solution corresponding to continuous a, b, g, and f of the oblique boundary problem for the Poisson equation is a function u 2 C 2 .†/ \ C 1 .†/ which fulfills the first two equations. For the outer problem, u must be regular at infinity, i.e., u.x/ ! 0 for jxj ! 1. Existence and uniqueness results for a classical solution to regular oblique boundary problems for the Poisson equation are already available; see, e.g., Miranda (1970), Gilbarg and Trudinger (1998), or Rozanov and Sanso (2002a). In order to allow very weak assumptions on boundary, coefficients, and inhomogeneities, we are interested in weak solutions from Sobolev spaces of one time weakly differentiable functions. When facing the deterministic problems, we have to distinguish the inner and the outer setting. The reason is that a Poincaré inequality, namely, hru; ruid  C n



Z

Z

Z

2

u dH

n1

C

@†



Z u d C 2

hru; ruid 

n



n

;



for all u 2 H 1;2 .†/, is only available for bounded †. Thus, we can only use the Lax-Milgram lemma for the inner problem in order to gain a solution operator. For the outer problem, we use the Kelvin transformation to transform the unbounded domain † to a bounded domain †K via n o K x † WD jxj2 jx 2 † [ f0g: Additionally, we transform coefficients as well as inhomogeneities and end up with an inner problem, which possesses a unique weak solution . Finally, we transform this function to the outer space by u.x/ WD

1  jxjn2



x jxj2

 ;

for all x 2 †. This u is then the weak solution to the outer problem, and it can be shown that, in the case of existence, u is the classical solution. Additionally, the transformations are continuous, and consequently the solution depends continuously on the data. Before we go on with stochastic inhomogeneities and stochastic weak solutions, we want to mention that we have to assume a regular inner problem, while we have a transformed regularity condition for the outer problem resulting from the transformations. Going to a stochastic setting, we have to introduce the spaces of stochastic functions. These are constructed as the tensor product of L2 .˝; dP /, with a suitable probability space .; F ; P/ and the Sobolev spaces used in the deterministic theory. They are again Hilbert spaces, and we have isomorphisms to Hilbert space-valued random variables. For the stochastic inner problem, we again employ the Lax-Milgram lemma, while in the outer setting, we define the solution operator pointwise for almost all ! 2 ˝. For all solutions, deterministic as well as stochastic, a Ritz-Galerkin approximation method is available. Finally, we give some examples from geomathematics, where stochastic inhomogeneities are implemented. Proofs for the results presented in this chapter are given in Grothaus and Raskop (2006, 2009). The examples are taken from Freeden and Maier (2002) and Bauer (2004). We want to mention that the articles Rozanov and Sanso (2001) as well as Rozanov and Sanso (2002b) also deal with solutions to oblique boundary-value problems.

Page 2 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

2 Scientifically Relevant Domains and Function Spaces In this section, we consider boundary-value problems for the Poisson equation. This means we are searching for a function which satisfies the Poisson equation in a subset † of Rn and an additional condition on the boundary @† of this set, i.e. u D f ha; rui C bu D g

in †; on @†:

Here, f and g are called inhomogeneities, a and b are called coefficients, and such a function u is then called the solution. Our analysis is motivated by problems from geomathematics. Here, oblique boundary problems arise frequently, because in general the normal of the Earth’s surface does not coincide with the direction of the gravity vector. Therefore, the oblique boundary condition is more suitable than a Neumann boundary condition. For details, see Bauer (2004) or Gutting (2008). We are dealing with two different types of sets †, namely, bounded and outer C m;˛ -domains, which are introduced by the following definition. In particular, the outer problem is of major interest for applications. Definition 1. @†  Rn is called a Cm;˛ -surface, m 2 N and 0  ˛  1, and † is called a bounded Cm;˛ -domain if and only if • † is a bounded subset of Rn which is a domain, i.e., open and connected • There exists an open cover .Ui /iD1;:::;N of @† and corresponding Cm;˛ -diffeomorhisms n ‰iW B1R .0/ ! Ui , i D 1; : : :, N , such that ‰i W B10 .0/ ! Ui \ @†; ‰i W B1C .0/ ! Ui \ †; ‰i W B1 .0/ ! Ui \ Rn n†; where B1R .0/ denotes the open unit ball in Rn , i.e., all x 2 Rn with jxj < 1. B10 (0) denotes the set n n of all x2 B1R .0/ with xn D 0, B1C .0/ denotes the set of all x 2 B1R .0/ with xn > 0, and B1 .0/ n denotes the set of all x 2 B1R .0/ with xn < 0. On the other hand, † is called an outer Cm;˛ -domain if and only if †  Rn is open, connected, and representable as † WD Rn nD, where D is a bounded Cm;˛ -domain  such that 0 2 D. ‰i is called n m;˛ m;˛ C -diffeomorphism if and only if it is bijective, .‰i /j 2 C B1R .0/ , .‰i1 /j 2 C m;˛ .UN i /, j D 1; : : :, n, and we have for the determinant of the Jacobian matrix of ‰i ; Det.D‰i / ¤ 0 in n B1R .0/. n

In Fig. 1, such a C m;˛ -surface is illustrated. For this definition and further details, see, e.g., Dobrowolski (2006). The definition is independent of the mappings chosen. @† is a compact and double-point-free .n  1/-dimensional C m;˛ -submanifold. The outer unit normal vector  is a C m1 -vector field. Furthermore, we find a C 1 -partition of .wi /1iN on @† corresponding to the open cover .Ui /1iN , provided by Alt (2002). H n1 denotes the .n  1/-dimensional Hausdorff measure on @† and n the Lebesgue measure in Rn . Throughout this chapter, we assume Lipschitz boundaries, i.e., C 0;1 -boundaries @†. Then we have  2 L1 .@†I Rn/. Note that Page 3 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

∂Σ = ∂D

n

Σ D

U C m,α

n–1

B1 (0) = ψ −1 (U )

Fig. 1 C m;˛ -surface

some geomathematically relevant examples are even C 1 -surfaces, e.g., a sphere or an ellipsoid. We will see in Sect. 3 and 4 that the cases of bounded and outer domains have to be treated differently, because the unboundedness causes problems which do not occur in the bounded setting. Nonetheless, we are searching in both cases for solutions under as weak assumptions as possible. More precisely, we are searching for solutions in Sobolev spaces for inhomogeneities from Banach space duals of Sobolev spaces. These spaces are introduced in the following. Definition 2.

Let † be a bounded C0;1 -domain and r 2 N. We define

H r;2 .†/ WD fF W † ! Rj@˛1 1    @˛nn F 2 L2 .†/ for all ˛1 C    C ˛n  rg; !1 2 r N P P ˛ 2 jj@ F jjL2 .†/ : jjF jjH r;2 .†/ WD j˛jD0 iD1

N We Let † be an outer C0;1 -domain and %1 , %2 , %3 be continuous, positive functions defined on †. define R WD fF W † ! RjF is measurable with † F 2 .x/%21 .x/d n .x/ < 1g; L2%1 .†/ H%1;2 .†/ 1 ;%2

WD fF 2 L2%1 .†/j@i F 2 L2%2 .†/; 1  i  ng;

.†/ H%2;2 1 ;%2 ;%3

WD fF 2 L2%1 .†/j@i F 2 L2%2 .†/

jjF jjL2%1 .†/

WD

jjF jjH%1;2;%

1 2 .†/

jjF jjH%2;2;% ;%

1 2 3

R

and

1 2 2 n 2 ; F .x/% .x/d  .x/ 1 †

 WD jjF jj2L2

%1

2 .†/ WD jjF jjL2

%1

C .†/ C .†/

n P iD1 n P iD1

@i @j F 2 L2%3 .†/; 1  j; i  ng;

1 2

jj@i F jj2L2

%2 .†/

jj@i F jj2L2

%2

;

C .†/

n P j D1

!! 1 2

jj@i @j F jj2L2

%3 .†/

:

Let @† be a C0;1 -surface and .wi /1iN be the C1 -partition of unity of @† corresponding to the open cover from Definition 1. For a function F defined on @†, we obtain a function i F defined Page 4 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

on Rn1 by  i F .y/ WD

.wi F /.‰i .y; 0// 0

y 2 B1R .0/; otherwise: n1

Let now @† be a Cm;1 -surface, m 2 N. Furthermore, let s 2 R and r 2 N, with s < m C 1 and 0  r  m. Then we define H s;2.@†/ WD fF W @† ! Rji F 2 H s;2 .Rn1 /; 1  i  N g; H r;1 .@†/ WD fF W † ! Rji F 2 H r;1 .Rn1 /; 1  i  N g;  12 N P 2 kF kH s;2 .@†/ WD ki F kH s;2 .Rn1 / ; iD1 n o s1 sn1 ess supB Rn1 .0/ .j@1    @n1 i F j/ ; max kF kH r;1 .@†/ WD 1

0jsjr;1iN

where H0;p .@†/ is identical with Lp .@†/, p 2 f2; 1g. The spaces H%2;2 .†/, H%1;2 .†/, 1 ;%2 ;%3 1 ;%2 2 s;2 r;2 r;1 L%1 .†/, H .@†/, and H .†/ are Hilbert spaces, while the spaces H .@†/ are Banach spaces with respect to the norms given above; see, e.g., Adams (1975) or Dautray and Lions (1988). The spaces H s;2 .Rn1 / are defined via the Fourier transformation. The differentiation in the definition above has to be understood in the sense of weak differentiation. The definition of the spaces on @† above is independent from the choice of .Ui /1iN , .wi /1iN , and .‰i /1iN . It is left to introduce the spaces .H s;2 .@†//0 on a C m;1 -surface @†, 0  s < m C 1. We do this as follows. Identify each function F 2 L2 .@†/ with a linear continuous functional on H s;2.@†/, defined by Z F .x/  G.x/dH n1 .x/; F .G/ WD @†

for all G 2 H s;2 .@†/. Then .H s;2 .@†//0 is defined as .H s;2 .@†//0 WD L2 .@†/jkk.H s;2 .@†//0 ; where kF k.H s;2 .@†//0 WD

jF .G/j : G2H s;2 .@†/ kGkH s;2 .@†/ sup

In this way, we end up with the space H s;2.@†/ defined in the previous definition. We get the following chain of rigged Hilbert spaces, called Gelfand triple: H s;2.@†/  L2 .@†/  H s;2.@†/; densely and continuously. Additionally, we have for the duality product

Page 5 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

H

s;2

Z .@†/hF; GiH s;2 .@†/ D

F .x/  G.x/dH n1 .x/; @†

for all F 2 L2 .@†/. Analogously, we introduce the Gelfand triples H 1;2 .†/  L2 .†/  .H 1;2 .†//0 for bounded C 0;1 -domains and  0 1;2 1;2 2 Hjxj .†/  L .†/  H .†/ ; 2 ;jxj3 jxj2 jxj2 ;jxj3 for outer C 0;1 -domains.

3 Poincaré Inequality as Key Issue for the Inner Problem In this section, we will show how to derive a weak formulation for the deterministic inner regular oblique boundary-value problem defined on bounded C 1;1 -domains †. The corresponding weak solution will obviously coincide with the classical solution in the case of existence. First, we derive a weak formulation; then a Poincaré inequality for the Sobolev space H 1;2 .†/ allows us to apply the Lax-Milgram lemma in order to provide a solution operator. Next, we translate a regularization result for the Neumann boundary-value problem to the oblique boundary-value problem. Finally, a Ritz-Galerkin method allows us to approximate the weak solutions with the help of numerical calculations. We proceed with the stochastic extensions. This means we introduce the stochastic function spaces for the inhomogeneities and corresponding solutions with the help of the tensor product. Then the results for the deterministic problem can be easily extended to the stochastic setting. This section is divided into five subsections according to the described approach.

3.1 The Weak Formulation In this section, we present the theory of weak solutions to the regular oblique boundary problem for the Poisson equation for inner domains. Although the weak problem can be formulated for bounded C 0;1 -domains, in order to prove the existence of a unique weak solution, we need at least a bounded C 1;1 -domain. Consequently, we assume †  Rn throughout this section to be such a domain, if not stated otherwise. At first, we give the definition of the regular oblique boundary problem together with the definition of the classical solution. Definition 3. Let † be a bounded C1;1 -domain, f 2 C 0 .†/, g, b 2 C 0 .@†/, and a 2 C 0 .†I Rn / be given, such that jha.x/; .x/ij > C1 > 0;

(1)

for all x2 @†, where 0 < C1 < 1. Finding a function u 2 C 2 .†/ \ C 1 .†/ such that

Page 6 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

u D f ha; rui C bu D g

in †; on @†;

is called the inner regular oblique boundary problem for the Poisson equation, and u is called the classical solution. Because of the condition in Eq. (1), the problem is called regular. It just means that the vector field a is nontangential to @† for all x 2 @†. Now we derive the weak formulation. The fundamental theorem of the calculus of variations gives u D f

in †

if and only if Z

Z u d  D n



f d n

for all  2 C01 .†/

f d n

for all  2 C 1 .†/:



if and only if Z

Z u d  D n





Additionally on †, the following Green formula is valid: Z Z Z n n ' d  C hr'; r i d  D †

for all



@†

' @@ dH n1 ;

2 C 2 .†/ \ C 1 .†/ and ' 2 C 1 .†/. This yields for a classical solution Z

@u  dH n1  @† @

Z

Z hr; rui d  D n



f d n ; †

N Now we transform the boundary condition for all  2 C 1 .†/. ha; rui C bu D g

on @†;

to the form ha; i

@ u C ha  h.a; /i  r@† ui C bu D g @

on @†:

Using Eq. (1) we may divide this by ha, i ¤ 0 to get the equivalent boundary condition @ uC @



 b a g   ; r@† u C uD ha; i ha; i ha; i

on

@†:

Page 7 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

n

a v

Tx(∂Σ)

x a (a . v)

v

∂Σ

Σ

Fig. 2 Transformation of the oblique boundary condition

Plugging this condition into the equation above, we get the following formulation of the regular oblique boundary problem for the Poisson equation which is equivalent to the formulation given N such that in Definition 3. We want to find a function u 2 C 2 .†/ \ C 1 .†/

 b a g  u  ; r@† u dH n1  @†  ha; i ha; i ha; i R R n n 1 N † hr; ruid  † f d D 0 for all  2 C .†/:

R



The transformation of the boundary term is shown in Fig. 2. Finally, we are weakening the assumptions on data, coefficients, the test function, and the solution. We give the weak formulation of the inner regular oblique boundary problem to the Poisson equation, summarized in the following definition. Let † be a bounded C1;1 -domain, a 2 H 1;1 .@†I Rn/ fulfilling the condition in 1 Eq. (1), b 2 L1 .@†/, g 2 H  2 ; 2 .@†/, and f 2 .H 1;2 .†//0 . We want to find a function u 2 H 1;2 .†/ such that Definition 4.

1

H 2 ; 2 .@†/



R †

g ; ha; i

1

H  2 ;2 .@†/

.r  ru/d n 

R

@†



1

iD1





n P H 2 ;2 .@†/

a  i  i ; .r@† u/i ha; i

1

H  2 ;2 .@†/

b udH n1 H 1;2 .†/ h; f i.H 1;2 .†//0 D 0; ha; i

for all  2 H 1;2 .†/. Then u is called a weak solution of the inner regular oblique boundary problem for the Poisson equation.

3.2 Existence and Uniqueness Results for the Weak Solution It is possible to prove the following existence and uniqueness result for the weak solution to the deterministic inner oblique boundary-value problem for the Poisson equation.

Page 8 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

Theorem 1. Let † be a bounded C1;1 -domain, a 2 H 1;1 .@†I Rn /, fulfilling the condition in Eq. (1), and b 2 L1 .@†/ such that 

  b a 1  div@†  > 0: ess inf @† ha; i 2 ha; i

(2)

Then for all f 2 .H 1;2 .†//0 and g 2 H  2 ;2 .@†/, there exists one and only one weak solution u 2 H 1;2 .†/ of the inner regular oblique boundary problem for the Poisson equation. Additionally, we have for a constant 0 < C2 < 1   jjujjH 1;2 .†/  C2 jjf jj.H 1;2 .†//0 C jjgjjH  12 ;2 .@†/ : 1

In the proof, we apply the Lax-Milgram lemma, which gives us a unique u 2 H 1;2 .†/ fulfilling the variational equation F ./ D a.; u/; for all  2 H 1;2 .†/, provided we have that F and a are continuous and additionally a is a coercive bilinear form. F and a can be obtained easily from the weak formulation as

g H 1;2 .†/ h; f i.H 1;2 .†//0 ;  21 ;2 ha; i H .@†/

n P ai 1  i ; .r@† u/i a.; u/ D  H 2 ;2 .@†/ 1 hai ; i iD1 H  2 ;2 .@†/ R R b u dH n1 : C † .r  ru/d n C @†  ha; i F ./ D H 12 ;2 .@†/ ;

The continuity can be shown by some results about the Sobolev spaces occurring in the weak formulation. In order to prove that a is coercive, i.e., ja.u; u/j  C3 jjujj2H 1;2 .†/ , the Poincaré inequality Z

Z

Z hrF; rF i d  C n



2

F dH @†

n1



Z

 C4

F d C 2



hrF; rF i d 

n

n

;



which is valid for all F 2 H 1;2 .†/ and a constant 0 < C4 < 1, is indispensable. Finally, the condition    1 b a  div@†  > 0; ess inf @† ha; i 2 ha; i is also essential to ensure the coercivity of a. The condition in Eq. (2) can be transformed into the  a 1 equivalent form ha; ib > 2 .ha; i/2 div@† ha;i   H n1 –almost everywhere on @†.   a If div@† ha;i   D 0 H n1 –almost everywhere on @†, we have for H n1 –almost all x 2 @† the condition from the existence and uniqueness result for the classical solution. Furthermore,

Page 9 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

for a D , i.e., the Robin problem, the condition reduces to b > 0 H n1 –almost everywhere on @†. Finally, we are able to define for each bounded C 1;1 -domain †; a 2 H 1;1 .@†I Rn/ and b 2 L1 .@†/, fulfilling the condition in Eq. (7) and (8), a continuous invertible linear solution in by operator Sa;b in W .H 1;2 .†//0  H  2 ;2.@†/ ! H 1;2 .†/; Sa;b .f; g/ 7! u; 1

where u is the weak solution provided by Theorem 1. In addition, this means that the inner weak problem is well posed.

3.3 A Regularization Result In this section, we will show that the weak solution from the previous section is even an element of H 2;2 .†/ if we choose the inhomogeneities and coefficients smooth enough. The result for the oblique boundary problem is based on a regularization result for the weak solution to the Neumann problem for the Poisson equation. Theorem 2. Let †  Rn be a bounded C2;1 -domain, a 2 H 2;1 .@†I Rn / fulfilling the condition 1 in Eq. (1) and b 2 H 1;1 .@†/. Then for all f 2 L2 .†/ and g 2 H 2 ;2 .@†/, the weak solution u 2 H 1;2 .†/ to the inner regular oblique boundary problem for the Poisson equation, provided in Theorem 1, is even in H 2;2 .†/. Furthermore, we have the a priori estimate jjujjH 2;2 .†/

   C5 jjf jjL2 .†/ C jjgjjH 12 ;2 .@†/ ;

for a constant 0 < C5 < 1. In order to prove the result, it suffices to show that the normal derivative of the weak solution u 1 of the oblique boundary problem is an element of H 2 ;2 .@†/. Therefore, we use some results for Sobolev spaces defined on submanifolds. The weak solution in H 2;2 .†/ is related to the classical solution in the following way. Let u 2 H 2;2 .†/ be the weak solution to the inner regular oblique boundary problem for the Poisson equation, provided by Theorem 2. Then we have u D f n –almost everywhere in†; ha; rui C bu D g H n1 –almost everywhere on @†: We call such a solution a strong solution to the inner regular oblique boundary problem for the Poisson equation.

Page 10 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

3.4 Ritz-Galerkin Approximation In this section, we provide a Ritz-Galerkin method which allows us to approximate the weak solution with the help of a numerical computation. Let a.; u/ and F ./ be defined as above and the conditions of Theorem 4 be satisfied. Furthermore, let .Un /n2N be an increasing sequence of finite-dimensional subspaces of H 1;2 .†/, i.e., Un  UnC1 such that [ Un D H 1;2 .†/. Because Un is, as a finite-dimensional subspace of n2N

the Hilbert space H 1;2 .†/, itself a Hilbert space, we find for each n 2 N a unique un 2 Un with a.; un / D F ./ for all  2 Un : Moreover, let d WD dim.Un / and .'k /1kd be a basis of Un . Then un 2 Un has the following unique representation: un D

d X

hi 'i ;

iD1

where .hi /1id is the solution of the linear system of equations given by d X

a.'j ; 'i /hi D F .'j / 1  j  d:

iD1

The following result from Céa proves that the sequence .un /n2N really approximates the weak solution u. Theorem 3. Then

Let u be the weak solution provided by Theorem 1 and .un /n2N taken from above.

jju  un jjH 1;2 .†/ 

C6 n!1 dist.u; Un / ! 0; C7

where C6 and C7 are the continuity and the coercivity constants of a.

3.5 Stochastic Extensions First, we define the spaces of stochastic functions. We are choosing a probability space .; F ; P/, arbitrary but fixed, such that L2 (˝, dP) is separable, and define

Page 11 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

1;2 0 .H 1;2 .†//0 WD L2 .; P / ˝ .H 1;2 .†//0 Š L2 .;  P I .H .†// /;  1 ;2

H 2 .@†/ WD L2 .; P / ˝ H  2 ;2 .@†/ Š L2 ; P I H  2 ;2 .@†/ ;   1 1 1 ;2 H2 .@†/ WD L2 .; P / ˝ H 2 ;2 .@†/ Š L2 ; P I H 2 ;2 .@†/ ; 1

WD L2 .; P / ˝ L2 .†/ WD L2 .; P / ˝ H 1;2 .†/ WD L2 .; P / ˝ H 2;2 .†/

L2 .†/ H1;2 .†/ H2;2 .†/

1

Š L2 .; P I L2 .†//; Š L2 .; P I H 1;2 .†//; Š L2 .; dP I H 2;2 .†//;

with the help of the tensor product. Now we can investigate the stochastic inner regular oblique boundary problem for the Poisson equation. We are searching for a solution u 2 H1;2 .†/ of u.x; !/ D f .x; !/ for all x 2 †; P-a:a: ! 2 ; .a  ru.x; !// C bu.x; !/ D g.x; Q !/ for all x 2 @†; P-a:a: ! 2 ;  C8 > 0 on @†: j.a  /j Using the argumentation from Sect. 3.1, we come immediately to the weak formulation of the stochastic boundary problem. Definition 5.

Find u 2 H1;2 .†/ with Z

!

Z b g 1   u dH n1 dP ; 1 ;2 H 2 ;2 .@†/  ha; i ha; i  @† H 2 .@†/ !

Z X n ai 1 ; .r@† u/i  dP  1 H 2 ;2 .@†/ ha; i  H  2 ;2 .@†/ iD1  Z Z n .r  ru/d   H 1;2 .†/ h; f i.H 1;2 .†//0 dP D 0 





for all  2 H1;2 .†/. u is called the stochastic weak solution of the stochastic inner regular oblique boundary problem for the Poisson equation. Obviously, u 2 H1;2 .†/ is a stochastic weak solution of the stochastic regular oblique boundary problem for the Poisson equation if and only if for P-a.a. ! 2 , u! WD u.; !/ is a weak solution of the deterministic problem D f .; !/ on †; u! ha  ru! i C bu! D g.; !/ on @†: The solution operator of the deterministic problem extends to the stochastic setting in the following way. Theorem 4. Let † be a bounded C1;1 -domain, a 2 H 1;1 .@†I Rn /, fulfilling the condition in Eq. (1), and b 2 L1 .@†/ such that

Page 12 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

  1 b a  div@†  > 0: ess inf @† ha; i 2 ha; i 

 1 ;2

Then for all f 2 .H 1;2 .†//0 and g 2 H 2 .@†/, there exists one and only one stochastic weak solution u 2 H1;2 .†/ of the stochastic inner regular oblique boundary problem for the Poisson equation. Additionally, we have for a constant 0 < C9 < 1 jjujjH 1;2 .†/ 

  C9 jjf jj.H 1;2 .†//0 C jjgjj

  12 ;2

H

:

.@†/

In the proof, we use the results from the deterministic setting in order to prove the requirements of the Lax-Milgram lemma to be fulfilled. Using the isomorphisms of the tensor product spaces to spaces of Hilbert space-valued random variables, also the regularization result translates to the stochastic setting. Theorem 5.

Let †  Rn be a bounded C2;1 -domain, a 2 H 2;1 .@†I Rn / fulfilling the condition 1

;2

in Eq. (1) and b 2 H 1;1 .@†/. Then for all f 2 L2 .†/ and g 2 H2 .@†/, the weak solution u 2 H1;2 .†/ to the inner regular oblique boundary problem for the Poisson equation, provided in Theorem 1, is even in H2;2 .†/. Furthermore, we have the a priori estimate jjujjH 2;2 .†/  C10 .jjf jjL2 .†/ C jjgjj 

1 ;2

H2 .@†/

/;

for a constant 0 < C10 < 1. u is called the stochastic strong solution and fulfills the classical problem almost everywhere. At the end of this section, we want to mention that a Ritz-Galerkin approximation is available also for the stochastic weak solution, repeating the procedure from the deterministic problem. For details and proofs of the presented results, we refer the reader to Grothaus and Raskop (2006).

4 Fundamental Results for the Outer Problem In this section, we provide a solution operator for the outer oblique boundary problem for the Poisson equation. The results presented in this section are taken from Grothaus and Raskop (2009), and further details on the proofs can be found in this reference. The outer problem is defined in an unbounded domain †  Rn which is representable as Rn nD, where D is a bounded domain. Additionally, we assume 0 2 D which is necessary for the Kelvin transformation. For unbounded †, a Poincaré inequality is yet missing. Consequently, we cannot use the technique used for the inner problem because we are unable to prove coercivity of the bilinear from a weak formulation corresponding to the outer problem. Thus, we will not derive a weak formulation for the outer problem, and thus, we do not have to consider a regular outer problem. Our approach is to transform the outer problem to a corresponding inner problem for which a solution operator is available from the results of the previous section. In this way, we will construct our weak solution and for this solution, also a Ritz-Galerkin method is available because of the continuity of the Kelvin transformation. Finally, we again extend our results for stochastic inhomogeneities as well as Page 13 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

Table 1 Transformation procedure Outer problem:

Inner problem:



(f ,g/

# K†

T1 # T2

†K

(T1 .f /,T2 .g//

out Sa;b

!

!

u "K

STin .a/;T 3

!

4 .b/



stochastic solutions and present some examples from geomathematics. The procedure is described in the following four subsections.

4.1 Transformations to an Inner Setting In this section, we define the transformations which will be needed in order to transform the outer oblique boundary problem for the Poisson equation to a corresponding regular inner problem. Then we will apply the solution operator in order to get a weak solution in the inner domain. This solution will be transformed with the help of the Kelvin transformation to a function defined in the outer domain. In the next section, we will finally prove that this function solves the outer problem for sufficiently smooth data almost everywhere, which gives the connection to the original problem. The whole procedure is illustrated in Table 1. We proceed in the following way. First, we define the Kelvin transformation K† of the outer domain † to a corresponding bounded domain †K . Next, the Kelvin transformation K of the solution for the inner problem will be presented. Finally, we define the transformations T1 and T2 for the inhomogeneities as well as T3 and T4 for the coefficients. We will also show that the operators K, T1 , and T2 are continuous. The consequence is that our solution operator out .f; g/ Sa;b

  in WD K ST3 .a/;T4 .b/ .T1 .f /; T2 .f //

forms a linear and continuous solution operator for the outer problem. Because all main results assume † to be at least an outer C 1;1 -domain, we fix † for the rest of this section as such a domain, if not stated otherwise. At first, we transform the outer domain † to a bounded domain †K . The tool we use is the so-called Kelvin transformation K† for domains. We introduce the Kelvin transformation for outer C 1;1 -domains in the following definition. Definition 6. Let † be an outer C 1;1 -domain and x 2 † be given. Then we define the Kelvin transformation K† (x) of x by K† .x/ WD

x : jxj2

Furthermore, we define †K as the Kelvin transformation of † via †K WD K† .†/ [ f0g D fK† .x/jx 2 †g [ f0g:

Page 14 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

From this point on, we fix the notation in such a way that †K always means the Kelvin transformation of †. Figure 3 illustrates the Kelvin transformation of †. 2

2

1.8 Σ

1.6 1.4 1.2 1 0.8

ΣK 0.6 0.4

x

0.2 0 –1.5

KΣ (x) –1

∂ΣK –0.5

0

0.5

∂Σ 1

1.5

Fig. 3 Kelvin transformation of †

We have K† 2 C 1 .Rn nf0gI Rn nf0g/ with K†2 D IdRn nf0g . Furthermore, we obtain by standard calculus, using the Leibnitz formula for the determinant, j Det.D.K† //.x/j  C11 jxj2n for all x 2 Rn nf0g, 1  i  n. This is one of the reasons for the weighted measures of the Sobolev spaces described below. Moreover, the transformation leaves the regularity of the surface invariant. Let † be an outer C 2;1 -domain. Then †K is a bounded C 2;1 -domain. Moreover, we have that @†K D K† .@†/. Furthermore, if † is an outer C 1;1 -domain, we have that †K is a bounded C 1;1 domain. There are geometric situations in which @†K can be computed easily. For example, if @† is a sphere around the origin with radius R, then @†K is a sphere around the origin with radius R1 . Furthermore, if @†  R2 is an ellipse with semiaxes a and b around the origin, then @†K is also an ellipse around the origin with semiaxes b 1 and a1 . Next, we present the transformation for the weak solution of the inner problem back to the outer setting. Therefore, we introduce the operator K. This is the so-called Kelvin transformation for functions. It transforms a given function u, defined on †K , to a function K.u/, defined on †. In addition, it preserves some properties of the original function. We will state some of these properties. So, after the following considerations, it will be clear why we choose exactly this transformation. It will also be clear how we have to choose the transformations T1 ; : : :; T4 in the following. We start with a definition. Definition 7. Let † be an outer C1;1 -domain and u be a function defined on †K . Then we define the Kelvin transformation K(u) of u, which is a function defined on †, via   1 x ; K.u/.x/ WD u jxjn2 jxj2 Page 15 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

for all x2 †. Important is that this transformation acts as a multiplier when applying the Laplace operator. Note that –.n  2/ is the only exponent for jxj which has this property. We have for u 2 C 2 .†K / that K.u/ 2 C 2 .†/ with   1 x .K.u//.x/ D ; .u/ jxjnC2 jxj2 for all x 2 †. As already mentioned above, we will apply K to functions from H 1;2 .†K /. So we want to find a normed function space .V; jj  jjV / such that K W H 1;2 .†K / ! V defines a continuous operator. It turns out that the weighted Sobolev space H 1;2 .†/ is a suitable 1 ; 1 choice. We have the following important result for K acting on H 1;2 .†K /.

jxj2 jxj

Theorem 6. Let † be an outer C1;1 -domain. For u 2 H 1;2 .†K / let K(u) be defined as above for all x 2 †. Then we have that K W H 1;2 .†K / ! H 1;2 .†/ 1 ; 1 jxj2 jxj

is a continuous linear operator. Moreover, K is injective. It is left to provide the remaining transformations T1 ; : : :; T4 . In the first part, we treat T1 , which transforms the inhomogeneity f of the outer problem in † to an inhomogeneity of the corresponding inner problem in †K . Assume f to be a function defined on †. We want to define the function T1 .f / on †K such that u.x/ D T1 .f /.x/; x 2 †K ;

(3)

.K.u//.y/ D f .y/; y 2 †:

(4)

implies that

We are able to define T1 for functions defined on † as follows. Definition 8. Let † be an outer C1;1 -domain and f be a function defined on †. Then we define a function T1 .f / on †K by 1 f T1 .f /.x/ WD jxjnC2



 x ; jxj2

for all x 2 †K nf0g and T1 .f /.0/ D 0.

Page 16 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

T1 is well defined and fulfills the relation described by Eqs. 6 and 7. Furthermore, T1 defines a linear continuous isomorphism T1 W L2jxj2 .†/ ! L2 .†K /; with T11 D T1 . We want to generalize our inhomogeneities in a way similar to the inner problem. This means we have to identify a normed vector space .W; jj  jjW / such that T1 W W ! .H 1;2 .†K //0 defines a linear continuous operator. Additionally, we want to end up with a Gelfand triple U  L2jxj2 .†/  W: Consequently, L2jxj2 .†/ should be a dense subspace. It is possible to prove that the space 0  1;2 .†/ is a suitable choice. Recall the Gelfand triple, given by Hjxj 2 ;jxj3 1;2 Hjxj 2 ;jxj3 .†/

Theorem 7.



L2jxj2 .†/

 0 1;2  Hjxj2 ;jxj3 .†/ :

We define a continuous linear operator T1 W L2jxj2 .†/ ! .H 1;2 .†K //0 ;

by Z .T1 .f //.h/ WD

.T1 .f //.y/h.y/d n .y/;

h 2 H 1;2 .†K /;

†K

for f 2 L2jxj2 .†/, where L2jxj2 .†/ is equipped with the norm jjjj to a linear bounded operator

H 1;22

jxj ;jxj3

0 . .†/

This extends uniquely

 0 1;2 T1 W Hjxj2 ;jxj3 .†/ ! .H 1;2 .†K //0 by the BLT theorem, i.e., the extension theorem for bounded linear transformations; see e.g., Reed and Simon (1972). Next, we define the transformations for the boundary inhomogeneity g and the coefficients a and b. This means we want to find transformations T2 , T3 , and T4 such that h.T3 .a//.x/; ru.x/i C .T4 .b//.x/u.x/ D .T2 .g//.x/;

(5)

for all x 2 @†K , yields that ha.y/; r..K.u//.y//i C b.y/u.x/ D g.y/;

(6)

for all y 2 @†K . We start with the transformation T2 .g/ of g. Page 17 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

Definition 9. Let † be an outer C1;1 -domain and g be a function defined on @†. Then we define a function T2 .g/ on @†K by 

x .T2 .g//.x/ WD g jxj2

 x 2 @†K :

;

Again we use a Gelfand triple, namely, H 2 ;2 .@†/  L2 .@†/  H  2 ;2 .@†/: 1

1

We have that T2 W L2 .@†/ ! L2 .@†K /; 1 1 T2 W H 2 ;2 .@†/ ! H 2 ;2 .@†K /; define linear, bounded isometries with .T2 /1 D T2 . Moreover, we define a continuous linear operator T2 W L2 .@†/ ! H  2 ;2 .@†K /; 1

by Z .T2 .g//.h/ WD

T2 .g/.y/h.y/dH n1 .y/;

h 2 H  2 ;2 .@†/ 1

@†K

for g 2 L2 .@†/, where L2 .@†/ is equipped with the norm jj  jjH  12 ;2 .@†/ . Hence, again the BLT theorem gives a unique continuous continuation T2 W H  2 ;2 .@†/ ! H  2 ;2 .@†K /: 1

1

Closing this section, we give the definitions of the transformations T3 and T4 . Definition 10. Let † be an outer C1;1 -domain and a and b be defined on @†. We define the operators T3 and T4 via D   E     .T3 .a//.x/ WD jxjn  a jxjx 2  2 a jxjx 2 ; ex e x ;    D   E x n2 .T4 .b//.x/ WD jxj  b jxj2 C .2  n/ a jxjx 2 ; x ; for all x 2 @†K , where a x denotes the unit vector in the x direction. Furthermore, we have T3 W H 1;1 .@†/ ! H 1;1 .@†K /; T4 W L1 .@†/ ! L1 .@†K /; if † is an outer C1;1 -domain and a 2 H 1;1 .@†/ for T4 . All operators are well defined and give the relation formulated by Eqs. 5 and 6. Page 18 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

These operators have the properties T3 W H 1;1 .@†/ ! H 1;1 .@†K /; T4 W L1 .@†/ ! L1 .@†K /; if † is an outer C 1;1 -domain and a 2 H 1;1 .@†/ for T4 and T3 W H 2;1 .@†/ ! H 2;1 .@†K /; T4 W H 1;1 .@†/ ! H 1;1 .@†K /; if † is an outer C 2;1 -domain and a 2 H 2;1 .@†/ for T4 .

4.2 Solution Operator for the Outer Problem In this section, we want to apply the solution operator of the inner regular problem in order to get a weak solution of the outer problem. Therefore, we will use a combination of all the operators defined in Sect. 4.1. In order to avoid confusion, we denote the normal vector of @† by  and the normal vector of @†K by  K . We start with the classical formulation of the outer oblique boundary problem for the Poisson equation in Definition 11. Definition 11. Let † be an outer C1;1 -domain, f 2 C 0 .†/, b, g 2 C 0 .@†/, and a 2 C 0 .@†I Rn / N such that be given. A function u 2 C 2 .†/ \ C 1 .†/ u.x/ D f .x/; for all x 2 †; ha.x/  ru.x/i C b  u.x/ D g.x/; for all x 2 @†; u.x/ ! 0; for jxj ! 1; is called the classical solution of the outer oblique boundary problem for the Poisson equation. Now we state the main result of this section which can be proved by the results on the transformations above. Theorem8. Let † be an outer C1;1 -domain, a 2 H 1;1 .@†I Rn /, b 2 L1 .@†/, g 2 H  2 ;2 .@†/, 0 1;2 such that and f 2 Hjxj 2 ;jxj3 .†/ 1

jh.T3 .a//.y/;  K .y/ij > C > 0; 

T4 .b/ 1  div@†K ess inf K K @† hT3 .a/;  i 2



T3 .a/  K K hT3 .a/;  i

(7)  > 0;

(8)

for all y 2 @†K , where 0 < C < 1. Then we define   out .f; g/ WD K STin3 .a/;T4 .b/ .T1 .f /; T2 .g// u WD Sa;b

Page 19 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

as the weak solution to the outer oblique boundary problem for the Poisson equation from out Definition 11. Sa;b is injective, and we have for a constant 0 < C12 < 1 jjujjH 1;2

1 ; 1 jxj2 jxj

.†/

  C12 jjf jj

 1;2 jxj2 ;jxj

H

0 3 .†/

C jjgjjH  12 ;2 .@†/ :

We are able to prove that the Kelvin transformation for functions is also a continuous operator .†/. So, we can prove the following regularization result, based on the from H 2;2 .†K / to H 2;2 1 ; 1 ;1 jxj2 jxj

regularization result for the inner problem; see Theorem 2. The following theorem shows that the weak solution, defined by Theorem 8, is really related to the outer problem, given in Definition 6, although it is not derived by its own weak formulation. Theorem 9. Let † be an outer C2;1 -domain, a 2 H 2;1 .@†I Rn/, b 2 H 1;1 .@†/ such that the 1 conditions in Eq. (7) and (8) hold. If f 2 L2jxj2 .†/, and g 2 H 2 ;2 .†/ then we have that u provided .†/, and by Theorem 8 is a strong solution, i.e., u 2 H 2;2 1 ; 1 ;1 jxj2 jxj

u D f; ha; rui C bu D g; almost everywhere on † and @†, respectively. Furthermore, we have an a priori estimate jjujjH 2;2

1 ; 1 ;1 .†/ jxj2 jxj

 C13 .jjf jjL2

jxj2

.†/

C jjgjjH 12 ;2 .@†/ /;

with a constant 0 < C13 < 1. As a consequence, we have that if the data in Theorem 4 fulfills the requirements of a classical solution, the weak solution u provided by Theorem 8 coincides with this classical solution. At the end of this section, we investigate the conditions on the oblique vector field. Analogous to the regular inner problem, we have the condition in Eq. (8), which is a transformed version of 2 and gives a relation between a and b, depending on the geometry of the surface @†. Moreover, the condition in Eq. (7) is a transformed version of the condition in Eq. (1) and gives the nonadmissible direction for the oblique vector field a. For the regular inner problem, the condition in Eq. (1) states the tangential directions as nonadmissible for the oblique vector field. For the outer problem, the direction depends as well on the direction of the normal vector .y/ at the point y 2 @† as on the direction of y itself. In this section, we will investigate this dependency in detail. Using the definitions of T3 and T4 , we can rewrite the condition in Eq. (7) into the equivalent form ˇ  ˇ   ˇ ˇ    2  cos.†   ˇ > C13 > 0; ˇcos † /  cos † (9) a.x/;e x x x ˇ ˇ a.x/; K e ; K x

jxj2

jxj2

for all x 2 @† and 0 < C13 < 1 independent of x. We use the formula hy; zi DW cos.†y;z / jyj  jzj Page 20 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

1.8

2

1.6 Σ

∂Σ

1.4 1.2

ex 1 – v (x) 0.8

v K (KΣ(x))

x

a (x) ex

0.6 0.4

ΣK

KΣ(x)

0.2 0

∂ΣK 0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Fig. 4 Nonadmissible direction for the outer problem

for vectors in Rn , where †y;z denotes the angle 0  †y;z   between y and z. Going to R2 and setting C14 .x/ WD cos.†ex ; K .x/ /; C15 .x/ WD sin.†ex ; K .x/ /; we can explicitly characterize the nonadmissible direction as ˇ ˇ ˇ ˇ 1 ˇ C14 .x/ ˇ ; †a.x/;ex D tan ˇ C15 .x/ ˇ if C15 .x/ ¤ 0, and †a.x/;ex D 2 , if C15 .x/ D 0. Generally, transforming the problem to an inner setting transforms the conditions for the coefficients a and b. There are circumstances in which we have the same nonadmissible direction as for the inner problem, i.e., the tangential directions are nonadmissible. For example, this is the case if @† is a sphere around the origin. In Fig. 4, the situation for †  R2 is illustrated; the dashed line indicates the nonadmissible direction, which occurs because of the transformed regularity condition hT3 .a/,  K i > C14 > 0, see Eq. (7).

4.3 Ritz-Galerkin Method In this section, we provide a Ritz-Galerkin method for the weak solution to the outer problem. Therefore, we use the approximation of the weak solution to the corresponding inner problem, provided in Chap. 3. Assume † to be an outer C 1;1 -domain. Furthermore, let a 2 H 1;1 .@†I Rn /,

Page 21 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

0  1 1;2 b 2 L1 .@†/, g 2 H  2 ; 2 .@†/, and f 2 Hjxj such that the condition in Eq. (7) and (8) 2 ;jxj3 .†/ are fulfilled. We want to approximate the weak solution u to the outer oblique boundary problem, provided by Theorem 8. Let a and F be defined by

T3 .a/i K  i ; .r@† /i a.; / WD  1 hT3 .a/;  K i iD1 H  2 ;2 .@†/ R R T4 .b/  dH n1  † .r; r/d n  @†  hT3 .a/;  K i

T2 .g/ F ./ WDH 12 ;2 .@†/ ; H 1;2 .†/ h; T1.f /i.H 1;2 .†//0 hT3 .a/;  K i H  21 ;2 .@†/ n P

1  H 2 ;2 .@†/

for ,  2 H 1;2 .†K /. Furthermore, let .Vn /n2N be an increasing sequence of finite-dimensional subspaces of H 1;2 .†K /, i.e., Vn  VnC1 such that [ Vn D H 1;2 .†K /. Then there exists for each n2N

n 2 N a unique n 2 Vn with a.; n / D F ./

for all  2 Vn I

see Sect. 3.4. Moreover, n can be computed explicitly by solving a linear system of equations. In Sect. 3.4, we have also seen that n!1

jj  n jjH 1;2 .†/  C16 dist.; Vn / ! 0: So, using the continuity of the operator K (see Theorem 6), we consequently get the following result. Theorem 10. Let u be the weak solution provided by Theorem 8 to the outer problem and , .n /n2N taken from Theorems 1 and 3, both corresponding to a, b; g; f , and †, given at the beginning of this section. Then n!1

jju  K.n /jjH 1;2 .†/  C17 dist.; Vn/ ! 0:

4.4 Stochastic Extensions and Examples In this section, we implement stochastic inhomogeneities as well as stochastic weak solutions for the outer setting. Again we start by defining the spaces of stochastic functions. So, let † be an outer C1;1 -domain and .; F ; P/ a probability space, arbitrary but fixed, such that L2 .,P) is separable. We define

Page 22 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015





 H 1 ; 1 ;1 .†/ WD L2 .; P / ˝ H 2;2 .†/ 1 ; 1 ;1 2 jxj jxj jxj2 jxj    1;2 WD L2 .; P / ˝ H 1;2 .†/ H 1 ; 1 .†/ 1 ; 1 2 jxj jxj jxj2 jxj    L2jxj2 .†/ WD L2 .; P / ˝ L2jxj2 .†/ 2;2





0 1;2 Hjxj .†/ 2 ;jxj3



1 2 ;2

 Š L ; P I H 1 ; 1 ;1 .†/ ; jxj2 jxj   1;2 2 Š L ; P I H 1 ; 1 .†/ ; jxj2 jxj   Š L2 ; P I L2jxj2 .†/ ;    0 0  1;2 1;2 2 2 ; WD L .; P / ˝ Hjxj2 ;jxj3 .†/ Š L ; P I Hjxj2 ;jxj3 .†/ 1

H .@†/

WD L2 .; P / ˝ H 2 ;2 .@†/

L2 .@†/

WD L2 .; P / ˝ L2 .@†/

 1 ;2

WD L2 .; P / ˝ H  2 ;2 .@†/ 1

H 2 .@†/

2;2

2

  Š L2 ; P I L2 .@†/ ;   Š L2 ; P I L2 .@†/ ;   1 Š L2 ; P I H  2 ;2 .@†/ ;

Because all spaces above are separable, we can again use the isomorphisms to Hilbert spacevalued random variables. Thus, we can prove the following main result of this section by defining the stochastic solution operator pointwise. Theorem 11. Let † be an outer C1;1 -domain, a 2 H 1;1 .@†I Rn /, b 2 L1 .@†/, g 2 1  0  ;2 1;2 2 .@†/, and f 2 Hjxj2 ;jxj3 .†/ , such that the conditions in Eq. (7) and (8) hold. Then H  we define out u.; !/ WD Sa;b .f .; !/; g.; !//;

for dP–almost all ! 2 . u is called the stochastic weak solution to the outer oblique boundary problem for the Poisson equation. Furthermore, we have for a constant 0 < C18 < 1   1 0   0 C jjgjj jjujj  C18 jjf jj 1;2 :  12 ;2 B 1;2 @H 1

jxj2

;

C 1 .†/A jxj 

H

jxj2 ;jxj3

.†/



H

.@†/

Moreover, we have the following result for a stochastic strong solution. n 1;1 Let † be an outer C2;1 -domain, a 2 H 2;1 .@†I R /, b 2 1H .@†/ such that ;2 and g 2 H2 .†/, then we have the conditions in Eq. (7) and (8) hold. If f 2 L2jxj2 .†/    2;2 u 2 H 1 ; 1 ; .†/ , for u provided by Theorem 11, and

Theorem 12.

jxj2 jxj 1



u.x; !/ D f .x; !/; ha.y/; ru.y; !/i C b.y/u.y; !/ D g.y; !/; for n –almost all x 2 †, for H n1 –almost all y 2 @†, and for dP-almost all ! 2 . Furthermore, we have an a priori estimate Page 23 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

jjujj0

  C19 jjf jjL2

1

B 2;2 @H 1

jxj2

1 ; jxj ;1

C .†/A

jxj

  2 .†/

C jjgjj 

1 ;2

H2 .@†/

;



with a constant 0 < C19 < 1. Again a Ritz-Galerkin method is available also for the stochastic weak solution. It is left to the reader to write down the details. As mentioned, we close the section with examples for stochastic data. These are used in geomathematical applications in order to model noise on measured values. In the following, we give examples for the outer problem. They are also suitable for the inner problem. 4.4.1 Gaussian Inhomogeneities We choose the probability space .; F ; P/ such that Xi , 1  i  n1 , are P ˝ n -measurable and Yj , 1  j  n2 , are P ˝ Hn1 -measurable with Xi .; x/, x 2 †, and Yj .; x/, x 2 @†, Gaussian random variables with expectation value 0, and variance f 2i .x/ or variance g 2j .x/, respectively. Here, f i 2 L2jxj2 .†/ and g j 2 L2 .@†/. We define f .!; x/ WD f .x/ C

n1 X

Xi .!; x/;

g.!; x/ WD g .x/ C

n2 X

Yj .!; x/;

j D1

iD1

where f 2 L2jxj2 .†/ and g 2 L2 .@†/. To use such kind of inhomogeneities, we must show f 2 L2 .  †; P ˝ jxj4  n / and

g 2 L2 .  @†; P ˝ H n1 /:

It is easy to see that the inhomogeneities defined in this way fulfill these requirements and the main results are applicable. Such a Gaussian inhomogeneity is shown in Fig. 5. 4.4.2 Gauss-Markov Model Here, we refer to Freeden and Maier (2002), in which an application of the example from the previous section can be found. The authors use a random field

E(h( ., X))

∂Γ

Fig. 5 Data with Gaussian noise Page 24 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

h.!; x/ WD H.x/ C Z.!; x/ to model an observation noise, where x 2 @B1 .0/  R3 and ! 2  with a probability space .; F ; P/. Here, we have that Z.; x/, x 2 @B1 .0/, is a Gaussian random variable with expectation value 0 and variance 2 > 0. Additionally, H.x/ 2 L2 .@B1 .0// and the covariance is given by cov.Z.; x1 /; Z.; x2 // D K.x1 ; x2 /; where K W @B1 .0/  @B1 .0/ ! R is a suitable kernel. Two geophysically relevant kernels are, e.g., K1 .x1 ; x2 / WD K2 .x1 ; x2 / WD

PM 2nC1 2 nD1 4 Pn .x1 .M C1/2 2 exp.c.x1  x2 //: exp.c/

 x2 /;

0  M < 1;

Pn , 1  n  M, are the Legendre polynomials defined on R. The noise model corresponding to the second kernel is called the first-degree Gauss-Markov model. If one chooses a P ˝ Hn1 measurable random field Z, then h fulfills the requirements. The existence of a corresponding probability measure P is provided in infinite-dimensional Gaussian analysis; see, e.g., Berezanskij (1995). 4.4.3 Noise Model for Satellite Data In this section, we give another precise application, which can be found in Bauer (2004). Here, the authors are using stochastic inhomogeneities to implement a noise model for satellite data. Therefore, random fields of the form h.!; x/ WD

m X

hi .x/Zi .!/

iD1

are used, where x 2 @†  R3 and ! 2  with a suitable probability space .; F ; P/. Here, @† could be, e.g., the Earth’s surface, and we are searching for harmonic functions in the space outside the Earth. Zi are Gaussian random variables with expectation value 0and variance i2 > 0 and hi 0; i , where fulfilling the assumptions of Sect. 4.4.1. If one chooses .; F ; P/ as Rm ; B.R/; cov ij 

0; i WD p.2/m1 det.A/ e cov ij aij WD cov.Zi ; Zj /;

1 .y; A1 y/ m 2 d ;

1  i; j  m;

one has a realization of Zi as the projection on the ith component in the separable space  2 m 0; i L R ; covij .

Page 25 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

5 Future Directions In this section, we want to point out one direction of further investigations. We have seen how to provide the existence of a weak solution to the outer oblique boundary problem for the Poisson equation. Therefore, we introduce several transformations. In Theorem 3, we proved for the transformation of the space inhomogeneity f  0  0 1;2 T1 W Hjxj ! H 1;2 .†K / : 2 ;jxj3 .†/  0  1;2 ¤ .H 1;2 .†K //0 . Finding a Hilbert This transformation is not bijective, i.e., T1 Hjxj2 ;jxj3 .†/  0 space V such that T1 W V ! H1;2 .†K / is bijective would lead to the existence of a weak solution for an even larger class of inhomogeneities. Moreover, we have for the transformation K of the weak solution to the inner problem K W H 1;2 .†K / ! H 1;2 1

jxj2

;

1 .†/; jxj

where again K.H 1;2 .†K // ¤ H 1;2 1

jxj2

1 ; jxj

.†/I

see Theorem 6. Finding a Hilbert space W such that K W H 1;2 .†K / ! W is bijective would give us uniqueness of the solution and more detailed information about the behavior of u and its weak derivatives, when x is tending to infinity. Additionally, we would be able to define a bijective solution operator for the outer problem. This could be used to find the right Hilbert spaces such that a Poincaré inequality is available. Consequently, the Lax-Milgram lemma would be applicable directly to a weak formulation for the outer setting, which can be derived similar to the inner problem. Then we might have to consider a regular outer problem, because the tangential direction is forbidden for the oblique vector field if we want to derive a weak formulation. In turn, we get rid of the transformed regularity condition on a. The results presented in this chapter are then still an alternative in order to get weak solutions for tangential a. Moreover, the availability of a Poincaré inequality would lead to existence results for weak solutions to a broader class of secondorder elliptic partial differential operators in outer domains. See, e.g., Alt (2002) for such secondorder elliptic partial differential operators for inner domains. Instead of using the Ritz-Galerkin approximation, it is possible to approximate solutions to oblique boundary-value problems for harmonic functions with the help of geomathematical function systems, e.g., spherical harmonics. For such an approach, see, e.g., Freeden and Michel (2004).

6 Conclusion The analysis of inner oblique boundary-value problems is rather well understood, and we reached the limit when searching for weak solutions under as weak assumptions as possible. The outer problem still causes problems because of the unboundedness of the domain. As mentioned in Sect. 5, finding the right distribution spaces such that a Poincaré inequality holds might lead to bijective solution operators for an even broader class of inhomogeneities. Nevertheless, we are already able to provide weak solutions to the outer problem, as presented in the previous sections for very general inhomogeneities. Also stochastic weak solutions for stochastic inhomogeneities

Page 26 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_35-4 © Springer-Verlag Berlin Heidelberg 2015

as used in geomathematical applications can be provided, and approximation methods for the weak solutions are available.

References Adams RA (1975) Sobolev spaces. Academic, New York Alt HW (2002) Lineare Funktionalanalysis. Springer, Berlin Bauer F (2004) An alternative approach to the oblique derivative problem in potential theory. Shaker, Aachen Berezanskij YM (1995) Spectral methods in infinite dimensional analysis. Kluwer, Dordrecht Dautray R, Lions JL (1988) Mathematical analysis and numerical methods for science and technology. Functional and variational methods, vol 2. Springer, Berlin Dobrowolski M (2006) Angewandte Funktionalanalysis. Springer, Berlin Freeden W, Maier T (2002) On multiscale denoising of spherical functions: basic theory and numerical aspects. Electron Trans Numer Anal 14:56–78 Freeden W, Michel V (2004) Multiscale potential theory (with applications to geoscience). Birkhäuser, Boston Gilbarg D, Trudinger NS (1998) Elliptic partial differential equations of second order. Springer, Berlin Grothaus M, Raskop T (2006) On the oblique boundary problem with a stochastic inhomogeneity. Stochastics 78(4):233–257 Grothaus M, Raskop T (2009) The outer oblique boundary problem of potential theory. Numer Funct Anal Optim 30(7–8):1–40 Gutting M (2008) Fast multipole methods for oblique derivative problems. Shaker, Aachen Miranda C (1970) Partial differential equations of elliptic type. Springer, Berlin Reed M, Simon B (1972) Methods of modern mathematical physics. Functional analysis, vol 1. Academic, New York Rozanov Y, Sanso F (2001) The analysis of the Neumann and oblique derivative problem: the theory of regularization and its stochastic version. J Geod 75(7–8):391–398 Rozanov Y, Sanso F (2002a) On the stochastic versions of Neumann and oblique derivative problems. Stochast Stochast Rep 74(1–2):371–391 Rozanov Y, Sanso F (2002b) The analysis of the Neumann and oblique derivative problem: weak theory. In: Geodesy: challenge of the 3rd millennium. Springer, Berlin

Page 27 of 27

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

Geodetic Deformation Analysis with Respect to an Extended Uncertainty Budget Hansjörg Kutterer Bundesamt für, Kartographie und Geodäsie, Frankfurt am Main, Germany

Abstract This chapter reports current activities and recent progress in the field of geodetic deformation analysis if a refined uncertainty budget is considered. This is meaningful in the context of a thorough system-theoretical assessment of geodetic monitoring and it leads to a more complete formulation of the modeling and analysis chain. The work focuses on three major topics: the mathematical modeling of an extended uncertainty budget, the adequate adaptation of estimation and analysis methods, and the consequences for one outstanding step of geodetic deformation analysis – the test of a linear hypothesis. The essential outcome is a consistent assessment of the quality of the final decisions such as the significance of a possible deformation.

1 Introduction Geodetic monitoring of spatial objects is concerned with the detection of deviations with time or due to physical influences. It has been a subject of geodetic research since decades. Typical fields of interest are geokinematics, the monitoring of large structures, and quality control in industrial production processes. The purpose of a particular monitoring study can originate from a scientific problem but also from public safety demands. Obviously, this purpose can vary as significantly as the relevant temporal and spatial scales. Studies in geokinematics refer, e.g., to the modeling and determination of regional and global terrestrial reference frames. They are strongly related to plate tectonics and recent crustal movements (e.g., Drewes and Heidbach 2004). Regarding the motion of tectonic plates, the relevant temporal scales range from years to decades. Shorter time scales have to be considered, e.g., for early warning systems for volcanoes or landslides. Studies in structural monitoring are concerned with large but local structures such as bridges or dams (e.g., Roberts et al. 2003). Here, time scales can range from subseconds to decades. The actual differences in scales require dedicated and sensitive instrumentations; they refer to the possible observables and thus to metrology and technology, respectively. Despite all indicated differences, a common methodology can be developed and used since geodetic monitoring always comprises aspects of modeling, observation, and analysis (Heunecke and Welsch 2001). As the basic result of geodetic monitoring, deviations in terms of rigid body motions of the object and changes of its geometric shape are determined. By convention, both types of deviation are called deformation; this notion will also be used throughout this chapter. Modeling refers to both the geometric-physical representation of the considered object and the description of the deformation process. Observation means the time-referenced measurement of



E-mail: [email protected]

Page 1 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

relevant metric object properties such as the positions and position changes of control points using a redundant geodetic network or a related configuration. Observation includes the mathematical formulation of error models for sensors and sensor systems as well as of configurations using observation equations and common parameters. Analysis comprises the adjustment or filtering of the temporally referenced measurements including the derivation of object parameters, the statistical assessment of the observations and of the results using hypothesis tests, and the prediction of unobservable object states. Note that a meaningful analysis relies on suitable object and observation models. This will be dealt with in detail later on. It is not possible to cover all present-day problems and work in the field of geodetic deformation monitoring and analysis within one paper. For this reason, the focus is put – after a general description – on current research which addresses a more comprehensive modeling of the observation process as a fundamental basis for all following processing, analysis, and interpretation. This chapter is organized as follows. In Sect. 2, a general survey of geodetic monitoring is presented which particularly addresses geodetic deformation analysis. Section 3 is concerned with the formulation and mathematical modeling of an extended uncertainty budget. Modeling alternatives are discussed. In Sect. 4, this extension is applied to statistical hypothesis testing. Some illustrative examples are presented. This chapter ends with a short summary and an outlook in Sect. 5. Some remarks concerning the mathematical notation are added. As it is good practice in geodesy, vectors are indicated by lowercase bold letters and matrices are indicated by uppercase bold letters. All vector or matrix components, respectively, are real variables or values. Variance-covariance matrices are indicated by the bold uppercase Greek † with the associated random vector referenced in the index. All further notations are introduced in the respective sections.

2 Scientific Relevance 2.1 Background Geodetic monitoring is a complex task with a multitude of aspects to be considered. Regarding its application in engineering practice, it can be considered as a standardized sequence of routine steps. From a scientific point of view, it is a physical experiment with the main purpose to provide an insight into the geometric or physical properties of the object under consideration. On the one hand, geodetic monitoring has a diagnostic component which refers to the generation, testing, and refinement of the knowledge about the object properties. On the other hand, it has a prognostic component which refers to the prediction of a priori unknown object states with respect to temporal changes or physical influences. Therefore, it is necessary to assess the deformation process thoroughly and to describe it by mathematical means. It will be outlined that although geodetic deformation analysis is already practicable to a remarkable extent, there are still several shortcomings in methodology. In geodetic monitoring, the object state is described by geometric and physical properties referring either to discrete, representative object points such as absolute or relative position vectors, or to the position, orientation, and shape of the object as a continuous body. The first type of information can be obtained using GPS (or any other space-geodetic technique) or – mainly in close-range applications – total stations and levels. The second type of information is provided, e.g., by digital cameras, laser scanners, or interferometric radar. The actual components of the parameter vector describing the object state depend on the observables which are defined by the Page 2 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

Phenomenon

Cause–effect relation

Descriptive models

Causal models

Congruence model

Spatial reference

Static model

Kinematic model

Spatiotemporal reference

Dynamic model

Fig. 1 Classification of geodetic deformation models

sensors and sensor systems in use. It is also possible to observe or derive time derivatives of the quantities mentioned above. Here, inclinometers, accelerometers, and inertial navigation systems may serve as examples of respective sensors and sensor systems. In the context of system theory, the object can be considered as a dynamic system which itself is a part of a dynamic system as it is embedded into a physical environment (Welsch et al. 2000). The Earth, e.g., is a body which is deformed by atmospheric and oceanic loading, by lunisolar gravitation, by groundwater variations, and many other forces. The shape of a water dam, as a second example, is sensitive to the filling level, to temperature changes, and to the physical properties of the underground. For this reason, it is essential either to observe all influences that are considered as relevant after a thorough system assessment or – if possible – to control or stabilize them during the observation procedure. Typically, the state of an object subject to a deformation process is repeatedly observed based on a geodetic network within a finite time interval. Thus, a time-discrete sequence of the so-called epochs of object states is generated. If the repetition (or sampling) rate is sufficiently high, the temporal representation of the deformation process could be considered as continuous. In this case, the theory of stochastic processes and time series can be applied. Although there has been some recent progress in this field (Neuner and Kutterer 2007), a further discussion is restricted to the analysis of a smaller and finite number of time-discrete object states.

2.2 Modeling of a Deformation Process In geodesy, there are several well-established approaches in deformation modeling and analysis. They can be classified primarily along two main lines with respect to the possible understanding of a deformation process. The first group of approaches (and hence models) is called descriptive. These approaches are based on the phenomenology of the process by observing only its effects using geodetic means. The second group of approaches and models is called causal as they refer to the cause-and-effect relation between quantities influencing the process and observable reactions of the object (Heunecke and Welsch 2001); see Fig. 1 for an overview sketch. Within the descriptive approaches there is a distinction between two models: the congruence model that allows a pure comparison of different states and the kinematic model that includes time as a relevant parameter. The congruence model refers to the identity of object states of successive epochs. The kinematic model describes the temporal evolution of a deformation process. Like in

Page 3 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

mechanics, acting forces are not taken into account. Congruence models are often used for simply comparing different states, e.g., in the long-term monitoring of structures such as dams, in quality control, or if the number of epochs is rather small, Kinematic models are set up in case of a higher number of epochs or in a quasi-continuous case. The present-day modeling of a terrestrial reference frame with station positions and velocities as well as, e.g., annual components is a typical example for a kinematic model. The causal approaches rely on the input-output relation of the deformation process. Here, in contrast to the descriptive approaches, acting forces can be considered as influence quantities which are either permanently present such as gravitation or induced by, e.g., temperature changes. These approaches can be based on structural models given by a physically meaningful set of differential equations. Besides, they can rely on substitutes such as a set of mathematical functions which simply correlate causes and effects in analogy to the descriptive approaches. Also, within the causal approaches there is a distinction between two more specific models. The static model refers to physical equilibrium states of the object due to changes in the acting forces. The controlled bending of a bridge during a loading experiment may serve as an example. Time is not a relevant parameter such as in case of the congruence model Among all mentioned deformation models, the dynamic model is most comprehensive as it takes both temporal evolution and acting forces into account. Illustrative examples are the changes of the Earth’s shape due to, e.g., variable loading effects or the high-frequent vibrations of a road bridge due to wind and traffic load. At present, the processing of geodetic observations in the framework of geodetic deformation analysis is based on one of two main techniques. The first one is based on linear estimation theory the parameters describing the object state are estimated epoch by epoch by a least-squares adjustment of the underlying geodetic network according to the state of the art of network adjustment. Thus, outlier tests are performed and the associated variance-covariance matrix (vcm) is derived. For a meaningful deformation analysis, it is indispensable to take care for a unique geodetic datum for all epochs. For general information on parameter estimation see, e.g., Koch (1999). In case of a congruence model, the object states of the different epochs are compared by means of a linear hypothesis test regarding the null hypothesis “congruence” or “identity” of a subset of all network points. If the null hypothesis is rejected, the difference and hence the deformation is considered as significant. Then, a refined analysis is started in order to identify the most plausible deformation model in terms of stable and unstable points. In addition, parameters of rigid body motion or affine strain can be estimated for point groups with similar position deviations between the epochs. In contrast to the following models, the congruence model does not allow a functional integration of the results of the single epochs. Thus, it allows to some extent the diagnosis of a deformation process but not a prognosis or prediction, respectively. In case of a kinematic model, the single-epoch solutions are combined in a joint estimation based on a functional model for the temporal evolution of the deformation process. The choice of a suitable class of functions is crucial, different kinds of polynomials such as Bernstein polynomials have been suggested (Rawiel 2001). The kinematic model allows both diagnosis and prediction in contrast to the congruence model. As an alternative to least-squares estimation, the Kalman filter was proposed for the mathematical treatment of the kinematic model in geodetic deformation analysis (Heunecke 1994). The Kalman filter is a recursive state-space filter which combines system information and measurement information in an optimum way. In the Kalman filter approach the system information is provided

Page 4 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

in terms of the so-called system equation. In its tint-discrete formulation, it is generally represented by the discretized solution of a first-order system of inhomogeneous differential equations. The system equation allows the temporal prediction of a given system state. The measurement equation is formulated as a Gauss-Markov model. The consistency of system and measurement information is checked based on the test of the so-called innovation which indicates the discrepancy between the predicted and the actually observed system state. In case of a static or dynamic model, the use of the Kalman filter is more or less mandatory as it allows integrate structural information on the system and observations of the system state in an elegant way. Unobservable system parameters such as material constants can be derived from geodetic observations within the filtering process. Eichhorn (2007) describes, e.g., the dynamic modeling of a thermally induced deformation. All procedures that were shortly introduced above consist of a modeling step, an estimation (or filtering) step, and a testing step. Several decisions are required during the analysis such as the identification and elimination of likely observation outliers, or the adaptation of the variance level that is basic for the significance of a deformation and hence for the probabilities of Type I and Type II errors.

2.3 Uncertainty Assessment Up to now, only the functional modeling of the object of interest and of the deformation process was addressed. However, geodetic monitoring and geodetic deformation analysis suffer from several kinds of uncertainty which have to be taken into account in a suitable way. For each epoch, the precision of the estimated parameters of the system state is described by their vcm. The adaptation of the theoretical vcm to a more realistic uncertainty level is possible through the estimated variance of the unit weight. Note that the estimated variance typically reflects only the consistency of the chosen model and the data within a single epoch but not for the set of all epochs. However, for geodetic deformation analysis a more comprehensive assessment is required since the sequence of modeling, estimation, and analysis is accompanied by a variety of assumptions and simplifications. A common assumption is the stability of the observation frame or experimental frame, respectively, within a particular epoch or between different epochs. As an optimal stability is not possible in geodetic monitoring, possible differences are partially compensated – but not fully eliminated – by correction models, e.g., for atmospheric refraction, by algebraic operations such as observation differencing and by formulation of additional parameters or correlations between the observations. This is, in particular, true in the case of imperfections of the materialization and the persistence of the coordinate reference frame. The repeatability of the reference frame is crucial for a deformation analysis as respective uncertainties immediately leads to a loss of accuracy and sensitivity – even in the absence of actual deformations. Seasonal variations were reported, e.g., for observation stations in terrestrial reference frames which could be induced by groundwater variations (Dong et al. 1997). Such effects can be considered as reference frame-induced noise. The influences indicated above increase the amount of uncertainty which has to be taken into account. It depends on the definition of the experimental frame if they act in a random or in a systematic way. But, there are more types of uncertainties to be considered in deformation analysis. The spatial and temporal discretization of the deformation process is a typical simplification which is often due to the available equipment. However, the information about the spatial and temporal gaps between the actually observed control points is lost. It can only be reconstructed based on Page 5 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

assumptions. Moreover, some object properties could have been neglected when the object state was modeled. This kind of uncertainty corresponds with the incompleteness of the model with respect to the scientific problem of interest. A last kind of uncertainty to be mentioned here is the ambiguity which is due to the alternatives in the functional model of the deformation process. The following discussion concentrates on random and systematic influences in geodetic monitoring which affect the results of a geodetic deformation analysis and thus the decisions based on them. There are several mathematical concepts and theories which allow the formulation and treatment of uncertainty. The theory of stochastics is well established. Corresponding uncertainty measures are based on variance-covariance matrices which are derived from observed data or which can rely on model assumptions. The stochastic way is strongly applied in geodesy, metrology, and many other disciplines – namely for the expression of the uncertainty of measurement results. Some competing concepts based on, e.g., interval mathematics and fuzzy theory are introduced, motivated, and discussed in Sect. 3.

3 Key Issues 3.1 Modeling Chain in Geodetic Deformation Analysis Present-day geodetic deformation analysis is organized as a sequence or chain of modeling and analysis steps. The modeling part of this chain is the essential basis both for the understanding of the considered deformation process as well as for the assessment of the uncertainty of the derived results and decisions. Hence, for a refined understanding it has to cover the knowledge about the generation as well as the complete processing and analysis of the data. In this section, the functional modeling is presented and discussed which comprises three main components: the physical modeling of the observations, the modeling of observations as basic quantities within a parameterized configuration or network, and the modeling of the relation between the configuration and the deformation process. The following sections concentrate on the corresponding modeling of uncertainty. In the modeling chain, five successive models can be distinguished; see Fig. 2. The first model (Model I) concentrates on the vector z of originally derived observations with dim(z) D nz , which are given as a differentiable function d W Rns ! Rnz of some basic quantities of influence s with dim(s) D ns . In the context of geodetic deformation analysis, this model relates observations within a single epoch as it allows describe intra-epochal correlations according to the law of variance propagation for linear or linearized functions: ˇ @d ˇˇ .s  s0 / D d0 C Ds ) †zz D D†ss DT : z D d.s/  d.s0 / C ˇ ƒ‚ … „ƒ‚… „ @s „ ƒ‚0 … s d0 

(1)

DWD

Such correlations are induced through, e.g., particular atmospheric conditions. In addition, Model I allows the formulation of inter-epochal correlations due to, e.g., diurnal or seasonal variations. An analogous model which relates observed quantities and influence quantities is formulated for the estimation of variance and covariance components; see, e.g., Koch (1999) for the so-called BIQUE approach (Best Invariant Quadratic Unbiased Estimation). Hence, both theoretical and empirical variances and covariances can be derived using Model I. Page 6 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

Original observation Original influences

Model I

z = d(s)

Model II

y = g(z,p,q)

Corrected and reduced observations Corrections, reductions, parametrization

Direct

Model III

I = m(y)

Modified observations Transformations, linear combinations

Direct

Model IV

I = f(x,t)

Modified observations Object state / process state

Inverse

Model V

b(x(1),..., x(k)) = h(u)

Epoch object (or process) state Process parameters

Direct

Inverse

Fig. 2 Schematic representation of the functional model chain

The second model (Model II) uses the original observations z described in Model I as input quantities which have to be modified, as in practice they are not directly used for deformation analysis. Common modifications refer to correction and reduction models which compensate or eliminate physical effects such as atmospheric refraction or which relate the observations to a defined reference or parameter system, respectively. In a first formulation, Model II reads as y D g(z, p) with the differentiable function g W Rnz  Rnp ! Rny of corrections and reductions, the vector y of reduced and corrected observations with dim(y) D ny D nz and the vector p of correction and reduction parameters with dim(p) D np . The formulation of Model II is fundamental both for the estimation or filtering methods and for uncertainty assessment and modeling. Several modeling strategies are possible regarding the knowledge about the parameters of the correction and reduction models as the actual value of a particular parameter can be either known or unknown. If it is derived from theory or determined by definition or convention, the value of the parameter is exactly known and repeatable throughout all epochs. Hence, it is directly used for the calculations. Even if the value is only known approximately, it can be used for the corrections and reductions. However, in this second case the associated uncertainty has to be taken into account depending on the experimental frame and the implied impact. If the knowledge about the parameter can be improved by estimation or filtering based on the observation configuration, it is considered as an additional parameter in the soon following fourth model. This alternative is the only possible one if the value of the parameter is completely unknown. Note that concerning uncertainty modeling a distinction between deterministic and stochastic parameters is crucial. As a result of the previous discussion, the group of completely unknown parameters in Model II is compiled in the vector q for a better distinction. Hence, the remaining parameters p are reinterpreted as those which are approximately known. Consequently, Model II is reformulated as y D g(z,p,q) with dim(q) D nq and np – for the sake of simplicity – reinterpreted as the dimension of p. The third model in the sequence (Model III) l D m(y) is still directly related to the observations. It refers to all manipulations of the original data which are not already covered by the second model such as linear combinations of the corrected and reduced observations. Typical examples are the well-known single and double differences or the narrow-lane or wide-lane linear combinations calculated in GPS data processing. Hence, Model III is usually formulated as l D M y. Note that

Page 7 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

dim(l) D n  ny which is the number of actually used observations. The function m W Rny ! Rn and the matrix M 2 Rnny are defined accordingly. The fourth model (Model IV) l D f(x,t) is an inverse model in contrast to the previous three models which are all direct in the sense that the arguments of the function are known; the function values are obtained by just inserting the argument values. Here, the modified observation values l are known according to the previous discussion and the parameters x are unknown. From the viewpoint of geodetic deformation analysis, this model represents the observation configuration by the functional relation between the observations l and the parameters x which – for the considered epoch – represent the state of the object and the deformation process, respectively. Further, parameters such as described by the vector q or required through the configuration like, e.g., orientation or scale parameters are compiled in the vector t. Note that f W RuCnt ! Rn with dim(x) D u and dim(t) D nt . Actually, Model IV represents the functional part of the well-known Gauss-Markov model (Koch 1999). Hence, it is the basis for the adjustment of the typically used geodetic networks. For the estimation or filtering of the parameters x, additional constraints (or pseudo-observations) maybe needed. In case of free networks, these constraints are required for regularization as they introduce the geodetic datum. In case of only weakly determined additional parameters p and q, they help to improve (or stabilize) the estimation. Note that during this estimation part it is standard to perform outlier tests and variance component estimations for the observations groups. Up to now, all given models refer to a particular epoch only. Hence, the fifth model (Model V) finally relates the object (or process) states of different epochs. It can be formulated as: 31 x.1/ B6 x.2/ 7C 7C B6 b B6 : 7C D h.u/; @4 :: 5A x.k/ 02

  dim .u/ D nu ; dim x.i/ D u.i/

8 i D 1; 2; : : : ; k:

(2)

The upper index (i) denotes the number of the respective epoch; the total number of epochs is k. The vector u denotes the parameters which are formulated for the actual description of the deformation process. Depending of the particular problem the parameters u represent, e.g., coefficients of a similarity or affine transformation, polynomial coefficients of a kinematic model such as position, velocity or acceleration, or coefficients of a static model such as material constants.

3.2 Observation Uncertainty and Its Impact In general, uncertainty is an essential component of quality description in data analysis and hence in geodetic deformation analysis. Quality measures such as reliability (i.e., the ability of a configuration to detect or even localize outliers in the observations), sensitivity (i.e., the ability of a configuration to detect a particular deformation), or specificity (i.e., the ability of a configuration to separate between different deformation models) strongly depend on a proper formulation of uncertainty. Uncertainty refers to average or maximum deviations between reality and the formulated models and the observed data, respectively. In order to quantify the present amount of uncertainty, adequate measures are needed. Although there are mathematical methods which can directly assess nonlinear models such as Monte Carlo techniques, Saltelli et al. (2000) and Koch (2007) the Page 8 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

following discussion uses a first-order sensitivity analysis which is based on a linearization of the relevant models. For this purpose, a joint model of Models II, III, and IV is considered m.y/ D m.g.z; p; q// D l D f.x; t/:

(3)

For the sake of a better understanding, Model I is not considered here as it just allows refer the uncertainty of the original observations to a set of anteceding influence parameters. In addition, Model V is not included here. This will be done below. The linearization yields       ıg ım ıg ım ıg m.g.z0 ; p0 ; q0 // C ım .z  z / C .p  p / C .q  q0 / 0 0 ıg ıq 0  ıf ıg  ıf  ıg ıp 0  ız 0  l  f.x0 ; t0 / C ıx 0 .x  x0 / C ıt 0 .t  t0 /

(4)

for suitable approximate values of the vectors. The subindex “0” indicates an approximate value. This representation can be simplified as l  f.x0 ; t0 / 





ım ıg ıg ız 0

z C





ım ıg ıg ıq 0

p C

with m.g.z0 ; p0 ; q0 // D f.x0 ; t0 /; z WD z  z0 ; p WD p  p0 ;





ım ıg ıg ıq 0

q 

q WD q  q0 ;

 ıf 

ıx 0

x C

x WD x  x0 ;

 ıf 

ıt 0

t; (5)

t WD t  t0

A more compact formulation is Gz z C Gp p  Fxx C Ft t C   Fq q; with Gz WD   ; Fx WD ıf ıt 0

ım ıg ıg ız 0

;

Fq WD 

Gp WD  

ım ıg ıg ıq 0

ım ıg ıg ıp 0

;

Fx WD

 ıf 

ıx 0

;

(6)

As they are subject to estimation, the unknown parameters q or q, respectively, are assigned to the configuration part according to Model IV which is given on the right-hand side of the “” sign. Now, the relevant types of uncertainty are introduced. It is reasonable to assess the original observations z as purely random. However, there are systematic deviations between the observations z and l. These deviations are significantly mitigated as indicated with Models II and III by applying correction and reduction models, by functional or algebraic operations or by modeling of additional parameters. This is indicated by m.g.z0 ; P0 ; q0 // D f.x0 ; t0 / which means that on the basis of good approximate values the observations and the configuration are consistent. Due to several reasons systematic deviations remain although their magnitude is rather small (Schön and Kutterer 2006). Hence, the discussion has to concentrate on the actual uncertainty of the observations which is crucial for the interpretation of the results of a geodetic deformation analysis. In the following, stochastic parameters (or quantities) are indicated by an underscore for a better understanding. Note that in Model III the vector p has been introduced as approximately known whereas the vector q has been considered as completely unknown. Thus, the vector q comprises deterministic parameters only, whereas p has to be separated in a deterministic part pd and a stochastic part ps . This yields the random reduced observation vector l  f.x0 ; t0 /  Gz z C Gp p D Gz z C Gdp pd C Gsp ps DW l:

(7) Page 9 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

Based on the previous discussion, it is clear that the indicated deterministic and stochastic components are both present in a typical geodetic deformation analysis. The expectation vector of l is given as E.l/  f.x0 ; t0 /  E.l/ D Gz E.z/ C Gdp E.pd / C Gsp E.ps / D Gdp pd

(8)

since it is reasonable to assume E.z/ D 0 and E.ps / D 0 in case of good prior knowledge and approximate values, respectively. Accordingly, the variance-covariance matrix is given as: †ll D D.l/  Gz †zz GTz C Gsp†ps ps .Gsp /T D †ll :

(9)

It can be seen that †ll describes only the stochastic part of uncertainty as the deterministic part is contained in the expectation vector. In contrast to other studies, this representation of the reduced observation vector comprises two important features: the consideration of correlations between the observations due to the modeled basic quantities of influence and due to identical parameters in correction and reduction steps as well as uncertain deterministic parameters. If modeling and analysis are performed in a Bayesian framework all parameters are considered as stochastic (Koch 2007). For a deterministic parameter, it has to be assumed that the limited knowledge introduces randomness. Therefore, prior distributions are formulated which are required for the application of the Bayes’ theorem. The Bayesian approach is not considered in the following as there are several arguments not to use random variables for the description of the uncertainty of deterministic parameters. The most important one can easily be illustrated by an example: Independent of using classical or Bayesian statistics, the uncertainty of a random variable is understood as its dispersion which is described by the (theoretical or empirical) standard deviation or variance, respectively. In case of a sample mean, the variance of the mean value is derived from the variances of the sample values by means of variance propagation. It is well known that in casep of independent and identically distributed sample values, the standard deviation of the mean is = n with n, the sample size and , the standard deviation of one sample value. For this reason it is obvious that in a stochastic framework, uncertainty is reduced by averaging repeated observations. However, this contradicts, e.g., the idea of an unknown additive bias which is often an inherent parameter in geodetic observation models. It is typical for the applications that the numerical value of the uncertainty of a certain parameter is defined by an experienced person (in other words: an expert), who gives a possible range of values which is either indicated by a lower or upper bound or by the mean value and the radius or diameter of the range. A close but principally different concept in metrology refers to tolerances. In both cases, the bounds of such a range indicate worst cases. Regarding the mathematical background, it is meaningful to use interval mathematics or interval arithmetic, respectively, for modeling such ranges of values as intervals. Note that in such a concept the actual value of the parameter remains unknown. Without referring to any probability distribution all values within a given interval are likely to the same amount. If several experts assess such a range of values it is clear that the given regions may partially be supporting and partially contradictory. Then, different degrees of possibility can be given leading from the minimum degree 0 (i.e., no support by any expert) to 1 (i.e., maximum support by all experts). Also in this case, a set-theoretical approach is meaningful In contrast to pure interval mathematics, different degrees can be modeled and treated in fuzzy theory and fuzzy data analysis, respectively. In the following, this deterministic part of uncertainty is called imprecision. Note

Page 10 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

that the approach using interval or fuzzy quantities relies on a significantly reduced amount of information in comparison with the stochastic approach This often reflects the situation in the applications in a better way.

3.3 Uncertainty Treatment Using Deterministic Sets Regarding geodetic observations and geodetic deformation analysis, the possible range of the value of a single parameter p can be defined by a closed interval [p] as a compact set of real numbers. Typically, such an interval is given either by its lower bound pl and its upper bound pu in terms of [p] D [pl ,pu ], pl  pu , or by its midpoint pm and its radius (or spread) ps in terms of [p] Dipm , ps h, ps  0, with pm D .pu C pl //2, pm D .pu  pl //2. Regarding the discussion in Sect. 3.2, the midpoint of an interval reflects some average knowledge about the parameter value whereas its radius directly refers to the amount of deterministic uncertainty. If pl D pu or ps D 0, the real interval simplifies to a real number which is then considered as precise. For intervals, several set-theoretical and arithmetic rules as well as functions can be defined or extended from the set of real numbers R to the set of real intervals IR; see, e.g., Alefeld and Herzberger (1983) or Jaulin et al. (2001). Due to the availability of textbooks on interval mathematics and because of the limited space in this chapter and the scope on geodetic deformation analysis the relevant mathematical definitions and methods are summarized in a very brief way. For the following discussion mainly two arithmetic rules are of interest. The first rule introduces the addition of two intervals: Œa C Œb D Œal C bl ; au C bu  D ham C bm ; as C bs i; Œa; Œb 2 IR:

(10)

It can be seen that the two radii are directly added. Regarding their interpretation as measures of uncertainty (more precisely of imprecision) this reflects a linear propagation of uncertainty in contrast to the quadratic propagation according to the well-known law of error (or variance) propagation The second rule introduces the multiplication of a real scalar and a real interval  cŒa D hc am ; jcjas i D

Œc al ; c au ; c  0 Œc au ; c al ; c < 0

with Œa 2 IR;

c 2 R:

(11)

Obviously, the midpoint-radius representation is compact whereas the representation by lower and upper bounds distinguishes two cases. This is related with the nonnegativity property of the radius of a real interval As an example for applying both rules, the difference of two intervals yields [a]  [a] D 0, h2as i. Note that when adding two intervals, the interval mathematical rule considers them as independent although identical symbols are used. From the general viewpoint of applications (and in particular regarding geodetic deformation analysis) this property is unsatisfactory as it shows the need for a refined understanding and modeling as well as analysis of data when interval mathematics is applied. Such a procedure is available as it will be shown below. It is easy to see that the addition can be extended to more intervals. Together with ˛the Pn Pn than two˝P n c a ; jc second rule this allows to evaluate expressions such as iD1 ci Œai  D i i;m iD1 iD1 i jai;s . If a matrix vector notation is used, this rule can be written as

Page 11 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

1 Œa1  B Œa  C B 2 C with Œa D B : C @ :: A Œan  0

Œd D hdm ; ds i D cT Œa D hcT am ; jcjT as i 1 jc1 j B jc j C B 2 C jcj D B : C : @ :: A jcn j 0

2 IRn

and

(12)

The transpose is used as for real vectors or matrices, respectively. The space of n-dimensional real interval vectors is denoted by IRn . Note that this “interval scalar multiplication” is exact as it yields the actual range of values. In geodetic deformation analysis, the multiplication of a real matrix and an interval vector is required. According to the previous discussion this yields y D F x; F 2 Rmn ; x 2 Œx 2 IRn ) Œy WD FŒx D hFxm ; jFjxs i: Note that the set of real intervals IR is not closed with respect to this multiplication since the actual range of values {y} D {yjy D F x, x 2 [x]} is a general convex polyhedron (or zonotope) within Rm with{y}  [y] D F[x] However, this multiplication yields the tightest inclusion of {y} by means of an interval vector as it is exact component by component. Depending on the definition of the parameter (or coordinate) system the shape and size of this interval vector may change (Schön and Kutterer 2005). Now the example of calculating the difference [d] D [a]  [a] can be restudied. In the context of uncertain deterministic quantities, the interpretation of this difference is as follows: a deterministic parameter a is considered whose value is uncertain. If e.g., two observed distances are both biased by this additional parameter a according to s1 D l1 Ca and s2 D l2 Ca, the difference is independent of this bias: s2  s1 D l2  l1 . If interval notation is used this problem can be reformulated as 0 1 l 1 0 1 @ 1A s1 D l2 ) 011 s2 „ƒ‚… „ ƒ‚ … a „ƒ‚… DWs DWF 







0 1 0 1 l1 l1 101 @ A @ d D s2  s1 D . 1 1 / l2 D . 1 1 0 / l2 A „ ƒ‚ … „ ƒ‚ … 0 1 1 a a DWM DMF DWx





with the matrix F defining the functional relations according to Model II and the matrix M describing a difference operator according to Model III. This yields 1 hl1 ; 0i Œd D .MF/Œx D . 1 1 0 / @ hl2 ; 0i A D hl2  l1 ; 0i D l2  l1 ham ; as i 0

Page 12 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

which – as desired – is free from the parameter a and its uncertainty. Note that the subdistributivity property holds in general [y] D (MF)[x]  M(F[x]). This is the central mathematical reason for the thorough presentation and discussion of the modeling chain in geodetic deformation analysis given in Sect. 3.1. If it is possible to refer the derived parameters to independent original quantities of influence always the tightest interval inclusion is obtained. From the viewpoint of applications, an interval implies a sharp separation between the consideration of a particular value of a parameter or its neglect. This problem is overcome if the discussion is based on fuzzy theory. Fuzzy theory is a broad field which cannot be introduced comprehensively within this chapter. However, for the further understanding only a few basics are required. For this reason a brief but self-consistent overview is given. For more detailed information the reader is referred to standard literature such as Dubois and Prade (1980) or Bandemer and Näther (1992). Fuzzy sets are the fundamental quantities in fuzzy theory. In general, a fuzzy set is defined as an ordered pair of values according to Q D f.x; mAQ .x//jx 2 xg; mAQ W x ! Œ0; 1 A

(13)

Q is with X a classical set and mAQ the so-called membership or characteristic function. A fuzzy set A an extension of a classical set A since a respective formulation is A D f.x; mA .x//jx 2 xg; mA W X ! f0; 1g:

(14)

In this case mA is called indicator function as mA x D 1 if a particular value x belongs to the set A, otherwise mA x D 0. Hence, the use of a fuzzy set allows a (discrete or continuous) transition between these two extreme values. In other words, a particular value x may only partially belong to (or be assigned to) the set A. Some important notions or operations, respectively, are the core, the support and the a-cut of a fuzzy set. The core of a fuzzy set is the classical set   Q D fx 2 xjmAQ .x/ D 1g: core A

(15)

The support of a fuzzy set is the classical set   Q WD fx 2 xjmAQ .x/ > 0g: supp A

(16)

A ˛-cut of a fuzzy set is defined as Q ˛ D fx 2 XjmAQ .x/  ˛g: A

(17)

A fuzzy set is uniquely represented by the set of its ˛-cuts. With respect to the discussion of intervals and regarding the use in practice a fuzzy set is assumed to express degrees of possibility (or certainty) for certain values or ranges. These degrees reflect experiences which are based on a reasonable number of experiments or simply experts’ opinions. The construction of a fuzzy set is treated by the theory of nested sets which allows assess overlaps and contradictions; see, e.g., Nguyen and Kreinovich (1996).

Page 13 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

m(x)

m(x) core x

1

1

2 xr

L

R

L



α

R=L xα

α

xL

xR xm

xs

xs x

xm

x

Fig. 3 LR-fuzzy interval (left) and L-fuzzy number (right) with respective parameters

For the analysis of geodetic data and hence for geodetic deformation analysis the concepts of fuzzy numbers and fuzzy intervals as well as of fuzzy vectors are essential since these quantities are needed to mathematically describe imprecision. The respective definitions are given in the following. Afterwards some relevant set-theoretical and arithmetic operations are introduced. Finally, the relation between interval mathematics and fuzzy theory in the context of fuzzy data analysis is given A fuzzy number is a fuzzy set with a single element core and x D R. A fuzzy interval is a fuzzy set with a compact real interval as core and x D R. Consequently, the ˛-cuts of a fuzzy number and a fuzzy interval, respectively, are real intervals. Obviously, each fuzzy number is a fuzzy interval In the applications, the so-called LR-fuzzy numbers and LR-fuzzy intervals are widely used because they offer a relevant interpretation and convenient mathematical properties. A LR-fuzzy interval is constructed by means of a classical interval as core as well as left and right reference functions L,R : R ! 0; 1, respectively, which are monotonously decreasing (see Fig. 3 for a graphical representation of a LR-fuzzy number and a LR-fuzzy interval). LR-fuzzy intervals are typically denoted by xQ D .xm ; xr; xl ; xr /LR with the midpoint of the core xm , the radius of the core xr , and the spreads xl and xr , respectively, which are assigned to the reference functions. In this general case, the three parameters xr , xl , and xr contain information on the imprecision If L D R, LL-fuzzy intervals are obtained. If L D R and xl D xr , L-fuzzy intervals are obtained. Finally, if additionally xr D 0 this yields L-fuzzy numbers xQ D .xm ; xs /1 which are well suited for many applications. For fuzzy sets in general (and thus for fuzzy intervals and fuzzy numbers) set-theoretical operations can be extended consistently. However, it should be noted that there are typically several Q and BQ can be defined based on their possible extensions. The intersection of two fuzzy sets A membership functions as mA\ Q B Q .x/ D min.mA Q .x/; mB Q .x// 8x/ 2 R:

(18)

The union can accordingly be defined as mA[ Q B Q .x/; mB Q .x// Q .x/ D min.mA

8x 2 R:

(19)

Other possible definitions are available within the general concept of t-norms and s-conorms Q c is also defined through the membership (Bandemer and Näther 1992). The complement A function as mAQ c .x/ D 1  mAQ .x/:

(20)

Page 14 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

For fuzzy numbers and fuzzy intervals extensions of real functions can be defined based on the extension principle by Zadeh according to yQ D gQ .Qx1 ; : : : ; xQ n / W, myQ .y/ D

sup .x1 ;x2 ;:::;xn /2Rn yDg.x1 ;x2 ;:::;xn /

min.mxQ1 .x1 /; : : : ; mxQn .xn //

8y 2 R:

(21)

The choice of the min-operator yields for L-fuzzy numbers the following rules: xQ C yQ D .xm C ym ; xs C ys /L

(22)

aQx D .a xm ; jajxs /L

(23)

and

which are obviously formally identical with the ones derived for real intervals; see Eqs. 10 and 11. Hence, keeping in mind the analogy of real intervals and the ˛-cuts of fuzzy intervals, the discussion of fuzzy quantities can be replaced by the discussion of real intervals according to the previously presented equations. However, this is only valid if the min-operator is used in the extension principle. Actually, any other t-norm could be used but would lead to different arithmetic rules. In contrast to the rules described above this would allow other ways of imprecision propagation such as, e.g., a quadratic one. As this is a topic of its own it will be treated elsewhere. Note that in general each t-norm allows define a fuzzy vector. Hence, a fuzzy vector based on the min-norm reads as mxQ .x/ D min.mxQ1 .x1 /; mxQ2 .x2 /; : : : ; mxQn .xn //8.x1 ; x2 ; : : : ; xn / 2 Rn :

(24)

By the extension principle, it is also possible to extend other than the linear functions presented until now such as, e.g., quadratic forms of a fuzzy vector. This is essential for the use of statistical hypothesis tests which are of particular importance in geodetic deformation analysis. Note that the overestimation mentioned in the context of real intervals is also present in case of fuzzy intervals. It can only be avoided if the discussion is strictly referred to the basic, original quantities of influence. Then, scalar functions can be evaluated exactly if the extension principle is applied. The key method in the applications is the so-called ˛-cut optimization (Möller and Beer 2004). There, the extension principle is interpreted as an optimization problem which can be solved numerically ˛-cut by ˛-cut. If the discretization of the fuzzy intervals by a-cuts is sufficiently dense with respect to the properties of the reference functions the approximation of the true fuzzy solution is sufficiently close.

3.4 Uncertainty Assessment in Geodetic Deformation Analysis Geodetic deformation analysis relies on models and data as well as on particular analysis methods. As motivated in Sect. 2 there is a lot of sources of uncertainty. Even a careful formulation of the required models and a thorough preprocessing of the data yield unavoidable remaining errors; the induced uncertainty has to be taken into account during the analysis steps. Moreover, it has to be

Page 15 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

considered in detail in the interpretation step which is an essential basis for the required decisions. Hence, meaningful extensions of all analysis procedures are needed regarding the two types of uncertainty (random variability and imprecision) which are highlighted throughout this chapter. Without doubt random variability can be reduced by averaging redundant observations which are either obtained by repetition or within an overdetermined geodetic configuration. Imprecision can principally be mitigated by improved models. As both ways are limited in practice, the consequences for the analysis techniques are as follows. Random variability is considered as the immanent type of uncertainty which is more fundamental than imprecision. All existing techniques in geodetic deformation analysis are designed for this type of uncertainty. Moreover, they are typically derived as optimal. Imprecision is considered as induced through remaining systematic deviations between the preprocessed data and the models which could not be completely eliminated. In particular, Model IV and Model V refer directly to the geodetic monitoring and deformation analysis. Hence imprecision superposes random variability which is described by the so-called fuzzy randomness (Möller and Beer 2004). This is clear when the reformulated Model III according to Eq. 7 l D Gz z C Gsp ps C Gdp pd

(25)

is considered which contains the approximately known deterministic parameters pd . This third term is a deterministic bias which is added to the originally only random observation vector. All well-known techniques of estimation, filtering and testing that already refer to random variability can be used further if they are adapted to interval or fuzzy data by means of interval arithmetic and the extension principle, respectively. The interval-mathematical extension of leastsquares estimation was studied by, e.g., Schön (2003) with emphasis on geodetic networks. In this thesis 2D measures and 3D measures of imprecision (i.e., zonotopes) were developed which indicate the true range of values.

4 Fundamental Results The consequences of the formulation of an extended uncertainty budget and its consideration in the models and analysis steps are manifold. In the first place, uncertainty measures are provided which are more meaningful (and more honest) with respect to the interpretation of the results. Due to the limited space, the presentation of the fundamental results is restricted to the significance tests of the deformation models which are performed within a geodetic deformation analysis. These tests are based on the general linear hypothesis. In the following, the relevant hypotheses and the test are briefly reviewed. Then they are extended with respect to imprecise test values. Note that this allows adapt the sensitivity of the test decisions. Model V according to Eq. 2 shows epoch by epoch the relation between the object or process states and the formulated deformation model. It is the basis for the joint analysis of the results of the single epochs. Typically, it is already linear as

Page 16 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

3 x.1/ 6 x.2/ 7 7 6 Bx D B 6 : 7 D Hu: : 4: 5 x.k/ 2

(26)

The expression of the left-hand side of Eq. 26 enables the comparison of the states of different epochs as it can represent, e.g., a difference operation such as 3 32 I CI 0 0 : : : 0 x.1/ 6 I 0 CI 0 : : : 0 7 6 x.2/ 7 7 6 76 Bx D 6 :: :: :: :: : : :: 7 6 :: 7 : 4: : : : : : 5 4: 5 x.k/ I 0 0 0 : : : CI 2

(27)

Instead, the direct processing of the object states is possible as well. In this case B D I. The righthand side of Eq. 26 represents the deformation model, i.e., the geometric-physical explanation of the differences as indicated in Sect. 2. The parameters u are defined according to the considered deformation model such as single point movements, a rigid body movement, or affine strain. Note that Eq. 26 represents a Gauss-Helmert model (i.e., condition equations with unknown parameters). The relative weighting of the states is controlled by the joint vcm 2 †xO xO

6 6 D6 4

3

.1/

†xO xO

7 7 7: 5

.2/

†xO xO

::

:

(28)

.k/

†xO xO

The values of the parameters u P are estimated by means of a least-squares adjustment. This yields the vector uO and the associated uO uO . The interpretation of the estimated parameters regarding the required decisions and actions, respectively, is based on statistical hypothesis testing. In case of epoch differences of object states the null hypothesis reads as H0 W E.BOx/ D Hu D 0

(29)

and the alternative hypothesis reads as HA W E.Bx/ O D Hu ¤ 0: If for a more simple representation the discrepancy vector w D Bx

(30)

O D BOx and †wO wO D B†xO xO BT , in case of a full column-rank matrix H the vector is introduced with w uO is estimated as

Page 17 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

uO D .HT

X1 w Ow O

H/1 HT

X1 w Ow O

O w

(31)

with X uO uO

D .HT

X1 w Ow O

H/1 :

(32)

An adequate test statistics is given by the quadratic form T D uO T

X1 uO uO

uO

(33)

since the probability distributions for the two hypotheses are given as the central 2 -distribution T  2nu jH0

(34)

with nu degrees of freedom in case of the null hypothesis and as the noncentral 2 -distribution T  2nu ;jHA

(35)

with nu degrees of freedom and the noncentrality parameter  in case of the alternative hypothesis. In case of rejection of H0 the difference of the object states and hence the supposed deformation model is considered as significant regarding the chosen significance level ˛. With respect to the discussion in Sect. 3 the test statistics T is understood as being derived from imprecise data. Hence, both its calculation and the statistical hypothesis test have to be extended accordingly. This is possible, e.g., using the concept and methodology which was first developed for one-dimensional test statistics (Kutterer 2004) and extended to multidimensional test statistics (Kutterer and Neumann 2008). The fundamental observation is that both the imprecise test statistics TQ and the regions of acceptance A and rejection R, respectively, are sets. The hypotheses H0 and HA can be formulated with some linguistic uncertainty such as “The object states at different epochs are approximately Q and R, Q respectively. The identical”. In such a case, these regions are better described by fuzzy sets A Q and R, Q and it quantifies the respective degrees of agreement proposed strategy compares TQ with A and of disagreement. The procedure is summarized below; for details see the given references. Q with a fuzzy set N Q can be defined as The degree of agreement of a fuzzy set M   Q \N Q   card M Q WD ”NQ M   Q card M based on the cardinality of a fuzzy set according to Z   Q D mMQ .x/dx: card M

(36)

(37)

R

  Q and The test criterion is defined as follows: ”RQ TQ quantifies the degree of agreement of TQ and R,     Q ıAQ TQ D 1  ”AQ TQ quantifies the degree of disagreement of TQ and A. Page 18 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

The degree of rejection of the null hypothesis is calculated as         TQ D min ”RQ TQ ; ıAQ TQ :

(38)

The test decision is based on the comparison of the actual value of the degree of rejection with an accordingly defined critical value crit 2 [0, 1].   The null hypothesis is accepted if  TQ  crit and it is rejected if  TQ > crit . In safetyrelevant applications of geodetic deformation analysis it is recommended to choose crit D 0 which maximizes the sensitivity of the test as it includes the strict consideration of worst-case scenarios. In contrast, the sensitivity is reduced by choosing crit D 1 by which point “noise” is introduced to the epoch comparison in terms of a deterministic type of uncertainty. Note that the presented concept allows the computation of the probabilities ˛ and ˇ of both Type I errors and Type II errors, respectively, since     ˛ D P  TQ > crit jH0 ;

(39)

    ˇ D P  TQ  crit jHA :

(40)

and

Within the so-called utility theory the expected utility (in terms of loss, costs, or gain) can be quantified and taken as decision criterion; see, e.g., Neumann (2009). This approach also gives a theoretical background for the particular choice of crit . As a first example, the comparison between the observed distances of two control points is presented. The relevant 1D test statistics is rather simple and hence well suited to show the principal characteristics of the procedure. The test statistics is based on the standardized difference of the two distances according to TD

d d

(41)

with the imprecise extension ŒT D ŒTl ; Tu D hTm ; Ts i

(42)

if interval mathematics is used. As the null and alternative hypothesis are defined as H0 W E.T/ D 0; HA W E.T/ ¤ 0;

(43)

the regions of acceptance A and rejection R, respectively, are given as A D Œk; Ck D h0; Cki; R D Ac :

(44)

Considering the possible cases of intersection for T, A, and R, the following cases for the degree of rejection are obtained:

Page 19 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

R

m(x)

A

R

1 [T]

Case I: [T] ⊆ R

–k Case II:

Case III:

x

+k

[T] ⊆ A

[T] ∩ A ≠ ∅ ∧ [T] ∩ R ≠ ∅

Fig. 4 Extension of a 1D hypothesis tests for interval data: distinction between three possible cases ρ(Tm)

R

A

R

1 ρcm

–k–Ts – k –k+Ts

k–Ts +k

k+Ts

x

Fig. 5 Degree of rejection with respect to the midpoint Tm and the spread Ts of the interval test statistics

8 1; Tm  k  Ts ˆ ˆ ˆ ˆ < .Tm C Ts  k/=.2Ts /; k  Ts < Tm  k C Ts .Tm / D 0; k C Ts < Tm  k  Ts : ˆ ˆ ˆ .T C Ts  k/=.2Ts /; k  Ts < Tm  k C Ts ˆ : m 1 Tm > k C Ts

(45)

See Figs. 4 and 5 for graphical illustration of this example. Figure 4 indicates the three possible cases with (I) the imprecise test statistics is completely element of R, (II) the imprecise test statistics is partially element of both A and R, and (III) the imprecise test statistics is completely element of A. Figure 5 shows the graph of the degree of rejection with clear regions of acceptance (Tm D 0) and of rejection (crit D 1), but also with regions of transition where the test decision depends on the chosen critical value crit . Regarding the test decision the null hypothesis is accepted if (Tm /  crit , else it is rejected. Hence, Eq. 45 yields k C .1  2crit / Ts  Tm  k  .1  2crit /Ts ) accept H0 ; else ) reject H0 : Obviously, for the case without imprecision, the critical values k and Ck are obtained  for is reduced A D h0; k  T i crit D 0:5. If crit D 0, the effective region of acceptance eff s , and  if crit D 1 the effective region of acceptance is extended Aeff D h0; k C Ts i . Hence, the value of

Page 20 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

T m =16.93 –

m(x)

1

~ Region of acceptance A ~ Region of rejection R

~ ~ card (T – ∩ A) = 0.309 ~ card(T) – ~ ~ card (T∩R) – = 0.652 ~ card(T) –

0.8 0.6 0.4

Midpoint Tm α- cut optimization points

~

~ A

~ ~ card (T – ∩ A)

R

0.2

~ ~ card (T – ∩ R)

X 0

x 20.95 (9.0)=16.92

20

x 20.99 (9.0)=21.7

40

Fig. 6 Extended hypothesis test: comparison of an imprecise test statistics (LR-fuzzy interval calculated by ˛-cut optimization) with the fuzzy regions of acceptance and rejection (According to Neumann 2009-courtesy of the author) 1.0

ρcrit

0.8 0.6 α impr

0.4 0.2

α impr 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Fig. 7 Extended hypothesis test: crit as a function of the imprecise probability of a Type I error calculated by ˛-cut optimization (According to Neumann 2009-courtesy of the author)

crit can be adapted according to the required type of decision. As already stated above this allows to control the sensitivity of the test. In the following, the effect of the test principle in case of the congruence model is presented and discussed based on a second example given in Neumann (2009, p. 125f). Figure 6 shows the intersection of the imprecise test statistics (LR-fuzzy interval) derived for the congruence test with fuzzy regions of acceptance and rejection. This fuzziness is introduced through regions of transition between significance probabilities of 0.95 and 0.99. Figure 7 presents the functional relation between the Type I error probability for the extended test based on Eq. 39. Obviously, crit decreases if this error probability increases. This relation allows define a suitable, theoretically justified value crit .

5 Future Directions There are several open questions regarding the discussion in Sect. 4. The proposed extension of the test of the general linear hypothesis was not derived based on an optimality criterion but on meaningful properties for practical applications. For this reason, a theoretical basis is required. In addition, thorough comparisons are needed with competing test strategies which mainly rely on imprecise probabilities; see, e.g., Ferson et al. (2007). There are also some studies required concerning the quantitative modeling of imprecision. In this context, the problem of selecting the most adequate method for imprecision propagation has Page 21 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

to be treated. Interval arithmetic and fuzzy data analysis based on the use of the min-operator in the extension principle lead to a linear propagation of interval radii and the spreads of fuzzy numbers, respectively. If an alternative operator is chosen which is based on a different t-norm, other propagation mechanisms can be provided; see Näther (2009) for a recent discussion. The t-norm discussion also refers to the choice of the operator for deriving the test decision in Sect. 4. Regarding the discussion in Sects. 2 and 3 there is a need for studies concerning extended statespace filtering which allows both the spatial and temporal prediction of processes and the diagnosis with respect to unknown system parameters (i.e., a system identification) using, e.g., structural models. Here the Kalman filter can be considered; see, e.g., Kutterer and Neumann (2009). Besides, the extension of a Bayesian state-space filter with respect to imprecision is a worthwhile task. In these cases, the problem of overestimation has to be treated thoroughly. Despite all dedicated studies regarding the extension of established methods for the use of imprecise data there is a strong need for comprehensive comparisons of models and results in geodetic deformations analysis which are derived based on different uncertainty theories. As the actual significance of deformation parameters is a key issue which can be safety-relevant, it has to be treated further in dedicated studies in order to optimally assess the risks which are associated with the Type I and Type II errors in testing.

6 Conclusions In geodetic deformation analysis a system-theoretical approach is required which allows assess the functional relation between observable quantities and derived deformation parameters as well as the kind and amount of present uncertainty. In this chapter, it was motivated that the observation uncertainty has to be modeled and propagated to the parameters of interest in an adequate way. Since a purely stochastic approach does not allow a meaningful modeling of the uncertainty of deterministic type, set-theoretical approaches were considered. In order to practically use these approaches established stochastic methods had to be extended. Based on the example of a statistical hypothesis test it was shown that a more meaningful notion of significance is available.

References Alefeld G, Herzberger J (1983) Introduction to interval computations. Academic, Boston Bandemer H, Näther W (1992) Fuzzy data analysis. Kluwer Academic, Dordrecht Dong D, Dickey JO, Chao Y, Cheng MK (1997) Geocenter variations caused by atmosphere, ocean and surface ground water. Geophys Res Lett 24/15:1867–1870 Drewes H, Heidbach O (2004) Deformation of the South American Crust from finite element and collocation methods. In: Sanso F (ed) A window on the future of geodesy. International association of geodesy symposia, vol 128. Springer, Berlin, pp 296–301 Dubois DJ, Prade HM (1980) Fuzzy sets and systems: theory and applications. Academic, London Eichhorn A (2007) Analysis of dynamic deformation processes with adaptive Kalman-filtering. J Appl Geod 1:9–15 Ferson S, Kreinovich V, Hajagos J, Oberkampf W, Ginzburg L (2007) Experimental uncertainty estimation and statistics for data having interval uncertainty. Sandia National Laboratories, SAND2007-0939 Page 22 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_36-4 © Springer-Verlag Berlin Heidelberg 2014

Heunecke O (1994) Zur Identifikation und Verifikation von Deformationsprozessen mittels adaptiver KALMAN-Filterung (in German). PhD thesis, University of Hannover Heunecke O, Welsch W (2001) Models and terminology for the analysis of geodetic monitoring observations-Official report of the Ad Hoc Committee WG 6.1. In: Whitaker C (ed) Proceedings of the 10th international FIG symposium on deformation measurements, Orange Jaulin L, Kieffer M, Didrit O, Walter E (2001) Applied interval analysis. Springer, London Koch KR (1999) Parameter estimation and hypotheses tests in linear models. Springer, Berlin Koch KR (2007) Introduction to Bayesian statistics, 2nd edn. Springer, Berlin Kutterer H (2004) Statistical hypothesis tests in case of imprecise data. In: Sanso F (ed) V HotineMarussi symposium on mathematical geodesy. Springer, Berlin, pp 49–56 Kutterer H, Neumann I (2008) Multidimensional statistical tests for imprecise data, In: Xu P, Liu J, Dermanis A (eds) Vi Hotine-Marussi symposium on theoretical and computational geodesy. Springer, Berlin, pp 232–237 Kutterer H, Neumann I (2009) Fuzzy extensions in state-space filtering-some applications in geodesy. In: Proceedings of the ICOSSAR 2009. Taylor & Francis, London, pp 1268–1275. ISBN:978-0-415-47557-0 Möller B, Beer M (2004) Fuzzy randomness. Springer, Berlin Näther W (2009) Copulas and t-norms: mathematical tools for modeling propagation of errors and interactions. In: Proceedings of the ICOSSAR 2009. Taylor & Francis, London, pp 1238–1245. ISBN:978-0-415-47557-0 Neumann I (2009) Zur Modellierung eines erweiterten Unsicherheitshaushaltes in Parameterschätzung und Hypothesentests (in German). Series C 634, German Geodetic Commission, Munich Neuner H, Kutterer H (2007) On the detection of change-points in structural deformation analysis. J Appl Geod 1:63–70 Nguyen HT, Kreinovich V (1996) Nested intervals and sets: concepts, relation to fuzzy sets, and applications. In: Kearfott B, Kreinovich V (eds) Applications of interval computations. Kluwer, Dordrecht, pp 245–290 Rawiel P (2001) Dreidimensionale kinematische Modelle zur Analyse von Deformationen an Hängen (in German). Series C 533, German Geodetic Commission, Munich Roberts G, Meng X, Meo M, Dodson A, Cosser E, Iuliano E, Morris A (2003) A remote bridge health monitoring system using computational simulation and GPS sensor data. In: Stiros S, Pytharouli P (eds) Proceedings of the 10th international FIG symposium on deformation measurements, Santorini Saltelli A, Chan K, Scott EM (2000) Sensitivity analysis. Wiley, Chichester Schön S (2003) Analyse und Optimierung geodätischer Messanordnungen unter besonderer Berücksichtigung des Intervallansatzes (in German). Series C 567, German Geodetic Commission, Munich Schön S, Kutterer H (2005) Using zonotopes for overestimation-free interval least-squares-some geodetic applications. Reliable Comput 11:137–155 Schön S, Kutterer H (2006) Uncertainty in GPS networks due to remaining systematic errors: the interval approach. J Geod 80:150–162 Welsch W, Heunecke O, Kuhlmann H (2000) Auswertung geodätischer Überwachungsmessungen (in German). Wichmann, Heidelberg

Page 23 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

Mixed Integer Estimation and Validation for Next Generation GNSS Peter J.G. Teunissen Department of Spatial Sciences, Curtin University of Technology, Perth, Australia Department of Geoscience and Remote Sensing, Delft University of Technology, Delft, Netherlands

Abstract The coming decade will bring a proliferation of Global Navigation Satellite Systems (GNSS) that are likely to revolutionize society in the same way as the mobile phone has done. The promise of a broader multifrequency, multi-signal GNSS “system of systems” has the potential of enabling a much wider range of demanding applications compared to the current GPS-only situation. In order to achieve the highest accuracies, one must exploit the unique properties of the received carrier signals. These properties include the multi-satellite system tracking, the mm-level measurement precision, the frequency diversity, and the integer ambiguities of the carrier phases. Successful exploitation of these properties results in an accuracy improvement of the estimated GNSS parameters of two orders of magnitude. The theory that underpins this ultraprecise GNSS parameter estimation and validation is the theory of integer inference. This theory is the topic of the present chapter.

1 Introduction 1.1 Next Generation GNSS 1.1.1 Background Global Navigation Satellite Systems (GNSSs) involve satellites, ground stations, and user receiver equipment and software to determine positions anywhere around the world at any time. The global positioning system (GPS) from the United States is the best-known and currently fully operational GNSS. Fueling growth during the next decade will be the next generation GNSSs that are currently being deployed and developed. Current and prospective providers of GNSS systems are the United States, Russia, the European Union, China, Japan, and India. The United States is modernizing its dual-frequency GPS. A third civil frequency will be added, with expected 24-satellite full constellation capability (FOC) around 2015. Russia is revitalizing its GLONASS system, from a current only partially functioning system to 24-satellite FOC reached by 2010. The European Union is developing a complete new multifrequency GNSS, called Galileo, which is currently in orbit validation phase and which will have its 30-satellite FOC by 2012. China is developing its own 30-satellite GNSS, called Compass, of which the first satellite was launched in April 2007. Finally, India and Japan are developing GNSS augmentation systems. India’s 7-satellite IRNSS (Indian Regional Navigational Satellite System) is expected operational in 2012 and Japan will soon launch its first of three QZSS (Quasi-Zenith Satellite System) satellites. QZSS is designed to increase the number of satellites available at high-elevation angles over Japan. 

E-mail: [email protected] Page 1 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

1.1.2 Benefits of GNSS The promise of a broader and more diverse system of GNSSs has enormous potential for improving the accuracy, integrity, and efficiency of positioning worldwide, the importance of which can hardly be overstated. The availability of many more satellites and signals creates exciting opportunities to extend current GPS applications and to enable new applications in areas where the GPS-only situation has been a hindrance to market growth. Extending the operational range of precise carrierphase GNSS, currently restricted to about 15 km, will allow instantaneous cm-level accuracy positioning at remote locations on land and offshore. Improved integrity will service various industries having high marginal costs (e.g., mining, agriculture, machine-guided construction). Single-frequency tracking of more satellites will create opportunities for the low-cost receiver market of precise real-time location devices, such as handheld or in moving vehicles. In addition, environmental- and spaceborne GNSS will benefit enormously from tracking multiple satellites on multiple frequencies. Environmental GNSS, in general, benefits from denser atmospheric profiling, while short-term weather prediction in particular benefits from a reduction in the latency of GNSS-integrated water vapor estimates. The benefits for spaceborne GNSS are highly accurate orbit determinations of Earth-orbiting space platforms, possibly even in real time, thus offering increased spacecraft autonomy, simplification of spacecraft operations, and support to rapid delivery of end-user data products such as atmospheric profiles from occultation or synthetic aperture radar images for deformation monitoring. An overview of this great variety of GNSS models and their applications can be found in textbooks like Parkinson and Spilker (1996), Strang and Borre (1997), Teunissen and Kleusberg (1998), Farell and Barth (1999), Leick (2004), Misra and Enge (2006), and Hofmann-Wellenhof et al. (2008). 1.1.3 Theory Several key issues need to be addressed in order to achieve the fullest exploitation of the opportunities created by future GNSSs. The highest possible accuracies can only be achieved if one is able to exploit the unique properties of the received carrier signals. These properties include the mm-level precision with which the carrier phases can be tracked, the frequency diversity of the carriers, and the knowledge that certain functions of the carriers are integer valued. The process of exploiting these properties is known as integer ambiguity resolution (IAR). IAR improves the precision of the estimated GNSS model parameters by at least two orders of magnitude. For positioning, successful IAR effectively transforms the estimated fractional carrier phases into ultraprecise receiver-satellite ranges, thus making high-precision (cm- to mm-level) positioning possible. As a beneficial by-product, it also improves other GNSS model parameters, such as atmospheric parameters, and it enables reduction of GNSS parameter-estimation space, sometimes up to 50 %, thus simplifying computations considerably and accelerating the time to position. However, the success of IAR depends on the strength of the underlying GNSS model. The weaker the model, the more data needs to be accumulated before IAR can be successful and the longer it therefore takes before one can profit from the ultraprecise carrier signals. Clearly, the aim is to have short times-to-convergence, preferably zero, thereby enabling truly instantaneous GNSS positioning.

Page 2 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

The theory that underpins ultraprecise GNSS parameter estimation is the theory of integer inference. This theory of estimation and validation is the topic of the present chapter. Although a large part of the theory has been developed since the 1990s for GPS, the theory has a much wider range of applicability.

1.2 Mixed Integer Model Central in the theory of integer inference is the mixed integer model. To introduce this model, we use the GNSS pseudorange and carrier-phase observables as leading example. If we denote the j -frequency pseudorange and carrier phase for the r  s receiver-satellite s s combination at epoch t as pr;j .t / and 'r;j .t /, respectively, then their observation equations can be formulated as s;p

s .t / D rs .t / C Trs .t / C j Irs .t / C cdtrs .t / C er;j .t / pr;j s;

(1)

s s r;j .t / D rs .t / C Trs .t /  j Irs .t / C cıtrs .t / C j Mr;j C er;j .t /

where rs is the receiver-satellite range, Trs is the tropospheric delay, Irs is the ionospheric delay, s is the timedtsr and ıtrs are the pseudorange and carrier-phase receiver-satellite clock biases, Mr;j invariant carrier-phase ambiguity, c is the speed of light, j is the j -frequency wave length, j D s;p s;' .j =1 /2 , and er;j and er;j are the remaining error terms, respectively. The real-valued carriers s D 'r;j .t0 / C 'js .t0 / C Nr;j is the sum of the initial receiver-satellite phases phase ambiguity Mr;j s and the integer ambiguity Nr;j . Through differencing of the observation equations, one can eliminate the initial phases and the clock biases. The so-called double differenced (DD) observation equations then take the form t s;p

ts ts ts .t / D qr .t / C Tqrt s .t / C j Iqr .t / C eqr;j .t / pqr;j ts .t / qr;j

D

ts qr .t /

C

Tqrt s .t /



ts j Iqr .t /

C

ts j Nqr;j

C

t s; eqr;j .t /

(2)

ts s t s t .t / D Œpr;j .t /  pr;j .t /  Œpq;j .t /  pq;j .t /, with a similar notation for the other DD where pqr;j variates. The DD tropospheric slant delays are usually further reduced to a single DD vertical delay Tqrvert by means of mapping functions. Furthermore, the need for having the ionospheric delays present depends very much on the baseline length between receivers. These delays can usually be neglected for distances less than 15 km. t s; t s; If we assume the error terms eqr;j .t / and eqr;j .t / in Eq. 2 to be zero-mean random variables, the observation equations can be used to set up a linear model in which some of the unknown ts . Such a model is an example of a mixed integer parameters are reals and others are integer Nqr;j linear model. ts .t /. These ranges The observation equations of Eq. 2 are parametrized in the DD ranges qr depend on the receiver positions and on the satellite positions. Assuming the satellite orbits are known, these ranges are usually further linearized with respect to the unknown receiver coordinates. As a result one obtains linearized equations that are parametrized in the betweenreceiver baseline vector increments. Such a model is an example of a mixed integer linearized model. These linearized GNSS models can usually be treated as if they are linear, since the nonlinearities are small.

Page 3 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

We now define the general form of a mixed integer linear model. Definition 1 (Mixed integer linear model). Let (A; B) be a given m  .n C p/ matrix of full rank and let Qyy be a given m  m positive definite matrix. Then y  N.Aa C Bb; Qyy /; a 2 Zn ; b 2 Rp

(3)

will be referred to as the mixed integer linear model. The notation “” is used to describe “distributed as.” In a GNSS context, the m-vector y contains the pseudorange and carrier-phase observables, the n-vector a the integer DD ambiguities, and the real-valued p-vector b the remaining unknown parameters, such as baseline components (coordinates) and possibly atmospheric delay parameters (troposphere, ionosphere). As in most GNSS applications, the underlying distribution of the above mixed integer model is assumed to be a multivariate normal distribution. Various results in the following sections are also valid, however, for general distributions.

1.3 Chapter Overview The mixed integer model (Eq. 3) is usually solved and validated in a number of steps. We now briefly present the contributions of this chapter in relation to these steps. In the first step, the integer nature of a is discarded. The parameters a and b are estimated using least-squares (LS) estimation, which in the present case is equivalent to using maximum likelihood (ML) or best linear unbiased estimation (BLUE). As a result one obtains the so-called float solution:       aO a QaO aO QaO bO N (4) ; b QbOaO QbObO bO In this first step, one usually also tests the data and GNSS model for possible model misspecifications, e.g., outliers, cycle slips, or other modeling errors. This can be done with the standard theory of hypothesis testing (Baarda 1968; Koch 1999; Teunissen 2006). In the second step, a mapping I W Rn 7! Zn is introduced that takes the integer constraints a 2 Zn into account: aL D I.a/ O

(5)

There are many ways in which the mapping I can be defined. In Sect. 2, we introduce three different classes of such estimators. They are the class of integer estimators (I), the class of integer aperture estimators (IA), and the class of integer equivariant estimators (IE). These three classes were introduced in Teunissen (1999a, 2003a, b). They are subsets of one another and related as I  IA  IE

(6)

Each class consists of a multitude of estimators. For each class we present the optimal estimator. As optimality criterion we either use the maximization of the probability of correct integer estimation or the minimization of the mean squared error. Page 4 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

Since estimators from the class of integer estimators are most often used, we present in Sect. 3 an analysis of the properties of the three most popular integer estimators. They are the estimators of integer rounding (IR), integer bootstrapping (IB), and integer least-squares (ILS). Special attention is given to computational issues and to their success-rates, i.e., the probabilities of correct integer estimation. It is shown that the performances of these three integer estimators are related as ^

^

^

P .aIR D a/  P .aIB D a/  P .aILS D a/

(7)

Knowing that a is integer strengthens the model and allows one, in principle, to re-evaluate the validation of the model as compared to the first step validation. However, the standard theory of hypothesis testing is then not applicable anymore. This validation problem will be addressed in Sect. 4, where we also present a cross-validation method for the mixed integer model. ^ In the final step, once a of Eq. 5 is computed and accepted, the float estimator bO is readjusted to obtain the so-called fixed estimator ^

^ b D bO  QbOaO Qa1 O  a/ O aO .a

(8)

^

Whether or not b is an improvement over bO depends in a large part on the probabilistic properties ^ ^ of a. We therefore present in Sect. 4 the probability density function (PDF) of b and show how it ^ is influenced by the probability mass function (PMF) of a.

2 Principles of Integer Inference In this section, we present three different classes of integer parameter estimators. They are the integer estimators, the integer aperture estimators, and the integer equivariant estimators. Within each class we determine the optimal estimator.

2.1 Integer Estimation 2.1.1 Pull-In Regions O 2 Zn . Then We start with the requirement that the estimator a needs to be integer, a D I.a/ I W Rn 7! Zn is a many-to-one map, instead of a one-to-one map. Different real-valued vectors will be mapped to one and the same integer vector. One can therefore assign a subset, say Pz  Rn , to each integer vector z 2 Zn , ^

Pz D fx 2 Rn jz D I.x/g; z 2 Zn

^

(9)

This subset is referred to as the pull-in region of z. It is the region in which all vectors are pulled to the same integer vector z. The concept of pull-in regions can be used to define integer estimators. Definition 2 (Integer estimators). The mapping a D I.a/ O is said to be an integer estimator if its pull-in regions satisfy ^

Page 5 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

S 1. z2Zn Pz D Rn 2. Int.Pu / \ Int.Pv / D ;; 8u; v 2 Zn ; u ¤ v 3. Pz D z C P0 ; 8z 2 Zn According to this definition an integer estimator is completely specified once the pull-in region P0 is given. The following explicit expression can be given for an integer estimator: X

aL D

zpz .a/ O

(10)

z2Zn

with the indicator function, pz .x/, defined as  pz .x/ D

1 if x 2 Pz 0 if x … Pz

P Note, since z2Zn pz .x/ D 1; 8x 2 Rn , that the pz .a/ O can be interpreted as weights. The integer ^ estimator a is therefore equal to a weighted sum of integer vectors with binary weights. Examples of I-estimators are integer rounding, integer bootstrapping, and integer least-squares. Their pull-in regions are the multivariate versions of a square, a parallelogram, and a hexagon. The properties of these popular integer estimators will be further detailed in Sect. 3. 2.1.2 PMF and Success-Rate The outcome of an integer estimator should only be used if one has enough confidence in its ^ ^ solution. To evaluate one’s confidence in a, one needs its PMF. The PMF of a is obtained by integrating the PDF of a, O faO .xja/, over the pull-in regions Pz  Rn , Z faO .xja/dx; z 2 Zn (11) P.aL D z/ D P.aO 2 Pz / D Pz

 ˚ In case aO  N.a; QaO aO /, the PDF is given as faO .xja/ D C exp  12 jjx  ajj2QaO aO , where C is a normalizing constant and jj  jj2M D ./T M 1 ./. ^ The PMF of a depends, of course, on the pull-in regions Pz and therefore on the chosen integer estimator. Since various integer estimators exist, some may be better than others. Having the problem of GNSS ambiguity resolution in mind, one is particularly interested in the probability ^ of correct integer estimation. This probability is referred to as the success-rate, Ps D P.a D a/. Its complement is referred to as the fail-rate, Pf D 1  Ps . The success-rate is computed as Z Ps D

Pa

Z faO .xja/dx D

P0

faO .x C aja/dx

(12)

This shows, if the PDF has the translational property faO .x C aja/ D faO .xj0/, that the success-rate can be computed without knowledge of the unknown integer vector a 2 Zn . Obviously, this is the case for the PDF of the multivariate normal distribution. Equation 12 is very important for GNSS applications. It allows the GNSS user to evaluate (often even before the actual measurements are taken) whether or not the strength of the underlying GNSS Page 6 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

model is such that one can expect successful integer ambiguity resolution. The evaluation of the multivariate integral of Eq. 12 can generally be done through Monte Carlo integration (Robert and Casella 1999). For some important integer estimators, we also have easy-to-compute expressions and/or sharp (lower and upper) bounds of their success-rates available (cf. Sect. 3). 2.1.3 Optimal Integer Estimation Since the success-rate depends on the pull-in region and therefore on the chosen integer estimator, it is of importance to know which integer estimator maximizes the probability of correct integer estimation (Teunissen 1999b). Theorem 1 (Optimal integer estimation). Let faO .xja/ be the PDF of â and let the integer maximum likelihood (IML) estimator ^

aIML D arg maxn faO .ajz/ O z2Z

(13)

be an integer estimator. Then ^

^

P.aIML D a/  P.a D a/

(14)

^

for any integer estimator a. This result shows that of all integer estimators, the IML estimator has the largest success-rate. The theorem holds true for an arbitrary PDF of a. O In case aO  N.a; QaO aO /, the optimal I estimator is the integer least-squares (ILS) estimator ^

aILS D arg minn jjaO  ajj2QaO aO a2Z

(15)

The above theorem therefore gives a probabilistic justification for using the ILS estimator when the PDF is Gaussian. We will have more to say about the ILS estimator in Sect. 3.3.

2.2 Integer Aperture Estimation 2.2.1 Aperture Pull-In Regions The outcome of an integer (I) estimator is always an integer, whether the fail-rate is large or small. Since the user has no direct control over this fail-rate (other than strengthening the underlying model a priori), the user has no direct influence on the confidence of its integer solution. To give the user control over the fail-rate, the class of integer aperture (IA) estimators was introduced. This class is larger than the I class. The IA class is defined by dropping one of the three conditions in Definition 2, namely, the condition that the pull-in regions should cover Rn completely. We thus allow the IA pull-in regions to have gaps. Definition 3 (Integer aperture estimators). Let   Rn and z D  \ Pz , where Pz is a pull-in region of an arbitrary I estimator. Then the estimator

Page 7 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

 aL D

aO if aO …  z if aO 2 z

(16)

is said to be an integer aperture estimator if its pull-in regions satisfy S n 1. z2Zn T z D   R 2. Int.u / Int.v / D ;; 8u; v 2 Zn ; u ¤ v 3. z D z C 0 ; 8z 2 Zn If we compare it with Definition 2, we note that the role of the complete space Rn has been replaced by the subset   Rn . It is easily verified from the above conditions that  is z-translational invariant,  D  C z; 8z 2 Zn . Also note that the I class is a subset of the IA class. Thus I estimators are IA estimators, but the converse is not necessarily true. An IA estimator maps the float solution aO to the integer vector z if aO 2 z and it maps the float solution to itself if aO … . An IA estimator can therefore be expressed explicitly as ^

aIA D aO C

X

.z  a/! O z .a/ O

(17)

z2Zn

with !z .x/ the indicator function of z . Note that the IA estimator is completely determined once 0  P0 is given. Thus 0 plays the same role for the IA estimators as P0 does for the I estimators. By changing the size and shape of 0 , one changes the outcome of the IA estimator. The subset 0 can therefore be seen as an adjustable pull-in region with two limiting cases: the limiting case in which 0 is empty and the limiting case when 0 equals P0 . In the first case the IA estimator becomes identical to the float solution a, O and in the second case the IA estimator becomes identical to an I estimator. The subset 0 therefore determines the aperture of the pull-in region. 2.2.2 Probability Distribution and Successful Fix Rate In order to evaluate the performance of an IA estimator, the following three outcomes need to be distinguished: aO 2 a for success (correct integer estimation), aO 2  na for failure (incorrect integer estimation), and aO …  for undecided (a not estimated as integer). The corresponding probabilities of success (s), failure (f ), and undecided (u) are given as R ^ Ps D P.aIA D a/ D a faO .xja/dx P P R ^ Pf D P.aIA D z/ D z faO .xja/dx n nfag z2Zn nfag z2Z R Pu D 1  Ps  Pf D 1  0 f^ .xja/dx

(18)

ı

where f^ .xja/ D ı

X

faO .x C zja/p0 .x/

(19)

z2Zn

^

is the PDF of the residual ı D aO  aO (Teunissen 2002; Verhagen and Teunissen 2004, 2005). Page 8 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

Since a  Pa , it follows that the success-rate of an IA estimator will never be larger than that of the corresponding I estimator. So what have we gained? What we have gained is that we can now control the fail-rate and in particular control the successful fix rate, i.e., the probability of successful fixing. Note that the complement of the undecided probability 1  Pu D Ps C Pf is the fix probability, i.e., the probability that the outcome of the IA estimator is integer. The probability of successful fixing is therefore given by the ratio Psf D

Ps Ps C Pf

(20)

To have confidence in the integer outcomes of IA-estimation, a user would like to have Psf close to 1. This can be achieved by setting the fail-rate Pf at a small-enough level. Thus the user chooses the level of fail-rate he finds acceptable and then determines the size of the aperture pull-in region that corresponds with this fail-rate level. With such a setting, the user has the guarantee that the fail-rate of his IA estimator will never become unacceptably large. As with I estimation, the user can choose from a whole class of IA estimators simply by using different shape definitions for the aperture pull-in regions. Various examples have been given in Verhagen and Teunissen (2006). The ILS estimator combined with the popular GNSS Ratio-Test (Leick 2004) is such an example. Unfortunately one can still find various incorrect interpretations of the Ratio-Test in the literature. So is its use often motivated by stating that it determines whether the ILS solution is true or false. This is not correct. Also the current ways of choosing the tolerance value  are ad hoc or based on false theoretical grounds. Often a fixed value of 12 or 13 is used. However, as shown in Verhagen and Teunissen (2006) and Teunissen and Verhagen (2009), instead of using a fixed -value, one should use the fixed fail-rate approach. From the fixed fail-rate, one can then compute the variable -value (it varies with varying strength of the underlying GNSS model). 2.2.3 Optimal Integer Aperture Estimation So far we considered IA estimation with a priori chosen aperture pull-in shapes. Now we determine which of the IA estimators performs best. As the optimal IA estimator we choose the one which maximizes the success-rate subject to a given fail-rate. The optimal IA estimator is given by the following theorem (Teunissen 2005). Theorem 2 (Optimal integer aperture estimation). Let faO .xja/ and fı .xja/ be the PDFs of aO ^ ^ and ı D aO  aIML , respectively, and let Ps and Pf be the success-rate and the fail-rate of the IA estimator. Then the solution to max Ps subject to given Pf

0 P0

(21)

is given by the aperture pull-in region 0 D fx 2 P0 jfı .xja/  faO .x C aja/g

(22)

where P0 D fx 2 Rn j0 D arg maxn faO .xjz/g z2Z

Page 9 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

and with the aperture parameter  chosen so as to satisfy the a priori fixed fail-rate Pf . The steps in computing the optimal IA estimator are therefore as follows: (1) Compute the ^ optimal I estimator a D arg maxz2Zn faO .ajz/. O (2) Determine the aperture parameter  from the user-defined fail-rate Pf . Ways of doing this are discussed in Verhagen (2005a, b) and Teunissen ^

^

^

^

and Verhagen (2009). (3) Check whether ı D aO  a lies in 0 . If ı 2 0 , then a is the outcome of the optimal IA estimator; otherwise, the outcome is a. O

2.3 Integer Equivariant Estimation 2.3.1 A Larger Class of Estimators The class of IA estimators includes the class of I estimators. Now we introduce an even larger class. This larger class is obtained by dropping another condition of Definition 2. Since we would at least like to retain the integer remove-restore property, we keep the condition that the estimators must be z-translational invariant. Such estimators will be called integer equivariant (IE) estimators. O with F W Rn 7! R, Definition 4 (Integer equivariant estimators). The estimator OIE D F .a/, is said to be an integer equivariant estimator of the linear function D l T a if F .x C z/ D F .x/ C l T z; 8x 2 Rn ; z 2 Zn

(23)

It will be clear that I estimators and IA estimators are also IE estimators. The converse, however, is not necessarily true. The class of IE estimators is also larger than the class of linear unbiased estimators, assuming that the float solution is unbiased. Let F T a, O for some F 2 Rn , be the linear estimator of D l T a. O D l T a; 8a 2 Rn holds true or For it to be unbiased one needs, using Efag O D a, that F T Efag that F D l. But this is equivalent to stating that F T .aO C a/ D F T aO C l T a; 8aO 2 Rn ; a 2 Rn . Comparison with (23) shows that the condition of linear unbiasedness is more restrictive than the condition of integer equivariance. The class of linear unbiased estimators is therefore a subset of the IE class. This implies that a “best” IE estimator must be at least as good as the BLUE a. O After all the float solution aO is an IE estimator as well. 2.3.2 Best Integer Equivariant Estimation We denote the best integer equivariant (BIE) estimator of as OBIE and use the mean squared error (MSE) as our criterion of “best.” The BIE estimator of D l T a is therefore defined as O  /2 g OBIE D arg min Ef.F .a/ F 2IE

(24)

in which IE stands for the class of IE estimators. The minimization is thus taken over all integer equivariant functions that satisfy the condition of Definition 4. Thus the BIE estimator is the optimal IE estimator in the MSE sense.

Page 10 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

The reason for choosing the MSE criterion is twofold. First, it is a well-known probabilistic criterion for measuring the closeness of an estimator to its target value, in our case D l T a. Second, the MSE criterion is also often used as measure for the quality of the float solution itself. It should be kept in mind however that the MSE criterion is a weaker criterion than the probabilistic criterion used in the previous two sections. The BIE estimator is given by the following theorem (Teunissen 2003b). Theorem 3 (Best integer equivariant estimation). Let faO .xja/ be the PDF of aO and let OBIE be the best integer equivariant estimator of D l T a. Then OBIE D l T aO BIE , where aO BIE D

X

zwz .a/ O

(25)

faO .x C a  zja/ u2Zn faO .x C a  uja/

(26)

z2Zn

with the weighting functions wz .x/ given as wz .x/ D P

As the I estimator, the BIE estimator is also a weighted sum of all integer vectors in Zn . In the present case, however, the weights are not binary. They vary between 0 and 1, and their values are determined by the float solution and its PDF. As a consequence the BIE estimator will be real valued in general, instead of integer valued. An important consequence of the above theorem is that the BIE estimator is always better than or at least as good as any integer estimator as well as any linear unbiased estimator. After all the class of integer estimators and the class of linear unbiased estimators are both subsets of the class of IE estimators. The nonlinear BIE estimator is therefore also better than the best linear unbiased estimator (BLUE): MSE. OBIE /  MSE. OBLUE /

(27)

The BLUE is the minimum variance estimator of the class of linear unbiased estimators and it is given by the well-known Gauss-Markov theorem. The two estimators OBIE and OBLUE therefore both minimize the mean squared error, albeit within a different class. The above theorem holds true for any PDF the float solution aO might have. In the Gaussian case aO  N.a; QaO aO /, the weighting function of Eq. 26 becomes ˚  exp  12 kx  zk2QaOaO ˚ 1  wz .x/ D P 2 u2Zn exp  2 kx  ukQaOaO

(28)

Since the space of integers Zn can be seen as a certain discretized version of the space of real numbers Rn , one would expect, if the integer grid size gets smaller in relation to the size and extend of the PDF, that the difference between the two estimators, aO BIE and a, O gets smaller as well. Similarly, if the PDF gets more peaked in relation to the integer grid size, one would expect that the BIE estimator aO BIE tends to an integer estimator. This is made precise in the following lemma.

Page 11 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

Lemma 1 (Limits of the BIE estimator). R P (i) If we replace z2Zn by Rn d z in Eqs. 25 and 28, then aO BIE D aO (ii) Let aO  N.a; QaO aO D 2 G/. Then ^

lim aO BIE D aILS

!0

^

A probabilistic performance comparison between a, O aILS , and aO BIE can be found in Verhagen and Teunissen (2005). It is interesting to observe that the above given Gaussian-based expression for aO BIE is identical to its Bayesian counterpart as given in Betti et al. (1993), Gundlich and Koch (2002), and Gundlich and Teunissen (2004). This is not quite true for the general case however. Still, this Gaussian equivalence nicely bridges the gap between the current theory of integer inference and the Bayesian approach.

3 Three Popular Integer Estimators In this section, we discuss integer rounding, integer bootstrapping, and integer least-squares, with special attention to computational issues and the success-rates. We assume aO  N.a 2 Zn ; QaO aO /.

3.1 Integer Rounding 3.1.1 Scalar and Vectorial Rounding The simplest integer estimator is “rounding to the nearest integer.” In the scalar case, its pull-in regions (intervals) are given as Rz D fx 2 Rkx  zj  1=2g; z 2 Z

(29)

Any outcome of aO  N.a 2 Z; aO2 / that satisfies jaO  zj  1=2 will thus be pulled to the integer ^ z. We denote the rounding estimator as aR and the operation of integer rounding as d.c. Thus ^ ^ aR D dac O and aR D z if aO 2 Rz . ^ O is given as The PMF of aR D dac    1 C 2.a  z/ 1  2.a  z/ P.aR D z/ D ˆ Cˆ  1; z 2 Z 2 aO 2 aO 

^

(30)

 ˚ Rx where ˆ.x/ denotes the normal distribution function, ˆ.x/ D 1 p12 exp  12 v 2 dv. Note that the PMF is symmetric about a. Thus integer rounding provides for an unbiased integer ^ estimator E.aR / D a 2 Z. Also note that the PMF becomes more peaked when aO gets smaller.

Page 12 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

For GNSS ambiguity resolution, the success-rate is of particular importance. The success-rate of integer rounding follows from Eq. 30 by setting z equal to a: 

1 P.aR D a/ D 2ˆ 2 aO ^

 1

(31)

Thus the smaller the standard deviation, the larger the success-rate. A success-rate better than 0.99 requires aO  0:15 cycle. Scalar rounding is easily generalized to the vectorial case. It is defined as the component-wise ^ rounding of aO D .aO 1 ; : : : ; aO n /T ; aR D .daO 1 c; daO 2 c; : : : ; daO n c/T . The pull-in regions of vectorial rounding are the multivariate versions of the scalar pull-in intervals: ˇ ˇ  ˚ (32) Rz D x 2 Rn j ˇciT .x  z/ˇ  1=2; i D 1; : : : ; n ; z 2 Zn where ci denotes the unit vector having a 1 as its i th entry and 0s otherwise. Thus the pull-in regions of rounding are unit squares in 2D, unit cubes in 3D, etc. 3.1.2 Rounding Success-Rate ^

To determine the joint PMF of the components of aR , we have to integrate the PDF of aO  N.a; QaO aO / over the pull-in regions RZ . These n-fold integrals are unfortunately difficult to ^ evaluate, unless the variance matrix QaO aO is diagonal, in which case the components of aR are independent and their joint PMF follows as the product of the univariate PMFs of the components. The corresponding success-rate is then given by the n-fold product of the univariate success-rates. In case of GNSS, the ambiguity variance matrix will usually be fully populated, meaning that one will have to resort to methods of Monte Carlo simulation for computing the joint PMF. In the case of the success-rate, one can alternatively make use of the following bounds (Teunissen 1998b). Theorem 4 (Rounding success-rate bounds). Let aO  N.a 2 Zn ; QaO aO /. Then the rounding success-rate can be bounded from below and from above as        n  Y 1 1 ^ 2ˆ  1  P.aR D a/  2ˆ 1 2

2

max a O i iD1

(33)

where max D max aO i . iD1;:::;n

These easy-to-compute bounds are very useful for determining the expected success of GNSS ambiguity rounding. The upper bound is useful to quickly decide against such ambiguity resolution. It shows that ambiguity resolution based on vectorial rounding can not be expected successful, if already one of the scalar rounding success-rates is too low. The lower bound is useful to quickly decide in favor of vectorial rounding. If the lower bound is sufficiently close to 1, one can be confident that vectorial rounding will produce the correct integer ambiguity vector. Note that this requires each of the individual probabilities in the product of the lower bound to be sufficiently close to 1.

Page 13 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

3.1.3 Z-Transformations ^

Although aR is easy to compute, the rounding estimator suffers from a lack of invariance against integer reparametrizations or the so-called Z-transformations. A matrix is called a Z-transformation if it is one-to-one (i.e., invertible) and integer (Teunissen 1995a). Such transformations leave the integer nature of the parameters in tact. By saying that the rounding estimator lacks Z-invariance, we mean that if the float solution is Z-transformed, the integer solution does not transform accordingly. That is, rounding and transforming do not commute ^

^

z R ¤ Z aR if zO D Z aO

(34) ^

^

Only in case Z is a permutation matrix, Z D …, do we have z R D …aR . In this case, the transformation is a simple reordering of the ambiguities. Also the success-rate lacks Z-invariance. Since the pull-in regions of rounding remain unaffected by the Z-transformation, while the distribution of the float solution changes to zO  N.z D Za; QzOzO D ZQaO aO Z T /, we have, in general, ^

^

P. z R D z/ ¤ P.aR D a/

(35)

This lack of invariance implies that integer rounding is not optimal in the vectorial case. The lack of invariance does not occur in the scalar case, since multiplication by ˙1 is then the only admissible Z-transformation. Does the mentioned lack of invariance mean that rounding is unfit for GNSS integer ambiguity resolution? No, by no means. Integer rounding is a valid ambiguity estimator, since it obeys the principle of integer equivariance, and it is an attractive estimator, because of its computational simplicity. Whether or not it can be successfully applied in any concrete situation depends solely on the value of its success-rate for that particular situation. What the lack of invariance shows is the nonoptimality of rounding. Despite being nonoptimal, rounding can achieve high success-rates, provided the underlying GNSS model is of sufficient strength and provided the proper ambiguity parametrization is chosen. In section “The ILS Search”, we come back to this issue and see how we can use the existing degrees of freedom of integer parametrization to our advantage.

3.2 Integer Bootstrapping 3.2.1 The Bootstrapping Principle Integer bootstrapping is a generalization of integer rounding; it combines integer rounding with sequential conditional least-squares estimation and as such takes some of the correlation between the components of the float solution into account. The method goes as follows. If aO D .aO 1 ; : : : ; aO n /T , one starts with aO 1 and as before rounds its value to the nearest integer. Having obtained the integer of the first component, the real-valued estimates of all remaining components are then corrected by virtue of their correlation with aO 1 . Then the second, but now corrected, realvalued component is rounded to its nearest integer. Having obtained the integer value of this second component, the real-valued estimates of all remaining n  2 components are then again corrected by virtue of their correlation with the second component. This process is continued until all n components are taken care of. We have the following definition. Page 14 of 28

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_37-2 © Springer-Verlag Berlin Heidelberg 2014

Definition 5 (Integer bootstrapping). Let aO D .aO 1 ; : : : ; aO n /T 2 Rn be the float solution and let ^ ^ ^ aB D .aB;1 ; : : : ; aB;n /T 2 Zn denote the corresponding integer bootstrapped solution. Then ^

aB;1 D daO 1 c ^ ^ aB;2 D daO 2j1 c D daO 2  21 12 .aO 1  aB;1 /c :: : P ^ ^ 2 aB;n D daO njN c D daO n  n1 O j jJ  aB;j /c j D1 n;j jJ j jJ .a

(36)

where aO ijI is the least-squares estimator of ai conditioned on the values of the previous I D f1; : : :; .i  1/g sequentially rounded components, i;j jJ is the covariance between aO i and aO j jJ , and

j2jJ is the variance of aO j jJ . For i D 1, aO ijI D aO 1 . As the definition shows, the bootstrapped estimator can be seen as a generalization of integer rounding. The bootstrapped estimator reduces to integer rounding in case correlations are absent, i.e., in case the variance matrix QaO aO is diagonal. The bootstrapped estimator combines sequential conditional least-squares estimation with integer rounding. If we replace the “least-squares estimation” part by “linear estimation,” we can construct a whole class of sequential integer estimators. This class is defined as follows. Definition 6 (Sequential integer estimation). Let aO D .aO 1 ; : : : ; aO n /T 2 Rn be the float solution. ^ ^ ^ ^ O D a 2 Zn if ai D Then aPD .a1 ; : : : ; an /T 2 Zn is a sequential integer estimator of E.a/ ^ i1 daO i C j D1 rij .aO j  aj /c, i D 1; : : :; n, or, in vector-matrix form, if ^

^

a D daO C .R  In /.aO  a/c

(37)

with R a unit lower triangular matrix. By showing how the bootstrapped estimator can be computed from using the triangular factorization of the variance matrix QaO aO , it becomes immediately clear that the bootstrapped ^ estimator aB is indeed a member of the class of sequential integer estimators. We have the following result (Teunissen 2007a). Theorem 5 (Bootstrapping and the triangular decomposition). Let aO 2 Rn be the float solution and let the unit lower triangular decomposition of its variance matrix be given as QaO aO D LDLT . The entries of L and D are then given as 8
0 the maximizer will have all eigenvalues (singular values) strictly positive. If it were possible to ignore the dependence of the eigenvalues on XN , then both problems would be solved by taking all the eigenvalues j .G/ of G, and all the squares of the singular values . j .Y//2 of Y, respectively, equal, that is, . j .Y//2 D œj .G/ D œavg D

.L C 1/2 ; j D 1; : : : ; N; 4

corresponding to G D œavg I, where I is the identity matrix. But this is not possible, except for L D 1 and N D 4 with the points forming the vertices of the regular tetrahedron, since otherwise it would contradict a celebrated theorem (see Sect. 4.5) on the nonexistence of tight spherical designs. As the Lagrangians qj (x), given by (22), can be expressed as a ratio of determinants, extremal points have the appealing property that sup max jqj .x/j  1;

x2S2 1j N

and hence that the Lebesgue constant, see Reimer (2003) for example, satisfies kƒN kC.S2 / D sup x2S2

N X

jqj .x/j  .L C 1/2 :

j D1

This upper bound is, however, far greater than the optimal order O.L1=2 / and also greater than the order observed for extremal points in the numerical experiments of Sloan and Womersley (2004). In summary, extremal points are attractive for both interpolation and cubature. In addition, the computed weights wj are positive and satisfy wj =wavg > 0. 5 for all degrees L  165, where wavg D 4=.L C 1/2 denotes the average weight. There is as yet no proof that the weights are positive for all degrees L. External points also have good geometrical properties, see Sloan and Womersley (2004). The points and weights are available from Womersley (2007).

4.5 Spherical Designs Cubature rules with specified degree of precision L and equal weights wj D 4=N have attracted special interest. Definition 2. that

A spherical L-design on S2 is a set XN D fx1 ; : : :; xN g of N points on S2 such Z N 4 X p.xj / D p.x/ d!.x/ N j D1 S2

for all p 2 PL .S2 /:

Page 15 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

Spherical designs were introduced by Delsarte et al. (1997), who gave the lower bound on the number of points ( N 

.LC1/.LC3/ 4 .LC2/2 4

if L is odd;

(25)

if L is even:

A spherical L-design that achieves these lower bounds is known as a tight spherical design. However, it is also known that for S2 tight spherical designs do not exist (Bannai and Damerell 1979) except for L D 1, 2, 3, or 5. On the other hand, it is known (Seymour and Zaslavsky 1984) that for any L, spherical L-designs always exist if N is sufficiently large. Based on equal weight quadrature for Œ1; 1 with the degree of precision L, Bajnok (1991) and Korevaar and Meyers (1993) give tensor product constructions of spherical L-designs with N D O.L3 /. For tabulated designs with modest numbers of points, see Hardin and Sloane (1996) and Sloane (2000). Numerically there is strong evidence that spherical L-designs with N D .L C 1/2 points exist (Chen and Womersley 2006), and recently (Chen et al. 2009) have used interval methods to prove existence of spherical L-designs with N D .L C 1/2 for all values of L up to 100; but there is no proof yet that spherical L-designs with O.L2 / points exist for all degrees L. Computed spherical L-designs for L D 1; : : :; 140 and symmetric, or antipodal (that is, x 2 XN , x 2 XN ), spherical L-designs for L D 1; : : :; 181 with N  .L C 1/2 =2 are provided by Womersley (2009). Symmetric point sets have the advantage of integrating all spherical harmonics Y`;k with ` odd, thus reducing both the number of equations and the number of variables, as in Sect. 4.2. These provide practical equal weight cubature rules with high degrees of precision. They have approximately the same number of points as the tensor product longitude-latitude rules using a Gauß-Legendre rule for Œ1; 1 but without the crowding of points with small weights near the poles.

4.6 Number of Points for Rules with Polynomial Accuracy In this section, we summarize what is known about the number of points N needed for rules of polynomial degree of precision L. The longitude-latitude rules of Sect. 4.1 with degree of precision L need m b.L C 1/; m b.L C 1/ C 1; m b.L C 1/ C 2 points where m b WD bL=2c C 1 for the Gauß-Legendre, Gauß-Radau, and Gauß-Lobatto points, respectively. Thus in all cases the number of points is N D 12 L2 C O.L/. For rules with octahedral and icosahedral symmetry it is common to use (McLaren 1963)

D

.L C 1/2 3N

as a measure of the efficiency of the rule. In most cases is close to 1 and the octahedral constructions of Lebedev and Laikov (1999) ensure ! 1 as L ! 1. Thus for these rules we have N D 13 L2 C O.L/. Interpolatory rules based on extermal point sets have, by definition, N D .LC1/2 D L2 CO.L/. Although the existence of spherical L-designs with N D O.L2 / has yet to be proved for all L, spherical L-designs with precisely N D .L C 1/2 points are now known rigorously to exist

Page 16 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

for all L up to 100 (Chen et al. 2009). And there is strong computational evidence that spherical L-designs with N D 12 L2 C O.L/ exist for L up to 181 (Womersley 2009). Finally, we note the following lower bound on the number of points for a rule with specified degree of precision L. Theorem 2. A rule of the form (14) with degree of precision L has the number of points N bounded below by N  dim .PbL=2c .S2 // D .bL=2c C 1/2 : For a proof see, for example, the survey paper (Cools 1997, Theorem 7.1). For the special case of positive-weight rules, a very simple proof of Theorem 2 is as follows: After renumbering the points, we may assume that x1 is the point of the positive-weight rule (14) that has the largest weight w1 . Since w1 is at least as large as the average weight, it follows from (15) that w1  4=N . The special spherical polynomial bL=2c 2`C1

p.y/ WD

X X

bL=2c

Y`;k .x1 /Y`;k .y/ D

`D0 kD1

X 2` C 1 P`.x1  y/ 4 `D0

(26)

has degree bL=2c, and hence p 2 has degree 2bL=2c  L and is therefore integrated exactly by the cubature rule (14). (The second formula in (26) follows from the addition theorem (5).) Thus, using the positivity of the weights and the fact that the spherical harmonics Y`;k are L2 .S2 /-orthonormal, Z N X 2 2 wj .p.xj // D .p.x//2 d! .x/ D p.x1 /; w1 .p.x1 //  j D1

S2

where the last step follows from (26). Canceling p.x1 / on both sides gives w1 p.x1 /  1, and using w1  4=N , together with the second representation in (26) and P` .1/ D 1, yields bL=2c X 4 .2` C 1/ D .bL=2c C 1/2 : p.x1 /  w1 p.x1 /  1 H) N  4p.x1/ D N `D0

4.7 Gauß Rules for S 2 Do Not Exist for L  2 An often asked natural question is as follows: What is the analogue for the sphere of the Gauß quadrature rules for Œ1; 1? The answer is that, on the sphere S2 , cubature rules with the “doubledegree-of-precision” property do not exist for L  2. The following theorem, related to one of Reimer (1992), states this precisely. Theorem 3. Assume L  2. There exists no positive-weight cubature rule QN on S2 with N D .L C 1/2 points and polynomial degree of precision 2L. Note that the theorem cannot be proved by appealing to Theorem 1, since with L replaced by 2L the lower bound becomes N  .L C 1/2 , giving no contradiction. Page 17 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

In brief, the theorem follows by noting that if N D .L C 1/2 and QN has degree of precision 2L, then, as the Y`;k ; k D 1; : : : ; 2` C 1I ` D 0; : : : ; L, form an L2 .S2 /-orthonormal system, the square matrix Y defined by (8) satisfies YWYT D I; where I is the N  N identity matrix and W is the N  N diagonal matrix with the cubature weights wj ; j D 1; : : :; N , as diagonal elements. From this it follows that Y is invertible and YT D W1 Y1 , so G W D YT Y D W1 : In particular, the j th diagonal element gives w1 j D Gj;j D N=.4/, by (24). Hence the weights are independent of j and the cubature rule QN , if it exists, is a spherical 2L-design that achieves the lower bound (25). But, by the previously mentioned result of Bannai and Damerell (1979), tight spherical 2L-designs on S2 do not exist for 2L  4. The negative result stated in Definition 1 has one unfortunate consequence for interpolatory cubature rules (Sect. 4.3). Whereas interpolation of trigonometric polynomials of degree  L at equally spaced points on the circle can be expressed by an explicit formula (because inner products are done exactly), for L  2 no such formula exists for polynomial interpolation on S2 , see Sloan (1995).

4.8 Integration Error for Rules with Polynomial Accuracy In this section, we give error estimates for integrands f with some smoothness. We do this in both a classical setting of functions that belong to C k .S2 /, (that is, have all derivatives up to order k continuous), and also in the so-called Sobolev space setting H s .S2 /, where the derivatives up to order s are merely required to be square-integrable. Here s > 1 is allowed to be fractional. The space H s .S2 / is defined in Sect. 8.2. Note that all the error estimates in this chapter are what are called a priori estimates: they provide a guide to the expected behavior, but, since the constants are not explicit, they tell us nothing about the error in a particular calculation. Theorem 4. Let QN be a positive-weight cubature rule on S2 with N points and the degree of precision L  1. Suppose that f 2 C k .S2 / and that N D O.L2 /. Then there exists a constant c D c.k; f / such that jIf  QN f j 

c : N k=2

Note that the constant does not depend on N or on the rule QN , as long as all its weights are positive. (We use c to denote a generic constant from now on.) The theorem is proved in Sect. 8.1. It suggests, correctly, that for functions with reasonable smoothness it pays to increase the degree of precision L. A similar result is presented in Sect. 8.3 for functions f in an appropriate Sobolev space.

Page 18 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

Theorem 5. Let QN be a positive-weight cubature rule on S2 with N points and degree of precision L  1. Suppose that f 2 H s .S2 / for some s > 1, and that N D O.L2 /. Then there exists a constant c D c.s; f / such that jIf  QN f j 

c : N s=2

In the Sobolev space case we also know that the convergence rate O.N s=2 / cannot be improved, because of a known lower bound given in Sect. 8.2. (A word of caution about the interpretation of the last remark: for a particular function f the rate usually can be improved at the expense of changing the constant, because a given f lies in more than one Sobolev space.)

5 Integration of Scattered Data In practical applications, the data will often not be given on a point grid that can be chosen by the user but will instead be given at “scattered points” not under the user control. Therefore, cubature rules that can handle data at scattered points are of great interest for practical applications. Often the first necessary step is to thin the data (see, for example, Floater and Iske 1996a, b), in order to produce a set of reasonable size and better geometry, as expressed, for example, through the mesh ratio, see Sect. 2.2. This is an important step, but we do not discuss it further. In Sect. 5.1, we describe two simple rules that can be used with scattered data, one based on the partition into Voronoi cells and the other employing the Delaunay triangulation. In Sect. 5.2, we discuss the construction of positive-weight rules with polynomial degree of precision starting from scattered data points. This has been known to be possible in principle since Mhaskar et al. (2001) and Narcowich et al. (2006) established theoretically the existence of positive-weight cubature rules for scattered points that are exact on PL .S2 /, under certain regularity assumptions expressed through the mesh norm and the separation distance. The publications Le Gia and Mhaskar (2008) and Gräf et al. (2009) provide methods of construction and give numerical results, while Filbir and Themistoclakis (2008) provide more concrete bounds for some of the constants in the estimates. Section 5.3 follows the paper Sommariva and Womersley (2005) in developing cubature rules based on the exact integration of a radial basis function interpolant. Later on we explain how the optimal rate of convergence with increasing N can be achieved with both of the strategies in Sects. 5.2 and 5.3, provided that the mesh ratio remains bounded.

5.1 Cubature Based on Voronoi Tessellation or Delaunay Triangulation Starting with a point set XN D fx1 ; : : :; xN g, we may partition the sphere into Voronoi cells (see Sect. 2.2), labeling as Tj the Voronoi cell that contains xj . Then the rule (17) and the results of Sect. 3.1 can be used, once the areas jTj j have been computed. A different but popular way to obtain a cubature rule based on a given set of points is to use the Delaunay triangulation T as the partition, then using

Page 19 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

QN f WD

X f .xi / C f .xj / C f .xk / jij k j; 3  2T ij k

where ijk is the spherical triangle with vertices xi ; xj , and xk , whose area has the simple formula jij k j D ˛i C ˛j C ˛k  ; where ˛i ; ˛j , and ˛k are the angles at the vertices of ijk . The number of times a function value is included depends on the number of Delaunay triangles having it as a vertex. For well-distributed point sets this number is most commonly 6, corresponding to hexagonal Voronoi cells. Note that although the rule based on the Delaunay triangulation is not expressed in the canonical form (14), of a sum of weights times function values, it is clear that if reexpressed in that form then all weights will be positive.

5.2 Cubature Rules with Polynomial Accuracy Now we discuss the construction of rules with prescribed polynomial degree of precision from scattered data. Consider a set XN D fx1 ; : : :xN g of N distinct scattered points on the sphere S2 . As discussed in Sect. 2.2, the quality of the geometric distribution of this point set is usually measured by considering the mesh norm hXN (see (10)), the separation distance ıXN (see (11)), and the mesh ratio XN WD 2hXN =ıXN  1. In Mhaskar et al. (2001), the existence of positive-weight cubature rules with polynomial degree of precision L from a given set of scattered points is proved, under the assumption that the points have a sufficiently uniform point distribution. The results in Mhaskar et al. (2001) are improved in Narcowich et al. (2006). Corollary 4.4 (see also Theorem 4.3) in the latter paper can be expressed as follows. Theorem 6. There exists a constant c such that for every   1 and every 0 < < 12 the following holds: For every point set XN D fx1 ; : : : ; xN g  S2 whose mesh ratio satisfies ¡XN  ¡ and for every L  1 such that hXN  c 1 L1 ;

(27)

there exist positive weights w1 ; : : :; wN such that the cubature rule QN f WD

N X

wj f .xj /

j D1

is exact on PL .S2 /, that is, QN p D

N X

Z wj p.xj / D

j D1

p.x/ d!.x/ S2

for all p 2 PL .S2 /:

The weights w1 ; : : :; wN satisfy the estimate

Page 20 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

 1  2 2 2 4  hXN  wj  4 ; 1 .bL=2c C 1/2 

j D 1; : : : ; N:

(28)

If for a given point set XN the polynomial degree L in Theorem 6 is chosen as large as possible, then from (27) hXN L1 , and from (28) the weights satisfy wj L2 h2XN : The notation aL bL for two sequences faL g and fbL g means that there exist positive constants c1 and c2 , independent of L, such that c1 bL  aL  c2 bL for all L. Finally it follows from the boundedness of the mesh ratio, (12) and (13), that the number of points satisfies 2 2 N h2 XN L dim.PL .S //:

Thus, the error estimates in both Theorems 4 and 5 apply to the resulting positive-weight rule. A similar result to Theorem 6 is stated in Gräf et al. (2009, Lemma 2.1), and a result with more explicit constants is given by Filbir and Themistoclakis (2008, Theorem 5.1). The first results in Mhaskar et al. (2001) on the computation of the positive-weight rules whose existence is guaranteed by Theorem 6 yielded rules of relatively low degree, but in Le Gia and Mhaskar (2008) and Gräf et al. (2009) numerical results are reported that give positive-weight rules with scattered points and high polynomial degree of precision L. For approaches to the actual computation of the positive weights, the reader is referred to these publications. It would go beyond the scope of this survey paper to explain the ideas behind the proof of Theorem 6.

5.3 Cubature Based on Spherical Radial Basis Functions Given function values f .x1 /; : : :; f .xN / at scattered data points x1 ; : : : ; xN 2 S2 , Sommariva and Womersley (2005) propose estimating the integral by integrating exactly a spherical radial basis function (SBF) approximation of the form f .x/ 

N X

aj .x; xj /;

x 2 S2 :

j D1

Here  is a certain (strictly) positive definite zonal kernel, that is,  (x, y) depends only on x  y (or equivalently on dist.x; y/ D arccos.x  y/), and  satisfies N N X X

˛i ˛j .yi ; yj /  0

(29)

iD1 j D1

for all distinct points yj 2 S2 ; j D 1; : : : ; N , and all ˛j 2 R; j D 1; : : : ; N , and all N 2 N, with equality in (29) if and only if ’1 D ˛2 D    D ˛N D 0. From the positive definiteness property it follows that the SBF interpolant at the given data points Page 21 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

ƒN f D

N X

.ƒN f /.xj / D f .xj /;

ai .; xi /;

j D 1; : : : ; N;

(30)

iD1

is well defined, since the coefficients are determined by the linear system N X

ai .xj ; xi / D f .xj /;

j D 1; : : : ; N;

(31)

iD1

in which the matrix is positive definite. Once the coefficients ai in the interpolant (30) are known, an approximation to the integral follows easily by integrating the interpolant, Z QN f WD

S2

.ƒN f /.x/ d!.x/ D

N X

ai ˇ;

(32)

iD1

where Z ˇ WD S2

.x; xi / d!.x/;

i D 1; : : : ; N;

which is easily seen to be independent of xi . In the paper Sommariva and Womersley (2005), a number of different possibilities for  are considered, but in the present treatment we consider only SBFs that are the restrictions to S2 of compactly supported radial basis functions (RBFs) on R3 . Thus if the RBF is given by ‰.x; y/ D .k x  y k/ for x; y 2 R3 , then p

.x; y/ D ‰.x; y/ D

 2.1  x  y/ ;

x; y 2 S2 ;

(33)

because for x; y 2 S2  R3 we have jjxjj D jjyjj D 1 and hence jjx  yjj2 D jjxjj2 C jjyjj2  2x  y D 2  2x  y: Then, from (3) (with the polar axis taken through xi ), we have Z

1

ˇ D 2

p

 2.1  z/ dz:

(34)

1

We choose to restrict ourselves to one of the following compactly supported RBFs introduced by Wendland (1995): WD .1  t /2C ; 4 2 .t / WD .1  t /C .4t C 1/; 6 2 3 .t / WD .1  t /C .35t C 18t C 3/; 1 .t /

(35)

where aC WD a if a  0, aC WD 0 if a  0. Page 22 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

There are obvious computational attractions in using these RBFs: the coefficient matrix in (31) is then sparse (because of the compactness of the support), the nonzero elements are available analytically through (33) and (35), and ˇ can be evaluated easily using (34). There is an implicit choice of scale in (35), in that each i .t / could clearly be replaced by i .t =˜/ for any ˜ > 0. At this time, little attention has been paid in the literature to the critical question of how the scale should be chosen, see however Le Gia et al. (2009). For simplicity, we will consider only the unscaled version. The choice of which of the three Wendland functions to use in a particular situation should be dictated by the smoothness of the target function f , since, see Sect. 8.4, each of 1 , 2 , 3 is associated in a natural way with a Sobolev space H si .S2 /, i D 1; 2; 3, with si D 32 ; 52 ; 72 , respectively. The following theorem appears to be new. Theorem 7. Let i 2 f1; 2; 3g, and let s D si where s1 D 32 , s2 D 52 , and s3 D 72 . Let H s .S2 / denote the Sobolev space of order s endowed with its usual norm jj  jjH s . Consider a sequence of 2 H s .S2 /, let QN be the point sets XN with mesh norm hXN D O.N 1=2 / as N ! 1. For f p cubature rule constructed as in (30), (31), and (32), with .x; y/ D i . 2.1  x  y//. Then there exists c > 0 such that jIf  QN f j 

c jjf jjH s : N s=2

The theorem is proved in Sect. 8.4. We note that the convergence rate is the same as obtained with rules of polynomial accuracy in Theorem 5 if f 2 H s .S2 /, and that the rate is optimal in this space, since it coincides with the lower bound given by Theorem 10. The SBF cubature rule in (32) is not presented in the canonical form (14), of a sum of weights times function values, but it can be easily rewritten in that form. There is no practical necessity to do this, but to understand the connection and get formulas for the weights, we use again the “Lagrangians” and proceed in analogy to Sect. 4.3, noting, however, that the interpolant and the Lagrangians are no longer polynomials. Let the finite-dimensional approximation space be VN WD spanf.; xj / W j D 1; : : : ; N g; and define the Lagrangians qj 2 VN , qj .xk / D •j;k ; k D 1; : : :; N . If we represent now the SBF interpolant (30) in the form ƒN f D

N X

f .xj /qj ;

j D1

then integration over S2 yields for QN the formula Z QN f WD

S2

.ƒN f /.x/ d!.x/ D

N X j D1

Z wj f .xj /;

with wj WD

S2

qj .x/ d!.x/:

It is easily seen that the weights wj ; j D 1; : : : ; N , are the solutions of the linear system

Page 23 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

N X

Z .xk ; xj /wj D

j D1

S2

.xk ; x/ d!.x/ D ˇ;

k D 1; : : : ; N;

which expresses the obvious fact that the rule QN is exact for all functions in VN . The weights wj for this cubature rule are not necessarily positive, but in the extensive numerical experiments of Sommariva and Womersley (2005) most weights turned out to be positive, and the negative weights were found to have an insignificant effect on the stability of the rule. In the above construction, interpolation can be replaced by approximation, the most natural approach being to replace the interpolant ƒN f of the integrand f in (32) by a smoothed approximation of f , one that respects but does not interpolate the data. Such an approach is especially relevant for applications with noisy data, but we do not consider it further.

6 Rules with Special Properties In this section, we consider rules based on equal-area partitions and centroidal Voronoi tessellations.

6.1 Equal-Area Points The “equal-area points” (Rakhmanov et al. 1994) aim to achieve a partition T of the sphere (16) into a user-chosen number N of subsets Tj each of which has the same area jTj j D

4 ; j D 1; : : : ; N; N

and a well-behaved diameter diam(Tj /, c diam.Tj /  p ; N for some constant c > 0. Within each Tj (and roughly in the center of Tj / there is a chosen point xj 2 Tj . The natural cubature rule associated with a set of N equal-area points fx1 ; : : :; xN g is the equal weight rule (a special case of (17)) N 4 X QN f WD f .xj /: N j D1

Then from (18) we have 4 c jIf  QN f j  p ; N for any Lipschitz continuous function f with Lipschitz constant .

Page 24 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

In brief, the construction used in Rakhmanov et al. (1994) first partitions the sphere into “zones” defined by lines of constant latitude, with the set of separating latitudes depending in a precise way on the selected value of N . Then each zone is divided into mj congruent “spherical rectangles,”

2.k  1/ 2k ; D x.; / W ˛j <  < ˇj ; ”j C <  < ”j C mj mj

Tj;k

k D 1; 2; : : :; mj , for some ˛j , ˇj 2 Œ0; ; ”j 2 Œ0; 2, and mj 2 N. The point xj is chosen to be at the parametric midpoints, 

˛j C ˇj .2k  1/ ; ”j C xj D x 2 mj

 :

6.2 Centroidal Voronoi Tessellations Another example of an equal-area construction is based on a centroidal Voronoi tessellation (Du et al. 1999). This means that the points xj ; j D 1; : : :; N , are chosen so that the corresponding Voronoi cells (see Sect. 2.2) have area 4=N . This again is just a special case of Sect. 3.1. For the nontrivial task of obtaining a centroidal Voronoi tessellation, we refer to the cited reference.

7 Integration Over Subsets In this section we discuss local cubature rules, that is, rules for numerical integration over local subsets of the sphere with all points located in the subset. The three types of subsets that we consider are spherical caps and spherical collars (in Sect. 7.1), spherical triangles (in Sect. 7.2), and spherical rectangles (in Sect. 7.3).

7.1 Integration Over Spherical Caps and Collars In Hesse and Womersley (2009), tensor product rules for numerical integration over spherical caps C.z; ”/ with positive weights and an arbitrarily high polynomial degree of precision are constructed and investigated. The construction can be extended to the so-called spherical collars  ˚ C.z; ”1 ; ”2 / WD x 2 S2 W ”1  dist.x; z/  ”2 ; where 0 < ”1 < ”2 < . The construction of the rules follows the general principle for the construction of tensor product rules for product domains (see comments in Sect. 4.1 and Stroud 1971, chap. 2). First we observe that, given a cubature rule for the spherical cap C.z;  / QN f WD

N X

wj f .xj /;

j D1

Page 25 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

with points xj 2 C.z; ”/ and weights wj , j D 1, . . . , N , then a cubature rule for any other spherical cap C.w; ”/ of the same radius  can be obtained by “rotating the cubature rule” for C.z; ”/. This is achieved by introducing a 3  3 orthogonal matrix R that maps z onto w and hence C.z; ”/ onto C.w; ”/, and then defining the rule for C.w; ”/ by e N g WD Q

N X

with yj D Rxj ;

wj g.yj /;

j D 1; : : : ; N:

j D1

The same strategy can also be applied to map a rule for the spherical collar C.z; ”1 ; ”2 / to another e N inherits polynomial precision from spherical collar C.w; ”1 ; ”2/. It is easily seen that the rule Q the rule QN . For this reason we need only construct cubature rules for spherical caps and collars with center z at the north pole e3 D .0; 0; 1/T . In the coordinates (2), the spherical cap C.e3 ; ”/ is the product domain Œcos ”; 1  Œ0; 2 and the spherical collar C.e3 ; ”1; ”2 / is the product domain Œcos ”2 ; cos ”1   Œ0; 2. From (3), it is easily seen that in the coordinates (z, ) (see (2)) Z

Z

Z

2

1

f .x/ d!.x/ D C.e3 ;”/

f .x.z; // dz d; 0

cos ”

and Z

Z

2

Z

f .x/ d!.x/ D C.e3 ;”1 ;”2 /

cos ”1

f .x.z; // dz d: 0

cos ”2

In analogy to Sect. 4.1, a cubature rule can be constructed as the tensor product of two suitable rules for the two intervals. We explain this explicitly for the case of the spherical cap C.e3 ; ”/; the case of the spherical collar C.e3 ; ”1; ”2 / is completely analogous. We choose now the trapezoidal rule (19) for the interval Œ0; 2. For the interval Œcos ”; 1, we choose a positive-weight rule QŒcos ”;1 h WD

m X

bi h.zi /

(36)

iD1

that is exact for all algebraic polynomials of degree  L. A rule (36) with the mentioned properties can easily be obtained by “scaling,” for example, the Gauß-Legendre rule to the interval Œcos ”; 1. Forming the tensor product of these two rules yields the tensor product rule 2 X X bi f QN f WD L C 1 j D0 iD1 L

m

   2j ; x zi ; LC1

with N D .L C 1/m for the Gauß-Legendre rule. This rule QN has positive weights, all points in the spherical cap C.e3 ; ”/, and degree of precision L. Note that for a spherical cap with  <  and if the Gauß-Radau rule or the Gauß-Lobatto rule is used for [1, 1], then one or both of the end points are included. The point z D 1 corresponds to

Page 26 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

the center of the cap, so is only included once with its weight multiplied by L C 1, while the point z D 1, which is “scaled” to cos  , corresponds to the boundary of the cap, and L C 1 points are distributed equally around the boundary of the cap. For a spherical collar with 0 < ”1 < ”2 <  both end points correspond to boundaries, and L C 1 points are distributed equally around one (Gauß-Radau) or both (Gauß-Lobatto) boundaries. The implementation of these tensor product rules is very simple, and in Hesse and Womersley (2009) numerical experiments for the spherical cap are reported which illustrate that these rules are easy to use and perform well.

7.2 Integration Over Spherical Triangles Given a spherical triangle  defined by three vertices v1 ; v2 ; v3 2 S2 , a natural approach (see, for example, Atkinson 1982; Boal and Sayas 2004) is to use a one-to-one map of the standard triangle T0 in the plane with vertices (0, 0), (1, 0), and (0, 1) to the spherical triangle , and then use a cubature rule for a planar triangle. Such a map is given by b u, where u.x; y/ WD v1 C x.v2  v1 / C y.v3  v1 /; u.x; y/ ; .x; y/ 2 T0 : b u.x; y/ WD ku.x; y/k The determinant of the Jacobian of the transformation is ˇ ˇ ˇ ˇ @b @b u.x; y/ u.x; y/ ˇ;  J.x; y/ D ˇˇ @x @y ˇ where @b u.x;y/ @x

D

1 .v ku.x;y/k 2

 v1 / 

u.x;y/.v2 v1 / u.x; y/; ku.x;y/k3

@b u.x;y/ @y

D

1 .v ku.x;y/k 3

 v1 / 

u.x;y/.v3 v1 / u.x; y/: ku.x;y/k3

Then Z

Z f .x/ d!.x/ D



f .b u.x; y//J.x; y/ dx dy: T0

Two basic rules for a spherical triangle  are • Area times the average of the function values at the vertices, which gives Z

f .v1 / C f .v2 / C f .v3 / : 3    • Area times the function value at the centroid .x; y/ D 13 ; 13 , which corresponds to xc D .v1 C v2 C v3 /=jjv1 C v2 C v3 jj, giving f .x/ d!.x/  jj

Page 27 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

Z f .x/ d!.x/  jjf .xc /: 

Note that the first choice uses only the function values at the vertices of the triangle, while the second rule requires a new function value. A rule that uses the vertices, the midpoints of the sides, and the centroid comes from the following rule for the planar triangle T0 with g.x; y/ D f .b u.x; y//J.x; y/, which is exact of polynomials of degree 3: Z

      1 1 1 1 1 1 g 0; Cg ;0 C g ; g.x; y/ dx dy Œg.0; 0/ C g.1; 0/ C g.0; 1/ C 40 15 2 2 2 2 T0   1 1 9 g ; : C 40 3 3

See Stroud (1971) and Cools and Rabinowitz (1993) for more information on rules for triangular regions in the plane, including rules with polynomial exactness. A spherical triangle may be subdivided into four subtriangles in a number of ways, for example, by joining the midpoints of each edge by arcs of great circles. Starting with a regular triangulation of the sphere, for example, that corresponding to the faces of the tetrahedron, octahedron, or icosahedron, this will produce a cubature rule for the whole sphere (Atkinson 1982; Baumgardner and Frederickson 1985). Using an error estimate to decide which triangles to subdivide can produce an adaptive method (Boal and Sayas 2004).

7.3 Integration Over Spherical Rectangles For spherical rectangles, in particular those defined by longitude-latitude boundaries, the natural strategy is to use the same parametrization as in (3) and a tensor product of one-dimensional rules, for example, the Gauß rules (Stroud 1971), for the resulting planar rectangle. For other rules with (planar) polynomial accuracy, see Cools and Rabinowitz (1993).

8 Error of Numerical Integration 8.1 Error Analysis Based on Best Uniform Approximation The following easy error bound holds for all rules QN with the degree of precision L. In this theorem, EL .f / denotes the error of best uniform approximation of a continuous function f by a polynomial of degree  L, EL .f / W D Theorem 8.

inf

p2PL .S2 /

jjf  pjjC.S2 / :

Assume QN is a cubature rule of the form (14) with the degree of precision L. Then 0 jIf  QN f j  @4 C

N X

1 jwj jA EL .f /;

f 2 C.S2 /:

j D1

Page 28 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

The theorem follows easily by observing that, from the linearity of I and QN and the exactness of QN for p 2 PL .S2 /, If  QN f D I.f  p/  QN .f  p/;

p 2 PL .S2 /;

and hence, since p 2 PL .S2 / is arbitrary, jIf  QN f j  .jjI jj C jjQN jj/

inf

p2PL .S2 /

jjf  pjjC.S2 / ;

where the operator norms of the bounded linear functionals I and QN are for C.S2 / endowed with PN the supremum norm jj  jjC.S2 / . Since jjI jj D 4 and jjQN jj D j D1 jwj j, the result follows. Bounds for EL .f / are given by the so-called Jackson theorems. For the special case of the sphere S2 a weakened but easily stated form of the Jackson theorem obtained by Ragozin (1971) is given in Theorem 9. Theorem 9. that

Suppose that f belongs to C k .S2 /. Then there exists a constant c D c.k; f / such EL .f / 

c : Lk

Here, the space C k .S2 / is the space of functions on S2 with all (tangential) derivatives of order up to k continuous. bound stated earlier in Theorem 4 follows from Theorems 8 and 9 using N D O.L2 / and PThe N j D1 jwj j D 4 for positive weights.

8.2 The Sobolev Space Setting and a Lower Bound In this section, we define the Sobolev space H s H s .S2 /. In the context of cubature, the space H s is useful only for s > 1, since that condition ensures continuity of the functions in H s . We define the worst-case error in the space H s and then state a theorem giving a lower bound on the worst-case error. For integer s  0; the space H s contains all those measurable functions on S2 whose distributional derivatives up to order s exist and are square-integrable. The space H s can easily be characterized for all s  0, with the help of the Fourier coefficients defined in (6), by ( H s D f 2 L2 .S2 / W

1 X

.` C 1/2s

`D0

2`C1 X

) jfb`;k j2 < 1 :

kD1

The Sobolev space H s is a Hilbert space with the inner product .f; g/H s D

1 X `D0

.` C 1/

2s

2`C1 X

g `;k ; fb`;kb

kD1

Page 29 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

p and norm jjf jjH s WD .f; f /H s . On setting s D 0 we recover H 0 D L2 .S2 /. If s > 1, then the Sobolev space H s is known to be embedded into the space C.S2 /, and hence numerical integration can be defined. For discussing the quality of a cubature rule QN on H s , s > 1, it is useful to introduce the worst-case error Error.QN ; H s / WD

sup f 2H s ;jjf jjH s 1

jIf  QN f j:

One case where the worst-case error can be explicitly evaluated is s D 32 (Sloan and Womersley 2004), since in that case the worst-case error is a constant multiple of the generalized discrepancy of Cui and Freeden (1997), which has a closed form expression. Note that any upper bound on the worst-case error gives immediately an error bound for a given f 2 H s , because of the obvious inequality jIf  QN f j  Error.QN ; H s /jjf jjH s ; for f 2 H s : The following lower bound, due to Hesse and Sloan (2005a), tells us that the best possible rate of convergence achievable for all functions in H s is O.N s=2 /. This allows us to say that the convergence rate achieved by some of our cubature rules cannot be improved. Theorem 10. Let s > 1. There exists a positive constant c D c.s/, such that for any cubature rule QN with N points Error.QN ; H s / 

c : N s=2

8.3 Sobolev Space Error Bounds for Rules with Polynomial Accuracy In Hesse and Sloan (2006), the following upper bound is derived for the worst-case error in H s , s > 1, for cubature rules with positive weights and the degree of precision L. Theorem 11. Let s > 1. There exists a positive constant c D c.s/ such that for every cubature rule QN that has positive weights and the degree of precision L  1 the worst-case error in H s has the upper bound Error.QN ; H s / 

c : Ls

As a consequence of this theorem and Theorem 10 we obtain immediately the following result. Theorem 12. Let s > 1. For a sequence of positive-weight cubature rules .QN / with the degree of precision L  1 and N D O.L2 / points there exist positive constants c1 and c2 such that c1 c2  Error.QN ; H s /  s=2 : s=2 N N Page 30 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

Theorem 12 applies to suitable sequences of longitude-latitude rules (see Sect. 4.1) and also to suitable sequences of rules with scattered data points (see Sect. 5.2), provided that the mesh ratio stays bounded.

8.4 Error Analysis for Cubature Based on SBF In this section, error estimates for the SBF interpolant are used to derive the estimate for the cubature error stated in Theorem 7 of Sect. 5.3. We first expand  into a series of Legendre polynomials: Since (x, y) depends only on the inner product x  y, it is essentially a one-dimensional function and has in the L2 .S2 / sense a series expansion .x; y/ D

1 X

2` C 1 b P` .x  y/ D .`/ 4 `D0

1 X

b .`/

`D0

2`C1 X

Y`;k .x/Y`;k .y/:

kD1

For the special case of the SBFs given by (35) it is known, from Narcowich and Ward (2002), that b .`/ .` C 1/2si ;

i D 1; 2; 3;

(37)

where s1 D 32 , s2 D 52 , and s3 D 72 , respectively, for the three cases in (35). It follows from this and the definition of the Sobolev space H s , that i .xj ; / 2 H si , and in turn this allows us to use the following result from Le Gia et al. (2006, Corollary 3.4). Theorem 13. Let s > 1, and let hXN be the mesh norm of the point set XN . Let  be a positive definite SBF with b .`/ .` C 1/2s . Then, with ƒN f denoting the SBF interpolant of f on XN , there exists c > 0 such that for hXN sufficiently small jjf  ƒN f jjL2 .S2 /  chsXN jjf jjH s ; f 2 H s : Since the cubature rule QN f is obtained by integrating the interpolant ƒN f , see (32), we have Z .f .x/  .ƒN f /.x// d!.x/; If  QN f D S2

and hence, with the aid of the Cauchy-Schwarz inequality, jIf  QN f j 

p

4jjf  ƒN f jjL2 .S2 /:

Together with (37) and Theorem 13 this gives jIf  QN f j  chsXi N jjf jjH si ; i D 1; 2; 3: Using now hXN D O.N 1=2 /, we obtain the result in Theorem 7.

Page 31 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

9 Conclusions This chapter discusses numerical integration over the sphere, in particular the role played by positive-weight cubature rules with a specified polynomial degree of precision. The chapter considers both rules in which the user is free to (judiciously) choose the points of the cubature rule and rules with scattered points from measured data. Cubature rules for subsets of the sphere are also discussed. For numerical integration of functions with an appropriate degree of smoothness over the whole sphere, it is shown that the optimal rate of convergence can be achieved both by positive-weight rules with polynomial precision and by rules obtained by integrating a suitable radial basis function interpolant. Acknowledgments The support of the Australian Research Council is gratefully acknowledged. IHS and RSW acknowledge the support of the Hong Kong Polytechnic University, where much of this work was carried out. The authors also thank Ronald Cools for helpful advice.

References Ahrens C, Beylkin G (2009) Rotationally invariant quadratures for the sphere. Proc R Soc A 465:3103–3125 Alfeld P, Neamtu M, Schumaker LL (1996) Bernstein-Bézier polynomials on spheres and spherelike surfaces. Comput Aided Geom Des 13:333–349 Atkinson K (1982) Numerical integration on the sphere. J Austral Math Soc (Ser B) 23:332–347 Atkinson K (1998) An introduction to numerical analysis. Wiley, New York Atkinson K, Sommariva A (2005) Quadrature over the sphere. Electron Trans Numer Anal 20:104– 118 Bajnok B (1991) Construction of designs on the 2-sphere. Eur J Comb 12:377–382 Bannai E, Bannai E (2009) A survey on spherical designs and algebraic combinatorics on spheres. Eur J Comb 30(6):1392–1425 Bannai E, Damerell RM (1979) Tight spherical designs I. Math Soc Jpn 31(1):199–207 Baumgardner JR, Frederickson PO (1985) Icosahedral discretization of the two-sphere. SIAM J Numer Anal 22(6):1107–1115 Boal N, Sayas F-J (2004) Adaptive numerical integration on spherical triangles. Monografas del Seminario Matemático García de Galdeano 31:61–69 Chen D, Menegatto VA, Sun X (2003) A necessary and sufficient condition for strictly positive definite functions on spheres. Proc Am Math Soc 131:2733–2740 Chen X, Frommer A, Lang B (2009) Computational existence proofs for spherical t-designs. Department of Applied Mathematics, The Hong Kong Polytechnic University Chen X, Womersley RS (2006) Existence of solutions to systems of underdetermined equations and spherical designs. SIAM J Numer Anal 44(6):2326–2341 Cohn H, Kumar A (2007) Universally optimal distribution of points on spheres. J Am Math Soc 20(1):99–148 Cools R (1997) Constructing cubature formulae: the science behind the art. Acta Numer 1997:1–54 Cools R, Rabinowitz P (1993) Monomial cubature rules since “Stroud”: a compilation. J Comput Appl Math 48:309–326 Cui J, Freeden W (1997) Equidistribution on the sphere. SIAM J Sci Comput 18(2):595–609 Page 32 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

Davis PJ, Rabinowitz P (1984) Methods of numerical integration, 2nd edn. Academic, Orlando Delsarte P, Goethals JM, Seidel JJ (1997) Spherical codes and designs. Geom Dedicata 6:363–388 Ditkin VA, Lyusternik LA (1953) On a method of practical harmonic analysis on the sphere (in Russian). Vychisl Mat Vychisl Tekhn 1:3–13 Du Q, Faber V, Gunzburger M (1999) Centroidal Voronoi tessellations: applications and algorithms. SIAM Rev 41(4):637–676 Erdélyi A (ed), Magnus W, Oberhettinger F, Tricomi FG (research associates) (1953) Higher transcendental functions, vol 2, Bateman Manuscript Project, California Institute of Technology. McGraw-Hill, New York/Toronto/London Fasshauer G (2007) Meshfree approximation methods with Matlab. World Scientific, Singapore Fasshauer GE, Schumaker LL (1998) Scattered data fitting on the sphere. In: Dahlen M, Lyche T, Schumaker LL (eds) Mathematical methods for curves and surfaces II. Vanderbilt University, Nashville, pp 117–166 Filbir F, Themistoclakis W (2008) Polynomial approximation on the sphere using scattered data. Math Nachr 281(5):650–668 Floater MS, Iske A (1996a) Multistep scattered data interpolation using compactly supported radial basis functions. J Comput Appl Math 73:65–78 Floater MS, Iske A (1996b) Thinning and approximation of large sets of scattered data. In: Fontanella F, Jetter K, Laurent P-J (eds) Advanced topics in multivariate approximation. World Scientific, Singapore, pp 87–96 Freeden W (1999) Multiscale modelling of spaceborne geodata. B.G. Teubner, Leipzig Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford Science/Clarendon, Oxford Freeden W, Michel V (2004) Multiscale potential theory (with applications to geoscience). Birkhäuser, Boston/Basel/Berlin Freeden W, Reuter R (1982) Remainder terms in numerical integration formulas of the sphere. In: Schempp W, Zeller K (eds) Multivariate approximation theory II. Birkhäuser, Basel, pp 151–170 Gautschi W (2004) Orthogonal polynomials: computation and approximation. Oxford University, New York Górski KM, Hivon E, Banday AJ, Wandelt BD, Hansen FK, Reinecke M, Bartelmann M (2005) HEALPix: a framework for high-resoluton discretization and fast analysis of data distributed on the sphere. Astrophys J 622:759–771 Gräf M, Kunis S, Potts D (2009) On the computation of nonnegative quadrature weights on the sphere. Appl Comput Harmon Anal 27(1):124–132 Hannay JH, Nye JF (2004) Fibonacci numerical integration on a sphere. J Phys A Math Gen 37:11591–11601 Hardin RH, Sloane NJA (1996) McLaren’s improved snub cube and other new spherical designs in three dimensions. Discret Comput Geom 15:429–441 Hesse K, Sloan IH (2005a) Optimal lower bounds for cubature error on the sphere S2 . J Complex 21:790–803 Hesse K, Sloan IH (2005b) Optimal order integration on the sphere. In: Li T, Zhang P (eds) Frontiers and prospects of contemporary applied mathematics. Series in contemporary applied mathematics, vol 6. Higher Education, Beijing/World Scientific, Singapore, pp 59–70 Hesse K, Sloan IH (2006) Cubature over the sphere S2 in Sobolev spaces of arbitrary order. J Approx Theory 141:118–133

Page 33 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

Hesse K, Womersley RS (2009) Numerical integration with polynomial exactness over a spherical cap. Technical report SMRR-2009-09, Department of Mathematics, University of Sussex Jetter K, Stöckler J, Ward JD (1998) Norming sets and spherical quadrature formulas. In: Chen Li, Micchelli C, Xu Y (eds) Computational mathematics. Marcel Decker, New York, pp 237–245 Korevaar J, Meyers, JLH (1993) Spherical Faraday cage for the case of equal point charges and Chebyshev-type quadrature on the sphere. Integral Transform Spec Funct 1(2):105–117 Lebedev VI (1975) Values of the nodes and weights of ninth to seventeenth order Gauss-Markov quadrature formulae invariant under the octahedron group with inversion. Comput Math Math Phys 15:44–51 Lebedev VI, Laikov DN (1999) A quadrature formula for the sphere of the 131st algebraic order of accuracy. Dokl Math 59(3):477–481 Le Gia QT, Mhaskar HN (2008) Localized linear polynomial operators and quadrature on the sphere. SIAM J Numer Anal 47(1):440–466 Le Gia QT, Narcowich FJ, Ward JD, Wendland H (2006) Continuous and discrete least-squares approximation by radial basis functions on spheres. J Approx Theory 143:124–133 Le Gia QT, Sloan IH, Wendland H (2009) Multiscale analysis in Sobolev spaces on the sphere. Applied mathematics report AMR09/20, University of New South Wales McLaren AD (1963) Optimal numerical integration on a sphere. Math Comput 17(84):361–383 Mhaskar HN (2004a) Local quadrature formulas on the sphere. J Complex 20:753–772 Mhaskar HN (2004b) Local quadrature formulas on the sphere, II. In: Neamtu M, Saff EB (eds) Advances in constructive approximation. Nashboro, Nashville, pp 333–344 Mhaskar HN, Narcowich FJ, Ward JD (2001) Spherical Marcinkiewicz-Zygmund inequalities and positive quadrature. Math Comput 70:1113–1130 (Corrigendum (2002) Math Comput 71:453– 454) Müller C (1966) Spherical harmonics. Lecture notes in mathematics, vol 17. Springer-Verlag, New York Narcowich FJ, Petrushev P, Ward JD (2006) Localized tight frames on spheres. SIAM J Math Anal 38(2):574–594 Narcowich FJ, Ward JD (2002) Scattered data interpolation on spheres: error estimates and locally supported basis functions. SIAM J Math Anal 33(6):1393–1410 Popov AS (2008) Cubature formulas on a sphere invariant under the icosahedral rotation group. Numer Anal Appl 1(4):355–361 Ragozin DL (1971) Constructive polynomial approximation on spheres and projective spaces. Trans Am Math Soc 162:157–170 Rakhmanov EA, Saff EB, Zhou YM (1994) Minimal discrete energy on the sphere. Math Res Lett 11(6):647–662 Reimer M (1992) On the existence problem for Gauss-quadarature on the sphere. In: Fuglede F (ed) Approximation by solutions of partial differential equations. Kluwer, Dordrecht, pp 169– 184 Reimer M (1994) Quadrature rules for the surface integral of the unit sphere based on extremal fundamental systems. Math Nachr 169:235–241 Reimer M (2003) Multivariate polynomial approximation. Birkhäuser, Basel/Boston/Berlin Renka RJ (1997) Algorithm 772: STRIPACK: delaunay triangulation and Voronoi diagram on the surface of a sphere. ACM Trans Math Softw 23(3):416–434 Saff EB, Kuijlaars ABJ (1997) Distributing many points on a sphere. Math Intell 19:5–11 Sansone G (1959) Orthogonal functions. Interscience, London/New York

Page 34 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_40-3 © Springer-Verlag Berlin Heidelberg 2015

Seymour PD, Zaslavsky T (1984) Averaging sets: a generalization of mean values and spherical designs. Adv Math 52:213–240 Sidi A (2005) Application of class Im variable transformations to numerical integration over surfaces of spheres. J Comput Appl Math 184(2):475–492 Sloan IH (1995) Polynomial interpolation and hyperinterpolation over general regions. J Approx Theory 83:238–254 Sloan IH, Womersley RS (2004) Extremal systems of points and numerical integration on the sphere. Adv Comput Math 21:107–125 Sloan IH, Womersley RS (2009) A variational characterization of spherical designs. J Approx Theory 159:308–318 Sloane NJA (2000) Spherical designs. http://www.research.att.com/~njas/sphdesigns/index.html Sobolev SL (1962) Cubature formulas on the sphere invariant with respect to any finite group of rotations. Dokl Acad Nauk SSSR 146:310–313 Sobolev SL, Vaskevich VL (1997) The theory of cubature formulas. Kluwer, Dordrecht/Boston/London Sommariva A, Womersley RS (2005) Integration by RBF over the sphere. Applied mathematics report AMR05/17, University of New South Wales Stroud AH (1971) Approximate calculation of multiple integrals. Prentice-Hall, Inc., Englewood Cliffs Szegö G (1975) Orthogonal polynomials. American mathematical society colloquium publications, vol 23, 4th edn. American Mathematical Society, Providence Tegmark M (1996) An icosahedron-based method for pixelizing the celestial sphere. Astrophys J 470:L81–L84 Wendland H (1995) Piecewise polynomial, positive definite and compactly supported radial basis functions of minimal degree. Adv Comput Math 4:389–396 Wendland H (1998) Error estimates for interpolation by compactly supported radial basis functions of minimal degree. J Approx Theory 93:258–272 Wendland H (2005) Scattered data approximation. Cambridge University, Cambridge Womersley RS (2007) Interpolation and cubature on the sphere. http://web.maths.unsw.edu.au/~ rsw/Sphere/ Womersley RS (2009) Spherical designs with close to the minimal number of points. Applied mathematics report AMR09/26, The University of New South Wales Womersley RS, Sloan IH (2001) How good can polynomial interpolation on the sphere be? Adv Comput Math 14:195–226 Xu Y, Cheney EW (1992) Strictly positive definite functions on spheres. Proc Am Math Soc 116:977–981

Page 35 of 35

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

Multiscale Approximation Stephan Dahlke FB 12 Mathematics and Computer Sciences, Philipps-University of Marburg, Marburg, Germany

Abstract In this chapter, we briefly recall the concept of multiscale approximations of functions by means of wavelet expansions. We present a short overview on the basic construction principles and discuss the most important properties of wavelets such as characterizations of function spaces. Moreover, we explain how wavelets can be used in signal/image analysis, particularly for compression and denoising.

1 Introduction One of the most important tasks in applied analysis is the analysis of signals which are usually modeled by a function f 2 L2 .Rd /. The first step is always the decomposition of the signal into suitable “building blocks,” which may be taken from a basis, a frame, or, more generally, from a so-called dictionary. Then one is faced with the problem of identifying those building blocks that are best suited for the features one is interested in. For many years, the Fourier transform has very often been the method of choice. In this case, the signal is decomposed with respect to its frequency components. This approach has been very successful; nevertheless, there is a serious drawback. Although the Fourier basis is perfectly localized in the frequency domain, the basis functions are by no means localized in the time domain, so that small changes in the signal influence the whole Fourier spectrum. This problem has been partially ameliorated by Gabor (1946). In Gabor’s formalism, one analyzes the signal f .t / with the aid of the window function g.t / and computes the coefficients Z (1) C.m; n/ WD f .t /g.t  nt0 /e im!0 t dt; m; n 2 Z: R

That means that we have localized around nt0 with the help of the window g.t  nt0 / (where t0 roughly is the size of the window) and then computed the Fourier coefficients of our localized f .t /g.n  t0 / corresponding to the frequency m!0 . Nevertheless, to detect sharp singularities, the Gabor transform is not perfectly suited since the size of the window function g is fixed and cannot be adapted to the singularity. This is completely different in the wavelet setting which, contrary to the Fourier approach outlined above, provides a time-scale decomposition instead of a timefrequency decomposition. In general, a wavelet basis consists of the scaled, dilated, and translated versions of a finite set f i gi2I of functions, i.e., the set ‰Df



i;j;k gi2I;j 2Z;k2Zd ;

i;j;k .x/

WD 2jd=2

i .2

j

x  k/

(2)

E-mail: [email protected]

Page 1 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

Ψ (t)

1

0

0.5

1

t

–1

Fig. 1 The Haar wavelet

forms a (stable) basis of L2 .Rd /. Given such a basis, we obtain the wavelet representation of a function, i.e., XX X ci;j;k i;j;k .x/: (3) f .x/ D i2I j 2Z k2Zd

The mother wavelets §i may be chosen to be exponentially decaying or even compactly supported, so that, contrary to the Fourier approach, the wavelet approach is local in the time domain, and small changes in the signal only influence those wavelet coefficients that correspond to wavelets located in the vicinity of the perturbation. Moreover, larger “scales” j correspond to the resolution of finer and finer details, so that the wavelet approach indeed provides a multiscale approximation. The first example of a wavelet basis was constructed by Haar in 1910. In that case, the mother wavelet looks as depicted in Fig. 1. Clearly, the mother wavelet is very well located but not very smooth. It has been an open problem for a long time if a mother wavelet that is located and smooth really exists. Indeed, it was often conjectured that due to the Heisenberg uncertainty principle such a function does not exist. Quite surprisingly, in 1985 Meyer found a mother wavelet that is C 1 .R/ and decays faster than any polynomial. The fundamental breakthrough of wavelet analysis took place in 1986. In that year, Mallat introduced a very systematic approach for the construction of wavelets, i.e., the multiresolution analysis (MRA) (Mallat 1989). Moreover, Daubechies used this concept to show that there exists a family of arbitrarily smooth and compactly supported wavelets (Daubechies 1987). Soon afterward, the wavelet concept had been generalized to the multivariate setting; see, e.g., Jia and Micchelli (1991) and Cohen and Daubechies (1993). Since then, the amount of papers has been almost exponentially increasing, and by now there even exist several journals that are particularly concerned with wavelet analysis. The applications of wavelet concepts are quite comprehensive and range from image analysis to numerical analysis, geophysics, tomography, astrophysics, and meteorology. In many fields, particularly in image/signal analysis, wavelets are nowadays very often the method of choice. This can, e.g., be extracted from the fact that the JPEG 2000 standard is based on wavelet concepts. Besides the locality of wavelets, the success of wavelet methods is based on the following facts: • Weighted sequence norms of wavelet expansion coefficients are equivalent in a certain range (depending on the regularity of the wavelets) to smoothness norms such as Sobolev and Besov norms. Page 2 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

• For a wide class of operators, their representation in the wavelet basis is nearly diagonal. • The vanishing moments of wavelets remove the smooth part of a function. In this chapter, we give a short overview on the basic wavelet construction techniques, and we discuss the most important properties of the resulting wavelet bases. Moreover, we explain how the powerful properties of wavelets can be exploited in image/signal analysis.

2 Wavelet Analysis In this section, we shall briefly recall the basic setting of wavelet analysis. First of all, in Sect. 2.1, we collect some facts concerning the discrete wavelet transform. Then, in Sect. 2.2, we discuss the biorthogonal wavelet approach. It is one of the most important properties of wavelets that they form stable bases for scales of smoothness spaces such as Sobolev and Besov spaces. These aspects will be discussed in Sect. 2.3. Finally, in Sect. 2.4, we briefly recall the construction of wavelet bases on bounded domains.

2.1 The Discrete Wavelet Transform In general, a function § is called a wavelet if all its scaled, dilated, and integer-translated versions j;k .x/

WD 2j=2 .2j x  k/; j; k 2 Z;

(4)

form a (Riesz) basis of L2 .R/. Usually, they are constructed by means of a multiresolution analysis introduced by Mallat (1986): Definition 1. A sequence fVj gj 2Z of closed subspaces of L2 .R/ is called a multiresolution analysis (MRA) of L2 .R/ if     Vj 1  Vj  Vj C1  : : : I [1 Vj D L2 .R/I j D1 \1 Vj D f0g

(5)

f ./ 2 Vj , f .2/ 2 Vj C1 I

(8)

f ./ 2 V0 , f .  k/ 2 V0 for all k 2 Z:

(9)

j D1

(6) (7)

Moreover, we assume that there exists a function ' 2 V0 such that V0 WD spanf'.  k/; k 2 Zg and that ' has stable integer translates, i.e., X    ck '.  k/ C1 jjcjj`2   k2Z

L2

 C2 jjcjj`2 ; c WD fck gk2Z 2 `2 .Z/:

(10)

(11)

Page 3 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

The function ' is called the generator of the multiresolution analysis. The properties (5), (8), (10), and (11) immediately imply that ' is refinable, i.e., it satisfies a two-scale relation X '.x/ D ak '.2x  k/; (12) k2Z

with the mask a D fak gk2Z 2 `2 .Z/. A function satisfying a relation of the form (12) is sometimes also called a scaling function. Because the union of the spaces fVj gj 2Z is dense in L2 .R/, it is easy to see that the construction of a wavelet basis reduces to finding a function whose translates span a complement space W0 of V0 in V1 , V1 D V0 ˚ W0 ; W0 D spanf .  k/jk 2 Zg:

(13)

Wj WD ff ./ 2 L2 .R/jf .2j / 2 W0 g;

(14)

Indeed, if we define

it follows from (6)–(8) that L2 .R/ D

1

˚ Wj ;

j D1

(15)

so that j;k ./

D 2j=2 .2j  k/; j; k 2 Z;

(16)

forms a wavelet basis of L2 .R/. Clearly, (8) and (10) imply that the wavelet § can be found by means of a functional equation of the form X bk '.2x  k/; (17) .x/ D k2Z

where the sequence b WD fbk gk2Z has to be judiciously chosen; see, e.g., Chui (1992), Daubechies (1992), and Meyer (1992) for details. The construction outlined above is quite general. In many applications, it is convenient to impose some more conditions, i.e., to require that functions on different scales are orthogonal with respect to the usual L2 -inner product, i.e., 0

h .2j  k/; .2j  k 0 /iL2 .R/ D 0; if j ¤ j 0 :

(18)

This can be achieved if the translates of § not only span an algebraic complement but the orthogonal complement, V0 ?W0 ; W0 D spanf .  k/jk 2 Zg:

(19)

Page 4 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

The resulting functions are sometimes called pre-wavelets. The basic properties of scaling functions and (pre-)wavelets can be summarized as follows: • Reproduction of polynomials: If ' is contained in C0 r .R/ WD fgjg 2 C r .R/ and supp g compactg, then every monomial x ˛ has an expansion of the form x˛ D

X

c ˛ '.x k2Z k

 k/; ˛  r:

(20)

• Oscillations: If the generator ' is contained in C0 r .R/, then the associated wavelet § has vanishing moments up to order r, i.e., Z x ˛ .x/dx D 0 for all 0  ˛  r:

(21)

R

• Approximation: If ' 2 C0 r .R/ and f 2 H r .R/, then the following Jackson-type inequality holds: inf jjf  gjjL2 .R/  C 2jr jf jH r :

g2Vj

(22)

In practice, it is clearly desirable to work with an orthonormal wavelet basis. This can be realized as follows. Given an `2 -stable generator in the sense of (11), one may define another generator  by './ O

O ./ WD P

O C 2k/j2 k2Z j'.

1=2 ;

(23)

and it can be directly checked that the translates of  are orthonormal and span the same space V0 . The generator  is also refinable, .x/ D

X k2Z

aO k .2x  k/;

and it can be shown that the function X bk .2x  k/; .x/ D k2Z

bk WD .1/k aO 1k ;

(24)

(25)

is an orthonormal wavelet with the same regularity properties as the original generator '. However, this approach has a serious disadvantage. If the generator ' is compactly supported, this property will in general not carry over to the resulting wavelet since it gets lost during the orthonormalization procedure (23). Therefore the compact support will only be preserved if we can dispense with the orthonormalization procedure, i.e., if the translates of ' are already orthonormal. Page 5 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

This observation was the starting point for the investigations of Daubechies who constructed a family  N , N 2 N of generators with the following properties (Daubechies 1987, 1992). Theorem 1. There exists a constant ˇ > 0 and a family  N of generators satisfying ' N 2 C ˇN .R/, supp  N D Œ0; 2N  1, and h N ./;  N .  k/iL2 .R/ D ı0;k ;  N .x/ D

XN 2 kDN1

ak  N .2x  k/:

(26)

Clearly, (26) and (25) imply that the associated wavelet §N is also compactly supported with the same regularity properties as  N . Given such an orthonormal wavelet basis, any function v 2 Vj has two equivalent representations, the single-scale representation with respect to the functions j;k .x/ D 2j=2.2j xk/ and the multiscale representation which is based on the functions 0;k ; §l;m; k; m 2 Z; 0  l < j; §l;m WD 2j=2 §.2j x m/. From the coefficients of v in the single-scale representation, the coefficients in the multiscale representation can easily be obtained by some kind of filtering, and vice versa. Indeed, given vD

X

j

k j;k

k2Z

and using the refinement equation (26) and the functional equation (25), it turns out that vD

X l2Z

21=2

X

! j

aN k2l k j 1;l C

X

21=2

m2Z

k2Z

X

! j bNk2l k

j 1;m :

(27)

k2Z j 1

n

j 1 k

o

D which describes the From (27), we observe that the coefficient sequence  k2Z information corresponding to Vj 1 can be obtained by applying the low-pass filter H induced by a to j , j 1

j 1 D H j ; l

D

X

21=2 aN k2l k : j

(28)

k2Z

The wavelet space Wj 1 describes the detailed information added to Vj 1. From (27), we can conclude that this information can be obtained by applying the high-pass filter D induced by b to j : X j 1 j c j 1 D Dj ; cl D 21=2 (29) bNk2l k : k2Z

Page 6 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

By iterating this decomposition method, we obtain a pyramid algorithm, the so-called fast wavelet transform. A reconstruction algorithm can be obtained in a similar fashion. Indeed, a straightforward computation shows X

j l j;l

D

p1 2

l2Z

P l2Z



P

k2Z

j 1 al2k k

C

P

 bl2ncnj 1

j;l

n2Z

so that l D 21=2 j

X

j 1

al2k k

C 21=2

X

bl2n cnj 1 :

(30)

n2Z

k2Z

Similar decomposition and reconstruction schemes also exist for the pre-wavelet case. There are several methods to generalize this concept to higher dimensions. The simplest way is to use tensor products. Given a univariate multiresolution analysis with generator ', it turns out that '.x1 ; : : : ; xd / WD '.x1 /    '.xd /

(31)

generates a multiresolution analysis of L2 .Rd /. Let E denote the vertices of the unit cube in Rd . Defining 0 WD ' and 1 WD , it can be shown that the set ‰ of 2d  1 functions e

.x1 ; : : : ; xd / WD

d Y

el

.xl / e D .e1 ; : : : ; ed / 2 Enf0g;

(32)

lD1

generates by shifts and dilates a basis of L2 .Rd /. There also exist multivariate wavelet constructions with respect to nonseparable refinable functions  satisfying .x/ D

d X

ak .2x  k/; fak gk2Zd 2 `2 .Zd /;

(33)

k2Z

see, e.g., Jia and Micchelli (1991) for details. Analogously to the tensor product case, a family i ; i D 1; : : :; 2d  1, of wavelets is needed. Each i satisfies a functional equation similar to (17), i

.x/ D

d X

bki .2x  k/:

(34)

k2Z

The basic properties of wavelets and scaling functions (approximation, oscillation, etc.) carry over to the multivariate case in the usual way.

Page 7 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

2.2 Biorthogonal Bases Given an orthonormal wavelet basis, the basic calculations are usually quite simple. For instance, the wavelet expansion of a function v 2 L2 .R/ can be computed as vD

X

hv;

j;k iL2 .R/

j;k ;

j;k .x/

WD 2j=2 .2j x  k/:

(35)

j;k2Z

However, requiring smoothness and orthonormality is quite restrictive, and consequently, as we have already seen above, the resulting wavelets are usually not compactly supported. It is one of the advantages of the pre-wavelet setting that the compact support property of the generator can be preserved. Moreover, since we have to deal with weaker conditions, the pre-wavelet approach provides us with much more flexibility. Therefore, given a generator , many different families of pre-wavelets adapted to a specific application can be constructed. Nevertheless, since orthonormality is lost, one is still interested in finding suitable alternatives which in some sense provide a compromise between both concepts. This can be performed by using the biorthogonal approach. For a given (univariate) wavelet basis f j;k ; j; k 2 Zg, j;k .x/ WD 2j=2 .2j x  k/, one is interested in finding a second system f Q j;k ; j; k 2 Zg satisfying ˝

j;k ./;

Q j 0 ;k0 ./

˛ L2 .R/

D ıj;j 0 ık;k0 ; j; j 0 ; k; k 0 2 Z:

(36)

Then all the computations are as simple as in the orthonormal case, i.e., vD

X

hv; Q j;k iL2 .R/

j;k

D

X

hv;

j 0 ;k 0 iL2 .R/

Q j 0 ;k0 :

(37)

j 0 ;k 0 2Z

j;k2Z

To construct such a biorthogonal system, one needs two sequences of approximation spaces fVj gj 2Z and fVQj gj 2Z. As for the orthonormal case, one has to find bases for certain algebraic complement spaces W0 and WQ 0 satisfying the biorthogonality conditions V0 ?WQ 0 ; VQ0?W0 ; V0 ˚ W0 D V1 ; VQ0 ˚ WQ 0 D VQ1 :

(38)

This is quite easy if the two generators  and Q form a dual pair, Q  k/iL2 .R/ D ı0;k : h./; . Indeed, the two biorthogonal wavelets .x/ D

X

(39)

and Q can be constructed as

.1/k d1k .2x  k/;

Q .x/ D

k2Z

X

Q .1/k a1k .2x  k/

(40)

k2Z

where .x/ D

X k2Z

ak .2x  k/;

Q .x/ D

X

Q dk .2x  k/:

(41)

k2Z

Page 8 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

a

b

20 15 10 5 0 −5

−10 −15

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

2.5 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Fig. 2 Typical biorthogonal mother wavelets. Red: mother wavelet, blue: dilated and translated version

Therefore, given a primal generator , one has to find a smooth and compactly supported dual generator Q satisfying (39) which is much less restrictive than the orthonormal setting. Elegant constructions can be found, e.g., in Cohen et al. (1992). There especially the important case where the primal generator is a cardinal B-spline,  D Nm , m  1, is discussed in detail. Generalizations to higher dimensions also exist (Cohen and Daubechies 1993). In particular, one may again use tensor products in the sense of (32). In Fig. 2, we show two examples of biorthogonal mother wavelets, together with a dilated and translated version. We see that these functions indeed look like small waves. The oscillations are necessary to obtain enough vanishing moments. For further information on wavelet analysis, the reader is referred to one of the excellent textbooks on wavelets which have appeared quite recently (Chui 1992; Daubechies 1992; Meyer 1992; Kahane and Lemarié-Rieusset 1995; Wojtaszczyk 1997).

2.3 Wavelets and Function Spaces It is one of the most important properties of wavelets that they give rise to stable bases in scales of function spaces. For wavelet applications, the Besov spaces are the most important smoothness spaces. Therefore, in this section, we introduce the Besov spaces and recall their characterization by means of wavelet expansions. Let us start by recalling the definition of Besov spaces. The modulus of smoothness !r .v; t /Lp .Rd / of a function v 2 Lp .Rd /; 0 < p  1, is defined by !r .v; t /Lp .Rd / WD supjhjt jjrh .v; /jjLp .Rd / ; t > 0; with rh being the rth difference with step size h. For s > 0 and 0 < q; p  1, the Besov space Bqs .Lp .Rd // is defined as the space of all functions v for which ( R jvjBqs .Lp .Rd // WD

1 0

Œt s !r .v; t /Lp .Rd / q dt =t supt 0 t s !r .v; t /Lp .Rd / ;

1=q

; 0 < q < 1; q D 1;

(42)

is finite with r WD Œs C 1. Then, (42) is a (quasi-)semi-norm for Bqs .Lp .Rd //. If we add jjvjjLp .Rd / to (42), we obtain a (quasi-)norm for Bqs .Lp .Rd //. Let P0 be the biorthogonal projector which maps L2 .Rd / onto V0 , i.e., Page 9 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

P0 .v/ WD

X

Q  k/i: hv; .

k2Zd

Then, P0 has an extension as a projector to Lp .Rd /, 1  p  1. For each v 2 Lp .Rd /, we have  D P0 ./ C

X XX˝ ˛ e ; Q j;k e2Enf0g j 0

e j;k ;

(43)

k2Zd

e e where j;k ; Q j;k denote the associated tensor product wavelet basis elements in the sense of (32). The Besov spaces Bqs .Lp .Rd // can be characterized by wavelet coefficients provided the parameters s, p, q satisfy certain restrictions. For simplicity, we shall only discuss the case q D p. Then, the following characterization holds:

Proposition 1. Let  and Q be in C r .R/ and compactly supported. If 0 < p  1 and r > s > d.1=p  1/, then a function v is in the Besov space Bps .Lp .Rd // if and only if v D P0 ./ C

X XX˝ ˛ e v; Q j;k

e j;k

(44)

e2Enf0g j 0 k2Z d

with 0 jjP0 .v/jjLp .R/ C @

X XX

2

  1 j sCd 2



1 p

11=p

 p

ˇ˝ ˛ˇ ˇ v; Q e ˇp A j;k

< 1;

(45)

e2Enf0g j 0 k2Zd

and (45) provides an equivalent (quasi-)norm for Bps .Lp .Rd //. In the case p  1, this is a standard result and can be found, e.g., in Meyer (1992, Sect. 10 of Chap. 6). For the general case of p, this can be deduced from general results in Littlewood-Paley theory (see, e.g., Sect. 4 of Frazier and Jawerth 1990) or proved directly. We also refer to Cohen (2003) and Triebel (2004) for further information. The condition that s > d.1=p  1/ implies that the Besov space Bps .Lp .Rd // is embedded in LpN .Rd / for some pQ > 1 so that the wavelet decomposition of v is defined. Also, with this restriction on s, the Besov space Bps .Lp .Rd // s defined via Fourier transforms and is equivalent to the nonhomogeneous Besov spaces Bp;p Littlewood-Paley theory. The other important scale of function spaces we shall be concerned with are the Sobolev spaces H s .Rd /. Usually, these spaces are defined by means of weak derivatives or by Fourier transforms; see, e.g., Adams (1975) for details. However, it can be shown that for p D q D 2, Sobolev and Besov spaces coincide: H s .Rd / D B2s .L2 .Rd //; s > 0: Therefore, Proposition 1 immediately provides us with a characterization of Sobolev spaces.

Page 10 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

Proposition 2. if and only if

Let  and Q be in C r .R/; r > s. Then a function v is in the Sobolev space H s .Rd / 0

X XX

jjP0 .v/jjLp .R/ C @

e2Enf0g j 0

11=2 22js jhv; Q j;k ij2 A

< 1;

(46)

k2Zd

and (46) provides an equivalent norm for H s .Rd /. For further information, the reader is referred to Frazier and Jawerth (1990) and Meyer (1992).

2.4 Wavelets on Domains When it comes to practical applications, one clearly has to deal with objects that are defined on bounded domains and manifolds. In image analysis, particularly the case of functions defined on cubes contained in Rd is essential; see also Sect. 3. Therefore, a wavelet basis on these domains or manifolds is needed. We cannot simply cut all elements of a wavelet basis on the whole Euclidean plane, for this would destroy almost all of the important properties described above. For the unit cube, a first natural idea would be periodization. If one applies the operator f p .x/ D

X

f .x C k/

k2Zd

to the elements of a biorthogonal wavelet basis in L2 .Rd /, one obtains a biorthogonal basis on the torus T D Rd =Zd ; see Meyer (1992), Dahmen et al. (1993), and Dahlke and Kunoth (1994) for details. However, this approach usually creates singularities at the boundary which is inconvenient in many cases. Fortunately, it is by now well known how to construct suitable biorthogonal wavelet bases for many cases of interest; see Dahmen and Schneider (1998), Dahmen et al. (1999), Canuto et al. (1999, 2000), and Cohen and Masson (2000). The first step is always to construct nested sequences S D fSj gj j0 ; SQ D fSQj gj j0 whose unions are dense in L2 . /. Then, one has to find suitable bases ‰j D f

j;k

Q j D f Q j;k W k 2 rj g; W k 2 rj g; ‰

for some complements Wj of Sj in Sj C1 and WQ j of SQj in SQj C1 such that the biorthogonality condition h

j;k ;

Q j 0 ;k0 i D ıj;j 0 ık;k0

(47)

holds. Moreover, one has to ensure that all the convenient properties of wavelets, especially the characterization of Sobolev spaces according to Proposition 2, can still be saved. What turns out to matter is that [ ‰j ‰D j j0

Page 11 of 23

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_41-2 © Springer-Verlag Berlin Heidelberg 2014

forms a Riesz basis of L2 . /, i.e., every v 2 L2 .˝) has a unique expansion vD

1 X X

Q j;k i hv; ‰

(48)

j;k

j Dj0 k2rj

such that 0 jjvjjL2 . /  @

1 X

X

11 2

jhv; Q j;k ij

2A

; v 2 L2 . /;

(49)

j Dj0 k2rj

and that both S and SQ should have some approximation and regularity properties which can be stated in terms of the following pair of estimates. There exists some > 0 such that the inverse estimate jjvn jjH s . /  2ns jjvn jjL2 . / ; vn 2 Sn ;

(50)

holds for s < . Moreover, there exists some m  such that the direct estimate inf jjv  vn jjL2 . /  2sn jjvjjH s . / ; v 2 H s . /;

vn 2Sn

(51)

holds for s  m; compare with (22). Such estimates are known to hold for every finite element or spline space. For instance, for piecewise linear finite elements one has D 3=2; m D 2. It will be convenient to introduce the following notation. Let J WD f D .j; k/ W k 2 rj ; j  j0 g D

1 [

.fj g  rj /;

j Dj0

and define jj WD j if  D .j; k/ 2 J or k 2 rj : Then the following result holds (Dahmen 1996). Q D f‰ Q  W  2 J g are biorthogonal Theorem 2. Suppose that ‰ D f  W  2 J g and ‰ Q collections in L2 ( / satisfying (49). If both S and S satisfy (50) and (51) relative to some ; 0 > 0;  m; 0  m0 , then jjvjjH s . /  

P 2J

P

2J

1 Q  ij2 2 ; s 2 . 0 ; /; 22jjs jhv; ‰ 1 2jjs 2 2 2 jhv;  ij ; s 2 . ; 0 /; v 2 H s . /:

(52)

All the constructions in Dahmen and Schneider (1998), Dahmen et al. (1999), Canuto et al. (1999, 2000), and Cohen and Masson (2000), induce isomorphisms of the form (52) at least for 1=2 jyj. Then the outer harmonic of degree n 2 N0 and order m 2 Z, n  m  n, at x  y can be expanded in terms of inner and outer harmonics as follows: 0

On;m .x  y/ D

1 n X X

In0 ;m0 .y/OnCn0 ;mCm0 .x/

n0 D0 m0 Dn0 0

D

n 1 X X n0 Dn

In0 n;m0 m .y/On0 ;m0 .x/:

(17)

m0 Dn0

Note that in (17), we make use of the convention that In;m D 0 if jmj > n.

Page 13 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

Theorem 4 (Translation Theorem for Inner Harmonics). Let x; y 2 R3 . Then the inner harmonic of degree n 2 N0 and order m 2 Z, n  m  n, at x  y can be expanded in a finite sum of inner harmonics 0

In;m .x  y/ D

n n X X

0

.1/n In0 ;m0 .y/Inn0 ;mm0 .x/:

n0 D0 m0 Dn0

For orders with jmj > n, we have again by convention In;m D 0. By applying (17) Theorem 3 allows the translation of an outer harmonics expansion with expansion center x0 such as F .x/ D

n 1 X X

Fx^;O .n; m/On;m .x  x0 / 0

(18)

nD0 mDn ext which converges uniformly for x 2 ext r0 .x0 / with some r0 > 0. r0 .x0 / denotes the exterior of the sphere of radius r0 around x0 . The outer harmonics series resulting from the translation possesses the expansion center x1 and the coefficients 0

.n0 ; m0 / Fx^;O 1

D

n n X X

Fx^;O .n; m/In0 n;m0 m .x0  x1 /: 0

(19)

nD0 mDn ext ext The expansion converges uniformly for x 2 ext r1 .x1 / where r1 .x1 /  r0 .x0 /. This translation is called multipole-to-multipole translation (M2M). By Theorem 3, we also find that the outer harmonics expansion can be translated into an inner harmonics series centered around x2 which converges uniformly for x 2 int r2 .x2 / if the new ball ext int int of convergence is situated completely in r1 .x1 /, i.e., r1 .x1 / \ r2 .x2 / D ¿. The resulting coefficients of the inner harmonics expansion are

.n0 ; m0 / Fx^;I 2

D

n 1 X X

0

 Fx^;O .n; m/.1/n Cm OnCn 0 ;m0 m .x2  x1 / 1

(20)

nD0 mDn

and this translation is named multipole-to-local translation (M2L). Furthermore, Theorem 4 lets us shift the expansion center of such inner harmonics expansions to the new center x3 which possesses the coefficients Fx^;I .n0 ; m0 / 3

D

1 X n X nDn0

Fx^;I .n; m/Inn0 ;mm0 .x3  x2 /: 2

(21)

mDn

int and converges uniformly for x 2 int r3 .x3 /  r2 .x2 /. This translation step is called local-to-local translation (L2L). For further details, we refer to Gutting (2007) and the references therein, in particular Epton and Dembart (1995).

Page 14 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

3.4 The Fast Multipole Algorithm Before kernel expansions can be translated, we have to compute multipole expansions, more precisely a first set of expansion coefficients for each cube containing any points. Thus, only the part of the spline related to a single cube X is considered, i.e., the kernel functions KH .xi ; /, where xi 2 X:  ˇˇ  KT N N X X 1 jy j 1 ˇ Dx F D ai KH .xi ; / D ai : ˇ KT ˇ R 2R jx  hy j iD1 iD1 xi 2X

xDxi

xi 2X

We find the following expansion for jhy KT  x0 j > jxi  x0 j, xi 2 X, i.e., if x0 is the center of the cube X, the targets hy KT and the cube X need to fulfill a distance requirement. N jy j X ai F D R iD1 KT

xi 2X

D

n 1 X X

1 Dx 2R nD0 mDn

!ˇ ˇ ˇ  KT In;m .x  x0 /On;m .hy  x0 / ˇ ˇ

xDxi

n 1 jy KT j X X ^;O F .n; m/On;m .hy KT  x0 / R nD0 mDn x0

(22)

where the multipole coefficients Fx^;O .n; m/ of the cube X are given by 0 .n; m/ D Fx^;O 0

N X

 ai

iD1 xi 2X

 ˇˇ 1 ˇ  Dx In;m .x  x0 / ˇ ˇ 2R

:

(23)

xDxi

These coefficients can be translated to other cubes via relations (19), (20) as well as (21) as long as the distance requirements are fulfilled by the construction of the decomposition of the domain into nested cubes. This first step is called point to multipole (P2M) step. Obviously, the infinite sum in (22) has to be truncated at degree p. This parameter essentially determines the accuracy of the algorithm. At the end of the fast multipole cycle, i.e., after many M2M-, M2L-, L2L-translations, each cube Y possesses an inner harmonics expansion centered around the center of the cube. This expansion has to be evaluated at the targets contained by Y , i.e., the local to targets (L2T) step is performed: !ˇ p n ˇ jy KT j X X ^;I ˇ Fx0 .n; m/In;m .hy KT  x0 / ˇ ; (24) Lj F D F .yj / D ˇ R nD0 mDn

yDyj

2

R where the variable y is hidden by y KT D jyj 2 y. In Algorithm 1 we briefly recapitulate the fast multipole algorithm (see, e.g., Carrier et al. 1988; Cheng et al. 1999 or Gutting 2007 for our specific implementation). For the computation of the spline coefficients of the smoothing splines of Sect. 2.2, we have to consider the system of linear equations (11) instead of (8). This means that we have to add

Page 15 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

Algorithm 1 Fast Multipole Algorithm Input: • A set of points xi 2 †ext (often xi 2 †), i D 1; : : : ; N , • a set of coefficients ai , i D 1; : : : ; N , • the choice of the type of the reproducing kernel KH (singularity or Abel-Poisson with the parameter h and the radius of the Runge sphere R), • a set of evaluation points yj 2 †ext , j D 1; : : : ; M , • the degree of the multipole expansion p, • the maximal number of points per cube m. Aim: compute the sum F .yj / D

N X

ai KH .xi ; yj /

for each j D 1; : : : ; M:

i D1

Initialization: 2

• Compute the targets hyjKT D h jyR j2 yj , j D 1; : : : ; M . j • Create a bounding box that contains all points and all targets, build the adaptive octree, and sort in all points and targets (as described in the beginning of Sect. 3.2). Set L as the maximum level, and eliminate all empty cubes. • Determine list 1 to list 4 of Definition 7. Create a list of all cubes of level l for each level l D 0; : : : ; L. Collect all leaves in a list. • Allocate memory for the different expansion coefficients of each cube X : multipole expansion (coefficient vector MX ), local expansion (coefficient vector LX ). Fast Multipole cycle: 1. Generation of the multipole coefficients: For all leaves X : P2M, i.e., compute the multipole coefficients MX of the multipole expansion up to degree p around the center of X from the points in X as in (23). For level l D L  1; : : : ; 2: M2M, i.e., translate the multipole coefficients of the children of X to X itself for all cubes X of level l via (19). 2. Interaction phase for list 4: For level l D 2; : : : ; L: for all cubes X of level l: compute the expansion coefficients of an inner harmonics expansion around the center of X from the points in Y for all cubes Y of list 4 of X and add them to LX – or use direct evaluation of the kernel sum corresponding to the points in Y to obtain the result at the targets in X if the number of targets in X  p 2 and X is a leaf. 3. Multipole-to-local translation: For level l D 2; : : : ; L: for all cubes X of level l: use (20) to translate MX to LY for all cubes Y in list 2 of X . 4. Translation of the inner harmonics expansions: For level l D 2; : : : ; L  1: L2L, i.e., translate the local coefficients LX to the children of X (if there are any) via (21) and add the resulting coefficients to LZ where Z denotes the corresponding child of X for all cubes X of level l. 5. Evaluation of the expansions and direct interaction: For all leaves X : L2T, i.e., evaluate the inner harmonics expansion of X at all targets in X as in (24). Store the result in F . For all cubes Y in list 1 of X : P2T, i.e., add the direct evaluation of the kernel sum corresponding to the points in Y at the targets in X to F . For all cubes Y in list 3 of X : evaluate the multipole expansion around the center of Y (coefficients MY ) at the targets in X and add the results to F – or use direct evaluation of the kernel sum corresponding to the points in Y to add the result at the targets in X to F if the number of points in Y  p 2 and Y is a leaf. 6. Reverse the effects of the Kelvin transformation: jy KT j FQj D Rj Fj for j D 1; : : : ; M . Return the result FQ . Page 16 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

ˇ

N P

ai .C1 /ij to the matrix-vector product that is computed by the FMM. In order to keep a fast

iD1

algorithm, the matrix C1 has to allow a fast summation method or C has to be a sparse matrix. The trivial cases where C is a diagonal matrix can also be included in the direct evaluation step of the fast multipole algorithm.

3.5 Acceleration of the Translations Following the ideas of White and Head-Gordon (1996) (see also Greengard and Rokhlin 1997; Cheng et al. 1999), the multipole-to-multipole (M2M) and the local-to-local (L2L) translations can both be accelerated by using Wigner rotation matrices (cf., e.g., Edmonds 1964; Biedenharn and Louck 1981; Var˘salovi˘c et al. 1988; Choi et al. 1999). With these rotations, the shift direction  4 and rotate back. This reduces the numerical costs from O p becomes the "3 -axis; we shift there     in the M2M and L2L steps to O p 3 , since each rotation requires an effort of O p 3 and the shift along the "3 -axis is given by 0

n n0 n X  0 0   ^;O ^;O 0 jx0  x1 j Q Q Fx1 n ; m D ; Fx0 n; m .n0  n/Š 0

m0 D n0 ; : : : ; n0 ;

nDjm j

for M2M (the tilde indicates that we are dealing with rotated coefficients) and by p nn0  0 0  X ^;I   ^;I 0 jx3  x2 j Q Q Fx2 n; m Fx3 n ; m D ; .n  n0 /Š nDn0

m0 D n0 ; : : : ; n0 ;

for L2L requiring also an effort of O.p 3 /. For a detailed description, we refer to White and HeadGordon (1996) or Gutting (2007) with all technical details. For the M2L translation, we follow the idea of Greengard and Rokhlin (1997) and Cheng et al. (1999) by replacing it with exponential translations which are based on the representation 1 1 D jx  yj 2 D

Z

1

e 0

.x3 y3 /

Z

2

e i..x1 y1 / cos ˛C.x2 y2 / sin ˛/ d˛ d 

0

s."/ Mk X wk X e k .x3 y3 / e ik ..x1 y1 / cos ˛j;k C.x2 y2 / sin ˛j;k / C O."/ Mk j D1 kD1

(25)

for points x; y whose Cartesian p p coordinates satisfy 1  x3  y3  4 as well as 0  2 2 .x1  y1 / C .x2  y2 /  4 2. The inner integral is discretized using the trapezoidal rule, , j D 1; : : : ; Mk , and the outer integral is treated with the integration weights i.e., ˛j;k D 2j Mk wk and the integration points k , k D 1; : : : ; s."/ for a chosen accuracy ". wk , k , Mk can be found in Greengard and Rokhlin (1997), Yarvin and Rokhlin (1998), and Cheng et al. (1999). The total number of numerical integration points, i.e., the number of exponential functions, is

Page 17 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

S."/ D

s."/ P

  Mk 2 O p 2 , whereas s."/ 2 O.p/ and takes the role of p in determining the

kD1

accuracy of the integration. Since outer harmonics are related to the single pole by Hobson’s formula (cf. Hobson 1965), a multipole expansion of F with coefficients Fc^;O .n; m/, center c, and accuracy " can be transformed by (25) into a series of exponentials (multipole-to-exponential step, briefly M2X) F .x/ D

s."/ Mk X X

W .k; j /e k

x3 c3 d

e ik .

x1 c1 d

cos ˛j;k C

x2 c2 d

sin ˛j;k /

C O."/;

(26)

kD1 j D1

with the coefficients given by p n X X Fc^;O .n; m/ wk n m im˛j;k W .k; j / D k i e nC1 d M k nD0 mDn

(27)

for k D 1; : : : ; s."/, j D 1; : : : ; Mk . It is required that c is the center of a box of edge length d containingpthe points x and the series of exponentials is valid for points y with d  x3  y3  4d p 2 2 and 0  .x1  y1 / C .x2  y2 /  4 2d . Such exponential expansions can be shifted very efficiently to the new center b, i.e., to F .x/ D

s."/ Mk X X

V .k; j /e

k

x3 b3 d



e

ik

x1 b1 d

cos ˛j;k C

x2 b2 d

 sin ˛j;k

kD1 j D1

with the new coefficients V .k; j / D W .k; j /e k

b3 c3 d



e

ik

b1 c1 d

cos ˛j;k C

b2 c2 d

 sin ˛j;k

(28)

for k D 1; : : : ; s."/, j D 1; : : : ; Mk . This exponential to exponential shift is abbreviated by X2X. Afterwards, the exponential expansion is transformed back into an inner harmonics expansion completing the M2L translation step: F .x/ D

p n X X

Fc^;I .n; m/In;m .x  c/ C O."/;

(29)

nD0 mDn

where the coefficients can be computed by this so-called exponential to local (X2L) step Fc^;I.n; m/

s."/ Mk X X

.k /n m im˛j;k D W .k; j / i e dn kD1 j D1

(30)

for n D 0; : : : ; p, m D n; : : : ; n. Obviously, the same geometric restrictions hold as before. The restrictions on the positions of x and y mean that the exponential translations are applicable for cubes in list 2 that are situated above the current cube with another cube in between. However, by combining the idea with rotations of the multipole expansion using again the Wigner rotation Page 18 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

matrices, the exponential translation can substitute the M2L translation for all cubes in list 2. Therefore, the list of all well-separated cubes (list 2) is split into six directional lists (up, down, North, South, East, and West), and instead of M2L, the following sequence of transformations is used: (rotation), M2X (27), X2X (28), and X2L (30), (inverse rotation).   Each exponential shift requires numerical costs of O.S."// D O p 2 , and the rotations  and X2L steps). Thus, this improves can be applied using O p 3 operations (as do the   4M2X the performance compared to the M2L step’s O p effort. Each cube needs to allocate some additional memory for an outgoing and an incoming set of coefficients of the sum of exponentials. Moreover, we can save translations by recombination (see Greengard and Rokhlin 1997; Cheng et al. 1999; Gutting 2007). If several cubes have the same target cubes, intermediate stops are performed. No additional memory is required since we can use the available storage of cubes at a lower level. In 3D we perform 8 translations from X1 ; : : : ; X8 to their common parent cube C , 9 translations from C to B1 ; : : : ; B9 which are the parent cubes of S1 ; : : : ; S36 instead of 288 translations from X1 ; : : : ; X8 to S1 ; : : : ; S36 as summarized in the following diagram:

(31) Not only expansions of Xj , but also of other cubes that have the children of Bk as target cubes, are collected in the parent cubes Bk and are shifted to the cubes Sj after all contributions are added. Thus, we only need to perform the translations from Bk to its children once. This further reduction is the reason that we use two intermediate stations for the translation. We call this the first stage of recombination. Similar considerations can be used in other situations where fewer cubes have common target cubes (second stage of recombination). The corresponding diagram for the 3D case is given by

(32) As before, the final translation is executed only after all contributions arrived in the cubes Bk . These two stages already cover the complete list for the directions up and down. For the other four lists, we introduce a third stage. Obviously, we only consider cubes that have not yet been treated by the first two stages. If a cube X has a cube Y in one of its directional lists, we collect all other children of X’s parent C which also have Y in the corresponding directional list in the combination list (altogether

Page 19 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

M cubes). If M > 1, we collect the maximal number of common cubes in the corresponding directional lists of all cubes of the combination list in the target cube list (altogether N cubes). If N > 1, all expansions from cubes in the combination list are shifted to their parent cube C and then to the cubes of the target cube list. The number of translations is reduced from N  M to N C M.

Finally, any remaining cubes are treated individually as without the recombination technique. This procedure can significantly reduce the number of necessary translations. The algorithm is described and analyzed in full detail in Gutting (2007). It should also be noted that there are several symmetries in the coefficients of the exponential expansion since we are dealing with a real-valued function F in (26). They are used to further reduce the numerical costs (cf. Greengard and Rokhlin 1997; Cheng et al. 1999).

4 Numerical Tests and Results At first, the effect of the recombination described in the end of Sect. 3.5 is investigated for a fully populated bounding box. Obviously, this is the ideal case maximizing the effect of the recombination. The increased efficiency of the X2X step for the various stages of recombination (see Sect. 3.5; by stage 0 we indicate the case without recombination) is documented by Tables 1 and 2. Note that the accuracy s."/ D 26 and the corresponding values for Mk , S."/ from Cheng et al. (1999) are chosen for this test. The well known symmetries in the exponential coefficients (see Gutting 2007 for a detailed derivation) are also used to reduce the computational costs. For a rather sphere-like setting as in the discrete version of the Dirichlet boundary value problem (Problem 3) Table 1 Number of exponential translations depending on the stages in the recombination technique for the fully populated octree with levels 2–5. In brackets in the first column, the number of cubes in this level is given; the average number of translations per cubes is listed in brackets in the other columns Level 2 (64) 3 (512) 4 (4,096) 5 (32,768)

Stage 0 3;096.48:4/ 53;352.104:2/ 584;136.142:6/ 5;398;920.164:8/

Stage 1 1;648.25:8/ 27;792.54:3/ 302;176.73:8/ 2;784;960.85:0/

Stage 2 1;000.15:6/ 16;056.31:4/ 171;640.41:9/ 1;570; 680.47:9/

Stage 3 902.14:1/ 13;602.26:6/ 141;050.34:4/ 1;271;850.38:8/

Table 2 Computing times (in seconds) corresponding to the exponential translations of Table 1 Level 2 3 4 5

Stage 0 0:61 9:96 107:89 1;040:81

Stage 1 0:32 5:27 56:55 548:03

Stage 2 0:20 3:11 32:87 330:88

Stage 3 0:18 2:67 27:34 263:01

Page 20 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

Table 3 Computing times (in seconds) for the exponential translation without and with recombination (all three stages). Note that the listed times include the M2X and X2L transformations in both cases Maximal level 4 4 5 5

Cubes 1,313 1,545 1,682 2,630

Table 4 Resulting truncation degrees p for different s."/ for the two types of kernels

Leaves 1,112 1,280 1,384 2,200

Without recombination 30.8 37.37 41.62 58.74

s."/ 8 17 26

Singularity kernel 4 12 23

With recombination 24.21 29.24 32.26 47.81

Abel-Poisson kernel 5 13 25

or in an interpolation/approximation problem on the Earth’s surface, the savings reduce to about 30 % because of the adaptive construction of the octree. This can be seen in Table 3 where an adaptive decomposition is computed for points randomly distributed between two spheres of radius 6,356 and 6,378 km. The targets are the Kelvin transforms of these points with respect to a sphere of radius R D 6;352 km and the parameter h D 0:95. The adaptive FMM used the parameters p D 23, s."/ D 26, S."/ D 670, m D 130. Now the truncation degree p is investigated for different accuracies of the exponential translation s."/. We increase p while s."/ is kept fixed and determine when the integration error of the exponential translation dominates the truncation error. This leads to the choices of p for different levels of s."/ given by Table 4. For the remaining tests in this section, we consider the following test scenario: We take latitudes and longitudes of the points of a so-called spiral grid (cf. Rakhmanov et al. 1994) and combine these with a random radius between 6,356.7 and 6,378.1 km. As radius of the Runge sphere, R D 6;356 km is chosen and h D 0:95. The spline coefficients ai are just random numbers between 1 and 1. We compute N N X     X ai Lj Li KH .; / D ai KH xi ; xj ; S xj D iD1

j D 1; : : : ; N;

iD1

with exponential translations of different accuracies and without them (using standard M2L translations). We compare the results to the direct summation without FMM. The error behavior can be seen in Fig. 3, and together with many further tests (cf. Gutting 2007), we obtain the values of Table 4. Finally, the maximal number of points or targets per cube m has a strong influence on the adaptive octree construction and the performance of the FMM. If m is too small, there are many cubes each containing only very few points. Thus, the kernel expansion coefficients no longer combine the information of enough points to be efficient. If m is too large, there are only few cubes each with a large number of points. This means that far too often instead of kernel expansion direct interaction is used. After many empirical tests (cf. Gutting 2007), we came to the conclusion that the choices for m given by Table 5 lead to a good performance in our implementation. After these optimizations of the parameters of the fast multipole algorithm, we can compare its performance with direct computation and find the break-even points of our implementation, Page 21 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

100

100

10–2

10–2

10–4

10–4

10–6

10–6

10–8

10–8

10–10

10–10

10–12

0

5

10

15

20

25

30

10–12

0

5

10

15

20

25

30

35

Fig. 3 Error induced by the FMM for the singularity kernel (left) and Abel-Poisson kernel (right) with exponential translations of different accuracies for s."/ D 8 (red line), s."/ D 17 (cyan line), s."/ D 26 (green line). For comparison, we include the standard M2L translation (blue line) with increasing truncation degree p as abscissae Table 5 Chosen maximal numbers of points m per cube for the singularity kernel and the Abel-Poisson kernel and the different error levels

s."/ 8 17 26

Singularity kernel 85 130 380

Abel-Poisson kernel 75 140 240

Table 6 Break-even points for the singularity kernel and the Abel-Poisson kernel

s."/

Singularity kernel

Abel-Poisson kernel

8 17 26

530 1,160 2,670

360 960 2,250

300

600

250

500

200

400

150

300

100

200

50

100

0

0

1

2

3

4

5

6

7

8

9 10 x 104

0

0

1

2

3

4

5

6

7

8

9 10 x 104

Fig. 4 Break-even points by comparison of computation times (in seconds) for direct (blue line) and FMM-accelerated (red line) computation (left: singularity kernel, right: Abel-Poisson kernel), the number of points forms the abscissae

i.e., the minimal number of points that is necessary for our algorithm to be faster than the direct approach (see Table 6). Note that such results are always very dependent on the implementation. Furthermore, the linear asymptotic behavior which we expect from the FMM becomes obvious in Fig. 4 compared to the quadratic behavior of the direct approach. Our implementation turns out to be efficient even for rather small problem sizes. In general, the Abel-Poisson kernel requires some more computation time since it leads to a more difficult P2M step.

Page 22 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

5 Overlapping Domain Decomposition Preconditioner The fast multipole algorithm already provides us with an adaptive decomposition of the computational domain which we also use to improve the condition of the system of linear equations (8) by using a Schwarz algorithm as preconditioner (see Chan and Mathew 1994; Smith et al. 1996 and the references therein for an overview). The basic idea of the Schwarz algorithms is to split up the computational domain into several (often overlapping) parts, solve the resulting smaller problems for each part, and combine the partial solutions. We intend to use only one decomposition for both the FMM and the Schwarz algorithm. There are two main variants that have different ways to update the residual: In the multiplicative variant, the residual is updated after each single solution of a subproblem. In the additive variant, the solutions of the subproblems are merged after all of them are computed, and the merged result updates the residual (cf. Beatson et al. 2000; Zhou et al. 2003 for use of these methods in radial basis function interpolation and Hesse 2002 for use of the multiplicative variant in the context of harmonic splines). The multiplicative variant possesses the advantage of not requiring the full matrix for the update of the residuals, but this is less relevant in the presence of the FMM which provides a very fast way to (approximately) multiply with the full matrix. Moreover, the total number of update steps is higher for the multiplicative variant. Thus, we choose the additive Schwarz algorithm together with a coarse grid correction and use a residual update of Hybrid-II type (by the categorization of Smith et al. 1996). Since all the decomposing of the domain and sorting of points is already part of the FMM, the only additional work in the initialization of the preconditioner is the determination of points in the overlapping parts of the subdomains which are exactly given by the leaves of the octree. The amount of the overlap is controlled by the parameter #ov , and the area is a part of the directly adjacent cubes, i.e., cubes in list 1 as set in Definition 7 (white in Fig. 5). The overlap is the blue area in Fig. 5. Note that #ov D 0, i.e., no overlap at all, is a valid choice. In our problems we usually do not assume a structured point distribution. The idea for establishing a coarse grid from our scattered data points is to choose from each subproblem the point that is closest to the center of the domain and add that point to the coarse grid. Note that for the preconditioner, we only consider points and ignore the so-called targets of Sect. 3.2. If a cube only contains targets and none of the points x1 ; : : : ; xN , neither overlap nor coarse grid points are determined for it. After these

Fig. 5 Two-dimensional illustration of the overlap (dark blue area) for an adaptive decomposition of the domain. The red and light blue areas belong to list 2 and 3, respectively

Page 23 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

initialization steps that need to be computed only once, we can investigate the preconditioning cycle which yields v D M1 b for a given residual b. At first, we introduce the restriction matrix for each subdomain. Let M 2 N be the total number of subdomains and the subdomain Xr , r D 1; : : : ; M , contain Nr points in the cube and its corresponding overlap. By r , we denote the permutation of the numbers 1; : : : ; N that yields the points x r .i/ of Xr for the indices i D 1; : : : ; Nr  N . For i D Nr C 1; : : : ; N , the permutation

r is supposed to give the remaining indices between 1 and N such that r .i / < r .i C 1/ for all i D Nr C 1; : : : ; N  1. With the help of the canonical basis vectors in RN and the index permutation r , the restriction matrix Rr is defined as 1 e Tr .1/ C B Rr D @ ::: A 2 RNr N : e Tr .Nr / 0

(33)

The restriction matrices reduce a vector x 2 RN to a vector in RNr consisting only of components of x related to Xr . The corresponding transposed matrix RTr 2 RN Nr is called the prolongation matrix in the multiplicative variant. The restriction/prolongation matrices R0 , RT0 for the coarse grid are defined analogously using the corresponding set of points. In the additive variant with overlapping subdomains, RTr , r D 1; : : : ; M , stands for the operation of fitting together solutions corresponding to the subdomains. Several possibilities are available (see Gutting 2007 and the references therein), and we choose to only update components that correspond to points of the cube in Xr without the overlap. The overlapping part still plays a role in solving the subproblems, i.e., in Rr and Ar . Obviously, these restriction and prolongation operators are only written as matrices for the description of the algorithm; their effect is implemented in the form of index operations. Algorithm 2 Overlapping Additive Schwarz Preconditioner Initialization step: Find points xi , i D 1; : : : ; Nr , in each domain including overlap and set matrices   Ar D KH .xi ; xj / i;j D1;:::;Nr ;

r D 1; : : : ; M:

(34)

  Find the coarse grid and build the corresponding matrix A0 D KH .xi ; xj / , where i; j D 1; : : : ; N0 correspond to the indices of the N0 points of the coarse grid. The residual b is given. Preconditioning cycle: For r D 1; : : : ; M : Solve Ar zr D Rr b to obtain zr . M P Update at the inner points: v D RTr zr . rD1

Update the coarse residual and perform coarse grid correction: vfinal D v C RT0 A1 0 R0 .b  Av/;

(35)

where A denotes the full matrix corresponding to all N points.

Page 24 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

Table 7 Splines using the singularity and the Abel-Poisson kernel with N D 8;800 or N D 11;200 points. Computation times (in seconds) and the number of iterations in brackets. The first line uses a direct solver for comparison; the second line uses GMRES without preconditioning. Note that we restrict the GMRES algorithm to a maximum of 300 iterations. If no convergence is achieved by then, we write >300 #ov Direct No PC. 0.1 0.2 0.3 0.4 0.5 0.6

Singularity (8,800 points) 331:32 1; 020:7.> 300/ 108:20.29/ 52:16.13/ 38:00.9/ 32:01.7/ 26:20.5/ 29:20.5/

Singularity (11,200 points) 580:30 1; 187:84.> 300/ 149:50.35/ 84:45.19/ 52:68.11/ 37:82.7/ 37:62.6/ 39:91.5/

Abel-Poisson (8,800 points) 291:81 522:76 .84/ 90:29 .13/ 52:48 .7/ 40:26 .5/ 34:71 .4/ 36:23 .4/ 39:33 .4/

Abel-Poisson (11,200 points) 577:53 747:14 .101/ 131:38 .16/ 63:06 .7/ 50:27 .5/ 50:63 .5/ 46:41 .4/ 53:16 .4/

Remark 1. The computation of R0 Av in (35) stands for the product of an N0  N -matrix with a vector in RN . Since N0 N , this matrix-vector product is computed directly in our implementation even though it is of the type for which the FMM works as fast summation method. Our implementation solves the subproblems directly, i.e., the matrices Ar are factorized once in the initialization step. Algorithm 2 is applied as preconditioner in the well known preconditioned GMRES algorithm for the iterative solution of systems of linear equations (cf. Saad and Schultz 1986 for the first formulation of GMRES or, e.g., Saad 2003). The matrix-vector products with the full matrix A that occur in GMRES are accelerated with the fast multipole algorithm of Sect. 3. The overlap parameter #ov which controls the size of the overlapping parts of the domains (see Fig. 5) plays an important role in the effectiveness of the preconditioner. Note that the number of overlap points is not directly controlled this way. On the one hand, a larger overlapping part, i.e., more overlap points, generally reduces the number of iterations, because the efficiency of the additive Schwarz preconditioner rises. On the other hand, the number of points related to each domain also grows, i.e., the effort for each domain increases, and hence the iterations take more time. We investigate the following test scenario to determine a good balance. The Earth’s surface is described by the TerrainBase model (see Hastings and Row 1997), and we choose the latitudes and longitudes of the data points from a so-called spiral grid (cf. Rakhmanov et al. 1994), i.e., the points are approximately equidistributed on the surface for this test. As gravitational data to interpolate, we use the EGM96 model (cf. Lemoine et al. 1998). The fast multipole algorithm is run with s."/ D 17 and corresponding values for p and m (see Sect. 4). The spline parameters are R D 6;352 km and h D 0:92 for both kernels. From Table 7, we find that the differences for #ov  0:4 are rather small. Moreover, the disadvantages of a too large overlap become more and more evident for more points, in particular more points in the cubes, e.g., resulting from s."/ D 26 (see Table 5). Therefore, we use the preconditioner with #ov D 0:4 in our calculations in Sect. 6. Further parameter studies and tests of other update variants can be found in Gutting (2007). For smoothing splines, similar restrictions on the matrix C in (11) hold as for the FMM (see the end of Sect. 3.4). However, the system of linear equations (11) is typically much better conditioned than (8) because of the smoothing. Page 25 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

–600 –400 –200

0

200

400

600

0.5

1

1.5

2

2.5

3

3.5

4

Fig. 6 Spline interpolation of the gravitational potential (left) and absolute error distribution (right), both in m2 /s2

0.5

1

1.5

2

2.5

3

3.5

4

0.5

1

1.5

2

2.5

3

3.5

4

Fig. 7 A closer look at some data gaps where the largest error (in m2 /s2 ) occurs in Fig. 6 (right). The data points are added in magenta

6 Numerical Tests and Results At first, we consider a global test example putting together all previous parts. In this section, the FMM is used with s."/ D 26 and the corresponding parameters (see Sect. 4), and the preconditioner is set up with the overlap parameter #ov D 0:4 as explained in Sect. 5. To determine random points on the Earth’s surface, we compute a uniform distribution on a sphere. Since the Earth’s surface is close to a sphere, the distribution can be expected to be still close to uniform. Of course, even for uniformly distributed random points, data gaps can occur leading to complications. As test data, we simply evaluate the EGM96 (using degrees 3–100 here) at N D 100;000 random points on the Earth’s surface. In Fig. 6, the spline of the gravitational potential and distribution of the absolute error of the spline interpolation can be seen. Figure 7 provides a detailed view of the errors at some data gaps. The singularity kernel is computed with h D 0:98 and R D 6;352 km in just 18 iterations and leads to a mean absolute error of "mabs D 0:0413 m2 =s2 and a maximal absolute error of "max D 4:1442 m2 =s2 . For our further considerations, we no longer work on the whole surface of the Earth but restrict ourselves to a region containing South America (65ı S to 30ı N, 110ı W to 10ı W), and only the points within this region are interpolated. As test data, the EGM96 (using degrees 16–200) is evaluated at N D 48;749 almost uniformly distributed random points (constructed as before) in the region on the Earth’s surface (see Fig. 8 (left)). The resulting spline with the singularity kernel with h D 0:9875 and R D 63;544 km required 32 iterations to obtain the coefficients, and its mean absolute error is "mabs D 0:0353 m2 =s2 . The

Page 26 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

30° N



30° N

90° W

60° W



30° W

30° S

30° S

60° S

60° S

90° W

60° W

30° W

Fig. 8 Local examples: N D 48;749 randomly distributed points (left) and N D 48;744 points based on a spiral grid (right) in the region of the Earth’s surface 30° N

30° N

8

150

7

100 0°

90° W

60° W

30° W

50



–50

6

3 2

–150 –200

30° W

4 30° S

–100

60° S

60° W

5

0 30° S

90° W

60° S

1

Fig. 9 Spline interpolation of the gravitational potential (left) and absolute error distribution (right) using the singularity kernel and the interpolation points in Fig. 8 (left), both in m2 /s2

maximal error "max D 8:0492 m2=s2 and occurs at the boundary of the region (see Fig. 9). Thus, we ignore 2:5 % of the region at each boundary (see Fig. 10) which reduces both mean and maximal errors to "Qmabs D 0:0280 m2 =s2 and "Qmax D 3:6544 m2=s2 , respectively (compare also Figs. 10 to 9). It should be noted that the error naturally is largest at the boundary since there are no more data points beyond it. Therefore, one can consider everything outside the region under consideration as one huge data gap. Finally, we consider another regional example with the Abel-Poisson kernel (with h D 0:98 and R D 6;354 km) using the spiral grid of Rakhmanov et al. (1994) mapped to the Earth’s surface as before (see the test scenarios in Sect. 5). The points (N D 48;744) are illustrated in Fig. 8 (right), and as data we evaluate again the EGM96 (using degrees 16–360). Our algorithm requires 15 iterations, and the errors are "mabs D 0:0199 m2 =s2 and "max D 10:7790 m2 =s2 where the maximal error again occurs at the boundary of the region (see Fig. 11). Cutting off again 2:5 % of the region at the boundary (see Fig. 12 and compare it to Fig. 11), these errors reduce to "Qmabs D 0:0046 m2=s2 and "Qmax D 0:9984 m2=s2 . Obviously, the more regular distribution of points removes the errors that result from data gaps and leads to better results.

Page 27 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

30° N

30° N

3.5

150

3

100 0°

90° W

60° W

30° W

50



90° W

60° W

30° W

2

0 30° S

–50

30° S

1.5

–100

1

–150 60° S

–200

2.5

0.5 60° S

Fig. 10 Same results as in Fig. 9, but 2.5 % of the area at the boundary of the region have been removed

30° N

30° N 10

150

9

100 0°

90° W

60° W

30° W

50



90° W

60° W

30° W

–50

5

30° S

4

–100

3

–150 –200

60° S

7 6

0 30° S

8

2 1

60° S

Fig. 11 Spline interpolation of the gravitational potential (left) and absolute error distribution (right) using the AbelPoisson kernel and the interpolation points in Fig. 8 (right), both in m2 /s2

30° N

30° N

0.9

150

0.8

100 0°

90° W

60° W

30° W

50



60° S

–50

60° W

30° W

0.7 0.6

0 30° S

90° W

0.5 30° S

0.4

–100

0.3

–150

0.2

–200

60° S

0.1

Fig. 12 Same results as in Fig. 11, but 2.5 % of the area at the boundary of the region have been removed

Page 28 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

7 Conclusion Combining the FMM and the additive Schwarz algorithm as preconditioner in the iterative algorithm GMRES proves to be an efficient solution strategy that can treat interpolation problems and Dirichlet boundary value problems with many data points on regular surfaces (e.g., the actual topography of the Earth). It should be pointed out that our approach is not restricted to a global treatment but also applies to regional domains as shown by the numerical examples. This can lead to a local improvement of the gravitational field in areas of particular interest. A global model (e.g., in terms of spherical harmonics) should first be subtracted from the data and combined afterwards with the spline solution. In general, the singularity kernel leads to results slightly faster than the Abel-Poisson kernel. The spline approach naturally includes spherical boundaries as a special case and can be extended to spline smoothing with some restrictions (see Sect. 2.2 and the end of Sect. 3.4). Note that for smoothing splines, it is well possible to leave out the preconditioner since the smoothing itself drastically improves the condition of the matrix. However, the smoothing parameter(s) plays a crucial role in this approach and must be chosen very carefully or a lot of detail information is lost to oversmoothing. The combination with parameter choice methods from ill-posed problems (cf. Bauer and Lukas 2011; Bauer et al. 2014 and the references therein) is an interesting challenge for the future. For highly irregular distributions of data points, the spline approach reaches its limits. The largest data gap in the domain desires a small value of the parameter h, whereas the closest data points require a larger value of h to avoid ill-conditioning. Even smoothing splines cannot completely bridge this gap so far though further investigation is required. However, functional matching pursuit methods can result in better approximations (see Michel 2014a and the references therein), but so far these algorithms require large numerical costs.

References Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68:337–404 Bauer F, Lukas MA (2011) Comparing parameter choice methods for regularization of ill-posed problems. Math Comput Simul 81(9):1795–1841 Bauer F, Gutting M, Lukas MA (2014) Evaluation of parameter choice methods for regularization of ill-posed problems in geomathematics. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Beatson RK, Billings S, Light WA (2000) Fast solution of the radial basis function interpolation equations: domain decomposition methods. SIAM J Sci Comput 22(5):1717–1740 Biedenharn LC, Louck JD (1981) Angular momentum in quantum physics (theory and application). Encyclopedia of mathematics and its applications. Addison-Wesley, Reading Carrier J, Greengard L, Rokhlin V (1988) A fast adaptive multipole algorithm for particle simulations. SIAM J Sci Stat Comput 9(4):669–686 Chan TF, Mathew TP (1994) Domain decomposition algorithms. Acta Numer 3:61–143 Cheng H, Greengard L, Rokhlin V (1999) A fast adaptive multipole algorithm in three dimensions. J Comput Phys 155:468–498 Choi CH, Ivanic J, Gordon MS, Ruedenberg K (1999) Rapid and staple determination of rotation matrices between spherical harmonics by direct recursion. J Chem Phys 111(19):8825–8831 Page 29 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

Edmonds AR (1964) Drehimpulse in der Quantenmechanik. Bibliographisches Institut, Mannheim Epton MA, Dembart B (1995) Multipole translation theory for the three-dimensional Laplace and Helmholtz equations. SIAM J Sci Comput 16(4):865–897 Fengler MJ (2005) Vector spherical harmonic and vector wavelet based non-linear Galerkin schemes for solving the incompressile Navier–Stokes equation on the sphere. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern, Shaker, Aachen Freeden W (1981a) On approximation by harmonic splines. Manuscr Geod 6:193–244 Freeden W (1981b) On spherical spline interpolation and approximation. Math Method Appl Sci 3:551–575 Freeden W (1982a) Interpolation and best approximation by harmonic spline functions. Boll Geod Sci Aff 1:105–120 Freeden W (1982b) On spline methods in geodetic approximation problems. Math Method Appl Sci 4:382–396 Freeden W (1984a) Spherical spline interpolation: basic theory and computational aspects. J Comput Appl Math 11:367–375 Freeden W (1984b) Ein Konvergenzsatz in sphärischer Spline-Interpolation. Z f Vermessungswes.(ZfV) 109:569–576 Freeden W (1987a) A spline interpolation method for solving boundary value problems of potential theory from discretely given data. Numer Methods Partial Differ Equ 3:375–398 Freeden W (1987b) Harmonic splines for solving boundary value problems of potential theory. In: Mason JC, Cox MG (eds) Algorithms for approximation. The institute of mathematics and its applications, conference Series, vol 10. Clarendon Press, Oxford, pp 507–529 Freeden W (1999) Multiscale modelling of spaceborne geodata. B.G. Teubner, Stuttgart/Leipzig Freeden W, Gerhards C (2013) Geomathematically oriented potential theory. Chapman & Hall/CRC, Boca Raton Freeden W, Gutting M (2013) Special functions of mathematical (geo-)physics. Birkhäuser, Basel Freeden W, Michel V (2004) Multiscale potential theory (with applications to geoscience). Birkhäuser, Boston/Basel/Berlin Freeden W, Schreiner M (2014) Special functions in mathematical geosciences: an attempt at a categorization. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Freeden W, Schreiner M, Franke R (1997) A survey on spherical spline approximation. Surv Math Ind 7:29–85 Freeden W, Gervens T, Schreiner M (1998a) Constructive approximation on the sphere (with applications to geomathematics). Oxford Science Publications, Clarendon Press, Oxford Freeden W, Glockner O, Schreiner M (1998b) Spherical panel clustering and its numerical aspects. J Geodesy 72:586–599 Glockner O (2002) On numerical aspects of gravitational field modelling from SST and SGG by harmonic splines and wavelets (with application to CHAMP data). PhD thesis, Geomathenatics Group, Department of Mathematics, University of Kaiserslautern. Shaker, Aachen Greengard L (1988) The rapid evaluation of potential fields in particle systems. MIT, Cambridge Greengard L, Rokhlin V (1987) A fast algorithm for particle simulations. J Comput Phys 73(1):325–348 Greengard L, Rokhlin V (1988) Rapid evaluation of potential fields in three dimensions. In: Anderson C, Greengard L (eds) Vortex methods. Springer, Berlin/Heidelberg/New York, pp 121–141 Page 30 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

Greengard L, Rokhlin V (1997) A new version of the fast multipole method for the Laplace equation in three dimensions. Acta Numer 6:229–269 Gutting M (2007) Fast multipole methods for oblique derivative problems. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern. Shaker, Aachen Gutting M (2012) Fast multipole accelerated solution of the oblique derivative boundary value problem. GEM Int J Geom 3(2):223–252 Hastings D, Row LW III (1997) TerrainBase global Terrain model summary documentation. National Geodetic Data Center, Boulder Hesse K (2002) Domain decomposition methods for multiscale geopotential determination from SST and SGG. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern, Shaker, Aachen Hobson EW (1965) The theory of spherical and ellipsoidal harmonics (second reprint). Chelsea Publishing Company, New York Keiner J, Kunis S, Potts D (2006) Fast summation of radial functions on the sphere. Computing 78:1–15 Kellogg OD (1967) Foundation of potential theory. Springer, Berlin/Heidelber/New York Lemoine FG, Kenyon SC, Factor JK, Trimmer RG, Pavlis NK, Chinn DS, Cox CM, Klosko SM, Luthcke SB, Torrence MH, Wang YM, Williamson RG, Pavlis EC, Rapp RH, Olson TR (1998) The development of the joint NASA GSFC and NIMA geopotential model EGM96. NASA/TP1998-206861, NASA Goddard Space Flight Center, Greenbelt Michel V (2013) Lectures on constructive approximation – Fourier, spline, and wavelet methods on the real line, the sphere, and the ball. Birkhäuser, Boston Michel V (2014a) RFMP – an iterative best basis algorithm for inverse problems in the geosciences. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Michel V (2014b) Tomography: problems and multiscale solutions. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Moritz H (2014) Classical physical geodesy. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Potts D, Steidl G (2003) Fast summation at nonequispaced knots by NFFTs. SIAM J Sci Comput 24(6):2013–2037 Rakhmanov EA, Saff EB, Zhou YM (1994) Minimal discrete energy on the sphere. Math Res Lett 1:647–662 Rokhlin V (1985) Rapid solution of integral equations of classical potential theory. J Comput Phys 60:187–207 Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn SIAM, Philadelphia Saad Y, Schultz MH (1986) GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J Sci Stat Comput 7:856–869 Shure L, Parker RL, Backus GE (1982) Harmonic splines for geomagnetic modelling. Phys Earth Planet Inter 28:215–229 Smith BF, Bjørstad PE, Gropp WD (1996) Domain decomposition (parallel multilevel methods for elliptic partial differential equations). Cambridge University Press, Cambridge Var˘salovi˘c DA, Moskalev AN, Chersonskij VK (1988) Quantum theory of angular momentum. World Scientific, Singapore

Page 31 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_47-1 © Springer-Verlag Berlin Heidelberg 2014

Wahba G (1981) Spline interpolation and smoothing on the sphere. SIAM J Sci Stat Comput 2:5–16. Also errata: SIAM J Sci Stat Comput 3:385–386 (1981) Wahba G (1990) Spline models for observational data. SIAM, Philadelphia White CA, Head-Gordon M (1996) Rotating around the quartic angular momentum barrier in fast multipole method calculations. J Chem Phys 105(12):5061–5067 Yamabe H (1950) On an extension of the Helly’s theorem. Osaka Math J 2(1):15–17 Yarvin N, Rokhlin V (1998) Generalized Gaussian quadratures and singular value decomposition of integral equations. SIAM J Sci Comput 20(2):699–718 Zhou X, Hon YC, Li J (2003) Overlapping domain decomposition method by radial basis functions. Appl Numer Math 44:241–255

Page 32 of 32

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

Clifford Analysis and Harmonic Polynomials Klaus Gürlebecka and Wolfgang Sprößigb a Bauhaus-Universität Weimar, Weimar, Germany b Technische Universität Bergakademie Freiberg, München, Germany

Abstract This overview gives an insight in the new field of hypercomplex analysis in relation to harmonic analysis. The algebra of complex numbers is replaced by the non-commutative algebra of real quaternions or by Clifford algebras. This contribution is focused on the presentation of an operator calculus on the sphere as well as the discussion of monogenic and holomorphic orthonormal systems of polynomials and special functions on the unit ball and the sphere. Function theoretic elements are lined out. Relations to the Lie algebra SO.3/ are discussed and a corresponding Radon transform is introduced.

1 Introduction In 1866 Sir William Rowan Hamilton one of the most fascinating scientists of the nineteenth century proved that the set of all complex numbers forms a division algebra, i.e., there exists a unit element 1 ¤ 0 and all nonzero elements have an inverse. All arithmetic operations (addition, subtraction, multiplication, and division (for nonzero elements)) are defined and satisfy the usual algebraic rules. Hamilton recognized that there are two units “1” and “i” with 12 D 1 and

i 2 D 1:

Every element of the algebra has the form x C iy with x; y 2 R. In the following 10 years, he tried to extend this result to triples, i.e., the real unit “1” and two further imaginary units “i” and “j” form the basis. Such triples he called vectors. For a long time, he did not succeed in finding a division rule for vectors. In 1843 only after introducing a further imaginary unit and dropping the commutativity Hamilton was able to divide vectors. On the discovery, the following story is told: Hamilton had to chair a meeting on the Royal Irish Academy. His wife walked with him along the Royal Canals in Dublin. Suddenly he had an ingenious idea, took his pocket knife, and carved the fundamental relations of the skew-field of quaternions in a stone of the Brougham Bridge: Here as he walked by on October 16 in 1843, Sir William Rowan Hamilton in a flash of genius discovered the fundamental formula for quaternion multiplication: i 2 D j 2 D k 2 D ij k D 1 cut it on a stone of this bridge.



E-mail: [email protected]

Page 1 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

On November 14 in 1843, a first paper on quaternions appears in the Council Books of the Royal Academy. Already Carl Friedrich Gauss was working with transformations of spaces in 1819. He had to compose real quadruples. Given .a; b; c; d / and .˛; ˇ; ; ı/ in R4 , he obtained for the composite 4-tuple .˛a  ˇb  c  ıd; aˇ C b˛  cı C d; a C bı C c˛  dˇ; aı  b C cˇ C d˛/: After transposition of the 2nd and 3rd coordinates, one gets the today usual notation of quaternionic multiplication. Gauss did not publish this result. It was found after his death in his diary (see also in Gauss 1900 and Lam 2003). In order to act in higher dimensions, one has to free itself from any spatial imagination. In 1844 the high school teacher Hermann G. Grassmann (1809–1877) from Stettin wrote his famous book: Die lineale Ausdehnungslehre (extension theory). There it is discussed the most important question: Is geometry part of mathematics? Grassmann wrote: . . . there (geometry) should be a branch of mathematics, which in a purely abstract way produces similar laws as they appear linked to space in geometry (Grassmann 1844). Grassmann’s work remains widely unnoticed and poorly understood. In 1862 a methodically strongly improved book appeared. It was H. Hankel who supported the popularization of the main ideas in Grassmann’s work. The algebraic fundamental terms in the extension theory were called elementary quantities, which can be connected by two product constructions: the inner product and the outer product. In modern notation, one has for the quantities e1 ; : : : ; en the following algebraic relations: ei ej C ej ei D 0;

ei2 D 0 .i; j D 1; 2 : : : ; n/

ei .ej C ek / D ei ej C ei ek : So the so-called Grassmann algebra is defined. It is the work of the English philosopher and geometer William Kingdon Clifford (1845–1879) to combine in an ingenious manner Hamilton’s quaternions and Grassmann’s extension theory. In his famous work from 1878 Applications of Grassmann’s Extensive Algebra, he introduced geometric algebras (Clifford’s notation), nowadays called Clifford algebras (Clifford 1878). This new type of algebras is constructed by scalars, vectors, bivectors, and in general k-vectors (.1  k  n/. Elements are called Clifford numbers, i.e., a Clifford number is a linear combination of k-numbers and scalars for any k 2 N; 1  k  n/. Complex numbers and quaternions (by isomorphy) are simple examples of Clifford numbers. Clifford’s life and its relation to J.C. Maxwell is very nicely described in the wonderful book Such Silver Currents: The story of William and Lucy Clifford, 1845–1929 by Chisholm (2002) with a foreword of Sir Roger Penrose. In this chapter, it is intended to show the richness, beauty, and usefulness of mathematics in Clifford algebras in particular in quaternions. In Sect. 2 such basic algebraic structures like the algebras of real quaternions, complex quaternions, and Clifford algebras are introduced and studied. Some important low-dimensional examples are introduced and studied. It is shown that rotations and reflections can be considered advantageously in such algebras. In the brief Sect. 3, some kinds of special functions important for a function theory on the sphere as well as the Lie algebra SO.3/ are provided. One of the main topics of this chapter one finds Page 2 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

in Sect. 4: higher-dimensional versions of holomorphy (monogenicity) and the construction of systems of holomorphic (monogenic) functions in the ball. An important system of holomorphic polynomials is the so-called Fueter polynomials. In the case of the Riesz system in R3 , a complete orthonormal system of homogeneous holomorphic polynomials is constructed. Finally, in this section, a complete orthogonal Appell system of polynomials is constructed. The Appell property is understood with respect to the hyperholomorphic derivative. This system serves as a basis for a hypercomplex Taylor series expansion of monogenic functions. After normalization the polynomials are used as basis for a Fourier series with convenient and explicit relations between Fourier and Taylor coefficients, respectively. Section 5 is devoted to the construction of harmonic conjugates. Two different approaches are presented. In the Sects. 6 and 7, an operator calculus over domains on the sphere is deduced. For this reason, it is necessary to construct corresponding differential operators (vector derivative, Günter derivative, etc.) and integral operators (singular integral operator of Cauchy type, Plemelj type projections). Gegenbauer polynomials are used in order to get the kernel for the Cauchy-type integral operator. Furthermore, a Hilbert space decomposition is presented. A Bergman type projection is constructed. The next part is considered with the generalization of some aspects of harmonic analysis to SO.3/ (Sect. 8). Wigner functions play now the role of spherical harmonics. Applications are given in Sect. 9. There, the operator calculus is applied on the consideration of Saint-Venant equations. A solution theory is beyond the content of this part but already worked out. A Radon transform over SO.3/ is introduced and describes the interrelation of Wigner functions with spherical harmonics.

2 Quaternions and Clifford Algebras Let us start with the discussion of one of the most simple hypercomplex structures – the algebra of real quaternions.

2.1 Real Quaternions In the real vector space R4 with the orthonormal basis fe0 ; e1 ; e2 ; e3 g, a multiplication is defined by ei ej C ej ei D 2ıij e0

and

e0 ej D ej e0 I e02 D e0 I e1 e2 D e3

i; j D 1; 2; 3:

Thus an algebra (= vector space over R or C, extended by a suitable compatible multiplication) is generated. In honor of W. R. Hamilton, the algebra of real quaternions is denoted by H. Let a 2 H, then an arbitrary element a 2 H may be represented by a D ˛0 e0 C ˛1 e1 C ˛2 e2 C ˛3 e3 : The quaternionic conjugation is given by e 0 D e0 ; ek D ek

.k D 1; 2; 3/;

x D x0 

3 X

˛k ek D x0  x:

kD1

Page 3 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

Further, it holds xx D jxj2R4 DW jxj2H ; x0 and !.x/ D Let ' D arccos jxj holds

xjxj 2 S , 2

x 1 WD

1 x for x ¤ 0; jxj2

xy D y x:

then for any quaternion x with x D jxj.cos' C !.x/sin'/

.cos' C !.x/sin'/n D cosn' C !.x/sinn': This is an analogue of Moivre’s formula in complex analysis. x DW Vec.x/ denotes the vector part and x0 DW Sc.x/ the scalar part of the quaternion x. In the vector case, the quaternion product yields x y D x  y C x  y: It should be remarked that the inner product (scalar product) was introduced by Grassmann (1844) and the cross product by Gibbs (1881). For more details, see Crowe (1967).

2.2 Quaternions and Vectors It is interesting to study relations between quaternions and vectors. It holds 1 1 x  y D  .xy C yx/ and x  y D .xy  yx/: 2 2 For x 2 R, the commutator relation xy D yx holds for any y 2 H. It is important to know that from x 2 D y 2 does not follow x D ˙y. The easy proofs can be found, for instance, in Sprößig and Fichtner (2004). Moreover, one gets: (i) Let x 2 H; then there exists a vector y ¤ 0 y with xy as a vector. (ii) Any quaternion x is the product of two vectors. (iii) The inverse of a vector is again a vector. It should be stressed that in H only scalars and vectors exist.

2.3 Rotations in R3 In 1840 Benjamin Olinde Rodrigues (1794–1851) – a Banker’s son from Bordeaux – studied rotations as general movements on a sphere. He could successfully solve the problem (Euler 1775) of the composition of two rotations in a constructive way. One has to consider the automorphism of H given by y .x/ D yxy 1 for y 2 H. The mapping y is an automorphism of R3 . Firstly, it is valid

Page 4 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

y .x/  y .x0 / D y .x  x0 /: Then it follows the so-called Euler-Rodrigues formula x0 WD y .x/ D x cos 2' C .!  x/ sin 2' C .1  cos 2'/.!  x/! y with y0 D cos ', ! D jyj and jyj D sin ' In Gürlebeck et al. (2008) it is proved that the rotations in H are exactly the mappings x ! x 0 D axb with jaj D jbj D 1 and a; b 2 H. It follows that every rotation in R3 can be composed at most by four reflections about planes. Porteous (1969) proved that the map

 W S 3 ! SO.3/

with .y/ D y WD yxy

is a surjective group homomorphism with ker D f1; 1g.

2.4 Complex Quaternions We denote by H.C/ the set of quaternions with complex coefficients ˛k D ˛k1 C i ˛k2 .˛ki 2 R/, i.e., a D ˛0 e0 C ˛1 e1 C ˛2 e2 C ˛3 e3 D a1 C i a2 ;

.aj 2 H/

Assume i ek D ek i and use therefore the denotation CH D H.C/. In the complex case, there are three possible conjugations: (i) aC WD a1  i a2 ; (ii) aH WD a1 C i a2 (iii) aCH WD a1  i a2 . Formally one can define a complex-valued norm: jaj2C D aa D ja1 j2 C ja2 j2 C 2i 0 Œ˛01 ˛02 C a1  a2 : A corresponding (real) norm is given by kak4 WD jjaj2C j2 . In H.C/ one has to distinguish between scalars, vectors, bivectors, and the so-called pseudoscalars (scalars with orientation).

2.5 Clifford Algebras As already mentioned, the main problem in higher dimensions, say Rn , is to define a multiplication rule. This was done by W. K. Clifford in an ingenious way: Let n  1; 1  p; q  n; p C q D n, let e0 ; e1 ; : : : ; en be an orthonormal basis of RnC1 . Then a multiplication is defined by

Page 5 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

e0 ei D ei e0 D ei ; i D 0; : : : ; nI ei2 D 1; i D 1; : : : ; p;

ei2 D 1; i D p C 1; : : : ; n

ei ej D ej ei ; i 6D j; i; j D 1; : : : ; n: Additionally, one has to assume the condition e1 e2 : : : en ¤ ˙1; if p  q  1.mod4/: In this way, the algebra C `p;q is completely described, which is called (universal) Clifford algebra. Addition and multiplication with real numbers are coordinatewisely given. A basis is determined by e0 I e1 ; : : : ; en I e1 e2 ; : : : ; en1 en I e1 e2 e3 ; : : : I : : : I e1 e2 : : : en : with e0 as unit element. An arbitrary Clifford number is given by x D x0 C

n X

X

xi1 :::ik ei1 :::ik D

kD1 0L2 .IAIR/ D 

where d V denotes the Lebesgue measure in R3 . The subspace L2 .I AI R/ \ ker @N of polynomial (R)-solutions of degree n is denoted by MC .I AI n/. In Leutwiler (2001), it is shown that the space MC .I AI n/ has dimension 2nC3. Later, this result was generalized to arbitrary higher dimensions in the framework of a Clifford algebra by Delanghe in 2007. The space of square integrable A-valued monogenic functions defined in  will be denoted by MC .I A/. The main idea of the constructions is the factorization of the Laplace operator. In this way, one can understand the following constructions as a refinement of the well-known spherical harmonics and harmonic analysis. This strategy goes back to Cacao (2004) and starts by considering the set of homogeneous harmonic polynomials 0 m m ; r nC1 UnC1 ; r nC1 VnC1 ; m D 1; : : : ; n C 1gn2N0 ; fr nC1 UnC1

(20)

formed by the extensions in the ball of an orthogonal basis of spherical harmonics in R3 considered, e.g., in Sansone (1959). This complete orthogonal system is given explicitly in spherical coordinates by 0 . ; '/ D PnC1 .cos / UnC1 m m . ; '/ D PnC1 .cos / cos m' UnC1 m m VnC1 . ; '/ D PnC1 .cos / sin m'

n D 0; : : : 1I m D 1; : : : ; n C 1: (21)

Here PnC1 denotes the Legendre polynomial of degree n C 1, given by ΠnC1 2 

PnC1 .t / D

X

anC1;k t nC12k ; P0 .t / D 1 ;

t 2 Œ1; 1 ;

kD0

with anC1;k D .1/k

1 2nC1

.2n C 2  2k/Š : kŠ .n C 1  k/Š .n C 1  2k/Š

As usual Œk denotes the largest integer  k.

Page 18 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

m The functions PnC1 are the associated Legendre functions defined by m .t / WD .1  t 2 /m=2 PnC1

dm PnC1 .t /; dt m

m D 1; : : : ; n C 1:

0 .t / is identical with the corresponding Legendre For m D 0, the associated Legendre function PnC1 polynomial PnC1 .t /. The application of the operator @ to the homogeneous harmonic polynomials in (21) leads to the following set of homogeneous monogenic polynomials: m; m; W m D 1; : : : ; n C 1g; fX0; n ; Xn ; Yn

(22)

with the notation n 0 X0; n WD r Xn ;

Xm; WD r n Xm n n;

Ym; WD r n Ym n n:

The spherical holomorphic polynomials are explicitly given by the following formulas: Xn0 WD A0;n C B 0;n cos ' e1 C B 0;n sin ' e2 ;

(23)

where   1 d 2 sin ŒPnC1 .t /t Dcos C .n C 1/ cos PnC1 .cos / ; WD 2 dt

(24)

  1 d WD sin cos ŒPnC1 .t /t Dcos  .n C 1/ sin PnC1 .cos / ; 2 dt

(25)

0;n

A

B

0;n

and Xnm WD Am;n cos m' C .B m;n cos ' cos m'  C m;n sin ' sin m'/ e1 C .B m;n sin ' cos m' C C m;n cos ' sin m'/ e2 ;

(26)

Ynm WD Am;n sin m' C .B m;n cos ' sin m' C C m;n sin ' cos m'/ e1 C .B m;n sin ' sin m'  C m;n cos ' cos m'/ e2

(27)

Page 19 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

with the coefficients m;n

A

  1 d 2 m m sin ŒPnC1 .t /t Dcos C .n C 1/ cos PnC1 .cos / WD 2 dt

  1 d m;n m m sin cos ŒPnC1 .t /t Dcos  .n C 1/ sin PnC1 .cos / B WD 2 dt 1 1 C m;n WD m P m .cos /; 2 sin nC1

(28)

(29)

m D 1; : : : ; n C 1. The important property is that the application of @ to the orthogonal system of solid spherical harmonics preserves the orthogonality. This was proved in Cacao (2004). For each n, the set m; m; W m D 1; : : : ; n C 1g fX0; n ; Xn ; Yn

is orthogonal with respect to the real-inner product (19). The scalar parts of the (R)-holomorphic spherical polynomials are again harmonic and must have a representation in the original solid spherical harmonics. Surprisingly this representation is very simple and convenient. .n C l C 1/ l; Un 2 .n C m C 1/ m; Vn I Sc.Ym; n / WD 2 Sc.Xl; n / WD

Moreover, it can be shown that the polynomials XnC1; and YnC1; are hyperholomorphic n n constants, that means their hyperholomorphic derivative vanishes. It should be underlined that the set of constants in the case of the Riesz system is richer than in complex analysis. The constants are in general (R)-holomorphic functions which depend only on x1 and x2 . For getting a Fourier expansion of square integrable (R)-solutions, the constructed system needs to be normalized. The norms can be calculated explicitly (see Cacao 2004; Gürlebeck and Morais 2009a, b). m; m; .m D For n 2 N0 , the norms of the homogeneous monogenic polynomials X0; n , Xn , and Yn 1; : : : ; n/ and their associated scalar parts are given by s k X0; n kL2 .BIAIR/ D

.n C 1/ 2n C 3

s

.n C 1/ .n C 1 C m/Š 2 2n C 3 .n C 1  m/Š s .n C 1/.2n C 2/Š D 2 2n C 3

kL2 .BIAIR/ D k Ym; kL2 .BIAIR/ D k Xm; n n k XnC1; kL2 .BIAIR/ D k YnC1; kL2 .BIAIR/ n n

Page 20 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

.n C 1/ Dp 2n C 3

kSc.X0; n /kL2 .B/

r

2n C 1

m; kSc.Xm; n /kL2 .B/ D kSc.Yn /kL2 .B/ D

.n C 1 C m/ p 2n C 3

s

1 .n C m/Š : 2 .2n C 1/ .n  m/Š

Denote by X0;; ; Xm;; ; Ym;; .m D 1; : : : ; n C 1/ the new normalized basis functions n n n 0; m; m; Xn ; Xn ; Yn in L2 .BI AI R/. The last necessary property for an approximation theorem is the completeness of the orthonormalized system. For each n, the 2n C 3 homogeneous monogenic polynomials ˚

X0;; ; Xm;; ; Ym;; W m D 1; : : : ; n C 1 n n n



(30)

form an orthonormal basis in the subspace MC .BI AI n/ with respect to the real-inner product (19). Consequently, ˚

 X0;; ; Xm;; ; Ym;; ; m D 1; : : : ; n C 1I n D 0; 1; : : : n n n

is an orthonormal basis in MC .BI A/. This was proved in Morais (2009). Now it is possible to introduce the desired Fourier expansion. Let f be a square integrable A-valued monogenic function. The function f can then be represented by the orthonormal system (30): fD

1 X nD0

# nC1 X   an0 C anm C Ym;; bnm ; Xm;; X0;; n n n

"

(31)

mD1

where for each n 2 N0 , an0 ; anm ; bnm 2 R .m D 1; : : : ; n C 1/ are the associated Fourier coefficients. By reordering the expected Fourier series expansion, one can decompose any square integrable (R)-holomorphic function in an orthogonal sum of a monogenic “main part” of the function .g/ and a hyperholomorphic constant .h/. More precisely, it holds A function f 2 MC .BI A/ can be decomposed into f WD f.0/ C g C h;

(32)

where the functions g and h have the Fourier series g.x/ D

1 X nD1

h.x/ D

1 X

! n X   .x/an0 C .x/anm C Ym;; .x/bnm Xm;; X0;; n n n mD1

 nC1;;  Xn .x/annC1 C YnC1;; .x/bnnC1 : n

nD1

Page 21 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

Finally, it should be remarked that for n D 2 also the system of the Fueter polynomials can be used for the representation of A-valued holomorphic functions. The presented Taylor series expansion is a tool for the local approximation (but not an orthogonal series expansion). The here developed Fourier series is of course a global approximation. 4.4.3 Orthogonal Polynomials in H Looking for polynomial systems defined in R3 and values in H, the situation is now different. For quaternion-valued polynomials, the algebraic structure of H can be used. Therefore, now a rightlinear vectorial space with coefficients from H can be considered. Consequently, a quaternionvalued inner product is defined by Z .f; g/ D f .x/g.x/dx: B3

The space of square integrable functions will be denoted by L2 .; H; H/ to distinguish clearly from the L2 -space of A-valued functions and the real-valued inner product. Let B3 be the 3D unit ball. In Bock (2012) a complete orthogonal system was introduced as follows: The system fAln I l D 0; 1; : : : ; nI n D 0; 1; : : : g of solid spherical monogenics is generated by the two-step recurrence formula AlnC1 D

nC1 Œ..2n C 3/x C .2n C 1/x/Aln  2nxxAln1  2.n  l C 1/.n C l C 2/

with 1 AllC1 D Œ.2l C 3/x C .2l C 1/xAll I 4

All D .x1  x2 e3 /l

satisfying the following Appell property 1 l @A D 2 n



nAln1 W l D 0; 1; : : : ; n  1 0 W l Dn

and @C Ann D nAn1 n1 : Here

@ denotes the adjoint generalized Cauchy-Riemann operator as above and @C WD @ 1 @ C e3 @x2 . This approach does not use explicitly the spherical symmetry and can be extended 2 @x1 also for constructing spheroidal monogenics. The Appell property goes back to Appell (1880) who considered polynomial systems d Pn .x/ D nPn .x/. This property obviously generalizes the fPn .x/gn2N with the property that dx Page 22 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

property of the “ordinary” powers x n or zn . In particular in C this property is very useful because the Taylor series and the Fourier series are orthogonal expansions and can be differentiated summand by summand preserving the orthogonality. An H-holomorphic Taylor series expansion can be introduced in the following way: N The series representation Let f 2 L2 .B3 I HI H/ \ ker @. f WD

n 1 X X

Aln tn;l ;

with tn;l D

nD0 lD0

ˇ 1 N l nl ˇ @C @0 f .x/ ˇ xD0 nŠ

(33)

N The operators @0 and @N 0 are is called generalized Taylor-type series in L2 .B3 I HI H/ \ ker @. 0 C identified with the identity operators. There is also a formal argument to call the obtained series ˇ expansion a Taylor-type series. ˇ l nl N By definition the coefficients are determined by @C @0 f .x/ ˇ . One has to consider only the xD0 hypercomplex derivative (important on the principal part of the monogenic function, orthogonal to the monogenic constants) and a Cauchy-Riemann-type operator (acting as a derivative on the monogenic constants). The almost complete analogy with the complex function theory becomes visible by comparing the Taylor series and the Fourier series, respectively. For this reason, the given Appell system has to be normalized in order to get a complete orthonormal system. In Bock (2012) the construction of this system is described as follows: 0; m; m; Using the notations Xn;j WD r n Xn0 ej ,Xn;j WD r n Xnm ej , Yn;j WD r n Ynm ej ; where m D 1; : : : ; n C 1 and j D 0; 1; 2; 3, one can introduce the normalized system of solid spherical monogenics 0;

Xn;j 0; XQn;j WD 0; Xn;j

m;

;

L2 .B3 /

Xn;j m; XQn;j WD m; Xn;j

m;

Yn;j m; YQn;j WD m; Yn;j

;

L2 .B3 /

;

(34)

L2 .B3 /

The necessary norms are explicitly given above. Collecting all this one obtains the following explicit representation: For each n 2 N0 , the following n C 1 solid spherical monogenics are orthonormal in N L2 .B3 I HI H/ \ ker @: 0 WD XQn;0 ;

'n;H 0;



) (35)

l; l; l WD cn;l XQn;0  YQn;3 ; 'n;H

q

nC1 and l D 1; : : : ; n. where cn;l D 2.nlC1/ Consider now for f 2 L2 .B3 I HI H/ the Fourier series

f D

n 1 X X nD0 lD0

Z l 'n;H

˛n;l ;

˛n;l D

B3

l 'n;H f dV

(36)

Page 23 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

as well as the Taylor series f D

n 1 X X

Aln tn;l ;

nD0 lD0

tn;l D

ˇ 1 N l nl ˇ @C @0 f .x/ ˇ : xD0 nŠ

(37)

Because each Appell polynomial Aln is an H-multiple of a single orthonormal basis polynomial l , one can compare the Fourier coefficients with the corresponding Taylor coefficients for each 'n;H n 2 N0 and l D 0; : : : ; n: l ˛n;l D Aln tn;l 'n;H

Finally, the desired relation between Fourier and Taylor coefficients is given by r ˛n;l D 2

lC1

ˇ ˇ f .x/ @N Cl @nl ˇ : 0 xD0 .2n C 3/ .n  l/Š .n C l C 1/Š

As a consequence, one can conclude that for an arbitrary Appell polynomial Aln , l D 0; : : : ; n, n 2 N0 of the system, the .n  l/-fold application of @0 and afterwards the l-fold application of @N C resolve to Aln D nŠ: @N Cl @nl 0

5 Harmonic Conjugates 5.1 Poisson’s Formula In real analysis, Hardy spaces are subspaces of Lp . They have, for instance, applications in control theory, scattering theory, and in several inner mathematical fields. Let p 2 .0; 1/, then a holomorphic function u in B D B1 .0/  H belongs to the Hardy space (or holomorphic Hardy space) Hp .B/ if the condition 0 B kukHp WD @ sup

Z

0 1; Zk .x; y/ are extended zonal harmonics. It should be mentioned that the given formula for the harmonic conjugation maps harmonic polynomials to monogenic polynomials.

6 Spherical Operators Let G be a domain on the sphere S 2 with the boundary . A tangential vector field u is called toroidal if its surface divergence vanishes. The surface divergence is just Günter’s derivative applied on the vector field u via the inner product. It is of course necessary to study also the corresponding tangential derivatives and connections between them on smooth surfaces. The sphere appears as a special case. For forthcoming studies, the works Cnops (2002), Duduchava et al. (2006), and Mitrea (2002) are recommended.

6.1 Vector Derivative Let M be a 2-dimensional manifold in R3 , a so-called hypersurface and Br .x/ a ball with radius r around the point x 2 M . Further let h be a function defined on M \ Br .x/ and H a smooth extension into Br .x/. Furthermore, let Px be the orthogonal projection onto Tx M , the tangential plane at x 2 M . Then the vector derivative rM is defined by .rM h/.x/ WD

3 X

Px ei @i H:

iD1

Let y 2 R3 , then the orthogonal projection onto Tx M is given by Px y WD y  nx .nx  y/:

Page 26 of 39

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_49-2 © Springer-Verlag Berlin Heidelberg 2014

Because of nx Px y D 0, the image of any vector y belongs to Tx M . In R3 holds the commutativity nx  ny D nx ^ ny and the double cross product yields the representation Px y D nx ^ .nx ^ y/: For the vector derivative follows 3 X

.rM h/.x/ WD

Œ.nx ^ ei / ^ nx @i H:

iD1

Now let D D

3 P

ei @i be the massless Dirac operator. The formal action of the projection P on

iD1

the Dirac operator D is given by .rM /.h/ D Px .D/H WD ŒD  nx .nx  D/H: rM is called Günter’s gradient. In components it follows that .rM /.h/ D

3 X

Œ@j  nx;j .nx  D/ej H D

j D1

3 X

Dj ej H;

j D1

where Dj are called Günter’s derivatives.

6.2 Relations Between Derivatives Following the alternative representation of the vector derivative and using the quaternionic multiplication rules, one gets rM D .nx ^ D/nx D

X

eij .ni @j  nj @i /nx D

i 0 denotes a positive constant. We then define the diagonal matrix D by Di;i D P j Wi;j . Finally, the graph Laplacian is given by L D D  W: 3. Eigendecomposition of the graph Laplacian Find the mapping y D fy1 ; : : : ; yN g; yi 2 Rn , by solving the minimization problem, argminy T DyDI

1X kyi  yj k22 Wi;j ; 2 i;j

(2)

which, in turn, is equivalent to solving the minimization problem,   argminy T DyDI trace y T Ly :

(3)

Letting z D D 1=2 y in (3), it follows that the solution of the minimization problem (3) is equivalent to finding the first n solutions to the generalized eigenproblem, Lv D Dv, sorted in increasing order of . It is clear that v0 D .1; 1; : : : ; 1/ is a solution for  D 0 (see, e.g., Chung 1997). If G is a connected graph, this solution is unique for  D 0. Hence, the problem can be refined further by assuming that v0 2 ker.y/, i.e., we only look for eigenvectors corresponding to nonzero eigenvalues.

Page 5 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

3.3 Schroedinger Eigenmaps A major feature of Laplacian Eigenmaps and related methods is the preservation of local spectral distances, while reducing the overall dimension of the data represented by the aforementioned graph (cf., Chung 1997). However, Laplacian Eigenmaps and similar DR methods lead to fully automated classification algorithms which do not easily allow for expert input. To compensate for this deficiency, we note that the Laplace operator, ‰.x/; can be extended to the time-independent Schroedinger operator E, E‰.x/ D ‰.x/ C v.x/‰.x/;

(4)

by adding a potential term v.x/. The potential v in (4) is considered as a nonnegative multiplier operator. The discrete analogue of E D  C v is the matrix E D L C V , where V is a nonnegative diagonal N  N potential matrix. We then replace the traditional Laplacian optimization problem (3) with the minimization problem,   min trace y > .L C ˛V /y ;

y > DyDI

(5)

which, in turn, is equivalent to the minimization problem, min

y > DyDI

     trace y > Ly C trace y > ˛Vy :

(6)

The parameter ˛  0 is added here so that it can be used to emphasize the relative significance of the potential V with respect to the graph Laplace operator. The minimization problem (6) is equivalent to the minimization problem, 8 9 < X = X 2 2 1 ky  y k W C ˛ V .i /ky k min ; (7) i j i;j i 2 ; y > DyDI : i;j

i

where V D diag.V .1/; : : : ; V .N //. The first sum in (7) incurs a penalty when neighboring points xi and xj are mapped into points yi and yj , respectively, which are far apart. The second sum penalizes those points yi , i D 1; : : : ; N , which are associated with large values of V .i /. For example, if V took only two values, 0 and 1, then the minimization (7) yields a dimension-reduced representation y, which forces increased clustering of the representations yi of points associated with the value V .i / D 1 while attempting to ensure that close points remain close after the dimension reduction. As such, we may utilize the potential V to label points which we would like to be identified together after the dimension reduction. Because of the built-in preservation of topology induced by the Laplacian, this labeling may be used to segment a particular class of points.

Page 6 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

3.4 Properties It is known that the rescaled graph Laplacian on G converges to the Laplace–Beltrami operator on the underlying manifold M (e.g., Belkin and Niyogi 2008) when we assume that all data points are connected. The Laplacian map on X  RD is given by LN f .xi / D f .xi /

X

e

kxi xj k2 

X   kxi xj k2  f xj e   :

j

j

Replacing the xi by an arbitrary x 2 RD yields LN f .x/ D f .x/

X

e

kxxj k2 

j



X

  kxxj k2 f xj e   :

j

It turns out that there exists a positive constant C such that for i.i.d. uniformly sampled data points 1 fx1 ; : : : ; xN g and N D N  nC2Cs , where s > 0, and f 2 C 1 .M/, we have the convergence, .N / lim C N !1 N

nC2 2

LNN f .x/ D M f .x/;

(8)

in probability (see Belkin and Niyogi 2008). This convergence carries over to the Schroedinger operator as we shall see next. Let v be a given potential on the manifold M. The associated matrix V acting on a discrete m-point set is defined as V D diag.v.x1 /; : : : ; v.xN //. Since the potential does not depend on , we may replace xi by an arbitrary x 2 RD to obtain the map, Vm f .x/ D v.x/f .x/: Clearly, this extension coincides with the continuous potential on the manifold. As such, adding the discrete potential VN to the discrete Laplacian does not impede the convergence in (8). Consequently, the term .N / C N

nC2 2

LNN f .x/ C VN f .x/

(9)

converges for N ! 1 to M f .x/ C v.x/f .x/; in probability (see Czaja and Halevy 2011; Halevy 2011). We note that (9) induces a specific choice of the parameter ˛. Indeed, in order to consider E D L C ˛V , rescaling of LNN in (9) implies the nC2 need to reversely rescale V . This can be done by means of multiplication with ˛ D C1 N.N / 2 , which converges to infinity as N ! 1.

Page 7 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

3.5 Applications to Remote Sensing Although a mathematical novelty, kernel eigenmap methods associated with diffusion-based kernels have been used in various roles in applications to remote sensing. The nonlinear nature of hyperspectral satellite imagery data (HSI) has been analyzed and verified by Bachmann et al. (2005). This led to a wide range of applications of nonlinear techniques. LLE has been employed for dimensionality reduction and vector segmentation by Mohan et al. (2007). Castrodad used iterative schemes to perform semi-supervised multi-class classification and segmentation on HSI data (see Castrodad 2009). Kernel fusion based on LE-derived kernels has been exploited for the purpose of spatial–spectral integration in Benedetto et al. (2012b) and Duke (2012). Kernel methods in remote sensing have also been combined with other mathematical techniques, such as randomized anisotropic transforms (Chui and Wang 2010) or approximate graph constructions and randomized projections (Halevy 2011). For more in-depth discussion and additional examples of kernel DR methods in remote sensing, we refer to the work of Chui and Wang (2010). The Schroedinger Eigenmaps algorithm (Czaja and Ehler 2013) allows experts, e.g., trained analysts, to introduce their input in the form of additional, labeled information to improve the detection and classification processes. This labeled information can take the form of a barrier potential for the associated Schroedinger operator on a graph. This added potential steers the diffusion process, induced by the Schroedinger operator on the data-dependent graph, according to both the dynamics of the labeled data and the geometry of the underlying graph. This is the major difference from the case of Laplacian Eigenmaps, where the diffusion process is determined by the geometry of the data alone (cf., Coifman and Lafon 2006). In Benedetto et al. (2012a), the impact of Schroedinger Eigenmaps on classification is analyzed on multispectral and hyperspectral imagery. Efficient methods for building the potentials are based on expert ground-truth data and on automated clustering techniques, and it is shown that they lead to significant improvements in class separation (Benedetto et al. 2012a), see Fig. 1.

4 The Theory of Frames 4.1 Overview Frames were introduced by Duffin and Schaeffer in 1952 (Duffin and Schaeffer 1952). However, their practical potential was not recognized until the 1990s. We refer the interested reader to other works for a more in-depth treatment of frames and their constructions and applications (Benedetto and Walnut 1994; Benedetto 1994; Casazza 1999; Christensen 2003; Kovaˇcevi´c and Chebira 2007, 2008). Since then, frames were both generalized and specialized to allow for constructions of appropriately designed representation systems with varied features adapted to specific applications. Among the generalizations of frames, many ideas have been proposed in the recent years, e.g., frames of subspaces (Casazza and Kutyniok 2003), pseudo-frames (Li and Ogawa 2004), fusion frames, oblique frames (Christensen and Eldar 2004), outer frames (Aldroubi et al. 2004), and multiplicative frames (Benedetto and Dellomo 2015). Finally, many of these constructions have been unified by an operator-based approach called g-frames (Sun 2006).

Page 8 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

1.5

x 10

–3

LE

1 0.5 0 –0.5 –1 –1.5 –2 –1.5

–1

–0.5

0

0.5

1

1.5

2 x 10

–3

2

–3

SE with alpha = 100

x 10

1.5 1 0.5 0 –0.5 –1 –1.5 –2 –2.5 –8

–6

–4

–2

0

2

4

6

8 x 10

–4

Fig. 1 The effect of using Schroedinger potentials for decorrelating clusters as compared to traditional Laplacian embedding

4.2 Frames A frame for a Hilbert space H is a collection fxi W i 2 I g  H of vectors such that there exist constants 0 < A  B < 1 so that, for each y 2 H, Akyk2 

X

jhxi ; yij2  Bkyk2 :

(10)

i2I

Constants A and B, which satisfy (10), are called frame bounds of fxi W i 2 I g. Optimally chosen values A and B are referred to as the optimal frame bounds of the frame. When A D B, the frame fxi W i 2 I g is referred to as a tight frame. As an example of a frame, one may choose an orthonormal basis – it is in fact a tight frame with constants A D B D 1. A union of any two orthonormal bases is a tight frame with constants

Page 9 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

A D B D 2. A union of an orthonormal basis with N arbitrary unit norm vectors is a frame with, not necessarily optimal, bounds A D 1 and B D N C 1, and, in general, it is not a tight frame. Given a frame fxi W i 2 I g, a dual frame is a collection fxi W i 2 I g  H of vectors such that for all x 2 H, we have the reconstruction formula, X xD hx; xi ixi : i2I

Every frame possesses a dual frame. In order to obtain a dual frame to a given frame, we shall define the frame operator.

4.3 Frame Operator Let `2 .I / denote the space of square-summable sequences indexed by I . Given a frame fxi W i 2 I g, the analysis operator  W H ! `2 .I / is defined as   .x/ D hx; xi i i2I : The adjoint of the analysis operator   is called the synthesis operator, and S D    is the frame operator. The synthesis operator satisfies the equation   .c/ D

X

ci xi ;

i2I

where c is any finitely supported sequence in `2 .I /. The following results are well known (e.g., Christensen 2003). Theorem 1.

Let fxi W i 2 I g  H be a frame for H. Then the following are satisfied:

a.  is a bounded operator from H into `2 .I /. b.   extends to a bounded operator from `2 .I / into H. c.  and   are adjoint operators of each other. Theorem 2. Let fxi W i 2 I g  H be a frame for H. The frame operator S D    maps H onto H and is a positive invertible operator satisfying A  Id  S  B  Id and B 1  Id  S 1  A1  Id . In particular, fxi W i 2 I g is a tight frame if and only if S D A  Id .  ˚ The sequence S 1 xi W i 2 I of vectors in H is called the canonical dual frame, and it is a dual frame for fxi W i 2 I g, i.e., we have xD

X˝ ˛ x; S 1 .xi / xi i2I

Page 10 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

and xD

X

hx; xi iS 1 .xi /;

i2I

where both sums converge unconditionally in H. We note here that dual frames are not in general unique, and this underlies the importance of the canonical dual frame. On the other hand, there are significant applications of frames where dual frames other than the canonical dual are critical (Lammers et al. 2009).

4.4 Parseval Frames For a particular given frame, it may not be easy to apply the procedure in the preceding paragraph to obtain a dual frame. One special case in which it is easy is that of Parseval frames. A Parseval frame is a tight frame consisting of unit norm vectors. For Parseval frames, we have that, for every x 2 H, X hx; xi ixi : (11) xD i2I

In particular, Parseval frames are dual frames of themselves. For this reason, among others, Parseval frames are the best behaved of frames. The following theorem, which goes back to Naimark, who used different terminology, is the source of most of the basic general properties of Parseval frames. Theorem 3. A collection fxi W i 2 I g  H of vectors in H is a Parseval frame for H if and only if there exist a Hilbert space K containing H as a closed subspace and an orthonormal basis fei W i 2 I g for K such that, for all i 2 I , P ei D xi , where P is the orthogonal projection onto H. In finite-dimensional Hilbert vector spaces, the notion of a frame becomes intuitively simple. Let N  d ; fxi W i D 1; : : : ; N g be a frame for Fd , where F denotes the field of real or complex numbers, if and only if it is a spanning system for Fd . State-of-the-art mathematical algorithms construct frames through minimization of frame potential energy functions on complex manifolds (Benedetto and Fickus 2003; Strawn 2011).

4.5 Applications to Remote Sensing Frame theoretic techniques are relatively new in remote sensing processing. However, as typical data collected in remote sensing experiments is far from being orthogonal (see Fig. 2), these techniques find novel applications. In Widemann (2008), Benedetto et al. (2009), and Hirn (2009), kernel eigenmap methods were used to map the high-dimensional space X to a low-dimensional feature space Y , and then a frame is constructed for Y , which plays the same role as endmembers play in linear mixture models. An original idea for constructing 2D tight frames that provide a new way to analyze, visualize, and process data at multiple scales and directions was proposed by Bosch et al. (2013).

Page 11 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

Spectral Signatures of VegPasture (Red) and VegTrees (Blue) 500

Percent Reflectance, 250 = %25

450 400 350 300 250 200 150 100 50 0

0

20

40

60 80 100 120 Wavelength in nanometers

140

160

180

Fig. 2 Spectral classes in HSI are typically non-orthogonal

This is achieved by employing the proper choice of functional components in directional signal representations to isolate directional information while, at the same time, effectively characterizing the underlying variation. This new family of frames has been shown to be suitable for a range of geostatistical applications, including superresolution and image inpainting. Examples of specifically constructed frames with built-in features have been utilized in remote sensing data processing (see, e.g., Bosch et al. 2009; Benedetto et al. 2010; Flake 2010). Olshausen and his collaborators (Charles et al. 2011) also used learned dictionaries which are effectively frames, in their work on improving the performance of supervised classification algorithms for HSI data.

5 Sparse Representation and Compressed Sensing 5.1 Overview Since the introduction of multiscale techniques in image analysis, there has been a strong motivation to provide sparse representations with the ability to detect edges, in particular, and directional content in general. As wavelets became a popular method to analyze multidimensional data, they fail to provide optimal n-term approximation rates for images with C 2 edges (see Candès and Donoho 2002). Therefore, a number of new representations have been introduced in an attempt to solve this problem. A few examples of these constructions are contourlets (Do and Vetterli 2002), curvelets (Candès and Donoho 2002), brushlets (Meyer and Coifman 1997), wedgelets (Donoho 1999), shearlets (Labate et al. 2005), and composite wavelets (Guo et al. 2006). A very different approach to induce sparsity in representation can be achieved by combining compression with sampling. This approach led to the introduction of one of the most fundamental models in data complexity reduction, which has been the focus of much recent Page 12 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

attention – compressed sensing (CS). At its foundation is the concept of sparse signals. Given a basis for the ambient (potentially high-dimensional) space RD , a signal is called K-sparse if it can be represented using at most K nonzero coefficients. The theory of CS (Candès and Tao 2005, 2006; Donoho and Tanner 2005; Candès et al. 2006a, b; Donoho 2006) exploits this model in order to maintain a low-dimensional representation of the signal from which a faithful approximation to the original signal can be recovered efficiently. Dimensionality reduction in CS is linear and nonadaptive, i.e., the mapping does not depend on the data.

5.2 K-Sparse Signals CS theory states that K-sparse signals x 2 RD can be recovered from n < D linear measurements y D ˆx, where ˆ represents an nD measurement matrix. This can be achieved via the following recovery algorithm: Q 1 min kxk

x2R Q D

subject to ˆ.x/ Q D y;

(12)

P where kxk1 D D j D1 jxj j. Naturally, as n < N , this recovery cannot be obtained with just any sensing matrix ˆ. Hence, we consider here matrices ˆ which satisfy the restricted isometry property (RIP) of order K, that is, matrices for which there exists a constant ıK 2 .0; 1/, such that .1  ıK / kxk22  kˆ.x/j22  .1 C ıK / kxk22 ; for every K-sparse vector x 2 RD . Under this assumptions, Candès proved the following theorem in Candès (2008). Theorem 4. Let x  be the solution of the minimizationpproblem (12). Let xK denote the best K-sparse approximation to x 2 RD . Assume that ı2K < 2  1. Then, there exists C > 0 such that kx   xk1  C kx  xK k1 and kx   xk2  CK 1=2 kx  xK k1 : Theorem 4 clearly implies that for K-sparse vectors x, the recovery in (12) is exact. However, two types of questions now follow. One is how to find matrices ˆ which satisfy RIP. The other is what is the fewest number of measurements we can afford to take and still recover the signal. While it is quite difficult to satisfactorily answer these questions in a deterministic manner, statistical concepts proved to be much easier to deal with. As such, one can assert that with high probability, every K-sparse signal x 2 RD can be recovered from just n D O.K log.D=K// measurements y D ˆx and the measurement matrix ˆ is an n  D measurement matrix drawn randomly from an acceptable probabilistic distribution. This includes random sampling matrices ˆ which have i.i.d. Bernoulli, Gaussian, or uniform entries (see Sect. 5.3). We note that RIP is just one of several ways Page 13 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

to provide such conditions (see Wang 2013). We note that the number of samples n is linear at the “information level,” i.e., with respect to K, and is logarithmic in terms of the ambient dimension D. The number of random samples n is typically taken large enough to ensure that all K-sparse signals remain well separated when embedded in Rn . CS theory applies equally well to signals that are not strictly sparse but compressible, i.e., if the coefficients in the signal’s representation decay fast enough. Furthermore, near-optimal recovery is guaranteed even in the presence of noise (e.g., Candès 2008).

5.3 Random Projections The notion of using a random projection for dimensionality reduction is not new. In fact, it can be traced back to long before the present wave of interest in CS. One fundamental result in which this type of dimension reduction manifests itself is the Johnson–Lindenstrauss Lemma (JL) (Johnson and Lindenstrauss 1984) (cf., Dasgupta and Gupta 1999), where one can use a random projection for a stable embedding of a finite data set, effectively providing dimension reduction. Lemma 1 (Johnson–Lindenstrauss). Given 0 <  < 1, a set Xof N points in RD , and a number n  O.ln D/= 2 , there is a Lipschitz function f W RD ! Rn such that, for all u; v 2 X, .1  /ku  vk  kf .u/  f .v/k  .1 C /ku  vk: The statement of JL is completely deterministic. Surprisingly, probability enters in two different but related ways. On the one hand, the technique of proof of JL depends on concentration of measure inequalities (see Dasgupta and Gupta 1999; Baraniuk and Wakin 2009). On the other hand, the result itself is very useful from the perspective of CS. In Baraniuk et al. (2008), a fundamental connection was identified between CS theory and JL, despite the fact that the former allows for the embedding of an uncountable number of points. This connection allows us, in particular, to answer the question about construction of matrices which satisfy RIP due to the following theorem (Baraniuk et al. 2008). Theorem 5. Let D, n, and ı be given. Assume that entries of the matrix ˆ are independent realizations of a probability distribution satisfying  2 2 2 Pr jkˆ.x/k`n2  kxk`D j  kxk`D  2e nC./ ;  2 .0; 1/: 2

2

Then, there exist constants C1 ; C2 > 0 such that ˆ satisfies RIP with ı and any K  C1 n= log.D=K/ with probability exceeding 1  2e C2 n . We note that computing random projections is relatively inexpensive: projecting N points from D to n dimensions costs O.DnN /. Manifold models generalize the notion of sparsity beyond bases. These models arise whenever a signal in RD is a continuous function of a K-dimensional parameter. For example, a pure sinusoid is completely determined by its amplitude, phase, and frequency. So a class of signals consisting of pure sinusoids would form a three-dimensional manifold in RD . The dimension of the manifold under this model is analogous to the sparsity level in the CS model. In Baraniuk and Wakin (2009), Page 14 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

the authors extend CS theory by demonstrating that random linear projections can be used to map the high-dimensional manifold-modeled data to a low-dimensional space while, with high probability, approximately preserving all pairwise distances between points.

5.4 Applications to Remote Sensing CS has many promising applications in signal acquisition, compression, and medical imaging. Among them, new possibilities arise for using CS in remote sensing. Based on CS concepts, Patel et al. introduced a new synthetic aperture radar (SAR) imaging modality which can provide a high-resolution map of the spatial distribution of targets and terrain while using a significantly reduced number of waveforms (Patel et al. 2010). Deloye et al. (2013) analyze the coded aperture snapshot spectral imager (CASSI) system, which is a class of imaging spectrometers that provide implementation of compressive sensing ideas for hyperspectral imaging. In the context of HSI data, Greer (2013) shows that standard theoretical guarantees do not apply to the performance of classical CS reconstruction algorithms such as orthogonal matching pursuit (OMP) and basis pursuit (BP). He introduces a new algorithm, sparse demixing (SD), and proves its optimality in reconstruction sparsity and accuracy.

6 Diffusion-Based Image Processing 6.1 Overview Diffusion-based methods have been extensively used in image analysis, both as self-contained techniques (Emmerich 2003; Bertozzi et al. 2007) and as tools for approximating the total variation (TV) functional (see, e.g., Rudin et al. 1992; Chambolle and Lions 1997). Further, even earlier, wavelet-type systems appeared in the context of variational problems (e.g., Chan et al. 2006; Elad et al. 2005). In particular, Chan et al. (2006) discusses the TV minimization in the wavelet domain that successfully reproduces lost coefficients. A related work (Elad et al. 2005) describes an algorithm for filling in holes in overlapping texture and cartoon image layers by means of a direct extension of the image decomposition method called morphological component analysis (MCA). The relationship between wavelet-based image processing algorithms and variational problems is analyzed in Chambolle et al. (1998). Shearlet-based TV minimization utilized for image denoising is studied in Easley et al. (2009).

6.2 Ginzburg–Landau Energy The Ginzburg–Landau (GL) energy functional (Ginzburg and Landau 1950; Chan and Shen 2005), Z Z  1 2 GL.u/ D jru.x/j dx C W .u/dx; (13) 2 4 W .u/ D .u2  1/2 ;

Page 15 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

R may be considered as a diffuse interface approximation to the TV functional jrujdx for the case of binary images. Based on this concept, several efficient algorithms for deconvolution, image inpainting, superresolution, and other applications have been proposed (see, e.g., Dobrosotskaya and Bertozzi 2013, 2008; Dobrosotskaya and Czaja 2013). The common approach has been to use modifications of the GL as the primary regularizer for the solution of an ill-posed R functional 2 problem. In those works, jru.x/j dx has been replaced with wavelet-based, shearlet-based, and related seminorms, with the goal of removing the “fuzzy” diffuse interface features and utilizing advantages of sparse directional representations. Characterizing signal regularity in terms of the decay of wavelet coefficients via a Besov seminorm (cf., Mallat 1999) allows one to construct a method with properties similar to PDE-based methods but without an -scale blur.

6.3 Composite Wavelet Ginzburg–Landau Energy Our goal is to explore the possibilities arising from constructions of frames with directional content, such as curvelets, shearlets, or, more generally, composite wavelets. For this purpose, let denote the dilation D and the generalized shift L , associated with a composite i; D Dai L wavelet construction, for parameters i 2 Z and 2 . We do not specify the choice of the parameter set , as it depends on the choice of the directional multiscale representation, and we refer the reader to Czaja et al. (2013) for details. For more information on the structure of composite wavelets and their theoretical underpinnings, we refer to Guo et al. (2006). For any u 2 L2 .Œ0; 1 2 /, we define the composite wavelet seminorm as juj2CW

D

1 X

j det aji

iD0

X

jhu;

2 i; ij :

2

We define the composite wavelet GL energy (CWGL) as Z 1  2 W .u/dx: CWGL.u/ D jujCW C 2 4 Then the “composite wavelet Allen–Cahn equation,” i.e., the gradient descent minimization equation for CWGL, is 1 ut D CW u  W 0 .u/;  where CW u D 

1 X iD0

j det aji

X

hu;

i; i

i;

2

is the “composite wavelet Laplace” operator. This means that we are replacing the regular Laplace operator that appears in the classical Allen–Cahn equation and that expresses the gradient descent minimization of GL energy.

Page 16 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Data recovery by means of CWGL algorithm with post-processing

On the one hand, the minimizers of CWGL energy exhibit properties similar to those in the classical diffuse interface model. On the other hand, the dependence on  is different for CWGL. In fact,  defines the dominant wavelet scales in the decomposition of the minimizer (see Dobrosotskaya and Bertozzi 2013), and since the wavelet functions are well localized, increasing  causes the low phase transition blur, which provides advantages in reconnecting edges over large gaps without losing the image sharpness.

6.4 Applications to Remote Sensing Consider an image f as a function on Œ0; 1 2 , and let  Œ0; 1 2 be the domain where we assume this image to be known. The goal of image inpainting (data recovery) is to recover the values f .x/ for x 2 C , the complement of . In this model, the GL-type energy term plays the role of a regularizer, while the forcing term is expressed as the L2 norm between the minimizer u and the known image f on the known domain. In the following formula, we consider composite wavelet GL energy and seminorm; however, without loss of generality, these can be replaced with other types of GL energy functionals. Define E.u/ D CWGL.u/ C

ku  f k2L2 . / : 2

One recovers the complete image as the minimizer of this modified functional E. In order to find this minimizer, it is necessary to consider it as a stable state solution of the respective gradient descent equation 1 ut D CW u  W 0 .u/  1 .u  f /;  where 1 is the characteristic function of the known portion of the domain. In a series of works (Dobrosotskaya and Bertozzi 2008, 2010, 2013), Bertozzi and Dobrosotskaya studied a wavelet analogue of the Ginzburg–Landau energy with an additional edgepreserving forcing term. Their applications include inpainting, superresolution, segmentation, denoising, and contour detection, and they have been used, for example, in partial road classification and inpainting for satellite imagery. Page 17 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

In Dobrosotskaya and Czaja (2013), this work has been extended to allow for utilizing the directional content based on shearlet representations. In Czaja et al. (2013), the above approach is utilized in an algorithm to recover missing data, due to sparsity of composite wavelet representations, especially when compared to inpainting algorithms induced by traditional wavelet representations (cf., Fig. 3).

Acknowledgements The first named author gratefully acknowledges the support of MURI-ARO Grant W911NF-0901-0383, NGA Grant HM - 1582-08-1-0009, and DTRA Grant HDTRA 1-13-1-0015. The Second named author gratfully acknowledges the support of NSF through grant CBET 0854233, NGA through grant HM - 1582-08-1-0009 and DTRA though grant HDTRA 1-13-1-0015.

References Aldroubi A, Cabrelli C, Molter U (2004) Wavelets on irregular grids with arbitrary dilation matrices and frame atoms for L2 .Rd /. Appl Comput Harmon Anal 17(2):11–140 Bachmann CM, Ainsworth TL, Fusina RA (2005) Exploiting manifold geometry in hyperspectral imagery. IEEE Trans Geosci Remote Sens 43(3):441–454 Banerjee A, Burlina P, Broadwater J (2007) A machine learning approach for finding hyperspectral endmembers. In: IEEE international geoscience and remote sensing symposium, Barcelona, 2007, pp 3817–3820 Baraniuk RG, Wakin MB (2009) Random projections of smooth manifolds. Found Comput Math 9(1):51–77 Baraniuk R, Davenport M, DeVore R, Wakin M (2008) A simple proof of the restricted isometry property for random matrices. Constr Approx 28(3):253–263 Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396 Belkin M, Niyogi P (2008) Towards a theoretical foundation for Laplacian-based manifold methods. J Comput Syst Sci 74(8):1289–1308 Benedetto JJ (1994) Frame decompositions, sampling and uncertainty principle inequalities. In: Benedetto J, Frazier M (eds) Wavelets: mathematics and applications. CRC, Boca Raton, pp 247–304 Benedetto JJ, Czaja W, Dobrosotskaya J, Doster T, Duke K, Gillis D (2012a) Semi-supervised learning of heterogeneous data in remote sensing imagery. In: Independent component analyses, compressive sampling, wavelets, neural net, biosystems, and nanoengineering X, Baltimore. Proceedings of SPIE, vol 8401, 8401-03 Benedetto JJ, Czaja W, Dobrosotskaya J, Doster T, Duke K, Gillis D (2012b) Integration of heterogeneous data for classification in hyperspectral satellite imagery. In: Algorithms and technologies for multispectral, hyperspectral, and ultraspectral imagery XVIII, Baltimore. Proceedings of SPIE, vol 8390, 8390-78 Benedetto JJ, Czaja W, Ehler M, Flake C, Hirn M (2010) Wavelet packets for multi- and hyperspectral imagery. In Wavelet applications in industrial processing VII, San Jose. Proceedings of SPIE, vol 7535, 7535-08 Page 18 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

Benedetto JJ, Czaja W, Flake JC, Hirn M (2009) Frame based kernel methods for automatic classification in hyperspectral data. In: IEEE IGARSS, Cape Town Benedetto JJ, Fickus M (2003) Finite normalized tight frames. Adv Comput Math 18:357–385 Benedetto JJ, Walnut D (1994) Gabor frames for L2 and related spaces. In: Benedetto J, Frazier M (eds) Wavelets: mathematics and applications. CRC, Boca Raton, pp 97–162 Benedetto JJ, Dellomo M (2015, preprint) Reactive sensing and multiplicative frames Bertozzi A, Esedoglu S, Gillette A (2007) Analysis of a two-scale Cahn-Hilliard model for image inpainting. Multiscale Model Simul 6(3):913–936 Boardman J, Kruse F, Green R (1995) Mapping target signatures via partial unmixing of aviris data. In: Fifth JPL Airborne Earth Science Workshop, Pasadena. Volume 1 of JPL Publication 95-1, pp 23–26 Bowles J, Palmadesso P, Antoniades J, Baumbeck M, Rickard L (1995) Use of filter vectors in hyperspectral data analysis. Proc SPIE 2553:148–157 Bosch EH, Castrodad A, Cooper JS, Czaja W, Dobrosotskaya J (2013) Multiscale and multidirectional tight frames for image analysis. Proc SPIE 8750 Bosch EH, González A, Vivas J, Easley G (2009) Directional wavelets and a wavelet variogram for two-dimensional data. Math Geosci 41(6):611–641 Candès EJ (2008) The restricted isometry property and its implications for compressed sensing. Compte Rendus de l’Academie des Sci 346:589–592 Candès EJ, Donoho DL (2002) New tight frames of curvelets and optimal representations of objects with piecewise-C 2 singularities. Commun Pure Appl Math 57:219–266 Candès EJ, Romberg J, Tao T (2006a) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52:489–509 Candès EJ, Romberg J, Tao T (2006b) Stable signal recovery from incomplete and inaccurate measurements. Commun Pure Appl Math 59:1207–1223 Candès EJ, Tao T (2005) Decoding by linear programming. IEEE Trans Inf Theory 51:4203–4215 Candès EJ, Tao T (2006) Near-optimal signal recovery from random projections: universal encoding strategies. IEEE Trans Inf Theory 52:5406–5425 Casazza P (1999) The art of frame theory. arXiv preprint math/9910168 Casazza P, Kutyniok G (2003) Frames of subspaces. In: Wavelets, frames and operator theory. Contemporary mathematics, vol 345. American Mathematical Society, Providence, pp 87–113 Castrodad A (2009) Graph-based denoising and classification of hyperspectral imagery using nonlocal operators. In: Algorithms and technologies for multispectral, hyperspectral, and ultraspectral imagery XV, Orlando. Proceedings of SPIE, vol 7334, 7334-0E Chambolle A, Lions P-L (1997) Image recovery via total variation minimization and related problems. Numer Math 76:167–188 Chambolle A, DeVore RA, Lee N, Lucier BJ (1998) Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEE Trans Image Process 7(3):319–333 Chan TF, Shen J (2005) Image processing and analysis: variational, PDE, wavelet, and stochastic methods. SIAM, Philadelphia Chan TF, Shen J, Zhou H-M (2006) Total variation wavelet inpainting. J Math Imaging Vis 25: 107–125 Charles AS, Olshausen BA, Rozell CJ (2011) Learning sparse codes for hyperspectral imagery. Sel Top Signal Process 5(5):963–978 Christensen O (2003) An introduction to frames and Riesz bases. Birkhauser, Boston

Page 19 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

Christensen O, Eldar Y (2004) Oblique dual frames and shift-invariant spaces. Appl Comput Harmon Anal 17(1):48–68 Chui CK, Wang J (2010) Randomized anisotropic transform for nonlinear dimensionality reduction. Int J Geomath 1(1):23–50 Chui CK, Wang J (2010) Dimensionality reduction of hyper-spectral imagery data for feature classification. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, vol 1. Springer, Berlin/Heidelberg, pp 1005–1048 Chung FRK (1997) Spectral graph theory. CBMS regional conference series in mathematics, vol 92. American Mathematical Society, Providence Coifman RR, Lafon S (2006) Geometric harmonics: a novel tool for multiscale out-of-sample extension of empirical functions. Appl. Comput. Harmon. Anal 21(1):31–52 Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B, Warner FJ, Zucker SW (2005) Geometric diffusions as a tool for harmonic analysis and structure definition of data. Part i: diffusion maps Proc Natl Acad Sci 102:7426–7431 Coifman RR, Maggioni M (2006) Diffusion wavelets. Appl Comput Harmon Anal 21(1):53–94 Czaja W, Ehler M (2013) Schroedinger eigenmaps for the analysis of biomedical data. IEEE Trans Pattern Anal Mach Intell 35(5):1274–1280 Czaja W, Halevy A (2011, preprint) On convergence of Schroedinger eigenmaps Czaja W, Dobrosotskaya J, Manning B (2013) Composite wavelet representations for reconstruction of missing data. Proc SPIE 8750 Dasgupta S, Gupta A (1999) An elementary proof of the Johnson-Lindenstrauss lemma. Technical report 99-006, UC Berkeley Deloye CJ, Flake JC, Kittle D, Bosch EH, Rand RS, Brady DJ (2013) Exploitation performance and characterization of a prototype compressive sensing imaging spectrometer. In: Excursions in harmonic analysis, vol. 1. Applied and numerical harmonic analysis. Birkhäuser, Boston, pp 151–171 Do MN, Vetterli M (2002) Contourlets: a directional multiresolution image representation. In: Proceedings of IEEE international conference on image processing (ICIP), Rochester Dobrosotskaya J, Bertozzi A (2008) A wavelet-laplace Variational technique for image deconvolution and inpainting. IEEE Trans Image Process 17(5):657–663 Dobrosotskaya J, Bertozzi A (2010) Wavelet analogue of the Ginzburg-Landau energy and its gamma-convergence. Interfaces Free Bound 12(2):497–525 Dobrosotskaya J, Bertozzi A (2013) Analysis of the wavelet Ginzburg-Landau energy in image applications with edges. SIAM J Imaging Sci 6(1):698–729 Dobrosotskaya J, Czaja W (2013, preprint) Shearlet Ginzburg-Landau energy, its gamma convergence and applications Donoho DL (1999) Wedgelets: nearly minimax estimation of edges. Ann Stat 27(3):859–897 Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306 Donoho DL, Grimes C (2003) Hessian eigenmaps: new locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci 100:5591–5596 Donoho D, Tanner J (2005) Sparse nonnegative solutions of underdetermined linear equations by linear programming. Proc Natl Acad Sci 102(27):9446–9451 Duffin RJ, Schaeffer AC (1952) A class of nonharmonic Fourier series. Trans Am Math Soc 72:341–366 Duke K (2012) A study of the relationship between spectrum and geometry through Fourier frames and Laplacian eigenmaps. Ph.D. thesis, University of Maryland, College Park

Page 20 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

Easley GR, Labate D, Colonna F (2009) Shearlet based total variation for denoising. IEEE Trans Image Process 18(2):260–268 Elad M, Starck JL, Querre P, Donoho DL (2005) Simultaneous cartoon texture image inpaitning using morphological component analysis. Appl Comput Harmon Anal 19:340–358 Emmerich H (2003) Diffuse interface approach in materials science thermodynamic concepts and applications of phase-field models. Springer, Berlin/New York Flake JC (2010) The multiplicative Zak transform, dimension reduction, and wavelet analysis of LIDAR data. Ph.D. thesis, University of Maryland, College Park Gillis D, Bowles J (2013) An introduction to hyperspectral image data modeling. In: Excursions in harmonic analysis, vol 1. Applied and Numerical Harmonic Analysis. Birkhäuser, Boston, pp 173–194 Ginzburg VL, Landau LD (1950) Zh Eksp Teor Fiz 20:1064 Goldberg Y, Zakai A, Kushnir D, Ritov Y (2008) Manifold learning: the price of normalization. J Mach Learn Res 9:1909–1939 Greer JB (2013) Hyperspectral demixing: sparse recovery of highly correlated endmembers. In: Excursions in harmonic analysis, vol 1. Applied and numerical harmonic analysis. Birkhauser, Boston, pp 195–210 Guo K, Labate D, Lim W-Q, Weiss G, Wilson E (2006) The theory of wavelets with composite dilations. In: Heil C (ed) Harmonic analysis and applications. Applied and numerical harmonic analysis. Birkhauser, Boston, pp 231–250 Halevy A (2011) Extensions of Laplacian eigenmaps for manifold learning. Ph.D. thesis, University of Maryland, College Park Hirn M (2009) Enumeration of harmonic frames and frame based dimension reduction. Ph.D. thesis, University of Maryland, College Park Johnson WB, Lindenstrauss J (1984) Extensions of Lipschitz mappings into a Hilbert space. Contemp Math 26:189–206 Kovaˇcevi´c J, Chebira A (2007) Life beyond bases: the advent of frames (parts I and II). IEEE Signal Process Mag 24(4):86–104 and 24(5):115–125 Kovaˇcevi´c J, Chebira A (2008) Introduction to frames. Foundations and trends in signal processing, vol 2(1). Now Publishers, Boston Labate D, Lim W, Kutyniok G, Weiss G (2005) Sparse multidimensional representation using shearlets. In: Wavelets XI, San Diego. SPIE proceedings, vol 5914, pp 254–262 Lammers M, Powell A, Yilmaz Ö (2009) Alternative dual frames for digital-to-analog conversion in sigma-delta quantization. Adv Comput Math 32(1):73–102 Lee JA, Verleysen M (2007) Nonlinear dimensionality reduction, Springer, New York/London Li S, Ogawa H (2004) Pseudoframes for subspaces with applications. J Fourier Anal Appl 10(4):409–431 Mallat S (1999) Wavelet tour of signal processing. Academic, San Diego Meyer F, Coifman R (1997) Brushlets: a tool for directional image analysis and image compression. Appl Comput Harmon Anal 4:147–187 Mohan A, Sapiro G, Bosch E (2007) Spatially coherent nonlinear dimensionality reduction and segmentation of hyperspectral images. IEEE Geosci Remote Sens Lett 4(2):206–210 Patel VM, Easley GR, Healy DM Jr, Chellappa R (2010) Compressed synthetic aperture radar. IEEE J Sel Top Signal Process 4(2):244–254 Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2(6):559–572

Page 21 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_50-1 © Springer-Verlag Berlin Heidelberg 2014

Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326 Rudin LI, Osher S, Fatemi E (1992) Nonlinear total variation based noise removal algorithms. Physica D 60:259–268 Schölkopf B, Smola A, Müller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319 Strawn N (2011) Geometric structures and optimization on spaces of finite frames. Ph.D. thesis, University of Maryland, College Park Sun W (2006) G-frames and g-Riesz bases. J Math Anal Appl 322(1):437–452 Tenenbaum V, Silva J, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 209:2319–2323 Wang R (2013) Global geometric conditions on dictionaries for the convergence of L1 minimization algorithms. Ph.D. thesis, University of Maryland, College Park Widemann D (2008) Dimensionality reduction for hyperspectral data. Ph.D. thesis, University of Maryland, College Park Winter M (1999) N-FINDR: an algorithm for fast autonomous spectral endmember determination in hyperspectral data. Proc SPIE 3753:266–275

Page 22 of 22

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Potential Methods and Geoinformation Systems Hans-Jürgen Götze Institut für Geowissenschaften, Geophysik, Christian-Albrechts-Universität zu Kiel, Kiel, Germany

Abstract Geophysical methods “gravity” and “magnetic” belong to potential methods together with geoelectrics. This chapter focuses on gravity and magnetic methods. Their fields can be described by Laplace and Poisson differential equations – if the observation is taken outside or inside the masses. Potential fields were defined to attribute vector fields to scalar fields, because the mathematical treatment of scalar fields is numerically easier. Gravity and magnetic exploration can help to locate faults, mineral or petroleum resources, and groundwater reservoirs. The interpretation of gravity and magnetic fields and their respective anomalies is not unique, and boundary conditions are always required. Geoinformation systems can help to overcome the ambiguity of potential methods and support integrated modeling of potential fields by allocation of boundary conditions, data/information fusion, and advanced visualization at different scales. These systems should help to facilitate 3D interpretation (even 4D), which bases on data from multiple sources. 3D potential field forward modeling and inversion, visualization, and metadata handling facilitate interdisciplinary interpretation crossing the field of geophysics and geoinformatics.

1 Introduction 1.1 What Are Potential Methods? Every arrow that flies, feels the attraction of the Earth. (Henry Wadsworth Longfellow) Potential Theory belongs to the oldest branches of Theoretical Physics, and probably it initiated the field of Theoretical Physics itself or the subject of “Mathematical Physics” at the end of the Renaissance epoch. Meantime the spectrum of research fields of Theoretical Physics extended tremendously, and less space and time is addicted to Potential Theory and potential methods today, to their principal theorems, boundary value problems, and their solutions, as well as the analysis of spherical harmonics. They run a bit out of focus, in order to make way for more advanced concepts of modern physics. However, for an in-depth understanding of Geophysics Potential Theory and the application of potential methods, they are still essential and include many details. They are used to investigate the Earth’s interior and it’s dynamic processes. A stimulating brief history of potential methods – particularly gravimetry and magnetic – contains the first chapters of R.J. Blakely’s text book Potential Theory in Gravity and Magnetic Applications (Blakely 1995). Readers who find this introduction here too short are referred to Ramsey (1940), Kellog (1953), or MacMillan (1958). 

E-mail: [email protected]

Page 1 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

1.2 Potential Fields In physics a field is a set of functions of space and time that is assigned to physical quantities of each point in space and time (e.g., temperature, density, magnetic susceptibility of rocks, etc.). Corresponding to the character of these quantities, one distinguishes between scalar, vectorial, and tensor fields. It is expected that the field quantities are defined piecewise smooth in a mathematical meaning. Examples for scalar fields are functions of pressure, velocity, and stress tensors. In contrast, all vector fields are characterized by three components (functions) in an orthogonal space (examples are deformation fields, gravity and magnetic fields, velocity of flow fields). The mathematical treatment of vector fields with a specific geometry is rather simple if a suitable coordinate system was selected. Clearly arranged are the corresponding field equations in orthogonal systems, because coordinate surfaces intersect vertically. In each space, point field quantities are independent from the choice of the coordinate systems. This includes the applications of differential and integral operators to field quantities which are invariant. The components of vector and tensor fields depend on the selected coordinate system. For readers who are not familiar and experienced with tensors, refer, e.g., to Fleisch (2011) for basic information. In general there are three main potential fields of further interest in geophysics: • Scalar electric potential V, • Magnetic vector potential A, • Scalar gravitational potential U. They, respectively, define three main force fields: • Electric field E • Magnetic field B • Gravitational field g. The following equations show how they relate:  rV D E Gradients are inclines, increases in some quantity over some distance. The scalar electric potential V changes over some distance that establishes an electric field. r  ADB The curl in the magnetic vector potential causes the magnetic force field. A magnetic field line may be visualized as the central axis of a vortex made of vector potential. If the vector potential A has zero vorticity, then no magnetic field arises, yet it can still distort in other ways by fluctuating, diverging, or compressing.  rU D g This equation is similar to the first. It says that the gravitational force field g points down toward the Earth’s center and accelerates falling masses at an average rate of 9.81 m/s2. It is the negative

Page 2 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

gradient of the gravitational potential U. The gravitational potential increases with height, forming a gradient whose downward slope points toward the ground. There is a fourth equation which relates the magnetic vector potential A to the electric field E: @A=@t D E and it holds that if A increases over time, an electric field will arise pointing opposite the physical direction of A. This is why changing magnetic fields cause electric fields and vice versa; magnetic fields consist of curled vector potentials, and a change in the latter manifests an electric field. In the course of this chapter, the focus is set on gravity and magnetic methods. Their fields can be described by Laplace and Poisson differential equations, depending on whether the observations are taken outside or inside the masses. A potential was defined to attribute vector fields to scalar fields, because the mathematical treatment is numerically easier. Potential Theory provides rules and formalisms to do so. In gravity and magnetic, the spatial and temporal variations and the absolute values of the earth’s gravity and magnetic field are measured, determining either single components in the three spatial directions (vector measurements) or the total value (scalar measurements). Gravity and magnetic exploration, also referred to as potential field exploration, is used to give geoscientists an indirect way to explore the Earth’s subsurface by sensing different physical properties of rocks (density and magnetization, respectively). Gravity and magnetic exploration can help to locate faults, mineral or hydrocarbon resources, and groundwater reservoirs: USGS: http://pubs.usgs.gov/fs/fs-0239-95/fs-0239-95.pdf. LIAG: http://www.liag-hannover.de/en/s/s1/p1/publications-products-potential-field-maps.html. However, an inherent characteristic of gravity and magnetic fields is their nonuniqueness. Therefore, it is necessary to introduce independent information into the interpretations of fields. This is one crucial point where geoinformation systems will support modeling and interpretation.

1.3 Ambiguity and Principle of Equivalence The interpretation of gravity and magnetic fields and their appropriate anomalies is not unique, and boundary conditions are always required, mainly from other geophysical disciplines, geology, and/or petrology. It can be easily shown by modeling that for a given anomaly and given density or susceptibility contrasts, a wide range of possible interpretations can be made: at various depths, based upon different geometrical shapes (e.g., Torge 1989; Fedi and Rapolla 1997; Boschetti et al. 1999). Nor does the method of interpretation by the field gradients allow us to make a unique interpretation or to distinguish deep from shallow anomalies as has been claimed: It has been shown that we cannot escape the ambiguity by using second derivative quantities (gradients) or curvature and that, in fact, gravity and magnetic with their derivatives are related by a corollary of Green’s theorem. This theorem provides an analytical proof of ambiguity (e.g., Skeels 1947).

Page 3 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 Gravitational potential (U), vertical gravity component (Vz ), and derivatives of gravity (Vzz , Vxz , Vzzz ) over a homogeneous sphere with a depth 5 km, a radius of 1 km, a density contrast of 1 t/m3 , and equivalent masses between surface and maximum depth; not to scale (Götze 2011 after Torge 1989) x

0

P (x, y, z) r dτ

y

(ξ, η, ζ)

z

Fig. 2 A density body T. P(x, y, z) is the point where gravity is measured, dV (, , ) is the volume element, and “r” the distance between P and dV

1.4 The Inversion Problem The ambiguity of the inverse problem of potential fields is related to the computation of density functions from observed gravity and/or magnetic fields. For example, density functions yield information of location, shape, and densities of causing masses. The computation leads to an integral equation, for which no unique solution exists. For a gravity anomaly generated by a particular mass distribution, an infinite number of equivalent disturbing masses can be constructed at depths above the maximum depth of a disturbing body (Fig. 1). However, the total mass of the disturbing body expressed by the surface integral of the gravity/magnetic anomaly, the lateral position of its mass center, and the corresponding coating on the surface of the earth can be determined uniquely. If certain conditions are met, restricting values of maximal depth, density (difference), thickness, and lateral extension of the disturbing body can be found. The necessary additional independent information (constraints) is taken from other geophysical methods and geological and/or petrological investigations (e.g., Götze 2011); refer also to Chapter 3. For the gravity potential U, one can write (Fig. 2) U .x; y; z/ D f s s s 

d 1 D f s s s  .; ; /   1 dd d r  .  x/2 C .  y/2 C .  z/2 2 Page 4 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

and the impression for the vertical gravity component Uz = g.z/ is Uz .x; y; z/ D

@V .  z/ D f s s s  .; ; /   3 dd d @z  .  x/2 C .  y/2 C .  z/2 2

Integration leads to a final formulation which can be used in forward modeling. However, we will write Uz .x; y; z/ D f s s s  .; ; / ‰ Œ.  x/ C .  y/ C .  z/ dd d 

This integral can be considered a convolution integral; similar formulation is possible for the magnetic method; however, the derivatives of mathematical functions have a higher degree than that of gravitational potential and its derivatives. There are two different perceptions of the last integral: (I) For solution of the “forward” modeling problem, we use it for integration: Uz .x; y; z/ D f s s s  .; ; / ‰ Œ.  x/; .  y/; .  z/ dd d 

(II) and for inversion there are two unknown terms in this equation: s s s  .; ; / d d

and



s s s ‰ Œ.  x/; .  y/; .  z/ dd d 

The unknowns under the integral have to be multiplied which results in an instable and nonunique solution. In practical solutions of inversion problems, the direct problem of potential methods appears, i.e., the computation of corresponding effects caused by particular mass and magnetic material contributions given in terms of location, shape, and density/susceptibility/remanent magnetization. In this case a unique solution is always possible by the application of the appropriate equations (among many others: Talwani et al. 1959; Götze and Lahmeyer 1988; Telford et al. 1990; Jacoby and Smilde 2009; Schmidt et al. 2011).

2 Geoinformation Systems 2.1 How Can Geoinformation Systems Help? “Geoinformatics or GIScience strives to understand and simulate spatiotemporal phenomena and processes. As such, GIScience is much related to human spatial cognition and the fundamental search for human concepts of space. Technically, it deals with representation, manipulation, analysis, and communication of geospatial information using computer science methods” (Sester, this issue). Following the previous chapters, one may add: It is expected that geoinformation systems will help to overcome the ambiguity of potential methods and support integrated modeling

Page 5 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

of different fields by the preparation of boundary conditions, data/information fusion, and advanced visualization at different scales. These systems should help to facilitate 3D interpretation (even 4D, see below) which bases on data from multiple sources crossing the fields of different disciplines: remote sensing, geodesy, geology, geophysics, and petrology providing a wide variety of data and information types. One of the crucial points is that all this information has to be in a digital format and geo-referenced – however, even today, a vast number of different software applications do not follow any standardization for input and output parameters. This fact, in addition to differences in hardware platforms and operating systems, still hampers free exchange of data. Additionally, the amount of data – available online and/or from other data bank systems – stress hard- and software in the context of integrated interpretation since long. All this demands a strong need for a close cooperation between geosciences (geophysics in particular) and the developer of geoinformation systems; see, for example, Breunig et al. (2000). However, there exist already close ties between geophysics and geoinformatics/geomathematics. In the following chapters, some examples shall elucidate the state of the art which base on personal experience gained (1) in a large-scaled research project which took place in the Andes over the last 30 years (Collaborative Research Center, SFB 267, Oncken et al. 2006) and (2) a smaller-scaled project of Applied Geoscience, the “virtual” sequestration of CO2 (Mopa Project, Bauer et al. 2012). The Collaborative Research Center 267 “Deformation Processes in the Andes,” focused on interdisciplinary research in the Central and Southern Andes and involved nearly all geoscientific disciplines. Over a period of 12 years, interdisciplinary and international task groups investigated the structure and dynamic processes that act at the convergent plate margin of Central South America, through both laboratory work and large field campaigns. The digital database of this research center has required extensive data documentation to guarantee its long-term use and avoid redundancies. A metadata information system has also been developed to facilitate queries by Internet and intranet (Oncken et al. 2006; Götze et al. 2006a). The Internet module mingles new datasets from laboratory work, field research, and remote sensing with diverse geoscientific metadata in a way that makes it more useful to both scientists and the general public (Munier et al. 2006; Munier 1997). The original website has been turned into a geo-service tool that provides data, metadata, and numerical tools for three-dimensional modeling, mapping, and visualization. The full Meta Information System is accessible over the Internet at the SFB 267 website (URL: http://www.cms.fu-berlin.de/sfb/sfb267); refer also to Ott et al. (2002).

2.2 The Search for Interoperability Most of the big international interdisciplinary projects (e.g., Lithoprobe in the USA, AlpArray in Europe, Global Geoscience Transects of the International Lithosphere Program, etc.) were initiated to foster and stimulate interdisciplinary work. However, geoscientific problems have been considered to be interdisciplinary by definition for a long time. The heterogeneity of the geodisciplines, all focusing on the same object, yet using different methods, data, spatial resolutions, and scales convincingly demonstrates this fact. Interdisciplinary should not be confused with multidisciplinary interpretation, which is characterized by the fact that different groups are working independently, intending to discuss their individual results to find finally “a joint interpretation.” In contrast, interdisciplinary interpretation continually tests the research of each working group against the results of other groups during the entire interpretation process (Fig. 3). This iterative cycle ensures the ongoing control of interpretations and the early-stage convergence of final Page 6 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Interdisciplinary, multidisciplinary, and joint interpretation schemes. The figure illustrates independent (parallel) multidisciplinary interpretation (left) versus joint (simultaneous) interdisciplinary interpretation, which substantially increases synergy (right); after Schmidt et al. (2007)

models. It is realized that substantial scientific progress could be achieved by not restricting the intensive coupling of different geoscientific disciplines to the exchange of data, but, rather, there should be a final integration of both data and methods. Indeed, a group of some 100 geoscientists, with the ambitious goal of understanding the structure and deformation processes anywhere in the world, definitely needs a sophisticated working environment that provides the full advantage of interoperability in an easy-to-use “plug and play” manner. In the early days from 1996 to 2000, the development of an interoperable open GIS (IOGIS) was initiated in a joint project by colleagues of the University of Bonn and Berlin (Breunig 1996, Schmidt and Götze 1999; Breunig et al. 2000). Data handling and the communication between software components were established in CORBA (Common Object Request Broker Architecture), an object-oriented software architecture for distributed software (Bode et al. 2002). Inspired by growing computer performance, new geophysical data acquisition (e.g., by modern satellite missions CHAMP, GRACE, and GOCE) and processing methods have led to an increased amount of data. The conception and technical realization of an interdisciplinary geodata management as a combination of metadata handling/catalogue together with web mapping technology will be briefly presented later. Related with storage and retrieval of different datasets is the need of advanced visualization. 3D visualization in interpretation is a useful tool, if heterogenic datasets have to be visualized at the same time (Damm and Götze 2009; Thomsen et al. 2012). Geophysical modelers and applicants of GIS software will not work with entities like “point,” “line,” “triangle,” or “polyhedron.” They prefer terms like “fault planes,” “geological formations,” or “increased reflectivity.” Therefore it is more reliable to have access to the geometry of the “XYZ-Fault,” rather than to “lines 17 through 218.” This observation leads directly to the definition of geo-objects, which are defined as an “existing” geoscientific object, composed by name, geometry, and “thematic description” as discussed by Breunig (1996). For those involved Page 7 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

in complex interpretation of interdisciplinary and heterogeneous geodata, GIS methods and functions, which are based on a geo-object data structure, became important. Object-oriented database management is affecting most of the modeling tasks in geosciences (e.g., 3D/4D models, visualization, validation, geostatistics, among others). However, an easy access to this objectoriented technology is still a bottleneck. Geologists and geophysicists work with objects which are composed of different geometries and parameters at different scales. Furthermore, these “objects” are linked to each other by complicated relations varying through time. An impressive example is the “Mohoroviˇci´c discontinuity” (Moho) and its description by a geo-object, toward this end, e.g., a “MOHO” geo-object could be defined as follows (Damm and Götze 2009): 1. 2. 3. 4. 5.

Lower boundary of bodies with crustal densities, A line (2D) or surface (3D) across seismic p-wave velocities increase abruptly (or smoothly), A surface at which topography becomes isostatically balanced, An interface between rigid crustal material and weaker lithospheric material Petrological composition changed.

A tool for object definitions should provide the following functions: 1. The ability to define geo-objects with composite geometry 2. To link several geometrically independent objects corresponding to the same geo-object. In advanced spatial-temporal scenarios, such as the simulation of complex geo-processes, the analysis of complex surface- and volume-based objects changing their locations and shapes in time is a central task (Breunig et al. 2012): landslides, mass movements, and ocean currents are examples and require 4D modeling based on dynamic geometric and topological database structures. Therefore, Breunig et al. (2012) introduced concepts and implementation efforts for an effective handling of geospatial and time-dependent data realized in according databases which consist of service-based geodatabase architecture. Thomsen et al. (2012) used this database architecture to model the underground of Northern Germany under the framework of a virtual CO2 sequestration project. Most of the data are surface data, either layer or unconformities/faults, represented by triangle meshes in 3D space. Balovnev et al. (2004) and Bär (2007) describe this kind of databases, which store 3D geometry data as simplicial complexes, e.g., point sets, polylines, triangle, and tetrahedron meshes with associated thematic data, or as nonuniform structured grids. As example the database “DB4GeO” offers a set of spatial operations for retrieving and transforming geometry data. It supports the construction and modification of hierarchies defined by aggregation of cells by using the “generalized map topological model” (Lienhardt 1994; Fradin et al. 2005; Breunig et al. 2012). For data exchange, GOCAD ASCII, GML, and a proprietary XML format are accepted. Some of the above mentioned features are implemented in the in-house software package IGMAS+ (Schmidt et al. 2011; Alvers et al. 2013) for modeling of potential fields. IGMAS is an acronym and stands for interactive gravity and magnetic application system. IGMAS+ is a new JAVA software and is replacing the older version of the IGMAS software which already has been used for 3D density and magnetic modeling in different tectonic environments of the Earth (Döring and Götze 1999; Götze and Schmidt 2003; Woldetinsae and Götze 2005; Götze et al. 2006b, 2009, 2010; Bilgili et al. 2007; Prezzi et al. 2009).

Page 8 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

3 3D Integrated and Constrained Potential Field Modeling It has long been recognized that the storage and interrogation of geoscientific data and information is an interdisciplinary problem. To date, only insufficient progress has been made, since the exchange of information is limited to the fields concerned. Thus, while exchange often takes place with the help of modern information technology, it still corresponds to the classical form of strongly limited, individual interpretations of the geodisciplines involved. A substantial increase in scientific value can be obtained by intensified coupling of geoscientific methods using new computer technologies, e.g., three- and four-dimensional geoscientific information systems (GIS). Here, the term of “coupling” goes beyond the often one-sided data and information exchange (e.g., Thomsen et al. 2012). At first glimpse of this “coupling,” the author and his research group achieved in the MoPa project (MoPa stands for “modeling, parameterization, and evaluation of monitoring methods for CO2 storage in deep saline formations,” Bauer et al. 2012). One of the relevant aims of the virtual MoPa project was the investigation and verification of processes of underground structures by geophysical modeling and monitoring, on the base of “real” data in a complete virtual computer environment. Toward this end, the subprojects investigated methods of numerical process simulations, and virtual measurements have been created and evaluated like real-world data. A typical synthetic site geometry was selected which was similar to those in the Northwest German basin and pressure response; chemical reactions of aquifer and host rock were simulated due to this layered structure of the storage system. Model calculations of time lapse microgravity effects are found to be promising to detect the CO2 phase distribution. The modeled data for the MoPa project were extremely variable in space, time scales, and content. Therefore a distributed data management was required, which allowed heterogeneous and multidisciplinary data handling. Figures 4 and 5 illustrate the interplay of MoPa subprojects, the data flux of measurements, and modeled data (bold arrows) and metadata (light color). The GIS/geophysical subprojects M3 and M4 (Figs. 4 and 5) combined potential field (gravity, magnetic, and electric) and seismic data handling and modeling.

3.1 What Is a Model? A model is an object of material or immaterial nature, which replaces another object on the basis of structure, function, or behavior of the original. It is used to solve tasks which immediate accomplishment is not possible or too expensive. A model is a substitute for a real system. It distinguishes between graphical, mathematical (symbolic), physical, artificial representations of simplified phenomena, structures, or other aspects of the real world (e.g., Ford 2009). Modeling includes the elimination of unnecessary components and aids in decision making by simulating of scenarios and explaining, controlling, and predicting events on the basis of observations and simulations. This chapter briefly focuses on density models of the Earth’s underground. The models base on equations for 3D modeling which are fast enough to use them interactively to fit observations of gravity and magnetic fields to the computed fields. We will concentrate on a special category of numerical models which are called “forward” models. In special cases a very useful approximation of masses in system Earth is point masses which are often described as 1D masses; an example is given below when the case of variable density calculation will be demonstrated. If the computation point is far away from the attracting/magnetic masses/bodies, 1D calculation provides very fast modeling results.

Page 9 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 Network representation and information flows in the CO2 MoPa project (Thomsen, pers. comm.). Besides the visualization of large complex models and processes by commercial graphic packages (Gocad, Petrel, and others), individual data of subprojects (M6–E2 in the above graphic), the interaction of different modeling stages, and their parameters were documented and analyzed. Prerequisites were the representation and analysis of the specific information flow among the models of individual subprojects. Complex scenario derived thereof as a system of individual models and their relationships. Data retrieval and data transformation were task of “Geoinformatics” subproject M3 and data and meta-information were accessible through the MoPa web server

Fig. 5 Data (solid arrows) and metadata flow (light arrows) in the CO2 MoPa project. Laboratory data stem from geochemistry subproject (M5), and geometry of underground structures was provided by geology subproject (M6). Subproject geoinformation (M3) organized the information transfer via semantic web to distribute formal and contentrelated description of individual data and their relationships

Page 10 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Geological structures are called 2D if they represent lineated structures in a particular horizontal direction (e.g., rift zones, oceanic fracture zones, or dykes). To be “linear” is a rather subjective criterion, and in general potential field methods considered a geological body to be 2D when it causes closed anomaly contours approximately elliptical in shape, with long dimensions at least 3–5 times longer than their (short) cross section (e.g., Blakely 1995). In this case, for example, the magnetic susceptibility “” (kappa) becomes .x, y, z/ D .x, z/ Talwani et al. (1959) replaced the cross section of a 2D geological body with a simplified polygon and calculate the attraction/magnetic field of this polygon with constant density and magnetic susceptibility at model stations and compare the results with real measurements. Their approach stems from a paper by Hubbert (1948), and their method is probably the most widely used technique in potential field interpretation today. Subroutines for both magnetic and gravity are available from commercial software products and from public-domain sources (e.g., http:// pubs.usgs.gov/fs/fs-0076-95/FS076-95.html). A JAVA written web version of the 2D TalwaniAlgorithm is also available at: http://www.gravity.uni-kiel.de/Software/Mod2D.htm.

3.2 Three-Dimensional Modeling and GIS Functionality For an integrated processing and interdisciplinary interpretation of gravity and magnetic data yielding an improved geological interpretation, three-dimensional model bodies are constructed using polyhedrons of suitable geometry and the physical parameters (e.g., Götze and Lahmeyer 1988, Pohánka 1988; Singh and Guptasarma 2001). For the purpose of 3D modeling of potential fields and their gradients, the in-house software IGMAS+ was developed since approximately 20 years. It calculates the potential, its first and second derivatives of the gravity and magnetic field. Polyhedrons approximate the geological domains and consist of rock densities, induced susceptibility, and/or remanent magnetization (Götze 1984; Götze and Lahmeyer 1988, Schmidt and Götze 1999; Schmidt et al. 2011). The gravity effect of a homogeneous polyhedron is calculated by the transformation of a volume integral into a sum of line integrals. An additional advantage of the method described is that its application requires considerably less computing time than conventional methods based on the direct evaluation of volume integrals. The software is especially designed for constrained modeling, concurrent processing, and interpretation of complicated geological structures (like salt domes) in an interactive mode. The interactive modeling software package and its 3D visualization tool allow the user to change the geometry as well as the density and/or susceptibility of the elementary polyhedrons and to quickly observe results during the course of modeling. The current version of IGMAS+ provides 3DGIS functions that allow the integration of other geophysical models, information, and data from both geophysics and geology (Schmidt and Götze 1999). This procedure enables the interpreter to decide immediately if and where a modeled geologic structure must be modified in order to fit its modeled field to that of a field survey. The analytical solution of the volume integral for the gravity and magnetic effect of a homogeneous body is based on the reduction of the volume integral to an integral over the bounding polyhedrons by application of the divergence theorem of Potential Theory. The hull of each polyhedron is approximated by a set of plane triangles. Homogeneous density is assumed and only attraction is calculated – no centrifugal force: Page 11 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

ˇ 0 ˇ ˇrE  rEˇ D R

1 ˇ dV U D f   sss ˇ 0 ˇrE  rEˇ V

with f: gravity constant 6,67 1011 m3 /kg s2 ,  W density, constant for each polyhedron, and V: volume of the attracting body. of the coordinate axis of a rectangular system is defined by a basis of unit   The direction eEx ; eEy ; eEz vectors. The three components of the gravity vector with respect to this basis are given by  gE D

@U @U @U ; ; @x @y @z

T

and the components of the three vertical second derivatives of gE @gz @gz @gz ; ; @x @y @z are of special interest in geophysics: @gz @2 U @gz @2 U @gz @2 U D I D I D @x @z @x @y @z @y @x @z @x For the application of potential field modeling, the following integrals have to be solved: 1 ˇ dV U D f   sss ˇ 0 ˇrE  rEˇ V gx D f   s s s V

gy D f   s s s V

gz D f   s s s V

VGx D f   s s s V

@ 1 ˇ dV ˇ @x ˇrE0  rEˇ 1 @ ˇ dV ˇ 0 ˇ @y rE  rEˇ @ 1 ˇ dV ˇ @z ˇrE0  rEˇ 1 @2 ˇ ˇ dV 0 ˇ @z @x rE  rEˇ

This is done by transforming the volume integrals into surface integrals and later to line integrals along the edges of the polyhedron by applying the divergence theorem: ZZ E  nE dS s s s div uE dV D  U S

Page 12 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

with uE W vector function nE W surface normal One has to find a vector field uE such that 1 ˇ for the potential – integral div uE D ˇ 0 ˇrE  rEˇ !

@ div uE D @ xi

1 ˇ ˇ ˇrE0  rEˇ

@ div uE D @z @xi

1 ˇ ˇ ˇrE0  rEˇ

for the gravity integral ! for the vertical gradient

This problem does not have a unique solution. Among all possible solutions, a “simple” one holds:   uE D 12  R1 .x1 x2 x3 /T D grad R2 with div uE D R1 for the potential, uE D R1 with  div uE D @x@ i R1 for the components of the gravity vector and uE D @x@ i div uE D

1 R @2 @z @xi

with xi D 1; 2; 3 .x, y, z/ 1 for the z-components of the gradient tensor. R

The solution is described by e.g., Götze and Lahmeyer (1988) and Götze and Schmidt (2003). “m” stands for the quantity of polyhedrons, and “n” surface normal; refer also to Fig. 6 for further explanations. For the potential, " 3 # m 3 X X f X UD SIGN.D/t Dt SIGN.d/tv dtv LNtv C Dt SIGN.d/tv ARCtv 2 tD1 vD1 vD1 For the components of the gravity vector, " 3 # m 3 X X X Ui D f  cos.E nt ; eEi / SIGN.d/tv dtv LNtv C Dt SIGN.d/tv ARCtv tD1

vD1

vD1

are included in the software package and can easily be determined: " 3 m X X U3 j D f  cos.E nt ; eE3 / cos.E ntv ; eEj / LNtv C SIGN.D/t cos.E nt ; eEj / tD1

vD1 3 X

#

SIGN.d/tv ARCtv

vD1

Page 13 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Station RR1tv Dt

ntv Side v

RR2tv ANtv

nt

dtv BNtv

e3

Triangle t e2

e1

Fig. 6 The figure illustrates the items and quantities used in the equations above: e.g., the distances RR1 and RR2 from a station to the vertices of triangle “t,” the vertical distance Dt between point and triangle, surface normal (nt ) and/or normal of the polygon sides (ntv ); dtv is the normal distance between the foot point of station and the single side “v”. ANtv and BNtv are upper and lower limits of line integrals. The interactive 3D gravity and magnetic application system was built around these equations and has been around for more than 20 years. Being initially developed on a mainframe it was transferred to the first DOS PCs and later adapted to Linux PCs in the 1990s. The program has proven to be very fast, accurate, and easy to use once a model has been established (Schmidt et al. 2011). In the modeling interface, after geometry changes, the gravity effect of the model can quickly be updated because only the changed triangles have to be recalculated. Optimized storage enables very fast inversion of densities. Changes of the model geometry are restricted to predefined parallel vertical sections (Fig. 7). This is a small restriction to the flexibility but makes geometry changes easy. No complex 3D editor is needed. The vertical sections are displayed together with the measured and calculated gravity fields (Fig. 7). The geometry is updated and the fields recalculated immediately after each modifications

For magnetic modeling, the components of the magnet field Hj D RJj

m X tD1

" cos.E nt ; eEj /

3 X

cos.E ntv ; eEj / LNtv C SIGN.D/t cos.E nt ; eEj /

vD1 3 X

#

SIGN.d/tv ARCtv

vD1

with BNtv CRR2tv LNtv D ln AN tv CRR1tv BNtv ANtv ARCtv D arctan dDtvtRR2  arctan dDtvtRR1 tv tv

and (Simplification: arctan A  arctan B = arctan ((AB)/(1CAB)) [˙ ]) m = Number of triangles RJ j = Component j of normal magnetic field  = Constant density contrast f = Gravity constant All other variables: geometrical distances and normal vectors, calculated in original coordinate system

Page 14 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 7 The IGMAS+ display of a (flat) density model of the South American active continental margin (Gutknecht, pers. com). The figure shows how the model geometry is defined at vertical cross sections (blue and red tones). The thick black line indicates the coast line of the continent, and thin black lines mark the ˙4,000 m depth/altitude. Yellow dots illustrate geographical positions of volcanoes and black dots define epicenters of earthquakes. The colored field superposing the model structure is the vertical stress anomaly which was calculated on top of the down going oceanic plate (normal stress field)

Because of the triangular model structure, IGMAS+ can handle complex structures (multi Z surfaces) like the overhangs of salt domes very well. The software development was directed toward scientific usage at universities with frequent, often experimental changes. The software integrates constraining data into the interactive modeling process by means of modern visualization and combination of independent data. Stress calculations and true 3D modeling of variable density/susceptibility structures have recently been included (http://www. potentialgs.com). The visual combination of 2D and 3D models (e.g., from seismic reflection or refraction surveys, seismology, electromagnetic, geology, petrology) enables a quantitative comparison and adjustment by the interpreter and results in a model comprising as much independently derived information as possible. Figure 7 demonstrates the overlay of very different data and information from rather different sources. Most of them stem from the data catalogue of the “Special Research Center 267, SFB 267” which was already introduced in Chapter 2.1. It clearly proves the crucial impact of information systems on constrained potential field modeling; here modeling of a gravity field.

3.3 Variable Material Parameters Another important aspect of integrated modeling is the integration of variable physical parameters – densities, remanence, and magnetic susceptibilities – in the modeling process. Schmidt et al. (2011) solve this problem by superimposing voxels on the polyhedrons. The voxel grid is used to “modify” the constant physical properties of the polyhedrons, keeping the physical properties to relatively small effects – a fact that minimizes the approximation error of voxel models as far as possible. Figure 8 demonstrates the superposition of the triangulated density model with constant density values (constant colors and solid density depth profile) with a linear density depth gradient  = z*0.05 t/m3 in a voxel cube; voxel densities vary from 0 to 0.075 t/m3 which results in absolute Page 15 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 8 Superposition of constant densities and density functions (densities in t/m3 ) (after Schmidt et al. 2011). For more information, refer to text

Fig. 9 A self-acting modification of the voxel density functions is linked to geometry modifications of triangulated polyhedrons (modified after Schmidt et al. 2011)

density values between 2.2 and 2.275 t/m3 (this corresponds to the dotted line in the density depth profile in Fig. 8). In general, there is no technical restriction to the voxel parameter function: It either depends on the absolute depth, on the depth below seafloor or on the horizontal position of each voxel. It is thus well adapted to the conversion of densities either from seismic velocity functions, seismic tomography, or porosity models. Appropriate conversion formulas such as Nafe and Drake (1957) and Gardner et al. (1974), or Barton (1986) are well known in many cases, and the knowledge of porosity depth functions may provide further information to constrain the density function. Necessarily the voxel density functions have to link to the polyhedrons of the same model structure/layer. This strategy ensures modifications of the effective voxel cube by changing the geometry of the polyhedrons (the geological domains). Figure 9 illustrates the voxel model overlaying the upper layer before and after the modification of the corresponding layer geometry. It is obvious that this combination of locally dependent density functions and restriction of its spatial validity through triangulated geometries allow easy testing of different scenarios and learning about the solution space of the model (Schmidt et al. 2011). The overlaying voxels and their constant densities are approximated by mass points. The implementation of the appropriate formulas is very easy and can also be applied in case of spherical model calculations. However, it has to be ensured that distances between model stations and point masses are not too small because mass points would cause numerical artifacts in the modeled gravity/magnetic fields if they are located too close to modeled masses. Therefore, all distances should exceed twice the horizontal voxel size (Schmidt et al. 2011), and high-resolution voxel

Page 16 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 10 The box on the left side defines where the model can be changed. By moving the handles, the model space is distorted and updated geometry is generated (picture on the right) (modified after Alvers et al. (2013))

cubes take considerable computation time. Li (2010) suggested that effects of mass points (voxels) can be calculated in reasonable time in the wavenumber domain (fast Fourier transform, FFT). In this case each of the different components and/or gradient components is calculated in the wavenumber domain, and the effects finally are added in the spatial domain. This method results in the best performance in terms of computation time. The disadvantage was already mentioned for direct mass point calculations: Proper results presume a “certain distance” between stations and the uppermost voxel layer. As the physical parameters (density and magnetic susceptibility) are linear, they can also be easily inverted using any linear inversion procedure. In IGMAS+ a minimum mean square error algorithm is implemented (Haase 2008). Her procedure bases on an algorithm which was originally introduced by Sæther (1997).

3.4 Interactive Inversion by “Distorting the Space” For geometry inversion of 3D potential field models, Alvers (1998) already suggested a new concept for the interactive update of complex 3D models. He suggested to distorting the space instead of directly changing the model geometry. In the distorted space the model topology remains intact. By a simple coordinate transformation, this model can then be transformed back to the undistorted 3D space, ready for the next model update. Figure 10 shows the concept for a 2D section; however, in the 3D case, the handles used to distort the room are defined in a cube. The resolution of the cube can be adjusted such that model updates can be either more focused or affecting a larger area. This is a useful task to switch between a more detailed work area and more regional model updates. More regional updates in the deeper parts of the model are also useful because the user gets immediate feedback on the model changes. The orientation of the cube with the handles can be adjusted to the geology or available seismic sections. This method is generic and can be applied to polyhedral models, voxel models, and hybrid models. For automated interpretation (inversion), the covariance-matrix-adoption evolution strategy (CMA-ES; Ostermeier et al. 1994) is applied. Alvers et al. (2013) stated that the usage of this evolution strategy would be problematic in case many solutions need to be rejected because of the model becoming inconsistent. The inversion process slows down or stops completely. However, if CMAES is combined with the strategy of “space distort” instead of moving the model vertices of the triangles directly, this problem is solved.

Page 17 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

4 Metadata Handling The last chapters have shown the importance to include independent information and data to overcome the problem of potential field ambiguity. Closely linked to this problem is handling of metadata which is crucial information on the data which were used as boundary condition in the course of modeling. The meta-information database is a part of the database system because it provides valuable information on the data used in modeling. Geodata mostly consist of strong heterogeneous behavior and should be accompanied by a homogeneous layer of additional information which represents the data resources: datasets, software, documents, images, or even of a piece of hardware, a concept, an external web site, and much more. Each information item consists of an XML document of limited size of a predefined form, corresponding roughly to an entry in a classical library catalogue, with some additional information (Thomsen et al. 2012). The structure of all these entries follows standard patterns with some variants corresponding to the different types of resources. Often there is a specific name and a title; any support for simple system navigation; also classical metadata, e.g., owner, institution, location, data of creation, and update; and comments and figures encoded in HTML. The most important part consists in three types of web links: 1. A list of references to the resource that is represented by the meta-information item, 2. Lists of hyperlinks referring to external, additional information in the WWW (e.g., scientific articles 3. Project-internal links referring to other resource representations; they are also part of the metainformation database. For more technical details, refer to Thomsen et al. (2012). The knowledge of data structures and formats alone are not sufficient to make adequate use of data. In many cases a certain understanding of the work done by the partners is required. This complicates the use of different conceptual models, definitions, standards, and generally different terminology, and here meta-information provides substantial help. Therefore, each data file should contain references explaining the generating program, documentation of parameters and constraints (e.g. objective, system of equations and/or inequalities, initial and boundary conditions, algorithm, discretization, material parameters like porosity, density, and much more). Very helpful are descriptions of generated methods, references of used documentation, and – particularly important for many application in geosciences – information on coordinate system, description of a “level of detail” or “resolution,” and error information.

5 Conclusion This chapter described the way in which the interaction of potential field modeling and geoinformatics systems adds value to interdisciplinary interpretation. Data and metadata exchange in both a large-scaled program (SFB 267) and a project of applied geophysics (CO2 -MoPa) were demonstrated in order to illustrate the state of the art. Much has been done; however, it is still futuristic to realize an easy and automatic exchange of geodata (geo-objects) between processing, modeling, and interpretation software. As the development of suitable data models and algorithms is an ongoing process, one should be prepared to design these models to be as general as possible. This would enable extension of their use to other problems. The required technology mostly exists, Page 18 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

and it is up to geologists and geophysicists to define their requirements of the modeling and GIS software. An important prerequisite for interdisciplinary modeling processes is the development of optimal and general interfaces. This ambitious task can be achieved only by intensive and permanent discussions between computer scientists and geoscientists. The times in which different groups worked in isolation using only a single method are definitely past. Modeling with the aid of object-oriented GIS will ensure the constant feedback among scientists from various disciplines and across different methods.

References Alvers M (1998) Zur Anwendung von Optimierungsstrategien auf Potentialfeldmodelle. Dissertation Fachbereich Geowissenschaften, FU Berlin, Berliner Geowissenschaftliche Abhandlungen, Reihe B 28:1–108 Alvers MR, Götze HJ, Lahmeyer B, Plonka C, Schmidt S (2013) Advances in 3D potential field modeling. Extended abstracts EAGE 2013, London Balovnev O, Bode T, Breunig M, Cremers AB, Möller G, Pogodaev G, Shumilov S, Siebeck J, Siehl A, Thomsen A (2004) The story of the GeoToolKit – an object-oriented geodata base kernel system. Geoinformaticsa 8(1):5–47 Barton PJ (1986) The relationship between seismic velocity and density in the continental crust – a useful constraint? Geophys J R Astron Soc 87:195–208 Bär W (2007) Management of geoscientific 3D data in mobile database management systems. PhD thesis, University of Osnabrück, 184p Bauer S, Class H, Ebert M, Feeser V, Götze HJ, Holzheid A, Kolditz O, Rosenbaum S, Rabbel W, Schäfer W, Dahmke A (2012) Modeling, parameterization and evaluation of monitoring methods for CO2 storage in deep saline formations: the CO2 -MoPa project. Environ Earth Sci 6702: 351–367 Bilgili F, Götze HJ, Pašteka R, Schmidt S, Hackney R (2007) Intrusion versus inversion – a 3D density model of the southern rim of the Northwest German Basin. Int J Earth Sci 98(3):1–13 Blakely R (1995) Potential theory in gravity and magnetic applications. Cambridge University Press, Cambridge Bode T, Radetzki U, Shumilov S, Cremers AB (2002) COBIDS: a component-based framework for the integration of geo-applications in a distributed spatial data infrastructure. In: Annual conference of the international association for mathematical geology, Berlin Boschetti F, Horowitz FG, Hornby P (1999) Ambiguity analysis and the constrained inversion of potential fields. Australian Geodynamics Cooperative Research Centre Breunig M (1996) Integration of spatial information for geo-information systems. Lecture notes in earth sciences, vol 61. Springer, Berlin/Heidelberg/New York Breunig M, Cremers AB, Götze HJ, Seidemann R, Schmidt S, Shumilov S, Siehl A (2000) Geologic mapping based on 3D models using an interoperable GIS. GIS J Spat Inf Decis Mak 13(2):12–18 Breunig M, Butwilowski E, Golovko D, Kuper PV, Menninghaus M, Thomsen A (2012) Advancing DB4GeO. 3DGeoinfo 2012, Ontario, 16p Damm T, HJ Götze (2009) Modern geodata management: application of interdisciplinary interpretation and visualization in Central America. Int J Geophy 13pp. hindawi.com

Page 19 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Döring J, Götze HJ (1999) The isostatic state of the southern Urals crust. Geologische Rundschau 87(4):500–510 Fedi M, Rapolla A (1997) Space-frequency analysis and reduction of poptential field ambiguity. Annali di Geofisica 5:1189–1200 Fleisch D (2011) A student’s guide to Maxwell’s equations. Cambridge University Press, Cambridge Ford A (2009) Modeling the environment. Island Press, Washington, DC Fradin D, Meneveaux D, Lienhardt P (2005) Hierarchy of generalized maps for modeling and rendering complex indoor scenes. Technical report, Signal Image Communication laboratory, CNRS, University of Poitiers, 19p Gardner GHF, Gardner LW, Gregory AR (1974) Formation velocity and density – the diagnostic basics for stratigraphic traps. Geophysics 39:770–780 Götze HJ (1984) Über den Einsatz interaktiver Computergraphik im Rahmen 3-dimensionaler Interpretationstechniken in Gravimetrie und Magnetik. Habilitation thersis, (in German) Technical University Clausthal-Zellerfeld Götze, HJ, Lahmeyer B (1988) Application of three-dimensional interactive modeling in gravity and magnetics. Geophysics 53(8):1096–1108 Götze HJ, Schmidt S (2003) Geophysical 3D-modelling using GIS-functions. In: Proceedings IAMG-2002, Berlin, pp 14–28 Götze HJ, Alten M, Burger H, Goni P, Melnick D, Mohr S, Munier K, Ott N, Reutter K, Schmidt S (2006a) Data management of the SFB 267 for the Andes – from ink and paper to digital databases. In: Oncken O, Chong G, Franz G, Giese P, Götze HJ, Ramos V, Strecker M, Wigger P (eds) Frontiers in earth sciences, vol 1. Springer, Heidelberg/New York Götze HJ, El-Kelani R, Schmidt S, Rybakov M, Hassouneh, M, Förster HJ, Ebbing J (2006b) Integrated 3-D density modelling and segmentation of the Dead Sea Transform (DST). Int J Earth Sci (Geologische Rundschau) 96(2):289–302 Götze HJ, Gabriel G, Giszas V, Hese F, Kirsch R, Köther N, Schmidt S (2009) The ice age paleochannel Ellerbeker Rinne an integrated 3D gravity study. Zeitschrift der Deutschen Gesellschaft für Geowissenschaften ZDGG 160(3):279–293 Götze HJ, Meyer U, Choi, S (2010) Helicopter gravity survey in the Dead Sea Area. EOS Trans Am Geophys Union 91(12):109–110 Götze HJ (2011) Gravity methods, principles. In: Gupta H (ed) Encyclopaedia of solid earth geophysics. Springer, Dordrecht, pp 500–504 Haase C (2008) Inversion of gravity, gravity gradient, and magnetic datawith application to subsalt imaging. Diploma thesis, Christian-Albrechts-University Kiel Hubbert MK (1948) A line-integral method of computing the gravimetric effects of twodimensional masses. Geophysics 13:215–225 Jacoby W, Smilde PL (2009) Gravity interpretation – fundamentals and application of gravity inversion and geological interpretation. Springer, Heidelberg/New York Kellog OD (1953) Foundations of potential theory. Dover, New York Li X (2010) Efficient 3D gravity and magnetic modeling. In: EGM 2010 international workshop, Capri. Expanded abstracts Lienhardt P (1994) N-dimensional generalized combinatorial maps and cellular quasi-manifolds. Int J Comput Geom Appl 4(3):275–324 MacMillan WD (1958) The theory of the potential. Dover, New York

Page 20 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_52-2 © Springer-Verlag Berlin Heidelberg 2014

Munier K (1997) Landsat-TM Satellitenbildmosaik der zentralen Anden (20˚ S–26˚ S). Photogrammetrie Fernerkundung Geoinformatik 6:391–392 Munier K, Levenhagen J, Burger H (2006) Introduction to the attached DVD. In: Oncken O, Chong G, Franz G, Giese P, Götze HJ, Ramos V, Strecker M, Wigger P (eds) Frontiers in earth sciences, vol 1. Springer, Heidelberg/New York Nafe J, Drake C L (1957) Variation with depth in shallow and deep water marine sediments of porosity, density and the velocities of compressional and shear waves. Geophysics 22:523–552 Oncken O, Chong G, Franz G, Giese P, Götze HJ, Ramos V, Strecker M, Wigger P (2006) The Andes – active subduction orogeny. Frontiers in earth sciences. Springer, Heidelberg/NewYork Ostermeier A, Gawelczyk A, Hansen N (1994) A derandomized approach to self-adaptation of evolution strategies. Evol Comput 2(4):369–380 Ott N, Götze H-J, Schmidt S, Burger H, Alten M (2002) Meta geoinformation system facilitates use of complex data for study of Central Andes. EOS 83:34 Pohánka V (1988) Optimum expression for computation of the gravity field of a homogeneous polyhedral body. Geophys Prospect 36:733–751 Prezzi C, Götze HJ, Schmidt S (2009) 3D density model of the Central Andes. Phys Earth Planet Inter 177:217–234 Ramsey AS (1940) An introduction to the theory of Newtonian attraction. Cambridge university Press, Cambridge Sæther B (1997) Improved estimation of subsurface magnetic properties using minimum meansquare error methods. PhD thesis, Norwegian University of Science and Technology Schmidt S, Götze H-J (1999) Integration of data constraintsand potential field modeling – an example from southern Lower Saxony, Germany. Phys Chem Earth A24(3):191–196 Schmidt S, Götze HJ, Fichler C, Ebbing J, Alvers MR (2007) 3D gravity, FTG and magnetic modeling: the new IGMAS+ software. Extended abstracts EGM 2007 international workshop, Capri Schmidt S, Plonka C, Götze HJ and Lahmeyer B (2011) Hybrid modelling of gravity, gravity gradients and magnetic fields. Geophys Prospect 12–6:1046–1051 Singh B, Guptasarma D (2001) New method for fast computation of gravity and magnetic anomalies from arbitrary polyhedra. Geophysics 66:521–526 Skeels DC (1947) Ambiguity in gravity interpretation. Geophysics 12:43. doi:10.1190/1.1437295 Talwani M, Worzel JL, Landisman M (1959) Rapid gravity computations for two-dimensional bodies with application to the Mendocino submarine fracture zone. J Geophys Res 64(1):49–59 Thomsen A, Götze HJ, Altenbrunn K (2012) Towards information management for the synoptic interpretation of complex geoscientific models – the virtual CO2 storage project “CO2 -MoPa” as an example. Zeitschrift Geol Wiss 40(6):393–415 Telford WM, Geldart LP, Sheriff RE (1990) Applied geophysics. Cambridge University Press, Cambridge Torge W (1989) Gravimetry. de Gruyter Berlin, New York Woldetinsae G, Götze HJ (2005) Gravity field and isostatic state of Ethiopia and adjacent areas. J Afr Earth Sci 41(1–2):103–117

Page 21 of 21

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Theory of Map Projection1 : From Riemann Manifolds to Riemann Manifolds Erik W. Grafarend and Friedrich W. Krumm Department of Geodesy and Geoinformatics, Stuttgart University, Stuttgart, Germany

Abstract The Theory of Map Projections is based here on the transformation of Riemann manifolds to Riemann manifolds. Section 2 offers some orientation based on simultaneous diagonalization of two symmetric matrices. We separate simply connected regions. In detail, we review the pullback versus pushforward operations. Section 3 introduces the first multiplicative measure of deformation: the Cauchy–Green deformation tensor, its polar decomposition, as well as singular value decomposition. An example is the Hammer retroazimuthal projection. A second multiplicative measure of deformation is presented in Sect. 4: stretch, or length distortion, and Tissot portrait as well. The Euler–Lagrange deformation tensor presented in Sect. 5 is the first additive measure of deformation based on the difference of the metrics fds2 ; dS2 g. Section 6 introduces a review of 25 different measures of deformation. First, angular shear is the second additive measure, also called angular distorsion, left and right. Section 8 introduces a third multiplicative measure of deformation, called relative angular shear. In contrast, the equivalence theorem of conformal mapping in Sect. 9 is based on Korn–Lichtenstein equations. Areal distortion in Sect. 10 offers a popular alternative based on the fourth multiplicative and additive measure of deformation, namely, dual deformation called areomorphism. Section 11 offers an equivalence theorem of equiareal mapping. The highlight is our review of canonical criteria in Sect. 12: (i) isometry; (ii) equidistant mapping of submanifolds; (iii) in particular, canonical conformism, areomorphism, isometry, and equidistance; and finally, (iv) optimal map projections. Please study Sect. 13, the exercise: the Armadillo double projection.

1 Introduction Our introduction of Map Projections is divided into a chapter of foundational type, namely, the mapping from Riemann manifold to Riemann manifold, and two chapters of applications: • “UTM” optimal ellipsoidal Universal Transversal Mercator Projection Gauss-Krüger Map (Earth, Planets, Moon, Sun) • “UM”, “UTPC”



E-mail: [email protected]

1

Map projections represent classical topics in (geo)mathematical research areas. Hence, the necessity of such a contribution to a reference work on Geomathematics comes naturally. Therefore the editors and the publisher have decided to include this chapter here despite the fact that its main content has been extracted from Grafarend & Krumm, Map Projections – Cartographic Information Systems, Springer, 2007. Page 1 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Mappings of the sphere or the ellipsoid to the tangential plane

Mappings of the sphere or the ellipsoid to the cylinder

Mappings of the sphere or the ellipsoid to the cone

Conformal mappings Equidistant mappings

Equal area mappings

Pseudo-mappings of types azimuthal, cylindric, or conic Perspective mappings Geodetic mappings (initial value versus boundary value problems) Double projections (“sphere to ellipsoid” and “sphere to plane”)

Fig. 1 The classical scheme of map projections

optimal, ellipsoidal Universal Mercator Projection, polycylindric Grafarend-Syffus 1997 (Indonesia) Both projections are of the type “optimal”; one term we have to explain. Next, we present to you our classical scheme of map projection reaching from the mapping of the sphere and the ellipsoid but also on the other type of surfaces, namely, torus, hyperboloid, paraboloid, onion shape, and many others (Fig. 1). A special example is the map projection of the double helix illustrating DNA. Such a structureal of DNA has been invented by Francis Crick and James D. Watson in 1953. It revealed how DNA was the substance of the genes, containing two polynucleotide strands winding around each other. Both of them got the Nobel Prize for their research. Read “Genes, Girls and Gamow” by J.D. Watson, A, A, Knoph Publ. New York 2002. Next, we intend to follow the classical scheme of map projections. Consult the formal scheme above for a first impression

1.1 The Standard Map Projections: Tangential, Cylindrical, Conical We start with mapping the sphere or the ellipsoid-of-revolution to the tangential plane, in particular the polar aspect. Examples are the universal polar stereographic (UPS) projection and the meta-azimuthal mapping in the transverse as well as the oblique aspect. They range from equidistant mapping via conformal mapping to equal-area mapping and finally to normal perspective mappings. Special cases are mappings of the type “sphere to tangential plane” at maximal distance, at minimal distance, and at the equatorial plane (three cases). We treat the line-of-sight, the line-of-contact,

Page 2 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

and minimal versus complete atlas. The gnomonic projection, the orthographic projection, and the Lagrange projection follow. Finally, we ask the question: “What is the best projection in the class of polar and azimuthal projections of the sphere to the plane?” A special section on pseudoazimuthal mappings, namely, the Wiechel polar pseudoazimuthal mapping, and another special section on meta-azimuthal projections (stereographic, transverse Lambert, oblique UPS, and oblique Lambert) concludes the important chapter on various maps “sphere to plane.” Next is the first topic on mapping the ellipsoid-of-revolution to the tangential plane. We treat usually mappings of type equidistant, conformal, and equal area and of type perspective. Here, we list the first chapter on double projections. First, we introduce the celebrated Gauss double projection. Alternatively, we introduce the authalic equal-area projection of the ellipsoid to the sphere and from the sphere to the plane. There is a wide spectrum of mapping “sphere to cylinder,” namely, to the polar aspect, to the metacylindrical projections of type transverse and of type oblique and finally to the pseudocylindrical mode. Four examples, namely, from mapping the sphere to a cylinder (polar aspect, transversal aspect, oblique aspect, pseudocylindrical equal-area projections) document the power of these spherical projections. The resulting map projections are called (i) Plate Carrée (quadratische Plattkarte), (ii) Mercator projection (Gerardus Mercator 1512–1594), and (iii) equal-area Lambert projection. A special feature of the Mercator projection is its property “map-ping loxodromes (rhumblines, lines of constant azimuths) to a straight line crossing all meridians with a constant angle.” The most popular map projection is the Universal Transverse Mercator projection (UTM) of the sphere to the cylinder. The pseudocylindrical equal-area projections – they only exist – are widely used in the sinusoidal version (Cossin, Sanson–Flamsteed), in the elliptical version (Mollweide, very popular), in the parabolic version (Craster), and in the rectilinear version (Eckert II). Here is the review in mapping an ellipsoid-of-revolution to a cylinder. We start with the polar aspect of type fx D A; y D f.˚/g, specialize to normal equidistant, normal conformal, and normal equiareal, in general, to a rotationally symmetric figure (e.g., the torus). The transverse aspect is applied to the transverse Mercator projection and the special Gauss–Krueger coordinates (UTM, GK) derived from the celebrated Korn–Lichtenstein equations subject to an integrability condition and an optimality condition for estimating the factor of conformality (dilatation factor) in a given quantity range ŒlE ; ClE   ŒNS ; BN  D Œ3:5ı ; C3:5ı   Œ80ı S; 84ı N  or ŒlE ; ClE   ŒBS ; BN  D Œ2ı ; C2ı   Œ80ı S; 80ı N , namely,  D 0:999; 578 or  D 0:999; 864. Due to its practical importance, we have added three examples for the transverse Mercator projection and for the Gauss–Krueger coordinate system of type {Easting, Northing}, adding the meridian zone number. Another special topic is the strip transformation from one meridian strip system to another one, both for Gauss–Krueger coordinates and for UTM coordinates. We conclude with two detailed examples of strip transformation (Bessel ellipsoid, World Geodetic System 84). At the end, we present to you the oblique aspect of type Oblique Mercator Projection (OMP) of the ellipsoidof-revolution, also called rectified skew orthomorphic by M. Hotine. J. P. Snyder calls it “Hotine Oblique Mercator Projection (HOM).” Landsat-type data are a satellite example. Only in the polar aspect, we present the maps of the sphere to the cone. We use as an illustration and the setup fa D  sin ˚0 ; r D f.˚/g in terms of polar coordinates, n D sin ˚0 range from n D 0 for the cylinder to n D 1 for the azimuthal mapping. Thus, we are left with the rule 0 < n < 1 for conic projections. The wide varieties of conic projections were already known to Ptolemy as the equidistant and conformal version on the circleof-contact. If we want a point-like image of the North Pole, the equidistant and conformal version

Page 3 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

on the circle-of-contact is our favorite. Another equidistant and conformal version on two parallels is the de L’Isle mapping. Various versions of conformal mapping range from the equidistant mappings on the circle-of-contact to the equidistant mappings on two parallels (secant cone, J. H. Lambert). The equal-area mappings range from the case of an equidistant and conformal mapping on the circle-of-contact over the case of an equidistant and conformal mapping on the circle-ofcontact and a point-like image of the North Pole to the case of equidistance and conformality on two parallels (secant cone, H. C. Albers). An extension is an introduction into mapping the sphere to the cone, namely, of type pseudoconic. We specialize on the Stab–Werner projection and on the Bonne projection. Both types have the shape of the heart. The polar aspect of mapping the ellipsoid-of-revolution to the cone is the key topic of the review of the line-of-contact and the principal stretches before we enter into special cases, namely, of type equidistant mappings on the set of parallel circles of type conformal (variant equidistant on the circle-of-reference, variant equidistant on two parallel circles, generalized Lambert conic projection) and type equal area (variant equidistant and conformal on the reference circle, variant pointwise mapping of the central point and equidistant and conformal on the parallel circle, variant of an equidistant and conformal mapping on two parallel circles, generalized Albers conic projection). Geodesics and geodetic mappings, in particular the geodesic circle, the Darboux frame, and the Riemann polar and normal coordinates, are special topics. We illustrate the Lagrange and the Hamilton portrait of a geodesic, introduce the Legendre series, the corresponding Hamilton equations, the notion of initial and boundary value problems, the Riemann polar and normal coordinates, Lie series, and specialize to the Clairaut constant and to the ellipsoid-of-revolution. Geodetic parallel coordinates refer to Soldner coordinates. Finally, we refer to Fermi coordinates. The deformation analysis of Riemann, Soldner, and Gauss–Krueger coordinates is presented.

1.2 Where can We Find more Information about Map Projections? The major source of our contribution can be found in two books on “Map Projections: cartographic informations systems" by E. Grafarend and F. Krumm (2006, 713 pages, 1387 references) as well as by E. Grafarend, R.J. You and R. Syffus (2013, 800 pages, 1500 references), the second edition of “Map Projections", both published by Springer Verlag, Berlin-Heidelberg-New York.

1.3 Why are We Treating the Legal Map Projections in the Ellipsoid-of-Revolution or on the International Reference Ellipsoid 2000? The origin of the research on the ellipsoid-of-revolution dated back to the famous project of the French Academy of Science in the eighteenth century: First, along the River Torniu flowing NorthSouth, nowadays the borderline between Sweden and Finland, the Frenchman P.L. Maupertuis and the Swede A. Celsius observed a Meridian Arc between 1736–1737 on the Territory of Lappland. Second, about the same time, J. Godin, P. Bouguer, and C.M. la Condamine, Frenchmen, measured in Peru during 1736–1744 along the Equator another Equatorial Arc. More details about the Historical Measurement Campain can be taken from E. Tobe (1986), “Fransysk visit i Torne dalen 1736–1737,” Lulea T. Tryck (1986), as well as R. Whitaker (2004), “The Mapmaker’s Tale of Love, Murder and Survival in the Amazon,” Basic Books/Perseus Book Group, New York 2004. Page 4 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

The result supported the opinion that already J. Newton proved (Principia, Book III, Proposition XVIII-XX): The rotating Earth causes an Equilibrium Figure of type Oblate ellipsoid namely, an ellipsoid-of-revolution. Totally wrong was the argument of those generations of Cassini who thought that the Earth is flattened on the Equator. Voltaire, at that time Maupertuis’s friend, congratulated him warmly for having aplati les pales et les Cassini. The Earth’s actual ellipticity 1=294 is substantially smaller than Newton’s predicted value 1=230, a result caused by the inhomogeneity of the Earth. Next, in 1742, Maclaurin presented a generalization of Newton’s result when the ellipticity is caused by the Earth’s complex rotation. He compared the Gravity at the Equator and the poles and presented the first formula, a relation between the square of the rotational speed and the excentricity, nowadays called “Maclaurin’s Formula.” The first result opened the door for a careful study of Ellipsoidal Figures of Equilibrium. More than ten thousands of scientific contributions are published meanwhile. The Climax was the Nobel Prize for S. Chandrasekhar, who calculated in 1965–1967 the postNewtonian effects in General Relativity. He wrote also the best textbook “Ellipsoidal Figures of Equilibrium,” Yale University Press, New Haven 1969.

1.4 Why do We Need the Result of the Lusternik-Schnirelmann Category S2 or E2 Equals 2: CAT .S2 / D CAT .E2 / D 2‹ One basic result of Category Theory is that we need a minimal atlas of the sphere or the ellipsoidal two charts. For instance, try to compare the tangent vectors at the North Pole or at the South Pole! You will be surprised that the tangent vector is zero. But that result cannot be realistic! Here is the reason that we use open domains for the spherical or ellipsoidal coordinates: =2 < ' < C=2 if we denote by the letter ' the latitude! CAT .S2 / D 2 or CAT.E2 / D 2 can be interpreted as follows: We need, for a singularityfree definition of permissible coordinates of a sphere or of an ellipsoid, two charts, for instance, the set f; 'g and f˛; ˇg called {longitude, latitude} and {metalongitude, metalatitude} for open domains. Here is the reason that we use coordinates and metacoordinates according to UTM and UM! Finally, we have to emphasize that our introduction into Map Projections is exclusively based upon right-handed coordinates. In particular, we got support from colleagues J. Engels (Stuttgart), F. Krumm (Stuttgart), V. Scharze (Backnang), R. Syffus (Munich), and R.J. You (Tainan/Taiwan). To all our readers, we appreaciate their use for the Wonderful World of Map Projections. Overall, we stay on the strong shoulders of great scientists, in particular C.F Gauss J.L. Lagrange B. Riemann E. Fermi J.H. Lambert J.H. Soldner.

Page 5 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

2 From Riemann Manifolds to Riemann Manifolds It is vain to do with more what can be done with fewer. (Entities should not be multiplied without necessity.) William of Ockham (1285–1349) Mappings from a left two-dimensional Riemann manifold to a right two-dimensional Riemann manifold, simultaneous diagonalization of two matrices, mappings (isoparametric, conformal, equiareal, isometric, equidistant), measures of deformation (Cauchy–Green deformation tensor, Euler–Lagrange deformation tensor, stretch, angular shear, areal distortion), decompositions (polar, singular value), equivalence theorems of conformal and equiareal mappings (conformeomorphism, areomorphism), Korn–Lichtenstein equations, optimal map projections. There is no chance to map a curved surface (left Riemann manifold), which differs from a developable surface to a plane or to another curved surface (right Riemann manifold) without distortion or deformation. Such distortion or deformation measures are reviewed here as they have been developed in differential geometry, continuum mechanics, and mathematical cartography. The classification of various mappings from one Riemann manifold (called left) onto another Riemann manifold (called right) is conventionally based upon a comparison of the metric. Example 1 (Classification). The terms equidistant, equiareal, conformal, geodesic, loxodromic, concircular, and harmonic represent examples for such classifications. In terms of the geometry of surfaces, this is taking reference to its first fundamental form, namely, the Gaussian differential invariant. In particular, in order to derive certain invariant measures of such mappings outlined in the frontline examples and called deformation measures, a “canonical formalism” is applied. The simultaneous diagonalization of two symmetric matrices here is of focal interest. Such a diagonalization rests on the following Theorem 1. Theorem 1 (Simultaneous diagonalization of two symmetric matrices). If A 2 Rnn is a symmetric matrix and B 2 Rnn is a symmetric positive-definite matrix such that the product AB1 exists, then there exists a nonsingular matrix X such that both following matrices are diagonal matrices, where In is the n-dimensional unit matrix: XT AX D diag .1 ; : : : ; n/ ;

XT BX D In D diag .1; : : : ; 1/ :

(1)

According to our understanding, the theorem had been intuitively applied by C. F. Gauss when he developed his theory of curvature of parameterized surfaces (two-dimensional Riemann manifold). Here, the second fundamental form (Hesse matrix of second derivatives, symmetric matrix H) had been analyzed with respect to the first fundamental form (a product of Jacobi matrices of first derivatives, a symmetric and positive-definite matrix G). Equivalent to the simultaneous diagonalization of a symmetric matrix H and a symmetric and positive-definite matrix G is the general eigenvalue problem jH  Gj D 0;

(2)

which corresponds to the special eigenvalue problem

Page 6 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

ˇ ˇ ˇ ˇ 1 HG  I ˇ n ˇ D 0;

(3)

where HG1 defines the Gaussian curvature matrix  K D HG1 :

(4)

In comparing two Riemann manifolds by a mapping from one (left) to the other (right), here we only concentrate on the corresponding metric, the first fundamental forms of two parameterized surfaces. A comparative analysis of the second and third fundamental forms of two parameterized surfaces related by a mapping is given elsewhere. F. Uhlig (1979) published a historical survey of the above theorem to which we refer. Generalizations to canonically factorize two symmetric matrices A and B which are only definite (which are needed for mappings between pseudoRiemann manifolds) can be traced to R. W. Newcomb (1960), S. K. Mitra and C. R. Rao (1968), C. R. Rao and S. K. Mitra (1971), F. Uhlig (1973, 1976, 1979), S. R. Searle (1982, pp. 312–316), W. Shougen and Z. Shuqin (1991), M. T. Chu (1991a, b), and J.F. Cardoso and A. Souloumiac (1996). In mathematical cartography, the canonical formalism for the analysis of deformations has been introduced by N.A. Tissot (1881). Note that there exists a beautiful variational formulation of the simultaneous diagonalization of two symmetric matrices which motivates the notation of eigenvalues as Lagrange multipliers  and which is expressed by Corollary 1. Corollary 1 (Variational formulation, simultaneous diagonalization of two symmetric matrices). If A 2 Rnn is a symmetric matrix and B 2 Rnn is a symmetric positive-definite matrix such that the product AB1 exists, then there exist extremal (semi)norm solutions of the 1=2   DW kXkA , the A-weighted Frobenius norm of the nonsingular Lagrange function tr XT AX matrix X subject to the constraint   tr XT BX  In D 0;

(5)

  kXk2A  tr XT BX  In D extrX; ;

(6)

namely, the constraint optimization

which is solved by the system of normal equations .A  B/ X D 0;

(7)

XT BX D In :

(8)

subject to

This is known as the general eigenvalue-eigenvector problem. The Lagrange multiplier  is identified as eigenvalue. ˚ 2  Let here be given the left and right two-dimensional Riemann manifolds M and ; G MN l  ˚ 2 Mr ; g , with standard metric GMN D GNM and g D g , respectively, both symmetric

Page 7 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

M2l

M2r

M2l ⊃ Ul

f

Ur ⊂ M2r

Φl

Φr R2, δij = E2

R2, δIJ = E2 f

  Fig. 2 Commutative diagram f ; f ; ˚ l ; ˚ r I f W M2l ! M2r I f D ˚ r ı f ı ˚ 1 l 2

M2l = EA1,A1,A2

2

M2r = Sr f

Φ

Φr φ

Φl f Λ

λ

Fig. 3 Bijective mapping of an ellipsoid-of-revolution E2A1 ;A1 ;A2 to a sphere S2r I f W E2A1 ;A1 ;A2 ! S2r I ˚ l WD Œ; ˚ ; ˚ r WD Œ;  I isoparametric mapping f D id, namely, f; ˚g D f; g 2 2 and positive- definite. A ˚ 2 ˚ 2Ur   subset Ul 2 Ml and  Mr , respectively, is covered by the chart 2 Vl  E WD R ; ıIJ and Vr  E WD R ; ıij , respectively, with respect to the standard canonical metric ıIJ and ıij , respectively, of the left two-dimensional Euclidean space and the right two-dimensional Euclidean space. Such a chart is constituted by local coordinates fU; V g 2 S  E2 and fu; vg 2 S!  E2 , respectively, over open sets S and S! : Figures 2 and 3 illustrate by a commutative diagram the mappings ˚ l , ˚ r and f , f . The left mapping ˚ l maps a point from the left two-dimensional Riemann manifold (surface) to a point of the left chart, while ˚ r maps a point from the right two-dimensional Riemann manifold (surface) to a point of the right chart. In contrast, the mapping f relates a point of the left two-dimensional Riemann manifold (surface) to a point of the right two-dimensional Riemann manifold (surface). Analogously, the mapping f maps a point of the left chart to a point of the right chart: f W M2l ! M2r ; f W Vl ! Vr D ˚ r ı f ı ˚ 1 l . All mappings are assumed to be a diffeomorphism: The mapping fdU; dV g ! fdu; dvg is bijective. Example 2 is the simple example of an isoparametric mapping of a point on an ellipsoid-ofrevolufcion to a point on the sphere.

Example 2 (E2A1 ;A1 ;A2 ! S2r , isoparametric mapping). As an example of the mapping f W M2l !   M2r and the commutative diagram f ; f ; ˚ l ; ˚ r , think of an ellipsoid-of-revolution ˇ 2 ˇX C Y 2 Z2 WD X 2 R ˇˇ C D 1; A21 A22





E2A1 ;A1 ;A2

3

C

R 3 A1 > A2 2 R

C

(9) Page 8 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

of semimajor axis A1 and semiminor axis A2 as the left Riemann manifold M2l D E2A1 ;A1 ;A2 , and think of a sphere ˇ ˚  S2r WD x 2 R2 ˇx 2 C y 2 C z2 D r 2 ; r 2 RC (10) of radius r as the right Riemann manifold M2r D S2r , f being the pointwise mapping of E2A1 ;A1 ;A2 to S2r one to one. f could be illustrated by a transformation of {ellipsoidal longitude , ellipsoidal latitude ˚} onto {spherical longitude , spherical latitude } one to one. The mapping f D id is called isoparametric if f D ; ˚ D g or fU D u; V D vg in general coordinates of the left Riemann manifold and the right Riemann manifold, respectively. Accordingly, in an isoparametric mapping, {ellipsoidal longitude, ellipsoidal latitude} and {spherical longitude, spherical latitude} are identical. An isoparametric mapping of this type is illustrated by the commutative diagram of Fig. 3. We take notice that the differential mappings, conventionally called f and f  , respectively, between the bell-shaped surface of revolution and the torus illustrated by Fig. 2 do not generate a diffeomorphism due to the different genus of the two surfaces. While Fig. 4 illustrates simply connected regions in R2 and R3 , respectively, Fig. 5 demonstrates regions that are not simply connected. Those regions are characterized by closed curves that can be laid around the inner holes and that cannot be contracted to a point within the region. The holes are against contraction. The mapping f W M2l ! M2r is usually called deformation. In addition, the mappings f (pullback) versus f  (pushforward) of the left tangent space T M2l onto the right tangent space T M2r , also called pullback (right derivative map, Jacobi map Jr ), and of the right tangent space T M2r onto the left tangent map T M2l , also called pushforward (left derivative map, Jacobi map Jl ), are of focal interest for the following discussion. Indeed, the pullback map f coincides with the mapping of the right cotangent space  T M2r 3 fdu; dvg onto the left cotangent space  T M2l 3 fdU; dV g as well as the pushforward map f  with the mapping of the left cotangent space  T M2l 3 fdU; dV g onto the right cotangent space  T M2r 3 fdu; dvg. This is illustrated by the relations f W

T M2l ! T M2r  T M2r !  T M2l .pullback/

versus



f W



T M2l !  T M2r : T M2r ! T M2l .pushforward/

(11)

3 Cauchy–Green Deformation Tensor A first multiplicative measure of deformation: the Cauchy–Green deformation tensor, polar decomposition, singular value decomposition, Hammer retroazimuthal projection. There are various local multiplicative and additive measures of deformation being derived from the infinitesimal distances dS 2 of M2l and ds 2 of M2r , with     dS 2 D GMN U L dUM dU N versus ds 2 D g u du du : (12) The mapping of type deformation, f W M2l ! M2r , is represented locally by f , in particular U M ! u , the mapping of type inverse deformation, f 1 W M2r ! M2l , is represented locally Page 9 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

R2

R3

Fig. 4 Simply connected regions R2

R3

Fig. 5 Not simply connected regions

by f 1 , in particular u ! U M , with U M ! u D f  .U M / and u ! U M D F M .u /. In the left and right tangent bundles T M2l  M2l and T M2r  M2r , we represent locally the projections   T M2l  M2l D T M2l and .T M2r  M2r / D T M2r by the pullback map and the pushforward map, in particular by @U M  @u   (13) du versus f W du D dU M @u @U M ˇ ˇ ˇ ˇ M ˇ@U =@u ˇ > 0 versus ˇ@u =@U M ˇ > 0 preserve the orientation @=@U ^ @=@V and @=@u ^ @=@v, respectively, of M2l and M2r , respectively. The first multiplicative measure of deformation has been introduced by A. L. Cauchy (1828) and G. Green (1839), reviewed in the sets of relations shown in Box 1, where the abbreviation Left CG indicates the left Cauchy–Green deformation tensor and the abbreviation Right CG indicates the right Cauchy–Green deformation tensor. With respect to the deformation gradients, the left and right Cauchy–Green tensors are represented in matrix algebra by f W dU M D

Cl WD JTl GrJl

versus

Cr WD JTr Gl Jr :

(14)

The set of deformation gradients is described by the two Jacobi matrices Jl and Jr , which obey the matrix relations Page 10 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Jl WD

@u @U M



D

J1 r

Jr WD

versus

@U M @u

D J1 l :

(15)

The abstract notation hopefully becomes more concrete when you work yourself through Example 3, where we compute the Cauchy-Green deformation tensor for an isoparametric mapping of a point on an ellipsoid-of-revolution to a point on a sphere. Box 1 (Left and right Cauchy–Green deformation tensor). Left CG: ds 2 D ˚   L  @u @u D g f U dU M dU N D M N  @U @U D cMN U L dU M dU N ;   cMN U L D   @u  L  @u  L  D g U L U U : @U M @U N

Right CG: dS 2 D ˚ L    @U M @U N   D GMN F u du du D    @u   @u D C u du du ;   C u D   @U M    @U N    D GMN u u u : @u @u (16)

Example 3 (Cauchy-Green deformation tensorauchy-Green deformation tensor, f W E2A1 ;A1 ;A2 ! M2l D E2A1 ;A1 ;A2 and a sphere M2r D S2r into a S2r ). The embedding of an ellipsoid-of-revolution ˚ 3  three-dimensional Euclidean space R ; I3 with respect to a standard Euclidean metric I3 (where I3 is the 3  3 unit matrix) is governed by A1 cos ˚ cos  A1 cos ˚ sin  C E2p C E3 X .; ˚/ D E 1 p 2 2 2 2 1E sin ˚

1E sin ˚

A1 .1E 2 / sin ˚ p 1E 2 sin2 ˚

D

3 cos ˚ cos  5; D ŒE 1 ; E 2 ; E 3  p A21 2 4  cos ˚ sin   1E sin ˚ 2 1  E sin ˚    2 2   A2 =A1 D 1  E 2 ; E 2 WD A21  A22 =A21 D 1  A22 =A21 ; 2

(17)

and by x .; / D e1 r cos cos  C e2 r cos sin  C e3 r sin D 2 3 r cos cos  D Œe1 ; e2 ; e3  4 r cos sin  5 ; r sin

(18)

respectively. The coordinates (X; Y; Z) and (x, y, z) of the placement vectors X .; ˚/ 2 E2A1 ;A1 ;A2 and x .; / 2 S2r are expressed in the left and right orthonormal fixed frames fE 1 ; E 2 ; E 3 jOg and fe1 ; e2 ; e3 jO g at their origins O and O.

Page 11 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Next, we are going to construct the left tangent space T M2l as well as the right tangent space 2 The vector field X .; ˚/ is locally characterized by the field of tangent vectors T

Mr , respectively. @X @X ; , the Jacobi map with respect to the “surface normal ellipsoidal longitude ” and the @ @˚ “surface normal ellipsoidal latitude ˚,” namely, 2

3 X X˚ @X @X ; D ŒE 1 ; E 2 ; E 3  4 Y Y˚ 5 D @ @˚ Z Z˚ 2 3   A1 1  E 2 sin ˚ cos  A1 cos ˚ sin    6 p 7  2 sin2 ˚ 2 sin2 ˚ 3=2 6 7 1  E 1  E 6 7   2 6 A1 cos ˚ cos  7 1  E sin ˚ sin  A 1 6Cp 7  D ŒE 1 ; E 2 ; E 3  6  3=2 7 ; 2 2 2 2 6 7 1  E sin ˚ 1 6 7  E sin2  ˚ 6 7 A 1  E cos ˚ 1 4 5 0 C 3=2 2 2 1  E sin ˚



(19)

@x @x , as well as the vector field x .; / is locally characterized by the field of tangent vectors ; @ @ the Jacobi map with respect to the “spherical longitude ” and the “spherical latitude ˚,” namely,

2

3 x x @x @x ; D Œe1 ; e2 ; e3  4 y y 5 D @ @ z z 2 3 r cos sin  r sin cos  D Œe1 ; e2 ; e3  4 Cr cos cos  r sin sin  5 : 0 r cos



(20)

Next, we are going to identify the coordinates of the left metric tensor Gl and of the right metric tensor Gr , in particular from the inner prodncts ˇ ˇ A21 cos2 ˚ @X ˇˇ @X @x ˇˇ @x D D r 2 cos2 DW g11 ; DW G11 ; @ ˇ @ @ ˇ @ 1  E 2 sin2 ˚ ˇ ˇ ˇ ˇ @x ˇˇ @x @X ˇˇ @X @x ˇˇ @x @X ˇˇ @X D DW G12 D 0; D DW g12 D 0; @ ˇ @˚ @˚ ˇ @ @ ˇ @ @ ˇ @  2 ˇ ˇ (21) A21 1  E 2 @x ˇˇ @x @X ˇˇ @X 2 D D r DW g22 ; 3 DW G22 ; @˚ ˇ @˚ @ ˇ @ 1  E 2 sin2 ˚   2 2 2 2 2 A 1  E cos ˚ A 1 2 d2 C  1 ds 2 D r 2 cos2 d2 C r 2 d 2 : dS 2 D 3 d˚ ; 2 2 1  E 2 sin2 ˚ 1  E sin ˚

Resorting to this identification, we obtain the left metric tensor, i.e., Gl , and the right metric tensor, i.e., Gr , according to Page 12 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

 G11 G12 D fGMN g D Gl WD G12 G22 3 2 A21 cos2 ˚ 0 7 6 1  E 2 sin2 ˚ 6   7 D6 2 2 2 7; A1 1  E 5 4 0  3 2 1  E 2 sin ˚

 ˚  g11 g12 D g D Gr WD g12 g22



 r 2 cos2 0 D : 0 r2

(22)

Finally, we implement the isoparametric mapping f D id. Applying the summation convention over repeated indices, this is realized by 

U M ! u D f  .U  / ; u D ıM U M ; u1 D U 1 ; u2 D U 2 ;  D ; D ˚; Jl D I2 D Jr ; ˇ ˇ M ˇ@U =@u ˇ D 1 > 0;

ˇ  ˇ ˇ@u =@U M ˇ D 1 > 0;

f W dU M D ıM du ;

  d d D ; d˚ d

f  W du D ıM dU M ;

  d d D : d d˚

(23)



(24)

Resorting to these relations and applying again the summation convention over repeated indices, we arrive at the left and right Cauchy–Green tensors, namely, @u @u @U M @U N   D g ı ı ; C D G D GMN ıM ıN ;   MN M N M N   @U @U @u @u 2 A2 cos2 3 1 

2 0 2 2 2  ˚ r cos ˚ 0 T 4 1E sin A2 1E 2 2 5 ; Cl D fcMN g D JTl GrJl D 2 ; Cr D C D Jr Gl Jr D / 1. 0 r 0 3 .1E 2 sin2 / 2 A2 .1E 2 / A2 cos2 dS 2 D 1E1 2 sin2 d2 C 1 2 2 3 d 2 : ds 2 D r 2 cos2 ˚ d2 C r 2 d˚ 2 ; .1E sin / (25) By means of the left Cauchy–Green tensor, we have succeeded to represent the right metric or the metric of the right manifold M2r in the coordinates of the left manifold M2l , or we may say that we have pulled back .d; d / 2  T; M2r to .d; d˚/ 2  T;˚ M2l , namely, from the right cotangent space to the left cotangent space. By means of the right Cauchy–Green tensor, we have been able to represent the left metric or the metric of the left manifold M2l in the coordinates of the right manifold M2r , or we may say that we have pushed forward .d; d˚/ 2  T;˚ M2l to .d; d / 2  T; M2r , namely, from the left cotangent space to the right cotangent space. cMN D g

There exists an intriguing representation of the matrix of deformation gradients J as well as of the matrix of Cauchy–Green deformations C, namely, the polar decomposition. It is a generalization to matrices of the familiar polar representation of a complex number z D r exp i , .r  0/ and is defined in Corollary 2. Corollary 2 (Polar decomposition). Let J 2 Rnn . Then, there exists a unique orthonormal matrix R 2 SO .n/ (called rotation matrix) and a unique symmetric positive-definite matrix S (called stretch) such that (26) holds and the expressions (27) are a polar decomposition of the Page 13 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

matrix of Cauchy–Green deformation. J D RS; R R D ln ; S D S ;

(26)

Cl D Jl GrJl D Sl R Gr RSl versus Sr R Gl RSr D Jr Gl Jr D Cr :

(27)

Question. “How can we compute the polar decomposition of the Jacobi matrix?” Answer: “An elegant way is the singular value decomposition, defined in Corollary 3.” Corollary 3 (Polar decomposition by singular value decomposition). Let the matrix J 2 R22 have the singular value decomposition J D U†V , where the matrices U 2 R22 and V 2 R22 are orthonormal (unitary), i.e., U U D I2 and V V D I2 , and where † D diag . 1 ; 2 / in descending order 1  2  0 is the diagonal matrix of singular values f 1 ; 2 g : If J has the polar decomposition J D RS, then R D UV and S D V†V .  .J/ and .J/ denote, respectively, the set of eigenvalues and the set of singular values of J. Then, the left eigenspace is spanned by the left eigencolumns u1 and u2 , which are generated by      JJ  i I2 ui D JJ  i2 I2 ui D 0; ku1 k D ku2 k D 1I

(28)

the right eigenspace is spanned by the right eigencolumns v1 and v2 , generated by      J J  j I2 vj D J J  j2 I2 vj D 0; kv1 k D kv2 k D 1I

(29)

the characteristic equation of the eigenvalues is determined by ˇ  ˇ ˇJJ  I2 ˇ D 0 or jJ J  I2j D 0;

(30)

which leads to 2  I C II D 0, with the invariants     I WD tr JJ D tr ŒJ J ; II WD .det ŒJ/2 D det JJ D det ŒJ J ;   p p 1 1 2 2 2 2 I C I  4II ; I  I  4II I 2 D 2 D 1 D 1 D 2 2

(31)

the matrices S and R can be expressed as     S D .J J/1=2 D .v1 ; v2 / diag . 1 ; 2 / v1 ; v2 ; R D JS1 D .u1 ; u2 / v1 ; v2 I

(32)

J is normal if and only if RS D SR. More details about the polar decomposition related to the singular value decomposition can be found in the classical text by T. C. T. Ting (1985), N. J. Higham (1986), and C. Kenney and A. J. Laub (1991). Example 4 is a numerical example for singular value decomposition and polar decomposition. Example 4 (singular value decomposition, polar decomposition). Let there be given the Jacobi matrix J and the product matrices JJ and J J, such that the left and right characteristic equations of eigenvalues read Page 14 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015



   5 2 29 9 26 3   JD ; JJ D ;J JD ; 1 7 9 50 3 53 ˇ ˇ  ˇJJ  I2 ˇ D  29   9 D D 9 50  

jJ J  l2 j D

 26   3 D 3 53  

D 2  79 C 1;369 D D 0;

D 2  79 C 1;369 D D 0;

    I WD tr JJ D tr ŒJ J D 79; II WD det JJ D det ŒJ J D 1;369; p 1 D 53:3329317; 1 D p1 D 7:302692; 2 D 25:670683; 2 D 2 D 5:066624:

(33)

(34)

(35) (36)

The left eigenspace is spanned by the left eigencolumns .u1 ; u2 /, the right eigenspace by the right eigencolumns .v1 ; v2 /, namely,     JJ   I 1 2  u1 D 0; .J J  1 I2 / v1 D 0;   JJ  2 I2 u2 D 0; .J J  2 I2 / v2 D 0;

24:329317 9 9 3:329317 3:329317 9 9 24:329317

 

or

 

 27:329317 3 v11 u11 D 0; D 0; u21 v21 3 0:329317

 

 0:329317 3 v12 u12 D 0; D 0: u22 v22 3 27:329317

(37)

(38)

Note that the matrices JJ  I2 and J J  I2 have only rank one. Accordingly, in order to solve the homogeneous linear equations uniquely, we need an additional constraint. Conventionally, this problem is solved by postulating normalized eigencolumns, namely, 2 2 2 2 C v21 D 1; v12 C v22 D 1; u211 C u221 D 1; u212 C u222 D 1; v11 ku1 k D ku2 k D 1; kv1 k D kv2 k D 1:

(39)

The left eigencolumns which are here denoted as .u1 ; u2 /, are constructed from the following system of equations: 24:329317u11 C 9u21 D 0; C3:329317u12 C 9u22 D 0; u212 C u222 D 1: u211 C u221 D 1;

(40)

This system of equations leads to two solutions. In the frame of the example to be considered here, we have chosen the following result:

Page 15 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

u11 D C0:316946; u12 D C0:937885; u21 D C0:937885; u22 D 0:346946:

(41)

The right eigencolumns, which are here denoted as .v1 ; v2 /, are constructed from the following system of equations: 27:329v11 C 3v21 D 0; C0:329v12 C 3v22 D 0; 2 2 2 2 C v21 D 1; v12 C v22 D 1: v11

(42)

This system of equations leads to two solutions. In the frame of the example to be considered here, we have chosen the following result: v11 D C0:109117; v12 D C0:994029; v21 D C0:994029; v22 D 0:109117:

(43)

In summary, the left and right eigencolumns are collected in the two following orthonormal matrices U and V:



 C0:346946 C 0:937665 C0:109117 C 0:994029 UD ;VD : (44) C0:937665  0:346946 C0:994029  0:109117 The polar decomposition is now straightforward. According to the above considerations, we finally arrive at the result R D UV ; S D V†V ; † D diag . 1 ; 2 / ; 

 C0:970142 C 0:242536 C5:093248 C 0:242536 RD ;SD : 0:242536 C 0:970142 C0:242536 C 7:276069

(45)

(46)

Note that from this result immediately follows that R is an orthonormal matrix. Furthermore, note that S indeed is a symmetric matrix. Before we consider a second multiplicative measure of deformation, please enjoy Fig. 6, which shows the Hammer retroazimuthal projection illustrating special mapping equations of the sphere. The ID card of this special pseudoazimuthal map projection is shown in Table 1.

4 Stretch, or Length Distortion A second multiplicative measure of deformation: stretch, or length distortion, Tissot portrait, simultaneous diagonalization of two matrics: The second multiplicative measure of deformation is based upon the scale ratio, which is also called stretch, dilatation factor, or length distortion. One here distinguishes the left arid right stretch:

Page 16 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Fig. 6 Special map projection of the sphere, called Hammer retroazimuthal projection, centered near St. Louis (longitude 90˚ W, latitude 40˚ N), with shorelines, 15˚ graticule, two hemispheres, one of which appears backward (they should be superimposed for the full map) Table 1 ID card of Hammer retroazimuthal projection of the sphere (i) Classification Retroazimuthal, modified az;imuthal, neither conformal nor equal-area (ii) Graticule Meridians: central meridian is straight; other meridians are curved. Parallels: curved. Poles of the sphere: curved lines. Symmetry: about the central meridians (iii) Distortions Distortions of area and shape (iv) Other features The direction from any point to the center of the map is the angle that a straight line connecting the two points makes with a vertical line. This feature is the basis of the term “retroazimuthal.” Scimitar-shaped boundary. Considerable overlapping when entire sphere is shown (v) Usage To determine the direction of a central point from a given location. (vii) Origins Presented by E. Hammer (1858–1925) in 1910. The author is the successor of E. Hammer in the Geodesy Chair of Stuttgart University (Germany). The map projection was independently presented by E. A. Reeves (1862–1945) and A. R. Rinks (1874–1945) of England in 1929

left stretch W ds 2 2 dS 2 D ds 2 ; D 2 DW 2l ; dS 2

right stretch W 2 ds 2 D dS 2 ; 2r WD 2 D

dS 2 ; ds 2

(47)

subject to the duality 2 2 D 1: ˚  Question. “What is the role of stretch 2 ; ˚2 in the context of the pair of (symmetric, positive definite) matrices fcMN ; GMN g, fCl ; Gl g and C ; g , fCr ; Grg, respectively?” Answer: “Due to a standard lemma of matrix algebra, both matrices can be simultaneously diagonalized, one matrix being the unit matrix.” We briefly outline the simultaneous diagonalization of the positive-definite symmetric matrices fCl ; Gr g and fCr ; Gl g, respectively, which is based upon a transformation called “Kartenwechsel”: left “Kartenwechsel”    W T W Vl UM2l ! VQl UM2l

versus

right “Kartenwechsel”   W Q W Vr UM2r ! Vr UM2r :

(48)

Page 17 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

f

Ul ⊂ M 2l

Φl

Ur ⊂ M2r

˜ Φ l

Φr

˜ Φ r

fcan V˜l ⊂ E2

V˜r ⊂ E2 f

Vl ⊂ E 2

Vr ⊂ E 2

Fig. 7 Commutative diagram, canonical representation of pairs of metric tensors, “Kartenwechsel” T and , canonical mapping f can from the left chart VQl to the right chart VQr

The commutative diagram shown in Fig. 7 illustrates this “Kartenwechsel.” Let us pay attention to Theorem 1 and Corollary 2, and let us present the various transformations in Boxes 2–8. Box 2 (Left versus right Cauchy–Green deformation tensor). Left CG: Right CG:   ˚   @u @u ˚ L    @U M @U N   M N 2 dU dU D dS D G du du D u ds 2 D g f  U L MN F @U M @U N @u @u     D C u du du ; D CMN U L dU M dU N ;     @u  L  @u  L       @U M    @U N    U U : C u WD G u u : cMN U L WD g U L  MN u @U M @U N @u @u (49)

Box 3 (Left Tissot circle versus left Tissot ellipse, left Cauchy–Green deformation tensor: Ricci calculus). Left Tissot circle S1 W

Left Tissot ellipse E11 ;2 W

dS 2 D GMN UAM UBN dV N A dV N BD

ds 2 D g uM uN UAM UBN dV N A dV N BD  1 2  2 2 D 21 dV C 22 dV D N N

D ıAB dV N A dV N BD  1 2  2 2 D dV C dV D ˝12 C ˝22 : N N



(50)

D ˝12 =21 C ˝22 =22:

Page 18 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 4 (Left Tissot circle versus left Tissot ellipse, left Cauchy–Green deformation tensor: Cayley calculus). Left Tissot circle S1 W dS 2 D ˝ T FTl Gl Fl ˝ D D ˝ T˝ ” ” FTl Gl Fl D I:

Left Tissot ellipse E11 ;2 W ds 2 D ˝ T FTl Cl Fl ˝ D   D ˝ T diag 21 ; 22 ˝ ”     ” FTl Cl Fl D diag 21 ; 22 D diag 1=21 ; 1=22 : (51)

Box 5 (The right Tissot ellipse versus right Tissot circle, right Cauchy–Green deformation tensor: Ricci calculus). Right Tissot ellipse E11 ;2 W

Right Tissot circle S1 W

dS 2 D GMN UM UN u˛ uˇ dv N ˛ dv N ˇD    1 2 2 =21 C dv D dv N 2 =22 D N

ds 2 D g u˛ uˇ dv N ˛ dv N ˇD

D 21 !12 C 22 !22 :

D ı˛ˇ dv N ˛ dv N ˇD  1 2  2 2 D dv C dv D !12 C !22 : N N

(52)

Box 6 (The right Tissot ellipse versus right Tissot circle, right Cauchy–Green deformation tensor: Cayley calculus). Right Tissot ellipse E11 ;2 W

Right Tissot circle S1 W

dS 2 D !T FTr Cr Fr ! D   D !T diag 21 ; 22 !

ds 2 D !T FTr Gr Fr ! D

,     FTr Cr Fr D diag 21 ; 22 D diag 1=21 ; 1=22 :

,

D !T !

(53)

FTr Gr Fr D I:

Certainly, we agree that the various transformations have to be checked by “paper and pencil,” in particular by means of Examples 2 and 3. In case that we are led to “(nonintegrable differentials” (namely, differential forms), we have indicated this result by writing “dV N ” and “dv” according to the M. Planck notation. In this context, the left and right Frobenius N matrices, Fl and Fr , have to be seen. They are used as matrices of integrating factors, which transform “imperfect differentials” dV N 1 , dV N 2 , or differential forms 1 , N A (namely, dV

2 ) or dv N ˛ (namely, dv N 1 , dv N 2 , or differential forms !1 , !2 ) to “perfect differentials” dU A 1 2 ˛ (namely, dU , dU ) or du (namely du1 , du2 ). As a sample reference of the theory of

Page 19 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 7 (Left general eigenvalue problem and right general eigenvalue problem: Ricci calculus). Left eigenvalue problem:

Right eigenvalue problem:

2 dS 2 D ds 2 ; 2 GMN UAM UBN dV N A dV N B

2 ds 2 D dS 2 ; 2 g u˛ uˇ dv N ˛ dv N ˇ



D g uM uN UAM UBN dV N A dV N B

D GMN UM UN u˛ uˇ dv N ˛ dv N ˇ

,

,

2 GMN UBN D cMN UBN

2 g uˇ D C uˇ

,   cMN  2 GMN UBN D 0;

,   C  2 g uˇ D 0;

subject to

subject to

g uM uN UAM UBN D ıAB :

GMN UM UN u˛ uˇ D ı :



(54)

Box 8 (Left general eigenvalue problem and right general eigenvalue problem, Cayle, calculus). Left eigenvalue problem:

Right eigenvalue problem:

2 dS 2 D ds 2 ;

2 ds 2 D dS 2 ;

T 2 dV N FTl Gl Fl dV N D

T 2 dV N FTr Gr Fr dV N D

T D dV N FTl Cl Fl dV N

T D dV N FTr Cr Fr dV N ;

,

  Cl  2 Gl Fl D 0;

,   Cr  2 Gr Fr D 0;

subject to

subject to

FTl Gl Fl

D I:

(55)

FTr Gr Fr D I:

differential forms and the Frobenius Integration Theorem, we direct the interested reader to H. Flanders (1970, p. 97), M. P. doCarmo (1994), and J. A. de Azcárraga and J. M. Izquierdo (1995). Indeed, we hope that the reader appreciates the triple index notation (Ricci calculus), matrix notation (Cayley calculus), and explicit notation (Leibniz-Newton calculus). Thus, we are led to the general eigenvalue problem as a result of the simultaneous diagonalization of two positive-definite symmetric matrices fCl ; Gl g or fCr ; Gr g, respectively. Compare with Lemma 1: Lemma 1 (Left and right general eigenvalue problem of the Cauchy–Green deformation tensor). For the pair of positive-definite symmetric matrices fCl ; Gl g or fCr ; Grg, respectively, a simultaneous diagonalization defined by

Page 20 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

left diagonalization:   D diag 21 ; 22 WD Dl ;

FTl Gl Fl

FTl Gl Fl

D I2

versus

right diagonalization:   D diag 21 ; 22 WD Dr ; FTr Gr Fr D l2

FTr Gr Fr

(56)

is readily obtained from the following general eigenvalue–eigenvector problem of type left eigenvalues and left principal stretches:

21;2

Cl Fl  Gl Fl Dl D 0 ”   Cl  2i Gl f li D 0 ”ˇ ˇ ˇCl  2 Gl ˇ D 0;   q        1 1 2 1 2 1 tr Cl Gl ˙ D ˙ D  4 det Cl Gl tr Cl Gl ; 2

(57)

subject to FTl Gl Fl D I2 , and Cr F r  G r F r Dr D 0 ”   Cr  2i Gr f ri D 0

21;2

” ˇ ˇ ˇCr  2 Grˇ D 0;   q      2  1 1 2 tr Cr Gr ˙ D ˙ D  4 det Cr G1 tr Cr G1 ; r r 2

(58)

subject to FTr GrFr D I2 , and 21;2 D 1=21;2 , 1=21;2 D 21;2 :

(59)

In order to visualize the eigenspace of the left and right Cauchy–Green deformation tensors Cl and Cr relative to the left and right metric tensors Gl and Gr , we are forced to compute in addition the eigenvectors, in particular the eigencolumns (also called eigendirections) of the pairs fCl ; Gl g and fCr ; Grg, respectively. Compare with Lemma 2: Lemma 2 (Left and right general eigenvectors, left and right principal stretch directions). For the pair of positive-definite symmetric matrices fCl ; Gl g and fCr ; Gr g, an explicit form of the left eigencolumns (also called left principal stretch directions) and of the right eigencolumns (also called right principal stretch directions) is

Page 21 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

1st left eigencolumn, 1 W  1 F11 D q    2   2 F21 2 2 2 2 c22  1 G22 G11  2 c12  1 G12 c22  1 G22 G12 C c12  1 G12 G22

 c22  21 G22   ;  c12  21 G12

2nd left eigencolumn; 2 W 1 F12  D q     2   F22 2 2 2 2 2 c11  2 G11 G22  2 c11  2 G11 c12  2 G12 G12 C c12  2 G12 G11 

  c12  22 G12 ;  c11  22 G11 (60) W 1st right eigencolumn,  1

 1 f11 D q    2   2 f21 2 2 2 2 C22  1 g22 g11  2 C12  1 g12 C22  1 g22 g12 C C12  1 g12 g22

 2 C 22  1 g22    ;  C12  21 g12 

f12 f22



2nd right eigencolumn; 2 W 1 D q      2   2 2 2 2 2 C11  2 g11 g22  2 C11  2 g11 C12  2 g12 g12 C C12  2 g12 g11 

  C12  22 g12  : C11  22 g11 (61)

A sketch of a proof is presented in the following. Note that there are four pairs of fF11 , F22 g, dependent on the sign choice fC; Cg; fC; g; f; Cg; and f; g. In Lemma 2, we have chosen the solution sign fF11 ; F22 g D fC; Cg. Furthermore, note that the proof for representing the right eigencolumns or right eigendirections runs analogously. The dimension four of the solution space of eigencolumns or eigendirections has already been documented by J. M. Gere and W. Weaver (1965), for instance. Proof (first and second left eigencolumns).

first left eigencolumn; 1 W    F11 0 c11  21 G11 c12  21 G12 D ; 2 2 c12  1 G12 c22  1 G22 F21 0

(62)

Page 22 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

second identity:     c12  21 G12 F11 C c22  21 G22 F21 D 0 H) " # 

1 c12  21 G12 F11 D F11 H) F21 D  F11 ” : c 2 G  c12 12 G12 F21 c22  21 G22 22

1

22

second left eigencolumn; 2 W    F21 0 c11  22 G11 c12  22 G12 D ; 2 2 c12  2 G12 c22  2 G22 F22 0

(64)

first identity:     c11  22 G11 F21 C c12  22 G12 F22 D 0 H) " # 

c12 21 G12 c12  22 G12  c 2 G F12 11 2 11 D F22 H) F12 D  F22 ” : 2 F22 c11  2 G11 1

(65)

Left conditions:     F11 F21 G11 G12 F11 F12 10 T D : Fl G l Fl D I 2 ” F12 F22 G12 G22 F21 F22 01

(66)



(63)

F11 ; F21



first and second partitioning:    

  G11 G12 F11 F12 G11 G12 D 1; F12 ; F22 D 1: G12 G22 F21 G12 G22 F22

(67)

second identity: # " h i 1 2 G G c  G 11 12 12 12 F112 1;  c 12 G D1 c 2 G 22 1 22  c12 12 G12 G12 G22 22 1 22 " # h i 1 2 2 c c  G  G D1 ” F112 G11  G12 c12 12 G12 ; G12  G22 c12 12 G12 c 2 G 22 22 1 22 1 22  c12 12 G12 22 22 1

 2 .c 2 G / c 2 G ” F112 G11  2G12 c12 12 G12 C G22 12 12 12 2 D 1 22 1 22 .c22 1 G22 / c22  21 G22 H) F11 D ˙ q   2   2 c22  21 G22 G11  2 c12  21 G12 c22  21 G22 G12 C c12  21 G12 G22 " # 

1 F11 D F11 ” D c 2 G  c12 12 G12 F21 22 1 22 1  D ˙ q       2 2 2 2 2 2 c22  1 G22 G11  2 c12  1 G12 c22  1 G22 G12 C c12  1 G12 G22

 c22  21 G22   q:e:d:  c12  21 G12 (68) Page 23 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

first identity:  " c12 22 G12 # i h 2  c 2 G G11 G12 c  G 11 2 11 D1 F222  c12 22 G12 ; 1 11 2 11 G12 G22 1 # " h i  c12 22 G12 2 2 c c  G  G 2 12 12 12 12 c11 2 G11 D1 ” F222 G12  G11 c 22 G ; G22  G12 c 22 G 11 11 2 11 2 11 1

 2 .c12 22 G12 / c12 22 G12 2 D1 ” F22 G22  2G12 c 2 G C G11 2 11 2 11 .c11 22 G11 / c11  22 G11 H) F22 D ˙ q   2   2 c11  22 G11 G22  2 c11  22 G11 c12  22 G12 G12 C c12  22 G12 G11 # " 

c 2 G  c12 22 G12 F12 11 2 11 D F22 D ” F22 1 1 D ˙ q       2 2 c11  22 G11 G22  2 c11  22 G11 c12  22 G12 G12 C c12  22 G12 G11 

  c12  22 G12  q.e.d. c11  22 G11 (69) The canonical forms of the metric, namely, dS 2 and ds 2 , have been interpreted as the following pairs: left Tissot circle S1 versus left Tissot ellipse E11 ;2 ;

and

right Tissot ellipse E11 ;2 versus right Tissot circle S1 :

(70)

Figure 8 illustrates the pair {left Cauchy-Green deformation tensor, left metric tensor} by means of the left Tissot circle S1 and the left Tissot ellipse E11 ;2 on the left tangent space T M2l . In contrast, by means of Fig. 9, we aim at illustrating the pair {right Cauchy–Green deformation tensor, right ∂/∂V

– 2 ∂/∂V dV – 2 dV ∂/∂U – dV 1

– 1 dV

Λ1 Λ2

∂/∂U

Fig. 8 Left Cauchy–Green tensor, left Tissot circle S1 ;left Tissot ellipse E11 ;2 ; the tangent vectors are @=@U and @=@V

Page 24 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

– 2 ∂/∂v dv

– 2 dv ∂/∂v

– 1 dv

λ2 λ1

∂/∂u

∂/∂u

– 1 dv

Fig. 9 Right Cauchy–Green tensor, right Tissot ellipse E11 ;2 , right Tissot circle S1 ; the tangent vectors are @=@u and @=@ v

metric tensor} by means of the right Tissot ellipse E11 ;2 and the right Tissot circle S1 on the right tangent space T M2r :The left eigenvectors span canonically the left tangent space T M2l , while the right eigenvectors span the right tangent space T M2r , namely,

UAM

@ @U M

versus

u˛

@ @ ; Fl  @u @U

versus

Fr

(71)

@ : @u

Indeed, they are generated from a dual holonomic base (coordinate base) fdU 1 ; dU 2 g versus fdu1 ; du2g to an anholonomic base fdV N 1 ; dV N 2 g D f 1 ; 2 g versus fdv N 1 ; dv N 2g D f!1 ; !2 g by the transformations

1



1

 dU

1 du !1 versus : (72) 2 D Fl 2 D Fr dU

2 du !2

5 Euler–Lagrange Deformation Tensor Approach your problems from the right end and begin with the answers. Then, one day, perhaps you will find the final question. (The Hermit Clad in Crane Feathers, in R. van Gulik’s The Chinese Maze Murders.) A first additive measure of deformation: Euler–Lagrange deformation tensor, relations between the Cauchy–Green and Euler-Lagrange deformation tensors. The first additive measure of deformation is based upon the scale differences ds 2 –dS 2 versus   dS 2  ds 2, which are represented by pullback U M ! u D f  U M or pushforward u ! U M D F M .u /, in particular by   ds 2  dS 2 D dU T JTl Gr Jl  Gl dU

  versus dS 2  ds 2 D duT JTr Gl Jl  Gr du:

(73)

Accordingly, we are led to the deformation measures of Box 9, which have been introduced by L. Euler and J. L. Lagrange, called strains.

Page 25 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 9 (Left versus right Euler–Lagrange deformation tensor). Left EL deformation tensor:   ds  dS 2 D dU T JTl Gr Jl  Gl dU ;  1 2 ds  dS 2 D CdU T El dU ; 2  1 T Jl G r Jl  G l : El WD 2

Right EL deformation tensor:   dS  ds 2 D duT JTr Gl Jl  Gr du;  1 2 dS  ds 2 D duT Er du; 2  1 Gr  JTr Gl Jr : Er WD 2

2

2

(74)

Question. “What is the role of strain in the context of the pair of matrices fEl ; Gl g and fEr ; Gr g, respectively?” Answer: “fEl ; Er g are symmetric matrices, and fGl ; Gr g are symmetric, positivedefinite matrices. Thus, according to a standard lemma of matrix algebra, both matrices can be simultaneously diagonalized, one matrix being the unit matrix. With reference to the general eigenvalue we experienced for the Cauchy–Green deformation tensor, we arrive at Lemma 3.” Lemma 3 (Left and right general eigenvalue problem of the Euler–Lagrange deformation tensor). For the pair of symmetric matrices fEl ; Gl g or fEr ; Gr g, where fGl ; Gr g are positivedefinite matrices, a simultaneous diagonalization, namely, FTl El Fl D diag .K1 ; K2 / ; FTl Gl Fl D l2

versus

FTr Er Fr D diag . 1 ; 2 / ; FTr Gr Fr D l2 ; (75)

is immediately obtained from the left and right general eigenvalue–eigenvector problems El Fl  Gl Fl diag .K1 ; K2 / D 0 , .El  Ki Gl / f li D 0 .8i 2 f1; 2g/ ,  FTl Gl Fl D I2 ; jEl  Ki Gl j D 0  1  K1;2 D KC; D tr El G1 ˙ l 2 r     1 2 1  4 det El Gl ; ˙ tr El Gl

and

Er Fr  Gr Fr diag . 1 ; 2 / D 0 , .Er  r Gr / f ri D 0 .8i 2 f1; 2g/ ,  FTr Gr Fr D I2 ; jEr  r Gr j D 0  1 

1;2 D C; D tr Er G1 ˙ r 2 r   2  1 1  4 det Er Gr : ˙ tr Er Gr

(76)

In order to visualize the eigenspace of both the left and the right Euler–Lagrange deformation tensors El and Er relative to the left and right metric tensors Gl and Gr , we are forced to compute in addition the left and right eigenvectors (namely, the left and right eigencolumns, also called eigendirectories) of the pairs fEl ; Gl g and fEr ; Grg, respectively. Lemma 4 summarizes the results. Lemma 4 (Left and right eigenvectors of the left and right Euler–Lagrange deformation tensors). For the pair of symmetric matrices fEl ; Gl g or fEr ; Gr g, an explicit form of the left eigencolumns and the right eigencolumns is

Page 26 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015



first left eigencolumns, K1 W  F11 1 Dp  F21 G11 .e22 K1 G22 /2 2G12 .e12 K1 G12 /.e22 K1 G22 /CG22 .e12 K1 G12 /2 

e22  K1 G22 I   .e12  K1 G12 / 

second left eigencolumns, K2 W

F12 1  Dp 2 F22 G22 .e11 K2 G11 / 2G12 .e11 K2 G11 /.e12 K2 G12 /CG11 .e12 K2 G12 /2 

e12  K2 G12 I   .e11  K2 G11 /



first right eigencolumns, 1 W  f11 1 Dp  f21 g11 .E22  1 g22 /2 2g12 .E12  1 g12 /.E22  1 g22 /Cg22 .E12  1 g12 /2 

E22  1 g22 I   .E12  1 g12 / 

(77)

(78)

(79)

second right eigencolumns, 2 W

f12 1 Dp  2 f22 g22 .E11  2 g11 / 2g12 .E11  2 g11 /.E12  2 g12 /Cg12 .E12  2 g12 /2 

 .E12  2 g12 / :  E11  2 g11

(80)

The proof of these relations follows the line of thought of the proof of Lemma 2. Accordingly, we skip any proof here. The canonical forms of the scale difference .ds/2  .dS/2 and .dS/2  .ds/2 , respectively, have been interpreted as left Euler  Lagrange circle S1 versus left Euler  Lagrange ellipse E1pK ;pK .Ki > 08i D 1; 2/ ; 1 2 left Euler  Lagrange hyperbola H1pK ;pK .K1 > 0; K2 < 0/ ; 1

2

and

right Euler  Lagrange circle S1 versus right Euler  Lagrange ellipse E1p 1 ;p 2 . i > 08i D 1; 2/ ; right Euler  Lagrange hyperbola H1p 1 ;p 2 . 1 > 0; 2 < 0/ ;

(81)

on the left tangent space TU M2l and the right tangent space Tu M2r , respectively. A deformation portrait with a positive eigenvalue K.El ; Gr / or .Er ; Gl / is referred to as extension, with a negative eigenvalue K.El ; Gr/ or .Er ; Gl / as compression. Obviously, Cauchy–Green deformation and Euler–Lagrange deformation are related as outlined in Corollary 4. The four cases of the eigenspace analysis of the left and right Euler–Lagrange deformations are illustrated in Figs. 10–14.

Page 27 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Corollary 4 (Relation between the Cauchy–Green and Euler–Lagrange deformation tensors). 2Er D Gr  Jr Gl Jr D Gr  Cr 2El D Jl Gr Jl  Gl D Cl  Gl versus versus versus Cr D Gr  2Er I Cl D 2El C Gl ;  versus Er D Jr El Jr I El D Jl Er Jl 2 versus 2 i D 2i  18i D 1; 2: 2Ki D i 8i D 1; 2

(82)

Examples for the mapping between two Riemann manifolds are the following. C. F. Gauss (1822, 1844) presented his celebrated conformal mapping of the biaxial ellipsoid E2A1 ;A1 ;A2 D M2l onto the sphere S2r D M2r , also called double projection due to a second conformal mapping of the sphere S2r onto the plane R2 . M. Amalvict and E. Livieratos (1988) elaborated the isoparametric mapping of the triaxial ellipsoid E2A1 ;A2 ;A3 D M2l onto the biaxial ellipsoid E2A1 ;A1 ;A2 D M2r . A. Dermanis, E. Livieratos, and S. Pertsinidou (1984) mapped the geoid onto the biaxial ellipsoid. While nearly all existing map projections are analyzed by means of the Cauchy–Green deformation tensor, A. Dermanis and E. Livieratos (1993) used the Euler–Lagrange deformation tensor for map      2   or tr Er G1 and general shear tr El G1 projections, in particular for dilatation tr El G1 r l l        1 1 2 1 4 det El Gl  4 det Er Gr . An elaborate example is discussed in or tr Er Gr Sects. 2–6. However, to give you some breathing time, please first enjoy the Berghaus star projection, presented in Fig. 13. ∂/∂V √

∂/∂v

1

∂/∂U

√ K2

K1

∂/∂u

1

Fig. 10 Left Euler–Lagrange tensor, K1 > 0, K2 > 0, left Euler–Lagrange circle S1 , left Euler–Lagrange ellipse E1pK ;pK 1

2

∂/∂ v

∂/∂V √ ∂/∂U + i K2

1

Fr √ K1

1

Fl

K12 +K 22

∂/∂u

√ −i K2

Fig. 11 Left Euler–Lagrange tensor, K1 > 0, K2 < 0, left Euler–Lagrange circle S1 , left Euler–Lagrange hyperbola H1pK ;pK , left and right focal points Fl and Fr 1

2

Page 28 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

∂/∂V

∂/∂v ∂/∂U

κ2

1

∂/∂u

κ1 1

Fig. 12 Right Euler–Lagrange tensor, 1 > 0, 2 > 0, right Euler–Lagrange circle S1 , right Euler–Lagrange ellipse E1p 1 ;p 2 ∂/∂V +iκ2

Fr

∂/∂v

κ12 + κ12 1

∂/∂u

κ1 ∂/∂U

1

−iκ2 Fl

Fig. 13 Right Euler–Lagrange tensor, 1 > 0, 2 < 0, right Euler–Lagrange circle S1 , right Euler–Lagrange hyperbola H1p 1 ;p 2 , left and right focal points Fl and Fr

6 Review: The Deformation Measures Review: The family of 22 different deformation measures, compatibility conditions, integrability conditions, and differential forms. By means of Table 2, let us introduce a collection of various deformation measures, i.e., deformation tensors of the first kind based upon the reviews by D. B. Macvean (1968), K.N. Morman (1986), and E. Grafarend (1995). For the classification scheme, various representation theorems of T. C. T. Ting (1985) are most useful. Compatibility conditions for Cauchy–Green deformation fields have been formulated by F.P. Duda and L. C. Martins They  k  (1995).  are  K K k k needed for the problem to determine the mapping equations U D f u or u D f U K from prescribed left or right Cauchy–Green deformation fields as tensor-valued functions. In the context of exterior calculus, these compatiblity conditions are classified as integrability conditions. The various deformation measures honor the works of G. Piola (1836), G. Green (1839), A. Cauchy (1889, 1890), J. Finger (1894), E. Almansi (1911), H. Hencky (1928), Z. Karni and M. Reiner (1960), and B. R. Seth (1964a, b). The inverse deformation matrices, namely, E5 ; E6 ; E15 ; E16 ; E17 , and E18 , appear in various forms of distortion energy. Logarithmic and root measures of deformation appear in special stress–strain relations, which very often are called constitutive equations. The measures E3 and E4 as well as E13 and E14 build up the special eigenvalue problems. They correspond to definitions of the curvature matrix K D HG1 in surface geometry, built on the matrices of the first differential form I  .dg/2 D g du du as well as on the second differential form II  .dh/2 D h du du , which is also called the Hesse form. Indeed, they establish the matrix pair fH; Gg, where G is positive definite.

Page 29 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Table 2 Various deformation tensors of the first kind Definitions E1 D Cl D Sl R Gr RSl D Jl Gr Jl E2 D Cr D Sr R Gl RSr D Jr Gl Jr E3 D Cl G1 l E4 D Cr G1 r E5 D Gl C1 l E6 D Gr C1 r m=2

E7 D Cl

 ˚ m  m 1 ; 2

E8 D ln Cl  fln 1 ; ln 2 g  ˚ m=2 m E9 D Cr  m 1 ; 2 E10 D ln Cr  fln 1 ; ln 2 g

Author A. Cauchy (1889, 1890) (“left Cauchy–Green”) G. Green (1839) (“right Cauchy–Green”) E. Grafarend (1995) (“left-right Cauchy–Green”) E. Grafarend (1995) (“right-right Cauchy–Green”) E. Grafarend (1995) (“inverse left-right Cauchy–Green”) E. Grafarend (1995) (“inverse right-left Cauchy–Green”) B. R. Seth (1964a, b) .m 2 Z; m ¤ 0/ H. Hencky (1928) B. R. Seth (1964a, b) H. Hencky (1928)

Comments If Gr D I, then Cl D Sl2 D Jl Jl If Gl D I, then Cr D S2r D Jr Jr if Gl D I, then E3 D Cl if Gr D I, then E4 D Cr J. Finger (1984a) if Gl D I, then E5 D E1 3 G. Piola (1936), if Gr D I, then E6 D E1 4 m D 2 W E7 D E1 – m D 2 W E9 D E2 –

E11 D El D 12 .Cl  Gl /

A. Cauchy (1889, 1890) (“left Euler–Lagrange”)

If Gl D I, then El D

1 2

.Cl  I/

E12 D Er D 12 .Gr  Cr /

E. Almansi (1911) (“right Euler–Lagrange”)

If Gr D I, then Er D

1 2

.I  Cr /

1 1 E13 D El G1 l D 2 .Cl Gl  I/

E. Grafarend (1995) (“left-right Euler–Lagrange”)

if Gl D I, then E13 D El

1 1 E14 D Er G1 r D 2 .I  Cr Gr /

E. Grafarend (1995) (“right-left Euler–Lagrange”)

if Gr D I, then E14 D Er

Z. Karni and M. Reiner (1960)

if Gl D I then E15 D

Z. Karni and M. Reiner (1960)

if Gr D I then E16

E. Grafarend (1995) (“inverse left-right Euler–Lagrange”) E. Grafarend (1995) (“inverse right-left Euler–Lagrange”)

if Gl D I, then E17 D E1 l

B. R. Seth (1964a, b) .m 2 Z; m ¤ 0/

m D 2 W E19 D E11

ln El

H. Hencky (1928)



m=2

B. R. Seth (1964a, b)

m D 2 W E21 D E12

H. Hencky (1928)





1 C1 l  Gl

E15 D

1 2

E16 D

1 2

E17 D

Gl E1 l



1 G1 r  Cr

 

E18 D Gr E1 r m=2

E19 D El E20 D

1 2

E21 D Er E22 D

1 2

o n m=2 m=2  K1 ; K2 n o m=2 m=2  1 ; 2

ln Er

 1  Cl  I   D 12 I  C1 r 1 2

if Gr D l, then E18 D E1 r

7 Angular Shear A second additive measure of deformation: angular shear (also called angular distortion), left and right surfaces, parameterized curves. An alternative additive measure of deformation is angular shear, also called angular distortion. Assume that two parameterized curves in M2l and their images in M2r intersect at the point U0 Page 30 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

M N  and u0 , respectively. Two vectors UP 1 and UP 2 , as well as uP 1 and uP 2 , being elements of the corresponding local tangent spaces TU 0 M2l and TU 0 M2r , M N UP 1 2 TU 0 M2l ; UP 2 2 TU 0 M2l



uP 1 2 TU 0 M2r ; uP 2 2 TU 0 M2r ;

versus

(83)

include the angles l and r . (Note that prime differentiation is understood as differentiation with respect to arc length. In contrast, dot differentiation is understood as differentiation with respect to arbitrary curve parameters, called “tl ” and “tr ”, respectively.) As it is illustrated in Fig. 14, the left angle l and the right angle r are represented by the inner products ˛ ˝ cos l D U 0 1 jU 0 2 D

cos r D hu0 1 ju0 2 i D

M N GMN UP 1 UP 2 Dq q A B   P P GAB U 1 U 2 G UP 1 UP 2

versus



Dq

g uP 1 uP 2 : q  ı ˛ ˇ g˛ˇ uP 1 uP 2 g ı uP 1 uP 2

(84)

The second additive measure of deformation is the angular shear,or the angle of shear (†l is of type “left,” and †r is of type “right”, respectively): †l D ˙ WD l  r versus †r D WD r  l :

(85)

Example 5 and Box 10 illustrate this second additive measure of deformation. In order to be simple, however, we have chosen the coordinate lines that are illustrated in Fig. 15. ∂X ∂UN

N U˙ 2

˙ = X 1

∂X ∂UM

˙ = X 2

x˙ 2 =

u˙ ν2

M U˙ 1

Ψr

Ψl

x˙ 1 =

∂x ∂uμ

u˙ μ 1

u0

U0 ˙ ,X ˙ ∈ TU0 M2l X 1 2

∂x ∂uν

x˙ 1 , x˙ 2 ∈ Tu0 M2r

Fig. 14 Angular measure of deformation, left and right shear

φ

Φ Λ

λ

Fig. 15 Angular shear, isoparametric mapping E2A1 ;A1 ;A2 ! S2r , left and right parameterized curves of type {ellipsoidal parallel circle, ellipsoidal meridian} and {spherical parallel circle, spherical meridian}

Page 31 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 10 (Angular shear or angular distortion). Left vector field:

Right vector field:

A1 cos ˚ cos  C X .; ˚/ D E 1 p 1 E 2 sin2 ˚ 1  E 2 A1 sin ˚ A1 cos ˚ cos  C E3 p : CE 2 p 1  E 2 sin2 ˚ 1  E 2 sin2 ˚

x .; / D e 1 r cos cos C Ce 2 r cos sin  C e 3 r sin : (86)

Left displacement field: @X d @X d˚ dX D dtl C dtl : @ dtl @˚ dtl

Right displacement field: @x d @x d dx D dtr C dtr : @ dtr @ dtr

(87)

1st left parameterized curve: dX @X : P D 1; ˚P D 0; D dtl @

1st right parameterized curve: dx @x P D 1; P D 0; : D dtr @

(88)

2nd left parameterized curve: dX @X P D 0; ˚P D 1; : D dtl @˚

2nd right parameterized curve: dx @x P D 0; P D 1; : D dtr @

(89)

Right ˇangular shear: @x ˇˇ @x 0D D h xP 1 j xP 2 i ; @ ˇ @  ˙ D r , cos r D 0; 2 †r D r  l D 0:

(90)

D

Left angular shear: ˇ E @X ˇˇ @X ˇ ˇ P 1ˇ X P2 D X D 0; @ ˇ @˚  cos l D 0 , l D ˙ ; 2 †l D l  r D 0:

Example 5 (Angular shear or angular distortion, f W E2A1 ;A1 ;A2 ! S2r ). Let us refer to Example 3, where we analyze the isoparametric mapping f D id from an ellipsoid-of-revolution M2l D E2A1 ;A1 ;A2 to a sphere M2r D S2r . Here, we shall continue the analysis by computing the angular shear or angular distortion of two parameterized curves in M2l D E2A1 ;A1 ;A2 as well as their images in M2r D S2r . Left surface, parameterized curves: .i/ parallel circles W U 1 D  D tl ; U 2 D ˚ D constantI .ii/ meridians W U D  D constant; U 2 D ˚ D tl : 1

Right surface, parameterized curves: .i/ parallel circles W u1 D  D tr ; u2 D D constantI

(91)

.ii/ meridians W u D  D constant; u2 D D tr :

(92)

1

Page 32 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

   .tl / tl U 1 .tl / D D ; / constant ˚ .t l 

  .tl / constant : D U 2 .tl / D ˚ .tl / tl

  tr  .tr / D D u1 .tr / ; constant / .t r

  constant  .tr / D D u2 .tr / : .tr / tr

(93)

With these parameterized curves in M2l and M2r , respectively, we enter Box 10. Here, we compute 1 3 ˚ 1 l and ˚ r in parameterized form, namely, .; ˚ / ! X .; ˚/ 2 R and .; / ! x .; / 2 3 The left and right displacement fields are used to derive the tangent vectors R o n , respectively. P 2 of type “left” and fxP 1 ; xP 2 g of type “right.” The inner products vanish according to our P 1; X X test computations in Example 3. In consequence, l D r D =2 and†l D †r D 0, i.e., no angular distortion appears.

8 Relative Angular Shear A third multiplicative measure of deformation: relative angular shear, Cauchy–Green deformation tensor, Euler–Lagrange deformation tensor. The third multiplicative measures of deformation are the ratios Ql and Qr , respectively. These ratios are also called relative angular shear. In particular, cos l ; cos r (94) subject to the duality Qq = 1. Note that additive angular shear and multiplicative angular shear are related by Ql cos l D cos r ;

Ql D Q WD

cos r cos l

cos q †l D D Ql cos l C 1  Qj2 cos2 l sin l

versus Qr cos r D cos l ;

Qr D q WD

cos p †r D D Qr cos2 r C 1  Qr2 cos2 r sin r : (95) In Box 11, we have collected various representations of angular shear, in particular in terms of the Cauchy-Green deformation and the Euler–Lagrange deformation tensors as well as their eigenvalues. Example 6 and Box 12 illustrate this third multiplicative measure of deformation. 2

versus

Example 6 (Relative angular shear). Again, we refer to Example 3, and to Example 5 in addition, where the isoparametric mapping f D id from an ellipsoid-of-revolution M2l D E2A1 ;A1 ;A2 to a sphere M2e D S2r with respect to the Cauchy–Green deformation tensor and the absolute angular shear have been analyzed. Here, we aim at relative angular shear. First, by means of Box 12, we are going to compute cos l and cos r from the two sets of left and right curves, namely, from the left Cauchy–Green tensor and the right Cauchy–Green tensor. Second, we derive the relative angular shear: Ql D Qr D 1.

Page 33 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

In the following section, we consider the equivalence theorem for conformal mapping. However, in order to give you first some breathing time, please enjoy the Stab–Werner pseudoconic projection is presented in Fig. 16. Box 11 (Left and right angular shear). cos l D

T

UP 1 Gl UP 2 kUP 1 kG kUP 2 kG l

Ql WD

D Ql D

cos r D

l

uP T1 Cr uP 2 ; kPu1 kCr kPu2 kCr

D



D

cos r cos l

D

Cl

T UP 1 Cl UP 2 T UP 1 Gl UP 2

Qr WD

kUP 1 kGl kUP 2 kGl D kUP 1 kCl kUP 2 kCl



T UP 1 Cl UP 2 1 ; T UP 1 Gl UP 2 .UP 1 /.UP 2 /

D

(96)

; Cl

uP T1 Cl uP 2 uP T1 Gr uP 2

kuP 1 kGr kuP 2 kGr D kuP 1 kCr kuP 2 kCr

D

(97)

uP T1 Cr uP 2 1 ; uP T1 Gr uP 2 .Pu1 /.Pu2 /

Qr D

uP T1 .2Er CGl /Pu2 uP T1 Gr uP 2

p

1C2.uP T1 Er uP 2 /=.uP T1 Gr uP 2 / 1C2.uP T1 Er uP 1 /=.uP T1 Gr uP 1 /

vP T1 FTr Cr Fr vP 2 kPv1 kFT C F kPv2 kFT C r

D

cos l cos r

D

p

kPu1 kGr T uP 1 .2Er CGr /Pu1

kuP 2 kGr uP T2 .2Er CGr /Pu2

r r

r

vP T1 diag.21 ;22 /vP 2 ; kPv1 kD kPv 2 kD

D r Fr

p

1C2.uP T2 Er uP 2 /=.uP T2 Gr uP 2 /

cos r D D

(99)

;

T VP 1 FTl Cl Fl VP 2 kVP 1 k T kVP 2 k

VP

(98)

;

 T   T  1C2 UP 1 El UP 2 = UP 1 Gl UP 2 r  T  T   T r   T ; 1C2 UP 1 El UP 1 = UP 1 Gl UP 1 1C2 UP 2 El UP 2 = UP 2 Gl UP 2

Qr D p cos l D

T UP 1 Cl UP 2 kUP 1 k kUP 2 k

D

T kUP 1 kGl UP 1 .2El CGl /UP 2 q T T UP 1 Gl UP 2 UP 1 .2El CGl /UP 1 P kU 2 kGl q T ; UP 2 .2El CGl /UP 2

Ql D

uP T1 Gr uP 2 kPu1 kGr kPu2 kGr

Fl Cl Fl T 2 2 1 diag 1 ;2

FT l Cl Fl

/VP 2 ; P P kV 1 kD kV 2 kD .

D (100)

Page 34 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 12 (Relative angular shear). Left Cauchy  Green matrix:  r 2 cos2 ˚ 0 : Cl D 0 r2

Right Cauchy  Green matrix: 3 2 A1 cos2 0 7 6 1  E 2 sin2 7 6   2 Cr D 6 7 A21 1  E 2 5 4 0   3 1  E 2 sin2

Left angular shear:

Right angular shear:

uP T1 Cr uP 2 cos l D ; kuP 1 kCr kuP 2 kCr

 0 cos l  Œ1; 0 Cr D 0; 1  cos l D 0 , l D ˙ : 2

T UP 1 Cl UP 2 cos r D     ; P  P  U 1  U 2  Cl  Cl 0 cos r  Œ1; 0 Cl D 0; 1  cos r D 0 , r D ˙ : 2

Left relative angular shear: cos r D 1: Ql WD cos l

Right relative angular shear: cos l Qr WD D 1: cos r

(101)

(102)

(103)

Fig. 16 Stab–Werner pseudoconic projection, with shorelines of a spherical earth, equidistant mapping of the Greenwich meridian, Tissot ellipses of distortion, “cordiform mapping” (Johannes Werner: Libellus de quatuor terrarum orbis in plane figurationibus. Nova translativ primi libri geographiae. El. Ptolemai (Latin), Nenenberg 1514, “designed after instructions by Johann Stabius.” first map by Petrus Aqianus, World Map of Ingolstadt) Page 35 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

9 Equivalence Theorem of Conformal Mapping Experience proves that anyone who studied geometry is infinitely quicker to grasp difficult subjects than one who has not. (Plato. The Republic Book 7, 375 B. C.) The equivalence theorem of conformal mapping from the left to the right two-dimensional Riemann manifold (conformeomorphism), generalized Korn–Lichtenstein equations. We shall define conformeophism as well as angular shear and shall present the equivalence theorem, which relates conformeomorphism to a special structure of the Cauchy–Green deformation tensor, the Euler–Lagrange deformation tensor, the left and right principal stretches (left and right eigenvalues), as well as dilatations before we are led to the generalized Korn–Lichtenstein equations, which govern any conformal mapping: Compare with Definition 1 and Theorem 1. For a further motivation, we refer to Figs. 17 and 18, which presents an image of Lichtenstein’s original publication, “Zur Theorie der konformen Abbildung.” Definition 1 (Conformal mapping). An orientation-preserving diffeomorphism f W M2l ! M2r is called angle-preserving conformal mapping (conformeomorphism, inner product preserving) if l D r and †l D ˙ D 0 , †r D D 0 for all points of M2l and M2r , respectively, holds. Theorem 2 (Conformeomorphism M2l ! M2r , conformal mapping). Let f W M2l ! M2r be an orientation-preserving conformal mapping. Then, the following conditions (i)–(iv) are equivalent:   .i / l UP 1 ; UP 2 D r .Pu1 ; uP 2 / ;

(104)

for all tangent vectors fUP 1 ; UP 2 g and their images fPu1 ; uP 2 g, respectively; 2 .i i / Cl D 2 .U 0 / Gl ; Cl G1 l D  .U 0 / I2 El D K .U 0 / Gl ; El G1 l D K .U 0 / I2   .i i i / K D 2  1 =2; 2 D 2K C 1 1 D 2 D  .U 0 / K1 D K2 D K .U 0 /  1  2 .U 0 / D tr Cl G1 l 2

left dilatation:  1  K D tr El G1 l   2 q   1 tr Cl Gl D 2 det Cl G1 l q     1 tr El Gl D 2 det El G1 l

versus versus versus versus versus versus

versus versus versus

2 Cr D 2 .u0 / Gr ; Cl G1 r D  .u0 / I2 ; Er D .u0 / Gr ; Er G1 r D .u0 / I2 I (105)  2  2   1 =2 D ; 2 C 1 D  ; 1 D 2 D  .u0 / ; (106)

1 D 2 D .u0 / ;   1 2 .u0 / D tr Cr G1 I r 2

right dilatation:  1 

D tr Er G1 ; r 2 q     tr Cr G1 D 2 det Cr G1 ; r r q     tr Er G1 D 2 det Er G1 I r r

(107)

(iv) generalized Korn–Lichtenstein equations (special case: g12 = 0):

uU uV

 Dp

1 2 G11 G22  G12

r

  g11 G12 G11 vU ; vV g22 G22 G12

(108) Page 36 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

subject to the integrability conditions uU V D uV U and vU V D vV U . Before we present the sketches of proofs for the various conditions, it has to be noted that the , suffer generalized Korn–Lichtenstein equations, which govern the conformal mappingM2l ! M2r         and g22 u U , and from the defect that they contain the unknown functions g11 u U

Fig. 17 Lichtenstein, L.: Zur Theorie der konformen Abbildung. Konforme Abbildung nichtanalytischer singularitätenfreier Flächenstücke auf ebene Gebiete (Bull. Int. Acad. Sci. Cracovie, Chasse des Sciences, Math. et Natur., Serie A, pp. 192–217, Cracovie 1916). Part 1

Page 37 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Fig. 18 Lichtenstein, L.: Zur Theorie der konformen Abbildung. Konforme Abbildung nichtanalytischer singularitätenfreier Flächenstücke auf ebene Gebiete (Bull. Int. Acad. Sci. Cracovie, Chasse des Sciences, Math. et Natur., Serie A, pp. 192–217, Cracovie 1916). Part 2

  ˚ 2    is that the mapping functions u U have to be determined, In case of Mr ; g D ˚the2reason R ; ı , the corresponding Korn–Lichtenstein equations do not suffer since these functions do not appear. The stated problem is overcome by representing the right Riemann manifold M2r by isometric coordinates (also called conformal coordinates or isothermal coordinates) directly, such Page 38 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

that the quotient g22 =g11 is identical to 1. This is exactly the procedure advocated by C. F. Gauss (1822, 1844) and applied to the conformal mapping of E2 onto S2r . We shall come back to this point of view after the proof. Proof (first part). .i / ) .i i /: T l D r ! cos l D cos r , U 01 Gl U 02 D u01 T Gru02 , dS2 dS1 T du1 Gr du2 , duT1 Cr du2 D 1 duT1 Gr du2 2 , , duT1 JTr Gl Jr du2 D ds1 ds2 , 1 D 2 D  .u0 / ; Cr D 2 .u0 / Gr q:e:d: cos r D cos l , u01 T Gru02 D U 01 Gl U 02 , dU T1 JTl Gr Jl dU 2 D T

, 1 D 2 D  .U 0 / ;

Cl D 2 .U 0 / Gl

(109)

ds2 ds1 dU T1 Gl dU 2 , (110) dS1 dS2 q:e:d:

.i / ( .i i /: 3 

ds2 ds 1 T T 0T 0 0T 0 cos  D U G U D u J G J du D u G u D cos  cos  l l l r 2 l r r 1 2 1 r 1 2 4 (111) dS1 dS2 5 ) orientation is preserved T 2 1 1 1 Jr Gl Jr D Cr D  .u0 / Gr ; 1 D 2 D  , l D r q:e:d: 2

Proof (second part).

D 21 D

21

.i i / ) .i i i /: Left eigenvalue problem: 

2  .U 0 / D 21 D 22 2 Cl D  .U 0 / Gl ; El D K .U 0 / Gl , : K 2 .U 0 / D K12 D K22

(112)

Right eigenvalue problem: 

2  .u0 / D 21 D 22 2 : Cr D  .u0 / Gr ; Er D .u0 / Gr ,

.u0 / D 12 D 22

(113)



.i i/ ( .i i i /:

T1 1 D  .U 0 / ; FT1 diag 21 ; 22 F1 Fl D Gl ) Cl D 2 .U 0 / Gl ; l l D Cl ; F l T1 1 22 D 2 .u0 / ; FT1 diag 21 ; 22 F1 Fr D Gr ) Cr D 2 .u0 / Gr: r r D Cr ; F r

22

2

(114) 1 1 The statements for the quantities El ; Er ; El G1 ; E G ; E G ; K;

; ; and  follow in the same l l r r l way. Proof (third part). .i i / ) .iv/: In order to derive a linear system of partial differential equations forfuU ; uV ; vU ; vV g, we depart from the inverse right Cauchy–Green deformation tensor Cr since it contains just the above-quoted partials. For an inverse portrait involving the partials fUu ; Uv ; Vu ; Vv g in previous sections, we start from the inverse left Cauchy-Green deformation tensor, a procedure we are not following further. Page 39 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

First step: 2

g22 3 2 7 6 

 2 6 7 1 3 Gr uU uV 6 .ˇ/ x T G1 x 2 D C g11 7 1 uU vU G D 2 l 6 l 6 vU vV 2 7 G1 uV vV 2 7 6 7 r 1 T 6 7 

C1 D J G J D , ) 6 7: l l g r l 12 4 5 2 6 . / x T G1 x 2 D  7 uU vU  1 l 6 ; x 2 WD x 1 WD 2 7  6 7 uV vV 4 g12 5 T 1 .ı/ x 2 Gl x 1 D  2  (115) Without loss of generality – see through the remark that follows after the proof – let us here assume that the right two-dimensional Riemann manifold (i.e., the right parameterized surface) M2r is charted by orthogonal parameters (orthogonal coordinates), such that g12 D 0 holds. Such a parameterization of a surface can always be achieved, though it might turn out to be a difficult numerical procedure. Second step:

.˛/ x T1 G1 l x1 D C

 x D 0 .ı/ x T2 G1 1 21 l , x T2 G1 l x 2 D 08x 2 2 R “Ansatz”x 1 D Gl Xx 2 .X D unknown matrix/

."/ :

(116) A quadratic form over the field of real numbers can be 0 (“isotropic”) if and only if X is antisymmetric, i. e., X D XT (for a proof, we refer to A. Crumeyrolle (1990), Proposition 1.1.3):

“Ansatz”X D Ax8A D A ; T

 0 1 A WD 2 R22 ; 1 0

x 2 R:

(117)

Third step: 3 1 T 1 1 T 1 1 4 g22 x 1 Gl x 1 D g11 x 2 Gl x 2 D 2 5 ) x 1 D Gl Axx 2 2

)

1 T T 1 T 1 1 T 1 x 1 Gl x 1 D x 2 A Gl Ax 2 x D x G x2 , g22 g22 g11 2 l g11 T A Gl AGl x D I , g22 r 2 3 1 g22 xDp 2 g11 5 ) ,4 G11 G22  G12 x 1 D Gl Axx 2 r g22 1 x2: ) x 1 D Gl A p 2 g11 G11 G22  G12 ,

(118)

The converse (iv) ) (ii) is obvious.

Page 40 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

general parameters (general left coordinates).

Special left Korn–Lichtenstein equations.

Left conformal coordinates (left isometric, left isothermal),

general parameters (general right coordinates).

Special right Korn–Lichtenstein equations.

Right conformal coordinates (right isometric, right isothermal),

Special Korn–Lichtenstein equations, Cauchy–Riemann equations (d’Alembert–Euler equations).

Fig. 19 Flow chart conformal mapping M2l ! M2r

Here is the remark relating to Gl being diagonal, not unity, of course. An obvious generalization for solving . / and .ı/ for g12 ¤ 0 would be the

“Ansatz”x 1 D Gl Xx 2 ;

 y x XD ; x y

(119)

the superposition of a diagonal trace-free matrix diag [y, y] and antisymmetric matrix Ax. Indeed, we succeed in determining the unknowns x and y according to the above steps but fail to arrive at linear relations between the partials fuU ; uV ; vU vV g. In practice, a different way in constructing a conformeomorphism M2l ! M2r has been chosen. In Fig. 19, the alternative path of generating a conformal mapping from a left curved surface to a right curved surface is outlined. First, the original coordinates {U 1 , U 2 } or {U , V }, which parameterize the left surface, are transformed to alternative left conformal coordinates {P , Q}, which are also called isometric or isothermal. Indeed, the left differential invariant Il  dS 2 D 2 .dP 2 C dQ2 / is described by identical metric coefficients GPP D GQQ D 2 and GPQ D 0. Second, the original coordinates {u1 , u2 } or {u, v}, which parameterize the right surface, are transformed to alternative right conformal coordinates {p, q}, which are also called isometric or isothermal. Indeed, the right differential invariant Ir  ds 2 D 2 .dp 2 C dq 2 / is described by identical metric coefficients gpp D gqq D 2 and gpq D 0. Third, the left conformal coordinates {P , Q} are transformed in to right conformal solving the special Korn–Lichtenstein ˚  ˚coordinates by  2 2 2 2 equations for Ml P; QjGl D  I2 ! Mr p; qjGr D  I2 , which are called Cauchy–Riemann (d’Alembert-Euler) equations, subject to an integrability condition. The integrability condition

Page 41 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

turns out to be the vector-valued Laplace equation of harmonicity as stated in the following theorem and proven later on: Theorem 3 (Conformeomorphism M2l ! M2r , conformal mapping). An orientation-preserving conformal mapping M2l ! M2r can be constructed in three steps in solving special Korn– Lichtenstein equations. First step or left step.   The left Riemann manifold M2l U 1 ; U 2jGl , which is called left surface, is parameterized by general left parameters (general left coordinates) fU 1 ; U 2 g or fU; V g. The solution of the following special Korn–Lichtenstein equations (i), subject to the following integrability conditions of harmonicity (ii) and orientation conservation (iii), is needed: (i)

Special KL:

PU PV



Dp

1 2 G11 G22 G12

G12 G11 G22 G12





2

PU D p

.G12 QU C G11 QV /

1

3

2 QU 4 G11 G22 G12 5: ; 1 PV D p QV .G22 QU C G12 QV / 2

G11 G22 G12

(120) (ii)

Left integrability: PU V D PV U

and QU V D QV U

(121)

or (in terms of the Laplace–Beltrami operator)     3 2 G P G P G P G P 11 V 12 U 22 U 12 V C p D07 2 6 U V P WD pG11 G22 G122 7: 6 V  G11 G22 G12 U  5 4 G11 QV G12 QU G22 QU G12 QV p U V Q WD p C D 0 2 2 G11 G22 G12

(iii)

V

G11 G22 G12

(122)

U

Left orientation conservation: ˇ ˇ ˇ PU PV ˇ ˇ ˇ ˇ QU QV ˇ D PU QV  PV QU > 0:

(123)

Note that the coordinates P and Qare the left conformal coordinates, which are also called isometric or isothermal. Second step or right step.   The right Riemann manifold M2r u1 ; u2 jGr , which is called right surface, is parameterized by general right parameters (general right coordinates) fu1 ; u2g or fu; vg. The solution of the following special Korn–Lichtenstein equations (i), subject to the following integrability conditions of harmonicity (ii) and orientation conservation (iii), is needed: (i)

Special KL:

pu pv



Dp

1 2 g11 g22 g12

g12 g11 g22 g12





2

pu D p

1

.g12 qu C g11 qv /

3

2 qu 4 g11 g22 g12 5: ; 1 qv .g q C g q / pv D p 22 u 12 v 2

(124)

g11 g22 g12

Page 42 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

(ii)

Right integrability: puv D pvu

and

quv D qvu

(125)

or (in terms of the Laplace–Beltrami operator) 

2

g11 pv g12 pu





g22 pu g12 pv



3

C p D 07 2 6 uv p WD pg11 g22 g122 v 6 7:    g11 g22 g12  u 4 5 g11 qv g12 qu g22 qu g12 qv p C D 0 uv q WD p 2 2 g11 g22 g12

(iii)

g11 g22 g12

v

(126)

u

Right orientation conservation: ˇ ˇ ˇ pu pv ˇ ˇ ˇ ˇ qu qv ˇ D pu qv  pv qu > 0:

(127)

Third step (left–right)   The left Riemann manifold M2l P; Qj2 I2 , which here is called left surface and is parameterized in left conformal coordinates fP; Qg,   is orientation preserving, conformally mapped onto the 2 2 right Riemann manifold Mr p; qj I2 , which here is called right surface and is patameterized in right conformal coordinates fp; qg, if the following special Korn–Lichtenstein equations (i) (called Cauchy–Riemann (or d’Alembert–Euler) equations), subject to the following integrability conditions of harmonicity (ii) and orientation conservation (iii), are solved: (i)

Special KL (Cauchy–Riemann, d’Alembert–Euler):

(ii)

pP pQ



1

Dp 2 g11 g22  g12

0 1 1 0



 qP ; qQ

pP D qQ

and pQ D qP :

Right integrability: pPQ D pQP

and

qPQ D qQP

or (in terms of the Laplace–Beltrami operator) 2  3  2 @2 PQ p WD pPP C pQQ D @P@ 2 C @Q p .P; Q/ D 0 2  2 4  5: @2 PQ q WD qPP C qQQ D @P@ 2 C @Q q .P; Q/ D 0 2 (iii)

(128)

(129)

(130)

Left-right orientation conservation: ˇ ˇ ˇ pP pQ ˇ ˇ ˇ ˇ qP qQ ˇ D pP qQ  pQ qP D 0:

(131)

The special Korn–Lichtenstein equations, which govern as Cauchy–Riemann (or d’Alembert– Euler) equations any harmonic and orientation-preserving conformal mapping M2l .P; Q/ ! M2r .p; q/, are uniquely solvable if a proper boundary value problem is formulated. Page 43 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

The proof of the operational theorem of conformal mapping M2l ! M2r rests upon the existence theorem of S. S. Cherne (1955), where it is shown that under rather mild certainty assumptions, namely, C 2;x , conformal coordinates (isometric coordinates, isothermal coordinates) exist as solutions of the left or right Korn-Lichtenstein equations. Let us here also refer to the following authors. J. L. Lagrange (1781), C. F. Gauss (1822, 1844), C. G. J. Jacobi (1839), J. Liouville (1850), B. Riemann (1851), E. Schering (1857), H. Weber (1867), L. Krueger (1903, 1922), L. Lichtenstein (1911, 1916), A. Korn (1914), G. Ricci (1918), H. Weyl (1918, 1921), J. A. Schouten (1921), A. Finzi (1922), H. Schmehl (1927), L. P. Eisenhart (1949), N. Kuiper (1949, 1950), K. Konig and K. H. Weise (1951) , P. Hartman and A. Wintner (1954), A. I. Markuschewitsch (1955), L. Mirsky (1960), M. Zadro and A. Carminelli (1966), S. S. Chern (1967), S. K. Mitra and C. R. Rao (1968), E. M. Stein and G, Weiss 1968, H. Samelson (1969), R. S. Kulkarni (1969, 1972), G. M. Lancaster (1969, 1973), D. G. L. Boulware, L. S. Brown and R. D. Peccei (1970), J. P. Bourguignon (1970), K. Yano (1970), S. Ferrara, A. F. Grillo and R. Gatto (1972), W. Blaschke and K. Leichtweiß (1973), B. Y. Chen (1973), B. Y. Chen and K. Yano (1973), S. Nishikawa (1974), T. Wray (1974), S. S. Chern, L. Euler (1755, 1777), J. D. Moore (1977), C. W. Misner (1978), M. Spivak (1979), W. Klingenberg (1982), A. I. Yanushauskas (1982), R. Schoen (1984), M. DoCarmo, M. Dajczer and F. Mercuri (1985), J. Zund (1987), S. Heitz (1988), R. S. Kulkarni and U. Pinkall (1988), J. Lafontaine (1988), B. Moor and H. Zha (1991), and H. Goenner, E. Grafarend and R. J. You (1994). Proof (sketch of the proof for the first step). The special KL equations generate a conformal  2 mapping, Ml .U; V jGl / ! Ml P; Qj I2 , namely, a conformal coordinate transformation from general left coordinates fU; V g to left conformal coordinates fP; Qg. The left matrix of the metric, i.e., the matrix Gl , is transformed, to the left matrix of the conformally flat metric, 2 I2 . Up to the factor of conformality, 2 .P; Q/, the transformed matrix of the metric is a unit matrix, I2 . Here, we only outline how the integrability conditions, PU V D PV U and QU V D QV U , are converted to the Laplace–Beltrami equation. KL, first equation and second equation, lead to  PU V D

G12 QU CG11 QV

p

 ; PV U D

2 G11 G22 G12





G22 QU CG12 QV

p

V

G11 QV G12 QU



 ;

2 G11 G22 G12



G22 QU G12 QV



D p first WPU V D PV U , p 2 2 G11 G22 G12 V   G11 G22 G12   ) Gp11 QV G12 Q2U C Gp22 QU G12 Q2V D 0: G11 G22 G12

G11 G22 G12

V

(132)

U

) U

(133)

U

The KL matrix equation is inverted to

QU QV



Dp

1 2 G11 G22 G12

G12 G11 G22 G12



 PU G12 PU G11 PV G22 PU G12 PV ; QV D p : ; QU D p 2 2 G11 G22 G12 G11 G22 G12 PV

(134)

The inverted KL equations lead to 

QU V

G11 PV G12 PU D p 2

G11 G22 G12



 ; QV U D

V

G22 PU G12 PV p 2

G11 G22 G12

 :

(135)

U

Page 44 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015





G11 PV G12 PU p



G22 PU G12 PV p

second WQU V D QV U ,  D 2 2 G22 G12 V   G11  G11 G22 G12 G11 PV G12 PU G22 PU G12 PV ) p C p D 0: 2 2 G11 G22 G12

G11 G22 G12

V

 ) U

(136)

U

Proof (sketch of the proof for the second step). The special KL equations generate the conformal  2 mapping Mr .u; vjGr/ ! p; qj I2 , a conformal coordinate transformation from general right coordinates fu; vg to right conformal coordinates fp; qg. The right matrix of the metric, Gr, is transformed to the right matrix of the conformally flat metric, 2 I2 . Up to the factor of conjormality, 2 .p; q/, the transformed matrix of the metric is a unit matrix, I2 . Here, we only outline how the integrability conditions, Puv D Pvu and quv D qvu , are converted in to the Laplace–Beltrami equation. KL, first equation and second equation, lead to  puv D

g12 qu Cg11 qv





p

; pvu D

2 g11 g22 g12



g22 qu Cg12 qv

p

2 g11 g22 g12

v

g11 qv g12 qu







:

g22 qu g12 qv



D p first Wpuv D pvu , p 2 2 g11 g22 g12   g11 g22 g12  v  g11 qv g12 qu g22 qu g12 qv C p D 0: ) p 2 2 g11 g22 g12

g11 g22 g12

v

(137)

u

) u

(138)

u

The KL matrix equation is inverted to

qu qv



Dp

1 2 g11 g22 g12

g11 g12

g12 g22



 pu g12 pu g11 pv g22 pu g12 pv ; qu D p ; qv D p : 2 2 g11 g22 g12 g11 g22 g12 pv

(139)

The inverted KL equations lead to 



g11 pv g12 pu

quv D  p

2 g11 g22 g12



 ; qvu D

v

g11 pv g12 pu



g22 pu g12 pv

p

2 g11 g22 g12



 :

g22 pu g12 pv

D p second: quv D qvu ,  p 2 2 g11 g22 g12 v    g11 g22 g12  g11 pv g12 pu g22 pu g12 pv ) p C p D 0: 2 2 g11 g22 g12

v

g11 g22 g12

(140)

u

 ) u

(141)

u

Proof (sketch of the proof step). The special KL equations generate a conformal  for the third  mapping M2l P; Qj2 I2 ! Mr p; qj2 I2 , namely, a conformal transformation from left conformal (isometric, isothermal) coordinates fP; Qg to right conformal (isometric, isothermal) coordinates fp; qg. The left matrix of the conformally flat metric, 2 I2 , is transformed to the right matrix of the conformally flat metric, 2 I2 . Up to the factors of conformality, 2 .P; Q/ and 2 .p; q/, the matrices of the left and right matrices are unit matrices, I2 . Here, we only outline how the integrability conditions, pPQ D pQP and qPQ D qQP , are converted, to the special Laplace–Beltrami equation. KL, first equation and second equation, lead to the following relations: Page 45 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

first WpPQ D pQP ; pPQ D qQQ ; pQP D qPP ; pPQ D pQP , qQQ D qPP ) qPP C qQQ D 0:

(142)

second WqPQ D qQP ; qPQ D pPP ; qPQ D pQQ ; qQP D qPQ , pPP D pQQ ) pPP C pQQ D 0:

(143)

This concludes the proofs. Note that a more elegant proof of the Korn–Lichtenstein equations, based upon exterior calculus, has been presented by E. Grafarend and R. Syffus (1998b). In addition, the authors succeeded to generalize the fundamental differential equations, which govern a conformeomorphism, the number of dimensions being n (for n = 3, they coincide with the Zund equations (J. Zund 1987) from M3l to M3r , namely, left (pseudo-)Riemann manifold Mnl ! right (pseudo-) Riemann manifold Mnr WD Er;s .r C s D n/. In general, conformal mappings, from an arbitrary left (pseudo-) Riemanri manifold Mnl to an arbitrary right (pseudo-)Riemann manifold Mnr , do not exist. The dimension n D 2 is just an exception where conformal mappings always exist, though they may be difficult to find. For instance, due to involved difficulties, the Philosphical Faculty of the University of Goettingen Georgia Augusta (dated 13 June 1857) set up the “Preisaufgabe” to find a conformal mapping of the triaxial ellipsoid, which had already been parameterized by C. F. Gauss in terms of “surface normal coordinates,” applying the “Gauss map.” Based upon Jacobi’s contributions to elliptic coordinates (C. G. J. Jacobi 1839), which separate the Laplace–Beltrami equations of harmonicity, the “Preisschrift” of E. Schering (1857) was finally crowned, nevertheless leaving the numerical problem open as to how to construct a conformal map of the triaxial ellipsoid – up to now an open problem (H. Schmehl 1927; W. Klingenberg 1982; B. Mueller 1991). The case of the dimension n D 3 is a special case to be treated. In contrast, for dimensions n > 3, a general statement can be made: A conformeomorphism exists if and only if the Weyl curvature tensor, being a curvature element of the Riemann curvature tensor, vanishes. We have given in Table 3 a list of related, commented references. A typical example for the nonexistence of a conformeomorphism is provided by the following example: Example 7 (Nonexistence of a conformeomorphism). In general relativity, the solutions of the Einstein gravitational field equations (for instance, the Schwarzschild metric) generate a Weyl curvature different from 0. Accordingly, the space time pseudo-Riemann manifold M3;1 l .space time/ ! ˚ 3;1 3;1  Mr WD R ; ı does not allow a conformal mapping to the pseudo-Euclidean manifold ˚ 3;1     R ; I4 , where I 4 WD ı WD diag Œ1; 1; 1; 1. Note that the details referring to those authors are listed in Table 3. Physical Aside. There is another interesting perspective between the geometry of conformal mappings and the physical field equations say of gravitostatics, electrostatics, and magnetostatics. It turns out as a result of conformal field theory that the factor of conformality, 2 or 2 , respectively, corresponds to the gravitational potential, the electric potential, and the magnetic potential, a notion being introduced by C. F. Gauss. A highlight has been the contribution of C. W. Misner (1978), who used the vector-valued four-dimensional Laplace-Beltrami equations (harmonic maps) as models of physical theories.

Page 46 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Table 3 Conformal mapping M2l ! M2r , commented references W. Blaschke, K. Leichtweiß (1973) J. P. Bourguignon (1970) L. P. Eisenhart (1949) A. Finzi (1922) C. F. Gauss (1822) C. F. Gauss (1816–1827) E. Grafarend, R. Syffus (1998a) E. R. Hedrick, L. Ingold (1925a) E. R. Hedrick, L. Ingold (1925b) W. Klingenberg (1982) R. S. Kulkarni (1969) R. S. Kulkarni (1972) R. S. Kulkarni, U. Pinkall (eds.) (1988) J. Lafontaine (1988) J. Liouville (1850) G. Ricci (1918) J. A. Schouten (1921) H. Weyl (1918) H. Weyl (1921) A.I. Yanushauskas (1982) J. Zund (1987)

Two-dimensional conformal mapping, Korn–Lichtenstein equations: nD2 n-dimensional conformal mapping, Weyl curvature: n  3 n-dimensional conformal mapping, Weyl curvature: n = arbitrary Three-dimensional conformal mapping, generalized Korn–Lichtenstein equations: n D 3 Classical contribution on two-dimensional conformal mapping: n D 2 Classical contribution on conformal mapping of the ellipsoid-ofrevolution n-dimensional conformal mapping, generalized Korn–Lichtenstein equations Analytic functions in three dimensions Laplace–Beltrami equations in three dimensions Conformal mapping of the triaxial ellipsoid, elliptic coordinates Curvature structures and conformal mapping Conformally flat manifolds Conformal geometry Conformal mapping “from the Riemann view point” Three-dimensional conformal mapping Conformal mapping n-dimensional conformal mapping, Weyl curvature Conformal mapping Conformal mapping Three-dimensional conformal mapping, generalized KornLichtenstein equations Three-dimensional conformal mapping, generalized Korn–Lichtenstein equations

10 Areal Distortion It isn’t that they can’t see the solution. It is that they can’t see the problem. (G. K. Chesterton, The Scandal of Father Brown. The Point of a Pin.) Fourth multiplicative and additive measures of deformation, dual deformation measures, areomorphism, equiareal mapping. Up to now, all deformation measures have been built on the first differential invariants Il and Ir of surface geometry, which are also called dS 2 and ds 2 . Such an invariant “left” or “right” measures the infinitesimal distance between two points on the “left” or the “right” surface. A dual measure of a surface (two-dimensional Riemann manifold) immersed in R3 is the infinitesimal surface element. Indeed, the surface element “left versus right.” (144), is dual to the infinitesimal distance element “left versus right.” (145): p p dSl WD det ŒGl dU ^ dV versus dSr WD det ŒGr du ^ dv; dS 2 D G11 dU 2 C 2G12 dU dV C G22 dV 2 versus ds 2 D g11 du2 C 2g12 dudv C g22 dv 2 :

(144) (145)

In the context of the mapping f W M2l ! M2r , we next define areomorphism as an equiareal mapping M2l ! M2r : See Definition 2. Page 47 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 13 (Areal distortion, representations of its multiplicative and additive deformation measures). (i) p

M2l ! M2r W

p det ŒGl dU ^ dV D det ŒGr du ^ dv; p p det ŒGr du ^ dv D det ŒGl dU ^ dV I p p det ŒGl dU ^ dV D det ŒGr  2Er du ^ dv; p p det ŒGr du ^ dv D det ŒGl  2El dU ^ dV; p 1 det ŒGl dU ^ dV D 1 2 du ^ dv; det ŒFr  p 1 det ŒGr du ^ dv D 1 2 dU ^ dV I det ŒFl  p  ˚  ˚ 2 Mr ; g D R2 ; ı ) det ŒGl dU ^ dV D 1 2 du ^ dv:

(146)

(ii) ˚l2 ˚r2

D D

q q

  det Cl G1 D 1 2 ; l   D 1 2 : det Cr G1 r

(147)

(iii)  Gl dU ^ dV; p p  Srl D det ŒCr   Gr du ^ dvI

Slr D

p

det ŒCl  

p

1 dU ^ dV; det ŒFl  1 du ^ dvI Srl D .1 2  1/ det ŒFr  ˚ 2  ˚  Mr ; g D R2 ; ı ) Srl D .1 2  1/ du ^ dv: Slr D .1 2  1/

(148)

Definition 2 (Equiareal mapping). An orientation-preserving diffeomorphism f W M2l ! M2r is called area preserving and equiareal (vector product preserving, areomorphism) if p

det ŒGl dU ^ dV D

p

det ŒGr du ^ dv

(149)

or, equivalently, Page 48 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

p

det ŒGr du ^ dv ˚l2 D ˚ 2 WD p D1 det ŒGl dU ^ dV , p det ŒGl dU ^ dV DW ˚ 2 D ˚r2 1D p det ŒGr du ^ dv or

(150)

p det ŒGr du ^ dv  det ŒGl dU ^ dV D 0 , p p Srl WD det ŒGl dU ^ dV  det ŒGr du ^ dv D 0

Slr WD

p

(151)

for all points of M2l and M2r , respectively, holds. p p Indeed, the left surface element det ŒGl dU ^ dV as well as the right surface element det ŒGr du ^ dv have enabled us to introduce dual measures to the left length element dU T Gl dU as well as to the right length duT Gr du. There exist representations of the multiplicative  ˚ element measure of areal distortion, ˚l2 ; ˚r2 , and of the additive measure of areal distortion, {Slr ,Srl }, in terms of the Cauchy–Green deformation tensor, the Euler–Lagrange deformation tensor, and the principal stretches (left or right eigenvalues), which we collect in Box 13 and turn out to be useful in the equivalence theorem. To give you again some breathing time, please enjoy Fig. 20, which presents the “quasicordiform” Bonne-pseudoconic projection.

11 Equivalence Theorem of Equiareal Mapping The equivalence theorem of equiareal mapping from the left to the right two-dimensional Riemann manifold (areomorphism). We have already defined areomorphism, namely, areal distortion, in order to present here an equivalence theorem that relates areomorphism to a special partial differential equation whose solution guarantees an equiareal mapping. In particular, we make a “canonical statement” about the product of left and right ˚principalstretches to be 1. Furthermore, we specify the equiareal ˚ mapping for a right manifold M2r ; g D R2 ; ı to be Euclidean. Theorem 4 (Areomorphism M2l ! M2r , equiareal mapping). Let f W M2l ! M2r be an orientation-preserving equiareal mapping. Then, the following conditions (i)–(iv) are equivalent: .i / p det ŒGl dU ^ dV D det ŒGr du ^ dv .i i / det ŒCr  D det ŒGr ; det ŒCl  D det ŒGl  ; det ŒGr  2Er  D det ŒGr  ; det ŒGl C 2El  D det ŒGl  .i i i / r 1 2 D 1; 1 2 D 1 r q q 2 2 g11 g22 g12 G11 G22 G12 detŒGr  detŒGl  Uu Vv  Uv Vu D detŒGl  D G G G 2 ; uU vV  uV vU D detŒGr  D 2 : g g g p

11

22

12

11 22

12

(152) Page 49 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Fig. 20 Bonne-pseudoconic projection, with shorelines of a spherical Earth, equidistant mapping of the line-of-contact of a circular cone, “quasicordiform”. Tissot ellipses of distortion(According to Rigobert Werner)

The proof is straightforward. For a better insight into the equivalence theorem of an equiareal mapping, we recommend a detailed study of the next example.

12 Review: The Canonical Criteria Where we cannot use the compass of mathematics or the torch of experience . . . it is certain we cannot take a single step forward. (Voltaire.) Review: the canonical criteria for conformal equiareal, isometric, and equidistant mappings, optimal map projections, Gaussian curvatures. Up to now, we have defined conformal mapping (compare with Definition 1) as well as equiareal mapping (compare with Definition 3) from the left two-dimensional Riemann manifold (here, left surface immersed into R3 ) to the right two-dimensional Riemann manifold (here, right surface immersed into R3 ). We demonstrated that under the action of the conformal map, angles were preserved. In contrast, an equiareal transformation preserves the surface element. However, what is there to tell about length-preserving mappings f W M2l ! M2r ?

12.1 Isometry Let us begin with the definition of an isometry and relate it in the form of an equivalence theorem to other measures of deformation. In particular, we ask the question: When does an isometric mapping exist?

Page 50 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Definition 3 (Isometry). An admissible mapping f W M2l ! M2r is called length preserving or an isometry if, for any curve in the left surface (“left curve”: cl .tl /; tl 2 I.cl //, the corresponding curve in the right surface (“right curve”: cr .tr /; tr 2 I.cr //, as its image f ı cl .tl /, has an identical length: Z

bl

Z sPl dtl D

al

br

sPr dtr :

(153)

ar

Two Riemann manifolds, M2l and M2r , respectively, which are mapped on to each other by means of an isometry, are called isometric. Without any proof, we make the following equivalence statement (Of course, we could make an equivalent statement for the right manifold M2r ). Theorem 5 (Isometry M2l ! M2r ). An admissible mapping f W M2l ! M2r is an isometry if and only if the following equivalent conditions are fulfilled: (i) The coordinates of the left Cauchy–Green tensor Cl are identical to the coordinates of the left metric tensor Gl , i.e., Cl D G l :

(154)

(ii) The stretch  for any point X 2 M1l  M2l  R3 is independent of the directions of the P , a constant to be 1, i.e., tangent vector X P / D 18X P ¤ 0; .X

P 2 T M1  T M2 ; X l l

P D X

2 3 X X I D1 M D1

EI

@X I dU M : @U M dtl

(155)

(iii) The left principal stretches for any point X 2 M2l are a constant to be 1: 1 D 2 D 1. If an isometric mapping f W M2l ! M2r were existing for an arbitrary left and right twodimensional Riemann manifold, we would have met an ideal situation. Let us, therefore, ask: When does an isometric mapping f W M2l ! M2r exist? Unfortunately, we can only sketch the existence proof here, which is based upon the intrinsic measure of curvature of a surface, namely, Gaussian curvature, computed k D detŒK D 1

K WD HG

detŒH ; detŒG 2 R22 :

(156)

The curvature matrix, K, of a surface is the negative product of the Hesse matrix H and the inverse of the Gauss matrix G, defined as follows:

Page 51 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 14 (Curvature matrix of a surface).

 3 X @X I @X I e f GD D ; f g @U M @U N I D1

 3 X @2 X I l m I N D ; HD M @U N mn @U I D1

1 gl C f m KD D 2 gm Cfn eg  f

 f l  em : f m  en

(157)

(158)

(159)

Table 4 Gaussian curvatures for some surfaces Type of surface Sphere S2R

Gaussian curvature k D R12 > 0

Ellipsoid-of-revolution E2A1 ;A1 ;A2

kD

Plane, cylinder, cone, ruled surface

k=0

1 MN

; M WD

A1 .1E 2 / ;N .1E 2 sin2 ˚ /3=2

WD

p

A 1E 2 sin2 ˚

N I denotes the coordinates of the surface normal vector N 2 N M2l with respect to the basis fE 1 ; E 2 ; E 3 ; jO g fixed to the origin O and assumed to be orthonormal. N D E 1 N 1 C E 1 N 2 C E 3 N 3 and X I (U , V / are the representers of ˚l1 , which are also called embedding functions M2l  R3 if we exclude self-intersections and singular points (corners) of M2l . The “Theorema Egregium” of C. F. Gauss states that the determinant of the curvature matrix, in short Gaussian curvature, depends only on (i) the metric coefficients e; f; g; (ii) their first derivatives eU , eV , fU , fV , gU , gV ; and (iii) their second derivatives eU U , eU V , eV V , . . . , gU U , gU V , gV V . The fundamental theorem of an isometric mapping can now be formulated as follows: Theorem 6 (Isometric mapping). If a left curvature is isometrically mapped to a right surface, then corresponding points X 2 M2l and x 2 M2l have identical Gaussian curvatures. A list of Gaussian curvatures for different surfaces is shown in Table 4. In consequence, there are no isometries (i) from ellipsoid to sphere, (ii) from ellipsoid or sphere to plane, cylinder, cone, or any ruled surface (developable surfaces of Gaussian curvature 0).

12.2 Equidistant Mapping of Submanifolds Indeed, we are unable to produce an isometric landscape of the Earth, its Moon, the Sun, the planets, other celestial bodies, or the universe. In this situation, we have to look for a softer version of a length-preserving mapping. Such an alternative concept is found by “dimension reduction.” Only a one-dimensional submanifold M1 of the two-dimensional Riemann manifold M2 is mapped “length preserving.” For instance, we map the left coordinate line “ellipsoidal equator” equidistantly to the right coordinate line “spherical equator,” namely, by the postulate A1  D r. Page 52 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

The arc length A1  of the ellipsoidal equator coincides with the arc length of the spherical equator r. A more precise definition is given in Definition 4. Definition 4 (Equidistant mapping). Let a particular mapping f W M2l ! M2r of a left surface (left two-dimensional Riemann manifold) to a right surface (right two-dimensional Riemann manifold) be given. Besides the exceptional points, both parameterized surfaces M2l as well as M2r are covered by a set of coordinate lines {U D constant; V }, {U; V = constant} as well as {u = constant, }, {u; v = constant}, called left curves cl .t / (left one-dimensional submanifold) and right curves cr .t / (right one-dimensional submanifold). Under the mapping f ı cl .t / D cr .t /, the mapping cl .t / ! cr .t /

or

equidistant

R3  M2l  M1l ! M1r  M2r  R3

(160)

is called equidistant if a finite section of a specific left curve cl .t / has the same length as a finite section of the corresponding right curve cr .t /. Let us work out the equivalence theorem for an equidistant mapping from a left curve cl .t / to a right curve cr .t /. Theorem 7 (Equidistant mapping R3  M2l  M1l ! M1r  M2r  R3 ). Let us assume that the left surface (left two-dimensional Riemann manifold) as well as the right surface (right twodimensional Riemann manifold) have been parameterized by left coordinates fU; V g and right coordinates fu; vg. If the directions of their left tangent vectors and their right tangent vectors coincide with the directions of the left principal stretches (left eigendirections, left eigenvectors) and of the right principal stretches (right eigendirections, right eigenvectors), then the following conditions of an equidistant mapping are equivalent: (i) Equidistant mapping of a section of a specific left curve cl .t / to a corresponding section of a specific right curve cr .t /. U coordinate line to u coordinate line: R bl p Rb p G22 .t /VP dt D arr g22 .t /vdt: P al

V coordinate line to  coordinate line: R bl p Rb p G11 .t /UP dt D arr g22 .t /Pudt : al (161) (ii) Left or right Cauchy–Green matrix under an equidistant mapping cl .t / ! cr .t /. U coordinate line to u coordinate line: c22 D G22 or C22 D g22

V coordinate line to  coordinate line: c11 D G11 or C11 D g11 : (162) (iii) Left or right principal stretches under an equidistant mapping cl .t / ! cr .t /. U coordinate line to u coordinate line: 2 D 1 or 2 D 1:

V coordinate line to  coordinate line: 1 D 1 or 1 D 1: (163)

The proof is straightforward.

Page 53 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

12.3 Canonical Criteria By means of the various equivalence theorems, we are well prepared to present to you, as beloved collectors of items of Box 15, the canonical criteria or measures for a conformal, an equiareal, and an isometric mapping M2l ! M2r as well as for an equidistant mapping cl .t / ! cr .t /. These canonical measures are exclusively used to generate, in the following sections, equidistant, conformal, and equiareal mappings of various surfaces, from the ellipsoid-of-revolution to the sphere. Hilbert’s invariant theory is finally used to generate scalar functions of the tensor-valued deformation measures. Box 16 reviews the two fundamental Hilbert invariants of the Cauchy– Green and Euler–Lagrange deformation tensors. Box 15 (Canonical criteria for a conformal, equiareal, and isometric mapping M2l ! M2r as well as for an equidistant mapping cl .t / ! cr.t /). Conformeomorphism: 1 D 2 or 1 D 2 K1 D K2 or 1 D 2

(164)

for all points of M2l or M2r , respectively. Aeromorphism: 1 2 D 1 1 K1 K2 C .K1 C K2 / D 0 2

or or

1 2 D 1; 1

1 2 C . 1 C 2 / D 0 2

(165)

for all points of M2l or M1r , respectively. Isometry: 1 D 2 D 1 or 1 D 2 D 1; K1 D K2 D 0 or 1 D 2 D 0;

(166)

for all points of M2l or M2r , respectively. Equidistance: 1 D 1; 2 D 1 or 1 D 1; 2 D 1; K1 D 0; K2 D 0 or 1 D 0; 2 D 1;

(167)

for all points of M2l (left curve) and M2r (right curve), which are equidistantly mapped.

Page 54 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

1 1 I1 D .21 C 22 / D 12 trŒCl G1 l ; 2 2 1 1 i1 D .21 C 22 / D 12 trŒCr G1 r  2 2 1 1 I1 D .21 C 22 / D 12 trŒCl G1 l ; 2 2 1 1 i1 D .21 C 22 / D 12 trŒCr G1 r  2 2

(168)

(169)

represent the average Cauchy–Green deformation, distortion energy density of the first kind, also called Cauchy–Green dilatation. In contrast, q 1 2 2 ln I2 D .ln 1 C ln 2 / D ln detŒCl G1 l ; 2 q p 1 ln i2 D .ln 21 C ln 22 / D ln detŒCr G1 r  2 p

(170)

are the geometric mean of Cauchy–Green deformation or distortion energy density of the second kind. Note that similar Hilbert invariants can be formulated and interpreted for the Euler–Lagrange deformation tensor. Box 16 (Canonical representation of Hilbert invariants derived from deformation measures). I1 .Cl / WD 21 C 22 D trŒCl G1 l 

versus

i1 .Cr / WD 21 C 22 D trŒCr G1 r ;

I2 .Cl / WD 21 22 D detŒCl G1 l 

versus i2 .Cr / WD 21 22 D det ŒCr G1 r ;

(171)

or I1 .El / WD K1 C K2 D trŒEl G1 l 

versus

i1 .Er / WD 1 C 2 D trŒEr G1 r ;

I2 .El / WD K1 K2 D detŒEl G1 l 

versus i2 .Er / WD 1 2 D detŒEr G1 r ;

(172)

Page 55 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Physical Aside. Alternative measures of distortion energy density are introduced in continuum mechanics. By means of the weighted Frobenius matrix norm of Box 17, we have given quadratic forms of Cauchy–Green and Euler–Lagrange deformation densities. The weight matrices Wl and Wr are Hooke matrices, also called direct and inverse stiffness matrices. Boxes 16 and 17 have reviewed local scalar-valued deformation measures, namely, distortion densities of the first and the second kind. As soon as we have to map a certain part of the left surface as well as the right surface, we should consequently introduce global invariant distortion measures, as summarized in Box 18, which constitute Cauchy–Green and Euler–Lagrange deformation energies. dSl denotes p the left surface element,pwhile dSr denotes the right surface element, for instance, dSl D detŒGl dU dV and dSr D detŒGr dudv, respectively. The vec operator is a mapping of a matrix as a two-dimensional array to a column as a one-dimensional array: under the operation vecŒA, the columns of the matrix A are stapled vertically one by one. An example is A 2 R22 ; vecŒA D .a11 ; a21 ; a12 ; a22 /. Box 17 (Weighted matrix norms of Cauchy–Green and Euler–Lagrange deformations). Cauchy-Green deformation:   Cl G1 1 WD l Wl

  T 1 WD tr .Cl G1 l / Wl .Cl Gl /   Cl G12 D l Wl

  Cr G1 1 WD r Wr

versus

  T 1 WD tr .Cr G1 r / Wr .Cr Gr / ;   Cr G1 2 D r Wr

versus

T 1 D .vecŒCl G1 l / Wl .vecŒCl Gl /

(173)

T 1 D .vecŒCr G1 r / Wr .vecŒCr Gr /:

Euler–Lagrange deformation:   Cl G1 2 WD l Wl

  T 1 / W .E G / WD tr .El G1 l l l l   2 El G1  D l Wl T 1 D .vecŒEl G1 l / Wl .vecŒEl Gl /

versus

  Er G1 2 WD r Wr

  T 1 WD tr .Er G1 / W .E G / ; r r r r   Er G12 D versus r Wr

(174)

T 1 .vecŒEr G1 r / Wr .vecŒEr Gr /:

Page 56 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 18 (Cauchy–Green distortion energy, Euler–Lagrange distortion energy). (i) Cauchy-Green distortion energy: Z Z 1 1 1 dSl trŒCl Gl  dSr trŒCr G1 .1st/ r  2 2 versus Z Z 1 1 2 2 D D dSl .1 C 2 / dSl .21 C 22 /I 2 2 Z Z q dSr trŒCr G1 .2nd/ dSl detŒCl G1 r  D l  D versus Z Z D dSr 1 2 I D dSl 1 2 Z Z 2 2 .3rd/ dSl .ln1 C ln 2 / versus dSr .ln 21 C ln 22 /I Z Z     1 T 1 T 1 dSr tr .Cr G1 / W .C G / .4th/ dSl tr .Cl Gl / Wl .Cl Gl / r r r r ˇˇˇ2 ˇˇˇ WD ˇˇˇCl G1 ˇˇˇ l

versus

Wl

(ii) Euler-Lagrange distortion energy: Z 1 dSl trŒEl G1 .1st/ l  D 2 versus Z 1 D dSl .K1 C K2 / 2 Z q .2nd/ dSl detŒEl G1 l  versus Z p D dSl K1 K2 Z 1 dSl .ln K1 C ln K2 / versus .3rd/ 2 Z Z   1 1 1 T 1 dSl tr .El Gl / Wl .El Gl / .4th/ 2 2 versus ˇ ˇ ˇ ˇˇˇ ˇˇˇ2 WD ˇˇˇEl G1 l Wl

(175)

ˇˇˇ ˇˇˇ 2 ˇˇˇ : WD ˇˇˇCr G1 r Wr

1 2

Z

dSl trŒEr G1 r  D Z 1 D dSr . 1 C 2 /I 2 Z q dSr detŒEr G1 r  Z p (176) D dSr 1 2 I Z 1 dSr .ln 1 C ln 2 /I 2   T 1 dSr tr .Er G1 r / Wr .Er Gr / ˇˇˇ ˇˇˇ2 ˇˇˇ : WD ˇˇˇEr G1 r Wr

12.4 Optimal Map Projections Optimal map projections relate to the invariant scalar measures of Cauchy–Green deformation. More than 1,000 scientific contributions have been published on this topic. Harmonic maps, Page 57 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

optimal Universal Mercator Projections (opt UMP), as well as optimal Universal Transverse Mercator (opt UTM) belong to this category. Let us only introduce here the optimality conditions, as they are summarized in Boxes 19–21. First, G. B. Airy (1861) and V.V. Kavrajski (1958) introduced local as well as global measures of f W M2l ! M2r from isometry. Since, for an isometry, canonically 1 D 2 D 1 or 1 D 2 = 1 holds, f1  1; 2  1g or fln 1 ; ln 2 g and {1 1, 2 1} or f ln 1 ; ln 2 g as “errors” l and r are measures of the local departure from isometry. When integrated over the part of the left or right surface to be mapped, we are led to the global measures of departure from isometry, namely, IA and IAK of type “left” and “right.” Second, we introduce local and global measures of f W M2l ! M2r from an areomorphism or a conformeomorphism. Since, for an equiareal mapping, canonically 1 2 D 1 or 1 2 = 1 holds, {1 2  1} or {12 1} as “errors” l and r of type “areal” measure the local departure from an areomorphism. Similarly, for a conformal mapping, canonically 1 D 2 or 1 D 2 holds. Accordingly, 1  2 or 1  2 as “errors” l and r as measures of type “conformal” describe the local departure from a conformeomorphism. When integrated over the part of the left or right surface to be mapped, we are led to global measures of departure from areomorphism or conformeomorphism, namely, Iareal and Iconf of type “left” and “right.” Examples are given in the following chapters. Box 19 (Local measures for a departure of the mapping M2l ! M2r from isometry). (i) G. B. Airy (1861): 2 lA WD

  1 1 2 : .1  1/2 C .2  1/2 versus .1  1/2 C .2  1/2 DW rA 2 2

(177)

(ii) V. V. Kavrajski (1958): 2 lAK WD

  1 1 2 : .ln 1 /2 C .ln 2 /2 versus .ln 1 /2 C .ln 2 /2 DW rAK 2 2

(178)

Box 20 (Local measures for a departure of the mapping M2l ! M2r from equiareal and conformal). (i) Departure from an equiareal mapping: l2areal WD .1 2  1/2 versus .1 2  1/2 DW r2areal :

(179)

(ii) Departure from a conformal mapping: l2conf WD .1  2 /2 versus .1  2 /2 DW r2conf :

(180)

Page 58 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 21 (Global measures for a departure of the mapping M2l ! M2r from isometry, areomorphism, and conformeomorphism). (i) Isometry: IlA IlAK

Z 1 2 WD versus dSl lA Sl Z 1 2 WD versus dSl lAK Sl

Z 1 2 DW IrA ; dSr rA Sr Z 1 2 DW IrAK : dSr rAK Sr

(181)

(ii) Areomorphism: Il areal

1 WD Sl

Z dSl l2areal

1 versus Sr

dSl l2conf

1 versus Sr

Z dSr r2areal DW Ir areal :

(182)

dSr r2conf DW Ir conf :

(183)

(iii) Conformeomorphism: Il conf

1 WD Sl

Z

Z

12.5 Maximal Angular Distortion 2 The conformal mapping f W M2l ! P Mr had been previously P defined by the angular identity l D r or by zero angular shear l D l  r D 0 or r D r  l D 0. By means of the canonical criteria 1 D 2 or 1  2 D 0, we succeeded to formulate an equivalence for conformality. We shall concentrate here by means of a case study on the deviation of a general mapping, f W M2l ! M2r , from conformality. In particular, we shall solve the optimization problem of maximal angular shear or of the largest deviation of such a general mapping from conformality. Fast first hand information is offered by Lemma 5:

Lemma 5 (Left and right general eigenvalue problem P of the Cauchy–Green deformation PC 1 2 tensor). The angular distortion is maximal if l D 2 C l D 2 arcsin 1 C2 or r D 2 r 2 D 2 arcsin 11C : 2 The general proof of such a lemma can be taken from C. Truesdell and R. Toupin (1960), pp. 257–266. Here, we make the simplifying assumption that {G12 = 0; c12 = 0} and {g12 = 0, C12 = 0}. The off-diagonal elements of the left matrix of the metric Gl as well as of the left Cauchy– Green matrix Cl vanish. Or we may say that the coordinate lines “left” and their images “right” intersect at right angles. In consequence, the mapping equations are specified by {u.U /; v.V /}. An analogue statement can be made for the special case {g12 = 0; C12 = 0}. First, we have to define the angular parameters l and r . According to Figs. 21 and 22, we refer the angles l and r , respectively, to the unit tangent vector C1 along the V = constant coordinate line and to the unit tangent vector D1 of an arbitrary curve intersecting the coordinate line V = constant, as well as to Page 59 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

the unit tangent vector c1 along the v = constant coordinate line and to the unit tangent vector d1 of an arbitrary curve intersecting the coordinate line v = constant. Such an image curve is generated by mapping the original curve C .S/ 2 M1l  M2l to c.s/ 2 M1r  M2r . Box 22 summarizes the related reference frames, namely, Gauss reference frame (3-leg): fG 1 ; G 2 ; G 3 jU ;V g ; fg 1 ; g 2 ; g 3 ju; vg I

(184)

Cartan reference frame (3-leg, orthonormal, repere mobile): fC 1 ; C 2 ; C 3 jU ;V g ; fc 1 ; c 2 ; c 3 ju; vg I

(185)

Darboux reference frame (3-1eg, orthonormal): fD 1 ; D 2 ; D 3 jU.S/; V .S/g ; fd 1 ; d 2 ; d 3 ju.s/; v.s/g :

(186)

Box 22 (Reference frames (3-leg) of type Gauss, Cartan, and Darboux). The left manifolds .M1l  M2l ; G12 D 0/: Gauss W

Cartan W

Darboux W

@X .U; V / ; @U @X .U; V / G 2 WD ; @U G1  G2 : G 3 WD kG 1  G 2 k

C 1 WD

G1 G1 Dp ; kG 1 k G11 G2 G2 Dp ; C 2 WD kG 2 k G22 C 3 WD C 1  C 2 D D .C 1 ^ C 2 / D G 3 :

D 1 WD X 0 D

G 1 WD

dX ; dS D 2 WD D 3  D 1 D D .D 3 ^ D 1 /;

(187)

D 3 D C 3 D G 3:

The right manifolds (c12 D 0). Gauss W

Cartan W

Darboux W

@x .u.U /; v.V // ; @U @x .u.U /; v.V // g 2 WD ; @V

g1 ; kg 1 k g2 ; c 2 WD kg 2 k

d 1 WD x 0 D

g 1 WD

g 3 WD

g1  g2 : kg 1  g 2 k

c 1 WD

c 3 WD c 1  c 2 D D .c 1 ^ c 2 /:

dx .u.s/; v.s// ; ds

d 2 WD d 3  d 1 D D .d 3 ^ d 1 /;

(188)

d 3 D c 3 D g 3:

Those forms of reference are needed to represent cos l and cos r , the cosines of the angles between the tangent vector C1 and c1 , respectively, and the tangent vector D1˝ and d1˛, respectively (also called “Cartan 1” and “Darboux 1”), by means of the scalar products X 0 jC 1 and hx 0 jc 1 i, p respectively. Second, according to Box 23, we derive the basic relations cos l D G11 U 0 and Page 60 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

p p p sin l D G22 V 0 as well as cos r D g11 u0 and sin r D g22 v 0 . fU 0 ; V 0 g and fu0 ; v 0 g express the derivatives of the parameterized curves C.S/ and c.s/, respectively, with respect to the canonical curve parameters {arc length S;arc length s}. Third, outlined in Box 24, by means of the chain rule, we succeed to derive fU 0 ; V 0 g and fu0 ; v 0 g, respectively, in terms of the elements of the Jacobi matrices Œ@fU; V g=@fu; vg and Œ@fu; vg=@fU; V g and the stretches ds=dSand dS=ds, respectively. In this way, we succeed to represent cos l and sin l and cos r and sin r in terms of the elements of the left and right Cauchy–Green P Cl and Cr , respectively. Fourth, P matrices Box 25 leads us to the left and right angular shear, l and r , respectively. Our great results are presented in Corollary 5. The proof follows the lines of Box 25, namely, P the addition theorem P tan.x  y/ D .tan x C tan y/=.1 C tan x tan y/. tan l .l / as well as tan r .r / establish the optimization criteria P for maximal angular distortion. Fifth, the characteristic optimization problem P l . l / D extr: or r . r / D extr is dealt with in Box 26. Indeed, we find the two stationary P ˙ points tan l and tan r˙ . These stationary solutions lead us to the extremal values of ˙ l and P˙ r , the celebrated representations X˙ l



1  2 1 C 2



1  2 : (189) 1 C 2 P˙ P From these extremal values of the left and right angular shear ˙ r , we derive the left l and and right maximal angular distortion l and r , respectively, namely, sin

ˇ ˇ ˇ 1  2 ˇ ˇ

l D 2 arcsin ˇˇ 1 C 2 ˇ

versus

sin

r



ˇ ˇ ˇ 1  2 ˇ ˇ; versus r D 2 arcsin ˇˇ 1 C 2 ˇ

(190)

PC P PC P PC P based upon D   D   WD l ; l and l l , l l l  P C the P symmetry   . Indeed,

and

are the maximal data of angular distortion.

r WD l r r r

U = constant C2 V = constant

D1 π 2

− Ψl

C1 Ψl

Fig. 21 Left angular shear parameter l

P l

WD l  r , left Gauss frame, left Cartan frame, left Darboux frame, angular shear

Page 61 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 23 (Angular parameters l and r ). “Right” W

“Left”: @X dV @X dU C I @U dS @V dS   ˛ ˝ cos l D X 0  kC 1 k cos l D X 0 jC 1 ; p cos l D G11 U 0 I    l sin l D cos 2  ˝   0 ˛     l D X 0 jC 2 ; D X kC 1 k cos 2 p sinl D G22 V 0 : X0 W

x 0 WD

@x du @x dv C I @u ds @v ds

cos r D kx 0 k kc 1 k cos r D hx 0 jc 1 i ; p cos r D g11 u0 I    r sin r D cos 2   0  r D hx 0 jc 2 i ; D kx k kc 2 k cos 2 p sinr D g22 v 0 : (191)

Box 24 (Transformation of angular parameters l and r . Special case: G12 = 0, c12 = 0, u.U /; v.V / versus U.u/; V .v/). p G11 U 0 D G11 U 0 2 ; p p sin l D G22 V 0 D G22 V 02 :

cos l D

p

p G11 u0 2 ; p p sin r D g22 v 0 D g22 v 02;

cos r D

p

u0 D dU du ds dV dv ds ;V0 D du ds dS dv ds dS ) s   dU 2 0 ds u cos l D G11 ; du dS s   dU 2 0 ds sin l D G22 u du dS ) p p ds ds cos l D C11 u0 ; sin l D C22 v 0 : dS dS U0 D

g11 u0 D

(192)

du 0 dv ;v D : dv ds

du dU dS 0 dv dV dS ;v D dU dS ds dV dS ds ) s   du 2 0 dS U cos r D g11 ; dU ds s   dv 2 0 dS sin r D g22 V dV ds ) p dS dS p cos r D c11 U 0 ; sin r D c22 V 0 : ds ds (193) u0 D

Page 62 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 25 (The canonical representation of left angular shear and right angular shear. Special case: G12 D 0, c12 D 0 and g12 D 0, C12 D 0). Left and right angular parameters l and r : p

du ds p dU dS c11 versus D cos r ; ds dS dS ds p du 1 p dU 1 versus D cos r ; cos l D C11 c11 ds  dS  p dv ds p dV dS versus D sin r ; c22 sin l D C22 ds dS dS ds p dv 1 p dV 1 versus D sin r : c22 sin l D C22 ds  dS  Left and right stretches, left and right principal stretches:

cos l D

C11

dS 2  WD 2 ds C 11 21 D g11 C22 22 D g22 2

p cos r D g11 u0 ) s C11 1 cos l D cos r g  s 11 C22 1 sin l D sin r g22  1 cos l D cos r  2 sin l D sin r  2 tan r tan l D 1

versus versus versus

ds 2  WD ; dS 2 c11 21 D ; G11 c22 22 D ; G22

(194)

2

(195)

p

versus versus versus versus versus

G11 U 0 D cos l ) r c11 1 cos l D cos r ; G11  r c22 1 sin l D sin r ; G22  1 cos l D cos r ;  2 sin l D sin r ;  2 tan l D tan r : 1

(196)

(197)

Left and right angular shear, left and right angular distortion: P tan.l  r / D tan l P tan l  tan r tan l D 1 C tan l tan r P tan l  2 1 1 tan l tan l D 2 1 C 2 1 1 tan l P tan l tan l D .1  2 / 1 C 2 tan2 l

versus versus versus versus

P tan r D tan.r  l /; P tan r  tan l tan r D ; 1 C tan r tan l P tan r  2 1 1 tan r tan r D ; 2 1 C 2 1 1 tan r P tan r tan r D .1  2 /: 1 C 2 tan2 r (198) Page 63 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 26 (The optimization problem; extremal, left angular shear, or right angular shear; maximal angular distortion). Optimization problem: P˙ l D P P D arg f l 2 Œ0; 2j l .

P˙ r D P P D arg f r 2 Œ0; 2j r . r / D extr:g ; l / D extr:g (199) P P x WD l ; f .x/ WD l .l /; x WD r ; f .x/ WD r .r /; tan x (200) a WD 1 ; b WD 2 : a WD 1 ; b WD 2 : tan f .x/ D 2 a C b tan x versus

Stationary points: .tan x/0 D 1 C tan2 x;

(201)

f 0 .x/ D 0 , 0 .tan f .x// D .1 C tan2 f .x//f 0 .x/ D 0; 1 C tan2 x .a  b tan2 x/; .tan f .x//0 D .a C b tan2 x/2 .tan f .x//0 D 0 , a  b tan2 x D 0 , r a ; tan x D ˙ b

(202)

tan l˙

r 1 D˙ 2

versus

tan r˙

r 1 D˙ : 2

(203)

Extremal left or right angular shear:

tan

P˙ l



1 1  2 p 2 1 2

versus sin x D p

sin

P˙ l



1  2 1 C 2

tan x 1 C tan2 x

versus

tan

P˙ r

1 1  2 D˙ p ; 2 1 2

;

(204) sin

P˙ r



1  2 : 1 C 2

Maximal angular distortion: P PC PC P PC  D 2 versus

WD  D 2 r r r l l l ˇ ˇ ˇ ˇ r ; ˇ 1  2 ˇ ˇ 1  2 ˇ ˇ ˇ:

l D 2arcsin ˇˇ versus

r D arcsin ˇˇ 1 C 2 ˇ 1 C 2 ˇ

l WD

PC

(205)

Page 64 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

u = constant c2 π 2

d1

−Ψr Ψr

Fig. 22 Right angular shear shear parameter r

P r

v =f(V )=constant c1

WD r  l , right Gauss frame right Cartan frame, right Darboux frame, angular

Corollary 5 (The canonical representation of left angular P shear and rightPangular shear. Special case: G12 = 0, c12 = 0 and g12 = 0, C12 = 0). Let l WD l  r and r WD r  l , respectively, denote the left and right angular shear, a measure of the deviation of the mapping of the angular parameters l and M2l ! M2r from conformality. Then, a canonical Prepresentation P r as well as of the angular shear parameters l and r is tan l D tan

P l

D .1  2 /

2 tan r 1

tan l 1 C 2 tan2 l

versus

versus

tan r D tan

1 tan l ; 2

P r

D .1  2 /

(206) tan r : 1 C 2 tan2 r (207)

13 Exercise: The Armadillo Double Projection Exercise: the Armadillo double projection. First: sphere to torus. Second: torus to plane. The oblique orthogonal projection. An excellent example of a mapping from a left two-dimensional Riemann manifold to a right two-dimensional Riemann manifold where we have to use all the power of the previous paragraphs is the Armadillo map modified by Raisz, which is illustrated in Fig. 23. First, points of the sphere S2R of radius R are mapped onto a specific torus T2a;b , Second, subject to a D b D R; T2a;b is mapped as an oblique orthogonal projection onto a central planeP2O . Such a double projection is analytically presented in Box 27. The first mapping, namely, S2R ! T2a;b , is fixed by the postulate f D =2; D ˚g, which cuts the spherical longitude  in half to be gauged to the toroidal longitude . In contrast, spherical latitude ˚ is set identical to the toroidal latitude . For generating the second mapping, namely, T2a;b ! P2O , subject to a D b D R;we rotate around the 2 axis by ˇ from fX; Y; Zg 2 R3 to fX 0 ; Y 0 ; Z 0 g 2 R3 . In consequence, we experience an orthogonal projection of any point of the specific torus T2a;b onto the Y 0  Z 0 plane such that x D Y 0 and y D Z 0 . In this way, we have succeeded in parameterizing the double projection S2R ! T2a;b ! P2O by fx.; ˚/; y.; ˚/g. However, we pose the following problems: (i) Determine the left principal stretches f1 ; 2 g from the direct mapping equations x.; ˚/ and y.; ˚/ subject to the matrix Gl of the metric, the right matrix Gr of the metric, the left Jacob matrix Jl , and the left Cauchy–Green matrix Cl viewed in Box 28. (ii) Prove that the Armadillo double projection is not Page 65 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Fig. 23 Armadillo projection modified by Raisz: double projection, (i) sphere ! torus, (ii) torus ! plane, obliquity ˇ D 20ı , Tissot ellipses of distortion

equiareal. (iii) Prove that the images of the parallel circles of the sphere are ellipses. Determine their semimajor and semiminor axes as well as the location of the center. (iv) Prove that the images of the meridians of the sphere are conic sections. Solutions (all problems). Here are some ideas to solve the hard problems. For the second problem, we advise you to prove the inequality detŒCl  ¤ detŒGl . To solve the third problem, choose ˚ D constant, and eliminate  from the direct equations of the mapping, for instance, sin =2 D x= ŒR .1 C cos ˚/ as well as cos =2 D .R cos ˇ sin ˚  y/=ŒR.1 C cos ˚/ sin ˇ. Next, add sin2 =2 C cos2 =2 D 1, and you are done. Similarly, to solve the fourth problem, choose  = constant, and eliminate ˚ from the direct equations of the mapping, for instance, by 1 C cos ˚ D x=ŒR sin =2 as ˚ D .x  R sin =2/=.R sin =2/ and cos2 ˚ as well as by y=R C .x sin ˇ/=.R tan =2/ D cos ˇ sin ˚, to be squared to cos2 ˇ sin2 ˚, leading to a quadratic form of type ax 2 C bxy C cy 2 C d D 0, indeed a conic section. With this box, we finish the general consideration of mappings between Riemann manifolds. In the following chapter, we specialize the various rules for mappings between Riemann manifolds and Euclidean manifolds.

Page 66 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 27 (Armadillo double projection. First: sphere to torus. Second: torus to oblique plane). Left manifold S2r ; left coordinates .spherical longitude ; .spherical latitude ˚/ W S2r WD ˇ ˚ WD X 2 R3 ˇX 2 C Y 2 C Z 2  R2 D 0;  R 2 RC ; R > 0 ; ˚l1 W X .; ˚/ D D E 1 R cos ˚ cos  C E 2 R cos ˚ sin  CE 3 R sin ˚: First mapping:

Right manifold T2a;b  S1a  S1b ; right coordinates .toroidal longitude ; .torodal latitude / W T2a;b WD ˇ 2 ˇ p WD x 2 R3 ˇˇ x 2 C y 2  a C z2  b 2 D 0;  a 2 RC ; b 2 RC ; b a ; ˚r1 W x .; / D D e1 .a C b cos / cos  Ce2 .a C b cos / sin  C e3 b sin : (208)

   =2 D ; ˚

a D b D R:

(209)

Second mapping (oblique orthogonal projection). Left manifold T2a;b W X D R .1 C cos ˚/ cos =2; Y D R .1 C cos ˚ / sin =2; Z D R sin I 2

Right manifold .oblique plane/ P2O W x D Y0 y D Z0I

2 03 3 X X 4 Y 5 D R2 .ˇ/ 4 Y 0 5 Z0 Z , 2 3 2 03 X X 4 Y 0 5 D R2 .ˇ/ 4 Y 5 ; Z0 2 Z 3 cos ˇ 0  sin ˇ R2 .ˇ/ D 4 0 1 0 5 sin ˇ 0 sin ˇ , 2 3 cos ˇ 0  sin ˇ 4 0 1 0 5 D RT2 .ˇ/ D RT2 .ˇ/ I  sin ˇ 0 cos ˇ

x WD Y D Y 0 ; y WD Z 0 D  sin ˇX C cos ˇZ; x D R .1 C cos ˚/ sin =2; y D R .1 C cos ˚/ sin ˇ cos =2 C R cos ˇ sin ˚:

(210)

(211)

(212)

Page 67 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Box 28 (Left principal stretches). Left and right matrices of the metric:  R2 cos2 ˚ 0 Gl W 2 ; 0 R  10 Gr WD : 01

(213)

Left Jacobi matrix:

D x D˚ x Jl WD D y D˚ y

2

1 6 2 .1 C cos ˚/ cos =2 D R41 .1 C cos ˚/ sin ˇ sin =2 2



 sin ˚ sin ˚=2 cos ˇ cos ˚ C sin ˇ sin ˚ cos =2

3 7 5:

(214)

Left Cauchy-Green matrix:  c11 c12 Cl WD ; D c12 c22 c11 D x2 C y2 D   1 D R2 .1 C cos ˚/2 cos2 =2 C sin2 ˇ sin2 =2 4   1 D R2 .1 C cos ˚/2 1  cos2 ˇ sin2 =2 ; 4 c12 D x x˚ C y y˚ D

JTl Gr Jl

1 D R2 .1 C cos ˚/ sin =2 Œ sin ˚ cos =2 C sin ˇ .cos ˇ cos ˚ C sin ˚ sin ˇ cos =2/ 2   1 D R2 .1 C cos ˚/ sin =2 sin ˇ cos ˇ cos ˚  sin ˚ cos2 ˇ cos =2 2 1 2 D R .1 C cos ˚/ sin =2 cos ˇ .sin ˇ cos ˚  sin ˚ cos ˇ cos =2/ ; 2 (215) 2 2 c22 D x˚ C y˚ D   D R2 sin2 ˚ sin2 =2 C .cos ˇ cos ˚ C sin ˚ sin ˇ cos =2/2 ; (216) .1 C cos ˚/2 det ŒCl  D R4 .sin ˇ sin ˚ C cos ˚ cos ˇ cos =2/2 ; 4 det ŒGl  D R4 cos2 ˚ ¤ det ŒCl  : Left principal stretches:

ˇ ˇ ˇCl  2 Gl ˇ D 0: l

(217)

Page 68 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

References Airy GB (1861) Explanation of a projection by balance of errors for maps applying to a very large extent of the Earth’s surface: comparison of this projection with other projections. Philos Mag 22:409–442 Almansi E (1911) Sulle deformazioni finite di solidi elastici isotropi, Note I. Atti Accad naz Lincei Re Serie Quinta 201:705–714 Amalvict M, Livieratos E (1988) Surface mapping of a rotational reference ellipsoid onto a triaxial counterpart through strain parameters. Manuscripta Geodaetica 13:133–138 Blaschke W, Leichtweiß K (1973) Elementare Differentialgeometrie. Springer, Berlin/Heidelberg/New York Boulware DG, Brown LS, Peccei RD (1970) Deep-inelastic electroproduction and conformal symmetry. Phys Rev D 2:293–298 Bourguignon JP (1970) Transformation infinitesimal conformes fermées des variétés riemanniennes connexes complètes. C R Acad Sci Ser A 270:1593–1596 Cardoso JF, Souloumiac A (1996) Jacobi angles for simultaneous diagonalization. SIAM J Matrix Anal Appl 17:161–164 Cauchy A (1828) Sur les équations qui expriment les conditions d’équilibre ou les lois du mouvement intérieur d’un corps solide, élastique, ou non élastique. Ex De Math 3:160–187 Cauchy A (1889) Oeuvres complétes, Iie série, tome VII. Gauthier–Villars et Fils, pp 82–93 Cauchy A (1890) Sur l’équilibre et le mouvement d’un systéme de points matériels sollicités par des forces d’attraction ou de répulsion mutuelle, Oeuvres Complétes, Iie série, tome VIII. Gauthier–Villars et Fils, pp 227–252 Chen BY (1973) Geometry of submanifolds. Marcel Dekker, New York Chen BY, Yano K (1973) Special conformally flat spaces and canal hypersurfaces. Tohoku Math J 25:77–184 Chern SS, Hartman P, Wintner A (1954) On isothermic coordinates. Commentarii Mathematici Helvetici 28:301–309 Chu MT (1991a) A continuous Jacobi-like approach to the simultaneous reduction of real matrices. Linear Algebra Appl 147:75–96 Chu MT (1991b) Least squares approximation by real normal matrices with specified spectrum. SIAM J Matrix Anal Appl 12:115–127 Crumeyrolle A (1990) Orthogonal and sympletic Clifford algebras. Kluwer Academic, Dordrecht/Boston/London De Azcárraga JA, Izquierdo JM (1995) Lie groups, Lie algebras, cohomology and some applications in physics. Cambridge monographs on mathematical physics. Cambridge University Press, Cambridge/New York De Moor B, Zha H (1991) A tree of generalizations of the ordinary singular value decomposition. Linear Algebra Appl 147:469–500 Dermanis A, Livieratos E (1993) Dilatation, shear, rotation and energy: analysis of map projections. Bollettino di Geodesia e Science Affini 42:53–68 Do Carmo MP (1994) Differential forms and applications. Springer, Berlin/New York Do Carmo M, Dajczer M, Mercuri F (1985) Compact conformally flat hypersurfaces. Trans Am Math Soc 288:189–203 Duda FP, Martins LC (1995) Compatibility conditions for the Cauchy–Green strain fields: solutions for the plane case. J Elast 39:247–264

Page 69 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Eisenhart LP (1949) Riemannian geometry. Princeton University Press, Princeton Euler L (1755) Principes généraux des mouvement des fluides, Memoirs de l’Accad. des Sciences de Berlin 11:274–315 Euler L (1777) Über die Abbildung einer Kugelfläche in die Ebene. Acta Academiae Scientiarum Petropolitanae, Petersburg Ferrara S, Grillo AF, Gatto R (1972) Conformal algebra in two space-time dimensions and the Thirring model. Il Nuovo Cinmento 12A:959–968 Finger J (1894) Über die allgemeinsten Beziehungen zwischen den Deformationen und den zugehörigen Spannungen in aerotropen und isotropen Substanzen. Sitzber Akad Wiss Wien (2a) 103:1073–1100 Finzi A (1922) Sulle varieta in rappresentazione conforme con la varieta euclidea a piu di tre dimensioni. Rend Acc Lincei Classe Sci Ser 5–31:8–12 Flanders H (1970) Differential forms with applications to the physical sciences, 4th printing. Academic, London Gauss CF (1822) Allgemeine Auflösung der Aufgabe, die Teile einer gegebenen Fläche auf einer anderen gegebenen Fläche so abzubilden, daß die Abbildung dem Abgebildeten in den kleinsten Teilen ähnlich wird, 1822, Abhandlungen Königl. Gesellschaft der Wissenschaften zu Göttingen, Bd. IV, pp 189–216, Göttingen 1838 Gauss CF (1844) Untersuchungen über Gegenstände der höheren Geodäsie, erste Abhandlung, Abhandlungen der Königl. Gesellschaft der Wissenschaften zu Göttingen, Bd. 2 (1844), Ges. Werke IV, pp 259–334, Göttingen 1880 Gere JM, Weaver W (1965) Matrix algebra for engineers. D. Van Nostrand, New York Goenner H, Grafarend EW, You RJ (1994) Newton mechanics as geodesic flow on Maupertuis’ manifolds: the local isometric embedding into flat spaces. Manuscripta Geodaetica 19:339–345 Grafarend EW (1995) The optimal universal Mercator projection. Manuscripta Geodaetica 20:421–468 Grafarend EW, Syffus R (1998a) The optimal mercator projection and the optimal polycylindric projection of conformal type – case study Indonesia. J Geod 72:251–258 Grafarend EW, Syffus R (1998b) The solution of the Korn-Lichtenstein equations of conformal mapping: the direct generation of ellipsoidal Gauss-Krueger conformal coordinates or the transverse mercator projection. J Geod 72:282–293 Grafarend EW, Krumm F (2006) Map projections, cartographic informationsystems, pp 713, Springer, Berlin/New York Grafarend EW, You RJ, Syffus R (2014) Map projections, cartographic information systems, second ed, vol 1: pp 413, vol 2: pp 510, Berlin/New York Green G (1839) On the laws of reflection and refraction of light at the common surface of two non-crystallized media. Trans Camb Philos Soc 7:1–24 Hedrick ER, Ingold L (1925a) Analytic functions in three dimensions. Trans Am Math Soc 27: 551–555 Hedrick ER, Ingold L (1925b) The Beltrami equations in three dimensions. Trans Am Math Soc 27:556–562 Hencky H (1928) Über die Form des Elastizitätsgesetzes bei ideal elastischen Stoffen. Zeitschrift f techn Physik 9:215–220, 457 Heitz S (1988) Coordinates in geodesy. Springer, Berlin/Heidelberg/New York Higham NJ (1986) Computing the polar decomposition. SIAM J Sci Stat Comput 7:1160–1174

Page 70 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Jacobi CGJ (1839) Note von der geodätischen Linie auf einem Ellipsoid und den verschiedenen Anwendungen einer merkwürdigen analytischen Substitution. Crelles J 19:309–313 Karni Z, Reiner M (1960) Measures of deformation in the strained and in the unstrained state. Bull Res Counc Isr 8c 89. Jerusalem Kavrajski VV (1958) Ausgewählte Werke, Mathematische Kartographie, Allgemeine Theorie der kartographischen Abbildungen, Kegel-und Zylinderabbildungen, ihre Anwendungen (russ.) GS VMP, Moskau Kenney C, Laub AJ (1991) Polar decomposition and matrix sign function condition estimates. SIAM J Sci Stat Comput 12:488–504 Klingenberg W (1982) Riemannian geometry. de Gruyter, Berlin/New York Koenig R, Weise KH (1951) Mathematische Grundlagen der höheren Geodäsie und Kartographie, Bd. I. Springer, Berlin/Heidelberg/New York Korn A (1914) Zwei Anwendungen der Methode der sukzessiven Annäherungen. In: Mathematischen Abhandlungen Hermann Amandus Schwarz zu seinem fünfzigjährigen Doktorjubiläum. Springer, Berlin/Heidelberg/New York, pp 215–229 Krueger L (1903) Bemerkungen zu C.F. Gauss: Conforme Abbildungen des Sphäroids in der Ebene, C.F. Gauss, Werke, Königl. Ges. Wiss. Göttingen Bd. IX (1903), pp 195–204 Krueger L (1922) Zur stereographischen Projektion. P. Stankiewicz, Berlin Kuiper NH (1949) On conformally-flat spaces in the large. Ann Math 50:916–924 Kuiper NH (1950) On compact conformally Euclidean spaces of dimension > 2. Ann Math 52: 478–490 Kulkarni RS (1969) Curvature structures and conformal transformations. J Differ Geom 4:425–451 Kulkarni RS (1972) Conformally flat manifolds. Proc Natl Acad Sci USA 69:2675–2676 Kulkarni RS, Pinkall U (eds) (1988) Conformal geometry. Vieweg, Braunschweig/Wiesbaden Lagrange de JL (1781) Sur la construction des cartes geographiques. Nouveaux Mémoires de l’Academie Royale des Sciences et Belles Lettres de Berlin 161–210. Berlin Lancaster GM (1969) A characterization of certain conformally Euclidean spaces of class one. Proc Am Math Soc 21:623–628 Lancaster GM (1973) Canonical metrics for certain conformally Euclidean spaces of dimension three and codimension one. Duke Math J 40:1–8 Lichtenstein L (1911) Beweis des Satzes, daß jedes hinreichend kleine, im wesentlichen stetig gekrümmte, singularitätenfreie Flächenstück auf einem Teil einer Ebene zusammenhängend und in den kleinsten Teilen ähnlich abgebildet werden kann. Preußische Akademie der Wissenschaften, Berlin Lichtenstein L (1916) Zur Theorie der konformen Abbildung. Konforme Abbildung nichtanalytischer, singularitätenfreier Flächenstücke auf ebene Gebiete. Anzeiger der Akademie der Wissenschaften in Krakau 2–4:192–217 Liouville J (1850) Extension au cas de trois dimensions de la question du tracé géographique, Note VI, by G. Monge: application de l’analyse à la géométrie, 5ème èdition revue corrigèe par M. Liouville, Bachelier, Paris Macvean DB (1968) Die Elementarbeit in einem Kontinuum und die Zuordnung von Spannungsund Verzerrungstensoren. Zeitschrift für Angewandte Mathematik und Physik 19:137–185 Markuschewitsch AI (1955) Skizzen zur Geschichte der analytischen Funktion. Deutscher Verlag der Wissenschaften, Belin Mirsky L (1960) Symmetric gauge functions and unitary invariant norms. Q J Math 11:50–59 Misner CW (1978) Harmonic maps as models for physical theores. Phys Rev D 18:4510–4524

Page 71 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Mitra SK, Rao CR (1968) Simultaneous reduction of a pair of quadratic forms. Sankya A 30: 312–322 Morman KN Jr (1986) The generalized strain measure with application to nonhomogeneous deformations in rubber-like solids. J Appl Mech 53:726–728 Moore JD (1977) Conformally flat submanifolds of Euclidean space. Math Ann 225:89–97 Mueller B (1991) Kartenprojektionen des dreiachsigen Ellipsoids, Diplomarbeit Geodätisches Institut Universität, Stuttgart Newcomb RW (1960) On the simultaneous diagonalization of two semidefinite matrices. Q J Appl Math 19:144–146 Nishikawa S (1974) Conformally flat hypersurfaces in a Euclidean space. Tohoku Math J 26:563–572 Piola G (1836) Nuova analisi per tutte le questioni della meccanica molecolare. Memorie Mat Fis Soc Ital Sci Modena 21:155–321 Rao CR, Mitra SK (1971) Generalized inverse of matrices and its applications. Wiley, New York Ricci G (1918) Sulla determinazione di varieta dotate di proprietà intrinseche date a priori – note I. Rend Acc Lincei Classe Sci Ser 5, vol 19, Rome 1918 Riemann B (1851) Grundlagen für eine allgemeine Theorie der Funktionen einer veränderlichen complexen Größe, Inauguraldissertation, Göttingen Samelson H (1969) Orientability of hypersurfaces in Rn. Proc Am Math Soc 22:301–302 Schering E (1857) Über die conforme Abbildung des Ellipsoides auf der Ebene, Göttingen 1857, Nachdruck in: Ges. Math. Werke von E. Schering, eds. R. Haussner und K. Schering, 1.Bd. Mayer und Müller Verlag, Berlin 1902 Schmehl H (1927) Untersuchungen über ein allgemeines Erdellipsoid, Veröffentlichungen des Preußischen Geodätischen Institutes, Neue Folge Nr. 98, Potsdam Schoen R (1984) Conformal deformation of a Riemannian metric to constant scalar curvature. J Differ Geom 20:479–495 Schouten JA (1921) Über die konforme Abbildung n-dimensionaler Mannigfaltigkeiten mit quadratischer Maßbestimmung auf eine Mannigfaltigkeit mit Euklidischer Maßbestimmung. Math Z 11:58–88 Searle SR (1982) Matrix algebra useful for statistics. Wiley, New York Seth BR (1964a) Generalized strain measure with applications to physical problems. In: Reiner M, Abir D (eds) Second order effects in elasticity, plasticity and fluid dynamics. Pergamon Press, Oxford, pp 162–172 Seth BR (1964b) Generalized strain and transition concepts for elastic-plastic deformation-creep and relaxation. IUTAM symposium. Pergamon Press, München, pp 383–389 Shougen W, Shuqin Z (1991) An algorithm for Ax D ./Bx with symmetric and positive-definite A and B. SIAM J Matrix Anal Appl 12:654–660 Spivak M (1979) A comprehensive introduction to differential geometry, vol 4, 2nd edn. Publish or Perish, Boston Stein EM, Weiss G (1968) Generalization of the Cauchy-Riemann equations and representations of the rotation group. Am J Math 90:163–196 Ting TCT (1985) Determination of C1=2 , C1=2 and more general isotropic tensor functions of C. J Elast 15:319–323 Tissot NA (1881) Mémoire sur la représentation des surfaces et les projections des cartes géographiques. Gauthier-Villars, Paris

Page 72 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_53-2 © Springer-Verlag Berlin Heidelberg 2015

Truesdell C, Toupin R (1960) The classical field theories. In: Handbuch der Physik, vol III/I. Springer, Berlin Uhlig F (1973) Simultaneous block diagonalization of two real symmetric matrices. Linear Algebra Appl 7:281–289 Uhlig F (1976) A canonical form for a pair of real symmetric matrices that generate a nonsingurlar pencil. Linear Algebra Appl 14:189–210 Uhlig F (1979) A recurring theorem about pairs of quadratic forms and extensions: a survey. Linear Algebra Appl 25:189–210 Weber H (1867) Über ein Prinzip der Abbildung der Teile einer krummen Oberfläche auf einer Ebene. Journal für die reine und angewandte Mathematik 67:229–247 Weyl H (1918) Reine Infinitesimalgeometrie. Math Z 2:384–411 Weyl H (1921) Zur Infinitesimalgeometrie: Einordnung der projektiven und der konformen Auffassung, Nachr. Königl. Ges. Wiss. Göttingen, Math.-Phys. Klasse, Göttingen Wray T (1974) The seven aspects of a general map projection. Supplement 2, Canadian cartographer 11, monographe no.11, Cartographica. B.V. Gustell Publ./University of Toronto Press, Toronto Yano K (1970) On Riemannian manifolds admitting an infinitesimal conformal transformation. Math Z 113:205–214 Yanushaushas AI (1982) Three-dimensional analogues of conformal mappings (in Russian). Iz da te l’stvo Nauka, Novosibirsk Zadro M, Carminelli A (1966) Rapprezentazione conforme del geoide sull ellissoide internazionale. Bollettino di Geodesia e Scienze Affini 25:25–36 Zund JD (1987) The tensorial form of the Cauchy-Riemann equations. Tensor New Ser 44: 281–290 Zund JD, Moore WA (1987) Conformal geometry, Hotine’s conjecture, and differential geodesy, Department of Mathematical Sciences, New Mexico State University, Scientific report no. 1, Las Cruces

Page 73 of 73

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

Satellite-to-Satellite Tracking (Low–Low/High–Low SST) Wolfgang Keller Geodätisches Institut, Universität Stuttgart, Stuttgart, Germany

Abstract This contribution reviews the mathematical ideas behind the most frequently used techniques for the processing of satellite-to-satellite tracking data. Its emphasis is on the model part rather than on all necessary technicalities in data preprocessing and numerical implementation. The main outcomes of these data-processing strategies, when applied to data of the satellite missions CHAMP and GRACE, are reviewed.

1 Introduction The dedicated gravity field missions CHAMP and GRACE have attracted and are still attracting the attention of many researchers worldwide. An impressive number of algorithms and approaches have been developed to extract information about the gravitational field of the Earth from the original satellite-to-satellite tracking (SST) signal. This contribution aims at a mathematical description of the basic ideas behind the most frequently used and most developed approaches for the processing of SST data. It will not cover the important issue of data preprocessing, though this preprocessing is vital for the extraction of geophysical meaningful results from the SST data.

2 Scientific Relevance Geophysical processes close to the Earth’s surface, which are connected with mass transports, such as the hydrological cycle, imprint their signature on the time variability of the gravitational field of the Earth. Though gravity and gravity changes can be precisely measured at the Earth’s surface, this data has a very coarse spatial and temporal resolution. The way out of this sparse data coverage is to use artificial satellites as proof masses in the Earth’s gravitational field: The deviation of the orbits of these satellites from their simple Keplerian orbits is due to the deviation of the Earth’s gravitational field from a rotational symmetric field. The orbital heights of those satellites have to be big enough to guarantee that the atmospheric friction of the satellite does not exceed the anomalous gravitational acceleration on the satellite. On the other hand, the intensity of the gravitational signal decreases with increasing distance from the mass center of the Earth. And this decay is the faster the smaller the scales of the gravitational anomalies are. This means a recovery of the gravitational field from orbit observations of artificial satellites smoothes out smaller details of the gravitational field. This smoothing cannot



E-mail: [email protected]

Page 1 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

be counteracted by a lower orbit because the atmospheric friction would obscure the gravitational signal. This smoothing only can be counteracted by differential measurements, meaning that instead of the orbits of the satellites themselves the orbit differences between two or more satellites are observed. This concept of relative orbit observation and its implications to the recovery of geophysical phenomena is explained in more detail in Rummel (2003). This satellite-to-satellite tracking (SST) principle can be put into mathematical terms in the following way: Assume that there are n satellites which can “see” each other in a certain way. The orbits of these satellites will be denoted by .xi .t /; xP i .t //; i D 1; : : : ; n. The “things” that the satellites 2; : : : ; n “see” of satellite 1 can be modeled as a vectorial function F of the orbits of all involved satellites: s.t / WD F.x1 .t /; xP 1 .t /; : : : ; xn .t /; xP n .t // C .t /:

(1)

Here  comprises all unavoidable observation errors as well as the errors from imperfect data preprocessing. The signal s is the primary SST signal and all the desired information about the gravitational field and its variability in space and time has to be extracted from this signal. The gravitational field V is hidden in the orbits of the satellites. Hence a more precise notation of the orbits of the involved satellites is .xi .t; V /; xP i .t; V //; i D 1; : : : ; n, which changes the primary SST model (1) to s.t / WD F.x1 .t; V /; xP 1 .t; V /; : : : ; xn .t; V /; xP n .t; V // C .t /:

(2)

The recovery of the gravitational field from SST data can be modeled as the following minimization problem: Z



T

V D argmin

ks.t /  F.x1 .t; V /; xP 1.t; V /; : : : ; xn.t; V /; xP n .t; V //k2 dt

(3)

0

 j V 2 Harm./ : The minimization is carried out over all functions V , which are harmonic in  and regular at infinity (Harm./). In most cases  will be the exterior of a sphere of radius R. Already at this stage two different modes of SST can be distinguished: the high–low SST (hl-SST) and the low–low SST (ll-SST). In the hl-SST mode the satellites 2; : : : ; n have such a high orbital altitude that the uncertainties in the gravitational field do not measurably influence their computed orbits. This means the orbits .xi .t; V0 /; xP i .t; V0//; i D 2; : : : ; n can be computed using a known reference potential V0 and do not longer depend on the unknown gravitational potential V . The minimization problem simplifies to 

Z

T

V D argmin

 2 O ks.t /  F.x1 .t; V /; xP 1.t; V //k dt j V 2 Harm./

(4)

0

Page 2 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

with O 1 .t; V /; xP 1 .t; V // W F.x D F.x1 .t; V /; xP 1.t; V /; x2 .t; V0 /; xP 2.t; V0 /; : : : ; xn.t; V0 /; xP n .t; V0//:

(5)

In the ll-SST mode two satellites in the same low orbit are chasing each other and are observing their relative positions and velocities. The corresponding minimization problem is Z V D argmin 

T

ks.t /  F.x1 .t; V /; xP 1 .t; V /; x2 .t; V /; xP 2 .t; V //k2 dt

0

 j V 2 Harm./ :

(6)

Both the hl-SST (4) and the ll-SST (6) minimization problems are infinite-dimensional problems and as such not suitable for numerical implementation. In order to discretize these infinitedimensional problems, a parametric model for the unknown gravitational potential has to be introduced: X VQ .x/ D cz ˆz .x;  z /: (7) z2C Z r

In this notation ˆz 2 Harm./ are harmonic basis functions, which besides on the location x can also depend on additional parameters  z . The weights cz in the linear combination (7) can be complex numbers and the multi-index z ranges inside a certain subset C of the r-dimensional integers. As a second discretization step, the SST signal s.t / has to be sampled equidistantly: si WD s.ih/;

h > 0:

(8)

After that, the infinite-dimensional minimization problems (4) and (6) can be approximated by their finite-dimensional counterparts: f.cz ;  z / j z 2 C g D argmin

N X

       ksi  FO x1 ih; VQ cz ;  z ; xP 1 ih; VQ cz ;  z k2

(9)

iD1

in the hl-SST mode and f.cz ;  z / j z 2 C g D argmin

N X iD1

             ksi  F x1 ih; VQ cz ;  z ; xP 1 ih; VQ cz ;  z ; x2 ih; VQ cz ;  z ; xP 2 ih; VQ cz ;  z k2 (10)

Page 3 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

in the ll-SST mode. Both (9) and (10) are finite-dimensional nonlinear least-squares problems and can be solved by certain variants of the Levenberg–Marquardt algorithm (Levenberg 1944; Marquardt 1963): .cz ;  z /.iC1/ D .cz ;  z /.i/ 

@F > @F C ˛ .i/ I @ .cz ;  z / @ .cz ;  z /

!1

  @F >  .i/ s  F .cz ;  z / : @ .cz ;  z /

(11)

The different algorithms for SST data processing differ in the way how they compute the entries in the Jacobian or how they combine the primary observations si to secondary observations i , which then are linearly related to the unknown parameters. These questions will be discussed in detail in Sect. 5. @F @.cz ; z /

3 Conventions 3.1 Reference Systems In geodynamics a well-defined set of reference systems is in use. For the precise definition of these reference systems and the transformation methods between the individual reference systems, the IERS conventions (Petit and Luzum 2010) can be consulted. Since for astronomical standards the time-span of SST observations is rather small, three simplified reference systems can be used: 1. A space-fixed system 2. An Earth-fixed system 3. An orbital system The space-fixed system has its origin in the mass center of the Earth and its x3 -axis points into the direction of the mean rotation axis of the Earth at the initial epoch t0 of then SST observations. Its x1 -axis points to the intersection of the mean equatorial plane with the ecliptic at this epoch t0 and the x2 -axis completes the former two axes to an orthogonal right-handed Cartesian system. Since the origin of the system is not in un-accelerated motion, this system is not a proper inertial system. The acceleration is due to the attracting forces of Sun, Moon, and planets. If these attracting forces are considered in the form of tidal forces in the data preprocessing, the space-fixed system becomes an inertial system. The differences between the simplified space-fixed system used here and the space-fixed system defined in the IERS convention have to be taken into account by precession, nutation, and polar-motion corrections. These corrections can be made part of the data preprocessing and will not be discussed here. The Earth-fixed system has the same x3 -axis but the x1 -axis lies on the meridian of Greenwich. The x2 -axis completes the former two axes to an orthogonal right-handed Cartesian system. The angle ‚ between the x1 -axis of the space-fixed and the x1 -axis of the Earth-fixed system is called Greenwich sidereal time. Hence, the coordinates of a point change from the space-fixed to the Earth-fixed system according to

Page 4 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 Relations between reference systems

1 cos  sin  0 R3 . / WD @ sin  cos  0A : 0 0 1 0

xEf D R3 .‚/xsf ;

(12)

The orbital system is related to a fictitious satellite with an exactly circular orbit in the gravitational field of a flattened Earth. Its x3 -axis is perpendicular to the orbital plane and its x1 -axis points to the fictitious satellite. The x2 -axis completes again to a right-handed orthogonal Cartesian system. According to Kaula (2000), for such a satellite, the angle  between the x1 -axis of the space-fixed system and the nodal line, i.e., the intersection between equatorial and orbital plane, changes with time as  2 R 3n cos i  t;  D 0  J2 2 a

r nD

GM a3

with GM being the product of gravitational constant and mass of the Earth; J2 being the dynamic form factor (cf. Petit and Luzum 2010); i being the inclination of the orbit, i.e., the angle between the equatorial and the orbital plane; and a being the radius of the orbital circle (Fig. 1). The argument of latitude u, i.e., the angle between the nodal line and the position of the satellite, is given by !  2 R 3n .3 cos2 i  1/  t: u D u0 C n C J 2 4 a

(13)

This means the transformation from the space-fixed to the orbital system is given by xorb D R3 .u/R1 .i /R3 ./xsf ;

0 1 1 0 0 R1 .˛/ WD @0 cos ˛ sin ˛ A : 0  sin ˛ cos ˛

(14)

Page 5 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

3.2 Basis Functions There are two types of basis functions for the representation of the gravitational potential, which are used in the SST community: spherical harmonics and radial basis functions. Nevertheless, they are used in different normalizations. Here, the standard used in Wolfram MathWorld (Weisstein) will be applied. The surface spherical harmonics are defined as 8q < .2lC1/ .l m/Š P m .cos #/e {m ; m  0 4 .lC m/Š l ; (15) Yl;m.#; / D :.1/m Y .#; / ;m < 0 l;m

where Y l;m is the conjugate complex of Yl;m, the Plm are the Legendre functions, and #;  are spherical coordinates related to the Earth-fixed system. Assume now that the original coordinate system is rotated by the three Eulerian angles ˛; ˇ; and  : 0

R.˛; ˇ;  / WD R3 . /R2 .ˇ/R3 .˛/;

1 cos ˇ 0  sin ˇ R2 .ˇ/ D @ 0 1 0 A sin ˇ 0 cos ˇ

and that the spherical coordinates in the rotated system are denoted by # 0 ; 0 , then a spherical harmonics in the non-rotated coordinates can be expressed as a linear combination of spherical harmonics of the same degree in the rotated coordinates

Yl;m.#; / D

l X

l Dl;m;k .˛; ˇ;  /Yl;k .# 0 ; 0 /;

(16)

kDl

with l l Dl;m;k .˛; ˇ;  / D e {m˛ dmk .ˇ/e {k

(17)

being the so-called Wigner functions according to Kostelec and Rockmore (2008). The representation of the gravitational potential as spherical harmonics expansion in the Earthfixed system is given by 1  l GM X R lC1 X V .r; #; / D Kl;mYl;m .#; /; R lD0 r mDl

Kl;m 2 :

(18)

This means, in the case of spherical harmonics in an Earth-fixed system as basis functions, the general notation specializes to GM ˆl;m .x/ D R

 lC1 R Yl;m.#; /; r

˚

.l; m/ 2 C D .i; j / 2 Z2 j i  0; jj j  i :

(19)

Page 6 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

Besides on the index vector z D .l; m/ and on the position vector x, the basis functions do not depend on additional parameters z . Since the transition from the Earth-fixed system to the orbital system is accomplished by the rotation    R   ‚  ; i; u C ; (20) 2 2 the gravitational potential at the position of the fictitious satellite is given by l 1  l GM X R lC1 X X k l Kl;m{ km dm;k .i /P l .0/e {.kuCmƒ/ ; V .r; u; ƒ/ D R r lD0

(21)

mDl kDl

with ƒD‚ and m

P l .x/ D

8q < .2lC1/ .l m/Š P m .x/

;m  0

:.1/

;m < 0

4 m

.lC m/Š l m P l .x/

:

(22)

Hence, in the orbital system, the general notation of a basis function is specialized to 2 ˆm;k .u; ƒ/ D 4

GM R

1 X lDmaxfjkj;jmjg

3  lC1 R k l { km dm;k .i /P l .0/5 e {.kuCmƒ/ : r

(23)

This means in the orbital systems the basis functions modify to imaginary exponentials defined on the torus Œ0; 2/  Œ0; 2/. The domain of definition gave this representation its name: torus approach (Sneeuw 2000). A satellite position with respect to the Earth is mapped onto the position ƒ; u on the torus. Since the Earth has the topological genus 0 and the torus has the topological genus 1, there is no one-to-one mapping between the satellite positions in an Earth-fixed system and the positions on a torus. Two different points on the torus can correspond to the same point related to the Earth: once the point is reached on an ascending and once on a descending orbital arc. A second kind of basis functions is the so-called radial basis functions. A radial basis function (RBF) on the sphere is a function which depends only upon the distance of its argument from the north pole of the sphere, i.e.,   Q e> ˆ.x/ D ˆ 3  ;

D

x ; kxk

0 1 0 e3 D @0A : 1

(24)

Page 7 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

Radial basis functions are used to approximate functions defined on the sphere as linear combinations of rotated versions of ˆ. A RBF, rotated to the position , is a function which only depends on the distance of its argument from that position:   Q >  : ˆ.x; / D ˆ

(25)

Since 1  >   1 holds, any square integrable rotated RBF must have a series expansion in Legendre polynomials: ˆ.x; / D

1 X

  n Pn >  :

(26)

nD0

This means in an Earth-fixed system and for rotated RBFs, the general notation of a basis function specializes to ˆl .x;  l / D

1 X

  n Pn > l  ;

(27)

nD0

with  l D fl ; fn gg :

(28)

So far, RBFs have been defined on the surface of the unit sphere only. Their harmonic continuation to the exterior of the mean Earth sphere is given by  nC1   R n Pn > ˆl .x;  l / D l  : r nD0 1 X

(29)

Since the Legendre polynomials Pn are sums of products of surface spherical harmonics n X .2n C 1/  >  Y n;m ./Yn;m ./; Pn   D 4 mDn

(30)

the rotated RBF ˆ.x;  l / can be expressed as ˆ.x; l / D

1 X nD0

n X 4 n Y n;m ./Yn;m .l /: 2n C 1 mDn

(31)

In this representation the position parameter l and the argument  are separated. Since the position parameter l is the following rotation of the vector e3 l D R3 .l /R2 .#l /e3 ;

0 1 sin #l cos l l D @ sin #l sin l A ; cos #l

(32)

Page 8 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

the representation (31) is equivalent to ˆ.x; l / D

1 X nD0

n X 4 n 2n C 1 mDn

n X

! n dm;k .#l /e {kl Yn;k .e3 / Y n;m ./:

(33)

kDn

A comparison of (31) with (18) allows the conversion of a RBF representation into the corresponding surface spherical harmonics representation. So far, both RBF representations refer to an Earth-fixed system. The representation of a RBF in the orbital system is ˆ.u; ƒ;  l /

!  nC1 X n n X R 4 k l D { km dm;k .i /e {.kuCmƒ/ P l .0/ Yn;m .l / n 2n C 1 r mDn kDn nD0 3 2  nC1 1 1 X X X 4 R k l 4 n D { km dm;k .i /P l .0/Yn;m .l /5  e {.kuCmƒ/ : 2n C 1 r mD1 kD1 1 X

nDmaxfjkj;jmjg

(34) This means, as in the case of spherical harmonics also, the RBFs in the orbital system are imaginary exponentials on the torus. A basis function on the torus is of particular importance if the underlying orbit is a so-called “repeat orbit.” For a repeat orbit holds uP ˇ D ; P ˛ ƒ

(35)

with ˛; ˇ relative prime. Then m;k

WD ku C mƒ D

uP .kˇ C m˛/ t D P m;k t ˇ ƒ‚ … „

(36)

P m;k

holds. Due to this relation between the time t and the location u; ƒ on the torus the basis function, evaluated along a repeat orbit, and changes into a periodic time function: 2 ˆm;k .t / D 4

GM R

1 X lDmaxfjkj;jmjg

3  lC1 R k P l { km dm;k .i /P l .0/5 e { m;k t ; r

(37)

Page 9 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

and ˆm;k .t;  l / D

1 X

1 X

mD1 kD1

2 4

X nDmaxfjkj;jmjg

3  nC1 4 R k P l n { km dm;k .i /P l .0/Yn;m .l /5 e { m;k t 2n C 1 r

(38)

respectively. The period of these functions is given by T D ˛ nodal days.

4 Celestial Mechanics In an inertial system, the equation of motion of a unit point mass is governed by the Newtonian equations xR .t / D rV .x.t /; t / C F.x.t /; xP .t /; t /;

(39)

where x.t / is the Cartesian position of the unit point mass, representing the satellite, V is the potential of the gravitational force acting on it, and F is the nonconservative force exerted, for example, by atmospheric friction, on the satellite. In a rotating system inertial forces have to be added and the equation of motion changes from (39) to       xR0 .t / D rV 0 x0 .t /; t C F0 x0 .t /; xP 0 .t /; t  !  !  x0 .t /  2!  xP 0 .t / : ƒ‚ … „ ƒ‚ … „ centrifugal force

(40)

Coriolisforce

Here ! is the rotation axis of the system and all the quantities denoted by a prime refer to that rotating system. Since we can assume ! P D 0, the inclusion of the Eulerian force is not necessary. Here, we distinguish two rotating systems: the Earth-fixed and the orbital system. For the Earthfixed system, ! D 7:292115  105 e3 holds, and V 0 does not longer explicitly depend on t . For the orbital system r !D

0

1 sin i sin  GM @  sin i cos A a3 cos i

holds, but in this case the gravitational potential V 0 still explicitly depends on t . The equations of motion can be solved as an initial value problem or as a boundary value problem. In the case of an initial value problem, position x0 .0/ and velocity xP 0 .0/ at the beginning of the orbital arc have to be known, while in the case of the boundary value problem, the positions x0 .0/ at the beginning and x0 .T / at the end of the orbital arc have to be given.

Page 10 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

In the case of a boundary value problem, it is convenient to transform the equations of motion into an equivalent integral equation (cmp. Schneider 1968): Z xQ ./ D xQ .0/.1  / C xQ .1/  T

1

2

    K ;  0 f  0 d  0 ;

(41)

0

where  WD

t ; T

xQ ./ WD x0 .T /

and the integral kernel K is given by ( K.;  0 / WD

 0 .1  /

; 0  

.1   0 /

;  0 > :

:

(42)

The function f comprises all forces acting on the satellite:             f  0 WD rV 0 x0  0 T ;  0 T C F0 x0  0 T ; xP 0  0 T ;  0 T       !  !  x0  0 T  2!  xP 0  0 T :

(43)

5 Observation Models and Data Processing Strategies 5.1 Satellite-to-Satellite Tracking in the High–Low Mode High–low modus is the name of scenario where a low-flying satellite is tracked by several high-flying satellites. Since the orbits of the high-flying satellites can be computed sufficiently precise from existing gravity field models, their orbits do not contribute to an improvement of the knowledge about the gravitational field. Hence, the positions of the high-flying satellites are used as known reference positions and the position of the low-flying satellite is determined in reference to these known positions. In general GPS satellites are used as high-flying satellites and the low-flying satellite tracks its position and velocity in an Earth-fixed system, in relation to the known positions and velocities of the GPS satellites, by an onboard GPS receiver. If this is put in relation to the general SST model (4) and (6), we obtain  0 0  0  0  0  0  x .V ; t / O s.t / D F x V ; t ; xP V ; t C .t / WD 0 0 C .t / xP .V ; t /

(44)

and 

Z

T

V D argmin

 2  0  0  0  0  s.t /  FO x V ; t ; xP V ; t dt:

(45)

0

Page 11 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

If this problem is discretized according to (7) and (8), it results in a nonlinear least-squares problem N X  0  0  0  0  2 O f.cz ;  z / j z 2 C g D argmin si  F x V ; ti ; xP V ; ti :

(46)

iD1

In order to solve this least-squares problem, the entries of the Jacobian       @FO @FO @Px0 VQ cz ;  z @FO @x0 VQ cz ;  z     C 0 D 0 @ .cz ;  z / @x @Px @ cz ;  z @ cz ;  z have to be known. As a standard approach, they are computed by solving the so-called variational equations (Ballani 1988). Since this technique is numerically costly, some less costly techniques have been developed, which convert the position and velocity information into synthetic observations, which then are directly related to the unknown potential. Those techniques are named: • Acceleration approach • Energy-balance approach–torus approach A technique, which is strongly related to the variational equation approach, is the so-called integral equation approach. 5.1.1 Energy-Balance Approach The energy-balance approach is one of the oldest ideas in dynamical satellite geodesy. It was first discussed in O’Keefe (1957), Reigber (1969), and Bjerhammar (1976). The method acquired practical importance only with availability of timely and spatially dense data, as, for instance, delivered by the satellite mission CHAMP. There are numerous publications about the application of the energy-balance approach to CHAMP data. Without a ranking the following contributions will be mentioned: Visser et al. (2003), Gerlach et al. (2003), and Badura et al. (2006). In an inertial frame the Lagrangian of the low-flying satellite is given as 1 L D T  V D xP > xP  V: 2

(47)

In the Earth-fixed system, the Lagrangian changes to LD

 >   1 >   1  0 > 0 xP xP C 2 xP 0 !  x0 C !  x0 !  x0  V: 2 2

(48)

A change from the Lagrangian to the Hamiltonian H D p> xP 0  L;

pD

@L @Px0

(49)

Page 12 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

yields H D

>   1  0 > 0 1  xP !  x0 C V: xP  !  x0 2 2

(50)

Since the Hamiltonian is constant and x0 ; xP 0 are known from orbit integration, this leads to an observation equation for the unknown parameters fcz ;  z g in the basis function representation VQ of V . So far all nonconservative forces acting on the satellite have been ignored. Taking them into account adds another term to the energy-balance equation: Z t    1  0 > 0 1  0 > 0 xP xP  !  x H D f> xP 0 dt 0 : (51) !x CV C 2 2 0 This last term is the line integral of the nonconservative force f along the orbital path. If the nonconservative force is measured or modeled, this line integral can be evaluated for each observation epoch t D ih. Inserting the base function representation VQ and the known positions x0i WD x0 .ih/ and velocities xP 0i WD xP 0 .ih/ yields the following observation equation: Z ih X  0  1 0 2 1     0 > 0 xP i  !  xi cz ˆz xi ;  z D f> xP 0 dt 0 : (52) !  xi C H 2 2 0 r z2C Z Hence the observations si in (52) are the sum of kinetic energy, potential energy, and dissipative work Z ih    1 1 0 2 0 > 0 xP  !  xi f> xP 0 dt 0 (53) !  xi C si D 2 i 2 0 and the least-squares problem N X iD0

si  H C

X

!2  0  cz ˆz xi ;  z ! min

(54)

z2C Z r

is partly linear and partly nonlinear in the unknown parameters H; cz ;  z . In order to refer the least-squares problem (54) to the generic setting (9), the following definitions have to be made: X        cz ˆz x0 ;  z : FO x0 ih; VQ ; xP 0 ih; VQ D H  z2C Z r

For the solution of this least-squares problem, the entries of the Jacobian 8 ˆ



x0l .t1 /x0 .t1 / kx0l .t1 /x0 .t1 /k

>

1

 1 .t1 / : : :  N .t1 / C B C B : : Xl D B C: : A @  0  0 > > 0 0 x .t /x .t / x .t /x .t /  kxl0 .tnn /x0 .tn1 /k  1 .tn / : : :  kxl0 .tnn /x0 .tnn /k  N .tn / l

l

4. Collect all GPS pseudo-ranges residuals s D . s1 .t1 /; : : : ; s1 .tn /; s2 .t1 /; : : : ; s2 .tn /; : : : ; s4 .tn //> where the residuals are defined as the difference between observed pseudo-ranges and pseudoranges computed with the initial guess p0 . 5. Assemble the design matrix 0 1 X1 B :: C X D @ : A: X4

Page 19 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

6. Compute parameter corrections p D .X> X/1 X s: 5.1.5 Colombo’s Modification of Variational Equation Approach Since for each unknown parameter pk one differential equation has to be solved, the total numerical effort is considerable. Therefore, it is desirable to find an at least approximative closed solution of (77). Colombo (1984) solved this problem by treating it in the orbital system. The transformation from the Earth-fixed to the orbital system is accomplished by the rotation (20) and results in the following rotation vector with respect to the orbital system: 0 1 0 ! D @0A ; n

r nD

GM : a3

We get 0

1 n2 k;1 C 2n Pk;2 !  .!   k .t //  2!  Pk .t / D @ n2 k;2  2n Pk;1 A : 0 If now, the gravitational potential V 0 is split in its spherical part U D potential T 0 , we first obtain

GM r

and the disturbing

0

1 2 k;1 r 2 U k D n2 @ k;2 A  k;3 and (77) simplifies to 0R 1

k;1  3n2 k;1  2n Pk;2 0   @ A D r 2 T 0 x0  k .t / C @rT :

Rk;2 C 2n Pk;1 @pk

Rk;3 C n2 k;3

(78)

Since both  and r 2 T are small, their product can be neglected, and the final form of the variational equations in the orbital system is obtained: 0R 1

k;1  3n2 k;1  2n Pk;2 0 R k;2 C 2n Pk;1 @ A D @rT : @pk

Rk;3 C n2 k;3

(79)

This is an inhomogeneous ordinary differential equation with constant coefficients, which can be solved in a closed form, provided the inhomogeneity is sufficiently “simple.” In order to make this sure, the orbit of the satellite is approximated by a so-called repeat orbit. If a basis function Page 20 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

representation of T is chosen according to (37), its evaluation along the repeat orbit results in a periodic disturbing potential: T D

X

cm;k ˆm;k .t / D

m;k

X

cm;k Am;k .a; i /e {

P m;k t

:

(80)

m;k

Consequently, the gradient of T along the repeat orbit is also a periodic vector function: rT D

0

1

@   @a P cm;k @ a1 @u@ A Am;k .a; i /e { m;k t @ 1 m;k a cos i @i

X

D

1 @Am;k @a P cm;k @ {ka Am;k A e { m;k t : 1 @Am;k m;k a cos i @i

X

0

(81)

This means the inhomogeneity is a Fourier series, and since the differential equation is linear, the superposition principle can be applied and each term of the force function can be treated separately. For each term a differential equation with constant coefficients and a periodic inhomogeneity has to be solved. The solution is elementary and consists in a superposition of two periodic solutions: one with the eigenfrequency n of the homogeneous differential equation and one with the frequency P m;k of the excitation term. Hence, the partial derivatives  k are series of periodic functions with the frequencies n and P m;k . These partial derivatives have to be transformed back from the orbital system to the Earth-fixed system according to     @x0 D R k D R   u; i; C ‚   : k @pk 2 2

(82)

This means in the Colombo modification step 2, the variational equation approach has to be replaced by the closed solution of the variational equations in the orbital frame and the backtransformation of the partial derivatives  k into the Earth-fixed system. Colombo’s Modification 1. 2. 3. 4. 5.

With an initial guess p0 for the unknown parameters, compute a reference orbit x0 .t /. Find the orbital elements a; i; ; M.t0 / of the best-fitting circular repeat orbit. Find the Fourier series representation of rT according to (81). For each Fourier term, solve Eqs. (79) for the partial derivatives  k in the original system. For each GPS satellite l, build the matrix 1 0  0 > >  0 xl .t1 /x0 .t1 / xl .t1 /x0 .t1 /  kx0 .t1 /x0 .t1 /k R 1 .t1 / : : :  kx0 .t1 /x0 .t1 /k R N .t1 / C B l l C B : : Xl D B C: : A @  0 > >  0 0 0 xl .tn /x .tn / xl .tn /x .tn /  kx0 .tn /x0 .tn /k R 1 .tn / : : :  kx0 .tn /x0 .tn /k R N .tn / l

l

6. Collect all GPS pseudo-ranges residuals s D . s1 .t1 /; : : : ; s1 .tn /; s2 .t1 /; : : : ; s2 .tn /; : : : ; s4 .tn //>

Page 21 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

where the residuals are defined as the difference between observed pseudo-ranges and pseudoranges computed with the initial guess p0 . 7. Assemble the design matrix 0 1 X1 B :: C X D @ : A: X4 8. Compute parameter corrections  1 p D X> X X s:

5.2 Satellite-to-Satellite Tracking in the Low–Low Mode For the determination of the gravitational field of the Earth, satellite-to-satellite tracking enfolds its true potential only in the low–low mode. In this mode two satellites measure their relative velocity, the so-called range-rate:  > P WD xP 02 .t; p/  xP 01 .t; p/ e12 ;

(83)

with e12 being the line-of-sight (LOS) unit vector e12 WD

x02  x01 : kx02  x01 k

(84)

For the determination of the parameters p, describing the gravitational field of the Earth, the following least-squares problem has to be solved: n  X      > 2   p WD argmin P tj  xP 02 tj ; p  xP 01 tj ; p e12 tj ; p :

(85)

j D1

As in the high–low mode, there are two groups of methods to solve this least-squares problem: 1. Conversion of the measured range-rates into artificial in situ observations as in the • Potential-difference approach • The line-of-sight gradiometry approach 2. The computation of the partial derivatives

@ P @p

as in

• The variational equation approach • The integral equation approach

Page 22 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

5.2.1 Potential-Difference Approach The basic idea of this approach dates back to an article of Wolff in 1969. It was further developed and tested for the GRACE mission by Jekeli (1999) and Han (2004). The observation quantity is the range-rate x2  xP 1 /> e12 : P12 D xP > 12 e12 WD .P

(86)

Using the energy conservation in the inertial frame 1 V D xP > xP  E0 2

(87)

and forming the along-track derivative da on both side yields da V D xP > da xP :

(88)

If both satellites are close to each other, one can approximate V12 WD V .x2 /  V .x1 /  kPx1 k P12 :

(89)

So far the developments are made in an inertial system for a static potential V . In reality the Earth rotates and therefore the potential is time dependent. In order to remove this time dependency, the energy balance has to be considered in a rotating system. Doing this, the centrifugal potential has to be added to the energy balance, yielding the following observation equation for the potential differences: V 0 .x2 /  V 0 .x1 / C E0  kPx1 k P12 

>   1 >   1 !  x02 !  x02 C !  x01 !  x01 : 2 2

(90)

Using the basis function representation of the gravitational field, the following relationship between range-rates and gravitational field parameters can be established: X

      cz ˆz x02 ;  z  ˆz x01 ;  z C E0

z2C Z r

 kPx1 k P12 

>   1 >   1 !  x02 !  x02 C !  x01 !  x01 : 2 2

(91)

If either the basis functions do not depend on the parameters  z or if these parameters are fixed a priori, Eqs. (91) are linear in the unknown coefficients cz , and these coefficients can be determined by a standard linear least-squares technique:  n  X     1  >    1   >    0 0 0 0 kPx1 tj k P12 tj  !  x2 tj !  x 2 tj C !  x 1 tj !  x 1 tj 2 2 j D1 

X



      cz ˆz x02 tj  ˆz x01 tj C E0

!2

! min:

(92)

z

Page 23 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

Potential-Difference Approach 1. Multiply the measured range-rates with the total velocity of the trailing satellite     s j D P12 tj x01 tj : 2. Reduce this pseudo-observation by the difference in the centrifugal potential sj D s j 

 >    1   >    1 !  x02 tj !  x02 tj C !  x01 tj !  x01 tj : 2 2

3. Solve the linear least-squares problem n X j D1

sj 

X

       2 cz ˆz x02 tj  ˆz x01 tj C E0

! ! min :

z

5.2.2 Line-of-Sight Gradiometry A different technique to convert range-rates in into situ observations is the line-of-sight gradiometry. This technique is described in Blaha (1992), Heß and Keller (1999), and Keller and Sharifi (2005). The main idea of this approach is to compute the time derivative of the observed rangerates: R D .Rx2  xR 1 /> e12 C

kPx2  xP 1 k2 P2  ;

(93)

with the measured range-rate P and the inter-satellite range D kx2  x1 k. Since xR i D rV .xi / holds, Eq. (93) can be recast into an observation equation for the differences in the potential gradients: .rV .x2 /  rV .x1 //> e12 D R 

kPx2  xP 1 k2 P2  :

(94)

A Taylor expansion of the left-hand side of this equation at the satellite midpoint x WD 12 .x1 C x2 / yields 2 e> 12 r V .x/ e12 

R kPx2  xP 1 k2 P2   2: 2

(95)

The left side of this equation is the gravity gradient in the direction of the line-of-sight between the two satellites, which gives this approach its name. If now the basis function representation of the gravitational potential is inserted, we arrive at a least-squares problem for the parameters of this representation: .cz ;  z / D argmin

n X j D1

X       2 cz e>

yy tj  12 r ˆz x tj ;  z e12

!2 :

(96)

z2C Z r

Page 24 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

In this equation the quantity yy is an abbreviation for the artificial observation

yy WD

R kPx2  xP 1 k2 P2   2: 2

(97)

Instead in the Earth-fixed system, the problem can also be treated in the orbital system. This has the advantage that the line-of-sight gradient can be expressed as the linear combination of the second-order derivative in u direction and of the first-order derivative in radial direction: 2 e> 12 r V .x/ e12 D

1 @2 V 1 @V C ; 2 2 r @u r @r

(98)

which makes the computation of all components of the Marussi tensor r 2 V obsolete. The leastsquares problem simplifies to n X      cm;k ;  m;k Dargmin

yy tj j D1



X

cm;k

m;k

!!2     1 @2 ˆm;k u; ƒ;  m;k   1 @ˆm;k u; ƒ;  m;k   tj : (99) tj C a2 @u2 a @a

For spherical harmonics as basis functions, their representation (23) in the orbital system leads to the following simplified least-squares problem: !2 n X    X      cm;k ˆm;k uj ; ƒj tj ;

yy tj  cm;k ;  m;k D argmin j D1

(100)

m;k

with 2   GM ˆm;k uj ; ƒj D 4 3 R

1 X lDmaxfjkj;jmjg

3  lC3   R k l .i /P l .0/5 e {.kuj Cmƒj / : l  1  k 2 { km dm;k r

(101) All in all, for spherical harmonics as basis functions, the line-of-sight gradiometry approach consists of the following steps: Line-of-Sight Gradiometry Approach 1. Convert the measured range P rate into synthetic relative accelerations R by some numerical differentiation scheme. 2. Compute the line-of-sight gravity gradient yy .tj / according to (97). 3. Solve the linear least-squares problem n X j D1

!2   X     cm;k ˆm;k uj ; ƒj tj ! min

yy tj  m;k

Page 25 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

5.2.3 Variational Equation Approach The variational equation approach is used by several groups concerned with the processing of GRACE data (Beutler et al. 2010a, b). It is centered around the nonlinear least-squares problem n  X       > 0  2 P tj  xP 02 tj  xP 01 tj e12 tj ! min :

(102)

j D1

To solve this nonlinear least-squares problem, the computation of the partial derivatives of the model range-rates  > PM .t / D xP 02 .t /  xP 01 .t / e012 .t / with respect to the unknown parameters p is an essential point. Obviously, we have >    >  0 0     @ x  x @ xP 02  xP 01 @ PM 1 P 2 1 xP 0  xP 01  x0  x01 D e012 C @pk @pk 2 2 @pk

D

e012

 .2/  1    P  0  >  .2/ .1/ 0 0 0 P k  P .1/ xP  xP 1  x  x1 C k  k k 2 2

(103)

.i/

For each satellite .i / the quantities  k are computed as the solutions of the variational equations as described in Sect. 5.1.4. Hence, the variational equation approach for the determination of the gravity field parameters p from the observed range-rates P consists of the following steps: Variational Equation Approach .i/

1. For each satellite solve the variational equations for the partial derivatives  k according to Sect. 5.1.4. 2. Compute the partial derivatives of the model range-rates with respect to the parameters pk as  .2/  @ PM .1/ D e012 P k  P k C @pk



  P  0  >  .2/ 1 0 .1/ 0 0 xP 2  xP 1  x2  x1 k  k

3. Solve the linear least-squares problem n X j D1

  X @ PM pk P tj  @pk

!2 ! min :

k

Basically, also an iterative solution of the nonlinear least-squares problem is possible. But taking into account that rather good a priori values p0 for p are known and that the solution of the variational equations is costly, in most cases only one single step is carried out.

Page 26 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

5.3 Integral Equations Approach The integral equation approach, fully developed in Mayer-Gürr et al. (2007), Mayer-Gürr (2012), also tackles the least-squares problem (102). Its difference to the variational equation approach .i/ .i/ lies in the way the quantities  k and P k in the partial derivatives of the model range-rates with respect to the unknown parameters p are computed. In the integral equations approach, the point of departure is not the equation of motion (40) but its equivalent integral equation counterpart (43). If now in (43) the unknown potential V 0 is replaced by its basis function representation, for each satellite .i / we obtain the following approximation of the integral equation describing its motions: Z xQ ./ D xQ .0/.1  / C xQ .1/  T .i/

.i/

.i/

2

1

    K ;  0 f.i/  0 ; p d  0

(104)

0

with X

  f.i/  0 ; p WD r

!  0.i/  0    T ; z cz ˆz x

z2C Z r

           C F0 x0.i/  0 T ; xP 0.i/  0 T ;  0 T  !  !  x0.i/  0 T  2!  xP 0.i/  0 T : (105) The vector p collects all unknown parameters cz ;  z in the basis function representation of the unknown potential. For the determination of the parameters p from the observed positions and velocities, their partial derivatives with respect to the parameters are needed. The partial derivatives solve the integral equations @Qx.i/ ./ @p ! Z 1 .i/  .i/   .i/  PQ .i/   @f.i/       @f @Q x @ x @f D T 2 0 C 0 C  0; p d  0 K ;  0  0; p  0; p 0 0 @x @p @P x @p @p 0  Z 1  0  @f.i/  0  .i/  0  @f.i/  0  .i/  0  @f.i/  0  2 P  ; p d  0 (106)  ;p   C  ;p   C D T K ;  0 0 @x @P x @p 0

 .i/ WD

and @xPQ .i/ .i/ ./ P WD @p Z 1 @K 2 D T 0 @ Z 1 @K 2 D T 0 @

!  0  @f.i/  0  @Qx.i/  0  @f.i/  0  @xPQ .i/  0  @f.i/  0  ;   C  C  ; p d0  ;p  ;p @x0 @p @Px0 @p @p   0  @f.i/  0  .i/  0  @f.i/  0  .i/  0  @f.i/  0  P ;   ; p d0  ;p   C  ;p   C @x0 @Px0 @p (107)

Page 27 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

respectively. If now the integrals are approximated by quadrature formulas   0  @f.i/  0  .i/  0  @f.i/  0  .i/  0  @f.i/  0   ; p d0 K ;   ;p   C  ; p P  C I WD 0 0 @x @Px @p 0  n X  0  @f.i/  0  .i/  0  @f.i/  0  .i/  0  @f.i/  0  P l ; p  wl K ; l l ; p  l C l ; p  l C 0 0 @x @P x @p lD1 Z

1

 @K  0  @f.i/  0  .i/  0  @f.i/  0  P .i/  0  @f.i/  0  ;   ; p d0  ;p   C  ;p   C I WD 0 0 @ @x @P x @p 0  n X @K   @f.i/  0  .i/  0  @f.i/  0  .i/  0  @f.i/  0  0 ; l  ;p ;  wl l ; p  l C l ; p P l C 0 0 @ @x @P x @p l lD1 0

Z

(108)

1

(109)

we obtain two linear systems of equations for the unknown partial derivatives: 

.i/

 n X     @f.i/  0  .i/  0  @f.i/  0  .i/  0  @f.i/  0  2 0 P  ;p wl K j ; l j D T  ; p  l C  ; p  l C @x0 l @Px0 l @p l lD1 (110)

 n X     @f.i/  0  .i/  0  @f.i/  0  .i/  0  @f.i/  0  @K 2 0 j ; l l ; p : wl l ; p  l C l ; p P l C P j D T 0 0 @ @x @P x @p lD1 (111) If the unknown partial derivatives with respect to the kth parameter pk are assembled into column vectors   .i/ .i/ .i/ .i/ .i/ Zk WD vec  k .1 /; : : : ; k .n /; P k .1 /; : : : ; P k .n / ; (112) .i/

these linear equations can be written in matrix form as   .i/ .i/ A B .i/ .i/ IC  Zk D bk .i/ .i/ C D

(113)

with 2

A.i/

   3  .i/   .i/  w1 K 1 ; 10 @f@x0 10 ; p : : : wn K 1 ; n0 @f@x0 n0 ; p 6     0    0 7 0 @f.i/ 0 @f.i/ 6 7 2 6 w1 K 2 ; 1 @x0 1 ; p : : : wn K 2 ; n @x0 n ; p 7 DT 6 7; :: 4 5 :         .i/ .i/ w1 K n ; 10 @f@x0 10 ; p : : : wn K n ; n0 @f@x0 n0 ; p

(114)

Page 28 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

   3  .i/   .i/  w1 K 1 ; 10 @f@Px0 10 ; p : : : wn K 1 ; n0 @f@Px0 n0 ; p 6   0    0 7   0 @f.i/ 0 @f.i/ 7 6 2 6w1 K 2 ; 1 @Px0 1 ; p : : : wn K 2 ; n @Px0 n ; p 7 DT 6 7; :: : 5 4   0    0    0 @f.i/ 0 @f.i/ w1 K n ; 1 @Px0 1 ; p : : : wn K n ; n @Px0 n ; p 2

B.i/

(115)

2

C.i/

   0     0 3 @K 0 @f.i/ 0 @f.i/ w1 @K    n ; p ;  ; p : : : w ;  1 n 1 0 1 1 n @ @x0 @ @x 6 @K  7       @K 0 @f.i/ 0 0 @f.i/ 0 6 7 2 6w1 @ 2 ; 1 @x0 1 ; p : : : wn @ 2 ; n @x0 n ; p 7 DT 6 7; :: : 4 5    0     0  @K @K 0 @f.i/ 0 @f.i/ w1 @ n ; 1 @x0 1 ; p : : : wn @ n ; n @x0 n ; p

(116)

D.i/

   0     0 3 @K 0 @f.i/ 0 @f.i/    n ; p ;  ; p : : : w ;  w1 @K 1 n 1 0 1 @Px0 1 n @Px @ @ 6 @K        7 .i/ .i/ @K 0 @f 0 0 @f 0 7 6     w ;  ; p : : : w ;  ; p 1 @ 2 1 @Px0 n @ 2 n @Px0 1 n 7; D T26 7 6 :: : 5 4    0     0  @K @K 0 @f.i/ 0 @f w1 @ n ; 1 @Px0 1 ; p : : : wn @ n ; n @Px0 n ; p

(117)

2

and .i/ bk

! n .i/  .i/  X   @f @f D T 2 vec wl K .1 ; l / wl K .n ; l / l0 ; p ; : : : ; l0 ; p : @p @p k k lD1 lD1 n X

(118)

The matrices A.i/ ; B.i/ ; C.i/ ; D.i/ are independent of the parameter pk . Hence the main numerical effort, the LU-decomposition, has to be carried out only once per satellite, no matter how many parameters pk are to be determined. This leads to the following algorithm: 5.3.1 Integral Equation Approach 1. For each satellite compute the matrices A.i/ ; B.i/ ; C.i/ ; D.i/ according to (114)–(117). 2. For each parameter pk .i/

• Compute the vector bk according to (118). .i/ • Compute the partial derivatives Zk with respect to the parameter pk as the solution of (113). • Compute the partial derivatives of the model range-rates with respect to the parameters pk as  .2/  @ PM P  P .1/ C D e>  k k 12 @pk



  P  0  >  .2/ 1 0 .1/ 0 0 xP 2  xP 1  x2  x1 k  k

Page 29 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

3. Solve the linear least-squares problem n X j D1

  X @ PM pk P tj  @p k k

!2 ! min :

6 Regional Gravity Field Models Spherical harmonics are suitable basis functions for a global recovery of the gravitational field of the Earth. Because of their global support on the sphere, they are best suited for a homogeneous data distribution on the sphere. Since in general SST satellites have almost polar orbits, their orbital arcs converge toward the poles. Therefore, the data distribution at the poles is much denser than around the equator. In this situation it is reasonable to combine a global gravity field model, based on spherical harmonics expansion, with regional improvements of this global field. The latter is represented by basis functions with a local support or by basis functions, which are at least rapidly decaying. A technique relying on basis functions with a local support is the so-called mascons technique. This technique was developed by Muller and Sjogren (1968) for the recovery of the lunar gravitational field. Due to the fact that constraints between the basis functions can be introduced, this technique has been applied to GRACE data by several authors (Luthcke et al. 2008; Rowlands et al. 2010). Nevertheless, the majority of authors use RBFs for the regional improvement of a global gravity field solution. In this respect the contributions of Schmidt et al. (2006), Fengler et al. (2007), Eicker (2012), Klees et al. (2008) and others have to be mentioned. The basic idea of the regional improvement by RBFs is the separation of the data set fsi gN iD1 in M N Eq. (10) into two disjunct subsets fsi giD1 [ fsi giDM C1 . The first subset is used to derive a global spherical harmonics model fcl;mg D argmin

M X iD1

        si  F x1 ih; VQ .Kl;m/ ; xP 1 ih; VQ .Kl;m/ ; x2 ih; VQ .Kn;m / ; xP 2 ih; VQ .Kl;m/ 2

(119)

with l 1  GM X R lC1 X Q V D Kl;mYl;m .#; /: R lD0 r mDl

(120)

Once the global model (120) has been determined, the residual observations between the original data and the synthetic data are computed for both subsets:          ri Dsi  F x1 ih; VQ .Kl;m/ ; xP 1 ih; VQ .Kl;m/ ; x2 ih; VQ .Kn;m / ; xP 2 ih; VQ .Kl;m/ ; i D 1; : : : ; N:

(121)

The residuals then undergo an analysis with respect to RBFs as basis functions Page 30 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

f.cl ;  l /g D argmin

N X iD1

         kri  F x1 ih; VO .cl ;  l / ; xP 1 ih; VO .cl ;  l / ; x2 ih; VO .cl ;  l / ; xP 2 ih; VO .cl ;  l / k2 (122) with VO D

L X

cl ˆ .x;  l /

(123)

lD1

and ˆ according to (27). In most applications, the parameters  l are assigned fixed a priori values, which makes the regional improvement technique a linear least-squares problem. One of the few publications, where also the parameters  l are subject to optimization, i.e., where the base functions are allowed to change shape and position during the minimization process, is Antoni (2012).

6.1 Final Remarks So far, all arguments give only a coarse sketch of the methods applied in the processing of SST data. Only the basic ideas behind these methods have been presented. The practical application differs in the following aspects from the given arguments: 1. An important step in data preprocessing is the so-called de-aliasing. Though this step is absolutely vital for the derivation of meaningful results from the SST data, it could not be discussed here. De-aliasing means that all forces, which do not fit into the simple observation model, have to be either measured or modeled and the genuine SST data has to be reduced for these forces. Hence, the concept of de-aliasing includes the following reductions: • • • •

For atmospheric friction For tidal effects For precession and nutation effects For atmospheric and oceanographic loading effects

A comprehensive description of the de-aliasing procedures is given in Bettadpur (2012). 2. In general not the de-aliased data itself but the de-aliased residual data is the input for the subsequent analysis. Residual data means that synthetic observations, which are computed from an a priori gravity field model, are subtracted from the de-aliased data. This step reduces the effect of all approximations made in the derivation of the observation model. The consequence of forming residuals is that not the gravity field parameters themselves but only their differences to the parameters of the a priori model can be determined. 3. In general the normal equation matrices of the least-squares problems have a poor condition. Therefore, in most cases different kinds of regularization are applied. 4. Due to imperfect de-aliasing, the monthly variations in GRACE-derived gravity field models show dominant north–south stripes, which obscure the desired geophysical information. Page 31 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

Therefore, a number of filter strategies are applied in post-processing to remove these stripes (Kusche et al. 2009). 5. Since the GRACE satellites also carry onboard GPS receivers, the GPS mission is both a hl and a ll SST mission. While the hl data are insensitive to the short wavelengths in the gravity field, the ll mission is “ignorant” for the long wavelength features. For an optimal resolution both data types have to be combined. Due to their different error budget, a proper combination requires a preceding variance components estimation.

7 Missions and Outcomes 7.1 CHAMP Mission The CHAMP mission was not only dedicated to the gravity field but also to the magnetic field of the Earth. Another important experiment carried out with CHAMP was the study of the atmosphere by radio-occultation. An overview over the results after 5 years in orbit is given in Reigber et al. (2006). Different institutions processed the CHAMP data for different orbit lengths and derived individual spherical harmonic gravity field models. All these models have been collected by the International Centre for Global Earth Models (ICGEM) and can be accessed via its website (ICGEM). The models differ in the time-span of data and in the resolution limit in degree and order. The highest resolution up to degree and order 140 is given by the model EIGEN-CHAMP03S (Reigber et al. 2004). For its derivation CHAMP data from October 2000 until June 2003 have been analyzed. The normal equation regularization for this model started from degree and order 60 on. An evaluation of the quality of the derived models can be done by a comparison with a model of higher quality, derived from GRACE data, e.g., by a comparison with the GRACE-EIGEN6C2 model. Taking this model for the truth, one can see that the estimation error of the EIGENCHAMP03S model matches the signal strengths at about degree 90. Roughly speaking, this means that from this point on, the estimation error of a coefficient exceeds its magnitude. While this is a relative evaluation, an absolute evaluation can be done by a comparison of geoid heights, obtained by a combination of GPS heights and spirit-leveling heights, with gravity field-derived geoid heights. In this comparison the GPS heights minus leveling heights solution is considered the master solution. The comparison shows a disagreement of about 0.8 m in North America and 1.2 m in Europe for EIGEN-CHAMP03S.

7.2 GRACE Mission Besides static solutions for the time-invariant part of the gravitational field, the GRACE mission also provides monthly solutions. The static solutions can be accessed via the ICGEM website (ICGEM). The models differ in the time-span of data and in the resolution limit in degree and order. The highest resolution up to degree and order 180 is given by the model ITG-GRACE2010s (Mayer-Gürr et al. 2010). It uses data from August 2002 till August 2009. If evaluated against GRACE-EIGEN6C2, it shows that the estimation error matches the signal strength at about degree 160. This means an improvement by the factor 2 compared to the best CHAMP solution. Also the comparison of the ITG-GRACE201s model with GPS-leveling results in a disagreement of about 0.5 m for all continents, which is also by the factor 2 better than the best CHAMP solution in Europe. Page 32 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

The essential outcome of the GRACE missions is the monthly gravity field solutions, because they carry the imprint of mass transport processes at or close to the surface of the Earth. They also can be accessed via the ICGEM website (ICGEM). Besides the unmodified monthly solution, also monthly solution, which is de-striping filtered according to Kusche et al. (2009), is provided. The importance of monthly GRACE solution for geophysical interpretation rests on the assumption that all mass changes, which are observed by GRACE, take place in a thin layer at the surface of the Earth. Under this assumption, observed changes Kl;m in the spherical harmonics coefficients can be converted into changes of surface layer density , i.e., mass change per area (cmp. Chao and Gross 1987): l 1 R X 2l C 1 X Kl;mYl;m.#; /: .#; / D 3 lD0 1 C kl mDl

(124)

In this equation stands for the average density of the Earth and the coefficients kl are the so-called Love numbers (Han and Wahr 1995). Since the majority of mass changes, which can be observed by GRACE, are related to water and ice redistributions, these density changes can be converted into changes of equivalent water thickness by EW T D

 ; w

(125)

with w being the density of water. This change in EWT corresponds to the vertically integrated mass changes inside aquifers, soil, surface reservoirs, and snow and ice packs. It can be observed by GRACE with an accuracy of a few millimeters for a spatial resolution of about 400 km. For this reasons the EWT observed by GRACE has an important impact to: • Hydrology • Oceanography • Glaciology In continental hydrology GRACE estimates of continental water storage improve the understanding of hydrological and atmospheric processes. The great advantage of GRACE estimates is that they are averages over a few hundred of kilometers while traditional ground-based hydrological data refer to scales of 10 km or less. The weakness of GRACE results is that they cannot distinguish between water on the surface and in the soil. Neither they can discriminate between water, snow, and ice. All in all, the GRACE estimates are in good agreement with estimates derived from hydrological models as WGHM or GLDAS. GRACE can give reliable estimates about the total water budget over large regions, and with this information, it can contribute to the improvement of hydrological models. In oceanography the main contribution of GRACE is the separation between steric and nonsteric sea-level rise, because only the latter is related to mass transports. Precise information about the steric sea-level rise makes it possible to estimate the change in heat storage in the oceans, which is important for climate change prediction. In the pre-GRACE area of glaciology, information about the mass balance of the arctic and antarctic ice shields could only be derived from measurements at a small number of benchmark glaciers. The parameters derived from these measurements are hardly representative for the entire Page 33 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

ice shield. Here the mass change derived from GRACE is an important additional source of information, because it refers to averages over a few hundreds of kilometers. But the total mass change observed by GRACE is not only due to the mass loss or mass gain of ice shields but additionally stems from the effect of postglacial rebound. The reduction of GRACE observation by the influence of postglacial rebound, computed from existing models, gives valuable constraints for ice-dynamics models.

8 Conclusions Satellite-to-satellite tracking techniques proved capable to monitor both the static and the timevariable gravity field of the Earth with an unprecedented accuracy and resolution. A number of different algorithms have been developed to convert the genuine SST observations into parameters of a basis function representation of the gravitational field of the Earth. Nevertheless, there are still a number of unsolved problems, like the de-striping of monthly GRACE solutions, the spectral leakage, and the aliasing problem, which still constitute a challenge for further research.

References Antoni M (2012) Nichtlineare Optimierung regionaler Graviationsfeldmodelle aus SST Daten. PhD thesis, Universität Stuttgart Badura T, Sakulin C, Gruber T, Klostius R (2006) Derivation of the CHAMP-only gravity field model TUG-CHAMP04 applying the energy integral approach. Stud Geophys Geod 50:57–74 Ballani L (1988) Partielle Ableitungen und Variationsgleichungen zur Modellierung von Satellitenbahnen und Parameterbestimmung. Vermessungstechnik 36:192–194 Bettadpur S (2012) Level-2 gravity field product user handbook rev. 3.0, May 29. ftp://podaac-ftp. jpl.nasa.gov/GeodeticsGravity/grace/L1B/JPL/RL01/docs/L2-UserHandbook_v3.0.pdf Beutler G, Jäggi A, Mervart L, Meyer U (2010a) The celestial mechanics approach: theoretical foundations. J Geodesy 84:65–624 Beutler G, Jäggi A, Mervart L, Meyer U (2010b) The celestial mechanics approach: application to data of the GRACE mission. J Geodesy 84:661–681 Bjerhammar A (1976) On the energy integral for satellites. Technical report, Report of the Royal Institute of Technology, Stockholm Blaha G (1992) Refinement of the satellite-to-satellite line-of-sight model in residual gravity field. Manuscr Geod 17:321–333 Chao BF, Gross RS (1987) Changes in the Earth’s rotation and low-degree gravitational field induced by earthquakes. J R Astron Soc 91:569–596 Colombo O (1984) Global mapping of gravity with two satellites. Technical report vol 7 Nr 3, Netherlands Geodetic Commission Eicker A (2012) Gravity field Refinement by radial basis functions from in-situ satellite data. Technical report, DGK Reihe C, Bd. 676 Fengler MJ, Freeden W, Kohlhaas A, Michel V, Peters T (2007) Wavelet modeling of regional variations of the Earth’s gravitational potential observed by GRACE. J Geodesy 81:5–15

Page 34 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

Gerlach CL, Földvary L, Švehla D, Gruber T, Wermut M, Sneeuw N, Frommknecht B, Oberhofer H, Peters T, Rothacher M, Rummel R, Steigenberger P (2003) A CHAMP-only gravity field model from kinematic orbits using the energy integral. Geophys Res Lett, doi:10.1029/2003GLO18025 Han D, Wahr J (1995) The viscoelastic relaxation of a realistic stratified Earth and further analysis of post-glacial rebound. Geophys J Int 120:287–311 Han S-C (2004) Efficient determination of global gravity field from satellite-to-satellite tracking mission. Celest Mech Dyn Astron 88:69–102 Heß D, Keller W (1999) Gradiometrie mit GRACE. Z Vermess 124:137–144 ICGEM. http://icgem.gfz-potsdam.de/ICGEM/ICGEM.html, 2014 Jekeli C (1999) The determination of gravitational potential differences from satellite-to-satellite tracking. Celest Mech Dyn Astron 75:85–101 Kaula WM (2000) Theory of satellite geodesy. Applications of satellites to geodesy. Dover, New York Keller W, Sharifi MA (2005) Satellite gradiometry using a satellite pair. J Geodesy 78:544–557 Klees R, Liu X, Wittwer T, Gunter BC, Revtona EA, Tenzer R, Ditmar P, Winsemius HC, Savanije HHG (2008) A comparison of global and regional GRACE models for land hydrology. Surv Geophys 29:335–359 Kostelec PJ, Rockmore DN (2008) FFTs on the rotation group. J Fourier Anal Appl 14:145–179 Kusche J, Schmidt R, Petrovic S, Rietbroeck R (2009) Decorrelated GRACE time-variable gravity field solutions by GFZ, and their validation using a hydrological model. J Geodesy 83:903–913 Levenberg KA (1944) A method for the solution of certain problems in least squares. Q Appl Math 2:164–168 Luthcke SB, Arendt AA, Rowlands DD, McCarthy JJ, Larsen CF (2008) Recent glacier mass changes in the Gulf of Alaska region from GRACE mascons solutions. J Glaciol 54:767–777 Marquardt D (1963) An algorithm for least-squares estimation of nonlinear parameters. SIAM J Appl Math 11:431–443 Mayer-Gürr T (2012) Gravitationsfeldbestimmung ausn der Analyse kurzer bahnbögen am eispiel der Satellitenmissionen CHAMP und GRACE. Technical report, DGK Reihe C, Bd. 675 Mayer-Gürr T, Eicker A, Ilk K-H (2007) ITG-Grace02s: a GRACE gravity field derived from range measurements of short arcs. In: Gravity field of the Earth, proceedings of the 1st international symposium of the international gravity field service (IGFS), Istanbul Mayer-Gürr T, Ilk H, Eicker A, Feuchtinger M (2005) ITG-CHAMP01: a CHAMP gravity field model from short kinematic arcs over a one-year observation period. J Geodesy 78:462–480 Mayer-Gürr T, Kurtenbach E, Eicker A (2010) ITG-grace2010 gravity field model. http://www. igg.uni-bonn.de/apmg/index.php?id=itg-grace2010 Muller PM, Sjogren WL (1968) Mascons: lunar mass concentrations. Science 161:680–684 O’Keefe JA (1957) An application of Jacobi’s integral to the motion of an Earth satellite. Astron J 62:265–266 Petit G, Luzum B (2010) IERS conventions (2010) (IERS technical note 36). Technical report, Verlag des Bundesamtes für Kartographie und Geodäsie, Frankfurt am Main Reigber C (1969) Zur Bestimmung des Gravitationsfeldes der Erde aus Satellitenbeobachtungen. Technical report, DGK Reihe C, Bd. 137

Page 35 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_56-2 © Springer-Verlag Berlin Heidelberg 2014

Reigber C, Jochmann H, Wünsch J, Petrovic S, Schwintzer P, Barthelmes F, Neumayer K-H, König R, Förste C, Balmino G, Biancale R, Lemoine J-M, Loyer S, Perosanz F (2004) Earth gravity field and seasonal variability from CHAMP. In: Reigber C, Lühr H, Schwintzer P, Wickert J (eds) Earth observation with CHAMP – results from three years in orbit. Springer, Berlin, pp 25–30 Reigber C, Lühr H, Grunwald L, Förste C, König R (2006) CHAMP mission 5 years in orbit. In: Flury J, Rummel R, Reigber C, Rothacher M, Boedecker G, Schreiber U (eds) Observation of the Earth system from space. Springer, Berlin/Heidelberg/New York Reubelt T (2009) Harmonische Gravitationsfeldanalyse aus GPS-vermessenen kinematischen Bahnen niedrig fliegender Satelliten vom Typ CHAMP, GRACE, GOCE mit einem hochauflösenden Beschleunigungsansatz. Technical report, DGK Reihe C, Bd. 632 Reubelt T, Austen G, Grafarend EW (2003) Harmonic analysis of the Earth’s gravitational field by means of semi-continuous ephemerides of a low Earth orbiting GPS-tracked satellite. Case study: CHAMP. J Geodesy 77:257–278 Rowlands DD, Luthcke SB, McCarthy JJ, Klosko SM, Chinn DS, Lemoine FG, Boy J-P, Sabaka TS (2010) Global mass-flux solutions from grace: a comparison of parameter estimation strategies – mass concentrations versus Stokes coefficients. J Geophys Res 115:B01403 Rummel R (2003) How to climb the gravity wall. Space Sci Rev 108:1–14 Schmidt M, Han S-C, Kusche J, Sanchez L, Shum CK (2006) Regional high-resolution spatiotemporal gravity modeling from GRACE data using spherical wavelets. Geophys Res Lett 33:L08403 Schneider M (1968) A general method of orbit determination. Technical report, Library Translations, Aircraft Establishment, Ministry of Technology, Farnborough Sneeuw N (2000) A semi-analytical approach to gravity field analysis from satellite observations. Technical report, DGK Reihe C, Bd. 527 Visser PNAM, Sneeuw N, Gerlach C (2003) Energy integral method for gravity field determination from satellite orbit coordinates. J Geodesy 77:207–216 Weisstein E Wolfram mathworld. http:mathworld.wolfram.com, 2014 Wolff M (1969) Direct measurements of the Earth’s gravitational field using a satellite pair. J Geophys Res 74:5295–5300

Page 36 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

Transmission Tomography in Seismology Guust Nolet Geosciences Azur, Université de Nice, Sophia Antipolis, France

Abstract This chapter summarizes three important methods for seismic transmission tomography: the interpretation of delays in onset times of seismic phases using ray theory, of cross-correlation delays using finite-frequency methods, and of full waveforms using adjoint techniques. Delaytime techniques differ importantly in one key aspect from full waveform inversions in that they are more linear. The inverse problem for onset times is usually small enough that it can be solved by matrix inversion; for waveform inversions gradient searches are generally needed, and for crosscorrelation delays the solver depends on the size of the problem. Onset times can simply be interpreted using the approximations of geometrical optics (ray theory). For cross-correlation delays one can use ray theory to compute the linearized dependency on model perturbations in a volume around the ray, if the observed phase travels a well-identified raypath. However, for diffracted pulses or headwaves, numerical solvers for the wavefield are needed. This is also the case for waveform inversions. Whatever the technique that is used, the resulting linearized system is usually underdetermined and needs to be regularized. Progress in the near future is to be expected from efforts to densify the network of seismometers and extending it to the oceanic domain, as well as from the continued growth in the power of supercomputing that will soon push waveform inversions to embrace the full frequency range of observed seismic signals.

1 Introduction Efforts to image the subsurface using seismic observations divide broadly into two groups: those that use reflected waves and those that use transmitted waves. Reflection seismology is very much like depth sounding with sonar on a ship: we know the speed of sound in water, and the arrival time for the reflected wave can therefore be converted into depth to the sea bottom. As the ship moves on, the reflection times trace a line of sea bottom depth. Similarly, on land we can observe reflecting surfaces using a large spread of geophones and an array of sources that move on like a ship. If the reflector is not horizontal, and the reflection point is thus not located vertically beneath the source, methods of “migration” exist to take reflector topography into account. Though the velocity in the subsurface is a priori unknown, an approximative value can be deducted from the shape of the reflection curve from each source. Reflector imaging is a crucial exploration tool for the oil and gas industry.



E-mail: [email protected]

Page 1 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

Obtaining a more reliable estimate of the subsurface velocity is needed to improve the imaging, but this is difficult to obtain from reflected waves. Transmitted waves are more powerful to estimate the velocity and its variations in two or three dimensions. The first attempts at transmission tomography were made in the exploration industry by Bois et al. (1971) between two boreholes. The field developed mostly outside of industry, however, where lack of large arrays of sensors made transmission tomography the most promising tool to image the deep three-dimensional structure of the Earth. Nolet (2008) gives an account of the development of transmission tomography in the past 40 years. More recently, the line between reflection and transmission tomography has become less sharp; the deployment of very dense arrays over hundreds or thousands of km sometimes allows for the imaging of the deep Earth using reflected waves. On the other hand, the need for a more precise model of the shallow subsurface velocity motivates the industry to record reflected waves at large distance (“wide angle”) and abandon traditional migration (backprojection) algorithms for more sophisticated “full waveform inversions” to image energy that has been both reflected and transmitted. The model to be retrieved from such data thus consists of parameters of very different character – one attempts to map the topography of discontinuities at the same time as the variations of seismic velocities within each layer. These velocities, in turn, depend on density p compressional waves one has a velocity p , shear modulus , and bulk modulus : for ˛ D . C 2=3/=, whereas for shear waves ˇ D =. As a rule, these velocities are easier to determine than the density and elastic parameters separately. Information on the density can only be obtained if accurate amplitudes are available: reflection coefficients, for example, depend on the seismic impedances ˛ and ˇ. But amplitudes are influenced by many other factors such as attenuation and focusing/defocusing, and accurate estimation of density with seismic waves only has so far proven to be illusive. Apart from the crucial role that reflection seismics – and more recently more tomographic techniques like waveform inversion – play in the search for oil and gas reservoirs hidden at depths down to half a dozen kilometers, seismic transmission tomography is the only tool that allows us to map structural anomalies with a useful resolution down the fluid core or even to the center of the Earth. Aside from transmission tomography, the observation of the normal mode frequencies and their splitting due to lateral heterogeneity of the Earth has contributed to constrain the very long wavelength structure in our planet’s interior. A discussion of this topic is beyond the scope of this chapter that concentrates on transmission tomography. Interested readers are referred to the textbook by Dahlen and Tromp (1998).

2 Key Issues Traditionally, seismic tomography has long relied on the approximations of geometrical optics to model the travel time of a seismic wave with a line integral along a seismic “ray”: Z ds T D ; (1) P c.r/ where T is the observed time it takes the wave to travel from the earthquake or explosion source to the receiver, c is the wave speed at location r, and P indicates a path satisfying Snel’s law. We Page 2 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

use c as a general notation, it stands for ˛por ˇ depending on the nature of the observed wave; for acoustic waves in fluids, the notation c D = is often used as well. If T is determined by picking the “onset” of the wave, which by definition satisfies Fermat’s principle of a stationary travel time, (1) provides the correct theory for its interpretation, since minimizing T leads to Euler-Lagrange equations that are equivalent to Snel’s law (Nolet 2008). Our knowledge of the velocity structure of the Earth is sufficient to calculate the path P with first-order accuracy. The fact that the travel time is stationary, and thus insensitive to small errors in the path, allows us to view (1) as quasi linear in the “slowness” c 1 . If it is not sufficiently linear, it can be linearized: Z 1 ıc.r/ds ; (2) ıT D  2 P c0 .r/ where ıT D Tobs T0 , the difference between the observed travel time and the one predicted by the background model velocity c0 .r/. The inversion problem is then handled by repeated application of (2) while adapting the trajectory P to the adjusted model c C ıc. Equation (2) is easy to use in tomography, and the ray-theoretical approach on which it is based still dominates the field. Ray-theoretical tomography has a number of limitations, though. The onset of a wave is often difficult to observe in the presence of noise. There exist reflected phases that follow not a strict minimum time path but a “minimax” time (it is at a stationary maximum for the position of the reflecting point), and a sharp onset would not even exist in the absence of noise. And, most importantly, when one observes only the onset of a wave, one ignores information that resides in the rest of the seismogram. In particular, energy diffracted around small heterogeneities influences the waveform even if it arrives after the onset. Finally, ray theory is inadequate to model amplitude variations since the ray-theoretical dependence of a wave amplitude on c.r/ is highly nonlinear and leads to amplitude variations that are far larger than observed for global seismic waves at typical frequencies of 0.1–0.3 Hz (Tibuleac et al. 2003). To improve on ray-theoretical seismology, Luo and Schuster (1991) and Dahlen et al. (2000) developed – in exploration seismics and global seismology respectively – the theory to interpret travel times estimated by picking the time of the maximum in the cross-correlation .t / between the observed wave arrival u.t / and a synthetic seismogram u0 .t / computed for a “background” or “starting” model m0 : Z     .t / D u t 0 u0 t 0  t dt 0 : (3) Note that, in contrast to many other applications of cross-correlograms, we do not require that u.t / and u0 .t / are the same waveforms that may perhaps only differ in amplitude and noise content. If they are the same, it means we are in a domain where ray theory is valid (absence of scattered or diffracted energy), and the maximum of the cross-correlation simply gives us the same delay for the observed wave as the onset time would give us. The importance of (3) is that it allows us to interpret energy arriving after the onset, which may move the time of the maximum of .t / away from the ray-theoretical delay. If we low-pass the signal, we are forced to include more of the later arriving energy in the cross-correlation window; this means that energy that has ventured further away from the raypath may influence the delay. The size of the region in the Earth that influences the delay depends thus on the frequency of the wave, as we shall see.

Page 3 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

In order to interpret the cross-correlation travel time, we assume that u.t / has a form slightly different from u0 .t /: u.t / D u0 .t / C ıu.t / Denoting the autocorrelation of u0 .t / by 0 .t /, we find for the observed cross-correlation Z   0     u0 t C ıu t 0 u0 t 0  t dt 0 ; .t / D 0 .t / C ı.t / D

(4)

(5)

which reaches a maximum after a delay ıT :   P / D P .0/ C R .0/ıT C ı P .0/ C O ı 2 D 0 : Pobs .ıT / D P .ıT / C ı .ıT

(6)

Since P .0/ D 0, we find to first order R1 R1 Re 0 i!u0 .!/ ıu.!/d! P 0 .t 0 / ıu .t 0 / dt 0 ı P .0/ 1 u D R1 : D  R1 2 ıT D  0 / u .t 0 / dt 0  R .0/ u R .t 0 0 1 0 ! u0 .!/ u0 .!/d!

(7)

The last expression in the frequency domain was obtained with Parseval’s theorem and a Fourier sign convention ei!t for the time signal. Equation (7) does not yet give us a direct link between an observed delay and the velocity structure of the Earth c.r/ like (1) does. For that we need one more linearization that relates perturbations ıc.r/ in the background velocity c0 .r/ to perturbations ıu.!/ in the wave arrival. Born theory, a first-order scattering theory, is the vehicle required for this. The wave field is satisfied by the elastodynamic equations which we write symbolically (using boldface we acknowledge that the displacement is a vector field even if we may observe only one component): A0 u0 D f ;

(8)

where f represents the source and A0 is an operator representing the second-order differential equations for elastic motion with density  and elastic coefficients ciklm for the background model or more generally .A0 u/i D 

@um @ 2 ui X @  cj klm : 2 @t @x @x k l klm

(9)

If we discretize the model as well as time, A0 and its boundary conditions are represented by a matrix that operates on a displacement field represented by the vector u0 . If f D ı .x  x 0 / ı.t /eO j , a unit vector in direction j at time t D 0 in the point location x 0 , we denote the i th component of the solution by the Green’s function Gij .x; x 0 ; t /. For a more general force distribution, the solution is then XZ 1 Z     Gij x; x 0 ; t  t 0 fj x 0 ; t 0 dt 0 dV 0 (10) ui .r; t / D j

1

V

Page 4 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

The perturbed system satisfies Au D ŒA0 C ıA.u0 C ıu/ D f ;

(11)

with .ıAu/i D ı

@um @ 2 ui X @  ıcj klm : 2 @t @xk @xl

(12)

klm

or   A0 ıu D ıA u0 C O ı 2

(13)

We thus see that ıu satisfies the same equations as u0 , but the source term is replaced by ıAu0 . The heterogeneities in the medium act as a source of scattered energy. Since ıA is linear in the perturbations ı and ıcj klm , we have effectively linearized the inverse problem for these model parameters; the expression for the perturbed wavefield in a general anisotropic medium with its elastic moduli described by a fourth-order tensor cj klm is given with some algebra by inserting the scattering source term into (10): ıui .t / D 

XZ t Z  j

X

0

V

      ı x 0 Gij x; x 0 ; t  t 0 uR 0j x 0 ; t 0

  @Gij .x; x 0 ; t  t 0 / @u0m .x 0 ; t 0 / ıcj klm x 0 C @xk @xl klm

 dV 0 dt 0

(14)

Even though only 21 of the 81 constants cj klm are independent, in practice it is highly unrealistic to work with so many elastic constants of which the spatial variations can never be resolved, and one usually assumes an isotropic Earth:    2 cj klm D   3  ıj k ılm C  ıj l ıkm C ıj mıkl

(15)

(with ıij the Kronecker delta) or anisotropy with a single symmetry axis. Since the expression (7) is linear in ıu, and ıu itself can be linearized using Born theory, (7) also represents a linearized relationship between the cross-correlation delay ıT and the perturbations in the model density and elastic parameters. In other words, Z (16) ıT D KT .r/ım.r/ d3 r where we write ım.r/ for the perturbation in any one of the model parameters (or a combination of them) and where the integration is over the volume of the Earth where the kernel KT .r/ is not negligible. If the volume sensitivity (16) is used for the interpretation, one speaks of finite-frequency tomography. If the wave arrivals are also filtered in different frequency bands – essentially capturing the dispersion in ıT – it is called multiple-frequency tomography.

Page 5 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

2.1 Linearity The linearity of the problem is a key issue. If we estimate ıT by cross-correlation, we rely on three linearizations to establish the dependence of this delay to perturbations in density and elastic parameters: the change in the location of the maximum in the cross-correlation function .t / in (6), the change in  itself as defined by (5), and the change in the wavefield u0 as obtained from the Born approximation (14). If the heterogeneity in the Earth is smooth, ray theory is valid: a pulse-like arrival can be delayed by ıT and may change amplitude by focusing, but its shape remains intact (there is no “dispersion”). In that case ıT is a linear function of the elastic perturbation: a perturbation double in amplitude, or over a layer thickness twice as large, will double ıT . The limitations of Born theory have no effect on this: Born is used to establish the functional derivative kernel KT .r/ in the limit ım ! 0. This derivative is always correct for very small perturbations, but as long as the delay ıT depends linearly on ım, we can use this derivative over a much larger range of perturbations than would be permitted by the Born approximation itself. If the heterogeneity in the Earth has sharp transitions generating reflections which reflect again (“second order scattering”), the linearity assumed in Born theory is affected if the impedance contrasts are strong enough that the energy transferred to scattered waves is non-negligible. In this case, the waveform will be affected and the cross-correlation delay becomes frequency dependent. Mercerat and Nolet (2013) test the linearity of the cross-correlation delays as a function of model heterogeneity and dominant frequency of the signal. Figure 1 shows the result for a pulse propagating in a model with random heterogeneity with a scale length close to the dominant wavelength of the wave. The observed delay approximately doubles when the velocity amplitude is doubled from an r.m.s. perturbation of 5–10 %. Though there is some scatter around the purely linear relationship (the solid line in Fig. 1), the errors implied by this scatter are generally acceptable when compared to the observational errors. This validates the assumption of linearity for many applications in global tomography, where seismic anomalies below the crust rarely exceed 10 %. For crustal applications, or for near-surface studies, nonlinearity may pose a problem, though. In full waveform tomography, the observed seismogram u itself is the datum and we invert for the difference with the predicted seismogram; for one of the components, Z ıu.t / D u.t /  u0 .t / D Ku .r; t /ım.r/ d3 r ; (17) where the time-dependent kernel Ku .r; t / is implicitly given in (14). It is important to notice that in this case, even if ray theory is valid, a third linearization enters into consideration that is not needed in the case of delay-time interpretation, since we need to assume that a first-order Taylor expansion for the delay ıT is valid: u.t / D u0 .t  ıT /  u0 .t /  ıT uP 0 .t / ;

(18)

which is clearly limited to ıT much less than the dominant period of the pulse. Even if ıT can still be correctly estimated using linearized theory (e.g., when ray theory is valid), the waveform perturbation becomes nonlinear. Waveform inversions are therefore much more nonlinear than delay-time inversions. In addition, their dependence on the amplitude of the observed seismogram

Page 6 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 Left: a random model with a Gaussian autocorrelation, a standard deviation 300 m/s (5 %) in seismic velocity, and a correlation length of 12 m. Only a slice through the model is shown. Right: measured cross-correlation delays for a set of seismograms computed from 4 sources placed in boreholes at the vertical edges of the slice shown on the left, both for the 5 % model and for a model with the anomalies amplified by a factor of 2 (10 %). The dominant wavelength is 24 m. The line denotes the delay predictions for a fully linear relationship (After Mercerat and Nolet (2013))

requires an accurate knowledge of the amplitude response of the instrument as well as of impedance effects of the soil directly beneath the instrument.

2.2 Solving the Forward Problem In practice, we face a choice how to compute u0 and with that the force term in (13). For very complex systems we have little choice but to discretize A and use purely numerical methods such as a finite difference solver or the spectral element method (Luo and Schuster 1991; Tromp et al. 2005). For sufficiently smooth models in which the wave travels as one coherent pulse-like arrival, we may approximate the Green’s function in the spectral domain as Gij .x s ; x r ; !/ 

A ij .!/ i!Trs e ; Rrs

(19)

where A ij defines the amplitude and polarization of the wave radiated from the source s in the direction of the receiver r and where ray theory provides the geometrical spreading Rrs and the travel time Trs . Equation (19) is known as the ray-theoretical solution. Its validity is limited to phases with a well-defined trajectory, away from focal points or caustic surfaces. Surface waves, headwaves, or other diffracted waves cannot be modeled using ray theory in this way and require a more numerical treatment. Fortunately, efficient numerical methods are available that rely on the symmetry of the background model. For frequencies below about 0.1 Hz, summation of normal modes can be used (Zhao et al. 2000). The direct solution method, a Galerkin-type method, can compute synthetic seismograms up to 2 Hz (Kawai et al. 2006). A 2D version of the spectral

Page 7 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

element method, applied to a spherically symmetric Earth model, provides another alternative (Nissen-Meyer et al. 2007).

2.3 Kernel Computation Wavefields are reciprocal: a force in direction eO 1 observed on a seismometer component in direction eO 2 yields the same seismogram that we obtain if we interchange the source and receiver locations and observe a eO 1 component from a source in the eO 2 direction. This property is used to significantly reduce the computational effort needed to compute ıu.!/ in (7) and (14). Formally     Gij x; x 0 ; t  t 0 D Gj i x 0 ; x; t  t 0

(20)

so that we only need to compute the wavefield u from a source location x s , and the Green’s function for a source at (receiver) location x r : Z tZ  ıui .x r ; t / D 

0

V

      ı x 0 Gj i x 0 ; x r ; t  t 0 uR 0j x 0 ; t 0

  @Gj i .x 0 ; x r ; t  t 0 / @u0m .x 0 ; t 0 / C ıcj klm x 0 @xk @xl

 dV 0 dt 0

(21)

Substitution of (21) and (15) into (7) gives the kernels that define the linearized relationship between the delay and perturbations in , , and  (see also Sect. 2.5). To translate this parameterization into more convenient perturbations of seismic velocity(; ˛; ˇ), we use K˛ D 2˛K

4 Kˇ D 2ˇ K  K 3  K0 D K C ˇ 2 K C K ;  where the accent on the density kernel indicates that we vary  while keeping the seismic velocity – rather than  and  – constant. In practice the sensitivity to density is weak and generally ignored. So far we assumed we have to use finite difference or spectral element methods to compute the Green’s functions from source and receiver locations. But if we use a smooth background model and the ray-theoretical Green’s function (19) to find the kernel expressions for well-defined arrivals such as P or S, we are able to find analytical kernel expressions, directly in terms of the seismic velocity. If, in addition, we neglect differences in the amplitudes A ij .!/ for the initial amplitude of the direct and wave and the wave that departs in the direction of the scatterer, as well as the directivity of the scatterer, the expression for the kernel Kc (where c stands for ˛ or ˇ) becomes Rrs 1 Kc .x 0 / D  0 2c.x r /c.x / Rxr Rxs

R1 0

! 3 ju0 .!/j2 sin Œ!T .x 0 / d! R1 ; ! 2 ju0 .!/j2 d! 0

(22)

Page 8 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 2 An example of a travel time kernel K˛ .x/ (Eq. 22) for a P wave arrival with a dominant period of 20 s. The color scale indicates values of the kernel in 107 s/km3 . Note the zero sensitivity at the location of the geometrical raypath in the center and the existence of the second Fresnel zone with reversed sensitivity. The black line plots the value of the kernel at a cross section through its center

where x 0 is the location of the scatterer and Rxr the geometrical spreading of a ray from r to the scatterer. Though this expression is somewhat simplified, it captures the essential differences between ray-theoretical and finite-frequency sensitivity and is usually accurate enough to be used in this form. An example of a kernel computed in this way is shown in Fig. 2. It is ironical that we can use ray theory to improve on ray theory, but the use of (22) can speed up the computation by two to three orders of magnitude with respect to the spectral element method (Mercerat and Nolet 2012), so it is certainly worth the effort. For more complete expressions that include amplitude variations and angle dependence of the scattering as well as possible phase shifts caused by supercritical reflections or passage of a caustic, or for kernels for alternative delay-time definitions, see Nolet (2008).

2.4 Regularization of Large Matrix Systems For the discretization of the model, we face several choices, all of which can be written in the form P ıc.x/ D M m h .x/, using a set of basis functions hj .x/; j D 1; : : : M . For the basis hj we j j j D1 can choose a local parameterization into cells or “voxels” (hj D 1 in voxel j , 0 outside), a global parameterization involving spherical harmonics, or a compromise between the two using wavelets. Equations (2) and (7) for delays, or (14) for waveform data, can then be written in a general matrix notation: Am D d ;

(23)

The regularization is done by minimizing a penalty function, the generic form of which is 1 J .m/ D kAm  dk C R.m/ : 2

(24)

Page 9 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

For the data misfit one usually adopts the Euclidean norm (least squares fit). R.m/ is a measure of the size and/or complexity of the model that we wish to keep under control. There are many different choices, but the most important in seismic tomography are (Loris 2013) Z (25) R.m/ D ıc.x/2 dV (Tikhonov or norm damping) Z R.m/ D jrıc.x/jdV (total variation or Laplacian smoothing) (26) R.m/ D

X

jmj j for a wavelet basis hj

(compressed sensing)

(27)

j

2.5 Adjoint Inversion If a waveform inversion is pursued, the inverse problem is easily too large to fit in the memory of computer clusters. But even in finite-frequency or multiple-frequency delay-time inversions, the large matrix size may be prohibitive. In that case one may attempt to find a solution by searching in the model space along the gradient of the penalty function J .m/, using “adjoint” equations to compute the gradient at each step. Since the gradient is recomputed anyway, adjoint inversions lend themselves very well to highly nonlinear problems, such as full waveform inversions. In travel time tomography of the Earth’s mantle, nonlinearity is weak, and it can be much more efficient to store the rows of the matrix A on disk than to recompute them. Excellent descriptions of adjoint inversion for waveform as well as delay-time tomography can be found in Tromp et al. (2005) and Fichtner et al. (2006). Here I first give a simple example of adjoint inversion for the matrix system (23) followed by the more complicated case for waveform inversion. If we use the Euclidean norm k:k D j:j2 in (24), the gradient of J with respect to the elements of the model vector m is rJ D

@J D A T .Am  d/ D A T r ; @m

(28)

where A T is the transpose of A and r is a vector of travel time residuals. For simplicity we ignore here the contribution of the regularization term R. The computation of the data residuals r is straightforward and is usually done “on the fly” while computing each row of the matrix A, since each row corresponds to one particular source-station pair. For the multiplication of r with A T , we again need the matrix. We can either recompute it or read it back from disk – the first approach is doable for a homogeneous background model, but quickly becomes too slow even for simple layered models. There is an apparent difficulty that we compute (and store or read) the matrix in row order, whereas the multiplication with A T is normally done in column order (which is row order for the transpose matrix). This, however, is not really needed, as the following pseudo-code for a row-order multiplication shows

Page 10 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

To compute g D A T r: set g D 0 for i D 1; N Read row i from disk (or compute it) for j D 1; M gj C Aij ri gj The function of A T is to project the data residuals back into the model space. The residuals observed in a particular station are redistributed along the raypaths to that station. Where raypaths cross, and the sign of the residuals is the same, their sum will create a visible anomaly. Thus, once the gradient A T r has been obtained, the model can be updated. The simplest form would be miter+1 D miter C ˛rJ ;

(29)

where the optimal step size ˛ is typically found through quadratic interpolation on three values of J .miter C ˛rJ /, where ˛ is near 2J =jgj. Convergence can be speeded up using conjugate gradients (Fletcher and Reeves 1964). If we compute the kernels using a full waveform algorithm rather than ray theory, we must substitute the Born approximation (21) for ıu in (7); limiting the time integral to the crosscorrelation window Œ0; T  for the i th component of the seismogram, this gives 1 ıT .x/ D  E C

X klm

Z

T 0

 X uP 0i x; t 0 j

Z tZ  0

V

      ı x 0 Gj i x 0 ; x; t  t 0 uR 0j x 0 ; t 0

  @Gj i .x 0 ; x; t  t 0 / @s0m .x 0 ; t 0 / ıcj klm x 0 @xk0 @xl0

 dV 0 dt dt 0 ;

(30)

P R1 where E D i 1 uR 0i .t /u0i .t /dt . A similar “backprojection” interpretation can be obtained if we identify the travel time adjoint field as

uj

 0  1 x ; x; T  t 0 D E

Z

T t t 0

0

    Gj i x 0 ; x; T  t 0 uP 0i x 0 ; T  t dt ;

(31)

which is generated from the “adjoint source”:

fi .x; t / D

    1 uP 0i x 0 ; T  t ı x  x 0 E

(32)

Equations (31) and (32) can be used to compute the kernel in (30) using a numerical algorithm that gives the wavefield in complicated media, in which the ray-theoretical expression (22) is not valid. It can also be used to interpret cross-correlation delays for arbitrary parts of the seismogram that are not identifiable as a ray arrival. This, for example, is needed in the case of headwaves such as the Pn wave that travels along the crust-mantle interface, or the core-diffracted Pdiff wave.

Page 11 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

Note that the cross-correlation delay itself is not needed to generate the adjoint field – in view of the weak nonlinearity, it can thus be done once and for all even if the system is solved by gradient searches. Equation (30) has the form (16) – the kernel is implicitly defined by this equation. If we do a full waveform inversion, i.e., if we invert (21) directly, this is not the case. We define the penalty function: 1X J D 2 rD1 N

Z 0

T

ju0i .x r ; t /  ui .x r ; t /j2 dt :

(33)

A summation over components i can be implicitly assumed in case we invert for more than one component of the wavefield. Perturbing the model gives a perturbation in J : ıJ D

N Z X 0

rD1

T

Œu0i .x r ; t /  ui .x r ; t / ıui .x r ; t /dt

(34)

Tromp et al. (2005) show that the adjoint field is now given by

uj

N X  0 0 X x ;t D rD1

i

Z

T t 0

0

  Gj i x 0 ; x r ; T  t  t 0 Œu0i .x r ; T  t /  ui .x r ; T  t / dt

(35)

with an adjoint source that sums the waveform discrepancies in all receivers:

fi .x; t / D

N X X rD1

Œu0i .x r ; T  t /  ui .x r ; T  t / ı.x  x r /

(36)

i

and substituting this, and the expression for the perturbed field (21) in (34), gives again a kernel interpretation of the form Z           0  0 (37) ıJ D K x ı x C K x 0 ı x 0 C K x 0 ı x 0 dV 0 ; but this formulation cannot be used in practice since the matrices to be stored are too large. Instead, the backprojection is done as in the case of (28) by computing the wavefield back from each receiver using the virtual sources (36). However, in contrast to the backprojection of travel time delays, the source terms depend on the observed misfit in each station. Note that, even if we invert for only one component, all three components enter in the computation of the gradient – since horizontal components are often much noisier than the vertical ones, this is not a trivial observation.

2.6 Resolution Analysis Errors in the data propagate into the solution. Moreover, since the problem is almost always underdetermined, the regularization introduces a bias into the solution obtained. Only for the smallest tomographic problems are we able to compute the posteriori covariance matrix of the solution. For large problems, the bias can be studied by generating a synthetic data set d synt for a Page 12 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

known model msynt , solving the system Am D d synt , and comparing the solution m with msynt . If one adds an error to the synthetic data with a distribution equal to that estimated for the real data, and if one repeats the exercise many times for the same msynt but different realizations of the errors, the covariance matrix of the solution can be estimated as well. Since the synthetic model often takes the form of voxels with alternating positive and negative anomalies, typically of a few percent, such tests are widely referred to as “checkerboard tests,” referring to the checkerboard-like image of msynt when plotted in two-dimensional cross sections.

3 Fundamental Results A number of intriguing and important discoveries have been made using global tomography and ray theory: for example, it has become clear that the oceanic lithosphere can subduct to great depths in the mantle, though this behavior varies with the tectonic setting (Fig. 3). Two major “superplumes” exist in the Southern Hemisphere just above the core, under South Africa and under the Society Islands in the Pacific. The origin and nature of these features is still debated, but strong indications exist that they are chemically distinct and intrinsically denser than the surrounding mantle and may have been in existence since the formation of our planet (Forte et al. 2010). These superplumes are the most pronounced of a series of much narrower lower mantle plumes, first discovered with finite-frequency tomography by Montelli et al. (2004), who combined crosscorrelation delay data with onset time data. The difference in sensitivity between the two types of delays to features with a typical length scale of several hundred km allowed the imaging of plumes with diameters of 400 km and larger. These rise more or less vertically from the coremantle boundary (see Fig. 4) and are located beneath volcanic islands such as Hawaii or Cape Verde. The surface separating the upper and lower mantle shows topography of several tens of kilometer – since this surface is a silicate phase transition, this shows that strong lateral temperature variations must exist in the interior of our planet (Lawrence and Shearer 2008). Most surprising is that lateral variations persist to the very center of the Earth, despite its high temperature (more than 5,000 ı C) and pressure (365 GPa): the solid inner core shows an Eastern and Western hemisphere with different seismic velocity and anisotropy (e.g., Irving and Deuss 2011).

4 Future Directions The theory of seismic wave propagation is by now well developed, and stable algorithms exist that can predict seismic motion up to frequencies that approach 1 Hz; with exaflop computing facilities widely believed to be within reach before 2020, this means we can soon compute wavefields over the full observable frequency range. More fundamental progress is thus only to be expected at the side of the observations. Current seismic networks severely undersample the wavefield, even in dedicated high-density deployments such as USArray with an average station distance of 70 km. Except for a few ocean island stations, and temporary and expensive deployment of seismometers on the ocean floor, the oceanic domain is void of sensors, severely hampering global tomography. Acoustic sensors (hydrophones) mounted on robotic floats that drift with the ocean currents may soon start to provide us with Page 13 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Vertical mantle sections across the Tonga-Kermadec arc, where oceanic lithosphere subducts back into the mantle. The top two rows show perturbations in the P velocity ˛ in two different models, obtained with transmission tomography. For comparison, the models in the lower half of the figure show the perturbation in the S velocity ˇ for the same locations, obtained with low frequency data (normal modes). The velocity variations are relative to a spherical average. Blue colors represent fast, red slow – to first order these can be interpreted as cold and hot regions in the mantle, respectively. The amplitude scale is different among the four models, as indicated by the numbers below each image. Circles denote earthquakes, clear indicators of active subduction (From Fukao et al. (2001), reproduced with permission from the American Geophysical Union)

observations of P wave arrivals, the delays of which will be instrumental to resolve anomalies in the mantle beneath the oceans. Finite-frequency methods or numerical algorithms to compute the seismogram in a laterally heterogeneous Earth make it in principle also possible to interpret amplitudes. Amplitudes are influenced by focusing and defocusing effects, as well as by the intrinsic attenuation of the rock. If the attenuation can be reliably inferred from seismic observations, important constraints on temperature can be obtained. The interpretation of amplitudes in terms of 3D structure is however still in a beginning stage, partly because the instrument response is usually better known for the phase, and with GPS clock corrections, the time keeping has become very reliable. On the contrary, the recorded amplitude Page 14 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 If one vertically averages the P velocity anomalies over the deepest 1,000 km of the lower mantle in a tomographic model, such as to emphasize features with vertical continuity, the two superplumes present under South Africa and the South Pacific become apparent, as do others, such as Hawaii and the Canary Island plume. As in Fig. 3, reddish colors indicate slow – presumably hot – mantle rock (From Montelli et al. (2006), with permission from the American Geophysical Union)

of the seismic signal is influenced by the impedance of the local structure directly beneath the seismograph and thus less certain than the phase. This presents an obstacle that will be important to overcome in the near future.

5 Conclusion Since its introduction in seismology more than 30 years ago, seismic tomography has become an indispensible tool to study the interior of our planet and is increasingly finding applications in our search for natural resources and in the monitoring of activities underground – ranging from the evolution of volcanic activity or the pumping of a gas reservoir to the detection of tunnels intended to escape border controls. The theory of seismic transmission tomography has evolved in the past decade from the simple ray-theoretical approach toward finite-frequency interpretation of cross-correlation delays and toward adjoint inversion of full waveforms. With these new methods we are able to extract significantly more information from observed seismograms than was possible until recently. Most of the improvement in the near future will come from an increase in data, rather than from theoretical improvements. The growing size of the inverse problem will require a continued growth in the power of supercomputers or new mathematical techniques to reduce the linearized system without significant loss of information.

References Bois P, la Porte M, Lavergne M, Thomas G (1971) Essai de determination automatique des vitesses sismiques par mesures entre puits. Geophys Prospect 19:42–81 Dahlen FA, Hung S-H, Nolet G (2000) Fréchet kernels for finite-frequency traveltimes – I. theory. Geophys J Int 141:157–174 Page 15 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_58-2 © Springer-Verlag Berlin Heidelberg 2014

Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Fichtner A, Bunge H-P, Igel H (2006) The adjoint method in seismology I. Theory. Phys. Earth Planet Inter 157:86–104 Fletcher R, Reeves C (1964) Function minimizationby conjugate gradients. Comput J 7:149–154 Forte AM, Sandrine Q, Moucha R, Simmons NA, Grand SP, Mitrovica JX, Rowley DB (2010) Joint seismic-geodynamic-mineral physical modelling of African geodynamics: a reconciliation of deep-mantle convection with surface geophysical constraints. Earth Planet Sci Lett 295:329– 341 Fukao Y, Widiyantoro S, Obayashi M (2001) Stagnant slabs in the upper and lower mantle transition region. Rev Geophys 39:291–323 Irving JCE, Deuss A (2011) Hemispherical structure in inner core velocity anisotropy. J Geophys Res 116:B04307 Kawai K, Takeuchi N, Geller RJ (2006) Complete synthetic seismograms up to 2 Hz for transversely isotropic spherically symmetric media. Geophys J Int 164:411–424 Lawrence JF, Shearer PM (2008) Imaging mantle transition zone thickness with SdS-SS finitefrequency sensitivity kernels. Geophys J Int 174:143–158 Loris I (2013) Numerical algorithms for non-smooth optimization applicable to seismic recovery. In: Freeden W (ed) Handbook of geomathematics (submitted) Luo Y, Schuster GT (1991) Wave-equation travel time tomography. Geophysics 56:645–653 Mercerat D, Nolet G (2012) Comparison of ray-based and adjoint-based sensitivity kernels for body-wave seismic tomography. Geophys Res Lett 39:L12301 Mercerat ED, Nolet G (2013) On the linearity of cross-correlation delay times in finite-frequency tomography. Geophys J Int 192:681–687 Montelli R, Nolet G, Dahlen FA, Masters G (2006) A catalogue of deep mantle plumes: new results from finite-frequency tomography. Geochem Geophys Geosys (G3) 7:Q11007 Montelli R, Nolet G, Dahlen FA, Masters G, Engdahl ER, Hung S-H (2004) Finite frequency tomography reveals a variety of plumes in the mantle. Science 303:338–343 Nissen-Meyer T, Dahlen FA, Fournier A (2007) Spherical-earth Fréchet sensitivity kernels. Geophys J Int 168:1051–1066 Nolet G (2008) A breviary of seismic tomography. Cambridge University Press, Cambridge Tibuleac IM, Nolet G, Michaelson C, Koulakov I (2003) P wave amplitudes in a 3-D Earth. Geophys J Int 155:1–10 Tromp J, Tape C, Liu Q (2005) Seismic tomography, adjoint methods, time reversal and bananadoughnut kernels. Geophys J Int 160:195–216 Zhao L, Jordan TH, Chapman CH (2000) Three-dimensional Fréchet kernels for seismic delay times. Geophys J Int 141:558–576

Page 16 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

Fractional Diffusion and Wave Propagation Yuri Luchko Department of Mathematics, Physics, and Chemistry, Beuth Technical University of Applied Sciences Berlin, Berlin, Germany

Abstract In this chapter, a short overview of the current research towards applications of the partial differential equations of an arbitrary (not necessarily integer) order for modeling of the anomalous transport processes (diffusion, heat transfer, and wave propagation) in the nonhomogeneous media is presented. On the microscopic level, these processes are described by the continuous time random walk (CTRW) model that is a starting point for derivation of some deterministic equations for the time- and space-averaged quantities that characterize the transport processes on the macroscopic level. In this work, the deterministic models are derived in the form of the partial differential equations of the fractional order. In particular, a generalized time-fractional diffusion equation and a time- and space-fractional wave equation are introduced and analyzed in detail. Finally, some open questions and directions for further work are suggested.

1 Introduction In many geological applications, e.g., modeling of the geothermal energy extraction (Luchko and Punzi 2011), stability and seismicity of the fractal fault systems (Gudehus and Touplikiotis 2012), or propagation of the damped waves (Luchko 2013), one has to deal with the transport processes that take place in the highly inhomogeneous media and are subject to external and internal forces being applied at different time and space scales. This raises the question of how reliable are the standard models for the transport processes in these complex environments. In particular, even though the Fourier and Fick’s laws are still the standard tools for modeling of the transport processes on the macro-level, they often fail to grasp the behavior of systems with the anomalous components and phenomena, the so-called anomalous transport processes (see, e.g., Geiger and Emmanuel 2010). Within the last few decades, the anomalous transport processes that do not follow the Gaussian statistics have been observed and confirmed in several different application areas in natural sciences, biology, geological sciences, medicine, etc. This forced even stronger research activities towards techniques and approaches for their adequate modeling. In this chapter, we consider one powerful approach, namely, the continuous time random walk (CTRW) model combined with the fractional dynamics on the macro-level and apply it for modeling of anomalous diffusion, heat transport, and wave propagation in heterogeneous media. The models for anomalous transport processes in the form of the time- and/or space-fractional partial differential equations enjoyed a 

E-mail: [email protected]

Page 1 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

particular attention and were introduced and analyzed by a number of researches since the 1980s. In particular, this kind of phenomena is known to occur in inhomogeneous media that combine characteristics of solid-like materials that exhibit wave propagation and fluidlike materials that support diffusion processes. In addition to the physical motivation and setting up of the models, mathematical analysis of the models and an overview of the numerical methods for their solution, some plots, and interpretation of the obtained results are presented in this chapter. In particular, we deal with the generalized time-fractional diffusion equation that of course can be employed to describe the anomalous heat conduction, too. To investigate this equation, a maximum principle well known for the elliptic and parabolic type PDEs is extended to the initial-boundary-value problems for the generalized diffusion equation of the fractional order. Then the Fourier spectral method is applied to obtain solutions to these problems in explicit form. Another important equation that is considered in this chapter is a fractional generalization of the wave equation that describes propagation of damped waves in inhomogeneous media. In contrast to the fractional diffusion-wave equation, the fractional wave equation contains fractional derivatives of the same order ˛; 1  ˛  2, both in space and in time. We show that this feature is a decisive factor for inheriting some crucial characteristics of the wave equation like a constant propagation velocity of both the maximum of its fundamental solution and its gravity and mass centers. Moreover, the first, the second, and the Smith centrovelocities of the damped waves described by the fractional wave equation are constant and depend just on the equation order ˛. The fundamental solution of the fractional wave equation is determined and shown to be a spatial probability density function evolving in time that possesses finite moments up to the order ˛. To illustrate analytical findings, results of numerical calculations and some plots are presented.

2 Continuous Time Random Walk Model The continuous time random walk (CTRW) model was first introduced in Montroll and Weiss (1965) to model transport processes that show anomalous behavior. The main idea behind a CTRW is first to interpret a transport process on the microlevel as a flow of many parcels. If one assumes that the parcels are independent from each other, then their state and behavior can be described in terms of the probability P .x; t /Vt of an individual parcel to be located inside the volume V within the time interval t . The function P .x; t / is an unknown probability density function (pdf) that satisfies the so-called master equation and is connected with the jump pdf that characterizes the transport process and is supposed to be known. Raising certain conditions on the jump pdf, the master equation can be transformed to some deterministic differential or integrodifferential equations that the pdf P .x; t / has to satisfy at least on the large time and space scales. In its turn, the macro-characteristics of the transport process like the concentration c.x; t / of the substance or its temperature T .x; t / at a certain place x to a certain time instant t averaged over the time interval t; t 2 t and the space volume V; x 2 V are proportional to P .x; t / and consequently governed by the same equations. This means that in the framework of the CTRW model, the key role is played by the pdf P .x; t / that describes a random walk of an individual parcel within the transport process. In what follows, we consider the random walk of an individual parcel and analyze its characteristics. For notational simplicity, we focus on the one-dimensional random walks. The multidimensional version follows the same steps. For more details regarding the CTRW models and their applications for modeling of the anomalous transport processes, see, Page 2 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

e.g., Berkowitz et al. (2002), Emmanuel and Berkowitz (2007), Fulger et al. (2008), Gorenflo and Mainardi (2009), Luchko (2012a), and Metzler and Klafter (2004).

2.1 Brownian Motion In the framework of the well-known random walk model for the Brownian motion, the random walker jumps at each time step t D t0 ; t0 C t; t0 C 2t; : : : in a randomly selected direction, thereby covering the distance x, the lattice constant. Denoting by P .x; t / x the probability that the random walker is located between x and xCx at the time t , the formula of total probability easily leads to the master equation 1 1 P .x; t C t / D P .x C x; t / C P .x  x; t /: 2 2

(1)

For the one-dimensional Brownian motion, we substitute the Taylor expansions   @P as t ! 0; C O .t /2 @t   .x/2 @2 P @P 3 C C O .x/ as x ! 0 P .x ˙ x; t / D P .x; t / ˙ x @x 2 @x 2 P .x; t C t / D P .x; t / C t

into the master equation (1) and get the formula   .x/2 @2 P .x/2 @P D C O.t / C x O @t 2t @x 2 t

as t ! 0; x ! 0:

(2)

In the continuum limit t ! 0 and x ! 0, this equation becomes the standard diffusion equation @P @2 P D d1 2 @t @x

(3)

under the condition that the diffusion coefficient d1 D lim

x!0 t!0

.x/2 2t

is finite. Of course, the same procedure leads to the two- or three-dimensional diffusion equations for the two- or three-dimensional Brownian motion, respectively: X @2 @P ; n D 2; 3: D d1 P;  WD 2 @t @x i iD1 n

(4)

When the random walker is located at the starting point x D 0; x 2 R; n D 1; 2; 3 at the time t D 0, then the initial condition

Page 3 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

P .x; 0/ D

n Y

ı.xi /; n D 1; 2; 3;

(5)

iD1

with the Dirac ı-function has to be added to the model. The solution   x2 1 exp  P .x; t / D p 4d1 t 4d1 t

(6)

of the one-dimensional diffusion equation (3) with the initial condition (5) can be easily obtained, e.g., with the Laplace and Fourier integral transforms technique. The pdf (6) is a spatial Gaussian p distribution at any time point t > 0 with the middle value  D 0 and with the deviation  D 2d1 t that means that the mean squared displacement of a parcel that participates in the transport process is given by  2 .t / D 2d1 t . It is important to mention that the central limit theorem ensures the same behavior of the pdf P .x; t / on the large time and space scales in the case when the waiting time t is not fixed, but the pdf that describes a distribution of the waiting times between two successive jumps possesses a finite mean value t .

2.2 CTRW Model In contrast to the random walk model for the Brownian motion, the CTRW model is based on the idea that the lengths of the jumps and the waiting times between two successive jumps are governed by a joint pdf .x; t / that is referred to as the jump pdf. From .x; t /, the jump length pdf Z

1

.x/ D

.x; t / dt

(7)

.x; t / dx

(8)

0

and the waiting time pdf Z

1

w.t / D

1

can be deduced. The main characteristics of the CTRW models are the characteristic waiting time Z

1

T D

w.t / t dt

(9)

0

and the jump length variance Z ˙ D 2

1

.x/ x 2 dx:

(10)

1

They can be finite or infinite and this makes the difference between the CTRW models. Usually, the following different cases are distinguished: Page 4 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

• Both T and ˙ 2 are finite: Brownian motion (diffusion equation as a deterministic model) • T diverges, ˙ 2 is finite: Sub-diffusion (time-fractional diffusion equation as a deterministic model) • T is finite, ˙ 2 diverges: Markovian Levy flights (space-fractional diffusion equation as a deterministic model) • Both T and ˙ 2 are infinite: Non-Markovian Levy flights (time-space-fractional diffusion equation as a deterministic model) It is known that the master equations for the CTRW model can be formulated in the form of some integral equations of the convolution type (see, e.g., Metzler and Klafter 2000). Below we give a short summary of how to derive these equations. Let us denote by .x; t / the probability of the event that at the time instant t a parcel just arrives to the point x. According to the law of total probability, .x; t / satisfies the equation Z .x; t / D

C1

dx 1

0

Z

t

   x0; t 0

  x  x 0 ; t  t 0 dt 0 C P0 .x/ı.t /;

(11)

0

.x; t / being the jump pdf that is supposed to be known. The pdf P .x; t / that governs the event that at the time instant t the parcel is located at the position x is given by Z

t

P .x; t / D

     x; t 0  t  t 0 dt 0 ;

(12)

0

where Z  .t / D 1 

t

  w t 0 dt 0

(13)

0

is assigned to the probability of no jump event within the time interval Œ0; t and w.t / is the waiting time pdf. The integral equations (11)–(13) determine the one-point probability density function that is an important part of the mathematical model for the CTRW but of course not enough to fully characterize the underlying stochastic process (see, e.g., Germano et al. 2009 for more details). Let us now transform Eqs. (11)–(13) into the frequency domain by applying the Fourier and the Laplace transforms. Applying the well-known convolution theorems for the Fourier and the Laplace transforms and solving the transformed equations for the unknown Fourier and Laplace transformed pdf POQ . ; s/, we get the formula PO0 . / 1  w.s/ Q ; POQ . ; s/ D s 1  OQ . ; s/

(14)

where PO0 . / denotes the Fourier transform of the initial condition P0 .x/ WD P .x; 0/. It is worth to mention that a purely probabilistic proof of Eq. (14) is given in Germano et al. (2009). We remind the readers that the Fourier transform is defined by

Page 5 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

fO. / D F ff .x/I g D

Z

C1

e Ci x f .x/ dx ; 2 R ;

1

and the Laplace transform by fQ.s/ D Lff .t /I sg D

Z

1

e st f .t / dt; s 2 C :

0

2.3 Advection-Diffusion Equation Until now, no particular assumptions for the densities .x/ and w.t /, except one for their integrability, have been made. In what follows, we assume for simplicity that the jump lengths and waiting times are independent random variables and the jump pdf .x; t / can be written in the O w.s/ decoupled form .x; t / D .x/ w.t /. In this case the equation OQ . ; s/ D . / Q holds true. Straightforward calculations show that if w.t / possesses a finite mean and .x/ a finite variance, then Eq. (14) can be transformed to the standard advection-diffusion equation for large t and jxj and therefore describes the Brownian motion on the large time and space scales. Indeed, in this case the Fourier transform of .x/ has the asymptotic behavior M2 2 O . /  1  iM1 

2

as

! 0;

(15)

where M1 and M2 are the first and the second moments of the jump length pdf .x/, respectively. Substituting this expression into (14), we get PO0 . / 1  w.s/ Q  POQ . ; s/ D s 1  w.s/ Q 1  iM1 

M2 2

2



as

! 0;

that is equivalent to   s w.s/ Q M 2 O 2

POQ . ; s/ as ! 0: s PQ . ; s/  PO0 . / D iM1  1  w.s/ Q 2 Applying the inverse and making use of theorem for the n 2the differentiation o   ˚Fourier transform  @ @ 2 Fourier transform F @x f .x; t /I D i F ff .x; t /I g; F @x2 f .x; t /I D  F ff .x; t /I g , one gets

@ @2 s w.s/ Q Q s P .x; s/  P0 .x/ D v P .x; s/ C d1 2 P .x; s/ 1  w.s/ Q @x @x

as jxj ! 1;

(16)

where v D M1 and d1 D M22 can be interpreted as the velocity and the diffusion coefficient, respectively. When the waiting time pdf w.t / possesses a finite mean (for an in-depth treatment of this problem, we refer to Emmanuel and Berkowitz (2007) and Geiger and Emmanuel (2010), where Page 6 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

some methods for determination of the mean value of w.t / in the case of heat transport in porous media were discussed), then the asymptotics of its Laplace transform w Q can be represented in the form w.s/ Q  1  s

as

s ! 0;

(17)

where denotes the first moment of the pdf w.t /. Substituting (17) into Eq. (16) and applying inverse Laplace transform and the differentiation theorem for the Laplace transform  ˚the L @t@ f .x; t /I s D sLff .x; t /I sg  f .x; 0/ , we obtain an initial-value problem P .x; 0/ D P0 .x/ for the standard advection-diffusion equation

@2 @ @ P .x; t / D v P .x; t / C d1 2 P .x; t / as @t @x @x

t ! 1; jxj ! 1

(18)

on the large time and space scales.

2.4 Fractional Diffusion-Wave Equations With the CTRW model, it is possible to go beyond this standard framework and to explore other kinds of transport processes including the anomalous transport. Let us first assume that the mean value of the waiting time pdf w.t / is not finite. As an example, a particular long-tailed waiting time pdf with the asymptotic behavior w.t /  A˛ . =t /1C˛ ; t ! C1; 0 < ˛ < 1

(19)

is considered. Its asymptotics in the Laplace domain can be easily determined by the so-called Tauberian theorem and is as follows: w.s/ Q  1  .s /˛ ; s ! 0: It is important to mention that the specific form of w.t / is of minor importance. In particular, the so-called Mittag-Leffler waiting time pdf X xk d w.t / D  E˛ .t ˛ / ; E˛ .x/ WD dt .˛ k C 1/ kD0 1

can be taken without loss of generality. The Laplace transform of the Mittag-Leffler pdf can be evaluated in explicit form w.s/ Q D

1 1 C s˛

and has the desired asymptotics. Together with the Gaussian jump length pdf

Page 7 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

   1=2  .x/ D 4 2 exp x 2 = 4 2 ; ˙ 2 D 2 2 with the Fourier transform in the form O . /  1   2 2 ; ! 0; the asymptotics of the Fourier-Laplace transform of the pdf P .x; t / becomes POQ . ; s/ 

PO0 . /=s ; s ! 0; ! 0: 1 C d˛ s ˛ 2

(20)

Using the Tauberian theorems for the Laplace and Fourier transforms, the last equation can be transformed for large t and jxj to a time-fractional partial differential equation. Namely, after multiplication with the denominator of the right-hand side, Eq. (20) becomes   1 C d˛ s ˛ 2 POQ . ; s/  PO0 . /=s; s ! 0; ! 0:

(21)

Making use of the differentiation theorem for the Fourier transform and employing the integration rule L f.I ˛ f / .t /I sg D s ˛ fQ.s/ for the Riemann-Liouville fractional integral I ˛ defined by 1 .I f /.t / WD .˛/

Z

t

f . /.t  /˛1 d ; ˛ > 0;

˛

 0  I f .t / D f .t /;

(22)

0

Eq. (21) can be rewritten in the form of the fractional integral equation  P .x; t /  P0 .x/ D d˛

 @2 I P .x; / .t / @x 2 ˛

(23)

for large t and jxj. Application of a fractional differential operator Dt˛ to Eq. (23) transforms it for large t and jxj to the initial-value problem P .x; 0/ D P0 .x/ for the so-called time-fractional diffusion equation  ˛  @2 P Dt P .t / D d˛ 2 ; 0 < ˛ < 1: @x

(24)

In what follows, the fractional derivative Dt˛ .n  1 < ˛  n; n 2 N/ is taken in the Caputo sense:

Page 8 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

  ˛   Dt f .t / WD I n˛ f .n/ .t /;

(25)

I ˛ being the Riemann-Liouville fractional integral (22). For the theory and applications of this fractional derivative and for other forms of the fractional derivatives, we refer the readers to Samko et al. (1993) and Diethelm (2010). It is worth mentioning that the integrodifferential kind of the fractional differential operator in Eq. (24) ensures the non-Markovian nature of the subdiffusive process we are dealing with. Indeed, calculating the Laplace transform of the mean squared displacement via the relation xQ 2 .s/ D lim 

!0

d2 P . ; s/ d 2

and applying the Laplace inversion transform, the formula x 2 .t / D

2d˛ t˛ .1 C ˛/

for the mean squared displacement in time is obtained. As we see, in contrast to the case of the Brownian motion, the mean squared displacement does not linearly depend on the time t , but is a power function with the exponent ˛. Mathematical theory of the general time-fractional diffusion equation is presented in the third section. Now we discuss the case when the characteristic waiting time T is finite, but the jump length variance ˙ 2 is infinite. Again, a specific form of the pdf .x/ is of minor importance, so that without a loss of generality, we can, e.g., consider one of the Levy-stable pdfs with the Fourier transform given by the formula   O . / D exp  ˇ j jˇ  1   ˇ j jˇ ; 1 < ˇ < 2; j j ! 0:

(26)

In the spatial domain, we get the asymptotical formula .x/  Aˇ  ˇ jxj1ˇ ; x ! 1 that shows the “long tails” of the pdf .x/. For the Poissonian waiting time pdf w.t / D 1 exp.t= /; > 0 with the Laplace transform of the form w.s/ Q  1  s ; s ! 0; the asymptotics of the Fourier-Laplace transform of the pdf P .x; t / can be written in the form POQ . ; s/ 

1 ; s ! 0; j j ! 0: s C cˇ j jˇ

(27)

Page 9 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

By inverting the Laplace and Fourier transforms in Eq. (27), an initial-value problem P .x; 0/ D P0 .x/ for the space-fractional diffusion equation @P D cˇ Rxˇ P .x; t / @t

(28)

is obtained for large t and jxj, where Rxˇ is the Riesz fractional derivative defined for a sufficiently well-behaved function f and 0 < ˇ  2 as a pseudo-differential operator with the symbol j jˇ (see, e.g., Samko et al. 1993; Mainardi et al. 2001): F

˚ ˇ   Rx f .x/I D j jˇ F ff .x/I g:

(29)

In the one-dimensional case, the Riesz fractional derivative (29) with 0 < ˇ < 2 can be represented as a hypersingular integral (see Samko et al. 1993 for the case ˇ 6D 1 and Gorenflo and Mainardi 2001 for the general case): 

Rxˇ f



1 .x/ D .1 C ˇ/ sin.ˇ=2/ 

Z

1 0

f .x C /  2f .x/ C f .x  / d :

ˇC1

(30)

For ˇ D 1, the relation (30) can be interpreted in terms of the Hilbert transform  1  1 d Rx f .x/ D   dx

Z

C1 1

f . / d ; x

(31)

where the integral is understood in the sense of the Cauchy principal value as first noted in Feller (1952) and then revisited and stated more precisely in Gorenflo and Mainardi (2001). In the case both T and ˙ 2 diverge, we employ in the CTRW model, e.g., the long-tailed pdf (19) as the waiting time pdf w.t / and the Levy-stable pdf (26) as the jump length pdf .x/. Following the same way as above, an initial-value problem P .x; t / D P .x; 0/ for the time- and spacefractional diffusion equation    ˛  Dt P .t / D s˛;ˇ Rxˇ P .x/

(32)

with the Caputo fractional derivative Dt˛ in time and the Riesz fractional derivative Rxˇ in space is deduced from the CRTW model on the large time and space scales. For the mathematical and numerical analysis of the time- and space-fractional diffusion equations of type (32), we refer to Mainardi et al. (2001), where an even more general equation with the Riesz-Feller space-fractional derivative was investigated in detail in the one-dimensional case, and to Hanyga (2002), where the multidimensional time-space fractional diffusion equations were treated. Let us now mention one important particular case of the time- and space-fractional diffusion equation that is referred to as the neutral-fractional diffusion equation (Mainardi et al. 2001; Metzler and Nonnenmacher 2002) or the fractional wave equation (Luchko 2013). This equation is obtained from (32) when we set ˛ D ˇ, i.e., when the orders of the fractional derivatives in time and in space are the same. From the viewpoint of the CTRW model, this condition means that the asymptotics of the waiting time pdf w.t / and the jump length pdf .x/ are the same on large time and space scales that means that the waiting times and jump length are adapted to each other

Page 10 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

in the way that the corresponding CTRW model describes an anomalous wave propagation rather than anomalous diffusion. Mathematical theory of the fractional wave equation is presented in the sect. 4.1. Further important models that generalize the well-known conventional transport equations are the fractional diffusion-advection equation (anomalous diffusion with an additional velocity field) and the fractional Fokker-Plank equation (anomalous diffusion in the presence of an external field). Of course, like in the conventional case, the multidimensional generalizations, equations of the fractional order with the nonconstant coefficients and nonlinear fractional differential equations appear in the corresponding models and are worth to be investigated.

3 Time-Fractional Diffusion-Wave Equations 3.1 Generalized Time-Fractional Diffusion Equation Motivated by the models derived in the previous section, we consider in this subsection the ndimensional generalized time-fractional diffusion equation (GTFDE)  ˛  Dt u .t / D L.u/ C F .x; t /; 0 < ˛  1;

(33)

where u D u.x; t /; .x; t / 2 ˝T WD G  .0; T /; G  Rn is the unknown function, the operator L is given by L.u/ D div.p.x/ grad u/ C q.x/u with the coefficients p and q that satisfy the conditions     N p 2 C 1 GN ; q 2 C GN ; p.x/ > 0; q.x/  0; x 2 G;

(34)

the fractional derivative Dt˛ is defined in the Caputo sense (see (25)), and the domain G with the boundary S is open and bounded in Rn . The operator L is a linear elliptic type differential operator of the second order  n  X @2 u @p @u L.u/ D p.x/ 2 C  q.x/u; @xk @xk @xk kD1 that can be represented in the form  L.u/ D p.x/4u C .grad p; grad u/  q.x/u;

(35)

4 being the Laplace operator. For ˛ D 1, Eq. (33) is reduced to a linear second-order parabolic PDE. The theory of this equation is well known, so that the main focus in this section is on the case 0 < ˛ < 1. Of course, in applications, the transport processes that we model with the time-fractional diffusion equations take place in some bounded domains, so that mainly the initial-boundary-value problems for these equations are worth to be studied from the viewpoint of applications. In this

Page 11 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

section, the initial-boundary-value problem ˇ N uˇt D0 D u0 .x/; x 2 G;

ˇ uˇS D v.x; t /; .x; t / 2 S  Œ0; T

(36) (37)

for Eq. (33) is considered. A solution to the problem (33), (36), (37) is  a function u D u.x; t / defined in the domain  called N T that belongs to the space C ˝N T \W 1 ..0; T /\C 2 .G/ and satisfies both Eq. (33) ˝N T WD GŒ0; and the initial and boundary conditions (36)–(37). By W 1 ..0; T /, the space of the functions f 2 C 1 ..0; T / such that f 0 2 L..0; T // is denoted. If the problem (33), (36), (37) possesses a solution,   then the functions F , u0 , and v given in the problem have to belong to the spaces C.˝T /; C GN , and C.S  Œ0; T /, respectively. In the further discussions, these inclusions are always supposed to be valid. The presentation of the results in this section follows Luchko (2009a, b, 2010, 2011a, b, 2012a, b), and the readers are advised to consult these papers for the proofs and more details. 3.1.1 Uniqueness of the Solution First, we investigate uniqueness of the solution to the problem (33), (36), (37). The main component of the uniqueness proof is an appropriate maximum principle for Eq. (33). In its turn, the proof of the maximum principle uses an extremum principle for the Caputo fractional derivative that is formulated in the following theorem. Theorem 1 (Luchko 2009b). Let a function f 2 W 1 ..0; T /\C.Œ0; T / attain its maximum over the interval Œ0; T at the point D t0 ; t0 2 .0; T . Then the Caputo fractional derivative of the function f is nonnegative at the point t0 for any ˛; 0 < ˛ < 1:  ˛  Dt f .t0 /  0; 0 < ˛ < 1:

(38)

Let us mention that recently a more strong estimate for the Caputo derivative of a function f; f 2 C 1 Œ0; 1 at the maximum point t0 was proved in Al-Refai (2012), namely, 

 Dt˛ f .t0 /  

t0˛ .f .t0 /  f .0//  0; 0 < ˛ < 1: .1  ˛/

The extremum principle for the Caputo fractional derivative is used to prove a maximum principle for the generalized time-fractional diffusion equation (33) that is formulated in the same way as the one for the parabolic type PDEs.   Theorem 2 (Luchko 2009b). Let a function u 2 C ˝N T \ W 1 ..0; T // \ C 2 .G/ be a solution of the generalized time-fractional diffusion equation (33) and F .x; t /  0; .x; t / 2 ˝T . N Then either  u.x; t /  0; 8.x; t / 2 ˝T , or the function u attains its positive maximum on the T N part SG WD G  f0g [ .S  Œ0; T / of the boundary of the domain ˝T , i.e.,

Page 12 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

(

)

u.x; t /  max 0; max u.x; t / ; 8.x; t / 2 ˝N T :

(39)

.x;t /2SGT

Similarly to the case of the partial differential equations of parabolic type (˛ D 1), an appropriate minimum principle is valid, too. The maximum and minimum principles can be applied to show that the problem (33), (36)–(37) possesses at most one solution and this solution – if it exists – continuously depends on the data given in the problem. Theorem 3 (Luchko 2009b). The initial-value problem (36)–(37) for the GTFDE (33) possesses at most one solution. This solution continuously depends on the data given in the problem in the sense that if Q C.S Œ0;T /  1 ; kF  FQ kC .˝N T /  ; ku0  uQ 0 kC .GN /  0 ; kv  vk and u and uQ are the solutions to the problem (33), (36)–(37) with the source functions F and FQ , the initial conditions u0 and ue0 , and the boundary conditions v and v, Q respectively, then the norm estimate T˛  ku  uQ kC .˝N T /  maxf0 ; 1 g C .1 C ˛/

(40)

for the solutions u and uQ holds true. Because the problem under consideration is a linear one, the uniqueness of the solution immediately follows from the fact that the homogeneous problem (33), (36)–(37), i.e., the problem with F 0, u0 0, and v 0, has only one solution, namely, u.x; t / 0; .x; t / 2 ˝N T . 3.1.2 Existence of the Solution To tackle the existence problem, notion of a generalized solution is first introduced following Vladimirov (1971), where the case ˛ D 1 was considered.     Definition 1 (Luchko 2010). Let Fk 2 C ˝N T ; u0k 2 C GN , and vk 2 C.S  Œ0; T /; k D 1; 2; : : : be the sequences of functions that satisfy the following conditions: (1) There exist the functions F , u0 , and v, such that kFk  F kC .˝N T / ! 0 as k ! 1;

(41)

ku0k  u0 kC .GN / ! 0 as k ! 1;

(42)

kvk  vkC.S Œ0;T / ! 0 as k ! 1;

(43)

(2) For any k D 1; 2; : : : , there exists a solution uk of the initial-boundary-value problem

Page 13 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

ˇ N uk ˇt D0 D u0k .x/; x 2 G; ˇ uk ˇS D vk .x; t /; .x; t / 2 S  Œ0; T

(44) (45)

for the generalized time-fractional diffusion equation  ˛  Dt uk .t / D L.uk / C Fk .x; t /:

(46)

  Suppose there exists a function u 2 C ˝N T such that kuk  ukC .GN / ! 0 as k ! 1:

(47)

The function u is called a generalized solution of the problem (33), (36)–(37). The generalized solution of the problem (33), (36)–(37) is a continuous function, not a generalized one. Still, the generalized solution is not required to be from the functional space   N C ˝T \ W 1 ..0; T / \ C 2 .G/, where the solution has to belong to. It follows from Definition 1 that if the problem (33), (36)–(37) possesses a solution, then this solution is a generalized solution of the problem, too. In this sense, Definition 1 extends the notion of the solution of the problem (33), (36)–(37). This extension is needed to get some existence results. But of course one does not want to lose the uniqueness of the solution. Let us now discuss some properties of the generalized solution including its uniqueness. If the problem (33), (36), (37) possesses a generalized  solution,  then the functions F , u0 , and v given in the problem have to belong to the spaces C ˝N T ; C GN , and C.S  Œ0; T /, respectively. In the further discussions, these inclusions are always supposed to be valid. First, we show that the sequence uk ; k D 1; 2; : : : defined by the relations (41)–(46) of N Definition  1 is always a uniformly convergent one in ˝T , i.e., there always exists a function u 2 C ˝N T that satisfies the property (47). Indeed, applying the estimate (40) from Theorem 3 to the functions uk and up that are solutions of the corresponding initial-boundary-value problems (44) and (45) for Eq. (46), one gets the inequality n

o

kuk  up kC .˝N T /  max ku0k  u0p kC .GN / ; kvk  vp kC.S Œ0;T / C

T˛ kFk  Fp kC .˝N T / ; .1 C ˛/

that, with the relations (41)–(43), means  together   that uk ; k D 1; 2; : : : is a Cauchy sequence in N N C ˝T that converges to a function u 2 C ˝T . Moreover, the following important uniqueness theorem holds true. Theorem 4 (Luchko 2010). The problem (33), (36)–(37) possesses at most one generalized solution in the sense of Definition 1. The generalized solution – if it exists – continuously depends on the data given in the problem in the sense of the estimate (40). In contrast to the situation with the solution to the problem (33), (36)–(37), existence of the generalized solution can be shown under some standard restrictions on the problem data and the boundary S of the domain G. In this section, existence of the solution of the problem

Page 14 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

 ˛  Dt u .t / D L.u/; ˇ N uˇt D0 D u0 .x/; x 2 G;

ˇ uˇS D 0; .x; t / 2 S  Œ0; T

(48) (49) (50)

is considered to demonstrate the technique that can be used with the appropriate standard modifications in the general case, too. The generalized solution of the problem (48)–(50) can be constructed in an analytical form by using the Fourier method of the variables separation. Let us look for a particular solution u of Eq. (48) in the form u.x; t / D T .t / X.x/; .x; t / 2 ˝N T ;

(51)

that satisfies the boundary condition (50). Substituting (51) into Eq. (48) and separating the variables, we get the equation 

 Dt˛ T .t / L.X/ D D ; T .t / X.x/

(52)

 being a constant not depending on the variables t and x. The last equation, together with the boundary condition (50), is equivalent to the fractional differential equation 

 Dt˛ T .t / C T .t / D 0

(53)

and the eigenvalue problem L.X/ D  X; ˇ X ˇS D 0; x 2 S

(54) (55)

for the operator L. Due to the condition (34), the operator L is a positive definite and self-adjoint linear operator. The theory of the eigenvalue problems for such operators is well known (see, e.g., Vladimirov 1971). In particular, the eigenvalue problem (54)–(55) has a counted number of the positive eigenvalues 0 < 1  2  : : : with the finite multiplicity, and if the boundary S of G is a smooth surface, any function f 2 ML can be represented through its Fourier series in the form f .x/ D

1 X

.f; Xi / Xi .x/;

(56)

iD1

where .f; g/ denotes the scalar product of two functions in L2 .G/ and Xi 2 ML are the eigenfunctions corresponding to the eigenvalues i : L.Xi / D i Xi ; i D 1; 2; : : : :

(57)

By ML ,the space of the functions f that satisfy the boundary condition (55) and the inclusions f 2 C 1 ˝N T \ C 2 .G/, L.f / 2 L2 .G/ is denoted.

Page 15 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

The solution of the fractional differential equation (53) with  D i ; i D 1; 2; : : : has the form (see, e.g., Luchko 1999; Luchko and Gorenflo 1999) Ti .t / D ci E˛ .i t ˛ / ;

(58)

E˛ being the Mittag-Leffler function defined by E˛ .z/ WD

1 X kD0

zk : .˛ k C 1/

(59)

Any of the functions ui .x; t / D ci E˛ .i t ˛ / Xi .x/; i D 1; 2; : : :

(60)

and thus the finite sums uk .x; t / D

k X

ci E˛ .i t ˛ / Xi .x/; k D 1; 2 : : :

(61)

iD1

satisfy both Eq. (48) and the boundary condition (50). To construct a function that satisfies the initial condition (49), too, the notion of a formal solution is introduced. Definition 2 (Luchko 2010). A formal solution to the problem (48)–(50) is called the Fourier series in the form u.x; t / D

1 X

.u0 ; Xi / E˛ .i t ˛ / Xi .x/;

(62)

iD1

Xi ; i D 1; 2; : : : being the eigenfunctions corresponding to the eigenvalues i of the eigenvalue problem (54)–(55). Under certain conditions, the formal solution (62) can be proved to be the generalized solution of the problem (48)–(50). Theorem 5 (Luchko 2010). Let the initial condition u0 be from the space ML . Then the formal solution (62) of the problem (48)–(50) is its generalized solution. Indeed, it can be easily verified that the functions uk ; k D 1; 2; : : : defined by (61) are solutions of the problem (48)–(50) with the initial conditions u0k .x/ D

k X

.u0 ; Xi / Xi .x/

(63)

iD1

instead of u0 . Because the function u0 is from the functional space ML , its Fourier series converges uniformly to the function u0 , so that Page 16 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

ku0k  u0 kC .GN / ! 0 as k ! 1: To prove the theorem, one only needs to show that the sequence uk ; k D 1; 2; : : : of the partial sums (61) converges uniformly on ˝N T . But this statement immediately follows from the estimate (see, e.g., Haubold et al. 2011) jE˛ .x/j 

M  M; 0  x; 0 < ˛ < 1 1Cx

for the Mittag-Leffler function (59) and the fact that the Fourier series

(64) 1 P

.u0 ; Xi / Xi .x/ of the

iD1

function u0 2 ML uniformly converges on ˝N T . In some cases, the generalized solution (62) can be shown to be the solution of the initialvalue problem for the generalized time-fractional diffusion equation, too. One important example is given by the following theorem. Theorem 6 (Luchko 2012b). Let an open domain G be a one-dimensional interval .0; l/, u0 2 ML , and L.u0 / 2 ML . Then the unique solution of the initial-value-problem ˇ uˇt D0 D u0 .x/; 0  x  l; u.0; t / D u.l; t / D 0; 0  t  T for the one-dimensional generalized time-fractional equation    ˛  @ @u Dt u .t / D p.x/  q.x/ u @x @x is a continuously differentiable function with respect to the time variable on the interval .0; T / that is given by the formula (62).

3.2 Fractional Diffusion-Wave Equation Of course, in the case of the fractional differential equations with the constant coefficients, other techniques than the spectral method presented in the previous section like the integral transforms method can be applied to deduce explicit formulas for solutions of the initial-, boundary-, or initialboundary-value problems for these equations. As an example, we consider in this section the initial-value problem (the Cauchy problem) 8 ˆ 0

(65)

for the time-fractional diffusion-wave equation of order ˛ with 1  ˛  2, namely, Page 17 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

 ˛  @2 u Dt u .t / D 2 ; x 2 R; t 2 RC : @x

(66)

This equation interpolates between the diffusion equation (˛ D 1) and the wave equation (˛ D 2) and was considered in detail, e.g., in Mainardi (1994, 1996a, b) (see also Luchko et al. 2013 for more recent results). To simplify the notations in the formulas, we set  D ˛=2, so that 1=2    1 for 1  ˛  2. The Green function Gc .x; t I / of the problem under consideration is its solution with the initial condition f .x/ D ı.x/, ı being the Dirac ı-function. The solution of the Cauchy problem (65) is obtained via the Green function in the form Z C1 Gc .x  ; t I / f . / d : u.x; t I / D 1

For the diffusion equation ( D 1=2), the Green function is well known and is given by t 1=2 2 Gc .x; t I 1=2/ D Gc d .x; t / D p ex =.4 t / ; 2 

(67)

whereas for the wave equation ( D 1), we get the representation 1 Gc .x; t I 1/ D Gc w .x; t / D .ı.x  t / C ı.x C t // : 2

(68)

Following Mainardi (1994, 1996a, b) and Luchko et al. (2013), several representations of the Green function Gc in the form of integrals and series as well as some methods for its numerical evaluation are presented and discussed in this subsection. We start by transforming the Cauchy problem for Eq. (66) into the Laplace-Fourier domain using the known formula (see, e.g., Podlubny 1999) L

˚

Dt˛ f





.t /I s D s L ff .t /I sg  ˛

n1 X

  f .k/ 0C s ˛1k ;

n  1 < ˛  n; n 2 N

(69)

kD0

for the Laplace transform of the Caputo fractional derivative. This formula, together with the standard formulas for the Fourier transform of the second derivative and of the Dirac ı-function, leads to the representation ec . ; s; / D Gb

s 21 ;  D ˛=2 s 2 C 2

(70)

ec of the Green function Gc . Using the Laplace transform formula of the Laplace-Fourier transform Gb (see, e.g., Podlubny 1999) L fE˛ .t ˛ / I sg D

s ˛1 s˛ C 1

Page 18 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

and applying to the R.H.S of the formula (70) first the inverse Laplace transform and then the inverse Fourier transform, we obtain the integral representation 1 Gc .x; t I / D 

Z

1

  E2  2 t 2 cos.x / d ;

(71)

0

if we take into consideration the fact that the Green function of the Cauchy problem is an even function of x that follows from the formula (70). In Mainardi (1994), another representation of the Green function was obtained by applying to the R.H.S of the formula (70) first the inverse Fourier transform and then the inverse Laplace transform: 2 x Gc .x; t I / D F .r/ D r M .r/;

(72)

where r D x=t  is the similarity variable and 1 F .r/ WD 2 i

Z r 

e Ha

1 d ; M .r/ WD 2 i

Z



er d 1 Ha 

are the two auxiliary functions nowadays often referred to as the Mainardi functions and Ha denotes the Hankel integration path. Let us note that the form of the similarity variable can be explained by the Lie group analysis of the time-fractional diffusion-wave equation (66). In Buckwar and Luchko (1998), Luchko and Gorenflo (1998), and Gorenflo et al. (2000b), symmetry groups of scaling transformations for the time- and space-fractional partial differential equations have been constructed. In particular, it has been proved in Buckwar and Luchko (1998) that the only invariant of the symmetry group T of scaling transformations of the time-fractional diffusion-wave equation (66) has the form .x; t; u/ D x=t  that explains the form of the scaling variable. Using the well-known representation of the Wright function, which reads (in our notation) for z2C 1 W; .z/ WD 2 i

Z e Ha

Cz 

1 X d zn D ;  nŠ .n C / nD0

(73)

where  > 1 and  > 0, we recognize that the auxiliary functions F and M are related to the Wright function according to the formula F .z/ D W;0.z/ D z M .z/ ;

M .z/ D W;1 .z/ :

(74)

This relation (74) along with (73) provides us with the series representations of the Mainardi functions and thus of the Green function Gc (for x > 0): 1 .x=t  /n 1 1 1 X F .r/ D  M .r/ D : Gc .x; t I / D 2x 2t 2 t  nD0 nŠ . n C 1  /

(75)

Page 19 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

0.8 0.7

ν=1

0.6

ν = 0.9 Gc(x;ν)

0.5 0.4

ν = 0.5

0.3 0.2

ν = 0.65 0.1 0

0

0.5

1

1.5

x

Fig. 1 Green function Gc .xI / WD Gc .x; 1I /: plots for several different values of 

Let us now shortly introduce some algorithms for numerical evaluation of the Green function Gc . Because Gc is a particular case of the Wright function (see formula (74)), one can of course use the algorithms for the numerical evaluation of the Wright function suggested in Luchko (2008) to evaluate the Green function Gc . Another approach to numerical evaluation of Gc we employed to produce the plots in Figs. 1 and 2 is in employing the integral representation (71). To evaluate the Mittag-Leffler function E˛ in (71), we applied the algorithms suggested in Gorenflo et al. (2002) and the MATLAB programs that implement these algorithms and are available from Matlab File Exchange (2005). In Fig. 1, several plots of the Green function Gc .xI / WD Gc .x; 1I / for different values of the parameter  . D ˛=2/ are presented. It can be seen that for x  0 each Green function has an only maximum and that location of the maximum point changes with the value of . For a detailed discussion of the maximum location, maximum value, and the propagation velocity of the maximum point, we refer to Luchko et al. (2013). In Fig. 2, the Green function Gc .x; t I / is plotted for  D 0:875.˛ D 1:75/ from different perspectives. The plots show that both the location of maximum and the maximum value depend on time t > 0: whereas the maximum value decreases with time (Fig. 2, right), the x-coordinate of the maximum location becomes even larger (Fig. 2, left). Surprisingly, the product of the maximum location and the maximum value of Gc .x; t I / does not depend on time t > 0 and is just a function of the parameter  (see Luchko et al. 2013 for more details).

4 Fractional Wave Equation In this section, a fractional generalization of the wave equation that describes propagation of damped waves is considered. In contrast to the fractional diffusion-wave equation that was considered in the previous section, the fractional wave equation contains fractional derivatives of the same order ˛; 1  ˛  2, both in space and in time. We show that the fractional wave equation inherits some crucial characteristics of the wave equation like a constant propagation

Page 20 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

0.7 0.6

Gc(x,t;ν)

0.6 1 0.4

Gc(x,t;ν)

0.5 0.8

0.4 0.3 0.2

0.2

0.1

1.5 t

0 0

0.5

1

1.5 x

2

2.5

3

2

0 1

4 1.2

1.4

1.6

1.8

2 0

2 x

t

Fig. 2 Green function Gc .x; tI /: plots for  D 0:875 from different perspectives

velocity of both the maximum of its fundamental solution and its gravity and “mass” centers. Moreover, the first, the second, and the Smith centrovelocities of the damped waves described by the fractional wave equation are constant and depend just on the equation order ˛. In this section, the fundamental solution of the one-dimensional fractional wave equation is obtained in explicit form and shown to be a spatial probability density function evolving in time all whose moments of order less than ˛ are finite. To illustrate analytical findings, results of numerical calculations and plots are presented. From the mathematical viewpoint, the one-dimensional fractional wave equation we deal with in this section was introduced for the first time in Gorenflo et al. (2000a), where this equation was called the neutral-fractional diffusion equation. In Mainardi et al. (2001), a time-space fractional diffusion-wave equation with the Riesz-Feller derivative of order ˛ 2 .0; 2 and skewness  has been investigated in detail. A particular case of this equation that for  D 0 corresponds to our fractional wave equation has been shortly mentioned in Mainardi et al. (2001). In Metzler and Nonnenmacher (2002), a fundamental solution to the neutral-fractional diffusion equation was deduced and analyzed in terms of the Fox H-function. For a detailed treatment of the onedimensional fractional wave equation, we refer to the recent paper Luchko (2013). In the applications, the fractional wave equations of different types were employed, e.g., for modeling of dynamics of sand and fissured rock with the seismic excitations in Gudehus and Touplikiotis (2012) and for description of the causal elastic waves with a frequency power-law attenuation in Näsholm and Holm (2013). As has been shown, e.g., in Szabo and Wu (2000), elastic wave attenuation in complex media such as biological tissue, polymers, rocks, and rubber often follows a frequency power law, and thus, such elastic waves can be modeled with the fractional wave equations.

4.1 Analysis of the Fractional Wave Equation 4.1.1 Problem Formulation In this section, we consider the fractional wave equation in the form Dt˛ u.x; t / D C˛ Rx˛ u.x; t /; x 2 Rn ; t 2 RC ; 1  ˛  2;

(76)

where u D u.x; t / is a real field variable, Rx˛ is the Riesz space-fractional derivative (29) of order ˛, and Dt˛ is the Caputo time-fractional derivative (25) of order ˛. The Caputo and the Riesz Page 21 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

fractional derivatives were introduced and short discussed in the second section. Here we just note that the Riesz fractional derivative is a symmetric operator with respect to the space variable x.  ˛=2 Because of the relation j j˛ D  2 , it can be formally interpreted as Rx˛ D  ./˛=2 ; i.e., as a power of the self-adjoint and positive definite operator ,  being the Laplace operator. To make analysis of Eq. (76) more simple and clear, in the further discussions we focus on the model one-dimensional fractional wave equation Dt˛ u.x; t / D Rx˛ u.x; t /; x 2 R ; t 2 RC ; 1  ˛  2:

(77)

In (77), all quantities are supposed to be dimensionless, so that the coefficient at the Riesz spacefractional derivative can be taken to be equal to one without loss of generality. As we have mentioned in the second section, in the one-dimensional case, the Riesz fractional derivative (29) can be represented as the hypersingular integral (30) that for ˛ D 1 can be rewritten via the Hilbert transform (31). This means that for ˛ D 1 Eq. (77) can be represented in the form @u 1 d .x; t / D  @t  dx

Z

C1 1

u. ; t / d x

(78)

that we call a modified convection equation and that is of course different from the standard convection equation. For ˛ D 2, Eq. (77) is reduced to the one-dimensional wave equation. In what follows, we focus on the case 1  ˛ < 2 because the case ˛ D 2 (wave equation) is well studied in the literature. For Eq. (77), the initial-value problem u.x; 0/ D '.x/ ;

@u .x; 0/ D 0; x 2 R @t

(79)

is considered for 1 < ˛ < 2. If ˛ D 1, the second initial condition in (79) is omitted. In this section, we are mostly interested in behavior and properties of the fundamental solution (Green function) G˛ of Eq. (77), i.e., its solution with the initial condition '.x/ D ı.x/, ı being the Dirac delta function. 4.1.2 Fundamental Solution of the Fractional Wave Equation We start our analysis by applying the Fourier transform with respect to the space variable x to Eq. (77) with 1 < ˛ < 2 and to the initial conditions (79) with '.x/ D ı.x/. Using definition of the Riesz fractional derivative, for the Fourier transform GO ˛ , we get the initial-value problem (

O G. ; 0/ D 1; @GO . ; 0/ D 0 @t

(80)

for the fractional differential equation Page 22 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

  D ˛ GO ˛ .t / C j j˛ GO ˛ . ; t / D 0:

(81)

The unique solution of (80), (81) is given by the expression (see, e.g., Luchko 1999) GO ˛ . ; t / D E˛ .j j˛ t ˛ /

(82)

in terms of the Mittag-Leffler function (59). The well-known formula (see, e.g., Podlubny 1999) m X   .x/k C O x 1m ; m 2 N; x ! C1 E˛ .x/ D  .1  ˛k/ kD1

for asymptotics of the Mittag-Leffler function that is valid for 0 < ˛ < 2 and the formula (82) show that GO ˛ belongs to L1 .R/ with respect to for 1 < ˛ < 2. Therefore, we can apply the inverse Fourier transform and get the representation 1 G˛ .x; t / D 2

Z

C1 1

e i x E˛ .j j˛ t ˛ / d ; x 2 R; t > 0

(83)

for the Green function G˛ . The last formula shows that the fundamental solution G˛ is an even function in x, i.e., G˛ .x; t / D G˛ .x; t /; x 2 R; t > 0 and (83) can be rewritten as the cos-Fourier transform: Z 1 1 G˛ .x; t / D cos. x/ E˛ . ˛ t ˛ / d ; x 2 R; t > 0:  0

(84)

(85)

Remarkably, the fundamental solution G˛ can be represented in terms of elementary functions for every ˛; 1 < ˛ < 2. To show this, the technique of the Mellin integral transform was applied in Luchko (2013) to rewrite the integral (85) as a particular case of the Fox H-function: 1 1 G˛ .x; t / D ˛x 2 i

Z

 Ci1

 i1

s

   s 1  ˛s t  s  ds; 0 <  < ˛ s x 1 2 2



˛

(86)

that can be represented in the form 1 1 G˛ .x; t / D ˛x 2 i

Z

 Ci1

 i1

sin.s=2/ sin.s=˛/

 s t ds; 0 <  < ˛ x

(87)

using the duplication and reflection formulas for the Euler gamma function . From (86) or (87), a useful representation

Page 23 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

G˛ .x; t / D

1 L˛ .t =x/; x > 0; t > 0 x

(88)

of the fundamental solution G˛ in terms of the auxiliary function L˛ defined by 1 1 L˛ .x/ D ˛ 2 i

Z

 Ci1  i1

sin.s=2/ .x/s ds; 0 <  < ˛ sin.s=˛/

(89)

can be obtained. It follows from the representation (89) (or from the formulas (84) and (88)) that L˛ is an odd function, i.e., L˛ .x/ D L˛ .x/; x 2 R:

(90)

L˛ .x/ D L˛ .1=x/; x 6D 0

(91)

Moreover, the important formula

can be obtained from the representation (89) by the variables substitution s D s1 in the integral at the right-hand side of (89). From (88) and (91) and the fact that G˛ .x; t / D G˛ .x; t / D G˛ .jxj; t / for all x ¤ 0, t > 0, the similarity properties of the fundamental solution G˛ .x; t / D

1 1 1 t G˛ .1; t =jxj/ D G˛ .1; jxj=t / D G˛ .x=t; 1/ D 2 G˛ .t =x; 1/ jxj jxj t x

(92)

are deduced. It is worthwhile to stress the remarkable fact that two of these similarity properties hold with the variable x fixed to 1, the other two ones with the variable t fixed to 1. This fact reflects the property that in the fractional wave equation (77) we deal with in this section, the timefractional derivative and the space-fractional derivative are of the same order ˛. The correctness of the four similarity properties (92) can be also directly checked using the final formula (97) for the fundamental solution G˛ of the fractional wave equation. Because the auxiliary function L˛ is defined in (89) as an inverse Mellin transform, its Mellin transform is given by the formula L˛ .s/

Z D 0

1

L˛ .x/ x s1 dx D

1 sin.s=2/ ; 0 < 0; x 2 R; 1 < ˛ < 2:  t 2˛ C 2jxj˛ t ˛ cos. ˛=2/ C jxj2˛

(97)

4.2 Fundamental Solution as a pdf We begin with a remark that the formula (97) is valid for ˛ D 1 (modified convection equation (78)), too, that can be proved by direct calculations. In this case we get the well-known Cauchy kernel G1 .x; t / D

t 1  t 2 C x2

(98)

that is a spatial probability density function evolving in time. For 1 < ˛ < 2, the Green function (97) is a spatial probability density function evolving in time, too. Indeed, the function (97) is evidently nonnegative for all t > 0. Furthermore, Z

1 1

G˛ .x; t / dx D .F G˛ .x; t // .0/ D GO ˛ .0; t / D E˛ .j j˛ t ˛ / j D0 D 1

(99)

for all t > 0 and 1 < ˛ < 2 according to the formula (82). Thus, G˛ given by (97) is a spatial probability density function evolving in time that can be considered to be a fractional generalization of the Cauchy kernel (98) for the case of an arbitrary index ˛; 1  ˛ < 2. Now let us study some properties of the fundamental solution (97) as a pdf. Because G˛ is an even function, we consider the function G˛C .r; t / D G˛ .jxj; t / D

r ˛1 t ˛ sin. ˛=2/ 1 ; t > 0; 1 < ˛ < 2  t 2˛ C 2r ˛ t ˛ cos. ˛=2/ C r 2˛

with r D jxj  0. It is easy to see that G˛C behaves like a power function in r both at r D 0 and at r D C1 for a fixed t > 0:

Page 25 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

( G˛C .r; t / 

r ˛1 ; r ! 0; r ˛1 ; r ! C1:

(100)

This means that the pdf G˛ possesses finite moments of order  for 0   < ˛, but the moment of order ˛ is infinite. In particular, the mean value of G˛ (its first moment) exists for all ˛ > 1 (we note that the Cauchy kernel does not possess a mean value). Let us now evaluate the moments of the one-sided fractional Cauchy kernel G˛C for a fixed t > 0. To do this, we refer to the representation (88) of G˛C in terms of the auxiliary function L˛ that is given by the formula (see (90) and (97)) sign.x/jxj˛ sin. ˛=2/ 1 ; x 2 R; 1 < ˛ < 2: L˛ .x/ D  jxj2˛ C 2jxj˛ cos. ˛=2/ C 1

(101)

Taking into account this formula, the function C˛ .x/ WD

L˛ .x/ x

can be interpreted as a fractional Cauchy pdf of the order ˛. Indeed, C˛ .x/ is evidently nonnegative for all x 2 R, and the property Z

C1 1

C˛ .x/ dx D 1

is valid because of the formulas (102) and (103). Of course, for ˛ D 1, the pdf C˛ .x/ coincides with the Cauchy pdf. The moment of the order , 0   < ˛ of G˛C can be represented in terms of the Mellin integral transform of L˛ that is known (see the formula (93)) and thus evaluated: Z 1 Z 1 t  sin.=2/ C   : (102) G˛ .r; t /r dr D t L˛ . / 1 d D ˛ sin.=˛/ 0 0 In particular, we get the formula Z

1

G˛C .r; t / dr D

0

1 2

(103)

that is in accordance with (99) because G˛ is an even function in x. We mention also the important formula Z

1 0

G˛C .r; t /r dr D

t ; 1 0 for x 6D 0, so that x D 0 is a minimum point of the fundamental solution G˛ for any t > 0. Because G˛ is an even function in x, we again consider the function G˛ .jxj; t / that was denoted by G˛C .r; t / with r D jxj. To determine the maximum locations of G˛C for the fixed values of t and ˛, we solve the equation @G˛C .r; t / D 0 @r that turns out to be equivalent to the quadratic equation 

r˛ .˛ C 1/ ˛ t

2



r˛ C 2 cos. ˛=2/ ˛ t

  .˛  1/ D 0

with solutions given by p  cos. ˛=2/ ˙ ˛ 2  sin2 . ˛=2/ r˛ : D t˛ ˛C1 Since we are interested in the nonnegative solutions, the only candidate for this role is the point p r˛  cos. ˛=2/ C ˛ 2  sin2 . ˛=2/ D c˛ ; c˛ WD : t˛ ˛C1 Because

@G˛C .r; t / @r

is positive for

r˛ t˛

< c˛ and negative for

r˛ t˛

(105)

> c˛ , we conclude that the point 1

r˛? .t / D vp .˛/t; vp .˛/ WD .c˛ / ˛

(106)

with c˛ given by (105) is the only maximum point of the one-sided fractional Cauchy kernel G˛C . Of course, this point and the point r˛? .t / < 0 are maximum points of G˛ because G˛ is an even function in x. To determine the maximum value of the function G˛ that coincides with the maximum value of C G˛ and is denoted by G˛? .t /, we substitute the point r D r˛? .t / given by (106) into the function G˛C and get the formula G˛? .t / D

c˛ sin. ˛=2/ 1 1 m˛ ; m˛ WD G˛? .1/ D ; t vp .˛/ 1 C 2c˛ cos. ˛=2/ C c˛2

(107)

Page 27 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

where vp .˛/ and c˛ are defined as in the formulas (105) and (106). Of course, we can also use the relation (88) and obtain the formula G˛? .t / D

1 L˛ .vp .˛// t vp .˛/

(108)

via the auxiliary function L˛ . It follows from the formulas (106) and (107) (or (108)) that for a fixed value of ˛; 1 < ˛ < 2, the product p˛ of the maximum value G˛? .t / and the maximum locations ˙r˛? .t / is time independent: p˛ D ˙r˛? .t / G˛? .t / D ˙

  c˛ sin. ˛=2/ 1 D ˙L .˛/ : v ˛ p  1 C 2c˛ cos. ˛=2/ C c˛2

(109)

t For ˛ D 1, the maximum location of the Green function G1 .x; t / D t 2 Cx 2 does not move with time and is evidently at the point x D 0 for any t > 0, i.e., c1 D vp .1/ D p1 D 0 that is in accordance with the formulas (105), (106), and (109). Now we calculate some physical characteristics of the damped waves that are described by the fundamental solution G˛ . Because G˛ consists in fact of two symmetric branches that move in opposite directions, we again consider only one of them, say, G˛C that is a restriction of G˛ to x  0. The location of the gravity center r˛g .t / of G˛C is defined by the formula

R1

r˛g .t /

r G C .r; t / dr : D R0 1 C˛ G .r; t / dr ˛ 0

(110)

For 1 < ˛ < 2, the formulas (103) and (104) lead to the following result: r˛g .t / D

2t : ˛ sin.=˛/

(111)

If ˛ D 1, the mean value of G1C does not exists and thus the gravity center r1 .t / of G1C is located at C1 for any t > 0. The “mass” center r˛m .t / of G˛C is determined by the formula (Gurwich 2001) g

 C 2 r G .r; t / dr : r˛m .t / D R0 1  ˛ 2 C .r; t / dr G ˛ 0 R1

(112)

Substituting the representation (88) into (112) and transforming the obtained integrals, we get the formula R 1 1 2 L . / d ; (113) r˛m .t / D vm .˛/ t; vm .˛/ D 0R 1 2 ˛ 0 L˛ . / d where the function L˛ is defined by (101). The formula (113) (as well as the formula (118)) includes some integrals of the form

Page 28 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

Z

1

I.ˇ/ D

ˇ L2˛ . / d ; 2˛  1 < ˇ < 2˛  1

(114)

0

that in general cannot be expressed via known elementary or special functions. Remarkably, there exists an explicit formula for the integrals (114) in the case ˛ D 1, namely, (Prudnikov et al. 1986) Z

1

ˇ L21 . / d D

0

1 1Cˇ ; 3 < ˇ < 1: 4 cos.ˇ=2/

(115)

It follows from this formula that the “mass” center r1m of G1C can be represented by the simple formula r1m .t / D

2 t: 

(116)

In the general case, we just note that the symmetry relation I.ˇ/ D I.ˇ  2/; 2˛  1 < ˇ < 2˛  1 holds true because of the formula (91). Finally we mention that the location of energy of the damped wave G˛C that is defined as the time corresponding to the centroid of the function G˛C in the time domain is given by the formula (Carcione et al. 2010) t˛c .r/

2 R1  C dt 0 t G˛ .r; t / : D R1 2 C .r; t / dt G ˛ 0

(117)

For 1 < ˛ < 2, both integrals at the right-hand side of (117) converge and the finite location of energy can be represented in the form t˛c .r/

R1 2 r 0 L˛ . / d ; vc .˛/ D R 1 ; D 2 vc .˛/ 0 L˛ . / d

where the function L˛ is defined by (101). For ˛ D 1, the integral vc .˛/ tends to 0 as ˛ ! 1.

R1 0

(118)

L2˛ . / d diverges, so that

4.4 Velocities of the Damped Waves It is well known (see, e.g., Smith 1970; Bloch 1977; Groesen and Mainardi 1989, 1990; Gurwich 2001; Carcione et al. 2010) that several different definitions of the wave velocities and in particular of the light velocity can be introduced. For the damped waves that are described by the fractional wave equation (77), we evaluate the propagation velocity of the maximum of its fundamental solution G˛ that can be interpreted as the phase velocity, the propagation velocity of the gravity center of G˛ , the velocity of its “mass” center or the pulse velocity, and three different kinds of its centrovelocity. It turns out that all these velocities are constant in time and depend just on the order Page 29 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

˛ of the fractional wave equation. Whereas four out of six velocities are different to each other, the first centrovelocity coincides with the Smith centrovelocity, and the second centrovelocity is the same as the pulse velocity. We start with the phase velocity and determine it using the formula (106) that leads to the result that the maximum locations of the fundamental solution G˛ propagate with the constant velocities vp .˛/ that are given by the expression ! ˛1 p  cos. ˛=2/ C ˛ 2  sin2 . ˛=2/ dr˛? .t / D˙ : vp .˛/ WD ˙ dt ˛C1

(119)

For ˛ D 1 (modified convection equation (78)), the propagation velocity of the maximum of G˛ is equal to zero (the maximum point stays at x D 0), whereas for ˛ D 2 (wave equation), the maximum points propagate with the constant velocity ˙1. To determine the propagation velocity vg .˛/ of the gravity center of G˛ , we employ the formula (111) and get the following result: vg .˛/ WD

2 dr˛g .t / D : dt ˛ sin.=˛/

(120)

vg .˛/ is thus time independent and determined by the order ˛ of the fractional wave equation. Evidently, vg .2/ D 1 and vg .˛/ ! C1 as ˛ ! 1 C 0. The velocity vm .˛/ of the “mass” center of G˛ or its pulse velocity (Gurwich 2001) is obtained from the formula (113) and is equal to dr m .t / D vm .˛/ WD ˛ dt

R1 0

1 L2˛ . / d

R1 0

L2˛ . / d

;

(121)

where the function L˛ is defined by (101). For ˛ D 1, the pulse velocity is equal to 2  0:64 (see the formula (116)). Following Carcione et al. (2010) we define the second centrovelocity v2 .˛/ as the mean pulse velocity computed from the time 0 to the time t . It follows from (113) and (121) that for the damped wave that is described by the fundamental solution of the fractional wave equation, the second centrovelocity is equal to its pulse velocity vm .˛/: r m .t / v2 .˛/ WD ˛ D vm .˛/ D t

R1 0

1 L2˛ . / d R1 : L2˛ . / d 0

(122)

The Smith centrovelocity vc .˛/ (Smith 1970) of the damped waves describes the motion of the first moment of their energy distribution and can be evaluated in explicit form using the formula (118):  vc .˛/ WD

dt˛c .r/ dr

1

R1 0 D R1 0

L2˛ . / d

L2˛ . / d

;

(123)

Page 30 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

10

10

8

8

t=0.1 t=0.1

6

4

G1.1

G1.01

6

t=0.2

t=0.3

t=0.3

2 0 −0.5 −0.4 −0.3 −0.2 −0.1

0 x

0.1

0.2

0.3

0.4

t=0.2

4 2 0 −0.5 −0.4 −0.3 −0.2 −0.1

0.5

0 x

0.1

0.2

0.3

0.4

0.5

70

14

60

12 t=0.1

50

8

t=0.2

G1.9

G1.5

10

6 t=0.3

4

t=0.1

40 30

t=0.2

20 t=0.3

10

2 0 −0.5 −0.4 −0.3 −0.2 −0.1

0 x

0.1

0.2

0.3

0.4

0.5

0 −0.5 −0.4 −0.3 −0.2 −0.1

0 x

0.1

0.2

0.3

0.4

0.5

Fig. 3 Fundamental solution G˛ : plots for ˛ D 1:01 (1st line, left), ˛ D 1:1 (1st line, right), ˛ D 1:5 (2nd line, left), and ˛ D 1:9 (2nd line, right) for 0:5  x  0:5 and t D 0:1; 0:2; 0:3

R1 where the function L˛ is defined by (101). Because the integral 0 L2˛ . / d diverges for ˛ D 1, the Smith centrovelocity tends to 0 as ˛ ! 1. Finally, we evaluate the first centrovelocity v1 .˛/ that is defined as the mean centrovelocity from 0 to x (Carcione et al. 2010). It follows from (118) and (123) that for the damped wave G˛ the first centrovelocity is equal to the Smith centrovelocity vc .˛/: R1 2 r 0 L˛ . / d D vc .˛/ D R 1 : v1 .˛/ WD c 2 t˛ .r/ 0 L˛ . / d

(124)

As we have seen, all velocities introduced above are constant in time and depend just on the order ˛ of the fractional wave equation. The phase velocity, the velocity of the gravity center of G˛ , the pulse velocity, and the Smith centrovelocity are different to each other, whereas the fist centrovelocity coincides with the Smith centrovelocity and the second centrovelocity is the same as the pulse velocity. For the physical interpretation and meaning of the velocities that were determined above, we refer to, e.g., Bloch (1977), Groesen and Mainardi (1989, 1990), Gurwich (2001), and Carcione et al. (2010).

4.5 Discussion of the Obtained Results and Plots To start with, let us consider the evolution of the fundamental solution G˛ in time for some characteristic values of ˛. In Fig. 3, the plots of G˛ for ˛ D 1:01; 1:1; 1:5, and 1:9 are presented. As we can see, in all cases the maximum location of G˛ is moved in time according to the formula (106), whereas the maximum value decreases according to the formula (107). The behavior of G˛ can be thus interpreted as propagation of the damped waves whose amplitude decreases with time. This phenomena can be very clearly recognized on the 3D plot presented in Fig. 4. Of course, because of the nonlocal character of the fractional derivatives in the fractional wave equation, the solutions to this equation show some properties of diffusion processes, too. In particular, the Page 31 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

25

G1.5

20 15 10 5 0.1

0 0.5

0.2

0 −0.5

x

t

0.3

Fig. 4 Plot of G˛ for ˛ D 1:5, 0:5  x  0:5, and 0 < t  0:3 V_g

3.0

V_m

2.5

V_p

2.0

V_c

1.5 1.0 0.5 0.0 1.0

1.2

1.4

1.6

1.8

2.0

Fig. 5 Plots of the gravity center velocity vg .˛/, the pulse velocity vm .˛/, the phase velocity vp .˛/, and the centrovelocity vc .˛/ for 1  ˛  2

fundamental solution G˛ is positive for all x 6D 0 at any small time instance t > 0 that means that a disturbance of the initial conditions spreads infinitely fast and Eq. (33) is nonrelativistic like the classical diffusion equation. But in contrast to the diffusion equation, both the maximum location of the fundamental solution G˛ , its gravity and “mass” centers, and location of its energy propagate with the finite constant velocities like the fundamental solution of the wave equation. The plots of the propagation velocity vp of the maximum location of the fundamental solution G˛ (phase velocity), the velocity vg of its gravity center, its pulse velocity vm , and its centrovelocity vc are presented in Fig. 5. As expected, vp D vc D 0, vm D 2  0:64 for ˛ D 1 (modified convection equation), and all velocities smoothly approach the value 1 as ˛ ! 2 (wave equation). For 1 < ˛ < 2, vp ; vm ; and vc monotonously increase, whereas vg monotonously decreases. It is interesting to note that for all velocities v D v.˛/, the property dv.˛/ .2  0/ D 0 holds d˛ true, i.e., in a small neighborhood of the point ˛ D 2, the velocities of G˛ are nearly the same as those of the fundamental solution of the wave equation. The velocity vg of the gravity center of G˛ tends to C1 for ˛ ! 1 C 0 and t > 0 (modified convection equation) because the first moment of the Cauchy kernel (98) does not exist. It is interesting to note that for all ˛; 1 < ˛ < 2, the velocities vp ; vg ; vm ; vc are different to each other and fulfill the inequalities vc .˛/ < vp .˛/ < vm.˛/ < vg .˛/. For ˛ D 2, all velocities are equal to 1.

Page 32 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

5 Conclusions and Open Problems In this chapter, anomalous transport processes have been first modeled with the continuous time random walks on the microlevel. On the macrolevel, the CTRW models were reduced to the deterministic fractional diffusion-wave equations on the large time and space scales and under some suitable assumptions posed on the jump pdf. This kind of equations has been already successfully employed, e.g., for modeling of the geothermal energy extraction (Luchko and Punzi 2011), stability and seismicity of the fractal fault systems (Gudehus and Touplikiotis 2012), and propagation of the damped waves (Luchko 2013) that shows their potential importance for different geomathematical applications. In this chapter, some important types of the partial differential equations of fractional order including the generalized time-fractional diffusion equation, the time-fractional diffusion-wave equation, and the time- and space-fractional wave equation were treated. For these equations, both the initial-boundary-value problems with the Dirichlet boundary conditions and the Cauchy initial-value problems have been posed and investigated. Of course, the same method can be applied for the initial-boundary-value problems with the Neumann, Robin, or mixed boundary conditions. For the generalized time-fractional diffusion equation, a powerful maximum principle has been established. It enables us to obtain information regarding solutions and their a priori estimates without any explicit knowledge of the form of the solutions themselves and thus is a valuable tool in scientific research. In this connection we mention an important problem that is still waiting for its solution, namely, to try to extend the maximum principle to the space- and time-space-fractional partial differential equations. These equations are nowadays actively employed in modeling of relevant complex phenomena like anomalous diffusion in inhomogeneous and porous mediums, Levy processes and Levy flights, and the so-called fractional kinetics and are worth to be treated in detail from the mathematical viewpoint. In the last section of the chapter, a fractional wave equation with the fractional derivatives of order ˛; 1  ˛  2, both in space and in time was introduced and analyzed. We showed that the fractional wave equation inherits some crucial characteristics of the wave equation like the constant propagation velocities of the maximum of its fundamental solution, its gravity and “mass” centers, and its energy location. Because the maximum value of the fundamental solution G˛ (wave amplitude) decreases with time whereas its location moves with a constant velocity, solutions to the fractional wave equation can be interpreted as the damped waves. Moreover, G˛ that turns out to be expressed in terms of elementary functions for all values of ˛; 1  ˛ < 2, can be interpreted as a spatial pdf evolving in time all whose moments of order less than ˛ are finite. In connection with the fractional wave equation, an important problem for further research would be determination of other velocities like the group velocity or the ratio-of-units velocity (see, e.g., Bloch 1977; Gurwich 2001) for the damped waves described by the fractional wave equation. Finally, the fractional wave equations with the nonconstant coefficients as well as qualitative behavior of solutions to the nonlinear fractional wave equations would be worth to consider from the mathematical viewpoint and to employ them as models in the suitable applications.

References Al-Refai M (2012) On the fractional derivatives at extreme points. Electron J Qual Theory Differ Equ 55:1–5 Page 33 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

Berkowitz B, Klafter J, Metzler R, Scher H (2002) Physical pictures of transport in heterogeneous media: advection-dispersion, random walk and fractional derivative formulations. Water Resour Res 38:1191–1203 Bloch SC (1977) Eighth velocity of light. Am J Phys 45:538–549 Buckwar E, Luchko Yu (1998) Invariance of a partial differential equation of fractional order under the Lie group of scaling transformations. J Math Anal Appl 227:81–97 Carcione JM, Gei D, Treitel S (2010) The velocity of energy through a dissipative medium. Geophysics 75:T37–T47 Diethelm K (2010) The analysis of fractional differential equations. Springer, Berlin Emmanuel S, Berkowitz B (2007) Continuous time random walks and heat transfer in porous media. Transp Porous Media 67:413–430 Feller W (1952) On a generalization of Marcel Riesz’ potentials and the semi-groups generated by them. Meddelanden Lunds Universitets Matematiska Seminarium (Comm. Sém. Mathém. Université de Lund), Tome suppl. dédié à M. Riesz: 73–81 Fulger D, Scalas E, Germano G (2008) Monte Carlo simulation of uncoupled continuous time random walks yielding a stochastic solution of the space-time fractional diffusion equation. Phys Rev E 77:021122 Geiger S, Emmanuel S (2010) Non-Fourier thermal transport in fractured geological media. Water Resour Res 46:W07504 Germano G, Politi M, Scalas E, Schilling RL (2009) Stochastic calculus for uncoupled continuoustime random walks. Phys Rev E 79:066102 Gorenflo R, Mainardi F (2001) Random walk models approximating symmetric space-fractional diffusion processes. In: Elschner J, Gohberg I, Silbermann B (eds) Problems in mathematical physics. Birkhäuser Verlag, Boston/Basel/Berlin Gorenflo R, Mainardi F (2009) Some recent advances in theory and simulation of fractional diffusion processes. J Comput Appl Math 229:400–415 Gorenflo R, Iskenderov A, Luchko Yu (2000a) Mapping between solutions of fractional diffusionwave equations. Fract Calc Appl Anal 3:75–86 Gorenflo R, Luchko Yu, Mainardi F (2000b) Wright functions as scale-invariant solutions of the diffusion-wave equation. J Comput Appl Math 118:175–191 Gorenflo R, Loutchko J, Luchko Yu (2002) Computation of the Mittag-Leffler function and its derivatives. Fract Calc Appl Anal 5:491–518 Groesen E, Mainardi F (1989) Energy propagation in dissipative systems, Part I: centrovelocity for linear systems. Wave Motion 11:201–209 Groesen E, Mainardi F (1990) Balance laws and centrovelocity in dissipative systems. J Math Phys 30:2136–2140 Gudehus G, Touplikiotis A (2012) Clasmatic seismodynamics – oxymoron or pleonasm? Soil Dyn Earthq Eng 38:1–14 Gurwich I (2001) On the pulse velocity in absorbing and nonlinear media and parallels with the quantum mechanics. Prog Electromagn Res 33:69–96 Hanyga A (2002) Multi-dimensional solutions of space-time-fractional diffusion equations. Proc R Soc Lond A 458:429-450 Haubold J, Mathai AM, Saxena RK (2011) Mittag-Leffler functions and their applications. J Appl Math 2011:298628 Luchko Yu (1999) Operational method in fractional calculus. Fract Calc Appl Anal 2:463–489

Page 34 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

Luchko Yu (2008) Algorithms for evaluation of the Wright function for the real arguments’ values. Fract Calc Appl Anal 11:57–75 Luchko Yu (2009a) Boundary value problems for the generalized time-fractional diffusion equation of distributed order. Fract Calc Appl Anal 12:409–422 Luchko Yu (2009b) Maximum principle for the generalized time-fractional diffusion equation. J Math Anal Appl 351:218–223 Luchko Yu (2010) Some uniqueness and existence results for the initial-boundary-value problems for the generalized time-fractional diffusion equation. Comput Math Appl 59:1766–1772 Luchko Yu (2011a) Initial-boundary-value problems for the generalized multi-term time-fractional diffusion equation. J Math Anal Appl 374:538–548 Luchko Yu (2011b) Maximum principle and its application for the time-fractional diffusion equations. Fract Calc Appl Anal 14:110–124 Luchko Yu (2012a) Anomalous diffusion: models, their analysis, and interpretation. In: Rogosin S, Koroleva A (eds) Advances in applied analysis. Series: trends in mathematics. Birkhäuser Verlag, Boston/Basel/Berlin Luchko Yu (2012b) Initial-boundary-value problems for the one-dimensional time-fractional diffusion equation. Fract Calc Appl Anal 15:141–160 Luchko Yu (2013) Fractional wave equation and damped waves. J Math Phys 54:031505 Luchko Yu, Gorenflo R (1998) Scale-invariant solutions of a partial differential equation of fractional order. Fract Calc Appl Anal 1: 63–78 Luchko Yu, Gorenflo R (1999) An operational method for solving fractional differential equations with the Caputo derivatives. Acta Math Vietnam 24:207–233 Luchko Yu, Punzi A (2011) Modeling anomalous heat transport in geothermal reservoirs via fractional diffusion equations. Int J Geomath 1:257–276 Luchko Yu, Mainardi F, Povstenko Yu (2013) Propagation speed of the maximum of the fundamental solution to the fractional diffusion-wave equation. Comput Math Appl 66:774–784 Mainardi F (1994) On the initial-value problem for the fractional diffusion-wave equation. In: Rionero S, Ruggeri T (eds) Waves and stability in continuous media. World Scientific, Singapore Mainardi F (1996a) Fractional relaxation-oscillation and fractional diffusion-wave phenomena. Chaos Solitons Fractals 7:1461–1477 Mainardi F (1996b) The fundamental solutions for the fractional diffusion-wave equation. Appl Math Lett 9:23–28 Mainardi F, Luchko Yu, Pagnini G (2001) The fundamental solution of the space-time fractional diffusion equation. Fract Calc Appl Anal 4:153–192. E-print http://arxiv.org/abs/cond-mat/ 0702419 Marichev OI (1983) Handbook of integral transforms of higher transcendental functions, theory and algorithmic tables. Ellis Horwood, Chichester Matlab File Exchange (2005) Matlab-Code that calculates the Mittag-Leffler function with desired accuracy. Available for download at www.mathworks.com/matlabcentral/fileexchange/8738mittag-leffler-function Metzler R, Klafter J (2000) The random walk’s guide to anomalous diffusion: a fractional dynamics approach. Phys Rep 339:1–77 Metzler R, Klafter J (2004) The restaurant at the end of the random walk: recent developments in the description of anomalous transport by fractional dynamics. J Phys A 37:161–208 Metzler R, Nonnenmacher TF (2002) Space- and time-fractional diffusion and wave equations, fractional Fokker-Planck equations, and physical motivation. Chem Phys 284: 67-90

Page 35 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-2 © Springer-Verlag London 2014

Montroll E, Weiss, G (1965) Random walks on lattices. J Math Phys 6:167 Näsholm SP, Holm S (2013) On a fractional Zener elastic wave equation. Fract Calc Appl Anal 16:26–50 Podlubny I (1999) Fractional differential equations. Academic, San Diego Prudnikov AP, Brychkov YA, Marichev OI (1986) Integrals and series. Vol 1: Elementary functions. Gordon and Breach, New York Samko SG, Kilbas AA, Marichev OI (1993) Fractional integrals and derivatives: theory and applications. Gordon and Breach, Yverdon Smith RL (1970) The velocities of light. Am J Phys 38:978–984 Szabo TL, Wu J (2000) A model for longitudinal and shear wave propagation in viscoelastic media. J Acoust Soc Am 107:2437–2446 Vladimirov VS (1971) Equations of the mathematical physics. Nauka, Moscow

Page 36 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

Fractional Diffusion and Wave Propagation Yuri Luchko Department of Mathematics, Physics, and Chemistry, Beuth Technical University of Applied Sciences Berlin, Berlin, Germany

Abstract In this chapter, a short overview of the current research towards applications of the partial differential equations of an arbitrary (not necessarily integer) order for modeling of the anomalous transport processes (diffusion, heat transfer, and wave propagation) in the nonhomogeneous media is presented. On the microscopic level, these processes are described by the continuous time random walk (CTRW) model that is a starting point for derivation of some deterministic equations for the time- and space-averaged quantities that characterize the transport processes on the macroscopic level. In this work, the deterministic models are derived in the form of the partial differential equations of the fractional order. In particular, a generalized time-fractional diffusion equation and a time- and space-fractional wave equation are introduced and analyzed in detail. Finally, some open questions and directions for further work are suggested.

1 Introduction In many geological applications, e.g., modeling of the geothermal energy extraction (Luchko and Punzi 2011), stability and seismicity of the fractal fault systems (Gudehus and Touplikiotis 2012), or propagation of the damped waves (Luchko 2013), one has to deal with the transport processes that take place in the highly inhomogeneous media and are subject to external and internal forces being applied at different time and space scales. This raises the question of how reliable are the standard models for the transport processes in these complex environments. In particular, even though the Fourier and Fick’s laws are still the standard tools for modeling of the transport processes on the macro-level, they often fail to grasp the behavior of systems with the anomalous components and phenomena, the so-called anomalous transport processes (see, e.g., Geiger and Emmanuel 2010). Within the last few decades, the anomalous transport processes that do not follow the Gaussian statistics have been observed and confirmed in several different application areas in natural sciences, biology, geological sciences, medicine, etc. This forced even stronger research activities towards techniques and approaches for their adequate modeling. In this chapter, we consider one powerful approach, namely, the continuous time random walk (CTRW) model combined with the fractional dynamics on the macro-level and apply it for modeling of anomalous diffusion, heat transport, and wave propagation in heterogeneous media. The models for anomalous transport processes in the form of the time- and/or space-fractional partial differential equations enjoyed a 

E-mail: [email protected]

Page 1 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

particular attention and were introduced and analyzed by a number of researches since the 1980s. In particular, this kind of phenomena is known to occur in inhomogeneous media that combine characteristics of solid-like materials that exhibit wave propagation and fluidlike materials that support diffusion processes. In addition to the physical motivation and setting up of the models, mathematical analysis of the models and an overview of the numerical methods for their solution, some plots, and interpretation of the obtained results are presented in this chapter. In particular, we deal with the generalized time-fractional diffusion equation that of course can be employed to describe the anomalous heat conduction, too. To investigate this equation, a maximum principle well known for the elliptic and parabolic type PDEs is extended to the initial-boundary-value problems for the generalized diffusion equation of the fractional order. Then the Fourier spectral method is applied to obtain solutions to these problems in explicit form. Another important equation that is considered in this chapter is a fractional generalization of the wave equation that describes propagation of damped waves in inhomogeneous media. In contrast to the fractional diffusion-wave equation, the fractional wave equation contains fractional derivatives of the same order ˛; 1  ˛  2, both in space and in time. We show that this feature is a decisive factor for inheriting some crucial characteristics of the wave equation like a constant propagation velocity of both the maximum of its fundamental solution and its gravity and mass centers. Moreover, the first, the second, and the Smith centrovelocities of the damped waves described by the fractional wave equation are constant and depend just on the equation order ˛. The fundamental solution of the fractional wave equation is determined and shown to be a spatial probability density function evolving in time that possesses finite moments up to the order ˛. To illustrate analytical findings, results of numerical calculations and some plots are presented.

2 Continuous Time Random Walk Model The continuous time random walk (CTRW) model was first introduced in Montroll and Weiss (1965) to model transport processes that show anomalous behavior. The main idea behind a CTRW is first to interpret a transport process on the microlevel as a flow of many parcels. If one assumes that the parcels are independent from each other, then their state and behavior can be described in terms of the probability P .x; t /Vt of an individual parcel to be located inside the volume V within the time interval t . The function P .x; t / is an unknown probability density function (pdf) that satisfies the so-called master equation and is connected with the jump pdf that characterizes the transport process and is supposed to be known. Raising certain conditions on the jump pdf, the master equation can be transformed to some deterministic differential or integrodifferential equations that the pdf P .x; t / has to satisfy at least on the large time and space scales. In its turn, the macro-characteristics of the transport process like the concentration c.x; t / of the substance or its temperature T .x; t / at a certain place x to a certain time instant t averaged over the time interval t; t 2 t and the space volume V; x 2 V are proportional to P .x; t / and consequently governed by the same equations. This means that in the framework of the CTRW model, the key role is played by the pdf P .x; t / that describes a random walk of an individual parcel within the transport process. In what follows, we consider the random walk of an individual parcel and analyze its characteristics. For notational simplicity, we focus on the one-dimensional random walks. The multidimensional version follows the same steps. For more details regarding the CTRW models and their applications for modeling of the anomalous transport processes, see, Page 2 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

e.g., Berkowitz et al. (2002), Emmanuel and Berkowitz (2007), Fulger et al. (2008), Gorenflo and Mainardi (2009), Luchko (2012a), and Metzler and Klafter (2004).

2.1 Brownian Motion In the framework of the well-known random walk model for the Brownian motion, the random walker jumps at each time step t D t0 ; t0 C t; t0 C 2t; : : : in a randomly selected direction, thereby covering the distance x, the lattice constant. Denoting by P .x; t / x the probability that the random walker is located between x and xCx at the time t , the formula of total probability easily leads to the master equation 1 1 P .x; t C t / D P .x C x; t / C P .x  x; t /: 2 2

(1)

For the one-dimensional Brownian motion, we substitute the Taylor expansions   @P as t ! 0; C O .t /2 @t   .x/2 @2 P @P 3 C C O .x/ as x ! 0 P .x ˙ x; t / D P .x; t / ˙ x @x 2 @x 2 P .x; t C t / D P .x; t / C t

into the master equation (1) and get the formula   .x/2 @2 P .x/2 @P D C O.t / C x O @t 2t @x 2 t

as t ! 0; x ! 0:

(2)

In the continuum limit t ! 0 and x ! 0, this equation becomes the standard diffusion equation @P @2 P D d1 2 @t @x

(3)

under the condition that the diffusion coefficient d1 D lim

x!0 t!0

.x/2 2t

is finite. Of course, the same procedure leads to the two- or three-dimensional diffusion equations for the two- or three-dimensional Brownian motion, respectively: X @2 @P ; n D 2; 3: D d1 P;  WD 2 @t @x i iD1 n

(4)

When the random walker is located at the starting point x D 0; x 2 R; n D 1; 2; 3 at the time t D 0, then the initial condition

Page 3 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

P .x; 0/ D

n Y

ı.xi /; n D 1; 2; 3;

(5)

iD1

with the Dirac ı-function has to be added to the model. The solution   x2 1 exp  P .x; t / D p 4d1 t 4d1 t

(6)

of the one-dimensional diffusion equation (3) with the initial condition (5) can be easily obtained, e.g., with the Laplace and Fourier integral transforms technique. The pdf (6) is a spatial Gaussian p distribution at any time point t > 0 with the middle value  D 0 and with the deviation  D 2d1 t that means that the mean squared displacement of a parcel that participates in the transport process is given by  2 .t / D 2d1 t . It is important to mention that the central limit theorem ensures the same behavior of the pdf P .x; t / on the large time and space scales in the case when the waiting time t is not fixed, but the pdf that describes a distribution of the waiting times between two successive jumps possesses a finite mean value t .

2.2 CTRW Model In contrast to the random walk model for the Brownian motion, the CTRW model is based on the idea that the lengths of the jumps and the waiting times between two successive jumps are governed by a joint pdf .x; t / that is referred to as the jump pdf. From .x; t /, the jump length pdf Z

1

.x/ D

.x; t / dt

(7)

.x; t / dx

(8)

0

and the waiting time pdf Z

1

w.t / D

1

can be deduced. The main characteristics of the CTRW models are the characteristic waiting time Z

1

T D

w.t / t dt

(9)

0

and the jump length variance Z ˙ D 2

1

.x/ x 2 dx:

(10)

1

They can be finite or infinite and this makes the difference between the CTRW models. Usually, the following different cases are distinguished: Page 4 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

• Both T and ˙ 2 are finite: Brownian motion (diffusion equation as a deterministic model) • T diverges, ˙ 2 is finite: Sub-diffusion (time-fractional diffusion equation as a deterministic model) • T is finite, ˙ 2 diverges: Markovian Levy flights (space-fractional diffusion equation as a deterministic model) • Both T and ˙ 2 are infinite: Non-Markovian Levy flights (time-space-fractional diffusion equation as a deterministic model) It is known that the master equations for the CTRW model can be formulated in the form of some integral equations of the convolution type (see, e.g., Metzler and Klafter 2000). Below we give a short summary of how to derive these equations. Let us denote by .x; t / the probability of the event that at the time instant t a parcel just arrives to the point x. According to the law of total probability, .x; t / satisfies the equation Z .x; t / D

C1

dx 1

0

Z

t

   x0; t 0

  x  x 0 ; t  t 0 dt 0 C P0 .x/ı.t /;

(11)

0

.x; t / being the jump pdf that is supposed to be known. The pdf P .x; t / that governs the event that at the time instant t the parcel is located at the position x is given by Z

t

P .x; t / D

     x; t 0  t  t 0 dt 0 ;

(12)

0

where Z  .t / D 1 

t

  w t 0 dt 0

(13)

0

is assigned to the probability of no jump event within the time interval Œ0; t and w.t / is the waiting time pdf. The integral equations (11)–(13) determine the one-point probability density function that is an important part of the mathematical model for the CTRW but of course not enough to fully characterize the underlying stochastic process (see, e.g., Germano et al. 2009 for more details). Let us now transform Eqs. (11)–(13) into the frequency domain by applying the Fourier and the Laplace transforms. Applying the well-known convolution theorems for the Fourier and the Laplace transforms and solving the transformed equations for the unknown Fourier and Laplace transformed pdf POQ . ; s/, we get the formula PO0 . / 1  w.s/ Q ; POQ . ; s/ D s 1  OQ . ; s/

(14)

where PO0 . / denotes the Fourier transform of the initial condition P0 .x/ WD P .x; 0/. It is worth to mention that a purely probabilistic proof of Eq. (14) is given in Germano et al. (2009). We remind the readers that the Fourier transform is defined by

Page 5 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

fO. / D F ff .x/I g D

Z

C1

e Ci x f .x/ dx ; 2 R ;

1

and the Laplace transform by fQ.s/ D Lff .t /I sg D

Z

1

e st f .t / dt; s 2 C :

0

2.3 Advection-Diffusion Equation Until now, no particular assumptions for the densities .x/ and w.t /, except one for their integrability, have been made. In what follows, we assume for simplicity that the jump lengths and waiting times are independent random variables and the jump pdf .x; t / can be written in the O w.s/ decoupled form .x; t / D .x/ w.t /. In this case the equation OQ . ; s/ D . / Q holds true. Straightforward calculations show that if w.t / possesses a finite mean and .x/ a finite variance, then Eq. (14) can be transformed to the standard advection-diffusion equation for large t and jxj and therefore describes the Brownian motion on the large time and space scales. Indeed, in this case the Fourier transform of .x/ has the asymptotic behavior M2 2 O . /  1  iM1 

2

as

! 0;

(15)

where M1 and M2 are the first and the second moments of the jump length pdf .x/, respectively. Substituting this expression into (14), we get PO0 . / 1  w.s/ Q  POQ . ; s/ D s 1  w.s/ Q 1  iM1 

M2 2

2



as

! 0;

that is equivalent to   s w.s/ Q M 2 O 2

POQ . ; s/ as ! 0: s PQ . ; s/  PO0 . / D iM1  1  w.s/ Q 2 Applying the inverse and making use of theorem for the n 2the differentiation o   ˚Fourier transform  @ @ 2 Fourier transform F @x f .x; t /I D i F ff .x; t /I g; F @x2 f .x; t /I D  F ff .x; t /I g , one gets

@ @2 s w.s/ Q Q s P .x; s/  P0 .x/ D v P .x; s/ C d1 2 P .x; s/ 1  w.s/ Q @x @x

as jxj ! 1;

(16)

where v D M1 and d1 D M22 can be interpreted as the velocity and the diffusion coefficient, respectively. When the waiting time pdf w.t / possesses a finite mean (for an in-depth treatment of this problem, we refer to Emmanuel and Berkowitz (2007) and Geiger and Emmanuel (2010), where Page 6 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

some methods for determination of the mean value of w.t / in the case of heat transport in porous media were discussed), then the asymptotics of its Laplace transform w Q can be represented in the form w.s/ Q  1  s

as

s ! 0;

(17)

where denotes the first moment of the pdf w.t /. Substituting (17) into Eq. (16) and applying inverse Laplace transform and the differentiation theorem for the Laplace transform  ˚the L @t@ f .x; t /I s D sLff .x; t /I sg  f .x; 0/ , we obtain an initial-value problem P .x; 0/ D P0 .x/ for the standard advection-diffusion equation

@2 @ @ P .x; t / D v P .x; t / C d1 2 P .x; t / as @t @x @x

t ! 1; jxj ! 1

(18)

on the large time and space scales.

2.4 Fractional Diffusion-Wave Equations With the CTRW model, it is possible to go beyond this standard framework and to explore other kinds of transport processes including the anomalous transport. Let us first assume that the mean value of the waiting time pdf w.t / is not finite. As an example, a particular long-tailed waiting time pdf with the asymptotic behavior w.t /  A˛ . =t /1C˛ ; t ! C1; 0 < ˛ < 1

(19)

is considered. Its asymptotics in the Laplace domain can be easily determined by the so-called Tauberian theorem and is as follows: w.s/ Q  1  .s /˛ ; s ! 0: It is important to mention that the specific form of w.t / is of minor importance. In particular, the so-called Mittag-Leffler waiting time pdf X xk d w.t / D  E˛ .t ˛ / ; E˛ .x/ WD dt .˛ k C 1/ kD0 1

can be taken without loss of generality. The Laplace transform of the Mittag-Leffler pdf can be evaluated in explicit form w.s/ Q D

1 1 C s˛

and has the desired asymptotics. Together with the Gaussian jump length pdf

Page 7 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

   1=2  .x/ D 4 2 exp x 2 = 4 2 ; ˙ 2 D 2 2 with the Fourier transform in the form O . /  1   2 2 ; ! 0; the asymptotics of the Fourier-Laplace transform of the pdf P .x; t / becomes POQ . ; s/ 

PO0 . /=s ; s ! 0; ! 0: 1 C d˛ s ˛ 2

(20)

Using the Tauberian theorems for the Laplace and Fourier transforms, the last equation can be transformed for large t and jxj to a time-fractional partial differential equation. Namely, after multiplication with the denominator of the right-hand side, Eq. (20) becomes   1 C d˛ s ˛ 2 POQ . ; s/  PO0 . /=s; s ! 0; ! 0:

(21)

Making use of the differentiation theorem for the Fourier transform and employing the integration rule L f.I ˛ f / .t /I sg D s ˛ fQ.s/ for the Riemann-Liouville fractional integral I ˛ defined by 1 .I f /.t / WD .˛/

Z

t

f . /.t  /˛1 d ; ˛ > 0;

˛

 0  I f .t / D f .t /;

(22)

0

Eq. (21) can be rewritten in the form of the fractional integral equation  P .x; t /  P0 .x/ D d˛

 @2 I P .x; / .t / @x 2 ˛

(23)

for large t and jxj. Application of a fractional differential operator Dt˛ to Eq. (23) transforms it for large t and jxj to the initial-value problem P .x; 0/ D P0 .x/ for the so-called time-fractional diffusion equation  ˛  @2 P Dt P .t / D d˛ 2 ; 0 < ˛ < 1: @x

(24)

In what follows, the fractional derivative Dt˛ .n  1 < ˛  n; n 2 N/ is taken in the Caputo sense:

Page 8 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

  ˛   Dt f .t / WD I n˛ f .n/ .t /;

(25)

I ˛ being the Riemann-Liouville fractional integral (22). For the theory and applications of this fractional derivative and for other forms of the fractional derivatives, we refer the readers to Samko et al. (1993) and Diethelm (2010). It is worth mentioning that the integrodifferential kind of the fractional differential operator in Eq. (24) ensures the non-Markovian nature of the subdiffusive process we are dealing with. Indeed, calculating the Laplace transform of the mean squared displacement via the relation xQ 2 .s/ D lim 

!0

d2 P . ; s/ d 2

and applying the Laplace inversion transform, the formula x 2 .t / D

2d˛ t˛ .1 C ˛/

for the mean squared displacement in time is obtained. As we see, in contrast to the case of the Brownian motion, the mean squared displacement does not linearly depend on the time t , but is a power function with the exponent ˛. Mathematical theory of the general time-fractional diffusion equation is presented in the third section. Now we discuss the case when the characteristic waiting time T is finite, but the jump length variance ˙ 2 is infinite. Again, a specific form of the pdf .x/ is of minor importance, so that without a loss of generality, we can, e.g., consider one of the Levy-stable pdfs with the Fourier transform given by the formula   O . / D exp  ˇ j jˇ  1   ˇ j jˇ ; 1 < ˇ < 2; j j ! 0:

(26)

In the spatial domain, we get the asymptotical formula .x/  Aˇ  ˇ jxj1ˇ ; x ! 1 that shows the “long tails” of the pdf .x/. For the Poissonian waiting time pdf w.t / D 1 exp.t= /; > 0 with the Laplace transform of the form w.s/ Q  1  s ; s ! 0; the asymptotics of the Fourier-Laplace transform of the pdf P .x; t / can be written in the form POQ . ; s/ 

1 ; s ! 0; j j ! 0: s C cˇ j jˇ

(27)

Page 9 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

By inverting the Laplace and Fourier transforms in Eq. (27), an initial-value problem P .x; 0/ D P0 .x/ for the space-fractional diffusion equation @P D cˇ Rxˇ P .x; t / @t

(28)

is obtained for large t and jxj, where Rxˇ is the Riesz fractional derivative defined for a sufficiently well-behaved function f and 0 < ˇ  2 as a pseudo-differential operator with the symbol j jˇ (see, e.g., Samko et al. 1993; Mainardi et al. 2001): F

˚ ˇ   Rx f .x/I D j jˇ F ff .x/I g:

(29)

In the one-dimensional case, the Riesz fractional derivative (29) with 0 < ˇ < 2 can be represented as a hypersingular integral (see Samko et al. 1993 for the case ˇ 6D 1 and Gorenflo and Mainardi 2001 for the general case): 

Rxˇ f



1 .x/ D .1 C ˇ/ sin.ˇ=2/ 

Z

1 0

f .x C /  2f .x/ C f .x  / d :

ˇC1

(30)

For ˇ D 1, the relation (30) can be interpreted in terms of the Hilbert transform  1  1 d Rx f .x/ D   dx

Z

C1 1

f . / d ; x

(31)

where the integral is understood in the sense of the Cauchy principal value as first noted in Feller (1952) and then revisited and stated more precisely in Gorenflo and Mainardi (2001). In the case both T and ˙ 2 diverge, we employ in the CTRW model, e.g., the long-tailed pdf (19) as the waiting time pdf w.t / and the Levy-stable pdf (26) as the jump length pdf .x/. Following the same way as above, an initial-value problem P .x; t / D P .x; 0/ for the time- and spacefractional diffusion equation    ˛  Dt P .t / D s˛;ˇ Rxˇ P .x/

(32)

with the Caputo fractional derivative Dt˛ in time and the Riesz fractional derivative Rxˇ in space is deduced from the CRTW model on the large time and space scales. For the mathematical and numerical analysis of the time- and space-fractional diffusion equations of type (32), we refer to Mainardi et al. (2001), where an even more general equation with the Riesz-Feller space-fractional derivative was investigated in detail in the one-dimensional case, and to Hanyga (2002), where the multidimensional time-space fractional diffusion equations were treated. Let us now mention one important particular case of the time- and space-fractional diffusion equation that is referred to as the neutral-fractional diffusion equation (Mainardi et al. 2001; Metzler and Nonnenmacher 2002) or the fractional wave equation (Luchko 2013). This equation is obtained from (32) when we set ˛ D ˇ, i.e., when the orders of the fractional derivatives in time and in space are the same. From the viewpoint of the CTRW model, this condition means that the asymptotics of the waiting time pdf w.t / and the jump length pdf .x/ are the same on large time and space scales that means that the waiting times and jump length are adapted to each other

Page 10 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

in the way that the corresponding CTRW model describes an anomalous wave propagation rather than anomalous diffusion. Mathematical theory of the fractional wave equation is presented in the sect. 4.1. Further important models that generalize the well-known conventional transport equations are the fractional diffusion-advection equation (anomalous diffusion with an additional velocity field) and the fractional Fokker-Plank equation (anomalous diffusion in the presence of an external field). Of course, like in the conventional case, the multidimensional generalizations, equations of the fractional order with the nonconstant coefficients and nonlinear fractional differential equations appear in the corresponding models and are worth to be investigated.

3 Time-Fractional Diffusion-Wave Equations 3.1 Generalized Time-Fractional Diffusion Equation Motivated by the models derived in the previous section, we consider in this subsection the ndimensional generalized time-fractional diffusion equation (GTFDE)  ˛  Dt u .t / D L.u/ C F .x; t /; 0 < ˛  1;

(33)

where u D u.x; t /; .x; t / 2 ˝T WD G  .0; T /; G  Rn is the unknown function, the operator L is given by L.u/ D div.p.x/ grad u/ C q.x/u with the coefficients p and q that satisfy the conditions     N p 2 C 1 GN ; q 2 C GN ; p.x/ > 0; q.x/  0; x 2 G;

(34)

the fractional derivative Dt˛ is defined in the Caputo sense (see (25)), and the domain G with the boundary S is open and bounded in Rn . The operator L is a linear elliptic type differential operator of the second order  n  X @2 u @p @u L.u/ D p.x/ 2 C  q.x/u; @xk @xk @xk kD1 that can be represented in the form  L.u/ D p.x/4u C .grad p; grad u/  q.x/u;

(35)

4 being the Laplace operator. For ˛ D 1, Eq. (33) is reduced to a linear second-order parabolic PDE. The theory of this equation is well known, so that the main focus in this section is on the case 0 < ˛ < 1. Of course, in applications, the transport processes that we model with the time-fractional diffusion equations take place in some bounded domains, so that mainly the initial-boundary-value problems for these equations are worth to be studied from the viewpoint of applications. In this

Page 11 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

section, the initial-boundary-value problem ˇ N uˇt D0 D u0 .x/; x 2 G;

ˇ uˇS D v.x; t /; .x; t / 2 S  Œ0; T

(36) (37)

for Eq. (33) is considered. A solution to the problem (33), (36), (37) is  a function u D u.x; t / defined in the domain  called N T that belongs to the space C ˝N T \W 1 ..0; T /\C 2 .G/ and satisfies both Eq. (33) ˝N T WD GŒ0; and the initial and boundary conditions (36)–(37). By W 1 ..0; T /, the space of the functions f 2 C 1 ..0; T / such that f 0 2 L..0; T // is denoted. If the problem (33), (36), (37) possesses a solution,   then the functions F , u0 , and v given in the problem have to belong to the spaces C.˝T /; C GN , and C.S  Œ0; T /, respectively. In the further discussions, these inclusions are always supposed to be valid. The presentation of the results in this section follows Luchko (2009a, b, 2010, 2011a, b, 2012a, b), and the readers are advised to consult these papers for the proofs and more details. 3.1.1 Uniqueness of the Solution First, we investigate uniqueness of the solution to the problem (33), (36), (37). The main component of the uniqueness proof is an appropriate maximum principle for Eq. (33). In its turn, the proof of the maximum principle uses an extremum principle for the Caputo fractional derivative that is formulated in the following theorem. Theorem 1 (Luchko 2009b). Let a function f 2 W 1 ..0; T /\C.Œ0; T / attain its maximum over the interval Œ0; T at the point D t0 ; t0 2 .0; T . Then the Caputo fractional derivative of the function f is nonnegative at the point t0 for any ˛; 0 < ˛ < 1:  ˛  Dt f .t0 /  0; 0 < ˛ < 1:

(38)

Let us mention that recently a more strong estimate for the Caputo derivative of a function f; f 2 C 1 Œ0; 1 at the maximum point t0 was proved in Al-Refai (2012), namely, 

 Dt˛ f .t0 /  

t0˛ .f .t0 /  f .0//  0; 0 < ˛ < 1: .1  ˛/

The extremum principle for the Caputo fractional derivative is used to prove a maximum principle for the generalized time-fractional diffusion equation (33) that is formulated in the same way as the one for the parabolic type PDEs.   Theorem 2 (Luchko 2009b). Let a function u 2 C ˝N T \ W 1 ..0; T // \ C 2 .G/ be a solution of the generalized time-fractional diffusion equation (33) and F .x; t /  0; .x; t / 2 ˝T . N Then either  u.x; t /  0; 8.x; t / 2 ˝T , or the function u attains its positive maximum on the T N part SG WD G  f0g [ .S  Œ0; T / of the boundary of the domain ˝T , i.e.,

Page 12 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

(

)

u.x; t /  max 0; max u.x; t / ; 8.x; t / 2 ˝N T :

(39)

.x;t /2SGT

Similarly to the case of the partial differential equations of parabolic type (˛ D 1), an appropriate minimum principle is valid, too. The maximum and minimum principles can be applied to show that the problem (33), (36)–(37) possesses at most one solution and this solution – if it exists – continuously depends on the data given in the problem. Theorem 3 (Luchko 2009b). The initial-value problem (36)–(37) for the GTFDE (33) possesses at most one solution. This solution continuously depends on the data given in the problem in the sense that if Q C.S Œ0;T /  1 ; kF  FQ kC .˝N T /  ; ku0  uQ 0 kC .GN /  0 ; kv  vk and u and uQ are the solutions to the problem (33), (36)–(37) with the source functions F and FQ , the initial conditions u0 and ue0 , and the boundary conditions v and v, Q respectively, then the norm estimate T˛  ku  uQ kC .˝N T /  maxf0 ; 1 g C .1 C ˛/

(40)

for the solutions u and uQ holds true. Because the problem under consideration is a linear one, the uniqueness of the solution immediately follows from the fact that the homogeneous problem (33), (36)–(37), i.e., the problem with F 0, u0 0, and v 0, has only one solution, namely, u.x; t / 0; .x; t / 2 ˝N T . 3.1.2 Existence of the Solution To tackle the existence problem, notion of a generalized solution is first introduced following Vladimirov (1971), where the case ˛ D 1 was considered.     Definition 1 (Luchko 2010). Let Fk 2 C ˝N T ; u0k 2 C GN , and vk 2 C.S  Œ0; T /; k D 1; 2; : : : be the sequences of functions that satisfy the following conditions: (1) There exist the functions F , u0 , and v, such that kFk  F kC .˝N T / ! 0 as k ! 1;

(41)

ku0k  u0 kC .GN / ! 0 as k ! 1;

(42)

kvk  vkC.S Œ0;T / ! 0 as k ! 1;

(43)

(2) For any k D 1; 2; : : : , there exists a solution uk of the initial-boundary-value problem

Page 13 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

ˇ N uk ˇt D0 D u0k .x/; x 2 G; ˇ uk ˇS D vk .x; t /; .x; t / 2 S  Œ0; T

(44) (45)

for the generalized time-fractional diffusion equation  ˛  Dt uk .t / D L.uk / C Fk .x; t /:

(46)

  Suppose there exists a function u 2 C ˝N T such that kuk  ukC .GN / ! 0 as k ! 1:

(47)

The function u is called a generalized solution of the problem (33), (36)–(37). The generalized solution of the problem (33), (36)–(37) is a continuous function, not a generalized one. Still, the generalized solution is not required to be from the functional space   N C ˝T \ W 1 ..0; T / \ C 2 .G/, where the solution has to belong to. It follows from Definition 1 that if the problem (33), (36)–(37) possesses a solution, then this solution is a generalized solution of the problem, too. In this sense, Definition 1 extends the notion of the solution of the problem (33), (36)–(37). This extension is needed to get some existence results. But of course one does not want to lose the uniqueness of the solution. Let us now discuss some properties of the generalized solution including its uniqueness. If the problem (33), (36), (37) possesses a generalized  solution,  then the functions F , u0 , and v given in the problem have to belong to the spaces C ˝N T ; C GN , and C.S  Œ0; T /, respectively. In the further discussions, these inclusions are always supposed to be valid. First, we show that the sequence uk ; k D 1; 2; : : : defined by the relations (41)–(46) of N Definition  1 is always a uniformly convergent one in ˝T , i.e., there always exists a function u 2 C ˝N T that satisfies the property (47). Indeed, applying the estimate (40) from Theorem 3 to the functions uk and up that are solutions of the corresponding initial-boundary-value problems (44) and (45) for Eq. (46), one gets the inequality n

o

kuk  up kC .˝N T /  max ku0k  u0p kC .GN / ; kvk  vp kC.S Œ0;T / C

T˛ kFk  Fp kC .˝N T / ; .1 C ˛/

that, with the relations (41)–(43), means  together   that uk ; k D 1; 2; : : : is a Cauchy sequence in N N C ˝T that converges to a function u 2 C ˝T . Moreover, the following important uniqueness theorem holds true. Theorem 4 (Luchko 2010). The problem (33), (36)–(37) possesses at most one generalized solution in the sense of Definition 1. The generalized solution – if it exists – continuously depends on the data given in the problem in the sense of the estimate (40). In contrast to the situation with the solution to the problem (33), (36)–(37), existence of the generalized solution can be shown under some standard restrictions on the problem data and the boundary S of the domain G. In this section, existence of the solution of the problem

Page 14 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

 ˛  Dt u .t / D L.u/; ˇ N uˇt D0 D u0 .x/; x 2 G;

ˇ uˇS D 0; .x; t / 2 S  Œ0; T

(48) (49) (50)

is considered to demonstrate the technique that can be used with the appropriate standard modifications in the general case, too. The generalized solution of the problem (48)–(50) can be constructed in an analytical form by using the Fourier method of the variables separation. Let us look for a particular solution u of Eq. (48) in the form u.x; t / D T .t / X.x/; .x; t / 2 ˝N T ;

(51)

that satisfies the boundary condition (50). Substituting (51) into Eq. (48) and separating the variables, we get the equation 

 Dt˛ T .t / L.X/ D D ; T .t / X.x/

(52)

 being a constant not depending on the variables t and x. The last equation, together with the boundary condition (50), is equivalent to the fractional differential equation 

 Dt˛ T .t / C T .t / D 0

(53)

and the eigenvalue problem L.X/ D  X; ˇ X ˇS D 0; x 2 S

(54) (55)

for the operator L. Due to the condition (34), the operator L is a positive definite and self-adjoint linear operator. The theory of the eigenvalue problems for such operators is well known (see, e.g., Vladimirov 1971). In particular, the eigenvalue problem (54)–(55) has a counted number of the positive eigenvalues 0 < 1  2  : : : with the finite multiplicity, and if the boundary S of G is a smooth surface, any function f 2 ML can be represented through its Fourier series in the form f .x/ D

1 X

.f; Xi / Xi .x/;

(56)

iD1

where .f; g/ denotes the scalar product of two functions in L2 .G/ and Xi 2 ML are the eigenfunctions corresponding to the eigenvalues i : L.Xi / D i Xi ; i D 1; 2; : : : :

(57)

By ML ,the space of the functions f that satisfy the boundary condition (55) and the inclusions f 2 C 1 ˝N T \ C 2 .G/, L.f / 2 L2 .G/ is denoted.

Page 15 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

The solution of the fractional differential equation (53) with  D i ; i D 1; 2; : : : has the form (see, e.g., Luchko 1999; Luchko and Gorenflo 1999) Ti .t / D ci E˛ .i t ˛ / ;

(58)

E˛ being the Mittag-Leffler function defined by E˛ .z/ WD

1 X kD0

zk : .˛ k C 1/

(59)

Any of the functions ui .x; t / D ci E˛ .i t ˛ / Xi .x/; i D 1; 2; : : :

(60)

and thus the finite sums uk .x; t / D

k X

ci E˛ .i t ˛ / Xi .x/; k D 1; 2 : : :

(61)

iD1

satisfy both Eq. (48) and the boundary condition (50). To construct a function that satisfies the initial condition (49), too, the notion of a formal solution is introduced. Definition 2 (Luchko 2010). A formal solution to the problem (48)–(50) is called the Fourier series in the form u.x; t / D

1 X

.u0 ; Xi / E˛ .i t ˛ / Xi .x/;

(62)

iD1

Xi ; i D 1; 2; : : : being the eigenfunctions corresponding to the eigenvalues i of the eigenvalue problem (54)–(55). Under certain conditions, the formal solution (62) can be proved to be the generalized solution of the problem (48)–(50). Theorem 5 (Luchko 2010). Let the initial condition u0 be from the space ML . Then the formal solution (62) of the problem (48)–(50) is its generalized solution. Indeed, it can be easily verified that the functions uk ; k D 1; 2; : : : defined by (61) are solutions of the problem (48)–(50) with the initial conditions u0k .x/ D

k X

.u0 ; Xi / Xi .x/

(63)

iD1

instead of u0 . Because the function u0 is from the functional space ML , its Fourier series converges uniformly to the function u0 , so that Page 16 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

ku0k  u0 kC .GN / ! 0 as k ! 1: To prove the theorem, one only needs to show that the sequence uk ; k D 1; 2; : : : of the partial sums (61) converges uniformly on ˝N T . But this statement immediately follows from the estimate (see, e.g., Haubold et al. 2011) jE˛ .x/j 

M  M; 0  x; 0 < ˛ < 1 1Cx

for the Mittag-Leffler function (59) and the fact that the Fourier series

(64) 1 P

.u0 ; Xi / Xi .x/ of the

iD1

function u0 2 ML uniformly converges on ˝N T . In some cases, the generalized solution (62) can be shown to be the solution of the initialvalue problem for the generalized time-fractional diffusion equation, too. One important example is given by the following theorem. Theorem 6 (Luchko 2012b). Let an open domain G be a one-dimensional interval .0; l/, u0 2 ML , and L.u0 / 2 ML . Then the unique solution of the initial-value-problem ˇ uˇt D0 D u0 .x/; 0  x  l; u.0; t / D u.l; t / D 0; 0  t  T for the one-dimensional generalized time-fractional equation    ˛  @ @u Dt u .t / D p.x/  q.x/ u @x @x is a continuously differentiable function with respect to the time variable on the interval .0; T / that is given by the formula (62).

3.2 Fractional Diffusion-Wave Equation Of course, in the case of the fractional differential equations with the constant coefficients, other techniques than the spectral method presented in the previous section like the integral transforms method can be applied to deduce explicit formulas for solutions of the initial-, boundary-, or initialboundary-value problems for these equations. As an example, we consider in this section the initial-value problem (the Cauchy problem) 8 ˆ 0

(65)

for the time-fractional diffusion-wave equation of order ˛ with 1  ˛  2, namely, Page 17 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

 ˛  @2 u Dt u .t / D 2 ; x 2 R; t 2 RC : @x

(66)

This equation interpolates between the diffusion equation (˛ D 1) and the wave equation (˛ D 2) and was considered in detail, e.g., in Mainardi (1994, 1996a, b) (see also Luchko et al. 2013 for more recent results). To simplify the notations in the formulas, we set  D ˛=2, so that 1=2    1 for 1  ˛  2. The Green function Gc .x; t I / of the problem under consideration is its solution with the initial condition f .x/ D ı.x/, ı being the Dirac ı-function. The solution of the Cauchy problem (65) is obtained via the Green function in the form Z C1 Gc .x  ; t I / f . / d : u.x; t I / D 1

For the diffusion equation ( D 1=2), the Green function is well known and is given by t 1=2 2 Gc .x; t I 1=2/ D Gc d .x; t / D p ex =.4 t / ; 2 

(67)

whereas for the wave equation ( D 1), we get the representation 1 Gc .x; t I 1/ D Gc w .x; t / D .ı.x  t / C ı.x C t // : 2

(68)

Following Mainardi (1994, 1996a, b) and Luchko et al. (2013), several representations of the Green function Gc in the form of integrals and series as well as some methods for its numerical evaluation are presented and discussed in this subsection. We start by transforming the Cauchy problem for Eq. (66) into the Laplace-Fourier domain using the known formula (see, e.g., Podlubny 1999) L

˚

Dt˛ f





.t /I s D s L ff .t /I sg  ˛

n1 X

  f .k/ 0C s ˛1k ;

n  1 < ˛  n; n 2 N

(69)

kD0

for the Laplace transform of the Caputo fractional derivative. This formula, together with the standard formulas for the Fourier transform of the second derivative and of the Dirac ı-function, leads to the representation ec . ; s; / D Gb

s 21 ;  D ˛=2 s 2 C 2

(70)

ec of the Green function Gc . Using the Laplace transform formula of the Laplace-Fourier transform Gb (see, e.g., Podlubny 1999) L fE˛ .t ˛ / I sg D

s ˛1 s˛ C 1

Page 18 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

and applying to the R.H.S of the formula (70) first the inverse Laplace transform and then the inverse Fourier transform, we obtain the integral representation 1 Gc .x; t I / D 

Z

1

  E2  2 t 2 cos.x / d ;

(71)

0

if we take into consideration the fact that the Green function of the Cauchy problem is an even function of x that follows from the formula (70). In Mainardi (1994), another representation of the Green function was obtained by applying to the R.H.S of the formula (70) first the inverse Fourier transform and then the inverse Laplace transform: 2 x Gc .x; t I / D F .r/ D r M .r/;

(72)

where r D x=t  is the similarity variable and 1 F .r/ WD 2 i

Z r 

e Ha

1 d ; M .r/ WD 2 i

Z



er d 1 Ha 

are the two auxiliary functions nowadays often referred to as the Mainardi functions and Ha denotes the Hankel integration path. Let us note that the form of the similarity variable can be explained by the Lie group analysis of the time-fractional diffusion-wave equation (66). In Buckwar and Luchko (1998), Luchko and Gorenflo (1998), and Gorenflo et al. (2000b), symmetry groups of scaling transformations for the time- and space-fractional partial differential equations have been constructed. In particular, it has been proved in Buckwar and Luchko (1998) that the only invariant of the symmetry group T of scaling transformations of the time-fractional diffusion-wave equation (66) has the form .x; t; u/ D x=t  that explains the form of the scaling variable. Using the well-known representation of the Wright function, which reads (in our notation) for z2C 1 W; .z/ WD 2 i

Z e Ha

Cz 

1 X d zn D ;  nŠ .n C / nD0

(73)

where  > 1 and  > 0, we recognize that the auxiliary functions F and M are related to the Wright function according to the formula F .z/ D W;0.z/ D z M .z/ ;

M .z/ D W;1 .z/ :

(74)

This relation (74) along with (73) provides us with the series representations of the Mainardi functions and thus of the Green function Gc (for x > 0): 1 .x=t  /n 1 1 1 X F .r/ D  M .r/ D : Gc .x; t I / D 2x 2t 2 t  nD0 nŠ . n C 1  /

(75)

Page 19 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

0.8 0.7

ν=1

0.6

ν = 0.9 Gc(x;ν)

0.5 0.4

ν = 0.5

0.3 0.2

ν = 0.65 0.1 0

0

0.5

1

1.5

x

Fig. 1 Green function Gc .xI / WD Gc .x; 1I /: plots for several different values of 

Let us now shortly introduce some algorithms for numerical evaluation of the Green function Gc . Because Gc is a particular case of the Wright function (see formula (74)), one can of course use the algorithms for the numerical evaluation of the Wright function suggested in Luchko (2008) to evaluate the Green function Gc . Another approach to numerical evaluation of Gc we employed to produce the plots in Figs. 1 and 2 is in employing the integral representation (71). To evaluate the Mittag-Leffler function E˛ in (71), we applied the algorithms suggested in Gorenflo et al. (2002) and the MATLAB programs that implement these algorithms and are available from Matlab File Exchange (2005). In Fig. 1, several plots of the Green function Gc .xI / WD Gc .x; 1I / for different values of the parameter  . D ˛=2/ are presented. It can be seen that for x  0 each Green function has an only maximum and that location of the maximum point changes with the value of . For a detailed discussion of the maximum location, maximum value, and the propagation velocity of the maximum point, we refer to Luchko et al. (2013). In Fig. 2, the Green function Gc .x; t I / is plotted for  D 0:875.˛ D 1:75/ from different perspectives. The plots show that both the location of maximum and the maximum value depend on time t > 0: whereas the maximum value decreases with time (Fig. 2, right), the x-coordinate of the maximum location becomes even larger (Fig. 2, left). Surprisingly, the product of the maximum location and the maximum value of Gc .x; t I / does not depend on time t > 0 and is just a function of the parameter  (see Luchko et al. 2013 for more details).

4 Fractional Wave Equation In this section, a fractional generalization of the wave equation that describes propagation of damped waves is considered. In contrast to the fractional diffusion-wave equation that was considered in the previous section, the fractional wave equation contains fractional derivatives of the same order ˛; 1  ˛  2, both in space and in time. We show that the fractional wave equation inherits some crucial characteristics of the wave equation like a constant propagation

Page 20 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

0.7 0.6

Gc(x,t;ν)

0.6 1 0.4

Gc(x,t;ν)

0.5 0.8

0.4 0.3 0.2

0.2

0.1

1.5 t

0 0

0.5

1

1.5 x

2

2.5

3

2

0 1

4 1.2

1.4

1.6

1.8

2 0

2 x

t

Fig. 2 Green function Gc .x; tI /: plots for  D 0:875 from different perspectives

velocity of both the maximum of its fundamental solution and its gravity and “mass” centers. Moreover, the first, the second, and the Smith centrovelocities of the damped waves described by the fractional wave equation are constant and depend just on the equation order ˛. In this section, the fundamental solution of the one-dimensional fractional wave equation is obtained in explicit form and shown to be a spatial probability density function evolving in time all whose moments of order less than ˛ are finite. To illustrate analytical findings, results of numerical calculations and plots are presented. From the mathematical viewpoint, the one-dimensional fractional wave equation we deal with in this section was introduced for the first time in Gorenflo et al. (2000a), where this equation was called the neutral-fractional diffusion equation. In Mainardi et al. (2001), a time-space fractional diffusion-wave equation with the Riesz-Feller derivative of order ˛ 2 .0; 2 and skewness  has been investigated in detail. A particular case of this equation that for  D 0 corresponds to our fractional wave equation has been shortly mentioned in Mainardi et al. (2001). In Metzler and Nonnenmacher (2002), a fundamental solution to the neutral-fractional diffusion equation was deduced and analyzed in terms of the Fox H-function. For a detailed treatment of the onedimensional fractional wave equation, we refer to the recent paper Luchko (2013). In the applications, the fractional wave equations of different types were employed, e.g., for modeling of dynamics of sand and fissured rock with the seismic excitations in Gudehus and Touplikiotis (2012) and for description of the causal elastic waves with a frequency power-law attenuation in Näsholm and Holm (2013). As has been shown, e.g., in Szabo and Wu (2000), elastic wave attenuation in complex media such as biological tissue, polymers, rocks, and rubber often follows a frequency power law, and thus, such elastic waves can be modeled with the fractional wave equations.

4.1 Analysis of the Fractional Wave Equation 4.1.1 Problem Formulation In this section, we consider the fractional wave equation in the form Dt˛ u.x; t / D C˛ Rx˛ u.x; t /; x 2 Rn ; t 2 RC ; 1  ˛  2;

(76)

where u D u.x; t / is a real field variable, Rx˛ is the Riesz space-fractional derivative (29) of order ˛, and Dt˛ is the Caputo time-fractional derivative (25) of order ˛. The Caputo and the Riesz Page 21 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

fractional derivatives were introduced and short discussed in the second section. Here we just note that the Riesz fractional derivative is a symmetric operator with respect to the space variable x.  ˛=2 Because of the relation j j˛ D  2 , it can be formally interpreted as Rx˛ D  ./˛=2 ; i.e., as a power of the self-adjoint and positive definite operator ,  being the Laplace operator. To make analysis of Eq. (76) more simple and clear, in the further discussions we focus on the model one-dimensional fractional wave equation Dt˛ u.x; t / D Rx˛ u.x; t /; x 2 R ; t 2 RC ; 1  ˛  2:

(77)

In (77), all quantities are supposed to be dimensionless, so that the coefficient at the Riesz spacefractional derivative can be taken to be equal to one without loss of generality. As we have mentioned in the second section, in the one-dimensional case, the Riesz fractional derivative (29) can be represented as the hypersingular integral (30) that for ˛ D 1 can be rewritten via the Hilbert transform (31). This means that for ˛ D 1 Eq. (77) can be represented in the form @u 1 d .x; t / D  @t  dx

Z

C1 1

u. ; t / d x

(78)

that we call a modified convection equation and that is of course different from the standard convection equation. For ˛ D 2, Eq. (77) is reduced to the one-dimensional wave equation. In what follows, we focus on the case 1  ˛ < 2 because the case ˛ D 2 (wave equation) is well studied in the literature. For Eq. (77), the initial-value problem u.x; 0/ D '.x/ ;

@u .x; 0/ D 0; x 2 R @t

(79)

is considered for 1 < ˛ < 2. If ˛ D 1, the second initial condition in (79) is omitted. In this section, we are mostly interested in behavior and properties of the fundamental solution (Green function) G˛ of Eq. (77), i.e., its solution with the initial condition '.x/ D ı.x/, ı being the Dirac delta function. 4.1.2 Fundamental Solution of the Fractional Wave Equation We start our analysis by applying the Fourier transform with respect to the space variable x to Eq. (77) with 1 < ˛ < 2 and to the initial conditions (79) with '.x/ D ı.x/. Using definition of the Riesz fractional derivative, for the Fourier transform GO ˛ , we get the initial-value problem (

O G. ; 0/ D 1; @GO . ; 0/ D 0 @t

(80)

for the fractional differential equation Page 22 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

  D ˛ GO ˛ .t / C j j˛ GO ˛ . ; t / D 0:

(81)

The unique solution of (80), (81) is given by the expression (see, e.g., Luchko 1999) GO ˛ . ; t / D E˛ .j j˛ t ˛ /

(82)

in terms of the Mittag-Leffler function (59). The well-known formula (see, e.g., Podlubny 1999) m X   .x/k C O x 1m ; m 2 N; x ! C1 E˛ .x/ D  .1  ˛k/ kD1

for asymptotics of the Mittag-Leffler function that is valid for 0 < ˛ < 2 and the formula (82) show that GO ˛ belongs to L1 .R/ with respect to for 1 < ˛ < 2. Therefore, we can apply the inverse Fourier transform and get the representation 1 G˛ .x; t / D 2

Z

C1 1

e i x E˛ .j j˛ t ˛ / d ; x 2 R; t > 0

(83)

for the Green function G˛ . The last formula shows that the fundamental solution G˛ is an even function in x, i.e., G˛ .x; t / D G˛ .x; t /; x 2 R; t > 0 and (83) can be rewritten as the cos-Fourier transform: Z 1 1 G˛ .x; t / D cos. x/ E˛ . ˛ t ˛ / d ; x 2 R; t > 0:  0

(84)

(85)

Remarkably, the fundamental solution G˛ can be represented in terms of elementary functions for every ˛; 1 < ˛ < 2. To show this, the technique of the Mellin integral transform was applied in Luchko (2013) to rewrite the integral (85) as a particular case of the Fox H-function: 1 1 G˛ .x; t / D ˛x 2 i

Z

 Ci1

 i1

s

   s 1  ˛s t  s  ds; 0 <  < ˛ s x 1 2 2



˛

(86)

that can be represented in the form 1 1 G˛ .x; t / D ˛x 2 i

Z

 Ci1

 i1

sin.s=2/ sin.s=˛/

 s t ds; 0 <  < ˛ x

(87)

using the duplication and reflection formulas for the Euler gamma function . From (86) or (87), a useful representation

Page 23 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

G˛ .x; t / D

1 L˛ .t =x/; x > 0; t > 0 x

(88)

of the fundamental solution G˛ in terms of the auxiliary function L˛ defined by 1 1 L˛ .x/ D ˛ 2 i

Z

 Ci1  i1

sin.s=2/ .x/s ds; 0 <  < ˛ sin.s=˛/

(89)

can be obtained. It follows from the representation (89) (or from the formulas (84) and (88)) that L˛ is an odd function, i.e., L˛ .x/ D L˛ .x/; x 2 R:

(90)

L˛ .x/ D L˛ .1=x/; x 6D 0

(91)

Moreover, the important formula

can be obtained from the representation (89) by the variables substitution s D s1 in the integral at the right-hand side of (89). From (88) and (91) and the fact that G˛ .x; t / D G˛ .x; t / D G˛ .jxj; t / for all x ¤ 0, t > 0, the similarity properties of the fundamental solution G˛ .x; t / D

1 1 1 t G˛ .1; t =jxj/ D G˛ .1; jxj=t / D G˛ .x=t; 1/ D 2 G˛ .t =x; 1/ jxj jxj t x

(92)

are deduced. It is worthwhile to stress the remarkable fact that two of these similarity properties hold with the variable x fixed to 1, the other two ones with the variable t fixed to 1. This fact reflects the property that in the fractional wave equation (77) we deal with in this section, the timefractional derivative and the space-fractional derivative are of the same order ˛. The correctness of the four similarity properties (92) can be also directly checked using the final formula (97) for the fundamental solution G˛ of the fractional wave equation. Because the auxiliary function L˛ is defined in (89) as an inverse Mellin transform, its Mellin transform is given by the formula L˛ .s/

Z D 0

1

L˛ .x/ x s1 dx D

1 sin.s=2/ ; 0 < 0; x 2 R; 1 < ˛ < 2:  t 2˛ C 2jxj˛ t ˛ cos. ˛=2/ C jxj2˛

(97)

4.2 Fundamental Solution as a pdf We begin with a remark that the formula (97) is valid for ˛ D 1 (modified convection equation (78)), too, that can be proved by direct calculations. In this case we get the well-known Cauchy kernel G1 .x; t / D

t 1  t 2 C x2

(98)

that is a spatial probability density function evolving in time. For 1 < ˛ < 2, the Green function (97) is a spatial probability density function evolving in time, too. Indeed, the function (97) is evidently nonnegative for all t > 0. Furthermore, Z

1 1

G˛ .x; t / dx D .F G˛ .x; t // .0/ D GO ˛ .0; t / D E˛ .j j˛ t ˛ / j D0 D 1

(99)

for all t > 0 and 1 < ˛ < 2 according to the formula (82). Thus, G˛ given by (97) is a spatial probability density function evolving in time that can be considered to be a fractional generalization of the Cauchy kernel (98) for the case of an arbitrary index ˛; 1  ˛ < 2. Now let us study some properties of the fundamental solution (97) as a pdf. Because G˛ is an even function, we consider the function G˛C .r; t / D G˛ .jxj; t / D

r ˛1 t ˛ sin. ˛=2/ 1 ; t > 0; 1 < ˛ < 2  t 2˛ C 2r ˛ t ˛ cos. ˛=2/ C r 2˛

with r D jxj  0. It is easy to see that G˛C behaves like a power function in r both at r D 0 and at r D C1 for a fixed t > 0:

Page 25 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

( G˛C .r; t / 

r ˛1 ; r ! 0; r ˛1 ; r ! C1:

(100)

This means that the pdf G˛ possesses finite moments of order  for 0   < ˛, but the moment of order ˛ is infinite. In particular, the mean value of G˛ (its first moment) exists for all ˛ > 1 (we note that the Cauchy kernel does not possess a mean value). Let us now evaluate the moments of the one-sided fractional Cauchy kernel G˛C for a fixed t > 0. To do this, we refer to the representation (88) of G˛C in terms of the auxiliary function L˛ that is given by the formula (see (90) and (97)) sign.x/jxj˛ sin. ˛=2/ 1 ; x 2 R; 1 < ˛ < 2: L˛ .x/ D  jxj2˛ C 2jxj˛ cos. ˛=2/ C 1

(101)

Taking into account this formula, the function C˛ .x/ WD

L˛ .x/ x

can be interpreted as a fractional Cauchy pdf of the order ˛. Indeed, C˛ .x/ is evidently nonnegative for all x 2 R, and the property Z

C1 1

C˛ .x/ dx D 1

is valid because of the formulas (102) and (103). Of course, for ˛ D 1, the pdf C˛ .x/ coincides with the Cauchy pdf. The moment of the order , 0   < ˛ of G˛C can be represented in terms of the Mellin integral transform of L˛ that is known (see the formula (93)) and thus evaluated: Z 1 Z 1 t  sin.=2/ C   : (102) G˛ .r; t /r dr D t L˛ . / 1 d D ˛ sin.=˛/ 0 0 In particular, we get the formula Z

1

G˛C .r; t / dr D

0

1 2

(103)

that is in accordance with (99) because G˛ is an even function in x. We mention also the important formula Z

1 0

G˛C .r; t /r dr D

t ; 1 0 for x 6D 0, so that x D 0 is a minimum point of the fundamental solution G˛ for any t > 0. Because G˛ is an even function in x, we again consider the function G˛ .jxj; t / that was denoted by G˛C .r; t / with r D jxj. To determine the maximum locations of G˛C for the fixed values of t and ˛, we solve the equation @G˛C .r; t / D 0 @r that turns out to be equivalent to the quadratic equation 

r˛ .˛ C 1/ ˛ t

2



r˛ C 2 cos. ˛=2/ ˛ t

  .˛  1/ D 0

with solutions given by p  cos. ˛=2/ ˙ ˛ 2  sin2 . ˛=2/ r˛ : D t˛ ˛C1 Since we are interested in the nonnegative solutions, the only candidate for this role is the point p r˛  cos. ˛=2/ C ˛ 2  sin2 . ˛=2/ D c˛ ; c˛ WD : t˛ ˛C1 Because

@G˛C .r; t / @r

is positive for

r˛ t˛

< c˛ and negative for

r˛ t˛

(105)

> c˛ , we conclude that the point 1

r˛? .t / D vp .˛/t; vp .˛/ WD .c˛ / ˛

(106)

with c˛ given by (105) is the only maximum point of the one-sided fractional Cauchy kernel G˛C . Of course, this point and the point r˛? .t / < 0 are maximum points of G˛ because G˛ is an even function in x. To determine the maximum value of the function G˛ that coincides with the maximum value of C G˛ and is denoted by G˛? .t /, we substitute the point r D r˛? .t / given by (106) into the function G˛C and get the formula G˛? .t / D

c˛ sin. ˛=2/ 1 1 m˛ ; m˛ WD G˛? .1/ D ; t vp .˛/ 1 C 2c˛ cos. ˛=2/ C c˛2

(107)

Page 27 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

where vp .˛/ and c˛ are defined as in the formulas (105) and (106). Of course, we can also use the relation (88) and obtain the formula G˛? .t / D

1 L˛ .vp .˛// t vp .˛/

(108)

via the auxiliary function L˛ . It follows from the formulas (106) and (107) (or (108)) that for a fixed value of ˛; 1 < ˛ < 2, the product p˛ of the maximum value G˛? .t / and the maximum locations ˙r˛? .t / is time independent: p˛ D ˙r˛? .t / G˛? .t / D ˙

  c˛ sin. ˛=2/ 1 D ˙L .˛/ : v ˛ p  1 C 2c˛ cos. ˛=2/ C c˛2

(109)

t For ˛ D 1, the maximum location of the Green function G1 .x; t / D t 2 Cx 2 does not move with time and is evidently at the point x D 0 for any t > 0, i.e., c1 D vp .1/ D p1 D 0 that is in accordance with the formulas (105), (106), and (109). Now we calculate some physical characteristics of the damped waves that are described by the fundamental solution G˛ . Because G˛ consists in fact of two symmetric branches that move in opposite directions, we again consider only one of them, say, G˛C that is a restriction of G˛ to x  0. The location of the gravity center r˛g .t / of G˛C is defined by the formula

R1

r˛g .t /

r G C .r; t / dr : D R0 1 C˛ G .r; t / dr ˛ 0

(110)

For 1 < ˛ < 2, the formulas (103) and (104) lead to the following result: r˛g .t / D

2t : ˛ sin.=˛/

(111)

If ˛ D 1, the mean value of G1C does not exists and thus the gravity center r1 .t / of G1C is located at C1 for any t > 0. The “mass” center r˛m .t / of G˛C is determined by the formula (Gurwich 2001) g

 C 2 r G .r; t / dr : r˛m .t / D R0 1  ˛ 2 C .r; t / dr G ˛ 0 R1

(112)

Substituting the representation (88) into (112) and transforming the obtained integrals, we get the formula R 1 1 2 L . / d ; (113) r˛m .t / D vm .˛/ t; vm .˛/ D 0R 1 2 ˛ 0 L˛ . / d where the function L˛ is defined by (101). The formula (113) (as well as the formula (118)) includes some integrals of the form

Page 28 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

Z

1

I.ˇ/ D

ˇ L2˛ . / d ; 2˛  1 < ˇ < 2˛  1

(114)

0

that in general cannot be expressed via known elementary or special functions. Remarkably, there exists an explicit formula for the integrals (114) in the case ˛ D 1, namely, (Prudnikov et al. 1986) Z

1

ˇ L21 . / d D

0

1 1Cˇ ; 3 < ˇ < 1: 4 cos.ˇ=2/

(115)

It follows from this formula that the “mass” center r1m of G1C can be represented by the simple formula r1m .t / D

2 t: 

(116)

In the general case, we just note that the symmetry relation I.ˇ/ D I.ˇ  2/; 2˛  1 < ˇ < 2˛  1 holds true because of the formula (91). Finally we mention that the location of energy of the damped wave G˛C that is defined as the time corresponding to the centroid of the function G˛C in the time domain is given by the formula (Carcione et al. 2010) t˛c .r/

2 R1  C dt 0 t G˛ .r; t / : D R1 2 C .r; t / dt G ˛ 0

(117)

For 1 < ˛ < 2, both integrals at the right-hand side of (117) converge and the finite location of energy can be represented in the form t˛c .r/

R1 2 r 0 L˛ . / d ; vc .˛/ D R 1 ; D 2 vc .˛/ 0 L˛ . / d

where the function L˛ is defined by (101). For ˛ D 1, the integral vc .˛/ tends to 0 as ˛ ! 1.

R1 0

(118)

L2˛ . / d diverges, so that

4.4 Velocities of the Damped Waves It is well known (see, e.g., Smith 1970; Bloch 1977; Groesen and Mainardi 1989, 1990; Gurwich 2001; Carcione et al. 2010) that several different definitions of the wave velocities and in particular of the light velocity can be introduced. For the damped waves that are described by the fractional wave equation (77), we evaluate the propagation velocity of the maximum of its fundamental solution G˛ that can be interpreted as the phase velocity, the propagation velocity of the gravity center of G˛ , the velocity of its “mass” center or the pulse velocity, and three different kinds of its centrovelocity. It turns out that all these velocities are constant in time and depend just on the order Page 29 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

˛ of the fractional wave equation. Whereas four out of six velocities are different to each other, the first centrovelocity coincides with the Smith centrovelocity, and the second centrovelocity is the same as the pulse velocity. We start with the phase velocity and determine it using the formula (106) that leads to the result that the maximum locations of the fundamental solution G˛ propagate with the constant velocities vp .˛/ that are given by the expression ! ˛1 p  cos. ˛=2/ C ˛ 2  sin2 . ˛=2/ dr˛? .t / D˙ : vp .˛/ WD ˙ dt ˛C1

(119)

For ˛ D 1 (modified convection equation (78)), the propagation velocity of the maximum of G˛ is equal to zero (the maximum point stays at x D 0), whereas for ˛ D 2 (wave equation), the maximum points propagate with the constant velocity ˙1. To determine the propagation velocity vg .˛/ of the gravity center of G˛ , we employ the formula (111) and get the following result: vg .˛/ WD

2 dr˛g .t / D : dt ˛ sin.=˛/

(120)

vg .˛/ is thus time independent and determined by the order ˛ of the fractional wave equation. Evidently, vg .2/ D 1 and vg .˛/ ! C1 as ˛ ! 1 C 0. The velocity vm .˛/ of the “mass” center of G˛ or its pulse velocity (Gurwich 2001) is obtained from the formula (113) and is equal to dr m .t / D vm .˛/ WD ˛ dt

R1 0

1 L2˛ . / d

R1 0

L2˛ . / d

;

(121)

where the function L˛ is defined by (101). For ˛ D 1, the pulse velocity is equal to 2  0:64 (see the formula (116)). Following Carcione et al. (2010) we define the second centrovelocity v2 .˛/ as the mean pulse velocity computed from the time 0 to the time t . It follows from (113) and (121) that for the damped wave that is described by the fundamental solution of the fractional wave equation, the second centrovelocity is equal to its pulse velocity vm .˛/: r m .t / v2 .˛/ WD ˛ D vm .˛/ D t

R1 0

1 L2˛ . / d R1 : L2˛ . / d 0

(122)

The Smith centrovelocity vc .˛/ (Smith 1970) of the damped waves describes the motion of the first moment of their energy distribution and can be evaluated in explicit form using the formula (118):  vc .˛/ WD

dt˛c .r/ dr

1

R1 0 D R1 0

L2˛ . / d

L2˛ . / d

;

(123)

Page 30 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

10

10

8

8

t=0.1 t=0.1

6

4

G1.1

G1.01

6

t=0.2

t=0.3

t=0.3

2 0 −0.5 −0.4 −0.3 −0.2 −0.1

0 x

0.1

0.2

0.3

0.4

t=0.2

4 2 0 −0.5 −0.4 −0.3 −0.2 −0.1

0.5

0 x

0.1

0.2

0.3

0.4

0.5

70

14

60

12 t=0.1

50

8

t=0.2

G1.9

G1.5

10

6 t=0.3

4

t=0.1

40 30

t=0.2

20 t=0.3

10

2 0 −0.5 −0.4 −0.3 −0.2 −0.1

0 x

0.1

0.2

0.3

0.4

0.5

0 −0.5 −0.4 −0.3 −0.2 −0.1

0 x

0.1

0.2

0.3

0.4

0.5

Fig. 3 Fundamental solution G˛ : plots for ˛ D 1:01 (1st line, left), ˛ D 1:1 (1st line, right), ˛ D 1:5 (2nd line, left), and ˛ D 1:9 (2nd line, right) for 0:5  x  0:5 and t D 0:1; 0:2; 0:3

R1 where the function L˛ is defined by (101). Because the integral 0 L2˛ . / d diverges for ˛ D 1, the Smith centrovelocity tends to 0 as ˛ ! 1. Finally, we evaluate the first centrovelocity v1 .˛/ that is defined as the mean centrovelocity from 0 to x (Carcione et al. 2010). It follows from (118) and (123) that for the damped wave G˛ the first centrovelocity is equal to the Smith centrovelocity vc .˛/: R1 2 r 0 L˛ . / d D vc .˛/ D R 1 : v1 .˛/ WD c 2 t˛ .r/ 0 L˛ . / d

(124)

As we have seen, all velocities introduced above are constant in time and depend just on the order ˛ of the fractional wave equation. The phase velocity, the velocity of the gravity center of G˛ , the pulse velocity, and the Smith centrovelocity are different to each other, whereas the fist centrovelocity coincides with the Smith centrovelocity and the second centrovelocity is the same as the pulse velocity. For the physical interpretation and meaning of the velocities that were determined above, we refer to, e.g., Bloch (1977), Groesen and Mainardi (1989, 1990), Gurwich (2001), and Carcione et al. (2010).

4.5 Discussion of the Obtained Results and Plots To start with, let us consider the evolution of the fundamental solution G˛ in time for some characteristic values of ˛. In Fig. 3, the plots of G˛ for ˛ D 1:01; 1:1; 1:5, and 1:9 are presented. As we can see, in all cases the maximum location of G˛ is moved in time according to the formula (106), whereas the maximum value decreases according to the formula (107). The behavior of G˛ can be thus interpreted as propagation of the damped waves whose amplitude decreases with time. This phenomena can be very clearly recognized on the 3D plot presented in Fig. 4. Of course, because of the nonlocal character of the fractional derivatives in the fractional wave equation, the solutions to this equation show some properties of diffusion processes, too. In particular, the Page 31 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

25

G1.5

20 15 10 5 0.1

0 0.5

0.2

0 −0.5

x

t

0.3

Fig. 4 Plot of G˛ for ˛ D 1:5, 0:5  x  0:5, and 0 < t  0:3 V_g

3.0

V_m

2.5

V_p

2.0

V_c

1.5 1.0 0.5 0.0 1.0

1.2

1.4

1.6

1.8

2.0

Fig. 5 Plots of the gravity center velocity vg .˛/, the pulse velocity vm .˛/, the phase velocity vp .˛/, and the centrovelocity vc .˛/ for 1  ˛  2

fundamental solution G˛ is positive for all x 6D 0 at any small time instance t > 0 that means that a disturbance of the initial conditions spreads infinitely fast and Eq. (33) is nonrelativistic like the classical diffusion equation. But in contrast to the diffusion equation, both the maximum location of the fundamental solution G˛ , its gravity and “mass” centers, and location of its energy propagate with the finite constant velocities like the fundamental solution of the wave equation. The plots of the propagation velocity vp of the maximum location of the fundamental solution G˛ (phase velocity), the velocity vg of its gravity center, its pulse velocity vm , and its centrovelocity vc are presented in Fig. 5. As expected, vp D vc D 0, vm D 2  0:64 for ˛ D 1 (modified convection equation), and all velocities smoothly approach the value 1 as ˛ ! 2 (wave equation). For 1 < ˛ < 2, vp ; vm ; and vc monotonously increase, whereas vg monotonously decreases. It is interesting to note that for all velocities v D v.˛/, the property dv.˛/ .2  0/ D 0 holds d˛ true, i.e., in a small neighborhood of the point ˛ D 2, the velocities of G˛ are nearly the same as those of the fundamental solution of the wave equation. The velocity vg of the gravity center of G˛ tends to C1 for ˛ ! 1 C 0 and t > 0 (modified convection equation) because the first moment of the Cauchy kernel (98) does not exist. It is interesting to note that for all ˛; 1 < ˛ < 2, the velocities vp ; vg ; vm ; vc are different to each other and fulfill the inequalities vc .˛/ < vp .˛/ < vm.˛/ < vg .˛/. For ˛ D 2, all velocities are equal to 1.

Page 32 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

5 Conclusions and Open Problems In this chapter, anomalous transport processes have been first modeled with the continuous time random walks on the microlevel. On the macrolevel, the CTRW models were reduced to the deterministic fractional diffusion-wave equations on the large time and space scales and under some suitable assumptions posed on the jump pdf. This kind of equations has been already successfully employed, e.g., for modeling of the geothermal energy extraction (Luchko and Punzi 2011), stability and seismicity of the fractal fault systems (Gudehus and Touplikiotis 2012), and propagation of the damped waves (Luchko 2013) that shows their potential importance for different geomathematical applications. In this chapter, some important types of the partial differential equations of fractional order including the generalized time-fractional diffusion equation, the time-fractional diffusion-wave equation, and the time- and space-fractional wave equation were treated. For these equations, both the initial-boundary-value problems with the Dirichlet boundary conditions and the Cauchy initial-value problems have been posed and investigated. Of course, the same method can be applied for the initial-boundary-value problems with the Neumann, Robin, or mixed boundary conditions. For the generalized time-fractional diffusion equation, a powerful maximum principle has been established. It enables us to obtain information regarding solutions and their a priori estimates without any explicit knowledge of the form of the solutions themselves and thus is a valuable tool in scientific research. In this connection we mention an important problem that is still waiting for its solution, namely, to try to extend the maximum principle to the space- and time-space-fractional partial differential equations. These equations are nowadays actively employed in modeling of relevant complex phenomena like anomalous diffusion in inhomogeneous and porous mediums, Levy processes and Levy flights, and the so-called fractional kinetics and are worth to be treated in detail from the mathematical viewpoint. In the last section of the chapter, a fractional wave equation with the fractional derivatives of order ˛; 1  ˛  2, both in space and in time was introduced and analyzed. We showed that the fractional wave equation inherits some crucial characteristics of the wave equation like the constant propagation velocities of the maximum of its fundamental solution, its gravity and “mass” centers, and its energy location. Because the maximum value of the fundamental solution G˛ (wave amplitude) decreases with time whereas its location moves with a constant velocity, solutions to the fractional wave equation can be interpreted as the damped waves. Moreover, G˛ that turns out to be expressed in terms of elementary functions for all values of ˛; 1  ˛ < 2, can be interpreted as a spatial pdf evolving in time all whose moments of order less than ˛ are finite. In connection with the fractional wave equation, an important problem for further research would be determination of other velocities like the group velocity or the ratio-of-units velocity (see, e.g., Bloch 1977; Gurwich 2001) for the damped waves described by the fractional wave equation. Finally, the fractional wave equations with the nonconstant coefficients as well as qualitative behavior of solutions to the nonlinear fractional wave equations would be worth to consider from the mathematical viewpoint and to employ them as models in the suitable applications.

References Al-Refai M (2012) On the fractional derivatives at extreme points. Electron J Qual Theory Differ Equ 55:1–5 Page 33 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

Berkowitz B, Klafter J, Metzler R, Scher H (2002) Physical pictures of transport in heterogeneous media: advection-dispersion, random walk and fractional derivative formulations. Water Resour Res 38:1191–1203 Bloch SC (1977) Eighth velocity of light. Am J Phys 45:538–549 Buckwar E, Luchko Yu (1998) Invariance of a partial differential equation of fractional order under the Lie group of scaling transformations. J Math Anal Appl 227:81–97 Carcione JM, Gei D, Treitel S (2010) The velocity of energy through a dissipative medium. Geophysics 75:T37–T47 Diethelm K (2010) The analysis of fractional differential equations. Springer, Berlin Emmanuel S, Berkowitz B (2007) Continuous time random walks and heat transfer in porous media. Transp Porous Media 67:413–430 Feller W (1952) On a generalization of Marcel Riesz’ potentials and the semi-groups generated by them. Meddelanden Lunds Universitets Matematiska Seminarium (Comm. Sém. Mathém. Université de Lund), Tome suppl. dédié à M. Riesz: 73–81 Fulger D, Scalas E, Germano G (2008) Monte Carlo simulation of uncoupled continuous time random walks yielding a stochastic solution of the space-time fractional diffusion equation. Phys Rev E 77:021122 Geiger S, Emmanuel S (2010) Non-Fourier thermal transport in fractured geological media. Water Resour Res 46:W07504 Germano G, Politi M, Scalas E, Schilling RL (2009) Stochastic calculus for uncoupled continuoustime random walks. Phys Rev E 79:066102 Gorenflo R, Mainardi F (2001) Random walk models approximating symmetric space-fractional diffusion processes. In: Elschner J, Gohberg I, Silbermann B (eds) Problems in mathematical physics. Birkhäuser Verlag, Boston/Basel/Berlin Gorenflo R, Mainardi F (2009) Some recent advances in theory and simulation of fractional diffusion processes. J Comput Appl Math 229:400–415 Gorenflo R, Iskenderov A, Luchko Yu (2000a) Mapping between solutions of fractional diffusionwave equations. Fract Calc Appl Anal 3:75–86 Gorenflo R, Luchko Yu, Mainardi F (2000b) Wright functions as scale-invariant solutions of the diffusion-wave equation. J Comput Appl Math 118:175–191 Gorenflo R, Loutchko J, Luchko Yu (2002) Computation of the Mittag-Leffler function and its derivatives. Fract Calc Appl Anal 5:491–518 Groesen E, Mainardi F (1989) Energy propagation in dissipative systems, Part I: centrovelocity for linear systems. Wave Motion 11:201–209 Groesen E, Mainardi F (1990) Balance laws and centrovelocity in dissipative systems. J Math Phys 30:2136–2140 Gudehus G, Touplikiotis A (2012) Clasmatic seismodynamics – oxymoron or pleonasm? Soil Dyn Earthq Eng 38:1–14 Gurwich I (2001) On the pulse velocity in absorbing and nonlinear media and parallels with the quantum mechanics. Prog Electromagn Res 33:69–96 Hanyga A (2002) Multi-dimensional solutions of space-time-fractional diffusion equations. Proc R Soc Lond A 458:429-450 Haubold J, Mathai AM, Saxena RK (2011) Mittag-Leffler functions and their applications. J Appl Math 2011:298628 Luchko Yu (1999) Operational method in fractional calculus. Fract Calc Appl Anal 2:463–489

Page 34 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

Luchko Yu (2008) Algorithms for evaluation of the Wright function for the real arguments’ values. Fract Calc Appl Anal 11:57–75 Luchko Yu (2009a) Boundary value problems for the generalized time-fractional diffusion equation of distributed order. Fract Calc Appl Anal 12:409–422 Luchko Yu (2009b) Maximum principle for the generalized time-fractional diffusion equation. J Math Anal Appl 351:218–223 Luchko Yu (2010) Some uniqueness and existence results for the initial-boundary-value problems for the generalized time-fractional diffusion equation. Comput Math Appl 59:1766–1772 Luchko Yu (2011a) Initial-boundary-value problems for the generalized multi-term time-fractional diffusion equation. J Math Anal Appl 374:538–548 Luchko Yu (2011b) Maximum principle and its application for the time-fractional diffusion equations. Fract Calc Appl Anal 14:110–124 Luchko Yu (2012a) Anomalous diffusion: models, their analysis, and interpretation. In: Rogosin S, Koroleva A (eds) Advances in applied analysis. Series: trends in mathematics. Birkhäuser Verlag, Boston/Basel/Berlin Luchko Yu (2012b) Initial-boundary-value problems for the one-dimensional time-fractional diffusion equation. Fract Calc Appl Anal 15:141–160 Luchko Yu (2013) Fractional wave equation and damped waves. J Math Phys 54:031505 Luchko Yu, Gorenflo R (1998) Scale-invariant solutions of a partial differential equation of fractional order. Fract Calc Appl Anal 1: 63–78 Luchko Yu, Gorenflo R (1999) An operational method for solving fractional differential equations with the Caputo derivatives. Acta Math Vietnam 24:207–233 Luchko Yu, Punzi A (2011) Modeling anomalous heat transport in geothermal reservoirs via fractional diffusion equations. Int J Geomath 1:257–276 Luchko Yu, Mainardi F, Povstenko Yu (2013) Propagation speed of the maximum of the fundamental solution to the fractional diffusion-wave equation. Comput Math Appl 66:774–784 Mainardi F (1994) On the initial-value problem for the fractional diffusion-wave equation. In: Rionero S, Ruggeri T (eds) Waves and stability in continuous media. World Scientific, Singapore Mainardi F (1996a) Fractional relaxation-oscillation and fractional diffusion-wave phenomena. Chaos Solitons Fractals 7:1461–1477 Mainardi F (1996b) The fundamental solutions for the fractional diffusion-wave equation. Appl Math Lett 9:23–28 Mainardi F, Luchko Yu, Pagnini G (2001) The fundamental solution of the space-time fractional diffusion equation. Fract Calc Appl Anal 4:153–192. E-print http://arxiv.org/abs/cond-mat/ 0702419 Marichev OI (1983) Handbook of integral transforms of higher transcendental functions, theory and algorithmic tables. Ellis Horwood, Chichester Matlab File Exchange (2005) Matlab-Code that calculates the Mittag-Leffler function with desired accuracy. Available for download at www.mathworks.com/matlabcentral/fileexchange/8738mittag-leffler-function Metzler R, Klafter J (2000) The random walk’s guide to anomalous diffusion: a fractional dynamics approach. Phys Rep 339:1–77 Metzler R, Klafter J (2004) The restaurant at the end of the random walk: recent developments in the description of anomalous transport by fractional dynamics. J Phys A 37:161–208 Metzler R, Nonnenmacher TF (2002) Space- and time-fractional diffusion and wave equations, fractional Fokker-Planck equations, and physical motivation. Chem Phys 284: 67-90

Page 35 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_60-3 © Springer-Verlag Berlin Heidelberg 2014

Montroll E, Weiss, G (1965) Random walks on lattices. J Math Phys 6:167 Näsholm SP, Holm S (2013) On a fractional Zener elastic wave equation. Fract Calc Appl Anal 16:26–50 Podlubny I (1999) Fractional differential equations. Academic, San Diego Prudnikov AP, Brychkov YA, Marichev OI (1986) Integrals and series. Vol 1: Elementary functions. Gordon and Breach, New York Samko SG, Kilbas AA, Marichev OI (1993) Fractional integrals and derivatives: theory and applications. Gordon and Breach, Yverdon Smith RL (1970) The velocities of light. Am J Phys 38:978–984 Szabo TL, Wu J (2000) A model for longitudinal and shear wave propagation in viscoelastic media. J Acoust Soc Am 107:2437–2446 Vladimirov VS (1971) Equations of the mathematical physics. Nauka, Moscow

Page 36 of 36

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Radial Basis Function-Generated Finite Differences: A Mesh-Free Method for Computational Geosciences Natasha Flyera , Grady B. Wrightb and Bengt Fornbergc a Institute for Mathematics Applied to Geosciences, National Center for Atmospheric Research, Boulder, CO, USA b Department of Mathematics, Boise State University, Boise, ID, USA c Department of Applied Mathematics, University of Colorado, Boulder, CO, USA

Abstract Radial basis function-generated finite differences (RBF-FD) is a mesh-free method for numerically solving partial differential equations that emerged in the last decade and have shown rapid growth in the last few years. From a practical standpoint, RBF-FD sprouted out of global RBF methods, which have shown exceptional numerical qualities in terms of accuracy and time stability for numerically solving PDEs, but are not practical when scaled to very large problem sizes because of their computational cost and memory requirements. RBF-FD bypass these issues by using local approximations for derivatives instead of global ones. Matrices in the RBF-FD methodology go from being completely full to 99 % empty. Of course, the sacrifice is the exchange of spectral accuracy from the global RBF methods for high-order algebraic convergence of RBF-FD, assuming smooth data. However, since natural processes are almost never infinitely differentiable, little is lost and much gained in terms of memory and runtime. This chapter provides a survey of a group of topics relevant to using RBF-FD for a variety of problems that arise in the geosciences. Particular emphasis is given to problems in spherical geometries, both on surfaces and within a volume. Applications discussed include nonlinear shallow water equations on a sphere, reaction–diffusion equations, global electric circuit, and mantle convection in a spherical shell. The results from the last three of these applications are new and have not been presented before for RBF-FD.

1 Introduction to the Concept of Radial Basis Functions (RBFs) The motivation for the RBF method originated with R.L. Hardy asking the question, “Given a N d set of sparse scattered data, ffj gN j D1 , at the node locations fxj gj D1  R , can an interpolant be constructed that adequately represents the unknown surface?” (Hardy 1971). It was first shown by Mairhuber (1956) that, in more than one dimension, interpolation is not well posed when using an d 2 , that are independent of the node locations. expansion of basis functions, f j .x/gN j D1 , x 2 R That is, there exist an infinite number of node configurations that will yield a singular interpolation problem. Hardy bypassed this singularity problem by constructing the interpolant from a basis function set consisting of translates of a single radially symmetric function with one centered at each data location. By giving up orthogonality, well posedness of the interpolant and its derivatives for any set of distinct scattered nodes in any dimension is guaranteed. Although unconditional nonsingularity of the interpolation problem was known early in some special cases (Bochner 1933;



E-mail: [email protected]

Page 1 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Table 1 Some common choices for radial functions Radial function .r/

Type of basis function Piecewise smooth RBFs Generalized Duchon spline (GDS)

r 2m log r; m 2 N r 2m ; m > 0 and m … N 21m m (Bessel K-function) .m/ r Km .r/; m > 0 r r Special cases: e , e .1 C r/ for m D 12 ; 32 . .1  r/m C p.r/; p certain polynomials, m 2 N

Matern Compact support (“Wendland”) Infinitely smooth RBFs Gaussian (GA) Multiquadric (MQ) Inverse multiquadric (IMQ) Inverse quadratic (IQ) Sech (SH) Bessel (BE) (d D 1; 2; : : :)

e ."r/ p 1pC ."r/2 1= 1 C ."r/2 1=.1 C ."r/2 / sech "r Jd=21 ."r/=."r/d=21 2

Schoenberg 1938), the proof in 1986 of guaranteed non-singularity for multiquadric (MQ) RBFs (Micchelli 1986) accelerated the further development and acceptance of RBFs. Pioneering work by M.J.D. Powell and his collaborators at University of Cambridge played also a major role in the early history of RBFs (Powell 1992). Piecewise smooth RBFs feature a jump in some derivative and thus can only lead to algebraic convergence. For example, the radial cubic jrj3 , where r D kx  xj k is the Euclidean norm, has a jump in the third derivative at x D xj , leading to fourth-order convergence in 1D, with the order of convergence increasing as the dimension increases (cf. Powell p  On theother hand,   1992). interpolating with infinitely smooth RBFs, such as 1 C ."r/2 , exp ."r/2 , and 1= 1 C ."r/2 , will lead to spectral convergence (Madych and Nelson 1992; Yoon 2001). Table 1 shows commonly used RBFs, noting that infinitely smooth RBFs depend on a shape parameter ". It was first shown by Driscoll and Fornberg (2002) that, in 1D, in the limit of " ! 0 (i.e., flat RBFs) the RBF methodology reproduces pseudospectral methods (PS) if the nodes are accordingly placed (i.e., equispaced nodes for Fourier methods, Gauss–Chebyshev nodes for Chebyshev methods, etc.). Similarly, on the surface of a sphere, Fornberg and Piret (2007) showed that, in the limit of " ! 0, RBFs reproduce spherical harmonics in the sense that they span an equivalent space for any scattered node set. Later in the chapter, algorithms of how to compute in the near-flat RBF limit will be discussed.

2 RBF-Generated Finite Difference (RBF-FD) Approximations Before delving into differentiation, interpolation is discussed as it can be considered a zeroth-order differentiation. RBF interpolation is based on a linear combination of translates of a single radially N d symmetric function, .kx  xk k/, that collocates the data ffk gN kD1 at the nodes fxk gkD1 ; xk 2 R , and is given by s.x/ D

N X

ck .kx  xk k/;

(1)

kD1

Page 2 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

where k  k denotes the `2 norm. The expansion coefficients, ck , can be found by inverting the matrix, A, in (2): 2 32 3 2 3 c1 f1 .jjx1  x1 jj/ .jjx1  x2 jj/    .jjx1  xN jj/ 6 .jjx  x jj/ .jjx  x jj/    .jjx  x jj/ 7 6 c 7 6 f 7 2 1 2 2 2 N 6 76 27 6 27 (2) 6 7 6 :: 7 D 6 :: 7 : :: :: :: : : 4 54 : 5 4 : 5 : : : : .jjxN  x1 jj/ .jjxN  x2 jj/    .jjxN  xN jj/ cN fN ƒ‚ … „ Interpolation matrix A

Since RBFs only depend on a scalar distance (defined by the `2 norm), the form of (2) is independent of coordinate system, dimension, and domain geometry. As an example, even if node locations on a sphere are given in spherical coordinates, no such grids need to be used. Distances are measured straight through the sphere and not along great arcs. That is, the argument of the RBF centered at, .x1 ; y1 ; z1 / and evaluated at .x2 ; y2 ; z2 / is q  p  2 2 2 r D .x2  x1 / C .y2  y1 / C .z2  z1 / D 2 1  xT2 x1 p D 2 .1  cos 2 cos 1 cos.2  1 /  sin 2 sin 1 /: where 1;2 , 1;2 are the respective latitude and longitude of the points. The RBF differentiation matrix (DM), DN , is derived by applying the analytic derivative operator L of the RBF interpolant and evaluating it at the desired node locations. As an example on a sphere, to approximate L at .;  / D .i ; i / Ls.i ; i / D

N X

ck ŒLk .r/j.; /D.i ;i / „ ƒ‚ … kD1

.i D 1; : : : ; N /

Components of B

DBc   D BA1 f DDN f;

(3)

where f contains the N discrete data values at the node locations and c contains the N discrete expansion coefficients and is formally given by c D A1 f, where A1 is the inverse of the RBF interpolation matrix defined in (2). The above derivation in (3) is for global RBF DM. That is, to calculate the discretized derivative operator at one node, all the nodes in the domain are used. This is analogous to a spectral collocation method. Thus, for smooth data and using infinitely smooth RBF, spectral convergence can be expected. Deriving RBF-FD DMs is very similar, except that for each stencil, only n nodes out of the N nodes in the domain are used. Similar to regular finite differences, as the stencil size n increases so does the rate of algebraic convergence. Now, the differential operator L is of the function values, uk , at the neighboring approximated at the node xc by a linear combination Pn n  1 node locations, xk . In other words kD1 ak uk D Luc . The differentiation weights, ak , are calculated by enforcing that this linear combination should be exact for RBFs, f.kx  xk k/gnkD1 , centered at each of the node locations fxk gnkD1 (classical FD would enforce that it be exact for Page 3 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

polynomials instead). It has also been shown through experience and studies (Fornberg et al. 2002; Wright and Fornberg 2006; Fornberg and Lehto 2011) that better is gained by Paccuracy n the interpolant being able to reproduce a constant. Hence, the constraint kD1 ak D L1jxDxc D 0 is added. Combining these constraints, the solution for the RBF-FD weights can be determined from the following linear system: 3 2 3 32 .kx1  x1 k/ .kx1  x2 k/    .kx1  xn k/ 1 L.kx  x1 k/jxDxc a1 6 7 7 6 :: 7 6 :: :: :: :: 6 7 76 : 7 6 : : : : 7D6 6 7; 76 4.kxn  x1 k/ .kxn  x2 k/    .kxn  xn k/ 1 5 4 an 5 4L.kx  xn k/jxDxc 5 anC1 0 1 1  1 0 2

(4)

where anC1 is not actually used in RBF-FD approximation after the system is solved. Solving this system once gives one row of the RBF-FD DM. For a total of N nodes on the sphere, there will be N linear systems of size .nC1/.nC1/ to solve, resulting in a preprocessing cost of O(n3 N ). Each subsequent matrix-vector multiplication will then cost only O.nN / operations. In high-resolution computations, N  n, and thus, the cost to time step the RBF-FD  3  method is O.nN /. This results ina significant speedup from global RBFs that require O N operations to create the DMs and  2 O N to time step. In general, a kd-tree algorithm is used to find the n  1 nearest neighbors when calculating the differentiation weights for the stencils. MATLAB conveniently has a kd-tree algorithm called “knnsearch” in its statistical toolbox. RBF-FD matrices are very sparse, generally about 99 % empty. Thus, in order to reduce the bandwidth of the matrix as well as index the entries effectively in memory, a Cuthill–McKee algorithm is applied to the RBF-FD DM via the MATLAB statement “symrcm.”

2.1 Limit of Flat Basis Functions With " available as a free parameter, an obvious numerical test is to see how the choice of it influences the accuracy that is obtained. A typical experiment is shown in Fig. 1. The error is observed to be rapidly decreasing with " when the calculation suddenly breaks down due to the ill conditioning of RBF-Direct – evaluation of (2) followed by (1). This may suggest that a tradeoff will be required between accuracy and numerical conditioning (described as an uncertainty principle in Schaback 1995). It was however realized shortly afterward that the RBF interpolation problem itself actually does not become ill conditioned in the flat basis function limit (Driscoll and Fornberg 2002; Fornberg et al. 2004). Instead, it is only the RBF-Direct algorithm that then becomes an ill-conditioned numerical procedure for a problem that remains well conditioned. Several well-conditioned numerical algorithms were subsequently developed. Using these, it was soon found that RBF errors often become extremely low before leveling off or bouncing back up, cf. Fig. 2. Sometimes, the most accurate "-range can be reached already with RBF-Direct, while at other times it requires a stable algorithm (or more costly extended precision arithmetic). If the nodes are latticelike, it can happen that the RBF interpolant diverges as " ! 0 (although never in the GA case Schaback 2005). For node sets with some irregularity, the interpolant will in the flat limit take the form of a multivariate polynomial (Driscoll and Fornberg 2002; Fornberg et al. 2004). One reason that " small often is better than " ! 0 is that, with RBF interpolants converging to polynomials, the boundary accuracy often deteriorates (the polynomial Runge phenomenon).

Page 4 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

y

1

b 10–4

0.5

10–6 |Error|

a

0

10–10

–0.5 –1 –1 –0.5

10–8

0 x

0.5

10–12

1

0

0.2

0.4

ε

0.6

0.8

1

Fig. 1 (a) A set of 41 scattered nodes in the unit circle. (b) The error in max norm when the test function f .x; y/ D 2    ı  1 2 is interpolated using these nodes, displayed as a function of the shape parameter " 59 67 C x C 17 C y  11 (Reproduced from Fornberg and Wright 2004)

2

2



2

7.xC 12 / 8.yC 12 / 9 z p1 2 Fig. 2 (a) Test function f .x/ D e , (b) n D 1;849 ME (minimal energy) nodes, (c) Interpolation errors (in max norm) when using RBF-Direct vs. using the stable RBF-QR algorithm

2.2 The Ill Conditioning of the A-Matrix Sideways translates of near-flat basis functions all look the same, and it is intuitively obvious that they must form a very ill-conditioned base to expand in. Just how bad it is can readily be quantified (Fornberg and Zuev 2007). For example, for scattered nodes in 2D, the eigenvalues of the A-matrix form distinct groups, following the specific pattern ˚     ˚       ˚         fO.1/g ; O "2 ; O "2 ; O "4 ; O "4 ; O "4 ; O "6 ; O "6 ; O "6 ; O "6 ; : : : (5) until the last eigenvalue is reached (causing the last group to possibly contain fewer eigenvalues than the general pattern would suggest). Different choices of scattered node locations or of RBF types (IQ, MQ, or GA) make no difference in this regard; lattice-based nodes or Bessel-type RBFs form exceptions. More concisely, the eigenvalue pattern above can be written as 1; 2; 3; 4; 5; 6; : : : ;

(6)

indicating how many eigenvalues there are of orders "0 ; "2 ; "4 ; "6 ; "8 ; "10 ; etc. Table 2 shows some more of such sequences. QnGiven these patterns, one can immediately calculate the orders of both cond.A/ and det.A/ D kD1 k asfunctions of n. For the examples in Figs. 1 and 2, cond.A/   becomes equal to O "16 and O "84 , respectively.

Page 5 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Table 2 Numbers of eigenvalues of different sizes (powers of ") for different geometries Geometry 1D nonperiodic 1D on circle periphery 2D nonperiodic 2D on spherical surface 3D nonperiodic

Power of " = 0 2 1 1 1 2 1 2 1 3 1 3

4 1 2 3 5 6

6 1 2 4 7 10

8 1 2 5 9 15

10 1 2 6 11 21

12 1 2 7 13 28

14 1 2 8 15 36

... ... ... ... ... ...

2.3 Overview of Some Computational Options In cases when RBF-Direct is too ill conditioned, the most straightforward approach is to resort to extended precision arithmetic. The only drawback is that the cost usually becomes excessive. Given the results quoted in the previous section, one can readily determine in advance just how many digits of precision would be needed as function of n and " in various geometrical settings. Some types of preconditionings and SVD enhancements have been suggested for the RBFDirect approach. While preconditioning can speed up certain iterative procedures, cf. Chapter 34 in Fasshauer (2007), they do not address the issue that significant information becomes lost already when the coefficient matrix A is formed (with all its entries virtually the same when " is small). Recovery of such missing information is challenging or impossible. Stable algorithms use computational steps that remain numerically well conditioned all the way into the " ! 0 limit (and therefore require only standard double precision arithmetic, no matter how small " is). So far, two main classes of stable algorithms have been proposed. The first realizations of these were denoted Contour-Padé (Fornberg and Wright 2004) and RBF-QR (Fornberg and Piret 2007), respectively. Related to the latter is the recent RBF-GA algorithm (Fornberg et al. 2013). 2.3.1 The Contour-Padé Algorithm Although " is a real-valued quantity, nothing stops us from considering it also for complex values. Focusing on the GA case, it can be shown that the interpolant s.x; "/, for any fixed evaluation point x, then becomes a meromorphic function of " (i.e., with poles as its only singularities across the finite complex "-plane). Furthermore, it is known that s.x; 0/ is finite even as " ! 0. The origin " D 0 must therefore be a removable singularity of s.x; "/, i.e., there is no actual singularity in the flat basis function limit as far as the RBF interpolant is concerned. The Contour-Padé algorithm is based on Cauchy’s integral theorem, allowing the evaluation of an analytic function at a point (such as " D 0) using an integration path that does not need to come close to it. The path can thus follow such a large circle in the "-plane that RBF-Direct can safely be used along it. 2.3.2 The RBF-QR Algorithm As noted above, translates of near-flat RBFs form a basis that is ill suited for immediate numerical use. This naturally raises the question whether the underlying approximation space also is bad, or if all conditioning issues can be resolved by just finding an alternate good basis in exactly the same space. This latter case turns out to hold true, leading to the follow-up issue of how one can carry out the basis conversion by analytic means also in scattered node cases, i.e., so that no numerical Page 6 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

cancelations will arise in the process. The RBF-QR method offers a systematic approach for this, first implemented for nodes on the surface of a sphere (Fornberg and Piret 2007) and more recently for node sets in 1D, 2D, and 3D (however then limited to GA RBFs Fornberg et al. 2011). 2.3.3 The RBF-GA Algorithm The RBF-QR algorithm involves extensive manipulations of Taylor expansions. While these can be truncated as needed, the RBF-GA algorithm utilizes that exact remainders can be expressed as incomplete gamma functions. This leads to a stable algorithm that is free from both infinite expansions and inexact truncations. It applies to GA RBFs in any number of dimensions and is presently the fastest stable option available (at around 10 times the cost of RBF-Direct). Although it may be slightly less accurate than RBF-QR in some cases (such as for large latticelike node sets), it is nevertheless well suited for generating RBF-FD approximations.

2.4 Time Stabilization: Hyperviscosity Stability issues for RBF-FD emanate from the fact that the natural intrinsic irregularity of the RBF-FD stencils causes eigenvalues of the DM to scatter into the right half of the complex plane. This becomes a hurdle to the RBF-FD method when (1) solving naturally dissipation-free PDEs, such that even a very mild numerical scatter of the eigenvalues into the right half complex plane can cause severe instability, and (2) using large RBF-FD stencils, since as the stencil size increases so does the scatter of eigenvalues. This latter point is even an issue for systems with dissipation, in which case the scatter might be too large for the natural dissipation to control. Stabilization of the RBF-FD method is achieved by applying a hyperviscosity filter, which is a high-order Laplacian k ; k 2 N > 1, to the right-hand side (RHS) of the system of PDEs being solved. Since the stencils for the hyperviscosity are same size as those used in discretizing the spatial operators and are simply added to the RHS, there  cost per time step  is no additional ."r/2 are used for creating (Fornberg and Lehto 2011; Flyer et al. 2012). If GA RBFs .r/ D e the hyperviscosity stencils, the values for k .r/ are available explicitly, thanks to the relation k .r/ D "2k pk .r/.r/:

(7)

Here, k is the order of the Laplacian, and pk .r/ are multiples of generalized Laguerre polynomials that are generated recursively (see Section 3.2, Fornberg and Lehto 2011). A 2D Laplacian operator is assumed when working on the surface of the sphere since a local stencil can be viewed as planar. Applying hyperviscosity leaves the physically relevant eigenvalues largely intact but shifts all the spurious ones of the PDE system to the left half of the complex plane. This shift is controlled by k, the order of the Laplacian, and a scaling parameter c , defined by H D k D c N k k : where N is the total number of nodes in the domain. a choice of N , n, and ", it was found   2Given provides stability with good accuracy for experimentally that c ranging from O.1/ to O 10 PDEs with convective operators (Flyer et al. 2012). In general, the larger the stencil size, the higher the order of the Laplacian. This is attributed to the fact that as the stencil size n increases, Page 7 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

the accuracy of the RBF-FD method increases; that is, a wider range of physical modes represented accurately. As a result, the hyperviscosity operator should preserve as much of that range as possible; as k increases, more physical modes are preserved correctly.

3 Implicit (Compact) RBF-FD Approximations Since a derivative is a “local” property of a function, there is something intuitively contradictory about enhancing the order of a FD approximation by invoking data located increasingly far away. When the task is to solve a PDE (rather than just to approximate an operator), compact approximations offer a different opportunity for improving the order of accuracy. For finite differences, the concept has a long history (Fox 1947; Collatz 1960) with several more recent enhancements available (such as to nonlinear PDEs in 2D and 3D, etc., Gupta 1991; Lele 1992; Li et al. 1995; Zhai et al. 2013). Before considering compact approximations in scattered node RBF-FD cases, the basic idea is @2 u @2 u illustrated in the case of approximating u D @x 2 C @y 2 on a 2D lattice, with spacing h in each direction. The most obvious FD approximation can be written as 2

3 1   4 1 4 1 5 u= h2 D u C O h2 : 1

(8)

Using only a 3  3 stencil size, it is impossible to find weights that improve the accuracy above second order. Extending the stencil to 5 weights in both directions permits fourth order, but causes problems when solving the PDE u D f : • The center weight becomes smaller in magnitude than the sum of magnitudes of the remaining weights, i.e., diagonal dominance is lost. This damages the convergence rate of many iterative schemes, and it also opens up the possibility of system singularities. • Wider stencils need more boundary information than what is readily available. Another way to approximate u is given by 2

3 1 4 1   4 4 20 4 5 u= 6h2 D u C O.h2 / ; 1 4 1

(9)

with no immediately obvious advantage over (8). However, Taylor expansions will reveal that, for solving u D f , 2 3 3 1 1 4 1   4 4 20 4 5 u= 6h2 D 4 1 8 1 5 f =12 C O.h4 /; 1 1 4 1 2

and for solving u D 0,

Page 8 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

2

3 1 4 1   4 4 20 4 5 u= 6h2 D 0 C O.h6 / : 1 4 1 The latter two approximations suffer neither of the problems noted above but achieve nevertheless significantly improved levels of accuracy. The weights ai in “regular” RBF-FD Papproximations of a linear operator L at a node x 1 are obtained by requiring that Luj xDx 1 D niD1 ai ui is exact for all the RBFs  .jjx  x i jj/, leading to an n  n linear system for the weights (system represented in (4) without column and row of ones). For compact RBF-FD approximations, first described in Wright and Fornberg (2006), we again consider a RBF-FD stencil centered at x 1 , with further nodes x 2 ; : : : ; x n ; and then repeat m  n  1 of the latter ones: x 1 ; : :: ; x m : Wishing to solve Lu D f , we extend the basis function  P P set to also include L jjx  x j jj and require Luj xDx 1 D niD1 ai uj xDx i C m j D1 bj Luj xDx j to hold for them all. This leads to a .n C m/  .n C m/ linear system for the weights ai and bj . With Luj xDx 1 D f1 , uj xDx i D ui and Luj xDx j D f j , we have thus arrived at the desired compact RBF-FD formula (sometimes denoted RBF-HFD for being based on a Hermite-type interpolation). The Pn usual enhancement of also enforcing exact results for u.x/  1 together with the constraint iD1 i D 0 is again beneficial for accuracy and is readily incorporated. The resulting .nCmC1/  .n C m C 1/ linear system will again generally be positive definite. The advantages noted above for compact formulas in the FD case carry directly over to the scattered node RBF-FD case. Several test examples and further discussions are provided in Wright and Fornberg (2006).

4 Applications on the Surface of Spheres and Spheroids 4.1 Cartesian Form of Surface Differential Operators Spherical coordinates are well known to suffer from the “pole problem.” When expressing PDEs in this coordinate system, this problem is exacerbated by the fact that directional velocity vector components in the lateral direction u (latitudinal) and v (longitudinal) will inherently carry pole O are singular at the poles. Similar singularities in their solution since the unit vectors O and  issues occur when representing more general surfaces in surface-based (or intrinsic) coordinate systems, such as oblate or prolate spheroidal coordinates. If, however, a Cartesian coordinate system is used, these singularities can be completely avoided. This section describes how to express common surface differential operators in Cartesian form. The discussion below is for general twodimensional surfaces embedded in three-dimensional space, with some specific comments on the sphere and spheroids toward the end. These Cartesian-based operators are completely natural to apply an RBF expansion since the radial functions are free of any surface-based coordinate systems. Let x D .x; y; z/ be a point on the target surface and n D .nx ; ny ; nz / denote the unit normal vector to the surface at x. If u D .u; v; w/ is a vector expressed in Cartesian coordinates and is placed at .x; y; z/, then nnT u gives the projection of u onto the normal to the surface, and unnT u gives the projection of u onto the plane tangent to the surface at .x; y; z/. This projection operation can be expressed using the following projection matrix:

Page 9 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Table 3 Common surface differential operators expressed in Cartesian coordinates. The projection matrix P is defined in (10) and r is the standard Cartesian-based gradient, i.e., r D @x Oi C @y Oj C @z kO Surface differential operator

Curl of a vector u D .u; v; w/

Expression in Cartesian coordinates 3 2 .px  r/h Prh D 4.py  r/h5 .pz  r/h .Pr/  u D .px  r/u C .py  r/v C .pz  r/w 2 3 .py  r/w  .pz  r/v .Pr/  u D 4.pz  r/u  .px  r/w 5

Laplace-Beltrami of scalar h

.px  r/v  .py  r/u .Pr/  .Prh/ D .px  r/.px  r/h C .py  r/.py  r/h C .pz  r/.pz  r/h

Gradient of a scalar h Divergence of a vector u D .u; v; w/

3 2 T3 2 px .1  nx nx /  nx ny  nx nz P D I  nnT D 4 nx ny 1  ny ny ny nz 5 D 4pTy 5 ; nx nz ny nz .1  nz nz / pTz

(10)

where I is the 3-by-3 identity matrix. Here px ; py ; and pz represent the projection operators in the x; y, and z directions, respectively. This projection operator can be combined with the standard Cartesian-based gradient operator r to define a variety of surface differential operators in Cartesian coordinates. Table 3 summarizes these results. In the case of the unit sphere, the normal vector n at a point x D .x; y; z/ on the sphere is just x. In this case, the projection operator simplifies to 3 xz .1  x 2 / xy P D I  xxT D 4 xy .1  y 2 / yz 5 : xz yz .1  z2 / 2

(11)

For a spheroid defined by z2 x2 C y2 C D 1; a2 c2

(12)

the unit normal vectors are given by  n D .nx ; ny ; nz / D where D

p

2x 2y 2z ; ; a2 a2 c 2

 ;

(13)

.x 2 C y 2 / =a4 C z2 =c 4 .

4.2 Nonlinear Shallow Water Equations on a Sphere The shallow water equations describe the nonlinear flow of an incompressible fluid. They represent the 2D atmospheric flow conditions in a single hydrostatic atmospheric layer and are therefore considered an idealized test bed for the horizontal dynamics (known as the dynamical core) of all

Page 10 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

3D climate model developments. The equations address the majority of the modeling challenges associated with the temporal and horizontal discretization techniques in spherical geometry. Using the notation from Table 3, the shallow water equations on the surface of the unit sphere in Cartesian coordinates are given by 2

3 .u  Pr/u C f .x  u/  Oi C g.px  r/h @u D  px  4 .u  Pr/v C f .x  u/  Oj C g.py  r/h 5; @t .u  Pr/w C f .x  u/  kO C g.pz  r/h „ ƒ‚ … RHS @v D  py  RHS; @t @w D  pz  RHS; @t @h D  .Pr/  .hu/ : @t

(14a)

(14b) (14c) (14d)

where P is the projection operator, given in (11), and confines the flow to the sphere, f is the Coriolis force, u D uOi C v Oj C wkO is the velocity vector, x represents the position vector, and h is the geopotential height. Notice that the only spatial operator that needs to be discretized are the components of the surface gradient .px  r/; .py  r/; .pz  r/. These operators are used as L in (4) to generate RBF-FD approximations in the numerical test cases that follow. 4.2.1 Flow Over an Isolated Mountain This test case describes flow over a single isolated mountain, which is achieved by adding a forcing term hmtn to the right-hand side of the geopotential height h equation in (14d). The standard undifferentiable C 0 mountain is given by hmtn D hmax .1  d=R/

(15)



where hmax D 2;000 m, R D =9, and d 2 D min R2 ; .  c /2 C .  c /2 with .c D 30ı N; c D 90ı W/ being the center of the mountain. To differentiate between errors due to a non-smooth forcing, which causes Gibbs phenomena in any high-order method, and those inherent in the RBF-FD method, the results for convergence and accuracy are compared against test runs that use an exceptionally steep C 1 Gaussian profile given by 2

h.;  / D hmax e .2:8 R / ; d

(16)

Page 11 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

where hmax ; d , and R are the same as for the cone mountain. The initial conditions are given by   1 u20 2 a u0 C z; h D h0  g 2

u D u0 fy; x; 0g;

(17)

where h0 D 5;400 m (mean reference height), g D 9:80616 m, u0 D 20 m/s, a D 6;371;220 m (mean radius of the earth), and D 7:292.10/5 s1 (rotation rate of the earth). The simulation is run for 15 days using the standard RK4 time-stepping scheme. The left column of Fig. 3 shows the profile of the conical mountain, the solution in h at day 15 and the magnitude of the error between the RBF-FD solution for N D 25;600, n D 31, and the discontinuous Galerkin (DG) reference solution. The same holds for the right column but with the Gaussian mountain profile. With the solutions looking identically the same, the key difference to notice is that even though the C 1 Gaussian mountain is slightly steeper than the C 0 mountain, no Gibbs phenomena is observed in the former. With the C 0 mountain, there are high-frequency waves emanating throughout the domain (i.e., Gibb’s phenomena), illustrating the sensitivity of highorder methods to non-smooth forcing. Another consequence is that the accuracy of the RBF-FD method does not indefinitely increase with stencil size n, as shown in the bottom left panel of Fig. 3. After n D 31, stencil size has no bearing on accuracy when a non-smooth mountain forcing is present. Hence, this is the stencil size chosen for this test case. In contrast, the bottom right panel of Fig. 3 demonstrates how the accuracy of the RBF-FD method for a C 1 solution does increase as n increases, that is, as the derivative approximations become more global. However, even with a smooth forcing, the rate of convergence is not much greater than for the cone case, since both the Gaussian and cone mountains are so steep, leading to underresolution even for very large nodes sets. To overcome this, adaptive node refinement in the area of the mountain needs to be used, as was done in St-Cyr et al. (2008). Next the performance of the RBF-FD method is compared not only against itself but also against a discontinuous Galerkin (DG) and spherical harmonic (SH) models. 1. DG: The DG model (Blaise and St-Cyr 2012) is in 3D Cartesian coordinates, with flows being tangentially constrained to the sphere by adding a Lagrange multiplier to the system of equations. The simulations used as references herein have been performed on a cubed sphere grid made up of 6,144 elements. Each element contains 12  12 Legendre quadrature nodes to represent the solution, which results in a total of 884;736 degrees of freedom and an average resolution around 26 km. For computing these reference solutions, no dissipation mechanism was found to be needed. However, for the run time versus error computations in Fig. 5, the two-dimensional exponential filter described in Hesthaven et al. (2007) was applied. 2. SH: The SH model from the DWD (Deutscher Wetterdienst, German National Weather Service, see http://icon.enes.org/) is an updated derivative of the NCAR spectral transform model. It is implemented with de-aliasing, using Orszag’s 2/3 rule (Gottlieb and Orszag 1977) and has become the standard reference solution in the community. For the flow over a mountain test, it has a spectral truncation of T426, that is, it uses 182,329 spherical harmonic bases. 3. RBF-FD: A high-resolution RBF-FD model is also used as a reference, based on N D 163;824 icosahedral-type nodes on the sphere, representing a 60 km resolution. It uses a stencil size of n D 31.

Page 12 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Left column – cone mountain results: (1) profile of mountain; (2) RBF-FD solution for h at day 15, N D 25;600, and n D 31 with contour intervals at 50 m; (3) magnitude in the error between the RBF-FD solution and DG reference solution; and (4) `2 error as function of the resolution N for varying stencil sizes. Right column, same as left but for the Gaussian mountain forcing. Dashed circle in all plots is the base of the mountain

The left panel of Fig. 4 shows that the normalized `2 error is an order of magnitude larger when the DWD-SH reference solution is used, as opposed to DG or the RBF-FD (N D 163;842) reference solutions. Furthermore, when these latter reference solutions are used, the normalized `2 errors are almost identical (notice the overlay the ). This same trend is also seen in the right panel of Fig. 4 with global RBFs, a different approach than both RBF-FD and DG that does not require hyperviscosity. Given that DG, RBF-FD, and global RBFs are vastly different numerical methods, this strongly indicates that the DWD-SH T426 spectral simulation is providing a less accurate solution. This is further supported by the few articles that do report `2 errors for this test case (Taylor et al. 1997; Spotz et al. 1998; St-Cyr et al. 2008), all of which use either the NCAR or DWD-SH reference solution and obtain errors on the order of 104 , an order of magnitude larger than that obtained by DG or RBF-FD. Next, time benchmarking is considered with comparison against the DG model. Benchmarking was done on a MacBook Pro laptop with an Intel i7 2.2 GHz quad-core processor, using only a single core and 8 GB of memory. The RBF-FD code was written in MATLAB and the DG code in C++. The RBF-FD reference solution of N D 163;842 and n D 31 (i.e., 60 km resolution) was used for calculating the `2 error versus runtime (i.e., wall-clock time) for both methods in Fig. 5. The RBF-FD resolutions with corresponding time steps is given in Table 4. The RBF-FD method Page 13 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

10−3 T426 DG RBF−FD ref.

2

Normalized

Normalized

2

error

T426 DG RBF−FD ref.

error

10−3

10−4

10−5 103

104

105

N

10−4

10−5 103

N

104

Fig. 4 The normalized `2 error in the height field h as a function of N for flow over a conical mountain at day 15 for RBF-FD (left panel) and global RBFs (right panel). The different markers correspond to different reference solutions

100 DG 1251 km

Error

10−2

2

10−1

10−3

RBF−FD 420 km

417 km

10−4 120 km

156 km

−5

10

100

101

102 103 Runtime in seconds

104

Fig. 5 The error as a function of runtime (defined by wall-clock time) for the flow over the cone mountain test case

Table 4 Time steps used for the cone mountain case with respective spatial resolutions for the maximal determinant (MD) nodes based on the mesh norm, maxx2S2 min1i N dist.x; xi / (see Womersley and Sloan 2003/2007 for discussion and tabulation) N 4;096 6;400 12;100 25;600 40;962 163;842

Resolution (km) 550 420 330 220 120 60

Node type MD

ICO

t (min) 20 15 12 5 3 1

Page 14 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

was computationally faster than the DG method, from about an order and a half of magnitude for coarser resolutions to four times faster for the finest resolutions. 4.2.2 Evolution of a Highly Nonlinear Wave This test case describes the evolution of a highly nonlinear wave with rapid energy transfer from large to small scales over a short time period. First, high-frequency gravity waves propagate around the sphere followed by complex vortex formations with sharp gradients. The details of how to set up the test can be found in Galewsky et al. (2004) and Flyer et al. (2012). The background flow is only a function of latitude, represented by an exponential profile that is zero everywhere except in the latitudinal band =7    5 =14. To generate the instability, the height (pressure) field is perturbed by Gaussians, in longitude and latitude, multiplied by a cosine to force the perturbation to go to zero at the poles  D ˙ =2. The test case is run for 6 days. There are two main concerns with this test case: (a) how well the sharp gradients are resolved and (b) the effect of Gibbs phenomena. For short time integration periods, as here, these two numerical issues become a balancing act. As both the solution N and order of the method n are increased, the gradients are resolved better. However, as with classical FD, the higher the order of the method, i.e., larger n, the more prominent the Gibbs phenomena. Here with N D 25;000, n D 101 would correspond approximately to a ninth-order method. Notice in Fig. 6, when an n D 101 stencil size is used, the contour lines are more jagged with evidence of of Gibbs phenomena. Yet, the gradients and vortices are much better defined than for n D 31, roughly analogous to a fifth–sixth-order method. It this latter case, the features appear smoothed out and underdeveloped. Increasing the resolution N and keeping n D 31 fixed gives the best results as seen in Fig. 7 for N D 163;842 (60 km or 0:54ı  0:54ı ). The solution is extremely similar to that given by the high-order DG solution with a resolution of 39 km or 0:35ı  0:35ı. Due to its high accuracy in approximating derivatives, the RBF-FD method is able to produce the basic wave pattern structure even at very coarse resolutions such as N D 4;096 (5ı  5ı ), displayed in Fig. 8. This is not the case with the DG method (a slightly higher resolution of 3ı  3ı is displayed in the third panel of Fig. 8), a spectral element method (SE), or a finite volume method (FV). At such coarse resolutions, the DG and SE (St-Cyr et al. 2008) methods instead produce features of the grid such as an artificial wavenumber-four pattern for the cubed sphere. The FV method, notorious for being dissipative, shows no spatial structures at a resolution of 5ı  5ı in Fig. 8.

4.3 Reaction-Diffusion Equations on Spheroids While the sphere plays a prominent role in several applications from geoscience, it is often a physical idealization. In some cases, spheroidal domains (oblate or prolate) may be more physically relevant. Unfortunately, many numerical techniques that work well for the sphere can be considerably more difficult in spheroidal domains (e.g., going from spherical harmonics to oblate/prolate spheroidal harmonics). Since RBFs are free of any surface-based coordinate system, they do not suffer from these complications. This fundamental feature is illustrated in this section by considering the problem of simulating certain reaction-diffusion equations on the sphere, an oblate spheroid, and a prolate spheroid using the RBF-FD method. These types of equations and these domains have a number of applications in biology and chemistry to model such things as diffusion of chemicals on biological cells or membranes, pattern formations in biology, and nonlinear chemical oscillators in excitable media. While the discussion below is for the surface of Page 15 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Fig. 6 The relative vorticity at day 6 for the evolution of the highly unstable wave case as function of resolution N and stencil size n. Contour interval is 2  105 s1

Fig. 7 Top: RBF-FD solution for N D 163;842 and n D 31. Bottom: DG solution at 39 km resolution

Page 16 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

FV 5° X 5° Latitude

90

45

0 0

90

–12

–6

180 Longitude

270

360

0 6 Relative Vorticity [10–5 s–1]

12

SEM ne=3 (5° X 5°)

Latitude

90

45

0 0

90

–12

90 80 70 60 50 40 30 20 10 0

–6

180 Longitude

360

0 6 Relative Vorticity [10–5 s–1]

12

DG R2N6

1.2e-04

0.0e+00

–150

–100

–50

0

RBF-FD 5° X 5°, N = 4096, n = 31,

Latitude

270

50

Δ4

100

150

–1.2e-04

type hyperviscosity × 10–4 1

90 80 70 60 50 40 30 20 10 0

0

–1

–150

–100

–50

0 Longitude

50

100

150

Fig. 8 The relative vorticity at day 6 for the evolution of the highly unstable wave case with resolution of 5ı  5ı for a spectral element model (SEM), finite volume model (FV) (St-Cyr et al. 2008), and the RBF-FD model. The DG model is approximately a resolution of 3ı  3ı . R2 corresponds to 16 elements per face of the cubed sphere for a total of 96 elements, while N6 corresponds using 6  6 D 36 Legendre quadrature nodes. In the SEM model, ne D 3 corresponds to 3  3 D 9 elements per face of the cubed sphere. In each element 8  8 Gauss–Legendre–Lobatto nodes are used

Page 17 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

a spheroid defined by (12), it may be possible to generalize it to more general (closed) surfaces, as has been done for the global RBF method in Piret (2012) and Fuselier and Wright (2013). In the case of two species u and v, the prototypical form of reaction–diffusion equations on a spheroid (or more general surface) can be written as @u Dıu S u C fu .t; u; v/; @t @v Dıv S v C fv .t; u; v/; @t

(18)

where ıu ; ıv 0, fu ; fv are typically nonlinear scalar functions describing reactions of the species and S is the Laplace–Beltrami operator for the spheroid (or surface). As mentioned above, this operator can be difficult to treat numerically if using surface-based coordinate systems (i.e., oblate/prolate spheroidal coordinates) as they will induce an artificial coordinate singularity (e.g., at the poles). As in the case of the shallow water wave equations, Cartesian coordinates will thus be used to avoid these singularities. For (18), this means expressing S using the formulation given in the last row of Table 3. In the case of a sphere, the direct discretization of S using RBFs is incredibly simple (as first discussed in Wright et al. 2010 and employed below in Sect. 5.2). Unfortunately when switching to a more general spheroid, this simplicity is lost. The discretization approach taken in Fuselier and Wright (2013) is therefore adopted here. In this approach, N “scattered” nodes are first distributed on the surface of the spheroid. Each of the components of the surface gradient listed in the top row of Table 3 is then approximated with the RBF-FD method using the projection matrix P defined by the unit normal vectors in (13). Finally, the DM for the surface Laplacian is constructed like the continuous formulation listed in the last row of Table 3 but using the RBF-FD DMs for each component of the surface gradient instead of the continuous ones (see Fuselier and Wright 2013 for full details). Based on numerical experiments, one small modification to the RBF-FD method described in Sect. 2 is suggested for the above procedure on the spheroid. Instead of computing the distance from node xj to xk using the standard Euclidean distance, a weighted Euclidean distance defined as kxj  xk k2S D

.xj  xk /2 C .yj  yk /2 .zj  zk /2 C a2 c2

should be used, where a and c define the parameters of the spheroid in (12). Although the results are not presented here due to space limitations, this was observed to give more accurate approximations of the surface Laplacian. It should be noted that this modification does not alter the non-singularity results of the linear systems involved in computing the RBF-FD formulas discussed in Sect. 2. 4.3.1 Turing Patterns Since Turing’s classical paper Turing (1952) that suggested how certain nonlinear models of reaction and diffusion can lead to stable, heterogeneous pattern formations, there has been an

Page 18 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

explosion of research in reaction–diffusion-type models for various kinds of morphogenesis. The following is one such model that has been studied on spherical domains (Varea et al. 1999):   @u Dıu S u C ˛u 1  1 v 2 C v.1  2 u/; @t   @v ˛ 1 Dıv S v C ˇv 1 C uv C u. C 2 v/: @t ˇ

(19)

Here u and v are morphogens with u the “activator” and v is the “inhibitor.” If ˛ D  then .u; v/ D .0; 0/ is a unique equilibrium point of this system. By changing the diffusivity rates of u and v, an instability can form that leads to different pattern formations. The cubic coupling parameter 1 favors the formation of stripes, while the quadratic coupling parameter 2 favors the formation spots. The spot pattern formations are more robust than stripes and take far less time to reach “steady state.” The RBF-FD method was applied to numerically solving (19) on three surfaces: an oblate spheroid defined by (12) with a D 1 and c D 0:5, a prolate spheroid with a D 1 and c D 1:5, and the unit sphere (i.e., a D c D 1). The surfaces were discretized using a radial projection of a set of N spherical MD nodes to the surface. In all experiments, N D 16;384 and n D 31 for the RBF-FD approximations of the surface Laplacian. The equations were advanced in time using the third-order, semi-implicit, backward differentiation formulae (SBDF3) method (Ascher et al. 1995). This scheme treats the diffusion terms implicitly and the (nonlinear) reaction terms explicitly. The implicit systems were solved using the (unpreconditioned) BiCGSTAB iterative method (van der Vorst 1992) with a tolerance of 108 on the relative residual. A time step of t D 0:05 was used, which was near the largest that could be used and still maintain time stability due to the non linear reaction terms. Throughout the simulations, the largest number of BiCGSTAB iterations required to solve the implicit systems was 5. Similar to the experiments in Varea et al. (1999), we set the initial values of u and v to random values between 0:5 and 0:5 in a thin strip around the “equator” of each surface and u D v D 0 elsewhere. Figure 9 shows the results from the simulations using parameters that lead to both spot and stripe patterns. These parameter values were motivated by those used in Varea et al. (1999). For the spot patterns, the parameters ˛ D 0:899, ˇ D 0:91,  D 0:899, 1 D 0:02, and 2 D 0:2 were used for all the surfaces. The diffusion values were set as ıu D 0:516ıv , with ıv D 2:25  103 for the oblate spheroid, for ıv D 3  103 for the sphere, and ıv D 4:5  103 for the prolate spheroid. For the stripe patterns, 1 D 3:5 and 2 D 0, with ˛, ˇ, and  , set the same as the spots for all the surfaces. The diffusion values were set as ıu D 0:516ıv , with ıv D 1:575  103 for the oblate spheroid, for ıv D 2:1  103 for the sphere, and ıv D 3:15  103 for the prolate spheroid. While the initial conditions used in these simulations are random, the resulting spot and stripe patterns observed in Fig. 9 are qualitatively similar to those obtained from the global RBF method from Fuselier and Wright (2013). This new RBF-FD method is significantly more computationally efficient since iterative methods can be used to successfully solve the implicit linear systems from the time-stepping scheme as opposed to direct methods used in Fuselier and Wright (2013). This also allows much higher resolutions to be used in the simulations.

Page 19 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Fig. 9 Turing spot and stripe patterns computed from the model (19) on an (a) oblate spheroid, (b) sphere, and (c) prolate spheroid. The pseudocolor plots are for the activator u once steady state is reached. In all plots, black corresponds to a high concentration of u, and white to a low concentration. For all simulations, N D 16;384 and n D 31, with a time step of t D 0:05. All simulations were run until a steady-state solution was reached, which was about 800 time steps for the spot patterns and 6,000 for the strip patterns

5 3D Applications in a Spherical Shell 5.1 Global Electric Circuit Electrical linkages within the atmosphere are often discussed in terms of a “global electric circuit” (GEC). The GEC extends from the Earth’s surface to the base of the ionosphere defined to be 90 km altitude. The basic idea, first postulated by Wilson (1920, 1929), is that thunderclouds and other highly electrified clouds produce an upward current ( 1,000–2,000 A) that maintains the ionosphere at a quasi-static potential on the order 240 ˙ 40 kV with respect to the ground. A downward return current density of 1–10 pAm2 is distributed over the rest of the globe in the so-called “fair weather regions” (no cloud activity). The quasi-static coupled atmospheric and ionospheric components of the GEC can be modeled as an elliptical Poisson-type PDE x/; r  . rˆ/ D Js .E

xE 2 ;

(20)

with suitable boundary conditions. Here, ˆ is the electrostatic potential, is the atmosphere below the ionosphere, and is the atmospheric conductivity. The right-hand side, Js , is a representation of the current sources (e.g., thunderstorms) in the domain. The conductivity of the atmosphere is , and to a good first approximation, it can be considered simply as a function of altitude. Below, it is even further simplified by taking it to be a constant, thus resulting in Poisson’s equation. This allows for a simple direct comparison between explicit RBF-FD stencil implementation to an implicit (compact) RBF-FD one in 3D spherical geometry, with the focus being on accuracy and convergence of a Krylov iterative solver (BiCGSTAB van der Vorst 1992 in this case).

Page 20 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Fig. 10 Schematic picture of the two types of stencils. Left panel: explicit stencil. Right panel: implicit stencil

5.1.1 Poisson’s Equation Consider Poisson’s equation with homogeneous Dirichlet conditions, given by u D f; u D 0;

x 2 ;

(21)

x 2 @ ;

(22)

where is a spherical shell with inner radius Ri D 2 and outer radius Ro D 4. The right-hand side is chosen such that the exact solution is given by   r  ri Y9;3 .; '/; uexact .r; ; '/ D sin 2 r0  ri

(23)

where Yl;m.; '/ denotes a spherical harmonic of degree l and order m. Two discretizations are considered, based on RBF-FD. The first is an explicit scheme, where the angular terms of the Laplacian in (21) are implemented using 31-point RBF-FD stencils on scattered nodes and the radial terms are discretized using 5-point RBF-FD stencils on equidistant nodes. The second scheme is implicit, based on the methodology in Wright and Fornberg (2006) and discussed in Sect. 3. For the angular terms in the Laplacian, fn D 9; m D 6g node stencils are applied, while fn D 3; m D 2g nodes are used in the radial discretization. Maximum determinant nodes (MD), see Womersley and Sloan (2003/2007), are used in the angular discretization for both stencil types. A typical stencil of each kind is shown in Fig. 10. The sparsity pattern of the Laplacian DM for each method, following a reverse Cuthill–McKee reordering of the entries, is shown in Fig. 11. The horizontal resolution is given by NH , and the radial one by NR . The total number of nodes is denoted N D NH  NR . The linear system arising from the discretization is solved using BiCGSTAB with zero-fill modified incomplete LU factorization as a preconditioner. A residual tolerance of 108 is used in the iterative solver. The subplots in Fig. 12 show the relative `2 error and the number of iterations required, respectively, as a function of the total number of nodes. While both methods achieve fourth-order convergence, the implicit method is more efficient in terms of memory and number of iterations required to converge. Table 5 gives the exact specifications of the six node sets used for the results in Fig. 12.

Page 21 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

10–2 Explicit Implicit (h4)

10–3

Relative

2

error

a

10–4 10–5 10–6 104

105

106

b

103

BiCGSTAB iterations

Fig. 11 The sparsity pattern for NH D 2;500, NR D 11. (a) Explicit stencil. (b) Implicit stencil

102

Explicit Implicit

101 104

N Error

105

106

N Iterations

Fig. 12 The error and the number of iterations as a function of N . (a) Error. (b) Iterations

5.2 Mantle Convection Thermal convection of an incompressible, Boussinesq fluid at infinite Prandtl number in a spherical shell that is heated from below is discussed here. This can be considered as a simplified version of mantle convection with constant viscosity. The dynamics of the fluid are governed by the Rayleigh number, Ra, which can be interpreted as a ratio of the destabilizing force due to the buoyancy of the heated fluid to the stabilizing force due to the viscosity of the fluid. Since the fluid is incompressible and isoviscous, the velocity can be expressed solely in terms of a poloidal potential, u D r  r  ..ˆr/Or/ (see, e.g., Chandrasekhar 1981; Backus 1966). Exploiting this relationship, the governing equations for a shell of inner radius Ri and outer radius Ro can be written as   2 @ r D Ra r T; @r   @ 2 @ˆ r D r 2 ; S ˆ C @r @r

@ S C @r

(24) (25)

Page 22 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Table 5 The specifications for the results shown in Fig. 12 Stencil type

Explicit

Implicit

NR 11 15 19 23 27 33 11 15 19 23 27 33

NH 2;500 4;900 8;100 12;100 16;900 25;600 2;500 4;900 8;100 12;100 16;900 25;600

"R

"H

0.5

5

0.5

4

`2 error 0:00129 0:000330 0:000106 4:09  105 1:84  105 6:73  106 0:000930 0:000221 7:68  105 3:39  105 1:79  105 7:87  106

    1 1 @ @T @T 1 @T 1 @T 2 @T D  ur C u C u C 2 S T C 2 r ; @t @r r @ r cos  @ r r @r @r

Iterations 24:5 35:5 47:5 65:5 98:5 624:5 12 16:5 23:5 25:5 32 34:5

(26)

where Ri  r  Ro ,  =2    =2,  <   are the standard spherical coordinates, u D .ur ; u ; u/ is the velocity field in spherical coordinates, T is temperature, and S is the surface Laplacian operator. The boundary conditions on the velocity of the fluid at the inner and outer surfaces of the spherical shell are assumed to be impermeable and shear stress-free, which translates into ˇ @2 ˆ ˇˇ and D 0: (27) ˆjrDRi ;Ro D 0 @r 2 ˇrDRi ;Ro The boundary conditions on the temperature are T .Ri ; ; / D 1

and T .Ro ; ; / D 0:

Equations (24)–(26) have been nondimensionalized with the length scale chosen as the thickness of the shell, R D Ro  Ri ; the timescale chosen as the thermal diffusion time, t D .R/2 = ( = thermal diffusivity); and the temperature scale chosen as the difference between the temperature at the inner and outer boundaries, T . In the experiments below, the inner and outer radii of the shell are set to Ro D 20=9 and Ri D 11=9, which give approximately the same ratio of the inner and outer radii of the Earth’s mantle (i.e., 0:55). 5.2.1 Overview of the Numerical Approach To numerically solve (24)–(26), an operator-splitting method similar to Wright et al. (2010) is used in space where the lateral directions .; / are discretized separately from the radial direction (see Fig. 13 for an illustration). M Chebyshev nodes are used in the radial direction, and N “scattered” nodes are used on each of the resulting M spherical surfaces, giving a total of MN nodes in the 3D spherical shell. The maximal determinant (MD) nodes (Womersley and Sloan 2003/2007) are again used for the spherical surfaces. All radial derivatives are discretized using collocation with Page 23 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Fig. 13 (a) Node layout for the RBF-FD approximations on the surface of a sphere; (b) 3D view of the discretization of the spherical shell used in the hybrid RBF-FD/Chebyshev calculation. Blue is the outer boundary and red is the inner boundary, and black circles display the computational nodes, which are distributed in the radial direction along the extrema of the Chebyshev polynomials. The spherical shell has been opened up in (b) to show the detail of the radial discretization

Chebyshev polynomials (see, for example, Fornberg 1996; Trefethen 2000 for details), while all differential operators in the latitudinal direction  and longitudinal direction  are approximated on each spherical surface using n-node RBF-FD stencils. The complete numerical algorithm is nearly identical to the one in Wright et al. (2010), with the only difference being that RBF-FD is used instead of global RBFs. The reader is therefore referred to Appendix B of Wright et al. (2010) for a detailed description. A general overview is given by the following steps: 1 @ @ , , and S at each of the N MD nodes using n point RBF-FD formula. 1. Discretize @ cos  @ As discussed in Wright et al. (2010), the coordinate singularities associated with the latter of these two operatorsis harmlessly removed when applied to an RBF expansion. ˇ ˇ  2 ˇ 2 ˇ @ @ @ @ @ ˇ ˇ r2 , , and using collocation with M Chebyshev , 2. Discretize @r @r @r @r 2 ˇrDRi @r 2 ˇrDRo polynomials. 3. With the given temperature initial condition, solve the Poisson equations (24) and (25) for and ˆ. This procedure is complicated by the fact that all four boundary conditions on these equations are only specified on ˆ. To handle this, the influence matrix (Peyret 2002) method is employed to find the unknown boundary values on such that all four boundary conditions on ˆ are satisfied (see Section 5 of Wright et al. 2010).   1 1 @2 @2 1 S ˆ; .ˆr/; .ˆr/ . 4. Compute the velocity field: u D rrŒ.ˆr/Or D r r @r@ r cos  @r@ 5. Update the energy equation (26) to the next time step t C t using the semi-implicit thirdorder Adams–Bashforth (AB3) and second-order trapezoidal rule (or Crank–Nicolson) method. In this case, all terms are treated explicitly with AB3, except the radial component of the diffusion term which is treated implicitly using trapezoidal rule. Return to step 3 with an updated temperature profile.

Page 24 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

a

b

Tetrahedral, Ra = 7 · 103

c

Cubic, Ra = 7 · 103

Cubic, Ra = 1 · 105

Fig. 14 Steady-state isosurfaces of the residual temperature, ıT , for the three isoviscous benchmark test cases using the RBF-FD model. Here the residual temperature ıT D T .r; ; /hT .r/i is visualized, where h i denotes averaging over a spherical surface. Yellow corresponds to ıT D 0:15 and denotes upwelling relative to the average temperature at each radial level, while blue corresponds to ıT D 0:15 and denotes downwelling. The red solid sphere shows the inner boundary of the 3D shell corresponding to the core. (a)Tetrahedral, Ra D 7  103 . (b) Cubic, Ra D 7  103 . (c) Cubic, Ra D 1  105

5.2.2 Validation on Community Benchmark Problems The two most common benchmarks for computational models of mantle convection in a spherical shell are the isoviscous steady-state tetrahedral and cubic test cases. These names correspond to the steady-state upwelling plumes that result from these tests at the faces of regular tetrahedron and cube, respectively. The initial condition for the temperature is specified as   Ri .r  Ro / r  Ri 2 C 0:01Y3 .; / sin T .r; ; / D r.Ri  Ro / Ro  Ri

(28)

for the tetrahedral test case and   Ri .r  Ro / 5 4 r  Ri 0 C 0:01 Y4 .; / C Y4 .; / sin T .r; ; / D r.Ri  Ro / 7 Ro  Ri

(29)

for the cubic test case, where Y`m denotes the normalized spherical harmonic of degree ` and order m. The first term in each of the initial conditions represents a purely conductive temperature profile, while the second terms are perturbations to this profile and determine the final steady-state solution. Both the tetrahedral and cubic tests were first run at Ra D 7;000, with N D 2;601 nodes on each spherical shell and M D 23 Chebyshev nodes in the radial direction giving a total of 59;823 nodes. Each differential operator on the spherical surfaces was approximated using n D 50 node RBF-FD formulas. A time step of 104 was used to reach steady state at the nondimensionalized time of t D 1, corresponding to roughly 58 times the age of the Earth. The final RBF-FD steadystate solutions for the Tetrahedral and Cubic test cases are displayed in Fig. 14a, b, respectively. As no analytical solutions exist, validation is done via comparison to other published results in the literature with respect to scalar global quantities, such as Nusselt number at the inner and outer boundaries (Nui and Nuo ), and the averaged root mean square velocity and temperature over the volume. Such a comparison for the RBF-FD method with respect to established methods (including the global RBF-PS method Wright et al. 2010) is given in Table 6. For the RBF-FD and RBF-PS

Page 25 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Table 6 Comparison between computational methods for the isoviscous tetrahedral and cubic mantle convection test cases with Ra D 7;000. Nuo and Nui denote the respective Nusselt number at the outer and inner spherical surfaces, hVrms i the volume-averaged RMS velocity over the 3D shell and hT i the mean temperature of the 3D shell Model Type Nodes Tetrahedral test case, Ra D 7;000 Zhong et al. (2008) FE 393;216 Yoshida and FD 2;122;416 Kageyama (2004) Kameyama et al. FD 12;582;912 (2008) Ratcliff et al. (1996) FV 200;000 Stemmer et al. (2006) FV 663;552 Stemmer et al. (2006) FV Extrap. Harder (1998) and SP-FD 552;960 Stemmer et al. (2006) Harder (1998) and SP-FD Extrap. Stemmer et al. (2006) RBF-PS (Wright SP 36;800 et al. 2010) RBF-FD, n D 50 SP-FD 59;823 Cubic test case, Ra D 7;000 Zhong et al. (2008) FE 393;216 Yoshida and FD 2;122;416 Kageyama (2004) Kameyama et al. FD 12;582;912 (2008) Ratcliff et al. (1996) FV 200;000 Stemmer et al. (2006) FV 663;552 Stemmer et al. (2006) FV Extrap. Harder (1998) and SP-FD 552;960 Stemmer et al. (2006) Harder (1998) and SP-FD Extrap. Stemmer et al. (2006) RBF-PS (Wright SP 36;800 et al. 2010) RBF-FD, n D 50 SP-FD 59;823

r  .  /

Nuo

Nui

hVrms i

hT i

32  .12  32  32/ 102  .102  204/

3:5126 3:4430

3:4919 —

32:66 32:0481

0:2171 —

128  .2  128  384/ 3:4945



32:6308

0:21597

40  .50  100/ 48  .6  48  48/ Extrap. 120  .48  96/

3:4423 3:4864 3:4949 3:4955

— 3:4864 — —

32:19 32:5894 32:6234 32:6375

— 0:21564 0:21560 0:21561

Extrap.

3:4962



32:6424

0:21556

23  .1;600/

3:4962

3:4962

32:6424

0:21556

23  .2;601/

3:4962

3:4962

32:6425

0:21556

32  .12  32  32/ 102  .102  204/

3:6254 3:5554

3:6016 —

31:09 30:5197

0:2176 —

128  .2  128  384/ 3:6083



31:0741

0:21639

40  .50  100/ 48  .6  48  48/ Extrap. 120  .48  96/

3:5806 3:5983 3:6090 3:6086

— 3:5984 — —

30:87 31:0226 31:0709 31:0765

— 0:21594 0:21583 0:21582

Extrap.

3:6096



31:0821

0:21578

23  .1;600/

3:6096

3:6096

31:0820

0:21577

23  .2;601/

3:6095

3:6096

31:0819

0:21577

Extrap. indicates that the results were obtained using Romberg extrapolation Solid lines indicate numbers were not reported Abbreviations: FE finite element, FD finite difference, FV finite volume, SP-FD hybrid spectral and finite difference, SP purely spectral

methods, the standard deviation of all the quantities from the last 1,000 time steps was less than 5  105 , which is a standard measure for indicating the model has reached numerical steady state. While the Ra D 7;000 tests above are the de facto benchmarks, there are also some results in the literature also for cubic test at Ra D 105 . This Rayleigh number results in a more convective regime, with thinner plumes, so that higher resolutions are needed to properly capture the solution. For this test, N D 6;561 nodes were used on each spherical shell and M D 43 Chebyshev nodes in the radial direction. The spherical surface differential operators were again approximated using

Page 26 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Table 7 Comparison between computational methods for the isoviscous Cubic mantle convection test case with Ra D 105 Model Zhong et al. (2008) RBF-PS (Wright et al. 2010) RBF-FD, n D 50

Type FE

Nodes 1;327;104

r  .  / 48  .12  48  48/

Nuo 7:8495

Nui 7:7701

hVrms i 154:8

hT i 0:1728

SP

176;128

43  .4;096/

7:8120

7:8005

154:49

0:17123

SP-FD

282;123

43  .6;561/

7:8072

7:8030

154:43

0:17069

n D 50. Increasing M necessitates a decrease in the time step to maintain time stability since the Chebyshev nodes cluster quadratically at the boundaries. A time step of  D 6  106 was used to reach a final time of t D 0:35, at which numerical steady state was reached (using the same criteria as the Ra D 7;000 cases). The final RBF-FD steady-state solution is displayed in Fig. 14c, and a comparison with two other methods is given in Table 7. The following observations can be made regarding the results for these benchmarks: 1. For the Ra D 7;000 test case, the results from the RBF-PS method (Wright et al. 2010) and Harder’s extrapolated results from a spherical harmonic-finite difference method (Harder 1998) are expected to be correct to four digits. The hybrid RBF-FD method was able to match these values to four digits for both tests. Furthermore, only in three values does the RBF-FD method not match the global RBF-PS method to all digits. 2. The number of nodes (degrees of freedom) needed to accomplish the Ra D 7;000 and Ra D 105 results for the RBF-FD method are significantly lower (over an order of magnitude in all but one case) than any of the other methods that use a hybrid spectral-FD discretization or a full finite difference, finite volume, or finite element method. 3. The Nusselt number is nondimensional, measuring the ratio of convective to conductive heat transfer across a boundary. Thus, if there are no sources or sinks in the domain, energy should be conserved and Nui D Nuo . This is exhibited quite clearly in the Ra D 7;000 cases for the RBFFD method (like the RBF-PS), and to a slightly lesser extent in the Ra D 105 (with a difference of only 0:05 %). This suggests the method will inherently dissipate physical quantities less. 4. While the RBF-FD method requires a higher resolution (larger N ) than the global RBF-PS method to achieve similar results, it has a significant advantage in terms of computational cost and memory storage, as spatial discretizations require a small fraction of the number of terms required by the global method.

6 Future Directions and Concluding Remarks This chapter provides a survey of a group of topics relevant to using RBF-FD, a meshless method, for a variety of problems that arise in the geosciences, with particular emphasis on spherical geometries, both on surfaces and within a volume. RBF-FD is a method that has only begun to grow in the last few years. From a practical standpoint, RBF-FD sprouted out of global RBFs, which showed exceptional numerical qualities in terms of accuracy and time stability for solving PDEs but reduction in computational cost needed to be addressed if RBFs were to be effective when scaled to very large problem sizes. With the RBF-FD methodology, matrices go from full to becoming 99 % empty. Of course, the sacrifice is the exchange of spectral accuracy for high-order Page 27 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

algebraic convergence, assuming smooth data. However, since natural processes are almost never infinitely differentiable, little is lost and much gained in terms of memory and runtime. Although the future of RBF-FD is bright on the horizon, many topics still need to be addressed. These include but are definitely not limited to improved reliability, effectiveness, and scalability of RBF-FD implementations on novel computer architectures (Bollig et al. 2012), dynamic adaptive node refinement (Driscoll and Heryudono 2007), application of preconditioners and iterative solvers for elliptic PDEs, stability analysis in the presence of boundary conditions for hyperbolic problems, and the treatment of discontinuities in the domain.

References Ascher UM, Ruuth SJ, Wetton BTR (1995) Implicit-explicit methods for time-dependent partial differential equations. SIAM J Numer Anal 32:797–823 Backus GE (1966) Potentials for tangent tensor fields on spheroids. Arch Ration Mech Anal 22:210–252 Blaise S, St-Cyr A (2012) A dynamic hp-adaptive discontinuous Galerkin method for shallow water flows on the sphere with application to a global tsunami simulation. Mon Wea Rev 140(3):978–996 Bochner S (1933) Monotine Functionen, Stieltjes Integrale und Harmonische Analyse. Math Ann 108:378–410 Bollig E, Flyer N, Erlebacher G (2012) Solution to PDEs using radial basis function finitedifferences (RBF-FD) on multiple GPUs. J Comput Phys 231:7133–7151 Chandrasekhar S (1981) Hydrodynamic and hydromagnetic stability. Dover, New York Collatz L (1960) The numerical treatment of differential equations. Springer, Berlin Driscoll TA, Fornberg B (2002) Interpolation in the limit of increasingly flat radial basis functions. Comput Math Appl 43:413–422 Driscoll TA, Heryudono A (2007) Adaptive residual subsampling methods for radial basis function interpolation and collocation problems. Comput Math Appl 53:927–939 Fasshauer GE (2007) Meshfree approximation methods with MATLAB. Interdisciplinary mathematical sciences, vol 6. World Scientific, Singapore Flyer N, Lehto E, Blaise S, Wright GB, St-Cyr A (2012) A guide to RBF-generated finite differences for nonlinear transport: shallow water simulations on a sphere. J Comput Phys 231:4078–4095 Fornberg B (1996) A practical guide to pseudospectral methods. Cambridge University Press, Cambridge Fornberg B, Lehto E (2011) Stabilization of RBF-generated finite difference methods for convective PDEs. J Comput Phys 230:2270–2285 Fornberg B, Piret C (2007) A stable algorithm for flat radial basis functions on a sphere. SIAM J Sci Comput 30:60–80 Fornberg B, Wright G (2004) Stable computation of multiquadric interpolants for all values of the shape parameter. Comput Math Appl 48:853–867 Fornberg B, Zuev J (2007) The Runge phenomenon and spatially variable shape parameters in RBF interpolation. Comput Math Appl 54:379–398 Fornberg B, Driscoll TA, Wright G, Charles R (2002) Observations on the behavior of radial basis functions near boundaries. Comput Math Appl 43:473–490 Page 28 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Fornberg B, Wright G, Larsson E (2004) Some observations regarding interpolants in the limit of flat radial basis functions. Comput Math Appl 47:37–55 Fornberg B, Larsson E, Flyer N (2011) Stable computations with Gaussian radial basis functions. SIAM J Sci Comput 33(2):869–892 Fornberg B, Lehto E, Powell C (2013) Stable calculation of Gaussian-based RBF-FD stencils. Comput Math Appl 65:627–637 Fox L (1947) Some improvements in the use of relaxation methods for the solution of ordinary and partial differential equations. Proc R Soc A 190:31–59 Fuselier EJ, Wright GB (2013) A high-order kernel method for diffusion and reaction-diffusion equations on surfaces. J Sci Comput 1–31. doi:10.1007/s10915-013-9688-x, http://dx.doi.org/ 10.1007/s10915-013-9688-x Galewsky J, Scott RK, Polvani LM (2004) An initial-value problem for testing numerical models of the global shallow-water equations. Tellus 56A:429–440 Gottlieb D, Orszag SA (1977) Numerical analysis of spectral methods. SIAM, Philadelphia Gupta MM (1991) High accuracy solutions of incompressible Navier-Stokes equations. J Comput Phys 93:343–359 Harder H (1998) Phase transitions and the three-dimensional planform of thermal convection in the Martian mantle. J Geophys Res 103:16,775–16,797 Hardy RL (1971) Multiquadric equations of topography and other irregular surfaces. J Geophys Res 76:1905–1915 Hesthaven JS, Gottlieb S, Gottlieb D (2007) Spectral methods for time-dependent problems. Cambridge University Press, Cambridge Kameyama MC, Kageyama A, Sato T (2008) Multigrid-based simulation code for mantle convection in spherical shell using Yin-Yang grid. Phys Earth Planet Inter 171:19–32 Lele SK (1992) Compact finite difference schemes with spectral-like resolution. J Comput Phys 103:16–42 Li M, Tang T, Fornberg B (1995) A compact fourth-order finite difference scheme for the steady incompressible Navier-Stokes equations. Int J Numer Methods Fluids 20:1137–1151 Madych WR, Nelson SA (1992) Bounds on multivariate polynomials and exponential error estimates for multiquadric interpolation. J Approx Theory 70:94–114 Mairhuber JC (1956) On Haar’s theorem concerning Chebyshev approximation problems having unique solutions. Proc Am Math Soc 7(4):609–615 Micchelli CA (1986) Interpolation of scattered data: distance matrices and conditionally positive definite functions. Constr Approx 2:11–22 Peyret R (2002) Spectral methods for incompressible viscous flow. Springer, New York Piret C (2012) The orthogonal gradients method: a radial basis functions method for solving partial differential equations on arbitrary surfaces. J Comput Phys 231:4662–4675 Powell MJD (1992) The theory of radial basis function approximation in 1990. In: Light W (ed) Advances in numerical analysis, vol II: wavelets, subdivision algorithms and radial functions. Oxford University Press, Oxford, pp 105–210 Ratcliff JT, Schubert G, Zebib A (1996) Steady tetrahedral and cubic patterns of spherical shell convection with temperature-dependent viscosity. J Geophys Res 101:473–484 Schaback R (1995) Error estimates and condition numbers for radial basis function interpolants. Adv Comput Math 3:251–264 Schaback R (2005) Multivariate interpolation by polynomials and radial basis functions. Constr Approx 21:293–317

Page 29 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_61-1 © Springer-Verlag Berlin Heidelberg 2014

Schoenberg IJ (1938) Metric spaces and completely monotone functions. Ann Math 39:811–841 Spotz WF, Taylor MA, Swarztrauber PN (1998) Fast shallow water equation solvers in latitudelongitude coordinates. J Comput Phys 145:432–444 St-Cyr A, Jablonowski C, Dennis JM, Tufo HM, Thomas SJ (2008) A comparison of two shallowwater models with nonconforming adaptive grids. Mon Weather Rev 136:1898–1922 Stemmer K, Harder H, Hansen U (2006) A new method to simulate convection with strongly temperature-dependent and pressure-dependent viscosity in spherical shell. Phys Earth Planet Inter 157:223–249 Taylor M, Tribbia J, Iskandarani M (1997) The spectral element method for the shallow water equations on the sphere. J Comput Phys 130:92–108 Trefethen LN (2000) Spectral methods in MATLAB. SIAM, Philadelphia Turing A (1952) The chemical basis of morphogenesis. Philos Trans R Soc B 237:37–52 van der Vorst H (1992) BI-CGSTAB: a fast and smoothly converging variant of BI-CG for the solution of nonsymmetric linear systems. SIAM J Sci Stat Comput 13(2):631–644. doi:10.1137/0913035 Varea C, Aragon J, Barrio R (1999) Turing patterns on a sphere. Phys Rev E 60:4588–4592 Wilson CTR (1920) Investigations on lightning discharges and on the electric field of thunderstorms. Philos Trans R Soc Lond A 221:73–115 Wilson CTR (1929) Some thundercloud problems. J Frankl Inst 208:1–12 Womersley RS, Sloan IH (2003/2007) Interpolation and cubature on the sphere. Website, http:// web.maths.unsw.edu.au/~rsw/Sphere/ Wright GB, Fornberg B (2006) Scattered node compact finite difference-type formulas generated from radial basis functions. J Comput Phys 212:99–123 Wright GB, Flyer N, Yuen DA (2010) A hybrid radial basis function – pseudospectral method for thermal convection in a 3D spherical shell. Geophys Geochem Geosyst 11(7):Q07,003 Yoon J (2001) Spectral approximation orders of radial basis function interpolation on the Sobolev space. SIAM J Math Anal 33(4):946–958 Yoshida M, Kageyama A (2004) Application of the Ying-Yang grid to a thermal convection of a Boussinesq fluid with infinite Prandtl number in a three-dimensional spherical shell. Geophys Res Lett 31:L12,609 Zhai S, Feng X, He Y (2013) A family of fourth-order and sixth-order compact difference schemes for the three-dimensional Poisson equation. J Sci Comput 54:97–120 Zhong S, McNamara A, Tan E, Moresi L, Gurnis M (2008) A benchmark study on mantle convection in a 3-D spherical shell using CitcomS. Geochem Geophys Geosyst 9:Q10,017

Page 30 of 30

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

Inverse Resistivity Problems in Computational Geoscience Alemdar Hasanov (Hasanoˇglu)a and Balgaisha Mukanovab a Mathematics and Computer Science, Izmir University, Izmir, Turkey b Eurasian National University, Astana, Kazakhstan

Abstract We study coefficient inverse problems arising in modeling of resistivity prospecting problems. Numerical simulations are investigated in the cases of vertically and cylindrically layered medium. Conductivity coefficients are assumed to be sufficiently smooth 1D functions. The model leads to an inverse problem of identification of an unknown coefficient (conductivity) in an elliptic equation in R2 inside a slab or in a cylinder. The direct problem is formulated as a mixed BVP in R2 . Measured data are assumed to be available on the upper boundary of the medium or along the axis of the well. A logarithmic transformation is applied to the unknown coefficient, and the inverse problem is studied as a minimization problem for the residual functional. A numerical method is discussed for interpreting the data of a resistivity prospecting in both considered models of layered medium. The method is implemented for realistic conductivity distributions, with both noise-free and noisy data.

1 Introduction Resistivity sounding method appears in geophysical prospecting techniques in 1927, after brothers Schlumberger works. The main ideas of the method is described in Stefanescu and Shlumberger (1930). The method is based on measurements of surface potentials produced by currents injected into a medium. Corresponding mathematical models has been first stated by Slichter (1933) and Langer (1933). In most practical cases, the model of vertically layered medium is used as a preliminary approximation to a medium resistivity distribution, and a conductivity function is assumed to be a piecewise constant one. Then, data interpretation technique bases on the layer stripping method, proposed firstly by Pekeris (1940). That technique’s modern formulation and applications are described by Sylvester (2000) in the review book. The method is attractive due to its simplicity and computational efficiency. Further, the results of data interpretation are improved by using another medium models and/or additional prospecting methods. Here we discuss how the resistivity sounding problem can be solved by using a model of the medium with continuously distributed electrical properties. We consider only cylindrically and vertically layered medium.

2 Scientific Relevance of Coefficient Inverse Problems in Geomathematics A mathematical model of resistivity prospecting bases on equations of field potential written for a nonhomogeneous medium. One needs to determine medium resistivity distribution by using some



E-mail: [email protected]

Page 1 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

additional measurement on the available boundary. Thus, the problem reduces to a coefficient inverse problem (CIP) for an elliptic equation. The original formulation of Langer (1933) is regarded as a classical statement of the resistivity sounding problem. The model of Langer assumes a vertical electrical sounding where the electrical properties of the medium depend only on depth and a source electrode is placed on the surface of the medium. Tikhonov (1949) also studied this model and strictly proved the uniqueness of the solution to the inverse problem. This problem is an example of a coefficient inverse problem that is severely ill-posed (Alessandrini 1988). The general principles of the solution to such problems, including quasisolution and regularization methods, are stated by Ivanov (1963) and Tikhonov and Arsenin (1977). Nowadays, the theory of coefficient inverse problems (CIPs) is one of the most active areas of applied mathematics. The most known application is the electrical impedance tomography used in medicine, defectoscopy, and geophysics. Note that the first complete statement of the coefficient inverse problem (CIP) for an elliptic equation and elliptic system of equations has been presented by Hasanov (1995). A review of the literature in this field of geophysical research can be found in the book Spichak (2011). The newest methods to solve CIPs with many numerical examples are described in the book Beilina and Klibanov (2012). Resistivity prospecting method is used for vertically layered model of medium and for well logging as well. Well logging has been studied extensively in the oil and gas industry since the 1940s (Archie 1942; Tabarovsky et al. 1994; Peng 1997; Dvoretckiy and Yarmakhov 1998; Epov et al. 2010, Onegova and Epov (2011), and references therein). Archie (1942) demonstrated for the first time how the electrical log can be used to provide qualitative indications of the presence of oil and gas reservoirs. Kaufman and Dashevsky (2003) provide an introduction to the basic principles and techniques of well electromagnetic sounding. The numerical solutions to the direct problem, when the conductivity is a given function, have been considered by many authors for different well logging tools (see, for instance, Lv et al. 2009; Epov et al. 2010; Geng et al. 2012, and references therein). In Peng (1997), the spontaneous potential well logging inverse problem is formulated as a variational problem for an elliptic equation, in which the resistivity of the objective layer appears as a coefficient. The existence of solutions to the model is then proven, and sufficient conditions to ensure the uniqueness of the solution are obtained. In Tabarovsky et al. (1994) and Pen’kovskii and Korsakova (2010), the inverse problem has been formulated for an inductive logging tool and several numerical results are obtained. The first successful attempt at numerical solution to the problem posed by Langer (1933) and Tikhonov (1949) has been briefly described in the unfinished research of Alekseev et al. (1989). These authors employed a regularization method, then smoothed the solution in each iteration step, and obtained some examples of recovery of the unknown coefficient. In the paper Mukanova and Orunkhanov (2010) the problem is solved numerically for various assumed conductivity distributions. Unlike Alekseev et al. no regularization of a residual functional nor a smoothing procedure is used. Instead, a two-stage procedure to recover the unknown coefficient is proposed that leads to a successful numerical solution of the considered inverse problem. In the first step, a logarithmic derivative, p.z/, of an unknown conductivity .z/ is recovered. In the second step, the function .z/ is found using an analytical formula. In the paper Mukanova (2012), this method is applied to the well logging problem when electrical properties of the medium depend only on the distance to the polar axis and do not depend on the depth. In addition, whereas the cited above papers (Langer 1933; Tikhonov 1949; Alekseev et al. 1989) consider the case when the source electrode is placed on the surface of the

Page 2 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

vertical layered medium, it is assumed that the current source is placed inside a well and that the surrounding medium is cylindrically layered. The problem is expressed in cylindrical coordinates .r; '; z/ and uses a statement of a direct problem described in the book Dvoretckiy and Yarmakhov (1998). This statement differs from the classical treatment of Langer and Tikhonov in both the boundary conditions and the geometry of the physical domain.

3 Least-Square (Quasisolution) Approach for Resistivity Prospecting Problem 3.1 Formulation of Inverse Problems Corresponding to Two Models of a Medium 1. We consider two different models of the medium. Denote them by abbreviation M1 and M2: M1. Assume that the medium is vertically layered, i.e., a conductivity function .z/ depends on the depth z only. Suppose that its value 0 D .0/ on the surface of the medium is given and .z/ continuously changes along z 2 Œ0; H , where H is a thickness of the slab. M2. Assume that the conductivity of the medium in a well has a given value, w D const, and let rw be the radius of the well. Suppose that the well is surrounded by a penetration zone, rw  r  re , in which the conductivity of the medium changes continuously in the radial direction. Outside the penetration zone, the containing environment has another known and constant conductivity, e D const for r > re . For convenience, we introduce a number of dimensionless variables. Let values H , 0 and rw ; w be the units of the length and the conductivity for models M1 and M2 respectively. Let the current source be placed at the origin, with current amplitude I . We take M D

I I and M D 40 H 4w rw

(1)

to be the corresponding units of potential for models M1 and M2. In the case of model M1, we assume that the set S of dimensionless conductivity functions .z/ satisfies the following conditions: S WD f.z/ 2 C 2 Œ0; 1 W .0/ D 1;  0 .0/ D 0I 0 < 1  .z/  2 g: For the model M2, 8 < 1; 0  r < 1; total .r/ D .r/; 1  r  r1 ; : 1 D const; r1 < r < 1:

Page 3 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

and the set of admissible conductivity functions S is defined as follows: .r/ 2 S D f.r/ W 0 < 1  .r/  2 < 1; .r/ 2 C 2 Œ1; r1 g where r1 D re =rw and 1 D e =w . The commonly used mathematical model of the resistivity prospecting is an equation for the field potential u.r; z/. Let us formulate coefficients inverse problems (CIPs) in terms of this model in the cases M1 and M2: Case M1 (CIP1):

Let the function u.r; z/ be the solution to the following boundary value problem 8 .z/ @  @u    r @r C @z@ .z/ @u D 0; 0 < r < 1; 0 < z < 1 ˆ ˆ r @r @z ˆ < @u j D ı.r/; @z zD0 @u j D 0; u.r; z/jzD1 D 0; ˆ @r rD0 ˆ ˆ : lim u.r; z/ D 0:

(2)

r!1

Determine the unknown electric conductivity coefficient .z/ 2 S from the measured data u1 .r/ where @u jzD0 D u1 .r/: @r

(3)

Case M2 (CIP2): Let the function u.r; z/ be the solution to the following boundary value problem 8   2 @u.r;z/ 1 @ ˆ D 0; 0 < r < 1;  1 < z < C1 r.r/ C .r/ @ u.r;z/ ˆ r @r @r @z2 ˆ ˆ < @u jrD0 D ı.r/; @r (4) lim u.r; z/ D 0; ˆ r!1 ˆ ˆ ˆ : lim u.r; z/ D 0 z!˙1

Determine the unknown electric conductivity coefficient .r/ 2 S from the measured data U.z/ where u.r; z/rD0 D U.z/:

(5)

Remark: in the sequel we formulate CIP2 anew. 2. Let us reformulate first the direct problems. Case M1. Note that the boundary condition at z D 1 in the statement (2) is an approximate form of the requirement that the potential should vanish at infinity. Namely, the condition at infinity is shifted to the boundary z D 1. Let us apply the Hankel transform to the problem (2) with respect to variable r. Then we obtain the two-point problem for a second order ordinary differential equation:

Page 4 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

dV i d h .z/  2 .z/V D 0; dz dz

(6)

with the following boundary conditions: ? dV ? ? D 1; V jzDH D 0: d z ?zD0

(7)

Here the function Z1 V .; z/ D

u.r; z/J0 .r/rdr

(8)

0

is a Hankel transform of the potential u.r; z/. Having the function V .; z/, we can derive the solution to the problem (2) by the following inversion formula: Z1 u.r; z/ D

V .; z/J0.r/d : 0

Case M2. In the cylindrically symmetric case, the governing equation is given by (4). Further we consider the solution on the cylindrical layer 1 < r < r1 . Then the function u.r; z/ should satisfy the condition lim u.r; z/ D 0; 1 < r < r1

(9)

z!˙1

Now, we need boundary conditions at r D 1 and r D r1 . One should express the continuity of the potential u.r; z/ and the current density .r/@u=@r at the boundaries r D 1 and r D r1 . We will derive them out later. First, let us apply the Fourier transform to the electric potential u.r; z/ with respect to z: Z

1

W .r; s/ D

u.r; z/ cos.sz/d z:

(10)

W .r; s/ cos.sz/ds:

(11)

0

Then 2 u.r; z/ D 

Z

1 0

Introduce the notation a.r/ D r.r/:

(12)

Page 5 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

Then, the transformed governing equation is the following ordinary differential equation with parameter s: dW  d  a.r/  a.r/s 2 W .r; s/ D 0; dr dr

s 2 Œ0; 1/; r 2 .1; r1 /;

(13)

Boundary conditions for Eq. (13) are first derived in Dvoretckiy and Yarmakhov (1998). The proof is available in Mukanova (2012). It has been shown that boundary conditions at r D 1 and r D r1 are the following: W 0 .1; s/ D k.s/W .1; s/  l.s/; W 0 .r1 ; s/ D m.s/W .r1 ; s/;

(14)

where the functions k.s/ D sI1 .s/=I0.s/;

l.s/ D 1=I0 .s/;

m.s/ D sK1 .s r1 /=K0 .s r1 /

(15)

are defined in terms of the modified Bessel functions K1 ; K0 ; and I0 , I1 . These conditions express the continuity of the potential and the current at the boundaries r D 1 and r D r1 . Remark. Another form of the condition at z D 1 for the model M1 would be derived using a technique used for the model M2; but in that technique we need the value of .1/, which is unknown due to the physical meaning of the problem. Statements (6)–(7) and (13)–(14) comprise the forward problem for models M1 and M2 respectively, when the conductivity function is known. 3. Now we reformulate the coefficient inverse problem in case M2. To state an inverse problem in the layer 1 < r < r1 , we first consider the solution inside the well. Suppose that the solution is just the potential of a point source in a small region around the origin and can in general be expressed in the form u.r; z/ D p

1 r 2 C z2

C ˚.r; z/;

(16)

where the function ˚.r; z/ is bounded, vanishes as z ! ˙1, and has a well-defined Fourier transform (with respect to z). In the region where the conductivity function is constant, i.e., for r 2 .0; 1/ (and for r 2 .r1 ; 1/ as well), the function W .r; s/ is a solution to the modified Bessel equation 1 W 00 C W 0  s 2 W D 0; r whose general solution can be expressed in terms of modified Bessel functions of the first and second kind of order zero: W .r; s/ D

.s/.s/I0 .s r/ C .s/K0 .s r/:

(17)

Page 6 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

From the formula 2 p D 2 2  r Cz 1

Z1 K0 .s r/ cos.sz/ds 0

that holds for Bessel functions and expressions (16) and (17), we find that inside the well, the function W .r; s/ can be written as W .r; s/ D

.s/I0 .s r/ C K0 .s r/:

(18)

Then by (16), Z W .0; s/ D

1

.s/ D

˚.0; z/ cos.sz/d z

(19)

0

is the Fourier transform of the function ˚.0; z/. The function ˚.0; z/ is known from measured data: ˚.0; z/ D U.z/ 

1 z

then the solution to equation (13) on on the boundary r D 1 of the well is given by W .1; s/ D

.s/I0 .s/ C K0 .s/  '.s/:

(20)

Thus, the considered inverse problem is formulated as follows: CIP2. Find the function .r/  a.r/=r using the solution of the BVP stated in equation (13) and conditions (14), satisfying the additional condition (20), where the function '.s/ is given. We have thus reduced the inverse problem CIP2 to equations (13) and conditions (14) and (20), which is similar to the statement for the model M1 excepting different boundary conditions. We can therefore apply the same numerical method to both cases.

3.2 Quasisolution Method Due to measurement errors, these problems CIP1 and CIP2 might not have a solution in any suitable class of admissible coefficients. For this reason, use the quasisolution method (Ivanov 1963) to solve the state inverse problems. First, make some preliminary transformations. Let us introduce the logarithmic derivative of the coefficient in Eqs. (6) and (13): p.z/ D .ln .z//0 D

0 1 0 . r/0 ; and p.r/ D .ln a.r//0 D D C  r  r

(21)

It has be shown in Mukanova and Orunkhanov (2010, Appendix) that the measured data are sensible to a reflection factor and not sensible to an electrical properties’ contrast of the adjacent Page 7 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

medium. The function p./ introduced above is just a continuous analog of a reflection factor. Then it is preferable to express the CIPs in terms of this function. Evidently, the conductivity function can be expressed via the reflection factor function p./P as follows: .z/ D exp

Z z

 p.z/d z ;

(22)

0

in the case M1, and Z r  1 p.r/dr ; .r/ D exp r

(23)

1

in the case M2. To construct residual functionals, we reformulate the CIPs above. Case M1. Let .z/ be a given coefficient and p.z/ its logarithmic derivative. Denote by u D u.r; zI p/ the unique solution to the direct problem (2), corresponding to the coefficient .z/. Introduce the operator Œp WD

@u.r; zI p/ jzD0 : @r

Then the CIP1 can be formulated in the following operator form Hasanov (1997): .p/.r/ D u1 .r/; r 2 Œ0; 1/: Therefore, the CIP1 can be reduced to solving the operator equation above. Now, define residual functionals (Tikhonov and Arsenin 1977). 1 J.p/ D 2

Z1 ..p/  u1 .r//2 rdr;

(24)

0

and consider the minimization problem J.p  / D inf J.p/; p2P

p 2 P;

(25)

in the class of admissible reflection coefficients P D fp.z/  C 1 Œ0; H ; p.z/ D .ln..z///0 ; .z/ 2 Sg:

(26)

The function   .z/ calculated via p  by the formula (21) will be defined to be a quasisolution of the inverse problem CIP1.

Page 8 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

Define the transformed measured data Z1 './ D

u1 .r/J1 .r/rdr: 0

Due to a unitary property of the Hankel transformation, differentiation rules for transformed functions, and formulas (8), the residual functional is equal to J.p/ D

1 2

R1 .'./  V .; 0//2 d :

(27)

0

We will consider the above minimization problem (24) and (25) for the transformed functional (27) in the set of the admissible reflection factor functions p.z/. By the same way, we introduce a residual functional in the case CIP2. Let 1 J Œp D 2

Z

s2

.W .1; s/  '.s//2 ds

(28)

s1

be the residual functional, where 0  s1 < s2 < 1. We construct a quasisolution to CIP2 defined in (13), (14), and (20), by minimizing the functional (28) with respect to the function p.r/.

4 Numerical Method Based on Conjugate Gradient Algorithm 4.1 Gradient Formulas Thus, we need first to reformulate the problem in terms of reflection functions; then we have to solve minimization problem numerically. The most common way to solve it numerically is gradient methods. There exist different ways to obtain gradient formulas. In most practical cases, it could be expressed via the solution to corresponding adjoint problem. But in our case, the gradient is expressed in closed form via the solution to direct problem. Note that closed form is preferable to obtain higher numerical accuracy. In Mukanova (2009, 2012), the formulas of Fréchet derivatives of the considered functionals are obtained in the following forms: CIP1: Z1 rJ Œp D .z/ .V .; 0/  './/V .; z/V 0 .; z/2d ;

(29)

0

CIP2: Z rJ Œp D a.r/

s2

l 1 .s/.W .1; s/  '.s//W .r; s/W 0 .r; s/ds

(30)

s1

Page 9 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

Here the functions .z/ and a.r/ should be replaced by their corresponding expression via p./: The expressions (29) and (30) are similar to each other. The differences are the weighting coefficient l 1 .s/ and the statement of boundary conditions in direct problems.

4.2 The Solution to Direct Problems When one implements any gradient method, one needs to solve multiple direct problem. The number of repetition depends on an iteration number of a gradient method and on a grid point number for Fourier transform parameter. Evidently, the total number of repetitions in solving direct problem might be very high. Then the method should be at the same time accurate and efficient. Because the formulated above direct problems are very simple, they can be solved by standard FDM or FEM methods.

4.3 A Gradient Method Algorithm The minimization problem (25) for residuals (27) and (28) can be solved using the following iterative algorithm (see, for instance, Vasil’ev 1981): (1) Specify the value of " used in the termination criterion (see step (4)) and the tolerance  of the parameter ˛n and one of the minima of J Œp .n/  ˛n q .n/  for step (3). (2) Choose an initial guess p .0/ .z/ D ln. .0/ .z//0 (case M1) or p .0/ .r/ D ln.r .0/ .r//0 (case M2). (3) Find the new values of p and q using the conjugate gradient method formulas:

q .0/ D rJ Œp .0/ ; q .1/ D q .0/ ; q .nC1/ D q .n/  ˛n .rJ Œp .n/   ˇn q .n1/ /; p .nC1/ D p .n/  ˛n q .n/ ; n D 0; 1; 2 : : :

n D 1; 2; 3 : : :

(31)

where the coefficient ˇn is computed as ˇn D 

hrJ Œp .n/ ; rJ Œp .n1/ i krJ Œp .n1/ k2

and the value ˛n is defined by the conditions ˛n  0;

J Œp .n/  ˛n q .n/  D

min J Œp .n/  ˛q .n/ 

˛2Œ0;˛max 

(32)

(4) Repeat step (3) until the following stopping criterion is satisfied:

max.jJ Œp .n/ j; krJ Œp .n/ k/ < max.k!kL2 Œs1 ;s2  ; "/:

(33) Page 10 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 A recovery of the increasing .z/ and corresponding p.z/ in the case of noise free synthetic measured data

Here, k!kL2 Œs1 ;s2  is the estimated norm of the additive noise in the measurements. Let us give several notes on practical use of the gradient method: (a) It is important to start with very small values of parameter ˛n ; in our practice it was above 105 ; (b) When choosing an initial guess, it is recommended to set it with the possible lowest values of the function ./; (c) It is useful to use a logarithmic grid for Fourier or Hankel transformations parameter, like D ln.1 C s/. (d) The most time-consuming step in an implementation of the method is when one defines ˛n by (32); especially it is important to choose a convenient value of the ˛max . Computations show that it is useful to set ˛max  krJ Œp .n1/ k1

4.4 Numericals The method described above has been tested with noisy and noise- free synthetic data. Computations show that the conditions under which the solution can be obtained numerically are similar for models M1 and M2 (see Mukanova and Orunkhanov 2010; Mukanova 2012). The results of numerical simulations of CIP1 in the case of monotonic increasing .z/ are depicted in Fig. 1, left-hand panel. The corresponding reflection factor functions p.z/ are compared in the right-hand panel. To check the stability of the method we introduced a random additive noise into synthetic measured data. The model of the noise is described in details in articles Mukanova and Orunkhanov (2010) and Mukanova (2012). The noise is presented as a sum of harmonics with random amplitudes. The number of harmonics is equal to a grid point number. Different cases of noise functions with their Fourier transforms are depicted in Fig. 2. A noise level is expressed as a ratio between a maximum of the noise and a maximum of the measured data. The results obtained for a noise level equal to 5 % are presented in Fig. 3. Left-hand panels of the figures represents transformed noised measured data and data that correspond to initial guess.

Page 11 of 16

noises

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

r

Transformed noises 1.0 11.5

8.8

10.1

7.7

6.7

5.8

5.1

4.4

3.8

3.2

2.8

2.3

2.0

1.6

1.3

1.1

0.8

0.6

0.4

0.3

0.1

0.0

0.0 −1.0 −2.0 −3.0

0.6 0.4 0.2 0.0 0 0. 1 3 4 6 8 1 .3 .6 0 .3 8 2 8 .4 1 .8 7 7 8 .1 5 −0.2 0. 0. 0. 0. 0. 1. 1 1 2. 2 2. 3. 3. 4 5. 5 5. 7. 8. 10 11. −0.4 −0.6

1.0 0.5 0.0 0 1 0. 0.1 0.3 0.4 0.6 0.8 1.1 1.3 1.6 2.0 2.3 2.8 3.2 3.8 4.4 5.1 5.6 6.7 7.7 8.8 10. 1.5 1 −0.5 −1.0

Fig. 2 Different examples of noise functions and their Fourier transforms

The most favorable case is when the function .z/ is decreasing one. The results obtained with 5 % noise level are compared in Fig. 4. When the conductivity is an increasing function, the admissible results are obtained up to 2.5 % noise level (see Fig. 5). The most unfavorable case occurs when the conductivity has a local minimum. The reason for this is an “overshadowing” effect. We show an example corresponding to this case with 1 % noise level (Fig. 6).

Page 12 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

4.5

12.0

8.4

10.1

7.0

5.8

4.8

4.0

3.2

2.6

2.1

1.6

1.2

0.9

0.6

0.4

0.2

0.0

0.00 –0.20

4.0 3.5

–0.40

3.0 2.5

–0.60

2.0 –0.80

1.5 1.0

–1.00

0.5 –1.20

0.0 0.00

–1.40

0.50 z

0.75

1.00

0.00

0.25

0.50 z

0.75

1.00

0.00

0.25

0.50

0.75

1.00

4.5

12.0

10.1

8.4

7.0

5.8

4.8

4.0

3.2

2.6

2.1

1.6

1.2

0.9

0.6

0.4

0.2

0.0

0.00 –0.20

0.25

4.0 3.5

–0.40

3.0 2.5

–0.60

2.0 –0.80

1.5 1.0

–1.00

0.5 –1.20

0.0

–1.40

5.0 12.0

10.1

8.4

7.0

5.8

4.8

4.0

3.2

2.6

2.1

1.6

1.2

0.9

0.6

0.4

0.2

0.0

0.00 –0.20

4.5 4.0 3.5

–0.40 s (z)

3.0

–0.60

2.5 2.0

–0.80

1.5 1.0

–1.00

0.5 0.0

–1.20 –1.40

j(l)noised

j(l)recovered

j(l)initial

s recovered

z s sintetic

s initial

Fig. 3 Transformed noised measured data, initial guess, and recovered functions in the case of a conductivity function with a maximum for different 5 % random noises

In general, the results obtained for the cylindrically symmetric case turn out to be significantly worse than those for a vertical-layered medium model and require further improvement. In particular, the conductivity function can be recovered with satisfactory quality for contrast ratio amax =amin  10 and for simple distributions of .r/. More acceptable results are obtained for the cases where ./ is monotonic and ones with a local maximum near an available boundary. We show several numerical results obtained for CIP2 in Figs. 7 and 8.

5 Future Directions We described how the data interpretation process could be made by including into consideration layered medium models with 1D smooth conductivity functions. Another relatively simple type of medium models are not layered ones with piecewise constant conductivity function. Such kind of models are useful if one considers medium with several local inclusions or with buried relief. We expect that direct problems in these cases can be efficiently solved with high accuracy by using integral equations method. Then, different directions in solving inverse problems are possible.

Page 13 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 Recovered conductivity distributions with 5 % noised data in the case of monotonic decreasing .z/

Fig. 5 Recovered conductivity distributions with 2.5 % noised data in the case of monotonic increasing .z/

Fig. 6 Recovered conductivity distribution with 1 % noised data in the case of .z/ having local minimum

Page 14 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

1.2

12.0

exact

recovered 1.0

8.0

0.8 σ (r)

σ (r)

exact 10.0

6.0

0.6

4.0

0.4

2.0

0.2

0.0 1.0

1.3

1.5

1.8

2.0 r

2.3

2.5

2.8

3.0

recovered

0.0 1.0

1.3

1.5

1.8

2.0 r

2.3

2.5

2.8

3.0

Fig. 7 Examples of numerical solutions to CIP2 with 1 % noise level of measured data

Fig. 8 The recovery of conductivity distributions with 5 % noised data in the cases of .z/ having local extremuma

6 Conclusion The considered mathematical models form a next step in a complication of medium models as compared with standard layered models. Discussed models are still simple and efficient and can be used as additional alternative to interpret measurement data.

References Alekseev AS, Tcheverda VA, Niambaa Sh (1989) Optimization method for solving the inverse problem of geophysical prospecting by electric means under direct current for verticallyinhomogeneous media. In: A. Vogel et al (eds) Inverse Modeling in Exploration Geophysics. Vieweg & Sohn, Braunschweig/Wiesbaden, pp 171–189 Alessandrini G (1988) Stable determination of conductivity by boundary measurements. Appl Anal 27:153–172 Archie GE (1942) The electrical resistivity log as an aid in determining some reservoir characteristics. AIME Trans 146:54–62 Beilina L, Klibanov MV (2012) Approximate global convergence and adaptivity for coefficient inverse problems. Springer, New York/Dordrecht/Heidelberg/London Dvoretckiy PI, Yarmakhov IG (1998) Electromagnetic and hydrodynamic methods in oil and gas deposit exploration. Nedra, Moscow (in Russian) Epov MI, Mironov VL, Muzalevskiy KV, Yeltsov IN (2010). UWB electromagnetic borehole logging tool. In: IEEE International Symposium on Geoscience and Remote Sensing, Honolulu, HI, USA, pp. 3565–3567 Page 15 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_62-2 © Springer-Verlag Berlin Heidelberg 2014

Geng M, Liang H, Yin H, Liu D, Gao Y (2012) Numerical simulation in whole space for resistivity logging through casing under approximate conditions. Procedia Eng 29:3600–3607 Hasanov A (1995) An inverse coefficient problem for an elasto-plastic medium. SIAM J Appl Math 55:1736–1752 Hasanov A (1997) Inverse coefficient problems for monotone potential operators. Inverse Probl 13:1265–1278 Ivanov VK (1963) On ill-posed problems. Math Sb 61(103) 2:211–223 Kaufman AA, Dashevsky YA (2003) Principles of induction logging. Elsevier, Amsterdam Langer RE (1933) An inverse problem in differential equations. Am Math Soc Bull 39:814–820 Lv W-G, Chu Z-T, Zhao X-Q, Fan Y-X, Song R-L, Han W (2009) Simulation of electromagnetic wave logging response in deviated wells based on vector finite element method. Chin Phys Lett 26:014102 Mukanova B (2009) An inverse resistivity problem: 1. Lipschitz continuity of the gradient of the objective functional. Appl Anal 88:749–765 Mukanova B (2012) A numerical solution to the well resistivity-sounding problem in the axisymmetric case. Inverse Probl Sci Eng. doi:10.1080/17415977.2012.727085. Taylor & Francis Mukanova B, Orunkhanov M (2010) Inverse resistivity problem: geoelectric uncertainty principle and numerical reconstruction method. Math Comput Simul 80:2091–2108 Onegova EV, Epov MI (2011) 3D simulation oftransient electromagnetic field for geosteering horizontal wells. Russ Geol Geophys 52(7):725–729 Pekeris SL (1940) Direct method of interpretation in resistivity prospecting. Geophysics 5(1): 31–42 Pen’kovskii VI, Korsakova NK (2010) The new method of data interpretation of well electromagnetic sounding. Inverse Probl Sci Eng 18(7):983–995 Peng Y-J (1997) An inverse problem in petroleum exploitation. Inverse Probl 13:1533 Slichter LV (1933) The interpretation of resistivity prospecting method for horisontal structures. Physics 4:307–311 Spichak V (2011) Electromagnetic sounding of the Earth’s Interior. Elsevier, Amsterdam Stefanescu SS, Shlumberger C (1930) Sur la distribution electrique potencielle dans une terrain a couches horizontals, homogenes etisotropes. J Phys Radium 7:132–141 Sylvester J (2000) Layer stripping. In: Colton D et al (eds) Surveys on solution methods for inverse problems. Springer, New York Tabarovsky LA, Bear DR, Mezzatesta A (1994) Induction logging: resolution analysis and optimal tool design using block spectrum analysis. In: SPWLA 35th Annual Logging Symposium, 19-22 June, Tulsa, Oklahoma, Society of Petrophysicists & Well Log Analysts, pp 1–19 Tikhonov AN (1949) About uniqueness of geoelectrics problem solution. Dokl Acad Sci USSR 69(6):797–800 Tikhonov A, Arsenin V (1977) Solution of ill-posed problems. Wiley, New York Vasil’ev FP (1981) Methods for solving extremal problems. Nauka, Moscow

Page 16 of 16

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

Mathematical Foundations of Photogrammetry Konrad Schindler Photogrammetry and Remote Sensing, ETH Zürich, Switzerland

Abstract Photogrammetry uses photographic cameras to obtain information about the 3D world. The basic principle of photogrammetric measurement is straightforward: recording a light ray in a photographic image corresponds to observing a direction from the camera to the 3D scene point where the light was reflected or emitted. From this relation, procedures have been derived to orient cameras relative to each other or relative to a 3D object coordinate frame and to reconstruct unknown 3D objects through triangulation. The chapter provides a compact, gentle introduction to the fundamental geometric relations that underly image-based 3D measurement.

1 Introduction The goal of photogrammetry is to obtain information about the physical environment from images. This chapter is dedicated to the mathematical relations that allow one to extract geometric 3D measurements from 2D perspective images.1 Its aim is to give a brief and gentle overview for students or researchers in neighboring disciplines. For a more extensive treatment, the reader is referred to textbooks such as Hartley and Zisserman (2004) and McGlone (2013). The basic principle of measurement with photographic cameras – and many other optical instruments – is simple: light travels along (approximately) straight rays; these rays are recorded by the camera. Thus, the sensor measures directions in 3D space. The fundamental geometric relation of photogrammetry is thus a simple collinearity constraint: a 3D world point, its image in the camera, and the camera’s projection center must all lie on a straight line. The following discussion is restricted to the most common type of camera, the so-called perspective camera, which has a single center of projection and captures light on a flat sensor plane. It should however be pointed out that the model is valid for all cameras with a single center of projection (appropriately replacing only the mapping from image points to rays in space), and extensions exist to noncentral projections along straight rays (e.g., Pajdla 2002). In a physical camera, the light-sensitive sensor plane is located behind the projection center, and the image is captured upside down (the “upside-down configuration”). However, there exists a symmetrical setup with the image plane located in front of the camera, as in a slide projector



E-mail: [email protected]

1

Beyond geometric measurement, photogrammetry also includes the semantic interpretation of images and the derivation of physical object parameters from the observed radiometric intensities. The methodological basis for these tasks is a lot broader and less coherent and is not treated here. Page 1 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

a

c

b ~ y

~ c z

~ x

X0 Z

~ x

y

x x xH

Y X

X

Fig. 1 Coordinate systems: X are the 3D object coordinates of a point; the 3D camera coordinates of the same point are xQ ; the 2D image coordinates of its projection are x. (a) Object coordinates. (b) Camera coordinates. (c) Image coordinates

(the “upright configuration”), for which the resulting image is geometrically identical. For convenience the latter configuration is used here, with the image plane between object and camera.

1.1 Preliminaries and Notation To understand the material in this chapter, the reader should possess basic knowledge of engineering mathematics (linear algebra and calculus) and should be familiar with the basic notions of homogeneous coordinates and projective geometry found in textbooks such as Semple and Kneebone (1952). In terms of notation, scalars will be denoted in italic font x, vectors in bold font x and matrices in sanserif font X. The symbol I is reserved for an identity matrix of size 3  3, and 0 denotes a 3-vector of zeros. All coordinate systems are defined as right handed. Three coordinate systems are required to describe a perspective camera (see Fig. 1): • The 3D object coordinate system; • The camera coordinate system, which is another 3D coordinate system, attached to the camera such that its origin lies at the projection center and the sensor plane is parallel to its xy-plane and displaced in positive z-direction; • The 2D image coordinate system in the sensor plane; its origin lies at the upper left corner of the image, and its x- and y-axis are parallel to those of the camera coordinate system; for digital cameras, it is convenient to also align the x-axis with the rows of the sensor array. To distinguish these coordinate systems, uppercase letters X refer to object coordinates, lowercase letters with tilde xQ to camera coordinates, and plain lowercase letters x to image coordinates. Transposition of a vector or matrix is written X> , and the cross-product between two 3-vectors is denoted either as x  y or using the cross-product matrix Œx y, where for x D Œu; v; w> 2

3 0 w v Œx D 4 w 0 u5 : v u 0

Page 2 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

The Kronecker product between two vectors or matrices is denoted X ˝ Y, such that 3 x11 Y x12 Y : : : x1n Y 6x Y x Y : : : x Y7 22 2n 7 6 21 X˝YD6 : :: 7 : : : :: :: 4 :: : 5 xm1 Y xm2 Y : : : xmn Y 2

(1)

Some further matrix operators are required: the determinant det.X/, the vectorization vec.X/ D x D ŒX11 ; X12 ; : : : ; Xmn > , and the diagonal matrix (here for a 3-dimensional example) 2

3 a 0 0 diag.a; b; c/ D 4 0 b 05 : 0 0 c Geometric entities are usually assumed to be given in homogeneous coordinates, so a vector X D ŒU; V; W; T > refers to a 3D object point with Euclidean coordinates T1 ŒU; V; W > , where T ¤ 0 is the projective scale.2 Similarly, a 2D image point is x D Œu; v; t > . If Euclidean vectors are needed, they are denoted with a superscript e, so, for example, the Euclidean image point is xe D Œx; y> D 1t Œu; v> . Although stochastic uncertainty modeling is not covered in this chapter, it should be noted that variance propagation is equally possible in homogeneous notation (Förstner 2010).

2 Single-View Geometry 2.1 The Collinearity Equation The mapping with an ideal perspective camera can be decomposed into two steps, namely: 1. A transformation from object coordinates to camera coordinates referred to as exterior orientation, and 2. A projection from camera coordinates to image coordinates with the help of the cameras interior orientation. The exterior orientation is achieved by a translation from the object coordinate origin to the origin of the camera coordinate system (i.e., the projection center), followed by a rotation which aligns the axes of the two coordinate systems. With the Euclidean object coordinates Xe0 D ŒX0 ; Y0 ; Z0 > of the projection center and the 3  3 rotation matrix R, this reads as 

R 0 xQ D MX D > 0 1



 I Xe0 X: 0> 1

(2)

2

Historically, much of the mathematics of photogrammetry was developed in Euclidean notation, and that form is still used in several textbooks. The projective formulation has found widespread use since 2,000. Page 3 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

Let us now have a closer look at the camera coordinate system. In this system, the image plane is perpendicular to the z-axis. The z-axis is also called the principal ray and intersects the image plane in the principal point, which has the camera coordinates xQ H D tQ  Œ0; 0; c; 1> and the image coordinates x D t  ŒxH ; yH ; 1> . The distance c between the projection center and the image plane is the focal length (or camera constant). The perspective mapping from camera coordinates to image coordinates then reads 3 c 0 xH 0 x D ŒKj0Qx D 40 c yH 05 xQ : 0 0 1 0 2

(3)

This relation holds if the image coordinate system has no shear (orthogonal axes, respectively, pixel raster) and isotropic scale (same unit along both axes, respectively, square pixels). If a shear s and a scale difference m are present, they amount to an affine distortion of the image coordinate system, and the camera matrix becomes 2

3 c cs xH K D 40 c.1 C m/ yH 5 0 0 1

(4)

with five parameters for the interior orientation. By concatenating the two steps from object to image coordinates, we get the final projection, i.e., the algebraic formulation of the collinearity constraint (Das 1949) x D PX / PX D KRŒIj  Xe0 X :

(5)

If an object point X and its image x are both given at an arbitrary projective scale, they will only satisfy the relation up to a constant factor. To verify the constraint, i.e., check whether x is the projection of X, one can use the relation x  PX D Œx PX D 0 :

(6)

Note that due to the projective formulation, only two of the three rows of this equation are linearly independent. Given a projection matrix P, it is often necessary to extract the interior and exterior orientation parameters. To that end, observe that P D ŒMjm D ŒKRj  KRXe0  :

(7)

The translation part of the exterior orientation immediately follows from Xe0 D M1 m. Moreover, the rotation must by definition be an orthonormal matrix, and the calibration must be an upper triangular matrix. Both properties are preserved by matrix inversion; hence, the two matrices can be found by QR decomposition of M1 D R> K1 (or, more efficiently, by RQ decomposition of M).

Page 4 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

2.2 Nonlinear Errors Equation (5) is valid for ideal perspective cameras. Real physical cameras obey the model only approximately, mainly because the light is collected with the help of lenses rather than entering through an ideal, infinitely small projection center (“pinhole”). The light observed on a real camera’s sensor did not travel there from the object point along a perfectly straight path, which leads to errors if one uses only the projective model. In the image of a real camera, we cannot measure the ideal image coordinates xe , but rather the ones displaced by the nonlinear image distortion, xL e D xe C x.xe ; q/ ;

(8)

with q as the parameters of the model that describes the distortion. A simple example would be a radially symmetric lens distortion around the principal point: x D

xe  xeH .q2 r 2 C q4 r 4 / r

;

r D kxe  xeH k :

(9)

Here, the different physical or empirical distortion models are not further discussed. Instead, the focus is on how to compensate the effect when given a distortion model and its parameters q. The corrections x vary across the image, which means that they depend on the (ideal) image coordinates x. This may be represented by the mapping 2

3 1 0 x.x; q/ xL D H.x/x D 40 1 y.x; q/5 x : 0 0 1

(10)

The overall mapping from object points to observable image points, including nonlinear distortions, is now L xL D P.x/X D H.x/PX :

(11)

Note that the nonlinear distortions x.xe ; q/ are a property of the camera, i.e., they are part of the interior orientation, together with the calibration matrix K. Equation (11) forms the basis for the correction of nonlinear distortions. The computation is split into two steps. Going from object point to image point, one first projects the object point to an ideal image point, x D PX, and then applies the distortion, xL D H.x/x. Note that for practical purposes the (linear) affine distortion parameters s and m of the image coordinate system are often also included in H.x/ rather than in K. For photogrammetric operations, the inverse relation is needed, i.e., one measures the coordinates xL and wants to convert them to ideal ones x, in order to use them as inputs for procedures based on collinearity, such as orientation and triangulation. Often it even makes sense to remove the distortion and synthetically generate perspective (straight line preserving) images as a basis for further processing. To correct the measured coordinates,

Page 5 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

X

x l

l

e t

Fig. 2 The coplanarity constraint: the two projection rays ,  0 must lie in one plane, which also contains the baseline t and the object point X. As a consequence, possible correspondences to an image point x must lie on the epipolar line l0 and vice versa. All epipolar lines l intersect in the epipole e, the image of the other camera’s projection center

2

3 1 0 x.x; q/ x D H1 .x/Lx D 40 1 y.x; q/5 xL ; 0 0 1

(12)

one would need to already know the ideal coordinate one is searching for, so as to evaluate H.x/. One thus resorts to an iterative scheme, starting from x  xL . Usually at most one iteration is required, because the nonlinear distortions vary slowly across the image.

3 Two-View Geometry From a single image of an unknown scene, no 3D measurements can be derived, because the third dimension (the depth along the ray) is lost during projection. The photogrammetric measurement principle is to acquire multiple images from different viewpoints and measure corresponding points, meaning image points which are the projections of the same physical object point. From correspondences one can reconstruct the 3D coordinates via triangulation. The minimal case of two views forms the nucleus for this approach.

3.1 The Coplanarity Constraint A direct consequence of the collinearity constraint is the coplanarity constraint for two cameras: the viewing rays through corresponding image points must be coplanar, because they intersect in the 3D point. It follows that even if only the relative orientation between the two cameras is known, one can reconstruct a (projectively distorted) straight-line-preserving model of the 3D world, by intersecting corresponding rays, and that if additionally the interior orientations are known (the cameras are calibrated), one can reconstruct an angle-preserving model of the world in the same way. The scale of such a photogrammetric model cannot be determined without external reference, because scaling up and down the two ray bundles together does not change the coplanarity. Now let us look at a pair of corresponding image points x in the first image and x0 in the second image (Fig. 2). The coplanarity constraint (or epipolar constraint) states that the two corresponding rays in object space must lie in a plane. By construction that plane also contains the baseline t D Xe0 0  Xe0 between the projection centers. From (2) and (3) the ray direction through x in Page 6 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

object space (in Euclidean coordinates) is  D R> K1 x, and similarly for the second camera  0 D R0> K01 x0 . Coplanarity between the three vectors implies   .t   0 / D x> K> RŒt R0> K01 x0 D x> Fx0 D 0 :

(13)

The matrix F is called the fundamental matrix and completely describes the relative orientation. It has the following properties: • l D Fx0 is the epipolar line to x0 , i.e., the image of the ray  0 in the first camera. Corresponding points to x0 must lie on that line, x> l D 0. Conversely, l0 D F> x is the epipolar line to x. • The left null-space of F is the epipole e of the first image, i.e., the image of the second projection center X00 in which all epipolar lines intersect, F> e D 0. Conversely, the right null-space of F is the epipole of the second image. • F is singular and has rank  2, because Œt has rank  2. Accordingly, F maps points to lines. It thus has seven degrees of freedom (nine entries determined only up to a common scale factor, minus one rank constraint). The coplanarity constraint is linear in the elements of F and bilinear in the image coordinates, which is the basis for directly estimating the relative orientation. If the interior orientations of both cameras are known (the cameras have been calibrated), the epipolar constraint can also be written in camera coordinates. The ray from the projection center to x in the camera coordinate system is given by  D K1 x, and similarly 0 D K01 x0 . With these direction vectors, the epipolar constraint reads > RŒt R0> 0 D > E0 D 0 :

(14)

The matrix E is called the essential matrix and completely describes the relative orientation between calibrated cameras, i.e., their relative rotation and the direction of the relative translation (the baseline). It has the following properties: • E has rank 2. Additionally, the two nonzero eigenvalues are equal. E has five degrees of freedom, corresponding to the relative orientation of an angle-preserving photogrammetric model (three for the relative rotation, two for the baseline direction). • The constraint between calibrated rays is still linear in the elements of E and bilinear in the image coordinates. For completeness, it shall be mentioned that beyond coplanarity, further constraints, so-called trifocal constraints, exist between triplets of images: if a corresponding straight line has been observed in three images, then the planes formed by the associated projection rays must all intersect in a single 3D line; see, for example, Hartley (1997) and McGlone (2013). This topic is not further treated here.

Page 7 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

3.2 Absolute Orientation The transformation from the coordinates of the angle-preserving photogrammetric model Xm to a given 3D object coordinate system X is called absolute orientation. It corresponds to a similarity transform and thus has seven degrees of freedom (translation, rotation, and scaling of the model). 1

I 0 X D SRTX D 0 1 m

s >



R 0 0> 1



 I T m X : 0> 1

(15)

4 Analytical Operations The input to the photogrammetric process are raw images, respectively, coordinates measured in those images. This section describes methods to estimate unknown parameters from image coordinates, using the models developed above.

4.1 Single-Image Orientation The complete orientation of a single image has 11 unknowns (5 for the interior orientation and 6 for the exterior orientation). An image point affords two observations; thus, at least six ground control points in the object coordinate system and their corresponding image points are required. An algebraic solution, known as the Direct Linear Transform or DLT (Abdel-Aziz and Karara 1971), is obtained directly from Eq. (6).   Œx PX D Œx ˝ X> p D 0 ;

(16)

with the vector p D vec.P/ D ŒP11 ; P12 ; : : : ; P34 > . For each control point, two of the three equations are linearly independent. Selecting two such equations for each of N  6 ground control points and stacking them yields a homogeneous linear system A2N 12 p D 0, which is solved with singular value decomposition to obtain the projection matrix P. Note that the DLT fails if all control points are coplanar and is unstable if they are nearly coplanar. A further critical configuration, albeit of rather theoretical interest, is if all control points lie on a twisted cubic curve (Hartley and Zisserman 2004). The direct algebraic solution is not geometrically optimal, but can serve as a starting value for an iterative estimation of the optimal Euclidean camera parameters; see McGlone (2013). A further frequent orientation procedure is to determine the exterior orientation of a single camera with known interior orientation. Three noncollinear control points are needed to determine the six unknowns. The procedure is known as spatial resection or P3P problem. The solution (Grunert 1841) is sketched in Fig. 3. With the known interior orientation, the three image points are converted to rays in camera coordinates, i D K1 xi ; i D 1 : : : 3. The pairwise 1 > 3 etc. In the object coordinate angles between these rays are determined via cos ˛ D j2 jj 3j 2 system, the distances between the points are determined, a D jXe2  Xe3 j, etc. The three triangles containing the projection center now give rise to constraints

Page 8 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

s2 c

a

s1 b s3

Fig. 3 Spatial resection: the image coordinates together with the interior orientation give rise to three rays in camera coordinates, forming a trilateral pyramid. Applying the cosine law on each of the pyramid’s faces relates the pairwise angles ˛; ˇ;  between the rays and the known distances a; b; c between the object points to the pyramids sides s1 ; s2 ; s3 . Solving for these three lengths yields the camera coordinates of the three points

a2 D s22 C s32  2s2 s3 cos ˛ b 2 D s32 C s12  2s3 s1 cos ˇ

(17)

c 2 D s12 C s22  2s1 s2 cos  for the three unknowns s1 ; s2 ; s3 . Substituting auxiliary variables u D ss21 and v D ss31 yields, after some manipulations, a fourth-order polynomial for v and hence up to four solutions; see (e.g., Haralick et al. 1994). Back-substitution delivers first u and then the three distances s1 ; s2; s3 . Given these distances, the ground control points in camera coordinates follow from xQ ei D si  jii j . The exterior orientation is then found by computing the rotation and translation between the xQ i and the Xi . There are two critical configurations for spatial resection: one where the projection center is located on (or near) a circular cylinder generated by sweeping the circle through X1 ; X2 ; X3 along the triangle’s normal vector and the other when the control points lie on the cubic horopter curve. Based on the geometric construction (17), several other algebraic schemes exist to solve the equation system; see (e.g., Haralick et al. 1994). For more than three control points, an iterative optimal resection algorithm can be found in the literature (McGlone 2013).

4.2 Relative Orientation of Two Images A further elementary operation is the relative orientation of images to gain a photogrammetric model. The present chapter deals with the relative orientation of two images. Since relative orientations can be transitively chained, that operation forms the elementary building block for orienting larger image networks (note, in practice, it is often preferred to chain image triplets because the associated redundancy affords robustness; however, that case is not treated here). The basis of relative orientation from observed image correspondences is the coplanarity constraint (13). Since the constraint is linear in the unknown elements of the relative orientation, it can be directly reordered and solved. Each corresponding point pair gives rise to an equation  >  x ˝ x0> f D 0 ;

(18) Page 9 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

O D ŒF11 ; F12 ; : : : ; F33 > . Stacking 8 such equations yields a regular, respectively, with f D vec.F/ overdetermined, homogeneous equations system for f. The direct solution ignores the rank deficiency of the fundamental matrix, instead using at least eight points to determine seven unknowns. Due to measurement noise the resulting matrix FO will not be a fundamental matrix. To correct this, one can find the nearest (according to the Frobenius norm) rank-2 matrix by decomposing FO with SVD and nullifying the smallest singular value, FO D U  diag.1 ; 2 ; 3/  V>

F D U  diag.1 ; 2 ; 0/  V>

;

(19)

This so-called “8-point algorithm” (Longuet-Higgins 1981) can be used in equivalent form to estimate the essential matrix between two calibrated cameras. Reordering (14) to 

 > ˝ 0> e D 0

(20)

O and the solution is corrected to the nearest essential matrix by yields the entries of e D vec.E/, enforcing the constraints on the singular values, EO D U  diag.1 ; 2 ; 3/  V>

;

E D U  diag.1; 1; 0/  V>

(21)

Estimating the seven unknowns of F or the five unknowns of E from eight points is obviously not a minimal solution, and thus not ideal – especially since a main application of the direct solution is robust estimation in RANSAC-type sampling algorithms. A minimal solution for F, called the “7-point algorithm” can be obtained in the following way (von Sanden 1908; Hartley 1994): only seven equations (18) are stacked into A79 f D 0. Solving this expression with SVD yields a twodimensional null-space f.ı/ D ıv8 C v9 ;

(22)

with arbitrary ı. To find a fundamental matrix (i.e., a rank-deficient matrix) in that null-space, one introduces the nine elements of f.ı/ into the determinant constraint det.F/ D 0 and analytically expands the determinant with Sarrus’ rule. This results in a cubic equation for ı, and consequently in either one or three solutions for F. Following the same idea, a “5-point algorithm” exists for the calibrated case (Nistér 2004): stacking (20) for five correspondences leads to a 4-dimensional null-space e.ı; ; / D ıv6 C v7 C v8 C v9

(23)

Again this can be substituted back into the determinant constraint. Furthermore, it can be shown that the additional constraints on the fundamental matrix can be written 1 EE> E  trace.EE> /E D 0 2

(24)

in which one can also substitute (23). Through further – rather cumbersome – algebraic variable substitutions, one arrives at a 10th-order polynomial in , which is solved numerically. For the (up to 10) real roots, one then recovers ı and , and thus E, through back-substitution. Page 10 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

The fundamental matrix is ambiguous if all points are coplanar in object space. The corresponding equations become singular in that case and unstable near it. On the contrary, the essential matrix does not suffer from that problem. Naturally, once initial values for the relative orientation parameters are available, an iterative solution exists to find the geometrically optimal solution for an arbitrary number of correspondences; see McGlone (2013). Having determined E, it is in many cases necessary to extract explicit relative orientation parameters (rotation and translation direction) for the image pair. Given the singular value decomposition (21) and the two auxiliary matrices, 3 0 ˙1 0 W D 41 0 0 5 0 0 1

2

;

3 0 ˙1 0 Z D 41 0 0 5 0 0 0

(25)

;

R D UWV>

(26)

2

the orientation elements are given by Œt D UZU>

which can be easily verified by checking E D Œt R. The sign ambiguities in W and Z give rise to four combinations, corresponding to all combinations of the “upright” and “upside-down” camera configurations for the two images. The correct one is found by checking in which one a 3D object point is located in front of both cameras.

4.3 Reconstruction of 3D Points For photogrammetric 3D reconstruction, the camera orientations are in fact only an unavoidable by-product, whereas the actual goal is to reconstruct 3D points (note, however, that the opposite is true for image-based navigation). The basic operation of reconstruction is triangulation of 3D object points from cameras with known orientations Pi . A direct algebraic solution is found from the collinearity constraint in the form (6). Each image point gives rise to .Œxi  Pi / X D 0

(27)

of which two rows are linearly independent. Stacking the equations leads to an equation system for the object point X. Solving with SVD yields a unique solution for two cameras P1 ; P2 , respectively, a (projective) least-squares solution for more than two cameras. A geometrically optimal solution for two views exists, which involves numerically solving a polynomial of degree 6 (Hartley and Sturm 1997). Iterative solutions for 3 views also exist. In general the algebraic solution (27) is a good approximation, if one employs proper numerical conditioning.

Page 11 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

4.4 Orientation of Multi-image Networks Most applications require a network of >2 images to cover the object of interest.3 By combinations of the elementary operations described above, the relative orientation of all images in a common coordinate system can be found: either one can chain two-view relative orientations together while estimating the relative scale from a few object points or one can generate a photogrammetric model from two views and then iteratively add additional views to it by alternating single-image orientation with triangulation of new object points. Absolute orientation is accomplished (either at the end or at an intermediate stage) by estimating the 3D similarity transform that aligns the photogrammetric model with known ground control points in the object coordinate system. Obviously such an iterative procedure will lead to error buildup. In most applications, the image network is thus polished with a global least-squares optimization of all unknown parameters. The specialization of least-squares adjustment to photogrammetric ray bundles, using the collinearity constraint as functional model, is called bundle adjustment (Brown 1958; Triggs et al. 1999; McGlone 2013). Adjustment proceeds in the usual way: the constraints y D f .x/ between observations y and unknowns x are linearized at the approximate solution x0 , leading to an overdetermined equation system ıy D J ıx. The equations are solved in a least-squares sense, N ıx D n N D J> S1 yy J ;

(28)

n D J> S1 yy ıy ;

with Syy the covariance matrix of the observations. The approximate solution is then updated, x1 D x0 C ıx, and the procedure iterated until convergence. In order to yield geometrically optimal solutions, the collinearity constraint is first transformed to Euclidean space by removing the projective scale. Denoting cameras by index j , object points by index i , and the rows of the projection matrix by P.1/ , P.2/ , P.3/ , we get .1/

xije

D

Pj Xi .3/

Pj Xi

.2/

;

yije

D

Pj Xi

(29)

.3/

Pj Xi

These equations must then be linearized for all observed image points w.r.t. the orientation parameters contained in the Pj as well as the 3D object point coordinates Xi . Moreover, equations for the ground control points, as well as additional measurements such as GPS/IMU observations for the projection centers, are added. For maximum accuracy, it is also common to regard interior orientation parameters (including nonlinear distortions) as observations of a specified accuracy rather than as constants and to estimate their values during bundle adjustment. This so-called self-calibration can take different forms, e.g., for crowd-sourced amateur images, it is usually required to estimate the focal length and radial distortion of each individual image, whereas for professional aerial imagery, it is common to use a single set of orientation parameters for all images but include more complex nonlinear distortion coefficients. For details about GPS/IMU integration, self-calibration, etc. see McGlone (2013).

3

In aerial photogrammetry, the network is often called an “image block,” since the images are usually recorded on a regular raster. Page 12 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

The normal equations for photogrammetric networks are often extremely large (up to >106 unknowns). However, they are also highly structured and very sparse ( xq Nqq

    ıxx n D x : ıxq nq

(30)

Inverting Nxx is cheap because it is block diagonal with individual .3  3/-blocks for each object point. Using that fact, the normal equations can efficiently be reduced to a much smaller system: N ıxq D nN ; with N N D Nqq  N> N1 Nxq N xq xx

1 ; nN D nq  N> xq Nxx nx :

(31)

The standard way to solve the reduced normal equations is to adaptively dampen the equation system with the Levenberg-Marquardt method (Levenberg 1944; Nocedal and Wright 2006) for better convergence, i.e., the equation system is modified to N  Iq /ıxq D nN ; .N

(32)

with Iq the identity matrix of appropriate size and the damping factor  depending on the success of the previous iteration. The system (32) is then reduced to a triangular form with variants of Cholesky factorization. Using recursive partitioning and equation solvers which exploit sparsity, it is possible to perform bundle adjustment for photogrammetric networks with >10,000 cameras. Due to automatic tie-point measurement as well as the sheer size of modern photogrammetric campaigns, blunders – mainly incorrect tie point matches – are unavoidable in practice. Therefore, bundle adjustment routinely employs robust methods such as iterative reweighted least squares (IRLS) (e.g., Huber 1981) to defuse, and subsequently eliminate, gross outliers.

5 Conclusion A brief summary has been given of the elementary geometry underlying photogrammetric modeling, as well as the mathematical operations for image orientation and image-based 3D reconstruction. The theory of photogrammetry started to emerge in the nineteenth century, most of it was developed in the twentieth century. The geometric relations that govern the photographic imaging process, and their inversion for 3D measurement purposes, are nowadays well understood; the theory is mature and has been compiled – in much more detail than here – in several excellent textbooks (e.g., Hartley and Zisserman 2004; Luhmann et al. 2006; McGlone 2013). Still, some important findings, such as the direct solution for relative orientation of calibrated cameras, are surprisingly recent.

Page 13 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_63-2 © Springer-Verlag Berlin Heidelberg 2014

References Abdel-Aziz YI, Karara HM (1971) Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry. In: Proceedings of the symposium on close-range photogrammetry. American Society of Photogrammetry, Falls Church Brown DC (1958) A solution to the general problem of multiple station analytical stereotriangulation. Tech. Rep. RCA-MTP data reduction technical report no 43, Patrick Airforce Base Das GB (1949) A mathematical approach to problems in photogrammetry. Emp Surv Rev 10(73):131–137 Förstner W (2010) Minimal representations for uncertainty and estimation in projective spaces. In: Proceedings of the Asian conference on computer vision, Queenstown. Lecture notes in computer science, vol 6493. Springer Grunert JA (1841) Das Pothenot’sche Problem in erweiterter Gestalt; nebst Bemerkungen über seine Anwendung in der Geodäsie. pp. 238–248 Haralick RM, Lee CN, Ottenberg K, Nölle M (1994) Review and analysis of solutions of the three point perspective pose estimation problem. Int J Comput Vis 13(3):331–356 Hartley RI (1994) Projective reconstruction and invariants from multiple images. IEEE Trans Pattern Anal Mach Intell 16(10):1036–1041 Hartley RI (1997) Lines and points in three views and the trifocal tensor. Int J Comput Vis 22(2):125–140 Hartley RI, Sturm P (1997) Triangulation. Comput Vis Image Underst 68(2):146–157 Hartley RI, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press, Cambridge Huber PJ (1981) Robust statistics. Wiley, New York Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q Appl Math 2:164–168 Longuet-Higgins HC (1981) A computer algorithm for reconstructing a scene from two projections. Nature 293:133–135 Luhmann T, Robson S, Kyle S, Boehm J (2014) Close-range photogrammetry and 3D imaging, 2nd edn. De Gruyter, Berlin McGlone JC (ed) (2013) Manual of Photogrammetry, 6th edn. American Society for Photogrammetry and Remote Sensing, Bethesda Nistér D (2004) An efficient solution to the five-point relative pose problem. IEEE Trans Pattern Anal Mach Intell 26(6):756–777 Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, New York Pajdla T (2002) Stereo with oblique cameras. Int J Comput Vis 47(1–3):161–170 Semple JG, Kneebone GT (1952) Algebraic projective geometry. Oxford University Press, Oxford Triggs B, McLauchlan PF, Hartley RI, Fitzgibbon AW (1999) Bundle adjustment – a modern synthesis. In: Vision algorithms: theory and practice. Lecture notes in computer science, vol 1883. Springer, Berlin/Heidelberg von Sanden H (1908) Die Bestimmung der Kernpunkte in der Photogrammetrie. PhD thesis, Universität Göttingen

Page 14 of 14

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Potential-Field Estimation Using Scalar and Vector Slepian Functions at Satellite Altitude Alain Plattnera,b and Frederik J. Simonsa,c a Department of Geosciences, Princeton University, Princeton, NJ, USA b Department of Earth and Environmental Science, California State University, Fresno, Fresno, CA, USA c Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA

Abstract In the last few decades, a series of increasingly sophisticated satellite missions has brought us gravity and magnetometry data of ever improving quality. To make optimal use of this rich source of information on the structure of the Earth and other celestial bodies, our computational algorithms should be well matched to the specific properties of the data. In particular, inversion methods require specialized adaptation if the data are only locally available, if their quality varies spatially, or if we are interested in model recovery only for a specific spatial region. Here, we present two approaches to estimate potential fields on a spherical Earth, from gradient data collected at satellite altitude. Our context is that of the estimation of the gravitational or magnetic potential from vector-valued measurements. Both of our approaches utilize spherical Slepian functions to produce an approximation of local data at satellite altitude, which is subsequently transformed to the Earth’s spherical reference surface. The first approach is designed for radialcomponent data only and uses scalar Slepian functions. The second approach uses all three components of the gradient data and incorporates a new type of vectorial spherical Slepian functions that we introduce in this chapter.

1 Introduction The estimation of the gravity potential (e.g., Nutz, 2002; Moritz, 2010) or that of the magnetic potential on a spherical Earth (e.g., Sabaka et al. 2010) from gradient data at satellite altitude can be stated as a “reevaluation,” of a three-dimensional function that is harmonic in a spherical shell, given values of its gradient within the harmonic shell (Freeden and Schreiner 2009). The reevaluation on the surface of a spherical Earth or planet is to be interpreted as a transformation, between the gradient at satellite altitude on the one hand and the potential function on the surface on the other hand. Such an operation is entwined with the notion of a (global) basis of functions in which to carry it out. When expressed in spherical harmonics, its numerical conditioning depends exponentially on the spherical-harmonic bandwidth of the data (Freeden and Schreiner 2009). The better the data quality, the higher the spherical-harmonic degrees that can be resolved (e.g., Maus et al. 2006a) but also the poorer the conditioning of the transformation. Scalar and vector spherical harmonics (e.g., Arkani-Hamed 2001, 2004; Maus et al. 2006b; Olsen et al. 2009; Gubbins et al. 2011) are only a few among the many basis functions that can be used for magnetic-field estimation. 

E-mail: [email protected]

Page 1 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Alternatives include ellipsoidal harmonics (e.g., Bölling and Grafarend 2005; Maus 2010; Lowes and Winch 2012), monopoles (e.g., O’Brien and Parker 1994), spherical wavelets (e.g., Chambodut et al. 2005; Mayer and Maier 2006), spherical-cap harmonics (e.g., Haines 1985; Hwang and Chen 1997; Korte and Holme 2003), and their relatives (e.g., de Santis 1991; Thébault et al. 2006). For gravity-field estimation, besides the spherical harmonics (e.g., Eshagh 2009; Freeden and Schreiner 2009), we can also list spherical wavelets (e.g., Chambodut et al. 2005; Fengler et al. 2007), ellipsoidal harmonics (e.g., Lowes and Winch 2012), and mascons (e.g., Rowlands et al. 2005). Data quality might not be evenly distributed over the entire sphere or may even only be locally available (Arkani-Hamed and Strangway 1986; Arkani-Hamed 2002; Maus et al. 2006c). For this reason, methods that take the locality of the data into account are of great value. Unfortunately, a function, and hence a method of analysis, can not be bandlimited and spacelimited at the same time. Every localized method that transforms data at satellite altitude into a potential field on Earth’s surface needs to circumvent or embrace this fact. Schachtschneider et al. (2010, 2012) analyze the errors introduced by local approximation in a general framework. The method that we present here builds on the localized function bases first described by Slepian and Pollak (1961) for problems in time-series analysis. They constructed one-dimensional functions that are bandlimited but optimally concentrated within a target interval, and later extended the concept of what became known as the Slepian functions to multidimensional Cartesian cases (Slepian 1964). Albertella et al. (1999) and then Simons et al. (2006) ushered in the realm of scalar spherical Slepian functions, and Jahn and Bokor (2012, 2014) and Plattner and Simons (2012, 2014) described vectorial spherical Slepian functions – all of these ideally suited for applications in geomathematics and fitting neatly with the general notions of signal concentration and the uncertainty principle espoused by Freeden and Michel (2004) and Kennedy and Sadeghi (2013), among others. A more detailed introduction to scalar and vectorial Slepian functions can be found in the chapter “Scalar and Vector Slepian Functions, Spherical Signal Estimation and Spectral Analysis” by Simons and Plattner in this book. Theoretical considerations on the application of scalar Slepian functions to potential-field estimation from scalar potential data at satellite altitude were presented by Simons and Dahlen (2006), and some very practical cases in oceanography, terrestrial geodesy, and planetary science can be found elsewhere (Harig and Simons 2012; Lewis and Simons 2012; Slobbe et al. 2012). In this chapter, after emphasizing some preliminaries in Sect. 2, stating the problems to be solved in Sect. 3, and introducing the scalar and a special type of vector Slepian functions in Sect. 4, we extend the approach presented by Simons and Dahlen (2006) to the potential estimation from radial-derivative data, in Sect. 5. Subsequently, we present a method to estimate the potential field from local three-component gradient data using vector Slepian functions in Sect. 6. Finally, in Sect. 7, we present numerical examples for both the radial-component method and the fully vectorial gradient data method.

2 Scalar and Vector Spherical Harmonics and Harmonic Continuation In this chapter, we employ a notation that is similar to the one used in the chapter “Scalar and Vector Slepian Functions, Spherical Signal Estimation, and Spectral Analysis” by Simons and Plattner in this book. We adapted the notation to transparently account for scalar and vector-valued functions. Scalar-valued functions are italicized, with capital letters such as Ylm for the classical spherical-harmonic functions. Vector-valued functions are italic but boldfaced, with capital letters, Page 2 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

such as E lm for the gradient-vector harmonics that we define. Column vectors containing scalar functions are in a calligraphic font, for example, Y, whereas column vectors that contain vector functions are calligraphic but bold, as in E. Column vectors of expansion coefficients are roman and lowercase, such as u, and their scalar entries are in lowercase italics, such as ulm. If functions or coefficients are estimated from the data, they receive a tilde, such as VQ or uQ . Matrices containing coefficients or multiplicative factors are roman and bold, such as A. Matrices containing functions evaluated at specific points are sans-serif bold, such as Y.

2.1 Scalar Spherical Harmonics As customary we define, for a point rO on the surface of the unit sphere  D frWkxk D 1g with colatitudinal value 0     and longitudinal value 0   < 2, the real-valued sphericalharmonic functions 8p ˆ < 2Xljmj . / cos m if  l  m < 0; O D Ylm .; / D Xl0 . / Ylm .r/ (1) if m D 0; ˆ :p 2Xlm . / sin m if 0 < m  l;  .l  m/Š 1=2 Plm .cos  /; Xlm . / D .1/ .l C m/Š  lCm d 1 2 m=2 .2  1/l : Plm./ D l .1   / 2 lŠ d 

m

2l C 1 4

1=2 

(2) (3)

With this definition of the surface spherical harmonics Ylm , we may learn from Backus et al. (1996), Dahlen and Tromp (1998), or Freeden and Schreiner (2009) that they are the orthonormal eigenfunctions of the scalar Laplace-Beltrami operator r12 D @2 C cot  @ C .sin  /2 @2 ;

(4)

with eigenvalues l.l C 1/; thus r12 Ylm D l.l C 1/Ylm . In spherical coordinates, we can define the three-dimensional Laplace operator r 2 D @2r C 2r 1 @r C r 2 r12 ;

(5)

and the Laplace equation by which we define a three-dimensional function V .r r/ O to be harmonic, r 2 V .r r/ O D 0:

(6)

The general solution of Eq. (6) comprises one component that vanishes at the origin r D 0 and another that is regular by going to zero at infinity. The inner, r l Ylm , and outer, r l1 Ylm , solid spherical harmonics form a basis for all solutions of Laplace’s equation and serve to approximate external-source and internal-source scalar potentials (Olsen et al. 2010), respectively (Blakely 1995; Langel and Hinze 1998). Page 3 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

The spherical harmonics Ylm defined in (1) form an orthonormal basis for square-integrable realvalued functions on the unit sphere . We can describe any such function V .r/ O as a unique linear combination of spherical harmonics via the expansion V .r/ O D

l 1 X X

Z ulm Ylm .r/; O

where

ulm D

V .r/Y O lm .r/ O d :

(7)



lD0 mDl

Now let V .r r/ O be a three-dimensional function that satisfies the Laplace equation (6) outside of the unit sphere, and which is regular at infinity. If we know the spherical-harmonic coefficients of V .r r/ O on the unit sphere (r D 1), from Eq. (7), then we can describe the function at any point r  1 outside of the unit sphere using the outer harmonics by writing V .r r/ O D

1 X l X

r l1ulm Ylm .r/: O

(8)

lD0 mDl

More generally, for a function V .r r/ O that satisfies Eq. (6) outside a ball of radius re , and which is regular at infinity, its evaluation on a sphere ra of radius ra  re is an expansion of spherical harmonics in the following way: V .ra r/ O D

1 X l X

Z a urlm Ylm .r/; O

where

lD0 mDl

a urlm

D

V .ra r/Y O lm .r/ O d :

(9)



In order to evaluate V .r r/ O at any other radius r  re given the spherical-harmonic coefficient a at radius ra  re , we can use Eq. (8) twice, to first evaluate V .r r/ O on the unit sphere values urlm and then, at radius r, to obtain  l1 1 X l X r a V .r r/ O D urlm Ylm .r/: O r a lD0 mDl

(10)

2.2 Gradient-Vector Spherical Harmonics From the scalar spherical harmonics Ylm .r/, O we may define vector spherical-harmonic functions on the unit sphere using the Helmholtz decomposition in the usual way (Backus et al. 1996; Dahlen O D rO Y00 .r/ O and, for and Tromp 1998; Freeden and Schreiner 2009) as the fully normalized P 00 .r/ l  1 and l  m  l, P lm .r/ O D rO Ylm .r/; O

h

i

O O @ C O .sin  /1 @ Ylm .r/ O r 1 Ylm .r/ D ; O Dp p B lm .r/ l.l C 1/ l.l C 1/

(11) (12)

Page 4 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

h

i O .sin  /1 @  O @ Ylm .r/ O  O rO  r 1 Ylm .r/ D ; C lm .r/ O D p p l.l C 1/ l.l C 1/

(13)

where the relevant surface and the three-dimensional gradient operators are r 1 D O @ C O .sin  /1 @ ; r D rO @r C r 1 r 1 :

(14) (15)

For our purposes, we use an alternative basis of normalized vector spherical harmonics (Nutz 2002; Mayer and Maier 2006; Freeden and Schreiner 2009). We define E 00 D P 00 , and, for l  1 and l  m  l, s s l C1 l E lm D (16) P lm  B lm ; 2l C 1 2l C 1 s s l l C1 P lm C B lm : Flm D (17) 2l C 1 2l C 1 This alternative orthonormal basis of vector spherical harmonics E lm ; F lm , and C lm is identical to .1/ .2/ .3/ .1/ .2/ .3/ , yQn;m , yQn;m in the notation of Freeden and Schreiner (2009) and to the un;k , un;k , un;k of the yQn;m m;.c;s/ by Sabaka et al. (2010) are scaled variants of the Mayer and Maier (2006). The functions …ni functions E lm . Figure 1 shows three-component spatial renditions of two of the basis elements, E 3 2 and F 3 2 .

2.3 Harmonic Continuation of Scalar and Vector Fields From now on, we will always assume that the Earth’s surface is a sphere re of fixed radius re and that the satellite altitude is a sphere rs of radius rs  re . Using Eqs. (9) and (10), we can express e O at the satellite altitude rs via the spherical-harmonic coefficients urlm on the potential field V .rs r/ Earth’s surface re by  l1 1 X l X rs e V .rs r/ O D urlm Ylm .r/; O re

(18)

lD0 mDl

e where the coefficients urlm , the entries of a vector ure , are given by

Z e urlm

D

V .re r/Y O lm .r/ O d :

(19)



The gradient of the potential at satellite altitude will then, by Eq. (15), be given by the expression

Page 5 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 1 The gradient-vector spherical harmonics of Eqs. (16) and (17), more specifically E 3 2 and F 3 2 . Shown are O and the the radial components E 3 2  rO and F 3 2  r, O the tangential (colatitudinal) components E 3 2  O and F 3 2  , tangential (longitudinal) components E 3 2  O and F 3 2  O

r V .rs r/ O D

1 X l X

 .l C 1/ re

lD0 mDl

C re

1

1

 l2 rs e urlm rO Ylm .r/ O re

(20)

 l2 rs e urlm r 1 Ylm .r/: O re

e are uniquely determined from the radial Equation (20) reveals that the potential coefficients urlm component of its gradient, as is well known (Lowes et al. 1995),

O  rO D @r V .rs r/ O D r V .rs r/

1 X l X lD0 mDl

.l C 1/ re

1

 l2 rs e urlm Ylm .r/: O re

(21)

If we had perfect knowledge of the radial component of the field r V , the potential V would be uniquely determined. When the data are contaminated by noise, we might gain by taking the radial and both tangential components into account. As shown, for example, by Freeden and Schreiner (2009), we can reformulate Eq. (20) by inserting the definitions (11) and (12) of the vector spherical harmonics P lm and B lm and then using the definition (16) of the vector spherical harmonics E lm to write O D r V .rs r/

1 X l X lD0 mDl

re

1

 l2   rs e urlm O C r 1 Ylm .r/ O .l  1/P lm .r/ re

Page 6 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

D

1 X l X lD0 mDl

 l2 p rs 1 e  .l C 1/.2l C 1/ re urlm E lm.r/: O re

(22)

Equation (22) thus shows that the gradient r V .r r/ O of a potential V .r r/ O that satisfies the Laplace 2 O D 0 outside the sphere r > re and which vanishes at infinity can be expressed equation r V .r r/ O of Eq. (16). For this reason, we as a linear combination of the vector spherical harmonics E lm.r/ O as will dub those gradient-vector spherical harmonics in this paper. We can expand r V .rs r/ O D r V .rs r/

l 1 X X

rs vlm E lm .r/; O

(23)

where the entries of the vector vrs are given by Z rs r V .rs r/ O  E lm .r/ O d : vlm D

(24)

lD0 mDl



The relationships between the spherical-harmonic expansion coefficients of the scalar potenO and the gradient-vector expansion tial V .r r/, O the radial component of the gradient @r V .r r/, coefficients of the gradient r V .r r/, O on Earth’s surface r D re , and at satellite altitude r D rs , can be described in the following (extended) “Meissl” scheme (Rummel and van Gelderen 1995; Nutz 2002; Freeden and Schreiner 2009) which identifies the basis transformations and the multiplicative factors for the expansion coefficients needed to interrelate them: ×(−l−1)/rs

 √  × − (l+1)(2l+1)/rs

∂r V (rs rˆ) ←−−−−−−−− V (rs rˆ) −−−−−−−−−−−−−−−→ ∇V (rs rˆ) Ylm →E lm          ⏐ rs −l−1 ⏐ rs −l−2 ⏐ rs −l−2 ⏐ × re ⏐ × re ⏐× r e

(25)

Ylm →E lm

∂r V (re rˆ) ←−−−−−−−− V (re rˆ) −−−−√ −−−−−−−−−−−→ ˆ)  ∇V (re r ×(−l−1)/re

× −

(l+1)(2l+1)/re

From the spherical-harmonic coefficients of V .re r/, O we can obtain the spherical-harmonic s e O as urlm D .rs =re /l1 urlm . In order to obtain the spherical-harmonic coefficients of V .rs r/ O from those of V .re r/, O we can either first follow V .re r/ O ! @r V .re r/ O and coefficients of @r V .rs r/ O ! @r V .rs r/ O or first V .re r/ O ! V .rs r/ O and then V .rs r/ O ! @r V .rs r/. O Either way we then @r V .re r/ e O as .l C 1/ re 1 .rs =re /l2 urlm . To obtain obtain the spherical-harmonic coefficients of @r V .rs r/ O from V .re r/, O we replace the spherical-harmonic functions Ylm by the gradient-vector r V .rs r/ p spherical harmonics E lm and multiply their coefficients with  .l C 1/.2l C 1/ re 1 .rs =re /l2 . Similarly, we can obtain the coefficients for any function in this scheme from the coefficients of any other function by following the arrows: replacing, if necessary, basis functions and multiplying the coefficients with the corresponding factors, as shown.

Page 7 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

3 Potential-Field Estimation Using Spherical Harmonics With the preliminaries out of the way, we now turn our attention to problems of geomathematical and geophysical interest. We distinguish and treat the following four problems in potential-field estimation: P1 Estimating the spherical-harmonic potential-field coefficients from scalar data collected at the same altitude. P2 Estimating spherical-harmonic potential-field coefficients at source level from radial data collected at satellite altitude. P3 Estimating the gradient-vector spherical-harmonic coefficients from vector data collected at the same altitude. P4 Estimating spherical-harmonic potential-field coefficients at source level from gradient data at satellite altitude. Problems P1 and P3 will serve as problems introductory to the more involved but practically more relevant P2 and P4. We will provide numerical solutions as estimations based on data point values for all four problems. For problems P2 and P4, we will also provide analytic solutions which will then enable us to calculate the effects of localization and bandlimitation on the estimation process. When discussing, in Sects. 5 and 6, the use of localized basis functions as a means of regularizing problems P2 and P4, we will provide an analysis of the effect of making bandlimited reconstructions of non-bandlimited functions explicitly, in Sects. 5.2 and 6.2.

3.1 Discrete Formulation and Unregularized Solutions In this section, we describe classical least-squares approaches to estimating the spherical-harmonic (problems P1, P2, and P4) or gradient-vector spherical-harmonic (problem P3) coefficients of potential fields and their gradients from discretely available, noiseless data. 3.1.1 Problem P1: Scalar Potential Data, Scalar-Harmonic Potential Coefficients, and Same Altitude Let there be k scalar function values  V D V .rs rO 1 /



T V .rs rO k / ;

(26)

evaluated at positions rs rO 1 ; : : : ; rs rO k on a sphere rs . These are the samples V .rs rO i / D

1 X l X

s urlm Ylm .rO i /:

(27)

lD0 mDl s within a certain bandwidth L, Our objective is to estimate the spherical-harmonic coefficients urlm i.e., for 0  l  L and l  m  l. This can be performed using least-squares analysis, assuming that the number of data exceeds the number of degrees of freedom in the system, .L C 1/2  k. Defining the matrix of point evaluations on the unit sphere

Page 8 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

1 Y00 .rO 1 /    Y00 .rO k / C B :: :: YD@ A; : : YLL .rO 1 /    YLL .rO k / 0

(28)

and the bandlimited vector of estimated coefficients T

s ; uQ rs D uQ r00s    uQ rLL

(29)

the statement of our first problem is to solve 2 T rs arg min Y uQ  V ;

(30)

uQ rs

and the solution is given by 1  YV uQ rs D YYT

(solution to problem P1).

(31)

3.1.2 Problem P2: Scalar Radial-Derivative Data, Scalar-Harmonic Potential Coefficients, and Different Altitudes Next, we wish to turn the equal-altitude problem P1 described in Eq. (30) and solved in Eq. (31) into a rs -to-re downward-continuation, radial-derivative component-to-potential problem P2. We define a diagonal upward transformation matrix A, which includes the effects of harmonic continuation and radial differentiation (see Eqs. 21 and 25), by its elements Alm;l 0 m0 D .l C 1/ re

1

 l2 rs ıl l 0 ımm0 : re

(32)

The discrete set of point values from which we desire to recover the spherical-harmonic potential e , are the sampled radial components of the gradient of coefficients on the surface of the Earth, urlm the potential (27) evaluated at satellite altitude rs , V0r

 D r V .rs rO 1 /  rO



r V .rs rO k /  rO

T

:

(33)

e of the potential on Earth’s Problem P2, estimating the spherical-harmonic coefficients urlm surface re , collected in the vector

T

e uQ re D uQ r00e    uQ rLL ;

(34)

from potential-field data collected at satellite altitude on rs , is then formulated as 2 T re 0 arg min AQ u  V Y r ; r uQ e

(35)

and is found to be Page 9 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

uQ re D A1 .YYT /1 Y V0r

(solution 1 to problem P2).

(36)

3.1.3 Problem P3: Vector Gradient Data, Vector-Harmonic Coefficients, and Same Altitude In a third problem, we seek to estimate the coefficients of the gradient function r V .rs r/ O at satellite altitude, all together

rs rs T ; vQ rs D vQ 00    vQ LL

(37)

in the basis of the gradient-vector spherical harmonics E lm , from discrete function values of O given at the points rs rO 1 ; : : : ; rs rO k . We introduce r V .rs r/ T  0T 0T 0 T ; V D Vr V V 0

(38)

with V0r as defined previously in Eq. (33), and, analogously,  V0 D r V .rs rO 1 /  O  V0 D r V .rs rO 1 /  O

T



r V .rs rO k /  O



r V .rs rO k /  O

T

;

(39)

:

(40)

To formulate problem P3 for the pointwise evaluated functions given in Eqs. (33), (39), and (40), namely, the samples r V .rs rO i / D

l 1 X X

rs vlm E lm .rO i /;

(41)

lD0 mDl

we also define the matrix of point evaluations of the gradient-vector spherical harmonics

E D Er E E ;

(42)

where the constituent matrices are given by 1 E 00 .rO 1 /  rO    E 00 .rO k /  rO C B :: :: Er D @ A; : : E LL .rO 1 /  rO    E LL .rO k /  rO 0

(43)

0

1 E 00 .rO 1 /  O    E 00 .rO k /  O B C :: :: E D @ A; : : E LL .rO 1 /  O    E LL .rO k /  O

(44)

Page 10 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

1 E 00 .rO 1 /  O    E 00 .rO k /  O C B :: :: E D @ A: : : E LL .rO 1 /  O    E LL .rO k /  O 0

(45)

Using the definitions in Eqs. (37), (38), and (42), problem P3 is stated as 2 T rs 0 arg min E vQ  V ; r

(46)

vQ s

and easily seen to be solved by 1  EV0 vQ rs D EET

(solution to problem P3):

(47)

3.1.4 Problem P4: Vector Gradient Data, Scalar-Harmonic Potential Coefficients, and Different Altitudes Finally, in order to transform the equal-altitude gradient-vector problem P3 into a downwardcontinuation, gradient data to scalar potential problem P4, we introduce the upward-transformation matrix B. This diagonal matrix contains the effect of harmonic continuation and differentiation (see Eqs. 22 and 25) and has the elements Blm;l 0 m0

 l2 p rs 1 D  .l C 1/.2l C 1/ re ıl l 0 ımm0 : re

(48)

e Problem P4, estimating the spherical-harmonic coefficients urlm of the potential on Earth’s surface re , from gradient data collected at satellite altitude on rs , can hence be formulated as

2 T re 0 arg min BQ u  V E ; r uQ e

(49)

with the solution  1 EV0 uQ re D B1 EET

(solution 1 to problem P4).

(50)

For all of the solutions listed thus far in Eqs. (31), (36), (47), and (50), we require at least as many data points as there are coefficients to estimate, k  .L C 1/2 , or 3k  .L C 1/2 for the vectorial case; otherwise, the matrices .YYT / and .EET / will not be invertible. If we have data distributed only over a certain concentration region R, the matrices .YYT / or .EET / will usually be badly conditioned and require regularization (Simons and Dahlen 2006). Furthermore, we have sidestepped issues of bias due to making bandlimited estimates (Eqs. 29, 34, and 37) from intrinsically wideband field observations (27) and (41). Lastly, we have so far blithely ignored any observational noise. For the more realistic practical cases of the problems P2 and P4, we will develop regularization methods, in Sects. 5 and 6, that take the target region R explicitly into account and whose performance we assess using detailed statistical considerations. Before doing so, however, we first establish some more notation. Page 11 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

3.2 Continuous Formulation and Bandwidth Considerations Let us define the .L C 1/-dimensional vector Y to contain the spherical-harmonic functions Ylm up to a bandlimit L, T

Y D Y00    YLL :

(51)

In the same manner, we shall define the vector of all spherical-harmonic functions up to infinite O The symbol YO >L will denote the vector of spherical harmonics with bandwidth as, simply, Y. degrees higher than L. Using this notation, we write the column vector with the complete basis  Y : YO D O Y>L 

(52)

Up to a certain bandlimit L, we can describe the spherical-harmonic coefficients of a potential O on the sphere rs , whose estimates we encountered previously in Eq. (29), as field V .rs r/ Z rs Y V .rs r/ O d ; (53) u D 

and their infinite-dimensional counterparts will be Z rs O d ; uO D YO V .rs r/  Z rs YO >L V .rs r/ O d : uO >L D

(54) (55)



With these definitions, we rewrite a representation similar to Eq. (27), for a potential field that is not bandlimited, as  r 

us T r T s T D Y T urs C YO >L V .rs r/ O D YO uO s D Y T YO >L uO r>L ; (56) s uO r>L and for future reference, we also write the equivalent of Eq. (21), using Eq. (32), in broadband and bandlimited form as O ure D Y T Aure C YO T A O O re : @r V .rs r/ O D YO T AO >L >L u >L

(57)

O >L together make up the infiniteThe matrix A and its infinite-dimensional complement A O Equation (56) contains an estimation problem that, assuming continuity of dimensional matrix A. global data R coverage, is solved by Eq. (54), owing to the orthonormality of the Ylm over the entire sphere,  Ylm Yl 0 m0 d  D ıl l 0 ımm0 . For complete data coverage, Eq. (53) solves the bandlimited portion of the estimation problem, and we can see that in that case Eq. (53) is indeed the continuous equivalent of Eq. (31), as pointed out also in the chapter “Scalar and Vector Slepian Functions, Spherical Signal Estimation and Spectral Analysis” by Simons and Plattner elsewhere in this book.

Page 12 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

For the gradient-vector spherical harmonics, we define the .L C 1/2 -dimensional vector of functions containing the E lm up to a certain bandlimit L as T

E D E 00    E LL :

(58)

Using a similar notation as for the scalar harmonics, the infinite-dimensional vector containing all O and the infinite-dimensional gradient-vector spherical harmonics to infinite bandlimit will be E, vector with all gradient-vector spherical harmonics for degrees l > L will be EO >L . The column vector with the complete vector basis is thus  E : EO D O E >L 

(59)

Up to a given bandwidth L, we can calculate the gradient-vector spherical-harmonic coefficients of a gradient field r V .rs r/ O at satellite altitude, previously known in the form of Eq. (24), via the expression Z rs E  r V .rs r/ O d : (60) v D 

The corresponding infinite-dimensional vectors of gradient-vector spherical-harmonic coefficients are Z rs EO  r V .rs r/ O d ; (61) vO D  Z rs vO >L D O d : (62) EO >L  r V .rs r/ 

Our definition of the inner product between a vector of vector-valued functions and a vector-valued function is 1 0 E 00  r V C B :: (63) E  rV D @ A: : E LL  r V In the same way, we define the outer product between two vectors of vector-valued functions as 1 E 00  E 00    E 00  E LL C B :: :: E  ET D @ A: : : E LL  E 00    E LL  E LL 0

(64)

O via its gradient-vector sphericalWe can represent the non-bandlimited gradient function r V .rs r/ harmonic coefficients

Page 13 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

 rs   T v T s D E T vrs C EO >L vO r>L r V .rs r/ O D EO vO D E T EO >L ; rs vO >L T

rs

(65)

and, via Eq. (48) as in Eq. (57), the equivalent of Eq. (22), T e O ure D E T Bure C EO T>L B O >L uO r>L r V .rs r/ O D EO BO :

(66)

The matrix B and its infinite-dimensional complement BO >L together make up the infiniteO Equation (65) again contains an estimation problem solved by Eq. (61) dimensional matrix B. in the scenario of noiseless, continuous, and complete data coverage, as can be seen from the R orthonormality relation  E lm E l 0 m0 d  D ıl l 0 ımm0 . As with the scalar problem described above, the bandlimited coefficient set (60) is approximated by the discrete solution (47) in the case of complete data coverage.

4 Scalar and Vector Spherical Slepian Functions In this section, we summarize the derivation and properties of scalar spherical Slepian functions developed by Simons et al. (2006) and further discussed in the chapter “Scalar and Vector Slepian Functions, Spherical Signal Estimation, and Spectral Analysis” by Simons and Plattner in this book. The scalar Slepian functions will play a key role in the solution to problem P2, the estimation of scalar spherical-harmonic coefficients of the potential on Earth’s surface from radial-component data at satellite altitude, in a spatially localized setting. To be able to consider spatial localization in the context of problem P4, the estimation of the scalar potential on Earth’s surface from vectorial gradient data at altitude, we introduce a special case of the vectorial Slepian functions constructed by Plattner and Simons (2014) and further discussed in the chapter “Scalar and Vector Slepian Functions, Spherical Signal Estimation, and Spectral Analysis” by Simons and Plattner in this book.

4.1 Scalar Slepian Functions We design functions that are bandlimited to a maximum spherical-harmonic degree L but at the same time spatially concentrated inside a target region R. Via optimization of a local energy criterion, we obtain a new basis of functions in the sense of Slepian (1983), as a particular linear combination of spherical harmonics. Unlike the latter, which are global functions indexed by their degree and order, the “Slepian” functions can be sorted according to their energy concentration inside of the target region. Local approximations to scalar functions can be made from the first few well-concentrated Slepian functions, as we will be needing for the solution to problem P2, where the spherical-harmonic coefficients of a potential field are determined from radial data only. Scalar spherical Slepian functions G are bandlimited spherical-harmonic expansions G.r/ O D

L X m X

glm Ylm .r/ O D YT g

(67)

lD0 lDm

Page 14 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

that are constructed by solving the quadratic optimization problem Z G 2 .r/ O d gT D g R D max T ;  D max Z g G g g G 2 .r/ O d

(68)



for the expansion coefficients glm in the .L C 1/2 -dimensional column vectors

T g D g00    gLL ;

(69)

with Y as in Eq. (51). The symmetric positive-definite kernel matrix D is defined by its elements Z Z Ylm .r/Y O l 0 m0 .r/ O d ; DD YY T d : (70) Dlm;l 0 m0 D R

R

The stationary solutions of Eq. (68) are the eigenvectors g1 ; : : : ; g˛ ; : : : ; g.LC1/2 that constitute an orthogonal coefficient matrix Z  T T GG D G G D I D G D g1    g˛    g.LC1/2 ; YY T d ; (71) 

defined by the eigenvalue problem D G D Gƒ;

D D GƒGT ;

(72)

with the eigenvalues ƒ D diag.1 ; : : : ; .LC1/2 / the concentration values of Eq. (68), many of which are near one, and many near zero. We index the individual elements glm;˛ 2 G by ˛ D 1; : : : ; .L C 1/2 and order them according to their eigenvalues in decreasing order 1 > 1      .LC1/2 > 0, to obtain a global basis for the space of spherical functions with bandlimit L, given by O D G˛ .r/

L X l X

glm;˛ Ylm .r/ O D Y T g˛ :

(73)

lD0 mDl

We normalize the different eigenvectors g˛ so that the newly constructed basis G1 ; : : : ; G.LC1/2 remains orthonormal over the entire sphere , but it is now also orthogonal over the region R, Z Z G˛ Gˇ d  D ı˛ˇ ; G˛ Gˇ d  D ˛ ı˛ˇ : (74) 

R

To further the notation introduced in and after (51), we now define the .L C 1/2 -dimensional function vector containing all Slepian functions, for a bandlimit L and a region R, to be  G D G1



T G.LC1/2

D GT Y:

(75)

Page 15 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Identifying the Slepian transformation matrix G in this way, we can then write the representation of a bandlimited function V .r/ O by involving the spherical-harmonic expansion coefficients u, or the Slepian-function expansion coefficients s D GT u, in the equivalent forms V .r/ O D

L X l X

.LC1/2 T

T

T

ulm Ylm .r/ O D Y u D Y GG u D G s D T

X

s˛ G˛ .r/: O

(76)

˛D1

lD0 mDl

Writing the Œ.L C 1/2  J -dimensional matrix containing the .L C 1/2 spherical-harmonic coefficients of the J best-concentrated Slepian functions GJ and its .L C 1/2  Œ.L C 1/2  J dimensional complement G>J as

G J D g1    gJ ;

G>J D gJ C1    g.LC1/2 ;

(77)

the J -dimensional vector of functions containing the J best-concentrated bandlimited Slepian functions GJ and its complement G>J as T

GJ D GTJ Y D G1    GJ ;

G>J D GT>J Y;

(78)

and denoting the J  J -dimensional diagonal matrix containing the J largest concentration ratios by ƒJ , Eqs. (70), (72), and (78) together imply that Z GJ GJT d : (79) ƒJ D diag.1 ; : : : ; J / D R

The orthonormality of the eigenvectors g1 ; : : : ; g.LC1/2 in Eqs. (71) and (72) guarantees that GTJ GJ D IJ J . In contrast, the matrix GJ GTJ is a .L C 1/2  .L C 1/2 -dimensional noninvertible projection, .GJ GTJ /2 D GJ GTJ GJ GTJ D GJ GTJ . The Slepian functions allow for a constructive approximation of bandlimited functions of the kind V .r/, O locally within the target region R, by restricting the expansion (76) to the J best-concentrated Slepian functions (Simons et al. 2009; Beggan et al. 2013), V .r/ O 

J X

s˛ G˛ .r/ O D GJT sJ D Y T GJ GTJ u;

rO 2 R:

(80)

˛D1

The greater the number of terms J , the less well localized the approximation, but the smaller the approximation error. Instead of spatially concentrating spectrally limited functions, we can also spectrally concentrate spatially limited functions. The spacelimited Slepian functions can be obtained by restricting the bandlimited Slepian functions to the space domain of interest: (

O G˛ .r/ O D GO ˛ .r/ 0

if rO 2 R; if rO 2  n R:

(81)

Page 16 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

The spherical-harmonic coefficients of the Slepian functions GO ˛ D YO T gO ˛ , using the notation of Eq. (52), form the infinite-dimensional vector T

gO ˛ D gO 00;˛    gO LL;˛    ;

(82)

and thus, using the orthonormality of the spherical harmonics and Eqs. (81) and (73), they are given by Z

YO GO ˛ d  D

gO ˛ D 

Z

YO GO ˛ d  D R

Z

O ˛ d D YG R

Z

 T O L g˛ ; O YY d  g˛ D D

(83)

R

where we have defined the 1  .L C 1/2 -dimensional rectangular counterpart of the localization kernel (70), namely, Z O O T d : (84) YY DL D R

To prepare for what is yet to come, in Sect. 5.2, we now also introduce another rectangular kernel, Z O (85) D>L;L D YO >L Y T d ; R

an infinite-dimensional vector containing the spherical-harmonic coefficients of gO ˛ for degrees higher than L, T

gO >L;˛ D gO LC1 L1;˛ gO LC1 L;˛    ;

(86)

and the 1  J -dimensional matrix containing the expansion coefficients gO >L;˛ , for ˛ D 1; : : : ; J , as

O >L;L GJ : O >L;J D gO >L;1    gO >L;J D D G

(87)

The vector of coefficients gO >L;˛ defined in Eq. (86) spectrally truncates the spacelimited Slepian function GO ˛ to a function GO >L;˛ D

1 l X X

T gO lm;˛ Ylm D YO >L gO >L;˛ ;

(88)

lDLC1 mDl

the ˛th element of the vector of functions GO>L , and finally, we also define the J -dimensional vector of functions with contributions confined to the degrees higher than L, using Eqs. (87), (85), and (78) again, in the equivalent formulations GO>L;J

T O T YO >L D GT D OT O D GO >L;1    GO >L;J D G >L;J J >L;L Y>L D

Z



T GJ YO >L d

YO >L :

(89)

R

Page 17 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

4.2 Gradient-Vector Slepian Functions Similarly to the scalar Slepian functions in Sect. 4.1, we can construct Slepian functions from vector spherical harmonics, as described by Plattner and Simons (2014) and in the chapter “Scalar and Vector Slepian Functions, Spherical Signal Estimation, and Spectral Analysis” by Simons and Plattner in this book. However, in Sect. 2.3, we showed that the estimation of a scalar potential field from vectorial data only depends on the gradient-vector spherical harmonics E lm defined in Sect. 2.2. In the following, we will therefore construct vector Slepian functions from gradientvector spherical harmonics E lm only. These new so-called gradient-vector Slepian functions will be useful for problem P4, the estimation of a scalar potential from vectorial data. We construct the gradient-vector Slepian functions H .r/ O D

L X l X

hlm E lm .r/ O D E T h;

(90)

lD0 mDl

as the stationary solutions of the maximization problem Z H .r/ O  H .r/ O d hT K h R D max T ; D max Z h H h h H .r/ O  H .r/ O d

(91)



for the expansion coefficients hlm in the .L C 1/2 -dimensional vector T

h D h00    hLL ;

(92)

where E was defined in Eq. (58). The symmetric positive-definite matrix K is given by its elements Z Z E lm .r/ O  E l 0 m0 .r/ O d ; KD E  E T d ; (93) Klm;l 0 m0 D R

R

using Eq. (64). The stationary solutions of Eq. (91) are the eigenvectors h1 ; : : : ; h˛ ; : : : ; h.LC1/2 in the matrix Z  T T HH D H H D I D E  E T d ; (94) H D h1    h˛    h.LC1/2 ; 

defined by the eigenvalue problem KH D H†;

K D H†HT ;

(95)

with the eigenvalues † D diag. 1 ; : : : ; .LC1/2 / the concentration values of Eq. (91), of which most are near unity or near zero. We index and order the hlm;˛ 2 H according to their eigenvalues in decreasing order such that 1 > 1      .LC1/2 > 0 to obtain a concentration-ordered basis of gradient-vector functions bandlimited to L given by

Page 18 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

H ˛ .r/ O D

L X l X

hlm;˛ E lm .r/ O D E T h˛ :

(96)

lD0 mDl

See Fig. 2 for a three-component space-domain example. We normalize the eigenvectors h˛ of Eq. (95) so that the new basis H 1 ; : : : ; H .LC1/2 is orthonormal over the entire sphere  and orthogonal over the region R, Z Z H ˛  H ˇ d  D ı˛ˇ ; H ˛  H ˇ d  D ˛ ı˛ˇ : (97) 

R

In the notation of Eq. (58) and beyond, the vector containing all gradient-vector Slepian functions for bandlimit L and region R is given by  H D H1



T H .LC1/2

D HT E:

(98)

The transformation of a bandlimited gradient-vector function into its equivalent gradient-vector Slepian-function expansion happens via the gradient-vector Slepian transformation matrix H as t D HT v and r V .r/ O D

L X l X

.LC1/2 T

T

T

vlmE lm .r/ O D E v D E HH v D H t D T

X

t˛ H ˛ .r/: O

(99)

˛D1

lD0 mDl

  We introduce the .L C 1/2  J -dimensional matrix containing the .L C 1/2 gradient-vector spherical-harmonic coefficients for each of the J best-concentrated gradient-vector Slepian functions

H J D h1    hJ ;

(100)

the J -dimensional vector of vector-valued functions containing the J best-concentrated gradientvector Slepian functions T

HJ D HTJ E D H 1    H J ;

(101)

and the J  J -dimensional diagonal matrix containing the J largest concentration ratios Z † J D diag. 1 ; : : : ; J / D HJ  HTJ d ;

(102)

R

where the last equality is a consequence of Eqs. (93), (95), and (101). The orthonormality of the h 1 ; : : : ; h.LC1/2 in Eqs. (94) and (95) ensures that HTJ HJ D IJ J ,  2 but the .L C 1/  .L C 1/2 -dimensional projection matrix HJ HTJ is not invertible. A local approximation of the gradient function can be obtained from

Page 19 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 2 The three vectorial components of the gradient-vector Slepian function H 1 best concentrated to Africa at a maximum spherical-harmonic degree L D 30. Top panel shows the radial component H 1  r, O center panel the O and bottom panel the tangential (longitudinal) component H 1  . O The tangential (colatitudinal) component H 1  , concentration coefficient is D 0:999892

r V .r/ O 

J X

t˛ H ˛ .r/ O D HTJ tJ D E T HJ HTJ v;

rO 2 R:

(103)

For use in Sect. 6.2, we finally define the 1  .L C 1/2 -dimensional matrix Z O EO >L  E T d ; K>L;L D

(104)

˛D1

R

OO O and the 1  J -dimensional matrix H E;>L;J D K>L;L HJ using the notation in Eqs. (58) and (59). From this, we derive an expression for the E lm -component of the J first spacelimited gradientvector Slepian functions for degrees greater than L, OT O T EO >L D O O EO >L D HTJ K H >L;L E;>L;J D HE;>L;J O

Z

 T O HJ  E >L d  EO >L :

(105)

R

Page 20 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

The analogy with the scalar Eq. (89) is only partial since the spacelimited versions of H also have nonvanishing components in the span of the F lm of Eq. (17) and the C lm of Eq. (13) – not just the E lm hence the more explicit notation.

5 Potential-Field Estimation from Radial Data Using Slepian Functions With the scalar Slepian functions defined in Sect. 4.1, we can now formulate the solution to problem P2 as a localized bandlimited potential-field estimation problem, from noisy radialderivative data at satellite altitude. More precisely we will use the Slepian functions to localize the radial-field analysis at satellite altitude and then, in a second step, downward-transform the resulting spherical-harmonic coefficients using the notions developed in Sect. 2.3. As in the exposition of the classical spherical-harmonic-based solutions described in Sects. 3.1 and 3.2, we start with a description of the numerical estimation procedure based on pointwise data in Sect. 5.1 before proceeding to a functional formulation that will facilitate the statistical analysis of the performance of the methods, in Sect. 5.2. Throughout this section, we do not assume that the target signal V .r/ O is bandlimited, but a bandwidth L does need to be chosen to form the approximation VQ .r/. O The bias that arises from this choice of bandlimitation will be discussed in Sect. 5.2.

5.1 Discrete Formulation and Truncated Solutions From pointwise data values of the radial derivative of the potential at satellite altitude, given at the points rs rO 1 ; : : : ; rs rO k , all inside the region R, and polluted by noise, dr D V0r C nr ;

(106)

we seek to estimate the bandlimited partial set of corresponding spherical-harmonic coefficients e ure D .ur00e    urLL /T of the scalar potential V on Earth’s surface re , as in the original statement (35) of Problem P2. In Eq. (106), V0r is defined as in Eq. (33), and nr is a vector of noise values at the evaluation points. As seen in Eq. (36), the solution to problem P2 involves the inversion of a “normal” matrix, .YYT /1 , that is reminiscent of the localization kernel in Eq. (70) and therefore has many nearzero eigenvalues, and the additional accounting for the effects of altitude via the term A1 , which will potentially unstably inflate the smallest-scale noise terms (Maus et al. 2006c). Instead of regularization by damping (in the spherical-harmonic basis), the approach we propose is based on truncation (in the Slepian basis). We focus on the estimation of the radial field at satellite altitude in a chosen target region R, by estimating only its J best-concentrated Slepian coefficients. The hard truncation level J is a regularization parameter whose value needs to be chosen based on signal-to-noise considerations and an optimality criterion, much as a proper damping parameter would (Kaula 1967; Simons and Dahlen 2006; Wieczorek and Simons 2007; Mallat 2008). Define the .L C 1/2  k-dimensional matrix containing the Slepian functions G1 ; : : : ; G.LC1/2 evaluated at the latitudinal and longitudinal locations of the data (on the unit sphere), G D GT Y;

(107)

Page 21 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

where the scalar Slepian transformation matrix G is defined in Eq. (71). Note the change in (serif vs sans) type. The matrix Y contains the spherical harmonics evaluated at the data locations on the unit sphere, as in Eq. (28). Problem P2 is restated from its original formulation in Eq. (35) via a bandlimited Slepian transformation at altitude to 2 2 2 T re T T rs T re 1 arg min Y AQu  dr D arg min Y GG AQu  dr D A G arg min G sQ  dr ; r r r uQ e

uQ e

Qs s

(108) where we used the orthogonality GGT D I and the definition Eq. (107) and identified the Slepian expansion coefficients at satellite altitude through transformation of the bandlimited vector (34) into the .L C 1/2 -dimensional vector sQrs D GT AQure :

(109)

We invoke our regularization of only solving for the coefficients of the J best-concentrated Slepian functions at satellite altitude by defining the J  k-dimensional matrix containing the point evaluations of the J best-concentrated Slepian functions on the unit sphere GJ D GTJ Y;

(110)

2 T rs arg min s Q  d GJ J r ; rs

(111)

and by solving, instead of Eq. (108),

QsJ

for the J -dimensional vector sQrJs containing the coefficients of the approximation at satellite altitude in the bandlimited Slepian basis. When J  k, we have the solution sQrJs

D



GJ GTJ

1

GJ dr ;

(112)

which we then downward-transform to the .L C 1/2 spherical-harmonic coefficients uQ re of the field on Earth’s surface re as  1 uQ re D A1 GJ sQrJs D A1 GJ GJ GTJ GJ dr

(solution 2 to noisy problem P2):

(113)

The numerical conditioning of the matrix .GJ GTJ / is determined by the truncation parameter J , and we require the inverse of the matrix A defined in Eq. (32). O of the potential field V .re r/ O at any point of interest on re The resulting approximation VQ .re r/ can be calculated as  1 T T re T T rs Q V .re r/ O D Y uQ D G#J GJ GJ GJ dr D G#J sQJ ;

(114)

where we have defined the vector of the J best-concentrated (and its complement) downwardtransformed scalar Slepian functions as

Page 22 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 3 Downward transformation of the tenth best-concentrated scalar Slepian function for Africa and a maximum spherical-harmonic degree L D 30. The right panel shows the concentrated scalar Slepian function G10 D Y T g10 for the radial component at an altitude of 500 km. The left panel shows the equivalent downward-transformed function G# 10 D Y T A1 g10 to describe a scalar potential on Earth’s surface (re D 6371 km). The concentration coefficient for the Slepian function G10 at altitude is  D 0:99985

G#J D GTJ A1 Y;

G#>J D GT>J A1 Y;

(115)

an example of which is plotted in Fig. 3. We reserve for later use the vectors of upward-transformed Slepian functions, G"J D GTJ AY;

G">J D GT>J AY:

(116)

From Eqs. (115), (116) and (71) or (75), we also obtain the equivalencies G#T .r/G O " .rO 0 / D Y T .r/A O 1 GGT AY.rO 0 / D Y T .r/Y. O rO 0 / D G T .r/G. O rO 0 /;

(117)

in the “silent” J D .L C 1/2 notation of Eq. (75), noting that Eq. (117) does not have an equivalent in truncated form when J ¤ .L C 1/2 . We also have T T G#T G" D G#J G"J C G#>J G">J :

(118)

5.2 Continuous Formulation and Statistical Considerations In this section, we provide a formulation of the approach described in Sect. 5.1 that considers the data in their functional form instead of being given as point values. In this formalism, we will then express the estimation variance, bias, and mean squared error for the methods presented under some special cases. Our results will generalize the scalar treatment of Simons and Dahlen (2006) in whose work we will point out a misprint that we correct here. 5.2.1 Continuous Formulation The analytic counterpart to the pointwise data from Eq. (106) known (or desired) only within the target region R is

Page 23 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

( d.r/ O D

O C n.r/ O if rO 2 R @r V .rs r/

(119)

if rO 2  n R;

unknown

where n.r/ O is the spatial noise function. The estimation problem equivalent to Eq. (108) can now be formulated as Z Z 2 2

T re

T Y AQu  d d  D arg min Y GGT AQure  d d  arg min uQre uQre R R Z 2

T rs 1 D A G arg min s Q  d d ; (120) G r Qs s

R

where the vector of Slepian functions G is defined in Eq. (75) and the estimated coefficients at satellite altitude sQrs are in Eq. (109). The problem is regularized by solving exclusively for the J best-concentrated Slepian coefficients that describe the data in Eq. (119), which transforms Eq. (120) into the estimation problem Z

T rs 2 G s Q  d d : (121) arg min J J rs QsJ

R

Differentiating with respect to sQrJs to find the stationary points, and making use of Eq. (79), the solution is given by Z sQrJs

GJ GJT

D R

1 Z Z 1 d GJ d d  D ƒJ GJ d d : R

(122)

R

As with the estimation of the spherical-harmonic coefficients of the potential field from the Slepian coefficients at altitude obtained from pointwise data in Eq. (113), we can estimate the vector containing the .L C 1/2 spherical-harmonic coefficients uQ re from the J -dimensional vector of Slepian coefficients sQrJs by first transforming it to the .L C 1/2 -dimensional vector of sphericalharmonic coefficients GJ sQrJs and then downward-transforming it using the inverse of the matrix A defined in Eq. (32). We thereby obtain the spherical-harmonic coefficients uQ re for the estimation O of the potential field on Earth’s surface re as VQ .re r/ Z 1 re 1 GJ d d  .analytic solution 2 to problem P2/: (123) uQ D A GJ ƒJ R

We can expand the coefficients uQ re obtained from the data d by Eq. (123) to evaluate the potential field anywhere on Earth’s surface as Z Z 1 1 T r T 1 T e VQ .re r/ O D Y uQ D Y A GJ ƒJ GJ d d  D G#J ƒJ GJ d d ; (124) R

R

where the truncated vector of downward-transformed Slepian functions G#J is defined in Eq. (115).

Page 24 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

5.2.2 Effects of Bandlimiting the Scalar Estimates The estimate given in Eq. (124) has a bandlimited representation of the unknown potential at its heart, though the actual potential that we are attempting to estimate will generally not be bandlimited (see Eqs. 18 and 27), nor will the noise be. To isolate the effects of the bandlimitation, we write the data as the sum of a bandlimited part (which is expanded globally in Slepian functions of the same bandwidth), its wideband complement, which contains spherical harmonics with degree greater than L introduced in Eq. (52), and the noise contribution. Equation (119) then becomes Z Z T T O d D @r V .rs r/ O CnDG G @r V .rs r/ O d  C Y>L YO >L @r V .rs r/ O d C n (125) 



within the region R. To this we apply the integral transform of Eq. (124) using the J bestconcentrated Slepian functions GJ , and we make use of the orthogonality Eq. (74), Eqs. (78), (79), and (89), to obtain the expression Z Z Z Z Z Z T T O O GJ d d  D GJ G d  G @r V .rs r/ O d  C GJ Y>L d  Y>L @r V .rs r/ O d  C GJ n d  R

R



R

Z



R

(126)

Z

Z

OT YO >L @r V .rs r/ GJ @r V .rs r/ O d  C GTJ D O d  C GJ n d  >L;L   R Z Z Z OT YO >L @r V .rs r/ D ƒJ GJ @r V .rs r/ O d C G O d  C GJ n d  >L;J   R Z Z Z D ƒJ GJ @r V .rs r/ O d C O d  C GJ n d : GO>L;J @r V .rs r/

D ƒJ





(127) (128) (129)

R

Finally, we can insert the result (129) into Eq. (124) to discover the contributions to the bandlimited O from signal with energy in the spherical-harmonic degree range l > L and the estimate VQ .re r/ presence of noise: T O D G#J VQ .re r/

Z GJ @r V .rs r/ O d C 

T G#J ƒ1 J

Z

O d C GO>L;J @r V .rs r/ 

Z

 GJ n d  ; (130)

R

an expression equivalent to Eq. (136) of Simons and Dahlen (2006). Ultimately, Eq. (130) is derived from an estimate of the spherical-harmonic potential coefficients, Eq. (123), that uses a truncated (to J ) set of bandlimited (to L) spatially concentrated (to R) Slepian functions. Keeping with the terminology introduced by Simons and Dahlen (2006), the truncation bias in the bandlimited part of the estimate (the first right-hand-side term in Eq. 130) diminishes as J increases, but the second, parenthetical, term grows, very unfavorably fast, with the inverseeigenvalue matrix ƒ1 J . This term contains the broadband leakage, which is captured from the non-bandlimited part of the signal by the nonvanishing regional product integral in the second term of Eq. (126), and the contribution due to the noise in the region over which data are available. Comparison of the bandlimited estimate (130) with the wideband original form (27) will furthermore identify a broadband bias that arises from the outright neglect of the necessary basis functions and is thus, essentially, unavoidable. The broadband leakage can be controlled under Page 25 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

some theoretical or numerical schemes (e.g., Hwang 1993; Trampert and Snieder 1996; Albertella et al. 2008). Oftentimes, however, those fail to be practically successful at the desired level of accuracy of the solution (e.g., Slobbe et al. 2012). 5.2.3 Statistical Analysis for Scalar Bandlimited White Processes The complete assessment of the statistical performance of the estimators (123) and (124) is an ambitious objective. It is difficult to go beyond Eq. (130) without making detailed assumptions about the underlying statistics of both signal and noise, not to mention the specifics of the region of data coverage and the satellite altitude (e.g., Kaula 1967; Whaler and Gubbins 1981; Xu 1992a, b, 1998; Schachtschneider et al. 2010, 2012; Slobbe et al. 2012). However, as shown by Simons and Dahlen (2006), special cases are easy to come by and learn from. We recall the standard definitions for the estimation error, bias, and variance,

D VQ .re r/ O  V .re r/; O ˛ ˝ O  V .re r/; O ˇ D VQ .re r/ ˛ ˝ ˛2 ˝ 2 O  VQ .re r/ O ; D VQ .re r/

(131) (132) (133)

and, typically the quantity to be minimized, the mean squared error: h 2i D C hˇ 2 i:

(134)

The angular brackets in Eq. (134) refer to averaging over a hypothetical ensemble of repeated observations, treating both signal and noise as stochastic processes (see Simons and Dahlen 2006). We make the following four oversimplified assumptions by which to obtain simple and insightful expressions for ; ˇ, and h 2 i: 1. The signal V .re r/ O is bandlimited, as are the Slepian functions G, with the same bandwidth L. 2. The signal is – almost, given the incompatible stipulation 1 – “white” on Earth’s surface, with O .re rO 0 /i D S ı.r; O rO 0 /, and with ı.r; O rO 0 / the scalar spherical delta power S, in the sense hV .re r/V function (see Simons et al. 2006). O rO 0 /, and – 3. The noise is white at the observation level, with power N , as hn.r/n. O rO 0 /i D N ı.r; again irreconcilably – zero outside of R. 4. The noise has zero mean and is uncorrelated with the signal, hn.r/i O D 0 D hn.r/V O .rO 0 /i. To honor 1, we insert the bandwidth-restricted version of Eq. (57) into Eq. (130); observe the cancellation, via the whole-sphere orthogonality of GO>L and Y, of the first term inside of the parentheses in Eq. (130); and then apply the relation (78) and the orthogonality (71), to arrive at Z    Z Z 1 1 T T re T T re Q O D G#J GJ Y Au d  C ƒJ GJ n d  D G#J GJ Au C ƒJ GJ n d  V .re r/ 

Z T D G#J



R

G"J V .re r/ O d  C ƒ1 J

Z

 GJ n d  :

R

(135)

R

Page 26 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

The last equality follows from the bandlimited identification V .re r/ O D Y T ure as from Eq. (56), global orthogonality of the Y, and by substitution of Eq. (116). From Eqs. (117) and (118), we O can be represented using the furthermore know that the unknown bandlimited signal V .re r/ upward- and downward-transformed Slepian functions as Z Z Z T T T O D G# G" V .re r/ O d  D G#J G"J V .re r/ O d  C G#>J G">J V .re r/ O d : (136) V .re r/ 





We can now calculate the bias ˇ from Eq. (132) by applying the averaging operation to Eq. (135), using assumption 4, and then subtracting Eq. (136), to give the result, which grows with diminishing truncation J , Z T G">J V .re r/ O d : (137) ˇ D G#>J 

In order to calculate the variance , we use Eq. (135) to obtain the squared Z

Q2

O D V .re r/

T G#J

G"J V .re r/ O d C 

Z  

D

T G#J

ƒ1 J

T V .re r/G O "J

Z Z 



d C

ƒ1 J

Z

Z

 GJ n d 

R

n GJT

 d  G#J

(138)

R

T G"J .r/V O .re r/V O .re rO 0 /G"J .rO 0 / d 0 d 

Z Z

GJ .r/n. O r/n. O rO 0 /GJT .rO 0 / d 0 d ƒ1 C ƒ1 J J R R Z Z C G"J .r/V O .re r/n. O rO 0 /GJT .rO 0 / d 0 d ƒ1 J 

C

ƒ1 J

R

Z Z GJ .r/n. O r/V O .re rO R



0

T /G"J .rO 0 / d 0

 d  G#J :

(139)

We apply the averaging over the different realizations of the noise in Eq. (139), and use assumptions 3 and 4 and Eq. (79), from which we subtract the square of the average of Eq. (135) to obtain the variance in Eq. (133), which grows with J , as T ƒ1 D N G#J J G#J :

(140)

The squared bias averaged over all realizations of the signal, using assumption 2, making the substitution (116), and using the whole-sphere orthogonality (71) of the spherical harmonics Y, yields

T 2 T G>J A G>J G#>J ; hˇ 2 i D S G#>J

(141)

which leads, together with the variance in Eq. (140), via Eq. (134) to the mean squared estimation error Page 27 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

T 2 T T h 2i D N G#J ƒ1 J G#J C S G#>J G>J A G>J G#>J :

(142)

With Eqs. (137), (141), and (142), we correct Eqs. (143)–(145) of Simons and Dahlen (2006). We can understand their typo by writing Eq. (141) using Eq. (115) as hˇ 2 i D SY T A1 G>J GT>J A2 G>J GT>J A1 Y and recognizing that the terms G>J GT>J are never identities and that the interior term GT>J A2 G>J is an identity only when A itself is an identity, which is never the case in this chapter, but would apply in the zero-altitude scalar case considered by Simons and Dahlen (2006). Another way of stating it is that Simons and Dahlen (2006) mistakenly applied their identity (93), which is our (117), in the case of truncated sums, for which it does not hold. The typos do not affect any of their further analysis or conclusions, which were conducted at zero altitude.

6 Potential-Field Estimation from Vectorial Data Using Slepian Functions In this section, we present a method to solve problem P4, the estimation of the potential field on Earth’s surface from noisy (three-component) vectorial data at satellite altitude (e.g., ArkaniHamed 2002). The method is constructed in a similar fashion as the scalar solutions to problem P2 described in Sect. 5. We will use the gradient-vector Slepian functions introduced in Sect. 4.2 to fit the local data at satellite altitude and then downward-transform the gradient-vector sphericalharmonic coefficients thus obtained. As for the scalar case, we will first present the numerical method applicable to pointwise data and then develop a functional formulation that will allow us to analyze the effect of non-bandlimited signal and noise on the estimation.

6.1 Discrete Formulation and Truncated Solutions Given pointwise data values of the gradient of the potential that are polluted by noise at the points rs rO 1 ; : : : ; rs rO k , d D V0 C n;

(143)

where V0 is defined in Eq. (38), and n is a vector of noise values at the evaluation points for the individual components, we seek to estimate the spherical-harmonic coefficients ure D e /T of the scalar potential V on Earth’s surface re , as in the statement (49) of .ur00e    urLL problem P4. The solution Eq. (50) contains the matrix inverse .EET /1 which, like its counterpart Eq. (93), is intrinsically poorly conditioned. To regularize the problem, we transform the problem into the gradient-vector Slepian basis for the relevant bandwidth and the chosen target region R and focus on estimating only the J best-concentrated gradient-vector Slepian coefficients. We leave the choice of the value J for later. We define the .L C 1/2  3k-dimensional matrix containing the .L C 1/2 gradient-vector Slepian functions H 1 ; : : : ; H .LC1/2 evaluated at the unit-sphere longitudes and latitudes of the data, H D HT E;

(144)

where the gradient-vector Slepian transformation matrix H is defined in Eq. (94) and the matrix E containing the values of the gradient-vector spherical harmonics evaluated at the data locations on Page 28 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

the unit sphere is defined in Eq. (42). Problem P4 is rewritten from Eq. (49) via the gradient-vector Slepian transformation H at altitude, to 2 2 2 T re T TQrs T re 1 arg min E BQu  d D arg min E HH BQu  d D B H arg min H t  d ; r r uQ e

uQ e

Qtrs

(145)

where we used the orthogonality HHT D I, the definition Eq. (144) and introduced the gradientvector Slepian coefficients at satellite altitude Qtrs D HT BQure :

(146)

As for the scalar case, we apply regularization by only estimating the coefficients for the J best-concentrated gradient-vector Slepian functions. We define the J  3k-dimensional matrix containing the point evaluations of those HJ D HTJ E

(147)

2 T Qrs arg min HJ tJ  d rs

(148)

and then solve QtJ

for the J -dimensional vector QtrJs of gradient-vector Slepian coefficients at satellite altitude. For J  3k, the minimizer  1 QtrJs D HJ HTJ HJ d

(149)

is subsequently downward-transformed to the .L C 1/2 spherical-harmonic coefficients uQ re of the field on Earth’s surface re as  1 HJ d uQ re D B1 HJ QtrJs D B1 HJ HJ HTJ

(solution 2 to noisy problem P4);

(150)

using the matrix B defined in Eq. (48). The conditioning of the matrix .HJ HTJ / is determined by the truncation level J . The local approximation VQ .re r/ O of the potential field V .re r/ O can now be calculated by  1 T T Qrs VQ .re r/ HJ HTJ tJ ; O D Y T uQ re D H#J HJ d D H#J

(151)

where we have defined the vector of the J best-concentrated gradient-vector Slepian functions (and its complement) that are downward-transformed (hence, expanded in scalar spherical harmonics) as H#J D HTJ B1 Y;

H#>J D HT>J B1 Y:

(152)

Page 29 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 4 Downward transformation of the 10th best-concentrated gradient-vector Slepian function for Africa and a maximum spherical-harmonic degree L D 30. The right panels show the concentrated gradient-vector Slepian O middle-right panel the tangential function H 10 D E T h10 . Top-right panel shows the radial component H 10  r, O and the lower-right panel the tangential (longitudinal) component H 10  . O The left (colatitudinal) component H 10  , T 1 panel shows the downward-transformed scalar potential H# 10 D Y B h10 on Earth’s surface (re D 6371 km) that corresponds to the field H 10 at satellite altitude 500 km. The concentration coefficient for the gradient-vector Slepian function H 10 at satellite altitude is D 0:93

Figure 4 shows an example. Similarly, we will be needing the upward-transformed pair of vectors H"J D HTJ BY;

H">J D HT>J BY;

(153)

and the relation derived from them when J D .L C 1/2 and Eq. (94) or Eq. (98), the equivalent of Eq. (117), namely, H#T .r/H O " .rO 0 / D Y T .r/B O 1 HHT BY.rO 0 / D Y T .r/Y. O rO 0 / D HT .r/H. O rO 0 /:

(154)

Once again we stress that we cannot derive such an equality after any truncation of the Slepianfunction set. We do have T T H#T .r/H O " .rO 0 / D H#J .r/H O "J .rO 0 / C H#>J .r/H O ">J .rO 0 /:

(155)

6.2 Continuous Formulation and Statistical Considerations In this section, we reformulate the method described in Sect. 6.1 such that instead of estimating the potential field from pointwise data, we estimate the field from functional data that are only Page 30 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

available in the target region R. This will then enable us to analyze the effect of a non-bandlimited signal and general noise on the estimation of the potential field on Earth’s surface re . 6.2.1 Continuous Formulation The data that are the functional equivalent of the point values (143) in the target region R are now expressed as ( d.r/ O D

O C n.r/ O r V .rs r/

if rO 2 R

unknown

if rO 2  n R;

(156)

where n.r/ O is a vector-valued function of space describing the noise at satellite altitude rs . The problem equivalent to Eq. (145), Z Z 2 2

T re

T T re BQ u  d d  D arg min HH BQ u  d d E E arg min uQre uQ re R R Z

T rs 2 1 H Qt  d d ; D B H arg min (157) Qtrs

R

where the vector of gradient-vector Slepian functions H is defined in Eq. (98) and the estimated vector of coefficients for the gradient-vector Slepian functions at satellite altitude Qtrs is defined in Eq. (146). In Eq. (157), the scalar-valued square of a three-dimensional vector is defined as the inner product of this vector with itself. As for the numerical formulation, we apply regularization by solving only for the coefficients of the J best-concentrated gradient-vector Slepian functions at altitude to fit the data d given in Eq. (156). We thence turn Eq. (157) into the estimation problem Z Z 2



T rs

T rs (158) HJ QtJ  d d  D arg min HJ QtJ  d  HTJ QtrJs  d d ; arg min rs rs QtJ

QtJ

R

R

which is solved by QtrJs

Z D

HJ 

HTJ

1 Z Z 1 d HJ  d d  D † J HJ  d d ;

R

R

(159)

R

where we have used Eq. (102). As for the pointwise data case shown in Eq. (150), we obtain an estimate uQ re for the spherical-harmonic coefficients of the potential field on Earth’s surface re as 1

uQ D B re

HJ † 1 J

Z HJ  d d 

(analytic solution 2 to problem P4):

(160)

R

We can transform the coefficients uQ re obtained from the data d by Eq. (160) into a local estimate of the potential field at the Earth’s surface as

Page 31 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

VQ .re r/ O D Y T uQ re D Y T B1 HJ † 1 J

Z HJ  d d  D R

T H#J † 1 J

Z HJ  d d ;

(161)

R

where the vector containing the downward-transformed gradient-vector Slepian functions H#J was defined in Eq. (152). 6.2.2 Effects of Bandlimiting the Vector Estimates The estimate (161) is bandlimited but neither the data nor the noise usually would be. To study the leakage and bias that arise from this discrepancy in the representation, we separate the data explicitly into a bandlimited and a broadband signal part, and the noise, much like we did for the scalar case in Sect. 5.2.2, as Z Z T T O O CnDH H  r V .rs r/ O d  C E >L EO >L  r V .rs r/ O d C n (162) d D r V .rs r/ 



within the region R. To work toward Eq. (161), we multiply the data with the vector HJ containing the J best-concentrated gradient-vector Slepian functions and integrate over the region. We make use of the orthogonality Eq. (97), and Eqs. (101) and (102), and the relations Eqs. (104) and (105), to arrive at Z Z Z T HJ  d d  D HJ  H d  H  r V .rs r/ O d R R  Z Z Z T C HJ  EO >L d  EO >L  r V .rs r/ O d  C HJ  n d  (163) R  R Z Z Z T OT O D †J HJ  r V .rs r/ O d  C HJ K>L;L E >L  r V .rs r/ O d  C HJ  n d ; 

Z D †J 



OT HJ  r V .rs r/ O d C H O E;>L;J Z

Z HJ  r V .rs r/ O d C

D †J 



Z

R

O d C EO >L  r V .rs r/ 

(164)

Z

HJ  n d ; R

O O O d C H E;>L;J  r V .rs r/

(165)

Z HJ  n d : R

(166) Substituting Eq. (166) into the expression for our estimate Eq. (161) exposes its bandlimited and broadband constituent terms Z T Q O D H#J HJ  r V .rs r/ O d V .re r/ 

C

T H#J † 1 J

Z 

O O O d C H E;>L;J  r V .rs r/

 HJ  n d  :

Z

(167)

R

The convenience of our notation is apparent from the comparison of this equation with Eq. (130), which is functionally very similar. Here, as there, the estimation error of the bandlimited part Page 32 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

of the signal (the first term in Eq. 167) becomes smaller with less truncation (larger J ), but the bias from the non-bandlimited part of the signal and the noise (second term) grows, amplified by the concentration factor † 1 J which becomes less well conditioned with growing J , as Slepian functions with ever smaller eigenvalues are being included into the estimate. 6.2.3 Statistical Analysis for Vectorial Bandlimited White Processes Even more so than for the scalar case described in Sect. 5.2, the calculation of the variance, bias, and mean squared error of the estimates (160) and (161), in the general sense of Eq. (167), would be very involved without imparting much insight. Instead, as for the scalar case, we narrow our scope to vectorial data d that satisfy some special properties. Because the field VQ .re r/ O that we estimate from these data is still a scalar function, we can retain the definitions of variance, bias, and mean squared error given in Eqs. (131)–(134). We update the list of assumptions as follows: 1. The signal V .re r/ O is bandlimited with the same bandlimit L as the Slepian functions H. O .re rO 0 /i D S ı.r; O rO 0 /. 2. The signal is white on the surface hV .re r/V O rO 0 /, with ı.r; O rO 0 / the vectorial 3. The noise is white at the observation level, hn.r/n. O rO 0 /i D N ı.r; delta function (see Plattner and Simons 2014) and it is zero outside of R. 4. The noise has zero mean and none of its components are correlated with the signal, hn.r/i O D 0 D hn.r/V O .rO 0 /i: Following assumption 1, we insert the bandlimited portion of Eq. (66) into Eq. (167), supply the O O form of Eq. (101), observe the cancellation of the whole-sphere inner product between H E;>L;J and E inside the parentheses in Eq. (167), and then use the relations (101) and (94) to write T O D H#J VQ .re r/ T D H#J

Z

T

HJ  E Bu d  C

D

 HJ  n d 

Z

† 1 J



R

  Z 1 T re HJ Bu C † J HJ  n d  R

Z T H#J

re

H"J V .re r/ O d C 

† 1 J

Z

 HJ  n d  ;

(168)

R

the last equality following from Eq. (56), global orthogonality of the E, and Eq. (153). From Eqs. (154 and 155), we learn that the unknown bandlimited true signal V .re r/ O can be represented by Z Z Z T T T O D H# H" V .re r/ O d  D H#J H"J V .re r/ O d  C H#>J H">J V .re r/ O d : (169) V .re r/ 





The bias of Eq. (132) derives from averaging Eq. (168), using assumption 4, and then subtracting Eq. (169) to yield a term that grows as J gets lowered, Z T H">J V .re r/ O d : (170) ˇ D H#>J 

The variance requires the square of Eq. (168), that is, Page 33 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Z

Q2

V .re r/ O D

T H#J

H"J V .re r/ O d C 

Z

T V .re r/H O "J

 

D

T H#J

† 1 J

Z Z 



d C

† 1 J

 HJ  n d 

Z R

Z

n

HTJ

 d  H#J

R

T H"J .r/V O .re r/V O .re rO 0 /H"J .rO 0 / d 0 d 

Z Z

ŒHJ .r/ O  n.r/Œn. O rO 0 /  HTJ .rO 0 / d 0 d † 1 C† 1 J J R R Z Z C H"J .r/V O .re r/Œn. O rO 0 /  HTJ .rO 0 / d 0 d † 1 J 

R

C† 1 J

Z Z ŒHJ .r/ O  n.r/V O .re rO R



0

T /H"J .rO 0 / d 0

 d  H#J :

(171)

After averaging Eq. (171) under the assumptions 3 and 4, using Eq. (102), and subtracting the square of the average of Eq. (168), we get the estimation variance of Eq. (133), which grows with J , in the form T D N H#J † 1 J H#J :

(172)

The average squared bias under the assumption 2, with Eq. (153) and the global orthogonality of the spherical harmonics Y, is written as

T 2 T hˇ 2 i D SH#>J H>J B H>J H#>J ;

(173)

which, together with the variance in Eq. (172), leads to the mean squared error defined in Eq. (134), in the form

T 2 T T h 2i D N H#J † 1 J H#J C SH#>J H>J B H>J H#>J :

(174)

7 Numerical Examples In this section, we illustrate the use of Eqs. (113) and (114) to solve the noisy scalar problem P2 and Eqs. (150) and (151) for the noisy vectorial problem P4. In both cases, our aim is to estimate the scalar potential field on Earth’s surface from noisy scalar and vectorial data, synthetically generated at a representative altitude. Throughout the section, we assume the Earth to be a sphere of radius re D 6;371 km and the satellite to fly in a spherical orbit at .rs  re / D 500 km above the Earth’s surface. We implemented the numerical algorithms in Matlab, and wherever the solution of a linear system of equations was required, such as in Eq. (112) or Eq. (149), we used the operator mldivide, e.g., .GJ GTJ /n.GJ dr / and .HJ HTJ /n.HJ d/. O D Y T ure in our numerical experiments is bandlimited to The “true” potential field V .re r/ degree L D 72 and its isotropic signal power is constant within the bandlimit by satisfying

re 2 Pl 1 D 1 for 1  l  L. We ensured that the signal had zero mean over the mDl ulm 2lC1 Page 34 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

entire Earth’s surface by setting ur00e D 0. Figures 5 and 7 show the potential-field signal in their upper-left panels. The bandlimited scalar quantity at satellite altitude @r V .rs r/ O is defined by the bandlimited O by the bandlimited restriction version of Eq. (57), and likewise, the vectorial quantity r V .rs r/ of Eq. (65). In each of the experiments in this section, we sampled the fields at altitude at the same set of 2,217 points which were uniformly distributed (equal surface area) over the target region R, R Africa, of solid-angle area a D R d . From these points, we created vectors with the data dr or d as in Eqs. (106) and (143). The noise for the scalar problem was generated at every location of the data points by independent sampling from a zero-mean Gaussian distribution with a variance equal to 2.5 % of the Pk numerical signal power at satellite altitude rs given by .1=k/kV0r k2 D .1=k/ iD1 Œ@r V .rs rO i /2 . For the vectorial problem, we generated the noise for each of the three signal components at satellite altitude, @r V .rs r/, O @ V .rs r/, O and @ V .rs r/, O independently from zero-mean Gaussian distributions with identical variances equal to 2.5 % of the numerical power of the signal in each of the components separately. At each fixed Slepian-basis truncation level J , the scalar estimates in Eq. (113) are derived from the solutions (112) which minimize the quadratic misfit (111) that is our regularized proxy for the noisy problem (108). Similarly, the vectorial estimates Eq. (150) derive from the solutions (149) to the misfit (148) which is our regularized version of the noisy problem (145). As we have seen in the theoretical treatment of the problem, the truncation regularization biases the estimates (see Eqs. 137 and 170) by an amount that grows when lowering J (more truncation), but the estimation variances (see Eqs. 140 and 172) are positively affected by lowering J (which leads to smaller variance). In all this, our ultimate objective is to control the trade-off between bias and variance and make our estimates of the potential field at the surface of the Earth as efficient as possible (Cox and Hinkley 1974; Davison 2003). We thus need to evaluate the quality of the estimates made using different truncation levels J in terms of their mean squared errors (see Eqs. 142 and 174). For each experiment, we will compute as a measure of efficiency the mean squared error between the estimated potential-field and the (bandlimited) truth, at the Earth’s surface, averaged over the area of interest, as follows: Z  T

2 1 1

V .re r/ (175) mse D O  VQ .re r/ O d  D ure  uQ re D ure  uQ re : a R a O D Y T ure and the estimates in the common form VQ .re r/ O D Y T uQ re as given With the truth V .re r/ by either Eqs. (114) or (151), the truncation-level J -dependent Eq. (175) can be calculated directly with the aid of the localization kernel Eq. (70), as shown. We will express the regional mean squared error relative to the mean squared signal strength over the same area, which is given by Z

1

1 (176) V 2 .re r/ O d  D ure /T D ure : mss D a R a We will call the relative measure '.J / D

mse ; mss

(177)

Page 35 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

and plot it in function of the Slepian-function truncation level J . Finally, we will also quote the relative quadratic measure of data misfit, Eq. (111), between the given data dr and the predicted data, YT AQure ,

.J / D

2 T re Y AQu  dr kdr k2

;

(178)

where we recall that the prediction uQ re is given by Eq. (113) and thereby remains a function of the truncation level J . In the vectorial case, the equivalent metric is the relative mean squared data misfit, Eq. (148), between the three vectorial components of the given data d and the three vectorial components of the predicted data, ET BQure ,

.J / D

2 T re E BQu  d kdk2

:

(179)

7.1 Estimating the Potential Field at the Surface from Radial-Component Data at Satellite Altitude Figure 5 shows the results from a suite of experiments with noisy scalar data. For generality we omitted a color bar and legend. We used the same linear color scale, normalized to the maximum O value, for all three panels on the left side. Blue is positive, red is negative, and all absolute V .re r/ points with absolute value smaller than 1 % of the maximum are left white. The data, shown on the right, are also color-coded in the same color map, but the colors are scaled with respect to the scale of the panels in the left column to account for the reduced data values at satellite altitude. O is displayed in the upper-left panel of Fig. 5, and one realization The true potential field, V .re r/, of the noisy radial-derivative data at altitude, dr , is shown in the upper-right panel. In the middleO at Earth’s surface re , from Eq. (114), with J D 412. In the left panel, we plot the estimate VQ .re r/, bottom-left panel, we show the absolute value of the difference between the truth and the estimate. The relative mean squared error, following Eq. (178), is 0.126. The Slepian-function truncation level J D 412 was chosen based on the numerical experiment shown in Fig. 6. For this value of J , O approximates the true potential field V .re r/ O very well within the estimated potential field VQ .re r/ Africa, and it has almost no energy outside the region of interest. In Fig. 6, each of the 64 gray lines labeled ' is a curve of '.J /, the regional relative mean squared model error calculated as in Eq. (177). The same true signal values V0r were used, but every experiment used data dr , as given by Eq. (106), that were contaminated by a different realization of the noise field nr , as described at the beginning of this section. Every curve starts at '.0/ D 1, as without any basis functions, only the zero model is obtained. The relative mse decreases dramatically after about J D 250, and the estimation improves as more Slepian functions are involved. As we have explained earlier for the theoretical behavior in Eq. (142), the squared bias term ˇ 2 diminishes in value with increasing J . Less truncation (larger J ) reduces the estimation bias, but this decrease is in competition with the variance term, which increases with J . The influence of data noise is felt more and more with the inclusion of additional basis functions.

Page 36 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 5 Example of the estimation of a potential field on Earth’s surface from noisy radial-derivative data at satellite altitude rs D re C 500 km, using Slepian functions bandlimited to L D 72 and spatially concentrated to the target O on Earth’s surface. The upper-right panel region Africa. The upper-left panel shows the true potential field V .re r/ shows the 2,217 noisy data dr at satellite altitude. The middle-left panel shows the estimated potential field VQ .re r/ O calculated from the data using Eq. (114), with Slepian-function truncation level J D 412. The lower-left panel shows O  VQ .re r/j O between the true and the estimated potential fields the absolute value of the difference jV .re r/

The turning points of minimum relative mean squared estimation error for each of the experiments are indicated by a gray circle. At the corresponding value J , the optimal Slepian truncation level for each specific data set is reached. The average of all of the '.J / curves shown is represented by a black dashed line. All individual turning points are clustered around the average ideal truncation point, which is the J D 412 indicated by the black circle. The relative regional mean squared model errors ' do not improve immediately after J D 1, unlike the data errors . There is a local minimum, followed by a rise, and a precipitous decline after J D 250 or thereabouts. We explain this behavior theoretically by our minimizing the misfit of the upwardtransformed potential field at the altitude of the data (see Eq. 121) instead of the misfit on the surface, which is measured by '. To obtain the potential field on the surface, we need to downwardtransform the radial-field estimate at altitude, obtained by truncation, as shown by Eq. (123). The inverse of the upward-transformation operator A defined in Eq. (32) is poorly conditioned for high maximum degrees L and large relative satellite altitudes rs =re . The interaction between all of the terms altogether displays a complex behavior that, however, has a clear global minimum which leads to a working algorithm and an objective decision as to the optimal Slepian-function truncation level. Because the noise level is relatively small compared to the signal strength, and because we use the same 2,217 data locations, the -lines with the data fits are close together. The relative mean squared data misfit curves .J / in Fig. 6 are decreasing fast until their values reach the relative energy of the noise, 2.5 %, indicated by the dashed horizontal black line. At this point the relative Page 37 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

412 1 0.9 0.8

relative mse

0.7 0.6 0.5 0.4 0.3 0.2 0.14 0.1 0.025

0 0

100

200

300 400 500 Slepian truncation J

600

700

Fig. 6 Relative regional mean squared model errors '.J /, from Eq. (177), and relative mean squared data misfit .J /, from Eq. (178), for potential-field estimation from radial-derivative data as described in Eq. (114). The true signal is the one shown in Fig. 5. Each of the 64 realizations of noise leads to a gray '.J /-curve and a gray .J /. The optimal truncation points are indicated by gray circles and the average optimal truncation point by a black circle, and the average '.J / behavior is the black dashed line. The dashed horizontal line is the relative energy of the noise

mean squared data misfit decreases much slower, or almost not at all. We recall that the noise is generated in the spatial domain and is therefore not bandlimited. Hence, the noise has appreciable energy in the degrees larger than 72 which cannot be fit by the L D 72 bandlimited Slepian functions.

7.2 Estimating the Potential Field at the Surface from Gradient-Vector Data at Satellite Altitude Figure 7 shows the results from an experiment with noisy vectorial data. Our plot color conventions are unchanged from those in Sect. 7.1, except now the three panels on the right are scaled to the O is found maximum absolute vectorial data value at satellite altitude. The true potential field V .re r/ in the upper-left panel of Fig. 7, and the noisy data at altitude d are shown on the right. The top-right panel shows the radial component dr , the middle-right panel the tangential colatitudinal component d , and the lower -right panel the tangential longitudinal component d . O for the potential field on Earth’s surface, We use Eq. (151) to calculate an estimate VQ .re r/ choosing the Slepian truncation J D 472 based on the numerical experiments shown in Fig. 8. O is shown in the middle-left panel of The estimated scalar potential field on Earth’s surface VQ .re r/ ˇ ˇ Fig. 7. The lower-left panel of Fig. 7 shows the absolute difference ˇV .re r/ O  VQ .re r/ O ˇ between the true and the estimated signal. The estimated field VQ .re r/ O approximates the true signal V .re r/ O well within Africa and is close to zero outside of that target region. The relative regional mean squared model error calculated using Eq. (177) is 0:057. Page 38 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Fig. 7 Example of a potential-field estimation on Earth’s surface from noisy gradient data at altitude rs D re C 500 km for Slepian functions with maximum degree L D 72 and target region Africa. The upper-left panel shows the O on Earth’s surface. The three right panels show the noisy data d at satellite altitude given true potential field V .re r/ by 2,217 data values. The top-right panel depicts the radial component dr , the middle-right panel the tangential colatitudinal component d , and the lower-right panel the tangential longitudinal component d . The middle-left panel O calculated from the data with Slepian truncation J D 472. The lower-left shows the estimated potential field VQ .re r/ panel shows the absolute difference jV .re r/ O  VQ .re r/j O between the true and the estimated potential fields

In Fig. 8, we plot the relative regional mean squared model errors '.J / defined in Eq. (177) as a function of the truncation level J , for each of the 64 experiments. Each data set d is generated from the same true vector field V0 using Eq. (143), but differs by the realization of the noise n, as discussed at the top of this section. Each experiment starts at '.0/ D 1 and descends from about J D 250 into a deep valley with increasing number of Slepian functions. The theoretical relation in Eq. (174) explains how the decreasing bias and increasing variance trade off as a function of the increasing number J of Slepian functions. The turning points are indicated by gray circles; they all cluster around the same truncation value. The average relative regional mean squared model error is shown by a dashed black line, and the average optimal Slepian truncation level J D 472 by a black circle. As in the scalar case the curves '.J / go through a local minimum before reaching the global optimum truncation level. Indeed, since we minimized Eq. (158) at altitude, in order to obtain the estimate VQ .re r/ O at the Earth’s surface, we need to apply the downward-transformation operator B defined in Eq. (48). At high maximum degrees L and high relative satellite altitudes rs =re , this operator is poorly conditioned. The interaction between the various competing effects produces a complex but reproducible error behavior. The 64 curves for the relative mean squared data misfit in Fig. 8 are close together because the signal-to-noise level is high and because we reuse the same 2,217 data locations. As for the scalar case, the relative mean squared data misfit .J / decreases fast until it reaches the relative energy of the noise, 2.5 %, indicated by the dashed horizontal black line. Page 39 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

472 1 0.9 0.8

relative mse

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.052 0.025

0 0

100

200

300 400 500 Slepian truncation J

600

700

Fig. 8 Relative regional mean squared model error '.J /, from Eq. (177), and relative mean squared data misfit .J /, from Eq. (179), for potential-field estimation from vectorial data described in Eq. (151). The true signal is the same as for Fig. 7. Each of the 64 realizations of noise leads to a gray '.J /-line and a gray .J /. The optimal truncation points are indicated by the gray circles, the average optimal truncation point by the black circle, and the average '.J / line by the black dashed line. The dashed horizontal line is the relative energy of the noise

8 Conclusions We presented two methods to estimate a potential field from gradient data at satellite altitude that are concentrated over a certain region. At the heart of both methods lies the use of spatiospectrally concentrated spherical basis functions. The first method only considered the radial component of the data and used scalar Slepian functions. The second method considered all three vectorial components of the data and used gradient-vector Slepian functions, a special case of vector Slepian functions. From the theoretical analysis of both methods, and through extensive experimentation, we show how the mean squared reconstruction error depends on the number of Slepian or gradientvector Slepian functions used for the estimation. The more Slepian functions involved, the smaller the bias but the larger the variance in the presence of noise. Acknowledgments A. P. thanks the Ulrich Schmucker Memorial Trust and the Swiss National Science Foundation, the National Science Foundation, and Princeton University for funding and the Smart family of Cape Town for their hospitality while writing this manuscript. This research was sponsored by the US National Science Foundation under grants EAR-1150145 and EAR-1245788 to F.J.S., and by the National Aeronautics & Space Administration under grant NNX14AM29G to A.P. and F.J.S.

Page 40 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Table of Symbols Symbol L R V .re r/ O V .rs r/ O re ulm s urlm O @r V .rs r/ r V .rs r/ O rs vlm

Ylm Y Y YO YO >L

Description Spherical-harmonic bandwidth Target region of data availability and for Slepian-function concentration Three-dimensional potential-field function at Earth’s surface re Three-dimensional potential-field function at satellite altitude rs Expansion coefficients of V .re r/ O in the basis of spherical harmonics Ylm Expansion coefficients of V .rs r/ O in the basis of spherical harmonics Ylm Radial derivative of the potential field at satellite altitude rs Three-dimensional gradient of the potential field at satellite altitude rs Expansion coefficients of r V .rs r/ O in the basis of gradient-vector harmonics E lm a .LC1/2 1 vector containing the coefficients urlm with 0  l  L at radius ra a Infinite-dimensional vector containing the coefficients urlm with 0  l  1 at radius ra a Infinite-dimensional vector containing the coefficients urlm with L < l  1 at radius ra ra .LC1/2 1 vector containing the coefficients vlm with 0  l  L at radius ra 2 2 .LC1/ .LC1/ diagonal matrix transforming the ure to the Ylm coefficients O of @r V .rs r/ Infinite-dimensional diagonal matrix transforming the uO re to the Ylm coeffiO cients of @r V .rs r/ e Infinite-dimensional diagonal matrix transforming the uO r>L to the Ylm coeffiO cients of @r V .rs r/ .L C 1/2  .L C 1/2 diagonal matrix transforming the ure to the E lm coefficients of r V .rs r/ O Infinite-dimensional diagonal matrix transforming the uO re to the E lm coeffiO cients of r V .rs r/ e Infinite-dimensional diagonal matrix transforming the uO r>L to the E lm coeffiO cients of r V .rs r/ Scalar spherical-harmonic function for degree l and order m Vector of all .L C 1/2 scalar spherical-harmonic functions to degree L .L C 1/2  k matrix of Ylm with bandwidth L evaluated at rO 1 ; : : : ; rO k Vector of all scalar spherical-harmonic functions to degree 1 Vector of all scalar spherical-harmonic functions for degrees L < l  1

(1) (51) (28) (52) (52)

E lm E E EO EO >L

Gradient-vector spherical-harmonic function for degree l and order m .L C 1/2  1 vector of all E lm up to degree L .L C 1/2  3k matrix of all of the E lm evaluated at rO 1 ; : : : ; rO k Vector of all gradient-vector spherical harmonics up to degree 1 Vector of all gradient-vector spherical harmonics for degrees L < l  1

(16) (58) (42) (59) (59)

u ra uO ra a uO r>L

v ra A O A O >L A B O B O >L B

Eq.

(9) (18) (19) (27) (21) (20) (24) (19) (54) (55) (24) (32) (57) (57) (48) (66) (66)

Continued on next page Page 41 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Symbol G˛ GO ˛ g˛ gO ˛ G GJ G#J G"J G#>J G">J G GJ G GJ ˛ ƒ ƒJ D OL D O >L;L D GO >L;˛ gO >L;˛ GO>L;J O >L;J G H˛ h˛ H HJ H#J H"J H#>J

Description ˛th best spatially concentrated (within R) bandlimited (to L) scalar spherical Slepian function ˛th best spectrally concentrated (within L) spacelimited (to R) scalar spherical Slepian function .L C 1/2  1 vector containing the Ylm coefficients of one of the G˛ Infinite-dimensional vector containing the Ylm coefficients of one of the GO ˛ .L C 1/2  1 vector containing all of the G˛ ordered with decreasing concentration ratio ˛ J  1 vector of functions containing the G1 ; : : : ; GJ J  1 vector of localized downward-transformed scalar Slepian functions J  1 vector of localized upward-transformed scalar Slepian functions   .L C 1/2  J  1 vector complementing G#J   .L C 1/2  J  1 vector complementing G"J .L C 1/2  k matrix of all of the G˛ evaluated at rO 1 ; : : : ; rO k J  k matrix of G1 ; : : : ; GJ evaluated at rO 1 ; : : : ; rO k .L C 1/2  .L C 1/2 matrix containing the Ylm coefficients for all of the G˛ .L C 1/2  J matrix containing the Ylm coefficients for the G1 ; : : : ; GJ Energy concentration ratio of G˛ .L C 1/2  .L C 1/2 diagonal matrix containing all of the ˛ J  J diagonal matrix containing the J largest 1 ; : : : ; J .L C 1/2  .L C 1/2 localization matrix diagonalized by G 1  .L C 1/2 matrix extending D to contain the inner products of YO and Y O L for degrees l > L 1  .L C 1/2 matrix containing the portion of D Scalar function made from the degrees l > L of GO ˛ Infinite-dimensional vector containing the l > L entries of gO ˛ J  1 vector of functions containing the first J of the GO >L;˛ 1  J matrix containing the Ylm coefficients for l > L of the GO ˛

Eq. (73) (81) (73) (82) (75) (78) (115) (116) (115) (116) (107) (110) (71) (77) (68) (72) (79) (70) (84) (85) (88) (86) (89) (87)

˛th best-concentrated gradient-vector Slepian function for bandwidth L and (96) region R .L C 1/2  1 vector containing the E lm coefficients of one of the H ˛ (96) 2 .L C 1/  1 vector containing all of the H ˛ ordered with decreasing (98) concentration ratio ˛ J  1 vector of functions containing the H 1 ; : : : ; H J (101) J  1 vector of scalar-valued downward-transformed gradient-vector Slepian (152) functions J  1 vector of scalar-valued upward-transformed gradient-vector Slepian (153) functions   (152) .L C 1/2  J  1 vector complementing H#J Continued on next page Page 42 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Symbol H">J H HJ H HJ ˛ † †J K O >L;L K O O H

E;>L;J

d.r/ O d.r/ O dr d n.r/ O n.r/ O nr n V V0r V0 VQ .re r/ O

sQ rJs QtrJs uQ re



Description Eq.   2 .L C 1/  J  1 vector complementing H"J (153) 2 .L C 1/  3k matrix of all of the H ˛ evaluated at rO 1 ; : : : ; rO k (144) J  3k matrix of H 1 ; : : : ; H J evaluated at rO 1 ; : : : ; rO k (147) 2 2 .L C 1/  .L C 1/ matrix containing the E lm coefficients for all of the H ˛ (94) .L C 1/2  J matrix containing the E lm coefficients for the H 1 ; : : : ; H J (100) Energy concentration ratio of H ˛ over R (91) 2 2 .L C 1/  .L C 1/ diagonal matrix containing all of the ˛ (95) J  J diagonal matrix containing the J largest 1 ; : : : ; J (102) 2 2 .L C 1/  .L C 1/ localization matrix diagonalized by H (93) 2 O 1  .L C 1/ matrix containing the inner products of E >L with E (104) J  1 vector of functions containing the EO >L components of the spacelimited (105) H Scalar data function at satellite altitude rs (119) Gradient data function at satellite altitude rs (156) k  1 vector of measured radial data values at satellite altitude rs (106) 3k  1 vector of measured gradient data values at satellite altitude rs (143) Scalar noise function at satellite altitude rs (119) Vectorial noise function at satellite altitude rs (156) k  1 vector of radial-derivative noise at satellite altitude rs (106) 3k  1 vector of vectorial noise at satellite altitude rs (143) k  1 vector containing the potential-field signal points V .rs r/; O : : : ; V .rs r/ O (26) k  1 vector containing the radial-derivative signal points (33) O : : : ; @r V .rs r/ O @r V .rs r/; 3k  1 vector containing the full gradient signal points V0r ; V0 , and V0 at (38) satellite altitude Potential field at the Earth’s surface estimated from the radial-derivative data (114) at altitude Potential field at the Earth’s surface estimated from the full gradient data at (151) altitude J  1 vector of G˛ coefficients of V .rs r/ O estimated from the scalar data dr (112) J  1 vector of H ˛ coefficients of r V .rs r/ O estimated from the vector data d (149) 2 .L C 1/  1 vector of Ylm coefficients of the estimate VQ .re r/ O derived from (113) the sQrs .L C 1/2  1 vector of Ylm coefficients of the estimate VQ .re r/ O derived from (150) the QtrJs Variance of the estimate VQ .re r/ O from the scalar data d.r/ O in truncated (140) Slepian estimation Variance of the estimate VQ .re r/ O from the vector data d.r/ O in truncated (172) Slepian estimation Continued on next page Page 43 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Symbol ˇ

h 2i '.J / .J /

Description Bias of the estimate VQ .re r/ O from the scalar data d.r/ O in truncated Slepian estimation Bias of the estimate VQ .re r/ O from the vector data d.r/ O in truncated Slepian estimation Mean squared error of the estimate VQ .re r/ O from the scalar data d.r/ O Mean squared error of the estimate VQ .re r/ O from the vector data d.r/ O Relative regional mean squared model error between VQ .re r/ O and V .re r/ O T re Relative mean squared data misfit between dr and Y AQu for the scalar case Relative mean squared data misfit between d and ET Bure for the vector case

Eq. (137) (170) (142) (174) (177) (178) (179)

References Albertella A, Sansò F, Sneeuw N (1999) Band-limited functions on a bounded spherical domain: the Slepian problem on the sphere. J Geodesy 73:436–447 Albertella A, Savcenko R, Bosch W, Rummel R (2008) Dynamic ocean topography – the geodetic approach, Technical report 27, Institut für Astronomische und Physikalische Geodäsie, Forschungseinrichtung Satellitengeodäsie, München. Arkani-Hamed J (2001) A 50-degree spherical harmonic model of the magnetic field of Mars. J Geophys Res 106(E10):23197–23208. doi:10.1029/2000JE001365 Arkani-Hamed J (2002) An improved 50-degree spherical harmonic model of the magnetic field of Mars derived from both high-altitude and low-altitude data. J Geophys Res 107(E10):5083. doi:10.1029/2001JE001835 Arkani-Hamed J (2004) A coherent model of the crustal magnetic field of Mars. J Geophys Res 109:E09005. doi:10.1029/2004JE002265 Arkani-Hamed J, Strangway DW (1986) Band-limited global scalar magnetic anomaly map of the Earth derived from Magsat data. J Geophys Res 91(B8):8193–8203 Backus GE, Parker RL, Constable CG (1996) Foundations of geomagnetism. Cambridge University Press, Cambridge Beggan CD, Saarimäki J, Whaler KA, Simons FJ (2013) Spectral and spatial decomposition of lithospheric magnetic field models using spherical Slepian functions. Geophys J Int 193(1): 136–148. doi:10.1093/gji/ggs122 Blakely RJ (1995) Potential theory in gravity and magnetic applications. Cambridge University Press, New York Bölling K, Grafarend EW (2005) Ellipsoidal spectral properties of the Earth’s gravitational potential and its first and second derivatives. J Geodesy 79(6–7):300–330. doi:10.1007/s00190005-0465-y Chambodut A, Panet I, Mandea M, Diament M, Holschneider M, Jamet O (2005) Wavelet frames: an alternative to spherical harmonic representation of potential fields. Geophys J Int 163(3): 875–899 Cox DR, Hinkley DV (1974) Theoretical statistics. Chapman and Hall, London Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Davison AC (2003) Statistical models. Cambridge University Press, Cambridge Page 44 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

de Santis A (1991) Translated origin spherical cap harmonic analysis. Geophys J Int 106:253–263 Eshagh M (2009) Comparison of two approaches for considering laterally varying density in topographic effect on satellite gravity gradiometric data. Acta Geophysica. doi:10.2478/s11600009-0057-y Fengler MJ, Freeden W, Kohlhaas A, Michel V, Peters T (2007) Wavelet modeling of regional and temporal variations of the earth’s gravitational potential observed by GRACE. J Geodesy 81(1):5–15, doi:10.1007/s00190-006-0040-1 Freeden W, Michel V (2004) Multiscale potential theory. Birkhäuser, Boston Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences: a scalar, vectorial, and tensorial setup. Springer, Berlin Gubbins D, Ivers D, Masterton SM, Winch DE (2011) Analysis of lithospheric magnetization in vector spherical harmonics. Geophys J Int 187:99–117. doi:10.1111/j.1365-246X.2011.05153.x Haines GV (1985) Spherical cap harmonic analysis. J Geophys Res 90(B3):2583–2591 Harig C, Simons FJ (2012) Mapping Greenland’s mass loss in space and time. Proc Natl Acad Sci 109(49):19934–19937. doi:10.1073/pnas.1206785109 Hwang C (1993) Spectral analysis using orthonormal functions with a case study on sea surface topography. Geophys J Int 115:1148–1160 Hwang C, Chen S-K (1997) Fully normalized spherical cap harmonics: Application to the analysis of sea-level data from TOPEX/POSEIDON and ERS-1. Geophys J Int 129:450–460 Jahn K, Bokor N (2012) Vector Slepian basis functions with optimal energy concentration in high numerical aperture focusing. Opt Commun 285:2028–2038. doi:10.1016/j.optcom.2011.11.107 Jahn K, Bokor N (2014) Revisiting the concentration problem of vector fields within a spherical cap: A commuting differential operator solution. J Fourier Anal Appl 20:421–451. doi:10.1007/s00041-014-9324-7 Kaula WM (1967) Theory of statistical analysis of data distributed over a sphere. Rev Geophys 5(1):83–107 Kennedy RA, Sadeghi P (2013) Hilbert space methods in signal processing. Cambridge University Press, Cambridge Korte M, Holme R (2003) Regularization of spherical cap harmonics. Geophys J Int 153:253–262. doi:10.1046/j.1365-246X.2003.01898.x Langel RA, Hinze WJ (1998) The magnetic field of the Earth’s lithosphere: The satellite perspective. Cambridge University Press, Cambridge Lewis KW, Simons FJ (2012) Local spectral variability and the origin of the Martian crustal magnetic field. Geophys Res Lett 39:L18201. doi:10.1029/2012GL052708 Lowes FJ, Winch DE (2012) Orthogonality of harmonic potentials and fields in spheroidal and ellipsoidal coordinates: application to geomagnetism and geodesy. Geophys J Int 191(2): 491–507. doi:10.1111/j.1365-246X.2012.05590.x Lowes FJ, de Santis A, Duka B (1995) A discussion of the uniqueness of a Laplacian potential when given only partial field information on a sphere. Geophys J Int 121(2):579–584 Mallat S (2008) A wavelet tour of signal processing, the sparse way, 3rd edn. Academic, San Diego Maus S (2010) An ellipsoidal harmonic representation of Earth’s lithospheric magnetic field to degree and order 720. Geochem Geophys Geosys 11(6):Q06015. doi:10.1029/2010GC003026 Maus S, Lühr H, Purucker M (2006a) Simulation of the high-degree lithospheric field recovery for the Swarm constellation of satellites. Earth Planets Space 58:397–407

Page 45 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Maus S, Rother M, Hemant K, Stolle C, Lühr H, Kuvshinov A, Olsen N (2006b) Earth’s lithospheric magnetic field determined to spherical harmonic degree 90 from CHAMP satellite measurements. Geophys J Int 164:319–330. doi:10.1111/j.1365-246X.2005.02833.x Maus S, Rother M, Stolle C, Mai W, Choi S, Lühr H, Cooke D, Roth C (2006c) Third generation of the Potsdam magnetic model of the earth (POMME). Geochem Geophys Geosys 7:Q07008. doi:10.1029/2006GC001269 Mayer C, Maier T (2006) Separating inner and outer Earth’s magnetic field from CHAMP satellite measurements by means of vector scaling functions and wavelets. Geophys J Int 167:1188–1203. doi:10.1111/j.1365-246X.2006.03199.x Moritz H (2010) Classical physical geodesy. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, chap 6, pp 127–158. Springer, Heidelberg. doi:10.1007/978-3-642-01546-5_6 Nutz H (2002) A unified setup of gravitational field observables. Ph.D. thesis, University Kaiserslautern, Germany O’Brien MS, Parker RL (1994) Regularized geomagnetic field modelling using monopoles. Geophys J Int 118(3):566–578. doi:10.1111/j.1365-246X.1994.tb03985.x Olsen N, Mandea M, Sabaka TJ, Tøffner-Clausen L (2009) CHAOS-2—a geomagnetic field model derived from one decade of continuous satellite data. Geophys J Int 179:1477–1487. doi:10.1111/j.1365-246X.2009.04386.x Olsen N, Hulot G, Sabaka TJ (2010) Sources of the geomagnetic field and the modern data that enable their investigation. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of Geomathematics, chap 5, pp 105–124. Springer, Heidelberg. doi:10.1007/978-3-642-01546-5_5 Plattner A, Simons FJ (2013) A spatiospectral localization approach for analyzing and representing vector-valued functions on spherical surfaces. In: Van de Ville D, Goyal VK, Papadakis M (eds) Wavelets and sparsity XV. SPIE, vol 8858, pp 88580N. doi: 10.1117/12.2024703 Plattner A, Simons FJ (2014) Spatiospectral concentration of vector fields on a sphere. Appl Comput Harmon Anal 36:1–22. doi:10.1016/j.acha.2012.12.001 Plattner A, Simons FJ, Wei L (2012) Analysis of real vector fields on the sphere using Slepian functions. In: 2012 IEEE statistical signal processing workshop (SSP’12), Ann Arbor Rowlands DD, Luthcke SB, Klosko SM, Lemoine FGR, Chinn DS, McCarthy JJ, Cox CM, Anderson OB (2005) Resolving mass flux at high spatial and temporal resolution using GRACE intersatellite measurements. Geophys Res Lett 32:L04310. doi:10.1029/2004GL021908 Rummel R, van Gelderen M (1995) Meissl scheme — spectral characteristics of physical geodesy. Manuscr Geod 20(5):379–385 Sabaka TJ, Hulot G, Olsen N (2010) Mathematical properties relevant to geomagnetic field modeling. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of Geomathematics, chap 17, pp 503–538. Springer, Heidelberg. doi:10.1007/978-3-642-01546-5_17 Schachtschneider R, Holschneider M, Mandea M (2010) Error distribution in regional inversion of potential field data. Geophys J Int 181:1428–1440. doi:10.1111/j.1365-246X.2010.04598.x Schachtschneider R, Holschneider M, Mandea M (2012) Error distribution in regional modelling of the geomagnetic field. Geophys J Int 191:1015–1024. doi:10.1111/j.1365-246X.2012.05675.x Simons FJ, Dahlen FA (2006) Spherical Slepian functions and the polar gap in geodesy. Geophys J Int 166:1039–1061. doi:10.1111/j.1365-246X.2006.03065.x Simons FJ, Dahlen FA, Wieczorek MA (2006) Spatiospectral concentration on a sphere. SIAM Rev 48(3):504–536. doi:10.1137/S0036144504445765

Page 46 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_64-2 © Springer-Verlag Berlin Heidelberg 2014

Simons FJ, Hawthorne JC, Beggan CD (2009) Efficient analysis and representation of geophysical processes using localized spherical basis functions. In: Goyal VK, Papadakis M, Van de Ville D (eds) Wavelets XIII, vol 7446, pp 74460G. SPIE. doi:10.1117/12.825730 Slepian D (1964) Prolate spheroidal wave functions, Fourier analysis and uncertainty — IV: extensions to many dimensions; generalized prolate spheroidal functions. Bell Syst Tech J 43(6):3009–3057 Slepian D (1983) Some comments on Fourier analysis, uncertainty and modeling. SIAM Rev 25(3):379–393 Slepian D, Pollak HO (1961) Prolate spheroidal wave functions, Fourier analysis and uncertainty — I. Bell Syst Tech J 40(1):43–63 Slobbe DC, Simons FJ, Klees R (2012) The spherical Slepian basis as a means to obtain spectral consistency between mean sea level and the geoid. J Geodesy 86(8):609–628. doi:10.1007/s00190-012-0543-x Thébault E, Schott JJ, Mandea M (2006) Revised spherical cap harmonic analysis (R-SCHA): validation and properties. J Geophys Res 111(B1):B01102. doi:10.1029/2005JB003836 Trampert J, Snieder R (1996) Model estimations biased by truncated expansions: Possible artifacts in seismic tomography. Science 271(5253):1257–1260. doi:10.1126/science.271.5253.1257 Whaler KA, Gubbins D (1981) Spherical harmonic analysis of the geomagnetic field: an example of a linear inverse problem. Geophys J Int 65(3):645–693. doi:10.1111/j.1365246X.1981.tb04877.x Wieczorek MA, Simons FJ (2007) Minimum-variance spectral analysis on the sphere. J Fourier Anal Appl 13(6):665–692. doi:10.1007/s00041-006-6904-1 Xu P (1992) Determination of surface gravity anomalies using gradiometric observables. Geophys J Int 110:321–332 Xu P (1992) The value of minimum norm estimation of geopotential fields. Geophys J Int 111: 170–178 Xu P (1998) Truncated SVD methods for discrete linear ill-posed problems. Geophys J Int 135(2):505–514. doi:10.1046/j.1365-246X.1998.00652.x

Page 47 of 47

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_65-3 © Springer-Verlag Berlin Heidelberg 2014

Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery Ignace Loris Université libre de Bruxelles, Bruxelles, Belgium

Abstract Inverse problems in seismic tomography are often cast in the form of an optimization problem involving a cost function composed of a data misfit term and regularizing constraint or penalty. Depending on the noise model that is assumed to underlie the data acquisition, these optimization problems may be non-smooth. Another source of lack of smoothness (differentiability) of the cost function may arise from the regularization method chosen to handle the ill-posed nature of the inverse problem. A numerical algorithm that is well suited to handle minimization problems involving two non-smooth convex functions and two linear operators is studied. The emphasis lies on the use of some simple proximity operators that allow for the iterative solution of non-smooth convex optimization problems. Explicit formulas for several of these proximity operators are given and their application to seismic tomography is demonstrated.

1 Introduction Global seismic tomography deals with the determination of the Earth’s inner structure based on the measurement of earthquake data (Nolet 2008). Inverse problems in global seismic tomography are typically plagued by a lack of measurement data that would allow for a unique determination of the seismic wave-speed anomaly in the Earth’s mantle. In addition, available data is contaminated by measurement noise. Assuming a linear relationship between wave-speed anomaly u and measurement data y, seismic recovery can be written as an ill-posed linear problem Ku D y. Its solution may not exist or may not be unique; the singular value spectrum of the measurement operator K, in combination with the noise, may make recovery unstable. Inverse problems of this type are often regularized by changing them into minimization problems involving two parts. A first part relates to the minimizing of the difference between observed (noisy) data y and predicted data Ku, while a second part plays the role of regularizer. Regularization may be achieved by penalizing “large” solutions (Tikhonov regularization (Tikhonov 1963)) or by constraining the solution to lie in a bounded set (Ivanov regularization (Ivanov 1976)). In this tutorial paper, a number of algorithms applicable to optimization problems of this kind are studied. Let f .Ku/ represent the data misfit term, measuring the goodness of fit between predicted data Ku and experimental data y. We will assume that the function f is a “simple” nonnegative convex function. In this context, “simple” should be understood as f possessing a proximity operator that



E-mail: [email protected]

Page 1 of 33

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_65-3 © Springer-Verlag Berlin Heidelberg 2014

can be computed easily. Similarly, we will assume that the penalty function has the form g.Au/, where A is a linear operator and g is a simple convex non-negative function (possessing an easyto-compute proximity operator). The function g thus serves to impose a penalty or a constraint on the set of possible models with the same data misfit f . Having introduced the two matrices K and A, and the corresponding function f and g, the problem we wish to solve is arg min f .Ku/ C g.Au/: u

(1)

Whereas the function f and the matrix K are dictated by the experimental setup, the function g (and the matrix A) is chosen by the scientist to impose desirable properties on the reconstruction u. In case the data is contaminated by Gaussian noise, the function f .Ku/ is just the usual least squares function ky  Kuk22 =2. However, if the data is affected by outliers in the noise, a more robust data misfit term may be needed. One well-known example is the sum of absolute deviations ky  Kuk1, which puts less weight on large deviations than ordinary least squares and which is therefore less sensitive to outliers. Early use in seismic tomography is found in Claerbout and Muir (1973), Taylor et al. (1979), and Santosa and Symes (1986). Yet, another noise model could assume a uniform distribution of the noise, whereby one would need to treat the minimization of the function ky  Kuk1 . Clearly, the `1 -norm and the `1 -norm provide two examples of nonsmooth (non-differentiable) functions. Moreover, a single problem may contain a combination of different noise models. Even if Gaussian noise is assumed, non-differentiability of the cost function in minimization problem (1) may be introduced through the regularization term g. Traditionally, this has often just been the `2 -norm squared kuk22 of the unknown u (for imposing a limit on the size of u) or the `2 -norm squared of the gradient of u (for imposing some smoothness on the seismic wave speed u). Recently (Daubechies et al. 2004; Bruckstein et al. 2009), the use of the `1 -norm as a way of imposing certain a priori information on the solution of a linear inverse problems has gained popularity. One example consists of using the `1 -norm to impose sparsity on u or more precisely on a linear transformation Au of the vector u. One example is the so-called total variation (Rudin et al. 1992) regularization, where the `1 -norm of the local gradient is used to impose small local variations, while allowing for sharper discontinuities than those admitted by the use of the `2 norm squared of the local gradient. Applications of these techniques in seismic recovery are, e.g., discussed in Loris et al. (2007), Herrmann et al. (2008), Herrmann and Hennenfent (2008), Loris et al. (2010), and Gholami and Siahkoohi (2010). The goal of this paper is to provide an introduction and guide to the use of proximity operatorbased algorithms for convex minimization problems of type (1). Such proximity operators and their connection to optimization are discussed in section “Basic Concepts.” A small number of iterative optimization algorithms will be discussed in section “Iterative Algorithms for Convex Optimization.” The aim of this paper is not to compare different algorithms and their speed of convergence. Indeed, many iterative algorithms can be written for solving the same problem (see, e.g., Esser 2010; Esser et al. 2010). For the same reason, we will not systematically formulate the most general version of each algorithm but restrict ourselves to one that is of sufficient practical use. Problem (1) is symmetric w.r.t to interchanging f and g (and K and A). However, in the proposed algorithm, we will treat the two function f and g (and the two operators K and A) in a slightly different manner. The advantage of this is that the conditions on K and A for convergence of the iterative algorithm are uncoupled. We shall argue that this is the more natural thing to do as Page 2 of 33

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_65-3 © Springer-Verlag Berlin Heidelberg 2014

the operator K is fixed by the physics of the data collection, whereas the operator A is determined by assumptions underlying the regularization. The proposed iterative algorithms are applied to linear inverse problems in section “Application to Linear Inverse Problems,” assuming different noise models (e.g., robust and uniform noise models). Section “Example” describes a synthetic inverse problem in global seismic tomography that demonstrates the use of these algorithms and noise models. In particular, the total variation regularization (on an irregular triangular or tetrahedral grid) of a global seismic tomographic problem is demonstrated.

2 Basic Concepts In this section, a number of basic concepts of convex analysis are introduced. The most important one is that of proximity operator which is a generalization of the projection on a convex set. It will form the basis of the algorithms discussed in sections “Iterative Algorithms for Convex Optimization” and “Application to Linear Inverse Problems.” The emphasis is on presenting relevant examples explicitly, rather than giving proofs in full detail. An introduction to convex optimization may be found in Rockafellar (1997) and Boyd and Vandenberghe (2004).

2.1 Convex Functions and Their Subdifferentials As is well known, a set C  Rd is said to be convex if u; v 2 C

)

u C .1  /v 2 C

(2)

N is said to be convex if for all  2 Œ0; 1. A function f W Rd ! R f .u C .1  /v/  f .u/ C .1  /f .v/

(3)

for all points u; v 2 Rd and for  2 Œ0; 1. The convex functions that we will be mostly interested in are expressed in terms of the `1 -norm, the `2 -norm (or Euclidean norm), and the `1 -norm (or maxnorm). They are defined as kuk1 D

X

jui j;

i

kuk2 D

X

!1=2 jui j2

and

kuk1 D max jui j

i

i

(4)

for any u 2 Rd . In some applications, it makes sense to use a mixed norm of type kuk1;2 D P p jui;1 j2 C jui;2 j2 for a vector u 2 R2d . This will be the case for the example of section i “Example.” To each of these three convex functions corresponds a ball of radius R around the origin: .p/

BR D fu j kukp  Rg;

for

p D 1; 2; 1:

(5)

The indicator functions of these convex sets,

Page 3 of 33

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_65-3 © Springer-Verlag Berlin Heidelberg 2014



0 kukp  R C1 kukp > R

iB .p/ .u/ D R

(6)

(p D 1; 2; 1), are also convex functions according to definition (3). The convex dual f  of a convex function f is defined as f  .u/ D suphv; ui  f .v/

(7)

v

A straightforward calculation shows that the dual function of f .u/ D 12 kuk22 is simply f  .u/ D 1 kuk22. One can also show the following relations: 2 f .u/ D kukp

)

f  .u/ D iB .q/ .u/

(8)



(where 1=p C 1=q D 1). We will use this property often in the following subsection. In case of convex functions, the notion of derivative can be extended to non-differentiable N be a convex function. The subdifferential @f .u/ of f at the point functions. Let f W Rd ! R u 2 Rd is the set of vectors w 2 Rd such that f .v/  f .u/ C hw; v  ui

8v:

(9)

In case confusion is possible, one can explicitly indicate the independent variable and write @u f instead of @f . The elements w of the set @f are called the subgradients of f in u. Even for a non-differentiable function, the subgradient can still be interpreted as the slope of a line (or plane) touching the function f from below at the point u (more than one such touching line/plane can exist). In case the function f is differentiable at u, the subdifferential reduces to a single vector (the usual gradient). If the subdifferential is a singleton, it is often identified with its only element. A simple example of the subdifferential of a non-differentiable convex function is f W R ! R W f .u/ D juj

8 u 0:

)

(10)

The subdifferential can be used to express the minimization of a convex function f : ,

uO D arg min f .u/ u

0 2 @f .Ou/:

(11)

Indeed, uO D arg minu f .u/ corresponds to f .u/  f .Ou/ for all u or equivalently f .u/  f .Ou/ C h0; u  uO i for all u. This is just saying that 0 2 @f .Ou/. Let us, as an example, express the conditions for minimizing the function of formula (1), where K and A are two linear operators (matrices), in terms of the subdifferentials of f and g. We find that u D arg min f .Ku/ C g.Au/

,

0 2 K T @f .Ku/ C AT @g.Au/;

(12)

or in other words, there exist v and w such that Page 4 of 33

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_65-3 © Springer-Verlag Berlin Heidelberg 2014

0 D K T v C AT w;

v 2 @f .Ku/ and

w 2 @g.Au/:

(13)

In the next subsection, we will rewrite the inclusions v 2 @f .Ku/ and w 2 @g.Au/ as algebraic equalities. This will allow us to rewrite minimization problem (1) as a system of algebraic equations. For solving these equations, we will then write iterative algorithms.

2.2 Projections and Proximity Operators As this tutorial paper is aimed at geoscientists, the goal of this subsection is not to give the properties of the proximity operators in their full mathematical generality. A more detailed overview, with applications to optimization problems, may also be found in Combettes and Wajs (2005) and Combettes and Pesquet (2011). 2.2.1 Projection on a Convex Set The projection of the vector u on the (nonempty) closed convex set C is defined as the closest point in C to u: PC .u/ D arg min ku  vk22:

(14)

v2C

.p/

.1/

For the `p -balls BR , these projections can be explicitly calculated. The projection PR .1/ convex set BR is given by  .1/ PR .u/i

D

ui jui j  R ui juRi j jui j  R .2/

on the

(15) .2/

(component-wise calculation). The projection PR on the Euclidean ball BR is given by .2/

PR .u/ D .1/

8 M . As there exists a converging subsequence, the right-hand side can be made as small as one desires by choosing M large enough. It follows that the left-hand side tends to zero for

Page 12 of 33

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_65-3 © Springer-Verlag Berlin Heidelberg 2014

  N ! 1. In other words, the whole sequence u.n/ ; v .n/ (and not just a subsequence) converges t u to the limit u ; v  .   .nC1/ D uN .nC1/  K T v .nC1/  v .n/ and hence u.n/ D uN .n/  As (42)  algorithm  implies that u T .n/ .n1/ , it follows that the variable u.n/ can be eliminated from iteration (42). One finds K v v the equivalent algorithm: (

  u.nC1/ D u.n/  K T 2v .n/  v .n1/   v .nC1/ D proxf  v .n/ C Ku.nC1/

(49)

(where we have also dropped the bar from the variable uN ). Algorithm (42) (or (49)) is well known. It is a special case of algorithm 1 of Chambolle and Pock (2010), of algorithm A1 of Zhang et al. (2011), or of the so-called PDHGMp algorithm of Esser et al. (2010). It is also possible to introduce variable step lengths (depending on iteration step n) (see, e.g., Esser (2010, page 78)). Algorithm (42) (or equivalently algorithm (49)) can also be used to minimize a sum of convex functions. The following result is a direct consequence of the previous proposition. N be m convex functions, let Ki W Rd ! Rdi be m linear Proposition 2. P Let fi W R di ! R m t operators with iD1 Ki Ki 2 < 1, and suppose that a minimizer of the problem arg min u

m X

fi .Ki x/

(50)

iD1

exists. The algorithm 8 m   X ˆ .n/ .n1/ ˆ .nC1/ .n/ M . As u ; v ; w ; z possesses   a converging  subsequence, it follows that the whole  .n/ .n/ .n/ .n/    converges to u ; v ; w ; z . t u sequence u ; v ; w ; z In contrast to algorithm (56), algorithm (58) appears asymmetric with respect to f and g (and K and A). By rewriting the latter algorithm, it is possible to better understand the origin of this asymmetry. Introducing an additional auxiliary variable uQ , algorithm (58) can equivalently be rewritten as 8 .nC1/ uN D uQ .n/  K T v .n/  AT w.n/ ˆ ˆ ˆ  .n/  ˆ .nC1/ .nC1/ ˆ ˆ w D prox C AN u w  ˆ g < .nC1/ .n/ (70) D uQ  K T v .n/  AT w.nC1/ u ˆ ˆ   ˆ ˆ v .nC1/ D proxf  v .n/ C Au.nC1/ ˆ ˆ ˆ : .nC1/ D uQ .n/  K T v .nC1/  AT w.nC1/ : uQ  .n/  .nC1/ .nC1/ T .nC1/  u D K  v v or Indeed, it follows from the third and last equation that u Q   uQ .n/ D u.n/ C K T v .n1/  v .n/ . The auxiliary variable uQ .n/ can therefore be eliminated from system (70), and one recovers algorithm (58). In form (70), it is clear how the symmetry between f (and K) and g (and A) is broken. A predict-update step is performed with respect to both the auxiliary variables w and v, sequentially instead of the parallel update in algorithm (56). The symmetry between f and g (and K and A) is

Page 18 of 33

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_65-3 © Springer-Verlag Berlin Heidelberg 2014

therefore broken simply by performing the w update step before the v update. The v update uses an already updated u. From the proof of Proposition 4, it is also clear that this technique is not immediately generalizable to the minimization of a sum of m functions (as in problem (50)). Indeed, the proof shows that the operator K ends up in the norm used for the variable u.n/ instead of in the norm used for the auxiliary variable v .n/ (see, e.g., expression (67)). The matrix A only shows up in the norm applied to the variable w.n/ . A special case of this algorithm, where f is the indicator function of the `2 -ball and g is the `1 -norm, was introduced in Loris and Verhoeven (2012). Apart from the two proximity operators proxf  and proxg , one also needs to perform four matrix-vector multiplications per iteration step.

3.3 Penalized Least Squares Minimization In this final subsection, we treat the minimization of a penalized least squares functional with a penalty of type g.Au/, where proxg is known but proxg.A/ is not known. The subdifferential of the quadratic part can be computed explicitly. The use of two proximity operators can then be avoided. N be a convex function. Let K W Rd ! Rd1 and A W Rd ! Rd2 be Proposition 5. Let g W Rd ! R p two linear operators with kKk2 < 2 and kAk2 < 1, and suppose that a minimizer of the problem 1 arg min kKu  yk22 C g.Au/ x 2

(71)

8 .nC1/   T .n/ T .n/ D u.n/ C K y  Ku < uN  .n/  A w .nC1/ .nC1/ w D proxg w  C ANu  : .nC1/ D u.n/ C K T y  Ku.n/  AT w.n/ u

(72)

exists. The algorithm

(u.0/ , uN .0/ , and w.0/ are arbitrary) converges to a minimizer of (71). Proof. See Loris and Verhoeven (2011). A generalization of algorithm (72) later appeared in Chen et al. (2013). t u Here too, step-length parameters can be introduced (see, e.g., formula (86) in section “Application to Linear Inverse Problems”). Many other algorithms can be written to solve problems of type (71) (see, e.g., Daubechies et al. (2007), Zhu and Chan (2008), Beck and Teboulle (2009), Bredies (2009), Afonso et al. (2010), Chambolle and Pock (2010), Esser et al. (2010), and Zhang et al. (2011)). When A is the identity, algorithm (72) reduces to the following proximal algorithm. Proposition 6. problem

N be a convex function and suppose that a minimizer of the Let g W Rd ! R 1 arg min kKu  yk22 C g.u/ u 2

(73)

exists. The algorithm Page 19 of 33

Handbook of Geomathematics DOI 10.1007/978-3-642-27793-1_65-3 © Springer-Verlag Berlin Heidelberg 2014

   u.nC1/ D prox˛g u.n/ C K T y  Ku.n/ (u.0/ is arbitrary) converges to a minimizer of (73) if kKk